Difference of Convex Functions

DIFFERENCE OF CONVEX FUNCTIONS
A thesis presented to the faculty of

San Francisco State University
In partial fulfilment of
The requirements for
The degree
Master of Arts
In
Mathematics
by
Miriam Schussler
San Francisco, California
May 2013
Copyright by
Miriam Schussler
2013
CERTIFICATION OF APPROVAL
I certify that I have read DIFFERENCE OF CONVEX FUNCTIONS by Miriam

Schussler and that in my opinion this work meets the criteria for approving a thesis
submitted in partial fulfillment of the requirements for the degree: Master of Arts
in Mathematics at San Francisco State University.
Sergei Ovchinnikov
Professor of Mathematics
Eric Hayashi
Alex Schuster
DIFFERENCE OF CONVEX FUNCTIONS
Miriam Schussler
San Francisco State University
2013
The main goal of this thesis is to describe a wide class of functions on a closed
bounded interval that are representable as the difference of two convex functions
(DC functions). The necessary background for this thesis is given in Chapter 1.
Then, in Chapter 2 we begin with three proofs that a piecewise linear function (PL
function) is a lattice polynomial on its linear components. We then show that any
PL function is representable as the difference of two convex functions. In Chapter 3
we investigate DC functions from an analytic standpoint. To motivate our proof we
represent a PL function as the indefinite integral of the step function which is its left
derivative. We then use the positive and negative variation of this step function to
construct two convex functions whose difference is equal to our original PL function.
This construction is possible because a PL function is the integral of a function of
bounded variation (BV functions). We conclude this thesis with an extension of this
proof to functions on a closed interval which are indefinite integrals of BV functions.
I certify that the Abstract is a correct representation of the content of this thesis.
Chair, Thesis Committee Date

ACKNOWLEDGMENTS
Foremost I would like to express my gratitude to Dr. Ovchinnikov for

suggesting my thesis topic and generously sharing his time and immense
knowledge as my advisor.
I would also like to offer my thanks to Dr. Hayashi and Dr. Schuster for
being on my thesis committee, and thank them for carefully reading my
thesis and suggesting improvements.
I have taken courses in analysis from all three members of my committee

and have benefited enormously. I would not have been able to complete
this thesis without the knowledge I gained there. Many thanks.
v
TABLE OF CONTENTS
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Definition and Examples of PL-functions . . . . . . . . . . . . . . . . 2
1.2 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 R as a Distributive Lattice . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Lattice polynomials . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Translation Invariance . . . . . . . . . . . . . . . . . . . . . . 7
2 PL Functions Written as Polynomials . . . . . . . . . . . . . . . . . . . . . 9

2.1 First Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Second Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Third Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 PL functions Written as the Difference of Two Convex PL function . 18
3 Analytic Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 A PL function is the difference of the indefinite integrals of increasing
step functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Indefinite Integrals of BV Functions are DC Functions . . . . . . . . 27
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
vi
LIST OF FIGURES
1.1 An example of a PL-function. . . . . . . . . . . . . . . . . . . . . . . 3
2.1 A PL function translated by a line . . . . . . . . . . . . . . . . . . . 10

2.2 Components of a PL function which satisfy the Separation Property . 16
vii
Chapter 1
Introduction
In this chapter we provide the definition of a piecewise linear function as well as

terms from lattice theory that will be necessary in what follows. Besides definitions,
three facts will be needed from this chapter. The first is that the real numbers
form a distributive lattice under the operations of maximum and minimum (Section
1.2.1). The second is that in any distributive lattice the object known as a lattice
polynomial can be represented as a meet of joins (Section 1.2.2). This fact is central
to the proof presented in Section 2.4. The last is translation invariance of lattice
polynomials. This is used in the first proof of Chapter 2 to show that PL functions
are representable as lattice polynomials.
1
2
1.1 Definition and Examples of PL-functions
Definition 1.1. Let R denote the set of real numbers, and let I = [a, b] be a closed,
bounded, nondegenerate interval in R. A function F on I is said to be piecewise
linear (in a narrow sense) if there is a strictly increasing sequence a = x0 , . . . , xn = b
of points in I and a sequence of (not necessarily distinct) linear functions G1 , . . . , Gn
on I such that
F (x) = Gk (x), if x ∈ Ik , for 1 ≤ k ≤ n,
where Ik = [xk−1 , xk ], 1 ≤ k ≤ n. We call each Gk a component of F . The graph

of F is a polygonal curve with vertices (xk , F (xk )), 0 ≤ k ≤ n, and edges ek defined
by equations y = Gk (x) for x ∈ Ik .
Note that according to our definition
Gk (xk ) = Gk+1 (xk ),
for 1 ≤ k ≤ n. Therefore F is automatically continuous. Definition 1.1 is referred

to whenever “PL function” is mentioned in this thesis.
Let us consider an example of a PL function.
Example 1.1. Let I = [1, 6], and x0 = 1, x1 = 2, x2 = 4, x3 = 5, x4 = 6. The four

3
Figure 1.1: An example of a PL-function.
intervals I1 –I4 are:
I1 = [1, 2], I2 = [2, 4], I3 = [4, 5], I4 = [5, 6].
The components G1 –G4 of F are defined by the equations:
G1 (x) = 2x − 1, G2 (x) = 3, G3 (x) = −x + 7, G4 (x) = 2x − 8.
The function F defined by F (x) = Gk (x), if x ∈ Ik , for 1 ≤ k ≤ 4, is a PL function.

The graph of F is displayed in Figure 1.1.
4
1.2 Lattices
In this section we review the definition of a lattice and some properties of lattices
which will be used in Chapter 2.
Definition 1.2. [2, p.6] A lattice is a poset P any two of whose elements has a
greatest lower bound or “meet” denoted by x ∧ y , and a least upper bound or “join”
denoted by x ∨ y.
Lemma 1.1. [2, p.8] In any lattice the operations of meet and join satisfy the
following laws:
L1. x ∧ x = x, x ∨ x = x (Idempotent)
L2. x ∧ y = y ∧ x (Commutative)
L3. x ∧ (y ∧ z) = (x ∧ y) ∧ z, x ∨ (y ∨ z) = (x ∨ y) ∨ z (Associative)
L4. x ∧ (x ∨ y) = x ∨ (x ∧ y) = x (Absorbtion)
Moreover, x ≤ y is equivalent to each of the conditions
x ∧ y = x and x ∨ y = y.
Two additional distributivity properties hold for some lattices:
Definition 1.3. [2, p. 12] A lattice L is distributive if and only if the following
properties hold for all x, y, z ∈ L:
5
L5. x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z)
L6. x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z)
It follows from L2 and L3 that we can denote the meet and join of a finite number
^ _
of elements by and respectively.
Let J be a finite set of indices, and each Sj , j ∈ J, be a subset of {1, . . . , n}.
^ _ ^ _
We omit brackets in expressions such as ai and write ai .
j∈J i∈Sj j∈J i∈Sj
1.2.1 R as a Distributive Lattice
The set of real numbers is totally ordered by ≤ . Therefore a ∧ b and a ∨ b exist

for every a, b ∈ R, and R satisfies the definition of a lattice. Because R is totally
ordered, the least upper bound and greatest lower bound of two elements are the
maximum and minimum, respectively. Therefore we refer to the meet, ∧, as the
minimum and the join, ∨, as the maximum:
 
 
a, if a ≤ b,
 a, if a ≥ b,

a∧b= and a∨b=
 
b,
 if a > b, b,
 if a < b,
for a, b ∈ R.
Theorem 1.2. The set R, equipped with operations ∧ and ∨ is a distributive lattice.
Proof. The number x ∧ y is the lesser of x and y while x ∨ y is the greater of x and
y. Both x ∧ (y ∨ z) and (x ∧ y) ∨ (x ∧ z) are equal to x if x is smaller than y or z.
6
In the case that x is bigger than y and z, both x ∧ (y ∨ z) and (x ∧ y) ∨ (x ∧ z) are

equal to y ∨ z. To prove the converse is a matter of exchanging ∨ for ∧, “lesser of”
for “greater of” etc.1
1.2.2 Lattice polynomials
Definition 1.4. [2, p.30] Expressions involving the symbols ∧, ∨ and letters are
called lattice polynomials. More formally, let us define individual letters x, y, z,
. . . as polynomials of weight one. Recursively, if p and q are lattice polynomials of
weights w and w0 , respectively, then p ∧ q and p ∨ q are called lattice polynomials of
weight w + w0 .
In this definition, the forms x ∧ x and x ∨ (y ∧ x) are considered as different poly-

nomials (of weights two and three, respectively), even though they are equivalent
in the sense that they represent the same function p : L → L in any lattice. In this
thesis we write p = q when p and q represent the same function.
Lemma 1.3. ([2, Lemma 3, p.30]) In a distributive lattice, every polynomial p is

equivalent to a meet of joins:
^ _
p(a1 , . . . , an ) = ai ,
j∈J i∈Sj
where J is a set of indices and each Sj , j ∈ J, is a subset of the set {1, . . . , n}.
1
In fact a lattice satisfies L5 if and only if it satisfies L6. Also, the proof given here that R is a
distributive lattice is valid not only for R but for any chain. [2, pp.11-12]
7
1.2.3 Translation Invariance
We will state without proof the following Lemma:
Lemma 1.4. For a finite set of indices S, and u, ai ∈ R, for all i ∈ S,
^ ^
ai + u = (ai + u) (i)
i∈S i∈S
_ _
ai + u = (ai + u) (ii)
i∈S i∈S
Lemma 1.5. A lattice polynomial on R is translation invariant:
p(a1 , . . . , an ) + u = p(a1 + u, . . . , an + u)
Proof.
^ _
p(a1 , ..., an ) + u = ai + u by Lemma 1.3
j∈J i∈Sj
^ _
= ( ai + u) by Lemma 1.4(i)
j∈J i∈Sj
^ _
= (ai + u) by Lemma 1.4(ii)
j∈J i∈Sj
= p(a1 + u, ..., an + u) by the definition of p

8
Lemma 1.6. If F , L, G1 , ..., Gn are functions on I, and
^ _
(F + L)(x) = (Gi + L)(x),
j∈J i∈Sj
^ _
then F (x) = Gi (x).
j∈J i∈Sj
Proof. If for each x ∈ I,
^ _
(F + L)(x) = (Gi + L)(x),
j∈J i∈Sj
then
^ _
F (x) = (F + L)(x) − L(x) = (Gi + L)(x) − L(x)
j∈J i∈Sj
^ _
= Gi (x) + L(x) − L(x)
j∈J i∈Sj
^ _
= Gi (x) + L(x) − L(x) by Lemma 1.5
j∈J i∈Sj
^ _
= Gi (x).
j∈J i∈Sj
Chapter 2
PL Functions Written as Polynomials
In this chapter we review three proofs that a PL function can be represented as a

lattice polynomial. Then, in the last section we prove that the set of convex functions
is closed under the operation ∨ and addition. Because any lattice polynomial on
the real numbers can be expressed as a meet of joins we are able to show that a PL
function is a DC function.
Although in this thesis we present all proofs for functions of one variable, the
advantage of using a lattice polynomial representation is that a multivariable analog
of the proof that PL functions are DC functions is possible with this approach.
Such a proof was given as Corollary 2.1 in [1]. This proof depends on the lattice
polynomial representation of a PL function. Of course this was stated and proved
in [1] but the proof was not rigorous. The proof of this fact given in Section 2.3 can
be extended to the multivariable case without significant modification to provide a
9
10
Figure 2.1: A PL function translated by a line
rigorous proof.
2.1 First Proof
Theorem 2.1. Let F be a PL-function on an interval I with components G1 , . . . , Gn .

There is a lattice polynomial p such that
F (x) = p(G1 (x), . . . , Gn (x)), for x ∈ I.
Proof. The proof is by induction on the number of components of F . For two

11
components, the assertion of the theorem is obvious because,
F (x) = G1 (x) ∧ G2 (x) or F (x) = G1 (x) ∨ G2 (x)
in this case.
F (b)−F (a)
Let L be a linear function defined by L(x) = F (a) + b−a
· (x − a). For
each 1 ≤ k ≤ n, we have F − L = (Gk − L) over Ik , so F − L is a PL function
with components G1 − L, ..., Gn − L (See Figure 2.1). Without loss of generality we
can assume that (F − L)(x) 6= 0 for some x ∈ I. We consider the case in which
(F − L)(x) > 0. The maximum of a PL function occurs at a vertex. Because
(F − L)(a) = (F − L)(b) = 0, the maximum of F − L cannot be at an endpoint.
Therefore max{(F − L)(x)} = (F − L)(xk ) for some k ∈ {1, ..., n − 1}.
Let us define two PL-functions F1 and F2 on I by:
F1 (x) = Gi (x), for x ∈ Ii , 1 ≤ i ≤ k, F1 (x) = Gk (x), for x ∈ Ij , k < j ≤ n,
F2 (x) = Gk+1 (x), for x ∈ Ii , 1 ≤ i ≤ k, F2 (x) = Gj (x), for x ∈ Ij , k < j ≤ n.
Suppose that the induction hypothesis holds when a PL function has fewer than
n components. We see that (F − L)(x) = (F1 − L)(x) ∧ (F2 − L)(x). Thus we can
write (F − L)(x) as a lattice polynomial and therefore as a meet of joins (Lemma
12
1.3) :
^ _
(F − L)(x) = (Gi − L)(x).
j∈J i∈Sj
By translation invariance (Lemma 1.6),
^ _
F (x) = Gi (x).
j∈J i∈Sj
If min{(F − L)(x)} < 0, we may choose the minimum rather that the maximum
value and the proof is similar.
2.2 Second Proof
For the second proof, we begin by introducing two sets in the plane defined by a
function F on I. The epigraph of F is a subset of the plane defined by
epi(F ) = {(x, y) ∈ R : y ≥ F (x), x ∈ I}.
Likewise, the hypograph of F is defined by
hyp(F ) = {(x, y) ∈ R : y ≤ F (x), x ∈ I}.
The intersection of the two sets, epi(F ) ∩ hyp(F ), is the graph of F .

13
There is a relationship between the operations of union/intersection and mini-

mum/maximum:
epi(F ) ∩ epi(G) = epi(F ∨ G) and epi(F ) ∪ epi(G) = epi(F ∧ G).
Here is a short proof of the first relation:
(x, y) ∈ epi(F ) ∩ epi(G) ⇔ (x, y) ∈ epi(F ) and (x, y) ∈ epi(G)
⇔ y ≥ F (x) and y ≥ G(x)
⇔ y ≥ F (x) ∨ G(x)
⇔ (x, y) ∈ epi(F ∨ G),
By induction, we immediately obtain
\ _ [ ^
epi(Fj ) = epi Fj and epi(Fj ) = epi Fj , (2.1)
j∈J j∈J j∈J j∈J
for a finite family of functions {Fj }j∈J on I. (For hypographs we have ‘dual’ rela-
tions: hyp(F ) ∩ hyp(G) = hyp(F ∧ G) and hyp(F ) ∪ hyp(G) = hyp(F ∨ G).)
We now find an expression for the epigraph of a PL function in terms of unions
and intersections, and then use the relationship in Equation 2.1 to write the function
as a lattice polynomial.
Definition 2.1. Let F be a PL-function on an interval I. A point B on the graph

14
of F is visible from a point A ∈ epi(F ) if the closed line segment AB belongs to

epi(F ). We say that an edge of the graph of F is visible from a point A ∈ epi(F ) if
there is an inner point B of the edge which is visible from A.
For a point A ∈ epi(F ), let JA be a subset of {1, . . . , n} such that an edge ek

is visible from A if and only if k ∈ JA . A line connecting A to any point B in the
hypograph intersects the graph of F . If the intersection of the line AB with F is at
a vertex at least one of of the edges adjacent to that vertex is visible. Therefore the
set JA is not empty.
\
Lemma 2.2. Let FA = epi(Gj ). Then
j∈JA
FA ⊆ epi(F).
Proof. The set FA is not empty because it contains A. Suppose that there is a point
B such that B ∈ FA and B ∈
/ epi(F). Then B belongs to the interior of hyp(F ).
Let C be the point closest to A at which the segment AB intersects the graph of F .
The point C exists because AB and the graph of F are closed subsets of the plane.
There is an an edge ek , which contains C. Therefore, ek is visible from A but B is
in the interior of hyp(Gk ). This contradicts our assumption that B ∈ FA . Hence,
FA ⊆ epi(F).
Now we present the second proof of Theorem 2.1.

15
[
Proof. By Lemma 2.2, FA = epi(F ). Since each JA is a subset of a finite set,
A∈epi(F )
there are only a finite number of distinct sets JA . Let {Sj }j∈J be the finite family
of distinct subsets JA . Then we have
[ [ \
epi(F ) = FA = epi(Gi ).
j∈J j∈J i∈Sj
By (2.1),
^ _
epi(F ) = epi Gi (x) .
j∈J i∈Sj
Certainly epi(F ) = epi(G) implies F = G. Therefore
^ _ ^ _
epi(F ) = epi Gi (x) implies F (x) = Gi (x).
j∈J i∈Sj j∈J i∈Sj
Thus we have obtained a representation of F as a lattice polynomial.
2.3 Third Proof
Theorem 2.3. Let F be a PL function on the interval I = [a, b] with components

G1 , . . . , Gn . Then F has the following two properties:
1. Gk (a) ≤ F (a) and Gk (b) ≥ F (b), for some 1 ≤ k ≤ n.
2. Gj (a) ≥ F (a) and Gj (b) ≤ F (b), for some 1 ≤ j ≤ n.

16
Figure 2.2: Components of a PL function which satisfy the Separation Property
In what follows, the above properties are called Separation Properties.
Proof. Let F be a PL function on I. Let L be the line segment with endpoints

(a, F (a)) and (b, F (b)). By this same symbol L we denote the equation of the
straight line defined by this set. The set S = {x ∈ I : x > a and F (x) = L(x)}
contains b, so it is not empty. If L = Gi for some i, then Gi satisfies both Separation
Properties.
If L is different from every Gi then L intersects each Gi at most once on I so S
is finite. Let c = min S. Since F and L are continuous,
F (x) ≥ L(x) for all x ∈ [a, c], or F (x) ≤ L(x) for all x ∈ [a, c].
17
We consider the case in which the graph of F is above above the graph of L. In
this case, G1 (x) > L(x) for all x > a. Therefore G1 (b) > L(b) = F (b) and G1
satisfies the first Separation Property. To find a component of F which satisfies
the second Separation Property, choose the smallest k such that c ∈ Ik . We have
Gk (c) = L(c) so Gk (x) > L(x) for all x < c, and Gk (x) < L(x) for all x > c.
Therefore Gk (a) ≥ F (a) and Gk (b) ≤ F (b) so Gk satisfies the second Separation
Property.
In the case that F (x) ≤ L(x) on [a, c], Gk satisfies the first Separation Property
and G1 the second.
Using the Separation Properties we give our final proof of Theorem 2.1.
Proof. For a given c ∈ I = [a, b], we define a set
Sc = {i ∈ {1, . . . , n} : Gi (c) ≤ F (c)}.
For each c there is a component Gi such that Gi (c) = F (c) so Sc 6= ∅. By the

Separation Property applied either to the interval [x, c] (or to [c, x]), x ∈ I, we have
_
Gi (x) ≥ F (x).
i∈Sc
18
Since F (c) = Gk (c) for some k ∈ Sc , we have
_
Gi (c) = F (c).
i∈Sc
It follows that
^_
Gi (x) = F (x).
c∈I i∈Sc
There are only finite number of distinct sets Sc . Let {Sj }j∈J be the family of distinct
sets Sc . Then we have the following representation of F by a lattice polynomial:
^ _
F (x) = Gi (x).
j∈J i∈Sj
2.4 PL functions Written as the Difference of Two Convex
PL function
Convex PL-functions Written as the Sum of Lattice Polynomials
We begin by recalling some basic definitions and properties of convex sets and
functions.
An arbitrary function F on an interval I is said to be convex if each point on
the chord between (x, f (x)) and (y, f (y)) is above the graph of F for any x, y ∈ I.
19
[3, p.113]
The following lemma is a well known fact that we state without proof.
Lemma 2.4. A function f on an interval I is convex if and only if the set epi(f )
is convex.
The interval with endpoints (x, F (x)) and (y, F (y)) is a subset of the plane in
the form
{(tx + (1 − t)y, tF (x) + (1 − t)F (y)) ∈ R2 : 0 ≤ t ≤ 1}.
Thus the convexity property of functions can be formulated analytically as follows:
F (tx + (1 − t)y) ≤ tF (x) + (1 − t)F (y)
for all x, y ∈ I and 0 ≤ t ≤ 1.
Note. It is clear that any linear function is convex.
Lemma 2.5. The family of convex functions on I is closed under addition and
operation ∨.
Proof. For two convex functions F, G we have
(F + G)(tx + (1 − t)y) = F (tx + (1 − t)y) + G(tx + (1 − t)y)
≤ tF (x) + (1 − t)F (y) + tG(x) + (1 − t)G(y)
= t(F (x) + G(x)) + (1 − t)(F (x) + G(x)).

20
Thus F + G is a convex function.

Recall that epi(F ∨G) = epi(F )∩epi(G). The sets epi(F ) and epi(G) are convex,
and hence their intersection is convex. Therefore the function F ∨ G is convex.
Theorem 2.6. A PL function on an interval I is representable as the difference of

two convex PL functions.
Proof. Suppose F is a PL function such that F (x) = Gk (x) on Ik and Gk has slope
mk . Let M = max |mk |.
Let H1 (x) = 0 and Hk (x) = 2kM (x − xk−1 ) + Hk−1 (xk−1 ) for 1 < k ≤ n. Define
H by H(x) = Hk (x) for x ∈ Ik . Then H is a convex function [3][5.18].
The function F + H has slope 2kM + mk on Ik and 2(k + 1)M + mk+1 on
Ik+1 . Since 2kM + mk ≤ (2k + 1)M and 2(k + 1)M + mk+1 ≥ (2k + 1)M , the
slopes of successive components, and therefore the left hand derivative of F + H is
nondecreasing. Thus for the same reason as for H, the function F + H is convex.
We have F = (F + H) − H, and both F + H and H are convex functions.
The preceeding proof is sufficient for a PL function of one variable. Unfortunately

there is no way to easily extend it to the multivariable case. The next proof does
not have this limitation.
Proof.
^ _
F (x) = Gi (x),
j∈J i∈Sj
21
for some family {Sj }j∈J of subsets of {1, . . . , n}. For each j ∈ J let us define the
function Hj by
_
Hj (x) = Gi (x).
i∈Sj
By Lemma 2.5, the functions Hj are convex.

We have the following identity:
X X
Hj (x) = Hk (x) − Hk (x).
k∈J k∈J
k6=j
Therefore,
^ X _X
F (x) = Hj (x) = Hk (x) − Hk (x).
j∈J k∈J j∈J k∈J
k6=j
X _X
By Lemma 2.5, Hk (x) and Hk (x) are both convex PL functions.
k∈J j∈J k∈J
k6=j
Chapter 3
Analytic Proofs
In Chapter 2 we used a lattice polynomial representation of a PL function F to

construct convex PL functions G and H such that F = G − H. A natural question
is what other functions can be represented as the difference of convex functions.
In the last section of this chapter we provide a partial answer. In section 3.1, we
represent a PL function as an indefinite integral of a step function and use this
representation to write it as a DC function. We construct it in this way to provide
motivation for the classification given in the last section. Note that in all cases
where an integral is used we refer to the Lebesgue integral.
22
23
3.1 A PL function is the difference of the indefinite integrals
of increasing step functions.
We will need the following lemmas in order to represent a PL function as an integral.
Lemma 3.1. Given a step function f on I = [a, b], the function F defined by
Z x
F (x) = f
a
is a PL function.
Proof. Let a = x0 < x1 < ... < xn = b be a partition of [a, b] such that
f (x) = mk for all x ∈ (xk−1 , xk ),
for some real number mk , 1 ≤ k ≤ n. Then
Z xk
f = mk (xk − xk−1 ).
xk−1
From the additivity of the integral,
Z x Z xk−1 Z x Z xk−1
f= f+ f= f + mk (x − xk−1 ), for each x ∈ [xk−1 , xk ].
a a xk−1 a
R xk−1
For each k the quantity a
f is constant. Therefore the last equality shows that
24
Rx
a
f defines a linear function on each interval [xk−1 , xk ]. It is clear that the integral
is a PL function on I.
In the next two proofs (Lemma 3.2 and our second proof of Theorem 2.6) we
refer to the notation given in the following paragraphs.
Suppose F is a PL function on I = [a, b] with components G1 , ..., Gn . We denote
0
the value of the constant function Gk by the real number mk , 1 ≤ k ≤ n.
We define two increasing step functions p and n on I as follows:
k
1X
p(x) = [(mi − mi−1 ) + |mi − mi−1 |], if x ∈ (xk−1 , xk ], for 2 ≤ k ≤ n,
2 2
p(x) = 0, for x ∈ [x0 , x1 ],
and
k
1X
n(x) = [|mi − mi−1 | − (mi − mi−1 )], if x ∈ (xk−1 , xk ], for 2 ≤ k ≤ n,
2 2
n(x) = 0, for x ∈ [x0 , x1 ].
Lemma 3.2. Every PL function F on the interval I = [a, b] is the indefinite integral
of a left continuous step function.
Proof. We define a left continuous step function f by f (x) = mk for x ∈ (xi−1 , xi ].

Rx
Note that xk−1 f = mk (x − xk−1 ) for each x in Ik = [xk−1 , xk ]. For each k and
x ∈ Ik ,
F (x) = Gk (x) = F (xk−1 ) + mk (x − xk−1 ). (3.1)
25
Rx
Suppose for some k, F (x) = F (a) + a
f for all x in Ik . Then for all x in Ik+1 ,
F (x) = F (xk ) + mk+1 (x − xk )

Z xk Z x
= F (a) + f+ f
a xk
Z x
= F (a) + f.
a
Rx
Since F (x) = F (a) + xo =a
f for all x ∈ I1 , by induction we have for each x ∈ I,
Z x
F (x) = F (a) + f. (3.2)
a
Lemma 3.3. If a function is the indefinite integral of an increasing function on

I = [a, b] then it is convex on I.
Proof. If f is an integrable function on [x1 , x2 ] and is bounded below and above by

real numbers M1 and M2 respectively, then
Z x2
M1 (x2 − x1 ) ≤ f ≤ M2 (x2 − x1 )
x1
(see [3, Proposition 4.15 iii]).

Suppose f is an increasing function on [x1 , x2 ]. Then it is bounded below by
f (x1 ) and above by f (x2 ).
26
Rx
Let F (x) = a
f , x ∈ I. For any x1 , x2 ∈ I, and any y ∈ [x1 , x2 ] we have
Z y
f = F (y) − F (x1 ) = c0 (y − x1 ) for some c0 ∈ [f (x1 ), f (y)] (i)
Zx1x2
f = F (x2 ) − F (y) = c1 (x2 − y) for some c1 ∈ [f (y), f (x2 )] (ii)
y
Z x2
f = c0 (y − x1 ) + c1 (x2 − y) (iii)
x1
Let L be the linear function whose graph contains the points (x1 , F (x1 )) and
(x2 , F (x2 )). Then
F (x2 ) − F (x1 )
L(y) = F (x1 ) + · (y − x1 )
(x2 − x1 )
c0 (y − x1 ) + c1 (x2 − y)
= F (x1 ) + · (y − x1 ) (equation iii)
(x2 − x1 )
(y − x1 )
= F (x1 ) + · [c0 (y − x1 ) + c1 (x2 − y)]
(x2 − x1 )
To establish convexity of F we need to show that F (y) ≤ L(y):
F (y) = F (x1 ) + (y − x1 )c0

y − x1
= F (x1 ) + · [c0 (x2 − x1 )]
x2 − x1
y − x1
= F (x1 ) + · [c0 (y − x1 ) + c0 (x2 − y)]
x2 − x1
y − x1
≤ F (x1 ) + · [c0 (y − x1 ) + c1 (x2 − y)] = L(y).
x2 − x1
27
The last inequality holds because f is an increasing function, so c0 < c1 . Therefore,

F is convex.
Now we proceed with an analytic proof of Theorem 2.6.
Proof. By replacing f with f (a) + p − n in Equation 3.2, we write:
Z x Z x
F (x) = F (a) + (f (a) + p) − n (3.3)
a a
Rx Rx
Let F1 (x) = F (a) + a
(f (a) + p) and F2 (x) = a
n. By Lemma 3.1, F1
and F2 are PL functions. To show convexity of F1 and F2 note that f (a) + p and
Rx
n are increasing functions. Thus by Lemma 3.3, a (f (a) + p) and F2 are convex
functions. Recall that the sum of two convex functions is convex, so F1 is also
convex. Therefore Equation 3.3 shows F written as the difference of two convex PL
functions.
3.2 Indefinite Integrals of BV Functions are DC Functions
The second proof of Theorem 2.6 depends on the fact that we could write the step
function f as the difference of two increasing functions. In fact we can write any
function of bounded variation (a BV function) as the difference of two increasing
functions [3][Theorem 5.5]. Perhaps the class of functions which is representable as
the difference of convex functions consists of just those functions which are indefinite
28
integrals of BV functions. Unfortunately, as the next example shows, this is not the
case.
√
Example 3.1. let F = − 1 − x2
F is certainly a convex function defined on the closed interval [−1, 1]. On any
0 0 0 Rx 0
closed interval [a , b ] ⊂ (a, b) F can be written as F (a ) + a0 F . Yet, there is no
Rx
increasing function f such that F (x) = F (−1) + −1 f . The problem occurs at
x = −1 and x = 1 where the derivative of F does not exist.
Indeed, if in addition to convexity we require that a function is differentiable at
the endpoints of [a, b] then we we can write the function as the indefinite integral of
an increasing function. We show this in the next proof.
Theorem 3.4. A function F is the indefinite integral of an increasing function on

I = [a, b] if and only if it is convex on I and differentiable at the endpoints of I.
Rx
Proof. Suppose F (x) = F (a) + f where f is an increasing function. To show
a
1
differentiability at the endpoints we need to show lim (F (b) − F (x)) and
x→b b − x
1
lim (F (x) − F (a)) exist and are finite.
x→a x − a
From Lemma 3.3 we know F is a convex function. For a convex function F on

F (x)−F (a) F (b)−F (x)
[a, b], the functions x−a
and b−x
are increasing functions of x [3][Theorem
5.16]. We have assumed that F is an indefinite integral of an increasing function f .
Therefore
Z b
1 1
(F (b) − F (x)) = f ≤ f (b).
b−x b−x x
29
The function f is increasing on [a, b] so the left hand limit exists at each point and
0 1 0
is finite. Therefore, F (x) = limx→b b−x
(F (b) − F (x)) exists. Similarly, F (a) exists.
0 0
For the converse, assume F is convex on [a, b] and F (a) and F (b) exist. For
a convex function on (a, b) the right and left hand derivatives of F exist and are
0 0
increasing on (a, b). Since F (a) and F (b) exist, we know that both left and right
hand derivatives are bounded on [a, b]. This implies that F is absolutely continuous
on [a, b] [3][Chapter 5, Problem 20] and therefore is an indefinite integral of its
Rx
derivative [3][Theorem 5.14]. Since F is continuous at a, F (x) = F (a) + a f where
0
f is the left hand derivative of F on (a, b] and f (a) = F (a).
Theorem 3.5. Consider the class of functions F on [a, b] representable in the form
F = G − H where G and H are convex and differentiable at the endpoints of [a, b].
Then a function belongs to this class if and only if it is an indefinite integral of a
function of bounded variation.
First we assume that the function F on [a, b] is representable in the form

F = G − H where G and H are convex and differentiable at the endpoints of [a, b].
Rx Rx
By Theorem 3.4, G(x) = G(a) + a g and H(x) = H(a) + a h where g and h are
increasing functions. Then
Z x
(G − H)(x) = G(a) − H(a) + (g − h).
a
30
The function (g − h) is the difference of two monotone functions and therefore is a

BV function.
Rx
To prove the converse, we assume F (x) = F (a) + a f where f is a BV function.
R R R
Then f = g − h and f = g − h for some increasing functions g and h. By
Rx Rx
Theorem 3.4, G(x) = a g and H(x) = a h define convex functions G and H which
are differentiable at a and b. Therefore F = (F (a) + G) − H is the difference of
convex functions and is differentiable at the endpoints.
Bibliography
[1] S Bartels, Ludwig Kuntz, and Stefan Scholtes, Continuous selections of linear
functions and nonsmooth critical point theory, Nonlinear Analysis 24 (1995),
385–407.
[2] Garrett Birkhoff, Lattice theory, third ed., American Mathematical Society Col-
loquium Publications, vol. 25, American Mathematical Society, Providence, R.I.,
1979. MR 598630 (82a:06001)
[3] H. L. Royden, Real analysis, third ed., Macmillan Publishing Company, New
York, 1988. MR 1013117 (90g:00004)
31

Difference of Convex Functions

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Difference of Convex Functions

Caricato da

Copyright:

Formati disponibili

DIFFERENCE OF CONVEX FUNCTIONS

A thesis presented to the faculty of

San Francisco, California

I certify that I have read DIFFERENCE OF CONVEX FUNCTIONS by Miriam

Chair, Thesis Committee Date

Foremost I would like to express my gratitude to Dr. Ovchinnikov for

I have taken courses in analysis from all three members of my committee

2 PL Functions Written as Polynomials . . . . . . . . . . . . . . . . . . . . . 9

1.1 An example of a PL-function. . . . . . . . . . . . . . . . . . . . . . . 3

2.1 A PL function translated by a line . . . . . . . . . . . . . . . . . . . 10

In this chapter we provide the definition of a piecewise linear function as well as

1.1 Definition and Examples of PL-functions

where Ik = [xk−1 , xk ], 1 ≤ k ≤ n. We call each Gk a component of F . The graph

Note that according to our definition

Gk (xk ) = Gk+1 (xk ),

for 1 ≤ k ≤ n. Therefore F is automatically continuous. Definition 1.1 is referred

Example 1.1. Let I = [1, 6], and x0 = 1, x1 = 2, x2 = 4, x3 = 5, x4 = 6. The four

Figure 1.1: An example of a PL-function.

intervals I1 –I4 are:

I1 = [1, 2], I2 = [2, 4], I3 = [4, 5], I4 = [5, 6].

The components G1 –G4 of F are defined by the equations:

G1 (x) = 2x − 1, G2 (x) = 3, G3 (x) = −x + 7, G4 (x) = 2x − 8.

The function F defined by F (x) = Gk (x), if x ∈ Ik , for 1 ≤ k ≤ 4, is a PL function.

Moreover, x ≤ y is equivalent to each of the conditions

Two additional distributivity properties hold for some lattices:

1.2.1 R as a Distributive Lattice

The set of real numbers is totally ordered by ≤ . Therefore a ∧ b and a ∨ b exist

In the case that x is bigger than y and z, both x ∧ (y ∨ z) and (x ∧ y) ∨ (x ∧ z) are

1.2.2 Lattice polynomials

In this definition, the forms x ∧ x and x ∨ (y ∧ x) are considered as different poly-

Lemma 1.3. ([2, Lemma 3, p.30]) In a distributive lattice, every polynomial p is

1.2.3 Translation Invariance

We will state without proof the following Lemma:

Lemma 1.4. For a finite set of indices S, and u, ai ∈ R, for all i ∈ S,

Lemma 1.5. A lattice polynomial on R is translation invariant:

= p(a1 + u, ..., an + u) by the definition of p

Lemma 1.6. If F , L, G1 , ..., Gn are functions on I, and

Proof. If for each x ∈ I,

PL Functions Written as Polynomials

In this chapter we review three proofs that a PL function can be represented as a

Figure 2.1: A PL function translated by a line

2.1 First Proof

Theorem 2.1. Let F be a PL-function on an interval I with components G1 , . . . , Gn .

F (x) = p(G1 (x), . . . , Gn (x)), for x ∈ I.

Proof. The proof is by induction on the number of components of F . For two

components, the assertion of the theorem is obvious because,

F (x) = G1 (x) ∧ G2 (x) or F (x) = G1 (x) ∨ G2 (x)

F1 (x) = Gi (x), for x ∈ Ii , 1 ≤ i ≤ k, F1 (x) = Gk (x), for x ∈ Ij , k < j ≤ n,

F2 (x) = Gk+1 (x), for x ∈ Ii , 1 ≤ i ≤ k, F2 (x) = Gj (x), for x ∈ Ij , k < j ≤ n.

By translation invariance (Lemma 1.6),

2.2 Second Proof

epi(F ) = {(x, y) ∈ R : y ≥ F (x), x ∈ I}.

Likewise, the hypograph of F is defined by

hyp(F ) = {(x, y) ∈ R : y ≤ F (x), x ∈ I}.

The intersection of the two sets, epi(F ) ∩ hyp(F ), is the graph of F .

There is a relationship between the operations of union/intersection and mini-

epi(F ) ∩ epi(G) = epi(F ∨ G) and epi(F ) ∪ epi(G) = epi(F ∧ G).

Here is a short proof of the first relation:

(x, y) ∈ epi(F ) ∩ epi(G) ⇔ (x, y) ∈ epi(F ) and (x, y) ∈ epi(G)

⇔ y ≥ F (x) and y ≥ G(x)

⇔ (x, y) ∈ epi(F ∨ G),

By induction, we immediately obtain

Definition 2.1. Let F be a PL-function on an interval I. A point B on the graph