Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Analysis
Gerald Teschl
Gerald Teschl
Fakult
at f
ur Mathematik
Oskar-Mogenstern-Platz 1
Universit
at Wien
1090 Wien, Austria
E-mail: Gerald.Teschl@univie.ac.at
URL: http://www.mat.univie.ac.at/~gerald/
Contents
Preface
vii
Introduction
Chapter 1.
3
3
9
1.1.
1.2.
20
1.3.
28
1.4.
Completeness
34
1.5.
Bounded operators
35
1.6.
38
1.7.
39
Chapter 2.
Hilbert spaces
43
2.1.
Orthonormal bases
43
2.2.
48
2.3.
50
2.4.
52
Chapter 3.
Compact operators
55
3.1.
Compact operators
55
3.2.
57
iii
iv
Contents
3.3.
60
Chapter
4.1.
4.2.
4.3.
65
65
71
79
Chapter
5.1.
5.2.
5.3.
87
87
90
93
Chapter
6.1.
6.2.
6.3.
6.4.
97
97
102
104
107
113
113
121
125
127
130
137
140
146
150
Chapter
8.1.
8.2.
8.3.
8.4.
8.5.
155
155
157
161
163
168
171
171
Contents
9.2.
Derivatives of measures
174
9.3.
Complex measures
180
9.4.
186
Chapter 10.
10.1.
10.2.
The dual of Lp
193
The dual of
Lp ,
p<
193
The dual of
194
Chapter 11.
199
L1
L2
11.1.
11.2.
207
11.3.
Sobolev spaces
212
11.4.
216
Chapter 12.
and
Interpolation
199
225
12.1.
225
12.2.
229
237
13.1.
237
13.2.
Contraction principles
241
13.3.
243
Chapter 14.
247
14.1.
Introduction
247
14.2.
249
14.3.
253
14.4.
259
14.5.
262
14.6.
265
14.7.
267
Chapter 15.
269
15.1.
269
15.2.
Compact maps
270
vi
Contents
15.3.
15.4.
15.5.
Chapter 16.
271
274
277
16.1.
277
16.2.
278
16.3.
283
Chapter 17.
Monotone maps
287
17.1.
Monotone maps
287
17.2.
289
17.3.
291
Appendix A.
Zorns lemma
293
Bibliography
295
Glossary of notation
297
Preface
The present manuscript was written for my course Functional Analysis given
at the University of Vienna in winter 2004 and 2009. It was adapted and
extended for a course Real Analysis given in summer 2011. The last part are
the notes for my course Nonlinear Functional Analysis held at the University
of Vienna in Summer 1998 and 2001. The three parts are to a large extent
independent. In particular, the first part does not assume any knowledge
from measure theory (at the expense of not mentioning Lp spaces).
It is updated whenever I find some errors and extended from time to
time. Hence you might want to make sure that you have the most recent
version, which is available from
http://www.mat.univie.ac.at/~gerald/ftp/book-fa/
Please do not redistribute this file or put a copy on your personal
webpage but link to the page above.
Acknowledgments
I wish to thank my students, Kerstin Ammann, Phillip Bachler, Alexander Beigl, Peng Du, Christian Ekstrand, Damir Ferizovic, Michael Fischer,
Melanie Graf, Matthias Hammerl, Florian Kogelbauer, Helge Kr
uger, Reinhold K
ustner, Oliver Leingang, Joris Mestdagh, Alice Mikikits-Leitner, Caroline Moosm
uller, Piotr Owczarek, Martina Pflegpeter, Tobias Preinerstorfer, Christian Schmid, Vincent Valmorin, Richard Welke, David Wimmesberger, Song Xiaojun, Rudolf Zeidler, and colleagues Nils C. Framstad,
Heinz Hanmann, Aleksey Kostenko, Wallace Lam, Daniel Lenz, Johanna
Michor, Alex Strohmaier, and Hendrik Vogt, who have pointed out several
typos and made useful suggestions for improvements. I am also grateful to
Volker En for making his lecture notes on nonlinear Functional Analysis
vii
viii
Preface
available to me.
Finally, no book is free of errors. So if you find one, or if you
have comments or suggestions (no matter how small), please let
me know.
Gerald Teschl
Vienna, Austria
August, 2013
Part 1
Functional Analysis
Chapter 0
Introduction
2
u(t, x) =
u(t, x).
t
x2
(0.1)
0. Introduction
Since finding the solution seems at first sight not possible, we could try
to find at least some solutions of (0.1) first. We could, for example, make
an ansatz for u(t, x) as a product of two functions, each of which depends
on only one variable, that is,
u(t, x) = w(t)y(x).
(0.2)
Accordingly, this ansatz is called separation of variables. Plugging everything into the heat equation and bringing all t, x dependent terms to the
left, right side, respectively, we obtain
w(t)
y 00 (x)
=
.
w(t)
y(x)
(0.3)
Here the dot refers to differentiation with respect to t and the prime to
differentiation with respect to x.
Now if this equation should hold for all t and x, the quotients must be
equal to a constant (we choose instead of for convenience later on).
That is, we are lead to the equations
w(t)
= w(t)
(0.4)
and
y 00 (x) = y(x),
y(0) = y(1) = 0,
(0.5)
w(t) = c1 et
(0.6)
(0.7)
However, y(x) must also satisfy the boundary conditions y(0) = y(1) = 0.
The first one y(0) = 0 is satisfied if c2 = 0 and the second one yields (c3 can
be absorbed by w(t))
sin( ) = 0,
(0.8)
which holds if = (n)2 , n N. In summary, we obtain the solutions
2
n N.
(0.9)
So we have found a large number of solutions, but we still have not dealt
with our initial condition u(0, x) = u0 (x). This can be done using the
superposition principle which holds since our equation is linear. Hence any
finite linear combination of the above solutions will again be a solution.
Moreover, under suitable conditions on the coefficients we can even consider
infinite linear combinations. In fact, choosing
u(t, x) =
X
n=1
cn e(n) t sin(nx),
(0.10)
cn sin(nx)
(0.11)
n=1
u0,n sin(nx),
(0.12)
n=1
2
u(t, x) 2 u(t, x) + q(x)u(t, x) = 0,
t
x
u(0, x) = u0 (x),
u(t, 0) = u(t, 1) = 0.
(0.13)
Here u(t, x) could be the density of some gas in a pipe and q(x) > 0 describes
that a certain amount per time is removed (e.g., by a chemical reaction).
Wave equation:
2
2
u(t,
x)
u(t, x) = 0,
t2
x2
u
u(0, x) = u0 (x),
(0, x) = v0 (x)
t
u(t, 0) = u(t, 1) = 0.
(0.14)
2
u(t, x) = 2 u(t, x) + q(x)u(t, x),
t
x
u(0, x) = u0 (x),
u(t, 0) = u(t, 1) = 0.
(0.15)
0. Introduction
Ly(x) = y(x),
d2
+ q(x),
dx2
(0.16)
(0.17)
cn un (x).
n=1
[0, 1]. (Hint: Weierstra M-test. When can you interchange the order of
summation and differentiation?)
Chapter 1
(1.1)
P
Example. Euclidean space Rn together with d(x, y) = ( nk=1 (xk yk )2 )1/2
P
is a metric space and so is Cn together with d(x, y) = ( nk=1 |xk yk |2 )1/2 .
The set
Br (x) = {y X|d(x, y) < r}
(1.2)
is called an open ball around x with radius r > 0. A point x of some set
U is called an interior point of U if U contains some ball around x. If x
9
10
d(x, y) =
|xk yk |.
(1.3)
k=1
Then
v
u n
n
n
X
X
uX
1
|xk | t
|xk |2
|xk |
n
k=1
k=1
(1.4)
k=1
11
Lemma 1.1. If B O is a base for the topology, then every open set can
be written as a union of elements from B.
If there exists a countable base, then X is called second countable.
Example. By construction, the open balls {B1/m (x)}mN,xX are a base
for the topology in a metric space. In the case of Rn (or Cn ) it even suffices
to take balls with rational center, and hence Rn (as well as Cn ) is second
countable.
A topological space is called a Hausdorff space if for two different
points there are always two disjoint neighborhoods.
Example. Any metric space is a Hausdorff space: Given two different
points x and y, the balls Bd/2 (x) and Bd/2 (y), where d = d(x, y) > 0, are
disjoint neighborhoods (a pseudometric space will not be Hausdorff).
The complement of an open set is called a closed set. It follows from
de Morgans rules that the family of closed sets C satisfies
(i) , X C,
(ii) C1 , C2 C implies C1 C2 C,
T
(iii) {C } C implies C C.
That is, closed sets are closed under finite unions and arbitrary intersections.
The smallest closed set containing a given set U is called the closure
\
U=
C,
(1.6)
CC,U C
12
and the largest open set contained in a given set U is called the interior
[
U =
O.
(1.7)
OO,OU
It is not hard to see that the closure satisfies the following axioms (Kuratowski closure axioms):
(i) = ,
(ii) U U ,
(iii) U = U ,
(iv) U V = U V .
In fact, one can show that they can equivalently be used to define the topology by observing that the closed sets are precisely those which satisfy A = A.
We can define interior and limit points as before by replacing the word
ball by open set. Then it is straightforward to check
Lemma 1.2. Let X be a topological space. Then the interior of U is the set
of all interior points of U , and the closure of U is the union of U with all
limit points of U .
Example. The closed ball
r (x) = {y X|d(x, y) r}
B
(1.8)
is a closed set (check that its complement is open). But in general we have
only
r (x)
Br (x) B
(1.9)
since an isolated point y with d(x, y) = r will not be a limit point. In Rn
(or Cn ) we have of course equality.
A sequence (xn )
n=1 X is said to converge to some point x X if
d(x, xn ) 0. We write limn xn = x as usual in this case. Clearly, the
limit is unique if it exists (this is not true for a pseudometric).
Every convergent sequence is a Cauchy sequence; that is, for every
> 0 there is some N N such that
d(xn , xm ) ,
n, m N.
(1.10)
If the converse is also true, that is, if every Cauchy sequence has a limit,
then X is called complete.
Example. Both Rn and Cn are complete metric spaces.
13
if
dX (x, y) < .
(1.11)
14
(1.12)
(1.13)
(1.14)
15
kN,1l<n
j,n is open, O
j,n Oj , and it is a cover since for
Then, by construction, O
every x there is a smallest j such that x Oj and a smallest n such that
k,l for some l n.
B32n (x) Oj implying x O
j,n is locally finite fix some x and let j be the smallest
To show that O
j,n for some n. Moreover, choose m such that
integer such that x O
16
j,i and z O
k,i with j < k. We will show
To show (ii) let y O
j,i
d(y, z) > 2nm+1 . There are points r and s such that y B2i (r) O
k,i . Then by definition B32i (r) Oj but s 6 Oj . So
and z B2i (s) O
i
d(r, s) 3 2 and d(y, z) > 2i 2nm+1 .
A subset K X is called compact if every open cover has a finite
subcover. A set is called relatively compact if its closure is compact.
Lemma 1.9. A topological space is compact if and only if it has the finite
intersection property: The intersection of a family of closed sets is empty
if and only if the intersection of some finite subfamily is empty.
Proof. By taking complements, to every family of open sets there is a corresponding family of closed sets and vice versa. Moreover, the open sets
are a cover if and only if the corresponding closed sets have empty intersection.
Lemma 1.10. Let X be a topological space.
(i) The continuous image of a compact set is compact.
(ii) Every closed subset of a compact set is compact.
(iii) If X is Hausdorff, every compact set is closed.
(iv) The product of finitely many compact sets is compact.
(v) The finite union of compact sets is again compact.
(vi) If X is Hausdorff, any intersection of compact sets is again compact.
Proof. (i) Observe that if {O } is an open cover for f (Y ), then {f 1 (O )}
is one for Y .
(ii) Let {O } be an open cover for the closed subset Y (in the induced
with O = O
Y and {O
}{X\Y }
topology). Then there are open sets O
is an open cover for X which has a finite subcover. This subcover induces a
finite subcover for Y .
(iii) Let Y X be compact. We show that X\Y is open. Fix x X\Y
(if Y = X, there is nothing to do). By the definition of Hausdorff, for
every y Y there are disjoint neighborhoods V (y) of y and Uy (x) of x. By
compactness of
TnY , there are y1 , . . . , yn such that the V (yj ) cover Y . But
then U (x) = j=1 Uyj (x) is a neighborhood of x which does not intersect
Y.
(iv) Let {O } be an open cover for X Y . For every (x, y) X Y
there is some (x, y) such that (x, y) O(x,y) . By definition of the product
topology there is some open rectangle U (x, y) V (x, y) O(x,y) . Hence for
fixed x, {V (x, y)}yY is an open cover of Y . Hence there are finitely many
17
T
points yk (x) such that the V (x, yk (x)) cover Y . Set U (x) = k U (x, yk (x)).
Since finite intersections of open sets are open, {U (x)}xX is an open cover
and there are finitely many points xj such that the U (xj ) cover X. By
construction, the U (xj ) V (xj , yk (xj )) O(xj ,yk (xj )) cover X Y .
(v) Note that a cover of the union is a cover for each individual set and
the union of the individual subcovers is the subcover we are looking for.
(vi) Follows from (ii) and (iii) since an intersection of closed sets is
closed.
As a consequence we obtain a simple criterion when a continuous function is a homeomorphism.
Corollary 1.11. Let X and Y be topological spaces with X compact and
Y Hausdorff. Then every continuous bijection f : X Y is a homeomorphism.
Proof. It suffices to show that f maps closed sets to closed sets. By (ii)
every closed set is compact, by (i) its image is also compact, and by (iii) it
is also closed.
A subset K X is called sequentially compact if every sequence
from K has a convergent subsequence. In a metric space, compact and
sequentially compact are equivalent.
Lemma 1.12. Let X be a metric space. Then a subset is compact if and
only if it is sequentially compact.
Proof. Suppose X is compact and let xn be a sequence which has no convergent subsequence. Then K = {xn } has no limit points and is hence compact
by Lemma 1.10 (ii). For every n there is a ball Bn (xn ) which contains only
finitely many elements of K. However, finitely many suffice to cover K, a
contradiction.
Conversely, suppose X is sequentially compact and let {O } be some
open cover which has no finite subcover. For every x X we can choose
some (x) such that if Br (x) is the largest ball contained in O(x) , then
either r 1 or there is no with B2r (x) OS(show that this is possible).
Now choose a sequence xn such that xn 6 m<n O(xm ) . Note that by
construction the distance d = d(xm , xn ) to every successor of xm is either
larger than 1 or the ball B2d (xm ) will not fit into any of the O .
Now let y be the limit of some convergent subsequence and fix some r
(0, 1) such that Br (y) O(y) . Then this subsequence must eventually be in
Br/5 (y), but this is impossible since if d = d(xn1 , xn2 ) is the distance between
two consecutive elements of this subsequence, then B2d (xn1 ) cannot fit into
18
O(y) by construction whereas on the other hand B2d (xn1 ) B4r/5 (a)
O(y) .
In a metric space, a set is called bounded if it is contained inside some
ball. Note that compact sets are always bounded since Cauchy sequences
are bounded (show this!). In Rn (or Cn ) the converse also holds.
Theorem 1.13 (HeineBorel). In Rn (or Cn ) a set is compact if and only
if it is bounded and closed.
Proof. By Lemma 1.10 (ii) and (iii) it suffices to show that a closed interval
in I R is compact. Moreover, by Lemma 1.12, it suffices to show that
every sequence in I = [a, b] has a convergent subsequence. Let xn be our
a+b
sequence and divide I = [a, a+b
2 ] [ 2 , b]. Then at least one of these two
intervals, call it I1 , contains infinitely many elements of our sequence. Let
y1 = xn1 be the first one. Subdivide I1 and pick y2 = xn2 , with n2 > n1 as
before. Proceeding like this, we obtain a Cauchy sequence yn (note that by
construction In+1 In and hence |yn ym | ba
n for m n).
By Lemma 1.12 this is equivalent to
Theorem 1.14 (BolzanoWeierstra). Every bounded infinite subset of Rn
(or Cn ) has at least one limit point.
Combining Theorem 1.13 with Lemma 1.10 (i) we also obtain the extreme value theorem.
Theorem 1.15 (Weierstra). Let X be compact. Every continuous function
f : X R attains its maximum and minimum.
A metric space for which the HeineBorel theorem holds is called proper.
Lemma 1.10 (ii) shows that X is proper if and only if every closed ball is
compact. Note that a proper metric space must be complete (since every
Cauchy sequence is bounded). A topological space is called locally compact if every point has a compact neighborhood. Clearly, a proper metric
space is locally compact.
The distance between a point x X and a subset Y X is
dist(x, Y ) = inf d(x, y).
yY
(1.16)
(1.17)
19
dist(x,C2 )
dist(x,C1 )+dist(x,C2 ) .
For the
second claim, observe that there is an open set O such that O is compact
and C1 O O X\C2 . In fact, for every x C1 , there is a ball B (x)
such that B (x) is compact and B (x) X\C2 . Since C1 is compact, finitely
many of them cover C1 and we can choose the union of those balls to be O.
Now replace C2 by X\O.
Note that Urysohns lemma implies that a metric space is normal; that
is, for any two disjoint closed sets C1 and C2 , there are disjoint open sets
O1 and O2 such that Cj Oj , j = 1, 2. In fact, choose f as in Urysohns
lemma and set O1 = f 1 ([0, 1/2)), respectively, O2 = f 1 ((1/2, 1]).
Lemma 1.18. Let X be a metric space and {Oj } a countable open cover.
Then there is a continuous partition of unity subordinate to this cover;
that is, there are continuous functions hj : X [0, 1] such that hj has
compact support contained in Oj and
X
hj (x) = 1.
(1.18)
j
Moreover, the partition of unity can be chosen locally finite; that is, every x
has a neighborhood where all but a finite number of the functions hj vanish.
Proof. For notational simplicity we assume j N. Now introduce fn (x) =
min(1, supjn d(x, X\Oj )) and gn = fn fn1 (with the convention f0 (x) =
0). Since fn is increasing we have 0 gn 1. Moreover, gn (x) > 0
implies d(x, X\On ))P
> 0 and thus supp(gn ) On . Next, by monotonicity
f = limn fn = n gn exists and is everywhere positive since {Oj } is a
cover. Finally, by
|fn (x) fn (y)| | sup d(x, X\Oj ) sup d(y, X\Oj )|
jn
jn
20
we see that all fn (and hence all gn ) are continuous. Moreover, the very
same argument shows that f is continuous and thus we have found the
required partition of unity hj = gj /f .
Finally, by Lemma 1.8 we can replace the cover by a locally finite refinement and the resulting partition of unity for this refinement will be locally
finite.
Problem 1.1. Show that |d(x, y) d(z, y)| d(x, z).
Problem 1.2. Show the quadrangle inequality |d(x, y) d(x0 , y 0 )|
d(x, x0 ) + d(y, y 0 ).
Problem 1.3. Show that the closure satisfies the Kuratowski closure axioms.
Problem 1.4. Show that the closure and interior operators are dual in the
sense that
X\A = (X\A)
and
X\A = (X\A).
(Hint: De Morgans laws.)
Problem 1.5. Let U V be subsets of a metric space X. Show that if U
is dense in V and V is dense in X, then U is dense in X.
Problem 1.6. Show that every open set O R can be written as a countable
union of disjoint intervals. (Hint: Let {I } be the set of all maximal open
subintervals of O; that is, I O and there is no other subinterval of O
which contains I . Then this is a cover of disjoint open intervals which has
a countable subcover.)
Problem 1.7. Show that the boundary of A is given by A = A\A .
It is not hard to see that with this definition C(I) becomes a normed vector
space:
A normed vector space X is a vector space X over C (or R) with a
nonnegative function (the norm) k.k such that
kf k > 0 for f 6= 0 (positive definiteness),
21
[0, 1].
(1.21)
X
kak1 =
|aj |
(1.22)
j=1
22
|aj + bj |
j=1
k
X
|aj | +
j=1
k
X
(1.23)
j=1
n
|am
j aj |
(1.24)
|aj anj | .
(1.25)
j=1
and take m :
k
X
j=1
Since this holds for all finite k, we even have kaan k1 . Hence (aan )
`1 (N) and since an `1 (N), we finally conclude a = an + (a an ) `1 (N).
By our estimate ka an k1 , our candidate a is indeed the limit of an .
Example. The previous example can be generalized by considering the
space `p (N) of all complex-valued sequences a = (aj )
j=1 for which the norm
1/p
X
kakp =
|aj |p ,
p [1, ),
(1.26)
j=1
1
1
+ ,
p
q
1 1
+ = 1,
p q
, 0,
(1.27)
which implies H
olders inequality
kabk1 kakp kbkq
(1.28)
23
and = |bj |q in (1.27) and summing over all j. (A different proof based on
convexity will be given in Section 8.2.)
Now using |aj + bj |p |aj | |aj + bj |p1 + |bj | |aj + bj |p1 , we obtain from
H
olders inequality (note (p 1)q = p)
ka + bkpp kakp k(a + b)p1 kq + kbkp k(a + b)p1 kq
= (kakp + kbkp )ka + bkp1
p .
Hence `p is a normed space. That it is complete can be shown as in the case
p = 1 (Problem 1.13).
Example. The space ` (N) of all complex-valued bounded sequences a =
(aj )
j=1 together with the norm
kak = sup |aj |
(1.29)
jN
is a Banach space (Problem 1.14). Note that with this definition, Holders
inequality (1.28) remains true for the cases p = 1, q = and p = , q = 1.
The reason for the notation is explained in Problem 1.16.
Example. Every closed subspace of a Banach space is again a Banach space.
For example, the space c0 (N) ` (N) of all sequences converging to zero is
a closed subspace. In fact, if a ` (N)\c0 (N), then lim supj |aj | = > 0
and thus ka bk for every b c0 (N).
Now what about completeness of C(I)? A sequence of functions fn (x)
converges to f if and only if
lim kf fn k = lim sup |fn (x) f (x)| = 0.
n xI
(1.30)
m, n > N , x I,
(1.31)
we see
|f (x) fn (x)|
n > N , x I;
(1.32)
24
and so that |x y| < implies |fn (x) fn (y)| < /3. Then |x y| <
implies
|f (x)f (y)| |f (x)fn (x)|+|fn (x)fn (y)|+|fn (y)f (y)| < + + =
3 3 3
as required. Hence f (x) C(I) and thus every Cauchy sequence in C(I)
converges. Or, in other words,
Theorem 1.19. C(I) with the maximum norm is a Banach space.
Next we want to look at countable bases. To this end we introduce a few
definitions first.
The set of all finite linear combinations of a set of vectors {un }nN X
is called the span of {un }nN and denoted by
span{un }nN = {
m
X
(1.33)
j=1
1/p
X
ka am k1 =
|aj |p 0
j=m+1
m
since am
j = aj for 1 j m and aj = 0 for j > m. Hence
a=
X
n=1
an n
25
and { n }
n=1 is a Schauder basis (uniqueness of the coefficients is left as an
exercise).
Note that { n }
n=1 is also Schauder basis for c0 (N) but not for ` (N).
A set whose span is dense is called total, and if we have a countable total
set, we also have a countable dense set (consider only linear combinations
with rational coefficients show this). A normed vector space containing
a countable dense set is called separable.
Example. Every Schauder basis is total and thus every Banach space with
a Schauder basis is separable (the converse is not true). In particular, the
Banach space `p (N) is separable.
While we will not give a Schauder basis for C(I), we will at least show
that it is separable. In order to prove this, we need a lemma first.
Lemma 1.20 (Smoothing). Let un (x) be a sequence of nonnegative continuous functions on [1, 1] such that
Z
Z
un (x)dx = 1
un (x)dx 0,
and
|x|1
> 0.
(1.35)
|x|1
f ( 12 )
1/2
un (x y)f (y)dy
fn (x) =
(1.36)
1/2
1/2
|f (x)
1/2
un (x y)dy| M .
1/2
26
+ 2M + M = (1 + 3M ),
which proves the claim.
(1.37)
1
(1 x2 )n ,
In
n!
(n!)2 22n+1
n!
22n+1 =
= 1 1
.
1
(n + 1) (2n + 1)
(2n + 1)!
2 ( 2 + 1) ( 2 + n)
Indeed, the first part of (1.35) holds by construction, and the second part
follows from the elementary estimate
1
2
which shows
|x|1 un (x)dx
1
< In < 2,
+n
27
n
X
X
fj = lim
fj
j=1
j=1
28
Problem 1.16. Show that if a `p0 (N) for some p0 [1, ), then a `p (N)
for p p0 and
lim kakp = kak .
p
Problem 1.17. Formally extend the definition of `p (N) to p (0, 1). Show
that k.kp does not satisfy the triangle inequality. However, show that it is
a quasinormed space, that is, it satisfies all requirements for a normed
space except for the triangle inequality which is replaced by
ka + bk K(kak + kbk)
with some constant K 1. Show, in fact,
ka + bkp 21/p1 (kakp + kbkp ),
p (0, 1).
k.kpp
1 , 2 C,
(1.38)
(positive definite),
(symmetry)
29
n
X
aj bj
(1.40)
j=1
X
n
o
2
(aj )
|a
|
<
j
j=1
(1.41)
j=1
aj bj .
(1.42)
j=1
X
X
X
X
X
X
2
2
2
aj bj
|aj bj |
|aj |
|bj |
|aj |
|bj |2
j=1
j=1
j=1
j=1
j=1
j=1
we infer that the sum in the definition of the scalar product is absolutely
2
convergent
p (an thus well-defined) for a, b ` (N). Observe that the norm
kak = ha, ai is identical to the norm kak2 defined in the previous section.
In particular, `2 (N) is complete and thus indeed a Hilbert space.
A vector f H is called normalized or a unit vector if kf k = 1.
Two vectors f, g H are called orthogonal or perpendicular (f g) if
hf, gi = 0 and parallel if one is a multiple of the other.
If f and g are orthogonal, we have the Pythagorean theorem:
kf + gk2 = kf k2 + kgk2 ,
f g,
(1.43)
(1.44)
f = f hu, f iu,
(1.45)
30
BMB
f
fk
1
u
B f
B
B
1
(1.46)
(1.47)
(1.48)
But let us return to C(I). Can we find a scalar product which has the
maximum norm as associated norm? Unfortunately the answer is no! The
reason is that the maximum norm does not satisfy the parallelogram law
(Problem 1.20).
Theorem 1.24 (Jordanvon Neumann). A norm is associated with a scalar
product if and only if the parallelogram law
kf + gk2 + kf gk2 = 2kf k2 + 2kgk2
holds.
(1.49)
31
In this case the scalar product can be recovered from its norm by virtue
of the polarization identity
hf, gi =
1
kf + gk2 kf gk2 + ikf igk2 ikf + igk2 .
4
(1.50)
1
kf + gk2 kf gk2 + ikf igk2 ikf + igk2 .
4
The corresponding inner product space is denoted by L2cont (I). Note that
we have
p
kf k |b a|kf k
(1.52)
and hence the maximum norm is stronger than the L2cont norm.
Suppose we have two norms k.k1 and k.k2 on a vector space X. Then
k.k2 is said to be stronger than k.k1 if there is a constant m > 0 such that
kf k1 mkf k2 .
(1.53)
32
0,
fn (x) = 1 + n(x 1),
1,
0 x 1 n1 ,
1 n1 x 1,
1 x 2.
(1.54)
(1.55)
Proof. Choose
a basis {uj }1jn such that every f X can be written
P
as f =
u
. Since equivalence of norms is an equivalence relation
j
j
j
(check this!), we can assume that k.k2 is the usual Euclidean norm: kf k2 =
P
P
k j j uj k2 = ( j |j |2 )1/2 . Then by the triangle and CauchySchwarz
inequalities,
sX
X
kf k1
|j |kuj k1
kuj k21 kf k2
j
qP
kuj k21 .
33
1jn
1j<kn
(1.56)
(1.57)
sup
|s(f, g)|
kf k=kgk=1
is finite. Show
kqk ksk 2kqk.
(Hint: Use the parallelogram law and the polarization identity from the previous problem.)
34
(1.58)
holds if q(f ) 0.
(Hint: Consider 0 q(f + g) = q(f ) + 2Re( s(f, g)) + ||2 q(g) and
choose = t s(f, g) /|s(f, g)| with t R.)
Problem 1.25. Prove the claims made about fn , defined in (1.54), in the
last example.
1.4. Completeness
Since L2cont is not complete, how can we obtain a Hilbert space from it?
Well, the answer is simple: take the completion.
If X is an (incomplete) normed space, consider the set of all Cauchy
Call two Cauchy sequences equivalent if their difference consequences X.
the set of all equivalence classes. It is easy
verges to zero and denote by X
to see that X (and X) inherit the vector space structure from X. Moreover,
Lemma 1.27. If xn is a Cauchy sequence, then kxn k converges.
Consequently, the norm of a Cauchy sequence (xn )
n=1 can be defined
by k(xn )
k
=
lim
kx
k
and
is
independent
of
the
equivalence class
n
n
n=1
35
(1.60)
and range
Ran(A) = {Af |f D(A)} = AD(A) Y
(1.61)
are defined as usual. The operator A is called bounded if the operator
norm
kAk =
sup
kAf kY
(1.62)
f D(A),kf kX =1
is finite.
By construction, a bounded operator is Lipschitz continuous,
kAf kY kAkkf kX ,
f D(A),
(1.63)
x[0,1]
d
dx
:XY
x[0,1]
x[0,1]
d
dx
However, if we consider A =
: D(A) Y Y defined on D(A) =
1
C [0, 1], then we have an unbounded operator. Indeed, choose
un (x) = sin(nx)
36
fn D(A),
f X.
37
The Banach space of bounded linear operators L(X) even has a multiplication given by composition. Clearly, this multiplication satisfies
(A + B)C = AC + BC,
A(B + C) = AB + BC,
A, B, C L(X) (1.64)
and
(AB)C = A(BC),
C.
(1.65)
(1.66)
and compute the operator norm kAk with respect to this matrix in terms of
the matrix entries. Do the same with respect to the norm
X
kxk1 =
|xj |.
1jn
where K(x, y) C([0, 1] [0, 1]), defined on D(K) = C[0, 1], is a bounded
operator both in X = C[0, 1] (max norm) and X = L2cont (0, 1).
Problem 1.29. Let I be a compact interval. Show that the set of differentiable functions C 1 (I) becomes a Banach space if we set kf k,1 =
maxxI |f (x)| + maxxI |f 0 (x)|.
Problem 1.30. Show that kABk kAkkBk for every A, B L(X). Conclude that the multiplication is continuous: An A and Bn B imply
An Bn AB.
Problem 1.31. Let A L(X) be a bijection. Show
kA1 k1 =
inf
f X,kf k=1
kAf k.
X
j=0
fj z j ,
|z| < R,
38
fj Aj
j=0
(1.67)
is a Banach space.
Proof. First of all we need to show that (1.67) is indeed a norm. If k[x]k = 0
we must have a sequence xj M with xj x and since M is closed we
conclude x M , that is [x] = [0] as required. To see k[x]k = ||k[x]k we
use again the definition
k[x]k = k[x]k = inf kx + yk = inf kx + yk
yM
yM
= || inf kx + yk = ||k[x]k.
yM
39
(1.68)
xU
is a Banach space and this can be shown as in Section 1.2. The space of
continuous functions with compact support Cc (U ) Cb (U ) is in general
not dense and its closure will be denoted by C0 (U ). If U is open it can be
interpreted as the functions in Cb (U ) which vanish at the boundary
C0 (U ) = {f C(U )| > 0, K compact : |f (x)| < , x X \ K}. (1.69)
Of course Rm could be replaced by any topological space up to this point.
Moreover, the above norm can be augmented to handle differentiable
functions by considering the space Cb1 (U ) of all continuously differentiable
functions for which the following norm is
m
X
kf k,1 = kf k +
kj f k
(1.70)
j=1
xj .
40
since the sum of seminorms is again a seminorm (Problem 1.350 the above
expression defines indeed a norm. It is also not hard to see that Cb1 (U, Cn )
is complete. In fact, let f k be a Cauchy sequence, then f k (x) converges uniformly to some continuous function f (x) and the same is true for the partial
derivatives
j f k (x) gj (x). Moreover, since f k (x) =Rf k (c, x2 , . . . , xm ) +
R x1
x1
k
c j f (t, x2 , . . . , xm )dt f (x) = f (c, x2 , . . . , xm ) + c gj (t, x2 , . . . , xm )
we obtain j f (x) = gj (x). The remaining derivatives follow analogously and
thus f k f in Cb1 (U, Cn ).
Clearly this approach extends to higher derivatives. To this end let
C k (U ) be the set of all complex-valued functions which have partial derivatives of order up to k. For f C k (U ) and Nn0 we set
f =
|| f
,
x1 1 xnn
|| = 1 + + n .
(1.71)
xU
Note that the space Cbk (U ) could be further refined by requiring the
highest derivatives to be H
older continuous. Recall that a function f : U
C is called uniformly H
older continuous with exponent (0, 1] if
[f ] = sup
x6=yU
|f (x) f (y)|
|x y|
(1.73)
41
Note that by the mean value theorem all derivatives up to order lower
than k are automatically Lipschitz continuous.
While the above spaces are able to cover a wide variety of situations,
there are still cases where the above definitions are not suitable. For example, suppose you want to consider continuous function C(R) but you do
not want to assume boundedness. Then you could try to use local uniform
convergence as follows. Consider the seminorms
kf kj = sup |f (x)|,
j N,
(1.75)
|x|j
(1.76)
jN
(1.77)
j N0 , k N,
(1.78)
|x|k
is a Frechet space.
Example. The space of all entire functions f (z) (i.e. functions which are
holomorphic on all of C) together with the seminorms kf kk = sup|z|k |f (x)|,
k N, is a Frechet space. Completeness follows from the Weierstra convergence theorem which states that a limit of holomorphic functions which
is uniform on every compact subset is again holomorphic.
Note that in all the above spaces we could replace complex-valued by
functions.
Cn -valued
42
d
1+d .
(Hint:
Problem 1.39. Let X be some space together with a sequence of pseudometrics dj , j N. Show that
X 1 dj (x, y)
d(x, y) =
2j 1 + dj (x, y)
jN
(0, 1).
Use this to show that balls in a Frechet space are convex. In particular, every
Frechet space is a locally convex vector space, that is, a topological vector
space with a base of convex sets.
Problem 1.42. Consider `p (N) for p (0, 1) compare Problem 1.17.
Show that k.kp is not convex. Show that every convex open set is unbounded.
Conclude that it is not a locally convex vector space. (Hint: Consider BR (0).
Then for r < R all vectors which have one entry equal to r and all other
entries zero are in this ball. By taking convex combinations all vectors which
have n entries equal to r/n are in the convex hull. The quasinorm of such
a vector is n1/p1 r.)
Chapter 2
Hilbert spaces
fk =
n
X
huj , f iuj ,
(2.1)
j=1
(2.3)
44
2. Hilbert spaces
n
X
j uj
j=1
n
X
|j huj , f i|2
j=1
|huj , f i|2 kf k2
(2.4)
j=1
with equality holding if and only if f lies in the span of {uj }nj=1 .
Of course, since we cannot assume H to be a finite dimensional vector space, we need to generalize Lemma 2.1 to arbitrary orthonormal sets
{uj }jJ . We start by assuming that J is countable. Then Bessels inequality
(2.4) shows that
X
|huj , f i|2
(2.5)
jJ
jK
(2.8)
45
(2.11)
f1
.
kf1 k
(2.12)
46
2. Hilbert spaces
(2.14)
H
for
which
f 6= 0.
k
j=1
N
Since {uj }j=1 is total, we can find a f in its span such that kf fk < kf k,
contradicting (2.11). Hence we infer that {uj }N
j=1 is an orthonormal basis.
Theorem 2.3. Every separable Hilbert space has a countable orthonormal
basis.
Example. In L2cont (1, 1), we can orthogonalize the polynomial fn (x) =
xn (which are total by the Weierstra approximation theorem Theorem 1.21). The resulting polynomials are up to a normalization equal to the
Legendre polynomials
P0 (x) = 1,
P1 (x) = x,
P2 (x) =
3 x2 1
,
2
...
(2.15)
(2.17)
jJ
(2.18)
47
g, f H1 .
(2.19)
By the polarization identity, (1.50) this is the case if and only if U preserves
norms: kU f k2 = kf k1 for all f H1 (note the a norm preserving linear
operator is automatically injective). The two Hilbert spaces H1 and H2 are
called unitarily equivalent in this case.
Let H be an infinite dimensional Hilbert space and let {uj }jN be any
orthogonal basis. Then the map U : H `2 (N), f 7 (huj , f i)jN is unitary (by Theorem 2.4 (ii) it is onto and by (iii) it is norm preserving). In
particular,
Theorem 2.6. Any separable infinite dimensional Hilbert space is unitarily
equivalent to `2 (N).
48
2. Hilbert spaces
(2.20)
in this situation.
Proof. Since M is closed, it is a Hilbert space and has an orthonormal
basis {uj }jJ . Hence the existence part follows from Theorem 2.2. To see
uniqueness, suppose there is another decomposition f = fk + f . Then
fk fk = f f M M = {0}.
Corollary 2.9. Every orthogonal set {uj }jJ can be extended to an orthogonal basis.
Proof. Just add an orthogonal basis for ({uj }jJ ) .
49
and
hPM g, f i = hg, PM f i
(2.21)
and
hP f, gi = hf, P gi
50
2. Hilbert spaces
(2.23)
(2.24)
sup
|hf, Agi| C.
(2.25)
kf k=kgk=1
(2.26)
51
(A) = A ,
(ii) A = A,
(iii) (AB) = B A ,
(iv) kA k = kAk and kAk2 = kA Ak = kAA k.
Proof. (i) is obvious. (ii) follows from hf, A gi = hA f, gi = hf, Agi. (iii)
follows from hf, (AB)gi = hA f, Bgi = hB A f, gi. (iv) follows using (2.25)
from
kA k =
|hf, A gi| =
sup
kf k=kgk=1
sup
|hAf, gi|
kf k=kgk=1
|hg, Af i| = kAk
sup
kf k=kgk=1
and
kA Ak =
sup
|hf, A Agi| =
kf k=kgk=1
sup
|hAf, Agi|
kf k=kgk=1
where we have used that |hAf, Agi| attains its maximum when Af and Ag
are parallel (compare Theorem 1.23).
Note that kAk = kA k implies that taking adjoints is a continuous operation. For later use also note that (Problem 2.8)
Ker(A ) = Ran(A) .
(2.27)
52
2. Hilbert spaces
Combining the last two results we obtain the famous LaxMilgram theorem which plays an important role in theory of elliptic partial differential
equations.
Theorem 2.15 (LaxMilgram). Let s be a sesquilinear form which is
bounded, |s(f, g)| Ckf k kgk, and
coercive, s(f, f ) kf k2 .
Then for every g H there is a unique f H such that
s(h, f ) = hh, gi,
h H.
(2.28)
M
X
X
Hj = {
fj | fj Hj ,
kfj k2j < },
(2.30)
j=1
j=1
j=1
X
X
X
h
gj ,
fj i =
hgj , fj ij .
j=1
j=1
j=1
(2.31)
Example.
j=1 C
53
= `2 (N).
elements of H H
n
X
={
j C}.
F(H, H)
j (fj , fj )|(fj , fj ) H H,
(2.32)
j=1
where
and (f ) f = f (f) we consider F(H, H)/N
(H, H),
= span{
N (H, H)
n
X
n
n
X
X
k fk )}
j k (fj , fk ) (
j fj ,
j,k=1
j=1
(2.33)
k=1
(2.34)
To show that we
which extends to a sesquilinear form on F(H, H)/N
(H, H).
P
obtain a scalar product, we need to ensure positivity. Let f = i i fi fi 6=
0 and pick orthonormal bases uj , u
k for span{fi }, span{fi }, respectively.
Then
X
X
f=
jk uj u
k , jk =
i huj , fi ih
uk , fi i
(2.35)
i
j,k
and we compute
hf, f i =
|jk |2 > 0.
(2.36)
j,k
uj u
k is an orthonormal basis for H H.
Proof. That uj u
k is an orthonormal set is immediate from (2.34). More respectively, it is easy to
over, since span{uj }, span{
uk } are dense in H, H,
see that uj u
k is dense in F(H, H)/N (H, H). But the latter is dense in
H H.
Example. We have H Cn = Hn .
54
2. Hilbert spaces
M
M
(
Hj ) H =
(Hj H),
(2.37)
j=1
j=1
where equality has to be understood in the sense that both spaces are unitarily equivalent by virtue of the identification
X
X
(
fj ) f =
fj f.
(2.38)
j=1
j=1
Chapter 3
Compact operators
Proof. Let fj
(1)
such
(1)
(1)
(2)
that A1 fj converges. From fj choose another subsequence fj such that
(2)
(n)
A2 fj converges and so on. Since fj might disappear as n , we con(j)
sider the diagonal sequence fj = fj . By construction, fj is a subsequence
(n)
of fj for j n and hence An fj is Cauchy for every fixed n. Now
56
3. Compact operators
(3.1)
|f (t)|dt k1k kf k,
a
if
|x y| < .
(3.2)
57
bers in [a, b]). Since fn (x1 ) is bounded, we can choose a subsequence fn (x)
(1)
such that fn (x1 ) converges (BolzanoWeierstra). Similarly we can extract
(2)
(1)
a subsequence fn (x) from fn (x) which converges at x2 (and hence also
(1)
at x1 since it is a subsequence of fn (x)). By induction we get a sequence
(j)
(n)
fn (x) converging at x1 , . . . , xj . The diagonal sequence fn (x) = fn (x)
will hence converge for all x = xj (why?). We will show that it converges
uniformly for all x:
Fix > 0 and chose such that |fn (x) fn (y)| 3 for |x y| < .
The balls B (xj ) cover [a, b] and by compactness even finitely many, say
1 j p, suffice. Furthermore, choose N such that |fm (xj ) fn (xj )| 3
for n, m N and 1 j p.
Now pick x and note that x B (xj ) for some j. Thus
|fm (x) fn (x)| |fm (x) fm (xj )| + |fm (xj ) fn (xj )|
+ |fn (xj ) fn (x)|
for n, m N , which shows that fn is Cauchy with respect to the maximum
norm. By completeness of C[a, b] it has a limit.
Compact operators are very similar to (finite) matrices as we will see in
the next section.
Problem 3.1. Show that compact operators form an ideal.
Problem 3.2. Show that adjoint of the integral operator from Lemma 3.4
is the integral operator with kernel K(y, x) .
f, g D(A).
(3.3)
(3.5)
58
3. Compact operators
f :kf k=1
f :kf k=1
59
(3.6)
(3.7)
huj , f iuj + h,
(3.8)
j=1
n
X
huj , f iuj ,
j=1
we have
kA(f fn )k |n |kf fn k |n |kf k
since f fn Hn and kAn k = |n |. Letting n shows A(f f ) = 0
proving (3.8).
60
3. Compact operators
j huj , f iuj ,
(3.9)
j=1
in
Problem 3.5. Find the eigenvalues and eigenfunctions of the integral operator
Z 1
(Kf )(x) = 2
(2xy x y + 1)f (y)dy
0
in
d2
+ q(x),
dx2
(3.10)
61
(3.11)
(3.12)
=
f (x) g(x)dx +
f (x) q(x)g(x)dx
0
(3.13)
= hLf, gi.
Here we have used integration by part twice (the boundary terms vanish
due to our boundary conditions f (0) = f (1) = 0 and g(0) = g(1) = 0).
Of course we want to apply Theorem 3.8 and for this we would need to
show that L is compact. But this task is bound to fail, since L is not even
bounded (see the example in Section 1.5)!
So here comes the trick: If L is unbounded its inverse L1 might still
be bounded. Moreover, L1 might even be compact and this is the case
here! Since L might not be injective (0 might be an eigenvalue), we consider
RL (z) = (L z)1 , z C, which is also known as the resolvent of L.
In order to compute the resolvent, we need to solve the inhomogeneous
equation (L z)f = g. This can be done using the variation of constants
formula from ordinary differential equations which determines the solution
62
3. Compact operators
u (z, t)g(t)dt
0
1
u (z, x)
+
W (z)
u+ (z, t)g(t)dt ,
(3.14)
(3.15)
(3.17)
(3.18)
(L z)
Z
g(x) =
G(z, x, t)g(t)dt.
0
(3.19)
63
Chapter 4
66
But x Brn (xn ) X\Xn for every n, contradicting our assumption that
the Xn cover X.
Remark: The set of rational numbers Q can be written as a countable
union of its elements. This shows that completeness assumption is crucial.
(Sets which can be written as the countable union of nowhere dense sets
are said to be of first category. All other sets are second category. Hence
we have the name category theorem.)
In other words, if Xn X is a sequence of closed subsets which cover
X, at least one Xn contains a ball of radius > 0.
Since a closed set is nowhere dense if and only if its complement is open
and dense (cf. Problem 1.4), there is a reformulation which is also worthwhile
noting:
Corollary 4.2. Let X be a complete metric space. Then any countable
intersection of open dense sets is again dense.
Proof. Let On be open dense sets whose intersection is not dense. Then
this intersection must be missing some closed ball B . This ball will lie in
S
n Xn , where Xn = X\On are closed and nowhere dense. Now note that
{x| kA xk n}.
S
Then n Xn = X by assumption. Moreover, by continuity of A and the
norm, each Xn is an intersection of closed sets and hence closed. By Baires
theorem at least one contains a ball of positive radius: B (x0 ) Xn . Now
observe
kA yk kA (y + x0 )k + kA x0 k n + C(x0 )
x
for kyk . Setting y = kxk
, we obtain
kA xk
for every x.
n + C(x0 )
kxk
67
[
Y =
A(BnX )
n=1
and the Baire theorem implies that for some n, A(BnX ) contains a ball
BY (y). Without restriction n = 1 (just scale the balls). Since A(B1X ) =
A(B1X ) = A(B1X ) we see BY (y) A(B1X ) and by convexity of A(B1X )
we also have BY A(B1X ).
So we have BY A(B1X ), but we would need BY A(B1X ). To complete
Y A(B X ).
the proof we will show A(B1X ) A(B2X ) which implies B/2
1
For every y A(B1X ) we can choose some sequence yn A(B1X ) with
yn y. Moreover, there even is some xn B1X with yn = A(xn ). However xn might not converge, so we need to argue more carefully and ensure
Y .
convergence along the way: start with x1 B1X such that y Ax1 B/2
Y A(B X ) and hence we can
Scaling the relation BY A(B1X ) we have B/2
1/2
X such that (y Ax ) Ax B Y A(B X ). Proceeding
choose x2 B1/2
1
2
/4
1/4
like this we obtain a sequence of points xn B2X1n such that
n
X
Y
Axk B2
n .
k=1
21k
By kxk k <
the limit x =
X
y = Ax A(B2 ) as desired.
k=1 xk
68
X
j=1
1=
X
j=1
|jbj |2 =
|aj |2 < .
j=1
(4.1)
69
(4.3)
If this is the case, A is called closable and the operator A associated with
(A) is called the closure of A.
In particular, A is closable if and only if xn 0 and Axn y implies
y = 0. In this case
D(A) = {x X|xn D(A), y Y : xn x and Axn y},
Ax = y.
(4.4)
For yet another way of defining the closure see Problem 4.4.
Example. Consider the operator A in `p (N) defined by Aaj = jaj on
D(A) = {a `p (N)|aj 6= 0 for finitely many j}.
(i). A is closable. In fact, if an 0 and Aan b then we have anj 0
and thus janj 0 = bj for any j N.
(ii). The closure of A is given by
p
D(A) = {a `p (N)|(jaj )
j=1 ` (N)}
and Aaj = jaj . In fact, if an a and Aan b then we have anj aj and
janj bj for any j N and thus bj = jaj for any j N. In particular,
p
(jaj )
j=1 = (bj )j=1 ` (N). Conversely, suppose (jaj )j=1 ` (N) and
consider
(
aj , j n,
anj =
0, j > n.
Then an a and Aan (jaj )
j=1 .
1
= (A)1 = (A
).
= A1 .
70
As a consequence of the closed graph theorem we obtain:
Corollary 4.8. Suppose A : D(A) Ran(A) is closed and injective. Then
A1 defined on D(A1 ) = Ran(A) is closed. Moreover, in this case Ran(A)
is closed if and only if A1 is bounded.
The closed graph theorem tells us that closed linear operators can be
defined on all of X if and only if they are bounded. So if we have an
unbounded operator we cannot have both! That is, if we want our operator
to be at least closed, we have to live with domains. This is the reason why in
quantum mechanics most operators are defined on domains. In fact, there
is another important property which does not allow unbounded operators
to be defined on the entire space:
Theorem 4.9 (HellingerToeplitz). Let A : H H be a linear operator on
some Hilbert space H. If A is symmetric, that is hg, Af i = hAg, f i, f, g H,
then A is bounded.
Proof. It suffices to prove that A is closed. In fact, fn f and Afn g
implies
hh, gi = lim hh, Afn i = lim hAh, fn i = hAh, f i = hh, Af i
n
Problem 4.1. Show that if A is closed and B bounded, then A+B is closed.
Show that this in general fails if B is not bounded. (Here A + B is defined
on D(A + B) = D(A) D(B).)
d
Problem 4.2. Show that the differential operator A = dx
defined on D(A) =
1
C [0, 1] C[0, 1] (sup norm) is a closed operator. (Compare the example in
Section 1.5.)
d
defined on D(A) = C 1 [0, 1] L2 (0, 1).
Problem 4.3. Consider A = dx
Show that its closure is given by
Z x
2
2
D(A) = {f L (0, 1)|g L (0, 1), c C : f (x) = c +
g(y)dy}
0
and Af = g.
Problem 4.4. Consider a linear operator A : D(A) X Y , where X
and Y are Banach spaces. Define the graph norm associated with A by
kxkA = kxkX + kAxkY .
(4.5)
Show that A : D(A) Y is bounded if we equip D(A) with the graph norm.
Show that the completion XA of (D(A), k.kA ) can be regarded as a subset of
71
X if and only if A is closable. Show that in this case the completion can
be identified with D(A) and that the closure of A in X coincides with the
extension from Theorem 1.30 of A in XA .
whose norm is kly k = kykq (Problem 4.7). But can every element of `p (N)
be written in this form?
Suppose p = 1 and choose l `1 (N) . Now define
yn = l( n ),
(4.7)
n = 0, n 6= m. Then
where nn = 1 and m
(4.8)
shows kyk klk, that is, y ` (N). By construction l(x) = ly (x) for every
x span{ n }. By continuity of ` it even holds for x span{ n } = `1 (N).
Hence the map y 7 ly is an isomorphism, that is, `1 (N)
= ` (N). A similar
p
72
`(y
+ x) (y + x). But
So all we need to do is to choose `(x)
such that `(y
this is equivalent to
(y x) `(y)
inf (y + x) `(y)
`(x)
sup
>0,yY
>0,yY
and is hence only possible if
(y1 1 x) `(y1 )
(y2 + 2 x) `(y2 )
1
2
for every 1 , 2 > 0 and y1 , y2 Y . Rearranging this last equations we see
that we need to show
2 `(y1 ) + 1 `(y2 ) 2 (y1 1 x) + 1 (y2 + 2 x).
Starting with the left-hand side we have
2 `(y1 ) + 1 `(y2 ) = (1 + 2 )` (y1 + (1 )y2 )
(1 + 2 ) (y1 + (1 )y2 )
= (1 + 2 ) ((y1 1 x) + (1 )(y2 + 2 x))
2 (y1 1 x) + 1 (y2 + 2 x),
where =
2
1 +2 .
To finish the proof we appeal to Zorns lemma (see Appendix A): Let E
73
`(x)
|`(x)|
and use
x c(N),
(4.9)
(4.10)
74
y Y.
1
)kyn + x0 k
n
75
Example. The Banach spaces `p (N) with 1 < p < are reflexive: Identify
`p (N) with `q (N) and choose z `p (N) . Then there is some x `p (N)
such that
X
z(y) =
yj xj ,
y `q (N)
= `p (N) .
jN
But this implies z(y) = y(x), that is, z = J(x), and thus J is surjective.
(Warning: It does not suffice to just argue `p (N)
= `q (N)
= `p (N).)
` (N) but ` (N) 6
However, `1 is not reflexive since `1 (N) =
= `1 (N)
as noted earlier.
Example. By the same argument (using the Riesz lemma), every Hilbert
space is reflexive.
Lemma 4.17. Let X be a Banach space.
(i) If X is reflexive, so is every closed subspace.
(ii) X is reflexive if and only if X is.
(iii) If X is separable, so is X.
Proof. (i) Let Y be a closed subspace. Denote by j : Y , X the natural
inclusion and define j : Y X via (j (y 00 ))(`) = y 00 (`|Y ) for y 00 Y
and ` X . Note that j is isometric by Corollary 4.12. Then
J
X
X
j
Y
JY
X
j
Y
(JX ) : X X
x000
7 x000 JX
1 00
are inverse of each other. Moreover, fix x00 X and let x = JX
(x ).
1
0
00
00
0
0
0
0
00
76
(4.11)
pU ( x) = pU (x),
0.
(4.12)
77
x U, y V.
(4.13)
`(x) pW (x).
x U, y V.
(4.14)
sup
|`(x)|,
(4.15)
`V, k`k=1
78
|`(Ax)|,
sup
(4.16)
79
80
Remark: One can equip X with the weakest topology for which all
` X remain continuous. This topology is called the weak topology and
it is given by taking all finite intersections of inverse images of open sets
as a base. By construction, a sequence will converge in the weak topology
if and only if it converges weakly. By Corollary 4.14 the weak topology is
Hausdorff, but it will not be metrizable in general. In particular, sequences
do not suffice to describe this topology.
In a Hilbert space there is also a simple criterion for a weakly convergent
sequence to converge in norm.
Lemma 4.22. Let H be a Hilbert space and let fn * f . Then fn f if
and only if lim sup kfn k kf k.
Proof. By (i) of the previous lemma we have lim kfn k = kf k and hence
kf fn k2 = kf k2 2Re(hf, fn i) + kfn k2 0.
The converse is straightforward.
Now we come to the main reason why weakly convergent sequences are
of interest: A typical approach for solving a given equation in a Banach
space is as follows:
(i) Construct a (bounded) sequence xn of approximating solutions
(e.g. by solving the equation restricted to a finite dimensional subspace and increasing this subspace).
(ii) Use a compactness argument to extract a convergent subsequence.
(iii) Show that the limit solves the equation.
Our aim here is to provide some results for the step (ii). In a finite dimensional vector space the most important compactness criterion is boundedness (HeineBorel theorem, Theorem 1.13). In infinite dimensions this
breaks down:
Theorem 4.23. The closed unit ball in X is compact if and only if X is
finite dimensional.
For the proof we will need
Lemma 4.24. Let X be a normed vector space and Y X some subspace.
If Y 6= X, then for every (0, 1) there exists an x with kx k = 1 and
inf kx yk 1 .
yY
(4.18)
81
1
kx y k
as required.
x =
82
L1 (R).
Finally, let me remark that similar concepts can be introduced for operators. This is of particular importance for the case of unbounded operators,
where convergence in the operator norm makes no sense at all.
A sequence of operators An is said to converge strongly to A,
s-lim An = A
n
An x Ax
x D(A) D(An ).
(4.19)
An x * Ax x D(A) D(An ).
(4.20)
Clearly norm convergence implies strong convergence and strong convergence implies weak convergence.
83
(4.21)
and the operator Sn L(`2 (N)) which shifts a sequence n places to the right
and fills up the first n places with zeros, that is,
Sn (x1 , x2 , . . . ) = (0, . . . , 0, x1 , x2 , . . . ).
| {z }
(4.22)
n places
Then Sn converges to zero strongly but not in norm (since kSn k = 1) and Sn
converges weakly to zero (since hx, Sn yi = hSn x, yi) but not strongly (since
kSn xk = kxk) .
Lemma 4.26. Suppose An L(X) is a sequence of bounded operators.
(i) s-lim An = A implies kAk lim inf kAn k.
n
4C
84
x X.
(4.23)
Note that this is not the same as weak convergence on X , since ` is the
weak limit of `n if
j(`n ) j(`)
j X ,
(4.24)
whereas for the weak- limit this is only required for j J(X) X (recall
J(x)(`) = `(x)). So the weak topology on X is the weakest topology for
which all j X remain continuous and the weak- topology on X is the
weakest topology for which all j J(X) remain continuous. In particular,
the weak- topology is weaker than the weak topology and both are equal
if X is reflexive.
With this notation it is also possible to slightly generalize Theorem 4.25
(Problem 4.15):
Theorem 4.28. Suppose X is separable. Then every bounded sequence
`n X has a weak- convergent subsequence.
Example. Let us return to the example after Theorem 4.25. Consider the
Banach
space of bounded continuous functions X = C(R). Using `f () =
R
f dx we can regard L1 (R) as a subspace of X . Then the Dirac measure
centered at 0 is also in X and it is the weak- limit of the sequence uk .
Problem 4.12. Suppose `n ` in X and xn * x in X. Then `n (xn )
`(x).
Similarly, suppose s-lim `n ` and xn x. Then `n (xn ) `(x).
Problem 4.13. Show that xn * x implies Axn * Ax for A L(X).
85
Chapter 5
More on compact
operators
(5.1)
j=1
(5.2)
The numbers sj = sj (K) are called singular values of K. There are either
finitely many singular values or they converge to zero.
Theorem 5.1 (Canonical form of compact operators). Let K be compact
and let sj be the singular values of K and {uj } corresponding orthonormal
eigenvectors of K K. Then
X
K=
sj huj , .ivj ,
(5.3)
j
where vj = s1
j Kuj . The norm of K is given by the largest singular value
kKk = max sj (K).
j
(5.4)
87
88
as required. Furthermore,
hvj , vk i = (sj sk )1 hKuj , Kuk i = (sj sk )1 hK Kuj , uk i = sj s1
k huj , uk i
shows that {vj } are orthonormal. Finally, (5.4) follows from
X
X
kKf k2 = k
sj huj , f ivj k2 =
s2j |huj , f i|2 max sj (K)2 kf k2 ,
j
(5.5)
we see that a compact operator is finite rank if and only if the sum in (5.3)
is finite. Note that the finite rank operators form an ideal in L(H) just as
the compact operators do. Moreover, every finite rank operator is compact
by the HeineBorel theorem (Theorem 1.13).
Lemma 5.2. The closure of the ideal of finite rank operators in L(H) is the
ideal of compact operators.
Proof. Since the limit of compact operators is compact, it remains to show
that every compact operator K can be approximated by finite rank ones.
To this end assume that K is not finite rank and note that
n
X
Kn =
sj huj , .ivj
j=1
89
converges to K as n since
kK Kn k = max sj (K)
jn
by (5.4).
Moreover, this also shows that the adjoint of a compact operator is again
compact.
Corollary 5.3. An operator K is compact (finite rank) if and only K is.
In fact, sj (K) = sj (K ) and
X
sj hvj , .iuj .
(5.6)
K =
j
Proof. First of all note that (5.6) follows from (5.3) since taking adjoints
is continuous and (huj , .ivj ) = hvj , .iuj . The rest is straightforward.
Finally, let me remark that there are a number of other equivalent definitions for compact operators.
Lemma 5.4. For K L(H) the following statements are equivalent:
(i) K is compact.
s
2
N
N
X
X
kAn Kk2 sup
sj |huj , f i|kAn vj k
s2j kAn vj k2 0.
kf k=1
j=1
j=1
90
Example. Let H = `2 (N) and let An be the operator which shifts each
sequence n places to the left and let K = h1 , .i1 , where 1 = (1, 0, . . . ).
Then s-lim An = 0 but kKAn k = 1.
Problem 5.1. Show that Ker(A A) = Ker(A) for any A L(H).
Problem 5.2. Show (5.4).
(5.8)
(5.9)
(5.10)
The two most important cases are p = 1 and p = 2: J2 (H) is the space
of HilbertSchmidt operators and J1 (H) is the space of trace class
operators.
We first prove an alternate definition for the HilbertSchmidt norm.
Lemma 5.5. A bounded operator K is HilbertSchmidt if and only if
X
kKwj k2 <
(5.11)
jJ
X
kKwj k2
1/2
(5.12)
jJ
j>n
j>n
where f =
91
j cj wj .
k,j
k,j
sk (K)2 = kKk22 .
kAKk2 kAkkKk2 .
respectively,
(5.13)
Since HilbertSchmidt operators turn out easy to identify (cf. Section 8.5),
it is important to relate J1 (H) with J2 (H):
Lemma 5.7. An operator is trace class if and only if it can be written as
the product of two HilbertSchmidt operators, K = K1 K2 , and in this case
we have
kKk1 kK1 k2 kK2 k2 .
(5.14)
X
n
kK1 vn k2
kK2 un k2
1/2
= kK1 k2 kK2 k2
92
Lemma 5.8. If K is trace class, then for every orthonormal basis {wn } the
trace
X
tr(K) =
hwn , Kwn i
(5.15)
n
n,m
hK2 vm , wn ihwn , K1 vm i
X
hK2 vm , K1 vm i
m
m,n
hvm , K2 K1 vm i.
Hence the trace is independent of the ONB and we even have tr(K1 K2 ) =
tr(K2 K1 ).
Clearly for self-adjoint trace class operators, the trace is the sum over
all eigenvalues (counted with their multiplicity). To see this, one just has to
choose the orthonormal basis to consist of eigenfunctions. This is even true
for all trace class operators and is known as Lidskij trace theorem (see [19]
for an easy to read introduction).
Problem 5.3. Let H = `2 (N) and let A be multiplication by a sequence
2
a = (aj )
j=1 . Show that A is HilbertSchmidt if and only if a ` (N).
Furthermore, show that kAk2 = kak in this case.
P
Problem 5.4. An operator of the form K : `2 (N) `2 (N), fn 7 jN kn+j fj
is called Hankel operator.
P
Show that K is HilbertSchmidt if and only if jN j|kj |2 <
and this number equals kKk2 .
Show that K is HilbertSchmidt with kKk2 kck1 if |kj | cj ,
where cj is decreasing and summable.
(Hint: For the first item use summation by parts.)
If kj is not square
summable both numbers
are infinite. So assume kj is square summable and
P
2
abbreviate Kn =
|k
j=n j | . Then, by summation by parts,
n
X
j=1
Kj = nKn+1 +
n
X
j|kj |2 .
j=1
So if Kj is P
summable, so is j|kj |2 . Conversely, if j|kj |2 is summable observe
2
nKn+1
j=n+1 j|kj | 0 as n and thus Kj is summable. In
particular, both sums are equal.
93
(5.16)
(5.18)
there is some hope. In fact, we will show that this formula (makes sense
and) holds if K is a compact operator.
Lemma 5.9. Let K C(H) be compact. Then Ker(1 K) is finite dimensional and Ran(1 K) is closed.
Proof. We first show dim Ker(1 K) < . If not we could find an infinite
orthonormal system {uj }
j=1 Ker(1 K). By Kuj = uj compactness of
K implies that there is a convergent subsequence ujk . But this is impossible
by kuj uk k2 = 2 for j 6= k.
To see that Ran(1 K) is closed we first claim that there is a > 0
such that
k(1 K)f k kf k,
f Ker(1 K) .
(5.19)
In fact, if there were no such , we could find a normalized sequence fj
Ker(1 K) with kfj Kfj k < 1j , that is, fj Kfj 0. After passing to a
subsequence we can assume Kfj f by compactness of K. Combining this
with fj Kfj 0 implies fj f and f Kf = 0, that is, f Ker(1 K).
On the other hand, since Ker(1K) is closed, we also have f Ker(1K)
which shows f = 0. This contradicts kf k = lim kfj k = 1 and thus (5.19)
holds.
Now choose a sequence gj Ran(1 K) converging to some g. By
assumption there are fk such that (1 K)fk = gk and we can even assume
fk Ker(1K) by removing the projection onto Ker(1K). Hence (5.19)
shows
kfj fk k 1 k(1 K)(fj fk )k = 1 kgj gk k
that fj converges to some f and (1 K)f = g implies g Ran(1 K).
94
Since
Ran(1 K) = Ker(1 K )
(5.20)
by (2.27) we see that the left and right hand side of (5.18) are at least finite
for compact K and we can try to verify equality.
Theorem 5.10. Suppose K is compact. Then
dim Ker(1 K) = dim Ran(1 K) ,
(5.21)
(5.22)
95
(5.23)
(5.25)
for compact K.
This theory can be generalized to the case of operators where both
Ker(1 K) and Ran(1 K) are finite dimensional. Such operators are
called Fredholm operators (also Noether operators) and the number
ind(1 K) = dim Ker(1 K) dim Ran(1 K)
(5.26)
is the called the index of K. Theorem 5.10 now says that a compact operator is Fredholm of index zero.
Problem 5.5. Compute Ker(1 K) and Ran(1 K) for the operator
K = hv, .iu, where u, v H satisfy hu, vi = 1.
Chapter 6
Bounded linear
operators
x(y + z) = xy + xz,
x, y, z X,
(6.1)
and
(xy)z = x(yz),
C.
(6.2)
and
kxyk kxkkyk.
(6.3)
x X
(6.4)
98
(6.5)
Rn
Rn
(6.6)
In this case y is called the inverse of x and is denoted by x1 . It is straightforward to show that the inverse is unique (if one exists at all) and that
(xy)1 = y 1 x1 .
(6.7)
S xn =
, S + xn = xn+1
xn1 n > 1
(6.8)
(i.e., S shifts each sequence one place right (filling up the first place with a
0) and S + shifts one place left (dropping the first place)). Then S + S = I
but S S + 6= I. So you really need to check both xy = e and yx = e in
general.
Lemma 6.1. Let X be a Banach algebra with identity e. Suppose kxk < 1.
Then e x is invertible and
X
(e x)1 =
xn .
(6.9)
n=0
X
X
X
n
n
(e x)
x =
x
xn = e
n=0
n=0
n=1
respectively
X
n=0
!
xn
(e x) =
X
n=0
xn
xn = e.
n=1
Corollary 6.2. Suppose x is invertible and kx1 yk < 1 or kyx1 k < 1.
Then (x y) is invertible as well and
X
X
1
1 n 1
1
(x y) =
(x y) x
or (x y) =
x1 (yx1 )n . (6.10)
n=0
n=0
In particular, both conditions are satisfied if kyk < kx1 k1 and the set of
invertible elements is open.
99
(6.11)
( 0 )n (x 0 )n1 ,
n=0
1
dist(, (x))
(6.14)
(x )
1 X x n
=
,
n=0
|| > kxk,
100
which implies (x) {| || kxk} is bounded and thus compact. Moreover, taking norms shows
1 X kxkn
1
1
k(x ) k
=
, || > kxk,
n
||
||
|| kxk
n=0
)1
which implies (x
0 as . In particular, if (x) is empty,
then `((x )1 ) is an entire analytic function which vanishes at infinity.
By Liouvilles theorem we must have `((x )1 ) = 0 in this case, and so
(x )1 = 0, which is impossible.
As another simple consequence we obtain:
Theorem 6.4. Suppose X is a Banach algebra in which every element except 0 is invertible. Then X is isomorphic to C.
Proof. Pick x X and (x). Then x is not invertible and hence
x = 0, that is x = . Thus every element is a multiple of the identity.
Theorem 6.5 (Spectral mapping). For every polynomial p and x X we
have
(p(x)) = p((x)).
(6.16)
Proof. Fix 0 C and observe
p(x) p(0 ) = (x 0 )q0 (x).
If p(0 ) 6 (p(x)) we have
(x 0 )1 = q0 (x)((x 0 )q0 (x))1 = ((x 0 )q0 (x))1 q0 (x)
(check this since q0 (x) commutes with (x 0 )q0 (x) it also commutes
with its inverse). Hence 0 6 (x).
Conversely, let 0 (p(x)). Then
p(x) 0 = a(x 1 ) (x n )
and at least one j (x) since otherwise the right-hand side would be
invertible. But then p(j ) = 0 , that is, 0 p((x)).
Next let us look at the convergence radius of the Neumann series for
the resolvent
1 X x n
1
(x ) =
(6.17)
n=0
encountered in the proof of Theorem 6.3 (which is just the Laurent expansion
around infinity).
The number
r(x) = sup ||
(x)
(6.18)
101
(6.19)
nN
(6.20)
`((x )1 ) =
1X 1
`(xn ).
(6.21)
n=0
To end this section let us look at two examples illustrating these ideas.
Example. Let X = L(C2 ) be the space of two by two matrices and consider
0 1
x=
.
(6.22)
0 0
Then x2 = 0 and consequently r(x) = 0. This is not surprising, since x has
the only eigenvalue 0. The same is true for any nilpotent matrix.
Example. Consider the linear Volterra integral operator
Z t
K(x)(t) =
k(t, s)x(s)ds,
x C([0, 1]),
(6.23)
kkkn tn
kxk .
n!
(6.24)
102
Consequently
kK n xk
that is kK n k
kkkn
n! ,
kkkn
kxk ,
n!
which shows
kkk
= 0.
n (n!)1/n
r(K) lim
Hence r(K) = 0 and for every C and every y C(I) the equation
x K x = y
(6.25)
n K n y.
(6.26)
n=0
Problem 6.1. Show that the multiplication in a Banach algebra X is continuous: xn x and yn y imply xn yn xy.
Problem 6.2. Suppose x has both a right inverse y (i.e., xy = e) and a left
inverse z (i.e., zx = e). Show that y = z = x1 .
Problem 6.3. Show (6.24).
Problem 6.4. Show that L1 (Rn ) with convolution as multiplication is a
commutative Banach algebra without identity (Hint: Lemma 8.12).
(x) = x ,
x = x,
(xy) = y x ,
(6.27)
satisfying
kxk2 = kx xk
(6.28)
103
(6.30)
If X is a C algebra, then x X is called normal if x x = xx , selfadjoint if x = x, and unitary if x = x1 . Moreover, x is called positive
if x = y 2 for some y = y X. Clearly both self-adjoint and unitary
elements are normal and positive elements are self-adjoint.
Lemma 6.7. If x X is normal, then kx2 k = kxk2 and r(x) = kxk.
Proof. Using (6.28) three times we have
kx2 k = k(x2 ) (x2 )k1/2 = k(xx ) (xx )k1/2 = kx xk = kxk2
k
xx = x x,
(6.31)
x = x .
(6.32)
Moreover, in this case C (x) is isomorphic to C((x)) (the continuous functions on the spectrum).
Theorem 6.9 (Spectral theorem). If X is a C algebra and x is self-adjoint,
then there is an isometric isomorphism : C((x)) C (x) such that
f (t) = t maps to (t) = x and f (t) = 1 maps to (1) = e.
Moreover, for every f C((x)) we have
(f (x)) = f ((x)),
where f (x) = (f (t)).
(6.33)
104
sup
(p(x))
(x)
u H.
(6.34)
105
We have
u,v1 +v2 = u,v1 + u,v2 ,
u,v = u,v ,
v,u = u,v
(6.36)
f (t)du,v (t),
f B((A)).
(6.37)
(A)
Moreover, if fn (t) f (t) pointwise and supn kfn k is bounded, then fn (A)u
f (A)u for every u H.
Proof. The map is well-defined linear operator by Lemma 2.11 since we
have
Z
f (t)du,v (t) kf k |u,v |((A)) kf k kukkvk
(A)
106
g(A)fn (A)u g(A)f (A)u. This establishes the case where f is bounded
and g is continuous. Similarly, approximating g removes the continuity requirement from g.
The last claim follows since fn f in L2 by dominated convergence in
this case.
In particular, given a self-adjoint operator A we can define the spectral
projections
PA () = (A),
B(R).
(6.38)
They are orthogonal projections, that is P 2 = P and P = P .
Lemma 6.12. Suppose P is an orthogonal projection. Then H decomposes
in an orthogonal sum
H = Ker(P ) Ran(P )
(6.39)
and Ker(P ) = (I P )H, Ran(P ) = P H.
Proof. Clearly, every u H can be written as u = (I P )u + P u and
h(I P )u, P ui = hP (I P )u, ui = h(P P 2 )u, ui = 0
shows H = (IP )HP H. Moreover, P (IP )u = 0 shows (IP )H Ker(P )
and if u Ker(P ) then u = (IP )u (IP )H shows Ker(P ) (IP )H.
In addition, the spectral projections satisfy
PA (R) = I,
PA (
n=1
n )u =
PA (n )u,
n m = , m 6= n. (6.40)
n=1
(6.42)
(6.44)
107
(6.45)
m
X
j PA ({j }),
(6.46)
j=1
108
(f + g) + |f g|
,
2
min{f, g} =
(f + g) |f g|
2
in A.
Now fix f C(K, R). We need to find some f A with kf f k < .
First of all, since A separates points, observe that for given y, z K
there is a function fy,z A such that fy,z (y) = f (y) and fy,z (z) = f (z)
(show this). Next, for every y K there is a neighborhood U (y) such that
fy,z (x) > f (x) ,
x U (y),
and since K is compact, finitely many, say U (y1 ), . . . , U (yj ), cover K. Then
fz = max{fy1 ,z , . . . , fyj ,z } A
and satisfies fz > f by construction. Since fz (z) = f (z) for every z K,
there is a neighborhood V (z) such that
fz (x) < f (x) + ,
x V (z),
109
1 einx ,
2
n Z, form an
Part 2
Real Analysis
Chapter 7
Almost everything
about Lebesgue
integration
114
S
P
A
)
=
(Aj ) if Aj Ak = for all j 6= k (-additivity).
(
j
j=1
j=1
115
Example. Take a set X and = P(X) and set (A) to be the number
of elements of A (respectively, if A is infinite). This is the so-called
counting measure. It will be finite if and only if X is finite and -finite if
and only if X is countable.
Example. Take a set X and = P(X). Fix a point x X and set
(A) = 1 if x A and (A) = 0 else. This is the Dirac measure centered
at x.
Example. Let 1 , 2 be two measures on (X, ) and 1 , 2 0. Then
= 1 1 + 2 2 defined via
(A) = 1 1 (A) + 2 2 (A)
is again a measure. Furthermore,
Pgiven a countable number of measures n
and numbers n 0, then = n n n is again a measure (show this).
Example. Let be a measure on (X, ) and Y X a measurable subset.
Then
(A) = (A Y )
is again a measure on (X, ) (show this).
116
inf
(O)
open
OA,O
(7.1)
sup
(K).
compact
(7.2)
KA,K
((0, x]),
x > 0,
which is right continuous and nondecreasing.
Example. The distribution function of the Dirac measure centered at 0 is
(
1, x < 0,
(x) =
0,
x 0.
117
I = [a, b],
I = (a, b),
or
I = [a, b),
(7.4)
with a b, a, b R{, }. Note that (a, a), [a, a), and (a, a] denote the
empty set, whereas [a, a] denotes the singleton {a}. For any proper interval
with different endpoints (i.e. a < b) we can define its measure to be
m(b)
m(a+),
I
=
(a,
b),
(7.6)
(which agrees with (7.5) except for the case I = (a, a) which would give a
negative value for the empty set if jumps at a). Note that ({a}) = 0 if
and only if m(x) is continuous at a and that there can be only countably
many points with ({a}) > 0 since a nondecreasing function can have at
most countably many jumps. Moreover, observe that the definition of
does not involve the actual value of m at a jump. Hence any function m
118
S
whenever Ik A and I = k Ik A. Since each Ik is a finite union of intervals, we can as well assume each Ik is just one interval (just split Ik into
its subintervals and note that the sum does not change by additivity). Similarly, we can assume that I is just one interval (just treat each subinterval
separately).
By additivity is monotone and hence
n
X
(Ik ) = (
k=1
which shows
n
[
Ik ) (I)
k=1
(Ik ) (I).
k=1
n
n
X
X
[
(Ik ) + .
(Jk )
Jk )
(I) (
k=1
k=1
k=1
n
n
X
X
X
(Ik )
(Ik )
(Jk ) ([x, x] I) .
k=1
k=1
k=1
119
Since the proof of this theorem is rather involved, we defer it to the next
section and look at some examples first.
Example. Suppose (x) = 0 for x < 0 and (x) = 1 for x 0. Then we
obtain the so-called Dirac measure at 0, which is given by (A) = 1 if
0 A and (A) = 0 if 0 6 A.
Example. Suppose (x) = x. Then the associated measure is the ordinary
Lebesgue measure on R. We will abbreviate the Lebesgue measure of a
Borel set A by (A) = |A|.
A set A is called a support for if (X\A) = 0. Note that a
support is not unique (see the examples below). If X is a topological space
and = B(X), one defines the support (also topological support) of
via
supp() = {x X|(O) > 0 for every open neighborhood O of x}. (7.7)
Equivalently one obtains supp() by removing all points which have an
open neighborhood of measure zero. In particular, this shows that supp()
is closed. If X is second countable, then supp() is indeed a support for :
For every point x 6 supp() let Ox be an open neighborhood of measure
zero. These sets cover X\ supp() and by the Lindelof theorem there is a
countable subcover, which shows that X\ supp() has measure zero.
Example. Let X = R, = B. The support of the Lebesgue measure
is all of R. However, every single point has Lebesgue measure zero and so
has every countable union of points (by -additivity). Hence any set whose
complement is countable is a support. There are even uncountable sets of
Lebesgue measure zero (see the Cantor set below) and hence a support might
even lack an uncountable number of points.
The support of the Dirac measure centered at 0 is the single point 0.
Any set containing 0 is a support of the Dirac measure.
In general, the support of a Borel measure on R is given by
supp(d) = {x R|(x ) < (x + ), > 0}.
Here we have used d to emphasize that we are interested in the support
of the measure d which is different from the support of its distribution
function (x).
A property is said to hold -almost everywhere (a.e.) if it holds on a
support for or, equivalently, if the set where it does not hold is contained
in a set of measure zero.
Example. The set of rational numbers is countable and hence has Lebesgue
measure zero, (Q) = 0. So, for example, the characteristic function of the
rationals Q is zero almost everywhere with respect to Lebesgue measure.
120
121
122
the -algebra. In order to show that it holds for every set in (S), it suffices
to show that the collection of sets for which it holds is a Dynkin system.
As an application we show that a premeasure determines the corresponding measure uniquely (if there is one at all):
Theorem 7.5 (Uniqueness of measures). Let S be a collection of sets
which generates and which is closed under finite intersections and contains
a sequence of increasing sets Xn % X of finite measure (Xn ) < . Then
is uniquely determined by the values on S.
Proof. Let
be a second measure and note (X) = limn (Xn ) =
limn
(Xn ) =
(X). We first suppose (X) < .
Then
D = {A |(A) =
(A)}
is a Dynkin system. In fact, by (A0 ) = (X)(A) =
(X)
(A) =
(A0 )
for A D we see that D is closed under complements. Furthermore, by
continuity of measures from below it is also closed under countable disjoint
unions. Since D contains S by assumption, we conclude D = (S) =
from Lemma 7.4. This finishes the finite case.
To extend our result to the general case observe that the finite case
implies (A Xj ) =
(A Xj ) (just restrict ,
to Xj ). Hence
(A) = lim (A Xj ) = lim
(A Xj ) =
(A)
j
nX
o
[
(A) = inf
(An )A
An , An A
(7.8)
n=1
n=1
where the infimum extends over all countable covers from A. Then the
function : P(X) [0, ] is an outer measure; that is, it has the
properties (Problem 7.6)
() = 0,
A1 A2 (A1 ) (A2 ), and
S
P
(
(subadditivity).
n=1 An )
n=1 (An )
Note that (A) = (A) for A A (Problem 7.7).
123
E X
(7.9)
(where A0 = X\A is the complement of A) forms a -algebra and restricted to this -algebra is a measure.
Proof. We first show that is an algebra. It clearly contains X and is
closed under complements. Concerning unions let A, B . Applying
Caratheodorys condition twice shows
(E) = (A B E) + (A0 B E) + (A B 0 E)
+ (A0 B 0 E)
((A B) E) + ((A B)0 E),
where we have used de Morgan and
(A B E) + (A0 B E) + (A B 0 E) ((A B) E)
which follows from subadditivity and (A B) E = (A B E) (A0
B E) (A B 0 E). Since the reverse inequality is just subadditivity, we
conclude that is an algebra.
Next, let An be a sequence of sets from . Without restriction we can
assume that they are disjoint (compare the argument for item (ii) in the
S
S
proof of Theorem 7.1). Abbreviate An = kn Ak , A = n An . Then for
every set E we have
(An E) = (An An E) + (A0n An E)
= (An E) + (An1 E)
= ... =
n
X
(Ak E).
k=1
n
X
k=1
(E)
(Ak E) + (A0 E)
k=1
(A E) + (A0 E) (E)
and we infer that is a -algebra.
(7.10)
124
(Ak A) + (A0 A) =
k=1
(Ak )
k=1
X
X
X
(An ) =
(An A) +
(An A0 ) (E A) + (E A0 )
n=1
n=1
n=1
125
Problem 7.8. Show that the measure constructed in Theorem 7.7 is complete.
Problem 7.9. Let be a finite measure. Show that
d(A, B) = (AB),
AB = (A B)\(A B)
(7.11)
I =
n
Y
(aj , ).
(7.12)
j=1
126
a R.
(7.14)
Again the intervals (a, ] can also be replaced by [a, ], [, a), or [, a].
Hence it is not hard to check that the previous lemma still holds if one
either avoids undefined expressions of the type and 0 or makes
a definite choice, e.g., = 0 and 0 = 0.
Moreover, the set of all measurable functions is closed under all important limiting operations.
Lemma 7.13. Suppose fn : X R is a sequence of measurable functions.
Then
inf fn , sup fn , lim inf fn , lim sup fn
(7.15)
nN
nN
127
(7.16)
(7.17)
x0 X.
x0 X.
and
(O\F ) .
(7.18)
Proof. That is -finite is immediate from the definitions since for any
fixed x0 X the open balls Xn = Bn (x0 ) have finite measure and satisfy
Xn % X.
128
To see that (7.18) holds we begin with the case when is finite. Denote
by A the set of all Borel sets satisfying (7.18). Then A contains every closed
set F : Given F define On = {x X|d(x, F ) < 1/n} and note that On are
open sets which satisfy On & F . Thus by Theorem 7.1 (iii) (On \F ) 0
and hence F A.
Moreover, A is even a -algebra. That it is closed under complements
= X\F and F = X\O are the required sets
is easy to see (note that O
for AS= X\A). To see that it is closed under countable unions consider
A=
n A. Then there are Fn , On such that (On \Fn )
n=1 An with AS
SN
2n1 . Now O =
n=1 On is open and F =
n=1 Fn is closed for any
finite
S N . Since (A) is finite we can choose N sufficiently large such that
\F ) /2. Then weShave found two sets of the required type:
(
N +1 FnP
(O\F )
n=1 (On \Fn ) + ( n=N +1 Fn \F ) . Thus A is a -algebra
containing the open sets, hence it is the entire Borel algebra.
Now suppose is not finite. Pick some x0 X and set X0 = B2/3 (x0 )
and Xn = Bn+2/3 (x0 )\Bn2/3 (x0 ), n N. Let An = A Xn and note that
S
A=
n On Xn such
n=0 An . By the finite case we canSchoose Fn AS
n1
that (On \Fn ) 2
. Now set F = n Fn and O = n On and observe
that F is closed. Indeed, let x F and let xj be some sequence from F
converging to x. Since d(x0 , xj ) d(x0 , x) this sequence must eventually
lie in Fn Fn+1 for P
some fixed n implying x Fn Fn+1 = Fn Fn+1 F .
Finally, (O\F )
n=0 (On \Fn ) as required.
This result immediately gives us outer regularity and, if we strengthen
our assumption, also inner regularity.
Corollary 7.15. Under the assumptions of the previous lemma
(A) =
inf
(O) =
sup
(F )
open
F A,F closed
OA,O
(7.19)
sup
(K).
compact
(7.20)
KA,K
129
Proof. Since G sets are Borel, only the converse direction is nontrivial.
By Lemma
T 7.14 we can find open sets On such that (On \A) 1/n. Now
let G = n On . Then (G\A) (On \A) 1/n for any n and thus
(G\A) = 0 The second claim is analogous.
Problem 7.12. Show directly (without using regularity) that for every > 0
there is an open set O of Lebesgue measure |O| < which covers the rational
numbers.
Problem 7.13. Show that a Borel set A R has Lebesgue measure zero
if and only if for every
P there exists a countable set of Intervals Ij which
cover A and satisfy j |Ij | < .
130
p
X
s(X) = {j }pj=1 ,
j A j ,
Aj = s1 (j ) .
(7.21)
j=1
p
X
j (Aj A).
(7.22)
j=1
,
t
=
j j Aj
j j Bj as in (7.21) and
abbreviate Cjk = (Aj Bk ) A. Then, by (ii),
Z
XZ
X
(s + t)d =
(s + t)d =
(j + k )(Cjk )
A
j,k
Cjk
X Z
j,k
j,k
Z
s d +
Cjk
t d
Cjk
Z
=
Z
s d +
t d.
A
(v) follows
from monotonicity
of . (vi) follows since by (iv) we can write
P
P
s = j j Cj , t = j j Cj where, by assumption, j j .
131
where the supremum is taken over all simple functions s f . Note that,
except for possibly (ii) and (iv), Lemma 7.17 still holds for arbitrary nonnegative functions s, t.
Theorem 7.18 (Monotone convergence, Beppo Levis theorem). Let fn be
a monotone nondecreasing sequence of nonnegative measurable functions,
fn % f . Then
Z
Z
fn d
f d.
(7.24)
R
Proof. By property (vi), A fn d is monotone and converges to some number . By fn f and again (vi) we have
Z
f d.
A
To show the converse, let s be simple such that s f and let (0, 1). Put
An = {x A|fn (x) s(x)} and note An % A (show this). Then
Z
Z
Z
fn d
fn d
s d.
A
An
An
Letting n , we see
Z
s d.
A
Since this is valid for every < 1, it still holds for = 1. Finally, since
s f is arbitrary, the claim follows.
In particular
Z
Z
f d = lim
n A
sn d,
(7.25)
n2
X
k
sn (x) =
1
(x),
2n f (Ak )
k=0
Ak = [
k k+1
,
), An2n = [n, ).
2n 2n
(7.26)
132
(7.27)
Z
g d =
gf d
(7.28)
[
X
X
(
An ) = (
An )f d =
An f d =
(An ).
n=1
n=1
n=1
n=1
The second claim holds for simple functions and hence for all functions by
construction of the integral.
If fn is not necessarily monotone, we have at least
Theorem 7.20 (Fatous lemma). If fn is a sequence of nonnegative measurable function, then
Z
Z
lim inf fn d lim inf
fn d.
(7.29)
A n
Now take the lim inf on both sides and note that by the monotone convergence theorem
Z
Z
Z
Z
lim inf
gn d = lim
gn d =
lim gn d =
lim inf fn d,
n
n A
A n
A n
f + d
f d =
A
(7.30)
133
Similarly, we handle the case where f is complex-valued by calling f integrable if both the real and imaginary part are and setting
Z
Z
Z
f d =
Re(f )d + i
Im(f )d.
(7.31)
A
Clearly f is integrable if and only if |f | is. The set of all integrable functions
is denoted by L1 (X, d).
Lemma 7.21. The integral is linear and Lemma 7.17 holds for integrable
functions s, t.
Furthermore, for all integrable functions f, g we have
Z
Z
|
f d|
|f | d
A
(7.32)
(7.33)
a.e.
(7.34)
f d = 0.
(7.35)
S
Proof. Observe that
we have A = {x|f (x) 6= 0} = n An , where An =
R
{x| |f (x)| > n1 }. If X |f |d = 0 we must have (An ) = 0 for every n and
hence (A) = limn (An ) = 0.
TheRconverse will
R follow from (7.35) since (A) = 0 (with A as before)
implies X |f |d = A |f |d = 0.
Finally, to see (7.35) note that by our convention 0 = 0 it holds for any
simple function and hence for any nonnegative f by definition of the integral
(7.23). Since any function can be written as a linear combination of four
nonnegative functions this also implies the case when f is integrable.
134
Note that the proof also shows that if f is not 0 almost everywhere,
there is an > 0 such that ({x| |f (x)| }) > 0.
In particular, the integral does not change if we restrict the domain of
integration to a support of or if we change f on a set of measure zero. In
particular, functions which are equal a.e. have the same integral.
Finally, our integral is well behaved with respect to limiting operations.
We first state a simple generalization of Fatous lemma.
Lemma 7.23 (generalized Fatou lemma). If fn is a sequence of real-valued
measurable function and g some integrable function. Then
Z
Z
lim inf fn d lim inf
fn d
(7.36)
A n
if g fn and
Z
n
Z
fn d
lim sup
A
lim sup fn d
A
(7.37)
if fn g.
R
Proof. To see the first apply Fatous lemma to fn g and subtract A g d on
both sides of the result. The second follows from the first using lim inf(fn ) =
lim sup fn .
If in the last lemma we even have |fn | g, we can combine both estimates to obtain
Z
Z
Z
Z
lim inf fn d lim inf
fn d lim sup
fn d
lim sup fn d,
A n
(7.38)
which is known as FatouLebesgue theorem. In particular, in the special
case where fn converges we obtain
Theorem 7.24 (Dominated convergence). Let fn be a convergent sequence
of measurable functions and set f = limn fn . Suppose there is an integrable function g such that |fn | g. Then f is integrable and
Z
Z
lim
fn d = f d.
(7.39)
n
Proof. The real and imaginary parts satisfy the same assumptions and
hence it suffices to prove the case where fn and f are real-valued. Moreover,
since lim inf fn = lim sup fn = f equation (7.38) establishes the claim.
Remark: Since sets of measure zero do not contribute to the value of the
integral, it clearly suffices if the requirements of the dominated convergence
theorem are satisfied almost everywhere (with respect to ).
Example. Note that the existence of g is crucial:
The functions fn (x) =
R
1
(x)
on
R
converge
uniformly
to
0
but
f
(x)dx
= 1.
2n [n,n]
R n
135
In fact, the integral can be restricted to any support and hence to {0}.
P
If (x) = n n (x xn ) is a sum of Dirac measures, (x) centered at
x = 0, then (Problem 7.14)
Z
X
n f (xn ).
f (x)d(x) =
R
(x,x+]
(where (x, x + ] has to be understood as (x + , x] if < 0) and
Z
1
lim sup
|f (y) f (x)|dy lim sup sup |f (y) f (x)| = 0
0 (x,x+]
0 y(x,x+]
by the continuity of f at x. Thus F C 1 (a, b) and
F 0 (x) = f (x),
which is a variant of the fundamental theorem of calculus. This tells us
that the integral of a continuous function f can be computed in terms of its
antiderivative and, in particular, all tools from calculus like integration by
parts or integration by substitution are readily available for the Lebesgue
integral on R. Moreover, the Lebesgue integral must coincide with the Riemann integral for (piecewise) continuous functions. More on the connection
with the Riemann integral will be given in Section 7.9. A generalization of
the fundamental theorem of calculus will be given in Theorem 9.25.
Problem 7.14.
P Consider a countable set of measures n and numbers n
0. Let = n n n and show
Z
X Z
n
f d =
f dn
(7.41)
A
136
Problem 7.15. Show that the set B(X) of bounded measurable functions
with the sup norm is a Banach space. Show that the set S(X) of simple
functions is dense in B(X). Show that the integral is a bounded linear
functional on B(X) if (X) < . (Hence Theorem 1.30 could be used to
extend the integral from simple to bounded measurable functions.)
Problem 7.16. Show that the monotone convergence holds for nondecreasing sequences of real-valued measurable functions fn % f provided f1 is
integrable.
Problem 7.17. Show that the dominated convergence theorem implies (under the same assumptions)
Z
lim
|fn f |d = 0.
n
is continuous if there is an integrable function g(y) such that |f (x, y)| g(y).
Problem 7.20. Let X R, Y be some measure space, and f : X Y C.
Suppose y 7 f (x, y) is integrable for all x and x 7 f (x, y) is differentiable
for a.e. y. Show that
Z
F (x) =
f (x, y) d(y)
(7.43)
A
f (x, y)|
is differentiable if there is an integrable function g(y) such that | x
0
F (x) =
f (x, y) d(y)
(7.44)
A x
137
and
(7.45)
are measurable.
Proof. Denote all sets A 1 2 with the property that A1 (x2 ) 1 by
S. Clearly all rectangles are in S and it suffices to show that S is a -algebra.
Now, if A S, then (A0 )1 (x2 ) = (A1 (x2 ))S0 1 and thus
S S is closed under
complements. Similarly, if An S, then ( n An )1 (x2 ) = n (An )1 (x2 ) shows
that S is closed under countable unions.
This implies that if f is a measurable function on X1 X2 , then f (., x2 ) is
measurable on X1 for every x2 and f (x1 , .) is measurable on X2 for every x1
(observe A1 (x2 ) = {x1 |f (x1 , x2 ) B}, where A = {(x1 , x2 )|f (x1 , x2 ) B}).
Given two measures 1 on 1 and 2 on 2 , we now want to construct
the product measure 1 2 on 1 2 such that
1 2 (A1 A2 ) = 1 (A1 )2 (A2 ),
Aj j , j = 1, 2.
(7.46)
Since the rectangles are closed under intersection, Theorem 7.5 implies that
there is at most one measure on 1 2 provided 1 and 2 are -finite.
Theorem 7.26. Let 1 and 2 be two -finite measures on 1 and 2 ,
respectively. Let A 1 2 . Then 2 (A2 (x1 )) and 1 (A1 (x2 )) are measurable and
Z
Z
2 (A2 (x1 ))d1 (x1 ) =
1 (A1 (x2 ))d2 (x2 ).
(7.47)
X1
X2
Proof. As usual, we begin with the case where 1 and 2 are finite. Let
D be the set of all subsets for which our claim holds. Note that D contains
at least all rectangles. Thus it suffices to show that D is a Dynkin system
by Lemma 7.4. To see this, note that measurability and equality of both
integrals follow from A1 (x2 )0 = A01 (x2 ) (implying 1 (A01 (x2 )) = 1 (X1 )
1 (A1 (x2 ))) for complements and from the monotone convergence theorem
for disjoint unions of sets.
138
X1
and the -finite case follows from the monotone convergence theorem.
Hence for given A 1 2 we can define
Z
Z
1 2 (A) =
2 (A2 (x1 ))d1 (x1 ) =
1 (A1 (x2 ))d2 (x2 )
X1
(7.49)
X2
(7.50)
X1
(7.52)
(7.53)
139
follows by applying the monotone convergence theorem (twice for the double
integrals).
For (ii) we can assume that f is real-valued by considering its real and
imaginary parts separately. Moreover, splitting f = f + f into its positive
and negative parts, the claim reduces to (i).
In particular, if f (x1 , x2 ) is either nonnegative or integrable, then the
order of integration can be interchanged. The case of nonnegative functions is also called Tonellis theorem. In the general case the integrability
condition is crucial, as the following example shows.
Example. Let X = [0, 1] [0, 1] with Lebesgue measure and consider
xy
f (x, y) =
.
(x + y)3
Then
Z
1Z 1
Z
f (x, y)dx dy =
Z
f (x, y)dy dx =
1
1
dy =
2
(1 + y)
2
1
1
dx = .
2
(1 + x)
2
(7.54)
140
141
Proof. In fact, it suffices to check this formula for simple functions g, which
follows since A f = f 1 (A) .
Example. Suppose f : X Y and g : Y Z. Then
(g f )? = g? (f? ).
since (g
f )1
f 1
g 1 .
1
| det(M )|
g(y)dn y,
M A+a
(7.59)
142
d y=
|Jf (x)|dn x
f (R)
For < 0 the integrand is nonzero only for z K = f 1 (B0 (y)), where
K is some compact set containing x = f 1 (y). Using the affine change of
coordinates z = x + w we obtain
Z
f (x + w) f (x)
1
dn w,
W (x) = (K x).
h (y) =
W (x)
By
f (x + w) f (x)
1 |w|,
C
C = sup kdf 1 k
K
BC (0)
f (R)
f (R)
n
R |Jf (z)|d z.
143
cos() sin()
T2
=
det
= det
sin() cos()
(, )
f (x)d2 x.
Note that T2 is only bijective when restricted to (0, ) [0, 2). However,
since the set {0} [0, 2) is of measure zero, it does not contribute to the
integral on the left. Similarly, its image T2 ({0} [0, 2)) = {0} does not
contribute to the integral on the right.
Example. We can use the previous example to obtain the transformation
formula for spherical coordinates in Rn by induction. We illustrate the
process for n = 3. To this end let x = (x1 , x2 , x3 ) and start with spherical coordinates in R2 (which are just polar coordinates) for the first two
components:
x = ( cos(), sin(), x3 ),
r [0, ), [0, ].
Note that the range for follows since 0. Moreover, observe that
r2 = 2 + x23 = x21 + x22 + x23 = |x|2 as already anticipated by our notation.
In summary,
x = T3 (r, , ) = (r sin() cos(), r sin() sin(), r cos()).
Furthermore, since T3 is the composition with T2 acting on the first two
coordinates with the last unchanged and polar coordinates P acting on the
first and last coordinate, the chain rule implies
T3
T2
P
det
= det
= r2 sin().
=r sin() det
(r, , )
(, , x3 ) x3 =r cos()
(r, , )
Hence one has
Z
f (x)d3 x.
T3 (U )
144
with
= r cos() sin(1 ) sin(2 ) sin(3 ) sin(n2 ),
= r sin() sin(1 ) sin(2 ) sin(3 ) sin(n2 ),
=
r cos(1 ) sin(2 ) sin(3 ) sin(n2 ),
=
r cos(2 ) sin(3 ) sin(n2 ),
..
.
x1
x2
x3
x4
xn1 =
xn
=
r cos(n3 ) sin(n2 ),
r cos(n2 ).
Tn
= rn1 sin(1 ) sin(2 )2 sin(n2 )n2 .
(r, , 1 , . . . , n2 )
S n1
(7.61)
where Vn = n (B1 (0)) is the volume of the unit ball in Rn , and if g(x) =
g(|x|) is radial we have
Z
Z
n
g(|x|)d x = Sn
g(r)rn1 dr.
(7.62)
Rn
x
)
Proof. Consider the transformation f : Rn [0, ) S n1 , x 7 (|x|, |x|
0
n1
(with |0| = 1). Let d(r) = r
dr and
(7.63)
145
n/2
.
( n2 + 1)
(7.64)
e|x| dn x = n/2 .
Rn
(Hint: Use Fubini to show In = I1n and compute I2 using polar coordinates.)
146
(7.65)
Verify that the integral converges and defines an analytic function in the
indicated half-plane (cf. Problem 7.22). Use integration by parts to show
(z + 1) = z(z),
(1) = 1.
(7.66)
(7.67)
Clearly we have f (x) f (x+) and a strict inequality can occur only at
a countable number of points. By monotonicity the value of f has to lie
between these two values f (x) f (x) f (x+). It will also be convenient
to extend f to a function on the extended reals R {, +}. Again
by monotonicity the limits f () = limx f (x) exist and we will set
f () = .
If we want to define an inverse, problems will occur at points where f
jumps and on intervals where f is constant. Formally speaking, if f jumps,
then the corresponding jump will be missing in the domain of the inverse and
if f is constant, the inverse will be multivalued. For the first case there is a
natural fix by choosing the inverse to be constant along the missing interval.
In particular, observe that this natural choice is independent of the actual
147
value of f at the jump and hence the inverse loses this information. The
second case will result in a jump for the inverse function and here there is no
natural choice for the value at the jump (except that it must be between the
left and right limits such that the inverse is again a nondecreasing function).
To give a precise definition it will be convenient to look at relations
instead of functions. Recall that a (binary) relation R on R is a subset of
R2 .
To every nondecreasing function f associate the relation
(f ) = {(x, y)|y [f (x), f (x+)]}.
(7.68)
(7.70)
(7.71)
148
of the next lemma is to investigate to what extend this remains valid for a
generalized inverse.
Note that for every y there is some x with y [f (x), f (x+)]. Moreover,
if we can find two values, say x1 and x2 , with this property, then f (x) = y
is constant for x (x1 , x2 ). Hence, the set of all such x is an interval which
is closed since at the left, right boundary point the left, right limit equals y,
respectively.
We collect a few simple facts for later use.
Lemma 7.34. Let f be nondecreasing.
(i) f1 (y) x if and only if y f (x+).
(i) f+1 (y) x if and only if y f (x).
(ii) f (f1 (y)) y if f is right continuous at f1 (y) with equality if
y Ran(f ).
(ii) f (f+1 (y)) y if f is left continuous at f+1 (y) with equality if
y Ran(f ).
Proof. Item (i) follows since both claims are equivalent to y f (
x) for
all x
> x. Next, let f be right-continuous. If y is in the range of f ,
then y = f (f1 (y)+) = f (f1 (y)) if f 1 ({y}) contains more than one
point and y = f (f1 (y)) else. If y is not in the range of m we must have
y [f (x), f (x+)] for x = f1 (y) which establishes item (ii). Similarly for
(i) and (ii).
In particular, f (f 1 (y)) = y if f is continuous. We will also need the
set
L(f ) = {y|f 1 ((y, )) = (f+1 (y), )}.
(7.72)
Note that y 6 L(f ) if and only if there is some x such that y [f (x), f (x)).
Lemma 7.35. Let m : R R be a nondecreasing function on R and its
associated measure via (7.5). Let f (x) be a nondecreasing function on R
such that ((0, )) < if f is bounded above and ((, 0)) < if f is
bounded below.
Then f? is a Borel measure whose distribution function coincides up
to a constant with m+ f+1 at every point y which is in L(f ) or satisfies
({f+1 (y)}) = 0. If y [f (x), f (x)) and ({f+1 (y)}) > 0, then m+ f+1
jumps at f (x) and (f? )(y) jumps at f (x).
Proof. First of all note that the assumptions in case f is bounded from
above or below ensure that (f? )(K) < for any compact interval. Moreover, we can assume m = m+ without loss of generality. Now note that we
149
+, 1 y,
1
f+ (y) = 0,
0 y < 1,
, y < 0,
and L(f ) = (, 0)[1, ). Moreover, (f+1 (y)) = [0,) (y) and (f? )(y) =
1
[1,) (y). If we choose g(x) = (0,) (x), then g+
(y) = f+1 (y) and L(g) =
1
R. Hence (g+ (y)) = [0,) (y) = (g? )(y).
For later use it is worth while to single out the following consequence:
Corollary 7.36. Let m, f be as in the previous lemma and denote by ,
the measures associated with m, m f 1 , respectively. Then, (f )? =
and hence
Z
Z
g d(m f 1 ) =
(g f ) dm.
(7.73)
hull(Ran(m))
150
for any Borel function g which is either nonnegative or for which one of the
two integrals is finite. Similarly, if n is left continuous and im is replaced
by m(m1
+ (y)).
R
R
Hence the usual R (g m) d(n m) = Ran(m) g dn only holds if m is
continuous. In fact, the right-hand side looses all point masses of . The
above formula fixes this problem by rendering g constant along a gap in the
range of m and includes the gap in the range of integration such that it
makes up for the lost point mass. It should be compared with the previous
example!
If one does not want to bother with im one can at least get inequalities
for monotone g.
Corollary 7.38. Suppose m, n are nondecreasing functions on R and g is
monotone. Then we have
Z
Z
(g m) d(n m)
g dn
(7.77)
hull(Ran(m))
(7.78)
kP k = max xj xj1
(7.79)
The number
1jn
151
n
X
mj =
inf
Mj =
j=1
sP,f,+ (x) =
n
X
f (x),
(7.80)
f (x),
(7.81)
x[xj1 ,xj )
sup
x[xj1 ,xj )
j=1
Hence sP,f, (x) is a step function approximating f from below and sP,f,+ (x)
is a step function approximating f from above. In particular,
m sP,f, (x) f (x) sP,f,+ (x) M,
x[a,b]
(7.82)
Moreover, we can define the upper and lower Riemann sum associated with
P as
n
n
X
X
L(P, f ) =
mj (xj xj1 ),
U (P, f ) =
Mj (xj xj1 ). (7.83)
j=1
j=1
Of course, L(f, P ) is just the Lebesgue integral of sP,f, and U (f, P ) is the
Lebesgue integral of sP,f,+ . In particular, L(P, f ) approximates the area
under the graph of f from below and U (P, f ) approximates this area from
above.
By the above inequality
m (b a) L(P, f ) U (P, f ) M (b a).
(7.84)
(7.85)
as well as
L(P1 , f ) L(P2 , f ) U (P2 , f ) U (P1 , f ).
Hence we define the lower, upper Riemann integral of f as
Z
Z
f (x)dx = sup L(P, f ),
f (x)dx = inf U (P, f ),
P
(7.86)
(7.87)
respectively. Clearly
Z
m (b a)
Z
f (x)dx
f (x)dx M (b a).
(7.88)
We will call f Riemann integrable if both values coincide and the common
value will be called the Riemann integral of f .
152
Example. Let [a, b] = [0, 1] and f (x) = Q (x). Then sP,f, (x) = 0 and
R
R
sP,f,+ (x) = 1. Hence f (x)dx = 0 and f (x)dx = 1 and f is not Riemann
integrable.
On the other hand, every continuous function f C[a, b] is Riemann
integrable (Problem 7.33).
Lemma 7.39. A function f is Riemann integrable if and only if there exists
a sequence of partitions Pj such that
lim L(Pj , f ) = lim U (Pj , f ).
(7.89)
In this case the above limit equals the Riemann integral of f and Pj can be
chosen such that Pj Pj+1 and kPj k 0.
Proof. If there is such a sequence of partitions then f is integrable by
limj L(Pj , f ) supP L(P, f ) inf P U (P, f ) limj U (Pj , f ).
Conversely, given an integrable f , there is a sequence of partitions PL,j
R
R
such that f (x)dx = limj L(PL,j , f ) and a sequence PU,j such that f (x)dx =
limj U (PU,j , f ). By (7.86) the common refinement Pj = PL,j PU,j is the
partition we are looking for. Since, again by (7.86), any refinement will also
work, the last claim follows.
With the help of this lemma we can give a characterization of Riemann
integrable functions and show that the Riemann integral coincides with the
Lebesgue integral.
Theorem 7.40 (Lebesgue). A bounded measurable function f : [a, b] R is
Riemann integrable if and only if the set of its discontinuities is of Lebesgue
measure zero. Moreover, in this case the Riemann and the Lebesgue integral
of f coincide.
Proof. Suppose f is Riemann integrable and let Pj be a sequence of partitions as in Lemma 7.39. Then sf,Pj , (x) will be monotone and hence converge to some function sf, (x) f (x). Similarly, sf,Pj ,+ (x) will converge to
some function sf,+ (x) f (x). Moreover, by dominated convergence
Z
Z
0 = lim
sf,Pj ,+ (x) sf,Pj , (x) dx =
sf,+ (x) sf, (x) dx
j
and thus by Lemma 7.22 sf,+ (x) = sf, (x) almost everywhere. Moreover,
f is continuous at every x at which equality holds and which is not in any
of the partitions. Since the first as well as the second set have Lebesgue
measure zero, f is continuous almost everywhere and
Z
Z
lim L(Pj , f ) = lim U (Pj , f ) = sf, (x)dx = f (x)dx.
j
153
kP k0
kP k0
Chapter 8
|f | d
1/p
,
1 p,
(8.1)
(8.2)
(8.3)
155
156
But before that let us also define L (X, d). It should be the set of bounded
measurable functions B(X) together with the sup norm. The only problem is
that if we want to identify functions equal almost everywhere, the supremum
is no longer independent of the representative in the equivalence class. The
solution is the essential supremum
kf k = inf{C | ({x| |f (x)| > C}) = 0}.
(8.6)
(8.7)
8.2. Jensen H
older Minkowski
157
lim kf kp = kf k ,
8.2. Jensen H
older Minkowski
As a preparation for proving that Lp is a Banach space, we will need Holders
inequality, which plays a central role in the theory of Lp spaces. In particular, it will imply Minkowskis inequality, which is just the triangle inequality
for Lp . Our proof is based on Jensens inequality and emphasizes the connection with convexity. In fact, the triangle inequality just states that a
norm is convex:
k f + (1 )gk kf k + (1 )kgk,
(0, 1).
(8.8)
(0, 1),
(8.9)
that is, on (x, y) the graph of (x) lies below or on the line connecting
(x, (x)) and (y, (y)):
,
zx
yx
yz
x < z < y,
(8.10)
158
0
monotone nondecreasing. Moreover, exists except at a countable
number of points.
(iii) For fixed x we have (y) (x) + (y x) for every with
0 (x) 0+ (x). The inequality is strict for y 6= x if is
strictly convex.
Proof. Abbreviate D(x, y) = D(y, x) = (y)(x)
and observe that (8.10)
yx
implies
D(x, z) D(y, z)
for x < z < y.
Hence 0 (x) exist and we have 0 (x) 0+ (x) 0 (y) 0+ (y) for x < y.
So (ii) follows after observing that a monotone function can have at most a
countable number of jumps. Next
0+ (x) D(y, x) 0 (y)
shows (y) (x)+0 (x)(y x) if (y x) > 0 and proves (iii). Moreover,
z ) for z < x, y < z proves (i).
0+ (z) |D(y, x)| 0 (
Remark: It is not hard to see that C 1 is convex if and only if 0 (x)
is monotone nondecreasing (e.g., 00 0 if C 2 ).
With these preparations out of the way we can show
Theorem 8.3 (Jensens inequality). Let : (a, b) R be convex (a =
or b = being allowed). Suppose is a finite measure satisfying (X) = 1
and f L1 (X, d) with a < f (x) < b. Then the negative part of f is
integrable and
Z
Z
f d)
(
X
( f ) d.
(8.11)
f d (a, b).
I=
X
8.2. Jensen H
older Minkowski
159
Taking n the claim follows from Xn % X and the monotone convergence theorem.
Observe that if is strictly convex, then equality can only occur if f is
constant.
Now we are ready to prove
Theorem 8.4 (H
olders inequality). Let p and q be dual indices; that is,
1 1
+ =1
(8.12)
p q
with 1 p . If f Lp (X, d) and g Lq (X, d), then f g L1 (X, d)
and
kf gk1 kf kp kgkq .
(8.13)
Proof. The case p = 1, q = (respectively p = , q = 1) follows directly
from the properties of the integral and hence it remains to consider 1 <
p, q < .
First of all it is no restriction to assume kgkq = 1. Let A = {x| |g(x)| >
0}, then (note (1 q)p = q)
Z
p Z
Z
|f | |g|1q |g|q d (|f | |g|1q )p |g|q d =
|f |p d kf kpp ,
kf gkp1 =
A
160
Now let us turn to the case p = . For every > 0 the set A =
{x| |f (x)| kf k } has positive measure. Moreover, considering Xn % X
with (Xn ) < there must be some n such that B R= A Xn satisfies 0 <
(B ) < . Then g = sign(f )B /(B ) satisfies X f g d kf k .
Finally, choosing a sequence of simple functions sn g in L1 finishes the
proof.
Note that in the case p = it suffices to assume that for every set A of
positive measure, there is a subset B A with 0 < (B) < . Moreover,
without this assumption the claim is clearly false since there will be no
simple functions s supported on A with ksk1 = 1.
As another consequence of Holders inequality we also get
Theorem 8.6 (Minkowskis integral inequality). Suppose, and are two
-finite measures and f is measurable. Let 1 p . Then
Z
Z
f (., y)d(y)
kf (., y)kp d(y),
(8.14)
Y
xk k
n
X
k=1
k xk ,
if
n
X
k = 1,
(8.16)
k=1
for k > 0, xk > 0. (Hint: Take a sum of Dirac-measures and use that the
exponential function is convex.)
161
(8.18)
j=1
kf kp0 (X) p0
p1
kf kp ,
1 p0 p.
(Hint: H
olders inequality.)
|gk (x)|
k=1
k=1
k=1
X
gn (x) = lim fn (x)
n=1
is absolutely convergent for those x. Now let f (x) be this limit. Since
|f (x) fn (x)|p converges to zero almost everywhere and |f (x) fn (x)|p
(2G(x))p L1 , dominated convergence shows kf fn kp 0.
In the case p = note that the Cauchy sequence property |fn (x)
fm (x)| < for n, m > N holds except for sets Am,n of measure zero. Since
162
S
A = n,m An,m is again of measure zero, we see that fn (x) is a Cauchy
sequence for x X\A. The pointwise limit f (x) = limn fn (x), x X\A,
is the required limit in L (X, d) (show this).
In particular, in the proof of the last theorem we have seen:
Corollary 8.9. If kfn f kp 0, then there is a subsequence (of representatives) which converges pointwise almost everywhere.
Consequently, if fn Lp0 Lp1 converges in both Lp0 and Lp1 , then the
limits will be equal a.e. Be warned that the statement is not true in general
without passing to a subsequence (Problem 8.8).
It even turns out that Lp is separable.
Lemma 8.10. Suppose X is a second countable topological space (i.e., it has
a countable basis) and is an outer regular Borel measure. Then Lp (X, d),
1 p < , is separable. In particular, for every countable base which is
closed under finite unions the set of characteristic functions O (x) with O
in this base is total.
Proof. The set of all characteristic functions A (x) with A and (A) <
is total by construction of the integral (Problem 8.10)). Now our strategy
is as follows: Using outer regularity, we can restrict A to open sets and using
the existence of a countable base, we can restrict A to open sets from this
base.
Fix A. By outer regularity, there is a decreasing sequence of open sets
On A such that (On ) (A). Since (A) < , it is no restriction to
assume (On ) < , and thus kA On kp = (On \A) = (On )(A) 0.
Thus the set of all characteristic functions O (x) with O open and (O) <
is total. Finally let B be a countable
S base for the topology. Then, every
open set O can be written as O =
j=1 Oj with Oj B. Moreover, by
consideringS
the set of all finite unions of elements from B, it is no restriction
j B. Hence there is an increasing sequence O
n % O
to assume nj=1 O
163
O\K
Clearly this result has to fail in the case p = (in general) since the
uniform limit of continuous functions is again continuous. In fact, the closure
of Cc (Rn ) in the infinity norm is the space C0 (Rn ) of continuous functions
vanishing at (Problem 1.36).
If X is some subset of Rn , we can do even better and approximate
integrable functions by smooth functions. The idea is to replace the value
f (x) by a suitable average computed from the values in a neighborhood.
This is done by choosing a nonnegative bump function , whose area is
normalized to 1, and considering the convolution
Z
Z
( f )(x) =
(x y)f (y)dy =
(y)f (x y)dy.
(8.19)
Rn
Rn
|Br (0)|1 Br (0)
164
(8.20)
in this case.
(ii) Suppose Cck (Rn ) and f L1loc (Rn ), then f C k (Rn ) and
( f ) = ( ) f
(8.21)
(8.22)
Rn
where we have use Jensens inequality with (x) = |x|p , d = ||dn y in the
first and Fubini in the second step.
Next we turn to approximation of f . To this end we call a family of
integrable functions , (0, 1], an approximate identity if it satisfies
the following three requirements:
(i) k k1 C for all > 0.
R
(ii) Rn (x)dn x = 1 for all > 0.
(iii) For every r > 0 we have lim0
R
|x|r
(x)dn x = 0.
165
In fact, (i), (ii) are obvious from k k1 = 1 and the integral in (iii) will be
identically zero for rs , where s is chosen such that supp Bs (0).
Now we are ready to show that an approximate identity deserves its
name.
Lemma 8.13. Let be an approximate identity. If f Lp (Rn ) with
1 p < . Then
lim f = f
(8.23)
0
Lp .
Proof. We begin with the case where f Cc (Rn ). Fix some small > 0.
Since f is uniformly continuous we know |f (x y) f (x)| 0 as y 0
uniformly in x. Since the support of f is compact, this remains true when
taking the Lp norm and thus we can find some r such that
,
|y| r.
2C
(Here the C is the one for which k k1 C holds.) Now we use
Z
( f )(x) f (x) =
(y)(f (x y) f (x))dn y
kf (. y) f (.)kp
Rn
166
| (y)|kf (. y) f (.)kp dn y
2
|y|r
and
Z
n
(y)(f
(x
y)
f
(x))d
y
|y|>r
p
Z
2kf kp
| (y)|dn y
2
|y|>r
provided is sufficiently small such that the integral in (iii) is less than /2.
This establishes the claim for f Cc (Rn ). Since these functions are
dense in Lp for 1 p < and in C0 (Rn ) for p = the claim follows from
Lemma 4.26 and Youngs inequality.
Note that in case of a mollifier with support in Br (0) this result implies
a corresponding local version since the value of ( f )(x) is only affected
by the values of f on Br (x). The question when the pointwise limit exists
will be addressed in Problem 8.13.
Now we are ready to prove
Theorem 8.14. If X Rn is open and is a regular Borel measure, then
the set Cc (X) of all smooth functions with compact support is dense in
Lp (X, d), 1 p < .
Proof. By Theorem 8.11 it suffices to show that every continuous function
f (x) with compact support can be approximated by smooth ones. By setting
f (x) = 0 for x 6 X, it is no restriction to assume X = Rn . Now choose
a mollifier and observe that f has compact support (since f has).
Moreover, f f in Lp by the previous lemma.
Lemma 8.15. Suppose f L1loc (Rn ). Then
Z
(x)f (x)dn x = 0,
Cc (Rn ),
(8.24)
Rn
167
g f d n x =
g f dn x =
Problem 8.10. Show that for any f Lp (X, d), 1 p < there exists
a sequence of integrable simple functions sn such that |sn | |f | and sn f
in Lp (X, d). Show that for p = this only holds if the measure is finite.
(Hint: Split f into the sum of four nonnegative functions and use (7.26).)
R
Problem 8.11. Let be integrable and normalized such that Rn (x)dn x =
1. Show that (x) = n ( x ) is an approximate identity.
Problem 8.12. Show that the Poisson kernel
P (x) =
1
2
x + 2
is an approximate identity on R.
Show that the Cauchy transform
Z
1
f ()
F (z) =
d
R z
of a function f Lp (R) is analytic in the upper half-plane with imaginary
part given by
Im(F (x + iy)) = (Py f )(x).
In particular, by Youngs inequality kIm(F (. + iy))kp kf kp and thus
supy>0 kIm(F (. + iy))kp = kf kp . Such analytic functions are said to be
in the Hardy space H p (C+ ).
(Hint: To see analyticity of F use Problem 7.22 plus the estimate
1
1
1 + |z|
z 1 + || |Im(z)| .)
Problem R8.13. Let be bounded with support in B1 (0) and normalized
such that Rn (x)dn x = 1. Set (x) = n ( x ).
For f locally integrable show
Vn kk
|( f )(x) f (x)|
|B (x)|
Z
|f (y) f (x)|dy.
B (x)
(8.25)
168
(8.26)
for -almost every x, respectively, for -almost every y. Then the operator
K : Lp (Y, d) Lp (X, d), defined by
Z
(Kf )(x) =
K(x, y)f (y)d(y),
(8.27)
Y
1/q Z
1/p
p
K1 (x, y) d(y)
|K2 (x, y)f (y)| d(y)
Z
Z
C1
1/p
|K2 (x, y)f (y)| d(y)
p
for a.e. x (if K2 (x, .)f (.) 6 Lp (X, d), the inequality is trivially true). Now
take this inequality to the pth power and integrate with respect to x using
Fubini
p
Z Z
Z Z
|K2 (x, y)f (y)|p d(y)d(x)
|K(x, y)f (y)|d(y) d(x) C1p
X
Y
X Y
Z Z
p
= C1
|K2 (x, y)f (y)|p d(x)d(y) C1p C2p kf kpp .
Y X
R
Hence Y |K(x, y)f (y)|d(y) Lp (X, d) and, in particular, it is finite for
-almost
every x. Thus K(x, .)f (.) is integrable for -almost every x and
R
Y K(x, y)f (y)d(y) is measurable.
Note that the assumptions are, for example, satisfied if kK(x, .)kL1 (Y,d)
C and kK(., y)kL1 (X,d) C which follows by choosing K1 (x, y) = |K(x, y)|1/q
and K2 (x, y) = |K(x, y)|1/p . For related results see also Problems 8.15 and
12.3.
169
where K(x, y)
Schmidt operator.
jJ
Z
2
K(x, y)uj (y)d(y) d(x)
X
2
Z X Z
=
K(x, y)uj (y)d(y) d(x)
X j
X
Z Z
=
|K(x, y)|2 d(x)d(y)
X
as claimed.
Hence in combination with Lemma 5.5 this shows that our definition for
integral operators agrees with our previous definition from Section 5.2. In
particular, this gives us an easy to check test for compactness of an integral
operator.
Example. Let [a, b] be some compact interval and suppose K(x, y) is
bounded. Then the corresponding integral operator in L2 (a, b) is Hilbert
Schmidt and thus compact. This generalizes Lemma 3.4.
Problem 8.14. Suppose (Y, d) = (X, d) with X Rn . Show that the
integral operator with kernel K(x, y) = k(x y) is bounded in Lp (X, d) if
k L1 (X, d) with kKk kkk1 .
Problem 8.15 (Schur test). Let K(x, y) be given and suppose there are
positive measurable functions a(x) and b(y) such that
kK(x, .)b(.)kL1 (Y,d) C1 a(x),
170
Chapter 9
171
172
L2 (X, d)
hg d.
X
By construction
Z
(A) =
Z
A d =
Z
A g d =
g d.
(9.3)
173
then (N ) = 0 and
Z
XZ
X
X
f d,
fn d = (A N ) +
(A Nn ) +
n (A) =
(A) =
n
174
Problem 9.3. Suppose and are inner regular measures. Show that
if and only if (C) = 0 implies (C) = 0 for every compact set.
Problem 9.4. Suppose (A) C(A) for all A . Then d = f d with
0 f C a.e.
Problem 9.5. Let d = f d. Suppose f > 0 a.e. with respect to . Then
and d = f 1 d.
Problem 9.6 (Chain rule). Show that is a transitive relation. In
particular, if , show that
d d
d
=
.
d
d d
Problem 9.7. Suppose . Show that for every measure we have
d
d
d =
d + d,
d
d
where is a positive measure (depending on ) which is singular with respect
to . Show that = 0 if and only if .
(B (x))
|B (x)|
(9.5)
((x + , x ))
(x + ) (x )
= lim
= 0 (x).
0
2
2
(B (x))
|B (x)|
and
(B (x))
.
|B (x)|
(9.6)
175
(D)(x) = lim
sup
(9.7)
`=1
Proof. Assume that the balls Bj are ordered by decreasing radius. Start
with Bj1 = B1 and remove all balls from our list which intersect Bj1 . Observe that the removed balls are all contained in B3r1 (x1 ). Proceeding like
this, we obtain the required subset.
The upshot of this lemma is that we can select a disjoint subset of balls
which still controls the Lebesgue volume of the original set up to a universal
constant 3n (recall |B3r (x)| = 3n |Br (x)|).
Now we can show
Lemma 9.5. Let > 0. For every Borel set A we have
|{x A | (D)(x) > }| 3n
(A)
(9.9)
and
|{x A | (D)(x) > 0}| = 0, whenever (A) = 0.
(9.10)
(O)
for every open set O with A O and every compact set K A . The
first claim then follows from outer regularity of and inner regularity of the
Lebesgue measure.
Given fixed K, O, for every x K there is some rx such that Brx (x) O
and |Brx (x)| < 1 (Brx (x)). Since K is compact, we can choose a finite
176
subcover of K from these balls. Moreover, by Lemma 9.4 we can refine our
set of balls such that
|K| 3n
k
X
i=1
k
3n X
(O)
(Bri (xi )) 3n
.
i=1
where d =
|h|dn x.
r0
This implies
and using the first part of Lemma 9.5 plus |{x | |h(x)| }| 1 khk1
(Problem 9.11), we see
r0
Since is arbitrary, the Lebesgue measure of this set must be zero for every
. That is, the set where the lim sup is positive has Lebesgue measure
zero.
Note that the balls can be replaced by more general sets: A sequence of
sets Aj (x) is said to shrink to x nicely if there are balls Brj (x) with rj 0
and a constant > 0 such that Aj (x) Brj (x) and |Aj | |Brj (x)|. For
example, Aj (x) could be some balls or cubes (not necessarily containing x).
However, the portion of Brj (x) which they occupy must not go to zero! For
example, the rectangles (0, 1j ) (0, 2j ) R2 do shrink nicely to 0, but the
rectangles (0, 1j ) (0, j22 ) do not.
177
Corollary 9.8. Let be a Borel measure on R which is absolutely continuous with respect to Lebesgue measure. Then its distribution function is
differentiable a.e. and d(x) = 0 (x)dx.
Proof. By assumption d(x) = f (x)dx for some locally
R x integrable function
f . In particular, the distribution function (x) = 0 f (y)dy is continuous.
Moreover, since the sets (x, x + r) shrink nicely to x as r 0, Lemma 9.7
implies
((x, x + r))
(x + r) (x)
lim
= lim
= f (x)
r0
r0
r
r
at every Lebesgue point of f . Since the same is true for the sets (x r, x),
(x) is differentiable at every Lebesgue point and 0 (x) = f (x).
As another consequence we obtain
Theorem 9.9. Let be a Borel measure on Rn . The derivative D exists
a.e. with respect to Lebesgue measure and equals the RadonNikodym derivative of the absolutely continuous part of with respect to Lebesgue measure;
that is,
Z
(D)(x)dn x.
ac (A) =
(9.13)
178
Using the upper and lower derivatives, we can also give supports for the
absolutely and singularly continuous parts.
Theorem 9.10. The set {x|0 < (D)(x) < } is a support for the absolutely continuous and {x|(D)(x) = } is a support for the singular part.
Proof. The first part is immediate from the previous theorem. For the
second part first note that by (D)(x) (Dsing )(x) we can assume that
is purely singular. It suffices to show that the set Ak = {x | (D)(x) < k}
satisfies (Ak ) = 0 for every k N.
Let K Ak be compact, and let Vj K be some open set such that
|Vj \K| 1j . For every x K there is some = (x) such that B (x) Vj
and (B3 (x)) k|B3 (x)|. By compactness, finitely many of these balls
cover K and hence
X
(K)
(Bi (xi )).
i
179
= 0. Set M = {x M0 | <
1
1
ac (M ) ac (M0 ) = 0
0 x 31 ,
2 f (3x),
1
2
K(f )(x) = 12 f (3x),
3 x 3,
1
2
2 (1 + f (3x 2)),
3 x 1.
Since kfn+1 fn k 12 kfn+1 fn k we can define the Cantor function
as f = limn fn . By construction f is a continuous function which is
constant on every subinterval of [0, 1]\C. Since C is of Lebesgue measure
zero, this set is of full Lebesgue measure and hence f 0 = 0 a.e. in [0, 1]. In
particular, the corresponding measure, the Cantor measure, is supported
on C and is purely singular with respect to Lebesgue measure.
Problem 9.8. Show that
(B (x)) lim inf (B (y)) lim sup (B (y)) (B (x)).
yx
yx
K(g)(y)| 32 M |x y| .)
180
[
X
(
An ) =
(An ),
An Am = , n 6= m.
(9.14)
n=1
n=1
k=1
(9.16)
181
Let > 0 be fixed and for each An choose a disjoint partition Bn,k oaf
An such that
m
X
||(An )
|(Bn,k )| + n .
2
k=1
Then
N
X
||(An )
n=1
since
P
SN
N X
m
X
|(Bn,k )| + ||(
n=1 k=1
n=1
Sm
k=1 Bn,k
N
[
An ) + ||(A) +
n=1
SN
n=1 An .
n=1 ||(An ).
m X
X
X
X
|(Bk )| =
(Bk An )
|(Bk An )|
k=1 n=1
k=1
X
m
X
k=1 n=1
|(Bk An )|
||(An ).
n=1
n=1 k=1
n=1 ||(An ).
X
(Bn )
n=1
diverges (the terms of a convergent series mustSconverge to zero). But additivity requires that the sum converges to ( n Bn ), a contradiction.
It remains to show existence of this splitting. Let A with ||(A) =
be given. Then there are disjoint sets Aj such that
n
X
|(Aj )| 2 + |(A)|.
j=1
S
S
Now let A+ = {Aj |(Aj ) 0} and A = A\A+ = {Aj |(Aj ) < 0}.
Then the above inequality reads (A+ ) + |(A )| 2 + |(A+ ) |(A )||
implying (show this) that for both of them we have |(A )| 1 and by
||(A) = ||(A+ ) + ||(A ) either A+ or A must have infinite || measure.
182
Note that this implies that every complex measure can be written as
a linear combination of four positive measures. In fact, first we can split
into its real and imaginary part
= r + ii ,
(9.17)
Second we can split every real (also called signed) measure according to
||(A) (A)
.
(9.18)
2
By (9.16) both and + are positive measures. This splitting is also known
as Jordan decomposition of a signed measure.
= + ,
(A) =
183
finite sums, (9.14) holds for finitely many sets. Hence it remains to show
(An ) (A). This follows from
|(An ) (A)| |(An ) k (An )| + |k (An ) k (A)| + |k (A) (A)|
2Ck + |k (An ) k (A)|.
Finally, k since |k (A) (A)| Ck implies kk k 4Ck (Problem 9.17).
If is a positive and a complex measure we say that is absolutely
continuous with respect to if (A) = 0 implies (A) = 0.
Lemma 9.15. If is a positive and a complex measure then if
and only if || .
Proof. If , then (A) = 0 implies (B) = 0 for every B A
and hence ||(A) = 0. Conversely, if || , then (A) = 0 implies
|(A)| ||(A) = 0.
Now we can prove the complex version of the RadonNikodym theorem:
Theorem 9.16 (Complex RadonNikodym). Let (X, ) be a measurable
space, a positive -finite measure and a complex measure which is absolutely continuous with respect to . Then there is a unique f L1 (X, d)
such that
Z
(A) =
f d.
(9.19)
184
S
Proof. If An are disjoint sets and A = n An we have
Z
XZ
X
X Z
|f |d.
|f |d =
f d
|(An )| =
n
Hence ||(A)
A |f |d.
An
An
k1
arg(f (x)) +
k
<
},
n
2
n
Then the simple functions
Ank = {x A|
sn (x) =
n
X
e2i
k1
+i
n
1 k n.
Ank (x)
k=1
shows
A |f |d
k=1
An
k
k=1
||(A).
185
(9.23)
|(A)| < ,
A .
(9.24)
= 2n
. Then, if (A) < we have
B,BA
186
sup
V (P, f )
(9.26)
partitions P of [a, b]
is called the total variation of f over [a, b]. If the total variation is finite,
f is called of bounded variation. Since we clearly have
Vab (f ) = ||Vab (f ),
(9.27)
the space BV [a, b] of all functions of finite total variation is a vector space.
However, the total variation is not a norm since (consider the partition
P = {a, x, b})
Vab (f ) = 0 f (x) c.
(9.28)
Moreover, any function of bounded variation is in particular bounded (consider again the partition P = {a, x, b})
sup |f (x)| |f (a)| + Vab (f ).
(9.29)
x[a,b]
c [a, b],
(9.30)
(9.31)
187
f (x) =
1 x
(V (f ) f (x)) ,
2 a
(9.32)
(9.33)
188
a < b we have
Vab () =
sup
V (P, )
sup
n
X
for every countable collection of pairwise disjoint intervals (xk , yk ) [a, b].
The set of all absolutely continuous functions on [a, b] will be denoted by
AC[a, b]. The special choice of just one interval shows that every absolutely
continuous function is (uniformly) continuous, AC[a, b] C[a, b].
Example. Every Lipschitz continuous function is absolutely continuous. In
fact, if |f (x) f (y)| L|x y| for x, y [a, b], then we can choose = L . In
particular, C 1 [a, b] AC[a, b]. Note that Holder continuity is not sufficient
(cf. Problem 9.21 and Theorem 9.25 below).
Theorem 9.24. A complex Borel measure on R is absolutely continuous with respect to Lebesgue measure if and only if its distribution function
is locally absolutely continuous (i.e., absolutely continuous on every compact sub-interval). Moreover, in this case the distribution function (x) is
differentiable almost everywhere and
Z x
(x) = (0) +
0 (y)dy
(9.35)
0
with
integrable,
R
R
| 0 (y)|dy
= ||(R).
189
as required.
The rest follows from Corollary 9.8.
Vax (f ) =
|g(y)|dy.
(9.37)
Proof. This is just a reformulation of the previous result. To see the last
claim combine the last part of Theorem 9.23 with Lemma 9.17.
In particular, since the fundamental theorem of calculus fails for the
Cantor function, this function is an example of a continuous which is not
absolutely continuous. Note that even if f is differentiable everywhere it
might fail the fundamental theorem of calculus (Problem 9.28).
Finally, we note that in this case the integration by parts formula
continues to hold.
Lemma 9.26. Let f, g BV [a, b], then
Z
Z
f (x)dg(x) = f (b)g(b) f (a)g(a)
[a,b)
as well as
Z
Z
f (x+)dg(x) = f (b+)g(b+) f (a+)g(a+)
(a,b]
[a,b)
[a,b)
190
(y,b)
[a,b)
Problem 9.18. Compute Vab (f ) for f (x) = sign(x) on [a, b] = [1, 1].
Problem 9.19. Show (9.30).
Problem 9.20. Consider fj (x) = xj cos(/x) for j N. Show that fj
C[0, 1] if we set fj (0) = 0. Show that fj is of bounded variation for j 2
but not for j = 1.
P
Moreover, show
Vab (Re(f )) Vab (f ),
191
Z
F (x) =
f (x, y) d(y)
(9.41)
f (x, y) d(y)
x
(9.42)
Z
A
|f (x) f (y)| kf 0 kp |x y|
Show that the functions f (x) = log(x)1 is absolutely continuous but not
H
older continuous on [0, 21 ].
Problem 9.28. Consider f (x) = x2 sin( x2 ) on [0, 1] (here f (0) = 0). Show
that f is differentiable everywhere and compute its derivative. Show that
its derivative is not integrable. In particular, this function is not absolutely
continuous and the fundamental theorem of calculus does not hold for this
function.
Problem 9.29. Show that the function from the previous problem is H
older
continuous of exponent 21 . (Hint: Consider 0 < x < y. There is an x0 < y
Chapter 10
The dual of Lp
X
X
X
(A) = `(
A j ) =
`(Aj ) =
(Aj ).
j=1
j=1
j=1
193
194
for every simple function f . Next let An = {x||g| < n}, then gn = gAn Lq
and by Lemma 8.5 we conclude kgn kq k`k. Letting n shows g Lq
and finishes the proof for finite .
If is -finite, let Xn % X with (Xn ) < . Then for every n there is
some gn on Xn and by uniqueness of gn we must have gn = gm on Xn Xm .
Hence there is some g and by kgn k k`k independent of n, we have g
Lq .
Corollary 10.2. Let be some -finite measure. Then Lp (X, d) is reflexive for 1 < p < .
Proof. Identify Lp (X, d) with Lq (X, d) and choose h Lp (X, d) .
Then there is some f Lp (X, d) such that
Z
h(g) = g(x)f (x)d(x),
g Lq (X, d)
= Lp (X, d) .
But this implies h(g) = g(f ), that is, h = J(f ), and thus J is surjective.
Note that in the case 0 < p < 1, where Lp fails to be a Banach space,
the dual might even be empty (see Problem 10.1)!
Problem 10.1. Show that Lp (0, 1) is a quasinormed space if 0 < p < 1 (cf.
Problem 1.17). Moreover, show that Lp (0, 1) = {0} in this case. (Hint:
Suppose there were a nontrivial ` Lp (0, 1) . Start with f0 Lp such that
|`(f0 )| 1. Set g0 = (0,s] f and h0 = (s,1] f , where s (0, 1) is chosen
such that kg0 kp = kh0 kp = 21/p kf0 kp . Then |`(g0 )| 12 or |`(h0 )| 21 and
we set f1 = 2g0 in the first case and f1 = 2h0 else. Iterating this procedure
gives a sequence fn with |`(fn )| 1 and kfn kp = 2nn/p kf0 kp .)
(10.2)
195
is a bounded linear functional on B(X) (the Banach space of bounded measurable functions) with norm
k` k = ||(X)
(10.3)
n
[
k=1
Ak ) =
n
X
Aj Ak = , j 6= k.
(Ak ),
(10.4)
k=1
Given a content
P we can define the corresponding integral for simple functions s(x) = nk=1 k Ak as usual
Z
n
X
s d =
k (Ak A).
(10.5)
A
k=1
As in the proof of Lemma 7.17 one shows that the integral is linear. Moreover,
Z
|
s d| ||(A) ksk ,
(10.6)
A
where ||(A) is defined as in (9.15) and the same proof as in Theorem 9.12
shows that || is a content. However, since we do not require -additivity,
it is not clear that ||(X) is finite. Hence we will call finite if ||(X) < .
Moreover, for a finite content this integral can be extended to all of B(X)
such that
Z
f d| ||(X) kf k
(10.7)
|
X
by Theorem 1.30 (compare Problem 7.15). However, note that our convergence theorems (monotone convergence, dominated convergence) will no
longer hold in this case (unless happens to be a measure).
In particular, every complex content gives rise to a bounded linear functional on B(X) and the converse also holds:
Theorem 10.3. Every bounded linear functional ` B(X) is of the form
Z
`(f ) =
f d
(10.8)
X
196
n
X
n
X
k Ak ) =
k=1
k=1
k (Ak ) =
n
X
|(Ak )|,
k = sign((Ak )),
k=1
[0, 1],
(10.9)
for f in the subspace of bounded measurable functions which have left and
right limits at 0. Since k`k = 1 we can extend it to all of B(R) using the
HahnBanach theorem. Then the corresponding content is no measure:
= ([1, 0)) = (
X
1
1
1
1
[ ,
)) 6=
([ ,
)) = 0. (10.10)
n n+1
n n+1
n=1
n=1
197
xn0
fn (x) = f (xn0 )[xn0 ,xn1 ) + f (xn1 )[xn1 ,xn2 ) + + f (xnn1 )[xnn1 ,xnn ] .
converges uniformly to f by continuity of f (and the fact that f vanishes as
x in the case I = R). Moreover,
Z
Z
n
X
f d = lim
fn d = lim
f (xnk1 )((xnk ) (xnk1 ))
n I
= lim
n
X
k=1
(xnk )
f (xnk1 )(
(xnk1 ))
Z
= lim
k=1
n I
fn d
Z
=
f d
= `(f )
I
provided the points xnk are chosen to stay away from all discontinuities of
(x) (recall that there are at most countably many).
To see k`k = ||(I) recall d = hd|| where |h| = 1 (Corollary 9.18).
n =
Now choose continuous functions hn (x) h(x) pointwise a.e. Using h
n | 1. Hence
max(1, |hRn |) sign(hn ) we
get such a sequence with |h
R even
`(hn ) = hn h d|| |h| d|| = ||(I) implying k`k ||(I). The converse follows from (10.7).
Note that will be a positive measure if ` is a positive functional,
that is, `(f ) 0 whenever f 0.
Problem 10.2. Let M be a closed subspace of a Banach space X. Show
that (X/M )
= {` X |M Ker(`)}.
Problem 10.3 (Vague convergence of measures). Let I be a compact interval. A sequence of measures n is said to converge vaguely to a measure
if
Z
Z
f dn
I
f d,
f C(I).
(10.12)
198
Chapter 11
1
(2)n/2
eipx f (x)dn x.
(11.1)
Rn
(11.2)
1
(2)n/2
Z
Rn
|eipx f (x)|dn x =
1
(2)n/2
|f (x)|dn x.
Rn
Moreover, a straightforward application of the dominated convergence theorem shows that f is continuous.
Note that if f is nonnegative we have equality: kfk = (2)n/2 kf k1 =
f (0).
The following simple properties are left as an exercise.
199
200
(11.3)
aR ,
(11.4)
> 0,
(11.5)
(e
ixa
a Rn ,
(11.6)
(11.7)
Similarly, if f (x), xj f (x) L1 (Rn ) for some 1 j n, then f(p) is differentiable with respect to pj and
(xj f (x)) (p) = ij f(p).
(11.8)
(j f ) (p) =
f (x)dn x
eipx
xj
(2)n/2 Rn
Z
1
ipx
e
f (x)dn x
=
xj
(2)n/2 Rn
Z
1
=
ipj eipx f (x)dn x = ipj f(p).
(2)n/2 Rn
Similarly, the second formula follows from
Z
1
xj eipx f (x)dn x
(xj f (x)) (p) =
(2)n/2 Rn
Z
1
ipx
=
i
e
f (x)dn x = i
f (p),
n/2
pj
pj
(2)
Rn
where interchanging the derivative and integral is permissible by Problem 7.20. In particular, f(p) is differentiable.
This result immediately extends to higher derivatives. To this end let
C (Rn ) be the set of all complex-valued functions which have partial derivatives of arbitrary order. For f C (Rn ) and Nn0 we set
f =
|| f
,
xnn
x1 1
x = x1 1 xnn ,
|| = 1 + + n .
(11.9)
201
(11.10)
(11.11)
Proof. The formulas are immediate from the previous lemma. To see that
f S(Rn ) if f S(Rn ), we begin with the observation that f is bounded
by (11.2). But then p ( f)(p) = i|||| ( x f (x)) (p) is bounded since
x f (x) S(Rn ) if f S(Rn ).
Hence we will sometimes write pf (x) for if (x), where = (1 , . . . , n )
is the gradient. Roughly speaking this lemma shows that the decay of
a functions is related to the smoothness of its Fourier transform and the
smoothness of a functions is related to the decay of its Fourier transform.
In particular, this allows us to conclude that the Fourier transform of
an integrable function will vanish at . Recall that we denote the space of
all continuous functions f : Rn C which vanish at by C0 (Rn ).
Corollary 11.5 (Riemann-Lebesgue). The Fourier transform maps L1 (Rn )
into C0 (Rn ).
Proof. First of all recall that C0 (Rn ) equipped with the sup norm is a
Banach space and that S(Rn ) is dense (Problem 1.36). By the previous
lemma we have f C0 (Rn ) if f S(Rn ). Moreover, since S(Rn ) is dense
in L1 (Rn ), the estimate (11.2) shows that the Fourier transform extends to
a continuous map from L1 (Rn ) into C0 (Rn ).
Next we will turn to the inversion of the Fourier transform. As a preparation we will need the Fourier transform of a Gaussian.
Lemma 11.6. We have ez|x|
F(ez|x|
2 /2
2 /2
)(p) =
1 |p|2 /(2z)
e
.
z n/2
(11.12)
Here z n/2 is the standard branch with branch cut along the negative real axis.
202
Proof. Due to the product structure of the exponential, one can treat
each coordinate separately, reducing the problem to the case n = 1 (Problem 11.2).
Let z (x) = exp(zx2 /2). Then 0z (x)+zxz (x) = 0 and hence i(pz (p)+
0
z z (p)) = 0. Thus z (p) = c1/z (p) and (Problem 7.26)
Z
1
1
exp(zx2 /2)dx =
c = z (0) =
z
2 R
at least for z > 0. However, since the integral is holomorphic for Re(z) > 0
by Problem 7.22, this holds for all z with Re(z) > 0 if we choose the branch
cut of the root along the negative real axis.
Now we can show
Theorem 11.7. The Fourier transform is a bounded injective map from
L1 (Rn ) into C0 (Rn ). Its inverse is given by
Z
1
2
eipx|p| /2 f(p)dn p,
(11.13)
f (x) = lim
n/2
0 (2)
Rn
where the limit has to be understood in L1 .
Proof. Abbreviate (x) = (2)n/2 exp(|x|2 /2). Then the right-hand
side is given by
Z Z
Z
n
ipx
(p)eipx f (y)eipy dn ydn p
(p)e f (p)d p =
Rn
Rn
Rn
and, invoking Fubini and Lemma 11.2, we further see that this is equal to
Z
Z
1
ipx
n
=
( (p)e ) (y)f (y)d y =
1/ (y x)f (y)dn y.
n/2
n
n
R
R
But the last integral converges to f in L1 (Rn ) by Lemma 8.13.
(11.14)
where
f(p) =
1
(2)n/2
(11.15)
Rn
203
However, note that F : L1 (Rn ) C0 (Rn ) is not onto (cf. Problem 11.5).
Nevertheless the Fourier transform F 1 is a closed map from Ran(F)
L1 (Rn ) by Lemma 4.7. As F 1 and F differ only by a reflection in the
argument, the same is true for F.
Lemma 11.9. Suppose f, f L1 (Rn ). Then f, f L2 (Rn ) and
kf k2 = kfk2 (2)n/2 kf k1 kfk1
2
(11.16)
holds.
Proof. This follows from Fubinis theorem since
Z
Z Z
1
|f(p)|2 dn p =
f (x) f(p)eipx dn p dn x
n/2
n
n
n
(2)
R
R
R
Z
=
|f (x)|2 dn x
Rn
for f, f
L1 (Rn ).
f(p) = lim
Z
|x|m
eipx f (x)dn x,
(11.17)
204
f (x y)g(y)dn y
(11.18)
Rn
(11.20)
ipx
(f g) (p) =
e
f (y)g(x y)dn y dn x
n/2
n
n
(2)
R
R
Z
Z
1
ipy
=
e
f (y)
eip(xy) g(x y)dn x dn y
(2)n/2 Rn
Rn
Z
g (p),
=
eipy f (y)
g (p)dn y = (2)n/2 f(p)
Rn
(f g) = (2)n/2 f g
in this case.
Proof. Clearly the product of two functions in S(Rn ) is again in S(Rn )
(show this!). Since S(Rn ) L1 (Rn ) the previous lemma implies (f g) =
(2)n/2 fg S(Rn ). Moreover, since the Fourier transform is injective on
L1 (Rn ) we conclude f g = (2)n/2 (fg) S(Rn ). Replacing f, g by f, g
in the last formula finally shows f g = (2)n/2 (f g) and the claim follows
by a simple change of variables using f(p) = f(p).
205
(f g) = (2)n/2 fg
in this case.
Proof. The inequality kf gk kf k2 kgk2 is immediate from Cauchy
Schwarz and shows that the convolution is a continuous bilinear form from
L2 (Rn ) to L (Rn ). Now take sequences fm , gm S(Rn ) converging to
f, g L2 (Rn ). Then using the previous corollary together with continuity
of the Fourier transform from L1 (Rn ) to C0 (Rn ) and on L2 (Rn ) we obtain
(f g) = lim (fm gm ) = (2)n/2 lim fm gm = (2)n/2 f g.
m
Similarly,
(f g) = lim (fm gm ) = (2)n/2 lim fm gm = (2)n/2 fg
m
from which that last claim follows since F : Ran(F) L1 (Rn ) is closed by
Lemma 4.7.
Finally, note that by looking at the Gaussians (x) = exp(x2 /2) one
observes that a well centered peak transforms into a broadly spread peak and
vice versa. This turns out to be a general property of the Fourier transform
known as uncertainty principle. One quantitative way of measuring this
fact is to look at
Z
k(xj x0 )f (x)k22 =
(xj x0 )2 |f (x)|2 dn x
(11.21)
Rn
(11.22)
Rn
Rn
Hence, by CauchySchwarz,
kf k22 2kxj f (x)k2 kj f (x)k2 = 2kxj f (x)k2 kpj f(p)k2
the claim follows.
206
where
1
K(x, y) =
(2)n
ei(xy)p A (y)dn p =
B (y x)A (y).
(2)n
(ii) f (p) =
1
,
p2 +k2
Re(k) > 0.
207
(11.23)
(11.24)
Since the right-hand side is integrable for n 3 we obtain that our solution
is necessarily given by
f (x) = (|p|2 g(p)) (x).
(11.25)
In fact, this formula still works provided g(x), |p|2 g(p) L1 (Rn ). Moreover,
if we additionally assume g L1 (Rn ), then |p|2 f(p) = g(p) L1 (Rn ) and
Lemma 11.3 implies that f C 2 (Rn ) as well as that it is indeed a solution.
Note that if n 3, then |p|2 g(p) L1 (Rn ) follows automatically from
g, g L1 (Rn ) (show this!).
Moreover, we clearly expect that f should be given by a convolution.
However, since |p|2 is not in Lp (Rn ) for any p, the formulas derived so far
do not apply.
208
Lemma 11.17. Let 0 < < n and suppose g L1 (Rn ) L (Rn ) as well
as |p| g(p) L1 (Rn ). Then
Z
(|p| g(p)) (x) =
I (|x y|)g(y)dn y,
(11.26)
Rn
( n
1
2 )
r n .
n/2
2 ( 2 )
(11.27)
Proof. Note that, while |.| is not in Lp (Rn ) for any p, our assumption
0 < < n ensures that the singularity at zero is integrable.
We set t (p) = exp(t|p|2 /2) and begin with the elementary formula
Z
1
|p| = c
t (p)t/21 dt, c = /2
,
2 (/2)
0
which follows from the definition of the gamma function (Problem 7.27)
after a simple scaling. Since |p| g(p) is integrable we can use Fubini and
Lemma 11.6 to obtain
Z
Z
c
ixp
/21
(|p| g(p)) (x) =
e
t (p)t
dt g(p)dn p
(2)n/2 Rn
0
Z Z
c
ixp
n
=
e
(p)
g
(p)d
p
t(n)/21 dt.
1/t
n
(2)n/2 0
R
g = (2)n/2 ( g)
Since , g L1 we know by Lemma 11.12 that
g L1 Theorem 11.7 gives us (
g ) = (2)n/2 g.
Moreover, since
Thus, we can make a change of variables and use Fubini once again (since
g L )
Z Z
c
n
(|p| g(p)) (x) =
1/t (x y)g(y)d y t(n)/21 dt
(2)n/2 0
Rn
Z Z
c
n
=
t (x y)g(y)d y t(n)/21 dt
n/2
n
(2)
0
R
Z Z
c
(n)/21
=
t (x y)t
dt g(y)dn y
(2)n/2 Rn
0
Z
c /cn
g(y)
=
dn y
n
n/2
(2)
Rn |x y|
to obtain the desired result.
Note that the conditions of the above theorem are, for example, satisfied
if g, g L1 (Rn ) which holds, for example, if g S(Rn ). In summary, if
g L1 (Rn ) L (Rn ), |p|2 g(p) L1 (Rn ) and n 3, then
f =g
(11.28)
209
( n2 1) 1
,
4 n/2 |x|n2
n 3,
(11.29)
P 1 g
F 1
F
?
210
(11.31)
(11.32)
(11.33)
As before we find that if g(x), (1 + |p|2 )1 g(p) L1 (Rn ) then the solution is
given by
f (x) = ((1 + |p|2 )1 g(p)) (x).
(11.34)
((1 + |p| )
g(p)) (x) =
B(n)/2 (|x y|)g(y)dn y,
(11.35)
Rn
r > 0, <
n
,
2
(11.36)
with
1 r
K (r) = K (r) =
2 2
Z
0
et 4t
dt
t+1
r > 0, R,
(11.37)
the modified Bessel function of the second kind of order ([16, (10.32.10)]).
Proof. We proceed as in the previous lemma. We set t (p) = exp(t|p|2 /2)
and begin with the elementary formula
Z
( 2 )
2
=
t/21 et(1+|p| ) dt.
/2
2
(1 + |p| )
0
211
Since g, g(p) are integrable we can use Fubini and Lemma 11.6 to obtain
Z
Z
( 2 )1
g(p)
/21 t(1+|p|2 )
ixp
(
t
e
dt g(p)dn p
) (x) =
e
(1 + |p|2 )/2
(2)n/2 Rn
0
1 Z Z
( 2 )
ixp
n
=
e 1/2t (p)
g (p)d p et t(n)/21 dt
n/2
n
(4)
0
R
1 Z Z
( 2 )
n
=
1/2t (x y)g(y)d y et t(n)/21 dt
n/2
n
(4)
0
R
Z
1 Z
( 2 )
t (n)/21
=
1/2t (x y)e t
dt g(y)dn y
n/2
(4)
Rn
0
to obtain the desired result. Using Fubini in the last step is allowed since g
is bounded and B(n)/2 (|x|) L1 (Rn ) (Problem 11.11).
Note that since the first condition g L1 (Rn ) L (Rn ) implies g
L2 (Rn ) and thus the second condition (1 + |p|2 )/2 g(p) L1 (Rn ) will be
satisfied if n2 < .
In particular, if g, g L1 (Rn ), then
f = Bn/21 g
(11.38)
1
,
2n|x|n2
2
|x|
2n ,
|x| 1,
|x| 1.
(Hint: Observe that the result depends only on |x|. Then choose x =
(0, . . . , 0, R) and evaluate the integral using spherical coordinates.)
212
> 0.
Conclude that
kBn/2 gkp kgkp .
(Hint: Fubini.)
(11.41)
f H r (Rn ), || r.
(11.43)
By Lemma 11.4 this definition coincides with the usual one for every f
S(Rn ).
q
2 cos(p)1
Example. Consider f (x) = (1 |x|)[1,1] (x). Then f(p) =
p2
and f H 1 (R). The weak derivative is f 0 (x) = sign(x)[1,1] (x).
213
We also have
Z
g(x)( f )(x)dn x = hg , ( f )i = h
g (p) , (ip) f(p)i
Rn
Rn
is also called the weak derivative or the derivative in the sense of distributions of f (by Lemma 8.15 such a function is unique if it exists). Hence,
choosing g = in (11.44), we see that H r (Rn ) is the set of all functions having partial derivatives (in the sense of distributions) up to order r, which
are in L2 (Rn ).
In this connection the following norm for H m (Rn ) with m N0 is more
common:
X
kf k22,m =
k f k22 .
(11.46)
||m
214
|| k.
(11.48)
Proof. Abbreviate hpi = (1 + |p|2 )1/2 . Now use |(ip) f(p)| hpi|| |f(p)| =
hpis hpi||+s |f(p)|. Now hpis L2 (Rn ) if s > n2 (use polar coordinates
to compute the norm) and hpi||+s |f(p)| L2 (Rn ) if s + || r. Hence
hpi|| |f(p)| L1 (Rn ) and the claim follows from the previous lemma.
In fact, we can even do a bit better.
Lemma 11.21 (Morrey inequality). Suppose f H n/2+ (Rn ) for some
(0, 1). Then f C00, (Rn ), the set of functions which are H
older continuous
with exponent and vanish at . Moreover,
|f (x) f (y)| Cn, kf(p)k2,n/2+ |x y|
(11.49)
in this case.
Proof. We begin with
1
f (x + y) f (x) =
(2)n/2
Rn
implying
1
|f (x + y) f (x)|
(2)n/2
Z
Rn
|eipy 1| n/2+
hpi
|f (p)|dn p,
hpin/2+
215
S
r
dr
n
n+2
hrin+2
Rn hpi
0
Z
4
+ Sn
rn1 dr
n+2
1/|y| hri
Sn
Sn 2
Sn
|y|2 +
|y| =
|y|2 ,
2(1 )
2
2(1 )
where Sn = nVn is the surface area of the unit sphere in Rn .
The last example shows that in the case r < n2 functions in H r are no
longer necessarily continuous. In this case we at least get an embedding into
some better Lp space:
Theorem 11.23 (Sobolev inequality). Suppose 0 < r < n2 . Then H r (Rn )
2n
is continuously embedded into Lp (Rn ) with p = n2r
, that is,
kf kp Cn,r k|p|r f(p)k2 Cn,r kf k2,r .
(11.51)
Proof. We will give a prove based on the HardyLittlewoodSobolev inequality to be proven in Theorem 12.10 below.
It suffices to prove the first inequality. Set |p|r f(p) = g(p) L2 . Moreover, choose some sequence fm S f H r . Then, by Lemma 11.17
fm = Ir gm , and since the HardyLittlewoodSobolev inequality implies that
m k2 =
the map Ir : L2 Lp is continuous, we have kfm kp = kIr gm kp Ckg
r f (p)k and the claim follows after taking limits.
gm k2 = Ck|p|
Ck
m
2
2n
Problem 11.12. Use dilations f (x) 7 f (x), > 0, to show that p = n2r
is the only index for which the Sobolev inequality kf kp Cn,r k|p|r f(p)k2 can
hold.
216
u(0) = g.
(11.52)
= lim
(11.53)
0
217
= Au(t),
u(0) = g.
(11.56)
This is, for example, the case if g D(A) in which case we even have
u(t) C 1 ([0, ), X).
Proof. Let g D(A), then
1
1
lim u(t + ) u(t) = lim T (t) T ()f g = T (t)Ag.
0
0
This shows the first part. To show that u(t) is differentiable for t > 0 it
remains to compute
1
1
u(t ) u(t) = lim T (t ) T ()g g
lim
0
0
= lim T (t ) Ag + o(1) = T (t)Ag
0
u
(0) = g.
(11.57)
218
Next, one verifies (Problem 11.15) that the solution (in the sense defined
above) of this differential equation is given by
2
u
(t)(p) = g(p)e|p| t .
(11.58)
TH (t) = F 1 e|p| t F.
(11.59)
(11.62)
Moreover, one can check directly that (11.62) defines a solution for arbitrary
g Lp (Rn ).
Theorem 11.27. Suppose g Lp (Rn ), 1 p . Then (11.62) defines
a solution for the heat equation which satisfies u C ((0, ) Rn ). The
solutions has the following properties:
(i) If 1 p < , then limt0 u(t) = g in Lp . If p = this holds for
g C0 (Rn ).
(ii) If p = , then
ku(t)k kgk .
(11.63)
(11.64)
219
(11.65)
Rn
Rn
and
ku(t)k
1
kgk1 .
(4t)n/2
(11.66)
Now (i) follows from Lemma 8.13, (ii) is immediate, and (iii) follows from
Fubini.
Note that using H
olders inequality we even have
1
1 1
ku(t)k kt kq kgkp = n
+ = 1.
n kgkp ,
p q
q 2q (4t) 2p
(11.68)
u(0) = g.
(11.69)
TS (t) = F 1 ei|p| t F.
(11.70)
2
f (p) = e(it+)p ,
> 0.
(11.72)
|xy|2
Z
e
Rn
4(it+)
g(y)dn y
(11.73)
220
and hence
Z
|xy|2
1
i 4t
TS (t)g(x) =
e
g(y)dn y
(11.74)
(4it)n/2 Rn
for t 6= 0 and g L2 (Rn ) L1 (Rn ). In fact, letting 0 the left-hand
side converges to TS (t)g in L2 and the limit of the right-hand side exists
pointwise by dominated convergence and its pointwise limit must thus be
equal to its L2 limit.
Using this explicit form, we can again draw some further consequences.
For example, if g L2 (Rn ) L1 (Rn ), then g(t) C(Rn ) for t 6= 0 (use
dominated convergence and continuity of the exponential) and satisfies
1
ku(t)k
kgk1 .
(11.75)
|4t|n/2
Thus we have again spreading of wave functions in this case.
Finally we turn to the wave equation
utt u = 0,
u(0) = g,
ut (0) = f.
(11.76)
This equation will fit into our framework once transform it to a first order
system with respect to time:
ut = v,
vt = u,
u(0) = g,
v(0) = f.
(11.77)
(11.78)
sin(t|p|)
f (p),
|p|
v(t, p) = sin(t|p|)|p|
g (p) + cos(t|p|)f(p).
(11.79)
u
t = v,
vt = |p|2 u
,
u
(0) = g,
= |p|
v (p) instead of v, then
u(t)
g
cos(t|p|) sin(t|p|)
1
= TW (t)
, TW (t) = F
F, (11.81)
w(t)
h
sin(t|p|) cos(t|p|)
(11.82)
221
sin(t|p|)
|p|
sin(t|p|)
|p|
sin(t|p|)
t (p) =
,
t
|p|
t (p) =
1 cos(t|p|)
,
|p|2
(11.84)
dr,
R
|x|r
|x|r
|x|r
2 0
1
= lim
(2 Si(R|x|) + Si(R(t |x|)) Si(R(t + |x|))) ,
R
2|x|
(11.87)
222
where
sin(x)
dx
(11.88)
x
0
is the sine integral. Using Si(x) = Si(x) for x R and (Problem 11.19)
lim Si(x) =
(11.89)
x
2
we finally obtain (since the pointwise limit must equal the L2 limit)
r
[0,t] (|x|)
t (x) =
.
(11.90)
2
|x|
Si(z) =
(11.93)
223
u(0) = g,
lim
dx = .
R 0
x
2
RRR
(Hint: Write Si(R) = 0 0 sin(x)ext dt dx and use Fubini.)
Chapter 12
Interpolation
kf kp kf k1
p0 kf kp1 ,
(12.1)
p0
p1 contains all integrable
where p1 = 1
p0 + p1 , (0, 1). Note that L L
p
simple functions which are dense in L for 1 p < (for p = this is
only true if the measure is finite cf. Problem 8.10).
p0 < p < p1 ,
(12.2)
226
12. Interpolation
them by virtue of
A : Lp0 + Lp1 Lq0 + Lq1 ,
f0 + f1 7 A0 f0 + A1 f1
(12.3)
|F (z)| M0
Re(z)
M1
(12.5)
for every z S.
Proof. Without loss of generality we can assume M0 , M1 > 0 and after
the transformation F (z) M0z1 M1z F (z) even M0 = M1 = 1. Now we
consider the auxiliary function
Fn (z) = e(z
2 1)/n
F (z)
which still satisfies |Fn (z)| 1 for Re(z) = 0 and Re(z) = 1 since Re(z 2
1) Im(z)2 0 for z S.pMoreover, by assumption |F (z)| M implying
|Fn (z)| p
1 for |Im(z)| log(M )n. Since we also have |Fn (z)| 1 for
|Im(z)| log(M )n by the maximum modulus principle we see |Fn (z)| 1
for all z S. Finally, letting n the claim follows.
Now we are abel to show the RieszThorin interpolation theorem
Theorem 12.2 (RieszThorin). Let (X, d) and (Y, d) measure spaces and
1 p0 , p1 , q0 , q1 . If q0 = q1 = assume additionally that is -finite.
If A is a linear operator on
A : Lp0 (X, d) + Lp1 (X, d) Lq0 (Y, d) + Lq1 (Y, d)
(12.6)
satisfying
kAf kq0 M0 kf kp0 ,
kAf kq1 M1 kf kp1 ,
(12.7)
then A has continuous restrictions
1
1
1
1
A : Lp (X, d) Lq (Y, d),
=
+ ,
=
+
(12.8)
p
p0
p 1 q
q0
q1
satisfying kA k M01 M1 for every (0, 1).
227
(12.10)
228
12. Interpolation
where
1
r
1
p
1
q
1.
1
1
1
1
where p1 = 1
1 + q 0 = 1 q and r = q + = p + q 1. To see that
f (y)g(x y) is integrable a.e. consider fn (x) = |x|n (x) max(n, |f (x)|).
Then the convolution (fn |g|)(x) is finite and converges for every x by
monotone convergence. Moreover, since fn |f | in Lp we have fn |g|
Ag f in Lr , which finishes the proof.
Combining the last two corollaries we obtain:
Corollary 12.6. Let f Lp (Rn ) and g Lq (Rn ) with
and 1 r, p, q 2. Then
1
r
1
p
1
q
1 0
(f g) = (2)n/2 fg.
Proof. By Corollary 11.13 the claim holds for f, g S(Rn ). Now take a
sequence of Schwartz functions fm f in Lp and a sequence of Schwartz
0
functions gm g in Lq . Then the left-hand side converges in Lr , where
1
1
1
r0 = 2 p q , by the Young and Hausdorff-Young inequalities. Similarly,
0
229
R |f (x)|
0
rp1 dr
(12.13)
(12.14)
|f |>r
(12.15)
1 p < ,
(12.16)
r>0
and the corresponding spaces Lp,w (X, d) consist of all equivalence classes
of functions which are equal a.e. for which the above norm is finite. Clearly
the distribution function and hence the weak Lp norm depend only on the
equivalence class. Despite its name the weak Lp norm turns out to be only
a quasinorm (Problem 12.4). By construction we have
kf kp,w kf kp
(12.17)
and thus Lp (X, d) Lp,w (X, d). In the case p = we set k.k,w = k.k .
Example. Consider f (x) =
1
x
1
2
Ef (r) = |{x| | | > r}| = |{x| |x| < r1 }| =
x
r
shows that f L1,w (R) with kf k1,w = 2. Slightly more general the function
f (x) = |x|n/p 6 Lp (Rn ) but f Lp,w (Rn ). Hence Lp,w (Rn ) is strictly
larger than Lp (Rn ).
230
12. Interpolation
(12.18)
(12.19)
kT (f )kq,w Cp,q,w kf kp .
(12.20)
By (12.17) strong type (p, q) is indeed stronger than weak type (p, q) and
we have Cp,q,w Cp,q .
Theorem 12.7 (Marcinkiewicz). Let (X, d) and (Y, d) measure spaces
and 1 p0 < p1 . Let T be a subadditive operator defined for all
f Lp (X, d), p [p0 , p1 ]. If T is of weak type (p0 , p0 ) and (p1 , p1 ) then it
is also of strong type (p, p) for every p0 < p < p1 .
Proof. We begin by assuming p1 < . Fix f Lp as well as some number
s > 0 and decompose f = f0 + f1 according to
f0 = f {x| |f |>s} Lp0 Lp ,
Next we use (12.12),
Z
p
kT (f )kp = p
p1
ET (f ) (r)dr = p2
rp1 ET (f ) (2r)dr
and observe
ET (f ) (2r) ET (f0 ) (r) + ET (f1 ) (r)
since |T (f )| |T (f0 )| + |T (f1 )| implies |T (f )| > 2r only if |T (f0 )| > r or
|T (f1 )| > r. Now using (12.15) our assumption implies
C0 kf0 kp0 p0
C1 kf1 kp1 p1
ET (f0 ) (r)
,
ET (f1 ) (r)
r
r
and choosing s = r we obtain
Z
Z
C p0
C p1
ET (f ) (2r) p00
|f |p0 d + p11
|f |p1 d.
r
r
{x| |f |>r}
{x| |f |r}
In summary we have kT (f )kpp p2p (C0p0 I1 + C1p1 I2 ) with
Z Z
I0 =
rpp0 1 {(x,r)| |f (x)|>r} |f (x)|p0 d(x) dr
0
X
p0
|f (x)|
=
X
|f (x)|
rpp0 1 dr d(x) =
1
kf kpp
p p0
231
and
Z
I1 =
1/p
kf kp .
f1 = f {x| |f |s/C1 } Lp L
(12.22)
Theorem 12.8 (HardyLittlewood maximal inequality). The maximal function is of weak type (1, 1),
EM(f ) (r)
3n
kf k1 ,
r
(12.23)
232
12. Interpolation
3n p
p1
1/p
kf kp ,
(12.24)
(12.25)
and the estimate follows upon taking absolute values and observing kk1 =
P
j j |Brj (0)|.
Now we will apply this to the Riesz potential (11.27) of order :
I f = I f.
(12.26)
I0 = I (0,) , I = I [,) ,
where > 0 will be determined later. Note that I0 (|.|) L1 (Rn ) and
p
n
, ). In particular, since p0 = p1
233
|I f (x)|
kf
k
=
n/p kf kp .
p
0 (n ) n
(n)p0
p
|y|
|y|
p/n
kf kp
such that
Now we choose = M(f )(x)
p
( , 1),
n
n
where C/2 is the larger of the two constants in the estimates for I0 f and
I f . Taking the Lq norm in the above expression gives
k kM(f )1 kq = Ckf
k kM(f )k1 = Ckf
k kM(f )k1
kI f kq Ckf
kp M(f )(x)1 ,
|I f (x)| Ckf
q(1)
T (f )(x) = ei arg(
f (y)dy )
f (x).
Show that T is subadditive and norm preserving. Show that T is not continuous.
Part 3
Nonlinear Functional
Analysis
Chapter 13
Analysis in Banach
spaces
(13.1)
where o, O are the Landau symbols. The linear map dF (x) is called derivative of F at x. If F is differentiable for all x U we call F differentiable.
In this case we get a map
dF : U
x
L(X, Y )
.
7 dF (x)
(13.2)
238
Pn
being fixed). We have dF v =
i=1 i F vi , v = (v1 , . . . , vn ) X, and
F C 1 (X, Y ) if and only if all partial derivatives exist and are continuous.
In the case of X = Rm and Y = Rn , the matrix representation of dF
with respect to the canonical basis in Rm and Rn is given by the partial
derivatives i Fj (x) and is called Jacobi matrix of F at x.
We can iterate the procedure of differentiation and write F C r (U, Y ),
r 1, if the r-th derivative of F , dr F (i.e., the derivative of the (r 1)th
of F ), exists and is continuous. Finally, we set C (U, Y ) =
T derivative
r
0
rN C (U, Y ) and, for notational convenience, C (U, Y ) = C(U, Y ) and
d0 F = F .
It is often necessary to equip C r (U, Y ) with a norm. A suitable choice
is
|F | = max sup |dj F (x)|.
(13.3)
0jr xU
The set of all r times continuously differentiable functions for which this
norm is finite forms a Banach space which is denoted by Cbr (U, Y ).
If F is bijective and F , F 1 are both of class C r , r 1, then F is called
a diffeomorphism of class C r .
Note that if F L(X, Y ), then dF (x) = F (independent of x) and
dr F (x) = 0, r > 1.
For the composition of mappings we note the following result (which is
easy to prove).
Lemma 13.1 (Chain rule). Let F C r (X, Y ) and G C r (Y, Z), r 1.
Then G F C r (X, Z) and
d(G F )(x) = dG(F (x)) dF (x),
x X.
(13.4)
(13.5)
x, y U,
(13.6)
then
sup |dF (x)| M.
xU
(13.7)
239
(13.8)
for any > 0. Let t0 = max{t [0, 1]|(t) 0}. If t0 < 1 then
+ )(t0 + )
(t0 + ) = |f (t0 + ) f (t0 ) + f (t0 ) f (0)| (M
+ ) + (t0 )
|f (t0 + ) f (t0 )| (M
+ )
|df (t0 ) + o()| (M
+ o(1) M
) = ( + o(1)) 0,
(M
(13.9)
(13.10)
since we can assume |o()| < for > 0 small enough, a contradiction.
As an immediate consequence we obtain
Corollary 13.3. Suppose U is a connected subset of a Banach space X. A
mapping F C 1 (U, Y ) is constant if and only if dF = 0. In addition, if
F1,2 C 1 (U, Y ) and dF1 = dF2 , then F1 and F2 differ only by a constant.
QmNext we want to look at higher derivatives more closely. Let X =
i=1 Xi , then F : X Y is called multilinear if it is linear with respect to
each argument.
It is not hard to see that F is continuous if and only if
|F | =
sup
x:
Qm
|F (x1 , . . . , xm )| < .
(13.11)
240
n
X
1
t1 tn F (
ti xi )|t1 ==tn =0 .
n!
(13.12)
i=1
C r (X, Y
dr Fx (v1 , . . . , vr ) = t1 tr F (x +
r
X
) is symmetric since,
ti vi )|t1 ==tr =0 ,
(13.13)
i=1
i=1
(13.15)
: R(I, X) X
(13.16)
(13.17)
R t2
t1
f (s)ds =
Rb
a
241
s[t,t+]
(13.18)
(s))ds for any F C 1 (I, X).
(
F
a
Rt
x, x
C.
(13.19)
n
|F (x) x|,
1
x C.
(13.20)
n
X
|xj xj1 |
j=m+1
m
nm1
X
j |x1 x0 |
j=0
|x1 x0 |.
1
Thus xn is Cauchy and tends to a limit x. Moreover,
(13.22)
(13.23)
shows that x is a fixed point and the estimate (13.20) follows after taking
the limit n in (13.22).
242
x, x
U , y V.
(13.24)
(13.25)
we infer
1
|F (x(y), y + v) F (x(y), y)|
(13.26)
1
and hence x(y) C(V, U ). Now let r = 1 and let us formally differentiate
x(y) = F (x(y), y) with respect to y,
|x(y + v) x(y)|
(13.27)
(13.28)
Let us abbreviate u = x(y + v) x(y), then using (13.27) and the fixed point
property of x(y) we see
(1 x F (x(y), y))(u x0 (y)v) =
= F (x(y) + u, y + v) F (x(y), y) x F (x(y), y)u y F (x(y), y)v
= o(u) + o(v)
(13.29)
243
(13.30)
(13.31)
244
Proof. Fix x0 Cb (I, U ) and > 0. For each t I we have a (t) > 0
such that B2(t) (x0 (t)) U and |f (x) f (x0 (t))| /2 for all x with
|x x0 (t)| 2(t). The balls B(t) (x0 (t)), t I, cover the set {x0 (t)}tI
and since I is compact, there is a finite subcover B(tj ) (x0 (tj )), 1 j n.
Let |x x0 | = min1jn (tj ). Then for each t I there is ti such
that |x0 (t) x0 (tj )| (tj ) and hence |f (x(t)) f (x0 (t))| |f (x(t))
f (x0 (tj ))| + |f (x0 (tj )) f (x0 (t))| since |x(t) x0 (tj )| |x(t) x0 (t)| +
|x0 (t) x0 (tj )| 2(tj ). This settles the case r = 0.
Next let us turn to r = 1. We claim that df is given by (df (x0 )x)(t) =
df (x0 (t))x(t). Hence we need to show that for each > 0 we can find a
> 0 such that
sup |f (x0 (t) + x(t)) f (x0 (t)) df (x0 (t))x(t)| sup |x(t)|
tI
(13.32)
tI
(13.33)
whenever |x(t)| (t). Now argue as before to show that (t) can be chosen
independent of t. It remains to show that df is continuous. To see this we
use the linear map
: Cb (I, L(X, Y )) L(Cb (I, X), Cb (I, Y )) ,
T
7 T
(13.34)
(13.35)
tI
Now we come to our existence and uniqueness result for the initial value
problem in Banach spaces.
Theorem 13.9. Let I be an open interval, U an open subset of a Banach
space X and an open subset of another Banach space. Suppose F
C r (I U , X), then the initial value problem
x(t)
= F (t, x, ),
x(t0 ) = x0 ,
(t0 , x0 , ) I U ,
(13.36)
245
(13.37)
such that we know the solution for = 0. The implicit function theorem will
show that solutions still exist as long as remains small. At first sight this
doesnt seem to be good enough for us since our original problem corresponds
to = 1. But since corresponds to a scaling t t, the solution for one
> 0 suffices. Now let us turn to the details.
Our problem (13.37) is equivalent to looking for zeros of the function
G : Dr+1 (0 , 0 ) Cbr ((1, 1), X)
.
(x, , )
7 x F (x, )
(13.38)
Lemma 13.8 ensures that this function is C r . Now fix 0 , then G(0, 0 , 0) = 0
Rt
and x G(0, 0 , 0) = T , where T x = x.
Since (T 1 x)(t) = 0 x(s)ds we
can apply the implicit function theorem to conclude that there is a unique
solution x(, ) C r (1 (0 , 0 ), Dr+1 ). In particular, the map (, t) 7
x(, )(t/) is in C r (1 , C r+1 ((, ), X)) , C r ( (, ), X). Hence it
is the desired solution of our original problem.
Chapter 14
14.1. Introduction
Many applications lead to the problem of finding all zeros of a mapping
f : U X X, where X is some (real) Banach space. That is, we are
interested in the solutions of
f (x) = 0,
x U.
(14.1)
In most cases it turns out that this is too much to ask for, since determining
the zeros analytically is in general impossible.
Hence one has to ask some weaker questions and hope to find answers for
them. One such question would be Are there any solutions, respectively,
how many are there?. Luckily, these questions allow some progress.
To see how, lets consider the case f H(C), where H(U ) denotes the
set of holomorphic functions on a domain U C. Recall the concept of
the winding number from complex analysis. The winding number of a path
: [0, 1] C around a point z0 C is defined by
Z
dz
1
n(, z0 ) =
Z.
(14.2)
2i z z0
It gives the number of times encircles z0 taking orientation into account.
That is, encirclings in opposite directions are counted with opposite signs.
In particular, if we pick f H(C) one computes (assuming 0 6 f ())
Z 0
X
1
f (z)
n(f (), 0) =
dz =
n(, zk )k ,
(14.3)
2i f (z)
k
247
248
249
properties a degree should have and then we will look for functions satisfying
these properties.
(14.4)
(14.5)
Dy (U , Rn ) = Dy0 (U , Rn )
(14.6)
n
r
for y R . We will use the topology induced by the sup norm for C (U , Rn )
such that it becomes a Banach space (cf. Section 13.1).
Note that, since U is bounded, U is compact and so is f (U ) if f
C(U , Rn ). In particular,
dist(y, f (U )) = inf |y f (x)|
xU
(14.7)
250
251
N
X
(14.8)
i=1
It suffices to consider one of the zeros, say x1 . Moreover, we can even assume
x1 = 0 and U (x1 ) = B (0). Next we replace f by its linear approximation
around 0. By the definition of the derivative we have
f (x) = df (0)x + |x|r(x),
r C(B (0), Rn ),
r(0) = 0.
(14.9)
N
X
(14.10)
i=1
252
= 0.
(14.13)
253
|f (x) f (
x) df (
x)(x x
)|
|df (
x + t(x x
)) df (
x)||
x x|dt
N
0
(14.14)
for x
, x Qi . Now pick a Qi which contains a critical point x
i CP(f ).
Without restriction we assume x
i = 0, f (
xi ) = 0 and set M = df (
xi ). By
i
n
det M = 0 there is an orthonormal basis {b }1in of R such that bn is
orthogonal to the image of M . In addition,
v
u n
n
n
X
X
uX
i t
i bi | |i | n }
Qi {
i b |
|i |2 n } {
N
N
i=1
i=1
i=1
254
M Qi {
i bi | |i | C
i=1
(e.g., C =
}
N
(14.15)
n
X
f (Qi ) {
i bi | |i | (C + ) , |n | }
N
N
(14.16)
i=1
C
and hence the measure of f (Qi ) is smaller than N
n . Since there are at most
n
N such Qi s, we see that the measure of f (CP(f ) Q) is smaller than
C.
Having this result out of the way we can come to step one and two from
above.
Step 1: Admitting critical values
By (ii) of Theorem 14.2, deg(f, U, y) should be constant on each component of Rn \f (U ). Unfortunately, if we connect y and a nearby regular
value y by a path, then there might be some critical values in between. To
overcome this problem we need a definition for deg which works for critical
values as well. Let us try to look for an Rintegral representation. Formally
(14.12) can be written as deg(f, U, y) = U y (f (x))Jf (x)dx, where y (.) is
the Dirac distribution at y. But since we dont want to mess with distributions, we replace y (.) by (. y), where { }>0 is a family of functions
such
that is supported on the ball B (0) of radius around 0 and satisfies
R
Rn (x)dx = 1.
Lemma 14.6. Let f Dy1 (U , Rn ), y 6 CV(f ). Then
Z
deg(f, U, y) =
(f (x) y)Jf (x)dx
(14.17)
for x U \
Z
SN
i=1 U (x
i)
and hence
N Z
X
i=1
N
X
255
U (xi )
Z
(
x)d
x = deg(f, U, y),
sign(Jf (x))
(14.18)
B0 (0)
i=1
(14.19)
n
X
xj Df (u)j =
j=1
n
X
Df (u)j,k ,
(14.20)
j,k=1
where Df (u)j,k is the determinant of the matrix obtained from the matrix
associated with Df (u)j by applying xj to the k-th column. Since xj xk f =
xk xj f we infer Df (u)j,k = Df (u)k,j , j 6= k, by exchanging the k-th and
the j-th column. Hence
n
X
div Df (u) =
Df (u)i,i .
(14.21)
i=1
(i,j)
Jf (x)
P
(i,j)
Now let
denote the (i, j) minor of df (x) and recall ni=1 Jf xi fk =
j,k Jf . Using this to expand the determinant Df (u)i,i along the i-th column
shows
n
n
n
X
X
X
(i,j)
(i,j)
div Df (u) =
Jf xi uj (f ) =
Jf
(xk uj )(f )xi fk
=
i,j=1
n
X
i,j=1
(xk uj )(f )
j,k=1
as required.
n
X
i=1
(i,j)
Jf
k=1
n
X
xj fk =
(14.22)
j=1
256
(14.23)
for suitable C 2 (Rn , R) and suitable > 0. Now observe
Z 1
(div u)(y) =
zj yj (y + tz)dt
0
Z 1
d
(14.24)
=
( (y + t z))dt = (y y 2 ) (y y 1 ),
dt
0
where
Z 1
u(y) = z
(y + t z)dt, (y) = (y y 1 ), z = y 1 y 2 ,
(14.25)
0
R
and apply the previous lemma to rewrite the integral as U div Df (u)dx.
Since the integrand vanishes in a neighborhood of U it is no restriction to
assume
R that U is smooth
R such that we can apply Stokes theorem. Hence we
have U div Df (u)dx = U Df (u)dF = 0 since u is supported inside B (
y)
i
provided is small enough (e.g., < max{|y y|}i=1,2 ).
As a consequence we can define
deg(f, U, y) = deg(f, U, y),
y 6 f (U ),
f C 2 (U , Rn ),
(14.26)
(14.28)
Since f : U S n1 the integrand can also be written as the pull back f dS
of the canonical surface element dS on S n1 .
257
This coincides with the boundary value approach for complex functions
(note that holomorphic functions are orientation preserving).
Step 2: Admitting continuous functions
Our final step is to remove the condition f C 2 . As before we want the
degree to be constant in each ball contained in Dy (U , Rn ). For example, fix
f Dy (U , Rn ) and set = dist(y, f (U )) > 0. Choose f i C 2 (U , Rn ) such
that |f i f | < , implying f i Dy (U , Rn ). Then H(t, x) = (1 t)f 1 (x) +
tf 2 (x) Dy (U , Rn ) C 2 (U, Rn ), t [0, 1], and |H(t) f | < . If we can
show that deg(H(t), U, y) is locally constant with respect to t, then it is
continuous with respect to t and hence constant (since [0, 1] is connected).
Consequently we can define
(14.29)
deg(f, U, y) = deg(f, U, y),
f Dy (U , Rn ),
where f C 2 (U , Rn ) with |f f | < dist(y, f (U )).
It remains to show that t 7 deg(H(t), U, y) is locally constant.
Lemma 14.10. Suppose f Cy2 (U , Rn ). Then for each f C 2 (U , Rn ) there
is an > 0 such that deg(f + t f, U, y) = deg(f, U, y) for all t (, ).
Proof. If f 1 (y) = the same is true for f + t g if |t| < dist(y, f (U ))/|g|.
Hence we can exclude this case. For the remaining case we use our usual
strategy of considering y RV(f ) first and then approximating general y
by regular ones.
Suppose y RV(f ) and let f 1 (y) = {xi }N
j=1 . By the implicit function
theorem we can find disjoint neighborhoods U (xi ) such that there exists a
unique solution xi (t) U (xi ) of (f + t g)(x) = y for |t| < 1 . By reducing
U (xi ) if necessary, we can even assume that the sign of Jf +t g is constant on
S
i
U (xi ). Finally, let 2 = dist(y, f (U \ N
i=1 U (x )))/|g|. Then |f + t g y| > 0
for |t| < 2 and = min(1 , 2 ) is the quantity we are looking for.
It remains to consider the case y CV(f ). pick a regular value y
B/3 (y), where = dist(y, f (U )), implying deg(f, U, y) = deg(f, U, y).
Then we can find an > 0 such that deg(f, U, y) = deg(f + t g, U, y) for
|t| < . Setting = min(
, /(3|g|)) we infer y(f +t g)(x) /3 for x U ,
that is |
y y| < dist(
y , (f + t g)(U )), and thus deg(f + t g, U, y) = deg(f +
t g, U, y). Putting it all together implies deg(f, U, y) = deg(f + t g, U, y) for
|t| < as required.
Now we can finally prove our main theorem.
Theorem 14.11. There is a unique degree deg satisfying (D1)-(D4). Moreover, deg(., U, y) : Dy (U , Rn ) Z is constant on each component and given
258
f Dy (U , Rn ) we have
X
deg(f, U, y) =
sign Jf(x)
(14.30)
xf1 (y)
(14.31)
Then we have |g(x)| = 5|x| and |f (x) g(x)| = 1 and hence h(t)=
(1 t)g + t f = g + t(f g) satisfies |h(t)| |g| t|f g| > 0 for |x| > 1/ 5
implying
259
|x| = R.
(14.34)
(14.35)
260
The same is true for the map f : B1 (0) B1 (0), x 7 x (B1 (0) Rn
is simply connected for n 3 but not homeomorphic to a convex set).
It remains to prove the result from topology needed in the proof of
the Brouwer fixed-point theorem. It is a variant of the Tietze extension
theorem.
Theorem 14.15. Let X be a metric space, Y a Banach space and let K
be a closed subset of X. Then F C(K, Y ) has a continuous extension
F C(X, Y ) such that F (X) hull(F (K)).
Proof. Consider the open cover {B(x) (x)}xX\K for X\K, where (x) =
dist(x, K)/2. Choose a (locally finite) partition of unity { } subordinate to this cover and set
X
F (x) =
(x)F (x ) for x X\K,
(14.36)
where x K satisfies dist(x , supp ) 2 dist(K, supp ). By construction, F is continuous except for possibly at the boundary of K. Fix
x0 K, > 0 and choose > 0 such that |F (x) F (x0 )| for all
x K with |x x0 | < 4. We will show that |F (x) F (x0 )| for
all
P x X with |x x0 | < . Suppose x 6 K, then |F (x) F (x0 )|
(x)|F (x ) F (x0 )|. By our construction, x should be close to x
for all with x supp since x is close to K. In fact, if x supp we
have
|x x | dist(x , supp ) + d(supp ) 2 dist(K, supp ) + d(supp ),
(14.37)
where d(supp ) = supx,ysupp |x y|. Since our partition of unity is
subordinate to the cover {B(x) (x)}xX\K we can find a x
X\K such
that supp B(x) (
x) and hence d(supp ) (
x) dist(K, B(x) (
x))
dist(K, supp ). Putting it all together implies that we have |x x |
3 dist(K, supp ) 3|x0 x| whenever x supp and thus
|x0 x | |x0 x| + |x x | 4|x0 x| 4
(14.38)
261
fixed point we can define a retraction R(x) = f (x) + t(x)(x f (x)), where
t(x) 0 is chosen such that |R(x)|2 = 1 (i.e., R(x) lies on the intersection
of the line spanned by x, f (x) with the unit sphere).
Using this equivalence the Brouwer fixed point theorem can also be derived easily by showing that the homology groups of the unit ball B1 (0) and
its boundary (the unit sphere) differ (see, e.g., [20] for details).
Finally, we also derive the following important consequence know as
invariance of domain theorem.
Theorem 14.16 (Brower). Let U Rn be open and let f : U Rn be
continuous and injective. Then f (U ) is also open.
Proof. By scaling and translation it suffices to show that if f : B1 (0) Rn
is injective, then f (0) is an inner point for f (B1 (0)). Abbreviate C = B1 (0).
Since C is compact so ist f (C) and thus f : C f (C) is a homeomorphism.
In particular, f 1 : f (C) C is continuous and can be extended to a
continuous left inverse g : Rn Rn (i.e., g(f (x)) = x for all x C.
Note that g has a zero in f (C), namely f (0), which is stable in the sense
that any perturbation g : f (C) Rn satisfying |
g (y) g(y)| 1 for all
y f (C) also has a zero. To see this apply the Brower fixed point theorem
to the function F (x) = x tig(f (x)) = g(f (x)) tig(f (x)) which maps C
to C by assumption.
Our strategy is to find a contradiction to this fact. Since g(f (0)) = 0
vanishes there is some such that |g(y)| 13 for y B2 (f (0)). If f (0) were
not in the interior of f (C) we can find some z B2 (f (0)) which is not in
f (C). After a translation we can assume z = 0 without loss of generality,
that is, 0 6 f (C) and |f (0)| < . In particular, we also have |g(y)| 13 for
y B (0).
Next consider the map : f (C) Rn given by
(
y,
|y| > ,
(y) =
y
|y| , |y| .
It is continuous away from 0 and its range is contained in 1 2 , where
1 = {y f (C)| |y| } and 2 = {y Rn | |y| = }.
Since f is injective, g does not vanish on 1 and since 1 is compact
there is a such that |g(y)| for y 1 . We may even assume < 13 .
Next, by the StoneWeierstra theorem we can find a polynomial P such
that
|P (y) g(y)| <
for all y . In particular, P does not vanish on 1 . However, it could
vanish on 2 . But since 2 has measure zero, so has P (2 ) and we can find
262
2
+ 1.
3
(14.39)
f (x) =
m
X
i yi ,
(14.41)
i=1
P
where
of x (i.e., i 0, m
i=1 i = 1 and
Pmi are the barycentric coordinates
x = i=1 i vi ). By construction, f 1 C(K, K) and there is a fixed point
x1 . But unless x1 is one of the vertices, this doesnt help us too much. So
lets choose a better function as follows. Consider the k-th barycentric subdivision and for each vertex vi in this subdivision pick an element yi f (vi ).
Now define f k (vi ) = yi and extend f k to the interior of each subsimplex as
before. Hence f k C(K, K) and there is a fixed point
xk =
m
X
i=1
ki vik =
m
X
ki yik ,
yik = f k (vik ),
(14.42)
i=1
k i. Since (xk , k , . . . , k , y k , . . . , y k ) K
in the subsimplex hv1k , . . . , vm
m 1
m
1
m
m
[0, 1] K we can assume that this sequence converges to some limit
0 ) after passing to a subsequence. Since the sub(x0 , 01 , . . . , 0m , y10 , . . . , ym
simplices shrink to a point, this implies vik x0 and hence yi0 f (x0 ) since
(vik , yik ) (vi0 , yi0 ) by the closedness assumption. Now (14.42) tells
us
m
X
x0 =
0i yi0 f (x0 )
(14.43)
i=1
= (1 , . . . , n ) =
n
Y
(14.44)
i=1
of the i-th player. We will consider the case where the game is repeated
a large number of times and where in each step the players choose their
action according to a fixed strategy. Here a strategy si for the i-th player is
k
i
a probability distribution on i , that is, si = (s1i , . . . , sm
i ) such that si 0
264
P i k
and m
k=1 si = 1. The set of all possible strategies for the i-th player is
denoted by Si . The number ski is the probability for the
Qn k-th pure strategy
to be chosen. Consequently, if s = (s1 , . . . , sn ) S = i=1 Si is a collection
of strategies, then the probability that a given collection of pure strategies
gets chosen is
n
Y
s() =
si (),
si () = ski i , = (k1 , . . . , kn )
(14.45)
i=1
(assuming all players make their choice independently) and the expected
payoff for player i is
X
Ri (s) =
s()Ri ().
(14.46)
Here s\
si denotes the strategy combination obtained from s by replacing
si by si . The set of all best replies against s for the i-th player is denoted
by Bi (s). Explicitly, si B(s) if and only if ski = 0 whenever Ri (s\k) <
max1lmi Ri (s\l) (in particular Bi (s) 6= ).
Let s, s S, we call s a best reply against s if si is a Q
best reply against
s for all i. The set of all best replies against s is B(s) = ni=1 Bi (s).
A strategy combination s S is a Nash equilibrium for the game if it is
a best reply against itself, that is,
s B(s).
(14.48)
R2 d2 c2
d1 0 1
c1 2
1
(14.49)
265
both players could get the payoff 1 if they both agree to cooperate. But if
one would break this agreement in order to increase his payoff, the other
one would get less. Hence it might be safer to defect.
Now that we have seen that Nash equilibria are a useful concept, we
want to know when such an equilibrium exists. Luckily we have the following
result.
Theorem 14.19 (Nash). Every n-person game has at least one Nash equilibrium.
Proof. The definition of a Nash equilibrium begs us to apply Kakutanis
theorem to the set valued function s 7 B(s). First of all, S is compact
and convex and so are the sets B(s). Next, observe that the closedness
condition of Kakutanis theorem is satisfied since if sm S and sm B(sn )
both converge to s and s, respectively, then (14.47) for sm , sm
Ri (sm \
si ) Ri (sm \sm
i ),
si Si , 1 i n,
(14.50)
si Si , 1 i n,
by continuity of Ri (s).
(14.51)
f
ij
j
0
JI+f(x) = det(I + f )(x) = det
0
ij
= det(ij + j fi ) = J (x)
(14.53)
I+fm
266
x(gf )1 (y)
X
zg 1 (y)
sign(Jg (z))
sign(Jf (x))
xf 1 (z)
zg 1 (y)
m
X
j=1 zg 1 (y)Gj
m
X
j=1
m
X
deg(f, U, Gj )
sign(Jg (z))
(14.55)
zg 1 (y)Gj
(14.56)
j=1
267
(14.57)
l , y) =
l deg(g, U
l>0
l deg(g, Ul , y)
l>0
deg(f, U, Gj ) deg(g, Gj , y)
(14.58)
(14.60)
and hence fore every i we have Ki Gl for some l since components are
maximal connected sets. Let Nl = {i|Ki Gl } and observe that we have
268
P
deg(k, Gl , y) = iNl deg(k, Ki , y) and deg(h, Hj , Gl ) = deg(h, Hj , Ki ) for
every i Nl . Therefore,
XX
X
1=
deg(h, Hj , Ki ) deg(k, Ki , y) =
deg(h, Hj , Ki ) deg(k, Ki , Hj )
l
iNl
(14.61)
By reversing the role of C1 and C2 , the same formula holds with Hj and Ki
interchanged.
Hence
X
i
1=
XX
i
deg(h, Hj , Ki ) deg(k, Ki , Hj ) =
(14.62)
Chapter 15
The LeraySchauder
mapping degree
(15.1)
provided this definition is independent of the basis chosen. To see this let
be a second isomorphism. Then A = 1 GL(n). Abbreviate
f = f 1 , y = (y) and pick f Cy1 ((U ), Rn ) in the same
component of Dy ((U ), Rn ) as f such that y RV(f ). Then Af A1
Cy1 ((U ), Rn ) is the same component of Dy ((U ), Rn ) as A f A1 =
f 1 (since A is also a homeomorphism) and
JAf A1 (Ay ) = det(A)Jf (y ) det(A1 ) = Jf (y )
(15.2)
270
Our next aim is to tackle the infinite dimensional case. The general
idea is to approximate F by finite dimensional maps (in the same spirit as
we approximated continuous f by smooth functions). To do this we need
to know which maps can be approximated by finite dimensional operators.
Hence we have to recall some basic facts first.
n
X
i (x)xi ,
(15.3)
i=1
then
|P (x) x| = |
n
X
i=1
i (x)x
n
X
i=1
i (x)xi |
n
X
i (x)|x xi | .
(15.4)
i=1
This lemma enables us to prove the following important result.
Theorem 15.2. Let U be bounded, then the closure of F(U, Y ) in C(U, Y )
is C(U, Y ).
Proof. Suppose FN C(U, Y ) converges to F . If F 6 C(U, Y ) then we can
find a sequence xn U such that |F (xn ) F (xm )| > 0 for n 6= m. If N
271
2 =
4
2
(15.5)
(15.6)
(15.7)
272
i = 1, 2,
(15.8)
x U, (1, ).
(15.10)
274
(15.11)
is compact.
Proof. We first need to prove that F is continuous. Fix x0 C(I, Rn ) and
> 0. Set = |x0 | + 1 and abbreviate B = B (0) Rn . The function f
is uniformly continuous on Q = I I B since Q is compact. Hence for
1 = /(b a) we can find a (0, 1] such that |f (t, s, x) f (t, s, y)| 1
for |x y| < . But this implies
Z
(t)
|F (x) F (x0 )| = sup
f (t, s, x(s)) f (t, s, x0 (s))ds
tI
a
Z (t)
|f (t, s, x(s)) f (t, s, x0 (s))|ds
sup
tI
sup(b a)1 = ,
(15.12)
tI
275
that |f (t, s, x)f (t0 , s, x)| 1 and | (t) (t0 )| 2 for |tt0 | < . Hence
we infer for |t t0 | <
Z
Z (t0 )
(t)
f (t0 , s, x(s))ds
|F (x)(t) F (x)(t0 )| =
f (t, s, x(s))ds
a
a
Z (t0 )
Z (t)
|f (t, s, x(s)) f (t0 , s, x(s))|ds +
|f (t, s, x(s))|ds
(t0 )
a
(b a)1 + 2 M = .
(15.14)
x(t0 ) = x0 ,
(15.16)
(15.17)
276
and the first part follows from our previous theorem. To show the second,
()(1 + ). Then
fix > 0 and assume M (, ) M
Z t
Z t
(15.18)
|f (s, x(s))|ds M () (1 + |x(s)|)ds
|x(t)|
0
Chapter 16
The stationary
NavierStokes equation
(16.1)
277
278
where V = x1 x2 x3 .
The viscosity acting on the surface through (x1 , x2 , x3 ) normal to the x1 direction is x2 x3 1 vj by some physical law. Here > 0 is the viscosity
constant of the fluid. On the opposite surface we have x2 x3 1 (vj +
1 vj x1 ). Adding up the contributions of all surface we end up with
V i i vj .
(16.2)
(16.4)
(16.7)
(16.8)
279
Taking the completion with respect to the associated norm we obtain the
Sobolev space H 1 (U, R). Similarly, taking the completion of Cc1 (U, R) with
respect to the same norm, we obtain the Sobolev space H01 (U, R). Here
Ccr (U, Y ) denotes the set of functions in C r (U, Y ) with compact support.
This construction of H 1 (U, R) implies that a sequence uk in C 1 (U, R) converges to u H 1 (U, R) if and only if uk and all its first order derivatives
j uk converge in L2 (U, R). Hence we can assign each u H 1 (U, R) its first
order derivatives j u by taking the limits from above. In order to show that
this is a useful generalization of the ordinary derivative, we need to show
that the derivative depends only on the limiting function u L2 (U, R). To
see this we need the following lemma.
Lemma 16.1 (Integration by parts). Suppose u H01 (U, R) and v H 1 (U, R),
then
Z
Z
u(j v)dx = (j u)v dx.
(16.9)
U
(16.10)
And since C0 (U, R) is dense in L2 (U, R), the derivatives are uniquely determined by u L2 (U, R) alone. Moreover, if u C 1 (U, R), then the derivative in the Sobolev space corresponds to the usual derivative. In summary,
H 1 (U, R) is the space of all functions u L2 (U, R) which have first order
derivatives (in the sense of distributions, i.e., (16.10)) in L2 (U, R).
Next, we want to consider some additional properties which will be used
later on. First of all, the Poincare-Friedrichs inequality.
Lemma 16.2 (Poincare-Friedrichs inequality). Suppose u H01 (U, R), then
Z
Z
u2 dx d2j (j u)2 dx,
(16.11)
U
280
(1 u)2 (, x2 , . . . , xn )d,
(b a)
(16.12)
Hence, from the view point of Banach spaces, we could also equip H01 (U, R)
with the scalar product
Z
hu, vi = (j u)(x)(j v)(x)dx.
(16.14)
U
This scalar product will be more convenient for our purpose and hence we
will use it from now on. (However, all results stated will hold in either case.)
The norm corresponding to this scalar product will be denoted by |.|.
Next, we want to consider the embedding H01 (U, R) , L2 (U, R) a little
closer. This embedding is clearly continuous since by the Poincare-Friedrichs
inequality we have
d(U )
kuk2 kuk,
n
(16.15)
2 Q
Q
Q
for all u H 1 (Q, R).
Proof. After a scaling we can assume Q = (0, 1)n . Moreover, it suffices to
consider u C 1 (Q, R).
Now observe
u(x) u(
x) =
n Z
X
i=1
xi
xi1
(i u)dxi ,
(16.17)
281
where xi = (
x1 , . . . , x
i , xi+1 , . . . , xn ). Squaring this equation and using
CauchySchwarz on the right hand side we obtain
!2
2
n Z 1
n Z 1
X
X
2
2
n
|i u|dxi
u(x) 2u(x)u(
x) + u(
x)
|i u|dxi
n
i=1 0
n Z 1
X
i=1
i=1
(i u)2 dxi .
(16.18)
(16.19)
(16.20)
is compact.
Proof. Pick a cube Q (with edge length ) containing U and a bounded
sequence uk H01 (U, R). Since bounded sets are weakly compact, it is no
restriction to assume that uk is weakly convergent in L2 (U, R). By setting
uk (x) = 0 for x 6 U we can also assume uk H 1 (Q, R) (show this). Next,
subdivide Q into N subcubes Qi with edge lengths /N . On each subcube
(16.16) holds and hence
Z
2
Z
Z
Z
Nn
X
N
n2
2
2
u dx =
u dx =
udx +
(k u)(k u)dx (16.21)
2N 2 U
U
Q
Qi
i=1
2N
Qi
(16.22)
i=1
The last term can be made arbitrarily small by picking N large. The first
term converges to 0 since uk converges weakly and each summand contains
the L2 scalar product of uk u` and Qi (the characteristic function of
Qi ).
In addition to this result we will also need the following interpolation
inequality.
282
1/4
4
kuk4 8kuk2 kuk3/4 .
(16.23)
Proof. We first prove the case where u Cc1 (U, R). The key idea is to start
with U R1 and then work ones way up to U R2 and U R3 .
If U R1 we have
2
u(x) =
1 u (x1 )dx1 2
and hence
2
max u(x) 2
xU
|u1 u|dx1
(16.24)
Z
|u1 u|dx1 .
(16.25)
L
2
4
qR
R
R
2
2
4
( 1 u dx
1 dx u dx), uk converges to v in L (U, R) as well and
hence u = v. Now take the limit in the inequality for uk .
As a consequence we obtain
8d(U ) 1/4
kuk,
kuk4
3
U R3 ,
(16.28)
283
and
Corollary 16.6. The embedding
H01 (U, R) , L4 (U, R),
U R3 ,
(16.29)
is compact.
Proof. Let uk be a bounded sequence in H01 (U, R). By Rellichs theorem
there is a subsequence converging in L2 (U, R). By the Ladyzhenskaya inequality this subsequence converges in L4 (U, R).
Our analysis clearly extends to functions with values in Rn since we have
H01 (U, Rn ) = nj=1 H01 (U, R).
U
1
H0 (U, R3 ),
that is,
(16.33)
where
Z
a(u, v, w) =
uk vj (k wj ) dx.
(16.35)
284
only solve the Navier-Stokes equations in the sense of distributions. But let
us try to show existence of solutions for (16.34) first.
For later use we note
Z
Z
1
vk vj (k vj ) dx =
a(v, v, v) =
vk k (vj vj ) dx
2 U
U
Z
1
=
(vj vj )k vk dx = 0,
v X.
2 U
We proceed by studying (16.34). Let K L2 (U, R3 ), then
X such that
a linear functional on X and hence there is a K
Z
wi,
Kw dx = hK,
w X.
(16.36)
R
U
Kw dx is
(16.37)
Moreover, the same is true for the map a(u, v, .), u, v X , and hence there
is an element B(u, v) X such that
a(u, v, w) = hB(u, v), wi,
w X.
(16.38)
X2
v B(v, v) = K.
(16.40)
So in order to apply the theory from our previous chapter, we need a Banach
space Y such that X , Y is compact.
Let us pick Y = L4 (U, R3 ). Then, applying the Cauchy-Schwarz inequality twice to each summand in a(u, v, w) we see
1/2 Z
1/2
XZ
2
|a(u, v, w)|
(uk vj ) dx
(k wj )2 dx
j,k
kwk
XZ
j,k
(uk ) dx
1/4 Z
(vj )4 dx
1/4
(16.41)
Moreover, by Corollary 16.6 the embedding X , Y is compact as required.
Motivated by this analysis we formulate the following theorem.
Theorem 16.7. Let X be a Hilbert space, Y a Banach space, and suppose
there is a compact embedding X , Y . In particular, kukY kuk. Let
a : X 3 R be a multilinear form such that
|a(u, v, w)| kukY kvkY kwk
(16.42)
X , > 0 we have a solution v X to
and a(v, v, v) = 0. Then for any K
the problem
wi,
hv, wi a(v, v, w) = hK,
w X.
(16.43)
285
(16.45)
(16.46)
Chapter 17
Monotone maps
(17.1)
F (x) = y
(17.2)
lim
x 6= y,
(17.3)
288
x, y X,
(17.4)
x 6= y X,
(17.5)
strictly monotone if
hF (x) F (y), x yi > 0,
x, y X.
(17.6)
Note that the same definitions can be made if X is a Banach space and
F : X X .
Observe that if F is strongly monotone, then it automatically satisfies
hF (x), xi
lim
= .
(17.7)
kxk
|x|
(Just take y = 0 in the definition of strong monotonicity.) Hence the following result is not surprising.
Theorem 17.2 (Zarantonello). Suppose F C(X, X) is (globally) Lipschitz
continuous and strongly monotone. Then, for each y X the equation
F (x) = y
(17.8)
(17.9)
(17.10)
289
a(x, y) = b(y),
(17.12)
(17.13)
(17.14)
and
Then there is a unique x X such that (17.12) holds.
Proof. By the Riez lemma (Theorem 2.10) there are elements F (x) X
and z X such that a(x, y) = b(y) is equivalent to hF (x) z, yi = 0, y X,
and hence to
F (x) = z.
(17.15)
By (17.13) the map F is strongly monotone. Moreover, by (17.14) we infer
kF (x) F (y)k =
sup
(17.16)
x
X,k
xk=1
x X,
(17.17)
x U,
u(x) = 0,
x U,
(17.18)
inf
eS n ,xU
ei Aij (x)ej ,
b0 = sup |b(x)|,
xU
c0 = inf c(x).
xU
(17.19)
290
(17.20)
(17.21)
After integration by parts we can write this equation as
a(v, u) = f (v),
v H01 ,
(17.22)
where
a(v, u) =
Z
i v(x)Aij (x)j u(x) + bj (x)v(x)j u(x) + c(x)v(x)u(x) dx
ZU
f (x)v(x) dx,
f (v) =
(17.23)
We call a solution of (17.22) a weak solution of the elliptic Dirichlet problem (17.18).
By a simple use of the Cauchy-Schwarz and Poincare-Friedrichs inequalities we see that the bilinear form a(u, v) is bounded. To be able to apply the (linear)
LaxMilgram theorem we need to show that it satisfies
R
a(u, u) C |j u|2 dx.
Using (17.19) we have
Z
a(u, u)
a0 |j u|2 b0 |u||j u| + c0 |u|2 ,
(17.24)
1
|u||j u| |u|2 + |j u|2
(17.25)
2
2
which gives
Z
b0
b0
)|u|2 .
a(u, u)
(a0 )|j u|2 + (c0
(17.26)
2
2
U
b0
b0
0
> 0 and c0 b20 0, or equivalently 2c
Since we need a0 2
b0 > 2a0 , we
see that we can apply the LaxMilgram theorem if 4a0 c0 > b20 . In summary,
we have proven
Theorem 17.4. The elliptic Dirichlet problem (17.18) has a unique weak
solution u H01 (U, R) if a0 > 0, b0 = 0, c0 0 or a0 > 0, 4a0 c0 > b20 .
291
(17.27)
for all y X
(17.28)
whenever xn x.
The idea is as follows: Start with a finite dimensional subspace Xn X
and project the equation F (x) = y to Xn resulting in an equation
Fn (xn ) = yn ,
xn , yn Xn .
(17.29)
More precisely, let Pn be the (linear) projection onto Xn and set Fn (xn ) =
Pn F (xn ), yn = Pn y (verify that Fn is continuous and monotone!).
Now Lemma 17.1 ensures that there exists a solution S
un . Now chose the
subspaces Xn such that Xn X (i.e., Xn Xn+1 and
n=1 Xn is dense).
Then our hope is that un converges to a solution u.
This approach is quite common when solving equations in infinite dimensional spaces and is known as Galerkin approximation. It can often
be used for numerical computations and the right choice of the spaces Xn
will have a significant impact on the quality of the approximation.
So how should we show that xn converges? First of all observe that our
construction of xn shows that xn lies in some ball with radius Rn , which is
chosen such that
hFn (x), xi > kyn kkxk,
kxk Rn , x Xn .
(17.30)
(17.31)
for every z X.
(17.32)
292
Appendix A
Zorns lemma
A partial order is a binary relation over a set P such that for all
A, B, C P:
A A (reflexivity),
if A B and B A then A = B (antisymmetry),
if A B and B C then A C (transitivity).
Example. Let P(X) be the collections of all subsets of a set X. Then P is
partially ordered by inclusion .
It is important to emphasize that two elements of P need not be comparable, that is, in general neither A B nor B A might hold. However,
if any two elements are comparable, P will be called totally ordered.
Example. R with is totally ordered.
294
A. Zorns lemma
Bibliography
295
296
Bibliography
[20] J.J. Rotman, Introduction to Algebraic Topology, Springer, New York, 1988.
[21] H. Royden, Real Analysis, Prencite Hall, New Jersey, 1988.
[22] W. Rudin, Real and Complex Analysis, 3rd edition, McGraw-Hill, New York,
1987.
[23] M. Ru
zicka, Nichtlineare Funktionalanalysis, Springer, Berlin, 2004.
[24] G. Teschl, Mathematical Methods in Quantum Mechanics; With Applications to
Schr
odinger Operators, Amer. Math. Soc., Providence, 2009.
[25] J. Weidmann, Lineare Operatoren in Hilbertr
aumen I: Grundlagen, B.G.Teubner,
Stuttgart, 2000.
[26] D. Werner, Funktionalanalysis, 3rd edition, Springer, Berlin, 2000.
[27] E. Zeidler, Applied Functional Analysis: Applications to Mathematical Physics,
Springer, New York 1995.
[28] E. Zeidler, Applied Functional Analysis: Main Principles and Their Applications,
Springer, New York 1995.
Glossary of notation
AC[a, b]
arg(z)
Br (x)
B(X)
BV [a, b]
B
Bn
C
C(U )
C(U, V )
C0 (U )
297
298
Glossary of notation
D(.)
deg(D, f, y)
det
dim
div
dist(U, V )
Dyr (U, Y )
Dy (U, Y )
e
dF
F(X, Y )
GL(n)
(z)
H
hull(.)
H(U )
H 1 (U, Rn )
H01 (U, Rn )
i
Im(.)
inf
Jf (x)
Ker(A)
n
L(X, Y )
L(X)
Lp (X, d)
L (X, d)
Lploc (X, d)
L1 (X)
L2cont (I)
`p (N)
`2 (N)
` (N)
max
N
N0
n(, z0 )
O(.)
o(.)
Q
R
. . . domain of an operator
. . . mapping degree, 249, 257
. . . determinant
. . . dimension of a linear space
. . . divergence
= inf (x,y)U V kx yk distance of two sets
. . . functions in C r (U , Y ) which do not attain y on the boundary, 249
. . . functions in C(U , Y ) which do not attain y on the boundary, 271
. . . exponential function, ez = exp(z)
. . . derivative of F , 237
. . . set of compact finite dimensional functions, 270
. . . general linear group in n dimensions
. . . gamma function, 146
. . . a Hilbert space
. . . convex hull
. . . set of holomorphic functions on a domain U C, 247
. . . Sobolev space, 279
. . . Sobolev space, 279
. . . complex unity, i2 = 1
. . . imaginary part of a complex number
. . . infimum
= det df (x) Jacobi determinant of f at x, 249
. . . kernel of an operator A, 35
. . . Lebesgue measure in Rn , 140
. . . set of all bounded linear operators from X to Y , 36
= L(X, X)
. . . Lebesgue space of p integrable functions, 156
. . . Lebesgue space of bounded functions, 156
. . . locally p integrable functions, 156
. . . space of integrable functions, 133
. . . space of continuous square integrable functions, 31
. . . Banach space of p summable sequences, 22
. . . Hilbert space of square summable sequences, 29
. . . Banach space of bounded summable sequences, 23
. . . maximum
. . . the set of positive integers
= N {0}
. . . winding number
. . . Landau symbol, f = O(g) iff lim supxx0 |f (x)/g(x)| <
. . . Landau symbol, f = o(g) iff limxx0 |f (x)/g(x)| = 0
. . . the set of rational numbers
. . . the set of real numbers
Glossary of notation
RV(f )
Ran(A)
Re(.)
R(I, X)
n1
S n1
Sn
sign(z)
S(I, X)
sup
supp(f )
supp()
span(M )
Vn
Z
I
z
z
A
A
f
f
299
|x|
||
k.k
k.kp
h., ..i
bxc
x F (x, y)
M
(1 , 2 )
[1 , 2 ]
xn x
xn * x
An A
s
An A
An * A
Index
301
302
Index
Fr
echet space, 41
Fredholm alternative, 95
Fredholm operator, 95
Fubini theorem, 138
function
open, 14
Functional, linear, 240
fundamental solution
heat equation, 218
Laplace equation, 209
fundamental theorem of calculus, 135, 189
Galerkin approximation, 291
gamma function, 146
gauge, 76
Gaussian, 201
generalized inverse, 147
gradient, 201
GramSchmidt orthogonalization, 46
graph, 68
graph norm, 70
Green function, 62
Gronwalls inequality, 276
Hahn decomposition, 184
Hankel operator, 92
Hardy space, 167
HardyLittlewood maximal function, 231
HardyLittlewood maximal inequality, 231
HardyLittlewoodSobolev inequality, 232
Hausdorff space, 11
HausdorffYoung inequality, 227
heat equation, 3, 216
HeineBorel theorem, 18
Heisenberg uncertainty principle, 205
Helmholtz equation, 210
Hilbert space, 28
dimension, 47
HilbertSchmidt operator, 90, 169
H
older continuous, 40
H
olders inequality, 22, 159
Holomorphic function, 247
homeomorphism, 14
Homotopy, 248
Homotopy invariance, 249
identity, 37, 97
Implicit function theorem, 243
index, 95
induced topology, 11
injective, 13
inner product, 28
inner product space, 28
integrable, 132
Riemann, 151
Integral, 240
integral, 130
Index
303
product, 137
space, 114
support, 119
metric
translation invariant, 41
metric space, 9
Minkowski functional, 76
Minkowski inequality, 160
integral form, 160
mollifier, 165
monotone, 288
map, 287
strictly, 288
strongly, 288
monotone convergence theorem, 131
Morrey inequality, 214
multi-index, 40, 201
order, 40, 201
Multilinear function, 239
multiplier, see Fourier multiplier
mutually singular measures, 171
Nash equilibrium, 264
Nash theorem, 265
NavierStokes equation, 278
stationary, 278
neighborhood, 10
Neumann series, 100
Noether operator, 95
norm, 20
operator, 35
normal, 19, 103
normalized, 29
normed space, 20
nowhere dense, 65
n-person game, 263
null space, 35
onto, 13
open
ball, 9
function, 14
set, 10
operator
adjoint, 50
bounded, 35
closeable, 69
closed, 68
closure, 69
compact, 55
domain, 35
finite rank, 88
HilbertSchmidt, 169
linear, 35
nonnegative, 51
self-adjoint, 57
strong convergence, 82
304
symmetric, 57
unitary, 47
weak convergence, 82
order
partial, 293
total, 293
orthogonal, 29
complement, 48
projection, 49, 106
sum, 52
outer measure, 122
parallel, 29
parallelogram law, 30
Parseval relation, 46
partial order, 293
partition, 150
partition of unity, 19
Payoff, 263
Peano theorem, 275
perpendicular, 29
-system, 121
Plancherel identity, 203
Poincar
e inequality, 280
Poincar
e-Friedrichs inequality, 279
Poisson equation, 207
Poisson kernel, 167
Poissons formula, 222
polar coordinates, 143
polarization identity, 31
premeasure, 115
Prisoners dilemma, 264
probability measure, 114
product measure, 137
product rule, 191
product topology, 14
projection valued measure, 106
Proper, 271
proper metric space, 18
pseudometric, 9
Pythagorean theorem, 29
quadrangle inequality, 20
quasinorm, 28
quotient space, 38
Radon measure, 129
RadonNikodym
derivative, 173
theorem, 173
range, 35
rank, 88
Reduction property, 265
refinement, 15
reflexive, 74
Regular values, 249
Regulated function, 240
Index
Index
strong convergence, 82
strong type, 230
SturmLiouville problem, 6
subadditive, 230
subcover, 15
subspace topology, 11
support, 14
measure, 119
surjective, 13
Symmetric multilinear function, 239
tensor product, 53
theorem
Arzel`
aAscoli, 56
B.L.T., 36
Bair, 65
BanachSteinhaus, 66
BolzanoWeierstra, 18
closed graph, 68
dominated convergence, 134
Dynkins -, 121
Fatou, 132, 134
FatouLebesgue, 134
Fubini, 138
fundamental thm. of calculus, 135, 189
Hadamard three-lines, 226
HahnBanach, 72
HahnBanach, geometric, 77
HeineBorel, 18
HellingerToeplitz, 70
Jordan, 187
Jordanvon Neumann, 30
LaxMilgram, 52
Lebesgue, 134, 152
Lebesgue decomposition, 173
Levi, 131
Lindel
of, 15
Marcinkiewicz, 230
monotone convergence, 131
open mapping, 67
Plancherel, 203
Pythagorean, 29
RadonNikodym, 173
Riesz representation, 196
RieszFischer, 161
RieszThorin, 226
Schur, 168
Sobolev embedding, 214
StoneWeierstra, 108
Tietze, 260
Tonelli, 139
Urysohn, 19
Weierstra, 18, 26
Wiener, 207
Zorn, 293
Tietze extension theorem, 260
Tonelli theorem, 139
305
topological space, 10
topological vector space, 21
topology
base, 11
product, 14
total, 25
total order, 293
total variation, 180, 186
trace, 92
class, 90
trace topology, 11
transport equation, 223
triangle inequality, 9, 21
inverse, 9, 21
trivial topology, 10
uncertainty principle, 205
uniform boundedness principle, 66
Uniform contraction principle, 242
uniformly convex space, 33
unit sphere, 144
unit vector, 29
unitary, 103
upper semicontinuous, 127
Urysohn lemma, 19
vague convergence
measures, 197
Vandermonde determinant, 27
variation, 186
Vitali set, 120
wave equation, 5, 220
weak
derivative, 213
weak Lp , 229
weak convergence, 79
Weak solution, 283, 290
weak topology, 80
weak type, 230
weak- convergence, 84
weak- topology, 84
Weierstra approximation, 26
Weierstra theorem, 18
Wiener covering lemma, 175
Winding number, 247
Young inequality, 164, 204, 227, 228
Zorns lemma, 293