Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Studies Mathematics
26
Heinz Bauer
Measure and
Integration Theory
Heinz Bauer
W Walter de Gruyter
Berlin New York 2001
Author
Heinz Bauer
Mathematisches Institut
der Universit t Erlangen-Numberg
Bismarckstral3e 1 1/2
91054 Erlangen
Germany
Translator
Robert B. Burckel
Department of Mathematics
Kansas State University
137 Cardwell Hall
Manhattan, K ansas 66506-2602
USA
Series Editors
Carlos E. Kenig
Department of Mathematics
University of Chicago
Andrew Ranicki
Michael Rockner
Fakultit fiir Mathematik
Universitiit Bielefeld
Department of Mathematics
University of Edinburgh
Mayfield Road
Chicago, IL 60637
USA
Scotland
UniversitiitsstraBe 25
33615 Bielefeld
Germany
I. Title.
It. Series.
2001028235
Measure and integration theory / Heinz Bauer. Trans[. from the German
Robert B. Burckel. - Berlin ; New York : de Gruyter, 2001
(De Gruyter studies in mathematics ; 26)
Einheitssacht.: Mass- and Integrationstheorie (engl.)
ISBN 3-11-016719-0
Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany.
All rights reserved including those of translation into foreign languages. No part of this book may be
reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or
any information storage and retrieval system, without permission in writing from the publisher.
Printed in Germany.
Typesetting: Oldlich Uhych, Prague, Czech Republic.
Printing and binding: Hubert & Co. GmbH & Co. KG, GBttingen.
Cover design: Rudolf Hubler, Berlin.
In memoriam
Orro HAUPT
(5.3.1887 -10.11.1988)
former Professor of Mathematics
Preface
for self-study and for examination preparations, even if their interests often did
not extend as far as the probability theory.
When the decision was made to rewrite and extend the parts devoted to probability theory, it was also decided to publish the part on measure and integration
theory as a separate volume. This volume had to serve two purposes. As before
it had to provide the measure-theoretic background for my book on probability
theory. Secondly, it should be a self-contained introduction into the field. The German edition of this book was published in 1990 (with a second edition in 1992),
followed in 1992 by the rewritten book on probability theory. The latter was translated into English and the translation was published in 1995 as Probability Theory
(Volume 23) in this series.
However, once again this book is much more than a pure translation of the
German original and the following quotation of the preface of my book Probability
Theory, applies a further time: "It is in fact a revised and improved version of
that book. A translator, in the sense of the word, could never do this job. This
explains why I have to express my deep gratitude to my very special translator, to
my American colleague Professor Robert B. Burckel from Kansas State University.
He had gotten to know my book by reading its very first German edition. I owe
our friendship to his early interest in it. He expended great energy, especially on
this new book, using his extensive acquaintance with the literature to make many
knowledgeable suggestions, pressing for greater clarity and giving intensive support
in bringing this enterprise to a good conclusion."
In addition I want to thank Dr. Oldfich Ulrych from Prague for his skill and patience in preparing the book manuscript in TJ( for final processing. Many thanks
are due to my family and Professor Niels Jacob, University of Swansea, for reasons
viii
Preface
they will know. Finally, I thank my publisher Walter de Gruyter & Co., and, above
all, Dr. Manfred Karbe for publishing the translation of my book.
Erlangen, March 2001
Heinz Bauer
Introduction
Measure theory and integration are closely interwoven theories, both content-wise
and in their historical developments. They form a unit. The development of analysis in the 19th century - here one is thinking especially about the theory of Fourier
series and classical function theory - compelled the creation of a sufficiently general concept of the integral that discontinuous functions could also be integrated.
The jump function of P. G. LEJEUNE DIRICHLET should be seen in this light. At
that time only an integration theory due to CAUCHY, a precursor of Riemann's,
was known. And it was not until B. RIEMANN's Habilitation in 1854 (text published posthumously in 1867) that Cauchy's ideas were made sufficiently precise
to integrate (certain) discontinuous functions. For the first time the need was felt
for integrability criteria. Parallel to this a "theory of content" was evolving - primarily at the hands of G. PEANO and C. JORDAN - to measure the areas of plane
and the volumes of spatial "figures".
But the decisive breakthrough occurred at the turn of the century, thanks to
the French mathematicians EMILE BOREL and HENRI LEBESGUE. In 1898 Borel -
coming from the direction of function theory - described the "a-algebra" of sets
that today bear his name, the Borel sets, and showed how to construct a "measure"
on this a-algebra that satisfactorily resolved the problems of measuring content.
In particular, he recognized the significance of the "a-additivity" of the measure.
In his thesis (1902) LEBESGUE presented the integral concept, subsequently named
after him, that proved decisive for the development of a general theory. At the same
time he furnished the tools needed to make Borel's ideas more precise. From then
on Lebesgue-Borel measure on the a-algebra of Borel sets and Lebesgue measure
on a somewhat larger a-algebra - consisting of the sets which are "measurable" in
Lebesgue's sense became standard methods of analysis.
What was new about Lebesgue's integral concept was not just the way it was
defined, but also - and this was the real reason for its fame - its great versatility as
manifested in the way it behaved with respect to limit operations. Consequently
the convergence theorems are at the center of the integration theory developed by
Lebesgue and his intellectual progeny.
Subsequent developments are characterized by increasing recognition of the
versatility of Lebesgue's concepts in dealing with new demands from mathematics
and its applications. In the course of time (up to 1930) the general (abstract) measure concept crystallized, and a theory of integration built on it - after Lebesgue's
model.
Introduction
analysis on locally compact groups, and mathematical economics. But the foremost example is probability theory, which uses measure and integration as an
indispensable tool and whose own specific kinds of questions and methods have
in turn helped to shape the former. Even today the development of measure and
integration theory is far from finished.
The book is comprised of four chapters. The first is devoted to the measure
concept and in particular to the Lebesgue-Borel measure and its interplay with
geometry. In the second chapter the integral determined by a measure, and in
particular the Lebesgue integral, the one determined by Lebesgue-Borel measure,
will be introduced and investigated. The short third chapter deals with the product
of measures and the associated integration. An application of this which is very
important in Fourier analysis is the convolution of measures. In the fourth and
last chapter the abstract concept of measure is made more concrete in the form
of Radon measures. As in the original example of Lebesgue-Borel measure, here
the relation of the measure to a topology on the underlying set moves into the
foreground. Essentially two kinds of spaces are allowed: Polish spaces and locally
compact spaces. The topological tools needed for this will mostly be developed in
the text, with the reader occasionally being given only a reference (very specific)
to the standard textbook literature.
The examples accompanying the exposition of a theme have an important function. They are supposed to illuminate the concepts and illustrate the limitations
of the theory. The reader should therefore work through them with care. Exercises also accompany the exposition. They are not essential to understanding later
developments and, in particular, proofs are not superficially shortened by consigning parts to the exercises. But the exercises do serve to deepen the reader's
understanding of the material treated in the text, and working them is strongly
recommended.
Notations
Here we assemble some of the notation and phraseology which will be used in the
text without further comment and which - with but a few exceptions - are in
general use.
(+oo) = +oo and (-oo) + (-oo) = -oo. On the other hand +oo + (-oo) and
-00 + (+oo) are not defined.
As usual too we set a (too) = oo for all real a > 0, including a = +oo, and
a (oo) = Too for all real a < 0, including a = -oo. Not so general but typical
in measure theory are the additional conventions
(a, b) will never be used for an open interval, but only for the ordered pair with
first element a, second element b.
For every pair of elements a, b E R
aVb:=max{a,b},
aAb:=min{a,b}
xii
Notations
Of course, a+ > 0 and a- > 0 for all a. For finitely many a1, ... , an E It the
corresponding expressions a1 V ... V an and al A ... A an stand for max{a1i... , an }
we say that ultimately all terms of the sequence have the property. The popular
phrasing "almost all terms of the sequence possess the property" has to be avoided
in measure theory because there the concept "almost all" is employed in another
sense.
Notations
xiii
co
00
function x H E fn (x). Also, functions like sup fn, inf In, Urnn-+oo
sup fn, lim
inf fn,
n-*oo
nEN
nEN
n=1
lim fn are defined "pointwise" via x '-+ sup fn (x), x H inf f (x), etc.; whereby,
nEN
nEN
n +00
of course, use of lim fn presupposes the convergence in IIt of the sequence (fn(x))
A V...Vfn
and
f1 A...Afn
and
xH fi(x)A...Afn(x).
At each point x E X they assume, respectively, the largest and smallest of the
function values f, (x),fi(x),.. . , f (x). These two functions are called, respectively, the
upper and the lower envelopes of f1, ... , f, . Correspondingly, sup fn and inf fn are
nEN
called the upper and lower envelopes of the sequence (fn) of numerical functions
on X.
A numerical function defined on a subset of IR is called isotope, resp., antitone, if
it is weakly increasing, reap., decreasing. We use this terminology also for numerical
functions f : A -> R when A is a (partially) ordered set. That is, if from x, y E A
and x < y always follows f (x) < f (y), reap., f (x) > f (y), then f is called isotone,
reap., antitone. If from x < y always follows f (x) < f (y), reap., f (x) > f (y), then
f is called strictly isotone, reap., strictly antitone.
For sequences (an) in R the symbolisms
anTa ,
an .l.a
express that the sequence is isotone, reap., antitone, and that a E IIP is its supremum, reap., its infimum.
The end of a proof is signaled by the symbol O.
References of the form "RADON [1913]" are to the bibliography at the end of
the book.
Section 18, labelled with *, can be skipped over in a first reading.
Table of Contents
PrefRee
Introduction
Notations
1.
2.
3.
Dynkin systems
Contents, premeasures, measures
4.
Lebesgue premeasure
5.
6.
7.
8.
428.
429.
vii
ix
xi
1
2
5
8
14
18
26
34
38
49
49
53
57
64
70
74
79
88
96
107
110
112
121
132
132
135
147
152
152
157
166
170
177
xvi
30.
31.
Table of Contents
Bibliography
Symbol Index
Name Index
Subiect Index
188
204
217
221
223
225
Chapter I
Measure Theory
To geometrically simple subsets of the line, the plane, and 3-dimensional space,
elementary geometry assigns "numerical measures" called length, area and volume.
At first all that is intuitively clear is how the length of a segment, the area of
a rectangle and the volume of a box should be defined. Proceeding from these we
can determine by elementary geometric methods the lengths, areas, and volumes
of more complicated sets if we accept certain calculational rules for dealing with
such numerical measures.
If one thinks for example about the elementary determination of the area of
a (topologically) open triangle, one begins by decomposing it via one of its altitudes
into two open right triangles and the altitude itself. One further recalls that every
right triangle arises from insertion of a diagonal into an appropriate rectangle.
Every line segment is assigned numerical measure 0 when considered as a surface.
The following two rules of calculation therefore lead to the determination of the
areas of triangles:
(A) If the set A has numerical measure a, and B is congruent to A, then B also
has numerical measure a.
(B) If A and B are disjoint sets with numerical measures a and p, reap., then
A U B has numerical measure a +)3.
The limits of such elementary geometric considerations are already reached in
defining the area of an open disk K, to which end one proceeds thus: A sequence of
open 3.2"-1-goes En (n E N) is inscribed in K, with El being an open equilateral
triangle, and the vertices of En+1 being those of En together with the intersections
of the circle with the radii perpendicular to the sides of En. Thus En+1 consists
of En together with its
3.2n-1 edges and the open isosceles triangles which have
these edges as hypotenuses and vertices on the circle. Since K is the union of all
the En, it looks like a "mosaic of triangles", that is, like a union of disjoint open
triangles and segments (namely, common sides of various triangles). The following
broader formulation of (B) therefore leads to a definition of the area of the disk K:
(C) If (An) is a sequence of pairwise disjoint sets, and An has numerical mean
00
n=1
If we replace K and every En by its topological closure, this method would not
lead to a plausible definition of the area of a closed disk K, because K is not the
union of the closures En of the above constructed polygons En. A peculiarity and
disadvantage of the elementary geometric procedure is precisely the necessity of
I. Measure Theory
intuitive geometric one. It is just the latter reason that explains the variety of
opportunities for applying measure theory in analysis, geometry and stochastics.
SIEa/;
(1.1)
(1.2)
A E .a0
(1.3)
CAE sat ;
(Af),EN C .as'
U An E d.
nEN
For any set SI the system of all its subsets which are either countable or
co-countable, that is, the A C SI such either A of CA is countable, constitute a aalgebra. Property (1.3) is confirmed as follows: If each An is countable, then so
is the union tJ An. If some An, is not countable, then its complement is, and
nEN
C U An = n CAn C
nEN
3.
is likewise countable.
nEN
(1.4)
f2'nal:={SI'nA:AEsr}
is a a-algebra in S2', called the trace of .sad in ff. In case S2' E of, 0' nod consists
simply of all the subsets of 12' which are elements of 0.
Let S2, S2' be sets, 0' a a-algebra in Cl', and T : Cl -> 12' a mapping. Then the
system of sets
4.
(1.5)
is a a-algebra in Cl, as follows from the known behavior of the set-theoretic operations under inverse mappings (like T-1 here).
(1.6)
(1.7)
n An E W.
(an)nEN C d
nEN
These follow from (1.1)-(1.3) and the identities 0 = C11 and nAn = C(UCAn).
Moreover,
A,u...UAn =A,u...UAnuOuOu...
and
A\B=AnCBEd.
A,BEd
(i) 9 C a(9),
(ii) for every o-algebra .sd in Cl with 8 C 0, a(8) C.W.
For a proof, consider the system E of all a-algebras nd in S2 with 9 C nd;
for example, . (S2) is an element of E. Then o(e) is the intersection of all the
0 E E, which according to 1.2 possesses all the desired properties.
Q(8) is called the a-algebra generated by 8 (in Cl) and .9 is called a generator
of a(8).
Examples. 5. If 9 itself is a a-algebra in S2, then 9 = a(8).
6.
1. Measure Theory
7.
of Q.
Several systems of sets possessing some of the properties of a-algebras frequently occur as generators. Of special interest are rings of sets.
(1.9)
(1.10)
(1.11)
A,BE.J
A,BER
A\BE-4;
AuBEF.
If in addition
(1.12)
SZ E R
A\B=AnCB=C(BuCA). 0
Examples. 8. Every a-algebra is an algebra.
For any set 0 the system of all sets A C 0 which are either finite or co-finite
(i.e., have finite complement in i2) is an algebra, but is a a-algebra only if fl is
9.
finite.
10.
The system of all finite subsets of a set 0 is a ring, but is an algebra only
if fl itself is finite.
11.
Exercises.
1. For every system 8 of subsets of a set n there exists a smallest ring p(8)
in 0 which contains if. It is called the ring generated by 8. Prove this existence
assertion. Determine p(8) and a(8) in the case where f consists of two subsets
A, B of Q. When does p(8) = a(e) hold in this latter case; when does it hold for
general 8?
2. Dynkin systems
AL.B:=(A\B)U(B\A)
is called their symmetric difference. Prove that it obeys the following rules of
calculation (in which A, B, C are arbitrary sets):
ADB=BAA;
(AAB)ACAA(BAC);
(a)
(b)
(c)
(d)
(e)
CA A CB =ADB ;
(A 6 B) n C = (A n C) A (B n C);
(f)
AAA=0;
nEN
AA0=A;
nEN
nEN
0EN;
NE.A',ME, ,MCN
ME.X;
M,N E.N => MUN E.N.
.
2. Dynkin systems
It is often difficult to directly determine whether a given system of sets is a a-algebra. The following concept, which goes back to DYNKIN [1961] but in inchoate
form even to SIERPINSKI (1928], helps to get around some of these difficulties.
I. Measure Theory
(2.1)
(2.2)
(2.3)
DE9
CDE9;
U D E 9.
D pairwise disjoint E 9 (n E N)
nEN
Every Dynkin system 9 thus contains the empty set 0 = CA, and then (2.3)
also insures that 9 contains the union of every finite, pairwise disjoint collection
of its sets.
Examples. 1. Every a-algebra is obviously a Dynkin system.
Let A be a finite set with an even number 2n of elements (n E N). Then the
system 9 of all D C A which contain an even number of elements is a Dynkin
system. In case n > 1, 9 is not an algebra, hence certainly not a a-algebra.
2.
The precise connection between the concepts of or-algebra and Dynkin system
is elucidated in the following considerations:
2.2 Lemma. Every Dynkin system 9 is closed with respect to the formation of
proper complements, meaning that
(2.2')
D,EE9, DcE
E\DE9.
Proof. According to what was noted right after definition 2.1, the set D U CE,
being the union of the disjoint sets D and CE from 9, lies in 9. But then the
complement of this set with respect to 0, that is, E f1 CD = E \ D, lies in 9.
Consequently, Dynkin systems can also be defined via properties (2.1), (2.2')
and (2.3).
Proof. What needs to be shown is that every Dynkin system .9 which is closed
under finite intersections is a a-algebra. Of the defining properties of a a-algebra,
only (1.3) needs to be confirmed and we do that thus: According to (2.2') and
the closure hypothesis, A \ B = A \ (A fl B) lies in 9 whenever A, B E 9. Since
(A \ B) fl B = 0 and A U B = (A \ B) U B, 9 contains the union of any two, hence
the union of any finitely many, of its elements. For any sequence (Da)nEN C 9,
we have
00
00
U Dn=U(D'n+1\D,)
n=1
n=e
2. Dynkin systems
in which D' := 0 and D;, := Dl U ... U D for each n E N. The sets D;+i \ D;, are
pairwise disjoint and, thanks to (2.2') and what has already been proved, they lie
in 2. According to (2.3) then the union of the sets D lies in 2. 0
Just as for a-algebras, algebras and rings, every system Cr C .9(Q) lies in
a smallest Dynkin system. It is, of course, called the Dynkin system generated
by 8, and is denoted 6(8).
The significance of Dynkin systems lies primarily in the following fact:
2.4 Theorem. Every 9 C .9(Q) which is closed with respect to finite intersection
satisfies
(2.4)
6(8) = 0(6) .
Proof. Since every a-algebra is a Dynkin system, o(8) is a Dynkin system containing 9' and consequently 6(8) C o((fl. If conversely, 6(8) were known to be
a a-algebra, the dual relation o(8) C 6(8) would also follow. In view of 2.3 therefore it suffices to show that 6(8) is closed under intersection. To prove this, we
introduce for every D E 6(8) the system
1D:={QE.9(st):QnDE6(8)}.
A routine check confirms that 9D is a Dynkin system. For every E E 8 the
hypothesis on 8 insures that 8 C 2E and therewith that 6(8) C 2E. Thus for
Exercise.
Determine the Dynkin system generated by the system consisting of just two
subsets A, B of fl. Show that 6(&) and o(8) coincide just in case one of the sets
A n B, A n CB, B n CA of CA n CB is empty.
1. Measure Theory
(3.1)
and for every sequence (An) of pairwise disjoint sets from R whose union lies in 1B
00
00
(3.2)
(a-additivity)
n=1
n=1
tt
It (U
(3.3)
A;) = F p(A;)
(finite additivity)
(for every two and therewith) for every finitely many pairwise disjoint sets A,,. .. ,
A,, E R_
Due to (3.1) every premeasure is evidently a content. To see this, you have only
to take An+1 = An+2 = ... = 0 in (3.2).
s,,,
by
if U) EA
if1.r0A
is a premeasure. It is called the premeasure defined by unit mass at W.
3.
00
p
n=1
(U Ai)
(3.8)
(isotoneity);
(subtractivity);
i=1
,p(Ai)
(subadditivity);
i=1
for every sequence (An) of pairwise disjoint sets from R whose union lies in R
00
"D
L(An)<(UAn).
n=1
n=1
AUB=AU(B\A)
B=(AnB)u(B\A).
and
(A U B) = (A) +(B \ A)
and
(AUB)+(AnB)+(B\A) =(A)+(B)+(B\A).
In case (B \ A) is finite, (3.5) follows from this. In case (B \ A) = +oo, the
formulas for (A U B), (B) show that each of them must also equal +oo, and
(3.5) consequently holds in this case too. If A C B, the preceding formula for (B)
reads
(U B,) =Ej(Bi)
n
i=1
now follows (3.8). To prove (3.9) we only have to observe that for every sequence (An)nEN of pairwise disjoint sets from R with A := u An E R
nEN
(m E N)
10
1. Measure Theory
Finally, if it is a premeasure on .4, then for any sets A0, A1, ... E 9
0
(3.10)
Ao C U A.
n=1
Because of AO = U(Ao n An) and (3.6), we can assume, in verifying (3.10), that
Ao = U An. Then set B1 := Al, B2 := A2 \ Al,... ,Bn := An \ (A1 U ... U An-1)
and proceed as in the proof of (3.8).
In particular, we now have
,u(UAn) <Ep(Aa)
(3.10')
n=1
n=1
(3.11)
and
En J. E
which mean that the sets E1 C E2 C ... satisfy E = U En, or that the sets
El ) E2 D ... satisfy E= n En. In other words, the sequence (En) either
increases isotonically to E or decreases antitonically to E.
n400
1imn p(An) = 0
n-+oo
(a) a (b)
(c) a (d).
If it is finite on R, that is, p(A) < +oc for all A E .9, then all four statements
(a)-(d) are equivalent.
A= U Bn,
Yd=1
An=B1U...UB,,.
11
00
i=1
(b)=(a): Let (An) be a sequence of pairwise disjoint sets from R whose union
A:= U An also is in R. If we set Bn := Al U ... u An, then Bn E 9 and Bn T A;
therefore (A) = lim(Bn). As a result of the finite additivity of
From this follows (c), because A C An means that also (A) < +oo and so
(A1 \ A) = (A1) - (A).
(c) .(d): Here there is nothing to prove!
From An 1. A follows An \A 10. Since An\A C An, the isotoneity of
means that along with p(An), (An\A) is finite too. Hence by (d), lim(An\A) =
0. But then (c) follows because p(A) < (An) < +oo, causing (An \ A) to equal
p(A.) - p(A)
To finish off, let us consider the case that is finite, and show that then
(d) =*- (b): If (An) is a sequence of sets from .9 and A. T A E.9, then A\An 10.
Taking account of the finiteness of , it therefore follows that 0 = lim (A \ An) _
Thus a measure is a non-negative, numerical function p defined on a a-algebra .0 and enjoying properties (3.1) and (3.2). The constant function = 0
is a measure on every a-algebra, the so-called zero-measure. The examples that
1. Measure Theory
12
follow are still of a rather formal nature. But as early as 6 and then quite a bit
later we will become acquainted with an abundance of important examples.
7.
00
(3.12)
n=1
Proof. Designate the union of all the An as C. For each r = 1,.. . , k the sets
(Ar+rnk)mENo are pairwise disjoint. So if we set
00
Fr
U Ar+mk
rn=0
then
00
E p(A,) = E u(Fr)
n=1
r=1
13
From this equality and the preceding inequality the asserted inequality can be read
off.
Exercises.
1. Let 12 be a finite, non-empty set. Show that the counting measure ( on Y(O)
coincides with E e,,. Show further that every measure p on :x(1l) has the form
cEn
p=
i=1
1<i<j<n
p(AinAjnAk)
1<i<j<k<n
- +...+(-1)n-1(A1n...nAn).
3. For a premeasure p on a ring. in 12 define
5. Let p be a measure on a a-algebra .sat in 0, and denote by .N,, the set of all
p-null (or -negligible) sets, that is, the N E .szd for which (N) = 0. Check that
.M,,, has the following properties:
(a)
0 E t',,;
(b)
NE.Y,,,MEd,MCN
(c)
(Nn)nEN C ,N,,
ME.A;
U Nn E ,4,.
nEN
Subsets of .sat with these properties are called a-ideals in d. Thus Y is always
a a-ideal. (Cf. Exercise 4 of 1.)
6. Every a-ideal .N in a a-algebra d is the a-ideal .N,, of p-null sets of an appropriate measure p on d. To get such a p, define
0
if AEa
_ I+oo ifAEd\-,Y.
'L(A)'As a special case, on the power set .9(12) of any set SZ there is a measure p such
that p(A) = 0 precisely if A is a countable subset of Q.
14
I. Measure Theory
(A, B E.9)
defines a pseudometric on M, that is, d,, has all the properties of a metric on .9 with
one possible exception: d,, (A, B) = 0 can happen without A = B. (Cf. Exercise 3
of 15.)
4. Lebesgue premeasure
Now we specialize Sl to be the d-dimensional number-line Rd (d E N). For every
two points a = (al, .. , ad) and b =
E Rd we write a:5 b (reap., a 4 b)
if a, < Qi f o r all i = 1 , ... , d (resp., ai < ,Q; for all i = I,-, d). Every set of the
form
(4.1)
is called its d- dimensional elementary content. It equals 0 just when [a, b[ = 0, that
is, when a < b fails (although a < b holds, a prerequisite to employing interval
notation).
From now on, #d shall designate the set of all right half-open intervals in Rd,
and 9d the system of all finite unions of such intervals, so ,.Od C .91d. The elements
of fd are called d- dimensional figures.
In JE fd
and
J\IE.Pd.
Every figure is a union of finitely many pairwise disjoint intervals from ,ld.
Proof. Let I = [a, b[, J = [a', b'[ with a < b, a' < b', and let the corresponding
coordinates of these points be ai, 3i, a;, 3,. If we let a and f denote those points
in Rd whose coordinates are max{ai, a' j) and min{13i, $ } (i = 1, ... , d), respectively, then I n J = [e, f [ in case e < f and otherwise I fl J = 0. Consequently,
4. Lebesgue premeasure
15
possible ways. More precisely, make such replacements for the i coming from each
non-empty subset of { 1, ... , d}. The points so created give rise to at most 3d - 1
F=I1U(I2\I1)U(13\IlUI2)U...U(In\IIU...UIn-1)
exhibits F as a union of n pairwise disjoint sets, each of the form I \ J1 U ... U Jm
with I, J1, ... , J. intervals from jd. Thus it suffices to show that every set of this
form is the union of finitely many pairwise disjoint intervals from mod. But this
follows from
I\J1U...UJm=n(1\Ji)
i=1
when, using what has already been proved, we write each I \ Ji as a union of
finitely many pairwise disjoint intervals from j0d and distribute the intersection
through these unions. 0
F=UI;
G= U1 .
and
j=1
i=1
But then
F\G(nvi,\Ijn)
i=1 j=1
and so it only has to be shown that each set n (I; \ I) is a figure. According to 4.1
j=1
16
1. Measure Theory
4.3 Theorem. There exists exactly one content A on 911 with the property that
A(I) coincides with the d-dimensional elementary content of I, for each I E .fad.
This content is real-valued.
F=I1U...UIn=J1U...UJnj
are two representations of the figure F E .>5d, each a union of pairwise disjoint
intervals, then
A(I1) + ... + A(In) = A(J1) + ... + A(Jm)
m
A(Ij)=>A(IjnJ=)
i=1
(j=1,...,n).
4. Lebesgue premeasure
17
(i = 1,...,m).
Together these last two equations entail the equality E A(Ij) = E A(JJ).
(e) Thus for every F E .'d the number F_ A(Ij) is independent of the special
representation
F=I1u...UI,
of F as a union of finitely many pairwise disjoint 11, ... , In E fd. Therefore the
decree
additive. Since 0 E j0d and A(0) = 0, a content with the sought-for properties is
at hand.
nFn #0.
Each Fn being a union of finitely many pairwise disjoint intervals from .>Id, it
should be clear that by a slight leftward shift of the right endpoints of each of
these intervals a new figure an E .fin is created, whose topological closure Gn is
still a subset of Fn, and
A(Fn) - A(Gn) < 2-"6.
nEN
and by choice of G1, A(F1) - A(G1) < 2-16. Suppose the inequality valid for
18
1. Measure Theory
some n. Since
From the induction hypothesis A(H,,) > A(F,) - (1 - 2-")b; from the choice
of G,.+1, A(Gn+1) > A (Fn+1) - 2-"-'b and G.+1 U Hn C F.+1 U Fn = Fn, so that
Combining these observations completes the inductive
U
step in the confirmation of (*):
Here we encounter for the first time the name of the French mathematician
H. LEBESGUE (1875-1941), the inventor of the measure and integration concepts that today are named after him. The development of the theory of measure and integration was spurred above all by his investigations and those of his
countryman E. BOREL (1871-1956). For the history of Lebesgue integration see
DIEUDONNE [1978] and HAWKINS [1970].
Exercises.
1. Show that on 91 there is exactly one content p that assigns to the right halfopen interval [a,,3[, a, f3 E R, the following values
ifa<0<$
19
a "numerical measure" gets assigned also to more complicated subsets of Rd. The
most satisfactory such result would say that Ad can be extended in exactly one
way to a measure on an appropriate or-algebra W in Rd with g d C a0.
by E in Q.
Proof. For each subset Q C S1 designate by 'W (Q) the set of all sequences (An)nEN
QC UAn.
nEN
(Q)
+oo,
(5.2)
(5.3)
Q1 C Q2
(5.4)
(Qn)nEN C -60(1l)
A WO < P*(Q2);
00
(U Qn)
n=1
00
E u*(Qn)
n=1
Equality (5.2) follows from the observation that the constant sequence 0, 0, ... is
in W (O). The observation that 1(Q2) C V (Q0 follows from Q1 C Q2, serves to
confirm (5.3). For the proof of (5.4) it can evidently be assumed that p (Qn) is
finite and so in particular 0&(Qn) # 0, for every n E N. For an arbitrary e > 0
then, each 0&(Qn) contains a sequence (Anm)mEN such that
00
1. Measure Theory
20
The double sequence (A,nm)n,mEN lies in 11 ('J Qn) and as a consequence the
n=1
00
n,mEN
L, '(Qn) +
n=1
and (5.4) follows from this and the arbitrariness of > 0. It is immediate from the
definition that
i >0.
(5.5)
(5.6)
as well as
p*(A) =;&(A).
(5.7)
In proving (5.6) we can again assume '(Q) < +oo, so that P(Q) 34 0. First of
all we have
00
00
00
n=1
n=1
for every sequence (An) from 1!(A), due to the finite additivity of p. Moreover,
the sequence (AnflA) lies in 9l(QnA) and the sequence (An \A) lies in P!(Q\A).
Consequently,
00
for every such sequence (An), and from this fact (5.6) is immediate. Equality (5.7)
follows on the one hand from (3.10), according to which u(A) < p*(A), and on
the other hand from consideration of the sequence A, 0, 0.... which lies in P (A).
The significance of what has been proven lies in the fact, which we will establish,
that the system d' of all sets A E .9(1) satisfying (5.6) is a a-algebra in 52 and the
restriction of ' to af' is a measure. Now (5.6) as just proved says that .' C d',
C W*. Then according to (5.7) ji := ' I a (R) is an
and so we shall have
extension of it to a measure on o(ff). The definition and theorem which follow
will therefore complete the present proof. 0
5.2 Definition. A numerical function ' on the power set .9(St) having properties
(5.2)-(5.4) is called an outer measure on the set fl. A subset A of 0 is called u-measurable if it satisfies (5.6).
Notice that ' > 0 always prevails, an immediate consequence of (5.2) and (5.3)
together.
The idea in the proof of the measure-extension theorem, which goes back to
C. CARATHFODORY (1873-1950), consists in associating via definition (5.1) an
outer measure to the premeasure p on.' and then invoking the following theorem.
21
5.3 Theorem (Caratheodory). Let ' be an outer measure on a set f). Then the
system 0' of all '-measurable sets A C fl is a o-algebra in fl. Moreover, the
restriction of ' to dA' is a measure.
Proof. First let us note that the requirement (5.6) for a subset A of St to lie in d'
is equivalent to
'(Q)='(QnA)+'(Q\A)
(5.6')
for allQE9(1),
because from (5.4) applied to the sequence Q n A, Q \ A, 0, 0.... follows the reverse
of inequality (5.6), for every Q E 9(S1). From either (5.6) or (5.6') it is immediate
that S2 E d', and because of their symmetry in A and CA, whenever A lies in d',
so does CA. The following considerations will show that with each two of its sets A
, (Q n
U A) = E(Q n Ai)
i=1
i=1
n
i=1
to be in Af ', and that Q \ Bn D Q \ A, so that ' (Q \ Bn) > ' (Q \ A), we obtain
n
22
I. Measure Theory
and consequently, as noted at the beginning of the proof, we actually have equality
throughout:
holding for all Q E 9x(1l). Thus A lies in d'. After all this we recognize that the
algebra sad' is an r)-stable Dynkin system and therefore by Theorem 2.3 a o-algebra. If in the last pair of equalities we take Q := A, we get
00
p'(A) _ E (An),
n=1
5.4 Theorem (Uniqueness theorem). Let 9 be an n-stable generator of a a-algebria d in 1 and suppose that (En) is a sequence in 9 with U En = n. Then
nEN
p1(E) = p2(E)
for all E E c9
p1(En)=p2(En)<+oo
for olin EN
and
(ii)
Proof. Denote by 8f the system of all sets E E.9 satisfying 1(E) =p2(E) < +oo.
For a given E E of consider the system
9E :_ {D E sV: p1(E n D) = p2(E n D)).
We will show that it is a Dynkin system. Obviously Sl E E. If D E 9E, then
p1(E n D) = 2(E n D) < +oo (since E E 8j), and so (3.7) shows that
at once from the a-additivity of the measures p1, p2. Because 8 is n-stable,
8 C 9E follows from (i) and the definition of 9E. But then S(td) C 9E because
6(8) is the smallest Dynkin system which contains 8. From Theorem 2.4 however,
23
1(EnnA)=2(E,,nA)
(5.9')
(nEN, AEsd).
nEN
1(FnnA)=l(EnnF,,nA)=p2(EnnF,,nA)=2(FnnA)
for all A E al and all n E N. But then the fact that
A= U(FnnA)
nEN
00
1(A)=E1(FnnA)=>2(FnnA)=2(A)
n=1
n=1
for every A E .W, which says that the measures 1A1, 2 are identical. O
For finite measures some other natural stability properties of the generator c9
(e.g., its closure under set-differences) also insure uniqueness. See, for example,
ROBERTSON [1967].
ery nEN.
nEN
Examples. 1. Suppose that the content p on the ring .4 in f is finite, that is,
p(A) < +oo for every A E R. The a-finiteness of u is the equivalent to the existence
24
I. Measure Theory
In summary we have
5.6 Theorem. Every or -finite premeasure p on a ring ' in a set 1 can be extended
in exactly one way to a measure it on a(M).
Proof. Only the uniqueness of it has to be proved. But this follows immediately
from 5.4: thanks to the o-finiteness of it, the ring. has all the properties required
of the generator 6' in the hypothesis of 5.4.
Remark. The hypothesis of a-finiteness of lc on 5.6 can not be dispensed with. It
suffices to look, as in Example 1, at a non-empty set ft and to take for . the ring
consisting just of the empty set. On a(R) = {0, 1} two different measures having
the same restriction to9 are defined byp(0) = v(0) := 0 and p(S2) := 0 =: 1-v(1l).
The uniqueness of the measure a which extends the or-finite premeasure 1A in
5.6 is expressed more dramatically by the following approximation property. For
simplicity we formulate it only for finite measures on an algebra.
5.7 Theorem (Approximation property). Let p be a finite measure on a v-algebra d inn which is generated by an algebra do in fI. Then for each A E d there
is a sequence (Cn)nEN in .moo satisfying
lim u(ALCn)=0.
(5.10)
n- 00
Here A designates the symmetric difference defined in Exercise 2 of 1. Exercise 7 of 3 is the real justification for the terminology "approximation property".
(5.11}
11=1
If we set Cn
U An satisfies
nEN
C n f A'
and
A' \ Cn y. 0.
25
<p(A'\C)+p(An)-p(A)
00
n=1
E/2 + e/2,
(5.14)
(n E N)
make this obvious: For C, D E ef, C C D U (C \ D), so that u(C) -,u(D) <
(C \ D) < (C A D). As C and D may be interchanged here, (5.14) is confirmed.
11
Exercises.
1. Let = E,,, be the premeasure on a ring .R in Sl defined by putting unit mass at
the point w E fl. Under the hypothesis that {w} can be realized as the intersection
of a sequence from .s and fl as the union of such a sequence, prove that: (a) The
outer measure ' defined from via (5.1) assigns to every set A E .9(11) the
value 1 or 0, according as w E A or w E CA. (b) Every subset of Sl is p'-measurable.
(c) ' is the measure E,,, on .9(11).
that: (a) The outer measure ' defined from p via (5.1) assigns to every set
A E .9(fl) the value 0 or 1, according as A is countable or not. (b) is not
a measure on .9(11), not even a content. (c) The only '-measurable sets are
those in the a-algebra sd on which u is defined.
26
I. Measure Theory
that:
(a) The measure ,t f dd' from Theorem 5.3 is complete.
(b) The measure in Examples 2 and 7 of 3 is complete.
(c) If dd is a a-algebra in a set U, w E 0 and {w} E V, then the Dirac measure e,,,
on dd is complete just when dd = f1a(St).
7. (a) Show that every measure it on a a-algebra AV in a set U can be completed.
(d) Characterize the sets in olo as follows: A set Ae C U lies in 0e just if sets
AI, A2 E dQ exist such that A, C Ao C A2 and p(A2 \ Al) = 0.
9. The proof of Theorem 5.4 only uses condition (i) for sets A E c' which satisfy pi(E) = p2(E) < +oo. Clarify this observation by showing that under the
hypotheses of Theorem 5.4 the system tj of all sets E E if satisfying pl(E) _
2(E) < +oo is likewise an ft-stable generator of dd.
on 0.(,d), which measure will also be denoted by ad from now on. Since every
figure is a union of finitely many intervals I E .1d, we have
a(.
d)
= o(5d) .
6.1 Definition. The elements of the a-algebra generated in ltd by the system 5d
of half-open intervals are called the Bored subsets of the space Rd. Correspond-
27
ingly o(.Fd) is called the a-algebm of Borel subsets of Rd; it will henceforth be
denoted .mod.
The results reviewed in the introduction can, following 4.3, be expressed thus:
6.2 Theorem. There is exactly one measure Ad on ,mod which assigns to every
right half-open interval in Rd its d-dimensional elementary content.
It is expedient to expand this definition: For every set C E 0 the trace aalgebra C fl 0 consists of all Borel subsets of C (cf. (1.4)). The restriction Ac
of Ad to C fl 0 is a measure. It will also be called the L-B measure on C.
Like the Lebesgue premeasure of which it is an extension, the L-B measure Ad
is a-finite (cf. Example 2 of 5). More generally
Ad(B) < +oo
(6.2)
for every bounded set B E .mod, since such a B lies in an interval in 0d; e.g.,
excepting finitely many n, B lies in each interval I from Example 2, 5, with the
result that Ad(B) < Ad(In) < +00.
Let us recall the question formulated in the introduction to Chapter I of finding
a unified method for assigning a numerical measure of d-dimensional volume to
as many subsets of Rd as possible. Step by step we will come to recognize that
Theorem 6.2 answers this question in a most satisfactory way: for every Borel
set B in Rd its d-dimensional measure in the number we were seeking.
First of all it seems desirable to get a deeper insight into the a-algebra gd of
Borel sets. In particular, the question naturally comes up whether topologically
interesting sets, like the open, closed, or compact ones are Borel. The characterization of .mod via such sets in the next theorem is often taken as the definition of
the a-algebra 0.
6.4 Theorem. Let 0d, `ed,
,Wd
d)
Proof ..lt'd C `E'd C 0,((2d), so o(Xd) C o(('d). Every set C E'd is the union of
a sequence of sets C E ..1C'd; for example, if K. are the compact balls with a fixed
center and radii n E N, then the sets C,, := C fl Kn furnish such a sequence. Thus
by (1.3), Wd C o(..lE'd), whence o(Wd) C o(..iE'd) and so finally the equality of
these two a-algebras. Since the open sets are the complements of the closed ones,
the equality o(6d) = o(( d) is obvious; therewith the last two equalities in (6.3)
are confirmed.
28
I. Measure Theory
We finish up by showing that o,(d) =mod. We will, as usual, use the term
bounded open interval in Rd for every set of the form
(6.4)
Ja,b[:={xERd:a.x4b},
and
(n E AI)
we have
Jan, b[ .l. ]a, b[ .
[an, b[ T ]a, b[
if we set
(6.5')
(n E N),
Rd #-'P(fltd)
will be proved. For the moment we content ourselves with computing the Lebesgue
measure ,d(B) of some geometrically simple Borel sets B.
H:={x=(l:l,...,ed)ERd: F.=a}
HC U [x",yn[
neeN
29
and
Ad([xn,yn[) = 2-ne,
n E N.
[a,b]:={xE1Rd:a<x<b}
and, in contrast to [a, b[, the left half-open interval
]a,b]:=Ix ERd:aax<b}.
Then
(6.7)
First of all the intervals [a, b[, ]a, b[ and [a, b) are Borel sets by Theorem 6.4. As in
its proof, we can show that
(6.8)
]a, bn [ .. ]a, b]
and
[a, bn [ 1 [a, b]
for appropriate sequences (bn) in Rd converging to b. Again from 6.4 we then get
that ]a, b] is Borel. From (6.5) follows
Ad(]a,b[) = lim Ad([a,,,b[) = Ad([a,b[)
n-,ao
the first equality using the continuity from below of a measure, and the second
using lim an = a (from (6.5')) and the continuous dependence on c and d of the
n-aoo
elementary content of the interval [c, d[. Analogously, with the help of (6.8), we
conclude that
Ad(]a, b]) = Ad([a, b]),
this time citing the continuity of measures from above. Thus finally from the
inclusions ]a, b[ C ]a, b] C [a, b] the remaining equality in (6.7) follows.
The choice of right half-open intervals for the construction of Ad is now seen
to have been due solely to the fact that the ring .ld they generate is so simple to
describe.
30
I. Measure Theory
or, equivalently, if p(B) < +oo for every bounded set B E Rd. I.-B measure Ad is
such a measure, according to (6.2).
The point of departure for defining A' is the determination of AI([a,b[) for
intervals [a, b[ E 51, namely as b - a. It suggests itself that this opening move
(6.9)
Thanks to the uniqueness theorem 5.4 such a measure is already thereby, i.e.,
by its values on 5', uniquely specified. Since p([a, b[) > 0, (6.9) entails that the
function F must be isotone. Moreover, F has to be left-continuous. This is because
for every x E R and every sequence (x,,) in R with x,, 1 x, the corresponding
interval behavior is
t [x1,x[, and since p must be continuous from below,
it follows that
n-+oo
Proof The techniques employed in the proof of Theorem 4.3 can be repeated
to show that corresponding to F there is a unique content p on the ring Jr'
of 1-dimensional figures which has property (6.9). That part of the proof used
only the isotoneity of F. From the left-continuity of F it follows that for every
31
But then the technique employed in the proof of Theorem 4.4 shows that it is
a a-finite (as well as finite) premeasure on .071.
F(x) .=
p([0, x[)
if x > 0
and (6.9) is confirmed analogously when a < b < 0. In the remaining case a < 0 < b
we get (6.9) from [a, b[ = [a, 0[ U 10, b[ and the additivity of it. The uniqueness
already proved leads finally to the equality of p with the measure AF derived
from F.
Notice that L-B measure )' has the form PF, with F the identity map x H x
on R.
measure p on R is either the zero measure p = 0, or 0 < p(R) < +co and
v:=
for all A E W.
Consider now a p-measure p on 0. The open interval [-co, x[ lies in 91 for each
x E R, so a real function F. with values in [0,1] is defined by
(6.11)
(x E R).
32
I. Measure Theory
F,. = A
in the notation introduced in Theorem 6.5. Among the infinitely many measuregenerating functions F that satisfy pF = for a given p-measure p the distribution
function F. is characterized as follows:
6.6 Theorem. A real function F on J is the distribution function of a -- necessarily uniquely determined -p-measure p on 4' if and only if it is measure-generating
(that is, isotone and left-continuous) and satisfies
lira F(x) = 0
_cc
(6.13)
and
lira F(x) = 1.
X-++oo
Proof. The distribution function F of a p-measure it on 91 is always measuregenerating, as (6.12) shows. Properties (6.13) follow from the continuity at 0
and the continuity from below of every finite measure, respectively, since for sequences (x,2) in R with x,, , -oo, resp., xn t +oo we have ] - oo,xn[ .. 0, reap.,
]-oo,x,, [TR.
If conversely F is a measure-generating function satisfying (6.13), then according to 6.5F is the only Borel measure on R with property (6.9), in particular, with
pp([-n, n[) = F(n)-F(-n) for all n E N. When n - +oo here, the normalization
condition u(R) = 1 follows from (6.13). Thus F is a probability measure. F is
then the distribution function of pp, because for x E R and all n E N fl [-x,+00[
tions". This is because, even before the invention of the measure concept,
T.J. STIELTJES (1856-1894) had used such functions to extend the ideas behind
the Riemann integral (cf. Remark 2 in 12).
2. Measure-generating functions (and distribution functions) also make sense
in Rd. But they are difficult to deal with and that is not the least reason why they
are of less significance. A function F : Rd -* R is called measure-generating if in
each of its d variables 1;1.... , l d, when the others are held fixed, it is left-continuous
and satisfies the additional condition
A$'...AQ,F>0
33
Here ak, (3k (k = 1,. .. , d) are the coordinates of a, b, resp., and ,aI F is the
function defined on Rd-1 via (t1, ... , d) '-4 F2 (6, , G) := F(01,6, -,td) F(a1, t2, ... , d). Then Da2F2 = AJ &3, F is defined and the further "difference
operators" Dak are inductively brought into play. There is a theorem analogous
to 6.5: To every measure-generating function F on Rd corresponds a unique Borel
measure AF on Rd which satisfies the iterated difference condition
(6.14)
(Qd - ad)
This function is consequently measure-generating, and generates the L-B measure Ad in the sense that uF0 = Ad. Details can be found in RICHTER [1966],
TUCKER [1967) and GNEDENKO [1988.
Exercises.
1. Prove that a Borel set B E .mod is an L-B-null set if and only if one of the
two following conditions (which are hence equivalent) is satisfied: (a) For every
e > 0 there is a covering of B by countably many open intervals In C Rd such
00
that E Ad(In) < c. (b) There is a covering of B by countably many open intern=1
00
vals In such that E Ad(II) < +oo and every point of B lies in In for infinitely
n=1
P.
supplemental condition that for every bounded set B E Rd, pn(B) -A 0 for only
finitely many n E N can be imposed if and only if y is a Borel measure.
I. Measure Theory
34
The measurable space (ltd, .4d) will henceforth be called the d-dimensional
Borel measurable space. The measure space (ltd, .mod, Ad) will correspondingly be
called the d-dimensional Lebesgue-Bored measure space abbreviated to L-B measure space).
The concept measurable space exhibits a formal analogy to that of topological
space. For a topological space is also a pair, consisting of a set and a system of its
subsets, namely, the open ones. In the sense of this analogy the next concept, that
of a measurable mapping, corresponds to the concept of continuity in topology.
T-'(A') E.off
and speak of a measurable mapping of the first measurable space into the second.
Using the notation introduced in (1.5), (7.1) can be written as
(7.1')
T-'(,W') Cd.
briefly put, Borel measurable. According to 6.4 the system /P' of all open subsets
7.2 Theorem. Let (12, d) and (Q', W') be measurable spaces; further, let 9' be
a generator of 0'. A mapping T : Cl - 12' is measurable just if
(7.2)
T-1(E') E R1
35
Proof. The system .l' of all sets Q E 9(S2') for which T-1(Q') E d is a a-algebra
in 11'. Consequently, 0' C . ' holds just if 8' C 2' does. sZf' C .l' is equivalent
to the measurability of T, while 8' c 2' is equivalent to (7.2).
Concerning the composition of measurable mappings, what the earlier analogy
with topology suggests, prevails:
7.3 Theorem. If Ti
: (c', .s+'j) -> (Sl2, a/2) and T2 : (S22, saI2) -* (S13, s71/3) are
measurable mappings, then the composite mapping T2 o T, is sari-d -measurable.
Proof. The claim follows from the validity of the equation (T2 o T,)-1(A) =
Ti 1(TZ 1(A)) for all A E 9(SZ3), in particular, from its validity for all A E saf3.
Next consider a family of measurable spaces ((c,, sO ))iEI and a family (Ti)iE1 of
mappings Ti : S2 -> S2i of some fixed set S2 into the individual sets 11,. Obviously the
and call it the a-algebra generated by the mappings Ti (and the measurable spaces
n}, we also use the notation
(Sti, r!)). In the case of the finite index set I
a(T,)C0.
Cf. (7.1').
8:=UT,'(s )
iE1
is a generator of o(TT : i E I). Each set E E 8 has the form E = Ti 1(Ai) for some
i E I, A, E .sad . Thus S-1(E) = (Ti o S) -1(Ai) E s to because of the hypothesized
measurability of Ti o S. From 7.2 therefore, S is sio-o(Ti : i E I)-measurable.
I. Measure Theory
36
7.5 Theorem. Let T : (I ,.d) -+ (0', 0') be a measurable mapping. Then for
every measure p on a+f,
(7.5)
defines a measure
p' on af'.
Proof. We only have to observe that for every sequence (An)nEN of pairwise disjoint
sets from al', (T-1(A'n))nEN is a sequence of pairwise disjoint sets from W, and
that
T-1(UA')=UT-'(Art).
nEN
nEN
7.6 Definition. In the situation described in 7.5, the measure p' is called the
image of p under the mapping T and is denoted by T(p).
Thus according to this definition
(7.5')
T(p)(A') := p(T-1(A'))
whenever we are in the situation of 7.3 and U is a measure on .aft: For every A E aft,
T := T2oT1 satisfies T -'(A) = Ti ' (T;" (A)), and T;" (A) E .aft. Therefore, setting
Ta(x) := a + x
x E Rd.
Ta(Ad) = Ad
a+A=A+a:=Ta(A)={a+x:xEA)
37
for sets A E .9(Rd) and points a E Rd, then TQ(Ad)(A) = ad(-a+A) for arbitrary
A E Rd. Property (7.7) can therefore also be expressed as
(7.7')
4.
Ad(a + A) = Ad(A)
which assigns to the point x = (x1, ... , xd) E Rd the image point x' E Rd having
coordinates x; := ax;, and x' = xj for all j 0 i, a dilation of x. It satisfies
(7.9)
For, every open interval ]a, b[ C Rd has D.()-pre-image equal to ]a', b'[, where the
coordinates of a', b' except the ith are those of a, b, the ith being a-1 times those
of a, b if a > 0, and a-1 times those of b, a if of < 0. Hence
Ad((DR'i)-'Qa,b[)) = IaI-' Ad(]a,b[)
DQ'i(ad) and IaI-l Ad are therefore measures on .mod which coincide on all bounded
open intervals. Thanks to 6.4 such intervals constitute a generator of 9d, which
obviously has with respect to each of these measures all the properties of the
generator 8 in the uniqueness Theorem 5.4. From that theorem (7.9) therefore
follows.
5.
If we set Hr := Dr1) o ... o D(rd) for real r 96 0, we obtain the linear mapping
Hr(Ad) =
Iri-dad
Hr, (K) of every two homothetic images of K with 0 < r < r' < 1 is an L-B-null
set. (This property is enjoyed by every sphere S,,(0) of radius a > 0 and center
0 := (0,. .. , 0), that is, the set of x E Rd having euclidean distance a from 0.) Show
I. Measure Theory
38
4. Let T:= {(x, y) E R2 : x2 + y2 = 1} denote the unit circle, that is, the sphere
S1 (0) in R2. Prove the existence of a finite non-zero measure v on the a-algebra
,4(T) : _ T n 0 which is invariant under all rotations of T. [Hint: Take for v an
image of Ac for an appropriate interval C C R.]
where 0 = (0, ... , 0) E Rd and 1 :_ (1, ... ,1) E Rd, 6.2 insures that
Ad(W) = 1.
(8.2)
8.1 Theorem. Every measure on .mod which is translation-invariant, i.e., satisfies T. (p) = for every translation x - Tp (x) := a + x of Rd, and which assigns
finite measure
(8.3)
p = aAd.
Proof. Let an := n'l, the point in W' all of whose coordinates are 1/n. Then
W := [0,
a cube with
.u(Wn) = a/nd
In fact: The interval [0,1 [ E .01 is the union of the pairwise disjoint intervals
[!,-_![
(Bl,
... , Pd) E Rd whose coordinates all come from the set { v/n : v = 0,...,n- 1},
then
39
a union of nd pairwise disjoint intervals. Because [r, r+an[ = T,.([0, a,,[) = Tr(Wn)
and because of the translation-invariance of , it follows from this representation
of W that a = nd(Wn).
A repetition of these considerations will show that
([a, b[) = aAd([a, b[)
holds for every interval [a, b[ E fd in which the points a, b have only rational
coordinates. Obviously in proving this we can assume that a 4 b, and due to
the translation-invariance of both measures we can further assume that a = 0.
Then b = (ml /n, ... , and/n) for appropriate ml,..., md, n E N, and therefore
[0, b[ is the union of the ml ... and pairwise disjoint intervals [r, r + an[ with
r = (Bl /n, , Pd/n) and Pi E 10,..., m; - 1) for each i. As before, this yields
m1 ...
=([0,b[), hence
((0,b[) = a
nl
...
nd = aAd([O, b[)
Now the set ee of all intervals [a, b[ E f d for which a, b have only rational
coordinates is an fl-stable system. The technique used in the proof of Theorem 6.4
shows that ac is, just like 5d, a generator of.. Because the measures p and aAd
coincide on ee and for n:= (n,. .. , n) with n E N the intervals [-n, n[ lie in 8e
and increase to Rd, our claim (8.4) follows from the uniqueness theorem 5.4.
For a = 1 we immediately get from 8.1.
8.2 Corollary. Lebesgue-Borel measure Ad is the only translation-invariant mea-
(W) = 1.
This corollary says that Ad is, in the theory of locally-compact groups, a Haar
measure on the additive group Rd. That theory provides an analogous non-zero
invariant measure on every locally compact abelian group G; it is unique to within
a positive scalar factor and is called Haar measure on G. The reader interested in
its theory should consult NACHBIN [1965]. (Cf. also Exercise 4 of 7 and Exercise 8
of 17.)
The conclusion of the theorem and its corollary remain valid if in the normalization (8.2) and (8.2') the unit cube W is replaced by its open interior ]0,1[ or
its compact closure [0,1]. This is immediate from (6.7). However, if p(W) = +oo
is allowed, p need not be a multiple of Ad. See HENLE and WAGON [1983].
Example. 1. Besides the DO(') of Example 4, 7 there is another basic class of linear
mappings in Rd, those that skew one coordinate by means of another. Specifically,
for each i, k E { 1, ... , d} with i -A k we define
S(i,k)(x 1,..., ad) :_ (xl,...,Xi-l,xi +2k,Xi+1,.... 2d).
40
I. Measure Theory
Fix such a pair (i, k) and write simply S for S(i,k) Since this is a linear mapping,
S(Ad) is a translation-invariant measure on 0, so (8.5) will follow from 8.2 if we
succeed in showing that S(Ad)(W) = 1, that is, Ad(S-1(W)) = 1. In view of (7.9)
and the equality S' = D(kl o S o D(ki, it suffices to show instead that
Ad(S(W)) = 1.
(8.5')
Let a denote the vector in Rd whose only non-zero coordinate is the ith one, it
being -1. Introduce
Ta(W") _ { (xl, ... , xd) : 0 < xj < I for j 96 i., xk < xi < 1} .
Clearly
(8.6)
W' is the intersection of W with the open set {(X1, ... , xd) E Rd : xi < xk }, so
W' is a Borel set. Similarly Ta(W") is the intersection of W with the closed set
{(xl, ... , xd) E Rd : xk < xi}, so it is a Borel set. Thus W", its preimage under Ta,
is also a Borel set. Since S is a homeomorphism, S(W) is a Borel set. Next notice
that
(8.7)
For the conditions on the ith coordinate that define each set in (8.7) are obviously
incompatible with those that define the other two sets. Moreover,
(8.8)
Here the inclusion "C" is obvious from the coordinate inequalities defining the
sets. A typical point x of D(i)(W) has j`h coordinate xx E [0,1[ if j 96 i and ith
coordinate t E [0, 2[ = [0, xk[ U [xk,1 +xk[ u [1 +xk, 2[. If t lies in the first (third)
interval, then x E W' (x E W"). Otherwise, xi := t - xk E 10, 1[, and
x = (XI.... , xi-1, t, xi+1, .... xd) = (x1, ... , xi-1, xi + xk, xi+1, ... , xd) E S(W ).
41
This confirms (8.8). Combining all that we have learned gives the desired (8.5') as
follows:
=Ad W)
Ad(W"i)
by (7.9)
One usually thinks of the space Rd as equipped with the euclidean scalarproduct
d
(x, y) E cn,
i=1
P(x,y):=
y :=
(x-y,x-y)
where x
Every mapping T : Rd -a Rd which
leaves this metric invariant, that is, satisfies
(8.9)
P(T(x),T(y)) = Lo(x,y)
T(0) = 0.
Using the linearity of (,) in each of its positions, we get
Replacing T with the identity mapping here shows that (8.9) may be supplemented with
(8.9')
(T(x),T(y)) = (x,y)
42
1. Measure Theory
expression. Upon doing so and re-assembling the terms, we get back a single expression like (*) but with the identity mapping in place of T. That is, we get 0. In
other words,
holding for all A E It, x, y E Rd. This says that T is a linear mapping. It is
immediate from (8.9) that T is then injective. The dimension of T(Rd) C Rd is
therefore d, so T(Rd) = Rd, and T is surjective. A motion T that is also a linear
mapping, and the preceding deliberations show that this is equivalent to T(0) = 0,
is called an orthogonal transformation.
The translation-invariance of All derived in 8.1 not only characterizes L-B measure but renders excellent service in the derivation of further invariance properties.
We begin with the motion-invariance of Ad, that is, with the proof that
(8.10)
T(Ad) = Ad
T. oT =T oTb.
For every x E Rd, T. oT(x) = T(x) +a = T(x)+T(b) = T(x+b) = ToTb(x), confirming (8.11). From this and the translation-invariance of Ad we get T.(T(Ad)) =
T(Tb(Ad)) = T(Ad). As a E ltd is arbitrary, this says that it:= T(Ad) is a translation-invariant measure on.*'. For the unit cube W = (0,1( we have a :=.u(W) =
Ad(T-I(W)) < +oo by (6.2), since T is an isometry and therefore along with W
the set T(W) is also bounded. Now Theorem 8.1 comes into action and guar-
43
distances invariant (8.9). Hence T-I(K) = K, and from T(ad) = aAd follows
Ad(K) = Ad(T-I (K)) = T(Ad)(K) = aad(K)
From this follows the desired a = 1, because on the one hand Ad(K) < +oo
by (6.2) and on the other hand Ad(K) > 0 because K contains a non-empty
interval I E jd, namely I := [-t, t[ with t := (d-1/2, _ .. , d-1/2)_ [In Exercise 6
of 23 we will compute Ad(K) explicitly.]
Since with every motion T of Rd its inverse T-' is also one, the motioninvariance can also be recorded in the following form: For every motion T of Rd
and every Borel set A E pfd
(8.12)
Ad(T(A)) _.d(A)
In this form Theorem 8.3 just says that any two congruent Borel sets in Rd have
the same d-dimensional Lebesgue measure. This however is the measure-theoretic
formulation and refinement of the elementary geometric principle (A) enunciated
in the introduction to the chapter. Via it L-B measure is seen in the final analysis
to be a concept from euclidean geometry.
Examples. 2. Every hyperplane H C Rd is an L-B-nullset. This follows from Example 1 of 6 and the fact that there is a motion T which transforms a hyperplane
of the kind considered in that example, say the hyperplane with equation td = 0,
into H.
Every closed or open box (meaning a parallelepiped with pairwise orthogonal
edges) Q C Rd whose edge-lengths are 11, ... , ld has Lebesgue measure Ad(Q) _
11 ... 1d. This follows analogously from Example 3 of 6.
3.
into itself - and then too with respect to arbitrary affine mappings - can also be
clarified using a slight modification of the preceding method of proof.
Linear mappings T : Rd -a Rd are just those that with respect to the canonical
basis in Rd (or indeed any basis) can be represented in the form T(x) = Cx, with
C a d x d matrix and x E Rd interpreted as a column vector. The determinant
of T, in symbols det T, is by definition that of C (and is independent of the choice
of basis).
We will restrict ourselves to the case where T is non-singular, that is, det T 34 0,
and consequently bijective. These are elements of the group GL(d, R) known as the
general linear group. The mappings T E GL(d, R) with det T = 1 form a subgroup
of GL(d, R), the special linear group SL(d, R). It is in fact the commutator subgroup
of GL(d, R) and this fact is used by DIEROLF and SCHMIDT [1998] to give an
44
I. Measure Theory
alternative proof of our next theorem. (The behavior of Ad with respect to linear
mappings T with det T = 0 is elucidated in Exercise 2 below.)
T (Ad)
1 T) Ad
I det
or, equivalently
Ad(T(A)) = IdetTIAd(A)
(8.14)
and S(") defined in Example 1 of this section. It is obvious that det D') = a
1. Therefore (7.9) and (8.5) confirm (8.13) in the special case that
and det
T lies in the set
(8.15)
Theorems 8.3 and 8.4 taken together confirm an elementary fact from linear
algebra, namely that det T = 1 for every orthogonal transformation T. And this
means that 8.3 is contained in the following immediate consequence of 8.4:
,P (Ao)
45
or equivalently
(8.16')
IdetDcpIAd .
We will not go into this any further, but refer the reader to the textbook literature,
e.g., STROMBERG [1981], or to VARBERG [1971].
Rd =
U (k + Qd) = U (y + K)
kEK
VEQd
and
(8.18)
t 1 0 y2
y1, y2 E Qd,
(y1 + K) fl (y2 + K) = 0 .
(Otherwise there are k, k' E K with y1 + k = y2 + k', that is, with k - k', which
by definition of K means that k = k' and consequently also y1 = y2.) Let us now
suppose that K E .mod. Since Q and therewith Qd is countable, it follows from
(8.17), (8.18) and the o-additivity of Ad that
(8.19)
(8.20)
for all y E Qd
2 being the point in Rd each of whose coordinates equals 2. From this fact and
(8.18) follows, again via Q-additivity of Ad, that
46
I. Measure Theory
But then (8.20) means that we must have Ad(K) = 0, contradicting (8.19). The
assumption K E Yd is what led to this contradiction, so we conclude that K is,
after all, not a Borel set.
The following remarks serve to round out the foregoing and to provide a glimpse
of some closely related issues.
47
In passing from Borel sets to Lebesgue measurable sets the important property
of the former that they are determined only by the topology of Rd is lost. Because
d is the defining or-algebra for so many other important measures (for d = 1
Theorem 6.5 already attests to this), we will not dwell in detail on the transition
from Lebesgue-Borel to Lebesgue measure; only the former will be employed in
the sequel.
4. There exists a Borel set B E 0 whose image 7r1(B) under the first projection
map irl : R2 -* R (which sends every point (xi, x2) E R2 to its first coordinate xi)
is not a Borel subset of R. A proof of this will be found in SRIVASTAVA [1998],
p. 130. Such a B can even be found which is G6-set, that is, the intersection of
countably many open subsets of R2; see p. 36 of CHRISTENSEN [1974]. In particular,
the continuous image of a Borel set need not be a Borel set. The system of all sets
7r, (B) with B E 92 comprises rather the so-called Souslin or analytic subsets of R.
See SRIVASTAVA [1998] and CHOQUET [1969].
sure on .mod with the following properties: (i) p(K) < +oo for every compact
K C Rd; (ii) p({x}) = 0 for every x E Rd; (iii) u(U) > 0 for every non-empty open
U C Rd; (iv) p(Rd) _ +o0. OXTOBY and ULAM [1941] showed that, conversely,
every measure p on 0 enjoying properties (i)-(iv) has the form it = T(ad) for
some homeomorphism T : Rd
Exercises.
1. Let T : (fl, .ad) -4 (fl', d') be a measurable mapping, p a measure on the or-algebra 0, and p' := T (p) its image under this mapping. (1l, .Wo, po) and (S2', 00, IA')
will denote the completions of these measure spaces (Exercise 7, 5). Show that the
mapping T is also do-.olo-measurable and that T(po) = o. From this it follows
that Lebesgue measure in Rd is also motion-invariant.
2. Let T be a linear mapping of Rd into itself with det T = 0. Show that for every
A E .9d, T(A), although it may fail to be a Borel set (as noted in Remark 4) is
at least a Lebesgue-null set, thus a subset of an L-B-null set, namely the linear
subspace T(Rd) of Rd. In this sense equality (8.14) retains its validity for linear
transformation T : Rd -+ Rd with det T = 0, i.e., (8.14) is valid for every linear
transformation T of Rd into itself.
3. Show that the set K constructed in the proof of Theorem 8.6 is not even
Lebesgue measurable.
4. In the section entitled "Fallacies, Flaws and Flimflam", p. 39, vol. 22, no. 1
(1991) of the College Mathematics Journal the following short "proof" of Theorem 8.6 is offered: Suppose that A1(X) is defined for every subset X of 10, 11. By
48
1. Measure Theory
6'([O.1)), A' (X) % X}. It is a subset of 10,1] and upon testing the number A '(B)
for membership in B we find that the statements A'(B) E B and Al(B) 0 B are
equivalent, a contradiction. What is the error in this reasoning, or is it perhaps
a legitimate proof of Theorem 8.6?
Chapter II
Integration Theory
with B E .61. The system 41 of these sets is obviously a a-algebra in IIt whose
trace in R is 0:
(9.1)
R n R1 = -41 .
1A(w) := { 0
if w E A
ifwEi2\A
ACB
1A<1B;
50
111,e,A,
inf IA, .
2. For an arbitrary subset Q of Rd consider the measurable space (Q, Qn9d). The
corresponding measurable numerical functions on Q will be called Borel measurable
forallaER.
{wEfi: f(w)>a}EAf
(9.3)
Proof. According to 7.2 we have only to show that the system 7 of all inter91
vals [a, +oo] with a E R generates the a-algebra Y in K. Since [a, +oo] E
.1 for the a-algebra 22 generated
for every a E R, we have at any rate that al C
by d. Because [a, J3[ _ [a, +oo( \ [/3, +oo[, the intervals (a, /3[ with a,,3 E R and
a < /3 all lie in R f :N. From 6.1 therefore follows that M1 C R ft :N. Now the
single-element sets
nEN
both lie in :9. Consequently, along with each Q E :N, the set R fl Q is also in :9.
In other words, R n
C :9 and therewith 91 C .2. This fact together with
(-oo), {+oo} E and the remarks preceding (9.1) make it clear that 91 C :N,
so that finally we have _'l = .1.
We now introduce some popular short-hand notation: For numerical functions
f and g on
(9.4)
and the sets (f < g}, (f = g}, (f # g}, etc., are defined analogously. Condition (9.3) in this language reads: (f > a} E ii for all a E R.
That we can just as well employ the sets If > a}, f f < a}, etc., in the
preceding characterization is the content of
9.2 Theorem. Each of the following conditions is equivalent to the d-measunability of the numerical function f on St:
(a)
if >a}Ed
(b)
if >a}E01
forallaER;
If <a}ES/
(d)
{f <a} E.0'
51
forallaER;
forallaER.
Prof. All that has to be shown is the equivalence of these four assertions, and
that results from the validity, for all a E R, of the equations
nEN
If >a}=C{f <a}.
nEN
It may be noted that the four related assertions in which quantification is over
all a E R are also equivalent.
A plethora of assertions about calculating with measurable numerical functions
now presents itself.
9.3 Theorem. For any 0-measurable functions f,g : fl - Ilt the sets If < g},
If < g}, If = g} and If & g} lie in W.
Prof. Because the set Q of rational numbers is countable, the claims follow (with
the help of 9.2) from the equalities
{fog}=C{f=g}.
9.4 Theorem. Along with f, g : 11 -> R, the function f g and, if everywhere
defined, the functions f + g and f - g are also d-measurable.
Prof. First of all, along with g, a + rg is measurable for all a, ,r E R. This follows
from 9.2 because {o + rg > a} is {g > (a - a)/r} if -r > 0 and is {g < (a - a)/r}
if r < 0, the case r = 0 being trivial. This preliminary remark takes care of the
passage from g to -g and reduces the case f - g to the case f + g. Furthermore,
together with the remark following 9.2 and the equalities
{f+g>-a}={f>a-g}
(aER)
1 1. Integration Theory
52
the ti-measurability of f g. 0
A useful special case of 9.4, isolated already in the course of its proof, is that of
inf f n ,
nEN
nEN
nEN
lim sup f .
nEN
nEN
Due to 9.4, inf fn = - sup(- fn) is then also measurable. By definition we have
lim inf fn = sup inf fn ,
ra- 00
nEN m>n
nEN m>n
fl A...Af,
and
fl V...Vfn
are .W-measurable.
fn, fn .... 0
lim f = limn-+00
inf f" = liznn-400
sup fn
n-+oo
To every numerical function f : ft -4 R three other functions on U are associated (cf. the section "Notations"): the absolute value
(9.5)
!fl := f V (-f),
53
f+ := f V 0,
(9.6)
Thus f+ (w) = f (w) in case f (w) > 0 and f+(w) = 0 in case f (w) < 0. Observe
that not only f + > 0, but also f - > 0. The important equalities
(9.8)
f=f+-fand Ifl=f++f-
are immediate.
From 9.4 and 9.6 we effortlessly infer our concluding result:
Exercises.
1. Let (Q, a() be a measurable space, D a dense subset of llt (e.g., Q). Show that
a numerical function f on fl is af-measurable if the analog for all a E D of one of
(a)-(d) in Theorem 9.2 holds.
2. Let (fn)nEN be a sequence of as -measurable numerical functions on a measurable
space (0,W). Why is the set of all w E f2 for which the sequence (fn(w))fEN
converges in R, and that for which it converges in R, xf-measurable?
3. The real function f : Sl -> R is measurable on the measurable space (0, sd). Are
exp f and sin f , that is, the function w H of (1) and w - sin f (w), 0-measurable?
4. With the aid of Theorem 9.1 show that the real function defined on R2 by
(x, y) +-> max{x, y} is 6#2-measurable. Deduce from this another proof of Corollary 9.6.
E = E(1,0)
of sag-elementary functions on ft, which we define as follows:
54
If {a1, ... , a,, } is the set of distinct values of a function u E E, then the sets
Ai := u-1(a; ), i = 1,..., n, are pairwise disjoint, and as pre-images of the Borel
sets {ai} they each lie in d. Using the notation for indicator functions introduced
in (9.2), we have then
n
(10.1)
u = E ailA,.
i=1
uVv, uAvEE.
au, u+v,
14,11 EE,aER+
The derivation of (10.1) shows moreover that every function u E E has a rep-
resentation of the form (10.1) in which the sets Ai E d are pairwise disjoint
and cover Il, that is, constitute a decomposition of 0. Such representations will
henceforth be called normal representations of u.
It is easy to see that generally functions u E E can have several different normal
representations. However, for u 96 0 there is only one representation in which the
coefficients are the distinct non-zero values taken by u. Anyway, for purposes of
integration non-uniqueness of normal representations is not an issue, as the next
lemma shows.
10.2 Lemma. Let (it, d,,u) be a measure space. For any normal representations
m
q
=fl,1B'
j=1
i=1
tol
L,Q1(Bj)
j=1
i1=AlU...UAm=B1U...UBn
follows
n
i=1
55
in which the sets Ai n Bj are pairwise disjoint. The finite additivity of A therefore
supplies the equalities
n
ns
i=1
the first for all i E { 1, ... , m}, the second for all j E
After further
summation
i,j
j=1
From these two equalities the claim follows when we observe the following fact:
p(AinAj)j4 0. o
Thanks to the preceding our next definition is sound:
Judo :_
i=1
it
= E ailA,
i=1
Thus u H f u dp defines a mapping from E into R+. Clearly it is a mapping in R+ just if p is finite. The most important properties of this mapping are
summarized in:
r
(10.5)
for all A E 0;
J IA dpi = p(A)
(10.4)
(10.6)
f(u+v)dp=J udp+Jv dp
(10.7)
u<v
Properties (10.4) and (10.5) are immediate from 10.3. The next property in the
list is confirmed thus: Start with normal representations
in
i=1
j=1
56
1 1. Integration Theory
i=1
and because the sets Ai n Bj are pairwise disjoint, these equations entail
m
1A, = E 1A,nB,
and
1A,nB1
1Bf
j=1
i=1
the first for all i E { 1, ... , m}, the second for all j E { 1, ... , n}, from which in turn
new normal representations
ij
ij
aA(A; n Bj)fvdiz=>/3iIL(AinBi).
ij
ij
J(u+v)dii=J(ai+Qj)p(AinBj)
ij
u=
E,yi1c,
and v = Ebilc,
i=1
i=1
involving the same sets C1, ... , Ck E d. In case u < v, it then follows that ryi < bi
for each i E { 1, ... , k} such that Ci 34 0, and from this we have (10.7).
n
u E E with coefficients ai E R.4. and sets Ai E .op, but not necessarily a normal
representation. From (10.4)-(10.6) it follows that
n
Jud =
aiu(Ai)
i=1
For normal representations this equation served as the definition of f u du. Its
validity without this restriction, which we now perceive, indicates that the introduction of normal representations was simply a technique of proof.
Exercises.
1. Let (S2,
p) be a measure space and (Sl, sVo, po) its completion. Prove that
for every moo-elementary function u there are d-elementary functions u1i u2 such
that u1 < u < u2 and ji({u1 # u2}) = 0. For every such pair, f u1 dp = f u2 du =
f udpo. (Cf. Exercise 7(d) in 5.)
57
2. The function 1Q on IIt has long been known as Dirichlet's jump function. Is it
a -41-elementary function?
11.1 Theorem. For every isotone sequence (un)neN of functions from E and
every u E E
(11.1)
JUd/L<sUPfUndIL.
u < sup un
nEN
nEN
U = 1aj1Aj
j=1
of u with sets Aj E af and coefficients aj E R+, and let a be any number in 10,1[.
Then because of measurability the set
B,,:={un>au}
lies in 0 for each n E N. From this definition follows on the one hand that un >
au1B and consequently by (10.5) and (10.7)
undp>a J
for every n E N. Since the sequence (un) is isotone and u < supun, it follows on
the other hand that Bn T St, and so Aj n Bn T Aj for each j E {1, ... , m} and
consequently, because p is continuous from below
m
na
r
j=1=1
sup
nEN
un d > sup a J u 1 B dp
nEN
= a n-oo
lim J u1s dp = a
ud .
where the first step follows from f un d > a f ul B d. Since a E 10,1 [ is arbitrary
here, the claim follows.
58
1 1. Integration Theory
(11.2)
nEN
nEN
nEN J
Proof. For every m E N, vn, < supun and u,,, < sup vn, from which inequalities
n
sup J un dp and
J vn, dp < nEN
J vn du.
Claim (11.2) is immediate from the validity of these inequalities for all m E N.
Now let
E- = E'(0,a)
(11.3)
sup un = f .
nEN
depends only on f and not on the special representating sequence (u,,) of f used
to compute it. We're in a position similar to that of 10.3. Therefore we make the
(11.4)
fdp:=sup J undpEk+,
neN
(11.6)
f,gEE',aElt+
(11.7)
J(f+9)dii=Jfd+fgdi.i
Jfd/iJgdlz
for allf,gEE'.
f <g
(11.8)
59
Proof. From the definition of E* and from (10.2) follows (11.5). One only has
to note that sup un = lim un for isotone sequences (un). The earlier proofs carry
n
over almost verbatim to (11.6) and (11.7). We'll do (11.7) and leave (11.6) for the
isotone sequences of
n d and
d = sup
n
g d = supvn d ,
n
jI
(Jun d + J vn dIL)
r
.
J f d., +Jgd
If in addition we assume that f < g, then urn < sup vn for every m E N.
n
Properties (11.6)-(11.8) say that the integral is a positively-homogeneous, additive and isotope function on E*.
Finally, it turns out that Theorem 11.1, which is so critical for our program, is
valid also in E. This is the content of a theorem which goes back to B. LEVI (18751961):
11.4 Theorem (on monotone convergence). For every isotone sequence (fn)nEN
of functions from E'
nEN
nEN
Proof. Set f := sup fn. It suffices to find an isotone sequence (vn) of functions
n
from E which satisfy
sup vn = f and vn < fn
for every n E N.
nEN
For then f E E' and f f d = sup f vn d by definition of the integral in E', while
f vn d < f fn d by (11.8). Consequently, f f d < sup f fn d and therewith the
equality claimed by the theorem follows, since the other inequality sup f fn dp <
n
f f d is immediate from (11.8) and the fact that fn < f for all n.
60
The sequence (vn) is gotten thus: For each fn there is by definition an isotone
sequence (umn)mEN of functions from E with sup urn = fn According to (10.2)
mEN
the functions
Cm:=um1 V...Vumm
be in E (for each m E N). The isotoneity of each sequence (umn)mEN clearly entails
that of the sequence (Vm)mEN. From the isotoneity of (fm) n,EN follows v n < fm
for all m, and thus sup um < f . For all m > n we have u,nn < vm and so
m
for every n E N.
mEN
mEN
Together with the preceding this gives finally sup vm = f . Therefore (vn) is a sen
00
fn E E'
nn=1
00
and J(f)d$t=JfdIL.
n=1
n=1
Proof. Apply 11.4 to the sequence U t + ... + fn)nEN and recall (11.7). 0
In analogy with the device of writing An T A, An 4. A for sets, introduced in 3,
we will from now on write
fn t f, fn 4.p
for numerical function f, 11, f2,... on the set S2 to signal that fn(w) T f (w) for
every w E S2, or fn(w) 4. f (w) for every w E Q; that is, the notations mean (fn) is
an isotone sequence and f is its upper envelope, or (fn) is an antitone sequence
and f is its lower envelope. Obviously for a sequence (An) of subsets of 12
ABTA a
1 A T lA
and An J. A q 1A 4.'A
Examples. 1. Let (S2, 0) be an arbitrary measurable space and c,, the measure
defined on d by unit mass at the point w E S2 (cf. Example 5 in 3). Then
f fde.=f(w)
for every f E E. Due to 11.3 we can at once assume that f E E.
If, however, f = E ai 1 A, is a normal representation of f, then w lies in exactly
one of the sets A;, say in Aj0. Then f f den, = E ajc,,(Aj) = a;. = f (w).
Consider 0 := N and .d :_ ,90(N). The o-additivity requirement means that
a measure p on V is uniquely defined whenever numbers do = p({n}) E R+ are
specified for each n E N. E` consists of all numerical function f > 0 on Q. Indeed,
one sets fn := f (n) It,,) for each n E N and then fn E E`, and in case f (n) < +oo,
2.
61
fn E E. Since
00
f=I:fn,
n=1
J
3.
f (n)pn .
n=1
fidp
->fidpn.
This is evidently true of indicator functions f, so the claimed equality holds for
all elementary functions. Transition to an arbitrary f E E' is accomplished thus:
Let (un) be a sequence in E with un t f. Then the double sequence
amn = >2
i=1
,,n
*n E N)
dpi
satisfies
mEN nEN
nEN mEN
(= sup amn) ,
m.nEN
I {If
}, n) n
if < (
-E 1)2-n},
i = n, 1 ..., n2n - 1
all lie in W, and for each fixed n E N the n211 sets are a decomposition of I.
Consequently, for each n
n2n
i2-n1A,,,
un
i=1
62
I l. Integration Theory
is a normal representation of a function in E. On the set Air the function un+1 can
1)2-"-1 if i E {O... , n2" -1}, and only
(2i)2'n-1 and (2i +
take only the values
values > n when i = n2". Therefore the sequence (un) is isotone. It satisfies
sup un = f , because for any w E 11 either f (w) = +oo, in which case un (w) = n
n
for every n, or f (w) < +oo, in which case u. (w) < f(w) < un(w) + 2'n for all
from Theorem 9.1, because for every a E R either if > a} C A or CA C (f > a}.
In proving the converse we can, thanks to (9.8), assume that f > 0. The claim
is then true for elementary functions f E E(fl, dd), because among finitely many
pairwise disjoint sets whose union is 1, exactly one has a countable complement.
For arbitrary f E E'(11,d) let (un) be a sequence of elementary functions with
it, T f . Each function un is constantly a(un) in the complement of some countable
set A. But then f (w) has the constant value
for all w E n CA. =
n
nEN
C( U An). As the set U An is countable, this proves that f has the asserted
nEN
nEN
f f dp = a(f)
f =goT.
63
+oo} the difference go(w') - go(w') is not defined. But the set T(Sl) is disjoint
from U', because go' (T(w)) = +oo always entails that 9o(T(w)) = f (w) = 0.
Therefore if we set
1Cu'9o
and g"
1Cu'9o
Exercises.
1. Show that every bounded, 0-measurable, non-negative real-valued function
on a measurable space (fl, d) is the uniform limit of an isotone sequence of dmeasurable elementary functions.
2. Let (Sl, .r9, ) be a measurable space with a finite measure . Further, let
f, f1, f2.... be measurable numerical functions on 11. Prove the equivalence of
64
lim( U{f,,,>f+E))=0
(i)
m>n
(ii) for every 6 > 0 there exists an A6 E .& with (A6) < 6 such that for every
e > 0, f,, (w) < f (w) + E holds for all w E CA6 and all sufficiently large n E N.
[Hints: Note that (i) is also equivalent to the statement that for every e > 0 and
6 > 0 there exists an A6,, E 0 with (A6,,) < 6 and an N6,,. E N such that
f,, (w) < f (w) + e for all w E CA6,, and n > N6,e.] Why does (i) hold, given the
sequence (fn)nN, for every measurable function f which satisfies f > lim sup fn?
n-4oo
3. With the hypotheses and notation of the factorization lemma, show that for
any w1, w2 E 12 with T(wi) = T(w2), and every C E a,(T), either wl,w2 E C or
w1, w2 E CC. (That is, w1 and w2 cannot be "separated" by any set in o(T).)
From this fact infer that a Q(T)-measurable f satisfies f(wl) = f(w2) whenever
T(wl) = T(u)2). In case T(S1) E d', deduce the existence of a er(T)-measurable
mapping g : SY -4 fR with f = g o T. [Hint: Consider the system `B of all C C Sl
which have this two-point property and conclude that o(T) C W. Further, take
note of the equality T(T'1(A')) = A' fl T(1) for A' C W.]
12. Integrability
By now the integral f f d;i is defined for all non-negative d-measurable numerical
functions on 11, as a result of 11.4 and 11.6 together. In a third and final step f f du
will now be defined for certain numerical functions f which are not of constant
sign.
According to Theorem 9.8, f is measurable just if both its positive part f+ and
its negative part f - are measurable. This remark prompts the following definition:
J fdu := f f+d- f f d
is called the (-)integral of f (over Sl).
If for some reason one wants to put the variable w E Sl into evidence, he also
writes
f f (w),u(dw)
or
f (w) dit(ty) .
12. Integrability
65
the integral off exists and one uses (12.1) to define f f d E R. Only occasionally
will we be concerned with this obvious generalization.
2. In the special case = ad we speak of Lebesgue integrable functions (on Rd)
and of their Lebesgue integrals. If a Borel measure F on Rd is described with the
help of a measure-generating function F on Rd (cf. 6), the F-integrable functions f on Rd are called Lebesgue-Stieltjes integrable (or Stieltjes integrable) with
Let us now summarize the most important properties of the conceptual edifice
just built:
12.2 Theorem. Each of the following four statements is equivalent to the integrability of the measurable numerical function f on S2:
f(af)d=aJfdtz
and
J(f+)dit=Jfdii+Jgdt.
66
1 1. Integration Theory
are integrable.
(of)+=of+,
(af)-=of-
ifa>0,and
(af) = lalf+
ifa < 0.
f+g=f++g+-(f +g ).(11.7) insures that u:=f++g+ and v:=f- +gare integrable. Then the claims about f + g follow from the equality f +g = u - v
via 12.2. Finally, If V gI < If I + I9I and If A 91 <_ IfI + IgI, and we know that
If I + IgI is integrable. The integrability of the measurable functions f V g and f A g
follows then from these inequalities and part (c) of 12.2.
f <9
Jfd
fiji d.
Proof. From f < g follows f+ < g+ and f - > g-, and from these inequalities and
the isotoneity of the integral on E' follows (12.3). Because f < IfI and -f: If 1,
(12.4) follows from the first equality in (12.2) and from (12.3), with If I in the role
of g there.
Using this widespread notation it follows immediately from Theorem 12.3 and
from (12.3) that: With respect to the operations
(f + g)(w) := f (w) + g(w) and (a f)(w) := of (w)
w E Cl
f fde,,=f(w)
Let
be the measure space defined in Example 2 of 11, ({n}) = an
for n E N. From what was shown there it follows that the -integrable functions
2.
12. Integrability
67
>1f(m)Ian <+00
n=1
fdf(n)an
n=1
Let (0, d, ) be the measure space defined in Examples 2 and 1 of 3. A function f : S2 -* R is then -integrable if and only if it is equal to a real constant a
3.
4.
on 12 is -integrable.
5.
fd(+v)=Jfd+Jfdv.
2'(+v)=21(i)n21(v)
is valid.
We can now free ourselves of the restriction that functions always be integrated
over the whole 1. (11.5) insures that along with any pair of functions from E' =
E*(S2, s9) their product is also in E. So from f E E' and A E d follows lA f E E.
If f is an integrable numerical function on S2, then so is lA f, for every A E srd:
Because of the trivial inequality I lAf 15 If I, this is immediate from 12.2 (and 9.4).
In the light of this the following seems natural:
jfdiu =f lAfd
:
jfdIi=Jfd,i.
68
The following rules of calculation are evident, for all f, g which either lie in E'
or are integrable:
fAUBf1P+IAflBf=JAf+LfL forallA,BEd
(12.8)
(12.8')
AuB
Afdj<_f gd
One merely has to reflect on the definitions involved. Moreover, pursuant to the
discussion after (12.5).
f - j fd
(12.10)
But we can get at integrals over sets in ad in a different way, namely by considering the restriction A of the given measure to the trace a-algebra A n a+d.
That one is thereby led to the same result is the content of
12.5 Lemma. Let A E .d and for every function f on IZ which either lies in E*
or is -integrable let f denote the restriction of f to A, and A the restriction
of to A n .W. Then
ff'dPA= J fd.
(12.11)
Proof. First consider f E E' (St, at). Then f' E E' (A, A n W) since
(f')-'(B) = An f-'(B)
holds for all Bore] sets B in R (cf. 11.6). For the function lA f E E' there is
a sequence (un) of a/-elementary functions satisfying it,, f IA f . The sequence (u;,)
of restrictions to A obviously consists of A n ad-elementary functions that satisfy
u',, t f', from all of which follows that
(12.12)
f fd = sup
,,
nENJndii
and
a;1A,
i=1
12. Integrability
that
69
k
ai1'qi
Un =
i=1
(Notice that for Q C A, the restriction 1Q coincides with the indicator function
with respect to A of Q.) From the last two equalities we see that
JufldP=JudPA
k.,
for all n E N,
i=1
are finite and it is obvious that (f')+ = (f+)', (f')- = (f-)', so (12.11) follows
from linearity of the integral.
and in the second case to say that f is also p-integrable over A. With the aid of
Lemma 12.5 we thereby get:
f 0(w) ifwEA
if w E St \ A
'
fAfd=
f fAd= JfdPA.
Exercises.
1. Characterize the functions u E E(12, d) which are p-integrable.
2. Let (12, d, p) be a measure space. The indicator function IA of a set A E at
is p-integrable just when (A) < +oo. Such sets are called p-integrable, and 9
will denote their totality. Show that R is an ideal in the ring 0 (cf. Exercise 4
A n R E R. For a or-finite measure p
in 1); in particular, R E .S and A E 0
a converse also holds: A C St and A n R E 9 for all R E R implies A E W.
70
5. Let (Sl, 0, p) he any measure space, (An) a sequence of pairwise disjoint sets
from W, A their union, f a numerical function on A. Show that f is -integrable
M
over A if and only if it is -integrable over each A. and E fA.. If I d < +oo.
n=1
6. Let (S2, r9, p) be a measure space with p finite. Show that every real function f on St which is the uniform limit of a sequence (f,,) in 2l () itself belongs
to 2l (). Why does this conclusion fail for every non-finite which is or-finite?
(Hint: Construct a sequence (gn) in 2l() with 0 < gn < 1 and f gn d > n2 for
each n E N and then consider fn := F j-2g1.j
j=t
f=9
(-)almost everywhere;
(p-)almost everywhere;
71
If < a
(-)almost everywhere,
etc.
The theorems that follow explicate the significance that this new concept has
for integration theory:
13.2 Theorem. For every f E E'(0, d), that is, (cf. 11.6) for every +dmeasur-able, non-negative numerical function f
Ji d = 0 a f = 0
p-almost everywhere.
N:={f54 0}={f>0}
lies in sat. What has to be shown is that
f f dy = 0 q (N)=0.
Suppose f f dp = 0. For each n E N the set A. := If > n-1) also lies in af and
An T N, so that (N) = limo(A,,) and it is enough to show that p(An) = 0 for
every n. But obviously f > n-11A,,, entailing that 0 = f f dp > n-1p(An) > 0,
that is, p(An) = 0, as wanted.
Suppose conversely that p(N) = 0. Each of the functions un := n1N (n E N)
lies in E(1l, 0) and satisfies fun d = 0. Setting g := sup un gives a function
n
fdp=0.
IN
Proof. If f > 0, this claim follows from the theorem, because each function 1N f
lies in E' (12, sd) and is almost everywhere 0. In turn, application of this to f +
(b)
f>0,g>0
f integrable
Jfd=J9d;
= g integrable and
fi d = J g d .
72
f Nfd= f Ngd=0.
On the other hand, for M = CN we have lM f = 1Mg due to the definition of N,
and so by (12.6)
JM
d_IM
d.
g- almost everywhere.
f f+d= J g+d
and
If-dA= f g-d.
Because f is integrable, what we have here are non-negative real numbers, showing
that g is integrable (part (a) of 12.2) and, upon subtracting the second equality
from the first, we get the equality claimed in (b).
Since, roughly speaking, all this shows that integrability and the integral of
a function are insensitive to (measurable) changes of the function on nullsets,
results proved earlier can easily be reformulated somewhat more sharply. For example:
13.5 Corollary. Let the l-measurable numerical functions f and g on 11 satisfy If I <_ g -almost everywhere. Then along with g, the function f will also be
-integrable.
13.6 Theorem. Every -integrable numerical function f on Il is -almost everywhere real-valued. Moreover, the set { f 0 0} is of a -finite measure.
Proof. The set N := (If I = +oo} lies in a( and for every real a > 0 satisfies
alN < if 1. Consequently, a(N) < f If I d < +oo, from which follows the first
73
claim, (N) = 0. To prove the second claim we pass over to If I and thereby assume
,f d < +00.
f
This holds for all n E N, confirming the a-finiteness claim. 0
Theorem 13.6 has yet another consequence: Let N be a p-nullset and f a numerical function which is defined on M := CN and is M fl ad-measurable. Such
a function is described as being a (p-)almost everywhere defined (d)-measurable
function. The function fm introduced in 12.6 extends it to an &d-measurable function on 11. Any other extension of f to SZ must agree with fm almost everywhere.
According to 13.4 therefore either every such extension is integrable or none is. In
the first case moreover all extensions have the same -integral. These observations
justify the following definition:
13.7 Definition. Let f be a -almost everywhere defined, std-measurable numerical function on 0. It will be called (-)integrable if it can be extended to
a (p-)integrable function f' defined on the whole of ft f f' d will then be called
the (p-)integral of f and denoted f f d.
We will only occasionally be concerned with this extension of the integral concept, but its utility is already shown by the following
J(f+o)d=ffd+J9d
prevails unrestrictedly.
Exercises.
1. The numerical functions f and g on the measure space (St, s(, ) satisfy f = g
,u-almost everywhere. Show via an example that in general the sat-measurability
74
1 1. Integration Theory
of g does not follow from that off . Show however that in case (52, d, p) is complete,
3. Even if the f in the preceding exercise is real-valued, the functions fl, f2 which
were proved to exist there cannot always be chosen to be real-valued. Prove this
for the case where 11 is any infinite set, Ad := {Q1, S2} and p := 0.
ifa<0
Np(f)
(f Iflp di )
1/p
Np(af)=IaINp(f)
75
14.1 Theorem. p > 1 is a real number and q > 1 is defined by the equation
-+-=1.
P
q
1
(HOLDER'S inequality).
Proof. It is clear from definition (14.1) that we may assume f > 0 and g > 0.
Setting
(1+71)I/p<2+1
_p
for all11ER+
or
If now x and y are positive real numbers, then one of xy-1 and x-Iy is such
a l;. Inserting this t into the last inequality (and reversing the roles of p and q if
necessary), gives
OT fg
< app fP +
rgg,
gq
valid throughout fI, since it trivially prevails as well in the complementary set
If= +oo} U {g = +oo}. Integration of this inequality leads at once to (14.3). 0
14.2 Theorem. For all measurable numerical functions f and g on l whose sum
f + g is defined throughout fI, and for every p E [1, +oo[
(14.4)
(MINKOWSKI's inequality).
76
1 1. Integration Theory
which shows that we may assume f > 0 and g > 0. In case p = 1 there is then
even equality in (14.4), by (11.7). Therefore, for the rest we can assume that
1 < p < +oo, and then again define q by p-1 + q 1 = I. We may further assume
that both NN(f) and NN(g) are finite, that is, that if and gp are integrable.
12.2(c) and the estimates
(f + g)P <- [2(f V g)J" = 2P[fP v gPJ < 2P(fP + gP)
then insure the integrability of (f + g)P, that is, Np(f + g) < +oo. Now write
1(1 + g)p dp =
1(f +
g)P-1 f d + J(f +
g)"-'g dp
The desired inequality (14.4) follows from this and the finiteness of Np(f + g). 0
According to 12.2, 1-fold integrable functions are indeed just the integrable
functions. In the case p = 2 we also speak of square-integrability.
It is immediate from the definition that a measurable function f is p-fold integrable if and only if if I is p-fold integrable; equivalently, if and only if there is
a p-fold integrable function g > 0 with IfI < g. Further properties, already known
to hold when p = 1, are codified in:
14.4 Theorem. Consider p E [1, +oc[ and p -fold integrable functions f and g.
Then for every a E R
of, f Vg and f Ag
are p -fold integrable, and in case it is defined throughout St, the function f + g is
p-fold integrable.
finite, the claims about a f and f + g follow from (14.2) and (14.4). The p-fold
integrability off V g and f A g then follow as in the case p = 1 from the estimates
If V gI <- IfI + Igl
and
If A gI <- If I + IgI.
14.5 Corollary. For 1 < p < +oo a numerical function f on Il is p-fold integrable
just if its positive part f + and its negative part f - are both p -fold integr able.
77
In view of (14.5) real-valued p-fold integrable functions are also known as .functions.
From (14.3) we immediately get:
14.6 Theorem. The product of a p-fold and a q -fold integrable numerical function
is integrable (where 1 < p < +oo and 1 + a = 1).
In particular, the product of two square-integrable functions is always integrable.
14.7 Corollary. If 1 < p < +oo and the measure is finite, then every p -fold
integrable function is integrable.
Proof. Because (S2) < +oo, the constant function 1 is q-fold integrable on 0, for
each q E (1,+00[. So the present claim follows from 14.6 upon writing any p-fold
integrable f as f 1.
Remark. 1. Without the hypothesis (S2) < +oo the conclusion of 14.7 may fail.
For example, in Example 2 of 12 choose the measure it by requiring a = n-1/2
for all n. Then the function f defined on S2 = N by f (n) := an for all n E N lies
78
the set of all real, d-measurable, p-almost everywhere bounded functions on S1.
One immediately perceives that 2'1(14) is also a vector apace over R. The union
of Theorems 14.6 and 14.8 results in the assertion
(14.7)
(14.8)
f E-"(),9E-2(),1 <p<+oo, P
1+q-1=1
= f9E21(F+),
analytic geometry.
Remark. 2. Definition (14.1) of Np obviously makes sense for every real p >
0, thus also for those 0 < p < 1 heretofore excluded from consideration. For
these p, however, the fundamental properties (14.3) and (14.4) are lost and the q
determined by p` l+q-1 = 1 is negative. (On this point, compare Exercise 5 below.)
Remark 3. at the end of 15 will show that pathologies occur when 0 < p < 1. All
subsequent work will therefore be restricted to the case p > 1.
Exercises.
1. Let (S2, d, ) be a finite measure space, 1 < p < +oo. Show that every function f
on fl which is the uniform limit of a sequence (fn) from VP(IA) itself lies in .'(p).
2. For an arbitrary measure space (S1, rd, p) and 1 < p < +oo, show that a real
function f on 9 is p-fold integrable if and only if f If I" is Integrable. (In the "if"
direction, measurability of f itself is not part of the hypothesis.)
3. Let (11, 0,;t) be a finite measure space, 1 < p' < p < +oo, and f a measurable
numerical function on Q. Then
Np'(f) < Np(f) .1 (01/P -1/P and 2'(p) C -2v'().
4. For any finite number of measurable numerical functions fl,..., fn on a measure
n
Nl(fl-
fn):5Np,(fl).....NP"(f.)
5. Let (52,. 9, p) be a measure space, p E J0,1 [ and q < 0 be defined by p`1 +q-1 =
1. Consider non-negative f E .P() and a measurable g : S1 -a 10, +oo[ satisfying
79
and find an example to show that generally equality does not prevail here.
Np :.2 (p) - R+
having properties (14.2) and (14.4). From the second of those properties, the
Minkowski inequality, it follows that the function
dp(f,9):= Np(f - 9)
f,9 E 2P(p),
Evidently dp thus has all the properties of a metric on 2"(p), with one exception:
According to 13.2 and 13.3
dp(f, 9) = 0
f = g p-almost everywhere.
Distance-like functions without the property that "distance between two elements
equal zero entails equality of the elements", are usually called pseudometrics.
Np and d,, are called the .P-semi-norm and the Pp-pseudometric, also the seminorm or the pseudometric of convergence in the pth mean or in 2'-convergence.
To elaborate: If (f,,) is a sequence in YP(i), then it is said to converge in eh
mean to f E 2'P(p), or to be 2P-convergent if
(15.1)
n +oo
By virtue of what was noted above, the limit function f is only almost everywhere
uniquely determined. (14.2) and (14.4) insure that linear computations with convergent sequences are like those we are accustomed to involving real numbers. In
immediately apprehensible symbolic form these say:
A - f,
for any a, 0 E R.
9n -1 9
a fn + f3gn -4 of + 09
80
simply because
[ffdfi_f9diijIf-i dp<_N,(f-g),
15.1 Theorem. Every sequence (fn) in 21(u) (reap., in 2 (1i)) which converges
in mean (resp.. in pth mean) to a function f from 21(p) (reap., from -gy(p)) also
satisfies
(15.5)
f
=
J
fd
f
fn
d
n- oo A
lim
for every A E d
(p.,
(15.6)
Jim
Ifnlp dp = f If I' dp
Proof. (15.5) follows from (15.3). Correspondingly (15.6) follows from (15.4), which
gives Np(lAfn) = Np(lAf), upon taking pth powers in this last limit and
f Hf fd and f HJ Iflpdp
A
81
15.2 Lemma (of Fatou). Every sequence (fn)fEN in E*(fl,ii), that is consisting
of 0-measurable numerical functions fn > 0, satisfies
f fndp.
f limonf fndp<liminf
Proof. According to 9.5 and 11.6 the functions
f := lim inf fm
inf fn
and gn
for all n E N
lim
f If dp = sup gn dp = n-+00
nEN
9n dp.
gn dp
infra J fm d!L
(15.7)
lim inf An :=
n-+oo
A-.
nEN m>n
This is the set of w E Il which lie in ultimately all of the sets An. Dual to it one
defines
(15.8)
lim sup An := n U U Am ,
n-pm
nEN m>n
the set of w E fl which lie in infinitely many of the sets An, more correctly, the w
which he in An for infinitely many n. Evidently
lim sup A,) = lim inf CAn .
n-+oo
n-+oo
n-4oo
holds as well.
82
1 1. Integration Theory
(S2) - (limsup
n-+oc
n-+oo
n-+oo
n-4oo
confirming (15.11).
15.4 Theorem (of F. Riesz). Suppose 1 < p < +oo and the sequence (fn)nEN
in 2P(S1) converges almost everywhere in 11 to a function f E 2P(51). Then the
condition
(15.12)
which has already been used in the proof of (14.4). Since Ia - 0I < a + /3 this
inequality yields
(a,$ E R).
9n:=2P(IffIP+VIP) -Ifn-fl",
nEN,
are non-negative functions. They lie in .2o1() and by hypothesis they converge
almost everywhere to 2P+1 If IP. In particular, 2P+1 If I = lim inf gn almost everywhere. Therefore Fatou's lemma in conjunction with (15.12) delivers the relations
21+1 J If IP d = J lim inf gn dp < Jim inf J
n-+oo
n-+OC
g. du
83
In preparation for the proof of the next convergence theorem we extend Minkowski's inequality to series of non-negative functions.
00
(15.13)
n=1
n=1
i=1
00
The sequence (s,) is isotone and E fn is its upper envelope; the same holds for
n=1
the pth powers. Therefore from the monotone convergence theorem 11.4 follows
00
Np(Efn) =suPNp(9n)
n=1
nEN
15.6 Theorem (on dominated convergence). Let 1 < p < +oo and (fn)nEN be
a sequence from .'P(p) which converges almost everywhere on Q. Suppose there
exists a p-integrable numerical function g > 0 on fI such that
for all n E N.
(15.14)
{ 0,
then f is real-valued and aaf-measurable, and the sequence (fn) converges almost
everywhere to f. Consider now any function f with these properties. Then If I < g
almost everywhere, so along with gp the function If Ip is also integrable, that is,
f E 2p(), by 13.5. We set, for each n E N
9n:=Ifn-fIp
84
and then what has to be shown is that lim f gn d = 0. From the definition of gn,
and so
=J hdp.
inf
The preceding inequality therefore yields lim sup f g,, du < 0. Since all 9,,, are
non-negative, this is equivalent to the desired lim f g,, du = 0.
The concept of a Cauchy sequence makes sense in any pseudometric space, in
particular therefore in 2p(). A sequence (fn) of functions from
(t) is said to
be a Cauchy sequence in _49P(p) if for every e > 0
dp(fm, fn) = NP(fm - fn) < E
holds for ultimately all m, n. Every .2P()-convergent sequence is a Cauchy sequence, as Minkowski's inequality shows. That the converse of this is also true,
that, in other words, the space 2P(14) is (metrically) complete, is the content of
the third convergence theorem. Its special case p = 2 goes back to F. RIESZ and
E. FISCHER (1875-1956).
15.7 Theorem. For each 1 < p < +oo, every Cauchy sequence (fn)nEN en
'(k)
converges in pt' mean to an f E 2P(p). Some subsequence of (fn) converges
almost everywhere to f.
Proof. Straight from the definition of Cauchy sequence we can construct 1 < n1 <
n2 < ... such that Np(fnk+, - fnk) < 2-k for all k E N. We define
00
9k *= fnk+, - fnk
00
00
NP(9)<_ENP(9k)<E2-k=1.
k=1
k=1
85
this series is f,,k+, - fn so we see that the sequence (fnk)kEN converges almost
everywhere in Q. Moreover,
Ifnk+,I = 191 +... +9k + fn,I <- 9+ Ifn,I
and by 14.4 the sum g + I fn, I is pth-power integrable. Thus the sequence (fn. )W
satisfies all the hypotheses of the dominated convergence theorem, according to
lim fnk = f
k-woo
almost everywhere.
Since (fn) is a Cauchy sequence, this subsequence behavior entails the convergence
in eh mean of the whole sequence: Given c > 0 there is an mE E N such that
Np(fn-fn)<E
NP(fnk - f) < E.
The triangle inequality then insures that
Np(fn - f) < Np(fn - fnk) + Np(fnk - f) < 2E
Example. Consider fl := (0, 1[, d := Clf1.1 and a := an. Every natural number n
15.8 Corollary. If the Cauchy sequence (fn) in 2p() converges almost everywhere to an d-measurable real function f on Cl, then f lies in 20P(A) and the
sequence converges to it in eh mean.
the union of this exceptional nullset and that in the hypothesis the two limits f
and f * must agree. Hence f = f * almost everywhere. 0
86
1 1. Integration Theory
Corresponding to Theorem 14.6 and its corollary we have finally the following
two convergence assertions:
15.9 Theorem. The sequence (fn) in .4D(p) converges in pth mean to a function
f E 2'(p) and the sequence (gn) in 29(p) converges in qth mean to g E
If I < p < +oo and p-' +q-1 = 1, then the sequence (fn9n) of products converges
in mean to f g.
Proof. The triangle inequality in IR yields
(nEN)
(fn9n-f9l<Ifn-fII9.I+If II9n-9I
which the Holder inequality (14.3) transforms into
(n E N).
Our claim follows from this when we recall from (15.2) or (15.6) that the sequence
(Nq(gn))neN is convergent, hence bounded.
15.10 Corollary. If the measure p is finite, then every sequence (fn),,EN in 2'(p)
which converges in pth mean to an f E YP(p) for some 1 < p < +oo, also
converges to f in mean.
Proof. For p = 1 there is nothing to prove. For 1 < p < +oo the claim follows from
the theorem upon taking every function gn there to be the constant function 1;
because of the finiteness of p the constant functions lie in 29(p) for every q E
(1, +oo(.
The reader should convince himself via an example like that in the remark after 14.7 that the converse of the assertion in this corollary is not true. However, the
conclusion of the corollary can be refined somewhat; namely, under its hypotheses
there is 2'V-convergence of (fn) to f for every p' E (1,p). Cf. Exercise 2 below.
Remarks. 1. Because
Np : 2'(p) - R+
is a semi-norm, the set
.N := N;'(0)
is a linear subspace of .gy(p). It is independent of p because it consists of all
measurable real functions on Sl which are almost everywhere equal to 0. The
quotient vector space
87
One checks effortlessly that f H 1If IIP is thereby well defined and provides a norm
on LP(p). Theorem 15.7 says that LP(p) is complete with respect to this norm,
that is, it is a Banach space (for 1 < p < +oo).
L2() is even a Hilbert space. For the product fg of two functions f,g E 22(p)
is integrable, by 14.6, and it is clear that the integral f f g dp depends only on the
canonical images f , g of these functions, which means that
(f, 9) -ffdp
is a well-defined mapping. A short calculation suffices to confirm that it provides
a scalar product in L2(p).
2. f E 2(p) means that the set W J of all a E R+ such that If I < a almost
everywhere is not empty. We can set
N00(f):=infWj
and show easily that N,,, :2(p) -r R+ is a semi-norm on 2(p). Also in this
case N ' (0) coincides with the space .At described in 1. In the quotient space
LO(,u) := Y(p)1_41
can be defined via N,,. just as before. One checks that L (p)
thus also becomes a Banach space.
a norm f H II f I I
3. For every measure space (SI, dry, p) and every p E ]0,1[ the set 2P(p) (cf.
Remark 2 in 14) turns out to be a vector space. NP is generally not a semi-norm
(cf. Exercise 5, 14), but 4(f, g) := Ny (f - g) is a complete pseudometric - with,
however, strange properties: The unit "ball" centered at 0 is generally not convex.
For L-B measure on (0, 1], every f E .2P is actually a convex combination of
functions in this ball. See BoURBAKI [1965], chap. 4, 6, exer. 13.
Exercises.
f limsup fndp.
Show by an explicit example that this chain of inequalities can fail if there is no
such majorizing function g. (To this end, cf. Exercise 6 in 21.)
2. Let p be a finite measure, 1 < p' < p < +oo. Show that if a sequence in 2P(p)
converges in pth mean to a function f E 2%p), then it also converges in p`h mean
to f. (Cf. Exercise 3 in 14.)
3. Let (f',
88
H. Integration Theory
n=1
of three examples which will be important in the sequel. The first concerns the
behavior of parameter-dependent integrals, the second the connection between the
Riemann and the Lebesgue integral, and the third the calculation of the (Gaussian)
integral
G:=
(16.1)
J_2)1()
1. Parameter-dependent Integrals. The question of the continuity and differentiability of functions which are defined by integrals will be answered in the
following lemmas and corollary. Throughout, (fl, srd, p) is an arbitrary measure
space.
J f(x,w)(dw)
is continuous at xo.
Proof. The continuity of V at xo is proved if we show that for every sequence (xn)
in E with lim xn = xo,
(n E Z+, w E 0).
By hypothesis these are integrable functions, each satisfies IfnJ < h, and for every
fixed w E 11, lieu fn(w) = fo(w). From the theorem on dominated convergence
n--+oo
89
fn du =
fo du
.f (xo, w)p(dw)
ca(x)
Jf(xw)li(dw)
VP (x) = Jfl(xw)tz(dw)
for every x E I.
In short, under the stated conditions (16.2) can be differentiated under the
integral sign.
Proof. Fix xo E I and consider any sequence (xn)nEN C I \ {xo} which converges
to xo. Then the function defined on S2 by
gn(w)
f (xn,w) - f(xo,w)
xn - xo
n-+oo
for all w E Q.
for each x E I \ {xo} and each fixed w E fl there is a point t, in the open interval
whose endpoints are x and xo, such that
90
Now the dominated convergence theorem comes into play to insure that the function w H f'(xo, w) to which the gn converge is tc-integrable and
im
4oo
gdp=forallnEN.
xn -xe
11
and f : U x f -i R
16.3 Corollary. Let U be an open subset of Rd, i E
a function with the properties
(a) w H f (x, w) is i-integrable for each x E U;
(b) x H f (x, w) has an ill' partial derivative at each point of U, for every w E S2;
(c) there is a -integrable function h > 0 on S2 such that
8f
(x, w) < h(w)
8xi
w(x) := ff(x.w)i(d)
has an ith partial derivative at every x E U, the function w
'-
8f (x, w) is -
8x,
integrable, and
for every x E U.
This follows at once from the differentiation lemma: Given T = (T,, ... ,Td) E
II. Comparison of the R.iemann and Lebesgue Integrals. For every ddimensional Borel set B E .mod and suitable Borel measurable numerical functions f on B the integral fa f dad was defined in 12 and identified with f f dAB.
This integral is called for short the Lebesgue integral of f over B. A frequently
encountered alternative way of writing it is
(16.4)
ff(x)dx= Jfda5.
91
In case d = 1 and B = [a, a], or ] - oo, a], or R, etc. the notations fa f (x) dx, or
f . f (x) dx, or f ' f (x) dx, etc., are also common.
Since in basic analysis courses it is frequently only the Riemann integral that
is dealt with, the following remarks relating it to what has been done here may be
useful.
J:={a=ao<al <...<an=p}
of I the Riemann theory associates the lower and upper sums
n
i=1
in which
and
ri :=supf([ai-l,ai]), i = 1,...,n.
and so by 13.2, q = 0 p-almost everywhere. Since in addition for every n,1.4 < f <
uj holds p-almost everywhere (everywhere except possibly at the points of in),
92
lim 1 j = f
p-almost everywhere on I.
n-+oo
As has been noted, f is bounded, say If 1 <_ M E R. The sequence ([1.4. [) is therefore
I fdp=lim J
which finishes the proof. 0
Remarks. 1. Consider once again Dirichlet's jump function f on the unit interval
(cf. Exercise 2 of 10). Being the indicator function of Q fl 10, 11, it is Borel measurable and almost everywhere 0 with respect to L-B measure .1011. Consequently
it is Lebesgue integrable and fo f (x) dx = 0. But f is not Riemann integrable. So
the roles of Riemann and Lebesgue integration cannot be reversed in 16.4.
2. Borel measurability of f need not be hypothesized: the above proof shows,
even without it, that lim 1.4. = f p-almost everywhere and so f is -almost everywhere equal to the Borel function lim lj, . However, in this case it can well happen
that f itself is not Borel measurable.
3. The ideas in the proof of Theorem 16.4 can be amplified into a non-trivial
criterion for Riemann integrability. Namely, f : [a, 0] -+ R is Riemann integrable
if and only if it is bounded and is continuous at V-almost every point of [a, fiJ.
See Theorem 2.5.1 of COHN [1980] or the multi-part Exercise 12.51 of HEwITT
and STROMBERG [1965].
,0:= lim J
f (x) dx
n
pn=IA
From 11.4 and the fact that IA f T f we get
sup p
JfdA'.
93
The improper Riemann integral exists, by definition, just if this supremum is finite
and in that case its value g is that supremum. From these observations and the
monotone convergence theorem our present result follows. 0
f f dal coincides with the improper Riemann integral of f. Obviously too, any
open or half-open interval I C iR can take over the role of R in 16.5.
By contrast, from the existence of the improper Riemann integral off does not
follow the Lebesgue integrability of f, even for continuous functions. Consider, for
example, the function f : R -+ 1R defined by f (x) :_ (sin x) Ix when x 54 0 and
f (0) := 1 m (sin x)/x = sin'(0) = 1. Of course, it is continuous. If for each k E N
we set
ak
I(k+l)w
:=
sinx
dx = (-l)k I
sin
t
dt,
we see that the signs of the ak alternate, their moduli decrease as k increases, and
r(k+1)n
Jakl < J
(k + 1)a
kir
1 dx = log
k+r
= log (I +
0 as k -> oo.
00
Therefore the series > ak converges. Using this it is very easy to confirm that the
k=1
JrR sin x
lim
R ++oo 0
dx
IsinxJ
If I d,\' >
J fa,(n+1)w)
at.
(k + 1)lr
J0
=
JR+
sin t
sin t
If ( dA'
(k+1)n
E
k=lJka
dx >
E k+1
k=11
Since the harmonic series diverges, these inequalities show that fR+ If I dA' = +oo,
and so by 12.2 f is not Lebesgue integrable over R+.
94
H. Integration Theory
f (x, w) :_
(16.5)
(x,w)ER x1R.
1 + w2
Both f and the function (x, w) t-+ f'(x, w) := -e-:(1+w2) are continuous. For fixed
xo > 0 form the auxiliary functions
ho(w) := e-220Iwl
w E It.
Their A'-integrability (over R) follows from Corollary 16.5 and the fundamental
theorem of calculus. For example,
r+
J/
(1 + W2)-1
hm [arctan(W)]"n = r.
n-too
Obviously f (x, w) < h(w) for all (x, w) E HI+ x R. It follows from 12.2 that for each
x E It+ the function w H f (x, w) is A'-integrable. And the real function defined
by
(16.6)
V(x) := Jf(z)dw
x E IR+
is continuous by the continuity lemma 16.1. Note that p(O) = r. Since 2 JWJ < 1+w2
for all w E R, we have I f'(x,w)J < ho(w) for all (x, w) E [xo,+oo[x]R. Consequently
the differentiation lemma 16.2 insures that <p is differentiable in ]xo, +oo[, for every
(16.7)
(x) = -
e_2(1+")).1(dw)
for x > 0
cp'(x) = -Gx-1"2e-z
forx>0
where G designates the integral (16.1) that we are trying to explicitly compute. Its
existence is already fart of the preceding analysis, but can also be inferred from
the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the
95
p(x) = 2G
+oo a-", dw
it = p(0) = 2G
r+ e-"'2 dw = G2,
J0
using the obvious (on grounds of symmetry) fact that f . a-"'' dw = f0+00 e' dw.
G = . That is,
Since G > 0, it follows finally thatfe2
dx = r
(16.10)
2a.
(16.10')
This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-
ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4
in 23).
Exercises.
1. Which of the two functions below are integrable, which are square-integrable
with respect to Lebesgue-Borel measure on the indicated intervals?
(a)
(b)
f (x) := x-1,
f (x) := x-1/2,
2. Show that for every real number a > 0 the function x H e" is A1-integrable
over R+.
96
1 1. Integration Theory
rsinx13 A1(dx)
x J
Jo
is continuous Oil 10, +00[.
v(A) :=
(17.1)
f du
Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets
IAf =
IA, f
n=1
and so by 11.5
v(An),
v(A)
n=1
v=fiz.
(17.2)
97
Jd(f,i) = f Wf d -
(17.3')
An id-measurable function V : fl - R is v-integrable if and only if ,pf is integrable. In this case (17.3) is again valid.
Proof. First suppose p =
holds because
n
f ,pdvaiv(A1)a;f lA,fd=Jcof d .
For an arbitrary p E E' there is a sequence (un) in E such that U. T V. Since then
un f T W f as well, (17.3) follows from 11.4. Finally, consider any id-measurable
numerical function p on Sl. By now we know that
W- dv = f V f du = f(f ) dp.
From these equations and the definition of integrability follows the second part of
the theorem. 0
It now follows that the formation of measures with densities is transitive:
17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf ), that is,
9(f) = (9f)
(17.4)
g(A) = f gdv =
A
lAgdv
f lA9dv=
lA9fd= f(9f)dii.
f =g
-almost everywhere
= f p = g .
98
IL . Integration Theory
Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each
A E a(, whence
JALgdp
for allAEd,
N:={f>g},
which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and
is positive, which means that the definition
tions f, g, are themselves integrable. Because fit = gp, they have the same itintegral. From this we getr that
hdp=
r
Ir fdp- /Ngdp=0.
Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles
of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since
if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is
obtained. 0
The converse of implication (17.5) is not valid without some additional hypothesis on the densities f and g. The next example illustrates this.
Example. 1. As in Example 2 of 3 let fl be an uncountable set, 0 the a-algebra of countable and co-countable subsets of (1 (see Example 2 in 1). But the
measure p will be defined on 0 by p(A) := 0 or +oo, according as A or CA is
countable. If f and g are the constant functions on ft with the respective values 1
and 2, then indeed f p = gp, yet f (w) = g(w) holds for no w E ft. Of course, it
then follows from 17.5 that neither f nor g is p-integrable.
Before turning to the principal problem of this section, we will examine another
characterization of a-finite measures which is important for what follows and is of
interest in its own right.
17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and
only if there exists a p-integrable function h on Cl which satisfies
(17.6)
0<h(w)<+oo
forevery wEf2.
99
h := L?In1A
n=1
does what is wanted. It is measurable, 0 < h(w) < 1 for each m E 0, and f h dp < 1.
In the light of 13.2 this lemma has another formulation: For each or-finite measure R there exists a real, measurable function h > 0 such that the measure hp is
finite and has the same nullsets as A.
We come now to the main problem, already alluded to: On the v-algebra sF of
the measurable space (S2, 0) two measures v and p are given. We pose the question
of how to decide whether v has a density with respect to , that is, whether there
is an .W-measurable, non-negative, numerical function f on St satisfying v = f p,
satisfying in other words
v(A)=J fdp
for allAE.d.
For an affirmative answer it is necessary, as 13.3 shows, that every p-null set in a
be a v-null set as well.
17.7 Definition. A measure v on W is called continuous with respect to a measure it on 0, for short, p-continuous, if every p-nullset from 0 is also a v-nullset.
In the case of a finite measure v there is a condition equivalent to p-continuity
which clarifies and justifies the terminology:
17.8 Theorem. A finite measure v on jzf is p-continuous if and only if for every
c > 0 there exists d > 0 such that
v(A) < e.
(17.7)
.
A E O and u(A)<b
Proof. From (17.7) it follows that v(A) < e holds for every E > 0 if A is a p-nullset.
Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness
hypothesis. For the converse we will show that if (17.7) fails, then v is not continuous. Thus, for some c > 0 there is no 6, which means there is a sequence
with the properties
(An)nEN in
p(An) < 2_n and v(An) > E
for each n E N.
We set
A := 41.s .up An := n U An
nEN m>n
m=n
00
m=n
2-m = 2-n+1
for every n E N,
100
whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3,
satisfies
E > 0,
nix
for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case
because Sl is uncountable.
Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V)
and denote by A" the system of all p-nullsets. Then
is an example of a or-ideal
in W1: The union of any sequence of its sets is another, as are the intersections of
its sets with those of ,5d1 (cf. Exercise 5, 3). These properties insure that
3.
v(A)
10
+oo
ifAE-4
if AEJO\.X
defines a measure on 1 (cf. Exercise 6, 3). From its definition it is clear that v
is p-continuous. Here however (17.7) falls, since for every b > 0
(17.9)
g(A) > -E
101
We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is
wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we
consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e
and the subtractivity of the finite measures a and T,
e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .
Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done.
In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because
A1, A2 are disjoint
for every n E N.
Because of the finite additivity of a and r, this would have the consequence that
n
e(A1U...UAn)=Ee(A,) <-ne
for every n E N
i=1
00
and entail the divergence of the series 1 e(An). But the latter is untenable,
n=1
because when the a-additivity of a and r is applied to the disjoint union A
U An it shows this series to be convergent:
nEN
00
00
n=1
This contradiction proves that the construction procedure must terminate after
some finite number n of steps, with the set QE := C(A1 U ... U An) then satisfying (17.8') and (17.9').
We now take e = 1/n in (*) for successive n E N. The sets (1 can be chosen with
the additional property of isotoneity. For if Sll D 121/2 3 ... 3 Sll/n has already
been realized, we simply apply (*) to fll/n as a new base space in the role of Sl,
that is, we consider the restriction of the measures or and T to S21/n fl dd. Finally,
the set Slo := n Sll/n will be seen to do the desired job. For since 01/n j Sla,
nEN
(17.8) follows from (17.8'), and (17.9) follows from (17.9'), which insures that
e(A)>-1/nforallnENandeveryAESlofl.od. O
As indicated, this puts us in a position to answer the important question we
posed earlier.
I l. Integration Theory
102
Proof. Only the implication (ii)=(i) is still in need of proof. To that end we
distinguish three cases.
First Case: The measures and v are each finite. Form the set 9 of all d measurable numerical functions g > 0 on Sl which satisfy g < v, that is, which
satisfy
for allAEd.
The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable,
that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI,
every A E d satisfiees
gvhd= 1
Ana,
gd+J
ArA,
ry:=suP{ f 9d:gE9)
is finite and there is a sequence (g;,) in 9 such that lim f gn d = -y. Due to supstability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn d >
f gn d (since g,, > gn) for all n E N. Which shows that lim f gn d = ry. As
the sequence (gn) is isotone, the monotone convergence theorem can be applied,
assuring that f := supgn is a function in 9 and that f f d = ry. All this proves
that the function g H f g d on 9 assumes its maximum value at f.
Now we prove that v = f . In any case we have f < v, since f E 9, and so
T:= V- f A
is a finite measure on sat, evidently -continuous since v is by hypothesis. We have
to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the
-continuity of r, this entails that (11) > 0 as well, and we may form the real
number
Q:=2
(M}>0,
which satisfies r(Sl) = 20(Sl) > Q(St). The preceding lemma applied to r and
a:= Q3 supplies a set flo E 0 which satisfies
r(flo) - l(ilo) > r(1) - $(!l) > 0 and r(A) > Q(A) for all A E f o n 0.
The .sat-measurable, non-negative function fo := f +,81n. therefore has the property
ffodiz=jfdii+I3(QonA)
jfd+r(A)=v(A)
103
fod= ffd+ap(no)=7+i3(Slo)>7,
an inequality which is incompatible with the definition of -f and the fact that
fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired.
Second Case: The measure is finite and the measure v is infinite. We will produce
00
(a) A E 1o fl at
(b)
n=0
for all n E N.
To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define
a:= sup{(Q) : Q E _l} .
This is a real number because the measure is finite. There is a sequence (Qm)mEN
(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with
v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is -continuous
we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted
already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a,
and consequently
(Qo U A) = lim p(Qm U A) < a.
"t-400
Since A is disjoint from 1o, u(Qo U A) = a + (A). Conjoined with the preceding
inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to
take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers
m > 2 in order to get a decomposition of S2 with the desired properties.
Now let An, vn denote the restrictions of , v to the trace a-algebra On fl 8d,
for n = 0, 1.... and note that each vn is a n-continuous measure. Moreover, for
all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable
functions fn > 0 on Cl,, with vn = fnn Taking fo to be the constant function +oo
on Sto, vo = foo also holds, thanks to (a). Finally, "putting all the pieces together"
gives our result in this second case. Namely, the function f on Cl defined to coincide
on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies
v=fp.
Third Case: This is the general case: only the a-finiteness of it is demanded. There
104
The constant function +oo is then a density for v with respect to p and what has
to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for
that it suffices to show that
({ f < n}) = 0
for each n E N,
which in turn is a consequence of the above alternative and the inequalities
v({f
<n})=J
f<n}
fdp<np({f <n})<+oo
Next, suppose v is a-finite. From 17.6 once again we get a strictly positive
function k E 21(v). Then kv = (f k) is a finite measure, that is, f k is integrable, consequently also p-almost everywhere real-valued. Because k takes
only non-zero real values, this means that f itself is real p-almost everywhere.
Conversely, suppose that f is p-almost everywhere real-valued. We want to
00
into a sequence of pairwise disjoint sets from 0 each of finite p-measure. Set
105
00
into a (doubly-indexed) sequence of pairwise disjoint sets from sat. If each has finite
v-measure, this proves that v is a-finite. Consider any i E Z+. Because p(Ao) = 0
and v = f, we have v(1l, n Ao) < v(Ao) = 0. Because v = fit and f < j in AJ,
we have v(12i n AJ) < jp(ni) < +oo for all j E N as well. Thus all is proven. 0
In the generality presented here Theorem 17.10 was proved in 1930 by O.M. NIKODYtM (1888-1974). H. Lebesgue proved the theorem in 1910 for the case where
At is the L-B measure A1. J. RADON (1887-1956) pushed things further in a fundamental work which appeared in 1913. So 17.10 is often also called the theorem of
Lebesgue-Radon-Nikodym. The uniquely determined density f in 17.11 is called
the Radon-Nikodym density or the Radon-Nikodym integrand (of v with respect
top). A beautiful proof of 17.10 by elementary Hilbert-space methods was discovered in 1940 by J. VON NEUMANN (1903-1957) and appears in many textbooks,
e.g., in RUDIN [19871, p. 130-131.
v(A) = v(A n N)
for all A E d,
as follows from v(A) = v(A n N) + v(A n CN) and v(CN) = 0. The condition
that v J it thus says that the measure v is "carried by a p-nullset". From v << p
v, is called the continuous part of v with respect to p, v, the singular part. The
Radon-Nikodym theorem is applicable to the part vc.
Proof. We will carry out the proof in detail only for finite p and v and indicate in
Exercise 4 how the reader can then handle the general case himself.
106
1 1. Integration Theory
(17.11)
is a real number. Since .X,, is closed under countable unions, there exists an isotone
sequence (An) in .A', with v(An) T a. Since v is continuous from below, it follows
that
v(N) = a
for the set N := U A E .A',,. We will show that via
nEN
v=VC +v,=vC'+v,'
are two decompositions of the kind described in the theorem. The measures v v,
are carried by p-nullsets N, N' in the sense of (17.10); which means that
(17.13)
v,,(AnNO)
=0
There is a short, elementary proof of 17.13 that does not make use of the
Radon-Nikodym theorem; see Woo [1971).
Exercises.
1. Show that the Dirac measure e., on Rd has no density with respect to .1d,
for any x E W'. (Physicists occasionally work with such a "symbolic" density d5,
calling it the Dirac. function at the point x. The correct mathematical object is
nevertheless the Dirac measure es.)
107
2. Show that the relation << on the set of measures on a a-algebra d is reflexive
have the same nullsets. For a-finite measures p and v on d show that p - v is
equivalent to v = f 1L for a density f which satisfies 0 < f (w) < +oo for p-almost
all (or even for all) w E Q.
3. On a a-algebra 0 in a set 11 two measures a and v are related by v < A. Show
that if further it is a-finite, then there is an d-measurable function f satisfying
4. Lebesgue's decomposition theorem was proved for finite measures p and v. Show
how to infer its validity for a-finite measures from this. [Hint: For the existence
proof use 17.6. For the uniqueness proof choose a sequence (An) in 0 with An T Sl
and a(An), v(An) finite for each n, and consider the measures vn(A) := v(Af1An),
AEd,nEN.]
for all A E 0 and a suitable p-nullset N E d. Show that if N' is any other pnullset with this property, then u(N 0 N') = v(N A N') = 0.
6. Let (S2, .mot, p) be a measure space, v = f 1A a a-finite measure on d having
density f with respect to p. Show that this density function is p-almost everywhere uniquely determined and is p-almost everywhere real-valued. Show that if f
is strictly positive, then p itself is a-finite.
7. Let (11, d) be a measurable space. For every measure on s0 let .M,,, denote
the a-ideal of its nullsets. Show that for any sequence (Pn)nEN of a-finite measures
8. The set n := 10, +oo[ is a group with respect to multiplication. Show that the
measure on SZ f1.1 defined by p := han with density function h(x) := 1/x is
108
I l. Integration Theory
18.1 Theorem. Let g be a finite signed measure on a a-algebra and in a set Cl.
Then there are sets Sl+, St- E of with Cl = Sl+ U fl-, Sl+ n fl- = 0, and g(A) > 0
for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd.
Proof. Set
-y:= sup{g(A) : A E 0}
y=sup{g(Pn):nEN).
(18.1)
Sl+ := U Pn,
S2- := S2 \ Q+ .
nEN
Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form
A = U Bn
nEN
follows g(A) _
n=1
on Sl+ n .sad, that is, the restriction of g to Sl+ n 0 is a finite measure. Moreover,
because @(P.):5 g(Sl+) < y and (18.1) this measure satisfies
y=Q(sl+)
In particular, y < +oo since p assumes only real values. g(A) > 0 cannot hold for
any A E Sl- n .sat, for otherwise g(C+ U A) = g(Sl+) + g(A) > y. Thus, g(A) < 0
for allAESl-n0.
Measures (in the sense of Definition 3.3) have occasionally been interpreted
as mass distributions on the underlying set Cl. A finite signed measure can be
analogously interpreted as an (electric) charge distribution smeared over Cl. The
foregoing theorem justifies this metaphor by showing that as with charge in electrostatics, there are two disjoint sets, one carrying all the positive charge, the other
all the negative charge.
109
From this theorem another important feature of signed measures becomes evident: The difference p in Lemma 17.9 is more than an illustrative example of
a signed measure - it is the typical signed measure:
Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then
evidently
p+(A)
p(A n St+)
A E sat
define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint
union (AnS2+)u(Ancl-). 0
With this result the circle closes: finite signed measures are nothing more than
the differences of finite measures. It is however possible to dispense with the finite-
ness hypothesis if a-additivity is handled with sufficient care, but we will not go
into this further.
In the final analysis it is because of the preceding corollary that we only consider
measures with non-negative values in this book. Often to emphasize the distinction
with signed measures, what we call simply measures are called positive measures.
Exercises.
1. Show that every finite signed measure on a a-algebra is bounded and assumes
a largest and a smallest value.
110
1 1. Integration Theory
p` := T(p)
is defined in (7.5). The connection between p-integrals and '-integrals is elucidated by:
19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0'
(19.1)
ailA s
i=1
f'oTa;lAi
e=1
T(p)(Ai) = p(Ai)
(i = 1,...,n)
holds by definition of image measures, (19.1) follows in this case. For an arbitrary
s9'-measurable f > 0 there is an isotone sequence (un) of d'-elementary functions
for which u;, T f'. Then (un o T) is a sequence of s(-elementary functions for which
u;, o T T f o T. From the validity of (19.1) for the u;, and Definition 11.3 of the
integral in general, we get (19.1) for f'.
111
f (f')+dT(p)=J(f')+
o Tdp and
J(f')_dT(P) = f
(f')- oT d1 z,
and of course
(f'oT)+=(f')+oT
and
(f'oT)-=(f')-oT.
Both claims therefore follow from the definition of the integral 12.1.
One has only to note that the integrability of f' o T entails the measurability
of f' o T and therewith that off'= f' o T o T -1.
The content of 19.1-19.3 constitutes what is called the "general transformation
theorem for integrals".
19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn
of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the
function f' o cp I det DWI is Ad-integrable over G, and in this case
(19.3)
Proof. The Ad-integrability of f' over G' and that of f' o W I (let DWI over G means
the AG,-integrability and the AC-integrability of those functions, respectively. According to (8.16')
' (Ac) = I det DWI Ad ;
f f'dAc,_f
f
f'o,pIdctDWIdAd,.
Because of Theorem 19.1, equality (19.3) holds as well for all non-negative,
Borel measurable, numerical functions on G'.
112
1 1. Integration Theory
Exercises.
1. Let (0, dal, p) be a measure space, T : fZ -+ f 1 a mapping which together with
its inverse is an d-d-measurable bijection. Show that for every f E E (St, .ad) the
image measure T(f p) has a density with respect to T(p), namely f o T-1.
fA-'(A) f oTdp
fqdu
-TJ
for all dat-measurable numerical functions f > 0 on fl, and all A E d.
(20.1)
af
Iflp dp
holds.
fd
a-r+co
One can also study the dependence on it E N of the measures of the sets
{ I fn - f I > a} when f, fl, f2.... are measurable real functions. That leads to
the aforementioned new convergence concept.
113
real function f on S2, if for each real number a > 0 and each A E d of finite
measure
(20.3)
+oo
- lim fn = f
(20.4)
(20.5)
The more complicated condition (20.3) is dictated by the desire to treat infinite,
and especially a-finite, measures as well as finite ones.
2. For a-finite measures p the stochastic convergence of a sequence (fn) to f is
generally not equivalent to (20.5), as the next example illustrates.
({n}) = n
for every n E N
and the requirement of o-additivity. With An := {n, n + 1,.. .} and In := 1A., for
each n E N, the sequence (fn) converges stochastically to 0: For every a E 10, 1[,
{ jn > a} = An, and since An ,. 0, it follows from 3.2 that lim (An n A) = 0
for every A E Af having finite measure. On the other hand, u(A.) = +oo for
every nEN.
20.3 Theorem. For every o-finite measure p, any two stochastic limits of a sequence of measurable real functions are -almost everywhere equal to each other.
114
1 1. Integration Theory
Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle
inequality in R
p({If-f*I >a}nA)<p({Ifn-fl>a/2}nA)+p({Ifn-f*I2:a/2}n A)
for every n E N and every A E d. Letting n -3 oo shows that
is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies
p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f
follows. D
To supplement this fact we mention:
Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost
everywhere equal without any hypotheses on the measure itself if both functions
are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the
set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost everywhere in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable
nEN
union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the
next example shows that it may fail if one of the functions is not in any 2P-space.
Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly
two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N.
These functions lie in every .2'P(p) and the sequence (fn) converges stochastically
20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function
f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically.
Proof. The Chebyshev-Markov inequality tells us that
115
holds for every n E N, every a > 0 and every A E s+d. The claimed stochastic
convergence, that is, the convergence to 0 of the left end of this chain as n -+ oo,
follows because f I fn - f I' d -+ 0 as n -+ oo is the definition of convergence
in pth mean. 0
The proof shows that convergence in eh mean actually entails the stronger
form of stochastic convergence in (20.5). The situation is different when the given
sequence is almost everywhere convergent. (On this point cf. also Remark 5.)
20.5 Theorem. If a sequence (fn)nEN of measurable real functions on fl converges
-almost everywhere in Sl - or even just p-almost everywhere in each set A E st
of finite measure - to a measurable malfunction f on 1l, then this sequence also
converges p-stochastically to f.
1a}
and so
for every A E d. The present claim therefore follows from our next lemma, applied
(20.6')
p(limsap{Ifnl>a})=0
lim A
n-rao
(20.7)
m>n
m>n
Proof. To prove the equivalence of (20.6) with the almost everywhere convergence
of (fn) to 0, we set, for each a > 0 and each n E N
An :_ { sup IN > a} .
m>n
op
1 1. Integration Theory
116
then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and
A= n U
kEN nEN
Passing to complements,
CA= U nAnk
kEN nEN
and so
n A ;/k r CA as k -+ oo,
and Al/k
n 1
fI' dl
"m
as n -00.
mEN
nEH
Consequently,
(20.8)
kEN
kEN 'nEN
nEN
because the finite measure is both continuous from above and continuous from
below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number
defined by (20.8) is 0. In turn, the latter occurs exactly in case
nEN
n-+oo
for every k E N. The first equivalence follows from this. The equivalence of (20.6)
with (20.6') follows from the observation that for any numerical function g on S2
{g>a}C{g>a}C{g>a'}
whenever 0 < a' < a.
Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every
(20.9)
m> n
On the one hand, Bn I B and consequently tim p(Bn) = (B). On the other hand,
however,
m>n
117
= n"p(An) = np-1
shows that the sequence does not converge to 0 in pth mean for any p > 1.
4.
Let (fl, 0, ) be the measure space of the preceding example. Write each n E N
In
n E N.
lAn,
It was shown in the example in 15 that the sequence (fn(w))nEN converges for
no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since
for every a > 0 and n E N
a/2);
thus by hypothesis ({I fn, - fnl > a}) can be made arbitrarily small by taking m
and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers
with
00
118
I l. Integration Theory
forallm>nk.
{t({Ifm-fnkl?nk})<-nk
Clearly the sequence (nk)kEN can be chosen strictly isotone: nk < nk+1 for every
k E N. If now we set
k E N,
{Ifnk+t - fnk l llk},
Ak
then
00
00
>
k=1
k=1
and consequently,
p(Ak) = 0.
lira
n-oo
k=n
p(A) = 0,
00
because A C U Ak for every n E N, entailing that p(A) < E p(Ak) for every n.
k=n
k>n
prevails for at most finitely many k E N. Therefore, along with the series E Ilk,
the series
00
1: lfnk+l(w) - A. (w)1
k=1
converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In summary, the sequence (fnk) converges almost everywhere to a measurable real func-
tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a subthat sequence converges stochastically to f as well. Hence
sequence of
by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-
20.8 Corollary. A sequence (fn) of measurable real functions on 11 converges pstochastically to a measurable real function f on ) if and only if for each A E of of
finite measure, each subsequence (fnk )kEN of (fn) contains a further subsequence
which converges to f p-almost everywhere in A.
Proof. The preceding theorem establishes that the subsequence condition is necessary for the stochastic convergence of (fn) to f, since every subsequence of (fn)
119
likewise converges stochastically to f. Let us now assume that the subsequence condition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)
(kEN),
p({Ifnk - fI -a}nA)
>a}nA)
(nEN)
converges to 0. As this is true of every A E d having finite measure and every a > 0, the stochastic convergence of
to f is thereby confirmed. 0
Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the
finite-measure set A E W can be stricken. This is already illustrated by Example 2
if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_
nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See
however Exercise 5.
6. The second part of the proof of 20.7 shows that for finite measures u there is
a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary
to a measurable
and sufficient for the stochastic convergence of a sequence
real function on S1 is the condition
for every a > 0.
litre
m.n-ix
7.
The sequence formed by alternately taking terms from each of two stochasti-
cally convergent sequences whose limit functions do not coincide almost everywhere
shows that in Corollary 20.8 it does not suffice to demand that in each A some
sub sequence of the full sequence (fn) converge almost everywhere.
A particularly useful consequence of 20.8 is:
20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 converges stochastically to a measurable real function f on. Q. and yo : R -4 R is
continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.
Proof. One exploits both directions of 20.8, noting that from the almost everyto f on an A E 41 follows the almost
f on A. 0
120
Exercises.
are stochastically convergent sequences of measurable real func1. (fn) and
tions, having limit functions f and g, respectively. Show that for all a,,8 E R
the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences
(fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively.
2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-
If - gi
dp,
for every pair of functions f, g E M(ss). Show that D also enjoys the properties
(a)-(c) proved for D$, in the preceding exercise.
be a or-finite measure space. Show that a sequence (fn) of measur5. Let
able real functions on Cl converges stochastically to a measurable real function f
on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can
be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is
stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for
each k and Ak 1 11, and consider the finite measures pk(A) := (A fl At,) on sW.
The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for
each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere
To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I <
21. Equi-integrability
121
that for every 6 > 0 there exists an A6 E W such that p(A6) < b and (fn) converges
to f uniformly on CA6. [Hint: Exercise 2 of 11.]
21. Equi-integrability
The sufficient condition for convergence in eh mean which is set out in Lebesgue's
dominated convergence theorem can be transformed into a necessary as well as sufficient condition with the help of stochastic convergence. But we need the concept
of equi-integrability, which is of fundamental significance.
In the following (S2, sz4, p) will again be an arbitrary measure space, and p is
always a real number satisfying 1 < p < +oo.
The point of departure is a simple observation. A measurable numerical function f on S2 is integrable if and only if for every e > 0 there is a non-negative
integrable function g = ge such that
(21.1)
f IfI dp=
{IfI?9}
IfI dp+
{III<9}
f I dp< e.
III_9}
122
2.
1 1. Integration Theory
Example 1 and the fact, demonstrated in the course of proving (21.1), that any
set consisting of just one integrable function f is equi-integrable, the function 2 If I
being an a-bound for every e > 0.
Suppose M is a set of measurable numerical functions on fl, 1 < p < +oo, and
there is a p-fold -integrable majorant g for M, that is, every f E M satisfies
3.
-almost everywhere.
If1 < g
M":={IfIP:fEM}
is equi-integrable. Indeed, as in Example 2, the single integrable function h := 2gP
is an --bound for every e > 0, since by 13.6
fId < J
gP d = J
d = 0
{g=too}
{gP>h}
1f1P>h}
fn d <
JIf-I>g}
If,.Id=J
nd=J nd-J
A
nd>1-J
From the finiteness of the measure g and the fact that An 1 {0}, it follows that
liminf
J
n_+00
Ifnl d> 1,
{If..I>g}
21. Equi-integrability
123
(21.3)
fEM
f If I d < oo .
(21.4) For every e > 0 there exists a p-integrable function h > 0 and a number
3 > 0 such that
f AIfI du=
An{IfI>g}
IfI du+ f
An{III<g}
IfI du<_
{IfI?g}
IfI
du+f gdu
A
f IfI du <_ f
IfI d+
{IfI>_g}
f gdu.
Assuming that the set M is equi-integrable, let us choose for g an E-bound for it
and then set h := g, d
2. Then conditions (21.3) and (21.4) follow from the
preceding inequalities.
Conversely, assume the two conditions are fulfilled and let e > 0 be given. Let
h and b > 0 be as furnished by (21.4). For each f E M and real a > 0, consider
the obviously valid inequality
f IfI du
4IfI?ah}
Ifl du > f
{If
(If I>_-h}
or its equivalent
1
If I dM.
hd < b
for all f E M.
{IfIiah}
(21.4) then insures that g := ah is an c-bound for M, which proves that this set
is equi-integrable. 0
21.3 Corollary. Let M C 2P and the set MP :_ { If I P : f E MI be equiintegrable, where 1 < p < +oo. Then the set
M;:={laf+,0glP:f,gEM,a,,0ER,Ial:_1,1,01<_1}
is equi-integrable.
124
Proof. For every f E 2P(p) and every A E dd, I lA f l <- If I shows that 1A f E
2'(p) too, and so for all fl, f2 E 2P(p) Minkowski's inequality (14.4) gives
Np(lAfl + lAf2) :5 Np(lAfl)+Np(lAf2),
whence
///'
JA
If,Ip dp)
1/v
1/1 p
21.4 Theorem. For every sequence. (fn)nEN of p -fold, p-integrable real functions
on a measure space (1l, sd, p) the following two assertions are equivalent:
(i) The sequence (fn) converges in p`h mean.
(ii) The sequence (fn) converges p-stochastically, and the sequence (Ifnlp) is pequi-integrable.
lim Np(fn-f)=0.
n+oo
In the light of 20.4 only the equi-integrability of the sequence (I fnI") has to be
proved. By (15.2) the sequence (Np(fn))nEN converges to Np(f) and is therefore
bounded, so the set M := (If,, 1' : n E N} satisfies (21.3).
(fA
If,.Idt) "<-Np(fn-f)+(JA
If1Pd\1/
J
To every e > 0 corresponds an nE E N such that Np(fn - f) < 2-eel/p for all
h:=If1IPV...VIfn,IPVIfIP,
condition (21.4) is also satisfied by M.
(ii) .(i): From the stochastic convergence of the sequence (fn) and Remark 6
in 20 it follows that
(21.5)
lim p({I fm -
n,m- .
a} n A) = 0
21. Equi-integrability
125
for every A E W of finite measure and every real a > 0. We have to show that
is a Cauchy sequence in 2P(), that is, that the doubly-indexed sequence of
functions frnn := frn - fn satisfies
rrr
= 0.
lim fIfrnfll' do
According to 21.3, along with the set {IfnIP : it E N} the set 1190 :_ {lfnrnI
m, n E N} is also equi-integrable. Hence to every e > 0 corresponds an integrable
function gE > 0 such that f{f _g. } f d < e holds for all f E Mo. If we set g := 9E1 /P
then g is p-fold integrable and the preceding inequality can be written
fnrnIPdu<<e
for allm,nEN.
Because
f If,.. I" d =
frn lP d
(21.6)
{Ifm I<g}
holds for all sufficiently large m, n E N. Now gP, being a finite measure on so',
is continuous from above. Since n {g < k-1 } = {g = 0), i'l > 0 can therefore be
kEN
chosen small enough that
fwnl
g" (11,<E.
fId J
g }
gP d <
for all m, n E N.
The Chebyshev-Markov inequality insures that the set {g > Y}} has finite r
measure. According to (21.5) therefore the doubly-indexed sequence of sets
Ann :_ {I fnrnl > a} fl {g > 7)}
m.n-4Q0
(A,,,n) = 0.
()PJgpd1j
< E,
in,nEN
1 1. Integration Theory
126
The p-continuity of the finite measure gPp and 17.8 provide for an no E N such
that
gP dp < e
,,
Hence
(21.8)
Ifmn IP dp <
gP du < e
JIfrnnV' dp<&A({g>r)})<()"f?d
<efor allm,nE14,
17
{Ifmk9}fAm.,
However, the phenomenon discussed above does not occur for a-finite measures. By 20.3 in that case any two stochastic limits are almost everywhere equal.
Therefore we have
21.5 Corollary. Suppose the measure p is a -finite. If a sequence (fn) from. "P(p)
converges stochastically to a (measurable, real) function f, and if the sequence
(IfnIP) is equi-integrable, then f E 2P(p) and (fn) converges in pth mean to I.
21. Equi-integrability
127
21.6 Lemma. Suppose the sequence of functions f > 0 from 2' (p) converges
stochastically to a function f > 0 from 2'(It). If in addition
lien
then the
sequence
f f dit = If dp,
J
converges to f in mean.
0< fA
and Example 3 show that it is equi-integrable. Since
05f-fAfn<-Ifn-fI
(forallnEN),
lim
n>z
lim
If V f dp =
f du.
Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:
21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically
to a function f E 2P(,u) the following three assertions are equivalent:
The sequence (fn) converges in p'h mean to f .
(1)
(ii) The sequence (If,, 1") is equi-integrable.
(iii) lim f If,, I' d;i = f If I' dp.
n-, x.
Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need
therefore establish only two implications:
(i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this.
(iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,)
to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma
it further follows that the sequence (If P) converges to I fI' in mean. Finally,
Theorem 21.4 - with the p there chosen to be I - shows that the convergence in
mean of this sequence entails its equi-integrability.
128
1 1. Integration Theory
function from 2'(p). Then for any set M of dd-measurable numerical functions
on Sl the following three assertions are equivalent:
(i) M is equi-integrable.
(ii) For every e > 0 some scalar multiple of h is an a-bound for M.
(iii) M satisfies
sup
(21.11)
JEM
as well as the following: Given e > 0 there exists 6 > 0 such that
fhd6=JIfIdlA<c
(21.12)
for allAEdd,fEM.
s lim
(21.13)
JIfI>ah} If I du = 0
holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8)
called the equi-(hit)-continuity of the measures If I , f E M.
Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0
{IfI>-hh}
IfI d=
{IfI>oh}n{IfI>g}
< fj IfI>_g} I fI d+
IfI d+
{(fI>h)n{(fI<9)
gdla < E +
IfI d
9d
fig >cth}
According to 13.6, ({g = +oo}) = 0. Since g is a finite measure on dd, it is
{g>ah}
nEN
k>ah)
g d < 2
for all sufficiently large a. Coupled with the preceding inequality this shows that
indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.
21. Equi-integrability
129
hd/1
lim
a-++oo
J IfI?a} IfI dp = 0
uniformly for f E M.
This condition is thus - just as (21.13) for a-finite measures - necessary and
sufficient for equi-integrability of M.
21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is
a p-integrable function g > 0 such that
(21.14)
J{Ift?a}
IfI dp <
J{IJI>a}
9dp
a-4+oo
uniformly in f E M.
fdize.
130
n++0oJ(Ill>o)
IfI dp = 0
uniformly for f E M,
Exercises.
1. Show that for any measure space (0, a, p) a set M of measurable numerical
functions is equi-integrable if and only if for every e > 0 there is an integrable
function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently
large q > 0, g := r)h will be a 2e-bound for M.]
2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se((t) converges almost everywhere on 12 to a measurable real
quence (f,,) in
function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the
sequence (If,, I P) is equi-integrable.
an(f):=n({n<_IfI<n+1}).
00
Show that M is equi-integrable if and only the series E an(f) converges uniformly
na
in f E M. [Cf. Theorem 3.4 and its proof in BAUER [1996].]
5. Consider a finite measure p and an M C 2 (z). Show that M is equi-integrable
_ +oo and
spu
J q If I du < +oo.
(In fact we have to do here with a necessary as well as a sufficient condition, which
goes back to CH. DE LA VALLEE POUSSIN (1866-1962). Moreover, q can always
be chosen to be convex and isotone. Cf. MEYER [1976], p. 19 or DELLACHERIE
and MEYER [1975], p. 38.)
6. Let (fl,.ad,p) be a measure space with (S2) < +oo, (fn)nEN a sequence of
measurable numerical functions fn > 0, and set f* := lira .supoofn. Show that:
n
(a) If the sequence (fn) is equi-integrable (or at least satisfies condition (21.12)),
then the following "dual version" of Fatou's lemma is valid:
(*)
How does the corresponding result in Exercise I of 15 fit in? [Hint: Exercise 2
of 11.]
(b) Under the hypothesis f f' du < +oc, the sequence (f,,) is equi-integrable if
and only if (*) holds. [In proving the "if" direction, argue indirectly.]
21. Equi-integrability
131
(c) Result (b) can fail in case f f ` d = +oo. Try to corroborate this with a se-
quence (an
derived by appropriate choice of (sufficiently large) numbers
a,, > 0 from the sequence (f,,) in the Example from 15.
7. Let (f), .x, ) be a measurable space with (S2) < +oo, and let (v;)iE f be a family
of finite and it-continuous measures on 0. Suppose this family is equi-continuous
at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every
and (A)<6
vi(A)<eforalliEI.
What does this result say in view of Theorem 21.8? (Hint: Review the proof of
Theorem 17.8.1
Chapter III
Product Measures
In this short chapter we will investigate whether and how one can associate a product with finitely many measure spaces. And for the product measures thus gotten
we will want to see about how to integrate with respect to them in terms of their
factors. We will recognize the L-B measure Ad as being a special product measure
Q:= X11j=Q1x...xQ,t
j=1
which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra
in Q generated by the mappings pa,. , pn is designated
n
j=1
and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to
do here with the smallest a-algebra s in ft such that each pj is d-safj-measurable.
The reader may recall that the product of finitely many topological spaces is
defined in a very similar way.
An important principle of generation for such products is immediately at hand:
22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj
in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra
.n is generated by the system of all sets
A(i 0
E1x...xEn
with E., E 9, for each j = 1, ... , n.
133
Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,
are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the
sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just
exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each
j E {1,.. . , n}, then the sets
El x ... x En =p11(El)n...npnl(En)
all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4
and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets
Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk,
kEN,
13
A particular case of this theorem is the fact that the product dj ... srdn is
generated by all the sets Al x ... x An with each A3 E . . Our further course will
be guided by the following example:
,qn = a1
(22.2)
(& R1
Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj
010 .. . (9 On satisfying
(22.3)
zr(E1
be proven?
The accompanying uniqueness question can be settled at once:
134
22.2 Theorem. Suppose that for each j = 1, ... , n irj is an n-stable generator
of ao which contains a sequence (Ejk)kEN of sets of finite pj-measure satisfying Ejk f 11j. Then there is at most one measure rr on alt ... x/ erljjoying
property (22.3).
Proof. Let 8 denote the system of all sets El x ... x E,,, where Ej E ej for each j.
According to 22.1, 8 generates the a-algebra dj (9 ... 04. Since each Bj is
f-)-stable, so is 8, as the identity
?I
9=1
j=1
EkTf1,x...xf1,,.
Recalling that j (Ejk) < +oe for all (relevant) j and k, we see that the uniqueness
claim therefore follows from 5.4. (Obviously it would suffice if U Ejk = f1j instead
kEN
Under the hypotheses of 22.2, which obviously entail the a-finiteness of each
measure uj, the existence of the desired measure it can also be proven. This proof
will be carried out in the next section, first for it = 2, then for arbitrary n > 2.
f:S2o-4 SZlx..-xSZ
of a measurable space (11o, ado) into a product of measurable spaces (0j, Afj) is
measurable with respect to the a-algebra all ... as' if and only if each component mapping fj := pj o f off is d0-Oj-measurable - a fact which is immediate
from Theorem 7.4.
Exercise.
Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the
algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj
consists of all finite unions of such product sets.
135
the sets
(23.1)
Q111
Q,,,.,
are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q
(w2 E p2)
This notation is chosen for typographic simplicity and will see us through 23,
after which it is not needed. In case ill = il2i however, it presents obvious problems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also
popular in the literature.
About these sets we claim:
23.1 Lemma. If Q E sd1 sd2i then its w1-section lies in ad2 for every w1 E 01,
and its w2-section lies in sd1 for every w2 E i12.
Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill
(!\Q)w, =!2\Q.1
and
(U Qn)
= U (Qn)., .
nEN
nEN
Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have
(A1 x A2),1 =
j A2
0
if w1 E Al
if w1 E ill \ A1.
For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode
23.2 Lemma. Suppose the measures p1 and 2 are or-finite. Then for every Q E
sd1 . 9 the functions
w1 H 2(Q.,)
and w2 H A, (Q..)
136
Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the
d1-measurability of sq, for each Q E d1 sal2. The other function can be treated
analogously.
First suppose that 2(1Z2) < +oo. In this case the set ) of all D E .01 sal2
whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2.
This involves the following easily checked assertions:
811 = /12(122);
The system if of all such Al x A2 is fl-stable and generates sale sd2, by 22.1.
Therefore 2.4 insures that 01 ad2 is the Dynkin system generated by it. From
9 C -9 C Wl ,42 therefore follows that .9 = .call .v i which is what is being
claimed.
because of the continuity from below of the measure 162. From Theorem 9.5 then
the mapping wl -r 162(Q,,,) is indeed
al-measurable.
23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there
is exactly one measure.. it on all .sate which satisfies
(23.2)
it(Q) =
and is a-finite.
Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function
on 121; it is of course non-negative. Consequently via
w1
ir(Q) :=
JSQdILI
a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH
of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure
that
137
00
7r U Qn) _ F, n(Qn)
n=1
nEN
ir'(Q) :=
fi(Qw2)iz2(dw2)
also defines a measure on s1 d2 having this property. But when Theorem 22.2
Thus also the question posed in 22 is answered for a-finite measures p1, P2.
f., (w2)
f (w1,w2)
f,.,2 (wi)
f (wi, w2)
Notice that if Q C 121 x 122 and f := 1Q, then these functions satisfy
(23.5)
(IQ),,, = IQ.,,
and
(IQ),,,2 = IQ,
138
Note, of course, that these indicator functions have different domains, and, just as
with (23.1), further caution is called for with (23.4) in case ill = f12. Equations
(23.4), and (23.5) lead us to call the mapping f,,,, the wj-section of f. It enjoys
the expected properties:
23.5 Lemma. For every measurable space (W, d') and every measurable mapping
f: (11 x122,4110A)-(11',d')
is sate -d' -measurable and f,,, is .11-d'-measurable for every wl E 11 i w2 E S12.
f-1(A')}
_ (f -'(A')),,,
(f-1(A'))w,,
so the measurability claims follow from Lemma 23.1.
Decisive is the following theorem which extends formula (23.3) from indicator
functions to non-negative measurable functions. It goes back to L. TONELLI (18851946), its corollary to G. FUBINI (1879-1943). Both statements are often combined
under the single designation the theorem of Fubini.
23.6 Theorem (of Tonelli). Let (111,41z) be o-finite measure spaces (j = 1, 2),
and let
f: 121x122 R+
be s1 0 .sat2-measurable. Then the functions
r
d2
ffd(i0u2)=
J(ffW2dul),02(dw2)=J(ff1due)l(dw1)
f:_Eaj1Qj
(ai>O,QjEa,nEN).
j=1
f"
Eaj1il040,
d,u1=
j=1
aj14IL2 and so
139
an iA2-measurable function on l2 thanks to 23.2. Its integration is therefore accomplished by (23.3) thus:
f(ff2d1) _
aj7r(Q7) =
j=1
f d7r,
For an arbitrary d-measurable numerical function f > 0 let (u(')) be a sequence of .say-elementary functions such that uini T f. Then, as was noted in the
first part of the proof,
is a sequence of dl-elementary functions, which
obviously satisfy u) T fw2 (for each w2 E 112). Consequently, the functions
Ji4)dir
V(n)(w2)
which are
w2 E 112,
w2H
f)2dp1,
by 11.3. This function is therefore also a02-measurable and the monotone convergence theorem 11.4 says that
ff
w(') d'2
nEN
for each n E N.
f f d7r = sup I
u(n) d7r.
nEN
J(ffdi)P2(dw2) =
f f dir,
and wholly analogous arguments establish the claims about the functions f", . 0
Having disposed of non-negative functions, the next step in integration theory
is to pass over to integrable functions. For them we get
23.7 Corollary (Theorem of Fubini). For j = 1, 2 let (llj, a4j, 14j) be a-finite
measure spaces, f a k1 0 p2-integrable numerical function on !l x 02. Then for
l-almost every w1 the function f,, is 142-integrable and for 2-almost every w2
the function f,,,2 is 1-integrable. The functions
140
thus defined p,-almost everywhere on fl, and P2-almost everywhere on f12, respectively, are pl-integrable and 2-integrable, respectively, and equations (23.6) are
valid.
IfIWj - I fWjI,
(f.,)-
(f
so we will employ parenthesis-free notation. According to (23.6) the product measure it := , 2 satisfies
+10-
In particular, the ddb,-measurable numerical function w, H f Iff,I d2 is lintegrable and so by 13.6 it is ,-almost everywhere finite. That is (by 12.1),
for ,-almost every w, the section f, is 2-integrable. Consequently,
w1 14
f L. d2 = f f.+,
d112 -
f,;, d02
is assured of each integral on the right by Theorem 23.6. In turn each of these
integrals is ,-integrable by 23.6. So our pi-almost everywhere defined function
w1 H f fW, dp2 is ,-integrable and
f + dir -
f - dir =
f dir.
(23 6')
f f d(1(9 2) = f f f(W1,W2)111(dW0112(dw2)
= Jff(wiw2)2(dw2)i(dwi).
That exceptional sets of measure zero cannot generally be ignored in the conclusions of Fubini's theorem is illustrated by the following example.
Example. 1. Consider L-B measure A2 = A' A' on R2, the set A := Q x R E R',
and its indicator function f := 1,1. According to 23.3 or 23.6 we have A2(A) =
f f dA2 = 0, so f is A2-integrable. Nevertheless, for every w1 E Q, the section
f,,,, = la is not A'-integrable.
141
Remark. 1. For certain measures 1,P2 which are not or-finite the existence but
usually not the uniqueness of a product measure can be proved by other methods.
See, e.g., BERBEIUAN [1962]. Even if just one of p' or 112 fails to be a-finite, the
second equality in (23.3) can fail. Cf. Exercise 1, p. 145 of HALMOS [1974], as
well as chapter IV, 16 of HAHN and ROSENTHAL [1948]. Moreover, there exist
f : 91 x f12 - R+ which are not sail (9 02-measurable yet the "iterated integrals"
on the right side of (23.6) make sense (and are finite). For an abundance of illuminating but elementary counterexamples related to this famous theorem, see
CHATTERJI [1985-86] and MATTNER [1999].
23.8 Theorem. Let (S2, d, p) be a a -finite measure space and f : Il - R+ a measurable, non-negative, real function. Further, let W : R+ -+ 11 P+ be a continuous
isotone function which is continuously differentiable at least on R+ :_ ]0,+00[
and satisfies w(0) = 0. Then
(23.7)
+00
co o f dp = fit ,
J0
Proof. Consider the L-B measure A' := AR+ on the o-algebra R' := R+ fl9l. The
function F : 0 x R+ -+ R2 defined by
F(w, t) :_ (f (w), t)
is, according to Remark 2 in 22, 0 .4'-measurable, because each of its component functions is. Therefore the F-preimage of the closed half-plane {(x, y) E R2 :
x > y}, namely
E:={(w,t)ESZxR+: f(w)>t},
lies in sad.. Theorem 23.6 for the product measure pA' consequently supplies
the equalities
JJ
(23.8)
(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)(dw)X'(dt)
= Jw'(t)iz(Ei)A(dt) =
since the t-section of E is just the set of all w E 1 which satisfy f (w) > t. As V
is isotone, W'(t) > 0 for all t > 0. The continuous function gyp' is integrable over
[1/n, a] whenever 1/n < a < +oo, and since [1/n, a] t ]0, a], and
oal
(t)A'(dt) = limo J
142
(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a]
for every a > 0. It follows from f > 0 and the preceding calculation that
p'(t)a(dt) = (f(w))
for every
E S1,
o,f(W)l
f (Jlo,f(W)l
= J f o'(t)llo,nw)d(t)A*(dt)(&)
=
IV
fl'd=p
+
0
f f du =
t})dt.
The reader should not overlook the geometric significance of this, which is that
the integral f f d is formed "vertically", while the integral on the right-hand side
of (23.10) is formed "horizontally".
Now at last we turn back to the general case of 22 and consider finitely many
o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2.
The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will
be identified via the bijection
((w1,...,W,y_1),wn) H (L11,...,wn-l,wn)
The agreed-upon equality of these sets leads at once to the equality of the corresponding products of v-algebras:
(23.11)
(Wi...An-1)-Wn=010...An-1dd/n.
In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote...OAfn-1,
143
j=1
j=m+1
(23.12)
-'10
= j=1
0j
(1<m<nEN).
The convention (23.11) opens up the possibility of proving the existence of product
measures on any finite number n > 2 of factors via induction on n.
23.9 Theorem. or-finite measures l, ... , n on a-algebras .d1, ... , jVn uniquely
determine a measure 7r on safe ... 0 do such that
(23.13)
j l...n.
j=1
Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn
that there is at most one measure 7r which satisfies (23.13). The existence question
has already been settled for n = 2, in 23.3. We make the inductive assumption
that 7r' := 1 ... n-1 exists for some n > 2 and show how that leads to the
existence of l ... n. Evidently the a-finiteness of l, ... , n_1 entails that
of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with
a measure 7r := 7r' n on (.W1 ... .dn_ 1) .dn which satisfies
7r(Q' x An) = 7r'(Q')n(An)
for all Q' E .d1 ... .dn-1 and all An E dd4n. Because of (23.11) this measure
does what is wanted at level n, completing the induction. Again, a-finiteness of 7r
is confirmed exactly as in the proof of 23.3. 0
This inductive construction of the n-fold product measure builds in the equality
(23.14)
In particular
(j)(
j)=
j
j=1
j=m+1
j=1
xd
V,
(1<m<nEN).
with d factors.
144
In view of (23.15) induction can also be used to extend the theorems of Tonelli
and Fubini to multiple factors. We will formulate only the analog of 23.6:
Let f _> 0 be an s91... .c 4-measurable numerical function on 01 x... x Stn.
Then for every permutation j1, ... , j,, of 1, ... , n
Jfd(ii...in)
(23.16)
= f(... (f (f f(w1i...,wn)j,(dwj,))j.(dwjs))...)jr(dwj.)'
Every integral that occurs on the right-hand side is measurable with respect to
the product of the appropriate Oj, namely those corresponding to the coordinates
in which integration has not yet occurred. This right-hand side is often written in
the shorter fashion
J ... J
The simple proof of this theorem (involving induction), as well as the formula,
tion and proof of the analog of 23.7, will be left to the reader.
One more piece of notation is convenient:
23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, j), 1 < j <
+,
1l
1!
n, the triple ()( SZj, .Wj, j) is called the product of these measure spaces
7=1
j=1
j=1
and is denoted by
j,
14Y
j=1
Remark. 2. Throughout the preceding the index set was finite. But there is
also a theory of products of (finite) measures indexed by arbitrary sets, which
is particularly important in probability theory; it is treated in detail by BAUER
[1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For
p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.
In closing we will consider the case where each measure j comes with a real
density f j > 0. According to Theorem 17.11, vj := f jj is then a a-finite measure
too.
andfj>0real-
vj = fjj,
Then the product of these measures is defined and satisfies
(23.17)
j=1
j=1
vj = F. (j)
j = 1,...,n.
145
[ffj(wj),
F(wl,...,wn)
(23.18)
j=1
ing that their product is defined. It suffices to treat the case n = 2 and refer the
general case to induction. For sets Al E
and A2 E s12
vl(A1)v2(A2) =
(jfid14i)(j12d142)
z
Jf
I ._
lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)
= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2)
From 23.6 therefore
Fd(141 1L2),
v1(A1)v2(A2) = J
, x A2
But then according to 23.3, v1 v2 coincides with the measure F (141 14z). 0
Exercises.
1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142
the non-a-finite
counting measure on .41 (cf. Example 3, 5). Show that equality (23.3) fails to
hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in
jV1 002 =W2?
2. Show that the function
(x, y) H 2e2xv - exv
is not A2-integrable over the set [1, +oo[x [0, 1].
3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the
following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >
0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals
f()f(x + y)14(dx)Ad(dy)
and, finally, take f to be any indicator function, g the indicator function of [0, 1[.
4. Compute
00
and thereby evaluate anew the important integral G = 21 in (16.1), in the folye_y2V2
lowing simple way: fo a-e2 dt = fo
dx for every y > 0 and therefore
146
5. Let IxI := (x + ... + xd)112 denote the usual euclidean norm of the vector
x := (x1,. .. , xd) E Rd. Show that the function x H e-Iz1 is ad-integrable for
every a > 0. (Recall Exercise 2 of 16.) In case a = 2, show that the Ad-integral
of this function is Gd.
6. KL(xo) will denote the closed ball in Rd with center xo and radius r > 0. Set
ad :_
and prove that
,\d(K*(xo)) = adrd .
2q(2q
and a2q- i = 1 3
- 1)
a-1
(q E Dl).
[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is
a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula
for the ad. Here, of course, 7r has its customary geometric meaning.]
How do these relations change if we replace K,.(xo) by the open ball Kr(xo)
in Rd of radius r and center xo? [Cf. Exercise 3 in 7.]
7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell
h(Jxj)Ad(dx) = d ad f
R(a,p)
h(t)td-1
dt,
ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H
defined on [a, p) by
H(t) := f
h(IxI)J1d(dx),
tE
a-t2
in order to
the set of all t > 0 such that u({f = t}) # 0, as well as the set of all t > 0 such
that ({ f > t}) # ({ f > t}) is countable. Therefore in the equalities (23.8),
(23.9) and (23.10), p({ f > t}) can always be replaced by ({ f > t}).
147
lI,II := IA(Rd)
24.1 Definition. The image under the mapping An of the product measure
-IC/+b(Rd),
plo. .Idn is called the convolution product of the measures pl,... , An E
in symbols
(24.2)
The theorems on product and image measures combine to yield the most important properties of the convolution operation *. First of all, At * ... *An is again
an element of .0+1 (Rd) and
l*...*n(R")=l...p,(R"d)=1111I ...
IIJUnII
so that in fact
(24.3)
for every n + 1 measures from .4 (Rd). To see this, introduce the continuous
mapping Bn+1 : R(n+l)d _+ Red by
148
fd(E.e*v)
=J foA2d(pv)
= ff f(x + y)p(dx)v(dy)
(24.5)
= f f f(x + y)v(dy)(dn)
As this holds for f := 1B, they indicator function of any set B E fed, we have
(24.6)
p*(vl+v2)=p*v1+p*v2,
p*(av)=(ap)*v=a(p*v).
(24.7)
(24.8)
The distributive law (24.7) even holds in the following generality: For every
sequence
00
E vn is also a measure in .4f+1 (Rd) (cf. Example 4 of 3). Taking account of 11.5,
n=1
(24.9)
14 *(E14t
n=1
00
Ep*vn
n=1
149
E. * = Ta(p)
Now To is the identity mapping, so co is a - and obviously the only - unit with
respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for
every E 4. (Rd), then it would follow that Eo = E * co = E.
For the special choice p := Eb, (24.10) says that
(24.10')
Ea * Eb = Ea+b
y)f(x)T-v(Ad)(dx)v(dy)
= f f 1B(x)f(x
- y)Ad(dx)v(dy)
for every B E .mod. With the help of Tonelli's theorem it further follows that
f * v(x) := f f (x - y)v(dy)
for x E Rd.
(/Ad) * v = (f * v)Ad.
3. Besides p = f Ad, let now v = gAd also have a Ad-integrable density g > 0.
According to 17.3 and the preceding
f * (gAd)(x) = f f(x - y)g(y)Ad(dy)
(x E Rd)
150
is a density for u * v with respect to Ad. We denote this function by f * g, that is,
we set
(24.13)
f * g(x)
f f(x - y)g(y).d(dy)
(x E Rd)
and get
(f Ad)*(gAd)_(f*g)Ad-
(24.14)
Here too f *g is called the convolution off and g. It is defined for every pair of nonnegative Ad-integrable functions and is itself such a function. Nevertheless, it might
not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13)
and the translation- and reflection-invariance of Ad it follows that for every x E Rd
f * g = g * f.
(f*g)*h=f*(g*h)
f*(g+h)=f*g+f*h
(aER.F.)
for such functions hold as well and follow immediately from (24.13).
4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and
negative parts and appeal to the resusecured in 3. show that
x +
ff(x - y)g(y)Ad(dy),
while possibly defined only Ad-almost everywhere (see Remark 1 below), is always
Ad-integrable. One can therefore define f * g by
f * g(x):= f f(x - y)g(y)Ad(dy)
but generally only for Ad-almost all x E Rd. Once again the expression convolution
is used for this f * g.
151
Remarks. 1. For real-valued, non-negative functions f, g E pl (Ad) the function f * g need not be finite everywhere. It suffices to consider any real-valued,
non-negative, even function f which lies in Y1 (A") but not in 22(Ad) and to take
g = f. Then f * g(0) = +oo. In case d = 1, such a function is
f(x) :=
forlxI>Iorx=0
10
1
IXI-112
Exercises.
1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd,
T( * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T),
where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself.
2. Compute the nlh convolution power of the function f defined on R by f (x)
ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for
every n E N, f has an "nth convolution root"? That is, is f the nth convolution
power of some A'-integrable function g > 0?
3. If we set N1(f)
f I f I dAd (this is (14.1) for it := Ad), then
N, (f *g) <N,(f)N,(g)
holds for all f, g E 21(Ad), and for non-negative functions equality prevails.
4. Write out the details of Remark 2 and show that
III*9II1 5 II/II,
119111
holds for all elements f and g of the Banach space L1(Ad). The latter is therefore
a Banach algebra.
Chapter N
.l
(E) := Q(6) .
The closed sets being the complements of the open ones, _V(E) is also generated
by the system of all closed subsets of E. In this respect the analogy with 6.4 extends
a bit farther. The intersection of a sequence of open sets is called a G6-set, and
the dual, the union of a sequence of closed sets is called an Fa-set. All such sets
are clearly Borel.
153
the a-algebra generated by the class Xfof all compact subsets of E is strictly
smaller that _4(E). So at this point the analogy with 6.4 falters.
Examples. 1. From 6.4, as has already been mentioned,
-4 (Rd) = .Vd,
(25.2)
' of
compact sets consists just of the finite subsets of E. Consequently (cf. Examples 2
and 7 in 1) o(..iE') is the countable and co-countable a-algebra, comprised of all
countable subsets of E and their complements, and so o ,(X) = -V(E) if and only
if E is countable.
.9(i) = R1
In fact, R n..(i) = ..(R) = V1 by Examples 1 and 3 above. The subsets {-oo}
(25.3)
and {+oo} are closed in R and the subset R is open in R, hence all three are Borel
sets in K. Equality (25.3) therefore follows from the definition of R1, given in 9.
In the sequel we will be studying measures on R(E) for two important classes
of spaces E. In preparation for which we make
a Borel measure on E if
(ii) locally finite if every point of E has an open neighborhood of finite -measure;
154
Note that a Borel measure is more than just a measure defined on 69(E): in
addition finiteness on the system of compact sets is demanded. The inner and
outer regularity conditions say that the measure is determined on every Borel set
by its values on the compact, resp., the open sets. The Borel measures on E = Rd
are already familiar to us from 6.
Every finite measure on M(E) is obviously a Borel measure; as in 24 where
E = Rd, we naturally call it a finite Borel measure on E. The notation introduced
there for the total mass of a finite Borel measure will be carried over to this more
general setting: For every finite Borel measure i on a Hausdorff space E
(25.6)
IIiII := p(E)
(ii)
(i) .
Indeed, each point x in the compact set K has an open neighborhood V,, with
p(Vr) < +oo, and compactness means that finitely many of these, say those corresponding to x, , .... x,,, cover K. Then
n
p(Vxf) < + 0 C .
The converse of (25.7) is, however, not generally valid. Exercise 2 below furnishes
eui example.
For the moment we will be content to illustrate the regularity concept with
some examples.
Examples. 5. Let E be an arbitrary Hausdorff space, a a point in E. The measure eq on .(E) defined by unit mass at a:
(25.8)
e .(A) = 1A(a)
for A E R (E)
is both inner and outer regular on E. Henceforth it will be called the Dirac measure
on Eat a.
As in Example 2, let E be a discrete space, so that t9 = .9p(E). The compact
sets are just the finite ones. The measure defined on .9(E) by
if A is countable
J0
p(A)
1 +oo otherwise
6.
155
is a locally finite Borel measure which is obviously outer regular. It is, however,
inner regular if and only if the set E is countable.
7. On -41 = .a(R) consider the counting measure. It is not a Borel measure, is
however inner regular, but not outer regular. In fact, equality (25.5) fails even for
one-point sets B.
More precisely the term used is "positive" Radon measure, but in this book
we dispense with that adjective because non-negativity is built into our definition
of measure, that is, we consider only measures with values in [0, +oo]. Example 5
says that the Dirac measure at any point a E E is always a Radon measure on E.
We have already noted that Borel measures are not automatically locally finite.
Nevertheless for many spaces Radon measures can be defined simply as the inner
regular Borel measures. That is the import of
K := {x} U U Kn
nEN
156
for allnEN.
Exercises.
1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.
Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is
a generator of the a-algebra .rah' in ff. Example 3 above is a special case.
2. Equip the set R with the so-called right-sided topology (which is also sometimes
named after SORGENFREY [1947) whose system 0, of open sets is defined as
follows: A subset U C R lies in r if and only if for each x E U there is an e > 0
such that [x, x + E[ C U. The topological space thus created will be denoted R,.
Establish, one after another, the following claims:
(a) Every right half-open interval [a, b[ is both open and closed in R,.. The rightsided topology on R is strictly finer than the usual topology. In particular,
R, is a Hausdorff space.
the value 0 and to every uncountable set the value +oo (cf. Example 6). Then
p is a Borel measure on R, for which no point of R, has a neighborhood of
finite measure. In particular, the measure p is not locally finite and is neither
inner regular nor outer regular.
llo,+ool(x)
(x E R)
157
26.1 Definition. A topological space E is called Polish when its topology has
a countable base and can be defined by a complete metric.
The terminology is due to N. BouRBAKI and commemorates the achievements
of Polish topologists in the development of general topology.
A metric is called complete when the associated metric space is complete: every
Cauchy subsequence in it converges. A countable base or basis for the topology is
a countable system of open sets such that every open set is the union of those from
the system which are subsets of it. For a metrizable space E the existence of such
a basis is equivalent to the existence of a countable dense subset.
Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the
ordinary euclidean metric being complete.
The product E' x E" of two Polish spaces is another, when given the product
topology. For if d, d" are complete metrics generating the topologies of E' and E",
reap., then the product topology of E' x E" is generated by the metric
2.
metric giving the topology of E, and consider the set F of all (A, x) E R x E
E\G) = 1. Here, as usual, for 0 0 A C E. d(x., A) := inf{d(x, a)
a E A} is the distance from the point x E E to A. The mapping x H d(x, A) is
continuous on E, in fact., as the reader can easily check, ld(x, A) - d(y, A)l <
satisfying
More generally it is true (cf. COHN [1980], Theorem 8.1.4 or WILLARI) [1970],
158
example, the set J of all irrational numbers with its topology as a subspace of R
is Polish, since
J= n (R \ {x}) .
2E'Q
Every compact space E with a countable basis is Polish. For a famous theorem
of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970],
6.
The key to the further discussion is the following lemma, which is here just
a preliminary to the big theorem that follows it, but nevertheless is significant in
its own right. In it we encounter our first extensive class of Radon measures.
26.2 Lemma. Every finite Borel measure it on a Polish space E is regular.
Proof. We consider the system .9 of all B E -W(E) which satisfy both
(26.1)
and
nEN
-F2'
j=1
kp
Each set Bn
j=1
and we have
u(E)-(K)=(E\K)=p(U (E\B,)) 5
nEN
p(E\Bn)<_
n=1
Ee2-n=e.
00
n=1
This will prove (26.1) if we can confirm that the closed set K is actually compact.
For every n E N
K C B. E l l , ,
159
and each set in this union has diameter no greater than 2/n. This shows that K is
pre-compact (=totally bounded) and in a complete metric space that is equivalent
to compactness, by very easy arguments (cf. WILLARD [1970], Theorem 39.9 or
KELLEY [1955], p. 198).
2. Every closed set C lies in 9: Let F > 0 be given. We already know that there is
a compact set K with
(E) - IA(K) < e.
According to 3.5 however
of a metric space, C is a G6-set, that is, there are open sets G. J. C. To see
this we may assume C 9& 0, so that G := E \ C is an open proper subset of E.
Consequently, x H d(x, C) is a continuous mapping whose zero-set is C, as was
by "closed". But then application of step 2 to these closed sets gives us the
full (26.1) for CB.
4. Whenever pairwise disjoint sets Dn lie in 9 (n E N), their union D also lies
in 9: First of all
(D.)
(D) _
n=1
(7 = 1, ... , ne)
2nE
( n,
n,
j=1
160
e/2"
for each n E N.
nEN
In summary, we have shown that (26.1) and (26.2) hold for B := D = U D,,.
5. The result of the first four steps is that 9 is a Dynkin system which contains
the system .$ of all closed sets. The claim, namely that -9 = R(E), now follows
26.3 Theorem. On a Polish space E every locally finite Borel measure p is a ofenite Radon measure.
(26.4)
Via
A E R(E)
nEN
nEN KEA
161
KEr nEN
KCA
KCA
proving the inner regularity of tt. The a-finiteness of it is affirmed by (26.4), so the
proof is complete.
The question now suggests itself whether - in analogy with 26.2 - the outer
regularity of p can be proved. This is in fact the case.
26.4 Corollary. Every Radon measure on a Polish space is outer regular.
Proof. We have to show that every B E 4(E) satisfies (25.5). So let B E .4(E)
and e > 0 be given. Consider the open sets G. and the finite measures tt created
in the preceding proof. Lemma 26.2 furnishes open sets U. J B such that
ti((U,, \ B) n
(26.5)
Let U
for each n E N.
B = B n E = B n UG,, U BnC,,,
nEN
nEN
nEN
nEN
nEN
and consequently
x,
n=1
n=1
e/2" =E.
tt(U\B) <
by (26.5). It follows finally that
The regularity conditions (25.4), (25.5) make sense for outer measures px and
together with one other minimal demand on p* they assure that all Borel sets are
,W-measurable. In fact, these conditions on an outer measure come up naturally in
the course of proving the famous Riesz representation theorem in 29; cf. also 28.3.
26.5 Lemma. Let E be a Hausdorf space and tt' an outer measure on E with
the following three properties:
(i) for every set A C E
162
Proof. We consider the a-algebra d* of all *-measurable sets, that is, according
to (5.6) the set of all A E .9(E) which satisfy
(26.6)
First note that it suffices that this hold for all open sets Q in order that it hold
for all Q whatsoever. In other words, what we need to check for an A to be in d*
is that
(26.6')
for all U E 0.
use criterion (26.6) to show that G lies in W*. To this end consider any open
U C E; further, consider any compact Kl C U n G and any compact K2 C U \ K1.
Since then K1 n K2 = 0 and Kl U K2 C U, it follows from (iii) that
y* (U) > {b' (K1 UK2) =A* (KI) +Ft*(K2)
The set U\Kl is open, so if we take the supremum over all such K2 in the preceding
inequality and appeal to (ii), we get
the latter
This all proves that B C W*. But then .9(E) = a() C j W*
is a a-algebra, by Theorem 5.3. That theorem further affirms that the restriction
of u* to W* is a measure.
The foregoing Theorem 26.3 and its corollary show in particular that the
L-B measure Ad is a regular Bored measure on Re in e a c h dimension d = 1, 2, ... .
In fact every Bore] measure on Rd is regular (cf. also Theorem 29.12). Following
STROMBERG [19721 we derive from the regularity of Ad a purely topological result
of H. STEINHAUS (1887-1972). It shows, incidentally, that every set of positive
L-B measure has the cardinality of R.
163
26.6 Theorem (of Steinhaus). Let A E Rd be a Borel set in Rd of positive ddimensional Lebesgue measure. Then 0 is an interior point of the set A - A of
differences of elements of A.
logical space into another, then f is Borel measurable (i.e., .(E)-.(E')-measurable) just if the pre-image f - i (G') of every open set G' C E' is a Borel set in E.
This follows from Theorem 7.2 and the fact that the Borel o-algebra M (E") is
generated by the open subsets of E'. By contrast, f is continuous just if f-1(G')
is open in E for every open set G' C E. What is quite remarkable is that for
Polish spaces E a much closer connection between those two concepts exists.
This is brought out by the following theorem, discovered in its definitive form
by N. LUSIN (1883-1950).
26.7 Theorem (of Lusin). Let ,a be a locally finite Borel measure, thus a Radon
measure, on a Polish space E, and E' be a topological space with a countable basis.
Then for every mapping f : E -+ E' the following are equivalent:
(a) f coincides p-almost everywhere with a Borel measurable mapping of E into E'.
(b) There is a decomposition of E into a p-nullset N E R(E) and a sequence
(K,.)nEN of compact sets, such that the restriction off to each K is continuous.
If the measure is finite, (a) and (b) are further equivalent to:
(c) For every e > 0 there is a compact subset KK C E such that p(CKE) < e and
the restriction off to K, is continuous.
Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topology of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is
a generator of the Borel o-algebra because every open subset of E' is a (countable)
union of sets from s'.
164
(26.7)
For every set Gn, g-1(Gn) E . (E). Because every Radon measure on E is regular,
given E > 0, there exist compact sets Kn and open sets Un such that
(26.8)
The set A
for each n E N.
U (Un \ Kn) is open, being a union of open sets. For its measure
nEN
Using once more the (inner) regularity of 1S, we find a compact K C C(A U N) _
CA n CN such that
p(CAnCNnCK) <e-p(A),
thus (since A U N C CK and A U N U (CA n CN = E) such that
p(CK) = p(A U N U [CA n CN n CKI) < p(A) + p(N) + E - p(A) = E .
This set K does what is wanted in (c), because by (26.7) f and g coincide in K
and because the restriction go of g to CA is continuous, as we now confirm. For
each set Gn,
go 1(Gn) = g-1(Gn) n CA;
goI(Gn)=UnnCA =KnnCA,
showing that the go-pre-image of G;, is open (as well as closed) in CA. Since
(Gn)nEN is a base for the topology of E', this is enough to guarantee the continuity
of go=gICA.
(c)=(b): It suffices to find pairwise disjoint compact subsets Kn of E such that
f I Kn is continuous and
K3) <
p(C ?=1
U
J
n
=
N:=CUKn= nCKn
nEN
nEN
is a Borel set disjoint from each Kn and satisfying p(N) < 1/n for every n E N, i.e.,
p(N) = 0. The sequence (Kn) is gotten inductively from (c) as follows: To start
off, there is a compact K1 C E such that u(CKI) < 1 and f I K1 is continuous.
165
If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1
from (c) and the inner regularity of p. By (c) there is a compact K' C E such that
Because
g(x) := yo for x E N.
What has to be shown is that g is Borel measurable, which is done as follows: For
every open G' C E'
9_1(G')
nEN
Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is
166
Exercises.
1. Show that every inner regular finite Borel measure on a Hausdorff space is outer
regular.
2. Show that in a Polish space E the Dirac measures are the only non-zero Borel
measures it which take only the values 0 and 1. [Hint: Show that the system of all
compact K C E such that tt(K) = I is fl-stable and investigate the intersection
of all itssets.]
3. Show that AE x E') _ i(E) M(E') for any Polish spaces E,E'.
4. Consider K compact C U open C Rd, and for each n E N let V denote the
open ball of radius 1/n and center 0. Show that K + V C U for some n. [Hint:
n CU # 0 for every it E N, find xn E K, vn E V,,, zn E CU such that
If (K +
x + v = z,,, for every n E N. Some subsequence of (xn) converges to a point
xo E K and because CU is closed we even have x0 E K fl CU, which contradicts
the fact that K C U.]
167
Ct(E)
the vector space of all, respectively all bounded, continuous real functions on E.
The complement of supp(f) is thus the largest open set at every point of
which f takes the value zero. If E is locally compact. we will designate by
CA(E)
the set of all f E C(E) with compact support supp(f). A function f E C(E) lies
in CA(E) just if there is some compact subset of E in the complement of which f is
identically zero.
Clearly
(27.2)
C,.(E) the functions Jul, u V zv. u A v, and therewith u+ and u.-, all lie in C'(E).
The needed continuity of y,(x, y) := r V y on 1R2 follows from the identity r V y =
(.x+y+I.e-yI)
In the special case of a compact space E, all three function spaces in (27.2)
coincide.
27.2 Theorem (on partitions of unity). Suppose that the compact subset K of
the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then
fj>0
(27.4)
supp(fj) C Uj
for j = 1.....n;
for j = 1,....n:
r4
f(x) < 1
(27.5)
j=1
for all r E E;
168
rfj(x)
(27.6)
forallXEK.
j=1
(i)
supp(f f) C Uj,
for j = 0,..., n;
Ef,(x)=1
(ii)
j=o
supp(fj)=supp(ff)flECUUflE=UUCUj
since UU C Uj C E. In particular, Uf being a closed subset of the compact space E',
for all x E K. 0
Two consequences of the foregoing will turn out to be especially useful. The
first - known as Urysohn's lemma - often serves as the starting point for inductive
constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can
also be proven directly, as indicated in Exercise 1 below.
0:5f:51, f(K)=fl),
and
supp(f) C U .
Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the
fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0
27.4 Corollary 2. In the locally compact space E the compact subset K is covered
169
K; := K n supp(f3 ),
j = 1, ... , n
do what is wanted; for if x E K, then 1 = f i (x) +... + f n (x) means that f, (x) j4 0
for some j, and therefore x E K3.
For a locally compact space E there is another function space besides CC(E)
that is of importance. To define it we assign to every bounded real function f on
an arbitrary space E its supremum norm, also called its uniform norm, via
Ilf11
sup If W1
sEE
The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector
space of all bounded real functions on E - into a metric space. One speaks of the
metric of uniform convergence (on E). A sequence (fn) of bounded real functions
on E converges uniformly on E to a bounded function f just means that
lim Ilfn - f 1l = 0 .
nloo
27.6 Theorem. For a real function f on a locally compact space E the following
statements are equivalent:
(a) f E Co(E);
(b) f E C(E) and {If I > e} is compact for each e > 0;
(c) the function
Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with
Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I <- If (x) - g(x)I S Ilf - gAI, so
we see that
(If 1> e} C {IgI > E/2} C supp(g).
This shows that (If 12: c} is a relatively compact set. But, due to the continuity
of f, it is also closed. Hence it is compact.
(b)*(c): Since the subspace topology of E in E' is its original topology and E is
an open subset of E', continuity of f' at each point of E is assured by f E C(E). As
to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I <
170
e for all x in the set E' \ {If I > E}, which by definition of E' is a neighborhood
of wo, since (If I > e} is a compact subset of E.
(c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean
that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I <
E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}.
Then fg E CA(E) and satisfies
If
for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).
Exercises.
1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for
the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open
neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.]
2. Let E' = E U {wo } be the one-point compactification of a locally compact
space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular,
see how your description fits into the following general picture: For a measure
space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O
in E"' generated by d and {wo} consists of all A' C El- such that All fl E E St.
(defined on R(E)). Here the requirement (K) < +oo for every compact set K
is the same as the local finiteness requirement, because every point of E has
a compact neighborhood and the implication (25.7) holds in general. So in the
present context the concepts of Borel measure and locally finite measure on .W(E)
coincide. The Radon measures on E are thus (cf. 25.3) those Borel measures which
are inner regular.
For a Borel measure it every u E CA(E) turns out to be p-integrable. For, being
continuous, u is Borel measurable. Denoting by K the compact support of u, we
have 1111 5 IIuII 1K. Since It is a Borel measure, 1K is p-integrable, and the pintegrability of u follows. Therefore corresponding to the Borel measure is a linear
form 1,, on C,;(E) defined by
(28.1)
lu(u) := Judy.
This is an isotope linear form in the sense of (12.3): From u < v follows I,,(u) <
I,,(v). Because of the linearity of I,, this is equivalent to
0<uEC,(E)
1,,(u)>0,
171
linear forms I,, arising from Borel measures it on J, there are no other positive
linear forms on Q,,(J) = C(J). One of our goals is to show that every locally
compact space E shares this property with J. The result in question will, in view
of this pioneering work, be called the Riesz representation theorem. En route to it
we will naturally be led to the construction of Radon measures on E.
Besides the locally compact space E. let now a positive linear form
I : Cr(E) -+ R
be given. What follows will prepare the way for the proof of the Riesz representation theorem.
For every compact K C E we set
(28.2)
Moreover, the mapping K ' p.(K) is obviously isotone on the system ..l' of all
compact, sets. For an arbitrary A E -1P(E) we set
(28.4)
Because of the above noted isotoneity of it. on ..it', this new definition is consistent
with (28.2). Finally, for A E .9(E) we define
(28.5)
as follows from the obvious fact that it.(A) < p.(U) for every open U D A; and
(28.7)
which follows from (28.5) and the isotoneit.v of it.. Somewhat more effort is required
to check that
(28.8)
p.(K) = p`(K)
for all K E X.
For every e > 0 definition (28.2) supplies a u E C,.(E) with to > 1K and
172
U.
If therefore Lisa compact subset of Ua, then 1y < u and so from (28.2) P. (L) <
a 1(u). From definition (28.4) therefore
0<1 s(Ua)-ps(K)
=(a-l)p.(K)+a.
As a 1 1 this majorant converges to e, which shows that
(28.9)
n=1
A*(U1U...UUn)
ps(U1)+...+AV.).
U1 + U2,
173
It suffices to settle the case n = 2, as induction then takes care of the rest. If K is
a compact subset of Ul U U2, then 27.4 provides compact Kj C Uj, j = I, 2, such
that K = Kl U K2. Then by the result of our first step
that
for every n E N.
2-11e
"EN
of U, then K C U1 U ... U U for sufficiently large n.. From this it follows that
:,
p.(K)_p*(K)<p'(UiU...UUn)<Ep'(Uj)<Ep`(Qj)+E.
j=t
j=1
where we used the second step. As this last inequality is satisfied by every compact
subset K of U, definition (28.4) and equation (28.7) give
a
it. (U) = Et'(U) <- E; (Qj) +e,
j=t
and since Q C U we will then have as well
00
j=
u.>1K,uK2=1K,+1Ks.
I}, and
According to 27.3 there is a v E C,(E) with 0 < v < 1, v(K1)
supp(v) C CK2, hence with v(K2) = {0}. The functions vu and (1 - v)u lie
in CA(E) and satisfy
vu > 1K,
and
Therefore
174
The Borel measure ' I ..(E) has a series of further remarkable properties:
28.4 Theorem. Every Borel subset A C E with '(A) < +oo satisfies
.(A) = `(A)
Proof. Given e > 0, there is an open U D A such that
Q:=(U\A)U(U\L)
then satisfies p* (Q) < e. Hence there is an open G Q such that
'(G) < C.
K C A and A\ K C G.
(28.10)
K = L \ G C L \ Q C L \ (U \ A) = L n A,
since L C U, and on the other hand
A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G,
since U \ L C Q C G. From (28.10) we get
175
and so u* (A) < '(K) + e <- .(A) + e. As e > 0 was arbitrary, this says that
'(A) < .(A), which with (28.6) finishes the proof.
The finiteness hypothesis in the preceding theorem can be weakened. In doing so
we make use of the terminology introduced just before the proof of Theorem 13.6.
28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E)
which has o'-finite '-measure.
Proof. The terminology means that there exist An E R (E) (n E N), each of finite
'-measure, such that An T A. The preceding theorem and the isotoneity yield
Proof. Since all compact K satisfy .(K) = p'(K) < +oo, all that has to be
proved is that p. I M(E) is a measure, i.e., that p. is countably additive on M (E).
To that end, let (An) be a sequence of pairwise disjoint sets from R(E), whose
union is A. For every compact K C A, K = U (K n An), so from 28.3 and 28.4
nEN
we get
00
00
00
n=1
n=1
In proving the reverse inequality we may assume that . (A) < +oo, and therefore
P. (An) < +oo for every n E N. There is then, given e > 0, a compact Kn C A.
satisfying
for each n E N.
j=1
j=1
j=1
n
> Ep.(Aj) - E
j=1
j=1
j=1
j=1
for every n E N.
176
holding for every c > 0. That is, . (A) > E . (A,,), the complementary inequality
we needed to finish the proof.
We now set
(28.11)
o := . I .4(E) a n d := * I R(E)
and, inspired by COURREGE [19621, call these the essential measure determined
Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and
consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with
0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2)
.(K) < I(u) < y, that is, .(K) < y for every such K. It follows that (U) =
`(U) = .(U) < y, by (28.4). The reverse inequality y < (U) is derived as
follows: Let u E CA(E) be a typical function involved in the definition of y. Set
L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2)
of .(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < .(L) = 0(L) =
(L) < (U). Taking the supremum over eligible u gives finally the desired
complementary inequality -y:5 (U).
177
Exercises.
1. For a locally compact space E and a measure p defined on ..(E), show that it
is a Borel measure if and only if Cc(E) C 21(p).
2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family
of open sets which is upward filtering, that is, for any i, j E I there is a k E I such
that Gi U G; C Gk. Show that C := U Gi satisfies
iEI
p(G) = sup{p(Gi) : i E I} .
3. Using the preceding exercise, show that for any Radon measure p on a locally
compact space E:
(a) There exists a largest open set G with p(G) = 0. The set CG is called the
support of the measure p and is denoted supp(p).
(b) A point x E E lies in supp(p) if and only if every open neighborhood of x has
positive p-measure.
I,,(u) := fudp
on CA(E). The question posed in 28 was: Is it true that for every positive linear
form I on CA(E) there is a Borel measure p on E such that I = I, that is, such
that
I(u) = Judp
foralluECC(E)?
Any such Borel measure p will be called a representing measure for I. The answer,
leaked earlier, to this question reads:
178
and because of linearity and the fact that the positive and negative parts of each
u E CA(E) also lie in C(E), it suffices to show this for non-negative u. So let such
be given and let the real number b > 0 be an upper bound for u. Fbr
auE
a given e > 0 choose real numbers yp,... , y,, with
0=yo<yt<...<yn=b
and
yj-yj-1< C
We set
(j = 1,...,n)
and get non-negative continuous functions, each having its support in supp(u),
which satisfy
n
(29.2)
u=Euj,
j=1
such that yj-1 < u(x) < yj. In that case uj(x) = u(x)-yj-1 and uk(x) = yk-yk-1
for k < j and uk(x) = 0 for k > j. Equality (29.2) follows. Next we set
Ko := supp(u) and Kj := {u > yj }
for j = 1, ... , n
and have
(29.3)
-yj-1)1K1_,,
for j = 1,...,n,
O!5 uj :5 yj-yj-1,
(29.5)
(29.6)
Kj c {uj = yj - yj_1},
179
valid for all j E {1, ... , n}. The left half of (29.7") follows from the left half of (29.3)
when account is taken of (28.2) and the fact that u.(Kj) = U*(Kj) = (Kj).
From (29.5) we have supp(uj) C Kj_1. For every open U i Kj_1, the function
v :_ (yj - yyj_1)-luj is therefore an element of Cc(E) with supp(v) C U and
satisfying, by (29.4), 0 < v < 1. From Lemma 28.7 then 1(v) < p(U) and hence
E(yj - yi-1)(Kj)
E(yj - yj-1)1(Kj-1)
and
j=1
j=1
and consequently
5 n
E( yj - yj -1)Fz(Kj-1 \ Kj),
if
j=1
since Kn C Kn_1 C ... C Ko. Due to the choice of the yj it follows that
Jud1L0-
Eu(KK-1\K3)-F(Ko\K.)<EIp(Ko)
I(u)I <_F,
j=1
The extreme inequality being valid for every e > 0 and p(Ko) being finite, the
desired equality
(29.8)
I(u) =
ud
emerges.
The measures of the compact sets Kj, j = 0, ... , n do not change, thanks
to (28.8), when is replaced by p ,. Another pass through the preceding derivation
therefore leads to the conclusion that O is also a representing measure for 1. O
180
Proof. Given K and U, consider functions u,v E CA(E) with iK < v, 0 < u < 1,
and supp(u) C U. Integrating these inequalities,
From (28.2) and Lemma 28.7 therefore the claimed inequalities follow. 0
After this preparation we can enhance the statement of the Riesz representation
theorem by characterizing the measures p and , thereby putting into relief the
role of Radon measures.
29.3 Theorem. For every positive linear form I on CA(E) the associated essential
measure F4 is the unique Radon measure among the representing measures of 1.
Proof. Let p he a representing measure for I which is inner regular, thus a Radon
measure. Since 1I is also inner regular, it follows from the first part of the preceding
lemma that
In particular then all open U C E satisfy (U) < p0(U) < p(U) and when this
is combined with the second part of 29.2 we have
p(U) = {I(U)
(29.9)
p(K) = po(K) ,
valid for every compact K C E. This fact and the inner regularity of both measures
Proof. Let p be an outer regular representing measure. By Lemma 29.2, p(U) <
p(U) holds for all open sets U. Since, however, is also outer regular, that
inequality passes over to Borel sets generally:
181
employ the adjective "regular" for just those outer regular Borel measures p that
have property (29.10), in contrast to our usage.
The following example shows that in general uO is not the only outer regular
representing measure.
Example. 1. Let E be an uncountable set and equip it with the discrete topology.
For I take the identically 0 form. Then from the last two theorems it follows that
= = 0. However the measure it from Example 6 of 25 is an outer regular
representing measure which is not identically 0.
equality p = p on M(E). The first imposes conditions on the space E, but none
on the linear form I.
We already know, for example, that for a compact space E the representing
measures p. and p determined by a given positive linear form I on CC(E) coincide.
This follows immediately from Theorem 28.4. The reasons that underlie this need
to be examined more closely.
182
29.6 Theorem. If the locally compact space E is countable at infinity, then the
representing measures ii and p determined by any positive linear form I on CA(E)
coincide.
A simple consequence is:
Proof. First of all there is a sequence (Kn) of compact sets K such that Kn t E.
Using Corollary 27.3 we find 0:5 u,, E CA(E) with u, t 1E. But then the sets
Ln:={un>1/n},
nEN,
183
(29.11)
Here IIf II denotes the supremum norm of any bounded real function f on E.
The requirement (29.11) means that I is continuous with respect to the metric (of
uniform convergence) in CA(E) derived from this norm.
Remark. 2. If the space E is compact, then every positive linear form I on Cc(E)
Il,0Il=sup{I(u):0<u<1,uECc(E)}.
Since 0:5 u < 1 entails Dull < 1, (29.11) says that 0:5 1(u) < M IIuII < M, and so
IW II <- M < +oo .
184
I,,(-)I = if
<
Remarks. 3. From the proof of Theorem 29.10 it also follows that the total
maw [l;tII of u is the smallest real number M > 0 that can serve in Definition 29.9.
the space E is even compact: It is the interval [1, Q] of all ordinal numbers not
greater than the first uncountable ordinal f2, equipped with the order topology.
The positive linear form IEn on C([1,52]) defined by the Dirac measure en has
a representing measure it which is neither inner regular nor outer regular. Thus
f f den = f f dp for all f E C([1,1z]) although It 96 eS2. Details can be found in
PFEFFER [1977], p. 116.
29.12 Theorem. If the locally compact space E has a countable base for its topology, then every Borel measure on E is regular, hence in particular a Radon measure.
Proof. Let It be a Borel measure, I, the associated positive linear form on CA(E)
and p the principal representing measure for I. Along with E each of its open
subspaces U also has a countable base. From Example 2 therefore U is countable
at infinity; there exists a sequence
of compact sets such that K 1' U. Since
the measures It, p are continuous from below, it follows that
u(U) = u(U),
For an arbitrary Borel set A and open U D A we then have u(A) < u(U) = u(U)
and so, on account of the outer regularity of u,
(29.13)
185
tt(A) = p(A),
(29.14)
u(B) = 'x-+x
lim u(B n Lg) = n-x
lim u(B n L,) = u(B).
That is, u and u coincide throughout .W(E). Since the essential measure u is
a representing measure for I,,, this fact insures (as does Theorem 29.6, for that
matter) that u = u. From the double equality it = u = p follows finally the
regularity of u. 0
In this situation the Riesz representation theorem can therefore be expressed
thus:
29.13 Corollary. For a locally compact space E whose topology has a countable
base, every positive linear form I on CA(E) can be represented as
1(u) = Judprt E
by exactly one Borel rrteasur p on E.
Example. 4. For cacti u E CS(R) choose real numbers a < 13 such that supp(u) C
(a,131 and define
L(u) :=
j a u(x) dx,
a
the integral being the usual Riemann integral: it is independent of the specific
numbers a and,3 used. Evidently L is a positive linear form on CS(R). According
to 16.4 L-B measure A' represents L, and by 29.13 it is the only representing
measure.
Remark. 5. It is also possible to deduce Theorem 29.12 from Theorem 26.3 and its
Corollary 26.4 because every locally compact space E whose topology has a countable basis is Polish. In fact along with E, its one-point compactification E' also has
a countable base, as follows from Lenima 29.8 and the commentary after it. It will
be shown in Remark 3 of 31 that E' is consequently ntetrizable, and completeness
of the metric follows easily from compactness (cf. Example 6, 26). Thus E' is
Polish and E is an open subset of it. Therefore according to Example 4, 26 E
itself is Polish.
186
Summarizing, we can say that for every locally compact space E, the mapping
that associates to each Radon measure p on E the positive linear form 1. on Cc(E)
is a bijection between the set of Radon measures on E and the set of positive linear
forms on CA(E). That is the reason why in BOURBAKI [1965) the positive linear
forms on CA(E) are themselves designated as (positive) Radon measures.
29.14 Theorem. For any regular Borel measure p on a locally compact space E
and any p E [1, +oo[, the vector space CA(E) is dense in 2P(p) with respect to
convergence in pen mean.
Proof. First of all, CA(E) C .`(p), because CA(E) C .2"(p) by (28.1) and Iulp E
CA(E) whenever u E CA(E). The denseness claim requires that for each f E gy(p)
and each number e > 0, a function u E CA(E) be produced with
Np(f -u):=
We accomplish this by a stepwise simplification of the function f to be approximated. Since along with f , both f+ and f - are in .`gy(p), and Np is a semi-norm,
we can assume that f > 0. By 11.3 and 11.6 there is an isotone sequence (fn) of
SR(E)-elementary functions such that f,, t f. All these functions also lie in "(p),
due to 0 < fn < f, Therefore from the dominated convergence theorem
lim
nioo
Np(f - f,,) = 0.
In particular, p(U) < +oo. Therefore the inner regularity of p insures that for
some compact K C U
that is,
0<1u-u<1u-1K
187
and so
Exercises.
1. Let E be an uncountable discrete space. Using the Borel measure from Example 6 in 25, show that every positive linear form on CA(E) has at least two
different representing measures. This sharpens Example 1 of this section.
2. Let E be a locally compact space and I a positive linear form on CA(E). With
the help of the R.iesz representation theorem prove the following refinement of
equality (28.12): For every open U C E
that there is exactly one finite Radon measure on E such that 1(f) = f f d
for every f E Co(E). (Hints: Indirect proof. Or: For every e > 0 and non-negative
f E Co(E) there is a u E CA(E) with If - uI 5
7. Let El, E2 be the interval [0, 1) equipped with the discrete topology, respectively,
the usual euclidean topology, and consider the product space E = El x E2. Show
that
(a) E is locally compact.
(b) Every product
xE:_{x}x[0,1),
is a compact subspace of E, which is also open in E.
0<x<1,
188
(c) A set U C E is open if and only if U fl xE is open for each x E 10111(d) Every compact subset of E is covered by finitely many of the sets xE.
Now consider u E CA(E). By (d) u vanishes in the complement of the union of
finitely many xE sets, and for each fixed x, y u-+ u(x, y) is a continuous function
on the compact interval Ea = [0, 1]. Therefore
I (u)
II
u(x, y) dy
O<x<I 0
is a well defined finite sum, evidently a positive linear form on Cc(E). Show that
(e) The essential and the principal representing measures for I do not coincide.
[Hint: Show that the set A := El x {0} is closed and that s(A) = 0, while
u(A) = +oo.]
(f) In passing from u to the Borel measure 1B for B E M(E) outer regularity
may be lost. [It suffices to consider B := E \ A, for the set A in the preceding
hint.]
With p, v E 4+ (E) and real numbers a > 0, i3 > 0 the measure a +)3v also lies
in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone.
Besides . W+ (E) we often consider the following subsets
In .f+1 (E) are to found all the Dirac measures on E. And 4 (E) is a convex
subcone of 4f+ (E).
In the special case E = Rd the set ..W+b (W') is the set of all finite Borel measures
on Rd, already familiar to us from 24. That the definition there is equivalent to
the present one is due to Theorem 29.12, according to which every Borel measure
on Rd is a Radon measure.
Depending on whether one thinks of the elements of . W+(E) as measures
on -V(E) or as positive linear forms on CA(E), two notions of convergence suggest themselves: One can define the convergence of a sequence (ta,,) in 4'+(E) to
pE
189
n-+oo
or
lim
n-+oo
f dp = J f dp
J
We will forthwith show that the first of these is of limited interest, while the second
is of considerable significance.
lim
-oo
190
and
udp and
vd < liimianfn(G).
From these inequalities (30.2) follows via (28.2) and (28.12). One only has to recall
that the Radon measure p coincides, thanks to Theorem 29.3, with the essential
measure po determined by the linear form I.
Now suppose conversely that condition (30.2) is fulfilled and that an f E CA(E)
has been given. Since our goal is to confirm (30.1), we lose no generality by assuming that f > 0. For a pre-assigned e > 0 we choose finitely many numbers
0=yo<y1<...<yk
with yk > IIfII and yj - yj_1 = e for each j = 1,...,k. Set
j = 1,...,k.
(j=1,...,k).
A =Kj-1\Kj
Because of the obvious inequalities
k
j=1
j=1
yjv(A,),
191
from which and a simple calculation using the facts v(A,) = v(Ki_1) -v(K,) and
yi - yi _ 1 = e, we get
k
i=
i=o
i=o
Jfd/<EJL(Kj)
for all n E M,
i=o
limsopJ fdn<eE(K1)
i=o
But this right-hand side can be estimated by using the left end of the earlier chain
of inequalities, with v:= . We thereby get
lim sup
f f dn < r f d + e(K),
Jfd n <
ffd.
f fd<liminf f fdn
and we get it by an analogous procedure, using the second half of hypothesis (30.2).
One sets Gi := If > yi }, j = 0, ... , k, which are open, relatively compact subsets
of K with
30.3 Lemma. If the sequence (n)nEN of Radon measures on the locally compact
then the associated total
space E converges vaguely to the measure E
masses satisfy
(30.3)
192
<u<1
Proof. For every u E CA(E) with 0JudlLn
< -IIunhI
Vague convergence of sequences in .4'+(E) is convergence in a certain topology on ..ff+(E), called, naturally, the vague topology. It is defined as the coarsest
topology on .4f+ (E) with respect to which all the mappings
p y J f dp
(30.4)
(f E CA(E))
J fi
in which n E N, 0 < E E R and fl,..fn E CA(E) are all arbitrary. The vague
topology is Hausdorff because the uniqueness aspect of Riesz's theorem says that
if p, v are different Radon measures, then I, 36 It,., which just means that f f du 34
lim
t =
10
tE A
forevery f E C(E).
tEA
193
that
lim KrAd = e0
(30.7)
r-a+oC
.F
this and the Lebesgue dominated convergence theorem the claim (30.7) follows upon checking that, on the one hand
r-++oo
and on the other hand for all real r > 0 and all x E Rd
has no identity element with respect to convolution, but it is not hard to show
that II Kr * f - f 11 -+ 0 as r -+ +oo for each f E L' (Ad), and in many situations
this is almost as useful as having an identity.
To .,W+b (E) belong in particular all discrete Radon measures on E. These are
the measures 6 which can be represented in the form
k
5 = E aic",
7=1
f o r some finite number of points x1, ... , xk E E and non-negative real numbers
at, ... , ak. Every 5 admits many such representations. Every Radon measure can
be approximated, in the sense of the vague topology, by such 5, as we next show.
30.4 Theorem. For every locally compact space E the set of discrete Radon measures on E is dense in .4f+ (E) in the vague topology.
Proof. Let a measure tso E .W+(E) and a vague neighborhood V of be given. As
noted after (30.5), we can suppose V is Vj, ,....I,, :1(0) for some non-zero Ii..... f E
,,(E). We have to find a discrete measure 6 in V. To that end, consider the com-
194
pact set
K := U supp(fi)
i=1
and g > 0 such that npo(K) < 1. Every y E K has an open neighborhood U. in E
such that 1 fi (y') - fi (y") I < q for all y', y" E U. and all j E {1, ... , n}. Finitely
many Us,, say Uy...... Uy,, suffice to cover K. Set
Al :=KnU,,, A2:=(KnU,,)\Al,...,Ak:=(KnUYk)\(ALU...UAk_1).
These are pairwise disjoint, relatively compact Borel sets whose union is K, and for
all j E { 1, ... , n}, i E { 1, ... , k} and y', y" E A. the inequality I f i (y') - fi (y") 15 rl
holds. Since only these properties of the A; are used in the sequel, we can discard
those that are empty (not all are because 0 0 K = Al U ... U Ak), and re-index the
others. That is, we can suppose all the A; are non-empty and then select a point
xi E A, for each i. The discrete measure
k
i=1
(notice that po(A;) is finite because A; is relatively compact) will be shown to lie
in V and that will complete the proof-
i=1
f fi dpo +
po(A:)fi(xi)
i=1
using the fact that Ifi(x) - fi(xi)I < 17 for all x E A;, all i E {1,...,k}. This
holds for each j E { 1, ... , n}, and gpo(K) < 1 by choice of q. Therefore b E
V1,,..., f,,;1(po) = V, as was to be shown. 0
30.5 Corollary. The discrete p-measures on E are dense in di. (E) in the vague
topology.
Proof. We take over the notation of the preceding proof. Now po is a measure
in 4+' (E), but the discrete measure 6 = F, po(A;)ez, may not be a p-measure,
so more work is required. Set a; := po(A1), i = 1, ... , k. If K = E (in which case
E had to be a compact space), then a1 +... + ak = po(K) = 1 and b actually is
a p-measure. In general what we have is
195
ak+l
which is non-negative. Then
is a discrete p-measure with f fj dd = f fj db' for each j = 1, ... , n, since xk+l lies
outside the supports of all these functions. Consequently, 6 E V = Vf...... f,,;I(P0)
Recall in this connection that for a measure E .,&+' (E), every f E Cb(E) is uintegrable: it is g(E)-measurable and its modulus is majorized by a real constant,
hence p-integrable, function. We will formulate the relevant results for sequences
only; their extensions to mappings t u-+ pt are routine.
f dn =
Jfd.
so is a finite measure. Definition 27.5 says that for each e > 0 there is a g =
gf E CA(E) such that 11f - g11 5 e. Therefore
for each n E N
if
and
if f du
< ae,
if f dn - f f d
- jgd.Uj
for all n E N.
valid for every e > 0. That is, the limit exists and is 0. 0
Remarks. 1. If one considers measures pn and E .-W+6 (E) without the hypothesis
sup 11n 11 < +oo, the above conclusion can fail. The special case of Example 2 in
196
for x # 0, f(0) := 1
lies in C0(R). But f f dpn = 1 for every n E N, while f f d = 0, because here the
vague limit p is the 0-measure.
2. Example 2, again with E := R and xn := n for all n, considered earlier, but
this time with the constant sequence a := 1, shows that indeed lim f f dey =
f f d for the measure p := 0 and all f E Co(R), but this equality is already false
for the constant function f := 1E in Cb(R).
The passage from Co(E) to Cb(E) therefore calls for a special investigation,
which we stress by introducing a new definition:
(30.8)
n-+00
JfdP=Jfdp
(iii) For every e > 0 there exists a compact subset K = K, of E such that
(E\K)<e
forallnEN.
that IIpnII < a+e for all n > no. Consequently, pn(Ko) > pn(G) > a > IIII -e,
so that p.n (E \ KO) < e, for all n > no. For each n E { 1, ... , no) inner regularity
and finiteness of pn give us a compact K C E such that pn(E \ Kn) < e. The
compact set K := Ko U K1 U ... U Kn0 then satisfies (iii).
Given e > 0, let K = K, be as described. Again from (30.2) we have
forallnEN
197
J(i- u)fd/)
<- If 11.
if
Jfdl
<2IIfIIE+11ufd/Ln- Juid l
limsup if f dILf
-1 f dl s 2IIf1I e,
valid for every e > 0. That is, this limit exists and equals 0, for every f E Cb(E).
Which proves (i). 0
At this point it is worth returning once more to Theorem 30.2. If the measures
,n there are all finite and of the same total mass, e.g., if they are all p-measures,
then the two components of the compound condition (30.2) become equivalent. The
result is the following portmanteau-theorem:
30.10 Theorem. Let ,l,2, ... be measures in &+' (E). Then the following
three assertions are equivalent:
(i) The sequence (n)nEN converges vaguely (and therefore also weakly) to p.
(ii) For every closed F C E
(30.9)
Proof. The first paragraph of the proof of 30.2 actually established that (i)=(iii),
under the less restrictive hypotheses prevailing there. Since that theorem further
shows that the conjunction of (ii) and (iii) implies (i), it only remains to establish
198
the equivalence of (ii) and (iii). That follows from the trivial observation that
Example 1 in this section shows that the weak convergence of a sequence (n)
in .4/+(E) to a It E 4' (E) does not imply the convergence of (f f d,+) to f f d
for every bounded Borel measurable function f . Nevertheless the continuity of the
functions f which define weak convergence can be relaxed somewhat. To this end,
we consider bounded, real-valued, Borel measurable functions f on E which are
p-almost everywhere continuous for a p E .A"+(E): After excision of a p-nullset
N E .3(E), f is continuous at each point of E \ N. Important examples of such
are the indicator functions of boundaryless Borel sets. The latter are defined as
follows:
30.11 Definition. A Borel subset Q of a locally compact space E is called boundaryless with respect to a measure p E .AY+(E), p-boundaryless (or p-quadrable)
for short, if the boundary Q'
\ $ of Q is Eo-mill:
(Q') = 0.
(30.10)
only if a E E \ Q*. Look back at Example 1 with this observation and the following
theorem in mind.
it E J4 (E). Then
(30.11)
lim
n-,00
JfdPn=JfdP
holds for every bounded Borel measurable function f that is p-almost everywhere
continuous on E. In particular,
(30.12)
n-,OC
p(Eo\K)<e.
Every x E K has an open neighborhood Ux on which the oscillation of f is at
most e, meaning that
for all y1, y2EUx.
If(yi)-f(Y2)I <_e
199
Choose a compact neighborhood V= of x with VV C Ux and then use the compactness of K to find finitely many points x 1, ... , x, E K such that V=, , ... , V=,,
cover K. If we now set
a := inf f (E),
13 := sup f (E),
Q3 := sup f (U , )
for j = 1, . . . , n, then for each such j there exist functions gj, h3 E Cb(E) satisfying
9i( x) _
(aj
a
as well as
if x E Vx
ifxECUU,
and h (x) =
{ ,Qi
[3
if x E Vj
ifxECUU,
a<g;<a;</3;<h;<0.
This follows at the once from 27.3 and the application of an appropriate affine
transformation in the range space R. From these properties and definitions it
follows in particular that gi S f < hj for all j. Therefore if we set
0<h(x)-g(x)<e
forallxEK.
For each x E K lies in some V1, C Us, and because of the way Ux; was chosen
with respect to the oscillation of f, it follows that h(x) - g(x) < h,(x) - gj(x) _
/31 - aj < E. We are now in a position to finish the proof, as follows:
d+JE\K-g)dit
J(h-g)di=IK-g)
<
and, because g < f < h and g, h E Cb(E), the weak convergence hypothesis gives
g dp = n-too
lim
< lim
-n +00 J
nloo J
h dn =
n-+00
h du.
Of course we also have f g d < f f d < f h d. Putting all this together shows
that any pair of the numbers f f d, lim inf f f dn and lim sup f f dn differ by
at most e((E) +,3-a). Since e > 0 is arbitrary, (30.11) holds. 0
Let us now look at an application of this theorem which relates the vague
convergence of p-measures on the number line to their Theorem 6.6 description
in terms of distribution functions. This is the way that weak (and hence vague)
convergence made its original historical appearance.
30.13 Theorem. Let , Al, A2.... be measures in 4+1(R), that is, probability measures on .41, and F, F1, F2 ... their distribution functions. If the sequence (n)nEN
200
n +0
] - oo, x] = Qx = n
Q=+1 /k
kEN
and therefore
be given. First of all, (6.13) supplies numbers a < b such that F(a) < e and
1 - F(b) < c. The uniform continuity of F on the compact interval [a,b] insures
that points a = xo < x1 < ... < xk = b exist such that
F(xj)-F(xj_1)<e
forj=1,...,k.
From what has already been proven we know that there exists nE E N such that
But then, as we will show, the inequality (Fo(x) - F(x)] < 2e prevails for every
x E R and all n > ne1 which proves the uniform convergence of (Fn) to F. For if
x < x0, then
0 < F(x) < F(xo) < e and 0 < Fn(x) < Fn(xo) < F(xo) +e < 2e,
that is, I F,,(x) - F(x)j < 2e. And a similar argument works if x > xk. The remaining x fall into [x j _ 1, x j [ for an appropriate j E {1,...,k}, so
F(xj_1) - c < Fn.(xj_1) < Fn(x) < F,,(xj) < F(xi) +e < F(xj_1) +2e,
confirming that in this case too IFn(x) - F(x)I < 2E.
Remarks. 4. At a point x E R of discontinuity of F limit relation (30.13) generally
fails, as the example Ee :=
n E N, confirms.
201
if E is a Polish space (or even just a metric space) if the measures involved in
Definition 30.7 are all finite Borel measure on E. Only the uniqueness of limits
calls for discussion:
30.14 Lemma. Finite Borel measures p and v on a metric space E are equal if
f f dp = f f dv for all f E Cb(E).
Proof. Let d be a metric giving the topology of E and consider closed subsets
F C E. Suppose we can always find a sequence (fn) in Cb(E) with fn .1. 1F. Then
it would follow from the hypothesis and from Lebesgue's dominated convergence
remains valid in this new situation. In the proof one merely has to secure the
existence of the needed functions g3 and h2 somewhat differently: To this end one
engages Urysohn's lemma (WILLARD [1970], p. 102 or KELLEY [1955], p. 115).
7. Weak convergence in the set of finite Radon measures on a Polish or a locally
compact space E derives from a topology in the same way that vague convergence
does. It is called, naturally, the weak topology and it is defined by letting Cb(E)
take over the role of CC(E) in (30.4).
Weak convergence in (non-locally compact) Polish spaces plays only a marginal
role in this book, but is thoroughly investigated in BILLINGSLEY [1968] and PARTHASARATHY [1967].
202
Exercises.
1. Let E be a locally compact space, (n)fEN a sequence in ..Wb(E) which is
vaguely convergent to E . +(E). If 11I.11 !5 1 for every n E N, then R
o.D
exists and equals 1.
be a convergent sequence of real numbers, with slim an = a E
2. Let
+00
be a sequence of non-negative real numbers such that al > 0
Further, let (a
and the series E a,,, is divergent. Then
lim
n-+no
alai +...'+'anon =a
a,
the case in which all an = 1 being the best known instance. Here is an outline for
a measure-theoretic proof: The equations
/tn :=
n E N,
al+...Ian
according to 30.6, line f f dt. = 0 holds for every f E Ca(lm). The relevant f is
the one defined by f (n) := a - a.
3. Let E be a locally compact space and T a subset of C0(E) with the following
properties: Each compact K C E has a relatively compact neighborhood U such
that every f E C0(E) with supp(f) C K is uniformly approximable on E by
functions t E T whose supports He in U; and further, there exists a t E T with
0 < t < I and t(K) _ {1}. Show that:
(a) A sequence (n) in .1+(E) is vaguely convergent if and only if the sequence
(f t dp) is convergent in R for every t E T.
(b) For E := R, the set of all continuously differentiable real-valued functions with
compact support is a T with the above properties.
4. With the help of Exercise 3 show that for the functions f, (x) := I - sin(nx)
on R, the sequence (f .\'),,EN converges vaguely to A1, and deduce from this the
Riemann-Lebesgue lemma:
Elm
n -r00
f (x) sin(nx) dx = 0
for every f E
6. ,1,z, ... are finite Radon measures on the locally compact space E. Show
that condition (30.12) is also sufficient for weak convergence; that is, from
limA.(Q) = (Q) for every p-boiundaryless set Q C E follows the weak convergence of
to it. This is also true if E is a Polish space. [Hints: Imitate the proof
203
of Theorem 11.6 and show with the help of Exercise 5 that every 0 < f E Cb(E)
is the uniform limit on E of an isotone sequence (un) in the vector space spanned
by the indicator functions of the sets in -90-1
7. As an application of Exercise 6 show that in the context of Theorem 30.13
condition (30.13) there is also sufficient for the weak convergence of (n) to p.
8. Let (an)nEN be a sequence of real numbers in J0, 1[. From [0,1] delete the open
interval Ill centered at 1/2 having length al. There remain two disjoint closed
intervals J11, J12. From J1j delete the open interval I2j of length a2A1(J13) whose
midpoint is that of J13 (j = 1,2). Then there remain four pairwise disjoint closed
intervals J21, J22, J23, J24. From J2, delete the open interval I3j of length a3.' (J23)
C:= n(Jn1U...UJn2n)
nEN
iim fln
I(,_ a,).
cc
[Hint: Recall the inequalities 1 + a < (1 - a)-1 and 1 - a < e_a for 0 < a < 1.]
00
an < +00, U :=]0,1[ \C is an open subset of R whose boundary
(d) In case
n=1
\2-measure.
10. Let E be a metric space, with metric d, and let , 1,p2, ... be p-measures
on .R(E). Show that each of the following is necessary and sufficient for the weak
convergence of the sequence (n) to p:
(a) lim f f dn = f f d for all bounded functions f which are uniformly continuous on E.
204
sup
ffd l
< +00
Thus vague boundedness of a set H C -4'+ is a necessary condition for its vague
relative compactness. We want to show that it is also sufficient:
Proof. In view of the preceding, all that has to be shown in that vague relative
compactness follows from the vague boundedness of H. To this end, let of denote the real number in (31.1), for each f E Cc(E), and Jf the compact interval
(-a f, a fJ in R. Also denote the (vague) closure of H in W+ by H. First observe
that
fid AEJf
for all f E CA(E) and all p E H. In fact, if f E CA(E) and e > 0 are given
ifll
As the extreme inequality holds for every e > 0, we see that If f dpi < a f, that
is, f f d. E Jf.
205
P:= RC = X Rl
IEC,
J:= X JI
I EC
4':.l+-4P
is defined which is injective by the Riesz representation theorem. On the basis of
what was shown in the opening campaign
4;(H) C J.
Our goal will be realized if we can show that
(a) 4' maps .4f+ homeomorphically onto
and
4'(u)'-Jfd(4i(4'()))
Jfdt
of 4>(.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+)
of the projection of P = RC, onto its coordinate specified by f.
for u E (f, g, f + g}
206
=II(f+g)-I'(f+g)I+II'(f)-I(f)+I'(g)-I(g)I
<e+II'(f)-I(f)I+II'(g)-I(g)I <3c,
and because e > 0 is arbitrary, the extreme inequality means that its left-hand
side must be 0. In a completely analogous way one proves that I (a f) = aI (f) for
every a E R, f E CA(E), and I(g) > 0 for every non-negative g E CA(E). With
the linearity of I confirmed, the Riesz representation theorem supplies a Radon
I. That is, I lies in
confirming that
measure v E + such that
the latter is closed in P. lJ
31.3 Corollary. For every real number a > 0 the set
9a:={pE..t+(E):IItzII<a)
is vaguely compact.
Proof. For every f E CC(E) and p E 4, if f dpi < f If I du <_ a IIf 11. Consequently, tf,, is vaguely bounded, hence vaguely relatively compact. What therefore
remains to be confirmed is the closedness of via in .4W+. According to (28.13)
6 is just the set of all Is E W+ such that f u d < a holds for all [0,1]-valued
u c- CA(E). Because the mapping p '-+ f u dtp of .'+ into R is continuous, the
set { E - ' + : f u du < a} is closed, for each u E CA(E), and by the preceding
observation 4 is an intersection of such sets, those for which u(E) C [0, 11. Thus
.9a is indeed (vaguely) closed. 0
Remark. 1. The set of all measures u E 4' (E) with IIpQ equal to a fixed positive
number a is vaguely closed if E is compact (because in that case 1E E CA(E)).
Example 2 of 30, with all the a there equal to a, illustrates this.
V : E -+ .4f+ (E)
form a neighborhood basis at x as the fj run through all finite subsets of CA(E)
and 17 through all positive real numbers. In fact, if U is a neighborhood of some
207
for all relevant functions, q E R+ and x E E. Together with the injectivity this
clearly shows that cp is a homeomorphism.
For the former the existence of a countable basis in E is sufficient, as was noted
in Remark 5 of 29. It is useful to formulate this in terms of CA(E):
31.4 Lemma. For any locally compact space E the following assertions are equivalent:
Proof. (a)=::-(b): Let 9 be a countable base for (the topology of) E,.? the set of
all open intervals in R with rational endpoints. For every natural number n let
us say that an n-tuple (C1,... , Gn) E 1n and an n-tuple (II, ... , In) E Mn are
compatible with each other if a function f E CA(E) exists such that f(G,) C II
for each j = 1,...,n and supp(f) C Gl U ... U Gn. Any such f will be called
a compatibility function for the pair of n-tuples. Obviously, the set
U(9" x,1n)
nEN
is countable; there are therefore only countably many such pairs of n-tuples (n E N)
that are compatible with each other. We choose a compatibility function for each
such pair and designate by F the set of functions chosen. It suffices to prove that
F is a countable dense subset of CA(E). To prove its denseness, let u E CA(E)
and e > 0 be given. Denote the support of u by K. Every x E K lies in an open
neighborhood from 9 each point y of which satisfies Iu(x) - u(y) I < E. The compact set K is covered by finitely many such neighborhoods, say by C1,.. . , Gn.
The diameter of each image set u(G,) is at most 2E. Consequently there are intervals I j E 9 of length less that & such that u(G3) C II, f o r j = 1, ... , n. Thus
u is a compatibility function f o r the pair of n-tuples (G 1 i ... , G"), (I1, ... , In ).
Hence there must also be such a compatibility function f in the representative
set F. Every X E Gj therefore satisfies Iu(x) - f(x)I < .A'(Ij) < 3e; that is,
Iu(x) - f (x)I < 3e for all x E G1 U ... U Gn. But this latter inequality prevails as
well for all x E E \ (G1 U ... U Gn) for the simple reason that both f and u vanish
identically in this complement. In summary, llu - f II < 3F. This proves that F is
dense in CA(E).
(b)=*(a): Let D be a dense subset of Cc(E). We will show that the system 9
of all sets {u > 1/2} with u E D is a base for the topology of E. For every open
U C E and every point x E U Corollary 27.3 furnishes an f E CA(E) with f (x) = 1
208
and supp(f) C U. Since D is dense, there is a u E D with 1$u - f O < 1/2. Then
xE{u>1/2}C{f> 0) C supp(f) C U.
If D is countable, so is If.
CA(E) separates the points of E, so D must also; that is, for any two distinct
points x, y E E there is a u E D with u(x) 96 u(y). The functions in D \ {0} may
be organized into a sequence ul, u2.... and we may then define
(31.3)
1un(x) -'uw(y)1
d(x, y) :_
n=1
X, Y E E.
2" 11un11
Point-separation by D means that d(x, y) > 0 whenever x # y. All the other properties of a metric on E are obvious for d. This function d on E x E is a uniform
limit of continuous functions and is consequently continuous. Therefore the topology generated by d, which we will call the d-topology, is coarser than the original
topology of E. For any given point x E E and neighborhood U of x in the original
topology of E there is, as was shown in the "(b)=(a)" part of the preceding proof,
a u E D with
zEV:={u>1/2}CU.
31.5 Theorem. The following assertions about a locally compact space E are
equivalent:
sets such that L. 1 E and every compact subset K of E satisfies K C L. for all
but finitely many n. For each n E N choose an e,, E CA(E) satisfying 0 < e,, < 1,
209
D:=Do
EDo,nEN}Ufen: nE N}
of CA(E) is still only countable and, of course, is dense in CA(E). Let d1, d2,... be
an enumeration of its elements:
D={d,,:nEN}.
Using this enumeration we define a mapping
e:
+x-&+-+ R+
by
(31.4)
, v E
n=1
All the properties of a metric save perhaps one are obvious for p. What needs
checking is that = v follows from g(, v) = 0. In view of the uniqueness part of
the Riesz representation theoremr this amounts to showing that from
J dodp=J dodo
f f dp =
fdv
supp(f)CLkC{ek=1}.
Further, given e > 0 there is u E Do with Ilf - ull < c, whence, since f = fek,
(31.5)
Integration yields
(31.6)
if
(31.61)
I ffdv_Juekdv l < F
J ek dv.
As the functions ek and uek are in D, the assumption that p(p, v) = 0 entails that
their p- and the v-integrals coincide, and it follows that
Jfdi_Jfdu
2e
l
<
J ek d,",
holding for every e > 0. That is, the desired equality f f dp = f f dv must hold.
The next step is to show that the topology determined by P is none other
than the vague topology. We will, to that end, make use of the fact that the sets
defined in (30.5) are a neighborhood base at v E ..&+ in the vague
210
topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers
e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e
(31.7)
p(, V) <
+<e
n=1
2. For finitely many f1,..., fn E CC(E), for every number e > 0 and every
v E 4'+, there is a number i > 0 such that
(31.8)
Un(v) C V11,---.fn;-(V)
(j=1,...,n).
Jf)dL_Juiekd1zl<SJekd,
fjdv-
(31.9')
if
show
for j = 1, ... , n. Choosem so large) that all the functions ek, u1ek....
up among the first m functions dl,..., d,,, in the enumeration of D, to which they
,7] ._ d2-m
211
if
for i = 1, ... , m.
for j=1,...,n
if
and
Jekd/L_fekdP<o.
(31.10')
From (31.9) and (31.9'), as well as from (31.10) it follows, via the triangle inequality
that
Jfid_ffidv<82+(1+2 J edv)S<eAs
this holds for every j E{ 1, ... , n}, it asserts that p E V11 ,... j,,, (v) and confirms (31.8). Together (31.7) and (31.8) assert the equality of the vague and the
p-topologies.
The next step will be to prove the completeness of the metric p, and we can
do that via slight modifications in the foregoing arguments. Let (pn)nEN be a pCauchy sequence in W+. Instead of the functions fl,..., fn and the number e > 0
in 2. above, let an f E CA(E) and a number b E ]0, 1[ be given. We aim first
to prove that the numerical sequence (f f dpn)nEN converges in R. Choose k E N
with supp(f) C {ek = 1) and u E Do with Ilf - It < b. Then choose m E N large
enough that the two functions ek and uek are among dl, ... , d,n and set 17:= 62-1.
Since (n) is a p-Cauchy sequence, there is a natural number N, dependent on 'q,
thus on f and S, such that
for all r, s > N.
p(pr, ps) < 77
Just as in the earlier deduction scheme, we get that for such r, 8
if
< 6 and
JekdPr_JekdPa < 6.
if
212
Of course we also have the f-analogs of (31.9) and (31.9'), so that reasoning similar
if
The second of the (valid for all r,s > N) inequalities in (31.11) shows that the
numerical sequence (f ek d
forallnEN.
The earlier inequality therefore yields
Jfdpr_JfdP8<62+(1+2M)
is therefore vaguely convergent to some p0 E .4'... Since the vague topology coincides with the p-topology, as we have already confirmed, this means that the
sequence (pn) converges to po in the p-metric.
We finally need to prove that, like the topology of E, the vague topology of ..k+
has a countable base. Since the vague topology is generated by the metric p, it
is enough to find a countable set 9o which is dense in . W+; because it is obvious
that the set of all open balls with respect to the metric p centered at points of 9o
and having rational radii is then a countable base for the p-topology of . '... Our
candidate for 9o is the set of all discrete measures
k
b :_
aifx,
with positive rational ai and points ai drawn from a countable set Eo which
is dense in E. We get such a set Eo simply by taking a point from each set
in a countable base for the topology of E. Evidently, this 90 is countable. We
have to show that for every p E . fl+, every real e > 0, and every finite set
F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,...
contains
a measure from 90. At least, according to 30.4, this neighborhood contains a
<e
i=1
213
if fdIt-Jfd.6l<
fdlt -
fd,,- ffdal+Ea;If(=i)-f(xj)I+FI-;-ailIlfII
-ffdbl
Remarks. 4. The reader should recall the rather elementary fact that for a metric space compactness and sequential compactness are equivalent (see (6.37) in
HEwrrr and STROMBERG [19651). In view of this, a very useful consequence of
Theorems 31.2 and 31.5 for a locally compact space E with a countable base is
that every vaguely bounded sequence in _J!+(E) contains a vaguely convergent
subsequence..
In particular, for such E every sequence (p,,) in ..#+(E), that is, every sequence
of p-measures, contains a vaguely convergent subsequence. Moreover, in case all
convergent subsequences have the same limit e, the original sequence (p,,) itself
converges vaguely to /t: Otherwise there would be an f E CA(E) for which (f f dlt )
sloes not converge to f f dlt, and so an e > 0 and integers I < n1 < n2 < ... such
214
The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified,
lead to a further interesting result. It concerns the space
C := C(R+, E)
of all continuous mappings f of R+ := [0, +oo into a Polish space E, for example, Rd. We endow C with the topology of uniform convergence on compact subsets
of R+.
such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can
simply assume that L< 1. This lets us define do in C for each n E N by
dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]),
f,g E C;
and
(31.14)
d(f,g) :_
00
E2-ndn(f,g),
f,g E C.
n=1
Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric
on C (with all its values in [0,1]) which satisfies
(31.15)
2-nd(f,g)<d(f,g)<dn(f,g)+2'n
for allnEIN,
the right-most inequality following from the fact that d< < d,+1 for all i E N,
resulting in
n
00
i=n+1
It follows from (31.15) via by-now-familiar reasoning that the d-topology coincides
with the original topology of C, and moreover that d is a complete metric.
U(n
nEN
for each pair of compatible n-tuples, for each n E N. The open d-balls having
centers in F and rational radii are a countable set, and it is easy to see that they
constitute a base for the d-topology of C once we confirm that F is dense in C.
215
c:= 2-N-2. Since f is continuous, every x E [0, NJ lies in a set 0 E 6' such that
for all y E 0.
Finitely many such sets 0 suffice to cover [0, NJ, say 01,..., 0,,. By the triangle
inequality
Q(fo(y),.fo(x)) < e
p(fo(x),fo(.., j))<e
for allxEOj.jE
The open Lo-ball of radius c centered at fo(xj) meets the dense set E0, say in the
point zj. As E is rational, the open p-ball of center zj and radius 2e is a set G j E I.
Then every x E Oj satisfies
Consequently,
f(0j)CGj
forj=1,....n.
As the Oj cover [0, NJ, this inequality holds for every x E [0, NJ. It affirms that
dN(f, fo) < 4E, and so thanks to (31.15) and the definition of e, d(f, fo) < 4E +
2-N = 2-N+'. As N E N is arbitrary, this shows that F is d-dense in C, which,
as noted earlier, completes the proof.
The significance of Theorem 31.5 lies partly in the fact that for a locally compact space E whose topology has a countable base the space .41+(E) of all (positive) Radon measures - which according to 29.12 is the set of all Borel measures
on E - being also a Polish space, is itself an environment in which measure theory
can be pursued. And this happens in convex analysis, in integral geometry, and
in stochastic geometry, a meeting point between geometry and probability theory.
Exercises.
1. Let E be a locally compact space, v E ..#+(E). Show that the set of all p E
..#+(E) which satisfy 0 <_ f u. d,u < f udv for every non-negative u E CA(E) is
vaguely compact.
216
2. Let E be a locally compact space with a countable base. Prove that there is
a countable subset of C0(E) that has the properties of the set T in Exercise 3, 30.
[Hint: Try the set D that featured in the proof of Theorem 31.5.]
3. (Selection theorem of E. HELIX (1884-1943)). Prove the original form of Corollary 31.3: To every sequence (Fn)nEN of distribution functions on R corresponds
a measure-generating function F : R -+ R and a subsequence (Fn,, )kEN of the
original sequence such that lim Fnk (x) = F(x) for every continuity point x of F.
k-roo
Why is F generally not a distribution function? How does one recover 31.3 (for
the case E := R) from Helly's theorem?
4. For a Polish space E consider the topology (introduced in Remark 7 of 30) of
weak convergence on the set of finite Borel measures (the finite Radon measures
- cf. 26.2) on E. By adapting the ideas in the proof of Theorem 31.5, show that
this topology is metrizable.
5. For what more general spaces taking over the role of R+ in the definition
of C(R+, E) does Theorem 31.6 remain valid?
Bibliography
e_ls
dr", Bull. Sci. Math. (2)13, 84.
U. ANONYME [1889]: "Sur l'integrale JIx
G. AUMANN [1969]: Reelle Funktionen. Grundlehren Math. Wiss. 68 (2nd edition),
Springer-Verlag, Berlin-Heidelberg-New York.
S. BANACH [1923]: "Stir le problenne de la mesure", Fund. Math. 4, 7-33.
R.G. BARTLE and J.T..JoicHI [1961]: "The preservation of convergence of measurable functions", Proc. Amer. Math. Soc. 12, 122-126.
H. BAUER [1984]: Mafle auf topologischen Raumen, Kurs der FernuniversitatGesamthochschule-Hagen.
- 11996]: Probability Theory, de Gruyter Stud. Math. 23. Walter de Gruyter.
Berlin-New York.
S.K. BERBERIAN [1962]: "The product of two measures", Amer. Math. Monthly
69, 961-968.
P. BILLINGSLEY [1968]: Convergence of Probability Measures. John Wiley & Soils,
ture uniforme d'espace complet", C. R. Acad. Sci. Paris Ser. I Math. 209,
145-147.
218
Bibliography
gue-Stieltjes measure with Lebesgue measure", Proc. Amer. Math. Soc. 52,
196-198.
73-116+333.
W.P. NOVINGER [1972]: "Mean convergence in Lp-space", Proc. Amer. Math.
Soc. 34, 627-628.
Bibliography
219
D.A. OVERDIJK, F.H. SIMONS and J.G.F. THIEMANN [1979]: "A comment on
unions of rings", Indag. Math. 41, 439-441.
J.C. OXTOBY and S. ULAM [1941]: "Measure-preserving homeomorphisms and
metrical transitivity", Ann. of Math. (2) 42, 874-920.
K.R. PARTHASARATHY [1967]: Probability Measures on Metric Spaces, Academic
Press, New York-London.
W.F. PFEFFER [1977]: Integrals and Measures. Marcel Dekker. New York-Basel.
J. RADON [1913]: "Theorie and Anwendungen der absolut additives Mengenfunktioncn", Sitzungsber. Kaiserl. Akad. Wiss. Wien, Math.-NaturYaiss. K1. 122,
1295-1438.
R.M. SOLOVAY [1970]: "A model of set-theory in which every set of reals is
Lebesgue measurable", Ann. of Math. (2) 92, 1-56.
R.H. SORGENFREY [1947]: "On the topological product of paracornpact spaces",
220
Bibliography
Symbol Index
The numbers beside the symbols refer to the pages where the symbol in question
is defined.
C, u,n u, n, c, \, xii
0,33
-00, (+)oo, xi
If >g}, If > 0, 50
IR, xi
N, Z, Q, R, xi,
Z+, Q+, R+, R+, xi
R+, 141
f f dF, 65
fA f d, fB f
R,., 156
Qd, 45.
(X)
dx, fa f dAd, f f d4
67 90
f If, 6D
fnTf,
00
F fn, E fn, xiii
n=1
a<b,a<b,14
[a, b1, a, b , a& a, b , xi, 14 28, 29
n-4o0
91
(ai)iEJ, Xli
d(x, A), 157, 201
det T, 43
8Q, 1,36
177
f";, 137
F,, 31
u 12
A (topological closure), 17
An, 147
1A (indicator function), 49
a+A, A+a, 36
A:= B, B =: A, xi
AC B, xii
A\ B, xii
222
Symbol Index
AD B,5
Y(), 78
Dye, 44
E = E(1l, sd), 53
E`(1,d), 58
E, 10
E. T E.
G:= f e-` )'(dx), 88. 93, 145. 146
Dai), 31
GL(d,R), 43
H,. (homothety), 37
Ij,,171,
K,.(xo) (closed ball), 146
K,.(xo) (open ball), 158
L'(Ad), 151, 123
L(), LP(), 86, 81
L3 (lower sum), 91
M(sd), 120
Mot(] d), 42
N,(f),Np(f),87,74
Q., Q.,, 135
S(i'k) (skewing transformation), 30
S,,(0) (euclidean sphere), 37
SL(d, R), 43
Ta (translation), 36. 149
T() (image measure), 36
T-'(sd), 3
(x,27
o(cia), 7
Odd
Qa
...
5:,
F, 32
9 171
(principal measure), 176
O (essential measure), 176
A (restriction of ), 68
F, 30
- lim (stochastic limit), 113
U3 (upper sum), 91
-v,1.0Z
IKv,P Lv,1455
Sv-, 170
1 2, 1M
n 11+j, 143
v convolution of measures), 147
a(-s.P-measurable, 34
a-algebras), 132
.Vd (Borel a-algebra in Rd), 27
4'
`6'd, Cd,
9,,, 206
jd
14
.2'(), 6fi
**n,147
P(S), 4
P."W ), 171
P+,r, M2
o'(8), Q(T), a(T1,... ,T,),
o(Ti:iEI),3 35 62
0 7-119,*,i), 1.44
Name Index
Caratheodory, Constantin
(1873-1950), 20
Cauchy, Augustin Louis (1789-1857),
ix
Dynkin, E.B., 5
(1883-1950),163
MacLane, Saunders, 44
Mattner, L., L41
Meyer, P: A., 130
Minkowski, Hermann (1864-1909),75
83
224
Name Index
Novinger, W.P., 82
Overdijk, D.A., 5
Oxtoby, John Corning (1910-1991), 4Z
Solovay, R.M., 45
Thiemann, J.G.F., 5
Tonelli, Leonida (1885-1946), 95 138,
144
Tucker, H.G., 33
Ulam, Stanislaw Marcin (1909-1984),
47
Yzeren, J. van, 95
Subject Index
fl-stable system, 7
U-stable system, 2
p-fold (p-)integrable, 7f
p-measure, 31
p-space, 34
pth-power integrable, 76
p -measurable, 20
a-additivity, 8
a-algebra, 2
a-algebra of Borel sets in Rd, 27
-R 42
Cl-diffeomorphism, 44 111
F,-set, 152
Ga-set, 47 152, 157, 1.59
K,-set, 187
29-convergence, 72
."P-functions, 77
.`gyp-pseudometric, 79
2-semi-norm, 79
e-bound, 121
- continuous, 128
- defined measurable function, 73
p-boundaryless, 19-8
p-completion, 26
p-continuous measure, 99
p-essentially bounded, 78
-- by a set, 3
a-compact, 181
a-finite content, 23
a-finite measure, 23 72 28
a-finite measure space, 34
a-ideal, 13, 100, 1117
a-ring, 177
-- generated, 177
absolutely continuous (see p-continuous)
-,a-,8
- , sub-, 9
Alexandroff compaetification (see onepoint compactification)
algebra, 1512 193
-,a-,2
- of sets, 4
almost everywhere, 70
- bounded,71
- defined function, 73
p-negligible, 13
p-nullset, 13, 70
p-quadrable, 198
p-singular, 105
p-stochastically convergent, 113
- equal, 70
- finite, 74
analytic set, 47
antitone, xiii
strictly, xiii
226
Subject Index
193, 194
99
- , mean square, 80
locally-finite, 154
bounded, 147
Borel set in Rd, 26
in mean, 80
- measures, 147
boundaryless, 198
bounded
Borel measure, 147
- root, 151
(z)-essentially, 78
- measurable mappings, 35
content, 8
content-problem, 411
continuity at
111
continuity from above, 10
- from below, 10
continuity lemma, 88
unit, 142
countable additivity (see Q-additivity)
countable at infinity, 181
countable and co-countable o'-algebra,
2
density of a measure, 96
denseness of C, in 2, 186
denseness of discrete measures, 1194
diffeomorphism, 44, 111
difference
set-theoretic, xii
- , symmetric, 5, 14 24, 87
differentiation lemma, 82
Dirac function, 146
Dirac measure, 12. 154
Dirichlet jump function, 57, 92, 166
disjoint sets, xii
distribution function, 31, 201
dominated convergence theorem, 83
--- , sharpened version, 124
Doob's factorization lemma, 62
Dynkin system, 6
--- generated by ', 7
Subject Index
- , upper, xiii
equi-(h)-continuity, 128
equi-p-continuous, 131
equi-continuous at 0, 131
equi-integrable, 121 if.
essential measure, 176
extension theorem, 19
Factorization lemma, 62
family, xii
Fatou's lemma, 81
finite additivity, 8
finite Borel measure, 3 147, 154
finite signed measure, 101
finite-co-finite algebra, 4 8 U
Fubini's theorem, 13(1
function, additive, 59
- , antitone, xiii
- , integrable of order p, 76
- , isotone, xiii
- , Lebesgue integrable, 65
- , Lebesgue-Stieltjes integrable, 65
- , measurable, 34, 49
- , measure-generating, Q. 32
- , numerical, 49
- , positively homogeneous, 59
- , real, xii
- , Riemann integrable, 91, 92
- , step, 53
- , with compact support, 167
Gaussian integral, 88, 93
general linear group, 43
generator, of a a-algebra, 3
-- , of a product a-algebra, 132
Haar measure, 39, 107
227
- - , generalized, 78
- , reversed, 72
homothety, 37
hull, measurable, 25
ideal, 5
-- of 1A-null sets (see a-ideal)
image measure, 3366 110
indicator function, 49
input-output formula, 13
integrable, 64
, equi-, 121
- , Lebesgue, 65
- , Lebesgue-Stieltjes, 65
-- quasi-, 64
- of order p, 76
- over a set, 69
integral of f exists, 65
integral over a set, 67
intervals in Rd, 14
isotone, xiii, 59 170
-- , strictly, xiii
isotoneity, 9
Jordan decomposition, 109
L-B measure, 27
L-B measure space, 34
L-B-nullset, 28, 29, 33
Lebesgue decomposition, 105
Lebesgue integrable function, 65
Lebesgue integral, 65
Lebesgue measure, 46
Lebesgue-Borel measure (see L-B
measure)
Lebesgue-Stieltjes integral, 65
Lebesgue's convergence theorem, 83
228
Subject Index
- Fatou's, 81
- Urysohn's, 168
- - , reversed, 79
- , Lebesgue, 46
with respect to an outer measure,
21)
- , L-B, 28 20 33, 43
- , Lebesgue, 46
totally, 1119
number line, xi
- , compactified, xi
- , extended, xi
measure, 11
Borel, 31L 153
finite, U
finite signed, 11)2
inner regular, 1.54
L-B,27
motion, 41
motion group, 42
motion-invariance of ad, 42
motion-invariant content, 46
mutually singular (measures), 1.05
, Lebesgue, 46
-, of a set, 11
outer regular, 153
positive, 1519
- , regular, 15.4
- , u-continuous, 99
- , a-finite, 23, 72, 98
- , signed, 102
positively-homogeneous function, 59
power set, xii, 2
pre-image, xii
premeasure, 8
with density, 96
measuue-defining function, 311
measure-extension theorem, Q. 21
measure-generating function, Q. 32
measure space, 26. 34
- , a-finite, 34
metric of uniform convergence, 169
metrizability of locally compact spaces,
- , Lebesgue, 18
principal measure, 176
probability measure, 31
probability space, 34
product measure, 137, 143
product of measure spaces, 144
208
70 83
Subject Index
- , finite, 188
- , p-measure, 188
- , regularity of, 156, 161, 183
Radon-Nikodym density, 105
- integrand, 105
- theorem, 101
- integral, 91
- - , improper, 92
- of a measure, 173
reflection-invariance, 37
regular, inner, 183
- , weakly, 213
representing measure, 173
- , essential 176
- , principal, L76
restriction of p, 19, 68
restriction of f, xii
Riemann integrable, 91, 92
229
178, 185
- generated by intervals, 14
- Egorov, 120
- Fubini, 139
- of a set, 135
-Helly,218
semi-norm, 79
- Levi, 59
- Lusin, 183
-,2'-,79
sequence, xii
sequentially compact, 213
- , relatively, 213
set, analytic, 47
- , Borel, 26, 49, 152, 172
- , difference, 183
- , Lebesgue measurable, 47
- , non-Borel, 45, 47
- of a-finite measure, 72, 175
- , (partially) ordered, xiii
- , quadrable, 198
Souslin, 47
- Prohorov, 21.3
Radon-Nikodym, 194
F. Riesz, 82
- Riesz-Fischer, 84
- Steinhaus, 163
Tonelli, 13$
theorem on dominated convergence 83
- monotone convergence, 59
- partitions of unity, 167
tight, 197, 213
topological basis (base), 157
230
Subject Index
translation-invariant measure, 4 39
ultimately all, xii
uncountable, xii
uniformly integrable, 122
uniqueness theorem, 22
unit mass at w, 8. 12 (see also Dirac
measure)
vanish at infinity, 10
vector space, 66,7,778
Z8
weak convergence, 196, 211
--- and distribution functions, 2(111
weak relative compactness, 213
weak topology, 201
Wiener measure, 216
zero-measure, 11