Sei sulla pagina 1di 248

de Gmyter

Studies Mathematics
26

Heinz Bauer

Measure and
Integration Theory

de Gruyter Studies in Mathematics 26

Editors: Carlos Kenig Andrew Ranicki Michael Rockner

de Gruyter Studies in Mathematics


1 Riemannian Geometry, 2nd rev. ed., Wlhelm R A. Klingenberg
2 Semimartingales, Michel M6tivier
3 Holomorphic Functions of Several Variables, Ludger Kaup and Burchard Kaup
4 Spaces of Measures, Corneliu Constantinescu

5 Knots, Gerhard Burde and Heiner Zieschang


6 Ergodic Theorems, Ulrich Krengel
7 Mathematical Theory of Statistics, Helmut Strasser
8 Transformation Groups, Tammo tom Dieck
9 Gibbs Measures and Phase Transitions, Hans-Otto Georgii
10 Analyticity in Infinite Dimensional Spaces, Michel Hervt
11 Elementary Geometry in Hyperbolic Space, Werner Fenchel
12 Transcendental Numbers, Andrei B. Shidlovskii
13 Ordinary Differential Equations, Herbert Amann
14 Dirichlet Forms and Analysis on Wiener Space, Nrcolas Bouleau and
Francis Hirsch
15 Nevanlinna Theory and Complex Differential Equations, Apo Laine
16 Rational Iteration, Norbert Steinmetz
17 Korovkin-type Approximation Theory and its Applications, Francesco Altomare
and Michele Campiti
18 Quantum Invariants of Knots and 3-Manifolds, Vladimir G. Turaev
19 Dirichlet Forms and Symmetric Markov Processes, Masatoshi Fukushima,
Yoichi Oshima, Masayoshi Takeda
20 Harmonic Analysis of Probability Measures on Hypergroups, Walter R. Bloom
and Herbert Heyer
21 Potential Theory on Infinite-Dimensional Abelian Groups, Alexander Bendikov
22 Methods of Noncommutative Analysis, Vladimir E. Nazaikinskii,
Victor E. Shatalov, Boris Yu. Sternin
23 Probability Theory, Heinz Bauer
24 Variational Methods for Potential Operator Equations, Jan Chabrowski
25 The Structure of Compact Groups, Karl H. Hofmann and Sidney A. Morris

Heinz Bauer

Measure and Integration Theory


Translated from the German by Robert B. Burckel

W Walter de Gruyter
Berlin New York 2001

Author
Heinz Bauer
Mathematisches Institut
der Universit t Erlangen-Numberg
Bismarckstral3e 1 1/2
91054 Erlangen
Germany

Translator

Robert B. Burckel
Department of Mathematics
Kansas State University
137 Cardwell Hall
Manhattan, K ansas 66506-2602

USA

Series Editors

Carlos E. Kenig
Department of Mathematics
University of Chicago

Andrew Ranicki

Michael Rockner
Fakultit fiir Mathematik
Universitiit Bielefeld

Department of Mathematics

5734 University Ave

University of Edinburgh
Mayfield Road

Chicago, IL 60637

Edinburgh EH9 3JZ

USA

Scotland

UniversitiitsstraBe 25

33615 Bielefeld
Germany

Mathematics Subject Classification 2000: 28-01; 28-02


Keywonts: Product measures, measures on topological spaces, topological measure theory, introduction
to measures and integration theory
Ptimod on acid-free papa which fans widen the guidelines of the ANSI to errawe permanence and dwability.

Library of Congress - Cataloging-in-Publication Data


Bauer, Heinz, 1928[Mass- and Integrationstheorie. English]
Measure and integration theory / Heinz Bauer ; translated from the
German by Robert B. Burckel.
p.

cm. - (De Gniyter studies in mathematics ; 26)

Includes bibliographical references and indexes.


ISBN 3110167190 (acid-free paper)

1. Measure theory. 2. Integrals, Generalized.


QC20.7.M43 84813 2001
530.8'0 1 - dc2l

I. Title.

It. Series.
2001028235

Die Deutsche Bibliothek - Cataloging-in-Publication Data


Bauer, Heinz:

Measure and integration theory / Heinz Bauer. Trans[. from the German
Robert B. Burckel. - Berlin ; New York : de Gruyter, 2001
(De Gruyter studies in mathematics ; 26)
Einheitssacht.: Mass- and Integrationstheorie (engl.)
ISBN 3-11-016719-0

Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany.
All rights reserved including those of translation into foreign languages. No part of this book may be
reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or
any information storage and retrieval system, without permission in writing from the publisher.
Printed in Germany.
Typesetting: Oldlich Uhych, Prague, Czech Republic.
Printing and binding: Hubert & Co. GmbH & Co. KG, GBttingen.
Cover design: Rudolf Hubler, Berlin.

In memoriam

Orro HAUPT
(5.3.1887 -10.11.1988)
former Professor of Mathematics

at the University of Erlangen

Preface

More than thirty years ago my textbook Wahrscheinlichkeitstheorie and Grundziige


der Maf3theorie was published for the first time. It contained three introductory
chapters on measure and integration as well as a chapter on measure in topological spaces, which was embedded in the probabilistic developments. Over the years
these parts of the book were made the basis for lectures on measure and integration at various universities. Generations of students used the measure theory part

for self-study and for examination preparations, even if their interests often did
not extend as far as the probability theory.
When the decision was made to rewrite and extend the parts devoted to probability theory, it was also decided to publish the part on measure and integration
theory as a separate volume. This volume had to serve two purposes. As before
it had to provide the measure-theoretic background for my book on probability
theory. Secondly, it should be a self-contained introduction into the field. The German edition of this book was published in 1990 (with a second edition in 1992),
followed in 1992 by the rewritten book on probability theory. The latter was translated into English and the translation was published in 1995 as Probability Theory
(Volume 23) in this series.

When offering now a translation of the book Mall- and Integrationstheorie


we have two aims: To provide the reader of my book on probability theory with
the necessary auxiliary results and, secondly, to serve as a secure entry into a
theory which to an ever-increasing extent is significant not only for many areas
within mathematics, but also for applications in physics, economics and computer
science.

However, once again this book is much more than a pure translation of the
German original and the following quotation of the preface of my book Probability

Theory, applies a further time: "It is in fact a revised and improved version of
that book. A translator, in the sense of the word, could never do this job. This
explains why I have to express my deep gratitude to my very special translator, to
my American colleague Professor Robert B. Burckel from Kansas State University.

He had gotten to know my book by reading its very first German edition. I owe
our friendship to his early interest in it. He expended great energy, especially on
this new book, using his extensive acquaintance with the literature to make many
knowledgeable suggestions, pressing for greater clarity and giving intensive support
in bringing this enterprise to a good conclusion."

In addition I want to thank Dr. Oldfich Ulrych from Prague for his skill and patience in preparing the book manuscript in TJ( for final processing. Many thanks
are due to my family and Professor Niels Jacob, University of Swansea, for reasons

viii

Preface

they will know. Finally, I thank my publisher Walter de Gruyter & Co., and, above
all, Dr. Manfred Karbe for publishing the translation of my book.
Erlangen, March 2001

Heinz Bauer

Introduction

Measure theory and integration are closely interwoven theories, both content-wise
and in their historical developments. They form a unit. The development of analysis in the 19th century - here one is thinking especially about the theory of Fourier
series and classical function theory - compelled the creation of a sufficiently general concept of the integral that discontinuous functions could also be integrated.
The jump function of P. G. LEJEUNE DIRICHLET should be seen in this light. At
that time only an integration theory due to CAUCHY, a precursor of Riemann's,
was known. And it was not until B. RIEMANN's Habilitation in 1854 (text published posthumously in 1867) that Cauchy's ideas were made sufficiently precise
to integrate (certain) discontinuous functions. For the first time the need was felt
for integrability criteria. Parallel to this a "theory of content" was evolving - primarily at the hands of G. PEANO and C. JORDAN - to measure the areas of plane
and the volumes of spatial "figures".

But the decisive breakthrough occurred at the turn of the century, thanks to
the French mathematicians EMILE BOREL and HENRI LEBESGUE. In 1898 Borel -

coming from the direction of function theory - described the "a-algebra" of sets
that today bear his name, the Borel sets, and showed how to construct a "measure"
on this a-algebra that satisfactorily resolved the problems of measuring content.
In particular, he recognized the significance of the "a-additivity" of the measure.
In his thesis (1902) LEBESGUE presented the integral concept, subsequently named
after him, that proved decisive for the development of a general theory. At the same
time he furnished the tools needed to make Borel's ideas more precise. From then
on Lebesgue-Borel measure on the a-algebra of Borel sets and Lebesgue measure
on a somewhat larger a-algebra - consisting of the sets which are "measurable" in
Lebesgue's sense became standard methods of analysis.

What was new about Lebesgue's integral concept was not just the way it was
defined, but also - and this was the real reason for its fame - its great versatility as
manifested in the way it behaved with respect to limit operations. Consequently
the convergence theorems are at the center of the integration theory developed by
Lebesgue and his intellectual progeny.
Subsequent developments are characterized by increasing recognition of the
versatility of Lebesgue's concepts in dealing with new demands from mathematics
and its applications. In the course of time (up to 1930) the general (abstract) measure concept crystallized, and a theory of integration built on it - after Lebesgue's
model.

It is this theory that will be developed here in an introductory fashion, but


far enough that from the platform so erected the reader can easily press ahead to
deeper questions and the manifold applications. Areas in which measure and integration play a key role are, for example, ergodic theory, spectral theory, harmonic

Introduction

analysis on locally compact groups, and mathematical economics. But the foremost example is probability theory, which uses measure and integration as an
indispensable tool and whose own specific kinds of questions and methods have
in turn helped to shape the former. Even today the development of measure and
integration theory is far from finished.
The book is comprised of four chapters. The first is devoted to the measure
concept and in particular to the Lebesgue-Borel measure and its interplay with
geometry. In the second chapter the integral determined by a measure, and in
particular the Lebesgue integral, the one determined by Lebesgue-Borel measure,
will be introduced and investigated. The short third chapter deals with the product
of measures and the associated integration. An application of this which is very
important in Fourier analysis is the convolution of measures. In the fourth and
last chapter the abstract concept of measure is made more concrete in the form
of Radon measures. As in the original example of Lebesgue-Borel measure, here
the relation of the measure to a topology on the underlying set moves into the
foreground. Essentially two kinds of spaces are allowed: Polish spaces and locally
compact spaces. The topological tools needed for this will mostly be developed in
the text, with the reader occasionally being given only a reference (very specific)
to the standard textbook literature.
The examples accompanying the exposition of a theme have an important function. They are supposed to illuminate the concepts and illustrate the limitations
of the theory. The reader should therefore work through them with care. Exercises also accompany the exposition. They are not essential to understanding later
developments and, in particular, proofs are not superficially shortened by consigning parts to the exercises. But the exercises do serve to deepen the reader's
understanding of the material treated in the text, and working them is strongly
recommended.

Notations

Here we assemble some of the notation and phraseology which will be used in the

text without further comment and which - with but a few exceptions - are in
general use.

By N, Z, Q, R we designate the sets of natural numbers 1,2,... (excluding 0), of


whole numbers, of rational numbers and of real numbers, respectively. We always
think of the field R as equipped with its usual (euclidean) metric and the topology

that it determines. Thus Ix - yi is the euclidean distance between two numbers


x, y E R. We also speak of the number line R.
Via the adjunction of (+)oo and -oo to R, the extended or compactified number
line K is produced. Addition with the improper numbers +oo and -oo is performed

in the usual way: a + (oo) = (oo) + a = oo for a E R, and as well (+oo) +

(+oo) = +oo and (-oo) + (-oo) = -oo. On the other hand +oo + (-oo) and
-00 + (+oo) are not defined.
As usual too we set a (too) = oo for all real a > 0, including a = +oo, and
a (oo) = Too for all real a < 0, including a = -oo. Not so general but typical
in measure theory are the additional conventions

which mean that the product a b is defined for all a, b E R.


The notation A := B or B =: A means that this equation is the definition of A
in terms of B.
The < (resp., <) relation in R is extended to R via the decree -oo < a < +oo
for all a E R. A plus sign affixed to Z, Q, R or K as a subscript means the sets
Z+, Q+, R+, R+ of all non-negative whole, rational, real numbers, or - in the last

case - all a EIltwith 0<a<+oo.


Intervals in R are designated as usual by [a, b], ]a, b[, ]a, b] and [a, b[. However,

(a, b) will never be used for an open interval, but only for the ordered pair with
first element a, second element b.
For every pair of elements a, b E R

aVb:=max{a,b},

aAb:=min{a,b}

designate their respective maximum and minimum. Obviously the equations

IaI=av(-a)=a++a- and a=a+-ahold without any restrictions on a if we set, as usual,


a+ := a V 0

and a - :_ (-a)+ _ -(a A 0) .

xii

Notations

Of course, a+ > 0 and a- > 0 for all a. For finitely many a1, ... , an E It the
corresponding expressions a1 V ... V an and al A ... A an stand for max{a1i... , an }

and min{ai, .... an), respectively.


For the set-theoretic operations we use the usual symbols: U or U for union,
n or n for intersection, and the prefix C to signify complementation. The settheoretic relation of inclusion is written A C B, and equality of the sets is not
thereby excluded. For the difference set A fl CB, the set of all x E A such that
x B, we also write A \ B. Sets A and B which have an empty intersection, that
is, for which A fl B = 0, are said to be disjoint.
The power set _9(Q) of a set f2 is the set of all subsets of f2, including the
empty set 0. A set A will be called countable if it is either finite or denumerably
infinite. In other words, we will be using "countable" in lieu of the equally popular
expression "at most countable". Obviously the empty set is to be understood as
a finite set. A set will be called non-denumerable or uncountable if it is neither
finite nor denumerable.

Mappings of a set A into a set B will be denoted by f : A - B or by the


mapping prescription x y f (x) (with x E A). In case B = R we speak of a real
function or a real-valued function on A. Not universal, but useful for our purposes,

is the designation numerical function on A for mappings f : A - R into the


extended number line. The restriction of a mapping f : A -+ B to a subset A'
of A will be denoted by f I A'. The composition of f with a mapping g : B - C
will be denoted g o f and the pre-image or inverse-image of a set B' C B under
the mapping f will be denoted f -1(B').
A sequence in a set A is a mapping f : N -* A of the set N of natural numbers into A. Designating the image element f (n) by an, we also write
or simply
for the mapping f. If other index sets, e.g., Z+ =
{0, 1... .}, come up, this notation is appropriately modified to, e.g., (an)nEZ+ or
(an)n=o.1.... In the same way finite sets are often exhibited as
with
n E N. Even more generally, we write mappings f : I -+ A of a set I into the
set A as (a,),EI. understanding by ai the element f (i) of A. And we then speak of
a family in A (with index set I).
If the terms of a sequence (an)fES in a set A from some index no E N onwards
possess a certain property, that is, if there are but finitely many exceptional indices,

we say that ultimately all terms of the sequence have the property. The popular
phrasing "almost all terms of the sequence possess the property" has to be avoided
in measure theory because there the concept "almost all" is employed in another
sense.

If f and g are real functions on a set X, then f + g, f g, etc., designate the


real functions x H f (x) +g(x), x H f (x)g(x), etc., on X. Numerical functions are
combined analogously, as long as f (x) +g(x) is defined for every x E X, there being
no problem with f (x)g(x) in this regard, thanks to the preceding conventions. If
00

is a sequence of real or numerical functions on X such that the series E f (x)


n=1

Notations

xiii

co

converges in Ht for every x E X, then E fn, or simply F_ fn, designates the


n=1

00

function x H E fn (x). Also, functions like sup fn, inf In, Urnn-+oo
sup fn, lim
inf fn,
n-*oo
nEN

nEN

n=1

lim fn are defined "pointwise" via x '-+ sup fn (x), x H inf f (x), etc.; whereby,
nEN

nEN

n +00

of course, use of lim fn presupposes the convergence in IIt of the sequence (fn(x))

for each xEX.


For numerical functions f1,..., fn on a set X

A V...Vfn

and

f1 A...Afn

and

xH fi(x)A...Afn(x).

designate the functions

xi-+ f1(x)V...V f,, (x)

At each point x E X they assume, respectively, the largest and smallest of the
function values f, (x),fi(x),.. . , f (x). These two functions are called, respectively, the
upper and the lower envelopes of f1, ... , f, . Correspondingly, sup fn and inf fn are
nEN

called the upper and lower envelopes of the sequence (fn) of numerical functions
on X.
A numerical function defined on a subset of IR is called isotope, resp., antitone, if
it is weakly increasing, reap., decreasing. We use this terminology also for numerical
functions f : A -> R when A is a (partially) ordered set. That is, if from x, y E A
and x < y always follows f (x) < f (y), reap., f (x) > f (y), then f is called isotone,
reap., antitone. If from x < y always follows f (x) < f (y), reap., f (x) > f (y), then
f is called strictly isotone, reap., strictly antitone.
For sequences (an) in R the symbolisms

anTa ,

an .l.a

express that the sequence is isotone, reap., antitone, and that a E IIP is its supremum, reap., its infimum.
The end of a proof is signaled by the symbol O.
References of the form "RADON [1913]" are to the bibliography at the end of
the book.
Section 18, labelled with *, can be skipped over in a first reading.

Table of Contents

PrefRee

Introduction
Notations

Chapter I Measure Theory

1.

a-algebras and their generators

2.
3.

Dynkin systems
Contents, premeasures, measures

4.

Lebesgue premeasure

5.
6.
7.
8.

Extension of a premeasure to a measure


Lebesgue-Borel measure and measures on the number line
Measurable mappings and image measures
Mapping properties of the Lebesgue-Borel measure

Chapter II Integration Theory


9. Measurable numerical functions
10. Elementary functions and their integral
11. The integral of non-negative measurable functions
12. Integrability
13. Almost everywhere prevailing properties
14. The spaces 2P()
15. Convergence theorems
16. Applications of the convergence theorems
17. Measures with densities: the Radon-Nikodym theorem
18.' Signed measures
19. Integration with respect to an image measure
20. Stochastic convergence
21. Egui-integrability

Chapter III Product Measures


22.
23.
24.

Products of a-algebras and measures


Product measures and Fubini's theorem
Convolution of finite Borel measures

Chapter IV Measures on Topological Spaces


25.
26.
27.

428.
429.

Borel sets, Borel and Radon measures


Radon measures on Polish spaces
Properties of locally compact spaces
Construction of Radon measures on locally compact spaces
Riesz representation theorem

vii
ix
xi
1

2
5

8
14
18

26
34
38
49
49
53

57
64
70
74
79

88
96
107

110
112
121

132
132
135
147
152
152
157
166
170
177

xvi
30.
31.

Table of Contents

Convergence of Radon measures


Vague compactness and metrizability questions

Bibliography
Symbol Index
Name Index
Subiect Index

188

204

217
221
223
225

Chapter I
Measure Theory

To geometrically simple subsets of the line, the plane, and 3-dimensional space,
elementary geometry assigns "numerical measures" called length, area and volume.
At first all that is intuitively clear is how the length of a segment, the area of
a rectangle and the volume of a box should be defined. Proceeding from these we
can determine by elementary geometric methods the lengths, areas, and volumes
of more complicated sets if we accept certain calculational rules for dealing with
such numerical measures.
If one thinks for example about the elementary determination of the area of
a (topologically) open triangle, one begins by decomposing it via one of its altitudes
into two open right triangles and the altitude itself. One further recalls that every
right triangle arises from insertion of a diagonal into an appropriate rectangle.
Every line segment is assigned numerical measure 0 when considered as a surface.
The following two rules of calculation therefore lead to the determination of the
areas of triangles:

(A) If the set A has numerical measure a, and B is congruent to A, then B also
has numerical measure a.
(B) If A and B are disjoint sets with numerical measures a and p, reap., then
A U B has numerical measure a +)3.
The limits of such elementary geometric considerations are already reached in
defining the area of an open disk K, to which end one proceeds thus: A sequence of
open 3.2"-1-goes En (n E N) is inscribed in K, with El being an open equilateral
triangle, and the vertices of En+1 being those of En together with the intersections

of the circle with the radii perpendicular to the sides of En. Thus En+1 consists
of En together with its
3.2n-1 edges and the open isosceles triangles which have
these edges as hypotenuses and vertices on the circle. Since K is the union of all
the En, it looks like a "mosaic of triangles", that is, like a union of disjoint open
triangles and segments (namely, common sides of various triangles). The following
broader formulation of (B) therefore leads to a definition of the area of the disk K:
(C) If (An) is a sequence of pairwise disjoint sets, and An has numerical mean
00

sure an (n E N), then U


0. An has numerical measure E an.
n=1

n=1

If we replace K and every En by its topological closure, this method would not
lead to a plausible definition of the area of a closed disk K, because K is not the
union of the closures En of the above constructed polygons En. A peculiarity and
disadvantage of the elementary geometric procedure is precisely the necessity of

I. Measure Theory

choosing a special mode of decomposition tailored to the set K being considered


in order to arrive at a numerical measure.
The question of a general method by means of which as many subsets of Rd (for
arbitrary d E N) "as possible" could in a natural way be assigned a d -dimensional
volume as numerical measure is what finally led to the mathematical discipline
called measure theory. The primary content of this chapter is an exposition of the
answer which measure theory furnished to this question. It will be seen that the key
to the answer lies in rule (C), and that this rule is obeyed by much more general
"numerical measures" which arise in situations quite remote from the original

intuitive geometric one. It is just the latter reason that explains the variety of
opportunities for applying measure theory in analysis, geometry and stochastics.

1. a-algebras and their generators


Let SI be an arbitrary set, .9(SI) its power set, that is, the set of all subsets of Q.
Then along with every family (A1)iEI of sets from Y(f2), its union U Ai and its
iEI

intersection n Ai are also in Y(O). Furthermore, Y(Q) contains the complement


iEI

CA of every set A which it contains. In what follows we will be interested in


subsystems d C Y(fl) which have the corresponding properties, at least for
countable index sets I. According to the conventions set out in the introduction,
such index sets are those that are either finite or denumerably infinite.
1.1 Definition. A system si of subsets of a set iI is called a o-algebra (in SI) if
it has the following properties:

SIEa/;

(1.1)
(1.2)

A E .a0

(1.3)

CAE sat ;

(Af),EN C .as'

U An E d.
nEN

Examples. 1. -10(0) is always a a-algebra.


2.

For any set SI the system of all its subsets which are either countable or

co-countable, that is, the A C SI such either A of CA is countable, constitute a aalgebra. Property (1.3) is confirmed as follows: If each An is countable, then so

is the union tJ An. If some An, is not countable, then its complement is, and
nEN

C U An = n CAn C
nEN

3.

is likewise countable.

nEN

If 0 is a a-algebra in a set SI and SI' is a subset of fl, then

(1.4)

f2'nal:={SI'nA:AEsr}

1. a-algebras and their generators

is a a-algebra in S2', called the trace of .sad in ff. In case S2' E of, 0' nod consists
simply of all the subsets of 12' which are elements of 0.
Let S2, S2' be sets, 0' a a-algebra in Cl', and T : Cl -> 12' a mapping. Then the
system of sets
4.

(1.5)

T-1(d) := {T-1(A') : A' E Ad'}

is a a-algebra in Cl, as follows from the known behavior of the set-theoretic operations under inverse mappings (like T-1 here).

Every a-algebra .d has properties "dual" to (1.1) and (1.3), namely:


OE.srd

(1.6)
(1.7)

n An E W.

(an)nEN C d

nEN

These follow from (1.1)-(1.3) and the identities 0 = C11 and nAn = C(UCAn).
Moreover,

A,u...UAn =A,u...UAnuOuOu...
and

A, n... nAn = A, n... nAn nCln1n...


Therefore, along with any finite number of sets which 0 contains, it also contains
their union and their intersection. From this observation and (1.2) follows as well:
(1.8)

A\B=AnCBEd.

A,BEd

For constructing a-algebras the following theorem is important:

1.2 Theorem. The intersection n .si of any family (dj)iEI of o-algebras in


iEI

a common set 0 is itself a a-algebra in Q.


Its proof is just a routine check of properties (1.1)-(1.3). It follows that for every
system 9 of subsets of Cl there is a smallest a-algebra a(8) which contains 9; that
is, a(8) is a a-algebra in 0 with the defining properties

(i) 9 C a(9),
(ii) for every o-algebra .sd in Cl with 8 C 0, a(8) C.W.
For a proof, consider the system E of all a-algebras nd in S2 with 9 C nd;
for example, . (S2) is an element of E. Then o(e) is the intersection of all the
0 E E, which according to 1.2 possesses all the desired properties.
Q(8) is called the a-algebra generated by 8 (in Cl) and .9 is called a generator

of a(8).
Examples. 5. If 9 itself is a a-algebra in S2, then 9 = a(8).
6.

If S consists of a single set A C Cl, then a(S) = {0, A, CA, S2}.

1. Measure Theory

7.

The a-algebra in Example 2 is generated by the system of all finite subsets

of Q.

Several systems of sets possessing some of the properties of a-algebras frequently occur as generators. Of special interest are rings of sets.

1.3 Definition. A system .

of subsets of a set 11 is called a ring (in Sl) if it has

the following properties:


O E R;

(1.9)

(1.10)

(1.11)

A,BE.J
A,BER

A\BE-4;

AuBEF.

If in addition
(1.12)

SZ E R

then :.8 is called an algebra (in fl).


A ring contains with each two of its sets (and so, with each finite collection of
its sets) not only their union, but also their intersection. This is because An B =
A \ (A \ B).

1.4 Theorem. A system 1 of subsets of a set 0 is an algebra if and only if it has


properties (1.1), (1.2) and (1.11).
Proof. By definition an algebra has properties (1.1) and (1.11) and (1.10), and from
the latter follows (1.2). The converse follows from the fact that 0 = Co, together
with the set-theoretic identity

A\B=AnCB=C(BuCA). 0
Examples. 8. Every a-algebra is an algebra.
For any set 0 the system of all sets A C 0 which are either finite or co-finite
(i.e., have finite complement in i2) is an algebra, but is a a-algebra only if fl is
9.

finite.
10.

The system of all finite subsets of a set 0 is a ring, but is an algebra only

if fl itself is finite.
11.

The smallest ring of subsets of a set 0 is the empty set O.

Exercises.
1. For every system 8 of subsets of a set n there exists a smallest ring p(8)
in 0 which contains if. It is called the ring generated by 8. Prove this existence
assertion. Determine p(8) and a(8) in the case where f consists of two subsets
A, B of Q. When does p(8) = a(e) hold in this latter case; when does it hold for
general 8?

2. Dynkin systems

2. For sets A and B

AL.B:=(A\B)U(B\A)

is called their symmetric difference. Prove that it obeys the following rules of
calculation (in which A, B, C are arbitrary sets):

ADB=BAA;
(AAB)ACAA(BAC);

(a)
(b)
(c)
(d)
(e)

CA A CB =ADB ;
(A 6 B) n C = (A n C) A (B n C);

(f)

(U An) 0 (U Bn) C U (An A Bn)

AAA=0;

nEN

AA0=A;

nEN

nEN

(for arbitrary sequences (An) and (Bn) of sets).


3. Deduce from exercise 2 that -4 C .9(Q) is a ring in a set Q if and only if with
respect to the operation A (as addition) and n (as multiplication) -4 constitutes
a commutative ring in the sense that the algebraists use that term.
4. A subset V of a ring -4 in a set Q is called an ideal if it satisfies
(a)
(b)
(c)

0EN;

NE.A',ME, ,MCN

ME.X;
M,N E.N => MUN E.N.
.

Continuing with exercise 3, show that .N C 9 is an ideal in 9 if and only if it is


an ideal in the algebraists' sense in the commutative ring -4. Every ideal in . ' is
itself a ring in Q.
5. Let Q := N and for each n E N, do denote the a-algebra in 12 generated by
the system do comprised of the singletons {1}, {2},..., {n}. Show that do consists of all subsets of Q which are either contained in (1, 2,. . ., n} or contain the
complement of this set. Obviously stI'n C .s4 for every n E N. Why is U stn
nEN
nevertheless not a a-algebra in 0 = N?
[Hint: It is generally true of any isotope sequence (.4n)nEN of rings in a set Q
that the union of all of them constitutes a a-algebra if and only if they are equal
from some index onward. Cf. OVERDIJK, SIMONS and THIEMANN [1979] and, for
the special case of a-algebras, BROUGHTON and HUFF [1977].1

2. Dynkin systems
It is often difficult to directly determine whether a given system of sets is a a-algebra. The following concept, which goes back to DYNKIN [1961] but in inchoate
form even to SIERPINSKI (1928], helps to get around some of these difficulties.

I. Measure Theory

2.1 Definition. A system 9 of subsets of a set Il is called a Dynkin system (in A)


if it has the following properties:
S2 E 9;

(2.1)
(2.2)

(2.3)

DE9

CDE9;

U D E 9.

D pairwise disjoint E 9 (n E N)

nEN

Every Dynkin system 9 thus contains the empty set 0 = CA, and then (2.3)
also insures that 9 contains the union of every finite, pairwise disjoint collection
of its sets.
Examples. 1. Every a-algebra is obviously a Dynkin system.

Let A be a finite set with an even number 2n of elements (n E N). Then the
system 9 of all D C A which contain an even number of elements is a Dynkin
system. In case n > 1, 9 is not an algebra, hence certainly not a a-algebra.
2.

The precise connection between the concepts of or-algebra and Dynkin system
is elucidated in the following considerations:

2.2 Lemma. Every Dynkin system 9 is closed with respect to the formation of
proper complements, meaning that
(2.2')

D,EE9, DcE

E\DE9.

Proof. According to what was noted right after definition 2.1, the set D U CE,
being the union of the disjoint sets D and CE from 9, lies in 9. But then the
complement of this set with respect to 0, that is, E f1 CD = E \ D, lies in 9.
Consequently, Dynkin systems can also be defined via properties (2.1), (2.2')
and (2.3).

2.3 Theorem. A Dynkin system is a o-algebra just if it contains the intersection


of any two of its sets.

Proof. What needs to be shown is that every Dynkin system .9 which is closed
under finite intersections is a a-algebra. Of the defining properties of a a-algebra,
only (1.3) needs to be confirmed and we do that thus: According to (2.2') and
the closure hypothesis, A \ B = A \ (A fl B) lies in 9 whenever A, B E 9. Since
(A \ B) fl B = 0 and A U B = (A \ B) U B, 9 contains the union of any two, hence
the union of any finitely many, of its elements. For any sequence (Da)nEN C 9,
we have
00

00

U Dn=U(D'n+1\D,)
n=1

n=e

2. Dynkin systems

in which D' := 0 and D;, := Dl U ... U D for each n E N. The sets D;+i \ D;, are
pairwise disjoint and, thanks to (2.2') and what has already been proved, they lie
in 2. According to (2.3) then the union of the sets D lies in 2. 0

Just as for a-algebras, algebras and rings, every system Cr C .9(Q) lies in
a smallest Dynkin system. It is, of course, called the Dynkin system generated
by 8, and is denoted 6(8).
The significance of Dynkin systems lies primarily in the following fact:

2.4 Theorem. Every 9 C .9(Q) which is closed with respect to finite intersection
satisfies
(2.4)

6(8) = 0(6) .

Proof. Since every a-algebra is a Dynkin system, o(8) is a Dynkin system containing 9' and consequently 6(8) C o((fl. If conversely, 6(8) were known to be
a a-algebra, the dual relation o(8) C 6(8) would also follow. In view of 2.3 therefore it suffices to show that 6(8) is closed under intersection. To prove this, we
introduce for every D E 6(8) the system

1D:={QE.9(st):QnDE6(8)}.
A routine check confirms that 9D is a Dynkin system. For every E E 8 the
hypothesis on 8 insures that 8 C 2E and therewith that 6(8) C 2E. Thus for

every DE6(8)andevery EE8wehave EnDE6(8);that is,8C2D,and


consequently 6(8) C 9D, holding for every D E 6(8). But this is just the property
of d(eb) that had to be confirmed. 0
Systems of subsets which are closed under intersections (respectively, unions)
of two, hence of any finite number, of their sets will from now on be described as
r)-stable (respectively, U-stable).

Exercise.
Determine the Dynkin system generated by the system consisting of just two
subsets A, B of fl. Show that 6(&) and o(8) coincide just in case one of the sets
A n B, A n CB, B n CA of CA n CB is empty.

1. Measure Theory

3. Contents, premeasures, measures


Combining the concepts of ring and or-algebra with the properties (B) and (C) of
lengths, areas and volumes that we encountered in the introduction leads to the
basic concepts of measure theory.

3.1 Definition. Let .4 be a ring in SI and it a function on sP with values in


10, +oo]. It is called a premeasure on 9 if
p(0) = 0

(3.1)

and for every sequence (An) of pairwise disjoint sets from R whose union lies in 1B
00

00

u(U An) = E p(A,)

(3.2)

(a-additivity)

n=1

n=1

holds. it is called a content if instead of (3.2) it only satisfies


n

tt

It (U

(3.3)

A;) = F p(A;)

(finite additivity)

(for every two and therewith) for every finitely many pairwise disjoint sets A,,. .. ,
A,, E R_
Due to (3.1) every premeasure is evidently a content. To see this, you have only
to take An+1 = An+2 = ... = 0 in (3.2).

Examples. 1. For every ring R in 11 and every point w E 11 the function


defined on .

s,,,

by
if U) EA

if1.r0A
is a premeasure. It is called the premeasure defined by unit mass at W.

Let a be the a-algebra defined in Example 2 of 1, for an uncountable set fl,


say for S2 =1R. Set p(A) := 0 or 1 according as A of CA is countable. Since of two
disjoint subsets of f? at most one can have a countable complement, property (3.2)
is easily confirmed; thus p is a premeasure on d.
2.

3.

Let W be the algebra defined in Example 9 of 1, for a countably infinite set i.

Set p(A) := 0 or 1 according as A or CA is finite. Then p is a content but not


a premeasure. The first assertion has a proof analogous to that in the preceding
example, the second follows from the fact that f) is the disjoint union of countably
many 1-element sets.

Let 111,112.... be a sequence of contents (premeasures) on a ring 9, and let


a 1, 02, ... be a sequence of non-negative real numbers, Then
4.

00

p
n=1

3. Contents, premeasures, measures

is also a content (premeasure) on R.


Every content on a ring R enjoys the following further properties (in which
A, B, A1, B1, ... E R):
(3.5)
(3.6)
(3.7)

(A U B) +(A n B) = (A) + (B) ;


(A) < (B)
ACB
.
(B \ A) = (B) - (A)
A C B, (A) < +oo
n

(U Ai)

(3.8)

(isotoneity);
(subtractivity);

i=1

,p(Ai)

(subadditivity);

i=1

for every sequence (An) of pairwise disjoint sets from R whose union lies in R
00

"D
L(An)<(UAn).
n=1

n=1

Proof. For arbitrary A, B E R

AUB=AU(B\A)

B=(AnB)u(B\A).

and

Because of finite additivity, it follows from these that

(A U B) = (A) +(B \ A)

(B) =(A n B) + (B \ A),

and

and from addition of the last two equations

(AUB)+(AnB)+(B\A) =(A)+(B)+(B\A).
In case (B \ A) is finite, (3.5) follows from this. In case (B \ A) = +oo, the
formulas for (A U B), (B) show that each of them must also equal +oo, and
(3.5) consequently holds in this case too. If A C B, the preceding formula for (B)
reads

(B) = (A) + (B \ A),


which, thanks to > 0, delivers both (3.6) and (3.7). If we set B1 := A1, B2
A2 \ A1,... ,Bn := A. \ (A1 u ... u A,-,), then B1,..., Bn are pairwise disjoint
sets from R, which entails that
n

(U B,) =Ej(Bi)
n

From the facts that Bi C Ai (i = 1,. .. , n), a is isotone, and U B, = U Ai


i=1

i=1

now follows (3.8). To prove (3.9) we only have to observe that for every sequence (An)nEN of pairwise disjoint sets from R with A := u An E R
nEN

(A1) + ... + (Am) = (A1 U ... U A.n) < (A)


and let m -+ oo.

(m E N)

10

1. Measure Theory

Finally, if it is a premeasure on .4, then for any sets A0, A1, ... E 9

0
(3.10)

Ao C U A.

p(Ao) :5 >2 p(An)


n=1

n=1

Because of AO = U(Ao n An) and (3.6), we can assume, in verifying (3.10), that
Ao = U An. Then set B1 := Al, B2 := A2 \ Al,... ,Bn := An \ (A1 U ... U An-1)
and proceed as in the proof of (3.8).
In particular, we now have
,u(UAn) <Ep(Aa)

(3.10')

n=1

n=1

whenever all the sets An as well as their union lie in R.


The following theorem characterizes premeasures via other properties related
to the a-additivity. Its formulation is facilitated by the notations:
En T E

(3.11)

and

En J. E

which mean that the sets E1 C E2 C ... satisfy E = U En, or that the sets
El ) E2 D ... satisfy E= n En. In other words, the sequence (En) either
increases isotonically to E or decreases antitonically to E.

3.2 Theorem. For a content p on a ring .9 consider the following statements:


(a) p is a premeasure.
(b) A,,, A E 9 with An T A =; limn_, , p(An) = p(A) (continuity from below).
(c) An, A E .4 with An 1 A and p(An) < +oc for all n

lim p(An) = p(A)

(continuity from above).

n400

(d) An E 9 with An 4.0 and p(An) < +oo for all n =


(continuity at 0).

1imn p(An) = 0

n-+oo

Then the following implications hold:

(a) a (b)

(c) a (d).

If it is finite on R, that is, p(A) < +oc for all A E .9, then all four statements
(a)-(d) are equivalent.

Proof. (a)=(b): Defining Au := 0, the sets Bn := A. \ An-1 (n E N) are pairwise


disjoint, lie in .9 and satisfy

A= U Bn,
Yd=1

An=B1U...UB,,.

3. Contents, premeasures, measures

11

Therefore on account of the a-additivity of p


n

00

(A) = E y(Bn) = nlim


J (Bi) = lim p(An)
+oo
n=1

i=1

(b)=(a): Let (An) be a sequence of pairwise disjoint sets from R whose union
A:= U An also is in R. If we set Bn := Al U ... u An, then Bn E 9 and Bn T A;
therefore (A) = lim(Bn). As a result of the finite additivity of

(B.) = (A1) +... +(An)


and therefore p(A) = F_(An). Thus is a-additive, and consequently is a premeasure.

(b)=(c): According to (3.7), (A1 \ An) = (A1) - p(An) for every n E N.


From An 1 A follows Al \ An T Al \ A, and all the sets appearing here are in R.
From (b) therefore
(A1 \ A) _ imo(A1 \ An) = (A1) - im0p(An).

From this follows (c), because A C An means that also (A) < +oo and so
(A1 \ A) = (A1) - (A).
(c) .(d): Here there is nothing to prove!
From An 1. A follows An \A 10. Since An\A C An, the isotoneity of
means that along with p(An), (An\A) is finite too. Hence by (d), lim(An\A) =
0. But then (c) follows because p(A) < (An) < +oo, causing (An \ A) to equal

p(A.) - p(A)
To finish off, let us consider the case that is finite, and show that then
(d) =*- (b): If (An) is a sequence of sets from .9 and A. T A E.9, then A\An 10.
Taking account of the finiteness of , it therefore follows that 0 = lim (A \ An) _

lim[(A) - (An)] and therewith (b). 0


Remark. If one modifies Example 3 of this section by making (A) := 0 for all
finite sets and p(A) := +oo for all cofinite sets, then he gets a content that is
continuous at 0 but is not a premeasure. Thus without the finiteness hypothesis
in the preceding theorem, statements (a)-(d) are not generally equivalent. On the
other hand, in (c) and (d) it is enough to explicitly hypothesize (An) < +oo for
some n E N, as then u(A,n) < +oo for all m > n (isotoneity).
The concepts of content and premeasure are preliminary to the central concept
of this book, that of a measure.

3.3 Definition. A premeasure defined on a a-algebra 41 of subsets of a set 51 is


called a measure (on ark). The function value (A) of at an A E d is called the
(p-)measure or the (p-)mass of A. If p(S1) < +oo (and consequently (A) < +oo
for every A E 4), the measure is called finite.

Thus a measure is a non-negative, numerical function p defined on a a-algebra .0 and enjoying properties (3.1) and (3.2). The constant function = 0
is a measure on every a-algebra, the so-called zero-measure. The examples that

1. Measure Theory

12

follow are still of a rather formal nature. But as early as 6 and then quite a bit
later we will become acquainted with an abundance of important examples.

Examples. 5. If for the ring R in Example 1 one takes a a-algebra d in 1a,


then e, is a measure on d, called the measure defined by a unit point mass
at w, or more briefly the unit mass at w, and also the Dirac measure at w. These
designations derive from interpreting a measure p on a a-algebra in f as a mass
distribution over Q. Accordingly for A E 0, p(A) is viewed as the mass that has
been "smeared" over A. The Dirac measure at w has, in so far as the one-element
set {w} lies in d, all of its (unit) mass concentrated at the point w: e({w}) = 1,
eW(C{w}) = 0.
6. Let SZ be an arbitrary set. For every A E .(12) let JAI denote the number
of elements in A in case A is finite, and otherwise +oo. Then r;(A) :_ IAI defines
a measure on :x(11), called the counting measure on ft (or on .9(1)). Its restriction
to a o-algebra W in i is called counting measure on W.

7.

The premeasure defined in Example 2 is a measure.


Next we derive a not-so-obvious consequence of the a-additivity of measures.

3.4 Lemma. Let p be a measure on a a-algebra ii and (An)nEN a sequence of


sets from 0. Suppose there is a k E N such that the sets A,n and An are disjoint
whenever their indices satisfy Im - nl > k. Then
00

00
(3.12)

J >(A.) < kp (U A.).


n=1

n=1

When k = 1 this is, in view of (3.10'), just the a-additivity requirement of


a measure.

Proof. Designate the union of all the An as C. For each r = 1,.. . , k the sets
(Ar+rnk)mENo are pairwise disjoint. So if we set
00

Fr

U Ar+mk

rn=0
then
00

E p(Ar+mk) = p(Fr) < p(C)


M=0

because Fr C C. Since the sum of a series of non-negative terms in independent


of the ordering of the terms, it follows that
00

E p(A,) = E u(Fr)
n=1

r=1

3. Contents, premeasures, measures

13

From this equality and the preceding inequality the asserted inequality can be read
off.

Exercises.
1. Let 12 be a finite, non-empty set. Show that the counting measure ( on Y(O)
coincides with E e,,. Show further that every measure p on :x(1l) has the form
cEn

p=

a,,e,,,, with each a, := p({w}).


WE n

2. For a finite content p on a ring .4 establish the following input-output formula


generalizing equality (3.5): For all n E Nl, A,, ... , An E M
n

(U A;) =EA(Ai)- E t(AinAj)+


i=1

i=1

1<i<j<n

p(AinAjnAk)
1<i<j<k<n

- +...+(-1)n-1(A1n...nAn).
3. For a premeasure p on a ring. in 12 define

.':={AE-6P(1l):AnRE.4for every RE-4}


(A) := sup{p(R) : R C A, R E-4}, for A E i.
Show that .9' is an algebra in 12 which contains .?, and that is a premeasure
on 8 which extends p.
4. Suppose that (A- )-EN is a sequence of premeasures on a common ring 9 which
is isotone, that is, satisfies n (A) < pn+1(A) for all A E R, n E N. Show that via
p(A) := sup An (A) a premeasure is defined on R.
nEN

5. Let p be a measure on a a-algebra .sat in 0, and denote by .N,, the set of all
p-null (or -negligible) sets, that is, the N E .szd for which (N) = 0. Check that
.M,,, has the following properties:
(a)

0 E t',,;

(b)

NE.Y,,,MEd,MCN

(c)

(Nn)nEN C ,N,,

ME.A;
U Nn E ,4,.
nEN

Subsets of .sat with these properties are called a-ideals in d. Thus Y is always
a a-ideal. (Cf. Exercise 4 of 1.)

6. Every a-ideal .N in a a-algebra d is the a-ideal .N,, of p-null sets of an appropriate measure p on d. To get such a p, define
0

if AEa

_ I+oo ifAEd\-,Y.
'L(A)'As a special case, on the power set .9(12) of any set SZ there is a measure p such
that p(A) = 0 precisely if A is a countable subset of Q.

14

I. Measure Theory

7. Let p be a finite content on a ring .9. Show that


di,(A, B) := p(A A B)

(A, B E.9)

defines a pseudometric on M, that is, d,, has all the properties of a metric on .9 with
one possible exception: d,, (A, B) = 0 can happen without A = B. (Cf. Exercise 3
of 15.)

4. Lebesgue premeasure
Now we specialize Sl to be the d-dimensional number-line Rd (d E N). For every
two points a = (al, .. , ad) and b =
E Rd we write a:5 b (reap., a 4 b)
if a, < Qi f o r all i = 1 , ... , d (resp., ai < ,Q; for all i = I,-, d). Every set of the
form
(4.1)

[a,b[:= {XERd:a<x-o b},

where a, b E Rd and a < b, is called a right half-open interval in Rd. Geometrically


described, these are parallelepiped "open on the right" and having sides parallel
to the coordinate axes. Clearly [a, b[ is nonempty if and only if a < b, and in this
case the interval [a, b[ uniquely determines the points a, b.
For every such interval [a, b[ the real number
(4.2)

is called its d- dimensional elementary content. It equals 0 just when [a, b[ = 0, that

is, when a < b fails (although a < b holds, a prerequisite to employing interval
notation).
From now on, #d shall designate the set of all right half-open intervals in Rd,
and 9d the system of all finite unions of such intervals, so ,.Od C .91d. The elements
of fd are called d- dimensional figures.

4.1 Lemma. For all 1, J E >fd

In JE fd

and

J\IE.Pd.

Every figure is a union of finitely many pairwise disjoint intervals from ,ld.

Proof. Let I = [a, b[, J = [a', b'[ with a < b, a' < b', and let the corresponding
coordinates of these points be ai, 3i, a;, 3,. If we let a and f denote those points
in Rd whose coordinates are max{ai, a' j) and min{13i, $ } (i = 1, ... , d), respectively, then I n J = [e, f [ in case e < f and otherwise I fl J = 0. Consequently,

I n J is already in .ld. Because J \ 1 = J \ (I n J) and we now have I n J E -old,


in proving the second claim we may assume that 196 0 and I C J. Then I and J
determine the points a, 6, a', 6' uniquely and they satisfy a' < a ci b < Y.
Create new points from a = (al , ... , ad) and b =
Ad) by replacing

ai by a; and /3i by ai, or by replacing ai by ai and Qi by $ , and do this in all

4. Lebesgue premeasure

15

possible ways. More precisely, make such replacements for the i coming from each
non-empty subset of { 1, ... , d}. The points so created give rise to at most 3d - 1

pairwise disjoint intervals from _Od whose union is J \ I. Thus J \ I is a figure


and is representable as a finite union of pairwise disjoint sets from _0d. That this
obtains as well for every figure F = Il U ... U I,, E _4rd with Ii, ... , I E .0d can
now be seen as follows:

F=I1U(I2\I1)U(13\IlUI2)U...U(In\IIU...UIn-1)
exhibits F as a union of n pairwise disjoint sets, each of the form I \ J1 U ... U Jm
with I, J1, ... , J. intervals from jd. Thus it suffices to show that every set of this
form is the union of finitely many pairwise disjoint intervals from mod. But this
follows from

I\J1U...UJm=n(1\Ji)
i=1

when, using what has already been proved, we write each I \ Ji as a union of
finitely many pairwise disjoint intervals from j0d and distribute the intersection
through these unions. 0

4.2 Theorem. 4'd is a ring in Rd.


Proof. The only thing that is not obvious is property (1.10) of a ring, according to
which along with any sets F, G E .$d their difference F \ G must also be in `$d.
By definition there exist intervals Ii, ... , I,,,, Ii , ... , I;; E pd such that
m

F=UI;

G= U1 .

and

j=1

i=1

But then

F\G(nvi,\Ijn)
i=1 j=1

and so it only has to be shown that each set n (I; \ I) is a figure. According to 4.1
j=1

I; \ Ij" is always a figure. So it further suffices to demonstrate that the intersection


of two (whence, of any finite number of) figures is itself a figure. If however F
and G are two figures represented as above, then thanks to distributivity F fl G
is just the union of the sets I; fl I , " (i = 1, ... , m; j = 1, ... , n), which by another

appeal to 4.1 is a figure. 0


By definition every figure is a union of finitely many intervals from 5d. Consequently, .mod C 9 for every ring 9 in Rd such that fd C R. So theorem 4.2 really
says that .Ird is the ring generated by fd.
Our geometric intuition now suggests the validity of the following theorem:

16

1. Measure Theory

4.3 Theorem. There exists exactly one content A on 911 with the property that
A(I) coincides with the d-dimensional elementary content of I, for each I E .fad.
This content is real-valued.

Proof. According to 4.1, every figure F E 90 has a representation F = Il U... UIn


as a union of finitely many pairwise disjoint intervals from 9d. Every content A
on 9' therefore satisfies
A(F) = A(11) + ...

which shows that A is determined throughout gd just by its values on fd and is


necessarily real-valued. Thus all we have to do is settle the existence question. To
this end we first define A only on .d as it must be defined, namely A(I) shall be
the d-dimensional elementary content of I for each I E fad. Then we have.
I3d) and y a real
(a) Let I = [a, b[ E .>fd, a = ( a l , ... , ad) and b =
number satisfying ai < -t < /3i for a fixed i E {1,.. . , d}. The hyperplane with
equation t;; = y divides I into two disjoint intervals 11 := [a', b[ and 12 := [a, b'[,
a' being a with its its' coordinate replaced by -y, and b' being b with its ith coordinate replaced by ry. From (4.2) then follows that .1(1) = A(I1) + A(12). Induction
therefore yields

(b) If I E sad is decomposed by finitely many hyperplanes in the manner


described in (a) into pairwise disjoint intervals Ii,. .. , I E .fad, then \(I) _
,\(11) +... + A(I ). More generally:
(c) For any finitely many, pairwise disjoint 11, ... , In E Yd with to := I1 U
U In E sad, A(Io) = A(I1) + ... + A(In). In proving this we can obviously
assume that each Ij is not empty. Then there are points aj = (ail, ... , a jd) and
bj = (ris 1, ... , J3jd) from Rd with aj d bj and Ij = [aj, bj [, j = 0,1, ... , n. The
hyperplanes whose respective equations are 1;; = aj; or & =,3j, for i E {1,...,d},
j E (1, ... , n) decompose Io into pairwise disjoint intervals 11.... ,1,, E frd. Each
of I1.... , In also decomposes into certain of these Ii.... ,1 . The claimed equality
therefore follows from (n + 1) citations of (b).
(d) If now

F=I1U...UIn=J1U...UJnj
are two representations of the figure F E .>5d, each a union of pairwise disjoint
intervals, then
A(I1) + ... + A(In) = A(J1) + ... + A(Jm)
m

Indeed, Ij = Ij nF = U (Ij nJ;) is a representation of Ii as a union of the pairwise


1=1

disjoint intervals Ij n Js, ... , I, n J,,, and thanks to (c)

A(Ij)=>A(IjnJ=)
i=1

(j=1,...,n).

4. Lebesgue premeasure

17

Upon interchanging the roles of i and j, one gets analogously


n

(i = 1,...,m).

A(J1) = EA(Ij nJ1)


j=1

Together these last two equations entail the equality E A(Ij) = E A(JJ).
(e) Thus for every F E .'d the number F_ A(Ij) is independent of the special
representation

F=I1u...UI,
of F as a union of finitely many pairwise disjoint 11, ... , In E fd. Therefore the
decree

A(F) := A(I1) +... + A(In)


well defines an extension, to be denoted still by A, of the original function on .fd to
one on gd. This function is real-valued, non-negative, and according to (d) finitely-

additive. Since 0 E j0d and A(0) = 0, a content with the sought-for properties is
at hand.

4.4 Theorem. The content A on !Fd is a premeasure.


Proof. Because A is finite, 3.2 says that we only need to prove the continuity of A
at 0. To this end, let (Fn) be an antitone sequence of figures from d. We will
show that from the assumption that

b:= limoA(FF)=n NA(Fn)>0


follows

nFn #0.
Each Fn being a union of finitely many pairwise disjoint intervals from .>Id, it
should be clear that by a slight leftward shift of the right endpoints of each of
these intervals a new figure an E .fin is created, whose topological closure Gn is
still a subset of Fn, and
A(Fn) - A(Gn) < 2-"6.

If we set Hn := G1 fl ... fl Gn, then (Hn) is a sequence of sets from gd satisfyin


Hn Hn+1+ Fn C Gn C Fn for all n. Because Fn is bounded its closed subset H,,
is compact. As soon as we succeed in showing that each Hn is not empty, it will
follow from the finite-intersection property of compacts (WILLARD [1970), p. 118,

KELLEY [1955], p. 136) that n Hn 0 0 and so a fortiori n Fn 54 0. So let us


nEN

prove that no Hn is empty. For every n E N


(*)

nEN

A(Hn) > A(Fn) - (1- 2-")d,

as we will confirm by induction. The inequality holds for n = 1 because H1 = G1,

and by choice of G1, A(F1) - A(G1) < 2-16. Suppose the inequality valid for

18

1. Measure Theory

some n. Since

G,,+1 fl H, and everything is finite, (3.5) gives


A(H,,+1) = A(Gn+1) + A(HH) - A(Gn+1 U Hn)

From the induction hypothesis A(H,,) > A(F,) - (1 - 2-")b; from the choice
of G,.+1, A(Gn+1) > A (Fn+1) - 2-"-'b and G.+1 U Hn C F.+1 U Fn = Fn, so that
Combining these observations completes the inductive
U
step in the confirmation of (*):

A(F'n+l) - 2-"-lb - (1 - 2-")b = A(Fn+l)


Recalling that A(F,) > S by definition of b, we infer from (*) the inequality
A(H") > 2`5 > 0 and therewith the fact that Hn 0 0, the last link that had to
be accounted for in the logical chain. 0
4.5 Definition. The premeasure A on the ring Jrd of d-dimensional figures in Rd
is called Lebesgue premeasure in Rd or d-dimensional Lebesgue premeasure. From
now on it will be denoted by Ad.

Here we encounter for the first time the name of the French mathematician
H. LEBESGUE (1875-1941), the inventor of the measure and integration concepts that today are named after him. The development of the theory of measure and integration was spurred above all by his investigations and those of his
countryman E. BOREL (1871-1956). For the history of Lebesgue integration see
DIEUDONNE [1978] and HAWKINS [1970].

Exercises.
1. Show that on 91 there is exactly one content p that assigns to the right halfopen interval [a,,3[, a, f3 E R, the following values

ifa<0<$

'L([a,13[)={10 in all other cases.


Is a-additive?
2. Two intervals 1o, J E jrd with 1o C J are given. Prove the existence of k < 2d
intervals I1, ... , Ik E Od with the following two properties: (i) 10 U ... U 1j E 'Od
for each j E {0, ... , k}; (ii) J = lo U ... U Ik. [Hint: Proceed by induction on the
dimension d.]

5. Extension of a premeasure to a measure


Lebesgue premeasure is not a measure because its domain of definition, the ring

of d-dimensional figured, is not a o-algebra. For example, the whole space Rd is


not in .$'d, every d-dimensional figure being a bounded subset of Rd.
The elementary geometric considerations sketched at the beginning of this chapter however suggest that the domain of the premeasure Al be so enlarged that

5. Extension of a premeasure to a measure

19

a "numerical measure" gets assigned also to more complicated subsets of Rd. The
most satisfactory such result would say that Ad can be extended in exactly one
way to a measure on an appropriate or-algebra W in Rd with g d C a0.

Here we encounter the following general problem: A ring ? in a set Q and


a content p on -4 are given. Under what conditions does there exist a o-algebra 0
in fl and a measure on at such that p is the restriction of j to -IV An obvious

necessary condition for this is that be a premeasure on R. The designation


"premeasure" will turn out to be justified if we can show the converse: For every
premeasure p on a ring . there exists a or-algebra ark in Il with -4 C at, and
a measure on ark satisfying i I .? = p. It suffices to take for 0 the o-algebra o(3) generated in fI by .M.

5.1 Theorem (Extension theorem). Every premeasure p on a ring .? in fI can


be extended in at least one way to a measure Ti on the or-algebra or(R) generated

by E in Q.
Proof. For each subset Q C S1 designate by 'W (Q) the set of all sequences (An)nEN

of sets from 3 which cover Q, that is, which satisfy

QC UAn.
nEN

Then the numerical function p may be defined on .B(SI) via


W

(Q)

inf{ E p(An) : (An) E'P1(Q)), in case P(Q) 34 0


in case *'(Q) = 0.

+oo,

It has the following properties:


A* (0) = 0;

(5.2)
(5.3)

Q1 C Q2

(5.4)

(Qn)nEN C -60(1l)

A WO < P*(Q2);
00

(U Qn)
n=1

00

E u*(Qn)
n=1

Equality (5.2) follows from the observation that the constant sequence 0, 0, ... is
in W (O). The observation that 1(Q2) C V (Q0 follows from Q1 C Q2, serves to
confirm (5.3). For the proof of (5.4) it can evidently be assumed that p (Qn) is
finite and so in particular 0&(Qn) # 0, for every n E N. For an arbitrary e > 0
then, each 0&(Qn) contains a sequence (Anm)mEN such that
00

1: p(Anm) : '(Q.) +2-ne.


M=1

1. Measure Theory

20

The double sequence (A,nm)n,mEN lies in 11 ('J Qn) and as a consequence the
n=1

definition of ' gives


00

00

,t* (U Qn) < E lp(Anm) <


n=1

n,mEN

L, '(Qn) +

n=1

and (5.4) follows from this and the arbitrariness of > 0. It is immediate from the
definition that

i >0.

(5.5)

Decisive for what follows is the fact that every A E .4 satisfies


p (Q) > ' (Q fl A) + ' (An CA)

(5.6)

for every Q E .9(0),

as well as

p*(A) =;&(A).

(5.7)

In proving (5.6) we can again assume '(Q) < +oo, so that P(Q) 34 0. First of
all we have

00

00

00

p(An) _ >lx(AnflA)+Ep(An \A)


n=1

n=1

n=1

for every sequence (An) from 1!(A), due to the finite additivity of p. Moreover,
the sequence (AnflA) lies in 9l(QnA) and the sequence (An \A) lies in P!(Q\A).
Consequently,
00

1: p(An) > p*(QnA)+"(Q\A)


n=1

for every such sequence (An), and from this fact (5.6) is immediate. Equality (5.7)

follows on the one hand from (3.10), according to which u(A) < p*(A), and on
the other hand from consideration of the sequence A, 0, 0.... which lies in P (A).
The significance of what has been proven lies in the fact, which we will establish,
that the system d' of all sets A E .9(1) satisfying (5.6) is a a-algebra in 52 and the

restriction of ' to af' is a measure. Now (5.6) as just proved says that .' C d',
C W*. Then according to (5.7) ji := ' I a (R) is an
and so we shall have
extension of it to a measure on o(ff). The definition and theorem which follow
will therefore complete the present proof. 0
5.2 Definition. A numerical function ' on the power set .9(St) having properties
(5.2)-(5.4) is called an outer measure on the set fl. A subset A of 0 is called u-measurable if it satisfies (5.6).

Notice that ' > 0 always prevails, an immediate consequence of (5.2) and (5.3)
together.
The idea in the proof of the measure-extension theorem, which goes back to
C. CARATHFODORY (1873-1950), consists in associating via definition (5.1) an
outer measure to the premeasure p on.' and then invoking the following theorem.

5. Extension of a premeasure to a measure

21

5.3 Theorem (Caratheodory). Let ' be an outer measure on a set f). Then the
system 0' of all '-measurable sets A C fl is a o-algebra in fl. Moreover, the
restriction of ' to dA' is a measure.
Proof. First let us note that the requirement (5.6) for a subset A of St to lie in d'
is equivalent to

'(Q)='(QnA)+'(Q\A)

(5.6')

for allQE9(1),

because from (5.4) applied to the sequence Q n A, Q \ A, 0, 0.... follows the reverse
of inequality (5.6), for every Q E 9(S1). From either (5.6) or (5.6') it is immediate
that S2 E d', and because of their symmetry in A and CA, whenever A lies in d',
so does CA. The following considerations will show that with each two of its sets A

and B, .d' also contains their union A U B, and so d is an algebra. B E as''


entails that
' (Q) = ' (Q n B) +.u* (Q \ B)

for every Q E 9(11). Replacing Q here first by Q n A, then by Q \ A = Q n CA, we


get two new equalities (valid for all Q E 9(1)) which, when inserted into (5.6'),
lead to

'(Q) ='(QnAnB)+'(QnAnCB)+'(QnCAnB)+ (QnCAnCB).


Replacing Q here by Q n (A U B) gives

(5.8) '(Qn(AuB)) ='(QnAnB)+'(QnAn CB)+'(QnCAnB),


which in conjunction with the preceding equality yields

'(Q) = '(Qn(AUB))+'(QnCAnCB) = '(Qn(AuB))+'(Q\(AuB))


This being valid for all Q E Y(n) affirms that A U BE d'.
Now let (An) be a sequence of pairwise disjoint sets from W' and A be their
union. The choice of A := A1, B:= A2 in (5.8) produces

'(Qn(A1 uA2)) ='(QnA1)+'(QnA2)


An induction argument generalizes this to
n

, (Q n

U A) = E(Q n Ai)
i=1

i=1
n

U Ai has already been proven

for all Q E 9(1), all n E N. Recalling that Bn

i=1

to be in Af ', and that Q \ Bn D Q \ A, so that ' (Q \ Bn) > ' (Q \ A), we obtain
n

p* (Q) =14* (QnBn)+p'(Q\Bn)?F1i'(QnAi)+'(Q\A)


i=1

22

I. Measure Theory

for all n E N. From this and an application of (5.4) follows


00

W(Q) ? F, p'(QnA.)+'(Q\A) ? 1 (QnA)+,u*(Q\A)


n=1

and consequently, as noted at the beginning of the proof, we actually have equality
throughout:

p'(Q) = 2p'(QnAn)+p'(Q\A) =p'(QnA)+p'(Q\A),


n=1

holding for all Q E 9x(1l). Thus A lies in d'. After all this we recognize that the
algebra sad' is an r)-stable Dynkin system and therefore by Theorem 2.3 a o-algebra. If in the last pair of equalities we take Q := A, we get
00

p'(A) _ E (An),
n=1

proving that the restriction of p' to d' is a measure. 0


It can be further shown that in many important cases the measure from
Theorem 5.1 is uniquely determined. As a preliminary we give a proof that is
a typical application of the technique of Dynkin systems. (Cf. also Exercise 9.)

5.4 Theorem (Uniqueness theorem). Let 9 be an n-stable generator of a a-algebria d in 1 and suppose that (En) is a sequence in 9 with U En = n. Then
nEN

measures p1 and p2 on W which satisfy


(i)

p1(E) = p2(E)

for all E E c9

p1(En)=p2(En)<+oo

for olin EN

and

(ii)

must in fact be identical.

Proof. Denote by 8f the system of all sets E E.9 satisfying 1(E) =p2(E) < +oo.
For a given E E of consider the system
9E :_ {D E sV: p1(E n D) = p2(E n D)).
We will show that it is a Dynkin system. Obviously Sl E E. If D E 9E, then
p1(E n D) = 2(E n D) < +oo (since E E 8j), and so (3.7) shows that

p1(EnCD) = p1(E\EnD) = p1(E)-p1(EnD) = 2(E)-p2(EnD) = p2(EnCD),


which says that CD E 9E. The remaining property of Dynkin systems (2.3) follows

at once from the a-additivity of the measures p1, p2. Because 8 is n-stable,
8 C 9E follows from (i) and the definition of 9E. But then S(td) C 9E because
6(8) is the smallest Dynkin system which contains 8. From Theorem 2.4 however,

5. Extension of a premeasure to a measure

23

6(9') = a(.9) =.W. Therefore 6(.9) C -9E C Sd entails 2E = 0. Thus


(5.9)
1(E n A) = 2(E n A)
holds for all E E -ff and A E d. On account of (ii) then in particular

1(EnnA)=2(E,,nA)

(5.9')

(nEN, AEsd).

In analogy with the proofs of (3.8) and (3.10) we set

F, := E1, F2:=E2\E1,..., Fn:=En\(E1U...UEn_1),...,


and get a sequence (F,,) of pairwise disjoint sets from dd satisfying F,, C En for

all n E N and U F,, = U E. = Q. Since Fn n A E d it follows from (5.9') that


nEN

nEN

1(FnnA)=l(EnnF,,nA)=p2(EnnF,,nA)=2(FnnA)
for all A E al and all n E N. But then the fact that

A= U(FnnA)
nEN

combines with the a-additivity of It, and 2 to deliver


00

00

1(A)=E1(FnnA)=>2(FnnA)=2(A)
n=1

n=1

for every A E .W, which says that the measures 1A1, 2 are identical. O

For finite measures some other natural stability properties of the generator c9
(e.g., its closure under set-differences) also insure uniqueness. See, for example,
ROBERTSON [1967].

In order to be able to formulate a useful sufficient condition for the uniqueness


of the measure ,& from Theorem 5.1, we make the

5.5 Definition. A content on a ring .9 in fl is called a-finite when a sequence


(An)nEN of sets from .9 exists such that U An = fl and (A,,) < +oo for ev-

ery nEN.

nEN

Examples. 1. Suppose that the content p on the ring .4 in f is finite, that is,
p(A) < +oo for every A E R. The a-finiteness of u is the equivalent to the existence

of a sequential covering (An) of Il by sets An E R. But the latter condition does


not automatically hold, as the trivial example Sl 54 0, .9 := {0} illustrates.
In general, the a-finiteness of a content p on a ring .4 is equivalent to the
existence of a sequence (A;,) of sets in .4 with (A',) < +oo for all n and A', T Q.
In fact, if (An) is merely a covering of S2 by sets in -4 having finite -measure,
then the sets An := Al U ... U An, n E N, furnish a sequence of the desired kind.
2.

Lebesgue premeasure in R" is a-finite (as well as finite). For if we denote by n

the point in Rd whose coordinates are all equal to n, then In :_ [-n, n[ is an


interval from 'Od, Ad(I,,) < +00 (n E N), and In t Rd.

24

I. Measure Theory

The counting measure on a set S2, defined in Example 6 of 3 is v-finite (resp.,


finite) just when S2 is countable (resp., finite).
3.

In summary we have

5.6 Theorem. Every or -finite premeasure p on a ring ' in a set 1 can be extended
in exactly one way to a measure it on a(M).

Proof. Only the uniqueness of it has to be proved. But this follows immediately
from 5.4: thanks to the o-finiteness of it, the ring. has all the properties required
of the generator 6' in the hypothesis of 5.4.
Remark. The hypothesis of a-finiteness of lc on 5.6 can not be dispensed with. It
suffices to look, as in Example 1, at a non-empty set ft and to take for . the ring
consisting just of the empty set. On a(R) = {0, 1} two different measures having
the same restriction to9 are defined byp(0) = v(0) := 0 and p(S2) := 0 =: 1-v(1l).
The uniqueness of the measure a which extends the or-finite premeasure 1A in
5.6 is expressed more dramatically by the following approximation property. For
simplicity we formulate it only for finite measures on an algebra.

5.7 Theorem (Approximation property). Let p be a finite measure on a v-algebra d inn which is generated by an algebra do in fI. Then for each A E d there
is a sequence (Cn)nEN in .moo satisfying

lim u(ALCn)=0.

(5.10)

n- 00

Here A designates the symmetric difference defined in Exercise 2 of 1. Exercise 7 of 3 is the real justification for the terminology "approximation property".

Proof. Let A E d, E > 0 be given. At issue is the existence of a C E X with


(A 0 C) < e. According to 5.1 and 5.6, especially the equation (5.1) which
extends pl do to 0, there exists a sequence (Af)1EN in .00 which covers A and
satisfies
00

0 < E (A11) -;i(A) < 2 .

(5.11}

11=1

If we set Cn

U Ai, n E N, then A'


i=1

U An satisfies
nEN

C n f A'

and

A' \ Cn y. 0.

Since p is finite, and consequently continuous at 0, an no E N exists for which


(5.12)

p(A' \ Coo) < 2

Let us show that the set C := Cno E do does what is wanted:

A,L C= (A \C)u(C\A) c (A'\ C) u (A'\ A),

5. Extension of a premeasure to a measure

25

and so the subadditivity of yields

,u(ADC) <(A'\C)+p(A'\A) =(A'\C)+(A') -(A)

<p(A'\C)+p(An)-p(A)
00
n=1

E/2 + e/2,

by (5.11) and (5.12),

which establishes the claim u(A L C) < E. 0


It should also be noted that

(5.13) limo (Cn) = A(A)


n

follows immediately from (5.10). The inequalities

I p(Cn) - (A)I < (A 0 Cn)

(5.14)

(n E N)

make this obvious: For C, D E ef, C C D U (C \ D), so that u(C) -,u(D) <
(C \ D) < (C A D). As C and D may be interchanged here, (5.14) is confirmed.

11

Exercises.
1. Let = E,,, be the premeasure on a ring .R in Sl defined by putting unit mass at
the point w E fl. Under the hypothesis that {w} can be realized as the intersection
of a sequence from .s and fl as the union of such a sequence, prove that: (a) The

outer measure ' defined from via (5.1) assigns to every set A E .9(11) the
value 1 or 0, according as w E A or w E CA. (b) Every subset of Sl is p'-measurable.
(c) ' is the measure E,,, on .9(11).

2. Consider the measure p in Examples 2 and 7 of 3, say for Sl := R, and prove

that: (a) The outer measure ' defined from p via (5.1) assigns to every set
A E .9(fl) the value 0 or 1, according as A is countable or not. (b) is not
a measure on .9(11), not even a content. (c) The only '-measurable sets are
those in the a-algebra sd on which u is defined.

3. Let d be the a-algebra generated by an algebra .r on the set Cl, a and v


measures on 0. Show that the validity of u(A) < v(A) for all A E . need not
imply its validity for all A E d. [Hint: do := 91, v counting measure, p := 2v.]
Find supplemental hypotheses that will render such an implication true.
4. Show that the sequence required in Definition 5.5 of the a-finiteness of the
content p on the ring 9' in fl, can always be chosen to be a sequence of pairwise
disjoint sets from ,R which cover Cl and each have finite measure.
5. Let be a a-finite measure on a a-algebra sF in fl, and ' the outer measure

defined by (5.1). Then to every set Q E .9(11) corresponds an A E 0, called


a measurable hull of Q, with the properties that Q C A, ' (Q) = p(A), and
(B) = 0 for all B E d such that B C A \ Q. [Hint: In case ' (Q) < +oo, show
that there exists a sequence (A,,) in at with Q C A,, and (A,) < '(A) + n'
for every n E N. Then A := n A,, has the desired properties.]
nEN

26

I. Measure Theory

6. A measure it on a a-algebra dd in Sl is called complete if every subset of a p-null


set (cf. Exercise 5, 3) belongs to W, and consequently is itself a it-null set. Show

that:
(a) The measure ,t f dd' from Theorem 5.3 is complete.
(b) The measure in Examples 2 and 7 of 3 is complete.
(c) If dd is a a-algebra in a set U, w E 0 and {w} E V, then the Dirac measure e,,,
on dd is complete just when dd = f1a(St).
7. (a) Show that every measure it on a a-algebra AV in a set U can be completed.

That is, can be extended to a complete measure po on a a-algebra ddo in U,


dd C ddo, in such a way that every complete measure p' on a a-algebra 0' in U,
dad C a(', which extends p is also an extension of po. The (obviously unique)
a-algebra ddo is called the p-completion of d; the triple (U, 4, PO) is called the
completion of (0, dd, p). [For such triples the term measure space will be introduced
in 7.]

(b) Determine the completion of (,Q.0, e.,) from Exercise 6(c).


(c) Show that the p-completion do of a or-algebra dd in U consists of all sets AUN
with A E d and N a subset of a it-mill set. For every such set, po(A U N) = p(A).

(d) Characterize the sets in olo as follows: A set Ae C U lies in 0e just if sets
AI, A2 E dQ exist such that A, C Ao C A2 and p(A2 \ Al) = 0.

8. Let it be a a-finite measure on a a-algebra dd in U, p' the outer measure it


determines via (5.1), and dd' the or-algebra of all p'-measurable subsets of Q.
With the help of Exercises 5 and 7, show that (U,dd5, p' Id') is the completion
of (f2, dd, p).

9. The proof of Theorem 5.4 only uses condition (i) for sets A E c' which satisfy pi(E) = p2(E) < +oo. Clarify this observation by showing that under the
hypotheses of Theorem 5.4 the system tj of all sets E E if satisfying pl(E) _
2(E) < +oo is likewise an ft-stable generator of dd.

6. Lebesgue-Borel measure and measures on the number


line
We are going to pursue further the investigations in 4. So as before jd will be
the set of all right half-open intervals in Rd, gd the ring of all d-dimensional
figures, and Ad the Lebesgue premeasure on .,mod. We have already noted that Ad
is a-finite. According to 5.6, ad can be extended in exactly one way to a measure

on 0.(,d), which measure will also be denoted by ad from now on. Since every
figure is a union of finitely many intervals I E .1d, we have
a(.

d)

= o(5d) .

6.1 Definition. The elements of the a-algebra generated in ltd by the system 5d
of half-open intervals are called the Bored subsets of the space Rd. Correspond-

6. Lebesgue-Borel measure and measures on the number line

27

ingly o(.Fd) is called the a-algebm of Borel subsets of Rd; it will henceforth be
denoted .mod.

The results reviewed in the introduction can, following 4.3, be expressed thus:

6.2 Theorem. There is exactly one measure Ad on ,mod which assigns to every
right half-open interval in Rd its d-dimensional elementary content.

6.3 Definition. The measure Ad in Theorem 6.2 is called the Lebesgue-Borel


measure (L-B measure, for short) on Rd. For every Borel set B E .mod, Ad(B) will
also be called the d-dimensional Lebesgue measure of B.

It is expedient to expand this definition: For every set C E 0 the trace aalgebra C fl 0 consists of all Borel subsets of C (cf. (1.4)). The restriction Ac
of Ad to C fl 0 is a measure. It will also be called the L-B measure on C.
Like the Lebesgue premeasure of which it is an extension, the L-B measure Ad
is a-finite (cf. Example 2 of 5). More generally
Ad(B) < +oo

(6.2)

for every bounded set B E .mod, since such a B lies in an interval in 0d; e.g.,
excepting finitely many n, B lies in each interval I from Example 2, 5, with the
result that Ad(B) < Ad(In) < +00.
Let us recall the question formulated in the introduction to Chapter I of finding
a unified method for assigning a numerical measure of d-dimensional volume to
as many subsets of Rd as possible. Step by step we will come to recognize that
Theorem 6.2 answers this question in a most satisfactory way: for every Borel
set B in Rd its d-dimensional measure in the number we were seeking.
First of all it seems desirable to get a deeper insight into the a-algebra gd of
Borel sets. In particular, the question naturally comes up whether topologically
interesting sets, like the open, closed, or compact ones are Borel. The characterization of .mod via such sets in the next theorem is often taken as the definition of

the a-algebra 0.
6.4 Theorem. Let 0d, `ed,

.d denote the system of all open, closed, compact

subsets of Rd, respectively. Then


(6.3)

,Wd

= o(6d) = o(`ed) = r(.

d)

Proof ..lt'd C `E'd C 0,((2d), so o(Xd) C o(('d). Every set C E'd is the union of
a sequence of sets C E ..1C'd; for example, if K. are the compact balls with a fixed
center and radii n E N, then the sets C,, := C fl Kn furnish such a sequence. Thus
by (1.3), Wd C o(..lE'd), whence o(Wd) C o(..iE'd) and so finally the equality of
these two a-algebras. Since the open sets are the complements of the closed ones,

the equality o(6d) = o(( d) is obvious; therewith the last two equalities in (6.3)
are confirmed.

28

I. Measure Theory

We finish up by showing that o,(d) =mod. We will, as usual, use the term
bounded open interval in Rd for every set of the form
(6.4)

Ja,b[:={xERd:a.x4b},

where a, b E Rd satisfy a < b. Every right half-open interval [a, b[ E fd is the


intersection of a sequence of bounded open intervals, namely, for

a:= (a1, ... , ad)

and

an := (al - n-1, ... , ad - n-1)

(n E AI)

we have
Jan, b[ .l. ]a, b[ .

Therefore jid c o(rYd) by (1.7) and consequently


o(.fd) C a(&). Every
open set in Rd can be exhibited as the union of countably many bounded open
intervals (e.g., all those which it contains whose endpoints have only rational coordinates). Moreover, every bounded open interval ]a, b[ is the union of a sequence
of intervals from .fd, namely
(6.5)

[an, b[ T ]a, b[

if we set

(6.5')

an := (min{al + n-',,31 I,-, min{ad + n-1,Qd})

(n E N),

ai, ... , ad and


being the coordinates of a and b, respectively. Every
open set is therefore the union of a sequence of intervals from jd, and so 6d C
0,(.f d) = .jd. It thus follows that o(eld) C .mod and, as the reverse inequality has
already been established, equality Rd = o,((l'd) is confirmed. O
We will become acquainted with some deeper properties of L-B measure in 8.
In particular, there the existence of non-Borel sets, that is, the assertion

Rd #-'P(fltd)
will be proved. For the moment we content ourselves with computing the Lebesgue
measure ,d(B) of some geometrically simple Borel sets B.

Examples. 1. Every hyperplane H orthogonal to one of the coordinate axes in Rd


is an L-B-null set, i.e., a Borel set with Ad(H) = 0. Let, say H be orthogonal to
the ih coordinate axis, i E { 1, ... , d}, that is, be of the form
(6.6)

H:={x=(l:l,...,ed)ERd: F.=a}

for an appropriate a E R. H is a closed set, and so is Borel. For each n E N,


let x,,, yn be those points in Rd whose coordinates are -n or n, respectively,
at every index except i and whose ith coordinates are a or a + 2-"(2n)1-de,
respectively, where e > 0. Evidently

HC U [x",yn[
neeN

6. Lebesgue-Borel measure and measures on the number line

29

and

Ad([xn,yn[) = 2-ne,

n E N.

From (3.10) we therefore get


00

Ad(H) 5 E Ad([xn, yn[) = E.


n=1

Since this is true for every e > 0, Ad(H) = 0 follows.


Due to the isotoneity of measures, we consequently also have Ad(B) = 0 for
every Borel subset B of such a hyperplane H.

2. Every countable subset of Rd is an L-B-null set. Because of o-additivity of


measures, it suffices to treat the case of one-point sets {x} C Rd. Being a closed
set, it is Borel; moreover for an appropriate hyperplane H of the form (6.6) we
have {x} C H.
3. For points a, b E Rd with a < b consider besides the intervals [a, b[ and ]a, b[
already defined, the compact interval

[a,b]:={xE1Rd:a<x<b}
and, in contrast to [a, b[, the left half-open interval

]a,b]:=Ix ERd:aax<b}.
Then
(6.7)

Ad([a, b[) = Ad(]a, b[) = Ad([a, b]) = Ad(la, bl)

First of all the intervals [a, b[, ]a, b[ and [a, b) are Borel sets by Theorem 6.4. As in
its proof, we can show that
(6.8)

]a, bn [ .. ]a, b]

and

[a, bn [ 1 [a, b]

for appropriate sequences (bn) in Rd converging to b. Again from 6.4 we then get
that ]a, b] is Borel. From (6.5) follows
Ad(]a,b[) = lim Ad([a,,,b[) = Ad([a,b[)
n-,ao

the first equality using the continuity from below of a measure, and the second
using lim an = a (from (6.5')) and the continuous dependence on c and d of the
n-aoo
elementary content of the interval [c, d[. Analogously, with the help of (6.8), we
conclude that
Ad(]a, b]) = Ad([a, b]),

this time citing the continuity of measures from above. Thus finally from the
inclusions ]a, b[ C ]a, b] C [a, b] the remaining equality in (6.7) follows.
The choice of right half-open intervals for the construction of Ad is now seen

to have been due solely to the fact that the ring .ld they generate is so simple to
describe.

30

I. Measure Theory

In a second step a large class of measures on the o-algebra R1 of Borel subsets


of the line will now be presented. These are the Borel measures. In general for
d E N, a measure p defined on .mod is called a Borel measure on Rd if

p(K) < +oo

for every compact K C Rd

or, equivalently, if p(B) < +oo for every bounded set B E Rd. I.-B measure Ad is
such a measure, according to (6.2).
The point of departure for defining A' is the determination of AI([a,b[) for
intervals [a, b[ E 51, namely as b - a. It suggests itself that this opening move

might be generalized as follows: One has a function F : R -+ R and asks for


conditions on it which guarantee the existence of a measure p on 0 with the
property
,u([a, b[) = F(b) - F(a)

(6.9)

for all a,bERwith a<b.

Thanks to the uniqueness theorem 5.4 such a measure is already thereby, i.e.,
by its values on 5', uniquely specified. Since p([a, b[) > 0, (6.9) entails that the
function F must be isotone. Moreover, F has to be left-continuous. This is because

for every x E R and every sequence (x,,) in R with x,, 1 x, the corresponding
interval behavior is
t [x1,x[, and since p must be continuous from below,
it follows that

lira F(xn) - F(xl) = lim p([xl,xn[) = pQxl,x[) = F(x) - F(xl)

n-+oo

that is, lin, F(xn) = F(x), F is left-continuous at x.


n-1oo
Functions F : R -+ R which are isotone and left-continuous will be called
measure-generating (or measure-defining) functions (on R). Of course, whenever F
is such a function, so is aF + b for any a E R+, b E R. The designation "measuregenerating" is justified by the next theorem, which answers completely the earlier
question of what are the appropriate conditions on F.

6.5 Theorem. To every measure-generating function F on R there corresponds


exactly one measure OF on 91 having property (6.9), that is, satisfying
pp([a,b[) = F(b) - F(a)

for all [a,b[ E 91.

The measure pc determined by the measure-generating function G satisfies PC =


pp if and only if G = F + c for some constant c E R. Every pF is a Borel measure
on R, and every Borel measure on R is a pp for an appropriate F.

Proof The techniques employed in the proof of Theorem 4.3 can be repeated
to show that corresponding to F there is a unique content p on the ring Jr'
of 1-dimensional figures which has property (6.9). That part of the proof used
only the isotoneity of F. From the left-continuity of F it follows that for every

1=[a,b[E5' and every e>0there isaJ=[a,c[E51with JCland


IA(1) - p(J) = p([c, b[) = F(b) - F(c) < e.

6. Lebesgue-Borel measure and measures on the number line

31

But then the technique employed in the proof of Theorem 4.4 shows that it is
a a-finite (as well as finite) premeasure on .071.

According to 5.6 it can be extended in exactly one way to a measure on 0.


This measure does what is wanted, is a pF. Its uniqueness with respect to its
prescription on .1 via F was settled in the deliberations preceding the present
theorem. From pF = pc we get G(b) - G(a) = F(b) - F(a) whenever a < b. Upon
applying this with a = 0 < b as well as with a < 0 = b, we learn that G = F + c,
with c := G(O) - F(0). Every AF is a Borel measure, because every bounded
B E 91 is contained in [-n,n[ for some n E N and so pF(B) < IAF([-n,n[) _

F(n) - F(-n) < +oo.


If conversely, p is an arbitrary Borel measure on R, we can define

F(x) .=

p([0, x[)

if x > 0

I-p([x, 0[) if x < 0


and get a function on R having property (6.9) and therewith, in light of the
discussion preceding this proof, measure-generating. In fact, for real numbers 0!5
a < b the subtractivity (3.7) of measures entails that
p([a, b[) = p([0, b[ \ [0, a[) = F(b) - F(a)

and (6.9) is confirmed analogously when a < b < 0. In the remaining case a < 0 < b
we get (6.9) from [a, b[ = [a, 0[ U 10, b[ and the additivity of it. The uniqueness

already proved leads finally to the equality of p with the measure AF derived
from F.
Notice that L-B measure )' has the form PF, with F the identity map x H x
on R.

Of special importance are the finite measures on 0. Every one is a Borel


measure on R. Because 0 < p(B) < p(R) < +oo for all B E 91, a finite Borel

measure p on R is either the zero measure p = 0, or 0 < p(R) < +co and

v:=

p is a measure on.91 with v(R) = 1. Measures normalized this way play


p(R)
a fundamental role in probability theory. This explains the following vocabulary:
A measure p on a a-algebra .sad in a set Q is called a probability measure (abbreviated to p-measure) if p(1l) = 1. Because of the isotoneity property every
p-measure satisfies
(6.10)

0 < p(A) < 1 = p(fl)

for all A E W.

Consider now a p-measure p on 0. The open interval [-co, x[ lies in 91 for each
x E R, so a real function F. with values in [0,1] is defined by
(6.11)

F,,(x) := p(] - oo,x[)

(x E R).

It is called the distribution function of p. For example, the distribution of the


Dirac measure eo equals 0 throughout ] - oo, 0] and 1 throughout ]0, +oo(.
Since ] - coo, b[ \ ] - oo, a[ = [a, b[ whenever a < b,

p([a, b[) = F, (b) - F, (a)

for all (a, b[ E S1.

32

I. Measure Theory

Therefore (6.11) uniquely defines a measure-generating function, which obviously


satisfies
(6.12)

F,. = A

in the notation introduced in Theorem 6.5. Among the infinitely many measuregenerating functions F that satisfy pF = for a given p-measure p the distribution
function F. is characterized as follows:
6.6 Theorem. A real function F on J is the distribution function of a -- necessarily uniquely determined -p-measure p on 4' if and only if it is measure-generating
(that is, isotone and left-continuous) and satisfies
lira F(x) = 0
_cc

(6.13)

and

lira F(x) = 1.

X-++oo

Proof. The distribution function F of a p-measure it on 91 is always measuregenerating, as (6.12) shows. Properties (6.13) follow from the continuity at 0
and the continuity from below of every finite measure, respectively, since for sequences (x,2) in R with x,, , -oo, resp., xn t +oo we have ] - oo,xn[ .. 0, reap.,

]-oo,x,, [TR.
If conversely F is a measure-generating function satisfying (6.13), then according to 6.5F is the only Borel measure on R with property (6.9), in particular, with

pp([-n, n[) = F(n)-F(-n) for all n E N. When n - +oo here, the normalization
condition u(R) = 1 follows from (6.13). Thus F is a probability measure. F is
then the distribution function of pp, because for x E R and all n E N fl [-x,+00[

pF([-n,x[) = F(x) - F(-n) and [-n.x[ t ] - oo,x[


so that

F(x) = bin ILF([-n,xD +n-+oo


lira F(-n) = u(] - oo,x[) = F,,, (x) .
Via p +-> F,, the set of p-measures on 91 is thus bijectively mapped onto the
set of measure-generating functions F on JR having property (6.13). This is the
significance of the preceding theorem.
Remarks. 1. Measure-generating functions are also called "Stieltjes measure func-

tions". This is because, even before the invention of the measure concept,
T.J. STIELTJES (1856-1894) had used such functions to extend the ideas behind
the Riemann integral (cf. Remark 2 in 12).
2. Measure-generating functions (and distribution functions) also make sense
in Rd. But they are difficult to deal with and that is not the least reason why they
are of less significance. A function F : Rd -* R is called measure-generating if in
each of its d variables 1;1.... , l d, when the others are held fixed, it is left-continuous
and satisfies the additional condition

A$'...AQ,F>0

for all a,bERdwith a<b.

6. Lebesgue-Borel measure and measures on the number line

33

Here ak, (3k (k = 1,. .. , d) are the coordinates of a, b, resp., and ,aI F is the
function defined on Rd-1 via (t1, ... , d) '-4 F2 (6, , G) := F(01,6, -,td) F(a1, t2, ... , d). Then Da2F2 = AJ &3, F is defined and the further "difference
operators" Dak are inductively brought into play. There is a theorem analogous
to 6.5: To every measure-generating function F on Rd corresponds a unique Borel
measure AF on Rd which satisfies the iterated difference condition

for all [a, b[ E d.

pF([a, b[) = Aad ... L&QI F

(6.14)

For d = 1 this reduces simply to (6.9'). As an example, for the function


CC

Aad ... Aa, Fo = (.31 - a1)

for a, b E Rd with a < b.

(Qd - ad)

This function is consequently measure-generating, and generates the L-B measure Ad in the sense that uF0 = Ad. Details can be found in RICHTER [1966],
TUCKER [1967) and GNEDENKO [1988.

Exercises.
1. Prove that a Borel set B E .mod is an L-B-null set if and only if one of the
two following conditions (which are hence equivalent) is satisfied: (a) For every
e > 0 there is a covering of B by countably many open intervals In C Rd such
00
that E Ad(In) < c. (b) There is a covering of B by countably many open intern=1

00

vals In such that E Ad(II) < +oo and every point of B lies in In for infinitely
n=1

many n. Both characterizations remain valid if the In are allowed to be half-open


or compact, instead of open. [Hint for (a): Utilize (5.1).]
2. Write Rd in the form Rd = Rp X RQ with p, q E N, p + q = d, by grouping the
first p coordinates of a point x E Rd into a point in RP and the last q coordinates
into a point in R. Denoting by 0 the zero of the vector space R9, show that for
a set A C RP, A x {O} E .mod precisely when A E

P.

3. Let p be a p-measure on 0 and F its distribution function. Show that F is


continuous at the point x E R just if p({x}) = 0.
4. Determine the p-measure on .r which has x -+ 0 V (x A 1) as distribution
function, and answer anew the question in Exercise 1 of 4.

5. Show that every a-finite measure p on 0 can be represented in the form


00

p = E an pn, where for each n E N, an E R+ and An is a p-measure on .mod. The


n=1

supplemental condition that for every bounded set B E Rd, pn(B) -A 0 for only
finitely many n E N can be imposed if and only if y is a Borel measure.

I. Measure Theory

34

7. Measurable mappings and image measures


The following considerations can be more simply formulated if we introduce some
shorthand terminology. If 11 is a set and d9 a a-algebra in fl, the pair (12, mot) will

be called a measurable space and the sets in d measurable sets. If in addition


a measure p is defined on the a-algebra d, then the triple (Cl, d, la) arising from
the measurable space (12, a) is called a measure space (cf. Exercise 7 of 5). If p is
a p-measure, the measure space (Sl, .a(, pC) is called a probability space (p-space for
short). Correspondingly, one speaks of a a-finite measure space
p) if the
measure p is a-finite.

The measurable space (ltd, .4d) will henceforth be called the d-dimensional
Borel measurable space. The measure space (ltd, .mod, Ad) will correspondingly be
called the d-dimensional Lebesgue-Bored measure space abbreviated to L-B measure space).
The concept measurable space exhibits a formal analogy to that of topological
space. For a topological space is also a pair, consisting of a set and a system of its
subsets, namely, the open ones. In the sense of this analogy the next concept, that
of a measurable mapping, corresponds to the concept of continuity in topology.

7.1 Definition. Let (11,,W) and


be measurable spaces, and T : fl -, Cl'
a mapping of 11 into Cl'. T is called W-d'_measurable if
(7.1)

T-'(A') E.off

for every A' E ,V'.

We express the W-sad'-measurability of T symbolically by

and speak of a measurable mapping of the first measurable space into the second.
Using the notation introduced in (1.5), (7.1) can be written as
(7.1')

T-'(,W') Cd.

Examples. 1. Every constant mapping T : 1-> Cl' is .W-a'-measurable.


2.

Every continuous mapping T : Rd - Rd' (d, d' E N) is : 1d-9"-measurable,

briefly put, Borel measurable. According to 6.4 the system /P' of all open subsets

of Rd' is a generator of .$. Because of the continuity of T, T-1(O) E Od C Rd


for every 0 E Od'. The asserted measurability of T therefore follows from the next
theorem.

7.2 Theorem. Let (12, d) and (Q', W') be measurable spaces; further, let 9' be
a generator of 0'. A mapping T : Cl - 12' is measurable just if
(7.2)

T-1(E') E R1

for every E' E 4'.

7. Measurable mappings and image measures

35

Proof. The system .l' of all sets Q E 9(S2') for which T-1(Q') E d is a a-algebra
in 11'. Consequently, 0' C . ' holds just if 8' C 2' does. sZf' C .l' is equivalent
to the measurability of T, while 8' c 2' is equivalent to (7.2).
Concerning the composition of measurable mappings, what the earlier analogy
with topology suggests, prevails:

7.3 Theorem. If Ti

: (c', .s+'j) -> (Sl2, a/2) and T2 : (S22, saI2) -* (S13, s71/3) are
measurable mappings, then the composite mapping T2 o T, is sari-d -measurable.

Proof. The claim follows from the validity of the equation (T2 o T,)-1(A) =
Ti 1(TZ 1(A)) for all A E 9(SZ3), in particular, from its validity for all A E saf3.
Next consider a family of measurable spaces ((c,, sO ))iEI and a family (Ti)iE1 of
mappings Ti : S2 -> S2i of some fixed set S2 into the individual sets 11,. Obviously the

a-algebra in 0 generated by U Ti 1(sa;) is the smallest a-algebra 0 with respect


to which every Ti is 0-sfi-measurable. We designate this a-algebra o(T, : i E I),
that is, we define
(7.3)

o(Ti : i E I) := o(U(T; 1(-Wi))


iEI

and call it the a-algebra generated by the mappings Ti (and the measurable spaces
n}, we also use the notation
(Sti, r!)). In the case of the finite index set I

o(T1i...,Tn)For n = 1 we clearly have a(TI) = Ti 1(sad1). If therefore a a-algebra d in


a set S1 is given, then a mapping T, : S2 -> S1, being d- s i(i -measurable is equivalent
to
(7.4)

a(T,)C0.

Cf. (7.1').

As a further application of 7.2 we will demonstrate:

7.4 Theorem. Let (T,)iEI be a family of mappings Ti : 0 -+ S2, of a set Sl into


measurable spaces (Sli, s ). Further, let S : Slo -> fl be a mapping of a measurable
space (Slo, sto) into Sl. The mapping S is then solo-o(Ti : i E I) -measurable if and
only if each mapping Ti o S (i E I) is sago-d-measurable.
Proof. According to Theorem 7.3 the condition is necessary. The following considerations show that it is also sufficient. By (7.3) the system

8:=UT,'(s )
iE1

is a generator of o(TT : i E I). Each set E E 8 has the form E = Ti 1(Ai) for some
i E I, A, E .sad . Thus S-1(E) = (Ti o S) -1(Ai) E s to because of the hypothesized
measurability of Ti o S. From 7.2 therefore, S is sio-o(Ti : i E I)-measurable.

I. Measure Theory

36

Finally, with the aid of measurable mappings, measures can be mapped:

7.5 Theorem. Let T : (I ,.d) -+ (0', 0') be a measurable mapping. Then for
every measure p on a+f,
(7.5)
defines a measure

p' on af'.

Proof. We only have to observe that for every sequence (An)nEN of pairwise disjoint

sets from al', (T-1(A'n))nEN is a sequence of pairwise disjoint sets from W, and
that
T-1(UA')=UT-'(Art).

nEN

nEN

7.6 Definition. In the situation described in 7.5, the measure p' is called the
image of p under the mapping T and is denoted by T(p).
Thus according to this definition
(7.5')

T(p)(A') := p(T-1(A'))

for all A' E ai'.

The formation of image measures is transitive, that is,


(7.6)

(T2 o TO) (p) = T2(Ti(p)),

whenever we are in the situation of 7.3 and U is a measure on .aft: For every A E aft,
T := T2oT1 satisfies T -'(A) = Ti ' (T;" (A)), and T;" (A) E .aft. Therefore, setting

A':= Ti(p), 14":= T2(') for short, it follows that

T(p)(A) = p(Ti '(Tz 1(A))) = (Tz 1(A)) = p"(A),


for all A E W3i showing that T(p) = p" and confirming (7.6).

Examples. 3. Let (Q, d) = (11',.af') :_ (Rd, Rd) be the d-dimensional Borel


measurable space and p := Ad the associated L-B measure. For every point a E Rd,
the translation mapping T. : Rd -a Rd is defined by

Ta(x) := a + x

x E Rd.

It is continuous and so (Example 2) measurable. We inquire into the image measure


A' := Ta(Ad).

The mapping Ta is bijective, and Ta 1 = T_a. So for every interval [b, c[ E


jd, Ta 1([b, c[) = (b - a, c - a[, whence A'([b, c[) = Ad([b - a, c - a[) = Ad([b, c[).
Both measures Ad and A' thus assign to every interval from . pd its d-dimensional
elementary content. According to 6.2 therefore Ad = A', that is,
(7.7)

Ta(Ad) = Ad

for every a E Rd.

This property of Ad is called its translation-invariance. If we set, as is customary


(7.8)

a+A=A+a:=Ta(A)={a+x:xEA)

7. Measurable mappings and image measures

37

for sets A E .9(Rd) and points a E Rd, then TQ(Ad)(A) = ad(-a+A) for arbitrary
A E Rd. Property (7.7) can therefore also be expressed as
(7.7')
4.

Ad(a + A) = Ad(A)

for all A E 69d, a E Rd.

In the context of Example 3, each non-zero real number a and each i E

{ 1, ... , d} determine a continuous, hence Borel measurable, linear mapping DQ')

which assigns to the point x = (x1, ... , xd) E Rd the image point x' E Rd having
coordinates x; := ax;, and x' = xj for all j 0 i, a dilation of x. It satisfies
(7.9)

Da'>(ad) = 1a1-1 Ad.

For, every open interval ]a, b[ C Rd has D.()-pre-image equal to ]a', b'[, where the
coordinates of a', b' except the ith are those of a, b, the ith being a-1 times those
of a, b if a > 0, and a-1 times those of b, a if of < 0. Hence
Ad((DR'i)-'Qa,b[)) = IaI-' Ad(]a,b[)
DQ'i(ad) and IaI-l Ad are therefore measures on .mod which coincide on all bounded

open intervals. Thanks to 6.4 such intervals constitute a generator of 9d, which
obviously has with respect to each of these measures all the properties of the
generator 8 in the uniqueness Theorem 5.4. From that theorem (7.9) therefore
follows.
5.

If we set Hr := Dr1) o ... o D(rd) for real r 96 0, we obtain the linear mapping

Hr(x) = rx (x E Rd), called a homothety. Because of the transitivity of image


measures, it follows from (7.9) that
(7.10)

Hr(Ad) =

Iri-dad

For r = -1 we get H_ 1(Ad) = A' Because H_ 1 is reflection through the origin,


this property is called the reflection-invariance of Ad.
Exercises.
1. For fl := R, let (Sl, dA, p) be the measure space of Example 2, 3. For SY := {0,1 }

9(fl) define the mapping T : fl --, SW by T(w) := 0 if w is rational,


T(w) := 1 if w is irrational. Show that T is d-d'-measurable and determine the
image measure T().
2. Show that for any sets fl, Sl', any mapping T : 11 - fl', and any system of sets
and .sad'

B' c .9(11'), T-1(o(8')) = a,(T-'(r))


3. Let K be a compact subset of Rd with the property that the intersection HH(K)fl

Hr, (K) of every two homothetic images of K with 0 < r < r' < 1 is an L-B-null
set. (This property is enjoyed by every sphere S,,(0) of radius a > 0 and center
0 := (0,. .. , 0), that is, the set of x E Rd having euclidean distance a from 0.) Show

that Ad(K)=0. [Hint: For allrE10,11, Hr(K)CK:={tx:0<t<1,xEK},


which is a compact set. Hence Ad (k) < +oo.)

I. Measure Theory

38

4. Let T:= {(x, y) E R2 : x2 + y2 = 1} denote the unit circle, that is, the sphere
S1 (0) in R2. Prove the existence of a finite non-zero measure v on the a-algebra
,4(T) : _ T n 0 which is invariant under all rotations of T. [Hint: Take for v an
image of Ac for an appropriate interval C C R.]

8. Mapping properties of the Lebesgue-Borel measure


L-B measure Ad on ,Rd is, as was shown in Example 3 of the preceding section,
translation-invariant. Of the greatest significance is the fact that Ad is uniquely
determined by this invariance property, together with a simple normalization. For
the d'-dimensional unit cube, defined by
(8.1)

W:= 10, 1[,

where 0 = (0, ... , 0) E Rd and 1 :_ (1, ... ,1) E Rd, 6.2 insures that
Ad(W) = 1.

(8.2)

Along with Ad each non-negative multiple aAd (a E R+) of it is a translation-

invariant measure u on 0, which satisfies u(W) = a < +oo. The following


converse of this also holds, and contains the aforementioned characterization of Ad
as a special case.

8.1 Theorem. Every measure on .mod which is translation-invariant, i.e., satisfies T. (p) = for every translation x - Tp (x) := a + x of Rd, and which assigns
finite measure
(8.3)

a := ii(W) < +oo

to the unit cube W, has the form


(8.4)

p = aAd.

Proof. Let an := n'l, the point in W' all of whose coordinates are 1/n. Then
W := [0,
a cube with
.u(Wn) = a/nd

In fact: The interval [0,1 [ E .01 is the union of the pairwise disjoint intervals

[!,-_![
(Bl,

with v = 0,1, ... , n - 1. If therefore Gn denotes the set of points

... , Pd) E Rd whose coordinates all come from the set { v/n : v = 0,...,n- 1},

then

W = U [r, r -i- M' [


rEG.

8. Mapping properties of the Lebesgue-Borel measure

39

a union of nd pairwise disjoint intervals. Because [r, r+an[ = T,.([0, a,,[) = Tr(Wn)
and because of the translation-invariance of , it follows from this representation

of W that a = nd(Wn).
A repetition of these considerations will show that
([a, b[) = aAd([a, b[)

holds for every interval [a, b[ E fd in which the points a, b have only rational
coordinates. Obviously in proving this we can assume that a 4 b, and due to
the translation-invariance of both measures we can further assume that a = 0.
Then b = (ml /n, ... , and/n) for appropriate ml,..., md, n E N, and therefore
[0, b[ is the union of the ml ... and pairwise disjoint intervals [r, r + an[ with
r = (Bl /n, , Pd/n) and Pi E 10,..., m; - 1) for each i. As before, this yields
m1 ...
=([0,b[), hence
((0,b[) = a

nl

...

nd = aAd([O, b[)

Now the set ee of all intervals [a, b[ E f d for which a, b have only rational
coordinates is an fl-stable system. The technique used in the proof of Theorem 6.4
shows that ac is, just like 5d, a generator of.. Because the measures p and aAd
coincide on ee and for n:= (n,. .. , n) with n E N the intervals [-n, n[ lie in 8e
and increase to Rd, our claim (8.4) follows from the uniqueness theorem 5.4.
For a = 1 we immediately get from 8.1.
8.2 Corollary. Lebesgue-Borel measure Ad is the only translation-invariant mea-

surep on . which satisfies


(8.2')

(W) = 1.

This corollary says that Ad is, in the theory of locally-compact groups, a Haar
measure on the additive group Rd. That theory provides an analogous non-zero
invariant measure on every locally compact abelian group G; it is unique to within
a positive scalar factor and is called Haar measure on G. The reader interested in
its theory should consult NACHBIN [1965]. (Cf. also Exercise 4 of 7 and Exercise 8
of 17.)

The conclusion of the theorem and its corollary remain valid if in the normalization (8.2) and (8.2') the unit cube W is replaced by its open interior ]0,1[ or
its compact closure [0,1]. This is immediate from (6.7). However, if p(W) = +oo
is allowed, p need not be a multiple of Ad. See HENLE and WAGON [1983].

Example. 1. Besides the DO(') of Example 4, 7 there is another basic class of linear
mappings in Rd, those that skew one coordinate by means of another. Specifically,
for each i, k E { 1, ... , d} with i -A k we define
S(i,k)(x 1,..., ad) :_ (xl,...,Xi-l,xi +2k,Xi+1,.... 2d).

40

I. Measure Theory

Evidently this mapping is continuous. It is also invertible, with inverse D(k)oS(i,k) o

D(k), as is easily seen. In view of Example 2 in 7, the image measure S(i,k)(Ad)


may therefore be formed. We want to show that Ad is also invariant under these
mappings, that is, that
S(i,k)(Ad) = Ad

for all i, k E { 1, ... , d) with i j4 k.

Fix such a pair (i, k) and write simply S for S(i,k) Since this is a linear mapping,
S(Ad) is a translation-invariant measure on 0, so (8.5) will follow from 8.2 if we
succeed in showing that S(Ad)(W) = 1, that is, Ad(S-1(W)) = 1. In view of (7.9)
and the equality S' = D(kl o S o D(ki, it suffices to show instead that
Ad(S(W)) = 1.

(8.5')

Let a denote the vector in Rd whose only non-zero coordinate is the ith one, it
being -1. Introduce

4iW'{(xl,...,xd):0<xj <1forjq6i,0<xi<xk} and


Wit :={(xl,...,xd):0<x,<1for j 0i,I+xk<xi<2}.
Notice that

Ta(W") _ { (xl, ... , xd) : 0 < xj < I for j 96 i., xk < xi < 1} .
Clearly
(8.6)

W = W' UTa(W") disjointly.

W' is the intersection of W with the open set {(X1, ... , xd) E Rd : xi < xk }, so
W' is a Borel set. Similarly Ta(W") is the intersection of W with the closed set
{(xl, ... , xd) E Rd : xk < xi}, so it is a Borel set. Thus W", its preimage under Ta,
is also a Borel set. Since S is a homeomorphism, S(W) is a Borel set. Next notice

that
(8.7)

W', W" and S(W)

are pairwise disjoint.

For the conditions on the ith coordinate that define each set in (8.7) are obviously
incompatible with those that define the other two sets. Moreover,
(8.8)

W' u W" U S(W) = D( 4)(W) .

Here the inclusion "C" is obvious from the coordinate inequalities defining the
sets. A typical point x of D(i)(W) has j`h coordinate xx E [0,1[ if j 96 i and ith
coordinate t E [0, 2[ = [0, xk[ U [xk,1 +xk[ u [1 +xk, 2[. If t lies in the first (third)
interval, then x E W' (x E W"). Otherwise, xi := t - xk E 10, 1[, and
x = (XI.... , xi-1, t, xi+1, .... xd) = (x1, ... , xi-1, xi + xk, xi+1, ... , xd) E S(W ).

8. Mapping properties of the Lebesgue-Borel measure

41

This confirms (8.8). Combining all that we have learned gives the desired (8.5') as
follows:

2 = 2Ad(W) =Ad (DZ')(W))

=Ad W)

Ad(W"i)

by (7.9)

+ Ad(S(W)) by (8.7) and (8.8)

=Ad(WI) + Ad(Ta(W )) + Ad(S(W)) by (7.7)


= Ad(W) +,d(S(W)) by (8.6)
= 1 + Ad(S(W)) by (8.2).

One usually thinks of the space Rd as equipped with the euclidean scalarproduct
d

(x, y) E cn,
i=1

and the euclidean metric derived from it by

P(x,y):=
y :=

(x-y,x-y)

where x
Every mapping T : Rd -a Rd which
leaves this metric invariant, that is, satisfies
(8.9)

for all x,y E Rd,

P(T(x),T(y)) = Lo(x,y)

is called a motion (or an isometry) in Rd. It is obviously continuous, hence Borel


measurable.
Suppose in addition that T fixes 0, that is,

T(0) = 0.
Using the linearity of (,) in each of its positions, we get

e2(T(x),T(y)) _ (T(x) -T(y),T(x) - T(y))


_ (T(x), T(x)) - 2 (T(x), T(y)) + (T(y), T(y))

L02(T(x), T (O)) - 2 (T (x), T (y)) + P2(T (y), T(O)),

so that in view of (8.9)


p2(x, y) = e2(x, 0) - 2 (T(x),T(y)) + p2 (y, 0) .

Replacing T with the identity mapping here shows that (8.9) may be supplemented with
(8.9')

(T(x),T(y)) = (x,y)

for all x,y E Rd if T(0) = 0.

Consider A E R, x, y E Rd and again suppose that T (O) = 0. Using the linearity


properties of (,) once more we expand
(*)

e2(AT(x) + T(y) - T(Ax+ y), AT(x) + T(y) - T(Ax + y))

into a linear combination of expressions (T(a), T(b)) with a, b E {x, y, Ax + y}.


Equation (8.9') allows T to be replaced by the identity mapping in every such

42

1. Measure Theory

expression. Upon doing so and re-assembling the terms, we get back a single expression like (*) but with the identity mapping in place of T. That is, we get 0. In
other words,

AT(x) + T(y) - T(Ax + y) = 0,


T(Ax + y) = AT(x) + T(y),

holding for all A E It, x, y E Rd. This says that T is a linear mapping. It is
immediate from (8.9) that T is then injective. The dimension of T(Rd) C Rd is
therefore d, so T(Rd) = Rd, and T is surjective. A motion T that is also a linear
mapping, and the preceding deliberations show that this is equivalent to T(0) = 0,
is called an orthogonal transformation.

If T is any motion and we set a := T(0), then the mapping U := T - a =


T_a o T is a motion that fixes 0. Therefore by the above, every motion T is
a composite Ta o U of a translation and an orthogonal transformation, and is
consequently a bijection of ltd. From this and (8.9) it is clear that the mapping
inverse to a motion is itself a motion, and that the set of motions is a group under
composition, the motion group Mot(Itd) of
Rd.

The translation-invariance of All derived in 8.1 not only characterizes L-B measure but renders excellent service in the derivation of further invariance properties.
We begin with the motion-invariance of Ad, that is, with the proof that
(8.10)

T(Ad) = Ad

for all T E Mot(Rd).

The reflection-invariance treated in Example 5 of 7 is contained in this as a special


case.

8.3 Theorem. Lebesgue-Borel ineasure Ad is motion-invariant.


Proof. Let a motion T of lRd, about which we initially assume that T(O) = 0, be
given. Thus T is an orthogonal, linear transformation. Via the following considerations, we will quickly convince ourselves that T(Ad) is a translation-invariant
measure on 4d: Denoting as before by T,, the translation x H x + c, for each
e E ltd, we consider any a E Rd, set b:= T`(a), and observe that
(8.11)

T. oT =T oTb.

For every x E Rd, T. oT(x) = T(x) +a = T(x)+T(b) = T(x+b) = ToTb(x), confirming (8.11). From this and the translation-invariance of Ad we get T.(T(Ad)) =
T(Tb(Ad)) = T(Ad). As a E ltd is arbitrary, this says that it:= T(Ad) is a translation-invariant measure on.*'. For the unit cube W = (0,1( we have a :=.u(W) =
Ad(T-I(W)) < +oo by (6.2), since T is an isometry and therefore along with W
the set T(W) is also bounded. Now Theorem 8.1 comes into action and guar-

antees that T(Ad) = it = aAd holds. So what remains is to see that a = 1. To


this end we look at the compact ball K := {x E ltd : p(0, x) < 1} of radius 1 and
center 0. Since T and T- i are orthogonal transformations, they fix 0 and leave

8. Mapping properties of the Lebesgue-Borel measure

43

distances invariant (8.9). Hence T-I(K) = K, and from T(ad) = aAd follows
Ad(K) = Ad(T-I (K)) = T(Ad)(K) = aad(K)

From this follows the desired a = 1, because on the one hand Ad(K) < +oo
by (6.2) and on the other hand Ad(K) > 0 because K contains a non-empty
interval I E jd, namely I := [-t, t[ with t := (d-1/2, _ .. , d-1/2)_ [In Exercise 6
of 23 we will compute Ad(K) explicitly.]

To handle the case of an arbitrary motion T, set c := T(O) and S := T, o T,


getting a motion that fixes 0, for which Ad = S(Ad) by what was first proved. It
follows finally from transitivity and T = TaoS that T(Ad) = T'(S(Ad)) =To(ad) _
Ad. Thus the theorem is proved.

Since with every motion T of Rd its inverse T-' is also one, the motioninvariance can also be recorded in the following form: For every motion T of Rd
and every Borel set A E pfd
(8.12)

Ad(T(A)) _.d(A)
In this form Theorem 8.3 just says that any two congruent Borel sets in Rd have
the same d-dimensional Lebesgue measure. This however is the measure-theoretic
formulation and refinement of the elementary geometric principle (A) enunciated
in the introduction to the chapter. Via it L-B measure is seen in the final analysis
to be a concept from euclidean geometry.

Examples. 2. Every hyperplane H C Rd is an L-B-nullset. This follows from Example 1 of 6 and the fact that there is a motion T which transforms a hyperplane
of the kind considered in that example, say the hyperplane with equation td = 0,
into H.
Every closed or open box (meaning a parallelepiped with pairwise orthogonal
edges) Q C Rd whose edge-lengths are 11, ... , ld has Lebesgue measure Ad(Q) _
11 ... 1d. This follows analogously from Example 3 of 6.
3.

The behavior of Ad with respect to linear transformations of the vector space Rd

into itself - and then too with respect to arbitrary affine mappings - can also be
clarified using a slight modification of the preceding method of proof.
Linear mappings T : Rd -a Rd are just those that with respect to the canonical
basis in Rd (or indeed any basis) can be represented in the form T(x) = Cx, with
C a d x d matrix and x E Rd interpreted as a column vector. The determinant
of T, in symbols det T, is by definition that of C (and is independent of the choice
of basis).
We will restrict ourselves to the case where T is non-singular, that is, det T 34 0,
and consequently bijective. These are elements of the group GL(d, R) known as the
general linear group. The mappings T E GL(d, R) with det T = 1 form a subgroup
of GL(d, R), the special linear group SL(d, R). It is in fact the commutator subgroup
of GL(d, R) and this fact is used by DIEROLF and SCHMIDT [1998] to give an

44

I. Measure Theory

alternative proof of our next theorem. (The behavior of Ad with respect to linear
mappings T with det T = 0 is elucidated in Exercise 2 below.)

8.4 Theorem. Every T E GL(d, R) satisfies


(8.13)

T (Ad)

1 T) Ad
I det

or, equivalently

Ad(T(A)) = IdetTIAd(A)

(8.14)

for all AE.Rd.


Proof. Consider first the elementary mappings Dam) defined in Example 4 of 7

and S(") defined in Example 1 of this section. It is obvious that det D') = a
1. Therefore (7.9) and (8.5) confirm (8.13) in the special case that
and det
T lies in the set
(8.15)

{DZ`),S(',k) :crER\{0},i,kE {1,...,d},i4k}.

Since det is a homomorphism of GL(d, R) into the multiplicative group R" := R


{0}, it follows from this and (7.6) that in fact (8.13) holds for all T in the subgroup
of GL(d, R) generated by the set (8.15). The proof of Theorem 8.4 is therefore
completed by recalling the key fact, proved in every linear algebra text, that this
subgroup is the whole group GL(d, R). Actually what is usually proved (see, e.g.,
BIRKHOFF and MACLANE [19651, p. 217) is that GL(d, R) is generated by (8.15)
together with the transformations j(',k) which send every vector x = (xl,... , xd)
to the vector whose coordinates are those of x but with xf and xk interchanged.
However, every such transformation is already in the group generated by (8.15), for
it is routine to confirm, by watching the behavior of the ith and the kth coordinates

at each step, that for i 0 k


j(',k) = D(i o S(lc) o D(k) o SO-0 o D(ki o S(k,*).

Theorems 8.3 and 8.4 taken together confirm an elementary fact from linear
algebra, namely that det T = 1 for every orthogonal transformation T. And this
means that 8.3 is contained in the following immediate consequence of 8.4:

8.5 Corollary. The L-B measure Ad is invariant under all transformations T E


GL(d, R) with I det T j = 1, in particular, under mappings T E SL(d, R).
Remark. 1. As is known from the differential calculus in Rd, a Cl-dif'eomorphism
V : G -, G' between two open subsets C and G' of Rd is approorimable near each
point x E G by a mapping Tx E GL(d, R), namely by the derivative Ti := DV(x). It
should now not come as a surprise that the following transformation, involving the
density concept from 17.2, relates the image of L-B measure on G to L-.B measure
on G':
(8.16)

,P (Ao)

= I det Dapl o cp-1

8. Mapping properties of the Lebesgue-Borel measure

45

or equivalently
(8.16')

IdetDcpIAd .

We will not go into this any further, but refer the reader to the textbook literature,
e.g., STROMBERG [1981], or to VARBERG [1971].

We will conclude the chapter by proving the existence of non-Borel subsets


of Rd. A different approach is indicated in the prologue to Theorem 26.6.

8.6 Theorem. For every dimension d E N, 0 54 9(R d).


Proof. Let Qd denote the set of points in Rd each of whose d coordinates is rational.
This is a subgroup of the additive group Rd, so congruence x - y of points x, y E Rd
modulo Qd is an equivalence relation; it is defined by x - y if and only if x-y E Qd.
The space Rd decomposes into disjoint equivalence classes, each a set x+Qd with

x E Rd, the statement x - y being equivalent to the equality x + Qd = y + Qd.


Since to every real number 77 corresponds an integer n such that n < r) < n + 1,
that is, such that q - n E [0,1 [, every equivalence class contains a point x E [0,1 [.
Consequently, there is a set K C [0, 1[ which contains exactly one element from
each equivalence class. (On the role of the Axiom of Choice from set theory in this
existence claim see SOLOVAY [1970] and HALMOS [1974].) We have then
(8.17)

Rd =

U (k + Qd) = U (y + K)
kEK

VEQd

and
(8.18)

t 1 0 y2

y1, y2 E Qd,

(y1 + K) fl (y2 + K) = 0 .

(Otherwise there are k, k' E K with y1 + k = y2 + k', that is, with k - k', which
by definition of K means that k = k' and consequently also y1 = y2.) Let us now
suppose that K E .mod. Since Q and therewith Qd is countable, it follows from
(8.17), (8.18) and the o-additivity of Ad that
(8.19)

E Ad(y + K) = Ad(Rd) = +00.


yEQd

Translation-invariance of Ad says that


Ad(y + K) = Ad(K)

(8.20)

for all y E Qd

and so in view of (8.19), Ad(K) > 0. Now K C [0, 1[, and so

U (y+K) C [0, 2[,


yE[o,l[ Qd

2 being the point in Rd each of whose coordinates equals 2. From this fact and
(8.18) follows, again via Q-additivity of Ad, that

F Ad(y + K) < Ad([0, 2[) = 2d < +oo .


yE io,l [nQd

46

I. Measure Theory

But then (8.20) means that we must have Ad(K) = 0, contradicting (8.19). The
assumption K E Yd is what led to this contradiction, so we conclude that K is,
after all, not a Borel set.
The following remarks serve to round out the foregoing and to provide a glimpse
of some closely related issues.

Remarks. 2. The "content-problem" in Rd described in the introduction to this


chapter, namely the problem of determining a d-dimensional volume for as large
a class of subsets of Rd as possible, is quite satisfactorily solved by the LebesgueBorel measure Ad, especially in view of its motion-invariance. More satisfactory
still for a reader without preconceived notions would be a proof of the existence
of a measure to on the whole power set 9(Rd) which assigns mass 1 to the unit
cube [0,1 [ and is invariant under all motions of Rd. According to Corollary 8.2 such
a p would have to be an extension of the Lebesgue-Borel measure Ad. But it was
shown by F. Hausdorff (1868-1942), cf. HAUSDORFF [1914J, pp. 401-402, that no
such measure p on .9(Rd) exists, for any dimension d > 1; in fact, there is not even

a or-additive, motion-invariant content rt # 0 on the ring of all bounded subsets


of Rd. For this result the reader is referred to the exposition in AUMANN [1969],
pp. 275-276, which further exploits the ideas in the preceding proof of Theorem 8.6
and will consequently be mathematically accessible to him.
HAUSDORFF [1914], p. 469 further showed that the content-problem for bounded

subsets of Rd does not even have a solution if the motion-invariant content 11 0 0


is only required to be finitely additive, and d > 3. The reader is referred for this to
the presentation by STROMBERG [1979] or WAGON [1985] (in each of which a cen-

tral role is played by the so-called Banach-Tarski paradox, discovered in 1924)


and to the subsequent investigations of VON NEUMANN [1929] that introduced the
idea of amenable groups.
S. Banach (1892-1945), cf. BANACH [1923], discovered that the finitely-additive

content-problem mentioned above has a solution in dimensions d = I and d = 2.


But such an rt is not uniquely determined by the normalization rt([0,1[) = 1 and
for this very reason its further study has not seemed worthwhile. A remarkable
generalization of Banach's result will be found on pp. 242-245 of HEwrrr and
Ross [1979].

3. If (11, 0, p) is a measure space and A a p-null set, then indeed because of


isotoneity every subset of A that belongs to W is itself ft-null; nevertheless, not
every subset of A need belong to d. This phenomenon even occurs with the LB measure, as the second part of Remark 4 will show. If .i contains all subsets of
each p-null set, then p is called a complete measure.
Exercise 7 of 5 describes how an arbitrary measure p can be extended in
a natural way to a complete measure by passage to its so-called completion po.
The completion of the Lebesgue-Borel measure in Rd is called Lebesgue measure
in Rd; the sets in the a-algebra on which it is defined are called Lebesgue measurable
and those of them having measure zero are called Lebesgue-null sets in Rd.

8. Mapping properties of the Lebesgue-Borel measure

47

In passing from Borel sets to Lebesgue measurable sets the important property
of the former that they are determined only by the topology of Rd is lost. Because
d is the defining or-algebra for so many other important measures (for d = 1
Theorem 6.5 already attests to this), we will not dwell in detail on the transition
from Lebesgue-Borel to Lebesgue measure; only the former will be employed in
the sequel.
4. There exists a Borel set B E 0 whose image 7r1(B) under the first projection
map irl : R2 -* R (which sends every point (xi, x2) E R2 to its first coordinate xi)
is not a Borel subset of R. A proof of this will be found in SRIVASTAVA [1998],
p. 130. Such a B can even be found which is G6-set, that is, the intersection of
countably many open subsets of R2; see p. 36 of CHRISTENSEN [1974]. In particular,

the continuous image of a Borel set need not be a Borel set. The system of all sets
7r, (B) with B E 92 comprises rather the so-called Souslin or analytic subsets of R.
See SRIVASTAVA [1998] and CHOQUET [1969].

For any non-Borel set A C R, Exercise 2 of 6 shows that A x {0} is a non-Borel


subset of the A2-null set R x {0}.
5. Examples 4 and 5 of 7 as well as Theorem 8.4 illustrate that the L-B measure Ad is not invariant with respect to all homeomorphisms T : Rd -3 Rd of Rd
with itself. For such a homeomorphism T however, p := T(ad) is always a mea-

sure on .mod with the following properties: (i) p(K) < +oo for every compact
K C Rd; (ii) p({x}) = 0 for every x E Rd; (iii) u(U) > 0 for every non-empty open
U C Rd; (iv) p(Rd) _ +o0. OXTOBY and ULAM [1941] showed that, conversely,

every measure p on 0 enjoying properties (i)-(iv) has the form it = T(ad) for
some homeomorphism T : Rd

Rd. A simpler treatment of their result was later

provided by COFFMAN and PEDRICK [1975].

Exercises.
1. Let T : (fl, .ad) -4 (fl', d') be a measurable mapping, p a measure on the or-algebra 0, and p' := T (p) its image under this mapping. (1l, .Wo, po) and (S2', 00, IA')
will denote the completions of these measure spaces (Exercise 7, 5). Show that the
mapping T is also do-.olo-measurable and that T(po) = o. From this it follows
that Lebesgue measure in Rd is also motion-invariant.
2. Let T be a linear mapping of Rd into itself with det T = 0. Show that for every

A E .9d, T(A), although it may fail to be a Borel set (as noted in Remark 4) is
at least a Lebesgue-null set, thus a subset of an L-B-null set, namely the linear
subspace T(Rd) of Rd. In this sense equality (8.14) retains its validity for linear
transformation T : Rd -+ Rd with det T = 0, i.e., (8.14) is valid for every linear
transformation T of Rd into itself.
3. Show that the set K constructed in the proof of Theorem 8.6 is not even
Lebesgue measurable.

4. In the section entitled "Fallacies, Flaws and Flimflam", p. 39, vol. 22, no. 1
(1991) of the College Mathematics Journal the following short "proof" of Theorem 8.6 is offered: Suppose that A1(X) is defined for every subset X of 10, 11. By

isotoneity it is a number in 10, 11. Consider the set B defined as {al(X) : X E

48

1. Measure Theory

6'([O.1)), A' (X) % X}. It is a subset of 10,1] and upon testing the number A '(B)
for membership in B we find that the statements A'(B) E B and Al(B) 0 B are
equivalent, a contradiction. What is the error in this reasoning, or is it perhaps
a legitimate proof of Theorem 8.6?

Chapter II

Integration Theory

A measure space (S2, W, p) is given. We pose the problem of assigning to each


function on S2 from as large a class as possible an integral, that is, a "mean value"

constructed with respect to . After the introduction in 9 of the property of


measurability, which is fundamental, this problem will be resolved step by step
in 10-12. The later sections of this chapter are devoted to erecting the theory
and exploring the applications of the integration procedure thus defined.

9. Measurable numerical functions


On the number line R we have defined the a-algebra 91 of Borel sets. If we
compactify R to R in the customary way by adjoining the "ideal" points -oo
and +oo, the sets A C R for which AnR E -41 are called Borel in R. Constructively,
the Borel sets in R are precisely all sets B, B U {-oo }, B U {+oo}, B U { -oo, +oo}

with B E .61. The system 41 of these sets is obviously a a-algebra in IIt whose

trace in R is 0:
(9.1)

R n R1 = -41 .

If now (Q,.&) is a measurable space, the sd-,mil-measurability of functions


f : S2 -+ R is defined. Such functions will henceforth be called (sad-)measurable numerical functions on Q. Real functions f : S2 -+ R are special numerical functions;
in view of (9.1) the af-. '-measurability of such a function is just the same as its
sat-RI-measurability.

Examples. 1. Let (Il, d) be a measurable space, A a subset of f2. The function


(9.2)

1A(w) := { 0

if w E A

ifwEi2\A

is called the indicator function (sometimes also the characteristic function) of A.


This real function on S2 is d-measurable just if A E 0, because for every B C R
the set (1A)-1(B) must be one of the four fl, A, SZ \ A, 0.
Thus sets and their indicator functions correspond biuniquely. The following
calculation rules, in which A, B C 0 and A; C S2 for i E I, are often used, and
their validity is immediately perceived:

ACB

1A<1B;

11. Integration Theory

50

1CA=1-IA; IAOB=IIA - IBI; lAnB=lA-1Bi


1U,eIAc = SUP IA, ,

111,e,A,

inf IA, .

2. For an arbitrary subset Q of Rd consider the measurable space (Q, Qn9d). The
corresponding measurable numerical functions on Q will be called Borel measurable

functions or Borel functions on Q. Every continuous numerical function f on Q


is such a Borel measurable function. Indeed, for every a E R the set Q. of all
x E Q with f (x) > a is a relatively closed subset of Q, that is, of the form Q fl F
for a set F which is closed in Rd. (Such an F would be, e.g., the closure of Q.
in Rd.) Since F E .mod, this intersection lies in the trace a-algebra Q fl .mod. The
claim therefore follows from the next theorem. (Cf. also Example 2 of 7.)
9.1 Theorem. A numerical function f on St is sad-measurable if and only if

forallaER.

{wEfi: f(w)>a}EAf

(9.3)

Proof. According to 7.2 we have only to show that the system 7 of all inter91
vals [a, +oo] with a E R generates the a-algebra Y in K. Since [a, +oo] E
.1 for the a-algebra 22 generated
for every a E R, we have at any rate that al C
by d. Because [a, J3[ _ [a, +oo( \ [/3, +oo[, the intervals (a, /3[ with a,,3 E R and
a < /3 all lie in R f :N. From 6.1 therefore follows that M1 C R ft :N. Now the
single-element sets

{-or,} = n C(-n,+oo] and {+oo} = n [n,+oo]


nEN

nEN

both lie in :9. Consequently, along with each Q E :N, the set R fl Q is also in :9.
In other words, R n
C :9 and therewith 91 C .2. This fact together with
(-oo), {+oo} E and the remarks preceding (9.1) make it clear that 91 C :N,
so that finally we have _'l = .1.
We now introduce some popular short-hand notation: For numerical functions

f and g on
(9.4)

(f < g} := {w E Q: f (w) < g(w)}

and the sets (f < g}, (f = g}, (f # g}, etc., are defined analogously. Condition (9.3) in this language reads: (f > a} E ii for all a E R.

That we can just as well employ the sets If > a}, f f < a}, etc., in the
preceding characterization is the content of

9.2 Theorem. Each of the following conditions is equivalent to the d-measunability of the numerical function f on St:
(a)

if >a}Ed

for all aER;

(b)

if >a}E01

forallaER;

9. Measurable numerical functions


(c)

If <a}ES/

(d)

{f <a} E.0'

51

forallaER;
forallaER.

Prof. All that has to be shown is the equivalence of these four assertions, and
that results from the validity, for all a E R, of the equations

if > a} = U If > a+n-'};

if <a} =C{f > a};

nEN

if <a} = U{f <a-n-'};

If >a}=C{f <a}.

nEN

It may be noted that the four related assertions in which quantification is over
all a E R are also equivalent.
A plethora of assertions about calculating with measurable numerical functions
now presents itself.

9.3 Theorem. For any 0-measurable functions f,g : fl - Ilt the sets If < g},
If < g}, If = g} and If & g} lie in W.
Prof. Because the set Q of rational numbers is countable, the claims follow (with
the help of 9.2) from the equalities

If <g}= U{f <e}fl{e<g};


FEQ

{f<g}=C{f >g}; If =g}={f<g}f1{g< f};

{fog}=C{f=g}.
9.4 Theorem. Along with f, g : 11 -> R, the function f g and, if everywhere
defined, the functions f + g and f - g are also d-measurable.
Prof. First of all, along with g, a + rg is measurable for all a, ,r E R. This follows

from 9.2 because {o + rg > a} is {g > (a - a)/r} if -r > 0 and is {g < (a - a)/r}
if r < 0, the case r = 0 being trivial. This preliminary remark takes care of the
passage from g to -g and reduces the case f - g to the case f + g. Furthermore,
together with the remark following 9.2 and the equalities

{f+g>-a}={f>a-g}

(aER)

it yields the measurability of f + g.


In investigating f g we will first suppose both functions are real-valued. Then
the identity

f9 = (f + 9)2 - 1(f - 9)2


4
a < 0 and
reduces the product question
to the case g = f. But (f2 > la)if is
is If > V a-1 U { f < - f } if a > 0, which shows that the measurability of f2

follows from that of f.

1 1. Integration Theory

52

If finally f and g are numerical functions, introduce 521 := { fg = +oo}, Ua :_


{ fg = -oo}, 123 := { fg = 0} and 124 := C(121 U U2 U f13). Using 9.3 and the
measurability of constant functions we check that these four pairwise disjoint sets
lie in al'. The restrictions f', g' of f, g to 124 are U4 nd-measurable and real-valued.
The product f'g' is therefore f24 fl at-measurable. From this follows immediately

the ti-measurability of f g. 0
A useful special case of 9.4, isolated already in the course of its proof, is that of

is 0-measurable whenever f is and a E R.


9.5 Theorem. Let (fn)nEN be a sequence of ti-measurable numerical functions
on ft. Then each of the following numerical functions is i-measurable:

sup f , 1im inf fn,

inf f n ,

nEN

nEN

nEN

lim sup f .
nEN

Proof. The function s := sup fn is measurable because

{s< a}= n{fn<a}

for all aER.

nEN

Due to 9.4, inf fn = - sup(- fn) is then also measurable. By definition we have
lim inf fn = sup inf fn ,
ra- 00

nEN m>n

lim sup f,, = inf sup fm.


n-,oc

nEN m>n

By what has already been proved, each of these functions is measurable. 0

9.6 Corollary 1. For every finitely many .off-measurable numerical functions


fl,..., fn on ft, their lower and upper envelopes

fl A...Af,

and

fl V...Vfn

are .W-measurable.

Proof. Apply 9.5 to the ultimately constant sequence fl,

fn, fn .... 0

9.7 Corollary 2. If a sequence (fn)nEN of ti-measurable functions converges


pointwise throughout 12, that is, if lim fn(w) exists on R for every w E 52, then
the limit function lim fn is 0-measurable.
n-+m

This is immediate from

lim f = limn-+00
inf f" = liznn-400
sup fn

n-+oo

To every numerical function f : ft -4 R three other functions on U are associated (cf. the section "Notations"): the absolute value
(9.5)

!fl := f V (-f),

10. Elementary functions and their integral

53

the positive part

f+ := f V 0,

(9.6)

and the negative part of f


(9.7)

(-f)+ _ -(f A 0).

Thus f+ (w) = f (w) in case f (w) > 0 and f+(w) = 0 in case f (w) < 0. Observe
that not only f + > 0, but also f - > 0. The important equalities
(9.8)

f=f+-fand Ifl=f++f-

are immediate.
From 9.4 and 9.6 we effortlessly infer our concluding result:

9.8 Theorem. A numerical function f on Il is jz -measurable if and only if both


its positive part f + and its negative part f - are each d-measurable. Furthermore,
along with f, its absolute value If I is always saf -measurable.

Exercises.
1. Let (Q, a() be a measurable space, D a dense subset of llt (e.g., Q). Show that
a numerical function f on fl is af-measurable if the analog for all a E D of one of
(a)-(d) in Theorem 9.2 holds.
2. Let (fn)nEN be a sequence of as -measurable numerical functions on a measurable

space (0,W). Why is the set of all w E f2 for which the sequence (fn(w))fEN
converges in R, and that for which it converges in R, xf-measurable?
3. The real function f : Sl -> R is measurable on the measurable space (0, sd). Are
exp f and sin f , that is, the function w H of (1) and w - sin f (w), 0-measurable?

4. With the aid of Theorem 9.1 show that the real function defined on R2 by
(x, y) +-> max{x, y} is 6#2-measurable. Deduce from this another proof of Corollary 9.6.

5. Show via an example that the measurability of a numerical function f is not


always a consequence of the measurability of if I.

10. Elementary functions and their integral


Our path to the integral proceeds via the set

E = E(1,0)
of sag-elementary functions on ft, which we define as follows:

10.1 Definition. A real function on 11 is called an (.sat-)elementary function (or


a non-negative step function) if it is non-negative, sad-measurable, and assumes
only finitely many different values.

54

11. Integration Theory

If {a1, ... , a,, } is the set of distinct values of a function u E E, then the sets
Ai := u-1(a; ), i = 1,..., n, are pairwise disjoint, and as pre-images of the Borel
sets {ai} they each lie in d. Using the notation for indicator functions introduced
in (9.2), we have then
n

(10.1)

u = E ailA,.
i=1

If conversely, numbers al,... , a,, E R+ and sets &..., An E 0 are given (n E N)


and we define u via (10.1), then u is an elementary function, because by 9.4 it is
measurable. Thus E is the set of all functions having a representation of the form
(10.1), with n E N, coefficients ai in It+ and sets Ai from W.
From Definition 10.1 and the results of 9 the following further properties of E
are immediate:
(10.2)

uVv, uAvEE.

au, u+v,

14,11 EE,aER+

The derivation of (10.1) shows moreover that every function u E E has a rep-

resentation of the form (10.1) in which the sets Ai E d are pairwise disjoint
and cover Il, that is, constitute a decomposition of 0. Such representations will
henceforth be called normal representations of u.
It is easy to see that generally functions u E E can have several different normal
representations. However, for u 96 0 there is only one representation in which the
coefficients are the distinct non-zero values taken by u. Anyway, for purposes of
integration non-uniqueness of normal representations is not an issue, as the next
lemma shows.

10.2 Lemma. Let (it, d,,u) be a measure space. For any normal representations
m

q
=fl,1B'
j=1

i=1

of an elementary function u E E we have


m

tol

L,Q1(Bj)
j=1

(bearing in mind the conventions for calculating with +oo).


Proof. From

i1=AlU...UAm=B1U...UBn
follows
n

Ai = U (Ai n Bj) and Bj = U (Ai n Bj )


j=1

i=1

10. Elementary functions and their integral

55

in which the sets Ai n Bj are pairwise disjoint. The finite additivity of A therefore
supplies the equalities
n

ns

p(Ai) = > p(Ai n Bj) and (Bj) _ E p(Ai n Bj),


j=1

i=1

the first for all i E { 1, ... , m}, the second for all j E

After further

summation

Eajp(A1)=>aip(AinBj) and Ef3jp(Bj)=E/3jii(AinBj)


i=1

i,j

j=1

From these two equalities the claim follows when we observe the following fact:

Because we started with normal representations of u, ai = Qj for every index


pair (i, j) such that Ai n Aj 0 0, in particular, for every pair (i, j) such that

p(AinAj)j4 0. o
Thanks to the preceding our next definition is sound:

10.3 Definition. Let u be an elementary function. The number


(10.3)

Judo :_
i=1

which is independent of the special choice of normal representation


U

it

= E ailA,
i=1

of u, is called the (p-)integral of u (over 1).

Thus u H f u dp defines a mapping from E into R+. Clearly it is a mapping in R+ just if p is finite. The most important properties of this mapping are
summarized in:

r
(10.5)

for all A E 0;

J IA dpi = p(A)

(10.4)

J(au)d;i =ra J udfor all u E E, a E

(10.6)

f(u+v)dp=J udp+Jv dp

(10.7)

u<v

for all u,vEE;


for all u, v E E.

Properties (10.4) and (10.5) are immediate from 10.3. The next property in the
list is confirmed thus: Start with normal representations
in

i=1

j=1

u=EailAi and v=J:pjlE,

56

1 1. Integration Theory

of the functions u and v in E. As before


m

Ai = U (Ai n Bj) and Bj = U (Ai n Bj );


j=1

i=1

and because the sets Ai n Bj are pairwise disjoint, these equations entail
m

1A, = E 1A,nB,

and

1A,nB1

1Bf

j=1

i=1

the first for all i E { 1, ... , m}, the second for all j E { 1, ... , n}, from which in turn
new normal representations

u=F'ai1A,nBj, v=EQi1A,nB, and u + v = E(ai + 13j) lA,nB3


ij

ij

ij

emerge. Using them to compute all the integrals,


u dii =

aA(A; n Bj)fvdiz=>/3iIL(AinBi).
ij
ij
J(u+v)dii=J(ai+Qj)p(AinBj)

ij

makes clear the validity of (10.6).


These deliberations have shown that every u, v E E admit normal representations
k

u=

E,yi1c,

and v = Ebilc,
i=1

i=1

involving the same sets C1, ... , Ck E d. In case u < v, it then follows that ryi < bi
for each i E { 1, ... , k} such that Ci 34 0, and from this we have (10.7).
n

Now let u = E ail A, be an arbitrary representation of an elementary function


i=1

u E E with coefficients ai E R.4. and sets Ai E .op, but not necessarily a normal
representation. From (10.4)-(10.6) it follows that
n

Jud =

aiu(Ai)
i=1

For normal representations this equation served as the definition of f u du. Its
validity without this restriction, which we now perceive, indicates that the introduction of normal representations was simply a technique of proof.
Exercises.
1. Let (S2,
p) be a measure space and (Sl, sVo, po) its completion. Prove that
for every moo-elementary function u there are d-elementary functions u1i u2 such

that u1 < u < u2 and ji({u1 # u2}) = 0. For every such pair, f u1 dp = f u2 du =
f udpo. (Cf. Exercise 7(d) in 5.)

11. The integral of non-negative measurable functions

57

2. The function 1Q on IIt has long been known as Dirichlet's jump function. Is it
a -41-elementary function?

11. The integral of non-negative measurable functions


Further progress hinges on the following result:

11.1 Theorem. For every isotone sequence (un)neN of functions from E and
every u E E
(11.1)

JUd/L<sUPfUndIL.

u < sup un
nEN

nEN

Proof. Choose a representation


m

U = 1aj1Aj
j=1

of u with sets Aj E af and coefficients aj E R+, and let a be any number in 10,1[.
Then because of measurability the set

B,,:={un>au}
lies in 0 for each n E N. From this definition follows on the one hand that un >
au1B and consequently by (10.5) and (10.7)

undp>a J

for every n E N. Since the sequence (un) is isotone and u < supun, it follows on
the other hand that Bn T St, and so Aj n Bn T Aj for each j E {1, ... , m} and
consequently, because p is continuous from below
m
na
r

JudajA(A1)_ mpVaj(AjnBn)=nl +00 ula d.


f

j=1=1
sup
nEN

un d > sup a J u 1 B dp
nEN

= a n-oo
lim J u1s dp = a

ud .

where the first step follows from f un d > a f ul B d. Since a E 10,1 [ is arbitrary
here, the claim follows.

58

1 1. Integration Theory

11.2 Corollary. For any sequences


sup un = sup vn

(11.2)

nEN

(vn)fEN of functions from E

* sup / un d = sup ( vn 41A.


nEN J

nEN

nEN J

Proof. For every m E N, vn, < supun and u,,, < sup vn, from which inequalities
n

and 11.1 follow

sup J un dp and
J vn, dp < nEN

u.. du < sup


nEN

J vn du.

Claim (11.2) is immediate from the validity of these inequalities for all m E N.
Now let

E- = E'(0,a)

(11.3)

designate the set of all non-negative numerical functions f on 1 for which an


isotone sequence

of functions from E can be found satisfying

sup un = f .
nEN

Then according to (11.2) the number


sup J U. dp E Ft+
nEN

depends only on f and not on the special representating sequence (u,,) of f used
to compute it. We're in a position similar to that of 10.3. Therefore we make the

11.3 Definition. Let f be a function in E', represented as the upper envelope


f = supun of an isotone sequence (un)nEN for elementary functions. Then the
number

(11.4)

fdp:=sup J undpEk+,
neN

shown above to be independent of the special representing (un), is called the


(p-)integrnl of f (over f1).

Evidently E C E*, because every u r= E satisfies u = sup un for the constant


sequence un := u. Moreover, using this sequence (as we may) in (11.4), we see
that in case f = u E E, that definition of the integral coincides with the earlier
one. The mapping f i-+ f f dp initially defined only on E is thereby extended to
a mapping of E' into It+. That in this extension process the known properties of
the integral persist, will now be confirmed.
The analogs of (10.2) and of (10.5)-(10.7) are
(11.5)

(11.6)

f,gEE',aElt+

of, f+9, f.9, fVg, fA9EE*;

Jfrxf)di=affdpfor all f EE' ,aER+;

X11. The integral of non-negative measurable functions

(11.7)

J(f+9)dii=Jfd+fgdi.i

for all f,gEE*;

Jfd/iJgdlz

for allf,gEE'.

f <g

(11.8)

59

Proof. From the definition of E* and from (10.2) follows (11.5). One only has
to note that sup un = lim un for isotone sequences (un). The earlier proofs carry
n

over almost verbatim to (11.6) and (11.7). We'll do (11.7) and leave (11.6) for the

reader: Let f = SUP un, 9 = sup vn be representations of f, g E E' by means of


n
elementary

isotone sequences of

n d and

d = sup
n

functions. Then by definition

g d = supvn d ,
n

jI

f +g)d = supJ (un +v.)d.


n

From this and (10.6) we get (11.7), since due to isotoneity


sup J (un + v,,) d = l nm

(Jun d + J vn dIL)

r
.
J f d., +Jgd

If in addition we assume that f < g, then urn < sup vn for every m E N.
n

(11.8) therefore follows from 11.1

Properties (11.6)-(11.8) say that the integral is a positively-homogeneous, additive and isotope function on E*.
Finally, it turns out that Theorem 11.1, which is so critical for our program, is
valid also in E. This is the content of a theorem which goes back to B. LEVI (18751961):

11.4 Theorem (on monotone convergence). For every isotone sequence (fn)nEN
of functions from E'

sup fn E E' and JsuPffldlz=supJfndlL.


nEN

nEN

nEN

Proof. Set f := sup fn. It suffices to find an isotone sequence (vn) of functions
n
from E which satisfy
sup vn = f and vn < fn
for every n E N.
nEN

For then f E E' and f f d = sup f vn d by definition of the integral in E', while
f vn d < f fn d by (11.8). Consequently, f f d < sup f fn d and therewith the
equality claimed by the theorem follows, since the other inequality sup f fn dp <
n

f f d is immediate from (11.8) and the fact that fn < f for all n.

60

Il. Integration Theory

The sequence (vn) is gotten thus: For each fn there is by definition an isotone
sequence (umn)mEN of functions from E with sup urn = fn According to (10.2)
mEN
the functions
Cm:=um1 V...Vumm
be in E (for each m E N). The isotoneity of each sequence (umn)mEN clearly entails
that of the sequence (Vm)mEN. From the isotoneity of (fm) n,EN follows v n < fm

for all m, and thus sup um < f . For all m > n we have u,nn < vm and so
m

sup umn = fn < sup vm

for every n E N.

mEN

mEN

Together with the preceding this gives finally sup vm = f . Therefore (vn) is a sen

quence with the needed properties 0

11.5 Corollary. For every sequence (fn)nEN of functions from E'


00

00

fn E E'

nn=1
00

and J(f)d$t=JfdIL.
n=1

n=1

Proof. Apply 11.4 to the sequence U t + ... + fn)nEN and recall (11.7). 0
In analogy with the device of writing An T A, An 4. A for sets, introduced in 3,
we will from now on write

fn t f, fn 4.p
for numerical function f, 11, f2,... on the set S2 to signal that fn(w) T f (w) for
every w E S2, or fn(w) 4. f (w) for every w E Q; that is, the notations mean (fn) is
an isotone sequence and f is its upper envelope, or (fn) is an antitone sequence
and f is its lower envelope. Obviously for a sequence (An) of subsets of 12

ABTA a

1 A T lA

and An J. A q 1A 4.'A

Examples. 1. Let (S2, 0) be an arbitrary measurable space and c,, the measure
defined on d by unit mass at the point w E S2 (cf. Example 5 in 3). Then

f fde.=f(w)
for every f E E. Due to 11.3 we can at once assume that f E E.
If, however, f = E ai 1 A, is a normal representation of f, then w lies in exactly
one of the sets A;, say in Aj0. Then f f den, = E ajc,,(Aj) = a;. = f (w).
Consider 0 := N and .d :_ ,90(N). The o-additivity requirement means that
a measure p on V is uniquely defined whenever numbers do = p({n}) E R+ are
specified for each n E N. E` consists of all numerical function f > 0 on Q. Indeed,
one sets fn := f (n) It,,) for each n E N and then fn E E`, and in case f (n) < +oo,
2.

11. The integral of non-negative measurable functions

61

fn E E. Since
00

f=I:fn,
n=1

it follows from 11.5 that f E E' and


f du =

J
3.

f (n)pn .
n=1

Let (0,0) be a measurable space, (pn)iEN a sequence of measures on 0 and


00

.U:= F, pn (cf. Example 4 in 3). Then for every f E E`


n=1

fidp

->fidpn.

This is evidently true of indicator functions f, so the claimed equality holds for
all elementary functions. Transition to an arbitrary f E E' is accomplished thus:
Let (un) be a sequence in E with un t f. Then the double sequence

amn = >2
i=1

,,n

*n E N)

dpi

satisfies

sup (supamn)= sup(sup amn)

mEN nEN

nEN mEN

(= sup amn) ,
m.nEN

which confirms the assertion.

Now that E` is seen as a natural generalization of E, we might ask for a more


workable characterization of it. A surprisingly simple one exists which brings us
back to the measurability concept in 9.

11.6 Theorem. E' is the set of non-negative, d-measurable, numerical functions


an 11.

Proof. Every elementary function is measurable and so therefore is every function


in E', by 9.5. Suppose conversely that f is a non-negative, measurable, numerical
function on 11. The sets
A3n

I {If

}, n) n

if < (

-E 1)2-n},

i = n, 1 ..., n2n - 1

all lie in W, and for each fixed n E N the n211 sets are a decomposition of I.
Consequently, for each n
n2n

i2-n1A,,,

un
i=1

62

I l. Integration Theory

is a normal representation of a function in E. On the set Air the function un+1 can
1)2-"-1 if i E {O... , n2" -1}, and only
(2i)2'n-1 and (2i +
take only the values

values > n when i = n2". Therefore the sequence (un) is isotone. It satisfies
sup un = f , because for any w E 11 either f (w) = +oo, in which case un (w) = n
n

for every n, or f (w) < +oo, in which case u. (w) < f(w) < un(w) + 2'n for all

n > f (w). Thus f lies in E. 0


Example. 4. Let fi be an uncountable set, dd the a-algebra in fZ comprised of
all sets which are either countable or have countable complement (introduced
in Example 2 of 1). We claim that a numerical function f on 0 is daf-measurable
just if there is a countable set A in the complement of which f is constant. This
constant a(f) does not depend on the particular set A, because if B is another
such, CA n CB, being the complement in uncountable ft of the countable set Au B,
is not empty. That this condition really implies the si-measurability of f follows

from Theorem 9.1, because for every a E R either if > a} C A or CA C (f > a}.
In proving the converse we can, thanks to (9.8), assume that f > 0. The claim
is then true for elementary functions f E E(fl, dd), because among finitely many
pairwise disjoint sets whose union is 1, exactly one has a countable complement.
For arbitrary f E E'(11,d) let (un) be a sequence of elementary functions with
it, T f . Each function un is constantly a(un) in the complement of some countable
set A. But then f (w) has the constant value
for all w E n CA. =
n

nEN

C( U An). As the set U An is countable, this proves that f has the asserted
nEN

nEN

property and that moreover a(f) = supa(u,,).


If now p is the measure defined in Examples 2 and 7 of 3 which takes only the
values 0 and 1, then it follows from the preceding deliberations that

f f dp = a(f)

for all f E E=(l2,.ul).

In closing we will use Theorem 11.6 to derive a factorization lemma, due to


J.L. Doob, which is interesting in its own right and quite important for its applications in probability theory.
11.7 Factorization lemma. Let T : St -> W be a mapping of a set 12 into a measurable space (n', dd') and f : 11 - Ft a numerical function on i2. The function f
is measurable with respect to the a-algebra o(T) = T-1(4d') in D generated by T
if and only if there exists a measurable numerical function g on (f2', s') such that
(11.9)

f =goT.

In case f is c(T)-measurable and real (reap., non-negative)-valued, then there is


such a g which is real (reap., non-negative) -valued.

11. The integral of non-negative measurable functions

63

Proof. If f has the form f = g o T as specified, then it is the composite of


a Q(T)-sad'-measurable with an a('-21 -measurable mapping, making it a(T)41measurable. For the proof of the converse we distinguish three cases:
n

1. Let f = E ai1A, be a Q(T)-elementary function; so Ai E o(T) and ai E R+ for


=1
i = 1, ... , n. For each Ai there is a set A; E 0' with Ai = T-1 (A;), by definition
of o(T). Therefore the function g :=

ailA' does what is wanted.

2. Let f > 0. According to Theorem 11.6 there is an isotone sequence (un)neri


of o(T)-elementary functions with f = sup u,,, and by the proof just given, there
n

are d'-elementary functions gn such that un = gn o T. The function g := sup gn


n
then does what is wanted in this case.
3. An arbitrary r(T)-measurable f : 0 -* Ilk decomposes into its positive part f+
and its negative part f -. From 2. we get d'-measurable go > 0 and go > 0 on Sl'
for which f + = go o T and f - = g, "o T. For w' in the set U' := {g'o = +oo} fl {go =

+oo} the difference go(w') - go(w') is not defined. But the set T(Sl) is disjoint
from U', because go' (T(w)) = +oo always entails that 9o(T(w)) = f (w) = 0.
Therefore if we set

1Cu'9o

and g"

1Cu'9o

then g := g' - g" will do the desired job.


4. If f is real, 3. supplies a numerical d'-measurable function go on SW such that
f = go oT. If we set U := {IgoI = +oo}, then U fl T(f2) = 0 since f takes only real
values, and so the real function g := 1Cu9o does what is wanted. 0

Remark. The restriction of g to T(1l) is uniquely determined by f and (11.9).


Specifically, for each w' E T(0), g(w') = f(w) for every w E T`(w'). On T(fl)
one therefore has no other choice than to set g(T(w)) := f (w). In case T(1) E at,
in particular when T(11) = fl', the existence of g can thus be secured without
recourse to 11.7 - cf. Exercise 3 below.
The factorization lemma is therefore noteworthy only in so far as it allows the
measurability of T(f)) to be dispensed with. And in doing that the special structure
of (1, 91) is critical. Remark 4 in 8 shows how we are sometimes forced to do
without the measurability of T(Q).

Exercises.
1. Show that every bounded, 0-measurable, non-negative real-valued function
on a measurable space (fl, d) is the uniform limit of an isotone sequence of dmeasurable elementary functions.

2. Let (Sl, .r9, ) be a measurable space with a finite measure . Further, let
f, f1, f2.... be measurable numerical functions on 11. Prove the equivalence of

11. Integration Theory

64

the two assertions:

lim( U{f,,,>f+E))=0

(i)

for every e>0;

m>n

(ii) for every 6 > 0 there exists an A6 E .& with (A6) < 6 such that for every

e > 0, f,, (w) < f (w) + E holds for all w E CA6 and all sufficiently large n E N.

[Hints: Note that (i) is also equivalent to the statement that for every e > 0 and

6 > 0 there exists an A6,, E 0 with (A6,,) < 6 and an N6,,. E N such that
f,, (w) < f (w) + e for all w E CA6,, and n > N6,e.] Why does (i) hold, given the
sequence (fn)nN, for every measurable function f which satisfies f > lim sup fn?
n-4oo

3. With the hypotheses and notation of the factorization lemma, show that for
any w1, w2 E 12 with T(wi) = T(w2), and every C E a,(T), either wl,w2 E C or
w1, w2 E CC. (That is, w1 and w2 cannot be "separated" by any set in o(T).)
From this fact infer that a Q(T)-measurable f satisfies f(wl) = f(w2) whenever
T(wl) = T(u)2). In case T(S1) E d', deduce the existence of a er(T)-measurable
mapping g : SY -4 fR with f = g o T. [Hint: Consider the system `B of all C C Sl
which have this two-point property and conclude that o(T) C W. Further, take
note of the equality T(T'1(A')) = A' fl T(1) for A' C W.]

12. Integrability
By now the integral f f d;i is defined for all non-negative d-measurable numerical
functions on 11, as a result of 11.4 and 11.6 together. In a third and final step f f du
will now be defined for certain numerical functions f which are not of constant
sign.

According to Theorem 9.8, f is measurable just if both its positive part f+ and
its negative part f - are measurable. This remark prompts the following definition:

12.1 Definition. A numerical function f on the measure space (Sl, 0, ) is called


(p-) integrable if it is s/-measurable and the integrals f f + d, f f " d are real
numbers. Then

J fdu := f f+d- f f d
is called the (-)integral of f (over Sl).
If for some reason one wants to put the variable w E Sl into evidence, he also
writes
f f (w),u(dw)

or

f (w) dit(ty) .

Remarks. 1. The right side of (12.1) is meaningful for measurable f if at least


one of f +, f - has a real integral. One says that then f is quasi-integrable or that

12. Integrability

65

the integral off exists and one uses (12.1) to define f f d E R. Only occasionally
will we be concerned with this obvious generalization.
2. In the special case = ad we speak of Lebesgue integrable functions (on Rd)
and of their Lebesgue integrals. If a Borel measure F on Rd is described with the
help of a measure-generating function F on Rd (cf. 6), the F-integrable functions f on Rd are called Lebesgue-Stieltjes integrable (or Stieltjes integrable) with

respect to F. One speaks of its (Lebesgue-)Stieltjes integral and writes f f dF


instead of f f dtF. The general theory of measure and integration has however
displaced this terminology and the notation f f dF, despite their historical significance.

Let us now summarize the most important properties of the conceptual edifice
just built:

12.2 Theorem. Each of the following four statements is equivalent to the integrability of the measurable numerical function f on S2:

(a) f + and f - are integrable.


(b) There are integrable functions u > 0, v > 0 such that f = u - v. (Note that the
last equality entails that u(w) - v(w) is defined (in R) for every w E 11.)
(c) There is an integrable function g with if I < g.
(d) If I is integrable.

From (b) follows: f f d = f u d - f v d.


Proof. What has to be shown is the equivalence of (a) through (d), since (a) constitutes the definition of f being integrable.
(a)=:-(b): According to (9.8), u := f+ and v := f- do the job required in (b).
Because the integral is additive on E', along with u and v, u + v is
also integrable. Since f = u - v < u < u + v and -f = v - u < v < u + v, the
function g := u + v is as required.
(c)=*(d): This follows from the isotoneity of the integral on E* and the fact
that If I E E' (Theorems 11.6 and 9.8): f If I d < f gd < +oo.
(d)=:;-(a): Upon recalling that f+ < IfI and f- If I, this too follows from the
isotoneity of the integral on E*.
v + f +, which via (11.7)
In (b), f = u - v = f + - f - and so u + f
yields f u d + f f - d = f v d + f f + d and therewith the last assertion of the
theorem, since all the integrals here are finite. 0

12.3 Theorem. Let f and g be integrable numerical functions on 0, a E R. Then


the functions of and, if it is everywhere defined on 11, f + g are integrable, and
satisfy
(12.2)

f(af)d=aJfdtz

and

J(f+)dit=Jfdii+Jgdt.

Furthermore, the functions

fVg and fAg

66

1 1. Integration Theory

are integrable.

Proof. The claims regarding of follow from (11.6), since

(of)+=of+,

(af)-=of-

ifa>0,and

(af)+ = Ial f-,

(af) = lalf+

ifa < 0.

Regarding f + g, we argue as follows: from f = f + - f - and g = g+ - g- follow

f+g=f++g+-(f +g ).(11.7) insures that u:=f++g+ and v:=f- +gare integrable. Then the claims about f + g follow from the equality f +g = u - v
via 12.2. Finally, If V gI < If I + I9I and If A 91 <_ IfI + IgI, and we know that
If I + IgI is integrable. The integrability of the measurable functions f V g and f A g
follows then from these inequalities and part (c) of 12.2.

12.4 Theorem. For any integrable numerical functions f and g on !1


(12.3)
(12.4)

f <9

Jfd

fiji d.

Proof. From f < g follows f+ < g+ and f - > g-, and from these inequalities and
the isotoneity of the integral on E' follows (12.3). Because f < IfI and -f: If 1,
(12.4) follows from the first equality in (12.2) and from (12.3), with If I in the role
of g there.

The relevant properties are particularly clearly perceptible when we consider


only real-valued integrable functions. To aid in that we define
(12.5)

2l() := the set of all -integrable real functions on Cl.

Using this widespread notation it follows immediately from Theorem 12.3 and
from (12.3) that: With respect to the operations
(f + g)(w) := f (w) + g(w) and (a f)(w) := of (w)

w E Cl

of pointwise addition and multiplication b y scalars a E R, Y 1 (p) is a vector space

over R, and on it f '-+ f f d is an (isotone) linear form.


Examples. 1. Let (Cl, d) be any measurable space, e,,, the measure on ii defined
by unit mass at w E Cl. According to Example I of 11, the e,,-integrable functions
are just the W-measurable numerical function f on Cl with I f (w) I < +oo. For them

f fde,,=f(w)
Let
be the measure space defined in Example 2 of 11, ({n}) = an
for n E N. From what was shown there it follows that the -integrable functions
2.

12. Integrability

67

f : SZ -4 9 are precisely those for which


ao

>1f(m)Ian <+00
n=1

and for such an f

fdf(n)an
n=1

Let (0, d, ) be the measure space defined in Examples 2 and 1 of 3. A function f : S2 -* R is then -integrable if and only if it is equal to a real constant a
3.

throughout the complement of some countable subset of 0. From Example 4 of 11

we have f f d = a for such an f.


Let (9, 0,,u) be a measure space with (f2) < +oo. Then every constant real
function, and consequently after 12.2, every bounded, measurable real function

4.

on 12 is -integrable.

Let and v be measures on a a-algebra si in Q. A numerical function f on 0


is ( + v)-integrable if and only if it is both - and v-integrable, and in this case

5.

fd(+v)=Jfd+Jfdv.

In fact: For every non-negative sf-measurable function g on 12, f g d( + v) _


f g du + f gdv holds by Example 3 in 11. Applied tog := If I this and 12.2 prove
the integrability claim, and applied to g := f + and g := f - it implies the claimed
equality. In particular

2'(+v)=21(i)n21(v)
is valid.

We can now free ourselves of the restriction that functions always be integrated
over the whole 1. (11.5) insures that along with any pair of functions from E' =
E*(S2, s9) their product is also in E. So from f E E' and A E d follows lA f E E.
If f is an integrable numerical function on S2, then so is lA f, for every A E srd:
Because of the trivial inequality I lAf 15 If I, this is immediate from 12.2 (and 9.4).
In the light of this the following seems natural:

12.4 Definition. If f is a numerical function which is defined on S2 and either


belongs to E' or is p-integrable, we set
(12.6)

jfdiu =f lAfd
:

for every A E d and call it the -integral off over A.


As a special case of this notation
(12.7)

jfdIi=Jfd,i.

68

11. Integration Theory

The following rules of calculation are evident, for all f, g which either lie in E'
or are integrable:

fAUBf1P+IAflBf=JAf+LfL forallA,BEd

(12.8)

and, as a Special case,

(12.8')

AuB

fd = jfd+jfdfor all disjoint A, B E d;

Afdj<_f gd

One merely has to reflect on the definitions involved. Moreover, pursuant to the
discussion after (12.5).

f - j fd

(12.10)

is a linear form on .l(),

for each A E ad.

But we can get at integrals over sets in ad in a different way, namely by considering the restriction A of the given measure to the trace a-algebra A n a+d.
That one is thereby led to the same result is the content of

12.5 Lemma. Let A E .d and for every function f on IZ which either lies in E*
or is -integrable let f denote the restriction of f to A, and A the restriction
of to A n .W. Then

ff'dPA= J fd.

(12.11)

Proof. First consider f E E' (St, at). Then f' E E' (A, A n W) since

(f')-'(B) = An f-'(B)
holds for all Bore] sets B in R (cf. 11.6). For the function lA f E E' there is
a sequence (un) of a/-elementary functions satisfying it,, f IA f . The sequence (u;,)
of restrictions to A obviously consists of A n ad-elementary functions that satisfy
u',, t f', from all of which follows that
(12.12)

f fd = sup

,,
nENJndii

and

ff'dPA = Sup udA .


nEN

Since 0 !5 u,, C 1A f, It,, = 0 in CA, so u = IAttn and consequently


Un =

a;1A,
i=1

for appropriate (depending on n) sets A; E .d which are all subsets of A, and


appropriate (also n-dependent) real coefficients a; > 0, and k,, E N. It follows

12. Integrability

that

69

k
ai1'qi

Un =

i=1

(Notice that for Q C A, the restriction 1Q coincides with the indicator function
with respect to A of Q.) From the last two equalities we see that

JufldP=JudPA
k.,

because each integral equals

for all n E N,

ai(Ai), and from these equalities and (12.12)

i=1

follows (12.11) for f E E'(1l,sv).


If f is p-integrable the preceding can be applied to both f + and f -. All integrals

are finite and it is obvious that (f')+ = (f+)', (f')- = (f-)', so (12.11) follows
from linearity of the integral.

In a final step of generalization let us note that we can conversely proceed


from an A E al and a function f on A which either lies in E' (A, A n sV) or is
pA-integrable to define the -integral of f over A via
(12.13)

and in the second case to say that f is also p-integrable over A. With the aid of
Lemma 12.5 we thereby get:

12.6 Corollary. A numerical function f defined on a set A E sr' is p-integrable


over A if the function defined on the whole of St by
fA(w)

f 0(w) ifwEA

if w E St \ A

'

is p-integrable. In this case

fAfd=

f fAd= JfdPA.

From this discussion we see that a p-integral over a set A E 0 is nothing


other than a A-integral over the new base space A. It can also be thought of as
a p-integral over SZ employing the integrands fA.

Exercises.
1. Characterize the functions u E E(12, d) which are p-integrable.
2. Let (12, d, p) be a measure space. The indicator function IA of a set A E at
is p-integrable just when (A) < +oo. Such sets are called p-integrable, and 9

will denote their totality. Show that R is an ideal in the ring 0 (cf. Exercise 4
A n R E R. For a or-finite measure p
in 1); in particular, R E .S and A E 0
a converse also holds: A C St and A n R E 9 for all R E R implies A E W.

70

It. Integration Theory

3. Is the Dirichlet jump function from Exercise 2, 10 A'-integrable?


4. Consider the measurable space (S2, d) from Example 4 of 11. On it is defined
the measure p which assigns countable sets the value 0, uncountable sets (from il)
the value +oc. Determine all the -integrable functions and their integrals.

5. Let (Sl, 0, p) he any measure space, (An) a sequence of pairwise disjoint sets
from W, A their union, f a numerical function on A. Show that f is -integrable
M
over A if and only if it is -integrable over each A. and E fA.. If I d < +oo.
n=1

6. Let (S2, r9, p) be a measure space with p finite. Show that every real function f on St which is the uniform limit of a sequence (f,,) in 2l () itself belongs
to 2l (). Why does this conclusion fail for every non-finite which is or-finite?
(Hint: Construct a sequence (gn) in 2l() with 0 < gn < 1 and f gn d > n2 for
each n E N and then consider fn := F j-2g1.j

j=t

13. Almost everywhere prevailing properties


For the further construction of the theory the concept of a negligible set, already
frequently mentioned in Chapter I, will now play an important role. We recall:

N C 11 is called a (-)nullset if N E a and (N) = 0. The union of every


sequence of p-nullsets is again one (3.10), as is every set in W which is contained
in a p-nullset, thanks to isotoneity (cf. Exercise 5 in 3).

13.1 Definition. Let q be a property of points in 1: every w E Sl either enjoys


property fl or does not. We say that "(-)almost all points of Cl have property 17"
or "rl prevails (-)almost everywhere in St" if there is a -nullset N such that all
points of CN enjoy property il.
Be careful: It is not required that the set N,, of all w E Cl which enjoy property rl

be a -nullset. Indeed, generally N,, may not belong to W. For example, if A is


a subset of S2 which does not belong to ii and q is the property "w is a point
of A", then N,, = CA is not in sir.
Examples of properties q which will come up in the sequel are: Equality of the
values at a point w E Cl of two functions f and g which are defined on fl, finiteness
of the value at w E Cl of a function f, etc. Corresponding to these we have the
following modes of speaking: f and g are (-) almost everywhere equal on Cl, in
symbols

f=9

(-)almost everywhere;

f is (p-) almost everywhere finite, in symbols


If ( < +oo

(p-)almost everywhere;

13. Almost everywhere prevailing properties

71

f is (,u-) almost everywhere bounded, meaning that for some a E R

If < a

(-)almost everywhere,

etc.

The theorems that follow explicate the significance that this new concept has
for integration theory:

13.2 Theorem. For every f E E'(0, d), that is, (cf. 11.6) for every +dmeasur-able, non-negative numerical function f

Ji d = 0 a f = 0

p-almost everywhere.

Proof. Since f is measurable, the set

N:={f54 0}={f>0}
lies in sat. What has to be shown is that

f f dy = 0 q (N)=0.
Suppose f f dp = 0. For each n E N the set A. := If > n-1) also lies in af and
An T N, so that (N) = limo(A,,) and it is enough to show that p(An) = 0 for
every n. But obviously f > n-11A,,, entailing that 0 = f f dp > n-1p(An) > 0,
that is, p(An) = 0, as wanted.
Suppose conversely that p(N) = 0. Each of the functions un := n1N (n E N)
lies in E(1l, 0) and satisfies fun d = 0. Setting g := sup un gives a function
n

g E E' (0, 0) such that un T g, so f g d = sup f un dp = 0. Finally, since


n

evidently f < g, 0 < f f d < f g d = 0 gives the desired equality f f d = 0. 0


13.3 Corollary. Every W-measurable numerical function f on fl is integrable
over every -nullset N, and

fdp=0.
IN

Proof. If f > 0, this claim follows from the theorem, because each function 1N f
lies in E' (12, sd) and is almost everywhere 0. In turn, application of this to f +

and f - delivers the full claim. 0


13.4 Theorem. Let f, g be sat-measurable numerical functions on Sl which are
-almost everywhere equal on Sl. Then
(a)

(b)

f>0,g>0
f integrable

Jfd=J9d;

= g integrable and

fi d = J g d .

72

11. Integration Theory

Proof. (a): By hypothesis (and 9.3) N := { f 34 g} is a Wnullset. From 13.3 then

f Nfd= f Ngd=0.
On the other hand, for M = CN we have lM f = 1Mg due to the definition of N,
and so by (12.6)
JM

d_IM

d.

A dding integrals and using (12.8') leads to the conclusion in (a).

(b): The almost everywhere equality hypothesis entails that

f+ = g+ almost everywhere and f

g- almost everywhere.

From (a) then

f f+d= J g+d

and

If-dA= f g-d.

Because f is integrable, what we have here are non-negative real numbers, showing

that g is integrable (part (a) of 12.2) and, upon subtracting the second equality
from the first, we get the equality claimed in (b).
Since, roughly speaking, all this shows that integrability and the integral of
a function are insensitive to (measurable) changes of the function on nullsets,
results proved earlier can easily be reformulated somewhat more sharply. For example:

13.5 Corollary. Let the l-measurable numerical functions f and g on 11 satisfy If I <_ g -almost everywhere. Then along with g, the function f will also be

-integrable.

Proof. If we set g' := g V If 1, then g' is measurable, g' = g almost everywhere


and If I < g'. From 13.4 part (b) we see first of all that g' is integrable, and then
from 12.2 f is as well.
Of special importance is the realization that integrability imposes limitations
on how often a function can assume the values oo, or indeed any non-zero value.
This is made precise in

13.6 Theorem. Every -integrable numerical function f on Il is -almost everywhere real-valued. Moreover, the set { f 0 0} is of a -finite measure.

Where a set A E a( is said to possess a -finite measure if it is the union of


a sequence of sets in of each of which has finite measure. This means nothing
other than that the restriction of to A fl d is a or-finite measure.

Proof. The set N := (If I = +oo} lies in a( and for every real a > 0 satisfies
alN < if 1. Consequently, a(N) < f If I d < +oo, from which follows the first

13. Almost everywhere prevailing properties

73

claim, (N) = 0. To prove the second claim we pass over to If I and thereby assume

that f > 0. Then

If 540}={f >0}= U{f >n-1}.


nEN

Every set An :_ If > n-' J = fn f > 11 /satisfies IA < n f and therewith


(An) :5 n

,f d < +00.

f
This holds for all n E N, confirming the a-finiteness claim. 0
Theorem 13.6 has yet another consequence: Let N be a p-nullset and f a numerical function which is defined on M := CN and is M fl ad-measurable. Such
a function is described as being a (p-)almost everywhere defined (d)-measurable
function. The function fm introduced in 12.6 extends it to an &d-measurable function on 11. Any other extension of f to SZ must agree with fm almost everywhere.
According to 13.4 therefore either every such extension is integrable or none is. In
the first case moreover all extensions have the same -integral. These observations
justify the following definition:

13.7 Definition. Let f be a -almost everywhere defined, std-measurable numerical function on 0. It will be called (-)integrable if it can be extended to
a (p-)integrable function f' defined on the whole of ft f f' d will then be called
the (p-)integral of f and denoted f f d.
We will only occasionally be concerned with this extension of the integral concept, but its utility is already shown by the following

Remark. Suppose f and g are integrable numerical functions on Q. According


to 13.6 each is almost everywhere finite. Because the union of two nullsets is itself
a nullset, there is a nullset N such that both If (w) I < +oo and Ig(w) I < +oo for
all w E CN. But then
w H f (w) + g(w)
(w E CN)

is an almost everywhere defined measurable function. This fact, in conjunction


with what was shown above, shows that the explicit hypothesis made in 12.3 that
f + g be everywhere defined is of little significance. For two integrable numerical
functions f and g on 11 the sum f + g is almost everywhere defined, and in the
sense of 13.7 integrable. The equality

J(f+o)d=ffd+J9d
prevails unrestrictedly.

Exercises.
1. The numerical functions f and g on the measure space (St, s(, ) satisfy f = g
,u-almost everywhere. Show via an example that in general the sat-measurability

74

1 1. Integration Theory

of g does not follow from that off . Show however that in case (52, d, p) is complete,

the d-measurability of g is equivalent to that of f.


2. Let (S2, .od, p) be a measure space, (1, x 1o', po) its completion. Prove that f :
Q -* R is wo-measurable just if .vd-measurable numerical functions fl, f2 on fl

exist with the properties f, < f < f2 everywhere in f1 and fl = f2 p-almost


everywhere. If f is po-integrable, then any functions fl, f2 with these properties
are p-integrable, and f fl dp = f f2 dp = f f dpo. (This supplements Exercise 7
in 5 and generalizes Exercise 1 in 10.)

3. Even if the f in the preceding exercise is real-valued, the functions fl, f2 which
were proved to exist there cannot always be chosen to be real-valued. Prove this
for the case where 11 is any infinite set, Ad := {Q1, S2} and p := 0.

14. The spaces 2P()


According to 9.4 the product of two measurable functions is again measurable. By
contrast however the product of two integrable functions is not generally integrable,
as the next example shows:

Example. (0, sd, p) is the measurable space described in Example 2 of 12 and


Example 2 of 11, with a,, := n_P-1 for each n E N, where 1 < p < +oo. The
identity function, f (n) := n for all n E N, is integrable, but its p-th power is not.
Thus for p = 2, f2 = f f is not integrable.

This observation suggests the investigation of those measurable functions f


on I for which if IP is integrable.
In what follows p will designate a real number, p > 1. For every od-measurable
function f on fI, If I and then also If Ip is measurable, because (adopting the usual
convention that (+oo)P := oo) for every real a
Q

ifa<0

(Iflp>a}= (IfI2:a'/P) ifa>0.


For such an f
(14.1)

Np(f)

(f Iflp di )

1/p

is therefore defined. It satisfies 0 5 Np(f) < +oo and, clearly,


(14.2)

Np(af)=IaINp(f)

Two deeper properties will now be established:

for all aER.

14. The spaces .`gy(p)

75

14.1 Theorem. p > 1 is a real number and q > 1 is defined by the equation

-+-=1.
P
q
1

Then for any measurable numerical functions f, g on St


(14.3)

NI(fg) < NP(f)NN(g)

(HOLDER'S inequality).

Proof. It is clear from definition (14.1) that we may assume f > 0 and g > 0.
Setting

a:=Np(f) and r:=Nq(g),


we can also assume that both these numbers are positive. For if, say a = 0, then
by 13.2 f P, whence also f , is almost everywhere equal to 0. The same is then true
of f g (remember that 0 (+oo) = 0), so that again by 13.2 we have NI (f g) = 0,
and (14.3) holds. Once a, ,r are each positive, no loss of generality is incurred by
assuming that each is also finite, which we now do.
Applying the mean-value theorem of the differential calculus to the function
q 1- (1 + rl)l/D, there follows at once the well-known Bernoulli inequality

(1+71)I/p<2+1
_p

for all11ER+

or

If now x and y are positive real numbers, then one of xy-1 and x-Iy is such
a l;. Inserting this t into the last inequality (and reversing the roles of p and q if
necessary), gives

xllpyllq < 1x+ ly.


P

This inequality - really equivalent to the concavity of the (natural) logarithm


function - holds as well for all x, y E R+. If finally we take x := (o-I f (w))P and
y := (rr-lg(w))q for an w E If < +oo} fl {g < +oo}, we get
1

OT fg

< app fP +

rgg,
gq

valid throughout fI, since it trivially prevails as well in the complementary set
If= +oo} U {g = +oo}. Integration of this inequality leads at once to (14.3). 0
14.2 Theorem. For all measurable numerical functions f and g on l whose sum
f + g is defined throughout fI, and for every p E [1, +oo[
(14.4)

Np(f + g) <_ Np(f) + Ng(g)

Proof. Since If + gI 5 If I + IgI,

Np(f +g) <_ Np(IfI +Igi),

(MINKOWSKI's inequality).

76

1 1. Integration Theory

which shows that we may assume f > 0 and g > 0. In case p = 1 there is then
even equality in (14.4), by (11.7). Therefore, for the rest we can assume that
1 < p < +oo, and then again define q by p-1 + q 1 = I. We may further assume
that both NN(f) and NN(g) are finite, that is, that if and gp are integrable.
12.2(c) and the estimates
(f + g)P <- [2(f V g)J" = 2P[fP v gPJ < 2P(fP + gP)

then insure the integrability of (f + g)P, that is, Np(f + g) < +oo. Now write
1(1 + g)p dp =

1(f +

g)P-1 f d + J(f +

g)"-'g dp

and apply Holder's inequality to each integral on the right to get

J(f +g)Pd < Nq((f +g)P-1)Np(f)+Nq((f +g)p-1)Np(g)


which thanks to the fact (p - 1)q = p reads
(Np(f + g))P < (Np(f + g))P-1 [Np(f) + Np(g)J

The desired inequality (14.4) follows from this and the finiteness of Np(f + g). 0

14.3 Definition. A numerical function f on S2 is called p -fold (p-)integrable or


integrable of order p or pth-power integrable, for some p E [1, +oo[, if f is measurable and If I p is p-integrable; that is, f is measurable and NN(f) < +oo.

According to 12.2, 1-fold integrable functions are indeed just the integrable
functions. In the case p = 2 we also speak of square-integrability.
It is immediate from the definition that a measurable function f is p-fold integrable if and only if if I is p-fold integrable; equivalently, if and only if there is
a p-fold integrable function g > 0 with IfI < g. Further properties, already known
to hold when p = 1, are codified in:

14.4 Theorem. Consider p E [1, +oc[ and p -fold integrable functions f and g.
Then for every a E R

of, f Vg and f Ag
are p -fold integrable, and in case it is defined throughout St, the function f + g is
p-fold integrable.

Proof. Because a function f is p-fold integrable just if it is measurable and Np(f) is

finite, the claims about a f and f + g follow from (14.2) and (14.4). The p-fold
integrability off V g and f A g then follow as in the case p = 1 from the estimates
If V gI <- IfI + Igl

and

If A gI <- If I + IgI.

14.5 Corollary. For 1 < p < +oo a numerical function f on Il is p-fold integrable
just if its positive part f + and its negative part f - are both p -fold integr able.

14. The spaces i(p)

77

Proof. Since f = f + - f -, from the p-fold integrability of f + and f - follows that


of f, by 14.4. The converse also follows from 14.4, and the equalities

f+= f v0 and f =(-f)vO.


Now for each p E (1, +oo[ we define
(14.5)

2"() := the set of all real p-fold -integrable functions on Q.

Then from 14.4 we get the property, already known for p = 1:


(14.6)

i(14) is a vector space over R.

In view of (14.5) real-valued p-fold integrable functions are also known as .functions.
From (14.3) we immediately get:
14.6 Theorem. The product of a p-fold and a q -fold integrable numerical function
is integrable (where 1 < p < +oo and 1 + a = 1).
In particular, the product of two square-integrable functions is always integrable.

14.7 Corollary. If 1 < p < +oo and the measure is finite, then every p -fold
integrable function is integrable.

Proof. Because (S2) < +oo, the constant function 1 is q-fold integrable on 0, for
each q E (1,+00[. So the present claim follows from 14.6 upon writing any p-fold

integrable f as f 1.
Remark. 1. Without the hypothesis (S2) < +oo the conclusion of 14.7 may fail.
For example, in Example 2 of 12 choose the measure it by requiring a = n-1/2
for all n. Then the function f defined on S2 = N by f (n) := an for all n E N lies

in 22(p) but not in 2'(p).


More generally when is finite, from p-fold integrability follows p'-fold integrability for every p' E [1, p] - cf. Exercise 3 below.
Related to 14.6 we have:

14.8 Theorem. Let 1 < p < +oo, f : S2 -+ l p -fold integrable and g : 11 -a R


a measurable almost-everywhere bounded function. Then the product f g is p -fold
integrable.

Proof. The boundedness hypothesis on g means that there is an a E R such


that IgI < a almost everywhere. Then of course Ifgl 5 a If I almost everywhere.
Because of the p-fold integrability of a If I, the claim follows from this inequality
via 13.5.

In particular, along with every f E -?'P(p) and A E V the function 1A f lies


in LP(p).

78

[1. Integration Theory

It seems natural to formulate the analog of Theorem 14.6 in case p = 1. To this


end we define

the set of all real, d-measurable, p-almost everywhere bounded functions on S1.
One immediately perceives that 2'1(14) is also a vector apace over R. The union
of Theorems 14.6 and 14.8 results in the assertion
(14.7)

(14.8)

f E-"(),9E-2(),1 <p<+oo, P

1+q-1=1

= f9E21(F+),

= 0 has to be recalled. Functions which


where of course the convention
are (-)almost everywhere bounded are also called (-)essentially bounded.
(+oo)-1

In closing it may be noted that for counting measure S on .9(12), with 0


... , n }, (14.3) and (14.4) go over into discrete versions of the Holder and
Minkowski inequalities. When p = 2 we get the Cauchy-Schwarz inequality and
the triangle inequality for the euclidean norm, familiar from linear algebra and
{ I,

analytic geometry.

Remark. 2. Definition (14.1) of Np obviously makes sense for every real p >
0, thus also for those 0 < p < 1 heretofore excluded from consideration. For
these p, however, the fundamental properties (14.3) and (14.4) are lost and the q
determined by p` l+q-1 = 1 is negative. (On this point, compare Exercise 5 below.)
Remark 3. at the end of 15 will show that pathologies occur when 0 < p < 1. All
subsequent work will therefore be restricted to the case p > 1.
Exercises.

1. Let (S2, d, ) be a finite measure space, 1 < p < +oo. Show that every function f
on fl which is the uniform limit of a sequence (fn) from VP(IA) itself lies in .'(p).

2. For an arbitrary measure space (S1, rd, p) and 1 < p < +oo, show that a real
function f on 9 is p-fold integrable if and only if f If I" is Integrable. (In the "if"
direction, measurability of f itself is not part of the hypothesis.)
3. Let (11, 0,;t) be a finite measure space, 1 < p' < p < +oo, and f a measurable
numerical function on Q. Then
Np'(f) < Np(f) .1 (01/P -1/P and 2'(p) C -2v'().
4. For any finite number of measurable numerical functions fl,..., fn on a measure
n

space and real numbers p i ,

, pn E 11, +oo [ satisfying F, p., 1 = 1, prove the


j=1

generalized Holder inequality

Nl(fl-

fn):5Np,(fl).....NP"(f.)

5. Let (52,. 9, p) be a measure space, p E J0,1 [ and q < 0 be defined by p`1 +q-1 =
1. Consider non-negative f E .P() and a measurable g : S1 -a 10, +oo[ satisfying

0 < Nq(g) := (f gq d) I /q < +oo. By an appropriate application of Holder's

15. Convergence theorems

79

inequality show that

f fgdp > Np(f)Na(9)


Infer that
Np(f + g) >Np(f ) + Np(g)

and find an example to show that generally equality does not prevail here.

15. Convergence theorems


Again consider 1 < p < +oo and a measure space (12, .sa', p). The function Np is
real-valued on the vector space 2P(p), and in fact a semi-norm, that is, a mapping

Np :.2 (p) - R+
having properties (14.2) and (14.4). From the second of those properties, the
Minkowski inequality, it follows that the function

dp(f,9):= Np(f - 9)

f,9 E 2P(p),

satisfies the triangle inequality, that is,


dp(f, 9) S dp(f, h) + dp(h, g)

for all f, g, h E -"(p).

Evidently dp thus has all the properties of a metric on 2"(p), with one exception:
According to 13.2 and 13.3
dp(f, 9) = 0

is not equivalent to f = g, but only to

f = g p-almost everywhere.
Distance-like functions without the property that "distance between two elements

equal zero entails equality of the elements", are usually called pseudometrics.
Np and d,, are called the .P-semi-norm and the Pp-pseudometric, also the seminorm or the pseudometric of convergence in the pth mean or in 2'-convergence.
To elaborate: If (f,,) is a sequence in YP(i), then it is said to converge in eh
mean to f E 2'P(p), or to be 2P-convergent if

lim N,(fn-f)= lim


dp(fn,f)=0.
n-ioo

(15.1)

n +oo

By virtue of what was noted above, the limit function f is only almost everywhere
uniquely determined. (14.2) and (14.4) insure that linear computations with convergent sequences are like those we are accustomed to involving real numbers. In
immediately apprehensible symbolic form these say:

A - f,
for any a, 0 E R.

9n -1 9

a fn + f3gn -4 of + 09

80

II. Integration Theory

From (14.4) also follows a triangle inequality from below"


(15.2)

I Np(f) - Np(9)I < Np(f g)

for all f,9 E 2"(it),

simply because

Np(f) = Np(f - 9 + 9) s Np(f - 9) + Np(9),


Np(-g) = NN(g), and the roles of f and g can be interchanged throughout.
In case p = 1 we speak of simply convergence in mean, and when p = 2 of
mean-square convergence.
Taking note of the inequalities
(15.3)

[ffdfi_f9diijIf-i dp<_N,(f-g),

validfor all A E d, f, g E 21(14), and


(15.4)

INp(lAf) - Np(lA9)I 5 Np((f - 9)1A) <_ Np(f - 9)

valid for all A E at, f, g E 2p(), we immediately get:

15.1 Theorem. Every sequence (fn) in 21(u) (reap., in 2 (1i)) which converges
in mean (resp.. in pth mean) to a function f from 21(p) (reap., from -gy(p)) also
satisfies
(15.5)

f
=
J
fd
f
fn
d
n- oo A
lim

for every A E d

(p.,
(15.6)

Jim

Ifnlp dp = f If I' dp

for every A E d.)

Proof. (15.5) follows from (15.3). Correspondingly (15.6) follows from (15.4), which
gives Np(lAfn) = Np(lAf), upon taking pth powers in this last limit and

using the continuity of the mapping x H xp on R+. O


(15.5) and (15.6) say nothing other than that for each A E d the mappings

f Hf fd and f HJ Iflpdp
A

on 2l () and 2P(p), respectively, are continuous with respect to 2'-convergence


and 2"-convergence, respectively.
Further developments require a lemma which is fundamental for the whole of
integration theory as well as its applications in probability theory and which goes
back to P. FATOU (1878-1929):

15. Convergence theorems

81

15.2 Lemma (of Fatou). Every sequence (fn)fEN in E*(fl,ii), that is consisting
of 0-measurable numerical functions fn > 0, satisfies

f fndp.

f limonf fndp<liminf
Proof. According to 9.5 and 11.6 the functions

f := lim inf fm

inf fn

and gn

for all n E N

he in E' (S2, dd ). By definition of limit inferior, gn T f and thus by 11.4

lim
f If dp = sup gn dp = n-+00
nEN

9n dp.

From this the claimed inequality follows, because by isotoneity

gn dp

infra J fm d!L

holds for all n E N.

If we choose for (fn) a sequence


of indicator functions of measurable
An c 11, then lim inf IA,, is the indicator function of the set
n-+00

(15.7)

lim inf An :=
n-+oo

A-.

nEN m>n

This is the set of w E Il which lie in ultimately all of the sets An. Dual to it one
defines
(15.8)

lim sup An := n U U Am ,
n-pm

nEN m>n

the set of w E fl which lie in infinitely many of the sets An, more correctly, the w
which he in An for infinitely many n. Evidently
lim sup A,) = lim inf CAn .
n-+oo

n-+oo

Hence we get the following corollary:

15.3 Corollary. For every sequence (Af)nEN of sets in the o-algebra.Q/


(15.10)

,t(lim ml A) < lim inf p(An),


n-+oo

n-4oo

and if the measure p is finite, the inequality


(15.11)

holds as well.

lim sup p(An) < p(limisup An)


n-oc
n-+oo

82

1 1. Integration Theory

Proof. (15.10) is an immediate consequence of 15.2. In turn, if we apply (15.10) to


the sequence (CAn) and use (15.9), we get

(C lim sup An) = u(liminf CA,,)

(S2) - (limsup
n-+oc

n-+oo

n-+oo

< lim inf p(CA,) = (fl) - lim sup 1(An),


n-,oo

n-4oo

confirming (15.11).

Fatou's lemma leads - in the hands of NOVINGER [1972] - surprisingly simply

to the first convergence theorem, by which is meant a mechanism for inferring


convergence in p'l' mean from almost everywhere convergence. The result itself
goes back to F. RIEsz (1880-1956); cf. RIESz (1911].

15.4 Theorem (of F. Riesz). Suppose 1 < p < +oo and the sequence (fn)nEN
in 2P(S1) converges almost everywhere in 11 to a function f E 2P(51). Then the
condition

Jim Jtfnrdsti= JIf lpdu

(15.12)

is (necessary and) sufficient for the convergence of (f,,) to f in eh mean.


Proof. The necessity of (15.12) follows (even without the hypothesis of almosteverywhere convergence) immediately from 15.1. The proof of sufficiency proceeds
from the inequality
(a +/3)P < 2P (aP +,6P)
(a, /3 E R+)

which has already been used in the proof of (14.4). Since Ia - 0I < a + /3 this
inequality yields
(a,$ E R).

la - QIP < 2P(IaIP + IQIP)

This inequality insures that

9n:=2P(IffIP+VIP) -Ifn-fl",

nEN,

are non-negative functions. They lie in .2o1() and by hypothesis they converge
almost everywhere to 2P+1 If IP. In particular, 2P+1 If I = lim inf gn almost everywhere. Therefore Fatou's lemma in conjunction with (15.12) delivers the relations
21+1 J If IP d = J lim inf gn dp < Jim inf J
n-+oo

n-+OC

g. du

=2P+I Jiii''d-limsup Jii.-f1Pd.


Since 2P+ f if I ' d < +oo, we infer by subtracting it that

limsup JIfn - f1Pd<<0,


which asserts the claimed 2P-convergence.

15. Convergence theorems

83

In preparation for the proof of the next convergence theorem we extend Minkowski's inequality to series of non-negative functions.

15.5 Lemma. Every sequence (fn)FEN of functions from E'(f1,d) satisfies


00

00

Np(> fn) < E Np(fn)

(15.13)

n=1

for every p E [1,+0o[.

n=1

Proof. If foreach nENwesetsn:=f1+...+fn,then by(14.4)


00

Np(sn) < > Np(fi) < > Np(fi)


i=1

i=1

00

The sequence (s,) is isotone and E fn is its upper envelope; the same holds for
n=1

the pth powers. Therefore from the monotone convergence theorem 11.4 follows
00

Np(Efn) =suPNp(9n)
n=1

nEN

and together with the preceding inequalities this gives (15.13). 0


We come now to a second convergence theorem. It goes back to H. Lebesgue
and is therefore frequently called Lebesgue's convergence theorem.

15.6 Theorem (on dominated convergence). Let 1 < p < +oo and (fn)nEN be
a sequence from .'P(p) which converges almost everywhere on Q. Suppose there
exists a p-integrable numerical function g > 0 on fI such that

for all n E N.

(15.14)

Then there is a real-valued measurable function f on fI to which (fn) converges


almost everywhere. Every such f lies in 21'(p) and the sequence (fn) converges
to f in pth mean.
Proof. By assumption there is a nullset M1 such that lim f,, (w) exists (in 1[1) for
every w E CM1. Because of the integrability of gp there is, according to 13.6,
another nullset M2 with g(w) < +oo for every w E CM2. If we set
f (w)

limo f,, (w), w E C(M1 U M2)


w E M1 U M2,

{ 0,

then f is real-valued and aaf-measurable, and the sequence (fn) converges almost
everywhere to f. Consider now any function f with these properties. Then If I < g
almost everywhere, so along with gp the function If Ip is also integrable, that is,
f E 2p(), by 13.5. We set, for each n E N

9n:=Ifn-fIp

84

II. Integration Theory

and then what has to be shown is that lim f gn d = 0. From the definition of gn,

0:5 gn <- (Ifnl+IfD <_ (9+IfI)P.


Since the fimction h :_ (g + If I )P is integrable, so is each gn (by 14.4 and 12.2).
Fatou's lemma applies to the sequence (h - gn) and says that

lim nf(h-gn)d<liminf J(h-gn)dp=[hdp-limsup f 9gdA.


Since (fn) converges almost everywhere to f, (h-g.) converges almost everywhere
to h. In particular,
lim inf(h - g,,) = h
almost everywhere
n-4oc

and so

=J hdp.

inf

The preceding inequality therefore yields lim sup f g,, du < 0. Since all 9,,, are
non-negative, this is equivalent to the desired lim f g,, du = 0.
The concept of a Cauchy sequence makes sense in any pseudometric space, in
particular therefore in 2p(). A sequence (fn) of functions from
(t) is said to
be a Cauchy sequence in _49P(p) if for every e > 0
dp(fm, fn) = NP(fm - fn) < E
holds for ultimately all m, n. Every .2P()-convergent sequence is a Cauchy sequence, as Minkowski's inequality shows. That the converse of this is also true,
that, in other words, the space 2P(14) is (metrically) complete, is the content of
the third convergence theorem. Its special case p = 2 goes back to F. RIESZ and
E. FISCHER (1875-1956).

15.7 Theorem. For each 1 < p < +oo, every Cauchy sequence (fn)nEN en

'(k)
converges in pt' mean to an f E 2P(p). Some subsequence of (fn) converges
almost everywhere to f.
Proof. Straight from the definition of Cauchy sequence we can construct 1 < n1 <
n2 < ... such that Np(fnk+, - fnk) < 2-k for all k E N. We define
00

9k *= fnk+, - fnk

for each k E R, and g:= Z I9kI


k=1

Then from 15.5

00

00

NP(9)<_ENP(9k)<E2-k=1.
k=1

k=1

Consequently, the d-measurable, non-negative numerical function g is ptI power


integrable and therefore (by 13.6) it is almost everywhere real-valued, that is, the
series F,gk is absolutely convergent almost everywhere. The kte partial sum of

15. Convergence theorems

85

this series is f,,k+, - fn so we see that the sequence (fnk)kEN converges almost
everywhere in Q. Moreover,
Ifnk+,I = 191 +... +9k + fn,I <- 9+ Ifn,I

and by 14.4 the sum g + I fn, I is pth-power integrable. Thus the sequence (fn. )W
satisfies all the hypotheses of the dominated convergence theorem, according to

which it therefore converges in eh mean to an f E 2P(1) and

lim fnk = f

k-woo

almost everywhere.

Since (fn) is a Cauchy sequence, this subsequence behavior entails the convergence
in eh mean of the whole sequence: Given c > 0 there is an mE E N such that

Np(fn-fn)<E

for all m,n>m,.

Then there is a k E N with nk > me such that

NP(fnk - f) < E.
The triangle inequality then insures that
Np(fn - f) < Np(fn - fnk) + Np(fnk - f) < 2E

holds for any n > mE. 0


Passage to a subsequence cannot generally be circumvented if one wants almost
everywhere convergence, as the next example illustrates.

Example. Consider fl := (0, 1[, d := Clf1.1 and a := an. Every natural number n

is representable as n = 2h + k for a unique pair of integers h > 0, k > 0 with


k < 2h. Set An := [k2-h, (k + 1)2-h[ and let fn denote its indicator function.
Then f fn d = f fn du = (An) = 2-4 < 2/n, so (fn) converges to 0 in eh mean
(for any 1 < p < +oo) and is therefore certainly a Cauchy sequence in 2p(14).
But the sequence (fn(w))fEN in {0, 11 is not convergent for any w E Cl. Indeed,

given w E Sl and h = 0, 1, ..., there is exactly one k = 0,..., 2h - 1, such that


w E [k2-h, (k + 1)2-h[, that is, w E A2k+k. In case k < 2h - 1, w AZk+k+I. In
case k = 2h -1 and h> 1, w A2h+, .
We record the following simple corollary to 15.7:

15.8 Corollary. If the Cauchy sequence (fn) in 2p() converges almost everywhere to an d-measurable real function f on Cl, then f lies in 20P(A) and the
sequence converges to it in eh mean.

Proof. According to 15.7 there is an f' E 2P(p) to which (fn) converges in


eh mean and to which a subsequence of (fn) converges almost everywhere. Outside

the union of this exceptional nullset and that in the hypothesis the two limits f
and f * must agree. Hence f = f * almost everywhere. 0

86

1 1. Integration Theory

Corresponding to Theorem 14.6 and its corollary we have finally the following
two convergence assertions:

15.9 Theorem. The sequence (fn) in .4D(p) converges in pth mean to a function
f E 2'(p) and the sequence (gn) in 29(p) converges in qth mean to g E
If I < p < +oo and p-' +q-1 = 1, then the sequence (fn9n) of products converges
in mean to f g.
Proof. The triangle inequality in IR yields

(nEN)

(fn9n-f9l<Ifn-fII9.I+If II9n-9I
which the Holder inequality (14.3) transforms into

N1(fngn - .f9) < Np(fn - f)Nq(9n) + Np(f)Nq(gn - 9)

(n E N).

Our claim follows from this when we recall from (15.2) or (15.6) that the sequence
(Nq(gn))neN is convergent, hence bounded.

15.10 Corollary. If the measure p is finite, then every sequence (fn),,EN in 2'(p)

which converges in pth mean to an f E YP(p) for some 1 < p < +oo, also
converges to f in mean.
Proof. For p = 1 there is nothing to prove. For 1 < p < +oo the claim follows from
the theorem upon taking every function gn there to be the constant function 1;
because of the finiteness of p the constant functions lie in 29(p) for every q E
(1, +oo(.

The reader should convince himself via an example like that in the remark after 14.7 that the converse of the assertion in this corollary is not true. However, the
conclusion of the corollary can be refined somewhat; namely, under its hypotheses
there is 2'V-convergence of (fn) to f for every p' E (1,p). Cf. Exercise 2 below.

Remarks. 1. Because

Np : 2'(p) - R+
is a semi-norm, the set

.N := N;'(0)
is a linear subspace of .gy(p). It is independent of p because it consists of all
measurable real functions on Sl which are almost everywhere equal to 0. The
quotient vector space

becomes a normed space in a natural way: Letting f H f denote the canonical


mapping of .2'p(p) onto LP(p), we define
IIf IIp = Np(f)

for all f E L"(p).

15. Convergence theorems

87

One checks effortlessly that f H 1If IIP is thereby well defined and provides a norm

on LP(p). Theorem 15.7 says that LP(p) is complete with respect to this norm,
that is, it is a Banach space (for 1 < p < +oo).
L2() is even a Hilbert space. For the product fg of two functions f,g E 22(p)
is integrable, by 14.6, and it is clear that the integral f f g dp depends only on the
canonical images f , g of these functions, which means that

(f, 9) -ffdp
is a well-defined mapping. A short calculation suffices to confirm that it provides
a scalar product in L2(p).

2. f E 2(p) means that the set W J of all a E R+ such that If I < a almost
everywhere is not empty. We can set

N00(f):=infWj
and show easily that N,,, :2(p) -r R+ is a semi-norm on 2(p). Also in this
case N ' (0) coincides with the space .At described in 1. In the quotient space
LO(,u) := Y(p)1_41

can be defined via N,,. just as before. One checks that L (p)
thus also becomes a Banach space.
a norm f H II f I I

3. For every measure space (SI, dry, p) and every p E ]0,1[ the set 2P(p) (cf.
Remark 2 in 14) turns out to be a vector space. NP is generally not a semi-norm
(cf. Exercise 5, 14), but 4(f, g) := Ny (f - g) is a complete pseudometric - with,
however, strange properties: The unit "ball" centered at 0 is generally not convex.
For L-B measure on (0, 1], every f E .2P is actually a convex combination of
functions in this ball. See BoURBAKI [1965], chap. 4, 6, exer. 13.
Exercises.

1. Let (fn) be a sequence of numerical measurable functions on a measure space


(11, 0, p). Under the hypothesis that a p-integrable function g satisfying Ifn l S 9
for every n E N exists, show that lim inf fn and lim sup fn are p-integrable functions and satisfy

fndp < limsup


f fndp <
f liminf
fndp <liminff
n-4o
n-4oo
n-*oo

f limsup fndp.

Show by an explicit example that this chain of inequalities can fail if there is no
such majorizing function g. (To this end, cf. Exercise 6 in 21.)

2. Let p be a finite measure, 1 < p' < p < +oo. Show that if a sequence in 2P(p)
converges in pth mean to a function f E 2%p), then it also converges in p`h mean
to f. (Cf. Exercise 3 in 14.)

3. Let (f',

p) be a finite measure space and on d9 consider the pseudometric


d(A, B) := p(AAB) = f IlA - 1BI dp introduced in Exercise 7 of 3. Show that
the pseudometric space (d, dN) is complete.

88

H. Integration Theory

4. Show that if 1 < p < +oo and f, fn E f%) satisfy


00

n=1

then the sequence (f) converges almost everywhere to f.


5. Show that the conclusion of Theorem 15.9 remains valid for p = 1 and q = +oo.

16. Applications of the convergence theorems


We will now demonstrate the applicability of the convergence theorems by means

of three examples which will be important in the sequel. The first concerns the
behavior of parameter-dependent integrals, the second the connection between the
Riemann and the Lebesgue integral, and the third the calculation of the (Gaussian)
integral

G:=

(16.1)

J_2)1()

1. Parameter-dependent Integrals. The question of the continuity and differentiability of functions which are defined by integrals will be answered in the
following lemmas and corollary. Throughout, (fl, srd, p) is an arbitrary measure
space.

16.1 Lemma (Continuity lemma). Let E be a metric space and f : E x it -R


a function with the properties
(a) w H f (x, w) is p-integrable for every x E E;
(b) x 1-4 f (x, w) is continuous at xo E E for every w E 1l;
(c) them is a -integrable function h >_ 0 on fl such that
If (x, w)I < h(w)

for all (x, w) E E x Cl.

Then the function defined on E by


O(x):=

J f(x,w)(dw)

is continuous at xo.
Proof. The continuity of V at xo is proved if we show that for every sequence (xn)
in E with lim xn = xo,

nim V(xn) = p(xo)


holds. To accomplish this, we introduce the sequence (fn) by
fn (w) := f (xn, w)

(n E Z+, w E 0).

By hypothesis these are integrable functions, each satisfies IfnJ < h, and for every
fixed w E 11, lieu fn(w) = fo(w). From the theorem on dominated convergence
n--+oo

16. Applications of the convergence theorems

89

therefore follows that


I/

fn du =

fo du

.f (xo, w)p(dw)

that is, indeed lim cp(xn) = V(xo). 0


In the following applications of this lemma the space E will frequently be an
interval in R or, more generally, a subset of Rd.
16.2 Lemma (Differentiation lemma). Let I be a non-degenerate (meaning, containing more than one point) interval in R, and f : I x 11 - R be a function with
the properties
(a) w '- f (x, w) is p-integrable for each x E I;
(b) x ,-a f(x,w) is differentiable on I for each w E 1?, the derivative at x being
denoted by f'(x,w);
(c) there is a p-integmble function h > 0 on f? such that
Jf'(x,w)I < h(w)

for all (x,w) E I x SZ.

Then the function defined on I by


(16.2)

ca(x)

Jf(xw)li(dw)

is differentiable, for each x E I the function w H f'(x, w) is p-integrable, and


(16.3)

VP (x) = Jfl(xw)tz(dw)

for every x E I.

In short, under the stated conditions (16.2) can be differentiated under the
integral sign.

Proof. Fix xo E I and consider any sequence (xn)nEN C I \ {xo} which converges
to xo. Then the function defined on S2 by
gn(w)

f (xn,w) - f(xo,w)
xn - xo

is p-integrable, for each n E N, and

lim gn(w) = f'(xo,w)

n-+oo

for all w E Q.

It is a consequence of hypothesis (c) that Ign 1 < h for n E N, as we now confirm. It


suffices to apply the mean-value theorem of differential calculus. According to it,

for each x E I \ {xo} and each fixed w E fl there is a point t, in the open interval
whose endpoints are x and xo, such that

f(x,w) - f(xo,w) = f'(t,w)


x - xo

90

11. Integration Theory

and therefore by (c) this quotient is majorized by h(w). In particular,


Ign(w)I <h(w)

for all w E S1, and every n E N.

Now the dominated convergence theorem comes into play to insure that the function w H f'(xo, w) to which the gn converge is tc-integrable and

im
4oo

J gds = J f'(xo, w)!p(dw)

Claim (16.3) follows from this because

gdp=forallnEN.
xn -xe

11

Passage to the multi-dimensional analog is painless:

and f : U x f -i R
16.3 Corollary. Let U be an open subset of Rd, i E
a function with the properties
(a) w H f (x, w) is i-integrable for each x E U;
(b) x H f (x, w) has an ill' partial derivative at each point of U, for every w E S2;
(c) there is a -integrable function h > 0 on S2 such that
8f
(x, w) < h(w)
8xi

for all (x, w) E U x S2.

Then the function defined on U by

w(x) := ff(x.w)i(d)
has an ith partial derivative at every x E U, the function w

'-

8f (x, w) is -

8x,

integrable, and

av (x) = J az (x, w),u(dw)


axj

for every x E U.

This follows at once from the differentiation lemma: Given T = (T,, ... ,Td) E

U, there is an open interval I C R containing ai such that for each t E I the


point (zl , ... , T,- j , t, Ti+i .... 7d) lies in U, and we can apply 16.2 to the function
(t,w),_, f(xl,...,xi-1,
.Td,w).

II. Comparison of the R.iemann and Lebesgue Integrals. For every ddimensional Borel set B E .mod and suitable Borel measurable numerical functions f on B the integral fa f dad was defined in 12 and identified with f f dAB.
This integral is called for short the Lebesgue integral of f over B. A frequently
encountered alternative way of writing it is
(16.4)

ff(x)dx= Jfda5.

16. Applications of the convergence theorems

91

In case d = 1 and B = [a, a], or ] - oo, a], or R, etc. the notations fa f (x) dx, or
f . f (x) dx, or f ' f (x) dx, etc., are also common.
Since in basic analysis courses it is frequently only the Riemann integral that
is dealt with, the following remarks relating it to what has been done here may be
useful.

16.4 Theorem. Consider a Borel measurable real function f defined on a compact


interval I := [a,)31 in R. If f is Riemann integrable (which in particular means it
is bounded), then it is also Lebesgue integrable, and the values of the two integrals
off coincide.
Proof. To every finite subdivision

J:={a=ao<al <...<an=p}
of I the Riemann theory associates the lower and upper sums
n

L1:=E'y:(a;-ai_1) and Ue:=Eri(ai-ai-1


i_1

i=1

in which

'yi := inf f([a1_i,ai])

and

ri :=supf([ai-l,ai]), i = 1,...,n.

If we set A:= A} and Ai := [ai_ 1 i ai], i = 1, ... , n, then


n

1a=EyilA, and u,:=Eri1A4


are -integrable functions on II for which

L1=J 11d and U1=J ujdp.


Riemann integrability of f means, by definition, that there is a sequence (.4n) of
subdivisions of I such that each .,i,,+1 refines its predecessor an and the sequences
and (Uk) tend to the same real limit value, the Riemann integral p of f
over I.
Because of the refinement feature of the sequence (en), (14) is an isotone and
(u4) is an antitone sequence. Hence

exists (in R) on 1. If therefore we apply Fatou's lemma 15.2 to the sequence of

functions uj -1j > 0, there follows


0:5

I q d < lim (U,4 - Lin) = 0


n-+oo

and so by 13.2, q = 0 p-almost everywhere. Since in addition for every n,1.4 < f <
uj holds p-almost everywhere (everywhere except possibly at the points of in),

92

11. Integration Theory

q = 0 almost everywhere entails that

lim 1 j = f

p-almost everywhere on I.

n-+oo

As has been noted, f is bounded, say If 1 <_ M E R. The sequence ([1.4. [) is therefore

majorized by the constant M, a p-integrable function, and so Theorem 15.6 on


dominated convergence delivers the 1s-integrability of f as well as the convergence
of (1k) to f in mean. From 15.1 finally follows

I fdp=lim J
which finishes the proof. 0
Remarks. 1. Consider once again Dirichlet's jump function f on the unit interval
(cf. Exercise 2 of 10). Being the indicator function of Q fl 10, 11, it is Borel measurable and almost everywhere 0 with respect to L-B measure .1011. Consequently
it is Lebesgue integrable and fo f (x) dx = 0. But f is not Riemann integrable. So
the roles of Riemann and Lebesgue integration cannot be reversed in 16.4.
2. Borel measurability of f need not be hypothesized: the above proof shows,
even without it, that lim 1.4. = f p-almost everywhere and so f is -almost everywhere equal to the Borel function lim lj, . However, in this case it can well happen
that f itself is not Borel measurable.
3. The ideas in the proof of Theorem 16.4 can be amplified into a non-trivial
criterion for Riemann integrability. Namely, f : [a, 0] -+ R is Riemann integrable
if and only if it is bounded and is continuous at V-almost every point of [a, fiJ.
See Theorem 2.5.1 of COHN [1980] or the multi-part Exercise 12.51 of HEwITT
and STROMBERG [1965].

16.5 Corollary. The non-negative, real-valued, Borel measurable function f is


Riemann integrable over every compact interval. Then f is Lebesgue integrable
over R if and only if the improper Riemann integral
r+n

,0:= lim J

f (x) dx
n

exists. In this case p = f f V.


Proof. Denote by pn the Riemann integral of f over An := [-n, +n] for each n E N.
According to the theorem just proved

pn=IA
From 11.4 and the fact that IA f T f we get

sup p

JfdA'.

16. Applications of the convergence theorems

93

The improper Riemann integral exists, by definition, just if this supremum is finite

and in that case its value g is that supremum. From these observations and the
monotone convergence theorem our present result follows. 0

Utilizing the decomposition f = f + - f - into positive and negative parts, it


follows from 12.2 and 16.5 that every Borel measurable real function f on R with
absolutely convergent improper Riemann integral is also Lebesgue integrable and

f f dal coincides with the improper Riemann integral of f. Obviously too, any
open or half-open interval I C iR can take over the role of R in 16.5.
By contrast, from the existence of the improper Riemann integral off does not
follow the Lebesgue integrability of f, even for continuous functions. Consider, for
example, the function f : R -+ 1R defined by f (x) :_ (sin x) Ix when x 54 0 and
f (0) := 1 m (sin x)/x = sin'(0) = 1. Of course, it is continuous. If for each k E N
we set

ak

I(k+l)w

:=

sinx

dx = (-l)k I

sin
t

dt,

we see that the signs of the ak alternate, their moduli decrease as k increases, and

r(k+1)n

Jakl < J

(k + 1)a
kir

1 dx = log

k+r

= log (I +

0 as k -> oo.

00

Therefore the series > ak converges. Using this it is very easy to confirm that the
k=1

improper Riemann integral

JrR sin x

lim
R ++oo 0

dx

exists. On the other hand,


k +1)R

IsinxJ

If I d,\' >

J fa,(n+1)w)

at.

(k + 1)lr

J0

=
JR+

sin t

F+ kir dt - Jo it+ k7r

and so for every n E N

sin t

If ( dA'

(k+1)n

E
k=lJka

dx >

E k+1
k=11

Since the harmonic series diverges, these inequalities show that fR+ If I dA' = +oo,
and so by 12.2 f is not Lebesgue integrable over R+.

III. Calculation of the integral G. The preceding considerations show that


integrals which the reader may already have encountered as Riemann integrals
can, in the stated circumstances, be immediately interpreted as Lebesgue integrals.
Known formulas and computational rules for the Riemann integral thereby become
available to the Lebesgue theory as well.

94

H. Integration Theory

As an illustration, consider the non-negative function


e-x(1+m2 )

f (x, w) :_

(16.5)

(x,w)ER x1R.

1 + w2

Both f and the function (x, w) t-+ f'(x, w) := -e-:(1+w2) are continuous. For fixed
xo > 0 form the auxiliary functions
ho(w) := e-220Iwl

and h(w) :_ (1 +w2)-1 ,

w E It.

Their A'-integrability (over R) follows from Corollary 16.5 and the fundamental
theorem of calculus. For example,
r+
J/

(1 + W2)-1

hm [arctan(W)]"n = r.

n-too

Obviously f (x, w) < h(w) for all (x, w) E HI+ x R. It follows from 12.2 that for each

x E It+ the function w H f (x, w) is A'-integrable. And the real function defined
by

(16.6)

V(x) := Jf(z)dw

x E IR+

is continuous by the continuity lemma 16.1. Note that p(O) = r. Since 2 JWJ < 1+w2
for all w E R, we have I f'(x,w)J < ho(w) for all (x, w) E [xo,+oo[x]R. Consequently
the differentiation lemma 16.2 insures that <p is differentiable in ]xo, +oo[, for every

xo > 0, that is, differentiable in JO,+oo[, and

(16.7)

(x) = -

e_2(1+")).1(dw)

for x > 0

and via the substitution t = w f this reads


(16.8)

cp'(x) = -Gx-1"2e-z

forx>0

where G designates the integral (16.1) that we are trying to explicitly compute. Its
existence is already fart of the preceding analysis, but can also be inferred from

the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the

16. Applications of the convergence theorems

95

fundamental theorem of calculus

V(x) - V(a) = GI t-1/2e- dt = 2G 41. e" dw,


for x > 0 and a > 0. Upon letting a run to +oo, we will get
(16.9)

p(x) = 2G

+oo a-", dw

J,rif we notice that V(a) -+ 0 as a - +oo, which in turn is a consequence of the


inequalities
+w2)-1A1(dw) = p(O)e-0

w(a) < e- f(i

for all a > 0.

Because cp is continuous on R+ we can pass to the limit x -+ 0+ in (16.9) and get

it = p(0) = 2G

r+ e-"'2 dw = G2,

J0

using the obvious (on grounds of symmetry) fact that f . a-"'' dw = f0+00 e' dw.
G = . That is,
Since G > 0, it follows finally thatfe2

dx = r

(16.10)

or equivalently, in the form seen in probability theory,

2a.

(16.10')

This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-

ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4
in 23).

Exercises.
1. Which of the two functions below are integrable, which are square-integrable
with respect to Lebesgue-Borel measure on the indicated intervals?
(a)

(b)

f (x) := x-1,
f (x) := x-1/2,

x E I:= [l, +oo[;


x E I:= 10,1] .

2. Show that for every real number a > 0 the function x H e" is A1-integrable
over R+.

3. Show that for every real number a > 0 the function


x

- a_x [sinX x13


J

96

1 1. Integration Theory

is A'-integrablc over JO, +oo[ and that

rsinx13 A1(dx)
x J

Jo
is continuous Oil 10, +00[.

17. Measures with densities: the Radon-Nikodym theorem


Again let (12, dd, p) be an arbitrary measure space and E' = E'(f2, sd) the set of
all W-measurable, non-negative numerical fimctions on 12. In 12.4 we defined the
integral of every function f E E* over every set A E id'. We are interested here in
how this integral behaves with respect to A.
17.1 Theorem. For each function f E E`JA the equation

v(A) :=

(17.1)

f du

defines a measure v on sd.

Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets

from W with A:= U A


nEN

IAf =

IA, f
n=1

and so by 11.5

v(An),

v(A)
n=1

the final property needing to be checked in confirming that v is a measure on 0. 0

17.2 Definition. If f is a non-negative .d-measurable, numerical function on 11,


then the pleasure v defined on .0' by (17.1) is called the measure having density f
with respect top. It will be denoted by

v=fiz.

(17.2)

Concerning the relationship between v- and -integrals we will show

17.3 Theorem. Let f,, E E', v:= fu. Then


(17.3)

17. Measures with densities: the Radon-Nikodym theorem

97

or, written out,

Jd(f,i) = f Wf d -

(17.3')

An id-measurable function V : fl - R is v-integrable if and only if ,pf is integrable. In this case (17.3) is again valid.
Proof. First suppose p =

a,lA; is an sad-elementary function. In this case (17.3)

holds because
n

f ,pdvaiv(A1)a;f lA,fd=Jcof d .
For an arbitrary p E E' there is a sequence (un) in E such that U. T V. Since then
un f T W f as well, (17.3) follows from 11.4. Finally, consider any id-measurable
numerical function p on Sl. By now we know that

fco+ dv = Jco+f d = J(caf)+ d and

W- dv = f V f du = f(f ) dp.

From these equations and the definition of integrability follows the second part of

the theorem. 0
It now follows that the formation of measures with densities is transitive:

17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf ), that is,

9(f) = (9f)

(17.4)

Proof. For every A E id

g(A) = f gdv =
A

lAgdv

and furthermore, according to 17.3

f lA9dv=

lA9fd= f(9f)dii.

We thus obtain p(A) = fA g f d, for all A E W; which is what had to be proved. 0

On the question of uniqueness of density functions we have

17.5 Theorem. For functions f, g E E'


(17.5)

f =g

-almost everywhere

= f p = g .

If either f or g is -integrable, the converse implication holds as well.

98

IL . Integration Theory

Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each
A E a(, whence
JALgdp

for allAEd,

which just says that fit = gp.


Now suppose that f is p-integrable and that fit = gp. Since g > 0 and f gdp =
f f dp < +oc, g is also p-integrable. Let us show that the set

N:={f>g},
which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and
is positive, which means that the definition

h:= 1Nf - 1N9


makes sense. The functions 1N f, 1Ng, being majorized by the p-integrable func-

tions f, g, are themselves integrable. Because fit = gp, they have the same itintegral. From this we getr that

hdp=

r
Ir fdp- /Ngdp=0.

Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles
of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since
if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is

obtained. 0
The converse of implication (17.5) is not valid without some additional hypothesis on the densities f and g. The next example illustrates this.

Example. 1. As in Example 2 of 3 let fl be an uncountable set, 0 the a-algebra of countable and co-countable subsets of (1 (see Example 2 in 1). But the
measure p will be defined on 0 by p(A) := 0 or +oo, according as A or CA is
countable. If f and g are the constant functions on ft with the respective values 1
and 2, then indeed f p = gp, yet f (w) = g(w) holds for no w E ft. Of course, it
then follows from 17.5 that neither f nor g is p-integrable.
Before turning to the principal problem of this section, we will examine another
characterization of a-finite measures which is important for what follows and is of
interest in its own right.

17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and
only if there exists a p-integrable function h on Cl which satisfies
(17.6)

0<h(w)<+oo

forevery wEf2.

Proof. If It is a-finite, there is a sequence


in a0 such that p(An) < +oo for
each n E N and A7, fi Cl. Choose positive real numbers gn satisfying both r) < 2-n

17. Measures with densities: the Radon-Nikodym theorem

99

and i p(An) < 2-n, for each n E N. Then the function


00

h := L?In1A
n=1

does what is wanted. It is measurable, 0 < h(w) < 1 for each m E 0, and f h dp < 1.

The converse implication is already known: it is contained in the second part


of 13.6.

In the light of 13.2 this lemma has another formulation: For each or-finite measure R there exists a real, measurable function h > 0 such that the measure hp is
finite and has the same nullsets as A.

We come now to the main problem, already alluded to: On the v-algebra sF of
the measurable space (S2, 0) two measures v and p are given. We pose the question

of how to decide whether v has a density with respect to , that is, whether there
is an .W-measurable, non-negative, numerical function f on St satisfying v = f p,
satisfying in other words

v(A)=J fdp

for allAE.d.

For an affirmative answer it is necessary, as 13.3 shows, that every p-null set in a
be a v-null set as well.

17.7 Definition. A measure v on W is called continuous with respect to a measure it on 0, for short, p-continuous, if every p-nullset from 0 is also a v-nullset.
In the case of a finite measure v there is a condition equivalent to p-continuity
which clarifies and justifies the terminology:

17.8 Theorem. A finite measure v on jzf is p-continuous if and only if for every
c > 0 there exists d > 0 such that
v(A) < e.
(17.7)
.
A E O and u(A)<b
Proof. From (17.7) it follows that v(A) < e holds for every E > 0 if A is a p-nullset.
Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness

hypothesis. For the converse we will show that if (17.7) fails, then v is not continuous. Thus, for some c > 0 there is no 6, which means there is a sequence
with the properties
(An)nEN in
p(An) < 2_n and v(An) > E
for each n E N.
We set

A := 41.s .up An := n U An
nEN m>n

and have a set in ap which on the one hand satisfies


00

A(A) < ( U Am) < E p(Am) <_


m>n

m=n

00

m=n

2-m = 2-n+1

for every n E N,

100

II. Integration Theory

whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3,
satisfies

v(A) > limsup

E > 0,

nix

which proves that v is not p-continuous. 0


Examples. 2. Let 12 be an uncountable set, W the or-algebra of countable and cocountable subsets of .W (Example 2 in 1). As in the preceding Example, consider
the measure v on .i which assigns to a set the value 0 or +oo according as the set
or its complement is countable. Let is denote the counting measure C on at (from
Example 6, 3). Since 0 is the only p-nullset, v is trivially -continuous. However,
v cannot have a density with respect to p. For from v = f p with f E E* it would
follow that

0 = v({w}) = f f dp = f(w)k({w}) = f(w)


W}

for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case
because Sl is uncountable.

Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V)
and denote by A" the system of all p-nullsets. Then
is an example of a or-ideal
in W1: The union of any sequence of its sets is another, as are the intersections of
its sets with those of ,5d1 (cf. Exercise 5, 3). These properties insure that
3.

v(A)

10
+oo

ifAE-4
if AEJO\.X

defines a measure on 1 (cf. Exercise 6, 3). From its definition it is clear that v
is p-continuous. Here however (17.7) falls, since for every b > 0

jp([o,ap = s and v([0,ap =+oo.


Thus the finiteness hypothesis on v in 17.8 is not superfluous. Example 2 shows
that for the existence of a density f E E' with v = fit, the -continuity of v, while
necessary, is not sufficient. All the more noteworthy is the theorem of Radon and
Nikodym which we will prove, after a preparatory lemma.

17.9 Lemma. Let or and r be finite measures on a o-algebra ii of subsets of 11


and let a := r - a denote their difference. Then there is a set S2o E W with the
properties
(17.8)

(17.9)

e(fl0) > LOW);


@(A) >0

for all AESTOltW.

Proof. Let us first proof the weaker claim:


(*) For every, e > 0 there exists 0e E 0 with the properties
(17.8')
(17.9')

N(1l) >- 9(f) ;

g(A) > -E

for all A ED, ft a/.

17. Measures with densities: the Radon-Nikodym theorem

101

We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is
wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we
consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e
and the subtractivity of the finite measures a and T,
e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .

Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done.
In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because
A1, A2 are disjoint

e(C(A1 U Az)) = o(Q) - e(A1) - e(A2) > e(fl) + 2e > e(n)


and the preceding dichotomy presents itself anew. If after finitely many repetitions of this procedure we have not reached our goal, then we will have generated
a sequence (An)nEN of pairwise disjoint sets in gd with
e(Sl \ (A1 U ... U An)) > e(Sl)

and e(A.) < -e

for every n E N.

Because of the finite additivity of a and r, this would have the consequence that
n

e(A1U...UAn)=Ee(A,) <-ne

for every n E N

i=1

00

and entail the divergence of the series 1 e(An). But the latter is untenable,
n=1
because when the a-additivity of a and r is applied to the disjoint union A
U An it shows this series to be convergent:
nEN
00

00

E e(A,) = 1: (r(An) - a(An)) = r(A) - a(A) E R.


n=1

n=1

This contradiction proves that the construction procedure must terminate after
some finite number n of steps, with the set QE := C(A1 U ... U An) then satisfying (17.8') and (17.9').
We now take e = 1/n in (*) for successive n E N. The sets (1 can be chosen with

the additional property of isotoneity. For if Sll D 121/2 3 ... 3 Sll/n has already
been realized, we simply apply (*) to fll/n as a new base space in the role of Sl,
that is, we consider the restriction of the measures or and T to S21/n fl dd. Finally,
the set Slo := n Sll/n will be seen to do the desired job. For since 01/n j Sla,
nEN

(17.8) follows from (17.8'), and (17.9) follows from (17.9'), which insures that

e(A)>-1/nforallnENandeveryAESlofl.od. O
As indicated, this puts us in a position to answer the important question we
posed earlier.

17.10 Theorem (Radon-Nikodym). Let u and v be measures on a a-algebra .srd


in a set Q. If is a-finite, the following two assertions are equivalent:

I l. Integration Theory

102

v has a density urith respect to A.


(ii) v is 14-continuous.
(i)

Proof. Only the implication (ii)=(i) is still in need of proof. To that end we
distinguish three cases.

First Case: The measures and v are each finite. Form the set 9 of all d measurable numerical functions g > 0 on Sl which satisfy g < v, that is, which
satisfy

for allAEd.
The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable,
that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI,
every A E d satisfiees

gvhd= 1

Ana,

gd+J

ArA,

Since f gd < v(Q) < +oo for every g E 9, the number

ry:=suP{ f 9d:gE9)
is finite and there is a sequence (g;,) in 9 such that lim f gn d = -y. Due to supstability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn d >

f gn d (since g,, > gn) for all n E N. Which shows that lim f gn d = ry. As
the sequence (gn) is isotone, the monotone convergence theorem can be applied,

assuring that f := supgn is a function in 9 and that f f d = ry. All this proves
that the function g H f g d on 9 assumes its maximum value at f.
Now we prove that v = f . In any case we have f < v, since f E 9, and so

T:= V- f A
is a finite measure on sat, evidently -continuous since v is by hypothesis. We have

to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the
-continuity of r, this entails that (11) > 0 as well, and we may form the real
number

Q:=2

(M}>0,

which satisfies r(Sl) = 20(Sl) > Q(St). The preceding lemma applied to r and
a:= Q3 supplies a set flo E 0 which satisfies

r(flo) - l(ilo) > r(1) - $(!l) > 0 and r(A) > Q(A) for all A E f o n 0.
The .sat-measurable, non-negative function fo := f +,81n. therefore has the property

ffodiz=jfdii+I3(QonA)

jfd+r(A)=v(A)

17. Measures with densities: the Radon-Nikodym theorem

103

for every A E sV. These inequalities put fo in 9. Since r is p-continuous and


r(S2o) > Q(S2o), we must have (S20) > 0, leading to

fod= ffd+ap(no)=7+i3(Slo)>7,

an inequality which is incompatible with the definition of -f and the fact that
fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired.
Second Case: The measure is finite and the measure v is infinite. We will produce
00

a decomposition SZ = U On of S1 into pairwise disjoint sets from d with the


following properties

(a) A E 1o fl at
(b)

n=0

either (A) = v(A) = 0 or 0 < (A) < v(A) = +oo .


v(S1n) < +0o

for all n E N.

To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define
a:= sup{(Q) : Q E _l} .
This is a real number because the measure is finite. There is a sequence (Qm)mEN

in .l with lim(Qn,) = a. Since 1 is evidently closed under finite unions, (Q,n)


U Q,n is then a set from std satisfying
may be assumed to be isotone. Qo
mEN

(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with

v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is -continuous
we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted
already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a,
and consequently
(Qo U A) = lim p(Qm U A) < a.
"t-400

Since A is disjoint from 1o, u(Qo U A) = a + (A). Conjoined with the preceding
inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to
take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers
m > 2 in order to get a decomposition of S2 with the desired properties.
Now let An, vn denote the restrictions of , v to the trace a-algebra On fl 8d,
for n = 0, 1.... and note that each vn is a n-continuous measure. Moreover, for
all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable
functions fn > 0 on Cl,, with vn = fnn Taking fo to be the constant function +oo
on Sto, vo = foo also holds, thanks to (a). Finally, "putting all the pieces together"
gives our result in this second case. Namely, the function f on Cl defined to coincide
on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies

v=fp.

Third Case: This is the general case: only the a-finiteness of it is demanded. There

is according to 17.6 a strictly positive function h E 2'(). The measure hp is


therefore finite and possesses exactly the same nullsets as does A. Consequently

v is also (hp)-continuous. By what has already been proved there is then an

104

II. Integration Theory

0-measurable function f > 0 on 1 with v = f (h). According to 17.4 v then


has the density f h with respect to A. 0
The question arises whether, in the situation of Theorem 17.10 the density f
of v is p-almost everywhere uniquely determined. From 17.5 we at least get a positive answer when f is p-integrable, that is, when v is a finite measure. But more
is true:

17.11 Theorem. Let v = fit be a measure having a density f with respect to


a a-finite measure p on 0. Then f is p-almost everywhere uniquely determined.
The measure v is or-finite exactly when f is p-almost everywhere real-valued.
Proof. First we show that f is -almost everywhere uniquely determined if the measure p is finite. In proving this we may assume that v(St) = +oo, since its truth is
otherwise a consequence of the second part of 17.5. Furthermore, as we now find
ourselves in case 2 of the preceding proof, the decomposition of St into %J11,...
employed there lets us confine our attention to Sto, as 17.5 takes care of the remaining Stn (n E N). So it suffices to treat the case ft = Sto, that is, to assume
that p and v are linked by the alternative:
A E srp

either p(A) = v(A) =0 or 0 < (A) < v(A) = +oo.

The constant function +oo is then a density for v with respect to p and what has
to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for
that it suffices to show that
({ f < n}) = 0
for each n E N,
which in turn is a consequence of the above alternative and the inequalities

v({f

<n})=J

f<n}

fdp<np({f <n})<+oo

coming from the finiteness of A.


We will use 17.6 to reduce the general case of or-finite p to the case just treated.

That lemma supplies a strictly positive function h E 21(p). The measure by =


h(fp) = f (hp) has the density f with respect to the finite measure hp, so f is
(h)-almost everywhere uniquely determined. Since the measures p and hp have
the same nullsets, f is therefore also uniquely determined p-almost everywhere.

Next, suppose v is a-finite. From 17.6 once again we get a strictly positive

function k E 21(v). Then kv = (f k) is a finite measure, that is, f k is integrable, consequently also p-almost everywhere real-valued. Because k takes
only non-zero real values, this means that f itself is real p-almost everywhere.
Conversely, suppose that f is p-almost everywhere real-valued. We want to
00

see that v is a-finite. First of all, there is a decomposition St = U On of f?


n=0

into a sequence of pairwise disjoint sets from 0 each of finite p-measure. Set

An := in - 1 < f < n} for each n E N and Ao := { f = +oo}, the present

17. Measures with densities: the Radon-Nikodym theorem

105

00

hypothesis being just that p(Ao) = 0. D = U (1li n AJ) is a decomposition of fl


i.J=o

into a (doubly-indexed) sequence of pairwise disjoint sets from sat. If each has finite

v-measure, this proves that v is a-finite. Consider any i E Z+. Because p(Ao) = 0

and v = f, we have v(1l, n Ao) < v(Ao) = 0. Because v = fit and f < j in AJ,
we have v(12i n AJ) < jp(ni) < +oo for all j E N as well. Thus all is proven. 0
In the generality presented here Theorem 17.10 was proved in 1930 by O.M. NIKODYtM (1888-1974). H. Lebesgue proved the theorem in 1910 for the case where
At is the L-B measure A1. J. RADON (1887-1956) pushed things further in a fundamental work which appeared in 1913. So 17.10 is often also called the theorem of
Lebesgue-Radon-Nikodym. The uniquely determined density f in 17.11 is called
the Radon-Nikodym density or the Radon-Nikodym integrand (of v with respect
top). A beautiful proof of 17.10 by elementary Hilbert-space methods was discovered in 1940 by J. VON NEUMANN (1903-1957) and appears in many textbooks,
e.g., in RUDIN [19871, p. 130-131.

The history of the result to be presented next, the Lebesgue decomposition


theorem, runs somewhat parallel, Radon and Nikodym having also made significant contributions. We need a concept complementary to p-continuity, namely
p-singularity:
17.12 Definition. Let (Sly, sat) be a measurable space, and v measures defined
on sat. Let us write v << p if v is p-continuous. v is said to be singular with respect

top (or p-singular), written v J p, if a set N E sl exists with (N) = 0 = v(CN).


It is obvious that the relation v J p is symmetric in and v, so it is also expressed as p and v are singular to each other (or mutually singular). The definition

of v 1 p expresses the fact that for a suitable p-nullset N E W


(17.10)

v(A) = v(A n N)

for all A E d,

as follows from v(A) = v(A n N) + v(A n CN) and v(CN) = 0. The condition
that v J it thus says that the measure v is "carried by a p-nullset". From v << p

and v 1 p together follows that v(N) = 0, and so v = 0. In this sense the


concepts p-continuity and p-singularity are diametral or antipodal. Relative to
L-B measure Ad every Dirac measure ex on

d obviously satisfies Ad 1 ex.

17.13 Theorem (Lebesgue's decomposition theorem). If p and v are a -finite


measures on a a-algebra sat in a set 12, then v can be decomposed in just one way
as v = v, + v, with measures vv, v, on sat that satisfy v, << p and v. J p.

v, is called the continuous part of v with respect to p, v, the singular part. The
Radon-Nikodym theorem is applicable to the part vc.
Proof. We will carry out the proof in detail only for finite p and v and indicate in
Exercise 4 how the reader can then handle the general case himself.

106

1 1. Integration Theory

Existence of a decomposition: Let ,, designate the system of all -nullsets


in W. Since v(A) < v(Q) < +oo for every A E Of,
a := sup{v(A) : A E X}

(17.11)

is a real number. Since .X,, is closed under countable unions, there exists an isotone
sequence (An) in .A', with v(An) T a. Since v is continuous from below, it follows

that

v(N) = a
for the set N := U A E .A',,. We will show that via
nEN

vc(A) := v(A n CN) and v,(A) := v(A n N)


two measures are defined on W that do what is wanted. Evidently v = Me + v8,
and V. 1 p since N E .NN. To prove that ve 4 it, it must be shown that v(A') = 0
whenever A E -A;, and A:= A n CN. As a subset of A E X, the set A' and then
also the set A' U N, is p-null. Therefore v(A' U N) < a by definition of a. But
A' n N = 0 and v(N) = a. Hence

a + v(A') = v(N) + v(A') = v(N U A') < a,


from which follows v(A') = 0 as desired, since a is finite.
Uniqueness of the decomposition: Suppose
(17.12)

v=VC +v,=vC'+v,'

are two decompositions of the kind described in the theorem. The measures v v,
are carried by p-nullsets N, N' in the sense of (17.10); which means that
(17.13)

v,(A)=v,(AnN) and v,(A)=v''(AnN')

Setting No := NUN' gives a set in

for all AEd.

so that from vi,, u< K p follows

v,,(AnNO)

=0

for every A E.Qd.

Therefore (17.12) and (17.13) give

v(AnNo) =v,(AnNo)+v,(AnNO) =v,(AnNO) =v,(AnNonN)


= v.(A n N) = v,(A),

for every A E Ad.

Analogously of course, v(AnNo) = v,(A) for every A E 0. Thus we have v, = va.


A return to (17.12), recalling that all measures are finite, gives v,; = v'' as well. 0

There is a short, elementary proof of 17.13 that does not make use of the
Radon-Nikodym theorem; see Woo [1971).

Exercises.
1. Show that the Dirac measure e., on Rd has no density with respect to .1d,
for any x E W'. (Physicists occasionally work with such a "symbolic" density d5,

calling it the Dirac. function at the point x. The correct mathematical object is
nevertheless the Dirac measure es.)

18*. Signed measures

107

2. Show that the relation << on the set of measures on a a-algebra d is reflexive

and transitive. The relation p - v defined as p << v and v is is then an


equivalence relation. Two measures p and v stand in this relation just when they

have the same nullsets. For a-finite measures p and v on d show that p - v is
equivalent to v = f 1L for a density f which satisfies 0 < f (w) < +oo for p-almost
all (or even for all) w E Q.
3. On a a-algebra 0 in a set 11 two measures a and v are related by v < A. Show
that if further it is a-finite, then there is an d-measurable function f satisfying

0< f<lsuch that y= f.

4. Lebesgue's decomposition theorem was proved for finite measures p and v. Show

how to infer its validity for a-finite measures from this. [Hint: For the existence
proof use 17.6. For the uniqueness proof choose a sequence (An) in 0 with An T Sl
and a(An), v(An) finite for each n, and consider the measures vn(A) := v(Af1An),

AEd,nEN.]

5. Let v = vi+ve be the Lebesgue decomposition of a a-finite measure v on d with


respect to a a-finite measure p. The singular part V. has the form v,(A) = v(AfN)

for all A E 0 and a suitable p-nullset N E d. Show that if N' is any other pnullset with this property, then u(N 0 N') = v(N A N') = 0.
6. Let (S2, .mot, p) be a measure space, v = f 1A a a-finite measure on d having
density f with respect to p. Show that this density function is p-almost everywhere uniquely determined and is p-almost everywhere real-valued. Show that if f
is strictly positive, then p itself is a-finite.
7. Let (11, d) be a measurable space. For every measure on s0 let .M,,, denote
the a-ideal of its nullsets. Show that for any sequence (Pn)nEN of a-finite measures

on ae there is a finite measure on d for which /V,,= n N,,,


nEN

8. The set n := 10, +oo[ is a group with respect to multiplication. Show that the
measure on SZ f1.1 defined by p := han with density function h(x) := 1/x is

invariant under each self-mapping x H as of fI (a E fI). p is thus the Haar


measure of the group f2 in the sense of the remark immediately following 8.2.

18*. Signed measures


It is worthwhile turning our attention back to Lemma 17.9. The measure concept
in this book is that formulated in Definition 3.3: Measures are premeasures p on
a a-algebra sad, and so are non-negative a-additive functions on d satisfying the
additional condition u(0) = 0. In Lemma 17.9 we encountered a real-valued, aadditive function p which is the difference of two finite measures. Similarly for any
f E 2' (p) the function A H fA f dp on W is the difference of two finite measures,
for example f + p, and f -.
We will call a real function p : sr' - R on a a-algebra a finite signed measure
if it is a-additive in the sense of (3.2), non-negativity not being required. From

108

I l. Integration Theory

a-additivity applied to the constant sequence 0, 0.... follows immediately that


(3.1) is also satisfied, that is, g(0) = 0, because g is only allowed to take real
values. A second pass through the proof of Lemma 17.9 will convince the reader
that this lemma is in fact valid for every finite signed measure. As a corollary
we immediately get the following theorem on the existence a Hahn decomposition
of g, a theorem that goes back to H. HAHN (1879-1934).

18.1 Theorem. Let g be a finite signed measure on a a-algebra and in a set Cl.
Then there are sets Sl+, St- E of with Cl = Sl+ U fl-, Sl+ n fl- = 0, and g(A) > 0
for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd.
Proof. Set

-y:= sup{g(A) : A E 0}

and choose a sequence (An) in 0 with limg(An) = y. By applying 17.9 to the


restriction of g to An nad, we may replace An by a set Pn E 0 satisfying g(Pn) >
g(An) and g(A) > 0 for all A E Pn n 0. We will then have

y=sup{g(Pn):nEN).

(18.1)

The decomposition of Cl that is sought can be realized by

Sl+ := U Pn,

S2- := S2 \ Q+ .

nEN

Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form

A = U Bn
nEN

with pairwise disjoint sets B. E P. n ad (by the disjointification procedure used


in the verification of (3.10)). From this representation of A and the a-additivity

g(B,) > 0. Thus p assumes only non-negative real values

follows g(A) _
n=1

on Sl+ n .sad, that is, the restriction of g to Sl+ n 0 is a finite measure. Moreover,
because @(P.):5 g(Sl+) < y and (18.1) this measure satisfies

y=Q(sl+)
In particular, y < +oo since p assumes only real values. g(A) > 0 cannot hold for
any A E Sl- n .sat, for otherwise g(C+ U A) = g(Sl+) + g(A) > y. Thus, g(A) < 0

for allAESl-n0.
Measures (in the sense of Definition 3.3) have occasionally been interpreted
as mass distributions on the underlying set Cl. A finite signed measure can be
analogously interpreted as an (electric) charge distribution smeared over Cl. The
foregoing theorem justifies this metaphor by showing that as with charge in electrostatics, there are two disjoint sets, one carrying all the positive charge, the other
all the negative charge.

18*. Signed measures

109

From this theorem another important feature of signed measures becomes evident: The difference p in Lemma 17.9 is more than an illustrative example of
a signed measure - it is the typical signed measure:

18.2 Corollary. Every finite signed measure p on a a-algebra sat in ] is the


difference of two finite measures on sat.

Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then
evidently
p+(A)

p(A n St+)

and p(A) :_ - p(A n St-),

A E sat

define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint

union (AnS2+)u(Ancl-). 0
With this result the circle closes: finite signed measures are nothing more than
the differences of finite measures. It is however possible to dispense with the finite-

ness hypothesis if a-additivity is handled with sufficient care, but we will not go
into this further.
In the final analysis it is because of the preceding corollary that we only consider
measures with non-negative values in this book. Often to emphasize the distinction
with signed measures, what we call simply measures are called positive measures.

Exercises.
1. Show that every finite signed measure on a a-algebra is bounded and assumes
a largest and a smallest value.

2. Let p be a finite signed measure on a-algebra d in Sl, and St = Sli U f1i ,


fl = fl2 Uci be two Hahn decompositions for it. Show that ii LSl2 and Sti OS22
are totally p-nulsets, meaning that p(N) = 0 for every N E 0 which is subset of
either of them. Conclude that to within such totally p-nullsets there is only one
Hahn decomposition for p.
3. Let p be a finite signed measure on a a-algebra sat in Q. Show that the specific
representation p = p+ - p- of p as the difference of the two measures on sat which
was produced in the proof of 18.2 is characterized by the following minimality
property: In every representation p = pl - p2 as the difference of measures pl, p2

on 0, pl = p+ + 8 and p2 = p + b for an appropriate finite measure 8 on sa7,


and indeed if 11 = Sl+ U S2- is any Hahn decomposition of S2 corresponding to p,

8 = (ln+)p2 + (1n-)pl. (Conversely, of course, every finite non-zero measure b


on sat generates in this way a different representation of p.) Infer that the only
measure v on sat which satisfies v(A) < min{p+(A), p-(A)} for every A E sat is
the identically 0 measure. [Remark: The representation p = p+ - p uniquely
determined by this minimality condition is called the Jordan decomposition of the
finite signed measure p. As with functions, p+ and p- are called the positive part
and the negative part of p.]

110

1 1. Integration Theory

19. Integration with respect to an image measure


Along with the measure space (it, .0', i) a measurable space (W,01) and an
jW-d'-measurable mapping

T : (fl, a) -a (ft', d')


are given. Then the image measure

p` := T(p)
is defined in (7.5). The connection between p-integrals and '-integrals is elucidated by:

19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0'
(19.1)

Proof. The non-negative function f' o T is d-measurable, by 7.3. The integral on


the right-band side of (19.1) is therefore defined. To prove the equality there we
first consider only d'-elementary f':
n

ailA s

i=1

(with coefficients ai E R+ and sets A; E d'). For such f

f'oTa;lAi
e=1

with A; := T-r (A;), so this composite is an d-elementary function. Since

T(p)(Ai) = p(Ai)

(i = 1,...,n)

holds by definition of image measures, (19.1) follows in this case. For an arbitrary
s9'-measurable f > 0 there is an isotone sequence (un) of d'-elementary functions
for which u;, T f'. Then (un o T) is a sequence of s(-elementary functions for which
u;, o T T f o T. From the validity of (19.1) for the u;, and Definition 11.3 of the
integral in general, we get (19.1) for f'.

19.2 Corollary 1. Let f' be an sf'-measurable numerical function on W. Then


the T()-integrability of f' entails the p-integrubility of f' oT, and conversely. In
case of integrability
(19.2)

19. Integration with respect to an image measure

111

Proof. From 19.1

f (f')+dT(p)=J(f')+

o Tdp and

J(f')_dT(P) = f

(f')- oT d1 z,

and of course

(f'oT)+=(f')+oT

and

(f'oT)-=(f')-oT.

Both claims therefore follow from the definition of the integral 12.1.

19.3 Corollary 2. The mapping T : S2 -+ S2' is bijective and d -d'-measurable,

with W'-d-measurable inverse T'. Further f' is a numerical function on W.


Then the T(p)-integrability of f' is equivalent to the p-integrability of f' o T, and
in its presence equality (19.2) prevails.

One has only to note that the integrability of f' o T entails the measurability
of f' o T and therewith that off'= f' o T o T -1.
The content of 19.1-19.3 constitutes what is called the "general transformation
theorem for integrals".

As the behavior of the L-B measure with respect to Cl-diffeomorphisms is


known from (8.16'), the transformation theorem for Lebesgue integrals follows at
once:

19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn

of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the
function f' o cp I det DWI is Ad-integrable over G, and in this case
(19.3)

IG, f' dAd =

fcf' o' I det D,,, I dAd .

Proof. The Ad-integrability of f' over G' and that of f' o W I (let DWI over G means
the AG,-integrability and the AC-integrability of those functions, respectively. According to (8.16')
' (Ac) = I det DWI Ad ;

furthermore, the Borel measurability of f' is equivalent to that of f'o<p. According


to 17.3 therefore f'otip is integrable with respect to the measure y:= I (let DWI AG if
and only if f'o(p I det DcpI is integrable with respect to A. Consequently the present
claim follows from Corollary 19.3 applied to T := W-1, because f' = f' o W o and

f f'dAc,_f
f

f'o,pIdctDWIdAd,.

Because of Theorem 19.1, equality (19.3) holds as well for all non-negative,
Borel measurable, numerical functions on G'.

112

1 1. Integration Theory

Exercises.
1. Let (0, dal, p) be a measure space, T : fZ -+ f 1 a mapping which together with
its inverse is an d-d-measurable bijection. Show that for every f E E (St, .ad) the
image measure T(f p) has a density with respect to T(p), namely f o T-1.

2. Let (0,.', p) be a a-finite measure space, T : ) -4 i2 an alf-d-measurable


mapping such that T-1(A) is a p-nullset whenever A is. Prove the existence of
a measurable function q > 0 such that

fA-'(A) f oTdp
fqdu

-TJ
for all dat-measurable numerical functions f > 0 on fl, and all A E d.

20. Stochastic convergence


Let us return to the study of p-fold integrable functions begun in 14. Our goal will
be to replace the almost-everywhere convergence concept that underlies the theorems proved there with a weaker convergence concept. It is suggested by a simple
but very useful inequality.

The setting is once again an arbitrary measure space (el, 0,u).


20.1 Lemma. For every measurable numerical function f on 0 and every pair of
real numbers p > 0 and a > 0 the Chebyshev-Markov inequality
p({IfI >- a}) <-

(20.1)

af

Iflp dp

holds.

For p = 2 this is also known simply as Chebyshev's inequality.

Proof The set A6 := {IfI > a} lies in d and

p>j Iflpdp2j apdp=app(Aa)

fd

which is what (20.1) claims.

Therefore if f If Ip dp is finite, which when p > 1 means just that f is p-fold


integrable, it follows from (20.1) that
(20.2)

lim p({IfI > a}) = 0.

a-r+co

One can also study the dependence on it E N of the measures of the sets
{ I fn - f I > a} when f, fl, f2.... are measurable real functions. That leads to
the aforementioned new convergence concept.

20. Stochastic convergence

113

20.2 Definition. A sequence (fn)nEN of measurable real functions on 1 is said to


be (-)stochastically convergent (or to be convergent in p-measure) to a measurable

real function f on S2, if for each real number a > 0 and each A E d of finite
measure

nlim tt({I fn - f I > a} n A) = 0.

(20.3)

+oo

In this case we also write

- lim fn = f

(20.4)

and call f a (-)stochastic limit of the sequence (fn).


Remarks. 1. For a finite measure p we may take A = 52 in (20.3) and in this case
stochastic convergence of (fn) to f is equivalent to the requirement

lim ({lfn- fI>a})=0

(20.5)

for every a>0.

The more complicated condition (20.3) is dictated by the desire to treat infinite,
and especially a-finite, measures as well as finite ones.
2. For a-finite measures p the stochastic convergence of a sequence (fn) to f is
generally not equivalent to (20.5), as the next example illustrates.

Example. 1. Let St := N, 0 := .9(N), It the measure (obviously a-finite) defined


on sad by the equations

({n}) = n

for every n E N

and the requirement of o-additivity. With An := {n, n + 1,.. .} and In := 1A., for
each n E N, the sequence (fn) converges stochastically to 0: For every a E 10, 1[,
{ jn > a} = An, and since An ,. 0, it follows from 3.2 that lim (An n A) = 0
for every A E Af having finite measure. On the other hand, u(A.) = +oo for

every nEN.

Remark. 3. Let f be a stochastic limit of a sequence (fn) and consider any


measurable real function f' on 11. If f' = f p-almost everywhere in every A E d
which has finite measure, then f' is also a stochastic limit of the sequence (fn).
This is because the sets

{Ifn-f*I >a}nA and {Ifn-fl>a}nA


differ from each other only in an (n-independent) nullset.
The converse of this is important:

20.3 Theorem. For every o-finite measure p, any two stochastic limits of a sequence of measurable real functions are -almost everywhere equal to each other.

114

1 1. Integration Theory

Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle
inequality in R

{If -f*I2al C{If.-fI? a/2}U{Ifn-f*I2! a/2},


whence

p({If-f*I >a}nA)<p({Ifn-fl>a/2}nA)+p({Ifn-f*I2:a/2}n A)
for every n E N and every A E d. Letting n -3 oo shows that

p({ If -f*1 >- a} nA) = 0


for every a > 0 and every A E ii of finite measure. Then however, f = f* "-almost
everywhere in every such set A, since

If 54 f*} n A= U{If - f*1 > Ilk} nA


kEN

is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies
p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f
follows. D
To supplement this fact we mention:

Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost
everywhere equal without any hypotheses on the measure itself if both functions
are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the

set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost everywhere in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable
nEN

union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the
next example shows that it may fail if one of the functions is not in any 2P-space.
Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly
two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N.
These functions lie in every .2'P(p) and the sequence (fn) converges stochastically

to f , as well as to every real-valued function f * on 0. Every such f* which is


non-zero at wl, however, lies in no 2"(p) with 1 < p < +00 and fails to coincide
p-almost everywhere in 11 with f.
The considerations with which we began this section lead to an important class
of stochastically convergent sequences:

20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function
f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically.
Proof. The Chebyshev-Markov inequality tells us that

p({Ifn - fl ?a}nA)<p({lfn-fl ! a})<a-P fl/n irdp

20. Stochastic convergence

115

holds for every n E N, every a > 0 and every A E s+d. The claimed stochastic
convergence, that is, the convergence to 0 of the left end of this chain as n -+ oo,
follows because f I fn - f I' d -+ 0 as n -+ oo is the definition of convergence

in pth mean. 0
The proof shows that convergence in eh mean actually entails the stronger
form of stochastic convergence in (20.5). The situation is different when the given
sequence is almost everywhere convergent. (On this point cf. also Remark 5.)
20.5 Theorem. If a sequence (fn)nEN of measurable real functions on fl converges
-almost everywhere in Sl - or even just p-almost everywhere in each set A E st
of finite measure - to a measurable malfunction f on 1l, then this sequence also
converges p-stochastically to f.

Proof. For every a > 0,

{Ifn - .fI 1a} C {m>p Ifm - .fI

1a}

and so

A({Ifn - fl2! a}nA):5 ({supI.fm-f1 >a}nA)


m>n

for every A E d. The present claim therefore follows from our next lemma, applied

to the restriction of p to A n sl for each A of finite measure. 0


20.6 Lemma. If the measure p is finite, then each of the following three conditions
on a sequence (fn)nEN of measurable real functions is equivalent to (fn) converging
p-almost everywhere to 0:
(20.6)

(20.6')

Ifml > a}) =0

for every a > 0,

lim ({sip Ifml > a}) = 0

for every a > 0,

p(limsap{Ifnl>a})=0

for every a>0.

lim A
n-rao

(20.7)

m>n

m>n

Proof. To prove the equivalence of (20.6) with the almost everywhere convergence
of (fn) to 0, we set, for each a > 0 and each n E N

An :_ { sup IN > a} .
m>n

Obviously both n H An and a H An are antitone mappings; then k H An/k is


isotone on N. If we also set

A:= {w E fl :limo fn(w) = 0} = {w E Sl : limas

op

Ifnl (w) = 0),

1 1. Integration Theory

116

then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and

A= n U
kEN nEN

Passing to complements,

CA= U nAnk
kEN nEN

and so

n A ;/k r CA as k -+ oo,

and Al/k
n 1

fI' dl
"m

as n -00.

mEN

nEH

Consequently,

u(CA) = sup p ( n A,imk) = sup inf

(20.8)

kEN

kEN 'nEN

nEN

because the finite measure is both continuous from above and continuous from
below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number
defined by (20.8) is 0. In turn, the latter occurs exactly in case

inf p(AIlk) = Iuu p(An1fk) = 0

nEN

n-+oo

for every k E N. The first equivalence follows from this. The equivalence of (20.6)
with (20.6') follows from the observation that for any numerical function g on S2

{g>a}C{g>a}C{g>a'}
whenever 0 < a' < a.
Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every

a > 0, of the equality

a(( sup Ifml > a}) = (limsop tlfnl > a}) .

(20.9)

m> n

For the proof of which we introduce

Bn:= U{Ifml>a} and B:=llmspp{Ifnl>a}.


m>n

On the one hand, Bn I B and consequently tim p(Bn) = (B). On the other hand,
however,

Bn= U {Ifml>a}={sup Ifml>a}.


rn>n

m>n

From this finally we get the needed (20.9). 0


The conditions involved in Theorems 20.4 and 20.5 are indeed sufficient to
insure stochastic convergence, but they are not necessary for it, as the following
examples show.

20. Stochastic convergence

117

Examples. 3. Let S2 :_ [0,1 [, s/ := 1 n 91 and := an, a finite measure. With


converges to 0 at every point of Q
An :_ JO, 1/n[ E a, the sequence
and so, either by appeal to 20.4 or by virtue of

({n1A > a)) = (An) = n

whenever 0 < a < n E N,

this sequence also converges stochastically to 0. By contrast

= n"p(An) = np-1
shows that the sequence does not converge to 0 in pth mean for any p > 1.
4.

Let (fl, 0, ) be the measure space of the preceding example. Write each n E N

as n = 2' + k with non-negative integers h and k satisfying 0 < k < 21 (which


uniquely determines them) and set

An :_ [k2-h, (k+ 1)2-h[,

In

n E N.

lAn,

It was shown in the example in 15 that the sequence (fn(w))nEN converges for
no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since
for every a > 0 and n E N

p({) fnI 1 a}) < 2-h < 2r2 .


In this example stochastic convergence can also be inferred from 20.4, since the
example in 15 showed that (fn) converges to 0 in pth mean for every p E [1, +oo[.
The connection between stochastic convergence and almost-everywhere convergence is nevertheless closer than one would be led to suspect on the basis of the
last example.

20.7 Theorem. If a sequence (fn)nEN of measurable real functions converges


,u-stochastically to a measurable real function f, then for every A E 0 of finite
p-measure some subsequence of (fn) converges to f -almost everywhere in A.
Proof. For A E sa( with (A) < +oo, the measure A, which is the restriction of p
to A n.ad, is finite. It therefore suffices to deal with the case of a finite measure u;
moreover, in that case we can simply take A to be St itself.
For a > 0 and m, n E N the triangle inequality shows that

{Ifm - fnI 2: a} C {If,. - f I ! a/2} U {Ifn - f I

a/2);

thus by hypothesis ({I fn, - fnl > a}) can be made arbitrarily small by taking m
and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers
with
00

E rlk < +00,


k=1

118

I l. Integration Theory

then for each k E N there is an nk E N such that

forallm>nk.

{t({Ifm-fnkl?nk})<-nk

Clearly the sequence (nk)kEN can be chosen strictly isotone: nk < nk+1 for every
k E N. If now we set
k E N,
{Ifnk+t - fnk l llk},
Ak
then
00

00

(Ak) < E 77k < +00,

>

k=1

k=1

and consequently,

p(Ak) = 0.

lira

n-oo

k=n

From this it follows that the set A := lira sup An satisfies


n-,00

p(A) = 0,
00

because A C U Ak for every n E N, entailing that p(A) < E p(Ak) for every n.
k=n

k>n

The definition of A shows that if w E CA, then the inequality


Ifnk+. (w) - fnk (w) I ? rlk

prevails for at most finitely many k E N. Therefore, along with the series E Ilk,
the series
00
1: lfnk+l(w) - A. (w)1
k=1

converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In summary, the sequence (fnk) converges almost everywhere to a measurable real func-

tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a subthat sequence converges stochastically to f as well. Hence
sequence of
by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-

verges almost everywhere to f. 0


In terms of almost-everywhere convergence we can now even characterize stochastic convergence by a subsequence principle.

20.8 Corollary. A sequence (fn) of measurable real functions on 11 converges pstochastically to a measurable real function f on ) if and only if for each A E of of
finite measure, each subsequence (fnk )kEN of (fn) contains a further subsequence
which converges to f p-almost everywhere in A.

Proof. The preceding theorem establishes that the subsequence condition is necessary for the stochastic convergence of (fn) to f, since every subsequence of (fn)

20. Stochastic convergence

119

likewise converges stochastically to f. Let us now assume that the subsequence condition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)

contains another which converges almost everywhere in A to f and by 20.5 this


latter subsequence must also converge (in A) stochastically to f, we see that in
the sequence of numbers

(kEN),

p({Ifnk - fI -a}nA)

in which a > 0 is fixed, a subsequence exists which converges to 0. But, as an


easy argument confirms, a sequence of real numbers whose subsequences, have this
property must itself converge to 0. That is, the sequence of real numbers

>a}nA)

(nEN)

converges to 0. As this is true of every A E d having finite measure and every a > 0, the stochastic convergence of
to f is thereby confirmed. 0
Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the
finite-measure set A E W can be stricken. This is already illustrated by Example 2

if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_
nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See
however Exercise 5.

6. The second part of the proof of 20.7 shows that for finite measures u there is
a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary
to a measurable
and sufficient for the stochastic convergence of a sequence
real function on S1 is the condition
for every a > 0.

litre

m.n-ix
7.

The sequence formed by alternately taking terms from each of two stochasti-

cally convergent sequences whose limit functions do not coincide almost everywhere

shows that in Corollary 20.8 it does not suffice to demand that in each A some
sub sequence of the full sequence (fn) converge almost everywhere.
A particularly useful consequence of 20.8 is:

20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 converges stochastically to a measurable real function f on. Q. and yo : R -4 R is
continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.

Proof. One exploits both directions of 20.8, noting that from the almost everyto f on an A E 41 follows the almost

where convergence of a subsequence


everywhere convergence of (,p o

f on A. 0

The general question of functions p : R -* R which preserve convergence, in the


sense that (o o f,
inherits the kind of convergence (f,,)iE14 has, is investigated
by BARTLE and Jo1CH1 (1961]. They show how Theorem 20.9 can fail if the more
restrictive definition (20.5) is adopted for stochastic convergence.

120

11. Integration Theory

Exercises.
are stochastically convergent sequences of measurable real func1. (fn) and
tions, having limit functions f and g, respectively. Show that for all a,,8 E R

the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences
(fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively.
2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-

ric on d constructed in Exercise 7 of 3. Show that a sequence (An) in saf is


d,,-convergent to A E 0 if and only if the sequence (NAB) of indicator functions
converges stochastically to the indicator function IA.
3. For every pair of measurable real functions f and g on a measure space (Cl, sA, )
with finite measure define

D,(f,g) := inf{e > 0 : p({I If - gI > e}) < e}


and then prove that
(a) DP is a pseudometric on the set M(d) of all measurable real functions.
(b) A sequence (fn) in M(W) converges stochastically to f E M(d) if and only if
lim D, (f,,, f) = 0.
n +00
(c) M(se) is D,,-complete, that is, every D Cauchy sequence in M(d) converges
with respect to Da to some function in M(Ao ).
What is the relation of D,, to the d of Exercise 2?
4. In the context of Exercise 3 define

If - gi

dp,

for every pair of functions f, g E M(ss). Show that D also enjoys the properties
(a)-(c) proved for D$, in the preceding exercise.
be a or-finite measure space. Show that a sequence (fn) of measur5. Let
able real functions on Cl converges stochastically to a measurable real function f
on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can
be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is
stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for
each k and Ak 1 11, and consider the finite measures pk(A) := (A fl At,) on sW.
The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for
each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere

to f. It can be arranged that (g nk+u)) is a subsequence of (gnl) for each k. Then


the diagonal subsequence (g;,ni ), EN does what is wanted.]
6. Give an "elementary" proof of 20.9 based directly on the relevant definition 20.2.

To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I <

11F}fl{Ifn-f1 :56}C{IVo fn-Wofl<}for all nEN.


7. (Theorem of D.F. Ecoaov (1869-1931)) Let (S2,srd,A) be a measure space
with finite measure p. Show that: For every sequence (fn)nEN of measurable real
functions on Cl its convergence almost everywhere to a measurable real function f
is equivalent to its so-called almost-uniform convergence to f. The latter means

21. Equi-integrability

121

that for every 6 > 0 there exists an A6 E W such that p(A6) < b and (fn) converges
to f uniformly on CA6. [Hint: Exercise 2 of 11.]

21. Equi-integrability
The sufficient condition for convergence in eh mean which is set out in Lebesgue's
dominated convergence theorem can be transformed into a necessary as well as sufficient condition with the help of stochastic convergence. But we need the concept
of equi-integrability, which is of fundamental significance.

In the following (S2, sz4, p) will again be an arbitrary measure space, and p is
always a real number satisfying 1 < p < +oo.
The point of departure is a simple observation. A measurable numerical function f on S2 is integrable if and only if for every e > 0 there is a non-negative
integrable function g = ge such that

J I9} IfI dp <e.

(21.1)

For if f is integrable and we take, as we then may, g to be 2 If I, then { If I > g} _


{ f = 0} U { If I = +oo} and thanks to 13.6 the integral in (21.1) is actually equal
to 0. Conversely, if we have (21.1) even for just one real e > 0, then

f IfI dp=

{IfI?9}

IfI dp+

{III<9}

IfI dp<e+f gdp<+oo

and hence f is integrable.


This observation induces us to make

21.1 Definition. A set M of d9-measurable numerical functions on S2 is called


(p-)equi-integrable if for every e > 0 there exists a p-integrable function g = ge > 0
on 0 such that every f E M satisfies
(21.2)

f I dp< e.
III_9}

Correspondingly a family (fi)iEl of measurable numerical functions on f is


called equi-integrable if the set { fi : i E I) is equi-integrable. Equi-integrable sets
and families are sometimes also called "uniformly integrable".
From now on, any function ge as described in Definition 21.1 will be called an
e-bound for the given set of functions. Obviously, along with an a-bound g for a set
of functions, any integrable g' 2 g is also an e-bound.

Examples. 1. If Ml,..., Mn are finitely many p-equi-integrable sets of measurable


functions on S2, then their union is also p-equi-integrable, because whenever gj is
an a-bound for MM (j = 1, ... , n), then gl V... Vg,, is an a-bound for Ml U... U Mn.

122
2.

1 1. Integration Theory

Every finite set of -integrable functions is u-equi-integrable. This follows from

Example 1 and the fact, demonstrated in the course of proving (21.1), that any
set consisting of just one integrable function f is equi-integrable, the function 2 If I
being an a-bound for every e > 0.

Suppose M is a set of measurable numerical functions on fl, 1 < p < +oo, and
there is a p-fold -integrable majorant g for M, that is, every f E M satisfies
3.

-almost everywhere.

If1 < g

Then the set

M":={IfIP:fEM}
is equi-integrable. Indeed, as in Example 2, the single integrable function h := 2gP
is an --bound for every e > 0, since by 13.6

fId < J

gP d = J

d = 0

{g=too}

{gP>h}

1f1P>h}

This example shows that Theorem 15.6 on dominated convergence is really


about an equi-integrable set of functions. Of course, one cannot expect that conversely from the equi-integrability of a subset of .`" (t) there should follow the
existence of a single integrable majorant for the set. The following example confirms this.
Consider the probability space (N, .(N), ), the finite measure being specified by ({n}) = 2-n for each n E N. The sequence of functions fn := 2"n-11{n)
(n E N) is equi-integrable: For the constant function 1 E .2o1() the inequality
4.

fn d <

holds for all n E N.

However, the smallest function g which majorizes every fn is the non--integrable


function n i-- 2nn-1 on N.
5. Let (St, d, ) be the measure space of Example 3, 20, and (fn)nEN the sequence
of functions considered there: An := [0, [ and fn := n1A, for each n E N. This
sequence is not equi-integrable, which wensee as follows: for every integrable g > 0
and every n E N

JIf-I>g}

If,.Id=J

nd=J nd-J
A

nd>1-J

From the finiteness of the measure g and the fact that An 1 {0}, it follows that

liminf
J
n_+00

Ifnl d> 1,

{If..I>g}

showing that g cannot be an a-bound for any e E ]0, 1[.


Here is a useful characterization of equi-integrability, which, for o-finite measures, will be improved upon in 21.8.

21. Equi-integrability

123

21.2 Theorem. A set M of measurable numerical functions on l is equi-integrable


if and only if the following two conditions are satisfied:
sup

(21.3)

fEM

f If I d < oo .

(21.4) For every e > 0 there exists a p-integrable function h > 0 and a number
3 > 0 such that

< d=* Jill/iforallfEMand


Proof. For every A E &/, every measurable numerical function f on 0, and every
integrable function g > 0

f AIfI du=

An{IfI>g}

IfI du+ f

An{III<g}

IfI du<_

{IfI?g}

IfI

du+f gdu
A

and in particular for A := fZ

f IfI du <_ f

IfI d+

{IfI>_g}

f gdu.

Assuming that the set M is equi-integrable, let us choose for g an E-bound for it
and then set h := g, d
2. Then conditions (21.3) and (21.4) follow from the
preceding inequalities.
Conversely, assume the two conditions are fulfilled and let e > 0 be given. Let
h and b > 0 be as furnished by (21.4). For each f E M and real a > 0, consider
the obviously valid inequality

f IfI du

4IfI?ah}

Ifl du > f
{If
(If I>_-h}

or its equivalent
1

J IfI?ah} h djo < -

If I dM.

The integrals f If I d here are bounded as f ranges over M, by (21.3). Therefore


a > 0 can be chosen so large that

hd < b
for all f E M.
{IfIiah}
(21.4) then insures that g := ah is an c-bound for M, which proves that this set
is equi-integrable. 0

21.3 Corollary. Let M C 2P and the set MP :_ { If I P : f E MI be equiintegrable, where 1 < p < +oo. Then the set

M;:={laf+,0glP:f,gEM,a,,0ER,Ial:_1,1,01<_1}
is equi-integrable.

II. Integration Theory

124

Proof. For every f E 2P(p) and every A E dd, I lA f l <- If I shows that 1A f E
2'(p) too, and so for all fl, f2 E 2P(p) Minkowski's inequality (14.4) gives
Np(lAfl + lAf2) :5 Np(lAfl)+Np(lAf2),
whence
///'

JA

If,Ip dp)

Ifl + f2Ip dp <

1/v

+ (!A 1f21P dp}

1/1 p

Applying this inequality to fl = a fl, f2 = pg with f, g E M a, 8 E R and Ial < 1,


ICI < 1, and hearing in mind that 21.2 is (by hypothesis) valid for the set MP,
one realizes that conditions (21.3) and (21.4) are fulfilled by M: as well as by MP,
with the same function h in both cases. 0
We are now in a position to deliver the sharpened version of the dominated
convergence theorem mentioned in the introduction to this section. That we really

have to do with a sharpening here is attested to on the one hand by Example 3


and Theorem 20.4, according to which stochastic convergence follows from almosteverywhere convergence, and on the other by Example 4 of 20, which shows that
there are situations in which the dominated convergence theorem is not applicable
but the following theorem is.

21.4 Theorem. For every sequence. (fn)nEN of p -fold, p-integrable real functions
on a measure space (1l, sd, p) the following two assertions are equivalent:
(i) The sequence (fn) converges in p`h mean.
(ii) The sequence (fn) converges p-stochastically, and the sequence (Ifnlp) is pequi-integrable.

Proof. (i)=(ii): Suppose

converges in eh mean, to f E 2P(); thus

lim Np(fn-f)=0.

n+oo

In the light of 20.4 only the equi-integrability of the sequence (I fnI") has to be
proved. By (15.2) the sequence (Np(fn))nEN converges to Np(f) and is therefore
bounded, so the set M := (If,, 1' : n E N} satisfies (21.3).

For every AEa(andevery nENwehave by(15.4)

(fA

If,.Idt) "<-Np(fn-f)+(JA
If1Pd\1/
J

To every e > 0 corresponds an nE E N such that Np(fn - f) < 2-eel/p for all

n > nE. Therefore, if we set 6:= 2-'Pe and

h:=If1IPV...VIfn,IPVIfIP,
condition (21.4) is also satisfied by M.
(ii) .(i): From the stochastic convergence of the sequence (fn) and Remark 6
in 20 it follows that
(21.5)

lim p({I fm -

n,m- .

a} n A) = 0

21. Equi-integrability

125

for every A E W of finite measure and every real a > 0. We have to show that
is a Cauchy sequence in 2P(), that is, that the doubly-indexed sequence of
functions frnn := frn - fn satisfies
rrr

= 0.
lim fIfrnfll' do

According to 21.3, along with the set {IfnIP : it E N} the set 1190 :_ {lfnrnI
m, n E N} is also equi-integrable. Hence to every e > 0 corresponds an integrable
function gE > 0 such that f{f _g. } f d < e holds for all f E Mo. If we set g := 9E1 /P
then g is p-fold integrable and the preceding inequality can be written

fnrnIPdu<<e

for allm,nEN.

Because

f If,.. I" d =

f{If.,,I>g} Ifnrn IP do + J Ifm,.I<g}

frn lP d

it suffices to show that


Ifnrn IP d < 3E

(21.6)

{Ifm I<g}

holds for all sufficiently large m, n E N. Now gP, being a finite measure on so',
is continuous from above. Since n {g < k-1 } = {g = 0), i'l > 0 can therefore be
kEN
chosen small enough that

fwnl

g" (11,<E.

Consequently we also have


(21.7)

fId J

g }

gP d <

for all m, n E N.

The Chebyshev-Markov inequality insures that the set {g > Y}} has finite r
measure. According to (21.5) therefore the doubly-indexed sequence of sets
Ann :_ {I fnrnl > a} fl {g > 7)}

satisfies, whatever a > 0 is involved,


lim

m.n-4Q0

(A,,,n) = 0.

We choose the positive number a so as to have

()PJgpd1j

< E,

in,nEN

1 1. Integration Theory

126

The p-continuity of the finite measure gPp and 17.8 provide for an no E N such

that

gP dp < e

for all m, n > no.

for all m, n > no.

,,

Hence
(21.8)

Ifmn IP dp <

gP du < e

A second application of the Chebyshev-Markov inequality furnishes the estimate


(21.9)

JIfrnnV' dp<&A({g>r)})<()"f?d

<efor allm,nE14,

17

{Ifmk9}fAm.,

Amn := {Ifm,il < a} n {g < rl} .

By adding the inequalities (21.7)-(21.9) we get finally inequality (21.6), whose


confirmation was the last outstanding claim in the proof that (ii) implies (i).
Remark. 1. Theorem 21.4 does not claim that from the stochastic convergence of
a sequence (fn) to a measurable real function f, the p-fold integrability of f and the
convergence of (fn) to f in pth mean follow as soon as the sequence (if. JP) is equiintegrable. Rather the theorem guarantees the existence of a p-fold integrable function among the possible stochastic limits of the sequence (fe). The sequence (fn)
does converge in eh mean to every such stochastic limit, as follows from the proof
of the theorem in the light of Remark 4 of 20, according to which any two p-fold
integrable stochastic limits must in fact coincide almost everywhere.
But stochastic limits that are not p-fold integrable do exist, a fact that can be
demonstrated with the aid of the Example in 20: For the sequence (fn) there,
(If,, 1") is equi-integrable. But among the stochastic limits f' that occur there,
f' E .`BP(p) for some p E I1,+oo[ if and only if f'(wi) = 0.

However, the phenomenon discussed above does not occur for a-finite measures. By 20.3 in that case any two stochastic limits are almost everywhere equal.
Therefore we have
21.5 Corollary. Suppose the measure p is a -finite. If a sequence (fn) from. "P(p)
converges stochastically to a (measurable, real) function f, and if the sequence
(IfnIP) is equi-integrable, then f E 2P(p) and (fn) converges in pth mean to I.

Theorem 21.4 can be sharpened by bringing in a further condition equivalent


to (i) and (ii) which is suggested by F. Riesz' Theorem 15.3. En route to this
sharpening the following lemma plays a key role. On the other hand, from the
sharpening that we are aiming for, the lemma can in turn be deduced, as can
the theorem of F. Riesz, even with its almost-everywhere convergence hypothesis
weakened to stochastic convergence.

21. Equi-integrability

127

21.6 Lemma. Suppose the sequence of functions f > 0 from 2' (p) converges
stochastically to a function f > 0 from 2'(It). If in addition
lien

then the

sequence

f f dit = If dp,
J

converges to f in mean.

Proof. We consider the sequence (f A fn)nEN. The inequalities

0< fA
and Example 3 show that it is equi-integrable. Since

05f-fAfn<-Ifn-fI

(forallnEN),

stochastic convergence of (fn) to f entails that of (f A fn) to f . From Theorem 21.4


this new sequence then converges to f in mean. We therefore also have
(21.10)

lim
n>z

From this, the decomposition f + fn = f V f + f A fn, and the convergence


hypothesis follows the companion result
(21.10')

lim

If V f dp =

f du.

But then the decomposition

If,, - fl =.f V .fn -.f A.fn


shows that the claimed mean convergence ensues upon subtracting (21.10) from
(21.10').

Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:

21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically
to a function f E 2P(,u) the following three assertions are equivalent:
The sequence (fn) converges in p'h mean to f .
(1)
(ii) The sequence (If,, 1") is equi-integrable.
(iii) lim f If,, I' d;i = f If I' dp.
n-, x.

Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need
therefore establish only two implications:
(i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this.
(iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,)
to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma

it further follows that the sequence (If P) converges to I fI' in mean. Finally,
Theorem 21.4 - with the p there chosen to be I - shows that the convergence in
mean of this sequence entails its equi-integrability.

128

1 1. Integration Theory

For a-finite measures , equi-integrability can be characterized in a way that is


particularly convenient for applications. The a-finiteness will be exploited in the
form expressed by 17.6, that there is a strictly positive function h in Y' (it).
21.8 Theorem. Let (S2, dd, p) be a o-finite measure space and h a strictly positive

function from 2'(p). Then for any set M of dd-measurable numerical functions
on Sl the following three assertions are equivalent:

(i) M is equi-integrable.
(ii) For every e > 0 some scalar multiple of h is an a-bound for M.
(iii) M satisfies
sup

(21.11)

fIfI d < +oo

JEM

as well as the following: Given e > 0 there exists 6 > 0 such that

fhd6=JIfIdlA<c

(21.12)

for allAEdd,fEM.

Statement (ii) simply says that

s lim

(21.13)

JIfI>ah} If I du = 0

holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8)
called the equi-(hit)-continuity of the measures If I , f E M.
Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0

{IfI>-hh}

IfI d=

{IfI>oh}n{IfI>g}

< fj IfI>_g} I fI d+

IfI d+

{(fI>h)n{(fI<9)

gdla < E +

IfI d
9d

fig >cth}
According to 13.6, ({g = +oo}) = 0. Since g is a finite measure on dd, it is
{g>ah}

continuous from above. Hence the fact that

n {g > ah} = n {g > nh} = {g = +oo}


a>o

nEN

is a set of (g)-measure 0 means that

k>ah)

g d < 2

for all sufficiently large a. Coupled with the preceding inequality this shows that
indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.

21. Equi-integrability

129

This can be gleaned from the inequality derived at the beginning of


the proof of 21.2, ah being now eligible for the function g there:

JIfIdJLjIJI> an}IfI d1+a

for all f EM.

hd/1

21.2 affirms this. 0


Theorem 21.8 is of special significance for finite measures p. Then it is often
expedient to choose for h the constant function 1. When one does, (21.13) assumes
the equivalent form
(21.13')

lim

a-++oo

J IfI?a} IfI dp = 0

uniformly for f E M.

This condition is thus - just as (21.13) for a-finite measures - necessary and
sufficient for equi-integrability of M.

Remark. 2. In part (iii) of Theorem 21.8 the 21-boundedness of M expressed


by (21.11) cannot in general be dropped from the hypotheses. It suffices to consider
the measure space ({a}, Y({ a}), Ca) consisting of a single point and the sequence

of functions f,, := n 1. This sequence is not equi-integrable, although for every


e > 0 and every strictly positive h, (21.12) holds whenever 0 < 6 < h(a).
Let us close by deriving a sufficient condition for equi-integrability in the finitemeasure case which generalizes the introductory Example 3.

21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is
a p-integrable function g > 0 such that
(21.14)

J{Ift?a}

IfI dp <

J{IJI>a}

9dp

for all f E M and all a E R+. Then M is equi-integrable.


Proof. The case a:= 0 of (21.14) says that f If I dp < f g dp < +oo for all f E M.
Then Chebyshev's inequality tells us that
p({IfI ? a}) <_

f IfI dp < a f 9dp

for all a > 0, f EM.

It follows from this that


(21.15)

lim p({IfI > a}) = 0

a-4+oo

uniformly in f E M.

For each e > 0, 17.8 supplies a 8 > 0 such that

AEd and p(A)<b

fdize.

130

II. Integration Theory

Putting this together with (21.14) and (21.15) gives us


Jim

n++0oJ(Ill>o)

IfI dp = 0

uniformly for f E M,

that i4, (21.13'), which we have seen entails equi-integrability of M. O

Exercises.
1. Show that for any measure space (0, a, p) a set M of measurable numerical
functions is equi-integrable if and only if for every e > 0 there is an integrable
function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently
large q > 0, g := r)h will be a 2e-bound for M.]
2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se((t) converges almost everywhere on 12 to a measurable real
quence (f,,) in
function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the
sequence (If,, I P) is equi-integrable.

3. Show that from the 2-convergence of a sequence (fn) to a function f E 2"(e)


follows the 21-convergence of the sequence (I fn IP) to If I, for any 1 < p < +oo.
4. Consider a finite measure .t and an M C Y1(). For each n E N, f E M set

an(f):=n({n<_IfI<n+1}).
00

Show that M is equi-integrable if and only the series E an(f) converges uniformly
na
in f E M. [Cf. Theorem 3.4 and its proof in BAUER [1996].]
5. Consider a finite measure p and an M C 2 (z). Show that M is equi-integrable

if there is a function q : a+ - R+ with the properties


lilri q(t)
t0+00 t

_ +oo and

spu

J q If I du < +oo.

(In fact we have to do here with a necessary as well as a sufficient condition, which
goes back to CH. DE LA VALLEE POUSSIN (1866-1962). Moreover, q can always
be chosen to be convex and isotone. Cf. MEYER [1976], p. 19 or DELLACHERIE
and MEYER [1975], p. 38.)

6. Let (fl,.ad,p) be a measure space with (S2) < +oo, (fn)nEN a sequence of
measurable numerical functions fn > 0, and set f* := lira .supoofn. Show that:
n
(a) If the sequence (fn) is equi-integrable (or at least satisfies condition (21.12)),
then the following "dual version" of Fatou's lemma is valid:

lim sup f fn d < J f * dit

(*)

for all A E S1.

How does the corresponding result in Exercise I of 15 fit in? [Hint: Exercise 2
of 11.]

(b) Under the hypothesis f f' du < +oc, the sequence (f,,) is equi-integrable if
and only if (*) holds. [In proving the "if" direction, argue indirectly.]

21. Equi-integrability

131

(c) Result (b) can fail in case f f ` d = +oo. Try to corroborate this with a se-

quence (an
derived by appropriate choice of (sufficiently large) numbers
a,, > 0 from the sequence (f,,) in the Example from 15.
7. Let (f), .x, ) be a measurable space with (S2) < +oo, and let (v;)iE f be a family
of finite and it-continuous measures on 0. Suppose this family is equi-continuous
at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every

c>0there is an nEENsuch that y;(A,)<efor all n>nE,and all iEI.Show


that then this family is equi--continuous in the following sense (cf. (21.12)): To
every E > 0 there corresponds a 6 = 6e > 0 such that

and (A)<6

vi(A)<eforalliEI.

What does this result say in view of Theorem 21.8? (Hint: Review the proof of
Theorem 17.8.1

Chapter III

Product Measures

In this short chapter we will investigate whether and how one can associate a product with finitely many measure spaces. And for the product measures thus gotten
we will want to see about how to integrate with respect to them in terms of their
factors. We will recognize the L-B measure Ad as being a special product measure

when d > 2. One important application of product measures is the introduction


of the concept of convolution for measures and functions.

22. Products of c-algebras and measures


j = 1, ... , n E N are given. We consider

Finitely many measurable spaces

the product set

Q:= X11j=Q1x...xQ,t
j=1

and for each j the projection mapping


Pj : 52 -> S2y

which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra
in Q generated by the mappings pa,. , pn is designated
n
j=1

and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to
do here with the smallest a-algebra s in ft such that each pj is d-safj-measurable.

The reader may recall that the product of finitely many topological spaces is
defined in a very similar way.
An important principle of generation for such products is immediately at hand:

22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj
in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra
.n is generated by the system of all sets
A(i 0

E1x...xEn
with E., E 9, for each j = 1, ... , n.

22. Products of a-algebras and measures

133

Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,

are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the
sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just
exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each
j E {1,.. . , n}, then the sets

El x ... x En =p11(El)n...npnl(En)
all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4
and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets

Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk,

kEN,

all lie in W. Since the sequence (Fk)kEN increases to

U1 x...x1j-1 xEj xflj+1 x... xOn =pj1(Ej),


this set too lies in d, for each j. The claim is therewith proven.

13

Remark. 1. The restriction imposed on the generators S, cannot generally be


dispensed with. Take, for example, n := 2, sail
in which .QF2 contains at least four sets.

{0,111}, ell := {0} and 82 := W2i

A particular case of this theorem is the fact that the product dj ... srdn is
generated by all the sets Al x ... x An with each A3 E . . Our further course will
be guided by the following example:

Example. F o r each j E { 1, ... , n} let Std := R, . rt :_ .41 and 8j :_ f 1. The


system of all sets E1 x ... x En with each E? E Jr' is evidently just the system .5n
of all right half-open intervals in Rn. According to 6.1, fn generates the a-algebra R" of n-dimensional Borel sets. Taken together with 22.1 - whose hypotheses
are clearly satisfied here - this reveals that

,qn = a1

(22.2)

(& R1

(n factors on the right).

By 6.2, A" is the only measure on R" which satisfies

,\' V1 x ... X In) = V1(Il) . ... Al (In)


for all I, i ... , In E .01. This remark and the example preceding it leads to the
following question.

Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj

a generator 9j. Under what hypotheses can the existence of a measure a on

010 .. . (9 On satisfying
(22.3)

zr(E1

for all E,ESj,I<j<n

be proven?
The accompanying uniqueness question can be settled at once:

134

III. Product Measures

22.2 Theorem. Suppose that for each j = 1, ... , n irj is an n-stable generator
of ao which contains a sequence (Ejk)kEN of sets of finite pj-measure satisfying Ejk f 11j. Then there is at most one measure rr on alt ... x/ erljjoying
property (22.3).

Proof. Let 8 denote the system of all sets El x ... x E,,, where Ej E ej for each j.
According to 22.1, 8 generates the a-algebra dj (9 ... 04. Since each Bj is
f-)-stable, so is 8, as the identity
?I

9=1

j=1

X Ej)n(X Fj) = X(E,nF,)


J=1

makes clear. Moreover Ek := Elk x ... X


evidently satisfies

E N) defines a sequence in 8 that

EkTf1,x...xf1,,.
Recalling that j (Ejk) < +oe for all (relevant) j and k, we see that the uniqueness
claim therefore follows from 5.4. (Obviously it would suffice if U Ejk = f1j instead
kEN

of Ejk T SZj were satisfied for each j.) 0

Under the hypotheses of 22.2, which obviously entail the a-finiteness of each
measure uj, the existence of the desired measure it can also be proven. This proof
will be carried out in the next section, first for it = 2, then for arbitrary n > 2.

Remark. 2. In closing it should again be mentioned that a mapping

f:S2o-4 SZlx..-xSZ
of a measurable space (11o, ado) into a product of measurable spaces (0j, Afj) is
measurable with respect to the a-algebra all ... as' if and only if each component mapping fj := pj o f off is d0-Oj-measurable - a fact which is immediate
from Theorem 7.4.

Exercise.
Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the
algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj
consists of all finite unions of such product sets.

23. Product measures and Fubini's theorem

135

23. Product measures and Fubini's theorem


Initially measure spaces (521, .sdl, pj ), (522, sd2, 2) are given. For every Q C ill x 112

the sets
(23.1)

{w2 E ill : (WI, W2) E Q}


{w1 E ili : (w1,w2) E Q}

Q111

Q,,,.,

are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q
(w2 E p2)
This notation is chosen for typographic simplicity and will see us through 23,
after which it is not needed. In case ill = il2i however, it presents obvious problems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also
popular in the literature.
About these sets we claim:

23.1 Lemma. If Q E sd1 sd2i then its w1-section lies in ad2 for every w1 E 01,
and its w2-section lies in sd1 for every w2 E i12.
Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill

(!\Q)w, =!2\Q.1
and

(U Qn)

= U (Qn)., .
nEN

nEN

Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have
(A1 x A2),1 =

j A2
0

if w1 E Al
if w1 E ill \ A1.

For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode

is a a-algebra in Cl which contains every product set Al x A2 with Al E .o'j,


A2 E ode. But according to 22.1 01 (& ad2 is the smallest a-algebra which contains
all such product sets. This proves the part of the lemma dealing with w1-sections.
Of course, w2-sections are treated the same way. 0
Since now 2(QW1) and
make sense for all Q E 01 .02, wl E ill and
w2 E S12, we are in a position to take the next step:

23.2 Lemma. Suppose the measures p1 and 2 are or-finite. Then for every Q E
sd1 . 9 the functions
w1 H 2(Q.,)

and w2 H A, (Q..)

on 121 and 122, respectively, are sd1-measurable and 02-measurable, respectively.

III. Product Measures

136

Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the
d1-measurability of sq, for each Q E d1 sal2. The other function can be treated
analogously.

First suppose that 2(1Z2) < +oo. In this case the set ) of all D E .01 sal2
whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2.
This involves the following easily checked assertions:
811 = /12(122);

sf1\D = 851 - SD for every D E .9;

svD = ESD. for every sequence (D,6) of disjoint sets in .9.


Furthermore 9 contains Al x A2 for every Al E salli A2 E sale, since
SA, xA2 =112(A2) - lA,

The system if of all such Al x A2 is fl-stable and generates sale sd2, by 22.1.
Therefore 2.4 insures that 01 ad2 is the Dynkin system generated by it. From
9 C -9 C Wl ,42 therefore follows that .9 = .call .v i which is what is being
claimed.

of sets from ae, each of


If 162 is only a-finite, then there is a sequence
finite 162-measure, with Bn T 112. For each n, A2 H u2(A2f B.) is therefore a finite
measure 162,, on sate, to which the already proven result can be applied, showing
is .aft-measurable for each Q E Of, 02. Now
that wl H
112(Q,,,) = auP112,,(Qw,)
nEN

because of the continuity from below of the measure 162. From Theorem 9.5 then
the mapping wl -r 162(Q,,,) is indeed
al-measurable.

It is now rather simple to construct the measure it that we seek:

23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there
is exactly one measure.. it on all .sate which satisfies
(23.2)

rr(A, x A2) = p, (Al)112(A2)

for all Al E sli, A2 E sate.

In addition this measure satisfies


(23.3)

it(Q) =

for all Q E sail d2

and is a-finite.
Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function
on 121; it is of course non-negative. Consequently via

w1

ir(Q) :=

JSQdILI

a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH
of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure

23. Product measures and Fubini's theorem

that

137

00

7r U Qn) _ F, n(Qn)
n=1

nEN

Since so = 0 we have 7r(0) = 0. This proves that 7r is indeed a measure on .od1a2.


It has property (23.2) because
SA, XA2 = p2(A2)IA,, whence integration yields
7r(A1 x A2) = pl(A1)a2(A2)

Proceeding analogously, we confirm that

ir'(Q) :=

fi(Qw2)iz2(dw2)

also defines a measure on s1 d2 having this property. But when Theorem 22.2

sr'1 and &2 := W2 it affirms that there is at most one such


measure. Thus 7r = 7r' and (23.3) is confirmed. There is a sequence (Ajn)nEN of
sets from ,rarj, each of finite pj-measure, with Ajn T 52j, for j = 1 and j = 2. Using
these as the A1, A2, respectively, in (23.2) proves the a-finiteness of IT because
is applied to 9d1

r(A1nxA2n)<+ooand A1nxA2nTfu1 xQ2


23.4 Definition. The measure IT on 010 .W2 which is uniquely specified by (23.2)
whenever (521,911,p1) and (122,d2ip2) are a-finite measure spaces is called the
product of the measures p1 and 02 and is denoted by

Thus also the question posed in 22 is answered for a-finite measures p1, P2.

If namely ej is a generator of salj (j = 1, 2) with the properties formulated in


Theorem 22.2, then according to 22.2 and 23.3, Al p2 is the only measure IT on
01 02 which satisfies (22.3).
The Example in 22 therefore entails that A2 = a1 a1. Similar considerations
lead to the validity of
Am+n = '\ )n

for any m, n E N, once the appropriate identification of 1R"'+" with RI x Rn has


been made.
We turn now to integrating with respect to the product measure 141 p2. Our
notation for sections can be usefully extended to functions for this purpose. If
f : S21 X 122 -+ 12o is any mapping, we define its sections f, for each w1 E 521 and
f,, for each w2 E 92 as mappings of 121 and f12, respectively, into 11o by
(23.4)

f., (w2)

f (w1,w2)

for all w2 E 112

f,.,2 (wi)

f (wi, w2)

for all w1 E 521.

Notice that if Q C 121 x 122 and f := 1Q, then these functions satisfy
(23.5)

(IQ),,, = IQ.,,

and

(IQ),,,2 = IQ,

111. Product Measures

138

Note, of course, that these indicator functions have different domains, and, just as
with (23.1), further caution is called for with (23.4) in case ill = f12. Equations

(23.4), and (23.5) lead us to call the mapping f,,,, the wj-section of f. It enjoys
the expected properties:
23.5 Lemma. For every measurable space (W, d') and every measurable mapping

f: (11 x122,4110A)-(11',d')
is sate -d' -measurable and f,,, is .11-d'-measurable for every wl E 11 i w2 E S12.

Proof. For every A' E W', w1 E 11


fJ,'(A') = {w2 E 122 : (w1, w2) E

f-1(A')}

_ (f -'(A')),,,

and similarly for every w2 E 122

(f-1(A'))w,,
so the measurability claims follow from Lemma 23.1.

Decisive is the following theorem which extends formula (23.3) from indicator
functions to non-negative measurable functions. It goes back to L. TONELLI (18851946), its corollary to G. FUBINI (1879-1943). Both statements are often combined
under the single designation the theorem of Fubini.

23.6 Theorem (of Tonelli). Let (111,41z) be o-finite measure spaces (j = 1, 2),
and let

f: 121x122 R+
be s1 0 .sat2-measurable. Then the functions

w2' J f,n d1 and w1 H

r
d2

are .sate-measurable and Ol -measurable, respectively. Moreover,


(23.6)

ffd(i0u2)=

J(ffW2dul),02(dw2)=J(ff1due)l(dw1)

Proof. Set Sl := Sl1 x 112, NY' := .


at-elementary function f :

.so42 and rr := Eq 2. Consider first an


n

f:_Eaj1Qj

(ai>O,QjEa,nEN).

j=1

Then a glance at (23.5) reveals that for each w2 E

f"

Eaj1il040,
d,u1=
j=1

aj14IL2 and so

23. Product measures and Fubini's theorem

139

an iA2-measurable function on l2 thanks to 23.2. Its integration is therefore accomplished by (23.3) thus:

f(ff2d1) _

aj7r(Q7) =
j=1

f d7r,

which confirms the first equation in (23.6), for elementary f.

For an arbitrary d-measurable numerical function f > 0 let (u(')) be a sequence of .say-elementary functions such that uini T f. Then, as was noted in the
first part of the proof,
is a sequence of dl-elementary functions, which
obviously satisfy u) T fw2 (for each w2 E 112). Consequently, the functions

Ji4)dir

V(n)(w2)

which are

w2 E 112,

d2-measurable by what has already been proven, increase to the function

w2H

f)2dp1,

by 11.3. This function is therefore also a02-measurable and the monotone convergence theorem 11.4 says that

ff

fiat dl)2(dw2) = suP f

w(') d'2

nEN

Again, by what has already been proved,


f ep(n) dM2 = J u(n) d7r

for each n E N.

By the choice of the sequence (u(')) and definition 11.3

f f d7r = sup I

u(n) d7r.

nEN

Combining the last three equations gives the desired

J(ffdi)P2(dw2) =

f f dir,

and wholly analogous arguments establish the claims about the functions f", . 0
Having disposed of non-negative functions, the next step in integration theory
is to pass over to integrable functions. For them we get

23.7 Corollary (Theorem of Fubini). For j = 1, 2 let (llj, a4j, 14j) be a-finite
measure spaces, f a k1 0 p2-integrable numerical function on !l x 02. Then for
l-almost every w1 the function f,, is 142-integrable and for 2-almost every w2
the function f,,,2 is 1-integrable. The functions

w1 y f fu dp2 and w2 H f fw2 d1

III. Product Measures

140

thus defined p,-almost everywhere on fl, and P2-almost everywhere on f12, respectively, are pl-integrable and 2-integrable, respectively, and equations (23.6) are
valid.

Proof. Evidently for all w? E S2? (j = 1, 2),

(f+)., = (fWj)+ and

IfIWj - I fWjI,

(f.,)-

(f

so we will employ parenthesis-free notation. According to (23.6) the product measure it := , 2 satisfies

f(JIf1I d142)1(dw1) = f (f IfW,I dill)2(dw2) = f

Ifl d' <

+10-

In particular, the ddb,-measurable numerical function w, H f Iff,I d2 is lintegrable and so by 13.6 it is ,-almost everywhere finite. That is (by 12.1),
for ,-almost every w, the section f, is 2-integrable. Consequently,
w1 14

f L. d2 = f f.+,

d112 -

f,;, d02

is a p1-almost everywhere defined function, which is x/1-measurable because that

is assured of each integral on the right by Theorem 23.6. In turn each of these
integrals is ,-integrable by 23.6. So our pi-almost everywhere defined function
w1 H f fW, dp2 is ,-integrable and

f (f fW, d2)1(dw1) = f(Ji-, dp2)pi(dwi) - f (f fZ d2)p1(dw1)


=

f + dir -

f - dir =

f dir.

Of course, the roles of w1 and w2 can be interchanged in this argument and we


thereby secure the rest of what is being claimed.
The theorems of Tonelli and Fubini insure, in particular, that under the stated
hypotheses the order of repeated integrations is immaterial. We can emphasize this
by writing the equation (23.6) in the form

(23 6')

f f d(1(9 2) = f f f(W1,W2)111(dW0112(dw2)

= Jff(wiw2)2(dw2)i(dwi).
That exceptional sets of measure zero cannot generally be ignored in the conclusions of Fubini's theorem is illustrated by the following example.
Example. 1. Consider L-B measure A2 = A' A' on R2, the set A := Q x R E R',
and its indicator function f := 1,1. According to 23.3 or 23.6 we have A2(A) =
f f dA2 = 0, so f is A2-integrable. Nevertheless, for every w1 E Q, the section
f,,,, = la is not A'-integrable.

23. Product measures and Fubini's theorem

141

Remark. 1. For certain measures 1,P2 which are not or-finite the existence but
usually not the uniqueness of a product measure can be proved by other methods.
See, e.g., BERBEIUAN [1962]. Even if just one of p' or 112 fails to be a-finite, the
second equality in (23.3) can fail. Cf. Exercise 1, p. 145 of HALMOS [1974], as
well as chapter IV, 16 of HAHN and ROSENTHAL [1948]. Moreover, there exist
f : 91 x f12 - R+ which are not sail (9 02-measurable yet the "iterated integrals"
on the right side of (23.6) make sense (and are finite). For an abundance of illuminating but elementary counterexamples related to this famous theorem, see
CHATTERJI [1985-86] and MATTNER [1999].

A useful and at the same time surprising consequence of Tonelli's theorem is


that it permits p-integrals to be expressed by means of A1-integrals.

23.8 Theorem. Let (S2, d, p) be a a -finite measure space and f : Il - R+ a measurable, non-negative, real function. Further, let W : R+ -+ 11 P+ be a continuous
isotone function which is continuously differentiable at least on R+ :_ ]0,+00[
and satisfies w(0) = 0. Then

(23.7)

+00

co o f dp = fit ,

(t)p({ f > t})A1(dt) = 0

w (t)({ f > t}) dt .

J0

Proof. Consider the L-B measure A' := AR+ on the o-algebra R' := R+ fl9l. The
function F : 0 x R+ -+ R2 defined by
F(w, t) :_ (f (w), t)

is, according to Remark 2 in 22, 0 .4'-measurable, because each of its component functions is. Therefore the F-preimage of the closed half-plane {(x, y) E R2 :
x > y}, namely

E:={(w,t)ESZxR+: f(w)>t},
lies in sad.. Theorem 23.6 for the product measure pA' consequently supplies
the equalities

JJ

(23.8)

(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)(dw)X'(dt)

= Jw'(t)iz(Ei)A(dt) =

Jc'(t)({f > t})A'(dt),

since the t-section of E is just the set of all w E 1 which satisfy f (w) > t. As V
is isotone, W'(t) > 0 for all t > 0. The continuous function gyp' is integrable over
[1/n, a] whenever 1/n < a < +oo, and since [1/n, a] t ]0, a], and

oal

(t)A'(dt) = limo J

(t) dt = W(a) - n m V(1/n) = w(a)

142

!IL Product Measures

(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a]
for every a > 0. It follows from f > 0 and the preceding calculation that

p'(t)a(dt) = (f(w))

for every

E S1,

o,f(W)l

both expressions being 0 whenever f (w) = 0. We thus get


o f d =

f (Jlo,f(W)l

= J f o'(t)llo,nw)d(t)A*(dt)(&)
=

IV

which combined with (23.8) concludes the proof. D

Example. 2. The relevant hypotheses are certainly fulfilled by the functions


V(t) := t' with p > 0. Thus for every a(-measurable real function f > 0 on S1
(23.9)

fl'd=p

+
0

When p = 1 we get the especially important formula


(23.10)

f f du =

r p({f > t})A1(dt) =

t})dt.

The reader should not overlook the geometric significance of this, which is that
the integral f f d is formed "vertically", while the integral on the right-hand side
of (23.10) is formed "horizontally".

Now at last we turn back to the general case of 22 and consider finitely many
o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2.
The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will
be identified via the bijection

((w1,...,W,y_1),wn) H (L11,...,wn-l,wn)
The agreed-upon equality of these sets leads at once to the equality of the corresponding products of v-algebras:
(23.11)

(Wi...An-1)-Wn=010...An-1dd/n.

In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote...OAfn-1,

and by the same theorem the sets

then generate (.Q91 0 ... 0 s0n_ 1) 6dn as well as .c

... sOn_ 1 SF,.

23. Product measures and Fubini's theorem

143

In a completely analogous fashion one confirms a general associativity in the


formation of products of a-algebras:
m

j=1

j=m+1

(23.12)

-'10

= j=1
0j

(1<m<nEN).

The convention (23.11) opens up the possibility of proving the existence of product
measures on any finite number n > 2 of factors via induction on n.

23.9 Theorem. or-finite measures l, ... , n on a-algebras .d1, ... , jVn uniquely
determine a measure 7r on safe ... 0 do such that
(23.13)

for all Aj E 0j, 1 < j < n.

7r(A1 x ... x An) = ul(A,) .... n(An)

This measure 7r is a-finite.

Corresponding to Definition 23.4, 7r is called the product of the measures


l, ... , n and is denoted by
n

j l...n.
j=1

The question posed in 22 is finally answered in full, by this theorem.

Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn
that there is at most one measure 7r which satisfies (23.13). The existence question
has already been settled for n = 2, in 23.3. We make the inductive assumption
that 7r' := 1 ... n-1 exists for some n > 2 and show how that leads to the
existence of l ... n. Evidently the a-finiteness of l, ... , n_1 entails that
of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with
a measure 7r := 7r' n on (.W1 ... .dn_ 1) .dn which satisfies
7r(Q' x An) = 7r'(Q')n(An)

for all Q' E .d1 ... .dn-1 and all An E dd4n. Because of (23.11) this measure
does what is wanted at level n, completing the induction. Again, a-finiteness of 7r
is confirmed exactly as in the proof of 23.3. 0
This inductive construction of the n-fold product measure builds in the equality
(23.14)

(141 ... (&n-1) n = 1 ... n-1 n


By now familiar considerations show that in fact a general associativity prevails
in the formation of product measures:
m
(23.15)

In particular

(j)(
j)=
j
j=1
j=m+1
j=1
xd

V,

(1<m<nEN).
with d factors.

144

III. Product Measures

In view of (23.15) induction can also be used to extend the theorems of Tonelli
and Fubini to multiple factors. We will formulate only the analog of 23.6:
Let f _> 0 be an s91... .c 4-measurable numerical function on 01 x... x Stn.
Then for every permutation j1, ... , j,, of 1, ... , n

Jfd(ii...in)

(23.16)

= f(... (f (f f(w1i...,wn)j,(dwj,))j.(dwjs))...)jr(dwj.)'
Every integral that occurs on the right-hand side is measurable with respect to
the product of the appropriate Oj, namely those corresponding to the coordinates
in which integration has not yet occurred. This right-hand side is often written in
the shorter fashion

J ... J
The simple proof of this theorem (involving induction), as well as the formula,
tion and proof of the analog of 23.7, will be left to the reader.
One more piece of notation is convenient:

23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, j), 1 < j <
+,

1l

1!

n, the triple ()( SZj, .Wj, j) is called the product of these measure spaces
7=1

j=1

j=1

and is denoted by

j,

14Y

j=1

Remark. 2. Throughout the preceding the index set was finite. But there is
also a theory of products of (finite) measures indexed by arbitrary sets, which
is particularly important in probability theory; it is treated in detail by BAUER
[1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For
p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.

In closing we will consider the case where each measure j comes with a real
density f j > 0. According to Theorem 17.11, vj := f jj is then a a-finite measure
too.

23.11 Theorem. Let (S2j,.Vj, jAj) be or-finite measure spaces

andfj>0real-

valued w(j-measurable, functions on S1j. Set

vj = fjj,
Then the product of these measures is defined and satisfies
(23.17)

j=1

j=1

vj = F. (j)

j = 1,...,n.

23. Product measures and Fubini's theorem

145

with the density function


n

[ffj(wj),

F(wl,...,wn)

(23.18)

j=1

The function F is the so-called tensor product of the densities f1,..., fn


Proof. As already noted, 17.11 insures that each measure vj is a-finite, guarantee-

ing that their product is defined. It suffices to treat the case n = 2 and refer the
general case to induction. For sets Al E
and A2 E s12
vl(A1)v2(A2) =

(jfid14i)(j12d142)
z

Jf

I ._

lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)

= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2)
From 23.6 therefore
Fd(141 1L2),

v1(A1)v2(A2) = J

for all Al E. iA2Ed2.

, x A2

But then according to 23.3, v1 v2 coincides with the measure F (141 14z). 0

Exercises.
1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142
the non-a-finite
counting measure on .41 (cf. Example 3, 5). Show that equality (23.3) fails to
hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in
jV1 002 =W2?
2. Show that the function
(x, y) H 2e2xv - exv
is not A2-integrable over the set [1, +oo[x [0, 1].

3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the
following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >

0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals

f()f(x + y)14(dx)Ad(dy)

and f f g(y - x)f(y)14(dx)Ad(dy)

and, finally, take f to be any indicator function, g the indicator function of [0, 1[.
4. Compute
00

I:= f e_x dx,


0

and thereby evaluate anew the important integral G = 21 in (16.1), in the folye_y2V2
lowing simple way: fo a-e2 dt = fo
dx for every y > 0 and therefore

146

III. Product Measures

I2 = f (, fn f (x, y) dx) dy for the function f on R+ x R+ defined by f (x, y)


yP-v2(1+z2).
Applying Tonelli's theorem leads to I = 2Vr7r.

5. Let IxI := (x + ... + xd)112 denote the usual euclidean norm of the vector
x := (x1,. .. , xd) E Rd. Show that the function x H e-Iz1 is ad-integrable for
every a > 0. (Recall Exercise 2 of 16.) In case a = 2, show that the Ad-integral
of this function is Gd.

6. KL(xo) will denote the closed ball in Rd with center xo and radius r > 0. Set
ad :_
and prove that
,\d(K*(xo)) = adrd .

Show also that the numbers ad can be calculated by


a2q = 4 9rq,

2q(2q

and a2q- i = 1 3

- 1)

a-1

(q E Dl).

[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is
a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula
for the ad. Here, of course, 7r has its customary geometric meaning.]

How do these relations change if we replace K,.(xo) by the open ball Kr(xo)
in Rd of radius r and center xo? [Cf. Exercise 3 in 7.]
7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell

K,3(0) \ K.(0) _ {x E Rd : a < IxI < /3} .


Show that for every continuous real function h on such an interval (a, /3] C R+

h(Jxj)Ad(dx) = d ad f

R(a,p)

h(t)td-1

dt,

ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H
defined on [a, p) by

H(t) := f

h(IxI)J1d(dx),

is differentiable with H'(t) = d ad h(t) td-1 for all such t.]


8. Apply the result of Exercise 7 to the case d = 2 and h(t) :=
show, using Exercise 5, once again that G = f.

tE

a-t2

in order to

9. Let (S2, d1. p) be a o-finite measure space, f : Il -+ R+ measurable. Show that

the set of all t > 0 such that u({f = t}) # 0, as well as the set of all t > 0 such
that ({ f > t}) # ({ f > t}) is countable. Therefore in the equalities (23.8),
(23.9) and (23.10), p({ f > t}) can always be replaced by ({ f > t}).

24. Convolution of finite Borel measures

147

24. Convolution of finite Borel measures


Consider the d-dimensional Borel measurable space (Rd,.gd). Every finite measure on Rd will be called a finite or also a bounded Borel measure, and the set
of all of them will be designated by.,&+' (lR'). For every such the number
(24.1)

lI,II := IA(Rd)

is called the total mass of A.


Making critical use of the group structure of (Rd, +) a so-called convolution
product can be assigned to any finitely many measures Al, ... , An E .K+ (Rd);

in contrast to the previously studied product measure, it is again a measure on


the original o-algebra Vd, even an element of .,of' (Rd). What we do below can
be carried out in every (abelian) locally compact group. We cannot, however, go
into this generalization, but must instead refer interested readers to the excellent
monographs of HEwIrr and Ross [1979] and RUDIN [1962]. Initially we consider

the product measure Al ... An defined in 23. Since W d = Rd ... 00,


this measure is an element of .,W+b (Rod) The mapping A. : R"d -3 Rd defined by

A,,(xl,... , xn) := x1 + ... + xn


is continuous, and so Vnd-.mod-measurable. The following definition accordingly
makes sense:

24.1 Definition. The image under the mapping An of the product measure
-IC/+b(Rd),
plo. .Idn is called the convolution product of the measures pl,... , An E
in symbols
(24.2)

The theorems on product and image measures combine to yield the most important properties of the convolution operation *. First of all, At * ... *An is again
an element of .0+1 (Rd) and

l*...*n(R")=l...p,(R"d)=1111I ...

IIJUnII

so that in fact
(24.3)

IIl * ... * poll = 1111I ...' 11n11

In studying the convolution product it suffices to deal with n = 2, because


(24.4)

Al * ... * An * I`n+1 = (Al * ... * ln) * ltn+1

for every n + 1 measures from .4 (Rd). To see this, introduce the continuous
mapping Bn+1 : R(n+l)d _+ Red by

Bn+1(x1, ... , xn, xn+l) := (XI + ... + xn, xn+l )

148

III. Product Measures

and have An+l = A2 o B.+1. Checking that


Bn+1(p1 ... OA. 0 pn+1) = A. (j AI ... pn) pn+1,

and remembering that the formation of image measures is transitive, we get


Al * ... * pn * n+1 = A2(Bn+l (JAI ... pn pn+i ))
= A2((1.t1 * ... * A.) 0 pn+1),
which confirms (24.4). Henceforth therefore n = 2.
For any measures p, v E .4f+' (Rd) and any 0-measurable numerical function
f > 0 it follows from T19.1 and 23.6 that

fd(E.e*v)

=J foA2d(pv)
= ff f(x + y)p(dx)v(dy)

(24.5)

= f f f(x + y)v(dy)(dn)

As this holds for f := 1B, they indicator function of any set B E fed, we have
(24.6)

p * v(B) = J (B - y)v(dy) = J v(B - x)p(dx)

(Recall (7.8) that B-x = -x+B.) Consequently * is a commutative, and by (24.4)


also an associative operation in .1/+(R.d)
Due to 19.2 and 23.7, (24.5) are valid as well for every p*v-integrable numerical
function f on Rd. Equality (24.6) is frequently taken as the definition of p * v.
Evidently .,W+6 (Rd) is closed with respect to addition and under multiplication
by numbers in R+. From (24.6) we immediately see the relation of convolution to
these two operations: For all p, v, v1i v2 E .41+(Rd), a E 11 Y+

p*(vl+v2)=p*v1+p*v2,
p*(av)=(ap)*v=a(p*v).

(24.7)
(24.8)

The distributive law (24.7) even holds in the following generality: For every
sequence

of measures from .4r+(Rd) satisfying E IkvJJ1 < +oo, the sum


n=1

00

E vn is also a measure in .4f+1 (Rd) (cf. Example 4 of 3). Taking account of 11.5,

n=1

it therefore follows from (24.6) that


00

(24.9)

14 *(E14t
n=1

00

Ep*vn
n=1

for every p E A,(+(Rd)

Let us now compute p * v in some special cases.

24. Convolution of finite Borel measures

149

1. We again denote by T. the translation mapping x H x + a of Rd onto itself via


a E Rd, and by ea the (Dirac-)measure on Md defined by unit mass at the point a.
Of course, Ea E -f+(Rd) and IIEa1I = 1. From (24.6) follows that Ea * (B) _
(B - a) = (T; ' (B)) for all B E mod, and so
(24.10)

E. * = Ta(p)

for all p E .4W+6 (Rd), a E Rd.

Now To is the identity mapping, so co is a - and obviously the only - unit with
respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for
every E 4. (Rd), then it would follow that Eo = E * co = E.
For the special choice p := Eb, (24.10) says that
(24.10')

for all a, b E Rd.

Ea * Eb = Ea+b

2. Let f > 0 be a Ad-integrable numerical function on Rd and p := fAd. Since


IIII = f f dAd < +oo, p also lies in W+ (Rd). Let us compute p*v for an arbitrary
v E .,4+(Rd). From 17.3 using the translation-invariance of Ad and the general
transformation theorem 19.1, we get

p * v(B) = J J 1B(x + y)f (x)Ad(dx)v(dy)


= f f 1B(x +

y)f(x)T-v(Ad)(dx)v(dy)

= f f 1B(x)f(x

- y)Ad(dx)v(dy)

for every B E .mod. With the help of Tonelli's theorem it further follows that

p * v(B) = f 1B(x)q(x)Ad(dx) = f gdAd,


B

where q is the non-negative .mod-measurable function x H f f (x - y)v(dy). This


function is also Ad-integrable, since f q dAd = Ilp * vfl < +oo. Thus whenever p has
a density with respect to Ad, so does p * v. We set f * v := q, that is, we make the
definition
(24.11)

f * v(x) := f f (x - y)v(dy)

for x E Rd.

The preceding result now assumes the more suggestive form


(24.12)

(/Ad) * v = (f * v)Ad.

Naturally f * v is called the convolution of f and v.

3. Besides p = f Ad, let now v = gAd also have a Ad-integrable density g > 0.
According to 17.3 and the preceding
f * (gAd)(x) = f f(x - y)g(y)Ad(dy)

(x E Rd)

150

III. Product Measures

is a density for u * v with respect to Ad. We denote this function by f * g, that is,
we set
(24.13)

f * g(x)

f f(x - y)g(y).d(dy)

(x E Rd)

and get

(f Ad)*(gAd)_(f*g)Ad-

(24.14)

Here too f *g is called the convolution off and g. It is defined for every pair of nonnegative Ad-integrable functions and is itself such a function. Nevertheless, it might

not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13)
and the translation- and reflection-invariance of Ad it follows that for every x E Rd

f * g(x) = f f(x - y)g(y)Ad(dy) = f f(x + y)g(-y)Ad(dy)


=

f f(y)g(x _ y)Ad(dy) = g * f(x)-

That is, the * operation between functions is also commutative:


(24.15)

f * g = g * f.

Similar calculations confirm its associativity; that is,


(24.16)

(f*g)*h=f*(g*h)

for all Ad-integrable, non-negative functions f, g, h.


The distributive law
(24.17)

f*(g+h)=f*g+f*h

and the homogeneity property


(24.18)

f * (ag) _ (af) * g = a(f * g)

(aER.F.)

for such functions hold as well and follow immediately from (24.13).

4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and
negative parts and appeal to the resusecured in 3. show that
x +

ff(x - y)g(y)Ad(dy),

while possibly defined only Ad-almost everywhere (see Remark 1 below), is always
Ad-integrable. One can therefore define f * g by
f * g(x):= f f(x - y)g(y)Ad(dy)

but generally only for Ad-almost all x E Rd. Once again the expression convolution
is used for this f * g.

24. Convolution of finite Borel measures

151

Remarks. 1. For real-valued, non-negative functions f, g E pl (Ad) the function f * g need not be finite everywhere. It suffices to consider any real-valued,
non-negative, even function f which lies in Y1 (A") but not in 22(Ad) and to take
g = f. Then f * g(0) = +oo. In case d = 1, such a function is

f(x) :=

forlxI>Iorx=0

10
1

IXI-112

for 0 < IxI < 1.

2. In passing to Le(ad) - cf. Remark 1 in 15 - the difficulties high-lighted


above with the definition of f * g disappear. Indeed, let f H f be the canonical
mapping of .1 (Ad) onto Ll (Ad). One defines f * g for arbitrary f , E Ll (Ad) as
the image h of a function h E 21 (Ad) which coincides Ad-almost everywhere with
f * g. This definition is independent of the special choice of representing functions
f, g and h from 21 (Ad). The new operation * renders the vector space Ll (Ad) an
algebra over R.

Exercises.

1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd,
T( * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T),
where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself.
2. Compute the nlh convolution power of the function f defined on R by f (x)

ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for
every n E N, f has an "nth convolution root"? That is, is f the nth convolution
power of some A'-integrable function g > 0?
3. If we set N1(f)
f I f I dAd (this is (14.1) for it := Ad), then

N, (f *g) <N,(f)N,(g)
holds for all f, g E 21(Ad), and for non-negative functions equality prevails.
4. Write out the details of Remark 2 and show that

III*9II1 5 II/II,

119111

holds for all elements f and g of the Banach space L1(Ad). The latter is therefore
a Banach algebra.

Chapter N

Measures on Topological Spaces

In view of many applications in analysis, geometry and probability theory it turns


out to be unavoidable to subject the Borel measures on Rd to more precise analysis. These measures possess a host of remarkable properties involving the topology
of Rd. Up to this point topology has only entered the picture in the generation
of the Borel sets. We will see that the completeness of the euclidean metric, respectively, the local compactness of Rd were responsible for the aforementioned
properties. But there are more general topological spaces, important in their own
right, which share these properties with Rd.
Therefore from the start we will set our exposition in an essentially more general framework: Instead of Borel measures on Rd we will study Radon measures
on Polish and locally compact spaces. In the process new facts, even in the Rd
environment, about the nature of the integral and the measurability concept will
emerge. A natural and useful convergence concept will play a role.
In what follows some simple things from general (point-set) topology will be
pre-supposed. The textbooks of KELLEY [1955] and WILLARD [1970] are good
sources for these, and explicit references to them will be given at the appropriate
points in the text.

25. Borel sets, Borel and Radon measures


Initially E will be an arbitrary topological space. The system of its open subsets
which defines the topology will be denoted B. In the case of Rd we had determined (cf. 6.4) that the o-algebra of Borel sets is generated by the open sets.
Consonant with this we now make the general

25.1 Definition. The a-algebra in E generated by 6 is denoted by V(E) and


called the Borel or-algebra in E:
(25.1)

.l

(E) := Q(6) .

The closed sets being the complements of the open ones, _V(E) is also generated
by the system of all closed subsets of E. In this respect the analogy with 6.4 extends

a bit farther. The intersection of a sequence of open sets is called a G6-set, and
the dual, the union of a sequence of closed sets is called an Fa-set. All such sets
are clearly Borel.

25. Borel sets, Borel and Radon measures

153

From now on E will be a Hausdorff space. Then every compact subset of E is


closed, hence Borel. The second example below will show, however, that generally

the a-algebra generated by the class Xfof all compact subsets of E is strictly
smaller that _4(E). So at this point the analogy with 6.4 falters.
Examples. 1. From 6.4, as has already been mentioned,
-4 (Rd) = .Vd,

(25.2)

E := Rd here carrying its euclidean topology.


2.

Let E be a discrete space, meaning that 6 = 9(E). Then the system

' of

compact sets consists just of the finite subsets of E. Consequently (cf. Examples 2
and 7 in 1) o(..iE') is the countable and co-countable a-algebra, comprised of all
countable subsets of E and their complements, and so o ,(X) = -V(E) if and only
if E is countable.

Let Q be a subspace of the Hausdorff space E. Then .V(Q) is the trace of


R(E) on Q:
-v(Q) = Q n 9(E).
3.

In fact, by definition the subspace topology of Q consists of the sets {Q n G : G E

6}, so this system generates ..(Q). Since Q n B(E) is a a-algebra in Q which


contains this system, it follows that 9(Q) C Q n . f(E). On the other hand, the
system {A C E : Q n A E .9(Q)} is obviously a a-algebra in E which contains all
the open subsets of E, a generating system for .V(Q). Hence Q n M(E) C SR(Q).
If Q itself is a Borel set in E, then ..(Q) just consists of all the Borel sets in E
which are subsets of Q.
4.

The compactified number line i is a topological space which is homeomorphic

to the compact interval [-1,+1]. For it

.9(i) = R1
In fact, R n..(i) = ..(R) = V1 by Examples 1 and 3 above. The subsets {-oo}
(25.3)

and {+oo} are closed in R and the subset R is open in R, hence all three are Borel
sets in K. Equality (25.3) therefore follows from the definition of R1, given in 9.

In the sequel we will be studying measures on R(E) for two important classes
of spaces E. In preparation for which we make

25.2 Definition. Let E be a Hausdorff space. A measure p on the a-algebra..(E)


is called:
(i)

a Borel measure on E if

(K) < +oo

for every compact K C E;

(ii) locally finite if every point of E has an open neighborhood of finite -measure;

(iii) inner regular if for every B E ..(E)


(25.4)

(B) = sup{(K) : K compact C BI;

154

IV. Measures on Topological Spaces

(iv) outer regular if for every B EL(E)


(25.5)

p(B) = inf {p(U) : B C U open} ;

(v) regular if it is both inner regular and outer regular.

Note that a Borel measure is more than just a measure defined on 69(E): in
addition finiteness on the system of compact sets is demanded. The inner and
outer regularity conditions say that the measure is determined on every Borel set
by its values on the compact, resp., the open sets. The Borel measures on E = Rd
are already familiar to us from 6.
Every finite measure on M(E) is obviously a Borel measure; as in 24 where
E = Rd, we naturally call it a finite Borel measure on E. The notation introduced
there for the total mass of a finite Borel measure will be carried over to this more
general setting: For every finite Borel measure i on a Hausdorff space E
(25.6)

IIiII := p(E)

is called the total mass of it.


Already at this point we can observe that every locally finite measure p on R(E)
is a Borel measure, that is, that
(25.7)

(ii)

(i) .

Indeed, each point x in the compact set K has an open neighborhood V,, with
p(Vr) < +oo, and compactness means that finitely many of these, say those corresponding to x, , .... x,,, cover K. Then
n

p(K) < p(VV, u ... U

p(Vxf) < + 0 C .

The converse of (25.7) is, however, not generally valid. Exercise 2 below furnishes
eui example.

Because of the implication (25.7), instead of locally finite measures defined


on 1(E), we will henceforth say simply locally finite Bore! measures.

For the moment we will be content to illustrate the regularity concept with
some examples.

Examples. 5. Let E be an arbitrary Hausdorff space, a a point in E. The measure eq on .(E) defined by unit mass at a:
(25.8)

e .(A) = 1A(a)

for A E R (E)

is both inner and outer regular on E. Henceforth it will be called the Dirac measure

on Eat a.
As in Example 2, let E be a discrete space, so that t9 = .9p(E). The compact
sets are just the finite ones. The measure defined on .9(E) by
if A is countable
J0
p(A)
1 +oo otherwise
6.

25. Borel sets, Borel and Radon measures

155

is a locally finite Borel measure which is obviously outer regular. It is, however,
inner regular if and only if the set E is countable.
7. On -41 = .a(R) consider the counting measure. It is not a Borel measure, is
however inner regular, but not outer regular. In fact, equality (25.5) fails even for
one-point sets B.

L-B measure Ad on ( d =M(Rd) is a (locally finite) Borel measure. In 26 we


will see that it - and indeed every Borel measure on Rd - is regular.
8.

Developments stretching over decades attest to the fact that on a Hausdorff


space those Borel measures which are locally finite and inner regular play a distinguished role. Such measures are nowadays named after J. RADON (1887-1956).
A work of his from the year 1913 (cf. the bibliography), which has since become
classical, set this development in motion.
25.3 Definition. A measure defined on the Borel a-algebra . (E) of a Hausdorff
space E is called a Radon measure on E if it is both locally finite and inner regular.

More precisely the term used is "positive" Radon measure, but in this book
we dispense with that adjective because non-negativity is built into our definition
of measure, that is, we consider only measures with values in [0, +oo]. Example 5
says that the Dirac measure at any point a E E is always a Radon measure on E.
We have already noted that Borel measures are not automatically locally finite.
Nevertheless for many spaces Radon measures can be defined simply as the inner
regular Borel measures. That is the import of

25.4 Lemma. On a Hausdorf space E in which every point has a countable


neighborhood basis, every inner regular Borel measure p on E is also locally finite
and hence a Radon measure.
Prof. We argue by contradiction: Suppose that it is not locally finite, which means
there is a point x E E such that u(V) = +oo for every open neighborhood V of x.
By hypothesis x has a neighborhood basis consisting of a sequence (Vn) of open
sets, and by replacing each V. with V1 fl ... fl V,,, we may suppose that V. 1. {x}.
Since p(Vn) = +oo and p is inner regular, there exists a compact subset Kn C V.
such that p(K,,) > n, and this is true of each n E N. Now the set

K := {x} U U Kn
nEN

is compact. For if 1! is an open cover of K, then some U E P1 contains x and


since (Vn) is a neighborhood basis at x, Vno C U for some no E N. It follows that
C U for all n > no. Since Kl U ... U Kno is a compact subset of K,
K, C Vn C
it is covered by finitely many sets in 9l. These together with U then furnish the
desired finite covering of K. On the one hand then p(K) < +oo, since p is a Borel

156

IV. Measures on Topological Spaces

measure, and on the other hand since K C K

(K) ? p(KK) > n


This is the contradiction sought. O

for allnEN.

Exercises.
1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.

Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is
a generator of the a-algebra .rah' in ff. Example 3 above is a special case.
2. Equip the set R with the so-called right-sided topology (which is also sometimes
named after SORGENFREY [1947) whose system 0, of open sets is defined as
follows: A subset U C R lies in r if and only if for each x E U there is an e > 0
such that [x, x + E[ C U. The topological space thus created will be denoted R,.
Establish, one after another, the following claims:
(a) Every right half-open interval [a, b[ is both open and closed in R,.. The rightsided topology on R is strictly finer than the usual topology. In particular,
R, is a Hausdorff space.

(b) .W(R,) =0.


(c) Suppose (x,e) is a strictly isotone sequence of real numbers possessing the
supremum b E R. Then the set {z : n E N} U {b} is closed but not compact
in R,. By contrast, if (y,,) is a strictly antitone sequence of real numbers
possessing the infimum a E R, then {a} U {y : n E N} is compact in R,..
(d) Let K be compact in R,. Then there exists (from the first part of (c)) for every
x E Kay E Q with y < x and [y, x[f1K = 0. If for each x E K, p(x) designates
such a rational number y, then a mapping B : K -+ Q materializes which is
strictly isotone, and hence injective.
(e) Every compact subset of R, is countable. (But (c) shows that the converse is
not true.)
(f) Consider on .W(R,) = . 1 the measure p which assigns to every countable set

the value 0 and to every uncountable set the value +oo (cf. Example 6). Then
p is a Borel measure on R, for which no point of R, has a neighborhood of
finite measure. In particular, the measure p is not locally finite and is neither
inner regular nor outer regular.

(g) Consider the measure v := IA' with density


f(x) := x-'

llo,+ool(x)

(x E R)

and show that it too is a non-locally-finite Borel measure on R,.

(h) Investigate the L-B measure Al, thought of as a Borel measure on R in


respect to its inner and outer regularity.

26. Radon measures on Polish spaces

157

26. Radon measures on Polish spaces


For two extensive classes of Hausdorff spaces Borel measures come up very naturally. The first of these classes will be discussed in this section, beginning of course
with its

26.1 Definition. A topological space E is called Polish when its topology has
a countable base and can be defined by a complete metric.
The terminology is due to N. BouRBAKI and commemorates the achievements
of Polish topologists in the development of general topology.
A metric is called complete when the associated metric space is complete: every
Cauchy subsequence in it converges. A countable base or basis for the topology is
a countable system of open sets such that every open set is the union of those from
the system which are subsets of it. For a metrizable space E the existence of such
a basis is equivalent to the existence of a countable dense subset.

Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the
ordinary euclidean metric being complete.
The product E' x E" of two Polish spaces is another, when given the product
topology. For if d, d" are complete metrics generating the topologies of E' and E",
reap., then the product topology of E' x E" is generated by the metric
2.

d(x, y) = d'(x', y) + d"(:r", y"), x := (x', x"), y (y', y").


which moreover is complete. If 9',9" are countable bases for E', E", resp., then
{G' x G" : G' E 91, G" E 9") is a countable basis for E' x E".
Every closed subspace F of a Polish space E is Polish. Just restrict to F any
complete metric that generates the topology of E.
3.
4.

Every open subspace G of a Polish space E is Polish.

Proof. We may suppose G # E. By 1. and 2. R x E is Polish. Let d be a complete

metric giving the topology of E, and consider the set F of all (A, x) E R x E
E\G) = 1. Here, as usual, for 0 0 A C E. d(x., A) := inf{d(x, a)
a E A} is the distance from the point x E E to A. The mapping x H d(x, A) is
continuous on E, in fact., as the reader can easily check, ld(x, A) - d(y, A)l <
satisfying

d(x, y) for all x, y r= E. Consequently, (A, x) Fa A d(x, E \ G) is a continuous real


function on R x E, and F is a closed subset of R x E, hence itself a Polish space,
by 3. Finally, (A, x) H .r. maps F homeomorphically onto G. To see surjectivity,
we only have to notice that, because E \ G is closed, G coincides with the set {x E
E : d(x, E \ G) > 0}.
5.

More generally it is true (cf. COHN [1980], Theorem 8.1.4 or WILLARI) [1970],

Theorem 24.12) that a subspace A of a Polish space E is Polish if A is a Ga-set


in E, that is. A is the intersection of a sequence of open subsets of E. Thus, for

158

IV. Measures on Topological Spaces

example, the set J of all irrational numbers with its topology as a subspace of R
is Polish, since

J= n (R \ {x}) .
2E'Q

Every compact space E with a countable basis is Polish. For a famous theorem
of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970],
6.

Theorem 23.1) guarantees that E is metrizable, and in Remark 3 of 31 we shall


even give a proof of this. The compactness of E easily entails that every metric
defining its topology is complete.

The key to the further discussion is the following lemma, which is here just
a preliminary to the big theorem that follows it, but nevertheless is significant in
its own right. In it we encounter our first extensive class of Radon measures.
26.2 Lemma. Every finite Borel measure it on a Polish space E is regular.
Proof. We consider the system .9 of all B E -W(E) which satisfy both

p(B) = sup{(K) : K compact C B}

(26.1)

and

(B) = inf {it(U) : B C U open).


The goal of course is to show that .9 = M(E). We block off the work into five
sections. Let d be a complete metric defining the topology of E.
1. E E 9: Only (26.1) needs proof when B = E. Let (X,,)-EN be a sequence which
is dense in E, and for x E E, real r > 0 let Kr(x) denote the open ball of center x
and d-radius r. For every r then E _ U K,.(xn), because in every ball Kr(x) lies
(26.2)

nEN

some x,, so that x E Kr(xn). Sincep is continuous from below


k

p(E) = kun(U Kr(xj)) .


j=1

Therefore, for each e > 0 and n E N there exists kn E N such that

K1/,, (xj)) > p(E)

-F2'

j=1

kp

Each set Bn

U K 1 / (x j ), hence also their intersection K:= f Bn is closed,


nEN

j=1

and we have

u(E)-(K)=(E\K)=p(U (E\B,)) 5
nEN

p(E\Bn)<_

n=1

Ee2-n=e.
00
n=1

This will prove (26.1) if we can confirm that the closed set K is actually compact.
For every n E N

K C B. E l l , ,

26. Radon measures on Polish spaces

159

and each set in this union has diameter no greater than 2/n. This shows that K is
pre-compact (=totally bounded) and in a complete metric space that is equivalent
to compactness, by very easy arguments (cf. WILLARD [1970], Theorem 39.9 or
KELLEY [1955], p. 198).

2. Every closed set C lies in 9: Let F > 0 be given. We already know that there is
a compact set K with
(E) - IA(K) < e.
According to 3.5 however

(C) - (C fl K) = p(C U K) - (K) < (E) - (K) <


and this proves (26.1) for B :

C, because C fl K is compact. As a closed subset

of a metric space, C is a G6-set, that is, there are open sets G. J. C. To see
this we may assume C 9& 0, so that G := E \ C is an open proper subset of E.
Consequently, x H d(x, C) is a continuous mapping whose zero-set is C, as was

shown in treating Example 4. The sets Gn :_ {x E E : d(x,C) < 1/n} are


therefore open and decrease to C. From the finiteness of and 3.2(c) we then
have that (G.) 4. (C), showing that (26.2) is also satisfied by B := C.
3. Whenever B lies in 9 so does CB: First note that for every compact K C B

(CK) - p(CB) = (B) - (K) ,


and so CB satisfies (26.2) whenever B satisfies (26.1). Moreover, if G is an open
superset of B, then CG is a closed subset of CB with
(CB) -,u(CG) = (G) - (B) ,

showing, at least, that CB satisfies (26.1) weakened by replacing "compact" there

by "closed". But then application of step 2 to these closed sets gives us the
full (26.1) for CB.

4. Whenever pairwise disjoint sets Dn lie in 9 (n E N), their union D also lies
in 9: First of all

(D.)

(D) _
n=1

Letting e > 0 be given, we therefore have an nr E N such that


n,
(26.3)

(D) - E p(Dn) < c/2.


n=1

Every Dj contains a compact K,j such that

(Di) - (Ka) <

(7 = 1, ... , ne)

2nE

since each D, E 9. Then K := K1 U...UKn, is a compact subset of D1 U...UD0, C


D which satisfies

( n,

n,

(D1 U... U Dn.) - (K) S U (D, \ Ki ))


j=1

j=1

(Di \ K,) < e/2

IV. Measures on Topological Spaces

160

from which, in view of (26.3),

IA(D) - (K) < e .

Again, D. E .9 means there exists open Un 7 D,, such that

e/2"

for each n E N.

Then the open set U := U Un contains D and satisfies


nEN
00

l(U) - p(D) < ( U (Un \ Dn)) < E li(Un \ D,,) < C.


n=2

nEN

In summary, we have shown that (26.1) and (26.2) hold for B := D = U D,,.
5. The result of the first four steps is that 9 is a Dynkin system which contains
the system .$ of all closed sets. The claim, namely that -9 = R(E), now follows

in the familiar way: Because 9 is n-stable, 6(.F) = o(Jr) = R (E). From Or C


9 c(E) follows . (E) = J(9) c9 c . (E), and thus the equality sought. o
We come now to the principal result of this section. It generalizes the foregoing
lemma.

26.3 Theorem. On a Polish space E every locally finite Borel measure p is a ofenite Radon measure.

Proof. The hypothesis is that every point x E E has an open neighborhood U. of


finite u-measure. The family (U:)XEE is an open cover of E. Because the topology
of E has a countable basis, a theorem of E. LINDEt.oF (1879-1946) insures that
this cover contains a countable subcover. That is, there is a sequence (xn)fEN in E
already covers E. [It is easy enough to prove
such that the sequence
Lindeldf's result right here: Let V be any open cover of E, 0 a countable basis
for the topology of E, and define d' to be the system of all A E d such that A C U
for some U E 9l and let U(A) be one such member of 'Pl. The subset 0' of at,
and therewith the system of all these U(A), is countable. This system covers E.
For if x E E, then there is some U E Pl that contains x, and since d is a basis

there is some AEsi such that xEACU.Thus AEii'and xEAcU(A).j


The system of sets Gn := U,z, U ... U US,,, n E N, satisfies

u(G,) < +oo

(26.4)

for every n E N, and G,, ? E.

Via

A E R(E)

1,6. (A) :_ p(AnG.),

a finite Borel measure ,, is defined on E for every n E N. Each such measure is


inner regular by the preceding lemma. It follows that for each A E SR(E)

(A) = sup p(A n Gn) = sup ,(A) = sup sup (K) .


nEN

nEN

nEN KEA

26. Radon measures on Polish spaces

161

After commuting the two suprerna this reads

jt(A) = sup suptin(K) = sup p(K),


KEr

KEr nEN

KCA

KCA

proving the inner regularity of tt. The a-finiteness of it is affirmed by (26.4), so the
proof is complete.

The question now suggests itself whether - in analogy with 26.2 - the outer
regularity of p can be proved. This is in fact the case.
26.4 Corollary. Every Radon measure on a Polish space is outer regular.

Proof. We have to show that every B E 4(E) satisfies (25.5). So let B E .4(E)
and e > 0 be given. Consider the open sets G. and the finite measures tt created
in the preceding proof. Lemma 26.2 furnishes open sets U. J B such that
ti((U,, \ B) n

(26.5)

Let U

p. (U,. \ B) < e/2"

for each n E N.

U U n G,,, an open set. Since


nEN

B = B n E = B n UG,, U BnC,,,
nEN

nEN

it follows from B C U for every n, that B C U. Moreover, this representation


of B shows that

U\B = U (UnnG,,)\ U (BnGn) C U (UnnGn)\(BnGn) = U (Un\B)nGn


nEN

nEN

nEN

nEN

and consequently
x,

n=1

n=1

e/2" =E.

tt(U\B) <
by (26.5). It follows finally that

(U) = u(B) + tt(U \ B) < (B) + c,


which confirms (25.5).

The regularity conditions (25.4), (25.5) make sense for outer measures px and
together with one other minimal demand on p* they assure that all Borel sets are
,W-measurable. In fact, these conditions on an outer measure come up naturally in
the course of proving the famous Riesz representation theorem in 29; cf. also 28.3.

26.5 Lemma. Let E be a Hausdorf space and tt' an outer measure on E with
the following three properties:
(i) for every set A C E

tt'(A) = inf{tt'(U) : A C U open 1;

IV. Measures on Topological Spaces

162

(ii) for every open set U C E

p* (U) = sup{Ec*(K) : K compact C U};


(iii) for any two disjoint compact sets K1, K2 C E
JL*(Kl UK2) = p*(Kl) +{l*(K2)
Then the restriction of * to R (E) is a measure.

Proof. We consider the a-algebra d* of all *-measurable sets, that is, according
to (5.6) the set of all A E .9(E) which satisfy
(26.6)

k*(Q) > *(Q n A) + p*(Q \ A)

for all Q E .9(E).

First note that it suffices that this hold for all open sets Q in order that it hold
for all Q whatsoever. In other words, what we need to check for an A to be in d*
is that
(26.6')

p*(U) > p*(U n A) +,t.*(U \ A)

for all U E 0.

Indeed from (26.6') it follows for any Q C E that


p*(U) > p*(Q f1 A) + p*(Q \ A)
whenever U is an open set containing Q; then (26.6) itself follows by taking the
infimum over such U and invoking (i). So now let A = G be an open set; we will

use criterion (26.6) to show that G lies in W*. To this end consider any open
U C E; further, consider any compact Kl C U n G and any compact K2 C U \ K1.
Since then K1 n K2 = 0 and Kl U K2 C U, it follows from (iii) that
y* (U) > {b' (K1 UK2) =A* (KI) +Ft*(K2)
The set U\Kl is open, so if we take the supremum over all such K2 in the preceding
inequality and appeal to (ii), we get

it* (U) > IA*(Kl) + u* (U \ K1) > u'(Ki) + t,* (U \ G),


the last inequality because U\Kl D U\G. This holds for all compact Kl C UnG,
and so after a second appeal to (ii) it yields

p*(U) > p*(UnG)+'(U\G),


holding for all U E 0. That is, (26.6') holds for A = G, and consequently G E d9*.

the latter
This all proves that B C W*. But then .9(E) = a() C j W*
is a a-algebra, by Theorem 5.3. That theorem further affirms that the restriction
of u* to W* is a measure.

The foregoing Theorem 26.3 and its corollary show in particular that the
L-B measure Ad is a regular Bored measure on Re in e a c h dimension d = 1, 2, ... .
In fact every Bore] measure on Rd is regular (cf. also Theorem 29.12). Following
STROMBERG [19721 we derive from the regularity of Ad a purely topological result
of H. STEINHAUS (1887-1972). It shows, incidentally, that every set of positive
L-B measure has the cardinality of R.

26. Radon measures on Polish spaces

163

26.6 Theorem (of Steinhaus). Let A E Rd be a Borel set in Rd of positive ddimensional Lebesgue measure. Then 0 is an interior point of the set A - A of
differences of elements of A.

Proof. The inner regularity of Ad means that A contains a compact subset K


with Ad(K) positive. It suffices to prove the claim with K in place of A. Outer
regularity furnishes an open set U D K with Ad(U) < 2Ad(K). There is an open
ball V centered at 0 of positive radius such that the sum set satisfies K + V C U.
One only has to choose the radius less than the (positive) distance between the
compact set K and the closed set CU from which it is disjoint. We will show that
V C K - K, which makes 0 an interior point of this difference set. Consider any
v E V. The translated set v + K cannot be disjoint from K, for otherwise from
K U (v + K) C K + V C U and translation-invariance of Ad would follow that
2Ad(K) = Ad(K) + Ad(v + K) = \d (K U (v + K)) < Ad(U),

contrary to the choice of U. But K fl (v + K) 0 0 means that for some x, y E K,


x = v + y; which says that the given point v = x - y lies in K - K. 0
In closing we turn to a remarkable consequence of Theorem 26.3 and its Corollary 26.4. It concerns the analogy, pointed out in 7 as measurable mappings were
being introduced, between the notions of measurability and continuity. Initially
this analogy is merely an analogy. Namely, if f : E -+ E' is a mapping of one topo-

logical space into another, then f is Borel measurable (i.e., .(E)-.(E')-measurable) just if the pre-image f - i (G') of every open set G' C E' is a Borel set in E.
This follows from Theorem 7.2 and the fact that the Borel o-algebra M (E") is
generated by the open subsets of E'. By contrast, f is continuous just if f-1(G')
is open in E for every open set G' C E. What is quite remarkable is that for
Polish spaces E a much closer connection between those two concepts exists.
This is brought out by the following theorem, discovered in its definitive form
by N. LUSIN (1883-1950).

26.7 Theorem (of Lusin). Let ,a be a locally finite Borel measure, thus a Radon
measure, on a Polish space E, and E' be a topological space with a countable basis.
Then for every mapping f : E -+ E' the following are equivalent:
(a) f coincides p-almost everywhere with a Borel measurable mapping of E into E'.
(b) There is a decomposition of E into a p-nullset N E R(E) and a sequence
(K,.)nEN of compact sets, such that the restriction off to each K is continuous.

If the measure is finite, (a) and (b) are further equivalent to:
(c) For every e > 0 there is a compact subset KK C E such that p(CKE) < e and
the restriction off to K, is continuous.
Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topology of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is
a generator of the Borel o-algebra because every open subset of E' is a (countable)
union of sets from s'.

IV. Measures on Topological Spaces

164

(a)=(c): By hypothesis there is a Borel measurable mapping g : E -* E' and


p-nullset N E .(E) with
f (x) = g(x)

(26.7)

for all x E CN.

For every set Gn, g-1(Gn) E . (E). Because every Radon measure on E is regular,
given E > 0, there exist compact sets Kn and open sets Un such that
(26.8)

K C g-1(G'n) C Un and p(Un \ Kn) < 2-ne

The set A

for each n E N.

U (Un \ Kn) is open, being a union of open sets. For its measure
nEN

we have the obvious inequality


00

p(A) s E p(Un \ Kn) < C.


n=1

Using once more the (inner) regularity of 1S, we find a compact K C C(A U N) _
CA n CN such that

p(CAnCNnCK) <e-p(A),
thus (since A U N C CK and A U N U (CA n CN = E) such that
p(CK) = p(A U N U [CA n CN n CKI) < p(A) + p(N) + E - p(A) = E .

This set K does what is wanted in (c), because by (26.7) f and g coincide in K
and because the restriction go of g to CA is continuous, as we now confirm. For
each set Gn,
go 1(Gn) = g-1(Gn) n CA;

from (26.8) and the fact Un \ Kn C A follows therefore

UnnCA =KnnCA cg'(G')cUnnCA,


which means that

goI(Gn)=UnnCA =KnnCA,
showing that the go-pre-image of G;, is open (as well as closed) in CA. Since
(Gn)nEN is a base for the topology of E', this is enough to guarantee the continuity

of go=gICA.
(c)=(b): It suffices to find pairwise disjoint compact subsets Kn of E such that
f I Kn is continuous and
K3) <
p(C ?=1
U
J
n
=

for each n E N. For then

N:=CUKn= nCKn
nEN

nEN

is a Borel set disjoint from each Kn and satisfying p(N) < 1/n for every n E N, i.e.,
p(N) = 0. The sequence (Kn) is gotten inductively from (c) as follows: To start
off, there is a compact K1 C E such that u(CKI) < 1 and f I K1 is continuous.

26. Radon measures on Polish spaces

165

If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1
from (c) and the inner regularity of p. By (c) there is a compact K' C E such that

p(CK') < (2n + 2)-'


and f I K' is continuous. With L := K, U... UKn the inner regularity of p supplies
a compact Kn+1 C K' \ L such that

(K' \ L) - p(Kn+1) = (K' n CL n CKn+,) < (2n + 2)' 1

Because

p(C(L U Kn+,)) = p(CK' n CL n CKn+1) + (K' n CL n CKn+, )

< p(CK')+p(K'nCL nCK,,+,) < (n + 1)-',


with this set Kn+, the inductive construction is complete.
(b)=(a): If E = N U K, U K2 U ... is the given decomposition, one defines
a mapping g : E -* E' as follows. In case N = 0, let g := f. In case N 96 0, choose
yo E f (N) arbitrarily and set

g(x) := f (x) for x E E \ N,

g(x) := yo for x E N.

What has to be shown is that g is Borel measurable, which is done as follows: For
every open G' C E'
9_1(G')

= (g-1 (G') n N) U U (g-1(G') n Kn) = No U U g; 1(G')


nEN

nEN

where No := g-1(G') n N and gn := g I Kn. Now No is either N or 0, according


as yo E G' or yo V G'. Moreover, gn coincides with the restriction of f to Kn, so
that by hypothesis gn 1(G') is open in Kn, that is, of the form Kn n Un for some
open subset U,, of E. Therefore only Borel sets occur in the above decomposition
of g-1(G') and we conclude that g-1(G') is a Borel set. This being true of every
open G' C E', the Borel measurability of g follows from 7.2.
Now consider an arbitrary locally finite measure p on R(E). According to 26.3,
p is a-finite. Lemma 17.6 therefore furnishes a strictly positive p-integrable real
function h on E. The measure v := hp is then a finite Borel measure on E which
has exactly the same nullsets as p. The proven equivalence of (a) and (b) for the
measure v therefore entails the validity of this equivalence for the measure it. Thus
the whole theorem is proved.

Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is

strengthened to the 9(E)-9(E')-measurability of f. It suffices to take for E the


compact set [0,1] x [0,1] and for p the L-B measure .X E. As was noted in the
second part of Remark 4, 8, E contains a p-nullset N which contains a non-Borel
subset. If M is such a set, its indicator function f = l,w is not Borel measurable,
although f is p-almost everywhere equal to the Borel measurable function 1N On
the other hand, if f is . (E)-. (E')-measurable, there is a Polish topology r on E,
stronger than the original but generating exactly the same Borel sets, such that f
is r-continuous. See 3.2.6 of SRIVASTAVA [1998] for the proof, which is not difficult.

166

IV. Measures on Topological Spaces

2. The Dirichlet jump function (cf. Remark 1 of 16) is continuous at no point


of its domain of definition 10, 1], yet it is Borel measurable. This shows that in assertion (c) of Lusin's theorem one cannot hope to be able to replace the continuity

of the function f I K by the continuity of f at each point of K.

Exercises.
1. Show that every inner regular finite Borel measure on a Hausdorff space is outer
regular.

2. Show that in a Polish space E the Dirac measures are the only non-zero Borel
measures it which take only the values 0 and 1. [Hint: Show that the system of all
compact K C E such that tt(K) = I is fl-stable and investigate the intersection
of all itssets.]

3. Show that AE x E') _ i(E) M(E') for any Polish spaces E,E'.
4. Consider K compact C U open C Rd, and for each n E N let V denote the
open ball of radius 1/n and center 0. Show that K + V C U for some n. [Hint:
n CU # 0 for every it E N, find xn E K, vn E V,,, zn E CU such that
If (K +
x + v = z,,, for every n E N. Some subsequence of (xn) converges to a point
xo E K and because CU is closed we even have x0 E K fl CU, which contradicts
the fact that K C U.]

5. Let p be a locally finite Borel measure on a Polish space E and f : E - E'


a mapping into a topological space E' with a countable base. Show that assertions (a) and (b) in Lusin's theorem are equivalent to (c'): For every e > 0 and
every compact K C E there is a further compact Kf C K such that p(K\Kf) < c
and f I KE is continuous.

27. Properties of locally compact spaces


A topological space is called locally compact if it is Hausdorff and if each of its
points has at least one compact neighborhood. Examples of such spaces are the
euclidean space Rd, every manifold (i.e., every locally euclidean Hausdorff space),
every discrete space, and every compact space.
When an arbitrary point is removed from a compact space the remainder is
a locally compact space. Actually every locally compact space is of this form. For
if is the system of all open subsets of the locally compact space E and wo is any
(so-called ideal) point not in E, then a topology can be defined on E' := EU {WO}
as follows: The system d' of open sets in E' shall consist of together with the sets
E' \ K for all the compact subsets K of E. This defines a compact topology on E',
E is an open subset of E' and the topology that E inherits from t9' is its original
topology. E was compact to start with if and only if wo is an isolated point in E'. If
E is not compact, then it is dense in E'. These claims are easily confirmed, or the
reader can consult KELLEY [1955], p. 150, or WILLARD [1970], 19.2. The space E'

27. Properties of locally compact spaces

167

is called, after its creator P.S. ALEXANDROFF (1896-1982), the (Alexandroff)


one-point compactification of E and wo its infinitely remote point.
We will pursue the further theory of locally compact spaces via this compactification. First we study some distinguished continuous functions in this environment.
For an arbitrary topological space E we denote by
C(E) and

Ct(E)

the vector space of all, respectively all bounded, continuous real functions on E.

27.1 Definition. Let f : E -> JR be a real function on a topological space E. The


set
(27.1)
supp(f) := If 34 0}
is called the support of f.

The complement of supp(f) is thus the largest open set at every point of
which f takes the value zero. If E is locally compact. we will designate by
CA(E)

the set of all f E C(E) with compact support supp(f). A function f E C(E) lies
in CA(E) just if there is some compact subset of E in the complement of which f is
identically zero.
Clearly
(27.2)

C (E) C Cb(E) C C(E),

since an f E CA(E) is bounded on its compact support, hence throughout E.


C,.(E) is a vector subspace of Cb(E). More generally for any n E N, E C(1R")
with V(O) = 0 and fl,.. .
E C,.(E), the composition
f,,) lies in CA(E),
rr

and indeed its support is a subset of f supp(fj). In particular, whenever u, v E


j=1

C,.(E) the functions Jul, u V zv. u A v, and therewith u+ and u.-, all lie in C'(E).
The needed continuity of y,(x, y) := r V y on 1R2 follows from the identity r V y =

(.x+y+I.e-yI)
In the special case of a compact space E, all three function spaces in (27.2)
coincide.

A fundamental property of the space C,.(E) is the following:

27.2 Theorem (on partitions of unity). Suppose that the compact subset K of
the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then

there are functions fl.... , f E C,.(E) with the following properties


(27.3)

fj>0

(27.4)

supp(fj) C Uj

for j = 1.....n;
for j = 1,....n:

r4

f(x) < 1

(27.5)
j=1

for all r E E;

168

IV. Measures on Topological Spaces


n

rfj(x)

(27.6)

forallXEK.

j=1

Proof. We work in the one-point compactification E' := E U {wo} of E. The


given open sets together with Uo := E' \ K constitute an open cover of E'. Because compact spaces are normal topological spaces (cf. KELLEY [1955], p. 141
Or WILLARD [1970], Theorem 17.10), this covering can be "shrunk" to an open
covering Ui, ... , Un of E' satisfying
UUCUj
for each j =0,...,n,
where of course the bar denotes closure in E'. The theorem on partitions of unity in
normal spaces (KELLEY [1955], p. 171 Or WILLARD [1970], 20 C) provides functions

fo..... fn E C(E') such that


fj' > 0,

(i)

supp(f f) C Uj,

for j = 0,..., n;

Ef,(x)=1

(ii)

for all xE E'.

j=o

The restrictions f I , ... , fn to E of f


f,i lie in C(E) and it will be easy to show
that they have all the properties wanted. From (i) and (ii) properties (27.3)-(27.5)
follow almost immediately. One only has to notice that for each j = 1,.. . , n

supp(fj)=supp(ff)flECUUflE=UUCUj
since UU C Uj C E. In particular, Uf being a closed subset of the compact space E',

is a compact subset of E. From supp(fj) C W therefore follows the compactness


of this support. Thus f I, ... , f,, all lie in CA(E). The remaining property (27.6)
likewise follows from (ii) because supp(fo) C Uo = E \ K entails that fo(x) = 0

for all x E K. 0
Two consequences of the foregoing will turn out to be especially useful. The
first - known as Urysohn's lemma - often serves as the starting point for inductive
constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can
also be proven directly, as indicated in Exercise 1 below.

27.3 Corollary 1. In the locally compact space E, U is an open neighborhood of


the compact subset K. Then CA(E) contains a function f which satisfies
(27.7)

0:5f:51, f(K)=fl),

and

supp(f) C U .

In particular, supp(f) is a compact neighborhood of K.

Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the
fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0
27.4 Corollary 2. In the locally compact space E the compact subset K is covered

by then open sets UI,... , Un, n E N. Then K can be decomposed as K = KI U


... U Kn with Kj a compact subset of Uj for each j = 1, ... , n.

27. Properties of locally compact spaces

Proof. Let fl,

169

, fn E Cc,(E) be as provided by 27.2. The compact sets

K; := K n supp(f3 ),

j = 1, ... , n

do what is wanted; for if x E K, then 1 = f i (x) +... + f n (x) means that f, (x) j4 0
for some j, and therefore x E K3.

For a locally compact space E there is another function space besides CC(E)
that is of importance. To define it we assign to every bounded real function f on
an arbitrary space E its supremum norm, also called its uniform norm, via
Ilf11

sup If W1
sEE

The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector
space of all bounded real functions on E - into a metric space. One speaks of the
metric of uniform convergence (on E). A sequence (fn) of bounded real functions
on E converges uniformly on E to a bounded function f just means that
lim Ilfn - f 1l = 0 .
nloo

27.5 Definition. A continuous real function f on a locally compact space E is


said to vanish at infinity if it lies in the closure Co(E) of CC(E) in Cb(E) with
respect to the metric of uniform convergence. Denoting closure in this metric by
bar, we thus have
Co(E) := CC(E) C Cb(E).
The terminology "vanishing at infinity" is both clarified and justified by

27.6 Theorem. For a real function f on a locally compact space E the following
statements are equivalent:

(a) f E Co(E);
(b) f E C(E) and {If I > e} is compact for each e > 0;
(c) the function

f'(x) :_ { f (x), for all x E E


for x = wo
0,
is continuous on the one-point compactification E' of E.

Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with
Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I <- If (x) - g(x)I S Ilf - gAI, so
we see that
(If 1> e} C {IgI > E/2} C supp(g).
This shows that (If 12: c} is a relatively compact set. But, due to the continuity
of f, it is also closed. Hence it is compact.
(b)*(c): Since the subspace topology of E in E' is its original topology and E is
an open subset of E', continuity of f' at each point of E is assured by f E C(E). As
to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I <

170

IV. Measures on Topological Spaces

e for all x in the set E' \ {If I > E}, which by definition of E' is a neighborhood
of wo, since (If I > e} is a compact subset of E.
(c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean
that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I <
E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}.
Then fg E CA(E) and satisfies

If

- f(x)I = If(x)I (1-g(x)) < E

for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).

Exercises.
1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for
the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open
neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.]
2. Let E' = E U {wo } be the one-point compactification of a locally compact
space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular,
see how your description fits into the following general picture: For a measure
space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O
in E"' generated by d and {wo} consists of all A' C El- such that All fl E E St.

28. Construction of Radon measures on locally compact


spaces
In what follows E will be a locally compact space. We consider a Borel measure p

(defined on R(E)). Here the requirement (K) < +oo for every compact set K
is the same as the local finiteness requirement, because every point of E has
a compact neighborhood and the implication (25.7) holds in general. So in the
present context the concepts of Borel measure and locally finite measure on .W(E)
coincide. The Radon measures on E are thus (cf. 25.3) those Borel measures which
are inner regular.
For a Borel measure it every u E CA(E) turns out to be p-integrable. For, being
continuous, u is Borel measurable. Denoting by K the compact support of u, we
have 1111 5 IIuII 1K. Since It is a Borel measure, 1K is p-integrable, and the pintegrability of u follows. Therefore corresponding to the Borel measure is a linear
form 1,, on C,;(E) defined by
(28.1)

lu(u) := Judy.

This is an isotope linear form in the sense of (12.3): From u < v follows I,,(u) <
I,,(v). Because of the linearity of I,, this is equivalent to

0<uEC,(E)

1,,(u)>0,

28. Construction of Radon measures on locally compact spaces

171

which is why I,, is usually called a positive linear form.


This brings us to a key question for our further work: Is every positive linear
form on C,.(E) an I,, for some Borel measure p on E, or are there possibly positive
linear forms of a completely different kind? Even for compact intervals J := [a, b]
on the number line, answering this question is by no means a trivial task. In this
case however, as early as 1909 F. Riesz showed (cf. RIEsz (1911]) that besides the

linear forms I,, arising from Borel measures it on J, there are no other positive
linear forms on Q,,(J) = C(J). One of our goals is to show that every locally
compact space E shares this property with J. The result in question will, in view
of this pioneering work, be called the Riesz representation theorem. En route to it
we will naturally be led to the construction of Radon measures on E.
Besides the locally compact space E. let now a positive linear form

I : Cr(E) -+ R
be given. What follows will prepare the way for the proof of the Riesz representation theorem.
For every compact K C E we set
(28.2)

p.(K) := inf{I(u) : 1K < it E C.,,(E)}.

Such functions u exist thanks to Corollary 27.3. Consequently,


(28.3)

0 < p. (K) < +oc.

Moreover, the mapping K ' p.(K) is obviously isotone on the system ..l' of all
compact, sets. For an arbitrary A E -1P(E) we set
(28.4)

p.(A) := sup{p.(K) : K compact C Al.

Because of the above noted isotoneity of it. on ..it', this new definition is consistent
with (28.2). Finally, for A E .9(E) we define
(28.5)

p'(A) := inf{p.(U) : A C U open}.

Then it. and p` are isotone functions on . (E). Moreover


(28.6)

p. (A) < y* (A)

for all A E .0(E),

as follows from the obvious fact that it.(A) < p.(U) for every open U D A; and
(28.7)

p.(U) = /I* (U)

for all open U E Y(E),

which follows from (28.5) and the isotoneit.v of it.. Somewhat more effort is required

to check that
(28.8)

p.(K) = p`(K)

for all K E X.

For every e > 0 definition (28.2) supplies a u E C,.(E) with to > 1K and

I(u) - p.(K) < E.

172

IV. Measures on Topological Spaces

For0<a< 1, Ua:={u>a} is an open superset of K and


1Ue <

U.

If therefore Lisa compact subset of Ua, then 1y < u and so from (28.2) P. (L) <
a 1(u). From definition (28.4) therefore

ps(Ua) < I(u)


and so, since K C Ua,

0<1 s(Ua)-ps(K)

=(a-l)p.(K)+a.
As a 1 1 this majorant converges to e, which shows that

inf{ps(U) : K C U open} < IA. (K) +e


holds for every e > 0; that is,

p`(K) = inf{.(U) : K C U open} < .(K).


This confirms (28.8), the reverse inequality being part of (28.6).
Of critical importance is the following result:

28.1 Lemma. W is an outer measure on E.


Proof. Obviously p*(0) = 0, so what we have to prove is that
00

(28.9)

ias (U Q-):5 E /bs (Qn)


nEN

n=1

holds for every sequence (Qn) in .9(E). We proceed in three steps.


First step: For any two compact sets K1, K2

p`(K1 UK2) <ps(K1)+ps(K2).


Consider any uj E CA(E) with uj > 1K, for j = 1, 2. Then IK,UK2
so (28.2) says that
/L.(K1 U K2) < I(u1 + U2) = I(u1) + I(u2) .
The claimed inequality now follows from (28.2) and (28.8).
Second step: For any finitely many open sets U1,.. . , U.

A*(U1U...UUn)

ps(U1)+...+AV.).

U1 + U2,

28. Construction of Radon measures on locally compact spaces

173

It suffices to settle the case n = 2, as induction then takes care of the rest. If K is
a compact subset of Ul U U2, then 27.4 provides compact Kj C Uj, j = I, 2, such
that K = Kl U K2. Then by the result of our first step

,u*(K) < lj*(KI) + p*(K2) <;t'(U,) +p`(U2)


The claimed inequality (with n = 2) then follows from (28.8), (28.4) and (28.7).
Third step: Now we will prove (28.9). In doing so we may obviously assume that
p'(Q,,) < +oo for every n. E N. Given e > 0, there then exist open U. J Q,, such

that
for every n E N.

2-11e

The open set U := U U contains Q :_ U Q. If now K is a compact subset


nEN

"EN

of U, then K C U1 U ... U U for sufficiently large n.. From this it follows that
:,

p.(K)_p*(K)<p'(UiU...UUn)<Ep'(Uj)<Ep`(Qj)+E.
j=t

j=1

where we used the second step. As this last inequality is satisfied by every compact
subset K of U, definition (28.4) and equation (28.7) give

a
it. (U) = Et'(U) <- E; (Qj) +e,
j=t
and since Q C U we will then have as well
00

(Q):5 EW (Qj) +e.

j=

Finally, e > 0 being arbitrary here, (28.9) is proven. 0


The next corollary sharpens the inequality proved in the first step above.

28.2 Corollary. For any two disjoint compact subsets K1, K2 of E

p"(Ki U K2) = p'(K1) + p'(K2)


Proof. Consider any u E C,(E) satisfying

u.>1K,uK2=1K,+1Ks.
I}, and
According to 27.3 there is a v E C,(E) with 0 < v < 1, v(K1)
supp(v) C CK2, hence with v(K2) = {0}. The functions vu and (1 - v)u lie
in CA(E) and satisfy
vu > 1K,

and

(1 - v)u > 1K2.

Therefore

p.(Ki) +p.(K2) < I(vu) + I((1 -v)u) =1(u) ,

174

IV. Measures on Topological Spaces

which, because of (28.2), has the consequence that

p.(Ki) + .(K2) < u.(K1 U K2).


In view of (28.8) this inequality is half of the equality being claimed. The other
half is simply the subadditivity of the outer measure '.
The first important consequence of all this is:

28.3 Theorem. The restriction of ' to M(E) is a Borel measure.


The proof is immediate from Lemma 26.5 and the facts accumulated to this
point. Notice that (28.7) and (28.5) say that hypothesis (1) of 26.5 is fulfilled, while
(28.7), (28.8) and (28.4) insure that hypothesis (ii) of 26.5 is fulfilled.

The Borel measure ' I ..(E) has a series of further remarkable properties:

28.4 Theorem. Every Borel subset A C E with '(A) < +oo satisfies

.(A) = `(A)
Proof. Given e > 0, there is an open U D A such that

It* (U) - '(A) < e/2,


which, due to ' (A) < +oo and ' being a measure on 9(E), can be written as

'(U\A) ='(U) -'(A) <e/2.


From (28.4) we get compact L C U such that

'(U\L)='(U)-li (L) <e/2.


The set

Q:=(U\A)U(U\L)
then satisfies p* (Q) < e. Hence there is an open G Q such that
'(G) < C.

Now K := L \ G is a (closed, hence) compact subset of L with the properties

K C A and A\ K C G.

(28.10)

In fact, on the one hand

K = L \ G C L \ Q C L \ (U \ A) = L n A,
since L C U, and on the other hand

A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G,
since U \ L C Q C G. From (28.10) we get

'(A) - '(K) = '(A \ K) 5 '(G) < e,

28. Construction of Radon measures on locally compact spaces

175

and so u* (A) < '(K) + e <- .(A) + e. As e > 0 was arbitrary, this says that
'(A) < .(A), which with (28.6) finishes the proof.
The finiteness hypothesis in the preceding theorem can be weakened. In doing so
we make use of the terminology introduced just before the proof of Theorem 13.6.

28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E)
which has o'-finite '-measure.

Proof. The terminology means that there exist An E R (E) (n E N), each of finite
'-measure, such that An T A. The preceding theorem and the isotoneity yield

'(An) = p.(An) < .(A) ,


from which and the continuity of ' from below on R (E) follows
'(A) = sup p* (An) <_ p. (A).
n

Together with (28.6) this proves the claimed equality.


Another central result, analogous to 28.3, emerges:

28.6 Theorem. The restriction of . to ..(E) is also a Borel measure.

Proof. Since all compact K satisfy .(K) = p'(K) < +oo, all that has to be
proved is that p. I M(E) is a measure, i.e., that p. is countably additive on M (E).
To that end, let (An) be a sequence of pairwise disjoint sets from R(E), whose
union is A. For every compact K C A, K = U (K n An), so from 28.3 and 28.4
nEN

we get
00

00

00

.(K)=ii (K)=1: '(KnAn)=1: 1.(KnAn)<E.(An).


n=1

n=1

n=1

Taking the supremum over such K on the left, (28.4) gives


00

,u. (A) S !L=(An)


n=1

In proving the reverse inequality we may assume that . (A) < +oo, and therefore

P. (An) < +oo for every n E N. There is then, given e > 0, a compact Kn C A.
satisfying

p. (An) - .(KK) < 2-ne

for each n E N.

Since the sets Kj are pairwise disjoint,


UKj)=*\UKj/IL_(Kj)A.(Kj)

j=1

j=1

j=1
n

> Ep.(Aj) - E
j=1

j=1

j=1

j=1

for every n E N.

176

IV. Measures on Topological Spaces

Letting n -+ oo we infer that


00

(A) ? E.(A.i) -e,


00

holding for every c > 0. That is, . (A) > E . (A,,), the complementary inequality
we needed to finish the proof.
We now set
(28.11)

o := . I .4(E) a n d := * I R(E)

and, inspired by COURREGE [19621, call these the essential measure determined

by I and the principal measure determined by I, respectively. Each is a Borel


measure (28.3 and 28.6).

Obviously the essential measure tb is inner regular, hence is a Radon measure


on E. By contrast the principal measure is outer regular. It turns out that
is the more important of the two.
Thus to the given positive linear form I on CA(E) we have associated two Borel
measures. The further relation of these measures to I and the questions of whether
and when they coincide will be clarified in the next section. The closing lemma
of this section recasts definition (28.4), when A is open, into a equivalent form. It
has a preparatory character.

28.7 Lemma. Every open set U C E satisfies


(28.12)

110(U) =11(U) = sup{I(u) : u E C0(E), supp(u) C U, 0 < u < I}.

Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and
consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with

0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2)
.(K) < I(u) < y, that is, .(K) < y for every such K. It follows that (U) =
`(U) = .(U) < y, by (28.4). The reverse inequality y < (U) is derived as
follows: Let u E CA(E) be a typical function involved in the definition of y. Set
L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2)
of .(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < .(L) = 0(L) =
(L) < (U). Taking the supremum over eligible u gives finally the desired
complementary inequality -y:5 (U).

A sharpening of equality (28.12) will be presented in Exercise 2 of 29. The


special case U = E of lemma 28.7 furnishes the following useful description of the
total masses of it. and :
(28.13)

11o11 = 11II = sup{1(u) : u E CC(E),0 < u < 1).

29. Riesz representation theorem

177

Exercises.

1. For a locally compact space E and a measure p defined on ..(E), show that it
is a Borel measure if and only if Cc(E) C 21(p).
2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family
of open sets which is upward filtering, that is, for any i, j E I there is a k E I such
that Gi U G; C Gk. Show that C := U Gi satisfies
iEI

p(G) = sup{p(Gi) : i E I} .
3. Using the preceding exercise, show that for any Radon measure p on a locally
compact space E:

(a) There exists a largest open set G with p(G) = 0. The set CG is called the
support of the measure p and is denoted supp(p).
(b) A point x E E lies in supp(p) if and only if every open neighborhood of x has
positive p-measure.

(c) For a non-negative f E C(E), f f d = 0 if and only if f = 0 throughout


supp(p).
Determine supp(Ad) for L-B measure Ad on Rd, and supp(E) for every Dirac
measure ea on E.
4. Let p be a Borel measure on a locally compact space E. Show that every set A
from the a-ring p0(X) generated by the system ..iE' of compact subsets of E is
a Borel set which satisfies p.(A) = p(A). Here a ring .4 in a set 0 is called a aring if the union of every sequence of sets in .9 is itself a set in R. In complete
analogy with a-algebras, every subset of .9(0) is contained in a smallest a-ring.
Sometimes it is only the sets in pe(a') which get called "Borel sets"; this is the
case, e.g., in the classic exposition of HALMOS [1974]. Why is it generally the case
that po(..1E') 3 .9(E)?

29. Riesz representation theorem


Again let E be a locally compact space. Every Borel measure p on E defines
a positive linear form

I,,(u) := fudp
on CA(E). The question posed in 28 was: Is it true that for every positive linear
form I on CA(E) there is a Borel measure p on E such that I = I, that is, such

that

I(u) = Judp

foralluECC(E)?

Any such Borel measure p will be called a representing measure for I. The answer,
leaked earlier, to this question reads:

178

W. Measures on lbpological Spaces

29.1 Riesz representation theorem. If E is a locally compact space, every


positive linear form I on CA(E) has at least one representing measure. In fact, both
the essential measure Po determined by I and the principal measure p determined
by I are representing measures for I.

Proof. po and p are Borel measures. It must be shown that


(29.1)

I(u)= fud = Judpo

for all uECC(E),

and because of linearity and the fact that the positive and negative parts of each
u E CA(E) also lie in C(E), it suffices to show this for non-negative u. So let such
be given and let the real number b > 0 be an upper bound for u. Fbr
auE
a given e > 0 choose real numbers yp,... , y,, with

0=yo<yt<...<yn=b
and

for each j = I,-, n.

yj-yj-1< C
We set

(j = 1,...,n)

uj :_ (u - yj-1)+ A (yj '- yj-1)

and get non-negative continuous functions, each having its support in supp(u),
which satisfy
n

(29.2)

u=Euj,
j=1

as the following deliberations will confirm. If x E E and u(x) = 0, then uj(x) = 0


for each j = 1, ... , n. If x E E and u(x) > 0, then there is a unique j E {1,...,n}

such that yj-1 < u(x) < yj. In that case uj(x) = u(x)-yj-1 and uk(x) = yk-yk-1
for k < j and uk(x) = 0 for k > j. Equality (29.2) follows. Next we set
Ko := supp(u) and Kj := {u > yj }

for j = 1, ... , n

and have
(29.3)

(yj -yj-1)lx, < uj < (yj

-yj-1)1K1_,,

for j = 1,...,n,

which becomes clear from considering the three properties


(29.4)

O!5 uj :5 yj-yj-1,

(29.5)

CKj_1 c {uj = 0},

(29.6)

Kj c {uj = yj - yj_1},

valid for j = 1, ... , n. Integrating in (29.3) with respect to p gives


(29.7')

(yj - yj-1)p(Kj) <_ 1 uj dp _< (yj - yj-1)p(Kj-1),

29. Riesz representation theorem

179

and from (29.3) we will - momentarily - infer the analogous inequalities


(29.7")

(yj -Eli-1)lL(Kj) 5 1(uj) 5 (yj -

valid for all j E {1, ... , n}. The left half of (29.7") follows from the left half of (29.3)

when account is taken of (28.2) and the fact that u.(Kj) = U*(Kj) = (Kj).
From (29.5) we have supp(uj) C Kj_1. For every open U i Kj_1, the function
v :_ (yj - yyj_1)-luj is therefore an element of Cc(E) with supp(v) C U and
satisfying, by (29.4), 0 < v < 1. From Lemma 28.7 then 1(v) < p(U) and hence

1(uj) 5 (yj -yj-1)/P(U).


According to (28.7) p(U) = p.(U) and therefore from (28.5) and the arbitrariness
of U we have confirmation of the right-hand side of (29.7"). Upon adding up the
inequalities in (29.7') and those in (29.7") and recalling (29.2), we find that both
of the numbers f u d and I (u) lie between
n

E(yj - yi-1)(Kj)

E(yj - yj-1)1(Kj-1)

and

j=1

j=1

and consequently
5 n

E( yj - yj -1)Fz(Kj-1 \ Kj),

if

j=1

since Kn C Kn_1 C ... C Ko. Due to the choice of the yj it follows that

Jud1L0-

Eu(KK-1\K3)-F(Ko\K.)<EIp(Ko)

I(u)I <_F,
j=1

The extreme inequality being valid for every e > 0 and p(Ko) being finite, the
desired equality
(29.8)

I(u) =

ud

emerges.

The measures of the compact sets Kj, j = 0, ... , n do not change, thanks
to (28.8), when is replaced by p ,. Another pass through the preceding derivation
therefore leads to the conclusion that O is also a representing measure for 1. O

These two representing measures can be characterized by extremality properties:

29.2 Lemma. Every representing measure p for I satisfies

p(K) 5 p.(K) and p(U) < (U)


for all compact subsets K and all open subsets U of E.

IV. Measures on Topological Spaces

180

Proof. Given K and U, consider functions u,v E CA(E) with iK < v, 0 < u < 1,
and supp(u) C U. Integrating these inequalities,

(K) < Jvd de = I(v) and I(u) =

udp < p(U).

From (28.2) and Lemma 28.7 therefore the claimed inequalities follow. 0
After this preparation we can enhance the statement of the Riesz representation
theorem by characterizing the measures p and , thereby putting into relief the
role of Radon measures.

29.3 Theorem. For every positive linear form I on CA(E) the associated essential
measure F4 is the unique Radon measure among the representing measures of 1.
Proof. Let p he a representing measure for I which is inner regular, thus a Radon
measure. Since 1I is also inner regular, it follows from the first part of the preceding

lemma that

p(A) < p,(A)

for every A E .R(E).

In particular then all open U C E satisfy (U) < p0(U) < p(U) and when this
is combined with the second part of 29.2 we have
p(U) = {I(U)

(29.9)

for every open U C E.

If compact K C E is given and U is an open, relatively compact neighborhood


of K, then U \ K is open, so that (29.9) is applicable and

p(U) - p(K) = p(U \ K) = po(U \ K) = p, (U) - p0(K)


Another appeal to (29.9), remembering that p0(U) < +oo, gives the equality

p(K) = po(K) ,
valid for every compact K C E. This fact and the inner regularity of both measures

results in their equality. 0


29.4 Theorem. Among all representing measures for a positive linear form I
on CA(E) the principal representing measure 1 is characterized by each of the
following two properties:
(i) p is the smallest among all outer regular representing measures.
(ii) p is the unique outer regular representing measure p which is inner regular

on open sets, that is, satisfies


(29.10)

p(U) = sup{(K) : K compact C U}

for every open U.

Proof. Let p be an outer regular representing measure. By Lemma 29.2, p(U) <

p(U) holds for all open sets U. Since, however, is also outer regular, that
inequality passes over to Borel sets generally:

(B) < p(B)

for all B E M(E),

29. Riesz representation theorem

181

which confirms (i). If K is a compact set

u(K) 5 A.(K) = A(K)


by Lemma 29.2 and (28.8), so by what has already been proven equality prevails
here. That is, k and coincide on the system .X' of all compact sets. Now p satisfies the inner regularity condition for open sets in (29.10), as we know from (28.4),
(28.7) and (28.8). If p also satisfies these conditions, then for every open set U

(U) = sup{(K) : U D K E ..'} = sup{p(K) : U D K E JL'} = (U),


an equality which passes over to all Borel sets via the outer regularity of both
measures; i.e., p = on M(E).
Remark. 1. Some authors (cf. HEWITT and STROMBERG [1965] and COHN [1980])

employ the adjective "regular" for just those outer regular Borel measures p that
have property (29.10), in contrast to our usage.
The following example shows that in general uO is not the only outer regular
representing measure.
Example. 1. Let E be an uncountable set and equip it with the discrete topology.
For I take the identically 0 form. Then from the last two theorems it follows that
= = 0. However the measure it from Example 6 of 25 is an outer regular
representing measure which is not identically 0.

Example 1 - there u. and p are identical - leads to the important question


whether the essential and the principal measures coincide in general, or under
appropriate supplemental conditions. Although according to 28.5 (A) = (A)
for all A E M (E) having a-finite p-measure, generally A. 96 A. An example due
to C.H. DOWKER (cf. the reference in EDWARDS [1953], p. 160) will be presented
in Exercise 7 below. Nevertheless in many important situations these measures do
coincide and we are going to look into this now.
We will encounter two types of supplemental hypotheses which will entail the

equality p = p on M(E). The first imposes conditions on the space E, but none
on the linear form I.
We already know, for example, that for a compact space E the representing
measures p. and p determined by a given positive linear form I on CC(E) coincide.
This follows immediately from Theorem 28.4. The reasons that underlie this need
to be examined more closely.

29.5 Definition. A locally compact space is called countable at infinity (also


sometimes o-compact) when it can be covered by a sequence of compact subsets.

Examples. 2. The following spaces are countable at infinity:


(i) every compact space;
(ii) the euclidean spaces Rd, d E N: The closed balls with any fixed center and
integer radii provide a countable covering by compact sets.

182

N. Measures on Topological Spaces

(iii) every locally compact space with a countable basis W. For 90 := {G : G E


9, G relatively compact} is a countable system of compact sets which covers E. Indeed, each x E E possesses by definition a compact neighborhood V, and since 9 is

a basis, x E G C V for some GE 9. Of course then GE40.


3.

A discrete space is countable at infinity just if it is a countable set.

Every subset A of a space E which is countable at infinity is of course covered


by a sequence of compact subsets of E, so from 28.5 we immediately get:

29.6 Theorem. If the locally compact space E is countable at infinity, then the
representing measures ii and p determined by any positive linear form I on CA(E)
coincide.
A simple consequence is:

29.7 Corollary. On a locally compact space E which is countable at infinity every


Radon measure (inner regular by definition) is also outer regular.

Proof. Every Radon measure it on E defines a positive linear form I. on CA(E)


of which it is a representing measure. According to 29.3 p must coincide with
the essential measure pO determined by I. Since O = p and the latter is outer
regular, so must be A. 0

To justify the terminology "countable at infinity" we sharpen the covering


condition featuring in Definition 29.5.

29.8 Lemma. Let E be a locally compact space which is countable at infinity.


Then E can be covered by a sequence (Ln)nEN of compact subsets each contained
in the interior of its successor. Every compact subset of E is therefore a subset of
some (hence of all but finitely many) L.

Proof. First of all there is a sequence (Kn) of compact sets K such that Kn t E.
Using Corollary 27.3 we find 0:5 u,, E CA(E) with u, t 1E. But then the sets

Ln:={un>1/n},

nEN,

do what is wanted: Each is closed and, since Ln C supp(u,,), it is compact. Because


(zun) is isotone

L C {Yin+i > 1/n} C

1/(n + 1)} open C Ln+l,

whence L C I n+t, where A denotes the interior of a set A. As a result, (t )nEN


is an open covering of E, so finitely many of its sets suffice to cover any given
compact subset of E. 0
A simple interpretation of countability at infinity now emerges: A locally compact space E is countable at infinity if and only if the infinitely remote point wo

29. Riesz representation theorem

183

in the one-point compactification E' has a countable base of neighborhoods. Such


a countable neighborhood basis is furnished by the complements E' \ Ln of any
sequence (L,,) with the properties described in 29.8.

We come now to the second type of supplemental hypotheses. Here E is an


arbitrary locally compact space and conditions will be imposed on the positive
linear form I on Cc(E).

29.9 Definition. A positive linear form I on Cc(E) is called bounded if there is


a real number M such that
II(u)1 < M IIuII

(29.11)

for all u E CA(E)-

Here IIf II denotes the supremum norm of any bounded real function f on E.
The requirement (29.11) means that I is continuous with respect to the metric (of
uniform convergence) in CA(E) derived from this norm.
Remark. 2. If the space E is compact, then every positive linear form I on Cc(E)

is bounded, because CA(E) = C(E) so the constant function 1 lies in Cc(E).


Therefore from - Dull 1 < u < IIuII . 1 and the positivity of I we infer that
- Hull 1(1) < I(u) <_ Hull 10),

so that (29.11) holds with M := 1(1).


The next theorem - like its predecessor - covers compact spaces as a special
case.

29.10 Theorem. If I is a bounded positive linear form on a locally compact


space E, then its principal representing measure is finite and coincides with the
essential measure O.
Proof. According to (28.13)

Il,0Il=sup{I(u):0<u<1,uECc(E)}.
Since 0:5 u < 1 entails Dull < 1, (29.11) says that 0:5 1(u) < M IIuII < M, and so
IW II <- M < +oo .

Thus is a finite measure and the rest follows from 28.4. 0


Proceeding via I as before (cf. 29.7) yields

29.11 Corollary. Every finite Radon measure on a locally compact space E is


also outer regular.

IV. Measures on Topological Spaces

184

Indeed, the positive linear form I4 on C(E) defined by It is bounded, by


M := Ilicll < +00:

I,,(-)I = if

<

r Jul du <_ Dull M

for every u E C(E),

and we can conclude as in the proof of 29.7. 0

Remarks. 3. From the proof of Theorem 29.10 it also follows that the total
maw [l;tII of u is the smallest real number M > 0 that can serve in Definition 29.9.

4. It is not to be expected that in every locally compact space E which is


countable at infinity every positive linear form on CA(E) will have exactly one
representing measure with no further qualification. Still less is unqualified uniqueness of representing measures for bounded positive linear forms on C(E), when E
is only a locally compact space, to be expected. There is a counterexample to
both in HALMOS [1974), p. 231 - DIEUDONNIi [1939) is also cited there - in which

the space E is even compact: It is the interval [1, Q] of all ordinal numbers not
greater than the first uncountable ordinal f2, equipped with the order topology.
The positive linear form IEn on C([1,52]) defined by the Dirac measure en has
a representing measure it which is neither inner regular nor outer regular. Thus
f f den = f f dp for all f E C([1,1z]) although It 96 eS2. Details can be found in
PFEFFER [1977], p. 116.

In view of the last remark the following theorem is especially noteworthy, as


well as useful:

29.12 Theorem. If the locally compact space E has a countable base for its topology, then every Borel measure on E is regular, hence in particular a Radon measure.

Proof. Let It be a Borel measure, I, the associated positive linear form on CA(E)
and p the principal representing measure for I. Along with E each of its open
subspaces U also has a countable base. From Example 2 therefore U is countable
at infinity; there exists a sequence
of compact sets such that K 1' U. Since
the measures It, p are continuous from below, it follows that

u(U) _ rn p(K,,) and p(U) _ im p(K,,).


But u(K,,) <

u(K,,) for every n E N, by Lemma 29.2. So we get


u(U) < u(U), from which and a second appeal to 29.2
(29.12)

u(U) = u(U),

for every open U C E.

For an arbitrary Borel set A and open U D A we then have u(A) < u(U) = u(U)
and so, on account of the outer regularity of u,
(29.13)

u(A) < u(A),

for every A E ..(E).

29. Riesz representation theorem

185

If A E .4(E) is relatively compact, we can choose an open relatively compact


neighborhood U of A and apply the last inequality to U \ A, getting

u(U) - u(A) = u(U \ A) < u(U \ A) = u(U) - u(A).


Subtracting (29.12) from this gives us the reverse inequality to (29.13). In summary,

tt(A) = p(A),

(29.14)

for every relatively compact A E .V(E).

Now, E is, as already noted, countable at infinity. So we have a sequence


of
compact sets which increase to E. (29.14) is applicable to B n L for any Borel
set B and any n. E N. We therefore get

u(B) = 'x-+x
lim u(B n Lg) = n-x
lim u(B n L,) = u(B).
That is, u and u coincide throughout .W(E). Since the essential measure u is
a representing measure for I,,, this fact insures (as does Theorem 29.6, for that
matter) that u = u. From the double equality it = u = p follows finally the

regularity of u. 0
In this situation the Riesz representation theorem can therefore be expressed
thus:

29.13 Corollary. For a locally compact space E whose topology has a countable
base, every positive linear form I on CA(E) can be represented as

1(u) = Judprt E
by exactly one Borel rrteasur p on E.
Example. 4. For cacti u E CS(R) choose real numbers a < 13 such that supp(u) C
(a,131 and define

L(u) :=

j a u(x) dx,
a

the integral being the usual Riemann integral: it is independent of the specific
numbers a and,3 used. Evidently L is a positive linear form on CS(R). According
to 16.4 L-B measure A' represents L, and by 29.13 it is the only representing
measure.

Remark. 5. It is also possible to deduce Theorem 29.12 from Theorem 26.3 and its
Corollary 26.4 because every locally compact space E whose topology has a countable basis is Polish. In fact along with E, its one-point compactification E' also has
a countable base, as follows from Lenima 29.8 and the commentary after it. It will
be shown in Remark 3 of 31 that E' is consequently ntetrizable, and completeness

of the metric follows easily from compactness (cf. Example 6, 26). Thus E' is
Polish and E is an open subset of it. Therefore according to Example 4, 26 E
itself is Polish.

186

IV. Measures on Topological Spaces

Summarizing, we can say that for every locally compact space E, the mapping
that associates to each Radon measure p on E the positive linear form 1. on Cc(E)
is a bijection between the set of Radon measures on E and the set of positive linear
forms on CA(E). That is the reason why in BOURBAKI [1965) the positive linear
forms on CA(E) are themselves designated as (positive) Radon measures.

If the space E is countable at infinity as well, the Radon measures on E are


all outer regular. If moreover the topology of E has a countable base, the Radon
measures and the Borel measures on E coincide.
We give now an application to integration that is of fundamental importance.

29.14 Theorem. For any regular Borel measure p on a locally compact space E
and any p E [1, +oo[, the vector space CA(E) is dense in 2P(p) with respect to
convergence in pen mean.

Proof. First of all, CA(E) C .`(p), because CA(E) C .2"(p) by (28.1) and Iulp E
CA(E) whenever u E CA(E). The denseness claim requires that for each f E gy(p)
and each number e > 0, a function u E CA(E) be produced with

Np(f -u):=

(f If - uIp du) "P <e.

We accomplish this by a stepwise simplification of the function f to be approximated. Since along with f , both f+ and f - are in .`gy(p), and Np is a semi-norm,
we can assume that f > 0. By 11.3 and 11.6 there is an isotone sequence (fn) of
SR(E)-elementary functions such that f,, t f. All these functions also lie in "(p),
due to 0 < fn < f, Therefore from the dominated convergence theorem
lim

nioo

Np(f - f,,) = 0.

This makes it clear that only . (E)-elementary functions need be approximated


by CA(E), and because of the semi-norm properties of Np the matter even comes
down to approximating the indicator functions 1A of Borel sets A having p(A) =
(Np(lA)Jp < +oo. For such an A the outer regularity of p supplies an open U J A
such that
[p(U) - p(A)J1/p = Np(lu\A) = Np(lU - 1A) < e/2.

In particular, p(U) < +oo. Therefore the inner regularity of p insures that for
some compact K C U

that is,

Np(lu - 1K) < e/2.


Finally, we use 27.3 to select u E CA(E) satisfying 1K < u:5 1U, whence

0<1u-u<1u-1K

29. Riesz representation theorem

187

and so

Np(lu - u) < e/2.


For the function f = 1 A to be approximated we now have

Np(f - u) < Np(lA - lu) + N,(lu - u) < e,


completing the proof. 0
The proof actually uses the inner regularity of only on open sets. So what is
involved here are conditions which according to 29.4(ii) characterize the principal
representing measure. We will not pursue this any further but interested readers
can in BOURBAKI [1965) and BAUER [1984), where this remark is placed in a more
general framework.

Exercises.
1. Let E be an uncountable discrete space. Using the Borel measure from Example 6 in 25, show that every positive linear form on CA(E) has at least two
different representing measures. This sharpens Example 1 of this section.
2. Let E be a locally compact space and I a positive linear form on CA(E). With
the help of the R.iesz representation theorem prove the following refinement of
equality (28.12): For every open U C E

(U) = sup{I(u) :0:5 u:5 lu, u E CA(E)}

3. A Ko-set is a union of countably many compacta. Prove that in a locally compact


space in which every open set is a Ks-set, every Borel measure is regular. (Hint:
Re-examine the proof of Theorem 29.12.)
4. Show that a locally compact space E is countable at infinity if and only if there
exists a strictly positive function in Co(E).
5. Prove that for an arbitrary Borel measure on a locally compact space E the
following two assertions are equivalent: (a) it is finite. (b) Cb(E) C 2l(). Show
that if is a Radon measure, the assertion C0(E) C 2l() is equivalent to each
of (a) and (b).
6. Let E be a locally compact space, I a positive linear form on Co(E). Show

that there is exactly one finite Radon measure on E such that 1(f) = f f d
for every f E Co(E). (Hints: Indirect proof. Or: For every e > 0 and non-negative
f E Co(E) there is a u E CA(E) with If - uI 5
7. Let El, E2 be the interval [0, 1) equipped with the discrete topology, respectively,
the usual euclidean topology, and consider the product space E = El x E2. Show

that
(a) E is locally compact.
(b) Every product

xE:_{x}x[0,1),
is a compact subspace of E, which is also open in E.

0<x<1,

188

IV. Measures on Topological Spaces

(c) A set U C E is open if and only if U fl xE is open for each x E 10111(d) Every compact subset of E is covered by finitely many of the sets xE.
Now consider u E CA(E). By (d) u vanishes in the complement of the union of
finitely many xE sets, and for each fixed x, y u-+ u(x, y) is a continuous function
on the compact interval Ea = [0, 1]. Therefore
I (u)

II

u(x, y) dy

O<x<I 0

is a well defined finite sum, evidently a positive linear form on Cc(E). Show that
(e) The essential and the principal representing measures for I do not coincide.

[Hint: Show that the set A := El x {0} is closed and that s(A) = 0, while
u(A) = +oo.]
(f) In passing from u to the Borel measure 1B for B E M(E) outer regularity
may be lost. [It suffices to consider B := E \ A, for the set A in the preceding
hint.]

30. Convergence of Radon measures


For locally compact spaces E we will henceforth use the notation .4'.. (E) for the set
of all (positive) Radon measures on E. The Riesz representation theorem furnishes
a canonical bijection of fl+(E) onto the set of all positive linear forms on CA(E).

With p, v E 4+ (E) and real numbers a > 0, i3 > 0 the measure a +)3v also lies
in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone.
Besides . W+ (E) we often consider the following subsets

.'+(E) = (1A E 4'(E) : p(E) < +oo}

-#+'(E) =fu E-0+(E):(E)=1},


the set of all finite (or bounded) Radon measures and the set of all Radon pmeasures on E, respectively. Evidently

-&+' (E) C.-W+(E) C .4+(E) .

In .f+1 (E) are to found all the Dirac measures on E. And 4 (E) is a convex
subcone of 4f+ (E).
In the special case E = Rd the set ..W+b (W') is the set of all finite Borel measures

on Rd, already familiar to us from 24. That the definition there is equivalent to
the present one is due to Theorem 29.12, according to which every Borel measure
on Rd is a Radon measure.
Depending on whether one thinks of the elements of . W+(E) as measures
on -V(E) or as positive linear forms on CA(E), two notions of convergence suggest themselves: One can define the convergence of a sequence (ta,,) in 4'+(E) to

30. Convergence of Radon measures

pE

189

by requiring either that


lim An (A) = p(A)

n-+oo

for all A E R(E)

or

lim

n-+oo

f dp = J f dp
J

for all f E CC(E).

We will forthwith show that the first of these is of limited interest, while the second
is of considerable significance.

30.1 Definition. A sequence (pn)nEN of Radon measures on E is said to be


vaguely convergent to a Radon measure y if
(30.1)

lim

-oo

for all f E CA(E).

A sequence (pn) in 4'+(E) is vaguely convergent just when the sequence of


real numbers (f f dpn) converges in R for every f E CA(E). For in this case
f H lim f f dpn evidently defines a positive linear form on CA(E), so by the Riesz
n
representation theorem together with Theorem 29.3 there is a unique Radon measure p to which (An) vaguely converges. At the same time we see that a sequence
in . K+(E) can have at most one vague limit.

Examples. 1. Let (xn) be a sequence in E, x E E. If (xn) converges to x, then


(e2 ) converges vaguely to eZ, for the latter just amounts to lim f (xn) = f(X)In general however lime= (A) = ex(A) does not hold for all A E -V(E); in fact,
if all xn are distinct from x, A := {x} is such a set. Conversely, if (es,) vaguely
converges to ey, then (xn) converges to x. For if this were not so, there would
be a subsequence of (xn) which remains outside of some neighborhood U of x.
27.3 furnishes an f E CA(E) with f (x) = 1 and supp(f) C U. Evidently the
(f (xn)) does not converge to f f de,.
sequence (f f
Let (an) be an arbitrary sequence of non-negative real numbers and (xn) a sequence in E with the property that {n E N : xn E K} is finite for every compact
K C E. (In other words, E is not compact and limxn = wo E E'.) Then the sequence of measures An := ane: (n E N) is vaguely convergent to the zero measure
p := 0. For f f dpn = an f (xn) = 0 for all n except the finitely many for which
xn E supp(f), whenever f E Cc(E).
2.

The fact, illustrated by Example 1, that the vague convergence of (An) to A


does not generally entail the convergence of (pn(A)) to p(A) for each A E . (E),
while, as 30.2 will show, the converse is true, seems to indicate that the first mode
of convergence mentioned above is too restrictive to be of much use. Actually,
vague convergence of (An) to p follows just from knowing that (An (A)) converges
to p(A) for certain special sets A E R(E). Even more:

190

IV. Measures on Topological Spaces

30.2 Theorem. A sequence (pn) of Radon measures on a locally compact space E


converges vaguely to a Radon measure p if and only if the following condition is
fulfilled:
(30.2)

lim pp 1zn (K) < p(K)

and

lim oinn (G) > jz(G)

for every compact K C E and every relatively compact, open G C E.


converges vaguely top and that K and G are any compact and
open sets, respectively. Consider functions u,v E CC(E) with u > 1K, 0 < v < 1
and supp(v) C G. Then for all n E N
Proof. Suppose

n(K) < J udjcn and JVdPn<Pn(C) ,


whence

limss op jln(K) <

udp and

vd < liimianfn(G).

From these inequalities (30.2) follows via (28.2) and (28.12). One only has to recall

that the Radon measure p coincides, thanks to Theorem 29.3, with the essential
measure po determined by the linear form I.
Now suppose conversely that condition (30.2) is fulfilled and that an f E CA(E)
has been given. Since our goal is to confirm (30.1), we lose no generality by assuming that f > 0. For a pre-assigned e > 0 we choose finitely many numbers

0=yo<y1<...<yk
with yk > IIfII and yj - yj_1 = e for each j = 1,...,k. Set

K:= supp(f) and Aj :_ {yj_1 < f < yj} f1 K,

j = 1,...,k.

Denoting the compact set { f > yj } fl K by Kj for j = 0,..., k (so Kk = 0 and


Ko = K), we have K, -I D Kj and

(j=1,...,k).

A =Kj-1\Kj
Because of the obvious inequalities
k

Eyj-11A; <.f <_ Eyj1A,,


j=1

j=1

every Radon measure v on E satisfies


k

1: yj_1v(Aj) < Jfthi <


j=1

j=1

yjv(A,),

30. Convergence of Radon measures

191

from which and a simple calculation using the facts v(A,) = v(Ki_1) -v(K,) and
yi - yi _ 1 = e, we get
k

e E v(KK) - ev(K) = e E v(KK) < Jfdzi<r>v(Ki).

i=

i=o

i=o

For v := it,, the right-hand inequality gives us


k

Jfd/<EJL(Kj)

for all n E M,

i=o

and therefore from the first half of hypothesis (30.2)

limsopJ fdn<eE(K1)
i=o

But this right-hand side can be estimated by using the left end of the earlier chain
of inequalities, with v:= . We thereby get
lim sup

f f dn < r f d + e(K),

valid for every e > 0. Consequently,


lira sup

Jfd n <

ffd.

The complementary inequality that we need is

f fd<liminf f fdn
and we get it by an analogous procedure, using the second half of hypothesis (30.2).

One sets Gi := If > yi }, j = 0, ... , k, which are open, relatively compact subsets
of K with

Gi-i \ Gi = {yi-i < f < yi} _ {yi_1 < f < yi} fl K.


These sets take over the role of the Ki. 0
The second example above (for the case in which, say, all the an equal 1) shows
that a vaguely convergent sequence of measures from .41+(E) need not converge
to a measure in .,W+l (E): mass can be lost. This illustrates the following general
phenomenon:

30.3 Lemma. If the sequence (n)nEN of Radon measures on the locally compact
then the associated total
space E converges vaguely to the measure E
masses satisfy
(30.3)

IIiII < Inm onf IIIinII

192

IV. Measures on Topological Spaces

<u<1
Proof. For every u E CA(E) with 0JudlLn
< -IIunhI

holds for n E N, so from (30.1) follows that

f udp < liminf IIpnII


J
n-00
Take the supremum of these integrals over all such u and you get, according
to (28.13), the total mass p(E) = IIpII of p. The inequality persists after this
operation

Vague convergence of sequences in .4'+(E) is convergence in a certain topology on ..ff+(E), called, naturally, the vague topology. It is defined as the coarsest
topology on .4f+ (E) with respect to which all the mappings

p y J f dp

(30.4)

(f E CA(E))

are continuous. A fundamental system of neighborhoods of a typical po E 4' (E)


consists of all sets of the form
(30.5) Vi...... t..:E(WJ)

1/1 E 4+(E) : if fi d1a -

J fi

dPol < s,.1 = 1, ... , n}

in which n E N, 0 < E E R and fl,..fn E CA(E) are all arbitrary. The vague
topology is Hausdorff because the uniqueness aspect of Riesz's theorem says that
if p, v are different Radon measures, then I, 36 It,., which just means that f f du 34

f f dv for some f E C (E).


In this context it is now clear too what should be understood by the vague
convergence of a mapping t i-+ p of a subset A of a topological space T into W+ (E)

when t converges to a point to E A. With respect to the vague topology the


convergence

lim
t =
10

tE A

for some U E 4'+(E) just means that


(30.6)

lim ffdt = ffd

forevery f E C(E).

tEA

Example. 3. Let K be a non-negative Ad-integrable, real function on E := Rd


with f K dAd = I (for example, the indicator function of the unit cube [0,1] ). For
every real r > 0 set
K,.(x) := rdK(rx)
(x E Rd).
Then K, is also non-negative and Ad-integrable, and f K, dAd = 1 as well. To see
this we only have to recall (7.10), according to which the homothety H,(x) := rx
on Rd transforms L-B measure thus: Hr(Ad) = r-dAd. For from that it follows

30. Convergence of Radon measures

193

that

J KrdAd=rd I K0HrdAd=rdJ Kd(Hr(Ad))= I KdAd = 1.


Now r -+ Kr)1d is a mapping of JO,+oo[ into dl. (Rd), and in the sense of the
vague topology it satisfies

lim KrAd = e0

(30.7)

r-a+oC

To confirm this, first notice that for every f E

.F

f f Kr dad = rd J f (K o Hr) dad = rd f (f o Hr-') K dHr(Ad)


= f(f oHH')KdAd= ff(f_1x)K(x)Ad((fr)

this and the Lebesgue dominated convergence theorem the claim (30.7) follows upon checking that, on the one hand

lint f (r-'x)K(x) = f (0)K(x)

r-++oo

for every x E Rd,

and on the other hand for all real r > 0 and all x E Rd

If (r-'x)K(x) I <_ Ilf11. K(x),


so that 11111 K is an integrable majoraut for all functions. The "approximation of
the identity" co expressed by (30.7) plays an important role in Fourier analysis (cf.
the exercises in 23 of BAUER [1996] ). For the algebra L' (ad) (cf. Remark 2, 24)

has no identity element with respect to convolution, but it is not hard to show
that II Kr * f - f 11 -+ 0 as r -+ +oo for each f E L' (Ad), and in many situations
this is almost as useful as having an identity.
To .,W+b (E) belong in particular all discrete Radon measures on E. These are
the measures 6 which can be represented in the form
k

5 = E aic",
7=1

f o r some finite number of points x1, ... , xk E E and non-negative real numbers
at, ... , ak. Every 5 admits many such representations. Every Radon measure can
be approximated, in the sense of the vague topology, by such 5, as we next show.
30.4 Theorem. For every locally compact space E the set of discrete Radon measures on E is dense in .4f+ (E) in the vague topology.
Proof. Let a measure tso E .W+(E) and a vague neighborhood V of be given. As
noted after (30.5), we can suppose V is Vj, ,....I,, :1(0) for some non-zero Ii..... f E
,,(E). We have to find a discrete measure 6 in V. To that end, consider the com-

194

IV. Measures on Topological Spaces

pact set

K := U supp(fi)
i=1

and g > 0 such that npo(K) < 1. Every y E K has an open neighborhood U. in E
such that 1 fi (y') - fi (y") I < q for all y', y" E U. and all j E {1, ... , n}. Finitely
many Us,, say Uy...... Uy,, suffice to cover K. Set

Al :=KnU,,, A2:=(KnU,,)\Al,...,Ak:=(KnUYk)\(ALU...UAk_1).
These are pairwise disjoint, relatively compact Borel sets whose union is K, and for
all j E { 1, ... , n}, i E { 1, ... , k} and y', y" E A. the inequality I f i (y') - fi (y") 15 rl
holds. Since only these properties of the A; are used in the sequel, we can discard
those that are empty (not all are because 0 0 K = Al U ... U Ak), and re-index the

others. That is, we can suppose all the A; are non-empty and then select a point
xi E A, for each i. The discrete measure
k

i=1

(notice that po(A;) is finite because A; is relatively compact) will be shown to lie
in V and that will complete the proof-

i=1

f fi dpo +

po(A:)fi(xi)
i=1

I:k fA.' -f(x))dpo


fA,

Ifi - fi(xi)I dpo<Eipo(A.)=rlpo(K),


iel

using the fact that Ifi(x) - fi(xi)I < 17 for all x E A;, all i E {1,...,k}. This
holds for each j E { 1, ... , n}, and gpo(K) < 1 by choice of q. Therefore b E
V1,,..., f,,;1(po) = V, as was to be shown. 0

30.5 Corollary. The discrete p-measures on E are dense in di. (E) in the vague
topology.

Proof. We take over the notation of the preceding proof. Now po is a measure
in 4+' (E), but the discrete measure 6 = F, po(A;)ez, may not be a p-measure,
so more work is required. Set a; := po(A1), i = 1, ... , k. If K = E (in which case
E had to be a compact space), then a1 +... + ak = po(K) = 1 and b actually is
a p-measure. In general what we have is

a1 + ... + ak = po(K) < uo(E) = 1

30. Convergence of Radon measures

195

and if K 0 E we can choose another point, xk+l E E \ K, and set

(al +... + ak),

ak+l
which is non-negative. Then

is a discrete p-measure with f fj dd = f fj db' for each j = 1, ... , n, since xk+l lies
outside the supports of all these functions. Consequently, 6 E V = Vf...... f,,;I(P0)

yields that also 6' E V. 0


Next we will investigate whether the equality (30.1) and the continuity assertion (30.4) remain valid for classes of continuous functions more general than C..(E).

Recall in this connection that for a measure E .,&+' (E), every f E Cb(E) is uintegrable: it is g(E)-measurable and its modulus is majorized by a real constant,
hence p-integrable, function. We will formulate the relevant results for sequences
only; their extensions to mappings t u-+ pt are routine.

30.6 Theorem. If a sequence (n)nEN in .14(E) is vaguely convergent to E


.1/+(E) and if the sequence (IInII)fEN of total masses is bounded, then along with
all the pn the measure is also finite, and for every f E Co(E)
lim

f dn =

Jfd.

Proof. If we set a := sup{11n11: n E N}, which is finite, then 111411:5 a, by (30.3),

so is a finite measure. Definition 27.5 says that for each e > 0 there is a g =
gf E CA(E) such that 11f - g11 5 e. Therefore
for each n E N

if
and

if f du

< ae,

so that via the triangle inequality

if f dn - f f d

I< 2ae + 119dJun

- jgd.Uj

for all n E N.

Since the hypothesis of vague convergence means that f g dn -1 f g d, we get

lim sup if f dAn - f f d < 2ae,


1

valid for every e > 0. That is, the limit exists and is 0. 0
Remarks. 1. If one considers measures pn and E .-W+6 (E) without the hypothesis
sup 11n 11 < +oo, the above conclusion can fail. The special case of Example 2 in

196

IV. Measures on Topological Spaces

which E := R, x := it and a := it for all n E N illustrates this. For the function f


defined by

f (x) := min (1, Ix[-1}

for x # 0, f(0) := 1

lies in C0(R). But f f dpn = 1 for every n E N, while f f d = 0, because here the
vague limit p is the 0-measure.
2. Example 2, again with E := R and xn := n for all n, considered earlier, but

this time with the constant sequence a := 1, shows that indeed lim f f dey =
f f d for the measure p := 0 and all f E Co(R), but this equality is already false
for the constant function f := 1E in Cb(R).

The passage from Co(E) to Cb(E) therefore calls for a special investigation,
which we stress by introducing a new definition:

30.7 Definition. Let p, p1, p2.... be measures in 4(E). The sequence


is said to be weakly convergent to p if
lim

(30.8)

n-+00

JfdP=Jfdp

for all f E Cb(E).

30.8 Theorem. Suppose the sequence (An)nEN in ..4+(E) converges vaguely to


the measure it E .W+ (E). Then the following statements are equivalent:
(i) The sequence
converges weakly to it.
11m
IIpnll
=
IIEiII
(ii)

(iii) For every e > 0 there exists a compact subset K = K, of E such that

(E\K)<e

forallnEN.

Proof. (i)*(ii) is obvious because 1 E Cb(E).


Let c > 0 be given. The inner regularity and finiteness of p yield
that there is a compact subset L of E such that p(E \ L) < e. According to 27.3,
L has a compact neighborhood KO, so there is an open set G with L C G C Ko.
By (30.2)

lim inf n(G) > p(G) > p(L) > IIp1I - e,


,l-+00

so if we choose a E I 11p 11 - e,p(L)[ there will be an no E N such that pn(G) > a


for all it > no. Moreover, in view of (ii) this no may be supposed large enough

that IIpnII < a+e for all n > no. Consequently, pn(Ko) > pn(G) > a > IIII -e,
so that p.n (E \ KO) < e, for all n > no. For each n E { 1, ... , no) inner regularity

and finiteness of pn give us a compact K C E such that pn(E \ Kn) < e. The
compact set K := Ko U K1 U ... U Kn0 then satisfies (iii).
Given e > 0, let K = K, be as described. Again from (30.2) we have

p(E \ K) < lim inf


K) < e. There is a function u E C,:(E) with 0 < u < 1
and u(K) _ { I). It satisfies 0 < 1 - it < 1CK and so for each f E Cb(E)

ifl 5 I[f11 f(i-u)dp,,<11fllM(CK)<-IIIIIk

forallnEN

30. Convergence of Radon measures

197

and by the same argument

J(i- u)fd/)

<- If 11.

As in the preceding proof, the triangle inequality then gives

if

Jfdl

<2IIfIIE+11ufd/Ln- Juid l

for all n EN.

Since of E CA(E), the hypothesis of vague convergence insures that (f of d,,)


converges to f u f d, so the preceding inequality yields

limsup if f dILf

-1 f dl s 2IIf1I e,

valid for every e > 0. That is, this limit exists and equals 0, for every f E Cb(E).
Which proves (i). 0

30.9 Corollary. A sequence (n)fEN in ..,f+ (E) is vaguely convergent to E


4' (E) if and only if it is weakly convergent to p.

Remark. 3. A sequence (n) in .f+(E) which satisfies condition (iii) is called


tight, whether or not any convergence is going on. If a tight sequence from _f+1 (E)
vaguely converges to a measure E .ill+(E), then first of all, IIII S 1 by (30.3), so
that E _&+6 (E). The preceding theorem then guarantees the weak convergence
of (n) to p and therewith E _W+' (E). In particular, with vaguely convergent tight
sequences in ..f+ (E) no mass is lost (cf. the remark preliminary to Lemma 30.3).
Consequences like these constitute the real significance of the tightness concept.

At this point it is worth returning once more to Theorem 30.2. If the measures
,n there are all finite and of the same total mass, e.g., if they are all p-measures,
then the two components of the compound condition (30.2) become equivalent. The
result is the following portmanteau-theorem:

30.10 Theorem. Let ,l,2, ... be measures in &+' (E). Then the following
three assertions are equivalent:
(i) The sequence (n)nEN converges vaguely (and therefore also weakly) to p.
(ii) For every closed F C E

(30.9)

lim so p n (F) < (F) .

(iii) For every open G C E


(30.9')

lim of n (G) >- IL(G)

Proof. The first paragraph of the proof of 30.2 actually established that (i)=(iii),
under the less restrictive hypotheses prevailing there. Since that theorem further
shows that the conjunction of (ii) and (iii) implies (i), it only remains to establish

198

IV. Measures on Topological Spaces

the equivalence of (ii) and (iii). That follows from the trivial observation that

v(CA) = v(E) - v(A) = 1 - v(A)


holds for all A E -4(E) and all v E _W+1(E).

Example 1 in this section shows that the weak convergence of a sequence (n)
in .4/+(E) to a It E 4' (E) does not imply the convergence of (f f d,+) to f f d
for every bounded Borel measurable function f . Nevertheless the continuity of the
functions f which define weak convergence can be relaxed somewhat. To this end,
we consider bounded, real-valued, Borel measurable functions f on E which are
p-almost everywhere continuous for a p E .A"+(E): After excision of a p-nullset
N E .3(E), f is continuous at each point of E \ N. Important examples of such
are the indicator functions of boundaryless Borel sets. The latter are defined as
follows:

30.11 Definition. A Borel subset Q of a locally compact space E is called boundaryless with respect to a measure p E .AY+(E), p-boundaryless (or p-quadrable)
for short, if the boundary Q'
\ $ of Q is Eo-mill:

(Q') = 0.

(30.10)

Examples. 4. Every interval of the number line R is A'-boundaryless.


5.

A set Q E V(E) is boundaryless with respect to a Dirac measure ea if and

only if a E E \ Q*. Look back at Example 1 with this observation and the following
theorem in mind.

30.12 Theorem. Suppose the sequence (n),+EN in .Al!+(E) converges weakly to

it E J4 (E). Then
(30.11)

lim

n-,00

JfdPn=JfdP

holds for every bounded Borel measurable function f that is p-almost everywhere
continuous on E. In particular,
(30.12)

lim p,,(Q) = (Q)

n-,OC

holds for every p-boundaryless set Q E .O(E).

Proof. By hypothesis there is a Borel set Eo C E with (E \ Eo) = 0 such that


f is continuous at the each point of E0. Let e > 0 be given. Since p is a Radon
measure, there is a compact K C Eo with

p(Eo\K)<e.
Every x E K has an open neighborhood Ux on which the oscillation of f is at
most e, meaning that
for all y1, y2EUx.
If(yi)-f(Y2)I <_e

30. Convergence of Radon measures

199

Choose a compact neighborhood V= of x with VV C Ux and then use the compactness of K to find finitely many points x 1, ... , x, E K such that V=, , ... , V=,,
cover K. If we now set

a := inf f (E),

aj:= inf f (U=; ),

13 := sup f (E),

Q3 := sup f (U , )

for j = 1, . . . , n, then for each such j there exist functions gj, h3 E Cb(E) satisfying

9i( x) _

(aj
a

as well as

if x E Vx
ifxECUU,

and h (x) =

{ ,Qi
[3

if x E Vj
ifxECUU,

a<g;<a;</3;<h;<0.

This follows at the once from 27.3 and the application of an appropriate affine
transformation in the range space R. From these properties and definitions it
follows in particular that gi S f < hj for all j. Therefore if we set

g:= g1 V... Vg,, and h:=h1 A...Ahn,


then both these functions lie in Cb(E) and they satisfy a < g < f < h < ,0.
Moreover,

0<h(x)-g(x)<e

forallxEK.

For each x E K lies in some V1, C Us, and because of the way Ux; was chosen
with respect to the oscillation of f, it follows that h(x) - g(x) < h,(x) - gj(x) _
/31 - aj < E. We are now in a position to finish the proof, as follows:

d+JE\K-g)dit
J(h-g)di=IK-g)
<

e(K) + ((3 - a)(E \ K) < e((E) + 3 - a) ;

and, because g < f < h and g, h E Cb(E), the weak convergence hypothesis gives

g dp = n-too
lim

< lim

-n +00 J

g dn < lim inf f f dttn < lim sup if f dn

nloo J

h dn =

n-+00

h du.

Of course we also have f g d < f f d < f h d. Putting all this together shows
that any pair of the numbers f f d, lim inf f f dn and lim sup f f dn differ by
at most e((E) +,3-a). Since e > 0 is arbitrary, (30.11) holds. 0

Let us now look at an application of this theorem which relates the vague
convergence of p-measures on the number line to their Theorem 6.6 description
in terms of distribution functions. This is the way that weak (and hence vague)
convergence made its original historical appearance.
30.13 Theorem. Let , Al, A2.... be measures in 4+1(R), that is, probability measures on .41, and F, F1, F2 ... their distribution functions. If the sequence (n)nEN

200

IV. Measures on Topological Spaces

converges weakly to p., then


(30.13)

limo F,,(x) = F(x)

n +0

holds for every x E R at which F is continuous. If F is continuous throughout R,


then this convergence is uniform on R.
Proof. According to Theorem 30.12, 1im p.,, (Q) = p(Q) for every p-boundaryless
set Q E .1 and thus, after (6.11), lim F,,(x) = F(x) for every x E R such that the
oo, x( is p-bounda.ryless. We have
interval Qx

] - oo, x] = Qx = n

Q=+1 /k

kEN
and therefore

t (Qx) = klim u(Q.+1/k) =kin F(x + Ilk) .


Consequently, Q, is L-boundaryless just if the (isotone) function F is right-continuous at x, that is (since distribution functions are everywhere left-continuous),
just if x is a point of continuity of F. This proves the first assertion.
Let us now hypothesize that F is continuous on the whole line, and let e > 0

be given. First of all, (6.13) supplies numbers a < b such that F(a) < e and
1 - F(b) < c. The uniform continuity of F on the compact interval [a,b] insures
that points a = xo < x1 < ... < xk = b exist such that

F(xj)-F(xj_1)<e

forj=1,...,k.

From what has already been proven we know that there exists nE E N such that

IFn(xj) - F(xj)I <,E

for each j E 10,..., k} and all n > nE.

But then, as we will show, the inequality (Fo(x) - F(x)] < 2e prevails for every
x E R and all n > ne1 which proves the uniform convergence of (Fn) to F. For if
x < x0, then

0 < F(x) < F(xo) < e and 0 < Fn(x) < Fn(xo) < F(xo) +e < 2e,
that is, I F,,(x) - F(x)j < 2e. And a similar argument works if x > xk. The remaining x fall into [x j _ 1, x j [ for an appropriate j E {1,...,k}, so

F(xj_1) < F(x) < F(xj) < F(xj_1) +e


and

F(xj_1) - c < Fn.(xj_1) < Fn(x) < F,,(xj) < F(xi) +e < F(xj_1) +2e,
confirming that in this case too IFn(x) - F(x)I < 2E.
Remarks. 4. At a point x E R of discontinuity of F limit relation (30.13) generally
fails, as the example Ee :=
n E N, confirms.

30. Convergence of Radon measures

201

5. Condition (30.12) for every p-boundaryless set Q E R (E) is also sufficient


for the weak convergence of the sequence
to p (cf. Exercise 6 below). The
same is true of condition (30.13) (cf. Exercise 7).
The concept of weak convergence (with the same definition) is also meaningful

if E is a Polish space (or even just a metric space) if the measures involved in
Definition 30.7 are all finite Borel measure on E. Only the uniqueness of limits
calls for discussion:

30.14 Lemma. Finite Borel measures p and v on a metric space E are equal if
f f dp = f f dv for all f E Cb(E).
Proof. Let d be a metric giving the topology of E and consider closed subsets
F C E. Suppose we can always find a sequence (fn) in Cb(E) with fn .1. 1F. Then
it would follow from the hypothesis and from Lebesgue's dominated convergence

theorem that u(F) = v(F). The system of closed subsets F of E is an fl-stable


generator of the Borel a-algebra R(E) and it contains the whole space E. The
equality = v would thus follow from the uniqueness theorem 5.4.

It remains therefore to prove the existence of such sequences (fn) and we


can suppose F 0 0. For this purpose we use the (uniformly) continuous antitone function h : R -+ R which is constantly 1 on ] - oo, 01, constantly 0
on [1, +oo[ and defined by h(t) := 1 - t on [0, 1], together with the function
x H d(x, F) := inf{d(x, y) : y E F}. The latter is a (uniformly) continuous function on E, as we showed in the proof of Example 4, 26. Moreover, its zero-set
is exactly F, because F is closed. Apparently then the sequence of (uniformly)
continuous functions
fn(x) := h(n d(x, fl),
x E E, n E N

does what is wanted. 0


Remarks. 6. The concept of p-boundaryless sets is also meaningful for finite Borel
measures p on Polish spaces. One easily convinces himself that Theorem 30.12

remains valid in this new situation. In the proof one merely has to secure the
existence of the needed functions g3 and h2 somewhat differently: To this end one
engages Urysohn's lemma (WILLARD [1970], p. 102 or KELLEY [1955], p. 115).
7. Weak convergence in the set of finite Radon measures on a Polish or a locally
compact space E derives from a topology in the same way that vague convergence
does. It is called, naturally, the weak topology and it is defined by letting Cb(E)
take over the role of CC(E) in (30.4).
Weak convergence in (non-locally compact) Polish spaces plays only a marginal
role in this book, but is thoroughly investigated in BILLINGSLEY [1968] and PARTHASARATHY [1967].

202

IV. Measures on Topological Spaces

Exercises.
1. Let E be a locally compact space, (n)fEN a sequence in ..Wb(E) which is
vaguely convergent to E . +(E). If 11I.11 !5 1 for every n E N, then R
o.D
exists and equals 1.
be a convergent sequence of real numbers, with slim an = a E
2. Let
+00
be a sequence of non-negative real numbers such that al > 0
Further, let (a
and the series E a,,, is divergent. Then
lim

n-+no

alai +...'+'anon =a
a,

the case in which all an = 1 being the best known instance. Here is an outline for
a measure-theoretic proof: The equations
/tn :=

x161 + ... + anEn

n E N,

al+...Ian

define a sequence of measures in -0 (N) which vaguely converges to 0. Therefore

according to 30.6, line f f dt. = 0 holds for every f E Ca(lm). The relevant f is
the one defined by f (n) := a - a.
3. Let E be a locally compact space and T a subset of C0(E) with the following
properties: Each compact K C E has a relatively compact neighborhood U such
that every f E C0(E) with supp(f) C K is uniformly approximable on E by
functions t E T whose supports He in U; and further, there exists a t E T with
0 < t < I and t(K) _ {1}. Show that:
(a) A sequence (n) in .1+(E) is vaguely convergent if and only if the sequence
(f t dp) is convergent in R for every t E T.
(b) For E := R, the set of all continuously differentiable real-valued functions with
compact support is a T with the above properties.

4. With the help of Exercise 3 show that for the functions f, (x) := I - sin(nx)
on R, the sequence (f .\'),,EN converges vaguely to A1, and deduce from this the
Riemann-Lebesgue lemma:
Elm

n -r00

f (x) sin(nx) dx = 0

for every f E

5. Let it be a finite Radon measure on a locally compact space E. Prove that:


(a) The system . of all p-boundaryless sets is an algebra in E.
(b) For every f E Cb(E) there is a countable set Al C R such that { f > a) E .
for every a E R \ A f. [Hint: For every finite set {al , .... an ) of real numbers
n

E({f =aj)) <(E) < +oo.]


i=1

6. ,1,z, ... are finite Radon measures on the locally compact space E. Show
that condition (30.12) is also sufficient for weak convergence; that is, from
limA.(Q) = (Q) for every p-boiundaryless set Q C E follows the weak convergence of
to it. This is also true if E is a Polish space. [Hints: Imitate the proof

30. Convergence of Radon measures

203

of Theorem 11.6 and show with the help of Exercise 5 that every 0 < f E Cb(E)
is the uniform limit on E of an isotone sequence (un) in the vector space spanned
by the indicator functions of the sets in -90-1
7. As an application of Exercise 6 show that in the context of Theorem 30.13
condition (30.13) there is also sufficient for the weak convergence of (n) to p.
8. Let (an)nEN be a sequence of real numbers in J0, 1[. From [0,1] delete the open
interval Ill centered at 1/2 having length al. There remain two disjoint closed
intervals J11, J12. From J1j delete the open interval I2j of length a2A1(J13) whose
midpoint is that of J13 (j = 1,2). Then there remain four pairwise disjoint closed
intervals J21, J22, J23, J24. From J2, delete the open interval I3j of length a3.' (J23)

whose midpoint is that of J23 (j = 1,2,3,4). Then there remain 8 = 23 pairwise


disjoint closed intervals J3j, j = 1, ... , 8. Continuing in this way one gets for each
n E N pairwise disjoint closed intervals Jnj, j = 1, ... , 2n. The set

C:= n(Jn1U...UJn2n)
nEN

is called a generalized Cantor discontinuum, and if all an = 1/3 it is simply called


the Cantor discontinuum. Prove that:
(a) C is compact and non-void, but C has void interior.

iim fln

I(,_ a,).

cc

(c) A' (C) = 0 4* E an = +00


n=1

[Hint: Recall the inequalities 1 + a < (1 - a)-1 and 1 - a < e_a for 0 < a < 1.]
00
an < +00, U :=]0,1[ \C is an open subset of R whose boundary
(d) In case

n=1

U' := U \ U is not a A'-nullset.


9. Construct an open subset of ]0,1[x]0, 1 [ whose boundary has positive

\2-measure.

10. Let E be a metric space, with metric d, and let , 1,p2, ... be p-measures
on .R(E). Show that each of the following is necessary and sufficient for the weak
convergence of the sequence (n) to p:
(a) lim f f dn = f f d for all bounded functions f which are uniformly continuous on E.

(b) lim sup n(F) < (F) for all closed F C E.


(c) lim inf n (G) > (G) for all open G C E.
[Hints for (a) .(b): Re-examine the proof of 30.14. There it was shown how,
for a closed non-empty F C E, to construct uniformly continuous functions fn
satisfying fn 1F.1

204

IV. Measures on Topological Spaces

31. Vague compactness and metrizability questions


We again consider a locally compact space E along with its space &+ = .,a'+(E)
of Radon measures, equipped with the vague topology. Our interest here is in the
subsets of ..41+ which are compact or relatively compact in this topology. They are
naturally called vaguely compact, resp., vaguely relatively compact.
A necessary condition for the vague relative compactness of a set H C -W+
can be inferred at once from the very definition of the vague topology. According

to it, for each f E Cc(E) the real function p H f f d is continuous on W+.


Therefore the image of any relatively compact H under each such mapping must
be a relatively compact subset of R, that is, a bounded set. This observation leads
to the following definition:
31.1 Definition. A set H C ..&+(E) is called vaguely bounded (sometimes simply
bounded) if
(31.1)

sup

ffd l

< +00

for every f E CA(E).

Thus vague boundedness of a set H C -4'+ is a necessary condition for its vague
relative compactness. We want to show that it is also sufficient:

31.2 Theorem. A set H C 4f+(E) is vaguely relatively compact if and only if it


is vaguely bounded.

Proof. In view of the preceding, all that has to be shown in that vague relative
compactness follows from the vague boundedness of H. To this end, let of denote the real number in (31.1), for each f E Cc(E), and Jf the compact interval
(-a f, a fJ in R. Also denote the (vague) closure of H in W+ by H. First observe
that

fid AEJf
for all f E CA(E) and all p E H. In fact, if f E CA(E) and e > 0 are given

Vf;e6a):={vE.A"+:I ffdv_ffd ul <e}


is a vague neighborhood of p, so if p E H then H fl Vf;e(p) 34 0. For any v in this
intersection, f f dv E Jf and therefore

ifll

<l ffdLI+f fd- J fdvl <af+e.

As the extreme inequality holds for every e > 0, we see that If f dpi < a f, that
is, f f d. E Jf.

31. Vague compactness and metrizability questions

205

Now consider the product space

P:= RC = X Rl
IEC,

in which for each f E C, = CA(E) a copy RI := R of the number line appears as


a factor. The product

J:= X JI
I EC

is a subspace of P which, as a product of compact spaces, is compact, by the


famous Tychonoff theorem (KELLEY [1955], p. 139 or WRIGHT [1994]). To each

Radon measure p E .A/+ we assign the mapping f -r f f d of C,,(E) into R. This


is a point in P. In this way a mapping

4':.l+-4P
is defined which is injective by the Riesz representation theorem. On the basis of
what was shown in the opening campaign

4;(H) C J.
Our goal will be realized if we can show that
(a) 4' maps .4f+ homeomorphically onto

and

(b) 4'(4'+) is closed in P.


is also closed in P. From 4)(H) lyFor then 4)(H), as a closed subset of
ing in the compact set J it therefore follows that 4'(H) is compact, hence too its
homeomorphic image H.
As to (a): Continuity of a mapping 4> into a product means continuity of every
"component" of 4), that is, of each mapping It P- f f dp (f E CA(E)). But this is
true right from the definition of the vague topology. Continuity of the mapping 4'
inverse to 4' means continuity of each mapping

4'(u)'-Jfd(4i(4'()))

Jfdt

of 4>(.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+)
of the projection of P = RC, onto its coordinate specified by f.

As to (b): Let I E P be a point in the closure of 4'(..E'+) in P. Then I is


a positive linear form on CA(E). To see its additivity, for example, let f, g E CA(E)

and E > 0 be given. The set of all I' E P which satisfy

II'(u) - I(u)I < E

for u E (f, g, f + g}

is a neighborhood of I in P, and therefore contains a point I' = 4>(p) from


I' is thus the positive linear form

u H I' (u) = Judu

206

IV. Measures on Topological Spaces

on CA(E). That means that we have

II(f +g) - I(f) - I(g)I

II(f +g) - I'(f +g)I + II'(f +g) - I(f) - I(g)I

=II(f+g)-I'(f+g)I+II'(f)-I(f)+I'(g)-I(g)I
<e+II'(f)-I(f)I+II'(g)-I(g)I <3c,

and because e > 0 is arbitrary, the extreme inequality means that its left-hand
side must be 0. In a completely analogous way one proves that I (a f) = aI (f) for
every a E R, f E CA(E), and I(g) > 0 for every non-negative g E CA(E). With
the linearity of I confirmed, the Riesz representation theorem supplies a Radon
I. That is, I lies in
confirming that
measure v E + such that
the latter is closed in P. lJ
31.3 Corollary. For every real number a > 0 the set

9a:={pE..t+(E):IItzII<a)
is vaguely compact.

Proof. For every f E CC(E) and p E 4, if f dpi < f If I du <_ a IIf 11. Consequently, tf,, is vaguely bounded, hence vaguely relatively compact. What therefore
remains to be confirmed is the closedness of via in .4W+. According to (28.13)

6 is just the set of all Is E W+ such that f u d < a holds for all [0,1]-valued
u c- CA(E). Because the mapping p '-+ f u dtp of .'+ into R is continuous, the
set { E - ' + : f u du < a} is closed, for each u E CA(E), and by the preceding
observation 4 is an intersection of such sets, those for which u(E) C [0, 11. Thus
.9a is indeed (vaguely) closed. 0
Remark. 1. The set of all measures u E 4' (E) with IIpQ equal to a fixed positive
number a is vaguely closed if E is compact (because in that case 1E E CA(E)).
Example 2 of 30, with all the a there equal to a, illustrates this.

For a variety of applications it is important to know when, in terms of E,


the vague topology of 4+(E) is metrizable. One reason is that sequences suffice
for dealing with metric topologies, but generally not for non-metric ones. The
following remark will prove useful in answering this question.
Remark. 2. For every locally compact space E the, obviously injective, mapping
(31.2)

V : E -+ .4f+ (E)

defined by V(x) := ex is a homeomorphism of E with cp(E) _ {ey : x E E}. For


every point x E E the (open) sets
Mf...... f..:n(x) = {y E E : If,(x) - f;(y)I < 17,,7 = 1,...,n}

form a neighborhood basis at x as the fj run through all finite subsets of CA(E)
and 17 through all positive real numbers. In fact, if U is a neighborhood of some

31. Vague compactness and metrizability questions

207

x E E, 27.3 furnishes a u E CA(E) with 0 < u < 1, u(x) = 1 and supp(u) C U,


which implies that
C U. Using the notation (30.5) it is obvious that
= V(E) n Vf...... J..;,&.)

for all relevant functions, q E R+ and x E E. Together with the injectivity this
clearly shows that cp is a homeomorphism.

As a result of the foregoing, the metrizability of the locally compact space E is


clearly a necessary condition for the metrizability of the vague topology on .41+(E).

For the former the existence of a countable basis in E is sufficient, as was noted
in Remark 5 of 29. It is useful to formulate this in terms of CA(E):
31.4 Lemma. For any locally compact space E the following assertions are equivalent:

(a) E has a countable basis.


(b) There is a countable subset of CA(E) which is dense with respect to uniform
convergence.

Proof. (a)=::-(b): Let 9 be a countable base for (the topology of) E,.? the set of
all open intervals in R with rational endpoints. For every natural number n let
us say that an n-tuple (C1,... , Gn) E 1n and an n-tuple (II, ... , In) E Mn are
compatible with each other if a function f E CA(E) exists such that f(G,) C II

for each j = 1,...,n and supp(f) C Gl U ... U Gn. Any such f will be called
a compatibility function for the pair of n-tuples. Obviously, the set

U(9" x,1n)
nEN

is countable; there are therefore only countably many such pairs of n-tuples (n E N)
that are compatible with each other. We choose a compatibility function for each

such pair and designate by F the set of functions chosen. It suffices to prove that
F is a countable dense subset of CA(E). To prove its denseness, let u E CA(E)
and e > 0 be given. Denote the support of u by K. Every x E K lies in an open
neighborhood from 9 each point y of which satisfies Iu(x) - u(y) I < E. The compact set K is covered by finitely many such neighborhoods, say by C1,.. . , Gn.
The diameter of each image set u(G,) is at most 2E. Consequently there are intervals I j E 9 of length less that & such that u(G3) C II, f o r j = 1, ... , n. Thus
u is a compatibility function f o r the pair of n-tuples (G 1 i ... , G"), (I1, ... , In ).
Hence there must also be such a compatibility function f in the representative

set F. Every X E Gj therefore satisfies Iu(x) - f(x)I < .A'(Ij) < 3e; that is,
Iu(x) - f (x)I < 3e for all x E G1 U ... U Gn. But this latter inequality prevails as
well for all x E E \ (G1 U ... U Gn) for the simple reason that both f and u vanish
identically in this complement. In summary, llu - f II < 3F. This proves that F is
dense in CA(E).
(b)=*(a): Let D be a dense subset of Cc(E). We will show that the system 9
of all sets {u > 1/2} with u E D is a base for the topology of E. For every open
U C E and every point x E U Corollary 27.3 furnishes an f E CA(E) with f (x) = 1

208

IV. Measures on Topological Spaces

and supp(f) C U. Since D is dense, there is a u E D with 1$u - f O < 1/2. Then

xE{u>1/2}C{f> 0) C supp(f) C U.
If D is countable, so is If.

Remark. 3. It is easy to show directly that (b) implies the metrizability of E.


To this end, let D be a countable dense subset of Ce(E). Now (cf. Corollary 27.3)

CA(E) separates the points of E, so D must also; that is, for any two distinct
points x, y E E there is a u E D with u(x) 96 u(y). The functions in D \ {0} may
be organized into a sequence ul, u2.... and we may then define
(31.3)

1un(x) -'uw(y)1

d(x, y) :_
n=1

X, Y E E.

2" 11un11

Point-separation by D means that d(x, y) > 0 whenever x # y. All the other properties of a metric on E are obvious for d. This function d on E x E is a uniform
limit of continuous functions and is consequently continuous. Therefore the topology generated by d, which we will call the d-topology, is coarser than the original
topology of E. For any given point x E E and neighborhood U of x in the original
topology of E there is, as was shown in the "(b)=(a)" part of the preceding proof,
a u E D with

zEV:={u>1/2}CU.

This function u is however a u,,, so that by (31.3) u is d-continuous and V is d -open.

Therefore the d-topology is finer than the original topology of E. Consequently


the two topologies in fact coincide.
Now we can provide the final answer to the question posed after Remark 1.

31.5 Theorem. The following assertions about a locally compact space E are
equivalent:

(a) .A+(E) is a Polish space in its vague topology.


(b) The vague topology of 4+(E) is metrizable and has a countable base.
(c) The topology of E has a countable base.
(d) E is a Polish space.
Proof. (a)=>(b): This follows from Definition 26.1 of a Polish space.
In Remark 2 we learned that x o-4 ey is a homeomorphic mapping of E
onto the subspace {e. : x E E} of all Dirac measures in .4'+(E). Since the property
of having a countable basis clearly passes to subspaces, (c) follows.
(c) .(d): This was shown in Remark 5 of 29.
(d)*(a): Lemma 31.4 provides a countable Do C CC(E) which is dense in CA(E)
with respect to uniform convergence. Furthermore, according to Example 2 of 29,
E is countable at infinity, so that by 29.8 there is a sequence
of compact

sets such that L. 1 E and every compact subset K of E satisfies K C L. for all
but finitely many n. For each n E N choose an e,, E CA(E) satisfying 0 < e,, < 1,

31. Vague compactness and metrizability questions

209

en(Ln) _ {1}. The subset

D:=Do

EDo,nEN}Ufen: nE N}

of CA(E) is still only countable and, of course, is dense in CA(E). Let d1, d2,... be
an enumeration of its elements:

D={d,,:nEN}.
Using this enumeration we define a mapping

e:

+x-&+-+ R+

by

(31.4)

e(, v) :_ E002-n min{1, I f do du - f do dvl },

, v E

n=1

All the properties of a metric save perhaps one are obvious for p. What needs
checking is that = v follows from g(, v) = 0. In view of the uniqueness part of
the Riesz representation theoremr this amounts to showing that from

J dodp=J dodo

for all nEN

follows the equality

f f dp =

fdv

for every f E C,(E).

So let us show this. Given f E CA(E) there is k E N such that

supp(f)CLkC{ek=1}.
Further, given e > 0 there is u E Do with Ilf - ull < c, whence, since f = fek,

If - uekl < Eek.

(31.5)

Integration yields
(31.6)

if

(31.61)

I ffdv_Juekdv l < F

J ek dv.

As the functions ek and uek are in D, the assumption that p(p, v) = 0 entails that
their p- and the v-integrals coincide, and it follows that

Jfdi_Jfdu

2e
l

<

J ek d,",

holding for every e > 0. That is, the desired equality f f dp = f f dv must hold.
The next step is to show that the topology determined by P is none other
than the vague topology. We will, to that end, make use of the fact that the sets
defined in (30.5) are a neighborhood base at v E ..&+ in the vague

210

IV. Measures on Topological Spaces

topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers
e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e

with respect to the metric p.


1. Given e > 0 there exists m E N such that
Vd,..... dm;e/2(V) C UU(V)

(31.7)

for every v E .4'+.

Indeed, one may take any m E N such that


00

E 2-n < e/2


n=m+1

and every le E Vd,..... d,,,;e/2(V) will then satisfy


in
E2-n

p(, V) <

+<e

n=1

and consequently lie in UE(v).

2. For finitely many f1,..., fn E CC(E), for every number e > 0 and every
v E 4'+, there is a number i > 0 such that
(31.8)

Un(v) C V11,---.fn;-(V)

First of all, choose k E N so that


n
U supp(fj) C Lk C {ek = 1}.
j=1

We can find a number 8, dependent on v, so that

0<8<1 and b2+(1+2fekdv)8<e.


For each j there is a function uj E Do with II fj - uj II < 6, hence with
Ifj - ujekl !5 bet,

(j=1,...,n).

Integration with respect to v and any u E _W+ gives


(31.9)

Jf)dL_Juiekd1zl<SJekd,

fjdv-

(31.9')

ujekdvl <d J ekdv

if

show
for j = 1, ... , n. Choosem so large) that all the functions ek, u1ek....
up among the first m functions dl,..., d,,, in the enumeration of D, to which they

all belong. Finally, set

,7] ._ d2-m

and consider any li E .,,v). It satisfies

2-'min{1,l fd;dle - fd;dvl}<p(u,v)<tl<82-',

31. Vague compactness and metrizability questions

211

whence, since b < 1

if

for i = 1, ... , m.

Because of the way m was chosen


(31.10)

for j=1,...,n

if

and

Jekd/L_fekdP<o.

(31.10')

From (31.9) and (31.9'), as well as from (31.10) it follows, via the triangle inequality
that

ffjd_Jfjdv l < (1+J ekd+J ekdv)b;


while from (31.10')

ekdp <6+ / ekdv,


J

so the preceding implies that

Jfid_ffidv<82+(1+2 J edv)S<eAs
this holds for every j E{ 1, ... , n}, it asserts that p E V11 ,... j,,, (v) and confirms (31.8). Together (31.7) and (31.8) assert the equality of the vague and the
p-topologies.

The next step will be to prove the completeness of the metric p, and we can
do that via slight modifications in the foregoing arguments. Let (pn)nEN be a pCauchy sequence in W+. Instead of the functions fl,..., fn and the number e > 0
in 2. above, let an f E CA(E) and a number b E ]0, 1[ be given. We aim first
to prove that the numerical sequence (f f dpn)nEN converges in R. Choose k E N
with supp(f) C {ek = 1) and u E Do with Ilf - It < b. Then choose m E N large
enough that the two functions ek and uek are among dl, ... , d,n and set 17:= 62-1.
Since (n) is a p-Cauchy sequence, there is a natural number N, dependent on 'q,
thus on f and S, such that
for all r, s > N.
p(pr, ps) < 77
Just as in the earlier deduction scheme, we get that for such r, 8

for all i E {1,...,m},


which contains in particular the inequalities
(31.11)

if

< 6 and

JekdPr_JekdPa < 6.

if

212

IV. Measures on Topological Spaces

Of course we also have the f-analogs of (31.9) and (31.9'), so that reasoning similar

to that used earlier deliversthe inequality

for all r, s > N.

if

The second of the (valid for all r,s > N) inequalities in (31.11) shows that the
numerical sequence (f ek d

EN is bounded, say by M E R+:

forallnEN.
The earlier inequality therefore yields

Jfdpr_JfdP8<62+(1+2M)

for all r,s>N.

Notice that M depends only on k, hence only on f. Furthermore N depends only


on b and f. Therefore this last inequality affirms that (f f dpn)nEN is a Cauchy
sequence in R. According to the remark following Definition 30.1 the sequence (tin)

is therefore vaguely convergent to some p0 E .4'... Since the vague topology coincides with the p-topology, as we have already confirmed, this means that the
sequence (pn) converges to po in the p-metric.
We finally need to prove that, like the topology of E, the vague topology of ..k+
has a countable base. Since the vague topology is generated by the metric p, it
is enough to find a countable set 9o which is dense in . W+; because it is obvious
that the set of all open balls with respect to the metric p centered at points of 9o
and having rational radii is then a countable base for the p-topology of . '... Our
candidate for 9o is the set of all discrete measures
k

b :_

aifx,

with positive rational ai and points ai drawn from a countable set Eo which
is dense in E. We get such a set Eo simply by taking a point from each set
in a countable base for the topology of E. Evidently, this 90 is countable. We
have to show that for every p E . fl+, every real e > 0, and every finite set
F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,...
contains
a measure from 90. At least, according to 30.4, this neighborhood contains a

with positive real Ui and Ti E E. Thus


(31.12)

ip- Jfdbl-l Jfd

<e
i=1

for all f EF.

31. Vague compactness and metrizability questions

213

Now for such f and d as above

if fdIt-Jfd.6l<

fdlt -

f &l +IJ fa-J fd6l


k

fd,,- ffdal+Ea;If(=i)-f(xj)I+FI-;-ailIlfII

Inequality (31.12) says that the number


EHJfd,1

-ffdbl

is positive. If we choose a; from Q+ sufficiently close to i and x; from the dense


set E o sufficiently close to T, (i = 1, ... , k ), then because of the continuity of the
(finitely many) functions f, we can obviously see to it that the two sums in (31.13)
together are less than this, so that the right side of inequality (31.13) is less than e,
for each f E F. But that means that b E 9o n V1..... f,,;, (It).

Remarks. 4. The reader should recall the rather elementary fact that for a metric space compactness and sequential compactness are equivalent (see (6.37) in
HEwrrr and STROMBERG [19651). In view of this, a very useful consequence of
Theorems 31.2 and 31.5 for a locally compact space E with a countable base is
that every vaguely bounded sequence in _J!+(E) contains a vaguely convergent
subsequence..

In particular, for such E every sequence (p,,) in ..#+(E), that is, every sequence
of p-measures, contains a vaguely convergent subsequence. Moreover, in case all
convergent subsequences have the same limit e, the original sequence (p,,) itself
converges vaguely to /t: Otherwise there would be an f E CA(E) for which (f f dlt )

sloes not converge to f f dlt, and so an e > 0 and integers I < n1 < n2 < ... such

that If f dlt,,; - f f ditl > e for all j E N. The sequence

)jEN would have


a vaguely convergent subsequence and its vague limit could not be iz. If we further
that it is tight, then with the aid of Remark 3 in 30 we can
hypothesize of
even converges weakly to it.
conclude that it E .W+(E) as well, and that

5. The foregoing deliberations show (for locally compact. E with a countable


base) that tight sequences in &+'(E) always contain weakly convergent subsequences. Explicitly formulated this says: A set H C .,i.+ (E) is relatively compact
(= relatively sequentially compact) in the weak topology if it is tight, meaning
that for every e > 0 a compact Kf C E exists such that p(E \ KE) < e for every it E H. A theorem of Yu.V. PROHOROV asserts that the lightness of H is
even equivalent to its weak relative compactness. More is true: This equivalence
prevails as well whenever E is any Polish space. For details the reader can consult
BILLINGSLEY [1968[.

214

IV. Measures on Topological Spaces

The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified,
lead to a further interesting result. It concerns the space

C := C(R+, E)

of all continuous mappings f of R+ := [0, +oo into a Polish space E, for example, Rd. We endow C with the topology of uniform convergence on compact subsets
of R+.

31.6 Theorem. Along with E, the space C(R+, E) is also Polish.


Proof. Consider any complete metric B which generates the topology of E. Another

such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can
simply assume that L< 1. This lets us define do in C for each n E N by
dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]),

f,g E C;

and

(31.14)

d(f,g) :_

00

E2-ndn(f,g),

f,g E C.

n=1

Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric
on C (with all its values in [0,1]) which satisfies
(31.15)

2-nd(f,g)<d(f,g)<dn(f,g)+2'n

for allnEIN,

the right-most inequality following from the fact that d< < d,+1 for all i E N,
resulting in
n

00

d(f, g) 5 E 2-`dt(f, g) + E 2-'.


i=1

i=n+1

It follows from (31.15) via by-now-familiar reasoning that the d-topology coincides
with the original topology of C, and moreover that d is a complete metric.

So it only remains to prove that the topology of C has a countable base.


As we showed in the very last phase of the proof of Theorem 31.5, the Polish
space E contains a countable dense subset E0. The system 9 of all open balls with
respect to the metric o with centers in Eo and with positive rational radii is then
a countable base for E. Together with it we consider a countable base 0 for R+.
Thus n-tuples (01, ... , On) E 0n and (G,,.. -,
E [9n are called compatible if
there is a function f E C such that f (O,) C G,, for each j = 1, ... , n. And, as
before, any such f will be called a compatibility function. Because

U(n

nEN

is countable, there is a countable set F C C which contains a compatibility function

for each pair of compatible n-tuples, for each n E N. The open d-balls having
centers in F and rational radii are a countable set, and it is easy to see that they
constitute a base for the d-topology of C once we confirm that F is dense in C.

31. Vague compactness and metrizability questions

215

So that is now our goal. Consider then an arbitrary fo E C and N E N. Set

c:= 2-N-2. Since f is continuous, every x E [0, NJ lies in a set 0 E 6' such that
for all y E 0.

p(fo(y), fo(x)) <,E/2

Finitely many such sets 0 suffice to cover [0, NJ, say 01,..., 0,,. By the triangle
inequality

Q(fo(y),.fo(x)) < e

for all x, y E Oj, j E { 1, ... , n}.

Choose a point xj from each Oj. Then

p(fo(x),fo(.., j))<e

for allxEOj.jE

The open Lo-ball of radius c centered at fo(xj) meets the dense set E0, say in the
point zj. As E is rational, the open p-ball of center zj and radius 2e is a set G j E I.
Then every x E Oj satisfies

P(fo(x),zj) <_ P(fo(x),fo(xj))+P(fo(xj),zj) <2e,


which means that fo(Oj) C G j, all this for each j E { 1, ... , n}. This shows that

fo is a compatibility function for (01,._O.) and

Consequently,

this pair of n-tuples has a compatibility function f E F, that is, f E F satisfies

f(0j)CGj

forj=1,....n.

It follows that f(x), fo(x) both lie in Gj whenever x E Oj and so


e(f (x), fo(x)) < 4c.

As the Oj cover [0, NJ, this inequality holds for every x E [0, NJ. It affirms that
dN(f, fo) < 4E, and so thanks to (31.15) and the definition of e, d(f, fo) < 4E +
2-N = 2-N+'. As N E N is arbitrary, this shows that F is d-dense in C, which,
as noted earlier, completes the proof.
The significance of Theorem 31.5 lies partly in the fact that for a locally compact space E whose topology has a countable base the space .41+(E) of all (positive) Radon measures - which according to 29.12 is the set of all Borel measures
on E - being also a Polish space, is itself an environment in which measure theory
can be pursued. And this happens in convex analysis, in integral geometry, and
in stochastic geometry, a meeting point between geometry and probability theory.

The path-space C(R+, E) of all continuous paths or curves t H f (t) 1 t E R+,


in a Polish space E (Theorem 31.6) plays a fundamental role in the theory of
stochastic processes. For example, the Polish space C(R+, Rd) carries the famous
Wiener measure; it is the steering mechanism of the Brownian motion in Rd (cf.
BAUER [1996]).

Exercises.
1. Let E be a locally compact space, v E ..#+(E). Show that the set of all p E
..#+(E) which satisfy 0 <_ f u. d,u < f udv for every non-negative u E CA(E) is
vaguely compact.

216

IV. Measures on Topological Spaces

2. Let E be a locally compact space with a countable base. Prove that there is
a countable subset of C0(E) that has the properties of the set T in Exercise 3, 30.
[Hint: Try the set D that featured in the proof of Theorem 31.5.]
3. (Selection theorem of E. HELIX (1884-1943)). Prove the original form of Corollary 31.3: To every sequence (Fn)nEN of distribution functions on R corresponds
a measure-generating function F : R -+ R and a subsequence (Fn,, )kEN of the
original sequence such that lim Fnk (x) = F(x) for every continuity point x of F.
k-roo

Why is F generally not a distribution function? How does one recover 31.3 (for
the case E := R) from Helly's theorem?
4. For a Polish space E consider the topology (introduced in Remark 7 of 30) of
weak convergence on the set of finite Borel measures (the finite Radon measures
- cf. 26.2) on E. By adapting the ideas in the proof of Theorem 31.5, show that
this topology is metrizable.
5. For what more general spaces taking over the role of R+ in the definition
of C(R+, E) does Theorem 31.6 remain valid?

Bibliography

e_ls
dr", Bull. Sci. Math. (2)13, 84.
U. ANONYME [1889]: "Sur l'integrale JIx
G. AUMANN [1969]: Reelle Funktionen. Grundlehren Math. Wiss. 68 (2nd edition),
Springer-Verlag, Berlin-Heidelberg-New York.
S. BANACH [1923]: "Stir le problenne de la mesure", Fund. Math. 4, 7-33.

R.G. BARTLE and J.T..JoicHI [1961]: "The preservation of convergence of measurable functions", Proc. Amer. Math. Soc. 12, 122-126.
H. BAUER [1984]: Mafle auf topologischen Raumen, Kurs der FernuniversitatGesamthochschule-Hagen.
- 11996]: Probability Theory, de Gruyter Stud. Math. 23. Walter de Gruyter.
Berlin-New York.
S.K. BERBERIAN [1962]: "The product of two measures", Amer. Math. Monthly
69, 961-968.
P. BILLINGSLEY [1968]: Convergence of Probability Measures. John Wiley & Soils,

Inc., New York-London-Sydney-Toronto.


G. BIRKHOFF and S. MACLANE [1965]: A Survey of Modern Algebra (3rd edition).
The Macmillan Co., New York.

N. BOURBAKI [1965]: Integration, Chap. 1-4. Hermann, Paris.


A. BROUGHTON and B.W. HUFF [1977]: "A comment on unions of a-fields",
Amer. Math. Monthly 84, 553-554.
S.D. CHATTERJI [1985-86]: "Elementary counter-examples in the theory of double
integrals", Atti Sem. Mat.. Fis. Univ. Modena 34, 363-384.
G. CHOQUET [1969]: Lectures on Analysis. Vol. 1. W.A. Benjamin, New YorkAmsterdam.
.I.P.R. CHRISTENSEN [1974]: Topology and Borel Structure. Mathematical Studies
10. North-Holland Publ. Co., Amsterdam-London.
D.L. COHN [1980]: Measure Theory. Birkhauser Verlag, Basel-Boston-Stuttgart.
P. COURREGE [1962]: Theorie dc la mesue. Les cours de Sorboune. Centre de
Documentation Universitaire, Paris 5'.
C. DELLACHERIE et P.-A. MEYER [1975]: Prnbabilites et potentiel, Chap. I a IV.
Hermann, Paris.
P. DIEROLF and V. SCHMIDT [1998]: "A proof of the change of variable formula
for d-dimensional integrals", Amer. Math. Monthly 105, 654-656.
J. DIEUDONNE [1939]: "Un exemple d'espacc normal non susceptible dune struc-

ture uniforme d'espace complet", C. R. Acad. Sci. Paris Ser. I Math. 209,
145-147.

218

Bibliography

- [1978]: Abreg6 d'Histoire des MathEmatiques, 1700-1900, tome II. Hermann,


Paris.
E.B. DYNKIN [1965]: Markov Processes, I, II. Grundlehren Math. Wiss. 121, 122.
Springer-Verlag, Berlin-Heidelberg-New York.
R.E. EDWARDS [1953]: "A theory of Radon measures on locally compact spaces",
Acta Math. 89, 133-164.
B.W. GNEDENKO [1988]: The Theory of Probability (translated from Russian by
G. Yankovsky) 6th printing. Mir Publishers, Moscow.
C. GOFFMAN and G. PEDRICK [1975]: "A proof of the homeomorphism of Lebes-

gue-Stieltjes measure with Lebesgue measure", Proc. Amer. Math. Soc. 52,
196-198.

H. HAHN and A. ROSENTHAL [1948]: Set Functions. The University of New


Mexico Press, Albuquerque.
P.R. HALMOS [1974]: Naive Set Theory. Undergrad. Texts Math., SpringerVerlag, New York-Heidelberg.
- [1974]: Measure Theory. Grad. Texts in Math. 18, Springer-Verlag, New YorkHeidelberg-Berlin.
F. HAUSDORFF [1914]: Grundziige der Mengenlehre. Verlag von Veit and Comp.,
Leipzig; reprinted (1949), Chelsea Publishing Comp., New York.
T. HAWKINS (1970]: Lebesgue's Theory of Integration. University of Wisconsin
Press, Madison-Milwaukee-London.
J. HENLE and S. WAGON [1983]: "A translation-invariant measure", Amer. Math.
Monthly 90, 62-63.
E. HEwITT and K.A. Ross [1979]: Abstract Harmonic Analysis I. Grundlehren
Math. Wiss. 115 (2nd edition). Springer-Verlag, Berlin-Heidelberg-New York.
E. HEwITT and K. STROMBERG [1965]: Real and Abstract Analysis. Grad. Texts
in Math. 25. Springer-Verlag, New York-Heidelberg-Berlin.
J.L. KELLEY [1955]: General Topology, Grad. Texts in Math. 27. D. Van Nostrand
Co., Inc. Princeton; reprinted (1975), Springer-Verlag, New York-HeidelbergBerlin.
L. MATTNER (1999]: "Product measurability, parameter integrals, and a Fubini
counterexample", Enseign. Math. (2) 45, 271-279.
P.-A. MEYER [1966]: Probability and Potentials. Blaisdell Publ. Comp., Waltham,
Massachusetts-Toronto-London.
L. NACHBIN [1965]: The Haar Integral. The University Series in Higher Mathematics. (Translated from Portugese by L. Bechtolsheim.) D. Van Nostrand Co.,
Inc. Princeton; reprinted (1976), R.E. Krieger Publ. Comp., Huntington, New
York.
J. VON NEUMANN (1929]: "Zur allgemeinen Theorie des MaBes", Fund. Math. 13,

73-116+333.
W.P. NOVINGER [1972]: "Mean convergence in Lp-space", Proc. Amer. Math.
Soc. 34, 627-628.

Bibliography

219

D.A. OVERDIJK, F.H. SIMONS and J.G.F. THIEMANN [1979]: "A comment on
unions of rings", Indag. Math. 41, 439-441.
J.C. OXTOBY and S. ULAM [1941]: "Measure-preserving homeomorphisms and
metrical transitivity", Ann. of Math. (2) 42, 874-920.
K.R. PARTHASARATHY [1967]: Probability Measures on Metric Spaces, Academic
Press, New York-London.
W.F. PFEFFER [1977]: Integrals and Measures. Marcel Dekker. New York-Basel.
J. RADON [1913]: "Theorie and Anwendungen der absolut additives Mengenfunktioncn", Sitzungsber. Kaiserl. Akad. Wiss. Wien, Math.-NaturYaiss. K1. 122,
1295-1438.

H. RICHTER [1966[: Wahrscheinlichkeitstheorie. Grundlehren Math. 1Viss. 86


(2nd edition). Springer-Verlag, Berlin-Heidelberg-New York.
F. R.IESZ [1911]: "Sur certaines systemes singuliers ('equations intrgrales", Ann.
Sci. Ecole Norm. Sup. (3) 28, 33-62.
J.B. ROBERTSON [1967]: "Uniqueness of measures", Amer. Math. Monthly 74,
50-53.
W. RUDIN [1962]: Fourier Analysis on Groups. Interscience Tracts in Pure Appl.
Math. 12. John Wiley & Sons, New York-London.
- [1987]: Real and Complex Analysis (3rd edition). McGraw-Hill Book Comp.,
New York-Hamburg-Tokyo--Toronto.
S. SAEKI [1996]: "A proof of the existence of infinite product probability measures", Amer. Math. Monthly 103, 682-683.
W. SIERPINSKI [1928]: "Un thboreme general sur les families d'ensembles", Fund.
Math. 1, 206-210.

R.M. SOLOVAY [1970]: "A model of set-theory in which every set of reals is
Lebesgue measurable", Ann. of Math. (2) 92, 1-56.
R.H. SORGENFREY [1947]: "On the topological product of paracornpact spaces",

Bull. Amer. Math. Soc. 53,631-632.


S.M. SRIVASTAVA [1998]: A Course on Bore! Sets. Grad. Texts in Math. 180.
Springer-Verlag, New York-Berlin.

K. STROMBERG [1972]: "An elementary proof of Steinhaus's theorem", Proc.


Amer. Math. Soc. 36, 308.
- [1979]: "The Banach-Tarski paradox", Amer. Math. Monthly 86, 151-161.
- [1981]: An Introduction to Classical Real Analysis. Wadsworth International,
Belmont, California.

H.G. TucKER [1967]: A Graduate Course in Probability. Academic Press, New


York-San Francisco-London.
J. VAN YZEREN [1979]: "Moivre's and Fresnel's integrals by simple integration",
Amer. Math. Monthly 86, 691-693.
D.E. VARBERG [1971]: "Change of variables in multiple integrals", Amer. Math.
Monthly 18, 42-45.

220

Bibliography

S. WAGON [1985]: The Banach-Tarski Paradox. Encyclopedia Math. Appl. 24.


Cambridge University Press, Cambridge.
S. WILLARD (1970]: General Topology. Addison-Wesley Publishing Co., Reading,
Massachusetts.
J. YAM TING Woo [1971]: "An elementary proof of the Lebesgue decomposition

theorem", Amer. Math. Monthly 78, 783.


D.G. WRIGHT [1994]: "Tychonoff's theorem", Proc. Amer. Math. Soc. 120,
985-987.

Symbol Index

The numbers beside the symbols refer to the pages where the symbol in question
is defined.

C, u,n u, n, c, \, xii
0,33
-00, (+)oo, xi

f * v (convolution of a function and a


measure), 149

If < g}, If < g}, If = g}, { f 76 g},

If >g}, If > 0, 50

IR, xi

N, Z, Q, R, xi,
Z+, Q+, R+, R+, xi

f f du, f f (w)A(dw), f f (w) da(w),


f u d, 55 58 64

R+, 141

f f dF, 65

R" (multiplicative group R \ {0}), 44

fA f d, fB f

R,., 156
Qd, 45.

T (unit circle [torus]), 38

(X)

dx, fa f dAd, f f d4

67 90

f If, 6D
fnTf,
00
F fn, E fn, xiii

n=1

a<b,a<b,14
[a, b1, a, b , a& a, b , xi, 14 28, 29

0,11 0,i, 0,1, 38 39


avb, aA xi
a-, a+, xi

lim sup An, lim inf An, 61


n-+00

n-4o0

limsup fn, liminf fn, xiii


lim fn, xiii
n--,oo
n = (n,.. . , n) E Rd (usually n E Z),
23,32

(an)nEN, (a.).=j.2...,, Xii


an I a, an T a, xiii

91

(ai)iEJ, Xli
d(x, A), 157, 201

sup fn, inf f, xiii


supp(f) (support of a function), 167

det T, 43

supp(p) (support of a Borel measure),

f: A -. B, x H f (x) (mapping), xii


f I A' (restriction of a mapping), xii
f-1(B') (pre-image), xii
fA, 62

8Q, 1,36

177

ix - yl (euclidean norm), 146


(x, y) (euclidean scalar product), 41

f";, 137
F,, 31

u 12

Ilf il (supremum norm), 169. 183

t (topological interior), 182


A* (topological boundary), 198

Ilflip, Ilflloo, 86, 8-2

f-, f+, [LL, 53


f, 86
fog (composition of mappings), xii
f = g (ti-)almost everywhere, 70

f +g, fg, af, xii, 66


f * g (convolution of functions), 150

A (topological closure), 17

An, 147
1A (indicator function), 49

a+A, A+a, 36
A:= B, B =: A, xi
AC B, xii
A\ B, xii

222

Symbol Index

A - A (algebraic difference), 1.63

.2l'(), 1 < p < +00, 71

AD B,5

Y(), 78

C(E), Cb(E), Co(E), C(E), 167,169


C(R+, E), 214

.4V+ (Rd), L47

Dye, 44

..41+(E), .4l+(E), _W+1 (E) (spaces of


Radon measures), 188

E = E(1l, sd), 53

,,,Y (negligible sets), 86 87 100


.N,, 13,106, 107

E`(1,d), 58

(open sets), 152

E, 10
E. T E.
G:= f e-` )'(dx), 88. 93, 145. 146

..(St) (power set), xii

Dai), 31

GL(d,R), 43
H,. (homothety), 37
Ij,,171,
K,.(xo) (closed ball), 146
K,.(xo) (open ball), 158
L'(Ad), 151, 123
L(), LP(), 86, 81
L3 (lower sum), 91

M(sd), 120
Mot(] d), 42

N,(f),Np(f),87,74
Q., Q.,, 135
S(i'k) (skewing transformation), 30
S,,(0) (euclidean sphere), 37
SL(d, R), 43
Ta (translation), 36. 149
T() (image measure), 36

T-'(sd), 3

(x,27
o(cia), 7
Odd

Qa

...

5:,

F, 32

e,, eZ (unit point mass, Dirac-measure), 8 154


(counting measure), 12, 13
Ad (d-dimensional L-B measure), 18,
26, 27
Ad (d-dimensional L-B measure on C),
27

(total mass), 14L 1.54


., 171

9 171
(principal measure), 176
O (essential measure), 176
A (restriction of ), 68
F, 30
- lim (stochastic limit), 113

U3 (upper sum), 91

-v,1.0Z
IKv,P Lv,1455

Sv-, 170

1 2, 1M
n 11+j, 143
v convolution of measures), 147

a(-s.P-measurable, 34

.di = e... dd, (product of


i=1

a-algebras), 132
.Vd (Borel a-algebra in Rd), 27

4'

= V(K), 49, 1553

.i(E) (Borel a-algebra in E), 152


(systems of closed, open,
compact subsets of Rd), 27

`6'd, Cd,

9,,, 206

jd

14

.,1E', 13. 171

.2'(), 6fi

**n,147

P(S), 4
P."W ), 171

e(x, y) (euclidean metric), 41

P+,r, M2
o'(8), Q(T), a(T1,... ,T,),

o(Ti:iEI),3 35 62

(Q, d) (measurable space), 34


(S1,.,, ) (measure space), 34

0 7-119,*,i), 1.44

S 1' 0 sd (trace of a-algebra), 2

Name Index

Alexandroff, Pavel Sergeevich


(1896-1982),167
Anonyme, U., 25
Aumann, Georg (1906-1980), 46
Banach, Stefan (1892-1945), 46
Bartle, R.G., 119
Bauer, Heinz, 130, 144. 187, 193, 215
Berberian, S.K., 141
Billingsley, P., 201, 213
Birkhoff, Garrett (1911-1996), 44
Borel, Emile (1871-1956), ix, 18
Bourbaki, Nicholas, 87, 157, 186, 187
Broughton, A., 5

Caratheodory, Constantin
(1873-1950), 20
Cauchy, Augustin Louis (1789-1857),
ix

Chatterji, S.D., 141


Choquet, G., 47
Christensen, J.P.R., 47
Cohn, D.L., 92,157, 181
Courrisge, P., 116
Dellacherie, C., 130
Dieroif, P., 43
Dieudonn6, Jean (1906-1992), 18 1.84

Dirichlet, Peter Gustav Lejeune


(1805-1859), ix
Doob J.L., 62
Dowker, Clifford Hugh (1912-1982),
181

Dynkin, E.B., 5

Edwards, Robert Edmund


(1926-2000), 181

Fatou, Pierre (1878-1929), 80


Fischer, Ernst (1875-1956), 84
Fubini, Guido (1879-1943), 138
Gnedenko, Boris W. (1912-1995), 33
Goffman, C., 4Z
Hahn, Hans (1879-1934), 108, 141
Halmos, Paul R., 45 141. 177. 184
Hausdorff, Felix (1868-1942), 46
Hawkins, T., 18
Helly, Eduard (1884-1943), 216
Henle, J., 39
Hewitt, Edwin (1920-1999), 46, 92,
144, 147, 181, 213
Holder, Otto (1859-1939), 75, 78
Huff, B.W., 5

Joichi, J.T., 119


Jordan, Camille (1838-1922), ix
Kelley, John Leroy (1916-1999), 1Z,
152, 158, 159, 166, 168, 201., 205

La Vallee Poussin, Charles de


(1866-1962), 130
Lebesgue, Henri Loon (1875-1941), ix,
18.

Levi, Beppo (1875-1961), 59


Lindelof, Ernst Leonhard (1870-1946),
160

Lusin, NikolaT Nikolaevich

(1883-1950),163
MacLane, Saunders, 44
Mattner, L., L41
Meyer, P: A., 130
Minkowski, Hermann (1864-1909),75
83

Egorov, Dmitrii Fedorovii`


(1869-1931), 120

Nachbin, Leopoldo (1922-1993), 39

224

Name Index

Neumann, John von (1903-1957), 46,


105

Nikodym, Otto Martin (1888-1974),


105

Novinger, W.P., 82

Overdijk, D.A., 5
Oxtoby, John Corning (1910-1991), 4Z

Parthasarathy, K.R., 201


Peano, Giuseppe (1858-1932), ix
Pedrick, G., 47
Pfeffer, W.F., 184
Prohorov, Yurii Vasil'evich, 213

Solovay, R.M., 45

Sorgenfrey, Robert Henry


(1915-1996), 156
Srivastava, S.M., 47. 165
Steinhaus, Hugo (1887-1972), 162
Stieltjes, Thomas Jan (1856-1894), 32
Stromberg, Karl R. (1931-1994),
46.92 144162, 181, 213

Thiemann, J.G.F., 5
Tonelli, Leonida (1885-1946), 95 138,
144

Tucker, H.G., 33
Ulam, Stanislaw Marcin (1909-1984),

Radon, Johann (1887--1956), 105, 155


Richter, Hv 33
Riemann, Georg Friedrich Bernhard
(1826-1866), ix
Riesz, Frigyes (1880-1956), 82. 81171
Robertson, J.B., 23
Rosenthal, Arthur (1887-1959), 141
Ross, Kenneth A., 46, 147
Rudin, Walter, 105, 147, 168
Sacki, S., L44
Schmidt, V., 43
Sierpinski, Waclaw (1882-1969), 5
Simons, F.H., 5

47

Urysohn, Pavel Samuilowich


(1898-1924),158
Varberg, D.E., 45
Wagon, S., 39, 46
Willard, S., 17 152, 157-159, 166, 168,
201

Woo, J. Yam Ting, 105


Wright, D.G., 205

Yzeren, J. van, 95

Subject Index

fl-stable system, 7
U-stable system, 2
p-fold (p-)integrable, 7f
p-measure, 31
p-space, 34

pth-power integrable, 76

p -measurable, 20
a-additivity, 8
a-algebra, 2
a-algebra of Borel sets in Rd, 27

-R 42

- topological space, 152


a-algebra generated by mappings, 35

Cl-diffeomorphism, 44 111

F,-set, 152
Ga-set, 47 152, 157, 1.59

K,-set, 187

29-convergence, 72
."P-functions, 77
.`gyp-pseudometric, 79

2-semi-norm, 79
e-bound, 121

p-almost all points, 7Q


p-almost everywhere, 70

- continuous, 128
- defined measurable function, 73
p-boundaryless, 19-8
p-completion, 26
p-continuous measure, 99
p-essentially bounded, 78

p-integrable over a set, 62


p-integrable function, 64
p-integrable set, 68
p-integral of function, 55, 5$, 64

-- by a set, 3
a-compact, 181
a-finite content, 23

a-finite measure, 23 72 28
a-finite measure space, 34
a-ideal, 13, 100, 1117

a-ring, 177
-- generated, 177
absolutely continuous (see p-continuous)

absolute value of function, 52


additivity, finite, 8

-,a-,8

- , sub-, 9
Alexandroff compaetification (see onepoint compactification)
algebra, 1512 193

-,a-,2

- of sets, 4
almost everywhere, 70

- bounded,71

- over a set, 67, 62

- defined function, 73

p-negligible, 13
p-nullset, 13, 70
p-quadrable, 198
p-singular, 105
p-stochastically convergent, 113

- equal, 70
- finite, 74
analytic set, 47
antitone, xiii
strictly, xiii

226

Subject Index

approximation by discrete measures,

continuous with respect to a measure,

193, 194

of the identity, 193


property, 24

99

continuous part with respect to u, 105


convergence almost everywhere, 70
almost uniform, 120

Banach algebra, L51


Banach space, 87
basis, base (topological), 157
Bernoulli inequality, 75
Borel Q-algebra, 27, 152
Borel (measurable) function, 50 1523
Borel measure, 311, 153

- , mean square, 80

- , regular, 154, 158, 184

convex cone, 188


convolution of functions, 151
functions and measures, 150

locally-finite, 154
bounded, 147
Borel set in Rd, 26

in mean, 80

in eh mean, 71, 114 124


in measure, 113
- , stochastic, 113
- , vague, 189

, weak, 196, 2011

- measures, 147

- - topological space, 152

convolution power, 151

boundaryless, 198
bounded
Borel measure, 147

- root, 151

(z)-essentially, 78

Cantor discontinuum, 203


- , generalized 2113
carried by a set, 105
Cauchy criterion for stochastic convergence, 114
Cauchy sequence in Y P, 84 85
Cauchy-Schwarz inequality, 78
characteristic function, 44
charge distribution, 108
Chebyshev-Markov inequality, 112
Chebyshev inequality, 112
compatibility function, 207
complete measure, 26, 46
completeness of LP, 82
completion of a measure, 24 56
composition of functions, xii

- measurable mappings, 35
content, 8
content-problem, 411

continuity at
111
continuity from above, 10
- from below, 10
continuity lemma, 88

unit, 142
countable additivity (see Q-additivity)
countable at infinity, 181
countable and co-countable o'-algebra,
2

countable (neighborhood) base, 157


countable set, xii
counting measure, 12

density of a measure, 96

denseness of C, in 2, 186
denseness of discrete measures, 1194
diffeomorphism, 44, 111
difference
set-theoretic, xii

- , symmetric, 5, 14 24, 87
differentiation lemma, 82
Dirac function, 146
Dirac measure, 12. 154
Dirichlet jump function, 57, 92, 166
disjoint sets, xii
distribution function, 31, 201
dominated convergence theorem, 83
--- , sharpened version, 124
Doob's factorization lemma, 62
Dynkin system, 6
--- generated by ', 7

Subject Index

elementary content, d-dimensional, U.


16,27
elementary function, 53
envelope, lower, xiii

- , upper, xiii
equi-(h)-continuity, 128
equi-p-continuous, 131
equi-continuous at 0, 131
equi-integrable, 121 if.
essential measure, 176
extension theorem, 19
Factorization lemma, 62
family, xii

Fatou's lemma, 81

- , dual version, 130


figure, d-dimensional, 14

finite (or bounded) Radon measure,


188

finite additivity, 8
finite Borel measure, 3 147, 154
finite signed measure, 101

finite-co-finite algebra, 4 8 U
Fubini's theorem, 13(1
function, additive, 59

- , antitone, xiii
- , integrable of order p, 76
- , isotone, xiii
- , Lebesgue integrable, 65
- , Lebesgue-Stieltjes integrable, 65
- , measurable, 34, 49
- , measure-generating, Q. 32
- , numerical, 49
- , positively homogeneous, 59

- , real, xii
- , Riemann integrable, 91, 92

- , step, 53
- , with compact support, 167
Gaussian integral, 88, 93
general linear group, 43
generator, of a a-algebra, 3
-- , of a product a-algebra, 132
Haar measure, 39, 107

227

Hahn decomposition, 108, 109


Hilbert space, 87
Holder inequality, 75

- - , generalized, 78
- , reversed, 72
homothety, 37

hull, measurable, 25
ideal, 5
-- of 1A-null sets (see a-ideal)
image measure, 3366 110

indicator function, 49
input-output formula, 13
integrable, 64
, equi-, 121

- , Lebesgue, 65
- , Lebesgue-Stieltjes, 65
-- quasi-, 64
- of order p, 76
- over a set, 69
integral of f exists, 65
integral over a set, 67
intervals in Rd, 14
isotone, xiii, 59 170
-- , strictly, xiii
isotoneity, 9
Jordan decomposition, 109
L-B measure, 27
L-B measure space, 34
L-B-nullset, 28, 29, 33
Lebesgue decomposition, 105
Lebesgue integrable function, 65
Lebesgue integral, 65
Lebesgue measure, 46
Lebesgue-Borel measure (see L-B
measure)
Lebesgue-Stieltjes integral, 65
Lebesgue's convergence theorem, 83

- , sharpened version, 124


Lebesgue's decomposition theorem,
105,143
left half-open interval, 29
left-continuous, 30
lemma of Doob, 62

228

Subject Index

- Fatou's, 81
- Urysohn's, 168

- - , reversed, 79

--- Riecnann-Lebesgue, 202


--- on differentiation of integrals, 89
linear form, 66, 68

- , positive (isotone), 66, 171


Lusin's theorem, 10
mapping, xii
mass distribution, 12, 108
measurable mapping, 34
measurable numerical function, 49
measurable sets, 34
measurable space, 34
measurable, Borel, 34, 103

- , Lebesgue, 46
with respect to an outer measure,
21)

negative part of a function, 53


- of a signed measure, 109
non-Borel set, 45 47
non-denumerable, xii
norm of uniform convergence, 169
normal representation, 54
nullset, 13

- , L-B, 28 20 33, 43
- , Lebesgue, 46
totally, 1119

number line, xi

- , compactified, xi
- , extended, xi

measure, 11
Borel, 31L 153

- , carried by a set, 105

one-point compactification, 167


outer measure, 20

finite, U
finite signed, 11)2
inner regular, 1.54

point, ideal, 106


point, infinitely remote, 167
point mass (see Dirac measure)
Polish space, 157, 208, 214
portmanteau-theorem, 197
positive part of a function, 53

L-B,27

motion, 41
motion group, 42
motion-invariance of ad, 42
motion-invariant content, 46
mutually singular (measures), 1.05

, Lebesgue, 46

- , locally finite, 153

-, of a set, 11
outer regular, 153
positive, 1519

- of a signed measure, 109

- , regular, 15.4
- , u-continuous, 99
- , a-finite, 23, 72, 98
- , signed, 102

positively-homogeneous function, 59
power set, xii, 2
pre-image, xii
premeasure, 8

with density, 96
measuue-defining function, 311

measure-extension theorem, Q. 21
measure-generating function, Q. 32
measure space, 26. 34

- , a-finite, 34
metric of uniform convergence, 169
metrizability of locally compact spaces,

- , Lebesgue, 18
principal measure, 176
probability measure, 31
probability space, 34
product measure, 137, 143
product of measure spaces, 144

- of a-algebras, 132, 142


pseudometric, 79

208

- of vague topology, 208


Minkowski inequality,

70 83

Radon measure, 155


--- , bounded, 188

Subject Index

- , finite, 188
- , p-measure, 188
- , regularity of, 156, 161, 183
Radon-Nikodym density, 105
- integrand, 105
- theorem, 101

signed measure, 107


singular part of a measure, 105
singular, mutually, 103

- , to each other, 105


Sorgenfrey topology, 156

- integral, 91

Souslin (analytic) subset, 4Z'


space, locally compact, 186
- , Polish, 157, 208, 214
special linear group, 43
square-integrability, 76
Steinhaus' theorem, 163
step function, 53
Stieltjes integrable, 65
Stieltjes measure function, 32
stochastic convergence, 113
stochastic limit, 112
subadditivity, 9
subsequence principle, 118, 120
subtractivity, 9
support of a function, 167

- - , improper, 92

- of a measure, 173

Riemann-Lebesgue lemma, 202


Riesz representation theorem, 171,

supremum norm, 189


symmetric difference, , 14, 24 87

reflection-invariance, 37
regular, inner, 183

- , outer, 154, 181


regularity of Borel measures, 184

- of L-B measure, 162


relatively compact, vaguely, 204

- , weakly, 213
representing measure, 173

- , essential 176
- , principal, L76
restriction of p, 19, 68
restriction of f, xii
Riemann integrable, 91, 92

229

178, 185

right half-open interval, 14


ring (of sets), 4

- generated by intervals, 14

tensor product, 145


theorem of Caratheodory on outer
measures 21

section of a function, 138

- Egorov, 120
- Fubini, 139

- of a set, 135

-Helly,218

semi-norm, 79

- Lebesgue, 83, 103


- Lebesgue-Radon-Nikodym, 1125

- of convergence in pth mean, 79

- Levi, 59
- Lusin, 183

-,2'-,79

sequence, xii
sequentially compact, 213

- , relatively, 213
set, analytic, 47
- , Borel, 26, 49, 152, 172
- , difference, 183
- , Lebesgue measurable, 47
- , non-Borel, 45, 47
- of a-finite measure, 72, 175
- , (partially) ordered, xiii
- , quadrable, 198
Souslin, 47

- Prohorov, 21.3
Radon-Nikodym, 194
F. Riesz, 82

- Riesz-Fischer, 84
- Steinhaus, 163
Tonelli, 13$
theorem on dominated convergence 83

- monotone convergence, 59
- partitions of unity, 167
tight, 197, 213
topological basis (base), 157

230

Subject Index

-- , countable, 157, 2174 208


topology, right sided, 157
vague, 192
weak, 211
total mass, 147. 154, L41
in vague convergence, 191, 195,19%

in weak convergence, 19&


trace, 3

transformation theorem for general


integrals, 111
for Lebesguc integrals, 111
transitivity of image measures, 36
translation-invariance of Ad, 36

translation-invariant measure, 4 39
ultimately all, xii
uncountable, xii
uniformly integrable, 122
uniqueness theorem, 22
unit mass at w, 8. 12 (see also Dirac
measure)

Urysohn's lemma, 168

vague density of discrete measures,


193, 194

vague limit, 189


vague topology, 192
vaguely bounded, 204
vaguely compact, 204
vaguely convergent, 189

vanish at infinity, 10
vector space, 66,7,778
Z8
weak convergence, 196, 211
--- and distribution functions, 2(111
weak relative compactness, 213
weak topology, 201
Wiener measure, 216

zero-measure, 11

This book gives a straightforward introduction to the field as it is


nowadays required in many branches of amtlysis and especially in
probability theory. The flrst three chapters Measure Theory.
Integration Theory. Product Measures) basically Follow the clear
and approved exposition given in the authors earlier book on
Probability Theory and Measure Theory'. Special emphasis is
laid on a complete discussion of the transformation of measures
anti integration with respect to the product measure, convergence
theorems, parameter depending integrals, as well as the Radon
Nikodym theorem.
The final chapter. essentially new and written in a clear and concise
style, deals with the theory of Radon measures on Polish or locally
compact spaces. With the main results being Luzin's theorem. the

representation theorem, the Portimtiitcau theorcm. and a


characterization ol locally compact spaces which are Polish, this
Riesz

chapter is a true invitation to study topological measure theory.

'lie (ext addresses graduate students, who wish to earn the


Fundamentals in measure and integration theory as netded in
modern analysis and probability theory. It will also bc an important
source for anyone teaching such a course.