Measure theory class notes - key concepts

Measure theory class notes - 2 August 2010, class 1
Introduction
We would like to define length for subsets of R. Is it possible to do this for all subsets of R? We
want length to have the following properties:
The length of any interval (a, b] (with a b) is b a.
The length of an at most countable disjoint union of sets is the sum of the individual lengths.
Unfortunately we cannot define a length function on 2R satisfying these properties (proof omitted).
So we do the best that we can.
By default, intervals will mean intervals in R, and (a, ] will mean (a, ), [, a) will mean
= R {, }, we will use a R
subscript,
(, a), and so on. If we want to talk of intervals in R
like [0, ]R = [0, ] {}.
Intervals, semi-fields
Let S be the collection of all intervals (a, b], where a b . This is a possible starting
point for defining lengths. What properties does it have?
Definition. Let be a set. A semi-field over is a subset S of 2 satisfying the following:
S
A, B S = A B S
A S = \ A is a finite disjoint union of elements of S
S as defined above is a semi-feld over R. In fact, the complement of any element of S can be
expressed as a disjoint union of at most two elements of S .
Definition. A measure on a semi-field S over is a function : S [0, ]R such that
() = 0
If A S is an at most countable collection of disjoint sets, and
[ X
(A)
A =
A S , then
AA
This property is known as countable additivity (on the semi-field).

Theorem. The natural length function on the semi-field of intervals S is a measure.
(proof omitted)
As another example, consider an F : R [0, 1] which is right continuous and nondecreasing. Take
the same semi-field of intervals, but define the length of (a, b] as F (b) F (a). This also gives a
measure (proof omitted).
The first extension : fields

Definition. Let be a set. A field over is a subset F of 2 satisfying the following:
F
A, B F = A B F
A F = \ A F
Theorem. Let S be a semi-field over . Let F be the collection of all finite disjoint unions of
elements of S . Then F is a field over . This is known as the field generated by S .
Proof. By definition of a semi-field, S . So F . Let A, B F . Then we have A S , a
S
finite collection of disjoint sets such that A = A. Similarly we have B S , a finite collection
S
of disjoint sets such that B = B. By the distributive law,
[
(A B)
AB =
AA,
BB

: A A, B
B . By definition
Since A and B are each a collection of disjoint sets, so is A B
of a semi-field, each of these sets also belongs to S . A B is a finite disjoint union of elements of
S , and so belongs to F .
Let A F . We want to show that \ A F . As above we have A S , a finite collection of
disjoint sets whose union is A. By De Morgans laws,
\
( \ A)
\A=
AA
By the definition of a semi-field, each \ A is a finite disjoint union of elements of S , and so

belongs to F . By repeated application of the previous part, a finite intersection of elements of F
belongs to F . So \ A F .
Note that by De Morgans laws, since a field is closed under complementation and finite intersections, it is also closed under finite unions.
Theorem. Suppose is a measure on a semifield S on the set . Let F be the field generated
S
by S . Define
: F [0, ]R as follows: for A F , write A = A where A is a finite collection
of disjoint elements of S . Define
X
(A) =
(A)
AA
is well-defined and agrees with on S . Also,

is countably additive on the field: If B F is
a countable collection of disjoint sets whose union belongs to F , then
[ X
B =
(B)
BB
Proof. Suppose we have two ways of expressing A as a finite disjoint union of elements of S ,
S
S
S
A = A = B. For each A A, we have (since A B)
[
A =
(A B)
BB
The sets on the RHS are disjoint and all in S , so

X
=
(A)
(A B)
BB
B,
Similarly, for each B
=
(B)
(A B)
AA
So
X
=
(A)
AA
XX
=
(A B)
AA
BB
XX
=
(A B)
BB
AA
(B)
BB
This shows that

is well-defined.
Clearly
agrees with on S .
We now show countable additivity. Let A F be an at most countable collection of disjoint sets
whose union A belongs to F .
[
A=
C
CA
For each C A, we have BC , a finite collection of disjoint elements of S whose union is C. Also,
we have B, a finite collection of disjoint elements of S whose union is A.
X
X
(E) (for all C A)
(D),
(C) =
(A) =
EBC
DB
Take any particular D B : it is a subset of A =

so D is the disjoint union of the collection
CA
EBC
E (and all these Es are disjoint),
{D E : E BC , C A}
All these sets are in S and is countably additive on S , so
X
(D E)
(D) =
CA,EBC
(A) =
(D)
DB
(D E)
DB CA,EBC
X X X
(D E)
(nonnegative values, can rearrange)
CA EBC DB
X X
(E)
CA EBC
(C)
CA
This completes the proof.
(similar argument as before, since E
B = A)
Arithmetic in R
= R {, }. We fix some conventions for arithmetic in R:
Recall that R
() = , () =
+ a = a + = , for a R {}
() + a = a + () = , for a R {}
, if a > 0
a= 0
, if a = 0 , for a R
, if a < 0
A similar rule for a
Other operations are not defined: in particular, is not defined.
0 = 0 may look suspicious but is convenient for our purposes. For example, when we define
integration, well want the area of a rectangle with width and height 0 to be 0.
Countable addivity for the semi-field of intervals

Recall our semi-field of intervals,
S = {(a, b] : a b }
In the last class we stated but did not prove that the natural length function on S is a measure
(particularly, that it is countably additive). We prove that now. This is harder than it looks,
since an interval can be expressed as a countable disjoint union of other intervals in all kinds of
convoluted ways.
In fact, we prove something more. Fix an F : R R which is right-continuous and non-decreasing.
R
by taking limits:
Extend to F : R
F () = sup{F (x) : x R}, F () = inf{F (x) : x R}
Define : S [0, ]R by
((a, b]) = F (b) F (a)1
We will show that is a measure on S . First we prove some lemmas. Note that these lemmas do
not depend on F being right-continuous, but do need F to be non-decreasing.
We have already shown that if is a measure on S , then its extension to the field generated by
S is also a measure.
if we represent S as (, ], and F () = , this definition will not be meaningful. In such cases, we
avoid (, ] and (, ], and represent by (a, a] for any a R
1
Lemma 1. Let (a1 , b1 ], . . . , (ak , bk ] be disjoint, and all a subset of (a, b], where each ai , bi and
Then
a, b R.
k
X
((ai , bi ]) ((a, b])
i=1
Proof. If ((a, b]) = the inequality holds trivially. Assume ((a, b]) R. This means that
F (a), F (b) R and so (by monotonicity of F ) each F (ai ), F (bi ) R.
Without loss of generality we can remove empty intervals from consideration and assume that
each ai < bi and a < b. Also, we may reorder the intervals so that a1 a2 . . . ak . Since the
intervals are disjoint and are all a subset of (a, b], we have that
a a1 < b1 a2 < b2 . . . ak < bk b
Since F is non-decreasing,
F (a) F (a1 ) F (b1 ) F (a2 ) F (b2 ) . . . F (ak ) F (bk ) F (b)
We have the telescoping sum
F (b) F (a) = (F (b) F (bk )) + (F (bk ) F (ak )) + . . . + (F (b2 ) F (a2 ))
+(F (a2 ) F (b1 )) + (F (b1 ) F (a1 )) + (F (a1 ) F (a))
Dropping some of the terms from the RHS (all of which are nonnegative) we get
F (b) F (a)
k
X
(F (bi ) F (ai ))
i=1
Lemma 2. Let (a1 , b1 ), . . . , (ak , bk ) be intervals whose union includes [a, b], where each ai < bi ,
a < b, and a, b, ai , bi R. Then
k
X
F (bi ) F (ai ) F (b) F (a)
i=1
Proof. By induction. The k = 1 case follows since F is non-decreasing. Suppose the result is true
for k = n, and we want to prove it for k = n + 1. Consider intervals (a1 , b1 ), . . . , (an+1 , bn+1 ) whose
union covers [a, b]. We may reorder the intervals such that b (an+1 , bn+1 ). If an+1 a, then
k
X
F (bi ) F (ai ) F (bn+1 ) F (an+1 ) F (b) F (a)
i=1
So assume an+1 > a. The intervals (a1 , b1 ), . . . , (ak , bk ) cover [a, an+1 ], so by the induction hypothesis
n
X
F (bi ) F (ai ) F (an+1 ) F (a)
i=1
Also,
F (bn+1 ) F (an+1 ) F (b) F (an+1 )
All quantities involved are real numbers, so we can add the above equations and cancel F (an+1 ):
n+1
X
F (bi ) F (ai ) F (b) F (a)
i=1
The proof is completed in the next class.
Countable addivity for the semi-field of intervals (contd)

We were proving that our : S [0, ]R is a measure. We prove another lemma from analysis
that helps us to deal with infinite intervals.
Lemma 3. For i, n N, let i,n be in [0, ). Suppose for each i, i,n i as n . Then
X
xX
i,n
i as n
iN
iN
i could be .
P
It is clear that as a sequence in n, iN i,n is non-decreasing. Also, since i,n i ,
X
X
i,n
i
Proof. Note that any i or
iN
iN
iN
So to show the result, it suffices to show that for all L R such that L <
N N such that
X
i,N > L
iN
i , there is a
iN
Note that this holds whether iN i is finite or not.

P
P
Let L be given, L < iN i . We have a k such that ki=1 i > L. Since limits commute with
finite sums, we have
k
k
X
xX
i,n
i as n
i=1
Since L < limn
Pk
i=1
i=1
i,n , there is N N such that

k
X
i,N > L
i=1
Hence
X
i,N > L
iN
Theorem. as defined above is a measure on S .

Proof. It is clear that () = 0, so we basically have to show countable additivity.
Let (a, b] S be written as a countable disjoint union of the collection {(ai , bi ]}iN . We have by
Lemma 1 that for all k N
k
X
((ai , bi ]) ((a, b])
i=1
So
X
((ai , bi ]) ((a, b])
iN
We now show the reverse inequality, and then we are done.
Case 1: a, b R. This implies that all ai , bi R. It is sufficient to show that for all > 0, the
following holds:
X
((ai , bi ]) + ((a, b])
iN
Let > 0 be given. We shrink (a, b] slightly: by right-continuity of F , choose an a0 such that
a < a0 < b and F (a0 ) < F (a) + . We expand each (ai , bi ] slightly: for each i choose a b0i > bi such
that F (b0i ) < F (bi ) + 2i .
The compact interval [a0 , b] is covered by the collection of open intervals {(ai , b0i )}iN . By compactness, there is a finite subset F N such that [a0 , b] is covered by {(ai , b0i )}iF . Lemma 2 applies
and we have
X
F (b0i ) F (ai ) F (b) F (a0 )
iF
So
X
((ai , bi ]) + =
iN
X
F (bi ) F (ai ) +
iN

2i
(F (b0i ) F (ai ))
iN
(F (b0i ) F (ai ))
iF
F (b) F (a0 )
F (b) F (a)
= ((a, b])
Since this holds for all > 0, we have
X
((ai , bi ]) ((a, b])
iN
As we have already shown the reverse inequality,

X
((ai , bi ]) = ((a, b])
iN
Case 2: a R and b = . For any integer n > a, let us intersect all our intervals with (, n].
{(ai , bi ] (, n]}iN are disjoint sets whose union is (a, b] (, n] = (a, n]. For any interval
(c, d], we have
x
(c, d] (, n] ((c, d]) as n
(if d R we can take n > d, otherwise use the definition of F ())
n, a R, so by case 1

(ai , bi ] (, n] = ((a, n])
iN
P
As n , by Lemma 3, the LHS tends to iN (ai , bi ]). The RHS tends to ((a, b]). So
X
((ai , bi ]) = ((a, b])
iN
Case 3: a = . Intersect with (n, ) and use a similar argument as before, using case 1 or
case 2 as applicable depending on whether b R or b = .
The final extension : -fields

We have now defined lengths for the field of finite disjoint unions of intervals. We now extend this
to a bigger and more useful class:
Definition. Let be a set. A -field over a set is a subset A of 2 which satisfies the following:
A
If B A is countable, then
BA
If A A , then \ A A .
Since -fields are closed under complements and countable unions, they are also closed under
countable intersections.
Here are some examples of -fields. Let be any set. Each of the following is a -field over :
{, }
For a fixed A , {, A, \ A, }
{A : A or \ A is countable}
The only nontrivial thing to verify is that the last set is closed under countable unions. A countable
union of countable sets is countable, and a countable (or even arbitrary) union of sets at least one
of which has a countable complement has a countable complement.
Just like we can talk of the subgroup generated by some elements of a group, the vector subspace
spanned by some elements of a vector space, and so on, we can do the same for -fields: It is easy
to see that an arbitrary intersection of -fields over is a -field. 2 is a -field over .
Definition. For any B 2 , the -field generated by B, denoted by (B), is the intersection of
all -fields over which have B as a subset.
Clearly (B) is a -field, and it is the smallest -field which is a superset of B. More precisely, if
C is any -field over with B C, then B (B) C.
More on -fields
Recall our field of finite disjoint unions of intervals of the type (a, b], denoted by F . Let B = (F )
be the -field generated by F . This known as the Borel -field on R, and sets in it are called
Borel sets. Any subset of R one can think of is likely to belong to B; it is quite hard to come up
with a set not in B, though such sets exist.
One can also define B as the smallest -field containing all intervals in R. For example, for a, b R,
(a, b) B because
[

(a, b) =
a, b n1
nN
Similarly it can be shown that all intervals belong to B, and so B is also the -field generated by
all intervals.
Here are some examples of Borel sets:

T
Any singleton, {a}. {a} = nN a n1 , a
Any countable set, say Q. A countable set is a countable union of singletons.
The set of transcendental numbers. Its complement, the set of algebraic numbers, is countable.
Definition. Let be a set, and A a -field on it. A measure on (, A ) is a function
: A [0, ]R
such that
() = 0
If C A is a countable collection of disjoint sets, then
[ X
C =
(C)
CC
This property is called countable additivity.

Note that unlike before we do not have to put a separate condition for
the definition of a -field.
C A , this follows from
Definition. is called a probability measure if () = 1.

Definition. is called a finite measure if () < .
S
Definition. is called a -finite measure if there exists a countable C A such that C =
and (C) < for all C C. In other words, is -finite if can be written as a union of
countably many sets of finite measure.
In the above definition, the sets in C can be made disjoint if needed: enumerate them as C1 , C2 , . . .,
and define
"
#
[
Di = Ci \
Cj
1j<i
Di and Dj are disjoint for i 6= j, and
iN
Di = .
(Ci ) = (Di ) + (Ci \ Di )

So (Di ) (Ci ) < .
Extending the measure from the field to the -field

We will prove a major result which allows us to extend a measure from a field to a -field uniquely.
Theorem (Caratheodory extension theorem). Let be a set and F a field on it. Let be a
measure on F . Assume is -finite, that is, can be expressed as countable union of sets in F
with finite measure1 . Then there is a unique measure
on (F ) such that and
agree on F .
Clearly
is -finite.
(Proof done later.)
Corollary. Let F : R R be non-decreasing and right-continuous. Define F at and as
before by taking limits. There is a unique measure on the Borel -field of R such that for any
a < b in R,
((a, b]) = F (b) F (a)
Definition. If we take F to be the identity function in the above, the measure so obtained is
called the Lebesgue measure.
Lemma. Let be a set, A a field on it, and a measure on (, A ). In what follows below, all
sets are from A :
1. If A B, then (A) (B).
2. If {An }
n=1 is a sequence of sets increasing to A, then (An ) (A) as n .
3. If {An }
n=1 is a sequence of sets decreasing to A and (A1 ) < , then (An ) (A) as
n .
(Note that we have to require that A A . If A is a -field, then this follows from each An being
in A .)
1. (B) = (A) + (B \ A). So (B) (A).
Proof.
2. By the above, {(An )}

n=1 is an increasing sequence. Let A0 = . For n N, let Bn =
An \ An1 . Then A is the disjoint union of {Bn }
n=1 , and so
X
(A) =
(Bi )
iN
Sk
i=1
Bi = Ai , so
Pk
i=1
(Bi ) = (Ak ). Taking limits, limk (Ak ) = (A).
3. {(An )}
n=1 is a decreasing sequence. A1 \ A is a disjoint union of the collection {Ai \ Ai+1 :
i N}. So
X
(A1 \ A) =
(Ai \ Ai+1 )
iN
Since (A1 \ A) < , the tail sums of the above converge to 0.
(Ai \ Ai+1 ) = An \ A
i=n
1
we need to state the definition here, since the above definition applies only for measures on -fields
So limn (An \ A) = 0. Adding (A) to both sides, limn (An ) = (A).
The third part of this lemma needs (An ) to be finite for some n. For example, consider (R, F , ),
where ((a, b]) = b a. ((n, )) = for all n. These sets decrease to the empty set, but
() = 0.
We now proceed towards the proof of the Caratheodory extension theorem by showing that proving
it for finite measures suffices.
Lemma. Suppose we have proven the Caratheodory extension theorem for finite measures. Then
the theorem for general measures follows.
Proof. Let , F , be as in the statement of the theorem, with -finite. Take a collection of sets
S
of F , {i }iN such that iN i = and each (i ) < . As observed before, we may assume
that {i }iN is a collection of disjoint sets (the argument also works for a field). For each i N,
define i : F [0, ]R
i (A) = (A i )
Note that A i F . It is easy to check that i is a measure, using being a measure.
Moreover, each i is a finite measure, since i () = (i ) < . Applying the finite case of the
theorem to all these measures, we get measures on the -field
i : (F ) [0, ]R
extending i .
Define
: (F ) [0, ]R by
(A) =
i (A)
iN
Clearly
() = 0. We now show that
is countably additive. This follows from countable
additivity of
i and being able to interchange the order of summation for nonnegative quantities.
Let B (F ) be a countable collection of disjoint sets.
[ X [
B =
i
B
(definition of
)
iN
XX
XX
i (B)
(by countable additivity of

i )
i (B)
(can interchange for nonnegative numbers)
iN BB
BB iN
X
BB
(B)
(definition of
)
We now show that

extends . Let A F .
X
(A) =
i (A)
(definition of
)
iN
i (A)
(
i extends
)
iN
(A i )
(definition of i )
iN
(countable additivity of on F )
= (A)
We have shown that has an extention to (F ). We now have to show that this extension is unique.
Let : (F ) [0, ]R be any measure extending . For each i N, define i : (F ) [0, ]R
by
i (A) = (A i )
i is a measure on (, (F )). For A F ,
i (A) = (A i ) = (A i ) = i (A)
Note that A i F . i and
i are both extensions of i to (F ). Using the the uniqueness in
the finite-measure part of the theorem, we get that i =
i .
For any A (F ),
(A) =
(A i ) =
iN
So =
. This proves uniqueness.
X
iN
i (A) =
X
iN
i (A) =
(A)
Carath
eodory extension theorem - proof outline
We have shown that proving the Caratheodory extension theorem for finite measures suffices to
prove it for measures which are -finite on the field. Consider our (, F , ). Now we have to prove
the theorem for the case when () < . If () = 0, then the zero measure (which assigns a
measure of 0 to all sets in (F )) is an extension, and monotonicity of implies that this is the
only extension. Otherwise, if 0 < () < , we can scale by () and so assume () = 1. We
will prove:
Theorem (Caratheodory extension theorem for probability measures). Let be a set and F a
field on it. Let be a probability measure on F . Then there is a unique probability measure
on (F ) such that and

agree on F .
We first describe the outline of the proof:
Step 1: Let
F = {A : {An }
n=1 , each An F , An A as n }
Define
: F [0, ]R as follows: if An A as n , then
(A) = limn (An ). We will
show the following:

is well-defined and extends .

is monotonic: If A, B F and A B, then
(A)
(B).

is strongly finitely additive: If A, B F , then A B, A B F and
(A) +
(B) =
(A B) +
(A B).

respects increasing limits: If {An }
n=1 is an increasing sequence of sets from F whose
union is A, then A F and limn
(An ) =
(A).
If at all we want to assign a measure to a set in F , it has better be the one given by
, as we
have shown earlier that measures respect increasing unions.
Step 2: Define : 2 [0, ] by
(B) = inf{
(A) : A F , A B}
The inf is well-defined since F . is called the outer measure. Note that it is defined for all
subsets of . We will show the following:
extends
.
If B1 B2 , then (B1 ) (B2 )
(B1 ) + (B2 ) (B1 B2 ) + (B1 B2 )
If {Bn }
n=1 increases to B as n , then (Bn ) (B) as n
If at all we want to assign a measure to a set B , because of monotonicity, it must be at most

(B). Also, it must be at least 1 ( \ B) (since the measure assigned to \ B must be at
most its outer measure too). Let us define the measure for the sets for which the upper bound and
lower bound agree.
Step 3: Let
C = {B : (B) + ( \ B) = 1}
We will show that
C is a -field.
is a probability measure on C .
F C.
Clearly (F ) C , but in general equality will not hold. This is not an issue: restricted to
(F ) is of course a probability measure on (F ) as required.
Carath
eodory extension theorem - proof details
We now execute the steps 1, 2, and 3, independently of each other.
Details of step 3
We will show that C is a -field. Since extends
which extends , we have () = 0 and
() = 1. So , C . It is clear that C is closed under complements.
If D , then from step 2 we know that
(D) + ( \ D) (D ( \ D)) + (D ( \ D)) = 1
Let A, B C . We will show that A B and A B also belong to C . We have
(A) + ( \ A) = 1
(1)
(B) + ( \ B) = 1
(2)
(A B) + ( \ (A B)) 1
(3)
(A B) + ( \ (A B)) 1
(4)
(A) + (B) (A B) + (A B)
( \ A) + ( \ B) ( \ (A B)) + ( \ (A B))
(5)
(6)
Adding (5) and (6) and using (1) and (2), we get
(A B) + (A B) + ( \ (A B)) + ( \ (A B)) 2
(7)
If either of (3) or (4) was a strict inequality, this would not be possible. So both (3) or (4) are
equalities. This shows that A B and A B belong to C . C is a field. Adding (3) and (4), we
get that equality holds in (7). Since (7) is also obtained by adding (5) and (6), equality must hold
in (5) and (6) too. (5) implies that is finitely additive on C .

Let {Cn }nN be a countable collection of sets in C . We need to show that B :=
all n N, let
[
Bn =
Cm
3
S
nN
Cn C . For
1mn
{Bn }
n=1
increases to B. Since each Bn C (C is a field),

(Bn ) + ( \ Bn ) = 1
Using the monotonicity of from step 2,

(Bn ) + ( \ B) 1
From step 2 we know that respects increasing unions, so taking the limit as n ,
(B) + ( \ B) 1
We know that
(B) + ( \ B) 1
(this holds for any set B, in fact) So equality holds, and B C . C is a -field.
We will show that is countably additive on C : Take {Bn }
n=1 , a countable collection of disjoint
sets in C whose union is B. We have shown that is finitely additive on C , so for all k N,
!
k
k
X
[
(Bn )
Bn =
n=1
n=1
Taking the limit as k and using that respects increasing unions, we get
(B) =
(Bn )
n=1
This shows that is countably additive.

Since extends
which extends , it is clear that F C .
Details of step 1
Let {An }
n=1 and {Bn }n=1 be two increasing sequences of sets from F , each having union A. To
show that
is well-defined, we need to show that
lim (An ) = lim (Bn )
Fix n N. {An Bm }
m=1 is an increasing sequence of sets from F , increasing to An A = An .
All these sets are in F , so
lim (An Bm ) = (An )
m
Since An Bm Bm , so
lim (Bm ) (An )
Taking the limit as n ,

lim (Bm ) lim (An )
By a symmetric argument,
lim (An ) lim (Bm )
Equality holds, and

is well-defined. Clearly
extends , since for A F we can take An = A
for all n, and so
(A) = (A).
We will show that

is monotonic. Let A, B F , A B. Let {An }
n=1 and {Bn }n=1 be increasing
sequences of sets in F whose union is A and B respectively. {An Bn }
n=1 is also an increasing
sequence, and increases to A B = A.
(A) = lim (An Bn ) lim (Bn ) =

(B)
n
We now show strong finite additivity. Let A, B F . Let {An }

n=1 and {Bn }n=1 be increasing
sequences of sets in F whose union is A and B respectively. Then {An Bn }n=1 and {An Bn }
n=1
are increasing sequences of sets in F whose union is A B and A B respectively. So A B and
A B belong to F . By finite additivity on F , we have

(An ) + (Bn ) = (An \ Bn ) + (An Bn ) + (Bn \ An ) + (Bn An )

= (An \ Bn ) + (Bn \ An ) + (An Bn ) + (An Bn )
= (An Bn ) + (An Bn )
Taking the limit as n , we get
(A) +
(B) =
(A B) +
(A B)
We now show that
respects increasing unions. Let {An }
n=1 be an increasing sequence of sets in
F with union A. For each n, let {An,m }m=1 be an increasing sequence of sets in F whose union
is An . For each m N, let
[
Bm =
Ai,j
1im,1jm
{Bm }
(A) =
m=1 is an increasing sequence of sets in F whose union is A. So A F and
limm (Bm )
Since An A,
(An )
(A). Taking limits,
lim
(An )
(A)
Bn An , so
(Bn )
(An ). Taking limits as n ,
(A) = lim (Bn ) = lim

(Bn ) lim
(An )
n
So
(A) = lim
(An )
n
This completes step 1.
Carath
eodory extension theorem - proof outline (contd)
In the proof of the Caratheodory extension theorem, we have finished step 1 and step 3. Step 2
and showing uniqueness remain.
Details of step 2
For B , we have
(B) = inf{
(A) : A F , A B}
For B F , we have (B)
(B) (since B B) and (B)
(B) (by monotonicity of
). So
extends
.
Monotonicity of is clear: if B1 B2 then
{
(A) : A F , A B2 } {
(A) : A F , A B1 }
and so (B1 ) (B2 ).
Let B1 , B2 . We will show that
(B1 ) + (B2 ) (B1 B2 ) + (B1 B2 )
Let > 0 be given. Choose A1 , A2 F such that A1 B1 , A2 , B2 , and
(A1 ) (B1 ) +
(A2 ) (B2 ) +

2

2
We have A1 A2 B1 B2 and A1 A2 B1 B2 . A1 A2 , A1 A2 F . So
(B1 B2 ) + (B1 B2 )
(A1 A2 ) +
(A1 A2 )
=
(A1 ) +
(A2 )
(strong finite additivity in step 1)
(B1 ) + (B2 ) +
As this holds for all > 0,
(B1 B2 ) + (B1 B2 ) (B1 ) + (B2 )
We now show that respects increasing unions. Let {Bn }
n=1 be an increasing sequence of subsets
of whose union is B. By monotonicity of , { (Bn )}n=1 is an increasing sequence, and each

(Bn ) (B). So
lim (Bn ) (B)
n
We now show the reverse inequality. Let > 0 be given. For each n N, choose An F such
that
(An ) (Bn ) + 2n
{An }
n=1 may not be an increasing sequence, so we define a sequence of sets {Cn }n=1 ,
Cn =
n
[
m=1
Am
{Cn }
n=1 is an increasing sequence of sets in F . Let its union be C. C F .
We will show by induction on n that
n
X

(Cn ) (Bn ) +
2m
m=1
This is clearly true for n = 1, by definition of A1 (note that C1 = A1 ). Suppose it is true for n = k.
(Ck+1 ) =
(Ck Ak+1 )
=
(Ck ) +
(Ak+1 )
(Ck Ak+1 )

(Ck ) + (Bk+1 ) + k+1

(Ck Ak+1 )
2
k
X

(Bk ) +
+ (Bk+1 ) + k+1
(Ck Ak+1 )
m
2
2
m=1
= (Bk+1 ) +
(strong finite additivity)

(definition of Ak+1 )
(induction hypothesis)
k+1
X

(C
A
)
(B
)
k
k+1
k
2m
m=1
k+1
X

(Bk+1 ) +
2m
m=1
(see below)
The last inequality holds because Ck Ak+1 F , and Bk Ck and Bk Bk+1 Ak+1 . So
n
X

(Cn ) (Bn ) +
2m
m=1
holds for all n N. Taking limits as n (note that all limits exist), we get
(C) lim (Bn ) +

n
C=
[
nN
Cn =
An
nN
Bn = B
nN
So (B)
(C), and
(B) lim (Bn ) +
n
Since this holds for all > 0,

(B) lim (Bn )
n
As we have already established the reverse inequality,

(B) = lim (Bn )
n
This completes the details of step 2. We have established the existence part of the Caratheodory
extension theorem. The C , on which we have defined a measure, will in general be larger than (F )
(the measure on (F ) is obtained simply by restriction). Even if we take all possible measures on
(, F ), and intersect all the C we get, we may still get a set larger than (F ).
For the uniqueness, we could follow this proof - arguing that any other extension must agree
with
on F by continuity of the measure and then with on (F ) by monotonicity. However,
we will give a different argument which does not depend on the details of this proof.
Definition. (, A , ) is said to be a measure space if

is a nonempty set.
A is a -field on .
is a measure on (, A ).
(, A , ) is said to be finite or -finite accordingly as is finite or -finite.
Monotone classes
Definition. Let be a set. M 2 is said to be a monotone class if
S
If {An }
n=1 is an increasing sequence of sets in M, then
nN An M.
T
If {An }
n=1 is a decreasing sequence of sets in M, then
nN An M.
In other words, a collection of subsets of is a monotone class if it is closed under countable
increasing unions and countable decreasing intersections. Every -field is of course a monotone
class; the set of all intervals in R is an example of a monotone class which is not a -field (or even
a field).
An arbitrary intersection of monotone classes over is also a monotone class over , and 2 is
a monotone class. So for any C 2 , we can define the monotone class generated by C as the
intersection of all monotone classes over which include C. This is denoted by M(C).
Theorem (Monotone class theorem). Let F be a field of subsets of . Then M(F ) = (F ).
(Proof done later.)
Theorem. Let A be a -field on , and let and be finite measures on (, A ). Let F A
be a field such that A = (F ). Suppose and agree on F . Then = .
Proof. Consider {A : A A , (A) = (A)}. This includes F , and by continuity of the measure,
is a monotone class. So it includes M(F ), which by the monotone class theorem is equal to
(F ) = A .
This proves the uniqueness part of the Caratheodory extension theorem (modulo the proof of the
monotone class theorem).
Measure theory class notes - 1 September 2010, class 7
The monotone class theorem

Recall that a monotone class over is a collection of subsets of closed under countable increasing
unions and countable decreasing intersections. M(F ) denotes the smallest monotone class which
includes F .
Theorem (Monotone class theorem). Let F be a field of subsets of . Then M(F ) = (F ).
Proof. Clearly M(F ) (F ), since (F ) is a monotone class. To show (F ) M(F ), we
need to show that M(F ) is a -field.
We first show that M(F ) is a field. Since F M(F ), M(F ). To show that M(F ) is
closed under complementation, let
M0 = {A M(F ) : \ A M(F )}
F M0 because F M(F ) and F is a field. Let {An }
n=1 be an increasing sequence of sets in
M0 with union A. Since M(F ) is a monotone class, A M(F ). Since each An belongs to M0 ,
\ An M(F ).
\
( \ An )
\A=
nN
and {\An }
n=1 is a decreasing sequence, so \A M(F ). Since A M(F ) and \A M(F ),
we have A M0 . M0 is closed under increasing unions, and a similar argument shows that it
is closed under decreasing intersections. M0 is a monotone class and F M0 M(F ), so
M0 = M(F ). So M(F ) is closed under complementation.
We show that M(F ) is closed under finite intersections in steps. First, fix A F (only in F , for
now!), and we will show that for all B M(F ), A B M(F ). As usual, let
M0 = {B M(F ) : A B M(F )}
Clearly F M0 , since F M(F ) and F is a field. Let {Bn }
n=1 be an increasing sequence of
sets in M0 whose union is B. Clearly B M(F ), since M(F ) is a monotone class. We have
that A Bn M(F ) for all n, and since
[
AB =
(A Bn )
nN
and {A Bn }
n=1 is an increasing sequence, we have A B M(F ). B M(F ) and A B
M(F ), so B M0 . M0 is closed under increasing unions and a similar argument shows that it
is closed under decreasing intersections. So M0 is a monotone class. Since F M0 M(F ),
we have M0 = M(F ).
We have that for all A F and B M(F ), A B M(F ). To complete the argument, now
fix C M(F ). Let
M0 = {D M(F ) : C D M(F )}
By the previous step we know that F M0 . By the same argument as before we have that M0
is a monotone class. So M0 = M(F ). Since C was an arbitrary set in M(F ), we have shown
that if C, D M(F ), then C D M(F ).
So M(F ) is a field. To show that it is a -field, we need to show it is closed under countable
unions. Let {An }

n=1 be a sequence of sets in M(F ). Define {Bn }n=1 ,
Bn =
n
[
Am
m=1
{Bn }
is in M(F ) since M(F ) is a field.
n=1 is an increasing sequence of sets and each of them
S
Since M(F ) is a monotone class, their union, which is nN An belongs to M(F ).
M(F ) is a -field, and M(F ) = (F ).
An application of the monotone class theorem

Theorem. Let be a set and A a -field on it. Suppose and are measures on (, A ). Let
F be a field on such that (F ) = A . Suppose and are -finite on F and agree on F .
Then = .1
Proof. Let {n }nN be a disjoint collection of sets in F whose union is and such that each (n )
is finite. Since (n ) = (n ), (n ) is also finite. We will show that for all A A and n N,
(A n ) = (A n )
Fix n and consider the set
{A A : (A n ) = (A n )}
This includes F (since A n F ) and is a monotone class (since finite measures respect
increasing unions and decreasing intersections), and so includes M(F ) which equals (F ) = A .
For any A A ,
(A) =
X
nN
(A n ) =
(A n ) = (A)
nN
Directly dealing with sets in a -field is often hard - we may not have a nice description of all sets
in the -field. The above theorem tells us that to show that two measures are equal on the -field,
it suffices to check equality only for sets in a field which generates it (provided the measures are
-finite on the field), which is likely to be easier since we may have an explicit description of the
sets in the field.
Note that the measure being -finite on the field is important. It is possible to have two -finite
measures on (, A ) which agree on F but are not equal: For example, consider (R, B), and let
D1 and D2 be countable dense subsets of R such that D1 D2 = . Consider 1 , 2 : B [0, ]R ,
1 (A) = |A D1 |
2 (A) = |A D2 |
It is easy to see that 1 and 2 are measures. 1 is -finite; we can take the cover by finite-measure
sets to be
{{x} : x D1 } {R \ D1 }
1
This follows from the uniqueness part of the Carathodory extension theorem, but we give a proof anyway.
Similarly, 2 is also -finite. We can take our field F to be all finite disjoint unions of intervals
of the type (a, b]. (F ) = B. For A F , we have 1 (A) = 2 (A) = if A is nonempty and
1 () = 2 () = 0. So 1 and 2 agree on F but are not equal (since, for example, 1 (D2 ) = 0
and 2 (D2 ) = ).
Integration
To define the Riemann integral of a function f : [a, b] [c, d], we divided [a, b] into small intervals,
and approximated the region under the graph of [a, b] on each of these small intervals by a rectangle.
This way we can approximate the area under the graph of f by the sum of the areas of these thin
rectangles.
If this quantity tends to a number z as the partition of [a, b] becomes finer, we say that
Rb
f (x)dx = z. To be able to approximate the value of f on a small interval by a single value, we
a
would like that f does not vary too much on that interval. Riemann integration works well for
continuous functions (and also some other functions like piecewise continuous ones).
Lebesgue integration takes a different approach. Instead of dividing the doman, we divide the
range (say [c, d] as above) into intervals A1 , A2 , . . . An , where n is large, so that each interval is
small. We can pick a point ai in each Ai , and approximate the integral of f by
n
X
ai (length of f 1 (Ai ))
i=1
Note that unlike in the case of Riemann integration, here taking ai as an approximation for any
value in Ai is reasonable, since Ai is a small interval, regardless of what f is! So this approach is
likely to work better, provided we can assign a length to f 1 (Ai ). This is one of the motivations
for developing measure theory. We now define the class of functions for which the f 1 (Ai ) will
have length defined, in a more general setting.
Definition. Let be a nonempty set and A a -field on it. f : R is said to be measurable
if for any interval I R, f 1 (I) A .
It follows that for a measurable function f , for any Borel set B R, f 1 (B) A (as usual, since
the collection of Borel sets for which the inverse image belongs to A has all intervals and is closed
under complements and countable unions).
When (, A ) = (R, B), then every continuous function is measurable, since the collection of all
open intervals in R generates B too.
Measurable functions
Let (, A , ) be a measure space. Recall that a real measurable function is an f : R such
that for all intervals I R, f 1 (I) A . This is equivalent to the inverse image of every Borel
set being a member of A . It is easy to see that if C B (B is the Borel -field on R) with
(C ) = B, then f is measurable if and only if the inverse image of every set in C is in A . This
allows us to choose C as per our convenience, for example {(, a) : a R}, {[a, ) : a R},
etc.
We will study properties of the collection of measurable functions. These will allow us various
natural operations on measurable functions while guaranteeing that we do not encounter a nonmeasurable function. Note that plays no role here.
Let L ( = L(, A , )) be the collection of measurable functions R.
Theorem. L has the following properties:
1. L has all constant functions.
2. L is a vector subspace of the vector space of all R functions with the usual operations.
3. L is a subalgebra of the algebra of all R functions with the usual operations.
4. L is a sublattice of the lattice of all R functions with the usual operations.
5. Suppose {fn }
n=1 is a sequence in L and f : R. Then
If f = lim supn fn , then f L.
If f = lim inf n fn , then f L.
As a special case of either of the above, if f = limn fn , then f L.
(here all limits are pointwise)
Proof.
1. The inverse image of any set under any constant function is or , so all constant functions
are in L.
2. Suppose f, g L. We need to show that f + g L. Note that for any real number a and
x ,
f (x) + g(x) < a
f (x) < a g(x)
r Q[f (x) < r < a g(x)]
This allows us to write
{x : f (x) + g(x) < a} =

{x : f (x) < r} {x : g(x) < a r}
rQ
{x : f (x) < r} and {x : g(x) < a r} belong to A , and since A is a -field,

{x : f (x) + g(x) < a} A
Since this holds for all a R, f + g L.

Let f L and c R. We need to show that cf L. If c = 0, then cf is the constant 0
function, and so is in L. If c > 0, then for all a R,
{x : (cf )(x) < a} = {x : c f (x) < a} = {x : f (x) < a/c} A
and so cf L. If c < 0,
{x : (cf )(x) < a} = {x : f (x) > a/c} A
and so cf L. If f L then f L (take c = 1).
This shows that L is a vector subspace of R .
3. We need to show that if f, g L then f g L. We first show that if f L then f 2 L. For
a 0,
{x : f 2 (x) < a} = A
and for a > 0,
{x : f 2 (x) < a} = {x : a < f (x) < a} A
So f 2 L.
Now for f, g L,
fg =
(f + g)2 f 2 g 2
L
2
(since L is a vector space)

4. Let f, g L and consider f g (the pointwise max of f and g). For all a R,
{x : (f g)(x) < a} = {x : f (x) < a} {x : g(x) < a} A
So f g L.
{x : (f g)(x) < a} = {x : f (x) < a} {x : g(x) < a} A
So f g L.
5. Note that here since we are dealing only with real-valued functions, we have to require that
the limits be real-valued functions. Later we will also deal will functions which could take
the values and .
We first show that if {fn }
n=1 is an increasing sequence in L which converges pointwise to an
f : R, then f L.
[
{x : f (x) > a} =
{x : fn (x) > a} A
nN
So f L. Similarly, if {fn }
n=1 is a decreasing sequence in L which converges pointwise to
an f : R, then
[
{x : f (x) < a} =
{x : fn (x) < a} A
nN
Now we come to lim sup. Suppose {fn }

n=1 is a sequence in L and f : R is such that
f = lim sup fn
n
For any x , since the sequence {fn (x)}

n=1 has a finite lim sup, it is bounded above. For
all m N, let gm : R be defined as
gm (x) = sup fn (x)
nm
By the above remark the supremum indeed belongs to R. {gm }

m=1 is a decreasing sequence
and by the definition of lim sup, it converges pointwise to f . For all m, l N, let hm,l be
defined as
hm,l (x) = max{fm (x), fm+1 (x), . . . fm+l (x)}
For any fixed m, {hm,l }
l=1 is an increasing sequence and converges to gm pointwise. Since L
is a lattice, each hm,l is in L. So their increasing limit gm is in L. The decreasing limit of
{gm }
m=1 , which is f , is also in L.
Now if f = lim inf n fn , then f = lim supn fn . If each fn is in L, so is fn , and
hence f , and hence f . Of course we could also give a direct argument analogous to the one
for lim sup.
If f = limn fn (pointwise), then f = lim supn fn , and the required result holds.
Simple functions
We will identify certain measurable functions which are simple to study. As before, we have an
(, A , ) and we are looking at R functions.
Definition. If is a set and A , the indicator function of A is the function 1A : R, given
by f (a) = 1 if a A and f (a) = 0 otherwise.
For a measure space (, A , ), clearly 1A L if and only if A A . So if we are given only L, the
collection of all measurable functions, we can recover A from it: A = {A : 1A L}.
Definition. f L is called a simple function if the image of f is finite.
Let E L be the set of all simple functions. Suppose f E takes m values, g E takes n values,
and c R. Then clearly f + g and f g take at most mn values, cf takes at most m values, and f g
and f g take at most m + n values. So E is a vector subspace of, subalgebra of, and sublattice
of L.
Any simple function can be written as a finite linear combination of indicator functions (in fact, indicator functions of disjoint sets). Suppose f is a simple function which takes the values a1 , . . . , an .
Then
n
X
f=
ai [1f 1 ({ai }) ]
i=1
We will prove the following two theorems which say that any measurable function can be approximated by simple functions in the next class.
Theorem. Suppose f L is bounded. Then there exists an increasing sequence {fn }

n=1 in E
whose uniform limit is f .
Theorem. Suppose f L is nonnegative (that is, f (x) 0 for all x ). Then there exists an
increasing sequence {fn }
n=1 in E whose pointwise limit is f .
Measurable functions and simple functions

The class of all real measurable functions on (, A ) is too vast to study directly. We identify
ways to study them via simpler functions or collections of functions. Recall that L is the set of all
measurable functions from (, A ) to R and E L are all the simple functions.
Theorem. Suppose f L is bounded. Then there exists an increasing sequence {fn }
n=1 in E
whose uniform limit is f .
Proof. First assume that f takes values in [0, 1). We divide [0, 1) into 2n intervals and use this to
construct fn :
n 1
2X
k
1 1 k k+1
fn =
2 n f ( [ 2n , 2n ) )
k=0

Whenever f takes a value in 2kn , k+1
, fn takes the value 2kn . We have
2n
For all n, fn f .
For all n, x, |fn (x) f (x)|
1
.
2n
This is clear from the construction.
fn E, since fn is a finite linear combination of indicator functions of sets in A .

2k 2k+1
fn fn+1 : For any x, if fn (x) = 2kn , then fn+1 (x) 2n+1
, 2n+1 .
So {fn }
n=1 is an increasing sequence in E converging uniformly to f .
Now for a general f , if the image of f lies in [a, b), then let
g=
f a1
ba
(note that a1 is the constant function a)

Im g [0, 1), so by the above we have {gn }
n=1 from E increasing uniformly to g. Let
fn = a1 + (b a)gn
Then {fn }
n=1 is a sequence from E increasing uniformly to f .
Theorem. Suppose f L is nonnegative (that is, f (x) 0 for all x ). Then there exists an
increasing sequence {fn }
n=1 in E whose pointwise limit is f .
Proof. As in the previous proof, we divide the range of f into intervals of length 21n . However,
f could be unbounded and we may need infinitely many of such intervals, which will not help in
defining a simple function. So at the nth stage, we divide only [0, n), and not the entire [0, ).
More precisely, let
n 1
n2
X
k
fn =
1 1 k k+1
2 n f ( [ 2n , 2n ) )
k=0
For all n, fn f . This is clear from the construction of fn (note that f 0).
For all n, x, if f (x) < n, then |fn (x) f (x)| <
1
.
2n
fn E, since fn is a finite linear combination of indicator functions of sets in A .

fn fn+1 : if fn (x) = 0 then clearly fn (x) fn+1 (x), otherwise f (x) < n < n + 1 and from
the construction it is easy to see that fn (x) fn+1 (x).
Now for any x , there is some integer m such that m > f (x). For all n m, |fn (x)f (x)| <
and so limn fn (x) = f (x).
1
,
2n
For f : R, define f + , f : R
f + (x) = max{f (x), 0}
f (x) = min{f (x), 0}
Both f + and f are nonnegative, and f = f + f . If f is measurable, so are both f + and
f . Every measurable function is the difference of two nonnegative measurable functions. This
is useful because by the above theorem, nonnegative measurable functions are easier to deal with
than general measurable functions.
Theorem. Suppose is a nonempty set and A is a -field on it. F is a field such that (F ) = A .
Suppose F is a collection of real measurable functions on which is a vector space under the natural
operations and closed under monotone pointwise limits. Suppose for all A F , 1A F . Then
F = L, the set of all real measurable functions.
Proof. We first show that indicator functions of all sets in A belong to F . Let
B = {A A : 1A F }
We are given that F B. Since F is closed under monotone pointwise limits, B is a monotone
class. By the monotone class theorem, B is a -field, and so B = A . So F has:
1A for all A A .
By taking finite linear combinations of the above (since F is a vector space), all simple
functions.
By taking increasing pointwise limits of simple functions, all nonnegative measurable functions.
By taking differences of nonnegative measurable functions, all measurable functions.
Sometimes we are concerned only about bounded measurable functions. We have a similar theorem
for that case. We need a definition first.
Definition. A sequence of R functions {fn }
n=1 is said to converge bounded pointwise (abbreviated as bp) to f : R if
{fn }
n=1 converges to f pointwise.
There is a C R such that for all n and x, |fn (x)| C.
In other words, bounded pointwise convergence is just pointwise convergence along with uniform
boundedness of the sequence of functions. Here are some examples of [0, 1] R functions which
illustrate various modes of convergence.
n
gn
fn
hn
1
1
n
1
n
1
n
{fn }
n=1 converges to the zero function pointwise, but not bounded pointwise or uniformly. {gn }n=1
converges to the zero function uniformly and bounded pointwise. {hn }
n=1 converges to the zero
function bounded pointwise but not uniformly.
Suppose F is a collection of bounded real measurable functions on which is a vector space under
the natural operations and closed under bounded pointwise limits. Suppose for all A F , 1A F .
Then F is the set of all bounded real measurable functions.
Proof. As before, we first show that indicator functions of all sets in A belong to F . Let
B = {A A : 1A F }
We are given that F B. Since F is closed under bounded pointwise limits, B is a monotone
class (all indicator functions are bounded by 1). By the monotone class theorem, B is a -field,
and so B = A . So F has:
1A for all A A .
By taking finite linear combinations of the above (since F is a vector space), all simple
functions.
By taking increasing bounded pointwise limits of simple functions, all nonnegative bounded
measurable functions (if |f (x)| C for all x, we can get an increasing sequence of simple
functions converging to it pointwise (even uniformly, in fact), all of which take values between
0 and C).
By taking differences of nonnegative bounded measurable functions, all bounded measurable
functions (if f is bounded, so are f + and f ).
Defining integration
We have studied the properties of the class of measurable functions, now we define their integrals.
R
Suppose (, A , ) is a measure space, and f : R is measurable. How can we define f d?
If we could define the integral for f + and f , we could define
Z
Z
Z
+
f d =
f d f d
because we want integration to be a linear operation. So we have reduced the problem from general
measurable
functions to nonnegative measurable functions. If f is nonnegative, how can we define
R
f d? If s is a nonnegative simple function, say taking values a1 , . . . , ak , then we can define
sd =
k
X
ai f 1 ({ai })
i=1
(this corresponds to our notion of area under the curve)

Since we want integration to be monotonic, we should have
Z

Z
f d sup
sd : s simple , s f
Since f is nonnegative, we know that nonnegative simple functions s with s f exist, in fact we
can get a sequence of them converging to f . Though it is not immediately clear, it makes sense
to define the integral of f by taking the above equation to be an equality. We will see this in the
next class.
Uniform convergence vs bp convergence - an example

In the last class we proved the following theorem:
Suppose F is a collection of bounded real measurable functions on which is a vector space under
the natural operations and closed under bounded pointwise limits. Suppose for all A F , 1A F .
Then F is the set of all bounded real measurable functions.
This theorem does not hold if bp convergence if replaced by uniform convergence in the statement
of the theorem. Here is a counterexample: Let (, A ) = (R, B) and let
F = {f RR : f is measurable, bounded, left-continuous}
Let F be the field of all finite disjoint unions of intervals of the type (a, b]. It is easy to see that
1(a,b] is left continuous, and hence 1A is left-continuous for all A F . F has 1A for all A F . If
f, g are left-continuous and bounded, and c R, then f + g and cf are also left-continuous and
bounded. F is a vector subspace of RR with the natural operations. Suppose {fn }
n=1 is a sequence
from F which converges uniformly to f . Pointwise convergence alone tells us that f is measurable.
By uniform convergence, there is an N such that for all x R, |fN (x) f (x)| < 1. Since fN is
bounded, so is f . We show that f is left-continuous: fix x R and let > 0 be given. Choose
M N such that for all y R, |fM (y) f (y)| < /3. By left-continuity of fM , choose > 0 such
that for all y (x , x], |fM (y) f (y)| < /3. For all y (x , x],
|f (y) f (x)| |f (y) fM (y)| + |fM (y) fM (x)| + |fM (x) f (x)| < 3 3 =
f is left continuous, and so F is closed under uniform limits. However, F does not have all bounded
measurable functions (for example, it does not have 1(0,1) ).
Defining integration (contd)

We now define integration.
Definition. Let (, A , ) be a measure space. All functions considered are R measurable
functions.
P
Clause 1 If f = ki=1 ai 1Ai , each ai 0 and Ai A , put
Z
f d =
k
X
ai (Ai )
i=1
(ai (Ai ) is the area of a rectangle with height ai and length (Ai ))
Clause 2 If f is nonnegative, put
Z

Z
f d = sup
gd : g nonnegative simple, g f
R
R
Clause 3 For any f : R measurable, if at least one of f + d and f d is finite, put
Z
Z
Z
+
f d = f d f d
R
R
R
If Rboth f + d and f d are , then we say that f d does not exist. f is said to be integrable
if f d exists and is finite.
We will analyse these definitions, but first some remarks:
We have defined 0 = 0. However, we need to be careful! We cannot expand 0 or take limits.
For example, it is not correct to say
= (1 1) = 0 = 0
lim ( 1
n n
) = 0 = 0
R
In clause 1, we did not say for any nonnegative simple function f , we define f d as, although
in clause 1 we defined the integral precisely for nonnegative simple functions. The reason is that
nonnegative simple functions may be expressed in ways which are not suitable for the definition
we gave. For example, 1(8,) = 2 1(8,) 1(8,) , but it is not correct to say that
Z
1(8,) d = 2((8, )) ((8, )) = 2
So we explicitly required the coefficients to be nonnegative.
We now look at some properties of the definition:
Clause 1
1. The definition in clause 1 is a good definition. If
k
X
ai 1Ai =
i=1
l
X
bi 1Bi
i=1
with all ai , bi 0 and Ai , Bi A , then we must have

k
X
ai (Ai ) =
i=1
l
X
bi (Bi )
i=1
This involves a tedious combinatorial argument which we will look at later. For now we
assume this.
R
R
R
2. If f, g are nonnegative simple, then f + g = f + g: Write
f=
k
X
ai 1Ai , g =
i=1
l
X
bi 1Bi
i=1
with all ai , bi 0 and Ai , Bi A . A convenient way of expressing f + g as in the definition

in clause 1 is simply
k
l
X
X
f +g =
ai 1Ai +
bi 1Bi
i=1
i=1

So
Z
f +g =
k
X
ai (Ai ) +
i=1
l
X
Z
bi (Bi ) =
cf =
k
X
f+
i=1
3. If c R, c 0, and f is nonnegative simple, then

P
cf = ki=1 cai 1Ai , and
Z
cai (Ai ) = c
i=1
k
X
R
P
cf = c f : If f = ki=1 ai 1Ai , then
Z
ai (Ai ) = c
i=1
R
R
4. If f , g are nonnegative simple with f g, then f g: Suppose f takes k values, a1 , . . . , ak
and g takes l values, b1 , . . . , bl . Let Ai = f 1 ({ai }) and Bj = g 1 ({bj }). {Ai : 1 i k} is
P
a partition of , as is {Bj : 1 j l}. Note that 1Ai = j 1Ai Bj , and a similar expression
holds for 1Bj . Using these,
X
X
f=
ai 1Ai Bj =
ai 1Ai Bj
i,j
g=
i,j3Ai Bj 6=
bj 1Ai Bj =
i,j
bj 1Ai Bj
i,j3Ai Bj 6=
If Ai Bj is nonempty, say has a point x, then ai = f (x) g(x) = bj . So

Z
Z
X
X
f=
ai (Ai Bj )
bj (Ai Bj ) = g
i,j3Ai Bj 6=
i,j3Ai Bj 6=
Clause 2
R
R
5. If f is nonnegative simple, then f as defined in clause 1 agrees with f defined in clause
2: By monotonicity of the integral from clause 1 and using that f is simple,
Z
Z
Z
f = sup
g : g nonnegative simple, g f =
f
clause 2
clause 1
clause 1
The definition in clause 2 is an extension

of the definition in clause 1, as it should be. For
R
simple functions f , we can talk of f unambiguously.
R
R
6. If 0 f1 f2 , then f1 f2 : If g is simple and g f1 , then g f2 .
Z
Z

g : g nonnegative simple, g f1
g : g nonnegative simple, g f2
So
f1
f2 .
7. The following is a very important theorem which relates integrals to convergence of functions.
Theorem (Monotone convergence theorem). Suppose {fn }
n=1 is an increasing sequence of
nonnegative functions which converge pointwise to f . Then
Z
xZ
fn d f d as n
We will prove this after an application:

R
R
R
8. If f, g are nonnegative measurable, then (f + g) = f + g: Let {sn }

n=1 and {tn }n=1 be
increasing sequences of nonnegative simple functions which converge pointwise to f and g
respectively (we know that such sequences exist). Then {sn +tn }
n=1 is an increasing sequence
of nonnegative functions which converges to f + g. For each n,
Z
Z
Z
sn + tn = (sn + tn )
from clause 1. Taking the limit as n and using the monotone convergence theorem,
Z
Z
Z
f + g = (f + g)
R
R
9. If c 0 and
f
is
nonnegative
measurable,
then
cf
=
c
f : If c = 0 then both sides are 0
R
(note that f could be , so we are using our convention that 0 = 0 here). If c > 0, let
Z

Z

A=
g : g nonnegative simple, g f , B =
g : g nonnegative simple, g cf
It is easy to see that for any nonnegative real number x, x A if and only if cx B. So
Z
Z
c f = c sup A = sup B = cf
Proof of the monotone convergence theorem

Suppose {fn }
n=1 is an increasing sequence of nonnegative functions which converge pointwise
R to
f . We know that f is measurable. By the monotonicity
of the integral, we know that
fn n=1
R
R
is an increasing sequence and that for all n, fn f . So
Z
Z
lim
fn f
n
We will show the reverse inequality:

Z
Z
f lim
fn
By the supremum definition of

with g f , we have
f , it suffices to show that for all nonnegative simple functions g

Z
Z
g lim
fn
It suffices to show that for all (0, 1),

Z
Z
Z
g = g lim
fn
n
We will show this.
Lemma. If g is nonnegative simple, and {n }

n=1 is an increasing sequence in A increasing to ,
then
Z
xZ
g1n g as n
Proof. If m n, then g1m g1n and so
g1m
g=
k
X
g1n . Write
ai 1Ai
i=1
with each ai 0 and Ai A . Then

g1n =
k
X
ai 1Ai 1n =
k
X
i=1
ai 1Ai n
i=1
Z
g1n =
k
X
ai (Ai n )
i=1
As n , (Ai n ) (Ai ) = (Ai ). Since the above is a finite sum,

Z
lim
g1n =
k
X
Z
ai (Ai ) =
i=1
Now let

n = : g() fn ()
n A . Since {fn }
n=1 is increasing, so is {n }n=1 . Take any . If g() = 0, then 1 .
Otherwise, g() f (), so g() < f (), and so there is an integer n such that g() < fn ().
n . This shows that
[
n =
nN
From the lemma above,

Z
lim
Z
g1n =
Note that for all n,

g1n fn
since for n , (g1n )() fn () holds by definition of n ; and for
/ n , (g1n )() = 0.
So
Z
Z
g1n fn
Taking the limit as n ,
Z
Z
g lim
As observed earlier, this completes the proof.
fn

We had defined integration, in 3 steps, and had analysed clauses 1 and 2.
Clause 3
10. The definition in clause 3 extends that in clause 2: If f is nonnegative, then f + = f and
f = 0, and so
Z
Z
Z
Z
+
f=
f
f =
f
clause 3
clause 2
clause 2
clause 2
Let L1 (, A ,R) L(, A , ) be the set of integrable functions.

Recall
that f is said to be
R
R
integrable if f exists and is finite (equivalently, if both f + and f are finite).
R
R
11. f L1 |f | R L1 : ifRf L1R, then f + and
f are finite. |f | = f + + f . From clause
R
2, we know that |f | = f + + f , and so |f | is finite. |f | L1 .
R
1
|f | is finite. Since f + |f | and f |f |, we know from clause 2 that
If
|f
|
L
,
then
R +
R
f and f are finite. So f L1 .
R
R
R
12. If f, g L1 then f + g L1 and (f + g) = f + g:
(f + g)+ , (f + g) |f + g| |f | + |g| f + + f + g + + g
By clause
2, the nonnegative function f + + f + g + + g has finite integral. So
R
and (f + g) are finite. f + g L1 . We have
(f + g)+
(f + g)+ (f + g) = f + g = f + f + g + g
Rearranging,
(f + g)+ + f + g = (f + g) + f + + g +
This function is nonnegative; using clause 2 we get
R
R
R
R
R
R
(f + g)+ + f + g = (f + g) + f + + g +
Since f + g, f , and g are in L1 , all these quantities are finite, and we can rearrange them,
R
R
R
R
R
R
(f + g)+ (f + g) = f + f + g + g
By definition of the integral in clause 3,
R
R
R
(f + g) = f + g
R
R
13. If f L1 andR c R, thenR cf L1 and cf = c f : if c = 0 then cf is the constant 0
function, and cf = 0 = c f . If c > 0, then (cf )+ = cf + and (cf ) = cf and from clause
2 and the definition of the integral in clause 3 we have

R
R
R
R
R
R
R
R
R
R
cf = (cf )+ (cf ) = cf + cf = c f + c f = c f + f = c f
If c < 0, then (cf )+ = (c)f and (cf ) = (c)f + , and so

R
R
R
R
R
R
R
R
R
R
cf = (cf )+ (cf ) = (c)f (c)f + = c f + c f = c f + f = c f
R R
14. If f L1 , then f |f |:1
R R + R R + R R
f = f f f + f = |f |
15.
Theorem (Fatous Lemma). Suppose {fn }
n=1 is a sequence of nonnegative measurable functions, and f is nonnegative measurable such that f = lim inf n fn (pointwise), then
Z
Z

lim inf fn lim inf fn
n
Proof. For all n N, let gn = inf{fn , fn+1 , . . .}. Each gn is measurable, and by definition of
lim inf, {gn }
n=1 increases to f . By the monotone convergence theorem, we have
R xR
gn f as n
Each gn fn , so
R
gn
fn
Taking lim inf on both sides (which on the LHS is same as taking the limit),
Z
Z
lim
gn lim inf fn
n
Since limn
gn =
f=
lim inf n fn , the proof is complete.
16. We now prove a theorem which is useful in many contexts when we deal with limits of functions. For example, it is useful when we want to commute differentiation with integration:
Z
Z
d
d
f (x, )dx =
f (x, )dx
d
d
Theorem (Dominated convergence theorem). Suppose {fn }
n=1 is a sequence of functions
1
and g is a function, all in L , with |fn | g for all n. Suppose f : R is such that
limn fn = f (pointwise). Then f L1 and
Z
Z
Z
lim
fn = f, lim
|fn f | = 0
n
Proof.
Since
|fn | g for all n, we have |f | g for all n. By the monotonicity of the integral,
R
R
|f | g < . f L1 .
|fn f | |fn | + |f | 2g
For all n, 2g |fn f | is a nonnegative function. By Fatous lemma,
Z
Z
lim inf (2g |fn f |) lim inf (2g |fn f |)
n
Since limn |fn f | = 0, the lim inf in the LHS is actually a limit, and the LHS is
Z
Z
Z
Z
Z
2g 2g + lim inf |fn f | = 2g lim sup |fn f |
n
2g. So
R
Many of the familiar properties of summation carry over to integration - after all, integration is a big um!

Since
2g is finite,
Z
|fn f | 0
lim sup
n
Since |fn f | is nonnegative,

Z
0 lim inf
n
Z
|fn f | lim sup
|fn f | 0
Equality holds throughout, and so

Z
|fn f | = 0
lim
Since
R
R
R R
0 fn f = (fn f ) |fn f |
we have
Z
lim
Z
fn =
We have not yet shown that the definition of the integral for nonnegative simple functions in clause
1 is well-defined. We will show that in the next class.
The notion of Lebesgue integral will not be very useful if for some function, its Riemann and
Lebesgue integrals do not agree! Fortunately, this is not the case, as we will show later. The theory
we have built is very general, and includes as special cases Riemann integration and summation
of series.

One part in our program for defining integration is still missing: showing that the definition of the
integral for a nonnegative simple function is well-defined. We want to show that if a1 , . . . , ak and
b1 , . . . , bl are nonnegative real numbers, and A1 , . . . , Ak and B1 , . . . , Bl are sets in A , such that
k
X
l
X
ai 1 A i =
i=1
then
k
X
bj 1B j
j=1
l
X
ai (Ai ) =
i=1
bj (Bj )
j=1
Suppose f is a nonnegative simple function, taking values 1 , . . . , k . For 1 i k, let Ci =

f 1 ({i }). {Ci : 1 i k} is a partition of by nonempty sets. We will say that the canonical
way of expressing f as a nonnegative linear combination of indicator functions is
f=
k
X
i 1C i
i=1
We will compare any other way of expressing f as a nonnegative linear combination of indicator
functions with the canonical way. To begin with, let us handle an easy case first: suppose we have
f=
l
X
d j 1D j
j=1
where {Dj : 1 j l} is a partition of and each Dj is nonempty. f takes the value dj on Dj

(since if j 6= j, Dj and Dj are disjoint), and since Dj is nonempty, dj is i for some i. f takes the
value dj = i on Dj , so Dj Ci . For all j, there is a unique i such that Dj Ci and dj = i . For
all i,
[
Ci =
Dj
j:Dj Ci
(Ci ) =
(Dj )
j:Dj Ci
So we have
l
X
dj (Dj ) =
k
X
X
dj (Dj )
i=1 j:Dj Ci
j=1
k
X
i=1
k
X
(Dj )
j:Dj Ci
i (Ci )
i=1
P
Now consider the general case, say f = ni=1 ai 1Ai where each ai 0 and Ai A . These Ai s may
intersect each other in various ways. So consider the 2n sets obtained as B1 B2 . . . Bn where
each Bi is Ai or \ Ai . Some of these might be empty. Enumerate the nonempty ones amongst
these as D1 , . . . , Dl . Here are some properties of {Dj : 1 j l}:
T
1. {Dj : 1 j l} is a pairwise disjoint collection: For j 6= j , if Dj = ni=1 Bi and Dj =
Tn
i=1 Bi , then there must be some i such that Bi 6= Bi . Bi = \ Bi , hence Dj Dj = .

T
S
2. lj=1 Dj = : For x , pick Bi = Ai or Bi = \ Ai so that x Bi . Then x ni=1 Bi . So
{Dj : 1 j l} is a partition of .
3. For all i and j, Dj Ai or Dj Ai = : Dj is the intersection of various sets which includes
Ai or \ Ai , so Dj Ai or Dj \ Ai .
4. For all j, put
dj =
ai
i:Ai Dj
Then f takes the value dj on Dj : if x Dj , then

f (x) =
n
X
ai 1Ai (x) =
i=1
ai =
i:xAi
ai
i:Dj Ai
(since x Ai = Dj Ai 6= = Dj Ai = x Ai )
5. For all i,
Ai =
Dj
j:Dj Ai
The RHS is clearly a subset of the LHS. For the other direction, if x Ai , then x Dj for
a unique j. This Dj intersects Ai , and so is a subset of Ai . x belongs to the RHS.
Aside: note the interesting duality between 4 and 5 above:
X
[
dj =
a i , Ai =
Dj
i:Dj Ai
Now we will show that

n
X
ai (Ai )
i=1
n
X
i=1
ai
Pn
i=1
(Dj )
ai (Ai ) =
Pk
i=1
j:Dj Ai
i (Ci ):
(additivity of measure and 5 from above)
j:Dj Ai
l
X
X
ai (Dj )
j=1 i:Dj Ai
l
X
j=1
l
X
(Dj )
ai
i:Dj Ai
dj (Dj )
(4 from above)
j (Cj )
({Dj : 1 j l} partitions ; this case already considered)
j=1
k
X
j=1
Examples
Finite sums
Let
= {1, . . . , n}.
A = 2 .
({k}) = 1, for 1 k n. (A) = |A|.
It is easy to see that is a measure on (, RA ). Every f : R is simple, and
P
are finite, so every f : R is integrable. f = nk=1 f (k).
f + and
Infinite sums
Let
= N.
A = 2 .
({k}) = 1, for k N. (A) = |A|.
is a measure on (, A ). Every f : R is measurable. We can think of f as a sequence
{f (n)}
n=1 .
Suppose f 0 with f (n) = 0 for all n > N , for some N . Then f is a nonnegative simple function
(though not all nonnegative simple functions look like this!), and
Z
f=
N
X
f (i)
i=1
Now suppose f 0, without assuming other conditions. For all k N, define gk : N by

(
f (n) , if n k
gk (n) =
0
otherwise
gk is nonnegative, and {gk }
k=1 increases to f . So by the monotone convergence theorem:
Z
Z
lim
gk = f
k
But
gk =
Pk
i=1 f (i).
f=
i=1
f (i).
The monotone convergence theorem made our job easier - we did not have to take Rall nonnegative
simple functions below f , calculate their integrals, and take the supremum to get f .
Now suppose f : R isR any function. f is measurable. f is integrable if and only if |f |

P
is integrable if and only if |f | =
n=1 |f (n)| < . Suppose f is integrable. Define gk as
above. By
R the Rdominated convergence theorem (taking |f | as the dominating function), we have
limk gk = f , and so
Z
X
f=
f (n)
n=1
f is integrable if and only if the series corresponding to the sequence {f (n)}

n=1 is absolutely
P
n1 1
convergent. Consider f given by f (n) = (1) n . We say that n=1 f (n) exists. But f is not
P
P
1
n1 1
integrable, since
, the specific order
n=1 n = . This is not a problem, since in
n=1 (1)
n
in which the summation was performed is important, while in general we do not have any notion
of order on .
Infinite sums with weights

Fix a sequence of nonnegative real numbers {n }
n=1 .
= N.
A = 2 .
({n}) = n , for n N. (A) =
nA
n .
is a measure on (, A ). This is similarR to the previous case. f : R is integrable if and only

P
P
if
f=
n=1 n |f (n)| < , and in this case
n=1 n f (n).
Riemann integration
Consider (R, B, ), where B is the Borel -field on R and is the Lebesgue measure. Suppose
f : R R is continuous with compact support. Then f is integrable, has a Riemann integral, and
Z
Z
R f (x) dx = f d
where R denotes Riemann integration: let [a, b] be an interval (with a, b R) such that f (x) = 0
for x
/ [a, b] (this implies that f (a) = f (b) = 0). f is bounded; let M > 0 be such that |f (x)| M
for all x R. For each n N, define
(
0
, if x
/ (a, b]
fn (x) =

ba
f a+k n
, if x a + k ba
, a + (k + 1) ba
n
n
We show that {fn }

n=1 converges to f uniformly. f is uniformly continuous. Given , let be such
that if |x y| < , then |f (x) f (y)| < . Let N be such that (b a)/N < . Let n N . For

ba
x
/ (a, b], fn (x) = f (x) = 0. For any x (a, b], let k be such that x a + k ba
.
,
a
+
(k
+
1)
n
n
Let y = a + k ba
n
ba
ba
<
|x y|
n
N
So
|fn (x) f (x)| = |f (y) f (x)| <
So {fn }
n=1 converges to f uniformly. Pointwise convergence suffices for us. For all n, |fn | M 1[a,b] .
M 1[a,b] is integrable, and so by the dominated convergence theorem, f is integrable and
Z
Z
lim
fn d = f d
n
Note that

n1
X
1
ba
f a+k
fn d =
n
n
k=0
Rb
is a particular Riemann sum considered in the definition on R a f (x) dx. We know a priori that f ,
being a continuous function on a bounded interval (the part of f outside [a, b] is not relevant), is
Riemann integrable. Since this particular sequence of Riemann sums converges, it must converge
to the Riemann integral. So
Z
Z
Z
f (x) dx =
f d
Here are some problematic examples. Consider f : R R such that

On (, 0], f is 0.
For each integer n, f (n) = 0.
For an n N and n odd, f |[n1,n] is piecewise linear and nonnegative, and looks like a tent
with height 2/n.
For an n N and n even, f |[n1,n] is piecewise linear and nonpositive, and looks like an
inverted tent with height 2/n.
The graph of f |[0,6] is given above. For n N, we have

Z
f 1[0,n] d = R
n
0
n
X
1
f (x) dx =
(1)n1
n
i=1
limit of this quantity

R + as n R exists, and is called the improper Riemann integral
RThe
f (x) dx. However, f d and f d are both :

0
Z
Z
+
f d f + 1[0,2n1] d
(monotonicity of integrals)
=
n
X
i=1
1
2i 1
(by Riemann integration)
n
X
1
2i
i=1
n
1X1
2 i=1 i
R +
P 1
1
As this
holds
for
all
natural
numbers
n,
f
d
R
R 2 i=1 i = . A similar argument shows
that f d = . So f is not integrable, and f d does not exist. As before, this is not a
problem: while the improper Riemann integral we used the order structure on R crucially, while
in our general theory we have no such order on .
As another example, consider 1[0,1]\Q . This is a nonnegative simple function. Since any countable
set has Lebesgue measure 0,
Z
1[0,1]\Q d = ([0, 1] \ Q) = ([0, 1]) ([0, 1] Q) = 1
However, 1[0,1]\Q is not Riemann integrable: every interval of nonzero length has both rationals
and irrationals, so for any partition of [0, 1], the lower Riemann sum is always 0 and the upper
Riemann sum is always 1.
Properties of integration
We return to generalities. Our context is again a nonempty set , a -field on it A , and a measure
on (, A ). All functions considered will be measurable.
R
R
R
If f 0, then f 0. If f, g L1 and f g, then f g: this is because g f L1 and
nonnegative, and so
R
R
R
R
g = f + (g f ) f
R
Lemma. If f 0 and f = 0, then
({x : f (x) 6= 0}) = 0
Proof. For all n N, let

An = x : f (x) n1
R
R
f , and so n1 (An ) = n1 1An f = 0. (An ) = 0.
!
[
X
An
(An ) = 0
({x : f (x) 6= 0}) =
Note that {x : f (x) 6= 0} =
nN
An .
1
1
n An
nN
nN
We will see in the next class that if ({x : f (x) 6= 0}) = 0, then f L1 and
f = 0.
Properties of integration (contd)

Theorem. Let (, A , ) be
R a measure space, and f : R measurable. Suppose ({ : f () 6=
0}) = 0. Then f L1 and f = 0.
Proof. Suppose s is simple with 0 s |f |. We will show
of s as a nonnegative linear combination of indicators be
s=
n
X
s = 0. Let the canonical expression
ai 1 A i
i=1
If ai 6= 0, then ai > 0. |f | takes value at least ai on Ai , so

Ai { : |f ()| 6= 0} = { : f () 6= 0}
R
Pn
(Ai ) = 0. For all i, ai (Ai ) = 0. This shows that
s
=
i=1 ai (Ai ) = 0. Since this
R
R holds
R for
1
1

all Rsimple functions s with 0 s |f |, we have |f | = 0. |f | L , hence f L .
f |f |,
so f = 0.
1
1
Suppose
R
R f L and g is measurable such that ({ : f () 6= 1g()})R = 0. Then g L and
g = f : this holds because by the above theorem, g f L and (g f ) = 0. Instead of
({ : f () 6= g()}) = 0, we may say f = g almost everywhere - when we say a property holds
almost everywhere, we mean that the set of points where it does not hold has measure 0.
Areas and product measure spaces

Is area a new concept, or is it something which comes from the concept of length? The area
of a rectangle is simply the product of the lengths of its sides, and we will see that area can indeed
be defined in terms of length. When one studies groups or vector spaces or other structures, one
studies product groups, product vector spaces, etc. We will do the same here, for measure spaces.
Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are two finite measure spaces. Define
= 1 2 .
R = {A1 A2 : A1 A1 , A2 A2 }.
A = (R).
What is left is defining the product measure:
Theorem. There exists a unique measure on A such that for A1 A1 and A2 A2 ,
(A1 A2 ) = 1 (A1 )2 (A2 )
Proof.
Fact 1: R is a semi-field. = 1 2 R. For A1 A2 , B1 B2 R,
(A1 A2 ) (B1 B2 ) = (A1 B1 ) (A2 B2 ) R

and

\ (A1 A2 ) = (1 \ A1 ) 2 A1 (2 \ A2 )
is a finite disjoint union of elements of R.
At this stage we could proceed using the Caratheodory extension theorem to obtain a measure on
(, A ) with the required property. However, we define the measure more explicitly.
Fact 2: Let A A .
For all 1 1 , define
A1 = {2 2 : (1 , 2 ) A }
This is called the vertical section of A at 1 .

For all 2 2 , define
A2 = {1 1 : (1 , 2 ) A }
This is called the horizontal section of A at 2 .
We can picture these sections as follows:
2
A2
A1
A2
A1
1
For any A A and 1 1 and 2 2 , we have A1 A2 and A2 A1 : To show this, let

B = {A A : 1 1 (A1 A2 )}
R B, since (A1 A2 )1 is A2 if 1 A1 and otherwise. Finite disjoint unions of sets in R also
belong to B, since (A B)1 = A1 B 1 .
B is a monotone class, because for any countable C B,
! 1
[
[
C
=
C 1 A2
CC
CC
CC
! 1
C 1 A2
CC
The -field generated by R is the same as the -field generated by the finite disjoint unions of sets
in R. By the monotone class theorem, B = (R).
We have shown that for all A A and 1 1 , we have A1 A2 . A symmetric argument shows
that for all A A and 2 2 , we have A2 A1 .
Fact 3: Let A A . Let
fA : 1 R be defined as fA (1 ) = 2 (A1 ).
gA : 2 R be defined as gA (2 ) = 1 (A2 ).
fA and gA are well-defined because A1 and A2 belong to the respective -fields. We claim that
fA and gA are measurable: Let
B = {A A : fA is measurable}
R B, because fA1 A2 = 2 (A2 )1A1 . Finite disjoint unions of sets in R also belong to B, because
if A, B R are disjoint, then for all 1 1 , A1 and B 1 are disjoint and their union is (AB)1 .
So fAB = fA + fB , being the sum of measurable functions, is measurable.
We now show that B is a monotone class. Suppose {Cn }
n=1 is a sequence in B, increasing to C.
We know C A . For any 1 1 , {Cn1 }
increases
to C 1 . So {2 (Cn1 )}
n=1
n=1 increases to
1
2 (C ). {fCn }n=1 increases to fC pointwise, so fC is measurable. The argument for decreasing

intersections is similar (note that all measures are finite here). As before, B is a monotone class
having all finite disjoint unions of sets in R, and so equals A = (R).
A symmetric argument shows that for all A A , gA is measurable. Note that we do not have to
actually do much; we are checking the properties we want only on rectangles, which are easy to
deal with!
Fact 4: Define : A [0, ),
(A) =
fA d1
The integral can never be , since it is at most 1 (1 )2 (2 ), and is nonnegative because fA is

nonnegative. We show that is a measure. () = 0, since 1 = and f is the zero function.
We now show countable additivity. Suppose {Cn }
n=1 is a collection of disjoint sets from A . By
countable additivity of 2 , we have
X
f( S Cn ) =
fCn
nN
nN
(Cn ) = lim
n=1
= lim
k
X
(Cn )
n=1
k Z
X
fCn d1
(by definition of )
fCn d1
(additivity of integrals)
n=1
= lim
Z X
k
lim
n=1
k
X
n=1
fCn d1
(monotone convergence theorem; existence of lim
fCn )
Z X
fCn d1
n=1
f( S
nN
nN
Cn )
Cn
(from above)
d1
(by definition of )
So is a measure on A . For any A1 A2 R,

Z
Z
(A1 A2 ) = fA1 A2 d 1 = 2 (A2 )1A1 d 1 = 1 (A1 )2 (A2 )
We could have started with g, and defined (A) =
gA d2 . As we may expect, = :
In fact, let : A [0, ]R be any measure such that and agree on R. Since and agree
on , is also finite. They agree on the finite disjoint unions of sets in R, and by an application
of the monotone class theorem we know that = .
Definition. The measure space so obtained, (, A , ) is called the product of the measure spaces
(1 , A1 , 1 ) and (2 , A2 , 2 ), and is denoted by (1 , A1 , 1 ) (2 , A2 , 2 ).
Fubinis Theorem
R
R
In calculus we often considered f (x, y) dx dy and f (x, y) dy dx, and the equality of the two
(when true) was useful in calculations. To avoid introducing too many symbols, notation borrowed
from -calculus will be used: x.expression is a function which when evaluated on x, is expression.
Theorem (Fubini). Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are finite measure spaces. Let
(, A , ) = (1 , A1 , 1 ) (2 , A2 , 2 )
be their product. Let f : R be nonnegative measurable. Then
For all 1 1 , 2 .f (1 , 2 ) is measurable on (2 , A2 )
For all 2 2 , 1 .f (1 , 2 ) is measurable on (1 , A1 ).

R
1 . 2 .f (1 , 2 ) d2 is measurable on (1 , A1 ).

R
2 . 1 .f (1 , 2 ) d1 is measurable on (2 , A2 ).
The above facts are needed to ensure that the main assertion is meaningful:
Z

Z

Z
Z
Z
f d = 1 .
2 .f (1 , 2 ) d2 d1 = 2 .
1 .f (1 , 2 ) d1 d2
This says that for nonnegative functions, we may interchange the order of integration, and in
either case the result is the same as integration on the product measure space. However, there
is one issue with this statement of Fubinis theorem: some of the functions described in it may
take the value , while we have been considering only real-valued functions upto now. We must
account for functions taking values and too. We will do this in the next class.
Product measures (contd)

Consider ([0, 1], B, ) ([0, 1], B, ) (which is just [0, 1] [0, 1] with the usual measure on it),
and consider f : [0, 1] [0, 1] R
(
0 , if (x, y) = (0, 0) or x > 0
f (x, y) = 1
, if x = 0 and y > 0
y
We have
Z
f (x, y) d(y) =
0
(
, if x = 0
0
, otherwise
R
We would like to call the above (as a function of x) as some , and say that = 0. However, upto
now we have only defined measurability, integration, and so on for real-valued functions. Some
modifications are needed to allow infinities. We will do this in the next class.
Extended real valued functions

As seen before, to deal with repeated integration nicely, we need to allow functions which take the
values and and their integrals.
= R {, }, and < x < for all x R. We do not define + (), but all other
R
addition and multiplication operations are defined naturally (recall that we define 0 = 0).
is said to be measurable if for all a R,
Definition. Let (, A ) be a set with a -field. f : R
{ : f () a} A
or equivalently, for all a R,
{ : f () < a} A
Such an f is called an extended real valued measurable function. Note that the above definition
refers to all a R, not all a R.

1. f extended real valued on (, A ) is measurable if and only if the following hold:
f 1 ({}) A .
f 1 ({}) A .
For any interval I R, f 1 (I) A .
To show this, let f be measurable. Then
\
f 1 ({}) =
{ : f () n} A
nZ
({}) =
( \ { : f () n}) A
nZ
f 1 ((a, b]) = { : f () b} \ { : f () a} A
Using the above, we can as usual obtain that f 1 (I) A for all intervals I.
Conversely, suppose the above three conditions hold:
{ : f () a} = f 1 ({}) f 1 ((, a]) A
So f is measurable. We need not have mentioned f 1 ({}) explicitly, it follows from the
other two conditions.
Any extended real measurable function f can be written as
f = 1A + ()1B + f1C
where each A, B, C A and {A, B, C} is a partition of and f is a good old real measurable
function on . Conversely, any such f is extended real measurable.
2. If f and g are extended real measurable, so are f g and f g:
{ : (f g)() a} = { : f () a} { : g() a} A
{ : (f g)() a} = { : f () a} { : g() a} A
For a countable collection of extended real measurable functions C, sup C and inf C are also
extended real measurable (where as usual the sup and inf are interpreted pointwise):
\[
{ : f () a + k1 } A
{ : (inf C)() a} =
kN f C
{ : (sup C)() a} =
{ : f () a} A
f C
Now we need not worry about inf C and sup C existing - since we allow the values and
, they always exist!
3. If {fn }
n=1 is a sequence of extended real measurable functions, then lim inf n fn and
lim supn fn are also extended real measurable (again, the lim sup and lim inf always exist,
no conditions!). This follows from the above, since
lim sup fn = inf
sup
fn
inf
fn
mN nN,nm
lim inf fn = sup

n
mN nN,nm
4. If f and g are extended real measurable then if f + g is well-defined, it is extended real

measurable. By being well-defined we mean there is no such that {f (), g()} = {, }.
In the real measurable case we did not need this condition, f + g was always well-defined
there. As in the case of real measurable functions, we have for all a R,
[

{ : (f + g)() < a} =
{ : f () < r} { : g() < a r} A
rQ
To check this, we can take cases:

If f () and g() are both finite, then we know that the sets on the LHS and RHS agree
on .
If one of f () and g() is , then the other cannot be , and belongs to neither
of the LHS or RHS.
If one of f () and g() is , then the other cannot be , and belongs to both
LHS and RHS.
5. If f and g are extended real measurable, so is f g. Let A = (f g)1 ({}), B = (f g)1 ({}),
and C = (f g)1 (R).

A = f 1 ({}) g 1 ((0, ]R ) f 1 ({}) g 1 ([, 0)R )

g 1 ({}) f 1 ((0, ]R ) g 1 ({}) f 1 ([, 0)R ) A
This just says that the product of two numbers is if and only if one is and the other is
positive, or one is and the other is negative. A similar expression tells us that B A .
So C = \ (A B) A . Let
f = f 1{:f ()R}
g = g1{:g()R}
We know that f and g are real measurable. For C, exactly one of the following holds:
f () R and g() R. In this case, (f g)() = f ()g() = f()

g ().
One of f () and g() is not finite and the other is 0. In this case, (f g)() = 0.
f() = g() = 0, so again (f g)() = f()
g ().
{A, B, C} is a partition of by sets in A , and
f g = 1A + ()1B + (fg)1C
with fg real measurable. So f g is extended real measurable.
6. We define simple functions as we did before. They do not take values or . If f is a
nonnegative extended real measurable function, then there exists a sequence of nonnegative
simple functions {fn }
n=1 which converges to it pointwise. The construction is similar to the
one for real measurable functions:
"n2n 1
#
X k
fn =
1 1 k k+1 + n1f 1 ([n,]R )
2n f ([ 2n , 2n ))
k=0
As before, {fn }
n=1 is an increasing sequence and each fn f . If f () R, then for all
n > f (), |fn () f ()| < 21n . If f () = , fn () = n. In either case,
lim fn () = f ()
7. Integration clauses 1,2,3 go through as before (clause 1 deals only with nonnegative simple
functions, so nothing changes there), including the monotone convergence theorem, Fatous
lemma, and dominated convergence theorem.
Measure theory class notes - 4 October 2010, class 16
Extended real valued functions and integration

Clause 1 in the definition of integration talked only about nonnegative simple functions. The
notion of a nonnegative simple function does not change when one moves to extended real valued
functions from real valued functions, because we required simple functions by definition to take
only real values.
In the context of extended real valued functions, in clause 2 we define the integral of a nonnegative
measurable function in the same way as we did for real valued functions. This extends the definition
in clause 1, is monotone, and respects linear operations (addition, and scalar multiplication by a
nonnegative number). The monotone convergence theorem continues to hold. We make some more
observations which will be useful later:
Lemma. Let (, A , ) be a measure space, and let f, g : [0, ]R be nonnegative measurable.
R
If ({ : f () 6= 0}) = 0, then f = 0.
R
If f < , then ({ : f () = }) = 0.1
R
R
If ({ : f () 6= g()}) = 0, then f = g.
Proof.
Let s be a nonnegative simple function with 0 s f . Write s as follows:
s=
n
X
ai 1Ai
i=1
where {Ai : 1 i n} is a partition ofR . For all i, if ai 6= 0, then Ai {

R : f () 6= 0} and
so (Ai ) = 0. ai (Ai ) = 0 for all i, so s = 0. Since this holds for all s, f = 0.
Let A = { : f () = }. For all n N, n1A f . So
Z
Z
n1A = n(A) f
Z
1
(A)
f
n
R
Since f is finite and this holds for all n, (A) = 0.
Let N = { : fR() 6= g()}. We are given that (N ) = 0. Applying the first bullet point to
f 1N , we have f 1N = 0.
R
R
R
R
R
f = (f 1N + f 1\N ) = f 1N + f 1\N = f 1\N
R
R
Similarly, g = g1\N . From the definition of N , f 1\N = g1\N .
Z
Z
Z
Z
f = f 1\N = g1\N = g
1
The converse is not true!
We now come to clause 3. For a, b [0, ]R , a + bR< if and

R only if a < and b < . Let
+
f : R be measurable. Applying the above to f and f , we get that

f Ris integrable
R
R if
+
and only if |f | is. As with real-valued functions, for integrable f , we define f = f f .
define f + g : R
by
For f, g : R,
(
0
, if {f (), g()} = {, }
(f + g)() =
f () + g() , otherwise
Defining (f + g)() to be 0 in the first case is an ad-hoc definition, we could have put anything
sensible (like, say, 25) there. We are not saying = 0.
R
R
R
If f and g are integrable, so is f + g, and (f + g) = f + g: let
N = { : f ()
/ R or g()
/ R}
Since f and g are integrable, N is a subset of { : {f + (), f (), g + (), g ()}} which we
know from the lemma above has measure 0. So (N ) = 0. Let M = \ N .
0 |f + g| f + + f + g + + g
and so |f + g| isRintegrable.
f + g is integrable. As before, we have for all h {f + , f , g + , g , (f +
R
g)+ , (f + g) }, h = h1M .
(f + g)+ 1M (f + g) 1M = f + 1M f 1M + g + 1M g 1M
Since these are all real-valued integrable functions,
R
R
R
R
R
R
(f + g)+ 1M (f + g) 1M = f + 1M f 1M + g + 1M g 1M
As seen above, we can drop the 1M multipliers,
R
R
R
R
R
R
(f + g)+ (f + g) = f + f + g + g
R
R
R
(f + g) = f + g
R
R
If f is integrable and c R, then cf is integrable and cf = c f ; the proof is the same as for
the real-valued case.
The art of handling null sets

Let (, A , ) be a measure space. A A is called a null set if (A) = 0.
We say f = g almost everywhere if ({ : f () 6= g()}) = 0.
We say f g almost everywhere if ({ : f () > g()}) = 0.
We say {fn }
n=1 increases almost everywhere if for all n, fn fn+1 almost everywhere.
Equivalently,
({ : n(fn+1 () < f ())}) = 0
since a countable union of measure 0 sets has measure 0.
We say that fn f as n almost everywhere if ({ : fn () 6 f () as n }) = 0.

In general, a property is said to hold almost everywhere (abbreviated as a.e.) if the set of points
where it does not hold has measure 0. This can be applied for sets too - A B a.e. if (A \ B) = 0.
Note that A B iff 1A 1B , and A B a.e. if and only if 1A 1B a.e. .
We now look at stronger, almost everywhere versions of MCT and DCT.
Theorem (Monotone convergence theorem; almost everywhere version). Let (, A , ) be a measure space, and let {fn }
n=1 be a Rsequence
R of measurable functions which are nonnegative a.e., and
which increase to f a.e. . Then fn f as n .
Proof. Let
N1 = { : n N(fn () < 0)}.
N2 = { : n N(fn+1 () < fn ())}.
N3 = { : fn () 6 f () as n }.
Let N = N1 N2 N3 . By the hypothesis, (N
R N, let gn = fn 1\N . Let
R ) =R 0. For Rall n
g = f 1\N . Since gn = fn a.e., and g = f a.e., gn = fn and g = f .
The sequence {gn }
n=1 is a sequence of nonnegative functions increasing to g: for N , gn () =
g() = 0, and for
/ RN , this Ris true by the definition
of RN1 , N2 , N3 . So by the original monotone
R
convergence theorem, gn g as n . So fn f as n .
Theorem (Dominated convergence theorem; almost everywhere version). Let (, A , ) be a measure space, and let g L1 (, A , ). Let {fn }
n=1 be a sequence of measurable functions, and f a
1
measurable
function,R such that
R
R |fn | g a.e., and fn f as n a.e. . Then fn , f L , and
|fn f | 0 and fn f as n .
Proof. Let
N1 = { : g() < 0}.
N2 = { : n(|fn | > g)}.
N3 = { : fn () 6 f () as n }.
Let N = N1 N2 N3 . By the hypothesis, (N ) R= 0. Let

R fn R= fn 1\N
R , fR = f 1R\N , g = g1\N .
As before, because of almost everywhere equality, fn = fn , f = f, g = g.
g |g|, so g L1 . |Rfn | g for all n N. fn f as n . By the original dominated
convergence
theorem, |fn f| R0 as n
R
R .R Since |Rfn f | = |fn f | a.e. (they differ at most
on N ), |fn f | 0. Also, since fn f, fn f .
Fubinis Theorem
Theorem (Fubinis theorem; nonnegative functions). Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are
finite measure spaces. Let
(, A , ) = (1 , A1 , 1 ) (2 , A2 , 2 )
be nonnegative measurable. Then
be their product. Let f : R
For all 1 1 , f 1 := 2 .f (1 , 2 ) is measurable on (2 , A2 ).
For all 2 2 , f2 := 1 .f (1 , 2 ) is measurable on (1 , A1 ).

R
1 . f 1 d2 is measurable on (1 , A1 ).

R
2 . f2 d1 is measurable on (2 , A2 ).
The above facts are needed to ensure that the main assertion is meaningful:
Z

Z

Z
Z
Z
1
f d = 1 .
f d2 d1 = 2 .
f2 d1 d2
Proof. We will prove the following:
For all 1 1 , f 1 is measurable on (2 , A2 ).

R

R
R
R
f d = 1 . f 1 d2 d1 .
Let these assertions together be denoted by ?. The rest of the assertions to be proven are obtained
by a similar proof, interchanging the roles of the indices 1 and 2. We will go step by step:
Indicator functions: Suppose f = 1S , for S A = A1 A2 . For all 1 1 , f 1 = 1S 1 . From
the definition of the product measure, we know that S 1 A2 and so f 1 is measurable.

R
R
1 . f 1 d2 = 1 . 1S 1 d2 = 1 .2 (S 1 )
is measurable, as shown in the definition of product measure.

R
R
R
R
f d = (S) = 1 .2 (S 1 ) d1 = 1 . f 1 d2 d1
again by the definition of product measure. ? holds for indicator functions.
Sum: Suppose ? holds for f and g. For all 1 1 , (f + g)1 = f 1 + g 1 , so (f + g)1 is
measurable.

R
R
1 . (f + g)1 d2 = 1 . f 1 + g 1 d2

R
R
= 1 . f 1 d2 + g 2 d2
(additivity of 2 integrals)

R
R
1
1
= 1 . f d2 + 1 . g d2
being a sum of measurable functions, is measurable.
R
R
R
(f + g) d = f d + g d
(additivity of integrals)

=
1 .
1 .
1 .
1 .
1 .

R
R
f 1 d2 d1 + 1 . g 1 d2 d1

R
R
f 1 d2 + 1 . g 1 d2 d1

R
R
f 1 d2 + g 1 d2 d1

R
f 1 + g 1 d2 d1

R
(f + g)1 d2 d1
2
(? holds for f and g)
If ? holds for f and g, then ? holds for f + g.

Scalar multiplication: Suppose ? holds for f and [0, ). For all 1 1 , (f )1 = f 1 is
measurable.

R
R
R
R
1 . (f )1 d2 = 1 . f 1 d2 = 1 . f 1 d2 = 1 . f 1 d2
is a measurable function scaled by , and so is a measurable function.
R
R
f d = f d
(linearity of integrals)

R
R
= 1 . f 1 d2 d1
(? holds for f )

R
R
= 1 . f 1 d2 d1
(linearity of 1 integrals)

R
R
= 1 . f 1 d2 d1

R
R
= 1 . f 1 d2 d1
(linearity of 2 integrals)

R
R
= 1 . (f )1 d2 d1
If ? holds for f and [0, ), then ? holds for f .
Nonnegative simple functions: Since these are linear combinations of indicator functions with
nonnegative coefficients, using the above we have that ? holds for all nonnegative simple functions.
Increasing limits: Suppose {fn }
n=1 is a sequence of increasing measurable functions for which ?
holds. Let f be the pointwise supremum (equivalently limit) of {fn }
f 1 is the
n=1 . For all 1 1 , R
1
supremum ofR {fn }n=1 , and so is measurable. By the monotone convergence theorem, fn1 d2
increases to f 1 d2 .

R
R
1 . f 1 d2 is the pointwise supremum of 1 . fn1 d2 n=1 , and hence measurable.
R
R
f d = limn fn d
(MCT for )

R
R
= limn 1 . fn 1 d2 d1
(? holds for fn )

R
R
1
= limn 1 . fn d2 d1
(MCT for 1 )

R
R
= 1 . limn fn 1 d2 d1

R
R
= 1 . limn fn1 d2 d1
(MCT for 2 )

R
R
= 1 . f 1 d2 d1
If ? holds for all members of an increasing sequence of nonnegative measurable functions, it holds
for their supremum too.
All nonnegative measurable functions: Since any nonnegative measurable function is an
increasing limit of nonnegative simple functions, ? holds for all nonnegative measurable functions.
Dealing with nonnegative functions was easy, because we could always integrate, without conditions. We now look at Fubinis theorem for integrable functions, where more work will be needed
to deal with integrability.
Theorem (Fubinis theorem; integrable functions). Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are finite

measure spaces. Let
(, A , ) = (1 , A1 , 1 ) (2 , A2 , 2 )
Suppose f L1 (, A , ). Then
For all 2 2 , f2 is measurable on (1 , A1 ).
1
1
For almost every
R 1 1 (w.r.t. R1 ), f is 2 -integrable. For 1 for which f is not
integrable, put f 1 d2 = 0. 1 . f 1 d2 , defined with this convention, is measurable
and 1 -integrable.
The corresponding mirror statement: For almost every R2 2 (w.r.t. 2 ), Rf2 is 1 integrable. For 2 for which f2 is not integrable, put f2 d1 = 0. 2 . f2 d1 ,
defined with this convention, is measurable and 2 -integrable.
Finally,
Z
Z
Z
f d =
1 .

d2
Z
Z
d1 =
2 .

d2 d1
d2
Before we look at the proof, consider how we may apply this theorem. The hypothesis requires f
to be integrable with respect to the product measure , while in a typical application we might be
interested only in interchanging the order of integration; and directly integrating on the product
space may not be easy. We would like to work only with iterated integrals and apply this theorem.
Fortunately, we can: f is integrable if and only if |f | is integrable. |f | is nonnegative, so we know
from the Fubinis theorem for nonnegative functions that the integral of |f | with respect to the
product measure is equal to the repeated integral in either order. So for f to be in L1 (, A , ), it
suffices to check any one of
Z

Z

Z
Z
1
1 .
|f | d2 d1 < ,
2 .
|f |2 d1 d2 <
The proof of the theorem:
Proof. As in the nonnegative case of the theorem, we prove one half of the theorem, the other half
has an analogous proof which interchanges the roles of the indices 1 and 2.
|h|1 = |h1 |, (h+ )1 = (h1 )+ , and (h )1 = (h1 ) .
Note that for all 1 1 , and all h : R,
f 1 = (f + )1 (f )1 . Both functions on the RHS are measurable by the nonnegative version of
Fubinis theorem, so their difference f 1 is measurable.
f is integrable, and so |f | is integrable. Applying the nonnegative version of Fubinis theorem,
Z

Z
1
1 .
|f | d2 d1 <
If the integral of a nonnegative function is finite, the function is finite almost everywhere. Let

R
N1 = 1 1 : |f |1 d2 =
N1 is the set where some measurable function takes the value , and so N1 A1 and 1 (N1 ) = 0.
For 1 1 \ N1 , f 1 is 2 -integrable.
Define : 1 R,
(R
f 1 d2 , if 1 1 \ N1
(1 ) =
0
, if 1 N1
We want to show that is measurable and 1 -integrable (this is the third bullet point in the
g = f 1( \N ) . g is measurable, and since |g| |f |,
statement of the theorem). Let g : R,
1
1
2
g is integrable. For all 1 1 , g 1 is integrable: if 1 N1 , then g 1 is the zero function, and
hence integrable. Otherwise, g 1 = f 1 , which is integrable by definition of N1 . For all 1 1 ,
R
(easy to see for 1 N1 , and 1
/ N1 )
(1 ) = g 1 d2
R
R +
= (g ) 1 d2 (g ) 1 d2
is a difference of two nonnegative measurable functions (we know they are measurable from the
nonnegative version of Fubinis theorem), and so is measurable.
R

R
R
|| d1 = 1 . g 1 d2 d1

R
R
1 . |g 1 | d2 d1

R
R
= 1 . |g|1 d2 d1

R
R
1 . |f |1 d2 d1
R
= |f | d
(Fubinis theorem for nonnegative functions)
<
So is integrable. We will complete the proof in the next class.
(f is integrable)
Fubinis Theorem (contd)

We
R continue
R with the proof of Fubinis theorem for integrable functions. We wanted to show that
f d = d1 . f and g differ at most
on N1R 2 , and (N1 2 ) = 1 (N1 )
R
R 2 (2 ) = R0. Since
f = g almost everywhere (w.r.t ), f d = g d. It suffices to show that Rg d = d1 .
Recall that g is -integrable, and for all 1 1 , g 1 is 2 -integrable. = 1 . g d2 d1 .
R
R
R
g d = g + d g d
(linearity of integral)

R
R +
R
R
= 1 . (g ) 1 d2 d1 1 . (g ) 1 d2 d1
(nonnegative functions)

R
R
R +
(linearity of 1 integral)
=
1 . (g ) 1 d2 1 . (g ) 1 d2 d1

R
R
R +
= 1 . (g ) 1 d2 (g ) 1 d2 d1

R
R
(linearity of 2 integral)
= 1 . ((g + )1 (g )1 ) d2 d1

R
R
1
= 1 . g d2 d1
R
= d1
This completes the proof.
Product of -finite measures

The definition of product measure spaces so far is unsatisfactory. We have defined products of
only finite measure spaces, while even R with the Borel sigma-field and Lebesgue measure is not
a finite measure space. We now define product measures for -finite measures in a way analogous
to finite measures.
Theorem. Let (1 , A1 , 1 ) and (2 , A2 , 2 ) be -finite measure spaces. Let
= 1 2 .
R = {A1 A2 : A1 A1 , A2 A2 }.
A = (R).
There is a unique measure on (, A ) such that for all A1 A1 and A2 A2 , (A1 A2 ) =
1 (A1 )2 (A2 ). This measure is -finite.
Proof.
Step 1: R is a semi-field. For any S A and 1 1 , the section S 1 belongs to A2 . The
proof is the same as in the case of finite measures, since this statement makes no reference to the
measures 1 and 2 .
Step 2: For all S A , define gS : 1 [0, ]R as gS (1 ) = 2 (S 1 ). We will show that gS
is measurable on (1 , A1 ). Let Y A2 be a countable partition of 2 such that for all Y Y,
2 (Y ) < . Let
B = {S A : Y Y(gS(1 Y ) is measurable)}
We will show that B = A . For A1 A1 and A2 A2 ,
g(A1 A2 )(1 Y ) = gA1 (A2 Y ) = 2 (A2 Y )1A1
is measurable on (1 , A1 ). R B. If S1 , S2 A are disjoint, then for all 1 , S11 and S21 are
disjoint, and so gS1 S2 = gS1 + gS2 . If S1 and S2 are disjoint, so are S1 (1 Y ) and S2 (1 Y ).
Since the sum of measurable functions is measurable, all finite disjoint unions of sets in R belong
to B.
We now show that B is a monotone class. Suppose {Sn }
n=1 is a decreasing sequence of sets in B,
with intersection S. S A . For any 1 1 , Y Y, and n N,

gSn (1 Y ) (1 ) = 2 (Sn (1 Y ))1 2 (Y ) <
{(Sn (1 Y ))1 }
n=1 is a decreasing sequence of sets of finite 2 measure which decreases to
1
1
(S (1 Y )) . So {2 ((Sn (1 Y ))1 )}
n=1 decreases to 2 ((S (1 Y )) ). gS(1 Y )
is the decreasing limit of {gSn (1 Y ) }
n=1 , and hence is measurable. S B. A similar argument
holds for increasing unions.
B is a monotone class. The field of finite disjoint unions of sets in R, and hence the monotone
class generated by this field, is a subset of B. By the monotone class theorem, (R) B. B = A .
For any S A and 1 1 ,
gS (1 ) = 2 (S 1 )
!
[
= 2
(S 1 Y )
Y Y
!
[
= 2
(S (1 Y ))1
Y Y
2 ((S (1 Y ))1 )
Y Y
gS(1 Y ) (1 )
Y Y
A countable sum of nonnegative measurable functions is measurable (since it is the increasing limit
of the partial sums, which are measurable). For all S A , gS is measurable.
Step 3: Define : A [0, ]R ,
Z
(S) =
gS d1
g is the zero function, so () = 0. Suppose S A is a countable collection of disjoint sets.

!1 !
[
gS S (1 ) = 2
S
SS
!
= 2
S 1
SS
2 (S 1 )
SS
SS
gS (1 )

Using that gS S =
SS
gS ,
[ Z
S = gS S d1
Z X
gS d1
=
SS
XZ
gS d1
(write as limit of partial sums, use MCT)
SS
(S)
SS
So is a measure on (, A ).
Step 4: For A1 A1 and A2 A2 ,
Z
Z
(A1 A2 ) = gA1 A2 d1 = 2 (A2 )1A1 d1 = 1 (A1 )2 (A2 )
S
S
Let X A1 and Y A2 be countable, such that X = 1 , Y = 2 , and the 1 measure of
each set in X as well as the 2 measure of each set in Y is finite.
{X Y : X X , Y Y}
is a witness to (, A , ) being -finite.
Step 5: Let be any measure on (, A ) such that for all A1 A1 and A2 A2 , (A1 A2 ) =
1 (A1 )2 (A2 ). and agree on R, and hence on the field of finite disjoint unions of sets in R.
(and necessarily also ) are -finite on this field. So by a theorem proven earlier, and agree
on (R) = A .
Fubinis theorem for -finite measures

Fubinis theorem for the nonnegative and integrable cases continues to hold for -finite measures:
Theorem (Fubinis theorem; nonnegative functions). Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are
-finite measure spaces. Let
(, A , ) = (1 , A1 , 1 ) (2 , A2 , 2 )
be nonnegative measurable. Then
be their product. Let f : R
For all 1 1 , f 1 := 2 .f (1 , 2 ) is measurable on (2 , A2 ).
For all 2 2 , f2 := 1 .f (1 , 2 ) is measurable on (1 , A1 ).

R

R
2 . f2 d1 is measurable on (2 , A2 ).
Finally,
Z
Z
Z
f d =
1 .

d2
Z
Z
d1 =
2 .

f2 d1
d2
Theorem (Fubinis theorem; integrable functions). Suppose (1 , A1 , 1 ) and (2 , A2 , 2 ) are finite measure spaces. Let
(, A , ) = (1 , A1 , 1 ) (2 , A2 , 2 )
Suppose f L1 (, A , ). Then
For all 2 2 , f2 is measurable on (1 , A1 ).
1
1
For almost every
R 1 1 (w.r.t. R1 ), f is 2 -integrable. For 1 for which f is not
integrable, put f 1 d2 = 0. 1 . f 1 d2 , defined with this convention, is measurable
and 1 -integrable.
The corresponding mirror statement: For almost every R2 2 (w.r.t. 2 ), Rf2 is 1 integrable. For 2 for which f2 is not integrable, put f2 d1 = 0. 2 . f2 d1 ,
defined with this convention, is measurable and 2 -integrable.
Finally,
Z
Z
Z
f d =
1 .

d2
Z
Z
d1 =
2 .

d2 d1
d2
The statements are the same as before, with finite measure spaces replaced by -finite measure
spaces. The proofs we gave earlier hold for the -finite case too; we never used the measure being
finite. For example, we did not consider decreasing sequences of sets.
Integration is area under the curve

We now apply our knowledge of product measures and Fubinis theorem to say that integration
corresponds to calculating the area under the curve, the intuition we originally started with.
The following theorem is about general measure spaces; the case of (R, B, ), where B is the
Borel -field on R and is the Lebesgue measure, corresponds to our usual notion of area.
Theorem. Let (, A , ) is a -finite measure space. Let f : [0, ) be measurable. Consider
the product space (, A , ) (R, B, ). Let
G = {(x, y) R : 0 y f (x)}
Then G A B and
Z
f d = ( )(G)
Here G is the region under the curve, and ( )(G) is its area.
Proof. Let 1 , 2 , : R R,
1 ((x, y)) = f (x), 2 ((x, y)) = y, = 1 + 2
1
For any interval I R, 1
(I) R A B. 1
1 (I) = f
2 (I) = (I) A B. So 1 and
2 are measurable. being their sum is also measurable.

G = (x, y) R : 0 y f (x) = a R : (a) 0 [0, )
So G A B.
( )(G) =
1G d( )

R
= x. (1G )x d d

R
R
= x. 1Gx d d

R
R
= x. 1{y:(x,y)G} d d

R
R
= x. 1[0,f (x)] d d
R
= x.(f (x)) d
R
= f d
R
(Fubinis theorem for nonnegative functions)
(definition of G)
Product of finitely many measure spaces

Just like we took the product of two measure spaces, we can take the product of any finite number
of measure spaces. Here the complete details involved are a bit boring and conceptually there is
nothing new, so we will not look at this in detail.
Theorem. Let (i , Ai , i ), 1 i n be -finite measure spaces. Let
= 1 . . . n .
R = {A1 . . . An : each Ai Ai }.
A = (R).
There is a unique measure : A [0, ]R , such that for A1 . . . An R,
(A1 . . . An ) =
n
Y
i (Ai )
i=1
This is -finite.
Fubinis theorem holds for products of n measure spaces. This will be stated in terms of a permutation : {1, . . . , n} {1, . . . , n}, by saying that the integral of a function with respect to the
product measure is the same as repeated integration performed in the order (1), . . . , (n), for any
permutation .
Theorem. Suppose (, A , ) is a measure space, and T : is an isomorphism of measure
spaces (that is, T is bijective, for all A A , T (A) and T 1 (A) belong to A , and (A)
R = (TR(A))),
then for all measurable f , f T is measurable. For f nonnegative or integrable, f T = f .
Proof. For any interval I R, (f T )1 (I) = T 1 (f 1 (I)). Since f 1 (I) A , T 1 (f 1 (I)) A .
If f is measurable, so is f T .
1
1
For A A , (T
R (A))R = (T (T (A))) = (A) = (T (A)). The proof is as usual:
R 1AR T =
1T 1 (A) , and so 1A = 1A T . By linearity, for any nonnegative simple function s, s = s T .
For any nonnegative measurable function f , by taking
a Rsequence of simple functions increasing to
R
it and using the monotone convergence theorem, fR = fR T . For any integrable f , by looking
at f + and f , we have that f T is integrable and f = f T .
This can be used to deal with product spaces: for example, if all the n measure spaces are the
same, then the transposition map which interchanges the ith component with the jth
component is an example of such a T .
Towards Radon-Nikodym theorem

We have many ways of obtaining new measure spaces from old: restricting the basic set, restricting
the -field, adding measures on the same set and -field, taking products, etc. We will give another
and A A , by
way
now. First, aRmatter of notation: for a measure space (, A , ), f : R,
R
f d we mean f 1A d.
A
Theorem. Suppose (, A , ) is a -finite measure space, and f : [0, ) measurable. Define
: A [0, ]R ,
Z
(A) =
f d
A
Then is a -finite measure on (, A ).
Proof. Clearly () = 0. Suppose C A is a countable collection of disjoint sets.

[ Z
C = f 1S C d
Z X
= f
1C d
CC
Z X
f 1C d
=
CC
XZ
f 1C d
(use partial sums, linearity of integral, MCT)
CC
(C)
CC
So is a measure.
Let B A be a countable collection of sets with finite measure whose union is . Consider the
countable collection of sets

B {x : f (x) [n, n]} : B B, n N
|f 1B{x:f (x)[n,n]} | n1B , so ({B {x : f (x) [n, n]}) n(B). This countable collection of
sets each with finite measure clearly covers ; so is -finite.
Also, note that the above is finite if and only if f is integrable.
also has the property that for all A A , if (A) = 0, then (A) = 0 (because if (A) = 0,
then f 1A is the zero function almost everywhere (w.r.t. ), and so has integral 0). The above
construction is the only way such measures can arise!
Theorem (Radon-Nikodym). Suppose is a nonempty set and A a -field on it. Suppose and
are -finite measures on (, A ) such that for all A A , (A) = 0 implies (A) = 0. Then
There exists f : R nonnegative measurable such that for all A A , (A) =
Such an f is unique upto a.e. equality (w.r.t. ).
f is integrable w.r.t. if and only if is a finite measure.
We will prove the theorem in the next class.
R
A
f d.
Proof of Radon-Nikodym theorem

Theorem (Radon-Nikodym). Suppose is a nonempty set and A a -field on it. Suppose and
are -finite measures on (, A ) such that for all A A , (A) = 0 implies (A) = 0. Then
There exists z : [0, ) measurable such that for all A A , (A) =
R
A
z d.
Such a z is unique upto a.e. equality (w.r.t. ).

z is integrable w.r.t. if and only if is a finite measure.
Proof. Assuming we have obtained
the z, it is easy to see that z is integrable if and only if is a
R
finite measure, since () = z d.
Uniqueness
We
R show uniqueness
R first. Suppose there are two Rz1 , z2 :
R [0, ) such that for all A A ,
z d = (A) = A z2 d. We would like to say A z1 d A z2 d = 0, but both the integrals
A 1
might be . Both and are -finite. Let C be a countable partition of such that for all
C C, (C) < and (C) < (to obtain this we can take such a partition for , and such a
partition for , and take pairwise intersections). Fix a C C. Let
B = C { : z1 () z2 () > 0}
Note that 1B (z1 z2 ) = 1C (z1 z2 )+ (evaluate both sides at : for
/ C, both are 0; for C \B,
LHS
is 0 andR since (z1 z2 )() 0, RHS is 0; for B both side are z1 ()
R
R z2 ()).
R (B) =
1B z1 d = 1B z2 d. Since (B) (C) < , it is meaningful to consider 1B z1 d 1B z2 d.
Z
Z
0 = 1B z1 d 1B z2 d
Z
= 1B (z1 z2 ) d
Z
= 1C (z1 z2 )+ d
If the integral of a nonnegative function is 0, it must be 0 almost everywhere. Applying to the
above:
({ : C and z1 () z2 () > 0}) = 0
We can interchange the roles of z1 and z2 in the above argument. Combining the two, we get
({ : C and z1 () z2 () 6= 0}) = 0
This holds for all C C. Since C is countable, and a countable union of null sets is a null set,
({ : z1 () 6= z2 ()}) = 0
This proves that the z promised in the theorem (if it exists) is unique upto a.e. (w.r.t. ) equality.
Existence
We now show the existence of such a z. We reduce the general case to the case when both and
are finite, and then show existence for the finite case.
Reduction to the finite measure case
Suppose we know the Radon-Nikodym theorem holds for the case when the measures involved are
finite. In general, assume and are -finite, and let C be a countable partition of such that
for all C C, (C) and (C) are finite. Define finite measures C and C for all C C,
C (A) = (C A), C (A) = (C A)
If C (A) = 0, then C (A) = 0. Using the finite-measure case of the theorem, let zC : [0, )
be such that for all A A and C C,
Z
zC dC
C (A) =
A
zC and zC 1C differ at most on \ C, which has 0 C measure. So we can replace zC by zC 1C ,

and so assume that for
/ C, zC () = 0. Define z : [0, ), by requiring that z|C = (zC )|C
P
(that is, for C, z() = zC ()). z can also be described as CC zC - this shows that z is
measurable.
R
R
If g : [0, ) is 0 outside C, then g d = g dC . This can be proven as usual by proving
it for simple functions and then for all nonnegative measurable functions. This fact is applied to
zC in what follows. For any A A ,
X
(A) =
(A C)
CC
C (A)
CC
XZ
CC
XZ
CC
zC dC
zC d
Z X
zC d
(by taking partial sums; MCT)
A CC
Z
=
z d
A
It remains to prove the theorem for the finite measure case:

Finite measures - getting hold of the z:
Now assume and are finite measures. Let

R
C = f : (f : [0, ]R ), A A ( A f d (A))
We want to identify f such that equality holds, that is, for all A A ,
R
A
f d = (A).
C is nonempty, because the zero function belongs to C. Intuitively, we want to choose the largest
function in C as a candidate for our z. Does such a largest function exist? We observe some
properties of C first:
If f, g C, then f g (which is the pointwise maximum of f and g) belongs to C. Take any
A A . We apply the condition defining C to the sets A1 := A { : f () g()} and
A2 := A { : f () < g()}:
Z
Z
f (A1 ),
A1
g (A2 )
A2
Note that f 1A1 = (f g)1A1 and g1A2 = (f g)1A2 . Substituting in the above equations and
adding them, we get
Z
(f g) (A)
A
Suppose {fn }
n=1 is an increasing sequence in C. Let f be their supremum (equivalently
limit). Then f C: take any A A . {fn 1A }
n=1 increases to f 1A , and so by the monotone
R
convergence
R theorem the corresponding limit with integrals holds. Since each fn 1A (A),
we have f 1A (A).
We identify the largest function in C by considering the function with the largest integral. For
R
all f C, f d () < . Let
= sup
nR
f d : f C
0 (). In particular, 6= . Let {fn }

n=1 be a sequence from C such that
R
fn d

n=1
converges to (this can be done because of the supremum definition of ). Consider the partial
maxima of {fn }
n=1 , that is, define {gn }n=1 by
gn = max{f1 , f2 , . . . , fn }
R
R
R
We know that gn C, so gn d . Also, fn gn , so fn d gn d . Thus
R
limn gn d = . R{gn }
n=1 is an increasing sequence; let z = limn gn . By the monotone
convergence theorem, z d = . Since z is integrable, z is finite almost everywhere. We modify
z on a set of measure 0 suitably so that z never takes the value . This z is our largest function
in C, as we show now:
Let g C. We will show that g z a.e. (w.r.t ). Let A = { : g() > z()}. Let z g = h.
R
R
R
h C. A = { : h() > z()} = { : h() z() > 0}. h z; h d z d = , so h = .
h z is a nonnegative function whose integral is 0, so h z is 0 almost everywhere. (A) = 0.
g z almost everywhere (w.r.t ).
Finite measures - showing that the z works

Define : A [0, ),
Z
(A) = (A)
z d
A
Note that this is well-defined, that is, the expression on the RHS indeed belongs to [0, ). is a
measure: clearly () = 0. Let C A be a countable collection of disjoint sets.
[ Z
X
XZ
(C) +
z d =
C + S z d <
CC
CC
Since the relevant series is absolutely convergent, we can rearrange terms, and
X
Z
[
[ Z
X
XZ
X
(C)
C =
C S z d =
z d =
(C)
z d =
(C)
C
CC
CC
CC
CC
is a measure. We want to show that is the zero measure, and then we will be done.
If for all A A and all k N, (A) k1 (A) then since (A) is finite, (A) = 0, and we are
done. So there exists A A and k N such that (A) k1 (A) > 0. Fix such an A and k. We
will derive a contradiction.
Call a set Z A to be good if the following two conditions hold:
(Z) k1 (Z) > 0.
For all B Z with B A , (B) k1 (B) 0.
The first condition implies that a good set Z must have (Z) > 0: otherwise if (Z) = 0, then by
the hypothesis, (Z) = 0, and so (Z) = 0. A satisfies the first condition, but need not satisfy the
second condition.
We will identify a good set - A itself may not be a good set, but we can scrape off some parts
of it to get a good set. Once we identify a good set Z, we will use the function z + k1 1Z to get a
contradiction.
Let A0 = A. Let 0 = inf{(B) k1 (B) : B A, B A }. If 0
R 0, then A itself is a good set.
So assume 0 < 0. 0 cannot be , since it is at least (() + z d + k1 ()). Pick B0 A0
such that
1
1
(B0 ) (B0 ) < 0
k
2
Put A1 = A \ B0 .

(A1 ) k1 (A1 ) = (A) (B0 ) k1 (A) k1 (B0 ) (A) k1 (A) > 0
(because (B0 ) k1 (B0 ) < 12 0 < 0.)
An important observation is this: if B A1 , then (B) k1 (B) 21 0 . Otherwise, we could take
a B which violates this, and B B0 would violate the infimum definition of 0 .
If A1 is a good set, we have found a good set. Otherwise, repeat with A1 what we did with A0 .
The above observation says that the 1 so obtained will satisfy 1 12 0 . Inductively, we construct
a sequences of sets {An }

n=1 and {Bn }n=1 , and a sequence of real numbers {n }n=1 , such that the
following holds: some An is good, or for all n N,
Bn An .
An+1 = An \ Bn .

n = inf (B) k1 (B) : B An , B A , n < 0.
(An+1 ) k1 (An+1 ) (An ) k1 (An ).
n+1 21 n .
This immediately implies that for all n, n 21n 0 and (An ) k1 (An ) (A) k1 (A) > 0.
T
If no An was good, define A = nN An . Since and are finite measures,
(A ) = lim (An ) and (A ) = lim (An )
n
So (A ) k1 (A ) (A) k1 (A) > 0. If B A and B A , then since B An ,

(B) k1 (B) 21n 0 . Since this holds for all n, (B) k1 (B) 0. A is a good set! If some An
was good, let A be that good set.
We will show that z + k1 1A C. For any S A , if S \ A , then
Z
S
1
z + 1A
k
Z
z d (S)
d =
S
If S A , then

R
R
z + k1 1A d = S z d + k1 (S)
S
R
S z d + (S)
(since A is a good set)
= (S)
(by definition of )
For general S, we just combine the part of S in A and the part in \ A:

Z
Z
Z
1
1
1
z + 1A d+
z + 1A d (S A)+(S \A) = (S)
z + 1A d =
k
k
k
SA
S\A
S
We have shown that z + k1 1A C. Since A is a good set, (A ) > 0. It is not the case that
z + k1 1A z a.e. (w.r.t. ), which contradicts what we have shown earlier.
must be the zero measure, and so for all D A ,
Z
(D) =
z d
D
Absolute continuity and singularity

Suppose is a nonempty set, and A is a -field on it. Let and be measures on (, A ).
Definition. is said to be absolutely continuous with respect to , denoted as , if for all
A A , (A) = 0 = (A) = 0.
This was the situation we encountered in the Radon-Nikodym theorem. At the other end of the
spectrum, we have the following:
Definition. is said to be singular with respect to , denoted as , if there exists A A
such that (A) = 0 = ( \ A).
This says that and are supported on disjoint sets.
Now suppose and are both -finite. Then we can write as 1 + 2 , where 1 is absolutely
continuous with respect to , and 2 is singular with respect to . We will not prove this now.
Function spaces
We now do some analysis, using the theory we have developed. Studying limits is important, since
that is often the only way to obtain certain objects. For example, irrational real numbers can often
be obtained only via a limiting operation.
For a measure space (, A , ), for p R, p 1, define

Z
p
p
L (, A , ) = f : f real measurable , |f | d <
Lp (, A , ) will be denoted simply by Lp if there is no ambiguity. For f Lp , define
Z
kf kp =
p1
|f | d
Suppose we take = {1, . . . , n}, A = 2 , and (A) = |A|, the counting measure. Then the set of
all real functions on is just Rn . Taking p = 2, we get the usual Euclidean norm on Rn :
v
Z
12 uX
u n
2
kf k2 =
f d
= t (f (n))2
i=1
We now look at some modes of convergence of functions: suppose {fn }

n=1 is a sequence of measurable functions and f is a measurable function.
1. fn f as n almost everywhere if ({ : fn () 6 f ()}) = 0.
2. fn f as n in measure if for all > 0,
({ : |fn () f ()| > }) 0 as n
3. fn f as n in Lp (also denoted as fn
f ) if each fn f , fn , and f are in1 Lp and
p
kfn f kp 0.
Here are some examples. Consider our usual (R, B, ).
1. Let fn = 1[n,) , f = 0. Then fn f almost everywhere (in fact, everywhere), however, for
all n N,

: |fn () f ()| > 21 = ([n, )) =
So {fn }
n=1 does not converge to f in measure.
2. Define a sequence of indicator functions {fn }
n=1 as follows: the first element is 1[0,1) . The
next two elements are 1[0, 1 ) and 1[ 1 ,1) . The next four elements are 1[0, 1 ) , 1[ 1 , 2 ) , 1[ 2 , 3 ) , 1[ 3 ,1) ;
2
2
4
4 4
4 4
4
and so on. In general, the elements from positions 2n to 2n+1 1 are indicator functions of
sets obtained by dividing [0, 1) into 2n pieces in the obvious way. As before, let f be the zero
function.

Now consider the sequence ({ : fn () f () > }) n=1 . For 1 this is the zero
sequence; for 0 < < 1, this sequence consists of a sequence of blocks, the nth block having
1
. So this sequence converges to 0, and fn f in
2n1 numbers, each each number being 2n1
measure as n .
However, fn 6 f almost everywhere. In fact, for every [0, 1), {fn ()}
n=1 does not
converge to f () = 0. This is because, for a fixed , in the block of indices from 2n to
2n+1 1, there is some index k such that fk () = 1. {fn ()}
n=1 has a subsequence consisting
of all 1s, and so does not converge to 0.
Here the set where fn differs from f becomes small, but keeps moving around all over
[0, 1), preventing almost everywhere convergence.
We will soon see that Lp is a vector space under the usual operations.
More on convergence of functions

We investigate more properties of convergence of a sequence of functions. In what follows, (, A , )
is a measure space, and all functions are real-valued measurable functions.
If {fn }
n=1 converges to f a.e. and to g a.e. then f = g almost everywhere. This says that a.e.
limits are unique (upto a.e. equality). Let N1 be the set of points where {fn }
n=1 does not converge
to f , and N2 the points where it does not converge to g. Both are in A and have measure 0.
Outside N1 N2 , f and g agree, and so f = g almost everywhere.
If {fn }
n=1 converges to f in measure and to g in measure then f = g almost everywhere: fix
> 0. For all n,
{ : |f () g()| > } { : |f () fn ()| > /2} { : |fn () g()| > /2}
({ : |f () g()| > }) ({ : |f () fn ()| > /2}) + ({ : |fn () g()| > /2})
By hypothesis, the RHS converges to 0 as n . So ({ : |f () g()| > }) = 0.
[
{ : f () 6= g()} =
{ : |f () g()| > 1/k}
kN
Since a countable union of null sets is a null set, f = g almost everywhere.

What if a sequence converges to some function almost everywhere, and converges to some function
in measure? Do the two limits have to be equal almost everywhere? Yes. For that we will prove
something else first.
Suppose {fn }
n=1 converges to f in measure. Then there exists a subsequence which converges
to f almost everywhere: for all k N, pick nk such that

: |fm () f ()| > 21k < 21k for all m nk
and such that {nk }
k=1 is a strictly increasing sequence of natural numbers. We will show that the
subsequence {fnk }
k=1 converges to f almost everywhere. For all k N, let

Ak = : |fnk () f ()| > 21k
We know (Ak ) <
1
,
2k
and so for any p N,

!
[
X
X
1
1
Ak
(Ak )
=
2k
2p1
k=p
k=p
k=p
So let
B=
[
\
Ak
p=1 k=p
1
(B) 2p1
for all p, and so (B) = 0. B is exactly the set of which appear in infinitely many
elements of the sequence {Ak }
k=1 . We had also described B as lim supk Ak .
\B =
[
p=1 k=p
( \ Ak )
\ B is the set of all points which appear in Ak for only finitely many k. For
/ B, there is a
1
K0 such that for k K0 ,
/ Ak , and so |fnk () f ()| < 2k . limk fnk () = f (). Since B is
a null set, {fnk }
converges
to f almost everywhere.
k=1
Suppose {fn }
n=1 converges to f in measure and to g almost everywhere. Then f = g a.e.: get a
subsequence of {fn }
n=1 which converges to f a.e.; this also converges to g a.e. and so f = g a.e. .
If is a finite measure, and {fn }
n=1 converges to f a.e., then it also converges in measure. This
is not true if is not finite (as we saw in the previous class), and the converse is not true (even
for finite measures, as seen in the previous class). Let > 0 be given. Let
An = { : m(m n and |fm () f ()| > )}
T
Clearly { : |fn () f ()| > } An . {An }
n=1 is a decreasing sequence. Let A =
nN An .
which
stays
at
least
-away
from
Observe that if A , then there is a subsequence of {fn ()}
n=1
f (), and so does not converge to f (). {fn }n=1 converges to f a.e., so we must have (A ) = 0.
Since is finite, limn (An ) = (A ) = 0. Since ({ : |fn () f ()| > }) (An ),
lim ({ : |fn () f ()| > }) = 0
So {fn }
n=1 converges to f in measure.
p
For 1 p < , if {fn }
n=1 converges to f in L , then it also converges to f in measure: first
note that if g is nonnegative, then for any > 0,
1{:g()>} g
R
and so by integrating both sides, ({ : g() > }) g, or equivalently ({ : g() > })
This trivial but profound inequality is called Markovs inequality.
g
.

Let > 0 be given.

({ : |fn () f ()| > }) = ({ : |fn () f ()|p > p })
Z
1
p |fn f |p d
(Markovs inequality)

R
Since 1p |fn f |p d goes to 0 as n , so does ({ : |fn () f ()| > }), and so {fn }
n=1
converges to f in measure.
Lp and inequalities
Consider Lp , for 1 p < . This is a vector space under the usual operations. If f Lp and
c R, then
Z
Z
p
p
|cf | d = |c|
|f |p d <
So cf Lp . If a and b are nonnegative real numbers, then (a + b)p 2p (ap + bp ): the proof is
surprisingly simple. If b a, then (a + b)p (2b)p 2p bp 2p (ap + bp ). If a b, a similar
argument applies. If f, g Lp , then
R
R
R
R
R
|f + g|p d (|f | + |g|)p d 2p (|f |p + |g|p ) d 2p |f |p d + 2p |g|p d <
So f + g Lp .
We now look at some profound inequalities:
Theorem (Holders inequality). Let p (1, ), and q =
nonnegative real-valued functions, then
Z
Z
fg
p1 Z
g
p
,
p1
so that
1
p
1
q
= 1. If f and g are
1q
If p = 2, then q = 2, and we get the Cauchy-Schwarz inequality.

Theorem (Minkowskis inequality). Let p [1, ). If f and g are nonnegative real-valued
functions, then
p1 Z
p1
Z
p1 Z
p
p
p
f
+
g
(f + g)
This tells us that the k kp defined on Lp ( kf kp =
We will prove these inqequalities in the next class.
|f |p d
p1
) is a norm.
Convex functions
Let I R be an interval. f : I R is said to be convex if for all x, y I and 0 < c < 1,
f (cx + (1 c)y) cf (x) + (1 c)f (y)
This says that the graph of the function between x and y lies below the line segment joining
(x, f (x)) and (y, f (y)).
Lemma. Let f : I R as above. If f 00 exists and is nonnegative, or if f 0 exists and is nondecreasing, then f is convex.
Proof. Let x < y in I. Let 0 < c < 1, and z = cx + (1 c)y. We want to show that f (z)
cf (x) + (1 c)f (y). This is equivalent to showing
c(f (x) f (z)) + (1 c)(f (y) f (z)) 0
By the mean value theorem, there exists 1 [x, z] and 2 [z, y] such that
c(f (x) f (z)) + (1 c)(f (y) f (z)) = c(z x)f 0 (1 ) + (1 c)(y z)f 0 (2 )
= c(1 c)(y x)f 0 (1 ) + (1 c)(c)(y x)f 0 (2 )
= c(1 c)(y x)(f 0 (2 ) f 0 (1 ))
which is nonnegative if f 0 is non-decreasing. If f 00 exists and is nonnegative, then f 0 exists and is
non-decreasing.
If I is not open, then a convex function f need not be continuous. For example, let I = [0, 1], and
f = 1{0,1} . f is convex but not continuous. However, if I is an open interval, then every convex
I R function is continuous (not proven here).
The exponential function (x.ex from R to (0, )) has second derivative nonnegative, and so is
convex.
H
olders and Minkowskis inequalities
Lemma. Suppose a, b [0, ), and p, q (1, ) such that
ab
1
p
1
q
= 1. Then
ap b q
+
p
q
Proof. If a = 0 or b = 0, then the inequality holds. So assume a > 0 and b > 0. Let and be
such that a = e/p and b = e/q . Then, using the convexity of the exponential function,
1
1
ap b q
ab = e p + q e + e =
+
p
q
p
q
Theorem (Holders inequality). Suppose f, g are nonnegative measurable functions, and p, q

(1, ) such that p1 + 1q = 1. Then
Z
Z
fg
p1 Z
g
1q
R
R
Proof. Let = f p and = g q . If = 0, then f p is zero a.e., so f is zero a.e., so f g is zero
a.e., and both sides of the inequality are zero. If = 0, then again both sides of the inequality are
zero. So assume > 0 and > 0. If = or = , then the RHS is , and the inequality is
satisfied trivially. So assume < and < .
Suppose first that = = 1. Then the RHS of the inequality is 1. For any ,
f ()g()
So f g
fp
p
gq
.
q
Integrating both sides gives
f ()p g()q
+
p
q
fg
1
p
1
q
= 1.
f
g
In general, let f0 = 1/p
and g0 = 1/q
. f0 and g0 are nonnegative, and
R
the above to f0 and g0 , we get f0 g0 1. So
Z
fg
1/p 1/q
Z
=
p1 Z
g
f0p =
g0q = 1. Applying
1q
The special case of the above with p = q = 2 is the Cauchy-Schwarz inequality:

sZ
Z

Z
2
2
fg
f
g
Theorem (Minkowskis inequality). Suppose f, g are nonnegative measurable functions, and p
1. Then
Z
1/p Z
1/p Z
1/p
p
p
p
(f + g)
f
+
g
p
Proof. For p = 1 this is an equality, so assume p > 1. Let q = p1
, so that p1 + 1q = 1. Assume
R p
R p
f Rand g are finite, otherwise the RHS is and the inequality holds trivially. Also assume
that (f + g)p > 0, otherwise the LHS is 0 and the inequality holds trivially. Applying Holders
inequality to the functions f and (f + g)p1 with p and q, and then to g and (f + g)p1 ,
Z
f (f + g)
p1
g(f + g)
p1
Z
Z
p1 Z
p1 Z
(p1)q
1q
(f + g)
(p1)q
1q
(f + g)
Adding the two and using that (p 1)q = p, we get

"Z
p1 Z
p1 # Z
1q
Z
p
p
p
p
(f + g)
f
+
g
(f + g)

Since f, g Lp , f + g Lp , and so
R
1
divide by ( (f + g)p ) q , to get
Z
(f + g)p is finite. We have assumed it is nonzero, so we can
1 1q
1
q
Z
(f + g)
1
1/p
Z
+
1/p
= p1 , so the proof is complete.
This allows us to define a norm (almost, but not quite, because of almost everywhere equality in
point 2) on Lp : For f Lp , let
Z
p1
p
kf kp =
|f |
Let d : Lp Lp R, d(f, g) = kf gkp
1. d is nonnegative and finite (because for f, g Lp , f g Lp ).
2. d(f, g) = 0 if and only if f = g almost everywhere.
3. d(f, h) d(f, g) + d(g, h). Proof:
Z
p1
|f h|
d(f, h) =
Z
Z
p
|f g| + |g h|
p
|f g|
p1
Z
+
p1
(exponentiation and integration are monotonic)
p
p1
|g h|
(Minkowskis inequality)
= d(f, g) + d(g, h)
As a special case, consider p = 2. If f and g are in L2 , then by Holders inequality applied to |f |
and |g| with p = q = 2, we get f g L1 . Define h, i : L2 L2 R,
Z
hf, gi = f g
This is an inner product and gives the norm k k2 , kf k2 = hf, f i.
Integration of complex-valued functions

In many ways complex numbers are more interesting than real numbers : every complex polynomial
of degree n has n (possibly repeated) roots. Every differentiable C C function is infinitely
differentiable. A uniform limit of C C C functions is always C . None of these statements is
true for R R functions.
Suppose (, A , ) is a measure space, and f : C. Let Re : C R and Im : C R be the
real and imaginary part functions. Let u = Re f and v = Im f (we will henceforth denote these
simply as Re f and Im f ). f is said to be measurable if u and v are measurable. f is said to be
integrable if u and v are. We dont allow functions to take values or , or have integrals
or here. If f is integrable, we define (as one would expect)
Z
Z
Z
f = u+i v
R
R
R
It is easy to see that if f and g are integrable, so is f + g, and (f +
g)
=
f
+
g. It is not so
R
R
obvious that if f is integrable and c C, then cf is integrable and cf = c f : let c = a + ib,
with a, b R, and f = u + iv, with u, v real-valued functions. u and v are integrable.
Re(cf ) = au bv
Im(cf ) = bu + av
This shows that Re(cf ) and Im(cf ) are integrable, and so by definition cf is integrable.
R
R
R
cf = (au bv) + i (bu + av)
R
R
= (a + ib) u + (b + ia) v
R
R
= (a + ib) u + (a + ib)i v
R
=c f
f is integrable if and only if |f | is integrable : let u = Re f and v = Im f asbefore. Suppose f
is integrable. Then u and v are integrable, and hence
|u| + |v| is integrable. u2 + v 2 |u| + |v|
(square both sides to see why this is true), and so |f | = u2 + v 2 is integrable. Conversely, suppose
|f | is integrable. |u| |f | and |v| |f |, so |u| and |v| are integrable. u and v are integrable, so by
definition f is integrable.
qR
R R
R
R
f |f | : f is a rather complicated object, being ( u)2 + ( v)2 . We dont tackle this
R
directly
but
use
a
clever
trick.
If
f = 0, then the inequality is trivially true. Otherwise, let
R
c = f , c 6= 0. Let
c
=
|c|
R
so that |c| = c. In what follows, note that we start with a real number f .
R
R
f = f
R
= f
R
R
= Re(f ) + i Im(f )
R
= Re(f )
(we started with a real number)
|Re(f )|
|f |
|f |
(|Re g| |g|)
(|| = 1)
DCT holds here as well:

Theorem. Suppose {fn }
n=1 is a sequence of complex measurable functions converging to f pointwise. Suppose g is a nonnegative real
function such that |fn | |g| for all n.
R valued
R integrable
R
Then each fn and f are integrable. fn f and |fn f | 0 as n .
Proof. Clearly |f | |g|. |fn | and |f | are integrable, and so fn and f are integrable. Let un , vn , u, v
be real valued functions such that fn = un + ivn and f = u + iv. {un }
n=1
R converges to u pointwise
and is dominated
by g, so by DCT for real-valued functions, limn |un u| = 0. Similarly,
R
limn |vn v| = 0. |fn f | |un u| + |vn v|, so
Z
lim
|fn f | = 0
n
R
R
R R
R
R
fn f = (fn f ) |fn f |, so limn fn = f .
As before, it suffices to require that |fn | g a.e., and that fn f a.e. . We can take N to be the
union of all the (countably many) null sets involved, and restrict our attention to \ N .
Complex Lp spaces
Let 1 p < . Let

R
Lp (, A , ) = f : (f : C), f measurable, |f |p d <
From now on, by Lp , we mean this space. We will qualify it as real Lp when we are talking
about real-valued functions.
From the definition, it is immediate that f Lp |f | Lp . Lp is a vector
the
R space under
p
p
p
p
p
p
p
usual operations: ifR f, g L , then
R |fp + g| (|f | + |g|) 2 (|f | + |g| ). So |f + g| < . If
p
p
p
f L and c C, |cf | = |c| |f | < .
For f Lp , define
Z
kf kp =
p1
|f |
kf k = 0 if and only if f is zero almost everywhere. For c C,

Z
kcf kp =
p1
|cf |
Z
=
|c| |f |
p1

=
|c|
p1
|f |
Z
= |c|
p1
|f |
= |c|kf k
If f, g Lp , using Minkowskis inequality,

Z
kf + gkp =
|f + g|
p1
Z
(|f | + |g|)
p1
Z
|f |
p1
Z
+
|g|
p1
= kf kp + kgkp
Define d : Lp Lp [0, ),
d(f, g) = kf gkp
d is symmetric, satisfies triangle inequality, and d(f, g) = 0 if and only if f = g almost everywhere.
This is almost a metric space, but not quite. To make it a metric space, we have to identify
functions which are equal almost everywhere. Define a relation on Lp , f g if f = g a.e. . It
is easy to see that this is an equivalence relation. If f f 0 and g g 0 , then f g = f 0 g 0 a.e.,
and kf gkp = kf 0 g 0 kp . So d naturally defines a function on the quotient Lp / , and this is a
metric.
However, to keep things simple we will not take this approach, and accept that d is only a pseudometric.
R
For p = 2, we can define an inner product on L2 , hf, gi = f g (Holders / Cauchy-Schwarz
inequality gives us that |f g| = |f | |g| is integrable). This gives us k k2 , kf k2 = hf, f i.
For a finite measure space, Lp decreases as p increases. That is, if p > q, then Lp Lq : let f Lp .
Then
|f |q 1 + |f |p
R
R
because if |f ()| 1, then |f ()|q 1, and otherwise |f ()|q |f ()|p . So |f |q ()+ |f |p <
.
If is not finite, there is no relation between Lp and Lq in general. We say in general because in
some simple cases all Lp might even be equal: let = {1, 2}, A = 2 , ({1}) = 1, ({2}) = .
For all 1 p < , Lp is just the set of all functions f such that f (2) = 0.
Now consider (R, B, ). Let f (x) = x1 1[1,) (x). Using what we know from Riemann integration,
we have that
Z
f = log n log 1 = log n
[1,n]
R
R
Letting n and using MCT, we get that |f | = f = . Using a similar approach,
Z
1
1
1
f 2 = ( ) = 1
n
1
n
[1,n]
R
Again using MCT, |f |2 = 1. f L2 and f
/ L1 .
Let g(x) =
1 1(0,1) (x).
x
As before,
g=
1
[n
,1]
and so by MCT
12
1
2
( n1 ) 2
1
2
|g| = 2.
Z
g 2 = log(1) log(1/n) = log n
1
[n
,1]
and so by MCT
r !
1
=2 1
n
|g|2 = . g L1 and g
/ L2 .
Differentiating under the integral sign

Suppose I is an open interval in R, and B is the Borel -field on it, and the Lebesgue measure.
Suppose (, A , ) is a measure space. Let g : I R be a measurable function defined on
the product space, such that for all y , t.g(t, y) is differentiable. Let this derivative (in t) be
g1 : I R. Suppose that for all t I, y.g(t, y) is integrable on . Define u : I R,
Z
u(t) = g(t, y) dy
R
Then can we say that u0 (t) = g1 (t, y) dy? That is, can we push the differentiation inside the
integration? Suppose the following condition holds:
For all t0 > 0, there exists > 0, and : [0, ) integrable, such that (t0 , t0 +
) I, and for all t (t0 , t0 + ), |g1 (t, y)| (y).
R
Then it is true that u is differentiable and u0 (t) = g1 (t, y) dy. The proof is an application of
DCT and is as follows: Let t0 I. Let and as in the hypothesis above; and let {hn }
n=1 be
any sequence of real numbers converging to 0 such that each hn (, ). We want to show that
Z
u(t0 + hn ) u(t0 )
= g1 (t0 , y) dy
lim
n
hn
We have
u(t0 + hn ) u(t0 )
=
hn
g(t0 + hn , y) g(t0 , y)
dy
hn
0 ,y)
By the mean value theorem, g(t0 +hn ,y)g(t
= g1 (q, y) for some q between t0 and t0 + hn and that
hn
0 ,y)
is bounded in absolute value by (y). Also, we know that limn g(t0 +hn ,y)g(t
= g1 (t0 , y). So
hn
by the dominated convergence theorem using as the dominating function,
Z
u(t0 + hn ) u(t0 )
= g1 (t0 , y) dy
lim
n
hn
Completeness of Lp
We will show that Lp (, A , ) as a (pseudo)metric space is complete. But before that, we make
some general remarks about showing completeness.
1. If a subsequence of a Cauchy sequence converges to a point, so does the sequence itself.
2. To show completeness of a (pseudo)metric space (X, d), it suffices to consider Cauchy se1
quences {xn }
n=1 where d(xn+1 , xn ) < 2n . This is because every Cauchy sequence has such a
subsequence, and the above remark.
Theorem. Lp is complete.
p
Proof. By the above remarks, it suffices to take a Cauchy sequence {fn }
n=1 in L such that
P
kfn+1 fn kp < 21n and show that it has a limit in Lp . We claim that g :=
n=1 |fn+1 fn | is finite
Pk
almost everywhere: let gk = n=1 |fn+1 fn |. By Minkowskis inequality,
k
X
1
kgk k
1
2k
n=1
R
R
R
gkp 1. gk increases to g as k , and so gkp g p . By MCT, limk gkp = g p . So g p 1,
and g p (and g) is finite almost everywhere.
P
Let h =
where it does
n=1 (fn+1 fn ) : this converges absolutely almost everywhere, at the points
Pk
not, as usual we can take h to be 0. We have written h as a telescoping sum : n=1 (fn+1 fn ) =
fk+1 f1 . {fk+1 f1 }
k=1 converges to h almost everywhere, so {fk }k=1 converges to h + f1 almost
everywhere. So it is reasonable to define f := h + f1 and try to prove that {fk }
k=1 converges to f
in Lp . |h| g, so |h|p g p . h Lp , and f Lp .
For m > n,
kfm fn kp
m1
X
kfn+1 fn kp
k=n
m1
X
k=n
X 1
1
1
= n1
k
k
2
2
2
k=n
This says that

Z
|fn fm |p
1
2(n1)p
For a fixed n, we can let m and apply Fatous lemma:

Z
Z
p
lim inf |fn fm | lim inf |fn fm |p
m
In the LHS, the lim inf is actually an almost everywhere limit, |fn f |p . The RHS is bounded by
1
. So we get
2(n1)p
Z
1
1
|fn f |p (n1)p , kfn f kp n1
2
2
p
{fn }
n=1 converges to f in L .
More on Lp
We saw that on (R, B, ), neither of L1 or L2 is a subset of the other. However, can we say that
if f L1 and f L3 , then f L2 ? Yes:
Lemma. Let (, A , ) be a measure space. Let 1 < < . Then L L L .
Proof. Let f L L .
|f | |f | + |f |
This holds because for any , if |f ()| 1, then |f ()| |f ()| , and otherwise |f ()|
|f ()| . Since |f | and |f | are integrable, so is |f | .
We can also use Holders inequality to prove this, which gives us something more. Since < < ,
1
so that p1 + 1q = 1. Let
we have c (0, 1) such that = c + (1 c). Let p = 1c and q = 1c
= |f |c and = |f |(1c) . Holders inequality says that
Z
Z
d
p1 Z
1q
d
q
So we get
Z
|f | d =
c+(1c)
|f |
Z
d
c Z
|f | d
1c
|f | d
Taking logs,
Z
log
c+(1c)
|f |

Z

Z

d c log
|f | d + (1 c) log
|f | d
For a fixed f , define h : [1, ) R,

Z
h(p) = log
|f | d
The equation above says

h(c + (1 c)) ch() + (1 c)h()
In other words, h is convex.
Jensens inequality
Suppose I R is an open interval, and : I R is convex. Then for x1 , x2 I.

x1 + x2
(x1 ) + (x2 )
2
2
Similarly, it can be shown that for x1 , . . . , xn I,

x1 + . . . + xn
(x1 ) + . . . + (xn )
n
n
This can be viewed as follows: let = {1, . . . , n} and A = 2 . Let be the uniform probability
measure on (, A ), that is, (A) = |A|/n. Then for any f : I, denoting f (n) by xn ,

Z
Z
(x1 ) + . . . + (xn )
x1 + . . . + xn
=
f d f d
n
n
Can we generalise this inequality to arbitrary probability spaces? It makes sense to restrict to
probability spaces (as opposed to other measure spaces); after all, it need not even be true that
(x1 + 2x2 ) (x1 ) + 2(x2 ). Yes, we can generalise to probability spaces:
Theorem (Jensens inequality). Suppose (, A , ) is a probability space. Let I R be an open
interval, and f : I be real-valued and integrable. Let : I R be convex. Then:
f d I.
f d is defined.
and finally,

Z
f d
f d
We look at some facts about convex functions. Suppose I R is an open interval, and : I R
is convex. Then:
is continuous.
For all x0 I, there exist , R, such that if we define ` : I R by `(x) = x + , then
` and `(x0 ) = (x0 ).
The second fact says that we have a tangent to the graph of at x0 which lies below the graph
of .
We use these to prove Jensens inequality and then prove these claims.
R
Proof. (JensensR inequality) Let a = inf I, b = sup I. We need to show that a < f d < b. If
a = , then f d > a, since f is integrable. Otherwise, f
is positive
R a1 is function which
R
at all points, and so its integral is positive. But its
integral
is
f
d
a,
and
so
f
d
> a. A
R
similar argument, which uses b1 f , shows that f d < b.
R
Let x0 = f d. x0 I, and so it is meaningful to talk of (x0 ). Let and be such that for all
x I, x + (x) and x0 + = (x0 ). For all , f () I, and so f () + (f ()).
f + 1 f
It is easy to check that if g1 , g2 : R with g1 g2 , then g2 R g1 . Applying here, (R f )
(f + 1 ) . Since f and 1 are integrable, so is f + 1 . ( f ) < , and so f is
defined (and is finite or ).
We have
Z
Z
f d
(f + 1 ) d
Z
=
f d +
= x0 +
= (x0 )
Z

=
f d
We now prove the facts about convex functions used:

Lemma. Suppose I R is an open interval and : I R is convex. If x1 < x2 < x3 are in I,
then
(x2 ) (x1 )
(x3 ) (x2 )
x2 x1
x3 x2
Proof. There is essentially only one way to go about the proof: write x2 as a convex combination
of x1 and x3 , and apply the definition of convexity. Let c (0, 1) be such that x2 = cx1 + (1 c)x3 .
2
c = xx33 x
. By convexity of ,
x1
x2 x1
x3 x2
(x1 ) +
(x3 )
x3 x1
x3 x1
((x3 x2 ) + (x2 x1 ))(x2 ) (x3 x2 )(x1 ) + (x2 x1 )(x3 )
(x2 ) c(x1 ) + (1 c)(x3 ) =
(x3 x2 )((x2 ) (x1 )) (x2 x1 )((x3 ) (x2 ))

(x3 ) (x2 )
(x2 ) (x1 )
x2 x1
x3 x2
Corollary. Suppose I R is an open interval and : I R is convex. If x1 < x2 < x3 < x4 are
in I, then
(x2 ) (x1 )
(x4 ) (x3 )
x2 x1
x4 x3
Proof.
(x2 ) (x1 )
(x3 ) (x2 )
(x4 ) (x3 )
x2 x1
x 3 x2
x4 x3
Lemma. Suppose I R is an open interval and : I R is convex. Then is continuous.

Proof. Let b, c I with b < c be arbitrary. We will show that is continuous at every point of
(b, c). Fix a, d I such that a < b and d > c. Now for any x, y (b, c), using the above corollary,
(b) (a)
(y) (x)
(d) (c)
ba
yx
dc
(d)(c)

,
. Then |(x) (y)| M |x y| for all x, y (b, c). So is
Let M = max (b)(a)
ba
dc
continuous in (b, c). Since every point of I has a neighbourhood (b, c), is continuous on I.
Measure theory class notes - 1 November 2010, class 28
Jensens inequality (continued)

To complete the proof of Jensens inequality we needed to prove the following:
Lemma. Suppose I R is an open interval, and : I R is convex. For all x0 I, there exist
, R, such that if we define ` : I R by `(x) = x + , then ` and `(x0 ) = (x0 ).
Proof. Let x0 I be given. For any a < x0 and b > x0 (a, b I), we know that
(x0 ) (a)
(b) (x0 )
x0 a
b x0
Therefore
sup
a<x0
(x0 ) (a)
(b) (x0 )
inf
b>x0
x0 a
b x0
And both the quantities are finite. Pick any R such that
sup
a<x0
(x0 ) (a)
(b) (x0 )
inf
b>x0
x0 a
b x0
This choice reflects that we want the slope of our line to be larger than any difference quotient
on the left of x0 and smaller than any difference quotient on the right of x0 . The choice of is now
forced: since we want `(x0 ) = (x0 ), we define = (x0 ) x0 . Letting `(x) = x + , clearly
`(x0 ) = (x0 ). For x > x0 ,
(x) (x0 )
x x0
x x0 (x) (x0 )
x + ((x0 ) x0 ) (x)
x + (x)
For x < x0 ,
(x0 ) (x)
x0 x
(x0 ) (x) x0 x
x + ((x0 ) x0 ) (x)
x + (x)
Lebesgue decomposition theorem

In a suitable inner product space, if one has a subspace, one can write any vector as a sum of
something that belongs to the subspace, and something that is orthogonal to the subspace. Here
is a similar situation:
Theorem (Lebesgue decomposition theorem). Let (, A , ) be a -finite measure space. Let

be a -finite measure on (, A ). Then = 1 + 2 , where 1 and 2 are -finite measures. and
1 (1 is absolutely continuous w.r.t ) and 2 (2 is singular w.r.t ).
Proof. We prove the theorem for the case when and are finite. Proving the -finite case using
the finite case is routine.
Clearly is absolutely continuous w.r.t + . By the Radon-Nikodym theorem, let f 0 be such
that for all A A ,
Z
(A) =
f d( + )
A
f is ( + )-integrable, and so is 1 . For any A A ,

Z
(1 f ) d( + ) = (A) + (A) (A) = (A) 0
A
Since this holds for all A A , 1 f is nonnegative almost everywhere w.r.t. + (and so
also w.r.t.
R and ). We may modify f so that f () 1 for all , without violating
(A) = A f d( + ).
Let
1 = { : 0 f () < 1}
2 = { : f () = 1}
{1 , 2 } is a partition of . Define 1 , 2 : A [0, ) by
1 (A) = (A 1 )
2 (A) = (A 2 )
It is easy to see that 1 and 2 are finite measures and = 1 + 2 . We will show that 2 (1 ) =
(2 ) = 0, which shows that 2 is singular w.r.t. . 2 (1 ) = 0 is immediate from the definition
of 2 . Since f is 1 on 2 ,
Z
(2 ) =
f d( + ) = (2 ) + (2 )
2
Since the measures are finite, (2 ) = 0. So 2 is singular w.r.t .

Now suppose A A is such that (A) = 0. Then
Z
Z
1 (A) = (A 1 ) =
f d( + ) =
A1
A1
Z
f d +
Z
f d =
A1
f d
A1
f 1A1 d is zero because f 1A1 is zero almost everywhere w.r.t . We have

Z
(1 f )d = (A 1 ) (A 1 ) = 0
A1
(1 f )1A1 , being nonnegative, must be zero almost everywhere w.r.t. . But f is strictly less
than 1 on 1 , and so (A 1 ) = 0. 1 (A) = 0. 1 is absolutely continuous w.r.t .
This shows existence. To show uniqueness, suppose = 1 +2 = 1 + 2 , with 1 and 1 absolutely
continuous w.r.t. , and 2 and 2 singular w.r.t. . 1 1 = 2 2 .
For A A , if (A) = 0, then 1 (A) = 1 (A) = 0, so

2 (A) 2 (A) = 0
2 A be such that
Since 2 and 2 are singular w.r.t , let 2 ,
2 ( \ 2 ) = 0, (2 ) = 0
2 ) = 0, (
2) = 0
2 ( \
If A A and:
2 : then (A) = 0, so 2 (A) 2 (A) = 0 as shown above.
A 2
2 ), then A ( \ 2 ) ( \
2 ), so 2 (A) = 2 (A) = 0, and 2 (A) 2 (A) = 0.
A \ (2
2 and with \(2
2 ),
Otherwise, write A as the disjoint union of its intersection with 2
and apply both the above to get 2 (A) 2 (A) = 0.
This shows that 2 = 2 . It follows that 1 = 1 .
Infinite products
We looked at products of finitely many measure spaces. Now we consider products of countably
many measure spaces. We restrict to probability spaces (otherwise, we may not get a -finite
measure on the product space even if all the component measures are finite). Further, to keep
things simpler, we restrict to probability measures on (R, B). The structure of the real line (the
topology, the order, the notion of compactness, etc) helps us. Our aim is the following:
Suppose for each n N, we have a probability measure n on (R, B). We want to
Q
define a probability measure on R (that is, nN R) with a suitable -field (yet to
be determined), so that if {Bn }
n=1 is a sequence of sets in B, then
!
Y
Y
Bn =
n (Bn )
nN
nN
For this to make sense, our -field on R must have at least all the infinite-dimensional boxes
Q
to be the -field generated by

nN Bn , where each Bn B. With this in mind, let us define B
these sets:
(
)!
Y
B =
Bn : each Bn B
nN
Note that the convergence of nN n (Bn ) is guaranteed since each n is a probability measure
QN
the sequence
n=1 n (Bn ) N =1 is nonincreasing and each element of it is in [0, 1], so the limit
exists and is in [0, 1].
Let us see what we can say about B , and on which sets we can define the product measure
without much effort. For each n N, let Bn be the Borel -field on Rn , and let

n
Fn = {C R : C Bn } = {(xk )
k=1 : (x1 , . . . , xn ) C} : C B
A set in Fn is just like a set in Bn , but with more components tagged on, which could take any
values. It is easy to see that Fn is a -field, using Bn being a -field. Fn Fn+1 . For any
C Rn , we denote by C the set of all elements of R whose first n components, taken as a tuple,
belong to C:
C = C R = {(xk )
k=1 : (x1 , . . . , xn ) C}
So Fn = {C : C Bn }. Define n : Fn [0, 1], n (C ) = (1 . . . n )(C), for C B n .
1. n is well-defined: every element of Fn is C for a unique C B n (it is easy to see that if
C 6= D, with C, D Rn , then C 6= D ).
2. n+1 restricted to Fn equals n : let C B n , so that C F n . C R B n+1 and
(C R) = C .1
n+1 (C ) = n+1 ((C R) ) = (1 . . . n+1 )(C R) = (1 . . . n )(C)n+1 (R)
= (1 . . . n )(C) = n (C )
This implies that if m > n, then m restricted to Fn equals n .
1
This does not contradict the previous point: here C Rn and C R Rn+1 .
S
Now let F = nN Fn . What kind of an object is F ? F is a field: F1 F . If A Fn ,
since Fn is a -field, R \ A Fn F . If A Fn and B Fm , then since Fmax{m,n} is a
-field, A B Fmax{m,n} F .
Define : F [0, 1], (A) = n (A) if A Fn . This is well-defined. What properties does
have? is a finitely additive probability: clearly (R ) = 1 (R ) = 1. () = 1 () = 0.
Suppose A1 , A2 F are disjoint. Pick n such that A1 , A2 Fn . Let C1 , C2 B n be such that
C1 = A1 and C2 = A2 . Then C1 and C2 are also disjoint. A1 A2 = (C1 C2 ) .
(A1 A2 ) = n (A1 A2 ) = (1 . . . n )(C1 C2 )
= (1 . . . n )(C1 ) + (1 . . . n )(C2 )
= n (A1 ) + n (A2 ) = (A1 ) + (A2 )
Is countably additive? If yes, we can use the Caratheodory extension theorem to extend to
Q
(F ). We will show that (F ) = B : Consider an infinite-dimensional box, nN Bn , one of the
generators for B :
!
N
Y
\ Y
Bn =
Bn
N N
nN
n=1
Bn Fn F , and so nN Bn (F ). Since (F ) is a -field, B (F ). To

establish the reverse inclusion, we will show Fn B , for each n. For a fixed n N, let
QN
n=1
C = {Q : Q Bn , Q B }
For B1 , . . . , Bn B, B1 . . . Bn Bn and (B1 . . . Bn ) is an infinite-dimensional box and
so belongs to B . B1 . . . Bn C. C is a -field because
Bn is a -field.
B is a -field.
is well-behaved: = ; (Rn \ Q) = R \ Q ;
QQ
Q =
QQ

Q .
C Bn is a -field and contains rectangles (which generate Bn ), and so equals Bn . For all
Q Bn , Q B . This says that Fn B . Since this holds for all n, F B . Since B is
a -field, (F ) B . We have shown that (F ) = B .
If we extend to a measure on B , will the condition we asked for hold? Yes! Suppose we have
Q
Q

Bn B for each n N. Let Cn = ni=1 Bi .
nN Bn is the decreasing intersection of {Cn }n=1 ,
and hence
!
N
Y
Y
Y
Bn = lim (CN ) = lim N (CN ) = lim

n (Bn ) =
n (Bn )
nN
n=1
nN
What remains is to show that is countably additive on F .

Lemma. Suppose F is a field on a nonempty set , and P : F [0, 1] is a finitely additive
probability. Suppose whenever {Bn }
n=1 is a sequence of sets from A decreasing to the empty set,
it is the case that {P (Bn )}n=1 decreases to 0. Then P is countably additive.
S
Proof. Suppose {Cn }
n=1 is sequence of disjoint sets in F such that
iN Ci F . For any n N,
we have
!
!
[
X
[
P
Ci =
P (Ci ) + P
Ci
i=1
i=1
i=n+1
Note that we are only using finite additivity here.
Ci =
i=n+1
[
i=1
i=n+1
!
Ci
Ci belongs to F because
n
[
Ci
i=1
S
As n ,
i=n+1 Ci decreases to the empty set (because any belongs to at most one Ci ).
By hypothesis, the measures go to 0. So
!
[
X
X
P
Ci = lim
P (Ci ) =
P (Ci )
i=1
i=1
i=1
We will use this to show that is countably additive on F in the next class.
Infinite products (contd)

We will show that the as defined in the previous lecture is countably additive on F . We need
some results from analysis.
Any bounded sequence of real numbers has a convergent subsequence. If we have two bounded sequences of real numbers, there is a common convergent subsequence, that is, a strictly increasing
sequence of natural numbers {ni }
i=1 along which both sequences converge: just take a convergent
subsequence t1 of the first sequence s1 , and restrict the second sequence s2 to those indices (to get
t2 ), and choose a convergent subsequence of t2 , say u2 . The same indices give a subsequence of
t1 , which converges because t1 does. This argument can be extended to finitely many sequences.
What about countably many sequences? This argument does not generalise directly, but the result
is true:
Lemma. Suppose for each k N, {xkn }
n=1 is a bounded sequence of real numbers. There exists a
k
strictly increasing sequence of natural numbers {ni }
i=1 such that for all k N, {xni }i=1 converges.
Proof. We use a diagonal argument.
For an infinite subset S N, say that a sequence of real numbers converges along S if the
subsequence obtained by restricting the indices to S converges. Inductively define S1 S2 S3 . . .,
infinite subsets of N, as follows: let S1 be such that {x1n }
n=1 converges along S1 . Having chosen
k
S1 , . . . , Sk1 , consider the sequence {xn }n=1 with indices retsricted to Sk1 : this is a bounded
sequence, and so has a convergent subsequence. This gives an Sk Sk1 such that {xkn }
n=1
converges along Sk .
For an infinite subset T N and q N, let #q(T ) denote the qth smallest element of T . Let
S = {#k(Sk ) : k N}
Since Sk+1 Sk , #(k + 1)(Sk+1 ) > #k(Sk+1 ) #k(Sk ). So S is infinite. Each {xkn }
n=1 converges
along S, since all elements of S except possibly the first k 1 elements are also elements of Sk .
We know that if {Kn }
compact subsets of Rm such that the intersection of any
n=1 is a sequence ofT
finitely many of them is nonempty, then nN Kn is nonempty. We generalise this:
ln
Lemma. Suppose {Kn }
n=1 is a sequence such that Kn is a compact subset of R , for some ln .
Suppose for all n,
n
\
Ki 6=
i=1
Then
\
Ki 6=
iN
Proof. Here different Kn may have different dimensions (the ln ), but were pushing them all into
R by applying .
T
For each n, choose xn ni=1 Ki . This xn has several components xnk ; for k > max{l1 , . . . , ln }, xnk
can always be taken to be 0, so assume this is the case.
For all k N, we will show that {xnk }
n=1 is bounded:
Suppose every ln is less than k. Then {xnk }

n=1 has only zeroes, and so is bounded.
Suppose some ln0 k. Let Z be the projection of Kn0 onto its kth component (Kn0 Rln0 ,
so this is well-defined). Z is a compact subset of R, and so is bounded. For all n n0 ,
xn Kn0 , and xnk Z. So {xnk }
n=1 is bounded.
Using an earlier lemma, let S N be infinite such that {xnk }
n=1 converges along S, for every k.
Let the limit be xk , and consider the point x R whose kth coordinate is xk , for all k. We will
show that
\
x
Ki
iN
To show that x Ki , consider xn for n > i and n S. The first li components of each of these,
as a tuple, is an element of the compact set Ki , and these converge to the first li components of x
(along S), so x Ki .
We now show that is countably additive on the field F .
As shown earlier, it suffices to show that if a decreasing sequence of sets decreases to the empty
set, then their values decrease to 0. We show the contrapositive: Suppose B1 B2 B3 is a
sequence of sets in F whose values do not decrease to 0. being finitely additive implies that
T
is monotone, so {(Bn )}
n=1 decreases, say to , with > 0. We need to show that
nN Bn 6= .
For each n, let ln be such that Bn Rln . Using regularity (Littlewoods principle), choose Ln Bn
compact such that

(1 . . . ln )(Bn \ Ln ) < n+1
2
This is possible as the measure is finite (in fact a probability measure).
Bn
n
\
k=1
Lk
n
[
(Bk \ Lk )
k=1
/ Lk .
To see that this is true, take x belonging to the LHS. x Bn , and for some k, 1 k n, x
Bk Bn , so x Bk and x Bk \ Lk RHS. So we have

!
n
n
n
\
X
X

Bn \
Lk
(Bk \ Lk )
k+1
2
2
k=1
k=1
k=1
This only uses F being a field and being finitely additive. (Bn ) , and so
!
n
\

Lk > 0
2
k=1
T
T
In particular, nk=1 Lk is nonempty, for every n. We have shown that this implies
k=1 Lk is
T
nonempty. k=1 Bk , being a superset, is also nonempty. is countably additive on F . By the

Caratheodory extension theorem, extends uniquely to a probability measure on (F ) = B .
Although we looked at probability measures only on (R, B), the result is true for any sequence
of probability spaces: we have the infinite product space, with B , Fn , n , etc as defined
above. is countably additive on F and so after extending, on (F ). However, the proof of
countable additivity needs more work in the general case: here it was simpler because we could
use compactness.
Haar measure
In the homework problems we have seen several examples of groups and translation-invariant
measures on them. We will now consider the existence of such measures in general.
Definition. A topological space X is said to be locally compact if for every x X and U open
such that x U , there exists V open such that x V V U , with V compact.
Definition. A topological group is a group G with a topology on it, such that the group multiplication and group inverse are continuous functions from G G to G and from G to G respectively.
We will restrict our attention to locally compact metrizable topological groups.
For a group G, g G, and B G, we make the following definitions:
gB = {gb : b B}.
Bg = {bg : b B}.
B 1 = {b1 : b B}.
Definition. Suppose G is a topological group. A measure on its Borel -field is said to be a
left-invariant Haar measure on G if
For K G compact, (K) < .
For U G open, if U 6= then (U ) > 0.
For every Borel set B G and g G, (gB) = (B).
Left-multiplication by g is a homeomorphism, so gB above is a Borel set if B is Borel. A rightinvariant Haar measure is similar, with (gB) replaced by (Bg). The definition rules out trivial
cases, like the zero measure, or the measure which is on all nonempty sets.
There is a natural one-to-one correspondence between left-invariant and right-invariant Haar measures on a given topological group: if is left-invariant (or right-invariant, respectively), then 0
defined by 0 (B) = (B 1 ) is right-invariant (or left-invariant, respectively). This is so because
for a, b G, (ab)1 = b1 a1 .
We will show that every locally compact metrizable topological group has a left-invariant Haar
measure.
Haar measure (contd)

Definition. Suppose G is a topological group. A measure on its Borel -field is said to be a
left-invariant Haar measure on G if
For K G compact, (K) < .
For U G open, if U 6= then (U ) > 0.
For every Borel set B G and g G, (gB) = (B).
We will show that every locally compact metrizable topological group has a left-invariant Haar
measure.
How would one construct such a ? Let us look at the topological group (R, +), for which we know
a Haar measure - the usual Lebesgue measure . How can we describe the length of an interval
[a, b] without subtracting a from b? In a general group, we do not have any notion of intervals; and
even if we did manage to subtract, we will get only an element of the group, not a real number.

Here is an approach: considertranslates of the interval 21 , 21 . Some finitely many of them cover
[a, b] (all translates of 12 , 21 certainly cover [a, b]; by compactness, some finitely many of them
do). Let n1 be the minimum number of translates needed to cover [a, b]. Then it can be shown
that
n1 1 ([a, b]) n1

Let n2 be the minimum number of translates of 14 , 41 needed to cover [a, b]. Then it can be
shown that
n2 1
n2
([a, b])
2
2
We may continue this way to get better
and better
approximations, but the problem here is that
it uses the measures of the sets 12 , 12 , 41 , 14 , etc. In general we have to define for all sets,
nothing is given to us.
To work around this, choose a reference interval, say [0, 1] (the exact choice doesnt matter).
For every k N, let nk be the minimum number of translates of 21k , 21k needed to cover [a, b],
and let mk be the minimum number needed to cover [0, 1]. Then the measure of [a, b] is roughly
nk /mk . It can be shown that as k goes to infinity, nk /mk converges to ([a, b]). If we had chosen
some other reference interval, we would have got a scalar multiple of the Lebesgue measure. This
is not a problem - after all, a scalar multiple (by a positive real number) of a left-invariant Haar
measure is also a left-invariant Haar measure.
We can try to extend this to a general metrizable locally compact topological group. We can get
a decreasing sequence of nonempty open sets {Ik }
k=1 decreasing to the identity, and a reference
nonempty open set U0 , such that the closures of all these open sets are compact. As before, for any
open set U with compact closure, we look at nk /mk , where nk and mk are the minimum number
of translates of Ik needed to cover U and U0 respectively. We can show that {nk /mk }
k=1 is a
bounded sequence. But will it converge? It need not. To deal with this, we generalise our notion
of convergence.
Ultrafilters and convergence

Definition. U 2N is said to be an ultrafilter if
1. For all i N, N \ {i} U.
2. If A U, then every superset of A is also in U.
3. A, B U = A B U.
4. For all A N, exactly one of A and N \ A belongs to U.
If U is an ultrafilter, then it contains all co-finite sets (because it contains complements of singletons,
and is closed under finite intersections). By 4, it does not contain any finite sets. Every element
of U is infinite, and in particular
/ U. But does such a U exist at all?
Theorem. There exists U 2N which is an ultrafilter.
The proof depends on the axiom of choice and is non-constructive. It is hard to explicitly describe
an ultrafilter. Fortunately for our puposes it does not matter which ultrafilter we use, so we fix
some ultrafilter U arbitrarily. Think of the sets in U as being large subsets of N, and the ones
not in U as being small. The first three conditions in the definition say that all co-finite sets are
large, every superset of a large set is large, and that the intersection of two large sets is large. The
last condition says that the complement of a large set is small, and vice versa.
How does an ultrafilter help us?
Lemma. Suppose {xn }
n=1 is a bounded sequence of real numbers. Then
\
{xn : n U }
U U
has exactly one element.

Proof. Since the given sequence is bounded, each set {xn : n U } is compact. These sets have
the finite intersection property (the intersection of any finitely many of them is nonempty) : If
U1 , . . . , Uk U, we know that their intersection belongs to U and so is nonempty - let p be an
element in U1 . . . Uk . Then
xp
k
\
{xn : n U }
i=1
k
\
{xn : n U }
i=1
Since this collection of compact sets has the finite intersection property, the grand intersection is
nonempty:
\
{xn : n U } =
6
U U
We now show that this intersection has exactly one element: Let < be any two real numbers.
Let

+
S = n : xn > +
,
N
\
S
=
n
:
x
n
2
2
Exactly one of S and N \ S belongs to U. If S U, then
/ {xn : n S}, otherwise
/
T
{xn : n N \ S}. In either case, both and cannot belong to U U {xn : n U }.
T
For a bounded sequence {xn }
n=1 of reals, let x be the unique element of
U U {xn : n U }. We
say that the sequence converges along the ultrafilter U to x. How does this compare with the usual
notion of convergence? Let F 2N consist of all co-finite subsets of N. F satisfies the first three
conditions in the definition of an ultrafilter, and F U.
Lemma. Suppose {xn }
n=1 is a bounded sequence of reals, and x a real number.
{xn }
n=1 converges to x in the usual analysis sense if and only if for every > 0, there exists
A F such that {xn : n A} (x , x + ).
{xn }
n=1 converges to x along U if and only if for every > 0, there exists A U such that
{xn : n A} (x , x + ).
Proof. The first part is easy. For the second part, assume {xn }
n=1 converges to x along U. Then
\
{x} =
{xn : n U }
U U
Let > 0 be given. Let

A = {n : xn (x , x + )}
Exactly one of A and N \ A belongs to U. N \ A cannot belong to U, because
{xn : n N \ A} R \ (x , x + ) R \ {x}
So A U, and {xn : n A} (x , x + ).
Conversely, suppose for every > 0, there exists A U such that {xn : n A} (x , x + ).
Let U U, we have to show that x {xn : n U }. This is the same as showing that every
neighbourhood of x intersects {xn : n U }. Let > 0 be given. Let A U be such that
{xn : n A} (x , x + ). A U U. For any m A U , xm (x , x + ) {xn : n U }.
T
Since this holds for every > 0, x U U {xn : n U }. So
{x} =
{xn : n U }
U U
Note that F U. So if a sequence converges to a number in the usual sense, it converges to the
same number along U. Convergence along U has many nice properties that the usual notion of
convergence has. If {xn }

n=1 converges to x along U and {yn }n=1 converges to y along U, then
{xn + yn }
n=1 converges to x + y and {xn yn }n=1 converges to xy along U. If xn yn for all n, then
x y. We will use these properties in the next class.
Consider the sequence (0, 1, 0, 1, 0, 1, . . .). This does not converge in the usual sense, but an ultrafilter U forces it to converge: if the set of even numbers belongs to U, it converges to 1; otherwise
the set of odd numbers belongs to U, and it converges to 0.
We now show that an ultrafilter exists1 :
Theorem. An ultrafilter U 2N exists.
1
This wasnt proved in class.
Proof. The idea is simple: start with F and keep adding sets to it till it is saturated.
We call a subset of 2N which satisfies the first three conditions in the definition of an ultrafilter
and which does not have the empty set as an element a filter. In other words, S 2N is a filter if
1. For all i N, N \ {i} S.
2. If A S, then every superset of A is also in S.
3. A, B S = A B S.
4.
/ S.
The condition
/ S only rules out S being equal to 2N . F, the collection of co-finite subsets of
N, is a filter. Let
A = {L : L 2N , F L, L is a filter}
A consists of all possible ways of adding more sets to F while retaining the property of being
a filter. Partially order A by inclusion. F A . We will use Zorns Lemma. Every chain is
S
bounded: If B A is a nonempty chain, then B is a filter (this is easy to see using B being
S
S
totally ordered) and F B. B A and is an upper bound for B. Every chain is bounded,
so by Zorns lemma, A has a maximal element.
We will look at a maximal element, after we examine what happens when we try to extend a filter
by throwing one more set in it: suppose S is a filter, and B N, B
/ S. If B is disjoint from
some set in S, there is no hope of extending S to a filter which has B, because of properties 3 and
4 of a filter. So suppose this is not the case, that is, B intersects every element of S. Let
S 0 = {A C : A S, B C N}
If at all we can extend S to a filter which has B, it must have at least all the sets in S 0 . Clearly
S S 0 , since we can take C = N. So S 0 has the complements of all singletons. Suppose AC S 0 ,
and D is a superset of it. A D S and C D is a superset of C, so (A D) (C D) S 0 .
(A D) (C D) = (A C) D = D
So D S 0 . S 0 is closed under taking supersets. If A1 C1 , A2 C2 S 0 , then
(A1 C1 ) (A2 C2 ) = (A1 A2 ) (C1 C2 ) S 0
since A1 A2 S and C1 C2 is a superset of B. S 0 is closed under finite intersections. Since
we have assumed B intersects every element of S, we have that
/ S 0 . S 0 is a filter, S S 0 , and
0
B S (becasue B = N B). We have shown the following:
Suppose S is a filter, and B N with B
/ S 0 . There exists a filter S 0 with S S 0 and
0
B S if and only if B intersects every element of S.
Now we go back to A . Let U be a maximal element of A . It is a filter, so it satisfies the first
three conditions of being an ultrafilter. We want to show that for every A N, exactly one of A
and N \ A belong to U. Assume the contrary: suppose neither of A and N \ A belong to U (of
course both cannot belong). U is maximal, so it cannot be extended using A. As we have shown,
this means there is a D1 U such that A is disjoint from D1 . Similarly, U cannot be extended by
N \ A, so there is a is a D2 U such that N \ A is disjoint from D2 . D1 N \ A and D2 A, so
D1 D2 = . U cannot have two disjoint sets, so this is a contradiction. U is an ultrafilter.
Construction of Haar measure

Let G be a metrizable locally compact topological group. We will construct a left-invariant Haar
measure on it. Let U0 be the set of nonempty open subsets of G whose closure is compact. Since
G is locally compact, there are lots of sets in U0 : every x G has a neighbourhood whose closure
is compact.

For U, V U0 , define VU as the least number of left-translates of V needed to cover U . A lefttranslate of V is any set of the form gV , for g G. From now on when we say translate we mean
left-translate. All translates of V certainlycover
U . Each translate is open and U is compact, so
U
some finitely many translates cover U . So V is well-defined.
Fix a metric d on G. Fix a U0 U0 . Fix a sequence of sets {Bn }
n=1 decreasing to {e} (where e is
the identity of the group) such that each Bn U0 and such that {diam(Bn )}
n=1 decreases to 0 (this
can be done easily using the metric d and the fact that G is locally compact). Fix an ultrafilter
on N, which shall remain anonymous.
Step 1: Define n : U0 [0, ]R , : U0 [0, ]R as follows:
.
U
U0
n (U ) =
Bn
Bn
(U ) = lim n (U )
n
where the limit is along the chosen ultrafilter. We will show later that the sequence {n (U )}
n=1
is bounded, so the limit is well-defined. Note that G U0 if and only if G is compact. Define
U = U0 {, G}1 . Extend to U by setting () = 0, and if G is not compact, (G) = .
Step 2: We will define a : 2G [0, ]R , analogous to the outer measure we had in the
proof of the Caratheodory extension theorem. For any E G, we cover E by at most countably
many sets from U (this is always possible, since G U), and take the sum of their values. The
infimum of this quantity over all possible covers is (E):
(
)
X
[
(E) = inf
(S) : S U, S is at most countable, E
S
SS
It is that will ultimately give us the left-invariant Haar measure we are looking for, not . It is
possible that for some U U0 , (U ) 6= (U ).2
Step 3: In the proof of the Caratheodory extension theorem, we took all those sets E which split
the whole space correctly, that is, those sets E for which (E) + (G \ E) = 1. This worked
because we were dealing with probability measures. Here we need not have a finite measure, so we
cannot use this approach directly. Instead, we take those sets E which split every set correctly:

A = E : E G, A G (A) = (A E) + (A \ E)
We will show later that A is a -field and includes all open sets (and hence includes the Borel
-field of G). We will show that is countably additive on A .
Step 4: We will show that the measure on the Borel -field of G has the properties we want:
it is finite on compact sets, positive on nonempty open sets, and left-invariant.
We will now examine the four steps more closely.
1
2
Do not confuse this with the ultrafilter U from the previous class.
This can happen with complicated sets like the complement of a Cantor-like set.
Step 1
1. For all U, V, W U0 ,

U
U
W
1
V
W
V

U

and
l
=
.
Clearly 1 VU , since U is nonempty. To prove the other inequality, let k = W
V
W
Let g1 , . . . , gk be such that g1 V, . . . , gk V cover W . Let h1 , . . . , hl be such that h1 W, . . . , hl W cover
U . We will show that {hi gj V : 1 i l, 1 j k} cover U : let x U . Pick
i such that x hi W .
1
1
U
Then hi x W W , so pick j such that hi x gj V . x hi gj W . So V kl.
2. For all n N, for all U U0 ,

U0
U
1
n (U )
U
U0
U
U U

From the previous inequality, we have Bn U0 Bn0 . This gives n (U ) UU0 . Similarly,
U U U

0
U0 Bn . This gives 1/ UU0 n (U ). The sequence {n (U )}
n=1 is bounded, so (U ) is
Bn
well-defined.
3. For all U U0 ,

U0
U
1
(U )
U
U0
This follows from the above, and that the limit along an ultrafilter of a bounded sequence also lies
within the bounds of the sequence.
4. For all U1 , U2 U, (U1 U2 ) (U1 ) + (U2 ). If any of U1 and U2 is , then the inequality holds
trivially. If any of them is G, then U1 U2 is G, and the inequality
U holds
U trivially. Otherwise,
U1 U2
1
U1 , U2 U0 and so U1 U2 U0 . It is easy to see that Bn Bn + Bn2 . Dividing both sides

by BUn0 gives n (U1 U2 ) n (U1 ) + n (U2 ). Using the observations made about limits along an
ultrafilter earlier, (U1 U2 ) (U1 ) + (U2 ).
5. For U1 , U2 U, if d(U1 , U2 ) > 0, then (U1 U2 ) = (U1 ) + (U2 ). By d(U1 , U2 ) we mean
inf xU1 ,yU2 d(x, y). We will prove this later.
6. For all g G
and U U, (gU ) = (U ). This holds trivially if U = or U = G; otherwise
gU
U
= Bn , since translates of Bn covering U can be left-multiplied by g to cover gU , and
Bn
similarly those covering gU can be left-multiplied by g 1 to cover U . So n (gU ) = n (U ). This
holds for all n, so (gU ) = (U ).
Step 2
7. () = 0, is monotonically increasing, and is countably subadditive (these three properties
say that is an outer measure). () is clearly 0, since covers . If E1 E2 , then any collection
of sets that covers E2 also covers E1 , so (E1 ) is an infimum over a larger set than (E2 ) is. So
(E1 ) (E2 ). Countable subadditivity says that for {En : n N} a countable collection of
subsets of G,
!
[
X
En
(En )
nN
nN
To prove this, recall the definition of :

(
)
X
[
(E) = inf
(S) : S U, S is at most countable, E
S
SS
Let > 0 be fixed. Each (En ) is the infimum of some set, choose something in that set which is
at most 2n away from the infimum. In other words, for each n, choose Sn , such that Sn U, Sn
S
P
is at most countable, En Sn , and SSn (S) (En ) + 2n . Each En is covered by Sn ; we
S
S
just put all these covers together to cover E := nN En . Let S = nN Sn . Then S U, S is at
S
most countable, and E S.
X
(E)
(S)
(by the last sentence above and definition of )
SS
XX
X
(S)
(nonnegative, can rearrange)
nN SSn
(En ) +
nN

2n
!
=
(En )
+
nN
Since this holds for all > 0, (E)
nN
(En ).
We have (gE) = (E) for all g G and E G. This is because the corresponding property
holds for , and any cover of E can be left-multiplied by g to obtain a cover of gE, and vice versa,
any cover of gE can be left-multiplied by g 1 to obtain a cover of E.
8. For E1 , E2 G, if d(E1 , E2 ) > 0, then (E1 E2 ) = (E1 ) + (E2 ). We will prove this later.
Step 3
Recall that

A = E : E G, A G (A) = (A E) + (A (G \ E))
From the definition, it is immediate that if E A , then G \ E A . A is closed under complements. Since () = 0, for all A, (A) = (A ) + (A G). So , G A . Subadditivity
tells us that for any A, E G,
(A) (A E) + (A \ E)
So

A = E : E G, A G (A) (A E) + (A \ E)
To show that some set belongs to A , we have to show that the required inequality holds for all
subsets A of G. When we are using the fact that a certain set belongs to A , we are free to apply
the given equality for any set A of our choice; often making a clever choice helps.
We will first show that A is a field. Let E1 , E2 A . We will show that E1 E2 A . Let A G
be arbitrary. Then
(A) = (A E1 ) + (A \ E1 )
(E1 A , use with A)

= (A E1 ) + ((A \ E1 ) E2 ) + ((A \ E1 ) \ E2 )
(A (E1 E2 )) + (A \ (E1 E2 ))
4
(E2 A , use with A \ E1 )
(subadditivity)
Note in the above that the union of A E1 and (A \ E1 ) E2 is A (E1 E2 ). E1 E2 A .

is finitely additive on A : let E1 , E2 A be disjoint. Using that E1 A and A = E1 E2 , we
get
(E1 E2 ) = ((E1 E2 ) E1 ) + ((E1 E2 ) \ E1 ) = (E1 ) + (E2 )
We will continue with the proof in the next class.
Construction of Haar measure (contd)

We were analysing step 3 of the construction of the Haar measure.
Step 3 (contd)
We had shown that A is a field, and that is finitely additive on A . We will now show that A
is a -field and is countably additive. Let {En : n N} be a collection of sets in A . We can
disjointify them as usual, and the resulting sets will still be in A (since it is a field), and their
union will not change. Hence without loss of generality we may assume that {En : n N} is a
S
collection of disjoint sets. Let E = nN En . Fix A G. We need to show that
(A) (A E) + (A \ E)
For all n,
Sn
Ei A , so
!
!
n
n
[
[
A
Ei + A \
Ei
i=1
(A) =
A
"
=
i=1
n
[
!
+
Ei
i=1
n
[
A\
i=1
!
(monotonicity of )
Ei
i=1
Ei E1
"
+
i=1
n
[
#
Ei \ E1
!
(E1 A )
+ (A \ E)
i=1
= (A E1 ) + A
n
[
!
+ (A \ E)
Ei
i=2
"
=
n
X
#
(A Ei ) + (A \ E)
(repeating the above)
i=1
Since this holds for all n N,

"
#
X
(A)
(A Ei ) + (A \ E)
i=1
!
(A Ei )
+ (A \ E)
(subadditivity)
i=1
= (A E) + (A \ E)
S
This holds for all A G, so E =
i=1 Ei A . We had shown A is a field; we now have that A
is a -field.
P
From subadditivity we know (E) iN (Ei ). To establish countable additivity, we need to
S
show the reverse inequality. Using that ni=1 Ei A and taking A = E, we get
!
!
!
!
n
n
n
n
n
[
[
[
[
X
(E) = E
Ei + E \
Ei E
Ei =
Ei =
(Ei )
i=1
i=1
i=1
i=1
i=1
where the last equality follows by finite additivity, which we have shown earlier. Since the above
P
equation holds for all n, (E)

i=1 (Ei ).
A is a -field, and is a measure on it. However, this in itself doesnt say much, for all we know
A might be {, G}! We dont yet know if A has any other set at all.
We will now show that all open subsets of G belong to A . We will need to use the fact that
is finitely additive on sets which are far apart (two sets are far apart from each other if the
distance between them as defined earlier is positive). We have stated but not proved this.
Let U G be a nonempty open. Fix A G. We need to show that (A) (AU )+ (A\U ).
For all n N, let

An = x A U : d(x, G \ U ) n1
{An }
n=1 is an increasing sequence, and increases to A U : for any x A U , there is an n such
that Ball(x, 1/n) U , x An .
An (A \ U ) A, so
(A) (An (A \ U )) = (An ) + (A \ U )
The last equality holds because d(An , A \ U ) 1/n > 0. We will show that (An ) increases to
(A U ) as n . Using the above inequality, this will show that U A .
By monotonicity of , { (An )}
n=1 increases. Showing that it increases to (A U ) requires
a delicate argument. Consider the sequence of sets A1 , A2 \ A1 , A3 \ A2 , . . .. They are disjoint,
but the distance between adjacent sets might be 0. However, other pairwise distance are positive.
That is, for n m + 2, d(An+1 \ An , Am+1 \ Am ) > 0. To show this, take x An+1 \ An , and
y Am+1 \ Am . For any u G \ U ,
d(y, u) d(x, y) + d(x, u)

d(x, y) d(y, u) d(x, u)
1
m+1
1
n
(y Am+1 , x An )
Since this holds for all x and y in the relevant sets, we have d(An+1 \An , Am+1 \Am )
!
n
[
(A2n )
(A2i \ A2i1 )
(monotonicity)
1
n1
m+1
> 0.
i=1
n
X
(A2i \ A2i1 )
(far apart additivity)
i=1
P
By a similar argument, (A2n+1 ) ni=1 (A2i+1 \ A2i ). If any of these two sums diverges to
as n , then { (An )}
n=1 increases to , and by monotonicity (A U ) = , which
establishes that limn (An ) = (A U ). Otherwise, both sums converge to finite numbers,
and so their tail sums converge to 0.
!
[
(A U ) = A2n
(Ai+1 \ Ai )
(A2n ) +
= (A2n ) +
i=2n
(Ai+1 \ Ai )
(subadditivity)
i=2n
i=n
i=n+1
(A2i+1 \ A2i ) +
(A2i \ A2i1 )
(rearranging)
Since the tail sum terms converge to 0, (A U ) limn (A2n ). It follows that (A U )
limn (An ). The reverse inequality holds by monotonicity of . We have shown that all open
sets belong to A (subject to two claims which will be proven below), and so the Borel -field of
G is a subset of A .
The remaining claims

We will now show that if d(E, F ) > 0, then (E F ) = (E) + (F ), which we have used
above. We need to show (E) + (F ) (E F ). If (E F ) = , we are done. Otherwise,
(E F ) < . Let > 0 be given. Using the definition of , pick a cover for E F within of the
S
P
infimum. That is, pick S U countable, such that E F S, and SS (S) < (E F ) + .
We will use S to get covers for E and F .
Let a = d(E, F ). Let
V =

Ball x, a4 ,
W =
xE
Ball x, a4
xF
Triangle inequality will give us that d(V, W )

S1 = {S V : S S},
a
.
2
V and W are open. Let

S2 = {S W : S S}
S
S
S1 , S2 U. E S1 and F S2 . S1 and S2 are countable. So they are valid covers for E and
F respectively. For any S S, d(S V, S W ) d(V, W ) > 0, so
(S V ) + (S W ) = (S (V W )) (S)
This gives
(E) + (F )
(S) +
SS1
X
SS2
(S V ) +
SS
(S)
X
(S W )
SS
(S)
SS
(E F ) +
Since this holds for every > 0, (E) + (F ) (E F ).
We will now show that if d(U1 , U2 ) > 0, then (U1 U2 ) = (U1 ) + (U2 ). We have used this above.
We will first show that for large enough n (that is, for all but finitely many n), no translate of Bn
can intersect both U1 and U2 . diam(Bn ) goes to 0 as n , so this would have been easy if the
metric d was translation invariant (that is, for all x, y, z G, d(xy, xz) = d(y, z)). But this need
not be so.
We prove the claim by contradiction. If is is not true that for all n large enough, no translate of Bn
intersects both U1 and U2 , then there is a strictly increasing sequence of natural numbers, {nk }
k=1 ,
and a sequence of elements of G, {gk }k=1 , such that for all k, gk Bnk intersects both U1 and U2 .
Pick xk gk Bnk U1 and yk gk Bnk U2 . {xk }

k=1 and {yk }k=1 are sequences in the compact sets
U1 and U2 , so we can choose a common subsequence along which both converge. We can restrict
to this subsequence, and so without loss of generality, assume that the two sequences converge to
x U1 and y U2 respectively. gk1 xk and gk1 yk both belong to Bnk . Since the diameters go to 0
1
and Bnk decreases to {e} as k , both {gk1 xk }

k=1 and {gk yk }k=1 converge to e.
1
The sequence {(gk1 xk )x1
= x1 . So {gk }
k=1 converges to x. By a similar
k }k=1 converges to ex
argument, it also converges to y. So x = y. U1 and U2 intersect, which contradicts d(U1 , U2 ) > 0.
For large enough n, no translate of Bn intersects both U1 and U2 . This gives

U1 U2
U1
U2
=
+
Bn
Bn
Bn
This holds because the inequality holds in general, and the inequality is obtained by using
that when we cover U1 U2 by the minimum number
U of translates of Bn needed, every translate
intersects exactly one of U1 and U2 . Dividing by Bn0 gives n (U1 U2 ) = n (U1 ) + n (U2 ) for large
enough n. This implies (U1 U2 ) = (U1 ) + (U2 ) (this works because our ultrafilter contains all
co-finite sets).
We have defined a left-invariant measure on the Borel -field of G. We wil observe that it has
the other properties we want (regarding measures of compact sets and open sets) in the next class,
and make some comments about uniqueness.
Construction of Haar measure (contd)

We had shown that is a left-invariant measure on A . We will now show that gives a finite
measure to every compact set and a positive measure to every nonempty open set.
Suppose K G is compact. U0 covers K, so some finitely many sets in U0 , say U1 , . . . , Ul cover
K. Since of every set in U0 is finite,
(K)
l
X
(Ui ) <
i=1
Suppose U G is open, and U 6= . Using local compactness, get nonempty open H such that
H H U and H is compact. For calculating (H), it suffices to consider only finite covers, since
any countably infinite cover has a finite subcover, which gives only a better (lower) potential value
P
for (H). Let > 0 be given. Get U1 , . . . , Ul U0 covering H such that li=1 (Ui ) < (H) + .
Note that is defined only on U.
!
l
[
(H)
(monotonicity)
Ui
i=1
l
X
(Ui )
(finite subadditivity)
i=1
(H) +
(U ) +
(monotonicity)
Since this holds for all > 0, (U ) (H) > 0.

is a left-invariant Haar measure on the Borel -field of G.
Uniqueness
In our current setup (a locally compact metrizable topological group G), we will show that if and
are two left-invariant Haar measures, then
R can be
R scaled so that for all continuous functions
with compact support f : G C, we have f d = f d. A positive finite scalar multiple of a
left-invariant Haar measure is also a left-invariant Haar measure.
The proof we give is a tricky proof due to Von Neumann, though more straightforward proofs
exist.
Suppose , are left-invariant Haar measures on G. Fix : [0, ), a continuous function with
compact support which is not the identically zero function. This can be done using the metric and
local compactness. is necessarily bounded. If takes the value
a > 0, then
R a somewhere, with
R
{x : (x) > a/2} is nonempty open, so has positive measure. d > 0, and d > 0. We
may scale and such that
Z
Z
d =
d = 1
Define : G [0, ),
Z
(g) =
(xg 1 ) d(x)
Lemma. is well-defined and is a continuous function of compact support.

Proof. To show that is well-defined, we need to show that for each g G, the integral used
to define (g) is finite. Let K G be a compact set such that is zero outside K. Kg is
also compact, and so has finite measure (but (Kg) need not be equal to (K); is only
left-invariant). To obtain (g) we are integrating a bounded nonnegative function which is zero
outisde Kg, and so (g) is finite.
To show continuity of , let {gn }
n=1 be a sequence converging to g. We will show that
[
= Kg
K
Kgn
nN
We need to find a convergent subsequence. If infinitely many

is compact. Take any sequence in K.
terms occur in Kg or in some Kgn , then we are done. Otherwise, only finitely many terms are in
Kg, and only finitely many terms are in Kgn , for each n. Then we can find a subsequence whose
kth term is in Kgnk , where {nk }
k=1 is a strictly increasing sequence. From this we can extract a
further subsequence which converges to a point in Kg.
Now we can show that {(gn )}
n=1 converges to (g) using the usual DCT argument, using M 1K

as the integrable bound, where M is a bound on .
Lemma. For all f : G C continuous functions with compact support,
1.
f (x) d(x) =
f (x1 )(x1 ) d(x).
2.
f (x) d(x) =
f (x1 )(x1 ) d(x).
3.
f (x) d(x) =
f (x)(x)(x1 ) d(x).
4.
f (x) d(x) =
f (x)(x)(x1 ) d(x).
5.
f (x) d(x) =
f (x) d(x).
Proof. We will prove 1 and 2 simultaneously. Fubinis theorem will be used freely. Everything is
bounded and has compact support, so is integrable. In what follows, can be taken to be either
or . = proves 1 and = proves 2.
R
f (x) d(x)
R
R

=
(y) d(y)
f (x) d(x)

R R
=
f (x) d(x) (y) d(y)

R R
=
f (y 1 x) d(x) (y) d(y)
(replace x by y 1 x, use left-invariance of )

R R
=
f (y 1 x)(y) d(y) d(x)

R R
=
f (y 1 )(xy) d(y) d(x)
(replace y by xy, use left-invariance of )
R

R
1
= f (y ) (xy) d(x) d(y)
R
= f (y 1 )(y 1 ) d(y)
We now prove 3 and 4, using 1 and 2. Again, can be taken to be either of or . = proves
3 and = proves 4.
R
f (x)(x)(x1 ) d(x)

=
g(x1 )(x1 ) d(x)
g(x) d(x)
f (x1 )(x1 ) d(x)
f (x) d(x)
(where g(x) = f (x1 )(x1 ))1

(using 1 or 2)
(using 1 or 2)
3 gives (x)(x1 ) = 1 for all x G. If not, suppose

for some x, (x)(x1 ) = a with a > 1

. U is nonempty open. By choosing a suitable
(a < 1 is similar). Let U = x : (x)(x1 ) > 1+a
2
continuous function f with compact support contained in U , we can get a contradiction to 3.
Since (x)(x1 ) = 1 for all x G, 4 gives 5.
Suppose we know that and are regular from below, that is,
(A) = sup{(K) : K compact, K A}
for {, }. For example, this is true if (G, d) is complete and separable. If K is a compact set,
we can get a sequence of continuous functions of compact support decreasing to 1K , each bounded
between 0 and 1. DCT will give (K) = (K). Then regularity will give = .

Measure theory class notes - key concepts

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Measure theory class notes - key concepts

Caricato da

Copyright:

Formati disponibili

Measure theory class notes - 2 August 2010, class 1

This property is known as countable additivity (on the semi-field).

Measure theory class notes - 2 August 2010, class 1

The first extension : fields

By the definition of a semi-field, each \ A is a finite disjoint union of elements of S , and so

is well-defined and agrees with on S . Also,

Measure theory class notes - 2 August 2010, class 1

The sets on the RHS are disjoint and all in S , so

This shows that

Take any particular D B : it is a subset of A =

E (and all these Es are disjoint),

(nonnegative values, can rearrange)

This completes the proof.

(similar argument as before, since E

Measure theory class notes - 4 August 2010, class 2

Countable addivity for the semi-field of intervals

((a, b]) = F (b) F (a)1

Measure theory class notes - 4 August 2010, class 2

F (bi ) F (ai ) F (b) F (a)

F (bi ) F (ai ) F (bn+1 ) F (an+1 ) F (b) F (a)

Measure theory class notes - 4 August 2010, class 2

F (bi ) F (ai ) F (b) F (a)

The proof is completed in the next class.

Measure theory class notes - 6 August 2010, class 3

Countable addivity for the semi-field of intervals (contd)

Note that this holds whether iN i is finite or not.

Since L < limn

i,n , there is N N such that

Theorem. as defined above is a measure on S .

((ai , bi ]) ((a, b])

We now show the reverse inequality, and then we are done.

Measure theory class notes - 6 August 2010, class 3

((ai , bi ]) ((a, b])

As we have already shown the reverse inequality,

Measure theory class notes - 6 August 2010, class 3

The final extension : -fields

Measure theory class notes - 9 August 2010, class 4

This property is called countable additivity.

C A , this follows from

Definition. is called a probability measure if () = 1.

Di and Dj are disjoint for i 6= j, and

(Ci ) = (Di ) + (Ci \ Di )

Measure theory class notes - 9 August 2010, class 4

Extending the measure from the field to the -field

2. By the above, {(An )}

(Bi ) = (Ak ). Taking limits, limk (Ak ) = (A).

Since (A1 \ A) < , the tail sums of the above converge to 0.

Measure theory class notes - 9 August 2010, class 4

So limn (An \ A) = 0. Adding (A) to both sides, limn (An ) = (A).

(by countable additivity of

(can interchange for nonnegative numbers)

Measure theory class notes - 9 August 2010, class 4

We now show that

Measure theory class notes - 11 August 2010, class 5

on (F ) such that and

If at all we want to assign a measure to a set B , because of monotonicity, it must be at most

Measure theory class notes - 11 August 2010, class 5

Measure theory class notes - 11 August 2010, class 5

increases to B. Since each Bn C (C is a field),

Using the monotonicity of from step 2,

This shows that is countably additive.

lim (An ) = lim (Bn )

(An ) (Bn ) + 2n

(C) lim (Bn ) +

Since this holds for all > 0,