Real Analysis

Lecture Notes in Real Analysis
Eric T. Sawyer
McMaster University, Hamilton, Ontario
E-mail address: sawyer@mcmaster.ca
Abstract. Beginning with the ordered eld of real numbers, these lecture
notes examine the theory of real functions with applications to dierential
equations and fractals. The main thread begins with the least upper bound
property of the real numbers, and follows through to compactness and com-
pleteness in Euclidean spaces. Standard results on continuity, dierentiation
and integration are established, culminating in two applications of the Con-
traction Lemma: fractals are characterized using the completeness of the met-
ric space of compact subsets of Euclidean space; existence and uniqueness
of solutions to rst order nonlinear initial value problems are proved using
completeness of the space of real continuous functions on a closed bounded
interval.
Contents
Preface v
Part 1. Dierentiation 1
Chapter 1. The elds of analysis 3
1. A model of a vibrating string 3
2. Deciencies of the rational numbers 7
3. The real eld 9
4. The complex eld 15
5. Dedekinds construction of the real numbers 19
6. Exercises 21
Chapter 2. Cardinality of sets 25
1. Exercises 29
Chapter 3. Metric spaces 31
1. Topology of metric spaces 33
2. Compact sets 37
3. Fractal sets 45
4. Exercises 53
Chapter 4. Sequences and Series 55
1. Sequences in a metric space 56
2. Numerical sequences and series 64
3. Power series 71
4. Exercises 74
Chapter 5. Continuity and Dierentiability 77
1. Continuous functions 78
2. Dierentiable functions 85
3. Exercises 92
Part 2. Integration 95
Chapter 6. Riemann and Riemann-Stieltjes integration 97
1. Simple properties of the Riemann-Stieltjes integral 103
2. Fundamental Theorem of Calculus 107
3. Exercises 109
Chapter 7. Function spaces 111
1. Sequences and series of functions 111
iii
iv CONTENTS
2. The metric space C
R
(A) 113
Chapter 8. Lebesgue measure theory 123
1. Lebesgue measure on the real line 125
2. Measurable functions and integration 131
Appendix. Bibliography 141
Preface
These notes grew out of lectures given three times a week in a third year under-
graduate course in real analysis at McMaster University September to December
2009. The topics include the real and complex number systems and their function
theory; continuity, dierentiability, and compactness. Applications include exis-
tence of solutions to dierential equations, and constuction of fractals such as the
Cantor set, the von Koch snowake and Peanos space-lling curve. Sources in-
clude books by Rudin [3] and [4], books by Stein and Shakarchi [5] and [6], and
the history book by Boyer [2].
v
Part 1
Dierentiation
We begin Part 1 with a chapter discussing the eld of real numbers R, in
particular its status as the unique ordered eld with the least upper bound property.
We show that the eld of real numbers R can be constructed either from Dedekind
cuts of rational numbers Q, or from Weierstrass Cauchy sequences of rational
numbers. Finally, we comment briey on the arithmetic properties of R that can
be derived from its denition, and also point out a false start in the construction
of R.
Then in the short Chapter 2 we introduce Cantors cardinal numbers and show
that the rational numbers are countable and that the real numbers are uncountable.
Chapter 3 follows Rudin [3] in part and introduces the concept of a metric space
with a distance function that is sucient for developing a rich theory of limits,
yet general enough to include the real and complex numbers, Euclidean spaces and
the various function spaces we use later. We also construct our rst fractal set,
the famous Cantor middle thirds set, which provides an example of a perfect set
that is large in cardinality (uncountable) yet small in length (measure zero). We
end by following Stein and Sharkarchi [6] to establish a one-to-one correspondence
between nite collections of contractive similarities and fractal sets, thus illustrating
Mandelbrots observation that much of the apparent chaotic form in nature has an
extremely simple underlying structure.
Chapter 4 develops the standard theory of sequences and series in a met-
ric space, including convergence tests, Cauchy sequences and the completeness of
Euclidean spaces. We also introduce the useful contraction lemma as a unifying
approach to fractals and later, to solutions to dierential equations.
Chapter 5 introduces the concepts of continuity and dierentiability including
uniform continuity and four mean value theorems of increasing generality.
CHAPTER 1
The elds of analysis
If one is not careful in dening the concepts used in analysis, confusion can
result. In particular, we need a clear denition of
(1) function,
(2) the set of real numbers, and
(3) convergence of series of real numbers and functions.
In the 18
||
century each of these concepts suered shortcomings. Early formu-
lations of the notion of function involved the idea of a specic formula. Later in
1837, Lejeune Dirichlet suggested a broader denition of function, still falling short
of the modern notion:
If a variable j is so related to a variable r that whenever a numerical value
is assigned to r, there is a rule according to which a unique value of j is
determined, then j is said to be a function of the independent variable r.
Real numbers were thought of as points on a line, but the identication of
their crucial properties, such as the absence of gaps as reected in the least upper
bound property, had to await Dedekinds construction of the real numbers from the
rational numbers.
In 1725 Varignon, one of the rst French scholars to appreciate the calculus,
warned that innite series were not to be used without investigation of the remain-
der term. It was not until 1872 however before Heine, inuenced by Weierstrass
lectures, dened the limit of the function ) at r
0
in virtually modern terms as
follows:
If, given any -, there is an j
0
such that for 0 < j < j
0
the dierence
) (r
0
j) 1 is less in absolute value than -, then 1 is the limit of ) (r)
for r = r
0
.
Historically, the following example was pivotal in the development of the rig-
orous analysis that addressed the above shortcomings, and also in the foundations
of set theory. We are referring here to a simple mathematical model of the motion
of a string vibrating in the plane.
1. A model of a vibrating string
Consider a vibrating string stretched along that portion of the r-axis in the
plane that joins the points (0, 0) and (1, 0), and suppose the string is wiggling up
and down (not very violently) in the j-direction. Suppose that at time t and just
above (or below) the point (r, 0) on the r-axis, the j-coordinate of the string is
given by j (r, t). This denes a function mapping the innite strip [0, 1] R into
the real numbers R, i.e. j (r, t) is dened for
0 _ r _ 1 and t R,
3
4 1. THE FIELDS OF ANALYSIS
and we are to think of the real number j (r, t) as measuring the displacement from
the r-axis of the vibrating string at position r and time t. We assume the endpoints
of the string are attached to the points (0, 0) and (1, 0) for all time and so we have
the boundary conditions
(1.1) j (0, t) = 0 and j (1, t) = 0 for all t R.
Moreover, we can suppose that at time t = 0 the shape of the string is specied by
the graph of a given function ) that maps [0, 1] to R;
(1.2) j (r, 0) = ) (r) for 0 _ r _ 1.
Finally, we can suppose that at time t = 0 the vertical velocity of the string is
specied by a given function q that maps [0, 1] to R;
(1.3)
0
0t
j (r, 0) = q (r) for 0 _ r _ 1.
Now provided the displacements are not too violent, it can be shown (and we
are not interested here in exactly how this is done) that the function j (r, t) satises
a partial dierential equation of the form
0
2
0t
2
j = c
2
0
2
0r
2
j, 0 < r < 1 and t R,
where c is a positive constant determined by the physical properties of the string,
and is interpreted as the speed of propagation. This is the so-called wave equation,
and together with the boundary conditions (1.1) and the initial conditions (1.2)
and (1.3), it constitutes the initial boundary value problem for the vibrating string:
_
0
2
0t
2
c
2
0
2
0r
2
_
j (r, t) = 0, 0 < r < 1 and t R, (1.4)
j (0, t) = j (1, t) = 0, t R,
_
j (r, 0)
J
J|
j (r, 0)
_
=
_
) (r)
q (r)
_
, 0 _ r _ 1.
On the one hand, Daniel Bernoulli noted around the middle of the 18
||
century
that for each positive integer : N the function
j
n
(r, t) = (sin:r) (cos :ct) ,
is a solution to (1.4) with initial conditions
) (r) = sin:r, 0 _ r _ 1,
q (r) = 0, 0 _ r _ 1.
Example 1. j (r, t) = (sin3r) (cos 3t)
1. A MODEL OF A VIBRATING STRING 5
1.0
1.5
0.5
z
0.0
0.0
-0.5
-1.0
0.2
0.0
0.4
0.6
x
0.8
1.0
y
0.5
1.0
Since the equations involved are linear we then have that
j (r, t) =

n=1
a
n
(sin:r) (cos :ct)
is a solution to (1.4) with initial conditions
) (r) =

n=1
a
n
sin:r, 0 _ r _ 1,
q (r) = 0, 0 _ r _ 1.
Presuming that we can take innite sums, we nally obtain that the solution j (r, t)
to the initial boundary value problem (1.4) with initial conditions
) (r) =
o
n=1
a
n
sin:r, 0 _ r _ 1,
q (r) = 0, 0 _ r _ 1,
is given by the innite series of functions
(1.5) j (r, t) =
o
n=1
a
n
(sin:r) (cos :ct) .
Remark 1. The Bernoulli decomposition is motivated for example by plucking
a guitar string. The fundamental note heard is that corresponding to : = 1, the
standing sine wave having one node that oscillates with frequency
c
2
and amplitude
a
1
. Corresponding to higher values of : are the harmonics having : nodes with
frequency
nc
2
and amplitude a
n
. See Example 1 above where the standing wave
having 3 nodes has graph sin3r with frequency
3
2
and amplitude 1.
On the other hand, a much simpler solution to (1.4) with initial condition
q (r) = 0 for 0 _ r _ 1 was given by Jean Le Rond dAlembert in 1747, namely
the travelling wave solution ,
(1.6) j (r, t) =
) (r +ct) +) (r ct)
2
, 0 _ r _ 1 and t R,
where we dene ) outside the interval [0, 1] by requiring that it be odd on the
interval [1, 1] and periodic with period 2 on the real line.
Example 2. j (r, t) =
1
2
1
1+(r+|)
2
+
1
2
1
1+(r|)
2
, < r < , t _ 0
2
y
-1
0
x
0
0.2
1
1 2
-2
-3
3
3
0.4
z
0.6
0.8
1.0
Exercise 1. Verify that the function j (r, t) in (1.6) satises (1.4) with q = 0.
Remark 2. The travelling wave solution is motivated for example by snapping
a skipping rope that is lying in a line on the ground. A hump is produced that
travels like a wave along the rope with speed c. See Example 2 above where two
humps move o in opposite directions with speed 1.
Based on physical experience, such as plucking a guitar string and snapping a
skipping rope, we expect that
(1) every solution to the initial boundary value problem (1.4) has the Fourier
harmonic form (1.5), and
(2) every solution to the initial boundary value problem (1.4) has the dAlembert
travelling wave form (1.6), and
(3) the solution to the initial boundary value problem (1.4) is uniquely deter-
mined by the boundary conditions (1.1) and the initial conditions (1.2)
and (1.3).
From these expectations it follows that for any function ) (r) we have
(1.7) ) (r) = j (r, 0) =
o
n=1
a
n
sin:r, 0 _ r _ 1,
2. DEFICIENCIES OF THE RATIONAL NUMBERS 7
for a suitable choice of constants a
n
, : _ 1. In fact the coecients a
n
are determined
at least formally by the function ) via the following formula:
(1.8)
_
1
0
) (r) sin/rdr =
o
n=1
a
n
_
1
0
sin:rsin/rdr =
1
2
a
|
,
where we have used
sinsin1 =
1
2
cos (1) cos (+1) ;
_
1
0
cos :rdr =
_
1 if : = 0
sin ntr
nt
[
1
0
= 0 if : Z 0
.
The precise meaning to be attached to such formulas (1.7) and (1.8) involve many
diculties! In particular,
when does the series on the right side of (1.7) converge?
and for what values of r?
or more generally in what sense?
when does the sum equal ) (r) in some sense?
when does the integral on the left side of (1.8) exist?
and when can integration and innite sum be interchanged in (1.8)?
We will introduce concepts and develop tools to answer such questions. In
particular we note that it was Joseph Fourier in 1824 who rst proved that (1.7)
holds under certain conditions, and this is the reason that the name of Fourier, and
not Bernoulli, is associated with such a decomposition of a function ) (r) into a
series of trigonometric functions sin:r.
One question that springs to mind immediately is whether or not the ordered
eld of rational numbers
Q =
_
:
:
: : Z and : N
_
can suce as the domain for r in answering these questions. As it happens, the
rational numbers suer a fatal deciency that we show can morph into dierent
forms in the next section, rendering the rationals unsuitable for this purpose. It is
convenient at this point to introduce the concept of an order < on a set o.
Definition 1. An order < on a set o is a relation (among ordered pairs (r, j)
of elements r, j o) satisfying the following three properties:
(1) (nonreexive) If r o, then it is not true that r < r.
(2) (antisymmetric) If r, j o and r ,= j, then one and only one of the
following two possibilities holds:
r < j, j < r.
(3) (transitive) If r, j, . o, and r < j and j < ., then r < ..
For example, the usual order on either Z or Q satises Denition 1.
2. Deciencies of the rational numbers
The rational numbers Q form an ordered eld, but there are diculties assoc-
itated with
(1) nonsolvability of algebraic equations,
(2) gaps in the order,
(3) and nonexistence of solutions to simple dierential equations.
Because of these problems with the rational numbers, we will be led to construct
the set of real numbers R which form an ordered eld with the least upper bound
property. This last property reects the absence of gaps in the order of the real
numbers and accounts for the privileged position of R in analysis.
2.1. Nonsolvability of algebraic equations. The polynomial equation
r
2
2 = 0
has no solution r Q. Indeed, if it did then we would have
_
n
n
_
2
= 2 where : and
: are integers with no factors in common. Then
:
2
= 2:
2
is even,
hence so is :, say : = 2/ for an integer /,
hence :
2
= 2/
2
is even,
and hence : is even.
This contradicts our assumption that : and : have no factors in common, and
completes the proof that
_
2 is not rational.
Alternatively, one can avoid divisibility and argue with inequalities to derive a
contradiction as Fermat did:
_
2 =
n
n
where 0 < : < : < 2: follows from 1 <
_
2 < 2,
1 = 2 1 =
__
2 1
_ __
2 + 1
_
=
_
n
n
1
_ __
2 + 1
_
,
_
2 =
1
m
n
1
1 =
2nn
nn
=
n1
n1
where :
1
= :: < :.
Thus we have shown that if
_
2 can be represented as a quotient of positive
integers
n
n
, then it can also be represented as a quotient of positive integers
n1
n1
with :
1
strictly smaller than :. This can be repeated as often as we wish, leading
to the contradiction that there are innitely many positive integers between 0 and
:. This technique is known as Fermats method of innite descent.
Remark 3. The equation r
2
+2 = 0 has no solution in Q either, in fact it has
no solution in the real numbers R. This prompts introduction of the set of complex
numbers C, which turns out to be an algebraically closed eld containing the reals,
i.e. every polynomial with real (even complex) coecients has a root in C. On the
other hand, C is not an ordered eld, which explains why so much of analysis begins
with the real eld R.
2.2. Gaps in the order. The rational numbers can be decomposed into two
disjoint sets and 1 with the properties that has no largest element and 1 has
no smallest element, and such that every element in is less than every element
in 1, thus leaving a gap in the order. By this we mean that we could insert a
new element labelled A, @ or even
_
2 into Q and extend the order on Q to the
larger set Q ' A by declaring j < A < for all j and 1. Because this
extended order on Q' A satises Denition 1, and has the properties that A is
the smallest element that is equal to or greater than everything in , and A is the
largest element that is equal to or less than everything in 1, we say that the sets
and 1 create a gap in the order of Q.
For example, we can set
=
_
j Q : either j _ 0 or j
2
< 2
_
, (2.1)
1 =
_
Q : 0 and
2
2
_
.
3. THE REAL FIELD 9
To see that has no largest element, pick j . We may assume that j 0, and
since every j in is less than 2 we have 0 < j < 2. Set c =
2
2
8
so that 0 < c <
1
4
.
Then
(j +c)
2
= j
2
+ 2jc +c
2
< j
2
+ 4c +
1
4
c
= j
2
+
1
2
_
2 j
2
_
+
1
32
_
2 j
2
_
< j
2
+
_
2 j
2
_
= 2.
Thus j + c j and j . The proof that 1 has no smallest element is similar.
Finally, if j and 1 are both positive, we obtain j < from
( j) ( +j) =
2
j
2
=
_
2
2
_
+
_
2 j
2
_
0,
by the denitions of j and 1, which gives j 0. This completes the
demonstration that and 1 create a gap in Q.
2.3. Nonexistence of solutions to dierential equations. The dieren-
tial equation
j
t
+rj
3
= 0
has no solution on any open interval of rational numbers. Indeed, we can solve the
equation in the real line by separating variables;
1
2
d
_
1
j
2
_
=
dj
j
3
= rdr =
1
2
d
_
r
2
_
,
1
j
2
= r
2
+C,
j =
1
_
r
2
+C
.
No matter what choice of integration constant C ,= 0 is made, and what choice
of interval (a, /) with rational numbers a < /, there are lots of rational numbers
r (a, /) for which j =
1
_
r
2
+c
is not rational.
3. The real eld
In regards to the problem of describing what is meant by the continuity of a line
segment, J. W. R. Dedekind published his famous construction of the real numbers
using Dedekind cuts in 1872. Some years earlier he had described his seminal idea
in the following way "By this commonplace remark the secret of continuity is to be
revealed", the idea in question being
In any division of the points of the segment into two parts such that each
point belongs to one and only one class, and such that every point of the
one class is to the left of every point in the other, there is one and only
one point that brings about the division.
We present here a modication of this idea due to Bertand Russell (born 1872,
the year of Dedekinds publication). Heuristically, following Russell, a Dedekind
cut c Q is a "left innite interval open on the right" of rational numbers that
is associated with the "real number" on the number line that marks its right hand
endpoint. More precisely, a cut c is a subset of Q satisfying (here j and denote
rational numbers)
c ,= O and c ,= Q, (3.1)
j c and < j implies c,
j c implies there is c with j < .
One can dene an ordered eld structure on the set of cuts, which we identify
as the eld R of real numbers, and prove that this ordered eld has the famous
Least Upper Bound Property dened below. It is this property that evolves into
the critical Heine-Borel property of Euclidean space, namely that every closed and
bounded subset is compact, and this property in turn ultimately permits the familiar
existence theorems for ordinary and partial dierential equations. We remark that
a copy of the rational number eld Q can be identied inside the real eld R of
Dedekind cuts by associating to each r Q the cut
c = (, r) = j Q : j < r .
Alternatively, one can dene an ordered eld structure on the set of equivalence
classes of Cauchy sequences in Q, and this produces an ordered eld isomorphic to
R. We will construct the real numbers using Dedekind cuts at the end of this
chapter, and leave the construction with Cauchy sequences to a later chapter. But
rst we study some of the consequences of an ordered eld with the least upper
bound property. For this we introduce precise denitions of these concepts.
Definition 2. A eld F is a set with two binary operations, called addition
and multiplication, that satisfy the following three sets of axioms. We often write
F for the underlying set, r+j for the operation of addition applied to r, j F, and
juxtaposition rj for the operation of multiplication applied to r, j F.
(1) Addition Axioms
(a) (closure) r +j F for all r, j F,
(b) (commutativity) r +j = j +r for all r, j F,
(c) (associativity) (r +j) +. = r + (j +.) for all r, j, . F,
(d) (additive identity) There is an element 0 F such that
0 +r = r for all r F,
(e) (inverses) For each r F there is an element r F such that
r + (r) = 0.
(2) Multiplication Axioms
(a) (closure) rj F for all r, j F,
(b) (commutativity) rj = jr for all r, j F,
(c) (associativity) (rj) . = r(j.) for all r, j, . F,
(d) (multiplicative identity) There is an element 1 F such that
1r = r for all r F,
(e) (inverses) For each r F 0 there is an element
1
r
F such that
r
_
1
r
_
= 1.
3. THE REAL FIELD 11
(3) Distributive Law
r(j +.) = rj +r.
for all r, j, . F.
Example 3. The set of rational numbers Q is a eld with the usual operations
of addition and multiplication. Another example is given by the nite set of integers
F
= 0, 1, 2, ..., j 1 ,
with addition and multiplication dened modulo j. This turns out to be a eld if
and only if j is a prime number. Details are left to the reader.
All of the familiar algebraic identities that hold for the rational numbers, hold
also in any eld. We state the most common such algebraic identities below leaving
for the reader some of the routine proofs.
Proposition 1. Let F be a set on which there are dened binary operations of
addition and multiplication.
(1) The addition axioms imply
(a) r +j = r +. ==j = .,
(b) r +j = r ==j = 0,
(c) r +j = 0 ==j = r,
(d) (r) = r.
(2) The multiplication axioms imply
(a) r ,= 0 and rj = r. ==j = .,
(b) r ,= 0 and rj = r ==j = 1,
(c) r ,= 0 and rj = 1 ==j =
1
r
,
(d) r ,= 0 ==
1
1
x
= r.
(3) The eld axioms imply
(a) 0r = 0,
(b) r ,= 0 and j ,= 0 ==rj ,= 0,
(c) (r) j = (rj) = r(j),
(d) (r) (j) = rj.
By way of illustration we prove the nal equality (r) (j) = rj by a method
that also establishes (1) (a) (c) (d) and (3) (a) (c) along the way (much shorter proofs
also exist). For this we begin with the additive cancellation property (1) (a): if
r +j = r +. then
j = 0 +j = (r +r) +j = r + (r +j)
= r + (r +.) by assumption
= (r +r) +. = 0 +. = ..
Taking . = r this gives (1) (c) (uniqueness of additive inverses), and since (r) +
r = 0, (1) (c) then gives r = (r), which is (1) (d). Next we note that
(3.2) (r) j +rj = (r +r) j = 0j = 0,
where the nal equality follows from applying additive cancellation (1) (a) to
0j + 0j = (0 + 0) j = 0j = 0j + 0.
By applying (1) (c) to (3.2) we obtain
(3.3) rj = ((r) j) .
If we interchange r and j in (3.3) and use multiplicative commutativity, we also
obtain
(3.4) rj = jr = ((j) r) = (r(j)) .
Finally, with r replaced by r and j replaced by j in (3.3) we have
(r) (j) = (((r)) (j)) = (r(j)) ,
which when combined with (3.4) yields (r) (j) = rj as required.
Now we combine the eld and order properties. By r j we mean j < r.
Definition 3. An ordered eld is a eld F together with an order < on the set
F where the eld and order structures are connected by the following two additional
axioms:
(1) r +j < r +. if r, j, . F and j < .,
(2) rj 0 if r, j F and both r 0 and j 0.
Example 4. The eld of rational numbers Q is an ordered eld with the usual
order, but for j a prime, there is no order on the eld F
that satises Denition

3.
All of the customary rules for manipulating inequalities in the rational numbers
hold also in any ordered eld. We state the most common such properties below,
without giving the routine proofs.
Proposition 2. The following hold in any ordered eld.
(1) r 0 if and only if r < 0,
(2) rj < r. if r 0 and j < .,
(3) rj r. if r < 0 and j < .,
(4) r
2
0 if r ,= 0,
(5) 1 0,
(6) 0 <
1
<
1
r
if 0 < r < j.
Now we come to the most important property an ordered eld can have, one
that is essential for the success of analysis, but is not satised in the ordered eld
of rational numbers Q.
Definition 4. Let < be an order on a set o.
(1) We say that r o is an upper bound for a subset 1 of o if
j _ r for all j 1.
(2) We say that a subset 1 is bounded above if it has at least one upper
bound.
(3) We say that r o is the least upper bound for a subset 1 of o if r is an
upper bound for 1 and if . is any other upper bound for 1, then r _ ..
In this case we write
r = sup1.
Clearly the least upper bound of a subset 1, if it exists, is unique. Consider
the ordered set of rational numbers Q. Then 3 is an upper bound for the interval
1 = [0, 3] = r Q : 0 _ r _ 3, and so are , 4 and 2
100
. In fact it is easy to see
that 3 is the least upper bound for [0, 3]. An example of a subset that has no least
upper bound is the semiinnite interval [0, ) = r Q : 0 _ r < , since it has
3. THE REAL FIELD 13
no upper bounds at all! A more substantial example of a bounded set that has no
least upper bound is the set dened in (2.1).
There are corresponding denitions of lower bound, bounded below, greatest
lower bound and inf 1, whose formulations we leave to the reader.
Definition 5. An ordered set o has the Least Upper Bound Property if every
subset 1 of o that is bounded above has a least upper bound.
The ordered set of rational numbers Q fails to have this crucial property, as
evidenced by the existence of the set in (2.1). An example of a nontrivial ordered
set with the Least Upper Bound Property is the set of all ordinal numbers equal to
or less than the rst uncountable ordinal.
Remark 4. If o has the Least Upper Bound Property, it also has the Greatest
Lower Bound Property: every subset 1 of o that is bounded below has a greatest
lower bound. To see this, suppose 1 is bounded below and let 1 be the nonempty set
of lower bounds. Then 1 is bounded above by every element of 1 and in particular
c = sup1 exists. Now c = inf 1 follows from the following two facts:
(1) If r 1, then r is an upper bound for 1 and since c is the least of the
upper bounds for 1, we have c _ r. Thus c is a lower bound for 1.
(2) If , c, then , , 1 since c is an upper bound of 1. It follows that c is
the greatest of the lower bounds for 1.
It turns out that the only ordered eld that has the Least Upper Bound Prop-
erty is (up to isomorphism) the ordered eld of real numbers R, which we have not
yet constructed. Before embarking on the construction of the real numbers using
Dedekind cuts, it will be useful to derive some consequences of the Least Upper
Bound Property in an ordered eld. Just so we can be certain we are not working
in a vaccuum, we state the basic existence theorem whose proof is deferred to the
end of this chapter.
Theorem 1. There exists an ordered eld R having the Least Upper Bound
Property. Moreover, such a eld is uniquely determined up to isomorphism (of or-
dered elds) and contains (an isomorphic copy of ) the rational eld Q as a subeld.
Assuming this existence theorem for the moment we derive some properties of
ordered elds with the Least Upper Bound Property. We note that we could also
prove these properties by appealing to the explicit construction of the real numbers
by Dedekind cuts below, but the approach used here is more streamlined in that
it avoids the complexities inherent in the construction of the reals. We begin with
two familiar properties shared by the eld of rational numbers.
Proposition 3. Let r, j R.
(1) (Archimedian property) If r 0, then there is a positive integer : such
that :r j.
(2) (density of rationals) If r < j then there is j Q such that r < j < j.
Proof : To prove assertion (1) by contradiction, let 1 = :r : : N. If (1)
were false, then j would be an upper bound for 1 and consequently c = sup1
would exist. Since r 0, we would have c r < c and thus that c r could not
be an upper bound for 1. But then there would be some :r greater than c r
and this gives
c = (c r) +r
< :r +r
= (: + 1) r 1,
which contradicts the assumption that c is an upper bound for 1.
Remark 5. The above proof shows that for every r 0, the set 1
r
= :r : : N
is bounded above. In fact, this statement is equivalent to the Archimedian property
(1).
Remark 6. A simple consequence of the Archimedian property is that : =
n |Ints
..
1 + 1 +... + 1 is not 0. Thus we can embed the natural numbers N inside R, and
hence also the integers Z and the rational numbers Q. It is with respect to this
embedding of Q into R that the density of rationals refers in assertion (2) of Propo-
sition 3.
To prove assertion (2), use assertion (1) to choose : N such that :(j r) 1.
Use assertion (1) twice more to obtain integers :
1
and :
2
satisfying :
1
:r and
:
2
:r. Thus we have both
:(j r) 1 and :
2
< :r < :
1
.
Because :
1
(:
2
) :r +(:r) = 0, i.e. :
1
(:
2
) _ 1, it follows that there
is an integer : lying between :
2
and :
1
such that
:1 _ :r < :.
Combining inequalities yields
:r < : _ 1 +:r < :j,
and since : 0 we obtain
r <
:
:
< j.
Similar reasoning can be used to obtain the existence of positive :
||
roots of
positive numbers in an ordered eld with the least upper bound property. This
property is not shared by the eld of rational numbers.
Proposition 4. (existence of :
||
roots) If r is a positive real number and :
is a positive integer, then there exists a unique positive real number j satisfying
j
n
= r.
Proof. Let 1 = . R : 0 _ . and .
n
< r. Now 1 is nonempty since 0 1.
Also, 1 is bounded above by max r, 1, since if . max r, 1, then
.
n
max r
n
, 1
n
= max r
n
, 1 _ r
implies that . , 1. The nal inequality in the display above follows by considering
two cases separately: if r _ 1 the inequality holds trivially; while if r 1, then
r
n
r follows by induction on :. Hence j = sup1 exists.
Using an argument similar to that following (2.1) one can now show that each
of the inequalities j
n
< r and j
n
r leads to a contradiction, leaving only the
possibility that j
n
= r. Indeed, suppose in order to derive a contradiction, that
4. THE COMPLEX FIELD 15
j
n
< r so that j 1. If we take c =
r
n
where _ 2 is a large integer to be

chosen below, then we have both j _ max r, 1 and c _
r
, so that
(j +c)
n
= j
n
+
n
|=1
_
:
/
_
j
n|
c
|
= j
n
+c
n
|=1
_
:
/
_
j
n|
c
|1
_ j
n
+
r j
n
|=1
_
:
/
_
(max r, 1)
n|
_
r
_
|1
_ j
n
+
r j
n
|=0
_
:
/
_
(r + 1)
n|
(r + 1)
|
= j
n
+
r j
n
(2r + 2)
n
.
Now use the Archimedean property to choose (2r + 2)
n
, and obtain that
(j +c)
n
_ j
n
+ (r j
n
)
(2r + 2)
n
< j
n
+ (r j
n
) = r.
This shows that j+c 1, contradicting j = sup1. Similarly, the inequality j
n
r
leads to a contradiction, and this completes the proof that j
n
= r. Uniqueness of
such a positive solution j is obvious. See page 10 of [3] for a somewhat dierent
presentation of this argument.
Note that sup =
_
2 where is the set in (2.1).
Corollary 1. If r and j are positive real numbers and : is a positive integer,
then r
1
n
j
1
n
= (rj)
1
n
.
Proof : By the commutativity of multiplication we have
_
r
1
n
j
1
n
_
n
=
_
r
1
n
j
1
n
__
r
1
n
j
1
n
_
...
_
r
1
n
j
1
n
_
=
_
r
1
n
__
r
1
n
_
...
_
r
1
n
_
_
j
1
n
__
j
1
n
_
...
_
j
1
n
_
=
_
r
1
n
_
n
_
j
1
n
_
n
= rj.
By the uniqueness assertion of Proposition 4 we then conclude that r
1
n
j
1
n
= (rj)
1
n
.
4. The complex eld
Property (4) of Proposition 2 on ordered elds shows that there is no real
number r satisfying the equation r
2
= 1. To remedy this situation, we dene
the complex eld C to be the eld obtained from the real eld R by adjoining an
abstract symbol i that is declared to satisfy the equation
(4.1) i
2
= 1.
Thus C consists of all expressions of the form
. = r +ij, r, j R,
which can be identied with the "points in the plane" by associating . = r+ij C
with (r, j) R R in the plane. The eld structure on C uses the multiplication
rule derived from (4.1) by
.n = (r +ij) (n +i) (4.2)
=
_
rn +i
2
j
_
+i (r +jn)
= (rn j) +i (r +jn) ,
where . = r + ij and n = n + i. For the most part, straightforward calculations
show that this multiplication and the usual addition derived from vectors in the
plane R R,
(r +ij) + (n +i) = (r +n) +i (j +) ,
satisfy the addition axioms, the multiplication axioms and the distributive law of
a eld. Only the existence of a multiplicative inverse needs some elaboration. For
this we dene
Definition 6. Suppose . = r+ij C. The complex conjugate . of. is dened
to be
. = r ij.
Now
.. = (r +ij) (r ij) = r
2
(ij)
2
+i jr rj = r
2
+j
2
,
and by Proposition 4, the nonnegative real number
_
r
2
+j
2
exists and is unique.
By Pythagoras theorem,
_
r
2
+j
2
=
_
..
is the distance between the complex numbers 0 and . when they are viewed as the
points (0, 0) and (r, j) in the plane. We dene
[.[ =
_
.., . C,
called the absolute value of ., and note that for . C 0, the multiplicative
inverse of . is given by .
1
=
:
]:]
2
since
.
_
.
1
_
= .
.
[.[
2
=
..
[.[
2
=
[.[
2
[.[
2
= 1.
We now make three observations.
(1) An immediate consequence of property (4) of Proposition 2 is that there is
no order on C that makes it into an ordered eld with this eld structure.
(2) It is a fundamental theorem in algebra, in fact it is called the fundamental
theorem of algebra, that we do not need to adjoin any further solutions
of polynomial equations: every polynomial equation
.
n
+a
n1
.
n1
+... +a
1
. +a
0
= 0
has a solution . in the complex eld C. Here the coecients a
0
, a
1
, ..., a
n1
are complex numbers.
4. THE COMPLEX FIELD 17
(3) If we associate . = r +ij to the matrix
_
r j
j r
_
, then this multiplica-
tion corresponds to matrix multiplication:
[.] [n] =
_
r j
j r
_ _
n
n
_
(4.3)
=
_
rn j r jn
jn +r j +rn
_
= [.n] .
Since the matrix
_
r j
j r
_
= r
_
cos 0 sin0
sin0 cos 0
_
is dilation by the nonnegative number r =
_
r
2
+j
2
= [.[ and rotation
by the angle 0 = tan
1
r
in the counterclockwise direction, we see that if
. has polar coordinates (r, 0) and n has polar coordinates (:, c), then .n
has polar coordinates (r:, 0 +c). Finally we note that the inverse of the
matrix ' =
_
r j
j r
_
is given by
'
1
=
1
det '
[co']
|
=
1
r
2
+j
2
_
r j
j r
_
=
_
r
r
2
+
2

r
2
+
2

r
2
+
2
r
r
2
+
2
_
,
which agrees with .
1
=
:
]:]
2
=
r
r
2
+
2
i

r
2
+
2
(' is the matrix repre-
sentation of the real linear map induced on R
2
by the map of complex
multiplication on C = R
2
by . = r +ij).
Finally we give some simple properties of the complex conjugate and absolute
value functions. If . = r +ij we write Re . = r and Im. = j.
Proposition 5. Let . and [.[ denote the complex conjugate and absolute value
of ..
(1) Suppose ., n C. Then
(a) . +n = . +n, (.n) = (.) (n) and . +. = 2 Re .,
(b) [0[ = 0 and [.[ 0 unless . = 0,
(c) [.[ = [.[,
(d) [.n[ = [.[ [n[,
(e) [Re .[ _ [.[,
(f) [. +n[ _ [.[ +[n[.
(2) (Cauchy-Schwarz inequality) Suppose .
1
, ..., .
n
C and n
1
, ..., n
n
C.
Then
=1
.
2
= [.
1
n
1
+... +.
n
n
n
[
2
_
_
_
n
=1
[.
[
2
_
_
_
_
n
=1
[n
[
2
_
_
.
Proof : Assertions (1) (a) (/) (c) (c) are easy. If . = r+ij and n = n+i then
from (4.2),
[.n[
2
= [(rn j) +i (r +jn)[
2
= (rn j)
2
+ (r +jn)
2
= r
2
n
2
2rnj +j
2
2
+r
2
2
+ 2rjn +j
2
n
2
=
_
r
2
+j
2
_ _
n
2
+
2
_
= [.[
2
[n[
2
,
and now the uniqueness assertion of Proposition 4 proves (1) (d).
Next we compute
[. +n[
2
= (. +n) (. +n) = (. +n) (. +n)
= .. +.n +n. +nn
= [.[
2
+ 2 Re (.n) +[n[
2
_ [.[
2
+ 2 [.n[ +[n[
2
= [.[
2
+ 2 [.[ [n[ +[n[
2
= ([.[ +[n[)
2
,
and the uniqueness assertion of Proposition 4 now proves (1) ()).
Finally, to obtain (2), set
7 =
n
=1
[.
[
2
and \ =
n
=1
[n
[
2
and 1 =
n
=1
.
,
so that we must prove
(4.4) [1[
2
_ 7\.
If \ = 0 then both sides of (4.4) vanish. Otherwise, we have
n
=1
[\.
1n
[
2
=
n
=1
(\.
1n
)
_
\.
1n
_
= \
2
n
=1
[.
[
2
\1
n
=1
.
1\
n
=1
n
+[1[
2
n
=1
[n
[
2
= \
2
7 \11 1\1 +[1[
2
\
= \
2
7 \ [1[
2
= \
_
\7 [1[
2
_
,
and since \ 0 we obtain
\7 [1[
2
=
1
\
n
=1
[\.
1n
[
2
_ 0.
4.1. Euclidean spaces. For x = (r
1
, r
2
, ..., r
n
) R R ... R = R
n
, we
dene
|x| =
_
r
2
1
+r
2
2
+... +r
2
n
,
and interpret |x| as the distance from the point x to the origin 0 = (0, 0, ..., 0),
which is reasonable since it agrees with Pythagoras theorem. We call R
n
the
Euclidean space of dimension :. For z, w R
n
, we dene the dot product of z and
w by
z w = .
1
n
1
+.
2
n
2
+... +.
n
n
n
=
n
=1
.
.
The Cauchy-Schwarz inequality, when restricted to real numbers, says that
[z w[ _ |z| |w| , z, w R
n
.
5. DEDEKINDS CONSTRUCTION OF THE REAL NUMBERS 19
Remark 7. The proof of the Cauchy-Schwarz inequality given above is moti-
vated by the fact that in a Euclidean space, the point on the line through 0 and w
that is closest to z is the projection 1z of z onto the line through 0 and w given by
1z =
_
z
w
|w|
_
w
|w|
=
z w
|w|
2
w.
Then
|z 1z|
2
=
n
=1

. n
|n|
2
n
2
=
1
|w|
4
n
=1
|w|
2
.
(z w) n
2
=
1
|w|
4
n
=1
[\.
1n
[
2
.
5. Dedekinds construction of the real numbers
Recall that a Dedekind cut c is a subset of Q satisfying (3.1),
c ,= O and c ,= Q,
j c and < j implies c,
We set
R = c : c is a cut ,
and dene an order < and two binary operations, addition + and multiplication ,
on the set R and then demonstrate that R satises the axioms for an ordered eld
with the Least Upper Bound Property. We proceed in six steps, giving proofs only
when there is some trick involved, or the result is especially important. The letters
j, , r, :, t always denote rational numbers and the Greek letters c, ,, , 0 always
denote cuts. See pages 17-21 of [3] for the details.
Step 1: Dene c < , if c is a proper subset of ,. Then (R, <) is an ordered
set.
Step 2: (R, <) has the Least Upper Bound Property.
Proof : To see this, suppose that 1 is a nonempty subset of R that is bounded
above by , R. Dene
=
_
oJ
c.
One can now show that is a cut ( ,= O since there exists c(,= O) 1 and then
c ; ,= Q since , and , ,= Q; if j , then there is c 1 with j c, and
it follows that every less than j is in c and there is r in c that is larger
than j), and clearly is then an upper bound for 1 since c for all c 1.
Moreover, is the least upper bound, written = sup1, since any upper bound
must contain at least each set c 1. Note how easily we obtained the Least Upper
Bound Property by this construction!
Step 3: If c, , R, dene
c +, = j + : j c and , .
Also set
0 = j Q : j < 0 .
Then c + , and 0 are cuts and using 0 as the additive identity 0, the Addition
Axioms for a eld hold. In fact more is true: if c is a cut and , is any nonempty
set that is bounded above, then c +, is a cut.
Proof : If j = r + : c + , and < j, then = ( j +r) + : c + ,
since j + r < r and c is a cut. Furthermore, there is t c with t r and so
t + : c + , with t + : r + : = j. Obviously 0 is a cut. Next, c + 0 c and
if j c, then there is r c with r j and so j = r + (j r) c + 0, and this
shows that c +0 = c for all c R. It requires only a bit more eort to show that
the inverse of c R is given by the set
c = j Q : there exists r 0 such that j r , c .
Indeed, it is not too hard to show that c is a cut. To see the more delicate fact
that
(5.1) c + (c) = 0,
we rst note that c + (c) 0 since if c and r c, then r , c, hence
< r, hence + r < 0. Conversely, pick : 0 and set t =
s
2
0. By the
Archimedian property of the rational numbers Q, there is : N such that
:t c but (: + 1) t , c.
Set j = (: + 2) t.
Remark 8. It is helpful at this point to suppose that c corresponds to a point
on the line to the right of 0, and to draw the players in the proof from left to right
on the line:
j < (: + 1) t < c < :t < t < 0 < t < :t < c < (: + 1) t < j.
Now j c since j t = (: + 1) t , c. Since :t c we thus have
: = 2t = :t +j c + (c) .
This proves that 0 c + (c) and completes the proof of (5.1).
Step 4: If c, ,, R and , < , then c +, < c +.
Proof : This is easy to prove using the cancellation law for addition in Propo-
sition 1 (1) (a). Indeed, when cuts are considered as subsets of rational numbers,
we clearly have c+, c+. If we had equality c+, = c+, then Proposition 1
(1) (a) shows that , = , a contradiction. Note that Proposition 1 (1) applies here
since we have shown in Step 3 that the addition axioms hold.
Step 5: If c, , 0, dene
c , = j Q : j _ r for some choice of
c with 0 and r , with r 0 .
For general c, , R, dene c , appropriately. Then (R, <, +, ) is an ordered eld
with the Least Upper Bound Property.
Proof : The proof of the multiplication axioms is somewhat bothersome due
to the dierent denitions of product c , according to the signs of c and ,. We
omit the remaining tedious details in the proof of Step 5.
Step 6: To each Q we associate the set
() = j Q : j < .
6. EXERCISES 21
Then () is a cut and
(r +:) = (r) + (:) ,
(r:) = (r) (:) ,
(r) < (:) ==r < :.
Thus the map : Q R is an ordered eld isomorphism from the rational numbers
Q into the real numbers R, and this is the sense in which we mean that the real
numbers R contain a copy of the rational numbers Q.
Remark 9. One might reasonably ask why in the denition of cut (3.1) we had
to include the third condition requiring the cut to have no largest element:
However, without this condition, there are additional cuts, namely those with a
largest rational element:
r
+
= j Q : j _ r , for r Q.
We refer to these additional cuts as closed cuts, and to the original cuts as open
cuts. A cut that is either closed or open is said to be a generalized cut. Suppose we
extend the denition of addition to generalized cuts in the standard way by taking
all possible sums of pairs, one element from each cut. The key property to observe
then is that c+, is an open cut provided at least one of c and , is open (see Step
3 above). Thus the usual zero element 0 can no longer serve as the additive identity
for the set of generalized cuts. It is not hard to see however that the closed cut
0
+
= j Q : j _ 0
has the required additive identity property 0
+
+c = c for all generalized cuts c - in
fact 0
+
is the only generalized cut with this property. Now comes the problem. An
open cut c cannot have an additive inverse since the result of adding any generalized
cut to c must also be open - and in particular cannot equal the closed cut 0
+
.
6. Exercises
Exercise 2. Use the trig formulas
cos (+1) = cos cos 1 sinsin1,
sin(+1) = sincos 1 + cos sin1,
to prove DeMoivres Theorem by induction on ::
(cos 0 +i sin0)
n
= cos :0 +i sin:0, : N.
Exercise 3. Prove that if ) (r) =

n=1
a
n
sin:r, where a
n
is a constant for
1 _ : _ , then
2
_
1
0
) (r) sin/r = a
|
, 1 _ / _ .
Exercise 4. ( Assuming rst year calculus theorems) Suppose that `
n
n=1
is a nite sequence of distinct numbers, that a (r) is a continuously dierentiable
function on [0, 1], and that o
n
(r)
n=1
is a nite sequence of twice continuously
dierentiable functions that satisfy the boundary value problems
[a (r) o
t
n
(r)]
t
= `
n
o
n
(r) , 0 _ r _ 1,
o
n
(0) = o
n
(1) = 0,
for each 1 _ : _ . Prove that if ) (r) =

n=1
a
n
o
n
(r), then
1
_
1
0
o
|
(r)
2
dr
_
1
0
) (r) o
|
(r) dr = a
|
, 1 _ / _ .
Hint:Use integration by parts, together with the boundary conditions, to obtain
`
n
_
1
0
o
n
o
|
=
_
1
0
[ao
t
n
]
t
o
|
= ao
t
n
o
|
[
1
0

_
1
0
ao
t
n
o
t
|
=
_
1
0
ao
t
n
o
t
|
;
`
|
_
1
0
o
n
o
|
=
_
1
0
o
n
[ao
t
|
]
t
= o
n
ao
t
|
[
1
0

_
1
0
o
t
n
ao
t
|
=
_
1
0
ao
t
n
o
t
|
;
on :n/tract to o/tai:
_
1
0
o
n
o
|
= 0 i) : ,= /!
Show how the previous exercise is a special case of this one.
Exercise 5. Prove that r
2
= 12 has no solution in the rational eld Q.
Exercise 6. Use induction to prove Bernoullis inequality
(1 +r)
n
_ 1 +:r, r 1 and : N.
Exercise 7. Use induction to prove that 5
n
4: 1 is divisible by 16 for all
: N.
Exercise 8. Suppose :
1
< :
2
. Use induction to show that if :
n+2
=
sn+1+sn
2
for all : _ 1, then
:
1
< :
n
< :
2
, : _ 3.
Exercise 9. The Fibonacci sequence )
n
o
n=0
is dened recursively by
)
0
= )
1
= 1,
)
n+2
= )
n+1
+)
n
, : _ 2.
Use induction to prove that
)
n
=
t
n+1
(t)
(n+1)
_
5
, : N,
where t =
1+
_
5
2
is the larger root of the polynomial equation r
2
= r + 1.
Exercise 10. Let ) : o T be a function. Prove that for any subsets o
1
, o
2
of o and T
1
, T
2
of T, we have
(1) )
1
(T
1
' T
2
) = )
1
(T
1
) ' )
1
(T
2
),
(2) ) (o
1
' o
2
) = ) (o
1
) ' ) (o
2
),
(3) )
1
(T
1
T
2
) = )
1
(T
1
) )
1
(T
2
),
(4) ) (o
1
o
2
) ) (o
1
) ) (o
2
),
(5) Equality may fail in property (4).
Exercise 11. Suppose that ) : o T and q : T o with q ) (r) = r for all
r o. Prove that ) is one-to-one and that q is onto.
6. EXERCISES 23
Exercise 12. Use the fact that 1 is a square in the complex eld C to show
that there is no order - on C that makes (C, -) into an ordered eld.
Exercise 13. Dene the dictionary relation - in the complex numbers C by
a+i/ - c +id if either a < c or a = c and / < d. Prove that - satises the axioms
for an order on C. Does the ordered set (C, -) have the least upper bound property?
Exercise 14. Prove the parallelogram law in Euclidean space R
n
:
|n +|
2
+|n |
2
= 2
_
|n|
2
+||
2
_
, n, R
n
.
Exercise 15. Fill in the details in the argument sketched in Remark 9 that
shows that, in the denition of cut (3.1), it is necessary to include the condition
that the cut have no largest element.
Exercise 16. Let and 1 be the subsets of the rational numbers dened in
(2.1),
=
_
j Q : either j _ 0 or j
2
< 2
_
,
1 =
_
Q : 0 and
2
2
_
.
(1) Show that the set 1 has no smallest element.
(2) Show that the set fails to have a least upper bound.
CHAPTER 2
Cardinality of sets
Dedekind was the rst to dene an innite set as one to which the paradoxes
of Galileo and Bolzano applied (there are as many perfect squares as there are
integers; there are as many even integers as there are integers; and there are as
many points in the interval [0, 1] as there are in [0, 2]):
A system o is said to be innite if it is similar to a proper part of itself;
in the contrary case o is said to be a nite system.
In other words, a set o was dened to be innite by Dedekind if there existed
a one-to-one correspondence between o and a proper subset of itself. However,
Dedekinds denition gave no hint that there might be dierent sizes of innity,
and the creation of this revolutionary concept had to await the imagination of Georg
Cantor.
Definition 7. Two sets and 1 are said to have the same cardinality or are
said to be equivalent, written ~ 1, if there is a one-to-one onto map , : 1.
Let : N. A set 1 is said to have cardinality : if it is equivalent to the set
J
n
= 1, 2, 3, ..., : 1, : ,
in which case it is said to be nite. A set 1 is said to be countable if it is equivalent
to the set of natural numbers N. If a set is neither nite nor countable, it is said
to be uncountable.
The relation ~ of having the same cardinality is an equivalence relation, mean-
ing that it satises
(1) (reexivity) ~ ,
(2) (symmetry) ~ 1 ==1 ~ ,
(3) (transitiviy) ~ 1 and 1 ~ C == ~ C.
These equivalence classes are called cardinal numbers since they measure the
size of sets up to bijections. Cantor showed at least two surprising results regarding
cardinality: rst, that the set of rational numbers is countable and second, that
the set of real numbers is uncountable. Both demonstrations involved a notion of
diagonalization.
25
26 2. CARDINALITY OF SETS
To show that the rational numbers Q are countable, Cantor arranged the pos-
itive rational numbers Q
+
in an innite matrix
_
n
n
o
n,n=1
;
_
_
1
1
1
2
1
3
1
4

2
1
2
2
2
3
2
4

3
1
3
2
3
3
3
4

4
1
4
2
4
3
4
4

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
_
,
and then dened a map : : N Q
+
by following the upward sloping diagonals in
succession, taking only those fractions that have not yet appeared:
: (1) =
1
1
;
: (2) =
2
1
; : (3) =
1
2
;
: (4) =
3
1
; : (5) =
1
3
(
2
2
= : (1) was skipped);
: (6) =
4
1
; : (7) =
3
2
; : (8) =
2
3
; : (9) =
1
4
;
: (10) =
5
1
; : (11) =
1
5
(
4
2
= : (2) ,
3
3
= : (1) ,
2
4
= : (3) were all skipped);
.
.
.
Clearly the map : is one-to-one and onto, thus demonstrating that N ~ Q
+
. It
is now a simple matter to use : to construct a one-to-one onto map t : N Q
(exercise: do this!) that shows N ~ Q.
To show that the real numbers are uncountable, we begin with a famous paradox
of Russell. Dene a set o by the rule
a o =a , a,
i.e. o consists of all sets a that are not members of themselves. Then we have the
following paradox:
If o o, then by the very denition of o it must be the case that o , o,
a contradiction.
On the other hand if o , o, then by the very denition of o it must be
the case that o o, again a contradiction.
One way out of this paradox is to note that we have never seen a set a that is a
member of itself. Thus we expect that o is actually the collection of all sets. If we
simply disallow the collection of all sets as a set, Russells paradox dissolves. This
type of thinking eventually led to the Zermelo-Frankel set theory in use today.
Russells paradox suggests the following proof that the power set
T (N) = 1 : 1 N
of the natural numbers, i.e. the set of all subsets of N, is uncountable. Indeed,
assume in order to derive a contradiction, that T (N) is countable. Then we can
2. CARDINALITY OF SETS 27
list all the elements of T (N) = 1
n
o
n=1
in a vertical column:
_
_
1
1
1
2
1
3
.
.
.
_
_
.
Now each subset 1
n
is uniquely determined by its characteristic function, i.e. the
sequence :
n
n
o
n=1
= :
n
1
, :
n
2
, :
n
3
, ... of 0s and 1s dened by
:
n
n
=
_
0 if : , 1
n
1 if : 1
n
.
Replace each subset 1
n
in the vertical column by the innite row of 0s and 1s
determined by :
n
n
o
n=1
to get an innite matrix of 0s and 1s:
_
_
:
1
1
:
1
2
:
1
3

:
2
1
:
2
2
:
2
3
:
3
1
:
3
2
:
3
3
.
.
.
.
.
.
_
_
.
Now consider the anti-diagonal or Russell sequence r
n
o
n=1
given by
(0.1) r
n
= 1 :
n
n
.
This is a sequence of 0s and 1s that is not included in the list
_
_
_
:
1
n
_
o
n=1 _
:
2
n
_
o
n=1 _
:
3
n
_
o
n=1
.
.
.
_
_
,
since for each :, the sequences :
n
n
o
n=1
and r
n
o
n=1
dier in the :
||
entry:
:
n
n
,= r
n
by (0.1). Thus the set 1 = : : r
n
= 1 whose characteristic function is
the sequence r
n
o
n=1
satises
: 1 =r
n
= 1 =:
n
n
= 0 =: , 1
n
,
and hence is the set of : such that : is not a member of 1
n
(reminiscent of Russells
paradox). It follows that 1 is not included in the list 1
n
o
n=1
. This contradiction
shows that the power set T (N) is uncountable. Equivalently, this shows that the
set of all sequences consisting of 0s and 1s is uncountable.
To see from this that the real numbers are uncountable, express each real
number : in the interval (0, 1] as a binary fraction
: =
:
1
2
+
:
2
2
2
+... +
:
n
2
n
+... = 0.:
1
:
2
...:
n
...
where the sequence :
n
o
n=1
does not end in an innite string of 0s. Since the set
of such fractions is uncountable (in fact its equivalence with T (N) follows from the
argument above with just a little extra work), we conclude that the interval (0, 1]
is uncountable, and then so is R. We will return to this argument later to show
that not only is R uncountable, but in fact (0, 1] ~ T (N).
We now turn to the task of making the previous arguments more rigorous. We
begin with a careful denition of sequence.
Definition 8. A sequence is a function ) dened on the natural numbers N.
If ) (:) = :
n
for all : N, the values :
n
are called the terms of the sequence, and
we often denote the sequence ) by :
n
o
n=1
or even :
1
, :
2
, :
3
, ....
Thus we may regard a countable set as the range of a sequence of distinct
terms, and in fact we used this point of view when we assumed above that T (N)
was countable and then listed the elements of T (N) in a vertical column. The next
lemma proves the intuitive fact that countable is the smallest innity.
Lemma 1. Every innite subset of a countable set is countable.
Proof : Suppose is countable and 1 is an innite subset of . Represent
as the range of a sequence a
n
o
n=1
of distinct terms, and dene a sequence of
integers :
|
o
|=1
as follows:
:
1
= min: N : a
n
1 ,
:
2
= min: :
1
: a
n
1 ,
:
3
= min: :
2
: a
n
1 ,
.
.
.
:
|
= min: :
|1
: a
n
1 , / _ 4,
.
.
.
Since 1 is innite, :
|
is dened for all / N. It is now clear that 1 = a
n
k
o
|=1
,
and so 1 is countable.
Corollary 2. A subset of a countable set is at most countable, i.e. it is either
countable or nite.
The next two theorems generalize the countability of the rational numbers and
the uncountability of the real numbers respectively. They are proved by the same
diagonalization procedures used above, and their proofs are left to the reader.
Theorem 2. Let 1
n
o
n=1
be a sequence of countable sets. Then o =
o
n=1
1
n
is countable.
The above theorem says that a countable union of countable sets is countable.
Note that the sets 1
n
may overlap, but not so much as to make the union nite,
since their union o contains 1
1
, and hence o is not nite. As an immediate corollary
we may replace countable with at most countable.
Corollary 3. An at most countable union of at most countable sets is at most
countable.
Theorem 3. Let be the set of all sequences whose terms are either 0 or 1.
Then is uncountable.
Here is one more result on countable sets that is easily proved by induction.
Proposition 6. Let be countable and consider the :-fold product set
n
=
... dened by
n
= (a
1
, a
2
, ..., a
n
) : a
I
for 1 _ i _ : .
Then
n
is countable.
1. EXERCISES 29
Proof : Clearly
1
~ is countable. We now proceed by induction on : and
assume that
n1
is countable. Assuming that : 1 we have
n
=
_
(/, c) : /
n1
and c
_
.
Now for each xed c , the set of pairs
_
(/, c) : /
n1
_
is equivalent to
n1
which is countable by our induction assumption. Since is countable, we thus see
that
n
is a countable union of countable sets, hence countable by Theorem 2.
Now we return to the assertion that (0, 1] ~ T (N). It is clear from the con-
struction above, that there is a one-to-one map ) : (0, 1] T (N). We now con-
struct a one-to-one map q : T (N) (0, 1] going in the opposite direction. Let
T
n
(N) be the collection of all nite subsets of N, so that the corresponding in-
dicator functions end in an innite string of 0
t
:. Then the construction above
shows that ) : (0, 1] T (N) T
n
(N) is one-to-one and onto. Now we dene
/ : T (N) T
n
(N)
_
0,
1
2
by /() =
1
2
)
1
() for T (N) T
n
(N). Finally
the set T
n
(N) is countable by the theorems above, and so it is easy to construct a
map i : T
n
(N)
_
1
2
, 1
that is one-to-one. Then we can dene q : T (N) (0, 1]

by
q () =
_
/() if T (N) T
n
(N)
i () if T
n
(N)
,
and it is easy to see that q is one-to-one. Now we can invoke the Schrder-Bernstein
Theorem from the exercises below to conclude that there is a bijection from (0, 1]
to T (N), i.e. (0, 1] ~ T (N). We remark that with some extra work it can be shown
that R ~ T (N).
1. Exercises
Exercise 17. A complex number . is algebraic if it satises a polynomial
equation
a
0
.
n
+a
1
.
n1
+... +a
n1
. +a
n
= 0.
where the coecients a
|
are integers and not all zero. Prove that the set of algebraic
numbers is countable.
Exercise 18. Prove that the interval (0, 1] of real numbers is in one-to-one
correspondence with a certain subset, namely T (N) T
n
(N), of the power set
T (N) of the positive integers.
Exercise 19. Show that for any set , there is no bijection ) : T ()
from to its power set T ().
Exercise 20. (Schrder-Bernstein Theorem) Suppose that ) : 1 and
q : 1 are both one-to-one maps. Prove that there is a bijection / : 1
from to 1. Hint: Dene 1
1
= 1 ) () and then set
1
= q (1
1
) , 1
2
= ) (
1
) ,
2
= q (1
2
) , 1
3
= ) (
2
) ,
.
.
.
n
= q (1
n
) , 1
n+1
= ) (
n
) ,
.
.
.
Then dene
0
=
o
_
n=1
n
and 1
0
= 1
o
_
n=1
1
n
. Now prove that =

_
o
n=0
n
and 1 =

_
o
n=0
1
n
are pairwise disjoint unions, and then that for each : _ 1, q
1
is a bijection from
n
to 1
n
, and nally that ) is a bijection from
0
to 1
0
.
CHAPTER 3
Metric spaces
There is a notion of distance between numbers in both the rational eld Q and
in the real eld R given by the absolute value of the dierence of the numbers:
di:t (j, ) = [j [ , j, Q,
di:t (r, j) = [r j[ , r, j R.
Motivated by Pythagoras theorem, this can be extended to complex numbers C by
di:t (., n) = [. n[ =
_
(r n)
2
+ (j )
2
,
for . = r +ij and n = n +i in C,
and even to points or vectors in Euclidean space:
di:t (x, y) = |x y| =
_
n
|=1
(r
|
j
|
)
2
,
for x = (r
1
, ..., r
n
) and y = (j
1
, ..., j
n
) in R
n
.
It will eventually be important to dene a notion of distance between functions, for
example if ) and q are continuous functions on the unit interval [0, 1], then we will
dene
di:t (), q) = sup[) (r) q (r)[ : 0 _ r _ 1 .
Of course at this point we dont even know if this supremum is nite, i.e. if the
set in braces is bounded above, or if it is, whether or not this denition satises
properties that we would expect of a distance function. Thus we begin by setting
down in as abstract a setting as possible the properties we expect of a distance
function.
Definition 9. A set A together with a function d : A A [0, ) is said to
be a metric space, and d is called a metric or distance function on A, provided:
(1) d (r, r) = 0,
(2) d (r, j) 0 if r ,= j,
(3) d (r, j) = d (j, r) for all r, j A,
(4) (triangle inequality) d (r, .) _ d (r, j) +d (j, .) for all r, j, . A.
To be precise we often write a metric space as a pair (A, d). Examples of metric
spaces include R, C and R
n
with the distance functions given above. The triangle
inequality holds in C by Proposition 5 (1) (f). To prove that the triangle inequality
holds in R
n
we can use the Cauchy-Schwarz inequality just as we did in the proof
31
32 3. METRIC SPACES
of Proposition 5 (1) (d):
di:t (r, .)
2
= |r .|
2
=
n
|=1
(r
|
.
|
)
2
=
n
|=1
(r
|
j
|
+j
|
.
|
)
2
=
n
|=1
(r
|
j
|
)
2
+ 2
n
|=1
(r
|
j
|
) (j
|
.
|
) +
n
|=1
(j
|
.
|
)
2
_ |r j|
2
+ 2 |r j| |j .| +|j .|
2
= (|r j| +|j .|)
2
= (di:t (r, j) +di:t (j, .))
2
.
Taking square roots we obtain
(0.1) di:t (r, .) = |r .| _ |r j| +|j .| = di:t (r, j) +di:t (j, .) .
We can also consider dierent metrics on R
n
such as taxicab distance:
d
|orI
(r, j) = max [r
|
j
|
[ : 1 _ / _ : .
This is the shortest distance a taxi must travel to get from r to j if the taxi is
restricted to proceed only vertically or horizontally, as is the case in most cities
built around a rectangular grid of streets. It is not too hard an exercise to prove
that (R
n
, d
|orI
) is a metric space, i.e. that d
|orI
satises the axioms in Denition 9
on the set R
n
.
An important method of constructing new metric spaces from known metric
spaces is to consider subsets. Indeed, if (A, d) is a metric space and 1 is any subset
of A, then (1, d) is also a metric space, as is immediately veried by restricting the
points r, j, . in Denition 9 to lie in the subset 1 . For example the open unit disk
D = . C : di:t (0, .) < 1
=
_
(r, j) R
2
:
_
r
2
+j
2
< 1
_
is a metric space with the metric d (., n) = [. n[. Note that the open unit disk
in the complex plane C coincides with the open unit disk in the Euclidean plane
R
2
.
The concept of a ball in a metric space is central to the further development of
the theory of metric spaces.
Definition 10. Let (A, d) be a metric space and suppose r A and r 0.
The ball 1(r, r) with center r and radius r is dened to be the set of all points
j A at a distance less than r from r:
1(r, r) = j A : d (r, j) < r .
One can easily verify that the collection of balls 1(r, r)
r,:,0
in a metric
space (A, d) satises the following six properties for all r, j A:
(1)

:,0
1(r, r) = r,
(2)

:,0
1(r, r) = A,
(3) If 0 < r _ :, then 1(r, r) 1(r, :)
(4) If j 1(r, r), then r 1(j, r),
(5) The set r 0 : j 1(r, r) has no least element,
(6) If 1(r, r) 1(j, :) ,= O, then j 1(r, r +:).
1. TOPOLOGY OF METRIC SPACES 33
While we will not need to know this, the six properties above characterize a
metric space in the following sense. Suppose that 1(r, r)
r,:,0
is a collection
of subsets of a set A that satisfy the six properties listed above. Dene
d (r, j) = inf r 0 : j 1(r, r) , for all r, j A.
Then it is not too hard to show that d maps A A into [0, ) and satises the
four properties in Denition 9, i.e. d denes a metric or distance function on A.
Moreover, one can prove that 1(r, r) = j A : d (r, j) < r for all r A and
r 0, so that the initial collection of subsets 1(r, r)
r,:,0
are precisely the
collection of balls corresponding to the metric d.
1. Topology of metric spaces
The notion of an open set is at the center of the subject of topology.
Definition 11. Let (A, d) be a metric space and suppose G is a subset of A.
Then G is open if for every point r in G there is a positive radius r such that the
ball 1(r, r) is contained in G:
1(r, r) G.
We see that the empty set O is open by default (there is nothing to check). The
set A is open since
1(r, 1) A, for all r A.
Any positive number would do in place of 1 as the radius above. One suspects that
balls themselves are open sets, but this needs a proof which relies heavily on the
triangle inequality.
Lemma 2. Let 1 be a ball in a metric space (A, d). Then 1 is open.
Proof : Suppose that 1 = 1(j, :) and that r 1. Then by Denition 10 we
have d (j, r) < :. Set
r = : d (r, j) 0.
We claim that the ball 1(r, r) with center r and radius r is contained in 1(j, :).
Draw a picture before proceeding! Indeed, if . 1(r, r) then by Denition 10
we have d (r, .) < r. Now we use the fact that the metric d satises the triangle
inequality in Denition 9 to compute that
d (j, .) _ d (j, r) +d (r, .) < d (r, j) +r = :.
This shows that . 1(j, :) for every . 1(r, r), i.e.
1(r, r) 1(j, :) .
Thus we have veried the condition that for every point r in 1(j, :) there is a
positive radius r = r
r
(depending on the point r we chose in 1(j, :)) such that the
ball 1(r, r
r
) is contained in 1(j, :). This proves that 1(j, :) is an open set.
Exercise 21. Consider the Euclidean space R
2
.
(1) Show that the inside of the ellipse,
G =
_
(r, j) R
2
: 4r
2
+j
2
< 1
_
,
is open. Hint: If 1 = (r, j) G, then the ball 1(1, r) is contained in G
if
r =
1
2
_
1
_
4r
2
+j
2
_
.
34 3. METRIC SPACES
Indeed, if Q = (n, ) 1(1, r), then (0.1) yields
_
(2n)
2
+
2
_
_
(2n 2r)
2
+ ( j)
2
+
_
(2r)
2
+j
2
_ 2
_
(n r)
2
+ ( j)
2
+
_
(2r)
2
+j
2
< 2r +
_
(2r)
2
+j
2
= 1.
(2) On the other hand, show that the corresponding set
1 =
_
(r, j) R
2
: 4r
2
+j
2
_ 1
_
,
dened with _ in place of <, is not an open set. Hint: The point 1 =
(0, 1) 1 but for every r 0 the ball 1(1, r) contains the point
_
0, 1 +
:
2
_
which is not in 1.
We declare a subset 1 of a metric space A to be closed if the complement
1
c
= A1 of 1 is an open set. For example, the set 1 in Exercise 21 (2) is closed,
but the set G in Exercise 21 (1) is not closed.
Caution: A set may be neither open nor closed, such as the subset [0, 1) of
R. Moreover, a set may be simultaneously open and closed, such as both
the empty set O and the entire set A in any metric space A.
Proposition 7. Let A be a metric space.
(1) If G
o
o.
is a collection of open subsets, then

o.
G
o
is open,
(2) If 1
o
o.
is a collection of closed subsets, then

o.
1
o
is closed,
(3) If G
|
n
|=1
is a nite collection of open subsets, then

n
|=1
G
|
is open,
(4) If 1
|
n
|=1
is a nite collection of closed subsets, then

n
|=1
1
|
is closed.
Proof : Suppose that G
o
is open for each c and let r

o.
G
o
. Then
r G
o
for some , and since G
o
is open, there is a ball 1(r, r) G
o

o.
G
o
,
which shows that

o.
G
o
is open. Next suppose that 1
o
is closed for each c and
note that if G
o
= (1
o
)
c
, then G
o
is open for each c and so

o.
G
o
is open by
part (1). Thus from de Morgans laws we have that
_

o.
1
o
_
c
=
_
o.
(1
o
)
c
=
_
o.
G
o
is open, so

o.
1
o
is closed by denition.
Now suppose that G
|
is open for 1 _ / _ : and that r
n
|=1
G
|
. Then there
is r
|
0 such that 1(r, r
|
) G
|
for 1 _ / _ :. It follows that if we set
r = minr
|
n
|=1
,
then r 0 (this is where we use that the collection G
|
n
|=1
is nite) and
1(r, r) 1(r, r
|
) G
|
, 1 _ / _ :.
Thus 1(r, r)

n
|=1
G
|
and this shows that

n
|=1
G
|
is open. Finally, if 1
|
is
closed for 1 _ / _ :, then G
|
= (1
|
)
c
is open and so
_
n
_
|=1
1
|
_
c
=
n
|=1
(1
|
)
c
=
n
|=1
G
|
is open by part (3). Thus

n
|=1
1
|
is closed by denition.
1. TOPOLOGY OF METRIC SPACES 35
1.1. Subspaces. Recall that if 1 is a subset of a metric space A, then we
may view 1 as a metric space in its own right, with metric given by that of A
restricted to 1 1 . The metric space (1, d) is then called a subspace of (A, d),
even though there is no linear structure on A. Note that if j 1 and r 0, then
the ball 1
Y
(j, r) in the metric space 1 satises
(1.1) 1
Y
(j, r) = . 1 : d (j, .) < r = 1
(j, r) 1,
where 1
(j, r) is the ball centered at j with radius r in the metric space A. Thus
if 1 is a subset of 1 , it can be considered as a subset of either the metric space 1
or the metric space A. Clearly the notions of 1 being open or closed depend on
which space is considered the ambient space. For example, if
1 =
_
(r, j) R
2
: di:t
__
1
2
, 0
_
, (r, j)
_
_
1
2
_
(1, 0)
is the ball center
_
1
2
, 0
_
with radius
1
2
together with its "boundary" except for the
point (1, 0), then we have
1 D R
2
.
Now one can show that 1 is a closed subset relative to the metric space D, but it is
neither open nor closed as a subset relative to the metric space R
2
. Exercise: prove
this!
On the other hand, (1.1) provides the following simple connection between the
open subsets relative to A and the open subsets relative 1 .
Theorem 4. Let 1 be a subset of a metric space A. Then a subset 1 of 1 is
open relative to 1 if and only if there exists a set G open relative to A such that
1 = G 1.
Proof : Suppose that 1 is open relative to 1 . Then for each j 1 there is a
positive radius r
such that 1
Y
(j, r
) 1. Now set
G =
_
J
1
(j, r
) ,
where we note that we are using balls 1
relative to A. Clearly G is open relative

to A by Lemma 2 and Proposition 7 (1). From (1.1) we obtain
G 1 =
_
J
1
(j, r
) 1 =
_
J
1
Y
(j, r
) ,
and the nal set is equal to 1 since j 1
Y
(j, r
) 1 for each j 1.
Conversely, suppose G is open relative to A and 1 = G1 . Then given j 1,
there is r
0 such that 1
(j, r
) G. From (1.1) we thus obtain

1
Y
(j, r
) = 1
(j, r
) 1 G 1 = 1,
which shows that 1 is open relative to 1 .
1.2. Limit points. In order to dene the notion of limit of a function later
on, we will need the idea of a limit point of a set. A deleted ball 1
t
(j, r) in a metric
space is the ball 1(j, r) minus its center j, i.e. 1
t
(j, r) = 1(j, r) j.
36 3. METRIC SPACES
Definition 12. Suppose (A, d) is a metric space and that 1 is a subset of A.
We say that j A is a limit point of 1 if every deleted ball centered at j contains
a point of 1:
1
t
(j, r) 1 ,= O for all r 0.
Note the following immediate consequence of this denition:
if j is a limit point of 1 then every deleted ball 1
t
(j, r) contains innitely
many points of 1,
and so in particular 1 must be innite in order to have any limits points at all.
Indeed, if 1
t
(j, r) 1 = r
n
=1
contains only : points, let : = mind (j, r
)
n
=1
.
Then : 0 and 1(j, :) doesnt contain any of the points r
n
=1
. Thus we have
the contradiction 1
t
(j, :) 1 = O.
Limit points are closely related to the notion of a closed set.
Proposition 8. A set 1 is closed in a metric space if and only if it contains
all of its limit points.
Proof : Suppose rst that r is a limit point of 1. Then in particular, 1(r, r)1
is nonempty for all r 0, and so no ball 1(r, r) centered at r is contained in 1
c
.
If 1 is closed, then 1
c
is open and it then follows that r , 1
c
. Thus r 1 and
we have shown that a closed set 1 contains all of its limit points.
Conversely, suppose that 1 contains all of its limit points. Pick r 1
c
. Since
r is not a limit point of 1, there is a deleted ball 1
t
(r, r) that does not intersect
1. But r , 1 as well so that 1(r, r) does not intersect 1. Hence 1(r, r) 1
c
and this shows that 1
c
is open, and thus that 1 is closed.
Definition 13. If 1 is a subset of a metric space A, we dene 1
t
(the derived
set of 1) to be the set of all limit points of 1, and we dene 1 (the closure of 1)
to be 1 ' 1
0
, the union of 1 and all of its limit points.
As a corollary to Proposition 8 we obtain the following basic theorem for the
metric space R.
Theorem 5. Suppose that 1 is a nonempty subset of the real numbers R that
is bounded above, and let sup1 be the least upper bound of 1. Then sup1 is in 1,
and sup1 1 if 1 is closed.
Proof : Since the real numbers R have the Least Upper Bound Property, . =
sup1 exists and satises the property that if j < ., then j is not an upper bound
of 1, hence there exists r 1 with j < r _ .. It follows that 1(., r) 1 ,= O for
all r 0 upon taking j = . r in the previous argument. Thus either . 1 1
or if not, then
1
t
(., r) 1 ,= O for all r 0,
in which case . is a limit point of 1, hence . 1
t
1. Finally, Proposition 8
shows that . 1 if 1 is closed.
One might wonder if the set 1 contains limit points not in 1, or roughly
speaking, if taking limit points of limit points yields new points. The answer is no,
and in fact not only is 1 closed, it is the smallest closed set containing 1.
2. COMPACT SETS 37
Proposition 9. If 1 is a subset of a metric space A, then
(1.2) 1 =

1 A : 1 is closed and 1 1 ,
and 1 is the smallest closed set containing 1.
Proof : Denote the right hand side of (1.2) by c. Then c is a closed set by
Proposition 7 (2). Thus by its very denition, it is the smallest closed set containing
1, i.e. every other closed set 1 containing 1 contains c. Now 1 c since by
Proposition 8, every closed set 1 containing 1 also contains all the limit points 1
t
of 1.
On the other hand, if r , 1, then there exists some r 0 such that
1(r, r) 1 = O.
Now 1(r, r)
c
is closed since 1(r, r) is open by Lemma 2. Moreover 1(r, r)
c
contains 1 and so is a candidate for the intersection dening c. This shows that
c 1(r, r)
c
and in particular that r , c. This proves that c 1 and completes
the proof of Proposition 9.
Lemma 3. 1
t
is closed.
Proof : Suppose that . (1
t
)
t
and r 0. Then there is j 1
t
(., r) 1
t
.
Let : = mind (., j) , r d (., j). Then : 0 and there is r 1
t
(j, :) 1. Now
r ,= . since otherwise : _ d (., j) = d (r, j) < :, a contradiction. Also,
d (., r) _ d (., j) +d (j, r) < d (., j) +: _ r.
Thus r 1
t
(., r) 1 and this shows that . 1
t
as required.
A set 1 is called a subset of 1 if 1 1 and a superset of 1 if 1 1.
Proposition 9 says every set 1 has a smallest closed superset, namely the closure
1 of 1. It should come as no surprise that there is also a largest open subset of 1,
namely the interior

1 of 1 dened by
(1.3)

1 = r 1 : there exists r 0 with 1(r, r) 1 .
It is easy to see that

1 =
_
1 : 1 is a ball contained in 1 and that

1 = 1
c
c
,
the complement of the closure of the complement of 1. Indeed, since balls are open,
their complements are closed, and we thus have
r

1 = 1(r, r) 1 for some r 0
= 1
c
1(r, r)
c
for some r 0
= 1
c
1(r, r)
c
for some r 0 (since 1(r, r)
c
is closed)
= 1(r, r) 1
c
c
for some r 0
= r 1
c
c
(since 1
c
c
is open).
2. Compact sets
Now we come to the single most important property that a subset of a metric
space can have, namely compactness. In a sense, compact subsets share the most
important topological properties enjoyed by nite sets. It turns out that the most
38 3. METRIC SPACES
basic of these properties is rather abstract looking at rst sight, but arises so of-
ten in applications and subsequent theory that we will use it as the denition of
compactness. But rst we introduce some needed terminology.
Let 1 be a subset of a metric space A. A collection ( = G
o
o.
of subsets
G
o
of A is said to be an open cover of 1 if
each G
o
is open and 1
_
o.
G
o
.
A nite subcover (relative to the open cover ( of 1) is a nite collection G
o
k
n
|=1
of the open sets G
o
that still covers 1:
1
n
_
|=1
G
o
k
.
For example, the collection ( =
__
1
n
, 1 +
1
n
__
o
n=1
of open intervals in R form an
open cover of the interval 1 =
_
1
8
, 2
_
, and
__
1
n
, 1 +
1
n
__
8
n=1
is a nite subcover.
Draw a picture! However, ( is also an open cover of the interval 1 = (0, 2) for
which there is no nite subcover since
1
n
,
_
1
n
, 1 +
1
n
_
for all 1 _ : _ :.
Definition 14. A subset 1 of a metric space A is compact if every open cover
of 1 has a nite subcover.
Example 5. Clearly every nite set is compact. On the other hand, the interval
(0, 2) is not compact since ( =
__
1
n
, 1 +
1
n
__
o
n=1
is an open cover of (0, 2) that does
not have a nite subcover.
The above example makes it clear that all we need is one bad cover as witness
to the failure of a set to be compact. On the other hand, in order to show that
an innite set is compact, we must often work much harder, namely we must show
that given any open cover, there is always a nite subcover. It will obviously be of
great advantage if we can nd simpler criteria for a set to be compact, and this will
be carried out below in various situations, see e.g. Remark 10 below. For now we
will content ourselves with giving one simple example of an innite compact subset
of the real numbers (even of the rational numbers).
Example 6. The set 1 = 0'
_
1
|
_
o
|=1
is compact in R or Q. Indeed, suppose
that ( = G
o
o.
is an open cover of 1. Then at least one of the open sets in (
contains 0, say G
o0
. Since G
o0
is open, there is r 0 such that
1(0, r) G
o0
.
Now comes the crux of the argument: there are only nitely many points
1
|
that lie
outside 1(0, r), i.e.
1
|
, 1(0, r) if and only if / _
_
1
:
= :. Now choose G
o
k
to
contain
1
|
for each / between 1 and : inclusive (with possible repetitions). Then the
nite collection of open sets G
o0
, G
o1
, G
o2
, ..., G
on
(after removing repetitions)
constitute a nite subcover relative to the open cover ( of 1. Thus we have shown
that every open cover of 1 has a nite subcover.
It is instructive to observe that 1 = 1 where 1 =
_
1
|
_
o
|=1
is not compact
(since the pairwise disjoint balls 1
_
1
|
,
1
4|
2
_
=
_
1
|

1
4|
2
,
1
|
+
1
4|
2
_
cover 1 one point
at a time). Thus the addition of the single limit point 0 to the set 1 resulted in
making the union compact. The argument given as proof in the above example
serves to illustrate the sense in which the set 1 is topologically almost a nite set.
2. COMPACT SETS 39
As a nal example to illustrate the concept of compactness, we show that any
unbounded set in a metric space fails to be compact. We say that a subset 1 of a
metric space A is bounded if there is some ball 1(r, r) in A that contains 1. So
now suppose that 1 is unbounded. Fix a point r A and consider the open cover
1(r, :)
o
n=1
of 1 (this is actually an open cover of the entire metric space A).
Now if there were a nite subcover, say 1(r, :
|
)
|=1
where :
1
< :
2
< ... < :
,
then because the balls are increasing,
1

_
|=1
1(r, :
|
) = 1(r, :
) ,
which contradicts the assumption that 1 is unbounded. We record this fact in the
following lemma.
Lemma 4. A compact subset of a metric space is bounded.
Remark 10. We can now preview one of the major themes in our development
of analysis. The Least Upper Bound Property of the real numbers will lead directly
to the following beautiful characterization of compactness in the metric space R of
real numbers, the Heine-Borel theorem: a subset 1 of R is compact if and only if
1 is closed and bounded.
Before proceeding to develop further properties of compact subsets, and their
relationship to open and closed subsets, we establish a truly surprising aspect of
the denition, namely that compactness is an intrinsic property of a set 1. By
this we mean:
Lemma 5. If 1 1 A where A is a metric space, then 1 is compact
relative to the metric space A if and only if it is a compact subset relative to the
subspace 1 .
In particular, we can take 1 = 1 here and obtain that
1 is a compact subset of a metric space A if and only if it is compact
when considered as a metric space in its own right, i.e. if and only if every
cover of 1 by subsets of 1 that are open in 1 has a nite subcover.
This means that it makes sense to talk of a compact set 1 without reference to
a larger metric space in which it is a proper subset, compare Example 6 above. Note
how this contrasts with the property of a set G being open or closed, which depends
heavily on the ambient metric space, see Subsection 1.1 on subspaces above.
Proof (of Lemma 5): Suppose that 1 is compact relative to A. We now show
1 is compact relative to 1 . So let c = 1
o
o.
be an open cover of 1 in the
metric space 1 . By Theorem 4 there are open sets G
o
in A so that
1
o
= G
o
1.
Then ( = G
o
o.
is an open cover of 1 relative to A, and since 1 is compact
relative to A, there is a nite subcover G
o
k
n
|=1
,
1
n
_
|=1
G
o
k
.
But 1 1 so that
1 1 1
n
_
|=1
(G
o
k
1 ) =
n
_
|=1
1
o
k
,
40 3. METRIC SPACES
which shows that 1
o
k
n
|=1
is a nite subcover of the open cover c = 1
o
o.
.
Conversely, suppose that 1 is compact relative to 1 . We now show that 1 is
compact relative to A. So let ( = G
o
o.
be an open cover of 1 relative to A.
If 1
o
= G
o
1 , then c = 1
o
o.
is an open cover of 1 in the metric space 1 .
Since 1 is compact relative to A, there is a nite subcover 1
o
k
n
|=1
. But then
1
n
_
|=1
1
o
k

n
_
|=1
G
o
k
,
and so G
o
k
n
|=1
is a nite subcover of the open cover (.
2.1. Properties of compact sets. We now prove a number of properties
that hold for general compact sets. In the next subsection we will restrict attention
to compact subsets of the real numbers and Euclidean spaces.
Lemma 6. If 1 is a compact subset of a metric space A, then 1 is a closed
subset of A.
Proof : We show that 1
c
is open. So x a point r 1
c
. For each point
j 1, consider the ball 1(j, r
) with
(2.1) r
=
1
2
d (r, j) .
Since 1(j, r
)
1
is an open cover of the compact set 1, there is a nite subcover
1(j
|
, r
k
)
n
|=1
with of course j
|
1 for 1 _ / _ :. Now by the triangle
inequality and (2.1) it follows that
(2.2) 1(r, r
k
) 1(j
|
, r
k
) = O, 1 _ / _ :.
Indeed, if the intersection on the left side of (2.2) contained a point . then we would
have the contradiction
d (r, j
|
) _ d (r, .) +d (., j
|
) < r
k
+r
k
= d (r, j
|
) .
Now we simply take r = minr
n
|=1
0 and note that 1(r, r) 1(r, r
k
) so
that
1(r, r) 1 1(r, r)
_
n
_
|=1
1(j
|
, r
k
)
_
=
n
_
|=1
1(r, r) 1(j
|
, r
k
)
n
_
|=1
1(r, r
k
) 1(j
|
, r
k
) =
n
_
|=1
O = O,
by (2.2). This shows that 1(r, r) 1
c
and completes the proof that 1
c
is open.
Draw a picture of this proof!
Lemma 7. If 1 1 A where 1 is closed in the metric space A and 1 is
compact, then 1 is compact.
Proof : Let ( = G
o
o.
be an open cover (relative to A) of 1. We must
construct a nite subcover o of 1. Now (
+
= 1
c
' ( is an open cover of 1.
By compactness of 1 there is a nite subcover S
+
of (
+
that consists of sets from
( and possibly the set 1
c
. However, if we drop the set 1
c
from the subcover S
+
2. COMPACT SETS 41
the resulting nite collection of sets o from ( is still a cover of 1 (although not
neccessarily of 1), and provides the required nite subcover of 1.
Corollary 4. If 1 is closed and 1 is compact, then 1 1 is compact.
Proof : We have that 1 is closed by Lemma 6, and then 1 1 is closed by
Proposition 7 (2). Now 1 1 1 and so Lemma 7 now shows that 1 1 is
compact.
Remark 11. With respect to unions, compact sets behave like nite sets, namely
the union of nitely many compact sets is compact. Indeed, suppose 1 and 1 are
compact subsets of a metric space, and let G
o
o.
be an open cover of 1 ' 1.
Then there is a nite subcover G
o
o1
of 1 and also a (usually dierent) nite
subcover G
o
o
of 1 (here 1 and J are nite subsets of ). But then the union
of these covers G
o
o1|
= G
o
o1
' G
o
o
is a nite subcover of 1 ' 1,
which shows that 1 ' 1 is compact.
Now we come to one of the most useful consequences of compactness in appli-
cations. A family of sets 1
o
o.
is said to have the nite intersection property
if

oJ
1
o
,= O
for every nite subset 1 of the index set . For example the family of open intervals
__
0,
1
n
__
o
n=1
has the nite intersection property despite the fact that the sets have
no element in common:
o
n=1
_
0,
1
n
_
= O. The useful consequence of compactness
referrred to above is that this cannot happen for compact subsets!
Theorem 6. Suppose that 1
o
o.
is a family of compact sets with the nite
intersection property. Then

o.
1
o
,= O.
Proof : Fix a member 1
o0
of the family 1
o
o.
. Assume in order to de-
rive a contradiction that no point of 1
o0
belongs to every 1
o
. Then the open
sets 1
c
o
o.\]o0]
form an open cover of 1
o0
. By compactness, there is a nite
subcover 1
c
o
oJ\]o0]
with 1 nite, so that
1
o0

_
oJ\]o0]
1
c
o
,
i.e.
1
o0

oJ\]o0]
1
o
= O,
which contradicts our assumption that the nite intersection property holds.
Corollary 5. If 1
n
o
n=1
is a nonincreasing sequence of nonempty compact
sets. i.e. 1
n+1
1
n
for all : _ 1, then
o
n=1
1
n
,= O.
Theorem 7. If 1 is an innite subset of a compact set 1, then 1 has a limit
point in 1.
42 3. METRIC SPACES
Proof : Suppose, in order to derive a contradiction, that no point of 1 is a
limit point of 1. Then for each . 1, there is a ball 1(., r
:
) that contains at
most one point of 1 (namely . if . is in 1). Thus it is not possible for a nite
number of these balls 1(., r
:
) to cover the innite set 1. Thus 1(., r
:
)
:1
is
an open cover of 1 that has no nite subcover (since a nite subcover cannot cover
even the subset 1 of 1). This contradicts the assumption that 1 is compact.
There is a converse to this theorem that leads to the following characterization
of compactness in a general metric space.
Theorem 8. A metric space (A, d) is compact if and only if every innite
subset of A has a limit point in A.
Proof : The only if statement is Theorem 7. The proof of the if statement is
a bit delicate, and we content ourselves with a mere sketch here. First we note that
A has a countable dense subset 1, i.e. every open subset G contains a point of 1.
Indeed, for each : N there exists a nite set of balls
_
1
1
_
r
n
|
,
1
n
__
n
|=1
that cover
A. To see this we inductively dene r
n
|
so that d (r
n
I
, r
n
|
) _
1
n
for all 1 _ i < /, and
note that the process must terminate since otherwise r
n
I
o
I=1
would be an innite
subset of A with no limit point, a contradiction. The set 1 =

o
n=1
r
n
|
n
|=1
is
then countable and dense in 1. Second we use this to construct a countable base
for A, i.e. a countable collection of open sets E = 1
n
o
n=1
such that for every
open set G and . G there is : _ 1 such that . 1
n
G. Indeed, if 1 is a
countable dense subset, then E = 1(r, r) : r 1, r Q (0, 1) is a countable
base.
Now suppose that G
o
o.
is an open cover of A. For each r A there is an
index c and a ball 1
r
E such that
(2.3) r 1
r
G
o
.
Note that the axiom of choice is not needed here since E is countable, hence well-
ordered. If we can show that the open cover

E = 1
r
: r A has a nite subcover,
then (2.3) shows that G
o
o.
has a nite subcover as well. So it remains to show
that

E has a nite subcover. Relabel the open cover

E as

E = 1
n
o
n=1
. Assume,
in order to derive a contradiction, that

E has no nite subcover. Then the sets
1
= A
_

_
|=1
1
n
_
are nonempty closed sets that are decreasing, i.e. 1
+1
1
, and that have empty

intersection. Thus if we choose r
for each , the set 1 =

o
=1
r
must
be an innite set, and so has a limit point r A. But then the fact that the 1
are closed and decreasing implies that r 1
for all , the desired contradiction.

2.2. Compact subsets of Euclidean space. The Least Upper Bound Prop-
erty of the real numbers plays a crucial role in the proof that closed bounded in-
tervals are compact.
Theorem 9. The closed interval [a, /] is compact (with the usual metric) for
all a < /.
2. COMPACT SETS 43
We give two proofs of this basic theorem. The second proof will be generalized
to prove that closed bounded rectangles in R
n
are compact.
Proof #1: Assume for convenience that the interval is the closed unit interval
[0, 1], and suppose that G
o
o.
is an open cover of [0, 1]. Now 1 G
o
for some
, and thus there is r 0 such that (1 r, 1 +r) G
o
. With a = 1 +
:
2
1
it follows that G
o
o.
is an open cover of [0, a]. Now dene
1 = r [0, a] : the interval [0, r] has a nite subcover .
We have 1 is nonempty (0 1) and bounded above (by a). Thus ` = sup1 exists.
We claim that ` 1. Suppose for the moment that this has been proved. Then 1
cannot be an upper bound of 1 and so there is some o 1 satisfying
1 < o _ `.
Thus by the denition of the set 1 it follows that [0, o] has a nite subcover, and
hence so does [0, 1], which completes the proof of the theorem.
Now suppose, in order to derive a contradiction, that ` _ 1. Then there is
some open set G
~
with and also some : 0 such that
(` :, ` +:) G
~
.
Now by the denition of least upper bound, there is some r 1 satisfying ` : <
r _ `, and by taking : less than a 1 we can also arrange to have
` +: _ 1 +: < a.
Thus there is a nite subcover G
o
k
n
|=1
of [0, r], and if we include the set G
~
with
this subcover we get a nite subcover of
_
0, ` +
s
2
. This shows that ` +

s
2
1,
which contradicts our assumption that ` is an upper bound of 1, and completes
the proof of the theorem.
Proof #2: Suppose, in order to derive a contradiction, that there is an open
cover G
o
o.
of [a, /] that has no nite subcover. Then at least one of the two
intervals
_
a,
o+b
2
and
_
o+b
2
, /
fails to have a nite subcover. Label it [a

1
, /
1
] so
that
a _ a
1
< /
1
_ /,
/
1
a
1
=
1
2
c,
where c = / a. Next we note that at least one of the two intervals
_
a
1
,
o1+b1
2
and
_
o1+b1
2
, /
1
fails to have a nite subcover. Label it [a

2
, /
2
] so that
a _ a
1
_ a
2
< /
2
_ /
1
_ /,
/
2
a
2
=
1
4
c.
Continuing in this way we obtain for each : _ 2 an interval [a
n
, /
n
] such that
a _ a
1
_ ...a
n1
_ a
n
< /
n
_ /
n1
... _ /
1
_ /, (2.4)
/
n
a
n
=
1
2
n
c, and [a
n
, /
n
] fails to have a nite subcover.
Now let 1 = a
n
: : _ 1 and set r = sup1. From (2.4) we obtain that each
/
n
is an upper bound for 1, hence r _ /
n
and we have
a _ a
n
_ r _ /
n
_ /, for all : _ 1,
44 3. METRIC SPACES
i.e. r [a
n
, /
n
] for all : _ 1. Now r [a, /] and so there is , and r 0 such
that
(r r, r +r) G
o
.
By the Archimedian property of R we can choose : N so large that
1
:
< : < 2
n
(it is easy to prove : < 2
n
for all : N by induction), and hence
[a
n
, /
n
] (r r, r +r) G
o
.
But this contradicts our construction that [a
n
, /
n
] has no nite subcover, and com-
pletes the proof of the theorem.
Corollary 6. A subset 1 of the real numbers R is compact if and only if 1
is closed and bounded.
Proof : Suppose that 1 is compact. Then 1 is bounded by Lemma 4 and is
closed by Lemma 6. Conversely if 1 is bounded, then 1 [a, a] for some a 0.
Now [a, a] is compact by Theorem 9, and if 1 is closed, then Lemma 7 shows
that 1 is compact.
Proof #2 of Theorem 9 is easily adapted to prove that closed rectangles
1 =
n
|=1
[a
|
, /
|
] = [a
1
, /
1
] ... [a
n
, /
n
]
in R
n
are compact.
Theorem 10. The closed rectangle 1 =

n
|=1
[a
|
, /
|
] is compact (with the
usual metric) for all a
|
< /
|
, 1 _ / _ :.
Proof : Here is a brief sketch of the proof. Suppose, in order to derive a
contradiction, that there is an open cover G
o
o.
of 1 that has no nite sub-
cover. It is convenient to write 1 as a product of closed intervals with super-
scripts instead of subscripts: 1 =

n
|=1
_
a
|
, /
|
. Now divide 1 into 2

n
congruent
closed rectangles. At least one of them fails to have a nite subcover. Label it
1
1
=

n
|=1
_
a
|
1
, /
|
1
, and repeat the process to obtain a sequence of decreasing

rectangles 1
n
=
n
|=1
_
a
|
n
, /
|
n
with
a
|
_ a
|
1
_ ...a
|
n1
_ a
|
n
< /
|
n
_ /
|
n1
... _ /
|
1
_ /
|
,
/
|
n
a
|
n
=
1
2
n
c
|
,
where c
|
= /
|
a
|
, 1 _ / _ :. Then if we set r
|
= sup
_
a
|
n
: : _ 1
_
we obtain
that r =
_
r
1
, ..., r
n
_
1
n
1 for all :. Thus there is , , r 0 and : _ 1
such that
1
n
1(r, r) G
o
,
contradicting our construction that 1
n
has no nite subcover.
Theorem 11. Let 1 be a subset of Euclidean space R
n
. Then the following
three conditions are equivalent:
(1) 1 is closed and bounded;
(2) 1 is compact;
(3) every innite subset of 1 has a limit point in 1.
3. FRACTAL SETS 45
Proof : We prove that (1) implies (2) implies (3) implies (1). If 1 is closed
and bounded, then it is contained in a closed rectangle 1, and is thus compact by
Theorem 10 and Lemma 7. If 1 is compact, then every innite subset of 1 has a
limit point in 1 by Theorem 7. Finally suppose that every innite subset of 1 has
a limit point in 1. Of course Theorem 8 implies that 1 is compact, hence closed
and bounded by Lemmas 6 and 4, but in Euclidean space there is a much simpler
proof that avoids the use of Theorem 8.
Suppose rst, in order to derive a contradiction, that 1 is not bounded. Then
there is a sequence r
|
o
|=1
of points in 1 with [r
|
[ _ / for all /. Clearly the set of
points in r
|
o
|=1
is an innite subset 1 of 1 but has no limit point in R
n
, hence
not in 1 either. Suppose next, in order to derive a contradiction, that 1 is not
closed. Then there is a limit point r of 1 that is not in 1. Thus each deleted ball
1
t
_
r,
1
|
_
contains some point r
|
from 1. Again it is clear that the set of points
in the sequence r
|
o
|=1
is an innite subset of 1 but contains no limit point in 1
since its only limit point is r and this is not in 1.
Corollary 7. Every bounded innite subset of R
n
has a limit point in R
n
.
3. Fractal sets
We say that a subset 1 of a Euclidean space R
n
is a fractal set if it replicates
under dilation and translation/rotation in the following way: there are positive
integers / and : such that
(3.1) /1 = 1
1
' 1
2
' ... ' 1
n
,
where /1 is a dilation of 1 by factor /,
/1 = /r : r 1 ,
each 1
is a translation and rotation of 1 by some vector a
and rotation matrix

'
,
1
= '
(r +a
) : r 1 ,
and nally where the sets 1
are pairwise disjoint (sometimes we will relax this

condition somewhat to require some notion of essentially pairwise disjoint). We
will refer to the number c =
ln n
ln |
as the fractal dimension of 1. This terminology
is explained below.
The simplest example of a fractal is the unit half open half closed cube I
n
in
R
n
:
I
1
= [0, 1) ,
I
2
= [0, 1) [0, 1) ,
I
n
= [0, 1)
n
=
n
=1
[0, 1) .
With 1 = I
n
, / = 2 and : = 2
n
we have,
/1 = 2I
n
= [0, 2)
n
=
_
(|1,...,|n)]0,1]
n
([0, 1)
n
+ (/
1
, ..., /
n
))
=
2
n
_
=1
(I
n
+a
) =
n
_
=1
1
,
46 3. METRIC SPACES
where a
2
n
=1
is an enumeration of the 2
n
sequences (/
1
, ..., /
n
) of 0
t
s and 1s having
length :. Note that if we let / denote an integer larger than 2, then we would have
/1 =
n
_
=1
1
with : = /
n
. Thus the quantity which remains invariant in these calculations is
the exponent : satisfying : = /
n
or
: = log
|
: =
ln:
ln/
.
Note that the compact set I
n
= [0, 1]
n
also satises (3.1) with the same translations,
but where the 1
overlap on edges. As : is the dimension of the cube I

n
, we will
more generally refer to the quantity
c = log
|
: =
ln:
ln/
associated to a fractal set 1 as the fractal dimension of 1. It can be shown
that if 1 satises two dierent pairwise disjoint replications /
1
1 =

n1
=1
1
and
/
2
1 =

n2
=1
1
, then c = log
|1
:
1
= log
|2
:
2
is independent of the replication
and depends only on 1.
3.1. The Cantor set. We now construct our rst nontrivial fractal, the Can-
tor middle thirds set (1883). It turns out to have fractional dimension. We start
with the closed unit interval 1 = 1
0
= [0, 1]. Now remove the open middle third
_
1
3
,
2
3
_
of length
1
3
and denote the two remaining closed intervals of length
1
3
by
1
1
1
=
_
0,
1
3
and 1
1
2
=
_
2
3
, 1
. Then remove the open middle third

_
1
9
,
2
9
_
of length
1
3
2
from 1
1
1
=
_
0,
1
3
and denote the two remaining closed intervals of length

1
3
2
by
1
2
1
and 1
2
2
. Do the same for 1
1
2
and denote the two remaining closed intervals by 1
2
3
and 1
2
4
.
Continuing in this way, we obtain at the /
||
generation, a collection
_
1
|
_
2
k
=1
of 2
|
pairwise disjoint closed intervals of length
1
3
k
. Let 1
|
=

2
k
=1
1
|
and set
1 =
o
|=1
1
|
=
o
|=1
_
_
2
k
_
=1
1
|
_
_
.
Now by Proposition 7 each set 1
|
is closed, and hence so is the intersection 1.
Then 1 is compact by Corollary 6. It also follows from Corollary 5 that 1 is
nonempty. Next we observe that by its very construction, 1 is a fractal satisfying
the replication identity
31 = 1 ' (1 + 2) = 1
1
' 1
2
.
Thus the fractal dimension c of the Cantor set 1 is
ln 2
ln 3
. Moreover, 1 has the
property of being perfect.
Definition 15. A subset 1 of a metric space A is perfect if 1 is closed and
every point in 1 is a limit point of 1.
To see that the Cantor set is perfect, pick r 1. For each / _ 1 the point r
lies in exactly one of the closed intervals 1
|
for some , between 1 and 2

|
. Since the
length of 1
|
is positive, in fact
1
3
k
0, it is possible to choose a point r
|
1
|
r.
3. FRACTAL SETS 47
Now the set of points in the sequence r
|
o
|=1
is an innite subset of 1 and clearly
has r as a limit point. This completes the proof that the Cantor set 1 is perfect.
By summing the lengths of the removed open middle thirds, we obtain
length ([0, 1] 1) =
1
3
+
2
3
2
+
2
2
3
3
+... = 1,
and it follows that 1 is nonempty, compact and has length 1 1 = 0. Another
way to exhibit the same phenomenon is to note that for each / _ 1 the Cantor
set 1 is a subset of the closed set 1
|
which is a union of 2
|
intervals each having
length
1
3
k
. Thus the length of 1
|
is 2
| 1
3
k
=
_
2
3
_
|
, and the length of 1 is at most
inf
_
_
2
3
_
|
: / _ 1
_
= 0.
In contrast to this phenomenon that the length of 1 is quite small, the car-
dinality of 1 is quite large, namely 1 is uncountable, as is every nonempty perfect
subset of a metric space with the Heine-Borel property: every closed and bounded
subset is compact. We will need the following easily proved fact:
In any metric space A, the closure 1(r, r) of the ball 1(r, r) satises
1(r, r) j A : d (r, j) _ r .
Theorem 12. Suppose A is a metric space in which every closed and bounded
subset is compact. Then every nonempty perfect subset of A is uncountable.
Proof : Suppose that 1 is a nonempty perfect subset of A. Since 1 has a limit
point it must be innite. Now assume, in order to derive a contradiction, that 1 is
countable, say 1 = r
n
o
n=1
. Start with any point j
1
1 that is not r
1
and the
ball 1
1
= 1(j
1
, r
1
) where r
1
=
J(r1,1)
2
. We have
1
1
1 ,= O and r
1
, 1
1
.
Then there is a point j
2
1
t
1
1 that is not r
2
and so we can choose a ball 1
2
such that
1
2
1 ,= O and r
2
, 1
2
and 1
2
1
1
.
Indeed, we can take 1
2
= 1(j
2
, r
2
) where r
2
=
min]J(r2,2),:1J(1,2)]
2
. Continuing
in this way we obtain balls 1
|
satisfying
1
|
1 ,= O and r
|
, 1
|
and 1
|
1
|1
, / _ 2.
Now we use the hypothesis that every closed and bounded set in A is compact.
It follows that each closed set 1
|
1 is nonempty and compact, and so by Corollary
5 we have
o
|=1
_
1
|
1
_
,= O, say r
_
o
|=1
1
|
_
1.
However, by construction we have r
n
, 1
n
for all : and since the sets 1
n
are
decreasing, we see that r
n
,

o
|=1
1
|
for all :; hence r ,= r
n
for all :. This
contradicts 1 = r
n
o
n=1
and completes the proof of the theorem.
48 3. METRIC SPACES
3.2. The Sierpinski triangle, Cantor dust and von Koch snowake.
The Sierpinski triangle is a plane version of the Cantor set. Begin with the unit solid
equilateral triangle T = T
0
=
_
(0, 0) , (1, 0) ,
_
1
2
,
_
3
2
__
whose edges of length
1 join the three points (0, 0) , (1, 0) ,
_
1
2
,
_
3
2
_
in the plane. Divide T
0
into four
congruent equilateral triangles with edgelength
1
2
by joining the midpoints of the
three edges of T
0
. Remover the center (upside down) open equilateral triangle
to leave three closed equilateral triangles T
1
1
, T
1
2
, T
1
3
of edgelength
1
2
. Repeat this
construction to obtain at the /
||
_
T
|
_
3
k
=1
of 3
|
pairwise
disjoint closed solid equilateral triangles of edgelength
1
2
k
. Let 1
|
=

3
k
=1
T
|
and
set
o =
o
|=1
1
|
=
o
|=1
_
_
3
k
_
=1
T
|
_
_
.
Then the Sierpinski triangle o is a nonempty compact perfect subset of R
2
that
has area equal to 0. Moreover o is a fractal satisfying the replication identity
2o = o ' (o + (1, 0)) '
_
o +
_
1,
_
3
__
= o
1
' o
2
' o
3
,
and so has fractal dimension
ln 3
ln 2
.
The Cantor dust is another plane version of the Cantor set, this time with
fractal dimension 1. From the unit closed square [0, 1]
2
remove everything but the
four closed squares of side lenth
1
4
at the corners of [0, 1]
2
, i. e. the squares
_
0,
1
4
2
,
_
3
4
, 1
_
0,
1
4
,
_
3
4
, 1
2
and
_
0,
1
4
_
3
4
, 1
. Then repeat this procedure with these

four smaller squares and continue ad innitum. The dust 1 that remains is a
nonempty perfect compact subset of the plane satisfying the replication formula
1 =
1
4
1 '
1
4
(1 + (3, 0)) '
1
4
(1 + (0, 3)) '
1
4
(1 + (3, 3)) .
Thus 1 has fractal dimension
ln 4
ln 4
= 1. The set 1 is in stark contrast to the segment
_
(r, 0) R
2
: 0 _ r _ 1
_
in the plane that also has fractal dimension 1.
Finally, the von Koch snowake (1904) is a bit harder to construct rigorously at
this stage, although we will return to it later on after we have studied the concept
of uniform convergence. For now we simply describe the snowake-shaped curve
informally. Begin with the line segment 1
0
joining the points (0, 0) and (1, 0) along
the r -axis. It is a segment of length 1 that looks like ___. Now divide the
segment 1
0
into three congruent closed line segments of length
1
3
that each look
like _, and denote the rst and last of these by 1
1
1
and 1
1
4
respectively. Now
replace the middle segment with the two segments 1
1
2
joining
_
1
3
, 0
_
to
_
1
2
,
1
2
_
3
_
and 1
1
3
joining
_
1
2
,
1
2
_
3
_
to
_
2
3
, 0
_
. Thus the middle third segment _ has been
replaced with a hat that looks like ., which together with the removed middle
third makes an equilateral triangle of side length
1
3
. The four segments
_
1
1
_
4
=1
form a connected polygonal path that looks like _ . _ where each of the four
segments has length
1
3
. Now we continue by replacing each of the four segments
1
1
of length
1
3
by the polygonal path of four segments of length
1
3
2
obtained by
removing the middle third of 1
1
and replacing it by two equal length segments as

3. FRACTAL SETS 49
above. Repeat this construction to obtain at the /
||
generation, a polygonal path
consisting of 4
|
closed segments
_
1
|
_
4
k
=1
of length
1
3
k
each. Denote this polygonal
snowake-shaped path by 1
|
.
We now dene the von Koch snowake 1 to be the limit of the polygonal
paths 1
|
as / . A more precise denition is this:
1 consists of all (r, j) R
2
such that for every - 0 there is satisfying
1((r, j) , -) 1
|
,= O, for all / _ .
In other words, 1 is the set of points in the plane such that every ball centered
at the point intersects all of the polygonal paths from some index on. One can
show that 1 is a compact subset of the plane that satises the replication identity
31 = 1
1
' 1
2
' 1
3
' 1
4
,
where each 1
is a translation and rotation of 1; moreover two dierent 1
inter-
sect in at most one point. It follows that 1 has fractal dimension
ln 4
ln 3
. Later we
will show that 1 is the image of a continuous curve with no tangent at any point,
and innite length between any two distinct points on it.
Here is a table of some of the fractals we constructed above. The matrices '
2
and '
3
are plane rotations through angles of
t
4
and
t
4
respectively.
Fractal Set 1 Replication formula Dimension
1 1 =
1
3
1 '
1
3
(1 + 2)
ln 2
ln 3
0.63093
[0, 1] 1 =
1
2
1 '
1
2
(1 + 1) 1 1
1
1 =
1
4
1 '
1
4
(1 + (3, 0))
'
1
4
(1 + (0, 3)) '
1
4
(1 + (3, 3))
1 1
1
1 =
1
3
1 '
1
3
('
2
1 + (1, 0))
'
1
3
_
'
3
1 +
_
3
2
,
_
3
2
__
'
1
3
(1 + (2, 0))
ln 4
ln 3
1.2619
o
1 =
1
2
1 '
1
2
(1 + (1, 0))
'
1
2
_
1 +
_
1
2
,
_
3
2
__
ln 3
ln 2
1.5850
[0, 1]
2
1 =
1
2
1 '
1
2
(1 + (1, 0))
'
1
2
(1 + (0, 1)) '
1
2
(1 + (1, 1))
2 2
3.3. Similarities: A xed point theorem. Each of the fractals 1 consid-
ered in the previous subsection satises a replication formula of the form
(3.2) 1 = o
1
(1) ' o
2
(1) ' ... ' o
n
(1) ,
where : _ 2 and each o
is a similarity transformation in R
n
, i.e. a composition
of a translation, rotation and a dilation with ratio 0 < r
< 1. Moreover, in all

of our examples each o
is a dilation with the same ratio 0 < r < 1. Our next

theorem shows that no matter what similarities we consider with positive dilation
ratios less than 1, there is always a nonempty compact set 1 that satises (3.2),
and furthermore 1 is uniquely determined by (3.2). Note that we are not requiring
50 3. METRIC SPACES
that the sets o
(1) be pairwise disjoint here. We call a nonempty set 1 satisfying

(3.2) a self-similar set. If all the dilations have the same ratio, we say that 1 is a
fractal set. The sets listed in the table above are all compact fractal sets.
In order to prove uniqueness in our theorem on self-similarity we will use a
special metric space whose elements are the nonempty compact subsets of R
n
. For
: N let
A
n
= 1 R
n
: 1 is nonempty and compact .
Given a pair of compact sets 1, 1 in A
n
we dene a distance between them by
(3.3) d (1, 1) = inf c 0 : 1 1
o
and 1 1
o
,
where 1
o
= r R
n
: di:t (r, 1) < c and di:t (r, 1) = inf
1
[r j[ is the usual
distance between a point r and a set 1. It is a straightforward exercise to prove
that d : A
n
A
n
[0, ) satises the properties of a metric as in Denition 9.
Exercise 22. Prove that d is a metric on A
n
. Why cant we allow O A
n
?
Hint: To see that d (1, 1) 0 if 1 ,= 1, we may suppose that r 1 1.
Then the open cover
_
1
_
j,
J(r,)
2
__
J
of the compact set 1 has a nite subcover
_
1
_
j
,
J(r,j)
2
__
=1
. If r = min
1
J(r,j)
2
, then r 0 and 1(r, r) 1 = O. It
follows that d (1, 1) _ d (r, 1) _ r 0. To see why we cant allow O A
n
, show
that d (O, r) = for any r A.
The space A
n
can also be viewed as an extension of R
n
via the map that takes
r in R
n
to the compact set r in A
n
. This map is actually an isometry, meaning
that it preserves distances:
di:t
R
n (r, j) = [r j[ = d (r , j) .
We will construct a solution to (3.2) using the nite intersection property of
compact sets, and then prove uniqueness using a xed point argument in the metric
space (A
n
, d). To see the connection with a xed point, dene for any set 1,
(3.4)

o (1) =
n
_
=1
o
(1)
to be the right hand side of (3.2). Note that o
takes balls to balls, hence bounded

sets to bounded sets and open sets to open sets, hence also closed sets to closed
sets. By Theorem 11 it follows that o
takes compact sets to compact sets, and

hence so does

o. Thus

o maps the metric space A
n
into itself, and moreover, a set
1 A
n
is self-similar if and only if 1 is a xed point of

o, i.e.

o (1) = 1.
Here is the theorem on existence of self-similar sets, which exhibits a simple
classication, in terms of similarity transformations, of these very complex looking
sets. It was B. Mandelbrot (1977) who brought the worlds attention to the fact
that much of the seeming complexity in nature is closely related to self-similarity -
plants, trees, shells, rivers, coastlines, mountain ranges, clouds, lightning, etc.
Theorem 13. For 1 _ , _ : suppose that o
is a similarity transformation
on R
n
with dilation ratio 0 < r
< 1. Then there is a unique nonempty compact

subset 1 of R
n
satisfying (3.2).
3. FRACTAL SETS 51
Proof : We begin by choosing a closed ball 1 = 1
1
= r R
n
: [r[ _ 1 so
large that
(3.5) o
(1) 1, 1 _ , _ :.
Since
[o
(r)[ _ [o
(r) o
(0)[ +[o
(0)[ _ r
[r[ +[o
(0)[ ,
it suces to take
1 =
max
1In
[o
I
(0)[
1 max
1In
r
I
,
so that if r 1
1
, then
[o
(r)[ _
_
max
1In
r
I
_
1 + max
1In
[o
I
(0)[
=
_
max
1In
r
I
_
1 +
_
1 max
1In
r
I
_
1 = 1.
A trivial property of the set mapping

o is monotonicity:
(3.6)

o (1)

o (1) if 1 1.
A less obvious property, which will be used to prove the uniqueness assertion in
Theorem 13, is a contractive inequality relative to the distance d introduced above
for the metric space A
n
:
(3.7) d
_
o () ,

o (1)
_
_ rd (, 1) , , 1 A
n
,
where r = max
1In
r
I
. To see (3.7), it suces by symmetry to show that
(3.8)

o ()
_
o (1)
_
:(J(.,1)+:)
, for all - 0.
So pick

o (), i.e. = o
(r) o
() for some r and 1 _ , _ :. Now

for any - 0 we know that r 1
J(.,1)+:
so that there is j 1 satisfying
[r j[ < d (, 1) +-. Then
j = o
(j) o
(1)

o (1) ,
and since o
is a similarity with dilation ratio r
, we have 1
= o
(0) is a
rotation and dilation of ratio r
and thus
[ j[ = [o
(r) o
(j)[ = [1
(r j)[ _ r
[r j[ < r (d (, 1) +-) ,
which shows that
_
o (1)
_
:(J(.,1)+:)
, i.e. (3.8) holds. This completes the proof
of the contractive inequality (3.7).
Now let 1 = 1
1
be the closed ball as above. The closed ball 1 is compact by
Theorem 11. Set
1
1
=

o (1) ,
1
2
=

o (1
1
) =

o
2
(1) ,
1
3
=

o (1
2
) =

o
3
(1) ,
.
.
.
1
|
=

o (1
|1
) =

o
|
(1) ,
.
.
.
52 3. METRIC SPACES
and note that each 1
|
is a nonempty compact subset of the closed ball 1. Indeed,
since a similarity maps closed balls to closed balls, each 1
|
is actually a nite union
of closed balls, hence closed by Proposition 7 (4). Moreover, by (3.5), (3.6) and
induction we have
1
1
=

o (1) 1,
1
2
=

o (1
1
)

o (1) = 1
1
,
1
3
=

o (1
2
)

o (1
1
) = 1
2
,
.
.
.
1
|
=

o (1
|1
)

o (1
|2
) = 1
|1
,
.
.
.
and so the sequence of nonempty compact sets 1
|
o
|=1
is nonincreasing. By Corol-
lary 5 we conclude that
1 =
o
|=1
1
|
is nonempty and compact. Applying

o to 1 we claim that
(3.9)

o (1) =
o
|=1
o (1
|
) =
o
|=1
1
|+1
=
o
|=2
1
|
= 1,
which proves the existence of a self-similar set satisfying (3.2). The only equality
requiring proof in (3.9) is the rst. If

o (1) then there is some , and r 1
such that = o
(r). Since 1 1
|
we get o
(1
|
)

o (1
|
) for all / _ 1,
which shows that

o (1)
o
|=1

o (1
|
). Conversely, suppose that
o
|=1

o (1
|
).
Then for each / there is some ,
|
and r
|
1
|
such that = o
k
(r
|
). Now
there is some , that occurs innitely often among the ,
|
. With such a , xed
let = / N : ,
|
= ,. Then = o
(r
|
) for all / and since o
is one-
to-one we conclude that r = o
1
satises r = r
|
1
|
for all / . Since
is innite and 1
|
o
|=1
is nonincreasing, we see that r

o
|=1
1
|
= 1. Thus
= o
(r) o
(1)

o (1), which proves

o
|=1

o (1
|
)

o (1).
Finally, we use the contractive inequality (3.7) to prove uniqueness. Indeed,
suppose that G is another nonempty compact set satisfying

o (G) = G. Then from
(3.7) we have
0 _ d (1, G) = d
_
o (1) ,

o (G)
_
_ rd (1, G) ,
which implies d (1, G) = 0 since 0 < r < 1. It follows that 1 = G since d is a
metric.
3.4. A paradoxical set. A similarity o with dilation ratio r = 1 is said to
be a rigid motion, i.e. o is a rigid motion if it is a composition of a translation and
a rotation. (Note that the very rst step in the proof of Theorem 13 breaks down
for a rigid motion.) A subset 1 of Euclidean space R
n
is said to be paradoxical if
there are subsets
I
, 1
of 1, 1 _ i _ /, 1 _ , _ :, and rigid motions o

I
, T
such
4. EXERCISES 53
that
1 =
_
_
'
|
I=1
I
_
_
'
_
_
'
n
=1
1
_
, (3.10)
1 =
_
'
|
I=1
o
I
I
=
_
'
n
=1
T
.
The notation
_
' asserts that the indicated union is pairwise disjoint. The paradox
here is that (3.10) says that 1 can be decomposed into nitely many pairwise
disjoint pieces, which can then be rearranged by rigid motions into two copies of
1.
A famous paradox of Banach and Tarski asserts that the unit ball 1 = 1(0, 1)
in R
3
is paradoxical, and moreover needs only 5 pieces to witness the paradox:
there is a decomposition
1 = 1
1
_
'1
2
_
'1
3
_
'1
4
_
'1
5
,
of 1 into ve pairwise disjoint sets, and there are rigid motions o
1
, ..., o
5
such that
1 = o
1
(1
1
)
_
'o
2
(1
2
)
= o
3
(1
3
)
_
'o
4
(1
4
)
_
'o
5
(1
5
) .
In other words we can break the ball 1 into ve pieces and then using rigid motions,
we can rearrange the rst two pieces into 1 itself and rearrange the other three
pieces into a separate copy of 1. This creates two distinct balls of radius one out
of a single ball of radius one using only a decomposition into ve pieces and rigid
motions. In fact the paradox can be extended to show that if and 1 are any
two bounded subsets of R
3
, each containing some ball, then can be broken into
nitely many pieces that can be rearranged to form 1. However, the Banach-Tarski
paradox requires the axiom of choice. See e.g. [7] for details.
It is somewhat surprising that there exists a paradoxical subset 1 of the plane
R
2
= C that does not require the axiom of choice for its construction, namely the
Sierpinski-Mazurkiewicz Paradox: let c
I0
be a transcendental complex number and
dene sets of complex numbers by
1 =
_
r =
o
n=0
r
n
c
In0
C : r
n
Z
+
and r
n
= 0 for all but nitely many :
_
,
1
1
= r 1 : r
0
= 0 ,
1
2
= r 1 : r
0
0 .
Then 1 = 1
1
_
'1
2
= c
I0
1
1
= 1
2
1. Thus 1 satises the replication formula
(3.1) using only rigid motions with / = 1 and : = 2,
1 =
_
c
I0
1
_
_
'(1 + 1) ,
and so is paradoxical. The set 1 has fractal dimension
ln n
ln |
=
ln 2
ln 1
=
ln 2
0
= ,
while on the other hand, 1 is a countable subset of the complex plane.
4. Exercises
Exercise 23. Let (A, d) be a metric space. Show that both d
1
(r, j) = min1, d (r, j)
and d
2
(r, j) =
J(r,)
1+J(r,)
dene metrics on A.
Exercise 24. Let 1 be a subset of a metric space A, and let 1
t
denote the set
of all limit points of 1. Prove that 1
t
is closed.
54 3. METRIC SPACES
Exercise 25. Construct a compact set of real numbers whose limit points form
a countable set.
Exercise 26. Let 1 be the set of all r in [0, 1] whose decimal expansion consists
of only 4
t
: and 7
t
:, e.g. r = .74474777474447.... Is 1 compact? perfect? countable?
Prove your answers.
Exercise 27. Suppose that 1
|
o
|=1
is a sequence of closed subsets of R
n
. If
R
n
=
o
_
|=1
1
|
, prove that at least one of the sets 1
|
has nonempty interior (see (1.3)
for the denition of interior). Hint: Try to mimic the proof of Theorem 12 with
R
n
in place of 1, with 1
n
in place of r
n
, and assuming all the 1
|
have empty
interior.
Exercise 28. Suppose that o
1
, o
2
, ...o
n
are nontrivial similarity transforma-
tions in R
n
and let
o (1) =
n
_
|=1
o
|
(1)
for any subset 1 of R
n
. Show that

o (1
t
) =

o (1)
t
.
1
, o
2
, ...o
n
are similarity transformations in R
n
with dilation ratios 0 < r
< 1 for 1 _ , _ :. Let 1 be the unique nonempty

compact subset of R
n
satisfying 1 =

o (1). Show that if r 1, then the countable
set of points
1 =
o
_
|=1
o
|
(r) =
o
_
|=1
o
I1
o
I2
...o
I
`
r
al l I1,I2,...,I
`
is dense in 1. Hint: Show that if 1
t
= O then 1 is nite and there exists 1 such
that

o
J+1
1 =

o
J
1. Conclude from uniqueness of 1 that

o
J
1 = 1. If 1
t
,= O use
the previous exercise to prove that

o (1
t
) = 1
t
and again use uniqueness of 1 to
conclude that 1
t
= 1.
1
, o
2
, ...o
n
are similarity transformations in R
n
with common dilation ratio 0 < r < 1 for 1 _ , _ :. Let 1 be the unique nonempty
compact subset of R
n
satisfying 1 =

o (1). Show that 1 is homogeneous in the
following sense: if r
0
1 and 1 is any open ball centered at r
0
, then 1 1
contains a set G similar to 1, i.e. there is a similarity transformation / T such that
T1 = G.
CHAPTER 4
Sequences and Series
Our main focus in this chapter will be on sequences :
n
o
n=1
whose terms :
n
are numbers, either rational, real or complex, i.e. on functions from the natural
numbers N to either Q, R or C. A key denition is that of limit of such a sequence.
Definition 16. A complex number 1 is the limit of a complex-valued sequence
:
n
o
n=1
provided that for every - 0 there is N (depending on -) such that
(0.1) [:
n
1[ < -, for all : _ .
lim
no
:
n
= 1.
Of course this denition applies equally well to the subsets Q and R of C. It
turns out that the Least Upper Bound Property of the real numbers R plays a
crucial role in the theory of limits, both in R and in the complex numbers C. For
example, if :
n
o
n=1
is a nondecreasing sequence of real numbers, i.e.
:
n+1
_ :
n
for all : N,
that is bounded above, i.e. there is a real number ' such that
:
n
_ ' for all : N,
then the limit of the sequence :
n
o
n=1
exists, and is given by
lim
no
:
n
= sup:
n
: : _ 1 ,
where in taking the supremum we are viewing :
n
: : _ 1 as a set of real numbers,
rather than as the real-valued function on the natural numbers N that is denoted
by :
n
o
n=1
.
To see this, let 1 = :
n
: : _ 1 and c = sup1. Given - 0, the number
c - is not an upper bound for 1 and it follows that there is a term :
such that
c - < :
.
Since the sequence :
n
o
n=1
is nondecreasing and bounded above by c, we have
c - < :
_ :
n
_ c
for all : _ . But this implies that (0.1) holds with 1 = c. We have thus proved
the following lemma.
Lemma 8. If :
n
o
n=1
is a nondecreasing sequence of real numbers that is
bounded above, then lim
no
:
n
= sup:
n
o
n=1
. Similarly, if :
n
o
n=1
is a non-
increasing sequence of real numbers that is bounded below, then lim
no
:
n
=
inf :
n
o
n=1
.
55
56 4. SEQUENCES AND SERIES
However, later applications of analysis to existence of fractals and solutions to
dierential equations, will require the notion of sequences of functions in certain
metric spaces. Thus we will now develop the critical concepts of limit, subsequence
and Cauchy sequence in the broader context of a general metric space.
1. Sequences in a metric space
Recall from Denition 8 that a sequence :
n
o
n=1
is a function ) dened on
the natural numbers N with ) (:) = :
n
for all : N. We begin with the general
denition of limit.
Definition 17. Let (A, d) be a metric space. An element 1 in A is the limit
of an A-valued sequence :
n
o
n=1
provided that for every - 0 there is N
(depending on -) such that
(1.1) d (:
n
, 1) < -, for all : _ .
lim
no
:
n
= 1,
and say that the sequence :
n
o
n=1
converges to 1; otherwise we say :
n
o
n=1
di-
verges.
Note that limits, if they exist, are unique! Indeed, if both 1 and 1
t
in A satisfy
(1.1), then given - 0, there is so that (1.1) holds for both 1 and 1
t
. Thus the
triangle inequality yields
0 _ d (1, 1
t
) _ d (1, :
) +d (:
, 1
t
) < - +- = 2-.
Since - can be made arbitrarily small, it follows that d (1, 1
t
) = 0, hence 1 = 1
t
.
Here are three more properties of limits that follow easily from Denition 17.
Proposition 10. Let :
n
o
n=1
be a sequence in a metric space (A, d).
(1) lim
no
:
n
= 1 A if and only if every ball 1(1, r), r 0, contains all
of the terms :
n
except for nitely many : N.
(2) lim
no
:
n
= 1 A implies that the set :
n
o
n=1
is bounded.
(3) If 1 A and if j A is a limit point of 1, then there is a sequence
:
n
o
n=1
in 1 such that j = lim
no
:
n
.
Proof : (1) Suppose that lim
no
:
n
= 1 A and that r 0. Then there is
such that (1.1) holds with - = r. Thus :
n
1(1, r) for all : _ , and so the only
terms :
n
not contained in 1(1, r) are among the nitely many terms :
1
,...,:
1
.
Conversely, suppose that every ball 1(1, r), r 0, contains all of the terms :
n
except for nitely many : N. Let - 0 be given. Then 1(1, -) contains all but
nitely many of the terms :
n
. Let ' be the largest subscript among these nitely
many terms :
n
. Then with = ' + 1 we have :
n
1(1, -) for all : _ , which
is (1.1). Note: Uniqueness of limits follows from (1) as well. Why?
(2) There is such (1.1) holds with - = 1. Now set
r = max 1, d (1, :
1
) , d (1, :
2
) , ..., d (1, :
) .
Then d (1, :
n
) _ r < r + 1 for all : N and it follows :
n
o
n=1
1(1, r + 1), i.e.
:
n
o
n=1
is bounded in A.
1. SEQUENCES IN A METRIC SPACE 57
(3) For each : N choose :
n
1
t
_
j,
1
n
_
1. We claim that lim
no
:
n
= j.
Indeed, given - 0, choose _
1
:
. Then for : _ we have
d (j, :
n
) <
1
:
_
1
_ -,
as required.
1.1. Subsequences. A key construct associated with a sequence : = :
n
o
n=1
is that of a subsequence. A subsequence is dened by viewing : as a map dened
on the natural numbers N and composing it with a strictly increasing map / :
|
from N to N, to get a map
/ :
|
:
n
k
dened on N. In other words we consider a sequence :
|
o
|=1
of strictly increas-
ing positive integers and dene the composition of sequences :
n
k
o
|=1
to be a
subsequence of :
n
o
n=1
. For example let :
n
o
n=1
be the sequence
:
n
o
n=1
=
__
: 1
_
: + 1
_
o
n=1
=
_
0,
_
2 1
_
2 + 1
,
_
3 1
_
3 + 1
,
2 1
2 + 1
,
_
5 1
_
5 + 1
, ...
_
.
If we take :
|
o
|=1
=
_
/
2
_
o
|=1
to be the increasing sequence of square numbers,
the corresponding subsequence:
n
k
o
n
k
=1
of :
n
o
n=1
is given by
:
n
k
o
n
k
=1
=
_
_
/
2
1
_
/
2
+ 1
_
o
n=1
=
_
/ 1
/ + 1
_
o
n=1
=
_
0,
2 1
2 + 1
,
3 1
3 + 1
, ...
_
.
Note that the terms
|1
|+1
in :
n
k
o
n
k
=1
appear in increasing order among the terms
_
n1
_
n+1
of :
n
o
n=1
.
Exercise 31. A sequence : = :
n
o
n=1
converges to 1 if and only if every
subsequence :
n
k
o
n
k
=1
of : converges to 1. This is an easy consequence of denition
chasing.
Theorem 14. Suppose that : = :
n
o
n=1
is a sequence in a metric space (A, d).
(1) If A is compact, then some subsequence of : converges to a point in A.
(2) If A is Euclidean space R
n
and : is bounded, then some subsequence of
: converges to a point in R
n
.
We often abbreviate the expression "then some subsequence of : converges to
a point in A" to simply ": has a convergent subsequence in A".
Proof : (1) Let 1 be the set of points :
n
: : N. If 1 is nite, then one of
its members, say j, occurs innitely often in the sequence : = :
1
, :
2
, :
3
, .... Thus
there is a strictly increasing sequence of positive integers
:
1
< :
2
< :
3
< ... < :
|
< ...
such that
j = :
n1
= :
n2
= :
n3
= ... = :
n
k
= ...
for all / _ 1. The subsequence :
n
k
o
|=1
= j, j, j, ... clearly converges to j A.
On the other hand, if 1 is innite, then since A is compact, Theorem 7 shows
that 1 has a limit point j A.
Remark 12. Proposition 10 (3) shows there is a sequence t
n
o
n=1
in 1 that
converges to j, but this sequence need not be a subsequence of :
n
o
n=1
.
So instead of using Proposition 10 (3), we construct a subsequence of : converg-
ing to j as follows: pick :
1
such that d (j, :
n1
) < 1. Then since 1
t
(j, 1) contains
innitely many points from 1, there is :
2
:
1
such that d (j, :
n2
) <
1
2
. Continuing
in this way we obtain for every / _ 1 a positive integer :
|
such that
:
1
< :
2
< ... < :
|
< :
|+1
< ...
and d (j, :
n
k
) <
1
|
for all / _ 1. Thus the subsequence :
n
k
o
|=1
converges to j.
(2) Since 1 = :
n
: : N is bounded, its closure 1 is closed and bounded in
R
n
(bounded since if 1 1(r, 1) then 1 1(r, 1) 1(r, 1 + 1)). By Theorem
11 it follows that 1 is compact. Now we can apply part (1) of the theorem, which
we just nished proving, with A = 1. This completes the proof of part (2).
In Lemma 3 we proved that the derived set 1
t
of a set 1 is always closed.
We have the following variant for sequences : = :
n
o
n=1
in a metric space A. A
point j A is said to be a subsequential limit of : if lim
|o
:
n
k
= j for some
subsequence :
n
k
o
|=1
of :.
Theorem 15. The subsequential limits of a sequence : = :
n
o
n=1
in a metric
space A form a closed subset of A.
Proof : Let 1
+
be the set of subsequential limits of :, i.e. all limits of subse-
quences of :. Suppose that . (1
+
)
t
. We must show that . 1
+
. Now there is
j
1
1
t
_
.,
1
2
_
1
+
and also :
1
such that d (j
1
, :
n1
) <
1
2
. Thus we have
d (., :
n1
) _ d (., j
1
) +d (j
1
, :
n1
) <
1
2
+
1
2
= 1.
In similar fashion we can choose :
2
:
1
such that d (., :
n2
) <
1
2
. Continuing we
can choose :
1
< :
2
< :
3
< ... so that
d (., :
n
k
) <
1
/
, / _ 1.
This shows that the subsequence :
n
k
o
|=1
of : converges to ., and hence . 1
+
as required.
1.2. Cauchy sequences. Sequences :
n
o
n=1
of rational numbers Q can di-
verge for two qualitatively quite dierent reasons:
(1) The sequences :
o
n=1
and (1)
n
o
n=1
fail to converge because the terms
:
n
and :
n
dont even get close to each other, much less close to a limiting
value 1, as : and : get large.
(2) The sequence :
n
o
n=1
= 1.4, 1.41, 1.414, 1.4142, ... of decimal approx-
imations to the real number
_
2 has no limit in Q because the rational
numbers have a gap where
_
2 ought to be - this despite the fact that
[:
n
:
n
[ _
1
10
m
for all : < :, which shows that the terms :
n
and :
n
get
rapidly close to each other as : and : get large.
The rst type of divergence above occurs for natural reasons, but the second
type of divergence occurs only because of a defect in the metric space Q. The real
numbers R do not share this defect, and Cantors construction of the real numbers
using cuts keyed on the fact that the defect in Q was a gap in the order. We
now wish to investigate to what extent this defect can be realized in the metric
space structure associated with Q and R, rather than in the order structure. As
a byproduct of this investigation, we will be led to Weierstrass construction of
the real numbers using Cauchy sequences of rational numbers. Our rst denition
captures the notion of a sequence :
n
o
n=1
of the second type above in which the
terms :
n
and :
n
get close to each other as : and : get large, and so ought to
have a limit in a nondefective metric space.
Definition 18. Let (A, d) be a metric space. A sequence :
n
o
n=1
in A is a
Cauchy sequence if for every - 0 there is N such that
(1.2) d (:
n
, :
n
) < -, for all :, : _ .
Lemma 9. Convergent sequences in a metric space are Cauchy sequences.
Proof : Suppose :
n
o
n=1
is a convergent sequence in a metric space (A, d), i.e.
lim
no
:
n
= 1 for some 1 A. Let - 0 be given. Choose as in Denition 17 so
that d (:
n
, 1) <
:
2
for all : _ . Then if :, : _ , the triangle inequality yields
d (:
n
, :
n
) _ d (:
n
, 1) +d (1, :
n
) <
-
2
+
-
2
= -.
There is a partial converse to this lemma.
Lemma 10. Let : = :
n
o
n=1
be a Cauchy sequence in a metric space A. Then
: converges if and only if it has a convergent subsequence in A.
Proof : If :
n
o
n=1
converges in a metric space A to a limit 1, then every sub-
sequence converges to 1 as well. Conversely suppose that : = :
n
o
n=1
is a Cauchy
sequence in A and that lim
|o
:
n
k
= 1 A for some subsequence :
n
k
o
|=1
.
Given - 0 the Cauchy criterion (1.2) yields so that
d (:
n
, :
n
) <
-
2
, :, : _ ,
and then the denition of limit yields 1 satisfying
d (:
n
k
, 1) <
-
2
, / _ 1.
We may also take 1 so large that :
1
_ . Then for : _ we have
d (:
n
, 1) _ d (:
n
, :
n
K
) +d (:
n
K
, 1) <
-
2
+
-
2
= -,
which shows that lim
no
:
n
= 1.
Now comes our denition of a nondefective metric space, which we call com-
plete.
Definition 19. A metric space A is complete if every Cauchy sequence in A
converges to a point in A.
Roughly speaking, a complete metric space A has the property that any se-
quence which ought to converge, i.e. one that satises the Cauchy criterion, actually
does converge in A. In a complete metric space, the condition that for every - 0
there is N satisfying (1.2), is often called the Cauchy criterion for convergence
of the sequence :
n
o
n=1
.
The crucial dierence between the rational and real numbers can now be ex-
pressed in metric terms: the space Q is not complete whereas the space R is com-
plete. In order to prove our theorem on completeness it is convenient to introduce
the concept of diameter of a set. If is a subset of real numbers, we extend the
denition of sup to sets that are not bounded above by dening
sup = , if is not bounded above.
Definition 20. If 1 is a subset of a metric space (A, d), we dene the diameter
of 1 to be
dia:(1) = supd (r, j) : r, j 1 .
The connection with Cauchy sequences is this. Suppose : = :
n
o
n=1
is a
sequence in a metric space (A, d). Let T
= :
n
: : _ be the set of points in
the tail of the sequence from on. Then : is a Cauchy sequence if and only if
(1.3) dia:(T
) 0 as .
The reader can easily verify this by chasing denitions.
Lemma 11. dia:(1) = dia:
_
1
_
.
Proof: Clearly dia:(1) _ dia:
_
1
_
holds since 1 1. Conversely pick
- 0 and two points j, in 1. There are points r, j 1 such that d (j, r) <
:
2
and d (, j) <
:
2
. Thus we have
d (j, ) _ d (j, r) +d (r, j) +d (j, ) _
-
2
+dia:(1) +
-
2
= dia:(1) +-,
even in the case that dia:(1) = . Now take the inmum over - 0 to obtain
d (j, ) _ dia:(1) for all j, 1, and then take the supremum over all such j,
to obtain dia:
_
1
_
_ dia:(1) as required.
Theorem 16. Let A be a metric space.
(1) If A is compact, then A is complete.
(2) Euclidean space R
n
is complete.
Proof : (1) Suppose that :
n
o
n=1
is a Cauchy sequence in a compact metric
space A. Let T
= :
n
: : _ be the set of points in the tail of the sequence
from on. By (1.3) the Cauchy criterion says that dia:(T
) 0 as .
Lemma 11 then gives
(1.4) dia:
_
T
_
0 as .
Now T
is nonempty and compact for each , and clearly T

+1
T
for all .
Corollary 5 thus shows that
1 =
o
=1
T
,= O.
Since 1 T
, (1.4) gives dia:(1) = 0, from which we conclude that 1 consists

of exactly one point, say 1 A.
We now claim that lim
no
:
n
= 1. Indeed, given - 0, choose so large
that dia:
_
T
_
< -. Then for all : _ we have that both :
n
and 1 belong to
T
, and so
d (:
n
, 1) _ dia:
_
T
_
< -,
as required.
(2) Suppose that : = :
n
o
n=1
is a Cauchy sequence in R
n
. There is so large
that the tail T
has diameter at most 1. Since there are only nitely many points
:
n
outside the tail T
, it follows that the set of points :

n
: : _ 1 in the sequence
is bounded. The closure of this set is also bounded, and thus : is contained in a
closed and bounded subset A of R
n
. By Theorem 11 the set A is compact and we
can now apply part (1) of the theorem proved above.
1.3. Weierstrass construction of the real numbers. Recall that the gap
in the rational numbers where the irrational number
_
2 lives, can be detected
either by one of Dedekinds cuts (the cut in (2.1) of Chapter 1), or by the Cauchy
sequence of decimal approximations 1.4, 1.41, 1.414, .... While Dedekind used cuts
to construct the real numbers, Weierstrass instead used such Cauchy sequences in
Q to construct the real numbers by lling in these gaps in the rationals as follows.
Denote by ( the set of all Cauchy sequences : = :
n
o
n=1
in Q. Dene an
equivalence relation on ( by : ~ t if the intertwined sequence
:
1
, t
1
, :
2
, t
2
, :
3
, t
3
, ...
is also a Cauchy sequence (intuitively this says that the limits that : and t ought to
have should coincide). Once we have proved this relation is indeed an equivalence
relation, then we can dene the equivalence class [:] of a Cauchy sequence : in (,
and we can dene the real numbers R to be the set of equivalence classes:
R = [:] : : ( .
At this point the construction becomes as tedious as that of Dedekind, and we omit
the details, only mentioning that one denes the sum of two classes [:] and [t] where
:, t (, by proving that the sequence : + t = :
n
+t
n
o
n=1
is Cauchy, and then
dening
[:] + [t] = [: +t] .
It is a long process to dene the remaining relations and verify that R satises the
axioms of an ordered eld with the least upper bound property.
This method of Weierstrass for constructing the real numbers has an advantage
the method of Dedekind lacks. Namely it can be used to construct an extension
of an arbitrary metric space A to a (usually larger) space

A that is complete, and
called the completion of A. More precisely, but without much detail, dene

A to
be the set of equivalence classes [:] in the set ( of Cauchy sequences : in A, where
: ~ t if :
1
, t
1
, :
2
, t
2
, ... is Cauchy in A. Dene a function

d on

A

A by
d ([:] , [t]) = lim

no
d (:
n
, t
n
) .
After showing that the limit above exists, and that
_
A,

d
_
satises the axioms for a
metric space, one can prove that the space
_
A,

d
_
is complete. We can view A as a
subspace of

A via the map that sends r in A to the equivalence class containing the
constant Cauchy sequence r, r, r, .... One can verify that this map is an isometry,
and moreover that under this identication of A with a subspace of

A, the set A is
dense in

A. This shows that

A is, up to an isometry, the smallest complete space
containing A, and this is the reason that

A is called the completion of A.
On the other hand, the idea of a Dedekind cut can only be used to construct
an extension of a linearly ordered set to one with the least upper bound property,
a concept that has not been nearly so useful in applications of analysis as is the
concept of a complete metric space. For example, the next subsection describes
one of the most useful results in the theory of abstract metric spaces, one that can
be used to simplify the ideas behind the proof of Theorem 13, and to prove many
existence theorems for dierential equations, as we illustrate in a later chapter.
1.4. A contraction lemma. It is possible to recast the proof of Theorem
13 on the existence and uniqueness of nonempty compact fractals, entirely within
the context of the metric space A
n
of compact subsets of R
n
that was introduced
above. This is achieved by using the fact that the map

o : A
n
A
n
dened in
(3.4) is a strict contraction, i.e. satises (3.7) for some 0 < r < 1, dened on the
complete metric space A
n
.
Of course we havent yet shown that A
n
is complete, and we defer the proof
of this to the end of this subsection. The main idea is to use the nite intersection
property of compact sets much as we did in the proof of Theorem 13.
Once we know that A
n
is complete, the following Contraction Lemma imme-
diately proves Theorem 13 on the existence and uniqueness of fractals.
Lemma 12. Suppose that (A, d) is a complete metric space and that , : A A
is a strict contraction on A, i.e. there is 0 < r < 1 such that
d (,(r) , ,(j)) _ rd (r, j) , for all r, j A.
Then , has a unique xed point . in A, i.e. there is . A such that ,(.) = .,
and if n A is another point satisfying ,(n) = n, then . = n.
Proof : The uniqueness assertion is immediate from
0 _ d (., n) = d (,(.) , ,(n)) _ rd (., n) ,
since 0 < r < 1. To establish the existence assertion, pick any point :
0
A.
Consider the sequence of iterates :
n
o
n=1
given by
:
1
= ,(:
0
) ,
:
2
= ,(:
1
) = ,(,(:
0
)) = ,
2
(:
0
) ,
:
3
= ,(:
2
) = ,
_
,
2
(:
0
)
_
= ,
3
(:
0
) ,
.
.
.
:
n
= ,(:
n1
) = ,
_
,
n1
(:
0
)
_
= ,
n
(:
0
) ,
.
.
.
We claim that the sequence :
n
o
n=1
is Cauchy. To see this rst note that
d (:
|
, :
|+1
) = d (,(:
|1
) , ,(:
|
)) _ rd (:
|1
, :
|
) , / _ 1,
and then use induction to prove that
d (:
|
, :
|+1
) _ rd (:
|1
, :
|
) _ r
2
d (:
|2
, :
|1
) _ ... _ r
|
d (:
0
, :
1
) .
Now for : < :, the triangle inequality yields
d (:
n
, :
n
) _ d (:
n
, :
n+1
) +d (:
n+1
, :
n+2
) +... +d (:
n1
, :
n
)
=
nn
=0
d (:
n+
, :
n++1
)
_
nn
=0
r
n+
d (:
0
, :
1
)
<
r
n
1 r
d (:
0
, ,(:
0
)) .
Thus given - 0, if we choose so large that
:
N
1:
d (:
0
, ,(:
0
)) < -, then we have
d (:
n
, :
n
) < - for all :, : _ , which proves that :
n
o
n=1
is Cauchy.
Now we use the important hypothesis that A is complete. Thus :
n
o
n=1
is
convergent and there is a limit
. = lim
no
:
n
A.
The triangle inequality gives
d (,(.) , .) _ d (,(.) , ,(:
n
)) +d (,(:
n
) , :
n+1
) +d (:
n+1
, .)
_ rd (., :
n
) + 0 +d (., :
n+1
)
_ d (., :
n
) +d (., :
n+1
) ,
which tends to 0 as : . It follows that d (,(.) , .) = 0 and hence ,(.) = ..
Lemma 13. The metric space A
n
is complete.
Proof : Suppose that 1
o
=1
is a Cauchy sequence in A
n
. For each / _ 1
there is by the denition (3.3) of the metric in A
n
together with the Cauchy criterion
(1.2), a positive integer ,
|
such that
1
(1
`
) 1
2
`+1
and 1
`
(1
) 1
2
`+1
, for all , _ ,
|
,
and moreover we can choose the ,
|
to be strictly increasing, i.e. ,
|
< ,
|+1
for all
/ _ 1. Using , = ,
|+1
,
|
we then also have the following inequalities:
_
1
`+1
_
1
2
`+1

_
(1
`
) 1
2
`+1
_
1
2
`+1
(1
`
) 1
2
`
, for all / _ 1.
Thus the sequence of closed bounded nonempty sets
_
(1
`
) 1
2
`
_
o
|=1
is nonincreasing, and by Theorem 11 consists of compact sets. By Corollary 6 we
then conclude that
1 =
o
|=1
(1
`
) 1
2
`
is a nonempty compact set, so 1 A
n
.
We now claim that
lim
o
1
= 1.
Since 1
o
=1
is Cauchy it suces by Lemma 10 to prove that
lim
|o
1
`
= 1.
Let c 0 be given. We trivially have
(1.5) 1 (1
`
) 1
2
`
(1
`
) 1
2
`1
(1
`
)
o
for / so large that 2
|1
1
o
. In the other direction
__
(1
`
) 1
2
`
_
c
_
o
|=1
is an open
cover of the compact set 1
1
1
c
o
, and if
__
(1
`
) 1
2
`
_
c
_
J
|=1
is a nite subcover,
then
1
1
(1)
c
o

J
_
|=1
_
(1
`
) 1
2
`
_
c
,
equivalently
(1
L
) 1
2
L
=
J
|=1
(1
`
) 1
2
`
1
c
1
' (1)
o
,
which implies
(1.6) 1
`
(1
`
) 1
2
`
(1)
o
, for all / _ 1.
Altogether (1.5) and (1.6) show that d (1, 1
`
) < c for / suciently large as re-
quired.
2. Numerical sequences and series
At the beginning of this chapter we proved in Lemma 8 that bounded monotonic
sequences : = :
n
o
n=1
of real numbers converge, and moreover we identied the
limit 1 as either the least upper bound or the greatest lower bound of the set of
terms 1 = :
n
: : _ 1:
lim
no
:
n
= 1 =
_
sup1 if : is nondecreasing
inf 1 if : is nonincreasing
.
Here are some examples of monotonic sequences for which we can further identify
the sup or inf as a specic real number:
(1) lim
no
1
n
p
= 0 if j 0.
(2) lim
no
n
_
j = 1 if j 0.
(3) lim
no
n
_
: = 1.
(4) lim
no
n
(1+)
n = 0 if j 0 and c R.
(5) lim
no
r
n
= 0 if 1 < r < 1.
To prove limit (1), let - 0 be given and use the Archimedian property of the
real numbers to choose
1
p
_
:
. Then 0 <
1
n
p
< - for all : _ .
The limit in (2) is trivial if j = 1. If j 1 then
n
_
j 1 and the binomial
theorem for : _ 1 yields
j = (
n
_
j)
n
= [1 + (
n
_
j 1)]
n
= 1 +:(
n
_
j 1) +
:(: 1)
2
(
n
_
j 1)
2
+...
1 +:(
n
_
j 1) ,
so that
0 <
n
_
j 1 <
j 1
:
, : _ 1,
2. NUMERICAL SEQUENCES AND SERIES 65
no
_
n
_
j 1
_
= 0 by limit (1), hence lim
no
n
_
j = 1.
Finally, if 0 < j < 1, apply the result just proved to the number
1
1 to get
lim
no
n
_
1
= 1, which gives the desired result upon taking reciprocals.

To see limit (3) we argue as in the proof for (2), but keep the quadratic term
in the binomial expansion for : _ 2 instead of the linear term:
: =
_
n
_
:
_
n
=
_
1 +
_
n
_
: 1
_
n
= 1 +:
_
n
_
: 1
_
+
:(: 1)
2
_
n
_
: 1
_
2
+...
1 +
:(: 1)
2
_
n
_
: 1
_
2
,
so that
0 <
n
_
: 1 <
_
: 1
n(n1)
2
=
_
2
:
=
_
2
_
:
, : _ 2,
no
(
n
_
: 1) = 0 by limit (1), hence lim
no
n
_
: = 1.
To see (4) let / be a positive integer greater than c. Then for : 2/ we have
(1 +j)
n
_
:
/
_
j
|
=
:(: 1) ... (: / + 1)
/ (/ 1) ...1
j
|
_
:
2
_
|
j
|
/!
,
so that
0 <
:
o
(1 +j)
n
< :
o
_
2
:
_
|
/!
j
|
= :
o|
_
2
j
_
|
/!, : 2/,
no
n
(1+)
n = 0 since lim
no
:
o|
= 0 if c / < 0 by limit
(1).
Limit (5) is the special case c = 0 of limit (4).
2.1. Series of complex numbers. Given a sequence a
n
o
n=1
of complex
numbers, we can use the eld structure on C to dene the corresponding sequence
of partial sums
:
= a
1
+a
2
+... +a
n=1
a
n
for all _ 1. Now if there were only nitely many nonzero terms a
n
in the original
sequence, then the sequence of partial sums :
o
=1
would eventually be constant
and that constant would be the sum of the nonzero terms a
n
. Thus in this case we
have

n:on,=0
a
n
= lim
o
:
.
This motivates the denition of the innite sum

o
n=1
a
n
as the limit lim
o
:
of the partial sums, provided that limit exists.

Definition 21. Suppose that a
n
o
n=1
is a sequence of complex numbers. If the
sequence of partial sums :
o
=1
, :
n=1
a
n
, converges to a complex number
1, we say that the (innite) series

o
n=1
a
n
converges to 1, and write
o
n=1
a
n
= 1.
If the sequence of partial sums :
o
=1
diverges, we say that the series

o
n=1
a
n
diverges.
Recall that as a metric space, the complex numbers C are isomorphic to R
2
,
and hence complete by Theorem 16. The Cauchy criterion thus takes the following
form for series:
The series

o
|=1
a
|
converges in C if and only if for every - 0 there is
N such that
|=n
a
|
< -, for all :, : _ .

This is easily seen using the Cauchy criterion for the sequence of partial sums
:
o
=1
, together with the fact that :
n
:
n1
=

n
|=n
a
|
. Note that this
provides a simple necessary condition for convergence

o
n=1
a
n
, namely
(2.1) [a
n
[ 0 as : .
The reader is cautioned however, that (2.1) is not in general sucient for conver-
gence of the series

o
n=1
a
n
. For example, if a
n
=
1
n
then (2.1) holds but the
harmonic series

o
n=1
1
n
diverges since the partial sums of order = 2
|
satisfy
:
=
2
k
n=1
1
:
=
1
1
+
_
1
2
_
+
_
1
3
+
1
4
_
+
_
1
5
+
1
6
+
1
7
+
1
8
_
+... +
_
1
2
|1
+ 1
+... +
1
2
|
_
_
1
1
+
_
1
2
_
+
_
1
4
+
1
4
_
+
_
1
8
+
1
8
+
1
8
+
1
8
_
+... +
_
1
2
|
+... +
1
2
|
_
= 1 +
/
2
,
which is unbounded, and hence the sequence :
o
=1
cannot converge.
We also note the following sucient condition for the convergence of

o
n=1
a
n
:
(2.2)
o
n=1
[a
n
[ converges.
Indeed, if

o
n=1
[a
n
[ converges, say to 1 _ 0, then we have
(2.3)
n=1
a
n
n=1
[a
n
[ = t
t
11
,
where t
n=1
[a
n
[ is the
||
partial sum of the series

o
n=1
[a
n
[. Now (2.2)
implies that t
o
=1
satises the Cauchy criterion for sequences, and together
with (2.3), this proves the Cauchy criterion for the series

o
n=1
a
n
. Thus the
series

o
n=1
a
n
converges. Note that the same argument proves the convergence of
o
n=1
a
n
if [a
n
[ _ /
n
for all suciently large : where

o
n=1
/
n
converges. We have
just proved the rst half of the versatile Comparison Test. The second half is a
trivial consequence of the rst.
Theorem 17. Suppose that a
n
o
n=1
is a sequence of complex numbers.
(1) If [a
n
[ _ /
n
for all suciently large :, and if

o
n=1
/
n
converges, then so
does

o
n=1
a
n
.
(2) If a
n
_ /
n
_ 0 for all suciently large :, and if

o
n=1
/
n
diverges, then
so does

o
n=1
a
n
.
Probably the most used fact about series of complex numbers is the geometric
series formula.
Lemma 14. If [.[ < 1, then
o
n=0
.
n
=
1
1 .
.
If [.[ _ 1, then

o
n=0
.
n
diverges.
Proof : The partial sums are given by :
n=0
.
n
=
1:
N+1
1:
for _ 1.
Now

.
+1
= [.[
+1
0 as if [.[ < 1 by limit (5) in the previous
subsection, and so
o
n=0
.
n
= lim
o
:
= lim
o
1 .
+1
1 .
=
1
1 .
, for [.[ < 1.
If on the other hand, [.[ _ 1, then [.
n
[ = [.[
n
does not tend to 0 as : , and
hence the series

o
n=0
.
n
cant converge by (2.1).
Example 7. The series
o
n=1
sin(n0)
n
n
converges for every real 0 since
sin(n0)
n
n
_
1
2
n
for all : _ 2. Indeed,

o
n=1
1
2
n
converges by Lemma 14, and the comparison
test Theorem 17 then shows that

o
n=1
sin(n0)
n
n
converges.
In order to take advantage of the comparison test as we did in the example
above, we must have available a large supply of series

o
n=1
/
n
with nonnegative
terms /
n
, for which we already know whether or not

o
n=1
/
n
converges. So we now
turn to the investigation of series with nonnegative terms.
2.2. Series of nonnegative terms. Lemma 8 on the convergence of increas-
ing sequences has the following useful reformulation for series with nonnegative
terms.
Lemma 15. Suppose that

o
n=1
a
n
is a series of nonnegative terms a
n
, and
let :
n=1
a
n
be the
||
partial sum. Then the series

o
n=1
a
n
converges if
and only if the sequence of partial sums :
o
=1
is bounded.
Proof : We simply chase the denitions with Lemma 8 as follows. The series
o
n=1
a
n
converges if and only if the sequence :
o
=1
has a limit. But :

:
1
= a
_ 0 shows that the sequence :
o
=1
is nondecreasing. Thus Lemma
8 shows that :
o
=1
has a limit if and only if the sequence is bounded.
Our rst main result in this subsection is the Cauchy condensation test that
applies to a series

o
n=1
a
n
of nonincreasing positive terms a
n
and says that the
series
a
1
+a
2
+a
3
+a
4
+a
5
+a
6
+a
7
+a
8
+a
9
+...
converges if and only if the condensed series
a
1
+ (a
2
+a
2
) + (a
4
+a
4
+a
4
+a
4
) + (a
8
+a
8
+...a
8
) +... (2.4)
= a
1
+ 2a
2
+ 4a
4
+ 8a
8
+...
converges. Note that the denition of the condensed series is motivated by regroup-
ing the terms in

o
n=1
a
n
as
(2.5) a
1
+ (a
2
+a
3
) + (a
4
+a
5
+a
6
+a
7
) + (a
8
+a
9
+... +a
15
) +...
Theorem 18. Suppose that

o
n=1
a
n
is a series of nonincreasing positive
terms a
n
. Then the series

o
n=1
a
n
converges if and only if its condensed series
o
|=0
2
|
a
2
k converges.
Proof : Let :
n=1
a
n
be the partial sums of the series

o
n=1
a
n
and let
t
1
=

1
|=0
2
|
a
2
k be the partial sums of the condensed series

o
|=0
2
|
a
2
k in the
second line of (2.4). Suppose rst that

o
|=0
2
|
a
2
k converges. We will use the
grouping of terms indicated in (2.5). For = 2
1+1
1 we have
(2.6) :
=
2
K+1
1
n=1
a
n
=
1
|=0
_
_
2
k+1
1
n=2
k
a
n
_
_
_
1
|=0
_
_
2
k+1
1
n=2
k
a
2
k
_
_
=
1
|=0
2
|
a
2
k = t
1
,
where the inequality follows from the assumption that the terms a
n
are positive
and nonincreasing. The convergence of

o
|=0
2
|
a
2
k shows that the partial sums
t
1
o
1=0
are bounded, and (2.6) now shows that the subsequence of partial sums
:
2
K+1
1
o
1=0
is bounded. Since the full sequence of partial sums :
o
=1
is
nondecreasing, we conclude that it is bounded as well. Then Lemma 15 shows that
the series

o
n=1
a
n
converges.
Conversely we use an inequality opposite to (2.6) that is suggested by the
alternate grouping of terms in the series

o
n=1
a
n
given by (compare with (2.5)),
a
1
+ (a
2
) + (a
3
+a
4
) + (a
5
+a
6
+a
7
+a
8
) +...
The inequality is that for = 2
1
we have
:
=
2
K
n=1
a
n
= a
1
+
1
|=1
_
_
2
k
n=2
k1
+1
a
n
_
_
_ a
1
+
1
|=1
_
_
2
k
n=2
k1
+1
a
2
k
_
_
(2.7)
= a
1
+
1
|=1
2
|1
a
2
k =
1
2
(a
1
+t
1
) ,
where again the inequality follows from the assumption that the terms a
n
are pos-
itive and nonincreasing. If

o
n=1
a
n
converges, then the sequence of partial sums
:
o
=1
is bounded, and (2.7) shows that the sequence of partial sums t
1
o
1=0
is bounded, hence

o
|=0
2
|
a
2
k converges by Lemma 3.4.
Corollary 8. Let j R. The j-series

o
n=1
1
n
p
converges if and only if
j 1.
Proof : For j _ 0 the series diverges since
1
n
p
does not go to zero as : .
If j 0 then the terms
1
n
p
are nonincreasing and so the Cauchy condensation test
shows that

o
n=1
1
n
p
converges if and only if its condensed series
o
|=0
2
|
1
2
|
=
o
|=0
_
1
2
1
_
|
converges. But the condensed series is a geometric series and Lemma 14 shows that
it converges if and only if
1
2
p1
< 1, i.e. j 1.
The series of reciprocals of factorials,
o
n=0
1
:!
=
1
0!
+
1
1!
+
1
2!
+
1
3!
+... +
1
:!
+...
= 1 + 1 +
1
2 1
+
1
3 2 1
+... +
1
:(: 1) ...3 2 1
+...
plays a very distinguished role in analysis. First we note that this series converges
by the comparison test and the geometric series formula. Indeed,
1
:!
=
1
:(: 1) ...3 2 1
_
1
2 (2) ...2 2 1
=
_
1
2
_
n1
for all : _ 2, and
o
n=2
_
1
2
_
n1
=
1
2
+
_
1
2
_
2
+... =
o
n=0
_
1
2
_
n
1 = 2 1 = 1
by Lemma 14. Thus

o
n=0
1
n!
converges by Theorem 17 (1), and in fact
2 < 1 + 1 +
o
n=2
1
:!
=
o
n=0
1
:!
< 1 + 1 +
o
n=2
_
1
2
_
n1
= 3.
Definition 22. c =
o
n=0
1
n!
.
The series for Eulers number c converges so rapidly that it forces c to be
irrational. Indeed, if :
n=0
1
n!
is the
||
partial sum, then
c :
=
o
n=+1
1
:!
=
1
( + 1)!
+
1
( + 2)!
+ +
1
( + 3)!
...
=
1
( + 1)!
_
1 +
1
+ 2
+
1
( + 3) ( + 2)
+...
_
<
1
( + 1)!
_
1 +
1
+ 2
+
_
1
+ 2
_
2
+...
_
=
1
( + 1)!
1
1
1
+2
=
1
( + 1)!
+ 2
+ 1
,
by Lemma 14. Now suppose that c is rational, say c =

j
where j, N. Since :!
divides ! for : _ we conclude that
!c !:
j
= !
j
!
j
n=0
1
:!
= ( 1)!j
j
n=0
!
:!
is a positive integer satisfying
!c !:
j
< !
1
( + 1)!
+ 2
+ 1
=
+ 2
( + 1)
2
< 1,
a contradiction. Thus we have proved:
Theorem 19. c is an irrational number lying strictly between 2 and 3.
To prove the next familiar theorem on Eulers number c, it is convenient to
introduce the limit superior and limit inferior of a real-valued sequence :
n
o
n=1
.
Definition 23. Suppose that : = :
n
o
n=1
is a real-valued sequence and let 1
+
be the set of subsequential limits of :. Dene
lim sup
no
:
n
= sup1
+
and lim inf
no
:
n
= inf 1
+
,
called the limit superior and limit inferior of : respectively.
Since 1
+
is closed we have either limsup
no
:
n
= or limsup
no
:
n
1
+
.
In the latter case limsup
no
:
n
is the largest subsequential limit of :. A similar
comment applies to liminf
no
:
n
. Here are some easily veried properties of limit
superior and limit inferior:
lim inf
no
:
n
_ lim sup
no
:
n
, (2.8)
lim
no
:
n
= 1 if and only if lim sup
no
:
n
= lim inf
no
:
n
= 1,
lim sup
no
t
n
_ lim sup
no
:
n
if t
n
_ :
n
for all suciently large :,
lim inf
no
t
n
_ lim inf
no
:
n
if t
n
_ :
n
for all suciently large :.
Theorem 20. lim
no
_
1 +
1
n
_
n
= c.
Proof : Let :
n
=

n
|=0
1
|!
and t
n
=
_
1 +
1
n
_
n
for : _ 1. By the binomial
theorem
t
n
=
_
1 +
1
:
_
n
=
n
|=0
:!
(: /)!/!
_
1
:
_
|
= 1 +
:
1!
_
1
:
_
+
:(: 1)
2!
_
1
:
_
2
+
:(: 1) (: 2)
3!
_
1
:
_
3
+...
_
1
:
_
n
= 1 + 1 +
1
2!
_
1
1
:
_
+
1
3!
_
1
1
:
__
1
2
:
_
+...
1
:!
_
1
1
:
_
...
_
1
: 1
:
_
,
and so t
n
_ 1 + 1 +
1
2!
+
1
3!
+...
1
n!
= :
n
. Thus from the third line in (2.8) we have
lim sup
no
t
n
_ lim sup
no
:
n
= c.
Conversely, x : 1. For : : we have
t
n
= 1 + 1 +
1
2!
_
1
1
:
_
+
1
3!
_
1
1
:
__
1
2
:
_
+...
1
:!
_
1
1
:
_
...
_
1
: 1
:
_
1 + 1 +
1
2!
_
1
1
:
_
+
1
3!
_
1
1
:
__
1
2
:
_
+...
1
:!
_
1
1
:
_
...
_
1
:1
:
_
.
Now the limit as : of the last sum (remember that : is kept xed) is
1 + 1 +
1
2!
+
1
3!
+...
1
:!
= :
n
.
Thus from the fourth line in (2.8) we have
lim inf
no
t
n
_ :
n
for all : 1. Now take the limit as : to obtain liminf
no
t
n
_ c.
Altogether, using the rst line in (2.8), we now have
c _ lim inf
no
t
n
_ lim sup
no
t
n
_ c,
3. POWER SERIES 71
which implies that liminf
no
t
n
= limsup
no
t
n
= c. The second line in (2.8)
now yields that lim
no
t
n
= c as required.
3. Power series
There is a very special class of series that turn out to dene complex-valued
functions on balls in the complex plane. These are the so-called power series that
have the form
o
n=0
a
n
.
n
,
where a
n
o
n=0
is a sequence in C whose terms a
n
are called coecients, and where
. C is called the variable. The rst question of interest is: For what values of
. in the complex plane does the series

o
n=0
a
n
.
n
converge? The second question
is: Of what use are these functions? The answer to the rst question is initially
surprising - namely the set of convergence 1 is either 0, C or there is a ball
1(0, 1) centered at the origin 0 with positive radius 1 such that
1(0, 1) 1 1(0, 1).
The answer to the second question is that these power series functions have many
special properties, and moreover, every complex-valued function ) dened on a ball
1(0, 1) in C that has a derivative everywhere in 1(0, 1) (i.e. lim
u:
}(u)}(:)
u:
exists for all . 1(0, 1)) turns out to be one of these power series functions! In
other words
) (.) =
o
n=0
a
n
.
n
, . 1(0, 1) ,
for some sequence of coecients a
n
o
n=0
. It turns out such ) are innitely dieren-
tiable and the coecients are given by a
n
=
}
(n)
(0)
n!
. Many more magical properties
of these so-called analytic functions are usually investigated in a course on complex
analysis.
We content ourselves here with answering just the rst question. This will
require a new convergence test, the root test. We will also prove a close cousin, the
ratio test.
Theorem 21. (Root Test) Let a
n
o
n=0
be a sequence of complex numbers and
set
1 = lim sup
no
n
_
[a
n
[.
(1) If 1 < 1 then

o
n=0
a
n
converges,
(2) If 1 1 then

o
n=0
a
n
diverges,
(3) If 1 = 1 then there is no information.
Proof : (1) Pick 1 < 1 < 1. Then there are only nitely many : satisfying
n
_
[a
n
[ _ 1 (otherwise we would have limsup
no
n
_
[a
n
[ _ 1), so there is such
that
n
_
[a
n
[ _ 1, i.e. [a
n
[ _ 1
n
, for all : _ .
Since

o
n=0
1
n
=
1
11
converges by Lemma 14, the comparison test Theorem 17
(1) shows that

o
n=0
a
n
converges.
(2) Since 1 < 1, there are innitely many : satisfying
n
_
[a
n
[ _ 1 (otherwise
we would have limsup
no
n
_
[a
n
[ _ 1), so there is such that
n
_
[a
n
[ _ 1, i.e. [a
n
[ _ 1, for all : _ .
Thus we cannot have [a
n
[ 0 as : , and it follows that

o
n=0
a
n
diverges.
(3) The class of j-series shows that the root test gives no information on con-
vergence when 1 = 1. Indeed, if j 0, then
lim sup
no
n
_
1
:
= lim sup
no
_
1
n
_
:
_
= 1
by limit (3) at the beginning of the previous section. Yet for j < 1 the series
o
n=1
1
n
p
diverges and for j 1 the series

o
n=1
1
n
p
converges.
Corollary 9. Let a
n
o
n=0
be a sequence in C and set 1 = limsup
no
n
_
[a
n
[.
Let 1 =
1
J
(where 1 = 0 if 1 = and 1 = if 1 = 0). Then the set of
convergence
1 =
_
. C :
o
n=0
a
n
.
n
converges
_
,
satises one of the following:
(1) 1 = 0 if 1 = 0,
(2) 1 = C if 1 = ,
(3) 1(0, 1) 1 1(0, 1) if 0 < 1 < .
The extended real number 1 is called the radius of convergence of the power
series

o
n=0
a
n
.
n
.
Proof : Apply the root test to the series

o
n=0
a
n
.
n
for . C. We have
1 = lim sup
no
n
_
[a
n
.
n
[ = [.[ lim sup
no
n
_
[a
n
[ =
[.[
1
.
Thus if . 1(0, 1), then 1 < 1 and the series

o
n=0
a
n
.
n
converges, i.e. . 1.
If . , 1(0, 1), then 1 1 and the series

o
n=0
a
n
.
n
diverges, i.e. . , 1. This
proves assertion (3), and the rst two assertions are proved in similar fashion.
There is another test, the ratio test, that is often simpler to apply than the
root test, but fails to have as wide a scope as the root test.
Theorem 22. (Ratio Test) Let a
n
o
n=0
be a sequence of complex numbers.
(1) If limsup
no
on+1
on
< 1 then

o
n=0
a
n
converges.
(2) If there is such that

on+1
on
_ 1 for all : _ , then

o
n=0
a
n
diverges.
Remark 13. If 1 = lim
no
on+1
on
exists, then

o
n=0
a
n
converges if 1 < 1,
and

o
n=0
a
n
diverges if 1 1.
Proof : (1) Pick limsup
no
on+1
on
< 1 < 1. Then there are only nitely

many : satisfying

on+1
on
_ 1 (otherwise we would have limsup

no
on+1
on
_ 1),
so there is such that
a
n+1
a
n
_ 1, i.e. [a
n+1
[ _ 1[a
n
[ , for all : _ .
3. POWER SERIES 73
By induction we obtain
[a
+|
[ _ 1[a
+|1
[ _ 1
2
[a
+|2
[ _ ... _ 1
|
[a
[ , / _ 0.
Now

o
|=0
1
|
[a
[ =
]o
N
]
11
by Lemma 14, and so the comparison test Theorem 17
(1) shows that

o
|=0
a
+|
converges, hence also

o
n=0
a
n
.
(2) By induction we have
[a
+|
[ _ [a
+|1
[ _ ... _ [a
[ , / _ 0.
Thus we cannot have [a
n
[ 0 as : and so

o
n=0
a
n
diverges.
Problem 1. What is the radius of convergence of the power series
o
n=0
_
2:
:
_
.
n
=
o
n=0
(2:)!
:!:!
.
n
?
The root test is very hard to apply here without Stirlings formula :! ~
_
2:
_
n
t
_
n
.
On the other hand the ratio test applies easily:
lim
no
a
n+1
a
n
= lim
no
(2n+2)!
(n+1)!(n+1)!
.
n+1
(2n)!
n!n!
.
n
= lim
no
(2: + 2) (2: + 1)
(: + 1) (: + 1)
.
= 4 [.[ .
By the remark following the ratio test, the power series converges if [.[ <
1
4
and
diverges if [.[
1
4
. Thus the radius of convergence is
1
4
.
Problem 2. What is the radius of convergence of the power series

o
n=0
:
n
n!
?
Since
lim
no
a
n+1
a
n
= lim
no
:
n+1
(n+1)!
:
n
n!
= lim
no
.
: + 1
= 0,
we see that the radius of convergence is . This is the exponential function
(3.1) 1rj (.) =
o
n=0
.
n
:!
, . C.
Finally, we note the sense in which the scope of the ratio test is not as wide as
that of the root test.
Proposition 11. For any sequence a
n
o
n=1
of positive numbers we have
lim sup
no
n
_
a
n
_ lim sup
no
a
n+1
a
n
.
Thus the root test gives convergence of the series

o
n=0
a
n
whenever the ratio
test does.
Proof : Suppose 1 = limsup
no
on+1
on
< and choose 1 < 1 < . Then
there is such that
a
n+1
_ 1a
n
, : _ .
By induction we have
a
+|
_ 1a
+|1
_ 1
2
a
+|2
_ ... _ 1
|
a
, / _ 0,
and so with : = +/,
n
_
a
n
= (a
+|
)
1
n
_
_
1
|
[a
[
_
1
n
= 1
k
N+k
a
1
n
= 1
1
N
N+k
a
1
n
= 1
1
N
n
a
1
n
.
Now we take the limit superior as : to obtain
lim sup
no
n
_
a
n
_ 1lim sup
no
1
N
n
a
1
n
= 1lim sup
no
_
a
_1
n
= 1
by limit (2) at the beginning of the previous subsection. Since 1 1 was arbitrary
we conclude that limsup
no
n
_
a
n
_ 1 = limsup
no
on+1
on
as required.
4. Exercises
Exercise 32. Let j be an integer greater than 1.
(1) If 0 < r < 1, show that there is a sequence a
n
o
n=1
of integers with
0 _ a
n
< j such that
r =
o
n=1
a
n
j
n
.
(2) Show that given 0 < r < 1, the sequence a
n
o
n=1
in part (1) is unique
except when r is of the form
j
n
, in which case there are exactly two such
sequences.
(3) Conversely, show that if a
n
o
n=1
is any sequence of integers with 0 _ a
n
<
j, then the series

o
n=1
on
n
converges to a real number r with 0 _ r _ 1.
By

o
n=1
on
n
we mean
sup
_

n=1
a
n
j
n
: N
_
.
Exercise 33. Let :
0
0 and for : N dene :
n+1
= :
n
+
1
sn
. Prove that the
sequence :
n
o
n=0
diverges.
Exercise 34. Find lim
no
__
:
2
+: :
_
.
Exercise 35. For : N dene
:
n
=
1
: + 1
+
1
: + 2
+... +
1
2:
.
Prove that this sequence converges. Can you guess what the limit might be by
comparing :
n
to an integral?
Exercise 36. Let :
1
=
_
2 and dene
:
n+1
=
_
2 +
_
:
n
, : N.
Prove that :
n
< 2 for all : _ 1 and prove that :
n
o
n=1
converges.
Exercise 37. Suppose that a
n
_ 0 for : N. Prove that

o
n=1
_
on
n
converges
if

o
n=1
a
n
converges.
Exercise 38. (there is no convergent series with largest terms) Suppose that
a
n
0 and

o
n=1
a
n
converges. Dene r
n
=

o
|=n
a
|
.
4. EXERCISES 75
(1) Prove that

o
n=1
on
:n
diverges by showing that for all : < ::
a
n
r
n
+
a
n+1
r
n+1
... +
a
n
r
n
1
r
n
r
n
.
(2) Prove that

o
n=1
on
_
:n
converges by showing that for all ::
a
n
_
r
n
< 2
_
_
r
n
_
r
n+1
_
.
CHAPTER 5
Continuity and Dierentiability
The notion of a continuous function ) : A 1 makes sense when the function
is dened from one metric space A to another 1 . We will initially examine the
connection between continuity and sequences, and after that between continuity
and open sets. The notion of a dierentiable function ) : A 1 requires that A
and 1 be Euclidean spaces, usually the real or complex numbers. Central to all of
this is the concept of limit of a function.
Definition 24. Suppose that (A, d
) and (1, d
Y
) are metric spaces. Let 1 be
a subset of A and suppose that ) : 1 1 is a function from 1 to 1 . Let j A
be a limit point of 1 and suppose that 1 . Then
lim
r
) (r) =
if for every - 0 there is c 0 such that
(0.1) d
Y
() (r) , ) < - whenever r 1 j and d
(r, j) < c.
Note that the concept of a limit of ) at a point j is only dened when j is a
limit point of the set 1 on which ) is dened. Do not confuse this notion with the
denition of limit of a sequence : = :
n
o
n=1
in a metric space 1 . In this latter
denition, : is a function from the natural numbers N into the metric space 1 , but
the limit point j is replaced by the symbol . Here is a characterization of limit
of a function in terms of limits of sequences.
Theorem 23. Suppose that (A, d
) and (1, d
Y
) are metric spaces. Let 1 be a
subset of A and suppose that ) : 1 1 is a function from 1 to 1 . Let j A be
a limit point of 1 and suppose that 1 . Then lim
r
) (r) = if and only if
lim
no
) (:
n
) =
for all sequences :
n
o
n=1
in 1 j such that
lim
no
:
n
= j.
Proof : Suppose rst that lim
r
) (r) = . Now assume that :
n
o
n=1
is a
sequence in 1j such that lim
no
:
n
= j. Then given - 0 there is c 0 such
that (0.1) holds. Furthermore we can nd so large that d
(:
n
, j) < c whenever
: _ . Combining inequalities with the fact that :
n
1 gives
d
Y
() (:
n
) , ) < - whenever : _ ,
which proves lim
no
) (:
n
) = .
Suppose next that lim
r
) (r) = fails. The negation of Denition 24 is that
there exists an - 0 such that for every c 0 we have
(0.2) d
Y
() (r) , ) _ - for some r 1 j with d
(r, j) < c.
77
78 5. CONTINUITY AND DIFFERENTIABILITY
So x such an - 0 and for each c =
1
n
0 choose a point :
n
1 j with
d
(:
n
, j) <
1
n
. Then :
n
o
n=1
is a sequence in 1 j such that the sequence
) (:
n
)
o
n=1
does not converge to - indeed, d
Y
() (:
n
) , ) _ - 0 for all : _ 1.
As a corollary of the theorem we immediately obtain that limits are unique if
they exist. In addition, if 1 = C is the space of complex numbers, then limits
behave as expected with regard to addition and multiplication.
Proposition 12. Suppose that (A, d) is metric space. Let 1 be a subset of A
and suppose that ), q : 1 C are complex-valued functions on 1. Let j A be a
limit point of 1 and suppose that , 1 C satisfy
lim
r
) (r) = and lim
r
q (r) = 1.
Then
(1) lim
r
) (r) +q (r) = +1.
(2) lim
r
) (r) q (r) = 1.
(3) lim
r
}(r)
(r)
=
.
1
provided 1 ,= 0.
1. Continuous functions
A function ) : A 1 from one metric space A to another 1 is said to be
continuous if it is continuous at each point j in A. We thus turn rst to the
denition of continuity at a point, which we give initially in a more general setting.
Definition 25. Suppose that (A, d
) and (1, d
Y
) are metric spaces. Let 1 be
a subset of A and suppose that ) : 1 1 is a function from 1 to 1 . Let j 1.
Then ) is continuous at j if for every - 0 there is c 0 such that
(1.1) d
Y
() (r) , ) (j)) < - whenever r 1 and d
(r, j) < c.
Note that (1.1) says
(1.2) ) (1(j, c) 1) 1() (j) , -) .
There are only two possibilities for the point j 1; either j is a limit point of 1 or
j is isolated in 1 (a point r in 1 is isolated in 1 if there is a deleted ball 1
t
(r, r)
that has empty intersection with 1). In the case that j is a limit point of 1, then
) is continuous at j if and only if lim
r
) (r) exists and the limit is ) (j), i.e.
(1.3) lim
r
) (r) = ) (j) .
On the other hand, if j is an isolated point of 1, then ) is automatically continuous
at j since (1.1) holds for all - 0 with c = r where 1
t
(r, r) 1 = O. From
these remarks together with Theorem 23, we immediately obtain the following
characterization of continuity in terms of sequences.
Theorem 24. Suppose that A and 1 are metric spaces. Let 1 be a subset of
A and suppose that ) : 1 1 is a function from 1 to 1 . Let j 1. Then ) is
continuous at j if and only if
lim
no
) (:
n
) = ) (j)
for all sequences :
n
o
n=1
in 1 j such that
lim
no
:
n
= j.
1. CONTINUOUS FUNCTIONS 79
Remark 14. The theorem remains true if we permit the sequences :
n
o
n=1
to
lie in 1 rather than in 1 j.
Before continuing any further, we point out that our denition of continuity of
) : 1 1 at a point j 1 A has absolutely nothing to do with the complement
A 1 of the set 1 in the ambient space A. Thus the denition of continuity at
a point is intrinsic in the sense that it doesnt matter what ambient space A we
choose to contain 1, and in fact we can just restrict attention to the case A = 1
is a metric space in its own right. Note that the denition of limit in Denition 24
is not intrinsic since the limit point j may not lie in the set 1.
Definition 26. A function ) : A 1 is said to be continuous on A if ) is
continuous at each point j A.
The previous theorem says that ) : A 1 is continuous if and only if
lim
no
) (:
n
) = ) (j) for all sequences :
n
o
n=1
in A such that lim
no
:
n
= j.
There is an alternate characterization of continuity of ) : A 1 in terms of open
sets which is particularly useful in connection with compact sets and continuity of
inverse functions.
Theorem 25. Suppose that ) : A 1 is a function from a metric space A to
a metric space 1 . Then ) is continuous on A if and only if
(1.4) )
1
(G) is open in A for every G that is open in 1 .
Corollary 10. Suppose that ) : A 1 is a continuous function from a
compact metric space A to a metric space 1 . Then ) (A) is compact.
Corollary 11. Suppose that ) : A 1 is a continuous function from a
compact metric space A to a metric space 1 . If ) is both one-to-one and onto,
then the inverse function )
1
: 1 A dened by
)
1
(j) = r where r is the unique point in A satisfying ) (r) = j,
is a continuous map.
Proof (of Corollary 10): If G
o
o.
is an open cover of ) (A), then
_
)
1
(G
o
)
_
o.
is an open cover of A, hence has a nite subcover
_
)
1
(G
o
k
)
_
|=1
. But then
G
o
k
|=1
is a nite subcover of ) (A) since
) (A) )
_

_
|=1
)
1
(G
o
k
)
_

_
|=1
)
_
)
1
(G
o
k
)
_

_
|=1
G
o
k
.
Note that it is not in general true that )
1
() (G)) G.
Proof (of Corollary 11): Let G be an open subset of A. We must show that
_
)
1
_
1
(G) is open in 1 . Note that since ) is one-to-one and onto, we have
_
)
1
_
1
(G) = ) (G). Now G
c
= A G is closed in A, hence compact, and so
Corollary 10 shows that ) (G
c
) is compact, hence closed in 1 , so ) (G
c
)
c
is open in
1 . But again using that ) is one-to-one and onto shows that ) (G) = ) (G
c
)
c
, and
so we are done.
Remark 15. Compactness is essential in this corollary since the map
) : [0, 2) T = . C : [.[ = 1 dened by ) (0) = c
I0
= (cos 0, sin0) ,
takes [0, 2) one-to-one and onto T, yet the inverse map fails to be continuous at
. = 1. Indeed, for points . on the circle just below 1, )
1
(.) is close to 2, while
)
1
(1) = 0.
Proof (of Theorem 25): Suppose rst that ) is continuous on A. We must
show that (1.4) holds. So let G be an open subset of 1 . We must now show that
for every j )
1
(G) there is r 0 (depending on j) such that 1(j, r) )
1
(G).
Fix j )
1
(G). Since G is open and ) (j) G we can pick - 0 such that
1() (j) , -) G. But then by the continuity of ) there is c 0 such that (1.2)
holds, i.e. ) (1(j, c)) 1() (j) , -) G. It follows that
1(j, c) )
1
() (1(j, c))) )
1
(G) .
Conversely suppose that (1.4) holds. We must show that ) is continuous at
every j A. So x j A. We must now show that for every - 0 there is c 0
such that (1.2) holds, i.e. ) (1(j, c)) 1() (j) , -). Fix - 0. Since 1() (j) , -) is
open, we have that )
1
(1() (j) , -)) is open by (1.4). Since j )
1
(1() (j) , -))
there is thus c 0 such that 1(j, c) )
1
(1() (j) , -)). It follows that
) (1(j, c)) )
_
)
1
(1() (j) , -))
_
1() (j) , -) .
Before specializing to the case where 1 is the space of real or complex numbers,
we show that continuity is stable under composition of maps. Continuity on a metric
space is easily handled with the help of Theorem 25.
Theorem 26. Suppose that A, 1, 7 are metric spaces. If ) : A 1 and
q : 1 7 are both continuous maps, then so is the composition / = q ) : A 7
dened by
/(r) = q () (r)) , r A.
Proof: If G is open in 7, then
/
1
(G) = )
1
_
q
1
(G)
_
is open since q continuous implies q
1
(G) is open by Theorem 25, and then )
continuous implies )
1
_
q
1
(G)
_
is open by Theorem 25. Thus / is continuous by
Theorem 25.
Continuity at a point is also easily handled using Denition 25. We leave the
proof of the following theorem to the reader.
Theorem 27. Suppose that A, 1, 7 are metric spaces. If j 1 A and
) : 1 1 is continuous at j and q : ) (1) 7 is continuous at ) (j), then the
composition / = q ) : 1 7 is continuous at j.
1.1. Real and complex-valued continuous functions. Proposition 12 es-
tablished limit properties for sums and products of complex-valued functions, and
some denition chasing easily leads to the following analogous result for continuous
maps.
Proposition 13. If ) and q are continuous complex-valued functions on a
metric space A, then so are the functions ) + q and )q. If in addition q never
vanishes, then
}
is also continuous on A.
Here is an extremely useful consequence of Corollary 10 when the target space
1 is the real numbers.
Theorem 28. Suppose that A is a compact metric space and ) : A R is
continuous. Then there exist points j, A satisfying
) (j) = sup) (A) and ) () = inf ) (A) .
Remark 16. Compactness of A is essential here as evidenced by the following
example. If A is the open interval (0, 1) and ) : (0, 1) (0, 1) is the identity map
dened by ) (r) = r, then ) is continuous and
sup) ((0, 1)) = sup (0, 1) = 1,
inf ) ((0, 1)) = inf (0, 1) = 0.
However, there are no points j, (0, 1) satisfying either ) (j) = 1 or ) () = 0.
Proof (of Theorem 28): Corollary 10 shows that ) (A) is compact. Lemmas 4
and 6 now show that ) (A) is a closed and bounded subset of R. Finally, Theorem
5 shows that sup) (A) exists and that sup) (A) ) (A), i.e. there is j A such
that sup) (A) = ) (j). Similarly there is A satisfying inf ) (A) = ) ().
Now consider a complex-valued function ) : A C on a metric space A, and
let n : A R and : A R be the real and imaginary parts of ) dened by
n(r) = Re ) (r) =
) (r) +) (r)
2
,
(r) = Im) (r) =
) (r) ) (r)
2i
,
for r A. It is easy to see that ) is continuous at a point j A if and only if each
of n and is continuous at j. Indeed, the inequalities
max [a[ , [/[ _
_
[a[
2
+[/[
2
_ [a[ +[/[
show that if (1.1) holds for ) (with 1 = A), i.e.
d
C
() (r) , ) (j)) < - whenever d
(r, j) < c,
then it also holds with ) replaced by n or by :
d
R
(n(r) , n(j)) = [n(r) n(j)[
_
_
[n(r) n(j)[
2
+[ (r) (j)[
2
= d
C
() (r) , ) (j)) < -
whenever d
(r, j) < c.
Similarly, if (1.1) holds for both n and then it holds for ) but with - replaced by
2-:
d
C
() (r) , ) (j)) =
_
[n(r) n(j)[
2
+[ (r) (j)[
2
_ [n(r) n(j)[ +[ (r) (j)[
= d
R
(n(r) , n(j)) +d
R
( (r) , (j)) < 2-
whenever d
(r, j) < c.
The same considerations apply equally well to Euclidean space R
n
(recall that
C = R
2
as metric spaces) and we have the following theorem. Recall that the
dot product of two vectors z = (.
1
, ..., .
n
) and w = (n
1
, ..., n
n
) in R
n
is given by
z w =

n
|=1
.
|
n
|
.
Theorem 29. Let A be a metric space and suppose f : A R
n
. Let )
|
(r) be
the component functions dened by f (r) = ()
1
(r) , ..., )
n
(r)) for 1 _ / _ :.
(1) The vector-valued function f : A R
n
is continuous at a point j A if
and only if each component function )
|
: A R is continuous at j.
(2) If both f : A R
n
and g : A R
n
are continuous at j then so are
f +g : A R
n
and f g : A R.
Here are some simple facts associated with the component functions on Euclid-
ean space.
For each 1 _ , _ :, the component function w = (n
1
, ..., n
n
) n
is
continuous from R
n
to R.
The length function w = (n
1
, ..., n
n
) [w[ is continuous from R
n
to
[0, ); in fact we have the so-called reverse triangle inequality:
[[z[ [w[[ _ [z w[ , z, w R
n
.
Every monomial function w = (n
1
, ..., n
n
) n
|1
1
n
|2
2
...n
|n
n
is continuous
from R
n
to R.
Every polynomial 1 (w) =

|1+...|n
a
|1,...|n
n
|1
1
n
|2
2
...n
|n
n
is continu-
ous from R
n
to R.
1.2. Uniform continuity. A function ) : A 1 that is continuous from a
metric space A to another metric space 1 satises Denition 25 at each point j in
A, namely for every j A and - 0 there is c
0 (note the dependence on j)

such that (1.1) holds with 1 = A:
(1.5) d
Y
() (r) , ) (j)) < - whenever d
(r, j) < c
.
In general we cannot choose c 0 to be independent of j. For example, the function
) (r) =
1
r
is continuous on the open interval (0, 1), but if we want
- d
Y
() (r) , ) (j)) =
1
r

1
j
whenever [j r[ < c,
we cannot take j = c since then r could be arbitrarily close to 0, and so
1
r
could
be arbitrarily large. In this example, A = (0, 1) is not compact and this turns out
to be the reason we cannot choose c 0 to be independent of j. The surprising
property that continuous functions ) on a compact metric space A have is that we
can indeed choose c 0 to be independent of j in (1.5). We rst give a name to
this surprising property; we call it uniform continuity on A.
Definition 27. Suppose that ) : A 1 maps a metric space A into a metric
space 1 . We say that ) is uniformly continuous on A if for every - 0 there is
c 0 such that
d
Y
() (r) , ) (j)) < - whenever d
(r, j) < c.
The next theorem plays a crucial role in the theory of integration and its ap-
plication to existence and uniqueness of solutions to dierential equations.
Theorem 30. Suppose that ) : A 1 is a continuous map from a compact
metric space A into a metric space 1 . Then ) is uniformly continuous on A.
Proof : Suppose - 0. Since ) is continous on A, (1.2) shows that for each
point j A, there is c
0 such that
(1.6) ) (1(j, c
)) 1
_
) (j) ,
-
2
_
.
Since A is compact, the open cover
_
1
_
j,
op
2
__
has a nite subcover

_
1
_
j
|
,
op
k
2
__
|=1
.
Now dene
c = min
_
c
k
2
_
|=1
.
Since the minimum is taken over nitely many positive numbers (thanks to the
nite subcover, which in turn owes its existence to the compactness of A), we have
c 0.
Now suppose that r, j A satisfy d
(r, j) < c. We will show that

d
Y
() (r) , ) (j)) < -.
Choose / so that j 1
_
j
|
,
op
k
2
_
. Then we have using the triangle inequality in
A that
d
(r, j
|
) _ d
(r, j) +d
(j, j
|
) < c +
c
k
2
_
c
k
2
+
c
k
2
= c
k
,
so that both j and r lie in the ball 1(j
|
, c
k
). It follows from (1.6) that both ) (j)
and ) (r) lie in
) (1(j
|
, c
k
)) 1
_
) (j
|
) ,
-
2
_
.
Finally an application of the triangle inequality in 1 shows that
d
Y
() (r) , ) (j)) _ d
Y
() (r) , ) (j
|
)) +d
Y
() (j
|
) , ) (j)) <
-
2
+
-
2
= -.
1.3. Connectedness.
Definition 28. A metric space A is said to be connected if it is not possible
to write A = 1

' 1 where 1 and 1 are disjoint nonempty open subsets of A. A
subset 1 of a metric space A is connected if it is connected when considered as a
metric space in its own right. A set that is not connected is said to be disconnected.
Equivalently, A is disconnected if it has a nonempty proper clopen subset (a
clopen subset of A is one that is simultaneously open and closed in A).
Lemma 16. A subset 1 of A is disconnected if and only if there are nonempty
subsets 1 and 1 of A with 1 = 1

' 1 and
(1.7) 1 1 = O and 1 1 = O,
where the closures refer to the ambient metric space A.
Proof : Theorem 4 shows that 1 is an open subset of the metric space 1 if
and only if 1 1 = O. Similarly, 1 is open in 1 if and only if 1 1 = O. Finally,
1 is clopen in 1 if and only if both 1 and 1 = 1 1 are open in 1 .
The connected subsets of the real line are especially simple - they are precisely
the intervals
[a, /] , (a, /) , [a, /) , (a, /]
lying in R with _ a _ / _ (we do not consider any case where a or / is
and lies next to either [ or ]).
Theorem 31. The connected subsets of the real numbers R are precisely the
intervals.
Proof : Consider rst a nonempty connected subset 1 of R. If a, / 1 , and
a < c < /, then we must also have c 1 since otherwise 1 (, c) is clopen in
1 . Thus the set 1 has the intermediate value property (a, / 1 and a < c < /
implies c 1 ), and it is now easy to see using the Least Upper Bound Property of
R, that 1 is an interval. Conversely, if 1 is a disconnected subset of R, then 1 has
a nonempty proper clopen subset 1. We can then nd two points a, / 1 with
a 1 and / 1 = 1 1 and (without loss of generality) a < /. Set
c = sup(1 [a, /]) .
By Theorem 5 we have c 1, and so c , 1 by (1.7). If also c , 1, then 1 fails the
intermediate value property and so cannot be an interval. On the other hand, if
c 1 then c , 1 (the closure of 1), and so there is d (c, /) 1. But then d , 1
since d c and so lies in (a, /) 1 , which again shows that 1 fails the intermediate
value property and so cannot be an interval.
Connected sets behave the same way as compact sets under pushforward by a
continuous map.
Theorem 32. Suppose ) : A 1 is a continuous map from a metric space A
to another metric space 1 , and suppose that is a subset of A. If is connected,
then ) () is connected.
Proof : We may suppose that = A and ) () = 1 . If 1 is disconnected,
there are disjoint nonempty open subsets 1 and 1 with 1 = 1

' 1. But then
A = )
1
(1)

')
1
(1) where both )
1
(1) and )
1
(1) are open in A by Theorem
25. This shows that A is disconnected as well, and completes the proof of the
(contrapositive of the) theorem.
Corollary 12. If ) : R R is continuous, then ) takes intervals to intervals,
and in particular, ) takes closed bounded intervals to closed bounded intervals.
Note that this corollary yields two familiar theorems from rst year calculus, the
Intermediate Value Theorem (real continuous functions on an interval attain their
intermediate values) and the Extreme Value Theorem (real continuous functions on
a closed bounded interval attain their extreme values).
Proof : Apply Theorems 32, 11 and 10.
Finally we have the following simple description of open subsets of the real
numbers.
Proposition 14. Every open subset G of the real numbers R can be uniquely
written as an at most countable pairwise disjoint union of open intervals 1
n
n1
:
G =

_
n1
1
n
.
2. DIFFERENTIABLE FUNCTIONS 85
Proof : For r G let
1
r
=
_
all open intervals containing r that are contained in G .
It is easy to see that 1
r
is an open interval and that if r, j G then
either 1
r
= 1
or 1
r
1
= O.
This shows that G is a union

o.
1
o
of pairwise disjoint open intervals. To see
that this union is at most countable, simply use (2) of Proposition 3 to pick a
rational number r
o
in each 1
o
. The uniqueness is left as an exercise for the reader.
2. Dierentiable functions
We can dene the derivative of a real-valued function ) at a point j provided )
is dened on an interval 1 containing j. We give the denition when 1 is a closed
interval, the remaining cases being similar.
Definition 29. Suppose ) : [a, /] R and that j [a, /]. Then j is a limit
point of 1 = [a, /] j and the function Q(r) =
}(r)}()
r
of Dierence Quotients
is dened on 1. We say that ) is dierentiable at r if there is R such that
lim
r
Q(r) =
in accordance with Denition 24. In this case we say that is the derivative of )
at j and we write
(2.1) )
t
(j) = = lim
r
Q(r) = lim
r
) (r) ) (j)
r j
.
In the case j = a, we say that )
t
(a) dened as above is a right hand derivative
of ) at a, while if j = /, we that )
t
(/) is a left hand derivative of ) at /. We can
of course dene left and right hand derivatives of ) at j (a, /) by restricting the
domain of ) to [a, j] and [j, /] respectively. If ) is dierentiable at every point in a
subset 1 of [a, /], then we say that ) is dierentiable on 1.
Remark 17. The Dierence Quotient
}(r)}()
r
is the slope of the line segment
joining the points (j, ) (j)) and (r, ) (r)) on the graph of ). Thus if )
t
(j) exists, it
is the limiting value of the slopes of the line segements

(j, ) (j)) (r, ) (r)) as r j,
and so we dene the line 1 through the point (j, ) (j)) having this limiting slope
)
t
(j) to be the tangent line to the graph of ) at the point (j, ) (j)). The equation
of the tangent line 1 is
(2.2) j = ) (j) +)
t
(j) (r j) , r R.
Lemma 17. Suppose ) : [a, /] R and that j [a, /]. If ) is dierentiable at
j, then ) is continuous at j.
Proof : We have
lim
r
() (r) ) (j)) = lim
r
_
) (r) ) (j)
r j
_
(r j)
=
_
lim
r
) (r) ) (j)
r j
__
lim
r
(r j)
_
= )
t
(j) 0 = 0,
which implies lim
r
) (r) = ) (j). Thus ) is continuous at j by (1.3).
Now we investigate the calculus of derivatives. First we have the derivative
calculus of the eld operations. To state the formulas we revert to the more common
notation of using r in place of j as the point at which we compute derivatives.
Proposition 15. Suppose that ), q : (a, /) R are functions dierentiable at
a point r (a, /), and suppose that c R represents the constant function. Then
we have
(1) () +q)
t
(r) = )
t
(r) +q
t
(r) ,
(2) (c))
t
(r) = c)
t
(r) ,
(3) ()q)
t
(r) = )
t
(r) q (r) +) (r) q
t
(r) ,
(4)
_
}
_
t
(r) =
}
0
(r)(r)}(r)
0
(r)
(r)
2
provided q (r) ,= 0.
Proof : For example, to prove (3) we use (2.1) and the corresponding properties
of limits to obtain
()q)
t
(r) = lim
r
()q) (j) ()q) (r)
j r
= lim
r
_
) (j) q (j) ) (r) q (j)
j r
+
) (r) q (j) ) (r) q (r)
j r
_
= lim
r
) (j) ) (r)
j r
lim
r
q (j) +) (r) lim
r
q (j) q (r)
j r
= )
t
(r) q (r) +) (r) q
t
(r) .
The other formulas are proved similarly.
Second we have the calculus of composition of functions, the so-called "chain
rule". This is most easily proved using an equivalent formulation of dierentiability
due to Landau. We begin by rewriting (2.1) in the alternate form
)
t
(r) = lim
|0
) (r +/) ) (r)
/
.
Then we rewrite this latter expression using Landaus "little oh" notation as
(2.3) ) (r +/) = ) (r) +)
t
(r) / +o (/) ,
where o (/) = /j (/) and j (/) denotes a function of / satisfying j (0) = 0 and
j (/) 0 as / 0.
Proposition 16. Suppose that ) is dierentiable at r and that q is dieren-
tiable at j = ) (r). Then
(q ))
t
(r) = q
t
(j) )
t
(r) = q
t
() (r)) )
t
(r) .
Proof : We use the Landau formulation (2.3) of derivative and the correspond-
ing properties of limits as follows. Write
) (r +/
1
) = ) (r) +)
t
(r) /
1
+o
1
(/
1
) , o
1
(/
1
) = /
1
j
1
(/
1
) ,
q (j +/
2
) = q (j) +q
t
(j) /
2
+o
2
(/
2
) , o
2
(/
2
) = /
2
j
2
(/
2
) ,
and then with
/
2
= ) (r +/
1
) ) (r) = )
t
(r) /
1
+o
1
(/
1
) ,
we have,
(q )) (r +/
1
) = q () (r +/
1
))
= q () (r) +)
t
(r) /
1
+o
1
(/
1
))
= q (j +/
2
)
= q (j) +q
t
(j) /
2
+o
2
(/
2
)
= (q )) (r) +q
t
(j) )
t
(r) /
1
+o
1
(/
1
) +o
2
(/
2
)
= (q )) (r) +q
t
(j) )
t
(r) /
1
+o
3
(/
1
) ,
where using lim
|10
/
2
= 0, we conclude that as /
1
0,
o
3
(/
1
)
/
1
= q
t
(j)
o
1
(/
1
)
/
1
+j
2
(/
2
)
)
t
(r) /
1
+o
1
(/
1
)
/
1
q
t
(j) 0 + 0 )
t
(r) = 0.
Example 8. There is a function ) : R R whose derivative )
t
: R R exists
everywhere on the real line, but the derivative function )
t
is not itself dierentiable
at 0, not even continuous at 0. For example
) (r) =
_
r
2
sin
1
r
if r ,= 0
0 if r = 0
has these properties. Indeed,
)
t
(r) =
_
2rsin
1
r
cos
1
r
if r ,= 0
0 if r = 0
fails to be continuous at the origin.
Proposition 17. Suppose that ) : [a, /] R is continuous and strictly in-
creasing. Let r (a, /) and set j = ) (r). If ) is dierentiable at r and )
t
(r) ,= 0,
then )
1
is dierentiable at j and
_
)
1
_
t
(j) =
1
)
t
(r)
=
1
)
t
()
1
(j))
.
Proof : We rst note that by Corollary 12, ) : [a, /] [) (a) , ) (/)] is contin-
uous, one-to-one and onto. Thus Corollary 11 shows that )
1
is continuous. Then
with
/ = )
1
(j +/) )
1
(j) = )
1
(j +/) r,
we have
) (r +/) = )
_
)
1
(j +/)
_
= j +/,
and so
)
1
(j +/) )
1
(j)
/
=
/
) (r +/) ) (r)

1
)
t
(r)
as / 0 since )
t
(r) ,= 0 and
lim
|0
/ = lim
|0
_
)
1
(j +/) )
1
(j)
_
= 0
by the continuity of )
1
at j.
2.1. Mean value theorems. We will present four mean value theorems in
order of increasing generality. They all depend on the following theorem of Fermat.
If ) : A R where A is any metric space, we say that ) has a relative maximum
at a point j in A if there is c 0 such that
) (j) _ ) (r) for all r 1(j, c) .
A relative minimum is dened similarly.
Theorem 33. Suppose ) : [a, /] R and j (a, /). If ) has either a relative
maximum or a relative minimum at j, and if ) is dierentiable at j, then
)
t
(j) = 0.
Proof : Suppose ) has a relative maximum at j. Then there is c 0 such that
) (r) ) (j) _ 0 for r (j c, j +c). It follows that
) (r) ) (j)
r j
_ 0, for r (j, j +c) ,
) (r) ) (j)
r j
_ 0, for r (j c, j) .
If we take a sequence r
n
o
n=1
in (j, j +c) converging to j, we see that
)
t
(j) = lim
no
) (r
n
) ) (j)
r
n
j
_ 0,
and if we take a sequence r
n
o
n=1
in (j c, j) converging to j, we see that
)
t
(j) = lim
no
) (r
n
) ) (j)
r
n
j
_ 0.
Combining these inequalities proves that )
t
(j) = 0. The proof is similar if ) has a
relative minimum at j.
Theorem 34. (First Mean Value) Suppose that ) : [a, /] R is continuous
on [a, /] and dierentiable on (a, /). If ) (a) = ) (/) = 0, then there is c (a, /)
such that
)
t
(c) = 0.
Proof : If ) = 0 then any c (a, /) works. Otherwise we may suppose without
loss of generality that ) (r) 0 for some r. Then by Theorem 28 there is c [a, /]
such that
sup) ([a, /]) = ) (c) .
Since ) (c) _ ) (r) 0 we must have c (a, /), and so ) has a relative maximum
at c. Theorem 33 now implies )
t
(c) = 0.
Theorem 35. (Second Mean Value) Suppose that ) : [a, /] R is continu-
ous on [a, /] and dierentiable on (a, /). Then there is c (a, /) such that
)
t
(c) =
) (/) ) (a)
/ a
.
Proof : Dene q : [a, /] R by
q (r) = ) (r)
_
) (a) +
) (/) ) (a)
/ a
(r a)
_
, a _ r _ /,
so that q (r) is the signed vertical distance from the graph of ) at r to the graph
of the line joining (a, ) (a)) to (/, ) (/)) at r. Then q satises the hypotheses of
Theorem 34 and so there is a point c (a, /) satisfying
0 = q
t
(c) = )
t
(c)
) (/) ) (a)
/ a
.
Note that the conclusion of the second mean value theorem can be rewritten
as
(2.4) ) (/) = ) (a) +)
t
(c) (/ a) .
Theorem 36. (Third Mean Value) Suppose that ), q : [a, /] R are each
continuous on [a, /] and dierentiable on (a, /). Then there is c (a, /) such that
[q (/) q (a)] )
t
(c) = [) (/) ) (a)] q
t
(c) .
Proof : Dene / : [a, /] R by
/(r) = [q (/) q (a)] ) (r) [) (/) ) (a)] q (r) , a _ r _ /.
Then / satises the hypotheses of Theorem 35 and a small calculation shows that
/(a) = /(/). So there is a point c (a, /) satisfying
0 =
/(/) /(a)
/ a
= /
t
(c) = [q (/) q (a)] )
t
(c) [) (/) ) (a)] q
t
(c) .
Definition 30. If ) : [a, /] R is dierentiable on [a, /], and if )
t
: [a, /] R
is dierentiable on a subset 1 of [a, /], then we dene )
tt
= ()
t
)
t
on 1, and call
)
tt
the second derivative of ) on 1. More generally, for : _ 2 we dene )
(n)
=
_
)
(n1)
_
t
on 1 if )
(n1)
is dened on an interval containing 1.
The form (2.4) can be generalized to higher order derivatives.
Theorem 37. (Fourth Mean Value) Suppose that ) : [a, /] R is : 1
times continuously dierentiable on [a, /], i.e. ), )
t
, ..., )
(n1)
are each dened and
continuous on [a, /], and suppose that )
(n1)
is dierentiable on (a, /), i.e. )
(n)
exists on (a, /). Then there is c (a, /) such that
) (/) = ) (a) +)
t
(a) (/ a) +)
tt
(a)
(/ a)
2
2!
+... +)
(n1)
(a)
(/ a)
n1
(: 1)!
+)
(n)
(c)
(/ a)
n
:!
=
n1
|=0
)
(|)
(a)
(/ a)
|
/!
+)
(n)
(c)
(/ a)
n
:!
.
Proof : Dene q : [a, /] R by
q (r) = ) (r)
n1
|=0
)
(|)
(a)
(r a)
|
/!
+' (r a)
n
, a _ r _ /,
and where ' is the number uniquely dened by requiring q (/) = 0, i.e.
' (/ a)
n
=
n1
|=0
)
(|)
(a)
(/ a)
|
/!
) (/) .
Calculations show that
q
t
(r) = )
t
(r)
n1
|=1
)
(|)
(a)
(r a)
|1
(/ 1)!
+:' (r a)
n1
, (2.5)
q
tt
(r) = )
tt
(r)
n1
|=2
)
(|)
(a)
(r a)
|2
(/ 2)!
+:(: 1) ' (r a)
n2
,
.
.
.
q
(n1)
(r) = )
(n1)
(r) )
(n1)
(a) +:(: 1) ... (3) (2) (1) ' (r a) ,
q
(n)
(r) = )
(n)
(r) 0 +:!'.
Now the conclusion of the theorem is that
)
(n)
(c)
(/ a)
n
:!
= ) (/)
n1
|=0
)
(|)
(a)
(/ a)
|
/!
= ' (/ a)
n
,
i.e. )
(n)
(c) +:!' = 0. Thus using the last line in (2.5) we see that we must show
q
(n)
(c) = 0 for some c (a, /).
Now the /
||
line of (2.5) shows that
(2.6) q
(|)
(a) = )
(|)
(a) )
(|)
(a) + 0 = 0, 0 _ / _ : 1.
Since q (a) = q (/) = 0, the rst mean value theorem shows that there is c
1
(a, /)
satisfying
q
t
(c
1
) = 0.
Using (2.6) we see that q
t
(a) = q
t
(c
1
) = 0, and so the rst mean value theorem
shows that there is c
2
(a, c
1
) satisfying
q
tt
(c
2
) = 0.
Continuing in this way we obtain c
|
(a, c
|1
) satisfying
q
(|)
(c
|
) = 0,
for each 1 _ / _ :. The number c = c
n
(a, /) satises q
(n)
(c) = 0 and this
completes the proof of the fourth mean value theorem.
Remark 18. The rst three mean value theorems can each be interpreted as
saying that there is a point on a curve whose tangent is parallel to the line seg-
ment joining the endpoints of the curve. For example, in the second theorem, )
t
(c)
is the slope of the tangent line to the graph of ) at (c, ) (c)), while
}(b)}(o)
bo
is
the slope of the line joining the endpoints (a, ) (a)) and (/, ) (/)) of the graph. In
the third theorem,
}
0
(c)
0
(c)
is the slope of the parametric curve r () (r) , q (r)) at
the point () (c) , q (c)), while
}(b)}(o)
(b)(o)
is the slope of the line joining the endpoints
() (a) , q (a)) and () (/) , q (/)). On the other hand, the second and fourth theo-
rems can each be interpreted as saying that a function can be approximated by a
polynomial.
2.2. Some consequences of the mean value theorems.
Theorem 38. (monotone functions) Suppose ) : (a, /) R is dierentiable.
(1) If )
t
(r) = 0 for all r (a, /), then ) is constant on (a, /).
(2) If )
t
(r) _ 0 (respectively )
t
(r) _ 0) for all r (a, /), then ) is monoton-
ically increasing (respectively decreasing) on (a, /).
Proof : Apply (2.4) of the second mean value theorem to the interval [c, ,] for
any a < c < , < / to obtain
) (,) = ) (c) +)
t
(c) (, c) ,
for some c (c, ,).
(1) If )
t
(c) = 0 for all c (c, ,), then ) (,) = ) (c) for all a < c < , < /.
(2) If )
t
(c) _ 0 for all c (c, ,), then ) (,) _ ) (c) for all a < c < , < /. If
)
t
(c) _ 0 for all c (c, ,), then ) (,) _ ) (c) for all a < c < , < /.
Recall from Corollary 12 that continuous functions have the Intermediate Value
Property. The next theorem shows that derivatives also have the Intermediate Value
Property, despite the fact that they need not be continuous functions - see Example
8. This is often referred to as a continuity property of derivatives.
Theorem 39. (continuity of derivatives) Suppose ) : [a, /] R is dieren-
tiable. If )
t
(a) < ` < )
t
(/), then there is c (a, /) such that )
t
(c) = `.
Proof : We eectively reduce matters to the case ` = 0 by considering q :
[a, /] R dened by
q (r) = ) (r) `r, r [a, /] .
By Theorem 28 there is a point j [a, /] such that
inf q ([a, /]) = q (j) .
We claim that j (a, /), i.e. that j cannot be either of the endpoints a or /. Indeed,
(2.7) q
t
(r) = )
t
(r) `
and so
q
t
(a) = )
t
(a) ` < 0,
q
t
(/) = )
t
(/) ` 0.
Since 0 q
t
(a) = lim
ro
(r)(o)
ro
, there is some r
1
(a, /) such that
q (r
1
) q (a) < 0,
and this shows that j ,= a. Since 0 < q
t
(/) = lim
rb
(r)(b)
rb
, there is some
r
2
(a, /) such that
q (r
2
) q (/) < 0,
and this shows that j ,= /. Thus q has a relative minimum at j and by Theorem
33 we conclude that q
t
(j) = 0. Hence )
t
(j) = ` by (2.7).
Theorem 40. (lHspitals rule) Suppose ), q : (a, /) R are each dieren-
tiable, and that q
t
(r) ,= 0 for all a < r < /. If lim
ro
) (r) = lim
ro
q (r) = 0
and
lim
ro
)
t
(r)
q
t
(r)
= 1,
then
lim
ro
) (r)
q (r)
= 1.
Proof : Given - 0 there is c 0 such that
)
t
(r)
q
t
(r)
1
< - for all a < r < a +c.

Now for a < c < , < a +c, the third mean value theorem gives a point c (c, ,)
such that
[q (,) q (c)] )
t
(c) = [) (,) ) (c)] q
t
(c) ,
and since q
t
(c) ,= 0 we can write
)
t
(c)
q
t
(c)
=
) (,) ) (c)
q (,) q (c)
.
Thus we have
) (,) ) (c)
q (,) q (c)
1
< - for all a < c < , < a +c.

Now let c a and use lim
oo
) (c) = lim
oo
q (c) = 0 to get
) (,)
q (,)
1
_ - for all a < , < a +c.

This completes the proof that lim
ro
}(r)
(r)
= 1.
3. Exercises
Exercise 39. Let ) : 1 R where 1 is a compact subset of R. Show that
) is continuous if and only if the graph G()) of ) is compact. The graph of ) is
dened to be the subset of the plane given by
G()) = (r, ) (r)) : r 1 .
Exercise 40. Let (A, d) be a metric space. Suppose 1 and 1 are disjoint
closed subsets of A.
(1) If in addition 1 is compact, prove that there is a positive number c 0
such that
(3.1) d (r, j) _ c, for all r 1 and j 1.
(2) Give an example to show that (3.1) can fail without the additional hypoth-
esis that either 1 or 1 is compact.
Exercise 41. Let ) and q be real-valued functions on R, and let r R. Suppose
that )
t
(r) ands q
t
(r) both exist, with q
t
(r) ,= 0 and ) (r) and q (r) = 0. Prove
that
lim
|r
) (t)
q (t)
=
)
t
(r)
q
t
(r)
.
Exercise 42. Suppose that ) is dierentiable on [a, /], and there is ' 0
such that ) satises the dierential inequality
[)
t
(r)[ _ [) (r)[ , a _ r _ /.
3. EXERCISES 93
If ) (a) = 0, prove that ) (r) = 0 for all a _ r _ /.
Hint: Fix r
0
[a, /] with (r
0
a) < 1. Pick r [a, r
0
] and show that with
'
1
= sup
r[o,r0]
[)
t
(r)[,
[) (r)[ _ '
1
[r a[ _ '
1
[r
0
a[ for all r [a, r
0
] .
Then
'
0
= sup
r[o,r0]
[) (r)[ _ '
1
[r
0
a[
and
'
1
= sup
r[o,r0]
[)
t
(r)[ _ '
0
imply '
0
_ '
0
[r
0
a[, hence '
0
vanishes. Continue forward.
Part 2
Integration
In the second part of these notes we consider the problem of describing the
inverse operation to that of dierentiation, commonly called integration. There are
four widely recognized theories of integration:
Riemann integration - the workhorse of integration theory that provides
us with the most basic form of the fundamental theorem of calculus;
Riemann-Stieltjes integration - that extends the idea of integrating the
innitesmal dr to that of the more general innitesmal dc(r) for an in-
creasing function c.
Lebesgue integration - that overcomes a shortcoming of the Riemann the-
ory by permitting a robust theory of limits of functions, all at the expense
of a complicated theory of measure of a set.
Henstock-Kurtzweil integration - that includes the Riemann and Lebesgue
theories and has the advantages that it is quite similar in spirit to the
intuitive Riemann theory, and avoids much of the complication of mea-
surability of sets in the Lebesgue theory. However, it has the drawback of
limited scope for generalization.
In Chapter 6 we follow Rudin [3] and use uniform continuity to develop the
standard theory of the Riemann and Riemann-Stieltjes integrals. A short detour
is taken to introduce the more powerful Henstock-Kurtzweil integral, and we use
compactness to prove its uniqueness and extension properties.
In Chapter 7 we prove the familiar theorems on uniform convergence of func-
tions and apply this to prove that the metric space C
R
(A) of real-valued continuous
functions on a compact metric space A is complete. We then use integration theory
and the Contraction Lemma from Chapter 4 to produce an elegant proof of exis-
tence and uniqueness of solutions to certain initial value problems for dierential
equations. We also construct a space-lling curve and the von Koch snowake.
Chapter 8 draws on Stein and Shakarchi [5] to provide a rapid introduction to
the theory of the Lebesgue integral.
CHAPTER 6
Riemann and Riemann-Stieltjes integration
Let ) : [0, 1] R be a bounded function on the closed unit interval [0, 1]. In
Riemanns theory of integration, we partition the domain [0, 1] of the function into
nitely many disjoint subintervals
[0, 1] =

_
n=1
[r
n1
, r
n
] ,
and denote the partition by T = 0 = r
0
< r
1
< ... < r
= 1 and the length of

the subinterval [r
n1
, r
n
] by r
n
= r
n
r
n1
0. Then we dene upper and
lower Riemann sums associated with the partition T by
l (); T) =

n=1
_
sup
[rn1,rn]
)
_
r
n
,
1(); T) =

n=1
_
inf
[rn1,rn]
)
_
r
n
.
Note that the suprema and inma are nite since ) is bounded by assumption.
Next we dene the upper and lower Riemann integrals of ) on [0, 1] by
| ()) = inf
1
l (); T) , /()) = sup
1
1(); T) .
Thus the upper Riemann integral | ()) is the "smallest" of all the upper sums, and
the lower Riemann integral is the "largest" of all the lower sums.
We can show that any upper sum is always larger than any lower sum by con-
sidering the renement of two partitions T
1
and T
2
: T
1
'T
2
denotes the paritition
whose points consist of the union of the points in T
1
and T
2
and ordered to be
strictly increasing.
Lemma 18. Suppose ) : [0, 1] R is bounded. If T
1
and T
2
are any two
partitions of [0, 1], then
(0.2) l (); T
1
) _ l (); T
1
' T
2
) _ 1(); T
1
' T
2
) _ 1(); T
2
) .
Proof : Let
T
1
= 0 = r
0
< r
1
< ... < r
1
= 1 ,
T
2
= 0 = j
0
< j
1
< ... < j
= 1 ,
T
1
' T
2
= 0 = .
0
< .
1
< ... < .
1
= 1 .
Fix a subinterval [r
n1
, r
n
] of the partition T
1
. Suppose that [r
n1
, r
n
] contains
exactly the following increasing sequence of points in the partition T
1
' T
2
:
.
|n
< .
|n+1
< ... < .
|n+nn
,
97
98 6. RIEMANN AND RIEMANN-STIELTJES INTEGRATION
i.e. .
|n
= r
n1
and .
|n+nn
= r
n
. Then we have
_
sup
[rn1,rn]
)
_
r
n
=
_
sup
[rn1,rn]
)
_
_
_
nn
=1
.
|n+
_
_
_
nn
=1
_
sup
[:
`n+j1
,:
`n+j
]
)
_
.
|n+
,
since sup
[:
`n+j1
,:
`n+j
]
) _ sup
[rn1,rn]
) when [.
|n+1
, .
|n+
] [r
n1
, r
n
]. If we
now sum over 1 _ : _ ' we get
l (); T
1
) =
1
n=1
_
sup
[rn1,rn]
)
_
r
n
_
1
n=1
nn
=1
_
sup
[:
`n+j1
,:
`n+j
]
)
_
.
|n+
=
1
=1
_
sup
[:p1,:p]
)
_
.
= l (); T
1
' T
2
) .
Similarly we can prove that
1(); T
2
) _ 1(); T
1
' T
2
) .
Since we trivially have 1(); T
1
' T
2
) _ l (); T
1
' T
2
), the proof of the lemma is
complete.
Now in (0.2) take the inmum over T
1
and the supremum over T
2
to obtain
that
| ()) _ /()) ,
which says that the upper Riemann integral of ) is always equal to or greater than
the lower Riemann integral of ). Finally we say that ) is Riemann integrable on
[0, 1], written ) [0, 1], if | ()) = /()), and we denote the common value by
_
1
0
) or
_
1
0
) (r) dr.
We can of course repeat this line of denition and reasoning for any bounded
closed interval [a, /] in place of the closed unit interval [0, 1]. We summarize matters
in the following denition.
Definition 31. Let ) : [a, /] R be a bounded function. For any partition
T = a = r
0
< r
1
< ... < r
= / of [a, /] we dene upper and lower Riemann

sums by
l (); T) =

n=1
_
sup
[rn1,rn]
)
_
r
n
,
1(); T) =

n=1
_
inf
[rn1,rn]
)
_
r
n
.
Set
| ()) = inf
1
l (); T) , /()) = sup
1
1(); T) ,
6. RIEMANN AND RIEMANN-STIELTJES INTEGRATION 99
where the inmum and supremum are taken over all partitions T of [a, /]. We say
that ) is Riemann integrable on [a, /], written ) [a, /], if | ()) = /()), and
we denote the common value by
_
b
o
) or
_
b
o
) (r) dr.
A more substantial generalization of the line of denition and reasoning above
can be obtained on a closed interval [a, /] by considering in place of the positive
quantities r
n
= r
n
r
n1
associated with a partition
T = a = r
0
< r
1
< ... < r
= /
of [a, /], the more general nonnegative quantities
c
n
= c(r
n
) c(r
n1
) , 1 _ : _ ,
where c : [a, /] R is nondecreasing. This leads to the notion of the Riemann-
Stieltjes integral associated with a nondecreasing function c : [a, /] R.
Definition 32. Let ) : [a, /] R be a bounded function and suppose c :
[a, /] R is nondecreasing. For any partition T = a = r
0
< r
1
< ... < r
= /
of [a, /] we dene upper and lower Riemann sums by
l (); T, c) =

n=1
_
sup
[rn1,rn]
)
_
c
n
,
1(); T, c) =

n=1
_
inf
[rn1,rn]
)
_
c
n
.
Set
| (), c) = inf
1
l (); T, c) , /(), c) = sup
1
1(); T, c) ,
where the inmum and supremum are taken over all partitions T of [a, /]. We say
that ) is Riemann-Stieltjes integrable on [a, /], written )
o
[a, /], if | (), c) =
/(), c), and we denote the common value by
_
b
o
)dc or
_
b
o
) (r) dc(r) .
The lemma on partitions above generalizes immediately to the setting of the
Riemann-Stieltjes integral.
Lemma 19. Suppose ) : [a, /] R is bounded and c : [a, /] R is nondecreas-
ing. If T
1
and T
2
are any two partitions of [a, /], then
(0.3) l (); T
1
, c) _ l (); T
1
' T
2
, c) _ 1(); T
1
' T
2
, c) _ 1(); T
2
, c) .
0.1. Existence of the Riemann-Stieltjes integral. The dicult question
now arises as to exactly which bounded functions ) are Riemann-Stieltjes integrable
with respect to a given nondecreasing c on [a, /]. We will content ourselves with
showing two results. Suppose ) is bounded on [a, /] and c is nondecreasing on [a, /].
Then
)
o
[a, /] if in addition ) is continuous on [a, /];
)
o
[a, /] if in addition ) is monotonic on [a, /] and c is continuous on
[a, /].
Both proofs will use the Cauchy criterion for existence of the integral
_
b
o
)dc
when ) : [a, /] R is bounded and c : [a, /] R is nondecreasing:
For every - 0 there is a partition T of [a, /] such that (0.4)
l (); T, c) 1(); T, c) < -.
Clearly, if (0.4) holds, then from (0.3) we obtain that for each - 0 that there is a
partition T
:
satisfying
| (), c) /(), c) = inf
1
l (); T, c) sup
1
1(); T, c)
_ l (); T
:
, c) 1(); T
:
, c) < -.
It follows that | (), c) = /(), c) and so
_
b
o
)dc exists. Conversely, given - 0
there are partitions T
1
and T
2
satisfying
| (), c) = inf
1
l (); T, c) l (); T
1
, c)
-
2
,
/(), c) = sup
1
1(); T, c) < 1(); T
2
, c) +
-
2
.
Inequality (0.3) now shows that
l (); T
1
' T
2
, c) 1(); T
1
' T
2
, c) _ l (); T
1
, c) 1(); T
2
, c)
<
_
| (), c) +
-
2
_
_
/(), c)
-
2
_
= -
since | (), c) = /(), c) if
_
b
o
)dc exists. Thus we can take T = T
1
' T
2
in (0.4).
The existence of
_
b
o
)dc when ) is continuous will use Theorem 30 on uniform
continuity in a crucial way.
Theorem 41. Suppose that ) : [a, /] R is continuous and c : [a, /] R is
nondecreasing. Then )
o
[a, /].
Proof : We will show that the Cauchy criterion (0.4) holds. Fix - 0. By
Theorem 30 ) is uniformly continuous on the compact set [a, /], so there is c 0
such that
[) (r) ) (r
t
)[ _
-
c(/) c(a)
whenever [r r
t
[ _ c.
Let T = a = r
0
< r
1
< ... < r
= / be any partition of [a, /] for which

max
1n
r
n
< c.
Then we have
sup
[rn1,rn]
) inf
[rn1,rn]
) _ sup
r,r
0
[rn1,rn]
[) (r) ) (r
t
)[ _ -,
since [r r
t
[ _ r
n
< c when r, r
t
[r
n1
, r
n
] by our choice of T. Now we
compute that
l (); T, c) 1(); T, c) =

n=1
_
sup
[rn1,rn]
) inf
[rn1,rn]
)
_
c
n
_

n=1
_
-
c(/) c(a)
_
c
n
= -,
which is (0.4) as required.
6. RIEMANN AND RIEMANN-STIELTJES INTEGRATION 101
Remark 19. Observe that it makes no logical dierence if we replace strict
inequality < with _ in - c type denitions. We have used this observation twice
in the above proof, and will continue to use it without further comment in the sequel.
The proof of the next existence result uses the intermediate value theorem for
continuous functions.
Theorem 42. Suppose that ) : [a, /] R is monotone and c : [a, /] R is
nondecreasing and continuous. Then )
o
[a, /].
Proof : We will show that the Cauchy criterion (0.4) holds. Fix - 0 and
suppose without loss of generality that ) is nondecreasing on [a, /]. Let _ 2 be a
positive integer. Since c is continuous we can use the intermediate value theorem
to nd points r
n
(a, /) such that r
0
= a, r
= / and
c(r
n
) = c(a) +
:
(c(/) c(a)) , 1 _ : _ 1.
Since c is nondecreasing we have r
n1
< r
n
for all 1 _ : _ , and it follows that
T = a = r
0
< r
1
< ... < r
= /
is a partition of [a, /] satisfying
c
n
= c(r
n
) c(r
n1
) =
c(/) c(a)
<
-
) (/) ) (a)
,
provided we take large enough. With such a partition T we compute
l (); T, c) 1(); T, c) =

n=1
_
sup
[rn1,rn]
) inf
[rn1,rn]
)
_
c
n
_
-
) (/) ) (a)
n=1
_
sup
[rn1,rn]
) inf
[rn1,rn]
)
_
=
-
) (/) ) (a)
n=1
() (r
n
) ) (r
n1
)) = -,
This proves (0.4) as required.
0.2. A stronger form of the denition of the Riemann integral. For the
Riemann integral there is another formulation of the denition of
_
b
o
) that appears
at rst sight to be much stronger (and which doesnt work for general nondecreasing
c in the Riemann-Stieltjes integral). For any partition T = a = r
0
< r
1
< ... < r
= /,
set |T| = max
1n
r
n
, called the norm of T. Now if
_
b
o
) exists, then for every
- 0 there is by the Cauchy criterion (0.4) a partition T = a = r
0
< r
1
< ... < r
= /
such that
l (); T) 1(); T) <
-
2
.
Now dene c to be the smaller of the two positive numbers
min
1n
r
n
and
-
2 dia: ) ([a, /])
.
Claim 1. If Q = a = j
0
< j
1
< ... < j
1
= / is any partition with
|Q| = max
1n1
j
n
< c,
then
l (); Q) 1(); Q) < -.
Indeed, since j
n
< c _ r
n
for all : and : by choice of c, each point r
n
lies in a distinct one of the subintervals [j
n1
, j
n
] of Q, call it J
n
= [j
nn1
, j
nn
].
The other subintervals [j
n1
, j
n
] of Q with : not equal to any of the :
n
, each
lie in one of the separating intervals 1
n
=
_
j
nn1
, j
nn1
that are formed by the

spaces between the intervals J
n
. These intervals 1
n
are the union of one or more
consecutive subintervals of Q. We have for each : that
n:[m1,m]1n
_
sup
[m1,m]
) inf
[m1,m]
)
_
j
n
_
_
_
sup
[m
n1
,mn1]
) inf
[m
n1
,mn1]
)
_
_

n:[m1,m]1n
j
n
_
_
sup
[rn,rn+1]
) inf
[rn,rn+1]
)
_
(j
n
j
n1
)
_
_
sup
[rn,rn+1]
) inf
[rn,rn+1]
)
_
(r
n+1
r
n
) .
Summing this in : yields
n=1
n:[m1,m]1n
_
sup
[m1,m]
) inf
[m1,m]
)
_
j
n
(0.5)
_

n=1
_
sup
[rn,rn+1]
) inf
[rn,rn+1]
)
_
(r
n+1
r
n
) = l (); T) 1(); T) .
Now we compute
l (); Q) 1(); Q) =
1
n=1
_
sup
[m1,m]
) inf
[m1,m]
)
_
j
n
=

n=1
_
sup
n
) inf
n
)
_
(j
nn
j
nn1
)
+

n=1
n:[m1,m]1n
_
sup
[m1,m]
) inf
[m1,m]
)
_
j
n
,
which by (0.5) and choice of c is dominated by
dia: ) ([a, /])

n=1
(j
nn
j
nn1
) +l (); T) 1(); T)
_ dia: ) ([a, /]) c +
-
2
<
-
2
+
-
2
= -,
and this proves the claim.
Conversely, if
For every - 0 there is c 0 such that (0.6)
l (); Q) 1(); Q) < - whenever |Q| < c,
1. SIMPLE PROPERTIES OF THE RIEMANN-STIELTJES INTEGRAL 103
then the Cauchy criterion (0.4) holds with T equal to any such Q. Thus (0.6)
provides another equivalent denition of the Riemann integral
_
b
o
) that is more
like the - c denition of continuity at a point (compare Denition 25).
1. Simple properties of the Riemann-Stieltjes integral
The Riemann-Stieltjes integral
_
b
o
)dc is a function of the closed interval [a, /],
the bounded function ) on [a, /], and the nondecreasing function c on [a, /]. With
respect to each of these three variables, the integral has natural properties related
to monotonicity, sums and scalar multiplication. In fact we have the following
lemmas dealing with each variable separately, beginning with ), then c and ending
with [a, /].
Lemma 20. Fix [a, /] R and c : [a, /] R nondecreasing. The set
o
[a, /]
is a real vector space and the integral
_
b
o
)dc is a linear function of )
o
[a, /]:
if )
[a, /] and `
R, then
) = `
1
)
1
+`
2
)
2

o
[a, /] and
_
b
o
)dc = `
1
_
b
o
)
1
dc +`
2
_
b
o
)
2
dc.
Furthermore,
o
[a, /] is partially ordered by declaring ) _ q if ) (r) _ q (r) for
r [a, /], and the integral
_
b
o
)dc is a nondecreasing function of ) with respect to
this order: if ), q
o
[a, /] and ) _ q, then
_
b
o
)dc _
_
b
o
qdc.
Lemma 21. Fix [a, /] R and ) : [a, /] R bounded. Then
(
}
[a, /] = c : [a, /] R : c is nondecreasing and )
o
[a, /]
is a cone and the integral
_
b
o
)dc is a positive linear function of c: if c
(
}
[a, /]
and c
[0, ), then
c = c
1
c
1
+c
2
c
2
(
}
[a, /] and
_
b
o
)dc = c
1
_
b
o
)dc
1
+c
2
_
b
o
)dc
2
.
Lemma 22. Fix [a, /] R and c : [a, /] R nondecreasing and )
o
[a, /].
If a < c < /, then c : [a, c] R and c : [c, /] R are each nondecreasing and
)
o
[a, c] and )
o
[c, /] and
_
b
o
)dc =
_
c
o
)dc +
_
b
c
)dc.
These three lemmas are easy to prove, and are left to the reader. Properties
regarding multiplication of functions in
o
[a, c] and composition of functions are
more delicate.
Theorem 43. Suppose that ) : [a, /] [:, '] and )
o
[a, /]. If , :
[:, '] R is continuous, then , )
o
[a, /].
Corollary 13. If ), q
o
[a, /], then )q
o
[a, /], [)[
o
[a, /] and
_
b
o
)dc
_
_
b
o
[)[ dc.
Proof : Since ,(r) = r
2
is continuous, Lemma 20 and Theorem 43 yield
)q =
1
2
_
() +q)
2
)
2
q
2
_

o
[a, /] .
Since ,(r) = [r[ is continuous, Theorem 43 yields [)[
o
[a, /]. Now choose
c = 1 so that c
_
b
o
)dc _ 0. Then the lemmas imply
_
b
o
)dc
= c
_
b
o
)dc =
_
b
o
(c)) dc _
_
b
o
[)[ dc.
Proof (of Theorem 43): Let / = , ). We will show that /
o
[a, /] by
verifying the Cauchy criterion for integrals (0.4). Fix - 0. Since , is continuous
on the compact interval [:, '], it is uniformly continuous on [:, '] by Theorem
30. Thus we can choose 0 < c < - such that
[,(:) ,(t)[ < - whenever [: t[ < c.
Since )
o
[a, /], there is by the Cauchy criterion a partition
T = a = r
0
< r
1
< ... < r
= /
such that
(1.1) l (); T, c) 1(); T, c) < c
2
.
Let
'
n
= sup
[rn1,rn]
) and :
n
= inf
[rn1,rn]
),
'
+
n
= sup
[rn1,rn]
/ and :
+
n
= inf
[rn1,rn]
/,
and set
= : : '
n
:
n
< c and = : : '
n
:
n
_ c .
The point of the index set is that for each : we have
'
+
n
:
+
n
= sup
r,[rn1,rn]
[,() (r)) ,() (j))[ _ sup
]s|]1nnn
[,(:) ,(t)[
_ sup
]s|]<o
[,(:) ,(t)[ _ -, : .
As for : in the index set 1, we have c
n
_ '
n
:
n
and the inequality (1.1) then
gives
c

n1
c
n
_

n1
('
n
:
n
) c
n
< c
2
.
Dividing by c 0 we obtain

n1
c
n
< c.
Now we use the trivial bound
'
+
n
:
+
n
_ dia: ,([:, '])
to compute that
l (/; T, c) 1(/; T, c) =
_
n.
+

n1
_
('
+
n
:
+
n
) c
n
_

n.
- c
n
+

n1
dia: ,([:, ']) c
n
_ - (c(/) c(a)) +c dia: ,([:, '])
_ - [c(/) c(a) +dia: ,([:, '])] ,
1. SIMPLE PROPERTIES OF THE RIEMANN-STIELTJES INTEGRAL 105
which veries (0.4) for the existence of
_
b
o
/dc as required.
1.1. The Henstock-Kurtzweil integral. We can reformulate the -c de-
nition of the Riemann integral
_
b
o
) in (0.6) using a more general notion of partition,
that of a tagged partition. If T = a = r
0
< r
1
< ... < r
= / is a partition of
[a, /] and we choose points t
n
[r
n1
, r
n
] in each subinterval of T, then
T
+
= a = r
0
_ t
1
_ r
1
_ ... _ r
1
_ t
_ r
= / ,
where r
0
< r
1
< ... < r
,
is called a tagged partition T
+
with underlying partition T. Thus a tagged parti-
tion consists of two nite intertwined sequences r
n
n=0
and t
n
n=1
, where the
sequence r
n
n=0
is strictly increasing and the sequence t
n
n=1
need not be. For
every tagged partition T
+
of [a, /], dene the corresponding Riemann sum o (); T
+
)
by
o (); T
+
) =

n=1
) (t
n
) r
n
.
Note that inf
[rn1,rn]
) _ ) (t
n
) _ sup
[rn1,rn]
) implies that
1(); T) _ o (); T
+
) _ l (); T)
for all tagged partitions T
+
with underlying partition T.
Now observe that if ) [a, /], - 0 and the partition T satises
l (); T) 1(); T) < -,
then every tagged partition T
+
with underlying partition T satises
(1.2)
o (); T
+
)
_
b
o
)
_ l (); T) 1(); T) < -.

Conversely if for each - 0 there is a partition T such that every tagged partition
T
+
with underlying partition T satises (1.2), then (0.4) holds and so ) [a, /].
However, we can also formulate this approach using the - c form (0.6) of the
denition of
_
b
o
). The result is that ) [a, /] if and only if
There is 1 R such that for every - 0 there is c 0 such that (1.3)
[o (); T
+
) 1[ < - whenever |T
+
| < c.
Of course if such a number 1 exists we write 1 =
_
b
o
) and call it the Riemann
integral of ) on [a, /]. Here we dene |T
+
| to be |T| where T is the underlying
partition of T
+
. The reader can easily verify that ) [a, /] if and only if the
above condition (1.3) holds.
Now comes the clever insight of Henstock and Kurtzweil. We view the positive
constant c in (1.3) as a function on the interval [a, /], and replace it with an arbitrary
(not necessarily constant) positive function c : [a, /] (0, ). We refer to such
an arbitrary positive function c : [a, /] (0, ) as a guage on [a, /]. Then for any
guage on [a, /], we say that a tagged partition T
+
on [a, /] is c-ne provided
(1.4) [r
n1
, r
n
] (t
n
c (t
n
) , t
n
+c (t
n
)) , 1 _ : _ .
Thus T
+
is c-ne if each tag t
n
[r
n1
, r
n
] has its associated guage value c (t
n
)
suciently large that the open interval centered at t
n
with radius c (t
n
) contains
the :
||
subinterval [r
n1
, r
n
] of the partition T. Now we can give the denition of
the Henstock and Kurtzweil integral.
Definition 33. A function ) : [a, /] R is Henstock-Kurtzweil integrable on
[a, /], written ) H/[a, /], if there is 1 R such that for every - 0 there is a
guage c
:
: [a, /] (0, ) on [a, /] such that
[o (); T
+
) 1[ < - whenever T
+
is c-ne.
It is clear that if ) [a, /] is Riemann integrable, then ) satises Denition
33 with 1 =
_
b
o
) - simply take c
:
to be the constant guage c in (1.3). However,
for this new denition to have any value it is necessary that such an 1 is uniquely
determined by Denition 33. This is indeed the case and relies crucially on the fact
that [a, /] is compact. Here are the details.
Suppose that Denition 33 holds with both 1 and 1
t
. Let - 0. Then there
are guages c
:
and c
t
:
on [a, /] such that
[o (); T
+
) 1[ < - whenever T
+
is c
:
-ne,
[o (); T
+
) 1
t
[ < - whenever T
+
is c
t
:
-ne.
Now dene
j
:
(r) = min
_
c
:
(r) , c
t
:
(r)
_
, a _ r _ /.
Then j
:
is a guage on [a, /]. Here is the critical point: we would like to produce
a tagged partition T
+
:
that is j
:
-ne! Indeed, if such a tagged partition T
+
:
exists,
then T
+
:
would also be c
:
-ne and c
t
:
-ne (since j
:
_ c
:
and j
:
_ c
t
:
) and hence
[1 1
t
[ _ [o (); T
+
:
) 1[ +[o (); T
+
:
) 1
t
[ < 2-
for all - 0, which forces 1 = 1
t
.
However, if j is any guage on [a, /], let
1(r, j (r)) = (r j (r) , r +j (r)) and 1
_
r,
j (r)
2
_
=
_
r
j (r)
2
, r +
j (r)
2
_
.
Then
_
1
_
r,
q(r)
2
__
r[o,b]
is an open cover of the compact set [a, /], hence there
is a nite subcover
_
1
_
r
n
,
q(rn)
2
__
n=0
. We may assume that every interval
1
_
r
n
,
q(rn)
2
_
is needed to cover [a, /] by discarding any in turn which are included
in the union of the others. We may also assume that a = r
0
< r
1
< ... < r
= /.
It follows that 1
_
r
n1
,
q(rn1)
2
_
1
_
r
n
,
q(rn)
2
_
,= O, so the triangle inequality
yields
[r
n
r
n1
[ <
j (r
n1
) +j (r
n
)
2
, 1 _ : _ .
If j (r
n
) _ j (r
n1
) then
[r
n1
, r
n
] 1(r
n
, j (r
n
)) ,
and so we dene
t
n
= r
n
.
Otherwise, we have j (r
n1
) j (r
n
) and then
[r
n1
, r
n
] 1(r
n1
, j (r
n1
)) ,
2. FUNDAMENTAL THEOREM OF CALCULUS 107
and so we dene
t
n
= r
n1
.
The tagged partition
T
+
= a = r
0
_ t
1
_ r
1
_ ... _ r
1
_ t
_ r
= /
is then j-ne.
With the uniqueness of the Henstock-Kurtzweil integral in hand, and the fact
that it extends the denition of the Riemann integral, we can without fear of confu-
sion denote the Henstock-Kurtzweil integral by
_
b
o
) when ) H/[a, /]. It is now
possible to develop the standard properties of these integrals as in Theorem 43 and
the lemmas above for Riemann integrals. The proofs are typically very similar to
those commonly used for Riemann integration. One exception is the Fundamental
Theorem of Calculus for the Henstock-Kurtzweil integral, which requires a more
complicated proof. In fact, it turns out that the theory of the Henstock-Kurtzweil
integral is suciently rich to include the theory of the Lebesgue integral, which we
consider in detail in a later chapter. For further development of the theory of the
Henstock-Kurtzweil integral we refer the reader to Bartle and Sherbert [1] and the
references given there.
2. Fundamental Theorem of Calculus
The operations of integration and dierentiation are inverse to each other in a
certain sense which we make precise in this section. We consider only the Riemann
integral. Our rst theorem proves a sense in which
1i))crc:tiatio: 1:tcqratio: = 1dc:titj,
and the second theorem proves a sense in which
1:tcqratio: 1i))crc:tiatio: = 1dc:titj.
The second theorem is often called the Fundamental Theorem of Calculus, while
the two together are sometimes referred to in this way. As an application we derive
an integration by parts formula in the third theorem below.
Theorem 44. Suppose ) [a, /]. Dene
1 (r) =
_
r
o
) (t) dt, for a _ r _ /.
Then 1 is continuous on [a, /] and
1
t
(r) exists and equals ) (r)
at every point r [a, /] at which ) is continuous.
Proof : First we show that 1 is continuous on [a, /]. Since ) is bounded there
is a positive ' such that [) (r)[ _ ' for a _ r _ /. Then Lemma 22 yields
[1 (j) 1 (r)[ =
_

o
) (t) dt
_
r
o
) (t) dt
_

r
) (t) dt
,
and if we apply Corollary 13 we obtain for a _ r < j _ /,
[1 (r) 1 (j)[ _
_

r
[) (t)[ dt _
_

r
'dt = ' (j r) = '[r j[ .
This easily gives the continuity of 1 on [a, /], in fact it implies the uniform continuity
of 1 on [a, /]: [1 (r) 1 (j)[ < - whenever [r j[ < c =
:
1
.
Now suppose that ) is continuous at a xed r
0
[a, /]. Given - 0 choose
c 0 so that
[) (r) ) (r
0
)[ < - if [r r
0
[ < c and r [a, /] .
Then if t (r
0
, r
0
+c) [a, /] we have
1 (t) 1 (r
0
)
t r
0
) (r
0
)
_
|
r0
) (r) dr
t r
0
) (r
0
)
1
t r
0
_
|
r0
[) (r) ) (r
0
)] dr
_
1
t r
0
_
|
r0
[) (r) ) (r
0
)[ dr < -.
Similarly if t (r
0
c, r
0
) [a, /] we have
1 (r
0
) 1 (t)
r
0
t
) (r
0
)
< -.
This proves that lim
|r0
J(r0)J(|)
r0|
= ) (r
0
) as required.
Theorem 45. Suppose ) [a, /]. If there is a continuous function 1 on
[a, /] that is dierentiable on (a, /) and satises
1
t
(r) = ) (r) , r (a, /) ,
then
(2.1)
_
b
o
) (r) dr = 1 (/) 1 (a) .
Proof : Given - 0 use the Cauchy criterion for integrals (0.4) to choose a
partition
T = a = r
0
< r
1
< ... < r
= /
of [a, /] satisfying
l (); T) 1(); T) < -.
Now apply the second mean value Theorem 35 to 1 on the subinterval [r
n1
, r
n
]
to obtain points t
n
(r
n1
, r
n
) such that
1
t
(t
n
) =
1 (r
n
) 1 (r
n1
)
r
n
r
n1
,
so that
1 (r
n
) 1 (r
n1
) = 1
t
(t
n
) r
n
= ) (t
n
) r
n
.
Thus we have
1 (/) 1 (a) =

n=0
(1 (r
n
) 1 (r
n1
)) =

n=0
) (t
n
) r
n
.
But (1.2) implies that
_
b
o
)

n=0
) (t
n
) r
n
_ l (); T) 1(); T) < -,

3. EXERCISES 109
and we conclude that
1 (/) 1 (a)
_
b
o
)
< -
for every - 0, hence (2.1) holds.
Theorem 46. (Integration by parts) Suppose that 1, G are dierentiable func-
tions on [a, /] with 1
t
, G
t
[a, /]. Then
_
b
o
1
t
(r) G(r) dr +
_
b
o
1 (r) G
t
(r) dr = 1 (/) G(/) 1 (a) G(a) .
Proof : By Proposition 15 the function H (r) = 1 (r) G(r) has derivative
H
t
(r) = 1
t
(r) G(r) +1 (r) G
t
(r) ,
and by Lemma 17 and Theorems 41 and 13 we have
H
t
[a, /] .
Now we apply (2.1) to H and / = H
t
to obtain
H (/) H (a) =
_
b
o
/ =
_
b
o
(1
t
G+1G
t
) =
_
b
o
1
t
G+
_
b
o
1G
t
.
3. Exercises
Exercise 43. Suppose ) : [a, /] [0, ) is continuous, and that
_
b
o
) (r) dr =
0. Prove that ) (r) = 0 for all r [a, /].
Exercise 44. Suppose ) : [0, 1] [0, 1] is continuous at every point r
[0, 1] 1 where 1 is the Cantor set. Prove that ) [0, 1].
Exercise 45. Let j, (1, ) satisfy
1
+
1
j
= 1, i.e. =

1
.
(1) Show that
n _
n
j
+

j
, for n, _ 0,
and show that equality holds if and only if n
=
j
. Hint: Compute the
areas of the plane regions
=
_
(:, t) : 0 _ : _ n, 0 _ t _ :
1
_
,
1 =
_
(:, t) : 0 _ t _ , 0 _ : _ t
j1
_
,
and relate these regions to the rectangle 1 = [0, n] [0, ]. Draw a picture
to see what is going on here!
(2) Suppose that ), q [a, /], that ), q _ 0, and that
_
b
o
)
=
_
b
o
q
j
= 1.
Prove that
_
b
o
)q _ 1.
(3) Show that if ), q [a, /], then
_
b
o
)q
_
_
_
b
o
[)[
_1
p
_
_
b
o
[q[
j
_1
q
.
CHAPTER 7
Function spaces
A very powerful abstract idea in analysis is to consider metric spaces whose
points consist of functions dened on yet another metric space. A prime example
is the metric space of functions C
R
(A), which we now dene. Suppose A is a
compact metric space and let
C
R
(A) = ) : A R : ) is continuous ,
be the set of all continuous functions ) mapping A into the real numbers R. Clearly
C
R
(A) is a real vector space with the usual notion of addition of functions and
scalar multiplication. However, we can also dene a metric structure on C
R
(A) as
follows. For ), q C
R
(A), dene
(0.1) d (), q) = d
c
R
()
(), q) = sup
r
[) (r) q (r)[ .
Since ) q C
R
(A) is continuous on a compact set A, and the absolute value
function is continuous, it follows from Theorem 28 that the supremum dening
d (), q) is a nite nonnegative real number (and is even achieved as [) (r) q (r)[
for some r A). Note that in the case A = [a, /] is a closed interval on the real line,
the quantity d (), q) is the largest vertical distance between points on the graphs
of ) and q. It is an easy exercise to verify that d : A A [0, ) satises the
axioms of a metric. In particular, if ), q, / C
R
(A), then
d (), /) = sup
r
[) (r) /(r)[ = sup
r
[[) (r) q (r)] + [q (r) /(r)][
_ sup
r
[) (r) q (r)[ + sup
r
[q (r) /(r)[ = d (), q) +d (q, /) .
Thus (C
R
(A) , d) is a metric space whose elements are continuous real-valued func-
tions on A. The single most important result of this chapter is that this particular
metric space is complete, i.e. every Cauchy sequence in C
R
(A) converges. A cru-
cial role is played here by an investigation of limits of sequences in C
R
(A), namely
limits of sequences of continuous functions on A.
1. Sequences and series of functions
We begin by examining more carefully the notion of convergence of a sequence
of functions in the metric space C
R
(A). We begin with a general denition of
uniform convergence.
Definition 34. Suppose A and 1 are metric spaces and 1 A. Suppose that
)
n
o
n=1
is a sequence of functions )
n
: 1 1 and that ) : 1 1 . We say that
the sequence )
n
o
n=1
converges uniformly to ) on 1 if for every - 0 there is a
positive integer such that
(1.1) d
Y
()
n
(r) , ) (r)) _ - for all : _ and all r 1.
111
112 7. FUNCTION SPACES
In this case we write )
n
) uniformly on 1.
Note in particular that if )
n
) uniformly on 1 then the sequence )
n
o
n=1
converges pointwise to ) on 1, written )
n
) pointwise on 1, by which we mean
lim
no
)
n
(r) = ) (r) ,
for each r 1. The point of uniform convergence of the sequence )
n
o
n=1
is that
there is a positive integer that depends only on - and not on r 1, that works
in (1.1).
Example 9. Let )
n
: [0, 1] [0, 1] by )
n
(r) = r
n
. Let
) (r) =
_
0 if 0 _ r < 1
1 if r = 1
.
Then )
n
) pointwise on [0, 1] but the convergence is not uniform. Indeed, for
any : _ 1 there is a point r [0, 1) such that
[)
n
(r) ) (r)[ = [r
n
0[ = r
n
_
1
2
.
This is because the monomial r
n
is continuous and so lim
r1
r
n
= 1
n
= 1.
An important feature of this example is that the functions )
n
are each continu-
ous on the set [0, 1] (which also happens to be compact), yet their pointwise limit is
not continuous on [0, 1]. The next theorem shows that the reason can be attributed
to the failure of uniform convergence here.
Theorem 47. Suppose that A and 1 are metric spaces and 1 A. Suppose
also that )
n
o
n=1
is a sequence of continuous functions from 1 to 1 and that
) : 1 1 . If )
n
) uniformly on 1, then ) is continuous on 1.
Proof : Fix a point j 1 and let - 0. We must show that there is c 0
such that
d
Y
() (r) , ) (j)) < - whenever d
(r, j) < c and r 1.

Since )
n
) uniformly on 1 we can choose so large that (1.1) holds with
:
3
in
place of -:
(1.2) d
Y
()
n
(r) , ) (r)) _
-
3
for all : _ and all r 1.
Now use the continuity of )
on 1 at the point j to nd c 0 satisfying

(1.3) d
Y
()
(r) , )
(j)) <
-
3
whenever d
(r, j) < c and r 1.

Finally the triangle inequality yields
d
Y
() (r) , ) (j)) _ d
Y
() (r) , )
(r)) +d
Y
()
(r) , )
(j)) +d
Y
()
(j) , ) (j))
<
-
3
+
-
3
+
-
3
= -,
whenever d
(r, j) < c and r 1, upon applying (1.2) with : = to the rst and
third terms on the right, and applying (1.3) to the middle term on the right.
2. THE METRIC SPACE c
R
() 113
2. The metric space C
R
(A)
We can now prove the main result of this chapter, namely that the metric space
C
R
(A) is complete. Recall that A is compact now. The connection with uniform
convergence is this: a sequence )
n
o
n=1
in C
R
(A) converges to ) C
R
(A) in the
metric d of C
R
(A) given in (0.1), if and only if )
n
) uniformly on A. This is in
fact a denition chaser as in the case 1 = A and 1 = R, (1.1) says precisely that
d ()
n
, )) = sup
r
[)
n
(r) ) (r)[ _ - for all : _ .
It follows immediately that )
n
) in C
R
(A) if and only if )
n
) uniformly on
A.
Theorem 48. Let A be a compact metric space. Then the metric space C
R
(A)
is complete.
Proof : Let )
n
o
n=1
be a Cauchy sequence in C
R
(A). We must show that
)
n
o
n=1
converges to some ) C
R
(A). Now for every - 0 there is such that
sup
r
[)
n
(r) )
n
(r)[ = d ()
n
, )
n
) _ - for all :, : _ .
In particular for each r A the sequence )
n
(r)
o
n=1
is Cauchy in R. Since the
real numbers R are complete, there is for each r A a real number ) (r) such that
lim
no
)
n
(r) = ) (r) .
Moreover for : _ and r A we have
[)
n
(r) ) (r)[ =

)
n
(r) lim
no
)
n
(r)
= lim
no
[)
n
(r) )
n
(r)[ _ lim
no
- = -.
This shows that )
n
) uniformly on A. Now we apply Theorem 47 to conclude
that ) is continuous on A, i.e. ) C
R
(A). Weve already noted that in the metric
space C
R
(A), )
n
) in C
R
(A) is equivalent to )
n
) uniformly on A. Thus
weve shown that )
n
o
n=1
converges to ) in C
R
(A) as required.
Now that we know the metric space C
R
(A) is complete we can apply the
Contraction Lemma 12 to C
R
(A):
Lemma 23. Suppose that T : C
R
(A) C
R
(A) is a strict contraction on
C
R
(A), i.e. there is 0 < r < 1 such that
d (T), Tq) _ rd (), q) , for all ), q C
R
(A) .
Then T has a unique xed point / in C
R
(A), i.e. there is a unique / C
R
(A)
such that T/ = /.
2.1. Existence and uniqueness of solutions to initial value problems.
We can use Lemma 23 in the case A is a closed bounded interval in R to give an
elegant proof of a standard existence and uniqueness theorem for solutions to the
nonlinear rst order initial value problem
(2.1)
_
j
t
= ) (r, j)
j (r
0
) = j
0
, a _ r _ /,
where a < r
0
< /, j
0
R and ) : [a, /] R R is continuous and satises a
Lipschitz condition in the second variable. A function / : [c, ,] R R is said to
satisfy a Lipschitz condition in the second variable if there is a positive number 1
such that
(2.2) [) (r, j) ) (r, j
t
)[ _ 1[j j
t
[ , for all r [c, ,] and j, j
t
R.
Definition 35. A dierentiable function j : [a, /] R is dened to be a
solution to (2.1) if
j
t
(r) = ) (r, j (r)) for all r [a, /] , (2.3)
and j (r
0
) = j
0
.
Theorem 49. Suppose that c < r
0
< ,, j
0
R and ) : [c, ,] R R is
continuous and satises the Lipschitz condition (2.2). Then there are a, / satisfying
c _ a < r
0
< / _ , such that there is a unique solution j : [a, /] R to the intial
value problem (2.1).
Proof : Our strategy is to rst use the Fundamental Theorem of Calculus to
replace the initial value problem (2.1) with an equivalent integral equation (2.4).
Then we observe that a solution to the integral equation (2.4) is a xed point of
a certain map T : C
R
([c, ,]) C
R
([c, ,]). Then we will choose a < r
0
< /
suciently close to r
0
that the map T is a strict contraction when viewed as a
map on the metric space C
R
([a, /]). The existence of a unique xed point to the
integral equation (2.4) then follows immediately from Lemma 23, and this proves
the theorem. Here are the details.
We claim that j : [a, /] R is dierentiable and a solution to (2.1) if and only
if j is continuous and satises the integral equation
(2.4) j (r) = j
0
+
_
r
r0
) (t, j (t)) dt, a _ r _ /.
This equivalence will use only the continuity of ) and not the Lipschitz condition
(2.2). Note that if j is continuous, then the map t (t, j (t)) R
2
is continuous,
and hence so is the map t ) (t, j (t)) R. Theorem 41 thus shows that the
integrals on the right side of (2.4) all exist when j is continuous.
Suppose rst that j : [a, /] R is a solution to (2.1). This means that j
t
(t)
exists on [a, /] and satises (2.3). However, j (t) is then also continuous and hence so
is ) (t, j (t)) by the above comments. Thus (2.3) shows that j
t
is actually continuous
on [a, /], hence j
t
[r
0
, r] for all a _ r _ /. Now apply the Fundamental
Theorem of Calculus 2.1 to (2.3) to obtain
j (r) j (r
0
) =
_
r
r0
j
t
(t) dt =
_
r
r0
) (t, j (t)) dt,
which is (2.4) since j (r
0
) = j
0
by the second line in (2.3).
Conversely, suppose that j : [a, /] R is a continuous solution to (2.4). Then
the integrand ) (t, j (t)) is continuous and by Theorem 44 we have
j
t
(r) =
d
dr
_
r
r0
) (t, j (t)) dt = ) (r, j (r)) , a _ r _ /,
which is the rst line in (2.3). The second line in (2.3) is immediate upon setting
r = r
0
in (2.4).
R
() 115
Now we observe that j is a solution to the integral equation (2.4) if and only
if j C
R
([a, /]) is a xed point of the map T : C
R
([a, /]) C
R
([a, /]) dened by
T,(r) = j
0
+
_
r
r0
) (t, ,(t)) dt, a _ r _ /, , C
R
([a, /]) .
Note that T maps C
R
([a, /]) to itself since if , C
R
([a, /]) then ) (t, ,(t)) is
continuous on [a, /] and Theorems 41 and 44 show that T, C
R
([a, /]). In order
to apply Lemma 23 we will need to choose a < r
0
< / suciently close to r
0
that
the map T is a strict contraction on C
R
([a, /]). We begin by estimating the distance
in C
R
([a, /]) between T, and Tc for any pair ,, c C
R
([a, /]):
d
c
R
([o,b])
(T,, Tc) = sup
orb
[T,(r) Tc (r)[
= sup
orb
_
r
r0
) (t, ,(t)) ) (t, c (t)) dt
_ sup
orb
_
r
r0
[) (t, ,(t)) ) (t, c (t))[ dt
_ sup
orb
_
r
r0
1[,(t) c (t)[ dt
,
where the nal line uses the Lipschitz condition (2.2). But with
: = max / r
0
, r
0
a ,
we can dominate the nal expression by
1 sup
orb
_
r
r0
[,(t) c (t)[ dt
_ 1: sup
o|b
[,(t) c (t)[ = 1: d
c
R
([o,b])
(,, c) .
Thus if we choose a and / so close to r
0
that : <
1
J
, then r = 1: < 1 and we
have
d
c
R
([o,b])
(T,, Tc) _ r d
c
R
([o,b])
(,, c) ,
for all ,, c C
R
([a, /]), which shows that T is a contraction on C
R
([a, /]) since
r < 1. Lemma 23 now shows that T has a unique xed point j C
R
([a, /]), and
by what we proved above, this function j is the unique solution to the initial value
problem (2.1).
2.1.1. An example. Let ) : R R R by ) (r, j) = j. Then ) is continuous
and satises the Lipschitz condition (2.2) with 1 = 1. Theorem 49 then yields a
unique solution 1 : [a, /] R to the initial value problem
_
j
t
= j
j (0) = 1
, a _ r _ /,
for some a < 0 < /. An examination of the proof of Theorem 49 shows that we
only need a and / to satisfy : = max / 0, 0 a <
1
J
= 1, so that we have a
unique solution 1
X
: [`, `] R for any 0 < ` < 1. By uniqueness, all of these
solutions 1
X
coincide on common intervals of denition. Thus we have a function
1 : (1, 1) 1 satisfying
_
1
t
= 1
1 (0) = 1
, 1 < r < 1.
But much more is true. If 1 < r
0
< 1 and 0 < ` < 1 then the above reasoning
shows that the initial value problem
_
j
t
= j
j (r
0
) = 1 (r
0
)
, ` _ r r
0
_ `,
has a unique solution 1
X
: [r
0
` _ r _ r
0
+`] R. But 1
X
(r
0
) = 1 (r
0
) and
so by uniqueness we must have 1
X
= 1 on their common interval of denition.
Repeating this type of argument it follows that there is a unique extension of 1 to
a function 1 dened on all of R that satises
1
t
(r) = 1 (r) , r R,
1 (0) = 1.
Thus 1 is innitely dierentiable 1
(n)
= 1 and is of course the exponential function
1rj (r) in (3.1), as can be easily seen using Taylors formula Theorem 37:
1 (r) = 1 (0) +1
t
(0) r +... +1
(n)
(0)
r
n
:!
+1
(n+1)
(c)
r
n+1
(: + 1)!
= 1 +r +... +
r
n
:!
+1
(n+1)
(c)
r
n+1
(: + 1)!
,
for some c between 0 and r. Indeed,
1 (c)
r
n+1
(: + 1)!
_ sup
]c]]r]
[1 (c)[
[r[
n+1
(: + 1)!
0
as : , so that
1 (r) = 1 +r +... +
r
n
:!
+... =
o
n=0
r
n
:!
= 1rj (r) .
Remark 20. In most applications it is not the case that ) : [c, ,] R R
satises a Lipschitz condition for all j, j
t
R as in (2.2), but more likely that the
Lipschitz condition is restricted to a nite interval j, j
t
[, c], or even that ) is
only dened on a bounded rectangle [c, ,] [, c] with j
0
(, c). Theorem 49 can
still be protably applied however if we simply redene ) (r, j) to be constant in j
outside an interval [, c] that contains j
0
in its interior (see (1.3) for the denition
of interior). More precisely, set
) (r, j) =
_
_
_
) (r, ) if c _ r _ ,, j _ ,
) (r, j) if c _ r _ ,, _ j _ c,
) (r, c) if c _ r _ ,, c _ j,
.
Then if ) : [c, ,][, c] R is continuous and satises the local Lipschitz condition
[) (r, j) ) (r, j
t
)[ _ 1[j j
t
[ , for all r [c, ,] and j, j
t
[, c] ,
the function

) : [c, ,] R R is continuous and satises the Lipschitz condition
(2.2). Thus Theorem 49 produces a < r
0
< / and a solution j : [a, /] R R to
the initial value problem
_
j
t
=

) (r, j)
j (r
0
) = j
0
, a _ r _ /.
Since j is continuous and
(r
0
, j (r
0
)) = (r
0
, j
0
) (a, /) (, c) ,
R
() 117
there exist a _ < r
0
< 1 _ / such that (r, j (r)) [, 1] [, c] for _ r _
1. But then

) (r, j (r)) = ) (r, j (r)) for such r and we see from (2.3) that the
restriction of j to [, 1] solves the initial value problem
_
j
t
= ) (r, j)
j (r
0
) = j
0
, _ r _ 1.
2.2. Space-lling curves and snowake curves. We rst use the com-
pleteness of C
R
([0, 1]) to construct two continuous maps ,, c C
R
([0, 1]) with the
property that
(,(t) , c (t)) : 0 _ t _ 1 = [0, 1] [0, 1] .
Thus if we dene (t) = (,(t) , c (t)) for 0 _ t _ 1, then : [0, 1] [0, 1]
2
takes the closed unit interval continuously onto the closed unit square! This is the
simplest example of a space-lling curve. Note that it is impossible for a space-lling
curve to be one-to-one:
Lemma 24. If : [0, 1] [0, 1]
2
is both continuous and onto, then is not
one-to-one.
Proof : Suppose in order to derive a contradiction that is continuous, one-to-
one and onto. Since [0, 1] is compact, Corollary 11 then shows that the inverse map
1
: [0, 1]
2
[0, 1] is continuous. Now consider the distinct points 1 = (0) and
Q = (1) in the unit square. Pick any two continuous curves
(t) : [0, 1] [0, 1]

2
,
, = 1, 2, for which
1
(0) =
2
(1) = 1, (2.5)
1
(1) =
2
(0) = Q,
1
(t) ,=
2
(t) , 0 _ t _ 1.
Thus
1
takes 1 to Q continuously and
2
takes Q to 1 continuously, and the
images
1
(t) and
2
(t) of the two curves in the square are distinct for each t.
Now consider the dierence of the composition of these two curves with the
continuous map
1
:
, (t) =
1
(
1
(t))
1
(
2
(t)) , 0 _ t _ 1.
Thus , : [0, 1] [0, 1] is continuous and
, (0) =
1
(1)
1
(Q) = 1,
, (1) =
1
(Q)
1
(1) = 1.
Since 0 is an intermediate value, the Intermediate Value Theorem shows that there
is c (0, 1) such that
0 = , (c) =
1
(
1
(c))
1
(
2
(c)) ,
which implies
1
(c) =
_
1
(
1
(c))
_
=
_
1
(
2
(c))
_
=
2
(c) ,
contradicting the third line in (2.5).
To construct our space-lling curve (t) = (,(t) , c (t)), we begin with a con-
tinuous function ) : R [0, 1] of period 2, i.e. ) (t + 2) = ) (t) for all t R, that
satises
) (t) =
_
0 if 0 _ t _
1
3
1 if
2
3
_ t _ 1
.
Then for N dene
,
(t) =

n=1
1
2
n
)
_
3
2n1
t
_
and c
(t) =

n=1
1
2
n
)
_
3
2n
t
_
, 0 _ t _ 1.
Each of the sequences ,
o
=1
and c
o
=1
is Cauchy in the metric space
C
R
([0, 1]) since if ' < ,
d (,
1
, ,
) = sup
0|1
[,
1
(t) ,
(t)[
= sup
0|1
n=1+1
1
2
n
)
_
3
2n1
t
_
n=1+1
1
2
n
<
1
2
1
tends to 0 as ' , and similarly d (c
1
, c
) 0 as ' . Since C
R
([0, 1])
is complete, there are continuous functions , and c on [0, 1] such that
, = lim
o
,
and c = lim
o
c
in C
R
([0, 1]) .
Then (t) = (,(t) , c (t)), 0 _ t _ 1, denes a continuous map from [0, 1] into
the unit square [0, 1]
2
since 0 _ ,(t) , c (t) _ 1 for 0 _ t _ 1. We claim that given
(r
0
, j
0
) [0, 1]
2
there is t
0
[0, 1] such that (t
0
) = (r
0
, j
0
). To see this expand
both r
0
and j
0
in binary series:
r
0
=
o
n=1
a
2n1
_
1
2
_
n
and j
0
=
o
n=1
a
2n
_
1
2
_
n
,
where each coecient a
2n1
and a
2n
is either 0 or 1. Now set
t
0
=
o
|=1
2
3
|+1
a
|
.
For / N consider the number
3
|
t
0
=
|1
|=1
3
||1
2a
|
+
2
3
a
|
+
o
|=|+1
3
||1
2a
|
=
|
+
2
3
a
|
+1
|
.
Now
|
=

|1
|=1
3
||1
2a
|
is an even integer and
0 _ 1
|
=
o
|=|+1
3
||1
2a
|
_
o
|=|+1
3
||1
2
= 2
_
1
9
+
1
27
+...
_
= 2
1
9
1
1
1
3
=
1
3
.
If follows from the fact that ) has period 2 that
)
_
3
|
t
0
_
= )
_
|
+
2
3
a
|
+1
|
_
= )
_
2
3
a
|
+1
|
_
,
and then from the fact that ) is constant on
_
0,
1
3
and
_
2
3
, 1
that
)
_
3
|
t
0
_
= )
_
2
3
a
|
_
,
and nally that
(2.6) )
_
3
|
t
0
_
= a
|
,
R
() 119
since ) (0) = 0 and )
_
2
3
_
= 1.
Armed with (2.6) we obtain
,(t
0
) = lim
o
,
(t
0
) =
o
n=1
1
2
n
)
_
3
2n1
t
0
_
=
o
n=1
1
2
n
a
2n1
= r
0
,
c (t
0
) = lim
o
c
(t
0
) =
o
n=1
1
2
n
)
_
3
2n
t
0
_
=
o
n=1
1
2
n
a
2n
= j
0
,
which implies (t
0
) = (,(t
0
) , c (t
0
)) = (r
0
, j
0
), and completes the proof that
maps [0, 1] onto [0, 1]
2
.
Now we return to the von Koch snowake 1 constructed in Subsection 3.2 of
Chapter 3. Recall that we constructed the snowake in a sequence of steps that
we called generations. At the /
||
generation, we had constructed a polygonal
path consisting of 4
|
closed segments
_
1
|
_
4
k
=1
each of length
1
3
k
. We denoted this
polygonal snowake-shaped path by 1
|
. We now parameterize this polygonal
path 1
|
with a constant speed parameterization on the unit interval [0, 1]. Since
the length of 1
|
is
|c:qt/ (1
|
) = 4
|
1
3
|
=
_
4
3
_
|
,
this will result in a curve
|
(t) = (c
|
(t) , ,
|
(t)) , 0 _ t _ 1,
that traces out the polygonal path 1
|
in such a way that
_
_
_
c
t
|
(t) , ,
t
|
(t)
__
_
=
_
[c
t
|
(t)[
2
+
,
t
|
(t)
2
=
_
4
3
_
|
,
at all t except those corresponding to the vertices of 1
|
.
We now observe that the vertices of 1
|
are precisely the points
|
_

4
k
_
, and
moreover that
|
0
_
,
4
|
_
=
|
_
,
4
|
_
whenever /
t
_ /.
Thus the vertices in the constructions remain xed once they appear, and are
thereafter achieved by each
|
0 with the same parameter value. In fact we can prove
the following estimate for the dierence between consecutive curves by induction:
|+1
(t)
|
(t)
_
1
3
|
, 0 _ t _ 1, / _ 1.
As a consequence we see that each of the sequences c
|
o
|=1
and ,
|
o
|=1
of contin-
uous functions on [0, 1] is a Cauchy sequence in the metric space C
R
([0, 1]). Indeed,
if : < : then the triangle inequality gives
d (c
n
, c
n
) _
n1
|=n
d (c
|
, c
|+1
) _
n1
|=n
sup
0|1
[c
|+1
(t) c
|
(t)[
_
n1
|=n
sup
0|1
|+1
(t)
|
(t)
_
o
|=n
1
3
|
=
1
3
n
1
1
1
3
,
which tends to 0 as : , and similarly for d (,
n
, ,
n
). Thus there are continuous
functions c, , C
R
([0, 1]) such that the curve
(t) = lim
|o
|
(t) = lim
|o
(c
|
(t) , ,
|
(t)) =
_
lim
|o
c
|
(t) , lim
|o
,
|
(t)
_
= (c(t) , , (t))
maps onto the von Koch snowake 1.
We now sketch a proof that : [0, 1] 1 is one-to-one, thus demonstrating
that the fractal 1 is a closed Jordan arc, namely a continuous one-to-one image
of the closed unit interval [0, 1]. Indeed, let o
1
, o
2
, o
3
and o
4
be the similarities
characterizing 1 in Theorem 13. These are given in the table in Subsection 3.2 of
Chapter 3: for r = (r
1
, r
2
) R
2
,
o
1
r =
1
3
r,
o
2
r =
1
3
('
2
r + (1, 0)) ,
o
3
r =
1
3
_
'
3
r +
_
3
2
,
_
3
2
__
,
o
4
r =
1
3
(r + (2, 0)) .
Now dene T to be the open triangle with vertices
(0, 0) ,
_
1
2
,
1
2
_
3
_
, (1, 0) ,
and for 0 _ t _ 1, expand t in a series
(2.7) t =
o
n=1
a
n
_
1
4
_
n
, a
n
0, 1, 2, 3 ,
where the sequence a
n
o
n=1
does not end in an innite string of consecutive 3s,
except for the case where all the a
n
are 3. With this restriction, the series repre-
sentation (2.7) of t [0, 1] is unique.
One can now show (we leave this to the reader) that the intersection
o
=1
o
oj+1
_
...o
o2+1
_
o
o1+1
_
T
___
= o
o1+1
_
T
_
o
o2+1
_
o
o1+1
_
T
__
... o
oj+1
_
...o
o2+1
_
o
o1+1
_
T
___
...
consists of exactly the single point (t). Moreover:
The four triangles o
1
(T) , o
2
(T) , o
3
(T) , o
4
(T) are pairwise disjoint,
as well as the four triangles
o
1
(o (T)) , o
2
(o (T)) , o
3
(o (T)) , o
4
(o (T))
where o is any nite composition of the similarities o
1
, o
2
, o
3
and o
4
.
It now follows easily that (t) ,= (t
t
) for t ,= t
t
upon expanding
t =
o
n=1
a
n
_
1
4
_
n
and t
t
=
o
n=1
a
t
n
_
1
4
_
n
R
() 121
as in (2.7) above, considering the smallest : for which a
n
,= a
t
n
, and then ap-
plying the observation in the bullet item to o
on
(o (T)) and o
o
0
n
(o (T)) where
o = o
on1+1
... o
o2+1
o
o1+1
. This shows that o
on
(o (T)) o
o
0
n
(o (T)) = O
and in order to obtain o
on
_
o
_
T
__
o
o
0
n
_
o
_
T
__
= O, we use the assumption that
the coecients in the series representation (2.7) do not end in an innite string of
consecutive 3s.
Finally, we show that the curve (t) is nowhere dierentiable. For each / there
is , such that
,
4
|
_ t <
, + 1
4
|
.
Let
|
=

4
k
and 1
|
=
+1
4
k
. Suppose in order to derive a contradiction that
t
(t)
exists. Then we would have
(2.8) lim
|o
_
_
_
_
(1
|
) (
|
)
1
|

|
_
_
_
_
=
_
_
_
_
lim
|o
(1
|
) (
|
)
1
|

|
_
_
_
_
= |
t
(t)| .
However, the length of the line segment (1
|
) (
|
) is
1
3
k
, and 1
|

|
=
1
4
k
,
so
_
_
_
_
(1
|
) (
|
)
1
|

|
_
_
_
_
=
1
3
k
1
4
k
=
_
4
3
_
|
,
which tends to as / , the desired contradiction.
CHAPTER 8
Lebesgue measure theory
Recall that ) is Riemann integrable on [0, 1), written ) [0, 1), if | ()) =
/()), and we denote the common value by
_
1
0
) or
_
1
0
) (r) dr. Here | ()) and
/()) are the upper and lower Riemann integrals of ) on [0, 1) respectively given
by
| ()) = inf
1
n=1
_
sup
[rn1,rn)
)
_
r
n
,
/()) = sup
1
n=1
_
inf
[rn1,rn)
)
_
r
n
,
where T = 0 = r
0
< r
1
< ... < r
= 1 is any partition of [0, 1) and r

n
=
r
n
r
n1
0. For convenience we work with [0, 1) in place of [0, 1] for now.
This denition is simple and easy to work with and applies in particular to
bounded continuous functions ) on [0, 1) since it is not too hard to prove that
) [0, 1) for such ). However, if we consider the vector space 1
2
7
([0, 1)) of
Riemann integrable functions ) [0, 1) endowed with the metric
d (), q) =
__
1
0
[) (r) q (r)[
2
dr
_
1
2
,
it turns out that while 1
2
7
([0, 1)) can indeed be proved a metric space, it fails to be
complete. This is a serious shortfall of Riemanns theory of integration, and is our
main motivation for considering the more complicated theory of Lebesgue below.
We note that the immediate reason for the lack of completeness of 1
2
7
([0, 1)) is the
inability of Riemanns theory to handle general unbounded functions. However,
even locally there are problems. For example, once we have Lebesgues theory in
hand, we can construct a famous example of a Lebesgue measurable subset 1 of
[0, 1) with the (somewhat surprising) property that
0 < [1 (a, /)[ < / a, 0 _ a < / _ 1,
where [1[ denotes the Lebesgue measure of a measurable set 1 (see Problem 5
below). It follows that the characteristic function
J
is bounded and Lebesgue
measurable, but that there is no Riemann integrable function ) such that ) =
J
almost everywhere, since such an ) would satisfy | ()) = 1 and /()) = 0.
Nevertheless, by Lusins Theorem (see page 34 in [5] or page 55 in [4]) there is a
sequence of compactly supported continuous functions (hence Riemann integrable)
converging to
J
almost everywhere.
123
124 8. LEBESGUE MEASURE THEORY
On the other hand, in Lebesgues theory of integration, we partition the range
[0, ') of the function into a homogeneous partition,
[0, ') =

_
n=1
_
(: 1)
'
, :
'
_
=

_
n=1
1
n
,
and we consider the associated upper and lower Lebesgue sums of ) on [0, 1) dened
by
l
+
(); T) =

n=1
_
:
'
)
1
(1
n
)
,
1
+
(); T) =

n=1
_
(: 1)
'
)
1
(1
n
)
,
where of course
)
1
(1
n
) =
_
r [0, 1) : ) (r) 1
n
=
_
(: 1)
'
, :
'
__
,
and [1[ denotes the "measure" or "length" of the subset 1 of [0, 1).
Here there will be no problem obtaining that l
+
(); T) 1
+
(); T) is small
provided we can make sense of

)
1
(1
n
)
. But this is precisely the diculty with

Lebesgues approach - we need to dene a notion of "measure" or "length" for
subsets 1 of [0, 1). That this is not going to be as easy as we might hope is
evidenced by the following negative result. Let T ([0, 1)) denote the power set of
[0, 1), i.e. the set of all subsets of [0, 1). For r [0, 1) and 1 T ([0, 1)) we dene
the translation 1 r of 1 by r to be the set in T ([0, 1)) dened by
1 r = 1 +r (mod1)
= . [0, 1) : there is j 1 with j +r . Z .
Theorem 50. There is no map j : T ([0, 1)) [0, ) satisfying the following
three properties:
(1) j([0, 1)) = 1,
(2) j
_

o
n=1
1
n
_
=

o
n=1
j(1
n
) whenever 1
n
o
n=1
is a pairwise disjoint
sequence of sets in T ([0, 1)),
(3) j(1 r) = j(1) for all 1 T ([0, 1)).
Remark 21. All three of these properties are desirable for any notion of mea-
sure or length of subsets of [0, 1). The theorem suggests then that we should not
demand that every subset of [0, 1) be "measurable". This will then restrict the func-
tions ) that we can integrate to those for which )
1
([a, /)) is "measurable" for all
< a < / < .
Proof : Let r
n
o
n=1
= Q [0, 1) be an enumeration of the rational numbers
in [0, 1). Dene an equivalence relation on [0, 1) by declaring that r ~ j if r
j Q. Let / be the set of equivalence classes. Use the axiom of choice to
pick a representative a = from each equivalence class in /. Finally, let
1 = : / be the set consisting of these representatives a, one from each
equivalence class in /.
1. LEBESGUE MEASURE ON THE REAL LINE 125
Then we have
[0, 1) =

_
o
n=1
1 r
n
.
Indeed, if r [0, 1), then r for some /, and thus r ~ a = , i.e.
r a r
n
o
n=1
. If r _ a then r a Q[0, 1) and r = a +r
n
where a 1 and
r
n
r
n
o
n=1
. If r < a then r a + 1 Q [0, 1) and r = a + (r
n
1) where
a 1 and r
n
1 r
n
o
n=1
. Finally, if a r
n
= / r
n
, then a / = r
n
r
n
Q
which implies that a ~ / and then r
n
= r
n
.
Now by properties (1), (2) and (3) in succession we have
1 = j([0, 1)) = j
_

_
o
n=1
1 r
n
_
=
o
n=1
j(1 r
n
) =
o
n=1
j(1) ,
which is impossible since the innite series

o
n=1
j(1) is either if j(1) 0 or
0 if j(1) = 0.
1. Lebesgue measure on the real line
In order to dene a "measure" satisfying the three properties in Theorem 50,
we must restrict the domain of denition of the set functional j to a "suitable"
proper subset of the power set T ([0, 1)). A good notion of "suitable" is captured
by the following denition where we expand our quest for measure to the entire
real line.
Definition 36. A collection / T (R) of subsets of real numbers R is called
a o-algebra if the following properties are satised:
(1) c /,
(2)
c
/ whenever /,
(3)

o
n=1
n
/ whenever
n
/ for all :.
Here is the theorem asserting the existence of "Lebesgue measure" on the real
line.
Theorem 51. There is a o-algebra / T (R) and a function j : / [0, ]
such that
(1) [a, /) / and j([a, /)) = / a for all < a < / < ,
(2)

o
n=1
1
n
/ and j
_

o
n=1
1
n
_
=

o
n=1
j(1
n
) whenever 1
n
o
n=1
is a
pairwise disjoint sequence of sets in /,
(3) 1 +r / and j(1 +r) = j(1) for all 1 /,
(4) 1 / and j(1) = 0 whenever 1 1 and 1 / with j(1) = 0.
It turns out that both the o-algebra / and the function j are uniquely deter-
mined by these four properties, but we will only need the existence of such / and
j. The sets in the o-algebra / are called Lebesgue measurable sets.
A pair (/, j) satisfying only property (2) is called a measure space. Property
(1) says that the measure j is an extension of the usual length function on intervals.
Property (3) says that the measure is translation invariant, while property (4) says
that the measure is complete.
From property (2) and the fact that j is nonnegative, we easily obtain the
following elementary consequences (where membership in / is implied by context):
c / and j(c) = 0, (1.1)
1 / for every open set 1 in R,
j(1) = / a for any interval 1 with endpoints a and /,
j(1) = sup
n
j(1
n
) = lim
no
j(1
n
) if 1
n
1,
j(1) = inf
n
j(1
n
) = lim
no
j(1
n
) if 1
n
1 and j(1
1
) < .
For example, the fourth line follows from writing
1 = 1
1

'
_

_
o
n=1
1
n+1
(1
n
)
c
_
and then using property (2) of j.
To prove Theorem 51 we follow the treatment in [5] with simplications due to
the fact that Theorem 31 implies the connected open subsets of the real numbers
R are just the open intervals (a, /). Dene for any 1 T (R), the outer Lebesgue
measure j
+
(1) of 1 by,
j
+
(1) = inf
_
o
n=1
(/
n
a
n
) : 1

_
o
n=1
(a
n
, /
n
) and _ a
n
< /
n
_
_
.
It is immediate that j
+
is monotone,
j
+
(1) _ j
+
(1) if 1 1.
A little less obvious is countable subadditivity of j
+
.
Lemma 25. j
+
is countably subadditive:
j
+
_
o
_
n=1
1
n
_
_
o
n=1
j
+
(1
n
) , 1
n
o
n=1
T (R) .
Proof : Given 0 < - < 1, we have 1
n

o
|=1
(a
|,n
, /
|,n
) with
o
|=1
(/
|,n
a
|,n
) < j
+
(1
n
) +
-
2
n
, : _ 1.
Now let
o
_
n=1
_

_
o
|=1
(a
|,n
, /
|,n
)
_
=

_
1
n=1
(c
n
, d
n
) ,
where '
+
N ' . Then dene disjoint sets of indices
J
n
= (/, :) : (a
|,n
, /
|,n
) (c
n
, d
n
) .
In the case c
n
, d
n
R, we can choose by compactness a nite subset T
n
of J
n
such that
(1.2)
_
c
n
+
-
2
c
n
, d
n
-
2
c
n
_
o
_
(|,n)Jm
(a
|,n
, /
|,n
) ,
where c
n
= d
n
c
n
. Fix : and arrange the left endpoints a
|,n
(|,n)Jm
in
strictly increasing order a
I
1
I=1
and denote the corresponding right endpoints by
/
I
(if there is more than one interval (a
I
, /
I
) with the same left endpoint a
I
, discard
all but one of the largest of them). From (1.2) it now follows that a
I+1
(a
I
, /
I
) for
i < 1 since otherwise /
I
would be in the left side of (1.2), but not in the right side,
a contradiction. Thus a
I+1
a
I
_ /
I
a
I
for 1 _ i < 1 and we have the inequality
(1 -) c
n
=
_
d
n
-
2
c
n
_
_
c
n
+
-
2
c
n
_
_ /
1
a
1
= (/
1
a
1
) +
11
I=1
(a
I+1
a
I
)
_
1
I=1
(/
I
a
I
) =

(|,n)Jm
(/
|,n
a
|,n
)
_

(|,n)1m
(/
|,n
a
|,n
) .
We also observe that a similar argument shows that

(|,n)1m
(/
|,n
a
|,n
) =
if c
n
= . Then we have
j
+
(1) _
o
n=1
c
n
_
1
1 -
o
n=1
(|,n)Jm
(/
|,n
a
|,n
)
_
1
1 -
|,n
(/
|,n
a
|,n
) =
1
1 -
o
n=1
o
|=1
(/
|,n
a
|,n
)
<
1
1 -
o
n=1
_
j
+
(1
n
) +
-
2
n
_
=
1
1 -
o
n=1
j
+
(1
n
) +
-
1 -
.
Let - 0 to obtain the countable subadditivity of j
+
.
Now dene the subset / of T (R) to consist of all subsets of the real line such
that for every - 0, there is an open set G satisfying
(1.3) j
+
(G ) < -.
Remark 22. Condition (1.3) says that can be well approximated from the
outside by open sets. The most dicult task we will face below in using this deni-
tion of / is to prove that such sets can also be well approximated from the inside
by closed sets.
Set
j() = j
+
() , /.
Trivially, every open set and every interval is in /. We will use the following two
claims in the proof of Theorem 51.
Claim 2. If G is open and G =

n=1
(a
n
, /
n
) (where
+
N ' ) is
the decomposition of G into its connected components (a
n
, /
n
) (Proposition 14 of
Chapter 5), then
j(G) = j
+
(G) =

n=1
(/
n
a
n
) .
We rst prove Claim 2 when
+
< . If G

o
n=1
(c
n
, d
n
), then for each
1 _ : _
+
, (a
n
, /
n
) (c
n
, d
n
) for some : since (a
n
, /
n
) is connected. If
J
n
= : : (a
n
, /
n
) (c
n
, d
n
) ,
it follows upon arranging the a
n
in increasing order that
n1m
(/
n
a
n
) _ d
n
c
n
,
since the intervals (a
n
, /
n
) are pairwise disjoint. We now conclude that
j
+
(G) = inf
_
o
n=1
(d
n
c
n
) : G

_
o
n=1
(c
n
, d
n
)
_
_
o
n=1
n1m
(/
n
a
n
) =

n=1
(/
n
a
n
) ,
and hence that j
+
(G) =

n=1
(/
n
a
n
) by denition since G

n=1
(a
n
, /
n
).
Finally, if
+
= , then from what we just proved and monotonicity, we have
j
+
(G) _ j
+
_

_
n=1
(a
n
, /
n
)
_
=

n=1
(/
n
a
n
)
for each 1 _ < . Taking the supremum over gives j
+
(G) _
o
n=1
(/
n
a
n
),
and then equality follows by denition since G

o
n=1
(a
n
, /
n
).
Claim 3. If and 1 are disjoint compact subsets of R, then
j
+
() +j
+
(1) = j
+
(' 1) .
First note that
c = di:t (, 1) = inf [r j[ : r , j 1 0,
since the function ) (r, j) = [r j[ is positive and continuous on the closed and
bounded (hence compact) subset 1 of the plane - Theorem 28 shows that )
achieves its inmum di:t (, 1), which is thus positive. So we can nd open sets
l and \ such that
l and 1 \ and l \ = c.
For example, l =

r.
1
_
r,
o
2
_
and \ =

r1
1
_
r,
o
2
_
work. Now suppose that
' 1 G =

_
o
n=1
(a
n
, /
n
) .
Then we have
l G =

_
1
|=1
(c
|
, )
|
) and 1 \ G =

_
J
|=1
(q
|
, /
|
) ,
and then from Claim 2 and monotonicity of j
+
we obtain
j
+
() +j
+
(1) _
1
|=1
()
|
c
|
) +
J
|=1
(/
|
q
|
)
= j
+
_
_
_
_

_
1
|=1
(c
|
, )
|
)
_
_

'
_
_

_
J
|=1
(q
|
, /
|
)
_
_
_
_
_ j
+
(G) =
o
n=1
(/
n
a
n
) .
Taking the inmum over such G gives j
+
() +j
+
(1) _ j
+
(' 1), and subaddi-
tivity of j
+
now proves equality.
Proof (of Theorem 51): We now prove that / is a o-algebra and that / and j
satisfy the four properties in the statement of Theorem 51. First we establish that
/ is a o-algebra in four steps.
Step 1: / if j
+
() = 0.
Given - 0, there is an open G with j
+
(G) < -. But then j
+
(G ) _
j
+
(G) < - by monontonicity.
Step 2:

o
n=1
n
/ whenever
n
/ for all :.
Given - 0, there is an open G
n

n
with j
+
(G
n

n
) <
:
2
n
. Then
=

o
n=1
n
is contained in the open set G =

o
n=1
G
n
, and since G is
contained in

o
n=1
(G
n

n
), monotonicity and subadditivity of j
+
yield
j
+
(G ) _ j
+
_
o
_
n=1
(G
n

n
)
_
_
o
n=1
j
+
(G
n

n
) <
o
n=1
-
2
n
= -.
Step 3: / if is closed.
Suppose rst that is compact, and let - 0. Then using Claim 2 there is
G =

n=1
(a
n
, /
n
) containing with
j
+
(G) =
o
n=1
(/
n
a
n
) _ j
+
() +- < .
Now G is open and so G =

n=1
(c
n
, d
n
) by Proposition 14. We want to
show that j
+
(G ) _ -. Fix a nite ' _ '
+
and
0 < j <
1
2
min
1n1
(d
n
c
n
) .
Then the compact set
1
q
=
1
_
n=1
[c
n
+j, d
n
j]
is disjoint from , so by Claim 3 we have
j
+
() +j
+
(1
q
) = j
+
(' 1
q
) .
We conclude from subadditivity and ' 1
q
G that
j
+
() +
1
n=1
(d
n
c
n
2j) = j
+
() +j
+
_
1
_
n=1
(c
n
+j, d
n
j)
_
_ j
+
() +j
+
(1
q
)
= j
+
(' 1
q
)
_ j
+
(G) _ j
+
() +-.
Since j
+
() < for compact, we thus have
1
n=1
(d
n
c
n
) _ - + 2'j
for all 0 < j <
1
2
min
1n1
(d
n
c
n
). Hence

1
n=1
(d
n
c
n
) _ - and taking
the supremum in ' _ '
+
we obtain from Claim 2 that
j
+
(G ) =
1
n=1
(d
n
c
n
) _ -.
Finally, if is closed, it is a countable union of compact sets =

o
n=1
([:, :] ),
and hence / by Step 2.
Step 4:
c
/ if /.
For each : _ 1 there is by Claim 2 an open set G
n
such that j
+
(G
n
) <
1
n
. Then 1
n
= G
c
n
is closed and hence 1
n
/ by Step 3. Thus
o =
o
_
n=1
1
n
/, o
c
,
and
c
o G
n
for all : implies that
j
+
(
c
o) _ j
+
(G
n
) <
1
:
, : _ 1.
Thus j
+
(
c
o) = 0 and by Step 1 we have
c
o /. Finally, Step 2 shows that
c
= o ' (
c
o) /.
Thus far we have shown that / is a o-algebra, and we now turn to proving that
/ and j satisfy the four properties in Theorem 51. Property (1) is an easy exercise.
Property (2) is the main event. Let 1
n
o
n=1
be a pairwise disjoint sequence of sets
in /, and let 1 =

o
n=1
1
n
.
We will consider rst the case where each of the sets 1
n
is bounded. Let - 0
be given. Then 1
c
n
/ and so there are open sets G
n
1
c
n
such that
j
+
(G
n
1
c
n
) <
-
2
n
, : _ 1.
Equivalently, with 1
n
= G
c
n
, we have 1
n
closed, contained in 1
n
, and
j
+
(1
n
1
n
) <
-
2
n
, : _ 1.
2. MEASURABLE FUNCTIONS AND INTEGRATION 131
Thus the sets 1
n
o
n=1
are compact and pairwise disjoint. Claim 3 and induction
shows that
n=1
j
+
(1
n
) = j
+
_

_
n=1
1
n
_
_ j
+
(1) , _ 1,
and taking the supremum over yields
o
n=1
j
+
(1
n
) _ j
+
(1) .
Thus we have
o
n=1
j
+
(1
n
) _
o
n=1
j
+
(1
n
1
n
) +j
+
(1
n
)
_
o
n=1
-
2
n
+
o
n=1
j
+
(1
n
) _ - +j
+
(1) .
Since - 0 we conclude that

o
n=1
j
+
(1
n
) _ j
+
(1), and subadditivity of j
+
then
proves equality.
In general, dene 1
n,|
= 1
n
(/ 1, /] ' [/, / + 1) for /, : _ 1 so that
1 =

_
o
n=1
1
n
=

_
o
n,|=1
1
n,|
.
Then from what we just proved we have
j
+
(1) =
o
n,|=1
j
+
(1
n,|
) =
o
n=1
_
o
|=1
j
+
(1
n,|
)
_
=
o
n=1
j
+
(1
n
) .
Finally, property (3) follows from the observation that 1

o
n=1
(a
n
, /
n
) if and
only if 1 +r

o
n=1
(a
n
+r, /
n
+r). It is then obvious that j
+
(1 +r) = j
+
(1)
and that 1 + r / if 1 /. Property (4) is immediate from Step 1 above. This
completes the proof of Theorem 51.
2. Measurable functions and integration
Let [, ] = R ' , be the extended real numbers with order and
(some) algebra operations dened by
< r < , r R,
r + = , r R,
r = , r R,
r = , r 0,
r = , r < 0,
0 = 0.
The nal assertion 0 = 0 is dictated by

o
n=1
a
n
= 0 if all the a
n
= 0. It turns
out that these denitions give rise to a consistent theory of measure and integration
of functions with values in the extended real number system.
Let ) : R [, ]. We say that ) is (Lebesgue) measurable if
)
1
([, r)) /, r R.
The simplest examples of measurable functions are the characteristic functions
J
of measurable sets 1. Indeed,
(
J
)
1
([, r)) =
_
_
_
c if r _ 0
1
c
if 0 < r _ 1
R if r 1
.
It is then easy to see that nite linear combinations : =

n=1
a
n
Jn
of such
characteristic functions
Jn
, called simple functions, are also measurable. Here
a
n
R and 1
n
is a measurable subset of R. It turns out that if we dene the
integral of a simple function : =

n=1
a
n
Jn
by
_
R
: =

n=1
a
n
j(1
n
) ,
the value is independent of the representation of : as a simple function. Armed
with this fact we can then extend the denition of integral
_
R
) to functions ) that
are nonnegative on R, and then to functions ) such that
_
R
[)[ < .
At each stage one establishes the relevant properties of the integral along with
the most useful theorems. For the most part these extensions are rather routine, the
cleverness inherent in the theory being in the overarching organization of the con-
cepts rather than in the details of the demonstrations. As a result, we will merely
state the main results in logical order and sketch proofs when not simply routine.
We will however give fairly detailed proofs of the three famous convergence theo-
rems, the Monotone Convergence Theorem, Fatous Lemma, and the Dominated
Convergence Theorem. The reader is referred to the excellent exposition in [5] for
the complete story including many additional fascinating insights.
2.1. Properties of measurable functions. From now on we denote the
Lebesgue measure of a measurable subset 1 of R by [1[ rather than by j(1) as in
the previous sections. We say that two measurable functions ), q : R [, ]
are equal almost everywhere (often abbreviated a.c.) if
[r R : ) (r) ,= q (r)[ = 0.
We say that ) is nite-valued if ) : R R. We now collect a number of elementary
properties of measurable functions.
Lemma 26. Suppose that ), )
n
, q : R [, ] for : N.
(1) If ) is nite-valued, then ) is measurable if and only if )
1
(G) / for
all open sets G R if and only if )
1
(1) / for all closed sets 1 R.
(2) If ) is nite-valued and continuous, then ) is measurable.
(3) If ) is nite-valued and measurable and : R R is continuous, then
) is measurable.
(4) If )
n
o
n=1
is a sequence of measurable functions, then the following func-
tions are all measurable:
sup
n
)
n
(r) , inf
n
)
n
(r) , ... lim sup
no
)
n
(r) , lim inf
no
)
n
(r) .
(5) If )
n
o
n=1
is a sequence of measurable functions and ) (r) = lim
no
)
n
(r),
then ) is measurable.
(6) If ) is measurable, so is )
n
for : N.
(7) If ) and q are nite-valued and measurable, then so are ) +q and )q.
(8) If ) is measurable and ) = q almost everywhere, then q is measurable.
Comments: For property (1), rst show that ) is measurable if and only if
)
1
((a, /)) / for all < a < / < . For property (3) use ( ))
1
(G) =
)
1
_
1
(G)
_
and note that
1
(G) is open if G is open. For property (7), use
) +q a =
_
:Q
[) a r q r] , a R,
)q =
1
4
_
() +q)
2
() q)
2
_
.
Recall that a measurable simple function , (i.e. the range of , is nite) has
the form
, =

|=1
c
|
J
k
, c
|
R, 1
|
/.
Next we collect two approximation properties of simple functions.
Proposition 18. Let ) : R [, ] be measurable.
(1) If ) is nonnegative there is an increasing sequence of nonnegative simple
functions ,
|
o
|=1
that converges pointwise and monotonically to ):
,
|
(r) _ ,
|+1
(r) and lim
|o
,
|
(r) = ) (r) , for all r R.
(2) There is a sequence of simple functions ,
|
o
|=1
satisfying
[,
|
(r)[ _
,
|+1
(r)
and lim
|o
,
|
(r) = ) (r) , for all r R.
Comments: To prove (1) let )
1
= min), ', and for 0 _ : < ' dene
1
n,,1
=
_
r R :
:
< )
1
(r) _
: + 1
_
.
Then ,
|
(r) =

2
k
|
n=1
n
2
k
J
n;2
k
;k
(r) works. Property (2) is routine given (1).
2.2. Properties of integration and convergence theorems. If , is a
measurable simple function (i.e. its range is a nite set), then , has a unique
canonical representation
, =

|=1
c
|
J
k
,
where the real constants c
|
are distinct and nonzero, and the measurable sets 1
|
are pairwise disjoint. We dene the Lebesgue integral of , by
_
,(r) dr =

|=1
c
|
[1
|
[ .
If 1 is a measurable subset of R and , is a measurable simple function, then so is
J
,, and we dene
_
J
,(r) dr =
_
(
J
,) (r) dr.
Lemma 27. Suppose that , and c are measurable simple functions and that
1, 1 /.
(1) If , =

1
|=1
,
|
J
k
(not necessarily the canonical representation), then
_
,(r) dr =
1
|=1
,
|
[1
|
[ .
(2)
_
(a, +/c) = a
_
, +/
_
c for a, / C,
(3)
_
J|J
, =
_
J
, +
_
J
, if 1 1 = c,
(4)
_
, _
_
c if , _ c,
(5)

_
,
_
_
[,[.
Properties (2) - (5) are usually referred to as linearity, additivity, monotonicity
and the triangle inequality respectively. The proofs are routine.
Now we turn to dening the integral of a nonnegative measurable function
) : R [0, ]. For such ) we dene
_
) (r) dr = sup
__
q (r) dr : 0 _ , _ ) and , is simple
_
.
It is essential here that ) be permitted to take on the value , and that the
supremum may be as well. We say that ) is (Lebesgue) integrable if
_
) (r) dr <
. For 1 measurable dene
_
J
) (r) dr =
_
(
J
)) (r) dr.
Here is an analogue of Lemma 27 whose proof is again routine.
Lemma 28. Suppose that ), q : R [0, ] are nonnegative measurable func-
tions and that 1, 1 /.
(1)
_
(a) +/q) = a
_
) +/
_
q for a, / (0, ),
(2)
_
J|J
) =
_
J
) +
_
J
) if 1 1 = c,
(3)
_
) _
_
q if 0 _ ) _ q,
(4) If
_
) < , then ) (r) < for a.e. r,
(5) If
_
) = 0, then ) (r) = 0 for a.e. r.
Note that convergence of integrals does not always follow from pointwise con-
vergence of the integrands. For example,
lim
no
_

[n,n+1]
(r) dr = 1 ,= 0 =
_
lim
no
[n,n+1]
(r) dr,
and
lim
no
_
:
(0,
1
n
)
(r) dr = 1 ,= 0 =
_
lim
no
:
[0,
1
n
]
(r) dr.
In each of these examples, the mass of the integrands "disappears" in the limit; at
"innity" in the rst example and at the origin in the second example. Here are our
rst two classical convergence theorems giving conditions under which convergence
does hold.
Theorem 52. (Monotone Convergence Theorem) Suppose that )
n
o
n=1
is an
increasing sequence of nonnegative measurable functions, i.e. )
n
(r) _ )
n+1
(r),
and let
) (r) = sup
n
)
n
(r) = lim
no
)
n
(r) .
Then ) is nonegative and measurable and
_
) (r) dr = lim
no
_
)
n
(r) dr.
Proof : Since
_
)
n
_
_
)
n+1
we have lim
no
_
)
n
= 1 [0, ]. Now ) is
measurable and )
n
_ ) implies
_
)
n
_
_
) so that
1 _
_
).
To prove the opposite inequality, momentarily x a simple function , such that
0 _ , _ ). Choose c < 1 and dene
1
n
= r R : )
n
(r) _ c,(r) , : _ 1.
Then 1
n
is an increasing sequence of measurable sets with

o
n=1
1
n
= R. We have
_
)
n
_
_
Jn
)
n
_ c
_
Jn
,, : _ 1.
Now let , =

|=1
c
|
J
k
be the canonical representation of ,. Then
_
Jn
, =

|=1
c
|
[1
n
1
|
[ ,
and since lim
no
[1
n
1
|
[ = [1
|
[ by the fourth line in (1.1), we obtain that
_
Jn
, =

|=1
c
|
[1
n
1
|
[

|=1
c
|
[1
|
[ =
_
,
as : . Altogether then we have
1 = lim
no
_
)
n
_ c
_
,
for all c < 1, which implies 1 _
_
, for all simple , with 0 _ , _ ), which implies
1 _
_
) as required.
Corollary 14. Suppose that a
|
(r) _ 0 is measurable for / _ 1. Then
_
o
|=1
a
|
(r) dr =
o
|=1
_
a
|
(r) dr.
To prove the corollary apply the Monotone Convergence Theorem to the se-
quence of partial sums )
n
(r) =

n
|=1
a
|
(r).
Lemma 29. (Fatous Lemma) If )
n
o
n=1
is a sequence of nonnegative mea-
surable functions, then
_
lim inf
no
)
n
(r) dr _ lim inf
no
_
)
n
(r) dr.
Proof : Let q
n
(r) = inf
|n
)
|
(r) so that q
n
_ )
n
and
_
q
n
_
_
)
n
. Then
q
n
o
n=1
is an increasing sequence of nonnegative measurable functions that con-
verges pointwise to liminf
no
)
n
(r). So the Monotone Convergence Theorem
yields
_
lim inf
no
)
n
(r) dr = lim
no
_
q
n
(r) dr _ lim inf
no
_
)
n
(r) dr.
Finally, we can give an unambiguous meaning to the integral
_
) (r) dr in the
case when ) is integrable, by which we mean that ) is measurable and
_
[) (r)[ dr <
. To do this we introduce the positive and negative parts of ):
)
+
(r) = max ) (r) , 0 and )
(r) = max ) (r) , 0 .

Then both )
+
and )
are nonnegative measurable functions with nite integral.

We dene
_
) (r) dr =
_
)
+
(r) dr
_
)
(r) dr.
With this denition we have the usual elementary properties of linearity, addi-
tivity, monotonicity and the triangle inequality.
Lemma 30. Suppose that ), q are integrable and that 1, 1 /.
(1)
_
(a) +/q) = a
_
) +/
_
q for a, / R,
(2)
_
J|J
) =
_
J
) +
_
J
) if 1 1 = c,
(3)
_
) _
_
q if ) _ q,
(4)

_
)
_
_
[)[.
Our nal convergence theorem is one of the most useful in analysis.
Theorem 53. (Dominated Convergence Theorem) Let q be a nonnegative in-
tegrable function. Suppose that )
n
o
n=1
is a sequence of measurable functions sat-
isfying
lim
no
)
n
(r) = ) (r) , a.c. r,
and
[)
n
(r)[ _ q (r) , a.c. r.
Then
lim
no
_
[) (r) )
n
(r)[ dr = 0,
and hence
_
) (r) dr = lim
no
_
)
n
(r) dr.
Proof : Since [)[ _ q and ) is measurable, ) is integrable. Since [) )
n
[ _ 2q,
Fatous Lemma can be applied to the sequence of functions 2q [) )
n
[ to obtain
_
2q _ lim inf
no
_
(2q [) )
n
[)
=
_
2q + lim inf
no
_
_
[) )
n
[
_
=
_
2q lim sup
no
_
[) )
n
[ .
Since
_
2q < , we can subtract it from both sides to obtain
lim sup
no
_
[) )
n
[ _ 0,
which implies lim
no
_
[) )
n
[ = 0. Then
_
) = lim
no
_
)
n
follows from the
triangle inequality

_
() )
n
)
_
_
[) )
n
[.
Finally, if ) (r) = n(r) + i (r) is complex-valued where n(r) and (r) are
real-valued measurable functions such that
_
[) (r)[ dr =
_ _
n(r)
2
+ (r)
2
dr < ,
then we dene
_
) (r) dr =
_
n(r) dr +i
_
(r) dr.
The usual properties of linearity, additivity, monotonicity and the triangle inequal-
ity all hold for this denition as well.
2.3. Three famous measure problems. The following three problems are
listed in order of increasing diculty.
Problem 3. Suppose that 1
1
, ..., 1
n
are : Lebesgue measurable subsets of [0, 1]
such that each point r in [0, 1] lies in some / of these subsets. Prove that there is
at least one set 1
with [1
[ _
|
n
.
Problem 4. Suppose that 1 is a Lebesgue measurable set of positive measure.
Prove that
1 1 = r j : r, j 1
contains a nontrivial open interval.
Problem 5. Construct a Lebesgue measurable subset of the real line such that
0 <
[1 1[
[1[
< 1
for all nontrivial open intervals 1.
To solve Problem 3, note that the hypothesis implies / _

n
=1
Jj
(r) for
r [0, 1]. Now integrate to obtain
/ =
_
1
0
/dr _
_
1
0
_
_
n
=1
Jj
(r)
_
_
dr =
n
=1
_
1
0

Jj
(r) dr =
n
=1
[1
[ ,
which implies that [1
[ _
|
n
for some ,. The solution is much less elegant without
recourse to integration.
To solve Problem 4, choose 1 compact contained in 1 such that [1[ 0. Then
choose G open containing 1 such that [G 1[ < [1[. Let c = di:t (1, G
c
) 0. It
follows that (c, c) 1 1 1 1. Indeed, if r (c, c) then 1 r G and
1 (1 r) ,= c since otherwise we have a contradiction:
2 [1[ = [1[ +[1 r[ _ [G[ _ [G 1[ +[1[ < 2 [1[ .
Thus there are /
1
and /
2
in 1 such that /
1
= /
2
r and so
r = /
2
/
1
1 1.
Problem 5 is most easily solved using generalized Cantor sets 1
o
. Let 0 < c _ 1
and set 1
0
1
= [0, 1]. Remove the open interval of length
1
3
c centered in 1
0
1
and denote
the two remaining closed intervals by 1
1
1
and 1
1
2
. Then remove the open interval of
length
1
3
2
c centered in 1
1
1
2
1
and
1
2
2
. Do the same for 1
1
2
2
3
and 1
2
4
.
Continuing in this way, we obtain at the /
||
_
1
|
_
2
k
=1
of 2
|
pairwise disjoint closed intervals of equal length. Let
1
o
=
o
|=1
_
_
2
k
_
=1
1
|
_
_
.
Then by summing the lengths of the removed open intervals, we obtain
[[0, 1] 1
o
[ =
1
3
c +
2
3
2
c +
2
2
3
3
c +... = c,
and it follows that 1
o
is compact and has Lebesgue measure 1 c. It is not hard
to show that 1
o
is also nowhere dense. The case c = 1 is particularly striking: 1
1
is a compact, perfect and uncountable subset of [0, 1] having Lebesgue measure 0.
This is the classical Cantor set introduced as a fractal in Subsection 3.1 of Chapter
3.
In order to construct the set 1 in Problem 3, it suces by taking unions of
translates by integers, to construct a subset 1 of [0, 1] satisfying
(2.1) 0 <
[1 1[
[1[
< 1, for all intervals 1 [0, 1] of positive length.
Fix 0 < c
1
< 1 and start by taking 1
1
= 1
o1
. It is not hard to see that
[J
1
|1[
]1]
< 1
for all 1, but the left hand inequality in (2.1) fails for 1 = 1
1
whenever 1 is a subset
of one of the component intervals in the open complement [0, 1] 1
1
. To remedy
this x 0 < c
2
< 1 and for each component interval J of [0, 1] 1
1
, translate and
dilate 1
o2
to t snugly in the closure J of the component, and let 1
2
be the union
of 1
1
and all these translates and dilates of 1
o2
. Then again,
[J
2
|1[
]1]
< 1 for all
1 but the left hand inequality in (2.1) fails for 1 = 1
2
whenever 1 is a subset of
one of the component intervals in the open complement [0, 1] 1
2
. Continue this
process indenitely with a sequence of numbers c
n
o
n=1
(0, 1). We claim that
1 =

o
n=1
1
n
satises (2.1) if and only if
(2.2)
o
n=1
(1 c
n
) < .
To see this, rst note that no matter what sequence of numbers c
n
less than
one is used, we obtain that 0 <
]J|1]
]1]
for all intervals 1 of positive length. Indeed,
each set 1
n
is easily seen to be compact and nowhere dense, and each component
interval in the complement [0, 1] 1
n
has length at most
c
1
3
c
2
3
...
c
n
3
_ 3
n
.
Thus given an interval 1 of positive length, there is : large enough such that 1 will
contain one of the component intervals J of [0, 1] 1
n
, and hence will contain the
translated and dilated copy (
_
1
on+1
_
of 1
on+1
that is tted into J by construction.
Since the dilation factor is the length [J[ of J, we have
[1 1[ _
(
_
1
on+1
_
= [J[
1
on+1
= [J[ (1 c
n+1
) 0,
since c
n+1
< 1.
It remains to show that [1 1[ < [1[ for all intervals 1 of positive length in
[0, 1], and it is here that we must use (2.2). Indeed, x 1 and let J be a component
interval of [0, 1] 1
n
(with : large) that is contained in 1. Let (
_
1
on+1
_
be the
translated and dilated copy of 1
on+1
that is tted into J by construction. We
compute that
[1 J[ =

(
_
1
on+1
_
+ (1 c
n+2
)
J (
_
1
on+1
_
+...
= (1 c
n+1
) [J[ + (1 c
n+2
) (1 (1 c
n+1
)) [J[
+(1 c
n+3
) (1 (1 c
n+1
) (1 c
n+2
) (1 (1 c
n+1
))) [J[ +...
=
o
|=1
,
n
|
[J[ ,
where by induction,
,
n
|
= (1 c
n+|
) c
n+|1
...c
n+1
, / _ 1.
Then we have
[1 J[ =
_
o
|=1
,
n
|
_
[J[ < [J[ ,
and hence also
]J|1]
]1]
< 1, if we choose c
n
o
n=1
so that

o
|=1
,
n
|
< 1 for all :.
Now we have
o
|=1
,
n
|
=
o
|=1
(1 c
n+|
) c
n+|1
...c
n+1
= 1
o
|=1
c
n+|
,
and by the rst line in (2.3) below, this is strictly less than 1 if and only if
o
|=1
(1 c
n+|
) < for all :. Thus the set 1 constructed above satises (2.1) if
and only if (2.2) holds.
2.3.1. Innite products. If 0 _ n
n
< 1 and 0 _
n
< then
o
n=1
(1 n
n
) 0 if and only if
o
n=1
n
n
< , (2.3)
o
n=1
(1 +
n
) < if and only if
o
n=1
n
< .
To see (2.3) we may assume 0 _ n
n
,
n
_
1
2
, so that c
un
_ 1 n
n
_ c
2un
and
c
1
2
un
_ 1 +
n
_ c
un
. For example, when 0 _ r _
1
2
, the alternating series estimate
yields
c
2r
_ 1 2r +
(2r)
2
2!
_ 1 r,
while the geometric series estimate yields
c
1
2
r
_ 1 +
_
1
2
r
_
_
1 +r +r
2
+...
_
_ 1 +r.
Thus we have
exp
_
n=1
n
n
_
_
o
n=1
(1 n
n
) _ exp
_
2
o
n=1
n
n
_
, (2.4)
exp
_
1
2
o
n=1
n
_
_
o
n=1
(1 +
n
) _ exp
_
o
n=1
n
_
.
Bibliography
[1] R. G. Bartle and D. R. Sherbert, Introduction to Real Analysis, John Wiley and Sons, Inc.
3rd edition, 2000.
[2] C. B. Boyer, A history of mathematics, John Wiley & Sons, Inc., 1968.
[3] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, 3rd edition, 1976.
[4] W. Rudin, Real and Complex Analysis, McGraw-Hill, 3rd edition, 1987.
[5] E. M. Stein and R. Shakarchi, Complex Analysis, Princeton Lectures in Analysis II, Prince-
ton University Press, Princeton and Oxford, 2003.
[6] E. M. Stein and R. Shakarchi, Real Analysis, Princeton Lectures in Analysis III, Princeton
University Press, Princeton and Oxford, 2003.
[7] S. Wagon, The Banach-Tarski Paradox, Cambridge University Press, 1985.
141

Real Analysis

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Real Analysis

Caricato da

Copyright:

Formati disponibili

Lecture Notes in Real Analysis

that satises Denition

where _ 2 is a large integer to be

that is one-to-one. Then we can dene q : T (N) (0, 1]

relative to A. Clearly G is open relative

) G. From (1.1) we thus obtain

, and that have empty

for each , the set 1 =

are closed and decreasing implies that r 1

for all , the desired contradiction.

. This shows that ` +

fails to have a nite subcover. Label it [a

fails to have a nite subcover. Label it [a

. Now divide 1 into 2

, and repeat the process to obtain a sequence of decreasing

is a translation and rotation of 1 by some vector a

and rotation matrix

are pairwise disjoint (sometimes we will relax this

overlap on edges. As : is the dimension of the cube I

. Then remove the open middle third

and denote the two remaining closed intervals of length

for some , between 1 and 2

. Then repeat this procedure with these

and replacing it by two equal length segments as

is a translation and rotation of 1; moreover two dierent 1

< 1. Moreover, in all

is a dilation with the same ratio 0 < r < 1. Our next

(1) be pairwise disjoint here. We call a nonempty set 1 satisfying

takes balls to balls, hence bounded

takes compact sets to compact sets, and

< 1. Then there is a unique nonempty compact

() for some r and 1 _ , _ :. Now

is a similarity with dilation ratio r

of 1, 1 _ i _ /, 1 _ , _ :, and rigid motions o

< 1 for 1 _ , _ :. Let 1 be the unique nonempty

is nonempty and compact for each , and clearly T

, (1.4) gives dia:(1) = 0, from which we conclude that 1 consists

, it follows that the set of points :

d ([:] , [t]) = lim

= 1, which gives the desired result upon taking reciprocals.

of the partial sums, provided that limit exists.

< -, for all :, : _ .

_ 0 shows that the sequence :

_ 1 for all : _ , then

< 1 < 1. Then there are only nitely

_ 1 (otherwise we would have limsup

0 (note the dependence on j)

has a nite subcover

(r, j) < c. We will show that

< - for all a < r < a +c.

< - for all a < c < , < a +c.

_ - for all a < , < a +c.

= 1 and the length of

= / of [a, /] we dene upper and lower Riemann

= / be any partition of [a, /] for which

that are formed by the

_ l (); T) 1(); T) < -.

_ l (); T) 1(); T) < -,

(r, j) < c and r 1.

on 1 at the point j to nd c 0 satisfying

(r, j) < c and r 1.

(t) : [0, 1] [0, 1]

= 1 is any partition of [0, 1) and r

. But this is precisely the diculty with