Sei sulla pagina 1di 364

Jonathan M.

Kane

Writing
Proofs in
Analysis

Writing Proofs in Analysis

Jonathan M. Kane

Writing Proofs in Analysis

123

Jonathan M. Kane
Department of Mathematics
University of Wisconsin - Madison
Madison, WI, USA

ISBN 978-3-319-30965-1
ISBN 978-3-319-30967-5 (eBook)
DOI 10.1007/978-3-319-30967-5
Library of Congress Control Number: 2016936668
Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer Nature
The registered company is Springer International Publishing AG Switzerland

To the memory of Sylvan Burgstaller, Duane


E. Anderson, and especially James L. Nelson
who, at the University of Minnesota Duluth,
taught me the fundamentals of writing proofs
in analysis.

Acknowledgments

I wish to thank Natalya St. Clair for her excellent work creating the illustrations
appearing in this textbook. She took my crude sketches and vague ideas and turned
them into pleasing artwork and instructive diagrams. I also wish to thank Daniel M.
Kane, Alan Gluchoff, Thomas Drucker, and Walter Stromquist for their insightful
comments about the presentation, content, and correctness of the text.

vii

Preface

After learning to solve many types of problems such as those found in the first
courses in Algebra, Geometry, Trigonometry, and Calculus, mathematics students
are usually exposed to a transition course where they are expected to write proofs
of various theorems. I taught such a course for a dozen years and was never satisfied
with the textbooks available for that course. Although such textbooks often teach
the fundamentals of logic (conditionals, biconditionals, negations, truth tables) and
give some common proof strategies such as mathematical induction, the textbooks
failed to teach what a student needs to be thinking about when trying to construct a
proof. Many of these books present a great number of well-written proofs and then
ask students to write proofs of similar statements in the hope that the students will
be able to mimic what they have seen. Some of these books are also designed to be
used as an introductory textbook in Analysis, Abstract Algebra, Topology, Number
Theory, or Discrete Mathematics, and, as such, they concentrate more on explaining
the fundamentals of those topic areas than on the fundamentals of writing good
proofs.
This Book Is Not Your Traditional Transition Textbook The goal of this book
is to give the student precise training in the writing of proofs by explaining what
elements make up a correct proof, by teaching how to construct an acceptable proof,
by explaining what the student is supposed to be thinking about when trying to write
a proof, and by warning about pitfalls that result in incorrect proofs. In particular,
this book was written with the following directives:
Unlike many transition books which do not give enough instruction about how
to write proofs, most of the proofs presented in this text are preceded by detailed
explanations describing the thought process one goes through when constructing
the proof. Then a good proof is given that incorporates the elements of that
discussion.
For proofs that share the same general structure such as the proof of lim f .x/ D L
x!a
for various functions, proof templates are provided that give a generic approach
to writing that type of proof.
ix

Preface

Many transition books begin with several chapters covering an introduction to


logic, set theory, cardinal numbers, and an axiomatic construction of the real
numbers. I find that students do not appreciate the details of these discussions
when these concepts are presented before they are needed to write a specific
proof. For example, truth tables are very helpful in verifying the truth of a
complex logical statement, but it is hard for students at that level to see the
connection between the truth value of a complex statement and the formation of
a proof. Therefore, I introduce many of these ideas as needed within the contexts
of writing Analysis proofs and have kept the introductory material to a minimum.
Many books that propose to teach students to write proofs in Analysis get carried
away with covering those great topics in Analysis and cut back on the proof
writing instruction. The books may start out teaching about proofs, but after
a few chapters of introduction, they assume that the students now understand
everything they need to know about writing proofs, and the books concentrate
entirely on the concepts of Analysis. This book covers plenty of Analysis and
can be used as a textbook for a typical beginning Real Analysis course, but it
never loses sight of the fact that its primary focus is about proof writing skills.
Certainly, one can use this book for a beginning course in Real Analysis because
it thoroughly covers the standard theorems, but as a first course in proof writing,
it will succeed where others fail.
If the students using this book have already had a thorough background in
writing proofs, then this book could be used as a standard one-semester course
in Real Analysis. Theses students might begin in Sect. 2.5 and, depending on their
background, be expected to cover the material through Chaps. 6, 7, or 8. On the other
hand, if the students are using this book both as an introduction to proof writing and
an introduction to Analysis, then the textbook can be used for a two-semester course
in Real Analysis and proof writing. The first semester might aim to cover the first
five or six chapters, while the second semester aims to complete the book. For most
of the topics, it is important that the chapters be covered in their prescribed order.
Elements of later chapters do depend on the material covered in earlier chapters.
Madison, Wisconsin, USA
2016

Jonathan M. Kane

Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
List of Proof Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
What Are Proofs, and Why Do We Write Them? . . . . . . . . . . . . . . . . . . . . . . . .
1.1
What Is a Proof? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Why We Write Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
The Basics of Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1
The Language of Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1
Conditional Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2
Negation of a Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3
Proofs of Conditional Statements . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Template for Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3
Proofs About Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1
Set Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3
Proofs About Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.5
Proofs About Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.6
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4
Proofs About Even and Odd Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1
Definitions of Even and Odd Integers . . . . . . . . . . . . . . . . . . . . .
2.4.2
Proofs About Even and Odd Integers . . . . . . . . . . . . . . . . . . . . .
2.4.3
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5
Basic Facts About Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1
Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2
The Completeness Axiom and the Real Numbers . . . . . . . .
2.5.3
Absolute Value, the Triangle Inequality, and Intervals . . .
2.5.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1
Function, Domain, Codomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.2
Surjection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.3
Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii
ix
xvii
xx
1
1
5
9
9
9
10
11
12
12
16
17
17
18
19
22
22
26
27
27
28
31
31
31
35
38
40
40
40
40
41
xi

xii

CONTENTS

2.6.4
Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.5
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
The Definition of Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
Proving lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x!a
3.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3
One-Sided Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4
Limits at Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5
Limit of a Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1
Definition of Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.2
Arithmetic with Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.3
Monotone Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.4
Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.5
Limit of a Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.6
Limits of Monotone Sequences and
Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.7
Cauchy Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.8
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6
Proving That a Limit Does Not Exist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1
Why a Limit Might Not Exist . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.2
Quantifiers and Negations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.3
Proving No Limit Exists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7
Accumulation Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8
Infinite Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.8.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9
The Arithmetic of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.1
Limit of a Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.2
Limit of a Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.3
Limit of a Quotient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.4
Limit of Rational Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.5
Other Types of Limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9.6
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10
Other Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10.1
The Limit of a Positive Function . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10.2
Uniqueness of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10.3
The Squeezing Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10.4
Limits of Subsequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10.5
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43
44
47
47
49
49
54
54
56
57
59
60
60
60
60
61
62
62
66
67
68
68
68
70
73
74
79
79
81
81
82
83
85
87
89
89
89
90
90
91
92
93

CONTENTS

3.11

Liminf and Limsup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


3.11.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1
The Definition of Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2
Proving the Continuity of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3
Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4
Compactness and the HeineBorel Theorem . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1
Open Covers and Subcovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2
Proofs of the HeineBorel Theorem . . . . . . . . . . . . . . . . . . . . . .
4.4.3
Uniform Continuity on Closed Bounded Intervals . . . . . . .
4.4.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5
The Arithmetic of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6
Composition, Absolute Value, Maximum, and Minimum . . . . . . . . . .
4.6.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7
Other Continuity Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.1
Boundedness of Continuous Functions . . . . . . . . . . . . . . . . . . .
4.7.2
Obtaining Extreme Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.3
The Intermediate Value Property . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.4
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8
Discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1
The Definition of Derivative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2
Differentiation and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3
Calculating Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4
The Arithmetic of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5
Chain Rule and Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6
Increasing Functions, Decreasing Functions, and Critical Points . .
5.6.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7
The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8
LHopitals Rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.8.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.9
Intermediate Value Property and Limits of Derivatives . . . . . . . . . . . . .
Riemann Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1
Area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2
Cardinality of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3
Measure Zero. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4
Areas in the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

93
97
99
99
101
105
105
109
110
110
111
115
117
117
120
121
123
123
123
126
127
130
131
133
133
134
135
136
139
140
143
145
146
150
150
155
155
159
159
159
162
163
166
166
169

xiv

CONTENTS

6.5

Definition of Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


6.5.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6
Properties of Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7
Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.7.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8
Step Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.8.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9
Integrals of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.9.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.10
Characterization of Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.10.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1
Convergence of Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2
Absolute and Conditional Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3
The Arithmetic of Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4
Tests for Absolute Convergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.1
Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.2
Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.3
Root Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.4
Integral Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.5
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5
Alternating Series Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.6
The Smallest Divergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.7
Rearrangement of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.7.1
Addition of Parentheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.7.2
Order of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.7.3
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.8
Cauchy Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.8.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sequences of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1
Pointwise Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2
Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.3
Monotone Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4
Series of Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5
Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5.1
Absolute Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5.2
Interval of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5.3
Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5.4
Taylors Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

169
172
173
177
178
183
183
189
189
194
194
200
201
201
204
205
206
208
209
209
212
214
215
218
219
222
222
224
224
225
230
231
236
239
239
241
241
246
246
252
255
255
256
259
263

CONTENTS

8.5.5
Arithmetic of Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.5.6
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6
Fundamental Question of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Topology of the Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1
Interior, Exterior, and Boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2
Open and Closed Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3
Unions and Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.4
Continuous Functions Applied to Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.4.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.5
Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.5.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.6
Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.6.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.7
Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.7.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1
Definition of Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2
Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2.1
CauchySchwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2.2
Minkowski Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2.3
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3
Examples of Metric Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4
Topology of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.5
Limits in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.5.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.6
Continuous Functions on Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.6.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.7
Homeomorphism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.7.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.8
Connected Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.8.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.9
Compact Metric Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.9.1
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.10 Complete Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.10.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.11 Contraction Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.11.1 Contraction Mapping Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.11.2 Picards Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

265
266
267
269
269
274
274
278
278
282
282
285
285
288
288
291
291
293
295
295
297
297
298
298
299
305
306
307
308
311
311
314
315
316
316
317
317
322
323
327
327
327
329

xvi

CONTENTS

10.11.3 Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333


10.11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Books for Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

List of Figures
Fig. 1.1

Dividing the disk with the chords from n points . . . . . . . . . . . . . . . . . . . . . .

Fig.
Fig.
Fig.
Fig.
Fig.

List of implications for P ! Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


.A [ B/c is equal to Ac \ Bc . . . . . . . . . . . . . . . . .p
...........................
Showing the least upper bound of S is s D r . . . . . . . . . . . . . . . . . . . . . . . .
Triangle inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Composition .f g/.x/ D z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15
26
37
39
43

2.1
2.2
2.3
2.4
2.5

Fig. 3.1
Fig. 3.2
Fig. 3.3
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

3.4
3.5
3.6
3.7
3.8
3.9
3.10

Fig. 4.1
Fig. 4.2
Fig.
Fig.
Fig.
Fig.

4.3
4.4
4.5
4.6

Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

4.7
4.8
4.9
4.10
4.11
4.12
4.13

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

Graph of f .x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Approaching a limit as x ! 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Proving bounded monotone sequences converge . . . . . . . . . . . . . . . . . . . . .
f has no limit at x D 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graph of sin 1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Set with accumulation point a and isolated point b . . . . . . . . . . . . . . . . . . .
Sequences approaching the lim sup and lim inf . . . . . . . . . . . . . . . . . . . . . . .

56
57
63
71
72
74
94

Continuity of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A function equal to 2x for rational x and x C 1 for
irrational x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
f .x/ D 1x is not uniformly continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
HeineBorel Theorem first proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
HeineBorel Theorem second proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
y and z straddle one endpoint but remain in an interval
of the open cover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Proving that a continuous function on a; b is bounded . . . . . . . . . . . . . .
The maximum and minimum of a function f .x/ on an interval . . . . . .
f passing through each y between f .c/ and f .d/ . . . . . . . . . . . . . . . . . . . . . .
A function with a jump discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graph of sin 1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graphs of sgn.x/ and bxc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graphs of functions with discontinuities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100
104
106
112
113
115
125
126
128
130
130
131
132

xvii

xviii

LIST OF FIGURES

Fig. 4.14 Graph of Thomaes function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132


Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

5.1
5.2
5.3
5.4
5.5
5.6
5.7

Slope of a Secant Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Tangent Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Restricting sin.x/ to get sin1 .x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Graph showing maxima and minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The proof of Rolles Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Point c where
the tangent line is parallel to the secant line . . . . . . . . . . .

x2 sin x12 and its derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

134
134
142
145
147
148
156

Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9

The union of countably many countable sets is countable . . . . . . . . . . . .


Determining y using a diagonalization argument . . . . . . . . . . . . . . . . . . . . .
Construction of the Cantor set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Covering a line segment with smaller and smaller squares . . . . . . . . . . .
An 8  8 grid of rectangles overlaying a triangle . . . . . . . . . . . . . . . . . . . . .
Approximating the area under a curve with narrowing rectangles . . .
The step function s.x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Choosing j on .xj1 ; xj / . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The mean value theorem for integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

161
161
164
165
168
170
184
185
191

Fig. 7.1
Fig. 7.2
Fig. 7.3

Comparing the series with the integral in the Integral Test . . . . . . . . . . . 216
Converging to ln 2 with an alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Rearranging terms to converge to L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

Fig. 8.1

The sequence of functions xn converging to a


discontinuous function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
nC1
The sequence of functions jxj n converging to the
function f .x/ D jxj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A sequence of functions with integral 1 converging to
the function f .x/ D 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A sequence of functions converging uniformly . . . . . . . . . . . . . . . . . . . . . . .
If continuous function fn is close to f , then f .x/ is close
to f .a/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fig. 8.2
Fig. 8.3
Fig. 8.4
Fig. 8.5
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9

Interior, boundary, and exterior of a set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


x in @.@S/, y in @S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
An open set S, its boundary, and its complement Sc . . . . . . . . . . . . . . . . . .
Showing boundaries of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The union of open sets is an open set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mapping sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The closure of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The sets 0; 1 and .4; 5/ are disconnected. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The set C is a connected set. The set N is not a
connected set. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Fig. 9.10 Graph of sin 1x with the y-axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

240
240
241
242
243
270
273
275
278
279
284
287
292
293
293

Fig. 10.1 Metric distances in the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

LIST OF FIGURES

Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9

Fig.
Fig.
Fig.
Fig.
Fig.
Fig.

10.10
10.11
10.12
10.13
10.14
10.15

Euclidean distance is R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
N.0; 1/ in the Euclidean, taxicab, and supremum metrics . . . . . . . . . . . .
Some functions in C0; 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Proving that the union of open sets is open. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Limit of f W X ! Y as x approaches a is L . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Limit of a sequence in a metric space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A compact set of a metric space is closed and bounded . . . . . . . . . . . . . .
Extrema of a continuous real-valued function on a
compact set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Continuous bijection on a compact metric space. . . . . . . . . . . . . . . . . . . . . .
Enclosing a closed bounded set in a grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contraction mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Mandelbrot set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stages in the forming of the Sierpinski triangle . . . . . . . . . . . . . . . . . . . . . . .
Generation of a fractal fern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xix

300
301
302
307
308
310
318
320
322
326
328
333
333
339

List of Proof Templates


Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Proving A  B for sets A and B . . . . . . . . . . . . . . .
Proving A D B for sets A and B . . . . . . . . . . . . . . .
Proving a function f is surjective . . . . . . . . . . . . . .
Proving a function f is injective . . . . . . . . . . . . . . .
Proving lim f .x/ D L . . . . . . . . . . . . . . . . . . . .
x!a
Proving a result using mathematical induction . . . . . . .
Proving lim f .x/ does not exist . . . . . . . . . . . . . . .
x!a
Proving the function f is continuous at the point a . . . . .
Proving the function f is uniformly continuous on the set A
Proving <X; d> is a metric space . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

. 12
. 20
. 23
. 41
. 42
. 50
. 65
. 70
. 102
. 106
. 296

xx

Chapter 1

What Are Proofs, and Why Do We Write Them?

1.1 What Is a Proof?


A statement in Mathematics is just a sentence which could be designated as true
or false. The sentences 1 C 1 D 2 and x D 4 implies x2 D 16 are true
statements while All rational numbers are positive and There is a real number
x such that x2 C 5 D 2 are false statements. Some sentences like Green is
nice or Authenticity runs hot are too ambiguous, a matter of opinion, or are just
plain nonsense and cannot be said to be true or false, so mathematicians would
not consider them to be statements. Mathematicians have a lot of words for kinds
of statements including many that you have heard: definition, axiom, postulate,
principle, conjecture, lemma, proposition, law, theorem, contradiction, and others.
You are certainly familiar with the numbers you use for counting items:
1; 2; 3; 4, and so forth. Suppose you wish to investigate statements about these
numbers to see which statements hold true for all of these numbers. This is an
admirable mathematical pursuit, so how would you get started? Mathematicians
know from experience that if you want to begin an investigation, you better start
with definitions, that is, you better make some clear statements about the objects
you are about to study, because there are examples of mathematicians running off
to study something without first making clear what it is they are studying, and later
running into problems because they have not been consistent about how they are
treating these new objects. This happened, for example, when people investigated
the concept of limit before a precise definition of limit was in place. OK, so perhaps
you make some statements about the numbers with which you want to work so that
you are confident that you understand the collection 1; 2; 3; 4; 5; : : : . What are you
going to be able to do with these numbers? If you only know the names of these
numbers and have a symbolic representation for each, there is not a great deal you
can do with them. Perhaps you could get a collection of blocks and paint one number
on each block. Then you could have fun rearranging these numbers just as you have
seen done by countless children.

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_1

1 What Are Proofs, and Why Do We Write Them?

But more likely you are interested in investigating some properties of these
numbers having to do with their order or how they behave when operated on
by addition or multiplication. This, of course, would mean that you will need to
make clear statements about addition and multiplication operations and a less than
relationship, again, so that you do not run into problems later because you were
being ambiguous. So, you might write definitions of addition, multiplication, and
less than, and then make statements about how these operations behave such as a
Commutative Law of Addition (mCn D nCm), a Distributive Law of Multiplication
over Addition (a.b C c/ D ab C ac), and an Order Property of Addition (r < s
implies r C t < s C t). These statements about how these defined quantities work
are called axioms, postulates, or principles. They are statements that you accept
as the guiding rules for how your mathematical objects behave and go beyond the
definitions to describe and make precise just what the definitions are talking about.
Once you have made definitions and laid out your axioms, you should have the
tools necessary to begin an investigation of other properties. Suppose that someone
looks at a few examples and notice that 1 C 9 D 10 and 10 is 2 times another
number, 5. They then notice that 4C12 D 16, 3C147 D 150, and 1002C6 D 1008,
and all of these results are also numbers equal to 2 times another number. This might
lead them to make the statement that if you add two natural numbers together,
the result is always 2 times another number. Such a statement would be called a
conjecture, a statement whose truth has not yet been determined. Of course, you
know that this statement is false and came about because the investigator had not
yet considered enough examples. Once they stumble upon 5 C 8 D 13 and notice
that 13 cannot be represented as 2 times another number, they will know that the
statement does not hold true in every case.
Other conjectures such as for every natural number a, the number a2  3a C 12
is a multiple of 2 hold up to more scrutiny. At some point in your investigation you
might see a convincing argument that this conjecture is, in fact, a true statement.
Such a convincing argument is what is called a proof. Once it is known that a
statement has a proof, it is known as a theorem, lemma, corollary, proposition,
or law. So, a proof of a statement in mathematics is a convincing argument that
establishes the truth of that statement.
Some statements are very easily proved, and certainly mathematicians often set
up axioms in order to make particular statements easy to prove. At first this may
appear to be cheating or, at best, unproductive and uninteresting because it seems
to defeat the purpose of establishing truth by dictating rules that make it trivial to
establish the truth. But this is certainly not the case. It is common for mathematicians
to have an intuitive idea about how a system should work before they feel that they
understand it enough to set down formal definitions and axioms. Perhaps you wanted
addition of all natural numbers a and b to satisfy a C b D b C a. Then it would make
sense to include this rule among your axioms. The axioms are written with the idea
of establishing enough structure so that the statements the mathematicians want to
hold true can easily be proved. The richness of mathematics is that after assuring
that the obvious can be proved from the axioms, there are many more results that
can be proved that are not immediately obvious from the definitions and axioms,

1.1 What Is a Proof?

statements which might never have been apparent to those who set up the system in
the first place. For example, Fermats Last Theorem (there are no natural numbers a,
b, c, and n > 2 such that an C bn D cn ) is a statement about natural numbers which
could only be conjectured after investigating a large number of examples, and stood
as a conjecture for hundreds of years before a proof was provided.
Occasionally, it is shown that a conjecture is independent of the axioms; that
is, neither the truth nor the falseness of the statement follows from the axioms. Two
famous examples are the statements about sets known as the Axiom of Choice and
the Continuum Hypothesis which have been shown to be independent of the original
axioms of Zermelo-Fraenkel Set Theory. The independence of such statements
suggests that the axiom system is not rich enough in structure to establish the truth
of these statements, and that if one chose to do so, those statements could be added
to the list of axioms for the system. The Axiom of Choice or something equivalent
to it, for example, is now usually listed along with the Zermelo-Fraenkel axioms.
One certainly hopes that it is not possible to prove two contradictory statements
about objects in a system. Such an occurrence would say that the axioms of the
system were inconsistent, and this would require the axioms to be changed. After
the original ground rules for Set Theory were established by Georg Cantor in the
1870s and 1880s, Bertrand Russell pointed out in 1901 a paradox (contradiction)
that is a consequence of those rules. Now commonly known as Russells paradox,
it stimulated a flurry of activity which resulted in the young field of Set Theory
being put on a firm foundation (we hope) with the creation and adoption of the
ZermeloFraenkel axioms.
The language of a proof can vary depending on who is writing the proof and
who is the intended reader. In other words, what makes a convincing argument may
well depend on who it is that needs to be convinced. For example, if two experts
in Functional Analysis are speaking to each other, one might prove a statement by
saying Oh, thats just a consequence of the Hahn-Banach Theorem. That proof
might be sufficient since it completely describes the reasoning behind the statement
in question due to the shared knowledge of the two experts. On the other hand, if
one of these experts were speaking to a beginning mathematics graduate student,
the proof would need to include far more detail in order for it to be a convincing
argument. If the expert were speaking to a high school student, the proof might
need to be a complete book that both introduces the needed concepts and explains
many results needed to understand the proof.
It is important to understand that there is a difference between knowing why
a statement is true and knowing how to write a good proof of the statement. It is
quite possible to learn a great deal of mathematics, to be able to solve many types
of mathematical problems, and to understand why particular properties must hold
without being able to write coherent proofs of these properties. It is analogous to
a police detective who has gathered enough evidence to be convinced which of the
many suspects has committed a particular crime, but it is quite another thing to have
the criminal successfully prosecuted in a court of law resulting in the criminals
conviction and eventual punishment for the crime. A student in Analysis needs to
learn many strategies that can be brought to bear when writing proofs. Some of these

1 What Are Proofs, and Why Do We Write Them?

strategies are methods or tricks that enter a students bag of tricks which can be
employed later when solving problems or writing proofs. A student of proof writing
needs to learn how to take those strategies and turn them into coherent proofs where
the ideas are presented in a logical order, fill in all necessary details, and make clear
to the proof reader exactly why the chosen strategies justify the needed result.
This book talks about how you should go about writing proofs of the kinds
of statements typically found in the branch of Mathematics called Analysis. The
branches of mathematics are not precisely defined. After a new branch arises, some
mathematicians begin to combine ideas from older branches with ideas from the new
branch to form even newer areas of study. For example, there are branches called
Algebra, Geometry, and Topology. During the twentieth century mathematicians
began talking about Algebraic Topology, Algebraic Geometry, and Geometric
Topology. Very roughly speaking, then, some of the branches of mathematics are
Set Theory: the study of sets, set operations, functions between sets, orderings
of sets, and sizes of sets
Algebra: the study of sets upon which there are binary operations defined (such
as addition or multiplication) and includes Group Theory, Ring Theory, Field
Theory, and Linear Algebra
Topology: the study of continuous functions and properties of sets that are
preserved by continuous functions
Analysis: the study of sets for which there is a measure of distance allowing for
the definition of various limiting processes such as those found in the subjects
of Calculus, Differential Equations, Functional Analysis, Complex Variables,
Measure Theory, and many other areas.
Other areas of study such as Applied Mathematics, Combinatorics, Geometry,
Logic, Probability are considered by some mathematicians to be their own branch
of mathematics or just as part of one or more of the above four branches. The
exact designation is important to some mathematicians and not to others. Although
mathematicians learn to write proofs in each of these branches of mathematics, one
has to begin the learning process someplace. Many teachers feel that Analysis is a
good area to start because students who have completed a study of Calculus will
already be familiar with just about all of the theorems discussed in a beginning
course in Analysis, and may already have an intuitive feeling for why these results
hold. That does not mean that those same students can write convincing proofs of
these theorems. It is the goal of this book to provide the training necessary so that
a student can learn to write proofs of these and similar theorems. Undergraduate
courses in Topology, Group Theory, Advanced Calculus, Graph Theory, and so
forth generally present the beginning concepts in each of these fields and try to
give students a feel for why the major results in the fields are true. Sometimes
this involves having the students learn proofs of these results while other times it
only involves a presentation of definitions and known results with the idea that the
students will be able to take the why it is true and turn it into a proof themselves.
This book is much more interested in turning known strategies into proofs than in
introducing a wealth of new strategies.

1.2 Why We Write Proofs

Some arguments in Analysis follow a standard format or template. This book


will present several templates for proofs as a tool for teaching how one might
approach the writing of a proof. For example, one can learn to prove a statement
of the form lim f .x/ D L by following a standard pattern. This book will display
x!1
proof patterns by presenting proof templates, and for each template it will discuss
proof examples showing how to use the template and the thought process needed
to complete such proofs. After that, a student would be expected to produce similar
proofs. There are other theorems in Analysis whose proofs involve the introduction
of some clever idea which time has shown to be useful. Beginning students would
not be expected to produce proofs using these new ideas on their own, so some of
these proofs are presented in order to teach the new proof strategy. The experienced
mathematician will have seen a large number of these clever proof techniques and
can be expected to reuse these techniques when writing a proof of some new
statement. Beginning students do not have this catalog of proof techniques from
which to draw, so they are not expected to be able to write proofs for such a wide
variety of statements. But one must start someplace when building up this catalog,
and it is a goal of this book to get students started in the right direction.

1.2 Why We Write Proofs


There are many reasons why mathematicians put a lot of weight on the writing of
proofs. Here are some of the reasons.
Determining Truth Research mathematicians use proofs to determine what mathematical statements are true. Although many statements in mathematics are obviously true, many remain unproved conjectures for long periods of time before being
proved. When a conjecture stands unproved for many years, there is time for more
mathematicians to learn about the statement, and the conjecture may attract a great
deal of attention. When the conjecture is first stated, some may find it interesting,
but finding a suitable proof may not appear to be a difficult problem until many
people have tried unsuccessfully to find a proof. As this interesting statement
remains a conjecture for a longer and longer period of time, the mathematical
community realizes that the problem of finding a proof is much more involved than
originally expected. This is exciting partly because a wider community of experts
begin to wonder whether the statement under consideration is true and because
it becomes clear that new techniques will be needed to find a suitable proof if,
in fact, the statement can be proved at all. The problem of determining whether
or not the mathematical statement is true takes on the same sort of interest that
some people would take in the success of their favorite sports teams; sitting and
waiting to see how they will fair in the upcoming contest. When a longstanding
conjecture is finally proved, the announcement of the accomplishment will often be
covered by the lay press giving mathematics an uncharacteristic brief period of pubic
admiration. Perhaps you are familiar with some of these famous problems whose

1 What Are Proofs, and Why Do We Write Them?

Fig. 1.1 Dividing the disk


with the chords from n points

1 Point, 1 Region

2 Points, 2 Regions

3 Points, 4 Regions

4 Points, 8 Regions

5 Points, 16 Regions

6 Points, 31 Regions

resolution has alluded mathematicians for years (at least at the time of the writing of
this text in January 2016): The Riemann Hypothesis, the Goldbach Conjecture, the
Twin Prime Conjecture, the P versus NP Problem, and the NavierStokes Equations
Existence and Smoothness Problem. During the last 40 years resolutions have been
announced for several long-standing problems including the Four Color Theorem,
The Bieberbach Conjecture (now called de Brangess Theorem), Fermats Last
Theorem, and the Poincar Conjecture.
Why do mathematicians expend so much effort trying to prove statements, some
of which may seem obvious from the start? One reason is that mathematicians
are very skeptical of statements that appear obvious, and rightfully so. There is a
long history that includes mathematical statements which appear to be true which
are eventually shown to be false. Even very clear patterns can be deceptively
seductive. Take, for example, the following problem. Select a set of n points along
the circumference of a circle, draw the chords between each pair of points, and find
out the maximum number of regions into which these segments can divide the disk.
Figure 1.1 shows the results for the first few values of n.
Although from considering n D 1; 2; 3; 4; 5 it appears that the chords can divide
the disk into 2n1 regions, this fails to be true when n D 6. With a bit more thinking
n1
itnis not hard to see that2n could not be the correct answer. With n points there are
chords and at most 4 intersections of two chords. This number of intersections
2
grows as a fourth-degree polynomial in n suggesting that the number of regions will

1.2 Why We Write Proofs

also grow as a fourth-degree polynomial in n. It would, therefore, be suspicious for


the number of regions to grow at the exponential rate of 2n1 .
Another well-known example comes from Number Theory. The function .x/
gives the number of positive prime integers less than or equal to the number x.
The growth rate of this  function has long been of central importance in Number
Theory. The Prime Number Theorem says that the  function grows at the same rate
as the logarithmic integral
Z

li.x/ D
0

dt
:
ln t

In fact, for many years it was thought that li.x/ > .x/ for all x > 0 because this
holds for all small values of x which can be practically checked, for example, all x
between 0 and 1024 . It has now been shown that li.x/  .x/ switches sign infinitely
often, although only for extremely large values of x.
It is apparent that sometimes seemingly very obvious patterns do not hold
in every case, so mathematicians rely on proofs to convince themselves that the
patterns do indeed hold in the general case.
Testing Axiom Systems In the next chapter you will read about the writing of
proofs for some very elementary facts in mathematics; so elementary that you may
wonder why anyone would bother with these proofs. Clearly, it makes sense to
begin any training in the writing of proofs with some very simple results that are
easy to understand so that the student can feel confident about all the statements
being made in the proofs. But these proofs are not being presented just because they
are elementary. When one sets up a mathematical system by making definitions
and determining axioms, it is usually with a particular application or example in
mind. The desired result is that the new system will include the already partially
understood application so that any new discoveries will immediately tell something
new about the original application. Suppose someone sets up an axiom system for
the real numbers, for example, but is not able to prove that addition of real numbers
satisfies the commutative property. Since the commutative property is an important
aspect of addition of real numbers, it would appear that the new axiom system does
not have enough power to represent all that one would want to show about the real
numbers. Perhaps the axiom system will need to be expanded to include an axiom
about the commutativity of addition. Thus, if one cannot prove that the expected
simple properties hold, then it says that something is missing from the axioms. So
mathematicians write proofs to confirm that their axiom systems are representative
of the applications they are trying to describe.
Exhibiting Beauty There are no rules about what composers of music need to
write, but many composers try to write in standardly accepted formats such as
string quartets or symphonies because there are already organizations ready to
perform such works and groups of people happy to listen to such works. Scholars of
literature compare literary works by writing literary analysis, a form which holds
a lot of meaning for those who read and write in that field. Although painters

1 What Are Proofs, and Why Do We Write Them?

choose to make pictures of every sort of object or scene, real or imagined, most
painters eventually try their hand at painting some of the standard subjects (still life,
nudes, famous religious or historical depictions). Similarly, mathematicians write
proofs partly because that is what mathematicians enjoy doing. Although many
mathematicians make substantial contributions to the sciences, social sciences, and
arts through the application of their mathematical skills, others live in a world
of creating and discussing abstract concepts that have no immediate application
to real world problems, or at least no application apparent to the mathematicians
doing the research. To them, mathematics is studied as part of the humanities and
is appreciated for its beauty. And much of the beauty of mathematics lies in the
proofs of its theorems. One gets a great deal of pleasure reading a clever proof of
a complicated result when the proof can be stated in just a few lines, especially if
previous proofs of the same result were considerably longer and more difficult to
understand. Many mathematicians like reading articles and attending conferences
where they are exposed mainly to proofs of results, partly so that they can learn
about new results, but more importantly so they can appreciate the techniques
brought to bear to construct the proofs.
Testing Students One should not underestimate the need to educate future mathematicians. A good way to test whether a student understands a particular result is to
ask the student to present a proof of the result. The presentation of a proof shows a
deep understanding of why the result is true and shows an ability to discuss many
details about the objects involved. At the graduate school level in mathematics, most
test problems require the student to produce a proof of a particular result.
The student who has completed a study of Calculus is likely to have mastered
basic skills in Algebra, Geometry, Trigonometry, and Elementary Functions. This
is a good point in ones studies to begin writing proofs. It should not be assumed
that one can just begin writing proofs at this stage even if they have had years of
experience watching teachers and authors present proofs to them any more than
someone can be expected to sit down and begin playing the piano just because they
have watched many other people present concerts using the instrument. In this book
the reader will be taken through the construction of many proofs in a step-by-step
manner that presents the thought process used to write the proofs. Some incorrect
proofs are shown and explained so that the student can learn about common pitfalls
to avoid. Some students dread the transition to writing proofs because they feel
that they do not understand how to write proofs, and are leery of the day when
they will be expected to produce what they cannot now do. But the ability to write
good proofs is a skill no different from the ability to factor polynomials or integrate
rational functions. There is no expectation that the beginner can produce a good
proof, but every expectation that the beginner can learn.

Chapter 2

The Basics of Proofs

2.1 The Language of Proofs


2.1.1 Conditional Statements
Most theorems concern mathematical objects x that satisfy a set of properties P, that
is, P.x/ D the properties P hold for object x. The theorem may say that if P.x/ is
true, then some additional properties Q.x/ must also be true. Such statements are
called conditional statements and can be written P.x/ ! Q.x/. In the context
of proving theorems, the P.x/ portion of the statement is referred to as the
hypothesis of the statement, and the Q.x/ portion of the statement is referred to as
the conclusion of the statement. The hypothesis of a conditional statement is often
called the antecedent while the conclusion of the conditional statement is often
called the consequent. For example, a well-known theorem is that all functions
differentiable at a point are also continuous at that point. There are many equivalent
ways to express this fact:

All functions differentiable at a point are also continuous at that point.


If the function f is differentiable at a point, then f is continuous at that point.
The function f is differentiable at a point only if f is continuous at that point.
If the function f is not continuous at a point, then f is not differentiable at that
point.
There are no functions f such that f is both differentiable at a point and
discontinuous at that point.
The function f is differentiable at a point implies that f is continuous at that point.
The function f is differentiable at x ! f is continuous at x.

All of these statements assert that if a function f satisfies the hypothesis that it has
a derivative at a point x, then f must also satisfy the conclusion that f is continuous
at x. Note that the truth of a conditional statement, P.x/ ! Q.x/, suggests nothing
about the truth of the statement Q.x/ ! P.x/ which is known as the converse
Springer International Publishing Switzerland 2016
J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_2

10

2 The Basics of Proofs

of the conditional statement P.x/ ! Q.x/. Indeed, the converse of this theorem
is the clearly false statement: If the function f is continuous at a point, then f is
differentiable at that point. Certainly, there are functions f both continuous and
differentiable at a point, but knowing that a function is continuous at a point does
not allow one to conclude that it is differentiable at that point. The converse of a
conditional statement is not logically equivalent to the original statement, but since
the two statements are concerned with the same subject matter, mathematicians
are often interested in the converse of a given conditional. If someone succeeds
in proving a new theorem expressed as a conditional statement, you might wonder
whether the converse of the statement could also be true. Sometimes the truth of the
converse statement is a trivial matter because it is well known. But there are many
examples where the converse does not hold in every case; that is, there are many
known values of x where the converse statement Q.x/ ! P.x/ is false. Other
times, the converse statement is something that has been previously established.
But very often, the truth of the converse statement remains an open question, and
the proof of the original conditional statement may generate research interest in its
converse.
One of the equivalent forms of a conditional statement P.x/ ! Q.x/ is the
statement if Q.x/ is false, then P.x/ must be false. This can be written as
:Q.x/ ! :P.x/ using the negation symbol : . This form of the statement
is called the contrapositive of the original conditional statement. For example, the
contrapositive of the statement discussed above is If the function f is not continuous
at a point, then f is not differentiable at that point. Although logically equivalent
to the original conditional statement, the contrapositive often gives you a different
way to think about the statement, and you will often see a proof which is a proof of
the contrapositive statement instead of a proof of the original conditional statement.

2.1.2 Negation of a Statement


The negation of a statement is a statement with the opposite truth value of the
original statement, that is, a statement which is false exactly when the original
statement is true. For example, the negation of n is an integer is n is not an
integer. The negation of the statement P.x/ is not P.x/ or simply :P.x/. The
conditional statement P.x/ ! Q.x/ says that every time P.x/ holds it must be the
case that Q.x/ also holds. The negation of this statement must, therefore, state that
for at least one value of x, P.x/ is true and Q.x/ is false or P.x/ and :Q.x/.
A proof by contradiction is a proof that assumes both that P.x/ and :Q.x/ are
true, and derives a statement that must be false (known as a contradiction) showing
that it is impossible to have P.x/ being true at the same time that Q.x/ is false.
The well-known Pythagorean Theorem is a conditional statement: If a right
triangle has legs with lengths a and b and a hypotenuse with length c, then
a2 C b2 D c2 . The converse of the Pythagorean Theorem is also true: If a triangle
has sides with lengths a, b, and c satisfying a2 C b2 D c2 , then the triangle is
a right triangle. When a conditional statement, P.x/ ! Q.x/ and its converse

2.1 The Language of Proofs

11

Q.x/ ! P.x/ are both true, the two statements can be combined into one as
P.x/
! Q.x/. This can also be stated as P.x/ if and only if Q.x/. Such
statements are called biconditional statements. Thus, the Pythagorean Theorem and
its converse could be combined into the single biconditional statement: A triangle
is a right triangle if and only if the triangle has side lengths a, b, and c satisfying
a2 C b2 D c2 .

2.1.3 Proofs of Conditional Statements


Conditional statements often make assertions about a very large number of objects
or even an infinite set of objects. Indeed, the statement about differentiable functions
being continuous refers to infinitely many functions, and the Pythagorean Theorem
refers to an infinite number of triangles. How, then, are you supposed to prove
these results since you clearly cannot consider every case individually? A general
approach to proving the conditional statement P.x/ ! Q.x/ is to select a generic
element x which could represent any object satisfying P.x/ and then to prove the
statement Q.x/. Since a generic object x satisfying P.x/ has been shown to satisfy
Q.x/, it follows that every object satisfying P.x/ must also satisfy Q.x/, and the
result has been proved. This will be the format of most of the proofs you will ever
write in analysis.
If the statement P.x/ ! Q.x/ is not true, it means that there is at least one
value of x that makes P.x/ ! Q.x/ a false statement. Such an x is called a
counterexample to the statement, and exhibiting such a counterexample would be
a way to prove that P.x/ ! Q.x/ is false. A proof of P.x/ ! Q.x/ is essentially
an argument showing that no counterexamples exist.
There are many phrases that occur so frequently when writing proofs, that
mathematicians have developed a short hand notation for these phrases. There is
little need to use these abbreviations within a textbook such as this or even in a
journal article, but the short hand can be useful when writing out a proof by hand on
paper or a blackboard. Here is a list of some of the commonly used symbols.
Shorthand Symbols for Proofs
9 there exists

9 there exists exactly one

8 for all

S suppose (or assume)

3 such that
! implies

! if an only if

12

2 The Basics of Proofs

2.1.4 Exercises
Perform the follows steps for each of the conditional statements in Exercises 16.
A
B
C
D
E

identify the hypothesis and the conclusion.


write the converse of the statement.
decide whether or not the converse of the statement is true.
write the contrapositive of the statement.
write the negation of the statement.

1. If x D 1 and y D 1, then xy D 1.
2. If x is an integer, then 2x C 1 is also an integer.
3. f .x/ and g.x/ are both continuous at x D 0 only if f .x/ C g.x/ is continuous
at x D 0.
4. xy D 0 if x D 0 or y D 0.
5. If xy  9y D 0 and y > 0, then x D 9.
6. A rectangle has area xy if two adjacent sides of the rectangle have lengths x and y.
7. Write the following without using shorthand symbols.
(a) 9x 2 R 3 x C 4 D 2.
(b) 8x 2 R 9y 2 R 3 x C y D 10.

2.2 Template for Proofs


Many proofs can be written by following a simple formula or template that suggests
guidelines to follow when writing the proof. Mathematicians reading a proof that
follows a traditional template will find the proof easier to follow because there will
be an expectation about what will be presented in the proof. For example, many
proofs will follow the general template given here.
TEMPLATE followed by many proofs

SET THE CONTEXT


ASSERT THE HYPOTHESIS
LIST IMPLICATIONS
STATE THE CONCLUSION

To illustrate this template, consider this proof of a well-known theorem from


elementary Algebra.

2.2 Template for Proofs

13

PROOF (Quadratic Formula): For constants a, b, and


p c, the quadratic
b2  4ac
b

.
polynomial ax2 C bx C c has roots given by x D
2a
SET THE CONTEXT: Let a, b, and c be constants with a 0.
ASSERT THE HYPOTHESIS: Suppose that x satisfies ax2 C bx C c D 0.
LIST IMPLICATIONS: Since a 0, it follows that x2 C ba x C ac D 0.
b2
b2
Then x2 C ba x C ac C 4a
2 D 4a2 .


b 2
b2
C ac D 4a
Factoring shows that x C 2a
2.


b 2
b2
c
b2 4ac
Then x C 2a D 4a2  a D 4a2 .

b
4ac
This means that x C 2a
must be one of the two square roots of b 4a
2 .
s
p
p
2
2
2
b b  4ac
b  4ac
b
b  4ac
D
, and x D
.
So, x C
D
2
2a
4a
2a
2a
STATE THE CONCLUSION: Thus, the
p roots of the quadratic polynomial
b b2  4ac
ax2 C bx C c are given by x D
.
2a

The proof template begins with the suggestion to SET THE CONTEXT which
represents statements designed to tell the reader what is being assumed in the proof.
This is usually a sentence or two telling the reader about the properties of the objects
that will be encountered in the proof. It may also introduce which variables will
appear in the proof and what kinds of objects they represent. So, in the given proof
of the Quadratic Formula, the first line tells that the variables a, b, and c are going
to represent known constants with a not being 0. Clearly, the fact that a is not 0
needs to be stipulated because if a D 0, the polynomial ax2 C bx C c would not
be quadratic and would not have the proposed roots. Generally, you are not looking
for a lengthy narrative here, and, in fact, brevity is a particularly cherished attribute
of a proof. Saying what needs to be said, but only what needs to be said is usually
best. Some authors who state a theorem and immediately follow the statement of the
theorem with its proof will forgo setting the context at the beginning of the proof
because the reader will have just seen the statement of the theorem and may not need
to see a repeat of the context for that proof. For example, in the example proof, the
statement of the theorem does introduce the constants a, b, and c and polynomial
ax2 C bx C c, so some authors might just skip the first line of the proof. On the
other hand, if the first line of the proof instead introduced the constants r, s, and
t, the proof could have proceeded using these variables instead of a, b, and c. The
same result would have been proved. So the SET THE CONTEXT of the proof
makes the proof independent of the statement of the theorem being proved. Thus, for
completeness, it is good to establish the habit of including the setting of the context
at the beginning of each proof, at least until the students experience in proof writing
has matured.
Your choices of variables used to represent particular objects in the proof are
not critically important to the structure or correctness of the proof, but there are

14

2 The Basics of Proofs

certain variables that mathematicians associate with various uses, and sticking to
these conventional choices simplifies the understanding of the proof because those
variable choices bring with them a history of context that the reader will recognize.
There are very few Algebra students
who would recognize the Quadratic Formula
p
s s2  4rt
if you gave them z D
. Proofs about limits usually refer to the
2r
variables  and which represent small positive real numbers used in specific ways
in the proof. Using these two variables in their traditional contexts makes the proofs
easier to understand because the reader will expect these variables to play specific
roles, just as they have in many other proofs the reader has seen. Seeing many
examples of proofs will familiarize the novice proof writer with these traditional
uses of variables.
Suppose that the statement being proved indicates that every object satisfying
the properties listed in the hypothesis of the theorem also satisfies some properties
listed in the conclusion of the theorem. One generally structures a proof of such
a statement by first selecting a generic object satisfying the properties listed in
the hypothesis. The ASSERT THE HYPOTHESIS part of the proof is where the
writer selects an arbitrary element satisfying the hypothesized properties. In the
Quadratic Formula proof, it was assumed that x satisfied the quadratic equation
ax2 C bx C c D 0. Other examples would be statements such as

Let n be any natural number bigger than 3.


Let x be an element of set A.
Let y be a root of the polynomial p.x/.
Assume that the real valued function f has a zero at the point z.
Suppose G and H are any two linesR that intersect at a point P.
s
Assume that the function f .s/ D 0 g.x/ dx is a differentiable function of s. In
addition assume that 0  f .s/  10 for all s  0.

It is possible that there are infinitely many objects which could play the role of
the generically chosen object. But if an argument proves the result is true for this
generic object, then the theorem will have been shown to hold for any object that
could have played the role of the generic object, and, therefore, the theorem will
have been proved for all objects satisfying the hypothesis. The Quadratic Formula
proof addresses the one generic polynomial ax2 C bx C c and in doing so derives
a formula that works for all quadratic polynomials including 5x2  17x C 126 and
rx2 C sx C t. Often the reader of a proof will form a mental picture of the generic
object being chosen. For example, after reading Let n be any natural number bigger
than 3, the reader may think, OK, how about n D 7? As the proof progresses,
the reader may take each statement of the proof and verify that it is valid and makes
sense for their choice of n D 7. This helps the reader follow the logic of the proof
and verifies that they are understanding what the proof is saying.
The proof will be completed when it is shown that the generically chosen
element satisfying the hypothesis of the theorem is, in fact, an element satisfying
the conclusion of the theorem as stated in the STATE THE CONCLUSION part
of the template. There will certainly need to be some statements placed between

2.2 Template for Proofs

15

the original assertion of the hypothesis and the end of the proof that justify the
conclusion of the theorem. Those statements make up the LIST IMPLICATIONS
part of the template. In almost all cases, most of the body of the proof belongs to
this list of statements. Each statement in the list should follow from definitions or
be simple implications following from previous statements in the proof. In a wellwritten complete proof, the reader should easily see why each implication follows
logically from other statements made earlier in the proof (Fig. 2.1). If an implication
is not clear on its own, it will need some justification so the reader can follow the
logic. The justification may just be a reminder of a key point made earlier in the
proof (as shown earlier, f is continuous at point a) or a reminder of a well-known
definition or theorem (Since all continuous functions on the interval 0; 4 are
R4
integrable there, it follows that f .x/dx exists.) The given Quadratic Formula proof
0

contains six lines of implications. Each line follows easily from the line before using
standard rules of Algebra, and any student familiar with the algebraic manipulations
of equations will be able to understand these implications. In the fourth step of the
b2
proof, the quantity 4a
2 is added to both sides of an equation. Although this step
surely follows the rules of Algebra, it may not be clear to the reader of the proof
why the step is important. As it turns out, this completing the square operation
prepares for the factoring performed in the fifth step of the proof and is arguably the
most clever step of the proof. A proof will often require a clever step such as this.
The proof writer may have labored for years looking for the inspiration needed to
find such a step, but the proof itself need only make clear the justification for what
is being done and does not need to refer to the sweat that went into producing it.
Some implications will be easy for the reader to follow without having to justify
the step. Other statements may need some deeper explanation. Here is where the
proof writer will need to consider the expertise of the target audience for the proof
in order to decide how much detail to provide. How to make your proof easy to
follow is only clear when you know for whom it is meant to be easy. For example, it
b2
made sense to follow the line x2 C ba xC ac D 0 with the statement x2 C ba xC ac C 4a
2 D
b2
4a2

because this just used the fact that you can add equal quantities to both sides of
an equation to get a new equation that is equivalent. On the other hand, suppose you
wish to combine a conditional statement on line 8 of a proof with the fact stated
on line 15 of the proof in order to show that the hypothesis of that conditional
is satisfied. This would allow the writer to state the conclusion of the conditional

Fig. 2.1 List of implications for P ! Q

16

2 The Basics of Proofs

statement to get line 16 of the proof, but the reader may have to be reminded about
which statements are being combined to get that conclusion.
Sometimes the writer of a long or complicated proof will need to make a new
definition or point out some new property that will be important later in the proof.
Depending on the complexity of the new idea, the proof writer may want to include
an example or two of objects satisfying the new definition or property. This will
serve to help the reader understand the new concept or to verify that the reader
is understanding the new concept. It is admirable to include such examples if the
complexity of the proof can be made clearer. But in most other contexts, the proof
should be kept short without the inclusion of unnecessary statements. If the intended
readers are able to easily construct these examples on their own, then the examples
should be left out of the proof.
The remainder of this chapter will discuss proofs that follow this general proof
template in contexts that the student should find easy to follow. It will also give
an opportunity to present some definitions and notation that will be used in later
chapters.

2.2.1 Exercises
1. If you were writing a proof of All prime numbers greater than 2 are odd, which
of the following would be appropriate ways to begin the proof. (There may be
more than one correct answer.)
(a)
(b)
(c)
(d)
(e)
(f)
(g)

Let n be an odd prime number.


Assume that all odd prime numbers are greater than 2.
Let n be a prime number greater than 2.
Assume that 2 is a prime number.
Assume that n and k are integers with n > k > 2.
The numbers 3, 5, 7, and 11 are prime numbers greater than 2.
Let n be a number greater than 2 which is not prime.

2. If you were writing a proof of The diagonals of a parallelogram bisect each


other, which of the following would be appropriate ways to begin the proof.
(There may be more than one correct answer.)
(a)
(b)
(c)
(d)
(e)
(f)

Let ABCD be a parallelogram.


Let ABCD be a quadrilateral whose diagonals bisect each other.
Let ABCD be a parallelogram whose diagonals bisect each other.
All rectangles are parallelograms.
Assume that the diagonals of a parallelogram bisect each other.
Assume that if the diagonals of a quadrilateral bisect each other, then the
quadrilateral is a parallelogram.

2.3 Proofs About Sets

17

3. If you were writing a proof of Every cubic polynomial with real coefficients has
at least one real root, which of the following would be appropriate ways to begin
the proof. (There may be more than one correct answer.)
(a) Assume that every cubic polynomial with real coefficients has at least one
real root.
(b) Assume that p.x/ is a polynomial with at least one real root.
(c) Assume that a, b, c, and d are real numbers with a 0, and let p.x/ D
ax3 C bx2 C cx C d.
(c) The polynomial x3  8 has exactly one real root at x D 2.
(e) Let p.x/ be a cubic polynomial with real coefficients and q.x/ be a cubic
polynomial with complex coefficients.
(f) Let p.x/ be a cubic polynomial with real coefficients with real root r.
Write an appropriate first sentence that would begin proofs of each of the following
statements.
4. If m and n are relatively prime integers, then there exist integers x and y such that
mx C ny D 1.
5. The three angle bisectors of any triangle intersect at a common point.
6. If a and b are real numbers with a  b, and f is a function continuous on the
closed interval a; b, then there is a real number M such that jf .x/j  M for all
x 2 a; b.



u , !

u  !

u  .!

7. If !
v , and !
w are 3-dimensional vectors, then .!
v / !
w D!
v !
w /:

2.3 Proofs About Sets


2.3.1 Set Notation
Most courses in mathematics discuss sets: sets of numbers, sets of points, sets
of functions, sample spaces, and so forth. This should have given any Calculus
student an intuitive understanding of sets. Many theorems in mathematics are
statements about sets in disguise. For example, the statement that If the function
f is differentiable at a point, then f is continuous at that point is equivalent to the
statement The set of functions differentiable at a point is a subset of the set of
functions continuous at that point.
For the purposes of this text, it will be enough to define a set as a collection of
elements. That is, elements are those objects that belong to sets, and the notation
x 2 A says that x is an element of the set A, and x A says that x is not an element
of the set A. The set A is a subset of the set B, or A is contained in the set B, if each
element of A is also an element of B in which case this fact is written as A  B. Two
sets, A and B, are equal if they have the same elements, that is, all the elements in the
set A are in the set B, and all the elements in the set B are in the set A. Notationally,
this says that A D B if and only if both A  B and B  A.

18

2 The Basics of Proofs

There are many ways to express the contents of a set. One is to list the elements
such as A D fa; b; cg or B D f1; 3; 5; 7; : : : g. Another way is to use set builder
notation which states that the set consists of all elements satisfying a given property
P.x/ and is written fx j P.x/g, or to emphasize that the elements of the set are also in
set A, it is often written fx 2 A j P.x/g. Examples are fx j x > 0g, fy j y2 C3y2 > 7g,
and ff j f is a function differentiable at x D 3g. Note that a set is determined by the
elements that are in the set. Thus, f1; 2; 3g D f3; 2; 1g D f1; 2; 2; 3; 3; 3; 1; 2; 3g
because all three of these sets contain exactly the same three elements. In some
contexts, mathematicians will talk about a multiset which is an object similar to
a set but allows elements of the collection to appear with different multiplicities.
Thus, f1; 2; 3g and f1; 2; 2; 3; 3; 3; 1; 2; 3g would be different multisets even though,
in the notation of sets, they are the same set.
One special set is the empty set written as ; or fg and is the set that has no
elements. In some contexts there is an understanding of a universal set, U, such
that all other sets under consideration are subsets of U. For example, the sets
A D f1; 2; 3g and B D f2; 4; 6; 8; : : : g can be thought of as subsets of the universal
set U D f1; 2; 3; 4; : : : g.
Take care not to confuse elements and subsets. Remember that sets are collections of elements and sets are subsets of other sets. It is possible that a set contains
other sets as elements, but this would need to be explicitly clear from the definition
of that set. It is correct to write 3 2 f1; 2; 3; 4; 5g, f1; 3; 5g  f1; 2; 3; 4; 5g,
and f1; 2g 2 f1; f1; 2g; f1; f1; 2ggg, but it is incorrect to write 3  f1; 2; 3; 4; 5g,
f1; 2g 2 f1; 2; 3; 4; 5g, or ; 2 f1; 2; 3; 4; 5g.
The student should be familiar with the following standard set operations. The
union of sets A and B is A [ B D fx j x 2 A or x 2 Bg, and the intersection
of sets A and B is A \ B D fx j x 2 A and x 2 Bg. When there is an
understood universal set, U, it makes sense to refer to the complement of a set
Ac D fx 2 U j x Ag. It does not make sense to discuss the complement of a set if
there is no understood universal set. For example, is f1; 2; 3gc D f4; 5; 6; 7; : : : g,
or is it f: : : ; 3; 2; 1; 0; 4; 5; 6; 7; : : : g? For that matter, is your right shoe an
element of f1; 2; 3gc ? The difference of two sets is AnB D fx 2 A j x Bg,
and, equivalently, if there is an understood universal set, AnB D A \ Bc . For
example, if A D f1; 2; 3; 4; 5g and B D f2; 4; 6; 8g, then A [ B D f1; 2; 3; 4; 5; 6; 8g,
A \ B D f2; 4g, AnB D f1; 3; 5g, and BnA D f6; 8g.

2.3.2 Exercises
1. Which of the following statements are true?
(a)
(b)
(c)
(d)

6 2 f1; 2; 3; : : : ; 10g.
f3; 5g 2 f1; 2; 3; : : : ; 10g.
; 2 f1; 2; 3; : : : ; 10g.
f6; 8g  f1; 2; 3; : : : ; 10g.

2.3 Proofs About Sets

19

(e) f2; 5; 5; 6g  f1; 2; 3; : : : ; 10g.


(f) f1; 2; 3; : : : ; 10g  f1; 2; 3; : : : ; 10g.
2. Given A D f1; 3; 5; 7; 9; 11; 13g, B D f2; 3; 4; 5; 6g, and C D f1; 4; 7; 11; 14g
evaluate each of the following expressions.
(a)
(b)
(c)
(d)
(e)
(f)

A[A
A\B
.A [ B/ \ C
.B [ C/ \ A
.A \ B/nC
.AnB/ [ C

2.3.3 Proofs About Subsets


There are many simple statements about sets which should be immediately obvious
to students reading this text, but learning to write proofs for these types of statements
will be instructive and useful in the proof writing discussed in the following
chapters. Here are some of those simple statements that apply to all sets A, B,
and C.
Some Statements About All Sets A, B, and C
A  A [ B.
A \ B  A.
An.B [ C/  .A [ B/nC.
.A [ B/ \ C  A [ .B \ C/.
A [ B D B [ A, the Commutative Law of Union.
A \ B D B \ A, the Commutative Law of Intersection.
.A [ B/ [ C D A [ .B [ C/, the Associative Law of Union.
.A\B/\C D A\.B\C/, the Associative Law of Intersection.
A [ .B \ C/ D .A [ B/ \ .A [ C/, the Distributive Law of
Union Over Intersection.
A \ .B [ C/ D .A \ B/ [ .A \ C/, the Distributive Law of
Intersection Over Union.
.A [ B/c D Ac \ Bc , DeMorgans Laws.
.A \ B/c D Ac [ Bc , DeMorgans Laws.

The first four of these statements propose that one set is a subset of a second set.
From the definition of subset, for A  B to be true, it is required that for every
x 2 A, x must also be in B. There is a standard template for proofs of statements of
this form:

20

2 The Basics of Proofs

TEMPLATE for proving A  B for sets A and B


SET THE CONTEXT: State what is being assumed about the sets A and B.
ASSERT THE HYPOTHESIS: Let x 2 A.
LIST IMPLICATIONS: Use the properties of set A to show x belongs to
set B.
STATE THE CONCLUSION: x 2 B. Therefore, by the definition of subset,
A  B.
For example, how would one prove the statement For all sets A and B, A 
A [ B? Because this proof is supposed to apply to any sets A and B regardless of
what properties they may possess, all that would be necessary for the SET THE
CONTEXT part of the proof is a statement introducing to the reader the fact that
the variables A and B will represent sets. Since A  A [ B exactly when every
element of A is also an element of A [ B, the ASSERT THE HYPOTHESIS part
of the proof needs to select a generic element of the set A so that the proof can
conclude that the generic element is an element of set A [ B. The first two lines of
the proof read:
Suppose that A and B are any two sets. Let x 2 A.
The LIST IMPLICATIONS for this proof can be very short. It merely needs
to show that the definition of set union implies that x is in the union A [ B. This
completes the proof.
PROOF: A  A [ B.

SET THE CONTEXT: Suppose that A and B are any two sets.
ASSERT THE HYPOTHESIS: Let x 2 A.
LIST IMPLICATIONS: Since x 2 A, it is true that x 2 A or x 2 B.
By the definition of set union x 2 A [ B.
STATE THE CONCLUSION: Therefore, by the definition of subset,
A  A [ B.

Do the statements of this proof have to appear in exactly this order using exactly
these words? Of course not. There can be many variations in what makes up a good
proof. But it does not hurt to review why these statements make a good proof. The
first line Suppose that A and B are any two sets just makes it clear to the reader
that the variables A and B can be used to represent any two sets. Here is where the
reader of the proof may well mentally choose two sets so that when reading the
remainder of the proof, the reader can verify that the statements make sense when
applied to those two sets. The second line Let x 2 A is required because by the
definition of subset, one must show that each element of A is also an element of
A [ B, so selecting an arbitrary element of A is the natural way to do this. The next
line Since x 2 A, it is true that x 2 A or x 2 B is just a statement of logic that says
if statement p is true, then statement p or q is also true. Of course, this particular
p or q statement is exactly the definition of x being a member of A [ B, which is
exactly what is needed to complete the proof.

2.3 Proofs About Sets

21

Could one have interchanged the third and fourth lines of this proof? Well, yes;
the proof would be complete if that were done, but the fact that the definition of set
union is invoked right after its conditions are verified makes the statements of the
proof flow smoothly. The reader facing the definition of set union in line three might
wonder why that definition is being shown at that point. By placing that statement
as the fourth statement where the proof reader has just seen that x 2 A or x 2 B,
the proof reader will immediately see that the definition of set union applies. Note
that each of the five statements in the proof has been placed on a separate line in the
display box. This has been done merely to facilitate the discussion about that proof.
In practice, there is no requirement that these statements appear on a separate lines.
The second statement about all sets is A \ B  A. This can be proved using the
same proof template as the first statement. Since this statement also applies to any
two sets A and B, the first line of this proof will be the same as the first line of the
previous proof. Because the assertion of the statement being proved is that A \ B is
a subset of another set, the ASSERT THE HYPOTHESIS line of the proof would
change to the assertion that x belongs to A \ B. After reading this second line, what
does the proof reader know about x? Only that x belongs to the intersection of two
sets. Thus, that only direction that the proof can proceed is to invoke the definition
of set intersection to make the additional assertion that x 2 A and x 2 B. This is a
statement of the form p and q, so logic allows the assertion that p is true, or, in this
case, that x 2 A. This is the required STATE THE CONCLUSION statement, and
the complete proof would be
PROOF: A \ B  A.
SET THE CONTEXT: Suppose that A and B are any two sets.
ASSERT THE HYPOTHESIS: Let x 2 A \ B.
LIST IMPLICATIONS: By the definition of set intersection, x 2 A and
x 2 B.
Thus, x 2 A.
STATE THE CONCLUSION: Therefore, by the definition of subset,
A \ B  A.
For a more substantial example, consider the third of the list of statements about
sets An.B [ C/  .A [ B/nC. A proof of this statement will need to refer to the
definition of set difference as well as the definitions of set union and subset. Since
the statement being proved involves three sets, the SET THE CONTEXT part of
the proof will need to refer to all three sets. The ASSERT THE HYPOTHESIS
statement will need to select an arbitrary element from An.B [ C/. To emphasize
that the choice of which variable to use is arbitrary, this time use y rather than x to
represent the arbitrarily chosen element. Once it is known that y 2 An.B [ C/,
the only property of y that can be used is the fact that y is a member of a set
difference. Thus, this would be a good time to invoke the definition of set difference.
That assures that y 2 A and y .B [ C/. At that point one can use the definition of

22

2 The Basics of Proofs

set union to conclude that since y .B [ C/ that y B and y C. Now these facts
can be combined to get the STATE THE CONCLUSION statement required by
the proof template. The complete proof would be
PROOF: An.B [ C/  .A [ B/nC.
SET THE CONTEXT: Suppose that A, B, and C are any three sets.
ASSERT THE HYPOTHESIS: Let y 2 An.B [ C/.
LIST IMPLICATIONS: By the definition of set difference, y 2 A and y
.B [ C/.
By the definition of set union y cannot be an element of either set B or set
C, or it would be in B [ C.
Also by the definition of set union, since y 2 A, y is also a member of A [ B.
Now, y 2 .A [ B/ and y C, so by the definition of set difference, y 2
.A [ B/nC.
STATE THE CONCLUSION: Therefore, by the definition of subset,
An.B [ C/  .A [ B/nC.

2.3.4 Exercises
Write proofs for each of the following statements.
1. For all sets A, B, and C, .A \ B/ \ C  A \ C.
2. For all sets A, B, and C, .A \ B/ \ .A \ C/  B \ C.
3. For all sets A, B, and C, .AnB/ \ .AnC/  An.B \ C/.

2.3.5 Proofs About Set Equality


Let A and B be sets. From the definition of set equality it follows that one can prove
A D B by proving the two separate facts A  B and B  A. That suggests the
following proof template for proving that two sets are equal.

2.3 Proofs About Sets

23

TEMPLATE for proving A D B for sets A and B


SET THE CONTEXT: Make statements about what is being assumed about
sets A and B.
PART 1: SHOW A  B.
ASSERT THE HYPOTHESIS: Let x 2 A.
LIST IMPLICATIONS: Use the properties of set A to show x belongs to
set B.
CONCLUDE PART 1: x 2 B. Therefore, by the definition of subset, A  B.
PART 2: SHOW B  A.
ASSERT THE HYPOTHESIS: Let x 2 B.
LIST IMPLICATIONS: Use the properties of set B to show x belongs to
set A.
CONCLUDE PART 2: x 2 A. Therefore, by the definition of subset, B  A.
STATE THE CONCLUSION: Therefore, because A and B are subsets of
each other, by the definition of set equality, A D B.
Is it correct to use the same variable x in both parts of the above proof template?
Yes, since the use of the variable x is only important in the context of showing A  B
or B  A, there is little chance that the reader will be confused by these two uses of
the same variable. On the other hand, there would be nothing incorrect about using
the variable x to represent the element of set A in the first part of the proof and to
use the variable y to represent the element of set B in the second part of the proof.
Using the same variable has the advantage that it is used the same way in both parts
of the proof, that is, to represent an element of a set that is being shown to also be
an element of a second set.
Is it correct that the variables A and B are used to represent the sets in both parts
of the proof? Could, for example, the first part of the proof use sets A and B, and
the second part of the proof use sets C and D? Here the answer is that it is very
important to use the same variables in both parts of the proof. To show A D B it
must be shown that A  B and B  A for the same pair of sets A and B. Showing
A  B and C  D does not let one conclude that A D B. After introducing A and B
in the SET THE CONTEXT part of the proof, it would be wrong to change the use
of these variables later in the proof or to change which variables were representing
the two sets.
Consider how to write proofs of three of the example statements:
A [ B D B [ A, the Commutative Law of Union.
.A \ B/ \ C D A \ .B \ C/, the Associative Law of Intersection.
.A [ B/c D Ac \ Bc , DeMorgans Law.
The first proof follows easily from the fact that in logic the statements p or q and
q or p are equivalent. This leads to the proof

24

2 The Basics of Proofs

PROOF: A [ B D B [ A.
SET THE CONTEXT: Suppose that A and B are any two sets.
PART 1 A [ B  B [ A

ASSERT THE HYPOTHESIS: Let x 2 A [ B.


LIST IMPLICATIONS: By the definition of set union, x 2 A or x 2 B.
Thus, x 2 B or x 2 A.
By the definition of set union x 2 B [ A.
CONCLUDE PART 1: Hence, from the definition of subset, it follows that
A [ B  B [ A.

PART 2 B [ A  A [ B

ASSERT THE HYPOTHESIS: Now suppose that x 2 B [ A.


LIST IMPLICATIONS: By the definition of set union, x 2 B or x 2 A.
Thus, x 2 A or x 2 B.
By the definition of set union x 2 A [ B.
CONCLUDE PART 2: Hence, from the definition of subset, it follows that
B [ A  A [ B.

STATE THE CONCLUSION: Therefore, because A [ B and B [ A are


subsets of each other, by the definition of set equality A [ B D B [ A.
Note that the PART 1 and PART 2 labels have been included in the above
display as guides to the student, but they are not required elements of the proof
itself. This proof can be shortened. Since the second part of the proof is identical to
the first part of the proof except that the roles of the sets A and B are interchanged,
one might save the reader from having to think through the details of the second half
of the proof which are identical to the details of the first half. The proof could be
written as
PROOF: A [ B D B [ A.

Suppose that A and B are any two sets.


Let x 2 A [ B.
By the definition of set union, x 2 A or x 2 B.
Thus, x 2 B or x 2 A.
By the definition of set union x 2 B [ A.
Hence, from the definition of subset, it follows that A [ B  B [ A.
Similarly, one can conclude that B [ A  A [ B.
Therefore, since A [ B and B [ A are subsets of each other, by the definition
of set equality A [ B D B [ A.

In fact, the first half of the proof is the second half of the proof. The first half of
the proof shows that A[B  B[A for any two sets A and B. In particular, that proof
applies when the roles of the two sets are interchanged; just let the variable A in the

2.3 Proofs About Sets

25

first part of the proof represent the set B from the second part of the proof, and let
the variable B in the first part of the proof represent the set A from the second part
of the proof.
The Associative Law of Intersection refers to three sets and requires repeated
use of the definition of set intersection. The definition is used to break down the
statement x 2 .A \ B/ \ C into the three simple statements x 2 A, x 2 B, and x 2 C
and then these facts are put back together to form the needed x 2 A\.B\C/. Again,
the proof needs two parts. The result is
PROOF: .A \ B/ \ C D A \ .B \ C/.
Suppose that A, B, and C are any three sets.
PART 1 .A \ B/ \ C  A \ .B \ C/
Let x 2 .A \ B/ \ C.
By the definition of set intersection, x 2 .A \ B/ and x 2 C.
Also, by the definition of set intersection, x 2 A and x 2 B.
Thus, x 2 A, x 2 B, and x 2 C.
Since x 2 B and x 2 C, by the definition of set intersection x 2 B \ C.
Since x 2 A and x 2 B \ C, by the definition of set intersection
x 2 A \ .B \ C/.
Hence, from the definition of subset, it follows that .A \ B/ \ C 
A\ .B \ C/.

PART 2 A \ .B \ C/  .A \ B/ \ C
Now, let x 2 A \ .B \ C/.
By the definition of set intersection, x 2 A and x 2 B \ C.
Also, by the definition of set intersection, x 2 B and x 2 C.
Thus, x 2 A, x 2 B, and x 2 C.
Since x 2 A and x 2 B, by the definition of set intersection x 2 A \ B.
Since x 2 A \ B and x 2 C, by the definition of set intersection
x 2 .A \ B/ \ C.
Hence, from the definition of subset, it follows that A\.B\C/  .A\B/\C.
Therefore, because .A \ B/ \ C and A \ .B [ C/ are subsets of each other,
by the definition of set equality .A \ B/ \ C D A \ .B \ C/.

The two DeMorgans Laws are useful because they tell how to simplify the
complement of a set formed by a combination of unions and intersections of sets.
Proving these laws can follow the template for showing set equality. The proofs will
need to refer to the definitions of set union, set intersection, and set complement.
The order in which these definitions are invoked follows from what is known at that
point of the proof. For example, if you know that x 2 .A [ B/c , then the only way
to make progress in the proof is to apply the definition of set complement because
the only attribute known about the set is that it is the complement of some other set.

26

2 The Basics of Proofs

Fig. 2.2 .A [ B/c is equal to


Ac \ B c

(A B)C

AC BC

Yes, that other set is a union of two sets, but there is no way to use that information
at this point of the proof because complementation was performed after the union
was taken (Fig. 2.2).
PROOF: .A [ B/c D Ac \ Bc .
Suppose that A and B are any two sets.
PART 1 .A [ B/c  Ac \ Bc
Let x 2 .A [ B/c .
By the definition of set complement, x .A [ B/.
If x 2 A or x 2 B, then x 2 A [ B which is false.
Thus, x A and x B, so by the definition of set complement, x 2 Ac and
x 2 Bc .
By the definition of set intersection x 2 Ac \ Bc .
Hence, from the definition of subset, it follows that .A [ B/c  Ac \ Bc .

PART 2 Ac \ Bc  .A [ B/c
Now, let x 2 Ac \ Bc .
By the definition of set intersection, x 2 Ac and x 2 Bc .
Thus, by the definition of set complement, x A and x B.
If x 2 A [ B, then by the definition of union, it would follow that x 2 A or
x 2 B which is false.
Thus, x A [ B, and, by the definition of set complement, x 2 .A [ B/c .
Hence, from the definition of subset, it follows that Ac \ Bc  .A [ B/c .
Therefore, because Ac \ Bc and .A [ B/c are subsets of each other, by the
definition of set equality .A [ B/c D Ac \ Bc .

2.3.6 Exercises
Give that A, B, and C are sets, write proofs for each of the following statements.
1. A \ B D B \ A.
2. A \ .BnA/ D ;.
3. .AnB/ [ .BnA/ D .A [ B/n.A \ B/.

2.4 Proofs About Even and Odd Integers

4.
5.
6.
7.

27

.A [ B/ [ C D A [ .B [ C/.
.A [ B/ \ C D .A \ C/ [ .B \ C/.
.A \ B/ [ C D .A [ C/ \ .B [ C/.
An.B [ C/ D .AnB/ \ .AnC/.

2.4 Proofs About Even and Odd Integers


2.4.1 Definitions of Even and Odd Integers
You are already very familiar with the natural numbers, N D f1; 2; 3; 4; : : : g,
which are sometimes called the counting numbers or whole numbers. By adding
zero and the negative natural numbers to this set, one obtains the integers, Z D
f: : : ; 3; 2; 1; 0; 1; 2; 3; : : : g. The natural numbers are often referred to as the
positive integers. Much of a students first study of mathematics is concerned
with these two sets of numbers. By a very young age most people are already
familiar with even and odd integers and some of their properties. This section will
construct proofs of some of these properties both because the student will feel very
comfortable with the concepts and because it allows for the introduction of some
basics about how to write proofs.
Before proceeding with proofs, though, it is necessary that there is agreement on
the definitions of even and odd integers. Indeed, there are many possible definitions
of even integers:
n 2 Z is an even integer if
the decimal representation of n has a ones digit equal to 0, 2,
4, 6, or 8.
n is either 0 or the prime factorization of n contains a factor
of 2.
there is an integer k such that npD 2k.
in is a real number, where i D 1.
the number .1/n is positive.
9n  1 .mod 10/.
sin. n
/ D 0.
2
n2 2 Z.
Which of these definitions should be used when writing proofs about even and
odd integers? Actually, since all the definitions are equivalent, one could adopt
any one of these definitions and then prove theorems that show that all the other
definitions are equivalent to the chosen definition. This is not an unusual situation
in mathematics, especially for a concept as elementary as even integers. But it turns
out that one of these definitions is particularly well suited for writing proofs, and
that is, n 2 Z is even if there is a k 2 Z such that n D 2k. This makes a useful

28

2 The Basics of Proofs

definition because it provides a fairly easy way to check whether a given integer is
even, and because knowing that a number n is even immediately gives you a number
k for which n D 2k, and that is a powerful tool for proving facts about even integers.
For this reason, this chosen definition is called the working definition, that is, it
is the definition easiest to apply in the wide variety of contexts. It is the definition
chosen from which all other properties of even numbers can be derived.
A similar discussion could take place about how to define odd integers. The
working definition is that n 2 Z is odd if there is a k 2 Z such that n D 2k C 1.
There is a long list of facts you could prove about even and odd numbers.
Facts About Even and Odd Integers

Every integer is either even or odd.


No integer is both even and odd.
n 2 Z is even if and only if n C 1 2 Z is odd.
The sum of any two even integers is even.
The sum of any two odd integers is even.
The sum of an even integer and an odd integer
is odd.
The product of two odd integers is odd.
The product of two integers is odd only if both
of the factors are odd.

Together, the first two of these facts say that each integer is either even or odd
but not both. This says that the sets of even and odd integers form a partition of
Z, that is, the sets are disjoint and the union of the sets is all of Z. Some authors
require that all the sets of a partition be nonempty as in the case with even and
odd integers. So why is it that every integer is either even or odd? This depends
on the Division Algorithm that states that if m; n 2 Z with n > 0, then there are
unique q; r 2 Z with 0  r < n such that m D nq C r. In this case q is called
the quotient of the division, and r is called the remainder of the division. Using
the Division Algorithm, any integer m can be divided by 2 giving a quotient and
remainder where the remainder is either 0 or 1. If the remainder is 0, then m D 2q
for integer q implying that m is even, and if the remainder is 1, then m D 2q C 1 for
integer q implying that m is odd.

2.4.2 Proofs About Even and Odd Integers


How can these ideas be used to write a good proof of Every integer is either even
or odd? First it is easier to reword the statement as If m 2 Z, then either m is even
or m is odd. This is a conditional statement, so the natural way to begin a proof
is to assume that the hypothesis of the statement is satisfied, that is, that m is an
integer. Now apply the Division Algorithm to get the quotient q and remainder r
guaranteed by the algorithm. Finally, the value of r shows that m either satisfies the

2.4 Proofs About Even and Odd Integers

29

definition of being an even integer or the definition of being an odd integer. The
result would be
PROOF: Ever integer is either even or odd.
Let m be an integer.
By the Division Algorithm there are integers q and r with 0  r < 2 such
that m D 2q C r.
If r D 0, then m D 2q for integer q which means that m satisfies the
definition for being even.
If r D 1, then m D 2q C 1 for integer q which means that m satisfies the
definition for being odd.
Since r must be either 0 or 1, it follows that every integer is either even
or odd.
Next consider the how to prove the statement The sum of any two odd integers is
even. The statement concerns the sum of any two odd integers, so the proof reader
would expect the proof to consider two arbitrarily chosen odd integers. Once two
odd integers are chosen, the definition of odd integer should be invoked because, at
that point, that is the only information that is known about the two integers. Finally,
a little algebra will help to show that the sum of these two odd integers satisfies the
definition of even integer. Here is an attempt to write such a proof that makes several
common proof writing errors.
PROOF ATTEMPT: The sum of any two odd integers is even.
The two integers are odd, so each has the form 2k C 1.
The sum of these two integers is .2k C 1/ C .2k C 1/ D 4k C 2.
k could be even or odd.
The number 2 is even since it is 2  1.
4k is even since it is 2  2k.
The sum of two even numbers is even, so the sum of 4k and 2 is an even
number.
Therefore, the sum of two odd integers is always even.

Here are some complaints about the above proof attempt.


The proof begins talking about two integers, but the proof reader has not yet
been introduced to these integers and does not know what two integers are being
discussed. The proof is missing a SET THE CONTEXT sentence to introduce
the idea of starting with any two odd integers.
The proof uses the variable k without introducing what that variable represents.
The proof requires that k be an integer, but the fact that k is an integer is not stated
anywhere. As far as the proof reader knows, k could be any complex number.
Later, the proof claims that 2k is an integer which is needed to show 4k is an
even integer. Without knowing that k is an integer, it does not follow that 2k is
also an integer.
The definition of odd integer allows you to take an odd integer and represent it as
2k C 1, where k is another integer. To apply this definition, then, the proof should

30

2 The Basics of Proofs

start with an odd integer, say m, and then represent it as 2kC1 rather than starting
with 2k C 1. The subtle point is that one should start with odd integer and use
its definition to move on to 2k C 1 rather than starting with 2k C 1 which jumps
the gun. The reader of the proof could wonder whether 2k C 1 could represent
a generic odd integer. Well, it can, but this takes some thought which can be
avoided by starting with an odd integer m and then using the definition of odd to
select the integer k such that m D 2k C 1.
The definition of odd integer does refer to 2k C 1, but it is more precise. It
does not say has the form. It says that there is an integer k such that the odd
number equals 2k C 1:
It is a major error to allow both odd integers to equal 2k C 1 for the same number
k. The only way this can happen is for the two odd integers themselves to be
equal. Thus, this proof only applies to a small subset of cases where one adds
two identical odd integers together such as 3 C 3 or 117 C 117.
The statement k could be even or odd is certainly correct, but it does not
contribute to the proof. It is a statement about items in the proof that is not part
of the proof. Occasionally, one will make a definition as part of a long proof,
and then give some examples to help the reader understand that definition. But
if a statement is not needed either as a critical step in a proof or an important
illustration to aid the understanding of the proof, then the statement should be
left out of the proof because it distracts from the proof and complicates it.
The statement The sum of two even numbers is even is correct, but it has not
been proved yet, at least in this text, and is equivalent in difficulty to proving the
corresponding statement about the sum of odd integers. Thus, it is not appropriate
to use the result about sums of even integers to prove one about the sum of odd
integers.

Considering these ideas, one can construct a better proof.


PROOF: The sum of any two odd integers is even.
Let m and n be two odd integers.
From the definition of odd integer, there is an integer k1 such that m D
2k1 C 1 and an integer k2 such that n D 2k2 C 1.
Then m C n D .2k1 C 1/ C .2k2 C 1/ D 2.k1 C k2 C 1/.
Since k1 and k2 are integers, so is k1 C k2 C 1.
Thus, the sum m C n is equal to twice an integer, so by the definition of even
integer, m C n is even.
Therefore, the sum of any two odd integers is always even.
The form of this proof can be copied almost word for word to get a similar proof
of the statement The product of two odd integers is odd.

2.5 Basic Facts About Real Numbers

31

PROOF: The product of any two odd integers is odd.


Let m and n be two odd integers.
From the definition of odd integer, there is an integer k1 such that m D
2k1 C 1 and an integer k2 such that n D 2k2 C 1.
Then mn D .2k1 C 1/.2k2 C 1/ D 4k1 k2 C 2k1 C 2k2 C 1 D
2.2k1 k2 C k1 C k2 / C 1.
Since k1 and k2 are integers, so is 2k1 k2 C k1 C k2 .
Thus, the product mn is equal to one more than twice an integer, so by the
definition of odd integer, mn is odd.
Therefore, the product of any two odd integers is always odd.

2.4.3 Exercises
Write proofs for each of the following statements.
1.
2.
3.
4.
5.

The sum of any two even integers is even.


The product of an even integer and an odd integer is even.
The difference of an even integer and an odd integer is odd.
If the product of two integers is odd, then both of the integers must have been odd.
The sum of any four consecutive integers is even.

2.5 Basic Facts About Real Numbers


2.5.1 Ordered Fields
Many of the theorems of Calculus involve properties of the real numbers. Some
of these properties are subtle, so it is essential to understand this important set of
numbers. Already introduced are the sets of natural numbers, N, and the integers, Z.
Also of importance is the set of rational numbers, Q D f mn j m; n 2 Z; n 0g. This
definition comes with the understanding that the two rational numbers mn and ab are
equal whenever mb D na. Thus, there are always infinitely many representations for
each rational number. For all rational numbers r 0, one can always find relatively
prime integers m and n with n > 0 such that r D mn . Together with an agreement to
write the rational number 0 as 01 , each rational number has a unique lowest terms
representation.
The set of rational numbers is more than a set of fractions with integers for
numerators and denominators. It also comes with the two binary operations of
addition (C) and multiplication () and with the order relation less than (<).

32

2 The Basics of Proofs

The binary operations satisfy conditions which make Q into a field. A field F
is a set with operations of addition and multiplication that satisfies the following
axioms.
Axioms for a Field F
A set F together with the binary operations of addition .C/ and multiplication
./ form a field if F contains the two elements 0 and 1 with 0 1 such that for
every r; s; t 2 F
r C s 2 F and
r  s 2 F and
r Ds ! rCt DsCt
r Ds ! rt Dst
the Closure
Properties
.r C s/ C t D r C .s C t/
.r  s/  t D r  .s  t/
the Associative
Properties
the Commutative
rCsDsCr
rsDsr
Properties
rC0Dr
r1Dr
the Identity
Properties
There exists r 2 F
If r 0, there exists 1r 2 F
the Inverse
such that r C .r/ D 0
such that r  1r D 1
Properties
r  .s C t/ D r  s C r  t

the Distributive
Law of
Multiplication
Over Addition

Notice that the rational numbers do satisfy the eleven field axioms. One defines the
operation subtraction () by r  s D r C .s/ and the operation division ( ) for
s 0 by r s D r  1s D rs . Moreover, the field Q together with the less than order
relation is an ordered field that obeys the following axioms.
Axioms for an Ordered Field F
A field F is an ordered field with order relation < if for every r; s; t 2 F
exactly one of the following holds
r < s, r D s, s < r
r < s and s < t imply r < t
r < s implies r C t < s C t
r < s and 0 < t imply r  t < s  t

the Trichotomy Property


the Transitive Property
the Addition Property of
Less Than
The Multiplication
Property of Less Than

2.5 Basic Facts About Real Numbers

33

Notice that the rational numbers do satisfy the four ordered field axioms. One
defines the other order relations of greater than (>), greater than or equal to
(), and less than or equal to () in the obvious ways, that is, r > s whenever
s < r, r  s whenever either r > s or r D s, and r  s whenever either r < s or
r D s.
There are many other ordered fields, and it is constructive to consider how to
justify the fifteen ordered
field axioms for a different ordered field. For example,
p
the set T D fr C s 2 j r; s 2 Qg is an ordered field
p using the usual
p addition and
multiplication operations.
For
two
elements
a
C
b
2
and
c
C
d
2 in T, define
p
p
p
addition as
.a
C
b
2/
C
.c
C
d
2/
D
.a
C
c/
C
.b
C
d/
2
and
multiplication
p
p
p
as .a C b 2/  .c C d 2/ D .ac
C
2bd/
C
.ad
C
bc/
2.
To
define
the less than
p
p
p
relation you would want .a C b 2/ < .c C d 2/ whenever
a

c
<
.d  b/ 2
p
which can be checked by squaring both a  c and .d  b/ 2, although you will
need topalso considerpthe signs of a  c and d  b. Thus, the definition becomes
.a C b 2/ < .c C d 2/ if one of the following holds:
a  c < 0 and 0 < d  b,
0 < a  c, 0 < d  b, and .a  c/2 < 2.d  b/2 , or
a  c < 0, d  b < 0, and .a  c/2 > 2.d  b/2 .
It is fairly easy to check that T is an ordered field. The only field axiom which
does not follow immediately from the properties of rational
p numbers is the inverse
axiom for multiplication. You should verify that for a C b 2 0, its multiplicative
inverse is
a
b p
C
2
a2  2b2
a2  2b2
which is in T. The order axioms take more work to verify due to the complicated
definition of less than. For example, to verify the less than relation
p
p works correctly
2, c C d 2, and
with addition,
one
would
begin
with
three
elements
of
T,
a
C
b
p
p
p
e C f p2 where it ispgiven that a C
p b 2 < cpC d 2. One needs to compare
.a C b 2/ C .e C f 2/ with .c C d 2/ C .e C f 2/. To do this, one compares the
values of .a C e/  .c C e/ D a  c and .d C f /  .b C f / D d  b. But this reduces to
comparing
p a  c and
p d  b which are known to satisfy the correct conditions because
a C b 2 < c C d 2 was given.
Every ordered field satisfies a long list of simple properties that you will
associate with facts learned in Arithmetic and Algebra. Here are some of those
properties.

34

2 The Basics of Proofs

Some Properties Obeyed By All Ordered Fields


Let r; s; t all be elements of ordered field F . Then
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.

r  0 D 0.
If r C t D s C t, then r D s.
If r  t D s  t and t 0, then r D s.
.r/ D r.
If r 0, then 11 D r.
r
r D s if and only if r D s.
r D .1/  r.
.r/ C .s/ D .r C s/.
.r/  .s/ D r  s.
If r < s and t < 0, then s  t < r  t.
If r 0, then r2 > 0.
0 < 1.
If 0 < r, then 0 < 1r .
If 0 < r < s, then 0 < 1s < 1r .
If n is any natural number and r > 1, then rn < rnC1 .

The reader may wish to prove some of these properties by applying the axioms.
This book will not dwell on these proofs since the techniques used in proving them
are not essential for writing most proofs in Analysis. Two simple proofs are given
here as examples.
PROOF: If r is any element of the field F, then r  0 D 0.
Let r be an element of field F .
Since 0 is the additive identity of F , 0 D 0 C 0.
Then r  0 D r  .0 C 0/.
By the Distributive Law, r  0 D r  0 C r  0.
By adding r  0 to each side of this equality, one gets 0 D r  0  r  0 D
.r  0 C r  0/  r  0 D r  0 C .r  0  r  0/ D r  0 C 0 D r  0.
Therefore, for any r 2 F , r  0 D 0.

The next theorem essentially says that if .1/  r has the same properties as r, it
must equal r.

2.5 Basic Facts About Real Numbers

35

PROOF: If r is any element of the field F, then r D .1/  r.


Let r be an element of field F . Then
.1/  r D
=
=
=
=
=
=
=

.1/  r C 0
.1/  r C .r C r/
.1/  r C r C r
.1/  r C 1  r C r
.1/ C 1  r C r
0  r C r
0 C r
r

Additive Identity
Additive Inverses
Associative Law of Addition
Multiplicative Identity
Distributive Law
Additive Inverses
r0D0
Additive Identity

Therefore, .1/  r D r for every r 2 F .


Note that every ordered field F will contain a copy of Q. This follows since
0; 1 2 F , and if n is a natural number in F , then n C 1 2 F . Thus, it follows by
mathematical induction that n 2 F for all n 2 N. Moreover, since 0 < 1, it follows
for each n 2 N that n D n C 0 < n C 1 showing that all natural numbers are
distinct elements of F . The existence of the negatives of all numbers in F implies
that the integers is a subset of F , and the existence of reciprocals implies that all of
Q lies in F . There are fields which are not ordered fields, and some of them do not
contain copies of Q. Indeed, there are finite fields as well as infinite fields that do
not contain Q or even N.

2.5.2 The Completeness Axiom and the Real Numbers


There are infinitely many ordered fields. The real numbers, R, is special because
it includes every number that is considered a possible distance from zero, either
positive, negative, or zero. An easy way to ensure that R contains every possible
distance is to require it to satisfy the Completeness Axiom. This axiom considers
nonempty subsets of an ordered field, F , (actually, any ordered set would do). A
subset S  F is said to be bounded above if there is an M 2 F such that all x 2 S
satisfy x  M. In this case, M is called an upper bound of S. Similarly, S  F is
bounded below by lower bound K 2 F if all x 2 S satisfy x  K. If S  F is both
bounded above and bounded below, then S is said to be bounded. If M is an upper
bound for a set S, and it is less than or equal to every upper bound of S, then M is
the least upper bound of S. Similarly, if K is a lower bound for a set S, and it is
greater than or equal to every lower bound of S, then K is the greatest lower bound
p
of S. For example, if S is the interval .1; 5 D fx j 1  x < 5g, then 10, 6, and 30

36

2 The Basics of Proofs

are all upper bounds of S, but 5 is the least upper bound of S. Also, 2, 0, and 12
are all lower bounds of S, but 1 is the greatest lower bound of S. One often uses the
notation l.u.b..S/ or sup.S/ to represent the least upper bound or supremum of S
and g.l.b..S/ or inf.S/ to represent the greatest lower bound or infimum of S.
Axioms for the Real Numbers
The real numbers, R, is an ordered field that satisfies The Completeness
Axiom:
Every nonempty set S  R which is bounded above has a least upper bound
in R.
Note, for example, that the set S D fx 2 Q j x2 < 7g is a nonempty subset of Q
which is bounded above by 4, 3, and 2.7, but there is no element of Q which is a
least upper bound of p
S. The set of real numbers, though, does contain a least upper
bound of S, namely 7. The Completeness Axiom is sometimes called the Least
Upper Bound Principle. The Completeness Axiom comes up frequently in proofs
about the real numbers to show that numbers with particular properties exist. For
example, consider the two theorems, the Archimedian Principle and the Existence
of Square Roots. Both of these theorems are easily understood, but they cannot be
proved without using the Completeness Axiom.
The Archimedian Principle states that for every real number r there is a natural
number greater than r. It can be proved using a proof by contradiction. The proof
makes the assumption that there is a real number greater than every natural number
and uses this to derive a contradiction, a statement that is false. Because one cannot
derive a false statement from a true statement, the assumption most recently made
in the proof must be a false statement, and you can conclude that no real number
exists that is greater than every natural number.
PROOF (Archimedian Principle): If r 2 R, then there exists n 2 N such
that r < n.
Suppose that there is an r 2 R such that r > n for every n 2 N.
Then the set N is a nonempty subset of R with an upper bound, so by the
Completeness Axiom, N has least upper bound M.
Then M  1 < M, so M  1 is not an upper bound for N.
Thus, there is a k 2 N with the property that k > M  1.
But then k C 1 is also in N, yet k C 1 > .M  1/ C 1 D M where M is an
upper bound for N.
This is a contradiction since no element of a set can be greater than an upper
bound for that set.
Therefore, the assumption that r > n for every n 2 N must be false, and for
every r 2 R there must be at least one n 2 N with n > r.

2.5 Basic Facts About Real Numbers

37

You may not have ever doubted that every nonnegative real number has a square
root, but this is a fact that can be proved using the axioms for the real numbers. It is
a nice application of both the Trichotomy Property and the Completeness Axiom.
Given a positive real number, r, the proof constructs the set S D fx 2 R j x2  rg
and then uses the Completeness Axiom to exhibit a value, s, equal to the least upper
bound of S. Then it shows that s2 cannot be greater than r and cannot be less than r,
so by the Trichotomy Property, s2 must equal r.
In particular, the proof first assumes that s2 > r and shows that there is a number
y > 0 such that the square of s  y is also greater than r. This shows that s  y is an
upper bound for S which contradicts the fact that s is the least upper bound of S. The
2
proof magically suggests that y D s 4sr works. Where did this magical expression
for y come from? It came from considering what property you would want such a y
to have. If you want .s  y/2 > r, this suggests that you want s2  2sy C y2 > r. This
inequality is quadratic in y and has an unnecessarily messy solution. But one of the
most important lessons about writing proofs in Analysis is that one can often be a
little sloppy when trying to show that an inequality holds. Here, for example, rather
than finding a y such that s2  2sy C y2 > r, it would be sufficient to find a y such
that s2  2sy > r, because if s2  2sy > r, then certainly the needed s2  2sy C y2 > r
also holds. The advantage of making this change is that the inequality s2  2sy > r
2
2
is very easy to solve for y yielding y < s 2sr . Thus, the value y D s 4sr ought to work
fine, and, hence, the magic is demystified. Of course, there are many other possible
values of y that would also have worked in this proof, but only one value for y is
needed.
After showing that s2 > r cannot be true, the proof assumes that s2 < r and
shows that there is a number y > 0 such that the square of s C y is less than r.
This shows that s C y is in S which contradicts the fact that s is an upper bound of
2
S. Again, the proof just suggests setting y D rs
. Can you figure out where this
4s
expression for y came from? Indeed, the calculation is similar to the one above. You
need .s C y/2  r, so s2 C 2sy C y2  r. It is simpler if y2 could be replaced by
2sy. If you assume y  2s, it allows you to conclude y2  2sy so that .s C y/2 D
s2 C 2sy C y2  s2 C 2sy C 2sy D s2 C 4sy. You then want a y that satisfies
2
gives the needed value of y (Fig. 2.3). Putting
s2 C 4sy  r. Thus, the value y D rs
4s
these ideas together gives the following proof.

s2 r s
4s

s2
S s r 4s

Fig. 2.3 Showing the least upper bound of S is s D

p
r

38

2 The Basics of Proofs

PROOF (Existence of Square Roots): If r 2 R and r  0, then there exists


an s 2 R such that s2 D r.
Let r  0 be a real number.
If r D 0, then 02 D r and 0 satisfies the needed condition.
So assume that r > 0.
Let S D fx 2 R j x2  rg.
S is nonempty since 0 2 S.
S is bounded above by r C 1 since x > .r C 1/ implies x2 > r2 C 2r C 1 > r
so x S.
Thus, by the Completeness Axiom, S has a least upper bound s.
By the Trichotomy Property, either s2 > r, s2 < r, or s2 D r.

If s2 > r, note that y D s 4sr > 0, and .s  y/2 D s2  2sy C y2 > s2  2sy D
2
2
s2  2s s 4sr D s 2Cr > r. Because .s  y/2 > r, it follows that s  y < s is an
upper bound of S. This contradicts the fact that s is the least upper bound of
S. Therefore, s2 > r must be false.
2

If s2 < r, let y be the smaller of 2s and rs


. Then y > 0, and
4s
2
2
2
2
2
2
D r.
.s C y/ D s C 2sy C y  s C 2sy C 2sy D s C 4sy  s2 C 4s rs
4s
Because .sCy/2  r, it follows that sCy 2 S and sCy > s. This contradicts
the fact that s is an upper bound of S. Therefore, s2 < r must be false.
Thus, it must be true that s2 D r which proves that for every real number
r  0 there is an s 2 R with s2 D r.

2.5.3 Absolute Value, the Triangle Inequality, and Intervals


The concept that separates the area of Mathematics known as Analysis from other
branches such as Algebra, Topology, Set Theory, and Combinatorics is the idea
of distance. In the real numbers, one canmeasure distance
by using the absolute

x if x  0
value function which is defined as jxj D
: For a real number x, the
x if x < 0
absolute value of x can be thought of as the distance that x is from the real number
0. Note that for all x 2 R it holds that jxj  x  jxj. If k > 0, then
the set
fx j jxj < kg is the same as the set fx j  k < x < kg. Similarly, the set fx jxj > kg
is the same as the set fx  k > x or x > kg.
The distance between two real numbers x and y can be defined as jx  yj. Note
that this distance is positive unless x D y.
One property of the absolute value function used frequently in proofs in Analysis
is the triangle inequality which states that for all x; y 2 R, jx C yj  jxj C jyj. The
name of this inequality comes from geometry where it is known that the sum of the

2.5 Basic Facts About Real Numbers

39

Fig. 2.4 Triangle inequality

x+y

x
lengths of two sides of a triangle always exceeds the length of the third side of the
triangle (Fig. 2.4). One simple proof of the triangle inequality is
PROOF (Triangle Inequality): jx C yj  jxj C jyj
Let x and y be elements of R.
Then jxj  x  jxj and jyj  y  jyj.
Adding these inequalities yields .jxj C jyj/  x C y  .jxj C jyj/.
This last inequality is equivalent to jx C yj  jxj C jyj.
A subset S contained in R is called connected if it has the property that for any
two numbers in S, all the numbers between those two numbers are also in S. More
precisely, S is connected if for all x; y 2 S with x < y, it follows that z 2 S for all z
with x < z < y. Informally, this means that there are no holes in the set S. Another
word for a connected set of real numbers is an interval. If a < b are real numbers,
all of the following sets are intervals.
Intervals of Real Numbers

; empty set
fag D a; a single point
fx j a < x < bg D .a; b/ open bounded interval
fx j a  x  bg D a; b closed bounded interval
fx j a  x < bg D a; b/ bounded interval open on the right
fx j a < x  bg D .a; b bounded interval open on the left
fx j a < xg D .a; 1/ open right infinite interval
fx j x < bg D .1; b/ open left infinite interval
fx j a  xg D a; 1/ closed right infinite interval
fx j x  ag D .1; b closed left infinite interval
R entire real line

40

2 The Basics of Proofs

2.5.4 Exercises
1. Show that for any real number x it follows that jxj C jx  6j  6.
2. Show that for any real number x, jx  1j C jx  3j  2.
3. Show that for any real numbers x and y it follows that jx2 C3x4yjCjx14yj 
.x C 1/2 .
4. Show that for any real numbers x and y, jx C yj C jx  y  2j C j2x C 8j  10.
5. Show that for any real numbers x and y it follows that jxjCj3x5yjCj5x4yj 
jx C yj.
6. Show that the intersection of any two intervals is always an interval.
7. Under what conditions is the union of two intervals an interval?

2.6 Functions
2.6.1 Function, Domain, Codomain
Intuitively, a function is a mapping that assigns to each point of some domain A a
value that resides in some codomain B. This is usually written f W A ! B. More
precisely, the function f is defined as a set of ordered pairs .x; y/ where each x resides
in the domain A of f and each y resides in the codomain B of f , and for each x 2 A
there is exactly one y 2 B such that .x; y/ 2 f . Since there is a unique ordered pair
.x; y/ 2 f for each x 2 A, f associates or links the value of y to the value of x and
allows one to write f .x/ D y.

2.6.2 Surjection
The domain of the function f is exactly the set of all x that are first coordinates of
the order pairs in f , that is, the domain is A D fx j .x; y/ 2 f g. The range of f is
defined as the image of f , that is, the range is fy j .x; y/ 2 f g. Clearly, the codomain
of f can be any set that contains the range of f . This can lead to some confusion
since the codomain of f is not precisely defined. It is simply a convenience. When
one defines a function f W R ! R, one means that f is defined for every real number,
and that for any x 2 R, the value f .x/ also lies in R. This is the case whether or not
R is the range of f or if the range of f is actually some proper subset of R. It could
be difficult and unnecessary to calculate exactly which subset of R is the range of
f , so it might be easier to just give the codomain as R and avoid the technicalities
of figuring out just what values of R are in the range of f . For example, the function
f .x/ D 3x6  15x4 C 12x3 C 25x2  32x C 14 maps the real numbers into the real
numbers, but to find the range of f , you would need to find the minimum value
of f . This minimum exists, but it may not be possible to give its value explicitly.

2.6 Functions

41

If the range of f W A ! B is actually all of B, we say that f maps A onto B, and f is


called surjective, and f is called a surjection. Thus, one can prove that a function
is surjective by showing that each element of the codomain is in the range.
TEMPLATE for proving a function f is surjective
SET THE CONTEXT: Make a statement introducing f , its domain A, and
its codomain B.
Select an arbitrary value y 2 B.
Exhibit a value x 2 A such that y D f .x/.
STATE THE CONCLUSION: Therefore, f is a surjection.
Note that the crucial step in proving that a function is surjective is showing the
existence of an x with f .x/ D y and verifying that the x is in the domain A of the
function. For example, the function f .x/ D 5x2 C 1 is a surjection from the negative
real numbers onto the interval .1; 1/. To prove this you would need to show that for
each real number y > 1 there is a negative real number x for which f .x/ D y. But this
just involves a simple algebraic manipulation. That is, if you need 5x2 C 1 D y, then
you can solve to get 5x2 D y1 and x2 D y1
. Here one needs to be careful because
5
q
y1
it is easy to continue by writing x D
which always results in a positive value
5
for x. The
q proof needs to exhibit a negative value for x, so it is important to set

. There is no need for the proof to display the steps of solving the
x D  y1
5
equation for x. The goal is to produce a value of x 2 A such that f .x/ D y; how you
arrived at that x is not important. It may be interesting, but it is not an essential part
of the proof, and, therefore, it should not be part of the proof.
PROOF: The function f .x/ D 5x2 C 1 is a surjection from the negative
real numbers onto the interval .1; 1/.
Let f .x/ D 5x2 C 1.

q
.
For any y > 1 let x D  y1
5

> 0, so
Because y > 1, y1
5
negative real number.

y1
5

is a positive real number and x is a

 q 2

C 1 D y.
Moreover, f .x/ D 5x C 1 D 5  y1
C 1 D 5 y1
5
5
2

Therefore, f is a surjection.

2.6.3 Injection
The definition of function requires that each value x in the domain of f is found in
exactly one ordered pair .x; y/ 2 f . The same does not have to hold for values in

42

2 The Basics of Proofs

the codomain, that is, one value y in the codomain could appear in many order pairs
.x; y/ 2 f . For example, for the constant function f W R ! R given by f .x/ D 1 for
all x 2 R, the value 1 appears as the second coordinate in all the ordered pairs of
the function. If a function has the property that no value of y appears as the second
coordinate of more than one ordered pair in f , then f is said to be injective or, less
formally, that f is one-to-one. In this case the function f is called an injection. In
such a case, one sees that f .x1 / D f .x2 / only if x1 D x2 . This gives a procedure for
proving that a function is injective.
TEMPLATE for proving a function f is injective
SET THE CONTEXT: Make a statement introducing f , its domain A, and
its codomain B.
Assume that for two values x1 and x2 in A that f .x1 / D f .x2 /.
Show that x1 D x2 .
STATE THE CONCLUSION: Therefore, f is an injection.
p
For example, the function f .x/ D 4x C 7 maps the positive real numbers to the
positive real numbers. It is not a surjection, but it is an injection. The proof would
require that you show that f .x1 / D f .x2 / implies that x1 D x2 . Again, this is just an
algebraic manipulation.
p
PROOF: The function f .x/ D 4x C 7 is an injection from the positive
real numbers to the positive real numbers.
p
Let f .x/ D 4x C 7.
Assume
p that for positive
p real numbers x1 and x2 , f .x1 / D f .x2 /.
Then 4x1 C 7 D 4x2 C 7.
Squaring yields 4x1 C 7 D 4x2 C 7, so 4x1 D 4x2 , and x1 D x2 .
Therefore, f is an injection.
If a function f W A ! B is both surjective and injective, that is, if f is both one-toone and onto, then f is bijective, and f is called a bijection. In this case, f exhibits
a one-to-one correspondence between the set A and the set B.
Two functions f and g whose ranges are in the real numbers can be combined
arithmetically. Specifically, one can define f C g, f  g, fg, and gf in natural ways:
.f C g/.x/ D f .x/ C g.x/,.f  g/.x/ D f .x/  g.x/, .fg/.x/ D f .x/  g.x/, and,
f .x/
for x such that g.x/ 0, gf .x/ D g.x/
. When functions f and g are combined
in this way, the domain of the sum, difference, product, or quotient is assumed to
be the intersection of the domain of f and the domain of g with the exception that
the domain of gf also excludes values of x for which g.x/ D 0. Thus, the function
p
p
f .x/ D x  4 is defined for all x  4, and
g.x/ D 5  x is defined
p the function
p
for all x  5. It follows that the function x  4 C 5  x is defined only for those
x satisfying 4  x  5. Similarly, the function ff .x/
is only defined for x > 4 even
.x/
though it is identically 1 for those x. That function has a natural extension to all real
numbers.

2.6 Functions

43

g(x)
y
f(x)
B

z
A

fg

Fig. 2.5 Composition .f g/.x/ D z

2.6.4 Composition
If g is a function assigning values in its domain A to values in its range contained
in the set B, and if f is a function assigning values in its domain B to values in
its range contained
 in
 the set C, then the composition of f with g is the function
.f g/.x/ D f g.x/ which assigns to values in its domain A values in its range
contained in set C (Fig. 2.5). The main reason for considering compositions is that
it is often easiest to represent complicated functions as compositions of simpler
2x
functions. For example, the function f .x/ D psinxC4
is clearly a quotient where the
numerator is the composition of the function p
x2 with the function sin x, and the
denominator is the composition of the function x with the function x C 4.
It is easily shown that if g W A ! B and f W B ! C are both surjective functions,
then their composition, f g W A ! C, is also surjective. To prove this, you would
follow the template for proving that a function is surjective. That requires that you
select an arbitrary z 2 C and show that there is an x 2 A such that .f g/.x/ D z.
Why might you use the variable z here rather than the variable y? Well, that allows
you to think of g as mapping x to y, and f , in turn, mapping y to z. Faced with the
statement f g.x/ D z, there is little you can do except to apply what you know
about the function f , that is, that f is surjective. Because f is surjective, and z is in
the codomain of f , you know that there is a y in the domain of f such that f .y/ D z.
Can you find an x such that g.x/ D y? Of course y is in the domain of f which is the
codomain of g. The function g is surjective, so there must be an x in the domain of
g that maps onto y. These ideas give the following proof.

44

2 The Basics of Proofs

PROOF: If g W A ! B and f W B ! C are both surjective functions, then


their composition f g W A ! C is also surjective.

Let g W A ! B and f W B ! C be two surjective functions.


Let z 2 C.
Then since f is a surjection from B to C, there is a y 2 B such that f .y/ D z.
Since g is a surjection from
 A to
 B, there is an x 2 A such that g.x/ D y.
Therefore, .f g/.x/ D f g.x/ D f .y/ D z.
It follows that f g W A ! C is surjective.

It is also true that if g W A ! B and f W B ! C are both injective functions,


then their composition, f g W A ! C, is also injective. You would prove this by
following the template for proving that a function is injective. That is, you would
assume that .f g/.x
 1/ D
 .f g/.x2 / for some x1 and x2 in A. Again, what can you
say if you know f g.x1 / D f g.x2 / ? All that you can do is apply what you know
about the function f , that is, that f is injective. Since f is injective, you can conclude
that g.x1 / D g.x2 /. Then because g is injective, you can conclude x1 D x2 , and you
are done.
PROOF: If g W A ! B and f W B ! C are both injective functions, then
their composition f g W A ! C is also injective.

Let g W A ! B and f W B ! C be two injective functions.


Assume that for some x1 and x2 inA, .f  g/.x1 / D .f g/.x2 /.
By the definition of composition f g.x1 / D f g.x2 / .
Then since f is an injection, it follows that g.x1 / D g.x2 /.
Since g is an injection, it follows that x1 D x2 .
It follows that f g W A ! C is injective.

2.6.5 Exercises
Write a proof for each of the following statements.
1. For each real number r there is a real number x such that x3 D r.
2. For each real number r  0 there is a real number x  0 such that x4 D r.
3. If n is an odd positive integer, then for each real number r there is a real number
x such that xn D r.
4. If n is an even positive integer, then for each real number r  0 there is a real
number x  0 such that xn D r.
5. If h W A ! B, g W B ! C, and f W C ! D are three functions, then .f g/ h D
f .g h/. In other words, function composition is associative. (Hint: Show that
both functions .f g/ h and f .g h/ give the same result when applied to an
x 2 A.)

2.6 Functions

45

6. If h W A ! B, g W B ! C, and f W C ! D are three surjective functions, then


their composition f g h is surjective.
7. If h W A ! B, g W B ! C, and f W C ! D are three injective functions, then their
composition f g h is injective.

Chapter 3

Limits

3.1 The Definition of Limit


In a typical Calculus course students develop an intuitive understanding of the
concept of limit which, of course, is the central concept of Calculus and, indeed,
the central concept of Analysis. In particular, if f is a function defined on an open
interval containing a 2 R, then f has limit L at a if the values of f .x/ get closer and
closer to L as x approaches a. In order to prove theorems about limits, one needs a
rigorous definition of limit which makes clear what is meant by closer and closer
and approaches. In Analysis the distance between two real numbers is measured
by the absolute value of the difference of the two numbers. Thus, the ideas of closer
and closer and approaches naturally involve statements about the absolute values
of differences of two quantities.
Consider the definition of lim f .x/ D L, where the function f is defined in an
x!a

open interval in R containing the point a. This limit should give you a mental image
similar to Fig. 3.1 where the graph of the function gets close to L as x approaches a.
So, how can you quantify what f .x/ is getting close to L means? Is within 1
1
close? Is within 14 close? Is within 1000
close? Clearly, there needs to be a way to say
arbitrarily close or as close as one likes. Analysts have found that a good way
to express f .x/ getting arbitrarily close to L is to say that for any positive distance,
jf .x/  Lj can be made to be less than that distance. Of course, jf .x/  Lj cannot be
made to be negative, and it is not reasonable to require it to be zero since that would
require f .x/ to actually equal L. Hence, one usually says that for any  > 0, one can
achieve jf .x/  Lj < . The use of the Greek letter  (epsilon) is arbitrary, but the
tradition of using  in this context has been universal since Cauchy introduced its
use in the early 1800s. Figure 3.2 shows a tolerance of a small  around the limit
value L. The goal is to show that the function f .x/ stays within that tolerance when
x is close to a.
In the figure you can see that for the values of x near a, the function f .x/ falls
within the prescribed tolerance of L. You could find a small interval centered at a
Springer International Publishing Switzerland 2016
J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_3

47

48

3 Limits

Fig. 3.1 lim f .x/ D L


x!a

y = f(x)

y = f(x)

Fig. 3.2 lim f .x/ D L


x!a

Fig. 3.3 lim f .x/ D L


x!a

L+
L
L

y = f(x)

a+

such that jf .x/  Lj <  for all x in that interval. That is, there is a value > 0
such that every x satisfying jx  aj < also satisfies jf .x/  Lj <  as seen in
Fig. 3.3. Again, the choice of the Greek letter (delta) is completely arbitrary, but
the tradition of using in this context is universal.
Note that for the function whose graph appears in Fig. 3.3 the value L is the limit
of the function as x approaches a, and L also happens to be the value of f .x/ at x D a.
You should recall that this sometimes happens (specifically when f is continuous at
x D a), but that this is not a requirement. Indeed, one reason for discussing limits
in the first place is because there is a need to evaluate the behavior of a function
as x approaches a value a when the function fails to be defined at x D a. Thus, in
general, one does not want to require jf .x/  Lj <  for all x with jx  aj less than
some positive value since this would require jf .x/  Lj <  at x D a. Instead,
one excludes the need for the function to satisfy any conditions at all at x D a by
saying that there is a positive such that f is within the desired tolerance of L for all
x with 0 < jx  aj < . Clearly, the value of must be chosen to be positive since
no negative value would represent a distance and, D 0 would not result in a region
around the number a satisfying jx  aj < .

3.2 Proving lim f .x/ D L

49

x!a

Combining these ideas results in the following definition. Suppose that the
function f is defined for all x in an open interval containing a 2 R except perhaps
at x D a. Then the limit of f as x approaches a is L, lim f .x/ D L, means that for
x!a
every  > 0 there exists a > 0 such that for every x satisfying 0 < jx  aj < , it
follows that jf .x/  Lj < . The power of this definition is the fact that the  and
are arbitrary positive numbers. For example, what if you knew that for every  > 0
there were a > 0 such that whenever 0 < jx  aj < , then jf .x/  Lj < 2?
Would this be sufficient for showing lim f .x/ D L? The answer is yes, because the
x!a

 is arbitrary. Suppose that for any  > 0 you can find a > 0 that will ensure that
jf .x/  Lj < 2. Then since 2 is also a positive number, you can find a 0 > 0, likely
smaller than , that will ensure that jf .x/  Lj < 2  2 D . The point here is that
since  was arbitrary, you can replace it with any positive number, including 2 .

3.1.1 Exercises
Which of the following definitions is equivalent to the definition of lim f .x/ D L?
x!a

1. For all   0 there is a  0 such that if 0 < jx  aj < , then jf .x/  Lj < .
2. For all  > 0 there is a > 0 such that if 0 < jx  aj < 4 , then jf .x/  Lj < 7.
3. For all  > 0:001 there is a > 0:001 such that if 0 < jx  aj < , then
jf .x/  Lj < .
4. For all > 0 there is an  > 0 such that if 0 < jx  aj < , then jf .x/  Lj < .
5. There exists a > 0 such that for all  > 0, if 0 < jxaj < , then jf .x/Lj < .
6. For all > 0 there is an  > 0, such that if 0 < jx  aj < , then jf .x/  Lj < .
7. For all  > 1 there is a > 1, such that if 1 < jx  aj C 1 < , then
jf .x/  Lj C 1 < .

3.2 Proving lim f .x/ D L


x!a

The definition of limit provides a formula by which one can construct a proof that
a particular function f has a limit of L at the point a. The definition requires that
for every  > 0 there is a > 0 that satisfies certain properties. Thus, a proof of a
limit must show that for every  > 0 you can exhibit a > 0 which has the needed
property. As with other proofs that some property holds for all elements of a set, the
proof begins by selecting an arbitrary element of that set. In this case, one would
select an arbitrary  > 0. The goal is to present a value for > 0 such that every x
satisfying 0 < jx  aj < also satisfies jf .x/  Lj < . That suggests the following
proof template.

50

3 Limits

TEMPLATE for proving lim f .x/ D L


x!a

SET THE CONTEXT: Make statements telling what is known about the
function f and the numbers a and L.
SELECT AN ARBITRARY : Given  > 0,
PROPOSE A VALUE FOR : let D
. Here you would insert an
appropriate value for .
SELECT AN ARBITRARY x: Select x such that 0 < jx  aj < .
LIST IMPLICATIONS: Derive the result jf .x/  Lj < .
STATE THE CONCLUSION: Therefore, lim f .x/ D L.
x!a

For example, consider proving that lim 2x  3 D 7. As stated in the template,


x!5

the proof would begin with Let f .x/ D 2x  3. Given  > 0, : : :. Your task is to
determine a value of > 0 that ensures the inequality jf .x/  Lj <  holds for all
x with 0 < jx  aj < . Since the function f is not constant, the choice of will
surely depend on the value of . But how is this value of determined? A common
approach is to work backwards from the final conclusion jf .x/  Lj <  to see what
value of is needed.
In this example, f .x/ D 2x  3, a D 5, and L D 7. The value of jf .x/  Lj D
j.2x  3/  7j D j2x  10j D 2jx  5j. Note that this expression has a factor of
x  a, where a D 5. When finding the limit of a polynomial where L D f .a/, this
will always be the case. For more complicated functions f , other properties of f will
often allow you to write f .x/  L as an expression where x  a is a factor. This makes
it easier to determine a value of since the choice of restricts the size of jx  aj
which, in turn, will make jf .x/  Lj small, the desired result.
So here, jf .x/  Lj D 2jx  5j. To force jf .x/  Lj to be less than some arbitrary
 > 0, it is, therefore, sufficient for 2jx  5j to be made less than . This is done by
making jx  5j < 2 , and the needed value of is 2 . Note that it is stipulated that  is
positive, so D 2 is also greater than zero, a requirement of the definition of limit.
Now a complete proof can be written by following the template.
PROOF: lim 2x  3 D 7
x!5

Let f .x/ D 2x  3.
Given  > 0, let D 2 > 0.
Select x such that 0 < jx  5j < D 2 .
Then > jx5j implies  > 2jx5j D j2x10j D j.2x3/7j D jf .x/7j.
Therefore, lim 2x  3 D 7.
x!5

Identifying an appropriate value for is very easy when f is linear as in


the example above, but it can be trickier for other functions. Consider proving
lim 3x2 D 48. In this example, f .x/ D 3x2 , a D 4, and L D 48. The value of

x!4

jf .x/  Lj D j3x2  48j D 3jx2  16j D 3jx C 4j  jx  4j. As suggested above,

3.2 Proving lim f .x/ D L

51

x!a

it should not be surprising that this expression includes a factor of x  a D x  4


which can be forced to be small by selecting a small value for . In particular, the
proof will need to justify jf .x/  Lj <  which means 3jx C 4j  jx  4j <  and

jx  4j < 3jxC4j
. Here is an attempt at a proof using this idea, but it falls short of
being correct.
PROOF ATTEMPT: lim 3x2 D 48
x!4

Let f .x/ D 3x2 .



For any x, let D 3jxC4j
.
Then 0 < jx  4j < D
3jx2  16j D jf .x/  48j.
Therefore, lim 3x2 D 48.


,
3jxC4j

implies that  > jx  4j  3jx C 4j D

x!4

Can you spot some of the errors in this proof?


The second line of the proof refers to the variable  which has not yet been
introduced in the proof. In particular, without having specified that  > 0, one
does not know that > 0 which is required by the definition of limit. The proof
should include the phrase Given  > 0.
The value of in the second line of the proof is undefined when x D 4.
The most serious error here is that the value of depends on the value chosen for
x. The definition of limit requires that for every  > 0 there is a > 0. That value
of can depend on the value of  but certainly cannot depend on x which has not
yet been introduced in the definition. After is specified, the definition requires
that a condition hold for all x satisfying 0 < jx  aj < , and only then does the
definition refer to values of x.

Still one needs a value of which will be less than 3jxC4j
for all the values of x
considered in the proof. One way around this would be to find a value for which is

less than 3jxC4j
for every value of x. But this cannot be done because the expression

gets arbitrarily small as x gets large. On the other hand, the value of x will be
3jxC4j
restricted so that 0 < jx  4j < . Thus, unless is very large, x cannot wander too

far away from a D 4, and 3jxC4j
cannot get arbitrarily small.
So how does one choose a which both ensures that jx C 4j does not grow too
large and also makes jx  4j small? The technique is to select in two stages. First,
to ensure that jx C 4j does not grow too large, restrict the value of so that x cannot
wander too far from a D 4. Almost any restriction in the size of will work, so
how about suggesting that not exceed 1? If  1, then when you choose an x with
0 < jx4j < , you will know that jx4j < 1 which is equivalent to 1 < x4 < 1
and, thus, 1 C 8 < .x  4/ C 8 < 1 C 8. That is, 7 < x C 4 < 9, and it follows that
jx C 4j < 9. Here is another attempt at a proof that uses this idea. Unfortunately, it
too has problems.

52

3 Limits

PROOF ATTEMPT: lim 3x2 D 48


x!4

Let f .x/ D 3x2 .


Given  > 0, let D 1.
Then for any x such that 0 < jx  4j < D 1, it follows that 1 < x  4 < 1
so 1 C 8 < .x  4/ C 8 < 1 C 8 and jx C 4j < 9.

Now let D 27
> 0.

Then 0 < jx  4j < D 27
implies that  > jx  4j  27 > jx  4j  3jx C 4j D
2
j3x  48j D jf .x/  48j.
Therefore, lim 3x2 D 48.
x!4

The only problem with the above proof is in its use of the variable . In the second

line of the proof is set to 1, and in the fourth line it is set to 27
. It does not make
sense to set the value of equal to both of these values because, except in the rare
case that  D 27, the value of cannot be equal to both values at the same time.
The solution is to choose one value for that satisfies two separate conditions. For
example, you can first require that < 1. Then a choice of x with 0 < jx  4j <



will guarantee that jx C 4j < 9. Then 3jxC4j
> 39
D 27
. This suggests that you

should select D 27 . But you also need  1. What happens if someone suggests

that  be some rather large number such as  D 100? Then D 27
would not satisfy
< 1. This is not a problem since one can always get away with selecting a positive
value for that is smaller than needed. Thus, you can select
to be the lesser of 1



and 27
. This choice is usually written as D min 27
; 1 . Now you can put this all
together to get a formal proof that is completely correct.
PROOF: lim 3x2 D 48
x!4

Let f .x/ D 3x2 .


 
Given  > 0, let D min 27
;1 .
Select x such that 0 < jx  4j < .
Since  1, it follows that jx4j < 1 and 1 < x4 < 1, so 7 < xC4 < 9.
Thus, jx C 4j < 9.




Since  27
, it follows that jx  4j < 27
D 39
< 3jxC4j
.
2
Then > jx  4j implies  > 3jx C 4j  jx  4j D j3x  48j D jf .x/  48j.
Therefore, lim 3x2 D 48.

x!4

xC2
2
x!2 x C3xC2

As a third example, consider proving the limit lim

D 1. In this

example f .x/ D
a D 2, and L D 1. Note that f .2/ is not defined even
though the limit as x approaches 2 exists. The proof
of this limit must conclude

xC2
with the inequality  > jf .x/Lj D x2 C3xC2  .1/. As in the previous examples,
xC2
,
x2 C3xC2

xC2
it would be convenient if the expression x2 C3xC2
 .1/ would contain a factor of
x C 2 so that it could be made small by requiring x  .2/ to be less than some .

3.2 Proving lim f .x/ D L

53

x!a

But this follows with some fairly straightforward algebra. Assuming that x 2,
x2

xC2
xC2
1
1 C .x C 1/
xC2
 .1/ D
C1 D
C1 D
D
:
C 3x C 2
.x C 2/.x C 1/
xC1
xC1
xC1

D jx C 2j  1 would follow if jx C 2j never


The needed inequality  > xC2
xC1
xC1
exceeds jx C 1j which, in turn, would happen if < jx C 1j. Again, there is a
problem because the choice of > 0 cannot depend on the value of x, yet jx C 1j
can get arbitrarily close to zero as x gets close to 1. The strategy, then, would be to
restrict the value of so that x could not get close to 1. If x is supposed to be close
to 2, could be chosen so that it does not exceed 12 . Then, jx C 1j could not get
smaller than 1  12 D 12 , and jx C 1j > 2 . You would not want to exceed either 2


or 12 . Thus, one can select D min 2 ; 12 . The complete proof follows.
xC2
2
x!2 x C3xC2

PROOF: lim

D 1

xC2
Let f .x/ D x2 C3xC2
.


Given  > 0, let D min 2 ; 12 .
Select x such that 0 < jx  .2/j < .
Since  12 , it follows that jx C 2j < 12 and  12 < x C 2 < 12 , so  32 <
x C 1 <  12 . Thus, jx C 1j > 12 .
Since  2 , it follows that jx C 2j < 2 <   jx C 1j.
Then > jx  .2/j > 0 implies 2 > jx C 2j and
1
1
 jx C 2j D jxC1j
 j1 C .x C 1/j D
 > 2jx C 2j > jxC1j

1
xC2

C 1 D 2
 .1/ D jf .x/  .1/j.

xC1

x C3xC2
xC2
2
x!2 x C3xC2

Therefore, lim

D 1.

Clearly, at the point that you stipulate that should be less than 12 , you are making
a rather arbitrary decision. What would have happened if you had chosen some
other reasonable bound on the size of ? For example, what if instead you only
require < 34 ? This would also work, although that decision would affect the final
choice of for now jx C 1j can get as small as 14 , and jxC2j
could be as large as

  3jxC1j
4jx C 2j. This suggests that you then select D min 4 ; 4 . This choice is no better
or worse than the chosen earlier. When one makes such arbitrary decisions, it
is good form to make a selection that does not lead to unnecessary arithmetic or
algebraic complications because one does not want to make the proof any harder to
read than necessary. Thus, it p
would perfectly adequate but enormously awkward to
select the bound on to be p5 . As long as the bound is less than 1, it will do the
1C 5

job of keeping jx C 1j bounded away from zero, but


optimal choice.

p
5
p
1C 5

would certainly not be an

54

3 Limits

3.2.1 Exercises
Write a proof of each of the following limits.
1. lim 35 x C 1 D 4
x!5

2. lim 5x  8 D 7
x!3

3. lim 2x2 D 18
x!3

4. lim 9x2 D 4
x! 23

5. lim 3x2  5x  7 D 1
x!1

6. lim x2 C 3x C 1 D 29
x!4

7. lim 2x3 D 16
x!2

6
D2
x!1 2xC5
xC4
lim 2
D
x!8 x 10xC10

8. lim
9.

2

10. lim mx C b D ma C b
x!a

11. lim ax2 D au2


x!u

12. lim ax2 C bx C c D au2 C bu C c


x!u

3.3 One-Sided Limits


The one-sided limits lim f .x/ D L and lim f .x/ D L are very similar to twox!aC

x!a

sided limits except that the value of x is only allowed to approach the real number a
from one side. As a result, the definitions of these one-sided limits are very similar
to the definition of limit with minor alterations that forces x to stay on one side of
a. The definition of limit states that for a function f defined in a neighborhood of a,
but not necessarily at a, the limit lim f .x/ D L means for every  > 0 there exists a
x!a

> 0 such that for every x satisfying 0 < jx  aj < , it follows that jf .x/  Lj < .
What is it about this definition that allows x to approach a from two sides? It is the
inequality 0 < jx  aj < that allows x to be either greater than or less than a since
jx  aj is positive in either case. By removing the absolute value function in this
inequality and writing instead 0 < x  a < , the choice of x becomes restricted
to being a value greater than a, or writing instead 0 < a  x < , the choice of x
becomes restricted to being a value less than a. Thus, if f is a function defined for
all x in an open interval with right end at a, then the limit of f at a from the left is
L, lim f .x/ D L, means that for every  > 0 there is a > 0 such that for every x
x!a

satisfying 0 < a  x < , it follows that jf .x/  Lj < . Similarly, if f is a function

3.3 One-Sided Limits

55

defined for all x in an open interval with left end at a, then the limit of f at a from
the right is L, lim f .x/ D L, means that for every  > 0 there is a > 0 such that
x!aC

for every x satisfying 0 < x  a < , it follows that jf .x/  Lj < .


One-sided limits are particularly useful in cases where the function f behaves
1
differently on one side of a as on the other side such as the way e x behaves quite
1
differently as x approached 0 from the right where 1x is positive from how e x behaves
as x approaches 0 from the left where 1x is negative. Similarly, the derivative of
f .x/ D jxj has different limits as x approaches 0 from the right and from the left.
There are also
p cases where a function is not even defined for x on one side of a such
as f .x/ D x which is not defined for x < 0.
Proving the existence of one-sided limits is very similar to proving two-sided
limits except that care must be taken to ensure that the value of x remains on one
side of a. Take, for example, the limit lim 2x2  5x D 3. Here f .x/ D 2x2  5x,
x!3C

a D 3, and L D 3. As with a proof of other limits earlier in the chapter, the proof
needs to give a value for > 0 which will ensure  > jf .x/Lj D j.2x2 5x/3j D

j.2x C 1/.x  3/j. This will follow if jx  3j <  j2xC1j
for all suitable values of
x. What is needed is the largest possible value of 2x C 1, but 2x C 1 is not bounded
unless x is restricted to be close to 3. Thus, stipulate that be less than 1 which will
ensure that x3 will be less than1, x will not exceed 4, and 2xC1 will not exceed 9.
Then can be chosen to be min 9 ; 1 , and the proof can be written as follows.
PROOF: lim 2x2  5x D 3
x!3C

Let f .x/ D 2x2  5x.




Given  > 0, let D min 9 ; 1 .
Select x such that 0 < x  3 < .
Since  1, it follows that 0 < x  3 < 1, 3 < x < 4, and j2x C 1j < 9.
Then 0 < x  3 < implies 9 > x  3 and  > 9.x  3/ > .2x C 1/.x  3/ D
2x2  5x  3 D jf .x/  3j.
Therefore, lim 2x2  5x D 3.

x!3C

Consider a function where its left limit differs from its right limit such as the
function 

5  7x if x < 1
f .x/ D
: Then lim f .x/ D 2 while lim f .x/ D 1. Thus,
x!1
x if x  1
x!1C
while proving lim f .x/ D 2, it is important to use that fact that x < 1 as part
x!1

of the proof since the required inequalities will not hold for x > 1 (Fig. 3.4). The
following shows one possible proof.

56

3 Limits

Fig. 3.4 Graph of f .x/


PROOF: lim
x!1

5  7x if x < 1
x if x  1


D 2


5  7x if x < 1
Let f .x/ D
.
x if x  1

Given  > 0, let D 7 .
Select x such that 0 < 1  x < D 7 . Then x < 1, so f .x/ D 5  7x.
It follows that  > 7.1  x/ D 5  7x  .2/ D jf .x/  .2/j.
Therefore, lim f .x/ D 2.
x!1

In the third line of the proof, 0 < 1  x < ensures that x < 1 which, in turn, is
needed to conclude that f .x/ D 5  7x and not f .x/ D x. The fact that x < 1 is also
used in the fourth line of the proof to conclude that 5  7x  .2/ D jf .x/  .2/j
which follows because 5  7x  .2/ is positive for all x < 1.

3.3.1 Exercises
Write a proof of each of the following one-sided limits.
1. lim x2 C 4x D 21
x!3C

2. lim 8  3x D 1
x!3

3. lim
x!2

4. lim

x!4C

5. lim
x!2

6. lim

x!2C

x2 4
x2 3xC2

D4

x2 4x

2x2 7x4
8jx2j
x2 4
8jx2j
x2 4

D 2
D2

4
9

3.4 Limits at Infinity

57

3.4 Limits at Infinity


The definitions given in the last two sections do not make sense when the real
number that x approaches, a, is replaced by infinity. Infinity, of course, is not an
element of the real numbers, R, but it does make sense to ask whether a function
approaches a limit when x increases without bound, that is, as x approaches infinity.
When one writes lim f .x/ D L, one is thinking that f .x/ is getting close to the real
x!1
number L as x increases without bound. But it does not make sense to measure how
close x is to infinity by choosing a > 0 so that when x is within of infinity, f .x/
is close to L. Since infinity is not a real number, one cannot measure the distance
from the real number, x, to infinity, even less expect x to get within of infinity. So
how does one quantify getting closer to infinity? The answer lies in the phrase
increases without bound which suggests that for any bound, N, you could place
on the size of x, the value of x can be made to be greater than that bound. Thus,
instead of selecting a > 0 and requiring 0 < jx  aj < , one chooses a number
N 2 R and requires x > N. This allows the following definition. Suppose that the
function f is defined for all x > K for some real number K. Then the limit of f as
x approaches infinity is L, lim f .x/ D L, means that for every  > 0 there exists
x!1
an N 2 R such that for every x > N, it follows that jf .x/  Lj <  (Fig. 3.5). Now
consider how one might write a proof of a limit at infinity. For example, consider
x
x
D 0. Here f .x/ D x2 C6
and L D 0. As with other limit
proving the limit lim x2 C6
x!1

proofs, the goal is to arrange that jf .x/  Lj <  for an


chosen  > 0.

arbitrarily
x
Again, you can work backwards. Since jf .x/  Lj D x2 C6 , as long as x > 0, it

x
would follow that x2 C6
< xx2 D 1x . Thus, there is an expression, 1x , which is larger
than jf .x/  Lj for all suitably large values of x. This will help because if you can
assure that 1x is less than , it will follow that jf .x/  Lj is also less than . It would
not have been helpful to exhibit an expression that was always less than jf .x/  Lj
because making that expression small would not imply that jf .x/  Lj is small. Now,
if x > 1 , it follows that 1x <  suggesting that 1 is a suitable value for N.
x
x!1 x2 C6

PROOF: lim

D0

x
Let f .x/ D x2 C6
.
Given  > 0, let N D 1 .
Select x such that x > N > 0.
Then x > 1 implies  > 1x D xx2 >
x
Therefore, lim x2 C6
D 0.
x!1

Fig. 3.5 Approaching a limit


as x ! 1

x
x2 C6

D x2 C6
 0 D jf .x/  0j.

58

3 Limits

Note that it is important that the third step of the proof pointed out that N is positive.
It is used in the fourth step when 1x is calculated, and this would not have been
allowed if the value of x could have been zero.
For a second example, consider proving lim 2xC5
D 2. Again, you can work
x!1 x7
.2xC5/2.x7/ 19
2xC5
backwards to get  > jf .x/  Lj D x7  2 D
D x7 . From here
x7
there are a number of ways to proceed. You can solve for x in the previous inequality
to get x > 7 C 19
which gives a reasonable value for N. Another way would be to

19
is less than
say that if x > 14, then x  7 < x  2x D 2x . In this case the fraction x7
19
38
38
,
and
it
becomes
clear
that
x
>
is
sufficient.
x D
x 2
x

This is an example demonstrating the enormous flexibility one sometimes has in
writing proofs in analysis where you often need to prove an inequality which can
be done in many ways. It is usually easier to prove an inequality involving a simple
fraction rather than a complicated fraction, so you can use the strategy of replacing
a fraction with a simpler fraction that is clearly larger, or in some cases, clearly
smaller. Keep in mind that a ratio of positive values gets larger if its numerator gets
larger or its denominator gets smaller.
A complete proof can be written as follows.
2xC5
x!1 x7

PROOF: lim

D2

Let f .x/ D 2xC5


.
x7
Given  > 0, let N D 7 C 19
.

Select x such that x > N > 7.
Then x > 7 C 19
implies x  7 >

Therefore, lim 2xC5
D 2.
x7

19


and  >

19
x7

2xC5
x7

 2 D jf .x/  2j.

x!1

As in the previous proof it is important that x > 7 is pointed out in the third step of
the proof because that fact is needed both to ensure that f .x/ is defined by assuring
x  7 0 and that x  7 is positive allowing the absolute value function to be
introduced in the fifth step of the proof.
With a slight adjustment of the definition of lim f .x/ D L, one gets a definition
x!1

of lim f .x/ D L. This time rather than choosing an N and requiring jf .x/  Lj < 
x!1

for all x > N, one instead needs f .x/ to be within  of L for those x < N. Thus,
lim f .x/ D L means that for every  > 0 there exists an N 2 R such that for all
x!1
x < N it follows that jf .x/  Lj < .
2
D 3, one can identify an N such that x < N implies that
To prove lim 6x2x2C5x
7
x!1

2
2
2
6x C5x
.6x C5x/3.2x2 7/
D
3j
<

by
working
backwards.
That
is,

3
j 6x2x2C5x

7
2x2 7
2x2 7
5xC21

2 . It would be nice to simplify this rather messy expression; something you


2x 7
can do as long as you do not introduce changes that prevent the final inequality from
holding. In this case, the 7 term in the denominator of 5xC21
is an inconvenience,
2x2 7

3.4 Limits at Infinity

59

and it would be nice to remove it. Simply removing this negative term would make
the absolute value of the fraction smaller when what is needed is to make the fraction
larger. A strategy that does work is to take part of the 2x2 term, which grows very
large as x goes to 1, and pair it with the 7 term. For example, 2x2  7 p
can
be written as x2 C .x2  7/. Because x2  7 is a positive value for all x <  7,
removing it from the denominator makes the absolute value of the fraction greater.
Also note that when x <  21
, the numerator j5x C 21j < 5jxj, and this happens for
10
5xC21
p
5
5

all x <  7. It would then be sufficient for  > jxj


D 5jxj
2 > 2x2 7 or that x <  
x
p
as long as x <  7. A proof would be
6x2 C5x
x!1 2x2 7

PROOF: lim

D3

6x2 C5x
.
2x2 7

Let f .x/ D


 p
Given  > 0, let N D min  7;  5 .
p
Select x such that x < N   7.
5xC21
> 2 2 D
Then x < N   5 implies  > 5x D 5x
x2
x C.x 7/
2

.6x C5x/3.2x2 7/ 6x2 C5x

D
D
jf
.x/

3j.

3

2x2 7
2x2 7
6x2 C5x
2
x!1 2x 7

Therefore, lim

D 3.

3.4.1 Exercises
Find ways to justify each of the following inequalities that hold for large values
of x.
1.
2.
3.

3x5
< 2x
2x2
4xC7
< 5x
2x2 6
2
5x C3xC1
< 10
x
x3 x2 1

Write proofs for each of the following limits.


4
D0
x!1 xC4
3x9
lim
D1
x!1 3xC4

4. lim
5.

9x2
D3
2
x!1 3x 10
3
x
lim
D 15
3
2
x!1 5x 2x 4

6. lim
7.

60

3 Limits

3.5 Limit of a Sequence


3.5.1 Definition of Sequence
A sequence is just a function whose domain is the set of natural numbers, N. In this
chapter the codomain of a sequence will be the real numbers, R, but you can have
a sequence with any set serving as the codomain. Functions are usually referenced
using the notation f .x/. But for sequences it is traditional to place the argument of
a sequence in a subscript rather than within parentheses as in a1 ; a2 ; a3 ; : : : . The
entire sequence is notated with angle brackets as in <an >. Note that this is not the
same as the set fa1 ; a2 ; a3 ; : : : g which is just the collection of the values taken on by
the sequence, that is, the range of the function a W N ! R. For each n 2 N, an is
called a term of the sequence, or specifically, the nth term of the sequence.

3.5.2 Arithmetic with Sequences


As with any real-valued function, you can add, subtract, multiply, and divide
sequences. The sum of sequences <an > and <bn > is the sequence <cn > where,
for each n 2 N, cn D an C bn . Similarly, one can define the difference of sequences
and product of sequences as cn D an  bn and cn D an  bn , respectively. If the
sequence <bn > has no terms equal to zero, then the quotient of sequence <an >
and <bn > is the sequence cn D abnn .
Other arithmetic operations can be similarly defined. If f is any real-valued
function with a domain that includes the range of the sequence <an >, then it makes
sense to define the sequence cn D f .an /. For example,pif <a
p sequence
pn >pis the
p
1; 3; 5; 7; : : : , then the sequence < an > is the sequence 1; 3; 5; 7; : : : .

3.5.3 Monotone Sequences


A sequence <an > is a monotone increasing sequence if a1  a2  a3  : : : , or
in other words, for natural numbers i < j it follows that ai  aj . Similarly, <an > is
a monotone decreasing sequence if a1  a2  a3  : : : , or for natural numbers
i < j it follows that ai  aj . A monotone sequence is a sequence that is either
monotone increasing or monotone decreasing. If a monotone increasing sequence
<an > satisfies ai < aj for all natural numbers i < j, then it is a strictly monotone
increasing sequence. Similarly, <an > is strictly monotone decreasing if ai > aj
for all natural numbers i < j. For example, the following sequences are monotone
increasing:

3.5 Limit of a Sequence

61

1; 2; 3; : : :
1; 1; 2; 3; 3; 4; 5; 5; : : :
12 ; 23 ; 34 ; 45 ; : : :
13 ; 23 ; 33 ; 43 ; : : :
whereas the following sequences are monotone decreasing:

1; 0; 1; 2; 3; : : :


8; 4; 2; 1; 12 ; 14 ; : : :
0; 0; 0;  12 ;  12 ;  12 ; 1; 1; 1;  32 ; : : :
1
1
1
44 ; 55 ; 66 ; : : :

It is interesting to notice that every sequence of real numbers can be written as


a sum of a monotone increasing sequence and a monotone decreasing sequence. In
particular, if <cn > is a sequence of real numbers, define an increasing sequence
<an > and a decreasing sequence <bn > as follows. Let a1 D c1 and b1 D 0. Then
for all n 2 N if cn  cnC1 , define anC1 D cnC1  bn and bnC1 D bn , and if
cn > cnC1 , define anC1 D an and bnC1 D cnC1  an . These definitions make it
clear that cn D an C bn for each n 2 N. The sequence <an > is increasing because
cn  cnC1 implies that anC1  an D .cnC1  bn /  .cn  bn / D cnC1  cn  0, and
cn < cnC1 , implies an D anC1 . Similarly, <bn > is decreasing because cn > cnC1
implies that bnC1  bn D .cnC1  an /  .cn  an / D cnC1  cn < 0, and cn  cnC1
implies bn D bnC1 . Thus, 1; 1; 2; 2; 3; 3; : : : can be written as the sum of the
two sequences 1; 1; 4; 4; 9; 9; : : : and 0; 2; 2; 6; 6; 12; : : : .

3.5.4 Subsequences
Intuitively, a subsequence of a sequence <an > is a sequence whose terms include
some of the terms of the sequence <an > in the same order as they appear in the
original sequence. Formally, if there is a strictly increasing sequence of natural
numbers i W N ! N, then <ain > is a subsequence of the sequence <an >. Thus, the
sequence 1; 1; 2; 2; 3; 3; : : : has the following subsequences
1; 2; 3; : : :
1; 1; 3; 3; 5; 5; : : :
2; 3; 5; 7; 11; : : :
The sequence 1; 2; 2; 3; 3; 3; 4; 4; 4; 4; : : : is not a subsequence of 1; 1; 2; 2;
3; 3; : : : since there are no repeated values in the original sequence, so there can
be no repeated values in any of its subsequences.

62

3 Limits

3.5.5 Limit of a Sequence


The definition of the limit of a sequence is similar to that of the limit of a function
as x ! 1 except that the function is only defined on the natural numbers. Thus,
if <an > is a sequence of real numbers, then the limit of the sequence is L,
lim an D L, means that for all  > 0 there is an N such that for every natural
n!1
number n > N it follows that jan  Lj < . A sequence that has limit L is said to
converge to L and is said to be a convergent sequence. A sequence that does not
converge is said to diverge and is said to be a divergent sequence.
Except for slight notational changes, proving that a sequence has a particular
limit involves the same type of work as proving that a function has a particular limit
2
as its variable approaches infinity. For example, the sequence an D 4n2nCnC2
has
2 7
limit 2. To prove this, given an  > 0, you would need to exhibit a number N such
that jan  2j <  for all n > N. As
writing
about functions,

one can
with
proofs
nC16
nC16n
4n2 CnC2

work backwards from jan  2j D 2n2 7  2 D 2n2 7  n2 C.n2 7/ . If you


stipulate that n  3, then n2  7  9  7 D 2 > 0 allowing you to conclude that
D 17 which can easily be made less than  by requiring n > 17 .
jan  2j < 17n
n

n2
This is what is needed for the proof.
PROOF: lim

n!1

4n2 CnC2
2n2 7

D2

Let an D 4n2nCnC2
2 7 .


Given  > 0, let N D max 3; 17
.

Select an n > N.
Since N  3, it follows that n2 > 9.
Also, n > N gives n  17
. Thus,  

2

4n CnC2

2n2 7  2 D jan  2j.


4n2 CnC2
2
n!1 2n 7

Therefore, lim

17
n

nC16n nC16
> 2 2 > 2 D
D 17n
n2
n C.n 7/
2n 7

D 2.

3.5.6 Limits of Monotone Sequences and Mathematical


Induction
A function f W A ! R is said to be bounded above if the set ff .x/ j x 2 Ag is
bounded above, that is, if there exists an M 2 R such that f .x/  M for all x in
the domain A of f . In this case M is an upper bound of f . Similarly, the function is
said to be bounded below if the set ff .x/ j x 2 Ag is bounded below. A function that
is both bounded above and bounded below is said to be bounded. Because a realvalued sequence <an > is just a real-valued function whose domain is the natural
numbers, N, these definitions apply to sequences as well.

3.5 Limit of a Sequence

63

One of the most important properties of monotone sequences is that monotone


increasing sequences that are bounded above must converge and monotone decreasing sequences that are bounded below must converge. Thus, bounded monotone
sequences converge. If a monotone sequence does not converge, then its terms must
continue to grow without bound and approach plus or minus infinity.
So how would you prove that a monotone increasing sequence that is bounded
above converges? When proving a limit of the form lim an D L, you can work with
n!1

the inequality  > jan  Lj in order to find an appropriate value of N that allows you
to use the definition of limit to complete the proof. But in this case, you do not have
a general expression for the terms an , and you have not been given a value for L.
Somehow you need to use the only known facts about <an >, that is, the fact that the
sequence is both monotone increasing and bounded, to come up with a candidate to
serve as the limit, L, in the proof.
The definition of a sequence being bounded above holds the key. That definition
says that the sequence <an > is bounded above if the set fan j n 2 Ng is bounded
above, so there is a real number M which is greater than or equal to each term of
the sequence. Will this M be the limit of the sequence? Well, not usually. If M is
an upper bound for the sequence, then so are M C 1, M C 100, and M C 20;000.
They are all upper bounds, but they cannot all be limits of the sequence. You should
recognize that the terms of the sequence must get close to the limit, and the only
upper bound of the set fan j n 2 Ng that the terms could get close to is the least
upper bound of the set. Since fan j n 2 Ng is both nonempty and bounded above,
the Completeness Axiom for the real numbers guarantees that such a least upper
bound exists. This gives you a candidate for L.
The proof will require you to show that for all n greater than some N, the terms
of the sequence, <an >, are within  of L. How can this be arranged? Here is where
you can use the fact that the sequence is monotone increasing because once you find
a single term, an , that gets within  of L, all the terms that come after this term in the
sequence will necessarily have to be between an and L, so they also will be within
 of L. How do you find one term, an , within  of L? This follows from the fact that
L is a least upper bound of fan j n 2 Ng. Because L is the least upper bound, L  
being less than the least upper bound, L, is not an upper bound, so there must be an
element of the set fan j n 2 Ng greater than L  . This gives all the tools needed for
the proof (Fig. 3.6).

a1

a2

a3

a4

a5 aN an

Fig. 3.6 Proving bounded monotone sequences converge

64

3 Limits

So how would you write the proof? Certainly the proof would begin with
selecting a generic sequence and making a statement about the properties the
sequence is assumed to have, that is, its being monotone increasing and bounded
above. Then, the proof would proceed to justify the existence of the least upper
bound for the set of terms of the sequence; that will give you the target value of L.
Then, as with most proofs about limits, it would select a value for  > 0. Unlike
the limit proofs earlier in this chapter, one cannot immediately state a value for N.
The existence of N must be proved as discussed in the previous paragraph. Finally,
the properties of the sequence can be brought together to show jan  Lj <  for all
n > N. Here is one possible proof.
PROOF: A monotone increasing sequence that is bound above converges.
Let <aj > be a monotone increasing sequence of real numbers that is
bounded above.
Since the set of terms A D faj j j 2 Ng contains a1 , it is nonempty, and since
it is bounded above, the Completeness Axiom guarantees that A has a least
upper bound, L.
Given  > 0, the number L   is less than L. Since L is the least upper
bound of A, L   is not an upper bound of A. Thus, there is an N 2 N such
that the term aN is in A and is larger than L  .
Select an n > N.
Because <aj > is monotone increasing, an  aN . Because L is an upper
bound for A, an  L. Therefore, L   < aN  an  L, and jan  Lj <
j.L  /  Lj D .
This proves that the sequence <aj > has limit L and that <aj > converges.
Note that the proof needs to refer to the sequence <an > as well as a particular
element of the sequence an . It could be confusing to the proof reader to use the
variable n in both contexts here, especially since the sequence notation <an > is
used after the choice of a specific value of n is made. That is the reason the proof
changed to using the variable j to refer to a generic term index. Then, it could refer
to a specific term using index n without confusing the two uses.
There is also a theorem stating that a monotone decreasing sequence that is
bounded below converges. The proof of this is left as an exercise.
As an illustration of the usefulness of the above result, consider
a sequence
p
defined recursively by a1 D 2, and for n p
 1, anC1 D
an C 12. That is,
p
p
p
a1 D 2, a2 D a1 C 12 D 14, a3 D
14 C 12, and so forth. One can
prove that this sequence converges by showing that the sequence is both monotone
increasing and bounded above. Indeed, both of these facts can be established by
mathematical induction. The reader is likely already familiar with proofs by
mathematical induction, but this is an appropriate opportunity to review the method
and its merits.

3.5 Limit of a Sequence

65

Suppose the variable n represents any natural number, and there is a statement
S.n/ that includes this variable as part of the statement. For example, the statement
could be lim xn D an . Mathematical induction is a proof technique that uses the
x!a

following proof template to show that S.n/ is true for all n greater than or equal to
some base value b 2 N.
TEMPLATE for using mathematical induction to prove the statement
S.n/ is true for all natural numbers n  b.
SET THE CONTEXT: The statement will be proved by mathematical
induction on n for all n  b.
PROVE S.b/: Prove that the statement is true when the variable n is equal
to the base value, b.
STATE THE INDUCTION HYPOTHESIS: Assume that S.n/ is true for
some natural number n D k  b.
PERFORM THE INDUCTION STEP: Using the fact that S.k/ is true, prove
that S.k C 1/ is true.
STATE THE CONCLUSION: Therefore, by mathematical induction, S.n/
is true for all natural numbers n  b.
It is important to understand that the technique of mathematical induction works.
That is, if the statement S.b/ is true, and if the statement S.k/ ! S.k C 1/ is true,
then, in fact, S.n/ must be true for all natural numbers n  b. Certainly, S.b/ is
true. Because S.b/ is true, and S.k/ ! S.k C 1/ is true for all k  b, it follows that
S.b/ ! S.bC1/, so S.bC1/ is true. Then S.bC1/ ! S.bC2/, S.bC2/ ! S.bC3/,
and so forth, so the fact that S.n/ is true for all n  b follows.
The strength of mathematical induction is that it is often much easier to provide
a proof for the one step S.k/ ! S.k C 1/ than it is to prove S.n/ in the general
case. The reader has likely seen many statements proved by mathematical induction
while studying Algebra, Calculus, or just about any other branch of mathematics.
Mathematical induction is an excellent tool for proving that the previously
introduced recursive
sequence
is both monotone increasing and bounded above.
p
p
Clearly, a2 D 14 > 4 D 2 D a1 so a1 < a2 . Supposepthat for some
p k  1 one
has ak < akC1 . Then it follows that ak C12 < akC1 C12 so ak C 12 < akC1 C 12
which shows that akC1 < akC2 . Thus, by mathematical induction it follows that
an < anC1 for all n, and the sequence is monotone increasing. Alsop
clear is that
ap1 D 2 < 4. p
Suppose that for some k  1 that ak < 4. Then akC1 D ak C 12 <
4 C 12 D 16 D 4. Thus, by mathematical induction it follows that an < 4 for
all n, and the sequence is bounded above. The limit of this sequence
can be shown
p
to be 4. In particular,
if
the
limit
is
L,
one
can
conclude
that
a
C
12 should be
n
p
which should equal the limit of an which is also L. Thus, one
converging to L C 12p
would expect that L D L C 12. This equation has only one positive real solution,
L D 4.

66

3 Limits

3.5.7 Cauchy Sequences


A Cauchy sequence is a sequence whose terms get close together. As with the
definition of limit, the concept of close needs to be made precise. As with the
definition of limit, close means that given any tolerance  > 0, one can go out far
enough in the sequence to ensure that all terms of the sequence beyond that point
are within  of each other. Thus, a sequence is Cauchy if for every  > 0 there is an
N such that if natural numbers m and n are both greater than N, then jam  an j < .
If a sequence of real numbers converges, then the sequence is Cauchy. The proof
of this fact uses a strategy employed repeatedly in Analysis, that is, if two quantities
are very close to the same value, then they must be very close to each other. This
standard technique for proving that two quantities are close to each other involves
the use of the triangle inequality. In particular, if lim aj D L, then for every  > 0
j!1

there is an N such that if natural number n > N, then jan  Lj < . Well then,
certainly if m and n are both natural numbers greater than N, then both jam  Lj < 
and jan  Lj < . Adding these two inequalities together shows that jam  Lj C
jan  Lj <  C . The triangle inequality states that for any real numbers x and y,
jxj C jyj  jx C yj. Thus, 2 > jam  Lj C jan  Lj D jam  Lj C jL  an j 
j.am  L/ C .L  an /j D jam  an j. Of course, the definition of Cauchy sequence
requires you to show that jam  an j is less than , not 2. But you have an enormous
amount of flexibility when working with these types of inequalities, so you could
have asked instead for an N such that for all natural numbers n greater than N,
you have jan  Lj less than 2 rather than less than . Thus, the proof could be as
follows.
PROOF: Every convergent sequence is Cauchy.
Let <aj > be a sequence of real numbers with lim aj D L.
j!1

Let  > 0 be given.


From the definition of limit, there is a number N such that for all natural
numbers j > N, it follows that jaj  Lj < 2 .
Then for all natural numbers m and n greater than N, jam  Lj < 2 and
jan  Lj < 2 , so  D 2 C 2 > jam  Lj C jan  Lj D jam  Lj C jL  an j 
j.am  L/ C .L  an /j D jam  an j.
This shows that the convergent sequence <aj > is Cauchy.
Note that the converse of this theorem also holds. That is, any sequence of
real numbers that is Cauchy is a convergent sequence. This result will be proved
in Sect. 3.7. An important and useful consequence of the above theorem is its
contrapositive: If a sequence is not Cauchy, then it does not converge. Often when
one wants to show that a sequence does not converge, one shows that there is some
 > 0 such that for every N there are natural numbers m and n greater than N for
which jam  an j  .
Another important property of Cauchy sequences is that all Cauchy sequences are
bounded. If the sequence <an > is Cauchy, then there is a natural number N such
that whenever m; n  N, the difference jam  an j < 1. The set fa1 ; a2 ; a3 ; : : : ; aN g

3.5 Limit of a Sequence

67

is a finite set, so it is bounded by some number, K. That is, jan j  K for all n  N.
If m > N, then, since both N and m are greater than or equal to N, it follows that
jam  aN j < 1 from which it follows that jam j < jaN j C 1  K C 1. Then the
sequence <an > is necessarily bounded above by K C 1 and below by .K C 1/, and
the sequence is bounded. A complete proof follows.
PROOF: All Cauchy sequences are bounded.
Let <an > be a Cauchy sequence.
Then there is a natural number N such that for all m; n  N, jam  an j < 1.
The set fa1 ; a2 ; a3 ; : : : ; aN g is a finite set, so there is a K such that the set is
bounded above by K and bounded below by K.
Let m be any natural number. If m  N, then jam j  K. If m > N, then
jam  aN j < 1, so jam j D jam  aN C aN j  jam  aN j C jaN j < 1 C K.
It follows that all terms of the sequence lie between .K C 1/ and K C 1,
and, thus, the sequence is bounded.
One consequence of the last two results is that since all convergent sequences are
Cauchy, all convergent sequences are bounded. The concept of a Cauchy sequence is
not only applied to sequences of numbers but also to much more general sequences
such as sequences of vectors, sequences of functions, and sequences of linear
operators. Of course, one would need a way to discuss distances between the terms
of a sequence in these other contexts, but when that makes sense, the concept of a
Cauchy sequence becomes important.

3.5.8 Exercises
1. Which of the following sequences are monotone? Which of them are bounded
above? Which of them are bounded below? Which of them are bounded?
(a)
(b)
(c)
(d)
(e)
(f)
(g)

an
an
an
an
an
an
an

D .1/n
n
D nC1
D 5n
.1/n
D 5n
n
D 1C.1/
nCn1
D 5  n.1/n
D 1  12  13    1n

2. Write proofs of each of the following limits.


6n
n!1 3nC1
lim 4n1
n!1 nC6

(a) lim
(b)

(c) lim

D2
D4

n2 C2nC1

2
n!1 n 2n5

D1

68

3 Limits

p
3. If a1 D 3 and an is defined recursively by anC1 D 3an C 10, show that the
sequence <an > converges.
p
4. If a1 D 7 and an is defined recursively by anC1 D 3an C 4, show that the
sequence <an > converges.
5. Prove that a monotone decreasing sequence that is bounded below converges.
6. Let <an > be any sequence. Prove that <an > has a monotone subsequence.
7. Prove that if <an > is a sequence such that L D lim a2n D lim a2nC1 , then the
n!1
n!1
sequence converges to L.
8. Prove that if <an > is a sequence that converges to L, then the sequence
a1 ; a1 ; a2 ; a2 ; a3 ; a3 ; : : : also converges to L.
9. Prove that if <an > is a sequence that converges to L, then the sequence
a1 ; a2 ; a2 ; a3 ; a3 ; a3 ; a4 ; a4 ; a4 ; a4 ; : : : also converges to L.

3.6 Proving That a Limit Does Not Exist


3.6.1 Why a Limit Might Not Exist
lim f .x/ D L means that if x is required to stay close to a, then f .x/ will stay close

x!a

to L. So what does it mean for lim f .x/ not to exist? Intuitively, it could mean that
x!a

in every neighborhood of a there are values of x for which f .x/ is close to one value
L1 and other values of x for which
value L2 . That is what
 f .x/ is close to another

4x  5 if x < 2
happens with the function f .x/ D
as x approaches 2. For some
10  2x if x  2
values of x near 2, f .x/ is close to 3, and for some values of x near 2, f .x/ is close
 
to 6. Thus, the limit does not exist. Another well-known example is f .x/ D sin 1x
which oscillates wildly as x approaches zero, and in every neighborhood of 0, the
function takes on all values in the interval 1; 1 infinitely often. Another way for
the limit not to exist is for the values of f .x/ to grow without bound and approach
xC3
infinity or negative infinity such as what happens to f .x/ D .x5/
2 as x approaches 5.
One can write a proof showing that a particular function has no limit at x D a,
but before discussing how to do this, it is worth taking a close look at the definition
of limit.

3.6.2 Quantifiers and Negations


To say that a function f has a limit at x D a is to say that there exists a real number
L such that for all  > 0 there is a > 0 such that for every x, 0 < jx  aj <
implies jf .x/  Lj < . This definition is actually a fairly complicated statement. At
the heart of it is the conditional statement 0 < jx  aj < implies jf .x/  Lj < .

3.6 Proving That a Limit Does Not Exist

69

But this is an open statement, that is, even though the function f and the limit point
a are supposedly known, the statement contains variables x, L, , and , all of which
are unknown. Thus, this open statement does not have any truth value until these
four variables have been stipulated. They are stipulated with four phrases: there is
a real number L, for all  > 0, there is a > 0, and for every x. These four
phrases are called quantifications of the variables because they indicate for which
values of the variables the following statement must hold. Two of the phrases use
the existential quantifier there exists. It indicates that there is at least one value
of the variable that will make the following statement true. The other two phrases
use the universal quantifier for all. It indicates that every possible value of that
variable will make the following statement true. So
The statement there exists a real number L such that for all  > 0 there is a > 0
such that for every x, 0 < jx  aj < implies jf .x/  Lj <  begins with the
existential quantifier there exists a real number L, and the entire statement is
true if, in fact, there is a value of the variable L that makes the following statement
true, that is, for all  > 0 there is a > 0 such that for every x, 0 < jx  aj <
implies jf .x/  Lj < .
The statement for all  > 0 there is a > 0 such that for every x, 0 < jx aj <
implies jf .x/  Lj <  begins with the universal quantifier for all  > 0, and
the entire statement is true if, in fact, every possible positive value of the variable
 makes the following statement true, that is, there is a > 0 such that for every
x, 0 < jx  aj < implies jf .x/  Lj < .
The statement there is a > 0 such that for every x, 0 < jx  aj < implies
jf .x/  Lj <  begins with the existential quantifier there is a > 0, and the
entire statement is true if, in fact, there is a positive value of the variable that
makes the following statement true, that is, for every x, 0 < jx  aj < implies
jf .x/  Lj < .
The statement for every x, 0 < jx  aj < implies jf .x/  Lj <  begins with
the universal quantifier for every x, and the entire statement is true if, in fact,
every possible value of the variable x makes the following statement true, that is,
0 < jx  aj < implies jf .x/  Lj < .
A proof that no limit exists must prove the negation of the statement that says that
a limit does exist, so it is important that one can generate the negation of a statement
that contains quantifiers such as this one does. The logic of doing this is not hard
to follow. Suppose the P.y/ is a statement that depends on the value of a variable y.
Then the universally quantified statement for every y, P.y/ says that P.y/ is true
for every possible value of y. The negation of for every y, P.y/ must be that it
is false that every value of y makes P.y/ true, so there must be at least one y that
makes P.y/ a false statement. This means that the negation of for every y, P.y/
is the statement there is a y such that :P.y/. To negate a universally quantified
statement, change the universal quantifier to an existential quantifier and negate the
statement that follows.
What if the original statement is an existentially quantified statement such as
there is a y such that P.y/? This statement says that some value of y makes

70

3 Limits

P.y/ true. The negation of this statement must be that no value of y makes P.y/
true which is to say that every value of y makes P.y/ a false statement. This means
that the negation of there is a y such that P.y/ is the statement for all y, :P.y/.
To negate an existentially quantified statement, change the existential quantifier to a
universal quantifier and negate the statement that follows.
The statement that f has a limit at x D a is a statement that has an existential
quantifier followed by a universal quantifier followed by an existential quantifier
followed by a universal quantifier followed by a conditional statement. To prove
that f does not have a limit at x D a requires a proof of the negation of that
statement. From the previous discussion it is now clear that to get the negation of
the statement that f has a limit at a, you must flip the two existential quantifiers to
universal quantifiers, flip the two universal quantifiers to existential quantifiers, and
end with the negation of the conditional statement. The result is for all real numbers
L there is an  > 0 such that for all > 0 there is an x such that 0 < jx  aj < and
jf .x/  Lj  .

3.6.3 Proving No Limit Exists


Getting back to writing a proof that a limit does not exist, the proof would need to
show that for every real number L there is an  > 0 such that for every > 0 there
is an x within of a such that jf .x/  Lj  . This is often done by exhibiting an x1
and an x2 within of a such that f .x1 / and f .x2 / are so far apart that they could not
both be within  of any L. That suggests the following template for proving that a
particular limit does not exist.
TEMPLATE for proving lim f .x/ does not exist
x!a

SET THE CONTEXT: Make statements about what is known about the
function f and the number a.
SELECT AN ARBITRARY LIMIT L: Given L 2 R,
PROPOSE A VALUE FOR : let  D . Here you would insert a value for
.
SELECT AN ARBITRARY > 0: Select > 0.
SELECT VALUES FOR x1 AND x2 : Let x1 D
and x2 D . Note that
0 < jx1 aj < , 0 < jx2 aj < , and jf .x1 /f .x2 /j  2. You would have
selected appropriate x1 and x2 in such a way that jf .x1 /  f .x2 /j exceeds 2.
LIST IMPLICATIONS: Assume that jf .x1 /  Lj <  and jf .x2 /  Lj < .
Then 2 D  C  > jf .x1 /  Lj C jf .x2 /  Lj D jf .x1 /  Lj C jL  f .x2 /j 
jf .x1 /  L C L  f .x2 /j D jf .x1 /  f .x2 /j.
STATE THE CONTRADICTION: This shows that 2 > jf .x1 /  f .x2 /j
which is a contradiction.
STATE THE CONCLUSION: Thus, it cannot hold that both jf .x1 /  Lj < 
and jf .x2 /  Lj < , and the limit does not exist.

3.6 Proving That a Limit Does Not Exist

71


4x  5 if x < 2
For example, consider the limit of f .x/ D
as x approaches
10  2x if x  2
2. Here the limit from the left is 3, and the limit from the right is 6. Thus, no matter
how close x is supposed to be to 2, there will be values x1 and x2 within that required
tolerance where f .x1 / is close to 3 and f .x2 / is close to 6. If f .x1 / and f .x2 / are both
supposed to be within  of some limit L, then it will follow that f .x1 / and f .x2 / will
have to be within 2 of each other. Again, you employ the technique of showing that
two quantities close to the same value must be close to each other. In particular, if x1
is chosen to be less than 2, f .x1 / will be less than 3. If x2 is chosen to be between 2
and 2 12 , f .x2 / will be greater than 5. In this case it would be impossible to have f .x1 /
and f .x2 / within 2 of each other, and, therefore, it would be impossible to have them
both within  D 1 of some limit L. This suggests that you will get a contradiction if
you set  D 1. Indeed, if a > 0 is chosen, you can let x1 D 2 2 (that is, less than 2


but within of 2), and let x2 D min 2 C 2 ; 2 C 12 (that is, greater than 2 but within
of 2 and not so large that f .x/ is less than 5). The point of all of this is that now,
no matter what value is chosen for L, f .x1 / and f .x2 / are more than 2 apart, so how
could they both be within 1 of L? Specifically, if jf .x1 /  Lj < 1 and jf .x2 /  Lj < 1,
it follows from the triangle inequality that 2 D 1 C 1 > jf .x1 /  Lj C jf .x2 /  Lj D
jf .x1 /  Lj C jL  f .x2 /j  jf .x1 /  L C L  f .x2 /j D jf .x1 /  f .x2 /j showing
2 > jf .x1 /  f .x2 /j which cannot hold. Here is the complete proof (Fig. 3.7).
Fig. 3.7 f has no limit at
xD2

72

3 Limits


PROOF: The function

4x  5 if x < 2
10  2x if x  2


has no limit as x ! 2.


4x  5 if x < 2
.
10  2x if x  2
Given any value for L, let  D1, and let > 0 be given.
Let x1 D 2  2 and x2 D min 2 C 2 ; 2 C 14 .
Note that 0 < jx1  2j < and 0 < jx2  2j < .
Since x1 < 2, it follows that f .x1 / < 3. Since x2 > 2 and x2 < 2 14 , it follows
that f .x2 / > 5. As a consequence jf .x1 /f .x2 /j D f .x2 /f .x1 / > 53 D 2.
If jf .x1 /  Lj <  D 1 and jf .x2 /  Lj <  D 1, it would follow that
2 D 1 C 1 > jf .x1 /  Lj C jf .x2 /  Lj D jf .x1 /  Lj C jL  f .x2 /j 
jf .x1 /  L C L  f .x2 /j D jf .x1 /  f .x2 /j > 2. This shows that 2 > 2 which
is a contradiction.
Thus, it cannot hold that both jf .x1 /  Lj <  and jf .x2 /  Lj < , and the
limit does not exist.

Let f .x/ D

It is even easier to show that the function f .x/ D sin 1x has no limit as x
approaches 0. This is because for every > 0 it is easy to find x1 and x2 between
0 and such that f .x1 / D 1 and f .x2 / D 1. This makes it impossible to find an
L where jf .x1 /  Lj < 1 and jf .x2 /  Lj < 1. Thus, the proof follows the given
template for proving that a limit does not exist (Fig. 3.8).
Fig. 3.8 Graph of sin

1
x

3.6 Proving That a Limit Does Not Exist

73

PROOF: The function sin 1x has no limit as x ! 0.


Let f .x/ D sin 1x .
Given any value for L, let  D 1, and let > 0 be given.
1
2
2
Select integer k > 2
. Let x1 D .4kC1/
and x2 D .4kC3/
.
2
1
Note that both x1 and x2 are positive and less than 4k D 2k
< ,


1
3
f .x1 / D sin .2k C 2 / D 1, and f .x2 / D sin .2k C 2 / D 1.
If jf .x1 /  Lj <  D 1 and jf .x2 /  Lj <  D 1, it would follow that
2 D 1 C 1 > jf .x1 /  Lj C jf .x2 /  Lj D jf .x1 /  Lj C jL  f .x2 /j 
jf .x1 /  L C L  f .x2 /j D jf .x1 /  f .x2 /j D 2. This shows that 2 > 2 which
is a contradiction.
Thus, it cannot hold that both jf .x1 /  Lj <  and jf .x2 /  Lj < , and the
limit does not exist.

If the function f .x/ is unbounded as x approaches a, then there is an even easier


template to use for the proof that f .x/ has no limit. The idea is that since f .x/ is
unbounded, for any proposed limit L one can find an x close to a such that jf .x/j >
jLj C 1. Then the difference jf .x/  Lj will be forced to be greater than 1. Consider,
xC3
for example, the function f .x/ D .x5/
2 as x approaches 5. Given L, you will want
an x with
1
.x5/2

xC3
>L
.x5/2
1
, so by
jx5j

C 1. But with x within 1 of 5, you could claim that


1
jLjC1

xC3
.x5/2

>

>
making jx  5j <
you will have the inequality that you
need. Note that the absolute value function was introduced in jLj C 1 to take care of
the embarrassing circumstance that L is negative, and in particular, when L D 1.
The proof is as follows.
PROOF: The function

xC3
.x5/2

has no limit as x ! 5.

xC3
Let f .x/ D .x5/
2.
Given any value for L, let  D 1, and let >

 0 be given.
1
.
Select a value of x between 5 and 5 C min 1; ; jLjC1
Note that 0 < jx  5j <
xC3
1
1
and f .x/ D .x5/
2 > .x5/2 > x5 > jLj C 1.
It follows that jf .x/  Lj > jLj C 1  L  1.
Thus, it cannot hold that jf .x/  Lj < , and the limit does not exist.

3.6.4 Exercises
Write the negation of each of the following statements.
1. There exists x such that x2 D A.
2. For all x there is a y such that g.x/ D f .y/.

74

3 Limits

3. There is an integer k such that f .x/  f .k/ for all x between k and k C 1.
4. For all x > 0 and all y > 0 there exists a z < 0 such that f .z/  xf .y/.
Prove that the following limits do not exist.
x
5. f .x/ D jxj
as x ! 0
 1 
as x ! 1
6. f .x/ D x sin x1


5x if x < 3
7. f .x/ D
as x ! 3
4x if x  3
8. f .x/ D x244 as x ! 2

3.7 Accumulation Points


A set A has an accumulation point p if for every  > 0 there is an x 2 A with x p
and jx  pj < . Informally, p is an accumulation point of A if there are points of A
that are arbitrarily close to p. Note that the fact that p is an accumulation point of the
set A hasnothing to do with whether p is actually an element of A. For example, the
set A D 1n j n 2 N has one accumulation point, 0, because for every  > 0 there is
an n 2 N with 1n < . Here the accumulation point 0 is not an element of the set A.
The set B D 0; 4 (the closed interval from 0 to 4) has infinitely many accumulation
points. Indeed, every element of the interval B is an accumulation point of B because
for each x 2 0; 4 and each  > 0 there are infinitely many points in B within  of
x. Here all of the accumulation points of B are in B. Each point x 2 0; 4 is also an
accumulation point of the set C D .0; 4/ \ Q, the set of rational numbers between 0
and 4. Here, some of the accumulation points are in C, and some are not. The set of
natural numbers, N, has no accumulation points. An element a of a set that is not an
accumulation point of that set is called an isolated point of the set. For any isolated
point a, there is an  > 0 such that a is the only element of the set in the interval
.a  ; a C / (Fig. 3.9).
A word of warning is needed here. The term accumulation point is not used the
same way by all authors. Many texts, especially those in Topology, will use the terms
limit point or cluster point instead of accumulation point. Even more confusing is
that some texts use the term accumulation point for something different.

Fig. 3.9 Set with accumulation point a and isolated point b

3.7 Accumulation Points

75

The first observation to make about accumulation points is that if p is an


accumulation point of set A, then for every  > 0 there is not only one point of
A within  of p but infinitely many points of A within  of p. The definition of
accumulation point guarantees at least one point of A within  of p, but once one
point, x 2 A, is found to be within  of p, the definition also says that there must be
another point y 2 A with 0 < jy  pj < jx  pj. Since for each x 2 A close to p
there must be another point y 2 A even closer to p, it follows that there are infinitely
many points of A within  of p.
Perhaps the most used fact about accumulation points is known as the Bolzano
Weierstrass Theorem which states that every infinite bounded set of real numbers
has an accumulation point. As pointed out earlier, N has no accumulation points,
and it is an infinite set. But N is not a bounded set. Intuitively, one cannot have a
bounded infinite set without an accumulation point because one runs out of places
to put the infinite number of points. If the points of a set are not allowed to bunch
up anywhere, then one will not be able to find room for infinitely many of the points
within a bounded interval.
There are several good strategies used to prove the BolzanoWeierstrass Theorem, and two of those strategies are presented here. Of course, one only needs one
good strategy to prove a theorem, but these proofs are instructive and use techniques
commonly employed in Analysis proofs. One begins each proof with a statement
about the set A being an infinite bounded set. Since A is a bounded set, it will have
a lower bound, a, and an upper bound, b showing that A  a; b. The first strategy
is to construct the set S D fx  a j a; x \ A is finiteg, that is, a value x  a is in
the set S if there are finitely many element of A which fall in the interval a; x. First
observe that the set S is an interval. This follows because if y 2 S, then a; y \ A is
finite, so if x is between a and y, then a; x \ A  a; y \ A must also be finite, and
x 2 A. The next observation is that S is not empty because the point a, whether or
not it is in A, is in S since a; a \ A contains at most one point, so it is finite. Since
a; b \ A D A is an infinite set, the set S is bounded above by b. The Completeness
Axiom now shows that S must have a least upper bound, p. It will follow that p is an
accumulation point of A because for all  > 0, the set A will have only finitely many
elements less than p   but infinitely many elements less than p C  implying that
there are infinitely many elements of A within  of p. Here is the complete proof.

76

3 Limits

PROOF (BolzanoWeierstrass Theorem): Every infinite bounded set of


real numbers has an accumulation point.
Let A be an infinite bounded set of real numbers.
Because A is bounded, it has a lower bound, a, and an upper bound, b,
showing that A  a; b.
Define set S D fx  a j a; x \ A is finiteg.
Note that a 2 S since a; a \ A is finite, so S is nonempty.
Note that if z  b, then a; z \ A D A is an infinite set, so z S showing
that S is bounded above by b.
By the Completeness Axiom, S has a least upper bound, p.
Given  > 0, p   < p so p   is not an upper bound of S. Hence, there is
a y 2 S with y > p  . It follows that there are only finitely many elements
of A less than or equal to y.
Also, p C  > p, so p C  S. It follows that a; p C  \ A is infinite.
Thus, there must be infinitely many elements of A between p   and p C ,
and there must be an element of A not equal to p within  of p.
This shows that p is an accumulation point of A.
The second strategy also begins with the interval a; b that contains the infinite
bounded set, A. One can rename the end points of this interval to be a1 D a and
1
b1 D b. Since a1 ; b1  \ A D A is infinite, it follows that either a1 ; a1 Cb
 \ A or
2
a1 Cb1
a1 Cb1
2 ; b1  \ A is an infinite set. If a1 ; 2  \ A is infinite, define a2 D a1 and
1
1
b2 D a1 Cb
. Otherwise, define a2 D a1 Cb
and b2 D b1 . In either case, a2 ; b2  \ A
2
2
is an infinite set. This procedure can be repeated so that for every n 2 N, one gets an
interval an ; bn  where an ; bn  \ A is infinite, and each interval is half the length of
the previous interval. Also, the sequence of left endpoints, <an >, is a monotone
increasing sequence bounded above by b, and the sequence of right endpoints,
<bn >, is a monotone decreasing sequence bounded below by a. Thus, both of these
sequences converge. In fact, both of these sequences must converge to the same
limit, p. This follows because the distances between the terms of the sequences,
bn  an , keep getting smaller and converge to 0. Given an  > 0, it will follow that
there is an n such that an and bn are both within  of p. Thus, .p  ; p C / \ A
contains an ; bn  \ A which is infinite. Here is the complete proof.

3.7 Accumulation Points

77

PROOF (BolzanoWeierstrass Theorem): Every infinite bounded set of


real numbers has an accumulation point.
Let A be an infinite bounded set of real numbers.
Because A is bounded, it has a lower bound, a1 , and an upper bound, b1 ,
showing that A  a1 ; b1  and a1 ; b1  \ A is an infinite set.
Define sequences <an > and <bn > recursively as follows.
Suppose, for natural number n, an and bn have been defined so that an ; bn \
n
A is an infinite set. If an ; an Cb
 \ A is infinite, then define anC1 D an and
2
an Cbn
n
and bnC1 D bn . In either
bnC1 D 2 . Otherwise, define anC1 D an Cb
2
case, anC1 ; bnC1  \ A is an infinite set.
By the way the sequences are constructed, for each n it follows that an 
anC1 < bnC1  bn showing that <an > is a monotone increasing sequence
bounded above by each bi , and <bn > is a monotone decreasing sequence
bounded below by each ai .
a1
Also, by the way the sequences are constructed, for each n, bn  an D b21n1
.
Thus, the bounded monotone sequence <an > must converge to a number
pa , and the bounded monotone sequence <bn > must converge to a number
a1
pb . But pb  pa  bn  an D b21n1
and, therefore, pb  pa must be zero. Let
p D pa D pb , and note that for each n, p 2 an ; bn .
a1
Given  > 0, select a natural number n such that b21n1
< . Then p   <
an  p  bn < p C . Hence, an ; bn  \ A  .p  ; p C / \ A is infinite
showing that there is an element of A not equal to p but within  of p.
This shows that p is an accumulation point of A.
You now have the machinery necessary to prove the result mentioned in Sect. 3.6
that all Cauchy sequences converge. The difficulty in proving this result earlier was
that given a Cauchy sequence <an >, it was not clear what real number would play
the role of the limit L of the sequence. Now, the BolzanoWeierstrass Theorem can
provide an accumulation point to serve as this limit. There are two cases to consider.
If the set of values in the sequence, fan g, is a finite set, then for the sequence to be
Cauchy, the sequence will necessarily need to be constant from some point on, and,
therefore, the sequence will converge. If the set of values in the sequence is infinite,
then since all Cauchy sequences are bounded, the set of values in the sequence will
be bounded and will have to have an accumulation point. It is then straightforward
to show that the sequence converges to this accumulation point.

78

3 Limits

PROOF: All Cauchy sequences converge.


Let <an > be a Cauchy sequence. Let A be the set of terms of the
sequence, fan g.
CASE 1: The set A is finite. If A contains only one value, then the sequence
is constant and converges to that constant. If A contains more than one
value, then, since the range of the sequence is finite, so is the set of
differences an  am of values in the sequence. Let d be the smallest positive
difference between any two values in the sequence, and let  D d2 . Because
the sequence is Cauchy, there is an N such that whenever m; n > N, the
difference jam  an j < . But the smallest positive difference between any
two terms of the sequence is d > , so it follows that am  an D 0. Thus, the
sequence is constant for all terms an with n > N, and, again, the sequence
must converge.
CASE 2: The set A is infinite. Since all Cauchy sequences are bounded, A is
a bounded infinite set, and thus, by the BolzanoWeierstrass Theorem, A has
an accumulation point, p. Because <an > is Cauchy, given  > 0 there is an
N such that for all m; n > N, jam an j < 2 . Also, since p is an accumulation
point of A, there are infinitely many values of A within 2 of p. Surely there
is a natural number k > N such that jak  pj < 2 . Then, for all n > N, it
follows that jan pj D j.an ak /.pak /j  jan ak jCjak pj < 2 C 2 D .
Thus, the sequence converges to p.
Therefore, all Cauchy sequences must converge.
Up to this point, the discussion of the limit lim f .x/ took place only for those
x!a

functions defined for all x a in an open interval containing a. The definition of


limit can now be extended. It should not be required that the function f be defined
for all x in an open interval containing a but that f be defined at enough points so
that it makes sense to allow x to approach a. In other words, a only needs to be
an accumulation point of the domain of f . Note that if a is not an accumulation
point of the domain of f , then there will be an open interval containing a where f
were not defined (except perhaps at a itself). Thus, no sense could be made out of x
approaches a. On the other hand, if a is an accumulation point of f , it makes sense
to define the limit of f at a to be L or lim f .x/ D L to mean that for all  > 0 there is
x!a
a > 0 such that for all x in the domain of f , 0 < jx  aj < implies jf .x/  Lj < .
Similarly, to define lim f .x/ D L one does not need f to be defined in an
x!1
entire interval stretching to positive infinity. It is sufficient that f .x/ is defined for
arbitrarily large values of x so that x can be allowed to approach infinity. One way
of saying this is that the domain of f should be unbounded above. This is what
was done, for example, when defining the limit of a sequence which is the limit of
a function defined for the natural numbers only. Similarly, lim f .x/ D L can be
x!1
defined for f when the domain of f is unbounded below.

3.8 Infinite Limits

79

3.7.1 Exercises
1. Write a definition for lim f .x/ where a is an accumulation point of the domain
x!aC

of f .
Identify the accumulation points, if any, of the following sets.

n
n 2 N
2. nC2


3. x 2 Q x2 < 2

1
3
; 2; 52 ; : : :
4. 2 ; 1;

m 2
5. 2n m; n 2 N
n
o
n C4
n 2 N
6. 2n.1/
3nC5

3.8 Infinite Limits


xC3
2
x!5 .x5/

In Sect. 3.6 it was shown that lim

does not exist. But more can be said about

this limit. The reason that the limit does not exist is that the function grows without
bound and, therefore, does not approach any real number value. This behavior can
be quantified by saying that the limit of the function is infinity. Of course, it does not
make sense to say that the function is getting close to infinity, since no real number
is very close to infinity. In the definition of lim f .x/ where it had to be made clear
x!1
what x approaching infinity meant, it was said that there was a number N such that
jf .x/  Lj was small whenever x > N. Similarly, to say that f .x/ approaches infinity,
one needs to say that for any real number M, f .x/ can be made larger than M. Thus, if
a is an accumulation point of the domain of function f .x/, the following two similar
definitions can be given.
The limit of f at a is infinity or lim f .x/ D 1 means that for every M 2 R there
x!a
is a > 0 such that if x is in the domain of f with 0 < jx  aj < , then f .x/ > M.
The limit of f at a is negative infinity or lim f .x/ D 1 means that for every
x!a
M 2 R there is a > 0 such that if x is in the domain of f with 0 < jx  aj < ,
then f .x/ < M.
What if f .x/ approaches infinity or negative infinity as x is allowed to approach
either infinity or negative infinity? Each of these ideas can be accommodated
resulting in four similar definitions. Remember that the limit of f as x approaches
infinity makes sense only if the domain of f is unbounded above, and as x approaches
negative infinity only if the domain of f is unbounded below. Here are the four
definitions.
The limit of f as x approaches infinity is infinity or lim f .x/ D 1 means that
x!1
for every M 2 R there is an N 2 R such that if x is in the domain of f with x > N,
then f .x/ > M.

80

3 Limits

The limit of f as x approaches infinity is negative infinity or lim f .x/ D 1


x!1
means that for every M 2 R there is an N 2 R such that if x is in the domain of
f with x > N, then f .x/ < M.
The limit of f as x approaches negative infinity is infinity or lim f .x/ D 1
x!1
means that for every M 2 R there is an N 2 R such that if x is in the domain of
f with x < N, then f .x/ > M.
The limit of f as x approaches negative infinity is negative infinity or
lim f .x/ D 1 means that for every M 2 R there is an N 2 R such that
x!1
if x is in the domain of f with x < N, then f .x/ < M.
xC3
2
x!5 .x5/

For example, how would you prove that lim

show that for every M 2 R there is a > 0 such that

D 1? You would need to


xC3
.x5/2

> M when x is within

xC3
of 5 with x 5. Working backwards, you would start with .x5/
2 > M. This is a
complicated inequality with which to work, so it would be more convenient to work
xC3
with an inequality that is easier to solve. If you want f .x/ D .x5/
2 to be bigger than
M, it would be sufficient to make some fraction smaller than f .x/ bigger than M.
1
For example, for all x > 2 the fraction .x5/
2 is smaller than f .x/. Moreover, for

x within 1 of 5,
1
jx5j

1
jx5j

is smaller than

1
.
.x5/2

Thus, it would be sufficient to make

> M which, under the condition of M > 0, happens when jx  5j < M1 .


A proof would need to take care of the embarrassing case of M  0, perhaps by
1
making D jMjC1
since jMj C 1 is always bigger than M and is always positive.
Another way to handle this is to write a proof that assumes that M is positive. In fact,
one could just stipulate that M > 1 by inserting the often used phrase without loss of
generality. This phrase means that even though a restriction is being placed on one
of the assumptions in the proof, if one can complete the proof using this restriction,
then it would be very easy to give a proof without the restriction. In this case, if it is
assumed that M > 1, one could just as easily handled cases where M  1 by finding
a > 0 that ensured f .x/ > 1  M, so being able to produce a proof that works for
1 does provide a proof for M  1. The phrase without loss of generality is used so
frequently that many authors abbreviate it as WLOG. These ideas give the following
proof.
xC3
2
x!5 .x5/

PROOF: lim

D1

xC3
Let f .x/ D .x5/
2.
Let M 2 R be given. Without loss of generality, assume that M > 1.
Let D M1 > 0. Note that < 1.
If 0 < jx  5j < , then since jx  5j < 1, it follows that jx  5j > .x  5/2 .
Also, since x > 4, it follows that x C 3 > 7 > 1.
xC3
1
1
1
Then f .x/ D .x5/
2 > .x5/2 > jx5j > D M.

xC3
2
x!5 .x5/

This shows that lim

D 1.

3.9 The Arithmetic of Limits

81

3.8.1 Exercises
Write a proof of each of the following infinite limits.
x
2 D 1
x!4 .x4/
2
lim x  5x D 1
x!1
lim x2 D 1
x!0 jxj

1. lim
2.
3.
4.

lim 10 

x!1

x
x!2C x2

5. lim

4  x D 1

D1

3.9 The Arithmetic of Limits


The fact that the limits of some functions are easy to prove hides the fact that there
are some limits whose validity is considerably more difficult to prove. Fortunately,
the limits of most arithmetic combinations of functions work as expected due to the
behavior of the arithmetic operations of addition, subtraction, multiplication, and
division. In the words of the next chapter, these operations behave well because
they are themselves continuous functions of their arguments. That is, for example,
the function of two variables f .x; y/ D x C y is a continuous function of x and y.
That continuity allows you to prove the following theorem.
THEOREM: Suppose that f and g are functions both defined on a set with
accumulation point a. Let lim f .x/ D L and lim g.x/ D H. Then
x!a

x!a

1. lim f .x/ C g.x/ D L C H.


x!a

2. lim f .x/  g.x/ D L  H.


x!a

3. lim f .x/g.x/ D LH.


x!a

f .x/
x!a g.x/

4. if H 0, lim

L
.
H

Consider how to prove each part of the above theorem. In each case you will need
to prove the validity of a limit, so the proof can follow the usual proof template for
establishing a limit. These proofs differ from limit proofs found earlier in the chapter
in that you know less about the functions whose limits you are trying to establish.
On the other hand, you do know that the limits of the functions f and g exist, and
that gives you a lot of tools with which to work.

82

3 Limits

3.9.1 Limit of a Sum


So what needs to be done to prove that the limit of the sum of two functions is the
sum of their respective limits? As with all limit proofs, the proof will begin with a
statement about what is being assumed about two functions f and g. In this case,
that would essentially be a restatement of the hypothesis of the theorem that says
that the limits of f and g at a are L and H, respectively. The second step of the
proof would be to say Let  > 0 be given which sets the tolerance to be met by
the proof. You know that the end of the proof will need to show that the function
in question, f .x/ C g.x/, needs to be within
 of the proposed limit, L C H. In

other words, you will need to establish j f .x/ C g.x/  .L C H/j < . Clearly,
this inequality will depend on properties of the functions f and g. But you know
very little about these functions. Actually, knowing very little about the functions
makes your job easier. All you know about these functions is that f has L for a limit,
and g has H for a limit. This means that your proof can only use these two facts.
Because these two limits exist, you will be able to set up conditions that ensure that
jf .x/  Lj and jg.x/  Hj are small. How does this help?It helps because
 the triangle
inequality will allow you to show that the expression j f .x/ C g.x/  .L C H/j is
no
 bigger than
 the sum of the two small quantities jf .x/  Lj and jg.x/  Hj. That is,
j f .x/ C g.x/  .L C H/j D j.f .x/  L/ C .g.x/  H/j  jf .x/  Lj C jg.x/  Hj. For
example, if both jf .x/  Lj and jg.x/  Hj can be made less than 2 , then their sum


will be less than , and the value of j f .x/ C g.x/  .L C H/j will, in turn, be less
than , as desired. How can you arrange for jf .x/  Lj and jg.x/  Hj both to be less
than 2 ? You are given that the limits of f and g are L and H, respectively, so, by the
definition of limit, you can arrange for each of these quantities to be smaller that any
given positive value, such as 2 , with appropriate choices of > 0. The only subtlety
here is that the value of > 0 needed to assure that jf .x/  Lj is less than 2 cannot
be assumed to be the same value as the > 0 needed to assure that jg.x/  Hj is less
than 2 . Thus, two different values of should be chosen, and then the minimum of
those two will be small enough to guarantee both of the needed inequalities.
Thus, after the proof proposes a given  > 0, it can produce a 1 > 0 small
enough so that if x is in the domain of f and 0 < jx  aj < 1 , then jf .x/  Lj will
be less than 2 . The existence of this 1 comes from the definition of lim f .x/ D L.
x!a

Similarly, the proof can produce a 2 > 0 coming from the definition of lim g.x/ D
x!a

H such that if x is in the domain of g and 0 < jx  aj < 2 , then jg.x/  Hj will be
less than 2 . The proof then easily follows as described above.

3.9 The Arithmetic of Limits

83

PROOF: Suppose that f and g are functions both defined on a set


with accumulation point a. If lim f .x/ D L and lim g.x/ D H, then
x!a
x!a
lim f .x/ C g.x/ D L C H.
x!a

Let f and g be functions both defined on a set with accumulation point a


with lim f .x/ D L and lim g.x/ D H.
x!a
x!a
Let  > 0 be given.
By the definition of limit, there is a 1 > 0 such that if x is in the domain of
f and 0 < jx  aj < 1 , then jf .x/  Lj < 2 .
Similarly, there is a 2 > 0 such that if x is in the domain of g and 0 <
jx  aj < 2 , then jg.x/  Hj < 2 .
Let D min.1 ; 2 / > 0.
Then
if x is in

 the domain of f C g with 0 < jx  aj < ,
j f .x/Cg.x/ .LCH/j D j.f .x/L/C.g.x/H/j  jf .x/LjCjg.x/Hj <

C 2 D .
2
This shows that lim f .x/ C g.x/ D L C H.
x!a

A proof that the limit of the difference f .x/  g.x/ equals the difference of the
individual limits, L  H, is very similar to the above proof and is left as an exercise.

3.9.2 Limit of a Product


Proving that the limit of the product f .x/g.x/ equals the product of the individual
limits, LH, uses the same techniques as the proof for the limit of a sum but has
an added complexity requiring the use of a commonly used trick. The proof of
lim f .x/g.x/ D LH follows the usual template for proving the existence of a limit.
x!a

Its goal is to establish the inequality jf .x/g.x/  LHj < . Again, you can use the
definition of limit to make jf .x/  Lj and jg.x/  Hj as small as you need, but how
small these have to be to ensure that jf .x/g.x/LHj is less than  is not immediately
obvious. The problem is that it is difficult to gauge how close f .x/g.x/ is to LH when
you know that f .x/ is close to L, and g.x/ is close to H. The difficulty stems from
having to move from f .x/g.x/ to LH, where f .x/ changes to L and g.x/ changes to H
at the same time. If only one of these two changes were made, then it might be easier
to make the needed estimate. That is, it would be easier to work with an expression
like f .x/g.x/  f .x/H than with f .x/g.x/  LH.
Of course, f .x/g.x/  LH is not the same as f .x/g.x/  f .x/H, so one cannot
just use f .x/g.x/  f .x/H in place of f .x/g.x/  LH. Sometimes, though, it is
worth replacing one expression with another expression that is easier to handle,

84

3 Limits

and then adjusting the second expression to make it equivalent to the first. In this
case, the change can be accomplished by employing one of the oldest tricks used in
mathematical proofs, that of adding and subtracting the same quantity. In particular,
you can rewrite jf .x/g.x/  LHj as jf .x/g.x/  f .x/H C f .x/H  LHj. The advantage
of doing this is that now you can see how the difference between f .x/g.x/ and LH
depends on the differences between f .x/ and L and g.x/ and H. Indeed, jf .x/g.x/ 
LHj D jf .x/g.x/  f .x/H C f .x/H  LHj D jf .x/.g.x/  H/ C H.f .x/  L/j 
jf .x/j  jg.x/  Hj C jHj  jf .x/  Lj. If each of the two terms, jf .x/j  jg.x/  Hj
and jHj  jf .x/  Lj, can be made smaller than 2 , then it will have been shown that
jf .x/g.x/  LHj is less than  as needed.
So how small does jf .x/  Lj need to be to ensure that jHj  jf .x/  Lj is less

than 2 ? Less than 2jHj
appears to be small enough, although one needs to handle
the embarrassing situation where H D 0. You could handle H D 0 and H 0 as
two separate cases, or you can take care of both cases at once by making jf .x/  Lj
less than    since jHj C 1 is larger than jHj and can never be 0. Thus, you can
2 jHjC1

select a 1 > 0 so that if 0 < jx  aj < 1 , then jf .x/  Lj < 

.

2 jHjC1

How small does jg.x/  Hj need to be to ensure that jf .x/j  jg.x/  Hj is less than
It would be nice to say that jg.x/  Hj < 2jf.x/j suggesting that you set small
enough to ensure jg.x/  Hj < 2jf.x/j , but there is a problem here. The definition of
limit requires that the choice of come before the choice of x, so you cannot have
the value of depending on x. What is needed is an upper bound for jf .x/j because,

if jf .x/j  M, the value of can be found to ensure jg.x/  Hj < 2M
which will

always be small enough to guarantee jf .x/j  jg.x/  Hj < 2 . You can find such
an upper bound for jf .x/j because the limit of f .x/ exists as x approaches a, and so
jf .x/j can be restricted to being not much larger than jLj. You could, for example,
find 2 > 0 so that if 0 < jx  aj < 2 , then jf .x/  Lj < 1. This would ensure that
f .x/ is a distance of no more than 1 from L so that jf .x/j < jLj C 1. Then you would
only need jg.x/  Hj <    to get jf .x/j  jg.x/  Hj < 2 . This gives you all the

?
2

2 jLjC1

pieces necessary to complete the proof as follows.

3.9 The Arithmetic of Limits

85

PROOF: Suppose that f and g are functions both defined on a set


with accumulation point a. If lim f .x/ D L and lim g.x/ D H, then
x!a
x!a
lim f .x/g.x/ D LH.
x!a

Let f and g be functions both defined on a set with accumulation point a


with lim f .x/ D L and lim g.x/ D H.
x!a
x!a
Let  > 0 be given.
By the definition of limit, there is a 1 > 0 such that if x is in the domain of
f and 0 < jx  aj < 1 , then jf .x/  Lj <    .
2 jHjC1

By the definition of limit, there is a 2 > 0 such that if x is in the domain of


f and 0 < jxaj < 2 , then jf .x/Lj < 1. Then jLjC1 > jLjCjf .x/Lj 
jL C .f .x/  L/j D jf .x/j.
By the definition of limit, there is a 3 > 0 such that if x is in the domain of
g and 0 < jx  aj < 3 , then jg.x/  Hj <    .
2 jLjC1

Let D min.1 ; 2 ; 3 / > 0.


Then if x is in the domain of f  g with 0 < jx  aj < ,
jf .x/g.x/  LHj D jf .x/g.x/  f .x/H C f .x/H  LHj D
jf .x/.g.x/  H/ C H.f .x/  L/j  jf .x/j  jg.x/  Hj C jHj  jf .x/  Lj 
.jLj C 1/     C jHj     < 2 C 2 D .
2 jLjC1

2 jHjC1

This shows that lim f .x/g.x/ D LH.


x!a

3.9.3 Limit of a Quotient


Finally, the proof that the limit of a quotient is the quotient of the individual
limits is much like the proof about the product of limits, although the algebra
is more complicated. As in the preceding
proof, you can start with the needed

f .x/
L
inequality which, in this case, is g.x/  H < . Using the trick of adding and
subtracting the same quantity, the left side of the inequality
can be written
 as

.f
.x/L/HCL
Hg.x/

f .x/

f .x/HLg.x/
f .x/HLHCLHLg.x/

g.x/  HL D g.x/H D
D

g.x/H
g.x/H

L.g.x/H/
jf .x/Lj
jf .x/Lj
C g.x/H . Again, the goal will be to make each of the terms jg.x/j
jg.x/j

less than 2 by selecting an appropriate sequence of s.


and L.g.x/H/
g.x/H
Both of these terms have a factor of jg.x/j in the denominator. To make the
fractions small, you will need to know that jg.x/j does not get too close to zero.
What you do know is that lim g.x/ D H is not zero because the hypothesis of the
x!a

theorem will make that assumption. How far away from zero can you require jg.x/j
to be? Certainly, this will depend on the value of H. If H is close to zero, then jg.x/j

86

3 Limits

will be close to zero as x approaches a. The best you can do is require that jg.x/j be
so close to H that it will keep a known distance from zero. For example, you could
. That will ensure that jg.x/j is at least jHj
require that jg.x/  Hj be less than jHj
2
2
which keeps it a known distance away from zero. So, select a 1 > 0 such that if x
, and jg.x/j will
is in the domain of g with 0 < jx  aj < 1 , then jg.x/  Hj < jHj
2
jHj
be greater than 2 .
.x/Lj
2
< jf .x/  Lj  jHj
. Thus, it would
Now for these values of x you will have jf jg.x/j
be sufficient if jf .x/Lj were to be less than

jHj
4

which will ensure that

jf .x/Lj
jg.x/j

< 2 .

jHj
This can be done by choosing
2 > 0 small enough so that jf .x/Lj is less than 4 .

term less than 2 , you can select a 3 > 0 so that if


To make the L.g.x/H/
g.x/H
2

x is within 3 of a you will have jg.x/  Hj less than H


because that will give
4jLj

2
H 2
H
jLj 4jLj
L.g.x/H/
< H42 D 2 . Well, OK, did you catch that the preceding does
g.x/H < jg.x/jjHj
2

not work if L D 0? To avoid this problem it would be better to make jg.x/  Hj less
2
than  H  . Putting all of these ideas together gives the following proof.
4 jLjC1

PROOF: Suppose that f and g are functions both defined on a set with
accumulation point a. If lim f .x/ D L and lim g.x/ D H with H 0, then
f .x/
x!a g.x/

lim

x!a

x!a

L
.
H

Let f and g be functions both defined on a set with accumulation point a


with lim f .x/ D L and lim g.x/ D H 0.
x!a
x!a
Let  > 0 be given.
By the definition of limit, there is a 1 > 0 such that if x is in
.
the domain of g and 0 < jx  aj < 1 , then jg.x/  Hj < jHj
2
jHj
For these x it follows that jg.x/j C 2 > jg.x/j C jg.x/  Hj D


jg.x/j C jH  g.x/j  jg.x/ C H  g.x/ j D jHj which implies that
D jHj
.
jg.x/j > jHj  jHj
2
2
By the definition of limit, there is a 2 > 0 such that if x is in the domain of
.
f and 0 < jx  aj < 2 , then jf .x/  Lj < jHj
4
By the definition of limit, there is a 3 > 0 such that if x is in the domain of
2
g and 0 < jx  aj < 3 , then jg.x/  Hj <  H  .
4 jLjC1

Let D min.1 ; 2 ; 3 /.
Then if x is in the domain of gf with 0 < jx  aj < ,






f .x/HLg.x/ f .x/HLHCLHLg.x/ .f .x/L/HCL Hg.x/
f .x/
L
g.x/  H D g.x/H D
D

g.x/H
g.x/H

2
L.g.x/H/ jHj 2
jf .x/Lj
2jLj
H


  2 < 2 C 2 D .
 jg.x/j C g.x/H < 4  jHj C 
H
4 jLjC1

This shows that

lim f .x/
x!a g.x/

L
.
H

3.9 The Arithmetic of Limits

87

3.9.4 Limit of Rational Functions


As a demonstration of the power of these results about the arithmetic of limits, you
can now easily prove the following list of results which will allow you to easily
calculate limits of polynomials and rational functions of x.
For any constant c in the real numbers, lim c D c.
x!a
lim x D a.
x!a
For any n in the natural numbers, lim xn D an .
x!a

For any polynomial p.x/, lim p.x/ D p.a/.


x!a

p.x/
x!a q.x/

For any polynomials p.x/ and q.x/ with q.a/ 0, lim

p.a/
.
q.a/

The first two results are very easy to prove, and are left as exercises. The next
two results can be proved by using mathematical induction which is often the first
technique one considers using when trying to prove statements such as these that
depends on a natural number. Here, mathematical induction will be employed to
prove statements about the limits of polynomials, and the degree of the polynomial
provides a natural number to use as the induction variable.
To begin with, try using mathematical induction to prove that lim xn D an for
x!a
any natural number n. In this mathematical induction argument, the base case is
lim x D a, that is, when n D b D 1. The proofs of statements similar to this base
x!a
case were covered earlier. The induction step in the proof will need to show that if
lim xk D ak for some natural number k, then lim xkC1 D akC1 . But xkC1 is just the
x!a

x!a

product xk  x, so this result follows immediately from the theorem about the limits
of products. That leads to the following proof that uses the template for proofs by
mathematical induction.
PROOF: lim xn D an for any natural number n.
x!a

SET THE CONTEXT: The statement will be proved for all natural numbers
n by mathematical induction on n.
PROVE S.b/: When n D 1, the statement says that lim x D a which has
x!a
already been established.
STATE THE INDUCTION HYPOTHESIS: Assume that for some natural
number k, lim xk D ak .
x!a
PERFORM THE INDUCTION STEP: Then since the limit of a product
of two functions is the product of the two individual limits, it follows that
lim xkC1 D lim xk  x D .lim xk /.lim x/ D ak  a D akC1 . So the statement
x!a
x!a
x!a
x!a
is true for n D k C 1.
STATE THE CONCLUSION: Therefore, by mathematical induction,
lim xn D an is true for all natural numbers n.
x!a

88

3 Limits

Mathematical induction can again be employed to prove that for every polynomial, p.x/, lim p.x/ D p.a/. As a reminder, a polynomial of degree n is a function,
x!a

p.x/ D cn xn C cn1 xn1 C cn2 xn2 C    C c1 x C c0 where c0 ; c1 ; c2 ; : : : ; cn


are constants with cn 0. Previously it has been proved that lim cj D cj and
x!a

lim x j D a j , from which one gets that the limit of a monomial is lim cj x j D cj a j .
x!a
x!a
A polynomial is just a sum of such monomials, so mathematical induction is a
convenient tool for showing that this sum of an arbitrary number of monomials
has the desired limit.
PROOF: For any constants c0 ; c1 ; c2 ; : : : ; cn and a 2 R, the polynomial p.x/ D cn xn C cn1 xn1 C cn2 xn2 C    C c1 x C c0 satisfies
lim p.x/ D p.a/.
x!a

SET THE CONTEXT: The statement will be proved by mathematical


induction on the degree of the polynomial n.
PROVE S.b/: lim c1 xCc0 D .lim c1 /.lim x/C lim c0 D .c1 /.a/Cc0 which
x!a
x!a
x!a
x!a
shows that the statement is true for n D 1.
STATE THE INDUCTION HYPOTHESIS: Assume that for some natural
number k, if p.x/ D ck xk C ck1 xk1 C    C c1 x C c0 , then lim p.x/ D p.a/.
x!a

PERFORM THE INDUCTION STEP: If p.x/ D ckC1 xkC1 C ck xk C


ck1 xk1 C    C c1 x C c0 , lim ckC1 xkC1 C ck xk C ck1 xk1 C    C c1 x C c0 D
x!a


.lim ckC1 /.lim xkC1 / C lim ck xk C ck1 xk1 C    C c1 x C c0 D
x!a

x!a

x!a

.ckC1 akC1 / C .cn an C cn1 an1 C cn2 an2 C    C c1 a C c0 / D p.a/. This


shows that the statement is true for n D k C 1.
STATE THE CONCLUSION: Therefore, by mathematical induction,
lim p.x/ D p.a/ is true for all polynomials p.x/.
x!a

Recall that a rational function is just a ratio of polynomials, that is, if p.x/ and
q.x/ are polynomials, then p.x/
is a rational function. It is only a simple step to get
q.x/
the following theorem.
PROOF: For any polynomials p and q and a 2 R where q.a/ 0, it follows
p.x/
that lim q.x/
D p.a/
.
q.a/
x!a

1. Let p and q be polynomials, and a 2 R such that q.a/ 0.


2. Because p and q are polynomials, lim p.x/ D p.a/ and lim q.x/ D q.a/.
x!a
x!a
3. Since the limit of the quotient is equal to the quotient of the individual limits
D
when the limit of the denominator is not zero, it follows that lim p.x/
q.x/
x!a

lim p.x/

x!a

lim q.x/

x!a

p.a/
.
q.a/

3.10 Other Limit Theorems

89

3.9.5 Other Types of Limits


It is time to note that even though all of these limit theorems concerned limits as
x approaches a, most can be extended to cover limits as x approaches a from the
left, as x approaches a from the right, as x approaches infinity, and as x approaches
negative infinity. In particular, most of the theorems apply to the limits of sequences.
Many of these statements can be found in the exercises.

3.9.6 Exercises
Write proofs of each of the following statements.
1. If f and g are defined in an open interval containing a, and if lim f .x/ D L and
x!a

lim g.x/ D H, then lim f .x/  g.x/ D L  H.


x!aC

x!a

2. For any constant c in the real numbers, lim c D c.


x!a
3. lim x D a.
x!a

4. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then
x!a

lim f .x/ C g.x/ D L C H.

x!a

x!a

5. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then
x!aC

lim f .x/  g.x/ D L  H.

x!aC

x!aC

6. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then
x!1

lim f .x/g.x/ D LH.

x!1

x!1

7. If <an > and <bn > are sequences with lim an D L and lim bn D H 0, then
an
n!1 bn

lim

n!1

L
.
H

8. lim f .x/ D L if and only if lim f


x!1

x!0C

1
x

n!1

D L.
1
x!a f .x/L
lim f .x/CM D 1.
x!a f .x/CN

9. If f .x/ > L for all x a, then lim f .x/ D L if and only if lim
x!a

10. If lim f .x/ D 1, then for any constants M and N


x!a

D 1.

3.10 Other Limit Theorems


This section discusses a few other useful results about limits. They provide an
interesting variety of proof strategies to consider.

90

3 Limits

3.10.1 The Limit of a Positive Function


What can you say about lim f .x/ D L if you know that f .x/ > 0 for all x, or at
x!a
least for all x in an open interval containing a? Assuming that this limit exists, it
is clear that the limit cannot be negative because, from the definition of limit, you
know that jf .x/  Lj can be made as small as you like which would not be possible
if f .x/ were always positive and L were negative. But how would you prove this?
The key lies in the inequality jf .x/  Lj <  since, if L were negative, you could
choose  to be so small that the inequality could not hold. How small would  need
to be? Well, the only thing you know about f .x/ is that it is positive, or, in other
words, cannot be smaller than 0. At the same time, L is negative which means that
f .x/ and L must be at least jLj apart, noting that jLj > 0. So set  D jLj. Then
jLj D  > jf .x/  Lj D f .x/ C jLj which implies f .x/ < 0 which is a contradiction.
This leads to the following proof.
PROOF: Let f be a function such that f .x/ > 0 for all x in the domain of
f . If lim f .x/ D L, then L  0.
x!a

Suppose that lim f .x/ D L and that for all x, f .x/ > 0.
x!a
Assume that L < 0.
By the definition of limit, there is a > 0 such that for all x in the domain
of f satisfying 0 < jx  aj < , it follows that jf .x/  Lj < L.
For these values of x it must be that L > jf .x/  Lj D f .x/  L implying
that 0 > f .x/ which contradicts the fact that f .x/ is always positive.
Therefore, it must hold that L  0.
Similar statements can be made about the limits of functions f satisfying f .x/ > b
or f .x/ < b for all x where b is a constant real number. One can also extend this to
limits from the left, limits from the right, and limits to infinity and negative infinity.
Several of these possibilities have been left for the exercises.

3.10.2 Uniqueness of Limits


There is nothing in the definition of lim f .x/ D L that a priori precludes lim f .x/ D
x!a

x!a

M for some M L. But, in fact, limits are unique, that is, the only way for the limit
to be L and the limit to be M is for L and M to be equal. Intuitively, this should make
sense. If the values of f .x/ are getting close to L, then they should not also be able
to get close to a value distinct from L. So how can you prove this using nothing but
the definition of limit as a tool?
The result can be proved by contradiction, that is, if you assume that the function
f has two distinct limits, L and M, as x approaches a, then this leads to a statement
which must be false. Assuming that both limits exist, the definition of limit will

3.10 Other Limit Theorems

91

allow you to force both jf .x/  Lj <  and jf .x/  Mj <  for any positive 
that you choose. Why cant this happen? Well, if it did, you could get  C  >
jf .x/  Lj C jf .x/  Mj D jf .x/  Lj C jM  f .x/j  jf .x/  L C M  f .x/j D jM  Lj.
If M L, then jM  Lj is a positive number, so if  is chosen less than or equal to
jMLj
, it will be impossible to have jM  Lj < 2 as guaranteed by the definition of
2
limit. That gives you the following proof.
PROOF: If lim f .x/ D L and lim f .x/ D M, then L D M.
x!a

x!a

Suppose that lim f .x/ D L and lim f .x/ D M.


x!a

x!a

Assume that L M which implies that jM  Lj > 0.


By the definition of limit, there is a 1 > 0 such that for all x in the domain
.
of f satisfying 0 < jx  aj < 1 , it follows that jf .x/  Lj < jMLj
2
By the definition of limit, there is a 2 > 0 such that for all x in the domain
.
of f satisfying 0 < jx  aj < 2 , it follows that jf .x/  Mj < jMLj
2
Let x be in the domain of f with 0 < jx  aj < min.1 ; 2 /.
Then jM  Lj D jMLj
C jMLj
> jf .x/  Lj C jf .x/  Mj D
2
2
jf .x/  Lj C jM  f .x/j  jf .x/  L C M  f .x/j D jM  Lj showing that
jM  Lj > jM  Lj which is a contradiction.
Thus, L M must be false, and L D M.

3.10.3 The Squeezing Theorem


The Squeezing Theorem, also known as the Sandwich Theorem or the Scrunch
Theorem, says that if the values of f .x/ are always between g.x/ and h.x/, then
if g and h both have the same limit, L, at x D a, then f must also have limit L at a.
The proof of this is not hard once you write down everything that you know about
the functions f , g, and h. So what do you know? You can assume that for every x that
g.x/  f .x/  h.x/, and you can assume that lim g.x/ D lim h.x/ D L. This means
x!a

x!a

that for every  > 0 there is a 1 > 0 such that when x satisfies 0 < jxaj < 1 , then
jg.x/  Lj < . Similarly, for that same , there is a 2 > 0 such that when x satisfies
0 < jx  aj < 2 , then jh.x/  Lj < . Thus, you can show for values of x near a that
g.x/  f .x/  h.x/,  < g.x/  L < , and  < h.x/  L < . Putting these three
sets of inequalities together shows that  < g.x/  L  f .x/  L  h.x/  L < 
from which jf .x/  Lj <  follows. This gives the following proof.

92

3 Limits

PROOF (Squeezing Theorem): Let f , g, and h be three functions with


the same domain, and let a be an accumulation point of that domain.
Assume that for all x in that domain g.x/  f .x/  h.x/, and that
lim g.x/ D lim h.x/ D L. Then lim f .x/ D L.
x!a

x!a

x!a

Assume that a is an accumulation point of the domain shared by the three


functions f , g, and h.
Also assume that lim g.x/ D lim h.x/ D L.
x!a

x!a

Finally, assume that g.x/  f .x/  h.x/ for all x in the common domain of
f , g, and h.
Let  > 0 be given.
By the definition of limit, there is a 1 > 0 such that for all x in the domain
of g that satisfy 0 < jx  aj < 1 , it follows that jg.x/  Lj < .
By the definition of limit, there is a 2 > 0 such that for all x in the domain
of h that satisfy 0 < jx  aj < 2 , it follows that jh.x/  Lj < .
Then for all x in the common domain of f , g, and h satisfying 0 < jx  aj <
min.1 ; 2 /, jg.x/  Lj <  and jh.x/  Lj < .
Thus, for those x,  < g.x/  L  f .x/  L  h.x/  L <  from which it
follows that jf .x/  Lj < .
Therefore, lim f .x/ D L.
x!a

3.10.4 Limits of Subsequences


If the sequence <an > converges to L, it means that the terms of the sequence are
getting close to L. This should mean that the terms of any subsequence should also
be getting close to L, and it is not hard to prove that every subsequence <anj > of
<an > has the same limit.
Given the fact that lim an D L and given a subsequence <anj >, how do you
n!1
use this to prove that the subsequence converges to L? What do you know about this
subsequence? Only that there is a strictly increasing sequence of natural numbers,
<nj >, that tells which terms of <an > are found in the subsequence. A nice property
of a strictly increasing sequence of natural numbers, <nj >, is that for any natural
number j, nj  j. This can easily be proved by mathematical induction on j.
Certainly, n1  1 since n1 is a natural number, so the claim is true for j D 1. If
nk  k for some k, then because <nj > is strictly increasing, nkC1  nk C 1  k C 1
showing that if the claim is true for k, then it is true for k C 1. This proves the claim.
The definition of limit gives you that for any  > 0 there is an N such that if
j > N, then jaj  Lj < . But since nj  j, it follows that for all j > N, nj is also
greater than N, so janj  Lj <  as needed.

3.11 Liminf and Limsup

93

PROOF: Let <an> be a sequence with lim an D L


L, and let <anj > be any
subsequence. Then lim anj D L.

n!1

j!1

Let <an > be a sequence with lim an D L, and let <anj > be any
n!1
subsequence.
Let  > 0 be given.
By the definition of limit, there is an N such that for all n > N, jan  Lj < .
By the definition of subsequence, <nj > is a strictly increasing sequence of
natural numbers and, as such, satisfies nj  j for all natural numbers j.
Thus, for all j > N, nj  j > N implies janj  Lj < .
This proves that lim anj D L.
j!1

Of course the converse of this theorem is trivially true. That is, if all subsequences
of a given sequence converge, then the original sequence converges. This is trivial
since the original sequence is one of its subsequences.

3.10.5 Exercises
Write proofs of each of the following statements.
1. If lim f .x/ D L and f .x/ < b for all x, then L  b.
x!a

2. If f .x/ > b for all x and lim f .x/ D L, then L  b.


x!1

3. If f .x/ > 0 for all x, then lim f .x/ cannot equal negative infinity.
x!a
4. Suppose that sequences <an >, <bn >, and <cn > satisfy an  bn  cn for every
natural number n. If lim an D lim cn D L, then lim bn D L.
n!1

n!1

n!1

3.11 Liminf and Limsup


Even when a limit does not exist, there is often something that can be said
about the values that the function approaches. Consider, for example, the sequence
1; 1; 0; 1; 1; 0; 1; 1; 0; : : : which just oscillates among the numbers 1, 1,
and 0. This sequence does not have a limit, but it has subsequences that do converge.
Some of its subsequences converge to 1, some converge to 1, and some converge
to 0.
2 sin x
2
Now consider the function f .x/ D 2xx2 C1
. The function x22xC1 has a limit of 2
as x goes to infinity, but f .x/ oscillates without approaching a limit. Some of its

94

3 Limits

values do approach 2, but other values approach 2 and every value in between.
More precisely, for each L 2 2; 2, you can find sequences <xn > where lim xn D
n!1

1 and lim f .xn / D L.


n!1
So suppose that the function f is defined for positive real numbers. How might
f .x/ behave as x goes to infinity? f might diverge to infinity or minus infinity as
3
2x2
do f .x/ D x2 and f .x/ D x1x
2 C4 . It might have a finite limit as does x2 C1 . It might

sin x
. Finally, it
oscillate among values within some bounded range such as .3xC100/
xC10
might oscillate and be unbounded like x  j sin xj.
Even when f oscillates so that it does not have a finite or infinite limit, it
is helpful to quantify which values the function f .x/ approaches repeatedly as x
grows. This can be done by considering the range of f .x/ when x is restricted to
an interval .M; 1/, and then watching what happens to that range as M gets large.
sin x
For example, consider the function f .x/ D .3xC100/
whose graph is shown in
xC10
3xC100
70
Fig. 3.10. The function xC10 D 3 C xC10 is a decreasing function of x for x > 0,
so on the interval .M; 1/, the function f oscillates in a range bounded between
3MC100
and  3MC100
. What can be said about the sequence <f .xn /> where <xn > is
MC10
MC10
and  3xxnn C100
,
a sequence with lim xn D 1? The values f .xn / are between 3xxnn C100
C10
C10
n!1

so as xn gets large, f .xn / is forced to be inside or very near the interval 3; 3.
Clearly, for no sequence <xn > can f .xn / approach a limit outside of the interval
3; 3, but there are sequences for which f .xn / approaches 3 and others for which
f .xn / approaches 3 as shown in the figure. Finding the greatest and least values
to which f .xn / could converge is the idea behind the limit superior and limit
inferior often referred to simply as the lim sup and lim inf, respectively. In the
sin x
example of f .x/ D .3xC100/
, the values of 3 and 3 came from looking at the
xC10
greatest lower bound and least upper bound of the set ff .x/ j x > Mg and then
letting M go to infinity. In general, let f be a function whose domain is unbounded
above. For each real number M let AM be the range of f for x > M, that is,
AM D ff .x/ j x is in the domain of f with x > Mg. Then define lim sup f .x/ to be
x!1

lim sup AM . Similarly, define lim inf f .x/ to be lim inf AM . Some books use the

M!1

x!1

M!1

notation lim and lim for lim sup and lim inf, respectively.
Fig. 3.10 Sequences
approaching the lim sup and
.3xC100/ sin x
lim inf of f .x/ D
xC10

3.11 Liminf and Limsup

95

If f .x/ is unbounded above as x gets large, then sup AM D 1 for each M,


so lim sup f .x/ D 1. If lim f .x/ D 1, then sup AM will also approach 1,
x!1

x!1

so lim sup f .x/ will be 1. Analogously, if f .x/ is unbounded below as x gets
x!1

large, lim inf f .x/ D 1, and if lim f .x/ D 1, then lim inf f .x/ D 1. Note
x!1
x!1
x!1
that since sup AM and inf AM are both monotone function of M, their limits always
exist, although they might be infinite limits. Thus, unlike lim f .x/, the values of
x!1

lim sup f .x/ and lim inf f .x/ always exist.


x!1

x!1

If f .x/ remains bounded as x gets large, lim sup f .x/ and lim inf f .x/ are finite
x!1

x!1

values. This means lim sup AM is finite. For each natural number n, there must
M!1

be an xn > n such that f .xn / is within, say 1n of sup An . Then, <xn > is a sequence
that diverges to infinity such that for each n, sup An  1n < f .xn / < sup An . By
the Squeezing Theorem, lim f .xn / D lim sup An D lim sup f .x/. Similarly, there
n!1

n!1

x!1

must be a sequence <xn > diverging to infinity with lim f .xn / D lim inf f .x/. This
n!1
x!1
means that there is a sequence such that f converges to its limit superior on that
sequence and another sequence such that f converges to its limit inferior on it.
Consider the three examples: sin x, x  j sin xj, and x  sin x. None of these functions
has a limit as x approaches infinity because each function oscillates and does not
approach one particular value. On the other hand, in each case it is easy to see upper
and lower bounds to the oscillations. The values ofthe function
sin x clearly stay in

the interval 1; 1. For each integer n, when x D 2n C 12 , the function sin x D


1 D lim sup sin x and when x D 2n  12 , the function sin x D 1 D lim inf sin x.
x!1

x!1

The function x  j sin xj is unbounded



 above but is nonnegative for positive x. Again,
for integer n, when x D 2n C 12 , the function x  j sin xj D x which goes to
infinity, the lim sup of x  j sin xj, and when x D n, the function x  j sin xj D
0 D lim inf x  j sin xj. The function x  sin x behaves similarly except, now, when
x!1 
x D 2n  12 , the function x  sin x is x which goes to negative infinity, the
lim inf of x  sin x.
The limit superior and limit inferior can be defined for limits at points other
than infinity. For example, if a is an accumulation point of the domain of f , one
can define the limit superior and limit inferior of f .x/ as x approaches a. Rather
than defining AM to be the values of f .x/ for x > M which essentially contains the
values of f for x restricted to an interval ending at infinity, one can define A for
any > 0 as A D ff .x/ j x is in the domain of f with 0 < jx  aj < g which
contains the values of f for x restricted to an open interval containing a with the
point a removed. Then lim sup f .x/ D lim sup A and lim inf f .x/ D lim inf A .
!0C

x!a

x!a

!0C

These definitions of lim sup and lim inf have properties similar to the definitions
of lim sup and lim inf at infinity. That is, sup A and inf A are both monotone in
, so their limits as goes to 0 always exist. Moreover, there is a sequence <xn >
where lim xn D a such that lim f .xn / D lim sup f .x/ and another such sequence
n!1

n!1

x!a

96

3 Limits

such that lim f .xn / D lim inf f .x/. Similar definitions can be given for lim inf f .x/,
n!1

x!aC

x!a

lim sup f .x/, lim inf


f .x/, lim sup f .x/, lim inf f .x/, and lim sup f .x/.

x!a

x!a

x!aC

x!1

x!a1

The most important theorem concerning lim inf and lim sup is that lim f .x/ D L
x!a

if and only if lim inf f .x/ D lim sup f .x/ D L. Notice first that this is a biconditional
x!a

x!a

statement; that is, an if and only if statement. This requires that its proof have two
parts; one that assumes lim f .x/ D L and proves lim inf f .x/ D lim sup f .x/ D L
x!a

x!a

x!a

and another that assumes lim inf f .x/ D lim sup f .x/ D L and proves lim f .x/ D L.
x!a

x!a

x!a

So, given lim f .x/ D L, how can you conclude that lim inf f .x/ D lim sup f .x/ D
x!a

x!a

x!a

L? What you know is that given  > 0, there is a > 0 such that for all x in the
domain of f for which 0 < jx  aj < , you have jf .x/  Lj < . But this means that
for small > 0, the supremum sup A and infimum inf A are both within  of L and,
therefore, the limits of sup A and inf A must both approach L as decreases to 0.
Conversely, suppose that lim inf f .x/ D lim sup f .x/ D L. Note that for any x a in
x!a

x!a

the domain of f , it follows


 that f .x/ 2 A2jxaj . Thus, inf A 2jxaj  f .x/
  sup A2jxaj
which implies that lim inf A2jxaj  lim f .x/  lim sup A2jxaj from which it
x!a

follows that lim f .x/ D L.

x!a

x!a

x!a

PROOF: Let a be an accumulation point of the domain of the function f .


Then lim f .x/ D L if and only if lim inf f .x/ D lim sup f .x/ D L.
x!a

x!a

x!a

Assume that a is an accumulation point of the function f .


For any > 0, define A D ff .x/ j x is in the domain of f with
0 < jx  aj < g.
PART I: the limit equals L implies lim inf and lim sup equal L
Assume that lim f .x/ D L.
x!a
Let  > 0 be given.
Then there is a > 0 such that if x is in the domain of f with 0 < jxaj < ,
then jf .x/  Lj < .
This says that inf A  L   and sup A  L C .
It follows that lim inf f .x/  L and lim sup f .x/  L implying that
x!a

x!a

lim inf f .x/ D lim sup f .x/ D L, completing the first part of the proof.
x!a

x!a

(continued)

3.11 Liminf and Limsup

97

PART II: lim inf and lim sup equal L implies that the limit equals L
Assume that lim inf f .x/ D lim sup f .x/ D L.
x!a

x!a

For any x in the domain of f with x a, it follows that inf A2jxaj  f .x/ 
sup A2jxaj .
Because lim inf A2jxaj D lim inf f .x/ D L, and lim sup A2jxaj D
x!a

x!a

x!a

lim sup f .x/ D L, the Squeezing Theorem shows that lim f .x/ D L, which
x!a

x!a

completes the second part of the proof.


As discussed earlier, this theorem holds even when L D 1 or 1. It also holds for
limits at 1 and for one-sided limits.

3.11.1 Exercises
1. Write definitions for each of the following.
(a) lim inf f .x/
x!aC

(b) lim sup f .x/


x!a

(c) lim inf f .x/


x!1

2. Determine each of the following.




x x<2
(a) lim inf
1
x!2
x>2
 x2

x x<2
(b) lim sup
1
x>2
x!2
 x2

x x is rational
(c) lim inf
x!2
5 x is irrational
5
(d) lim sup
n
n!1 4 C .1/
1
1C n
(e) lim sup
1
C
n.1/n
n!1

98

3 Limits

3. Prove that if a is any accumulation point of the domain of f , then lim inf f .x/ 
x!a

lim sup f .x/.


x!a

4. Prove that lim f .x/ D 1 if and only if lim inf f .x/ D 1.


x!a

x!a

5. Suppose that lim inf f .x/ D L and lim inf g.x/ D M. What can you say about
x!a

lim inf.f C g/.x/?

x!a

x!a

6. Suppose that f is a positive-valued function with lim sup f .x/ D L > 0. Prove
1
that lim inf f .x/
D L1 .
x!a

x!a

Chapter 4

Continuity

4.1 The Definition of Continuity


As with the definition of limit, most Calculus students will develop an intuitive feel
for what it means for a function to be continuous. This usually involves knowing
that a function is continuous on an interval if the graph of that function over that
interval can be drawn without lifting ones pencil from the page. The important
property here is that as the pencil is tracing out the graph of the function, and the
pencil is approaching the point where x D a, the points on the graph are getting
close to their destination at the point .a; f .a//. In particular, it does not happen
that as the points on the graph are getting close to .a; L/ that the graph suddenly
jumps to a different point .a; f .a// where f .a/ L, a situation where the pencil
would have to be lifted from the page to get from .a; L/ to .a; f .a//. This intuitive
understanding leads directly to the key property of f being continuous at a which is
that lim f .x/ D f .a/.
x!a
How can one state a definition for continuity that embodies this intuitive feel for
the function having its own value as its limit? Clearly, the definition of a function f
being continuous at a point x D a must be similar to a definition of the limit of f as
x approaches a. As a reminder, here is the definition of limit.
Suppose that the point a is an accumulation point of the domain of the function f.
Then lim f .x/ D L means that for every  > 0 there exists a > 0 such that for
x!a
every x in the domain of f satisfying 0 < jx  aj < , it follows that jf .x/  Lj < .
The definition of continuity of f at point a needs to include the fact that the
function is defined at the point a, so references to the limit L in the definition of
limit can be replaced by references to f .a/. Thus, the definition of continuity will
contain the conclusion jf .x/f .a/j < . In the definition of limit, it was not required
that the function f be defined at x D a, and if it were defined, f .a/ did not need to
be equal to the limit L. For this reason, the definition of limit took care to ensure
that even though jf .x/  Lj <  was required to hold for x values near x D a, this
inequality did not need to hold at x D a. The definition of limit excluded x D a by
Springer International Publishing Switzerland 2016
J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_4

99

100

4 Continuity

Fig. 4.1 Continuity of a function

only requiring the inequality to hold for those x values satisfying 0 < jx  aj <
which excludes x D a. This restriction is not necessary in the definition of continuity
of a function at a point.
Suppose that the point a is in the domain of the function f. Then f is continuous at a
means that for every  > 0 there exists a > 0 such that for every x in the domain
of f satisfying jx  aj < , it follows that jf .x/  f .a/j < .
Notice that the requirement that the point a be an accumulation point of the
domain of f has been dropped. As a result, if the function f is defined at an isolated
point a, then f is continuous at that point. A function that is not continuous at the
point a is discontinuous at the point a.
A function f is continuous on a set A if it is continuous at each point a 2 A. The
function whose graph appears in Fig. 4.1 is discontinuous at x D b because its limit
at x D b does not exist. Similarly, it is discontinuous at x D c. It is discontinuous
at x D d because it is not defined at that point even though the function has a limit
there. The function is continuous on the intervals a; b/, .b; c/, and .c; d/, and at
the points x D e and x D f . The function is not continuous on the intervals a; b
or c; d.
It is a direct consequence of the definition of continuity that if f is continuous
at a point a, and if a is an accumulation point of the domain of f , then the limit of
f .x/ at a exists and is, in fact, f .a/. To prove this you would just need to show that
if f satisfies the definition of continuity at a, then f also satisfies the definition of
lim f .x/ D f .a/. Writing down the definition of continuity gives you that for every
x!a

 > 0 there is a > 0 such that jx  aj < implies jf .x/  f .a/j < . But if this is
true, then certainly 0 < jx  aj < implies jf .x/  f .a/j < , so the definition of
limit is satisfied.

4.2 Proving the Continuity of a Function

101

PROOF: If the function f is continuous at a, and a is an accumulation


point of the domain of f , then lim f .x/ D f .a/.
x!a

Let f be a function continuous at a where a is an accumulation point of the


domain of f .
Given  > 0,
the definition of continuity says that there is a > 0 such that if x is in the
domain of f with jx  aj < , then jf .x/  f .a/j < .
But then if 0 < jx  aj < , it follows that jf .x/  f .a/j <  satisfying the
definition of lim f .x/ D f .a/.
x!a

Therefore, lim f .x/ D f .a/.


x!a

Similarly, if f is defined at a and lim f .x/ D f .a/, then f is continuous at a.


x!a
Again, the proof of this follows directly from the definitions.
PROOF: If the function f is defined at a and lim f .x/ D f .a/, then f is
x!a
continuous at a.
Let f be a function defined at a where lim f .x/ D f .a/.
x!a
Given  > 0,
the definition of limit says that there is a > 0 such that if x is in the domain
of f with 0 < jx  aj < , then jf .x/  f .a/j < .
Certainly, if x D a, then jf .x/  f .a/j D jf .a/  f .a/j D 0 < .
Thus, it follows that jx  aj < implies jf .x/  f .a/j <  satisfying the
definition of f being continuous at a.
Therefore, f is continuous at a.

4.2 Proving the Continuity of a Function


The template for proofs of lim f .x/ D L followed directly from the definition of
x!a
limit. Similarly, a template for proofs of the continuity of a function f at a point
a will follow directly from the definition of continuity. Indeed, the definition of
continuity requires that for every  > 0 there exist a > 0 which satisfies
a particular condition. This suggests that a proof of continuity should select an
arbitrary  > 0 and proceed to display a value of > 0 that causes the needed
condition to be satisfied. This is similar to the procedure taken for a limit proof
except that the needed condition is slightly different. Thus, here is a template for
proofs about the continuity of a function at a point.

102

4 Continuity

TEMPLATE for proving the function f is continuous at the point a


SET THE CONTEXT: Make statements about what is known about the
function f and the numbers a and f .a/.
SELECT AN ARBITRARY : Given  > 0,
PROPOSE A VALUE FOR : let D
. Here you would insert an
appropriate value for .
SELECT AN ARBITRARY x: Select x in the domain of f such that
jx  aj < .
LIST IMPLICATIONS: Derive the result jf .x/  f .a/j < .
STATE THE CONCLUSION: Therefore, f is continuous at the point a.
As a start, consider how to prove that the function defined for all real numbers x as
f .x/ D 5x3 is continuous at x D 4. The proof would begin with Let f .x/ D 5x3.
Given  > 0; : : : . The task is then to find a > 0 so that jf .x/  f .4/j <  for every
x satisfying jx  4j < . Working backwards, to get jf .x/  f .4/j <  one needs
 > j.5x  3/  .5  4  3/j D 5jx  4j. Therefore, it seems clear that jx  4j
needs to be less than 5 , so letting D 5 will work. Note that because  > 0, is
also greater than 0 as required by the definition of continuity. Putting this into the
template results in the following proof.
PROOF: The function f .x/ D 5x  3 is continuous at x D 4.
Let f .x/ D 5x  3.
Given  > 0,
let D 5 which is greater than 0 since  > 0.
Select x such that jx  4j < D 5 .
Then > jx4j implies jf .x/f .4/j D j.5x3/.543/j D j5x20j D
5jx  4j < 5 D .
Therefore, the function f is continuous at 4.

For a more challenging example, consider proving that the function f .x/ D
2x3  4x C 1 is continuous for all real numbers. This proof not only tackles a
more complicated function than the one in the previous example, it is supposed to
demonstrate the continuity of the function at the general real number a rather than
at a specific value such as a D 4. This requires the proof to select an arbitrary a and
prove the continuity of f at the point a. By showing that the function is continuous
at any arbitrarily chosen a, it shows that the function is continuous at every point a.
Again, the proof will select an arbitrary  > 0 and needs to produce a > 0 such
that jf .x/  f .a/j <  for all x satisfying jx  aj < . The proof needs to select an
arbitrary a and an arbitrary  > 0. Does it matter which it does first? In this case
where the choice of a does not depend on which  is chosen, and the choice of 
does not depend on which a is chosen, the order is not critical. It makes sense to
select the a first because you are then challenged to prove that f is continuous at
a for which you should choose an  > 0. But since both quantifiers are universal
quantifiers (for all a 2 R and for all  > 0), the order does not matter. If it had been

4.2 Proving the Continuity of a Function

103

a universal quantifier and an existential quantifier such as for all  > 0 there exists
a > 0, then the order would matter a great deal.
Working backwards from  > jf .x/f .a/j you can see that you need  > j.2x3 
4xC1/.2a3 4aC1/j D j2.x3 a3 /4.xa/j D j2.xa/.x2 CxaCa2 /4.xa/j D
jxajj2.x2 CxaCa2 /4j. You should not be surprised and, in fact, be quite pleased
to see that this last expression contains a factor of jx  aj because this will facilitate
making the expression small when jx  aj is made small. One only needs to control
the size of the other factor j2.x2 C xa C a2 /  4j. Of course, if x is allowed to wonder
too far from a, this other factor could get arbitrarily large, so care must be taken to
restrict how far x gets from a. This can be done by requiring that not be larger than
some conveniently selected value such as 1. That means that jx  aj <  1 would
imply, for example, that jxj < jaj C 1. Given this, there are many ways to find an
upper bound for the quantity j2.x2 C xa C a2 /  4j where the upper bound does not
depend on x. For example, j2.x2 C xa C a2 /  4j  2x2 C 2jxjjaj C 2a2 C 4 
2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4. One can afford to be sloppy here and get a
simpler looking upper bound by saying 2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4 
2.jaj C 1/2 C 2.jaj C 1/.jaj C 1/ C 2.jaj C 1/2 C 4.jaj C 1/2 D 10.jaj C 1/2 . All you
need is an upper bound that depends only on a. This leads to the following proof.
PROOF: The function f .x/ D 2x3  4x C 1 is continuous on the real
numbers.
Let f .x/ D 2x3  4x C 1, and let a 2 R.
Given  > 0,


which is greater than 0 since 1, , and 10.jajC1/2
let D min 1; 10.jajC1/
2
are all positive.
Select x such that jx  aj < . Then  1 implies jxj < jaj C 1.

Also,  10.jajC1/
2 implies that
3
jf .x/f .a/jDj.2x  4x C 1/  .2a3  4a C 1/j D j2.x3  a3 /  4.x  a/jD
j2.x  a/.x2 C xa C a2 /  4.x  a/j D jx  aj  j2.x2 C xa C a2 /  4j 
jx  aj  2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4 
jx  aj  2.jaj C 1/2 C 2.jaj C 1/.jaj C 1/ C 2.jaj C 1/2 C 4.jaj C 1/2 D

2
jx  aj  10.jaj C 1/2 < 10.jajC1/
2  10.jaj C 1/ D .
Therefore, the function f is continuous at every real number a.
Not all functions
 can be expressed with
 nice formulas. Take, for example, the
2x if x is rational
function f .x/ D
which behaves differently on the rational
x C 1 if x is irrational
numbers than it does on the irrational numbers. Such functions that are defined
one way on the rational numbers and another way on the irrational numbers make
interesting examples because both the rational and the irrational numbers are dense
in the real numbers; that is, in every nonempty open interval .a; b/, you can find
both rational and irrational numbers. For the given function, in every nonempty
open interval .a; b/ there are values of x where f .x/ D 2x and other values of x

104

4 Continuity

Fig. 4.2 A function equal to


2x for rational x (blue) and
x C 1 for irrational x (red)
The blue and red lines are not
solid
(1,2)

where f .x/ D x C 1. Indeed, for most real numbers a, lim f .x/ does not exist. Only
x!a
at x D 1, where 2x and x C 1 coincide, does this limit exist, and, in fact, at that point
f .x/ is continuous (Fig. 4.2).
A proof that f is continuous at x D 1 would be similar to the two preceding
proofs, but you need to be careful to handle f .x/ differently depending on whether
x is rational or irrational. As in other continuity proofs, given an  > 0 you are
faced with producing a value for > 0 which will ensure that jf .x/  f .1/j < 
whenever jx  aj < . If the function in the proof were equal to x C 1 for every
value of x, then the value D  would work because jx  1j <  shows that
jf .x/  f .1/j D j.x C 1/  .1 C 1/j D jx  1j < . If the function in the proof
were equal to 2x for every value of x, then the value D 2 would work because
jx  1j < 2 shows that jf .x/  f .1/j D j.2x/  .2  1/j D 2jx  1j < . In this proof,
then, you can choose D min.; 2 / D 2 . After selecting an x with jx  1j < ,
you merely consider two separate cases, one where x is rational, and one where x is
irrational. These ideas allow you to produce the following proof.

PROOF: The function f .x/ D
x D 1.
Let f .x/ D

2x if x is rational
x C 1 if x is irrational


2x if x is rational
.
x C 1 if x is irrational


is continuous at

Given  > 0,
let D 2 which is greater than 0 since  > 0.
Select x such that jx  1j < D 2 .
If x is a rational number, then jf .x/  f .1/j D j2x  2j D 2jx  1j < 2 D .
If x is an irrational number, then jf .x/  f .1/j D j.x C 1/  2j D jx  1j <
< .
In either case, jf .x/  f .1/j < .
Therefore, the function f is continuous at 1.

4.3 Uniform Continuity

105

4.2.1 Exercises
Write proofs of each of the following statements.
1.
2.
3.
4.
5.
6.
7.

f .x/ D 4x C 7 is continuous at x D 2.


f .x/ D 5x2 C 3x  2 is continuous at x D 8.
f .x/ D 10x3 C 25 is continuous for all real numbers x.
f .x/ D jxj is continuous at x D 0.
f .x/ D p
jx2  9j is continuous for all real numbers x.
f .x/ D px is continuous for all x  0.
f .x/ D jx2  4j is continuous for all real numbers x.

4.3 Uniform Continuity


Continuity of a function is a local property, that is, whether or not a function f is
continuous at a point x D a depends only on how f behaves close to a. In fact, f
1
can be continuous at a and yet have very erratic behavior at points just 10
unit from
1
1
a or 100 or even 1;000;000 from a. The last example in the previous section shows a
function continuous at x D 1 which is continuous for no other value of x. Even if
f is continuous at all points of a set A, it could be that proofs of the continuity of f
at two points x D a and x D b might need to be quite different. Certainly, there is
no reason to believe that, given an  > 0, a value of > 0 that works in a proof of
the continuity of f at the point a would also work in a proof of the continuity of f at
point b.
Consider, for example, the function f .x/ D 1x which is continuous for all x 0.
To prove that f is continuous at x D 2, given  > 0 one can use D min.1; / or
even be as generous as to let D min.1; 2/. But to prove that f is continuous at
x D 12 where the function f changes much more rapidly, for the same  > 0, one
might need to use D min. 41 ; 8 /. You can easily see from the graph of f .x/ D 1x
that as a gets closer to 0, the > 0 chosen for a particular  > 0 will need to get
smaller (Fig. 4.3).
Suppose that you wanted to prove that a particular function f was continuous
at every a in the domain of f . Such a proof was discussed in the previous section
using f .x/ D 2x3  4x C 1. In that proof, the formula for the > 0 chosen for a
given  > 0 depended on the point a where f was being shown to be continuous.
Clearly, this would have to be the case because f is a cubic function of x which
grows increasingly more rapidly as x gets large. But it is not true that every function
behaves this way. Some functions change at a constant rate like f .x/ D 6x  13 or
change at a rate that does not continue to grow such as f .x/ D x2 1C1 . When writing
a proof of the continuity of such functions, it is possible to pick a single value for
> 0 that depends on  > 0 (as it certainly would have to unless f were constant
on each interval in its domain), but where the choice of > 0 does not depend on

106

4 Continuity

Fig. 4.3 f .x/ D 1x is not


uniformly continuous

the point a where the continuity needs to be shown. These functions are special and
satisfy the following definition. A function f is uniformly continuous on the set
A if for every  > 0 there is a > 0 such that jf .x/  f .y/j <  for every x and y
in A satisfying jx  yj < . You should compare this definition to the definition of
continuity at a point. The difference centers on when the value of > 0 needs to
be determined. For continuity at a single point, given  > 0, one must specify the
value of > 0 after being given the value of a but before being given a value for x.
Thus, the value of > 0 can depend on the value of a even though it cannot depend
on the value of x. On the other hand, for uniform continuity, given  > 0, one must
specify the value of > 0 before learning the values of either x or y, and, therefore,
its value cannot depend on either x or y.
The definition of uniform continuity suggests a template for how to prove that a
given function f is uniformly continuous on a set A. As in the proof for continuity
at a point, you would say that a value for  > 0 has been given. Then you would
present a value for > 0. Once these two values have been specified, you would
need to show that any x and y in A that satisfy jxyj < also satisfy jf .x/f .y/j < .
This suggests the following.
TEMPLATE for proving the function f is uniformly continuous on the
set A
SET THE CONTEXT: Make statements about what is known about the
function f .
SELECT AN ARBITRARY : Given  > 0,
PROPOSE A VALUE FOR : let D
. Here you would insert an
appropriate value for .
SELECT ARBITRARY x and y in A with jx  yj < : Let x and y be in A
such that jx  yj < .
(continued)

4.3 Uniform Continuity

107

LIST IMPLICATIONS: Derive the result jf .x/  f .y/j < .


STATE THE CONCLUSION: Therefore, f is uniformly continuous on the
set A.
Proving the function f .x/ D 6x  13 is uniformly continuous on the entire real
line is straightforward since the function f changes at a constant rate. This allows
you to select a value for > 0 based on that rate of change, 6.
PROOF: The function f .x/ D 6x  13 is uniformly continuous on the real
numbers.

Let f .x/ D 6x  13.


Given  > 0,
let D 6 which is greater than 0 since  > 0.
Let x and y be real numbers such that jx  yj < D 6 .
Then jf .x/  f .y/j D j.6x  13/  .6y  13/j D 6jx  yj < 6 D .
Therefore, the function f is uniformly continuous on the real numbers.

Less clear is how to choose a value for > 0 when proving f .x/ D x2 1C1
is uniformly continuous on the real numbers. To do this, you need to find a
way to show jf .x/  f .y/j < . You would try to find an upper bound for
2
2 C1/j

jxCyj
D .x2 C1/.y
jf .x/f .y/j D x2 1C1  y2 1C1 D j.y.x2C1/.x
2 C1/ jxyj. This expression
C1/.y2 C1/
is complicated, so it is convenient to find ways to simplify it. The nice thing about
working with inequalities rather than equalities is that you are not prevented from
making changes that increase the value of your expression. That is, if you can
simplify an expression by substituting an expression that is a little larger, that might
not be a problem. The numerator in the previous expression is jx C yj which does
not simplify algebraically, but it does suggest a possible application of the triangle
inequality, jx C yj  jxj C jyj. Changing jx C yj to jxj C jyj allows the fraction to
be broken into two simpler
fractions. It allows you
to continue
 with jf .x/  f .y/j D

jxCyj
jyj
jxj
jx  yj.
jx  yj  .x2 C1/.y2 C1/ C .x2 C1/.y2 C1/ jx  yj  x2jxj
C y2jyj
.x2 C1/.y2 C1/
C1
C1
When jxj < 1, you can conclude that jxj < 1  x2 C 1. When jxj  1, you can
2
conclude that jxj  x2 < x2 C 1. In either case .x2jxj
 xx2 C1
D 1. This lets you
C1/
C1


jxCyj
jyj
jxj
jx  yj 
C
state that jf .x/  f .y/j D .x2 C1/.y
2 C1/ jx  yj 
2
2
2
2
.x C1/.y C1/
.x C1/.y C1/

2jx  yj. This suggests that D 2 will work in the proof.

108

PROOF: The function f .x/ D


numbers.

4 Continuity
1
x2 C1

is uniformly continuous on the real

Let f .x/ D x2 1C1 .


Given  > 0,
let D 2 which is greater than 0 since  > 0.

Let x and y be real numbers
such that

jx 2 yj < 2 D 2 .

C1/j
Then jf .x/  f .y/j D x2 1C1  y2 1C1 D j.y.x2C1/.x
D
C1/.y2 C1/


jxCyj
jyj
jxj
jx  yj 
jx  yj  .x2 C1/.y
2
2
2 C1/ C .x2 C1/.y2 C1/
.x C1/.y C1/
jxj
jx  yj
C y2jyj
x2 C1
C1

Note that if jxj < 1, then jxj < x2 C 1, and if jxj  1, then jxj  x2 < x2 C 1.
In either case, jxj < x2 C 1, so x2jxj
< 1, and similarly, y2jyj
< 1.
C1

 C1
jyj
jxj
It follows that jf .x/  f .y/j  x2 C1 C y2 C1 jx  yj < 2jx  yj < 2 D .
Therefore, the function f is uniformly continuous on the real numbers.
One of the most memorable theorems from Calculus is the Mean Value
Theorem which states that if the function f is continuous on the interval a; b
and differentiable on the interval .a; b/, then there is a c 2 .a; b/ such that
.a/
f 0 .c/ D f .b/f
. If the function f has a bounded derivative on the interval
ba
a; b, that is, if there is a positive real number M such that jf 0 .x/j  M for all
values of x 2 a; b, then one can easily see that f is uniformly continuous on that
interval. Indeed, if x and y are in a; b, then there is a c between x and y such that
jf .x/  f .y/j D jf 0 .c/j  jx  yj  M  jx  yj. This implies that given  > 0, the value
D M > 0 can be used in a proof that f is uniformly continuous on a; b for then
jx  yj < implies jf .x/  f .y/j D jf 0 .c/j  jx  yj < M  jx  yj < M D . This
is summarized by saying that a function with a bounded derivative on an interval is
uniformly continuous there.
Whenever you learn of the truth of a conditional statement such as the one at the
end of the previous paragraph (bounded derivative implies uniform continuity), it is
natural to ask whether the converse of the statement is also true (uniform continuity
implies bounded derivative). The answer to this particular question is no, not all
functions uniformly continuous on an interval have bounded derivatives there. In
particular, the function f .x/ D jxj is an example of a function uniformly continuous
on the entire real line, yet it fails to be differentiable at x D 0. The function f .x/ D
p
x is uniformly continuous for x  0, but its derivative is unbounded
 near x D 0.
A more complex example is the function defined by f .x/ D x2 sin x12 when x
0 and f .0/ D 0. This function is uniformly continuous on the interval 10; 10
even though its derivative, which exists on the entire real line, is not bounded as x
approaches 0.
p
Because the function f .x/ D x has an increasingly large rate of change as x
approaches 0, proving that the function is uniformly continuous for x  0 provides
an interesting challenge. The proof will need to conclude that  > jf .x/  f .y/j D

4.3 Uniform Continuity

109

p p p p
p p
j x yj. xC y/
p p
p . As expected, there is a factor of jx  yj in
j x  yj D
D pjxyj
xC y
xC y
this expression, so that you can try to make the expression small by
the
prestricting
p
size of jx  yj. This is easy if the denominator of the expression, x C y, does
not get too small. The problem is if x and y get close to 0, the denominator of the
expression will also get close to 0. At first this seems
p likepa significant roadblock.
But this roadblockppresents its own resolution for if x C y is very small, it must
p
certainly be that j x  yj is even smaller
is the conclusion that you want.
p which
p
In other words, there are two
cases:
either
x
C
y
is small which would imply that
p
p
jf .x/  f .y/j is small, or x C y is large which would imply that jf .x/  f .y/j D
jxyj
p p is small. You only need to decide what to use as the dividing line between
xC y
p
p
large and small. A natural choice would be  itself because x C y < 
p
p
p
p
jxyj
implies j x  yj < . If x C y  , then jf .x/  f .y/j D pxCpy  jxyj

2

which suggests letting D  2 so that jx  yj < gives jf .x/  f .y/j <  D . The
complete proof follows.
p
PROOF: The function f .x/ D x is uniformly continuous on the interval
x  0.
p
Let f .x/ D x.
Given  > 0,
let D  2 which is greater than 0 since  0.
Let x and y be nonnegative
real numbers such that jx  yj < .p
p
p
p
In
the
case
that
x
C
y
<
, it follows that jf .x/  f .y/j D j x  yj 
p
p
x C y < . p
p
p
p
In the case that x C y  , it follows that jf .x/  f .y/j D j x  yj D
p p p p
j x yj. xC y/
p p
xC y

jxyj
p p
xC y

jxyj


<

2


D .

In either case, jx  yj < implies that jf .x/  f .y/j < , so the function f is
uniformly continuous on the interval x  0.
There is an important lesson to be learned from this example. When planning how
to write a proof, you can pursue one line of thinking which may solve the problem
in most but not all cases. Sometimes the special cases where the argument does not
work are enough to cause you to abandon your original line of reasoning altogether.
But often you can just break your argument into two or more cases and find other
techniques to handle the special cases where the original argument does not work.

4.3.1 Exercises
Write proofs of each of the following statements.
1. f .x/ D 3x C 11 is uniformly continuous on the set of real numbers.
2. f .x/ D 14x C 5 is uniformly continuous on the set of real numbers.
3. f .x/ D jxj is uniformly continuous on the set of real numbers.

110

4.
5.
6.
7.
8.

4 Continuity

f .x/ D 8x2 is uniformly continuous on the interval 6; 6.


4
f .x/ D 5xC1
is uniformly continuous for x  0.
p
f .x/ D 3 x is uniformly continuous on the set of real numbers.
f .x/ D x2 is not uniformly continuous on the set of real numbers.
f .x/ D x12 is not uniformly continuous on the set .0; 1/.

4.4 Compactness and the HeineBorel Theorem


4.4.1 Open Covers and Subcovers
Let a and b be real numbers with a < b. It turns out that if a function f is continuous
on the closed interval a; b, then f is uniformly continuous on that interval. How
might you prove this result? As a first try, you might say that for each  > 0 and
for each y 2 a; b there is a > 0 such that if x 2 a; b with jx  yj < , then
jf .x/  f .y/j < . Then, having produced a value for for each y 2 a; b, you might
want to pick the smallest of all of those s and hope that this minimum would be
sufficiently small to work for every y 2 a; b. Unfortunately, you started out with
an infinite collection of s, each greater than 0. Such an infinite set might not have
a minimum value. The set of such s is certainly nonempty and bounded below, so
the collection does have a greatest lower bound, but that greatest lower bound could
be 0, too small to use for the in the proof. A finite set of positive numbers always
has a minimum value that is positive, but an infinite set of positive numbers might
have a greatest lower bound of 0.
Suppose that T is a collection of open intervals, and A  R. If the set A is
contained in the union of the open intervals in T, that is, if A  [ .s; t/, then
.s;t/2T

T is called an open cover of A. A subset T 0  T which is also an open cover of


A is called a subcover of A. In the above suggested proof that the continuity of f
on a; b implies the uniform continuity of f on a; b, the definition of continuity
at each point of y 2 a; b produced a collection of open intervals which form an
open cover T of a; b. If that open cover had a finite subcover T 0 , then you would be
dealing with only a finite number of > 0 values, and you could expect to produce
a smallest such > 0. Whether such a finite subcover exists has nothing to do with
the continuous function f that motivated this discussion. A closed bounded interval
a; b in the real numbers is compact which means that every open open cover of
a; b contains a finite subcover. The fact that every closed bounded interval in the
real numbers is compact is known as the HeineBorel Theorem, and it is central
to proving the above result about continuous functions on closed bounded intervals
being uniformly continuous there. In fact, the HeineBorel Theorem is an important
tool for proving many results in analysis.
Suppose that for every rational number in 0; 1 you represent the rational
number in lowest terms as pq . Then for each of these rational numbers you

4.4 Compactness and the HeineBorel Theorem

111

associate the open interval . 4p1


; 4pC1
/. For example, the number 27 would be
4q
4q
7 9
associated with the open interval . 28
; 28 /. Since the set of rational numbers in
0; 1 is infinite, this collection of open intervals is also infinite. The collection
forms an open cover of 0; 1. One possible finite subcover is the collection of
intervals associated with rational numbers 01 ; 14 ; 13 ; 12 ; 23 ; 34 ; and 11 giving the intervals
3 5
3 5
7 9
; 16 /; . 12
; 12 /; . 38 ; 58 /; . 12
; 12 /. 11
; 13 /; and . 34 ; 54 /. You should verify that
. 14 ; 14 /; . 16
16 16
these intervals are in the original open cover and do produce the claimed finite
subcover. On the other hand, if you associate with each natural number n > 1
the open interval . n1 ; 1/, you get an open cover of the set .0; 1/, yet no finite subset
of this collection of intervals can cover the entire interval .0; 1/. Indeed, any finite
collection will only cover the interval . m1 ; 1/ for some natural number m > 1. Since
these intervals form an open cover of .0; 1/ which does not have a finite subcover,
the set .0; 1/ is not a compact set.

4.4.2 Proofs of the HeineBorel Theorem


Presented next are two quite different proofs of the HeineBorel Theorem. The
techniques used in both proofs are instructive, and it is interesting to see how a
single result can be proved using two completely different strategies. Given in each
case are real numbers a < b and a set of open intervals T that forms an open cover
of the closed bounded interval a; b. Both proofs seek to show that there must be
a finite subset of T that covers a; b. The strategy in the first proof suggests that,
whether or not you can cover a; b with a finite number of open intervals, you can
certainly cover some of the interval starting at a and working at least part of the way
toward b. The proof proposes looking at the set
S D fx 2 a; b j T has a finite subcover that covers the interval a; xg:
The proof first shows that S is not empty because it contains the point a. The set S
is bounded above by b, so S has a least upper bound, r. This is not to say that r 2 S,
but if r is not in S, there must be values in S that are arbitrarily close to r. Certainly
r is in a; b, so there is an open interval from T that covers r. Since there are values
of S arbitrarily close to r, there are some inside this open interval containing r. This
open interval then extends the finite subcover to values greater than r. One can only
conclude that r must be b, and, in fact, b 2 S. Thus, a; b has a finite subcover, and
the proof is complete (Fig. 4.4).

112

4 Continuity

PROOF (HeineBorel Theorem): Let a < b be two real numbers, and let
T be an open cover of a; b. Then T contains a finite subcover of a; b.
Let a < b be two real numbers, and let T be an open cover of a; b.
Define set S D fx 2 a; b j T has a finite subcover that covers the interval
a; xg.
The set T is an open cover of a; b, and a 2 a; b, so T must contain at
least one open interval, .p; q/ which contains the point a, that is, p < a < q.
Since the interval a; a is covered by .p; q/ 2 T, the point a 2 S, and S is
not an empty set.
The set S is bounded above by b.
Since S is nonempty and bounded above, it has a least upper bound r.
Since r must be at least a and cannot be greater than b, r 2 a; b, so there
is an interval .p; q/ in T which contains the point r, that is, p < r < q.
Since p < r and r is the least upper bound of S, p is not an upper bound of
S. Thus, there is a point y 2 S with p < y. This means that there is a finite
set of intervals in T that covers a; y.
Let z D min. rCq
; b/. Since z  r and z 2 .p; q/, adding the interval .p; q/
2
to the finite set of intervals of T that covers a; y produces a finite set of
intervals in T that covers a; z, and z 2 S.
But r is the least upper bound for S, implying that z  r. Because z D
min. rCq
; b/ and rCq
> r, it must be that z D b.
2
2
Because z 2 S, it follows that b 2 S which completes the proof of the
theorem.
The second proof of the HeineBorel Theorem is a proof by contradiction. It
begins as the first proof by assuming that a < b are real numbers, and that the
interval a; b has an open cover T. Then it makes the additional assumption that no
finite collection of intervals in T can cover a; b. This will lead to a contradiction.
This proof is not one that the beginning student is likely to invent on their own
unless they have seen the technique before.
First, the proof sets a0 D a and b0 D b so that the interval a0 ; b0  D a; b. Let
0
m0 D a0 Cb
be the midpoint of a0 ; b0 . It must be the case that at least one of the
2
intervals a0 ; m0  or m0 ; b0  cannot be covered by a finite number of intervals in T
because, if both can be covered by a finite number of intervals, putting those two
collections together would give a finite collection of intervals that covered the entire
interval a0 ; b0  D a; b contradicting the assumption that this could not be done.
p

[
a

Fig. 4.4 HeineBorel Theorem first proof

]
b

4.4 Compactness and the HeineBorel Theorem

113

So, if it is the case that a0 ; m0  cannot be covered by a finite number of intervals


in T, let a1 D a0 and b1 D m0 . Otherwise, if m0 ; b0  cannot be covered by a finite
number of intervals in T, let a1 D m0 and b1 D b0 . In either case, the new interval
a1 ; b1   a; b cannot be covered by a finite collection of intervals in T.
Now the proof continues iteratively. If for some j > 0, there is an interval aj ; bj 
contained in a; b which cannot be covered by any finite collection of intervals in
a Cb
T, let mj D j 2 j be the midpoint of the interval. Either aj ; mj  or mj ; bj  cannot be
covered by a finite collection of intervals from T, so if aj ; mj  cannot be covered by a
finite collection of intervals, let ajC1 D aj and bjC1 D mj . Otherwise, let ajC1 D mj
and bjC1 D bj . In either case ajC1 ; bjC1  cannot be covered by a finite collection
of intervals from T. Notice that this process constructs a sequence of intervals
a0 ; b0 ; a1 ; b1 ; a2 ; b2 ; : : : contained in a; b, none of which can be covered by
a finite collection of intervals in T. Also note that a D a0  a1  a2  : : :
while b D b0  b1  b2  : : :, and for each j, the length of the jth interval
is bj  aj D ba
. Since each aj term is less than all of the bk terms, both of the
2j
monotone sequences are bounded and, therefore, converge. Moreover, since for each
k, lim bj  lim aj  bk  ak D ba
, it follows that lim aj D lim bj D r 2 a; b.
2k
j!1

j!1

j!1

j!1

Note that since the sequence of aj s increases to r, and the sequence of bj s decrease
to r, the limit r 2 aj ; bj  for each j. Because the limit, r, is in a; b, there is an open
interval .p; q/ 2 T such that r 2 .p; q/. The distance the limit r is from the boundary
of the interval .p; q/ is  D min.r  p; q  r/ > 0. Since lim ba
D 0, you can
2j
j!1

< . Then it follows that p  r   < aj  r  bj  r C  < q,


select a j so that ba
2j
and, so, aj ; bj   .p; q/. But this shows that aj ; bj  is covered by the single open
interval .p; q/ 2 T contradicting the fact that aj ; bj  could not be covered by a finite
collection of intervals in T. Thus, you must conclude that the assumption that a; b
cannot be covered by a finite number of intervals is false. A formal proof follows
(Fig. 4.5).

r
Fig. 4.5 HeineBorel Theorem second proof

114

4 Continuity

PROOF (HeineBorel Theorem): Let a < b be two real numbers, and let
T be an open cover of a; b. Then T contains a finite subcover of a; b.
Let a < b be two real numbers, and let T be an open cover of a; b.
Assume that T contains no finite subcover of a; b.
Let a0 D a and b0 D b so that the interval a0 ; b0  D a; b, and note that no
finite collection of intervals in T will cover a0 ; b0 .
Define sequences <aj > and <bj > inductively. For j  0, let aj ; bj   a; b
be an interval which cannot be covered by a finite collection of open
intervals in T, and where bj  aj D ba
.
2j
aj Cbj
Let mj D 2 be the midpoint of aj ; bj .
It must be the case that at least one of the intervals aj ; mj  or mj ; bj 
cannot be covered by a finite number of intervals in T because, if both can
be covered by a finite number of intervals, putting those two collections
together would give a finite collection of intervals that covered the entire
interval aj ; bj .
If aj ; mj  cannot be covered by a finite collection of intervals, let ajC1 D aj
and bjC1 D mj . Otherwise, let ajC1 D mj and bjC1 D bj . In either case
ajC1 ; bjC1  cannot be covered by a finite collection of intervals from T, and
ba
j

bjC1  ajC1 D 22 D 2ba


jC1 .
Thus, there are monotone sequences a D a0  a1  a2  : : : and b D
b0  b1  b2  : : :, and for each j, the length of the aj ; bj  interval is
bj  aj D ba
.
2j
Since each aj term is less than all of the bk terms, both of the monotone
sequences are bounded and, therefore, converge. The fact that lim aj 
j!1

lim bj  lim .aj C

j!1

j!1

ba
/,
2j

shows that lim aj D lim bj D r 2 a; b.


j!1

j!1

Because the limit, r, is in a; b, there is an open interval .p; q/ 2 T such


that r 2 .p; q/.
The distance the limit r is from the boundary of the interval .p; q/ is  D
min.r  p; q  r/ > 0. Since lim ba
D 0, there is a j such that ba
< .
2j
2j
j!1

It follows that p  r  aj  r  bj  rC < q, and, so, aj ; bj   .p; q/.


But then aj ; bj  is covered by the single open interval .p; q/ 2 T contradicting the fact that aj ; bj  could not be covered by a finite collection of
intervals in T.
Thus, the assumption that a; b cannot be covered by a finite number of
intervals is false, and the theorem is proved.
The fact that the interval a; b in the HeineBorel Theorem is both closed and
bounded is crucial. The interval 1; 1/ is covered by the collection of open intervals
.j; j C 2/ for j D 0; 1; 2; 3; : : :, but no finite collection of these open intervals
can cover 1; 1/. The interval .0; 5/ is covered by the collection . 1j ; 5/ for j D
1; 2; 3; 4; : : :, but, again, no finite collection of these open intervals can cover .0; 5/.

4.4 Compactness and the HeineBorel Theorem

115

4.4.3 Uniform Continuity on Closed Bounded Intervals


With the HeineBorel Theorem, it can now be shown that every continuous function
on a closed bounded interval is uniformly continuous on that interval. The idea is
simple enough: if f is continuous on the closed bounded interval a; b, then, given
 > 0, at each point x 2 a; b there is a > 0 such that for any y 2 .x  ; x C /, it
follows that jf .x/  f .y/j < . Thus, there is an open interval around each x 2 a; b
that has the desired property, and the HeineBorel Theorem shows that a; b can be
covered by just a finite number of these open intervals. Since each of these finitely
many open intervals is associated with a positive , you can select the smallest to
serve as the > 0 needed in your proof of uniform continuity.
There are, though, a couple of subtleties that get in the way of this simple
argument. First of all, for any y in one of the open intervals .x  ; x C / you
can conclude that jf .y/  f .x/j < , but the proof will require that jf .y/  f .z/j < 
for any y and z that are within the chosen of each other, not just for z D x, the
middle point of the interval. One can get around this problem by arranging that
jf .y/  f .x/j < 2 for all y 2 .x  ; x C /. This is a common trick in analysis
proofs. The definition of continuity allows you to find a > 0 that works for
any given  > 0, so why not for 2 which is also greater than 0? Then for any
y and z in .x 
 ; x C /,
 you
 can use
 the triangle inequality to conclude that
jf .y/f .z/j D j f .y/f .x/  f .z/f .x/ j  jf .y/f .x/jCjf .z/f .x/j < 2 C 2 D .
There is a second problem with the this strategy. If you select y and z within of
each other, how do you know that they both lie within the same interval .x; xC/?
The interval a; b is covered by a finite number of such intervals, but just because
the two numbers y and z are close to each other does not mean that they will both
fall within the same interval in your finite collection of open intervals. There are a
couple of ways to get around this problem. One method is to consider the endpoints
of the intervals in your finite collection of open intervals. Since the number of open
intervals is finite, there are only finitely many endpoints to these intervals. You could
select the in the proof not to be the least of the s used for any of the intervals
but to be the least distance between any two distinct elements of the collection of
endpoints of these intervals. That ensures that if y and z are closer together than ,
there can be at most one endpoint between y and z. That will guarantee that y and
z will both be within one of the finitely many open intervals. This follows from the
fact that intervals in an open cover must overlap, so that each endpoint of one of
the open intervals must be a member of one of the other open intervals in the open
cover as seen in the following diagram (Fig. 4.6).

(z )

Fig. 4.6 y and z straddle one endpoint but remain in an interval of the open cover

116

4 Continuity

A cleaner way to ensure that any y and z within of each other are in one of
the finite number of intervals in the open cover of a; b is to be more clever about
choosing the original open intervals. Suppose that for all y 2 .x  ; x C /, it
follows that jf .y/  f .x/j < . You can be very conservative and use the open
interval .x  2 ; x C 2 / as the interval chosen to cover x in the open cover of a; b.
Then if y and z are very close, and y 2 .x  2 ; x C 2 / for some x, it will follow that,
since y and z will be closer together than 2 , guaranteeing that both y and z will be in
.x  ; x C /, and the result will follow. The following proof uses the first strategy.
PROOF: A function continuous on a closed bounded interval is uniformly
continuous on that interval.
Let a < b be two real numbers, and let f be continuous on the interval a; b.
Let  > 0 be given.
By the definition of continuity, for each x 2 a; b there is a x > 0 such that
for all y in a; b with jy  xj < x it follows that jf .y/  f .x/j < 2 .
Because for each x 2 a; b, the point x 2 .x  2x ; x C 2x /, this collection of
open intervals covers a; b.
By the HeineBorel Theorem, there is a finite set fx1 ; x2 ; x3 ; : : : ; xn g  a; b
such that the collection I D f.xj  xj ; xj C xj / j j D 1; 2; 3; : : : ; ng forms
an open cover of a; b.
The set of endpoints of these intervals, E D fxj xj j j D 1; 2; 3; : : : ; ng, is
a finite set, so let be the smallest positive difference between two elements
of E.
Let y and z be elements of a; b with jy  zj < .
Because the collection of intervals I is an open cover of a; b, there are j
and k such that y 2 .xj  xj ; xj C xj / and z 2 .xk  xk ; xk C xk /. If j D k,
then y and z are in the same interval of I. If j k, then because jy  zj < ,
there is at most one endpoint in E between y and z. Thus, either there is at
most one endpoint of .xj  xj ; xj C xj / or .xk  xk ; xk C xk / between
y and z. This implies that either y and z are both in .xj  xj ; xj C xj /
or both in .xk  xk ; xk C xk /. In either case, there is a single interval
.xm  xm ; xm C xm / 2 I such that y; z 2 .xm  xm ; xm C xm /.
Now it follows that jf .y/  f .z/j D j .f .y/  f .xm //  .f .z/  f .xm // j 
jf .y/  f .xm /j C jf .z/  f .xm /j < 2 C 2 D .
This shows that for every  > 0 there is a > 0 such that if y; z 2 a; b
with jy  zj < , then jf .y/  f .z/j < . This completes the proof that f is
uniformly continuous on a; b.
Note that the fact that a; b is both closed and bounded is crucial. The function
f .x/ D x2 is continuous on the unbounded interval 0; 1/, but f is not uniformly
continuous on this interval. Similarly, the function f .x/ D 1x is continuous on the
open interval .0; 1/, but f is not uniformly continuous on this interval.

4.5 The Arithmetic of Continuous Functions

117

4.4.4 Exercises
1. Determine which of the following sets of real numbers are compact.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)

0; 12
.2; 2
1; 4 [ 8; 15
f1; 3; 5; 7; 9g
R
;
2; 6/ [
.6; 11
S
h
i
1
1
1
f0g [
jD1 2jC1 ; 2j

2. For each of the following open covers, find a finite subcover.




(a) 1; 10 is covered by the collection of open intervals r  14 ; r C 14 where
r 2 Q.


(b) 0; 3 is covered by the collection of open intervals 1j ; 4 for j D 1; 2; 3; : : :
 1 1
; 10 .
along with the open interval  10


(c) 0; 1 is covered by the collection of open intervals 2j1 ; 2j for j D 1; 2; 3; : : :
 1 1
along with  10
; 10 .
Write proofs of each of the following statements.
3. The intersection of two compact sets is a compact set.
4. The union of two compact sets is a compact set.
5. If C is a compact set and .a; b/ is an open interval, then the set difference Cn.a; b/
is a compact set.
6. If the function f is uniformly continuous on the interval a; b and uniformly
continuous on the interval b; c for a < b < c, then f is uniformly continuous on
the interval a; c.
7. If a set A has a cover consisting of a finite number of open intervals, then A has
a subcover such that for each x 2 A, x is an element of at most two of the open
intervals in the subcover.

4.5 The Arithmetic of Continuous Functions


Chapter 3 discusses several theorems about how one can calculate limits when faced
with the addition, subtraction, multiplication, or division of functions whose limits
are known. As one might expect, since continuity and limits are closely related, the
proofs of the corresponding theorems about functions continuous at a point are, in
fact, very similar. Before starting, it is worth pointing out that if f and g are two
functions, then you can define the new functions f C g, f  g, f  g, and gf at all

118

4 Continuity

points in the intersection of the domain of f and the domain of g and, in the case
of gf , only where g is not 0. Generally, one is interested in functions that have a
common domain, but sometimes this is not the case. Pathological examples do exist.
It could be, for example, that f is only defined for positive real numbers, and g is
only defined for negative real numbers as with f .x/ D p1x and g.x/ D p1x . Then
f C g has
domain and is the empty function, one that contains no ordered
 an empty

pairs, x; f .x/ . Oddly, the definition of continuity says that the empty function is
continuous because it satisfies the definition at each point of its empty domain.
Suppose that functions f and g have a common domain where the point a is an
accumulation point of that domain. Also suppose that lim f .x/ D L and lim g.x/ D
x!a
x!a
H. Recall that when proving that the limit of f C g is L C H, you are given an  > 0
and can use the definition of limit to conclude that there are 1 > 0 and 2 > 0
such that if x is in the common domain of f and g with 0 < jx  aj < 1 , then
jf .x/  Lj < 2 , and if 0 < jx  aj < 2 , then jg.x/  Hj < 2 . Then the triangle
inequality allows you to conclude that for all x with 0 < jx  aj < min.1 ; 2 / that
j .f .x/ C g.x//  .L C H/j D j .f .x/  L/ C .g.x/  H/ j  jf .x/  Lj C jg.x/  Hj <

C 2 D . The same method works for the proof about continuity of f C g at a
2
with minor changes made to match the template for writing proofs about continuity
of a function at a point. Of course, the same logic works for proving the continuity
of f  g, so the two results might as well be combined as follows.
PROOF: Suppose that f and g are functions with common domain
containing the point a. If both f and g are continuous at the point a, then
so are the functions f C g and f  g.
Let f and g be functions both defined on a set A containing the point a, and
assume that f and g are both continuous at a.
Let  > 0 be given.
By the definition of continuity, there is a 1 > 0 such that if x 2 A and
jx  aj < 1 , then jf .x/  f .a/j < 2 .
Similarly, there is a 2 > 0 such that if x 2 A and jx  aj < 2 , then
jg.x/  g.a/j < 2 .
Let D min.1 ; 2 /.
Then
if x 2 Awith

 jx  aj < ,

 

j f .x/ g.x/  f .a/ g.a/ j D j f .x/  f .a/ g.x/  g.a/ j 
jf .x/  f .a/j C jg.x/  g.a/j < 2 C 2 D .
This shows that f C g and f  g are continuous at a.
Now suppose that f and g are functions as discussed above with lim f .x/ D L
x!a

and lim g.x/ D H. Recall how you prove that lim f .x/g.x/ D LH. Again, as with
x!a

x!a

the proof for the sum of the limits, given  > 0 you find > 0 so that both jf .x/  Lj
and jg.x/  Mj are small when 0 < jx  aj < . How small do these need to
be? The idea was to write jf .x/g.x/  LHj as jf .x/g.x/  f .x/H C f .x/H  LHj 
jf .x/j  jg.x/  Hj C jHj  jf .x/  Lj. Thus, 1 > 0 can be chosen to ensure that

4.5 The Arithmetic of Continuous Functions

119


jf .x/  Lj is less than 1, 2 > 0 so that jf .x/  Lj is less than 2.jHjC1/
, and 3 so that

jg.x/  Hj is less than 2.jLjC1/ . Then can be set to the least of 1 , 2 , and 3 . The
proof for continuity of fg at the point a follows this same strategy.

PROOF: Suppose that f and g are functions with common domain


containing the point a. If both f and g are continuous at the point a, then
so is the function fg.
Let f and g be functions both defined on a set A containing the point a, and
assume that f and g are both continuous at a.
Let  > 0 be given.
By the definition of continuity, there is a 1 > 0 such that if x 2 A and
jx  aj < 1 , then jf .x/  f .a/j < 1, and thus, jf .x/j < jf .a/j C 1.
There is a 2 > 0 such that if x 2 A and jx  aj < 2 , then jf .x/  f .a/j <

.
2.jg.a/jC1/
There is a 3 > 0 such that if x 2 A and jx  aj < 3 , then jg.x/  g.a/j <

.
2.jf .a/jC1/
Let D min.1 ; 2 ; 3 /.
Then if x 2 A with jx  aj < ,
jf .x/g.x/  f .a/g.a/j D jf .x/g.x/  f .x/g.a/ C f .x/g.a/  f .a/g.a/j 
C jg.a/j  jf .x/  f .a/j <
jf .x/j  jg.x/  g.a/j


jf .a/j C 1 2.jf .a/jC1/
C jg.a/j  2.jg.a/jC1/
 2 C 2 D .
This shows that fg is continuous at a.
Finally, suppose that f and g are functions as discussed above with lim f .x/ D L
x!a

f .x/
and lim g.x/ D H and H 0. This time recall how you prove that lim g.x/
D HL .
x!a
x!a
The idea is the same
but the algebra took
the proof for products,

as with
f .x/HLg.x/

f .x/
L
a few more steps. g.x/  H D g.x/H D f .x/HLHCLHLg.x/
D
g.x/H



.f .x/L/HCL Hg.x/

 jf .x/Lj C jLjjg.x/Hj . Then, given an  > 0, you can

g.x/H
jg.x/j
jg.x/jjHj

choose 1 > 0 so that jx  aj < 1 would ensure jg.x/  Hj < jHj


which, in
2
jHj
turn, implies that jg.x/j > 2 . Then you choose a 2 > 0 so that jx  aj < 2
gives jf .x/  Lj < jHj
. Lastly, you choose a 3 > 0 so that jx aj < 3 gives
4

H 2
jg.x/  Hj < 4.jLjC1/ . This allowed you to conclude f .x/HLg.x/
< . Again, the
g.x/H
proof for continuity can be constructed by changing the limit L to jf .a/j, the limit
H to jg.a/j, and making some other minor wording changes.

120

4 Continuity

PROOF: Suppose that f and g are functions with common domain


containing the point a with g.a/ 0. If both f and g are continuous at
the point a, then so is the function gf .
Let f and g be functions both defined on a set A containing the point a, and
assume that f and g are both continuous at a with g.a/ 0.
Let  > 0 be given.
Note that jg.a/j > 0. By the definition of continuity, there is a 1 > 0
such that if x 2 A and jx  aj < 1 , then jg.x/  g.a/j < jg.a/j
. For these x it
2
>
jg.x/jCjg.x/g.a/j
D
jg.x/jCjg.a/g.x/j

follows that jg.x/jC jg.a/j


2
g.x/ C g.a/  g.x/ D jg.a/j which implies that jg.x/j > jg.a/j jg.a/j D
2
jg.a/j
.
2
By the definition of continuity, there is a 2 > 0 such that if x 2 A and
jx  aj < 2 , then jf .x/  f .a/j < jg.a/j
.
4
By the definition of continuity, there is a 3 > 0 such that if x 2 A and
2
.
jx  aj < 3 , then jg.x/  g.a/j < 4.jfg.a/
.a/jC1/
Let D min.1 ; 2 ; 3 /.
Then
if x 2 A with
0 < jx  aj < ,

f .x/ f .a/ f .x/g.a/f .a/g.x/ f .x/g.a/f .a/g.a/Cf .a/g.a/f .a/g.x/


g.x/  g.a/ D
D
D
g.x/g.a/
g.x/g.a/



.f .x/f .a//g.a/Cf .a/ g.a/g.x/ jf .x/f .a/j f .a/.g.x/g.a//


C g.x/g.a/ <

g.x/g.a/
jg.x/j
2

jg.a/j
4

2
 jg.a/j
C 4.jfg.a/
 2jf .a/j < 2 C
.a/jC1/ jg.a/j2
This shows that fg is continuous at a.


2

D .

4.5.1 Exercises
1. Suppose that f and g are functions that are both uniformly continuous of a set
A. Find an example showing that their product need not be uniformly continuous
on A.
Write proofs for each of the following statements.
5

2. The function f .x/ D x 2 is continuous for x  0.


3. All polynomials are continuous on R.
4. All rational functions are continuous on R except at points where their denominators are 0.
5. If f and g are uniformly continuous on the set A, then f C g and f  g are also
uniformly continuous on A.

4.6 Composition, Absolute Value, Maximum, and Minimum

121

6. Suppose f and g have common domain A and f C g is continuous at a 2 A. If f


is discontinuous at a, then g is discontinuous at a.
7. Suppose f and g have common domain A and fg is continuous at a 2 A. If f is
discontinuous at a, then g is either discontinuous at a or g.a/ D 0.

4.6 Composition, Absolute Value, Maximum, and Minimum


Recall that the two functions g W A ! B and f W B ! C can be composed to
obtain f g W A ! C. An important property of composition is that if the function
g is continuous at a 2 A and the function f is continuous at g.a/ 2 B, then the
composition f g is continuous at a. The proof of this result can follow the template
for proofs of continuity at a point. Such a proof
 would
 introduce

 an  > 0 and end
with concluding that jf g.x/  f g.a/j
D
jf
g.x/

f
g.a/
j < . The continuity




of f at g.a/ allows you to claim that jf g.x/  f g.a/ j is small if g.x/ is close to
g.a/. But it is easy to ensure that g.x/ is close to g.a/ because g is continuous at a.
So, you can choose a > 0 to make jg.x/  g.a/j as small as necessary. How small
0
is that? The continuity of f tells you how small.
 So, given  > 0, choose a > 0
0
so that jy  g.a/j < implies jf .y/  f g.a/ j < .Then
choose


 a > 0 so that
jx  aj < implies jg.x/  g.a/j < 0 . This gives f g.x/  f g.a/ < . The
complete proof can be written as follows.
PROOF: Suppose that function g has domain A with its range contained
in the set B, and that function f has domain B. If g is continuous at a 2 A
and f is continuous at g.a/, then the composition f g is continuous at a.
Let g be a function with domain A with its range contained in the set B, and
let f be a function with domain B. Assume g is continuous at a 2 A, and f
is continuous at g.a/.
Let  > 0 be given.
Because f is continuous at g.a/, there is a 0 >
that if y is in the
 0 such

domain of f with jy  g.a/j < 0 , then jf .y/  f g.a/ j < .
Because g is continuous at a, there is a > 0 such that if x 2 A with
jx  aj < , then jg.x/  g.a/j < 0 .
0
If
x 2 A,
 then
 jx aj < implies jg.x/  g.a/j < which, in turn, implies
f g.x/  f g.a/ < .
This shows that f g is continuous at a.
As an example of how useful this theorem is consider the function jxj. One can
prove that this function is continuous fairly easily by following the template for
proofs that a function
f is continuous at a point a. Indeed, such a proof must end

with jxj  jaj < , but by considering all the possible


cases for x and a being
negative or nonnegative, it can be seen that jxj  jaj  jx  aj, so if jx  aj is
made less than , then jxj  jaj < . This, in fact, shows that jxj is uniformly
continuous.

122

4 Continuity

PROOF: The function jxj is uniformly continuous.


Let f .x/ D jxj.
Let  > 0 be given, and set D .
Let a be any real number.
Note that if x and a are either both
greater than or equal to 0, or both less
than or equal to 0, then jxj  jaj D jx aj, but that if x and
opposite
a have
. In either case, jxjjaj  jxaj.
signs, then jxaj D jxjCjaj > jxjjaj

So, if jx  aj < , it follows that jxj  jaj  jx  aj < D .


Thus, jxj is uniformly continuous.

As easy as this proof is, the continuity of jxj can more easily be proved as
follows.
PROOF: The function jxj is continuous.
p
Let g.x/ D x2 , and f .x/ D x.
Let a be any real number.
2
Then g is continuous at a, and,
p since g.a/ D a  0, f is continuous at g.a/.
2
Because the function jxj D x D .f g/.x/, it follows that jxj is continuous
at a.
In turn, this result can be used to show that if f and g are functions with domain A,
and f and g are both continuous at a 2 A, then the functions min.f ; g/ and max.f ; g/
are both continuous at a. This is because the functions min.f ; g/ and max.f ; g/ can
be expressed in terms of absolute value.
PROOF: If f and g are functions with the same domain A, and both functions are continuous at a 2 A, then the function min.f ; g/ is continuous
at a.
Let f and g be functions with the same domain A, and assume that both
functions are continuous at a 2 A.
Note that for any two real numbers y and z, if y > z, then y C z  jy  zj D
y C z  .y  z/ D 2z, but if y  z, then y C z  jy  zj D y C z  .z  y/ D 2y.
In either case, y C z  jy  zj D 2 min.y; z/.


.x/g.x/j
Thus, for any x, min f .x/; g.x/ D f .x/Cg.x/jf
.
2
Since f and g are continuous at a, so is f  g.
Since f  g is continuous at a, so is jf  gj.
gj
It then follows that the combination min.f ; g/ D f Cgjf
is continuous
2
at a.

4.7 Other Continuity Theorems

123

4.6.1 Exercises
1. Find examples of functions f and g defined on R with lim f .x/ D L and
lim g.y/ D M such that lim g .f .x// M.

y!L

x!a

x!a

Write proofs for each of the following statements.


2. If g is uniformly continuous on its domain A, and f is uniformly continuous on
the range of g, then f g is uniformly continuous on A.
3. If f and g are functions with the same domain A, and both f and g are continuous
at a 2 A, then max.f ; g/ is continuous at a.
4. If f and g are functions uniformly continuous on the same domain A, then
min.f ; g/ is uniformly continuous on A.
5. If f1 ; f2 ; f3 ; : : : ; fn are all uniformly continuous on the same domain A, then so is
max.f1 ; f2 ; f3 ; : : : ; fn /.

4.7 Other Continuity Theorems


4.7.1 Boundedness of Continuous Functions
A function that is continuous on a closed bounded interval a; b satisfies some
important properties. In particular, there are points u and v in a; b such that for
all x 2 a; b, f .u/  f .x/  f .v/. In this case, f .u/ is the minimum value of f
on a; b and f .v/ is the maximum value of f on a; b. This result is often stated
as a function continuous on a closed bounded interval obtains its minimum and its
maximum value. Note that the function f .x/ D x is continuous on the open interval
.1; 2/, but it does not take on a minimum or maximum value there. Clearly, f .x/ < 2
for each value of x 2 .1; 2/, and 2 is the least upper bound of all values achieved by
f on that interval, but the least upper bound is never achieved.
Before showing that the function f continuous on the closed bounded interval
a; b obtains its minimum and maximum values, it is convenient to first show that
such a function must be bounded, that is, there is a real number M such that for all
x 2 a; b, jf .x/j  M. This can be proved by contradiction by assuming that f is
continuous on a; b, but that no bound exists. How do you quantify the assumption
f is not bounded? You cannot just assume that f takes on an infinitely large value
because f .x/ is a real number for each value of x 2 a; b and, hence, f .x/ cannot be
infinitely large for any value of x. You need to construct the negation of the statement
there is an M such that for all x 2 a; b, jf .x/j  M. This is a statement with two
quantifiers: there is an M and for all x 2 a; b. The two quantifies are followed
by the proposition (statement) that jf .x/j  M. The first quantifier is existential

124

4 Continuity

quantifier, and the second quantifier is a universal quantifier. Thus, the statement
there is an M such that for all x 2 a; b, jf .x/j  M has an existential quantifier
stating that there exists a number M satisfying a property. This is followed by a
universal quantifier stating that all x in the interval a; b satisfy a property. Finally,
the property is given as jf .x/j  M.
The rule of thumb for constructing the negation of statements with quantifiers
is to replace each existential quantifier with a corresponding universal quantifier,
replace each universal quantifier with a corresponding existential quantifier, and
replace the property with the negation of that property. In this example, the
existential quantifier there exists a number M would be replaced by the universal
quantifier for all numbers M. Then the universal quantifier for all x 2 a; b
would be replaced by the existential quantifier there exists an x 2 a; b. Finally,
the property jf .x/j  M would be replaced by its negation jf .x/j > M. The
resulting negation is for all numbers M there is an x 2 a; b such that jf .x/j > M.
Your proof of the boundedness of f would begin by introducing f and the interval
a; b. Then it would assume negation just discussed. The remainder of the proof
would be to derive a contradiction, and that would show that the assumption made
at the outset of the proof is false, so its negation, the statement you were trying to
prove, must be true.
Thus, the proof would begin with a statement about f being a continuous function
on the closed bounded interval a; b which would be followed by the negation
of the statement you want to prove. So how do you use this negation to reach a
contradiction? Well, just see where this assumption leads you. If for each M you
can find an x 2 a; b where jf .x/j > M, it means that there is an x1 such that
jf .x1 /j > 1. Similarly, there is an x2 such that jf .x2 /j > 2. In this way, you can
assert that there is a sequence x1 ; x2 ; x3 ; : : : such that for each n  1, jf .xn /j > n.
Note that this gives you an infinite sequence of values in the closed bounded interval
a; b. The BolzanoWeierstrass Theorem states that every infinite bounded set has
an accumulation point. Does the sequence x1 ; x2 ; x3 ; : : : produce such an infinite
bounded set? Well, it is certainly bounded because each xn is in the interval a; b.
Is it possible that the sequence does not give an infinite collection of points? For
that to happen, it would have to be the case that infinitely many of the value in
the sequence were equal to each other. Actually, just because you choose x1 so that
jf .x1 /j > 1 does not preclude having jf .x1 /j > 100, so the value x1 could appear
in the sequence many times. This is awkward. It would be easier if you chose a
sequence of distinct values. This is actually not hard to do. Rather than choosing xn
so that jf .xn /j > n, why not choose x1 as above, and for each n  1 choose xnC1 so
that jf .xnC1 /j > jf .xn /j C 1. This would not only imply that for each n, jf .xn /j > n
but also that xn could not equal any of the values that appear earlier in the sequence.
So what can you do with the infinite sequence of xn values with its guaranteed
accumulation point, y? First note that the accumulation point y is also in a; b
because all of the xn values satisfy both a  xn and xn  b, so the accumulation
point y must also satisfy a  y  b. Otherwise, there would be an interval around y

4.7 Other Continuity Theorems

125

Fig. 4.7 Proving that a


continuous function on a; b
is bounded

x1

x3 x4

x6 x5 x2 b

that did not share any points with a; b, so it would not contain any of the xn values.
This means that f is defined and continuous at y. That implies that there is a > 0
such that for all x 2 a; b satisfying jx  yj < , it follows that jf .x/  f .y/j < 1.
But that means jf .x/j < jf .y/j C 1. But y is an accumulation point for the sequence
of xn values, so there are infinitely many of the xn within of y, and some of them
will necessarily have the property that jf .xn /j > jf .y/j C 1. This gives the needed
contradiction (Fig. 4.7).
PROOF: A function continuous on a closed bounded interval is bounded.
Let a  b be real numbers, and let f be a function continuous on the interval
a; b.
Assume that f on a; b is not bounded. That is, for every real number M,
there is an x 2 a; b such that jf .x/j > M.
Select x1 2 a; b so that jf .x1 /j > 1.
Construct a sequence inductively as follows. Assume that for some n  1
the sequence x1 ; x2 ; x3 ; : : : ; xn has been selected. Choose xnC1 2 a; b such
that jf .xnC1 /j > jf .xn /j C 1.
Note that for each n the xn chosen in this manner must be distinct from all
values of xj chosen before it in the sequence, and that jf .xn /j > n.
The terms of the sequence x1 ; x2 ; x3 ; : : : form an infinite set contained in the
interval a; b, so it is an infinite bounded set. By the BolzanoWeierstrass
Theorem the set has an accumulation point y.
If y lies outside of a; b, then there is an open interval containing y that
contains no points of a; b. Thus, that open interval would not contain any
terms of the sequence which cannot happen if y is an accumulation point of
the sequence. Therefore, y 2 a; b.
(continued)

126

4 Continuity

The function f is continuous at y, so there is a > 0 such that for all


x 2 a; b with jx  yj < , it follows that jf .x/  f .y/j < 1, and thus,
jf .x/j < jf .y/j C 1.
Since y is an accumulation point of the sequence x1 ; x2 ; x3 ; : : :, there must
be infinitely many terms of the sequence within of y. Thus, there must
be an n such that n > jf .y/j C 1 and xn is within of y. This implies that
jf .xn /j > n > jf .y/j C 1 which contradicts the fact that jf .xn /  f .y/j < 1.
Therefore, the assumption that f is not bounded must be false, and the
theorem is proved.

4.7.2 Obtaining Extreme Values


Using the fact that continuous functions on closed bounded intervals are bounded,
there is a nice trick to show that a function f continuous on the closed bounded
interval a; b must achieve its extreme values, that is, its minimum and maximum.
The fact that the set of values that f takes on is a bounded set implies that the set
of values has a least upper bound, M. If f is never equal to M, then the function
M  f .x/ is positive for all x 2 a; b because M is an upper bound, and f .x/ is never
equal to M. This implies that the function Mf1 .x/ is also continuous on the interval
a; b. But then you can again apply the previous theorem to show that there is a
number K such that for all x 2 a; b, Mf1 .x/  K. Taking reciprocals one more time
shows M  f .x/  K1 which implies that f .x/  M  K1 . This shows that M  K1 < M
is an upper bound for f on a; b when M was assumed to be the least upper bound.
This is a contradiction, and you must conclude that f .x/ D M for at least one value
of x 2 a; b. The formal proof can be written as follows (Fig. 4.8).
Fig. 4.8 The maximum and
minimum of a function f .x/
on an interval

maximum
y = f(x)

minimum

4.7 Other Continuity Theorems

127

PROOF (Extreme Value Theorem): A function continuous on a closed


bounded interval obtains its maximum value and its minimum value at
some points in the interval.
Let a  b be real numbers, and let f be a function continuous on the interval
a; b.
The set B D ff .x/ j x 2 a; bg is not empty because it contains f .a/, and
B is bounded above because all functions continuous on a closed bounded
interval are bounded.
Let M be the least upper bound of set B.
Assume that for all x 2 a; b, f .x/ M.
The function M  f .x/ is continuous on a; b and is never equal to 0. Hence,
M  f .x/ > 0 on a; b.
It follows that the function Mf1 .x/ is continuous on a; b.
Because all functions continuous on a closed bounded interval are bounded,
there is a real number K > 0 such that Mf1 .x/  K on a; b.
But then M  f .x/  K1 on a; b, so f .x/  M  K1 on a; b.
Since K > 0, the set B is bounded above by M  K1 < M. This means that
M  K1 is an upper bound for B which contradicts the fact that M was the
least upper bound of B.
Therefore, the assumption that f was never equal to M is false, and there
must be a value x 2 a; b such that f .x/ D M.
Applying the proceeding argument to the function f , which is also
continuous on a; b, shows that there is an x 2 a; b such that f .x/ is
equal to the maximum of f on a; b. But then f .x/ is the minimum value
of f on a; b. This completes the proof of the theorem.

4.7.3 The Intermediate Value Property


Suppose the function f is defined on an interval containing c and d, and the graph of
f passes through the points .c; f .c// and .d; f .d//. It might be that the graph of the
function passes through every value of y between f .c/ and f .d/ as it moves between
the points .c; f .c// and .d; f .d// as shown in the figure (Fig. 4.9). For example, the
function f .x/ D 2x2 3 is defined for all real numbers
with f .1/ D 1 and f .2/ D 5.
q

lies between 1 and 2 and f .x/ D


For each y between 1 and 5, the value x D yC3
2
y. Formally, a function defined on an interval a; b is said to have the intermediate
value property on that interval if for each choice of c and d with a  c  d 
b and each y between f .c/ and f .d/, there is an x 2 c; d such that f .x/ D y.
The Intermediate Value Theorem states that any function continuous on an interval
has the intermediate value property there. If you consider the intuitive notion of
continuity where you say that f is continuous on a; b if you can draw the graph of

128
Fig. 4.9 f passing through
each y between f .c/ and f .d/

4 Continuity

f(c)

f(d)

f without lifting your pencil from the paper, then this intermediate value property
becomes clear because in going from f .c/ to f .d/, your pencil will necessarily cross
over all the y values between f .c/ and f .d/.
To prove the Intermediate Value Theorem you would begin by setting the context
by introducing a function f continuous on an interval a; b and points c and d with
a  c  d  b. Then you would select an arbitrary y between f .c/ and f .d/.
The proof would have to demonstrate the existence of an x between c and d with
f .x/ D y. How is this to be done? As with many other proofs in Analysis, one shows
the existence of a real number by constructing a set for which that number is a least
upper bound. Consider, for example, the case where f .c/ < y < f .d/. You could
construct the set S D fx 2 c; d j f .x/  yg. This set is not an empty set because
c 2 S, and S is certainly bounded above by d. Thus, the Completeness Axiom says
that the set has a least upper bound, s. Now you can refer to the continuity of f
which will show that if f .s/ < y, then there is a > 0 such that jx  sj <
implies that f .x/ < y showing that there are values greater than s for which f .x/ < y
contradicting the fact that s is an upper bound of S. If f .s/ > y, then there is a > 0
such that jx  sj < implies that f .x/ > y showing that s  < s is an upper bound
for S contradicting the fact that s is the least upper bound of S. The only remaining
conclusion is that f .s/ D y which provides the needed example, x D s, needed to
prove the theorem.
Note that the above argument did not cover the general case where f .c/ and f .d/
can be in any order. The argument so far only covers the specific case where f .c/ <
f .d/. So is there more proof to write? It is easy to see that the case f .c/ > f .d/ can be
proved with an argument virtually identical to the one given above by changing the
sense of some of the inequalities. The case of f .c/ D f .d/ is even easier because the
only possible y between f .c/ and f .d/ is f .c/, so the value x D c gives the needed
f .x/ D y. Thus, giving the argument for f .c/ < f .d/ essentially covers all the
needed cases, and it would be very easy for the reader to add the needed arguments
to complete the proof for the missing cases. In this situation it is common for the
proof to cover only the specific condition f .c/ < f .d/ and introduce it with the
phrase without loss of generality. In this case the phrase means that although the
following assumption looks like it only covers some of the necessary cases, in order

4.7 Other Continuity Theorems

129

to make the argument completely general, the omitted cases are either very easy or
virtually identical to the case being considered. With this in mind, the following is
a proof of the Intermediate Value Theorem.
PROOF (Intermediate Value Theorem): Let the function f be continuous
on the interval a; b containing c and d. If y is any value between f .c/ and
f .d/, then there exists x between c and d such that f .x/ D y.
Let f be a function continuous on a; b, and let c and d be in a; b.
Let y be any value between f .c/ and f .d/.
Without loss of generality, assume that c  d and f .c/  y  f .d/.
Let set S D fx 2 c; d j f .x/  yg.
S is not empty because f .c/  y implying c 2 S.
S is bounded above by d.
By the Completeness Axiom S has a least upper bound s which will be an
element of a; b.
If f .s/ < y, then by the continuity of f , there is a > 0 such that if x 2 a; b
with jx  sj < , then jf .x/  f .s/j < yf2.s/ , and, in particular, f .x/ < y.
This shows that there is an x > s with f .x/ < y, so x 2 S contradicting the
fact that s is an upper bound of S.
If f .s/ > y, then by the continuity of f , there is a > 0 such that if x 2 a; b
with jx  sj < , then jf .x/  f .s/j < f .s/y
, and, in particular, f .x/ > y.
2
This shows that for all x between s  and s that f .x/ > y, so s  is an
upper bound of S contradicting the fact that s is the least upper bound of S.
It follows that f .s/ must equal y which completes the proof of the theorem.

In the above proof the steps which begin If f .x/ < y and If f .x/ > y are written
in exactly the same style using almost identical words. If you were writing a short
story, you would avoid writing in this style because it might sound monotonous to
the reader. In creative writing, you would want to be more creative, and you would
reach for your thesaurus to find alternate words to enhance your writing. But in a
mathematical proof, using such parallel construction of sentences actually makes
the proof easier to read. A reader only needs to parse the first of the two steps in
order to have a good idea of what is going to be done in the second of the two steps.
This gives the reader a head start on processing the second step. What is passed off
as boring in creative writing can be applauded in the writing of proofs because of
the way it simplifies the understanding. In fact, one often begins the second of two
such steps with the word similarly to indicate that the argument to follow looks a lot
like the one just completed, again alerting the reader to the parallel construction.
The Intermediate Value Theorem says that functions continuous on an interval
have the intermediate value property there. But a function need not be continuous
for it to have the intermediate value property. Clearly, if a function has a jump
discontinuity at a point a, that is, if lim f .x/ and lim f .x/ both exist but are
x!a

x!aC

different as shown in Fig. 4.10, then there could well be values of y that the function
misses as it passes from .c; f .c// to .d; f .d//.

130

4 Continuity

Fig. 4.10 A function with a


jump discontinuity

f(c)

f(d)

Fig. 4.11 Graph of sin

1
x

For a discontinuous function to have the intermediate value property, the function
must necessarily
oscillate wildly (Fig. 4.11). A typical example is the function

sin 1x if x > 0
f .x/ D
.
0
if x  0

4.7.4 Exercises
Write proofs for each of the following statements. Each statement can be proved
using one or more of the theorems in this section.
1. Let A  R be a bounded set, and let f be a function defined on A. If f is
unbounded on A, then for every  > 0, there exists a and b in R with b  a < 
such that f is unbounded on A \ .a; b/.
2. If a < b and f is a continuous function on a; b with f .a/ D f .b/, then there is a
c 2 .a; b/ such that f obtains an extreme value (either a minimum or maximum)
at c.
3. Suppose that f is a continuous function defined on R such that lim f .x/ D
x!1

lim f .x/ D 1. Then f obtains its minimum value for some x 2 R.

x!1

4.8 Discontinuity

131

4. If p is an odd degree polynomial with real coefficients, then p has at least one
real root.
5. Suppose that a plane contains be a polygon G and a line L. Then there is a line
L0 in the plane parallel to L such that exactly half the area of G lies on each side
of L0 .
r
1
2
6. There is a value of x between 0 and 1 such that x equals
.
1 C x2

4.8 Discontinuity
In Calculus students learn about a great many continuous functions. These include
the elementary functions: polynomials, rational functions, algebraic functions,
exponential functions, logarithmic functions, and circular and hyperbolic trigonometric functions and their inverses. How badly can a function be discontinuous? A
function can
8 be discontinuous
9 at a single point such as the signum or sign function
< 1 if x < 0 =
sgn.x/ D
0
if x D 0 or at a sequence of points such as the floor or greatest
:
;
1
if x > 0
integer function bxc D n if n is the integer satisfying n  x < n C 1 (Fig. 4.12).
A function
( can be discontinuous at a sequence of points
) that converge such as with
1
1
1
if
<
x

;
for
positive
integer
n
nC1
n
f .x/ D n
. This function is discontin0 otherwise
uous at each x D 1n for positive integers n, but it is continuous everywhere else
including at x D 0 (Fig. 4.13). A function can be discontinuous at every x such as
0 if x is rational
with f .x/ D
.
1 if x is irrational
But one of the most surprising examples is the following often called Thomaes
function but also known as the popcorn function, the raindrop function,

Fig. 4.12 Graphs of sgn.x/ and bxc

132

4 Continuity

Fig. 4.13 Graphs of functions with discontinuities


Fig. 4.14 Graph of
Thomaes function

or the modified
Dirichlet function. It is defined on
 the interval .0; 1/ by
1
m
if
x
is
rational
written
in
lowest
terms
as
n
n . Its graph is shown in
f .x/ D
0 if x is irrational
Fig. 4.14. It is not hard to see that this function is discontinuous at each rational
number mn 2 .0; 1/. Indeed if mn is in lowest terms, then f . mn / D 1n . If  is set
1
for every >
be irrational numbers x 2 .0; 1/ satisfying
0 there will
at 2n ,mthen


x  < for which f .x/  f m D j0  1 j > . On the other hand, at each
n
n
n
irrational number a in .0; 1/, the function is continuous. To see this, given an
 > 0, notice that there are only finitely many rational numbers r 2 .0; 1/ such that
f .r/  . If there are such rational numbers, there is one, r0 , closest to a, so choose
D jr0  aj. If there are no such rational numbers, you can choose D 1. In either
case, for all x 2 .0; 1/ with jx  aj < , it follows that jf .x/  f .a/j < , showing
that f is continuous at a.

Chapter 5

Derivatives

5.1 The Definition of Derivative


Anybody who was even half paying attention in their first course in Calculus got
the strong impression that the differentiation of functions has an enormous number
of applications. Not only does it provide a great tool for understanding the behavior
of functions, but it also has applications to a very wide range of other fields, most
notably Physics, Engineering, Chemistry, Biology, and Economics. In particular,
being able to use the derivative to determine where a function is increasing and
decreasing in itself justifies this reputation. Merely knowing the average rate of
change of a function over an interval is valuable. But the limit concept allows you to
refine this idea to get the instantaneous rate of change of the function at a point. This
allows for more precise information about the function as well as providing what is
often a simpler expression than that of the average rate of change from which it
is derived. This chapter will discuss the theorems needed to calculate derivatives
efficiently as well as theorems highlighting some of the important properties and
applications of the derivative.
Let f be a function defined on an open interval containing the point a.
Then for values of x near but not equal to a one can calculate the slope
of the secant

line passing

through the two points on the graph of the
function a; f .a/ and x; f .x/ . As shown in Fig. 5.1, the slope of this secant
.a/
line is given by the difference quotient f .x/f
. If f is continuous, as x
xa




approaches a, the point x; f .x/ approaches the point a; f .a/ , and the secant
line may approach a tangent line, the line that passes through a; f .a/
and most closely approximates the graph of the function near a (Fig. 5.2).
The derivative of f at a is the slope of this tangent line. More formally, if a is
an accumulation point of the domain of the function f , and f is defined at a, then
.a/
the derivative of f at a is f 0 .a/ D lim f .x/f
. The derivative is said to exist if this
xa
x!a

limit exists. When the limit exists, f is said to be differentiable at a. Equivalently,


.a/
the limit can be written f 0 .a/ D lim f .aCh/f
.
h
h!0

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_5

133

134

5 Derivatives

Fig. 5.1 Slope of a Secant


Line

(a, f(a))

f(x) - f(a)

x -a

(x, f(x))

Fig. 5.2 Tangent Line


(a, f(a))

5.2 Differentiation and Continuity


The first important consequence of the definition of the derivative is that if a function
f has a derivative at a point, then f is also continuous at that point. As part of the
definition of derivative, f needs to be defined at the point a for it to have a derivative
.a/
at a. It remains to show that lim f .x/ D f .a/ whenever the limit lim f .x/f
exists.
xa
x!a
x!a
For this difference quotient to have a finite limit when the denominator is clearly
approaching 0, the numerator must also be approaching 0. This last statement is
intuitively true, so you would hope that it has an easy justification. Consider what
.a/
sort of algebraic operations you could apply to the difference quotient f .x/f
in
xa
order to produce the numerator f .x/  f .a/. It should be clear that if the difference
quotient is multiplied by x  a, the product will be the desired difference f .x/  f .a/.
This suggests the method that works in the following simple proof.

5.3 Calculating Derivatives

135

PROOF: If the function f has a derivative at a point a, then f is continuous


at a.
Suppose that f has a derivative at the point a.
It follows from the definition of derivative that f is defined at a, and that a
is an accumulation point of the domain of f .
.a/
Also from the definition of derivative it follows that f 0 .a/ D lim f .x/f
xa
x!a
exists.
h
i
.a/
.a/
Then lim f .x/  f .a/ D lim .x  a/ f .x/f
D
lim
x

a
 lim f .x/f
D
xa
xa
x!a

x!a

x!a

x!a

0  f 0 .a/ D 0.
Thus, f .x/ is both defined at x D a, and lim f .x/  f .a/ D 0, or
lim f .x/ D f .a/.
x!a
It follows that f is continuous at x D a.

x!a

So, f differentiable at a implies that f is continuous at a. Is the converse true? You


should know several counterexamples that show that the converse is false, that is,
there are functions f continuous at a point a that are not differentiable at a. First of
all, f can be continuous at a where a is an isolated point of the domain of f , and at
such points, the derivative of f is not defined. But even if f is continuous for all real
numbers, f need not have a derivative at a particular a. The best known example is
the absolute value function, f .x/ D jxj, which is continuous for all real numbers but
.0/
fails to have a derivative at x D 0. This is because the difference quotient f .x/f
is
x0
equal to 1 for all x > 0 and 1 for all x < 0, so the limit of the difference quotient
does not exist at x D 0. Of course, the absolute value function has a derivative at all
x 0. There is a well-known example known as the Weierstrauss function that is
continuous for real numbers x but does not have a derivative at any point.

5.3 Calculating Derivatives


The proof that a function f has a particular derivative at a point a is just a proof about
the limit of a difference quotient, and as such, is no different than a proof of any other
limit. On the other hand, there are some similarities among the proofs of derivatives,
so it is worth working through a few examples. The key observation is that whenever
you need to calculate a derivative directly from the definition, you must calculate the
limit of a difference quotient which, by design, is a fraction whose numerator and
denominator are both approaching zero. In such a case, one would expect to be able
to perform some algebraic manipulation that would result in the x  a expression in
the denominator canceling with an equivalent factor in the numerator. This allows
you to use other limit theorems to complete the evaluation.

136

5 Derivatives

For example, consider the function f .x/ D 3x2  8x. To calculate the derivative
of f at a D 4, one needs to evaluate the limit


3x  8x  3  42  8  4
f .x/  f .4/
3x2  8x  16
lim
D lim
D lim
x!4
x!4
x!4
x4
x4
x4
.3x C 4/.x  4/
D lim
D lim 3x C 4 D 16:
x!4
x!4
x4
Since each step of this derivation follows either from rules of algebra or from
the theorems about calculating the limits of various arithmetic combinations of
functions, the calculation given is a complete proof that the derivate of f at x D 4
is 16.
In a more general setting, consider proving that the derivative of f .x/ D 5x4 at
the point x D a is f 0 .a/ D 20a3 . Here you would calculate


5.x  a/ x3 C x2 a C xa2 C a3
f .x/  f .a/
5x4  5a4
lim
D lim
D lim
x!a
x!a
x!a
xa
xa
xa
D lim 5.x3 C x2 a C xa2 C a3 / D 5.a3 C a2 a C aa2 C a3 / D 20a3 :
x!a

Again, finding a factor of x  a in the numerator of the difference quotient is the key
to evaluating the needed limit.

5.4 The Arithmetic of Derivatives


One quickly learns in Calculus that although the derivative is defined as a limit
of a difference quotient, there is a small collection of algorithms that reduce the
finding of the derivative of any combination of elementary functions to a fairly
mechanical exercise. The algorithms show you how to take the derivatives of the
sum, difference, product, and quotient of two differentiable functions as well as a
constant multiple of a differentiable function, the inverse of a differentiable function,
and the composition of two differentiable functions. Those rules along with the
knowledge of how to differentiate the elementary functions, xn , ax , loga x, sin x,
and cos x give you all the tools necessary to differentiate virtually any function you
are likely to see in a lifetime of applications. This and the next sections discusses
the proofs of the theorems that provide these needed algorithms.
The simplest of these results is the theorem that states that if f is a function
differentiable at a and c is any constant, then the function cf is also differentiable at
a with .cf /0 .a/ D cf 0 .a/. In the proof of this theorem, you would assume that f 0 .a/
.a/
D f 0 .a/. Since the limit needed
exists. That provides for you the limit lim f .x/f
xa
x!a

to show that .cf /0 .a/ D cf 0 .a/ is just a multiple of a known limit, the needed result
follows immediately from the fact that the limit of a constant times a function is the
constant times the limit of the function.

5.4 The Arithmetic of Derivatives

137

PROOF: If f 0 .a/ exists, and c is a constant, then .cf /0 .a/ D cf 0 .a/.


Suppose that f has a derivative at the point a.
.a/
From the definition of derivative f 0 .a/ D lim f .x/f
.
xa
cf .x/cf .a/
xa
x!a
0

Then .cf /0 .a/ D lim

Thus, .cf /0 .a/ D cf .a/.

x!a

.a/
D lim c  f .x/f
D c lim
xa
x!a

x!a

f .x/f .a/
xa

D cf 0 .a/.

To show that the derivative of the sum or difference of two differentiable


functions is the sum or difference of their derivatives, one is faced with finding the
limit of a difference quotient which can easily be written as the sum or difference of
two difference quotients whose limits are already known. Thus, if f and g are two
functions defined on the same domain and both differentiable at a, calculating the
derivative of f C g at a requires the limit

 

f .x/  f .a/ C g.x/  g.a/
.f C g/.x/  .f C g/.a/
D lim
lim
x!a
x!a
xa
xa
f .x/  f .a/
g.x/  g.a/
C lim
D f 0 .a/ C g0 .a/
D lim
x!a
x!a
xa
xa
as needed.
PROOF: Let f and g be functions defined on a common domain, and let
f and g both be differentiable at a. Then .f C g/0 .a/ D f 0 .a/ C g0 .a/ and
.f  g/0 .a/ D f 0 .a/  g0 .a/.
Suppose that f and g are functions defined on a common domain, and that f
and g are both differentiable at a.
.a/
From the definition of derivative f 0 .a/
D
lim f .x/f
and
xa
x!a

g0 .a/ D lim

g.x/g.a/
.
xa
x!a

 

f .x/f .a/ C g.x/g.a/


Cg/.a/
Then .f C g/ .a/ D lim .f Cg/.x/.f
D lim
xa
xa
x!a
x!a
.a/
g.x/g.a/
0
0
lim f .x/f
C
lim
D
f
.a/
C
g
.a/.
xa
xa
x!a
x!a
0
0
0
0

Thus, .f C g/ .a/ D f .a/ C g .a/.


Because .f  g/.x/ D f .x/  g.x/ D f .x/ C .1/g.x/, and the derivative
the derivative of g.x/, it follows that .f  g/0 .a/ D
of .1/g.x/is 1 times
0
f C .1/g .a/ D f .a/ C .1/g0 .a/ D f 0 .a/  g0 .a/ completing the proof
of the theorem.
Why does the first step in this proof make the assumption that f and g are defined
on the same domain? This is to avoid the embarrassing situation that the intersection
of the domains of f and g isolates the point a. For example, if f is defined for all
x  1 and g is defined for all x  1, it could be that both f 0 .1/ and g0 .1/ are defined,
but the function f Cg is defined only at 1, so its derivative cannot be defined. Another

138

5 Derivatives

example would be for f to


p be defined at all rational numbers, and g to be defined at
all rational multiples of 2. Each function could be differentiable at each point of
its domain, but f C g is only defined at 0, so its derivative cannot be defined.
It is certainly worth noting here that the theorems discussed so far show that for
functions f and g and constants a and b, the derivative of the linear combination of
functions af .x/Cbg.x/ is the linear combination of the derivatives af 0 .x/Cbg0 .x/.
In the words of Linear Algebra, this says that the derivative is a linear operator. This
fact alone has a long list of ramifications in Differential Equations and other fields.
It is important for the beginning Calculus student to learn that even though the
derivative behaves in an intuitive way with respect to addition and subtraction, that
this intuition ceases when discussing the derivative of a product or quotient. The
proof that .fg/0 D fg0 C f 0 g involves one trick reminiscent of the proof that the limit
of a product is the product of the limits. That is, one adds and subtracts the same
quantity so that rather than making a change in two different factors at the same
time, one makes a change in one factor at a time. Indeed, the difference quotient
you obtain for the function fg is
f .x/g.x/  f .a/g.a/
f .x/g.x/  f .x/g.a/ C f .x/g.a/  f .a/g.a/
D
xa
xa
g.x/  g.a/
f .x/  f .a/
D f .x/ 
C
 g.a/:
xa
xa
Taking the limits at each step produces the following proof.
PROOF (Product Rule): Let f and g be functions defined on a
common domain, and let f and g both be differentiable at a. Then
.fg/0 .a/ D f .a/g0 .a/ C f 0 .a/g.a/.
Suppose that f and g are functions defined on a common domain, and that f
and g are both differentiable at a.
.a/
From the definition of derivative f 0 .a/
D
lim f .x/f
and
xa
x!a

g0 .a/ D lim g.x/g.a/


.
xa
x!a
Because f is differentiable at a, it is continuous at a. This implies that
lim f .x/ D f .a/.
x!a

Then .fg/0 .a/ D lim

.fg/.x/.fg/.a/

xa
x!a
f .x/g.x/f .x/g.a/Cf .x/g.a/f .a/g.a/
lim
xa
x!a

lim f .x/ 

x!a

D lim

x!a

f .x/g.x/f .a/g.a/
xa

D lim f .x/ 
x!a

.a/
lim g.x/g.a/
C lim f .x/f
xa
xa
x!a
x!a
0
0
0

g.x/g.a/
xa

f .x/f .a/
xa


 g.a/ D

 lim g.a/ D f .a/g0 .a/ C f 0 .a/g.a/.


x!a

Thus, .fg/ .a/ D f .a/g .a/ C f .a/g.a/.


 0
The proof that gf
D
assumption that g.a/ 0.

gf 0 fg0
g2

involves the same strategy along with the extra

5.4 The Arithmetic of Derivatives

139

PROOF (Quotient Rule): Let f and g be functions defined on a common


domain,
and let f and g both be differentiable at a. If g.a/ 0, then
 0
0
0 .a/
f
.a/ D g.a/f .a/f.a/g
.
2
g
g.a/

Suppose that f and g are functions defined on a common domain, and that f
and g are both differentiable at a with g.x/ 0.
.a/
From the definition of derivative f 0 .a/
D
lim f .x/f
and
xa
x!a

.
g0 .a/ D lim g.x/g.a/
xa
x!a
Because g is differentiable at a, it is continuous at a. This implies that
lim g.x/ D g.a/.
x!a


f
f
 0
f .x/
f .a/
.x/

.a/
 g.a/
g
g
f
g.x/
Then
D lim
D
.a/ D lim
x!a
x!a
g
xa
xa
f .x/
f .a/
f .a/
f .a/
 g.x/
C g.x/
 g.a/
g.x/
lim
D
x!a
xa
!
1
1
 g.a/
1
f .x/  f .a/
g.x/

C f .a/ 
lim
D
x!a
xa
g.x/
xa


1
f .a/
g.a/  g.x/
f .x/  f .a/

C

D
lim
x!a
xa
g.x/
g.x/g.a/
xa
lim

x!a

f .x/  f .a/
1
f .a/
g.a/  g.x/
 lim
C lim
 lim
D
x!a g.x/
x!a g.x/g.a/ x!a
xa
xa

f .a/
1
f 0 .a/g.a/  f .a/g0 .a/
0

.
2  g .a/ D

2
g.a/
g.a/
g.a/
 0
f
f 0 .a/g.a/  f .a/g0 .a/
Thus,
.a/ D
.

2
g
g.a/
f 0 .a/ 

5.4.1 Exercises
Write proofs for each of the following statements.
For any constant c, the function f .x/ D c has derivative f 0 .x/ D 0.
The function f .x/ D x has derivative f 0 .x/ D 1.
For any positive integer n, the function f .x/ D xn has derivative f 0 .x/ D nxn1 .
Any polynomial function f .x/ D an xn Can1 xn1 Can2 xn2 C  Ca1 xCa0 has
derivative f 0 .x/ D nan xn1 C .n  1/an1 xn2 C .n  2/an2 xn3 C    C a1 .
n
5. For any positive integer n, the function f .x/ D x1n has derivative f 0 .x/ D  xnC1
.

1.
2.
3.
4.

140

5 Derivatives

6. Given the collection of functions f1 ; f2 ; f3 ;    ; fn each defined on the same


domain and
constants c1 ; c2 ; c3 ;    ; cn , the
 each differentiable at a, and given
0
derivative c1 f1 C c2 f2 C c3 f3 C    C cn fn .a/ D c1 f10 .a/ C c2 f20 .a/ C c3 f30 .a/ C
   C cn fn0 .a/.
7. Given the collection of functions f1 ; f2 ; f3 ;    ; fn each defined
on the same

0
domain and each differentiable at a, the derivative f1 f2 f3    fn .a/ D
f10 .a/f2 .a/f3 .a/    fn .a/ C f1 .a/f20 .a/f3 .a/    fn .a/ C f1 .a/f2 .a/f30 .a/    fn .a/ C
   C f1 .a/f2 .a/f3 .a/    fn0 .a/.

5.5 Chain Rule and Inverse Functions


The Chain Rule shows how to differentiate a function that is a composition of
other functions. Since composition is an invaluable tool for constructing functions,
the Chain Rule deserves its place among the important algorithms for calculating
derivatives. It states that if g is a function differentiable at a, and if f is a
function defined on the range of g and differentiable at g.a/,
 then the function
.f g/.x/ is differentiable at a, and .f g/0 .a/ D f 0 g.a/ g0 .a/. To
 prove
  this,

f g.x/ f g.a/

.
you would need to find the limit of the difference quotient D D
xa
0
which
has
limit
g
.a/,
you
might
try
both
Expecting to see the expression g.x/g.a/
xa
multiplying
   and dividing the difference quotient D by the factor g.x/  g.a/ to get
f g.x/ f g.a/
g.x/g.a/

g.x/g.a/
.
xa

This idea leads to the following almost correct proof.

PROOF ATTEMPT: Let g be a function differentiable at a, and f be


a function defined
on

 the range of g and differentiable at g.a/. Then
.f g/0 .a/ D f 0 g.a/ g0 .a/.
Let g be a function differentiable at a, and f be a function defined on the
range of g and differentiable at g.a/.
From the definition of derivative, g0 .a/
D
lim g.x/g.a/
and
xa
x!a
 
f .y/f g.a/
f 0 .g.a// D lim
.
yg.a/
y!g.a/

Because g is differentiable at a, it is continuous at a. This implies that


lim g.x/ D g.a/.
x!a
   
f g.x/ f g.a/
.f g/.x/.f g/.a/
0
Then .f g/ .a/ D lim
D lim
D
xa
xa
x!a
x!a
   
   
f g.x/ f g.a/
f g.x/ f g.a/
lim g.x/g.a/  g.x/g.a/
D lim g.x/g.a/  lim g.x/g.a/
.
xa
xa
x!a
x!a
x!a
   
 
f .y/f g.a/
f g.x/ f g.a/
Therefore, .f g/0 .a/ D lim g.x/g.a/ lim g.x/g.a/
D
lim

xa
yg.a/
x!a

g.x/g.a/
xa
x!a

lim

D f 0 .g.a//  g0 .a/.

x!a

y!g.a/

5.5 Chain Rule and Inverse Functions

141

This proof attempt does include the intuitive reasoning behind why the Chain Rule
works, but the proof is not correct. Can you spot the error? The problem is that even
though g.x/ is approaching g.a/ as x approaches a, there is no guarantee where g.x/
is different from g.a/. In fact, it is quite easy to construct functions g.x/ which are
differentiable at a for which g.x/ is equal to g.a/ for infinitely many values of x
as x approaches a. The simplest example
is when g is a constant function. A more
 2
x sin 1x if x 0
complicated example is g.x/ D
which has a derivative of
0
if x D 0
1
0 at x D 0 and is equal to g.0/ D 0 at n
for all nonzero integers n. Clearly,
when g.x/ D g.a/, one cannot both multiply and divide the difference quotient
by g.x/  g.a/ and expect to get anything except nonsense. This problem does not
present anenormous
  hurdle because, in the cases where g.x/ D g.a/, the difference
f g.x/ f g.a/
xa

is itself equal to 0. A typical way around the problem is to


 
8
9
f .y/f g.a/

< yg.a/ if y g.a/ >


=
introduce the function h.y/ D
. This function has the

>

: 0
;
f g.a/ if y D g.a/
nice property that it is equal to the desired difference quotient when g.x/ differs
from g.a/, and it is continuous at g.a/. Introducing this function into the proof gets
around the technical difficulties of the previously attempted proof.
quotient

PROOF (Chain Rule): Let g be a function differentiable at a, and f be


a function defined
on

 the range of g and differentiable at g.a/. Then
.f g/0 .a/ D f 0 g.a/ g0 .a/.
Let g be a function differentiable at a, and f be a function defined on the
range of g and differentiable at g.a/.
From the definition of derivative, g0 .a/ D lim g.x/g.a/
and f 0 .g.a// D
xa
x!a
 
f .y/f g.a/
lim
.
yg.a/
y!g.a/

Because g is differentiable at a, it is continuous at a. This implies that


lim g.x/ D g.a/.
x!a
 
9
8
f .y/f g.a/

=
< yg.a/ if y g.a/ >
, and note that h is continuous
Define h.y/ D
>


;
: 0
f g.a/ if y D g.a/
at g.a/.
 
f g.x/ .f g/.a/

g.a//
D lim
Then .f g/0 .a/ D lim .f g/.x/.f
xa
xa
x!a
x!a
 
 g.x/g.a/


g.x/g.a/
D lim h g.x/  lim xa D
lim h g.x/  xa
x!a
x!a
x!a



 0
0
g.a/

g
D
f
.a/.
h lim g.x/  lim g.x/g.a/
xa
x!a
x!a

 0
0
0
Therefore, .f g/ .a/ D f g.a/  g .a/.

142

5 Derivatives

Recall that a function f W A ! B is called a bijection if it is both surjective


and injective; that is, for each y 2 B there is one and only one x 2 A such that
f .x/ D y. In this case f is a one-to-one correspondence between the points of A
and the points of B. When f is a bijection, one can define the inverse of f to be the
function f 1 W B ! A by letting f 1 .y/ be the unique value of x such that f .x/ D y.
In other words f 1 is the set of ordered
this

 pairs f.y; x/ j .x; y/ 2 f g. From

 definition
it is clear that for all x 2 A, f 1 f .x/ D x, and for all y 2 B, f f 1 .y/ D y. This
says that f 1 f is the identity function on A, and f f 1 is the identity function
on B.
Note that if f W A ! B is not a bijection, then one cannot define f 1 as a mapping
from B to A. If f is not surjective, then there is a y 2 B which is not in the range
of f , so there is no way to define f 1 .y/. If f is not injective, then there is a y 2 B
such that f .x/ D y for more than one value of x, and there may be no natural way
to select which x should be f 1 .y/. For example, the function f .x/ D x2 maps the
real numbers into the real numbers. The function is neither surjective (its range is
the nonnegative real numbers) nor injective since f .2/ D f .2/. One can restrict the
codomain to the nonnegative real numbers. Then f becomes a surjective function,
but it is still not injective. To get an inverse to f you can substitute a different
function for f which restricts the domain of f to the nonnegative real numbers. If
f is thought of as a function from the nonnegative real numbers to
pthe nonnegative
real numbers, then f is a bijection, and it has the inverse function x.
The same procedure is done to obtain inverses of the trigonometric functions.
For example, the function f .x/ D sin x maps the real numbers to the interval 1; 1.
The function is surjective, but it is not injective.
an injective function, the

To obtain

domain of sin x is restricted to the interval  2 ; 2 . On this interval sin x is both
injective and surjective and has the inverse sin1 y, sometimes written as arcsin y
(Fig. 5.3).
Now suppose that f is bijective and has inverse function f 1 . If f has a nonzero
derivative at the point a, the Chain Rule can be used to find the derivative of f 1
at f .a/. Indeed,
one has that .f f 1 /.x/ D x, so the Chain Rule implies that

1
0 1
1 0
 . Is this conclusion valid? That
f f .x/  .f / .x/ D 1, or .f 1 /0 .x/ D 0  1
f

.x/

is, can you justify taking the derivative of .f f 1 / using the Chain Rule before you
know that the derivative of f 1 exists? The answer is yes, the use of the Chain
Rule
 the limit of
 is justified here. The proof of the Chain Rule includes taking
which
is
broken
into
the
product
of
the
limit
of
h
g.x/ and the
h g.x/  g.x/g.a/
xa

Fig. 5.3 Restricting sin.x/ to get sin1 .x/

5.6 Increasing Functions, Decreasing Functions, and Critical Points

143

In its application to the equation .f f 1 /.x/ D x one can rewrite


xa
the difference quotient as g.x/g.a/
D xa  . Now there is no a priori assumption
xa

limit of

g.x/g.a/
.
xa

h g.x/

that the limit of

g.x/g.a/
xa

xa

exists; its limit is just the limit of the quotient xa  which
h g.x/

exists as the quotient of limits.


As an application
of differentiating an inverse function, consider finding the
p
derivative of n x for integer values of n 0. It is known that, for integer values of
n, the derivative of f .x/ D xn is f 0 .x/ D nxn1 . For n 0 and x  0, the inverse of
p
1
1
the function of f is f 1 .x/ D n x, so its derivative must be 0  1  D p
.
n. n x/n1
f f .x/

5.6 Increasing Functions, Decreasing Functions,


and Critical Points
Perhaps the most important property of the derivative is its ability to determine
where a function is increasing or decreasing. Let f be a function defined on an
interval I. If for all x and y in I, x < y implies that f .x/  f .y/, then f is said to be
increasing on I, and if x < y implies that f .x/ < f .y/, then f is said to be strictly
increasing on I. Similarly, if x < y implies that f .x/  f .y/, then f is said to be
decreasing on I, and if x < y implies that f .x/ > f .y/, then f is said to be strictly
decreasing on I.
So what can be said if it is known that function f has a positive derivative at a?
.a/
What is known is that the difference quotient f .x/f
has a positive limit, so it is
xa
positive when x is close to a. How close to a does x have to be? What the limit
.a/
definition gives you is that for any  > 0, you can find a > 0 so that f .x/f
is
xa
within  of its limit, f 0 .a/, which is positive. So, if  > 0 is chosen to be f 0 .a/, then
the difference quotient which has to be within f 0 .a/ of f 0 .a/ will have to be positive.
.a/
Thus, for x within of a (and not equal to a), the difference quotient f .x/f
is
xa
positive. Then if x > a, it follows that f .x/ > f .a/, and if x < a, it follows that
f .x/ < f .a/. Does this mean that f is increasing? The answer is no. There are
functions with a positive derivative at a which are not increasing over any open
interval containing a. An example of such a function is given in the last section of
this chapter. All one can say is the following.

144

5 Derivatives

PROOF: Let f be a function with a positive derivative at a. Then there is a


> 0 such that for all x in the domain of f with jx  aj < , if x > a, then
f .x/ > f .a/, and if x < a, then f .x/ < f .a/. Similarly, if f has a negative
derivative at a, there is a > 0 such that for all x in the domain of f with
jx  aj < , if x > a, then f .x/ < f .a/, and if x < a, then f .x/ > f .a/.
Let f be a function with a positive derivative at a.
.a/
Then lim f .x/f
D f 0 .a/ > 0.
xa
x!a

From the definition of limit, there is a > 0 such that


if x is in the domain

.a/
0
of f and 0 < jx  aj < , then f .x/f

f
.a/
< f 0 .a/ implying that
xa
f .x/f .a/
xa

> 0.
.a/
> 0, so f .x/f .a/ >
If x satisfies a < x < aC, then xa > 0 and f .x/f
xa
0, and f .x/ > f .a/.
.a/
If x satisfies a > x > a, then xa < 0 and f .x/f
> 0, so f .x/f .a/ <
xa
0, and f .x/ < f .a/.
This proves the first part of the theorem.
If instead f 0 .a/ < 0, apply the above argument to the function f to obtain
the analogous result.

A function f defined at the point a is said to have a relative maximum


(sometimes called a local maximum) at a if there is a > 0 such that for all x
in the domain of f satisfying jx  aj < the value of f .a/  f .x/. Similarly, one can
define relative minimum (or local minimum) where, in this case, f .a/  f .x/. If f
has a relative maximum or relative minimum at a, one can say that it has a relative
extremum (sometimes called a local extremum) at a.
Another very important property of the derivative is its ability to identify points
where a function has relative extrema. This ability follows immediately from the
previous theorem.
PROOF: Let f be a function defined on an open interval containing a, and
let f be differentiable at a. Then if f has a relative extremum at a, the value
of f 0 .a/ is 0.
Let f be a function defined on an open interval I containing the point a.
Assume that f is differentiable at a and has a relative maximum at a.
If f 0 .a/ is positive, there is a > 0 such that for all x 2 I with a < x < aC,
f .x/ > f .a/. This contradicts the fact that f has a relative maximum at a.
If f 0 .a/ is negative, there is a > 0 such that for all x 2 I with a < x < a,
f .x/ > f .a/. This contradicts the fact that f has a relative maximum at a.
Thus, f 0 .a/ must be 0.
Applying this argument to the function f shows that if f has a relative
minimum at a, it must be that f 0 .a/ D 0.

5.6 Increasing Functions, Decreasing Functions, and Critical Points

145

f(x)

Fig. 5.4 This graph of f .x/ on the interval a; h shows relative maxima at b, d, and g, relative
minima at a; c; e; and h; an absolute maximum at b, and an absolute minimum at h. The derivative
f 0 .x/ does not exist at x D d

Any student of Calculus will see applications of this result where one is asked
to identify relative extrema for a particular function, and applications to what are
fondly called Max/Min problems where one is first asked to construct an appropriate
function to fit the application and then find a particular extremum of that function.
One defines a critical point of f to be a value a where either f 0 .a/ D 0 or f 0 .a/ does
not exist. Not all of these points will end up being relative extrema for some may just
be a saddle point of f where f 0 .a/ D 0, but f has no relative extrema at that point.
For example, the function f .x/ D x3 has a saddle point at x D 0 where f 0 is 0, but f
is a strictly increasing function over the entire real line. A function is said to have an
absolute maximum (sometimes called a global maximum) at a if f is defined at
a, and for all other x in the domain of f , f .x/  f .a/. The term absolute minimum
(sometimes called a global minimum) is defined in the analogous way with f .x/ 
f .a/, and an absolute extremum (sometimes called a global extremum) is either an
absolute maximum or absolute minimum. The theorem about relative extrema shows
that if f is defined on any interval I, then the only places f can have relative extrema
or absolute extrema are critical points or at endpoints of I. You should be able to
identify example functions where each of these criteria give extrema (Fig. 5.4).

5.6.1 Exercises
Identify the relative extrema and absolute extrema of the given functions on the
given intervals.
1.
2.
3.
4.

f .x/ D x3  8x on the interval 3; 2


f .x/ D 3x C 3x on the interval 1; 1/
f .x/ D jx2  16j on the interval 5; 6
p 2
f .x/ D 3  2 3 x on the interval 2; 2

146

5 Derivatives

5.7 The Mean Value Theorem


The Mean Value Theorem is one of the better known results about derivatives, and
for good reason. It is invoked frequently when one needs to estimate the maximum
possible change between the values of a function at two different points. This can
be a valuable tool when finding approximations to functions or when it is necessary
to know how much variation is exhibited by a particular function. The theorem
states that the average rate of change of a function between two points a and b
.a/
given by f .b/f
is equal to the value of the derivative f 0 .c/ for some c between a
ba
and b. This allows you to use information about the derivative to make statements
about the change f .b/  f .a/. The theorem is usually proved in two steps by first
proving Rolles Theorem which is a simpler version of the Mean Value Theorem.
Rolles Theorem states that if a < b, and if f is a function continuous on the interval
a; b, differentiable on the interval .a; b/, and satisfying f .a/ D f .b/, then there is a
c 2 .a; b/ for which f 0 .c/ D 0.
What tools do you have to prove this result? Your proof needs to conclude that
f 0 .c/ D 0. Think through what you know about derivatives, and see if any of the
results conclude that the derivative is equal to 0. The only results that come to mind
are the result that the derivative of any constant function is 0, and the result that
if f reaches a relative extremum at a point where the function is differentiable,
then its derivative at that point must be 0. It is unlikely that the first of these two
results will be of much help except in the very special case where f is a constant
function. So how can you use the result about extreme values to show that there
is a place where the function has a derivative of 0? What you do know is that f is
continuous on a closed interval a; b, and the Extreme Value Theorem states that
such a function obtains its maximum and minimum values on this interval. You also
know that these extreme values can only occur at places where the derivative is 0,
where the derivative does not exist, or at the endpoints of the interval. OK, there are
no places on .a; b/ where the derivative does not exist, but could both the maximum
and minimum occur at endpoints of the interval? The hypothesis of Rolles Theorem
says that f .a/ D f .b/, so the only way that the two endpoints can be both maximum
and minimum values of f on the interval is for f to be constant on the interval. In
the case of a constant function, the theorem is clearly true. In any other case, it
could be that f .a/ and f .b/ are maximum values for f or minimum values for f , but
they cannot be both. If f is not constant, its maximum and minimum values must
be different. That guarantees that f must have either an absolute maximum or an
absolute minimum (possibly both) between a and b. That gives the result (Fig. 5.5).

5.7 The Mean Value Theorem

147

f(x)

Fig. 5.5 The proof of Rolles Theorem finds an extreme point c between a and b for which
f 0 .c/ D 0

PROOF (Rolles Theorem): For a < b, let f be a function continuous


on the interval a; b and differentiable on the interval .a; b/ satisfying
f .a/ D f .b/. Then there is a c 2 .a; b/ for which f 0 .c/ D 0.
For a < b, let f be a function continuous on the interval a; b and
differentiable on the interval .a; b/ satisfying f .a/ D f .b/.
Because f is continuous on the closed bounded interval a; b, it obtains a
maximum and a minimum value there.
If both the maximum and minimum values of f occur at endpoints of the
interval, then, since f .a/ D f .b/, the maximum and minimum values of f
are equal, and f is constant on the interval a; b.
In this case, f 0 .c/ D 0 for each c 2 .a; b/, and the conclusion of the theorem
holds.
If the maximum and minimum values of f do not both occur at endpoints of
the interval, then there must be a c 2 .a; b/ such that f reaches a maximum
or a minimum value at c.
In this case, f 0 .c/ D 0, and the conclusion of the theorem holds.
In either case, the conclusion of the theorem holds which completes the
proof.
Rolles Theorem takes care of the case where f .a/ D f .b/. To prove the Mean
Value Theorem in the more general case where f .a/ need not equal f .b/, you would
want to reduce this general case to the previously proved case where f .a/ and f .b/
are equal. An easy way to do this is to subtract a linear function from f to get a
new function h which does satisfy the hypothesis of Rolles Theorem. This linear
function can be any linear function that takes on a value at b which differs
by
 xa
f .b/  f .a/ from the value it takes on at a. One such function is f .b/  f .a/  ba
because it takes on the value 0 at a and f .b/  f .a/ at b (Fig. 5.6).

148
Fig. 5.6 Point c between a
and b where the tangent line
is parallel to the secant line
from a to b

5 Derivatives
f(x)

PROOF (Mean Value Theorem): For a < b, let f be a function continuous


on the interval a; b and differentiable on the interval .a; b/. Then there
.a/
is a c 2 .a; b/ for which f 0 .c/ D f .b/f
.
ba
For a < b, let f be a function continuous on the interval a; b and
differentiable on the
 interval .a; b/.xa
Let h.x/ D f .x/  f .b/  f .a/  ba
.

 xa
Since both f .x/ and f .b/  f .a/  ba
are continuous on a; b and
differentiable .a; b/, h is also continuous on a; b and differentiable on
.a; b/.




h.b/ D f .b/  f .b/  f .a/  ba
D f .b/  f .b/  f .a/ D f .a/ D h.a/.
ba
Thus, h satisfies the hypothesis of Rolles Theorem, so there is a c 2 .a; b/
such that h0 .c/ D 0.

 1
.a/
Then 0 D h0 .c/ D f 0 .c/  f .b/  f .a/  ba
, so f 0 .c/ D f .b/f
.
ba
f
.b/f
.a/
0
Therefore, there is a c 2 .a; b/ with f .c/ D ba which completes the
proof.
The following are two instructive applications of the Mean Value Theorem. First,
if you know that a function f is differentiable on an interval, and its derivative is
nonnegative on that interval, then the function must be increasing on the interval. To
show that a function is increasing, you need to show that if x and y are in the interval
with x < y, then f .x/  f .y/. This would follow from knowing that if y  x  0,
.x/
 0. What the Mean Value Theorem gives you
then the difference quotient f .y/f
yx
is that this difference quotient is equal to the derivative of f at some point c between
x and y. So, if you know that the derivative on the interval is always nonnegative,
then the difference quotient must be nonnegative as needed.

5.7 The Mean Value Theorem

149

PROOF: Let f be a function whose derivative is nonnegative at every point


of an interval. Then f is an increasing function on that interval.
Let f be a function whose derivative is nonnegative at each point of the
interval I.
Let x and y be in I with x < y.
Then by the Mean Value Theorem, there is a c between x and y such that
f .y/f .x/
D f 0 .c/.
yx
Since I is an interval and x and y are in I, c is also in I, implying that
f 0 .c/  0.
.x/
.x/
Thus, f .y/f
 0 so .y  x/  f .y/f
D f .y/  f .x/  0.
yx
yx
Therefore, f is increasing on I.
Clearly, if f 0 is strictly positive on an interval, then you can prove that f is strictly
increasing on the interval. This can be done by altering the above proof by changing
the greater than or equal signs to greater than signs where needed. Is the converse of
the above theorem true? Well, one cannot conclude that a function is differentiable
on an interval by just knowing that the function is increasing there. But what if
you are given a differentiable function that is increasing? What can you conclude
about the derivative? If a function is increasing, it does mean that every difference
.x/
quotient f .y/f
will be greater than or equal to 0, and, thus, the derivative which
yx
is the limit of such difference quotients will have to be greater than or equal to 0. If
f is strictly increasing, can you conclude that its derivative is positive? In this case
you cannot. You can conclude that all difference quotients will be positive, but the
limit of positive difference quotients can be 0. For example, f .x/ D x3 is a function
differentiable on the entire real line, and it is strictly increasing, but its derivative is
0 at x D 0.
Another important consequence of the Mean Value Theorem is that if a function
has a derivative equal to 0 at every point of an interval, then f is constant on that
interval. Again, this follows directly from what you can say about any difference
quotient.
PROOF: Let f be a function whose derivative is 0 at every point of an
interval. Then f is constant on that interval.
Let f be a function whose derivative is 0 at each point of the interval I.
Let x and y be in I with x < y.
Then by the Mean Value Theorem, there is a c between x and y such that
f .y/f .x/
D f 0 .c/.
yx
Since I is an interval and x and y are in I, c is also in I, implying that
f 0 .c/ D 0.
.x/
.x/
Thus, f .y/f
D 0 so .y  x/  f .y/f
D f .y/  f .x/ D 0.
yx
yx
Therefore, f .x/ D f .y/ for all x and y in the interval, and, thus, f is constant.

150

5 Derivatives

How important is it that the set where f 0 is 0 is an interval? The


 fact that the

0 if x < 0
set is an interval is crucial. For example, the function f .x/ D
1 if x > 0
is not defined at 0. The derivative, f 0 , is equal to 0 at each point of the domain
of f , but clearly, f is not a constant function, although it is constant on each
interval contained in its domain. Looking back to the previous theorem, note that
the function f .x/ D  1x has a strictly positive derivative at each point of its domain,
but, again, its domain does not include 0. This function is strictly increasing on
each interval contained in its domain, but it is not an increasing function because
f .1/ > f .1/.

5.7.1 Exercises
Write proofs for each of the following statements.
1. If f is a function whose derivative is negative for all points in an interval, then f
is a decreasing function on the interval.
2. If f and g are functions differentiable on an interval with f 0 .x/ D g0 .x/ for each
x in the interval, then there is a constant C such that f .x/ D g.x/ C C for all x in
the interval.
3. If f .0/ D g.0/ and f 0 .x/  g0 .x/ for each x  0, then f .x/  g.x/ for each x > 0.

5.8 LHopitals Rule


It seems like most students who take Calculus remember LHopitals Rule. Even
those who do not remember what the rule states seem to remember its name.
Perhaps this is because it is so much fun to pronounce, but more students remember
LHopitals Rule than some far more important results such as the Fundamental
Theorem of Calculus. LHopitals Rule states that if f and g are differentiable
functions defined on an interval containing the point a with lim f .x/ D lim g.x/ D
x!a

x!a

f .x/
0, then lim gf 0.x/
D L implies that lim g.x/
D L. This is very useful because the
x!a .x/
x!a
theorem stating that the limit of a quotient is the quotient of the limits does not
apply in cases when the denominator has a limit of 0.
How would you prove LHopitals Rule? You might try to prove it by using the
Mean Value Theorem because the quotient you are considering is

f .x/  f .a/
f .x/
D
D
g.x/
g.x/  g.a/

f .x/f .a/
xa
g.x/g.a/
xa

5.8 LHopitals Rule

151

This is not exactly correct because, as far as you know, f .a/ and g.a/ might not
even be defined, and if they are, they need not be equal to lim f .x/ and lim g.x/.
x!a
x!a
This is not a big stumbling block, because you can always redefine f and g at a to
be equal to 0 without changing the result of the theorem. You also would need to
know that g.x/ g.a/ for x near a so that the needed quotient can be calculated.
Once the quotient of f and g is rewritten as the quotient of the difference quotient
of f and the difference quotient of g, you can apply the Mean Value Theorem to
replace the difference quotients with derivatives, and then take the limit. It might
look something like the following.
PROOF ATTEMPT: Let both f and g be functions differentiable for
all x a in an interval which contains a. Assume that lim f .x/ D
x!a
f 0 .x/
0
x!a g .x/

lim g.x/ D 0, and g0 .x/ 0 for all x a in the interval. Then lim

x!a

implies

lim f .x/
x!a g.x/

DL

D L.

Let f and g be functions differentiable for all x a in an interval which


contains a.
Assume that lim f .x/ D lim g.x/ D 0, and g0 .x/ 0 for all x in the interval
x!a

with x a.
Assume that lim

f 0 .x/
0
x!a g .x/

x!a

D L.

Without loss of generality, it can be assumed that f .a/ D g.a/ D 0 because


redefining the functions at a does not change the limits at a of f , g, f 0 , g0 , or
their ratios. With f .a/ D g.a/ D 0, both f and g are continuous at x D a.
For x in the given interval with x a, both f and g are continuous on the
closed interval with endpoints at x and a, and both f and g are differentiable
on the open interval with these endpoints.
Thus, by the Mean Value Theorem, there is a cf .x/ between x and a such


.a/
that f 0 cf .x/ D f .x/f
, and there is a cg .x/ between x and a such that
xa


g.x/g.a/
g0 cg .x/ D xa .
Because cf .x/ and cg .x/ are both between x and a, lim cf .x/D lim cg .x/Da.
x!a

Then because f .a/ D g.a/ D 0,


 
0
f 0 cf .x/
.
lim 0   D lim gf 0.x/
.x/
x!a g cg .x/

lim f .x/
x!a g.x/

lim f .x/f .a/


x!a g.x/g.a/

x!a

D lim

x!a

f .x/f .a/
xa
g.x/g.a/
xa

x!a

This completes the proof.


There is a significant problem with this proof. The problem stems from the fact
that, although both functions cf .x/ and cg .x/ do approach a as x approaches a, the
two functions can approach a at different rates. Why is this a problem? Consider
2
calculating lim xx , a limit which is clearly equal to 0. But what if cf .x/ D x
x!0

and cg .x/ D x2 ? Even though it is true that cf .x/ and cg .x/ both approach 0 as

152

5 Derivatives
cf .x/2
x!0 cg .x/

x approaches 0, the limit lim

D 1. Just knowing that cf .x/ and cg .x/ are

approaching a does not allow you to use both of these expressions in place of x
f .x/
x!0 g.x/

when taking the limit. What the proof attempt does show is that lim

lim f 0 .x/

x!0

lim g0 .x/

x!0

a result that is not as useful as LHopitals Rule.


A second less crucial problem with this proof attempt is that it defines cf .x/ to
.a/
be the value of c such that f .x/f
D f 0 .c/. But this condition may well be satisfied
xa
by more than one value of c, so there is a problem with which of the possible values
of c is chosen. One can get around this difficulty, but that still does not address the
previously stated problem.
A common way to correct the problem in the proof attempt is to use a more
powerful version of the Mean Value Theorem known as the Extended Mean Value
Theorem or sometimes as the Cauchy Mean Value Theorem. It allows you to select
0
f .x/f .a/
one value of c so that gf 0.c/
D g.x/g.a/
, that is, it allows you to select the ratio of
.c/
derivatives equal the ratio of the difference quotients at a single value of c rather
than selecting one value of c for the numerator and a possibly different value
of c for the denominator. One can prove the Extended Mean Value Theorem by
0
f .x/f .a/
manipulating the desired relation gf 0.c/
D g.x/g.a/
. This equation can be rewritten as
.c/
f 0 .c/g.x/  g.a/ D g0 .c/f .x/  f .a/ and then f 0 .c/g.x/  g.a/  g0 .c/f .x/  f .a/
D 0. This may be confusing because there are three variables involved, x, a, and c,
but you can make better sense of it by thinking of x and a as being fixed. That
is, if you define the function h.t/ D f .t/g.x/  g.a/  g.t/f .x/  f .a/, then
h0 .c/ D f 0 .c/g.x/  g.a/  g0 .c/f .x/  f .a/ as needed. How do you know that
there is a c such that h0 .c/ D 0? That follows from Rolles Theorem because it is
easy to verify that h.x/ D h.a/.
PROOF (Extended Mean Value Theorem): For a < b, let both f and g be
functions continuous on a; b and differentiable on .a; b/. Then there is a
c 2 .a; b/ such that f 0 .c/g.b/  g.a/ D g0 .c/f .b/  f .a/.
Let a < b and assume f and g are functions continuous on a; b and
differentiable on .a; b/.
For x 2 a; b define h.x/ D f .x/g.b/  g.a/  g.x/f .b/  f .a/.
Then h is also continuous on a; b and differentiable on .a; b/.
Note that h.a/ D f .a/g.b/g.a/g.a/f .b/f .a/ D f .a/g.b/g.a/f .b/,
and
h.b/ D f .b/g.b/  g.a/  g.b/f .b/  f .a/ D f .a/g.b/  g.a/f .b/ D h.a/.
Thus, h satisfies the hypothesis of Rolles Theorem on the interval a; b.
It follows that there is a c 2 .a; b/ such that h0 .c/ D 0, so
f 0 .c/g.b/  g.a/  g0 .c/f .b/  f .a/ D 0.
This is equivalent to f 0 .c/g.b/  g.a/ D g0 .c/f .b/  f .a/ which is the
conclusion of the theorem.

5.8 LHopitals Rule

153

Now the Extended Mean Value Theorem can be used to give a correct proof of
LHopitals Rule.
PROOF (LHopitals Rule, Part 1): Let f and g be functions differentiable
for all x a in an open interval which contains a. Assume that lim f .x/ D
x!a
f 0 .x/
0
x!a g .x/

lim g.x/ D 0, and g0 .x/ 0 for all x a in the interval. Then lim

x!a

implies

lim f .x/
x!a g.x/

DL

D L.

Let f and g be functions differentiable for all x a in an open interval


which contains a.
Assume that lim f .x/ D lim g.x/ D 0, and g0 .x/ 0 for all x in the interval
x!a

x!a

with x a.
Assume that lim

f 0 .x/
0
x!a g .x/

D L.

Without loss of generality, it can be assumed that f .a/ D g.a/ D 0 because


redefining the functions at a does not change the limits at a of f , g, f 0 , g0 , or
their ratios. With f .a/ D g.a/ D 0, both f and g are continuous at x D a.
Let  > 0 be given.
0
Since lim gf 0.x/
D L, there is a > 0 such that 0 < jx  aj < implies that
.x/
f 0 .x/
g0 .x/

x!a

is within  of L.
Fix x in the given interval with 0 < jx  aj < .
Since f and g are continuous on the closed interval from a to x and
differentiable on the open interval from a to x, f and g satisfy the hypothesis
of the Extended Mean Value Theorem on the interval from a to x.
It follows that there is a c between x and a such that f 0 .c/g.x/  g.a/ D
g0 .c/f .x/  f .a/.
By assumption g0 is not 0, so g0 .c/ 0. Also the Mean Value Theorem
shows that g.x/  g.a/ D g0 .t/.x  a/ for some t between x and a, and this
shows g.x/  g.a/ 0.
0
f .x/f .a/
It follows that g.x/g.a/
D gf 0.c/
.
.c/

f .x/f .a/
0

f .x/

L
Thus, g.x/  L D g.x/g.a/  L D gf 0.c/
< .
.c/
f .x/
x!a g.x/

This implies that lim

D L which proves the theorem.

LHopitals Rule also holds in cases where lim g.x/ is infinite rather than
x!a
zero.

154

5 Derivatives

PROOF (LHopitals Rule, Part 2): Let f and g be functions differentiable


in an open interval which contains a. Assume that lim g.x/ is either
x!a

positive or negative infinity, and g0 .x/ 0 for all x in the interval with
0
f .x/
x a. Then lim gf 0 .x/
D L implies lim g.x/
D L.
.x/
x!a

x!a

Let f and g be a function differentiable in an open interval which contains a.


Assume that lim g.x/ is positive or negative infinity, g0 .x/ 0 for all x in
x!a

the interval with x a, and lim gf 0.x/


D L.
x!a .x/
Let  > 0 be given.
Because g0 .x/ is never 0, the Mean Value Theorem shows that if both x and
y are in the interval and are both on the same side of a, then g.x/ g.y/.
0
0
Since lim gf 0.x/
D L, there is a 0 such that 0 < jx  aj < 0 implies that gf 0.x/
.x/
.x/
x!a

is within 2 of L.
Fix x in the given interval with 0 < jx  aj < 0 .
Since f and g are differentiable between x and a and continuous at x, for
any y between x and a it follows from the Extended Mean Value Theorem
0
f .y/f .x/
f .y/f .x/
that there is a c between x and y such that g.y/g.x/
D gf 0.c/
. Thus, g.y/g.x/
is
.c/

within 2 of L.
f .y/
f .x/
 g.y/
f .y/  f .x/
g.y/
D
.
Note that
g.y/  g.x/
1  g.x/
g.y/
f .y/ f .x/


g.y/  g.y/

f .y/f .x/
f 0 .c/

Because g.y/g.x/ D g0 .c/ is within 2 of L, it follows that
L
< .
1 g.x/
2
g.y/






f .y/ f .x/

< 2 1  g.x/
Then g.y/
 g.y/  L 1  g.x/
g.y/
g.y/ .
Because g.y/ approaches positive or negative infinity as y approaches a,
there is a > 0 with < 0 such that for all y with 0 < jy  aj < , the
fraction jf .x/jCjLg.x/jCjg.x/j
< 2 .
jg.y/j




f .y/ f .x/
<
 g.y/  L 1  g.x/
Then for y with 0 < jy  aj < , g.y/
g.y/

g.x/

1  g.y/ implies
2


f .y/ f .x/
g.x/
jg.x/j
jf .x/jCjLg.x/jCjg.x/j
< 2 C 2 D .
g.y/ L < g.y/ L g.y/ C 2 C 2jg.y/j < 2 C
jg.y/j

This completes the proof.


There are several variations of LHopitals Rule covering the cases of one sided
limits and limits at positive or negative infinity. These are covered in the following
exercises.

5.9 Intermediate Value Property and Limits of Derivatives

155

5.8.1 Exercises
Use LHopitals Rule to calculate the following limits.
sin2 .2x3 /
6
x!0 px
xx
lim pxCx
x!0C p

1. lim
2.

3. lim

x!0C

4. lim

x!1

x ln x

ln x
p
x

5. lim .sin.2x//x
x!0C

tan1 x
1
x!0 tan .3x/

6. lim

Write proofs of each of the following statements.


7. If f and g are differentiable functions for all x > 0, lim f .x/ D lim g.x/ D 0,
f 0 .x/
0
g
x!1 .x/

and g0 .x/ > 0 for all x > 0, then lim

x!1

f .x/
x!1 g.x/

D L implies lim

x!1

D L.

8. If f and g are differentiable functions for all x > 0, lim g.x/ D 1, and g0 .x/ > 0
for all x > 0, then

0
lim f 0.x/
x!1 g .x/

D L implies

lim f .x/
x!1 g.x/

x!1

D L.

9. If f and g are functions differentiable for all x > a, lim f .x/ D lim g.x/ D 0,
and g0 .x/ 0 for all x with x > a, then lim

x!aC

f 0 .x/
g0 .x/

x!aC

x!aC
f .x/
g.x/
x!aC

D L implies lim

D L.

5.9 Intermediate Value Property and Limits of Derivatives


The Intermediate Value Theorem says that if a function is continuous on an interval,
then it has the intermediate value property on that interval. That is, if f is continuous
on the interval I, and a; b 2 I, then for any K between f .a/ and f .b/, there is a
c between a and b with f .c/ D K. Suppose that f is differentiable at each point
of an interval I. If f 0 is continuous on I, then certainly it obeys the Intermediate
Value Theorem and has the intermediate value property on I. But f 0 .x/ can exist
0
for all x 2 I without
 f being a continuous function. One example is f .x/ D

1
2
x sin x2 if x 0
. This function is differentiable for all x. When x 0, the
0
if x D 0
 
 
1
2
1
0
derivative is f .x/ D 2x sin 2   cos 2 , and f 0 .0/ D 0. As x approaches
x
x
x
0, f 0 is not even bounded and, in fact, oscillates wildly. In spite of its discontinuity
at 0, f 0 does have the intermediate value property. For example, for any x 0, the
function f 0 obtains every value between f 0 .x/ and f 0 .0/ D 0 on the interval between

156

Fig. 5.7 x2 sin

5 Derivatives

1
x2

and its derivative

0 and x. Moreover, it obtains each of those values infinitely often. In fact, between
0 and x, the function f 0 takes on every real number infinitely often (Fig. 5.7).
Note that the function f .x/ C x has a derivative of 1 at x D 0. This is an example
of a function with a positive derivative at 0 which is not an increasing function over
any open interval containing 0. This can easily be seen by the fact that in every open
interval containing 0 there are intervals where the derivative of f .x/ C x is negative.
So, how can you prove that if a function f has a derivative f 0 on an interval I,
that f 0 has the intermediate value property on I? The hypothesis suggests that you
start by taking a function f differentiable on an interval I and values a; b 2 I. Then
you select a value K between f 0 .a/ and f 0 .b/. Without loss of generality, you can
assume that a < b and f 0 .a/ < K < f 0 .b/. The goal would be to show that there
is a c between a and b such that f 0 .c/ D K. One simplification is to replace f with
the function g.x/ D f .x/  Kx. This function is also differentiable on I, and if
f 0 .c/ D K, then g0 .c/ D 0. Which theorems about derivatives allow you to conclude
that a derivative is 0 at some point in an interval? First there is a theorem that states
that if a differentiable function reaches an extreme value at a point in an interval,
then the point is either a critical point of the function or an endpoint of the interval.
A second theorem is Rolles Theorem which talks about a differentiable function
which takes on the same value at the endpoints a and b. Since you do not have any
information about the values of g at the endpoints of the interval, the theorem about
extreme values may be the more promising choice for this proof.
What is known about the function g? You know that g is differentiable at each
point of the interval from a to b. Additionally, g0 .a/ D f 0 .a/  K < 0 and g0 .b/ D
f 0 .b/  K > 0. Does this mean that the function g is decreasing at a and increasing
at b? Well, it would if you knew that g0 were continuous because then g0 would be
negative in an interval around a and positive in an interval around b. But, as you now
know, g0 need not be continuous. On the other hand, there is a theorem that says that
if g0 .a/ is negative, then there is a > 0 such that if x satisfies a < x < a C ,
then g.x/ < g.a/. This does not show much, but you can use it to conclude that g
does not take on its minimum value on a; b at a. A similar argument uses the fact
that g0 .b/ > 0 to show that g does not take on its minimum value on a; b at b.

5.9 Intermediate Value Property and Limits of Derivatives

157

Is g a continuous function on a; b? It is differentiable at each point of a; b, so it is


continuous. All continuous functions on a close bounded interval take on both their
minimum and maximum values on the interval. Thus, you know that g takes on its
minimum value on a; b at some point c strictly between a and b. Such a point must
be a critical point of g, so g0 .c/ D 0. This is the idea behind the following proof.
PROOF: A function differentiable on an interval has the intermediate
value property on that interval.

Let f be a function differentiable at each point of an interval I.


Let a; b 2 I, and assume that f 0 .a/ f 0 .b/.
Without loss of generality assume that a < b and f 0 .a/ < f 0 .b/.
Let K be a value satisfying f 0 .a/ < K < f 0 .b/.
Let g.x/ D f .x/  Kx.
Then g0 .x/ D f 0 .x/  K for all x 2 a; b, and g0 .a/ < 0 < g0 .b/.
Since g is differentiable at each point of a; b, it is continuous on a; b.
Since g is continuous on a; b, it obtains a minimum value at some point
c 2 a; b.
g0 .a/ < 0 implies that there is a a > 0 such that g.x/ < g.a/ for all x
satisfying a < x < a C a . In particular, g does not obtain its minimum at a.
g0 .b/ > 0 implies that there is a b > 0 such that g.x/ < g.b/ for all x
satisfying b  b < x < b. In particular, g does not obtain its minimum at b.
It follows that g obtains its minimum on a; b at a point c strictly between
a and b.
Since c is not an endpoint of a; b, g0 .c/ D 0.
Thus, f 0 .c/ D g0 .c/CK D K which shows that f 0 has the intermediate value
property on I.

There are simple examples of functions that have discontinuous derivatives that
do not have the intermediate value property; functions such as f .x/ D jxj. This
functions derivative is the constant 1 for all x > 0 and 1 for all x < 0. This
derivative is not continuous at x D 0 because it is not defined there. Clearly, f 0 does
not have the intermediate value property on any interval containing both positive
and negative numbers, but then f does not satisfy the hypothesis of the previous
theorem on any such interval because f 0 .0/ is not defined. Functions that have
discontinuous derivatives that are defined at all points will have to exhibit wild
oscillations
of those discontinuities similar to the example
  in the neighborhoods
 2

x sin x12 if x 0
.
0
if x D 0
Suppose f is a function whose derivative is defined at all points of an interval
except perhaps at some point c in the interval. What can be said if lim f 0 .x/ exists?
x!c
Such a derivative does not exhibit wild oscillations near c, and, in fact, it must
have a continuous derivative at c. The proof is a consequence of the Mean Value
Theorem.

158

5 Derivatives

PROOF: Let f be a function differentiable at all points of an interval .a; b/


except perhaps at some point c 2 .a; b/. Suppose that the limit lim f 0 .x/
x!c
exists. Then f 0 is continuous at c.
Let f be a function differentiable on the interval .a; b/ except perhaps at
c 2 .a; b/.
Assume lim f 0 .x/ D L.
x!c

From the definition of limit, given  > 0, there is a > 0 such that if
y 2 .a; b/ with 0 < jy  cj < , then jf 0 .y/  Lj < .
Let x 2 .a; b/ with 0 < jx  cj < .
By the Mean Value Theorem there is a y between x and c such that
f .x/f .c/
D f 0 .y/.
xc

.c/
Then y 2 .a; b/ with 0 < jy  cj < , so f .x/f
 L D jf 0 .y/  Lj < .
xc
f .x/f .c/
D L, so f 0 .c/ D L.
xc
f 0 .c/ D lim f 0 .x/, it follows that f 0 .x/
x!c

Thus, lim

x!c

Because
This completes the proof.

is continuous at c.

Chapter 6

Riemann Integrals

6.1 Area
The first application one usually sees of the Riemann Integral is that of finding
the area of a region in the plane bounded by the graph of a function and the
lines x D a, x D b, and the x-axis. Thus, before discussing integration, it makes
sense to review what is meant by the area of a region in the plane. Clearly, the
measure of area should be a way to assign a size to a region in a way that is
compatible with the well-established rules from Geometry for assigning areas to
regions such as rectangles, triangles, and circles. But there is a need to go beyond
these simple regions so that area can be calculated for far more complicated regions.
For example, consider the region in the coordinate plane f.x; y/ j 0  x  1; 0 
y  1; at least one of x or y is rationalg. Regions such as these are not typically
considered in a Geometry course, but being able to calculate areas for such sets
is important in the more general discussion of integration. This chapter, therefore,
begins by considering two different measures of the sizes of sets which will aid the
understanding of integration.

6.2 Cardinality of Sets


What does the set fA; B; C; D; Eg have in common with the set f2; 4; 6; 8; 10g? One
thing they have in common is that the two sets have the same number of elements.
What does the set of positive integers have in common with the set of positive
multiples of 2? These sets are both infinite sets, and the second set is clearly a
proper subset of the first, but, here again, the two sets have the same number
of elements. To see this consider the function f .n/ D 2n which is a bijection
from the set of positive integers one-to-one and onto the set of positive multiples
of 2. This function provides a one-to-one matching of the elements of one set

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_6

159

160

6 Riemann Integrals

to the elements of the other set. One says that two sets A and B have the same
cardinality if there is a bijection f W A ! B. The bijection demonstrates a oneto-one correspondence between the elements of set A and the elements of set B, so
one concludes that A and B are the same size. Some sets are finite, meaning that the
set is either empty (has cardinality 0) or, for some positive integer n, is in one-toone correspondence with the set f1; 2; 3;    ; ng. A set is called denumerable if it
can be put in one-to-one correspondence with the set of positive integers. Thus, the
set of positive multiples of 2 is denumerable. So is the set of all integers since
the positive
can be mapped
onto the set of all integers using the bijection
 integers

x
if
x
is
even
2
f .x/ D
. The verification that this map is a bijection is left
if x is odd
1  xC1
2
as an exercises. It shows that the integers and the positive integers have the same
cardinality. Sets that are either finite or denumerable are called countable because
they can be counted out by listing a first, second, third, and so forth. Thus, a good
way to think about a countable set is a set whose elements can be written down in
a finite or infinite sequence x1 ; x2 ; x3 ;    because this listing shows the one-to-one
correspondence between the set and the natural numbers or one of its finite subsets.
The union of two countable sets is also countable. This can be seen by representing one set by the sequence x1 ; x2 ; x3 ;    and the other by y1 ; y2 ; y3 ;    . Then
the elements of the union of the two sets can be written as x1 ; y1 ; x2 ; y2 ; x3 ; y3 ;    .
If there are elements that belong to both sets, then one can just leave the second
copies of those elements out of the listing. Clearly, this can be extended to the
union of any finite collection of countable sets, so the union of a finite number
of countable sets is countable. What might seem surprising is that the union of a
countable number of countable sets is still countable. That is, if A1 ; A2 ; A3 ;    is
1

a sequence of countable sets, then the union [ Ak is also countable. To see this,
kD1

suppose that the elements in each Ak can be listed in a sequence ak;1 ; ak;2 ; ak;3 ;    .
1

One can now list all the elements of [ Ak by listing the ak;j elements in increasing

kD1
a1;1 ; a2;1 ; a1;2 ; a3;1 ; a2;2 ; a1;3 ; a4;1 ; a3;2 ; a2;3 ; a1;4 ; a5;1 ;    .

order of k C j resulting in
As above, duplicate elements occurring because they belong to more than one set
can be left out of this listing. Figure 6.1 shows the order that the elements enter
the list.
Note that this result can be used to show that the set of rational numbers is
countable. Indeed, the rational numbers can be written as the union R1 [R2 [R3 [  
where Rk are the rational numbers that can be written as a fraction with an integer
in the numerator and the positive integer k in the denominator. For example,
R2 D f 02 ; 12 ;  12 ; 22 ;  22 ; 32 ;  32 ;    g. Thus, the rational numbers is a countable union
of countable sets showing that it is countable. The cardinality of a denumerable set is
often written using the symbol @0 (read Aleph knot or Aleph null). The symbol
represents the size of the natural numbers and the size of any set that can be placed
in one-to-one correspondence with the natural numbers.
A set which is not a countable set is called uncountable. There is a standard
argument that shows that the set of real numbers in the interval .0; 1/ is not a
countable set. The method, known as a diagonalization argument, first assumes

6.2 Cardinality of Sets


Fig. 6.1 The union of
countably many countable
sets is countable

Fig. 6.2 Determining y using


a diagonalization argument

161

a1,1

a1,2

a1,3

a1,4

a1,5

a1,6

a2,1

a2,2

a2,3

a2,4

a2,5

a2,6

a3,1

a3,2

a3,3

a3,4

a3,5

a3,6

a4,1

a4,2

a4,3

a4,4

a4,5

a4,6

a5,1

a5,2

a5,3

a5,4

a5,5

a5,6

x1 = 0. 4 9 0 3 2 5 5 9 0 9 9 0
x2 = 0. 1 7 7 3 8 8 0 0 0 0 0 0
x3 = 0. 7 4 1 1 8 9 1 8 2 5 4 4
x4 = 0. 1 1 8 8 8 3 7 2 9 0 0 1
x5 = 0. 5 5 2 7 7 7 1 0 6 4 2 3
x6 = 0. 0 0 0 0 0 2 1 0 9 3 7 3
x7 = 0. 8 2 1 7 4 9 0 3 2 8 5 5
x8 =
y = 0. 7 3 7 7 3 7 7 7 7 7 7 3

that the real numbers between 0 and 1 can all be written down in a sequence
x1 ; x2 ; x3 ; x4 ;    . Then one constructs a real number y between 0 and 1 where the
kth digit to the right of the decimal point in y is chosen as follows. If the kth digit
to the right of the decimal point of xk is 7, then let the kth digit to the right of the
decimal point in y be 3. Otherwise, if the kth digit to the right of the decimal point of
xk is not 7, then let the kth digit to the right of the decimal point in y be 7. Figure 6.2
illustrates the process of determining y.
The point of this construction is that the number y is a real number in the interval
.0; 1/, but it cannot be one of the numbers in the sequence x1 ; x2 ; x3 ; x4 ;    . This
is because for each k, y cannot equal xk because y and xk differ in their kth digits.
This is a contradiction to the assumption that the sequence contained all of the real
numbers in .0; 1/ and shows that it is impossible to list all the elements of .0; 1/ in
a sequence. Thus, this interval is an uncountable set. If there is a bijection from the
set .0; 1/ to a set B, then it follows that B will also be uncountable. You may wonder
whether all uncountable sets have the same cardinality. They do not, but that fact
will not be needed for the proofs discussed in this book. Refer to a standard text in
Set Theory for a far more in-depth look at the cardinality of sets.

162

6 Riemann Integrals

6.2.1 Exercises
1. Determine whether each of the following sets is finite, denumberable, or
uncountable.
(a) the set of points in the coordinate plane where both x and y coordinates are
rational numbers
(b) the set of points in the coordinate plane where at least one of its x and y
coordinates is a rational number
(c) the set of polynomials p.x/ with integer coefficients
(d) the set of real numbers whose decimal representation does not contain the
digit 5
(e) the set of functions f W f0; 1; 2; 3; 4; 5g ! f1; 2; 3; : : : ; 100g
(f) the set of functions f W f2; 4; 6; 8; 10; : : : g ! f0; 1g


x
if x is even
2
2. Show that the function
is a bijection from the set of
if x is odd
1  xC1
2
natural numbers to the set of integers.
3. Show that .0; 1/ and .1; 5/ have the same cardinality.
4. Show that .0; 1/ and the entire set of real numbers, R, have the same cardinality.
5. Show that .0; 1/ and the interval 0; 1 have the same cardinality. (Hint: Find a
way to bury the endpoints of 0; 1 inside of .0; 1/ by mapping a sequence
x1 ; x2 ; x3 ; : : : to x3 ; x4 ; x5 ; : : : .)
1

6. Show that if A1 ; A2 ; A3 ; : : : are sets and [ An is an uncountable set, then for at


nD1

least one n, the set An must be uncountable. This can be thought of as an infinite
form of the Pigeonhole Principle.
7. The interval 0; 1 on the real line and the unit square in the plane have the same
cardinality. (Hint: for a point in 0; 1 split up its decimal digits between the x
and y coordinates of a point in the unit square.)
8. Show that the equality of cardinality is an equivalence relation. That is, if A, B,
and C are any sets, then
A has the same cardinality as A.
If A has the same cardinality as B, then B has the same cardinality as A.
If A has the same cardinality as B, and B has the same cardinality as C, then A
has the same cardinality as C.
9. Suppose that you apply the diagonalization argument to the set of rational
numbers in the interval .0; 1/. That is, suppose you list all of the rational numbers
in a sequence x1 ; x2 ; x3 ; : : : and use the diagonalization argument to construct a
number y in .0; 1/ that differs from each element of the sequence. Why is this
not a proof that the rational numbers are uncountable?

6.3 Measure Zero

163

6.3 Measure Zero


Cardinality is used to compare the sizes of sets by considering how many elements
the sets have. But two sets such as 0; 3 and 0; 6 can have the same cardinality
and yet be quite different in what we traditionally think of as size in the geometric
sense. So there is a need to develop a different way to compare the sizes of sets that
embodies the notion of the length of a set of real numbers and of the area of a set in
the plane. A general theory of measure is not a topic that can be covered in a book
at this level, but it is helpful to introduce how one determines which sets should be
assigned a length or an area equal to 0.
If measure is to mean anything useful, you would want each finite interval a; b
to have measure equal to its length, ba. How about the measure of the open interval
.a; b/? Likely, you would say that its measure should also be b  a. This suggests
that the set of endpoints fa; bg should be assigned a measure of 0. More generally,
a set S  R is said to have measure zero if for each  > 0 there is a sequence of
open intervals .a1 ; b1 /; .a2 ; b2 /; .a3 ; b3 /; : : : such that S is contained in the union of
1

the intervals S  [ .aj ; bj / and the total length of the intervals is less than , that
jD1

is, for every natural number n,

n
P

.bj  aj / < . In other words, a set has measure

jD1

zero if you can cover it with a sequence of intervals whose total length is as small
as you want.
In particular, any finite set consisting of n real numbers has measure zero because


for any  > 0, each point x in the set can be covered by the interval .x  3n
; x C 3n
/,
2
and the total length of these intervals is 3  . Similarly, any countable set of real


numbers fx1 ; x2 ; x3 ; : : : g can be covered by intervals .xj  32
j ; xj C 32j /, and the total
1
P 2
D 23  . Thus, the set of rational numbers, which
length of these intervals is
32j
jD1

is countable, has measure zero. A similar argument shows that if A1 ; A2 ; A3 ; : : : is


1

a sequence of sets all of which have measure zero, then the union [ Aj also has
jD1

measure zero. Indeed, given  > 0, for each j you can cover Aj with a sequence
of open intervals whose total length is less than 2j . Then the sequences of open
1

intervals can be combined into one sequence of intervals which cover [ Aj and has
total length less than

1
P
jD1

jD1


2j

D .

Since any countable set of real numbers has measure zero, if a set does not
have measure zero, it must be an uncountable set. A natural question is whether
an uncountable set of real numbers can have measure zero. The answer to this
question is yes. The most famous example of this is known as the Cantor set which
is constructed as follows. The construction begins with the closed unit interval
C0 D 0; 1. At the first stage, the open interval of length 13 is removed from the
middle of this set leaving two intervals each with length 13 so that C1 D 0; 13 [ 23 ; 1.

164

6 Riemann Integrals
Stage 0
Stage 1
Stage 2
Stage 3
Stage 4
Stage 5

Fig. 6.3 Construction of the Cantor set

At the second stage, open intervals of length 19 are removed from the middle of each
of the two remaining intervals leaving four intervals each with length 19 so that
C2 D 0; 19  [ 29 ; 39  [ 69 ; 79  [ 89 ; 99 . This process is repeated so that at stage n,
open intervals of length 31n are removed from each of 2n1 closed intervals of length
1
leaving 2n closed intervals each with length 31n (Fig. 6.3). The Cantor set C
3n1
1

is then defined to be the intersection of all of these Cn sets, that is, C D \ Cn .


nD1

The Cantor set is sometimes called the Cantor middle thirds set, because, at each
stage, the middle thirds of the remaining intervals are removed. Other similar types
of Cantor-like sets can be constructed by removing other portions of each interval.
It is clear that the Cantor set has measure zero because it is contained in Cn which
is made up of 2n closed intervals each with length 31n . The total length of the closed
n
intervals in Cn is 23n , a quantity that goes to 0 as n gets large. Cn can be covered by
n
a finite collection of open intervals whose total length is 10 percent larger than 23n
showing that the Cantor set can be covered by open intervals whose total length is
as small as you want. So how do you show that the Cantor set is uncountable? To
see this, consider writing each number in the unit interval 0; 1 in base three. The
numbers in the interval 0; 13  are the numbers between 0 and 1 whose base-three
representation begins with 0.0, and the numbers in the interval 23 ; 1 are the numbers
between 0 and 1 whose base-three representation begins with 0.2. The numbers in
the middle third of the interval that are removed at the first stage of the construction
process are the numbers between 0 and 1 whose base-three representation begins
with 0.1. Note that numbers at the endpoints of the removed interval, 13 and 23 each
has two different representations. Indeed, in base three 13 D 0:1 D 0:0222    and
2
D 0:2 D 0:1222    . One could say that C1 consists of all the numbers between 0
3
and 1 that can be represented in base three without a 1 in the first place to the right
of the decimal point, the one-third place. Similarly, C2 are the numbers between 0
and 1 that have a base-three representation with no 1 in either of the first two places
to the right of the decimal point. The Cantor set C is the set of numbers between 0

6.3 Measure Zero

165

and 1 that have a base-three representation that contains no digit equal to 1. Then
consider the map that takes each element of the Cantor set and divides it by 2. This
is an injective map that maps the numbers in the Cantor set to the set of numbers in
the unit interval that have base-three representations that include only the digits of
0 and 1 because it takes numbers with representations that only included the digits
of 0 and 2 and divides each of the digits by 2. Now, the numbers between 0 and
1 with base-three representations that include only the digits of 0 and 1 are clearly
in one-to-one correspondence with base-two representations of numbers in between
0 and 1. But all the real numbers between 0 and 1 have base-two representations
containing only 0 and 1, so the numbers in the Cantor set are as numerous as the
real numbers between 0 and 1. Thus, the Cantor set must be uncountable since the
set of real numbers between 0 and 1 is an uncountable set.
The concept of measure zero can be extended to sets in the plane, although here,
rather than being interested in the length of a set, the interest is in the area of the set.
Thus, rather than trying to cover a set with intervals whose total length is small, in
the plane one would try to cover a set with a sequence of squares whose total area
is small. Just as on the real line, it was taken as given that the length of an interval
a; b was b  a, in the plane it will be taken as given that the area of a square with
side length x is x2 . Then, a region in the plane is said to have measure zero (or area
zero) if for each  > 0, the set is contained in the union of a sequence of squares
whose total area is less than .
As it was with sets of real numbers, any countable set of points in the plane has
area zero because, for any  < 0, you can cover the sequence x1 ; x2 ; x3 ; : : : with a
sequence of squares with total area less than . Moreover, let Y be a line segment
with length y > 0. Then Y has area zero. How would you prove this? Certainly, this
line segment is contained in a square with side length y which has area y2 , so the
squares area could be rather large and, in particular, the area of the square is not
zero. Notice, though, that Y can also be covered by two side-by-side squares each
2
with side length 2y and each with area y4 giving a total area of the two squares equal
2

to y2 . This is the key to covering Y with squares with very small total area. If Y is
covered by a sequence of n adjacent squares each with side length ny , then the total
2

area of the n squares is yn . Since yn can be made arbitrarily small by choosing n


large, it follows that Y has measure zero (Fig. 6.4).

Fig. 6.4 Covering a line segment with smaller and smaller squares

166

6 Riemann Integrals

6.3.1 Exercises
1. Rather than constructing the Cantor set only on the interval 0; 1, perform the
same construction on each interval n; n C 1 for every integer n. Show that the
resulting set has measure zero.
2. Beginning with the interval 0; 1 construct a Cantor-like set, but instead of
removing intervals of length 13 at stage 1, 19 at stage 2, and so forth removing
intervals of length 31n at stage n, you remove an interval of length 14 at stage 1,
1
at stage 2, and so forth removing intervals of length 41n at
intervals of length 16
stage n. Show that the total lengths of the intervals remaining after stage n does
not approach zero as n approaches infinity.
3. Which of the following sets of real numbers have measure zero?
(a)
(b)
(c)
(d)

the integers
the irrational
numbers
p
p
fa C b 2 C c 3 j a; b; c are integers g
the Cantor-like set where instead of removing the middle 13 of each
remaining interval at stage n, you remove the middle 14 of each remaining
interval

4. Show that if the set A has measure zero and B  A, then the set B has measure
zero.
5. Show that a line in the plane has area zero.
6. Show that the set in the plane f.x; y/ j x is rationalg has area zero.
7. Suppose that the set A  R has measure zero. Show that the set f.x; y/ j x 2
A; y 2 0; 1g has area zero.
8. Suppose that the set A  R has measure zero. Show that the set f.x; y/ j x 2 Ag
has area zero.
9. Show that f.x; y/ j 0  x  1; 0  y  1; at least one of x or y is rationalg has
area zero.
10. Show that the interval 0; 1 does not have measure zero. (Hint: Use the Heine
Borel Theorem to reduce any cover to a finite subcover.)

6.4 Areas in the Plane


When discussing area, it is not possible to avoid the limit concept, and this brings
a topic usually associated with Geometry into the field of Analysis. One could
even make a case for including much of Geometry as a subtopic of Analysis since
Geometry involves properties of distance, a distinguishing feature of Analysis.
What properties of area can be taken as given? One would hope that whatever
axioms are chosen, they would let you prove results about area that you know to be
true from Euclidean Geometry. The following axioms accomplish this.

6.4 Areas in the Plane

167

Axioms for Area


1. The area of a set in the plane is a nonnegative real number.
2. A square with side length 1 has area equal to 1.
3. (Similarity) If sets A and B are similar in the geometric sense with lengths
in B equal to t times the corresponding lengths in A, then the area of B is t2
times the area of A.
4. (Area Zero) Let A be a set. Suppose that for each  > 0 there is a sequence
of squares S1 ; S2 ; S3 ;    with areas s1 ; s2 ; s3 ;    , respectively, such that the
1

set A is contained in the union of the squares [ Sk , and for every natural
kD1
P
number n, nkD1 sk < . Then A has area 0.
5. (Union) If set A has area a, set B has area b, and their intersection A \ B has
area 0, then the union A [ B has area a C b.
6. (Exhaustion) Let B be a set. If for each  > 0 there are sets A and C with
A  B  C such that the area of A is greater than b  , and the area of C
is less than b C , then the area of B is b.
Axioms 1, 2, and 3 should agree with what you know about area from Geometry,
and they can be used to prove some simple results. For example, since a 1  1 square
has area 1, Axiom 3 can be used to show that an s  s square has area s2 .
The result from the previous section that a line segment has area 0 is particularly
useful because of the way it can be used in conjunction with the Union area axiom.
In particular, suppose A and B are two squares or other polygons set side-by-side so
that they only share an edge. Because the shared edge is a line segment, it has 0 for
its area, and the Union area axiom shows that A [ B has an area equal to the sum of
the area of A and the area of B. By using mathematical induction, this result can be
extended to the union of many polygons that share borders. In particular, consider
finding the area of a rectangle with width x and length y. If xy is a rational number
equal to pq , where p and q are positive integers, then the x  y rectangle is the union
of p  q squares all with side length px . Indeed, the width of the rectangle which has
length x is spanned by p such squares, and the length of the rectangle which has
length y is spanned by q such squares showing that the entire rectangle can be tiled
 2
by a p  q array of squares, each with area px . The Union axiom then shows that


 2
the area of the x  y rectangle is p  q  px D x  qp  x D x  y. It will require
the last of the area axioms to conclude that the area of any rectangle is equal to its
length times its width even when the length of the rectangle is an irrational multiple
of its width.
The last area axiom is essentially the Method of Exhaustion used some by
Euclid and much more extensively by Archimedes to calculate areas and volumes.
It is an example of a use of Calculus about 1800 years before the foundation of
Calculus was formally established by Newton and Leibniz. This axiom says that if
a region in the plane can be closely approximated by sets whose areas you know,
then you can figure out the area of the region. Take, for example, a rectangle B with

168

6 Riemann Integrals

width x > 0 and length y > 0 where the ratio xy D is irrational. It is certainly
possible to find other rectangles close to the size of B whose length to width ratios
are rational. To prove that B has area xy, the axiom requires that for each  > 0 you
find a subset A  B whose area is greater than xy   and a set C containing B whose
area is less than xy C . Suppose you choose A to be a rectangle with width x and
length just a bit short of y, say rx, where r is a rational number chosen to be less
than but suitably close to xy . How close is suitably close? Well, you would need the
area of A, which is x  rx D rx2 , to be within  of xy, that is, xy  rx2 < . Solving
for r shows that r > xy  x2 . Is there such an r which is rational and between xy  x2
and xy ? Of course there is. The rational numbers are dense in the real line; there are
rational numbers in every interval of positive length. Thus, you can select a rational
number r between xy  x2 and xy and let A be an x  rx rectangle. Then A can be
placed inside of B, and the area of A is within  of xy. Similarly, you can choose a
rectangle C with width x and length sx, where s is a rational number chosen to be
greater than but suitably close to xy . You need the area of C to be within  of xy, so
choose s so that x  sx  xy < . This happens if xy < s < xy C x2 . Since you have
found a rectangle A contained inside B and a rectangle C containing B with the areas
of A and C within  of xy, the Exhaustion area axiom shows that B has area xy.
The familiar formula for the area of a triangle given as one half the base times the
height can be derived geometrically, but to prove this formula using the area axioms
requires more work. To begin, consider a right triangle with legs with lengths x and
y. Place this triangle in a rectangle with side lengths x and y. For any natural number
n, the rectangle can be overlaid with an n  n grid of rectangles with side lengths
x
and ny . The hypotenuse of the triangle is the diagonal of the x  y rectangle and
n
spans the diagonals of n of the smaller rectangles as shown in Fig. 6.5 exhibiting the
case where n D 8.
Because there are n grid rectangles along the hypotenuse of the triangle, it
2
must be that there are n 2n grid rectangles inside the triangle with a total area


2
of n 2n  nxy2 D 1  1n xy2 . Similarly, the triangle is enclosed inside the union of


n2 Cn
grid rectangles with a total area of 1 C 1n xy2 . Clearly, n can be chosen large
2
Fig. 6.5 An 8  8 grid of
rectangles overlaying a
triangle

6.5 Definition of Riemann Integral

169

enough to make both the total area of grid rectangles inside the triangle and the total
area of grid rectangles enclosing the triangle within a particular  > 0 of xy2 . Thus,
the Exhaustion axiom shows that the area of the triangle is xy2 as expected. Since any
triangle can be partitioned into two right triangles, the well-known area formula for
the area of a triangle follows. Since any polygon can be partitioned into triangles,
the usual formulas from Geometry for the areas of polygons can be derived in the
same way they would be in Geometry.
You may wonder whether these techniques can be used to find the area of any
region in the plane, or at least any bounded region in the plane. This is a really good
question with a very complicated answer. The Area Axioms listed in this section
are designed to give the reader a feel for proofs about areas that will be useful
in the upcoming discussion of proofs about Riemann integrals. The axiom list is
not complete enough to allow the calculation of the area of many of the sets that
one might encounter. The area of Analysis known as Measure Theory provides a
somewhat richer environment for this study, but the complexities of measure theory
go beyond the aim of this text. What can be said is that even with the use of measure
theory, there are sets in the plane complex enough that one cannot assign an area
measure to them.

6.4.1 Exercises
1. Show that a circle with radius r has area r2 .
2. Suppose the polygonal region A in the coordinate plane has area K. Show that
the region f.x; y/ j .x; 3y / 2 Ag has area 3K.

6.5 Definition of Riemann Integral


The definition of the Riemann Integral is motivated by the Method of Exhaustion
which attempts to approximate a planar region with sets, perhaps collections of
rectangles or other polygons, whose area can be easily calculated. If the region
whose area is to be calculated is not a polygon itself, then one needs to fill the
region with a sequence of smaller and smaller polygons until a limit is realized.
Such a region might be bounded by the horizontal x-axis, the vertical lines given
by x D a and x D b for some real numbers a and b, and finally by the graph of
some nonnegative function f . Given this region, one attempts to fill the region with
rectangles whose sides are parallel to the axes, have one side along the x-axis,
and have a length determined in some way by the graph of the given function.

170

6 Riemann Integrals

Fig. 6.6 Approximating the area under a curve with narrowing rectangles

If the function is in some sense well behaved, then as the widths of these rectangles
are chosen to be smaller and smaller, the total area of the rectangles will approach
the area of the region (Fig. 6.6). What is meant by well behaved will be a main focus
of the theorems presented in this chapter.
To make the definition of Riemann Integral precise, there needs to be a way to
talk about the placement of the vertical rectangles used to approximate the area
under a curve. This is done by designating the position of the vertical sides of the
rectangles with a collection of x values in the interval a; b. One defines a partition
of the interval a; b to be a finite sequence of x values P W a D x0  x1 
x2      xn D b for some natural number n. These values of x break the interval
a; b into n subintervals xj1 ; xj . Note that the definition of partition does not say
anything about the lengths of the subintervals for the partition. Indeed, it could be
that the jth subinterval length xj  xj1 could be 0 or could be as large as b  a.
In particular, there is no requirement that all the interval lengths be the same size.
Since the lengths of the subintervals xj  xj1 are used frequently in the discussion
of Riemann Integrals, one often uses the shorthand notation xj D xj  xj1 .
Given a partition, P W a D x0  x1  x2      xn D b, one defines the
norm of the partition P , jjP jj, to be the maximum length of a subinterval of the
partition, that is, jjP jj D max xj . For example, if a; b D 1; 4 has the partition
jn

1; 1 12 ; 1 34 ; 2 12 ; 2 23 ; 2 23 ; 3 14 ; 3 12 ; 3 56 ; 4, then the norm of the partition is 34 D 2 12 1 34 , the


largest distance between any two of the adjacent points in the partition. As seen in
the previous section, one can get increasingly better approximations to the area of a
region by attempting to approximate the region by smaller and smaller polygons.
Thus, by requiring the norm of a partition to be smaller, the rectangles used to
approximate the area of a region bounded by a curve become smaller in width and
can give a better approximation.
For the Riemann Integral, a partition will determine the widths of the rectangles
used to approximate the area of a region. What will be used as the lengths of
those rectangles? Suppose a rectangle rests on the x-axis between xj1 and xj . If
the rectangle is going to fit inside the region bounded by the curve y D f .x/, then

6.5 Definition of Riemann Integral

171

the length of the rectangle (its height above the x-axis) cannot exceed

inf

xj1 xxj

f .x/.

If the rectangle is going to enclose the part of the region between xj1 and xj , then the
length of the rectangle must be at least sup f .x/. The definition of the Riemann
xj1 xxj

Integral uses a value between these two possible extremes. It requires the choice of
a sequence of x values 1 ; 2 ; 3 ; : : : ; n with xj1  j  xj for each j. Then the
rectangle on xj1 ; xj  is given the length f .j / so that it has area f .j /xj . Clearly,
the choice of j 2 xj1 ; xj  results in the length of the rectangle being f .j / which
is between the two extremes inf f .x/ and sup f .x/, so the rectangles that
xj1 xxj

xj1 xxj

result might neither be contained in the region bounded by the curve nor cover the
region. Instead, the lengths of the rectangles are allowed to be in between these two
extremes. The total area of all the rectangles is then given by the Riemann Sum
n
P
f .j /xj .
jD1

Now, given a function f defined on the interval a; b, one can define the Riemann
Rb
Integral of f on a; b to be I D f .x/dx if for every  > 0 there is a > 0
a

such that for every partition P W a D x0  x1  x2      xn D b of a; b


with jjP jj < and for every
choice of 1 ; 2 ; 3 ; : : : ; n with j 2 xj1 ; xj , it
P

n
Rb

follows that f .j /xj  I < . If f .x/dx exists, then f is said to be integrable
jD1

a
(or Riemann integrable) on the interval a; b. The function f in the integral is
called the integrand. When the integrand f is a nonnegative function, this definition
results in a value for I that can be considered the area of the region bounded by the
x-axis, the lines x D a and x D b, and the curve y D f .x/. When f is allowed to take
on both positive and negative values, the value of I can be thought of as the area of
the region lying above the x-axis minus the area of the region lying below the x-axis.
The power of the definition of Riemann Integral is that it need not be associated
with area at all. The student may well be familiar with other applications to the
determination of moments, work, force, speed, distances, interest rates, populations,
and many other examples. It is convenient to extend the definition of Riemann
Rb
Rb
Ra
Integral to f .x/dx where b < a with the convention f .x/dx D  f .x/dx.
a

Note that the definition of Riemann Integral, similar to the definitions of limit and
derivative, states that the integral of f between the numbers a and b is I if for every
 > 0 there is a > 0 such that a particular inequality holds. But unlike previous
kinds of limits, the inequality that must hold for Riemann sums is supposed to be
true for every choice of a partition P and every choice of j s as long as jjP jj < .
Thus, it is not just that a region in the plane is being approximated by a sequence
of rectangles, but that the region must be closely approximated by every possible

172

6 Riemann Integrals

sequence of rectangles that arise from the Riemann sum

n
P

f .j /xj . Also worth

jD1

noting is that the Riemann Integral is not the only way to define integration. Most
of the other definitions give the same value as the Riemann Integral for functions
where the Riemann Integral exists, but some of the other definitions give values to
integrals in situations where the Riemann Integral does not exist. Some examples of
other integration definitions include the RiemannStieltjes Integral, the Lebesgue
Integral, the Darboux Integral, and the Daniell Integral.
There are some fairly easy to describe functions that
 do not have a Riemann

1 if x is rational
integral. One simple example is the function f .x/ D
whose
0 if x is irrational
Riemann integral is not defined on any interval a; b with a < b. To see why this is,
n
P
consider any Riemann sum
f .j /xj . Because both the rational numbers and the
jD1

irrational numbers are dense in the real numbers, in any subinterval of the partition
which has positive length, there are values of j in the subinterval where f .j / D 0,
and other values of j in the subinterval where f .j / D 1. Thus, for any partition,
n
P
there are choices of the j s that make the Riemann sums equal to
0  xj D 0 and
other choices that make the Riemann sum equal to

n
P

jD1

1  xj D b  a > 0. Thus,

jD1

no limit can exist.

6.5.1 Exercises
1. Let f .x/ D x. Partition the interval 1; 3 into n subintervals with 1 D x0 and
xj D xj1 C 2n for j D 1; 2; 3; : : : ; n.
(a) Find the minimum and maximum possible values for an associated Riemann
n
P
sum
f .j /xj .
jD1

(b) Show that as n gets large, the Riemann sum must approach 4.
2. Let f .x/ D x2 . Partition the interval 1; 2 into n subintervals with 1 D x0 and
xj D xj1 C 3n for j D 1; 2; 3; : : : ; n.
(a) Find the minimum and maximum possible values for an associated Riemann
n
P
sum
f .j /xj .
jD1

(b) Show that as n gets large, the Riemann sum must approach 3.

6.6 Properties of Integrals

173

6.6 Properties of Integrals


There are many theorems about the properties satisfied by Riemann integrals. Some
of the proofs of these theorems merely rely on properties of summations since
n
P
the definition of the Riemann Integral is based on the Riemann sum,
f .j /xj .
jD1

Consider the following results.


If a, b, and c are a constants, then

Rb

c dx D c.b  a/.

If f is an integrable function on the interval a; b and c is a constant, then


Rb
Rb
c  f .x/dx D c f .x/dx.
a

Rb
If f and g are functions integrable on the interval a; b, then .f C g/.x/dx D
Rb
a

f .x/dx C

Rb

g.x/dx.

If f and g are functions integrable on the interval a; b, and f .x/  g.x/ for all
Rb
Rb
x 2 a; b, then f .x/dx  g.x/dx.
a

To prove that

Rb

cdx D c.b  a/, one needs to find a > 0 so that if the norm of a

partition is less than , then the Riemann sum

n
P
jD1

f .j /xj is within some  > 0 of

c.b  a/. But in this case f .j / is always equal to the constant c, so the Riemann sum
is always equal to the desired integral, c.b  a/. This makes the proof particularly
easy.
Note that the first four steps of this proof merely set up the assumptions required
by the definition of the Riemann Integral. That is, one needs to have constants
a and b and function f defined on the interval a; b. Then one needs to take an
arbitrary  > 0, find an appropriate > 0, and consider an arbitrary Riemann
sum which satisfies the needed condition on the norm of the partition. Although
straightforward, these steps are necessary in order to show that the definition of
Riemann Integral is being satisfied.

174

6 Riemann Integrals

PROOF: If a, b, and c are constants, then

Rb
a

c dx D c.b  a/.

Let constants a, b, and c be given.


Without loss of generality, assume that a  b, and let f .x/ D c for all x in
a; b.
Let  > 0 be given.
Let D 1, and let P W a D x0  x1  x2      xn D b be a partition of
a; b with jjP jj < 1.

Then for any choices of j 2 xj1 ; xj , it follows that f .j /xj c.ba/
jD1

P
P

n
n

D cxj  c.b  a/ D c  .xj  xj1 /  c.b  a/ D


jD1
jD1

jcj  j.xn  x0 /  .b  a/j D 0 < .


Rb
Thus, cdx D c.b  a/.
a

Now consider the next theorem which states that

Rb
a

c  f .x/dx D c 

Rb
a

f .x/dx.

Rb
In the proof of this result you will need to use the fact that f .x/dx D I to
a

say something about the size of c  f .j /xj  cI . But this expression equals
jD1

P
P

n
n

jcj  f .j /xj  I suggesting that if you can arrange for f .j /xj  I to be
jD1
jD1

small, then you can arrange for the product jcj  f .j /xj  I to be small. You
jD1

to be less than some given  > 0, so it is tempting to require


will need the product


. This is fine except for the embarrassing case where c D 0.
f .j /xj  I < jcj
jD1

One could handle this problem by breaking the proof into two
cases: c D 0 and
P


c 0. Easier, though, is to simply ask for f .j /xj  I to be less than jcjC1
.
jD1

The use of jcj C 1 in the denominator is just a trick that takes care of the case where

6.6 Properties of Integrals

175

jcj is large
and the case where jcj is 0 both at the same time. Of course, you can
P

Rb
n


arrange f .j /xj  I < jcjC1
because that follows from f .x/dx D I.
jD1

a
PROOF: If f is an integrable function on the interval a; b and c is a
Rb
Rb
constant, then c  f .x/dx D c f .x/dx.
a

Let interval a; b and constant c be given.


Let f be a function defined on a; b such that

Rb

f .x/dx D I.

Let  > 0 be given.


From the definition of Riemann Integral, there is a > 0 such that if
P W a D x0  x1  x2      xn D b is a partition
with jjP jj < , then for


every choice of j 2 xj1 ; xj , f .j /xj  I < jcjC1
.
jD1

P
n

n

< .
Then c  f .j /xj  cI D jcj  f .j /xj  I  jcj  jcjC1
jD1

jD1
Rb
Rb
Thus, c  f .x/dx D cI D c f .x/dx.
a

The third theorem in this section can be summarized by saying that the integral
of a sum is the sum of the integrals. Its proof is reminiscent of the proof of
the theorem stating that the limit of a sum is the sum of the limits, and of the
theorem stating that the derivative of a sum is the sum of the derivatives. In this
Rb
Rb
case, you are given that f .x/dx D I and g.x/dx D J and are then faced with
a

the distance that the Riemann sum for f C g is from the value of the integral
P

I C J given by .f C g/.j /xj  .I C J/. This easily breaks into the two
jD1

!
!

P
n
P

n
differences
f .j /xj  I C
g.j /xj  J . The existence of the two

jD1
jD1
given integrals then lets you choose a value of > 0 that will ensure that the
two parts to this sum are both small.

176

6 Riemann Integrals

PROOF: If f and g are integrable functions on the interval a; b, then


Rb
Rb
Rb
.f C g/.x/dx D f .x/dx C g.x/dx.
a

Let interval a; b be given, and let f and g be integrable functions on a; b


Rb
Rb
with f .x/dx D I and g.x/dx D J.
a

Let  > 0 be given.


From the definition of Riemann Integral, there is a 1 > 0 such that if
P W a D x0  x1  x2      xn D b is a partition
with jjP jj < 1 , then
P

for every choice of j 2 xj1 ; xj , f .j /xj  I < 2 .


jD1

Similarly, there is a 2 > 0 such that if P W a D x0  x1  x2     


x n D b is a partition
with jjP jj < 2 , then for every choice of j 2 xj1 ; xj ,
P

g.j /xj  J < 2 .


jD1

Let D min.1 ; 2 /.
Let P W a D x0  x1  x2      xn D b be a partition of a; b with
jjP jj <
, and let j s be chosen with
j 2 xj1 ; xj .
P

Then .f C g/.j /xj  .I C J/ D


jD1

!
!
P

n
P
n

f .j /xj  I C
g.j /xj  J 

jD1

jD1

P
P

n
n

f .j /xj  I C g.j /xj  J < 2 C 2 D .


jD1
jD1

Rb
Rb
Rb
Thus, .f C g/.x/dx D I C J D f .x/dx C g.x/dx.
a

The final theorem in this section states that if f .x/  g.x/ for all x 2 a; b, then if
Rb
Rb
the functions are integrable, f .x/dx  g.x/dx. It is sufficient to prove this result
a

when f is the identically 0 function, because if h.x/  0 implies

Rb

h.x/dx  0, this

would imply that if f .x/  g.x/, then h.x/ D g.x/  f .x/  0, so


From there

Rb
a

.g  f /.x/dx D

Rb
a

g.x/dx 

Rb
a

Rb

h.x/dx  0.

f .x/dx  0 and the needed result

follows. With the assumption that h.x/  0 for all x 2 a; b, it is not hard to
Rb
show that h.x/dx  0, because the value of every associated Riemann sum must
a

6.6 Properties of Integrals

177

be nonnegative. How do you turn this into a proof? Recall how the proof went
when showing that if f .x/  0, then lim f .x/ cannot be negative. If you assume that
x!a

the limit L is negative, then you can choose an  D  L2 . If f is never negative, it


follows that jf .x/  Lj is always greater than  giving a contradiction. A very similar
argument works here where f is replaced by the Riemann sum.
PROOF: If f and g are functions integrable on the interval a; b, and
Rb
Rb
f .x/  g.x/ for all x 2 a; b, then f .x/dx  g.x/dx.
a

Let interval a; b be given, and let f and g be integrable functions on a; b


with f .x/  g.x/ for all x 2 a; b.
Define h.x/ D g.x/f .x/ which is greater than or equal to 0 for all x 2 a; b.
Rb
Rb
Rb
Since f and g are integrable, so is h, and h.x/dx D g.x/dx  f .x/dx.
Thus, it suffices to prove that
Assume instead that

Rb
a

Rb

h.x/dx  0.

h.x/dx D I < 0, and let  D  2I > 0.

From the definition of Riemann Integral, there is a > 0 such that if P W


a D x0  x1  x2      xn D b is a partition
with jjP jj < , then for
P

every choice of j 2 xj1 ; xj , h.j /xj  I < .


jD1

But for every choice of j , h.j /  0, so h.j /xj  I D


jD1

n
P
h.j /xj  I  I > .
jD1

This contradicts the assumption that I < 0 which completes the proof.

6.6.1 Exercises
Write proofs for each of the following statements.
1. If functions f1 ; f2 ; f3 ; : : : ; fn are integrable on interval a; b, and c1 ; c2 ; c3 ; : : : ; cn
Rb
.c1 f1 .x/ C c2 f2 .x/ C c3 f3 .x/ C    C cn fn .x// dx
are constants, then
D
c1

Rb
a

f1 .x/dx C c2

Rb
a

f2 .x/dx C c3

Rb
a

f3 .x/dx C    C cn

Rb

fn .x/dx. (In the words

of Linear Algebra, this says that the Riemann integral is a linear operator.)

178

6 Riemann Integrals

2. If f is a function integrable on a; b with f .x/  c for all x 2 a; b, then


Rb
f .x/dx  c.b  a/.
a

Rb

3. If f is a function such that both f and jf j are integrable on a; b, then f .x/dx 
a

Rb
jf .x/jdx.
a

6.7 Integrable Functions


It is helpful to have a characterization of those functions which are Riemann
integrable. This section will discuss several theorems which establish some properties of functions that guarantee that they are integrable. Then the following three
sections present a series of results that give a complete characterization of Riemann
integrable functions.
Recall that f is called bounded on a; b if there is a number M such that jf .x/j 
M for all x 2 a; b. It is important to note that if f is integrable on an interval, then f
must be bounded on that interval. The way to prove this result is reminiscent of the
way one proves that a function continuous on a closed bounded interval is bounded.
That is, one uses an indirect proof assuming that you have an integrable function
that is not bounded, and from that, you produce a contradiction. Think about what
can be done with a Riemann sum if the function f is not bounded. Given a partition
P W a D x0  x1  x2      xn D b, for some choice of the j s the Riemann
n
P
sum is
f .j /xj . If f is unbounded on a; b, then it must be unbounded on at
jD1

least one of the subintervals xj1 ; xj ; otherwise, if there is a bound for f on each of
the n subintervals, one merely needs to select the largest of those n bounds to have
a bound for f on the entire interval a; b. So what happens if f is not bounded on
the kth subinterval xk1 ; xk ? It means that k could be changed to be some other
value in the subinterval, say   , to make the term f .  /xk as large as you like.
Thus, you can make the entire Riemann
sum as largeas you like. So how large do
P

you want f .  /xk to be? You want f .j /xj  I to be larger than  for some
jD1

preassigned  > 0 such as  D 1. The proof below does this by selecting a value
  to replace k in such a way that the kth term of the Riemann sum, f .  /xk ,
is
larger by at least
1 than the sum of the absolute values of all the other terms of
P

f .j /xj  I guaranteeing that the resulting expression will be bigger than 1.
jD1

This gives the needed contradiction.

6.7 Integrable Functions

179

PROOF: If f is a function integrable on the interval a; b, then f is


bounded on a; b.
Let f be an integrable function on the interval a; b.
Assume that f is not bounded on a; b.
Rb
Let  D 1, and f .x/dx D I.
a

Then from the definition of Riemann integral, there is a > 0 such that if
P W a D x0  x1  x2      xn D b is a partition with
jjP
jj < and j s are chosen with j 2 xj1 ; xj , the Riemann sum satisfies
P

f .j /xj  I <  D 1.


jD1

Let a particular partition P with jjP jj < and choices for j 2 xj1 ; xj  be
given.
Because f is not bounded on a; b, it follows that there is a k between 1
and n such that f is not bounded on the interval xk1 ; xk . Otherwise, f
is bounded on each of the subintervals of the partition implying that it is
bounded on the entire interval a; b. Note that xk > 0 because a function
cannot be unbounded on an interval of length 0.
n
P
Let J D
jf .j /jxj  jf .k /jxk C jIj.
jD1

Because f is unbounded on xk1 ; xk , there is exists   2 xk1 ; xk  such that


jf .  /j > JC1
.
xk
Then the Riemann sum resulting from the partition
P with the choices of j



n

f .j /xj C f .  /f .k / xk I


jD1

where k is replaced by   must satisfy 1 >


 jf .  /jxk 

n
P

jf .j /jxj  jf .k /jxk C jIj >

jD1

JC1
xk
xk

 J D 1.

This is a contradiction, so the assumption that f is unbounded must be false.


This completes the proof.
Knowing that integrable functions must be bounded is very helpful. If you can
claim that jf .x/j
P M for some constant M, then you know that any one term of a
Riemann sum njD1 f .j /xj can contribute at most M  xj to the sum. By forcing
the norm of the partition, jjP jj, to be very small, you can control the maximum
size of xj and, thus, the maximum size of a term in the Riemann sum. This is the
key idea behind the proof of the next theorem which states that if f is integrable
Rc
Rb
Rc
on a; b and on b; c, then f .x/dx D f .x/dx C f .x/dx. To prove this, it is
a

natural to consider finding a 1 > 0 so that Riemann sums arising from partitions of
Rb
a; b with norm less than 1 are close to I D f .x/dx and finding a 2 > 0 so that
a

Riemann sums arising from partitions of b; c with norm less than 2 are close to
Rc
J D f .x/dx. You would consider allowing to equal the minimum of the 1 and 2 .
b

180

6 Riemann Integrals

Then you could take a partition of a; c with a norm less than . Unfortunately, this
partition of a; c does not separate into a partition of a; b and a partition of b; c
because there is no guarantee that the given partition of a; c includes the point b as
one of the xj values in the partition. But if you change the Riemann sum by altering
the interval of the partition containing the point b by adding b as an extra point to
the partition, you are not making a large change in the total sum. More precisely,
suppose the partition is P W a D x0  x1  x2      xn D c with the point b in
the interval xk1 ; xk . A resulting Riemann sum has the term f .k /.xk  xk1 /. If this
term is replaced by two terms f .b/.b  xk1 / C f .b/.xk  b/, how much does this
change the Riemann sum? The change is exactly f .b/.b  xk1 / C f .b/.xk  b/ 
f .k /.xk  xk1 / D .f .b/  f .k //.xk  xk1 /. Given that f is integrable on a; b and
on b; c, you know that there is a bound M such that jf .x/j  M for all x 2 a; c.
An upper bound for the size of this change is, therefore, 2M.xk  xk1 / < 2M. This
says that by choosing small enough, you can control the amount of change made
in the Riemann sum by introducing b as a point in the partition of a; c. If is also
Rb
Rc
chosen small enough so that the resulting Riemann sums for f .x/dx and f .x/dx
a

are close to the corresponding integral, then the total difference between original
Rb
Rc
Riemann sum and the sum of the integrals f .x/dx C f .x/dx is small enough.
a

This is the idea behind the following proof.


PROOF: If f is a function integrable on the interval a; b and on the
Rc
Rb
Rc
interval b; c, then f .x/dx D f .x/dx C f .x/dx.
a

Without loss of generality assume that a < b < c, and let f be a function
Rb
integrable on the interval a; b and on the interval b; c with I D f .x/dx
and J D

Rc

f .x/dx.

Because f is integrable on a; b, jf j is bounded on that interval by some


value M1 . Because f is integrable on b; c, jf j is bounded on that interval
by some value M2 . It follows that jf j is bounded on the interval a; b by
M D max.M1 ; M2 /.
Let  > 0 be given.
From the definition of Riemann integration, there is a 1 > 0 such that for
every partition P of a; b with jjP jj < 1 and every choice of j 2 xj1 ; xj 
on the intervals of the partition, the associated Riemann sum will be within

of the integral I.
3
Similarly, there is a 2 > 0 such that for every partition P of b; c with
jjP jj < 2 and every choice of j 2 xj1 ; xj , the associated Riemann sum
will be within 3 of the integral J.
(continued)

6.7 Integrable Functions

181




.
Let D min 1 ; 2 ; 6MC1
Let P W a D x0  x1  x2      xn D c be a partition of a; c with
jjP jj < .
Let s be chosen such that j 2 xj1 ; xj .
Since b 2 a; c, there is a k such that b 2 xk1 ; xk .
Then

n
f .j /xj  .I C J/ D

jD1

!
!
k1
n
P
P
f . /xj Cf .b/.b  xk1 / C f .b/.xk  b/ C
f .j /xj C

jD1 j
jDkC1

.f .xk /  f .b//xk  .I C J/ 

k1

n
P
P

f .j /xj  J C
f .j /xj Cf .b/.b  xk1 /  I C f .b/.xk  b/C
jD1

jDkC1
jf .xk /  f .b/jxk :
Since the partition a D x0  x1  x2      xk1  b D
b is a partition of a; b with norm less than 1 , it follows that
k1

f .j /xj C f .b/.b  xk1 /  I < 3 .


jD1

Similarly, since the partition b D b  xk  xkC1     


norm less than 2 , it follows that
xn D c is a partition of b; c with

n
P

f .j /xj  J < 3 .


f .b/.xk  b/ C

jDkC1

< 3 .
Also, jf .xk /  f .b/jxk < 2M  6MC1

Therefore, f .j /xj  .I C J/ < 3 C 3 C 3 D .


jD1

Rc
This proves that f .x/dx D I C J and completes the proof of the theorem.
a

Note that you can easily show that this theorem also holds if a > b or b > c by
simply rearranging the order of the limits on one or more of the integrals.
The previous section discusses the theorem stating that if integrable functions
Rb
Rb
satisfy f  g on a; b, then f .x/dx  g.x/dx. Can this statement be made
a

stronger? That is, if f .x/  g.x/ for x 2 a; b, with f .x/ < g.x/ for some x 2 a; b,
Rb
Rb
can you conclude that f .x/dx < g.x/dx? The answer is no. For example, if
a

f and g only differ for a finite number of x values, then f and g will have identical
integrals. To prove this, start with two integrable functions, f and g, that are identical
for all x 2 a; b except for some t 2 a; b. How would you prove that f and g have

182

6 Riemann Integrals

identical integrals? Again, you should consider the Riemann sums associated with f
n
P
and g, that is, consider a Riemann sum
g.j /xj for g with a particular partition
jD1

and choice of j s, and compare it to the corresponding sum

n
P

f .j /xj for f . If

jD1

f .x/ D g.x/ at all points except x D t, how many of the corresponding terms in
these two Riemann sum could be different? Well, only those terms for which the
chosen j D t and xj 0. This could happen at most twice (twice in the unusual
case of t D xj D j D jC1 ). Thus, the Riemann sum for g is identical to the
Riemann sum for f plus at most two terms. By controlling the size of xj which
you can do by limiting the norm of the partition, you can control the contribution of
those at most two terms in the Riemann sum, thus ensuring that the sums for f and
g are close. That is the idea behind the following proof.
PROOF: Suppose that f and g are functions integrable on the interval
a; b, and that f .x/ D g.x/ for all x 2 a; b except perhaps at t 2 a; b.
Rb
Rb
Then f .x/dx D g.x/dx.
a

Let f and g be a functions integrable on the interval a; b, and suppose that
f .x/ D g.x/ for all x 2 a; b except perhaps at t 2 a; b.
Rb
Let f .x/dx D I.
a

Let M D max.jf .t/j; jg.t/j/ C 1.


Let  > 0 be given.
From the definition of Riemann Integration, there is a 1 > 0 such that for
every partition P W a D x0  x1  x2      xn D b with norm less than
1 , and every choice
of j 2 xj1 ; xj , the associated Riemann sum satisfies
P

f .j /xj  I < 2 .


jD1


Let 2 D 8M , and set D min.1 ; 2 /.
Select any partition P W a D x0  x1  x2      xn D b with
norm less than , and select any sequence of j 2 xj1 ; xj . Then the
P

associated Riemann sum for the function g satisfies g.j /xj  I 


jD1


D .
f .j /xj  I C jg.t/  f .t/j2 < 2 C 2M  2  8M
jD1

Rb
Rb
Thus, g.x/dx D I D f .x/dx which proves the theorem.
a

It is left as an exercise to extend this theorem to the case where f and g differ at a
finite number of points. In fact, this can be extended to f and g which differ on an
infinite sequence of points in a; b as long as the sequence has a limit.

6.8 Step Functions

183

6.7.1 Exercises
Write proofs for each of the following statements.
1. If f and g are functions integrable on the interval a; b, and f .x/ D g.x/ for all
x 2 a; b except perhaps at the finite set of points ft1 ; t2 ; t3 ; : : : ; tk g  a; b, then
Rb
Rb
f .x/dx D g.x/dx.
a

2. If f and g are functions integrable on the interval a; b, and f .x/ D g.x/ for all
x 2 a; b except perhaps on a sequence of points ft1 ; t2 ; t3 ; : : : g  a; b where
Rb
Rb
lim tj D L, then f .x/dx D g.x/dx.
j!1

3. If f is defined on the interval 0; 1 by f .0/ D 0 and for each natural number n,


R1
1
, then f .x/dx exists and is equal to 13 .
f .x/ D 21n for all x with 21n < x  2n1
0

4. If f is defined on the interval 0; 1 by f .0/ D 0 and for each natural number


 n
R1
1
, then f .x/dx does not exist but
n, f .x/ D 32 for all x with 21n < x  2n1
lim

R1

r!0C r

f .x/dx D 3.

5. If f is integrable on a; b, then the function defined on a; b by F.x/ D


is continuous on a; b.

Rx

f .t/dt

6.8 Step Functions


Step functions play an important role in the theory of the Riemann integration. A
step function s on the interval a; b is associated with a partition P W a D x0  x1 
x2      xn D b of a; b and has the property that s is constant
9
8 on each interval

3 0  x < 2>
>

>

>

>

=
< 1 2Dx
of the partition, .xj1 ; xj /. For example, the function s.x/ D
4 2 < x < 4 is
>

>

>

>
0 4Dx
>

;
:
1 4 < x  5
a step function defined on the interval 0; 5 (Fig. 6.7). It follows easily that a step
function on an interval a; b is integrable there. Indeed, suppose that P W a D x0 
x1  x2      xn D b, and s.x/ D cj for all x satisfying xj1 < x < xj . Clearly,
the constant function cj is integrable on the interval xj1 ; xj , and the function s.x/
differs from this constant function at at most the two endpoints, xj1 and xj . Thus,

184

6 Riemann Integrals

Fig. 6.7 The step


function s.x/

by the previous theorem,


cj  xj D

n
P

Rxj

s.x/dx D cj  xj . Then,

Rb

xj1

s.x/dx D

n Rxj
P

s.x/dx D

jD1 xj1

cj  xj .

jD1

The importance of step functions comes from the fact that a function f is
integrable on a; b if and only if f can be closely approximated by step functions.
Precisely, f has a Riemann integral on the interval a; b if and only if for every
 > 0, there exist step functions u.x/ and v.x/ on a; b with the property that for
Rb
Rb
all x 2 a; b, v.x/  f .x/  u.x/, and u.x/dx  v.x/dx < . That is, f has
a

an integral precisely when for every  > 0 there is a lower step function v that is
always less than or equal to f and an upper step function u that is always greater
than or equal to f with the property that the integrals of v and u are within  of each
other. This squeezes f between two step functions whose integrals are as close as
you want. This should remind you of the Exhaustion Area Axiom.
The statement of this theorem is a biconditional statement; that is, it is an if and
only if statement. This means that the proof will have two distinct parts. One proof
must show that if a function is integrable, then it can be approximated by very close
upper and lower step functions. The other proof must show that if a function can
be approximated by very close upper and lower step functions, then it is integrable.
Consider how you would approach the proofs of each of these statements.
For the first part of the proof, you would consider a function, f , integrable on an
interval a; b. Given an  > 0, somehow you need to show that there are upper and
lower step functions, u and v, whose integrals are within  of each other. Where do
you start? All you know about f is that it has a Riemann integral on a; b, thus, all
you have to go on is the definition of Riemann integration which makes a statement
about the properties of Riemann sums. The key observation here is that a Riemann
n
P
sum
f .j /xj is equal to the integral of a step function defined to be equal to the
jD1

constant f .j / on the interval .xj1 ; xj /. Since the definition of the integral guarantees

6.8 Step Functions

185

that you can find Riemann sums that are very close to the value of the integral, this
suggests how you might choose step functions whose integrals are close to each
other. How can you assure that you choose a step function that is less than f .x/
for each x 2 a; b? For each interval of the partition .xj1 ; xj / you could consider
selecting j so that f .j / is the minimum value of f on that interval. Unfortunately,
f might not achieve a minimum value on that interval. Certainly, if f is continuous
on xj1 ; xj , then it obtains its minimum on that interval, but there is nothing here
indicating that f is continuous. On the other hand, you do know that, because f is
integrable, it is bounded. Thus, there is a greatest lower bound Mj D inf f .x/.
x2.xj1 ;xj /

There may not be any x 2 .xj1 ; xj / with the property that f .x/ D Mj , but you know
that there are values of x in the interval such that f .x/ is as close as you like to Mj .
Getting specific, now, your goal is to find upper and lower step functions whose
integrals are within some given  > 0 of each other. It makes sense, therefore, to
find upper and lower step functions whose integrals are both within 2 of the value of
the integral of f because then the two step functions will be within  of each other.
From the definition of Riemann integral, you can find a partition of a; b such that
all associated Riemann sums are within 4 of the integral of f . Then you can define
a lower step function, v.x/, that is equal to the infimum of f on each interval of the
chosen partition. On each interval of the partition you can find j values so that f .j /

is within 4.ba/
of v.j /. Then the integral of the lower step function will be within

 .b  a/ D 4 of a Riemann sum for f which in turn is within 4 of the integral
4.ba/
of f . This produces a lower step function with the properties you want. A similar
construction will produce an upper step function whose integral is also within 2 of
the integral of f , and that will complete the first part of the proof (Fig. 6.8).
For the second part of the proof, you consider a function, f , such that for each
 > 0 you can find a lower step function, v.x/, and an upper step function, u.x/,
whose integrals are within  of each other. You must then show that f has an integral.
The first task is to figure out what value I will serve as the integral of f . Your proof
will need to show that Riemann sums for f approach this value of I, so you first

4(b a)
inf f(x)
j
xj1
Fig. 6.8 Choosing j on .xj1 ; xj /

xj

186

6 Riemann Integrals

need a target I for that purpose. To do this, consider the collection of all possible
lower step functions, v.x/. That is, let L D fv j v is a step function with v.x/ 
f .x/ for all x 2 a; bg. Each v 2 L has an integral, and each integral should be less
than or equal to the needed value of I. How about taking the least upper bound of
all of those integrals? Does the least upper bound exist? It does if the collection
of integrals of elements of L is bounded above. To get that, all you need is one
upper step function u. For each v 2 L and for each x 2 a; b, you know that
Rb
Rb
v.x/  f .x/  u.x/. This ensures that for each v 2 L, v.x/dx  u.x/dx showing
a

that the set of integrals of elements in L is bounded above. That allows you to set
Rb
I D sup v.x/dx. This makes sense because I would then be greater than or equal
v2L a

to the integral of any lower step function. It would also have to be less than or equal
to the integral of any upper step function. Since the assumption is that the integrals
of lower step functions and upper step functions can be found arbitrarily close to
each other, and each integral of an upper step function must be greater than or equal
to any integral of a lower step function, you would expect that the least upper bound
of the lower step function integrals would be equal to the greatest lower bound of
the upper step function integrals, and this value is what you will choose for I.
After determining I, your proof can proceed naturally. You need to show that by
restricting the norm of a partition of a; b, you can force an associated Riemann
sum for f to be close to I. What you have at your disposal is the ability to find
upper and lower step functions whose integrals are close to each other. A helpful
observation is that if you have a lower step function v and an upper step function u,
then for any partition and choice of j in the intervals of the partition, you know that
n
n
n
P
P
P
v.j /xj 
f .j /xj 
u.j /xj . So you can choose upper and lower step
jD1

jD1

jD1

functions, u and v whose integrals are each within, say 2 , of I. Then you can choose
a norm of a partition so that any Riemann sum for v is within 2 of the integral of
v, and any Riemann sum for u is within 2 of the integral of u. That will force the
corresponding Riemann sum for f to be within  of I completing the proof.
PROOF: The function f is integrable on the interval a; b if and only if for
every  > 0 there are step functions, u and v, such that for each x 2 a; b,
Rb
Rb
v.x/  f .x/  u.x/ and u.x/dx  v.x/dx < .
a

Let the function f and the interval a; b be given.


Without loss of generality, assume that a < b, for if a D b, the result follows
trivially.
(continued)

6.8 Step Functions

187

PART I: Integrability implies close upper and lower step functions


Assume that f is an integrable function with

Rb

f .x/dx D I.

Let  > 0 be given.


By the definition of Riemann integration, there is a > 0 such that for any
partition of a; b with norm less than and any choice of j s in the intervals
n
P
of the partition, the associated Riemann sum
f .j /xj is within 4 of I.
jD1

Let P W a D x0 < x1 < x2 <    < xn D b be such a partition.


Note that since f is integrable, f is a bounded function on the interval a; b.
Because f is bounded, for each j D 1; 2; 3; : : : ; n, the value of
inf f .x/ exists. Therefore, there exists j 2 .xj1 ; xj / such that f .j / <
xj1 <x<xj

inf

xj1 <x<xj

f .x/ C


.
4.ba/

For each j, define v.xj / D f .xj / and for x 2 .xj1 ; xj /, define v.x/ D

inf f .x/  f .j /  4.ba/
.
xj1 <x<xj

Then v is a step function with the property that v.x/  f .x/ for all x 2

n 
n
Rb
P
P

a; b, and v.x/dx 
f .j /  4.ba/
xj D
f .j /xj  4 . Since
jD1

jD1

the Riemann sum was chosen to be within 4 of I, the integral of v is greater


than I  2 .
Similarly, one can define an upper step function u in the same way that
v was defined except that, in this case, the j values are chosen to satisfy

f .j / > sup f .x/  4.ba/
, and for x 2 .xj1 ; xj / the function u.x/ is
xj1 <x<xj

defined to be

sup

xj1 <x<xj

f .x/  f .j / C


.
4.ba/

Then u is a step function with the property that f .x/  u.x/ for all x 2

n 
n
Rb
P
P

a; b, and u.x/dx 
f .j / C 4.ba/
xj D
f .j /xj C 4 . Since
jD1

jD1

the Riemann sum was chosen to be within 4 of I, the integral of u is less


than I C 2 .
It follows that u and v are upper and lower step functions for f and have the
Rb
Rb
property that u.x/dx  v.x/dx < .I C 2 /  .I  2 / D .
a

This completes PART I of the proof.


PART II: Close upper and lower step functions implies integrability
Assume that for every  > 0 there exists step functions u and v satisfying
Rb
Rb
v.x/  f .x/  u.x/ for every x 2 a; b, and u.x/dx  v.x/dx < .
a

(continued)

188

6 Riemann Integrals

Let u be any upper step function for f . Every lower step function, v,
Rb
satisfies v.x/  f .x/  u .x/ for every x 2 a; b, implying that v.x/dx 
a
(
)
Rb 
Rb
u .x/dx and that the set
v.x/dx j v is a lower step function of f is
a

bounded above by

Rb

u .x/dx.

The set of integrals of( lower step functions of f is nonempty and


) bounded
Rb
above, so let I D sup
v.x/dx j v is a lower step function of f .
a

Let  > 0 be given.


By assumption there are step functions u and v satisfying v.x/  f .x/ 
Rb
Rb
u.x/ for every x 2 a; b, and u.x/dx  v.x/dx < 2 .
a

Since the integral of any upper step function is an upper bound for the set
Rb
of all integrals of lower step functions, it follows that I  u.x/dx <
Rb

v.x/dx C

Also

Rb


2

< I C 2 .

v.x/dx >

Rb

u.x/dx 


2

> I  2 .

Since u is integrable, there is a 1 > 0 such that for every partition of


a; b with norm less than 1 and every choice of j s in the intervals of
n
Rb
P
the partition, the Riemann sum
u.j /xj is within 2 of u.x/dx.
jD1

Similarly, there is a 2 > 0 such that for every partition of a; b with norm
less than 2 and every choice of j s in the intervals of the partition, the
n
Rb
P
Riemann sum
v.j /xj is within 2 of v.x/dx.
jD1

Let D min.1 ; 2 /, and let P W a D x0  x1  x2      xn D b be a


partition of a; b with jjP jj < .
For each j, let j be chosen in the interval xj1 ; xj .
n
Rb
P
Then it follows that I   D I  2  2 < v.x/dx  2 <
v.j / 
a

n
P

n
P

Rb

jD1

f .j / 
u.j / < u.x/dx C 2 < I C 2 C 2 D I C .
jD1

a
P

n
Rb

Thus, f .j /  I <  which shows that f .x/dx D I and completes the
jD1

a
proof of PART II.
jD1

6.9 Integrals of Continuous Functions

189

This theorem provides a characterization of integrable functions, but it is not


the easiest characterization to use when faced with determining whether or not
a function is integrable. To use this criteria to determine if a given function f
is integrable, one needs to show that the function admits upper and lower step
functions whose integrals are within  of each other. This is not the easiest criteria
to apply. The next two sections will develop other criteria for integrability, but the
results will be based closely on this theorem about step functions.

6.8.1 Exercises
Write proofs for each of the following statements.
1. If s.x/ and t.x/ are both step functions on the interval a; b, then so are
(a)
(b)
(c)
(d)

s.x/ C t.x/.
s.x/t.x/.


max s.x/; t.x/ .
s2 .x/ C t2 .x/.

2. If f and g are integrable functions on the interval a; b, then so is max.f ; g/.
3. If f is an integrable function on interval a; b, then so is jf j.

6.9 Integrals of Continuous Functions


The previous theorem about step functions gives a straightforward way to prove
that all continuous functions are integrable. Such a proof would take an arbitrary
function f that is continuous on the interval a; b for some a < b and an arbitrary
 > 0, and show that f has upper and lower step functions whose integrals are
within  of each other. What is it about such a continuous function, f , that allows
the construction of these upper and lower step functions? The important result about
continuous functions that comes into play here is that if f is continuous on a; b, then
it is uniformly continuous there. This has the consequence that there is a > 0 such

that if x and y are in a; b with jx  yj < , then jf .x/  f .y/j < 2.ba/
. This means

.
that if xj1 < xj with xj  xj1 < , then sup f .x/  inf f .x/  2.ba/
x2.xj1 ;xj /

x2.xj1 ;xj /

Defining upper and lower step functions to be equal to this supremum and infimum,
respectively, on .xj1 ; xj / gives the step functions with the needed property.

190

6 Riemann Integrals

PROOF: If the function f is continuous on the interval a; b, then it is


integrable there.
Let f be a function continuous on the interval a; b.
Without loss of generality, assume that a < b.
Let  > 0 be given.
Because f is continuous on a; b, it is uniformly continuous there.

Thus, there is a > 0 such that jf .x/  f .y/j < 2.ba/
holds for every x and
y in a; b with jx  yj < .
Let n be a positive integer with ba
< .
n
.
For each j D 0; 1; 2; 3; : : : ; n let xj D a C j  ba
n
Define step function v.x/ by v.xj / D f .xj / for each j D 0; 1; 2; : : : ; n and
v.x/ D min f .y/ for each j D 1; 2; 3; : : : ; n. Thus, v is a lower step

y2xj1 ;xj 

function for f .
Similarly, define step function u.x/ by u.xj / D f .xj / for each j D
0; 1; 2; : : : ; n and u.x/ D max f .y/ for each j D 1; 2; 3; : : : ; n. Thus,
y2xj1 ;xj 

u is an upper step function for f .


For each j D 1; 2; 3; : : : ; n, because xj  xj1 < , it follows that

max f .y/  min f .y/  2.ba/
implying that for all x 2 a; b,
y2xj1 ;xj 

y2xj1 ;xj 


.
u.x/  v.x/ < ba

Rb
Rb
Rb 
Rb
u.x/  v.x/ dx <
Thus, u.x/dx  v.x/dx D
a


dx
ba

D .

Therefore, u and v are upper and lower step functions for f whose integrals
on a; b differ by less than , so it follows that f is integrable on a; b which
completes the proof.
One thing nice about knowing that a function is integrable on an interval a; b
is that rather than having to consider all partitions of a; b, you can determine the
value of the functions integral by using any collection of partitions of a; b whose
norms approach zero. Thus, if you know that f is integrable on a; b, then for every


n
P
natural number n you could calculate I.n/ D
f a C .b  a/ nj ba
which is the
n
jD1

Riemann sum for f based on the very specific partition where xj D a C .b  a/ nj and
with j D xj . This is not the more general Riemann sum required by the definition
of the integral, but if you already know that the integral exists, then it must be equal
to lim I.n/.
n!1

As an example, consider the function f .x/ D x which is continuous on the


interval 0; 4, so you know that it is integrable there. You can then consider


n
n
n
P
P
n.nC1/
j 4
16 P
I.n/ D
f a C .b  a/ nj ba
D
.4

/
D
j D 16
. Then
2
n
n n
2
n
n2
jD1

jD1

lim I.n/ is easily seen to be 8 which is

n!1

jD1

R4
0

x dx. On the other hand, if you try

6.9 Integrals of Continuous Functions

191

f(c)

Fig. 6.9 The mean value theorem for integration



0 x is rational
this with the function f .x/ D
on the interval 0; 1, you obtain
1 x is irrational

n
P
f nj  1n D 1. So lim I.n/ D 1 which is not the integral of f . That
I.n/ D
n!1

jD1

integral does not exist.


Now that it has been established that continuous functions are integrable, it is
appropriate to investigate the properties of the integrals of continuous functions.
The first of these properties is known as the Mean Value Theorem for Integration. It
states that the integral of a continuous function, f , on an interval, a; b, is given by
the length of the interval, b  a, times one of the values f achieves on the interval.
Rb
That is, there exists a c 2 a; b such that f .x/dx D f .c/  .b  a/. This result has a
a

nice visual interpretation showing that the area under a continuous curve is equal to
the area of a rectangle with length b  a and width f .c/ for some c 2 a; b as shown
in Fig. 6.9. Another way to think about this is that there is a c 2 a; b such that f .c/
Rb
1
is the mean value of f which could be defined as ba
f .x/dx.
a

The proof of this theorem follows easily from three earlier results: (1) the
Intermediate Value Theorem, (2) a continuous function on a closed interval takes
on its extreme values, and (3) if one integrable function is greater than or equal to a
second integrable function, then the integral of the first is greater than or equal to the
integral of the second. The proof starts with a function f continuous of an interval
a; b. That function achieves its minimum value K and its maximum value M on the
interval. Thus, for all x 2 a; b, it follows that K  f .x/  M from which it follows
Rb
that .b  a/K  f .x/dx  .b  a/M. Then, by the Intermediate Value Theorem,
a

on the interval a; b the function f achieves every value between K and M including
Rb
1
f .x/dx.
ba
a

192

6 Riemann Integrals

PROOF (Mean Value Theorem for Integration): Assume the function f


is continuous on the interval a; b with a < b. Then there is c 2 a; b
Rb
1
f .x/dx.
satisfying f .c/ D ba
a

Let f be a function continuous on the interval a; b with a < b.


Because f is continuous on the interval a; b there are s and t in a; b such
that f .s/ D K is the minimum value for f on a; b, and f .t/ D M is the
maximum value for f on a; b.
Rb
Since for all x 2 a; b, K  f .x/  M, it follows that K.b  a/ D K dx 
Rb

f .x/dx 

Rb

M dx D M.b  a/.

a
1
ba

Because f .s/ D K 

Rb

f .x/dx  M D f .t/, the Intermediate Value

Theorem says that there is a c between s and t such that f .c/ D

1
ba

Rb

f .x/dx.

Thus, c 2 a; b satisfies the needed requirement and completes the proof.


It can be very exciting to take a first course in Calculus. After learning what a
limit is, you learn about two very different-looking limit processes: the derivative
and the integral. Both differentiation and integration have important applications
which justify the amount of attention they receive. But then comes the seemingly
amazing revelation that these two processes, although they are defined in extremely
different ways, are, in fact, very closely related in that they are essentially inverse
operations of each other. This fact is the point of the Fundamental Theorem of
Calculus, often presented as the pinnacle of the first course in Calculus.
The Fundamental Theorem of Calculus starts with a function f integrable
on a; b. The result of the theorem is generally stated in two parts. The first part
Rx
defines a new function F.x/ D f .t/dt and states that if f is continuous at some
a

point c 2 .a; b/, then F 0 .c/ D f .c/. The second part states that if f is continuous
on a; b, and if F is any function satisfying F 0 .x/ D f .x/ for all x 2 a; b, then
Rb
f .x/dx D F.b/  F.a/. It is fairly straightforward to prove the second part using
a

the first part.


To prove the first part of the theorem, you would assume that a function f is
integrable on an interval a; b and that f is continuous at c 2 .a; b/. To find
Rx
the derivative of F.x/ D f .t/dt at c, you would just apply the definition of
a

the derivative. That is, you would start with the difference quotient F.x/F.c/
D
xc
x

x
c
R
R
R
1
1
f .t/dt  f .t/dt . This simplifies to xc
f .t/dt. Now if you knew that f
xc
a

6.9 Integrals of Continuous Functions

193

were continuous between c and x, you could apply the just completed Mean Value
Theorem for Integration to conclude that this difference quotient is equal to f .y/ for
some y between c and x. Then by forcing x to be close to c, you could force f .y/
to be close to f .c/ to complete the proof. But you do not know that f is continuous
between c and x; only that f is continuous at c. Still this is enough. You can use
the continuity of f at c to say that for a given  > 0 there is a > 0 that ensures
that if t satisfies jt  cj < , then jf .t/  f .c/j < . This shows that for x within
Rx
Rx
Rx
1
1
1
of c, xc
.f .c/  /dx < xc
f .t/dx < xc
.f .c/ C /dx which simplifies to
c

f .c/   <

1
xc

Rx

f .t/dx < f .c/ C , and the result follows.

PROOF (Fundamental Theorem of Calculus: Part I): Assume the function


f is integrable on the interval a; b and continuous for some c 2 .a; b/.
Rx
Then the function F.x/ D f .t/dt is differentiable at c with F0 .c/ D f .c/.
a

Let f be a function integrable on the interval a; b and continuous at some


c 2 .a; b/.
Rx
For x 2 a; b, define F.x/ D f .t/dt.
a

Let  > 0 be given.


From the definition of continuity, there is a > 0 such that if x 2 a; b with
jx  cj < , then jf .x/  f .c/j < .
Select any x 2 a; b with 0 < jx  cj < .
Rx
Rc
Rx
Then F.x/  F.c/ D f .t/dt  f .t/dt D f .t/dt.
a

Since f .c/   < f .t/ < f .c/ C  for all t between c and x, it follows that
Rx
Rx
Rx
1
1
1
.f .c//dx < xc
f .t/dx < xc
.f .c/C/dx D f .c/C.
f .c/ D xc
c
c
c

Rx

1
Thus, F.x/F.c/
 f .c/ D xc
f .t/dx  f .c/ < .
xc
c

This proves that F 0 .c/ D lim F.x/F.c/


D f .c/ completing the proof of the
xc
x!c
theorem.
The second part of the Fundamental Theorem of Calculus now follows easily.
Rx
Indeed, if f is continuous on a; b, then the function F.x/ D
f .t/dt is an
a

antiderivative of f , that is, a function whose derivative is f . If G.x/ is any other


antiderivative of f , then G0 .x/ D F 0 .x/ on a; b. It follows from the Mean Value
Theorem (for derivatives) that G and F differ by a constant because G  F has a
derivative that is identically 0. Thus, F.x/  F.a/ D G.x/  G.a/ for all x 2 a; b
Rb
showing that f .t/dt D G.b/  G.a/ for any antiderivative G.
a

194

6 Riemann Integrals

PROOF (Fundamental Theorem of Calculus: Part II): Assume the function f is continuous on the interval a; b and that F is any antiderivative
Rb
of f . Then f .x/dx D F.b/  F.a/.
a

Let f be a function continuous on the interval a; b.


Without loss of generality, a < b.
Rx
Define F.x/ D f .t/dt.
a

Since f is continuous at each x 2 a; b, it follows that F 0 .x/ D f .x/.


Let G be any antiderivative of f .
Then for all x 2 a; b, the derivative of G.x/  F.x/ is f .x/ f .x/ D 0. 
By
is a c 2 .a; b/ such that G.b/  F.b/ 
 the Mean Value
  Theorem there

G.a/  F.a/ D f .c/  f .c/ .b  a/ D 0.
Rb
Thus, G.b/  G.a/ D F.b/  F.a/ D F.b/ D f .x/dx which completes the

proof.
The importance of the Fundamental Theorem of Calculus cannot be overstated. It
turns the complex operation of finding limits of difficult to calculate Riemann sums
into the somewhat more routine job of finding antiderivatives of functions.

6.9.1 Exercises
1. If F.x/ D

Rx3
x2

t
dt,
1Ct2

find F 0 .x/.

2. Suppose f has a jump discontinuity at c 2 a; b (that is, lim f .x/ and lim f .x/
x!cC

x!c

both exist and are unequal). If f is integrable on a; b, what is the behavior of
Rx
F.x/ D f .t/dt at c?
a

3. Suppose f is integrable on a; b. If the derivative of F.x/ D


c 2 a; b, what can you say about f at c?

Rx

f .t/dt exists at

6.10 Characterization of Integrable Functions


A function continuous on the closed interval a; b is integrable there. Some
functions which are not continuous are still integrable, so the question is, how badly
can a function behave and still be integrable? If a continuous function is changed

6.10 Characterization of Integrable Functions

195

at one point, it is no longer continuous, but changing a function at a single point


does not affect whether or not it is integrable. If a function has a jump discontinuity
at a point (that is, it has a right limit and a left limit at the point, but those two
limits are not equal) but is continuous elsewhere, then the function is still integrable.
This is because if a function is integrable on a; b and integrable on b; c, then it
is integrable on a; c whether or not the function is continuous at b. It follows that
bounded piecewise continuous functions are integrable.
Let the function f be defined on an interval a; b. Define the set of discontinuities of f , Df , to be the subset of a; b where f fails to be continuous. For example,


0 x D 15 ; 25 ; 35 ; 45
, then Df D f 15 ; 25 ; 35 ; 45 g. If
on the interval 0; 1 if f .x/ D
x otherwise


0 x is rational
f .x/ D
, then Df is the entire set 0; 1 because f is discontinuous
1 x is irrational


0 x is in the Cantor set
everywhere. Finally, if f .x/ D
, then Df is equal to
1 x is not in the Cantor set
the Cantor set because for any x not in the Cantor set, there is an open interval
containing x such that f is identically 0 on that open interval.
These examples suggest that a function defined on a; b is integrable as long
as its set of discontinuities does not get too large. In fact, a function defined on
a; b is Riemann integrable if and only if it is bounded and its associated set
of discontinuities, Df , has measure zero. Thus, any bounded function which is
discontinuous only on a countable set of points must be Riemann integrable. The
first function in the preceding paragraph with Df D f 15 ; 25 ; 35 ; 45 g is, therefore,
integrable. The second function which has Df D 0; 1 is not integrable as seen
earlier in this chapter. The third function which has Df equal to the Cantor set is
interesting because its set of discontinuities is not countable, yet the function is
integrable because the Cantor set does have measure zero.
The statement that the Riemann integrable functions on a; b are exactly those
whose set of discontinuities has measure zero is a biconditional statement. It says
both that if a function is Riemann integrable, then it is bounded with a set of
discontinuities that has measure zero, and that if a function is bounded with a set of
discontinuities that has measure zero, then the function is Riemann integrable. Thus,
a proof of this statement will have two parts, one for each conditional. The proof of
the theorem is somewhat longer than others seen in this book, but it requires only
one new concept not yet discussed.
Assume first that the function f is defined on the interval a; b, is bounded, and
its set of discontinuities, Df , has measure zero. You can prove that f is integrable on
a; b if you can show that for every  > 0 the function f has upper and lower step
Rb
functions, u and v, such that u.x/  v.x/dx < . The key point here is that f is well
a

behaved near points where it is continuous, and the set where it is not well behaved
is very small (has measure zero). The strategy, then, is to construct step functions,
u and v, so that u.x/  v.x/ is very small near points where f is continuous, and

196

6 Riemann Integrals

to limit the size of the intervals where u.x/  v.x/ is large. Suppose, for example,
that near points where f is continuous, you could limit u.x/  v.x/ to be less than

. Then the total contribution to the integral of u  v over those sections of the
2.ba/

.b  a/ D 2 . The function f is bounded, so
step functions would be at most 2.ba/
there is an M such that jf .x/j < M for all x 2 a; b. It is possible, therefore, to
define upper and lower step functions that differ by at most 2M at points of Df . If
you can limit the regions where u.x/  v.x/ is large to intervals whose total length

is at most 4M
, then the total contribution to the integral of u  v over those sections

of the step functions would be at most 2M  4M
D 2 . Accomplishing both of these
goals would then show that the integral of u  v is less than 2 C 2 D . Can this be
accomplished? By the definition of continuity, for each point x where f is continuous

there is a > 0 such that if y is in a; b with jy  xj < , then jf .y/  f .x/j < 4.ba/
.
That would ensure that for any two values y1 and y2 in the interval .x  ; x C /, the



difference jf .y1 /f .y2 /j  jf .y1 /f .x/jCjf .x/f .y2 /j < 4.ba/
C 4.ba/
D 2.ba/
.
By the definition of measure zero, the set of discontinuities of f can be covered by a

collection of open intervals whose total length is less than the needed 4M
. Thus, each
point of a; b can be covered by one of the open intervals covering Df or by one of
these .x  ; x C / intervals constructed at each point of continuity. The Heine
Borel Theorem then lets you reduce this covering of a; b with open intervals to a
finite subcovering, and from that subcovering, the appropriate upper and lower step
functions can be constructed. That completes the strategy for the first part of the
proof.
Assume, conversely, that the function f defined on the interval a; b is integrable.
You already know that this implies that jf j is bounded by some constant M, so all
you need to prove is that the set of discontinuities of f , Df , has measure zero. This
can be done with a proof by contradiction. That is, by assuming that Df does not
have measure zero, you can show that for any upper and lower step functions, u and
v, the integral of u  v is bounded away from 0. To do this it is helpful to consider
how much f can vary near a particular value x. For a point x 2 a; b and a > 0,
you would like to know how much f can change over the interval .x  ; x C /. So
define W .x/ D sup f .y/  inf f .y/ where the supremum and infimum are calculated
for y varying over the interval .x  ; x C / \ a; b. Note that if f had upper and
lower step functions that were both constant on the interval .x  ; x C /, then
the two step functions would have to differ by at least W .x/ on that interval. Now
define the variation of a function f at a point x to be W.x/ D lim W .x/. Since
!0C

0  W .x/  2M is nonincreasing as ! 0C , the limit W.x/ always exists and is


equal to inf W .x/. The following lemma gives an important property of W.

6.10 Characterization of Integrable Functions

197

PROOF: Let f be any bounded function defined on the interval a; b.


Then for any x 2 a; b, the variation of f at x is 0 if an only if f is
continuous at x.
Let f be a bounded function defined on the interval a; b.
PART I: Continuity implies W D 0
Assume that for some x 2 a; b the function f is continuous at x.
Then for every  > 0 there is a > 0 such that if y 2 .x  ; x C / \ a; b,
then jf .y/  f .x/j < 2 . Thus, W .x/ < .jf .x/j C 2 /  .jf .x/j  2 / D .
Thus, there are for which W .x/ is within  of 0 implying that W.x/ D
inf W .x/ D 0.
PART II: W D 0 implies continuity
Assume that for some x 2 a; b the variation of f at x is W.x/ D 0.
Since lim W .x/ D 0, for every  > 0, there is a  > 0 such that for
!0C

0 < <  , W .x/ < .


Select > 0 with <  .
Then for any y 2 .x  ; x C / \ a; b, it follows that jf .y/  f .x/j 
sup f .z/  inf f .z/ D W .x/ < .
jzxj<

jzxj<

This shows that f is continuous at x.


As a consequence of this lemma, the set of discontinuities of f is Df D
fx 2 a; b j W.x/ > 0g. Now for each natural number n define Dnf D
fx 2 a; b j W.x/ > 1n g to be the points of a; b where the variation of f at x
is greater than 1n . If the variation of f at x is positive, then it must be greater than 1n
for some n. Thus, the set of all discontinuities of f must be the union of these Dnf
1

sets, that is, Df D [ Dnf . The key observation here is that if for each n the Dnf set
nD1

has measure zero, then the entire set of discontinuities, Df , must have measure zero
because it is just a countable union of sets with measure zero. So, if you assume that
Df does not have measure zero, it requires that there is a natural number n such that
Dnf also does not have measure zero. What does it mean for Dnf not to have measure
zero? It means that there is an  > 0 such that no collection of open intervals with
total length less than  can cover all of Dnf . This will be the key to showing that
upper and lower step functions for f cannot have integrals that are arbitrarily close
to each other, and thus, f cannot be integrable. The result is known as Lebesgues
Theorem.

198

6 Riemann Integrals

PROOF (Lebesgues Theorem): The function f defined on the interval


a; b is Riemann integrable if and only if f is bounded and the set of points
in a; b where f is discontinuous has measure zero.
Let f be a function defined on the interval a; b with a < b.
PART I: Boundedness and discontinuities with measure zero imply integrable
Assume that there is a real number M such that jf .x/j < M for all x 2 a; b,
and assume the set Df , the set of x 2 a; b such that f is discontinuous at x,
has measure zero.
Let  > 0 be given.
By the definition of measure zero, there is a sequence of open intervals

I1 ; I2 ; I3 ; : : : with total length less than 4M
such that Df is contained in the
union of those intervals.
By the definition of continuity, for each x 2 a; b where f is continuous,

there is a x > 0 such that jf .y/  f .x/j < 4.ba/
for all y 2 a; b with
jy  xj < x . Let Jx be the interval .x  x ; x C x /.
Since each x 2 a; b is either a point of continuity of f or a member of
Df , each x 2 a; b is either a member of one of the intervals Ij that covers
Df or in the interval Jx . Thus, the collection of open intervals consisting of
I1 ; I2 ; I3 ; : : : together with the Jx intervals forms an open covering of a; b.
By the HeineBorel Theorem, there exists a finite collection of these open
intervals than covers a; b. Let E D fx1 ; x2 ; x3 ; : : : ; xn g be the set of distinct
endpoints for the intervals in this finite cover of a; b where x1 < x2 < x3 <
   < xn .
Define step functions u.x/ and v.x/ as follows. If x D xj for one of the
endpoints xj 2 E, then define u.x/ D v.x/ D f .x/.
For each j the open interval .xj1 ; xj / must be a subset of one of the finite
number of intervals that cover a; b. If .xj1 ; xj / is contained in one of the
Ik intervals that covers Df , define u.x/ D M and v.x/ D M for each
x 2 .xj1 ; xj /. Since jf j is bounded by M, v.x/  f .x/  u.x/ for each
x 2 .xj1 ; xj /.
Otherwise, .xj1 ; xj / is contained in one of the Jx intervals. In this case,


define u.y/ D f .x/ C 4.ba/
and v.y/ D f .x/  4.ba/
for each y 2 .xj1 ; xj /.

Since jf .y/  f .x/j < 4.ba/ for all y 2 Jx , it follows that v.y/ < f .y/ < u.y/
for each y 2 .xj1 ; xj /.
It follows that v is a lower step function of f , and u is an upper step function
of f .
n
Rb
Rxj
P

u.x/  v.x/dx D
u.x/  v.x/dx.
a

jD2 xj1

(continued)

6.10 Characterization of Integrable Functions

199

Over the intervals that were subsets of the Ij intervals, u.x/  v.x/ D 2M.

The total length of such intervals cannot exceed 4M
. As a result, the integral

of u.x/  v.x/ over these intervals cannot exceed 2M  4M
D 2 .

Over the intervals that were subsets of the Jx intervals, u.x/  v.x/ < 2.ba/
.
As a result, the integral of u.x/  v.x/ over these intervals cannot exceed
Rb 
D 2 .
2.ba/
a

Thus, f has upper and lower step functions, u and v, with the property that
Rb
u.x/  v.x/dx < 2 C 2 D .
a

Therefore, f is Riemann integrable on a; b.


PART II: Integrable implies bounded and discontinuities with measure
zero
Let f be Riemann integrable on a; b.
Since all integrable functions are bounded, there is an M such that
jf .x/j < M for all x 2 a; b.
For each x 2 a; b, let W .x/ D sup f .y/  inf f .y/ where the supremum and
infimum are calculated for y varying over the interval .x  ; x C / \ a; b.
For each x 2 a; b define the variation of f at x to be W.x/ D lim W .x/.
!0C

By the preceding lemma, W.x/ is 0 if and only if f is continuous at x 2 a; b.


For natural number n define Dnf D fx 2 a; b j W.x/ > 1n g to be the points
of a; b where the variation of f at x is greater than 1n .
1

The set of discontinuities of f is then Df D [ Dnf .


nD1

Assume that Df does not have measure zero.


A countable union of sets with measure zero is itself a set with measure
zero. Since Df is the union of the Dnf , there must exist a natural number n
such that Dnf does not have measure zero.
Since Dnf does not have measure zero, there is an  > 0 such that if Dnf is
covered by a sequence of open intervals, the total length of those intervals
must exceed .
Let u be an upper step function for f and v be a lower step functions for f .
From the definition of step function, there is a sequence a D x0 < x1 <
x2 <    < xk D b such that both u and v are constant on the open intervals
Ij D .xj1 ; xj / for each j D 1; 2; 3; : : : ; k.
For each x 2 Ij , u.x/ cannot be less than sup f .z/, and v.x/ cannot be greater
z2Ij

than inf f .z/. As a consequence, u.x/  v.x/  sup f .z/  inf f .z/. Thus, if
z2Ij

Dnf

\ Ij is not empty, then u.x/  v.x/ 

z2Ij
1
n

z2Ij

for all x 2 Ij .
(continued)

200

6 Riemann Integrals

Dnf cannot be covered by open intervals whose total length is less than .
Thus, it follows that the total length of the intervals Ij that contain points of
Df must be at least .
Rb
It follows that u.x/  v.x/dx  1n  .
a

Thus, f cannot have upper and lower step functions whose integrals differ
by less than n . This implies that f is not integrable which is a contradiction.
Therefore, the assumption that Df does not have measure zero is false,
which completes the proof.
The last section of Chap. 4 introduced Thomaes function, a function defined
on 0; 1 which is discontinuous at each rational number but is continuous at each
irrational number. Since the rational numbers is a countable set, it has measure zero.
Thus, Thomaes function is bounded, and its set of discontinuities has measure zero,
so Thomaes function is Riemann integrable. Compare this to the function that is
equal to 1 for all rational numbers and equal to 0 for all irrational numbers. That
function is discontinuous everywhere, so its set of discontinuities does not have
measure zero, and it is not integrable as seen earlier.

6.10.1 Exercises
1. Suppose f W 0; 2 ! 5; 9 is integrable and g W 5; 9 ! 0; 2 is continuous.
Show that g f is integrable.
Write proofs for each of the following statements.
2. If f .x/ is a function integrable on the interval 0; 10, then so is the function
f .x/f .10  x/.
3. If f .x/ is a function integrable on the interval a; b, and p.x/ is a polynomial,
then p f .x/ is also integrable on a; b.
4. If f .x/ and g.x/ are integrable functions on the interval a; b, then so is f .x/g.x/.

Chapter 7

Infinite Series

7.1 Convergence of Infinite Series


The axioms for the real numbers define addition as a binary operation and establish
the rules for adding two real numbers together. One can use mathematical induction
to extend axioms and theorems about addition to get theorems about the addition
of any finite number of terms. But there is nothing in the axioms that suggests how
to add an infinite number of terms together or what such a sum would mean. You
need to make a separate definition in order to make sense out of adding infinitely
1
P
many terms together. An infinite series a1 C a2 C a3 C    D
an has a sequence
nD1

of terms a1 ; a2 ; a3 ; : : : which are written with plus signs or minus signs between
the terms of the sequence. In this chapter, most series will begin with a first term
a1 , although there is no problem with beginning the series at other subscript values
1
P
such as the commonly seen
an . Also in this chapter the terms of the series will
nD0

be real numbers, although it is possible to extend the definition to series of other


kinds of terms such as complex numbers or matrices. This explains what an infinite
series looks like, but it does not prescribe any meaning to the symbols.
In Abstract Algebra one can study formal power series, a study that looks at one
type of infinite series and considers how to manipulate the series without regard
to whether these series can be assigned any meaningful numerical values. But in
Analysis, one is interested in the cases where it makes sense to assign a numerical
value to the series. The difference in the two studies is in the interpretation of a
series like 1  2 C 3  4 C    . If you ask what happens if you multiply this series
by 2, a purely algebraic answer would be that you just use the Distributive Law and
multiply each term of the series by 2 to get 2  4 C 6  8 C    . But an analytical
answer to the question is that it makes little sense to assign a numerical value to the
series, so multiplying the series by 2 cannot yield a meaningful result.

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_7

201

202

7 Infinite Series

To each series a1 C a2 C a3 C    , one associates the sequence of partial sums


s1 D a1
s2 D a1 C a2
s3 D a1 C a2 C a3
:::
sk D a1 C a2 C a3 C    C ak :
Since each of these sums is just the sum of a finite number of terms, they are easily
defined. The series is said to converge to real number L if the sequence of partial
1
P
sums converges to L, that is, if lim sk D L. In this case one writes
an D L and
k!1

nD1

says that the series has limit L or even that the series has value L. If the sequence
of partial sums does not converge, then the series is said to diverge. If the limit of
partial sums converges to infinity or negative infinity, the series is said to diverge to
1
P
infinity or negative infinity, respectively. In that case one could write
an D 1 or
nD1

1
P

an D 1.

nD1

The definition of convergence suggests that for each series


derive a simple expression for its partial sums sk D

k
P

1
P

an one should

nD1

an and then calculate the

nD1

limit of the partial sums lim sk . Unfortunately, there are relatively few series that
k!1

admit simple closed-form expressions for their partial sums, and this technique for
finding the value of a series has limited use. Still, it is important to know about
some of the cases when this technique does work. Perhaps the best known examples
of series whose partial sums can be explicitly calculated are the geometric series.
These are the series whose sequence of terms can be written in the form an D arn1 ,
where a and r are given real numbers. Then the first term of the series is a and the
n1
an
common ratio of adjacent terms is an1
D ar
D r, at least in the interesting cases
arn2
when ar 0. When r is not equal to 1, there is a simple algebraic trick that gives
the expression for the partial sums.
sk D

k
X

arn1

nD1

r  sk D

k
X
nD1

arn

7.1 Convergence of Infinite Series

203

sk  r  sk D

k
X

arn1 

nD1

k
X

arn D a  ark

nD1

sk .1  r/ D a.1  r /
k

sk D a

1  rk
:
1r

(Of course, there is an even simpler trick for the case when r D 1.) Thus, except
in the trivial case where a D 0, the limit of the partial sum diverges if jrj  1. On
a
which can easily be
the other hand, when jrj < 1, lim rk D 0 so lim sk D 1r
k!1

k!1

remembered as the first term divided by 1 minus the common ratio. The geometric
series is particularly important because one can often compare other series to a
geometric series to determine if the other series converges. It also gives a nice
example showing that series that make a lot of sense when they converge can lead
you to very strange and very incorrect conclusions when they do not converge. In
1
P
r
particular,
rn D 1r
whenever jrj < 1. But when you take a limit as r approaches
nD1

1, you get lim


series

1
P

1
P

r!1 nD1

rn D lim

r
r!1 1r

1
2

which is not the same as the nonsensical

lim rn D .1/ C 1 C .1/ C 1 C .1/ C 1 C    .

nD1 r!1

Another class of series whose partial sums can be calculated are the telescoping
series. This is a class of series where each term an can be written as a difference of
two terms an D bn  bnC1 . Then sk D .b1  b2 / C .b2  b3 / C .b3  b4 / C    C
.bk bkC1 / D b1 C.b2 Cb2 /C.b3 Cb3 /C.b4 Cb4 /C    .bk C bk /  bkC1 D
b1  bkC1 . Hence, if lim bkC1 exists, the series converges. The best known example
k!1
1
1 

P
P
1
1
1
1
D 1  lim nC1
D
 nC1
D 1.
of this type is the series
2
n
n Cn
nD1

nD1

n!1

Fortunately, even though it is often difficult to determine the exact values for
the partial sums of a series, one can very often determine whether or not the series
converges and sometimes the value to which it converges even without knowing an
explicit formula for its partial sums. There are many tools that can be used to do this.
These tools consist of a large collection of convergence tests which can be applied
to determine if a particular series converges. Calculus students often get a great deal
of practice selecting appropriate convergence tests for series. This chapter will be
more interested in proving the theorems that provide these tests.
The simplest and possibly most important convergence test is the Limit of Terms
Test which says that a series can converge only if its sequence of terms has a limit
1
P
of 0. That is, if
an converges, then lim an D 0. This is a direct consequence of
nD1

the fact that if

1
P

nD1

n!1

an converges, then its sequence of partial sums <sk > converges.

204

7 Infinite Series

The point is, if lim sk exists, then <sk > is a Cauchy sequence whose term must get
k!1

close to each other, and so sk  sk1 D ak must approach 0.


PROOF (Limit of Terms Test): The series

1
P

an converges only if

nD1

lim an D 0.

n!1

Assume that the series

1
P

an converges to the limit L.

nD1

Then the sequence of partial sums sk D

k
P

an converges to L.

nD1

This implies lim an D lim .sn  sn1 / D lim sn  lim sn1 D L  L D 0


n!1
n!1
n!1
n!1
which completes the proof.
The convergence of one series can often be inferred from the convergence of a
similar series. For example, inserting extra terms equal to 0 into a series does not
affect whether the series converges, nor can inserting extra 0 terms affect the value
to which the series converges. This is because the insertion of terms equal to 0 into
a series does not change the sequence of partial sums for that series except to allow
some of the partial sums to be repeated, and that does not change the limit of the
sequence of partial sums.
Another useful observation is that if two series differ in only a finite number of
terms, then either both series converge or both series diverge. Suppose, for example,
1
1
P
P
that
an and
bn are two series such that for some positive integer N, the terms
nD1

nD1

an D bn for all n > N. Why would the convergence of one of the series imply
the convergence of the other? It must depend on the convergence of their partial
k
k
P
P
sums, so let sk D
an and tk D
bn be the sequences of partial sums for
nD1

nD1

the two series. The agreement of an and bn for all n > N shows that for k > N,
k
k
P
P
tk D t N C
bn D tN C
an D tN C sk  sN . Thus, lim sk exists if and only
nDNC1

nDNC1

k!1

if lim sk C tN  sN D lim tk exists.


k!1

k!1

7.1.1 Exercises
Find limits for the following series or show that the limit does not exist.
1.
2.

1
P
nD1
1
P
nD1

5
3n
4
22nC1

7.2 Absolute and Conditional Convergence

3.
4.
5.

1
P
nD1
1
P
nD1
1
P
nD1

23

205

7
n2 C5n
1
n2 C9nC14

1
1 1
1
1
1
1
1
 
C 2 C 3  3  4 C 4 C 
2 3 22
3
2
3
2
3
1
1
1
1
7. C 0 C C 0 C 0 C C 0 C 0 C 0 C
C 0 C 0 C 0 C 0
2
4
8
16
1
1
1
1
1
5
C
C
C
C

8. 11 C 3 C C
9
12
23
34
45
56

6.

7.2 Absolute and Conditional Convergence


Before going on, it is necessary to distinguish between two types of convergent
1
P
series. The series
an is said to be absolutely convergent if the series of the
nD1

absolute value of its terms,


convergent. That is, if

1
P

1
P

jan j, converges. All absolutely convergent series are

nD1

jan j converges, then so does

nD1

1
P

an . This can be proved

nD1

by using the fact that a series converges if and only if its sequence of partial sums is
1
1
P
P
Cauchy. The proof can begin with the absolutely convergent series
an , so
jan j
converges. Then, knowing that
sequence of partial sums Sk D

nD1

1
P

nD1

jan j converges, the proof can conclude that its

nD1
k
P

jan j converges. From this you need to reach the

nD1

conclusion that the original series

1
P
nD1

that the sequence of partial sums sk D

an converges, that is, you need to conclude


k
P
nD1

an converges. But if <Sk > converges, it

must be Cauchy, which means that for each  > 0 there is an N such that for any
m > k > N, jSm  Sk j < . What do you need to know for <sk > to be Cauchy?
You need to know that for each  > 0 there
is an mN such that for any m > k > N,
m

P
P

jsm  sk j < . But jsm  sk j D


an 
jan j D jSm  Sk j which you
nDkC1
nDkC1
already know can be made small. It follows that the <sk > sequence is Cauchy, so it
converges.

206

7 Infinite Series

PROOF: If the series

1
P

an is absolutely convergent, then it converges.

nD1

Assume that the series


series

1
P

1
P

an is absolutely convergent which means that the

nD1

jan j converges.

nD1

For each k > 0 let Sk D

k
P

jan j and sk D

nD1

k
P

an .

nD1

Then, since the series is absolutely convergent, the sequence <Sk > converges implying that <Sk > is a Cauchy sequence.
Given  > 0 there is an N such that for all m and k greater than
N, jSm  Sk j < .

m
m

P
P
Let m > k > N. Then jsm  sk j D
an 
jan j D jSm  Sk j < .
nDkC1
nDkC1
Thus, the sequence <sk > is a Cauchy sequence and is, therefore, a
convergent sequence.
1
P
This shows that
an is convergent which proves the theorem.
nD1

If the series

1
P

an converges, but it is not absolutely convergent, then it is called

nD1

conditionally convergent. An absolutely convergent series converge because its


terms get small fast enough that its partial sums must rapidly get close to each
other and to a limit. A conditionally convergent series converges because its negative
terms balance the growth of its positive terms. For example, the series 1  1 C 12 
1
C 13  13 C 14  14 C    clearly converges to 0 due to this type of cancelation.
2
Thus, every series can be categorized as either absolutely convergent, conditionally
convergent, or divergent.

7.3 The Arithmetic of Series


Because the definition of the convergence of a series involves the limit of partial
sums, many results that are true for finite sums are easily proved for infinite sums.
1
1
1
P
P
P
For example, if
an converges, and c is any constant, then
can D c
an . To
nD1

prove this you would have to consider the partial sums


Law works for finite sums, so
c

1
P
nD1

k
P
nD1

an .

can D c

k
P
nD1

k
P

nD1

nD1

c  an . But the Distributive

nD1

an , and the limit of this is the needed

7.3 The Arithmetic of Series

207
1
P

PROOF: If the series


1
P

c  an D c 

nD1

1
P

an converges, and c is any real number, then

nD1

an .

nD1

Assume that
Then

1
P

an converges, and that c is a real number.

nD1

1
P

can D lim

k
P

k!1 nD1

nD1

k
P

can D lim c
k!1

an D c lim

k
P

k!1 nD1

nD1

an D c

1
P

an

nD1

proving the result.


Another easy result is that if
1
P

.an C bn / D

nD1

1
P

1
P

an C

nD1

1
P

an and

nD1

1
P

bn are both convergent series, then

nD1

bn . Again, this is easy because the result follows

nD1

immediately from properties of finite sums.


PROOF: If the series
1
P

an C

nD1

1
P

an and

nD1

1
P

1
P

bn both converge, then

nD1

1
P

.an C bn /D

nD1

bn .

nD1

Assume that the series


Then
lim

1
P
nD1
k
P

k!1 nD1

1
P

an and

nD1

.an C bn / D lim

an C lim

k
P

k!1 nD1

k
P

1
P

bn both converge.

nD1

.an C bn / D lim

k!1 nD1

bn D

1
P

nD1

an C

k!1

1
P

k
P

an C

nD1

k
P

bn

nD1

bn proving the result.

nD1

With these theorems you can often start with a series whose value you know
and derive the values of other series. For example, what is the value of the
1
1
1
1
1
series 1 C 12  14 C 18 C 16
 32
C 64
C 128
 256
   ? This series looks
something like the geometric series with first term 1 and common ratio 12 which
is 1 C 12 C 14 C 18 C    . That series has limit 11 1 D 2. But the new series
2
is clearly not a geometric series because the terms are not all the same sign,
which would be the case for a geometric series with a positive common ratio,
nor are the terms alternating in sign, which would be the case for a geometric
series
with a negativecommon
ratio. The new series can be written, though,
as



2
2
1 C 12 C 14 C 18 C     0 C 0 C 24 C 0 C 0 C 32
C 0 C 0 C 256
C    . This is
the difference of two series: the geometric series with first term 1 and common ratio
1
, and a series whose value is the same as a geometric series with first term 24 D 12
2
1
1
10
2
.
and common ratio 18 . Thus, the new series has value

D
1
1
7
1 2
1 8

208

7 Infinite Series

The series 1  12 C 13  14 C 15  16 C    is a conditionally convergent series. In


the next chapter it will be shown that this series converges to ln 2. So how about the
1
series 1 C 13  12 C 15 C 17  14 C 19 C 11
 16 C    ? At first it appears that this series
is the same as the previous series because it includes the same terms rearranged in a
1
1
different order. Indeed, both series include the terms 2n1
and  2n
for each positive
integer n. But one can write
1C

1
1 1
1
1
1
1 1
1
1 1
 C C  C C
 C  D 1 C 0 C  C
3 2
5
7 4
9
11 6
3 2
5
1
1
1
1 1
 C  D
C0C  C C0C
7 4
9
11 6


1
1 1
1 1
1  C  C  C 
2
3 4
5 6


1
1
1
1
C 0 C C 0  C 0 C C 0  C 0 C  D
2
4
6
8


1
1 1
1 1
1  C  C  C 
2
3 4
5 6


1
1 1
1 1
1
C
1  C  C  C  D
2
2
3 4
5 6
ln 2 C

1
3
 ln 2 D  ln 2:
2
2

It is not unusual that rearranging the order of terms in a series results in the series
converging to a different quantity. This is, in fact, a characteristic of all conditionally
convergent series as will be shown later in this chapter.

7.3.1 Exercises
1. Prove that if the series

1
P

an and

nD1

1
P

bn both converge, and c and d are real

nD1

numbers, then
1
1
1
P
P
P
.c  an C d  bn / D c 
an C d 
bn .
nD1

2. Prove that if the series

nD1

1
P

nD1

converges to 0.

nD1

an converges, then its sequence of tails tn D

1
P
mDn

am

7.4 Tests for Absolute Convergence

209

3. Find the value of each of the following series.


1
1
1
1
1
1
1
(a) 1 C C 2  3 C 4 C 5 C 6  7 C   
3
3
3
3
3
3
3
1
1
1
1
1
1

C

C

C 
(b)
23 34
45 56
67 78
1 1
1
1 1
1
1
1
1
C  C C  C
C
 C 
(c)
2
4 3
6
8 5
10
12 7
1
1
1 1
1
1
1
1
1
(d) 1 C C C  C C
C
C
 C 
3
5
7 2
9
11
13
15 4

7.4 Tests for Absolute Convergence


There are many tests for the convergence of series. Presented here are four very
useful tests that apply to series whose terms are all positive real numbers. Of course,
1
1
P
P
since the convergence of
jan j implies the convergence of
an , these tests can
nD1

nD1

be thought of as tests for the absolute convergence of series.

7.4.1 Comparison Test


After the Limit of Terms Test, the Comparison Test is likely the most important
convergence test because it is used to prove most of the other convergence tests. It
states that if the terms of one series are less than or equal to the corresponding terms
of a second series, then the convergence of the second series implies the convergence
1
1
P
P
of the first series. Specifically, suppose there are two series
an and
bn , and for
each n, the terms satisfy 0  an  bn . Then if

1
P

nD1

nD1

bn converges, it follows that

nD1

1
P

an

nD1

must converge. The contrapositive of this statement is then also true and states that
1
1
P
P
if
an diverges, then
bn must also diverge.
nD1

nD1

Consider how you would prove that this test is valid. The proof would assume that
1
P
0  an  bn for each n, and assume that
bn converges. Then it must show that
1
P

nD1

an converges. One shows that a series converges by showing that its sequence

nD1

of partial sums converges. You do know that the sequence of partial sums for

1
P

bn

nD1

converges, so how can you use that to make a conclusion about the partial sums of
1
P
an ? One idea is to use the technique from the proof that absolutely convergent
nD1

210

7 Infinite Series

series are convergent; that is, a series converges if and only if its sequence of partial
1
m
P
P
sums is Cauchy. If the partial sums of
bn form a Cauchy sequence, then
bn
nDk

nD1

gets small whenever k  m are large. Now, the given fact that an  bn lets you
m
m
1
P
P
P
conclude that
an 
bn which implies that the partial sums of
an are
nDk
1
P

Cauchy. Thus,

nDk

nD1

an must converge.

nD1

PROOF (Comparison Test): Suppose that

an and

nD1

bn are series

nD1

with nonnegative terms and N is a real number such that for every
1
P
integer n > N, the terms satisfy 0  an  bn . Then if
bn converges, so
does

1
P

nD1

an .

nD1

Assume that

P
nD1

an and

bn are series with nonnegative terms.

nD1

Assume that there is an N such that for every n > N, the terms of the series
satisfy 0  an  bn .
1
P
Assume that the series
bn converges.
nD1

This means that the sequence of partial sums

k
P

bn converges and is,

nD1

therefore, a Cauchy sequence.


Thus, given  > 0 there is an M  N such that if M < k  m, then
m
k
m
P

P
P
 >
bn 
bn D
bn .
nD1
nD1
nDkC1
m
m
P
P
But then whenever M < k  m, it follows that  >
bn 
an D
m
P
nD1

an 

k
P

nDkC1

nDkC1

an .

nD1

This implies that the sequence of partial sums of


Therefore, the sequence of partial sums of

1
P

1
P

an is Cauchy.

nD1

an converges, so the series

nD1

converges.
This proves that the Comparison Test is valid.

The Comparison Test can be used in many cases when you are faced with a series
which is similar to a series that you know converges. For example, you already know
1
P
1
that the series
converges because it forms a telescoping series. Can this fact
n2 Cn
nD1

7.4 Tests for Absolute Convergence

be used to show that the series

211

1
P
nD1

1
n2

converges? Well, the Comparison Test cannot

be applied directly because for each n you have n2 1Cn < n12 which is not what you
need. You need to find a convergent series whose terms are greater than or equal to
1
or a divergent series whose terms are less than or equal to them. You have neither.
n2
2
2
On the other hand for each positive integer n, it is true that n12 D n2 Cn
2  n2 Cn .
1
P 2
is twice a convergent series, so it is also convergent. Thus, the
The series
n2 Cn
nD1

Comparison Test shows that

1
P
nD1

1
n2

converges.

In this way the Comparison Test can be used to simplify the task of testing the
convergence of many complicated looking series. As another example, consider the
1
P
2nC7
series
. Note that the first two terms of this series are negative. Because
n3 5nC1
nD1

the convergence of a series does not depend on the value of any finite set of its
terms, it is sufficient to test the series by considering the terms where n  3. In
the terms n32nC7
the degree of the polynomial in the denominator is 3 while the
5nC1
degree of the polynomial in the numerator is 1. This suggests that the terms could be
compared to the terms n12 of a known convergent series. The strategy is to compare
2nC7
to a fraction that is greater but look more like n12 . If the series with greater
n3 5nC1
fractions converges, the Comparison Test shows that the original series converges.
This can be done by attempting to eliminate lower degree terms of the numerator
and denominator polynomials, thus, ending up with a simpler fraction greater than
the original. Clearly, when considering the numerator 2n C 7, the constant term,
7, will be dwarfed by the size of the linear term 2n suggesting that you replace
2n C 7 by the larger quantity 2n C 7n D 9n. This replacement will result in a larger
fraction, but it should not affect whether or not the series converges. Similarly, it
would be good to replace the denominator n3  5n C 1 with a smaller polynomial of
the same degree which will result in obtaining a fraction larger than n32nC7
. One
5nC1
can drop the constant term altogether, but one cannot drop the 5n term without
making the denominator polynomial larger. This can be handled by writing n3 as
1
 n3 C 12  n3 . For large enough values of n, the value of 12  n3 will exceed 5n making
2
1
 n3  5n a positive quantity which could be removed from the polynomial to make
2
the polynomial smaller. Indeed, you need 12  n3  5n  0 implying n2  10. Thus,
if n  4, you can conclude that n3  5n C 1 > 12  n3 . This shows that for n  4, the
1
1
P
P
1
1
18
< 19n
D
18

.
fraction n32nC7
2 . Since the series
2 converges, so does
3
5nC1
n
n
n2
n
2

Thus, by the Comparison Test, the series

1
P
nD1

nD1

2nC7
n3 5nC1

nD1

converges.

As a final example of using the Comparison Test, consider the series 12 C 14 C


1
1
1
1
1
1
1
1
C 18 C 18 C 18 C 18 C 16
C 16
C 16
C 16
C 16
C 16
C 16
C 16
C    . For each
1
k
k
k  0 this series has 2 terms equal to 2kC1 , and these 2 terms add to 12 . Thus, the
1
4

212

7 Infinite Series

sequence of partial sums for this series contains the subsequence 12 ; 1; 32 ; 2; 52 ; 3;   


which clearly diverges. Thus, the series diverges. Now, compare this series to the
1
P
1
harmonic series
and note that each term of the harmonic series is greater than
n
nD1

or equal to the corresponding term of the first series. Thus, by the Comparison Test,
the harmonic series must diverge.

7.4.2 Ratio Test


The geometric series

1
P

a  rn1 converges whenever jrj < 1. The Ratio Test

nD1

essentially is an application of the Comparison Test where one compares terms of


1
P
a series to the terms of an appropriate geometric series. Suppose
an is a series
nD1

with positive terms, and suppose that the sequence of ratios of adjacent terms, nC1
an
has limit L as n approaches infinity. If L < 1, then the series can be compared to
a convergent geometric series with a common ratio between L and 1. If L > 1,
then the terms of the series increase in value and do not approach 0, so the series
diverges. When L D 1, the ratio test fails because there are series for which L D 1
that converge and other series for which L D 1 that diverge.
1
P
To prove that the Ratio Test is valid, you would start by assuming that
an
nD1

is a series with positive terms such that lim nC1


D L < 1. Then you would
n!1 an
compare this series to a well-chosen geometric series known to converge so that
1
P
the Comparison Test can be used to conclude that
an converges. To compare the
nD1

given series to the geometric series with nth term arn1 , you would need to know
that for all n greater than some N, the terms an are less than arn1 . If you know that
anC1
is always less than r, then the an terms will grow more slowly than the arn1
an
terms, and the Comparison Test can be used. In general, r cannot be set equal to L
a
because knowing that nC1
approaches L in the limit does not ensure that the ratio
an
is ever actually smaller than L and certainly not that it is always less than L. But if
a
the limit of nC1
is L, then by the definition of limit, there is an N such that for all
an
, which is half
n  N, the ratio is less than some value greater than L such as LC1
2
way between L and 1. Then, if the value of a is chosen so that aN  a, you will have
aNCk  ark for each k  0, and the result follows.

7.4 Tests for Absolute Convergence

213

PROOF (Ratio Test): Suppose that

1
P

an is a series of positive terms such

nD1

that lim anC1


D L. Then if L < 1, the series converges, if L > 1, the series
n!1 an
diverges, and if L D 1 the test fails.
1
P

Assume that

an is a series of positive terms such that lim

n!1

nD1

anC1
an

D L.

CASE 1: L < 1
a

If L < 1, there is an integer N such that nC1


< LC1
for all n  N.
an
2
LC1
Let a D aN , and r D 2 < 1.
a
Assume that for some k  0, aNCk  ark . Then NCkC1
< r, so aNCkC1 <
aNCk
 k
kC1
aNCk r  ar r D ar .
Therefore, by mathematical induction it follows that aNCk  ark for all
k  0.
1
P
Since
arn is a convergent geometric series, it follows from the ComparnD0

1
P

ison Test that

an is also convergent.

nD1

CASE 2: L > 1
a

If L > 1, there is an integer N such that nC1


> 1 for all n  N.
an
Then for all n  N, anC1 > an > 0, so the sequence of terms increases from
aN and cannot have a limit of 0.
Therefore, the series diverges by the Limit of Terms Test.
CASE 3: L D 1
Note that the constant series
The series

1
P
nD1

1
P

1 diverges, and lim

n!1

nD1
1
n2

converges, and lim

n!1

anC1
an

anC1
an

n2
2
n!1 .nC1/

D lim

1
n!1 1

D lim

D 1.

D 1.

Therefore, no conclusion can be drawn when L D 1, and the Ratio Test


fails.
The ratio test is not helpful for series where the nth term is a rational function of n
a
because the limit of nC1
will always be 1, and the test is inconclusive. The ratio test
an
is particularly useful for series whose nth terms involve powers or factorials. For
5nC1
1 n
P
.nC1/
5
,
you
get
lim
D
example, when you apply the ratio test to the series
5n
n
lim 5
n!1 nC1

nD1

D 0 < 1, so the series converges.

n!1

214

7 Infinite Series

Note that rather than requiring lim


that lim sup
n!1

anC1
an

n!1

anC1
an

to have a limit, it is enough to assume

< 1 to assure that the series converges and lim inf


n!1

anC1
an

> 1 to

assure that the series diverges. The proofs of these facts are left as exercises, but
they are important refinements of the Ratio Test since the lim inf and lim sup always
exist even if the limit does not. For example, consider the series 1 C 23 C 13 C 322 C
a
1
C 323 C 313 C    . For this series, the ratio nC1
oscillates between 23 and 12 , so the
an
32
limit of the ratio does not exist. But the lim sup of the ratio is 23 < 1 implying that
the series converges.
The Ratio Test will play a major role in the discussion of power series in the next
chapter.

7.4.3 Root Test


The Root Test is similar to the Ratio Test and can often be used for the same
series for which the Ratio Test can be used. This is because, like the Ratio Test,
it compares a series to a geometric series. For some series where the general term an
involves the nth powers of expressions, the Root Test can be easier to apply than the
Ratio Test. To test a series with positive terms an with the Root Test, you calculate
p
the limit lim n an D L. Then, as with the Ratio Test, if L < 1, the series converges,
n!1
if L > 1, the series diverges, and if L D 1, the test fails.
Proving that the Root Test is valid is very straightforward. Given that
p
p
lim n an D L < 1, there is an integer N such that for all n  N the root n an is
n!1


n
< 1. Then, for n  N, the terms an are less than LC1
, the terms
less than LC1
2
2
of a convergent geometric series. Thus, the series converges by the Comparison
Test.
1
P
PROOF (Root Test): Suppose that
an is a series of positive terms such
nD1
p
that lim n an D L. Then if L < 1, the series converges, if L > 1, the series
n!1
diverges, and if L D 1 the test fails.
1
P

Assume that

nD1

an is a series of positive terms such that lim

n!1

p
n a D L.
n

CASE 1: L < 1
p
If L < 1, there is an integer N such that n an < LC1
for all n  N.
2
 LC1 n
Then, for n  N, each term an <
, the corresponding term of a
2
<
1.
geometric series with common ratio LC1
2
Therefore, since the geometric series converges, the Comparison Test shows
1
P
that
an converges.
nD1

(continued)

7.4 Tests for Absolute Convergence

215

CASE 2: L > 1
p
If L > 1, there is an integer N such that n an > LC1
> 1 for all n  N.
2
 LC1 n
Then, for all n  N, an > 2
which diverges to infinity.
Therefore, the series diverges by the Limit of Terms Test.
CASE 3: L D 1
Note that the constant series
The series

1
P
nD1

1
P

1 diverges, and lim

nD1
1
n2

p
n

n!1

converges, and lim

n!1

p
n a D lim
n

n!1

an D lim 1 D 1.
n!1

1
p
n 2.
n

Since
natural
function is continuous at 1, it follows that
logarithmp
 thep

ln lim n an D lim ln n an D lim  2 lnn n . Then by LHopitals Rule,
n!1
n!1
n!1
2
p
this limit is lim  1n D 0, from which it follows that lim n an D 1.
n!1
n!1
Therefore, no conclusion can be drawn when L D 1, and the Root Test fails.
p
As with the Ratio Test, it is sufficient to know that lim sup n an < 1 to conclude
n!1
p
that the series converges, and that lim inf n an > 1 to conclude that the series
n!1

diverges. For example, the series 12 C 13 C 212 C 312 C 213 C 313 C    has general term
p
p
a2n D 31n and a2n1 D 21n . Thus, lim n an does not exist, but lim sup n an D p1
n!1

n!1

which is less than 1, so the series converges.

7.4.4 Integral Test


The definition of the Riemann integral considers the integrals of functions over
closed bounded intervals, a; b. This definition can be extended to integrals on
an infinite interval. An improper Riemann integral of the first kind defines
integrals over intervals where one or both of the endpoints of the interval are infinite.
R1
Rb
Rb
Rb
One defines f .x/dx as lim f .x/dx. Similarly,
f .x/dx D lim
f .x/dx
and

R1

f .x/dx D

1
lim  1x jb1
b!1

D 1.

lim

b!1 a
Rb

lim

a!1 b!1 a

1
R1

f .x/dx. For example,

1
x2

dx D

a!1 a
Rb
lim x12
b!1 1

dx D

After seeing a definition of the improper Riemann integral of the first kind, the
reader may be curious whether there is also an improper Riemann integral of
the second kind. Although this text will not need to deal with improper Riemann
integrals of the second kind, the definition is given here for completeness. Recall
that Riemann integrals over an interval a; b exist only if the integrand is bounded.
So, an improper integral of the second kind is an integral where the integrand is
unbounded in every neighborhood of a point c 2 a; b. In this case, the Riemann

216

7 Infinite Series

integral on a; b can be calculated on a region that excludes c and then the limit can
R4
be taken as the region expands toward c. For example, one would define p1x dx as
lim

R4

a!0C a

1
p

p
dx D lim 2 xj4a D 4.
x
a!0C

The Integral Test for the convergence of a series of positive terms involves
the comparison of an infinite series with an improper Riemann integral. It applies
to series whose terms are equal to a monotonically decreasing function f defined
on an interval a; 1/ such that for all n  a, the nth term of the series an is
equal to the function at the point n, that is, an D f .n/. The following figure
makes this comparison clear. Let k be an integer greater than or equal to a. If f
is a monotonically decreasing function, then whenever n  x > k, the function
Rn
Rn
f .x/  f .n/ D an showing that
f .x/dx 
f .n/dx D f .n/ D an . Thus, by the
Comparison Test, the series

1
P

n1

n1

1 nC1
R
P

an converges if the series

nD1

Then, because f is a positive function,

1 nC1
R
P

f .x/dx D

nDk n

f .x/dx converges.

nDk n
R1

f .x/dx. Alternatively, if

f is a monotonically decreasing function, then whenever x  n > k, the function


nC1
nC1
R
R
f .x/dx 
f .n/dx D f .n/ D an . Thus,
f .x/  f .n/ D an showing that
n

again by the Comparison Test, the improper integral


converges if the series

1
P

f .n/ converges. Therefore, the series

if and only if the improper integral


R1
kC1

f .x/dx 

1
P
nDkC1

an 

R1

R1

1 nC1
R
P

f .x/dx D

nDk

k  a,

R1

1
P

f .x/dx

nDk n

an converges

nD1

f .x/dx converges. Moreover, for any integer

f .x/dx giving a fairly narrow range for the value

of the infinite series and a good way to obtain an approximation to the value of
the series. This is helpful because it is often easier to evaluate the integral than the
corresponding infinite series (Fig. 7.1).

Fig. 7.1 Comparing the series with the integral in the Integral Test

7.4 Tests for Absolute Convergence

217

PROOF (Integral Test): Suppose f is a positive monotonically decreasing


1
P
an is a series such that
function on the interval a; 1/. Suppose
nD1

an D f .n/ for all n greater than or equal to an integer k  a. Then the


1
R1
P
series
an converges if and only if the improper integral f .x/dx
nD1

converges.

Suppose f is a positive monotonically decreasing function on the interval


a; 1/.
1
P
Suppose
an is a series such that an D f .n/ for all n greater than or equal
nD1

to an integer k  a.
Because f is monotonically decreasing, for any n > k it follows that f .x/ 
f .n/ for all x 2 n  1; n.
Rn
Rn
Thus, for any n  k it follows that an D f .n/ D
f .n/dx 
f .x/dx.
By the Comparison Test the series
1
P

Rn

f .x/dx D

nDkC1 n1

R1

1
P

n1

n1

an converges if the series

nD1

f .x/dx converges.

Because f is monotonically decreasing, for any n  k it follows that f .n/ 


f .x/ for all x 2 n; n C 1.
nC1
nC1
R
R
Thus, for any n > k it follows that
f .x/dx 
f .n/dx D f .n/ D an .
By the Comparison Test the series
if the series

1
P

nC1
R

nDkC1 n

f .x/dx D

R1

f .x/dx converges

kC1

an converges.

nD1

Thus, the series


R1

n
1
P

1
P

an converges if and only if the improper integral

nD1

f .x/dx converges, proving that the Integral Test is valid.

As an example, consider the collection of p-series which are the series

1
P
nD1

1
np

where p is some constant greater than 0. For which p does the p-series converge?
You have already seen that it converges when p D 2 and diverges when p D 1,
the harmonic series. All the p-series can be handled at once using the Integral Test.
Indeed, since the function f .x/ D x1p is monotonically decreasing in x for each p > 0,
R1
the p-series converges exactly when the integral x1p dx converges. But the integral
1

218

7 Infinite Series

is easy to calculate. When p 1,

R1
1

1
dx
xp

1 1
D  1p  xp1
j1 . This is infinite when p < 1

but converges to 1p when p > 1. When p D 1, the integral is ln xj1


1 which is infinite.
Thus, by the Integral Test, the integral and the series converge exactly when p > 1.
Consider the p-series when p D 2. The value of this series can be estimated
using the integral estimate associated with the Integral Test. The estimate would be
1
1
1
R1 1
R1
P
P
P
1
1
1
dx >
> x12 dx or 1 C 1 > a1 C
> 1 C 12 , so
is between
x2
n2
n2
n2
1

nD2

nD2

nD1

1.5 and 2. This is not very precise, but one can apply this technique a few terms
1
1
R1
R1
P
P
1
1
farther down the series to get x12 dx >
> x12 dx which shows that
n2
n2
10

nD11

11

is between 1.6406 and 1.649. In fact, the limit of the series is

2
6

nD1

1:6449.

7.4.5 Exercises
1. Suppose that

1
P

bn is a convergent series,

nD1

1
P

an is a series, and there are

nD1

1
P

constants N and K such that 0  an  Kbn for all n > N. Prove that

an

nD1

converges.
2. Suppose that

1
P

bn is a convergent series with positive terms,

nD1

1
P

an is a series

nD1
an
n!1 bn

with positive terms, and lim

1
P

D L for some real number L. Prove that

an

nD1

converges. This is sometimes called the Limit Comparison Test.


1
P
a
3. Assume that
an is a series of positive terms that satisfies lim sup nC1
an
nD1

D L < 1. Prove that


4. Assume that

1
P

lim sup
n!1

anC1
an

an converges.

an is a series of positive terms that satisfies lim inf


n!1

D L > 1. Prove that


1
P

n!1

nD1

nD1

5. For the series

1
P

1
P

anC1
an

an diverges.

nD1
1
C 221 C 212 C 222 C 213 C 223 C 214 C 224
21
a
lim inf nC1
. What can you conclude about the
an
n!1

an D

C    calculate

nD1

and

the series?
6. Assume that

1
P

convergence of

an is a series of positive terms that satisfies lim sup

nD1

D L < 1. Prove that

1
P
nD1

n!1

an converges.

p
n a
n

7.5 Alternating Series Test

7. Assume that

1
P

219

an is a series of positive terms that satisfies lim sup

nD1

D L > 1. Prove that

1
P

p
n a
n

n!1

an diverges.

nD1

1
P
8. For the series
an D 211 C 312 C 213 C 314 C 215 C 316 C 217 C 318 C    calculate
nD1
p
p
lim sup n an and lim inf n an . What can you conclude about the convergence of
n!1

n!1

the series?
9. Use the integral estimate from the Integral Test to estimate the size of the series
1
P
1
.
n3
nD1

10. Determine which of the following series are absolutely convergent by applying
an appropriate convergence test.
(a)
(b)
(c)
(d)
(e)
(f)
(g)

1
P
nD1
1
P
nD1
1
P

7n6 18n4 C12n2 183


n10 5n5 19
n2 C5
n3 5
3n
2n C5n

nD1
1 
P
nD1
1
P
nD1
1
P
nD1
1
P
nD1


 n
3

n
.2n/
5n
n
nn
.n/2

7.5 Alternating Series Test


So what can you do with a series which is not absolutely convergent? There are
fewer tools to handle conditionally convergent series. One tool that does help is
the Alternating Series Test which considers series whose terms alternate in sign.
Specifically, if the absolute values of the terms of the series are monotonically
decreasing to 0, and the signs of the term alternate, then the series converges. For
example, the series seen earlier 1 12 C 13  14 C 15  16 C   satisfies these conditions.
The series formed by the absolute values of these terms 1 C 12 C 13 C 14 C 15 C 16 C   
is the harmonic series which does not converge, so the given series is not absolutely
convergent. Seeing how the partial sums of this series behave will give you an idea
how to prove that the Alternating Series Test is valid. In particular, the first few
partial sums of this series are

220

7 Infinite Series

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1

Fig. 7.2 Converging to ln 2 with an alternating series

s1 D 1 D 1
1
2
1
1
s3 D 1  C
2
3
1 1
1
s4 D 1  C 
2
3 4
1 1
1
1
s5 D 1  C  C
2
3 4
5
s2 D 1 

1
2
5
D
6
7
D
12
47
D
:
60
D

The progression can be seen graphically in Fig. 7.2.


Notice that the partial sums of an odd number of terms are all greater than the
limit, ln 2, while the partial sums of an even number of terms
 are all less 1than the
1
1
limit. Also, if n is odd, then snC2 D sn C  nC1
D sn  .nC1/.nC2/ <
C nC2
sn , showing that the partial sums of an odd number
of
terms
 forms a decreasing
 1
1
1
D sn C .nC1/.nC2/
sequence. Similarly, if n is even, then snC2 D sn C nC1
 nC2
>
sn , showing that the partial sums of an even number of terms forms an increasing
nC1
sequence. Because the terms of the series .1/n
approach 0, the odd partial sums
and the even partial sums approach each other. They both form bounded monotonic
sequences which both converge to the common limit. This behavior is typical of all
series satisfying the hypothesis of the Alternating Series Test.

7.5 Alternating Series Test

221

PROOF (Alternating Series Test): Suppose

1
P

an is a series such that

nD1

lim an D 0, and for each n  1, an and anC1 have opposite signs, and
jan j  janC1 j. Then the series converges.

n!1

Assume that

1
P
nD1

an is a series such that lim an D 0, and for each n  1, an


n!1

and anC1 have opposite signs, and jan j  janC1 j.


Without loss of generality, assume that a1 > 0.
k
P
Let the series have partial sums sk D
an .
nD1

Note that if n  1 is odd, then anC1 is negative and anC2 is positive with
janC1 j  janC2 j implying that snC2 D sn C .anC1 C anC2 /  sn .
Similarly, if n  1 is even, then anC1 is positive and anC2 is negative with
janC1 j  janC2 j implying that snC2 D sn C .anC1 C anC2 /  sn .
Thus, the subsequence of odd numbered partial sums forms a monotonically
decreasing sequence while the subsequence of even numbered partial sums
forms a monotonically increasing sequence.
Because the subsequence of even numbered partial sums is increasing, when
n is an odd positive integer it follows that sn > snC1  s2 showing the
subsequence of odd numbered partial sums is bounded below by s2 implying
that that sequence converges to a limit L1 .
Similarly, the subsequence of even number partial sums is an increasing
sequence that is bounded above by s1 implying that that sequence converges
to a limit L2 .
Then L1  L2 D lim s2nC1  s2n D lim a2nC1 D 0 showing that L1 D L2
n!1
n!1
and that the odd numbered partial sums and the even numbered partial sums
both converge to the same limit.
Therefore, the sequence of partial sums converges and the series converges.
This proof not only says that the given alternating series converges; it gives a
way to estimate the limit of the series. For any series that satisfies the hypothesis
of the theorem, any two adjacent partial sums, sn and snC1 , are on opposite sides of
the limit L of the series. Thus, the distance that sn is from the limit of the series is
less than the distance sn is from snC1 , and that distance is just janC1 j. Therefore, it
is easy to remember that for these series, the distance that a partial sum is from the
limit of the series is no more than the first term that is not part of the sum, janC1 j.
Note that the Alternating Series Test for convergence and this limit estimate apply to
series without regard to whether the series is absolutely convergent or conditionally
convergent.
For example, the number 1e D 01  11 C 21  31 C    . This is an absolutely
convergent series as seen by the ratio test. But it is also a series whose terms alternate
in sign, and the absolute values of the terms decrease monotonically to 0. Thus,
1
the partial sum of the series 01  11 C 21  31 C 41 is already within 100
of 1e because
1
the first neglected term is  51 D  120
. This technique gives an easy proof that the

222

7 Infinite Series

number e is irrational. It goes like this: If e were rational, then it could be expressed
as pq , where p and q are positive integers. Then 1e D qp D 01  11 C 21  31 C    .
Multiplying both sides of this equation by p yields q.p  1/ D p  p C p2  p3 C


1
1
   1  pC1
C .pC1/.pC2/
    . Thus, the integer q.p  1/ would be an integer

1
1
 .pC1/.pC2/
C    . But this infinite series is an alternating series
plus (or minus) pC1
1
where the absolute value of the terms decrease to 0, so its value is between pC1
and
1
1
 .pC1/.pC2/ . Thus, there would have to be an integer between those two values,
pC1
something clearly not possible. This is a contradiction, so the assumption that e is
rational must be false.

7.5.1 Exercises
Determine which of the following series are conditionally convergent, absolutely
convergent, or divergent.
1. 1 
2. 1 
3. 1 
4. 1 C
5. 1 C

1
C ln13  ln14 C   
ln 2
1
C 3 ln1 3  4 ln1 4 C   
2 ln 2
1
C 3.ln13/2  4.ln14/2 C   
2.ln 2/2
p1  p1 C p1 C p1  p1 C
3
2
5
7
4
p1  p1 C p1 C p1  p1 C
3
4
5
7
8

p1
9
p1
9

C
C

p1
11
p1
11




p1 C   
6
p1 C   
12

7.6 The Smallest Divergent Series


Recall that the p-series

1
P
nD1

1
np

converges when p > 1 and diverges otherwise. This

raises a natural question about whether there is, in some sense, a largest series
that converges, or, perhaps a smallest series that diverges. If there were, that might
provide a good series to use in the Comparison Test because all series smaller would
converge, and all series larger would diverge. This turns out not to be the case. For
1
P
every series of positive terms,
an , that diverges, there is a sequence of positive
nD1

numbers <bn > that converges to 0 such that the series

1
P

an bn also diverges. In

nD1

fact, one can take bn D

1
sn

where sn is the nth partial sum of

series diverges, then sn goes to infinity, so bn D

1
sn

1
P
nD1

goes to 0.

an . Clearly, if the

7.6 The Smallest Divergent Series

223

To prove this result you would begin with a divergent the series with positive
1
P
terms,
an . Because the series is divergent, you know that the sequence of partial
nD1

sums must diverge to infinity. The strategy is to show that the partial sums of the new
1
P
an
series
are not Cauchy. In particular, for every integer m, there is an integer k
sn
nD1

such that

k
P

nDmC1

1
.
2

1
2

>

an
sn

showing that the mth and kth partial sums differ by at least

Suppose you are given a positive integer m. Since the original series diverges,
there is a positive integer k such that sk > 2sm . Then the difference between the kth
and the mth partial sums of the new series is
sk sm
sk

>1

1
2

nDmC1

D 12 .
1
P

PROOF: Let
k
P

sums sk D

an . Then the series

nD1

sk D

k
P

an
sn

k
P

>

nDmC1

k
P

an
sk

an

nDmC1

sk

an be a divergent series with positive terms and partial

nD1

1
P
nD1

Assume that

k
P

1
P

an
sn

also diverges.

an is a divergent series with positive terms and partial sums

nD1

an .

nD1

Let m be any positive integer.


Since the partial sums sn diverge to infinity, there is a positive integer k such
that sk > 2sm .
1
P
an
Then the difference between the mth and kth partial sums of the series
sn
nD1

is

k
P
nDmC1

an
sn

>

k
P

k
P

an
sk

nDmC1

nDmC1

sk

an

sk sm
sk

>1

1
2

D 12 .

This shows that the sequence of partial sums of the series

1
P
nD1

an
sn

is not a

Cauchy sequence, so it cannot converge.


1
P
an
Therefore, the series
diverges.
sn
nD1

For example, consider the harmonic series 1 C 12 C 13 C 14 C 15 C    which


diverges. The Integral Test suggests that the kth partial sum of this series is close
to ln k. If for all n > 1, the nth term, 1n , of the harmonic series is divided by ln n,
1
P
1
. The Integral Test shows that this series also diverges
the resulting series is
nln n
since the integral

R1
2

nD2

1
dx
xln x

D ln.ln x/j1
2 D 1 diverges.

224

1
P

7 Infinite Series

It is interesting to note that even though for positive termed divergent series
1
1
P
P
an
an , the series
also diverges, for positive termed series
an , the series
sn

nD1
1
P
nD1

nD1

nD1

always converges. To see this, note that for n > 1 the term

an
s2n

an
s2n

sn sn1
s2n

<

1
P
an
Thus,
the
terms
of
the
series
are less than the terms of a
sn1
s2
nD1 n


1
P
1
convergent telescoping series
 s1n D s11  lim s1n . Whether the original
sn1
sn sn1
sn sn1

1
.
sn

n!1

nD2

series diverges so that sn goes to infinity, or it converges to a finite value L so that sn


goes to L, the telescoping series converges.

7.7 Rearrangement of Terms


7.7.1 Addition of Parentheses
The series 11C11C11C   does not converge. Yet, if you insert parentheses to
group some of the terms together, it can result in a convergent series such as .11/C
.11/C.11/C   which converges to 0 or 1C.1C1/C.1C1/C.1C1/C  
which converges to 1. So, inserting parentheses can turn a divergent series into a
convergent series. Equivalently, removing parentheses from a convergent series can
1
P
turn it into a divergent series. What if the series
an converges? Can inserting
nD1

parentheses change whether or not it converges or change the limit to which the
1
P
series converges? The answer to this is no. The point is, if
an converges, it means
nD1

that its sequence of partial sums converges. By inserting parentheses into the series,
you are just removing some of the terms in the sequence of partial sums. You end up
with a new series whose sequence of partial sums is a subsequence of the sequence
1
P
of partial sums of
an , and any such subsequence will converge to the same limit
nD1

as the original series.


Slightly more can be said. Suppose

1
P

an is a series whose terms approach 0. If

nD1

parentheses are inserted in such a way that the number of terms contained within
each set of parentheses is bounded, then the insertion of parentheses cannot affect
whether the series converges or the limit to which the series converges. To see this
1
P
assume that each set of parentheses encloses at most K terms. If
an converges
nD1

to L, then, as suggested above, no insertion of parentheses can affect the limit of


1
P
the series. So suppose that the series
an diverges, and that its partial sums are
sk D

k
P
nD1

nD1

an . Because lim an D 0, for each  > 0, there is an N such that for all
n!1

7.7 Rearrangement of Terms

225

n > N, the size of the terms jan j must be less than K . Suppose that for some m > N
one term of the series with parentheses added is .amC1 CamC2 CamC3 C  CamCk /.
Then sm and smCk are both partial sums for the series with parentheses added. For
any j D 1; 2; 3; : : : ; k, jamC1 CamC2 CamC3 C  CamCj j  K j  , showing that for
any of those j, jsmCj  sm j < . The sequence of partial sums for the original series
does not converge either because the sequence is unbounded or because its lim sup
and lim inf approach distinct values. Because the subsequence of partial sums for
the original series remains within  of the subsequence corresponding to the series
with parentheses added, the subsequence must also either be unbounded or have
distinct lim sup and lim inf values. Thus, the series with parentheses added cannot
converge.
This observation can be very helpful. Consider again the series 1 C 13  12 C 15 C
1
1
 14 C 19 C 11
 16 C    . This series is not absolutely convergent, and it does not
7
satisfy the hypothesis of the Alternating
Series
Yet, if parentheses
are inserted
 Test.




1
to group each set of three terms: 1 C 13  12 C 15 C 17  14 C 19 C 11
 16 C    ,
1
1
4n3
C 2n1
 1n D n.2n1/.2n3/
. The series with
one gets a general term equal to 2n3
4n3
terms n.2n1/.2n3/ converges absolutely as can be seen by comparing it to the p4n3
4n
series with p D 2 since, for n  3, one has n.2n1/.2n3/
< n.2nn/.2nn/
D n42 .
So, the series with parentheses added converges, and since each set of parentheses
contains a maximum of three terms, and the terms of the original series approach 0,
this means that the original series converges.
Of course, if the number of terms enclosed by sets of parentheses is not bounded,
one cannot draw the same type of conclusions. The series .1/ C . 12 C 12  12  12 / C
. 13 C 13 C 13  13  13  13 / C . 41 C 14 C 14 C 14  14  14  14  14 / C    converges, but
if parentheses are removed, the series diverges even though its terms do approach 0.
The partial sums oscillate between 1 and 2.

7.7.2 Order of Terms


It has already been shown that the terms of the series 1  12 C 13  14 C 15  16 C   
1
 16 C    to get a series that
can be rearranged as 1 C 13  12 C 15 C 17  14 C 19 C 11
converges to a different limit. This is typical for a conditionally convergent series.
1
P
In fact, if
an is a conditionally convergent series, and L is any real number, then
nD1

there is a rearrangement of the terms of this series that converges to L. To prove


this, first note that a conditionally convergent series must have both positive and
negative terms. Define two new series so that for each n, bn D an if an  0 and
bn D 0 if an < 0, and cn D bn  an . Then, for each n, both bn  0 and cn  0,
1
1
1
1
1
P
P
P
P
P
and
an D
.bn  cn /. Since
jan j D
.bn C cn /, it must be that both
bn
and

nD1
1
P
nD1

nD1

cn diverge to infinity.

nD1

nD1

nD1

226

7 Infinite Series

Suppose you are given a target limit L. You have isolated the positive terms of
the series, the bn terms, and the negative terms of the series, the cn terms, so you can
play a cute game by taking a few bn terms such that the sum of those terms exceeds
L, and then subtract off a few cn terms until the sum decreases below L. You can then
add on more bn terms to make the sum again exceed L, and subtract of a few cn terms
until the sum decreases below L. Thus, by alternating between adding on bn terms
and subtracting off cn terms, you can arrange for the resulting series to have limit
L. More precisely, construct a new series inductively as follows: select u1 so that
u1
P
bn > L. This is always possible because the series with bn terms diverges to
nD1

infinity. Then select v1 to be the least positive integer such that

v2 to be the least positive integer such that

u2
P

bn 

nD1

v2
P

bn 

nD1

nD1

u2
P

Then select u2 to be the least positive integer such that

u1
P

bn 
v1
P

v1
P

cn < L.

nD1

cn > L, and

nD1

cn < L. For k  2, having

nD1

selected uk and vk , select ukC1 and vkC1 to be the least positive integers such that
uP
vP
vk
kC1
kC1
P
bn 
cn > L, and
bn 
cn < L. It is then the case that the series

uP
kC1
nD1

nD1

nD1

nD1

b1 C b2 C b3 C    C bu1  c1  c2  c3      cv1 C bu1 C1 C bu1 C2 C bu1 C3 C


   C bu2  cv1 C1  cv1 C2  cv1 C3      cv2 C    is a rearrangement of the terms
of the original series with some extra 0 terms added. Since the terms of the series
approach 0, the partial sums of the series approach L. This provides the desired
rearrangement (Fig. 7.3).

b1

b2

b3

b4

b5

c3

c2

b6

b7

c1

b8

b9

c6

c5 c4

b8

c8

Fig. 7.3 Rearranging terms to converge to L

c7

7.7 Rearrangement of Terms

PROOF: Let

1
P

227

an be a conditionally convergent series, and let L be any

nD1

real number. Then there is a rearrangement of the terms of the series


which converges to L.
Let

1
P

an be a conditionally convergent series, and let L be any real number.

nD1

For each n, define bn D an if an  0, and bn D 0 if an < 0, and define


cn D bn  an .
Thus, for each n, bn  0, cn  0, and an D bn  cn .
1
1
1
1
P
P
P
P
Because
an is conditionally convergent,
jan j D
bn C
cn
nD1

diverges. Thus, at least one of


and because

1
P

nD1

1
P

bn and

nD1
1
P

.bn  cn / D

nD1

1
P

nD1

nD1

cn must diverge to infinity,

nD1

an converges, both series must diverge to

nD1

infinity.
The Limit of Terms Test shows that lim an D 0 and, thus, lim bn D
n!1
n!1
lim cn D 0.
n!1
1
P
Because
bn is unbounded, there is a least positive integer, u1 , such that
u1
P

nD1

bn > L.

nD1

Because
u1
P
nD1

bn 

1
P
nD1
v1
P

cn is unbounded, there is a least positive integer, v1 , such that


cn < L.

nD1

Having selected uk and vk for some k  1, let ukC1 > uk be the least
uP
v1
kC1
P
positive integer such that
bn 
cn > L. Then let vkC1 > vk be the
nD1

least positive integer such that

uP
kC1

nD1

bn 

nD1

vP
kC1

cn < L. Thus, by mathematical


nD1
<vk > can be constructed so that for
vk
P

induction, the sequences <uk > and


uP
uk
vk
kC1
P
P
each k,
bn 
cn < L and
bn 
nD1

nD1

nD1
u1
P

cn > L.

nD1

dn be given by
nD1
c1 ; c2 ; c3 ; : : : ; cv1 ,

Let the terms of the new series

terms b1 ; b2 ; b3 ; : : : ; bu1

followed by the terms


followed by the terms
bu1 C1 ; bu1 C2 ; bu1 C3 ; : : : ; bu2 follows by the terms cv1 C1 ; cv1 C2 ;
cv1 C3 ; : : : ; cv2 , and so forth, alternating between the sequence of bn
terms for uk < n  ukC1 and the sequence of cn terms for vk < n  vkC1 .
(continued)

228

7 Infinite Series

The resulting series


1
P

1
P

dn is a rearrangement of the terms in the series

nD1

an with some terms equal to 0 inserted.

nD1

Given  > 0 there is an N1  u1 such that if n > N1 , then bn < , and there
is an N2 such that if n > N2 , then cn < .
Then there is a k1 such that uk1 > N1 and a k2 such that vk2 > N2 .
Let k D max.k1 ; k2 /, and let N D uk C vk .
Then for all m > N, either there is an r such that dm D bp for some p with
N1 < ur < p  urC1 or
is an s such that dm D cq for some q with N2 <
there
m
P



vs < q  vsC1 . Thus,
dn  L is bounded by either max cvr ; burC1 < 
nD1
or by max .bus ; cvs / < .
1
P
This shows that
dn converges to L implying that a rearrangement of the
series

1
P

nD1

an converges to L as claimed.

nD1

This theorem takes care of the case of conditionally convergent series, but what
happens when terms of an absolutely convergent series are rearranged? The answer
is that nothing happens; that is, every rearrangement of an absolutely convergent
1
P
series converges to the same limit. Suppose, for example, the series
an is
absolutely convergent with rearrangement

1
P

bn . Because

nD1

 > 0 there is an integer N such that for all k  N,


limit. Alternatively,

1
P
nDNC1

jan j < . Because

1
P

k
P

1
P

nD1

jan j converges, given

nD1

jan j is within  > 0 of its

nD1

bn is a rearrangement of

nD1

1
P

an ,

nD1

there is an integer K such that all the terms a1 ; a2 ; a3 ; : : : ; aN are among the terms
k
k
P
P
b1 ; b2 ; b3 ; : : : ; bK . So, if k  K, by how much can
an and
bn differ? Both
nD1

nD1

sums contain the terms a1 ; a2 ; a3 ; : : : ; aN , so the two sums differ only by a finite
1
P
number of the terms aNC1 ; aNC2 ; aNC3 ; : : : which add to at most
jan j < .
nDNC1

This shows that the series and its rearrangement have partial sums within  of each
other and completes the argument.

7.7 Rearrangement of Terms

PROOF: Let

1
P

229

an be a series that converges absolutely to L. Then every

nD1

rearrangement of the series also converges to L.


Let
Let

1
P

an be a series that converges absolutely to L.

nD1
1
P

bn be any rearrangement of the series

nD1
1
P

1
P

an converges absolutely, given  > 0 there is an integer N such

Since

nD1

that if k  N,

k
P

jan j is within  of its limit,

nD1

1
P

an .

nD1

1
P

jan j. This means that

nD1

jan j < .

nDNC1

Since

1
P

bn is a rearrangement of

nD1

1
P

an , there is an integer K such that all

nD1

of the terms a1 ; a2 ; a3 ; : : : ; aN are among the terms b1 ; b2 ; b3 ; : : : ; bK .


For k  K, the difference between the kth partial sums of the two series
k
k
k
P
P
P
is
an 
bn . This difference is a sum of the terms in
an that are
nD1

not in

k
P

nD1

bn minus the sum of the terms in

nD1

k
P

nD1

bn that are not in

nD1
a1 ; a2 ; a3 ; : : : ; aN ,

k
P

an .

nD1

Neither sum contains any of the terms


nor are there any
terms that appear in both sums. It follows that the difference of partial sums
equals a sum minus another sum where
k each ksum contains distinct terms
P
P
from aNC1 ; aNC2 ; aNC3 ; : : : . Thus,
an 
bn is bounded above by
nD1
nD1
1
P
jan j < .
nDNC1

Thus, given  > 0, there is a K such that for all k  K, the k partial sum of
1
1
P
P
an and the kth partial sum of
bn are within  of each other.
nD1

1
P

nD1

an converges to L, given  > 0, there is and N1 such that if


nD1

k
P

k  N1 ,
an  L < 2 .

Because

nD1

(continued)

230

7 Infinite Series

k
k

P
P
Also, there is an N2 such that if k  N2 ,
an 
bn < 2 .

k nD1
nD1 k
k
P

Then for all k  max.N1 ; N2 /,


bn  L 
bn 
an C
nD1
nD1
nD1
k

P



an  L < 2 C 2 D .

nD1

Thus, the series

1
P

an and its rearrangement

nD1

1
P

bn must both converge L.

nD1

7.7.3 Exercises
1. In which of the following series can the parentheses be removed without affecting
the convergence of the series?


1

1

(a) 1  12

C

C 12  13  13
C 13 C 13  14  14  14
2
3

1
1
1
1
1
1
1
1
C



C
C
C




4
4
4  5  5
5  5
4
1
1
C 
 12
(b) 12  14 C 16  18 C 10
1 1 1 1 1 1
(c) 2  2 C 2  2 C 2  2 C   

1 1


1
1
1
1
C
C
 9 C 10
C 11
 12
 13
(d) 12  13 C 14 C 15  16  17
8

1
1
1
1
1
1
 C 16 C 17  18  19 C   
14 15
 
 
 

1
C  
(e) .1/C 1  12 C 1  12  14 C 1  12  14  18 C 1  12  14  18  16
2. Write a proof to show that if

1
P

an is a conditionally convergent series, then

nD1

there is a rearrangement of the terms of the series that diverges to infinity and a
rearrangement that diverges to negative infinity.
3. Write a proof to show that if a1 , a2 , and a3 are real numbers, the series a11 C a22 C
a3
C a41 C a52 C a63 C    converges if and only if a1 C a2 C a3 D 0.
3
1
1
P
P
an is an absolutely convergent series, and
bn
4. Write a proof to show that if
nD1

is a convergent series, then

1
P

nD1

an bn converges.

nD1

5. Give an example of convergent series

1
P
nD1

an and

1
P
nD1

bn where

1
P

an bn diverges.

nD1

6. Using the method described in this section find the first 20 terms of the
rearrangement of the series 1  1 C 12  12 C 13  13 C 14  14 C    that converges
to 1.

7.8 Cauchy Products

231

7.8 Cauchy Products


Earlier it was shown that if

1
P

an converges to L and

nD1

1
P

1
P

bn converges to M, then

nD1

.an C bn /, converges to L C M. What can be said about


 1 
1
P
P
an
bn ? First of all, can this product even be
the product of the series
the sum of the series,

nD1

nD1

nD1

written as an infinite series? One could, of course, write

1 P
1
P

an bp , and some

nD1 pD1

sense can be made out of this expression. The notation suggests that for each n,
1
P
one would calculate a limit of
an bp , and then one would consider the series of
pD1

those limits. This raises interesting questions about whether that limit, if it should
1 P
1
P
exist, has anything to do with the similar looking
an bp . In fact, as seen in
pD1 nD1

the exercises, there are examples where interchanging the order of summation in a
double summation can result in a different limit.
 1  1 
P
P
A simpler approach is to group the terms of the product
an
bn in a
nD1

nD1

way that might allow you to calculate the sum. One strategy is to group the terms
an bp where n C p is a given constant. For example, when the constant is 2, there
is only one term a1 b1 . When the constant is 3, there are two terms a1 b2 C a2 b1 .
n1
P
ap bnp . This
In general, the grouping of the terms whose subscripts add to n is
pD1
!
1
n1
P
P
gives what is known as the Cauchy product of the two series
ap bnp .
nD2

pD1

Note that this definition is symmetric in a and b, so the Cauchy product of


and

1
P

bn is the same as the Cauchy product of

nD1
1
P

1
P

bn and

nD1

1
P

1
P

an

nD1

an .

nD1

For example, what is the Cauchy product for the square of the geometric series
n1
P
1
? Here you have two identical series where an D bn D 21n , so
ap bnp D
2n

nD1
n1
P
pD1

pD1

1
2p

1
2np

n1
P
pD1

1
2n

n1
.
2n

Thus, the Cauchy product is

1
P
nD2

shows that this series converges to some value S. So,


SD

1
X
n1
nD2

2n

n1
.
2n

The Ratio Test

232

7 Infinite Series

2S D

1
X
n1
nD2

2n1

1
X
n
n
2
nD1

1 X 1
2S  S D C
D1
2 nD2 2n
SD1
This Cauchy product converges to 1 which is the expected limit since

1
P
nD1

1
2n

D 1.

But Cauchy products do not always behave so nicely. For example, find the Cauchy
1
1
P
P
.1/n
.1/n
p
p
product of the two series
and
. The Alternating Series Test shows
n
nC4
nD1

nD1

that both of these series converge, but the Integral Test shows that neither converges
absolutely. The nth term of the Cauchy product of these two series is
n1
n1
P .1/p .1/np
P
p 1
p p
D .1/n
. For even values of n, this is a sum of n1
p
npC4
p.npC4/

pD1

pD1

positive terms of the form

when p D

nC4
,
2

1
.
p.npC4/

Since the product p.n  p C 4/ is maximum

each term of the sum is greater than or equal to

p2
nC4

2
 pnC4
D

4
.
nC4

which approaches 4 as
This means that the sum is greater than or equal to 4.n1/
nC4
n gets large. Thus, the terms of the Cauchy product do not approach 0 as n goes
to infinity, and the Limit of Terms Test shows that the Cauchy product does not
converge.
This last example shows what can go wrong with the Cauchy product of two
conditionally convergent series, but the results are better when at least one of
the series is absolutely convergent. For example, if both series are absolutely
convergent, then the Cauchy product is absolutely convergent to the product of the
series. To see why this is, just consider the difference between a partial sum of
the Cauchy product of the two series and the product of two partial sums of the
individual series. That is, let k1 and k2 be positive integers, and find the difference
!
k1P
Ck2 n1
P
between the .k1 C k2 /th partial sum of the Cauchy product,
ap bnp ,
nD2

pD1

and the product of the k1 th and k2 th partial sums, respectively, of the two series,
k1
k2
P
P
am 
bn . These are both just finite sums where the Cauchy product partial
mD1

nD1

sum includes all the terms am bn where the sum of the subscripts of m C n add to
something less than or equal to k1 C k2 and the other sum includes all the terms
am bn where M  k1 and n k2 . Thus, the
is the
 difference
 sum of the remaining
k1 Ck
k1 Ck
k1 Ck
k1 Ck
P2 1
P2 m
P2 1
P2 n
am 
bn 
terms
bn C
am . So by choosing k1 and
mDk1 C1

nD1

k2 large, you can ensure that both


necessary convergence.

nDk2 C1
k1 Ck
P2 1
mDk1 C1

mD1
k1 Ck
P2 1

am and

nDk2 C1

bn are small showing the

7.8 Cauchy Products

233

PROOF: The Cauchy product of the two absolutely convergent series


converges absolutely to the product of the two series.
1
P

Let

1
P

am and

mD1

bn be absolutely convergent series.

nD1

Let  > 0 be given.


1
P
Because
am converges absolutely, there exists an integer N1 such that
mD1

1
P

jam j <

mDN1

2 1C

!.

1
P

jbn j

nD1

1
P

Similarly, because

bn converges absolutely, there exists an integer N2

nD1

such that
1
P
jbn j <
nDN2

!.

1
P

2 1C

jam j

mD1

Then for all k1  N1 and k2  N2 , the difference between the


.k1 C k2 /th partial sum of the Cauchy product of the two series and
the
product of the

! k1 th and k2 th partial sums of the two series is

k1P
Ck2 n1
k1
k2
P
P
P

ap bnp 
am 
bn D

nD2 pD1
mD1
nD1





k1 Ck
k1 Ck
k1 Ck
k1 Ck
P2 m
P2 1
P2 n
P2 1

a 
bn 
bn C
am 

mDk1 C1 m

nD1
nDk2 C1
mD1
k1 Ck
P2 1
mDk1 C1
1
P
mDk1 C1
1
P

jam j
jam j

k1 Ck
P2 m
nD1
1
P

jbn j C

nD1


jbn j 

nD1

jbn j C

2 1C

1
P

!
jbn j

k1 Ck
P2 1
nDk2 C1

1
P

nDk2 C1
1
P

jbn j
1
P

jbn j

k1 Ck
P2 n

jam j 

mD1


jam j 

mD1

jam j 

mD1

2 1C

nD1

1
P
mD1

!
jam j

<


2


2

D .

Therefore, since the .k1 C k2 /th partial sum of the Cauchy product of the
two series and the product of the k1 th and k2 th partial sums of the two series
are within  of each other when k1 and k2 are large, both expressions must
converge to the same quantity when k1 and k2 approach infinity, and that
limit is the product of the two series.

n1

1
P
P

Also note that for k1  N1 and k2  N2 , the sum


ap bnp 

nDk1 Ck2 C1 pD1


1
1
1
1
P
P
P
P
jam j
jbn j C
jbn j
jam j <  showing that the Cauchy
mD1

nDk2 C1

nD1

product converges absolutely.

mDk1 C1

234

1
P

7 Infinite Series

Another way! to think about this theorem is that the Cauchy product
n1
1
1
P
P
P
ap bnp and the product of the two series
am 
bn are rearrangements

nD2

pD1

mD1

nD1

of each other. Thus, if either converges absolutely, both converge absolutely to the
same limit. Of course, to make this rigorous, one would need to find at least one
rearrangement of the terms into a form c1 C c2 C c3 C    and then show that that
series converges absolutely.
If one series converges absolutely and the other only converges conditionally,
then the Cauchy product of the two series still converges to the product of the two
series, but absolute convergence is not guaranteed. The proof is similar to the proof
of the previous theorem in that it carefully considers the difference between the
partial sum of the Cauchy product and product of the two series. This difference can
be broken into three differences each of which can be bounded. Specifically, assume
1
1
P
P
that
am is absolutely convergent and
bn is convergent. Then consider the
mD1

nD1

difference between the Nth partial sum of the Cauchy product and the product of the
N n1
1
1
N1
1
Nm
P
P
P
P
P
P
P
ap bnp 
am 
bn D
am
bn 
am 
series. That difference is
nD2 pD1
mD1
nD1
mD1
nD1
mD1
Nm
  N1
 1
1
N1
1
1
P
P
P
P
P
P
P
bn D
am
bn 
bn C
am 
am
bn . In the second term
nD1
mD1
nD1
nD1
mD1
mD1
nD1

 N1
1
1
P
P
P
bn is fixed, and the factor
am 
am can be made
of this sum, the factor
nD1

mD1

mD1

as small as necessary by choosing


The
Nm N large.
 first term of the sum is a little trickier
1
P
P
to handle. In the terms am
bn 
bn the am factor can be made small by
making m large, and the

Nm
P
nD1

nD1

bn 

1
P

nD1

bn factor can be made small by making N  m

nD1

large, or by keeping m small. Both of these can be done, but not at the same time.
The technique one would use here would be to break the sumfrom m D 1 to m
D
N1
Nm
1
P
P
P
am
bn 
bn D
N  1 at some intermediate value K < N  1 writing
mD1
nD1
nD1




K
Nm
1
N1
Nm
1
P
P
P
P
P
P
am
bn 
bn C
am
bn 
bn . You can now choose K
mD1

nD1

nD1

mDKC1

nD1

nD1

so that for m > K the value of am is small, and when m  K, the N  m is large so
Nm
1
P
P
that
bn 
bn will be small. This gives the following proof.
nD1

nD1

7.8 Cauchy Products

235

1
P

PROOF: If

am is an absolutely convergent series and

mD1

1
P

bn is a

nD1

convergent series, then the Cauchy product of the two series converges
to the product of the two series.
1
P

Let

am be an absolutely convergent series, and

mD1

1
P

bn be a convergent

nD1

series.
Then for integers N and K with 1 < K < N  1, the difference between the
Nth partial sum of the Cauchy product of the two series and the product of
the two series is
N n1
1
1
P
P
P
P
ap bnp 
am 
bn D
nD2 pD1
N1
Nm
P
P

mD1
1
P

nD1
1
P

bn 
am 
bn D
nD1
mD1
nD1


 N1
 1
Nm
1
1
P
P
P
P
P
am
bn 
bn C
am 
am
bn D
mD1
nD1
nD1
mD1
mD1
nD1




K
Nm
1
N1
Nm
1
P
P
P
P
P
P
am
bn 
bn C
am
bn 
bn C
mD1
nD1
nD1
nD1
nD1
mDKC1

 N1
1
1
P
P
P
am 
am
bn :
am

mD1
N1
P

mD1

mD1

Because

nD1

T
P

bn converges as T goes to infinity, it remains bounded. Thus,


T

1
P

P
there is a number M such for all T,
bn 
bn < M.
nD1
nD1
Let  > 0 be given.
1
P
Because
am converges absolutely, there is an integer K such that
nD1

1
P

mD1

jam j <

mDKC1


.
3M

1
P
bn converges to
bn , there is a positive integer N1 such that
nD1
nD1

N
1

P
P

!.
for all N  N1 ,
bn 
bn <
1

Because

N
P

nD1

nD1

3 1C

jam j

mD1

1
P
am converges to
am , there is a positive integer N2 such that
mD1
mD1
N
1

P
P

for all N  N2 ,
am 
am <
1 ! .
P

Because

N1
P

mD1

mD1

Let N  max.N1 C K; N2 C 1/.

3 1C

nD1

bn

(continued)

236

7 Infinite Series

P
N1
Nm
1
1
1
1

P
P
P
P
P
P
P
N n1

Then
ap bnp 
am 
bn D
am
bn 
am 
bn D

nD2 pD1
mD1
nD1
mD1
nD1
mD1
nD1

Nm

Nm
  N1
 1
P
1
N1
1
1
P
P
P
P
P
P
P
P
K
a
b 
b C
am
bn 
bn C
am 
am
bn 

mD1 m nD1 n nD1 n

nD1
nD1
mD1
mD1
nD1
mDKC1

N1
1
Nm
Nm
K
1
N1
1
1

P
P
P
P
P
P
P
P
P
jam j
bn 
bn C
jam j
bn 
bn C
am 
am
bn <
mD1
nD1
nD1
nD1
nD1
mD1
mD1
nD1
mDKC1

K
1
P

P


! C  MC
1 ! 
jam j 
bn < 3 C 3 C 3 D .
1
3M
P
P
mD1

3 1C

3 1C

jam j

mD1

Therefore, the Cauchy product


1
P

the series

1
P

am 

mD1

nD1

bn

nD1

N Nn
P
P

ap bnp converges to the product of

nD2 pD1

bn .

nD1

Cauchy products play a particularly useful role in the study of power series, a
topic covered in the next chapter.

7.8.1 Exercises
1. Let am;n be the nth number in the mth row of the following table where m and n
both range from 1 to infinity.
1 1

0 

 12  12

0 

 14  14  14  14

0 

1
2

1
2

1
4

1
4

1
4

1
4

1
8

1
8

1
8

1
8



Show that

1 P
1
P
mD1 nD1

1
8

1
8

1
8

1
8

 18  18  18  18  18  18  18  18   

am;n is not equal to

1 P
1
P

am;n .

nD1 mD1

2. Show that the Cauchy product for the square of the conditionally convergent
1
P
.1/n
series
converges.
n
nD1

3. Show that the Cauchy product for the square of the series

1
P
nD1

.1/n
p
n

diverges.

4. Suppose you have two series whose indices begin with 0 rather than 1 as in
1
1
P
P
an and
bn . Show that the Cauchy product of these two series is then
nD0
1 P
n
P
nD0 pD0

nD0

ap bnp .

7.8 Cauchy Products

237

5. In the next chapter it will be shown that for all real values of x, the exponential
1 n
P
x
function has the series representation ex D
. Use the Cauchy product of
n
series to show that ea eb D eaCb .

nD0

Chapter 8

Sequences of Functions

8.1 Pointwise Convergence


Chapter 3 introduces the idea of a sequence of real numbers <an > and discusses
theorems related to the limit lim an , limit superior lim sup an , limit inferior
n!1

n!1

lim inf an , and subsequences <anj > of such a sequence. If instead of requiring
n!1
the terms of the sequence an to be constants, the an were allowed to depend on
the value of a variable as in fn .x/, then the sequence is a sequence of functions.
Thus, for each value of x, if all the functions fn .x/ are defined at x, then there is
a sequence of real numbers, <fn .x/>. This sequence changes as x changes, and,
indeed, there is a different sequence of real numbers for each choice of x. The limit
of the sequence, if it exists, could be different for each x, and, therefore, the limit
would also be a function, f .x/. The first question that arises is, what is meant by
the convergence of such a sequence? In fact, there are many different definitions
for the convergence of a sequence of functions, each with its own applications and
properties. The next question is, what can one say about the properties of the limit
of the sequence? For example, under what conditions can you know that the limit
function is continuous, differentiable, or integrable? In particular, if the sequence
of integrable functions <fn .x/> converges to an integrable function f .x/, when can
Rb
Rb
you conclude that lim fn .x/dx D f .x/dx?
n!1 a

The simplest form of convergence of a sequence of functions is to say that the


sequence of functions <fn .x/> converges pointwise to the function f .x/ on a set
A if for each x 2 A, lim fn .x/ D f .x/. This type of convergence is referred
n!1

to as pointwise convergence. For example, the sequence of functions fn .x/ D nx


converges pointwise to the function f .x/ D 0 on the entire real line because for each
x 2 R, lim nx D 0. A more interesting example is the sequence fn .x/ D xn which
n!1

converges pointwise on the interval .1; 1. When jxj < 1, the powers xn get small
as n gets large so lim xn D 0. But when x D 1, the powers xn D 1, so the limit
n!1

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_8

239

240

8 Sequences of Functions

Fig. 8.1 The sequence of


functions xn converging to a
discontinuous function

-1

Fig. 8.2 The sequence of


nC1
functions jxj n converging
to the function f .x/ D jxj


0 if  1 < x < 1
of the sequence is 1. Thus, the limit function is f .x/ D
: Note
1 if x D 1
that this is a sequence of continuous functions that converges to a function that is not
continuous. The sequence does not converge at x D 1 because the terms oscillate
between 1 and 1 (Fig. 8.1).
Continuity is not the only property not preserved by functions converging
nC1
pointwise. The terms of the sequence fn .x/ D jxj n are differentiable functions
for all real numbers, but the limit of the sequence is the function f .x/ D jxj
which
is not differentiable9at x D 0 (Fig. 8.2). The terms of the sequence f .x/ D
8
2

n
x
if 0  x  1n >
>

>

>

>

=
<
R2
2
1
2
all have integral fn .x/dx D 1, yet the sequence
n2 . n  x/ if n < x < n
>

>

0
>

>

>

;
:
2
0
if n  x  2
converges pointwise on the interval 0; 1 to the function f .x/ D 0 which has integral
equal to 0 (Fig. 8.3).

8.2 Uniform Convergence

241

Fig. 8.3 A sequence of


functions with integral 1
converging to the function
f .x/ D 0
f3

f2
f1

8.1.1 Exercises
Determine the pointwise limits of the following sequences of functions. For which
sequences is the limit continuous? For which sequences is the limit of the integrals
of the terms equal to the integral of the limit?
p
1. fn .x/ D n x for x 2 0; 16.
2. Let r1 ; r2 ; r3 ; : : : be a sequence
consisting of all the rational numbers in the

1 if x D rk for some k  n
interval 0; 1. Let fn .x/ D
for x 2 0; 1.
0 otherwise
n
nx for x 2 .0; 1/.
3. fn .x/ D (
)
n
nC4
n if 2nC4
< x < 2nC4
4. fn .x/ D
for x 2 0; 1.
0 otherwise
8
9
1
.2 C .1/n / n2 x
if 0 < x < 2n
<
=


1
5. fn .x/ D .2 C .1/n / n2 1n  x if 2n
 x < 1n for x 2 0; 1.
:
;
0
otherwise

8.2 Uniform Convergence


The sequence <fn > converges pointwise to f on the set A if given  > 0, for each
x 2 A there is an integer N such that jfn .x/  f .x/j <  for all n  N. So, for
each x there is an integer N that ensures the inequality. The value of N can depend
on the choice of x. If this dependence is dropped, and you are able to specify a
value of N that does not depend on the choice of x, then the speed of convergence
becomes similar for all x 2 A; that is, the rate of convergence is uniform for all
x 2 A. The sequence <fn > converges uniformly to f on the set A if, given
 > 0, there is an integer N such that for each x 2 A, jfn .x/  f .x/j <  for all
n  N. The difference between a sequence of functions converging uniformly and
converging pointwise is that with uniform convergence there can be no points of

242

8 Sequences of Functions

Fig. 8.4 A sequence of


functions converging
uniformly

the set A where convergence lags behind. For any region of width  > 0 around
the limit function, all of the terms suitably far down the sequence enter that region.
Compare, for example, the uniformly convergent sequence depicted in Fig. 8.4 with
the pointwise convergent sequences depicted in Figs. 8.1 and 8.3. In Fig. 8.4 the
functions of the sequence get close to the limit function for all the values of x,
whereas for each function in Figs. 8.1 and 8.3 there is an x for which the function
is far from its limit. Clearly, if a sequence of functions converges uniformly, then it
also converges pointwise. Thus, to converge uniformly is a stronger condition than
to converge pointwise.
As seen in the previous section, the terms of the sequence <fn > can have many
properties that are not automatically inherited by the limit of the sequence, f , when
the convergence is pointwise. Under uniform convergence, more of the properties
of the terms of the sequence are retained by the limit. This is because under uniform
convergence there are no points x 2 A for which the values of <fn .x/> lag behind
as n gets large. For all values of x 2 A, the sequence of <fn .x/> values get close to
the corresponding f .x/ at a rate at least as fast as some fixed rate.
For example, if <fn > is a sequence of functions continuous on the set A which
converges uniformly to f on A, then the limit is guaranteed to be continuous.
Actually, a stronger statement can be made. If all the terms fn are continuous at
some point a 2 A, then the limit, f , will also be continuous at a. To prove that f is
continuous at a, you will need to show that for each  > 0 there is a > 0 such
that if x is in A with jx  aj < , then jf .x/  f .a/j < . How can you arrange for
f .x/ to be close to f .a/? What you know is that the functions fn get uniformly close
to f , and that the fn functions are continuous at a. Since, for any particular n, the
term fn is continuous at a, you can arrange for fn .x/ to be close to fn .a/. The uniform
convergence allows you to choose an integer n so that for every x 2 A, fn .x/ is close

8.2 Uniform Convergence

243
f(x)

fn(x)

)
a

Fig. 8.5 If continuous function fn is close to f , then f .x/ is close to f .a/

to f .x/. That is, jf .x/  f .a/j D jf .x/  fn .x/ C fn .x/  fn .a/ C fn .a/  f .a/j 
jf .x/  fn .x/j C jfn .x/  fn .a/j C jfn .a/  f .a/j. Each of these three terms can be made
small, say less than 3 , so that the sum is less than . The key point here is that only
one value of n needs to be chosen so that jf .x/  fn .x/j can be made less than 3 no
matter which x is chosen (Fig. 8.5).
PROOF: If the sequence <fn > converges uniformly to the limit f on the set
A and if for each n, fn is continuous at a 2 A, then f is continuous at a 2 A.
In particular, if each fn is continuous on A, then f is continuous on A.
Let <fn > be a sequences of functions that converge uniformly to the
function f on a set A.
Assume that each fn is continuous at point a 2 A.
Let  > 0 be given.
Because the sequence converges uniformly, there is an integer N such that
jfn .x/  f .x/j < 3 for all x 2 A and all n  N.
Because fN is continuous at a, there is a > 0 such that jfN .x/  fN .a/j < 3
for all x 2 A satisfying jx  aj < .
Then, for all x 2 A satisfying jx  aj < , it follows that
jf .x/  f .a/j D jf .x/  fN .x/ C fN .x/  fN .a/ C fN .a/  f .a/j 
jf .x/  fN .x/j C jfN .x/  fN .a/j C jfN .a/  f .a/j < 3 C 3 C 3 D .
Therefore, the function f is continuous at a 2 A.
Moreover, if each function fn is continuous at each a 2 A, then f is
continuous at each x 2 A, so f is continuous on A.
It is worth considering where this proof breaks down if all you assume is that
the sequence <fn > converges pointwise to f . The problem comes in the fact that
although jfN .a/  f .a/j and jfN .x/  fN .a/j can be made smaller than 3 , there could

244

8 Sequences of Functions

be values of x very close to a for which jfN .x/  f .x/j is no longer small. Thus, the
needed inequality jf .x/  fN .x/j C jfN .x/  fN .a/j C jf .a/  f .a/j <  might not
hold. Also consider the function f defined on the interval 0; 2 by f .x/ D x if x 1
and f .1/ D 3. If for each positive integer n you let fn .x/ D f .x/ C 1n , then it is
clear that the sequence <fn > converges uniformly to f . At the points where each fn
is continuous, that is, for x 1, the limit function f is also continuous.
Suppose that <fn > is a sequence of functions Riemann integrable on an interval
a; b and that this sequence converges to a limit f . Examples in the last section show
Rb
that if the convergence is pointwise, the limit lim fn .x/dx does not necessarily
equal

Rb

n!1 a

f .x/dx. Moreover, the function f does not even need to be Riemann

integrable, and the limit of the integrals of the fn might not exist. On the other
hand, if the convergence is uniform, then the limit function f will be Riemann
integrable and the limit of the integrals of the fn will equal the integral of f . Showing
that the uniform limit of Riemann integrable functions is Riemann integrable is not
difficult and is based on the characterization of Riemann integrable functions given
by Lebesgues Theorem. Recall that a function is Riemann integrable on an interval
if and only if it is bounded and the set of points where the function is discontinuous
has measure zero. If each term of the sequence, fn , has these properties, then the limit
function, f , must also have them. By the definition of uniform convergence, there is
an integer N such that jfN .x/  f .x/j < 1 for all x 2 a; b. So, if the function fN is
bounded by some constant M, then the function f must be bounded by M C 1 since
for all x 2 a; b it follows that M  1  fN .x/  1 < f .x/ < fN .x/ C 1  M C 1.
As for points of discontinuity of f , for each positive integer n, let Dn be the set of
points in a; b where the function fn is discontinuous. Because each fn is Riemann
1

integrable, each Dn has measure zero. But then D D [ Dn is a countable union


nD1

of sets of measure zero, so it also has measure zero. The sequence <fn > converges
uniformly on the set A D a; bnD, and each term of the sequence is continuous at
each point of A, so, by the preceding theorem, the limit f must be continuous
on A. Thus, the set of discontinuities of f must be contained in D, so the set of
discontinuities of f has measure zero. Therefore, f is Riemann integrable.
Rb
Rb
So why does it follow that lim fn .x/dx D f .x/dx? From the definition of
n!1 a

uniform convergence, for every  > 0 there is an integer N such that n  N implies
that jfn .x/  f .x/j <  for every x 2 a; b. This means that for every n  N and
Rb
every x 2 a; b it follows that f .x/   < fn .x/ < f .x/ C , so f .x/dx  .b  a/ D
Rb
a

.f .x//dx 

Rb
a

Rb

Rb

fn .x/dx  .f .x/C/dx D

f .x/dxC.ba/. Thus, by selecting

a more appropriate value for , this shows that lim

Rb

n!1 a

fn .x/dx D

Rb
a

f .x/dx.

8.2 Uniform Convergence

245

PROOF: Assume that <fn > is a sequence of functions that are Riemann
integrable on the interval a; b. If the sequence converges uniformly to f ,
Rb
Rb
then f is also Riemann integrable on a; b, and lim fn .x/dx D f .x/dx.
n!1 a

Assume that <fn > is a sequence of functions Riemann integrable on the


interval a; b which converges uniformly to the function f .
Because the sequence converges uniformly, there is an integer N such that
for all n  N and all x 2 a; b it follows that jfn .x/  f .x/j < 1.
Because fN is Riemann integrable on a; b, fN is bounded, so there exists an
M such that jfN .x/j < M for all x 2 a; b.
Then for each x 2 a; b it follows that M  1  fN .x/  1 < f .x/ <
fN .x/ C 1  M C 1. Thus, jf .x/j is bounded by M C 1 and f is a bounded
function.
For each positive integer n let Dn be the set of x 2 a; b where fn fails to
be continuous. Because each fn is Riemann integrable, Lebesgues Theorem
shows that Dn has measure zero.
Since the countable union of sets of measure zero is a set with measure zero,
1

the set D D [ Dn has measure zero.


nD1

For each n, the function fn is continuous on the set A D a; bnD.


Because <fn > converges uniformly to f on a; b, the limit function, f , is
continuous at each point in A.
Thus, the set of points where f is discontinuous is a subset of D, and, hence,
the set of discontinuities of f has measure zero.
It follows from Lebesgues Theorem that f is integrable on the interval a; b.
Let  > 0 be given.
Because <fn > converges uniformly to f on a; b, there is an integer N such

that for all n  N and all x 2 a; b, jfn .x/  f .x/j < baC1
.


Thus, for each x 2 a; b, f .x/  baC1 < fn .x/ < f .x/ C baC1
.
b
b


Rb
Rb 
R
R


Then, f .x/dx < f .x/ baC1
dx  fn .x/dx  f .x/C baC1
dx
<

Rb

f .x/dx C .

This proves that lim

Rb

n!1 a

fn .x/dx D

Rb

f .x/dx and completes the proof of the

theorem.

. The
Note the use of b  a C 1 rather than just b  a in the denominator of baC1
extra C1 avoids the embarrassing case of a D b. The addition of C1 allows the
proof to handle the easy special case without having to provide a separate argument
for it.

246

8 Sequences of Functions

8.2.1 Exercises
1. Show that lim xn converges uniformly to 0 on any interval a; a where 0 <
n!1
a < 1.
1
converges uniformly to 0 on any interval a; 1/ for a > 0 but
2. Show that lim nx
n!1

not on the interval .0; 1/.


3. Another way to show that the uniform limit of Riemann integrable functions is
Riemann integrable is to show that the limit function has upper and lower step
functions, u and v, such that the integrals of u and v are within  > 0 of each
other. Write a proof that uses this strategy.
4. Suppose that the sequence of functions <fn > converges uniformly to the function
f and that g is a uniformly continuous function defined on the range
of

 f and the
ranges of each of the
f
functions.
Prove
that
the
functions
g
f
.x/
converge
n
n

uniformly to g f .x/ .

8.3 Monotone Convergence


Chapter 3 introduces monotonically increasing sequences of numbers <an > where
an  anC1 for each n  1 and monotonically decreasing sequences of numbers
where an  anC1 for each n  1. One can similarly define a <fn > to be a
monotonically increasing sequence of functions or a monotonically decreasing
sequence of functions on a set A if for each x 2 A the sequence of numbers
<fn .x/> is always monotonically increasing or always monotonically decreasing,
respectively. Such a sequence is said to converge monotonically to the limit
function f on A if the monotone sequence converges to f . That convergence could
be a pointwise convergence or a uniform convergence. Even if convergence is
pointwise, sometimes knowing that the convergence is also monotone gives results
similar to knowing that the convergence is uniform.
In the previous section it was shown that if a sequence of continuous functions
converges uniformly to f , then f is continuous. If the convergence is actually
monotone, then the converse holds, that is, if the terms of the sequence <fn > are
continuous on an interval a; b and the sequence converges monotonically to a limit
function f that is also continuous on a; b, then the convergence is actually uniform.
To prove this you would need to take an  > 0 and show there was an integer N
such that for all n  N and all x 2 a; b it was true that jfn .x/  f .x/j < . What do
you have working for you? What you have is that the limit function f is continuous
at each point x 2 a; b, each of the terms of the sequence fn is continuous, and the
values fn .x/ are approaching f .x/ monotonically.
Let x 2 a; b be given. Using the continuity of f you can find an interval around
x such that for any y in that interval, f .y/ is close to f .x/. Using the fact that the fn
are converging to f pointwise, you can find an integer N such that for all n  N the
value of fn .x/ is close to the value of f .x/. Finally, using the continuity of fN you can
find an interval around x such that for any y in that interval fN .y/ is close to fN .x/.

8.3 Monotone Convergence

247

Combining these you can show that for any y in some interval around x, the value
of fN .y/ is close to the value of f .y/. The crucial observation here is that once you
know that fN .y/ and f .y/ are close, the monotonicity of the convergence gives you
that fn .y/ is between fN .y/ and f .y/ for all n  N, and, thus, fn .y/ will be close to
f .y/ for all n  N. Note, though, that the value of N can vary with the value of x.
Well, this means that for each x 2 a; b there is an interval around x where
fn .y/ is close to f .y/ for all y in the interval and all n  N. Now you can use the
compactness of the interval a; b, that is, you can use the HeineBorel Theorem to
show that there is a finite collection of these x values, say x1 ; x2 ; x3 ; : : : ; xk , such that
the entire interval a; b is covered by these intervals you constructed around each
of the xj s. Each of the xj s was associated with an Nj , and now one can select the
maximum of these Nj values to get a single function fN which is uniformly close
to f . Again, because the convergence is monotone, once you know that fN is close to
f , you know that fn is close to f for all n  N. This will complete the proof.
PROOF: Assume that <fn > is a sequence of functions continuous on the
interval a; b that converges monotonically to the function f that is also
continuous on a; b. Then the sequence converges uniformly to f on a; b.
Assume that <fn > is a sequence of functions continuous on the interval
a; b that converges monotonically to the function f that is also continuous
on a; b.
Let  > 0 be given.
Let x 2 a; b.
Because the function f is continuous at x, there is a 1 > 0 such that if
y 2 a; b with jy  xj < 1 , then jf .y/  f .x/j < 3 .
Because lim fn .x/ D f .x/, there is an integer Nx such that if n  Nx , then
n!1

jfn .x/  f .x/j < 3 .


Because fNx is continuous at x, there is a 2 > 0 such that if y 2 a; b with
jy  xj < 2 , then jfNx .y/  fNx .x/j < 3 .
Let x D min.1 ; 2 /.
Then, if y 2 a; b with jy  xj < x , it follows that jfNx .y/  f .y/j D
jfNx .y/  fNx .x/ C fNx .x/  f .x/ C f .x/  f .y/j 
jfNx .y/  fNx .x/j C jfNx .x/  f .x/j C jf .x/  f .y/j < 3 C 3 C 3 D .
The interval a; b is covered by the collection of open intervals
.x  x ; x C x / for x 2 a; b.
By the HeineBorel Theorem, there is a finite collection of these x
values, x1 ; x2 ; x3 ; : : : ; xk , such that the intervals .xj  xj ; xj C xj / for j D
1; 2; 3; : : : ; k covers the interval a; b.
Let N D max.Nx1 ; Nx2 ; Nx3 ; : : : ; Nxk /.
Let y 2 a; b.
There is a value of j between 1 and k such that y 2 .xj  xj ; xj C xj /.
Because the sequence <fn > converges monotonically to f , for all n  N 
Nxj , fn .y/ is between fNxj .y/ and f .y/, and jfNxj .y/  f .y/j < .
This shows that the sequence <fn > converges uniformly to f .

248

8 Sequences of Functions
p

The sequence of functions fn .x/ D 2xC n converges monotonically to f .x/ D 2x


on the interval 0; 4. Since each fn and f is continuous on 0; 4, you can conclude
that the convergence of the sequence is uniform. On the other hand, the sequence
of continuous functions fn .x/ D xn converges monotonically on the interval 0; 1
to a function discontinuous at 1. Clearly, then, the sequence does not converge
uniformly.
Another important theorem about monotone convergence is that if <fn > is a
sequence of functions Riemann integrable on the interval a; b that converge monoRb
Rb
tonically to the Riemann integrable function f , then lim fn .x/dx D f .x/dx. The
n!1 a

result is called the Monotone Convergence Theorem for Riemann Integrals. It is


generally not proved in a book of this type because it is an easy consequence of the
Monotone Convergence Theorem of Lebesgue which is covered in any beginning
course in measure theory, but that study requires the development of Lebesgue
measure, a topic which is beyond the scope of this book.
It does need to be pointed out that even if all the terms of a sequence are Riemann
integrable functions, and the sequence converges monotonically to a function f ,
it may be that the limit, f , is not itself Riemann integrable. For example, let
r1 ; r2 ; r3 ; : : : be a sequence consisting of all the rational numbers in the interval
0; 1. Let fn .x/ be the function equal to 1 for x D r1 ; r2 ; r3 ; : : : ; rn and equal
to 0 elsewhere. Then each fn has finitely many points of discontinuity so each
fn has a Riemann integral on 0; 1 equal to 0. Yet the sequence <fn > converges
monotonically to the function f equal to 1 for rational values of x and 0 for irrational
values of x, so f is discontinuous everywhere, and, as a result, it is not Riemann
integrable.
So, suppose that <fn > is a sequence of functions Riemann integrable on the
interval a; b that converge monotonically to a limit function f that is also Riemann
integrable on a; b. Without loss of generality one can assume that the sequence
is monotonically decreasing to f because if the sequence were increasing, the same
argument could just be applied to the sequence <fn >. Also, it can be assumed that
the function f is identically 0 on a; b because if that is not the case, the argument
could be applied to the sequence <fn  f > which does decrease monotonically to 0,
Rb
Rb
Rb
and lim fn .x/  f .x/dx D 0 is equivalent to lim fn .x/dx D f .x/dx. 1
n!1 a

n!1 a

A proof of the Monotone Convergence Theorem for Riemann Integrals would


start with an  > 0, and the goal of the proof would be to show that there is an
Rb
integer N such that for all n  N, it follows that fn .x/dx < . The proof presented
a

here is based on the fact that for any Riemann integrable function, fn , you can find
upper and lower step functions, un .x/ and vn .x/, satisfying vn .x/  fn .x/  un .x/
Rb
for every x 2 a; b so that a .un .x/  vn .x//dx is as small as you like. Suppose
Rb
you select un and vn so that a .un .x/  vn .x//dx < 2n . That is, find upper and
lower step functions for each fn such that they give increasingly better and better
1

This proof is based on ideas from the article Monotone Convergence Theorem for the Riemann
Integral by Brian S. Thomson from the American Mathematical Monthly, JuneJuly 2010.

8.3 Monotone Convergence

249

approximations to the integrals of fn as n gets large. In particular, with the stated


1 R
P
b
precision, you would be able to know that the entire sum
a .un .x/  vn .x//dx
would be less than

1
P

nD1

n

2

D . Actually, the bound  is not small enough for this

nD1

proof, but later this value can be adjusted when you see just how small the bound
needs to be in order to make the proof work.
For each n these un and vn functions are step functions on the interval a; b, so
there must be a positive integer k and a partition of a; b given by a D x0 < x1 <
x2 <    < xk D b such that both the un and the vn functions are constant on each of
the open intervals .xj1 ; xj / for j D 1; 2; 3; : : : ; k. Well, for this proof, there will be a
different k and a different partition for each fn function, so it would be better to name
the positive integer kn and the partition a D xn;0 < xn;1 < xn;2 <    < xn;kn D b.
For the purposes of this proof, it is important that the endpoints of the partition
associated with the un and vn , that is, xn;1 ; xn;2 ; xn;3 ;    ; xn;kn 1 , do not match any
of the endpoints of the partition associated with the next case unC1 and vnC1 . This
is easy to arrange because an upper or lower step function for fn that is constant
on the two intervals .xj1 ; xj / and .xj ; xjC1 / can be altered to be constant on the
three intervals .xj1 ; xj  /, .xj  ; xj C /, and .xj C ; xjC1 / for some suitably
small > 0 without significantly changing the value of the integral of the step
function and without destroying whether the step function is an upper or lower step
function of fn . Indeed, if, for example, the upper step function un were constant on
the function un by redefining
un on the
.xj1 ; xj / and .xj ; xjC1 /, you could define


interval .xj  ; xj C / to equal max un .xj  /; un .xj /; un .xj C / . Then un would

Rb 
be slightly larger than un on a small interval so that un .x/  un .x/ dx is less than
a

2un .xj / which can be made arbitrarily small by selecting small. Because un is
greater than or equal to un , it is also an upper step function of fn .
For each y 2 a; b, consider the sequence of numbers f1 .y/; f2 .y/; f3 .y/; : : : which
decreases monotonically to 0. For each y, select a positive integer n.y/ such that
fn.y/ .y/ < . Thus, n.y/ associates a term of the sequence fn.y/ with y. That term is
also associated with un and vn and the partition a D xn;0 < xn;1 < xn;2 <    <
xn;kn D b. As stated in the previous paragraph, it can be assumed that y is not equal
to any of the endpoints xn;1 ; xn;2 ; xn;3 ; : : : ; xn;kn 1 , so y can be associated with an open
interval .xn;j1 ; xn;j / containing y unless y is a or b in which case y will be associated
with the open interval .a  1; xn;1 / or .xn;kn 1 ; b C 1/, respectively.
Each point y 2 a; b has been associated with an open interval that contains y.
Thus, these open intervals provide an open cover of the interval a; b. The Heine
Borel Theorem says that there exists a finite subcover of a; b. That is, there is a
sequence of y 2 a; b, say y1 ; y2 ; y3 ; : : : ; ym , such that the intervals associated with
these y values cover a; b. Something stronger can be said. In this finite subcover
of open intervals, you can assume that there are no values of y 2 a; b that belong
to more than two of the open intervals in that subcover. Indeed, suppose y is an
element of the three intervals of the subcover .a1 ; b1 /, .a2 ; b2 /, and .a3 ; b3 /. Suppose
a1 is the least of a1 , a2 ; and a3 , and that b2 is the greatest of b1 ; b2 ; and b3 . Then
a1 < a3 < y < b3 < b2 , so .a3 ; b3 /  .a1 ; b1 /[.a2 ; b2 /, and the interval .a3 ; b3 / can

250

8 Sequences of Functions

be dropped from the subcover. Because the subcover contains only a finite number of
open intervals, all of these superfluous intervals can be dropped from the subcover.
Consider the intervals associated with each of the yj values. For simplicity, let the
interval associated with yj be renamed .aj ; bj /. Note that the endpoints a and b will
be among the yj values because for every n each of these endpoints was covered by
only one possible open interval. At this point the left endpoint associated with a can
be set to a and the right endpoint of the interval associated with b can be set to b. It
is important to note that if n.yi / D n.yj / for some distinct i and j, then the intervals
associated with yi and yj do not overlap. This is because the intervals associated with
yi and yj are distinct intervals from .a1; x1 /; .x1 ; x2 /; .x2 ; x3 /; : : : ; .xk1 ; bC1/. Let
N be the maximum of the finitely many n.yj / values for j D 1; 2; 3; : : : ; m. Because
no value of y 2 a; b appears in more than two of the open intervals associated with
the yj , it can be concluded that
Zb
a

m Z
X

bj

fN .x/dx 

jD1 a

m Z
X

bj

fN .x/dx 

jD1 a

fn.yj / .x/dx D

2
2
3
3
Zbj
Zbj
m
m
X
X




6
6
7
7
fn.yj / .x/  fn.yj / .yj / dx C fn.yj / .yj /.bj  aj /5 
un.yj / .x/  vn.yj / .yj / dx C .bj  aj /5 
4
4
jD1

jD1

aj

aj

N Z
N
X
X


2p C 2.b  a/ < .2b  2a C 1/:
up .x/  vp .x/ dx C 2.b  a/ 
b

pD1 a

pD1

There were two places in the above argument where quantities were forced to be
less than the given value . It can now be seen that those quantities should have been
Rb

made smaller than 2b2aC1
so that the final inequality would show fN .x/dx <  as
a

needed. It is also worth noting that there were two places in the argument that use
the fact that the sequence <fn > converges monotonically. The first was to conclude
that when, for a particular y 2 a; b, the value of fn .y/ is small, then the values of
fm .y/ are also small for all m  n. The second important use of monotonicity takes
Rb
Rb
the final result that fN .x/dx <  and concludes that fm .x/dx <  for all m  N.
a

PROOF: Assume that <fn > is a sequence of functions Riemann integrable


on the interval a; b that converges monotonically to the function f that is
Rb
Rb
also Riemann integrable on a; b. Then lim fn .x/dx D f .x/dx.
n!1 a

Assume that <fn > is a sequence of functions Riemann integrable on the


interval a; b that converges monotonically to the function f that is also
Riemann integrable on a; b.
Without loss of generality assume that <fn > decreases monotonically to
f .x/  0 on a; b. If this were not the case, the argument could be applied
to the sequence of functions <jf  fn j>.
(continued)

8.3 Monotone Convergence

251

Let  > 0 be given.


It is left to show that there is an integer N such that for all n  N,
Rb
fn .x/dx < .
a

For each positive integer n the function fn is Riemann integrable on a; b so


there exists upper and lower step functions, un and vn , satisfying for each

Rb 

x 2 a; b, vn .x/  fn .x/  un .x/ and un .x/  vn .x/ dx < 2b2aC1
2n .
a

Because un and vn are step functions, there exist a positive integer kn and a
partition of a; b given by a D xn;0 < xn;1 < xn;3 <    < xn;kn D b such
that for each j D 1; 2; 3; : : : ; kn , the functions un and vn are constant on each
open interval .xn;j1 ; xn;j /.
Because there is flexibility in selecting the upper and lower step functions,
it can be assumed that for each positive integer n, except for a and b, the
endpoints of the partition associated with the upper and lower step functions
for fn are distinct from the endpoints of the partition associated with the
upper and lower step functions for fnC1 .
For each y 2 a; b the sequence f1 .y/; f2 .y/; f3 .y/; : : : decreases monotonically to 0, so for each y there is a positive integer n.y/ such that

fm .y/ < 2b2aC1
for all m  n.y/. In particular, it can be assumed that,
unless y is a or b, y is not an endpoint of the partition of a; b associated
with the upper and lower step functions of fn.y/ .
Associate with each y 2 a; b an open interval as follows. If y D a,
then let the open interval be .a  1; xn.a/;1 /. If y D b, then let the open
interval be .xn.b/;kn.b/ 1 ; b C 1/. Otherwise, associate y with the open interval
.xn.y/;j1 ; xn.y/;j / that contains y.
Thus, each y 2 a; b is associated with an open interval that contains y, so
this collection of open intervals provides an open cover of a; b.
By the HeineBorel Theorem, there exists a finite subcovering of a; b
consisting of m open intervals associated with m values y1 ; y2 ; y3 ; : : : ; ym
in a; b. Note that since a and b are each covered by at most one of the
open intervals in the covering of a; b, both a and b appear in the list of y1
through ym .
Let the interval associated with yj be called .aj ; bj /. Reset the interval
associated with a so that its aj value is equal to a rather than a  1, and
reset the interval associated with b so that its bj value is equal to b rather
than b C 1. It can be assumed that no value of y 2 a; b belongs to more
than two of the open intervals of the subcovering.
(continued)

252

8 Sequences of Functions



Let N D max n.y1 /; n.y2 /; n.y3 /; : : : ; n.ym / .
Then
Zb

m Z
X

bj

fN .x/dx 

jD1 a

m Z
X

bj

fN .x/dx 

jD1 a

fn.yj / .x/dx D

3
2
Zbj
m
X

7
6 
fn.yj / .x/  fn.yj / .yj / dx C fn.yj / .yj /.bj  aj /5 
4
jD1

aj

2
3
Zbj
m
N Zb
X


6 
7 X 
un.yj / .x/  vn.yj / .yj / dx C .bj  aj /5 
up .x/  vp .x/ dx C
4
jD1

pD1 a

aj
N
X
pD1


2.b  a/ 
2b  2a C 1




2p C
2.b  a/ <
.2b  2a C 1/ D :
2b  2a C 1
2b  2a C 1
2b  2a C 1

Because the sequence <fn > decreases monotonically to 0, it follows that


Rb
Rb
for all n  N that fn .x/dx  fN .x/dx <  which completes the proof.
a

Pointwise convergence and uniform convergence are not the only methods of
convergence of sequences of functions. Another method suggested by the above
theorem is called convergence in mean or convergence in L1 . A sequence of
Riemann integrable functions <fn > is said to converge in mean to the Riemann
Rb
integrable function f on the interval a; b if lim jfn .x/  f .x/jdx D 0. For
n!1 a

example, consider the following sequence of functions defined on the interval 0; 1.
Define f .xI a; b/ be the function that is 1 for x in the interval a; b and 0 for all
other x. Then for positive integer n and for integer k with 2n1  k < 2n , let
n1 kC12n1
n1 kC12n1
fk .x/ D f .xI k2
; 2n1 /. The integral of f .xI k2
; 2n1 / from 0 to 1 is
2n1
2n1
1
,
so
the
integrals
of
f
.x/
approach
0
as
k
gets
large.
Thus,
fk converges in mean
k
2n1
to the zero function. Yet this sequence of functions does not converge pointwise for
any single value of x.

8.4 Series of Functions


The infinite series f .x/ D

1
P

an .x/ is a function of x. For each value of x where the

nD1

terms of the series are defined and the series converges, f .x/ is just an infinite series
1
P
1
is defined for each x
of real numbers given by an .x/. For example, f .x/ D
n2 Cx
nD1

that is not the negative of a perfect square. If x is the negative of a perfect square,
then there is a term of the series that is not defined. Otherwise, the series converges

8.4 Series of Functions

253

by the Comparison Test for if x  0, then


with n2 > 2jxj it follows that
1
P
nD1

1
n2

1
n2 Cx

1
n2 Cx

1
,
n2

and if x < 0, then for any n


1
P
1
 n22 . Then
converges since
n2 Cx

2
2n2 2jxj

nD1

converges.

Another example is f .x/ D

1
P

xn which is just a geometric series which

nD1

x
converges to f .x/ D 1x
for all x satisfying jxj < 1. Note here that the function
x
f .x/ D 1x is defined for all x 1, but the infinite series is only defined for jxj < 1.
This is an example of a power series dealt with in considerably more detail in the
next section.
The results concerning the convergence of sequences of functions discussed
earlier in this chapter apply to the study of infinite series of functions because
an infinite series is just defined to be the sequence of its partial sums. Still other
questions arise such as, can one find the derivative or the integral of an infinite series
by simply differentiating or integrating the terms of the series and then finding the
limit of the resulting partial sums? The answer to this question is that sometimes one
gets a correct answer by differentiating or integrating a series term by term, but other
times this process results in nonsense. For example, consider again the function
1
1
1 R
R
R P
P
P
1
1
1
f .x/ D
.
Here,
the
statement
that
f
.x/
dx
D
dx
D
dx
2
2
n Cx
n Cx
n2 Cx
nD1

is not valid since

1 R
P
nD1

1
n2 Cx

dx D

1
P

nD1

nD1



ln n2 C x C C which does not converge

nD1

for any value of x. Alternatively, for this particular series it is valid to use the
1
1 Ry
Ry
Ry P
P
1
1
definite integral from 0 to y and write f .x/ dx D
dx
D
dx D
2
n Cx
n2 Cx
nD1 0
0
0 nD1


1
P
2
which does converge for each y > 1. The integral and derivative of
ln n nCy
2

nD1

the series f .x/ D

1
P

xn make perfectly good sense in the range jxj < 1.

nD1

One simple observation about series


series of positive numbers

1
P

1
P

an .x/ is that if there is a convergent

nD1

Mn such that for each n, the term an is bounded by

nD1

Mn for all x in some set A, then the series

1
P

an .x/ converges uniformly on A. This

nD1

in known as the Weierstrass M-Test. Consider how the proof of this result would
1
P
go. First, of course, you would assume that you had a series of functions,
an .x/,
and a convergent series of positive numbers,

1
P
nD1

nD1

Mn , such that for each positive

integer n, jan .x/j  Mn for every x 2 A. You should note that for each x 2 A, the

254

8 Sequences of Functions

series

1
P

an .x/ converges by the Comparison Test and, thus,

nD1

1
P

an .x/ converges

nD1

pointwise. You are to prove that the sequence of function converges uniformly, so
you would need to take an  > 0 and show that there is an integer N such that
m
P
whenever m  N and x 2 A, the partial sum
an .x/ is within  of the limit
nD1

1
P

1
P
an .x/. The difference between the mth partial sum of
an .x/ and its limit
nD1
nD1

1
1
1
P

P
P
is
an .x/ 
jan .x/j 
Mn which can be made less than  by
nDmC1

nDmC1

nDmC1

selecting m large. The value of m does not depend on x showing that the convergence
is uniform. This gives the following proof.
PROOF (Weierstrass M-Test): Let
defined on the set A, and let

1
P

1
P

an .x/ be a series of functions

nD1

Mn be a convergent series of positive

nD1

numbers. If for each n and each x 2 A it holds that jan .x/j  Mn , then
1
P
an .x/ converges uniformly on A.
nD1

Let

1
P

1
P

an .x/ be a series of functions defined on the set A, and let

nD1

Mn be

nD1

a convergent series of positive numbers.


Assume that for each n and each x 2 A it holds that jan .x/j  Mn .
Then for each x 2 A it follows from the Comparison Test that

1
P

an .x/

nD1

converges absolutely and, thus, the series converges.


Let  > 0 be given.
1
1
P
P
Because
Mn converges, there is an integer N such that
Mn <  for
nDm

nD1

all m  N.
But, then, for each x 2 A and each m  N,
the difference
between the mth
1

1
1
P

P
P
partial sum of
an .x/ and its limit is
an .x/ 
jan .x/j 
1
P
nDmC1

Thus,

nD1

nDmC1

nDmC1

Mn < .
1
P

an .x/ converges uniformly on A.

nD1

For example, the series

1
P
nD1

because for all x 2 0; 1/,

1
n2 Cx
1

n2 Cx

converges uniformly on the interval 0; 1/

1
,
n2

and the series

1
P
nD1

1
n2

converges. Since

all the partial sums of the series are continuous functions, it follows from this

8.5 Power Series

255

uniform convergence that the limit function is continuous on 0; 1/. Similarly, the

1
P
sin.n2 x/
sin.n2 x/
 n12
series
converges
uniformly
on
the
entire
real
line
because

n2
n2
nD1

for every positive integer n. Again, you can conclude that the limit function is
continuous because all the partial sums are continuous functions. Notice, though,
1
P
that if you differentiate each term of this series, you get
cos.n2 x/ which does not
nD1

converge for any value of x because the terms do not approach 0.

8.5 Power Series


Power series form a class of infinite series of functions that stands out because of the
particularly nice properties they satisfy, the ease in which
pthey can be produced, the
many well-known elementary functions they represent ( x; ex ; sin x; cos x; ln x),
and the enormous number of applications they have. A power series is a series of
1
P
the form
an .x  c/n , where the real number an is the nth coefficient and c is the
nD0

center of the power series. This book will consider such series where the variable,
coefficients, and center are real numbers, although most of what is said here holds
when these quantities are allowed to be complex numbers. In fact, such series play
a central role in Complex Analysis.

8.5.1 Absolute Convergence


The first important result about power series is that they converge in an interval
.c  R; c C R/ where c is the center of the power series and R, called the radius of
convergence, is a nonnegative real number or possibly even infinity. In fact, if the
1
P
power series
an .xc/n converges for a particular real number y, then it converges
nD0

absolutely for any x satisfying jx  cj < jy  cj, that is, for any x closer to c than y.
The proof is based on the Weierstrass M-Test where the power series
at the point x

which is less
is compared to a convergent geometric series with common ratio xc
yc
than 1.

256

8 Sequences of Functions

PROOF: If the power series

1
P

an .x  c/n converges when x D y, then the

nD0

series converges absolutely for all x satisfying jx  cj < jy  cj.


Let

1
P

an .x
nD0
1
P

 c/n be a power series that converges at x D y.

an .y  c/n converges, its terms must approach 0 by the Limit of

Since

nD0

Terms Test.
Thus, the terms must be bounded, and there exists a real number M such
that jan .y  c/n j  M for every nonnegative integer n.
Let x be any real number satisfying jx 
n cj < jy  cj.

n
n
n xc
Then jan .x  c/ j D jan .y  c/ j  yc  M xc
.
yc
n
1
P

The series
M xc
is a convergent geometric series with common ratio
yc
nD0

xc
yc < 1.
1
P
Thus,
jan .y  c/n j converges absolutely by the Weierstrass M-Test.
nD0

8.5.2 Interval of Convergence


It follows immediately from the previous
theorem that the radius of convergence for

a power series is R D supfjy  cj the series converges at yg, and that the power
series converges absolutely for all x 2 .c  R; c C R/. This does not say anything
about how the power series behaves at the end points c  R and c C R. There are
examples of power series that converge at both endpoints, that converge at one of the
two endpoints, or converge at neither endpoint. It also follows from the above proof,
that if the power series converges absolutely at y, then it converges uniformly for all
x satisfying jx  cj  jy  cj. In particular, since all the partial sums of the series are
continuous functions, if the power series has radius of convergence R > 0 and is
any positive number less than R, then the series converges absolutely at x D cCR,
so the series converges absolutely and uniformly on c  R C ; c C R  . As a
1
P
result, the function f .x/ D
an .x  c/n is continuous on c  R C ; c C R   for
nD0

all small > 0, so it is continuous on the open interval .c  R; c C R/. If the series
converges absolutely for x D c C R, then f .x/ is continuous on the closed interval
1
P
c  R; c C R. What if the series
an .x  c/n converges conditionally at x D c C R
nD0

or x D c  R? Does this mean that the function is continuous at that endpoint? The
answer is yes, but this takes some proof and is known as Abels Theorem.

8.5 Power Series

257

PROOF (Abels Theorem): Suppose the power series

1
P

an .x  c/n has

nD0

positive radius of convergence R < 1, and that the series converges at


one of the endpoints c  R or c C R. Then the series is continuous on an
interval from c  R to c C R containing that endpoint.
Let

1
P

an .x  c/n be a power series with positive radius of convergence

nD0

R < 1.
Assume that the series converges at one of the endpoints of the interval of
convergence, c  R or c C R.
Without loss of generality c D 0 and R D 1 because the argument can be
applied to the series where x is replaced by Rx C c. Thus, assume that the
1
P
series is
an xn with radius of convergence 1.
nD0

Also, it can be assumed that the series converges at 1, because if it converges


1
P
at 1, the argument can be applied to the series
an .1/n xn which
nD0

converges at 1.
Finally, by subtracting a constant from the constant term of the series, a0 , it
1
P
can be assumed that
an D 0.
nD0

Let this series have partial sums sk D

k
P

an .

nD0

Because lim sk D 0, the sk are bounded, and, in particular,


k!1

1
P

sn x n

nD1

converges for all x with jxj < 1.


1
1
1
1
P
P
P
P
Then
an xn D a0 C
.sn  sn1 /xn D s0 C
sn x n 
sn1 xn D
nD0

s0 C

1
P

nD1

sn x .1  x/  s0 x D .1  x/
n

nD1

1
P

nD1

nD1

sn x .

nD0

Let  > 0 be given.


Because lim sn D 0, there is an integer N such that for all n  N, jsn j < 2 .

1n!1
1
N
1


P
P
P
P
n
n
n
n

Then
an x D .1x/
sn x  .1x/
sn x C .1x/
sn x
nD0
nDNC1
nD0 N

nD0 N
1

P
P
P
NC1


 .1x/
sn xn C.1x/
xn D .1  x/
sn xn C 2 .1x/ x1x D
2
nD0
nDNC1
nD0

N
 NC1

P
n
.1  x/
sn x C 2 x
.

nD0

Because the limit of this quantity as x approaches 1 from the left is 2 , there
exists > 0 such that for all x between 1  and 1, this expression is less
than .
1
P
an xn D 0 which completes the proof.
This shows that lim
x!1

nD0

258

8 Sequences of Functions

The Root Test can be used to determine the radius of convergence of a


1
P
power series. The Root Test says that the series
an .x  c/n must converge if
nD0
p
. Conversely, the
lim sup n jan j  jx  cjn < 1. Equivalently, jx  cj < lim sup1 p
n
ja j
n!1

series diverges if jx  cj >


must be R D

1p
.
lim sup n jan j

1p
.
lim sup n jan j

n!1

Thus, the radius of convergence of the series

n!1

n!1

1
P
an .x  c/n , you see that the series will converge
If you apply the Ratio Test to
nD0

a .xc/nC1
nj
<
1.
Equivalently,
jx  cj < lim jajanC1
. The series diverges if
if lim nC1

n
an .xc/
j
n!1

n!1

nj
nj
showing that R D lim jajanC1
. This expression is fine as long
jx  cj > lim jajanC1
j
j
n!1
n!1
as the limit of the ratio of terms exists, but it is less helpful when it does not.
It is worth considering a few examples.

1
P

nxn

nD0

The center is c D 0, and the radius of convergence is R D lim

n!1

1
p
n

D 1. Clearly, the series diverges at both endpoints x D 1 and x D 1


by the Limit of Terms Test.
1
P
n

.1/n .x1/
n
lim nC1
n!1 n

nD1

1
The center is c D 1, and the radius of convergence is R D lim p
D
n 1
n!1

lim n D 1. At the right endpoint x D 0 the series is the harmonic series which
n!1 nC1
diverges to infinity, but at the right endpoint x D 2 the series is the alternating
harmonic series which converges conditionally.
1
P
.xC4/n
n2 5n

nD0

The center is c D 4, and the radius of convergence is R D lim


.nC1/2 5nC1

n!1

1
p
n 2 n
n 5

D 5. At the right endpoint x D 1 the series converges absolutely


lim
n2 5n
which means that the series will also converge at the left endpoint x D 9.
1
P
xn

n!1

nD0

.2n/

The center is c D 0, and the radius of convergence is R D lim

n!1

lim

n!1

.2nC2/
.2n/

D 1, so this series converges for all real numbers x.

q1
n

1
.2n/

8.5 Power Series

1
P

259

nn xn

nD0

The center is c D 0, and the radius of convergence is R D lim


lim

nn

n!1

C 234 x4 C 215 x5 C 236 x6 C   


The center is c D 0. This is an example of a series where lim
n!1

D 0. This series only converges at its center.

nC1
n!1 .nC1/
1
22 2
x C 32 x C 213 x3
21

but lim inf

1
p
n n
n

1
p
n a
n

3
2

n!1

1
p
n a
n

does not exist,


n

n
D R. Note that lim inf anC1 D 0 and lim sup .nC1/
nC1 D 1,
an

n!1

n!1

neither of which shed any light on the value of R. This series diverges at both
endpoints by the Limit of Terms Test.

8.5.3 Differentiability
A function represented by a power series in an interval with positive length is said to
be analytic in that interval. Perhaps the most unusual property of analytic functions
is that they are differentiable, and the derivative of a power series can be found by
differentiating the series term by term to get a new series which converges with the
1
P
same radius of convergence as the original series. That is, if f .x/ D
an .x  c/n
0

for all x satisfying jx  cj < R, then f .x/ D

1
P

nD0

n  an .x  c/

n1

, where the new

nD1

series converges for these same values of x. It is easy to check that the radius of
1
P
convergence of the derivative series
n  an .x  c/n1 is the same as the original
nD1
p
p
p
series. This follows from the fact that lim sup n n  an D lim n nlim sup n an D R1 ,
n!1

n!1

n!1

where R is the radius of convergence of the original series.


1
P
It is, therefore, the case that g.x/ D
n  an .x  c/n1 is a power series with the
nD1

same radius of convergence as the power series f .x/ D

1
P

an .xc/n . The question is

nD0

whether this new power series is, in fact, the derivative of the original series. That is,
does g.x/ D f 0 .x/ hold for all x in the open interval where the two series converge?
.x/
This needs to be proved. The proof needs to show that lim f .xCh/f
D g.x/ for
h
h!0

each x within the interval of convergence


To construct a proof
of the power series.

.x/

g.x/
of this, you might express the difference f .xCh/f
in terms of power series
h
and see if this simplifies to an expression that has a limit of 0 as h approaches 0. The
calculation is simpler if you assume that c D 0. Then,

260

8 Sequences of Functions

1
P
1


an .x C h/n 
an xn X
1

f .x C h/  f .x/
nD0
nD0
n1

D

g.x/

n

a
x
n

h
h

nD1

1
1
n  
P

P
P
P
n p np
n

h
a
x

a
x

hn  an xn1
n
n
nD0 pD0 p
nD0
nD1

A careful accounting of the terms in the numerator shows that all the terms of
1
1
P
P
an xn and all of the terms of
hn  an xn1 cancel leaving
nD0

nD1

n  
P
P

n p np

!
h
a
x
1

n
nD2 n pD2 p

X
X

p2 np

an
h x :

D jhj 

h
nD2 pD2 p

The factor jhj clearly goes to 0 as h goes to 0, but there is a question about what
happens to the other factor. This infinite sum will not be a problem if it remains
bounded as h gets small. Here is where you can use the fact that power series with
radius of convergence R converge absolutely at points less than a distance R from
the center of the series. Assume that jhj is smaller than some fixed value s > 0. Then
the second factor can be estimated as follows.

!
!
!
1

1
n
1
n
n
X
X
X
X X
n p2 np X
n
n p2 np
p2 np

h
jhj
s
an
x
jan j
jxj

jan j
jxj



p
p
nD2 pD2 p
nD2
pD2
nD2
pD2
!
n
1
1
X
jan j X n p np
1 X
s jxj
D 2
jan j.jxj C s/n :
2
s
s
p
nD2
pD0
nD2

This last expression converges as long as jxjCs is a point where the power series for
f converges absolutely. But if x were chosen so that jxj < R, then for any positive s
with s < R  jxj, this will happen. Because you are free to choose any s > 0, you

.x/
can choose one less than R  jxj which will ensure that f .xCh/f
 g.x/ is small
h
whenever 0 < jhj < s, so the proof can be completed.

8.5 Power Series

261

PROOF: Suppose the function f is defined by the power series


1
P
an .x  c/n which has a positive radius of convergence R  1.
f .x/ D
nD0

Then for all x satisfying jx  cj < R, the derivative of f at x is given by


1
P
n  an .x  c/n1 .
f 0 .x/ D
nD1

Let

1
P

an .x  c/n be a power series with positive radius of convergence

nD0

R  1.
The power series for f and its derivative depend on x  c and not on c, so
there is no loss of generality to assume
p that c D 0. p
p
Note that lim sup n n  an D lim n n  lim sup n an , so the two series
1
P

n!1

an .x  c/n and

nD0

1
P

n!1

n!1

n  an .x  c/n1 have the same radius of convergence,

nD1

so each is absolutely convergent for all x with jxj < R.


Let x be chosen with jxj < R.
Let  > 0 be given.
If R < 1, let s D Rjxj
, and if R D 1, let s D 1.
2
1
P
an .jxj C s/n converges absolutely.
Because jxj C s < R, the series
nD0
0
1
Let D min @s;

1C

1
P

s2
jan j.jxjCs/n

A > 0.

nD2

Let h be chosen with 0 < jhj < .


Then

1
1

P an .x C h/n  P an xn
1
1

f .x C h/  f .x/ X

X

nD0
nD0
n1
n1


nan x
n  an x
D


h
h
nD0
nD1

1
1

1
1
n  
n  
P
P
P
P
P
P
n p np
n p np


an x n 
hn  an xn1
an
p h x
nD0 an pD0 p h x

nD0
nD1

nD2 pD2

D
D

h
h

!
!
1

1
n
n
X
X
X X
n p2 np
n

jhj 

jhj

h
jhjp2 jxjnp 
an
x
ja
j
n

p
nD2 pD2 p

nD2
pD2

(continued)

262

8 Sequences of Functions

jhj 

1
X
nD2

jan j

!
!
n
1
n
1
X
X
jan j X n p np
n p2 np
1 X
s jxj
s jxj
 jhj 
D jhj  2
jan j.jxj C s/n < :
2
s
s
p
p
pD2
nD2
pD0
nD2

Thus, the derivative of f at x is

1
P

nan xn1 .

nD1

An immediate consequence of this theorem is that not only can you obtain the
1
P
first derivative of f .x/ D
an .x  c/n by differentiating term by term, but you
nD0

can also get all the higher derivatives of f by repeating the process. This follows
by induction because, if the mth derivative of f is equal to the series formed by the
mth derivatives of the terms of the series for f , and if that series has the same radius
of convergence as the series for f , then the theorem says that the mC1st derivative of
f can be obtained by differentiating the terms of the series for the mth derivative of
f , and the radius of convergence of that series will remain the same. Moreover, one
can find an antiderivative for f by integrating each term of the series for f . That is, if
1
1
P
P
an
f .x/ D
an .x c/n for all x with jx cj < R, then the series
.x c/nC1 will
nC1
nD0

nD0

have the same radius of convergence as the series for f , and the theorem says that
the derivative of the new series is equal to f . It is important to note that if a function
is analytic by virtue of having a power series representation in an open interval of
radius R around c, then that function is infinitely differentiable in that interval.
These results make it very simple to derive new series from previously known
1
P
1
series. For example, you already know that 1x
D
xn for all x with jxj < 1.
nD0

From this one can get


by substituting x for x in the series for
by substituting x2 for x in the series for
by differentiating the series for

1
, 1
1x 1Cx

1
, 1
1Cx 1Cx2

1
, 1
1x .1x/2

1
P

1
P

.1/n xn .

nD0
1
P

.1/n x2n .

nD0

nxn1 .

nD1

1
and noting that ln 1 D 0, ln.1 C x/ D
by integrating the series for 1Cx
1
1
P
P
nC1
n
.1/n xnC1
D
.1/n1 xn . In particular, by Abels theorem,
nD0

nD1

ln 2 D 1  12 C 13  14 C    .
1
1
0 D 0, tan1 x D
by integrating the series for 1Cx
2 and noting that tan
1
P
2nC1
.1/n x2nC1 . In particular, by Abels Theorem, 4 D 1  13 C 15  17 C    .
nD0

8.5 Power Series

263

8.5.4 Taylors Theorem


If f .x/ D

1
P

an .x  c/n for all x with jx  cj < R, then f .c/ D a0 , the

nD0

constant term of the series for f . Finding the mth derivative of the series for f and
evaluating it at the center of the series, c, gives that f .m/ .c/ D mam . So, for all
.m/
integers m  0, am D f m.c/ . This gives a straightforward way to generate the
power series representing any analytic function. Moreover, even if f is not infinitely
differentiable, if it is m times differentiable, one can generate the mth degree Taylor
m .n/
P
f .c/
polynomial for f centered at c given by g.x/ D
.x  c/n . Then g is an mth
n
nD0

degree polynomial that is equal to f at c, and all of its derivatives up to order m agree
with the corresponding derivatives of f at c. In particular, the first degree Taylor
polynomial is just the familiar linear approximation to f given by the line tangent
to the graph of f at c. If f is m-times differentiable at c, one can generate the mth
degree Taylor polynomial, g.x/, for f centered at c, but this does not say whether
the value of g.x/ is even remotely related to the value of f .x/ when x is different
from c. This issue is what is addressed by Taylors Theorem which states that
f .x/ D g.x/ C Rm .x/ for some remainder function Rm .x/. Depending on various
characteristics of f , one can show that Rm .x/ is suitably small so that g.x/ is a good
approximation for f .x/.
There are many forms of Taylors Theorem that express the remainder term,
Rm .x/, in different ways. The one discussed here is sometimes called Lagranges
form. It says that if f is m C 1 times differentiable on the interval between c and
x, then the difference between f .x/ and the mth degree Taylor polynomial for f
centered at c can be expressed in terms of f .mC1/ ./ for some  strictly between c
and x. Its proof follows easily from the following generalization of Rolles Theorem.
PROOF (Higher Order Rolles Theorem): Let f be an m C 1 times
differentiable function on the open interval from a to b with a b,
let f be continuous on the closed interval from a to b, and suppose
that 0 D f .a/ D f 0 .a/ D f 00 .a/ D    D f .m/ .a/ D f .b/. Then there is an x
strictly between a and b where f .mC1/ .x/ D 0.
Let f be an m C 1 times differentiable function on the open interval from a
to b with a b, let f be continuous on the closed interval from a to b, and
suppose that 0 D f .a/ D f 0 .a/ D f 00 .a/ D    D f .m/ .a/ D f .b/.
Since f .a/ D f .b/, f is continuous on the closed interval between a and b,
and f is differentiable between a and b, then by Rolles Theorem there is an
x1 strictly between a and b such that f 0 .x1 / D 0.
Assume for some k with 1  k  m, that f .k/ .a/ D f .k/ .xk / D 0, f .k/ is
continuous on the closed interval between a and xk , and f .k/ is differentiable
between a and xk . Then by Rolles Theorem there is an xkC1 strictly between
a and xk such that f .kC1/ .xkC1 / D 0.
Thus, by mathematical induction, there is an x D xmC1 strictly between a
and b such that f .mC1/ .x/ D 0 completing the proof.

264

8 Sequences of Functions

This Higher Order Rolles Theorem can now be used to prove Taylors Theorem.
If the function f is m C 1 times differentiable between c and x, then f has an
m .n/
P
f .c/
mth degree Taylor polynomial g.y/ D
.y  c/n . Notice that the difference
n
nD0

f .y/  g.y/ has the property that this function and its first m derivatives are all equal
to 0 at c. The remainder term RmC1 .x/ will include a factor of f .mC1/ evaluated at
some  between c and x, and that value of  will come from an application of Rolles
Theorem. Of course, to apply Rolles Theorem, f .y/g.y/ would need to be 0 at y D
x. One needs to add a term to f .y/  g.y/ which will not affect the function and its
derivatives at c but will make the function equal to 0 at x. The term that accomplishes

 .yc/mC1


this is  f .x/g.x/ .xc/
mC1 since this term equals  f .x/g.x/ at x, and it and its
first m derivatives are equal to 0 at c. But now Rolles Theorem can be applied to the

 .yc/mC1
function h.y/ D f .y/g.y/ f .x/g.x/ .xc/
mC1 to find a value of  between c and x
.mC1/
such that h.mC1/ ./ D 0, or 0 D f .mC1/ ./g.mC1/ ./.f .x/g.x// .xc/
mC1 . Noting

that the mC1st derivative of g at c is equal to 0 gives f .x/ D g.x/Cf .mC1/ ./ .xc/
.mC1/
as desired.

mC1

PROOF (Taylors Theorem): Let f be an m C 1 times differentiable


function on the open interval from c to x with c x, and let f be
continuous on the closed interval from c to x. Then there is an  between
m .n/
P
mC1
f .c/
c and x such that f .x/ D
.x  c/n C f .mC1/ ./ .xc/
.
n
.mC1/
nD0

Let f be an m C 1 times differentiable function on the open interval from c


to x with c x, and let f be continuous on the closed interval from c to x.
m .n/
P
f .c/
Let g.x/ D
.x  c/n , and define the function
n
nD0

 .yc/mC1
h.y/ D f .y/  g.y/  f .x/  g.x/ .xc/
mC1 .
Then 0 D h.c/ D h0 .c/ D h00 .c/ D    D h.m/ .c/ D h.x/.
Thus, by the Higher Order Rolles Theorem, there exists  between c and x
such that h.mC1/ ./ D 0.
m .n/
P
mC1
f .c/
This implies that f .x/ D
.x  c/n C f .mC1/ ./ .xc/
which
n
.mC1/
nD0

completes the proof.


For example, the cosine function is analytic, and its power series which converges
2
4
6
8
for all real numbers is cos x D 1  x2 C x4  x6 C x8     . So, how accurate
2
4
of an approximation is 1  x2 C x4 at x D 2? It is clear that the given Taylor
polynomial includes the terms for n D 0, 1, 2, 3, and 4, but it is beneficial to note
that it also includes the term for n D 5 which is 0. Therefore, Taylors Theorem
6
26
D  cos./ 720
. Since the cosine
says that the remainder at x D 2 is f .6/ ./ .20/
6
function is bounded by 1, the error introduced by using the Taylor polynomial as an
26
approximation to cos 2 is at most 720

0:09. In fact, at x D 2 the polynomial is  13


while cos 2
0:416146 with a difference of 0:08281.

8.5 Power Series

265

8.5.5 Arithmetic of Power Series


Given two analytic functions each represented by power series with common center
c and positive radii of convergence, it is straightforward to find the power series
representing the sum, difference, product, and quotients of these series. Suppose
1
1
P
P
two functions have power series f .x/ D
an .x  c/n and g.x/ D
bn .x  c/n
nD0

nD0

which both converge when jx  cj < R for some R > 0. Then theorems about
the sum and difference of series of real numbers ensure that the sum and difference,
1
1
P
P
.an Cbn /.xc/n and .f g/.x/ D
.an bn /.xc/n , both converge
.f Cg/.x/ D
nD0

nD0

when jx  cj < R. Of course, it is possible that the new series converges in an even
larger interval. For example, the series 1 C x C x2 C x3 C    and 2  x  x2  x3    
both have radius of convergence equal to 1, but the sum of the two series is the
constant function 3, and its power series converges for all x.
The product of two power series can be found by using the Cauchy product of
1
P
the two series. If f .x/ D
an .x  c/n has radius of convergence R1 > 0 and
g.x/ D

1
P

nD0

bn .x  c/ has radius of convergence R2 > 0, then both series converge


n

nD0

absolutely when jxcj < min.R1 ; R!2 / implying that their Cauchy
product, .fg/.x/ D
!
1
n
1
n
P
P
P
P
ap .x  c/p bnp .x  c/np D
ap bnp .x  c/n , converges for
nD0

pD0

nD0

pD0

jx  cj < min.R1 ; R2 /. Again, the radius of convergence can be larger as is with


1
the product of 1x
D 1 C x C x2 C x3 C    and 1  x which converges for all x.
1
1
P
P
an .xc/n has radius of convergence R1 > 0 and g.x/D
bn .xc/n
If f .x/ D
nD0

nD0

has radius of convergence R2 > 0, and g.c/ is not zero, then one can find the power
f .x/
series for the quotient h.x/ D g.x/
centered at c by working backwards from the
1
P
Cauchy product of h and g. That is, if you assume that h.x/ D
qn .x  c/n , then
nD0
!
1
1
n
P
P
P
f .x/ D
an .x  c/n D h.x/g.x/ D
bp qnp .x  c/n . Because of the
nD0

nD0

pD0

assumption that g.c/ 0, it follows that b0 0. Then equating like terms in the
product gives the sequence of equations
a0 Db0 q0
a1 Db0 q1 C b1 q0
a2 Db0 q2 C b1 q1 C b2 q0
a3 Db0 q3 C b1 q2 C b3 q1 C b4 q0

266

8 Sequences of Functions

and so forth. The first equation can be solved to give q0 . Then the second equation
can be solved to give q1 , and so forth. The fact that g.c/ 0 says that the coefficient
b0 0 which allows the equation for am to be solved for qm for each m  0. Often
this results in a recursive formula for qn . For example, it is known that ln.1 C x/ D
2
3
4
x  x2 C x3 C x4     , so you can find the series centered at 0 for the quotient

1
1
P
P
ln.1Cx/
n
n
D
qn x by writing .1 C x/
qn x giving 0 D 1  q0 so q0 D 0. Then
1Cx
nD0

for each n > 0,


with qn D

nD0

.1/n1
n

.1/n1
n

D qn C qn1 , so q1 D  12 , q2 D 56 , q3 D  13
, and so forth
12

 qn1 .

8.5.6 Exercises
Determine for which x the following power series converge.
1.
2.
3.
4.

1
P
nD0
1
P
nD0
1
P
nD0
1
P
nD0

4n .xC4/n
3n C5n
n5 .x2/n
8n
nxn
.2n/
nxn
nn

Determine power series representations for the following functions centered at


c D 0.
5.
6.
7.
8.
9.
10.

ex
e2x
sin.3x/
sin x
1Cx

ln.cos x/
3
5
2
4
Using the fact that sin x D x  x3 C x5     and cos x D 1  x2 C x4     ,
show that the powers series satisfy the identity sin2 x C cos2 x D 1.
11. Find the first four nonzero terms of the series for tan x centered at 0 by finding
the quotient of the series for sin x and the series for cos x. Then check your work
by generating those
terms using
(
) Taylors Theorem.
1

e x2 if x > 0
. Prove that for each positive integer n, the
0 if x  0
derivative f .n/ .0/ D 0. Then show that the mth degree Taylor polynomial for
f centered at 0 is p.x/ D 0, and the remainder term is Rm .x/ D f .x/.

12. Let f .x/ D

8.6 Fundamental Question of Analysis

267

8.6 Fundamental Question of Analysis


In a sense Analysis can be thought about as the study of limiting processes. So
far this book has discussed limits of functions and sequences, the continuity of
functions, differentiation of functions, integration of functions, the convergence of
infinite series, and now the convergence of sequences and series of functions. In each
of these studies one fundamental question recurs: when is it valid to interchange the
order of limiting processes. For example,

the question of continuity is a question of
whether lim f .x/ is the same as f lim x . The Fundamental Theorem of Calculus
x!a
x!a
establishes when the derivative of an integral is equal to the integral of a derivative.
The discussion of convergence of sequences of functions included questions about
Rb
Rb
when lim fn .x/ dx is the same as lim fn .x/ dx. Power series give an example

a n!1
1
d P
where dx
an .x
nD0

n!1 a
1
P

 c/n is the same as

whether it is valid to write lim


x!R

1
P
nD0

nD0

an xn D

d
a .x
dx n
1
P

 c/n . Abels Theorem discusses

lim an xn . Thus, the fundamental


nD0 x!R

question of Analysis asks when can you interchange the order of two limiting
processes? It is instructive to watch for other occurrences of this question as your
study of Analysis continues.

Chapter 9

Topology of the Real Line

9.1 Interior, Exterior, and Boundary


In the field of Analysis the concepts of the limit and the continuity of a function f
at a point x D a are defined in terms of open intervals. For example, the condition
jf .x/  Lj <  says that f .x/ is in an open interval centered at L, and the condition
jx  aj < says that x is in an open interval centered at a. These intervals are
specified in terms of the distance between x and y given by jx  yj. Topology is
a branch of Mathematics where these concepts are extended to spaces where one
can discuss intervals without having to rely on a distance formula. As a result the
concepts of limit and continuity can be extended to such spaces, and it can be shown
that many of the properties associated with continuous functions defined on the
real line are shared by continuous functions defined on these more general spaces.
Although the theorems discussed in this chapter are presented in the context of sets
on the real line, virtually all of the theorems are true in the more general context of
any topological space. Many of the techniques used to prove these theorems are the
same techniques one would use for a general topological space, and, therefore, this
chapter can be thought of as an introduction to the field of Topology even though
general topological spaces are not discussed here.
A good way to begin is by taking a set S  R and identifying the points s 2 S
that are not only inside of S but are, in a sense, completely surrounded by points in
S. The point s is said to be in int.S/, called the interior of S, if there is an  > 0 such
that all x within  of s are in S, that is, jx  sj <  implies x 2 S. You can think of
the interior of S as those points which are a positive distance from the complement
of S, Sc D RnS. For example, if S is the closed interval 0; 4, then the open interval
.0; 4/ is the interior of S. This is because if s 2 .0; 4/ and  D min.s; 4  s/, then
all x satisfying jx  sj <  are elements of S. The two endpoints of the interval
0; 4, 0 and 4, do not have this property. No open interval containing either 0 or
4 is completely contained inside of S. Clearly, if x > 4 or x < 0, then x S, so
x int.S/. Thus, int.S/ D .0; 4/. The interior of the set Q of rational numbers is
Springer International Publishing Switzerland 2016
J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_9

269

270

9 Topology of the Real Line

Fig. 9.1 The point x is in the


interior of S: The point y is on
the boundary of S: The point z
is in the exterior of S

y
x
S
SC

the empty set because all nonempty open intervals contain irrational numbers, so
no nonempty open interval is contained in Q. One sometimes says that Q has no
interior even though it does have an interior; it is just that its interior is the empty
set.
Then ext.S/, called the exterior of S, is just defined to be the interior of Sc , that
is, s 2 ext.S/ if there is an  > 0 such that all x satisfying jx  sj <  are in
Sc D RnS. The exterior of S is the set of points that are completely surrounded by
points outside of S. You can think of the exterior of S as the collection of points
bounded away from S, that is, the points that are a positive distance from S. The
exterior of the set 0; 4 is the union of two open intervals, .1; 0/ [ .4; 1/. The
exterior of the set Q is the empty set.
If a point s is in neither int.S/ nor ext.S/, then it must be that no open interval
containing s is completely inside of S and no open interval containing s is completely
outside of S. Thus, for every  > 0, the interval .s  ; s C / contains at least one
element of S and at least one element of Sc . Such points are said to be in @S, called
the boundary of S (Fig. 9.1). Note that the symbol used for boundary is @ which
is the same symbol use for partial derivatives in Calculus. There are connections
between derivatives and boundaries that justify the use of the same symbol for both
concepts. The boundary of 0; 4 is the set f0; 4g. The boundary of Q is the entire
real line, R.
It is important to note that for any set S  R, the three sets int.S/, ext.S/, and @S
partition R, that is, each real number x is in exactly one of these three sets. A proof
of this fact must show two things about a set S: that R D int.S/ [ ext.S/ [ @S, and
that no point x belongs to more than one of these sets. To show that R is a union of
the three sets, you would take an arbitrary x 2 R and show that it is in at least one of
these sets. One way to show that a point must be one of three things is to assume that
it not one of the first two, and then prove that it must be the third. In this case, you
can assume that a point x 2 R is not in int.S/ or in ext.S/. If x is not in int.S/, then

9.1 Interior, Exterior, and Boundary

271

for every  > 0, the open interval .x  ; x C / is not contained in S, and if x is


not in ext.S/, then for every  > 0, the open interval .x  ; x C / is not contained
in Sc . The only alternative is that if x is in neither int.S/ nor ext.S/, then for every
 > 0, the open interval .x  ; x C / contains points in both S and its complement.
This means that x is in @S implying that x must be in at least one of int.S/, ext.S/,
or @S. To show that the three sets are disjoint, show that if x belongs to one of the
three sets, it cannot belong to either of the other two sets. These inferences follow
directly from the definitions of the sets.
PROOF: For every set S  R, R D int.S/ [ ext.S/ [ @S and the three sets
int.S/, ext.S/, and @S are mutually disjoint.
Let S  R.
Assume that x is a real number that is not a member of int.S/ or ext.S/.
Then, because x int.S/, for every  > 0, the open interval .x  ; x C / is
not contained in S, so it contains points of Sc .
And because x ext.S/, for every  > 0, the open interval .x  ; x C / is
not contained in Sc , so it contains points of S.
It follows that for every  > 0, the open interval .x  ; x C / contains
points in S and points in Sc .
Thus, by the definition of boundary, x 2 @S, and this shows that x must be
in at least one of the three sets, int.S/, ext.S/, or @S.
If x 2 int.S/, then there is an  > 0 such that the open interval
.x  ; x C /  S.
But then x 2 S, so x ext.S/, and .x  ; x C /  S shows that x @S.
Similarly, if x 2 ext.S/, then it cannot be in int.S/ or @S.
Thus, no x 2 R is a member of more than one of the three sets which
completes the proof.
There are many results that follow directly from the definitions of interior,
exterior, and boundary. For example, if S and T are any subsets of R, then

int.int.S// D int.S/.
int.ext.S// D ext.S/.
int.S/  ext.ext.S//.
ext.S/  ext.int.S//.
@.@.S//  @S.
@.int.S//  @S.
@.ext.S//  @S.
@.S/ D @.Sc /.
int.S/ [ int.T/  int.S [ T/.
ext.S [ T/  ext.S/ \ ext.T/.
int.S \ T/ D int.S/ \ int.T/.
@.S [ T/  @S [ @T.
if S  T, then int.S/  int.T/.
if S  T, then ext.T/  ext.S/.

272

9 Topology of the Real Line

Each of these results is a statement about either two sets being equal to each other
or one set being a subset of another. Thus, one would prove these results using the
techniques discussed in Chap. 2 for proving subset and set equality statements. For
example, how would you write a proof that for any set S, int.int.S// D int.S/?
This would be a proof that two sets are equal, so the proof would consist of two
parts: showing int.int.S//  int.S/ and showing int.S/  int.int.S//. The fact that
int.int.S//  int.S/ is just a consequence of the definition of interior. For any set
T, int.T/  T, so certainly int.int.S//  int.S/. Showing that int.S/  int.int.S//
is showing that one set is a subset of another. So, you would let x be an element of
int.S/, and then show that x is also an element of int.int.S//. By the definition of
interior, there is an  > 0 such that the open interval .x  ; x C /  S. Thus, you
need to show that .x  ; x C / is contained in int.S/. That is, each y 2 .x  ; x C /
must be in the interior of S. But it is easy to find an open interval centered at y that
is contained in .x  ; x C /. Just let D min.y  .x  /; x C   y/ > 0 because
then .y  ; y C /  .x  ; x C /. This shows each point of .x  ; x C / is in
int.S/ which completes the proof.
PROOF: For every set S  R, int.int.S// D int.S/.

Let S  R.
For any set T, int.T/  T, so int.int.S//  int.S/.
So let x 2 int.S/.
By the definition of interior, there is an  > 0 such that the open interval
.x  ; x C / is contained in S.
Let y 2 .x  ; x C /, and let D min.y  .x  /; x C   y/ > 0.
Then .y  ; y C /  .x  ; x C /  S.
This shows that .x  ; x C /  int.S/ implying that x is in int.int.S//.
This proves that int.S/  int.int.S// and completes the proof of the
theorem.

For a more difficult challenge, consider writing a proof that for any set S, @.@S/  @S
which, in words, says that the boundary of the boundary of a set is contained in the
boundary of the set. For example, let S be the set of rational numbers in the interval
0; 4. You should prove to yourself that the boundary of this set is the entire interval
0; 4. The boundary of that interval is just f0; 4g which indeed is contained in
@S D 0; 4. To show that @.@S/ is a subset of @S, you would take an arbitrary
point x 2 @.@S/ and show that it is in @S. So what do you know if x 2 @.@S/? The
only tool you have at your disposal here is the definition of the boundary of a set,
so you would proceed to use that definition. It says that for every  > 0 the open
interval .x  ; x C / contains elements of @S and elements of the complement
of @S. You want to show that x is in @S, so you would need to show that the open
interval .x  ; x C / contains elements of S and elements of Sc . Well, what is the
consequence of saying that the open interval .x; xC/ contains elements of @S? It
must mean that there is a y 2 .x  ; x C / such that y 2 @S. What does it mean for y
to be in @S? It means that for every > 0, the interval .y; yC/ contains elements
of S and elements of Sc . But this is sufficient if .y  ; y C /  .x  ; x C / because

9.1 Interior, Exterior, and Boundary

273

Fig. 9.2 x in @.@S/, y in @S

((S

)
c

y S

that would put elements of both S and Sc in .x  ; x C /. This can be arranged by
selecting small enough (Fig. 9.2).
PROOF: For every set S  R, @.@S/  @S.
Let S  R.
Let x 2 @.@S/.
Then by the definition of boundary, for every  > 0, the open interval
.x  ; x C / contains points of @S and points of the complement of @S.
Let  > 0 be given, and let y 2 .x  ; x C / such that y 2 @S.
Let D min.y.x/; xCy/ > 0 so that the open interval .y; yC/ 
.x  ; x C /.
By the definition of boundary, the interval .y  ; y C / contains an element
of S and an element of Sc .
But .y  ; y C /  .x  ; x C / shows that .x  ; x C / contains an
element of S and an element of Sc , so x is in @S which completes the proof.
As a third example, consider proving that for any two sets S and T, that
int.S/ [ int.T/  int.S [ T/. Again, this is proving that one set is a subset of a
second set, so your proof would start by selecting an arbitrary element of the first
set and then proceed to show that that element belongs to the second set. Here the
first set is int.S/ [ int.T/. If you select an x from this set, all you know about x is that
it is in the union of the two sets int.S/ and int.T/. So, the only tool you can use is
the definition of union to say that x must be either a member of int.S/ or a member
of int.T/. In the case that x 2 int.S/, you can then apply the definition of interior to
say that there is an  > 0 such that the interval .x  ; x C /  S. But this is all you
need since S  S [ T showing .x  ; x C /  S [ T proving that x 2 int.S [ T/.
The case where x 2 int.T/ is analogous, completing the proof.
PROOF: For any sets of real numbers S and T, int.S/ [ int.T/  int.S [ T/.
Let S and T be sets of real numbers.
Let x 2 int.S/ [ int.T/.
Then by the definition of the union of two sets, either x 2 int.S/ or x 2
int.T/.
Without loss of generality, assume that x 2 int.S/.
Then there is an  > 0 such that the open interval .x  ; x C /  S.
But since S  S [ T, it follows that .x  ; x C /  S [ T showing that
x 2 int.S [ T/, completing the proof.
Can it be that int.S/ [ int.T/ is not equal to int.S [ T/? The answer is yes. See if
you can think of an example.

274

9 Topology of the Real Line

9.1.1 Exercises
For each of the following sets, find the interior, exterior, and boundary of the set.
1. 0; 3/ [ .3; 6
1 

1
1
2. [ 2n
.
; 2n1
nD1

3. 0; 4 \ Q
Write proofs for each of the following statements. For exercises involving the subset
relation rather than the equality relation, give examples showing that the subset
relation in the statement cannot be replaced by an equality.
If S  T, then int.S/  int.T/.
If S  T, then ext.T/  ext.S/.
int.ext.S// D ext.S/.
ext.S/  ext.int.S//.
@.int.S//  @S.
@.S/ D @.Sc /.
@.ext.S//  @S.
int.S \ T/ D int.S/ \ int.T/.
ext.S [ T/ D ext.S/ \ ext.T/.
@.S [ T/  @S [ @T.
int.S/  ext.ext.S//

4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.

9.2 Open and Closed Sets


A set S of real numbers is called open if for every s 2 S there is an  > 0 such that
the open interval .x; xC/  S. A set S of real numbers is called closed if @S  S.
The intervals that are called open intervals are, in fact, open sets. In particular,
.1; 7/, .2; 1/, and ; are all open sets as well as .3; 3/ [ .5; 9/ [ .10; 41/ and
1

[ .2n; 2n C 1/. The intervals that are called closed intervals are, in fact, closed sets.

nD1

In particular, 5; 3, 4; 1/, and ; are all closed sets.
There are actually many equivalent ways to define open and closed sets, so
one usually begins this discussion by proving that all the different definitions are
equivalent. In particular, if S  R, then the following are equivalent:
1.
2.
3.
4.

S is an open set.
S D int.S/.
S \ @S D ;.
Sc is a closed set.

Many theorems in mathematics are statements of the form p , q, and the proof of
these statements is often broken into two steps: p ) q and q ) p. Theorems of
that type state that two conditions are equivalent. But it is not uncommon to have

9.2 Open and Closed Sets

275

a theorem that states that several statements are equivalent, that is, p1 , p2 ,
p3 ,    , pk . One way to prove theorems of this form is to show in a sequence
of steps that p1 ) p2 , p2 ) p3 , p3 ) p4 , . . . , pk1 ) pk , and then pk ) p1 . This
is the technique you can use to prove the list of statements about open sets. You
would begin by assuming condition 1, that a set S is open and then prove condition
2, that S D int.S/. This can be done by noting that for any set, elements of the set
are either in the interior of the set or on the boundary of the set. But if the set S
is open, it means that for each x 2 S there is an  > 0 such that the open interval
.x  ; x C /  S. Thus, .x  ; x C / contains no elements of Sc showing that x
cannot be in @S, so it must be that x 2 int.S/ which proves that S D int.S/.
Now, assuming condition 2 that S D int.S/ it follows immediately that S \ @S D
int.S/ \ @S D ;, which is condition 3. If you assume condition 3 that S \ @S D ;,
how can you conclude that Sc is closed? Well, if S contains no elements of @S, it
must be that all the elements of @S (if there are any) must belong to Sc . But as seen
in the exercises of the previous section, the boundary of S and the boundary of Sc
are always the same. This follows from the fact that the definition of boundary is
symmetric in its references to S and Sc . Therefore, Sc contains its boundary proving
that Sc is a closed set, which is condition 4.
Finally, assuming condition 4 that Sc is a closed set, you know that Sc contains
its boundary, so Sc contains the boundary of S. You must show that for each x 2 S,
there is an open interval centered at x such that the entire interval is contained in S.
But if for every  > 0 the open interval .x  ; x C / contains elements in Sc , then
x would be in the boundary of S which is false. Thus, there is an  > 0 such that
the open interval .x  ; x C / is contained in S. This proves that S is an open set,
which is condition 1 (Fig. 9.3).
Fig. 9.3 An open set S, its
boundary, and its
complement Sc

S
SC

276

9 Topology of the Real Line

PROOF: Let S  R. Then the following statements are equivalent.


1.
2.
3.
4.

S is an open set.
S D int.S/.
S \ @S D ;.
Sc is a closed set.

Let S  R.
Condition 1 ) Condition 2
Assume that S is an open set.
If x 2 S, then by the definition of open set, there exists an  > 0 such that
.x  ; x C /  S.
But .x  ; x C /  S shows that x is not an element of @S.
Since S  int.S/ [ @S, it can be concluded that S  int.S/.
Because the interior of any set is contained in the set, it follows that int.S/ 
S implying that S D int.S/, which is condition 2.
Condition 2 ) Condition 3
Assume that S D int.S/.
Then S \ @S D int.S/ \ @S D ; because the interior and the boundary of
any set are disjoint.
Thus, S \ @S D ;, which is condition 3.
Condition 3 ) Condition 4
Assume that S \ @S D ;.
Then @S must be contained in Sc .
Because @S D @.Sc /, it follows that @.Sc /  Sc implying that Sc is a closed
set, which is condition 4.
Condition 4 ) Condition 1
Assume that Sc is a closed set, which means that Sc contains @.Sc /.
Let x 2 S.
If for every  > 0, the open interval .x  ; x C / contains elements of Sc ,
then x would be an element of @S D @.Sc /.
But all elements of @.Sc / are contained in Sc , so there must be an  > 0
such that the interval .x  ; x C / contains no elements of Sc implying that
.x  ; x C /  S.
This shows that S is an open set, which is condition 1.
A similar theorem can be proved concerning closed sets.

9.2 Open and Closed Sets

277

PROOF: Let S  R. Then the following statements are equivalent.


1.
2.
3.
4.

S is a closed set.
S D int.S/ [ @S.
Every accumulation point of S is an element of S.
Sc is an open set.

Let S  R.
Condition 1 ) Condition 2
Assume that S is a closed set.
Because no point in the exterior of S is a member of S, it is clear that S 
int.S/ [ @S.
Because S is closed, it contains @S, and because all sets contain their interior,
S contains int.S/.
Thus, S contains int.S/ [ @S proving that S D int.S/ [ @S, which is
condition 2.
Condition 2 ) Condition 3
Assume that S D int.S/ [ @S.
Let x be an accumulation point of S.
If x S, then for all  > 0, the open interval .x  ; x C / contains
elements of S (since x is an accumulation point of S) and elements of Sc
(in particular, x).
Thus, x 2 @S implying that x 2 S, a contradiction.
Therefore, all accumulation points of S must be elements of S, which is
condition 3.
Condition 3 ) Condition 4
Assume that every accumulation point of S is an element of S.
Let x 2 Sc .
Because S contains all of its accumulation points, x is not an accumulation
point of S.
Thus, there is an  > 0 such that the open interval .x  ; x C / contains no
elements of S, and is, therefore, contained in Sc .
This shows that Sc is an open set, which is condition 4.
Condition 4 ) Condition 1
Assume that Sc is an open set.
Then Sc \ @.Sc / D ; implying that @.Sc /  S, so @S  S.
Thus, S contains @S, so S is a closed set, which is condition 1.
Of course, a set need not be either open or closed as is seen by the interval .0; 5
which contains one but not both of its boundary points (Fig. 9.4), so it is neither
open (because it contains a boundary point) nor closed (because it does not contain
all of its boundary points).

278

9 Topology of the Real Line

Fig. 9.4 Boundaries are


drawn dotted to indicate open
sets. Boundaries are drawn
solid to indicate closed sets

Open

Closed

9.2.1 Exercises
Determine which of the following sets of real numbers are open and which are
closed.
1.
2.
3.
4.

.2; 2/ [ .2; 6/ [ .6; 10/


R
the irrational numbers
the real numbers that are not integers

Write proofs for each of the following statements.


5.
6.
7.
8.
9.
10.

If S is an open set, and T is a closed set, then SnT is an open set.


If S is an open set, and T is a closed set, then TnS is a closed set.
; is both an open set and a closed set.
If x is an accumulation point of a set S, and x S, then x 2 @S.
If x 2 @S and x S, then x is an accumulation point of S.
Let <an > be any sequence. Let A be the set of all values y such that there exists
a subsequence of <an > that converges to y. Then A is a closed set.

9.3 Unions and Intersections


Perhaps the most important properties open sets have are that the union of any
collection of open sets is itself an open set and that the intersection of a finite
collection of open sets is itself an open set. In fact, these two properties of open sets
are the defining conditions required to hold in the more general setting of topological
spaces (Fig. 9.5).
In the context of the real numbers, it is not hard to show that the union of any
collection of open sets is itself an open set. But before this proof can be started,
there needs to be a convenient way to discuss an arbitrary collection of open sets.

9.3 Unions and Intersections

279

Fig. 9.5 The union of open


sets is an open set

If the open sets are listed as A1 ; A2 ; A3 ; : : : ; Ak , then there is an implication that


this is a collection of a finite number of sets because the indices used to describe
the collection, f1; 2; 3; : : : ; kg, is a finite set. If the collection is listed as a sequence,
A1 ; A2 ; A3 ; : : : , then there is an implication that the collection of sets is denumerable,
that is, there is one set for each natural number. But to allow the collection of open
sets to be any size, even to be an uncountable collection of sets, one generally wants
to represent the collection as a collection of sets Ai where the index i is allowed to
range over a particular index set, I. That is, the collection is given by fAi j i 2 Ig.
Thus, since there is no restriction on the size of the index set I, there is no restriction
on the size of the collection of open sets.
So assume that fAi j i 2 Ig is a collection of open sets. How would you prove
that its union, A D [ Ai , is an open set? Given the theorem about open sets in the
i2I

previous section, you could prove that the set A is open by using the definition of
open set, by showing that A D int.A/, by showing that A \ @A D ;, or by showing
that Ac is a closed set. In this case, it is simple enough to use the definition of open
set. Thus, for each x 2 A, you would need to show that there is an open interval
centered at x such that this open interval is contained in A. All you know about A
is that it is a union of a collection of open sets, so the first thing you should try is
invoking the definition of union. That is, if x 2 A, then there must be a j 2 I such
that x 2 Aj . What do you know about Aj ? Only that it is an open set. That means that
there is an  > 0 such that the open interval .x  ; x C / is contained in Aj . But
by the definition of union, Aj  A implying that .x  ; x C /  A which is what
you needed to prove.

280

9 Topology of the Real Line

PROOF: Assume that for each i in the index set I, Ai is an open set. Then
[ Ai is an open set.
i2I

Assume that for each i in the index set I, Ai is an open set.


Let x 2 [ Ai .
i2I

By the definition of set union, there is a j 2 I such that x is an element of


the open set Aj .
By the definition of open set, there is an  > 0 such that the open interval
.x  ; x C /  Aj .
But by the definition of set union, Aj  [ Ai showing that .x  ; x C / 
[ Ai , which proves the theorem.

i2I

i2I

Now consider proving the result that the intersection of a finite collection of
open sets is itself an open set. This time there is no need to consider an arbitrarily
large collection of open sets; you can just use the finite collection of open sets
A1 ; A2 ; A3 ; : : : ; Ak . Again you would take an arbitrary x 2 A1 \ A2 \ A3 \    \ Ak .
You know from the definition of intersection that for each j D 1; 2; 3; : : : ; k, this
element x must be in Aj . And you know that since Aj is an open set, there must be
an j > 0 such that the interval .x  j ; x C j /  Aj . Now you have a collection
of k open intervals each centered at x. By selecting  D min.1 ; 2 ; 3 ; : : : ; k /, you
will have the least of these j values which is a positive number. This is crucial. The
fact that you have a finite collection of open sets ensures that you can find a finite
number of open intervals centered at x and can find the shortest of these intervals.
If the collection of open sets were infinite, there would be no guarantee that there
would be a minimum j . The fact that there is a minimum  value that is greater than
0 allows you to claim that the interval .x  ; x C / is contained in each of the Aj
sets, and thus, .x  ; x C / is contained in the intersection of the Aj s.
PROOF: Assume that A1 ; A2 ; A3 ; : : : ; Ak
A1 \ A2 \ A3 \    \ Ak is an open set.

are

open

sets.

Then

Assume that A1 ; A2 ; A3 ; : : : ; Ak are open sets.


Let x 2 A1 \ A2 \ A3 \    \ Ak .
Then for each j D 1; 2; 3; : : : ; k, x is an element of the open set Aj , and
because Aj is an open set, there exists an j > 0 such that the open interval
.x  j ; x C j /  Aj .
Let  D min.1 ; 2 ; 3 ; : : : ; k / > 0.
Then the open interval .x  ; x C / is contained in Aj for each j D
1; 2; 3; : : : ; k.
This shows that .x  ; x C /  A1 \ A2 \ A3 \    \ Ak proving that this
intersection is an open set.
There are analogous results about the union and intersections of closed sets. In
particular, the intersection of an arbitrary collection of closed sets is itself a closed
set, and the union of a finite number of closed sets is itself a closed set. One can
prove these results by relying on the definition of a closed set, but it is much easier

9.3 Unions and Intersections

281

to use the results from the previous section that show that a set is a closed set if and
only if it is the complement is an open set. For example, to show that the union of
a finite number of closed sets is closed, let A1 ; A2 ; A3 ; : : : ; Ak be a finite collection
of closed sets. Then for each j, Acj is the complement of a closed set, so it is open.
By the previous theorem, the intersection of a finite number of open sets is an open
set, so Ac1 \ Ac2 \ Ac3 \    \ Ack is an open set. But DeMorgans Law says that
Ac1 \ Ac2 \ Ac3 \    \ Ack D .A1 [ A2 [ A3 [    [ Ak /c which is an open set, so its
complement, A1 [ A2 [ A3 [    [ Ak is a closed set as desired.
Although not needed in this textbook about writing proofs in Analysis, for
completeness, it makes sense at this point to introduce the definition of a topological
space to be a set S together with a collection T of subsets of S satisfying the
conditions
Both ; and S are in T .
The union of any collection of sets in T is also a set in T .
The intersection of any finite collection of sets in T is also a set in T .
If these conditions are satisfied, then the set T is said to be a topology for the
topological space S. From the definitions and theorems presented so far in this
chapter it follows that the real numbers R along with its collection of open sets
forms a topological space. The advantage of introducing the more general concept
of a topological space is that many theorems about the real numbers extend to
all topological spaces, so once you justify the fact that you are dealing with a
topological space, you then know many theorems about your new space.
As an example of another topological space consider the set of integers, Z, along
with the collection T of subsets of Z consisting of the empty set, ;, and the sets
A  Z with the property that Ac D ZnA is a finite set. It is easy to see that both
; and Z are elements of T . To show that T is closed under unions, suppose you
have a collection of sets in T . There are two cases to consider: (1) all the sets in the
collection are the empty set, and (2) at least one of the sets in the collection is not
empty. In the first case, the union of all the sets in the collection is the empty set
which is in T . In the second case, if the collection includes a set A, then the union
of the sets in the collection contains A, and because the complement of the union
lies inside the complement of A which is finite, the union will have to have a finite
complement and be a set in T . To show that T is closed under finite intersections,
suppose you have a finite collection of sets in T . Again, there are two cases to
consider: (1) at least one set in the collection is the empty set, and (2) none of
the sets in the collection is the empty set. In the first case, the intersection of the
collection of sets is the empty set which is in T . In the second case, the complement
of the intersection of the finite collection of sets is the union of the complements of
the sets. If all the complements are finite, then the union of the finite number of
complements is also finite, so the intersection is in T . This verifies that T is a
topology for Z. This is known as the finite complement topology for Z. It is clearly
not the usual topology associated with the integers which is just the usual topology
of R restricted to Z. Generally, a set can have many different topologies, each giving
rise to a different topological space. Most of these topologies are uninteresting and
have few if any applications.

282

9 Topology of the Real Line

9.3.1 Exercises
1. Find a sequence of open sets whose intersection is neither an open nor a
closed set.
2. Find a sequence of closed sets whose union is neither an open nor a closed set.
3. Prove that the intersection of a collection of closed sets is a closed set.
4. Prove that an open set of real numbers is the union of all the open intervals
contained in the set.
5. Verify that if S is any set, then the power set of S, P .S/, consisting of all the
subsets of S, is a topology for S. This is called the discrete topology for S.
6. Let S be the interval 0; 5, and let T include the empty set, the set S, and any
interval of the form 0; x/ where x 2 .0; 5/. Verify that T is a topology for S.

9.4 Continuous Functions Applied to Sets


Sometimes rather than focusing your attention on the entire real line, you are
interested in the open sets within a particular subset of the real numbers. For
example, if the real valued function f has domain A D 4; 4, you might be
interested in the open sets contained in A. Moreover, you might want to consider
some new sets to be open which were not considered to be open sets in R. For
example, within A the interval 4; 0/ should be considered open in the topological
space consisting just of the set A. This is because, within A, each point of 4; 0/
is an interior point. The only controversial point here is 4, but it makes sense to
claim that 4 is in the interior of A if your entire universe of interest is A. Certainly,
all the points of A that are within a distance of 12 of 4 are elements of 4; 0/.
Generalizing this idea leads to the definition of the inherited topology in the set
A  R. In the inherited topology, a set B  A is said to be open in A if B is the
intersection of A with some set that is open in R. For example, if A D 4; 4
as above, then the set 4; 0/ is open in A because 4; 0/ D .5; 0/ \ A,
and .5; 0/ is an open set in R. With this same reasoning, within A the set
4; 3 [ 2; 0/ [ f1; 2g [ 3; 4/ has interior 4; 3/ [ .2; 0/ [ .3; 4/ and
boundary f3; 2; 0; 1; 2; 3; 4g.
Similarly, a set B  A is said to be closed in A if B is the intersection of A with
some set that is closed in R. Note that all of the properties proved earlier in this
chapter pertaining to open or closed sets in R hold equally well for sets open or
closed in A. In particular, the union of any collection of sets open in A is itself a set
that is open in A.
The motivation for developing the properties of open and closed sets and for
defining topological spaces is that one can now generalize the idea of a continuous
function. One defines P .X/, the power set of a set X, to be the collection of all
subsets of the set X. For example, if X is a finite set with n elements, then P .X/
contains the 2n subsets of X. If f W A ! B is a function which maps elements of the

9.4 Continuous Functions Applied to Sets

283

set A to elements of the set B, the function f can be extended to f W P .A/ ! P .B/
which maps subsets of the set A to subsets of the set B. If C  A, then define f .C/
to be the set fy 2 B j y D f .a/ for some a 2 Cg. Then f .C/ is called the image of
C under f . The notation f .C/ could be confusing because f was originally defined
for elements of A, not subsets of A. The application of f to subsets of A is really
defining a new function f W P .A/ ! P .A/ whose domain is the power set of A and
codomain is the power set of B. The confusion arises because the same name, f , is
given to both functions. The confusion is cleared up by recognizing the distinction
that if the argument of f is an element a 2 A, then f .a/ refers to an element of the
codomain, B, while if the argument of f is a subset C  A, then f .C/ is a subset of
B, f .C/  B.
For example, the function f .x/ D x2 is defined to be a function with domain R
and codomain R. It is then easily understood that f .3/ D 9 and f .2/ D 4. But
taking C to be the interval .3; 2/, the expression f .C/ now refers to the function
f W P .R/ ! P .R/, and f .C/ is the set of all elements of R that are images under f
of elements of C. That is, f .C/ D 0; 9/.
If the function f W A ! B is not a bijection mapping A one-to-one and onto B,
then it is not possible to define the inverse function f 1 W B ! A. One problem is
that if f is not surjective (mapping A onto B), there might be an element of b 2 B for
which there is no corresponding element a satisfying f .a/ D b, so f 1 .b/ cannot be
defined. Another problem is that if f is not injective (mapping A one-to-one to B),
then there will be an element b 2 B such that f .x/ D b is satisfied by more than one
value of x, so f 1 .b/ would not be unique. On the other hand, if D  B, it is always
possible to define the function f 1 W P .B/ ! P .A/ mapping the power set of B to
the power set of A. Indeed, one defines f 1 .D/ D fx 2 A j f .x/ 2 Dg. In this case
f 1 .D/ is called the preimage of D under f . For example,
returning
to f .x/ D x2 ,


1
1
.1; 16/ D .4; 4/.
it follows that f .4; 9/ D .3; 2/ [ .2; 3/ and f
2
Note
that
when
the
continuous
function
f
.x/
D
x
was
applied to an open set as in


f .3; 2/ D 0; 9/, the image did not end
up
being
an
open
set. But when f 1 was

1
.1; 16/ D .4; 4/, the preimage was also an open
applied to an open set as in f
set. This is an important distinction. A continuous function need not map open
sets to open sets; functions that do map all open sets to open sets are called open
functions. But all continuous functions have the property that their inverses always
map open sets to open sets. Conversely, a function whose inverse always maps open
sets to open sets must be a continuous function. Of course, these statements require
proof, but the proofs follow directly from the definition of continuity and definition
of open set.
Assume, for example, that f W A ! B is a continuous function and D  B is an
open set in B. You are challenged to show that f 1 .D/ is an open set in A. To show
that f 1 .D/ is open, you would need to show for every a 2 f 1 .D/ there is a > 0
such that .a  ; a C / \ A  f 1 .D/. From the definition of f 1 .D/, you know
1
that if a 2 f .D/, then f .a/ 2 D. Because D is open, there is an  > 0 such that
f .a/  ; f .a/ C  \ B  D. This means that if y 2 B such that jy  f .a/j < , then
y 2 D. But now, by the definition of continuity, there is a > 0 such that if x 2 A

284

9 Topology of the Real Line

Fig. 9.6 Mapping sets

C
A

a
f-1

f(a) = b
f-1(D) = C
B



with jxaj < , then jf .x/f .a/j <  implying that f .x/ is in f .a/; f .a/C \B,
and thus, f .x/ is in D. This shows that x 2 f 1 .D/ proving that .a  ; a C / 
f 1 .D/, so f 1 .D/ is open.
Conversely, suppose that f has the property that f 1 .D/ is an open set in A
whenever D is an open set in B. Then let a 2 A. This time you are challenged
to show that for every  > 0, there is a > 0 such that if x 2 A with jx  aj < ,
then jf .x/  f .a/j < . But the set D of all y 2 B satisfying jy  f .a/j <  is an
open set in B implying that f 1 .D/ is an open set in A containing the point a. This
means that there is a > 0 such that .a  ; a C / \ A is contained in f 1 .D/.
In other words, if x 2 A with jx  aj < , then x is in f 1 .D/, so f .x/ is in D and
jf .x/  f .a/j < , completing the proof that f is continuous (Fig. 9.6).
PROOF: Let A and B be sets of real numbers, and let f W A ! B be a
function from A to B. Then f is continuous on A if and only if for every
open set D  B, its preimage under f , f 1 .D/, is an open set in A.
Let A and B be sets of real numbers, and let f W A ! B be a function from A
to B.
Continuity implies that the preimages of open sets are open
Assume that f W A ! B is a continuous function.
Let D be an open set in B, and let a 2 f 1 .D/.
Because D is open in B, there is an  > 0 such that
.f .a/  ; f .a/ C / \ B  D.
Thus, if y 2 B with jy  f .a/j < , then y 2 D.
Because f is a continuous function, there is a > 0 such that for all x 2 A
with jx  aj < it follows that jf .x/  f .a/j < .
Thus, if x 2 .a  ; a C / \ A, then jf .x/  f .a/j < , so f .x/ 2 D and
x 2 f 1 .D/.
Therefore, .a  ; a C /  f 1 .D/ and f 1 .D/ is an open set in A. This
proves that the preimage under f of any open set is open.
(continued)

9.5 Closure

285

The preimages of open sets are open implies continuity


Assume that the preimage under f of any set D open in B is an open set in A.
Let a 2 A, and let  > 0 be given.
The set f .a/  ; f .a/ C  \ B is an open set in B, so its preimage, C D
fx 2 A jf .x/  f .a/j < g is an open set in A.
Because C is an open set containing a, there is a > 0 such that
.a  ; a C / \ A  C.
Thus, if x 2 A with jx  aj < , then x 2 .a  ; a C / \ A  C, so
f .x/ 2 f .C/ implying that jf .x/  f .a/j < .
Therefore, f is continuous which completes the proof of the theorem.
There is a similar theorem that states that a function f W A ! B is continuous if
and only if the preimage of every set closed in B is a closed set in A. The proof is
left as an exercise. As it is with open sets, continuous functions do not always map
closed set onto closed sets. Functions that do map all closed sets onto closed sets
are called closed functions.
In general, then, one can define what continuity means for any function from one
topological space into another topological space. If A and B are topological spaces,
the function f W A ! B is continuous if the preimage under f of every set open in B
is a set open in A. Note that this definition makes sense even in topological spaces
where there is no distance measure, and the definition does not involve the selection
of a > 0 given an  > 0.

9.4.1 Exercises
Write proofs for each of the following statements.
1.
2.
3.
4.
5.

The union of any collection of sets open in A is itself a set open in A.


The intersection of any finite collection of sets
 open in A is itself a set open in A.
If f W A ! B and C  A, then C  f 1 f .C/ .
If f W A ! B and D  B, then f f 1 .D/  D.
If f is a function from set A into set B, then f is continuous on A if and only if the
preimage under f of every set closed in B is a set closed in A.

9.5 Closure
Recall that if S is any subset of R, then a is an accumulation point of S if for
every  > 0 the open interval .a  ; a C / contains at least one point of S
other than a itself. An important property of closed sets is that if a closed set,
S, has an accumulation point, a, then a 2 S. You should be able to construct a

286

9 Topology of the Real Line

short proof of this fact that relies only on the definitions of accumulation point,
closed set, and boundary. Such a proof would start with the assumption that a is
an accumulation point of the closed set S. One way to continue is to construct a
proof by contradiction, that is, to assume that a S and hope that this will lead to a
contradiction. Interestingly, you can proceed in more than one way. You could use
the fact that S is a closed set which implies that, since a S, then a 2 ext.S/.
This means that there is an  > 0 such that the open interval .a  ; a C /
is contained in Sc . But the definition of accumulation point says that every open
interval containing a also contains points of S, so this contradicts the fact that
a is an accumulation point of S. Alternatively, you could use the fact that a is
an accumulation point of S. This means that for every  > 0, the open interval
.a  ; a C / contains points in S. All of these open intervals also contain a S
implying that each of these open intervals contains points in S and points in Sc . Thus,
a satisfies the definition of being an element of @S. From the definition of closed set,
@S  S. Thus, a 2 S.
PROOF: If S is a closed set, then S contains all of its accumulation points.
Let S be a closed set, and let a be an accumulation point of S.
Assume that a S.
From the definition of accumulation point, for every  > 0 it follows that
the open interval .a  ; a C / contains elements in S.
Because a S, it follows that for every  > 0 the open interval .a; aC/
contains elements of S and elements of Sc , so a 2 @S.
From the definition of closed set, @S  S, so a 2 S which contradicts the
assumption that a S.
Thus, every accumulation point of S must be contained in S.
The collection of all the accumulation points of S is called the derived set of S
which is written S0 . The previous theorem shows that if S is closed, then S0  S.
The converse is also true, that is, if S0  S, then S must be closed. This follows from
the fact that if a is in the boundary of S but a is not an element of S, then a must be
an accumulation point of S. This should make sense to you. A boundary point is a
point close both to S and to Sc . An accumulation point is close to S, and if it is not
in S, it is close to Sc .
PROOF: If set S contains all of its accumulation points, then S is a closed
set.
Let S be a set that contains all of its accumulation points.
Assume that a 2 @SnS.
Because a 2 @S, for every  > 0, the open interval .a  ; a C / contains
elements of S and elements of Sc .
Thus, because a itself is not a member of S, .a; aC/ contains an element
of S not equal to a.
It follows that a 2 S0  S which contradicts the assumption that a S.
Therefore, @S  S which proves that S is a closed set.

9.5 Closure

287

You can conclude from this result that for any set S, if a 2 @S \ Sc , it is an
accumulation point of S, and, by symmetry, if a 2 @S \ S, then it is an accumulation
point of Sc . The set S is closed if it contains its boundary, @S. But for any set S, the
elements of @S that are not in S are accumulation points of S, so S is closed if and
only if it contains all of its accumulation points. It is important to recognize, though,
that the derived set S0 need not be contained in @S since points in the interior of S are
accumulation points of S, and @S need not be contained in S0 since isolated points
of S are in the boundary of S without being accumulation points of S. On the other
hand, S [ @S D S [ S0 .
For any set S, define the closure of S or cl.S/ to be S [ S0 D S [ @S. Some books
use the notation S or S for the closure of S. Intuitively, the closure of a set S takes
the elements of the boundary of S and adds them to the set so that you now have S
along with its boundary (Fig. 9.7). The closure also has the following properties.

For any set S, the closure cl.S/ is a closed set.


The set S is closed if and only if S D cl.S/.
cl.S/ is the intersection of every closed set that contains S.
cl.S/ is the smallest closed set that contains S.

All of these results have short proofs. For example, to get the first result, recall that
if x is in the boundary of the union of two sets, S [ T, then x is either in the boundary
of S or the boundary of T. Thus, if x 2 @ cl.S/, it means that x 2 @.S [ @S/ and,
therefore, x 2 @S or x 2 @.@S/. It was shown in the first section of this chapter that
@.@S/  @S implying that x 2 @S proving that x is in cl.S/. Thus, cl.S/ contains its
boundary, so it is closed.
For the second result, note that if S is closed, it contains its boundary so cl.S/ D
S [ @S D S. Conversely, if S D cl.S/, then S is closed because cl.S/ is always a
closed set.
The third and fourth results follow quickly after noticing that any closed set
containing S must also contain the boundary of S.

](

Fig. 9.7 The closure of a set

cl(S)

][

288

9 Topology of the Real Line

9.5.1 Exercises
For each of the following sets S, determine @S, S0 , and cl.S/.
1.
2.
3.
4.
5.

The real numbers, R.


The integers, Z.
f 1n j n 2 Zg
.0; 3/ [ .3; 5/ [ .5; 7/
.1; 3/ \ Q [ f0; 4g

Write proofs of each of the following statements.


4.
5.
6.
7.
8.

For any set S, the closure cl.S/ is a closed set.


The set S is closed if and only if S D cl.S/.
cl.S/ is the intersection of every closed set that contains S.
cl.S/ is the smallest closed set that contains S.
For any set S, its derived set, S0 , is a closed set.

9.6 Compactness
The topics of open cover, finite subcover, compactness, and the HeineBorel
Theorem were introduced in Chap. 4 because of their usefulness in proving that a
function continuous on a closed bounded interval is uniformly continuous on that
interval. Compactness also played an important role in showing that a continuous
function on a closed bounded interval is bounded, a continuous function on a
closed bounded interval obtains its extreme values (maximum and minimum), and a
continuous function on a closed bounded interval has a Riemann integral. Recall
that an open cover of a set S was defined to be a collection open intervals T
where for each x 2 S there is an open interval .p; q/ 2 T such that x 2 .p; q/.
After the introduction of the topological ideas in this chapter, that definition can be
generalized to allow T to be a collection of open sets rather than just open intervals,
that is, a collection of open sets, T, is called an open cover of S if for each x 2 S
there is an open set U 2 T such that x 2 U. Moreover, the HeineBorel Theorem
can now be extended in two ways: the concept of an open cover by intervals can
be generalized to an open cover by open sets, and the concept of closed bounded
interval can be generalized to closed bounded set.
PROOF (HeineBorel Theorem): Let S be any closed bounded set of real
numbers, and let T be a cover of S by open sets. Then T contains a finite
subcover of S.
Let S be a closed bounded set and T be a cover of S by open sets.
Because S is bounded, there are real numbers a and b with a < b such that
S  a; b.
(continued)

9.6 Compactness

289

Let U D .a  1; b C 1/nS which is the intersection of the open interval


.a  1; b C 1/ and the open set Sc , so U is an open set.
Then T 0 D T [ fUg is an open cover of a; b.
For each x 2 a; b there is an open set Vx 2 T 0 that contains x.
Because Vx is open, there is an open interval .px ; qx /  Vx that contains x.
Thus, the collection T 00 D f.px ; qx / j x 2 a; bg is a cover of a; b by open
intervals.
Now, by the previously proved version of the HeineBorel Theorem, T 00
has a finite subcover of a; b, say f.p1 ; q1 /; .p2 ; q2 /; .p3 ; q3 /; : : : ; .pk ; qk /g
for some natural number k.
For each j D 1; 2; 3; : : : ; k, the open interval .pj ; qj / in the subcover
is contained in an open set Vj 2 T 0 , so it is clear that the subcover
V1 ; V2 ; V3 ; : : : ; Vk covers a; b and, therefore, covers S.
If one of the open sets, Vj , happens to be the set U added to T, this set can
be discarded from the subcover of S because it contains no elements of S.
This gives a finite subcover of S which completes the proof.
So, this shows that all closed bounded sets of real numbers are compact. The
converse is also true, that is, all compact subsets of real numbers are both closed
and bounded. These two results together, then, completely characterize the compact
sets of real numbers.
PROOF: A subset of R is compact if and only if it is closed and bounded.
The HeineBorel Theorem shows that closed bounded sets of real numbers
are compact.
Conversely, assume that S is a compact subset of R.
The collection of open intervals .j; j/ where j ranges over the natural
numbers is a collection of open sets that covers all of R, so it certainly
covers S.
Because S is compact, S can be covered by a finite collection of the .j; j/
intervals.
It follows that there exists a natural number k such that S  .k; k/, and S
is a bounded set.
Suppose that there is a real number x in the boundary of S that is not an
element of S.
For each  > 0, let U D .1; x  / [ .x C ; 1/ which is an open set.
The collection of all such U covers all of Rnfxg, and since x is not an
element of S, the collection is an open cover of S.
Because S is compact, it is covered by a finite collection of the U sets.
It follows that there is an > 0 such that S  U .
(continued)

290

9 Topology of the Real Line

Thus, the interval .x  ; x C / contains no elements of S which contradicts


the assumption that x is in the boundary of S.
Therefore, there are no elements x in the boundary of S that are not elements
of S implying @S  S and S is closed.
This shows that all compact sets are closed and bounded completing the
proof of the theorem.
Continuous functions need not map bounded sets onto bounded sets as is seen by
f .x/ D 1x which maps the bounded interval .0; 1/ continuously onto the interval
.1; 1/ which is not bounded. Continuous functions need not map closed sets
onto closed sets as seen by f .x/ D 1x which maps the closed interval 1; 1/
onto 1; 0/ which is not closed. But continuous functions always map compact
sets onto compact sets. This is a result that is true in any topological space, so
its proof need not use any more than the properties of open sets, compact sets,
and continuous functions. To write the proof you would start by assuming that the
function f W A ! B is continuous on A, and that C  A is a compact set. You must
then show that the image of C, f .C/  B, is compact. How would you show this
set is compact? The definition of compact set suggests that you would take an open
cover of the set and proceed to show that that cover has a finite subcover. So let I be
an index set and assume that fUi j i 2 Ig is an open cover of f .C/. Somehow you
must show that this cover has a finite subcover. All you know is that f is a continuous
function and that the set C is compact. Since C is compact, you know that open
covers of C have finite subcovers, but you have an open cover of f .C/, not an open
cover of C. You need to use the fact that f is a continuous function which means that
for each i 2 I, the preimage of the open set Ui , f 1 .Ui /, is an open set in A. Does
the collection of f 1 .Ui / sets form a cover of C? Follow what happens: if x 2 C,
then f .x/ 2 f .C/. Thus, there is at least one i 2 I such that f .x/ 2 Ui . Therefore,
x 2 f 1 .Ui /. So, indeed, the collection of f 1 .Ui / sets forms an open cover of C.
Hence, there is a finite subcover of C given (by renaming subscripts) as f 1 .U1 /,
f 1 .U1 /; f 1 .U1 /; : : : ; f 1 .Uk /, for some natural number
 k. For
 each x 2 C, there is
a j between 1 and k such that x 2 f 1 .Uj /, so f .x/ 2 f f 1 .Uj /  Uj . Because each
element of f .C/ is the image of at least one x 2 C, and each x 2 C is an element
of at least one of the finite number of f 1 .Uj /, it follows that the finite collection of
open sets, U1 ; U2 ; U3 ; : : : ; Uk , covers f .C/ proving that f .C/ is compact.
PROOF: If f W A ! B is continuous on A, and if C  A is a compact set,
then f .C/ is a compact set in B.
Assume that f W A ! B is continuous on A, and C  A is a compact set.
Let I be an index set, and fUi j i 2 Ig be a collection of open sets that cover
f .C/.
For each x 2 C there is an i 2 I such that f .x/ 2 Ui .
Since f is continuous, and, for each i 2 I, Ui is an open set in B, f 1 .Ui / is
an open set in A.
Thus, ff 1 .Ui / j i 2 Ig is an open cover of C.
(continued)

9.7 Connectedness

291

Since C is compact, this open cover has a finite subcover.


By renaming subscripts, the subcover is given as f 1 .U1 /; f 1 .U2 /; f 1 .U3 /;
: : : ; f 1 .Uk / for some natural number k.
Let y be any element of f .C/. Then y D f .x/ for some x 2 C.
Since x 2 f 1 .Uj / for one of the j D 1; 2; 3; : : : ; k, it follows that y D
f .x/ 2 Uj showing that the finite collection U1 ; U2 ; U3 ; : : : ; Uk covers f .C/.
Therefore, every open cover of f .C/ has a finite subcover, and f .C/ is
compact.
Notice that it is an immediate consequence of this theorem that a real valued continuous function on a closed bounded interval on the real line is bounded and obtains
its maximum and minimum values. This is because every closed bounded interval on
the real line is a compact set, so its image under a continuous function is compact
which means the image is closed and bounded. The image being bounded is just
another way of saying that the function is bounded. The image being closed shows
that the image contains its boundary which includes the maximum and minimum
values of the function.
The HeineBorel Theorem can be extended to n-dimensional Euclidean
space Rn . That is, the compact sets in Rn are the sets that are both closed and
bounded. One can use mathematical induction to show that a rectangular box that
is a cross product of n closed intervals is compact, and then, that can be extended to
any closed bounded set.

9.6.1 Exercises


1. Find an example of a function f and a set C such that f 1 f .C/ is notequal to C.
2. Find an example of a continuous function f and a set D such that f f 1 .D/ is
not equal to D.
3. Find an example of a continuous function f W A ! B and a compact set D  B
such that f 1 .D/ is not compact.
4. Suppose that the continuous function f has domain 0; 10 and codomain .4; 4/.
Show that the function is not surjective.

9.7 Connectedness
The intervals on the real line were discussed in Chap. 2. A set of real numbers is an
interval if whenever x and y are elements of the interval, then all the real numbers
between x and y are also elements of the interval. The intervals are the connected
sets on the real line, but the concept of connectedness can be extended to any
topological space. In a general topological space, two nonempty sets A and B are

292

9 Topology of the Real Line

Fig. 9.8 The sets 0; 1 and .4; 5/ are disconnected

disconnected if there are disjoint open sets U and V with A  U and B  V. For
example, the sets 0; 1 and .4; 5/ are disconnected because 0; 1  .1; 2/ and
.4; 5/  .4; 5/ where .1; 2/ and .4; 5/ are disjoint open sets (Fig. 9.8). The sets
0; 3 and .3; 5/ are disjoint nonempty sets, but they are not disconnected because
any open set that contains 0; 3 will necessarily share points with any open set
containing .3; 5/, and, in particular, both open sets will contain the element 3. A set
is called connected if it is not the union of two disconnected nonempty sets. Even
though the connected sets of real numbers are just the intervals, the concept of
connectedness gets far more interesting in more general topological spaces.
If f W A ! B is continuous, then it always maps connected sets to connected
sets, that is, if C  A is a connected set, then so is f .C/. This is easy to see since,
if f .C/ is disconnected, then there are two disjoint open sets U and V in B and two
nonempty sets S and T in B such that f .C/ D S [ T and S  U and T  V. But then
C  f 1 .U/ [ f 1 .V/ where f 1 .U/ and f 1 .V/ are disjoint open sets in A. Because
S and T are nonempty, C \ f 1 .U/ and C \ f 1 .V/ are nonempty implying that C
is a disconnected set. Thus, if C is connected, f .C/ must also be connected.
PROOF: If f W A ! B is continuous on A, and if C  A is a connected set,
then f .C/ is a connected set in B.
Let f W A ! B be a continuous function on A, and assume that C  A such
that f .C/ is disconnected.
This means that there are disjoint open sets U and V in B, and nonempty
sets S and T in B with S  U and T  V such that B D S [ T.
Since f is continuous, f 1 .U/ and f 1 .V/ are open sets in A.
Since S and T are nonempty sets whose union is f .C/, both C \ f 1 .U/ and
C \ f 1 .V/ are nonempty.
This shows that C is a disconnected set.
Therefore, if C is a connected set, f .C/ must also be connected.
When this theorem is applied to functions from the real numbers to the real
numbers, the result is the Intermediate Value Theorem which states that if f is a
real valued function on the interval a; b, then for every c between f .a/ and f .b/
there is an x 2 a; b such that f .x/ D c. This is because f must map the connected
set a; b into a connected set which must include all the elements c between f .a/
and f .b/. Note that f 1 need not bring connected sets to connected sets.
In n-dimensional Euclidean space the concept of connectedness gets considerably richer as the connected sets are not merely the cross products of intervals
(Fig. 9.9). In R2 one introduces what it means for a set to be path-connected which
makes precise the intuitive notion that a set is connected if you can draw a path

9.7 Connectedness

293
N

Fig. 9.9 The set C is a connected set. The set N is not a connected set
Fig. 9.10 Graph of sin
with the y-axis

1
x

between any two of its points where the path stays inside the set. On the real line,
this just means that for any two points in the set, the interval between the two points
stays in the set. But in R2 where paths need not be straight lines, the examples are
far more varied. In fact, in R2 there are examples of connected sets that are not path
connected, a phenomenon that cannot occur on the real line.
  A famous example is
the set consisting of the graph of the equation y D sin 1x along with the y-axis.
This is a connected set because
 any
 open set that contains the y-axis must intersect
parts of the graph of y D sin 1x both to the left and to the right of the y-axis. On
the other hand, this set is not path-connected because there is no way to construct a
path that stays inside the set and connects the points . 1 ; 0/ and . 1 ; 0/ (Fig. 9.10).

9.7.1 Exercises
1. Find an example of a continuous function f W A ! B and connected set D  B
such that f 1 .D/ is not connected.
2. Show that in any topological space A, if S and T are connected sets with A D S[T
and S \ T ;, then A is connected.

Chapter 10

Metric Spaces

10.1 Definition of Metric Space


This book has discussed at length how one writes proofs about the limits and
continuity of functions whose domains and ranges are subsets of the real numbers, R. Although the real numbers is a far simpler set to study than many other
naturally arising sets in Analysis, the techniques learned while dealing with realvalued functions of a real variable can be applied almost exactly to prove similar
theorems about functions defined on other domains with other types of ranges. It is
instructive to take note of the properties of the real numbers that play important
roles in these proofs. In particular, most of the proofs about limits and continuity
involve measuring the distance between two real numbers x and y. This is done
by calculating the absolute value of the difference between the numbers, jx  yj.
This distance measure has important properties that allow the proofs about limits
and continuity to proceed. Among the useful properties of this distance measure
is that if jx  yj <  for every  > 0, then it follows that x D y, and if
jx  yj > 0, then x is surely different from y. Another property use repeatedly in
these proofs is the triangle inequality. For example, if f and g are two functions,
and x and y are both elements in the domains of these functions, then knowing

that
 g.y/j < 2 allows the proofs to conclude that
 jf .x/  f.y/j < 2 and jg.x/
 

 
f .x/ C g.x/  f .y/ C g.y/ D f .x/  f .y/ C g.x/  g.y/  jf .x/f .y/jC
jg.x/  g.y/j < 2 C 2 D . The fact that the triangle inequality holds true for this
chosen measure of distance is crucial in the argument.
The conclusion is, then, that if there were a set, X, and a distance measure that
assigned to each x and y in X a real number, d.x; y/, that had many of the same
properties that the jx  yj distance measure does in the real numbers, then it might
be possible to prove limit and continuity theorems for functions defined on X by
just adopting the same proof techniques used for the theorems about functions of

Springer International Publishing Switzerland 2016


J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_10

295

296

10 Metric Spaces

Fig. 10.1 Metric distances in


the plane

x
d(x,y)

d(x,z)

y
d(y,z)

real numbers. With this in mind a nonempty set X together with distance function
d is defined to be a metric space if, for all x, y, and z in X, this distance function
satisfies the following properties:

d.x; y/ 2 R with d.x; y/  0 (the distance function is a nonnegative real number).


d.x; y/ D 0 if and only if x D y (the distance function separates points).
d.x; y/ D d.y; x/ (the distance function is symmetric).
d.x; y/ C d.y; z/  d.x; z/ (the distance function satisfies the triangle inequality).

The distance function defines a metric for the metric space, and the metric
space is designated as <X; d> (Fig. 10.1). This definition is a generalization of the
distance function defined on the real numbers, d.x; y/ D jx  yj. Clearly, for all real
numbers x, y, and z,

d.x; y/ D jx  yj  0
0 D d.x; y/ D jx  yj if and only if x D y
d.x; y/ D jx  yj D jy  xj D d.y; x/
d.x; y/ C d.y; z/ D jx  yj C jy  zj  j.x  y/ C .y  z/j D jx  zj D d.x; z/

showing that <R; d> where d.x; y/ D jx  yj is a metric space. In general, it is a


fairly straightforward process to construct a proof that <X; d> is a metric space.
Most proofs would follow this template:
TEMPLATE for proving <X; d> is a metric space
SET THE CONTEXT: Give the definitions of X and d.
METRIC DEFINITION: Show that d maps each x; y 2 S to a nonnegative
real number.
SEPARATION OF POINTS: Show that d.x; y/ D 0 implies x D y.
ZERO DISTANCE: Show that for all x 2 X that d.x; x/ D 0.
SYMMETRY: Show that for all x; y 2 X that d.x; y/ D d.y; x/.
TRIANGLE INEQUALITY: Show that for all x; y; z 2 S that d.x; y/ C
d.y; z/  d.x; z/.
Given a metric space <X; d>, an element a 2 X, and a positive real number r,
define the neighborhood of a with radius r to be N.a; r/ D fx 2 X j d.a; x/ < rg.
Sometimes, as in the definition of a limit at point a, one needs to exclude the point a
from the neighborhood of a. In this case, one can define the deleted neighborhood
of a with radius r to be N .a; r/ D fx 2 X j 0 < d.a; x/ < rg. These neighborhoods
play a central role in defining limits and continuity of functions defined on X and
in establishing a topology for the space X. It is not uncommon for there to be

10.2 Inequalities

297

several different distance functions defined on a particular set X that make X into a
metric space. Each new distance function results in different shaped neighborhoods.
Some give rise to the same topology of X while others may result in quite different
topologies. Many examples of these different distance functions will be explored in
the sections that follow.

10.2 Inequalities
Most proofs in Analysis involve establishing one or more inequalities. Some
inequalities seem to keep reappearing in different guises throughout Analysis, so
they provide great tools for writing proofs. This section presents two very common
inequalities that will be used later in the chapter to justify the triangle inequality for
some examples of metric spaces.

10.2.1 CauchySchwarz Inequality


For natural number n let a D .a1 ; a2 ; a3 ; : : : ; an / and b D .b1 ; b2 ; b3 ; : : : ; bn / be any
two points in n-dimensional Euclidean space. The CauchySchwarz Inequality
states that
ja1 b1 C a2 b2 C a3 b3 C    C an bn j 

q
a21 C a22 C a33 C    C a2n b21 C b22 C b33 C    C b2n :

The student familiar with vectors and the dot


q product of two vectors will find
this inequality easy to remember. If jaj D
a21 C a22 C a23 C    C a2n refers to
the magnitude of vector a and the dot product a  b D .a1 ; a2 ; a3 ; : : : an / 
.b1 ; b2 ; b3 ; : : : bn / D a1 b1 C a2 b2 C a3 b3 C    C an bn , then a  b D jaj  jbj cos 
where  is the angle between the two vectors. Then the CauchySchwarz Inequality
is just the statement that jaj  jbj  ja  bj which follows because j cos j  1.
To prove the CauchySchwarz Inequality note that for given a; b 2 Rn and every
n
P
real number x the quantity
.aj C xbj /2 is a sum of squares of real numbers, so
jD1

it must be nonnegative. By expanding the squares one gets


x2

n
P
jD1

n
P
jD1

a2j C 2x

n
P

aj bj C

jD1

b2j  0. Thus, this is a quadratic polynomial in x that is nonnegative for every

real number x. Any quadratic polynomial Ax2 CBxCC with A > 0 is nonnegative for
every x if and only if its discriminant B2  4AC is not positive. But the discriminant

298

10 Metric Spaces

2
of the previous polynomial is 4 4

n
P

!2
aj bj

jD1

n
P
jD1

!
a2j

n
P
jD1

!3
b2j 5. The statement

that this discriminant is less than or equal to 0 is exactly the statement of the
CauchySchwarz Inequality. An even stronger statement can now be made. Equality
occurs in the CauchySchwarz Inequality if and only if the given discriminant is 0
so that the underlying quadratic polynomial has exactly one root, meaning that the
n
P
sum .aj C xbj /2 is 0 for exactly one value of x. This happens if and only if a is
jD1

a multiple of b. Thus, the CauchySchwarz Inequality always holds, and equality


holds exactly when one of the points .a1 ; a2 ; a3 ; : : : ; an / and .b1 ; b2 ; b3 ; : : : ; bn / is a
scalar multiple of the other.

10.2.2 Minkowski Inequality


Starting with the CauchySchwarz Inequality
a1 b1 C a2 b2 C a3 b3 C    C an bn 

q
q
a21 C a22 C a33 C    C a2n b21 C b22 C b33 C    C b2n

 


doubling it and adding a21 C a22 C a23 C    C a2n C b21 C b22 C b23 C    C b2n to
both sides yields


.a1 C b1 /2 C .a2 C b2 /2 C .a3 C b3 /2 C    C .an C bn /2 
q
q
 2

a1 C a22 C a23 C    C a2n C 2 a21 C a22 C a33 C    C a2n b21 C b22 C b33 C    C b2n C .b21 C b22 C b23 C    C b2n /

or by taking square roots


q
q
q
.a1 C b1 /2 C .a2 C b2 /2 C .a3 C b3 /2 C    C .an C bn /2  a21 C a22 C a23 C    C a2n C b21 C b22 C b23 C    C b2n

which is a special case of the Minkowski Inequality which can be restated jaCbj 
jaj C jbj. Again, equality occurs only when one of the points is a scalar multiple of
the other.

10.2.3 Exercises
1. Show that the CauchySchwarz Inequality extends to infinite s
series. That
s is, if

1
1
1
1
1

P 2
P 2
P
P 2 P
an and
bn are both convergent series, then
an bn 
an
b2n .
nD1

nD1

nD1

nD1

nD1

10.3 Examples of Metric Spaces

299

1
P
2. Show that the Minkowski Inequality extends to infinite series. That is, if
a2n
nD1
s
s
1
1
1
P
P
P
2
2
and
bn are both convergent series, then
.an C bn / 
a2n C
nD1
nD1
nD1
s
1
P
b2n .
nD1

3. Show that for any real numbers a1 ; a2 ; a3 ; : : : ; an and positive real numbers
a2
a2
a2
a2
b1 ; b2 ; b3 ; : : : ; bn , the following inequality holds: 1 C 2 C 3 C    C n 
b1
b2
b3
bn
.a1 C a2 C a3 C    C an /2
. This can be shown by mathematical induction on n,
b1 C b2 C b3 C    C bn
but can also be shown using the CauchySchwarz Inequality.

10.3 Examples of Metric Spaces


For any natural number n one can define n-dimensional Euclidean space, Rn ,
with R1 being the real numbers, R2 being the Euclidean plane, R3 being
3-dimensional Euclidean space, and so forth. Elements of Rn can be represented
as ordered n-tuples of real numbers, .x1 ; x2 ; x3 ; : : : ; xn /. You should be familiar
with the Euclidean distance between two points in n-dimensional Euclidean space,
x D .x1 ; x2 ; x3 ; : : : ; xn / and y D .y1 ; y2 ; y3 ; : : : ; yn /, given by the generalization of
the Pythagorean Theorem as
d.x; y/ D

.x1  y1 /2 C .x2  y2 /2 C .x3  y3 /2 C    C .xn  yn /2 :

For any x; y 2 Rn , the distance d.x; y/ is a nonnegative real number since it is


a square root of the sum of squares of real numbers. Moreover, the distance is
0 exactly when the sum of squares is 0 which happens only when x D y. The
fact that d is symmetric follows from the fact that for all real numbers a and
b, .a  b/2 D .b  a/2 . The fact that the Euclidean distance satisfies the triangle
inequality is just the statement of the Minkowski Inequality with aD.x1 y1 ; x2 y2 ;
x3  y3 ; : : : ; xn  yn / and b D .y1  z1 ; y2  z2 ; y3  z3 ; : : : ; yn  zn /. Then
p
d.x; y/ C d.y; z/ D .x1  y1 /2 C .x2  y2 /2 C .x3  y3 /2 C    C .xn  yn /2 C
p
.y1  z1 /2 C .y2  z2 /2 C .y3  z3 /2 C    C .yn  zn /2 
p
.x1  z1 /2 C .x2  z2 /2 C .x3  z3 /2 C    C .xn  zn /2 D d.x; z/:
Together these facts show that Rn with Euclidean distance is a metric space
(Fig. 10.2).

300
Fig. 10.2 Euclidean distance
is R2

10 Metric Spaces

(x1, y1)
d(x,y) =

(x1 x2)2+(y1 y2)2

|y2 y1|

|x2 x1|

(x2, y2)

PROOF: For natural number n, n-dimensional Euclidean space


with
Euclidean
distance
function
p the
d.x; y/ D .x1  y1 /2 C .x2  y2 /2 C .x3  y3 /2 C    C .xn  yn /2 is a
metric space.
SET THE CONTEXT: For natural number n, let x D .x1 ; x2 ; x3 ; : : : ; xn /,
y D .y1 ; y2 ; y3 ; : : : ; yn /, and z D .z1 ; z2 ; z3 ; : : : ; zn / be elements of Rn .
METRIC
DEFINITION:
Define
d.x; y/
p
D .x1  y1 /2 C .x2  y2 /2 C .x3  y3 /2 C    C .xn  yn /2 which is the
square root of a sum of squares of real numbers, so it is a nonnegative real
number.
SEPARATION OF POINTS: If x y, then for some j between 1 and n,
.xj  yj /2 must be positive implying that d.x; y/ > 0.
ZERO DISTANCE: For each j between 1 and n, .xj  xj /2 D 0, so
d.x; x/ D 0.
SYMMETRY: Since for each j between 1 and n, .xj  yj /2 D .yj  xj /2 , it
follows that d.x; y/ D d.y; x/.
TRIANGLE INEQUALITY: The fact that d.x; y/ C d.y; z/  d.x; z/ is just
a restatement of the Minkowski Inequality with aj D xj  yj and bj D yj  zj
for each j between 1 and n.
This shows that Rn with the Euclidean distance is a metric space.
The Euclidean distance, sometimes called the Euclidean metric, may be the most
commonly seen distance function used for Euclidean space, but there are many other
distance functions which can make Rn into a metric space. One example is d.a; b/ D
ja1 b1 jCja2 b2 jCja3 b3 jC  Cjan bn j. This is sometimes called the taxicab
metric because the distance d.a; b/ is the distance you would travel between the two
points a and b if you could only travel in directions parallel to one of the coordinate
axes as a taxicab would do on a rectangular grid of streets. Proving that this distance
function makes Rn into a metric space is quite easy.

10.3 Examples of Metric Spaces

301

PROOF: For natural number n, n-dimensional Euclidean space with the


Euclidean distance function d.x; y/ D j.x1  y1 j C jx2  y2 j C jx3  y3 jC
   C jxn  yn j is a metric space.
SET THE CONTEXT: For natural number n, let x D .x1 ; x2 ; x3 ; : : : ; xn /,
y D .y1 ; y2 ; y3 ; : : : ; yn /, and z D .z1 ; z2 ; z3 ; : : : ; zn / be elements of Rn .
METRIC DEFINITION: Define d.x; y/ D jx1  y1 j C jx2  y2 j C jx3  y3 j C
   C jxn  yn j which is the sum of nonnegative absolute values so it is a
nonnegative real number.
SEPARATION OF POINTS: If x y, then for some j between 1 and n,
jxj  yj j must be positive implying that d.x; y/ > 0.
ZERO DISTANCE: For each j between 1 and n, jxj xj j D 0, so d.x; x/ D 0.
SYMMETRY: Since for each j between 1 and n, jxj  yj j D jyj  xj j, it
follows that d.x; y/ D d.y; x/.
TRIANGLE INEQUALITY: Since for each j between 1 and n, jxj  yj j C
jyj  zj j  jxj  zj j, it follows that d.x; y/ C d.y; z/  d.x; z/.
This shows that Rn with the d distance function is a metric space.
Still another distance function that can be used for Euclidean space is called
the supremum metric given by d.a; b/ D max.ja1  b1 j; ja2  b2 j; ja3  b3 j; : : : ;
jan  bn j/. It is constructive to compare the shapes of the neighborhoods that you get
using the Euclidean metric, the taxicab metric, and the supremum metric as shown
in Fig. 10.3. Since the Euclidean distance is the familiar distance from Euclidean
Geometry, it is easy to see that if a 2 Rn and r > 0, then N.a; r/ is an open ball with
center a and radius r. On the other hand, using the taxicab metric, N.a; r/ is a union
of 2n n-dimensional triangular pyramids. That is, when n D 2, N.a; r/ is a diamond
made up of four isosceles right triangles, and when n D 3, N.a; r/ is a union of
8 tetrahedra, one in each octant, forming a regular octahedron. For the supremum
metric, N.a; r/ is an n-dimensional cube. Note that in the Euclidean metric, if the
coordinate axes are rotated (performing an orthogonal change of coordinates), there
is no change in the neighborhood whereas with the other two metrics, rotating the
axes changes the orientation of the neighborhoods. It turns out that all three of these

Fig. 10.3 N.0; 1/ in the Euclidean, taxicab, and supremum metrics in 2 and 3 dimensions

302

10 Metric Spaces

metrics give rise to the same topology on Rn because each metric gives the same
open sets even though the open neighborhoods are different in shape. But the three
metrics have different algebraic properties, and sometimes it is easier to prove a
particular theorem using one of these metrics rather than the others.
Distance measures in metric spaces need not be complicated. For any set X you
can define d.x; x/ D 0 for all x 2 X and d.x; y/ D 1 for all x and y in X with x y. It
is very easy to see that d.x; y/ is nonnegative, symmetric, and equal to 0 if and only
if x D y. Also, for any x; y; z 2 S, if d.x; z/ D 1, then x z, so at least one of d.x; y/
and d.y; z/ must be 1 which implies the triangle inequality d.x; y/Cd.y; z/  d.x; z/.
Thus, any set X is a metric space with this metric sometimes called the discrete
metric, and <X; d> is called a discrete metric space. Note that for this metric,
each neighborhood, N.a; r/ is either all of X or just the single point fag depending
on whether or not r is greater than 1.
Next, consider a space that looks much different than Euclidean space. Let C0; 1
be the set of all real-valued functions continuous on the interval 0; 1. Certainly,
this set contains all the polynomials with real coefficients, but it also includes the
rational functions that are defined on 0; 1, exponential functions, many elementary
functions, and a much larger class of functions continuous but not differentiable on
0; 1. This set is truly very large as compared, say, to the set of real numbers. There
are many ways you might try to measure the distance between two functions in this
set. For example, you could evaluate the function at one or more points and measure
how much the functions differ at those points.
 That is,if f and g are in C0; 1, you
could define d.f ; g/ D jf .0/  g.0/j C f 12  g 12 C jf .1/  g.1/j. The only
problem with this definition is that there are continuous functions f and g which are
equal at 0, 12 , and 1 but not equal at other points such as f .x/ D x.2x  1/.x  1/
and g.x/ D 2x.2x  1/.x  1/. Because the given distance function gives a distance
of 0 between two unequal functions, it cannot serve as a metric for the space of
continuous functions on 0; 1 (Fig. 10.4).
As a result, a distance function that makes C0; 1 into a metric space really
needs to take into account the values of the functions at all the points (or at least
a dense set of points) in 0; 1. One distance measure that does this is called
the supremum metric or sup metric for short. It is defined for all f and g in
C0; 1 as d.f ; g/ D sup jf .x/  g.x/j. It is clear that if f g, then there are
x20;1

values of x 2 0; 1 where f .x/ g.x/, so d.f ; g/ will be positive, yet when


f  g, then d.f ; g/ D 0 as needed. It is necessary to check that this distance
Fig. 10.4 Some functions
in C0; 1

10.3 Examples of Metric Spaces

303

function has a valid definition, that is, for every f and g in C0; 1 the distance
function gives a nonnegative real number. But if f and g are continuous functions
on 0; 1, then so is jf .x/  g.x/j. Since all functions continuous on 0; 1 are
bounded and jf .x/g.x/j is a continuous function, the needed supremum is defined.
The triangle inequality follows from the fact that the triangle inequality works for
real numbers. Since for any three continuous functions f , g, and h and for each
x 2 0; 1it is true that jf .x/  g.x/j C
 jg.x/  h.x/j  jf .x/  h.x/j, it follows
that sup jf .x/  g.x/j C jg.x/  h.x/j  sup jf .x/  h.x/j. Then, the inequality
x20;1

x20;1

sup.A C B/  sup A C sup B shows that sup jf .x/  g.x/j C sup jg.x/  h.x/j 
x20;1
x20;1


sup jf .x/  g.x/j C jg.x/  h.x/j  sup jf .x/  h.x/j, and d.f ; g/ C d.g; h/ 
x20;1

x20;1

d.f ; h/.

PROOF: The set C01 with distance function d.f ; g/ D sup jf .x/  g.x/j
x20;1

is a metric space.
SET THE CONTEXT: Let C0; 1 be the set of real-valued functions
continuous on the interval 0; 1.
METRIC DEFINITION: For any f and g in C0; 1, the function
jf .x/  g.x/j is also in C0; 1. Define d.x; y/ D sup jf .x/  g.x/j which is
x20;1

the supremum of a nonnegative continuous function, so it is a nonnegative


real number.
SEPARATION OF POINTS: For f ; g 2 C0; 1 if f g, then for some
x 2 0; 1, jf .x/  g.x/j must be positive implying that d.f ; g/ > 0.
ZERO DISTANCE: For all x 2 0; 1 and f 2 C0; 1, jf .x/  f .x/j D 0, so
sup jf .x/  f .x/j D 0 and d.f ; f / D 0.
x20;1

SYMMETRY: Since for all x 2 0; 1 and all f ; g 2 C0; 1, jf .x/  g.x/j D
jg.x/  f .x/j, it follows that d.f ; g/ D d.g; f /.
TRIANGLE INEQUALITY: Since for all x 2 0; 1 and all f ; g; h 2 C0; 1,
it holds that jf .x/  g.x/j C jg.x/  h.x/j  jf .x/  h.x/j, it follows
that sup jf .x/  g.x/j C sup jg.x/  h.x/j  sup jf .x/  g.x/j C
x20;1
x20;1
x20;1

jg.x/  h.x/j  sup jf .x/  h.x/j, and d.f ; g/ C d.g; h/  d.f ; h/.
x20;1

This shows that C0; 1 with the supremum distance function is a metric
space.
The supremum metric provides only one of many possible distance functions
for the space C0; 1. Another example is called the L1 metric and is defined by
R1
d.f ; g/ D jf .x/  g.x/jdx. Since all functions continuous on a closed interval are
0

integrable there, this distance function is defined. Moreover, since jf .x/  g.x/j  0
for all x 2 0; 1, its integral is also nonnegative. If f g, then there is an a 2 0; 1
where f .a/ g.a/. Because jf .x/  g.x/j is continuous and positive at x D a, there
is a > 0 such that for all x 2 C0; 1 with jxaj < , jf .x/g.x/j > 12 jf .a/g.a/j.

304

10 Metric Spaces

This implies that d.f ; g/ D

R1
0

jf .x/  g.x/jdx >

aC
R

jf .x/  g.x/jdx > 0. Of course,

a

a rigorous proof will take care that the limits of integration in the previous sentence
are chosen in a way that the integral is guaranteed to be defined. The symmetry
of d follows from its definition. For all f ; g; h 2 C0; 1 and each x 2 0; 1,
the triangle inequality gives jf .x/  g.x/j C jg.x/  h.x/j  jf .x/  h.x/j. Thus,
R1
R1
R1
jf .x/  g.x/jdx C jg.x/  h.x/jdx D jf .x/  g.x/j C jg.x/  h.x/jdx 
0

R1
0

jf .x/  h.x/jdx, so d.f ; g/ C d.g; h/  d.f ; h/, and the needed triangle inequality

holds.
PROOF: The set C01 with distance function d.f ; g/ D
is a metric space.

R1
0

jf .x/  g.x/jdx

SET THE CONTEXT: Let C0; 1 be the set of real-valued functions


continuous on the interval 0; 1.
R1
METRIC DEFINITION: Define d.x; y/ D jf .x/  g.x/jdx which is the
0

integral of a nonnegative continuous function, so it is a nonnegative real


number.
SEPARATION OF POINTS: For f ; g 2 C0; 1 if f g, then for some
a 2 0; 1, jf .a/  g.a/j must be positive.
Because jf .x/  g.x/j is a continuous function, there is a > 0 such that
jf .x/  g.x/j > 12 jf .a/  g.a/j for all x 2 0; 1 satisfying jx  aj < .
In particular, there are and in 0; 1 with < such that jf .x/  g.x/j >
1
jf .a/  g.a/j for all x satisfying < x < .
2
R1
R
Then d.f ; g/ D jf .x/  g.x/jdx  jf .x/  g.x/jdx >
1
jf .a/
2

 g.a/j.  / > 0, so d.f ; g/ > 0 whenever f g.


ZERO DISTANCE: For all x 2 0; 1 and f 2 C0; 1, jf .x/  f .x/j D 0, so
R1
R1
jf .x/  f .x/jdx D 0 dx D 0 and d.f ; f / D 0.
0

SYMMETRY: Since for all x 2 0; 1 and all f ; g 2 C0; 1, jf .x/  g.x/j D
jg.x/  f .x/j, it follows that d.x; y/ D d.y; x/.
TRIANGLE INEQUALITY: Since for all x 2 0; 1 and all f ; g; h 2 C0; 1,
it holds that jf .x/  g.x/j C jg.x/  h.x/j  jf .x/  h.x/j, it follows that
R1
R1
R1
jf .x/  g.x/jdx C jg.x/  h.x/jdx D jf .x/  g.x/j C jg.x/  h.x/jdx 
0

R1
0

jf .x/  h.x/jdx, and d.f ; g/ C d.g; h/  d.f ; h/.

This shows that C0; 1 with the d.f ; g/ distance function is a metric space.

10.3 Examples of Metric Spaces

305

It is important to note that the supremum metric and the L1 metric are
distinctly different. In particular,
8
9 consider the sequence of functions fn .x/ D
1

>
0
if
0


>
nC1

>

>

>
<
=
1
1
for all natural numbers n. In the L1 metric,
n.n C 1/x  n if nC1 < x  n

>

>

>

>

>
:
;
1
1
if n < x  1
these functions converge to the function which is identically 1 on 0; 1. On the
other hand, this sequence is not even a Cauchy sequence in the supremum metric
since d.fn ; fm / D 1 for all n m. All metrics for C0; 1 need to measure the
distance between two continuous functions. The supremum metric measures the
maximum distance between two functions whereas the L1 metric measures a mean
distance between two functions.

10.3.1 Exercises
Write proofs for each of the following statements.
1. Let C be a circle. For x and y in C, define d.x; y/ to be the number in 0; 
equal to the measure of the central angle in C of the arc bounded by x and y.
Show that C with this distance function is a metric space.
2. If d is defined for points .x1 ; y1 / and .x2 ; y2 / in R2 by 2jx1  x2 j C 3jy1  y2 j,
then R2 with distance function d is a metric space.
3. Let X be the set consisting of all integers plus one extra point M. For each
1
x 2 X, let d.x; x/ be 0. For integers x y, let d.x; y/ D min.jxj;jyj/C1
, and for
1
each integer x, let d.x; M/ D d.M; x/ D jxjC1 . Then <X; d> is a metric space.
4. Let X be the collection of all sequences of real numbers a1 ; a2 ; a3 ; : : : for which
there exists a natural number n such that aj D ak for all j and k greater than or
equal to n. In other words, X is the collection of all sequences which are constant
from some point on, such as 1; 2; 3; 4; 3; 3; 3; 3; : : : or 12 ; 23 ; 23 ; 23 ; 23 ; : : : . Define
the distance between two sequences <aj > and <bj > to be 0 if the sequences are
identical, and to be the least natural number n for which the difference between
the two sequences <aj  bj > is constant for all terms j  n. Then X is a metric
space with this metric.
5. Let p be any prime number. Then for any two rational numbersn r and s, define
d.r; s/ D 0 if r D s. Otherwise, if r s, then jr  sj D pba where a and
b are relatively prime natural numbers, n is an integer, and neither a nor b is
divisible by p. Define d.r; s/ D pn . Then the rational numbers with distance
function d is a metric space.
6. If <X; d> is a metric space, then for any c > 0, <X; c  d> is also a metric
space.
7. If <X; d1 > and <X; d2 > are both metric spaces, then <X; d1 C d2 > is also a
metric space.

306

10 Metric Spaces

8. If <X; dX > and <Y; dY > are both metric spaces, then X  Y D f.x; y/ j x 2
X and y 2 Yg with distance function d .x1 ; y1 /; .x2 ; y2 / D dX .x1 ; x2 / C
dY .y1 ; y2 / is a metric space.
s
2
R1 
f .x/  g.x/ dx is a metric space.
9. C0; 1 with distance function d.f ; g/ D
0

This distance function is called the L2 metric.


R1
10. The L1 metric d.f ; g/ D jf .x/  g.x/j dx is not a metric for the space of all
0

Riemann integrable functions defined on the interval 0; 1.

10.4 Topology of Metric Spaces


Recall that in the real numbers, R, the interior of a set S is defined to be the set of
points x 2 S such that there is an  > 0 for which the entire interval .x  ; x C /
is contained in S. The exterior of a set is defined to be the set of points x S such
that there is an  > 0 for which the entire interval .x  ; x C / is contained in the
complement of S. The boundary of a set S is defined to be the set of points neither in
the interior nor the exterior of the set, or the points x such that for all  > 0 the set
.x; xC/ contains elements of S and elements of Sc . All three of these definitions
generalize in a natural way to all metric spaces. Indeed, one just has to replace the
role of the open interval .x  ; x C / with the neighborhood N.x; /. That is, if
<X; d> is a metric space, and S  X, the interior of S, int.S/, is the set of x 2 S
such that there is an  > 0 for which N.x; /  S, the exterior of S, ext.S/, is the
set of x 2 Sc such that there is an  > 0 for which N.x; /  Sc , and the boundary
of S, @S, is the set of x 2 X such that for every  > 0, the set N.x; / contains points
in S and points in Sc .
The definitions of interior, exterior, and boundary, in turn, allow one to define
open and closed sets, accumulation point, derived set, and closure in ways analogous
to how they are defined for the set of real numbers. That is, if S is a subset of a
metric space X, S is an open set if S D int.S/, S is a closed set if @S  S, S has
accumulation point a if, for every  > 0, N .a; /\S ;, the derived set of S, S0 ,
is the set of accumulation points of S, and the closure of S; cl.S/; is S[S0 . It is worth
noting that for every x 2 X and every  > 0 that N.x; / is an open set. This is easy
to show by thinking about how you prove that an open interval in the real numbers
is an open set. In the real numbers, if a < b, then .a; b/ is open because if y 2 .a; b/,
the interval .y  ; y C /  .a; b/ when D min.y  a; b  y/. Similarly, then, in
metric space <X; d>, if a 2 X and  > 0 are given, let y 2 N.a; /. It follows from
the definition of N.a; / that D   d.a; y/ > 0. Then, if x 2 N.y; /, d.x; y/ <
D   d.a; y/, so by the triangle inequality d.a; x/  d.a; y/ C d.y; x/ < , and
x 2 N.a; /. Thus, you can conclude that N.y; /  N.a; / when D   d.a; y/
which proves that N.a; / is open.

10.4 Topology of Metric Spaces

307

Fig. 10.5 Proving that the


union of open sets is open

Many of the theorems pertaining to the topology of the real numbers proved in the
preceding chapter can now be reproved in the context of metric spaces by merely
changing references to open intervals .x  ; x C / with the new neighborhood
notation, N.x; /. For example, consider the proof that the union of open sets is also
an open set (Fig. 10.5).
PROOF: In metric space <X; d> assume that for each i in the index set I,
Ai is an open set. Then [ Ai is an open set.
i2I

In metric space <X; d> assume that for each i in the index set I, Ai is an
open set.
Let x 2 [ Ai .
i2I

By the definition of set union, there is an j 2 I such that x is an element of


the open set Aj .
By the definition of open set, there is an  > 0 such that the N.x; /  Aj .
But by the definition of set union, Aj  [ Ai showing that N.x; /  [ Ai ,
i2I

i2I

which proves the theorem.


Several other examples are left for the exercises. Note that any metric space <X; d>
with the given definition of open set is a topological space as defined in Chap. 9.

10.4.1 Exercises
Write proofs for each of the following statements.



1. For every subset, S, of metric space <X; d>, int int.S/ D int.S/.
2. For every subset, S, of metric space <X; d>, @.@S/  @S.
3. For subsets S and T of metric space <X; d>, ext.S [ T/  ext.S/ \ ext.T/.

308

10 Metric Spaces

4. A subset S of a metric spa