7 Voti positivi0 Voti negativi

749 visualizzazioni364 pagineProofs

Aug 10, 2016

© © All Rights Reserved

PDF, TXT o leggi online da Scribd

Proofs

© All Rights Reserved

749 visualizzazioni

Proofs

© All Rights Reserved

- MAA the Contest Problem Book-i
- Methods of Solving Calculus Problems
- Toronto MathComp
- Alsina, Nelsen - Charming Proofs
- The NYC Contest Problem Book
- Compiled and Solved Problems in Geometry and Trigonometry
- USA Mathematical Olympiad (1972-1986)
- Real Mathematical Analysis
- Bridge to Abstract Mathematics Mathematical Proof and Structures
- Introduction to Mathematical Thinking - Devlin, Keith
- [Problem Books in Mathematics] Antonio Caminha Muniz Neto - An Excursion Through Elementary Mathematics, Volume III_ Discrete Mathematics and Polynomial Algebra (2018, Springer)
- Hungarian Problem Book I (Number 11)(Bk. 1) [E. Rapaport]
- How to Read and Do Proofs
- IMOMATH - Classical inequalities
- Mathematical Reasoning- Writing and Proof
- Methods of Solving Nonstandard Problems, Grigorieva, 2015
- 06. Andy Liu - Hungarian Problem Book III 1929-1943
- Solow - How to Read and Do Proofs
- [Matthew A. Pons] - Real Analysis for the Undergraduate.pdf
- Maa the Contest Problem Book III 1966 1972

Sei sulla pagina 1di 364

Kane

Writing

Proofs in

Analysis

Jonathan M. Kane

123

Jonathan M. Kane

Department of Mathematics

University of Wisconsin - Madison

Madison, WI, USA

ISBN 978-3-319-30965-1

ISBN 978-3-319-30967-5 (eBook)

DOI 10.1007/978-3-319-30967-5

Library of Congress Control Number: 2016936668

Springer International Publishing Switzerland 2016

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of

the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,

broadcasting, reproduction on microfilms or in any other physical way, and transmission or information

storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology

now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication

does not imply, even in the absence of a specific statement, that such names are exempt from the relevant

protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book

are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or

the editors give a warranty, express or implied, with respect to the material contained herein or for any

errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG Switzerland

E. Anderson, and especially James L. Nelson

who, at the University of Minnesota Duluth,

taught me the fundamentals of writing proofs

in analysis.

Acknowledgments

I wish to thank Natalya St. Clair for her excellent work creating the illustrations

appearing in this textbook. She took my crude sketches and vague ideas and turned

them into pleasing artwork and instructive diagrams. I also wish to thank Daniel M.

Kane, Alan Gluchoff, Thomas Drucker, and Walter Stromquist for their insightful

comments about the presentation, content, and correctness of the text.

vii

Preface

After learning to solve many types of problems such as those found in the first

courses in Algebra, Geometry, Trigonometry, and Calculus, mathematics students

are usually exposed to a transition course where they are expected to write proofs

of various theorems. I taught such a course for a dozen years and was never satisfied

with the textbooks available for that course. Although such textbooks often teach

the fundamentals of logic (conditionals, biconditionals, negations, truth tables) and

give some common proof strategies such as mathematical induction, the textbooks

failed to teach what a student needs to be thinking about when trying to construct a

proof. Many of these books present a great number of well-written proofs and then

ask students to write proofs of similar statements in the hope that the students will

be able to mimic what they have seen. Some of these books are also designed to be

used as an introductory textbook in Analysis, Abstract Algebra, Topology, Number

Theory, or Discrete Mathematics, and, as such, they concentrate more on explaining

the fundamentals of those topic areas than on the fundamentals of writing good

proofs.

This Book Is Not Your Traditional Transition Textbook The goal of this book

is to give the student precise training in the writing of proofs by explaining what

elements make up a correct proof, by teaching how to construct an acceptable proof,

by explaining what the student is supposed to be thinking about when trying to write

a proof, and by warning about pitfalls that result in incorrect proofs. In particular,

this book was written with the following directives:

Unlike many transition books which do not give enough instruction about how

to write proofs, most of the proofs presented in this text are preceded by detailed

explanations describing the thought process one goes through when constructing

the proof. Then a good proof is given that incorporates the elements of that

discussion.

For proofs that share the same general structure such as the proof of lim f .x/ D L

x!a

for various functions, proof templates are provided that give a generic approach

to writing that type of proof.

ix

Preface

logic, set theory, cardinal numbers, and an axiomatic construction of the real

numbers. I find that students do not appreciate the details of these discussions

when these concepts are presented before they are needed to write a specific

proof. For example, truth tables are very helpful in verifying the truth of a

complex logical statement, but it is hard for students at that level to see the

connection between the truth value of a complex statement and the formation of

a proof. Therefore, I introduce many of these ideas as needed within the contexts

of writing Analysis proofs and have kept the introductory material to a minimum.

Many books that propose to teach students to write proofs in Analysis get carried

away with covering those great topics in Analysis and cut back on the proof

writing instruction. The books may start out teaching about proofs, but after

a few chapters of introduction, they assume that the students now understand

everything they need to know about writing proofs, and the books concentrate

entirely on the concepts of Analysis. This book covers plenty of Analysis and

can be used as a textbook for a typical beginning Real Analysis course, but it

never loses sight of the fact that its primary focus is about proof writing skills.

Certainly, one can use this book for a beginning course in Real Analysis because

it thoroughly covers the standard theorems, but as a first course in proof writing,

it will succeed where others fail.

If the students using this book have already had a thorough background in

writing proofs, then this book could be used as a standard one-semester course

in Real Analysis. Theses students might begin in Sect. 2.5 and, depending on their

background, be expected to cover the material through Chaps. 6, 7, or 8. On the other

hand, if the students are using this book both as an introduction to proof writing and

an introduction to Analysis, then the textbook can be used for a two-semester course

in Real Analysis and proof writing. The first semester might aim to cover the first

five or six chapters, while the second semester aims to complete the book. For most

of the topics, it is important that the chapters be covered in their prescribed order.

Elements of later chapters do depend on the material covered in earlier chapters.

Madison, Wisconsin, USA

2016

Jonathan M. Kane

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Proof Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

What Are Proofs, and Why Do We Write Them? . . . . . . . . . . . . . . . . . . . . . . . .

1.1

What Is a Proof? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Why We Write Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

The Basics of Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1

The Language of Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1.1

Conditional Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1.2

Negation of a Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1.3

Proofs of Conditional Statements . . . . . . . . . . . . . . . . . . . . . . . . .

2.1.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Template for Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3

Proofs About Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.1

Set Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.2

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.3

Proofs About Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.5

Proofs About Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.4

Proofs About Even and Odd Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.4.1

Definitions of Even and Odd Integers . . . . . . . . . . . . . . . . . . . . .

2.4.2

Proofs About Even and Odd Integers . . . . . . . . . . . . . . . . . . . . .

2.4.3

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.5

Basic Facts About Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.5.1

Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.5.2

The Completeness Axiom and the Real Numbers . . . . . . . .

2.5.3

Absolute Value, the Triangle Inequality, and Intervals . . .

2.5.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6.1

Function, Domain, Codomain . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6.2

Surjection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6.3

Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vii

ix

xvii

xx

1

1

5

9

9

9

10

11

12

12

16

17

17

18

19

22

22

26

27

27

28

31

31

31

35

38

40

40

40

40

41

xi

xii

CONTENTS

2.6.4

Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.1

The Definition of Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.1.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.2

Proving lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x!a

3.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

One-Sided Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.4

Limits at Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.4.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5

Limit of a Sequence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.1

Definition of Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.2

Arithmetic with Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.3

Monotone Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.4

Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.5

Limit of a Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.6

Limits of Monotone Sequences and

Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.7

Cauchy Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5.8

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.6

Proving That a Limit Does Not Exist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.6.1

Why a Limit Might Not Exist . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.6.2

Quantifiers and Negations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.6.3

Proving No Limit Exists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.6.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.7

Accumulation Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.7.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.8

Infinite Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.8.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9

The Arithmetic of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.1

Limit of a Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.2

Limit of a Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.3

Limit of a Quotient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.4

Limit of Rational Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.5

Other Types of Limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.9.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10

Other Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10.1

The Limit of a Positive Function . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10.2

Uniqueness of Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10.3

The Squeezing Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10.4

Limits of Subsequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

44

47

47

49

49

54

54

56

57

59

60

60

60

60

61

62

62

66

67

68

68

68

70

73

74

79

79

81

81

82

83

85

87

89

89

89

90

90

91

92

93

CONTENTS

3.11

3.11.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

The Definition of Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2

Proving the Continuity of a Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.4

Compactness and the HeineBorel Theorem . . . . . . . . . . . . . . . . . . . . . . . .

4.4.1

Open Covers and Subcovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.4.2

Proofs of the HeineBorel Theorem . . . . . . . . . . . . . . . . . . . . . .

4.4.3

Uniform Continuity on Closed Bounded Intervals . . . . . . .

4.4.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.5

The Arithmetic of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.5.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.6

Composition, Absolute Value, Maximum, and Minimum . . . . . . . . . .

4.6.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.7

Other Continuity Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.7.1

Boundedness of Continuous Functions . . . . . . . . . . . . . . . . . . .

4.7.2

Obtaining Extreme Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.7.3

The Intermediate Value Property . . . . . . . . . . . . . . . . . . . . . . . . . .

4.7.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.8

Discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.1

The Definition of Derivative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.2

Differentiation and Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.3

Calculating Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.4

The Arithmetic of Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.4.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.5

Chain Rule and Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.6

Increasing Functions, Decreasing Functions, and Critical Points . .

5.6.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.7

The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.7.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.8

LHopitals Rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.8.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.9

Intermediate Value Property and Limits of Derivatives . . . . . . . . . . . . .

Riemann Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.1

Area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2

Cardinality of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3

Measure Zero. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.4

Areas in the Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.4.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiii

93

97

99

99

101

105

105

109

110

110

111

115

117

117

120

121

123

123

123

126

127

130

131

133

133

134

135

136

139

140

143

145

146

150

150

155

155

159

159

159

162

163

166

166

169

xiv

CONTENTS

6.5

6.5.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.6

Properties of Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.6.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.7

Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.7.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8

Step Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.9

Integrals of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.9.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.10

Characterization of Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.10.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.1

Convergence of Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.1.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2

Absolute and Conditional Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.3

The Arithmetic of Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4

Tests for Absolute Convergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4.1

Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4.2

Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4.3

Root Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4.4

Integral Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.4.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.5

Alternating Series Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.5.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.6

The Smallest Divergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.7

Rearrangement of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.7.1

Addition of Parentheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.7.2

Order of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.7.3

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.8

Cauchy Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.8.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sequences of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.1

Pointwise Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.1.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.2

Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.3

Monotone Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.4

Series of Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5

Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5.1

Absolute Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5.2

Interval of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5.3

Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5.4

Taylors Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

169

172

173

177

178

183

183

189

189

194

194

200

201

201

204

205

206

208

209

209

212

214

215

218

219

222

222

224

224

225

230

231

236

239

239

241

241

246

246

252

255

255

256

259

263

CONTENTS

8.5.5

Arithmetic of Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.5.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.6

Fundamental Question of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

Topology of the Real Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.1

Interior, Exterior, and Boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.1.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.2

Open and Closed Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.2.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.3

Unions and Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.4

Continuous Functions Applied to Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.4.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.5

Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.5.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.6

Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.6.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.7

Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.7.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.1

Definition of Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2

Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2.1

CauchySchwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2.2

Minkowski Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.2.3

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.3

Examples of Metric Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.3.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.4

Topology of Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.4.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.5

Limits in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.5.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.6

Continuous Functions on Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.6.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.7

Homeomorphism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.7.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.8

Connected Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.8.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.9

Compact Metric Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.9.1

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.10 Complete Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.10.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.11 Contraction Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.11.1 Contraction Mapping Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . .

10.11.2 Picards Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

265

266

267

269

269

274

274

278

278

282

282

285

285

288

288

291

291

293

295

295

297

297

298

298

299

305

306

307

308

311

311

314

315

316

316

317

317

322

323

327

327

327

329

xvi

CONTENTS

10.11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340

Books for Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

List of Figures

Fig. 1.1

Fig.

Fig.

Fig.

Fig.

Fig.

.A [ B/c is equal to Ac \ Bc . . . . . . . . . . . . . . . . .p

...........................

Showing the least upper bound of S is s D r . . . . . . . . . . . . . . . . . . . . . . . .

Triangle inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Composition .f g/.x/ D z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

26

37

39

43

2.1

2.2

2.3

2.4

2.5

Fig. 3.1

Fig. 3.2

Fig. 3.3

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

3.4

3.5

3.6

3.7

3.8

3.9

3.10

Fig. 4.1

Fig. 4.2

Fig.

Fig.

Fig.

Fig.

4.3

4.4

4.5

4.6

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

4.7

4.8

4.9

4.10

4.11

4.12

4.13

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

lim f .x/ D L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

x!a

Graph of f .x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Approaching a limit as x ! 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Proving bounded monotone sequences converge . . . . . . . . . . . . . . . . . . . . .

f has no limit at x D 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph of sin 1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Set with accumulation point a and isolated point b . . . . . . . . . . . . . . . . . . .

Sequences approaching the lim sup and lim inf . . . . . . . . . . . . . . . . . . . . . . .

56

57

63

71

72

74

94

Continuity of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A function equal to 2x for rational x and x C 1 for

irrational x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

f .x/ D 1x is not uniformly continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

HeineBorel Theorem first proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

HeineBorel Theorem second proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

y and z straddle one endpoint but remain in an interval

of the open cover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Proving that a continuous function on a; b is bounded . . . . . . . . . . . . . .

The maximum and minimum of a function f .x/ on an interval . . . . . .

f passing through each y between f .c/ and f .d/ . . . . . . . . . . . . . . . . . . . . . .

A function with a jump discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph of sin 1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graphs of sgn.x/ and bxc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graphs of functions with discontinuities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

104

106

112

113

115

125

126

128

130

130

131

132

xvii

xviii

LIST OF FIGURES

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

5.1

5.2

5.3

5.4

5.5

5.6

5.7

Tangent Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Restricting sin.x/ to get sin1 .x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Graph showing maxima and minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The proof of Rolles Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Point c where

the tangent line is parallel to the secant line . . . . . . . . . . .

x2 sin x12 and its derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

134

134

142

145

147

148

156

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

6.1

6.2

6.3

6.4

6.5

6.6

6.7

6.8

6.9

Determining y using a diagonalization argument . . . . . . . . . . . . . . . . . . . . .

Construction of the Cantor set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Covering a line segment with smaller and smaller squares . . . . . . . . . . .

An 8 8 grid of rectangles overlaying a triangle . . . . . . . . . . . . . . . . . . . . .

Approximating the area under a curve with narrowing rectangles . . .

The step function s.x/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Choosing j on .xj1 ; xj / . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The mean value theorem for integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

161

161

164

165

168

170

184

185

191

Fig. 7.1

Fig. 7.2

Fig. 7.3

Comparing the series with the integral in the Integral Test . . . . . . . . . . . 216

Converging to ln 2 with an alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Rearranging terms to converge to L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

Fig. 8.1

discontinuous function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

nC1

The sequence of functions jxj n converging to the

function f .x/ D jxj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A sequence of functions with integral 1 converging to

the function f .x/ D 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A sequence of functions converging uniformly . . . . . . . . . . . . . . . . . . . . . . .

If continuous function fn is close to f , then f .x/ is close

to f .a/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fig. 8.2

Fig. 8.3

Fig. 8.4

Fig. 8.5

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

9.1

9.2

9.3

9.4

9.5

9.6

9.7

9.8

9.9

x in @.@S/, y in @S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

An open set S, its boundary, and its complement Sc . . . . . . . . . . . . . . . . . .

Showing boundaries of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The union of open sets is an open set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Mapping sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The closure of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The sets 0; 1 and .4; 5/ are disconnected. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The set C is a connected set. The set N is not a

connected set. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Fig. 9.10 Graph of sin 1x with the y-axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

240

240

241

242

243

270

273

275

278

279

284

287

292

293

293

LIST OF FIGURES

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

10.2

10.3

10.4

10.5

10.6

10.7

10.8

10.9

Fig.

Fig.

Fig.

Fig.

Fig.

Fig.

10.10

10.11

10.12

10.13

10.14

10.15

Euclidean distance is R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

N.0; 1/ in the Euclidean, taxicab, and supremum metrics . . . . . . . . . . . .

Some functions in C0; 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Proving that the union of open sets is open. . . . . . . . . . . . . . . . . . . . . . . . . . . .

Limit of f W X ! Y as x approaches a is L . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Limit of a sequence in a metric space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

A compact set of a metric space is closed and bounded . . . . . . . . . . . . . .

Extrema of a continuous real-valued function on a

compact set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Continuous bijection on a compact metric space. . . . . . . . . . . . . . . . . . . . . .

Enclosing a closed bounded set in a grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contraction mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The Mandelbrot set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Stages in the forming of the Sierpinski triangle . . . . . . . . . . . . . . . . . . . . . . .

Generation of a fractal fern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xix

300

301

302

307

308

310

318

320

322

326

328

333

333

339

Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Proving A B for sets A and B . . . . . . . . . . . . . . .

Proving A D B for sets A and B . . . . . . . . . . . . . . .

Proving a function f is surjective . . . . . . . . . . . . . .

Proving a function f is injective . . . . . . . . . . . . . . .

Proving lim f .x/ D L . . . . . . . . . . . . . . . . . . . .

x!a

Proving a result using mathematical induction . . . . . . .

Proving lim f .x/ does not exist . . . . . . . . . . . . . . .

x!a

Proving the function f is continuous at the point a . . . . .

Proving the function f is uniformly continuous on the set A

Proving <X; d> is a metric space . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 12

. 20

. 23

. 41

. 42

. 50

. 65

. 70

. 102

. 106

. 296

xx

Chapter 1

A statement in Mathematics is just a sentence which could be designated as true

or false. The sentences 1 C 1 D 2 and x D 4 implies x2 D 16 are true

statements while All rational numbers are positive and There is a real number

x such that x2 C 5 D 2 are false statements. Some sentences like Green is

nice or Authenticity runs hot are too ambiguous, a matter of opinion, or are just

plain nonsense and cannot be said to be true or false, so mathematicians would

not consider them to be statements. Mathematicians have a lot of words for kinds

of statements including many that you have heard: definition, axiom, postulate,

principle, conjecture, lemma, proposition, law, theorem, contradiction, and others.

You are certainly familiar with the numbers you use for counting items:

1; 2; 3; 4, and so forth. Suppose you wish to investigate statements about these

numbers to see which statements hold true for all of these numbers. This is an

admirable mathematical pursuit, so how would you get started? Mathematicians

know from experience that if you want to begin an investigation, you better start

with definitions, that is, you better make some clear statements about the objects

you are about to study, because there are examples of mathematicians running off

to study something without first making clear what it is they are studying, and later

running into problems because they have not been consistent about how they are

treating these new objects. This happened, for example, when people investigated

the concept of limit before a precise definition of limit was in place. OK, so perhaps

you make some statements about the numbers with which you want to work so that

you are confident that you understand the collection 1; 2; 3; 4; 5; : : : . What are you

going to be able to do with these numbers? If you only know the names of these

numbers and have a symbolic representation for each, there is not a great deal you

can do with them. Perhaps you could get a collection of blocks and paint one number

on each block. Then you could have fun rearranging these numbers just as you have

seen done by countless children.

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_1

But more likely you are interested in investigating some properties of these

numbers having to do with their order or how they behave when operated on

by addition or multiplication. This, of course, would mean that you will need to

make clear statements about addition and multiplication operations and a less than

relationship, again, so that you do not run into problems later because you were

being ambiguous. So, you might write definitions of addition, multiplication, and

less than, and then make statements about how these operations behave such as a

Commutative Law of Addition (mCn D nCm), a Distributive Law of Multiplication

over Addition (a.b C c/ D ab C ac), and an Order Property of Addition (r < s

implies r C t < s C t). These statements about how these defined quantities work

are called axioms, postulates, or principles. They are statements that you accept

as the guiding rules for how your mathematical objects behave and go beyond the

definitions to describe and make precise just what the definitions are talking about.

Once you have made definitions and laid out your axioms, you should have the

tools necessary to begin an investigation of other properties. Suppose that someone

looks at a few examples and notice that 1 C 9 D 10 and 10 is 2 times another

number, 5. They then notice that 4C12 D 16, 3C147 D 150, and 1002C6 D 1008,

and all of these results are also numbers equal to 2 times another number. This might

lead them to make the statement that if you add two natural numbers together,

the result is always 2 times another number. Such a statement would be called a

conjecture, a statement whose truth has not yet been determined. Of course, you

know that this statement is false and came about because the investigator had not

yet considered enough examples. Once they stumble upon 5 C 8 D 13 and notice

that 13 cannot be represented as 2 times another number, they will know that the

statement does not hold true in every case.

Other conjectures such as for every natural number a, the number a2 3a C 12

is a multiple of 2 hold up to more scrutiny. At some point in your investigation you

might see a convincing argument that this conjecture is, in fact, a true statement.

Such a convincing argument is what is called a proof. Once it is known that a

statement has a proof, it is known as a theorem, lemma, corollary, proposition,

or law. So, a proof of a statement in mathematics is a convincing argument that

establishes the truth of that statement.

Some statements are very easily proved, and certainly mathematicians often set

up axioms in order to make particular statements easy to prove. At first this may

appear to be cheating or, at best, unproductive and uninteresting because it seems

to defeat the purpose of establishing truth by dictating rules that make it trivial to

establish the truth. But this is certainly not the case. It is common for mathematicians

to have an intuitive idea about how a system should work before they feel that they

understand it enough to set down formal definitions and axioms. Perhaps you wanted

addition of all natural numbers a and b to satisfy a C b D b C a. Then it would make

sense to include this rule among your axioms. The axioms are written with the idea

of establishing enough structure so that the statements the mathematicians want to

hold true can easily be proved. The richness of mathematics is that after assuring

that the obvious can be proved from the axioms, there are many more results that

can be proved that are not immediately obvious from the definitions and axioms,

statements which might never have been apparent to those who set up the system in

the first place. For example, Fermats Last Theorem (there are no natural numbers a,

b, c, and n > 2 such that an C bn D cn ) is a statement about natural numbers which

could only be conjectured after investigating a large number of examples, and stood

as a conjecture for hundreds of years before a proof was provided.

Occasionally, it is shown that a conjecture is independent of the axioms; that

is, neither the truth nor the falseness of the statement follows from the axioms. Two

famous examples are the statements about sets known as the Axiom of Choice and

the Continuum Hypothesis which have been shown to be independent of the original

axioms of Zermelo-Fraenkel Set Theory. The independence of such statements

suggests that the axiom system is not rich enough in structure to establish the truth

of these statements, and that if one chose to do so, those statements could be added

to the list of axioms for the system. The Axiom of Choice or something equivalent

to it, for example, is now usually listed along with the Zermelo-Fraenkel axioms.

One certainly hopes that it is not possible to prove two contradictory statements

about objects in a system. Such an occurrence would say that the axioms of the

system were inconsistent, and this would require the axioms to be changed. After

the original ground rules for Set Theory were established by Georg Cantor in the

1870s and 1880s, Bertrand Russell pointed out in 1901 a paradox (contradiction)

that is a consequence of those rules. Now commonly known as Russells paradox,

it stimulated a flurry of activity which resulted in the young field of Set Theory

being put on a firm foundation (we hope) with the creation and adoption of the

ZermeloFraenkel axioms.

The language of a proof can vary depending on who is writing the proof and

who is the intended reader. In other words, what makes a convincing argument may

well depend on who it is that needs to be convinced. For example, if two experts

in Functional Analysis are speaking to each other, one might prove a statement by

saying Oh, thats just a consequence of the Hahn-Banach Theorem. That proof

might be sufficient since it completely describes the reasoning behind the statement

in question due to the shared knowledge of the two experts. On the other hand, if

one of these experts were speaking to a beginning mathematics graduate student,

the proof would need to include far more detail in order for it to be a convincing

argument. If the expert were speaking to a high school student, the proof might

need to be a complete book that both introduces the needed concepts and explains

many results needed to understand the proof.

It is important to understand that there is a difference between knowing why

a statement is true and knowing how to write a good proof of the statement. It is

quite possible to learn a great deal of mathematics, to be able to solve many types

of mathematical problems, and to understand why particular properties must hold

without being able to write coherent proofs of these properties. It is analogous to

a police detective who has gathered enough evidence to be convinced which of the

many suspects has committed a particular crime, but it is quite another thing to have

the criminal successfully prosecuted in a court of law resulting in the criminals

conviction and eventual punishment for the crime. A student in Analysis needs to

learn many strategies that can be brought to bear when writing proofs. Some of these

strategies are methods or tricks that enter a students bag of tricks which can be

employed later when solving problems or writing proofs. A student of proof writing

needs to learn how to take those strategies and turn them into coherent proofs where

the ideas are presented in a logical order, fill in all necessary details, and make clear

to the proof reader exactly why the chosen strategies justify the needed result.

This book talks about how you should go about writing proofs of the kinds

of statements typically found in the branch of Mathematics called Analysis. The

branches of mathematics are not precisely defined. After a new branch arises, some

mathematicians begin to combine ideas from older branches with ideas from the new

branch to form even newer areas of study. For example, there are branches called

Algebra, Geometry, and Topology. During the twentieth century mathematicians

began talking about Algebraic Topology, Algebraic Geometry, and Geometric

Topology. Very roughly speaking, then, some of the branches of mathematics are

Set Theory: the study of sets, set operations, functions between sets, orderings

of sets, and sizes of sets

Algebra: the study of sets upon which there are binary operations defined (such

as addition or multiplication) and includes Group Theory, Ring Theory, Field

Theory, and Linear Algebra

Topology: the study of continuous functions and properties of sets that are

preserved by continuous functions

Analysis: the study of sets for which there is a measure of distance allowing for

the definition of various limiting processes such as those found in the subjects

of Calculus, Differential Equations, Functional Analysis, Complex Variables,

Measure Theory, and many other areas.

Other areas of study such as Applied Mathematics, Combinatorics, Geometry,

Logic, Probability are considered by some mathematicians to be their own branch

of mathematics or just as part of one or more of the above four branches. The

exact designation is important to some mathematicians and not to others. Although

mathematicians learn to write proofs in each of these branches of mathematics, one

has to begin the learning process someplace. Many teachers feel that Analysis is a

good area to start because students who have completed a study of Calculus will

already be familiar with just about all of the theorems discussed in a beginning

course in Analysis, and may already have an intuitive feeling for why these results

hold. That does not mean that those same students can write convincing proofs of

these theorems. It is the goal of this book to provide the training necessary so that

a student can learn to write proofs of these and similar theorems. Undergraduate

courses in Topology, Group Theory, Advanced Calculus, Graph Theory, and so

forth generally present the beginning concepts in each of these fields and try to

give students a feel for why the major results in the fields are true. Sometimes

this involves having the students learn proofs of these results while other times it

only involves a presentation of definitions and known results with the idea that the

students will be able to take the why it is true and turn it into a proof themselves.

This book is much more interested in turning known strategies into proofs than in

introducing a wealth of new strategies.

will present several templates for proofs as a tool for teaching how one might

approach the writing of a proof. For example, one can learn to prove a statement

of the form lim f .x/ D L by following a standard pattern. This book will display

x!1

proof patterns by presenting proof templates, and for each template it will discuss

proof examples showing how to use the template and the thought process needed

to complete such proofs. After that, a student would be expected to produce similar

proofs. There are other theorems in Analysis whose proofs involve the introduction

of some clever idea which time has shown to be useful. Beginning students would

not be expected to produce proofs using these new ideas on their own, so some of

these proofs are presented in order to teach the new proof strategy. The experienced

mathematician will have seen a large number of these clever proof techniques and

can be expected to reuse these techniques when writing a proof of some new

statement. Beginning students do not have this catalog of proof techniques from

which to draw, so they are not expected to be able to write proofs for such a wide

variety of statements. But one must start someplace when building up this catalog,

and it is a goal of this book to get students started in the right direction.

There are many reasons why mathematicians put a lot of weight on the writing of

proofs. Here are some of the reasons.

Determining Truth Research mathematicians use proofs to determine what mathematical statements are true. Although many statements in mathematics are obviously true, many remain unproved conjectures for long periods of time before being

proved. When a conjecture stands unproved for many years, there is time for more

mathematicians to learn about the statement, and the conjecture may attract a great

deal of attention. When the conjecture is first stated, some may find it interesting,

but finding a suitable proof may not appear to be a difficult problem until many

people have tried unsuccessfully to find a proof. As this interesting statement

remains a conjecture for a longer and longer period of time, the mathematical

community realizes that the problem of finding a proof is much more involved than

originally expected. This is exciting partly because a wider community of experts

begin to wonder whether the statement under consideration is true and because

it becomes clear that new techniques will be needed to find a suitable proof if,

in fact, the statement can be proved at all. The problem of determining whether

or not the mathematical statement is true takes on the same sort of interest that

some people would take in the success of their favorite sports teams; sitting and

waiting to see how they will fair in the upcoming contest. When a longstanding

conjecture is finally proved, the announcement of the accomplishment will often be

covered by the lay press giving mathematics an uncharacteristic brief period of pubic

admiration. Perhaps you are familiar with some of these famous problems whose

with the chords from n points

1 Point, 1 Region

2 Points, 2 Regions

3 Points, 4 Regions

4 Points, 8 Regions

5 Points, 16 Regions

6 Points, 31 Regions

resolution has alluded mathematicians for years (at least at the time of the writing of

this text in January 2016): The Riemann Hypothesis, the Goldbach Conjecture, the

Twin Prime Conjecture, the P versus NP Problem, and the NavierStokes Equations

Existence and Smoothness Problem. During the last 40 years resolutions have been

announced for several long-standing problems including the Four Color Theorem,

The Bieberbach Conjecture (now called de Brangess Theorem), Fermats Last

Theorem, and the Poincar Conjecture.

Why do mathematicians expend so much effort trying to prove statements, some

of which may seem obvious from the start? One reason is that mathematicians

are very skeptical of statements that appear obvious, and rightfully so. There is a

long history that includes mathematical statements which appear to be true which

are eventually shown to be false. Even very clear patterns can be deceptively

seductive. Take, for example, the following problem. Select a set of n points along

the circumference of a circle, draw the chords between each pair of points, and find

out the maximum number of regions into which these segments can divide the disk.

Figure 1.1 shows the results for the first few values of n.

Although from considering n D 1; 2; 3; 4; 5 it appears that the chords can divide

the disk into 2n1 regions, this fails to be true when n D 6. With a bit more thinking

n1

itnis not hard to see that2n could not be the correct answer. With n points there are

chords and at most 4 intersections of two chords. This number of intersections

2

grows as a fourth-degree polynomial in n suggesting that the number of regions will

the number of regions to grow at the exponential rate of 2n1 .

Another well-known example comes from Number Theory. The function .x/

gives the number of positive prime integers less than or equal to the number x.

The growth rate of this function has long been of central importance in Number

Theory. The Prime Number Theorem says that the function grows at the same rate

as the logarithmic integral

Z

li.x/ D

0

dt

:

ln t

In fact, for many years it was thought that li.x/ > .x/ for all x > 0 because this

holds for all small values of x which can be practically checked, for example, all x

between 0 and 1024 . It has now been shown that li.x/ .x/ switches sign infinitely

often, although only for extremely large values of x.

It is apparent that sometimes seemingly very obvious patterns do not hold

in every case, so mathematicians rely on proofs to convince themselves that the

patterns do indeed hold in the general case.

Testing Axiom Systems In the next chapter you will read about the writing of

proofs for some very elementary facts in mathematics; so elementary that you may

wonder why anyone would bother with these proofs. Clearly, it makes sense to

begin any training in the writing of proofs with some very simple results that are

easy to understand so that the student can feel confident about all the statements

being made in the proofs. But these proofs are not being presented just because they

are elementary. When one sets up a mathematical system by making definitions

and determining axioms, it is usually with a particular application or example in

mind. The desired result is that the new system will include the already partially

understood application so that any new discoveries will immediately tell something

new about the original application. Suppose someone sets up an axiom system for

the real numbers, for example, but is not able to prove that addition of real numbers

satisfies the commutative property. Since the commutative property is an important

aspect of addition of real numbers, it would appear that the new axiom system does

not have enough power to represent all that one would want to show about the real

numbers. Perhaps the axiom system will need to be expanded to include an axiom

about the commutativity of addition. Thus, if one cannot prove that the expected

simple properties hold, then it says that something is missing from the axioms. So

mathematicians write proofs to confirm that their axiom systems are representative

of the applications they are trying to describe.

Exhibiting Beauty There are no rules about what composers of music need to

write, but many composers try to write in standardly accepted formats such as

string quartets or symphonies because there are already organizations ready to

perform such works and groups of people happy to listen to such works. Scholars of

literature compare literary works by writing literary analysis, a form which holds

a lot of meaning for those who read and write in that field. Although painters

choose to make pictures of every sort of object or scene, real or imagined, most

painters eventually try their hand at painting some of the standard subjects (still life,

nudes, famous religious or historical depictions). Similarly, mathematicians write

proofs partly because that is what mathematicians enjoy doing. Although many

mathematicians make substantial contributions to the sciences, social sciences, and

arts through the application of their mathematical skills, others live in a world

of creating and discussing abstract concepts that have no immediate application

to real world problems, or at least no application apparent to the mathematicians

doing the research. To them, mathematics is studied as part of the humanities and

is appreciated for its beauty. And much of the beauty of mathematics lies in the

proofs of its theorems. One gets a great deal of pleasure reading a clever proof of

a complicated result when the proof can be stated in just a few lines, especially if

previous proofs of the same result were considerably longer and more difficult to

understand. Many mathematicians like reading articles and attending conferences

where they are exposed mainly to proofs of results, partly so that they can learn

about new results, but more importantly so they can appreciate the techniques

brought to bear to construct the proofs.

Testing Students One should not underestimate the need to educate future mathematicians. A good way to test whether a student understands a particular result is to

ask the student to present a proof of the result. The presentation of a proof shows a

deep understanding of why the result is true and shows an ability to discuss many

details about the objects involved. At the graduate school level in mathematics, most

test problems require the student to produce a proof of a particular result.

The student who has completed a study of Calculus is likely to have mastered

basic skills in Algebra, Geometry, Trigonometry, and Elementary Functions. This

is a good point in ones studies to begin writing proofs. It should not be assumed

that one can just begin writing proofs at this stage even if they have had years of

experience watching teachers and authors present proofs to them any more than

someone can be expected to sit down and begin playing the piano just because they

have watched many other people present concerts using the instrument. In this book

the reader will be taken through the construction of many proofs in a step-by-step

manner that presents the thought process used to write the proofs. Some incorrect

proofs are shown and explained so that the student can learn about common pitfalls

to avoid. Some students dread the transition to writing proofs because they feel

that they do not understand how to write proofs, and are leery of the day when

they will be expected to produce what they cannot now do. But the ability to write

good proofs is a skill no different from the ability to factor polynomials or integrate

rational functions. There is no expectation that the beginner can produce a good

proof, but every expectation that the beginner can learn.

Chapter 2

2.1.1 Conditional Statements

Most theorems concern mathematical objects x that satisfy a set of properties P, that

is, P.x/ D the properties P hold for object x. The theorem may say that if P.x/ is

true, then some additional properties Q.x/ must also be true. Such statements are

called conditional statements and can be written P.x/ ! Q.x/. In the context

of proving theorems, the P.x/ portion of the statement is referred to as the

hypothesis of the statement, and the Q.x/ portion of the statement is referred to as

the conclusion of the statement. The hypothesis of a conditional statement is often

called the antecedent while the conclusion of the conditional statement is often

called the consequent. For example, a well-known theorem is that all functions

differentiable at a point are also continuous at that point. There are many equivalent

ways to express this fact:

If the function f is differentiable at a point, then f is continuous at that point.

The function f is differentiable at a point only if f is continuous at that point.

If the function f is not continuous at a point, then f is not differentiable at that

point.

There are no functions f such that f is both differentiable at a point and

discontinuous at that point.

The function f is differentiable at a point implies that f is continuous at that point.

The function f is differentiable at x ! f is continuous at x.

All of these statements assert that if a function f satisfies the hypothesis that it has

a derivative at a point x, then f must also satisfy the conclusion that f is continuous

at x. Note that the truth of a conditional statement, P.x/ ! Q.x/, suggests nothing

about the truth of the statement Q.x/ ! P.x/ which is known as the converse

Springer International Publishing Switzerland 2016

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_2

10

of the conditional statement P.x/ ! Q.x/. Indeed, the converse of this theorem

is the clearly false statement: If the function f is continuous at a point, then f is

differentiable at that point. Certainly, there are functions f both continuous and

differentiable at a point, but knowing that a function is continuous at a point does

not allow one to conclude that it is differentiable at that point. The converse of a

conditional statement is not logically equivalent to the original statement, but since

the two statements are concerned with the same subject matter, mathematicians

are often interested in the converse of a given conditional. If someone succeeds

in proving a new theorem expressed as a conditional statement, you might wonder

whether the converse of the statement could also be true. Sometimes the truth of the

converse statement is a trivial matter because it is well known. But there are many

examples where the converse does not hold in every case; that is, there are many

known values of x where the converse statement Q.x/ ! P.x/ is false. Other

times, the converse statement is something that has been previously established.

But very often, the truth of the converse statement remains an open question, and

the proof of the original conditional statement may generate research interest in its

converse.

One of the equivalent forms of a conditional statement P.x/ ! Q.x/ is the

statement if Q.x/ is false, then P.x/ must be false. This can be written as

:Q.x/ ! :P.x/ using the negation symbol : . This form of the statement

is called the contrapositive of the original conditional statement. For example, the

contrapositive of the statement discussed above is If the function f is not continuous

at a point, then f is not differentiable at that point. Although logically equivalent

to the original conditional statement, the contrapositive often gives you a different

way to think about the statement, and you will often see a proof which is a proof of

the contrapositive statement instead of a proof of the original conditional statement.

The negation of a statement is a statement with the opposite truth value of the

original statement, that is, a statement which is false exactly when the original

statement is true. For example, the negation of n is an integer is n is not an

integer. The negation of the statement P.x/ is not P.x/ or simply :P.x/. The

conditional statement P.x/ ! Q.x/ says that every time P.x/ holds it must be the

case that Q.x/ also holds. The negation of this statement must, therefore, state that

for at least one value of x, P.x/ is true and Q.x/ is false or P.x/ and :Q.x/.

A proof by contradiction is a proof that assumes both that P.x/ and :Q.x/ are

true, and derives a statement that must be false (known as a contradiction) showing

that it is impossible to have P.x/ being true at the same time that Q.x/ is false.

The well-known Pythagorean Theorem is a conditional statement: If a right

triangle has legs with lengths a and b and a hypotenuse with length c, then

a2 C b2 D c2 . The converse of the Pythagorean Theorem is also true: If a triangle

has sides with lengths a, b, and c satisfying a2 C b2 D c2 , then the triangle is

a right triangle. When a conditional statement, P.x/ ! Q.x/ and its converse

11

Q.x/ ! P.x/ are both true, the two statements can be combined into one as

P.x/

! Q.x/. This can also be stated as P.x/ if and only if Q.x/. Such

statements are called biconditional statements. Thus, the Pythagorean Theorem and

its converse could be combined into the single biconditional statement: A triangle

is a right triangle if and only if the triangle has side lengths a, b, and c satisfying

a2 C b2 D c2 .

Conditional statements often make assertions about a very large number of objects

or even an infinite set of objects. Indeed, the statement about differentiable functions

being continuous refers to infinitely many functions, and the Pythagorean Theorem

refers to an infinite number of triangles. How, then, are you supposed to prove

these results since you clearly cannot consider every case individually? A general

approach to proving the conditional statement P.x/ ! Q.x/ is to select a generic

element x which could represent any object satisfying P.x/ and then to prove the

statement Q.x/. Since a generic object x satisfying P.x/ has been shown to satisfy

Q.x/, it follows that every object satisfying P.x/ must also satisfy Q.x/, and the

result has been proved. This will be the format of most of the proofs you will ever

write in analysis.

If the statement P.x/ ! Q.x/ is not true, it means that there is at least one

value of x that makes P.x/ ! Q.x/ a false statement. Such an x is called a

counterexample to the statement, and exhibiting such a counterexample would be

a way to prove that P.x/ ! Q.x/ is false. A proof of P.x/ ! Q.x/ is essentially

an argument showing that no counterexamples exist.

There are many phrases that occur so frequently when writing proofs, that

mathematicians have developed a short hand notation for these phrases. There is

little need to use these abbreviations within a textbook such as this or even in a

journal article, but the short hand can be useful when writing out a proof by hand on

paper or a blackboard. Here is a list of some of the commonly used symbols.

Shorthand Symbols for Proofs

9 there exists

8 for all

3 such that

! implies

! if an only if

12

2.1.4 Exercises

Perform the follows steps for each of the conditional statements in Exercises 16.

A

B

C

D

E

write the converse of the statement.

decide whether or not the converse of the statement is true.

write the contrapositive of the statement.

write the negation of the statement.

1. If x D 1 and y D 1, then xy D 1.

2. If x is an integer, then 2x C 1 is also an integer.

3. f .x/ and g.x/ are both continuous at x D 0 only if f .x/ C g.x/ is continuous

at x D 0.

4. xy D 0 if x D 0 or y D 0.

5. If xy 9y D 0 and y > 0, then x D 9.

6. A rectangle has area xy if two adjacent sides of the rectangle have lengths x and y.

7. Write the following without using shorthand symbols.

(a) 9x 2 R 3 x C 4 D 2.

(b) 8x 2 R 9y 2 R 3 x C y D 10.

Many proofs can be written by following a simple formula or template that suggests

guidelines to follow when writing the proof. Mathematicians reading a proof that

follows a traditional template will find the proof easier to follow because there will

be an expectation about what will be presented in the proof. For example, many

proofs will follow the general template given here.

TEMPLATE followed by many proofs

ASSERT THE HYPOTHESIS

LIST IMPLICATIONS

STATE THE CONCLUSION

elementary Algebra.

13

p c, the quadratic

b2 4ac

b

.

polynomial ax2 C bx C c has roots given by x D

2a

SET THE CONTEXT: Let a, b, and c be constants with a 0.

ASSERT THE HYPOTHESIS: Suppose that x satisfies ax2 C bx C c D 0.

LIST IMPLICATIONS: Since a 0, it follows that x2 C ba x C ac D 0.

b2

b2

Then x2 C ba x C ac C 4a

2 D 4a2 .

b 2

b2

C ac D 4a

Factoring shows that x C 2a

2.

b 2

b2

c

b2 4ac

Then x C 2a D 4a2 a D 4a2 .

b

4ac

This means that x C 2a

must be one of the two square roots of b 4a

2 .

s

p

p

2

2

2

b b 4ac

b 4ac

b

b 4ac

D

, and x D

.

So, x C

D

2

2a

4a

2a

2a

STATE THE CONCLUSION: Thus, the

p roots of the quadratic polynomial

b b2 4ac

ax2 C bx C c are given by x D

.

2a

The proof template begins with the suggestion to SET THE CONTEXT which

represents statements designed to tell the reader what is being assumed in the proof.

This is usually a sentence or two telling the reader about the properties of the objects

that will be encountered in the proof. It may also introduce which variables will

appear in the proof and what kinds of objects they represent. So, in the given proof

of the Quadratic Formula, the first line tells that the variables a, b, and c are going

to represent known constants with a not being 0. Clearly, the fact that a is not 0

needs to be stipulated because if a D 0, the polynomial ax2 C bx C c would not

be quadratic and would not have the proposed roots. Generally, you are not looking

for a lengthy narrative here, and, in fact, brevity is a particularly cherished attribute

of a proof. Saying what needs to be said, but only what needs to be said is usually

best. Some authors who state a theorem and immediately follow the statement of the

theorem with its proof will forgo setting the context at the beginning of the proof

because the reader will have just seen the statement of the theorem and may not need

to see a repeat of the context for that proof. For example, in the example proof, the

statement of the theorem does introduce the constants a, b, and c and polynomial

ax2 C bx C c, so some authors might just skip the first line of the proof. On the

other hand, if the first line of the proof instead introduced the constants r, s, and

t, the proof could have proceeded using these variables instead of a, b, and c. The

same result would have been proved. So the SET THE CONTEXT of the proof

makes the proof independent of the statement of the theorem being proved. Thus, for

completeness, it is good to establish the habit of including the setting of the context

at the beginning of each proof, at least until the students experience in proof writing

has matured.

Your choices of variables used to represent particular objects in the proof are

not critically important to the structure or correctness of the proof, but there are

14

certain variables that mathematicians associate with various uses, and sticking to

these conventional choices simplifies the understanding of the proof because those

variable choices bring with them a history of context that the reader will recognize.

There are very few Algebra students

who would recognize the Quadratic Formula

p

s s2 4rt

if you gave them z D

. Proofs about limits usually refer to the

2r

variables and which represent small positive real numbers used in specific ways

in the proof. Using these two variables in their traditional contexts makes the proofs

easier to understand because the reader will expect these variables to play specific

roles, just as they have in many other proofs the reader has seen. Seeing many

examples of proofs will familiarize the novice proof writer with these traditional

uses of variables.

Suppose that the statement being proved indicates that every object satisfying

the properties listed in the hypothesis of the theorem also satisfies some properties

listed in the conclusion of the theorem. One generally structures a proof of such

a statement by first selecting a generic object satisfying the properties listed in

the hypothesis. The ASSERT THE HYPOTHESIS part of the proof is where the

writer selects an arbitrary element satisfying the hypothesized properties. In the

Quadratic Formula proof, it was assumed that x satisfied the quadratic equation

ax2 C bx C c D 0. Other examples would be statements such as

Let x be an element of set A.

Let y be a root of the polynomial p.x/.

Assume that the real valued function f has a zero at the point z.

Suppose G and H are any two linesR that intersect at a point P.

s

Assume that the function f .s/ D 0 g.x/ dx is a differentiable function of s. In

addition assume that 0 f .s/ 10 for all s 0.

It is possible that there are infinitely many objects which could play the role of

the generically chosen object. But if an argument proves the result is true for this

generic object, then the theorem will have been shown to hold for any object that

could have played the role of the generic object, and, therefore, the theorem will

have been proved for all objects satisfying the hypothesis. The Quadratic Formula

proof addresses the one generic polynomial ax2 C bx C c and in doing so derives

a formula that works for all quadratic polynomials including 5x2 17x C 126 and

rx2 C sx C t. Often the reader of a proof will form a mental picture of the generic

object being chosen. For example, after reading Let n be any natural number bigger

than 3, the reader may think, OK, how about n D 7? As the proof progresses,

the reader may take each statement of the proof and verify that it is valid and makes

sense for their choice of n D 7. This helps the reader follow the logic of the proof

and verifies that they are understanding what the proof is saying.

The proof will be completed when it is shown that the generically chosen

element satisfying the hypothesis of the theorem is, in fact, an element satisfying

the conclusion of the theorem as stated in the STATE THE CONCLUSION part

of the template. There will certainly need to be some statements placed between

15

the original assertion of the hypothesis and the end of the proof that justify the

conclusion of the theorem. Those statements make up the LIST IMPLICATIONS

part of the template. In almost all cases, most of the body of the proof belongs to

this list of statements. Each statement in the list should follow from definitions or

be simple implications following from previous statements in the proof. In a wellwritten complete proof, the reader should easily see why each implication follows

logically from other statements made earlier in the proof (Fig. 2.1). If an implication

is not clear on its own, it will need some justification so the reader can follow the

logic. The justification may just be a reminder of a key point made earlier in the

proof (as shown earlier, f is continuous at point a) or a reminder of a well-known

definition or theorem (Since all continuous functions on the interval 0; 4 are

R4

integrable there, it follows that f .x/dx exists.) The given Quadratic Formula proof

0

contains six lines of implications. Each line follows easily from the line before using

standard rules of Algebra, and any student familiar with the algebraic manipulations

of equations will be able to understand these implications. In the fourth step of the

b2

proof, the quantity 4a

2 is added to both sides of an equation. Although this step

surely follows the rules of Algebra, it may not be clear to the reader of the proof

why the step is important. As it turns out, this completing the square operation

prepares for the factoring performed in the fifth step of the proof and is arguably the

most clever step of the proof. A proof will often require a clever step such as this.

The proof writer may have labored for years looking for the inspiration needed to

find such a step, but the proof itself need only make clear the justification for what

is being done and does not need to refer to the sweat that went into producing it.

Some implications will be easy for the reader to follow without having to justify

the step. Other statements may need some deeper explanation. Here is where the

proof writer will need to consider the expertise of the target audience for the proof

in order to decide how much detail to provide. How to make your proof easy to

follow is only clear when you know for whom it is meant to be easy. For example, it

b2

made sense to follow the line x2 C ba xC ac D 0 with the statement x2 C ba xC ac C 4a

2 D

b2

4a2

because this just used the fact that you can add equal quantities to both sides of

an equation to get a new equation that is equivalent. On the other hand, suppose you

wish to combine a conditional statement on line 8 of a proof with the fact stated

on line 15 of the proof in order to show that the hypothesis of that conditional

is satisfied. This would allow the writer to state the conclusion of the conditional

16

statement to get line 16 of the proof, but the reader may have to be reminded about

which statements are being combined to get that conclusion.

Sometimes the writer of a long or complicated proof will need to make a new

definition or point out some new property that will be important later in the proof.

Depending on the complexity of the new idea, the proof writer may want to include

an example or two of objects satisfying the new definition or property. This will

serve to help the reader understand the new concept or to verify that the reader

is understanding the new concept. It is admirable to include such examples if the

complexity of the proof can be made clearer. But in most other contexts, the proof

should be kept short without the inclusion of unnecessary statements. If the intended

readers are able to easily construct these examples on their own, then the examples

should be left out of the proof.

The remainder of this chapter will discuss proofs that follow this general proof

template in contexts that the student should find easy to follow. It will also give

an opportunity to present some definitions and notation that will be used in later

chapters.

2.2.1 Exercises

1. If you were writing a proof of All prime numbers greater than 2 are odd, which

of the following would be appropriate ways to begin the proof. (There may be

more than one correct answer.)

(a)

(b)

(c)

(d)

(e)

(f)

(g)

Assume that all odd prime numbers are greater than 2.

Let n be a prime number greater than 2.

Assume that 2 is a prime number.

Assume that n and k are integers with n > k > 2.

The numbers 3, 5, 7, and 11 are prime numbers greater than 2.

Let n be a number greater than 2 which is not prime.

other, which of the following would be appropriate ways to begin the proof.

(There may be more than one correct answer.)

(a)

(b)

(c)

(d)

(e)

(f)

Let ABCD be a quadrilateral whose diagonals bisect each other.

Let ABCD be a parallelogram whose diagonals bisect each other.

All rectangles are parallelograms.

Assume that the diagonals of a parallelogram bisect each other.

Assume that if the diagonals of a quadrilateral bisect each other, then the

quadrilateral is a parallelogram.

17

3. If you were writing a proof of Every cubic polynomial with real coefficients has

at least one real root, which of the following would be appropriate ways to begin

the proof. (There may be more than one correct answer.)

(a) Assume that every cubic polynomial with real coefficients has at least one

real root.

(b) Assume that p.x/ is a polynomial with at least one real root.

(c) Assume that a, b, c, and d are real numbers with a 0, and let p.x/ D

ax3 C bx2 C cx C d.

(c) The polynomial x3 8 has exactly one real root at x D 2.

(e) Let p.x/ be a cubic polynomial with real coefficients and q.x/ be a cubic

polynomial with complex coefficients.

(f) Let p.x/ be a cubic polynomial with real coefficients with real root r.

Write an appropriate first sentence that would begin proofs of each of the following

statements.

4. If m and n are relatively prime integers, then there exist integers x and y such that

mx C ny D 1.

5. The three angle bisectors of any triangle intersect at a common point.

6. If a and b are real numbers with a b, and f is a function continuous on the

closed interval a; b, then there is a real number M such that jf .x/j M for all

x 2 a; b.

u , !

u !

u .!

7. If !

v , and !

w are 3-dimensional vectors, then .!

v / !

w D!

v !

w /:

2.3.1 Set Notation

Most courses in mathematics discuss sets: sets of numbers, sets of points, sets

of functions, sample spaces, and so forth. This should have given any Calculus

student an intuitive understanding of sets. Many theorems in mathematics are

statements about sets in disguise. For example, the statement that If the function

f is differentiable at a point, then f is continuous at that point is equivalent to the

statement The set of functions differentiable at a point is a subset of the set of

functions continuous at that point.

For the purposes of this text, it will be enough to define a set as a collection of

elements. That is, elements are those objects that belong to sets, and the notation

x 2 A says that x is an element of the set A, and x A says that x is not an element

of the set A. The set A is a subset of the set B, or A is contained in the set B, if each

element of A is also an element of B in which case this fact is written as A B. Two

sets, A and B, are equal if they have the same elements, that is, all the elements in the

set A are in the set B, and all the elements in the set B are in the set A. Notationally,

this says that A D B if and only if both A B and B A.

18

There are many ways to express the contents of a set. One is to list the elements

such as A D fa; b; cg or B D f1; 3; 5; 7; : : : g. Another way is to use set builder

notation which states that the set consists of all elements satisfying a given property

P.x/ and is written fx j P.x/g, or to emphasize that the elements of the set are also in

set A, it is often written fx 2 A j P.x/g. Examples are fx j x > 0g, fy j y2 C3y2 > 7g,

and ff j f is a function differentiable at x D 3g. Note that a set is determined by the

elements that are in the set. Thus, f1; 2; 3g D f3; 2; 1g D f1; 2; 2; 3; 3; 3; 1; 2; 3g

because all three of these sets contain exactly the same three elements. In some

contexts, mathematicians will talk about a multiset which is an object similar to

a set but allows elements of the collection to appear with different multiplicities.

Thus, f1; 2; 3g and f1; 2; 2; 3; 3; 3; 1; 2; 3g would be different multisets even though,

in the notation of sets, they are the same set.

One special set is the empty set written as ; or fg and is the set that has no

elements. In some contexts there is an understanding of a universal set, U, such

that all other sets under consideration are subsets of U. For example, the sets

A D f1; 2; 3g and B D f2; 4; 6; 8; : : : g can be thought of as subsets of the universal

set U D f1; 2; 3; 4; : : : g.

Take care not to confuse elements and subsets. Remember that sets are collections of elements and sets are subsets of other sets. It is possible that a set contains

other sets as elements, but this would need to be explicitly clear from the definition

of that set. It is correct to write 3 2 f1; 2; 3; 4; 5g, f1; 3; 5g f1; 2; 3; 4; 5g,

and f1; 2g 2 f1; f1; 2g; f1; f1; 2ggg, but it is incorrect to write 3 f1; 2; 3; 4; 5g,

f1; 2g 2 f1; 2; 3; 4; 5g, or ; 2 f1; 2; 3; 4; 5g.

The student should be familiar with the following standard set operations. The

union of sets A and B is A [ B D fx j x 2 A or x 2 Bg, and the intersection

of sets A and B is A \ B D fx j x 2 A and x 2 Bg. When there is an

understood universal set, U, it makes sense to refer to the complement of a set

Ac D fx 2 U j x Ag. It does not make sense to discuss the complement of a set if

there is no understood universal set. For example, is f1; 2; 3gc D f4; 5; 6; 7; : : : g,

or is it f: : : ; 3; 2; 1; 0; 4; 5; 6; 7; : : : g? For that matter, is your right shoe an

element of f1; 2; 3gc ? The difference of two sets is AnB D fx 2 A j x Bg,

and, equivalently, if there is an understood universal set, AnB D A \ Bc . For

example, if A D f1; 2; 3; 4; 5g and B D f2; 4; 6; 8g, then A [ B D f1; 2; 3; 4; 5; 6; 8g,

A \ B D f2; 4g, AnB D f1; 3; 5g, and BnA D f6; 8g.

2.3.2 Exercises

1. Which of the following statements are true?

(a)

(b)

(c)

(d)

6 2 f1; 2; 3; : : : ; 10g.

f3; 5g 2 f1; 2; 3; : : : ; 10g.

; 2 f1; 2; 3; : : : ; 10g.

f6; 8g f1; 2; 3; : : : ; 10g.

19

(f) f1; 2; 3; : : : ; 10g f1; 2; 3; : : : ; 10g.

2. Given A D f1; 3; 5; 7; 9; 11; 13g, B D f2; 3; 4; 5; 6g, and C D f1; 4; 7; 11; 14g

evaluate each of the following expressions.

(a)

(b)

(c)

(d)

(e)

(f)

A[A

A\B

.A [ B/ \ C

.B [ C/ \ A

.A \ B/nC

.AnB/ [ C

There are many simple statements about sets which should be immediately obvious

to students reading this text, but learning to write proofs for these types of statements

will be instructive and useful in the proof writing discussed in the following

chapters. Here are some of those simple statements that apply to all sets A, B,

and C.

Some Statements About All Sets A, B, and C

A A [ B.

A \ B A.

An.B [ C/ .A [ B/nC.

.A [ B/ \ C A [ .B \ C/.

A [ B D B [ A, the Commutative Law of Union.

A \ B D B \ A, the Commutative Law of Intersection.

.A [ B/ [ C D A [ .B [ C/, the Associative Law of Union.

.A\B/\C D A\.B\C/, the Associative Law of Intersection.

A [ .B \ C/ D .A [ B/ \ .A [ C/, the Distributive Law of

Union Over Intersection.

A \ .B [ C/ D .A \ B/ [ .A \ C/, the Distributive Law of

Intersection Over Union.

.A [ B/c D Ac \ Bc , DeMorgans Laws.

.A \ B/c D Ac [ Bc , DeMorgans Laws.

The first four of these statements propose that one set is a subset of a second set.

From the definition of subset, for A B to be true, it is required that for every

x 2 A, x must also be in B. There is a standard template for proofs of statements of

this form:

20

SET THE CONTEXT: State what is being assumed about the sets A and B.

ASSERT THE HYPOTHESIS: Let x 2 A.

LIST IMPLICATIONS: Use the properties of set A to show x belongs to

set B.

STATE THE CONCLUSION: x 2 B. Therefore, by the definition of subset,

A B.

For example, how would one prove the statement For all sets A and B, A

A [ B? Because this proof is supposed to apply to any sets A and B regardless of

what properties they may possess, all that would be necessary for the SET THE

CONTEXT part of the proof is a statement introducing to the reader the fact that

the variables A and B will represent sets. Since A A [ B exactly when every

element of A is also an element of A [ B, the ASSERT THE HYPOTHESIS part

of the proof needs to select a generic element of the set A so that the proof can

conclude that the generic element is an element of set A [ B. The first two lines of

the proof read:

Suppose that A and B are any two sets. Let x 2 A.

The LIST IMPLICATIONS for this proof can be very short. It merely needs

to show that the definition of set union implies that x is in the union A [ B. This

completes the proof.

PROOF: A A [ B.

SET THE CONTEXT: Suppose that A and B are any two sets.

ASSERT THE HYPOTHESIS: Let x 2 A.

LIST IMPLICATIONS: Since x 2 A, it is true that x 2 A or x 2 B.

By the definition of set union x 2 A [ B.

STATE THE CONCLUSION: Therefore, by the definition of subset,

A A [ B.

Do the statements of this proof have to appear in exactly this order using exactly

these words? Of course not. There can be many variations in what makes up a good

proof. But it does not hurt to review why these statements make a good proof. The

first line Suppose that A and B are any two sets just makes it clear to the reader

that the variables A and B can be used to represent any two sets. Here is where the

reader of the proof may well mentally choose two sets so that when reading the

remainder of the proof, the reader can verify that the statements make sense when

applied to those two sets. The second line Let x 2 A is required because by the

definition of subset, one must show that each element of A is also an element of

A [ B, so selecting an arbitrary element of A is the natural way to do this. The next

line Since x 2 A, it is true that x 2 A or x 2 B is just a statement of logic that says

if statement p is true, then statement p or q is also true. Of course, this particular

p or q statement is exactly the definition of x being a member of A [ B, which is

exactly what is needed to complete the proof.

21

Could one have interchanged the third and fourth lines of this proof? Well, yes;

the proof would be complete if that were done, but the fact that the definition of set

union is invoked right after its conditions are verified makes the statements of the

proof flow smoothly. The reader facing the definition of set union in line three might

wonder why that definition is being shown at that point. By placing that statement

as the fourth statement where the proof reader has just seen that x 2 A or x 2 B,

the proof reader will immediately see that the definition of set union applies. Note

that each of the five statements in the proof has been placed on a separate line in the

display box. This has been done merely to facilitate the discussion about that proof.

In practice, there is no requirement that these statements appear on a separate lines.

The second statement about all sets is A \ B A. This can be proved using the

same proof template as the first statement. Since this statement also applies to any

two sets A and B, the first line of this proof will be the same as the first line of the

previous proof. Because the assertion of the statement being proved is that A \ B is

a subset of another set, the ASSERT THE HYPOTHESIS line of the proof would

change to the assertion that x belongs to A \ B. After reading this second line, what

does the proof reader know about x? Only that x belongs to the intersection of two

sets. Thus, that only direction that the proof can proceed is to invoke the definition

of set intersection to make the additional assertion that x 2 A and x 2 B. This is a

statement of the form p and q, so logic allows the assertion that p is true, or, in this

case, that x 2 A. This is the required STATE THE CONCLUSION statement, and

the complete proof would be

PROOF: A \ B A.

SET THE CONTEXT: Suppose that A and B are any two sets.

ASSERT THE HYPOTHESIS: Let x 2 A \ B.

LIST IMPLICATIONS: By the definition of set intersection, x 2 A and

x 2 B.

Thus, x 2 A.

STATE THE CONCLUSION: Therefore, by the definition of subset,

A \ B A.

For a more substantial example, consider the third of the list of statements about

sets An.B [ C/ .A [ B/nC. A proof of this statement will need to refer to the

definition of set difference as well as the definitions of set union and subset. Since

the statement being proved involves three sets, the SET THE CONTEXT part of

the proof will need to refer to all three sets. The ASSERT THE HYPOTHESIS

statement will need to select an arbitrary element from An.B [ C/. To emphasize

that the choice of which variable to use is arbitrary, this time use y rather than x to

represent the arbitrarily chosen element. Once it is known that y 2 An.B [ C/,

the only property of y that can be used is the fact that y is a member of a set

difference. Thus, this would be a good time to invoke the definition of set difference.

That assures that y 2 A and y .B [ C/. At that point one can use the definition of

22

set union to conclude that since y .B [ C/ that y B and y C. Now these facts

can be combined to get the STATE THE CONCLUSION statement required by

the proof template. The complete proof would be

PROOF: An.B [ C/ .A [ B/nC.

SET THE CONTEXT: Suppose that A, B, and C are any three sets.

ASSERT THE HYPOTHESIS: Let y 2 An.B [ C/.

LIST IMPLICATIONS: By the definition of set difference, y 2 A and y

.B [ C/.

By the definition of set union y cannot be an element of either set B or set

C, or it would be in B [ C.

Also by the definition of set union, since y 2 A, y is also a member of A [ B.

Now, y 2 .A [ B/ and y C, so by the definition of set difference, y 2

.A [ B/nC.

STATE THE CONCLUSION: Therefore, by the definition of subset,

An.B [ C/ .A [ B/nC.

2.3.4 Exercises

Write proofs for each of the following statements.

1. For all sets A, B, and C, .A \ B/ \ C A \ C.

2. For all sets A, B, and C, .A \ B/ \ .A \ C/ B \ C.

3. For all sets A, B, and C, .AnB/ \ .AnC/ An.B \ C/.

Let A and B be sets. From the definition of set equality it follows that one can prove

A D B by proving the two separate facts A B and B A. That suggests the

following proof template for proving that two sets are equal.

23

SET THE CONTEXT: Make statements about what is being assumed about

sets A and B.

PART 1: SHOW A B.

ASSERT THE HYPOTHESIS: Let x 2 A.

LIST IMPLICATIONS: Use the properties of set A to show x belongs to

set B.

CONCLUDE PART 1: x 2 B. Therefore, by the definition of subset, A B.

PART 2: SHOW B A.

ASSERT THE HYPOTHESIS: Let x 2 B.

LIST IMPLICATIONS: Use the properties of set B to show x belongs to

set A.

CONCLUDE PART 2: x 2 A. Therefore, by the definition of subset, B A.

STATE THE CONCLUSION: Therefore, because A and B are subsets of

each other, by the definition of set equality, A D B.

Is it correct to use the same variable x in both parts of the above proof template?

Yes, since the use of the variable x is only important in the context of showing A B

or B A, there is little chance that the reader will be confused by these two uses of

the same variable. On the other hand, there would be nothing incorrect about using

the variable x to represent the element of set A in the first part of the proof and to

use the variable y to represent the element of set B in the second part of the proof.

Using the same variable has the advantage that it is used the same way in both parts

of the proof, that is, to represent an element of a set that is being shown to also be

an element of a second set.

Is it correct that the variables A and B are used to represent the sets in both parts

of the proof? Could, for example, the first part of the proof use sets A and B, and

the second part of the proof use sets C and D? Here the answer is that it is very

important to use the same variables in both parts of the proof. To show A D B it

must be shown that A B and B A for the same pair of sets A and B. Showing

A B and C D does not let one conclude that A D B. After introducing A and B

in the SET THE CONTEXT part of the proof, it would be wrong to change the use

of these variables later in the proof or to change which variables were representing

the two sets.

Consider how to write proofs of three of the example statements:

A [ B D B [ A, the Commutative Law of Union.

.A \ B/ \ C D A \ .B \ C/, the Associative Law of Intersection.

.A [ B/c D Ac \ Bc , DeMorgans Law.

The first proof follows easily from the fact that in logic the statements p or q and

q or p are equivalent. This leads to the proof

24

PROOF: A [ B D B [ A.

SET THE CONTEXT: Suppose that A and B are any two sets.

PART 1 A [ B B [ A

LIST IMPLICATIONS: By the definition of set union, x 2 A or x 2 B.

Thus, x 2 B or x 2 A.

By the definition of set union x 2 B [ A.

CONCLUDE PART 1: Hence, from the definition of subset, it follows that

A [ B B [ A.

PART 2 B [ A A [ B

LIST IMPLICATIONS: By the definition of set union, x 2 B or x 2 A.

Thus, x 2 A or x 2 B.

By the definition of set union x 2 A [ B.

CONCLUDE PART 2: Hence, from the definition of subset, it follows that

B [ A A [ B.

subsets of each other, by the definition of set equality A [ B D B [ A.

Note that the PART 1 and PART 2 labels have been included in the above

display as guides to the student, but they are not required elements of the proof

itself. This proof can be shortened. Since the second part of the proof is identical to

the first part of the proof except that the roles of the sets A and B are interchanged,

one might save the reader from having to think through the details of the second half

of the proof which are identical to the details of the first half. The proof could be

written as

PROOF: A [ B D B [ A.

Let x 2 A [ B.

By the definition of set union, x 2 A or x 2 B.

Thus, x 2 B or x 2 A.

By the definition of set union x 2 B [ A.

Hence, from the definition of subset, it follows that A [ B B [ A.

Similarly, one can conclude that B [ A A [ B.

Therefore, since A [ B and B [ A are subsets of each other, by the definition

of set equality A [ B D B [ A.

In fact, the first half of the proof is the second half of the proof. The first half of

the proof shows that A[B B[A for any two sets A and B. In particular, that proof

applies when the roles of the two sets are interchanged; just let the variable A in the

25

first part of the proof represent the set B from the second part of the proof, and let

the variable B in the first part of the proof represent the set A from the second part

of the proof.

The Associative Law of Intersection refers to three sets and requires repeated

use of the definition of set intersection. The definition is used to break down the

statement x 2 .A \ B/ \ C into the three simple statements x 2 A, x 2 B, and x 2 C

and then these facts are put back together to form the needed x 2 A\.B\C/. Again,

the proof needs two parts. The result is

PROOF: .A \ B/ \ C D A \ .B \ C/.

Suppose that A, B, and C are any three sets.

PART 1 .A \ B/ \ C A \ .B \ C/

Let x 2 .A \ B/ \ C.

By the definition of set intersection, x 2 .A \ B/ and x 2 C.

Also, by the definition of set intersection, x 2 A and x 2 B.

Thus, x 2 A, x 2 B, and x 2 C.

Since x 2 B and x 2 C, by the definition of set intersection x 2 B \ C.

Since x 2 A and x 2 B \ C, by the definition of set intersection

x 2 A \ .B \ C/.

Hence, from the definition of subset, it follows that .A \ B/ \ C

A\ .B \ C/.

PART 2 A \ .B \ C/ .A \ B/ \ C

Now, let x 2 A \ .B \ C/.

By the definition of set intersection, x 2 A and x 2 B \ C.

Also, by the definition of set intersection, x 2 B and x 2 C.

Thus, x 2 A, x 2 B, and x 2 C.

Since x 2 A and x 2 B, by the definition of set intersection x 2 A \ B.

Since x 2 A \ B and x 2 C, by the definition of set intersection

x 2 .A \ B/ \ C.

Hence, from the definition of subset, it follows that A\.B\C/ .A\B/\C.

Therefore, because .A \ B/ \ C and A \ .B [ C/ are subsets of each other,

by the definition of set equality .A \ B/ \ C D A \ .B \ C/.

The two DeMorgans Laws are useful because they tell how to simplify the

complement of a set formed by a combination of unions and intersections of sets.

Proving these laws can follow the template for showing set equality. The proofs will

need to refer to the definitions of set union, set intersection, and set complement.

The order in which these definitions are invoked follows from what is known at that

point of the proof. For example, if you know that x 2 .A [ B/c , then the only way

to make progress in the proof is to apply the definition of set complement because

the only attribute known about the set is that it is the complement of some other set.

26

Ac \ B c

(A B)C

AC BC

Yes, that other set is a union of two sets, but there is no way to use that information

at this point of the proof because complementation was performed after the union

was taken (Fig. 2.2).

PROOF: .A [ B/c D Ac \ Bc .

Suppose that A and B are any two sets.

PART 1 .A [ B/c Ac \ Bc

Let x 2 .A [ B/c .

By the definition of set complement, x .A [ B/.

If x 2 A or x 2 B, then x 2 A [ B which is false.

Thus, x A and x B, so by the definition of set complement, x 2 Ac and

x 2 Bc .

By the definition of set intersection x 2 Ac \ Bc .

Hence, from the definition of subset, it follows that .A [ B/c Ac \ Bc .

PART 2 Ac \ Bc .A [ B/c

Now, let x 2 Ac \ Bc .

By the definition of set intersection, x 2 Ac and x 2 Bc .

Thus, by the definition of set complement, x A and x B.

If x 2 A [ B, then by the definition of union, it would follow that x 2 A or

x 2 B which is false.

Thus, x A [ B, and, by the definition of set complement, x 2 .A [ B/c .

Hence, from the definition of subset, it follows that Ac \ Bc .A [ B/c .

Therefore, because Ac \ Bc and .A [ B/c are subsets of each other, by the

definition of set equality .A [ B/c D Ac \ Bc .

2.3.6 Exercises

Give that A, B, and C are sets, write proofs for each of the following statements.

1. A \ B D B \ A.

2. A \ .BnA/ D ;.

3. .AnB/ [ .BnA/ D .A [ B/n.A \ B/.

4.

5.

6.

7.

27

.A [ B/ [ C D A [ .B [ C/.

.A [ B/ \ C D .A \ C/ [ .B \ C/.

.A \ B/ [ C D .A [ C/ \ .B [ C/.

An.B [ C/ D .AnB/ \ .AnC/.

2.4.1 Definitions of Even and Odd Integers

You are already very familiar with the natural numbers, N D f1; 2; 3; 4; : : : g,

which are sometimes called the counting numbers or whole numbers. By adding

zero and the negative natural numbers to this set, one obtains the integers, Z D

f: : : ; 3; 2; 1; 0; 1; 2; 3; : : : g. The natural numbers are often referred to as the

positive integers. Much of a students first study of mathematics is concerned

with these two sets of numbers. By a very young age most people are already

familiar with even and odd integers and some of their properties. This section will

construct proofs of some of these properties both because the student will feel very

comfortable with the concepts and because it allows for the introduction of some

basics about how to write proofs.

Before proceeding with proofs, though, it is necessary that there is agreement on

the definitions of even and odd integers. Indeed, there are many possible definitions

of even integers:

n 2 Z is an even integer if

the decimal representation of n has a ones digit equal to 0, 2,

4, 6, or 8.

n is either 0 or the prime factorization of n contains a factor

of 2.

there is an integer k such that npD 2k.

in is a real number, where i D 1.

the number .1/n is positive.

9n 1 .mod 10/.

sin. n

/ D 0.

2

n2 2 Z.

Which of these definitions should be used when writing proofs about even and

odd integers? Actually, since all the definitions are equivalent, one could adopt

any one of these definitions and then prove theorems that show that all the other

definitions are equivalent to the chosen definition. This is not an unusual situation

in mathematics, especially for a concept as elementary as even integers. But it turns

out that one of these definitions is particularly well suited for writing proofs, and

that is, n 2 Z is even if there is a k 2 Z such that n D 2k. This makes a useful

28

definition because it provides a fairly easy way to check whether a given integer is

even, and because knowing that a number n is even immediately gives you a number

k for which n D 2k, and that is a powerful tool for proving facts about even integers.

For this reason, this chosen definition is called the working definition, that is, it

is the definition easiest to apply in the wide variety of contexts. It is the definition

chosen from which all other properties of even numbers can be derived.

A similar discussion could take place about how to define odd integers. The

working definition is that n 2 Z is odd if there is a k 2 Z such that n D 2k C 1.

There is a long list of facts you could prove about even and odd numbers.

Facts About Even and Odd Integers

No integer is both even and odd.

n 2 Z is even if and only if n C 1 2 Z is odd.

The sum of any two even integers is even.

The sum of any two odd integers is even.

The sum of an even integer and an odd integer

is odd.

The product of two odd integers is odd.

The product of two integers is odd only if both

of the factors are odd.

Together, the first two of these facts say that each integer is either even or odd

but not both. This says that the sets of even and odd integers form a partition of

Z, that is, the sets are disjoint and the union of the sets is all of Z. Some authors

require that all the sets of a partition be nonempty as in the case with even and

odd integers. So why is it that every integer is either even or odd? This depends

on the Division Algorithm that states that if m; n 2 Z with n > 0, then there are

unique q; r 2 Z with 0 r < n such that m D nq C r. In this case q is called

the quotient of the division, and r is called the remainder of the division. Using

the Division Algorithm, any integer m can be divided by 2 giving a quotient and

remainder where the remainder is either 0 or 1. If the remainder is 0, then m D 2q

for integer q implying that m is even, and if the remainder is 1, then m D 2q C 1 for

integer q implying that m is odd.

How can these ideas be used to write a good proof of Every integer is either even

or odd? First it is easier to reword the statement as If m 2 Z, then either m is even

or m is odd. This is a conditional statement, so the natural way to begin a proof

is to assume that the hypothesis of the statement is satisfied, that is, that m is an

integer. Now apply the Division Algorithm to get the quotient q and remainder r

guaranteed by the algorithm. Finally, the value of r shows that m either satisfies the

29

definition of being an even integer or the definition of being an odd integer. The

result would be

PROOF: Ever integer is either even or odd.

Let m be an integer.

By the Division Algorithm there are integers q and r with 0 r < 2 such

that m D 2q C r.

If r D 0, then m D 2q for integer q which means that m satisfies the

definition for being even.

If r D 1, then m D 2q C 1 for integer q which means that m satisfies the

definition for being odd.

Since r must be either 0 or 1, it follows that every integer is either even

or odd.

Next consider the how to prove the statement The sum of any two odd integers is

even. The statement concerns the sum of any two odd integers, so the proof reader

would expect the proof to consider two arbitrarily chosen odd integers. Once two

odd integers are chosen, the definition of odd integer should be invoked because, at

that point, that is the only information that is known about the two integers. Finally,

a little algebra will help to show that the sum of these two odd integers satisfies the

definition of even integer. Here is an attempt to write such a proof that makes several

common proof writing errors.

PROOF ATTEMPT: The sum of any two odd integers is even.

The two integers are odd, so each has the form 2k C 1.

The sum of these two integers is .2k C 1/ C .2k C 1/ D 4k C 2.

k could be even or odd.

The number 2 is even since it is 2 1.

4k is even since it is 2 2k.

The sum of two even numbers is even, so the sum of 4k and 2 is an even

number.

Therefore, the sum of two odd integers is always even.

The proof begins talking about two integers, but the proof reader has not yet

been introduced to these integers and does not know what two integers are being

discussed. The proof is missing a SET THE CONTEXT sentence to introduce

the idea of starting with any two odd integers.

The proof uses the variable k without introducing what that variable represents.

The proof requires that k be an integer, but the fact that k is an integer is not stated

anywhere. As far as the proof reader knows, k could be any complex number.

Later, the proof claims that 2k is an integer which is needed to show 4k is an

even integer. Without knowing that k is an integer, it does not follow that 2k is

also an integer.

The definition of odd integer allows you to take an odd integer and represent it as

2k C 1, where k is another integer. To apply this definition, then, the proof should

30

start with an odd integer, say m, and then represent it as 2kC1 rather than starting

with 2k C 1. The subtle point is that one should start with odd integer and use

its definition to move on to 2k C 1 rather than starting with 2k C 1 which jumps

the gun. The reader of the proof could wonder whether 2k C 1 could represent

a generic odd integer. Well, it can, but this takes some thought which can be

avoided by starting with an odd integer m and then using the definition of odd to

select the integer k such that m D 2k C 1.

The definition of odd integer does refer to 2k C 1, but it is more precise. It

does not say has the form. It says that there is an integer k such that the odd

number equals 2k C 1:

It is a major error to allow both odd integers to equal 2k C 1 for the same number

k. The only way this can happen is for the two odd integers themselves to be

equal. Thus, this proof only applies to a small subset of cases where one adds

two identical odd integers together such as 3 C 3 or 117 C 117.

The statement k could be even or odd is certainly correct, but it does not

contribute to the proof. It is a statement about items in the proof that is not part

of the proof. Occasionally, one will make a definition as part of a long proof,

and then give some examples to help the reader understand that definition. But

if a statement is not needed either as a critical step in a proof or an important

illustration to aid the understanding of the proof, then the statement should be

left out of the proof because it distracts from the proof and complicates it.

The statement The sum of two even numbers is even is correct, but it has not

been proved yet, at least in this text, and is equivalent in difficulty to proving the

corresponding statement about the sum of odd integers. Thus, it is not appropriate

to use the result about sums of even integers to prove one about the sum of odd

integers.

PROOF: The sum of any two odd integers is even.

Let m and n be two odd integers.

From the definition of odd integer, there is an integer k1 such that m D

2k1 C 1 and an integer k2 such that n D 2k2 C 1.

Then m C n D .2k1 C 1/ C .2k2 C 1/ D 2.k1 C k2 C 1/.

Since k1 and k2 are integers, so is k1 C k2 C 1.

Thus, the sum m C n is equal to twice an integer, so by the definition of even

integer, m C n is even.

Therefore, the sum of any two odd integers is always even.

The form of this proof can be copied almost word for word to get a similar proof

of the statement The product of two odd integers is odd.

31

Let m and n be two odd integers.

From the definition of odd integer, there is an integer k1 such that m D

2k1 C 1 and an integer k2 such that n D 2k2 C 1.

Then mn D .2k1 C 1/.2k2 C 1/ D 4k1 k2 C 2k1 C 2k2 C 1 D

2.2k1 k2 C k1 C k2 / C 1.

Since k1 and k2 are integers, so is 2k1 k2 C k1 C k2 .

Thus, the product mn is equal to one more than twice an integer, so by the

definition of odd integer, mn is odd.

Therefore, the product of any two odd integers is always odd.

2.4.3 Exercises

Write proofs for each of the following statements.

1.

2.

3.

4.

5.

The product of an even integer and an odd integer is even.

The difference of an even integer and an odd integer is odd.

If the product of two integers is odd, then both of the integers must have been odd.

The sum of any four consecutive integers is even.

2.5.1 Ordered Fields

Many of the theorems of Calculus involve properties of the real numbers. Some

of these properties are subtle, so it is essential to understand this important set of

numbers. Already introduced are the sets of natural numbers, N, and the integers, Z.

Also of importance is the set of rational numbers, Q D f mn j m; n 2 Z; n 0g. This

definition comes with the understanding that the two rational numbers mn and ab are

equal whenever mb D na. Thus, there are always infinitely many representations for

each rational number. For all rational numbers r 0, one can always find relatively

prime integers m and n with n > 0 such that r D mn . Together with an agreement to

write the rational number 0 as 01 , each rational number has a unique lowest terms

representation.

The set of rational numbers is more than a set of fractions with integers for

numerators and denominators. It also comes with the two binary operations of

addition (C) and multiplication () and with the order relation less than (<).

32

The binary operations satisfy conditions which make Q into a field. A field F

is a set with operations of addition and multiplication that satisfies the following

axioms.

Axioms for a Field F

A set F together with the binary operations of addition .C/ and multiplication

./ form a field if F contains the two elements 0 and 1 with 0 1 such that for

every r; s; t 2 F

r C s 2 F and

r s 2 F and

r Ds ! rCt DsCt

r Ds ! rt Dst

the Closure

Properties

.r C s/ C t D r C .s C t/

.r s/ t D r .s t/

the Associative

Properties

the Commutative

rCsDsCr

rsDsr

Properties

rC0Dr

r1Dr

the Identity

Properties

There exists r 2 F

If r 0, there exists 1r 2 F

the Inverse

such that r C .r/ D 0

such that r 1r D 1

Properties

r .s C t/ D r s C r t

the Distributive

Law of

Multiplication

Over Addition

Notice that the rational numbers do satisfy the eleven field axioms. One defines the

operation subtraction () by r s D r C .s/ and the operation division ( ) for

s 0 by r s D r 1s D rs . Moreover, the field Q together with the less than order

relation is an ordered field that obeys the following axioms.

Axioms for an Ordered Field F

A field F is an ordered field with order relation < if for every r; s; t 2 F

exactly one of the following holds

r < s, r D s, s < r

r < s and s < t imply r < t

r < s implies r C t < s C t

r < s and 0 < t imply r t < s t

the Transitive Property

the Addition Property of

Less Than

The Multiplication

Property of Less Than

33

Notice that the rational numbers do satisfy the four ordered field axioms. One

defines the other order relations of greater than (>), greater than or equal to

(), and less than or equal to () in the obvious ways, that is, r > s whenever

s < r, r s whenever either r > s or r D s, and r s whenever either r < s or

r D s.

There are many other ordered fields, and it is constructive to consider how to

justify the fifteen ordered

field axioms for a different ordered field. For example,

p

the set T D fr C s 2 j r; s 2 Qg is an ordered field

p using the usual

p addition and

multiplication operations.

For

two

elements

a

C

b

2

and

c

C

d

2 in T, define

p

p

p

addition as

.a

C

b

2/

C

.c

C

d

2/

D

.a

C

c/

C

.b

C

d/

2

and

multiplication

p

p

p

as .a C b 2/ .c C d 2/ D .ac

C

2bd/

C

.ad

C

bc/

2.

To

define

the less than

p

p

p

relation you would want .a C b 2/ < .c C d 2/ whenever

a

c

<

.d b/ 2

p

which can be checked by squaring both a c and .d b/ 2, although you will

need topalso considerpthe signs of a c and d b. Thus, the definition becomes

.a C b 2/ < .c C d 2/ if one of the following holds:

a c < 0 and 0 < d b,

0 < a c, 0 < d b, and .a c/2 < 2.d b/2 , or

a c < 0, d b < 0, and .a c/2 > 2.d b/2 .

It is fairly easy to check that T is an ordered field. The only field axiom which

does not follow immediately from the properties of rational

p numbers is the inverse

axiom for multiplication. You should verify that for a C b 2 0, its multiplicative

inverse is

a

b p

C

2

a2 2b2

a2 2b2

which is in T. The order axioms take more work to verify due to the complicated

definition of less than. For example, to verify the less than relation

p

p works correctly

2, c C d 2, and

with addition,

one

would

begin

with

three

elements

of

T,

a

C

b

p

p

p

e C f p2 where it ispgiven that a C

p b 2 < cpC d 2. One needs to compare

.a C b 2/ C .e C f 2/ with .c C d 2/ C .e C f 2/. To do this, one compares the

values of .a C e/ .c C e/ D a c and .d C f / .b C f / D d b. But this reduces to

comparing

p a c and

p d b which are known to satisfy the correct conditions because

a C b 2 < c C d 2 was given.

Every ordered field satisfies a long list of simple properties that you will

associate with facts learned in Arithmetic and Algebra. Here are some of those

properties.

34

Let r; s; t all be elements of ordered field F . Then

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

r 0 D 0.

If r C t D s C t, then r D s.

If r t D s t and t 0, then r D s.

.r/ D r.

If r 0, then 11 D r.

r

r D s if and only if r D s.

r D .1/ r.

.r/ C .s/ D .r C s/.

.r/ .s/ D r s.

If r < s and t < 0, then s t < r t.

If r 0, then r2 > 0.

0 < 1.

If 0 < r, then 0 < 1r .

If 0 < r < s, then 0 < 1s < 1r .

If n is any natural number and r > 1, then rn < rnC1 .

The reader may wish to prove some of these properties by applying the axioms.

This book will not dwell on these proofs since the techniques used in proving them

are not essential for writing most proofs in Analysis. Two simple proofs are given

here as examples.

PROOF: If r is any element of the field F, then r 0 D 0.

Let r be an element of field F .

Since 0 is the additive identity of F , 0 D 0 C 0.

Then r 0 D r .0 C 0/.

By the Distributive Law, r 0 D r 0 C r 0.

By adding r 0 to each side of this equality, one gets 0 D r 0 r 0 D

.r 0 C r 0/ r 0 D r 0 C .r 0 r 0/ D r 0 C 0 D r 0.

Therefore, for any r 2 F , r 0 D 0.

The next theorem essentially says that if .1/ r has the same properties as r, it

must equal r.

35

Let r be an element of field F . Then

.1/ r D

=

=

=

=

=

=

=

.1/ r C 0

.1/ r C .r C r/

.1/ r C r C r

.1/ r C 1 r C r

.1/ C 1 r C r

0 r C r

0 C r

r

Additive Identity

Additive Inverses

Associative Law of Addition

Multiplicative Identity

Distributive Law

Additive Inverses

r0D0

Additive Identity

Note that every ordered field F will contain a copy of Q. This follows since

0; 1 2 F , and if n is a natural number in F , then n C 1 2 F . Thus, it follows by

mathematical induction that n 2 F for all n 2 N. Moreover, since 0 < 1, it follows

for each n 2 N that n D n C 0 < n C 1 showing that all natural numbers are

distinct elements of F . The existence of the negatives of all numbers in F implies

that the integers is a subset of F , and the existence of reciprocals implies that all of

Q lies in F . There are fields which are not ordered fields, and some of them do not

contain copies of Q. Indeed, there are finite fields as well as infinite fields that do

not contain Q or even N.

There are infinitely many ordered fields. The real numbers, R, is special because

it includes every number that is considered a possible distance from zero, either

positive, negative, or zero. An easy way to ensure that R contains every possible

distance is to require it to satisfy the Completeness Axiom. This axiom considers

nonempty subsets of an ordered field, F , (actually, any ordered set would do). A

subset S F is said to be bounded above if there is an M 2 F such that all x 2 S

satisfy x M. In this case, M is called an upper bound of S. Similarly, S F is

bounded below by lower bound K 2 F if all x 2 S satisfy x K. If S F is both

bounded above and bounded below, then S is said to be bounded. If M is an upper

bound for a set S, and it is less than or equal to every upper bound of S, then M is

the least upper bound of S. Similarly, if K is a lower bound for a set S, and it is

greater than or equal to every lower bound of S, then K is the greatest lower bound

p

of S. For example, if S is the interval .1; 5 D fx j 1 x < 5g, then 10, 6, and 30

36

are all upper bounds of S, but 5 is the least upper bound of S. Also, 2, 0, and 12

are all lower bounds of S, but 1 is the greatest lower bound of S. One often uses the

notation l.u.b..S/ or sup.S/ to represent the least upper bound or supremum of S

and g.l.b..S/ or inf.S/ to represent the greatest lower bound or infimum of S.

Axioms for the Real Numbers

The real numbers, R, is an ordered field that satisfies The Completeness

Axiom:

Every nonempty set S R which is bounded above has a least upper bound

in R.

Note, for example, that the set S D fx 2 Q j x2 < 7g is a nonempty subset of Q

which is bounded above by 4, 3, and 2.7, but there is no element of Q which is a

least upper bound of p

S. The set of real numbers, though, does contain a least upper

bound of S, namely 7. The Completeness Axiom is sometimes called the Least

Upper Bound Principle. The Completeness Axiom comes up frequently in proofs

about the real numbers to show that numbers with particular properties exist. For

example, consider the two theorems, the Archimedian Principle and the Existence

of Square Roots. Both of these theorems are easily understood, but they cannot be

proved without using the Completeness Axiom.

The Archimedian Principle states that for every real number r there is a natural

number greater than r. It can be proved using a proof by contradiction. The proof

makes the assumption that there is a real number greater than every natural number

and uses this to derive a contradiction, a statement that is false. Because one cannot

derive a false statement from a true statement, the assumption most recently made

in the proof must be a false statement, and you can conclude that no real number

exists that is greater than every natural number.

PROOF (Archimedian Principle): If r 2 R, then there exists n 2 N such

that r < n.

Suppose that there is an r 2 R such that r > n for every n 2 N.

Then the set N is a nonempty subset of R with an upper bound, so by the

Completeness Axiom, N has least upper bound M.

Then M 1 < M, so M 1 is not an upper bound for N.

Thus, there is a k 2 N with the property that k > M 1.

But then k C 1 is also in N, yet k C 1 > .M 1/ C 1 D M where M is an

upper bound for N.

This is a contradiction since no element of a set can be greater than an upper

bound for that set.

Therefore, the assumption that r > n for every n 2 N must be false, and for

every r 2 R there must be at least one n 2 N with n > r.

37

You may not have ever doubted that every nonnegative real number has a square

root, but this is a fact that can be proved using the axioms for the real numbers. It is

a nice application of both the Trichotomy Property and the Completeness Axiom.

Given a positive real number, r, the proof constructs the set S D fx 2 R j x2 rg

and then uses the Completeness Axiom to exhibit a value, s, equal to the least upper

bound of S. Then it shows that s2 cannot be greater than r and cannot be less than r,

so by the Trichotomy Property, s2 must equal r.

In particular, the proof first assumes that s2 > r and shows that there is a number

y > 0 such that the square of s y is also greater than r. This shows that s y is an

upper bound for S which contradicts the fact that s is the least upper bound of S. The

2

proof magically suggests that y D s 4sr works. Where did this magical expression

for y come from? It came from considering what property you would want such a y

to have. If you want .s y/2 > r, this suggests that you want s2 2sy C y2 > r. This

inequality is quadratic in y and has an unnecessarily messy solution. But one of the

most important lessons about writing proofs in Analysis is that one can often be a

little sloppy when trying to show that an inequality holds. Here, for example, rather

than finding a y such that s2 2sy C y2 > r, it would be sufficient to find a y such

that s2 2sy > r, because if s2 2sy > r, then certainly the needed s2 2sy C y2 > r

also holds. The advantage of making this change is that the inequality s2 2sy > r

2

2

is very easy to solve for y yielding y < s 2sr . Thus, the value y D s 4sr ought to work

fine, and, hence, the magic is demystified. Of course, there are many other possible

values of y that would also have worked in this proof, but only one value for y is

needed.

After showing that s2 > r cannot be true, the proof assumes that s2 < r and

shows that there is a number y > 0 such that the square of s C y is less than r.

This shows that s C y is in S which contradicts the fact that s is an upper bound of

2

S. Again, the proof just suggests setting y D rs

. Can you figure out where this

4s

expression for y came from? Indeed, the calculation is similar to the one above. You

need .s C y/2 r, so s2 C 2sy C y2 r. It is simpler if y2 could be replaced by

2sy. If you assume y 2s, it allows you to conclude y2 2sy so that .s C y/2 D

s2 C 2sy C y2 s2 C 2sy C 2sy D s2 C 4sy. You then want a y that satisfies

2

gives the needed value of y (Fig. 2.3). Putting

s2 C 4sy r. Thus, the value y D rs

4s

these ideas together gives the following proof.

s2 r s

4s

s2

S s r 4s

p

r

38

an s 2 R such that s2 D r.

Let r 0 be a real number.

If r D 0, then 02 D r and 0 satisfies the needed condition.

So assume that r > 0.

Let S D fx 2 R j x2 rg.

S is nonempty since 0 2 S.

S is bounded above by r C 1 since x > .r C 1/ implies x2 > r2 C 2r C 1 > r

so x S.

Thus, by the Completeness Axiom, S has a least upper bound s.

By the Trichotomy Property, either s2 > r, s2 < r, or s2 D r.

If s2 > r, note that y D s 4sr > 0, and .s y/2 D s2 2sy C y2 > s2 2sy D

2

2

s2 2s s 4sr D s 2Cr > r. Because .s y/2 > r, it follows that s y < s is an

upper bound of S. This contradicts the fact that s is the least upper bound of

S. Therefore, s2 > r must be false.

2

. Then y > 0, and

4s

2

2

2

2

2

2

D r.

.s C y/ D s C 2sy C y s C 2sy C 2sy D s C 4sy s2 C 4s rs

4s

Because .sCy/2 r, it follows that sCy 2 S and sCy > s. This contradicts

the fact that s is an upper bound of S. Therefore, s2 < r must be false.

Thus, it must be true that s2 D r which proves that for every real number

r 0 there is an s 2 R with s2 D r.

The concept that separates the area of Mathematics known as Analysis from other

branches such as Algebra, Topology, Set Theory, and Combinatorics is the idea

of distance. In the real numbers, one canmeasure distance

by using the absolute

x if x 0

value function which is defined as jxj D

: For a real number x, the

x if x < 0

absolute value of x can be thought of as the distance that x is from the real number

0. Note that for all x 2 R it holds that jxj x jxj. If k > 0, then

the set

fx j jxj < kg is the same as the set fx j k < x < kg. Similarly, the set fx jxj > kg

is the same as the set fx k > x or x > kg.

The distance between two real numbers x and y can be defined as jx yj. Note

that this distance is positive unless x D y.

One property of the absolute value function used frequently in proofs in Analysis

is the triangle inequality which states that for all x; y 2 R, jx C yj jxj C jyj. The

name of this inequality comes from geometry where it is known that the sum of the

39

x+y

x

lengths of two sides of a triangle always exceeds the length of the third side of the

triangle (Fig. 2.4). One simple proof of the triangle inequality is

PROOF (Triangle Inequality): jx C yj jxj C jyj

Let x and y be elements of R.

Then jxj x jxj and jyj y jyj.

Adding these inequalities yields .jxj C jyj/ x C y .jxj C jyj/.

This last inequality is equivalent to jx C yj jxj C jyj.

A subset S contained in R is called connected if it has the property that for any

two numbers in S, all the numbers between those two numbers are also in S. More

precisely, S is connected if for all x; y 2 S with x < y, it follows that z 2 S for all z

with x < z < y. Informally, this means that there are no holes in the set S. Another

word for a connected set of real numbers is an interval. If a < b are real numbers,

all of the following sets are intervals.

Intervals of Real Numbers

; empty set

fag D a; a single point

fx j a < x < bg D .a; b/ open bounded interval

fx j a x bg D a; b closed bounded interval

fx j a x < bg D a; b/ bounded interval open on the right

fx j a < x bg D .a; b bounded interval open on the left

fx j a < xg D .a; 1/ open right infinite interval

fx j x < bg D .1; b/ open left infinite interval

fx j a xg D a; 1/ closed right infinite interval

fx j x ag D .1; b closed left infinite interval

R entire real line

40

2.5.4 Exercises

1. Show that for any real number x it follows that jxj C jx 6j 6.

2. Show that for any real number x, jx 1j C jx 3j 2.

3. Show that for any real numbers x and y it follows that jx2 C3x4yjCjx14yj

.x C 1/2 .

4. Show that for any real numbers x and y, jx C yj C jx y 2j C j2x C 8j 10.

5. Show that for any real numbers x and y it follows that jxjCj3x5yjCj5x4yj

jx C yj.

6. Show that the intersection of any two intervals is always an interval.

7. Under what conditions is the union of two intervals an interval?

2.6 Functions

2.6.1 Function, Domain, Codomain

Intuitively, a function is a mapping that assigns to each point of some domain A a

value that resides in some codomain B. This is usually written f W A ! B. More

precisely, the function f is defined as a set of ordered pairs .x; y/ where each x resides

in the domain A of f and each y resides in the codomain B of f , and for each x 2 A

there is exactly one y 2 B such that .x; y/ 2 f . Since there is a unique ordered pair

.x; y/ 2 f for each x 2 A, f associates or links the value of y to the value of x and

allows one to write f .x/ D y.

2.6.2 Surjection

The domain of the function f is exactly the set of all x that are first coordinates of

the order pairs in f , that is, the domain is A D fx j .x; y/ 2 f g. The range of f is

defined as the image of f , that is, the range is fy j .x; y/ 2 f g. Clearly, the codomain

of f can be any set that contains the range of f . This can lead to some confusion

since the codomain of f is not precisely defined. It is simply a convenience. When

one defines a function f W R ! R, one means that f is defined for every real number,

and that for any x 2 R, the value f .x/ also lies in R. This is the case whether or not

R is the range of f or if the range of f is actually some proper subset of R. It could

be difficult and unnecessary to calculate exactly which subset of R is the range of

f , so it might be easier to just give the codomain as R and avoid the technicalities

of figuring out just what values of R are in the range of f . For example, the function

f .x/ D 3x6 15x4 C 12x3 C 25x2 32x C 14 maps the real numbers into the real

numbers, but to find the range of f , you would need to find the minimum value

of f . This minimum exists, but it may not be possible to give its value explicitly.

2.6 Functions

41

called surjective, and f is called a surjection. Thus, one can prove that a function

is surjective by showing that each element of the codomain is in the range.

TEMPLATE for proving a function f is surjective

SET THE CONTEXT: Make a statement introducing f , its domain A, and

its codomain B.

Select an arbitrary value y 2 B.

Exhibit a value x 2 A such that y D f .x/.

STATE THE CONCLUSION: Therefore, f is a surjection.

Note that the crucial step in proving that a function is surjective is showing the

existence of an x with f .x/ D y and verifying that the x is in the domain A of the

function. For example, the function f .x/ D 5x2 C 1 is a surjection from the negative

real numbers onto the interval .1; 1/. To prove this you would need to show that for

each real number y > 1 there is a negative real number x for which f .x/ D y. But this

just involves a simple algebraic manipulation. That is, if you need 5x2 C 1 D y, then

you can solve to get 5x2 D y1 and x2 D y1

. Here one needs to be careful because

5

q

y1

it is easy to continue by writing x D

which always results in a positive value

5

for x. The

q proof needs to exhibit a negative value for x, so it is important to set

. There is no need for the proof to display the steps of solving the

x D y1

5

equation for x. The goal is to produce a value of x 2 A such that f .x/ D y; how you

arrived at that x is not important. It may be interesting, but it is not an essential part

of the proof, and, therefore, it should not be part of the proof.

PROOF: The function f .x/ D 5x2 C 1 is a surjection from the negative

real numbers onto the interval .1; 1/.

Let f .x/ D 5x2 C 1.

q

.

For any y > 1 let x D y1

5

> 0, so

Because y > 1, y1

5

negative real number.

y1

5

q 2

C 1 D y.

Moreover, f .x/ D 5x C 1 D 5 y1

C 1 D 5 y1

5

5

2

Therefore, f is a surjection.

2.6.3 Injection

The definition of function requires that each value x in the domain of f is found in

exactly one ordered pair .x; y/ 2 f . The same does not have to hold for values in

42

the codomain, that is, one value y in the codomain could appear in many order pairs

.x; y/ 2 f . For example, for the constant function f W R ! R given by f .x/ D 1 for

all x 2 R, the value 1 appears as the second coordinate in all the ordered pairs of

the function. If a function has the property that no value of y appears as the second

coordinate of more than one ordered pair in f , then f is said to be injective or, less

formally, that f is one-to-one. In this case the function f is called an injection. In

such a case, one sees that f .x1 / D f .x2 / only if x1 D x2 . This gives a procedure for

proving that a function is injective.

TEMPLATE for proving a function f is injective

SET THE CONTEXT: Make a statement introducing f , its domain A, and

its codomain B.

Assume that for two values x1 and x2 in A that f .x1 / D f .x2 /.

Show that x1 D x2 .

STATE THE CONCLUSION: Therefore, f is an injection.

p

For example, the function f .x/ D 4x C 7 maps the positive real numbers to the

positive real numbers. It is not a surjection, but it is an injection. The proof would

require that you show that f .x1 / D f .x2 / implies that x1 D x2 . Again, this is just an

algebraic manipulation.

p

PROOF: The function f .x/ D 4x C 7 is an injection from the positive

real numbers to the positive real numbers.

p

Let f .x/ D 4x C 7.

Assume

p that for positive

p real numbers x1 and x2 , f .x1 / D f .x2 /.

Then 4x1 C 7 D 4x2 C 7.

Squaring yields 4x1 C 7 D 4x2 C 7, so 4x1 D 4x2 , and x1 D x2 .

Therefore, f is an injection.

If a function f W A ! B is both surjective and injective, that is, if f is both one-toone and onto, then f is bijective, and f is called a bijection. In this case, f exhibits

a one-to-one correspondence between the set A and the set B.

Two functions f and g whose ranges are in the real numbers can be combined

arithmetically. Specifically, one can define f C g, f g, fg, and gf in natural ways:

.f C g/.x/ D f .x/ C g.x/,.f g/.x/ D f .x/ g.x/, .fg/.x/ D f .x/ g.x/, and,

f .x/

for x such that g.x/ 0, gf .x/ D g.x/

. When functions f and g are combined

in this way, the domain of the sum, difference, product, or quotient is assumed to

be the intersection of the domain of f and the domain of g with the exception that

the domain of gf also excludes values of x for which g.x/ D 0. Thus, the function

p

p

f .x/ D x 4 is defined for all x 4, and

g.x/ D 5 x is defined

p the function

p

for all x 5. It follows that the function x 4 C 5 x is defined only for those

x satisfying 4 x 5. Similarly, the function ff .x/

is only defined for x > 4 even

.x/

though it is identically 1 for those x. That function has a natural extension to all real

numbers.

2.6 Functions

43

g(x)

y

f(x)

B

z

A

fg

2.6.4 Composition

If g is a function assigning values in its domain A to values in its range contained

in the set B, and if f is a function assigning values in its domain B to values in

its range contained

in

the set C, then the composition of f with g is the function

.f g/.x/ D f g.x/ which assigns to values in its domain A values in its range

contained in set C (Fig. 2.5). The main reason for considering compositions is that

it is often easiest to represent complicated functions as compositions of simpler

2x

functions. For example, the function f .x/ D psinxC4

is clearly a quotient where the

numerator is the composition of the function p

x2 with the function sin x, and the

denominator is the composition of the function x with the function x C 4.

It is easily shown that if g W A ! B and f W B ! C are both surjective functions,

then their composition, f g W A ! C, is also surjective. To prove this, you would

follow the template for proving that a function is surjective. That requires that you

select an arbitrary z 2 C and show that there is an x 2 A such that .f g/.x/ D z.

Why might you use the variable z here rather than the variable y? Well, that allows

you to think of g as mapping x to y, and f , in turn, mapping y to z. Faced with the

statement f g.x/ D z, there is little you can do except to apply what you know

about the function f , that is, that f is surjective. Because f is surjective, and z is in

the codomain of f , you know that there is a y in the domain of f such that f .y/ D z.

Can you find an x such that g.x/ D y? Of course y is in the domain of f which is the

codomain of g. The function g is surjective, so there must be an x in the domain of

g that maps onto y. These ideas give the following proof.

44

their composition f g W A ! C is also surjective.

Let z 2 C.

Then since f is a surjection from B to C, there is a y 2 B such that f .y/ D z.

Since g is a surjection from

A to

B, there is an x 2 A such that g.x/ D y.

Therefore, .f g/.x/ D f g.x/ D f .y/ D z.

It follows that f g W A ! C is surjective.

then their composition, f g W A ! C, is also injective. You would prove this by

following the template for proving that a function is injective. That is, you would

assume that .f g/.x

1/ D

.f g/.x2 / for some x1 and x2 in A. Again, what can you

say if you know f g.x1 / D f g.x2 / ? All that you can do is apply what you know

about the function f , that is, that f is injective. Since f is injective, you can conclude

that g.x1 / D g.x2 /. Then because g is injective, you can conclude x1 D x2 , and you

are done.

PROOF: If g W A ! B and f W B ! C are both injective functions, then

their composition f g W A ! C is also injective.

Assume that for some x1 and x2 inA, .f g/.x1 / D .f g/.x2 /.

By the definition of composition f g.x1 / D f g.x2 / .

Then since f is an injection, it follows that g.x1 / D g.x2 /.

Since g is an injection, it follows that x1 D x2 .

It follows that f g W A ! C is injective.

2.6.5 Exercises

Write a proof for each of the following statements.

1. For each real number r there is a real number x such that x3 D r.

2. For each real number r 0 there is a real number x 0 such that x4 D r.

3. If n is an odd positive integer, then for each real number r there is a real number

x such that xn D r.

4. If n is an even positive integer, then for each real number r 0 there is a real

number x 0 such that xn D r.

5. If h W A ! B, g W B ! C, and f W C ! D are three functions, then .f g/ h D

f .g h/. In other words, function composition is associative. (Hint: Show that

both functions .f g/ h and f .g h/ give the same result when applied to an

x 2 A.)

2.6 Functions

45

their composition f g h is surjective.

7. If h W A ! B, g W B ! C, and f W C ! D are three injective functions, then their

composition f g h is injective.

Chapter 3

Limits

In a typical Calculus course students develop an intuitive understanding of the

concept of limit which, of course, is the central concept of Calculus and, indeed,

the central concept of Analysis. In particular, if f is a function defined on an open

interval containing a 2 R, then f has limit L at a if the values of f .x/ get closer and

closer to L as x approaches a. In order to prove theorems about limits, one needs a

rigorous definition of limit which makes clear what is meant by closer and closer

and approaches. In Analysis the distance between two real numbers is measured

by the absolute value of the difference of the two numbers. Thus, the ideas of closer

and closer and approaches naturally involve statements about the absolute values

of differences of two quantities.

Consider the definition of lim f .x/ D L, where the function f is defined in an

x!a

open interval in R containing the point a. This limit should give you a mental image

similar to Fig. 3.1 where the graph of the function gets close to L as x approaches a.

So, how can you quantify what f .x/ is getting close to L means? Is within 1

1

close? Is within 14 close? Is within 1000

close? Clearly, there needs to be a way to say

arbitrarily close or as close as one likes. Analysts have found that a good way

to express f .x/ getting arbitrarily close to L is to say that for any positive distance,

jf .x/ Lj can be made to be less than that distance. Of course, jf .x/ Lj cannot be

made to be negative, and it is not reasonable to require it to be zero since that would

require f .x/ to actually equal L. Hence, one usually says that for any > 0, one can

achieve jf .x/ Lj < . The use of the Greek letter (epsilon) is arbitrary, but the

tradition of using in this context has been universal since Cauchy introduced its

use in the early 1800s. Figure 3.2 shows a tolerance of a small around the limit

value L. The goal is to show that the function f .x/ stays within that tolerance when

x is close to a.

In the figure you can see that for the values of x near a, the function f .x/ falls

within the prescribed tolerance of L. You could find a small interval centered at a

Springer International Publishing Switzerland 2016

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_3

47

48

3 Limits

x!a

y = f(x)

y = f(x)

x!a

x!a

L+

L

L

y = f(x)

a+

such that jf .x/ Lj < for all x in that interval. That is, there is a value > 0

such that every x satisfying jx aj < also satisfies jf .x/ Lj < as seen in

Fig. 3.3. Again, the choice of the Greek letter (delta) is completely arbitrary, but

the tradition of using in this context is universal.

Note that for the function whose graph appears in Fig. 3.3 the value L is the limit

of the function as x approaches a, and L also happens to be the value of f .x/ at x D a.

You should recall that this sometimes happens (specifically when f is continuous at

x D a), but that this is not a requirement. Indeed, one reason for discussing limits

in the first place is because there is a need to evaluate the behavior of a function

as x approaches a value a when the function fails to be defined at x D a. Thus, in

general, one does not want to require jf .x/ Lj < for all x with jx aj less than

some positive value since this would require jf .x/ Lj < at x D a. Instead,

one excludes the need for the function to satisfy any conditions at all at x D a by

saying that there is a positive such that f is within the desired tolerance of L for all

x with 0 < jx aj < . Clearly, the value of must be chosen to be positive since

no negative value would represent a distance and, D 0 would not result in a region

around the number a satisfying jx aj < .

49

x!a

Combining these ideas results in the following definition. Suppose that the

function f is defined for all x in an open interval containing a 2 R except perhaps

at x D a. Then the limit of f as x approaches a is L, lim f .x/ D L, means that for

x!a

every > 0 there exists a > 0 such that for every x satisfying 0 < jx aj < , it

follows that jf .x/ Lj < . The power of this definition is the fact that the and

are arbitrary positive numbers. For example, what if you knew that for every > 0

there were a > 0 such that whenever 0 < jx aj < , then jf .x/ Lj < 2?

Would this be sufficient for showing lim f .x/ D L? The answer is yes, because the

x!a

is arbitrary. Suppose that for any > 0 you can find a > 0 that will ensure that

jf .x/ Lj < 2. Then since 2 is also a positive number, you can find a 0 > 0, likely

smaller than , that will ensure that jf .x/ Lj < 2 2 D . The point here is that

since was arbitrary, you can replace it with any positive number, including 2 .

3.1.1 Exercises

Which of the following definitions is equivalent to the definition of lim f .x/ D L?

x!a

1. For all 0 there is a 0 such that if 0 < jx aj < , then jf .x/ Lj < .

2. For all > 0 there is a > 0 such that if 0 < jx aj < 4 , then jf .x/ Lj < 7.

3. For all > 0:001 there is a > 0:001 such that if 0 < jx aj < , then

jf .x/ Lj < .

4. For all > 0 there is an > 0 such that if 0 < jx aj < , then jf .x/ Lj < .

5. There exists a > 0 such that for all > 0, if 0 < jxaj < , then jf .x/Lj < .

6. For all > 0 there is an > 0, such that if 0 < jx aj < , then jf .x/ Lj < .

7. For all > 1 there is a > 1, such that if 1 < jx aj C 1 < , then

jf .x/ Lj C 1 < .

x!a

The definition of limit provides a formula by which one can construct a proof that

a particular function f has a limit of L at the point a. The definition requires that

for every > 0 there is a > 0 that satisfies certain properties. Thus, a proof of a

limit must show that for every > 0 you can exhibit a > 0 which has the needed

property. As with other proofs that some property holds for all elements of a set, the

proof begins by selecting an arbitrary element of that set. In this case, one would

select an arbitrary > 0. The goal is to present a value for > 0 such that every x

satisfying 0 < jx aj < also satisfies jf .x/ Lj < . That suggests the following

proof template.

50

3 Limits

x!a

SET THE CONTEXT: Make statements telling what is known about the

function f and the numbers a and L.

SELECT AN ARBITRARY : Given > 0,

PROPOSE A VALUE FOR : let D

. Here you would insert an

appropriate value for .

SELECT AN ARBITRARY x: Select x such that 0 < jx aj < .

LIST IMPLICATIONS: Derive the result jf .x/ Lj < .

STATE THE CONCLUSION: Therefore, lim f .x/ D L.

x!a

x!5

the proof would begin with Let f .x/ D 2x 3. Given > 0, : : :. Your task is to

determine a value of > 0 that ensures the inequality jf .x/ Lj < holds for all

x with 0 < jx aj < . Since the function f is not constant, the choice of will

surely depend on the value of . But how is this value of determined? A common

approach is to work backwards from the final conclusion jf .x/ Lj < to see what

value of is needed.

In this example, f .x/ D 2x 3, a D 5, and L D 7. The value of jf .x/ Lj D

j.2x 3/ 7j D j2x 10j D 2jx 5j. Note that this expression has a factor of

x a, where a D 5. When finding the limit of a polynomial where L D f .a/, this

will always be the case. For more complicated functions f , other properties of f will

often allow you to write f .x/ L as an expression where x a is a factor. This makes

it easier to determine a value of since the choice of restricts the size of jx aj

which, in turn, will make jf .x/ Lj small, the desired result.

So here, jf .x/ Lj D 2jx 5j. To force jf .x/ Lj to be less than some arbitrary

> 0, it is, therefore, sufficient for 2jx 5j to be made less than . This is done by

making jx 5j < 2 , and the needed value of is 2 . Note that it is stipulated that is

positive, so D 2 is also greater than zero, a requirement of the definition of limit.

Now a complete proof can be written by following the template.

PROOF: lim 2x 3 D 7

x!5

Let f .x/ D 2x 3.

Given > 0, let D 2 > 0.

Select x such that 0 < jx 5j < D 2 .

Then > jx5j implies > 2jx5j D j2x10j D j.2x3/7j D jf .x/7j.

Therefore, lim 2x 3 D 7.

x!5

the example above, but it can be trickier for other functions. Consider proving

lim 3x2 D 48. In this example, f .x/ D 3x2 , a D 4, and L D 48. The value of

x!4

51

x!a

which can be forced to be small by selecting a small value for . In particular, the

proof will need to justify jf .x/ Lj < which means 3jx C 4j jx 4j < and

jx 4j < 3jxC4j

. Here is an attempt at a proof using this idea, but it falls short of

being correct.

PROOF ATTEMPT: lim 3x2 D 48

x!4

For any x, let D 3jxC4j

.

Then 0 < jx 4j < D

3jx2 16j D jf .x/ 48j.

Therefore, lim 3x2 D 48.

,

3jxC4j

x!4

The second line of the proof refers to the variable which has not yet been

introduced in the proof. In particular, without having specified that > 0, one

does not know that > 0 which is required by the definition of limit. The proof

should include the phrase Given > 0.

The value of in the second line of the proof is undefined when x D 4.

The most serious error here is that the value of depends on the value chosen for

x. The definition of limit requires that for every > 0 there is a > 0. That value

of can depend on the value of but certainly cannot depend on x which has not

yet been introduced in the definition. After is specified, the definition requires

that a condition hold for all x satisfying 0 < jx aj < , and only then does the

definition refer to values of x.

Still one needs a value of which will be less than 3jxC4j

for all the values of x

considered in the proof. One way around this would be to find a value for which is

less than 3jxC4j

for every value of x. But this cannot be done because the expression

gets arbitrarily small as x gets large. On the other hand, the value of x will be

3jxC4j

restricted so that 0 < jx 4j < . Thus, unless is very large, x cannot wander too

far away from a D 4, and 3jxC4j

cannot get arbitrarily small.

So how does one choose a which both ensures that jx C 4j does not grow too

large and also makes jx 4j small? The technique is to select in two stages. First,

to ensure that jx C 4j does not grow too large, restrict the value of so that x cannot

wander too far from a D 4. Almost any restriction in the size of will work, so

how about suggesting that not exceed 1? If 1, then when you choose an x with

0 < jx4j < , you will know that jx4j < 1 which is equivalent to 1 < x4 < 1

and, thus, 1 C 8 < .x 4/ C 8 < 1 C 8. That is, 7 < x C 4 < 9, and it follows that

jx C 4j < 9. Here is another attempt at a proof that uses this idea. Unfortunately, it

too has problems.

52

3 Limits

x!4

Given > 0, let D 1.

Then for any x such that 0 < jx 4j < D 1, it follows that 1 < x 4 < 1

so 1 C 8 < .x 4/ C 8 < 1 C 8 and jx C 4j < 9.

Now let D 27

> 0.

Then 0 < jx 4j < D 27

implies that > jx 4j 27 > jx 4j 3jx C 4j D

2

j3x 48j D jf .x/ 48j.

Therefore, lim 3x2 D 48.

x!4

The only problem with the above proof is in its use of the variable . In the second

line of the proof is set to 1, and in the fourth line it is set to 27

. It does not make

sense to set the value of equal to both of these values because, except in the rare

case that D 27, the value of cannot be equal to both values at the same time.

The solution is to choose one value for that satisfies two separate conditions. For

example, you can first require that < 1. Then a choice of x with 0 < jx 4j <

will guarantee that jx C 4j < 9. Then 3jxC4j

> 39

D 27

. This suggests that you

should select D 27 . But you also need 1. What happens if someone suggests

that be some rather large number such as D 100? Then D 27

would not satisfy

< 1. This is not a problem since one can always get away with selecting a positive

value for that is smaller than needed. Thus, you can select

to be the lesser of 1

and 27

. This choice is usually written as D min 27

; 1 . Now you can put this all

together to get a formal proof that is completely correct.

PROOF: lim 3x2 D 48

x!4

Given > 0, let D min 27

;1 .

Select x such that 0 < jx 4j < .

Since 1, it follows that jx4j < 1 and 1 < x4 < 1, so 7 < xC4 < 9.

Thus, jx C 4j < 9.

Since 27

, it follows that jx 4j < 27

D 39

< 3jxC4j

.

2

Then > jx 4j implies > 3jx C 4j jx 4j D j3x 48j D jf .x/ 48j.

Therefore, lim 3x2 D 48.

x!4

xC2

2

x!2 x C3xC2

D 1. In this

example f .x/ D

a D 2, and L D 1. Note that f .2/ is not defined even

though the limit as x approaches 2 exists. The proof

of this limit must conclude

xC2

with the inequality > jf .x/Lj D x2 C3xC2 .1/. As in the previous examples,

xC2

,

x2 C3xC2

xC2

it would be convenient if the expression x2 C3xC2

.1/ would contain a factor of

x C 2 so that it could be made small by requiring x .2/ to be less than some .

53

x!a

But this follows with some fairly straightforward algebra. Assuming that x 2,

x2

xC2

xC2

1

1 C .x C 1/

xC2

.1/ D

C1 D

C1 D

D

:

C 3x C 2

.x C 2/.x C 1/

xC1

xC1

xC1

The needed inequality > xC2

xC1

xC1

exceeds jx C 1j which, in turn, would happen if < jx C 1j. Again, there is a

problem because the choice of > 0 cannot depend on the value of x, yet jx C 1j

can get arbitrarily close to zero as x gets close to 1. The strategy, then, would be to

restrict the value of so that x could not get close to 1. If x is supposed to be close

to 2, could be chosen so that it does not exceed 12 . Then, jx C 1j could not get

smaller than 1 12 D 12 , and jx C 1j > 2 . You would not want to exceed either 2

or 12 . Thus, one can select D min 2 ; 12 . The complete proof follows.

xC2

2

x!2 x C3xC2

PROOF: lim

D 1

xC2

Let f .x/ D x2 C3xC2

.

Given > 0, let D min 2 ; 12 .

Select x such that 0 < jx .2/j < .

Since 12 , it follows that jx C 2j < 12 and 12 < x C 2 < 12 , so 32 <

x C 1 < 12 . Thus, jx C 1j > 12 .

Since 2 , it follows that jx C 2j < 2 < jx C 1j.

Then > jx .2/j > 0 implies 2 > jx C 2j and

1

1

jx C 2j D jxC1j

j1 C .x C 1/j D

> 2jx C 2j > jxC1j

1

xC2

C 1 D 2

.1/ D jf .x/ .1/j.

xC1

x C3xC2

xC2

2

x!2 x C3xC2

Therefore, lim

D 1.

Clearly, at the point that you stipulate that should be less than 12 , you are making

a rather arbitrary decision. What would have happened if you had chosen some

other reasonable bound on the size of ? For example, what if instead you only

require < 34 ? This would also work, although that decision would affect the final

choice of for now jx C 1j can get as small as 14 , and jxC2j

could be as large as

3jxC1j

4jx C 2j. This suggests that you then select D min 4 ; 4 . This choice is no better

or worse than the chosen earlier. When one makes such arbitrary decisions, it

is good form to make a selection that does not lead to unnecessary arithmetic or

algebraic complications because one does not want to make the proof any harder to

read than necessary. Thus, it p

would perfectly adequate but enormously awkward to

select the bound on to be p5 . As long as the bound is less than 1, it will do the

1C 5

optimal choice.

p

5

p

1C 5

54

3 Limits

3.2.1 Exercises

Write a proof of each of the following limits.

1. lim 35 x C 1 D 4

x!5

2. lim 5x 8 D 7

x!3

3. lim 2x2 D 18

x!3

4. lim 9x2 D 4

x! 23

5. lim 3x2 5x 7 D 1

x!1

6. lim x2 C 3x C 1 D 29

x!4

7. lim 2x3 D 16

x!2

6

D2

x!1 2xC5

xC4

lim 2

D

x!8 x 10xC10

8. lim

9.

2

10. lim mx C b D ma C b

x!a

x!u

x!u

The one-sided limits lim f .x/ D L and lim f .x/ D L are very similar to twox!aC

x!a

sided limits except that the value of x is only allowed to approach the real number a

from one side. As a result, the definitions of these one-sided limits are very similar

to the definition of limit with minor alterations that forces x to stay on one side of

a. The definition of limit states that for a function f defined in a neighborhood of a,

but not necessarily at a, the limit lim f .x/ D L means for every > 0 there exists a

x!a

> 0 such that for every x satisfying 0 < jx aj < , it follows that jf .x/ Lj < .

What is it about this definition that allows x to approach a from two sides? It is the

inequality 0 < jx aj < that allows x to be either greater than or less than a since

jx aj is positive in either case. By removing the absolute value function in this

inequality and writing instead 0 < x a < , the choice of x becomes restricted

to being a value greater than a, or writing instead 0 < a x < , the choice of x

becomes restricted to being a value less than a. Thus, if f is a function defined for

all x in an open interval with right end at a, then the limit of f at a from the left is

L, lim f .x/ D L, means that for every > 0 there is a > 0 such that for every x

x!a

55

defined for all x in an open interval with left end at a, then the limit of f at a from

the right is L, lim f .x/ D L, means that for every > 0 there is a > 0 such that

x!aC

One-sided limits are particularly useful in cases where the function f behaves

1

differently on one side of a as on the other side such as the way e x behaves quite

1

differently as x approached 0 from the right where 1x is positive from how e x behaves

as x approaches 0 from the left where 1x is negative. Similarly, the derivative of

f .x/ D jxj has different limits as x approaches 0 from the right and from the left.

There are also

p cases where a function is not even defined for x on one side of a such

as f .x/ D x which is not defined for x < 0.

Proving the existence of one-sided limits is very similar to proving two-sided

limits except that care must be taken to ensure that the value of x remains on one

side of a. Take, for example, the limit lim 2x2 5x D 3. Here f .x/ D 2x2 5x,

x!3C

a D 3, and L D 3. As with a proof of other limits earlier in the chapter, the proof

needs to give a value for > 0 which will ensure > jf .x/Lj D j.2x2 5x/3j D

j.2x C 1/.x 3/j. This will follow if jx 3j < j2xC1j

for all suitable values of

x. What is needed is the largest possible value of 2x C 1, but 2x C 1 is not bounded

unless x is restricted to be close to 3. Thus, stipulate that be less than 1 which will

ensure that x3 will be less than1, x will not exceed 4, and 2xC1 will not exceed 9.

Then can be chosen to be min 9 ; 1 , and the proof can be written as follows.

PROOF: lim 2x2 5x D 3

x!3C

Given > 0, let D min 9 ; 1 .

Select x such that 0 < x 3 < .

Since 1, it follows that 0 < x 3 < 1, 3 < x < 4, and j2x C 1j < 9.

Then 0 < x 3 < implies 9 > x 3 and > 9.x 3/ > .2x C 1/.x 3/ D

2x2 5x 3 D jf .x/ 3j.

Therefore, lim 2x2 5x D 3.

x!3C

Consider a function where its left limit differs from its right limit such as the

function

5 7x if x < 1

f .x/ D

: Then lim f .x/ D 2 while lim f .x/ D 1. Thus,

x!1

x if x 1

x!1C

while proving lim f .x/ D 2, it is important to use that fact that x < 1 as part

x!1

of the proof since the required inequalities will not hold for x > 1 (Fig. 3.4). The

following shows one possible proof.

56

3 Limits

PROOF: lim

x!1

5 7x if x < 1

x if x 1

D 2

5 7x if x < 1

Let f .x/ D

.

x if x 1

Given > 0, let D 7 .

Select x such that 0 < 1 x < D 7 . Then x < 1, so f .x/ D 5 7x.

It follows that > 7.1 x/ D 5 7x .2/ D jf .x/ .2/j.

Therefore, lim f .x/ D 2.

x!1

In the third line of the proof, 0 < 1 x < ensures that x < 1 which, in turn, is

needed to conclude that f .x/ D 5 7x and not f .x/ D x. The fact that x < 1 is also

used in the fourth line of the proof to conclude that 5 7x .2/ D jf .x/ .2/j

which follows because 5 7x .2/ is positive for all x < 1.

3.3.1 Exercises

Write a proof of each of the following one-sided limits.

1. lim x2 C 4x D 21

x!3C

2. lim 8 3x D 1

x!3

3. lim

x!2

4. lim

x!4C

5. lim

x!2

6. lim

x!2C

x2 4

x2 3xC2

D4

x2 4x

2x2 7x4

8jx2j

x2 4

8jx2j

x2 4

D 2

D2

4

9

57

The definitions given in the last two sections do not make sense when the real

number that x approaches, a, is replaced by infinity. Infinity, of course, is not an

element of the real numbers, R, but it does make sense to ask whether a function

approaches a limit when x increases without bound, that is, as x approaches infinity.

When one writes lim f .x/ D L, one is thinking that f .x/ is getting close to the real

x!1

number L as x increases without bound. But it does not make sense to measure how

close x is to infinity by choosing a > 0 so that when x is within of infinity, f .x/

is close to L. Since infinity is not a real number, one cannot measure the distance

from the real number, x, to infinity, even less expect x to get within of infinity. So

how does one quantify getting closer to infinity? The answer lies in the phrase

increases without bound which suggests that for any bound, N, you could place

on the size of x, the value of x can be made to be greater than that bound. Thus,

instead of selecting a > 0 and requiring 0 < jx aj < , one chooses a number

N 2 R and requires x > N. This allows the following definition. Suppose that the

function f is defined for all x > K for some real number K. Then the limit of f as

x approaches infinity is L, lim f .x/ D L, means that for every > 0 there exists

x!1

an N 2 R such that for every x > N, it follows that jf .x/ Lj < (Fig. 3.5). Now

consider how one might write a proof of a limit at infinity. For example, consider

x

x

D 0. Here f .x/ D x2 C6

and L D 0. As with other limit

proving the limit lim x2 C6

x!1

chosen > 0.

arbitrarily

x

Again, you can work backwards. Since jf .x/ Lj D x2 C6 , as long as x > 0, it

x

would follow that x2 C6

< xx2 D 1x . Thus, there is an expression, 1x , which is larger

than jf .x/ Lj for all suitably large values of x. This will help because if you can

assure that 1x is less than , it will follow that jf .x/ Lj is also less than . It would

not have been helpful to exhibit an expression that was always less than jf .x/ Lj

because making that expression small would not imply that jf .x/ Lj is small. Now,

if x > 1 , it follows that 1x < suggesting that 1 is a suitable value for N.

x

x!1 x2 C6

PROOF: lim

D0

x

Let f .x/ D x2 C6

.

Given > 0, let N D 1 .

Select x such that x > N > 0.

Then x > 1 implies > 1x D xx2 >

x

Therefore, lim x2 C6

D 0.

x!1

as x ! 1

x

x2 C6

D x2 C6

0 D jf .x/ 0j.

58

3 Limits

Note that it is important that the third step of the proof pointed out that N is positive.

It is used in the fourth step when 1x is calculated, and this would not have been

allowed if the value of x could have been zero.

For a second example, consider proving lim 2xC5

D 2. Again, you can work

x!1 x7

.2xC5/2.x7/ 19

2xC5

backwards to get > jf .x/ Lj D x7 2 D

D x7 . From here

x7

there are a number of ways to proceed. You can solve for x in the previous inequality

to get x > 7 C 19

which gives a reasonable value for N. Another way would be to

19

is less than

say that if x > 14, then x 7 < x 2x D 2x . In this case the fraction x7

19

38

38

,

and

it

becomes

clear

that

x

>

is

sufficient.

x D

x 2

x

This is an example demonstrating the enormous flexibility one sometimes has in

writing proofs in analysis where you often need to prove an inequality which can

be done in many ways. It is usually easier to prove an inequality involving a simple

fraction rather than a complicated fraction, so you can use the strategy of replacing

a fraction with a simpler fraction that is clearly larger, or in some cases, clearly

smaller. Keep in mind that a ratio of positive values gets larger if its numerator gets

larger or its denominator gets smaller.

A complete proof can be written as follows.

2xC5

x!1 x7

PROOF: lim

D2

.

x7

Given > 0, let N D 7 C 19

.

Select x such that x > N > 7.

Then x > 7 C 19

implies x 7 >

Therefore, lim 2xC5

D 2.

x7

19

and >

19

x7

2xC5

x7

2 D jf .x/ 2j.

x!1

As in the previous proof it is important that x > 7 is pointed out in the third step of

the proof because that fact is needed both to ensure that f .x/ is defined by assuring

x 7 0 and that x 7 is positive allowing the absolute value function to be

introduced in the fifth step of the proof.

With a slight adjustment of the definition of lim f .x/ D L, one gets a definition

x!1

of lim f .x/ D L. This time rather than choosing an N and requiring jf .x/ Lj <

x!1

for all x > N, one instead needs f .x/ to be within of L for those x < N. Thus,

lim f .x/ D L means that for every > 0 there exists an N 2 R such that for all

x!1

x < N it follows that jf .x/ Lj < .

2

D 3, one can identify an N such that x < N implies that

To prove lim 6x2x2C5x

7

x!1

2

2

2

6x C5x

.6x C5x/3.2x2 7/

D

3j

<

by

working

backwards.

That

is,

3

j 6x2x2C5x

7

2x2 7

2x2 7

5xC21

2x 7

can do as long as you do not introduce changes that prevent the final inequality from

holding. In this case, the 7 term in the denominator of 5xC21

is an inconvenience,

2x2 7

59

and it would be nice to remove it. Simply removing this negative term would make

the absolute value of the fraction smaller when what is needed is to make the fraction

larger. A strategy that does work is to take part of the 2x2 term, which grows very

large as x goes to 1, and pair it with the 7 term. For example, 2x2 7 p

can

be written as x2 C .x2 7/. Because x2 7 is a positive value for all x < 7,

removing it from the denominator makes the absolute value of the fraction greater.

Also note that when x < 21

, the numerator j5x C 21j < 5jxj, and this happens for

10

5xC21

p

5

5

D 5jxj

2 > 2x2 7 or that x <

x

p

as long as x < 7. A proof would be

6x2 C5x

x!1 2x2 7

PROOF: lim

D3

6x2 C5x

.

2x2 7

Let f .x/ D

p

Given > 0, let N D min 7; 5 .

p

Select x such that x < N 7.

5xC21

> 2 2 D

Then x < N 5 implies > 5x D 5x

x2

x C.x 7/

2

D

D

jf

.x/

3j.

3

2x2 7

2x2 7

6x2 C5x

2

x!1 2x 7

Therefore, lim

D 3.

3.4.1 Exercises

Find ways to justify each of the following inequalities that hold for large values

of x.

1.

2.

3.

3x5

< 2x

2x2

4xC7

< 5x

2x2 6

2

5x C3xC1

< 10

x

x3 x2 1

4

D0

x!1 xC4

3x9

lim

D1

x!1 3xC4

4. lim

5.

9x2

D3

2

x!1 3x 10

3

x

lim

D 15

3

2

x!1 5x 2x 4

6. lim

7.

60

3 Limits

3.5.1 Definition of Sequence

A sequence is just a function whose domain is the set of natural numbers, N. In this

chapter the codomain of a sequence will be the real numbers, R, but you can have

a sequence with any set serving as the codomain. Functions are usually referenced

using the notation f .x/. But for sequences it is traditional to place the argument of

a sequence in a subscript rather than within parentheses as in a1 ; a2 ; a3 ; : : : . The

entire sequence is notated with angle brackets as in <an >. Note that this is not the

same as the set fa1 ; a2 ; a3 ; : : : g which is just the collection of the values taken on by

the sequence, that is, the range of the function a W N ! R. For each n 2 N, an is

called a term of the sequence, or specifically, the nth term of the sequence.

As with any real-valued function, you can add, subtract, multiply, and divide

sequences. The sum of sequences <an > and <bn > is the sequence <cn > where,

for each n 2 N, cn D an C bn . Similarly, one can define the difference of sequences

and product of sequences as cn D an bn and cn D an bn , respectively. If the

sequence <bn > has no terms equal to zero, then the quotient of sequence <an >

and <bn > is the sequence cn D abnn .

Other arithmetic operations can be similarly defined. If f is any real-valued

function with a domain that includes the range of the sequence <an >, then it makes

sense to define the sequence cn D f .an /. For example,pif <a

p sequence

pn >pis the

p

1; 3; 5; 7; : : : , then the sequence < an > is the sequence 1; 3; 5; 7; : : : .

A sequence <an > is a monotone increasing sequence if a1 a2 a3 : : : , or

in other words, for natural numbers i < j it follows that ai aj . Similarly, <an > is

a monotone decreasing sequence if a1 a2 a3 : : : , or for natural numbers

i < j it follows that ai aj . A monotone sequence is a sequence that is either

monotone increasing or monotone decreasing. If a monotone increasing sequence

<an > satisfies ai < aj for all natural numbers i < j, then it is a strictly monotone

increasing sequence. Similarly, <an > is strictly monotone decreasing if ai > aj

for all natural numbers i < j. For example, the following sequences are monotone

increasing:

61

1; 2; 3; : : :

1; 1; 2; 3; 3; 4; 5; 5; : : :

12 ; 23 ; 34 ; 45 ; : : :

13 ; 23 ; 33 ; 43 ; : : :

whereas the following sequences are monotone decreasing:

8; 4; 2; 1; 12 ; 14 ; : : :

0; 0; 0; 12 ; 12 ; 12 ; 1; 1; 1; 32 ; : : :

1

1

1

44 ; 55 ; 66 ; : : :

a sum of a monotone increasing sequence and a monotone decreasing sequence. In

particular, if <cn > is a sequence of real numbers, define an increasing sequence

<an > and a decreasing sequence <bn > as follows. Let a1 D c1 and b1 D 0. Then

for all n 2 N if cn cnC1 , define anC1 D cnC1 bn and bnC1 D bn , and if

cn > cnC1 , define anC1 D an and bnC1 D cnC1 an . These definitions make it

clear that cn D an C bn for each n 2 N. The sequence <an > is increasing because

cn cnC1 implies that anC1 an D .cnC1 bn / .cn bn / D cnC1 cn 0, and

cn < cnC1 , implies an D anC1 . Similarly, <bn > is decreasing because cn > cnC1

implies that bnC1 bn D .cnC1 an / .cn an / D cnC1 cn < 0, and cn cnC1

implies bn D bnC1 . Thus, 1; 1; 2; 2; 3; 3; : : : can be written as the sum of the

two sequences 1; 1; 4; 4; 9; 9; : : : and 0; 2; 2; 6; 6; 12; : : : .

3.5.4 Subsequences

Intuitively, a subsequence of a sequence <an > is a sequence whose terms include

some of the terms of the sequence <an > in the same order as they appear in the

original sequence. Formally, if there is a strictly increasing sequence of natural

numbers i W N ! N, then <ain > is a subsequence of the sequence <an >. Thus, the

sequence 1; 1; 2; 2; 3; 3; : : : has the following subsequences

1; 2; 3; : : :

1; 1; 3; 3; 5; 5; : : :

2; 3; 5; 7; 11; : : :

The sequence 1; 2; 2; 3; 3; 3; 4; 4; 4; 4; : : : is not a subsequence of 1; 1; 2; 2;

3; 3; : : : since there are no repeated values in the original sequence, so there can

be no repeated values in any of its subsequences.

62

3 Limits

The definition of the limit of a sequence is similar to that of the limit of a function

as x ! 1 except that the function is only defined on the natural numbers. Thus,

if <an > is a sequence of real numbers, then the limit of the sequence is L,

lim an D L, means that for all > 0 there is an N such that for every natural

n!1

number n > N it follows that jan Lj < . A sequence that has limit L is said to

converge to L and is said to be a convergent sequence. A sequence that does not

converge is said to diverge and is said to be a divergent sequence.

Except for slight notational changes, proving that a sequence has a particular

limit involves the same type of work as proving that a function has a particular limit

2

as its variable approaches infinity. For example, the sequence an D 4n2nCnC2

has

2 7

limit 2. To prove this, given an > 0, you would need to exhibit a number N such

that jan 2j < for all n > N. As

writing

about functions,

one can

with

proofs

nC16

nC16n

4n2 CnC2

stipulate that n 3, then n2 7 9 7 D 2 > 0 allowing you to conclude that

D 17 which can easily be made less than by requiring n > 17 .

jan 2j < 17n

n

n2

This is what is needed for the proof.

PROOF: lim

n!1

4n2 CnC2

2n2 7

D2

Let an D 4n2nCnC2

2 7 .

Given > 0, let N D max 3; 17

.

Select an n > N.

Since N 3, it follows that n2 > 9.

Also, n > N gives n 17

. Thus,

2

4n CnC2

4n2 CnC2

2

n!1 2n 7

Therefore, lim

17

n

nC16n nC16

> 2 2 > 2 D

D 17n

n2

n C.n 7/

2n 7

D 2.

Induction

A function f W A ! R is said to be bounded above if the set ff .x/ j x 2 Ag is

bounded above, that is, if there exists an M 2 R such that f .x/ M for all x in

the domain A of f . In this case M is an upper bound of f . Similarly, the function is

said to be bounded below if the set ff .x/ j x 2 Ag is bounded below. A function that

is both bounded above and bounded below is said to be bounded. Because a realvalued sequence <an > is just a real-valued function whose domain is the natural

numbers, N, these definitions apply to sequences as well.

63

increasing sequences that are bounded above must converge and monotone decreasing sequences that are bounded below must converge. Thus, bounded monotone

sequences converge. If a monotone sequence does not converge, then its terms must

continue to grow without bound and approach plus or minus infinity.

So how would you prove that a monotone increasing sequence that is bounded

above converges? When proving a limit of the form lim an D L, you can work with

n!1

the inequality > jan Lj in order to find an appropriate value of N that allows you

to use the definition of limit to complete the proof. But in this case, you do not have

a general expression for the terms an , and you have not been given a value for L.

Somehow you need to use the only known facts about <an >, that is, the fact that the

sequence is both monotone increasing and bounded, to come up with a candidate to

serve as the limit, L, in the proof.

The definition of a sequence being bounded above holds the key. That definition

says that the sequence <an > is bounded above if the set fan j n 2 Ng is bounded

above, so there is a real number M which is greater than or equal to each term of

the sequence. Will this M be the limit of the sequence? Well, not usually. If M is

an upper bound for the sequence, then so are M C 1, M C 100, and M C 20;000.

They are all upper bounds, but they cannot all be limits of the sequence. You should

recognize that the terms of the sequence must get close to the limit, and the only

upper bound of the set fan j n 2 Ng that the terms could get close to is the least

upper bound of the set. Since fan j n 2 Ng is both nonempty and bounded above,

the Completeness Axiom for the real numbers guarantees that such a least upper

bound exists. This gives you a candidate for L.

The proof will require you to show that for all n greater than some N, the terms

of the sequence, <an >, are within of L. How can this be arranged? Here is where

you can use the fact that the sequence is monotone increasing because once you find

a single term, an , that gets within of L, all the terms that come after this term in the

sequence will necessarily have to be between an and L, so they also will be within

of L. How do you find one term, an , within of L? This follows from the fact that

L is a least upper bound of fan j n 2 Ng. Because L is the least upper bound, L

being less than the least upper bound, L, is not an upper bound, so there must be an

element of the set fan j n 2 Ng greater than L . This gives all the tools needed for

the proof (Fig. 3.6).

a1

a2

a3

a4

a5 aN an

64

3 Limits

So how would you write the proof? Certainly the proof would begin with

selecting a generic sequence and making a statement about the properties the

sequence is assumed to have, that is, its being monotone increasing and bounded

above. Then, the proof would proceed to justify the existence of the least upper

bound for the set of terms of the sequence; that will give you the target value of L.

Then, as with most proofs about limits, it would select a value for > 0. Unlike

the limit proofs earlier in this chapter, one cannot immediately state a value for N.

The existence of N must be proved as discussed in the previous paragraph. Finally,

the properties of the sequence can be brought together to show jan Lj < for all

n > N. Here is one possible proof.

PROOF: A monotone increasing sequence that is bound above converges.

Let <aj > be a monotone increasing sequence of real numbers that is

bounded above.

Since the set of terms A D faj j j 2 Ng contains a1 , it is nonempty, and since

it is bounded above, the Completeness Axiom guarantees that A has a least

upper bound, L.

Given > 0, the number L is less than L. Since L is the least upper

bound of A, L is not an upper bound of A. Thus, there is an N 2 N such

that the term aN is in A and is larger than L .

Select an n > N.

Because <aj > is monotone increasing, an aN . Because L is an upper

bound for A, an L. Therefore, L < aN an L, and jan Lj <

j.L / Lj D .

This proves that the sequence <aj > has limit L and that <aj > converges.

Note that the proof needs to refer to the sequence <an > as well as a particular

element of the sequence an . It could be confusing to the proof reader to use the

variable n in both contexts here, especially since the sequence notation <an > is

used after the choice of a specific value of n is made. That is the reason the proof

changed to using the variable j to refer to a generic term index. Then, it could refer

to a specific term using index n without confusing the two uses.

There is also a theorem stating that a monotone decreasing sequence that is

bounded below converges. The proof of this is left as an exercise.

As an illustration of the usefulness of the above result, consider

a sequence

p

defined recursively by a1 D 2, and for n p

1, anC1 D

an C 12. That is,

p

p

p

a1 D 2, a2 D a1 C 12 D 14, a3 D

14 C 12, and so forth. One can

prove that this sequence converges by showing that the sequence is both monotone

increasing and bounded above. Indeed, both of these facts can be established by

mathematical induction. The reader is likely already familiar with proofs by

mathematical induction, but this is an appropriate opportunity to review the method

and its merits.

65

Suppose the variable n represents any natural number, and there is a statement

S.n/ that includes this variable as part of the statement. For example, the statement

could be lim xn D an . Mathematical induction is a proof technique that uses the

x!a

following proof template to show that S.n/ is true for all n greater than or equal to

some base value b 2 N.

TEMPLATE for using mathematical induction to prove the statement

S.n/ is true for all natural numbers n b.

SET THE CONTEXT: The statement will be proved by mathematical

induction on n for all n b.

PROVE S.b/: Prove that the statement is true when the variable n is equal

to the base value, b.

STATE THE INDUCTION HYPOTHESIS: Assume that S.n/ is true for

some natural number n D k b.

PERFORM THE INDUCTION STEP: Using the fact that S.k/ is true, prove

that S.k C 1/ is true.

STATE THE CONCLUSION: Therefore, by mathematical induction, S.n/

is true for all natural numbers n b.

It is important to understand that the technique of mathematical induction works.

That is, if the statement S.b/ is true, and if the statement S.k/ ! S.k C 1/ is true,

then, in fact, S.n/ must be true for all natural numbers n b. Certainly, S.b/ is

true. Because S.b/ is true, and S.k/ ! S.k C 1/ is true for all k b, it follows that

S.b/ ! S.bC1/, so S.bC1/ is true. Then S.bC1/ ! S.bC2/, S.bC2/ ! S.bC3/,

and so forth, so the fact that S.n/ is true for all n b follows.

The strength of mathematical induction is that it is often much easier to provide

a proof for the one step S.k/ ! S.k C 1/ than it is to prove S.n/ in the general

case. The reader has likely seen many statements proved by mathematical induction

while studying Algebra, Calculus, or just about any other branch of mathematics.

Mathematical induction is an excellent tool for proving that the previously

introduced recursive

sequence

is both monotone increasing and bounded above.

p

p

Clearly, a2 D 14 > 4 D 2 D a1 so a1 < a2 . Supposepthat for some

p k 1 one

has ak < akC1 . Then it follows that ak C12 < akC1 C12 so ak C 12 < akC1 C 12

which shows that akC1 < akC2 . Thus, by mathematical induction it follows that

an < anC1 for all n, and the sequence is monotone increasing. Alsop

clear is that

ap1 D 2 < 4. p

Suppose that for some k 1 that ak < 4. Then akC1 D ak C 12 <

4 C 12 D 16 D 4. Thus, by mathematical induction it follows that an < 4 for

all n, and the sequence is bounded above. The limit of this sequence

can be shown

p

to be 4. In particular,

if

the

limit

is

L,

one

can

conclude

that

a

C

12 should be

n

p

which should equal the limit of an which is also L. Thus, one

converging to L C 12p

would expect that L D L C 12. This equation has only one positive real solution,

L D 4.

66

3 Limits

A Cauchy sequence is a sequence whose terms get close together. As with the

definition of limit, the concept of close needs to be made precise. As with the

definition of limit, close means that given any tolerance > 0, one can go out far

enough in the sequence to ensure that all terms of the sequence beyond that point

are within of each other. Thus, a sequence is Cauchy if for every > 0 there is an

N such that if natural numbers m and n are both greater than N, then jam an j < .

If a sequence of real numbers converges, then the sequence is Cauchy. The proof

of this fact uses a strategy employed repeatedly in Analysis, that is, if two quantities

are very close to the same value, then they must be very close to each other. This

standard technique for proving that two quantities are close to each other involves

the use of the triangle inequality. In particular, if lim aj D L, then for every > 0

j!1

there is an N such that if natural number n > N, then jan Lj < . Well then,

certainly if m and n are both natural numbers greater than N, then both jam Lj <

and jan Lj < . Adding these two inequalities together shows that jam Lj C

jan Lj < C . The triangle inequality states that for any real numbers x and y,

jxj C jyj jx C yj. Thus, 2 > jam Lj C jan Lj D jam Lj C jL an j

j.am L/ C .L an /j D jam an j. Of course, the definition of Cauchy sequence

requires you to show that jam an j is less than , not 2. But you have an enormous

amount of flexibility when working with these types of inequalities, so you could

have asked instead for an N such that for all natural numbers n greater than N,

you have jan Lj less than 2 rather than less than . Thus, the proof could be as

follows.

PROOF: Every convergent sequence is Cauchy.

Let <aj > be a sequence of real numbers with lim aj D L.

j!1

From the definition of limit, there is a number N such that for all natural

numbers j > N, it follows that jaj Lj < 2 .

Then for all natural numbers m and n greater than N, jam Lj < 2 and

jan Lj < 2 , so D 2 C 2 > jam Lj C jan Lj D jam Lj C jL an j

j.am L/ C .L an /j D jam an j.

This shows that the convergent sequence <aj > is Cauchy.

Note that the converse of this theorem also holds. That is, any sequence of

real numbers that is Cauchy is a convergent sequence. This result will be proved

in Sect. 3.7. An important and useful consequence of the above theorem is its

contrapositive: If a sequence is not Cauchy, then it does not converge. Often when

one wants to show that a sequence does not converge, one shows that there is some

> 0 such that for every N there are natural numbers m and n greater than N for

which jam an j .

Another important property of Cauchy sequences is that all Cauchy sequences are

bounded. If the sequence <an > is Cauchy, then there is a natural number N such

that whenever m; n N, the difference jam an j < 1. The set fa1 ; a2 ; a3 ; : : : ; aN g

67

is a finite set, so it is bounded by some number, K. That is, jan j K for all n N.

If m > N, then, since both N and m are greater than or equal to N, it follows that

jam aN j < 1 from which it follows that jam j < jaN j C 1 K C 1. Then the

sequence <an > is necessarily bounded above by K C 1 and below by .K C 1/, and

the sequence is bounded. A complete proof follows.

PROOF: All Cauchy sequences are bounded.

Let <an > be a Cauchy sequence.

Then there is a natural number N such that for all m; n N, jam an j < 1.

The set fa1 ; a2 ; a3 ; : : : ; aN g is a finite set, so there is a K such that the set is

bounded above by K and bounded below by K.

Let m be any natural number. If m N, then jam j K. If m > N, then

jam aN j < 1, so jam j D jam aN C aN j jam aN j C jaN j < 1 C K.

It follows that all terms of the sequence lie between .K C 1/ and K C 1,

and, thus, the sequence is bounded.

One consequence of the last two results is that since all convergent sequences are

Cauchy, all convergent sequences are bounded. The concept of a Cauchy sequence is

not only applied to sequences of numbers but also to much more general sequences

such as sequences of vectors, sequences of functions, and sequences of linear

operators. Of course, one would need a way to discuss distances between the terms

of a sequence in these other contexts, but when that makes sense, the concept of a

Cauchy sequence becomes important.

3.5.8 Exercises

1. Which of the following sequences are monotone? Which of them are bounded

above? Which of them are bounded below? Which of them are bounded?

(a)

(b)

(c)

(d)

(e)

(f)

(g)

an

an

an

an

an

an

an

D .1/n

n

D nC1

D 5n

.1/n

D 5n

n

D 1C.1/

nCn1

D 5 n.1/n

D 1 12 13 1n

6n

n!1 3nC1

lim 4n1

n!1 nC6

(a) lim

(b)

(c) lim

D2

D4

n2 C2nC1

2

n!1 n 2n5

D1

68

3 Limits

p

3. If a1 D 3 and an is defined recursively by anC1 D 3an C 10, show that the

sequence <an > converges.

p

4. If a1 D 7 and an is defined recursively by anC1 D 3an C 4, show that the

sequence <an > converges.

5. Prove that a monotone decreasing sequence that is bounded below converges.

6. Let <an > be any sequence. Prove that <an > has a monotone subsequence.

7. Prove that if <an > is a sequence such that L D lim a2n D lim a2nC1 , then the

n!1

n!1

sequence converges to L.

8. Prove that if <an > is a sequence that converges to L, then the sequence

a1 ; a1 ; a2 ; a2 ; a3 ; a3 ; : : : also converges to L.

9. Prove that if <an > is a sequence that converges to L, then the sequence

a1 ; a2 ; a2 ; a3 ; a3 ; a3 ; a4 ; a4 ; a4 ; a4 ; : : : also converges to L.

3.6.1 Why a Limit Might Not Exist

lim f .x/ D L means that if x is required to stay close to a, then f .x/ will stay close

x!a

to L. So what does it mean for lim f .x/ not to exist? Intuitively, it could mean that

x!a

in every neighborhood of a there are values of x for which f .x/ is close to one value

L1 and other values of x for which

value L2 . That is what

f .x/ is close to another

4x 5 if x < 2

happens with the function f .x/ D

as x approaches 2. For some

10 2x if x 2

values of x near 2, f .x/ is close to 3, and for some values of x near 2, f .x/ is close

to 6. Thus, the limit does not exist. Another well-known example is f .x/ D sin 1x

which oscillates wildly as x approaches zero, and in every neighborhood of 0, the

function takes on all values in the interval 1; 1 infinitely often. Another way for

the limit not to exist is for the values of f .x/ to grow without bound and approach

xC3

infinity or negative infinity such as what happens to f .x/ D .x5/

2 as x approaches 5.

One can write a proof showing that a particular function has no limit at x D a,

but before discussing how to do this, it is worth taking a close look at the definition

of limit.

To say that a function f has a limit at x D a is to say that there exists a real number

L such that for all > 0 there is a > 0 such that for every x, 0 < jx aj <

implies jf .x/ Lj < . This definition is actually a fairly complicated statement. At

the heart of it is the conditional statement 0 < jx aj < implies jf .x/ Lj < .

69

But this is an open statement, that is, even though the function f and the limit point

a are supposedly known, the statement contains variables x, L, , and , all of which

are unknown. Thus, this open statement does not have any truth value until these

four variables have been stipulated. They are stipulated with four phrases: there is

a real number L, for all > 0, there is a > 0, and for every x. These four

phrases are called quantifications of the variables because they indicate for which

values of the variables the following statement must hold. Two of the phrases use

the existential quantifier there exists. It indicates that there is at least one value

of the variable that will make the following statement true. The other two phrases

use the universal quantifier for all. It indicates that every possible value of that

variable will make the following statement true. So

The statement there exists a real number L such that for all > 0 there is a > 0

such that for every x, 0 < jx aj < implies jf .x/ Lj < begins with the

existential quantifier there exists a real number L, and the entire statement is

true if, in fact, there is a value of the variable L that makes the following statement

true, that is, for all > 0 there is a > 0 such that for every x, 0 < jx aj <

implies jf .x/ Lj < .

The statement for all > 0 there is a > 0 such that for every x, 0 < jx aj <

implies jf .x/ Lj < begins with the universal quantifier for all > 0, and

the entire statement is true if, in fact, every possible positive value of the variable

makes the following statement true, that is, there is a > 0 such that for every

x, 0 < jx aj < implies jf .x/ Lj < .

The statement there is a > 0 such that for every x, 0 < jx aj < implies

jf .x/ Lj < begins with the existential quantifier there is a > 0, and the

entire statement is true if, in fact, there is a positive value of the variable that

makes the following statement true, that is, for every x, 0 < jx aj < implies

jf .x/ Lj < .

The statement for every x, 0 < jx aj < implies jf .x/ Lj < begins with

the universal quantifier for every x, and the entire statement is true if, in fact,

every possible value of the variable x makes the following statement true, that is,

0 < jx aj < implies jf .x/ Lj < .

A proof that no limit exists must prove the negation of the statement that says that

a limit does exist, so it is important that one can generate the negation of a statement

that contains quantifiers such as this one does. The logic of doing this is not hard

to follow. Suppose the P.y/ is a statement that depends on the value of a variable y.

Then the universally quantified statement for every y, P.y/ says that P.y/ is true

for every possible value of y. The negation of for every y, P.y/ must be that it

is false that every value of y makes P.y/ true, so there must be at least one y that

makes P.y/ a false statement. This means that the negation of for every y, P.y/

is the statement there is a y such that :P.y/. To negate a universally quantified

statement, change the universal quantifier to an existential quantifier and negate the

statement that follows.

What if the original statement is an existentially quantified statement such as

there is a y such that P.y/? This statement says that some value of y makes

70

3 Limits

P.y/ true. The negation of this statement must be that no value of y makes P.y/

true which is to say that every value of y makes P.y/ a false statement. This means

that the negation of there is a y such that P.y/ is the statement for all y, :P.y/.

To negate an existentially quantified statement, change the existential quantifier to a

universal quantifier and negate the statement that follows.

The statement that f has a limit at x D a is a statement that has an existential

quantifier followed by a universal quantifier followed by an existential quantifier

followed by a universal quantifier followed by a conditional statement. To prove

that f does not have a limit at x D a requires a proof of the negation of that

statement. From the previous discussion it is now clear that to get the negation of

the statement that f has a limit at a, you must flip the two existential quantifiers to

universal quantifiers, flip the two universal quantifiers to existential quantifiers, and

end with the negation of the conditional statement. The result is for all real numbers

L there is an > 0 such that for all > 0 there is an x such that 0 < jx aj < and

jf .x/ Lj .

Getting back to writing a proof that a limit does not exist, the proof would need to

show that for every real number L there is an > 0 such that for every > 0 there

is an x within of a such that jf .x/ Lj . This is often done by exhibiting an x1

and an x2 within of a such that f .x1 / and f .x2 / are so far apart that they could not

both be within of any L. That suggests the following template for proving that a

particular limit does not exist.

TEMPLATE for proving lim f .x/ does not exist

x!a

SET THE CONTEXT: Make statements about what is known about the

function f and the number a.

SELECT AN ARBITRARY LIMIT L: Given L 2 R,

PROPOSE A VALUE FOR : let D . Here you would insert a value for

.

SELECT AN ARBITRARY > 0: Select > 0.

SELECT VALUES FOR x1 AND x2 : Let x1 D

and x2 D . Note that

0 < jx1 aj < , 0 < jx2 aj < , and jf .x1 /f .x2 /j 2. You would have

selected appropriate x1 and x2 in such a way that jf .x1 / f .x2 /j exceeds 2.

LIST IMPLICATIONS: Assume that jf .x1 / Lj < and jf .x2 / Lj < .

Then 2 D C > jf .x1 / Lj C jf .x2 / Lj D jf .x1 / Lj C jL f .x2 /j

jf .x1 / L C L f .x2 /j D jf .x1 / f .x2 /j.

STATE THE CONTRADICTION: This shows that 2 > jf .x1 / f .x2 /j

which is a contradiction.

STATE THE CONCLUSION: Thus, it cannot hold that both jf .x1 / Lj <

and jf .x2 / Lj < , and the limit does not exist.

71

4x 5 if x < 2

For example, consider the limit of f .x/ D

as x approaches

10 2x if x 2

2. Here the limit from the left is 3, and the limit from the right is 6. Thus, no matter

how close x is supposed to be to 2, there will be values x1 and x2 within that required

tolerance where f .x1 / is close to 3 and f .x2 / is close to 6. If f .x1 / and f .x2 / are both

supposed to be within of some limit L, then it will follow that f .x1 / and f .x2 / will

have to be within 2 of each other. Again, you employ the technique of showing that

two quantities close to the same value must be close to each other. In particular, if x1

is chosen to be less than 2, f .x1 / will be less than 3. If x2 is chosen to be between 2

and 2 12 , f .x2 / will be greater than 5. In this case it would be impossible to have f .x1 /

and f .x2 / within 2 of each other, and, therefore, it would be impossible to have them

both within D 1 of some limit L. This suggests that you will get a contradiction if

you set D 1. Indeed, if a > 0 is chosen, you can let x1 D 2 2 (that is, less than 2

but within of 2), and let x2 D min 2 C 2 ; 2 C 12 (that is, greater than 2 but within

of 2 and not so large that f .x/ is less than 5). The point of all of this is that now,

no matter what value is chosen for L, f .x1 / and f .x2 / are more than 2 apart, so how

could they both be within 1 of L? Specifically, if jf .x1 / Lj < 1 and jf .x2 / Lj < 1,

it follows from the triangle inequality that 2 D 1 C 1 > jf .x1 / Lj C jf .x2 / Lj D

jf .x1 / Lj C jL f .x2 /j jf .x1 / L C L f .x2 /j D jf .x1 / f .x2 /j showing

2 > jf .x1 / f .x2 /j which cannot hold. Here is the complete proof (Fig. 3.7).

Fig. 3.7 f has no limit at

xD2

72

3 Limits

PROOF: The function

4x 5 if x < 2

10 2x if x 2

has no limit as x ! 2.

4x 5 if x < 2

.

10 2x if x 2

Given any value for L, let D1, and let > 0 be given.

Let x1 D 2 2 and x2 D min 2 C 2 ; 2 C 14 .

Note that 0 < jx1 2j < and 0 < jx2 2j < .

Since x1 < 2, it follows that f .x1 / < 3. Since x2 > 2 and x2 < 2 14 , it follows

that f .x2 / > 5. As a consequence jf .x1 /f .x2 /j D f .x2 /f .x1 / > 53 D 2.

If jf .x1 / Lj < D 1 and jf .x2 / Lj < D 1, it would follow that

2 D 1 C 1 > jf .x1 / Lj C jf .x2 / Lj D jf .x1 / Lj C jL f .x2 /j

jf .x1 / L C L f .x2 /j D jf .x1 / f .x2 /j > 2. This shows that 2 > 2 which

is a contradiction.

Thus, it cannot hold that both jf .x1 / Lj < and jf .x2 / Lj < , and the

limit does not exist.

Let f .x/ D

It is even easier to show that the function f .x/ D sin 1x has no limit as x

approaches 0. This is because for every > 0 it is easy to find x1 and x2 between

0 and such that f .x1 / D 1 and f .x2 / D 1. This makes it impossible to find an

L where jf .x1 / Lj < 1 and jf .x2 / Lj < 1. Thus, the proof follows the given

template for proving that a limit does not exist (Fig. 3.8).

Fig. 3.8 Graph of sin

1

x

73

Let f .x/ D sin 1x .

Given any value for L, let D 1, and let > 0 be given.

1

2

2

Select integer k > 2

. Let x1 D .4kC1/

and x2 D .4kC3/

.

2

1

Note that both x1 and x2 are positive and less than 4k D 2k

< ,

1

3

f .x1 / D sin .2k C 2 / D 1, and f .x2 / D sin .2k C 2 / D 1.

If jf .x1 / Lj < D 1 and jf .x2 / Lj < D 1, it would follow that

2 D 1 C 1 > jf .x1 / Lj C jf .x2 / Lj D jf .x1 / Lj C jL f .x2 /j

jf .x1 / L C L f .x2 /j D jf .x1 / f .x2 /j D 2. This shows that 2 > 2 which

is a contradiction.

Thus, it cannot hold that both jf .x1 / Lj < and jf .x2 / Lj < , and the

limit does not exist.

template to use for the proof that f .x/ has no limit. The idea is that since f .x/ is

unbounded, for any proposed limit L one can find an x close to a such that jf .x/j >

jLj C 1. Then the difference jf .x/ Lj will be forced to be greater than 1. Consider,

xC3

for example, the function f .x/ D .x5/

2 as x approaches 5. Given L, you will want

an x with

1

.x5/2

xC3

>L

.x5/2

1

, so by

jx5j

1

jLjC1

xC3

.x5/2

>

>

making jx 5j <

you will have the inequality that you

need. Note that the absolute value function was introduced in jLj C 1 to take care of

the embarrassing circumstance that L is negative, and in particular, when L D 1.

The proof is as follows.

PROOF: The function

xC3

.x5/2

has no limit as x ! 5.

xC3

Let f .x/ D .x5/

2.

Given any value for L, let D 1, and let >

0 be given.

1

.

Select a value of x between 5 and 5 C min 1; ; jLjC1

Note that 0 < jx 5j <

xC3

1

1

and f .x/ D .x5/

2 > .x5/2 > x5 > jLj C 1.

It follows that jf .x/ Lj > jLj C 1 L 1.

Thus, it cannot hold that jf .x/ Lj < , and the limit does not exist.

3.6.4 Exercises

Write the negation of each of the following statements.

1. There exists x such that x2 D A.

2. For all x there is a y such that g.x/ D f .y/.

74

3 Limits

3. There is an integer k such that f .x/ f .k/ for all x between k and k C 1.

4. For all x > 0 and all y > 0 there exists a z < 0 such that f .z/ xf .y/.

Prove that the following limits do not exist.

x

5. f .x/ D jxj

as x ! 0

1

as x ! 1

6. f .x/ D x sin x1

5x if x < 3

7. f .x/ D

as x ! 3

4x if x 3

8. f .x/ D x244 as x ! 2

A set A has an accumulation point p if for every > 0 there is an x 2 A with x p

and jx pj < . Informally, p is an accumulation point of A if there are points of A

that are arbitrarily close to p. Note that the fact that p is an accumulation point of the

set A hasnothing to do with whether p is actually an element of A. For example, the

set A D 1n j n 2 N has one accumulation point, 0, because for every > 0 there is

an n 2 N with 1n < . Here the accumulation point 0 is not an element of the set A.

The set B D 0; 4 (the closed interval from 0 to 4) has infinitely many accumulation

points. Indeed, every element of the interval B is an accumulation point of B because

for each x 2 0; 4 and each > 0 there are infinitely many points in B within of

x. Here all of the accumulation points of B are in B. Each point x 2 0; 4 is also an

accumulation point of the set C D .0; 4/ \ Q, the set of rational numbers between 0

and 4. Here, some of the accumulation points are in C, and some are not. The set of

natural numbers, N, has no accumulation points. An element a of a set that is not an

accumulation point of that set is called an isolated point of the set. For any isolated

point a, there is an > 0 such that a is the only element of the set in the interval

.a ; a C / (Fig. 3.9).

A word of warning is needed here. The term accumulation point is not used the

same way by all authors. Many texts, especially those in Topology, will use the terms

limit point or cluster point instead of accumulation point. Even more confusing is

that some texts use the term accumulation point for something different.

75

accumulation point of set A, then for every > 0 there is not only one point of

A within of p but infinitely many points of A within of p. The definition of

accumulation point guarantees at least one point of A within of p, but once one

point, x 2 A, is found to be within of p, the definition also says that there must be

another point y 2 A with 0 < jy pj < jx pj. Since for each x 2 A close to p

there must be another point y 2 A even closer to p, it follows that there are infinitely

many points of A within of p.

Perhaps the most used fact about accumulation points is known as the Bolzano

Weierstrass Theorem which states that every infinite bounded set of real numbers

has an accumulation point. As pointed out earlier, N has no accumulation points,

and it is an infinite set. But N is not a bounded set. Intuitively, one cannot have a

bounded infinite set without an accumulation point because one runs out of places

to put the infinite number of points. If the points of a set are not allowed to bunch

up anywhere, then one will not be able to find room for infinitely many of the points

within a bounded interval.

There are several good strategies used to prove the BolzanoWeierstrass Theorem, and two of those strategies are presented here. Of course, one only needs one

good strategy to prove a theorem, but these proofs are instructive and use techniques

commonly employed in Analysis proofs. One begins each proof with a statement

about the set A being an infinite bounded set. Since A is a bounded set, it will have

a lower bound, a, and an upper bound, b showing that A a; b. The first strategy

is to construct the set S D fx a j a; x \ A is finiteg, that is, a value x a is in

the set S if there are finitely many element of A which fall in the interval a; x. First

observe that the set S is an interval. This follows because if y 2 S, then a; y \ A is

finite, so if x is between a and y, then a; x \ A a; y \ A must also be finite, and

x 2 A. The next observation is that S is not empty because the point a, whether or

not it is in A, is in S since a; a \ A contains at most one point, so it is finite. Since

a; b \ A D A is an infinite set, the set S is bounded above by b. The Completeness

Axiom now shows that S must have a least upper bound, p. It will follow that p is an

accumulation point of A because for all > 0, the set A will have only finitely many

elements less than p but infinitely many elements less than p C implying that

there are infinitely many elements of A within of p. Here is the complete proof.

76

3 Limits

real numbers has an accumulation point.

Let A be an infinite bounded set of real numbers.

Because A is bounded, it has a lower bound, a, and an upper bound, b,

showing that A a; b.

Define set S D fx a j a; x \ A is finiteg.

Note that a 2 S since a; a \ A is finite, so S is nonempty.

Note that if z b, then a; z \ A D A is an infinite set, so z S showing

that S is bounded above by b.

By the Completeness Axiom, S has a least upper bound, p.

Given > 0, p < p so p is not an upper bound of S. Hence, there is

a y 2 S with y > p . It follows that there are only finitely many elements

of A less than or equal to y.

Also, p C > p, so p C S. It follows that a; p C \ A is infinite.

Thus, there must be infinitely many elements of A between p and p C ,

and there must be an element of A not equal to p within of p.

This shows that p is an accumulation point of A.

The second strategy also begins with the interval a; b that contains the infinite

bounded set, A. One can rename the end points of this interval to be a1 D a and

1

b1 D b. Since a1 ; b1 \ A D A is infinite, it follows that either a1 ; a1 Cb

\ A or

2

a1 Cb1

a1 Cb1

2 ; b1 \ A is an infinite set. If a1 ; 2 \ A is infinite, define a2 D a1 and

1

1

b2 D a1 Cb

. Otherwise, define a2 D a1 Cb

and b2 D b1 . In either case, a2 ; b2 \ A

2

2

is an infinite set. This procedure can be repeated so that for every n 2 N, one gets an

interval an ; bn where an ; bn \ A is infinite, and each interval is half the length of

the previous interval. Also, the sequence of left endpoints, <an >, is a monotone

increasing sequence bounded above by b, and the sequence of right endpoints,

<bn >, is a monotone decreasing sequence bounded below by a. Thus, both of these

sequences converge. In fact, both of these sequences must converge to the same

limit, p. This follows because the distances between the terms of the sequences,

bn an , keep getting smaller and converge to 0. Given an > 0, it will follow that

there is an n such that an and bn are both within of p. Thus, .p ; p C / \ A

contains an ; bn \ A which is infinite. Here is the complete proof.

77

real numbers has an accumulation point.

Let A be an infinite bounded set of real numbers.

Because A is bounded, it has a lower bound, a1 , and an upper bound, b1 ,

showing that A a1 ; b1 and a1 ; b1 \ A is an infinite set.

Define sequences <an > and <bn > recursively as follows.

Suppose, for natural number n, an and bn have been defined so that an ; bn \

n

A is an infinite set. If an ; an Cb

\ A is infinite, then define anC1 D an and

2

an Cbn

n

and bnC1 D bn . In either

bnC1 D 2 . Otherwise, define anC1 D an Cb

2

case, anC1 ; bnC1 \ A is an infinite set.

By the way the sequences are constructed, for each n it follows that an

anC1 < bnC1 bn showing that <an > is a monotone increasing sequence

bounded above by each bi , and <bn > is a monotone decreasing sequence

bounded below by each ai .

a1

Also, by the way the sequences are constructed, for each n, bn an D b21n1

.

Thus, the bounded monotone sequence <an > must converge to a number

pa , and the bounded monotone sequence <bn > must converge to a number

a1

pb . But pb pa bn an D b21n1

and, therefore, pb pa must be zero. Let

p D pa D pb , and note that for each n, p 2 an ; bn .

a1

Given > 0, select a natural number n such that b21n1

< . Then p <

an p bn < p C . Hence, an ; bn \ A .p ; p C / \ A is infinite

showing that there is an element of A not equal to p but within of p.

This shows that p is an accumulation point of A.

You now have the machinery necessary to prove the result mentioned in Sect. 3.6

that all Cauchy sequences converge. The difficulty in proving this result earlier was

that given a Cauchy sequence <an >, it was not clear what real number would play

the role of the limit L of the sequence. Now, the BolzanoWeierstrass Theorem can

provide an accumulation point to serve as this limit. There are two cases to consider.

If the set of values in the sequence, fan g, is a finite set, then for the sequence to be

Cauchy, the sequence will necessarily need to be constant from some point on, and,

therefore, the sequence will converge. If the set of values in the sequence is infinite,

then since all Cauchy sequences are bounded, the set of values in the sequence will

be bounded and will have to have an accumulation point. It is then straightforward

to show that the sequence converges to this accumulation point.

78

3 Limits

Let <an > be a Cauchy sequence. Let A be the set of terms of the

sequence, fan g.

CASE 1: The set A is finite. If A contains only one value, then the sequence

is constant and converges to that constant. If A contains more than one

value, then, since the range of the sequence is finite, so is the set of

differences an am of values in the sequence. Let d be the smallest positive

difference between any two values in the sequence, and let D d2 . Because

the sequence is Cauchy, there is an N such that whenever m; n > N, the

difference jam an j < . But the smallest positive difference between any

two terms of the sequence is d > , so it follows that am an D 0. Thus, the

sequence is constant for all terms an with n > N, and, again, the sequence

must converge.

CASE 2: The set A is infinite. Since all Cauchy sequences are bounded, A is

a bounded infinite set, and thus, by the BolzanoWeierstrass Theorem, A has

an accumulation point, p. Because <an > is Cauchy, given > 0 there is an

N such that for all m; n > N, jam an j < 2 . Also, since p is an accumulation

point of A, there are infinitely many values of A within 2 of p. Surely there

is a natural number k > N such that jak pj < 2 . Then, for all n > N, it

follows that jan pj D j.an ak /.pak /j jan ak jCjak pj < 2 C 2 D .

Thus, the sequence converges to p.

Therefore, all Cauchy sequences must converge.

Up to this point, the discussion of the limit lim f .x/ took place only for those

x!a

limit can now be extended. It should not be required that the function f be defined

for all x in an open interval containing a but that f be defined at enough points so

that it makes sense to allow x to approach a. In other words, a only needs to be

an accumulation point of the domain of f . Note that if a is not an accumulation

point of the domain of f , then there will be an open interval containing a where f

were not defined (except perhaps at a itself). Thus, no sense could be made out of x

approaches a. On the other hand, if a is an accumulation point of f , it makes sense

to define the limit of f at a to be L or lim f .x/ D L to mean that for all > 0 there is

x!a

a > 0 such that for all x in the domain of f , 0 < jx aj < implies jf .x/ Lj < .

Similarly, to define lim f .x/ D L one does not need f to be defined in an

x!1

entire interval stretching to positive infinity. It is sufficient that f .x/ is defined for

arbitrarily large values of x so that x can be allowed to approach infinity. One way

of saying this is that the domain of f should be unbounded above. This is what

was done, for example, when defining the limit of a sequence which is the limit of

a function defined for the natural numbers only. Similarly, lim f .x/ D L can be

x!1

defined for f when the domain of f is unbounded below.

79

3.7.1 Exercises

1. Write a definition for lim f .x/ where a is an accumulation point of the domain

x!aC

of f .

Identify the accumulation points, if any, of the following sets.

n

n 2 N

2. nC2

3. x 2 Q x2 < 2

1

3

; 2; 52 ; : : :

4. 2 ; 1;

m 2

5. 2n m; n 2 N

n

o

n C4

n 2 N

6. 2n.1/

3nC5

xC3

2

x!5 .x5/

this limit. The reason that the limit does not exist is that the function grows without

bound and, therefore, does not approach any real number value. This behavior can

be quantified by saying that the limit of the function is infinity. Of course, it does not

make sense to say that the function is getting close to infinity, since no real number

is very close to infinity. In the definition of lim f .x/ where it had to be made clear

x!1

what x approaching infinity meant, it was said that there was a number N such that

jf .x/ Lj was small whenever x > N. Similarly, to say that f .x/ approaches infinity,

one needs to say that for any real number M, f .x/ can be made larger than M. Thus, if

a is an accumulation point of the domain of function f .x/, the following two similar

definitions can be given.

The limit of f at a is infinity or lim f .x/ D 1 means that for every M 2 R there

x!a

is a > 0 such that if x is in the domain of f with 0 < jx aj < , then f .x/ > M.

The limit of f at a is negative infinity or lim f .x/ D 1 means that for every

x!a

M 2 R there is a > 0 such that if x is in the domain of f with 0 < jx aj < ,

then f .x/ < M.

What if f .x/ approaches infinity or negative infinity as x is allowed to approach

either infinity or negative infinity? Each of these ideas can be accommodated

resulting in four similar definitions. Remember that the limit of f as x approaches

infinity makes sense only if the domain of f is unbounded above, and as x approaches

negative infinity only if the domain of f is unbounded below. Here are the four

definitions.

The limit of f as x approaches infinity is infinity or lim f .x/ D 1 means that

x!1

for every M 2 R there is an N 2 R such that if x is in the domain of f with x > N,

then f .x/ > M.

80

3 Limits

x!1

means that for every M 2 R there is an N 2 R such that if x is in the domain of

f with x > N, then f .x/ < M.

The limit of f as x approaches negative infinity is infinity or lim f .x/ D 1

x!1

means that for every M 2 R there is an N 2 R such that if x is in the domain of

f with x < N, then f .x/ > M.

The limit of f as x approaches negative infinity is negative infinity or

lim f .x/ D 1 means that for every M 2 R there is an N 2 R such that

x!1

if x is in the domain of f with x < N, then f .x/ < M.

xC3

2

x!5 .x5/

xC3

.x5/2

xC3

of 5 with x 5. Working backwards, you would start with .x5/

2 > M. This is a

complicated inequality with which to work, so it would be more convenient to work

xC3

with an inequality that is easier to solve. If you want f .x/ D .x5/

2 to be bigger than

M, it would be sufficient to make some fraction smaller than f .x/ bigger than M.

1

For example, for all x > 2 the fraction .x5/

2 is smaller than f .x/. Moreover, for

x within 1 of 5,

1

jx5j

1

jx5j

is smaller than

1

.

.x5/2

A proof would need to take care of the embarrassing case of M 0, perhaps by

1

making D jMjC1

since jMj C 1 is always bigger than M and is always positive.

Another way to handle this is to write a proof that assumes that M is positive. In fact,

one could just stipulate that M > 1 by inserting the often used phrase without loss of

generality. This phrase means that even though a restriction is being placed on one

of the assumptions in the proof, if one can complete the proof using this restriction,

then it would be very easy to give a proof without the restriction. In this case, if it is

assumed that M > 1, one could just as easily handled cases where M 1 by finding

a > 0 that ensured f .x/ > 1 M, so being able to produce a proof that works for

1 does provide a proof for M 1. The phrase without loss of generality is used so

frequently that many authors abbreviate it as WLOG. These ideas give the following

proof.

xC3

2

x!5 .x5/

PROOF: lim

D1

xC3

Let f .x/ D .x5/

2.

Let M 2 R be given. Without loss of generality, assume that M > 1.

Let D M1 > 0. Note that < 1.

If 0 < jx 5j < , then since jx 5j < 1, it follows that jx 5j > .x 5/2 .

Also, since x > 4, it follows that x C 3 > 7 > 1.

xC3

1

1

1

Then f .x/ D .x5/

2 > .x5/2 > jx5j > D M.

xC3

2

x!5 .x5/

D 1.

81

3.8.1 Exercises

Write a proof of each of the following infinite limits.

x

2 D 1

x!4 .x4/

2

lim x 5x D 1

x!1

lim x2 D 1

x!0 jxj

1. lim

2.

3.

4.

lim 10

x!1

x

x!2C x2

5. lim

4 x D 1

D1

The fact that the limits of some functions are easy to prove hides the fact that there

are some limits whose validity is considerably more difficult to prove. Fortunately,

the limits of most arithmetic combinations of functions work as expected due to the

behavior of the arithmetic operations of addition, subtraction, multiplication, and

division. In the words of the next chapter, these operations behave well because

they are themselves continuous functions of their arguments. That is, for example,

the function of two variables f .x; y/ D x C y is a continuous function of x and y.

That continuity allows you to prove the following theorem.

THEOREM: Suppose that f and g are functions both defined on a set with

accumulation point a. Let lim f .x/ D L and lim g.x/ D H. Then

x!a

x!a

x!a

x!a

x!a

f .x/

x!a g.x/

4. if H 0, lim

L

.

H

Consider how to prove each part of the above theorem. In each case you will need

to prove the validity of a limit, so the proof can follow the usual proof template for

establishing a limit. These proofs differ from limit proofs found earlier in the chapter

in that you know less about the functions whose limits you are trying to establish.

On the other hand, you do know that the limits of the functions f and g exist, and

that gives you a lot of tools with which to work.

82

3 Limits

So what needs to be done to prove that the limit of the sum of two functions is the

sum of their respective limits? As with all limit proofs, the proof will begin with a

statement about what is being assumed about two functions f and g. In this case,

that would essentially be a restatement of the hypothesis of the theorem that says

that the limits of f and g at a are L and H, respectively. The second step of the

proof would be to say Let > 0 be given which sets the tolerance to be met by

the proof. You know that the end of the proof will need to show that the function

in question, f .x/ C g.x/, needs to be within

of the proposed limit, L C H. In

other words, you will need to establish j f .x/ C g.x/ .L C H/j < . Clearly,

this inequality will depend on properties of the functions f and g. But you know

very little about these functions. Actually, knowing very little about the functions

makes your job easier. All you know about these functions is that f has L for a limit,

and g has H for a limit. This means that your proof can only use these two facts.

Because these two limits exist, you will be able to set up conditions that ensure that

jf .x/ Lj and jg.x/ Hj are small. How does this help?It helps because

the triangle

inequality will allow you to show that the expression j f .x/ C g.x/ .L C H/j is

no

bigger than

the sum of the two small quantities jf .x/ Lj and jg.x/ Hj. That is,

j f .x/ C g.x/ .L C H/j D j.f .x/ L/ C .g.x/ H/j jf .x/ Lj C jg.x/ Hj. For

example, if both jf .x/ Lj and jg.x/ Hj can be made less than 2 , then their sum

will be less than , and the value of j f .x/ C g.x/ .L C H/j will, in turn, be less

than , as desired. How can you arrange for jf .x/ Lj and jg.x/ Hj both to be less

than 2 ? You are given that the limits of f and g are L and H, respectively, so, by the

definition of limit, you can arrange for each of these quantities to be smaller that any

given positive value, such as 2 , with appropriate choices of > 0. The only subtlety

here is that the value of > 0 needed to assure that jf .x/ Lj is less than 2 cannot

be assumed to be the same value as the > 0 needed to assure that jg.x/ Hj is less

than 2 . Thus, two different values of should be chosen, and then the minimum of

those two will be small enough to guarantee both of the needed inequalities.

Thus, after the proof proposes a given > 0, it can produce a 1 > 0 small

enough so that if x is in the domain of f and 0 < jx aj < 1 , then jf .x/ Lj will

be less than 2 . The existence of this 1 comes from the definition of lim f .x/ D L.

x!a

Similarly, the proof can produce a 2 > 0 coming from the definition of lim g.x/ D

x!a

H such that if x is in the domain of g and 0 < jx aj < 2 , then jg.x/ Hj will be

less than 2 . The proof then easily follows as described above.

83

with accumulation point a. If lim f .x/ D L and lim g.x/ D H, then

x!a

x!a

lim f .x/ C g.x/ D L C H.

x!a

with lim f .x/ D L and lim g.x/ D H.

x!a

x!a

Let > 0 be given.

By the definition of limit, there is a 1 > 0 such that if x is in the domain of

f and 0 < jx aj < 1 , then jf .x/ Lj < 2 .

Similarly, there is a 2 > 0 such that if x is in the domain of g and 0 <

jx aj < 2 , then jg.x/ Hj < 2 .

Let D min.1 ; 2 / > 0.

Then

if x is in

the domain of f C g with 0 < jx aj < ,

j f .x/Cg.x/ .LCH/j D j.f .x/L/C.g.x/H/j jf .x/LjCjg.x/Hj <

C 2 D .

2

This shows that lim f .x/ C g.x/ D L C H.

x!a

A proof that the limit of the difference f .x/ g.x/ equals the difference of the

individual limits, L H, is very similar to the above proof and is left as an exercise.

Proving that the limit of the product f .x/g.x/ equals the product of the individual

limits, LH, uses the same techniques as the proof for the limit of a sum but has

an added complexity requiring the use of a commonly used trick. The proof of

lim f .x/g.x/ D LH follows the usual template for proving the existence of a limit.

x!a

Its goal is to establish the inequality jf .x/g.x/ LHj < . Again, you can use the

definition of limit to make jf .x/ Lj and jg.x/ Hj as small as you need, but how

small these have to be to ensure that jf .x/g.x/LHj is less than is not immediately

obvious. The problem is that it is difficult to gauge how close f .x/g.x/ is to LH when

you know that f .x/ is close to L, and g.x/ is close to H. The difficulty stems from

having to move from f .x/g.x/ to LH, where f .x/ changes to L and g.x/ changes to H

at the same time. If only one of these two changes were made, then it might be easier

to make the needed estimate. That is, it would be easier to work with an expression

like f .x/g.x/ f .x/H than with f .x/g.x/ LH.

Of course, f .x/g.x/ LH is not the same as f .x/g.x/ f .x/H, so one cannot

just use f .x/g.x/ f .x/H in place of f .x/g.x/ LH. Sometimes, though, it is

worth replacing one expression with another expression that is easier to handle,

84

3 Limits

and then adjusting the second expression to make it equivalent to the first. In this

case, the change can be accomplished by employing one of the oldest tricks used in

mathematical proofs, that of adding and subtracting the same quantity. In particular,

you can rewrite jf .x/g.x/ LHj as jf .x/g.x/ f .x/H C f .x/H LHj. The advantage

of doing this is that now you can see how the difference between f .x/g.x/ and LH

depends on the differences between f .x/ and L and g.x/ and H. Indeed, jf .x/g.x/

LHj D jf .x/g.x/ f .x/H C f .x/H LHj D jf .x/.g.x/ H/ C H.f .x/ L/j

jf .x/j jg.x/ Hj C jHj jf .x/ Lj. If each of the two terms, jf .x/j jg.x/ Hj

and jHj jf .x/ Lj, can be made smaller than 2 , then it will have been shown that

jf .x/g.x/ LHj is less than as needed.

So how small does jf .x/ Lj need to be to ensure that jHj jf .x/ Lj is less

than 2 ? Less than 2jHj

appears to be small enough, although one needs to handle

the embarrassing situation where H D 0. You could handle H D 0 and H 0 as

two separate cases, or you can take care of both cases at once by making jf .x/ Lj

less than since jHj C 1 is larger than jHj and can never be 0. Thus, you can

2 jHjC1

.

2 jHjC1

How small does jg.x/ Hj need to be to ensure that jf .x/j jg.x/ Hj is less than

It would be nice to say that jg.x/ Hj < 2jf.x/j suggesting that you set small

enough to ensure jg.x/ Hj < 2jf.x/j , but there is a problem here. The definition of

limit requires that the choice of come before the choice of x, so you cannot have

the value of depending on x. What is needed is an upper bound for jf .x/j because,

if jf .x/j M, the value of can be found to ensure jg.x/ Hj < 2M

which will

always be small enough to guarantee jf .x/j jg.x/ Hj < 2 . You can find such

an upper bound for jf .x/j because the limit of f .x/ exists as x approaches a, and so

jf .x/j can be restricted to being not much larger than jLj. You could, for example,

find 2 > 0 so that if 0 < jx aj < 2 , then jf .x/ Lj < 1. This would ensure that

f .x/ is a distance of no more than 1 from L so that jf .x/j < jLj C 1. Then you would

only need jg.x/ Hj < to get jf .x/j jg.x/ Hj < 2 . This gives you all the

?

2

2 jLjC1

85

with accumulation point a. If lim f .x/ D L and lim g.x/ D H, then

x!a

x!a

lim f .x/g.x/ D LH.

x!a

with lim f .x/ D L and lim g.x/ D H.

x!a

x!a

Let > 0 be given.

By the definition of limit, there is a 1 > 0 such that if x is in the domain of

f and 0 < jx aj < 1 , then jf .x/ Lj < .

2 jHjC1

f and 0 < jxaj < 2 , then jf .x/Lj < 1. Then jLjC1 > jLjCjf .x/Lj

jL C .f .x/ L/j D jf .x/j.

By the definition of limit, there is a 3 > 0 such that if x is in the domain of

g and 0 < jx aj < 3 , then jg.x/ Hj < .

2 jLjC1

Then if x is in the domain of f g with 0 < jx aj < ,

jf .x/g.x/ LHj D jf .x/g.x/ f .x/H C f .x/H LHj D

jf .x/.g.x/ H/ C H.f .x/ L/j jf .x/j jg.x/ Hj C jHj jf .x/ Lj

.jLj C 1/ C jHj < 2 C 2 D .

2 jLjC1

2 jHjC1

x!a

Finally, the proof that the limit of a quotient is the quotient of the individual

limits is much like the proof about the product of limits, although the algebra

is more complicated. As in the preceding

proof, you can start with the needed

f .x/

L

inequality which, in this case, is g.x/ H < . Using the trick of adding and

subtracting the same quantity, the left side of the inequality

can be written

as

.f

.x/L/HCL

Hg.x/

f .x/

f .x/HLg.x/

f .x/HLHCLHLg.x/

g.x/ HL D g.x/H D

D

g.x/H

g.x/H

L.g.x/H/

jf .x/Lj

jf .x/Lj

C g.x/H . Again, the goal will be to make each of the terms jg.x/j

jg.x/j

and L.g.x/H/

g.x/H

Both of these terms have a factor of jg.x/j in the denominator. To make the

fractions small, you will need to know that jg.x/j does not get too close to zero.

What you do know is that lim g.x/ D H is not zero because the hypothesis of the

x!a

theorem will make that assumption. How far away from zero can you require jg.x/j

to be? Certainly, this will depend on the value of H. If H is close to zero, then jg.x/j

86

3 Limits

will be close to zero as x approaches a. The best you can do is require that jg.x/j be

so close to H that it will keep a known distance from zero. For example, you could

. That will ensure that jg.x/j is at least jHj

require that jg.x/ Hj be less than jHj

2

2

which keeps it a known distance away from zero. So, select a 1 > 0 such that if x

, and jg.x/j will

is in the domain of g with 0 < jx aj < 1 , then jg.x/ Hj < jHj

2

jHj

be greater than 2 .

.x/Lj

2

< jf .x/ Lj jHj

. Thus, it would

Now for these values of x you will have jf jg.x/j

be sufficient if jf .x/Lj were to be less than

jHj

4

jf .x/Lj

jg.x/j

< 2 .

jHj

This can be done by choosing

2 > 0 small enough so that jf .x/Lj is less than 4 .

To make the L.g.x/H/

g.x/H

2

because that will give

4jLj

2

H 2

H

jLj 4jLj

L.g.x/H/

< H42 D 2 . Well, OK, did you catch that the preceding does

g.x/H < jg.x/jjHj

2

not work if L D 0? To avoid this problem it would be better to make jg.x/ Hj less

2

than H . Putting all of these ideas together gives the following proof.

4 jLjC1

PROOF: Suppose that f and g are functions both defined on a set with

accumulation point a. If lim f .x/ D L and lim g.x/ D H with H 0, then

f .x/

x!a g.x/

lim

x!a

x!a

L

.

H

with lim f .x/ D L and lim g.x/ D H 0.

x!a

x!a

Let > 0 be given.

By the definition of limit, there is a 1 > 0 such that if x is in

.

the domain of g and 0 < jx aj < 1 , then jg.x/ Hj < jHj

2

jHj

For these x it follows that jg.x/j C 2 > jg.x/j C jg.x/ Hj D

jg.x/j C jH g.x/j jg.x/ C H g.x/ j D jHj which implies that

D jHj

.

jg.x/j > jHj jHj

2

2

By the definition of limit, there is a 2 > 0 such that if x is in the domain of

.

f and 0 < jx aj < 2 , then jf .x/ Lj < jHj

4

By the definition of limit, there is a 3 > 0 such that if x is in the domain of

2

g and 0 < jx aj < 3 , then jg.x/ Hj < H .

4 jLjC1

Let D min.1 ; 2 ; 3 /.

Then if x is in the domain of gf with 0 < jx aj < ,

f .x/HLg.x/ f .x/HLHCLHLg.x/ .f .x/L/HCL Hg.x/

f .x/

L

g.x/ H D g.x/H D

D

g.x/H

g.x/H

2

L.g.x/H/ jHj 2

jf .x/Lj

2jLj

H

2 < 2 C 2 D .

jg.x/j C g.x/H < 4 jHj C

H

4 jLjC1

lim f .x/

x!a g.x/

L

.

H

87

As a demonstration of the power of these results about the arithmetic of limits, you

can now easily prove the following list of results which will allow you to easily

calculate limits of polynomials and rational functions of x.

For any constant c in the real numbers, lim c D c.

x!a

lim x D a.

x!a

For any n in the natural numbers, lim xn D an .

x!a

x!a

p.x/

x!a q.x/

p.a/

.

q.a/

The first two results are very easy to prove, and are left as exercises. The next

two results can be proved by using mathematical induction which is often the first

technique one considers using when trying to prove statements such as these that

depends on a natural number. Here, mathematical induction will be employed to

prove statements about the limits of polynomials, and the degree of the polynomial

provides a natural number to use as the induction variable.

To begin with, try using mathematical induction to prove that lim xn D an for

x!a

any natural number n. In this mathematical induction argument, the base case is

lim x D a, that is, when n D b D 1. The proofs of statements similar to this base

x!a

case were covered earlier. The induction step in the proof will need to show that if

lim xk D ak for some natural number k, then lim xkC1 D akC1 . But xkC1 is just the

x!a

x!a

product xk x, so this result follows immediately from the theorem about the limits

of products. That leads to the following proof that uses the template for proofs by

mathematical induction.

PROOF: lim xn D an for any natural number n.

x!a

SET THE CONTEXT: The statement will be proved for all natural numbers

n by mathematical induction on n.

PROVE S.b/: When n D 1, the statement says that lim x D a which has

x!a

already been established.

STATE THE INDUCTION HYPOTHESIS: Assume that for some natural

number k, lim xk D ak .

x!a

PERFORM THE INDUCTION STEP: Then since the limit of a product

of two functions is the product of the two individual limits, it follows that

lim xkC1 D lim xk x D .lim xk /.lim x/ D ak a D akC1 . So the statement

x!a

x!a

x!a

x!a

is true for n D k C 1.

STATE THE CONCLUSION: Therefore, by mathematical induction,

lim xn D an is true for all natural numbers n.

x!a

88

3 Limits

Mathematical induction can again be employed to prove that for every polynomial, p.x/, lim p.x/ D p.a/. As a reminder, a polynomial of degree n is a function,

x!a

are constants with cn 0. Previously it has been proved that lim cj D cj and

x!a

lim x j D a j , from which one gets that the limit of a monomial is lim cj x j D cj a j .

x!a

x!a

A polynomial is just a sum of such monomials, so mathematical induction is a

convenient tool for showing that this sum of an arbitrary number of monomials

has the desired limit.

PROOF: For any constants c0 ; c1 ; c2 ; : : : ; cn and a 2 R, the polynomial p.x/ D cn xn C cn1 xn1 C cn2 xn2 C C c1 x C c0 satisfies

lim p.x/ D p.a/.

x!a

induction on the degree of the polynomial n.

PROVE S.b/: lim c1 xCc0 D .lim c1 /.lim x/C lim c0 D .c1 /.a/Cc0 which

x!a

x!a

x!a

x!a

shows that the statement is true for n D 1.

STATE THE INDUCTION HYPOTHESIS: Assume that for some natural

number k, if p.x/ D ck xk C ck1 xk1 C C c1 x C c0 , then lim p.x/ D p.a/.

x!a

ck1 xk1 C C c1 x C c0 , lim ckC1 xkC1 C ck xk C ck1 xk1 C C c1 x C c0 D

x!a

.lim ckC1 /.lim xkC1 / C lim ck xk C ck1 xk1 C C c1 x C c0 D

x!a

x!a

x!a

shows that the statement is true for n D k C 1.

STATE THE CONCLUSION: Therefore, by mathematical induction,

lim p.x/ D p.a/ is true for all polynomials p.x/.

x!a

Recall that a rational function is just a ratio of polynomials, that is, if p.x/ and

q.x/ are polynomials, then p.x/

is a rational function. It is only a simple step to get

q.x/

the following theorem.

PROOF: For any polynomials p and q and a 2 R where q.a/ 0, it follows

p.x/

that lim q.x/

D p.a/

.

q.a/

x!a

2. Because p and q are polynomials, lim p.x/ D p.a/ and lim q.x/ D q.a/.

x!a

x!a

3. Since the limit of the quotient is equal to the quotient of the individual limits

D

when the limit of the denominator is not zero, it follows that lim p.x/

q.x/

x!a

lim p.x/

x!a

lim q.x/

x!a

p.a/

.

q.a/

89

It is time to note that even though all of these limit theorems concerned limits as

x approaches a, most can be extended to cover limits as x approaches a from the

left, as x approaches a from the right, as x approaches infinity, and as x approaches

negative infinity. In particular, most of the theorems apply to the limits of sequences.

Many of these statements can be found in the exercises.

3.9.6 Exercises

Write proofs of each of the following statements.

1. If f and g are defined in an open interval containing a, and if lim f .x/ D L and

x!a

x!aC

x!a

x!a

3. lim x D a.

x!a

4. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then

x!a

x!a

x!a

5. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then

x!aC

x!aC

x!aC

6. If f and g have a common domain with lim f .x/ D L and lim g.x/ D H, then

x!1

x!1

x!1

7. If <an > and <bn > are sequences with lim an D L and lim bn D H 0, then

an

n!1 bn

lim

n!1

L

.

H

x!1

x!0C

1

x

n!1

D L.

1

x!a f .x/L

lim f .x/CM D 1.

x!a f .x/CN

9. If f .x/ > L for all x a, then lim f .x/ D L if and only if lim

x!a

x!a

D 1.

This section discusses a few other useful results about limits. They provide an

interesting variety of proof strategies to consider.

90

3 Limits

What can you say about lim f .x/ D L if you know that f .x/ > 0 for all x, or at

x!a

least for all x in an open interval containing a? Assuming that this limit exists, it

is clear that the limit cannot be negative because, from the definition of limit, you

know that jf .x/ Lj can be made as small as you like which would not be possible

if f .x/ were always positive and L were negative. But how would you prove this?

The key lies in the inequality jf .x/ Lj < since, if L were negative, you could

choose to be so small that the inequality could not hold. How small would need

to be? Well, the only thing you know about f .x/ is that it is positive, or, in other

words, cannot be smaller than 0. At the same time, L is negative which means that

f .x/ and L must be at least jLj apart, noting that jLj > 0. So set D jLj. Then

jLj D > jf .x/ Lj D f .x/ C jLj which implies f .x/ < 0 which is a contradiction.

This leads to the following proof.

PROOF: Let f be a function such that f .x/ > 0 for all x in the domain of

f . If lim f .x/ D L, then L 0.

x!a

Suppose that lim f .x/ D L and that for all x, f .x/ > 0.

x!a

Assume that L < 0.

By the definition of limit, there is a > 0 such that for all x in the domain

of f satisfying 0 < jx aj < , it follows that jf .x/ Lj < L.

For these values of x it must be that L > jf .x/ Lj D f .x/ L implying

that 0 > f .x/ which contradicts the fact that f .x/ is always positive.

Therefore, it must hold that L 0.

Similar statements can be made about the limits of functions f satisfying f .x/ > b

or f .x/ < b for all x where b is a constant real number. One can also extend this to

limits from the left, limits from the right, and limits to infinity and negative infinity.

Several of these possibilities have been left for the exercises.

There is nothing in the definition of lim f .x/ D L that a priori precludes lim f .x/ D

x!a

x!a

M for some M L. But, in fact, limits are unique, that is, the only way for the limit

to be L and the limit to be M is for L and M to be equal. Intuitively, this should make

sense. If the values of f .x/ are getting close to L, then they should not also be able

to get close to a value distinct from L. So how can you prove this using nothing but

the definition of limit as a tool?

The result can be proved by contradiction, that is, if you assume that the function

f has two distinct limits, L and M, as x approaches a, then this leads to a statement

which must be false. Assuming that both limits exist, the definition of limit will

91

allow you to force both jf .x/ Lj < and jf .x/ Mj < for any positive

that you choose. Why cant this happen? Well, if it did, you could get C >

jf .x/ Lj C jf .x/ Mj D jf .x/ Lj C jM f .x/j jf .x/ L C M f .x/j D jM Lj.

If M L, then jM Lj is a positive number, so if is chosen less than or equal to

jMLj

, it will be impossible to have jM Lj < 2 as guaranteed by the definition of

2

limit. That gives you the following proof.

PROOF: If lim f .x/ D L and lim f .x/ D M, then L D M.

x!a

x!a

x!a

x!a

By the definition of limit, there is a 1 > 0 such that for all x in the domain

.

of f satisfying 0 < jx aj < 1 , it follows that jf .x/ Lj < jMLj

2

By the definition of limit, there is a 2 > 0 such that for all x in the domain

.

of f satisfying 0 < jx aj < 2 , it follows that jf .x/ Mj < jMLj

2

Let x be in the domain of f with 0 < jx aj < min.1 ; 2 /.

Then jM Lj D jMLj

C jMLj

> jf .x/ Lj C jf .x/ Mj D

2

2

jf .x/ Lj C jM f .x/j jf .x/ L C M f .x/j D jM Lj showing that

jM Lj > jM Lj which is a contradiction.

Thus, L M must be false, and L D M.

The Squeezing Theorem, also known as the Sandwich Theorem or the Scrunch

Theorem, says that if the values of f .x/ are always between g.x/ and h.x/, then

if g and h both have the same limit, L, at x D a, then f must also have limit L at a.

The proof of this is not hard once you write down everything that you know about

the functions f , g, and h. So what do you know? You can assume that for every x that

g.x/ f .x/ h.x/, and you can assume that lim g.x/ D lim h.x/ D L. This means

x!a

x!a

that for every > 0 there is a 1 > 0 such that when x satisfies 0 < jxaj < 1 , then

jg.x/ Lj < . Similarly, for that same , there is a 2 > 0 such that when x satisfies

0 < jx aj < 2 , then jh.x/ Lj < . Thus, you can show for values of x near a that

g.x/ f .x/ h.x/, < g.x/ L < , and < h.x/ L < . Putting these three

sets of inequalities together shows that < g.x/ L f .x/ L h.x/ L <

from which jf .x/ Lj < follows. This gives the following proof.

92

3 Limits

the same domain, and let a be an accumulation point of that domain.

Assume that for all x in that domain g.x/ f .x/ h.x/, and that

lim g.x/ D lim h.x/ D L. Then lim f .x/ D L.

x!a

x!a

x!a

functions f , g, and h.

Also assume that lim g.x/ D lim h.x/ D L.

x!a

x!a

Finally, assume that g.x/ f .x/ h.x/ for all x in the common domain of

f , g, and h.

Let > 0 be given.

By the definition of limit, there is a 1 > 0 such that for all x in the domain

of g that satisfy 0 < jx aj < 1 , it follows that jg.x/ Lj < .

By the definition of limit, there is a 2 > 0 such that for all x in the domain

of h that satisfy 0 < jx aj < 2 , it follows that jh.x/ Lj < .

Then for all x in the common domain of f , g, and h satisfying 0 < jx aj <

min.1 ; 2 /, jg.x/ Lj < and jh.x/ Lj < .

Thus, for those x, < g.x/ L f .x/ L h.x/ L < from which it

follows that jf .x/ Lj < .

Therefore, lim f .x/ D L.

x!a

If the sequence <an > converges to L, it means that the terms of the sequence are

getting close to L. This should mean that the terms of any subsequence should also

be getting close to L, and it is not hard to prove that every subsequence <anj > of

<an > has the same limit.

Given the fact that lim an D L and given a subsequence <anj >, how do you

n!1

use this to prove that the subsequence converges to L? What do you know about this

subsequence? Only that there is a strictly increasing sequence of natural numbers,

<nj >, that tells which terms of <an > are found in the subsequence. A nice property

of a strictly increasing sequence of natural numbers, <nj >, is that for any natural

number j, nj j. This can easily be proved by mathematical induction on j.

Certainly, n1 1 since n1 is a natural number, so the claim is true for j D 1. If

nk k for some k, then because <nj > is strictly increasing, nkC1 nk C 1 k C 1

showing that if the claim is true for k, then it is true for k C 1. This proves the claim.

The definition of limit gives you that for any > 0 there is an N such that if

j > N, then jaj Lj < . But since nj j, it follows that for all j > N, nj is also

greater than N, so janj Lj < as needed.

93

L, and let <anj > be any

subsequence. Then lim anj D L.

n!1

j!1

Let <an > be a sequence with lim an D L, and let <anj > be any

n!1

subsequence.

Let > 0 be given.

By the definition of limit, there is an N such that for all n > N, jan Lj < .

By the definition of subsequence, <nj > is a strictly increasing sequence of

natural numbers and, as such, satisfies nj j for all natural numbers j.

Thus, for all j > N, nj j > N implies janj Lj < .

This proves that lim anj D L.

j!1

Of course the converse of this theorem is trivially true. That is, if all subsequences

of a given sequence converge, then the original sequence converges. This is trivial

since the original sequence is one of its subsequences.

3.10.5 Exercises

Write proofs of each of the following statements.

1. If lim f .x/ D L and f .x/ < b for all x, then L b.

x!a

x!1

3. If f .x/ > 0 for all x, then lim f .x/ cannot equal negative infinity.

x!a

4. Suppose that sequences <an >, <bn >, and <cn > satisfy an bn cn for every

natural number n. If lim an D lim cn D L, then lim bn D L.

n!1

n!1

n!1

Even when a limit does not exist, there is often something that can be said

about the values that the function approaches. Consider, for example, the sequence

1; 1; 0; 1; 1; 0; 1; 1; 0; : : : which just oscillates among the numbers 1, 1,

and 0. This sequence does not have a limit, but it has subsequences that do converge.

Some of its subsequences converge to 1, some converge to 1, and some converge

to 0.

2 sin x

2

Now consider the function f .x/ D 2xx2 C1

. The function x22xC1 has a limit of 2

as x goes to infinity, but f .x/ oscillates without approaching a limit. Some of its

94

3 Limits

values do approach 2, but other values approach 2 and every value in between.

More precisely, for each L 2 2; 2, you can find sequences <xn > where lim xn D

n!1

n!1

So suppose that the function f is defined for positive real numbers. How might

f .x/ behave as x goes to infinity? f might diverge to infinity or minus infinity as

3

2x2

do f .x/ D x2 and f .x/ D x1x

2 C4 . It might have a finite limit as does x2 C1 . It might

sin x

. Finally, it

oscillate among values within some bounded range such as .3xC100/

xC10

might oscillate and be unbounded like x j sin xj.

Even when f oscillates so that it does not have a finite or infinite limit, it

is helpful to quantify which values the function f .x/ approaches repeatedly as x

grows. This can be done by considering the range of f .x/ when x is restricted to

an interval .M; 1/, and then watching what happens to that range as M gets large.

sin x

For example, consider the function f .x/ D .3xC100/

whose graph is shown in

xC10

3xC100

70

Fig. 3.10. The function xC10 D 3 C xC10 is a decreasing function of x for x > 0,

so on the interval .M; 1/, the function f oscillates in a range bounded between

3MC100

and 3MC100

. What can be said about the sequence <f .xn /> where <xn > is

MC10

MC10

and 3xxnn C100

,

a sequence with lim xn D 1? The values f .xn / are between 3xxnn C100

C10

C10

n!1

so as xn gets large, f .xn / is forced to be inside or very near the interval 3; 3.

Clearly, for no sequence <xn > can f .xn / approach a limit outside of the interval

3; 3, but there are sequences for which f .xn / approaches 3 and others for which

f .xn / approaches 3 as shown in the figure. Finding the greatest and least values

to which f .xn / could converge is the idea behind the limit superior and limit

inferior often referred to simply as the lim sup and lim inf, respectively. In the

sin x

example of f .x/ D .3xC100/

, the values of 3 and 3 came from looking at the

xC10

greatest lower bound and least upper bound of the set ff .x/ j x > Mg and then

letting M go to infinity. In general, let f be a function whose domain is unbounded

above. For each real number M let AM be the range of f for x > M, that is,

AM D ff .x/ j x is in the domain of f with x > Mg. Then define lim sup f .x/ to be

x!1

lim sup AM . Similarly, define lim inf f .x/ to be lim inf AM . Some books use the

M!1

x!1

M!1

notation lim and lim for lim sup and lim inf, respectively.

Fig. 3.10 Sequences

approaching the lim sup and

.3xC100/ sin x

lim inf of f .x/ D

xC10

95

so lim sup f .x/ D 1. If lim f .x/ D 1, then sup AM will also approach 1,

x!1

x!1

so lim sup f .x/ will be 1. Analogously, if f .x/ is unbounded below as x gets

x!1

large, lim inf f .x/ D 1, and if lim f .x/ D 1, then lim inf f .x/ D 1. Note

x!1

x!1

x!1

that since sup AM and inf AM are both monotone function of M, their limits always

exist, although they might be infinite limits. Thus, unlike lim f .x/, the values of

x!1

x!1

x!1

If f .x/ remains bounded as x gets large, lim sup f .x/ and lim inf f .x/ are finite

x!1

x!1

values. This means lim sup AM is finite. For each natural number n, there must

M!1

be an xn > n such that f .xn / is within, say 1n of sup An . Then, <xn > is a sequence

that diverges to infinity such that for each n, sup An 1n < f .xn / < sup An . By

the Squeezing Theorem, lim f .xn / D lim sup An D lim sup f .x/. Similarly, there

n!1

n!1

x!1

must be a sequence <xn > diverging to infinity with lim f .xn / D lim inf f .x/. This

n!1

x!1

means that there is a sequence such that f converges to its limit superior on that

sequence and another sequence such that f converges to its limit inferior on it.

Consider the three examples: sin x, x j sin xj, and x sin x. None of these functions

has a limit as x approaches infinity because each function oscillates and does not

approach one particular value. On the other hand, in each case it is easy to see upper

and lower bounds to the oscillations. The values ofthe function

sin x clearly stay in

the interval 1; 1. For each integer n, when x D 2n C 12 , the function sin x D

1 D lim sup sin x and when x D 2n 12 , the function sin x D 1 D lim inf sin x.

x!1

x!1

above but is nonnegative for positive x. Again,

for integer n, when x D 2n C 12 , the function x j sin xj D x which goes to

infinity, the lim sup of x j sin xj, and when x D n, the function x j sin xj D

0 D lim inf x j sin xj. The function x sin x behaves similarly except, now, when

x!1

x D 2n 12 , the function x sin x is x which goes to negative infinity, the

lim inf of x sin x.

The limit superior and limit inferior can be defined for limits at points other

than infinity. For example, if a is an accumulation point of the domain of f , one

can define the limit superior and limit inferior of f .x/ as x approaches a. Rather

than defining AM to be the values of f .x/ for x > M which essentially contains the

values of f for x restricted to an interval ending at infinity, one can define A for

any > 0 as A D ff .x/ j x is in the domain of f with 0 < jx aj < g which

contains the values of f for x restricted to an open interval containing a with the

point a removed. Then lim sup f .x/ D lim sup A and lim inf f .x/ D lim inf A .

!0C

x!a

x!a

!0C

These definitions of lim sup and lim inf have properties similar to the definitions

of lim sup and lim inf at infinity. That is, sup A and inf A are both monotone in

, so their limits as goes to 0 always exist. Moreover, there is a sequence <xn >

where lim xn D a such that lim f .xn / D lim sup f .x/ and another such sequence

n!1

n!1

x!a

96

3 Limits

such that lim f .xn / D lim inf f .x/. Similar definitions can be given for lim inf f .x/,

n!1

x!aC

x!a

f .x/, lim sup f .x/, lim inf f .x/, and lim sup f .x/.

x!a

x!a

x!aC

x!1

x!a1

The most important theorem concerning lim inf and lim sup is that lim f .x/ D L

x!a

if and only if lim inf f .x/ D lim sup f .x/ D L. Notice first that this is a biconditional

x!a

x!a

statement; that is, an if and only if statement. This requires that its proof have two

parts; one that assumes lim f .x/ D L and proves lim inf f .x/ D lim sup f .x/ D L

x!a

x!a

x!a

and another that assumes lim inf f .x/ D lim sup f .x/ D L and proves lim f .x/ D L.

x!a

x!a

x!a

So, given lim f .x/ D L, how can you conclude that lim inf f .x/ D lim sup f .x/ D

x!a

x!a

x!a

L? What you know is that given > 0, there is a > 0 such that for all x in the

domain of f for which 0 < jx aj < , you have jf .x/ Lj < . But this means that

for small > 0, the supremum sup A and infimum inf A are both within of L and,

therefore, the limits of sup A and inf A must both approach L as decreases to 0.

Conversely, suppose that lim inf f .x/ D lim sup f .x/ D L. Note that for any x a in

x!a

x!a

that f .x/ 2 A2jxaj . Thus, inf A 2jxaj f .x/

sup A2jxaj

which implies that lim inf A2jxaj lim f .x/ lim sup A2jxaj from which it

x!a

x!a

x!a

x!a

Then lim f .x/ D L if and only if lim inf f .x/ D lim sup f .x/ D L.

x!a

x!a

x!a

For any > 0, define A D ff .x/ j x is in the domain of f with

0 < jx aj < g.

PART I: the limit equals L implies lim inf and lim sup equal L

Assume that lim f .x/ D L.

x!a

Let > 0 be given.

Then there is a > 0 such that if x is in the domain of f with 0 < jxaj < ,

then jf .x/ Lj < .

This says that inf A L and sup A L C .

It follows that lim inf f .x/ L and lim sup f .x/ L implying that

x!a

x!a

lim inf f .x/ D lim sup f .x/ D L, completing the first part of the proof.

x!a

x!a

(continued)

97

PART II: lim inf and lim sup equal L implies that the limit equals L

Assume that lim inf f .x/ D lim sup f .x/ D L.

x!a

x!a

For any x in the domain of f with x a, it follows that inf A2jxaj f .x/

sup A2jxaj .

Because lim inf A2jxaj D lim inf f .x/ D L, and lim sup A2jxaj D

x!a

x!a

x!a

lim sup f .x/ D L, the Squeezing Theorem shows that lim f .x/ D L, which

x!a

x!a

As discussed earlier, this theorem holds even when L D 1 or 1. It also holds for

limits at 1 and for one-sided limits.

3.11.1 Exercises

1. Write definitions for each of the following.

(a) lim inf f .x/

x!aC

x!a

x!1

x x<2

(a) lim inf

1

x!2

x>2

x2

x x<2

(b) lim sup

1

x>2

x!2

x2

x x is rational

(c) lim inf

x!2

5 x is irrational

5

(d) lim sup

n

n!1 4 C .1/

1

1C n

(e) lim sup

1

C

n.1/n

n!1

98

3 Limits

3. Prove that if a is any accumulation point of the domain of f , then lim inf f .x/

x!a

x!a

x!a

x!a

5. Suppose that lim inf f .x/ D L and lim inf g.x/ D M. What can you say about

x!a

x!a

x!a

6. Suppose that f is a positive-valued function with lim sup f .x/ D L > 0. Prove

1

that lim inf f .x/

D L1 .

x!a

x!a

Chapter 4

Continuity

As with the definition of limit, most Calculus students will develop an intuitive feel

for what it means for a function to be continuous. This usually involves knowing

that a function is continuous on an interval if the graph of that function over that

interval can be drawn without lifting ones pencil from the page. The important

property here is that as the pencil is tracing out the graph of the function, and the

pencil is approaching the point where x D a, the points on the graph are getting

close to their destination at the point .a; f .a//. In particular, it does not happen

that as the points on the graph are getting close to .a; L/ that the graph suddenly

jumps to a different point .a; f .a// where f .a/ L, a situation where the pencil

would have to be lifted from the page to get from .a; L/ to .a; f .a//. This intuitive

understanding leads directly to the key property of f being continuous at a which is

that lim f .x/ D f .a/.

x!a

How can one state a definition for continuity that embodies this intuitive feel for

the function having its own value as its limit? Clearly, the definition of a function f

being continuous at a point x D a must be similar to a definition of the limit of f as

x approaches a. As a reminder, here is the definition of limit.

Suppose that the point a is an accumulation point of the domain of the function f.

Then lim f .x/ D L means that for every > 0 there exists a > 0 such that for

x!a

every x in the domain of f satisfying 0 < jx aj < , it follows that jf .x/ Lj < .

The definition of continuity of f at point a needs to include the fact that the

function is defined at the point a, so references to the limit L in the definition of

limit can be replaced by references to f .a/. Thus, the definition of continuity will

contain the conclusion jf .x/f .a/j < . In the definition of limit, it was not required

that the function f be defined at x D a, and if it were defined, f .a/ did not need to

be equal to the limit L. For this reason, the definition of limit took care to ensure

that even though jf .x/ Lj < was required to hold for x values near x D a, this

inequality did not need to hold at x D a. The definition of limit excluded x D a by

Springer International Publishing Switzerland 2016

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_4

99

100

4 Continuity

only requiring the inequality to hold for those x values satisfying 0 < jx aj <

which excludes x D a. This restriction is not necessary in the definition of continuity

of a function at a point.

Suppose that the point a is in the domain of the function f. Then f is continuous at a

means that for every > 0 there exists a > 0 such that for every x in the domain

of f satisfying jx aj < , it follows that jf .x/ f .a/j < .

Notice that the requirement that the point a be an accumulation point of the

domain of f has been dropped. As a result, if the function f is defined at an isolated

point a, then f is continuous at that point. A function that is not continuous at the

point a is discontinuous at the point a.

A function f is continuous on a set A if it is continuous at each point a 2 A. The

function whose graph appears in Fig. 4.1 is discontinuous at x D b because its limit

at x D b does not exist. Similarly, it is discontinuous at x D c. It is discontinuous

at x D d because it is not defined at that point even though the function has a limit

there. The function is continuous on the intervals a; b/, .b; c/, and .c; d/, and at

the points x D e and x D f . The function is not continuous on the intervals a; b

or c; d.

It is a direct consequence of the definition of continuity that if f is continuous

at a point a, and if a is an accumulation point of the domain of f , then the limit of

f .x/ at a exists and is, in fact, f .a/. To prove this you would just need to show that

if f satisfies the definition of continuity at a, then f also satisfies the definition of

lim f .x/ D f .a/. Writing down the definition of continuity gives you that for every

x!a

> 0 there is a > 0 such that jx aj < implies jf .x/ f .a/j < . But if this is

true, then certainly 0 < jx aj < implies jf .x/ f .a/j < , so the definition of

limit is satisfied.

101

point of the domain of f , then lim f .x/ D f .a/.

x!a

domain of f .

Given > 0,

the definition of continuity says that there is a > 0 such that if x is in the

domain of f with jx aj < , then jf .x/ f .a/j < .

But then if 0 < jx aj < , it follows that jf .x/ f .a/j < satisfying the

definition of lim f .x/ D f .a/.

x!a

x!a

x!a

Again, the proof of this follows directly from the definitions.

PROOF: If the function f is defined at a and lim f .x/ D f .a/, then f is

x!a

continuous at a.

Let f be a function defined at a where lim f .x/ D f .a/.

x!a

Given > 0,

the definition of limit says that there is a > 0 such that if x is in the domain

of f with 0 < jx aj < , then jf .x/ f .a/j < .

Certainly, if x D a, then jf .x/ f .a/j D jf .a/ f .a/j D 0 < .

Thus, it follows that jx aj < implies jf .x/ f .a/j < satisfying the

definition of f being continuous at a.

Therefore, f is continuous at a.

The template for proofs of lim f .x/ D L followed directly from the definition of

x!a

limit. Similarly, a template for proofs of the continuity of a function f at a point

a will follow directly from the definition of continuity. Indeed, the definition of

continuity requires that for every > 0 there exist a > 0 which satisfies

a particular condition. This suggests that a proof of continuity should select an

arbitrary > 0 and proceed to display a value of > 0 that causes the needed

condition to be satisfied. This is similar to the procedure taken for a limit proof

except that the needed condition is slightly different. Thus, here is a template for

proofs about the continuity of a function at a point.

102

4 Continuity

SET THE CONTEXT: Make statements about what is known about the

function f and the numbers a and f .a/.

SELECT AN ARBITRARY : Given > 0,

PROPOSE A VALUE FOR : let D

. Here you would insert an

appropriate value for .

SELECT AN ARBITRARY x: Select x in the domain of f such that

jx aj < .

LIST IMPLICATIONS: Derive the result jf .x/ f .a/j < .

STATE THE CONCLUSION: Therefore, f is continuous at the point a.

As a start, consider how to prove that the function defined for all real numbers x as

f .x/ D 5x3 is continuous at x D 4. The proof would begin with Let f .x/ D 5x3.

Given > 0; : : : . The task is then to find a > 0 so that jf .x/ f .4/j < for every

x satisfying jx 4j < . Working backwards, to get jf .x/ f .4/j < one needs

> j.5x 3/ .5 4 3/j D 5jx 4j. Therefore, it seems clear that jx 4j

needs to be less than 5 , so letting D 5 will work. Note that because > 0, is

also greater than 0 as required by the definition of continuity. Putting this into the

template results in the following proof.

PROOF: The function f .x/ D 5x 3 is continuous at x D 4.

Let f .x/ D 5x 3.

Given > 0,

let D 5 which is greater than 0 since > 0.

Select x such that jx 4j < D 5 .

Then > jx4j implies jf .x/f .4/j D j.5x3/.543/j D j5x20j D

5jx 4j < 5 D .

Therefore, the function f is continuous at 4.

For a more challenging example, consider proving that the function f .x/ D

2x3 4x C 1 is continuous for all real numbers. This proof not only tackles a

more complicated function than the one in the previous example, it is supposed to

demonstrate the continuity of the function at the general real number a rather than

at a specific value such as a D 4. This requires the proof to select an arbitrary a and

prove the continuity of f at the point a. By showing that the function is continuous

at any arbitrarily chosen a, it shows that the function is continuous at every point a.

Again, the proof will select an arbitrary > 0 and needs to produce a > 0 such

that jf .x/ f .a/j < for all x satisfying jx aj < . The proof needs to select an

arbitrary a and an arbitrary > 0. Does it matter which it does first? In this case

where the choice of a does not depend on which is chosen, and the choice of

does not depend on which a is chosen, the order is not critical. It makes sense to

select the a first because you are then challenged to prove that f is continuous at

a for which you should choose an > 0. But since both quantifiers are universal

quantifiers (for all a 2 R and for all > 0), the order does not matter. If it had been

103

a universal quantifier and an existential quantifier such as for all > 0 there exists

a > 0, then the order would matter a great deal.

Working backwards from > jf .x/f .a/j you can see that you need > j.2x3

4xC1/.2a3 4aC1/j D j2.x3 a3 /4.xa/j D j2.xa/.x2 CxaCa2 /4.xa/j D

jxajj2.x2 CxaCa2 /4j. You should not be surprised and, in fact, be quite pleased

to see that this last expression contains a factor of jx aj because this will facilitate

making the expression small when jx aj is made small. One only needs to control

the size of the other factor j2.x2 C xa C a2 / 4j. Of course, if x is allowed to wonder

too far from a, this other factor could get arbitrarily large, so care must be taken to

restrict how far x gets from a. This can be done by requiring that not be larger than

some conveniently selected value such as 1. That means that jx aj < 1 would

imply, for example, that jxj < jaj C 1. Given this, there are many ways to find an

upper bound for the quantity j2.x2 C xa C a2 / 4j where the upper bound does not

depend on x. For example, j2.x2 C xa C a2 / 4j 2x2 C 2jxjjaj C 2a2 C 4

2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4. One can afford to be sloppy here and get a

simpler looking upper bound by saying 2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4

2.jaj C 1/2 C 2.jaj C 1/.jaj C 1/ C 2.jaj C 1/2 C 4.jaj C 1/2 D 10.jaj C 1/2 . All you

need is an upper bound that depends only on a. This leads to the following proof.

PROOF: The function f .x/ D 2x3 4x C 1 is continuous on the real

numbers.

Let f .x/ D 2x3 4x C 1, and let a 2 R.

Given > 0,

which is greater than 0 since 1, , and 10.jajC1/2

let D min 1; 10.jajC1/

2

are all positive.

Select x such that jx aj < . Then 1 implies jxj < jaj C 1.

Also, 10.jajC1/

2 implies that

3

jf .x/f .a/jDj.2x 4x C 1/ .2a3 4a C 1/j D j2.x3 a3 / 4.x a/jD

j2.x a/.x2 C xa C a2 / 4.x a/j D jx aj j2.x2 C xa C a2 / 4j

jx aj 2.jaj C 1/2 C 2.jaj C 1/jaj C 2a2 C 4

jx aj 2.jaj C 1/2 C 2.jaj C 1/.jaj C 1/ C 2.jaj C 1/2 C 4.jaj C 1/2 D

2

jx aj 10.jaj C 1/2 < 10.jajC1/

2 10.jaj C 1/ D .

Therefore, the function f is continuous at every real number a.

Not all functions

can be expressed with

nice formulas. Take, for example, the

2x if x is rational

function f .x/ D

which behaves differently on the rational

x C 1 if x is irrational

numbers than it does on the irrational numbers. Such functions that are defined

one way on the rational numbers and another way on the irrational numbers make

interesting examples because both the rational and the irrational numbers are dense

in the real numbers; that is, in every nonempty open interval .a; b/, you can find

both rational and irrational numbers. For the given function, in every nonempty

open interval .a; b/ there are values of x where f .x/ D 2x and other values of x

104

4 Continuity

2x for rational x (blue) and

x C 1 for irrational x (red)

The blue and red lines are not

solid

(1,2)

where f .x/ D x C 1. Indeed, for most real numbers a, lim f .x/ does not exist. Only

x!a

at x D 1, where 2x and x C 1 coincide, does this limit exist, and, in fact, at that point

f .x/ is continuous (Fig. 4.2).

A proof that f is continuous at x D 1 would be similar to the two preceding

proofs, but you need to be careful to handle f .x/ differently depending on whether

x is rational or irrational. As in other continuity proofs, given an > 0 you are

faced with producing a value for > 0 which will ensure that jf .x/ f .1/j <

whenever jx aj < . If the function in the proof were equal to x C 1 for every

value of x, then the value D would work because jx 1j < shows that

jf .x/ f .1/j D j.x C 1/ .1 C 1/j D jx 1j < . If the function in the proof

were equal to 2x for every value of x, then the value D 2 would work because

jx 1j < 2 shows that jf .x/ f .1/j D j.2x/ .2 1/j D 2jx 1j < . In this proof,

then, you can choose D min.; 2 / D 2 . After selecting an x with jx 1j < ,

you merely consider two separate cases, one where x is rational, and one where x is

irrational. These ideas allow you to produce the following proof.

PROOF: The function f .x/ D

x D 1.

Let f .x/ D

2x if x is rational

x C 1 if x is irrational

2x if x is rational

.

x C 1 if x is irrational

is continuous at

Given > 0,

let D 2 which is greater than 0 since > 0.

Select x such that jx 1j < D 2 .

If x is a rational number, then jf .x/ f .1/j D j2x 2j D 2jx 1j < 2 D .

If x is an irrational number, then jf .x/ f .1/j D j.x C 1/ 2j D jx 1j <

< .

In either case, jf .x/ f .1/j < .

Therefore, the function f is continuous at 1.

105

4.2.1 Exercises

Write proofs of each of the following statements.

1.

2.

3.

4.

5.

6.

7.

f .x/ D 5x2 C 3x 2 is continuous at x D 8.

f .x/ D 10x3 C 25 is continuous for all real numbers x.

f .x/ D jxj is continuous at x D 0.

f .x/ D p

jx2 9j is continuous for all real numbers x.

f .x/ D px is continuous for all x 0.

f .x/ D jx2 4j is continuous for all real numbers x.

Continuity of a function is a local property, that is, whether or not a function f is

continuous at a point x D a depends only on how f behaves close to a. In fact, f

1

can be continuous at a and yet have very erratic behavior at points just 10

unit from

1

1

a or 100 or even 1;000;000 from a. The last example in the previous section shows a

function continuous at x D 1 which is continuous for no other value of x. Even if

f is continuous at all points of a set A, it could be that proofs of the continuity of f

at two points x D a and x D b might need to be quite different. Certainly, there is

no reason to believe that, given an > 0, a value of > 0 that works in a proof of

the continuity of f at the point a would also work in a proof of the continuity of f at

point b.

Consider, for example, the function f .x/ D 1x which is continuous for all x 0.

To prove that f is continuous at x D 2, given > 0 one can use D min.1; / or

even be as generous as to let D min.1; 2/. But to prove that f is continuous at

x D 12 where the function f changes much more rapidly, for the same > 0, one

might need to use D min. 41 ; 8 /. You can easily see from the graph of f .x/ D 1x

that as a gets closer to 0, the > 0 chosen for a particular > 0 will need to get

smaller (Fig. 4.3).

Suppose that you wanted to prove that a particular function f was continuous

at every a in the domain of f . Such a proof was discussed in the previous section

using f .x/ D 2x3 4x C 1. In that proof, the formula for the > 0 chosen for a

given > 0 depended on the point a where f was being shown to be continuous.

Clearly, this would have to be the case because f is a cubic function of x which

grows increasingly more rapidly as x gets large. But it is not true that every function

behaves this way. Some functions change at a constant rate like f .x/ D 6x 13 or

change at a rate that does not continue to grow such as f .x/ D x2 1C1 . When writing

a proof of the continuity of such functions, it is possible to pick a single value for

> 0 that depends on > 0 (as it certainly would have to unless f were constant

on each interval in its domain), but where the choice of > 0 does not depend on

106

4 Continuity

uniformly continuous

the point a where the continuity needs to be shown. These functions are special and

satisfy the following definition. A function f is uniformly continuous on the set

A if for every > 0 there is a > 0 such that jf .x/ f .y/j < for every x and y

in A satisfying jx yj < . You should compare this definition to the definition of

continuity at a point. The difference centers on when the value of > 0 needs to

be determined. For continuity at a single point, given > 0, one must specify the

value of > 0 after being given the value of a but before being given a value for x.

Thus, the value of > 0 can depend on the value of a even though it cannot depend

on the value of x. On the other hand, for uniform continuity, given > 0, one must

specify the value of > 0 before learning the values of either x or y, and, therefore,

its value cannot depend on either x or y.

The definition of uniform continuity suggests a template for how to prove that a

given function f is uniformly continuous on a set A. As in the proof for continuity

at a point, you would say that a value for > 0 has been given. Then you would

present a value for > 0. Once these two values have been specified, you would

need to show that any x and y in A that satisfy jxyj < also satisfy jf .x/f .y/j < .

This suggests the following.

TEMPLATE for proving the function f is uniformly continuous on the

set A

SET THE CONTEXT: Make statements about what is known about the

function f .

SELECT AN ARBITRARY : Given > 0,

PROPOSE A VALUE FOR : let D

. Here you would insert an

appropriate value for .

SELECT ARBITRARY x and y in A with jx yj < : Let x and y be in A

such that jx yj < .

(continued)

107

STATE THE CONCLUSION: Therefore, f is uniformly continuous on the

set A.

Proving the function f .x/ D 6x 13 is uniformly continuous on the entire real

line is straightforward since the function f changes at a constant rate. This allows

you to select a value for > 0 based on that rate of change, 6.

PROOF: The function f .x/ D 6x 13 is uniformly continuous on the real

numbers.

Given > 0,

let D 6 which is greater than 0 since > 0.

Let x and y be real numbers such that jx yj < D 6 .

Then jf .x/ f .y/j D j.6x 13/ .6y 13/j D 6jx yj < 6 D .

Therefore, the function f is uniformly continuous on the real numbers.

Less clear is how to choose a value for > 0 when proving f .x/ D x2 1C1

is uniformly continuous on the real numbers. To do this, you need to find a

way to show jf .x/ f .y/j < . You would try to find an upper bound for

2

2 C1/j

jxCyj

D .x2 C1/.y

jf .x/f .y/j D x2 1C1 y2 1C1 D j.y.x2C1/.x

2 C1/ jxyj. This expression

C1/.y2 C1/

is complicated, so it is convenient to find ways to simplify it. The nice thing about

working with inequalities rather than equalities is that you are not prevented from

making changes that increase the value of your expression. That is, if you can

simplify an expression by substituting an expression that is a little larger, that might

not be a problem. The numerator in the previous expression is jx C yj which does

not simplify algebraically, but it does suggest a possible application of the triangle

inequality, jx C yj jxj C jyj. Changing jx C yj to jxj C jyj allows the fraction to

be broken into two simpler

fractions. It allows you

to continue

with jf .x/ f .y/j D

jxCyj

jyj

jxj

jx yj.

jx yj .x2 C1/.y2 C1/ C .x2 C1/.y2 C1/ jx yj x2jxj

C y2jyj

.x2 C1/.y2 C1/

C1

C1

When jxj < 1, you can conclude that jxj < 1 x2 C 1. When jxj 1, you can

2

conclude that jxj x2 < x2 C 1. In either case .x2jxj

xx2 C1

D 1. This lets you

C1/

C1

jxCyj

jyj

jxj

jx yj

C

state that jf .x/ f .y/j D .x2 C1/.y

2 C1/ jx yj

2

2

2

2

.x C1/.y C1/

.x C1/.y C1/

2jx yj. This suggests that D 2 will work in the proof.

108

numbers.

4 Continuity

1

x2 C1

Given > 0,

let D 2 which is greater than 0 since > 0.

Let x and y be real numbers

such that

jx 2 yj < 2 D 2 .

C1/j

Then jf .x/ f .y/j D x2 1C1 y2 1C1 D j.y.x2C1/.x

D

C1/.y2 C1/

jxCyj

jyj

jxj

jx yj

jx yj .x2 C1/.y

2

2

2 C1/ C .x2 C1/.y2 C1/

.x C1/.y C1/

jxj

jx yj

C y2jyj

x2 C1

C1

Note that if jxj < 1, then jxj < x2 C 1, and if jxj 1, then jxj x2 < x2 C 1.

In either case, jxj < x2 C 1, so x2jxj

< 1, and similarly, y2jyj

< 1.

C1

C1

jyj

jxj

It follows that jf .x/ f .y/j x2 C1 C y2 C1 jx yj < 2jx yj < 2 D .

Therefore, the function f is uniformly continuous on the real numbers.

One of the most memorable theorems from Calculus is the Mean Value

Theorem which states that if the function f is continuous on the interval a; b

and differentiable on the interval .a; b/, then there is a c 2 .a; b/ such that

.a/

f 0 .c/ D f .b/f

. If the function f has a bounded derivative on the interval

ba

a; b, that is, if there is a positive real number M such that jf 0 .x/j M for all

values of x 2 a; b, then one can easily see that f is uniformly continuous on that

interval. Indeed, if x and y are in a; b, then there is a c between x and y such that

jf .x/ f .y/j D jf 0 .c/j jx yj M jx yj. This implies that given > 0, the value

D M > 0 can be used in a proof that f is uniformly continuous on a; b for then

jx yj < implies jf .x/ f .y/j D jf 0 .c/j jx yj < M jx yj < M D . This

is summarized by saying that a function with a bounded derivative on an interval is

uniformly continuous there.

Whenever you learn of the truth of a conditional statement such as the one at the

end of the previous paragraph (bounded derivative implies uniform continuity), it is

natural to ask whether the converse of the statement is also true (uniform continuity

implies bounded derivative). The answer to this particular question is no, not all

functions uniformly continuous on an interval have bounded derivatives there. In

particular, the function f .x/ D jxj is an example of a function uniformly continuous

on the entire real line, yet it fails to be differentiable at x D 0. The function f .x/ D

p

x is uniformly continuous for x 0, but its derivative is unbounded

near x D 0.

A more complex example is the function defined by f .x/ D x2 sin x12 when x

0 and f .0/ D 0. This function is uniformly continuous on the interval 10; 10

even though its derivative, which exists on the entire real line, is not bounded as x

approaches 0.

p

Because the function f .x/ D x has an increasingly large rate of change as x

approaches 0, proving that the function is uniformly continuous for x 0 provides

an interesting challenge. The proof will need to conclude that > jf .x/ f .y/j D

109

p p p p

p p

j x yj. xC y/

p p

p . As expected, there is a factor of jx yj in

j x yj D

D pjxyj

xC y

xC y

this expression, so that you can try to make the expression small by

the

prestricting

p

size of jx yj. This is easy if the denominator of the expression, x C y, does

not get too small. The problem is if x and y get close to 0, the denominator of the

expression will also get close to 0. At first this seems

p likepa significant roadblock.

But this roadblockppresents its own resolution for if x C y is very small, it must

p

certainly be that j x yj is even smaller

is the conclusion that you want.

p which

p

In other words, there are two

cases:

either

x

C

y

is small which would imply that

p

p

jf .x/ f .y/j is small, or x C y is large which would imply that jf .x/ f .y/j D

jxyj

p p is small. You only need to decide what to use as the dividing line between

xC y

p

p

large and small. A natural choice would be itself because x C y <

p

p

p

p

jxyj

implies j x yj < . If x C y , then jf .x/ f .y/j D pxCpy jxyj

2

which suggests letting D 2 so that jx yj < gives jf .x/ f .y/j < D . The

complete proof follows.

p

PROOF: The function f .x/ D x is uniformly continuous on the interval

x 0.

p

Let f .x/ D x.

Given > 0,

let D 2 which is greater than 0 since 0.

Let x and y be nonnegative

real numbers such that jx yj < .p

p

p

p

In

the

case

that

x

C

y

<

, it follows that jf .x/ f .y/j D j x yj

p

p

x C y < . p

p

p

p

In the case that x C y , it follows that jf .x/ f .y/j D j x yj D

p p p p

j x yj. xC y/

p p

xC y

jxyj

p p

xC y

jxyj

<

2

D .

In either case, jx yj < implies that jf .x/ f .y/j < , so the function f is

uniformly continuous on the interval x 0.

There is an important lesson to be learned from this example. When planning how

to write a proof, you can pursue one line of thinking which may solve the problem

in most but not all cases. Sometimes the special cases where the argument does not

work are enough to cause you to abandon your original line of reasoning altogether.

But often you can just break your argument into two or more cases and find other

techniques to handle the special cases where the original argument does not work.

4.3.1 Exercises

Write proofs of each of the following statements.

1. f .x/ D 3x C 11 is uniformly continuous on the set of real numbers.

2. f .x/ D 14x C 5 is uniformly continuous on the set of real numbers.

3. f .x/ D jxj is uniformly continuous on the set of real numbers.

110

4.

5.

6.

7.

8.

4 Continuity

4

f .x/ D 5xC1

is uniformly continuous for x 0.

p

f .x/ D 3 x is uniformly continuous on the set of real numbers.

f .x/ D x2 is not uniformly continuous on the set of real numbers.

f .x/ D x12 is not uniformly continuous on the set .0; 1/.

4.4.1 Open Covers and Subcovers

Let a and b be real numbers with a < b. It turns out that if a function f is continuous

on the closed interval a; b, then f is uniformly continuous on that interval. How

might you prove this result? As a first try, you might say that for each > 0 and

for each y 2 a; b there is a > 0 such that if x 2 a; b with jx yj < , then

jf .x/ f .y/j < . Then, having produced a value for for each y 2 a; b, you might

want to pick the smallest of all of those s and hope that this minimum would be

sufficiently small to work for every y 2 a; b. Unfortunately, you started out with

an infinite collection of s, each greater than 0. Such an infinite set might not have

a minimum value. The set of such s is certainly nonempty and bounded below, so

the collection does have a greatest lower bound, but that greatest lower bound could

be 0, too small to use for the in the proof. A finite set of positive numbers always

has a minimum value that is positive, but an infinite set of positive numbers might

have a greatest lower bound of 0.

Suppose that T is a collection of open intervals, and A R. If the set A is

contained in the union of the open intervals in T, that is, if A [ .s; t/, then

.s;t/2T

A is called a subcover of A. In the above suggested proof that the continuity of f

on a; b implies the uniform continuity of f on a; b, the definition of continuity

at each point of y 2 a; b produced a collection of open intervals which form an

open cover T of a; b. If that open cover had a finite subcover T 0 , then you would be

dealing with only a finite number of > 0 values, and you could expect to produce

a smallest such > 0. Whether such a finite subcover exists has nothing to do with

the continuous function f that motivated this discussion. A closed bounded interval

a; b in the real numbers is compact which means that every open open cover of

a; b contains a finite subcover. The fact that every closed bounded interval in the

real numbers is compact is known as the HeineBorel Theorem, and it is central

to proving the above result about continuous functions on closed bounded intervals

being uniformly continuous there. In fact, the HeineBorel Theorem is an important

tool for proving many results in analysis.

Suppose that for every rational number in 0; 1 you represent the rational

number in lowest terms as pq . Then for each of these rational numbers you

111

; 4pC1

/. For example, the number 27 would be

4q

4q

7 9

associated with the open interval . 28

; 28 /. Since the set of rational numbers in

0; 1 is infinite, this collection of open intervals is also infinite. The collection

forms an open cover of 0; 1. One possible finite subcover is the collection of

intervals associated with rational numbers 01 ; 14 ; 13 ; 12 ; 23 ; 34 ; and 11 giving the intervals

3 5

3 5

7 9

; 16 /; . 12

; 12 /; . 38 ; 58 /; . 12

; 12 /. 11

; 13 /; and . 34 ; 54 /. You should verify that

. 14 ; 14 /; . 16

16 16

these intervals are in the original open cover and do produce the claimed finite

subcover. On the other hand, if you associate with each natural number n > 1

the open interval . n1 ; 1/, you get an open cover of the set .0; 1/, yet no finite subset

of this collection of intervals can cover the entire interval .0; 1/. Indeed, any finite

collection will only cover the interval . m1 ; 1/ for some natural number m > 1. Since

these intervals form an open cover of .0; 1/ which does not have a finite subcover,

the set .0; 1/ is not a compact set.

Presented next are two quite different proofs of the HeineBorel Theorem. The

techniques used in both proofs are instructive, and it is interesting to see how a

single result can be proved using two completely different strategies. Given in each

case are real numbers a < b and a set of open intervals T that forms an open cover

of the closed bounded interval a; b. Both proofs seek to show that there must be

a finite subset of T that covers a; b. The strategy in the first proof suggests that,

whether or not you can cover a; b with a finite number of open intervals, you can

certainly cover some of the interval starting at a and working at least part of the way

toward b. The proof proposes looking at the set

S D fx 2 a; b j T has a finite subcover that covers the interval a; xg:

The proof first shows that S is not empty because it contains the point a. The set S

is bounded above by b, so S has a least upper bound, r. This is not to say that r 2 S,

but if r is not in S, there must be values in S that are arbitrarily close to r. Certainly

r is in a; b, so there is an open interval from T that covers r. Since there are values

of S arbitrarily close to r, there are some inside this open interval containing r. This

open interval then extends the finite subcover to values greater than r. One can only

conclude that r must be b, and, in fact, b 2 S. Thus, a; b has a finite subcover, and

the proof is complete (Fig. 4.4).

112

4 Continuity

PROOF (HeineBorel Theorem): Let a < b be two real numbers, and let

T be an open cover of a; b. Then T contains a finite subcover of a; b.

Let a < b be two real numbers, and let T be an open cover of a; b.

Define set S D fx 2 a; b j T has a finite subcover that covers the interval

a; xg.

The set T is an open cover of a; b, and a 2 a; b, so T must contain at

least one open interval, .p; q/ which contains the point a, that is, p < a < q.

Since the interval a; a is covered by .p; q/ 2 T, the point a 2 S, and S is

not an empty set.

The set S is bounded above by b.

Since S is nonempty and bounded above, it has a least upper bound r.

Since r must be at least a and cannot be greater than b, r 2 a; b, so there

is an interval .p; q/ in T which contains the point r, that is, p < r < q.

Since p < r and r is the least upper bound of S, p is not an upper bound of

S. Thus, there is a point y 2 S with p < y. This means that there is a finite

set of intervals in T that covers a; y.

Let z D min. rCq

; b/. Since z r and z 2 .p; q/, adding the interval .p; q/

2

to the finite set of intervals of T that covers a; y produces a finite set of

intervals in T that covers a; z, and z 2 S.

But r is the least upper bound for S, implying that z r. Because z D

min. rCq

; b/ and rCq

> r, it must be that z D b.

2

2

Because z 2 S, it follows that b 2 S which completes the proof of the

theorem.

The second proof of the HeineBorel Theorem is a proof by contradiction. It

begins as the first proof by assuming that a < b are real numbers, and that the

interval a; b has an open cover T. Then it makes the additional assumption that no

finite collection of intervals in T can cover a; b. This will lead to a contradiction.

This proof is not one that the beginning student is likely to invent on their own

unless they have seen the technique before.

First, the proof sets a0 D a and b0 D b so that the interval a0 ; b0 D a; b. Let

0

m0 D a0 Cb

be the midpoint of a0 ; b0 . It must be the case that at least one of the

2

intervals a0 ; m0 or m0 ; b0 cannot be covered by a finite number of intervals in T

because, if both can be covered by a finite number of intervals, putting those two

collections together would give a finite collection of intervals that covered the entire

interval a0 ; b0 D a; b contradicting the assumption that this could not be done.

p

[

a

]

b

113

in T, let a1 D a0 and b1 D m0 . Otherwise, if m0 ; b0 cannot be covered by a finite

number of intervals in T, let a1 D m0 and b1 D b0 . In either case, the new interval

a1 ; b1 a; b cannot be covered by a finite collection of intervals in T.

Now the proof continues iteratively. If for some j > 0, there is an interval aj ; bj

contained in a; b which cannot be covered by any finite collection of intervals in

a Cb

T, let mj D j 2 j be the midpoint of the interval. Either aj ; mj or mj ; bj cannot be

covered by a finite collection of intervals from T, so if aj ; mj cannot be covered by a

finite collection of intervals, let ajC1 D aj and bjC1 D mj . Otherwise, let ajC1 D mj

and bjC1 D bj . In either case ajC1 ; bjC1 cannot be covered by a finite collection

of intervals from T. Notice that this process constructs a sequence of intervals

a0 ; b0 ; a1 ; b1 ; a2 ; b2 ; : : : contained in a; b, none of which can be covered by

a finite collection of intervals in T. Also note that a D a0 a1 a2 : : :

while b D b0 b1 b2 : : :, and for each j, the length of the jth interval

is bj aj D ba

. Since each aj term is less than all of the bk terms, both of the

2j

monotone sequences are bounded and, therefore, converge. Moreover, since for each

k, lim bj lim aj bk ak D ba

, it follows that lim aj D lim bj D r 2 a; b.

2k

j!1

j!1

j!1

j!1

Note that since the sequence of aj s increases to r, and the sequence of bj s decrease

to r, the limit r 2 aj ; bj for each j. Because the limit, r, is in a; b, there is an open

interval .p; q/ 2 T such that r 2 .p; q/. The distance the limit r is from the boundary

of the interval .p; q/ is D min.r p; q r/ > 0. Since lim ba

D 0, you can

2j

j!1

select a j so that ba

2j

and, so, aj ; bj .p; q/. But this shows that aj ; bj is covered by the single open

interval .p; q/ 2 T contradicting the fact that aj ; bj could not be covered by a finite

collection of intervals in T. Thus, you must conclude that the assumption that a; b

cannot be covered by a finite number of intervals is false. A formal proof follows

(Fig. 4.5).

r

Fig. 4.5 HeineBorel Theorem second proof

114

4 Continuity

PROOF (HeineBorel Theorem): Let a < b be two real numbers, and let

T be an open cover of a; b. Then T contains a finite subcover of a; b.

Let a < b be two real numbers, and let T be an open cover of a; b.

Assume that T contains no finite subcover of a; b.

Let a0 D a and b0 D b so that the interval a0 ; b0 D a; b, and note that no

finite collection of intervals in T will cover a0 ; b0 .

Define sequences <aj > and <bj > inductively. For j 0, let aj ; bj a; b

be an interval which cannot be covered by a finite collection of open

intervals in T, and where bj aj D ba

.

2j

aj Cbj

Let mj D 2 be the midpoint of aj ; bj .

It must be the case that at least one of the intervals aj ; mj or mj ; bj

cannot be covered by a finite number of intervals in T because, if both can

be covered by a finite number of intervals, putting those two collections

together would give a finite collection of intervals that covered the entire

interval aj ; bj .

If aj ; mj cannot be covered by a finite collection of intervals, let ajC1 D aj

and bjC1 D mj . Otherwise, let ajC1 D mj and bjC1 D bj . In either case

ajC1 ; bjC1 cannot be covered by a finite collection of intervals from T, and

ba

j

jC1 .

Thus, there are monotone sequences a D a0 a1 a2 : : : and b D

b0 b1 b2 : : :, and for each j, the length of the aj ; bj interval is

bj aj D ba

.

2j

Since each aj term is less than all of the bk terms, both of the monotone

sequences are bounded and, therefore, converge. The fact that lim aj

j!1

j!1

j!1

ba

/,

2j

j!1

j!1

that r 2 .p; q/.

The distance the limit r is from the boundary of the interval .p; q/ is D

min.r p; q r/ > 0. Since lim ba

D 0, there is a j such that ba

< .

2j

2j

j!1

But then aj ; bj is covered by the single open interval .p; q/ 2 T contradicting the fact that aj ; bj could not be covered by a finite collection of

intervals in T.

Thus, the assumption that a; b cannot be covered by a finite number of

intervals is false, and the theorem is proved.

The fact that the interval a; b in the HeineBorel Theorem is both closed and

bounded is crucial. The interval 1; 1/ is covered by the collection of open intervals

.j; j C 2/ for j D 0; 1; 2; 3; : : :, but no finite collection of these open intervals

can cover 1; 1/. The interval .0; 5/ is covered by the collection . 1j ; 5/ for j D

1; 2; 3; 4; : : :, but, again, no finite collection of these open intervals can cover .0; 5/.

115

With the HeineBorel Theorem, it can now be shown that every continuous function

on a closed bounded interval is uniformly continuous on that interval. The idea is

simple enough: if f is continuous on the closed bounded interval a; b, then, given

> 0, at each point x 2 a; b there is a > 0 such that for any y 2 .x ; x C /, it

follows that jf .x/ f .y/j < . Thus, there is an open interval around each x 2 a; b

that has the desired property, and the HeineBorel Theorem shows that a; b can be

covered by just a finite number of these open intervals. Since each of these finitely

many open intervals is associated with a positive , you can select the smallest to

serve as the > 0 needed in your proof of uniform continuity.

There are, though, a couple of subtleties that get in the way of this simple

argument. First of all, for any y in one of the open intervals .x ; x C / you

can conclude that jf .y/ f .x/j < , but the proof will require that jf .y/ f .z/j <

for any y and z that are within the chosen of each other, not just for z D x, the

middle point of the interval. One can get around this problem by arranging that

jf .y/ f .x/j < 2 for all y 2 .x ; x C /. This is a common trick in analysis

proofs. The definition of continuity allows you to find a > 0 that works for

any given > 0, so why not for 2 which is also greater than 0? Then for any

y and z in .x

; x C /,

you

can use

the triangle inequality to conclude that

jf .y/f .z/j D j f .y/f .x/ f .z/f .x/ j jf .y/f .x/jCjf .z/f .x/j < 2 C 2 D .

There is a second problem with the this strategy. If you select y and z within of

each other, how do you know that they both lie within the same interval .x; xC/?

The interval a; b is covered by a finite number of such intervals, but just because

the two numbers y and z are close to each other does not mean that they will both

fall within the same interval in your finite collection of open intervals. There are a

couple of ways to get around this problem. One method is to consider the endpoints

of the intervals in your finite collection of open intervals. Since the number of open

intervals is finite, there are only finitely many endpoints to these intervals. You could

select the in the proof not to be the least of the s used for any of the intervals

but to be the least distance between any two distinct elements of the collection of

endpoints of these intervals. That ensures that if y and z are closer together than ,

there can be at most one endpoint between y and z. That will guarantee that y and

z will both be within one of the finitely many open intervals. This follows from the

fact that intervals in an open cover must overlap, so that each endpoint of one of

the open intervals must be a member of one of the other open intervals in the open

cover as seen in the following diagram (Fig. 4.6).

(z )

Fig. 4.6 y and z straddle one endpoint but remain in an interval of the open cover

116

4 Continuity

A cleaner way to ensure that any y and z within of each other are in one of

the finite number of intervals in the open cover of a; b is to be more clever about

choosing the original open intervals. Suppose that for all y 2 .x ; x C /, it

follows that jf .y/ f .x/j < . You can be very conservative and use the open

interval .x 2 ; x C 2 / as the interval chosen to cover x in the open cover of a; b.

Then if y and z are very close, and y 2 .x 2 ; x C 2 / for some x, it will follow that,

since y and z will be closer together than 2 , guaranteeing that both y and z will be in

.x ; x C /, and the result will follow. The following proof uses the first strategy.

PROOF: A function continuous on a closed bounded interval is uniformly

continuous on that interval.

Let a < b be two real numbers, and let f be continuous on the interval a; b.

Let > 0 be given.

By the definition of continuity, for each x 2 a; b there is a x > 0 such that

for all y in a; b with jy xj < x it follows that jf .y/ f .x/j < 2 .

Because for each x 2 a; b, the point x 2 .x 2x ; x C 2x /, this collection of

open intervals covers a; b.

By the HeineBorel Theorem, there is a finite set fx1 ; x2 ; x3 ; : : : ; xn g a; b

such that the collection I D f.xj xj ; xj C xj / j j D 1; 2; 3; : : : ; ng forms

an open cover of a; b.

The set of endpoints of these intervals, E D fxj xj j j D 1; 2; 3; : : : ; ng, is

a finite set, so let be the smallest positive difference between two elements

of E.

Let y and z be elements of a; b with jy zj < .

Because the collection of intervals I is an open cover of a; b, there are j

and k such that y 2 .xj xj ; xj C xj / and z 2 .xk xk ; xk C xk /. If j D k,

then y and z are in the same interval of I. If j k, then because jy zj < ,

there is at most one endpoint in E between y and z. Thus, either there is at

most one endpoint of .xj xj ; xj C xj / or .xk xk ; xk C xk / between

y and z. This implies that either y and z are both in .xj xj ; xj C xj /

or both in .xk xk ; xk C xk /. In either case, there is a single interval

.xm xm ; xm C xm / 2 I such that y; z 2 .xm xm ; xm C xm /.

Now it follows that jf .y/ f .z/j D j .f .y/ f .xm // .f .z/ f .xm // j

jf .y/ f .xm /j C jf .z/ f .xm /j < 2 C 2 D .

This shows that for every > 0 there is a > 0 such that if y; z 2 a; b

with jy zj < , then jf .y/ f .z/j < . This completes the proof that f is

uniformly continuous on a; b.

Note that the fact that a; b is both closed and bounded is crucial. The function

f .x/ D x2 is continuous on the unbounded interval 0; 1/, but f is not uniformly

continuous on this interval. Similarly, the function f .x/ D 1x is continuous on the

open interval .0; 1/, but f is not uniformly continuous on this interval.

117

4.4.4 Exercises

1. Determine which of the following sets of real numbers are compact.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

0; 12

.2; 2

1; 4 [ 8; 15

f1; 3; 5; 7; 9g

R

;

2; 6/ [

.6; 11

S

h

i

1

1

1

f0g [

jD1 2jC1 ; 2j

(a) 1; 10 is covered by the collection of open intervals r 14 ; r C 14 where

r 2 Q.

(b) 0; 3 is covered by the collection of open intervals 1j ; 4 for j D 1; 2; 3; : : :

1 1

; 10 .

along with the open interval 10

(c) 0; 1 is covered by the collection of open intervals 2j1 ; 2j for j D 1; 2; 3; : : :

1 1

along with 10

; 10 .

Write proofs of each of the following statements.

3. The intersection of two compact sets is a compact set.

4. The union of two compact sets is a compact set.

5. If C is a compact set and .a; b/ is an open interval, then the set difference Cn.a; b/

is a compact set.

6. If the function f is uniformly continuous on the interval a; b and uniformly

continuous on the interval b; c for a < b < c, then f is uniformly continuous on

the interval a; c.

7. If a set A has a cover consisting of a finite number of open intervals, then A has

a subcover such that for each x 2 A, x is an element of at most two of the open

intervals in the subcover.

Chapter 3 discusses several theorems about how one can calculate limits when faced

with the addition, subtraction, multiplication, or division of functions whose limits

are known. As one might expect, since continuity and limits are closely related, the

proofs of the corresponding theorems about functions continuous at a point are, in

fact, very similar. Before starting, it is worth pointing out that if f and g are two

functions, then you can define the new functions f C g, f g, f g, and gf at all

118

4 Continuity

points in the intersection of the domain of f and the domain of g and, in the case

of gf , only where g is not 0. Generally, one is interested in functions that have a

common domain, but sometimes this is not the case. Pathological examples do exist.

It could be, for example, that f is only defined for positive real numbers, and g is

only defined for negative real numbers as with f .x/ D p1x and g.x/ D p1x . Then

f C g has

domain and is the empty function, one that contains no ordered

an empty

pairs, x; f .x/ . Oddly, the definition of continuity says that the empty function is

continuous because it satisfies the definition at each point of its empty domain.

Suppose that functions f and g have a common domain where the point a is an

accumulation point of that domain. Also suppose that lim f .x/ D L and lim g.x/ D

x!a

x!a

H. Recall that when proving that the limit of f C g is L C H, you are given an > 0

and can use the definition of limit to conclude that there are 1 > 0 and 2 > 0

such that if x is in the common domain of f and g with 0 < jx aj < 1 , then

jf .x/ Lj < 2 , and if 0 < jx aj < 2 , then jg.x/ Hj < 2 . Then the triangle

inequality allows you to conclude that for all x with 0 < jx aj < min.1 ; 2 / that

j .f .x/ C g.x// .L C H/j D j .f .x/ L/ C .g.x/ H/ j jf .x/ Lj C jg.x/ Hj <

C 2 D . The same method works for the proof about continuity of f C g at a

2

with minor changes made to match the template for writing proofs about continuity

of a function at a point. Of course, the same logic works for proving the continuity

of f g, so the two results might as well be combined as follows.

PROOF: Suppose that f and g are functions with common domain

containing the point a. If both f and g are continuous at the point a, then

so are the functions f C g and f g.

Let f and g be functions both defined on a set A containing the point a, and

assume that f and g are both continuous at a.

Let > 0 be given.

By the definition of continuity, there is a 1 > 0 such that if x 2 A and

jx aj < 1 , then jf .x/ f .a/j < 2 .

Similarly, there is a 2 > 0 such that if x 2 A and jx aj < 2 , then

jg.x/ g.a/j < 2 .

Let D min.1 ; 2 /.

Then

if x 2 Awith

jx aj < ,

j f .x/ g.x/ f .a/ g.a/ j D j f .x/ f .a/ g.x/ g.a/ j

jf .x/ f .a/j C jg.x/ g.a/j < 2 C 2 D .

This shows that f C g and f g are continuous at a.

Now suppose that f and g are functions as discussed above with lim f .x/ D L

x!a

and lim g.x/ D H. Recall how you prove that lim f .x/g.x/ D LH. Again, as with

x!a

x!a

the proof for the sum of the limits, given > 0 you find > 0 so that both jf .x/ Lj

and jg.x/ Mj are small when 0 < jx aj < . How small do these need to

be? The idea was to write jf .x/g.x/ LHj as jf .x/g.x/ f .x/H C f .x/H LHj

jf .x/j jg.x/ Hj C jHj jf .x/ Lj. Thus, 1 > 0 can be chosen to ensure that

119

jf .x/ Lj is less than 1, 2 > 0 so that jf .x/ Lj is less than 2.jHjC1/

, and 3 so that

jg.x/ Hj is less than 2.jLjC1/ . Then can be set to the least of 1 , 2 , and 3 . The

proof for continuity of fg at the point a follows this same strategy.

containing the point a. If both f and g are continuous at the point a, then

so is the function fg.

Let f and g be functions both defined on a set A containing the point a, and

assume that f and g are both continuous at a.

Let > 0 be given.

By the definition of continuity, there is a 1 > 0 such that if x 2 A and

jx aj < 1 , then jf .x/ f .a/j < 1, and thus, jf .x/j < jf .a/j C 1.

There is a 2 > 0 such that if x 2 A and jx aj < 2 , then jf .x/ f .a/j <

.

2.jg.a/jC1/

There is a 3 > 0 such that if x 2 A and jx aj < 3 , then jg.x/ g.a/j <

.

2.jf .a/jC1/

Let D min.1 ; 2 ; 3 /.

Then if x 2 A with jx aj < ,

jf .x/g.x/ f .a/g.a/j D jf .x/g.x/ f .x/g.a/ C f .x/g.a/ f .a/g.a/j

C jg.a/j jf .x/ f .a/j <

jf .x/j jg.x/ g.a/j

jf .a/j C 1 2.jf .a/jC1/

C jg.a/j 2.jg.a/jC1/

2 C 2 D .

This shows that fg is continuous at a.

Finally, suppose that f and g are functions as discussed above with lim f .x/ D L

x!a

f .x/

and lim g.x/ D H and H 0. This time recall how you prove that lim g.x/

D HL .

x!a

x!a

The idea is the same

but the algebra took

the proof for products,

as with

f .x/HLg.x/

f .x/

L

a few more steps. g.x/ H D g.x/H D f .x/HLHCLHLg.x/

D

g.x/H

.f .x/L/HCL Hg.x/

g.x/H

jg.x/j

jg.x/jjHj

which, in

2

jHj

turn, implies that jg.x/j > 2 . Then you choose a 2 > 0 so that jx aj < 2

gives jf .x/ Lj < jHj

. Lastly, you choose a 3 > 0 so that jx aj < 3 gives

4

H 2

jg.x/ Hj < 4.jLjC1/ . This allowed you to conclude f .x/HLg.x/

< . Again, the

g.x/H

proof for continuity can be constructed by changing the limit L to jf .a/j, the limit

H to jg.a/j, and making some other minor wording changes.

120

4 Continuity

containing the point a with g.a/ 0. If both f and g are continuous at

the point a, then so is the function gf .

Let f and g be functions both defined on a set A containing the point a, and

assume that f and g are both continuous at a with g.a/ 0.

Let > 0 be given.

Note that jg.a/j > 0. By the definition of continuity, there is a 1 > 0

such that if x 2 A and jx aj < 1 , then jg.x/ g.a/j < jg.a/j

. For these x it

2

>

jg.x/jCjg.x/g.a/j

D

jg.x/jCjg.a/g.x/j

follows that jg.x/jC jg.a/j

2

g.x/ C g.a/ g.x/ D jg.a/j which implies that jg.x/j > jg.a/j jg.a/j D

2

jg.a/j

.

2

By the definition of continuity, there is a 2 > 0 such that if x 2 A and

jx aj < 2 , then jf .x/ f .a/j < jg.a/j

.

4

By the definition of continuity, there is a 3 > 0 such that if x 2 A and

2

.

jx aj < 3 , then jg.x/ g.a/j < 4.jfg.a/

.a/jC1/

Let D min.1 ; 2 ; 3 /.

Then

if x 2 A with

0 < jx aj < ,

g.x/ g.a/ D

D

D

g.x/g.a/

g.x/g.a/

C g.x/g.a/ <

g.x/g.a/

jg.x/j

2

jg.a/j

4

2

jg.a/j

C 4.jfg.a/

2jf .a/j < 2 C

.a/jC1/ jg.a/j2

This shows that fg is continuous at a.

2

D .

4.5.1 Exercises

1. Suppose that f and g are functions that are both uniformly continuous of a set

A. Find an example showing that their product need not be uniformly continuous

on A.

Write proofs for each of the following statements.

5

3. All polynomials are continuous on R.

4. All rational functions are continuous on R except at points where their denominators are 0.

5. If f and g are uniformly continuous on the set A, then f C g and f g are also

uniformly continuous on A.

121

is discontinuous at a, then g is discontinuous at a.

7. Suppose f and g have common domain A and fg is continuous at a 2 A. If f is

discontinuous at a, then g is either discontinuous at a or g.a/ D 0.

Recall that the two functions g W A ! B and f W B ! C can be composed to

obtain f g W A ! C. An important property of composition is that if the function

g is continuous at a 2 A and the function f is continuous at g.a/ 2 B, then the

composition f g is continuous at a. The proof of this result can follow the template

for proofs of continuity at a point. Such a proof

would

introduce

an > 0 and end

with concluding that jf g.x/ f g.a/j

D

jf

g.x/

f

g.a/

j < . The continuity

of f at g.a/ allows you to claim that jf g.x/ f g.a/ j is small if g.x/ is close to

g.a/. But it is easy to ensure that g.x/ is close to g.a/ because g is continuous at a.

So, you can choose a > 0 to make jg.x/ g.a/j as small as necessary. How small

0

is that? The continuity of f tells you how small.

So, given > 0, choose a > 0

0

so that jy g.a/j < implies jf .y/ f g.a/ j < .Then

choose

a > 0 so that

jx aj < implies jg.x/ g.a/j < 0 . This gives f g.x/ f g.a/ < . The

complete proof can be written as follows.

PROOF: Suppose that function g has domain A with its range contained

in the set B, and that function f has domain B. If g is continuous at a 2 A

and f is continuous at g.a/, then the composition f g is continuous at a.

Let g be a function with domain A with its range contained in the set B, and

let f be a function with domain B. Assume g is continuous at a 2 A, and f

is continuous at g.a/.

Let > 0 be given.

Because f is continuous at g.a/, there is a 0 >

that if y is in the

0 such

domain of f with jy g.a/j < 0 , then jf .y/ f g.a/ j < .

Because g is continuous at a, there is a > 0 such that if x 2 A with

jx aj < , then jg.x/ g.a/j < 0 .

0

If

x 2 A,

then

jx aj < implies jg.x/ g.a/j < which, in turn, implies

f g.x/ f g.a/ < .

This shows that f g is continuous at a.

As an example of how useful this theorem is consider the function jxj. One can

prove that this function is continuous fairly easily by following the template for

proofs that a function

f is continuous at a point a. Indeed, such a proof must end

cases for x and a being

negative or nonnegative, it can be seen that jxj jaj jx aj, so if jx aj is

made less than , then jxj jaj < . This, in fact, shows that jxj is uniformly

continuous.

122

4 Continuity

Let f .x/ D jxj.

Let > 0 be given, and set D .

Let a be any real number.

Note that if x and a are either both

greater than or equal to 0, or both less

than or equal to 0, then jxj jaj D jx aj, but that if x and

opposite

a have

. In either case, jxjjaj jxaj.

signs, then jxaj D jxjCjaj > jxjjaj

Thus, jxj is uniformly continuous.

As easy as this proof is, the continuity of jxj can more easily be proved as

follows.

PROOF: The function jxj is continuous.

p

Let g.x/ D x2 , and f .x/ D x.

Let a be any real number.

2

Then g is continuous at a, and,

p since g.a/ D a 0, f is continuous at g.a/.

2

Because the function jxj D x D .f g/.x/, it follows that jxj is continuous

at a.

In turn, this result can be used to show that if f and g are functions with domain A,

and f and g are both continuous at a 2 A, then the functions min.f ; g/ and max.f ; g/

are both continuous at a. This is because the functions min.f ; g/ and max.f ; g/ can

be expressed in terms of absolute value.

PROOF: If f and g are functions with the same domain A, and both functions are continuous at a 2 A, then the function min.f ; g/ is continuous

at a.

Let f and g be functions with the same domain A, and assume that both

functions are continuous at a 2 A.

Note that for any two real numbers y and z, if y > z, then y C z jy zj D

y C z .y z/ D 2z, but if y z, then y C z jy zj D y C z .z y/ D 2y.

In either case, y C z jy zj D 2 min.y; z/.

.x/g.x/j

Thus, for any x, min f .x/; g.x/ D f .x/Cg.x/jf

.

2

Since f and g are continuous at a, so is f g.

Since f g is continuous at a, so is jf gj.

gj

It then follows that the combination min.f ; g/ D f Cgjf

is continuous

2

at a.

123

4.6.1 Exercises

1. Find examples of functions f and g defined on R with lim f .x/ D L and

lim g.y/ D M such that lim g .f .x// M.

y!L

x!a

x!a

2. If g is uniformly continuous on its domain A, and f is uniformly continuous on

the range of g, then f g is uniformly continuous on A.

3. If f and g are functions with the same domain A, and both f and g are continuous

at a 2 A, then max.f ; g/ is continuous at a.

4. If f and g are functions uniformly continuous on the same domain A, then

min.f ; g/ is uniformly continuous on A.

5. If f1 ; f2 ; f3 ; : : : ; fn are all uniformly continuous on the same domain A, then so is

max.f1 ; f2 ; f3 ; : : : ; fn /.

4.7.1 Boundedness of Continuous Functions

A function that is continuous on a closed bounded interval a; b satisfies some

important properties. In particular, there are points u and v in a; b such that for

all x 2 a; b, f .u/ f .x/ f .v/. In this case, f .u/ is the minimum value of f

on a; b and f .v/ is the maximum value of f on a; b. This result is often stated

as a function continuous on a closed bounded interval obtains its minimum and its

maximum value. Note that the function f .x/ D x is continuous on the open interval

.1; 2/, but it does not take on a minimum or maximum value there. Clearly, f .x/ < 2

for each value of x 2 .1; 2/, and 2 is the least upper bound of all values achieved by

f on that interval, but the least upper bound is never achieved.

Before showing that the function f continuous on the closed bounded interval

a; b obtains its minimum and maximum values, it is convenient to first show that

such a function must be bounded, that is, there is a real number M such that for all

x 2 a; b, jf .x/j M. This can be proved by contradiction by assuming that f is

continuous on a; b, but that no bound exists. How do you quantify the assumption

f is not bounded? You cannot just assume that f takes on an infinitely large value

because f .x/ is a real number for each value of x 2 a; b and, hence, f .x/ cannot be

infinitely large for any value of x. You need to construct the negation of the statement

there is an M such that for all x 2 a; b, jf .x/j M. This is a statement with two

quantifiers: there is an M and for all x 2 a; b. The two quantifies are followed

by the proposition (statement) that jf .x/j M. The first quantifier is existential

124

4 Continuity

quantifier, and the second quantifier is a universal quantifier. Thus, the statement

there is an M such that for all x 2 a; b, jf .x/j M has an existential quantifier

stating that there exists a number M satisfying a property. This is followed by a

universal quantifier stating that all x in the interval a; b satisfy a property. Finally,

the property is given as jf .x/j M.

The rule of thumb for constructing the negation of statements with quantifiers

is to replace each existential quantifier with a corresponding universal quantifier,

replace each universal quantifier with a corresponding existential quantifier, and

replace the property with the negation of that property. In this example, the

existential quantifier there exists a number M would be replaced by the universal

quantifier for all numbers M. Then the universal quantifier for all x 2 a; b

would be replaced by the existential quantifier there exists an x 2 a; b. Finally,

the property jf .x/j M would be replaced by its negation jf .x/j > M. The

resulting negation is for all numbers M there is an x 2 a; b such that jf .x/j > M.

Your proof of the boundedness of f would begin by introducing f and the interval

a; b. Then it would assume negation just discussed. The remainder of the proof

would be to derive a contradiction, and that would show that the assumption made

at the outset of the proof is false, so its negation, the statement you were trying to

prove, must be true.

Thus, the proof would begin with a statement about f being a continuous function

on the closed bounded interval a; b which would be followed by the negation

of the statement you want to prove. So how do you use this negation to reach a

contradiction? Well, just see where this assumption leads you. If for each M you

can find an x 2 a; b where jf .x/j > M, it means that there is an x1 such that

jf .x1 /j > 1. Similarly, there is an x2 such that jf .x2 /j > 2. In this way, you can

assert that there is a sequence x1 ; x2 ; x3 ; : : : such that for each n 1, jf .xn /j > n.

Note that this gives you an infinite sequence of values in the closed bounded interval

a; b. The BolzanoWeierstrass Theorem states that every infinite bounded set has

an accumulation point. Does the sequence x1 ; x2 ; x3 ; : : : produce such an infinite

bounded set? Well, it is certainly bounded because each xn is in the interval a; b.

Is it possible that the sequence does not give an infinite collection of points? For

that to happen, it would have to be the case that infinitely many of the value in

the sequence were equal to each other. Actually, just because you choose x1 so that

jf .x1 /j > 1 does not preclude having jf .x1 /j > 100, so the value x1 could appear

in the sequence many times. This is awkward. It would be easier if you chose a

sequence of distinct values. This is actually not hard to do. Rather than choosing xn

so that jf .xn /j > n, why not choose x1 as above, and for each n 1 choose xnC1 so

that jf .xnC1 /j > jf .xn /j C 1. This would not only imply that for each n, jf .xn /j > n

but also that xn could not equal any of the values that appear earlier in the sequence.

So what can you do with the infinite sequence of xn values with its guaranteed

accumulation point, y? First note that the accumulation point y is also in a; b

because all of the xn values satisfy both a xn and xn b, so the accumulation

point y must also satisfy a y b. Otherwise, there would be an interval around y

125

continuous function on a; b

is bounded

x1

x3 x4

x6 x5 x2 b

that did not share any points with a; b, so it would not contain any of the xn values.

This means that f is defined and continuous at y. That implies that there is a > 0

such that for all x 2 a; b satisfying jx yj < , it follows that jf .x/ f .y/j < 1.

But that means jf .x/j < jf .y/j C 1. But y is an accumulation point for the sequence

of xn values, so there are infinitely many of the xn within of y, and some of them

will necessarily have the property that jf .xn /j > jf .y/j C 1. This gives the needed

contradiction (Fig. 4.7).

PROOF: A function continuous on a closed bounded interval is bounded.

Let a b be real numbers, and let f be a function continuous on the interval

a; b.

Assume that f on a; b is not bounded. That is, for every real number M,

there is an x 2 a; b such that jf .x/j > M.

Select x1 2 a; b so that jf .x1 /j > 1.

Construct a sequence inductively as follows. Assume that for some n 1

the sequence x1 ; x2 ; x3 ; : : : ; xn has been selected. Choose xnC1 2 a; b such

that jf .xnC1 /j > jf .xn /j C 1.

Note that for each n the xn chosen in this manner must be distinct from all

values of xj chosen before it in the sequence, and that jf .xn /j > n.

The terms of the sequence x1 ; x2 ; x3 ; : : : form an infinite set contained in the

interval a; b, so it is an infinite bounded set. By the BolzanoWeierstrass

Theorem the set has an accumulation point y.

If y lies outside of a; b, then there is an open interval containing y that

contains no points of a; b. Thus, that open interval would not contain any

terms of the sequence which cannot happen if y is an accumulation point of

the sequence. Therefore, y 2 a; b.

(continued)

126

4 Continuity

x 2 a; b with jx yj < , it follows that jf .x/ f .y/j < 1, and thus,

jf .x/j < jf .y/j C 1.

Since y is an accumulation point of the sequence x1 ; x2 ; x3 ; : : :, there must

be infinitely many terms of the sequence within of y. Thus, there must

be an n such that n > jf .y/j C 1 and xn is within of y. This implies that

jf .xn /j > n > jf .y/j C 1 which contradicts the fact that jf .xn / f .y/j < 1.

Therefore, the assumption that f is not bounded must be false, and the

theorem is proved.

Using the fact that continuous functions on closed bounded intervals are bounded,

there is a nice trick to show that a function f continuous on the closed bounded

interval a; b must achieve its extreme values, that is, its minimum and maximum.

The fact that the set of values that f takes on is a bounded set implies that the set

of values has a least upper bound, M. If f is never equal to M, then the function

M f .x/ is positive for all x 2 a; b because M is an upper bound, and f .x/ is never

equal to M. This implies that the function Mf1 .x/ is also continuous on the interval

a; b. But then you can again apply the previous theorem to show that there is a

number K such that for all x 2 a; b, Mf1 .x/ K. Taking reciprocals one more time

shows M f .x/ K1 which implies that f .x/ M K1 . This shows that M K1 < M

is an upper bound for f on a; b when M was assumed to be the least upper bound.

This is a contradiction, and you must conclude that f .x/ D M for at least one value

of x 2 a; b. The formal proof can be written as follows (Fig. 4.8).

Fig. 4.8 The maximum and

minimum of a function f .x/

on an interval

maximum

y = f(x)

minimum

127

bounded interval obtains its maximum value and its minimum value at

some points in the interval.

Let a b be real numbers, and let f be a function continuous on the interval

a; b.

The set B D ff .x/ j x 2 a; bg is not empty because it contains f .a/, and

B is bounded above because all functions continuous on a closed bounded

interval are bounded.

Let M be the least upper bound of set B.

Assume that for all x 2 a; b, f .x/ M.

The function M f .x/ is continuous on a; b and is never equal to 0. Hence,

M f .x/ > 0 on a; b.

It follows that the function Mf1 .x/ is continuous on a; b.

Because all functions continuous on a closed bounded interval are bounded,

there is a real number K > 0 such that Mf1 .x/ K on a; b.

But then M f .x/ K1 on a; b, so f .x/ M K1 on a; b.

Since K > 0, the set B is bounded above by M K1 < M. This means that

M K1 is an upper bound for B which contradicts the fact that M was the

least upper bound of B.

Therefore, the assumption that f was never equal to M is false, and there

must be a value x 2 a; b such that f .x/ D M.

Applying the proceeding argument to the function f , which is also

continuous on a; b, shows that there is an x 2 a; b such that f .x/ is

equal to the maximum of f on a; b. But then f .x/ is the minimum value

of f on a; b. This completes the proof of the theorem.

Suppose the function f is defined on an interval containing c and d, and the graph of

f passes through the points .c; f .c// and .d; f .d//. It might be that the graph of the

function passes through every value of y between f .c/ and f .d/ as it moves between

the points .c; f .c// and .d; f .d// as shown in the figure (Fig. 4.9). For example, the

function f .x/ D 2x2 3 is defined for all real numbers

with f .1/ D 1 and f .2/ D 5.

q

For each y between 1 and 5, the value x D yC3

2

y. Formally, a function defined on an interval a; b is said to have the intermediate

value property on that interval if for each choice of c and d with a c d

b and each y between f .c/ and f .d/, there is an x 2 c; d such that f .x/ D y.

The Intermediate Value Theorem states that any function continuous on an interval

has the intermediate value property there. If you consider the intuitive notion of

continuity where you say that f is continuous on a; b if you can draw the graph of

128

Fig. 4.9 f passing through

each y between f .c/ and f .d/

4 Continuity

f(c)

f(d)

f without lifting your pencil from the paper, then this intermediate value property

becomes clear because in going from f .c/ to f .d/, your pencil will necessarily cross

over all the y values between f .c/ and f .d/.

To prove the Intermediate Value Theorem you would begin by setting the context

by introducing a function f continuous on an interval a; b and points c and d with

a c d b. Then you would select an arbitrary y between f .c/ and f .d/.

The proof would have to demonstrate the existence of an x between c and d with

f .x/ D y. How is this to be done? As with many other proofs in Analysis, one shows

the existence of a real number by constructing a set for which that number is a least

upper bound. Consider, for example, the case where f .c/ < y < f .d/. You could

construct the set S D fx 2 c; d j f .x/ yg. This set is not an empty set because

c 2 S, and S is certainly bounded above by d. Thus, the Completeness Axiom says

that the set has a least upper bound, s. Now you can refer to the continuity of f

which will show that if f .s/ < y, then there is a > 0 such that jx sj <

implies that f .x/ < y showing that there are values greater than s for which f .x/ < y

contradicting the fact that s is an upper bound of S. If f .s/ > y, then there is a > 0

such that jx sj < implies that f .x/ > y showing that s < s is an upper bound

for S contradicting the fact that s is the least upper bound of S. The only remaining

conclusion is that f .s/ D y which provides the needed example, x D s, needed to

prove the theorem.

Note that the above argument did not cover the general case where f .c/ and f .d/

can be in any order. The argument so far only covers the specific case where f .c/ <

f .d/. So is there more proof to write? It is easy to see that the case f .c/ > f .d/ can be

proved with an argument virtually identical to the one given above by changing the

sense of some of the inequalities. The case of f .c/ D f .d/ is even easier because the

only possible y between f .c/ and f .d/ is f .c/, so the value x D c gives the needed

f .x/ D y. Thus, giving the argument for f .c/ < f .d/ essentially covers all the

needed cases, and it would be very easy for the reader to add the needed arguments

to complete the proof for the missing cases. In this situation it is common for the

proof to cover only the specific condition f .c/ < f .d/ and introduce it with the

phrase without loss of generality. In this case the phrase means that although the

following assumption looks like it only covers some of the necessary cases, in order

129

to make the argument completely general, the omitted cases are either very easy or

virtually identical to the case being considered. With this in mind, the following is

a proof of the Intermediate Value Theorem.

PROOF (Intermediate Value Theorem): Let the function f be continuous

on the interval a; b containing c and d. If y is any value between f .c/ and

f .d/, then there exists x between c and d such that f .x/ D y.

Let f be a function continuous on a; b, and let c and d be in a; b.

Let y be any value between f .c/ and f .d/.

Without loss of generality, assume that c d and f .c/ y f .d/.

Let set S D fx 2 c; d j f .x/ yg.

S is not empty because f .c/ y implying c 2 S.

S is bounded above by d.

By the Completeness Axiom S has a least upper bound s which will be an

element of a; b.

If f .s/ < y, then by the continuity of f , there is a > 0 such that if x 2 a; b

with jx sj < , then jf .x/ f .s/j < yf2.s/ , and, in particular, f .x/ < y.

This shows that there is an x > s with f .x/ < y, so x 2 S contradicting the

fact that s is an upper bound of S.

If f .s/ > y, then by the continuity of f , there is a > 0 such that if x 2 a; b

with jx sj < , then jf .x/ f .s/j < f .s/y

, and, in particular, f .x/ > y.

2

This shows that for all x between s and s that f .x/ > y, so s is an

upper bound of S contradicting the fact that s is the least upper bound of S.

It follows that f .s/ must equal y which completes the proof of the theorem.

In the above proof the steps which begin If f .x/ < y and If f .x/ > y are written

in exactly the same style using almost identical words. If you were writing a short

story, you would avoid writing in this style because it might sound monotonous to

the reader. In creative writing, you would want to be more creative, and you would

reach for your thesaurus to find alternate words to enhance your writing. But in a

mathematical proof, using such parallel construction of sentences actually makes

the proof easier to read. A reader only needs to parse the first of the two steps in

order to have a good idea of what is going to be done in the second of the two steps.

This gives the reader a head start on processing the second step. What is passed off

as boring in creative writing can be applauded in the writing of proofs because of

the way it simplifies the understanding. In fact, one often begins the second of two

such steps with the word similarly to indicate that the argument to follow looks a lot

like the one just completed, again alerting the reader to the parallel construction.

The Intermediate Value Theorem says that functions continuous on an interval

have the intermediate value property there. But a function need not be continuous

for it to have the intermediate value property. Clearly, if a function has a jump

discontinuity at a point a, that is, if lim f .x/ and lim f .x/ both exist but are

x!a

x!aC

different as shown in Fig. 4.10, then there could well be values of y that the function

misses as it passes from .c; f .c// to .d; f .d//.

130

4 Continuity

jump discontinuity

f(c)

f(d)

1

x

For a discontinuous function to have the intermediate value property, the function

must necessarily

oscillate wildly (Fig. 4.11). A typical example is the function

sin 1x if x > 0

f .x/ D

.

0

if x 0

4.7.4 Exercises

Write proofs for each of the following statements. Each statement can be proved

using one or more of the theorems in this section.

1. Let A R be a bounded set, and let f be a function defined on A. If f is

unbounded on A, then for every > 0, there exists a and b in R with b a <

such that f is unbounded on A \ .a; b/.

2. If a < b and f is a continuous function on a; b with f .a/ D f .b/, then there is a

c 2 .a; b/ such that f obtains an extreme value (either a minimum or maximum)

at c.

3. Suppose that f is a continuous function defined on R such that lim f .x/ D

x!1

x!1

4.8 Discontinuity

131

4. If p is an odd degree polynomial with real coefficients, then p has at least one

real root.

5. Suppose that a plane contains be a polygon G and a line L. Then there is a line

L0 in the plane parallel to L such that exactly half the area of G lies on each side

of L0 .

r

1

2

6. There is a value of x between 0 and 1 such that x equals

.

1 C x2

4.8 Discontinuity

In Calculus students learn about a great many continuous functions. These include

the elementary functions: polynomials, rational functions, algebraic functions,

exponential functions, logarithmic functions, and circular and hyperbolic trigonometric functions and their inverses. How badly can a function be discontinuous? A

function can

8 be discontinuous

9 at a single point such as the signum or sign function

< 1 if x < 0 =

sgn.x/ D

0

if x D 0 or at a sequence of points such as the floor or greatest

:

;

1

if x > 0

integer function bxc D n if n is the integer satisfying n x < n C 1 (Fig. 4.12).

A function

( can be discontinuous at a sequence of points

) that converge such as with

1

1

1

if

<

x

;

for

positive

integer

n

nC1

n

f .x/ D n

. This function is discontin0 otherwise

uous at each x D 1n for positive integers n, but it is continuous everywhere else

including at x D 0 (Fig. 4.13). A function can be discontinuous at every x such as

0 if x is rational

with f .x/ D

.

1 if x is irrational

But one of the most surprising examples is the following often called Thomaes

function but also known as the popcorn function, the raindrop function,

132

4 Continuity

Fig. 4.14 Graph of

Thomaes function

or the modified

Dirichlet function. It is defined on

the interval .0; 1/ by

1

m

if

x

is

rational

written

in

lowest

terms

as

n

n . Its graph is shown in

f .x/ D

0 if x is irrational

Fig. 4.14. It is not hard to see that this function is discontinuous at each rational

number mn 2 .0; 1/. Indeed if mn is in lowest terms, then f . mn / D 1n . If is set

1

for every >

be irrational numbers x 2 .0; 1/ satisfying

0 there will

at 2n ,mthen

x < for which f .x/ f m D j0 1 j > . On the other hand, at each

n

n

n

irrational number a in .0; 1/, the function is continuous. To see this, given an

> 0, notice that there are only finitely many rational numbers r 2 .0; 1/ such that

f .r/ . If there are such rational numbers, there is one, r0 , closest to a, so choose

D jr0 aj. If there are no such rational numbers, you can choose D 1. In either

case, for all x 2 .0; 1/ with jx aj < , it follows that jf .x/ f .a/j < , showing

that f is continuous at a.

Chapter 5

Derivatives

Anybody who was even half paying attention in their first course in Calculus got

the strong impression that the differentiation of functions has an enormous number

of applications. Not only does it provide a great tool for understanding the behavior

of functions, but it also has applications to a very wide range of other fields, most

notably Physics, Engineering, Chemistry, Biology, and Economics. In particular,

being able to use the derivative to determine where a function is increasing and

decreasing in itself justifies this reputation. Merely knowing the average rate of

change of a function over an interval is valuable. But the limit concept allows you to

refine this idea to get the instantaneous rate of change of the function at a point. This

allows for more precise information about the function as well as providing what is

often a simpler expression than that of the average rate of change from which it

is derived. This chapter will discuss the theorems needed to calculate derivatives

efficiently as well as theorems highlighting some of the important properties and

applications of the derivative.

Let f be a function defined on an open interval containing the point a.

Then for values of x near but not equal to a one can calculate the slope

of the secant

line passing

through the two points on the graph of the

function a; f .a/ and x; f .x/ . As shown in Fig. 5.1, the slope of this secant

.a/

line is given by the difference quotient f .x/f

. If f is continuous, as x

xa

approaches a, the point x; f .x/ approaches the point a; f .a/ , and the secant

line may approach a tangent line, the line that passes through a; f .a/

and most closely approximates the graph of the function near a (Fig. 5.2).

The derivative of f at a is the slope of this tangent line. More formally, if a is

an accumulation point of the domain of the function f , and f is defined at a, then

.a/

the derivative of f at a is f 0 .a/ D lim f .x/f

. The derivative is said to exist if this

xa

x!a

.a/

the limit can be written f 0 .a/ D lim f .aCh/f

.

h

h!0

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_5

133

134

5 Derivatives

Line

(a, f(a))

f(x) - f(a)

x -a

(x, f(x))

(a, f(a))

The first important consequence of the definition of the derivative is that if a function

f has a derivative at a point, then f is also continuous at that point. As part of the

definition of derivative, f needs to be defined at the point a for it to have a derivative

.a/

at a. It remains to show that lim f .x/ D f .a/ whenever the limit lim f .x/f

exists.

xa

x!a

x!a

For this difference quotient to have a finite limit when the denominator is clearly

approaching 0, the numerator must also be approaching 0. This last statement is

intuitively true, so you would hope that it has an easy justification. Consider what

.a/

sort of algebraic operations you could apply to the difference quotient f .x/f

in

xa

order to produce the numerator f .x/ f .a/. It should be clear that if the difference

quotient is multiplied by x a, the product will be the desired difference f .x/ f .a/.

This suggests the method that works in the following simple proof.

135

at a.

Suppose that f has a derivative at the point a.

It follows from the definition of derivative that f is defined at a, and that a

is an accumulation point of the domain of f .

.a/

Also from the definition of derivative it follows that f 0 .a/ D lim f .x/f

xa

x!a

exists.

h

i

.a/

.a/

Then lim f .x/ f .a/ D lim .x a/ f .x/f

D

lim

x

a

lim f .x/f

D

xa

xa

x!a

x!a

x!a

x!a

0 f 0 .a/ D 0.

Thus, f .x/ is both defined at x D a, and lim f .x/ f .a/ D 0, or

lim f .x/ D f .a/.

x!a

It follows that f is continuous at x D a.

x!a

should know several counterexamples that show that the converse is false, that is,

there are functions f continuous at a point a that are not differentiable at a. First of

all, f can be continuous at a where a is an isolated point of the domain of f , and at

such points, the derivative of f is not defined. But even if f is continuous for all real

numbers, f need not have a derivative at a particular a. The best known example is

the absolute value function, f .x/ D jxj, which is continuous for all real numbers but

.0/

fails to have a derivative at x D 0. This is because the difference quotient f .x/f

is

x0

equal to 1 for all x > 0 and 1 for all x < 0, so the limit of the difference quotient

does not exist at x D 0. Of course, the absolute value function has a derivative at all

x 0. There is a well-known example known as the Weierstrauss function that is

continuous for real numbers x but does not have a derivative at any point.

The proof that a function f has a particular derivative at a point a is just a proof about

the limit of a difference quotient, and as such, is no different than a proof of any other

limit. On the other hand, there are some similarities among the proofs of derivatives,

so it is worth working through a few examples. The key observation is that whenever

you need to calculate a derivative directly from the definition, you must calculate the

limit of a difference quotient which, by design, is a fraction whose numerator and

denominator are both approaching zero. In such a case, one would expect to be able

to perform some algebraic manipulation that would result in the x a expression in

the denominator canceling with an equivalent factor in the numerator. This allows

you to use other limit theorems to complete the evaluation.

136

5 Derivatives

For example, consider the function f .x/ D 3x2 8x. To calculate the derivative

of f at a D 4, one needs to evaluate the limit

3x 8x 3 42 8 4

f .x/ f .4/

3x2 8x 16

lim

D lim

D lim

x!4

x!4

x!4

x4

x4

x4

.3x C 4/.x 4/

D lim

D lim 3x C 4 D 16:

x!4

x!4

x4

Since each step of this derivation follows either from rules of algebra or from

the theorems about calculating the limits of various arithmetic combinations of

functions, the calculation given is a complete proof that the derivate of f at x D 4

is 16.

In a more general setting, consider proving that the derivative of f .x/ D 5x4 at

the point x D a is f 0 .a/ D 20a3 . Here you would calculate

5.x a/ x3 C x2 a C xa2 C a3

f .x/ f .a/

5x4 5a4

lim

D lim

D lim

x!a

x!a

x!a

xa

xa

xa

D lim 5.x3 C x2 a C xa2 C a3 / D 5.a3 C a2 a C aa2 C a3 / D 20a3 :

x!a

Again, finding a factor of x a in the numerator of the difference quotient is the key

to evaluating the needed limit.

One quickly learns in Calculus that although the derivative is defined as a limit

of a difference quotient, there is a small collection of algorithms that reduce the

finding of the derivative of any combination of elementary functions to a fairly

mechanical exercise. The algorithms show you how to take the derivatives of the

sum, difference, product, and quotient of two differentiable functions as well as a

constant multiple of a differentiable function, the inverse of a differentiable function,

and the composition of two differentiable functions. Those rules along with the

knowledge of how to differentiate the elementary functions, xn , ax , loga x, sin x,

and cos x give you all the tools necessary to differentiate virtually any function you

are likely to see in a lifetime of applications. This and the next sections discusses

the proofs of the theorems that provide these needed algorithms.

The simplest of these results is the theorem that states that if f is a function

differentiable at a and c is any constant, then the function cf is also differentiable at

a with .cf /0 .a/ D cf 0 .a/. In the proof of this theorem, you would assume that f 0 .a/

.a/

D f 0 .a/. Since the limit needed

exists. That provides for you the limit lim f .x/f

xa

x!a

to show that .cf /0 .a/ D cf 0 .a/ is just a multiple of a known limit, the needed result

follows immediately from the fact that the limit of a constant times a function is the

constant times the limit of the function.

137

Suppose that f has a derivative at the point a.

.a/

From the definition of derivative f 0 .a/ D lim f .x/f

.

xa

cf .x/cf .a/

xa

x!a

0

x!a

.a/

D lim c f .x/f

D c lim

xa

x!a

x!a

f .x/f .a/

xa

D cf 0 .a/.

functions is the sum or difference of their derivatives, one is faced with finding the

limit of a difference quotient which can easily be written as the sum or difference of

two difference quotients whose limits are already known. Thus, if f and g are two

functions defined on the same domain and both differentiable at a, calculating the

derivative of f C g at a requires the limit

f .x/ f .a/ C g.x/ g.a/

.f C g/.x/ .f C g/.a/

D lim

lim

x!a

x!a

xa

xa

f .x/ f .a/

g.x/ g.a/

C lim

D f 0 .a/ C g0 .a/

D lim

x!a

x!a

xa

xa

as needed.

PROOF: Let f and g be functions defined on a common domain, and let

f and g both be differentiable at a. Then .f C g/0 .a/ D f 0 .a/ C g0 .a/ and

.f g/0 .a/ D f 0 .a/ g0 .a/.

Suppose that f and g are functions defined on a common domain, and that f

and g are both differentiable at a.

.a/

From the definition of derivative f 0 .a/

D

lim f .x/f

and

xa

x!a

g0 .a/ D lim

g.x/g.a/

.

xa

x!a

Cg/.a/

Then .f C g/ .a/ D lim .f Cg/.x/.f

D lim

xa

xa

x!a

x!a

.a/

g.x/g.a/

0

0

lim f .x/f

C

lim

D

f

.a/

C

g

.a/.

xa

xa

x!a

x!a

0

0

0

0

Because .f g/.x/ D f .x/ g.x/ D f .x/ C .1/g.x/, and the derivative

the derivative of g.x/, it follows that .f g/0 .a/ D

of .1/g.x/is 1 times

0

f C .1/g .a/ D f .a/ C .1/g0 .a/ D f 0 .a/ g0 .a/ completing the proof

of the theorem.

Why does the first step in this proof make the assumption that f and g are defined

on the same domain? This is to avoid the embarrassing situation that the intersection

of the domains of f and g isolates the point a. For example, if f is defined for all

x 1 and g is defined for all x 1, it could be that both f 0 .1/ and g0 .1/ are defined,

but the function f Cg is defined only at 1, so its derivative cannot be defined. Another

138

5 Derivatives

p be defined at all rational numbers, and g to be defined at

all rational multiples of 2. Each function could be differentiable at each point of

its domain, but f C g is only defined at 0, so its derivative cannot be defined.

It is certainly worth noting here that the theorems discussed so far show that for

functions f and g and constants a and b, the derivative of the linear combination of

functions af .x/Cbg.x/ is the linear combination of the derivatives af 0 .x/Cbg0 .x/.

In the words of Linear Algebra, this says that the derivative is a linear operator. This

fact alone has a long list of ramifications in Differential Equations and other fields.

It is important for the beginning Calculus student to learn that even though the

derivative behaves in an intuitive way with respect to addition and subtraction, that

this intuition ceases when discussing the derivative of a product or quotient. The

proof that .fg/0 D fg0 C f 0 g involves one trick reminiscent of the proof that the limit

of a product is the product of the limits. That is, one adds and subtracts the same

quantity so that rather than making a change in two different factors at the same

time, one makes a change in one factor at a time. Indeed, the difference quotient

you obtain for the function fg is

f .x/g.x/ f .a/g.a/

f .x/g.x/ f .x/g.a/ C f .x/g.a/ f .a/g.a/

D

xa

xa

g.x/ g.a/

f .x/ f .a/

D f .x/

C

g.a/:

xa

xa

Taking the limits at each step produces the following proof.

PROOF (Product Rule): Let f and g be functions defined on a

common domain, and let f and g both be differentiable at a. Then

.fg/0 .a/ D f .a/g0 .a/ C f 0 .a/g.a/.

Suppose that f and g are functions defined on a common domain, and that f

and g are both differentiable at a.

.a/

From the definition of derivative f 0 .a/

D

lim f .x/f

and

xa

x!a

.

xa

x!a

Because f is differentiable at a, it is continuous at a. This implies that

lim f .x/ D f .a/.

x!a

.fg/.x/.fg/.a/

xa

x!a

f .x/g.x/f .x/g.a/Cf .x/g.a/f .a/g.a/

lim

xa

x!a

lim f .x/

x!a

D lim

x!a

f .x/g.x/f .a/g.a/

xa

D lim f .x/

x!a

.a/

lim g.x/g.a/

C lim f .x/f

xa

xa

x!a

x!a

0

0

0

g.x/g.a/

xa

f .x/f .a/

xa

g.a/ D

x!a

0

The proof that gf

D

assumption that g.a/ 0.

gf 0 fg0

g2

139

domain,

and let f and g both be differentiable at a. If g.a/ 0, then

0

0

0 .a/

f

.a/ D g.a/f .a/f.a/g

.

2

g

g.a/

Suppose that f and g are functions defined on a common domain, and that f

and g are both differentiable at a with g.x/ 0.

.a/

From the definition of derivative f 0 .a/

D

lim f .x/f

and

xa

x!a

.

g0 .a/ D lim g.x/g.a/

xa

x!a

Because g is differentiable at a, it is continuous at a. This implies that

lim g.x/ D g.a/.

x!a

f

f

0

f .x/

f .a/

.x/

.a/

g.a/

g

g

f

g.x/

Then

D lim

D

.a/ D lim

x!a

x!a

g

xa

xa

f .x/

f .a/

f .a/

f .a/

g.x/

C g.x/

g.a/

g.x/

lim

D

x!a

xa

!

1

1

g.a/

1

f .x/ f .a/

g.x/

C f .a/

lim

D

x!a

xa

g.x/

xa

1

f .a/

g.a/ g.x/

f .x/ f .a/

C

D

lim

x!a

xa

g.x/

g.x/g.a/

xa

lim

x!a

f .x/ f .a/

1

f .a/

g.a/ g.x/

lim

C lim

lim

D

x!a g.x/

x!a g.x/g.a/ x!a

xa

xa

f .a/

1

f 0 .a/g.a/ f .a/g0 .a/

0

.

2 g .a/ D

2

g.a/

g.a/

g.a/

0

f

f 0 .a/g.a/ f .a/g0 .a/

Thus,

.a/ D

.

2

g

g.a/

f 0 .a/

5.4.1 Exercises

Write proofs for each of the following statements.

For any constant c, the function f .x/ D c has derivative f 0 .x/ D 0.

The function f .x/ D x has derivative f 0 .x/ D 1.

For any positive integer n, the function f .x/ D xn has derivative f 0 .x/ D nxn1 .

Any polynomial function f .x/ D an xn Can1 xn1 Can2 xn2 C Ca1 xCa0 has

derivative f 0 .x/ D nan xn1 C .n 1/an1 xn2 C .n 2/an2 xn3 C C a1 .

n

5. For any positive integer n, the function f .x/ D x1n has derivative f 0 .x/ D xnC1

.

1.

2.

3.

4.

140

5 Derivatives

domain and

constants c1 ; c2 ; c3 ; ; cn , the

each differentiable at a, and given

0

derivative c1 f1 C c2 f2 C c3 f3 C C cn fn .a/ D c1 f10 .a/ C c2 f20 .a/ C c3 f30 .a/ C

C cn fn0 .a/.

7. Given the collection of functions f1 ; f2 ; f3 ; ; fn each defined

on the same

0

domain and each differentiable at a, the derivative f1 f2 f3 fn .a/ D

f10 .a/f2 .a/f3 .a/ fn .a/ C f1 .a/f20 .a/f3 .a/ fn .a/ C f1 .a/f2 .a/f30 .a/ fn .a/ C

C f1 .a/f2 .a/f3 .a/ fn0 .a/.

The Chain Rule shows how to differentiate a function that is a composition of

other functions. Since composition is an invaluable tool for constructing functions,

the Chain Rule deserves its place among the important algorithms for calculating

derivatives. It states that if g is a function differentiable at a, and if f is a

function defined on the range of g and differentiable at g.a/,

then the function

.f g/.x/ is differentiable at a, and .f g/0 .a/ D f 0 g.a/ g0 .a/. To

prove

this,

f g.x/ f g.a/

.

you would need to find the limit of the difference quotient D D

xa

0

which

has

limit

g

.a/,

you

might

try

both

Expecting to see the expression g.x/g.a/

xa

multiplying

and dividing the difference quotient D by the factor g.x/ g.a/ to get

f g.x/ f g.a/

g.x/g.a/

g.x/g.a/

.

xa

a function defined

on

the range of g and differentiable at g.a/. Then

.f g/0 .a/ D f 0 g.a/ g0 .a/.

Let g be a function differentiable at a, and f be a function defined on the

range of g and differentiable at g.a/.

From the definition of derivative, g0 .a/

D

lim g.x/g.a/

and

xa

x!a

f .y/f g.a/

f 0 .g.a// D lim

.

yg.a/

y!g.a/

lim g.x/ D g.a/.

x!a

f g.x/ f g.a/

.f g/.x/.f g/.a/

0

Then .f g/ .a/ D lim

D lim

D

xa

xa

x!a

x!a

f g.x/ f g.a/

f g.x/ f g.a/

lim g.x/g.a/ g.x/g.a/

D lim g.x/g.a/ lim g.x/g.a/

.

xa

xa

x!a

x!a

x!a

f .y/f g.a/

f g.x/ f g.a/

Therefore, .f g/0 .a/ D lim g.x/g.a/ lim g.x/g.a/

D

lim

xa

yg.a/

x!a

g.x/g.a/

xa

x!a

lim

D f 0 .g.a// g0 .a/.

x!a

y!g.a/

141

This proof attempt does include the intuitive reasoning behind why the Chain Rule

works, but the proof is not correct. Can you spot the error? The problem is that even

though g.x/ is approaching g.a/ as x approaches a, there is no guarantee where g.x/

is different from g.a/. In fact, it is quite easy to construct functions g.x/ which are

differentiable at a for which g.x/ is equal to g.a/ for infinitely many values of x

as x approaches a. The simplest example

is when g is a constant function. A more

2

x sin 1x if x 0

complicated example is g.x/ D

which has a derivative of

0

if x D 0

1

0 at x D 0 and is equal to g.0/ D 0 at n

for all nonzero integers n. Clearly,

when g.x/ D g.a/, one cannot both multiply and divide the difference quotient

by g.x/ g.a/ and expect to get anything except nonsense. This problem does not

present anenormous

hurdle because, in the cases where g.x/ D g.a/, the difference

f g.x/ f g.a/

xa

8

9

f .y/f g.a/

=

introduce the function h.y/ D

. This function has the

>

: 0

;

f g.a/ if y D g.a/

nice property that it is equal to the desired difference quotient when g.x/ differs

from g.a/, and it is continuous at g.a/. Introducing this function into the proof gets

around the technical difficulties of the previously attempted proof.

quotient

a function defined

on

the range of g and differentiable at g.a/. Then

.f g/0 .a/ D f 0 g.a/ g0 .a/.

Let g be a function differentiable at a, and f be a function defined on the

range of g and differentiable at g.a/.

From the definition of derivative, g0 .a/ D lim g.x/g.a/

and f 0 .g.a// D

xa

x!a

f .y/f g.a/

lim

.

yg.a/

y!g.a/

lim g.x/ D g.a/.

x!a

9

8

f .y/f g.a/

=

< yg.a/ if y g.a/ >

, and note that h is continuous

Define h.y/ D

>

;

: 0

f g.a/ if y D g.a/

at g.a/.

f g.x/ .f g/.a/

g.a//

D lim

Then .f g/0 .a/ D lim .f g/.x/.f

xa

xa

x!a

x!a

g.x/g.a/

g.x/g.a/

D lim h g.x/ lim xa D

lim h g.x/ xa

x!a

x!a

x!a

0

0

g.a/

g

D

f

.a/.

h lim g.x/ lim g.x/g.a/

xa

x!a

x!a

0

0

0

Therefore, .f g/ .a/ D f g.a/ g .a/.

142

5 Derivatives

and injective; that is, for each y 2 B there is one and only one x 2 A such that

f .x/ D y. In this case f is a one-to-one correspondence between the points of A

and the points of B. When f is a bijection, one can define the inverse of f to be the

function f 1 W B ! A by letting f 1 .y/ be the unique value of x such that f .x/ D y.

In other words f 1 is the set of ordered

this

pairs f.y; x/ j .x; y/ 2 f g. From

definition

it is clear that for all x 2 A, f 1 f .x/ D x, and for all y 2 B, f f 1 .y/ D y. This

says that f 1 f is the identity function on A, and f f 1 is the identity function

on B.

Note that if f W A ! B is not a bijection, then one cannot define f 1 as a mapping

from B to A. If f is not surjective, then there is a y 2 B which is not in the range

of f , so there is no way to define f 1 .y/. If f is not injective, then there is a y 2 B

such that f .x/ D y for more than one value of x, and there may be no natural way

to select which x should be f 1 .y/. For example, the function f .x/ D x2 maps the

real numbers into the real numbers. The function is neither surjective (its range is

the nonnegative real numbers) nor injective since f .2/ D f .2/. One can restrict the

codomain to the nonnegative real numbers. Then f becomes a surjective function,

but it is still not injective. To get an inverse to f you can substitute a different

function for f which restricts the domain of f to the nonnegative real numbers. If

f is thought of as a function from the nonnegative real numbers to

pthe nonnegative

real numbers, then f is a bijection, and it has the inverse function x.

The same procedure is done to obtain inverses of the trigonometric functions.

For example, the function f .x/ D sin x maps the real numbers to the interval 1; 1.

The function is surjective, but it is not injective.

an injective function, the

To obtain

domain of sin x is restricted to the interval 2 ; 2 . On this interval sin x is both

injective and surjective and has the inverse sin1 y, sometimes written as arcsin y

(Fig. 5.3).

Now suppose that f is bijective and has inverse function f 1 . If f has a nonzero

derivative at the point a, the Chain Rule can be used to find the derivative of f 1

at f .a/. Indeed,

one has that .f f 1 /.x/ D x, so the Chain Rule implies that

1

0 1

1 0

. Is this conclusion valid? That

f f .x/ .f / .x/ D 1, or .f 1 /0 .x/ D 0 1

f

.x/

is, can you justify taking the derivative of .f f 1 / using the Chain Rule before you

know that the derivative of f 1 exists? The answer is yes, the use of the Chain

Rule

the limit of

is justified here. The proof of the Chain Rule includes taking

which

is

broken

into

the

product

of

the

limit

of

h

g.x/ and the

h g.x/ g.x/g.a/

xa

143

xa

the difference quotient as g.x/g.a/

D xa . Now there is no a priori assumption

xa

limit of

g.x/g.a/

.

xa

h g.x/

g.x/g.a/

xa

xa

exists; its limit is just the limit of the quotient xa which

h g.x/

As an application

of differentiating an inverse function, consider finding the

p

derivative of n x for integer values of n 0. It is known that, for integer values of

n, the derivative of f .x/ D xn is f 0 .x/ D nxn1 . For n 0 and x 0, the inverse of

p

1

1

the function of f is f 1 .x/ D n x, so its derivative must be 0 1 D p

.

n. n x/n1

f f .x/

and Critical Points

Perhaps the most important property of the derivative is its ability to determine

where a function is increasing or decreasing. Let f be a function defined on an

interval I. If for all x and y in I, x < y implies that f .x/ f .y/, then f is said to be

increasing on I, and if x < y implies that f .x/ < f .y/, then f is said to be strictly

increasing on I. Similarly, if x < y implies that f .x/ f .y/, then f is said to be

decreasing on I, and if x < y implies that f .x/ > f .y/, then f is said to be strictly

decreasing on I.

So what can be said if it is known that function f has a positive derivative at a?

.a/

What is known is that the difference quotient f .x/f

has a positive limit, so it is

xa

positive when x is close to a. How close to a does x have to be? What the limit

.a/

definition gives you is that for any > 0, you can find a > 0 so that f .x/f

is

xa

within of its limit, f 0 .a/, which is positive. So, if > 0 is chosen to be f 0 .a/, then

the difference quotient which has to be within f 0 .a/ of f 0 .a/ will have to be positive.

.a/

Thus, for x within of a (and not equal to a), the difference quotient f .x/f

is

xa

positive. Then if x > a, it follows that f .x/ > f .a/, and if x < a, it follows that

f .x/ < f .a/. Does this mean that f is increasing? The answer is no. There are

functions with a positive derivative at a which are not increasing over any open

interval containing a. An example of such a function is given in the last section of

this chapter. All one can say is the following.

144

5 Derivatives

> 0 such that for all x in the domain of f with jx aj < , if x > a, then

f .x/ > f .a/, and if x < a, then f .x/ < f .a/. Similarly, if f has a negative

derivative at a, there is a > 0 such that for all x in the domain of f with

jx aj < , if x > a, then f .x/ < f .a/, and if x < a, then f .x/ > f .a/.

Let f be a function with a positive derivative at a.

.a/

Then lim f .x/f

D f 0 .a/ > 0.

xa

x!a

if x is in the domain

.a/

0

of f and 0 < jx aj < , then f .x/f

f

.a/

< f 0 .a/ implying that

xa

f .x/f .a/

xa

> 0.

.a/

> 0, so f .x/f .a/ >

If x satisfies a < x < aC, then xa > 0 and f .x/f

xa

0, and f .x/ > f .a/.

.a/

If x satisfies a > x > a, then xa < 0 and f .x/f

> 0, so f .x/f .a/ <

xa

0, and f .x/ < f .a/.

This proves the first part of the theorem.

If instead f 0 .a/ < 0, apply the above argument to the function f to obtain

the analogous result.

(sometimes called a local maximum) at a if there is a > 0 such that for all x

in the domain of f satisfying jx aj < the value of f .a/ f .x/. Similarly, one can

define relative minimum (or local minimum) where, in this case, f .a/ f .x/. If f

has a relative maximum or relative minimum at a, one can say that it has a relative

extremum (sometimes called a local extremum) at a.

Another very important property of the derivative is its ability to identify points

where a function has relative extrema. This ability follows immediately from the

previous theorem.

PROOF: Let f be a function defined on an open interval containing a, and

let f be differentiable at a. Then if f has a relative extremum at a, the value

of f 0 .a/ is 0.

Let f be a function defined on an open interval I containing the point a.

Assume that f is differentiable at a and has a relative maximum at a.

If f 0 .a/ is positive, there is a > 0 such that for all x 2 I with a < x < aC,

f .x/ > f .a/. This contradicts the fact that f has a relative maximum at a.

If f 0 .a/ is negative, there is a > 0 such that for all x 2 I with a < x < a,

f .x/ > f .a/. This contradicts the fact that f has a relative maximum at a.

Thus, f 0 .a/ must be 0.

Applying this argument to the function f shows that if f has a relative

minimum at a, it must be that f 0 .a/ D 0.

145

f(x)

Fig. 5.4 This graph of f .x/ on the interval a; h shows relative maxima at b, d, and g, relative

minima at a; c; e; and h; an absolute maximum at b, and an absolute minimum at h. The derivative

f 0 .x/ does not exist at x D d

Any student of Calculus will see applications of this result where one is asked

to identify relative extrema for a particular function, and applications to what are

fondly called Max/Min problems where one is first asked to construct an appropriate

function to fit the application and then find a particular extremum of that function.

One defines a critical point of f to be a value a where either f 0 .a/ D 0 or f 0 .a/ does

not exist. Not all of these points will end up being relative extrema for some may just

be a saddle point of f where f 0 .a/ D 0, but f has no relative extrema at that point.

For example, the function f .x/ D x3 has a saddle point at x D 0 where f 0 is 0, but f

is a strictly increasing function over the entire real line. A function is said to have an

absolute maximum (sometimes called a global maximum) at a if f is defined at

a, and for all other x in the domain of f , f .x/ f .a/. The term absolute minimum

(sometimes called a global minimum) is defined in the analogous way with f .x/

f .a/, and an absolute extremum (sometimes called a global extremum) is either an

absolute maximum or absolute minimum. The theorem about relative extrema shows

that if f is defined on any interval I, then the only places f can have relative extrema

or absolute extrema are critical points or at endpoints of I. You should be able to

identify example functions where each of these criteria give extrema (Fig. 5.4).

5.6.1 Exercises

Identify the relative extrema and absolute extrema of the given functions on the

given intervals.

1.

2.

3.

4.

f .x/ D 3x C 3x on the interval 1; 1/

f .x/ D jx2 16j on the interval 5; 6

p 2

f .x/ D 3 2 3 x on the interval 2; 2

146

5 Derivatives

The Mean Value Theorem is one of the better known results about derivatives, and

for good reason. It is invoked frequently when one needs to estimate the maximum

possible change between the values of a function at two different points. This can

be a valuable tool when finding approximations to functions or when it is necessary

to know how much variation is exhibited by a particular function. The theorem

states that the average rate of change of a function between two points a and b

.a/

given by f .b/f

is equal to the value of the derivative f 0 .c/ for some c between a

ba

and b. This allows you to use information about the derivative to make statements

about the change f .b/ f .a/. The theorem is usually proved in two steps by first

proving Rolles Theorem which is a simpler version of the Mean Value Theorem.

Rolles Theorem states that if a < b, and if f is a function continuous on the interval

a; b, differentiable on the interval .a; b/, and satisfying f .a/ D f .b/, then there is a

c 2 .a; b/ for which f 0 .c/ D 0.

What tools do you have to prove this result? Your proof needs to conclude that

f 0 .c/ D 0. Think through what you know about derivatives, and see if any of the

results conclude that the derivative is equal to 0. The only results that come to mind

are the result that the derivative of any constant function is 0, and the result that

if f reaches a relative extremum at a point where the function is differentiable,

then its derivative at that point must be 0. It is unlikely that the first of these two

results will be of much help except in the very special case where f is a constant

function. So how can you use the result about extreme values to show that there

is a place where the function has a derivative of 0? What you do know is that f is

continuous on a closed interval a; b, and the Extreme Value Theorem states that

such a function obtains its maximum and minimum values on this interval. You also

know that these extreme values can only occur at places where the derivative is 0,

where the derivative does not exist, or at the endpoints of the interval. OK, there are

no places on .a; b/ where the derivative does not exist, but could both the maximum

and minimum occur at endpoints of the interval? The hypothesis of Rolles Theorem

says that f .a/ D f .b/, so the only way that the two endpoints can be both maximum

and minimum values of f on the interval is for f to be constant on the interval. In

the case of a constant function, the theorem is clearly true. In any other case, it

could be that f .a/ and f .b/ are maximum values for f or minimum values for f , but

they cannot be both. If f is not constant, its maximum and minimum values must

be different. That guarantees that f must have either an absolute maximum or an

absolute minimum (possibly both) between a and b. That gives the result (Fig. 5.5).

147

f(x)

Fig. 5.5 The proof of Rolles Theorem finds an extreme point c between a and b for which

f 0 .c/ D 0

on the interval a; b and differentiable on the interval .a; b/ satisfying

f .a/ D f .b/. Then there is a c 2 .a; b/ for which f 0 .c/ D 0.

For a < b, let f be a function continuous on the interval a; b and

differentiable on the interval .a; b/ satisfying f .a/ D f .b/.

Because f is continuous on the closed bounded interval a; b, it obtains a

maximum and a minimum value there.

If both the maximum and minimum values of f occur at endpoints of the

interval, then, since f .a/ D f .b/, the maximum and minimum values of f

are equal, and f is constant on the interval a; b.

In this case, f 0 .c/ D 0 for each c 2 .a; b/, and the conclusion of the theorem

holds.

If the maximum and minimum values of f do not both occur at endpoints of

the interval, then there must be a c 2 .a; b/ such that f reaches a maximum

or a minimum value at c.

In this case, f 0 .c/ D 0, and the conclusion of the theorem holds.

In either case, the conclusion of the theorem holds which completes the

proof.

Rolles Theorem takes care of the case where f .a/ D f .b/. To prove the Mean

Value Theorem in the more general case where f .a/ need not equal f .b/, you would

want to reduce this general case to the previously proved case where f .a/ and f .b/

are equal. An easy way to do this is to subtract a linear function from f to get a

new function h which does satisfy the hypothesis of Rolles Theorem. This linear

function can be any linear function that takes on a value at b which differs

by

xa

f .b/ f .a/ from the value it takes on at a. One such function is f .b/ f .a/ ba

because it takes on the value 0 at a and f .b/ f .a/ at b (Fig. 5.6).

148

Fig. 5.6 Point c between a

and b where the tangent line

is parallel to the secant line

from a to b

5 Derivatives

f(x)

on the interval a; b and differentiable on the interval .a; b/. Then there

.a/

is a c 2 .a; b/ for which f 0 .c/ D f .b/f

.

ba

For a < b, let f be a function continuous on the interval a; b and

differentiable on the

interval .a; b/.xa

Let h.x/ D f .x/ f .b/ f .a/ ba

.

xa

Since both f .x/ and f .b/ f .a/ ba

are continuous on a; b and

differentiable .a; b/, h is also continuous on a; b and differentiable on

.a; b/.

h.b/ D f .b/ f .b/ f .a/ ba

D f .b/ f .b/ f .a/ D f .a/ D h.a/.

ba

Thus, h satisfies the hypothesis of Rolles Theorem, so there is a c 2 .a; b/

such that h0 .c/ D 0.

1

.a/

Then 0 D h0 .c/ D f 0 .c/ f .b/ f .a/ ba

, so f 0 .c/ D f .b/f

.

ba

f

.b/f

.a/

0

Therefore, there is a c 2 .a; b/ with f .c/ D ba which completes the

proof.

The following are two instructive applications of the Mean Value Theorem. First,

if you know that a function f is differentiable on an interval, and its derivative is

nonnegative on that interval, then the function must be increasing on the interval. To

show that a function is increasing, you need to show that if x and y are in the interval

with x < y, then f .x/ f .y/. This would follow from knowing that if y x 0,

.x/

0. What the Mean Value Theorem gives you

then the difference quotient f .y/f

yx

is that this difference quotient is equal to the derivative of f at some point c between

x and y. So, if you know that the derivative on the interval is always nonnegative,

then the difference quotient must be nonnegative as needed.

149

of an interval. Then f is an increasing function on that interval.

Let f be a function whose derivative is nonnegative at each point of the

interval I.

Let x and y be in I with x < y.

Then by the Mean Value Theorem, there is a c between x and y such that

f .y/f .x/

D f 0 .c/.

yx

Since I is an interval and x and y are in I, c is also in I, implying that

f 0 .c/ 0.

.x/

.x/

Thus, f .y/f

0 so .y x/ f .y/f

D f .y/ f .x/ 0.

yx

yx

Therefore, f is increasing on I.

Clearly, if f 0 is strictly positive on an interval, then you can prove that f is strictly

increasing on the interval. This can be done by altering the above proof by changing

the greater than or equal signs to greater than signs where needed. Is the converse of

the above theorem true? Well, one cannot conclude that a function is differentiable

on an interval by just knowing that the function is increasing there. But what if

you are given a differentiable function that is increasing? What can you conclude

about the derivative? If a function is increasing, it does mean that every difference

.x/

quotient f .y/f

will be greater than or equal to 0, and, thus, the derivative which

yx

is the limit of such difference quotients will have to be greater than or equal to 0. If

f is strictly increasing, can you conclude that its derivative is positive? In this case

you cannot. You can conclude that all difference quotients will be positive, but the

limit of positive difference quotients can be 0. For example, f .x/ D x3 is a function

differentiable on the entire real line, and it is strictly increasing, but its derivative is

0 at x D 0.

Another important consequence of the Mean Value Theorem is that if a function

has a derivative equal to 0 at every point of an interval, then f is constant on that

interval. Again, this follows directly from what you can say about any difference

quotient.

PROOF: Let f be a function whose derivative is 0 at every point of an

interval. Then f is constant on that interval.

Let f be a function whose derivative is 0 at each point of the interval I.

Let x and y be in I with x < y.

Then by the Mean Value Theorem, there is a c between x and y such that

f .y/f .x/

D f 0 .c/.

yx

Since I is an interval and x and y are in I, c is also in I, implying that

f 0 .c/ D 0.

.x/

.x/

Thus, f .y/f

D 0 so .y x/ f .y/f

D f .y/ f .x/ D 0.

yx

yx

Therefore, f .x/ D f .y/ for all x and y in the interval, and, thus, f is constant.

150

5 Derivatives

fact that the

0 if x < 0

set is an interval is crucial. For example, the function f .x/ D

1 if x > 0

is not defined at 0. The derivative, f 0 , is equal to 0 at each point of the domain

of f , but clearly, f is not a constant function, although it is constant on each

interval contained in its domain. Looking back to the previous theorem, note that

the function f .x/ D 1x has a strictly positive derivative at each point of its domain,

but, again, its domain does not include 0. This function is strictly increasing on

each interval contained in its domain, but it is not an increasing function because

f .1/ > f .1/.

5.7.1 Exercises

Write proofs for each of the following statements.

1. If f is a function whose derivative is negative for all points in an interval, then f

is a decreasing function on the interval.

2. If f and g are functions differentiable on an interval with f 0 .x/ D g0 .x/ for each

x in the interval, then there is a constant C such that f .x/ D g.x/ C C for all x in

the interval.

3. If f .0/ D g.0/ and f 0 .x/ g0 .x/ for each x 0, then f .x/ g.x/ for each x > 0.

It seems like most students who take Calculus remember LHopitals Rule. Even

those who do not remember what the rule states seem to remember its name.

Perhaps this is because it is so much fun to pronounce, but more students remember

LHopitals Rule than some far more important results such as the Fundamental

Theorem of Calculus. LHopitals Rule states that if f and g are differentiable

functions defined on an interval containing the point a with lim f .x/ D lim g.x/ D

x!a

x!a

f .x/

0, then lim gf 0.x/

D L implies that lim g.x/

D L. This is very useful because the

x!a .x/

x!a

theorem stating that the limit of a quotient is the quotient of the limits does not

apply in cases when the denominator has a limit of 0.

How would you prove LHopitals Rule? You might try to prove it by using the

Mean Value Theorem because the quotient you are considering is

f .x/ f .a/

f .x/

D

D

g.x/

g.x/ g.a/

f .x/f .a/

xa

g.x/g.a/

xa

151

This is not exactly correct because, as far as you know, f .a/ and g.a/ might not

even be defined, and if they are, they need not be equal to lim f .x/ and lim g.x/.

x!a

x!a

This is not a big stumbling block, because you can always redefine f and g at a to

be equal to 0 without changing the result of the theorem. You also would need to

know that g.x/ g.a/ for x near a so that the needed quotient can be calculated.

Once the quotient of f and g is rewritten as the quotient of the difference quotient

of f and the difference quotient of g, you can apply the Mean Value Theorem to

replace the difference quotients with derivatives, and then take the limit. It might

look something like the following.

PROOF ATTEMPT: Let both f and g be functions differentiable for

all x a in an interval which contains a. Assume that lim f .x/ D

x!a

f 0 .x/

0

x!a g .x/

lim g.x/ D 0, and g0 .x/ 0 for all x a in the interval. Then lim

x!a

implies

lim f .x/

x!a g.x/

DL

D L.

contains a.

Assume that lim f .x/ D lim g.x/ D 0, and g0 .x/ 0 for all x in the interval

x!a

with x a.

Assume that lim

f 0 .x/

0

x!a g .x/

x!a

D L.

redefining the functions at a does not change the limits at a of f , g, f 0 , g0 , or

their ratios. With f .a/ D g.a/ D 0, both f and g are continuous at x D a.

For x in the given interval with x a, both f and g are continuous on the

closed interval with endpoints at x and a, and both f and g are differentiable

on the open interval with these endpoints.

Thus, by the Mean Value Theorem, there is a cf .x/ between x and a such

.a/

that f 0 cf .x/ D f .x/f

, and there is a cg .x/ between x and a such that

xa

g.x/g.a/

g0 cg .x/ D xa .

Because cf .x/ and cg .x/ are both between x and a, lim cf .x/D lim cg .x/Da.

x!a

0

f 0 cf .x/

.

lim 0 D lim gf 0.x/

.x/

x!a g cg .x/

lim f .x/

x!a g.x/

x!a g.x/g.a/

x!a

D lim

x!a

f .x/f .a/

xa

g.x/g.a/

xa

x!a

There is a significant problem with this proof. The problem stems from the fact

that, although both functions cf .x/ and cg .x/ do approach a as x approaches a, the

two functions can approach a at different rates. Why is this a problem? Consider

2

calculating lim xx , a limit which is clearly equal to 0. But what if cf .x/ D x

x!0

and cg .x/ D x2 ? Even though it is true that cf .x/ and cg .x/ both approach 0 as

152

5 Derivatives

cf .x/2

x!0 cg .x/

approaching a does not allow you to use both of these expressions in place of x

f .x/

x!0 g.x/

when taking the limit. What the proof attempt does show is that lim

lim f 0 .x/

x!0

lim g0 .x/

x!0

A second less crucial problem with this proof attempt is that it defines cf .x/ to

.a/

be the value of c such that f .x/f

D f 0 .c/. But this condition may well be satisfied

xa

by more than one value of c, so there is a problem with which of the possible values

of c is chosen. One can get around this difficulty, but that still does not address the

previously stated problem.

A common way to correct the problem in the proof attempt is to use a more

powerful version of the Mean Value Theorem known as the Extended Mean Value

Theorem or sometimes as the Cauchy Mean Value Theorem. It allows you to select

0

f .x/f .a/

one value of c so that gf 0.c/

D g.x/g.a/

, that is, it allows you to select the ratio of

.c/

derivatives equal the ratio of the difference quotients at a single value of c rather

than selecting one value of c for the numerator and a possibly different value

of c for the denominator. One can prove the Extended Mean Value Theorem by

0

f .x/f .a/

manipulating the desired relation gf 0.c/

D g.x/g.a/

. This equation can be rewritten as

.c/

f 0 .c/g.x/ g.a/ D g0 .c/f .x/ f .a/ and then f 0 .c/g.x/ g.a/ g0 .c/f .x/ f .a/

D 0. This may be confusing because there are three variables involved, x, a, and c,

but you can make better sense of it by thinking of x and a as being fixed. That

is, if you define the function h.t/ D f .t/g.x/ g.a/ g.t/f .x/ f .a/, then

h0 .c/ D f 0 .c/g.x/ g.a/ g0 .c/f .x/ f .a/ as needed. How do you know that

there is a c such that h0 .c/ D 0? That follows from Rolles Theorem because it is

easy to verify that h.x/ D h.a/.

PROOF (Extended Mean Value Theorem): For a < b, let both f and g be

functions continuous on a; b and differentiable on .a; b/. Then there is a

c 2 .a; b/ such that f 0 .c/g.b/ g.a/ D g0 .c/f .b/ f .a/.

Let a < b and assume f and g are functions continuous on a; b and

differentiable on .a; b/.

For x 2 a; b define h.x/ D f .x/g.b/ g.a/ g.x/f .b/ f .a/.

Then h is also continuous on a; b and differentiable on .a; b/.

Note that h.a/ D f .a/g.b/g.a/g.a/f .b/f .a/ D f .a/g.b/g.a/f .b/,

and

h.b/ D f .b/g.b/ g.a/ g.b/f .b/ f .a/ D f .a/g.b/ g.a/f .b/ D h.a/.

Thus, h satisfies the hypothesis of Rolles Theorem on the interval a; b.

It follows that there is a c 2 .a; b/ such that h0 .c/ D 0, so

f 0 .c/g.b/ g.a/ g0 .c/f .b/ f .a/ D 0.

This is equivalent to f 0 .c/g.b/ g.a/ D g0 .c/f .b/ f .a/ which is the

conclusion of the theorem.

153

Now the Extended Mean Value Theorem can be used to give a correct proof of

LHopitals Rule.

PROOF (LHopitals Rule, Part 1): Let f and g be functions differentiable

for all x a in an open interval which contains a. Assume that lim f .x/ D

x!a

f 0 .x/

0

x!a g .x/

lim g.x/ D 0, and g0 .x/ 0 for all x a in the interval. Then lim

x!a

implies

lim f .x/

x!a g.x/

DL

D L.

which contains a.

Assume that lim f .x/ D lim g.x/ D 0, and g0 .x/ 0 for all x in the interval

x!a

x!a

with x a.

Assume that lim

f 0 .x/

0

x!a g .x/

D L.

redefining the functions at a does not change the limits at a of f , g, f 0 , g0 , or

their ratios. With f .a/ D g.a/ D 0, both f and g are continuous at x D a.

Let > 0 be given.

0

Since lim gf 0.x/

D L, there is a > 0 such that 0 < jx aj < implies that

.x/

f 0 .x/

g0 .x/

x!a

is within of L.

Fix x in the given interval with 0 < jx aj < .

Since f and g are continuous on the closed interval from a to x and

differentiable on the open interval from a to x, f and g satisfy the hypothesis

of the Extended Mean Value Theorem on the interval from a to x.

It follows that there is a c between x and a such that f 0 .c/g.x/ g.a/ D

g0 .c/f .x/ f .a/.

By assumption g0 is not 0, so g0 .c/ 0. Also the Mean Value Theorem

shows that g.x/ g.a/ D g0 .t/.x a/ for some t between x and a, and this

shows g.x/ g.a/ 0.

0

f .x/f .a/

It follows that g.x/g.a/

D gf 0.c/

.

.c/

f .x/f .a/

0

f .x/

L

Thus, g.x/ L D g.x/g.a/ L D gf 0.c/

< .

.c/

f .x/

x!a g.x/

LHopitals Rule also holds in cases where lim g.x/ is infinite rather than

x!a

zero.

154

5 Derivatives

in an open interval which contains a. Assume that lim g.x/ is either

x!a

positive or negative infinity, and g0 .x/ 0 for all x in the interval with

0

f .x/

x a. Then lim gf 0 .x/

D L implies lim g.x/

D L.

.x/

x!a

x!a

Assume that lim g.x/ is positive or negative infinity, g0 .x/ 0 for all x in

x!a

D L.

x!a .x/

Let > 0 be given.

Because g0 .x/ is never 0, the Mean Value Theorem shows that if both x and

y are in the interval and are both on the same side of a, then g.x/ g.y/.

0

0

Since lim gf 0.x/

D L, there is a 0 such that 0 < jx aj < 0 implies that gf 0.x/

.x/

.x/

x!a

is within 2 of L.

Fix x in the given interval with 0 < jx aj < 0 .

Since f and g are differentiable between x and a and continuous at x, for

any y between x and a it follows from the Extended Mean Value Theorem

0

f .y/f .x/

f .y/f .x/

that there is a c between x and y such that g.y/g.x/

D gf 0.c/

. Thus, g.y/g.x/

is

.c/

within 2 of L.

f .y/

f .x/

g.y/

f .y/ f .x/

g.y/

D

.

Note that

g.y/ g.x/

1 g.x/

g.y/

f .y/ f .x/

g.y/ g.y/

f .y/f .x/

f 0 .c/

Because g.y/g.x/ D g0 .c/ is within 2 of L, it follows that

L

< .

1 g.x/

2

g.y/

f .y/ f .x/

< 2 1 g.x/

Then g.y/

g.y/ L 1 g.x/

g.y/

g.y/ .

Because g.y/ approaches positive or negative infinity as y approaches a,

there is a > 0 with < 0 such that for all y with 0 < jy aj < , the

fraction jf .x/jCjLg.x/jCjg.x/j

< 2 .

jg.y/j

f .y/ f .x/

<

g.y/ L 1 g.x/

Then for y with 0 < jy aj < , g.y/

g.y/

g.x/

1 g.y/ implies

2

f .y/ f .x/

g.x/

jg.x/j

jf .x/jCjLg.x/jCjg.x/j

< 2 C 2 D .

g.y/ L < g.y/ L g.y/ C 2 C 2jg.y/j < 2 C

jg.y/j

There are several variations of LHopitals Rule covering the cases of one sided

limits and limits at positive or negative infinity. These are covered in the following

exercises.

155

5.8.1 Exercises

Use LHopitals Rule to calculate the following limits.

sin2 .2x3 /

6

x!0 px

xx

lim pxCx

x!0C p

1. lim

2.

3. lim

x!0C

4. lim

x!1

x ln x

ln x

p

x

5. lim .sin.2x//x

x!0C

tan1 x

1

x!0 tan .3x/

6. lim

7. If f and g are differentiable functions for all x > 0, lim f .x/ D lim g.x/ D 0,

f 0 .x/

0

g

x!1 .x/

x!1

f .x/

x!1 g.x/

D L implies lim

x!1

D L.

8. If f and g are differentiable functions for all x > 0, lim g.x/ D 1, and g0 .x/ > 0

for all x > 0, then

0

lim f 0.x/

x!1 g .x/

D L implies

lim f .x/

x!1 g.x/

x!1

D L.

9. If f and g are functions differentiable for all x > a, lim f .x/ D lim g.x/ D 0,

and g0 .x/ 0 for all x with x > a, then lim

x!aC

f 0 .x/

g0 .x/

x!aC

x!aC

f .x/

g.x/

x!aC

D L implies lim

D L.

The Intermediate Value Theorem says that if a function is continuous on an interval,

then it has the intermediate value property on that interval. That is, if f is continuous

on the interval I, and a; b 2 I, then for any K between f .a/ and f .b/, there is a

c between a and b with f .c/ D K. Suppose that f is differentiable at each point

of an interval I. If f 0 is continuous on I, then certainly it obeys the Intermediate

Value Theorem and has the intermediate value property on I. But f 0 .x/ can exist

0

for all x 2 I without

f being a continuous function. One example is f .x/ D

1

2

x sin x2 if x 0

. This function is differentiable for all x. When x 0, the

0

if x D 0

1

2

1

0

derivative is f .x/ D 2x sin 2 cos 2 , and f 0 .0/ D 0. As x approaches

x

x

x

0, f 0 is not even bounded and, in fact, oscillates wildly. In spite of its discontinuity

at 0, f 0 does have the intermediate value property. For example, for any x 0, the

function f 0 obtains every value between f 0 .x/ and f 0 .0/ D 0 on the interval between

156

5 Derivatives

1

x2

0 and x. Moreover, it obtains each of those values infinitely often. In fact, between

0 and x, the function f 0 takes on every real number infinitely often (Fig. 5.7).

Note that the function f .x/ C x has a derivative of 1 at x D 0. This is an example

of a function with a positive derivative at 0 which is not an increasing function over

any open interval containing 0. This can easily be seen by the fact that in every open

interval containing 0 there are intervals where the derivative of f .x/ C x is negative.

So, how can you prove that if a function f has a derivative f 0 on an interval I,

that f 0 has the intermediate value property on I? The hypothesis suggests that you

start by taking a function f differentiable on an interval I and values a; b 2 I. Then

you select a value K between f 0 .a/ and f 0 .b/. Without loss of generality, you can

assume that a < b and f 0 .a/ < K < f 0 .b/. The goal would be to show that there

is a c between a and b such that f 0 .c/ D K. One simplification is to replace f with

the function g.x/ D f .x/ Kx. This function is also differentiable on I, and if

f 0 .c/ D K, then g0 .c/ D 0. Which theorems about derivatives allow you to conclude

that a derivative is 0 at some point in an interval? First there is a theorem that states

that if a differentiable function reaches an extreme value at a point in an interval,

then the point is either a critical point of the function or an endpoint of the interval.

A second theorem is Rolles Theorem which talks about a differentiable function

which takes on the same value at the endpoints a and b. Since you do not have any

information about the values of g at the endpoints of the interval, the theorem about

extreme values may be the more promising choice for this proof.

What is known about the function g? You know that g is differentiable at each

point of the interval from a to b. Additionally, g0 .a/ D f 0 .a/ K < 0 and g0 .b/ D

f 0 .b/ K > 0. Does this mean that the function g is decreasing at a and increasing

at b? Well, it would if you knew that g0 were continuous because then g0 would be

negative in an interval around a and positive in an interval around b. But, as you now

know, g0 need not be continuous. On the other hand, there is a theorem that says that

if g0 .a/ is negative, then there is a > 0 such that if x satisfies a < x < a C ,

then g.x/ < g.a/. This does not show much, but you can use it to conclude that g

does not take on its minimum value on a; b at a. A similar argument uses the fact

that g0 .b/ > 0 to show that g does not take on its minimum value on a; b at b.

157

continuous. All continuous functions on a close bounded interval take on both their

minimum and maximum values on the interval. Thus, you know that g takes on its

minimum value on a; b at some point c strictly between a and b. Such a point must

be a critical point of g, so g0 .c/ D 0. This is the idea behind the following proof.

PROOF: A function differentiable on an interval has the intermediate

value property on that interval.

Let a; b 2 I, and assume that f 0 .a/ f 0 .b/.

Without loss of generality assume that a < b and f 0 .a/ < f 0 .b/.

Let K be a value satisfying f 0 .a/ < K < f 0 .b/.

Let g.x/ D f .x/ Kx.

Then g0 .x/ D f 0 .x/ K for all x 2 a; b, and g0 .a/ < 0 < g0 .b/.

Since g is differentiable at each point of a; b, it is continuous on a; b.

Since g is continuous on a; b, it obtains a minimum value at some point

c 2 a; b.

g0 .a/ < 0 implies that there is a a > 0 such that g.x/ < g.a/ for all x

satisfying a < x < a C a . In particular, g does not obtain its minimum at a.

g0 .b/ > 0 implies that there is a b > 0 such that g.x/ < g.b/ for all x

satisfying b b < x < b. In particular, g does not obtain its minimum at b.

It follows that g obtains its minimum on a; b at a point c strictly between

a and b.

Since c is not an endpoint of a; b, g0 .c/ D 0.

Thus, f 0 .c/ D g0 .c/CK D K which shows that f 0 has the intermediate value

property on I.

There are simple examples of functions that have discontinuous derivatives that

do not have the intermediate value property; functions such as f .x/ D jxj. This

functions derivative is the constant 1 for all x > 0 and 1 for all x < 0. This

derivative is not continuous at x D 0 because it is not defined there. Clearly, f 0 does

not have the intermediate value property on any interval containing both positive

and negative numbers, but then f does not satisfy the hypothesis of the previous

theorem on any such interval because f 0 .0/ is not defined. Functions that have

discontinuous derivatives that are defined at all points will have to exhibit wild

oscillations

of those discontinuities similar to the example

in the neighborhoods

2

x sin x12 if x 0

.

0

if x D 0

Suppose f is a function whose derivative is defined at all points of an interval

except perhaps at some point c in the interval. What can be said if lim f 0 .x/ exists?

x!c

Such a derivative does not exhibit wild oscillations near c, and, in fact, it must

have a continuous derivative at c. The proof is a consequence of the Mean Value

Theorem.

158

5 Derivatives

except perhaps at some point c 2 .a; b/. Suppose that the limit lim f 0 .x/

x!c

exists. Then f 0 is continuous at c.

Let f be a function differentiable on the interval .a; b/ except perhaps at

c 2 .a; b/.

Assume lim f 0 .x/ D L.

x!c

From the definition of limit, given > 0, there is a > 0 such that if

y 2 .a; b/ with 0 < jy cj < , then jf 0 .y/ Lj < .

Let x 2 .a; b/ with 0 < jx cj < .

By the Mean Value Theorem there is a y between x and c such that

f .x/f .c/

D f 0 .y/.

xc

.c/

Then y 2 .a; b/ with 0 < jy cj < , so f .x/f

L D jf 0 .y/ Lj < .

xc

f .x/f .c/

D L, so f 0 .c/ D L.

xc

f 0 .c/ D lim f 0 .x/, it follows that f 0 .x/

x!c

Thus, lim

x!c

Because

This completes the proof.

is continuous at c.

Chapter 6

Riemann Integrals

6.1 Area

The first application one usually sees of the Riemann Integral is that of finding

the area of a region in the plane bounded by the graph of a function and the

lines x D a, x D b, and the x-axis. Thus, before discussing integration, it makes

sense to review what is meant by the area of a region in the plane. Clearly, the

measure of area should be a way to assign a size to a region in a way that is

compatible with the well-established rules from Geometry for assigning areas to

regions such as rectangles, triangles, and circles. But there is a need to go beyond

these simple regions so that area can be calculated for far more complicated regions.

For example, consider the region in the coordinate plane f.x; y/ j 0 x 1; 0

y 1; at least one of x or y is rationalg. Regions such as these are not typically

considered in a Geometry course, but being able to calculate areas for such sets

is important in the more general discussion of integration. This chapter, therefore,

begins by considering two different measures of the sizes of sets which will aid the

understanding of integration.

What does the set fA; B; C; D; Eg have in common with the set f2; 4; 6; 8; 10g? One

thing they have in common is that the two sets have the same number of elements.

What does the set of positive integers have in common with the set of positive

multiples of 2? These sets are both infinite sets, and the second set is clearly a

proper subset of the first, but, here again, the two sets have the same number

of elements. To see this consider the function f .n/ D 2n which is a bijection

from the set of positive integers one-to-one and onto the set of positive multiples

of 2. This function provides a one-to-one matching of the elements of one set

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_6

159

160

6 Riemann Integrals

to the elements of the other set. One says that two sets A and B have the same

cardinality if there is a bijection f W A ! B. The bijection demonstrates a oneto-one correspondence between the elements of set A and the elements of set B, so

one concludes that A and B are the same size. Some sets are finite, meaning that the

set is either empty (has cardinality 0) or, for some positive integer n, is in one-toone correspondence with the set f1; 2; 3; ; ng. A set is called denumerable if it

can be put in one-to-one correspondence with the set of positive integers. Thus, the

set of positive multiples of 2 is denumerable. So is the set of all integers since

the positive

can be mapped

onto the set of all integers using the bijection

integers

x

if

x

is

even

2

f .x/ D

. The verification that this map is a bijection is left

if x is odd

1 xC1

2

as an exercises. It shows that the integers and the positive integers have the same

cardinality. Sets that are either finite or denumerable are called countable because

they can be counted out by listing a first, second, third, and so forth. Thus, a good

way to think about a countable set is a set whose elements can be written down in

a finite or infinite sequence x1 ; x2 ; x3 ; because this listing shows the one-to-one

correspondence between the set and the natural numbers or one of its finite subsets.

The union of two countable sets is also countable. This can be seen by representing one set by the sequence x1 ; x2 ; x3 ; and the other by y1 ; y2 ; y3 ; . Then

the elements of the union of the two sets can be written as x1 ; y1 ; x2 ; y2 ; x3 ; y3 ; .

If there are elements that belong to both sets, then one can just leave the second

copies of those elements out of the listing. Clearly, this can be extended to the

union of any finite collection of countable sets, so the union of a finite number

of countable sets is countable. What might seem surprising is that the union of a

countable number of countable sets is still countable. That is, if A1 ; A2 ; A3 ; is

1

a sequence of countable sets, then the union [ Ak is also countable. To see this,

kD1

suppose that the elements in each Ak can be listed in a sequence ak;1 ; ak;2 ; ak;3 ; .

1

One can now list all the elements of [ Ak by listing the ak;j elements in increasing

kD1

a1;1 ; a2;1 ; a1;2 ; a3;1 ; a2;2 ; a1;3 ; a4;1 ; a3;2 ; a2;3 ; a1;4 ; a5;1 ; .

order of k C j resulting in

As above, duplicate elements occurring because they belong to more than one set

can be left out of this listing. Figure 6.1 shows the order that the elements enter

the list.

Note that this result can be used to show that the set of rational numbers is

countable. Indeed, the rational numbers can be written as the union R1 [R2 [R3 [

where Rk are the rational numbers that can be written as a fraction with an integer

in the numerator and the positive integer k in the denominator. For example,

R2 D f 02 ; 12 ; 12 ; 22 ; 22 ; 32 ; 32 ; g. Thus, the rational numbers is a countable union

of countable sets showing that it is countable. The cardinality of a denumerable set is

often written using the symbol @0 (read Aleph knot or Aleph null). The symbol

represents the size of the natural numbers and the size of any set that can be placed

in one-to-one correspondence with the natural numbers.

A set which is not a countable set is called uncountable. There is a standard

argument that shows that the set of real numbers in the interval .0; 1/ is not a

countable set. The method, known as a diagonalization argument, first assumes

Fig. 6.1 The union of

countably many countable

sets is countable

a diagonalization argument

161

a1,1

a1,2

a1,3

a1,4

a1,5

a1,6

a2,1

a2,2

a2,3

a2,4

a2,5

a2,6

a3,1

a3,2

a3,3

a3,4

a3,5

a3,6

a4,1

a4,2

a4,3

a4,4

a4,5

a4,6

a5,1

a5,2

a5,3

a5,4

a5,5

a5,6

x1 = 0. 4 9 0 3 2 5 5 9 0 9 9 0

x2 = 0. 1 7 7 3 8 8 0 0 0 0 0 0

x3 = 0. 7 4 1 1 8 9 1 8 2 5 4 4

x4 = 0. 1 1 8 8 8 3 7 2 9 0 0 1

x5 = 0. 5 5 2 7 7 7 1 0 6 4 2 3

x6 = 0. 0 0 0 0 0 2 1 0 9 3 7 3

x7 = 0. 8 2 1 7 4 9 0 3 2 8 5 5

x8 =

y = 0. 7 3 7 7 3 7 7 7 7 7 7 3

that the real numbers between 0 and 1 can all be written down in a sequence

x1 ; x2 ; x3 ; x4 ; . Then one constructs a real number y between 0 and 1 where the

kth digit to the right of the decimal point in y is chosen as follows. If the kth digit

to the right of the decimal point of xk is 7, then let the kth digit to the right of the

decimal point in y be 3. Otherwise, if the kth digit to the right of the decimal point of

xk is not 7, then let the kth digit to the right of the decimal point in y be 7. Figure 6.2

illustrates the process of determining y.

The point of this construction is that the number y is a real number in the interval

.0; 1/, but it cannot be one of the numbers in the sequence x1 ; x2 ; x3 ; x4 ; . This

is because for each k, y cannot equal xk because y and xk differ in their kth digits.

This is a contradiction to the assumption that the sequence contained all of the real

numbers in .0; 1/ and shows that it is impossible to list all the elements of .0; 1/ in

a sequence. Thus, this interval is an uncountable set. If there is a bijection from the

set .0; 1/ to a set B, then it follows that B will also be uncountable. You may wonder

whether all uncountable sets have the same cardinality. They do not, but that fact

will not be needed for the proofs discussed in this book. Refer to a standard text in

Set Theory for a far more in-depth look at the cardinality of sets.

162

6 Riemann Integrals

6.2.1 Exercises

1. Determine whether each of the following sets is finite, denumberable, or

uncountable.

(a) the set of points in the coordinate plane where both x and y coordinates are

rational numbers

(b) the set of points in the coordinate plane where at least one of its x and y

coordinates is a rational number

(c) the set of polynomials p.x/ with integer coefficients

(d) the set of real numbers whose decimal representation does not contain the

digit 5

(e) the set of functions f W f0; 1; 2; 3; 4; 5g ! f1; 2; 3; : : : ; 100g

(f) the set of functions f W f2; 4; 6; 8; 10; : : : g ! f0; 1g

x

if x is even

2

2. Show that the function

is a bijection from the set of

if x is odd

1 xC1

2

natural numbers to the set of integers.

3. Show that .0; 1/ and .1; 5/ have the same cardinality.

4. Show that .0; 1/ and the entire set of real numbers, R, have the same cardinality.

5. Show that .0; 1/ and the interval 0; 1 have the same cardinality. (Hint: Find a

way to bury the endpoints of 0; 1 inside of .0; 1/ by mapping a sequence

x1 ; x2 ; x3 ; : : : to x3 ; x4 ; x5 ; : : : .)

1

nD1

least one n, the set An must be uncountable. This can be thought of as an infinite

form of the Pigeonhole Principle.

7. The interval 0; 1 on the real line and the unit square in the plane have the same

cardinality. (Hint: for a point in 0; 1 split up its decimal digits between the x

and y coordinates of a point in the unit square.)

8. Show that the equality of cardinality is an equivalence relation. That is, if A, B,

and C are any sets, then

A has the same cardinality as A.

If A has the same cardinality as B, then B has the same cardinality as A.

If A has the same cardinality as B, and B has the same cardinality as C, then A

has the same cardinality as C.

9. Suppose that you apply the diagonalization argument to the set of rational

numbers in the interval .0; 1/. That is, suppose you list all of the rational numbers

in a sequence x1 ; x2 ; x3 ; : : : and use the diagonalization argument to construct a

number y in .0; 1/ that differs from each element of the sequence. Why is this

not a proof that the rational numbers are uncountable?

163

Cardinality is used to compare the sizes of sets by considering how many elements

the sets have. But two sets such as 0; 3 and 0; 6 can have the same cardinality

and yet be quite different in what we traditionally think of as size in the geometric

sense. So there is a need to develop a different way to compare the sizes of sets that

embodies the notion of the length of a set of real numbers and of the area of a set in

the plane. A general theory of measure is not a topic that can be covered in a book

at this level, but it is helpful to introduce how one determines which sets should be

assigned a length or an area equal to 0.

If measure is to mean anything useful, you would want each finite interval a; b

to have measure equal to its length, ba. How about the measure of the open interval

.a; b/? Likely, you would say that its measure should also be b a. This suggests

that the set of endpoints fa; bg should be assigned a measure of 0. More generally,

a set S R is said to have measure zero if for each > 0 there is a sequence of

open intervals .a1 ; b1 /; .a2 ; b2 /; .a3 ; b3 /; : : : such that S is contained in the union of

1

the intervals S [ .aj ; bj / and the total length of the intervals is less than , that

jD1

n

P

jD1

zero if you can cover it with a sequence of intervals whose total length is as small

as you want.

In particular, any finite set consisting of n real numbers has measure zero because

for any > 0, each point x in the set can be covered by the interval .x 3n

; x C 3n

/,

2

and the total length of these intervals is 3 . Similarly, any countable set of real

numbers fx1 ; x2 ; x3 ; : : : g can be covered by intervals .xj 32

j ; xj C 32j /, and the total

1

P 2

D 23 . Thus, the set of rational numbers, which

length of these intervals is

32j

jD1

1

a sequence of sets all of which have measure zero, then the union [ Aj also has

jD1

measure zero. Indeed, given > 0, for each j you can cover Aj with a sequence

of open intervals whose total length is less than 2j . Then the sequences of open

1

intervals can be combined into one sequence of intervals which cover [ Aj and has

total length less than

1

P

jD1

jD1

2j

D .

Since any countable set of real numbers has measure zero, if a set does not

have measure zero, it must be an uncountable set. A natural question is whether

an uncountable set of real numbers can have measure zero. The answer to this

question is yes. The most famous example of this is known as the Cantor set which

is constructed as follows. The construction begins with the closed unit interval

C0 D 0; 1. At the first stage, the open interval of length 13 is removed from the

middle of this set leaving two intervals each with length 13 so that C1 D 0; 13 [ 23 ; 1.

164

6 Riemann Integrals

Stage 0

Stage 1

Stage 2

Stage 3

Stage 4

Stage 5

At the second stage, open intervals of length 19 are removed from the middle of each

of the two remaining intervals leaving four intervals each with length 19 so that

C2 D 0; 19 [ 29 ; 39 [ 69 ; 79 [ 89 ; 99 . This process is repeated so that at stage n,

open intervals of length 31n are removed from each of 2n1 closed intervals of length

1

leaving 2n closed intervals each with length 31n (Fig. 6.3). The Cantor set C

3n1

1

nD1

The Cantor set is sometimes called the Cantor middle thirds set, because, at each

stage, the middle thirds of the remaining intervals are removed. Other similar types

of Cantor-like sets can be constructed by removing other portions of each interval.

It is clear that the Cantor set has measure zero because it is contained in Cn which

is made up of 2n closed intervals each with length 31n . The total length of the closed

n

intervals in Cn is 23n , a quantity that goes to 0 as n gets large. Cn can be covered by

n

a finite collection of open intervals whose total length is 10 percent larger than 23n

showing that the Cantor set can be covered by open intervals whose total length is

as small as you want. So how do you show that the Cantor set is uncountable? To

see this, consider writing each number in the unit interval 0; 1 in base three. The

numbers in the interval 0; 13 are the numbers between 0 and 1 whose base-three

representation begins with 0.0, and the numbers in the interval 23 ; 1 are the numbers

between 0 and 1 whose base-three representation begins with 0.2. The numbers in

the middle third of the interval that are removed at the first stage of the construction

process are the numbers between 0 and 1 whose base-three representation begins

with 0.1. Note that numbers at the endpoints of the removed interval, 13 and 23 each

has two different representations. Indeed, in base three 13 D 0:1 D 0:0222 and

2

D 0:2 D 0:1222 . One could say that C1 consists of all the numbers between 0

3

and 1 that can be represented in base three without a 1 in the first place to the right

of the decimal point, the one-third place. Similarly, C2 are the numbers between 0

and 1 that have a base-three representation with no 1 in either of the first two places

to the right of the decimal point. The Cantor set C is the set of numbers between 0

165

and 1 that have a base-three representation that contains no digit equal to 1. Then

consider the map that takes each element of the Cantor set and divides it by 2. This

is an injective map that maps the numbers in the Cantor set to the set of numbers in

the unit interval that have base-three representations that include only the digits of

0 and 1 because it takes numbers with representations that only included the digits

of 0 and 2 and divides each of the digits by 2. Now, the numbers between 0 and

1 with base-three representations that include only the digits of 0 and 1 are clearly

in one-to-one correspondence with base-two representations of numbers in between

0 and 1. But all the real numbers between 0 and 1 have base-two representations

containing only 0 and 1, so the numbers in the Cantor set are as numerous as the

real numbers between 0 and 1. Thus, the Cantor set must be uncountable since the

set of real numbers between 0 and 1 is an uncountable set.

The concept of measure zero can be extended to sets in the plane, although here,

rather than being interested in the length of a set, the interest is in the area of the set.

Thus, rather than trying to cover a set with intervals whose total length is small, in

the plane one would try to cover a set with a sequence of squares whose total area

is small. Just as on the real line, it was taken as given that the length of an interval

a; b was b a, in the plane it will be taken as given that the area of a square with

side length x is x2 . Then, a region in the plane is said to have measure zero (or area

zero) if for each > 0, the set is contained in the union of a sequence of squares

whose total area is less than .

As it was with sets of real numbers, any countable set of points in the plane has

area zero because, for any < 0, you can cover the sequence x1 ; x2 ; x3 ; : : : with a

sequence of squares with total area less than . Moreover, let Y be a line segment

with length y > 0. Then Y has area zero. How would you prove this? Certainly, this

line segment is contained in a square with side length y which has area y2 , so the

squares area could be rather large and, in particular, the area of the square is not

zero. Notice, though, that Y can also be covered by two side-by-side squares each

2

with side length 2y and each with area y4 giving a total area of the two squares equal

2

to y2 . This is the key to covering Y with squares with very small total area. If Y is

covered by a sequence of n adjacent squares each with side length ny , then the total

2

large, it follows that Y has measure zero (Fig. 6.4).

Fig. 6.4 Covering a line segment with smaller and smaller squares

166

6 Riemann Integrals

6.3.1 Exercises

1. Rather than constructing the Cantor set only on the interval 0; 1, perform the

same construction on each interval n; n C 1 for every integer n. Show that the

resulting set has measure zero.

2. Beginning with the interval 0; 1 construct a Cantor-like set, but instead of

removing intervals of length 13 at stage 1, 19 at stage 2, and so forth removing

intervals of length 31n at stage n, you remove an interval of length 14 at stage 1,

1

at stage 2, and so forth removing intervals of length 41n at

intervals of length 16

stage n. Show that the total lengths of the intervals remaining after stage n does

not approach zero as n approaches infinity.

3. Which of the following sets of real numbers have measure zero?

(a)

(b)

(c)

(d)

the integers

the irrational

numbers

p

p

fa C b 2 C c 3 j a; b; c are integers g

the Cantor-like set where instead of removing the middle 13 of each

remaining interval at stage n, you remove the middle 14 of each remaining

interval

4. Show that if the set A has measure zero and B A, then the set B has measure

zero.

5. Show that a line in the plane has area zero.

6. Show that the set in the plane f.x; y/ j x is rationalg has area zero.

7. Suppose that the set A R has measure zero. Show that the set f.x; y/ j x 2

A; y 2 0; 1g has area zero.

8. Suppose that the set A R has measure zero. Show that the set f.x; y/ j x 2 Ag

has area zero.

9. Show that f.x; y/ j 0 x 1; 0 y 1; at least one of x or y is rationalg has

area zero.

10. Show that the interval 0; 1 does not have measure zero. (Hint: Use the Heine

Borel Theorem to reduce any cover to a finite subcover.)

When discussing area, it is not possible to avoid the limit concept, and this brings

a topic usually associated with Geometry into the field of Analysis. One could

even make a case for including much of Geometry as a subtopic of Analysis since

Geometry involves properties of distance, a distinguishing feature of Analysis.

What properties of area can be taken as given? One would hope that whatever

axioms are chosen, they would let you prove results about area that you know to be

true from Euclidean Geometry. The following axioms accomplish this.

167

1. The area of a set in the plane is a nonnegative real number.

2. A square with side length 1 has area equal to 1.

3. (Similarity) If sets A and B are similar in the geometric sense with lengths

in B equal to t times the corresponding lengths in A, then the area of B is t2

times the area of A.

4. (Area Zero) Let A be a set. Suppose that for each > 0 there is a sequence

of squares S1 ; S2 ; S3 ; with areas s1 ; s2 ; s3 ; , respectively, such that the

1

set A is contained in the union of the squares [ Sk , and for every natural

kD1

P

number n, nkD1 sk < . Then A has area 0.

5. (Union) If set A has area a, set B has area b, and their intersection A \ B has

area 0, then the union A [ B has area a C b.

6. (Exhaustion) Let B be a set. If for each > 0 there are sets A and C with

A B C such that the area of A is greater than b , and the area of C

is less than b C , then the area of B is b.

Axioms 1, 2, and 3 should agree with what you know about area from Geometry,

and they can be used to prove some simple results. For example, since a 1 1 square

has area 1, Axiom 3 can be used to show that an s s square has area s2 .

The result from the previous section that a line segment has area 0 is particularly

useful because of the way it can be used in conjunction with the Union area axiom.

In particular, suppose A and B are two squares or other polygons set side-by-side so

that they only share an edge. Because the shared edge is a line segment, it has 0 for

its area, and the Union area axiom shows that A [ B has an area equal to the sum of

the area of A and the area of B. By using mathematical induction, this result can be

extended to the union of many polygons that share borders. In particular, consider

finding the area of a rectangle with width x and length y. If xy is a rational number

equal to pq , where p and q are positive integers, then the x y rectangle is the union

of p q squares all with side length px . Indeed, the width of the rectangle which has

length x is spanned by p such squares, and the length of the rectangle which has

length y is spanned by q such squares showing that the entire rectangle can be tiled

2

by a p q array of squares, each with area px . The Union axiom then shows that

2

the area of the x y rectangle is p q px D x qp x D x y. It will require

the last of the area axioms to conclude that the area of any rectangle is equal to its

length times its width even when the length of the rectangle is an irrational multiple

of its width.

The last area axiom is essentially the Method of Exhaustion used some by

Euclid and much more extensively by Archimedes to calculate areas and volumes.

It is an example of a use of Calculus about 1800 years before the foundation of

Calculus was formally established by Newton and Leibniz. This axiom says that if

a region in the plane can be closely approximated by sets whose areas you know,

then you can figure out the area of the region. Take, for example, a rectangle B with

168

6 Riemann Integrals

width x > 0 and length y > 0 where the ratio xy D is irrational. It is certainly

possible to find other rectangles close to the size of B whose length to width ratios

are rational. To prove that B has area xy, the axiom requires that for each > 0 you

find a subset A B whose area is greater than xy and a set C containing B whose

area is less than xy C . Suppose you choose A to be a rectangle with width x and

length just a bit short of y, say rx, where r is a rational number chosen to be less

than but suitably close to xy . How close is suitably close? Well, you would need the

area of A, which is x rx D rx2 , to be within of xy, that is, xy rx2 < . Solving

for r shows that r > xy x2 . Is there such an r which is rational and between xy x2

and xy ? Of course there is. The rational numbers are dense in the real line; there are

rational numbers in every interval of positive length. Thus, you can select a rational

number r between xy x2 and xy and let A be an x rx rectangle. Then A can be

placed inside of B, and the area of A is within of xy. Similarly, you can choose a

rectangle C with width x and length sx, where s is a rational number chosen to be

greater than but suitably close to xy . You need the area of C to be within of xy, so

choose s so that x sx xy < . This happens if xy < s < xy C x2 . Since you have

found a rectangle A contained inside B and a rectangle C containing B with the areas

of A and C within of xy, the Exhaustion area axiom shows that B has area xy.

The familiar formula for the area of a triangle given as one half the base times the

height can be derived geometrically, but to prove this formula using the area axioms

requires more work. To begin, consider a right triangle with legs with lengths x and

y. Place this triangle in a rectangle with side lengths x and y. For any natural number

n, the rectangle can be overlaid with an n n grid of rectangles with side lengths

x

and ny . The hypotenuse of the triangle is the diagonal of the x y rectangle and

n

spans the diagonals of n of the smaller rectangles as shown in Fig. 6.5 exhibiting the

case where n D 8.

Because there are n grid rectangles along the hypotenuse of the triangle, it

2

must be that there are n 2n grid rectangles inside the triangle with a total area

2

of n 2n nxy2 D 1 1n xy2 . Similarly, the triangle is enclosed inside the union of

n2 Cn

grid rectangles with a total area of 1 C 1n xy2 . Clearly, n can be chosen large

2

Fig. 6.5 An 8 8 grid of

rectangles overlaying a

triangle

169

enough to make both the total area of grid rectangles inside the triangle and the total

area of grid rectangles enclosing the triangle within a particular > 0 of xy2 . Thus,

the Exhaustion axiom shows that the area of the triangle is xy2 as expected. Since any

triangle can be partitioned into two right triangles, the well-known area formula for

the area of a triangle follows. Since any polygon can be partitioned into triangles,

the usual formulas from Geometry for the areas of polygons can be derived in the

same way they would be in Geometry.

You may wonder whether these techniques can be used to find the area of any

region in the plane, or at least any bounded region in the plane. This is a really good

question with a very complicated answer. The Area Axioms listed in this section

are designed to give the reader a feel for proofs about areas that will be useful

in the upcoming discussion of proofs about Riemann integrals. The axiom list is

not complete enough to allow the calculation of the area of many of the sets that

one might encounter. The area of Analysis known as Measure Theory provides a

somewhat richer environment for this study, but the complexities of measure theory

go beyond the aim of this text. What can be said is that even with the use of measure

theory, there are sets in the plane complex enough that one cannot assign an area

measure to them.

6.4.1 Exercises

1. Show that a circle with radius r has area r2 .

2. Suppose the polygonal region A in the coordinate plane has area K. Show that

the region f.x; y/ j .x; 3y / 2 Ag has area 3K.

The definition of the Riemann Integral is motivated by the Method of Exhaustion

which attempts to approximate a planar region with sets, perhaps collections of

rectangles or other polygons, whose area can be easily calculated. If the region

whose area is to be calculated is not a polygon itself, then one needs to fill the

region with a sequence of smaller and smaller polygons until a limit is realized.

Such a region might be bounded by the horizontal x-axis, the vertical lines given

by x D a and x D b for some real numbers a and b, and finally by the graph of

some nonnegative function f . Given this region, one attempts to fill the region with

rectangles whose sides are parallel to the axes, have one side along the x-axis,

and have a length determined in some way by the graph of the given function.

170

6 Riemann Integrals

Fig. 6.6 Approximating the area under a curve with narrowing rectangles

If the function is in some sense well behaved, then as the widths of these rectangles

are chosen to be smaller and smaller, the total area of the rectangles will approach

the area of the region (Fig. 6.6). What is meant by well behaved will be a main focus

of the theorems presented in this chapter.

To make the definition of Riemann Integral precise, there needs to be a way to

talk about the placement of the vertical rectangles used to approximate the area

under a curve. This is done by designating the position of the vertical sides of the

rectangles with a collection of x values in the interval a; b. One defines a partition

of the interval a; b to be a finite sequence of x values P W a D x0 x1

x2 xn D b for some natural number n. These values of x break the interval

a; b into n subintervals xj1 ; xj . Note that the definition of partition does not say

anything about the lengths of the subintervals for the partition. Indeed, it could be

that the jth subinterval length xj xj1 could be 0 or could be as large as b a.

In particular, there is no requirement that all the interval lengths be the same size.

Since the lengths of the subintervals xj xj1 are used frequently in the discussion

of Riemann Integrals, one often uses the shorthand notation xj D xj xj1 .

Given a partition, P W a D x0 x1 x2 xn D b, one defines the

norm of the partition P , jjP jj, to be the maximum length of a subinterval of the

partition, that is, jjP jj D max xj . For example, if a; b D 1; 4 has the partition

jn

largest distance between any two of the adjacent points in the partition. As seen in

the previous section, one can get increasingly better approximations to the area of a

region by attempting to approximate the region by smaller and smaller polygons.

Thus, by requiring the norm of a partition to be smaller, the rectangles used to

approximate the area of a region bounded by a curve become smaller in width and

can give a better approximation.

For the Riemann Integral, a partition will determine the widths of the rectangles

used to approximate the area of a region. What will be used as the lengths of

those rectangles? Suppose a rectangle rests on the x-axis between xj1 and xj . If

the rectangle is going to fit inside the region bounded by the curve y D f .x/, then

171

the length of the rectangle (its height above the x-axis) cannot exceed

inf

xj1 xxj

f .x/.

If the rectangle is going to enclose the part of the region between xj1 and xj , then the

length of the rectangle must be at least sup f .x/. The definition of the Riemann

xj1 xxj

Integral uses a value between these two possible extremes. It requires the choice of

a sequence of x values 1 ; 2 ; 3 ; : : : ; n with xj1 j xj for each j. Then the

rectangle on xj1 ; xj is given the length f .j / so that it has area f .j /xj . Clearly,

the choice of j 2 xj1 ; xj results in the length of the rectangle being f .j / which

is between the two extremes inf f .x/ and sup f .x/, so the rectangles that

xj1 xxj

xj1 xxj

result might neither be contained in the region bounded by the curve nor cover the

region. Instead, the lengths of the rectangles are allowed to be in between these two

extremes. The total area of all the rectangles is then given by the Riemann Sum

n

P

f .j /xj .

jD1

Now, given a function f defined on the interval a; b, one can define the Riemann

Rb

Integral of f on a; b to be I D f .x/dx if for every > 0 there is a > 0

a

with jjP jj < and for every

choice of 1 ; 2 ; 3 ; : : : ; n with j 2 xj1 ; xj , it

P

n

Rb

follows that f .j /xj I < . If f .x/dx exists, then f is said to be integrable

jD1

a

(or Riemann integrable) on the interval a; b. The function f in the integral is

called the integrand. When the integrand f is a nonnegative function, this definition

results in a value for I that can be considered the area of the region bounded by the

x-axis, the lines x D a and x D b, and the curve y D f .x/. When f is allowed to take

on both positive and negative values, the value of I can be thought of as the area of

the region lying above the x-axis minus the area of the region lying below the x-axis.

The power of the definition of Riemann Integral is that it need not be associated

with area at all. The student may well be familiar with other applications to the

determination of moments, work, force, speed, distances, interest rates, populations,

and many other examples. It is convenient to extend the definition of Riemann

Rb

Rb

Ra

Integral to f .x/dx where b < a with the convention f .x/dx D f .x/dx.

a

Note that the definition of Riemann Integral, similar to the definitions of limit and

derivative, states that the integral of f between the numbers a and b is I if for every

> 0 there is a > 0 such that a particular inequality holds. But unlike previous

kinds of limits, the inequality that must hold for Riemann sums is supposed to be

true for every choice of a partition P and every choice of j s as long as jjP jj < .

Thus, it is not just that a region in the plane is being approximated by a sequence

of rectangles, but that the region must be closely approximated by every possible

172

6 Riemann Integrals

n

P

jD1

noting is that the Riemann Integral is not the only way to define integration. Most

of the other definitions give the same value as the Riemann Integral for functions

where the Riemann Integral exists, but some of the other definitions give values to

integrals in situations where the Riemann Integral does not exist. Some examples of

other integration definitions include the RiemannStieltjes Integral, the Lebesgue

Integral, the Darboux Integral, and the Daniell Integral.

There are some fairly easy to describe functions that

do not have a Riemann

1 if x is rational

integral. One simple example is the function f .x/ D

whose

0 if x is irrational

Riemann integral is not defined on any interval a; b with a < b. To see why this is,

n

P

consider any Riemann sum

f .j /xj . Because both the rational numbers and the

jD1

irrational numbers are dense in the real numbers, in any subinterval of the partition

which has positive length, there are values of j in the subinterval where f .j / D 0,

and other values of j in the subinterval where f .j / D 1. Thus, for any partition,

n

P

there are choices of the j s that make the Riemann sums equal to

0 xj D 0 and

other choices that make the Riemann sum equal to

n

P

jD1

jD1

6.5.1 Exercises

1. Let f .x/ D x. Partition the interval 1; 3 into n subintervals with 1 D x0 and

xj D xj1 C 2n for j D 1; 2; 3; : : : ; n.

(a) Find the minimum and maximum possible values for an associated Riemann

n

P

sum

f .j /xj .

jD1

(b) Show that as n gets large, the Riemann sum must approach 4.

2. Let f .x/ D x2 . Partition the interval 1; 2 into n subintervals with 1 D x0 and

xj D xj1 C 3n for j D 1; 2; 3; : : : ; n.

(a) Find the minimum and maximum possible values for an associated Riemann

n

P

sum

f .j /xj .

jD1

(b) Show that as n gets large, the Riemann sum must approach 3.

173

There are many theorems about the properties satisfied by Riemann integrals. Some

of the proofs of these theorems merely rely on properties of summations since

n

P

the definition of the Riemann Integral is based on the Riemann sum,

f .j /xj .

jD1

If a, b, and c are a constants, then

Rb

c dx D c.b a/.

Rb

Rb

c f .x/dx D c f .x/dx.

a

Rb

If f and g are functions integrable on the interval a; b, then .f C g/.x/dx D

Rb

a

f .x/dx C

Rb

g.x/dx.

If f and g are functions integrable on the interval a; b, and f .x/ g.x/ for all

Rb

Rb

x 2 a; b, then f .x/dx g.x/dx.

a

To prove that

Rb

cdx D c.b a/, one needs to find a > 0 so that if the norm of a

n

P

jD1

c.b a/. But in this case f .j / is always equal to the constant c, so the Riemann sum

is always equal to the desired integral, c.b a/. This makes the proof particularly

easy.

Note that the first four steps of this proof merely set up the assumptions required

by the definition of the Riemann Integral. That is, one needs to have constants

a and b and function f defined on the interval a; b. Then one needs to take an

arbitrary > 0, find an appropriate > 0, and consider an arbitrary Riemann

sum which satisfies the needed condition on the norm of the partition. Although

straightforward, these steps are necessary in order to show that the definition of

Riemann Integral is being satisfied.

174

6 Riemann Integrals

Rb

a

c dx D c.b a/.

Without loss of generality, assume that a b, and let f .x/ D c for all x in

a; b.

Let > 0 be given.

Let D 1, and let P W a D x0 x1 x2 xn D b be a partition of

a; b with jjP jj < 1.

Then for any choices of j 2 xj1 ; xj , it follows that f .j /xj c.ba/

jD1

P

P

n

n

jD1

jD1

Rb

Thus, cdx D c.b a/.

a

Rb

a

c f .x/dx D c

Rb

a

f .x/dx.

Rb

In the proof of this result you will need to use the fact that f .x/dx D I to

a

say something about the size of c f .j /xj cI . But this expression equals

jD1

P

P

n

n

jcj f .j /xj I suggesting that if you can arrange for f .j /xj I to be

jD1

jD1

small, then you can arrange for the product jcj f .j /xj I to be small. You

jD1

will need the product

. This is fine except for the embarrassing case where c D 0.

f .j /xj I < jcj

jD1

One could handle this problem by breaking the proof into two

cases: c D 0 and

P

c 0. Easier, though, is to simply ask for f .j /xj I to be less than jcjC1

.

jD1

The use of jcj C 1 in the denominator is just a trick that takes care of the case where

175

jcj is large

and the case where jcj is 0 both at the same time. Of course, you can

P

Rb

n

arrange f .j /xj I < jcjC1

because that follows from f .x/dx D I.

jD1

a

PROOF: If f is an integrable function on the interval a; b and c is a

Rb

Rb

constant, then c f .x/dx D c f .x/dx.

a

Let f be a function defined on a; b such that

Rb

f .x/dx D I.

From the definition of Riemann Integral, there is a > 0 such that if

P W a D x0 x1 x2 xn D b is a partition

with jjP jj < , then for

every choice of j 2 xj1 ; xj , f .j /xj I < jcjC1

.

jD1

P

n

n

< .

Then c f .j /xj cI D jcj f .j /xj I jcj jcjC1

jD1

jD1

Rb

Rb

Thus, c f .x/dx D cI D c f .x/dx.

a

The third theorem in this section can be summarized by saying that the integral

of a sum is the sum of the integrals. Its proof is reminiscent of the proof of

the theorem stating that the limit of a sum is the sum of the limits, and of the

theorem stating that the derivative of a sum is the sum of the derivatives. In this

Rb

Rb

case, you are given that f .x/dx D I and g.x/dx D J and are then faced with

a

the distance that the Riemann sum for f C g is from the value of the integral

P

I C J given by .f C g/.j /xj .I C J/. This easily breaks into the two

jD1

!

!

P

n

P

n

differences

f .j /xj I C

g.j /xj J . The existence of the two

jD1

jD1

given integrals then lets you choose a value of > 0 that will ensure that the

two parts to this sum are both small.

176

6 Riemann Integrals

Rb

Rb

Rb

.f C g/.x/dx D f .x/dx C g.x/dx.

a

Rb

Rb

with f .x/dx D I and g.x/dx D J.

a

From the definition of Riemann Integral, there is a 1 > 0 such that if

P W a D x0 x1 x2 xn D b is a partition

with jjP jj < 1 , then

P

jD1

x n D b is a partition

with jjP jj < 2 , then for every choice of j 2 xj1 ; xj ,

P

jD1

Let D min.1 ; 2 /.

Let P W a D x0 x1 x2 xn D b be a partition of a; b with

jjP jj <

, and let j s be chosen with

j 2 xj1 ; xj .

P

jD1

!

!

P

n

P

n

f .j /xj I C

g.j /xj J

jD1

jD1

P

P

n

n

jD1

jD1

Rb

Rb

Rb

Thus, .f C g/.x/dx D I C J D f .x/dx C g.x/dx.

a

The final theorem in this section states that if f .x/ g.x/ for all x 2 a; b, then if

Rb

Rb

the functions are integrable, f .x/dx g.x/dx. It is sufficient to prove this result

a

Rb

h.x/dx 0, this

From there

Rb

a

.g f /.x/dx D

Rb

a

g.x/dx

Rb

a

Rb

h.x/dx 0.

follows. With the assumption that h.x/ 0 for all x 2 a; b, it is not hard to

Rb

show that h.x/dx 0, because the value of every associated Riemann sum must

a

177

be nonnegative. How do you turn this into a proof? Recall how the proof went

when showing that if f .x/ 0, then lim f .x/ cannot be negative. If you assume that

x!a

follows that jf .x/ Lj is always greater than giving a contradiction. A very similar

argument works here where f is replaced by the Riemann sum.

PROOF: If f and g are functions integrable on the interval a; b, and

Rb

Rb

f .x/ g.x/ for all x 2 a; b, then f .x/dx g.x/dx.

a

with f .x/ g.x/ for all x 2 a; b.

Define h.x/ D g.x/f .x/ which is greater than or equal to 0 for all x 2 a; b.

Rb

Rb

Rb

Since f and g are integrable, so is h, and h.x/dx D g.x/dx f .x/dx.

Thus, it suffices to prove that

Assume instead that

Rb

a

Rb

h.x/dx 0.

a D x0 x1 x2 xn D b is a partition

with jjP jj < , then for

P

jD1

jD1

n

P

h.j /xj I I > .

jD1

This contradicts the assumption that I < 0 which completes the proof.

6.6.1 Exercises

Write proofs for each of the following statements.

1. If functions f1 ; f2 ; f3 ; : : : ; fn are integrable on interval a; b, and c1 ; c2 ; c3 ; : : : ; cn

Rb

.c1 f1 .x/ C c2 f2 .x/ C c3 f3 .x/ C C cn fn .x// dx

are constants, then

D

c1

Rb

a

f1 .x/dx C c2

Rb

a

f2 .x/dx C c3

Rb

a

f3 .x/dx C C cn

Rb

of Linear Algebra, this says that the Riemann integral is a linear operator.)

178

6 Riemann Integrals

Rb

f .x/dx c.b a/.

a

Rb

3. If f is a function such that both f and jf j are integrable on a; b, then f .x/dx

a

Rb

jf .x/jdx.

a

It is helpful to have a characterization of those functions which are Riemann

integrable. This section will discuss several theorems which establish some properties of functions that guarantee that they are integrable. Then the following three

sections present a series of results that give a complete characterization of Riemann

integrable functions.

Recall that f is called bounded on a; b if there is a number M such that jf .x/j

M for all x 2 a; b. It is important to note that if f is integrable on an interval, then f

must be bounded on that interval. The way to prove this result is reminiscent of the

way one proves that a function continuous on a closed bounded interval is bounded.

That is, one uses an indirect proof assuming that you have an integrable function

that is not bounded, and from that, you produce a contradiction. Think about what

can be done with a Riemann sum if the function f is not bounded. Given a partition

P W a D x0 x1 x2 xn D b, for some choice of the j s the Riemann

n

P

sum is

f .j /xj . If f is unbounded on a; b, then it must be unbounded on at

jD1

least one of the subintervals xj1 ; xj ; otherwise, if there is a bound for f on each of

the n subintervals, one merely needs to select the largest of those n bounds to have

a bound for f on the entire interval a; b. So what happens if f is not bounded on

the kth subinterval xk1 ; xk ? It means that k could be changed to be some other

value in the subinterval, say , to make the term f . /xk as large as you like.

Thus, you can make the entire Riemann

sum as largeas you like. So how large do

P

you want f . /xk to be? You want f .j /xj I to be larger than for some

jD1

preassigned > 0 such as D 1. The proof below does this by selecting a value

to replace k in such a way that the kth term of the Riemann sum, f . /xk ,

is

larger by at least

1 than the sum of the absolute values of all the other terms of

P

f .j /xj I guaranteeing that the resulting expression will be bigger than 1.

jD1

179

bounded on a; b.

Let f be an integrable function on the interval a; b.

Assume that f is not bounded on a; b.

Rb

Let D 1, and f .x/dx D I.

a

Then from the definition of Riemann integral, there is a > 0 such that if

P W a D x0 x1 x2 xn D b is a partition with

jjP

jj < and j s are chosen with j 2 xj1 ; xj , the Riemann sum satisfies

P

jD1

Let a particular partition P with jjP jj < and choices for j 2 xj1 ; xj be

given.

Because f is not bounded on a; b, it follows that there is a k between 1

and n such that f is not bounded on the interval xk1 ; xk . Otherwise, f

is bounded on each of the subintervals of the partition implying that it is

bounded on the entire interval a; b. Note that xk > 0 because a function

cannot be unbounded on an interval of length 0.

n

P

Let J D

jf .j /jxj jf .k /jxk C jIj.

jD1

jf . /j > JC1

.

xk

Then the Riemann sum resulting from the partition

P with the choices of j

n

jD1

jf . /jxk

n

P

jD1

JC1

xk

xk

J D 1.

This completes the proof.

Knowing that integrable functions must be bounded is very helpful. If you can

claim that jf .x/j

P M for some constant M, then you know that any one term of a

Riemann sum njD1 f .j /xj can contribute at most M xj to the sum. By forcing

the norm of the partition, jjP jj, to be very small, you can control the maximum

size of xj and, thus, the maximum size of a term in the Riemann sum. This is the

key idea behind the proof of the next theorem which states that if f is integrable

Rc

Rb

Rc

on a; b and on b; c, then f .x/dx D f .x/dx C f .x/dx. To prove this, it is

a

natural to consider finding a 1 > 0 so that Riemann sums arising from partitions of

Rb

a; b with norm less than 1 are close to I D f .x/dx and finding a 2 > 0 so that

a

Riemann sums arising from partitions of b; c with norm less than 2 are close to

Rc

J D f .x/dx. You would consider allowing to equal the minimum of the 1 and 2 .

b

180

6 Riemann Integrals

Then you could take a partition of a; c with a norm less than . Unfortunately, this

partition of a; c does not separate into a partition of a; b and a partition of b; c

because there is no guarantee that the given partition of a; c includes the point b as

one of the xj values in the partition. But if you change the Riemann sum by altering

the interval of the partition containing the point b by adding b as an extra point to

the partition, you are not making a large change in the total sum. More precisely,

suppose the partition is P W a D x0 x1 x2 xn D c with the point b in

the interval xk1 ; xk . A resulting Riemann sum has the term f .k /.xk xk1 /. If this

term is replaced by two terms f .b/.b xk1 / C f .b/.xk b/, how much does this

change the Riemann sum? The change is exactly f .b/.b xk1 / C f .b/.xk b/

f .k /.xk xk1 / D .f .b/ f .k //.xk xk1 /. Given that f is integrable on a; b and

on b; c, you know that there is a bound M such that jf .x/j M for all x 2 a; c.

An upper bound for the size of this change is, therefore, 2M.xk xk1 / < 2M. This

says that by choosing small enough, you can control the amount of change made

in the Riemann sum by introducing b as a point in the partition of a; c. If is also

Rb

Rc

chosen small enough so that the resulting Riemann sums for f .x/dx and f .x/dx

a

are close to the corresponding integral, then the total difference between original

Rb

Rc

Riemann sum and the sum of the integrals f .x/dx C f .x/dx is small enough.

a

PROOF: If f is a function integrable on the interval a; b and on the

Rc

Rb

Rc

interval b; c, then f .x/dx D f .x/dx C f .x/dx.

a

Without loss of generality assume that a < b < c, and let f be a function

Rb

integrable on the interval a; b and on the interval b; c with I D f .x/dx

and J D

Rc

f .x/dx.

value M1 . Because f is integrable on b; c, jf j is bounded on that interval

by some value M2 . It follows that jf j is bounded on the interval a; b by

M D max.M1 ; M2 /.

Let > 0 be given.

From the definition of Riemann integration, there is a 1 > 0 such that for

every partition P of a; b with jjP jj < 1 and every choice of j 2 xj1 ; xj

on the intervals of the partition, the associated Riemann sum will be within

of the integral I.

3

Similarly, there is a 2 > 0 such that for every partition P of b; c with

jjP jj < 2 and every choice of j 2 xj1 ; xj , the associated Riemann sum

will be within 3 of the integral J.

(continued)

181

.

Let D min 1 ; 2 ; 6MC1

Let P W a D x0 x1 x2 xn D c be a partition of a; c with

jjP jj < .

Let s be chosen such that j 2 xj1 ; xj .

Since b 2 a; c, there is a k such that b 2 xk1 ; xk .

Then

n

f .j /xj .I C J/ D

jD1

!

!

k1

n

P

P

f . /xj Cf .b/.b xk1 / C f .b/.xk b/ C

f .j /xj C

jD1 j

jDkC1

.f .xk / f .b//xk .I C J/

k1

n

P

P

f .j /xj J C

f .j /xj Cf .b/.b xk1 / I C f .b/.xk b/C

jD1

jDkC1

jf .xk / f .b/jxk :

Since the partition a D x0 x1 x2 xk1 b D

b is a partition of a; b with norm less than 1 , it follows that

k1

jD1

norm less than 2 , it follows that

xn D c is a partition of b; c with

n

P

f .b/.xk b/ C

jDkC1

< 3 .

Also, jf .xk / f .b/jxk < 2M 6MC1

jD1

Rc

This proves that f .x/dx D I C J and completes the proof of the theorem.

a

Note that you can easily show that this theorem also holds if a > b or b > c by

simply rearranging the order of the limits on one or more of the integrals.

The previous section discusses the theorem stating that if integrable functions

Rb

Rb

satisfy f g on a; b, then f .x/dx g.x/dx. Can this statement be made

a

stronger? That is, if f .x/ g.x/ for x 2 a; b, with f .x/ < g.x/ for some x 2 a; b,

Rb

Rb

can you conclude that f .x/dx < g.x/dx? The answer is no. For example, if

a

f and g only differ for a finite number of x values, then f and g will have identical

integrals. To prove this, start with two integrable functions, f and g, that are identical

for all x 2 a; b except for some t 2 a; b. How would you prove that f and g have

182

6 Riemann Integrals

identical integrals? Again, you should consider the Riemann sums associated with f

n

P

and g, that is, consider a Riemann sum

g.j /xj for g with a particular partition

jD1

n

P

jD1

f .x/ D g.x/ at all points except x D t, how many of the corresponding terms in

these two Riemann sum could be different? Well, only those terms for which the

chosen j D t and xj 0. This could happen at most twice (twice in the unusual

case of t D xj D j D jC1 ). Thus, the Riemann sum for g is identical to the

Riemann sum for f plus at most two terms. By controlling the size of xj which

you can do by limiting the norm of the partition, you can control the contribution of

those at most two terms in the Riemann sum, thus ensuring that the sums for f and

g are close. That is the idea behind the following proof.

PROOF: Suppose that f and g are functions integrable on the interval

a; b, and that f .x/ D g.x/ for all x 2 a; b except perhaps at t 2 a; b.

Rb

Rb

Then f .x/dx D g.x/dx.

a

Let f and g be a functions integrable on the interval a; b, and suppose that

f .x/ D g.x/ for all x 2 a; b except perhaps at t 2 a; b.

Rb

Let f .x/dx D I.

a

Let > 0 be given.

From the definition of Riemann Integration, there is a 1 > 0 such that for

every partition P W a D x0 x1 x2 xn D b with norm less than

1 , and every choice

of j 2 xj1 ; xj , the associated Riemann sum satisfies

P

jD1

Let 2 D 8M , and set D min.1 ; 2 /.

Select any partition P W a D x0 x1 x2 xn D b with

norm less than , and select any sequence of j 2 xj1 ; xj . Then the

P

jD1

D .

f .j /xj I C jg.t/ f .t/j2 < 2 C 2M 2 8M

jD1

Rb

Rb

Thus, g.x/dx D I D f .x/dx which proves the theorem.

a

It is left as an exercise to extend this theorem to the case where f and g differ at a

finite number of points. In fact, this can be extended to f and g which differ on an

infinite sequence of points in a; b as long as the sequence has a limit.

183

6.7.1 Exercises

Write proofs for each of the following statements.

1. If f and g are functions integrable on the interval a; b, and f .x/ D g.x/ for all

x 2 a; b except perhaps at the finite set of points ft1 ; t2 ; t3 ; : : : ; tk g a; b, then

Rb

Rb

f .x/dx D g.x/dx.

a

2. If f and g are functions integrable on the interval a; b, and f .x/ D g.x/ for all

x 2 a; b except perhaps on a sequence of points ft1 ; t2 ; t3 ; : : : g a; b where

Rb

Rb

lim tj D L, then f .x/dx D g.x/dx.

j!1

R1

1

, then f .x/dx exists and is equal to 13 .

f .x/ D 21n for all x with 21n < x 2n1

0

n

R1

1

, then f .x/dx does not exist but

n, f .x/ D 32 for all x with 21n < x 2n1

lim

R1

r!0C r

f .x/dx D 3.

is continuous on a; b.

Rx

f .t/dt

Step functions play an important role in the theory of the Riemann integration. A

step function s on the interval a; b is associated with a partition P W a D x0 x1

x2 xn D b of a; b and has the property that s is constant

9

8 on each interval

3 0 x < 2>

>

>

>

>

=

< 1 2Dx

of the partition, .xj1 ; xj /. For example, the function s.x/ D

4 2 < x < 4 is

>

>

>

>

0 4Dx

>

;

:

1 4 < x 5

a step function defined on the interval 0; 5 (Fig. 6.7). It follows easily that a step

function on an interval a; b is integrable there. Indeed, suppose that P W a D x0

x1 x2 xn D b, and s.x/ D cj for all x satisfying xj1 < x < xj . Clearly,

the constant function cj is integrable on the interval xj1 ; xj , and the function s.x/

differs from this constant function at at most the two endpoints, xj1 and xj . Thus,

184

6 Riemann Integrals

function s.x/

cj xj D

n

P

Rxj

Rb

xj1

s.x/dx D

n Rxj

P

s.x/dx D

jD1 xj1

cj xj .

jD1

The importance of step functions comes from the fact that a function f is

integrable on a; b if and only if f can be closely approximated by step functions.

Precisely, f has a Riemann integral on the interval a; b if and only if for every

> 0, there exist step functions u.x/ and v.x/ on a; b with the property that for

Rb

Rb

all x 2 a; b, v.x/ f .x/ u.x/, and u.x/dx v.x/dx < . That is, f has

a

an integral precisely when for every > 0 there is a lower step function v that is

always less than or equal to f and an upper step function u that is always greater

than or equal to f with the property that the integrals of v and u are within of each

other. This squeezes f between two step functions whose integrals are as close as

you want. This should remind you of the Exhaustion Area Axiom.

The statement of this theorem is a biconditional statement; that is, it is an if and

only if statement. This means that the proof will have two distinct parts. One proof

must show that if a function is integrable, then it can be approximated by very close

upper and lower step functions. The other proof must show that if a function can

be approximated by very close upper and lower step functions, then it is integrable.

Consider how you would approach the proofs of each of these statements.

For the first part of the proof, you would consider a function, f , integrable on an

interval a; b. Given an > 0, somehow you need to show that there are upper and

lower step functions, u and v, whose integrals are within of each other. Where do

you start? All you know about f is that it has a Riemann integral on a; b, thus, all

you have to go on is the definition of Riemann integration which makes a statement

about the properties of Riemann sums. The key observation here is that a Riemann

n

P

sum

f .j /xj is equal to the integral of a step function defined to be equal to the

jD1

constant f .j / on the interval .xj1 ; xj /. Since the definition of the integral guarantees

185

that you can find Riemann sums that are very close to the value of the integral, this

suggests how you might choose step functions whose integrals are close to each

other. How can you assure that you choose a step function that is less than f .x/

for each x 2 a; b? For each interval of the partition .xj1 ; xj / you could consider

selecting j so that f .j / is the minimum value of f on that interval. Unfortunately,

f might not achieve a minimum value on that interval. Certainly, if f is continuous

on xj1 ; xj , then it obtains its minimum on that interval, but there is nothing here

indicating that f is continuous. On the other hand, you do know that, because f is

integrable, it is bounded. Thus, there is a greatest lower bound Mj D inf f .x/.

x2.xj1 ;xj /

There may not be any x 2 .xj1 ; xj / with the property that f .x/ D Mj , but you know

that there are values of x in the interval such that f .x/ is as close as you like to Mj .

Getting specific, now, your goal is to find upper and lower step functions whose

integrals are within some given > 0 of each other. It makes sense, therefore, to

find upper and lower step functions whose integrals are both within 2 of the value of

the integral of f because then the two step functions will be within of each other.

From the definition of Riemann integral, you can find a partition of a; b such that

all associated Riemann sums are within 4 of the integral of f . Then you can define

a lower step function, v.x/, that is equal to the infimum of f on each interval of the

chosen partition. On each interval of the partition you can find j values so that f .j /

is within 4.ba/

of v.j /. Then the integral of the lower step function will be within

.b a/ D 4 of a Riemann sum for f which in turn is within 4 of the integral

4.ba/

of f . This produces a lower step function with the properties you want. A similar

construction will produce an upper step function whose integral is also within 2 of

the integral of f , and that will complete the first part of the proof (Fig. 6.8).

For the second part of the proof, you consider a function, f , such that for each

> 0 you can find a lower step function, v.x/, and an upper step function, u.x/,

whose integrals are within of each other. You must then show that f has an integral.

The first task is to figure out what value I will serve as the integral of f . Your proof

will need to show that Riemann sums for f approach this value of I, so you first

4(b a)

inf f(x)

j

xj1

Fig. 6.8 Choosing j on .xj1 ; xj /

xj

186

6 Riemann Integrals

need a target I for that purpose. To do this, consider the collection of all possible

lower step functions, v.x/. That is, let L D fv j v is a step function with v.x/

f .x/ for all x 2 a; bg. Each v 2 L has an integral, and each integral should be less

than or equal to the needed value of I. How about taking the least upper bound of

all of those integrals? Does the least upper bound exist? It does if the collection

of integrals of elements of L is bounded above. To get that, all you need is one

upper step function u. For each v 2 L and for each x 2 a; b, you know that

Rb

Rb

v.x/ f .x/ u.x/. This ensures that for each v 2 L, v.x/dx u.x/dx showing

a

that the set of integrals of elements in L is bounded above. That allows you to set

Rb

I D sup v.x/dx. This makes sense because I would then be greater than or equal

v2L a

to the integral of any lower step function. It would also have to be less than or equal

to the integral of any upper step function. Since the assumption is that the integrals

of lower step functions and upper step functions can be found arbitrarily close to

each other, and each integral of an upper step function must be greater than or equal

to any integral of a lower step function, you would expect that the least upper bound

of the lower step function integrals would be equal to the greatest lower bound of

the upper step function integrals, and this value is what you will choose for I.

After determining I, your proof can proceed naturally. You need to show that by

restricting the norm of a partition of a; b, you can force an associated Riemann

sum for f to be close to I. What you have at your disposal is the ability to find

upper and lower step functions whose integrals are close to each other. A helpful

observation is that if you have a lower step function v and an upper step function u,

then for any partition and choice of j in the intervals of the partition, you know that

n

n

n

P

P

P

v.j /xj

f .j /xj

u.j /xj . So you can choose upper and lower step

jD1

jD1

jD1

functions, u and v whose integrals are each within, say 2 , of I. Then you can choose

a norm of a partition so that any Riemann sum for v is within 2 of the integral of

v, and any Riemann sum for u is within 2 of the integral of u. That will force the

corresponding Riemann sum for f to be within of I completing the proof.

PROOF: The function f is integrable on the interval a; b if and only if for

every > 0 there are step functions, u and v, such that for each x 2 a; b,

Rb

Rb

v.x/ f .x/ u.x/ and u.x/dx v.x/dx < .

a

Without loss of generality, assume that a < b, for if a D b, the result follows

trivially.

(continued)

187

Assume that f is an integrable function with

Rb

f .x/dx D I.

By the definition of Riemann integration, there is a > 0 such that for any

partition of a; b with norm less than and any choice of j s in the intervals

n

P

of the partition, the associated Riemann sum

f .j /xj is within 4 of I.

jD1

Note that since f is integrable, f is a bounded function on the interval a; b.

Because f is bounded, for each j D 1; 2; 3; : : : ; n, the value of

inf f .x/ exists. Therefore, there exists j 2 .xj1 ; xj / such that f .j / <

xj1 <x<xj

inf

xj1 <x<xj

f .x/ C

.

4.ba/

For each j, define v.xj / D f .xj / and for x 2 .xj1 ; xj /, define v.x/ D

inf f .x/ f .j / 4.ba/

.

xj1 <x<xj

Then v is a step function with the property that v.x/ f .x/ for all x 2

n

n

Rb

P

P

a; b, and v.x/dx

f .j / 4.ba/

xj D

f .j /xj 4 . Since

jD1

jD1

than I 2 .

Similarly, one can define an upper step function u in the same way that

v was defined except that, in this case, the j values are chosen to satisfy

f .j / > sup f .x/ 4.ba/

, and for x 2 .xj1 ; xj / the function u.x/ is

xj1 <x<xj

defined to be

sup

xj1 <x<xj

f .x/ f .j / C

.

4.ba/

Then u is a step function with the property that f .x/ u.x/ for all x 2

n

n

Rb

P

P

a; b, and u.x/dx

f .j / C 4.ba/

xj D

f .j /xj C 4 . Since

jD1

jD1

than I C 2 .

It follows that u and v are upper and lower step functions for f and have the

Rb

Rb

property that u.x/dx v.x/dx < .I C 2 / .I 2 / D .

a

PART II: Close upper and lower step functions implies integrability

Assume that for every > 0 there exists step functions u and v satisfying

Rb

Rb

v.x/ f .x/ u.x/ for every x 2 a; b, and u.x/dx v.x/dx < .

a

(continued)

188

6 Riemann Integrals

Let u be any upper step function for f . Every lower step function, v,

Rb

satisfies v.x/ f .x/ u .x/ for every x 2 a; b, implying that v.x/dx

a

(

)

Rb

Rb

u .x/dx and that the set

v.x/dx j v is a lower step function of f is

a

bounded above by

Rb

u .x/dx.

) bounded

Rb

above, so let I D sup

v.x/dx j v is a lower step function of f .

a

By assumption there are step functions u and v satisfying v.x/ f .x/

Rb

Rb

u.x/ for every x 2 a; b, and u.x/dx v.x/dx < 2 .

a

Since the integral of any upper step function is an upper bound for the set

Rb

of all integrals of lower step functions, it follows that I u.x/dx <

Rb

v.x/dx C

Also

Rb

2

< I C 2 .

v.x/dx >

Rb

u.x/dx

2

> I 2 .

a; b with norm less than 1 and every choice of j s in the intervals of

n

Rb

P

the partition, the Riemann sum

u.j /xj is within 2 of u.x/dx.

jD1

Similarly, there is a 2 > 0 such that for every partition of a; b with norm

less than 2 and every choice of j s in the intervals of the partition, the

n

Rb

P

Riemann sum

v.j /xj is within 2 of v.x/dx.

jD1

partition of a; b with jjP jj < .

For each j, let j be chosen in the interval xj1 ; xj .

n

Rb

P

Then it follows that I D I 2 2 < v.x/dx 2 <

v.j /

a

n

P

n

P

Rb

jD1

f .j /

u.j / < u.x/dx C 2 < I C 2 C 2 D I C .

jD1

a

P

n

Rb

Thus, f .j / I < which shows that f .x/dx D I and completes the

jD1

a

proof of PART II.

jD1

189

the easiest characterization to use when faced with determining whether or not

a function is integrable. To use this criteria to determine if a given function f

is integrable, one needs to show that the function admits upper and lower step

functions whose integrals are within of each other. This is not the easiest criteria

to apply. The next two sections will develop other criteria for integrability, but the

results will be based closely on this theorem about step functions.

6.8.1 Exercises

Write proofs for each of the following statements.

1. If s.x/ and t.x/ are both step functions on the interval a; b, then so are

(a)

(b)

(c)

(d)

s.x/ C t.x/.

s.x/t.x/.

max s.x/; t.x/ .

s2 .x/ C t2 .x/.

2. If f and g are integrable functions on the interval a; b, then so is max.f ; g/.

3. If f is an integrable function on interval a; b, then so is jf j.

The previous theorem about step functions gives a straightforward way to prove

that all continuous functions are integrable. Such a proof would take an arbitrary

function f that is continuous on the interval a; b for some a < b and an arbitrary

> 0, and show that f has upper and lower step functions whose integrals are

within of each other. What is it about such a continuous function, f , that allows

the construction of these upper and lower step functions? The important result about

continuous functions that comes into play here is that if f is continuous on a; b, then

it is uniformly continuous there. This has the consequence that there is a > 0 such

that if x and y are in a; b with jx yj < , then jf .x/ f .y/j < 2.ba/

. This means

.

that if xj1 < xj with xj xj1 < , then sup f .x/ inf f .x/ 2.ba/

x2.xj1 ;xj /

x2.xj1 ;xj /

Defining upper and lower step functions to be equal to this supremum and infimum,

respectively, on .xj1 ; xj / gives the step functions with the needed property.

190

6 Riemann Integrals

integrable there.

Let f be a function continuous on the interval a; b.

Without loss of generality, assume that a < b.

Let > 0 be given.

Because f is continuous on a; b, it is uniformly continuous there.

Thus, there is a > 0 such that jf .x/ f .y/j < 2.ba/

holds for every x and

y in a; b with jx yj < .

Let n be a positive integer with ba

< .

n

.

For each j D 0; 1; 2; 3; : : : ; n let xj D a C j ba

n

Define step function v.x/ by v.xj / D f .xj / for each j D 0; 1; 2; : : : ; n and

v.x/ D min f .y/ for each j D 1; 2; 3; : : : ; n. Thus, v is a lower step

y2xj1 ;xj

function for f .

Similarly, define step function u.x/ by u.xj / D f .xj / for each j D

0; 1; 2; : : : ; n and u.x/ D max f .y/ for each j D 1; 2; 3; : : : ; n. Thus,

y2xj1 ;xj

For each j D 1; 2; 3; : : : ; n, because xj xj1 < , it follows that

max f .y/ min f .y/ 2.ba/

implying that for all x 2 a; b,

y2xj1 ;xj

y2xj1 ;xj

.

u.x/ v.x/ < ba

Rb

Rb

Rb

Rb

u.x/ v.x/ dx <

Thus, u.x/dx v.x/dx D

a

dx

ba

D .

Therefore, u and v are upper and lower step functions for f whose integrals

on a; b differ by less than , so it follows that f is integrable on a; b which

completes the proof.

One thing nice about knowing that a function is integrable on an interval a; b

is that rather than having to consider all partitions of a; b, you can determine the

value of the functions integral by using any collection of partitions of a; b whose

norms approach zero. Thus, if you know that f is integrable on a; b, then for every

n

P

natural number n you could calculate I.n/ D

f a C .b a/ nj ba

which is the

n

jD1

Riemann sum for f based on the very specific partition where xj D a C .b a/ nj and

with j D xj . This is not the more general Riemann sum required by the definition

of the integral, but if you already know that the integral exists, then it must be equal

to lim I.n/.

n!1

interval 0; 4, so you know that it is integrable there. You can then consider

n

n

n

P

P

n.nC1/

j 4

16 P

I.n/ D

f a C .b a/ nj ba

D

.4

/

D

j D 16

. Then

2

n

n n

2

n

n2

jD1

jD1

n!1

jD1

R4

0

191

f(c)

0 x is rational

this with the function f .x/ D

on the interval 0; 1, you obtain

1 x is irrational

n

P

f nj 1n D 1. So lim I.n/ D 1 which is not the integral of f . That

I.n/ D

n!1

jD1

Now that it has been established that continuous functions are integrable, it is

appropriate to investigate the properties of the integrals of continuous functions.

The first of these properties is known as the Mean Value Theorem for Integration. It

states that the integral of a continuous function, f , on an interval, a; b, is given by

the length of the interval, b a, times one of the values f achieves on the interval.

Rb

That is, there exists a c 2 a; b such that f .x/dx D f .c/ .b a/. This result has a

a

nice visual interpretation showing that the area under a continuous curve is equal to

the area of a rectangle with length b a and width f .c/ for some c 2 a; b as shown

in Fig. 6.9. Another way to think about this is that there is a c 2 a; b such that f .c/

Rb

1

is the mean value of f which could be defined as ba

f .x/dx.

a

The proof of this theorem follows easily from three earlier results: (1) the

Intermediate Value Theorem, (2) a continuous function on a closed interval takes

on its extreme values, and (3) if one integrable function is greater than or equal to a

second integrable function, then the integral of the first is greater than or equal to the

integral of the second. The proof starts with a function f continuous of an interval

a; b. That function achieves its minimum value K and its maximum value M on the

interval. Thus, for all x 2 a; b, it follows that K f .x/ M from which it follows

Rb

that .b a/K f .x/dx .b a/M. Then, by the Intermediate Value Theorem,

a

on the interval a; b the function f achieves every value between K and M including

Rb

1

f .x/dx.

ba

a

192

6 Riemann Integrals

is continuous on the interval a; b with a < b. Then there is c 2 a; b

Rb

1

f .x/dx.

satisfying f .c/ D ba

a

Because f is continuous on the interval a; b there are s and t in a; b such

that f .s/ D K is the minimum value for f on a; b, and f .t/ D M is the

maximum value for f on a; b.

Rb

Since for all x 2 a; b, K f .x/ M, it follows that K.b a/ D K dx

Rb

f .x/dx

Rb

M dx D M.b a/.

a

1

ba

Because f .s/ D K

Rb

1

ba

Rb

f .x/dx.

It can be very exciting to take a first course in Calculus. After learning what a

limit is, you learn about two very different-looking limit processes: the derivative

and the integral. Both differentiation and integration have important applications

which justify the amount of attention they receive. But then comes the seemingly

amazing revelation that these two processes, although they are defined in extremely

different ways, are, in fact, very closely related in that they are essentially inverse

operations of each other. This fact is the point of the Fundamental Theorem of

Calculus, often presented as the pinnacle of the first course in Calculus.

The Fundamental Theorem of Calculus starts with a function f integrable

on a; b. The result of the theorem is generally stated in two parts. The first part

Rx

defines a new function F.x/ D f .t/dt and states that if f is continuous at some

a

point c 2 .a; b/, then F 0 .c/ D f .c/. The second part states that if f is continuous

on a; b, and if F is any function satisfying F 0 .x/ D f .x/ for all x 2 a; b, then

Rb

f .x/dx D F.b/ F.a/. It is fairly straightforward to prove the second part using

a

To prove the first part of the theorem, you would assume that a function f is

integrable on an interval a; b and that f is continuous at c 2 .a; b/. To find

Rx

the derivative of F.x/ D f .t/dt at c, you would just apply the definition of

a

the derivative. That is, you would start with the difference quotient F.x/F.c/

D

xc

x

x

c

R

R

R

1

1

f .t/dt f .t/dt . This simplifies to xc

f .t/dt. Now if you knew that f

xc

a

193

were continuous between c and x, you could apply the just completed Mean Value

Theorem for Integration to conclude that this difference quotient is equal to f .y/ for

some y between c and x. Then by forcing x to be close to c, you could force f .y/

to be close to f .c/ to complete the proof. But you do not know that f is continuous

between c and x; only that f is continuous at c. Still this is enough. You can use

the continuity of f at c to say that for a given > 0 there is a > 0 that ensures

that if t satisfies jt cj < , then jf .t/ f .c/j < . This shows that for x within

Rx

Rx

Rx

1

1

1

of c, xc

.f .c/ /dx < xc

f .t/dx < xc

.f .c/ C /dx which simplifies to

c

f .c/ <

1

xc

Rx

f is integrable on the interval a; b and continuous for some c 2 .a; b/.

Rx

Then the function F.x/ D f .t/dt is differentiable at c with F0 .c/ D f .c/.

a

c 2 .a; b/.

Rx

For x 2 a; b, define F.x/ D f .t/dt.

a

From the definition of continuity, there is a > 0 such that if x 2 a; b with

jx cj < , then jf .x/ f .c/j < .

Select any x 2 a; b with 0 < jx cj < .

Rx

Rc

Rx

Then F.x/ F.c/ D f .t/dt f .t/dt D f .t/dt.

a

Since f .c/ < f .t/ < f .c/ C for all t between c and x, it follows that

Rx

Rx

Rx

1

1

1

.f .c//dx < xc

f .t/dx < xc

.f .c/C/dx D f .c/C.

f .c/ D xc

c

c

c

Rx

1

Thus, F.x/F.c/

f .c/ D xc

f .t/dx f .c/ < .

xc

c

D f .c/ completing the proof of the

xc

x!c

theorem.

The second part of the Fundamental Theorem of Calculus now follows easily.

Rx

Indeed, if f is continuous on a; b, then the function F.x/ D

f .t/dt is an

a

antiderivative of f , then G0 .x/ D F 0 .x/ on a; b. It follows from the Mean Value

Theorem (for derivatives) that G and F differ by a constant because G F has a

derivative that is identically 0. Thus, F.x/ F.a/ D G.x/ G.a/ for all x 2 a; b

Rb

showing that f .t/dt D G.b/ G.a/ for any antiderivative G.

a

194

6 Riemann Integrals

PROOF (Fundamental Theorem of Calculus: Part II): Assume the function f is continuous on the interval a; b and that F is any antiderivative

Rb

of f . Then f .x/dx D F.b/ F.a/.

a

Without loss of generality, a < b.

Rx

Define F.x/ D f .t/dt.

a

Let G be any antiderivative of f .

Then for all x 2 a; b, the derivative of G.x/ F.x/ is f .x/ f .x/ D 0.

By

is a c 2 .a; b/ such that G.b/ F.b/

the Mean Value

Theorem there

G.a/ F.a/ D f .c/ f .c/ .b a/ D 0.

Rb

Thus, G.b/ G.a/ D F.b/ F.a/ D F.b/ D f .x/dx which completes the

proof.

The importance of the Fundamental Theorem of Calculus cannot be overstated. It

turns the complex operation of finding limits of difficult to calculate Riemann sums

into the somewhat more routine job of finding antiderivatives of functions.

6.9.1 Exercises

1. If F.x/ D

Rx3

x2

t

dt,

1Ct2

find F 0 .x/.

2. Suppose f has a jump discontinuity at c 2 a; b (that is, lim f .x/ and lim f .x/

x!cC

x!c

both exist and are unequal). If f is integrable on a; b, what is the behavior of

Rx

F.x/ D f .t/dt at c?

a

c 2 a; b, what can you say about f at c?

Rx

f .t/dt exists at

A function continuous on the closed interval a; b is integrable there. Some

functions which are not continuous are still integrable, so the question is, how badly

can a function behave and still be integrable? If a continuous function is changed

195

does not affect whether or not it is integrable. If a function has a jump discontinuity

at a point (that is, it has a right limit and a left limit at the point, but those two

limits are not equal) but is continuous elsewhere, then the function is still integrable.

This is because if a function is integrable on a; b and integrable on b; c, then it

is integrable on a; c whether or not the function is continuous at b. It follows that

bounded piecewise continuous functions are integrable.

Let the function f be defined on an interval a; b. Define the set of discontinuities of f , Df , to be the subset of a; b where f fails to be continuous. For example,

0 x D 15 ; 25 ; 35 ; 45

, then Df D f 15 ; 25 ; 35 ; 45 g. If

on the interval 0; 1 if f .x/ D

x otherwise

0 x is rational

f .x/ D

, then Df is the entire set 0; 1 because f is discontinuous

1 x is irrational

0 x is in the Cantor set

everywhere. Finally, if f .x/ D

, then Df is equal to

1 x is not in the Cantor set

the Cantor set because for any x not in the Cantor set, there is an open interval

containing x such that f is identically 0 on that open interval.

These examples suggest that a function defined on a; b is integrable as long

as its set of discontinuities does not get too large. In fact, a function defined on

a; b is Riemann integrable if and only if it is bounded and its associated set

of discontinuities, Df , has measure zero. Thus, any bounded function which is

discontinuous only on a countable set of points must be Riemann integrable. The

first function in the preceding paragraph with Df D f 15 ; 25 ; 35 ; 45 g is, therefore,

integrable. The second function which has Df D 0; 1 is not integrable as seen

earlier in this chapter. The third function which has Df equal to the Cantor set is

interesting because its set of discontinuities is not countable, yet the function is

integrable because the Cantor set does have measure zero.

The statement that the Riemann integrable functions on a; b are exactly those

whose set of discontinuities has measure zero is a biconditional statement. It says

both that if a function is Riemann integrable, then it is bounded with a set of

discontinuities that has measure zero, and that if a function is bounded with a set of

discontinuities that has measure zero, then the function is Riemann integrable. Thus,

a proof of this statement will have two parts, one for each conditional. The proof of

the theorem is somewhat longer than others seen in this book, but it requires only

one new concept not yet discussed.

Assume first that the function f is defined on the interval a; b, is bounded, and

its set of discontinuities, Df , has measure zero. You can prove that f is integrable on

a; b if you can show that for every > 0 the function f has upper and lower step

Rb

functions, u and v, such that u.x/ v.x/dx < . The key point here is that f is well

a

behaved near points where it is continuous, and the set where it is not well behaved

is very small (has measure zero). The strategy, then, is to construct step functions,

u and v, so that u.x/ v.x/ is very small near points where f is continuous, and

196

6 Riemann Integrals

to limit the size of the intervals where u.x/ v.x/ is large. Suppose, for example,

that near points where f is continuous, you could limit u.x/ v.x/ to be less than

. Then the total contribution to the integral of u v over those sections of the

2.ba/

.b a/ D 2 . The function f is bounded, so

step functions would be at most 2.ba/

there is an M such that jf .x/j < M for all x 2 a; b. It is possible, therefore, to

define upper and lower step functions that differ by at most 2M at points of Df . If

you can limit the regions where u.x/ v.x/ is large to intervals whose total length

is at most 4M

, then the total contribution to the integral of u v over those sections

of the step functions would be at most 2M 4M

D 2 . Accomplishing both of these

goals would then show that the integral of u v is less than 2 C 2 D . Can this be

accomplished? By the definition of continuity, for each point x where f is continuous

there is a > 0 such that if y is in a; b with jy xj < , then jf .y/ f .x/j < 4.ba/

.

That would ensure that for any two values y1 and y2 in the interval .x ; x C /, the

difference jf .y1 /f .y2 /j jf .y1 /f .x/jCjf .x/f .y2 /j < 4.ba/

C 4.ba/

D 2.ba/

.

By the definition of measure zero, the set of discontinuities of f can be covered by a

collection of open intervals whose total length is less than the needed 4M

. Thus, each

point of a; b can be covered by one of the open intervals covering Df or by one of

these .x ; x C / intervals constructed at each point of continuity. The Heine

Borel Theorem then lets you reduce this covering of a; b with open intervals to a

finite subcovering, and from that subcovering, the appropriate upper and lower step

functions can be constructed. That completes the strategy for the first part of the

proof.

Assume, conversely, that the function f defined on the interval a; b is integrable.

You already know that this implies that jf j is bounded by some constant M, so all

you need to prove is that the set of discontinuities of f , Df , has measure zero. This

can be done with a proof by contradiction. That is, by assuming that Df does not

have measure zero, you can show that for any upper and lower step functions, u and

v, the integral of u v is bounded away from 0. To do this it is helpful to consider

how much f can vary near a particular value x. For a point x 2 a; b and a > 0,

you would like to know how much f can change over the interval .x ; x C /. So

define W .x/ D sup f .y/ inf f .y/ where the supremum and infimum are calculated

for y varying over the interval .x ; x C / \ a; b. Note that if f had upper and

lower step functions that were both constant on the interval .x ; x C /, then

the two step functions would have to differ by at least W .x/ on that interval. Now

define the variation of a function f at a point x to be W.x/ D lim W .x/. Since

!0C

equal to inf W .x/. The following lemma gives an important property of W.

197

Then for any x 2 a; b, the variation of f at x is 0 if an only if f is

continuous at x.

Let f be a bounded function defined on the interval a; b.

PART I: Continuity implies W D 0

Assume that for some x 2 a; b the function f is continuous at x.

Then for every > 0 there is a > 0 such that if y 2 .x ; x C / \ a; b,

then jf .y/ f .x/j < 2 . Thus, W .x/ < .jf .x/j C 2 / .jf .x/j 2 / D .

Thus, there are for which W .x/ is within of 0 implying that W.x/ D

inf W .x/ D 0.

PART II: W D 0 implies continuity

Assume that for some x 2 a; b the variation of f at x is W.x/ D 0.

Since lim W .x/ D 0, for every > 0, there is a > 0 such that for

!0C

Select > 0 with < .

Then for any y 2 .x ; x C / \ a; b, it follows that jf .y/ f .x/j

sup f .z/ inf f .z/ D W .x/ < .

jzxj<

jzxj<

As a consequence of this lemma, the set of discontinuities of f is Df D

fx 2 a; b j W.x/ > 0g. Now for each natural number n define Dnf D

fx 2 a; b j W.x/ > 1n g to be the points of a; b where the variation of f at x

is greater than 1n . If the variation of f at x is positive, then it must be greater than 1n

for some n. Thus, the set of all discontinuities of f must be the union of these Dnf

1

sets, that is, Df D [ Dnf . The key observation here is that if for each n the Dnf set

nD1

has measure zero, then the entire set of discontinuities, Df , must have measure zero

because it is just a countable union of sets with measure zero. So, if you assume that

Df does not have measure zero, it requires that there is a natural number n such that

Dnf also does not have measure zero. What does it mean for Dnf not to have measure

zero? It means that there is an > 0 such that no collection of open intervals with

total length less than can cover all of Dnf . This will be the key to showing that

upper and lower step functions for f cannot have integrals that are arbitrarily close

to each other, and thus, f cannot be integrable. The result is known as Lebesgues

Theorem.

198

6 Riemann Integrals

a; b is Riemann integrable if and only if f is bounded and the set of points

in a; b where f is discontinuous has measure zero.

Let f be a function defined on the interval a; b with a < b.

PART I: Boundedness and discontinuities with measure zero imply integrable

Assume that there is a real number M such that jf .x/j < M for all x 2 a; b,

and assume the set Df , the set of x 2 a; b such that f is discontinuous at x,

has measure zero.

Let > 0 be given.

By the definition of measure zero, there is a sequence of open intervals

I1 ; I2 ; I3 ; : : : with total length less than 4M

such that Df is contained in the

union of those intervals.

By the definition of continuity, for each x 2 a; b where f is continuous,

there is a x > 0 such that jf .y/ f .x/j < 4.ba/

for all y 2 a; b with

jy xj < x . Let Jx be the interval .x x ; x C x /.

Since each x 2 a; b is either a point of continuity of f or a member of

Df , each x 2 a; b is either a member of one of the intervals Ij that covers

Df or in the interval Jx . Thus, the collection of open intervals consisting of

I1 ; I2 ; I3 ; : : : together with the Jx intervals forms an open covering of a; b.

By the HeineBorel Theorem, there exists a finite collection of these open

intervals than covers a; b. Let E D fx1 ; x2 ; x3 ; : : : ; xn g be the set of distinct

endpoints for the intervals in this finite cover of a; b where x1 < x2 < x3 <

< xn .

Define step functions u.x/ and v.x/ as follows. If x D xj for one of the

endpoints xj 2 E, then define u.x/ D v.x/ D f .x/.

For each j the open interval .xj1 ; xj / must be a subset of one of the finite

number of intervals that cover a; b. If .xj1 ; xj / is contained in one of the

Ik intervals that covers Df , define u.x/ D M and v.x/ D M for each

x 2 .xj1 ; xj /. Since jf j is bounded by M, v.x/ f .x/ u.x/ for each

x 2 .xj1 ; xj /.

Otherwise, .xj1 ; xj / is contained in one of the Jx intervals. In this case,

define u.y/ D f .x/ C 4.ba/

and v.y/ D f .x/ 4.ba/

for each y 2 .xj1 ; xj /.

Since jf .y/ f .x/j < 4.ba/ for all y 2 Jx , it follows that v.y/ < f .y/ < u.y/

for each y 2 .xj1 ; xj /.

It follows that v is a lower step function of f , and u is an upper step function

of f .

n

Rb

Rxj

P

u.x/ v.x/dx D

u.x/ v.x/dx.

a

jD2 xj1

(continued)

199

Over the intervals that were subsets of the Ij intervals, u.x/ v.x/ D 2M.

The total length of such intervals cannot exceed 4M

. As a result, the integral

of u.x/ v.x/ over these intervals cannot exceed 2M 4M

D 2 .

Over the intervals that were subsets of the Jx intervals, u.x/ v.x/ < 2.ba/

.

As a result, the integral of u.x/ v.x/ over these intervals cannot exceed

Rb

D 2 .

2.ba/

a

Thus, f has upper and lower step functions, u and v, with the property that

Rb

u.x/ v.x/dx < 2 C 2 D .

a

PART II: Integrable implies bounded and discontinuities with measure

zero

Let f be Riemann integrable on a; b.

Since all integrable functions are bounded, there is an M such that

jf .x/j < M for all x 2 a; b.

For each x 2 a; b, let W .x/ D sup f .y/ inf f .y/ where the supremum and

infimum are calculated for y varying over the interval .x ; x C / \ a; b.

For each x 2 a; b define the variation of f at x to be W.x/ D lim W .x/.

!0C

For natural number n define Dnf D fx 2 a; b j W.x/ > 1n g to be the points

of a; b where the variation of f at x is greater than 1n .

1

nD1

A countable union of sets with measure zero is itself a set with measure

zero. Since Df is the union of the Dnf , there must exist a natural number n

such that Dnf does not have measure zero.

Since Dnf does not have measure zero, there is an > 0 such that if Dnf is

covered by a sequence of open intervals, the total length of those intervals

must exceed .

Let u be an upper step function for f and v be a lower step functions for f .

From the definition of step function, there is a sequence a D x0 < x1 <

x2 < < xk D b such that both u and v are constant on the open intervals

Ij D .xj1 ; xj / for each j D 1; 2; 3; : : : ; k.

For each x 2 Ij , u.x/ cannot be less than sup f .z/, and v.x/ cannot be greater

z2Ij

than inf f .z/. As a consequence, u.x/ v.x/ sup f .z/ inf f .z/. Thus, if

z2Ij

Dnf

z2Ij

1

n

z2Ij

for all x 2 Ij .

(continued)

200

6 Riemann Integrals

Dnf cannot be covered by open intervals whose total length is less than .

Thus, it follows that the total length of the intervals Ij that contain points of

Df must be at least .

Rb

It follows that u.x/ v.x/dx 1n .

a

Thus, f cannot have upper and lower step functions whose integrals differ

by less than n . This implies that f is not integrable which is a contradiction.

Therefore, the assumption that Df does not have measure zero is false,

which completes the proof.

The last section of Chap. 4 introduced Thomaes function, a function defined

on 0; 1 which is discontinuous at each rational number but is continuous at each

irrational number. Since the rational numbers is a countable set, it has measure zero.

Thus, Thomaes function is bounded, and its set of discontinuities has measure zero,

so Thomaes function is Riemann integrable. Compare this to the function that is

equal to 1 for all rational numbers and equal to 0 for all irrational numbers. That

function is discontinuous everywhere, so its set of discontinuities does not have

measure zero, and it is not integrable as seen earlier.

6.10.1 Exercises

1. Suppose f W 0; 2 ! 5; 9 is integrable and g W 5; 9 ! 0; 2 is continuous.

Show that g f is integrable.

Write proofs for each of the following statements.

2. If f .x/ is a function integrable on the interval 0; 10, then so is the function

f .x/f .10 x/.

3. If f .x/ is a function integrable on the interval a; b, and p.x/ is a polynomial,

then p f .x/ is also integrable on a; b.

4. If f .x/ and g.x/ are integrable functions on the interval a; b, then so is f .x/g.x/.

Chapter 7

Infinite Series

The axioms for the real numbers define addition as a binary operation and establish

the rules for adding two real numbers together. One can use mathematical induction

to extend axioms and theorems about addition to get theorems about the addition

of any finite number of terms. But there is nothing in the axioms that suggests how

to add an infinite number of terms together or what such a sum would mean. You

need to make a separate definition in order to make sense out of adding infinitely

1

P

many terms together. An infinite series a1 C a2 C a3 C D

an has a sequence

nD1

of terms a1 ; a2 ; a3 ; : : : which are written with plus signs or minus signs between

the terms of the sequence. In this chapter, most series will begin with a first term

a1 , although there is no problem with beginning the series at other subscript values

1

P

such as the commonly seen

an . Also in this chapter the terms of the series will

nD0

kinds of terms such as complex numbers or matrices. This explains what an infinite

series looks like, but it does not prescribe any meaning to the symbols.

In Abstract Algebra one can study formal power series, a study that looks at one

type of infinite series and considers how to manipulate the series without regard

to whether these series can be assigned any meaningful numerical values. But in

Analysis, one is interested in the cases where it makes sense to assign a numerical

value to the series. The difference in the two studies is in the interpretation of a

series like 1 2 C 3 4 C . If you ask what happens if you multiply this series

by 2, a purely algebraic answer would be that you just use the Distributive Law and

multiply each term of the series by 2 to get 2 4 C 6 8 C . But an analytical

answer to the question is that it makes little sense to assign a numerical value to the

series, so multiplying the series by 2 cannot yield a meaningful result.

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_7

201

202

7 Infinite Series

s1 D a1

s2 D a1 C a2

s3 D a1 C a2 C a3

:::

sk D a1 C a2 C a3 C C ak :

Since each of these sums is just the sum of a finite number of terms, they are easily

defined. The series is said to converge to real number L if the sequence of partial

1

P

sums converges to L, that is, if lim sk D L. In this case one writes

an D L and

k!1

nD1

says that the series has limit L or even that the series has value L. If the sequence

of partial sums does not converge, then the series is said to diverge. If the limit of

partial sums converges to infinity or negative infinity, the series is said to diverge to

1

P

infinity or negative infinity, respectively. In that case one could write

an D 1 or

nD1

1

P

an D 1.

nD1

derive a simple expression for its partial sums sk D

k

P

1

P

an one should

nD1

nD1

limit of the partial sums lim sk . Unfortunately, there are relatively few series that

k!1

admit simple closed-form expressions for their partial sums, and this technique for

finding the value of a series has limited use. Still, it is important to know about

some of the cases when this technique does work. Perhaps the best known examples

of series whose partial sums can be explicitly calculated are the geometric series.

These are the series whose sequence of terms can be written in the form an D arn1 ,

where a and r are given real numbers. Then the first term of the series is a and the

n1

an

common ratio of adjacent terms is an1

D ar

D r, at least in the interesting cases

arn2

when ar 0. When r is not equal to 1, there is a simple algebraic trick that gives

the expression for the partial sums.

sk D

k

X

arn1

nD1

r sk D

k

X

nD1

arn

203

sk r sk D

k

X

arn1

nD1

k

X

arn D a ark

nD1

sk .1 r/ D a.1 r /

k

sk D a

1 rk

:

1r

(Of course, there is an even simpler trick for the case when r D 1.) Thus, except

in the trivial case where a D 0, the limit of the partial sum diverges if jrj 1. On

a

which can easily be

the other hand, when jrj < 1, lim rk D 0 so lim sk D 1r

k!1

k!1

remembered as the first term divided by 1 minus the common ratio. The geometric

series is particularly important because one can often compare other series to a

geometric series to determine if the other series converges. It also gives a nice

example showing that series that make a lot of sense when they converge can lead

you to very strange and very incorrect conclusions when they do not converge. In

1

P

r

particular,

rn D 1r

whenever jrj < 1. But when you take a limit as r approaches

nD1

series

1

P

1

P

r!1 nD1

rn D lim

r

r!1 1r

1

2

nD1 r!1

Another class of series whose partial sums can be calculated are the telescoping

series. This is a class of series where each term an can be written as a difference of

two terms an D bn bnC1 . Then sk D .b1 b2 / C .b2 b3 / C .b3 b4 / C C

.bk bkC1 / D b1 C.b2 Cb2 /C.b3 Cb3 /C.b4 Cb4 /C .bk C bk / bkC1 D

b1 bkC1 . Hence, if lim bkC1 exists, the series converges. The best known example

k!1

1

1

P

P

1

1

1

1

D 1 lim nC1

D

nC1

D 1.

of this type is the series

2

n

n Cn

nD1

nD1

n!1

Fortunately, even though it is often difficult to determine the exact values for

the partial sums of a series, one can very often determine whether or not the series

converges and sometimes the value to which it converges even without knowing an

explicit formula for its partial sums. There are many tools that can be used to do this.

These tools consist of a large collection of convergence tests which can be applied

to determine if a particular series converges. Calculus students often get a great deal

of practice selecting appropriate convergence tests for series. This chapter will be

more interested in proving the theorems that provide these tests.

The simplest and possibly most important convergence test is the Limit of Terms

Test which says that a series can converge only if its sequence of terms has a limit

1

P

of 0. That is, if

an converges, then lim an D 0. This is a direct consequence of

nD1

1

P

nD1

n!1

204

7 Infinite Series

The point is, if lim sk exists, then <sk > is a Cauchy sequence whose term must get

k!1

PROOF (Limit of Terms Test): The series

1

P

an converges only if

nD1

lim an D 0.

n!1

1

P

nD1

k

P

an converges to L.

nD1

n!1

n!1

n!1

n!1

which completes the proof.

The convergence of one series can often be inferred from the convergence of a

similar series. For example, inserting extra terms equal to 0 into a series does not

affect whether the series converges, nor can inserting extra 0 terms affect the value

to which the series converges. This is because the insertion of terms equal to 0 into

a series does not change the sequence of partial sums for that series except to allow

some of the partial sums to be repeated, and that does not change the limit of the

sequence of partial sums.

Another useful observation is that if two series differ in only a finite number of

terms, then either both series converge or both series diverge. Suppose, for example,

1

1

P

P

that

an and

bn are two series such that for some positive integer N, the terms

nD1

nD1

an D bn for all n > N. Why would the convergence of one of the series imply

the convergence of the other? It must depend on the convergence of their partial

k

k

P

P

sums, so let sk D

an and tk D

bn be the sequences of partial sums for

nD1

nD1

the two series. The agreement of an and bn for all n > N shows that for k > N,

k

k

P

P

tk D t N C

bn D tN C

an D tN C sk sN . Thus, lim sk exists if and only

nDNC1

nDNC1

k!1

k!1

k!1

7.1.1 Exercises

Find limits for the following series or show that the limit does not exist.

1.

2.

1

P

nD1

1

P

nD1

5

3n

4

22nC1

3.

4.

5.

1

P

nD1

1

P

nD1

1

P

nD1

23

205

7

n2 C5n

1

n2 C9nC14

1

1 1

1

1

1

1

1

C 2 C 3 3 4 C 4 C

2 3 22

3

2

3

2

3

1

1

1

1

7. C 0 C C 0 C 0 C C 0 C 0 C 0 C

C 0 C 0 C 0 C 0

2

4

8

16

1

1

1

1

1

5

C

C

C

C

8. 11 C 3 C C

9

12

23

34

45

56

6.

Before going on, it is necessary to distinguish between two types of convergent

1

P

series. The series

an is said to be absolutely convergent if the series of the

nD1

convergent. That is, if

1

P

1

P

nD1

nD1

1

P

nD1

by using the fact that a series converges if and only if its sequence of partial sums is

1

1

P

P

Cauchy. The proof can begin with the absolutely convergent series

an , so

jan j

converges. Then, knowing that

sequence of partial sums Sk D

nD1

1

P

nD1

nD1

k

P

nD1

1

P

nD1

k

P

nD1

must be Cauchy, which means that for each > 0 there is an N such that for any

m > k > N, jSm Sk j < . What do you need to know for <sk > to be Cauchy?

You need to know that for each > 0 there

is an mN such that for any m > k > N,

m

P

P

an

jan j D jSm Sk j which you

nDkC1

nDkC1

already know can be made small. It follows that the <sk > sequence is Cauchy, so it

converges.

206

7 Infinite Series

1

P

nD1

series

1

P

1

P

nD1

jan j converges.

nD1

k

P

jan j and sk D

nD1

k

P

an .

nD1

Then, since the series is absolutely convergent, the sequence <Sk > converges implying that <Sk > is a Cauchy sequence.

Given > 0 there is an N such that for all m and k greater than

N, jSm Sk j < .

m

m

P

P

Let m > k > N. Then jsm sk j D

an

jan j D jSm Sk j < .

nDkC1

nDkC1

Thus, the sequence <sk > is a Cauchy sequence and is, therefore, a

convergent sequence.

1

P

This shows that

an is convergent which proves the theorem.

nD1

If the series

1

P

nD1

terms get small fast enough that its partial sums must rapidly get close to each

other and to a limit. A conditionally convergent series converges because its negative

terms balance the growth of its positive terms. For example, the series 1 1 C 12

1

C 13 13 C 14 14 C clearly converges to 0 due to this type of cancelation.

2

Thus, every series can be categorized as either absolutely convergent, conditionally

convergent, or divergent.

Because the definition of the convergence of a series involves the limit of partial

sums, many results that are true for finite sums are easily proved for infinite sums.

1

1

1

P

P

P

For example, if

an converges, and c is any constant, then

can D c

an . To

nD1

Law works for finite sums, so

c

1

P

nD1

k

P

nD1

an .

can D c

k

P

nD1

k

P

nD1

nD1

nD1

207

1

P

1

P

c an D c

nD1

1

P

nD1

an .

nD1

Assume that

Then

1

P

nD1

1

P

can D lim

k

P

k!1 nD1

nD1

k

P

can D lim c

k!1

an D c lim

k

P

k!1 nD1

nD1

an D c

1

P

an

nD1

Another easy result is that if

1

P

.an C bn / D

nD1

1

P

1

P

an C

nD1

1

P

an and

nD1

1

P

nD1

nD1

PROOF: If the series

1

P

an C

nD1

1

P

an and

nD1

1

P

1

P

nD1

1

P

.an C bn /D

nD1

bn .

nD1

Then

lim

1

P

nD1

k

P

k!1 nD1

1

P

an and

nD1

.an C bn / D lim

an C lim

k

P

k!1 nD1

k

P

1

P

bn both converge.

nD1

.an C bn / D lim

k!1 nD1

bn D

1

P

nD1

an C

k!1

1

P

k

P

an C

nD1

k

P

bn

nD1

nD1

With these theorems you can often start with a series whose value you know

and derive the values of other series. For example, what is the value of the

1

1

1

1

1

series 1 C 12 14 C 18 C 16

32

C 64

C 128

256

? This series looks

something like the geometric series with first term 1 and common ratio 12 which

is 1 C 12 C 14 C 18 C . That series has limit 11 1 D 2. But the new series

2

is clearly not a geometric series because the terms are not all the same sign,

which would be the case for a geometric series with a positive common ratio,

nor are the terms alternating in sign, which would be the case for a geometric

series

with a negativecommon

ratio. The new series can be written, though,

as

2

2

1 C 12 C 14 C 18 C 0 C 0 C 24 C 0 C 0 C 32

C 0 C 0 C 256

C . This is

the difference of two series: the geometric series with first term 1 and common ratio

1

, and a series whose value is the same as a geometric series with first term 24 D 12

2

1

1

10

2

.

and common ratio 18 . Thus, the new series has value

D

1

1

7

1 2

1 8

208

7 Infinite Series

the next chapter it will be shown that this series converges to ln 2. So how about the

1

series 1 C 13 12 C 15 C 17 14 C 19 C 11

16 C ? At first it appears that this series

is the same as the previous series because it includes the same terms rearranged in a

1

1

different order. Indeed, both series include the terms 2n1

and 2n

for each positive

integer n. But one can write

1C

1

1 1

1

1

1

1 1

1

1 1

C C C C

C D 1 C 0 C C

3 2

5

7 4

9

11 6

3 2

5

1

1

1

1 1

C D

C0C C C0C

7 4

9

11 6

1

1 1

1 1

1 C C C

2

3 4

5 6

1

1

1

1

C 0 C C 0 C 0 C C 0 C 0 C D

2

4

6

8

1

1 1

1 1

1 C C C

2

3 4

5 6

1

1 1

1 1

1

C

1 C C C D

2

2

3 4

5 6

ln 2 C

1

3

ln 2 D ln 2:

2

2

It is not unusual that rearranging the order of terms in a series results in the series

converging to a different quantity. This is, in fact, a characteristic of all conditionally

convergent series as will be shown later in this chapter.

7.3.1 Exercises

1. Prove that if the series

1

P

an and

nD1

1

P

nD1

numbers, then

1

1

1

P

P

P

.c an C d bn / D c

an C d

bn .

nD1

nD1

1

P

nD1

converges to 0.

nD1

1

P

mDn

am

209

1

1

1

1

1

1

1

(a) 1 C C 2 3 C 4 C 5 C 6 7 C

3

3

3

3

3

3

3

1

1

1

1

1

1

C

C

C

(b)

23 34

45 56

67 78

1 1

1

1 1

1

1

1

1

C C C C

C

C

(c)

2

4 3

6

8 5

10

12 7

1

1

1 1

1

1

1

1

1

(d) 1 C C C C C

C

C

C

3

5

7 2

9

11

13

15 4

There are many tests for the convergence of series. Presented here are four very

useful tests that apply to series whose terms are all positive real numbers. Of course,

1

1

P

P

since the convergence of

jan j implies the convergence of

an , these tests can

nD1

nD1

After the Limit of Terms Test, the Comparison Test is likely the most important

convergence test because it is used to prove most of the other convergence tests. It

states that if the terms of one series are less than or equal to the corresponding terms

of a second series, then the convergence of the second series implies the convergence

1

1

P

P

of the first series. Specifically, suppose there are two series

an and

bn , and for

each n, the terms satisfy 0 an bn . Then if

1

P

nD1

nD1

nD1

1

P

an

nD1

must converge. The contrapositive of this statement is then also true and states that

1

1

P

P

if

an diverges, then

bn must also diverge.

nD1

nD1

Consider how you would prove that this test is valid. The proof would assume that

1

P

0 an bn for each n, and assume that

bn converges. Then it must show that

1

P

nD1

an converges. One shows that a series converges by showing that its sequence

nD1

of partial sums converges. You do know that the sequence of partial sums for

1

P

bn

nD1

converges, so how can you use that to make a conclusion about the partial sums of

1

P

an ? One idea is to use the technique from the proof that absolutely convergent

nD1

210

7 Infinite Series

series are convergent; that is, a series converges if and only if its sequence of partial

1

m

P

P

sums is Cauchy. If the partial sums of

bn form a Cauchy sequence, then

bn

nDk

nD1

gets small whenever k m are large. Now, the given fact that an bn lets you

m

m

1

P

P

P

conclude that

an

bn which implies that the partial sums of

an are

nDk

1

P

Cauchy. Thus,

nDk

nD1

an must converge.

nD1

an and

nD1

bn are series

nD1

with nonnegative terms and N is a real number such that for every

1

P

integer n > N, the terms satisfy 0 an bn . Then if

bn converges, so

does

1

P

nD1

an .

nD1

Assume that

P

nD1

an and

nD1

Assume that there is an N such that for every n > N, the terms of the series

satisfy 0 an bn .

1

P

Assume that the series

bn converges.

nD1

k

P

nD1

Thus, given > 0 there is an M N such that if M < k m, then

m

k

m

P

P

P

>

bn

bn D

bn .

nD1

nD1

nDkC1

m

m

P

P

But then whenever M < k m, it follows that >

bn

an D

m

P

nD1

an

k

P

nDkC1

nDkC1

an .

nD1

Therefore, the sequence of partial sums of

1

P

1

P

an is Cauchy.

nD1

nD1

converges.

This proves that the Comparison Test is valid.

The Comparison Test can be used in many cases when you are faced with a series

which is similar to a series that you know converges. For example, you already know

1

P

1

that the series

converges because it forms a telescoping series. Can this fact

n2 Cn

nD1

211

1

P

nD1

1

n2

be applied directly because for each n you have n2 1Cn < n12 which is not what you

need. You need to find a convergent series whose terms are greater than or equal to

1

or a divergent series whose terms are less than or equal to them. You have neither.

n2

2

2

On the other hand for each positive integer n, it is true that n12 D n2 Cn

2 n2 Cn .

1

P 2

is twice a convergent series, so it is also convergent. Thus, the

The series

n2 Cn

nD1

1

P

nD1

1

n2

converges.

In this way the Comparison Test can be used to simplify the task of testing the

convergence of many complicated looking series. As another example, consider the

1

P

2nC7

series

. Note that the first two terms of this series are negative. Because

n3 5nC1

nD1

the convergence of a series does not depend on the value of any finite set of its

terms, it is sufficient to test the series by considering the terms where n 3. In

the terms n32nC7

the degree of the polynomial in the denominator is 3 while the

5nC1

degree of the polynomial in the numerator is 1. This suggests that the terms could be

compared to the terms n12 of a known convergent series. The strategy is to compare

2nC7

to a fraction that is greater but look more like n12 . If the series with greater

n3 5nC1

fractions converges, the Comparison Test shows that the original series converges.

This can be done by attempting to eliminate lower degree terms of the numerator

and denominator polynomials, thus, ending up with a simpler fraction greater than

the original. Clearly, when considering the numerator 2n C 7, the constant term,

7, will be dwarfed by the size of the linear term 2n suggesting that you replace

2n C 7 by the larger quantity 2n C 7n D 9n. This replacement will result in a larger

fraction, but it should not affect whether or not the series converges. Similarly, it

would be good to replace the denominator n3 5n C 1 with a smaller polynomial of

the same degree which will result in obtaining a fraction larger than n32nC7

. One

5nC1

can drop the constant term altogether, but one cannot drop the 5n term without

making the denominator polynomial larger. This can be handled by writing n3 as

1

n3 C 12 n3 . For large enough values of n, the value of 12 n3 will exceed 5n making

2

1

n3 5n a positive quantity which could be removed from the polynomial to make

2

the polynomial smaller. Indeed, you need 12 n3 5n 0 implying n2 10. Thus,

if n 4, you can conclude that n3 5n C 1 > 12 n3 . This shows that for n 4, the

1

1

P

P

1

1

18

< 19n

D

18

.

fraction n32nC7

2 . Since the series

2 converges, so does

3

5nC1

n

n

n2

n

2

1

P

nD1

nD1

2nC7

n3 5nC1

nD1

converges.

1

1

1

1

1

1

1

1

C 18 C 18 C 18 C 18 C 16

C 16

C 16

C 16

C 16

C 16

C 16

C 16

C . For each

1

k

k

k 0 this series has 2 terms equal to 2kC1 , and these 2 terms add to 12 . Thus, the

1

4

212

7 Infinite Series

which clearly diverges. Thus, the series diverges. Now, compare this series to the

1

P

1

harmonic series

and note that each term of the harmonic series is greater than

n

nD1

or equal to the corresponding term of the first series. Thus, by the Comparison Test,

the harmonic series must diverge.

The geometric series

1

P

nD1

1

P

a series to the terms of an appropriate geometric series. Suppose

an is a series

nD1

with positive terms, and suppose that the sequence of ratios of adjacent terms, nC1

an

has limit L as n approaches infinity. If L < 1, then the series can be compared to

a convergent geometric series with a common ratio between L and 1. If L > 1,

then the terms of the series increase in value and do not approach 0, so the series

diverges. When L D 1, the ratio test fails because there are series for which L D 1

that converge and other series for which L D 1 that diverge.

1

P

To prove that the Ratio Test is valid, you would start by assuming that

an

nD1

D L < 1. Then you would

n!1 an

compare this series to a well-chosen geometric series known to converge so that

1

P

the Comparison Test can be used to conclude that

an converges. To compare the

nD1

given series to the geometric series with nth term arn1 , you would need to know

that for all n greater than some N, the terms an are less than arn1 . If you know that

anC1

is always less than r, then the an terms will grow more slowly than the arn1

an

terms, and the Comparison Test can be used. In general, r cannot be set equal to L

a

because knowing that nC1

approaches L in the limit does not ensure that the ratio

an

is ever actually smaller than L and certainly not that it is always less than L. But if

a

the limit of nC1

is L, then by the definition of limit, there is an N such that for all

an

, which is half

n N, the ratio is less than some value greater than L such as LC1

2

way between L and 1. Then, if the value of a is chosen so that aN a, you will have

aNCk ark for each k 0, and the result follows.

213

1

P

nD1

D L. Then if L < 1, the series converges, if L > 1, the series

n!1 an

diverges, and if L D 1 the test fails.

1

P

Assume that

n!1

nD1

anC1

an

D L.

CASE 1: L < 1

a

< LC1

for all n N.

an

2

LC1

Let a D aN , and r D 2 < 1.

a

Assume that for some k 0, aNCk ark . Then NCkC1

< r, so aNCkC1 <

aNCk

k

kC1

aNCk r ar r D ar .

Therefore, by mathematical induction it follows that aNCk ark for all

k 0.

1

P

Since

arn is a convergent geometric series, it follows from the ComparnD0

1

P

an is also convergent.

nD1

CASE 2: L > 1

a

> 1 for all n N.

an

Then for all n N, anC1 > an > 0, so the sequence of terms increases from

aN and cannot have a limit of 0.

Therefore, the series diverges by the Limit of Terms Test.

CASE 3: L D 1

Note that the constant series

The series

1

P

nD1

1

P

n!1

nD1

1

n2

n!1

anC1

an

anC1

an

n2

2

n!1 .nC1/

D lim

1

n!1 1

D lim

D 1.

D 1.

fails.

The ratio test is not helpful for series where the nth term is a rational function of n

a

because the limit of nC1

will always be 1, and the test is inconclusive. The ratio test

an

is particularly useful for series whose nth terms involve powers or factorials. For

5nC1

1 n

P

.nC1/

5

,

you

get

lim

D

example, when you apply the ratio test to the series

5n

n

lim 5

n!1 nC1

nD1

n!1

214

7 Infinite Series

that lim sup

n!1

anC1

an

n!1

anC1

an

n!1

anC1

an

> 1 to

assure that the series diverges. The proofs of these facts are left as exercises, but

they are important refinements of the Ratio Test since the lim inf and lim sup always

exist even if the limit does not. For example, consider the series 1 C 23 C 13 C 322 C

a

1

C 323 C 313 C . For this series, the ratio nC1

oscillates between 23 and 12 , so the

an

32

limit of the ratio does not exist. But the lim sup of the ratio is 23 < 1 implying that

the series converges.

The Ratio Test will play a major role in the discussion of power series in the next

chapter.

The Root Test is similar to the Ratio Test and can often be used for the same

series for which the Ratio Test can be used. This is because, like the Ratio Test,

it compares a series to a geometric series. For some series where the general term an

involves the nth powers of expressions, the Root Test can be easier to apply than the

Ratio Test. To test a series with positive terms an with the Root Test, you calculate

p

the limit lim n an D L. Then, as with the Ratio Test, if L < 1, the series converges,

n!1

if L > 1, the series diverges, and if L D 1, the test fails.

Proving that the Root Test is valid is very straightforward. Given that

p

p

lim n an D L < 1, there is an integer N such that for all n N the root n an is

n!1

n

< 1. Then, for n N, the terms an are less than LC1

, the terms

less than LC1

2

2

of a convergent geometric series. Thus, the series converges by the Comparison

Test.

1

P

PROOF (Root Test): Suppose that

an is a series of positive terms such

nD1

p

that lim n an D L. Then if L < 1, the series converges, if L > 1, the series

n!1

diverges, and if L D 1 the test fails.

1

P

Assume that

nD1

n!1

p

n a D L.

n

CASE 1: L < 1

p

If L < 1, there is an integer N such that n an < LC1

for all n N.

2

LC1 n

Then, for n N, each term an <

, the corresponding term of a

2

<

1.

geometric series with common ratio LC1

2

Therefore, since the geometric series converges, the Comparison Test shows

1

P

that

an converges.

nD1

(continued)

215

CASE 2: L > 1

p

If L > 1, there is an integer N such that n an > LC1

> 1 for all n N.

2

LC1 n

Then, for all n N, an > 2

which diverges to infinity.

Therefore, the series diverges by the Limit of Terms Test.

CASE 3: L D 1

Note that the constant series

The series

1

P

nD1

1

P

nD1

1

n2

p

n

n!1

n!1

p

n a D lim

n

n!1

an D lim 1 D 1.

n!1

1

p

n 2.

n

Since

natural

function is continuous at 1, it follows that

logarithmp

thep

ln lim n an D lim ln n an D lim 2 lnn n . Then by LHopitals Rule,

n!1

n!1

n!1

2

p

this limit is lim 1n D 0, from which it follows that lim n an D 1.

n!1

n!1

Therefore, no conclusion can be drawn when L D 1, and the Root Test fails.

p

As with the Ratio Test, it is sufficient to know that lim sup n an < 1 to conclude

n!1

p

that the series converges, and that lim inf n an > 1 to conclude that the series

n!1

diverges. For example, the series 12 C 13 C 212 C 312 C 213 C 313 C has general term

p

p

a2n D 31n and a2n1 D 21n . Thus, lim n an does not exist, but lim sup n an D p1

n!1

n!1

The definition of the Riemann integral considers the integrals of functions over

closed bounded intervals, a; b. This definition can be extended to integrals on

an infinite interval. An improper Riemann integral of the first kind defines

integrals over intervals where one or both of the endpoints of the interval are infinite.

R1

Rb

Rb

Rb

One defines f .x/dx as lim f .x/dx. Similarly,

f .x/dx D lim

f .x/dx

and

R1

f .x/dx D

1

lim 1x jb1

b!1

D 1.

lim

b!1 a

Rb

lim

a!1 b!1 a

1

R1

1

x2

dx D

a!1 a

Rb

lim x12

b!1 1

dx D

After seeing a definition of the improper Riemann integral of the first kind, the

reader may be curious whether there is also an improper Riemann integral of

the second kind. Although this text will not need to deal with improper Riemann

integrals of the second kind, the definition is given here for completeness. Recall

that Riemann integrals over an interval a; b exist only if the integrand is bounded.

So, an improper integral of the second kind is an integral where the integrand is

unbounded in every neighborhood of a point c 2 a; b. In this case, the Riemann

216

7 Infinite Series

integral on a; b can be calculated on a region that excludes c and then the limit can

R4

be taken as the region expands toward c. For example, one would define p1x dx as

lim

R4

a!0C a

1

p

p

dx D lim 2 xj4a D 4.

x

a!0C

The Integral Test for the convergence of a series of positive terms involves

the comparison of an infinite series with an improper Riemann integral. It applies

to series whose terms are equal to a monotonically decreasing function f defined

on an interval a; 1/ such that for all n a, the nth term of the series an is

equal to the function at the point n, that is, an D f .n/. The following figure

makes this comparison clear. Let k be an integer greater than or equal to a. If f

is a monotonically decreasing function, then whenever n x > k, the function

Rn

Rn

f .x/ f .n/ D an showing that

f .x/dx

f .n/dx D f .n/ D an . Thus, by the

Comparison Test, the series

1

P

n1

n1

1 nC1

R

P

nD1

1 nC1

R

P

f .x/dx D

nDk n

f .x/dx converges.

nDk n

R1

f .x/dx. Alternatively, if

nC1

nC1

R

R

f .x/dx

f .n/dx D f .n/ D an . Thus,

f .x/ f .n/ D an showing that

n

converges if the series

1

P

R1

kC1

f .x/dx

1

P

nDkC1

an

R1

R1

1 nC1

R

P

f .x/dx D

nDk

k a,

R1

1

P

f .x/dx

nDk n

an converges

nD1

of the infinite series and a good way to obtain an approximation to the value of

the series. This is helpful because it is often easier to evaluate the integral than the

corresponding infinite series (Fig. 7.1).

Fig. 7.1 Comparing the series with the integral in the Integral Test

217

1

P

an is a series such that

function on the interval a; 1/. Suppose

nD1

1

R1

P

series

an converges if and only if the improper integral f .x/dx

nD1

converges.

a; 1/.

1

P

Suppose

an is a series such that an D f .n/ for all n greater than or equal

nD1

to an integer k a.

Because f is monotonically decreasing, for any n > k it follows that f .x/

f .n/ for all x 2 n 1; n.

Rn

Rn

Thus, for any n k it follows that an D f .n/ D

f .n/dx

f .x/dx.

By the Comparison Test the series

1

P

Rn

f .x/dx D

nDkC1 n1

R1

1

P

n1

n1

nD1

f .x/dx converges.

f .x/ for all x 2 n; n C 1.

nC1

nC1

R

R

Thus, for any n > k it follows that

f .x/dx

f .n/dx D f .n/ D an .

By the Comparison Test the series

if the series

1

P

nC1

R

nDkC1 n

f .x/dx D

R1

f .x/dx converges

kC1

an converges.

nD1

R1

n

1

P

1

P

nD1

1

P

nD1

1

np

where p is some constant greater than 0. For which p does the p-series converge?

You have already seen that it converges when p D 2 and diverges when p D 1,

the harmonic series. All the p-series can be handled at once using the Integral Test.

Indeed, since the function f .x/ D x1p is monotonically decreasing in x for each p > 0,

R1

the p-series converges exactly when the integral x1p dx converges. But the integral

1

218

7 Infinite Series

R1

1

1

dx

xp

1 1

D 1p xp1

j1 . This is infinite when p < 1

1 which is infinite.

Thus, by the Integral Test, the integral and the series converge exactly when p > 1.

Consider the p-series when p D 2. The value of this series can be estimated

using the integral estimate associated with the Integral Test. The estimate would be

1

1

1

R1 1

R1

P

P

P

1

1

1

dx >

> x12 dx or 1 C 1 > a1 C

> 1 C 12 , so

is between

x2

n2

n2

n2

1

nD2

nD2

nD1

1.5 and 2. This is not very precise, but one can apply this technique a few terms

1

1

R1

R1

P

P

1

1

farther down the series to get x12 dx >

> x12 dx which shows that

n2

n2

10

nD11

11

2

6

nD1

1:6449.

7.4.5 Exercises

1. Suppose that

1

P

bn is a convergent series,

nD1

1

P

nD1

1

P

constants N and K such that 0 an Kbn for all n > N. Prove that

an

nD1

converges.

2. Suppose that

1

P

nD1

1

P

an is a series

nD1

an

n!1 bn

1

P

an

nD1

1

P

a

3. Assume that

an is a series of positive terms that satisfies lim sup nC1

an

nD1

4. Assume that

1

P

lim sup

n!1

anC1

an

an converges.

n!1

1

P

n!1

nD1

nD1

1

P

1

P

anC1

an

an diverges.

nD1

1

C 221 C 212 C 222 C 213 C 223 C 214 C 224

21

a

lim inf nC1

. What can you conclude about the

an

n!1

an D

C calculate

nD1

and

the series?

6. Assume that

1

P

convergence of

nD1

1

P

nD1

n!1

an converges.

p

n a

n

7. Assume that

1

P

219

nD1

1

P

p

n a

n

n!1

an diverges.

nD1

1

P

8. For the series

an D 211 C 312 C 213 C 314 C 215 C 316 C 217 C 318 C calculate

nD1

p

p

lim sup n an and lim inf n an . What can you conclude about the convergence of

n!1

n!1

the series?

9. Use the integral estimate from the Integral Test to estimate the size of the series

1

P

1

.

n3

nD1

10. Determine which of the following series are absolutely convergent by applying

an appropriate convergence test.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

1

P

nD1

1

P

nD1

1

P

n10 5n5 19

n2 C5

n3 5

3n

2n C5n

nD1

1

P

nD1

1

P

nD1

1

P

nD1

1

P

nD1

n

3

n

.2n/

5n

n

nn

.n/2

So what can you do with a series which is not absolutely convergent? There are

fewer tools to handle conditionally convergent series. One tool that does help is

the Alternating Series Test which considers series whose terms alternate in sign.

Specifically, if the absolute values of the terms of the series are monotonically

decreasing to 0, and the signs of the term alternate, then the series converges. For

example, the series seen earlier 1 12 C 13 14 C 15 16 C satisfies these conditions.

The series formed by the absolute values of these terms 1 C 12 C 13 C 14 C 15 C 16 C

is the harmonic series which does not converge, so the given series is not absolutely

convergent. Seeing how the partial sums of this series behave will give you an idea

how to prove that the Alternating Series Test is valid. In particular, the first few

partial sums of this series are

220

7 Infinite Series

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

s1 D 1 D 1

1

2

1

1

s3 D 1 C

2

3

1 1

1

s4 D 1 C

2

3 4

1 1

1

1

s5 D 1 C C

2

3 4

5

s2 D 1

1

2

5

D

6

7

D

12

47

D

:

60

D

Notice that the partial sums of an odd number of terms are all greater than the

limit, ln 2, while the partial sums of an even number of terms

are all less 1than the

1

1

limit. Also, if n is odd, then snC2 D sn C nC1

D sn .nC1/.nC2/ <

C nC2

sn , showing that the partial sums of an odd number

of

terms

forms a decreasing

1

1

1

D sn C .nC1/.nC2/

sequence. Similarly, if n is even, then snC2 D sn C nC1

nC2

>

sn , showing that the partial sums of an even number of terms forms an increasing

nC1

sequence. Because the terms of the series .1/n

approach 0, the odd partial sums

and the even partial sums approach each other. They both form bounded monotonic

sequences which both converge to the common limit. This behavior is typical of all

series satisfying the hypothesis of the Alternating Series Test.

221

1

P

nD1

lim an D 0, and for each n 1, an and anC1 have opposite signs, and

jan j janC1 j. Then the series converges.

n!1

Assume that

1

P

nD1

n!1

Without loss of generality, assume that a1 > 0.

k

P

Let the series have partial sums sk D

an .

nD1

Note that if n 1 is odd, then anC1 is negative and anC2 is positive with

janC1 j janC2 j implying that snC2 D sn C .anC1 C anC2 / sn .

Similarly, if n 1 is even, then anC1 is positive and anC2 is negative with

janC1 j janC2 j implying that snC2 D sn C .anC1 C anC2 / sn .

Thus, the subsequence of odd numbered partial sums forms a monotonically

decreasing sequence while the subsequence of even numbered partial sums

forms a monotonically increasing sequence.

Because the subsequence of even numbered partial sums is increasing, when

n is an odd positive integer it follows that sn > snC1 s2 showing the

subsequence of odd numbered partial sums is bounded below by s2 implying

that that sequence converges to a limit L1 .

Similarly, the subsequence of even number partial sums is an increasing

sequence that is bounded above by s1 implying that that sequence converges

to a limit L2 .

Then L1 L2 D lim s2nC1 s2n D lim a2nC1 D 0 showing that L1 D L2

n!1

n!1

and that the odd numbered partial sums and the even numbered partial sums

both converge to the same limit.

Therefore, the sequence of partial sums converges and the series converges.

This proof not only says that the given alternating series converges; it gives a

way to estimate the limit of the series. For any series that satisfies the hypothesis

of the theorem, any two adjacent partial sums, sn and snC1 , are on opposite sides of

the limit L of the series. Thus, the distance that sn is from the limit of the series is

less than the distance sn is from snC1 , and that distance is just janC1 j. Therefore, it

is easy to remember that for these series, the distance that a partial sum is from the

limit of the series is no more than the first term that is not part of the sum, janC1 j.

Note that the Alternating Series Test for convergence and this limit estimate apply to

series without regard to whether the series is absolutely convergent or conditionally

convergent.

For example, the number 1e D 01 11 C 21 31 C . This is an absolutely

convergent series as seen by the ratio test. But it is also a series whose terms alternate

in sign, and the absolute values of the terms decrease monotonically to 0. Thus,

1

the partial sum of the series 01 11 C 21 31 C 41 is already within 100

of 1e because

1

the first neglected term is 51 D 120

. This technique gives an easy proof that the

222

7 Infinite Series

number e is irrational. It goes like this: If e were rational, then it could be expressed

as pq , where p and q are positive integers. Then 1e D qp D 01 11 C 21 31 C .

Multiplying both sides of this equation by p yields q.p 1/ D p p C p2 p3 C

1

1

1 pC1

C .pC1/.pC2/

. Thus, the integer q.p 1/ would be an integer

1

1

.pC1/.pC2/

C . But this infinite series is an alternating series

plus (or minus) pC1

1

where the absolute value of the terms decrease to 0, so its value is between pC1

and

1

1

.pC1/.pC2/ . Thus, there would have to be an integer between those two values,

pC1

something clearly not possible. This is a contradiction, so the assumption that e is

rational must be false.

7.5.1 Exercises

Determine which of the following series are conditionally convergent, absolutely

convergent, or divergent.

1. 1

2. 1

3. 1

4. 1 C

5. 1 C

1

C ln13 ln14 C

ln 2

1

C 3 ln1 3 4 ln1 4 C

2 ln 2

1

C 3.ln13/2 4.ln14/2 C

2.ln 2/2

p1 p1 C p1 C p1 p1 C

3

2

5

7

4

p1 p1 C p1 C p1 p1 C

3

4

5

7

8

p1

9

p1

9

C

C

p1

11

p1

11

p1 C

6

p1 C

12

Recall that the p-series

1

P

nD1

1

np

raises a natural question about whether there is, in some sense, a largest series

that converges, or, perhaps a smallest series that diverges. If there were, that might

provide a good series to use in the Comparison Test because all series smaller would

converge, and all series larger would diverge. This turns out not to be the case. For

1

P

every series of positive terms,

an , that diverges, there is a sequence of positive

nD1

1

P

an bn also diverges. In

nD1

1

sn

1

sn

1

P

nD1

goes to 0.

an . Clearly, if the

223

To prove this result you would begin with a divergent the series with positive

1

P

terms,

an . Because the series is divergent, you know that the sequence of partial

nD1

sums must diverge to infinity. The strategy is to show that the partial sums of the new

1

P

an

series

are not Cauchy. In particular, for every integer m, there is an integer k

sn

nD1

such that

k

P

nDmC1

1

.

2

1

2

>

an

sn

showing that the mth and kth partial sums differ by at least

Suppose you are given a positive integer m. Since the original series diverges,

there is a positive integer k such that sk > 2sm . Then the difference between the kth

and the mth partial sums of the new series is

sk sm

sk

>1

1

2

nDmC1

D 12 .

1

P

PROOF: Let

k

P

sums sk D

nD1

sk D

k

P

an

sn

k

P

>

nDmC1

k

P

an

sk

an

nDmC1

sk

nD1

1

P

nD1

Assume that

k

P

1

P

an

sn

also diverges.

nD1

an .

nD1

Since the partial sums sn diverge to infinity, there is a positive integer k such

that sk > 2sm .

1

P

an

Then the difference between the mth and kth partial sums of the series

sn

nD1

is

k

P

nDmC1

an

sn

>

k

P

k

P

an

sk

nDmC1

nDmC1

sk

an

sk sm

sk

>1

1

2

D 12 .

1

P

nD1

an

sn

is not a

1

P

an

Therefore, the series

diverges.

sn

nD1

diverges. The Integral Test suggests that the kth partial sum of this series is close

to ln k. If for all n > 1, the nth term, 1n , of the harmonic series is divided by ln n,

1

P

1

. The Integral Test shows that this series also diverges

the resulting series is

nln n

since the integral

R1

2

nD2

1

dx

xln x

D ln.ln x/j1

2 D 1 diverges.

224

1

P

7 Infinite Series

It is interesting to note that even though for positive termed divergent series

1

1

P

P

an

an , the series

also diverges, for positive termed series

an , the series

sn

nD1

1

P

nD1

nD1

nD1

always converges. To see this, note that for n > 1 the term

an

s2n

an

s2n

sn sn1

s2n

<

1

P

an

Thus,

the

terms

of

the

series

are less than the terms of a

sn1

s2

nD1 n

1

P

1

convergent telescoping series

s1n D s11 lim s1n . Whether the original

sn1

sn sn1

sn sn1

1

.

sn

n!1

nD2

goes to L, the telescoping series converges.

7.7.1 Addition of Parentheses

The series 11C11C11C does not converge. Yet, if you insert parentheses to

group some of the terms together, it can result in a convergent series such as .11/C

.11/C.11/C which converges to 0 or 1C.1C1/C.1C1/C.1C1/C

which converges to 1. So, inserting parentheses can turn a divergent series into a

convergent series. Equivalently, removing parentheses from a convergent series can

1

P

turn it into a divergent series. What if the series

an converges? Can inserting

nD1

parentheses change whether or not it converges or change the limit to which the

1

P

series converges? The answer to this is no. The point is, if

an converges, it means

nD1

that its sequence of partial sums converges. By inserting parentheses into the series,

you are just removing some of the terms in the sequence of partial sums. You end up

with a new series whose sequence of partial sums is a subsequence of the sequence

1

P

of partial sums of

an , and any such subsequence will converge to the same limit

nD1

Slightly more can be said. Suppose

1

P

nD1

parentheses are inserted in such a way that the number of terms contained within

each set of parentheses is bounded, then the insertion of parentheses cannot affect

whether the series converges or the limit to which the series converges. To see this

1

P

assume that each set of parentheses encloses at most K terms. If

an converges

nD1

1

P

the series. So suppose that the series

an diverges, and that its partial sums are

sk D

k

P

nD1

nD1

an . Because lim an D 0, for each > 0, there is an N such that for all

n!1

225

n > N, the size of the terms jan j must be less than K . Suppose that for some m > N

one term of the series with parentheses added is .amC1 CamC2 CamC3 C CamCk /.

Then sm and smCk are both partial sums for the series with parentheses added. For

any j D 1; 2; 3; : : : ; k, jamC1 CamC2 CamC3 C CamCj j K j , showing that for

any of those j, jsmCj sm j < . The sequence of partial sums for the original series

does not converge either because the sequence is unbounded or because its lim sup

and lim inf approach distinct values. Because the subsequence of partial sums for

the original series remains within of the subsequence corresponding to the series

with parentheses added, the subsequence must also either be unbounded or have

distinct lim sup and lim inf values. Thus, the series with parentheses added cannot

converge.

This observation can be very helpful. Consider again the series 1 C 13 12 C 15 C

1

1

14 C 19 C 11

16 C . This series is not absolutely convergent, and it does not

7

satisfy the hypothesis of the Alternating

Series

Yet, if parentheses

are inserted

Test.

1

to group each set of three terms: 1 C 13 12 C 15 C 17 14 C 19 C 11

16 C ,

1

1

4n3

C 2n1

1n D n.2n1/.2n3/

. The series with

one gets a general term equal to 2n3

4n3

terms n.2n1/.2n3/ converges absolutely as can be seen by comparing it to the p4n3

4n

series with p D 2 since, for n 3, one has n.2n1/.2n3/

< n.2nn/.2nn/

D n42 .

So, the series with parentheses added converges, and since each set of parentheses

contains a maximum of three terms, and the terms of the original series approach 0,

this means that the original series converges.

Of course, if the number of terms enclosed by sets of parentheses is not bounded,

one cannot draw the same type of conclusions. The series .1/ C . 12 C 12 12 12 / C

. 13 C 13 C 13 13 13 13 / C . 41 C 14 C 14 C 14 14 14 14 14 / C converges, but

if parentheses are removed, the series diverges even though its terms do approach 0.

The partial sums oscillate between 1 and 2.

It has already been shown that the terms of the series 1 12 C 13 14 C 15 16 C

1

16 C to get a series that

can be rearranged as 1 C 13 12 C 15 C 17 14 C 19 C 11

converges to a different limit. This is typical for a conditionally convergent series.

1

P

In fact, if

an is a conditionally convergent series, and L is any real number, then

nD1

this, first note that a conditionally convergent series must have both positive and

negative terms. Define two new series so that for each n, bn D an if an 0 and

bn D 0 if an < 0, and cn D bn an . Then, for each n, both bn 0 and cn 0,

1

1

1

1

1

P

P

P

P

P

and

an D

.bn cn /. Since

jan j D

.bn C cn /, it must be that both

bn

and

nD1

1

P

nD1

nD1

cn diverge to infinity.

nD1

nD1

nD1

226

7 Infinite Series

Suppose you are given a target limit L. You have isolated the positive terms of

the series, the bn terms, and the negative terms of the series, the cn terms, so you can

play a cute game by taking a few bn terms such that the sum of those terms exceeds

L, and then subtract off a few cn terms until the sum decreases below L. You can then

add on more bn terms to make the sum again exceed L, and subtract of a few cn terms

until the sum decreases below L. Thus, by alternating between adding on bn terms

and subtracting off cn terms, you can arrange for the resulting series to have limit

L. More precisely, construct a new series inductively as follows: select u1 so that

u1

P

bn > L. This is always possible because the series with bn terms diverges to

nD1

u2

P

bn

nD1

v2

P

bn

nD1

nD1

u2

P

u1

P

bn

v1

P

v1

P

cn < L.

nD1

cn > L, and

nD1

nD1

selected uk and vk , select ukC1 and vkC1 to be the least positive integers such that

uP

vP

vk

kC1

kC1

P

bn

cn > L, and

bn

cn < L. It is then the case that the series

uP

kC1

nD1

nD1

nD1

nD1

C bu2 cv1 C1 cv1 C2 cv1 C3 cv2 C is a rearrangement of the terms

of the original series with some extra 0 terms added. Since the terms of the series

approach 0, the partial sums of the series approach L. This provides the desired

rearrangement (Fig. 7.3).

b1

b2

b3

b4

b5

c3

c2

b6

b7

c1

b8

b9

c6

c5 c4

b8

c8

c7

PROOF: Let

1

P

227

nD1

which converges to L.

Let

1

P

nD1

cn D bn an .

Thus, for each n, bn 0, cn 0, and an D bn cn .

1

1

1

1

P

P

P

P

Because

an is conditionally convergent,

jan j D

bn C

cn

nD1

and because

1

P

nD1

1

P

bn and

nD1

1

P

.bn cn / D

nD1

1

P

nD1

nD1

nD1

nD1

infinity.

The Limit of Terms Test shows that lim an D 0 and, thus, lim bn D

n!1

n!1

lim cn D 0.

n!1

1

P

Because

bn is unbounded, there is a least positive integer, u1 , such that

u1

P

nD1

bn > L.

nD1

Because

u1

P

nD1

bn

1

P

nD1

v1

P

cn < L.

nD1

Having selected uk and vk for some k 1, let ukC1 > uk be the least

uP

v1

kC1

P

positive integer such that

bn

cn > L. Then let vkC1 > vk be the

nD1

uP

kC1

nD1

bn

nD1

vP

kC1

nD1

<vk > can be constructed so that for

vk

P

uP

uk

vk

kC1

P

P

each k,

bn

cn < L and

bn

nD1

nD1

nD1

u1

P

cn > L.

nD1

dn be given by

nD1

c1 ; c2 ; c3 ; : : : ; cv1 ,

terms b1 ; b2 ; b3 ; : : : ; bu1

followed by the terms

bu1 C1 ; bu1 C2 ; bu1 C3 ; : : : ; bu2 follows by the terms cv1 C1 ; cv1 C2 ;

cv1 C3 ; : : : ; cv2 , and so forth, alternating between the sequence of bn

terms for uk < n ukC1 and the sequence of cn terms for vk < n vkC1 .

(continued)

228

7 Infinite Series

1

P

1

P

nD1

nD1

Given > 0 there is an N1 u1 such that if n > N1 , then bn < , and there

is an N2 such that if n > N2 , then cn < .

Then there is a k1 such that uk1 > N1 and a k2 such that vk2 > N2 .

Let k D max.k1 ; k2 /, and let N D uk C vk .

Then for all m > N, either there is an r such that dm D bp for some p with

N1 < ur < p urC1 or

is an s such that dm D cq for some q with N2 <

there

m

P

vs < q vsC1 . Thus,

dn L is bounded by either max cvr ; burC1 <

nD1

or by max .bus ; cvs / < .

1

P

This shows that

dn converges to L implying that a rearrangement of the

series

1

P

nD1

an converges to L as claimed.

nD1

This theorem takes care of the case of conditionally convergent series, but what

happens when terms of an absolutely convergent series are rearranged? The answer

is that nothing happens; that is, every rearrangement of an absolutely convergent

1

P

series converges to the same limit. Suppose, for example, the series

an is

absolutely convergent with rearrangement

1

P

bn . Because

nD1

limit. Alternatively,

1

P

nDNC1

1

P

k

P

1

P

nD1

nD1

nD1

bn is a rearrangement of

nD1

1

P

an ,

nD1

there is an integer K such that all the terms a1 ; a2 ; a3 ; : : : ; aN are among the terms

k

k

P

P

b1 ; b2 ; b3 ; : : : ; bK . So, if k K, by how much can

an and

bn differ? Both

nD1

nD1

sums contain the terms a1 ; a2 ; a3 ; : : : ; aN , so the two sums differ only by a finite

1

P

number of the terms aNC1 ; aNC2 ; aNC3 ; : : : which add to at most

jan j < .

nDNC1

This shows that the series and its rearrangement have partial sums within of each

other and completes the argument.

PROOF: Let

1

P

229

nD1

Let

Let

1

P

nD1

1

P

nD1

1

P

1

P

Since

nD1

that if k N,

k

P

nD1

1

P

an .

nD1

1

P

nD1

jan j < .

nDNC1

Since

1

P

bn is a rearrangement of

nD1

1

P

nD1

For k K, the difference between the kth partial sums of the two series

k

k

k

P

P

P

is

an

bn . This difference is a sum of the terms in

an that are

nD1

not in

k

P

nD1

nD1

k

P

nD1

nD1

a1 ; a2 ; a3 ; : : : ; aN ,

k

P

an .

nD1

nor are there any

terms that appear in both sums. It follows that the difference of partial sums

equals a sum minus another sum where

k each ksum contains distinct terms

P

P

from aNC1 ; aNC2 ; aNC3 ; : : : . Thus,

an

bn is bounded above by

nD1

nD1

1

P

jan j < .

nDNC1

Thus, given > 0, there is a K such that for all k K, the k partial sum of

1

1

P

P

an and the kth partial sum of

bn are within of each other.

nD1

1

P

nD1

nD1

k

P

k N1 ,

an L < 2 .

Because

nD1

(continued)

230

7 Infinite Series

k

k

P

P

Also, there is an N2 such that if k N2 ,

an

bn < 2 .

k nD1

nD1 k

k

P

bn L

bn

an C

nD1

nD1

nD1

k

P

an L < 2 C 2 D .

nD1

1

P

nD1

1

P

nD1

7.7.3 Exercises

1. In which of the following series can the parentheses be removed without affecting

the convergence of the series?

1

1

(a) 1 12

C

C 12 13 13

C 13 C 13 14 14 14

2

3

1

1

1

1

1

1

1

1

C

C

C

C

4

4

4 5 5

5 5

4

1

1

C

12

(b) 12 14 C 16 18 C 10

1 1 1 1 1 1

(c) 2 2 C 2 2 C 2 2 C

1 1

1

1

1

1

C

C

9 C 10

C 11

12

13

(d) 12 13 C 14 C 15 16 17

8

1

1

1

1

1

1

C 16 C 17 18 19 C

14 15

1

C

(e) .1/C 1 12 C 1 12 14 C 1 12 14 18 C 1 12 14 18 16

2. Write a proof to show that if

1

P

nD1

there is a rearrangement of the terms of the series that diverges to infinity and a

rearrangement that diverges to negative infinity.

3. Write a proof to show that if a1 , a2 , and a3 are real numbers, the series a11 C a22 C

a3

C a41 C a52 C a63 C converges if and only if a1 C a2 C a3 D 0.

3

1

1

P

P

an is an absolutely convergent series, and

bn

4. Write a proof to show that if

nD1

1

P

nD1

an bn converges.

nD1

1

P

nD1

an and

1

P

nD1

bn where

1

P

an bn diverges.

nD1

6. Using the method described in this section find the first 20 terms of the

rearrangement of the series 1 1 C 12 12 C 13 13 C 14 14 C that converges

to 1.

231

Earlier it was shown that if

1

P

an converges to L and

nD1

1

P

1

P

bn converges to M, then

nD1

1

1

P

P

an

bn ? First of all, can this product even be

the product of the series

the sum of the series,

nD1

nD1

nD1

1 P

1

P

an bp , and some

nD1 pD1

sense can be made out of this expression. The notation suggests that for each n,

1

P

one would calculate a limit of

an bp , and then one would consider the series of

pD1

those limits. This raises interesting questions about whether that limit, if it should

1 P

1

P

exist, has anything to do with the similar looking

an bp . In fact, as seen in

pD1 nD1

the exercises, there are examples where interchanging the order of summation in a

double summation can result in a different limit.

1 1

P

P

A simpler approach is to group the terms of the product

an

bn in a

nD1

nD1

way that might allow you to calculate the sum. One strategy is to group the terms

an bp where n C p is a given constant. For example, when the constant is 2, there

is only one term a1 b1 . When the constant is 3, there are two terms a1 b2 C a2 b1 .

n1

P

ap bnp . This

In general, the grouping of the terms whose subscripts add to n is

pD1

!

1

n1

P

P

gives what is known as the Cauchy product of the two series

ap bnp .

nD2

pD1

and

1

P

nD1

1

P

1

P

bn and

nD1

1

P

1

P

an

nD1

an .

nD1

For example, what is the Cauchy product for the square of the geometric series

n1

P

1

? Here you have two identical series where an D bn D 21n , so

ap bnp D

2n

nD1

n1

P

pD1

pD1

1

2p

1

2np

n1

P

pD1

1

2n

n1

.

2n

1

P

nD2

SD

1

X

n1

nD2

2n

n1

.

2n

232

7 Infinite Series

2S D

1

X

n1

nD2

2n1

1

X

n

n

2

nD1

1 X 1

2S S D C

D1

2 nD2 2n

SD1

This Cauchy product converges to 1 which is the expected limit since

1

P

nD1

1

2n

D 1.

But Cauchy products do not always behave so nicely. For example, find the Cauchy

1

1

P

P

.1/n

.1/n

p

p

product of the two series

and

. The Alternating Series Test shows

n

nC4

nD1

nD1

that both of these series converge, but the Integral Test shows that neither converges

absolutely. The nth term of the Cauchy product of these two series is

n1

n1

P .1/p .1/np

P

p 1

p p

D .1/n

. For even values of n, this is a sum of n1

p

npC4

p.npC4/

pD1

pD1

when p D

nC4

,

2

1

.

p.npC4/

p2

nC4

2

pnC4

D

4

.

nC4

which approaches 4 as

This means that the sum is greater than or equal to 4.n1/

nC4

n gets large. Thus, the terms of the Cauchy product do not approach 0 as n goes

to infinity, and the Limit of Terms Test shows that the Cauchy product does not

converge.

This last example shows what can go wrong with the Cauchy product of two

conditionally convergent series, but the results are better when at least one of

the series is absolutely convergent. For example, if both series are absolutely

convergent, then the Cauchy product is absolutely convergent to the product of the

series. To see why this is, just consider the difference between a partial sum of

the Cauchy product of the two series and the product of two partial sums of the

individual series. That is, let k1 and k2 be positive integers, and find the difference

!

k1P

Ck2 n1

P

between the .k1 C k2 /th partial sum of the Cauchy product,

ap bnp ,

nD2

pD1

and the product of the k1 th and k2 th partial sums, respectively, of the two series,

k1

k2

P

P

am

bn . These are both just finite sums where the Cauchy product partial

mD1

nD1

sum includes all the terms am bn where the sum of the subscripts of m C n add to

something less than or equal to k1 C k2 and the other sum includes all the terms

am bn where M k1 and n k2 . Thus, the

is the

difference

sum of the remaining

k1 Ck

k1 Ck

k1 Ck

k1 Ck

P2 1

P2 m

P2 1

P2 n

am

bn

terms

bn C

am . So by choosing k1 and

mDk1 C1

nD1

necessary convergence.

nDk2 C1

k1 Ck

P2 1

mDk1 C1

mD1

k1 Ck

P2 1

am and

nDk2 C1

233

converges absolutely to the product of the two series.

1

P

Let

1

P

am and

mD1

nD1

1

P

Because

am converges absolutely, there exists an integer N1 such that

mD1

1

P

jam j <

mDN1

2 1C

!.

1

P

jbn j

nD1

1

P

Similarly, because

nD1

such that

1

P

jbn j <

nDN2

!.

1

P

2 1C

jam j

mD1

.k1 C k2 /th partial sum of the Cauchy product of the two series and

the

product of the

k1P

Ck2 n1

k1

k2

P

P

P

ap bnp

am

bn D

nD2 pD1

mD1

nD1

k1 Ck

k1 Ck

k1 Ck

k1 Ck

P2 m

P2 1

P2 n

P2 1

a

bn

bn C

am

mDk1 C1 m

nD1

nDk2 C1

mD1

k1 Ck

P2 1

mDk1 C1

1

P

mDk1 C1

1

P

jam j

jam j

k1 Ck

P2 m

nD1

1

P

jbn j C

nD1

jbn j

nD1

jbn j C

2 1C

1

P

!

jbn j

k1 Ck

P2 1

nDk2 C1

1

P

nDk2 C1

1

P

jbn j

1

P

jbn j

k1 Ck

P2 n

jam j

mD1

jam j

mD1

jam j

mD1

2 1C

nD1

1

P

mD1

!

jam j

<

2

2

D .

Therefore, since the .k1 C k2 /th partial sum of the Cauchy product of the

two series and the product of the k1 th and k2 th partial sums of the two series

are within of each other when k1 and k2 are large, both expressions must

converge to the same quantity when k1 and k2 approach infinity, and that

limit is the product of the two series.

n1

1

P

P

ap bnp

1

1

1

1

P

P

P

P

jam j

jbn j C

jbn j

jam j < showing that the Cauchy

mD1

nDk2 C1

nD1

mDk1 C1

234

1

P

7 Infinite Series

Another way! to think about this theorem is that the Cauchy product

n1

1

1

P

P

P

ap bnp and the product of the two series

am

bn are rearrangements

nD2

pD1

mD1

nD1

of each other. Thus, if either converges absolutely, both converge absolutely to the

same limit. Of course, to make this rigorous, one would need to find at least one

rearrangement of the terms into a form c1 C c2 C c3 C and then show that that

series converges absolutely.

If one series converges absolutely and the other only converges conditionally,

then the Cauchy product of the two series still converges to the product of the two

series, but absolute convergence is not guaranteed. The proof is similar to the proof

of the previous theorem in that it carefully considers the difference between the

partial sum of the Cauchy product and product of the two series. This difference can

be broken into three differences each of which can be bounded. Specifically, assume

1

1

P

P

that

am is absolutely convergent and

bn is convergent. Then consider the

mD1

nD1

difference between the Nth partial sum of the Cauchy product and the product of the

N n1

1

1

N1

1

Nm

P

P

P

P

P

P

P

ap bnp

am

bn D

am

bn

am

series. That difference is

nD2 pD1

mD1

nD1

mD1

nD1

mD1

Nm

N1

1

1

N1

1

1

P

P

P

P

P

P

P

bn D

am

bn

bn C

am

am

bn . In the second term

nD1

mD1

nD1

nD1

mD1

mD1

nD1

N1

1

1

P

P

P

bn is fixed, and the factor

am

am can be made

of this sum, the factor

nD1

mD1

mD1

The

Nm N large.

first term of the sum is a little trickier

1

P

P

to handle. In the terms am

bn

bn the am factor can be made small by

making m large, and the

Nm

P

nD1

nD1

bn

1

P

nD1

nD1

large, or by keeping m small. Both of these can be done, but not at the same time.

The technique one would use here would be to break the sumfrom m D 1 to m

D

N1

Nm

1

P

P

P

am

bn

bn D

N 1 at some intermediate value K < N 1 writing

mD1

nD1

nD1

K

Nm

1

N1

Nm

1

P

P

P

P

P

P

am

bn

bn C

am

bn

bn . You can now choose K

mD1

nD1

nD1

mDKC1

nD1

nD1

so that for m > K the value of am is small, and when m K, the N m is large so

Nm

1

P

P

that

bn

bn will be small. This gives the following proof.

nD1

nD1

235

1

P

PROOF: If

mD1

1

P

bn is a

nD1

convergent series, then the Cauchy product of the two series converges

to the product of the two series.

1

P

Let

mD1

1

P

bn be a convergent

nD1

series.

Then for integers N and K with 1 < K < N 1, the difference between the

Nth partial sum of the Cauchy product of the two series and the product of

the two series is

N n1

1

1

P

P

P

P

ap bnp

am

bn D

nD2 pD1

N1

Nm

P

P

mD1

1

P

nD1

1

P

bn

am

bn D

nD1

mD1

nD1

N1

1

Nm

1

1

P

P

P

P

P

am

bn

bn C

am

am

bn D

mD1

nD1

nD1

mD1

mD1

nD1

K

Nm

1

N1

Nm

1

P

P

P

P

P

P

am

bn

bn C

am

bn

bn C

mD1

nD1

nD1

nD1

nD1

mDKC1

N1

1

1

P

P

P

am

am

bn :

am

mD1

N1

P

mD1

mD1

Because

nD1

T

P

T

1

P

P

there is a number M such for all T,

bn

bn < M.

nD1

nD1

Let > 0 be given.

1

P

Because

am converges absolutely, there is an integer K such that

nD1

1

P

mD1

jam j <

mDKC1

.

3M

1

P

bn converges to

bn , there is a positive integer N1 such that

nD1

nD1

N

1

P

P

!.

for all N N1 ,

bn

bn <

1

Because

N

P

nD1

nD1

3 1C

jam j

mD1

1

P

am converges to

am , there is a positive integer N2 such that

mD1

mD1

N

1

P

P

for all N N2 ,

am

am <

1 ! .

P

Because

N1

P

mD1

mD1

3 1C

nD1

bn

(continued)

236

7 Infinite Series

P

N1

Nm

1

1

1

1

P

P

P

P

P

P

P

N n1

Then

ap bnp

am

bn D

am

bn

am

bn D

nD2 pD1

mD1

nD1

mD1

nD1

mD1

nD1

Nm

Nm

N1

1

P

1

N1

1

1

P

P

P

P

P

P

P

P

K

a

b

b C

am

bn

bn C

am

am

bn

nD1

nD1

mD1

mD1

nD1

mDKC1

N1

1

Nm

Nm

K

1

N1

1

1

P

P

P

P

P

P

P

P

P

jam j

bn

bn C

jam j

bn

bn C

am

am

bn <

mD1

nD1

nD1

nD1

nD1

mD1

mD1

nD1

mDKC1

K

1

P

P

! C MC

1 !

jam j

bn < 3 C 3 C 3 D .

1

3M

P

P

mD1

3 1C

3 1C

jam j

mD1

1

P

the series

1

P

am

mD1

nD1

bn

nD1

N Nn

P

P

nD2 pD1

bn .

nD1

Cauchy products play a particularly useful role in the study of power series, a

topic covered in the next chapter.

7.8.1 Exercises

1. Let am;n be the nth number in the mth row of the following table where m and n

both range from 1 to infinity.

1 1

0

12 12

0

14 14 14 14

0

1

2

1

2

1

4

1

4

1

4

1

4

1

8

1

8

1

8

1

8

Show that

1 P

1

P

mD1 nD1

1

8

1

8

1

8

1

8

18 18 18 18 18 18 18 18

1 P

1

P

am;n .

nD1 mD1

2. Show that the Cauchy product for the square of the conditionally convergent

1

P

.1/n

series

converges.

n

nD1

3. Show that the Cauchy product for the square of the series

1

P

nD1

.1/n

p

n

diverges.

4. Suppose you have two series whose indices begin with 0 rather than 1 as in

1

1

P

P

an and

bn . Show that the Cauchy product of these two series is then

nD0

1 P

n

P

nD0 pD0

nD0

ap bnp .

237

5. In the next chapter it will be shown that for all real values of x, the exponential

1 n

P

x

function has the series representation ex D

. Use the Cauchy product of

n

series to show that ea eb D eaCb .

nD0

Chapter 8

Sequences of Functions

Chapter 3 introduces the idea of a sequence of real numbers <an > and discusses

theorems related to the limit lim an , limit superior lim sup an , limit inferior

n!1

n!1

lim inf an , and subsequences <anj > of such a sequence. If instead of requiring

n!1

the terms of the sequence an to be constants, the an were allowed to depend on

the value of a variable as in fn .x/, then the sequence is a sequence of functions.

Thus, for each value of x, if all the functions fn .x/ are defined at x, then there is

a sequence of real numbers, <fn .x/>. This sequence changes as x changes, and,

indeed, there is a different sequence of real numbers for each choice of x. The limit

of the sequence, if it exists, could be different for each x, and, therefore, the limit

would also be a function, f .x/. The first question that arises is, what is meant by

the convergence of such a sequence? In fact, there are many different definitions

for the convergence of a sequence of functions, each with its own applications and

properties. The next question is, what can one say about the properties of the limit

of the sequence? For example, under what conditions can you know that the limit

function is continuous, differentiable, or integrable? In particular, if the sequence

of integrable functions <fn .x/> converges to an integrable function f .x/, when can

Rb

Rb

you conclude that lim fn .x/dx D f .x/dx?

n!1 a

sequence of functions <fn .x/> converges pointwise to the function f .x/ on a set

A if for each x 2 A, lim fn .x/ D f .x/. This type of convergence is referred

n!1

converges pointwise to the function f .x/ D 0 on the entire real line because for each

x 2 R, lim nx D 0. A more interesting example is the sequence fn .x/ D xn which

n!1

converges pointwise on the interval .1; 1. When jxj < 1, the powers xn get small

as n gets large so lim xn D 0. But when x D 1, the powers xn D 1, so the limit

n!1

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_8

239

240

8 Sequences of Functions

functions xn converging to a

discontinuous function

-1

nC1

functions jxj n converging

to the function f .x/ D jxj

0 if 1 < x < 1

of the sequence is 1. Thus, the limit function is f .x/ D

: Note

1 if x D 1

that this is a sequence of continuous functions that converges to a function that is not

continuous. The sequence does not converge at x D 1 because the terms oscillate

between 1 and 1 (Fig. 8.1).

Continuity is not the only property not preserved by functions converging

nC1

pointwise. The terms of the sequence fn .x/ D jxj n are differentiable functions

for all real numbers, but the limit of the sequence is the function f .x/ D jxj

which

is not differentiable9at x D 0 (Fig. 8.2). The terms of the sequence f .x/ D

8

2

n

x

if 0 x 1n >

>

>

>

>

=

<

R2

2

1

2

all have integral fn .x/dx D 1, yet the sequence

n2 . n x/ if n < x < n

>

>

0

>

>

>

;

:

2

0

if n x 2

converges pointwise on the interval 0; 1 to the function f .x/ D 0 which has integral

equal to 0 (Fig. 8.3).

241

functions with integral 1

converging to the function

f .x/ D 0

f3

f2

f1

8.1.1 Exercises

Determine the pointwise limits of the following sequences of functions. For which

sequences is the limit continuous? For which sequences is the limit of the integrals

of the terms equal to the integral of the limit?

p

1. fn .x/ D n x for x 2 0; 16.

2. Let r1 ; r2 ; r3 ; : : : be a sequence

consisting of all the rational numbers in the

1 if x D rk for some k n

interval 0; 1. Let fn .x/ D

for x 2 0; 1.

0 otherwise

n

nx for x 2 .0; 1/.

3. fn .x/ D (

)

n

nC4

n if 2nC4

< x < 2nC4

4. fn .x/ D

for x 2 0; 1.

0 otherwise

8

9

1

.2 C .1/n / n2 x

if 0 < x < 2n

<

=

1

5. fn .x/ D .2 C .1/n / n2 1n x if 2n

x < 1n for x 2 0; 1.

:

;

0

otherwise

The sequence <fn > converges pointwise to f on the set A if given > 0, for each

x 2 A there is an integer N such that jfn .x/ f .x/j < for all n N. So, for

each x there is an integer N that ensures the inequality. The value of N can depend

on the choice of x. If this dependence is dropped, and you are able to specify a

value of N that does not depend on the choice of x, then the speed of convergence

becomes similar for all x 2 A; that is, the rate of convergence is uniform for all

x 2 A. The sequence <fn > converges uniformly to f on the set A if, given

> 0, there is an integer N such that for each x 2 A, jfn .x/ f .x/j < for all

n N. The difference between a sequence of functions converging uniformly and

converging pointwise is that with uniform convergence there can be no points of

242

8 Sequences of Functions

functions converging

uniformly

the set A where convergence lags behind. For any region of width > 0 around

the limit function, all of the terms suitably far down the sequence enter that region.

Compare, for example, the uniformly convergent sequence depicted in Fig. 8.4 with

the pointwise convergent sequences depicted in Figs. 8.1 and 8.3. In Fig. 8.4 the

functions of the sequence get close to the limit function for all the values of x,

whereas for each function in Figs. 8.1 and 8.3 there is an x for which the function

is far from its limit. Clearly, if a sequence of functions converges uniformly, then it

also converges pointwise. Thus, to converge uniformly is a stronger condition than

to converge pointwise.

As seen in the previous section, the terms of the sequence <fn > can have many

properties that are not automatically inherited by the limit of the sequence, f , when

the convergence is pointwise. Under uniform convergence, more of the properties

of the terms of the sequence are retained by the limit. This is because under uniform

convergence there are no points x 2 A for which the values of <fn .x/> lag behind

as n gets large. For all values of x 2 A, the sequence of <fn .x/> values get close to

the corresponding f .x/ at a rate at least as fast as some fixed rate.

For example, if <fn > is a sequence of functions continuous on the set A which

converges uniformly to f on A, then the limit is guaranteed to be continuous.

Actually, a stronger statement can be made. If all the terms fn are continuous at

some point a 2 A, then the limit, f , will also be continuous at a. To prove that f is

continuous at a, you will need to show that for each > 0 there is a > 0 such

that if x is in A with jx aj < , then jf .x/ f .a/j < . How can you arrange for

f .x/ to be close to f .a/? What you know is that the functions fn get uniformly close

to f , and that the fn functions are continuous at a. Since, for any particular n, the

term fn is continuous at a, you can arrange for fn .x/ to be close to fn .a/. The uniform

convergence allows you to choose an integer n so that for every x 2 A, fn .x/ is close

243

f(x)

fn(x)

)

a

to f .x/. That is, jf .x/ f .a/j D jf .x/ fn .x/ C fn .x/ fn .a/ C fn .a/ f .a/j

jf .x/ fn .x/j C jfn .x/ fn .a/j C jfn .a/ f .a/j. Each of these three terms can be made

small, say less than 3 , so that the sum is less than . The key point here is that only

one value of n needs to be chosen so that jf .x/ fn .x/j can be made less than 3 no

matter which x is chosen (Fig. 8.5).

PROOF: If the sequence <fn > converges uniformly to the limit f on the set

A and if for each n, fn is continuous at a 2 A, then f is continuous at a 2 A.

In particular, if each fn is continuous on A, then f is continuous on A.

Let <fn > be a sequences of functions that converge uniformly to the

function f on a set A.

Assume that each fn is continuous at point a 2 A.

Let > 0 be given.

Because the sequence converges uniformly, there is an integer N such that

jfn .x/ f .x/j < 3 for all x 2 A and all n N.

Because fN is continuous at a, there is a > 0 such that jfN .x/ fN .a/j < 3

for all x 2 A satisfying jx aj < .

Then, for all x 2 A satisfying jx aj < , it follows that

jf .x/ f .a/j D jf .x/ fN .x/ C fN .x/ fN .a/ C fN .a/ f .a/j

jf .x/ fN .x/j C jfN .x/ fN .a/j C jfN .a/ f .a/j < 3 C 3 C 3 D .

Therefore, the function f is continuous at a 2 A.

Moreover, if each function fn is continuous at each a 2 A, then f is

continuous at each x 2 A, so f is continuous on A.

It is worth considering where this proof breaks down if all you assume is that

the sequence <fn > converges pointwise to f . The problem comes in the fact that

although jfN .a/ f .a/j and jfN .x/ fN .a/j can be made smaller than 3 , there could

244

8 Sequences of Functions

be values of x very close to a for which jfN .x/ f .x/j is no longer small. Thus, the

needed inequality jf .x/ fN .x/j C jfN .x/ fN .a/j C jf .a/ f .a/j < might not

hold. Also consider the function f defined on the interval 0; 2 by f .x/ D x if x 1

and f .1/ D 3. If for each positive integer n you let fn .x/ D f .x/ C 1n , then it is

clear that the sequence <fn > converges uniformly to f . At the points where each fn

is continuous, that is, for x 1, the limit function f is also continuous.

Suppose that <fn > is a sequence of functions Riemann integrable on an interval

a; b and that this sequence converges to a limit f . Examples in the last section show

Rb

that if the convergence is pointwise, the limit lim fn .x/dx does not necessarily

equal

Rb

n!1 a

integrable, and the limit of the integrals of the fn might not exist. On the other

hand, if the convergence is uniform, then the limit function f will be Riemann

integrable and the limit of the integrals of the fn will equal the integral of f . Showing

that the uniform limit of Riemann integrable functions is Riemann integrable is not

difficult and is based on the characterization of Riemann integrable functions given

by Lebesgues Theorem. Recall that a function is Riemann integrable on an interval

if and only if it is bounded and the set of points where the function is discontinuous

has measure zero. If each term of the sequence, fn , has these properties, then the limit

function, f , must also have them. By the definition of uniform convergence, there is

an integer N such that jfN .x/ f .x/j < 1 for all x 2 a; b. So, if the function fN is

bounded by some constant M, then the function f must be bounded by M C 1 since

for all x 2 a; b it follows that M 1 fN .x/ 1 < f .x/ < fN .x/ C 1 M C 1.

As for points of discontinuity of f , for each positive integer n, let Dn be the set of

points in a; b where the function fn is discontinuous. Because each fn is Riemann

1

nD1

of sets of measure zero, so it also has measure zero. The sequence <fn > converges

uniformly on the set A D a; bnD, and each term of the sequence is continuous at

each point of A, so, by the preceding theorem, the limit f must be continuous

on A. Thus, the set of discontinuities of f must be contained in D, so the set of

discontinuities of f has measure zero. Therefore, f is Riemann integrable.

Rb

Rb

So why does it follow that lim fn .x/dx D f .x/dx? From the definition of

n!1 a

uniform convergence, for every > 0 there is an integer N such that n N implies

that jfn .x/ f .x/j < for every x 2 a; b. This means that for every n N and

Rb

every x 2 a; b it follows that f .x/ < fn .x/ < f .x/ C , so f .x/dx .b a/ D

Rb

a

.f .x//dx

Rb

a

Rb

Rb

fn .x/dx .f .x/C/dx D

Rb

n!1 a

fn .x/dx D

Rb

a

f .x/dx.

245

PROOF: Assume that <fn > is a sequence of functions that are Riemann

integrable on the interval a; b. If the sequence converges uniformly to f ,

Rb

Rb

then f is also Riemann integrable on a; b, and lim fn .x/dx D f .x/dx.

n!1 a

interval a; b which converges uniformly to the function f .

Because the sequence converges uniformly, there is an integer N such that

for all n N and all x 2 a; b it follows that jfn .x/ f .x/j < 1.

Because fN is Riemann integrable on a; b, fN is bounded, so there exists an

M such that jfN .x/j < M for all x 2 a; b.

Then for each x 2 a; b it follows that M 1 fN .x/ 1 < f .x/ <

fN .x/ C 1 M C 1. Thus, jf .x/j is bounded by M C 1 and f is a bounded

function.

For each positive integer n let Dn be the set of x 2 a; b where fn fails to

be continuous. Because each fn is Riemann integrable, Lebesgues Theorem

shows that Dn has measure zero.

Since the countable union of sets of measure zero is a set with measure zero,

1

nD1

Because <fn > converges uniformly to f on a; b, the limit function, f , is

continuous at each point in A.

Thus, the set of points where f is discontinuous is a subset of D, and, hence,

the set of discontinuities of f has measure zero.

It follows from Lebesgues Theorem that f is integrable on the interval a; b.

Let > 0 be given.

Because <fn > converges uniformly to f on a; b, there is an integer N such

that for all n N and all x 2 a; b, jfn .x/ f .x/j < baC1

.

Thus, for each x 2 a; b, f .x/ baC1 < fn .x/ < f .x/ C baC1

.

b

b

Rb

Rb

R

R

Then, f .x/dx < f .x/ baC1

dx fn .x/dx f .x/C baC1

dx

<

Rb

f .x/dx C .

Rb

n!1 a

fn .x/dx D

Rb

theorem.

. The

Note the use of b a C 1 rather than just b a in the denominator of baC1

extra C1 avoids the embarrassing case of a D b. The addition of C1 allows the

proof to handle the easy special case without having to provide a separate argument

for it.

246

8 Sequences of Functions

8.2.1 Exercises

1. Show that lim xn converges uniformly to 0 on any interval a; a where 0 <

n!1

a < 1.

1

converges uniformly to 0 on any interval a; 1/ for a > 0 but

2. Show that lim nx

n!1

3. Another way to show that the uniform limit of Riemann integrable functions is

Riemann integrable is to show that the limit function has upper and lower step

functions, u and v, such that the integrals of u and v are within > 0 of each

other. Write a proof that uses this strategy.

4. Suppose that the sequence of functions <fn > converges uniformly to the function

f and that g is a uniformly continuous function defined on the range

of

f and the

ranges of each of the

f

functions.

Prove

that

the

functions

g

f

.x/

converge

n

n

uniformly to g f .x/ .

Chapter 3 introduces monotonically increasing sequences of numbers <an > where

an anC1 for each n 1 and monotonically decreasing sequences of numbers

where an anC1 for each n 1. One can similarly define a <fn > to be a

monotonically increasing sequence of functions or a monotonically decreasing

sequence of functions on a set A if for each x 2 A the sequence of numbers

<fn .x/> is always monotonically increasing or always monotonically decreasing,

respectively. Such a sequence is said to converge monotonically to the limit

function f on A if the monotone sequence converges to f . That convergence could

be a pointwise convergence or a uniform convergence. Even if convergence is

pointwise, sometimes knowing that the convergence is also monotone gives results

similar to knowing that the convergence is uniform.

In the previous section it was shown that if a sequence of continuous functions

converges uniformly to f , then f is continuous. If the convergence is actually

monotone, then the converse holds, that is, if the terms of the sequence <fn > are

continuous on an interval a; b and the sequence converges monotonically to a limit

function f that is also continuous on a; b, then the convergence is actually uniform.

To prove this you would need to take an > 0 and show there was an integer N

such that for all n N and all x 2 a; b it was true that jfn .x/ f .x/j < . What do

you have working for you? What you have is that the limit function f is continuous

at each point x 2 a; b, each of the terms of the sequence fn is continuous, and the

values fn .x/ are approaching f .x/ monotonically.

Let x 2 a; b be given. Using the continuity of f you can find an interval around

x such that for any y in that interval, f .y/ is close to f .x/. Using the fact that the fn

are converging to f pointwise, you can find an integer N such that for all n N the

value of fn .x/ is close to the value of f .x/. Finally, using the continuity of fN you can

find an interval around x such that for any y in that interval fN .y/ is close to fN .x/.

247

Combining these you can show that for any y in some interval around x, the value

of fN .y/ is close to the value of f .y/. The crucial observation here is that once you

know that fN .y/ and f .y/ are close, the monotonicity of the convergence gives you

that fn .y/ is between fN .y/ and f .y/ for all n N, and, thus, fn .y/ will be close to

f .y/ for all n N. Note, though, that the value of N can vary with the value of x.

Well, this means that for each x 2 a; b there is an interval around x where

fn .y/ is close to f .y/ for all y in the interval and all n N. Now you can use the

compactness of the interval a; b, that is, you can use the HeineBorel Theorem to

show that there is a finite collection of these x values, say x1 ; x2 ; x3 ; : : : ; xk , such that

the entire interval a; b is covered by these intervals you constructed around each

of the xj s. Each of the xj s was associated with an Nj , and now one can select the

maximum of these Nj values to get a single function fN which is uniformly close

to f . Again, because the convergence is monotone, once you know that fN is close to

f , you know that fn is close to f for all n N. This will complete the proof.

PROOF: Assume that <fn > is a sequence of functions continuous on the

interval a; b that converges monotonically to the function f that is also

continuous on a; b. Then the sequence converges uniformly to f on a; b.

Assume that <fn > is a sequence of functions continuous on the interval

a; b that converges monotonically to the function f that is also continuous

on a; b.

Let > 0 be given.

Let x 2 a; b.

Because the function f is continuous at x, there is a 1 > 0 such that if

y 2 a; b with jy xj < 1 , then jf .y/ f .x/j < 3 .

Because lim fn .x/ D f .x/, there is an integer Nx such that if n Nx , then

n!1

Because fNx is continuous at x, there is a 2 > 0 such that if y 2 a; b with

jy xj < 2 , then jfNx .y/ fNx .x/j < 3 .

Let x D min.1 ; 2 /.

Then, if y 2 a; b with jy xj < x , it follows that jfNx .y/ f .y/j D

jfNx .y/ fNx .x/ C fNx .x/ f .x/ C f .x/ f .y/j

jfNx .y/ fNx .x/j C jfNx .x/ f .x/j C jf .x/ f .y/j < 3 C 3 C 3 D .

The interval a; b is covered by the collection of open intervals

.x x ; x C x / for x 2 a; b.

By the HeineBorel Theorem, there is a finite collection of these x

values, x1 ; x2 ; x3 ; : : : ; xk , such that the intervals .xj xj ; xj C xj / for j D

1; 2; 3; : : : ; k covers the interval a; b.

Let N D max.Nx1 ; Nx2 ; Nx3 ; : : : ; Nxk /.

Let y 2 a; b.

There is a value of j between 1 and k such that y 2 .xj xj ; xj C xj /.

Because the sequence <fn > converges monotonically to f , for all n N

Nxj , fn .y/ is between fNxj .y/ and f .y/, and jfNxj .y/ f .y/j < .

This shows that the sequence <fn > converges uniformly to f .

248

8 Sequences of Functions

p

on the interval 0; 4. Since each fn and f is continuous on 0; 4, you can conclude

that the convergence of the sequence is uniform. On the other hand, the sequence

of continuous functions fn .x/ D xn converges monotonically on the interval 0; 1

to a function discontinuous at 1. Clearly, then, the sequence does not converge

uniformly.

Another important theorem about monotone convergence is that if <fn > is a

sequence of functions Riemann integrable on the interval a; b that converge monoRb

Rb

tonically to the Riemann integrable function f , then lim fn .x/dx D f .x/dx. The

n!1 a

generally not proved in a book of this type because it is an easy consequence of the

Monotone Convergence Theorem of Lebesgue which is covered in any beginning

course in measure theory, but that study requires the development of Lebesgue

measure, a topic which is beyond the scope of this book.

It does need to be pointed out that even if all the terms of a sequence are Riemann

integrable functions, and the sequence converges monotonically to a function f ,

it may be that the limit, f , is not itself Riemann integrable. For example, let

r1 ; r2 ; r3 ; : : : be a sequence consisting of all the rational numbers in the interval

0; 1. Let fn .x/ be the function equal to 1 for x D r1 ; r2 ; r3 ; : : : ; rn and equal

to 0 elsewhere. Then each fn has finitely many points of discontinuity so each

fn has a Riemann integral on 0; 1 equal to 0. Yet the sequence <fn > converges

monotonically to the function f equal to 1 for rational values of x and 0 for irrational

values of x, so f is discontinuous everywhere, and, as a result, it is not Riemann

integrable.

So, suppose that <fn > is a sequence of functions Riemann integrable on the

interval a; b that converge monotonically to a limit function f that is also Riemann

integrable on a; b. Without loss of generality one can assume that the sequence

is monotonically decreasing to f because if the sequence were increasing, the same

argument could just be applied to the sequence <fn >. Also, it can be assumed that

the function f is identically 0 on a; b because if that is not the case, the argument

could be applied to the sequence <fn f > which does decrease monotonically to 0,

Rb

Rb

Rb

and lim fn .x/ f .x/dx D 0 is equivalent to lim fn .x/dx D f .x/dx. 1

n!1 a

n!1 a

start with an > 0, and the goal of the proof would be to show that there is an

Rb

integer N such that for all n N, it follows that fn .x/dx < . The proof presented

a

here is based on the fact that for any Riemann integrable function, fn , you can find

upper and lower step functions, un .x/ and vn .x/, satisfying vn .x/ fn .x/ un .x/

Rb

for every x 2 a; b so that a .un .x/ vn .x//dx is as small as you like. Suppose

Rb

you select un and vn so that a .un .x/ vn .x//dx < 2n . That is, find upper and

lower step functions for each fn such that they give increasingly better and better

1

This proof is based on ideas from the article Monotone Convergence Theorem for the Riemann

Integral by Brian S. Thomson from the American Mathematical Monthly, JuneJuly 2010.

249

1 R

P

b

precision, you would be able to know that the entire sum

a .un .x/ vn .x//dx

would be less than

1

P

nD1

n

2

nD1

proof, but later this value can be adjusted when you see just how small the bound

needs to be in order to make the proof work.

For each n these un and vn functions are step functions on the interval a; b, so

there must be a positive integer k and a partition of a; b given by a D x0 < x1 <

x2 < < xk D b such that both the un and the vn functions are constant on each of

the open intervals .xj1 ; xj / for j D 1; 2; 3; : : : ; k. Well, for this proof, there will be a

different k and a different partition for each fn function, so it would be better to name

the positive integer kn and the partition a D xn;0 < xn;1 < xn;2 < < xn;kn D b.

For the purposes of this proof, it is important that the endpoints of the partition

associated with the un and vn , that is, xn;1 ; xn;2 ; xn;3 ; ; xn;kn 1 , do not match any

of the endpoints of the partition associated with the next case unC1 and vnC1 . This

is easy to arrange because an upper or lower step function for fn that is constant

on the two intervals .xj1 ; xj / and .xj ; xjC1 / can be altered to be constant on the

three intervals .xj1 ; xj /, .xj ; xj C /, and .xj C ; xjC1 / for some suitably

small > 0 without significantly changing the value of the integral of the step

function and without destroying whether the step function is an upper or lower step

function of fn . Indeed, if, for example, the upper step function un were constant on

the function un by redefining

un on the

.xj1 ; xj / and .xj ; xjC1 /, you could define

interval .xj ; xj C / to equal max un .xj /; un .xj /; un .xj C / . Then un would

Rb

be slightly larger than un on a small interval so that un .x/ un .x/ dx is less than

a

2un .xj / which can be made arbitrarily small by selecting small. Because un is

greater than or equal to un , it is also an upper step function of fn .

For each y 2 a; b, consider the sequence of numbers f1 .y/; f2 .y/; f3 .y/; : : : which

decreases monotonically to 0. For each y, select a positive integer n.y/ such that

fn.y/ .y/ < . Thus, n.y/ associates a term of the sequence fn.y/ with y. That term is

also associated with un and vn and the partition a D xn;0 < xn;1 < xn;2 < <

xn;kn D b. As stated in the previous paragraph, it can be assumed that y is not equal

to any of the endpoints xn;1 ; xn;2 ; xn;3 ; : : : ; xn;kn 1 , so y can be associated with an open

interval .xn;j1 ; xn;j / containing y unless y is a or b in which case y will be associated

with the open interval .a 1; xn;1 / or .xn;kn 1 ; b C 1/, respectively.

Each point y 2 a; b has been associated with an open interval that contains y.

Thus, these open intervals provide an open cover of the interval a; b. The Heine

Borel Theorem says that there exists a finite subcover of a; b. That is, there is a

sequence of y 2 a; b, say y1 ; y2 ; y3 ; : : : ; ym , such that the intervals associated with

these y values cover a; b. Something stronger can be said. In this finite subcover

of open intervals, you can assume that there are no values of y 2 a; b that belong

to more than two of the open intervals in that subcover. Indeed, suppose y is an

element of the three intervals of the subcover .a1 ; b1 /, .a2 ; b2 /, and .a3 ; b3 /. Suppose

a1 is the least of a1 , a2 ; and a3 , and that b2 is the greatest of b1 ; b2 ; and b3 . Then

a1 < a3 < y < b3 < b2 , so .a3 ; b3 / .a1 ; b1 /[.a2 ; b2 /, and the interval .a3 ; b3 / can

250

8 Sequences of Functions

be dropped from the subcover. Because the subcover contains only a finite number of

open intervals, all of these superfluous intervals can be dropped from the subcover.

Consider the intervals associated with each of the yj values. For simplicity, let the

interval associated with yj be renamed .aj ; bj /. Note that the endpoints a and b will

be among the yj values because for every n each of these endpoints was covered by

only one possible open interval. At this point the left endpoint associated with a can

be set to a and the right endpoint of the interval associated with b can be set to b. It

is important to note that if n.yi / D n.yj / for some distinct i and j, then the intervals

associated with yi and yj do not overlap. This is because the intervals associated with

yi and yj are distinct intervals from .a1; x1 /; .x1 ; x2 /; .x2 ; x3 /; : : : ; .xk1 ; bC1/. Let

N be the maximum of the finitely many n.yj / values for j D 1; 2; 3; : : : ; m. Because

no value of y 2 a; b appears in more than two of the open intervals associated with

the yj , it can be concluded that

Zb

a

m Z

X

bj

fN .x/dx

jD1 a

m Z

X

bj

fN .x/dx

jD1 a

fn.yj / .x/dx D

2

2

3

3

Zbj

Zbj

m

m

X

X

6

6

7

7

fn.yj / .x/ fn.yj / .yj / dx C fn.yj / .yj /.bj aj /5

un.yj / .x/ vn.yj / .yj / dx C .bj aj /5

4

4

jD1

jD1

aj

aj

N Z

N

X

X

2p C 2.b a/ < .2b 2a C 1/:

up .x/ vp .x/ dx C 2.b a/

b

pD1 a

pD1

There were two places in the above argument where quantities were forced to be

less than the given value . It can now be seen that those quantities should have been

Rb

made smaller than 2b2aC1

so that the final inequality would show fN .x/dx < as

a

needed. It is also worth noting that there were two places in the argument that use

the fact that the sequence <fn > converges monotonically. The first was to conclude

that when, for a particular y 2 a; b, the value of fn .y/ is small, then the values of

fm .y/ are also small for all m n. The second important use of monotonicity takes

Rb

Rb

the final result that fN .x/dx < and concludes that fm .x/dx < for all m N.

a

on the interval a; b that converges monotonically to the function f that is

Rb

Rb

also Riemann integrable on a; b. Then lim fn .x/dx D f .x/dx.

n!1 a

interval a; b that converges monotonically to the function f that is also

Riemann integrable on a; b.

Without loss of generality assume that <fn > decreases monotonically to

f .x/ 0 on a; b. If this were not the case, the argument could be applied

to the sequence of functions <jf fn j>.

(continued)

251

It is left to show that there is an integer N such that for all n N,

Rb

fn .x/dx < .

a

there exists upper and lower step functions, un and vn , satisfying for each

Rb

x 2 a; b, vn .x/ fn .x/ un .x/ and un .x/ vn .x/ dx < 2b2aC1

2n .

a

Because un and vn are step functions, there exist a positive integer kn and a

partition of a; b given by a D xn;0 < xn;1 < xn;3 < < xn;kn D b such

that for each j D 1; 2; 3; : : : ; kn , the functions un and vn are constant on each

open interval .xn;j1 ; xn;j /.

Because there is flexibility in selecting the upper and lower step functions,

it can be assumed that for each positive integer n, except for a and b, the

endpoints of the partition associated with the upper and lower step functions

for fn are distinct from the endpoints of the partition associated with the

upper and lower step functions for fnC1 .

For each y 2 a; b the sequence f1 .y/; f2 .y/; f3 .y/; : : : decreases monotonically to 0, so for each y there is a positive integer n.y/ such that

fm .y/ < 2b2aC1

for all m n.y/. In particular, it can be assumed that,

unless y is a or b, y is not an endpoint of the partition of a; b associated

with the upper and lower step functions of fn.y/ .

Associate with each y 2 a; b an open interval as follows. If y D a,

then let the open interval be .a 1; xn.a/;1 /. If y D b, then let the open

interval be .xn.b/;kn.b/ 1 ; b C 1/. Otherwise, associate y with the open interval

.xn.y/;j1 ; xn.y/;j / that contains y.

Thus, each y 2 a; b is associated with an open interval that contains y, so

this collection of open intervals provides an open cover of a; b.

By the HeineBorel Theorem, there exists a finite subcovering of a; b

consisting of m open intervals associated with m values y1 ; y2 ; y3 ; : : : ; ym

in a; b. Note that since a and b are each covered by at most one of the

open intervals in the covering of a; b, both a and b appear in the list of y1

through ym .

Let the interval associated with yj be called .aj ; bj /. Reset the interval

associated with a so that its aj value is equal to a rather than a 1, and

reset the interval associated with b so that its bj value is equal to b rather

than b C 1. It can be assumed that no value of y 2 a; b belongs to more

than two of the open intervals of the subcovering.

(continued)

252

8 Sequences of Functions

Let N D max n.y1 /; n.y2 /; n.y3 /; : : : ; n.ym / .

Then

Zb

m Z

X

bj

fN .x/dx

jD1 a

m Z

X

bj

fN .x/dx

jD1 a

fn.yj / .x/dx D

3

2

Zbj

m

X

7

6

fn.yj / .x/ fn.yj / .yj / dx C fn.yj / .yj /.bj aj /5

4

jD1

aj

2

3

Zbj

m

N Zb

X

6

7 X

un.yj / .x/ vn.yj / .yj / dx C .bj aj /5

up .x/ vp .x/ dx C

4

jD1

pD1 a

aj

N

X

pD1

2.b a/

2b 2a C 1

2p C

2.b a/ <

.2b 2a C 1/ D :

2b 2a C 1

2b 2a C 1

2b 2a C 1

Rb

Rb

for all n N that fn .x/dx fN .x/dx < which completes the proof.

a

Pointwise convergence and uniform convergence are not the only methods of

convergence of sequences of functions. Another method suggested by the above

theorem is called convergence in mean or convergence in L1 . A sequence of

Riemann integrable functions <fn > is said to converge in mean to the Riemann

Rb

integrable function f on the interval a; b if lim jfn .x/ f .x/jdx D 0. For

n!1 a

example, consider the following sequence of functions defined on the interval 0; 1.

Define f .xI a; b/ be the function that is 1 for x in the interval a; b and 0 for all

other x. Then for positive integer n and for integer k with 2n1 k < 2n , let

n1 kC12n1

n1 kC12n1

fk .x/ D f .xI k2

; 2n1 /. The integral of f .xI k2

; 2n1 / from 0 to 1 is

2n1

2n1

1

,

so

the

integrals

of

f

.x/

approach

0

as

k

gets

large.

Thus,

fk converges in mean

k

2n1

to the zero function. Yet this sequence of functions does not converge pointwise for

any single value of x.

The infinite series f .x/ D

1

P

nD1

terms of the series are defined and the series converges, f .x/ is just an infinite series

1

P

1

is defined for each x

of real numbers given by an .x/. For example, f .x/ D

n2 Cx

nD1

that is not the negative of a perfect square. If x is the negative of a perfect square,

then there is a term of the series that is not defined. Otherwise, the series converges

253

with n2 > 2jxj it follows that

1

P

nD1

1

n2

1

n2 Cx

1

n2 Cx

1

,

n2

1

P

1

n22 . Then

converges since

n2 Cx

2

2n2 2jxj

nD1

converges.

1

P

nD1

x

converges to f .x/ D 1x

for all x satisfying jxj < 1. Note here that the function

x

f .x/ D 1x is defined for all x 1, but the infinite series is only defined for jxj < 1.

This is an example of a power series dealt with in considerably more detail in the

next section.

The results concerning the convergence of sequences of functions discussed

earlier in this chapter apply to the study of infinite series of functions because

an infinite series is just defined to be the sequence of its partial sums. Still other

questions arise such as, can one find the derivative or the integral of an infinite series

by simply differentiating or integrating the terms of the series and then finding the

limit of the resulting partial sums? The answer to this question is that sometimes one

gets a correct answer by differentiating or integrating a series term by term, but other

times this process results in nonsense. For example, consider again the function

1

1

1 R

R

R P

P

P

1

1

1

f .x/ D

.

Here,

the

statement

that

f

.x/

dx

D

dx

D

dx

2

2

n Cx

n Cx

n2 Cx

nD1

1 R

P

nD1

1

n2 Cx

dx D

1

P

nD1

nD1

ln n2 C x C C which does not converge

nD1

for any value of x. Alternatively, for this particular series it is valid to use the

1

1 Ry

Ry

Ry P

P

1

1

definite integral from 0 to y and write f .x/ dx D

dx

D

dx D

2

n Cx

n2 Cx

nD1 0

0

0 nD1

1

P

2

which does converge for each y > 1. The integral and derivative of

ln n nCy

2

nD1

1

P

nD1

series of positive numbers

1

P

1

P

nD1

nD1

1

P

nD1

in known as the Weierstrass M-Test. Consider how the proof of this result would

1

P

go. First, of course, you would assume that you had a series of functions,

an .x/,

and a convergent series of positive numbers,

1

P

nD1

nD1

integer n, jan .x/j Mn for every x 2 A. You should note that for each x 2 A, the

254

8 Sequences of Functions

series

1

P

nD1

1

P

an .x/ converges

nD1

pointwise. You are to prove that the sequence of function converges uniformly, so

you would need to take an > 0 and show that there is an integer N such that

m

P

whenever m N and x 2 A, the partial sum

an .x/ is within of the limit

nD1

1

P

1

P

an .x/. The difference between the mth partial sum of

an .x/ and its limit

nD1

nD1

1

1

1

P

P

P

is

an .x/

jan .x/j

Mn which can be made less than by

nDmC1

nDmC1

nDmC1

selecting m large. The value of m does not depend on x showing that the convergence

is uniform. This gives the following proof.

PROOF (Weierstrass M-Test): Let

defined on the set A, and let

1

P

1

P

nD1

nD1

numbers. If for each n and each x 2 A it holds that jan .x/j Mn , then

1

P

an .x/ converges uniformly on A.

nD1

Let

1

P

1

P

nD1

Mn be

nD1

Assume that for each n and each x 2 A it holds that jan .x/j Mn .

Then for each x 2 A it follows from the Comparison Test that

1

P

an .x/

nD1

Let > 0 be given.

1

1

P

P

Because

Mn converges, there is an integer N such that

Mn < for

nDm

nD1

all m N.

But, then, for each x 2 A and each m N,

the difference

between the mth

1

1

1

P

P

P

partial sum of

an .x/ and its limit is

an .x/

jan .x/j

1

P

nDmC1

Thus,

nD1

nDmC1

nDmC1

Mn < .

1

P

nD1

1

P

nD1

1

n2 Cx

1

n2 Cx

1

,

n2

1

P

nD1

1

n2

converges. Since

all the partial sums of the series are continuous functions, it follows from this

255

uniform convergence that the limit function is continuous on 0; 1/. Similarly, the

1

P

sin.n2 x/

sin.n2 x/

n12

series

converges

uniformly

on

the

entire

real

line

because

n2

n2

nD1

for every positive integer n. Again, you can conclude that the limit function is

continuous because all the partial sums are continuous functions. Notice, though,

1

P

that if you differentiate each term of this series, you get

cos.n2 x/ which does not

nD1

Power series form a class of infinite series of functions that stands out because of the

particularly nice properties they satisfy, the ease in which

pthey can be produced, the

many well-known elementary functions they represent ( x; ex ; sin x; cos x; ln x),

and the enormous number of applications they have. A power series is a series of

1

P

the form

an .x c/n , where the real number an is the nth coefficient and c is the

nD0

center of the power series. This book will consider such series where the variable,

coefficients, and center are real numbers, although most of what is said here holds

when these quantities are allowed to be complex numbers. In fact, such series play

a central role in Complex Analysis.

The first important result about power series is that they converge in an interval

.c R; c C R/ where c is the center of the power series and R, called the radius of

convergence, is a nonnegative real number or possibly even infinity. In fact, if the

1

P

power series

an .xc/n converges for a particular real number y, then it converges

nD0

absolutely for any x satisfying jx cj < jy cj, that is, for any x closer to c than y.

The proof is based on the Weierstrass M-Test where the power series

at the point x

which is less

is compared to a convergent geometric series with common ratio xc

yc

than 1.

256

8 Sequences of Functions

1

P

nD0

Let

1

P

an .x

nD0

1

P

Since

nD0

Terms Test.

Thus, the terms must be bounded, and there exists a real number M such

that jan .y c/n j M for every nonnegative integer n.

Let x be any real number satisfying jx

n cj < jy cj.

n

n

n xc

Then jan .x c/ j D jan .y c/ j yc M xc

.

yc

n

1

P

The series

M xc

is a convergent geometric series with common ratio

yc

nD0

xc

yc < 1.

1

P

Thus,

jan .y c/n j converges absolutely by the Weierstrass M-Test.

nD0

It follows immediately from the previous

theorem that the radius of convergence for

a power series is R D supfjy cj the series converges at yg, and that the power

series converges absolutely for all x 2 .c R; c C R/. This does not say anything

about how the power series behaves at the end points c R and c C R. There are

examples of power series that converge at both endpoints, that converge at one of the

two endpoints, or converge at neither endpoint. It also follows from the above proof,

that if the power series converges absolutely at y, then it converges uniformly for all

x satisfying jx cj jy cj. In particular, since all the partial sums of the series are

continuous functions, if the power series has radius of convergence R > 0 and is

any positive number less than R, then the series converges absolutely at x D cCR,

so the series converges absolutely and uniformly on c R C ; c C R . As a

1

P

result, the function f .x/ D

an .x c/n is continuous on c R C ; c C R for

nD0

all small > 0, so it is continuous on the open interval .c R; c C R/. If the series

converges absolutely for x D c C R, then f .x/ is continuous on the closed interval

1

P

c R; c C R. What if the series

an .x c/n converges conditionally at x D c C R

nD0

or x D c R? Does this mean that the function is continuous at that endpoint? The

answer is yes, but this takes some proof and is known as Abels Theorem.

257

1

P

an .x c/n has

nD0

one of the endpoints c R or c C R. Then the series is continuous on an

interval from c R to c C R containing that endpoint.

Let

1

P

nD0

R < 1.

Assume that the series converges at one of the endpoints of the interval of

convergence, c R or c C R.

Without loss of generality c D 0 and R D 1 because the argument can be

applied to the series where x is replaced by Rx C c. Thus, assume that the

1

P

series is

an xn with radius of convergence 1.

nD0

1

P

at 1, the argument can be applied to the series

an .1/n xn which

nD0

converges at 1.

Finally, by subtracting a constant from the constant term of the series, a0 , it

1

P

can be assumed that

an D 0.

nD0

k

P

an .

nD0

k!1

1

P

sn x n

nD1

1

1

1

1

P

P

P

P

Then

an xn D a0 C

.sn sn1 /xn D s0 C

sn x n

sn1 xn D

nD0

s0 C

1

P

nD1

sn x .1 x/ s0 x D .1 x/

n

nD1

1

P

nD1

nD1

sn x .

nD0

Because lim sn D 0, there is an integer N such that for all n N, jsn j < 2 .

1n!1

1

N

1

P

P

P

P

n

n

n

n

Then

an x D .1x/

sn x .1x/

sn x C .1x/

sn x

nD0

nDNC1

nD0 N

nD0 N

1

P

P

P

NC1

.1x/

sn xn C.1x/

xn D .1 x/

sn xn C 2 .1x/ x1x D

2

nD0

nDNC1

nD0

N

NC1

P

n

.1 x/

sn x C 2 x

.

nD0

Because the limit of this quantity as x approaches 1 from the left is 2 , there

exists > 0 such that for all x between 1 and 1, this expression is less

than .

1

P

an xn D 0 which completes the proof.

This shows that lim

x!1

nD0

258

8 Sequences of Functions

1

P

power series. The Root Test says that the series

an .x c/n must converge if

nD0

p

. Conversely, the

lim sup n jan j jx cjn < 1. Equivalently, jx cj < lim sup1 p

n

ja j

n!1

must be R D

1p

.

lim sup n jan j

1p

.

lim sup n jan j

n!1

n!1

n!1

1

P

an .x c/n , you see that the series will converge

If you apply the Ratio Test to

nD0

a .xc/nC1

nj

<

1.

Equivalently,

jx cj < lim jajanC1

. The series diverges if

if lim nC1

n

an .xc/

j

n!1

n!1

nj

nj

showing that R D lim jajanC1

. This expression is fine as long

jx cj > lim jajanC1

j

j

n!1

n!1

as the limit of the ratio of terms exists, but it is less helpful when it does not.

It is worth considering a few examples.

1

P

nxn

nD0

n!1

1

p

n

by the Limit of Terms Test.

1

P

n

.1/n .x1/

n

lim nC1

n!1 n

nD1

1

The center is c D 1, and the radius of convergence is R D lim p

D

n 1

n!1

lim n D 1. At the right endpoint x D 0 the series is the harmonic series which

n!1 nC1

diverges to infinity, but at the right endpoint x D 2 the series is the alternating

harmonic series which converges conditionally.

1

P

.xC4/n

n2 5n

nD0

.nC1/2 5nC1

n!1

1

p

n 2 n

n 5

lim

n2 5n

which means that the series will also converge at the left endpoint x D 9.

1

P

xn

n!1

nD0

.2n/

n!1

lim

n!1

.2nC2/

.2n/

q1

n

1

.2n/

1

P

259

nn xn

nD0

lim

nn

n!1

The center is c D 0. This is an example of a series where lim

n!1

nC1

n!1 .nC1/

1

22 2

x C 32 x C 213 x3

21

1

p

n n

n

1

p

n a

n

3

2

n!1

1

p

n a

n

n

n

D R. Note that lim inf anC1 D 0 and lim sup .nC1/

nC1 D 1,

an

n!1

n!1

neither of which shed any light on the value of R. This series diverges at both

endpoints by the Limit of Terms Test.

8.5.3 Differentiability

A function represented by a power series in an interval with positive length is said to

be analytic in that interval. Perhaps the most unusual property of analytic functions

is that they are differentiable, and the derivative of a power series can be found by

differentiating the series term by term to get a new series which converges with the

1

P

same radius of convergence as the original series. That is, if f .x/ D

an .x c/n

0

1

P

nD0

n an .x c/

n1

nD1

series converges for these same values of x. It is easy to check that the radius of

1

P

convergence of the derivative series

n an .x c/n1 is the same as the original

nD1

p

p

p

series. This follows from the fact that lim sup n n an D lim n nlim sup n an D R1 ,

n!1

n!1

n!1

1

P

It is, therefore, the case that g.x/ D

n an .x c/n1 is a power series with the

nD1

1

P

nD0

whether this new power series is, in fact, the derivative of the original series. That is,

does g.x/ D f 0 .x/ hold for all x in the open interval where the two series converge?

.x/

This needs to be proved. The proof needs to show that lim f .xCh/f

D g.x/ for

h

h!0

To construct a proof

of the power series.

.x/

g.x/

of this, you might express the difference f .xCh/f

in terms of power series

h

and see if this simplifies to an expression that has a limit of 0 as h approaches 0. The

calculation is simpler if you assume that c D 0. Then,

260

8 Sequences of Functions

1

P

1

an .x C h/n

an xn X

1

f .x C h/ f .x/

nD0

nD0

n1

D

g.x/

n

a

x

n

h

h

nD1

1

1

n

P

P

P

P

n p np

n

h

a

x

a

x

hn an xn1

n

n

nD0 pD0 p

nD0

nD1

A careful accounting of the terms in the numerator shows that all the terms of

1

1

P

P

an xn and all of the terms of

hn an xn1 cancel leaving

nD0

nD1

n

P

P

n p np

!

h

a

x

1

n

nD2 n pD2 p

X

X

p2 np

an

h x :

D jhj

h

nD2 pD2 p

The factor jhj clearly goes to 0 as h goes to 0, but there is a question about what

happens to the other factor. This infinite sum will not be a problem if it remains

bounded as h gets small. Here is where you can use the fact that power series with

radius of convergence R converge absolutely at points less than a distance R from

the center of the series. Assume that jhj is smaller than some fixed value s > 0. Then

the second factor can be estimated as follows.

!

!

!

1

1

n

1

n

n

X

X

X

X X

n p2 np X

n

n p2 np

p2 np

h

jhj

s

an

x

jan j

jxj

jan j

jxj

p

p

nD2 pD2 p

nD2

pD2

nD2

pD2

!

n

1

1

X

jan j X n p np

1 X

s jxj

D 2

jan j.jxj C s/n :

2

s

s

p

nD2

pD0

nD2

This last expression converges as long as jxjCs is a point where the power series for

f converges absolutely. But if x were chosen so that jxj < R, then for any positive s

with s < R jxj, this will happen. Because you are free to choose any s > 0, you

.x/

can choose one less than R jxj which will ensure that f .xCh/f

g.x/ is small

h

whenever 0 < jhj < s, so the proof can be completed.

261

1

P

an .x c/n which has a positive radius of convergence R 1.

f .x/ D

nD0

1

P

n an .x c/n1 .

f 0 .x/ D

nD1

Let

1

P

nD0

R 1.

The power series for f and its derivative depend on x c and not on c, so

there is no loss of generality to assume

p that c D 0. p

p

Note that lim sup n n an D lim n n lim sup n an , so the two series

1

P

n!1

an .x c/n and

nD0

1

P

n!1

n!1

nD1

Let x be chosen with jxj < R.

Let > 0 be given.

If R < 1, let s D Rjxj

, and if R D 1, let s D 1.

2

1

P

an .jxj C s/n converges absolutely.

Because jxj C s < R, the series

nD0

0

1

Let D min @s;

1C

1

P

s2

jan j.jxjCs/n

A > 0.

nD2

Then

1

1

P an .x C h/n P an xn

1

1

f .x C h/ f .x/ X

X

nD0

nD0

n1

n1

nan x

n an x

D

h

h

nD0

nD1

1

1

1

1

n

n

P

P

P

P

P

P

n p np

n p np

an x n

hn an xn1

an

p h x

nD0 an pD0 p h x

nD0

nD1

nD2 pD2

D

D

h

h

!

!

1

1

n

n

X

X

X X

n p2 np

n

jhj

jhj

h

jhjp2 jxjnp

an

x

ja

j

n

p

nD2 pD2 p

nD2

pD2

(continued)

262

8 Sequences of Functions

jhj

1

X

nD2

jan j

!

!

n

1

n

1

X

X

jan j X n p np

n p2 np

1 X

s jxj

s jxj

jhj

D jhj 2

jan j.jxj C s/n < :

2

s

s

p

p

pD2

nD2

pD0

nD2

1

P

nan xn1 .

nD1

An immediate consequence of this theorem is that not only can you obtain the

1

P

first derivative of f .x/ D

an .x c/n by differentiating term by term, but you

nD0

can also get all the higher derivatives of f by repeating the process. This follows

by induction because, if the mth derivative of f is equal to the series formed by the

mth derivatives of the terms of the series for f , and if that series has the same radius

of convergence as the series for f , then the theorem says that the mC1st derivative of

f can be obtained by differentiating the terms of the series for the mth derivative of

f , and the radius of convergence of that series will remain the same. Moreover, one

can find an antiderivative for f by integrating each term of the series for f . That is, if

1

1

P

P

an

f .x/ D

an .x c/n for all x with jx cj < R, then the series

.x c/nC1 will

nC1

nD0

nD0

have the same radius of convergence as the series for f , and the theorem says that

the derivative of the new series is equal to f . It is important to note that if a function

is analytic by virtue of having a power series representation in an open interval of

radius R around c, then that function is infinitely differentiable in that interval.

These results make it very simple to derive new series from previously known

1

P

1

series. For example, you already know that 1x

D

xn for all x with jxj < 1.

nD0

by substituting x for x in the series for

by substituting x2 for x in the series for

by differentiating the series for

1

, 1

1x 1Cx

1

, 1

1Cx 1Cx2

1

, 1

1x .1x/2

1

P

1

P

.1/n xn .

nD0

1

P

.1/n x2n .

nD0

nxn1 .

nD1

1

and noting that ln 1 D 0, ln.1 C x/ D

by integrating the series for 1Cx

1

1

P

P

nC1

n

.1/n xnC1

D

.1/n1 xn . In particular, by Abels theorem,

nD0

nD1

ln 2 D 1 12 C 13 14 C .

1

1

0 D 0, tan1 x D

by integrating the series for 1Cx

2 and noting that tan

1

P

2nC1

.1/n x2nC1 . In particular, by Abels Theorem, 4 D 1 13 C 15 17 C .

nD0

263

If f .x/ D

1

P

nD0

constant term of the series for f . Finding the mth derivative of the series for f and

evaluating it at the center of the series, c, gives that f .m/ .c/ D mam . So, for all

.m/

integers m 0, am D f m.c/ . This gives a straightforward way to generate the

power series representing any analytic function. Moreover, even if f is not infinitely

differentiable, if it is m times differentiable, one can generate the mth degree Taylor

m .n/

P

f .c/

polynomial for f centered at c given by g.x/ D

.x c/n . Then g is an mth

n

nD0

degree polynomial that is equal to f at c, and all of its derivatives up to order m agree

with the corresponding derivatives of f at c. In particular, the first degree Taylor

polynomial is just the familiar linear approximation to f given by the line tangent

to the graph of f at c. If f is m-times differentiable at c, one can generate the mth

degree Taylor polynomial, g.x/, for f centered at c, but this does not say whether

the value of g.x/ is even remotely related to the value of f .x/ when x is different

from c. This issue is what is addressed by Taylors Theorem which states that

f .x/ D g.x/ C Rm .x/ for some remainder function Rm .x/. Depending on various

characteristics of f , one can show that Rm .x/ is suitably small so that g.x/ is a good

approximation for f .x/.

There are many forms of Taylors Theorem that express the remainder term,

Rm .x/, in different ways. The one discussed here is sometimes called Lagranges

form. It says that if f is m C 1 times differentiable on the interval between c and

x, then the difference between f .x/ and the mth degree Taylor polynomial for f

centered at c can be expressed in terms of f .mC1/ ./ for some strictly between c

and x. Its proof follows easily from the following generalization of Rolles Theorem.

PROOF (Higher Order Rolles Theorem): Let f be an m C 1 times

differentiable function on the open interval from a to b with a b,

let f be continuous on the closed interval from a to b, and suppose

that 0 D f .a/ D f 0 .a/ D f 00 .a/ D D f .m/ .a/ D f .b/. Then there is an x

strictly between a and b where f .mC1/ .x/ D 0.

Let f be an m C 1 times differentiable function on the open interval from a

to b with a b, let f be continuous on the closed interval from a to b, and

suppose that 0 D f .a/ D f 0 .a/ D f 00 .a/ D D f .m/ .a/ D f .b/.

Since f .a/ D f .b/, f is continuous on the closed interval between a and b,

and f is differentiable between a and b, then by Rolles Theorem there is an

x1 strictly between a and b such that f 0 .x1 / D 0.

Assume for some k with 1 k m, that f .k/ .a/ D f .k/ .xk / D 0, f .k/ is

continuous on the closed interval between a and xk , and f .k/ is differentiable

between a and xk . Then by Rolles Theorem there is an xkC1 strictly between

a and xk such that f .kC1/ .xkC1 / D 0.

Thus, by mathematical induction, there is an x D xmC1 strictly between a

and b such that f .mC1/ .x/ D 0 completing the proof.

264

8 Sequences of Functions

This Higher Order Rolles Theorem can now be used to prove Taylors Theorem.

If the function f is m C 1 times differentiable between c and x, then f has an

m .n/

P

f .c/

mth degree Taylor polynomial g.y/ D

.y c/n . Notice that the difference

n

nD0

f .y/ g.y/ has the property that this function and its first m derivatives are all equal

to 0 at c. The remainder term RmC1 .x/ will include a factor of f .mC1/ evaluated at

some between c and x, and that value of will come from an application of Rolles

Theorem. Of course, to apply Rolles Theorem, f .y/g.y/ would need to be 0 at y D

x. One needs to add a term to f .y/ g.y/ which will not affect the function and its

derivatives at c but will make the function equal to 0 at x. The term that accomplishes

.yc/mC1

this is f .x/g.x/ .xc/

mC1 since this term equals f .x/g.x/ at x, and it and its

first m derivatives are equal to 0 at c. But now Rolles Theorem can be applied to the

.yc/mC1

function h.y/ D f .y/g.y/ f .x/g.x/ .xc/

mC1 to find a value of between c and x

.mC1/

such that h.mC1/ ./ D 0, or 0 D f .mC1/ ./g.mC1/ ./.f .x/g.x// .xc/

mC1 . Noting

that the mC1st derivative of g at c is equal to 0 gives f .x/ D g.x/Cf .mC1/ ./ .xc/

.mC1/

as desired.

mC1

function on the open interval from c to x with c x, and let f be

continuous on the closed interval from c to x. Then there is an between

m .n/

P

mC1

f .c/

c and x such that f .x/ D

.x c/n C f .mC1/ ./ .xc/

.

n

.mC1/

nD0

to x with c x, and let f be continuous on the closed interval from c to x.

m .n/

P

f .c/

Let g.x/ D

.x c/n , and define the function

n

nD0

.yc/mC1

h.y/ D f .y/ g.y/ f .x/ g.x/ .xc/

mC1 .

Then 0 D h.c/ D h0 .c/ D h00 .c/ D D h.m/ .c/ D h.x/.

Thus, by the Higher Order Rolles Theorem, there exists between c and x

such that h.mC1/ ./ D 0.

m .n/

P

mC1

f .c/

This implies that f .x/ D

.x c/n C f .mC1/ ./ .xc/

which

n

.mC1/

nD0

For example, the cosine function is analytic, and its power series which converges

2

4

6

8

for all real numbers is cos x D 1 x2 C x4 x6 C x8 . So, how accurate

2

4

of an approximation is 1 x2 C x4 at x D 2? It is clear that the given Taylor

polynomial includes the terms for n D 0, 1, 2, 3, and 4, but it is beneficial to note

that it also includes the term for n D 5 which is 0. Therefore, Taylors Theorem

6

26

D cos./ 720

. Since the cosine

says that the remainder at x D 2 is f .6/ ./ .20/

6

function is bounded by 1, the error introduced by using the Taylor polynomial as an

26

approximation to cos 2 is at most 720

while cos 2

0:416146 with a difference of 0:08281.

265

Given two analytic functions each represented by power series with common center

c and positive radii of convergence, it is straightforward to find the power series

representing the sum, difference, product, and quotients of these series. Suppose

1

1

P

P

two functions have power series f .x/ D

an .x c/n and g.x/ D

bn .x c/n

nD0

nD0

which both converge when jx cj < R for some R > 0. Then theorems about

the sum and difference of series of real numbers ensure that the sum and difference,

1

1

P

P

.an Cbn /.xc/n and .f g/.x/ D

.an bn /.xc/n , both converge

.f Cg/.x/ D

nD0

nD0

when jx cj < R. Of course, it is possible that the new series converges in an even

larger interval. For example, the series 1 C x C x2 C x3 C and 2 x x2 x3

both have radius of convergence equal to 1, but the sum of the two series is the

constant function 3, and its power series converges for all x.

The product of two power series can be found by using the Cauchy product of

1

P

the two series. If f .x/ D

an .x c/n has radius of convergence R1 > 0 and

g.x/ D

1

P

nD0

n

nD0

absolutely when jxcj < min.R1 ; R!2 / implying that their Cauchy

product, .fg/.x/ D

!

1

n

1

n

P

P

P

P

ap .x c/p bnp .x c/np D

ap bnp .x c/n , converges for

nD0

pD0

nD0

pD0

1

the product of 1x

D 1 C x C x2 C x3 C and 1 x which converges for all x.

1

1

P

P

an .xc/n has radius of convergence R1 > 0 and g.x/D

bn .xc/n

If f .x/ D

nD0

nD0

has radius of convergence R2 > 0, and g.c/ is not zero, then one can find the power

f .x/

series for the quotient h.x/ D g.x/

centered at c by working backwards from the

1

P

Cauchy product of h and g. That is, if you assume that h.x/ D

qn .x c/n , then

nD0

!

1

1

n

P

P

P

f .x/ D

an .x c/n D h.x/g.x/ D

bp qnp .x c/n . Because of the

nD0

nD0

pD0

assumption that g.c/ 0, it follows that b0 0. Then equating like terms in the

product gives the sequence of equations

a0 Db0 q0

a1 Db0 q1 C b1 q0

a2 Db0 q2 C b1 q1 C b2 q0

a3 Db0 q3 C b1 q2 C b3 q1 C b4 q0

266

8 Sequences of Functions

and so forth. The first equation can be solved to give q0 . Then the second equation

can be solved to give q1 , and so forth. The fact that g.c/ 0 says that the coefficient

b0 0 which allows the equation for am to be solved for qm for each m 0. Often

this results in a recursive formula for qn . For example, it is known that ln.1 C x/ D

2

3

4

x x2 C x3 C x4 , so you can find the series centered at 0 for the quotient

1

1

P

P

ln.1Cx/

n

n

D

qn x by writing .1 C x/

qn x giving 0 D 1 q0 so q0 D 0. Then

1Cx

nD0

with qn D

nD0

.1/n1

n

.1/n1

n

D qn C qn1 , so q1 D 12 , q2 D 56 , q3 D 13

, and so forth

12

qn1 .

8.5.6 Exercises

Determine for which x the following power series converge.

1.

2.

3.

4.

1

P

nD0

1

P

nD0

1

P

nD0

1

P

nD0

4n .xC4/n

3n C5n

n5 .x2/n

8n

nxn

.2n/

nxn

nn

c D 0.

5.

6.

7.

8.

9.

10.

ex

e2x

sin.3x/

sin x

1Cx

ln.cos x/

3

5

2

4

Using the fact that sin x D x x3 C x5 and cos x D 1 x2 C x4 ,

show that the powers series satisfy the identity sin2 x C cos2 x D 1.

11. Find the first four nonzero terms of the series for tan x centered at 0 by finding

the quotient of the series for sin x and the series for cos x. Then check your work

by generating those

terms using

(

) Taylors Theorem.

1

e x2 if x > 0

. Prove that for each positive integer n, the

0 if x 0

derivative f .n/ .0/ D 0. Then show that the mth degree Taylor polynomial for

f centered at 0 is p.x/ D 0, and the remainder term is Rm .x/ D f .x/.

267

In a sense Analysis can be thought about as the study of limiting processes. So

far this book has discussed limits of functions and sequences, the continuity of

functions, differentiation of functions, integration of functions, the convergence of

infinite series, and now the convergence of sequences and series of functions. In each

of these studies one fundamental question recurs: when is it valid to interchange the

order of limiting processes. For example,

the question of continuity is a question of

whether lim f .x/ is the same as f lim x . The Fundamental Theorem of Calculus

x!a

x!a

establishes when the derivative of an integral is equal to the integral of a derivative.

The discussion of convergence of sequences of functions included questions about

Rb

Rb

when lim fn .x/ dx is the same as lim fn .x/ dx. Power series give an example

a n!1

1

d P

where dx

an .x

nD0

n!1 a

1

P

x!R

1

P

nD0

nD0

an xn D

d

a .x

dx n

1

P

nD0 x!R

question of Analysis asks when can you interchange the order of two limiting

processes? It is instructive to watch for other occurrences of this question as your

study of Analysis continues.

Chapter 9

In the field of Analysis the concepts of the limit and the continuity of a function f

at a point x D a are defined in terms of open intervals. For example, the condition

jf .x/ Lj < says that f .x/ is in an open interval centered at L, and the condition

jx aj < says that x is in an open interval centered at a. These intervals are

specified in terms of the distance between x and y given by jx yj. Topology is

a branch of Mathematics where these concepts are extended to spaces where one

can discuss intervals without having to rely on a distance formula. As a result the

concepts of limit and continuity can be extended to such spaces, and it can be shown

that many of the properties associated with continuous functions defined on the

real line are shared by continuous functions defined on these more general spaces.

Although the theorems discussed in this chapter are presented in the context of sets

on the real line, virtually all of the theorems are true in the more general context of

any topological space. Many of the techniques used to prove these theorems are the

same techniques one would use for a general topological space, and, therefore, this

chapter can be thought of as an introduction to the field of Topology even though

general topological spaces are not discussed here.

A good way to begin is by taking a set S R and identifying the points s 2 S

that are not only inside of S but are, in a sense, completely surrounded by points in

S. The point s is said to be in int.S/, called the interior of S, if there is an > 0 such

that all x within of s are in S, that is, jx sj < implies x 2 S. You can think of

the interior of S as those points which are a positive distance from the complement

of S, Sc D RnS. For example, if S is the closed interval 0; 4, then the open interval

.0; 4/ is the interior of S. This is because if s 2 .0; 4/ and D min.s; 4 s/, then

all x satisfying jx sj < are elements of S. The two endpoints of the interval

0; 4, 0 and 4, do not have this property. No open interval containing either 0 or

4 is completely contained inside of S. Clearly, if x > 4 or x < 0, then x S, so

x int.S/. Thus, int.S/ D .0; 4/. The interior of the set Q of rational numbers is

Springer International Publishing Switzerland 2016

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_9

269

270

interior of S: The point y is on

the boundary of S: The point z

is in the exterior of S

y

x

S

SC

the empty set because all nonempty open intervals contain irrational numbers, so

no nonempty open interval is contained in Q. One sometimes says that Q has no

interior even though it does have an interior; it is just that its interior is the empty

set.

Then ext.S/, called the exterior of S, is just defined to be the interior of Sc , that

is, s 2 ext.S/ if there is an > 0 such that all x satisfying jx sj < are in

Sc D RnS. The exterior of S is the set of points that are completely surrounded by

points outside of S. You can think of the exterior of S as the collection of points

bounded away from S, that is, the points that are a positive distance from S. The

exterior of the set 0; 4 is the union of two open intervals, .1; 0/ [ .4; 1/. The

exterior of the set Q is the empty set.

If a point s is in neither int.S/ nor ext.S/, then it must be that no open interval

containing s is completely inside of S and no open interval containing s is completely

outside of S. Thus, for every > 0, the interval .s ; s C / contains at least one

element of S and at least one element of Sc . Such points are said to be in @S, called

the boundary of S (Fig. 9.1). Note that the symbol used for boundary is @ which

is the same symbol use for partial derivatives in Calculus. There are connections

between derivatives and boundaries that justify the use of the same symbol for both

concepts. The boundary of 0; 4 is the set f0; 4g. The boundary of Q is the entire

real line, R.

It is important to note that for any set S R, the three sets int.S/, ext.S/, and @S

partition R, that is, each real number x is in exactly one of these three sets. A proof

of this fact must show two things about a set S: that R D int.S/ [ ext.S/ [ @S, and

that no point x belongs to more than one of these sets. To show that R is a union of

the three sets, you would take an arbitrary x 2 R and show that it is in at least one of

these sets. One way to show that a point must be one of three things is to assume that

it not one of the first two, and then prove that it must be the third. In this case, you

can assume that a point x 2 R is not in int.S/ or in ext.S/. If x is not in int.S/, then

271

not in ext.S/, then for every > 0, the open interval .x ; x C / is not contained

in Sc . The only alternative is that if x is in neither int.S/ nor ext.S/, then for every

> 0, the open interval .x ; x C / contains points in both S and its complement.

This means that x is in @S implying that x must be in at least one of int.S/, ext.S/,

or @S. To show that the three sets are disjoint, show that if x belongs to one of the

three sets, it cannot belong to either of the other two sets. These inferences follow

directly from the definitions of the sets.

PROOF: For every set S R, R D int.S/ [ ext.S/ [ @S and the three sets

int.S/, ext.S/, and @S are mutually disjoint.

Let S R.

Assume that x is a real number that is not a member of int.S/ or ext.S/.

Then, because x int.S/, for every > 0, the open interval .x ; x C / is

not contained in S, so it contains points of Sc .

And because x ext.S/, for every > 0, the open interval .x ; x C / is

not contained in Sc , so it contains points of S.

It follows that for every > 0, the open interval .x ; x C / contains

points in S and points in Sc .

Thus, by the definition of boundary, x 2 @S, and this shows that x must be

in at least one of the three sets, int.S/, ext.S/, or @S.

If x 2 int.S/, then there is an > 0 such that the open interval

.x ; x C / S.

But then x 2 S, so x ext.S/, and .x ; x C / S shows that x @S.

Similarly, if x 2 ext.S/, then it cannot be in int.S/ or @S.

Thus, no x 2 R is a member of more than one of the three sets which

completes the proof.

There are many results that follow directly from the definitions of interior,

exterior, and boundary. For example, if S and T are any subsets of R, then

int.int.S// D int.S/.

int.ext.S// D ext.S/.

int.S/ ext.ext.S//.

ext.S/ ext.int.S//.

@.@.S// @S.

@.int.S// @S.

@.ext.S// @S.

@.S/ D @.Sc /.

int.S/ [ int.T/ int.S [ T/.

ext.S [ T/ ext.S/ \ ext.T/.

int.S \ T/ D int.S/ \ int.T/.

@.S [ T/ @S [ @T.

if S T, then int.S/ int.T/.

if S T, then ext.T/ ext.S/.

272

Each of these results is a statement about either two sets being equal to each other

or one set being a subset of another. Thus, one would prove these results using the

techniques discussed in Chap. 2 for proving subset and set equality statements. For

example, how would you write a proof that for any set S, int.int.S// D int.S/?

This would be a proof that two sets are equal, so the proof would consist of two

parts: showing int.int.S// int.S/ and showing int.S/ int.int.S//. The fact that

int.int.S// int.S/ is just a consequence of the definition of interior. For any set

T, int.T/ T, so certainly int.int.S// int.S/. Showing that int.S/ int.int.S//

is showing that one set is a subset of another. So, you would let x be an element of

int.S/, and then show that x is also an element of int.int.S//. By the definition of

interior, there is an > 0 such that the open interval .x ; x C / S. Thus, you

need to show that .x ; x C / is contained in int.S/. That is, each y 2 .x ; x C /

must be in the interior of S. But it is easy to find an open interval centered at y that

is contained in .x ; x C /. Just let D min.y .x /; x C y/ > 0 because

then .y ; y C / .x ; x C /. This shows each point of .x ; x C / is in

int.S/ which completes the proof.

PROOF: For every set S R, int.int.S// D int.S/.

Let S R.

For any set T, int.T/ T, so int.int.S// int.S/.

So let x 2 int.S/.

By the definition of interior, there is an > 0 such that the open interval

.x ; x C / is contained in S.

Let y 2 .x ; x C /, and let D min.y .x /; x C y/ > 0.

Then .y ; y C / .x ; x C / S.

This shows that .x ; x C / int.S/ implying that x is in int.int.S//.

This proves that int.S/ int.int.S// and completes the proof of the

theorem.

For a more difficult challenge, consider writing a proof that for any set S, @.@S/ @S

which, in words, says that the boundary of the boundary of a set is contained in the

boundary of the set. For example, let S be the set of rational numbers in the interval

0; 4. You should prove to yourself that the boundary of this set is the entire interval

0; 4. The boundary of that interval is just f0; 4g which indeed is contained in

@S D 0; 4. To show that @.@S/ is a subset of @S, you would take an arbitrary

point x 2 @.@S/ and show that it is in @S. So what do you know if x 2 @.@S/? The

only tool you have at your disposal here is the definition of the boundary of a set,

so you would proceed to use that definition. It says that for every > 0 the open

interval .x ; x C / contains elements of @S and elements of the complement

of @S. You want to show that x is in @S, so you would need to show that the open

interval .x ; x C / contains elements of S and elements of Sc . Well, what is the

consequence of saying that the open interval .x; xC/ contains elements of @S? It

must mean that there is a y 2 .x ; x C / such that y 2 @S. What does it mean for y

to be in @S? It means that for every > 0, the interval .y; yC/ contains elements

of S and elements of Sc . But this is sufficient if .y ; y C / .x ; x C / because

273

((S

)

c

y S

that would put elements of both S and Sc in .x ; x C /. This can be arranged by

selecting small enough (Fig. 9.2).

PROOF: For every set S R, @.@S/ @S.

Let S R.

Let x 2 @.@S/.

Then by the definition of boundary, for every > 0, the open interval

.x ; x C / contains points of @S and points of the complement of @S.

Let > 0 be given, and let y 2 .x ; x C / such that y 2 @S.

Let D min.y.x/; xCy/ > 0 so that the open interval .y; yC/

.x ; x C /.

By the definition of boundary, the interval .y ; y C / contains an element

of S and an element of Sc .

But .y ; y C / .x ; x C / shows that .x ; x C / contains an

element of S and an element of Sc , so x is in @S which completes the proof.

As a third example, consider proving that for any two sets S and T, that

int.S/ [ int.T/ int.S [ T/. Again, this is proving that one set is a subset of a

second set, so your proof would start by selecting an arbitrary element of the first

set and then proceed to show that that element belongs to the second set. Here the

first set is int.S/ [ int.T/. If you select an x from this set, all you know about x is that

it is in the union of the two sets int.S/ and int.T/. So, the only tool you can use is

the definition of union to say that x must be either a member of int.S/ or a member

of int.T/. In the case that x 2 int.S/, you can then apply the definition of interior to

say that there is an > 0 such that the interval .x ; x C / S. But this is all you

need since S S [ T showing .x ; x C / S [ T proving that x 2 int.S [ T/.

The case where x 2 int.T/ is analogous, completing the proof.

PROOF: For any sets of real numbers S and T, int.S/ [ int.T/ int.S [ T/.

Let S and T be sets of real numbers.

Let x 2 int.S/ [ int.T/.

Then by the definition of the union of two sets, either x 2 int.S/ or x 2

int.T/.

Without loss of generality, assume that x 2 int.S/.

Then there is an > 0 such that the open interval .x ; x C / S.

But since S S [ T, it follows that .x ; x C / S [ T showing that

x 2 int.S [ T/, completing the proof.

Can it be that int.S/ [ int.T/ is not equal to int.S [ T/? The answer is yes. See if

you can think of an example.

274

9.1.1 Exercises

For each of the following sets, find the interior, exterior, and boundary of the set.

1. 0; 3/ [ .3; 6

1

1

1

2. [ 2n

.

; 2n1

nD1

3. 0; 4 \ Q

Write proofs for each of the following statements. For exercises involving the subset

relation rather than the equality relation, give examples showing that the subset

relation in the statement cannot be replaced by an equality.

If S T, then int.S/ int.T/.

If S T, then ext.T/ ext.S/.

int.ext.S// D ext.S/.

ext.S/ ext.int.S//.

@.int.S// @S.

@.S/ D @.Sc /.

@.ext.S// @S.

int.S \ T/ D int.S/ \ int.T/.

ext.S [ T/ D ext.S/ \ ext.T/.

@.S [ T/ @S [ @T.

int.S/ ext.ext.S//

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

A set S of real numbers is called open if for every s 2 S there is an > 0 such that

the open interval .x; xC/ S. A set S of real numbers is called closed if @S S.

The intervals that are called open intervals are, in fact, open sets. In particular,

.1; 7/, .2; 1/, and ; are all open sets as well as .3; 3/ [ .5; 9/ [ .10; 41/ and

1

[ .2n; 2n C 1/. The intervals that are called closed intervals are, in fact, closed sets.

nD1

In particular, 5; 3, 4; 1/, and ; are all closed sets.

There are actually many equivalent ways to define open and closed sets, so

one usually begins this discussion by proving that all the different definitions are

equivalent. In particular, if S R, then the following are equivalent:

1.

2.

3.

4.

S is an open set.

S D int.S/.

S \ @S D ;.

Sc is a closed set.

Many theorems in mathematics are statements of the form p , q, and the proof of

these statements is often broken into two steps: p ) q and q ) p. Theorems of

that type state that two conditions are equivalent. But it is not uncommon to have

275

a theorem that states that several statements are equivalent, that is, p1 , p2 ,

p3 , , pk . One way to prove theorems of this form is to show in a sequence

of steps that p1 ) p2 , p2 ) p3 , p3 ) p4 , . . . , pk1 ) pk , and then pk ) p1 . This

is the technique you can use to prove the list of statements about open sets. You

would begin by assuming condition 1, that a set S is open and then prove condition

2, that S D int.S/. This can be done by noting that for any set, elements of the set

are either in the interior of the set or on the boundary of the set. But if the set S

is open, it means that for each x 2 S there is an > 0 such that the open interval

.x ; x C / S. Thus, .x ; x C / contains no elements of Sc showing that x

cannot be in @S, so it must be that x 2 int.S/ which proves that S D int.S/.

Now, assuming condition 2 that S D int.S/ it follows immediately that S \ @S D

int.S/ \ @S D ;, which is condition 3. If you assume condition 3 that S \ @S D ;,

how can you conclude that Sc is closed? Well, if S contains no elements of @S, it

must be that all the elements of @S (if there are any) must belong to Sc . But as seen

in the exercises of the previous section, the boundary of S and the boundary of Sc

are always the same. This follows from the fact that the definition of boundary is

symmetric in its references to S and Sc . Therefore, Sc contains its boundary proving

that Sc is a closed set, which is condition 4.

Finally, assuming condition 4 that Sc is a closed set, you know that Sc contains

its boundary, so Sc contains the boundary of S. You must show that for each x 2 S,

there is an open interval centered at x such that the entire interval is contained in S.

But if for every > 0 the open interval .x ; x C / contains elements in Sc , then

x would be in the boundary of S which is false. Thus, there is an > 0 such that

the open interval .x ; x C / is contained in S. This proves that S is an open set,

which is condition 1 (Fig. 9.3).

Fig. 9.3 An open set S, its

boundary, and its

complement Sc

S

SC

276

1.

2.

3.

4.

S is an open set.

S D int.S/.

S \ @S D ;.

Sc is a closed set.

Let S R.

Condition 1 ) Condition 2

Assume that S is an open set.

If x 2 S, then by the definition of open set, there exists an > 0 such that

.x ; x C / S.

But .x ; x C / S shows that x is not an element of @S.

Since S int.S/ [ @S, it can be concluded that S int.S/.

Because the interior of any set is contained in the set, it follows that int.S/

S implying that S D int.S/, which is condition 2.

Condition 2 ) Condition 3

Assume that S D int.S/.

Then S \ @S D int.S/ \ @S D ; because the interior and the boundary of

any set are disjoint.

Thus, S \ @S D ;, which is condition 3.

Condition 3 ) Condition 4

Assume that S \ @S D ;.

Then @S must be contained in Sc .

Because @S D @.Sc /, it follows that @.Sc / Sc implying that Sc is a closed

set, which is condition 4.

Condition 4 ) Condition 1

Assume that Sc is a closed set, which means that Sc contains @.Sc /.

Let x 2 S.

If for every > 0, the open interval .x ; x C / contains elements of Sc ,

then x would be an element of @S D @.Sc /.

But all elements of @.Sc / are contained in Sc , so there must be an > 0

such that the interval .x ; x C / contains no elements of Sc implying that

.x ; x C / S.

This shows that S is an open set, which is condition 1.

A similar theorem can be proved concerning closed sets.

277

1.

2.

3.

4.

S is a closed set.

S D int.S/ [ @S.

Every accumulation point of S is an element of S.

Sc is an open set.

Let S R.

Condition 1 ) Condition 2

Assume that S is a closed set.

Because no point in the exterior of S is a member of S, it is clear that S

int.S/ [ @S.

Because S is closed, it contains @S, and because all sets contain their interior,

S contains int.S/.

Thus, S contains int.S/ [ @S proving that S D int.S/ [ @S, which is

condition 2.

Condition 2 ) Condition 3

Assume that S D int.S/ [ @S.

Let x be an accumulation point of S.

If x S, then for all > 0, the open interval .x ; x C / contains

elements of S (since x is an accumulation point of S) and elements of Sc

(in particular, x).

Thus, x 2 @S implying that x 2 S, a contradiction.

Therefore, all accumulation points of S must be elements of S, which is

condition 3.

Condition 3 ) Condition 4

Assume that every accumulation point of S is an element of S.

Let x 2 Sc .

Because S contains all of its accumulation points, x is not an accumulation

point of S.

Thus, there is an > 0 such that the open interval .x ; x C / contains no

elements of S, and is, therefore, contained in Sc .

This shows that Sc is an open set, which is condition 4.

Condition 4 ) Condition 1

Assume that Sc is an open set.

Then Sc \ @.Sc / D ; implying that @.Sc / S, so @S S.

Thus, S contains @S, so S is a closed set, which is condition 1.

Of course, a set need not be either open or closed as is seen by the interval .0; 5

which contains one but not both of its boundary points (Fig. 9.4), so it is neither

open (because it contains a boundary point) nor closed (because it does not contain

all of its boundary points).

278

drawn dotted to indicate open

sets. Boundaries are drawn

solid to indicate closed sets

Open

Closed

9.2.1 Exercises

Determine which of the following sets of real numbers are open and which are

closed.

1.

2.

3.

4.

R

the irrational numbers

the real numbers that are not integers

5.

6.

7.

8.

9.

10.

If S is an open set, and T is a closed set, then TnS is a closed set.

; is both an open set and a closed set.

If x is an accumulation point of a set S, and x S, then x 2 @S.

If x 2 @S and x S, then x is an accumulation point of S.

Let <an > be any sequence. Let A be the set of all values y such that there exists

a subsequence of <an > that converges to y. Then A is a closed set.

Perhaps the most important properties open sets have are that the union of any

collection of open sets is itself an open set and that the intersection of a finite

collection of open sets is itself an open set. In fact, these two properties of open sets

are the defining conditions required to hold in the more general setting of topological

spaces (Fig. 9.5).

In the context of the real numbers, it is not hard to show that the union of any

collection of open sets is itself an open set. But before this proof can be started,

there needs to be a convenient way to discuss an arbitrary collection of open sets.

279

sets is an open set

this is a collection of a finite number of sets because the indices used to describe

the collection, f1; 2; 3; : : : ; kg, is a finite set. If the collection is listed as a sequence,

A1 ; A2 ; A3 ; : : : , then there is an implication that the collection of sets is denumerable,

that is, there is one set for each natural number. But to allow the collection of open

sets to be any size, even to be an uncountable collection of sets, one generally wants

to represent the collection as a collection of sets Ai where the index i is allowed to

range over a particular index set, I. That is, the collection is given by fAi j i 2 Ig.

Thus, since there is no restriction on the size of the index set I, there is no restriction

on the size of the collection of open sets.

So assume that fAi j i 2 Ig is a collection of open sets. How would you prove

that its union, A D [ Ai , is an open set? Given the theorem about open sets in the

i2I

previous section, you could prove that the set A is open by using the definition of

open set, by showing that A D int.A/, by showing that A \ @A D ;, or by showing

that Ac is a closed set. In this case, it is simple enough to use the definition of open

set. Thus, for each x 2 A, you would need to show that there is an open interval

centered at x such that this open interval is contained in A. All you know about A

is that it is a union of a collection of open sets, so the first thing you should try is

invoking the definition of union. That is, if x 2 A, then there must be a j 2 I such

that x 2 Aj . What do you know about Aj ? Only that it is an open set. That means that

there is an > 0 such that the open interval .x ; x C / is contained in Aj . But

by the definition of union, Aj A implying that .x ; x C / A which is what

you needed to prove.

280

PROOF: Assume that for each i in the index set I, Ai is an open set. Then

[ Ai is an open set.

i2I

Let x 2 [ Ai .

i2I

the open set Aj .

By the definition of open set, there is an > 0 such that the open interval

.x ; x C / Aj .

But by the definition of set union, Aj [ Ai showing that .x ; x C /

[ Ai , which proves the theorem.

i2I

i2I

Now consider proving the result that the intersection of a finite collection of

open sets is itself an open set. This time there is no need to consider an arbitrarily

large collection of open sets; you can just use the finite collection of open sets

A1 ; A2 ; A3 ; : : : ; Ak . Again you would take an arbitrary x 2 A1 \ A2 \ A3 \ \ Ak .

You know from the definition of intersection that for each j D 1; 2; 3; : : : ; k, this

element x must be in Aj . And you know that since Aj is an open set, there must be

an j > 0 such that the interval .x j ; x C j / Aj . Now you have a collection

of k open intervals each centered at x. By selecting D min.1 ; 2 ; 3 ; : : : ; k /, you

will have the least of these j values which is a positive number. This is crucial. The

fact that you have a finite collection of open sets ensures that you can find a finite

number of open intervals centered at x and can find the shortest of these intervals.

If the collection of open sets were infinite, there would be no guarantee that there

would be a minimum j . The fact that there is a minimum value that is greater than

0 allows you to claim that the interval .x ; x C / is contained in each of the Aj

sets, and thus, .x ; x C / is contained in the intersection of the Aj s.

PROOF: Assume that A1 ; A2 ; A3 ; : : : ; Ak

A1 \ A2 \ A3 \ \ Ak is an open set.

are

open

sets.

Then

Let x 2 A1 \ A2 \ A3 \ \ Ak .

Then for each j D 1; 2; 3; : : : ; k, x is an element of the open set Aj , and

because Aj is an open set, there exists an j > 0 such that the open interval

.x j ; x C j / Aj .

Let D min.1 ; 2 ; 3 ; : : : ; k / > 0.

Then the open interval .x ; x C / is contained in Aj for each j D

1; 2; 3; : : : ; k.

This shows that .x ; x C / A1 \ A2 \ A3 \ \ Ak proving that this

intersection is an open set.

There are analogous results about the union and intersections of closed sets. In

particular, the intersection of an arbitrary collection of closed sets is itself a closed

set, and the union of a finite number of closed sets is itself a closed set. One can

prove these results by relying on the definition of a closed set, but it is much easier

281

to use the results from the previous section that show that a set is a closed set if and

only if it is the complement is an open set. For example, to show that the union of

a finite number of closed sets is closed, let A1 ; A2 ; A3 ; : : : ; Ak be a finite collection

of closed sets. Then for each j, Acj is the complement of a closed set, so it is open.

By the previous theorem, the intersection of a finite number of open sets is an open

set, so Ac1 \ Ac2 \ Ac3 \ \ Ack is an open set. But DeMorgans Law says that

Ac1 \ Ac2 \ Ac3 \ \ Ack D .A1 [ A2 [ A3 [ [ Ak /c which is an open set, so its

complement, A1 [ A2 [ A3 [ [ Ak is a closed set as desired.

Although not needed in this textbook about writing proofs in Analysis, for

completeness, it makes sense at this point to introduce the definition of a topological

space to be a set S together with a collection T of subsets of S satisfying the

conditions

Both ; and S are in T .

The union of any collection of sets in T is also a set in T .

The intersection of any finite collection of sets in T is also a set in T .

If these conditions are satisfied, then the set T is said to be a topology for the

topological space S. From the definitions and theorems presented so far in this

chapter it follows that the real numbers R along with its collection of open sets

forms a topological space. The advantage of introducing the more general concept

of a topological space is that many theorems about the real numbers extend to

all topological spaces, so once you justify the fact that you are dealing with a

topological space, you then know many theorems about your new space.

As an example of another topological space consider the set of integers, Z, along

with the collection T of subsets of Z consisting of the empty set, ;, and the sets

A Z with the property that Ac D ZnA is a finite set. It is easy to see that both

; and Z are elements of T . To show that T is closed under unions, suppose you

have a collection of sets in T . There are two cases to consider: (1) all the sets in the

collection are the empty set, and (2) at least one of the sets in the collection is not

empty. In the first case, the union of all the sets in the collection is the empty set

which is in T . In the second case, if the collection includes a set A, then the union

of the sets in the collection contains A, and because the complement of the union

lies inside the complement of A which is finite, the union will have to have a finite

complement and be a set in T . To show that T is closed under finite intersections,

suppose you have a finite collection of sets in T . Again, there are two cases to

consider: (1) at least one set in the collection is the empty set, and (2) none of

the sets in the collection is the empty set. In the first case, the intersection of the

collection of sets is the empty set which is in T . In the second case, the complement

of the intersection of the finite collection of sets is the union of the complements of

the sets. If all the complements are finite, then the union of the finite number of

complements is also finite, so the intersection is in T . This verifies that T is a

topology for Z. This is known as the finite complement topology for Z. It is clearly

not the usual topology associated with the integers which is just the usual topology

of R restricted to Z. Generally, a set can have many different topologies, each giving

rise to a different topological space. Most of these topologies are uninteresting and

have few if any applications.

282

9.3.1 Exercises

1. Find a sequence of open sets whose intersection is neither an open nor a

closed set.

2. Find a sequence of closed sets whose union is neither an open nor a closed set.

3. Prove that the intersection of a collection of closed sets is a closed set.

4. Prove that an open set of real numbers is the union of all the open intervals

contained in the set.

5. Verify that if S is any set, then the power set of S, P .S/, consisting of all the

subsets of S, is a topology for S. This is called the discrete topology for S.

6. Let S be the interval 0; 5, and let T include the empty set, the set S, and any

interval of the form 0; x/ where x 2 .0; 5/. Verify that T is a topology for S.

Sometimes rather than focusing your attention on the entire real line, you are

interested in the open sets within a particular subset of the real numbers. For

example, if the real valued function f has domain A D 4; 4, you might be

interested in the open sets contained in A. Moreover, you might want to consider

some new sets to be open which were not considered to be open sets in R. For

example, within A the interval 4; 0/ should be considered open in the topological

space consisting just of the set A. This is because, within A, each point of 4; 0/

is an interior point. The only controversial point here is 4, but it makes sense to

claim that 4 is in the interior of A if your entire universe of interest is A. Certainly,

all the points of A that are within a distance of 12 of 4 are elements of 4; 0/.

Generalizing this idea leads to the definition of the inherited topology in the set

A R. In the inherited topology, a set B A is said to be open in A if B is the

intersection of A with some set that is open in R. For example, if A D 4; 4

as above, then the set 4; 0/ is open in A because 4; 0/ D .5; 0/ \ A,

and .5; 0/ is an open set in R. With this same reasoning, within A the set

4; 3 [ 2; 0/ [ f1; 2g [ 3; 4/ has interior 4; 3/ [ .2; 0/ [ .3; 4/ and

boundary f3; 2; 0; 1; 2; 3; 4g.

Similarly, a set B A is said to be closed in A if B is the intersection of A with

some set that is closed in R. Note that all of the properties proved earlier in this

chapter pertaining to open or closed sets in R hold equally well for sets open or

closed in A. In particular, the union of any collection of sets open in A is itself a set

that is open in A.

The motivation for developing the properties of open and closed sets and for

defining topological spaces is that one can now generalize the idea of a continuous

function. One defines P .X/, the power set of a set X, to be the collection of all

subsets of the set X. For example, if X is a finite set with n elements, then P .X/

contains the 2n subsets of X. If f W A ! B is a function which maps elements of the

283

set A to elements of the set B, the function f can be extended to f W P .A/ ! P .B/

which maps subsets of the set A to subsets of the set B. If C A, then define f .C/

to be the set fy 2 B j y D f .a/ for some a 2 Cg. Then f .C/ is called the image of

C under f . The notation f .C/ could be confusing because f was originally defined

for elements of A, not subsets of A. The application of f to subsets of A is really

defining a new function f W P .A/ ! P .A/ whose domain is the power set of A and

codomain is the power set of B. The confusion arises because the same name, f , is

given to both functions. The confusion is cleared up by recognizing the distinction

that if the argument of f is an element a 2 A, then f .a/ refers to an element of the

codomain, B, while if the argument of f is a subset C A, then f .C/ is a subset of

B, f .C/ B.

For example, the function f .x/ D x2 is defined to be a function with domain R

and codomain R. It is then easily understood that f .3/ D 9 and f .2/ D 4. But

taking C to be the interval .3; 2/, the expression f .C/ now refers to the function

f W P .R/ ! P .R/, and f .C/ is the set of all elements of R that are images under f

of elements of C. That is, f .C/ D 0; 9/.

If the function f W A ! B is not a bijection mapping A one-to-one and onto B,

then it is not possible to define the inverse function f 1 W B ! A. One problem is

that if f is not surjective (mapping A onto B), there might be an element of b 2 B for

which there is no corresponding element a satisfying f .a/ D b, so f 1 .b/ cannot be

defined. Another problem is that if f is not injective (mapping A one-to-one to B),

then there will be an element b 2 B such that f .x/ D b is satisfied by more than one

value of x, so f 1 .b/ would not be unique. On the other hand, if D B, it is always

possible to define the function f 1 W P .B/ ! P .A/ mapping the power set of B to

the power set of A. Indeed, one defines f 1 .D/ D fx 2 A j f .x/ 2 Dg. In this case

f 1 .D/ is called the preimage of D under f . For example,

returning

to f .x/ D x2 ,

1

1

.1; 16/ D .4; 4/.

it follows that f .4; 9/ D .3; 2/ [ .2; 3/ and f

2

Note

that

when

the

continuous

function

f

.x/

D

x

was

applied to an open set as in

f .3; 2/ D 0; 9/, the image did not end

up

being

an

open

set. But when f 1 was

1

.1; 16/ D .4; 4/, the preimage was also an open

applied to an open set as in f

set. This is an important distinction. A continuous function need not map open

sets to open sets; functions that do map all open sets to open sets are called open

functions. But all continuous functions have the property that their inverses always

map open sets to open sets. Conversely, a function whose inverse always maps open

sets to open sets must be a continuous function. Of course, these statements require

proof, but the proofs follow directly from the definition of continuity and definition

of open set.

Assume, for example, that f W A ! B is a continuous function and D B is an

open set in B. You are challenged to show that f 1 .D/ is an open set in A. To show

that f 1 .D/ is open, you would need to show for every a 2 f 1 .D/ there is a > 0

such that .a ; a C / \ A f 1 .D/. From the definition of f 1 .D/, you know

1

that if a 2 f .D/, then f .a/ 2 D. Because D is open, there is an > 0 such that

f .a/ ; f .a/ C \ B D. This means that if y 2 B such that jy f .a/j < , then

y 2 D. But now, by the definition of continuity, there is a > 0 such that if x 2 A

284

C

A

a

f-1

f(a) = b

f-1(D) = C

B

with jxaj < , then jf .x/f .a/j < implying that f .x/ is in f .a/; f .a/C \B,

and thus, f .x/ is in D. This shows that x 2 f 1 .D/ proving that .a ; a C /

f 1 .D/, so f 1 .D/ is open.

Conversely, suppose that f has the property that f 1 .D/ is an open set in A

whenever D is an open set in B. Then let a 2 A. This time you are challenged

to show that for every > 0, there is a > 0 such that if x 2 A with jx aj < ,

then jf .x/ f .a/j < . But the set D of all y 2 B satisfying jy f .a/j < is an

open set in B implying that f 1 .D/ is an open set in A containing the point a. This

means that there is a > 0 such that .a ; a C / \ A is contained in f 1 .D/.

In other words, if x 2 A with jx aj < , then x is in f 1 .D/, so f .x/ is in D and

jf .x/ f .a/j < , completing the proof that f is continuous (Fig. 9.6).

PROOF: Let A and B be sets of real numbers, and let f W A ! B be a

function from A to B. Then f is continuous on A if and only if for every

open set D B, its preimage under f , f 1 .D/, is an open set in A.

Let A and B be sets of real numbers, and let f W A ! B be a function from A

to B.

Continuity implies that the preimages of open sets are open

Assume that f W A ! B is a continuous function.

Let D be an open set in B, and let a 2 f 1 .D/.

Because D is open in B, there is an > 0 such that

.f .a/ ; f .a/ C / \ B D.

Thus, if y 2 B with jy f .a/j < , then y 2 D.

Because f is a continuous function, there is a > 0 such that for all x 2 A

with jx aj < it follows that jf .x/ f .a/j < .

Thus, if x 2 .a ; a C / \ A, then jf .x/ f .a/j < , so f .x/ 2 D and

x 2 f 1 .D/.

Therefore, .a ; a C / f 1 .D/ and f 1 .D/ is an open set in A. This

proves that the preimage under f of any open set is open.

(continued)

9.5 Closure

285

Assume that the preimage under f of any set D open in B is an open set in A.

Let a 2 A, and let > 0 be given.

The set f .a/ ; f .a/ C \ B is an open set in B, so its preimage, C D

fx 2 A jf .x/ f .a/j < g is an open set in A.

Because C is an open set containing a, there is a > 0 such that

.a ; a C / \ A C.

Thus, if x 2 A with jx aj < , then x 2 .a ; a C / \ A C, so

f .x/ 2 f .C/ implying that jf .x/ f .a/j < .

Therefore, f is continuous which completes the proof of the theorem.

There is a similar theorem that states that a function f W A ! B is continuous if

and only if the preimage of every set closed in B is a closed set in A. The proof is

left as an exercise. As it is with open sets, continuous functions do not always map

closed set onto closed sets. Functions that do map all closed sets onto closed sets

are called closed functions.

In general, then, one can define what continuity means for any function from one

topological space into another topological space. If A and B are topological spaces,

the function f W A ! B is continuous if the preimage under f of every set open in B

is a set open in A. Note that this definition makes sense even in topological spaces

where there is no distance measure, and the definition does not involve the selection

of a > 0 given an > 0.

9.4.1 Exercises

Write proofs for each of the following statements.

1.

2.

3.

4.

5.

The intersection of any finite collection of sets

open in A is itself a set open in A.

If f W A ! B and C A, then C f 1 f .C/ .

If f W A ! B and D B, then f f 1 .D/ D.

If f is a function from set A into set B, then f is continuous on A if and only if the

preimage under f of every set closed in B is a set closed in A.

9.5 Closure

Recall that if S is any subset of R, then a is an accumulation point of S if for

every > 0 the open interval .a ; a C / contains at least one point of S

other than a itself. An important property of closed sets is that if a closed set,

S, has an accumulation point, a, then a 2 S. You should be able to construct a

286

short proof of this fact that relies only on the definitions of accumulation point,

closed set, and boundary. Such a proof would start with the assumption that a is

an accumulation point of the closed set S. One way to continue is to construct a

proof by contradiction, that is, to assume that a S and hope that this will lead to a

contradiction. Interestingly, you can proceed in more than one way. You could use

the fact that S is a closed set which implies that, since a S, then a 2 ext.S/.

This means that there is an > 0 such that the open interval .a ; a C /

is contained in Sc . But the definition of accumulation point says that every open

interval containing a also contains points of S, so this contradicts the fact that

a is an accumulation point of S. Alternatively, you could use the fact that a is

an accumulation point of S. This means that for every > 0, the open interval

.a ; a C / contains points in S. All of these open intervals also contain a S

implying that each of these open intervals contains points in S and points in Sc . Thus,

a satisfies the definition of being an element of @S. From the definition of closed set,

@S S. Thus, a 2 S.

PROOF: If S is a closed set, then S contains all of its accumulation points.

Let S be a closed set, and let a be an accumulation point of S.

Assume that a S.

From the definition of accumulation point, for every > 0 it follows that

the open interval .a ; a C / contains elements in S.

Because a S, it follows that for every > 0 the open interval .a; aC/

contains elements of S and elements of Sc , so a 2 @S.

From the definition of closed set, @S S, so a 2 S which contradicts the

assumption that a S.

Thus, every accumulation point of S must be contained in S.

The collection of all the accumulation points of S is called the derived set of S

which is written S0 . The previous theorem shows that if S is closed, then S0 S.

The converse is also true, that is, if S0 S, then S must be closed. This follows from

the fact that if a is in the boundary of S but a is not an element of S, then a must be

an accumulation point of S. This should make sense to you. A boundary point is a

point close both to S and to Sc . An accumulation point is close to S, and if it is not

in S, it is close to Sc .

PROOF: If set S contains all of its accumulation points, then S is a closed

set.

Let S be a set that contains all of its accumulation points.

Assume that a 2 @SnS.

Because a 2 @S, for every > 0, the open interval .a ; a C / contains

elements of S and elements of Sc .

Thus, because a itself is not a member of S, .a; aC/ contains an element

of S not equal to a.

It follows that a 2 S0 S which contradicts the assumption that a S.

Therefore, @S S which proves that S is a closed set.

9.5 Closure

287

You can conclude from this result that for any set S, if a 2 @S \ Sc , it is an

accumulation point of S, and, by symmetry, if a 2 @S \ S, then it is an accumulation

point of Sc . The set S is closed if it contains its boundary, @S. But for any set S, the

elements of @S that are not in S are accumulation points of S, so S is closed if and

only if it contains all of its accumulation points. It is important to recognize, though,

that the derived set S0 need not be contained in @S since points in the interior of S are

accumulation points of S, and @S need not be contained in S0 since isolated points

of S are in the boundary of S without being accumulation points of S. On the other

hand, S [ @S D S [ S0 .

For any set S, define the closure of S or cl.S/ to be S [ S0 D S [ @S. Some books

use the notation S or S for the closure of S. Intuitively, the closure of a set S takes

the elements of the boundary of S and adds them to the set so that you now have S

along with its boundary (Fig. 9.7). The closure also has the following properties.

The set S is closed if and only if S D cl.S/.

cl.S/ is the intersection of every closed set that contains S.

cl.S/ is the smallest closed set that contains S.

All of these results have short proofs. For example, to get the first result, recall that

if x is in the boundary of the union of two sets, S [ T, then x is either in the boundary

of S or the boundary of T. Thus, if x 2 @ cl.S/, it means that x 2 @.S [ @S/ and,

therefore, x 2 @S or x 2 @.@S/. It was shown in the first section of this chapter that

@.@S/ @S implying that x 2 @S proving that x is in cl.S/. Thus, cl.S/ contains its

boundary, so it is closed.

For the second result, note that if S is closed, it contains its boundary so cl.S/ D

S [ @S D S. Conversely, if S D cl.S/, then S is closed because cl.S/ is always a

closed set.

The third and fourth results follow quickly after noticing that any closed set

containing S must also contain the boundary of S.

](

cl(S)

][

288

9.5.1 Exercises

For each of the following sets S, determine @S, S0 , and cl.S/.

1.

2.

3.

4.

5.

The integers, Z.

f 1n j n 2 Zg

.0; 3/ [ .3; 5/ [ .5; 7/

.1; 3/ \ Q [ f0; 4g

4.

5.

6.

7.

8.

The set S is closed if and only if S D cl.S/.

cl.S/ is the intersection of every closed set that contains S.

cl.S/ is the smallest closed set that contains S.

For any set S, its derived set, S0 , is a closed set.

9.6 Compactness

The topics of open cover, finite subcover, compactness, and the HeineBorel

Theorem were introduced in Chap. 4 because of their usefulness in proving that a

function continuous on a closed bounded interval is uniformly continuous on that

interval. Compactness also played an important role in showing that a continuous

function on a closed bounded interval is bounded, a continuous function on a

closed bounded interval obtains its extreme values (maximum and minimum), and a

continuous function on a closed bounded interval has a Riemann integral. Recall

that an open cover of a set S was defined to be a collection open intervals T

where for each x 2 S there is an open interval .p; q/ 2 T such that x 2 .p; q/.

After the introduction of the topological ideas in this chapter, that definition can be

generalized to allow T to be a collection of open sets rather than just open intervals,

that is, a collection of open sets, T, is called an open cover of S if for each x 2 S

there is an open set U 2 T such that x 2 U. Moreover, the HeineBorel Theorem

can now be extended in two ways: the concept of an open cover by intervals can

be generalized to an open cover by open sets, and the concept of closed bounded

interval can be generalized to closed bounded set.

PROOF (HeineBorel Theorem): Let S be any closed bounded set of real

numbers, and let T be a cover of S by open sets. Then T contains a finite

subcover of S.

Let S be a closed bounded set and T be a cover of S by open sets.

Because S is bounded, there are real numbers a and b with a < b such that

S a; b.

(continued)

9.6 Compactness

289

.a 1; b C 1/ and the open set Sc , so U is an open set.

Then T 0 D T [ fUg is an open cover of a; b.

For each x 2 a; b there is an open set Vx 2 T 0 that contains x.

Because Vx is open, there is an open interval .px ; qx / Vx that contains x.

Thus, the collection T 00 D f.px ; qx / j x 2 a; bg is a cover of a; b by open

intervals.

Now, by the previously proved version of the HeineBorel Theorem, T 00

has a finite subcover of a; b, say f.p1 ; q1 /; .p2 ; q2 /; .p3 ; q3 /; : : : ; .pk ; qk /g

for some natural number k.

For each j D 1; 2; 3; : : : ; k, the open interval .pj ; qj / in the subcover

is contained in an open set Vj 2 T 0 , so it is clear that the subcover

V1 ; V2 ; V3 ; : : : ; Vk covers a; b and, therefore, covers S.

If one of the open sets, Vj , happens to be the set U added to T, this set can

be discarded from the subcover of S because it contains no elements of S.

This gives a finite subcover of S which completes the proof.

So, this shows that all closed bounded sets of real numbers are compact. The

converse is also true, that is, all compact subsets of real numbers are both closed

and bounded. These two results together, then, completely characterize the compact

sets of real numbers.

PROOF: A subset of R is compact if and only if it is closed and bounded.

The HeineBorel Theorem shows that closed bounded sets of real numbers

are compact.

Conversely, assume that S is a compact subset of R.

The collection of open intervals .j; j/ where j ranges over the natural

numbers is a collection of open sets that covers all of R, so it certainly

covers S.

Because S is compact, S can be covered by a finite collection of the .j; j/

intervals.

It follows that there exists a natural number k such that S .k; k/, and S

is a bounded set.

Suppose that there is a real number x in the boundary of S that is not an

element of S.

For each > 0, let U D .1; x / [ .x C ; 1/ which is an open set.

The collection of all such U covers all of Rnfxg, and since x is not an

element of S, the collection is an open cover of S.

Because S is compact, it is covered by a finite collection of the U sets.

It follows that there is an > 0 such that S U .

(continued)

290

the assumption that x is in the boundary of S.

Therefore, there are no elements x in the boundary of S that are not elements

of S implying @S S and S is closed.

This shows that all compact sets are closed and bounded completing the

proof of the theorem.

Continuous functions need not map bounded sets onto bounded sets as is seen by

f .x/ D 1x which maps the bounded interval .0; 1/ continuously onto the interval

.1; 1/ which is not bounded. Continuous functions need not map closed sets

onto closed sets as seen by f .x/ D 1x which maps the closed interval 1; 1/

onto 1; 0/ which is not closed. But continuous functions always map compact

sets onto compact sets. This is a result that is true in any topological space, so

its proof need not use any more than the properties of open sets, compact sets,

and continuous functions. To write the proof you would start by assuming that the

function f W A ! B is continuous on A, and that C A is a compact set. You must

then show that the image of C, f .C/ B, is compact. How would you show this

set is compact? The definition of compact set suggests that you would take an open

cover of the set and proceed to show that that cover has a finite subcover. So let I be

an index set and assume that fUi j i 2 Ig is an open cover of f .C/. Somehow you

must show that this cover has a finite subcover. All you know is that f is a continuous

function and that the set C is compact. Since C is compact, you know that open

covers of C have finite subcovers, but you have an open cover of f .C/, not an open

cover of C. You need to use the fact that f is a continuous function which means that

for each i 2 I, the preimage of the open set Ui , f 1 .Ui /, is an open set in A. Does

the collection of f 1 .Ui / sets form a cover of C? Follow what happens: if x 2 C,

then f .x/ 2 f .C/. Thus, there is at least one i 2 I such that f .x/ 2 Ui . Therefore,

x 2 f 1 .Ui /. So, indeed, the collection of f 1 .Ui / sets forms an open cover of C.

Hence, there is a finite subcover of C given (by renaming subscripts) as f 1 .U1 /,

f 1 .U1 /; f 1 .U1 /; : : : ; f 1 .Uk /, for some natural number

k. For

each x 2 C, there is

a j between 1 and k such that x 2 f 1 .Uj /, so f .x/ 2 f f 1 .Uj / Uj . Because each

element of f .C/ is the image of at least one x 2 C, and each x 2 C is an element

of at least one of the finite number of f 1 .Uj /, it follows that the finite collection of

open sets, U1 ; U2 ; U3 ; : : : ; Uk , covers f .C/ proving that f .C/ is compact.

PROOF: If f W A ! B is continuous on A, and if C A is a compact set,

then f .C/ is a compact set in B.

Assume that f W A ! B is continuous on A, and C A is a compact set.

Let I be an index set, and fUi j i 2 Ig be a collection of open sets that cover

f .C/.

For each x 2 C there is an i 2 I such that f .x/ 2 Ui .

Since f is continuous, and, for each i 2 I, Ui is an open set in B, f 1 .Ui / is

an open set in A.

Thus, ff 1 .Ui / j i 2 Ig is an open cover of C.

(continued)

9.7 Connectedness

291

By renaming subscripts, the subcover is given as f 1 .U1 /; f 1 .U2 /; f 1 .U3 /;

: : : ; f 1 .Uk / for some natural number k.

Let y be any element of f .C/. Then y D f .x/ for some x 2 C.

Since x 2 f 1 .Uj / for one of the j D 1; 2; 3; : : : ; k, it follows that y D

f .x/ 2 Uj showing that the finite collection U1 ; U2 ; U3 ; : : : ; Uk covers f .C/.

Therefore, every open cover of f .C/ has a finite subcover, and f .C/ is

compact.

Notice that it is an immediate consequence of this theorem that a real valued continuous function on a closed bounded interval on the real line is bounded and obtains

its maximum and minimum values. This is because every closed bounded interval on

the real line is a compact set, so its image under a continuous function is compact

which means the image is closed and bounded. The image being bounded is just

another way of saying that the function is bounded. The image being closed shows

that the image contains its boundary which includes the maximum and minimum

values of the function.

The HeineBorel Theorem can be extended to n-dimensional Euclidean

space Rn . That is, the compact sets in Rn are the sets that are both closed and

bounded. One can use mathematical induction to show that a rectangular box that

is a cross product of n closed intervals is compact, and then, that can be extended to

any closed bounded set.

9.6.1 Exercises

1. Find an example of a function f and a set C such that f 1 f .C/ is notequal to C.

2. Find an example of a continuous function f and a set D such that f f 1 .D/ is

not equal to D.

3. Find an example of a continuous function f W A ! B and a compact set D B

such that f 1 .D/ is not compact.

4. Suppose that the continuous function f has domain 0; 10 and codomain .4; 4/.

Show that the function is not surjective.

9.7 Connectedness

The intervals on the real line were discussed in Chap. 2. A set of real numbers is an

interval if whenever x and y are elements of the interval, then all the real numbers

between x and y are also elements of the interval. The intervals are the connected

sets on the real line, but the concept of connectedness can be extended to any

topological space. In a general topological space, two nonempty sets A and B are

292

disconnected if there are disjoint open sets U and V with A U and B V. For

example, the sets 0; 1 and .4; 5/ are disconnected because 0; 1 .1; 2/ and

.4; 5/ .4; 5/ where .1; 2/ and .4; 5/ are disjoint open sets (Fig. 9.8). The sets

0; 3 and .3; 5/ are disjoint nonempty sets, but they are not disconnected because

any open set that contains 0; 3 will necessarily share points with any open set

containing .3; 5/, and, in particular, both open sets will contain the element 3. A set

is called connected if it is not the union of two disconnected nonempty sets. Even

though the connected sets of real numbers are just the intervals, the concept of

connectedness gets far more interesting in more general topological spaces.

If f W A ! B is continuous, then it always maps connected sets to connected

sets, that is, if C A is a connected set, then so is f .C/. This is easy to see since,

if f .C/ is disconnected, then there are two disjoint open sets U and V in B and two

nonempty sets S and T in B such that f .C/ D S [ T and S U and T V. But then

C f 1 .U/ [ f 1 .V/ where f 1 .U/ and f 1 .V/ are disjoint open sets in A. Because

S and T are nonempty, C \ f 1 .U/ and C \ f 1 .V/ are nonempty implying that C

is a disconnected set. Thus, if C is connected, f .C/ must also be connected.

PROOF: If f W A ! B is continuous on A, and if C A is a connected set,

then f .C/ is a connected set in B.

Let f W A ! B be a continuous function on A, and assume that C A such

that f .C/ is disconnected.

This means that there are disjoint open sets U and V in B, and nonempty

sets S and T in B with S U and T V such that B D S [ T.

Since f is continuous, f 1 .U/ and f 1 .V/ are open sets in A.

Since S and T are nonempty sets whose union is f .C/, both C \ f 1 .U/ and

C \ f 1 .V/ are nonempty.

This shows that C is a disconnected set.

Therefore, if C is a connected set, f .C/ must also be connected.

When this theorem is applied to functions from the real numbers to the real

numbers, the result is the Intermediate Value Theorem which states that if f is a

real valued function on the interval a; b, then for every c between f .a/ and f .b/

there is an x 2 a; b such that f .x/ D c. This is because f must map the connected

set a; b into a connected set which must include all the elements c between f .a/

and f .b/. Note that f 1 need not bring connected sets to connected sets.

In n-dimensional Euclidean space the concept of connectedness gets considerably richer as the connected sets are not merely the cross products of intervals

(Fig. 9.9). In R2 one introduces what it means for a set to be path-connected which

makes precise the intuitive notion that a set is connected if you can draw a path

9.7 Connectedness

293

N

Fig. 9.9 The set C is a connected set. The set N is not a connected set

Fig. 9.10 Graph of sin

with the y-axis

1

x

between any two of its points where the path stays inside the set. On the real line,

this just means that for any two points in the set, the interval between the two points

stays in the set. But in R2 where paths need not be straight lines, the examples are

far more varied. In fact, in R2 there are examples of connected sets that are not path

connected, a phenomenon that cannot occur on the real line.

A famous example is

the set consisting of the graph of the equation y D sin 1x along with the y-axis.

This is a connected set because

any

open set that contains the y-axis must intersect

parts of the graph of y D sin 1x both to the left and to the right of the y-axis. On

the other hand, this set is not path-connected because there is no way to construct a

path that stays inside the set and connects the points . 1 ; 0/ and . 1 ; 0/ (Fig. 9.10).

9.7.1 Exercises

1. Find an example of a continuous function f W A ! B and connected set D B

such that f 1 .D/ is not connected.

2. Show that in any topological space A, if S and T are connected sets with A D S[T

and S \ T ;, then A is connected.

Chapter 10

Metric Spaces

This book has discussed at length how one writes proofs about the limits and

continuity of functions whose domains and ranges are subsets of the real numbers, R. Although the real numbers is a far simpler set to study than many other

naturally arising sets in Analysis, the techniques learned while dealing with realvalued functions of a real variable can be applied almost exactly to prove similar

theorems about functions defined on other domains with other types of ranges. It is

instructive to take note of the properties of the real numbers that play important

roles in these proofs. In particular, most of the proofs about limits and continuity

involve measuring the distance between two real numbers x and y. This is done

by calculating the absolute value of the difference between the numbers, jx yj.

This distance measure has important properties that allow the proofs about limits

and continuity to proceed. Among the useful properties of this distance measure

is that if jx yj < for every > 0, then it follows that x D y, and if

jx yj > 0, then x is surely different from y. Another property use repeatedly in

these proofs is the triangle inequality. For example, if f and g are two functions,

and x and y are both elements in the domains of these functions, then knowing

that

g.y/j < 2 allows the proofs to conclude that

jf .x/ f.y/j < 2 and jg.x/

f .x/ C g.x/ f .y/ C g.y/ D f .x/ f .y/ C g.x/ g.y/ jf .x/f .y/jC

jg.x/ g.y/j < 2 C 2 D . The fact that the triangle inequality holds true for this

chosen measure of distance is crucial in the argument.

The conclusion is, then, that if there were a set, X, and a distance measure that

assigned to each x and y in X a real number, d.x; y/, that had many of the same

properties that the jx yj distance measure does in the real numbers, then it might

be possible to prove limit and continuity theorems for functions defined on X by

just adopting the same proof techniques used for the theorems about functions of

J.M. Kane, Writing Proofs in Analysis, DOI 10.1007/978-3-319-30967-5_10

295

296

10 Metric Spaces

the plane

x

d(x,y)

d(x,z)

y

d(y,z)

real numbers. With this in mind a nonempty set X together with distance function

d is defined to be a metric space if, for all x, y, and z in X, this distance function

satisfies the following properties:

d.x; y/ D 0 if and only if x D y (the distance function separates points).

d.x; y/ D d.y; x/ (the distance function is symmetric).

d.x; y/ C d.y; z/ d.x; z/ (the distance function satisfies the triangle inequality).

The distance function defines a metric for the metric space, and the metric

space is designated as <X; d> (Fig. 10.1). This definition is a generalization of the

distance function defined on the real numbers, d.x; y/ D jx yj. Clearly, for all real

numbers x, y, and z,

d.x; y/ D jx yj 0

0 D d.x; y/ D jx yj if and only if x D y

d.x; y/ D jx yj D jy xj D d.y; x/

d.x; y/ C d.y; z/ D jx yj C jy zj j.x y/ C .y z/j D jx zj D d.x; z/

fairly straightforward process to construct a proof that <X; d> is a metric space.

Most proofs would follow this template:

TEMPLATE for proving <X; d> is a metric space

SET THE CONTEXT: Give the definitions of X and d.

METRIC DEFINITION: Show that d maps each x; y 2 S to a nonnegative

real number.

SEPARATION OF POINTS: Show that d.x; y/ D 0 implies x D y.

ZERO DISTANCE: Show that for all x 2 X that d.x; x/ D 0.

SYMMETRY: Show that for all x; y 2 X that d.x; y/ D d.y; x/.

TRIANGLE INEQUALITY: Show that for all x; y; z 2 S that d.x; y/ C

d.y; z/ d.x; z/.

Given a metric space <X; d>, an element a 2 X, and a positive real number r,

define the neighborhood of a with radius r to be N.a; r/ D fx 2 X j d.a; x/ < rg.

Sometimes, as in the definition of a limit at point a, one needs to exclude the point a

from the neighborhood of a. In this case, one can define the deleted neighborhood

of a with radius r to be N .a; r/ D fx 2 X j 0 < d.a; x/ < rg. These neighborhoods

play a central role in defining limits and continuity of functions defined on X and

in establishing a topology for the space X. It is not uncommon for there to be

10.2 Inequalities

297

several different distance functions defined on a particular set X that make X into a

metric space. Each new distance function results in different shaped neighborhoods.

Some give rise to the same topology of X while others may result in quite different

topologies. Many examples of these different distance functions will be explored in

the sections that follow.

10.2 Inequalities

Most proofs in Analysis involve establishing one or more inequalities. Some

inequalities seem to keep reappearing in different guises throughout Analysis, so

they provide great tools for writing proofs. This section presents two very common

inequalities that will be used later in the chapter to justify the triangle inequality for

some examples of metric spaces.

For natural number n let a D .a1 ; a2 ; a3 ; : : : ; an / and b D .b1 ; b2 ; b3 ; : : : ; bn / be any

two points in n-dimensional Euclidean space. The CauchySchwarz Inequality

states that

ja1 b1 C a2 b2 C a3 b3 C C an bn j

q

a21 C a22 C a33 C C a2n b21 C b22 C b33 C C b2n :

q product of two vectors will find

this inequality easy to remember. If jaj D

a21 C a22 C a23 C C a2n refers to

the magnitude of vector a and the dot product a b D .a1 ; a2 ; a3 ; : : : an /

.b1 ; b2 ; b3 ; : : : bn / D a1 b1 C a2 b2 C a3 b3 C C an bn , then a b D jaj jbj cos

where is the angle between the two vectors. Then the CauchySchwarz Inequality

is just the statement that jaj jbj ja bj which follows because j cos j 1.

To prove the CauchySchwarz Inequality note that for given a; b 2 Rn and every

n

P

real number x the quantity

.aj C xbj /2 is a sum of squares of real numbers, so

jD1

x2

n

P

jD1

n

P

jD1

a2j C 2x

n

P

aj bj C

jD1

real number x. Any quadratic polynomial Ax2 CBxCC with A > 0 is nonnegative for

every x if and only if its discriminant B2 4AC is not positive. But the discriminant

298

10 Metric Spaces

2

of the previous polynomial is 4 4

n

P

!2

aj bj

jD1

n

P

jD1

!

a2j

n

P

jD1

!3

b2j 5. The statement

that this discriminant is less than or equal to 0 is exactly the statement of the

CauchySchwarz Inequality. An even stronger statement can now be made. Equality

occurs in the CauchySchwarz Inequality if and only if the given discriminant is 0

so that the underlying quadratic polynomial has exactly one root, meaning that the

n

P

sum .aj C xbj /2 is 0 for exactly one value of x. This happens if and only if a is

jD1

holds exactly when one of the points .a1 ; a2 ; a3 ; : : : ; an / and .b1 ; b2 ; b3 ; : : : ; bn / is a

scalar multiple of the other.

Starting with the CauchySchwarz Inequality

a1 b1 C a2 b2 C a3 b3 C C an bn

q

q

a21 C a22 C a33 C C a2n b21 C b22 C b33 C C b2n

doubling it and adding a21 C a22 C a23 C C a2n C b21 C b22 C b23 C C b2n to

both sides yields

.a1 C b1 /2 C .a2 C b2 /2 C .a3 C b3 /2 C C .an C bn /2

q

q

2

a1 C a22 C a23 C C a2n C 2 a21 C a22 C a33 C C a2n b21 C b22 C b33 C C b2n C .b21 C b22 C b23 C C b2n /

q

q

q

.a1 C b1 /2 C .a2 C b2 /2 C .a3 C b3 /2 C C .an C bn /2 a21 C a22 C a23 C C a2n C b21 C b22 C b23 C C b2n

which is a special case of the Minkowski Inequality which can be restated jaCbj

jaj C jbj. Again, equality occurs only when one of the points is a scalar multiple of

the other.

10.2.3 Exercises

1. Show that the CauchySchwarz Inequality extends to infinite s

series. That

s is, if

1

1

1

1

1

P 2

P 2

P

P 2 P

an and

bn are both convergent series, then

an bn

an

b2n .

nD1

nD1

nD1

nD1

nD1

299

1

P

2. Show that the Minkowski Inequality extends to infinite series. That is, if

a2n

nD1

s

s

1

1

1

P

P

P

2

2

and

bn are both convergent series, then

.an C bn /

a2n C

nD1

nD1

nD1

s

1

P

b2n .

nD1

3. Show that for any real numbers a1 ; a2 ; a3 ; : : : ; an and positive real numbers

a2

a2

a2

a2

b1 ; b2 ; b3 ; : : : ; bn , the following inequality holds: 1 C 2 C 3 C C n

b1

b2

b3

bn

.a1 C a2 C a3 C C an /2

. This can be shown by mathematical induction on n,

b1 C b2 C b3 C C bn

but can also be shown using the CauchySchwarz Inequality.

For any natural number n one can define n-dimensional Euclidean space, Rn ,

with R1 being the real numbers, R2 being the Euclidean plane, R3 being

3-dimensional Euclidean space, and so forth. Elements of Rn can be represented

as ordered n-tuples of real numbers, .x1 ; x2 ; x3 ; : : : ; xn /. You should be familiar

with the Euclidean distance between two points in n-dimensional Euclidean space,

x D .x1 ; x2 ; x3 ; : : : ; xn / and y D .y1 ; y2 ; y3 ; : : : ; yn /, given by the generalization of

the Pythagorean Theorem as

d.x; y/ D

a square root of the sum of squares of real numbers. Moreover, the distance is

0 exactly when the sum of squares is 0 which happens only when x D y. The

fact that d is symmetric follows from the fact that for all real numbers a and

b, .a b/2 D .b a/2 . The fact that the Euclidean distance satisfies the triangle

inequality is just the statement of the Minkowski Inequality with aD.x1 y1 ; x2 y2 ;

x3 y3 ; : : : ; xn yn / and b D .y1 z1 ; y2 z2 ; y3 z3 ; : : : ; yn zn /. Then

p

d.x; y/ C d.y; z/ D .x1 y1 /2 C .x2 y2 /2 C .x3 y3 /2 C C .xn yn /2 C

p

.y1 z1 /2 C .y2 z2 /2 C .y3 z3 /2 C C .yn zn /2

p

.x1 z1 /2 C .x2 z2 /2 C .x3 z3 /2 C C .xn zn /2 D d.x; z/:

Together these facts show that Rn with Euclidean distance is a metric space

(Fig. 10.2).

300

Fig. 10.2 Euclidean distance

is R2

10 Metric Spaces

(x1, y1)

d(x,y) =

|y2 y1|

|x2 x1|

(x2, y2)

with

Euclidean

distance

function

p the

d.x; y/ D .x1 y1 /2 C .x2 y2 /2 C .x3 y3 /2 C C .xn yn /2 is a

metric space.

SET THE CONTEXT: For natural number n, let x D .x1 ; x2 ; x3 ; : : : ; xn /,

y D .y1 ; y2 ; y3 ; : : : ; yn /, and z D .z1 ; z2 ; z3 ; : : : ; zn / be elements of Rn .

METRIC

DEFINITION:

Define

d.x; y/

p

D .x1 y1 /2 C .x2 y2 /2 C .x3 y3 /2 C C .xn yn /2 which is the

square root of a sum of squares of real numbers, so it is a nonnegative real

number.

SEPARATION OF POINTS: If x y, then for some j between 1 and n,

.xj yj /2 must be positive implying that d.x; y/ > 0.

ZERO DISTANCE: For each j between 1 and n, .xj xj /2 D 0, so

d.x; x/ D 0.

SYMMETRY: Since for each j between 1 and n, .xj yj /2 D .yj xj /2 , it

follows that d.x; y/ D d.y; x/.

TRIANGLE INEQUALITY: The fact that d.x; y/ C d.y; z/ d.x; z/ is just

a restatement of the Minkowski Inequality with aj D xj yj and bj D yj zj

for each j between 1 and n.

This shows that Rn with the Euclidean distance is a metric space.

The Euclidean distance, sometimes called the Euclidean metric, may be the most

commonly seen distance function used for Euclidean space, but there are many other

distance functions which can make Rn into a metric space. One example is d.a; b/ D

ja1 b1 jCja2 b2 jCja3 b3 jC Cjan bn j. This is sometimes called the taxicab

metric because the distance d.a; b/ is the distance you would travel between the two

points a and b if you could only travel in directions parallel to one of the coordinate

axes as a taxicab would do on a rectangular grid of streets. Proving that this distance

function makes Rn into a metric space is quite easy.

301

Euclidean distance function d.x; y/ D j.x1 y1 j C jx2 y2 j C jx3 y3 jC

C jxn yn j is a metric space.

SET THE CONTEXT: For natural number n, let x D .x1 ; x2 ; x3 ; : : : ; xn /,

y D .y1 ; y2 ; y3 ; : : : ; yn /, and z D .z1 ; z2 ; z3 ; : : : ; zn / be elements of Rn .

METRIC DEFINITION: Define d.x; y/ D jx1 y1 j C jx2 y2 j C jx3 y3 j C

C jxn yn j which is the sum of nonnegative absolute values so it is a

nonnegative real number.

SEPARATION OF POINTS: If x y, then for some j between 1 and n,

jxj yj j must be positive implying that d.x; y/ > 0.

ZERO DISTANCE: For each j between 1 and n, jxj xj j D 0, so d.x; x/ D 0.

SYMMETRY: Since for each j between 1 and n, jxj yj j D jyj xj j, it

follows that d.x; y/ D d.y; x/.

TRIANGLE INEQUALITY: Since for each j between 1 and n, jxj yj j C

jyj zj j jxj zj j, it follows that d.x; y/ C d.y; z/ d.x; z/.

This shows that Rn with the d distance function is a metric space.

Still another distance function that can be used for Euclidean space is called

the supremum metric given by d.a; b/ D max.ja1 b1 j; ja2 b2 j; ja3 b3 j; : : : ;

jan bn j/. It is constructive to compare the shapes of the neighborhoods that you get

using the Euclidean metric, the taxicab metric, and the supremum metric as shown

in Fig. 10.3. Since the Euclidean distance is the familiar distance from Euclidean

Geometry, it is easy to see that if a 2 Rn and r > 0, then N.a; r/ is an open ball with

center a and radius r. On the other hand, using the taxicab metric, N.a; r/ is a union

of 2n n-dimensional triangular pyramids. That is, when n D 2, N.a; r/ is a diamond

made up of four isosceles right triangles, and when n D 3, N.a; r/ is a union of

8 tetrahedra, one in each octant, forming a regular octahedron. For the supremum

metric, N.a; r/ is an n-dimensional cube. Note that in the Euclidean metric, if the

coordinate axes are rotated (performing an orthogonal change of coordinates), there

is no change in the neighborhood whereas with the other two metrics, rotating the

axes changes the orientation of the neighborhoods. It turns out that all three of these

Fig. 10.3 N.0; 1/ in the Euclidean, taxicab, and supremum metrics in 2 and 3 dimensions

302

10 Metric Spaces

metrics give rise to the same topology on Rn because each metric gives the same

open sets even though the open neighborhoods are different in shape. But the three

metrics have different algebraic properties, and sometimes it is easier to prove a

particular theorem using one of these metrics rather than the others.

Distance measures in metric spaces need not be complicated. For any set X you

can define d.x; x/ D 0 for all x 2 X and d.x; y/ D 1 for all x and y in X with x y. It

is very easy to see that d.x; y/ is nonnegative, symmetric, and equal to 0 if and only

if x D y. Also, for any x; y; z 2 S, if d.x; z/ D 1, then x z, so at least one of d.x; y/

and d.y; z/ must be 1 which implies the triangle inequality d.x; y/Cd.y; z/ d.x; z/.

Thus, any set X is a metric space with this metric sometimes called the discrete

metric, and <X; d> is called a discrete metric space. Note that for this metric,

each neighborhood, N.a; r/ is either all of X or just the single point fag depending

on whether or not r is greater than 1.

Next, consider a space that looks much different than Euclidean space. Let C0; 1

be the set of all real-valued functions continuous on the interval 0; 1. Certainly,

this set contains all the polynomials with real coefficients, but it also includes the

rational functions that are defined on 0; 1, exponential functions, many elementary

functions, and a much larger class of functions continuous but not differentiable on

0; 1. This set is truly very large as compared, say, to the set of real numbers. There

are many ways you might try to measure the distance between two functions in this

set. For example, you could evaluate the function at one or more points and measure

how much the functions differ at those points.

That is,if f and g are in C0; 1, you

could define d.f ; g/ D jf .0/ g.0/j C f 12 g 12 C jf .1/ g.1/j. The only

problem with this definition is that there are continuous functions f and g which are

equal at 0, 12 , and 1 but not equal at other points such as f .x/ D x.2x 1/.x 1/

and g.x/ D 2x.2x 1/.x 1/. Because the given distance function gives a distance

of 0 between two unequal functions, it cannot serve as a metric for the space of

continuous functions on 0; 1 (Fig. 10.4).

As a result, a distance function that makes C0; 1 into a metric space really

needs to take into account the values of the functions at all the points (or at least

a dense set of points) in 0; 1. One distance measure that does this is called

the supremum metric or sup metric for short. It is defined for all f and g in

C0; 1 as d.f ; g/ D sup jf .x/ g.x/j. It is clear that if f g, then there are

x20;1

f g, then d.f ; g/ D 0 as needed. It is necessary to check that this distance

Fig. 10.4 Some functions

in C0; 1

303

function has a valid definition, that is, for every f and g in C0; 1 the distance

function gives a nonnegative real number. But if f and g are continuous functions

on 0; 1, then so is jf .x/ g.x/j. Since all functions continuous on 0; 1 are

bounded and jf .x/g.x/j is a continuous function, the needed supremum is defined.

The triangle inequality follows from the fact that the triangle inequality works for

real numbers. Since for any three continuous functions f , g, and h and for each

x 2 0; 1it is true that jf .x/ g.x/j C

jg.x/ h.x/j jf .x/ h.x/j, it follows

that sup jf .x/ g.x/j C jg.x/ h.x/j sup jf .x/ h.x/j. Then, the inequality

x20;1

x20;1

sup.A C B/ sup A C sup B shows that sup jf .x/ g.x/j C sup jg.x/ h.x/j

x20;1

x20;1

sup jf .x/ g.x/j C jg.x/ h.x/j sup jf .x/ h.x/j, and d.f ; g/ C d.g; h/

x20;1

x20;1

d.f ; h/.

PROOF: The set C01 with distance function d.f ; g/ D sup jf .x/ g.x/j

x20;1

is a metric space.

SET THE CONTEXT: Let C0; 1 be the set of real-valued functions

continuous on the interval 0; 1.

METRIC DEFINITION: For any f and g in C0; 1, the function

jf .x/ g.x/j is also in C0; 1. Define d.x; y/ D sup jf .x/ g.x/j which is

x20;1

real number.

SEPARATION OF POINTS: For f ; g 2 C0; 1 if f g, then for some

x 2 0; 1, jf .x/ g.x/j must be positive implying that d.f ; g/ > 0.

ZERO DISTANCE: For all x 2 0; 1 and f 2 C0; 1, jf .x/ f .x/j D 0, so

sup jf .x/ f .x/j D 0 and d.f ; f / D 0.

x20;1

SYMMETRY: Since for all x 2 0; 1 and all f ; g 2 C0; 1, jf .x/ g.x/j D

jg.x/ f .x/j, it follows that d.f ; g/ D d.g; f /.

TRIANGLE INEQUALITY: Since for all x 2 0; 1 and all f ; g; h 2 C0; 1,

it holds that jf .x/ g.x/j C jg.x/ h.x/j jf .x/ h.x/j, it follows

that sup jf .x/ g.x/j C sup jg.x/ h.x/j sup jf .x/ g.x/j C

x20;1

x20;1

x20;1

jg.x/ h.x/j sup jf .x/ h.x/j, and d.f ; g/ C d.g; h/ d.f ; h/.

x20;1

This shows that C0; 1 with the supremum distance function is a metric

space.

The supremum metric provides only one of many possible distance functions

for the space C0; 1. Another example is called the L1 metric and is defined by

R1

d.f ; g/ D jf .x/ g.x/jdx. Since all functions continuous on a closed interval are

0

integrable there, this distance function is defined. Moreover, since jf .x/ g.x/j 0

for all x 2 0; 1, its integral is also nonnegative. If f g, then there is an a 2 0; 1

where f .a/ g.a/. Because jf .x/ g.x/j is continuous and positive at x D a, there

is a > 0 such that for all x 2 C0; 1 with jxaj < , jf .x/g.x/j > 12 jf .a/g.a/j.

304

10 Metric Spaces

R1

0

aC

R

a

a rigorous proof will take care that the limits of integration in the previous sentence

are chosen in a way that the integral is guaranteed to be defined. The symmetry

of d follows from its definition. For all f ; g; h 2 C0; 1 and each x 2 0; 1,

the triangle inequality gives jf .x/ g.x/j C jg.x/ h.x/j jf .x/ h.x/j. Thus,

R1

R1

R1

jf .x/ g.x/jdx C jg.x/ h.x/jdx D jf .x/ g.x/j C jg.x/ h.x/jdx

0

R1

0

jf .x/ h.x/jdx, so d.f ; g/ C d.g; h/ d.f ; h/, and the needed triangle inequality

holds.

PROOF: The set C01 with distance function d.f ; g/ D

is a metric space.

R1

0

jf .x/ g.x/jdx

continuous on the interval 0; 1.

R1

METRIC DEFINITION: Define d.x; y/ D jf .x/ g.x/jdx which is the

0

number.

SEPARATION OF POINTS: For f ; g 2 C0; 1 if f g, then for some

a 2 0; 1, jf .a/ g.a/j must be positive.

Because jf .x/ g.x/j is a continuous function, there is a > 0 such that

jf .x/ g.x/j > 12 jf .a/ g.a/j for all x 2 0; 1 satisfying jx aj < .

In particular, there are and in 0; 1 with < such that jf .x/ g.x/j >

1

jf .a/ g.a/j for all x satisfying < x < .

2

R1

R

Then d.f ; g/ D jf .x/ g.x/jdx jf .x/ g.x/jdx >

1

jf .a/

2

ZERO DISTANCE: For all x 2 0; 1 and f 2 C0; 1, jf .x/ f .x/j D 0, so

R1

R1

jf .x/ f .x/jdx D 0 dx D 0 and d.f ; f / D 0.

0

SYMMETRY: Since for all x 2 0; 1 and all f ; g 2 C0; 1, jf .x/ g.x/j D

jg.x/ f .x/j, it follows that d.x; y/ D d.y; x/.

TRIANGLE INEQUALITY: Since for all x 2 0; 1 and all f ; g; h 2 C0; 1,

it holds that jf .x/ g.x/j C jg.x/ h.x/j jf .x/ h.x/j, it follows that

R1

R1

R1

jf .x/ g.x/jdx C jg.x/ h.x/jdx D jf .x/ g.x/j C jg.x/ h.x/jdx

0

R1

0

This shows that C0; 1 with the d.f ; g/ distance function is a metric space.

305

It is important to note that the supremum metric and the L1 metric are

distinctly different. In particular,

8

9 consider the sequence of functions fn .x/ D

1

>

0

if

0

>

nC1

>

>

>

<

=

1

1

for all natural numbers n. In the L1 metric,

n.n C 1/x n if nC1 < x n

>

>

>

>

>

:

;

1

1

if n < x 1

these functions converge to the function which is identically 1 on 0; 1. On the

other hand, this sequence is not even a Cauchy sequence in the supremum metric

since d.fn ; fm / D 1 for all n m. All metrics for C0; 1 need to measure the

distance between two continuous functions. The supremum metric measures the

maximum distance between two functions whereas the L1 metric measures a mean

distance between two functions.

10.3.1 Exercises

Write proofs for each of the following statements.

1. Let C be a circle. For x and y in C, define d.x; y/ to be the number in 0;

equal to the measure of the central angle in C of the arc bounded by x and y.

Show that C with this distance function is a metric space.

2. If d is defined for points .x1 ; y1 / and .x2 ; y2 / in R2 by 2jx1 x2 j C 3jy1 y2 j,

then R2 with distance function d is a metric space.

3. Let X be the set consisting of all integers plus one extra point M. For each

1

x 2 X, let d.x; x/ be 0. For integers x y, let d.x; y/ D min.jxj;jyj/C1

, and for

1

each integer x, let d.x; M/ D d.M; x/ D jxjC1 . Then <X; d> is a metric space.

4. Let X be the collection of all sequences of real numbers a1 ; a2 ; a3 ; : : : for which

there exists a natural number n such that aj D ak for all j and k greater than or

equal to n. In other words, X is the collection of all sequences which are constant

from some point on, such as 1; 2; 3; 4; 3; 3; 3; 3; : : : or 12 ; 23 ; 23 ; 23 ; 23 ; : : : . Define

the distance between two sequences <aj > and <bj > to be 0 if the sequences are

identical, and to be the least natural number n for which the difference between

the two sequences <aj bj > is constant for all terms j n. Then X is a metric

space with this metric.

5. Let p be any prime number. Then for any two rational numbersn r and s, define

d.r; s/ D 0 if r D s. Otherwise, if r s, then jr sj D pba where a and

b are relatively prime natural numbers, n is an integer, and neither a nor b is

divisible by p. Define d.r; s/ D pn . Then the rational numbers with distance

function d is a metric space.

6. If <X; d> is a metric space, then for any c > 0, <X; c d> is also a metric

space.

7. If <X; d1 > and <X; d2 > are both metric spaces, then <X; d1 C d2 > is also a

metric space.

306

10 Metric Spaces

8. If <X; dX > and <Y; dY > are both metric spaces, then X Y D f.x; y/ j x 2

X and y 2 Yg with distance function d .x1 ; y1 /; .x2 ; y2 / D dX .x1 ; x2 / C

dY .y1 ; y2 / is a metric space.

s

2

R1

f .x/ g.x/ dx is a metric space.

9. C0; 1 with distance function d.f ; g/ D

0

R1

10. The L1 metric d.f ; g/ D jf .x/ g.x/j dx is not a metric for the space of all

0

Recall that in the real numbers, R, the interior of a set S is defined to be the set of

points x 2 S such that there is an > 0 for which the entire interval .x ; x C /

is contained in S. The exterior of a set is defined to be the set of points x S such

that there is an > 0 for which the entire interval .x ; x C / is contained in the

complement of S. The boundary of a set S is defined to be the set of points neither in

the interior nor the exterior of the set, or the points x such that for all > 0 the set

.x; xC/ contains elements of S and elements of Sc . All three of these definitions

generalize in a natural way to all metric spaces. Indeed, one just has to replace the

role of the open interval .x ; x C / with the neighborhood N.x; /. That is, if

<X; d> is a metric space, and S X, the interior of S, int.S/, is the set of x 2 S

such that there is an > 0 for which N.x; / S, the exterior of S, ext.S/, is the

set of x 2 Sc such that there is an > 0 for which N.x; / Sc , and the boundary

of S, @S, is the set of x 2 X such that for every > 0, the set N.x; / contains points

in S and points in Sc .

The definitions of interior, exterior, and boundary, in turn, allow one to define

open and closed sets, accumulation point, derived set, and closure in ways analogous

to how they are defined for the set of real numbers. That is, if S is a subset of a

metric space X, S is an open set if S D int.S/, S is a closed set if @S S, S has

accumulation point a if, for every > 0, N .a; /\S ;, the derived set of S, S0 ,

is the set of accumulation points of S, and the closure of S; cl.S/; is S[S0 . It is worth

noting that for every x 2 X and every > 0 that N.x; / is an open set. This is easy

to show by thinking about how you prove that an open interval in the real numbers

is an open set. In the real numbers, if a < b, then .a; b/ is open because if y 2 .a; b/,

the interval .y ; y C / .a; b/ when D min.y a; b y/. Similarly, then, in

metric space <X; d>, if a 2 X and > 0 are given, let y 2 N.a; /. It follows from

the definition of N.a; / that D d.a; y/ > 0. Then, if x 2 N.y; /, d.x; y/ <

D d.a; y/, so by the triangle inequality d.a; x/ d.a; y/ C d.y; x/ < , and

x 2 N.a; /. Thus, you can conclude that N.y; / N.a; / when D d.a; y/

which proves that N.a; / is open.

307

union of open sets is open

Many of the theorems pertaining to the topology of the real numbers proved in the

preceding chapter can now be reproved in the context of metric spaces by merely

changing references to open intervals .x ; x C / with the new neighborhood

notation, N.x; /. For example, consider the proof that the union of open sets is also

an open set (Fig. 10.5).

PROOF: In metric space <X; d> assume that for each i in the index set I,

Ai is an open set. Then [ Ai is an open set.

i2I

In metric space <X; d> assume that for each i in the index set I, Ai is an

open set.

Let x 2 [ Ai .

i2I

the open set Aj .

By the definition of open set, there is an > 0 such that the N.x; / Aj .

But by the definition of set union, Aj [ Ai showing that N.x; / [ Ai ,

i2I

i2I

Several other examples are left for the exercises. Note that any metric space <X; d>

with the given definition of open set is a topological space as defined in Chap. 9.

10.4.1 Exercises

Write proofs for each of the following statements.

1. For every subset, S, of metric space <X; d>, int int.S/ D int.S/.

2. For every subset, S, of metric space <X; d>, @.@S/ @S.

3. For subsets S and T of metric space <X; d>, ext.S [ T/ ext.S/ \ ext.T/.

308

10 Metric Spaces

4. A subset S of a metric spa

## Molto più che documenti.

Scopri tutto ciò che Scribd ha da offrire, inclusi libri e audiolibri dei maggiori editori.

Annulla in qualsiasi momento.