Bahan Kuliah Analisis Real

1.1.
Notation and Set Theory
IRA
Sets are the most basic building blocks in mathematics, and it is in fact not easy to give a
precise definition of the mathematical object set. Once sets are introduced, however, one
can compare them, define operations similar to addition and multiplication on them, and
use them to define new objects such as various kinds of number systems. In fact, most of
the topics in modern analysis are ultimately based on sets.
Therefore, it is good to have a basic understanding of sets, and we will review a few
elementary facts in this section. Most, if not all, of this section should be familiar and its
main purpose is to define the basic notation so that there will be no confusion in the
remainder of this text.
Definition 1.1.1: Sets and Operations on Sets
A set is a collection of objects chosen from some universe. The universe is
usually understood from the context. Sets are denoted by capital, bold letters or
curly brackets.
A B: A is a subset of B means that every element in A is also contained
in B.
A B: A union B is the set of all elements that are either in A or in B or
in both.
A B: A intersection B is the set of all elements that are in both sets A
and B.
A \ B: A minus B are all elements from A that are not in B.
comp(A): The complement of A consists of all elements that are not in
A.
Two sets are disjoint if A B = 0 (the empty set)
Two sets A and B are equal if A B and B A
The most commonly used sets are the sets of natural numbers, integers, rational and
real numbers, and the empty set. They are usually denoted by these symbols:
N = {1, 2, 3, 4, ... } = natural numbers (sometimes 0 is considered part of the
natural numbers as well)
Z = {... -3, -2, -1, 0, 1, 2, 3, ... } = integers
Q = {p / q : p, q Z} (read as "all number p / q, such that p and q are elements of
Z") = rational numbers
R = real numbers
0 = empty set (the set that contains no elements)
All of the number systems (except the natural numbers) will be defined in a
mathematically precise way in later sections. First, some examples:
Examples 1.1.2:
Define the following sets: E = { x: x = 2n for n N}, O = { x: x = 2n - 1

for n N}, A = { x R : -4 < x < 3}, B = { x R : -1 < x < 7}, and I = {
x R: x2 = -2}. Then:
1.
What, in words, are the sets E, O, and I ?
2.
Find A B, A B, A \ B, comp(A).
3.
Find O E, O I, comp(I).
Sets can be combined using the above operations much like adding and multiplying
numbers. Familiar laws such as associative, commutative, and distributive laws will be true
for sets as well. As an example, the next result will illustrate the distributive law; other
laws are left as exercises.
Proposition 1.1.3: Distributive Law for Sets
A (B
A
(B
C) = (A
C) = (A
B)
(A C)
B)
(A
C)
Proof
Many results in set theory can be illustrated using Venn diagram, as in the above proof.
However, such diagrams do not represent mathematically rigorous proofs. Nonetheless,
before an actual proof is developed, it is first necessary to form a mental picture of the
assumptions, conclusions, and implications of a theorem. For this process a Venn diagram
can be very helpful. You can practice Venn diagrams by using them for some of the
true/false statements in the exercises.
There are many other theorems dealing with operation on sets. One that is particularly
interesting is the theorem about de Morgan's Laws, because it deals with any number of
sets (even infinitely many). Drawing a Venn diagram in such a situation would be
impossible, but a mathematical proof can easily deal with this situation:
Theorem 1.1.4: de Morgan Laws
i.e. the complement of the intersection of any

number of sets equals the union of their complements.
i.e. the complement of the union of any number of

sets equals the intersection of their complements.
Proof
So far, we have reviewed a few basic facts from set theory, and also got an idea about how
a course in Real Analysis will proceed:
First, there are definitions, stating precisely what we are talking about. From those
definitions we derive new results, based on old results, notation, and logic. The new results
are called Theorems (if they are important or broad), Propositions (if they are interesting,
but not so broadly applicable) and Corollaries (which are usually restatements of theorems
or propositions in special situations). We will proceed that way throughout the text.
The most difficult part of Real Analysis is trying to understand the proofs of new results, or
even developing your own proofs. While there are a few 'general' methods for proofs, a lot
of experience and practice is needed before you will feel familiar with giving your own
proofs. However, only a few proofs require real ingenuity, and many other proofs can be
understood by carefully reviewing the definitions of terms involved. Therefore, as a rule:
write down the precise mathematical definitions of all terms involved before
starting a proof
In following that rule, one often gets ideas about how to start a proof by starting to
manipulate the mathematical symbols involved in the precise definitions of the terms.
Keep in mind that a proof can (almost) never be given by means of examples. Working out
a few examples can certainly be helpful - and should in fact always be done before starting
a proof - but they can not constitute a rigorous proof of a general statement.
Two types of proofs will be encountered frequently, and deserve special attention:
Proof by Induction: This type of proof is introduced in detail in the next chapter.
Proof by Contradiction: In this type of proof one assumes that the proposition (i.e.
what one actually would like to proof) is false. Then one derives a contradiction,
i.e. a logical impossibility. If that can be accomplished, then one has shown that the
negation of a statement will result in an illogical situation. Hence, the original
statement must be true.
Examples 1.1.5:
Prove that when two even integers are multiplied, the result is an even
integer, and when two odd integers are multiplied, the result is an odd
integer.
Prove that if the square of a number is an even integer, then the
original number must also be an even integer. (Try a proof by
contradiction)
Euclid's Theorem states that there is no largest prime. A proof by
contradiction would start out by assuming that the statement is false, i.e.
there is a largest prime. The advantage now is that if there was a largest
prime, there would be only finitely many primes. This seems easier to
handle than the original statement which implies the existence of
infinitely many primes. Finish the proof.
1.2. Relations and Functions
IRA
After introducing some of the basic elements of set theory (sets), we will move on to the
second most elementary concept, the concept of relations and functions.
Definition 1.2.1: Relation
Let A and B be two sets. A relation between A and B is a collection of ordered
pairs (a, b) such that a A and b B. Often we use the notation a ~ b to
indicated that a and b are related, rather then the order pair notation (a, b).
Note that this does not mean that each element from A needs to be associated with one (or
more) elements from B. It is sufficient if some associations between elements of A and B
are defined. In contrast, there is the definition of a function:
Definition 1.2.2: Function, Domain, and Range
Let A and B be two sets. A function f from A to B is a relation between A and B
such that for each a A there is one and only one associated b B. The set A is
called the domain of the function, B is called its range.
Often a function is denoted as y = f(x) or simply f(x), indicating
the relation { (x, f(x)) }.
Examples 1.2.3:
Let A = {1, 2, 3, 4}, B = {14, 7, 234}, C = {a, b, c}, and R = real

numbers. Define the following relations:
1. r is the relation between A and B that associates the pairs 1 ~ 234,
2 ~ 7, 3 ~ 14, 4 ~ 234, 2 ~ 234
2. f is the relation between A and C that relates the pairs {(1,c),
(2,b), (3,a), (4,b)}
3. g is the relation between A and C consisting of the associations
{(1,a), (2,a), (3,a)}
4. h is the relation between R and itself consisting of pairs
{(x,sin(x))}
Which of those relations are functions ?
The outcomes of a function (i.e. the elements from the range associated to elements in the
domain) do not only depend on the rule of the function (such as x is associated with sin(x))
but also on the domain of the function. Therefore, we need to specify those outcomes that
are possible for a given rule and a given domain:
Definition 1.2.4: Image and Preimage
Let A and B be two sets and f a function from A to B. Then the image of f is
defined as
imag(f) = {b B : there is an a A with f(a) = b}.
Let A and B be two sets and f a function from A to B. If C is a subset of the
range B then the preimage, or inverse image, of C under the function f is the set
defined as
f -1(C) = {x A : f(x) C }
As an example, consider the following functions:
Example 1.2.5:
Let f(x) = 0 if x is rational and f(x) = 1 if x is irrational. This function is

called Dirichlets Function. The range for f is R.
o
Find the image of the domain of the Dirichlet Function when:
1. the domain of f is Q
2. the domain of f is R
3. the domain of f is [0, 1] (the closed interval between 0 and
1)
o
What is the preimage of R ? What is the preimage of [-1/2,
1/2] ?
Let f(x) = x2, with domain and range being R. Then use the graph of the
function to determine:
1.
What is the image of [0, 2] and the preimage of [1, 4] ?
Find the image and the preimage of [-2, 2].

Functions can be classified into three groups: those for which every element in the image
has one preimage, those for which the range is the same as the image, and those which
have both of these properties. Accordingly, we make the following definitions:
Definition 1.2.6: One-one, Onto, Bijection
2.
A function f from A to B is called one to one (or one- one) if whenever f(a) =
f(b) then a = b. Such functions are also called injections.
A function f from A to B is called onto if for all b in B there is an a in A such
that f(a) = b. Such functions are also called surjections.
A function f from A to B is called a bijection if it is one to one and onto, i.e.
bijections are functions that are injective and surjective.
Examples 1.2.7:
If the graph of a function is known, how can you decide whether a

function is one-to-one (injective) or onto (surjective) ?
Which of the following functions are one-one, onto, or bijections ?
The domain for all functions is R.
1. f(x) = 2x + 5
2. g(x) = arctan(x)
3. g(x) = sin(x)
4. h(x) = 2x3 + 5x2 - 7x + 6
1.3. Equivalence Relations and Classes
IRA
Just as there were different classes of functions (bijections, injections, and surjections),
there are also special classes of relations. One of the most useful kind of relation (besides
functions, which of course are also relations) are those called equivalence relations.
Definition 1.3.1: Equivalence Relation
Let S be a set and r a relation between S and itself. We call r an equivalence
relation on S if r has the following three properties:
1. Reflexivity: Every element of S is related to itself
2. Symmetry: If s is related to t then t is related to s
3. Transitivity: If s is related to t and t is related to u, then s is related to u.
Examples 1.3.2:
Let A = {1, 2, 3, 4} and B = {a, b, c} and define the following two

relations:
1. r : { (a,a), (b,b), (a,b), (b,a) }
2. s : 1 ~ 1, 2 ~ 2, 3 ~ 3, 4 ~ 4, 1 ~ 4, 4 ~ 1, 2 ~ 4, 4 ~ 2
Which one is an equivalence relation, if any ?

At first glance equivalence relations seem to be too abstract to be useful. However, just the
opposite is the case. Because they are defined in an abstract fashion, equivalent relations
can be utilized in many different situations. In fact, they can be used to define such basic
objects as the integers, the rational numbers, and the real numbers.
The main result about an equivalence relation on a set A is that it induces a partition of A
into disjoint sets. This property is the one that will allow us to define new mathematical
objects based on old ones in the next section.
Theorem 1.3.3: Equivalence Classes
Let r be an equivalence relation on a set A. Then A can be written as a union of

disjoint sets
with the following properties:
1. If a, b are in A then a ~ b if and only if a and b are in the same set
2. The subsets
are non-empty and pairwise disjoint.
The sets
are called equivalence classes.
Proof
Example 1.3.4:
Consider the set Z of all integers. Define a relation r by saying that x

and y are related if their difference y - x is divisible by 2. Then
1. Check that this relation is an equivalence relation
2. Find the two equivalence classes, and name them appropriately.
3. How would you add these equivalence classes, if at all ?
What kind of equivalence classes do you get when x and y are defined
to be related if their difference is divisible by m ? How could you add
those ?
Here is another, more complicated example:

Example 1.3.5:
Consider the set R x R \ {(0,0)} of all points in the plane minus the
origin. Define a relation between two points (x,y) and (x, y) by saying
that they are related if they are lying on the same straight line passing
through the origin. Then:
o
Check that this relation is an equivalence relation and find a
graphical representation of all equivalence classes by picking an
appropriate member for each class.
The space of all equivalence classes obtained under this equivalence

relation is called projective space.
More examples for equivalence relations and their resulting classes are given in the next
section.
1.4. Natural Numbers, Integers, and Rational

Numbers
IRA
In this section we will define some number systems based on those numbers that anyone is
familiar with: the natural numbers. In a course on logic and foundation even the natural
numbers can be defined rigorously. In this course, however, we will take the numbers {1,
2, 3, 4, ...} and their basic operations of addition and multiplication for granted. Actually,
the most basic properties of the natural numbers are called the Peano Axioms, and are
defined (being axioms they need not be derived) as follows:
Definition 1.4.1: Peano Axioms
1 is a natural number
For every natural number x there exists another natural number x called
the successor of x.
1 # x for every natural number x (x being the successor of x)
If x = y then x = y
If Q is a property such that:
1. 1 has the property Q
2. if x has property Q then x has property Q
then the property Q holds for all natural numbers.

The last property is called the Principle of Induction and it will be treated in more detail in
the next chapter. Right now we want to use the natural numbers to define a new number
system using equivalence classes, discussed in the previous section.
Theorem 1.4.2: The Integers
Let A be the set N x N and define a relation r on N x N by saying that (a,b) is

related to (a,b) if a + b = a + b. Then this relation is an equivalence relation.
If [(a,b)] and [(a, b)] denote the equivalence classes containing (a, b) and (a,
b), respectively, and if we define addition and multiplication of those
equivalence classes as:
1. [(a,b)] + [(a',b')] = [(a + a', b + b')]
2. [(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
then these operations are well-defined and the resulting set of all equivalence
classes has all of the familiar properties of the integers (it therefore serves to
define the integers based only on the natural numbers).
Proof
When defining operations on equivalent classes, one must prove that the operation is welldefined. That means, because the operation is usually defined by picking particular
representatives of a class, one needs to show that the result is independent of these
particular representatives. Check the above proof for details.
The above proof is in fact rather abstract. The basic question, however, is: why would
anyone even get this idea of defining those classes and their operations, and what does this
really mean, if anything ? In particular, why is the theorem entitled The Integers ? Try to
go through the few examples below:
Examples 1.4.3:
Which elements are contained in the equivalence classes of, say, [(1,
2)], [(0,0)] and of [(1, 0)] ? Which of the pairs (1, 5), (5, 1), (10, 14), (7,
3) are in the same equivalence classes ?
What do you get when adding, say, [(1,2)] + [(4, 6)] ? How about
[(3,1)] + [(1,3)] ? What about multiplying [(5,4)] * [(7, 4)] and [(1,2)] *
[(2,1)] ?
Can you think of a better notation for denoting the classes containing,
say, (5, 7) ? How about the class containing (7, 5) ? Do you see why the
above theorem is called 'The Integers' ?
An obvious question about multiplication is: why did we not define
[(a, b)] * [(a, b)] = [(a * a, b * b)]
It certainly looks a lot easier. But, keep in mind that all pair (a,b) are in the same class, if
their difference b - a is the same. Hence, we think of the class [(a,b)] as represented by b a and of the class [(a, b)] as the class represented by b - a. Then, in view of the fact that
(b - a) * (b - a) = b * b + a * a - (a * b + a * b)
our original definition of multiplication does make some sense.
Incidentally, now it is clear why multiplying two negative numbers together does give a
positive number: it is precisely the above operation on equivalent classes that induces this
interpretation. For example:
(-2) * (-4) = 8
This now has a precise mathematical interpretation.
(-2) is a representative of the equivalence class of pair of natural numbers (a. b)
(which we understand completely) whose difference b - a = -2.
One such pair representing the whole class is, for example, (4, 2).
To represent the class that might be denoted by -4, we might choose the pair (8, 4).
But then, according to our definition of multiplication, we have:
[(4, 2)] * [(8, 4)] = [ (4 * 4 + 2 * 8) , (4 * 8 + 2 * 4)] = [(32, 40)]
and the appropriate notation for that class is 40 - 32 = 8. Hence, we now understand
completely why (-2) * (-4) = +8.
Next, we will define the rational numbers in much the same way, leaving the proof as an
exercise. To motivate the next theorem, think about the following:
a / b = a / b if and only if a * b = a * b
a / b + a / b = (a b + b * a) / (b * b)
a / b * a / b = (a * a) / (b * b)
Theorem 1.4.4: The Rationals
Let A be the set N x N - {0} and define a relation r on N x N - {0} by saying
that (a, b) is related to (a, b) if a * b = a * b. Then this relation is an
equivalence relation.
If [(a, b)] and [(a, b)] denotes the equivalence classes containing (a, b) and
(a, b), respectively, and if we define the operations
1. [(a, b)] + [(a', b')] = [(a * b' + a * b, b * b')]
2. [(a, b)] * [(a, b)] = [(a * a, b * b)]
then these operations are well-defined and the resulting set of all equivalence
classes has all of the familiar properties of the rational numbers (it therefore
serves to define the rationals based only on the natural numbers).
Proof
Note that the second component of a pair of integers can not be zero (otherwise the relation
would not be an equivalence relation). As before, this will yield, in a mathematically
rigorous way, a new set of equivalence classes commonly called the 'rational numbers'. The
individual equivalent classes [(a,b)] are commonly denoted by the symbol a / b, and are
often called a 'fraction'. The requirement that the second component should not be zero is
the familiar restriction on fractions that their denominator not be zero.
As a matter of fact, the rational numbers are much nicer than the integers or the natural
numbers:
A natural number has no inverse with respect to addition or multiplication
An integer has an inverse with respect to addition, but none with respect to
multiplication.
A rational number has an inverse with respect to both addition and multiplication.
Example 1.4.5:
So, why would anyone bother introducing more complicated

numbers, such as the real (or even complex) numbers ? Find as many
reasons as you can.
2.1. Countable Infinity
IRA
One of the more obvious features of the three number systems N, Z, and Q that were
introduced in the previous chapter is that each contains infinitely many elements. Before
defining our next (and last) number system, R, we want to take a closer look at how one
can handle 'infinity' in a mathematically precise way. We would like to be able to answer
questions like:
1. Are there more even than odd numbers ?
2. Are there more even numbers than integers ?
3. Are there more rational numbers than negative integers ?
While most people would probably agree that there are just as many even than odd
numbers, it might be surprising that the answer to the last two questions is no as well. All
of the sets mentioned have the same number - albeit infinite - of elements. The person who
first established a rigorous 'theory of the infinite' was G. Cantor.
The basic idea when trying to count infinitely large (or otherwise difficult to count) sets
can roughly be described as follows:
Suppose you are standing in an empty classroom, with a lot of students waiting to
get in. How could you know whether there are enough chairs for everyone? You
can not count the students, because they walk around too much. So, you simply let
in the students, one by one, and take a seat. If all the seats are taken, and no
students are left standing, then there was the same number of students as chairs.
This simple idea of matching two sets element by element is the basis for comparing two
sets of any size, finite or infinite. Since 'matching elements from one set with those in
another set ' seems related to the concept of a function, we have arrived at the following
definition:
Definition 2.1.1: Cardinality
Let A and B be two sets. We say that A and B have the same
cardinality if there is a bijection f from A to B. We write card(A) =
card(B).
If there exists a function f from A to B that is injective (i.e. one-to-one)
we say that card(A) card(B).
If there exists a function f from A to B that is surjective (i.e. onto) we

say that card(A) card(B).
Please explain carefully what this definition has to do with the above idea of counting
students and chairs?
Examples 2.1.2:
We can now answer questions similar to the ones posed at the

beginning:
1. Let E be the set of all even integers, O be the set of odd integers.
Then card(E) = card(O). What is the bijection ?
2. Let E be the set of even integers, Z be the set of all integers.
Again, card(E) = card(Z). Can you find the bijection ?
3. Let N be the set of natural numbers, Z be the set of all integers.
Which set, if any, has the bigger cardinality ?
Definition 2.1.3: Countable and Uncountable

If a set A has the same cardinality as N (the natural numbers), then we say that
A is countable. In other words, a set is countable if there is a bijection from
that set to N.
An alternate way to define countable is: if there is a way to enumerate the
elements of a set, then the set has the same cardinality as N and is called
countable.
A set that is infinite and not countable is called uncountable.
The second part of this definition is actually just rephrasing of what it means to have a
bijection from N to a set A:
If a set A is countable, there is a bijection f from N to A. Therefore, the elements
f(1), f(2), f(3), ... are all in A. But we can easily enumerate them, by putting them in
the following order: f(1) is the first element in A, f(2) is the second element in A,
f(3) is the third one, and so on...
If a set A can be enumerated, then there is a first element, a second element, a third
element, and so on. Then the function that assigns to each element of A its position
in the enumeration process is a bijection between A and N and thus A is countable
by definition.
By the above examples, the set of even integers, odd integers, all positive and negative
integers are all countable.
Note that there is a difference between finite and countable, but we will often use the word
countable to actually mean countable or finite (even though it is not proper). However,
here is a nice result that distinguishes the finite from the infinite sets:
Theorem 2.1.4: Dedekind Theorem
A set S is infinite if and only if there exists a proper subset A of S which has the
same cardinality as S.
Proof
Examples 2.1.5:
Use Dedekind's Theorem to show that the set of integers Z and the
interval of real numbers between 0 and 2, [0, 2], are both infinite (which
is of course not surprising).
The surprising fact when dealing with countably infinite sets is that when combining two
countable sets one gets a new set that contains no more elements than each of the previous
sets. The next result will illustrate that.
Proposition 2.1.6: Combining Countable Sets
Every subset of a countable set is again countable (or finite).

The set of all ordered pairs of positive integers is countable.
The countable union of countable sets is countable
The finite cross product of countable sets is countable.
Proof
Think about these propositions carefully. It seems to be contrary to ones beliefs. To see
some rather striking examples for the above propositions, consider the following:
Examples 2.1.7:
The set of all rational numbers is countable.

The collection of all polynomials with integer coefficients is countable.
To prove this, follow these steps:
1.
Show that all polynomials of a fixed degree n (with integer
coefficients) are countable by using the above result on finite
cross products.
2.
Show that all polynomials (with integer coefficients) are

countable by writing that set as a countable union of countable
sets.
2.2. Uncountable Infinity
IRA
The last section raises the question whether it is at all possible to have sets that contain
more than countably many elements. After all, the examples of infinite sets we encountered
so far were all countable. It was Georg Cantor who answered that question: not all infinite
sets are countable.
Proposition 2.2.1: An Uncountable Set
The open interval (0, 1) is uncountable.
Proof
Note that this proposition assumes the existence of the real numbers. At this stage,
however, we have only defined the integers and rationals. We are not supposed to know
anything about the real numbers. Therefore, this proposition should - from a strictly logical
point of view - be rephrased:
if there are 'real numbers' in the interval (0, 1) then there must be uncountably
many.
However, the real numbers, of course, do exist, and are thus uncountable. As for a more
elementary uncountable set, one could consider the following:
the set of all infinite sequences of 0's and 1's is uncountable
The proof of this statement is similar to the above proposition, and is left as an exercise.
What about other familiar sets that are uncountable ?
Examples 2.2.2:
As for other candidates of uncountable sets, we might consider the

following:
1. What is the cardinality of the open interval (-1, 1) ?
2. What is the cardinality of any open interval (a, b) ?
3. Is the set R of all real numbers countable or uncountable ?

Now we know examples of countable sets (natural numbers, rational numbers) as well as
of uncountable sets (real numbers, any interval of real numbers). Both contain infinitely
many numbers, but, according to our counting convention via bijections, the reals actually
have a lot more numbers than the rationals.
Are there sets that contain even more elements than the real numbers ? More generally,
given any set, is there a method for constructing another set from the given one that will
contain more elements than the original set ? For finite sets the answer is easy: just add one
more element that was not part of the original set. For countable sets, however, this does
not work, since the new set with one more element would again be countable.
But, there is indeed such a procedure leading to bigger and bigger sets:
Definition 2.2.3: Power Set
The power set of a given set S is the set of all subsets of S, denoted by P(S).
Examples 2.2.4:
If S = {1,2,3}, then what is P(S) ? What is the power set of the set S
= {1, 2, 3, 4} ? How many elements does the power set of S = {1, 2, 3,
4, 5, 6} have ?
Show that card(P(S)) card (S) for any set S (you dont have to
proof strict inequality, only greater or equal).
Theorem 2.2.5: Cardinality of Power Sets

The cardinality of the power set P(S) is always bigger than the cardinality of S
for an set S.
Proof
Example 2.2.6: Logical Impossibilities - The Set of all Sets

When dealing with sets of sets, one has to be careful that one does not by
accident construct a logical impossibility. For example, one might casually
define the following superset:
Let S be the set of all those sets which are not members of themselves.
Do you see why this is an impossibility ? Ask yourself whether the set S is a
member of itself ?
Example 2.2.7: A Hierarchy of Infinity - Cardinal Numbers
We can now make a 'hierarchy' of infinity. The 'smallest' infinity is card(N).
The next smallest infinity is card(R), then card(P(R)), then card(P(P(R))), and
so on. By the above theorem, we keep getting bigger and bigger 'infinities'. The
numbers card(S), where S is any finite or infinite set, are called cardinal
numbers, and one can indeed establish rigorous rules for adding and
subtracting them.
Try to establish the following definitions for dealing with cardinalities:
1. Definition of a cardinal number
2. Comparing cardinal numbers
3. Addition of two cardinal numbers
Examples 2.2.8:
Using your above definitions, find the answers for the following
examples:
1. What is card(N) + card(N) ?
2. What is card(N) - card(N) ?
3. What is card(R) + card(N) ?
4. What is card(R) + card(R) ?
Example 2.2.9: The Continuum Hypothesis

An important question when dealing with cardinal number is: is there a set S
whose cardinal number satisfies the strict inequalities
card(N) < card(S) < card(R)
What might be a possible candidate ?
In real analysis, we will (almost always) deal with finite sets, countable sets, or sets of the
same cardinality as card(R). Larger sets will almost never appear in this text, and the
existence of a set as described above will not be important to us.
In order to prove that two sets have the same cardinality one must find a bijection between
them. That is often difficult, however. In particular, the difficulty in proving that a function
is a bijection is to show that it is surjective (i.e. onto). On the other hand, it is usually easy
to find injective (i.e. one-to-one) functions. Therefore, the next theorem deals with equality
of cardinality if only one-to-one functions can be found.
Note that it should be the case that if you find a one-to-one function from A to B (i.e. each
element in A is matched with a different one from B, with possibly being unmatched
elements in B left over) and another one-to-one function from B to A (i.e. each element in
B is matched with a different one from A, this time possibly leaving extra elements in A),
then the two sets A and B should contain the same number of elements.
This is indeed true, and is the content of the next and last theorem:
Theorem 2.2.10: Cantor-Bernstein
Let A and B be two sets. If there exists a one-to-one function f from A to B and
another one-to-one function g from B to A, then card(A) = card(B).
Proof
This theorem can be used to show, for example, that R x R has the same cardinality as R
itself, which is one of the exercises. It can also be used to prove that Q is countable by
showing that Q and Z x Z have the same cardinality (another exercise).
2.3. The Principle of Induction
IRA
In this section we will briefly review a common technique for many mathematical proofs
called the Principle of Induction. Based on this principle there is a constructive method
called Recursive Definition that is also used in several proofs. Both principles, in fact, can
be applied to many well-ordered set.
Definition 2.3.1: Ordered and Well-Ordered Set
A set S is called partially ordered if there exists a relation r (usually denoted by
the symbol ) between S and itself such that the following conditions are
satisfied:
1. reflexive: a a for any element a in S
2. transitive: if a b and b c then a c
3. anti-symmetric: if a b and b a then a = b
A set S is called ordered if it is partially ordered and every pair of elements x
and y from the set S can be compared with each other via the partial ordering
relation.
A set S is called well-ordered if it is an ordered set for which every non-empty
subset contains a smallest element.
Examples 2.3.2:
Determine which of the following sets and their ordering relations are
partially ordered, ordered, or well-ordered:
1. S is any set. Define a b if a = b
2. S is any set, and P(S) the power set of S. Define A B if A B
3. S is the set of real numbers between [0, 1]. Define a b if a is less
than or equal to b (i.e. the 'usual' interpretation of the symbol )
4. S is the set of real numbers between [0, 1]. Define a b if a is
greater than or equal to b.
Which of the following sets are well-ordered ?
1. The number systems N, Z, Q, or R ?
2. The set of all rational numbers in [0, 1] ?
3. The set of positive rational numbers whose denominator equals
3?
Theorem 2.3.3: Induction Principle

Let S be a well-ordered set with the additional property that every element except
for the smallest one has an immediate predecessor. Then: if Q is a property such
that:
1. the smallest element of S has the property Q
2. if s S has property Q then the successor of s also has property Q
Then the property Q holds for every element in S
Proof
Recall the this is very similar to parts of the Peano Axioms, and it is easy to see that the
principle of induction applies to the well-ordered set of natural numbers.
To use the principle of induction for the natural numbers one has to proceed in four steps:
1. Define a property that you believe to be true for some ordered set (such as N)
2. Check if the property is true for the smallest number of your set (1 for N)
3. Assume that property is true for an arbitrary element of your set (n for N)
4. Prove that the property is still true for the successor of that element (n+1 for N)
Examples 2.3.4:
Use induction to prove the following statements:

o The sum of the first n positive integers is n (n+1) / 2.
o If a, b > 0, then (a + b) n an + bn for any positive integer n.
Use induction to prove Bernoulli's inequality:
If x -1 then (1 + x) n 1 + n x for all n

Before stating a theorem whose proof is based on the induction principle, we should find
out why the additional property that every element except the smallest one must have an
immediate predecessor is necessary for the induction principle:
Example 2.3.5:
o
1.
The set of natural numbers, with the usual ordering, is well-ordered,

and in addition every element except of 1 has an immediate predecessor.
Now impose a different ordering labeled << on the natural numbers:
o if n and m are both even, then define n << m if n < m
o if n and m are both odd, then define n << m if n < m
o if n is even and m is odd, we always define n << m
Is the set of natural numbers, together with this new ordering << wellordered ? Does it have the property that every element has an immediate
predecessor ?
2.
Suppose the induction principle defined above does not contain the
assumption that every element except for the smallest has an immediate
predecessor. Then show that it could be proved that every natural
number must be even (which is, of course, not true so the additional
assumption on the induction principle is necessary).
A somewhat more complicated, but very useful theorem that can be proved by induction is
the binomial theorem:
Theorem 2.3.6: Binomial Theorem
Let a and b be real numbers, and n a natural number. Then:
Proof
Based on the Induction principle is the principle of Recursive Definition that is used
frequently in computer science.
Definition 2.3.7: Recursive Definition
Let S be a set. If we define a function h from N to S as follows:
1. h(1) is a uniquely defined element of S
2. h(n) is defined via a formula that involves at most terms h(j) for 0 < j <
n
Then this construction determines a unique function h from N to S.
Examples 2.3.8:
Below are two recursive definitions, only one of which is valid.

Which one is the valid one ?
1. Let x0 = x1 = 1 and define xn = xn - 1 + xn - 2 for all n > 1.
2. Select the following subset from the natural numbers:
x0 = smallest element of N
xn = smallest element of [N - {x0 , x1 , x2 , ..., xn + 1}]

When first encountering proofs by induction, it seems that anything can be proved. It is
hard, in fact almost impossible, to find out why a particular property should be true when
looking at an induction proof, and therefore, one might use induction to prove anything
(Incidentally, such proofs are often called non-constructive proofs).
Exercise 2.3.9:
Try to use induction to prove that the sum of the first n square
numbers is equal to (n + 2)/3. (This induction proof should fail, since
the statement is false - or is it true ?)
Here is a more elaborate example of an invalid induction proof:
Example 2.3.10:
All birds are of the same color.
"Proof"
The property is clearly true for n = 1, because one bird is of the same color.
Assume that n birds are of the same color.
Now take (n+1) birds. Put one aside. There are n birds left, which, by assumption,
are of the same color. For simplicity, say they are all black. Put the one bird back
into the group, and take out another one. Again, there are n birds remaining, which
by assumption must have the same color. In particular, the bird that was taken out
in the first place must now have the same color as all the other birds, namely black.
The bird taken outside was also black. Therefore, the n+1 birds must all be black.
But that means that we have proved property Q for all natural numbers by
induction. Yet, the statement is obviously wrong. Where does this proof actually
break down ?
A similar word of caution applies to Recursive Definitions. While that principle can be
very useful, one has to be careful not to get into logical difficulties.
Example 2.3.11:
A classical example for a recursive definition that does not work is the
paradox of the barber of Seville: The barber of Seville is that inhabitant
of Seville who shaves every man in Seville that does not shave himself.
The problem here is: who shaves the barber ?

To conclude, let's prove two more 'theorems' via induction:
Examples 2.3.12: Sum of Squares and Cubes
Prove the following statements via induction:

1. The sum of the first n numbers is equal to
2. The sum of the first n square numbers is equal to
3. The sum of the first n cubic numbers is equal to
Interactive Real Analysis, ver. 1.9.3
(c) 1994-2000, Bert G. Wachsmuth
IRA
2.4. The Real Number System

In the previous chapter we have defined the integers and rational numbers based on the
natural numbers and equivalence relations. We have also used the real numbers as our
prime example of an uncountable set. In this section we will actually define mathematically correct - the 'real numbers' and establish their most important properties.
There are actually several convenient ways to define R. Two possible methods of
construction are:
Construction of R via Dedekinds cuts
Construction of R classes via equivalence of Cauchy sequences .
Right now, however, it will be more important to describe those properties of R that we
will need for the remainder of this class.
The first question is: why do we need the real numbers ? Arent the
rationals good enough ?
Theorem 2.4.1: No Square Roots inQ
There is no rational number x such that x2 = x * x = 2.
Proof
Thus, we see that even simple equations have no solution if all we knew were rational
numbers. We therefore need to expand our number system to contain numbers which do
provide a solution to equations such as the above.
There is another reason for preferring real over rational numbers: Informally speaking,
while the rational numbers are all 'over the place', they contain plenty of holes (namely the
irrationals). The real numbers, on the other hand, contain no holes. A little bit more formal,
we could say that the rational numbers are not closed under the limit operations, while the
real numbers are. More formally speaking, we need some definitions.
Definition 2.4.2: Upper and Least Upper Bound
Let A be an ordered set and X a subset of A. An element b is called
an upper bound for the set X if every element in X is less than or
equal to b. If such an upper bound exists, the set X is called bounded
above.
Let A be an ordered set, and X a subset of A. An element b in A is
called a least upper bound (or supremum) for X if b is an upper
bound for X and there is no other upper bound b' for X that is less
than b. We write b = sup(X). By its definition, if a least upper bound
exists, it is unique.
Examples 2.4.3:
Consider the set S of all rational numbers strictly between

0 and 1.
1. Find 5 different upper bounds for S
2. Find the least upper bound for S.
3. Is there any difference for the set [0, 1] ?
Consider the set of rational numbers {1, 1.4, 1.41, 1.414,
1.4142, ...} converging to the square root of 2.
1. What is the least upper bound of this set, if all we
knew were the rational numbers?
2. What is the least upper bound of this set, allowing real
numbers ?
Can you define Lower Bound and Greatest Lower Bound

(called Infimum) ?
In the above example we have seen several facts:
an upper (or lower) bound need not be unique
a least upper bound (or greatest lower bound) may or may not be part of the set
least upper bounds (or greatest lower bounds) may fail to exist in Q, but do exist in
R.
In fact, this last fact is exactly the property which distinguishes the real numbers from the
other number systems, and makes it useful to us. We will state this as a theorem:
Theorem 2.4.4: Least Upper Bound Property
There exists an ordered field of numbers R with the properties:
1. it contains all rational numbers
2. it has the property that any non-empty subset which has an
upper bound has a least upper bound.
Proof
Note that we have not defined 'ordered field', but that is not so important for us right now.
The importance for us is that this property is one of the most basic properties of the real
numbers, and it distinguishes the real from the rational numbers (which do not have this
property).
In order to prove this theorem we need to know what exactly the real numbers are, and we
have indeed given two possible constructions at the beginning of this section.. However, it
is more important to understand these properties of R, and to know about the differences
between R and the other number systems N, Z, and Q.
We can use this theorem to illustrate another property of the real numbers that makes them
more useful than the rational numbers:
Theorem 2.4.5: Square Roots inR
There is a positive real number x such that x2 = 2
Proof
There are several other properties that will be of importance later on. Two of those are the
Archimedean and the Density property. Again, as for the Least Upper Bound property, it is
more important to understand what these properties mean than to follow the proof exactly.
Theorem 2.4.6: Properties ofR and Q
The set of real numbers satisfies the Archimedean Property:
Let a and b be positive real numbers. Then there is a natural
number n such that n * a > b
The set of rational numbers satisfies the following
Density Property:
Let c < d be real numbers. Then there is a rational number q
with c < q < d.
Proof
This concludes the 'elementary' part of this text. We have now defined much of our basic
notation, learned how to count to infinity, introduced all number systems that we will be
using, and we have seen several different types of proofs. We are now ready for some more
complicated topics.
2.5. Pi and e are irrational
IRA
Pi and e (Euler's number) are two numbers that occur everywhere in mathematics. Both are
irrational and in fact transcendental numbers. In this little appendix we will prove that both
numbers are irrational. We will not show that they are also transcendental numbers. The
proofs will actually use tools from much later sections, but they might be interesting
nevertheless, and the results fit well in this section.
First will now prove that e (Euler's number) is irrational. In fact, it is a transcendental
number, but that requires a lot of work to prove.
Theorem 2.5.3: e is irrational
Euler's number e is irrational
Proof: We know that for any integer n
(check the definition of Euler's number and the definition of Euler' series) where
Now suppose that e was rational, i.e. there are two positive integers a and b with e = a / b.
Choose n > b. Then, using the above representation, we have:
or multiplying both sides by n! we get:
Since n > b, the left side of this equation represents an integer, and hence n! Rn is also an
integer. But we know that
so that if n is large enough the left side is less than 1. But then n! Rn must be a positive
integer less than 1, which is a contradiction.
Next we prove that Pi is irrational, which is a lot harder to do. We need a preliminary
result:
Lemma 2.5.1:
Define a function fn(x) =
. Then this function has the
following properties:
0 < fn(x) < 1 / n! for 0 < x < 1
and
are both integers
Proof: Using the binomial theorem, we see that when the numerator of the function is
multiplied out, the lowest power of x will be n, and the highest power is 2n. Therefore, the
function can be written as
fn(x) =
where all coefficients are integers. It is clear from this expression that
n and for k > 2n.
Also, looking at the sum more carefully, we see that
= 0 for k <
...
But this implies that

Moreover, since
is an integer for any k.
we also have
Therefore
is an integer for any k.

[x]
Also note that |

| < for n large enough ( n > 2a) and any a. Now we can prove the
result of this chapter:
Theorem 2.5.2: Pi is irrational
Pi is irrational
Proof: We will prove that
is not rational, which implies the assertion.
Suppose it was rational. Then there are two positive integers a and b
with
=a/b
Define the function
Each of the factors
is an integer, by assumption on
. We also know that
well. Hence, G(0) and G(1) are both integers.
Differentiating G(x) twice, we get:
and
are integers as
The last term in this equation, the (2n+2)-th derivative, is zero, by the properties of the
function f. Adding G(x) and G''(x) we get:
Now define another function
Then, using the above formula for G(x) +
G''(x), we have:
=
=
Now we can use the second fundamental theorem of calculus to
conclude:
= H(1) - H(0)
=
=
[ G(1) + G(0)]
Thus, the integral
is an integer. But we also know that 0 < fn(x) < 1 / n! for 0 < x < 1. Therefore, estimating
the above integral, we get
0<
<
for 0 < x < 1. Therefore, we can estimate our integral to get
0<
<
<1
Here we have used the fact that the last fraction approaches zero if n is large enough. But
know we have a contradiction, because that integral was supposed to be an integer. Since
there is no positive integer less than 1, our assumption that
contradiction. Hence,
was rational resulted in a
must be irrational.
This prove is, admittedly, rather curious. Aside from the assumption that
Pi is rational (leading to the contradiction) the only other properties of Pi
that are really involved in this proof are that
sin( ) = 0, cos( ) = 1
and we need the defining properties of the trig. functions that
sin'(x) = cos(x), cos'(x) = - sin(x), sin(0) = 0, cos(0) = 1
That does not make the proof much clearer, but it illustrates that some essential properties
of are indeed used along the way, and this proof will not work for any other number.
Taken from Calculus (2nd Edition), by Michael Spivak
Publish or Perish, Inc, 1980, pages 307 - 310
IRA
3.1. Sequences
So far we have introduced sets as well as the number systems that we will use in this text.
Next, we will study sequences of numbers. Sequences are, basically, countably many
numbers arranged in an order that may or may not exhibit certain patterns. Here is the
formal definition of a sequence:
Definition 3.3.1: Sequence
A sequence of real numbers is a function f: N R. In other words, a
sequence can be written as f(1), f(2), f(3), ..... Usually, we will denote
such a sequence by the symbol
, where aj = f(j).
For example, the sequence 1, 1/2, 1/3, 1/4, 1/5, ... is written as
. Keep in mind that
despite the strange notation, a sequence can be thought of as an ordinary function. In many
cases that may not be the most expedient way to look at the situation. It is often easier to
simply look at a sequence as a 'list' of numbers that may or may not exhibit a certain
pattern.
We now want to describe what the long-term behavior, or pattern, of a
sequence is, if any.
Definition 3.1.2: Convergence
A sequence
of real (or complex) numbers is said to converge
to a real (or complex) number c if for every > 0 there is an integer
N > 0 such that if j > N then
| aj - c | <
The number c is called the limit of the sequence

sometimes write aj c.
If a sequence
that it diverges.
and we
does not converge, then we say
Example 3.1.3:
Consider the sequence

Prove it.
The sequence
. It converges to zero.
does not converge. Prove it.
The sequence
converges to zero Prove it.
Convergent sequences, in other words, exhibit the behavior that they get closer and closer
to a particular number. Note, however, that divergent sequence can also have a regular
pattern, as in the second example above. But it is convergent sequences that will be
particularly useful to us right now.
We are going to establish several properties of convergent sequences,

most of which are probably familiar to you. Many proofs will use an '
argument' as in the proof of the next result. This type of argument is not
easy to get used to, but it will appear again and again, so that you
should try to get as familiar with it as you can.
Proposition 3.1.4: Convergent Sequences are Bounded
Let
be a convergent sequence. Then the sequence is bounded,
and the limit is unique.
Proof
Example 3.1.5:
The Fibonacci numbers are recursively defined as x1 = 1,
x2 = 1, and for all n > 2 we set xn = xn - 1 + xn - 2. Show that the
sequence of Fibonacci numbers {1, 1, 2, 3, 5, ...} does not
converge.
Convergent sequences can be manipulated on a term by term basis, just as one would
expect:
Proposition 3.1.6: Algebra on Convergent Sequences
Suppose
and
are converging to a and b, respectively.
Then
1. Their sum is convergent to a + b, and the sequences can be
added term by term.
2. Their product is convergent to a * b, and the sequences can be
multiplied term by term.
3. Their quotient is convergent to a / b, provide that b # 0, and
the sequences can be divided term by term (if the
denominators are not zero).
4. If an bn for all n, then a b
Proof
This theorem states exactly what you would expect to be true. The proof
of it employs the standard trick of 'adding zero' and using the triangle
inequality. Try to prove it on your own before looking it up.
Note that the fourth statement is no longer true for strict inequalities. In
other words, there are convergent sequences with an < bn for all n, but
strict inequality is no longer true for their limits. Can you find an
example ?
While we now know how to deal with convergent sequences, we still
need an easy criteria that will tell us whether a sequence converges.
The next proposition gives reasonable easy conditions, but will not tell
us the actual limit of the convergent sequence.
First, recall the following definitions:
Definition 3.1.7: Monotonicity
A sequence
j.
A sequence
is called monotone increasing if aj + 1 aj for all
is called monotone decreasing if
aj aj + 1 for all j.
In other words, if every next member of a sequence is larger than the previous one, the
sequence is growing, or monotone increasing. If the next element is smaller than each
previous one, the sequence is decreasing. While this condition is easy to understand, there
are equivalent conditions that are often easier to check:
Monotone increasing:
1. aj + 1 aj
2. aj + 1 - aj 0
3. aj + 1 / aj 1, if aj > 0
Monotone decreasing:
1. aj + 1 aj
2. aj + 1 - aj 0
3. aj + 1 / aj 1, if aj > 0
Examples 3.1.8:
Is the sequence
decreasing ?
Is the sequence
decreasing ?
monotone increasing or
monotone increasing or
Is it true that a bounded sequence converges ? How about

monotone increasing sequences ?
Here is a very useful theorem to establish convergence of a given sequence (without,
however, revealing the limit of the sequence): First, we have to apply our concepts of
supremum and infimum to sequences:
If a sequence
is bounded above, then c = sup(xk) is finite. Moreover, given
any > 0, there exists at least one integer k such that xk > c - , as illustrated in the
picture.
If a sequence
is bounded below, then c = inf(xk) is finite. Moreover, given
any > 0, there exists at least one integer k such that xk < c + , as illustrated in the
picture.
Proposition 3.1.9: Monotone Sequences
If
is a monotone increasing sequence that is bounded above,
then the sequence must converge.
If
is a monotone decreasing sequence that is
bounded below, then the sequence must converge.
Proof
Using this result it is often easy to prove convergence of a sequence just

by showing that it is bounded and monotone. The downside is that this
method will not reveal the actual limit, just prove that there is one.
Examples 3.1.10:
Prove that the sequences

and
converge.
What is their limit?
Define x1 = b and let xn = xn - 1 / 2 for all n > 1. Prove that
this sequence converges for any number b. What is the limit ?
Let a > 0 and x0 > 0 and define the recursive sequence
xn+1 = 1/2 (xn + a / xn)
Show that this sequence converges to the square root of a

regardless of the starting point x0 > 0.
There is one more simple but useful theorem that can be used to find a limit if comparable
limits are known. The theorem states that if a sequence is pinched in between two
convergent sequences that converge to the same limit, then the sequence in between must
also converge to the same limit.
Theorem 3.1.11: The Pinching Theorem
Suppose {aj} and {cj} are two convergent sequences such that lim aj =
lim cj = L. If a sequence {bj} has the property that
aj bj cj
for all j, then the sequence {bj} converges and lim bj = L.
Proof
Example 3.1.12:
Show that the sequence sin(n) / n and cos(n) / n both converge to
zero.

IRA
3.2. Cauchy Sequences
What is slightly annoying for the mathematician (in theory and in praxis) is that we refer to
the limit of a sequence in the definition of a convergent sequence when that limit may not
be known at all. In fact, more often then not it is quite hard to determine the actual limit of
a sequence.
We would prefer to have a definition which only includes the known
elements of the particular sequence in question and does not rely on the
unknown limit. Therefore, we will introduce the following definition:
Definition 3.2.1: Cauchy Sequence
Let
be a sequence of real (or complex) numbers. We say that
the sequence satisfies the Cauchy criterion (or simply is Cauchy) if
for each > 0 there is an integer N > 0 such that if j, k > N then
| a j - ak | <
This definition states precisely what it means for the elements of a sequence to get closer
together, and to stay close together. Of course, we want to know what the relation between
Cauchy sequences and convergent sequences is.
Theorem 3.2.2: Completeness Theorem in R
Let
be a Cauchy sequence of real numbers. Then the
sequence is bounded.
Let
be a sequence of real numbers. The
sequence is Cauchy if and only if it converges to some
limit a.
Proof
Thus, by considering Cauchy sequences instead of convergent

sequences we do not need to refer to the unknown limit of a sequence,
and in effect both concepts are the same.
Note that the Completeness Theorem not true if we consider only
rational numbers. For example, the sequence 1, 1.4, 1.41, 1.414, ...
(convergent to the square root of 2) is Cauchy, but does not converge to
a rational number. Therefore, the rational numbers are not complete, in
the sense that not every Cauchy sequence of rational numbers
converges to a rational number.
Hence, the proof will have to use that property which distinguishes the
reals from the rationals: the least upper bound property.
IRA
3.3. Subsequences
So far we have learned the basic definitions of a sequence (a function from the natural
numbers to the Reals), the concept of convergence, and we have extended that concept to
one which does not pre-suppose the unknown limit of a sequence (Cauchy sequence).
Unfortunately, however, not all sequences converge. We will now
introduce some techniques for dealing with those sequences. The first is
to change the sequence into a convergent one (extract subsequences)
and the second is to modify our concept of limit (lim sup and lim inf).
Definition 3.3.1: Subsequence
Let
be a sequence. When we extract from this sequence only
certain elements and drop the remaining ones we obtain a new
sequences consisting of an infinite subset of the original sequence.
That sequence is called a subsequence and denoted by
One can extract infinitely many subsequences from any given sequence.
Examples 3.3.2:
Take the sequence

, which we have proved does
not converge. Extract every other member, starting with the
first. Does this sequence converge ? What if we extract every
other member, starting with the second. What do you get in
this case ?
Take the sequence

. Extract three different
subsequences of your choice. Do these subsequences
converge ? Is so, to what limit ?
The last example is an indication of a general result:
Proposition 3.3.3: Subsequences from Convergent Sequence
If
is a convergent sequence, then every subsequence of that
sequence converges to the same limit
If
is a sequence such that every possible
subsequence extracted from that sequences converge
to the same limit, then the original sequence also
converges to that limit.
Proof
The next statement is probably one on the most fundamental results of

basic real analysis, and generalizes the above proposition. It also
explains why subsequences can be useful, even if the original sequence
does not converge.
Theorem 3.3.4: Bolzano-Weierstrass

Let
be a sequence of real numbers that is bounded. Then there
exists a subsequence
that converges.
Proof
Example 3.3.5:
Does
converge ? Does there exist a convergent
subsequence ? What is that subsequence ?
In fact, the following is true: given any number L between

-1 and 1, it is possible to extract a subsequence from the
sequence
that converges to L. This is difficult to
prove.
Next, we will broaden our concept of limits.
IRA
3.4. Lim Sup and Lim Inf

When dealing with sequences there are two choices:
the sequence converges
the sequence diverges
While we know how to deal with convergent sequences, we don't know much about
divergent sequences. One possibility is to try and extract a convergent subsequence, as
described in the last section. In particular, Bolzano-Weierstrass' theorem can be useful in
case the original sequence was bounded. However, we often would like to discuss the limit
of a sequence without having to spend much time on investigating convergence, or
thinking about which subsequence to extract. Therefore, we need to broaden our concept of
limits to allow for the possibility of divergent sequences.
Definition 3.4.1: Lim Sup and Lim Inf
Let
be a sequence of real numbers. Define

Aj = inf{aj , aj + 1 , aj + 2 , ...}
and let c = lim (Aj). Then c is called the limit inferior of the sequence
.
Let
be a sequence of real numbers. Define

Bj = sup{aj , aj + 1 , aj + 2 , ...}
and let c = lim (Bj). Then c is called the limit superior of the
sequence
.
In short, we have:
1. lim inf(aj) = lim(Aj) , where Aj = inf{aj , aj + 1 , aj + 2 , ...}
2. lim sup(aj) = lim(Bj) , where Bj = sup{aj , aj + 1 , aj + 2 , ...}
When trying to find lim sup and lim inf for a given sequence, it is best to find the first few
Aj's or Bj's, respectively, and then to determine the limit of those. If you try to guess the
answer quickly, you might get confused between an ordinary supremum and the lim sup, or
the regular infimum and the lim inf.
Examples 3.4.2:
What is inf, sup, lim inf and lim sup for
?
?

While these limits are often somewhat counter-intuitive, they have one very useful
property:
Proposition 3.4.3: Lim inf and Lim sup exist
lim sup and lim inf always exist (possibly infinite) for any sequence
of real numbers.
Proof
It is important to try to develop a more intuitive understanding about

lim sup and lim inf. The next results will attempt to make these concepts
somewhat more clear.
Proposition 3.4.4: Characterizing lim sup and lim inf
Let
be an arbitrary sequence and let c = lim sup(aj) and d =
lim inf(aj). Then
1. there is a subsequence converging to c
2. there is a subsequence converging to d
3. d lim inf
lim sup
c for any subsequence {
If c and d are both finite, then: given any > 0 there are arbitrary
large j such that aj > c - and arbitrary large k such that ak < d +
Proof
A little bit more colloquial, we could say:

Aj picks out the greatest lower bound for the truncated sequences {aj}. Therefore Aj
tends to the smallest possible limit of any convergent subsequence.
Similarly, Bj picks the smallest upper bound of the truncated sequences, and hence
tends to the greatest possible limit of any convergent subsequence.
Compare this with a similar statement about supremum and infimum.
Example 3.4.5
If
is the sequence of all rational numbers in the
interval [0, 1], enumerated in any way, find the lim sup and
lim inf of that sequence.
The final statement relates lim sup and lim inf with our usual concept of limit.
Proposition 3.4.6: Lim sup, lim inf, and limit
If a sequence {aj} converges then

lim sup aj = lim inf aj = lim aj
Conversely, if lim sup aj = lim inf aj are both finite then {aj}
converges.
Proof
IRA
3.5. Special Sequences

In this section will take a look at some sequences that will appear again and again. You
should try to memorize all those sequences and their convergence behavior.
Power Sequence
Exponent Sequence
Root of n Sequence
n-th Root Sequence
Binomial Sequence
Euler's Sequence
Exponential Sequence

Definition 3.5.1: Power Sequence

Power Sequence: The convergence properties of the power sequence
depends on the size of the base a:
|a| < 1: the sequence converges to 0.
a = 1: the sequence converges to 1 (being constant)
a > 1: the series diverges to plus infinity
a -1: the series diverges
back
You browser is not 'Java-enabled' ...

Convergent Power series with a = -9/10
Divergent Power series with a = 11/10
Proof:
This seems an obvious statement: if a number is in absolute values less than one, it gets
smaller and smaller when raised to higher and higher powers. Proving something 'obvious',
however, is often difficult, because it may not be clear how to start. To prove the statement
we have to resort to one of the elementary properties of the real number system: the
Archimedian principle.
Case a > 1:
Take any real number K > 0 and define
x=a-1
Since a > 1 we know that x > 0. By the Archimedian principle there exists a
positive integer n such that nx > K - 1. Using Bernoulli's inequality for that n we
have:
an = (1 + x)n 1 + nx > 1 + (K - 1) = K
But since K was an arbitrary number, this proves that the sequence {an} is
unbounded. Hence it can not converge.
Case 0 < a < 1:
Take any > 0. Since 0 < a < 1 we know that 1/a > 1, so that by the previous proof
we can find an N with
But then it follows that

an < for all n > N
This proves that the sequence {an} converges to zero.
Case -1 < a < 0:

By the above proof we know that | an | converges to zero. But since -| an | < an < |
an | the sequence {an} again converges to zero by the Pinching Theorem.
Case a < -1:
Extract the subsequence {a2m} from the sequence {an}. Then this sequence diverges
to infinity by the first part of the proof, and therefore the original sequence can not
converge either.
Case a = 1:
This is the constant sequence, so it converges.
Case a = -1:
We have already proved that the sequence
does not converge.

Definition 3.5.2: Exponent Sequence

Exponent Sequence: The convergence depends on the size of the
exponent a:
a > 0: the sequence diverges to positive infinity
a = 0: the sequence is constant
a < 0: the sequence converges to 0
back

Exponent sequence with a = 2
Exponent sequence with a = -2
Proof:
Write n a = e a ln(n). Then:
if a > 0 then as n approaches infinity, the function ea ln(n) approaches infinity as well
if a < 0 then as n approaches infinity, the function ea ln(n) approaches zero
if a = 0 then the sequence is the constant sequence, and hence convergent
Is this a good proof ? Of course not, because at this stage we know nothing about the
exponential or logarithm function. So, we should come up with a better proof.
But actually we first need to understand exactly what na really means:
If a is an integer, then clearly na means to multiply n with itself a times
If a = p/q is a rational number, then np/q means to multiply n p times, then take the
q-root
It is unclear, however, what na means if a is an irrational number. For example,
what is n , or n ?
One way to define na for all a is to resort to the exponential function:
na = ea ln(n)
In that light, the original 'proof' was not bad after all, but of course we now need to know
how exactly the exponential function is defined, and what its properties are before we can
continue. As it turns out, the exponential function is not easy to define properly. Generally
one can either define it as the inverse function of the ln function, or via a power series.
Another way to define na for a being rational it to take a sequence of
rational numbers rn converging to a and to define na as the limit of the
sequence {nrn}. There the problem is to show that this is well-defined,
i.e. if there are two sequences of rational numbers, the resulting limit
will be the same.
In either case, we will base our proof on the simple fact:
if p > 0 and x > y then xp > yp
which seems clear enough and could be formally proved as soon as the exponential
function is introduced to properly define xp.
Now take any positive number K and let n be an integer bigger than K1/a.
Since
n > K1/a
we can raise both sides to the a-th power to get
na > K
which means - since K was arbitrary - that the sequence {na} is unbounded.
The case a = 0 is clear (since n0 = 1). The second case of a < 0 is
related to the first by taking reciprocals (details are left as an exercise).
Since we have already proved the first case we are done.
Definition 3.5.3: Root of n Sequence

Root of n Sequence: This sequence converges to 1.
back

Root of n sequence
Proof:
If n > 1, then
> 1. Therefore, we can find numbers an > 0 such that
= 1 + an for each n > 1

Hence, we can raise both sides to the n-th power and use the Binomial
theorem:
In particular, since all terms are positive, we obtain
Solving this for an we obtain
0 an
But that implies that an converges to zero as n approaches to infinity, which
means, by the definition of an that
converges to 1 as n goes to infinity.
That is what we wanted to prove.

Definition 3.5.4: n-th Root Sequence

n-th Root Sequence: This sequence converges to 1 for any a > 0.
back

n-th Root sequence with a = 3
Proof:
Case a > 1:
If a > 1, then for n large enough we have 1 < a < n. Taking roots on both sides we
obtain
1<
<
But the right-hand side approaches 1 as n goes to infinity by our statement of the
root-n sequence. Then the sequence {
} must also approach 1, being squeezed
between 1 on both sides (Pinching theorem).
Case 0 < a < 1:
If 0 < a < 1, then (1/a) > 1. Using the first part of this proof, the reciprocal of the
sequence {
} must converge to one, which implies the same for the original
sequence.
Incidentally, if a = 0 then we are dealing with the constant sequence, and the limit is of
course equal to 0.
Definition 3.5.5: Binomial Sequence
Binomial Sequence: If b > 1 then the sequence converges to zero

for any positive integer k. In fact, this is still true if k is replaced by any real
number.
back

Binomial sequence with k = 2 and b = 1.3
Proof:
Note that both numerator and denominator tend to infinity. Our goal will be to show that
the denominator grows faster than the k-th power of n, thereby 'winning' the race to infinity
and forcing the whole expression to tend to zero.
The name of this sequence indicates that we might try to use the
binomial theorem for this. Indeed, define x such that
b=1+x
Since b > 1 we know that x > 0. Therefore, each term in the binomial theorem is positive,
and we can use the (k+1)-st term of that theorem to estimate:
for any k+1 n. Let n = 2k + 1, or equivalently, k = (n-1)/2. Then n - k = n - (n-1)/2 =

(n+1)/2 > n/2, so that each of the expressions n, n-1, n-2, ..., n - k is greater than n/2.
Hence, we have that
But then, taking reciprocals, we have:
But this expression is true for all n > 2k + 1 as well, so that, with k fixed, we can take the
limit as n approaches infinity and the right hand side will approach zero. Since the lefthand side is always greater than or equal to zero, the limit of the binomial sequence must
also be zero.
If k is replaced by any real number, see the exponential sequence to
find out how nk could be defined for rational and irrational values of k.
But perhaps some simple estimate will help if k is not an integer. Details
are left as an exercise.

Definition 3.5.6: Euler Sequence
Euler's Sequence: Converges to e ~

2.71828182845904523536028747135... (Euler's number). This sequence
serves to define e.
back

Euler's sequence
Proof:
We will show that the sequence is monotone increasing and bounded above. If that was
true, then it must converge. Its limit, by definition, will be called e for Euler's number.
Euler's number e is irrational (in fact transcendental), and an
approximation of e to 30 decimals is e ~
2.71828182845904523536028747135.
First, we can use the binomial theorem to expand the expression
Similarly, we can replace n by n+1 in this expression to obtain
The first expression has (n+1) terms, the second expression has (n+2) terms. Each of the
first (n+1) terms of the second expression is greater than or equal to each of the (n+1)
terms of the first expression, because
But then the sequence is monotone increasing, because we have shown that
Next, we need to show that the sequence is bounded. Again, consider the expansion
1+
Now we need to estimate the expression

If we define Sn =
to finish the proof.

, then
so that, finally,
for all n.
But then, putting everything together, we have shown that
1+
1 + Sn 3
for all n. Hence, Euler's sequence is bounded by 3 for all n.

Therefore, since the sequence is monotone increasing and bounded, it
must converge. We already know that the limit is less than or equal to 3.
In fact, the limit is approximately equal to
2.71828182845904523536028747135
Definition 3.5.7: Exponential Sequence
Exponential Sequence: Converges to the exponential function

ex = exp(x) for any real number x.
back
Proof:
We will use a simple substitution to prove this. Let
x/n = 1/u, or equivalently, n = u x
Then we have
But the term inside the square brackets is Euler's sequence, which converges to Euler's
number e. Hence, the whole expression converges to ex, as required.
In fact, we have used a property relating to functions to make this proof
work correctly. What is that property ?
If we did not want to use functions, we could first prove the statement
for x being an integer. Then we could expand it to rational numbers, and
then, approximating x by rational number, we could prove the final
result.
4.1. Series and Convergence
IRA
So far we have learned about sequences of numbers. Now we will investigate what may
happen when we add all terms of a sequence together to form what will be called an
infinite series.
The old Greeks already wondered about this, and actually did not have
the tools to quite understand it This is illustrated by the old tale of
Achilles and the Tortoise.
Example 4.1.1: Zeno's Paradox (Achilles and the Tortoise)
Achilles, a fast runner, was asked to race against a tortoise. Achilles

can run 10 meters per second, the tortoise only 5 meter per second.
The track is 100 meters long. Achilles, being a fair sportsman, gives
the tortoise 10 meter advantage. Who will win ?
Both start running, with the tortoise being 10 meters ahead.
After one second, Achilles has reached the spot where the tortoise started. The
tortoise, in turn, has run 5 meters.
Achilles runs again and reaches the spot the tortoise has just been. The tortoise, in
turn, has run 2.5 meters.
Achilles runs again to the spot where the tortoise has just been. The tortoise, in
turn, has run another 1.25 meters ahead.
This continuous for a while, but whenever Achilles manages to reach the spot where the
tortoise has just been a split-second ago, the tortoise has again covered a little bit of
distance, and is still ahead of Achilles. Hence, as hard as he tries, Achilles only manages to
cut the remaining distance in half each time, implying, of course, that Achilles can actually
never reach the tortoise. So, the tortoise wins the race, which does not make Achilles very
happy at all.
Obviously, this is not true, but where is the mistake ?
Now let's return to mathematics. Before we can deal with any new
objects, we need to define them:
Definition 4.1.2: Series, Partial Sums, and Convergence
Let { a n } be an infinite sequence.
1. The formal expression
is called an (infinite) series.
2. For N = 1, 2, 3, ... the expression lim Sn =

is called the
N-th partial sum of the series.
3. If lim Sn exists and is finite, the series is said to converge.
4. If lim Sn does not exist or is infinite, the series is said to
diverge.
Note that while a series is the result of an infinite addition - which we do not yet know how
to handle - each partial sum is the sum of finitely many terms only. Hence, the partial sums
form a sequence, and we already know how to deal with sequences.
Examples 4.1.3:
= 1/2 + 1/4 + 1/8 + 1/16 + ... is an infinite series.

The 3rd, 4th, and 5th partial sums, for example, are,
respectively: 0.875, 0.9375, and 0.96875.
o
Does this series converge or diverge ?
= 1 + 1/2 + 1/3 + 1/4 + 1/5 + ... is another infinite

series, called harmonic series. The 3rd, 4th, and 5th partial
sums are, respectively: 1.833, 2.0833, and 2.2833.
Does this series converge or diverge ?

Actually, if a series contains positive and negative terms, many of them may cancel out
when being added together. Hence, there are different modes of convergence: one mode
that applies to series with positive terms, and another mode that applies to series whose
terms may be negative and positive.
Definition 4.1.4: Absolute and Conditional Convergence
o
A series
values
converges absolutely if the sum of the absolute

converges.
A series converges conditionally, if it converges, but

not absolutely.
Examples 4.1.5:
Does the series

or not at all ?
converge absolutely, conditionally,
Does the series

converge absolutely,
conditionally, or not at all ?
Does the series

converge absolutely,
conditionally, or not at all (this series is called alternating
harmonic series) ?
Show that if a series

converges absolutely, it
converges in the ordinary sense. The converse is not true.
Conditionally convergent sequences are rather difficult to work with. Several operations
that one would expect to be true do not hold for such series. The perhaps most striking
example is the associative law. Since a + b = b + a for any two real numbers a and b,
positive or negative, one would expect also that changing the order of summation in a
series should have little effect on the outcome. However:
Theorem 4.1.6: Absolute Convergence and Rearrangement
Let
be an absolutely convergent series. Then any
rearrangement of terms in that series results in a new series that is
also absolutely convergent to the same limit.
Let
be a conditionally convergent series. Then, for
any real number c there is a rearrangement of the
series such that the new resulting series will converge
to c.
Proof
It seems that conditionally convergent series contain a few surprises. As a concrete

example, we can rearrange the alternating harmonic series so that it converges to, say, 2.
Examples 4.1.7: Rearranging the Alternating Harmonic Series
Find a rearrangement of the alternating harmonic series

that is within 0.001 of 2, i.e. show a concrete
rearrangement of that series that is about to converge to the
number 2.
that diverges to positive infinity.

Absolutely convergent series, however, behave just as one would expect.
Theorem 4.1.8: Algebra on Series
Let
and
be two absolutely convergent series. Then:
1. The sum of the two series is again absolutely convergent. Its
limit is the sum of the limit of the two series.
2. The difference of the two series is again absolutely
convergent. Its limit is the difference of the limit of the two
series.
3. The product of the two series is again absolutely convergent.
Its limit is the product of the limit of the two series (Cauchy
Product).
Proof
We will give one more rather abstract result on series before stating and proving easy to
use convergence criteria. The one result that is of more theoretical importance is
Theorem 4.1.9: Cauchy Criteria for Series
The series
converges if and only if for every > 0 there is a
positive integer N such that if m > n > N then
|<
Proof
4.2. Convergence Tests
IRA
In this section we will list many of the better known tests for convergence or divergence of
series, complete with proofs and examples. You should memorize each and every one of
those tests. The most useful tests are marked with a start (*). Click on the question marks
below to learn more about that particular test.
Divergence Test (*)
Comparison Test
Limit Comparison Test (*)
Cauchy Condensation Test
Geometric Series Test (*)
p Series Test (*)
Root Test
Ratio Test (*)
Abel's Convergence Test
Alternating Series Test (*)
Integral Test
IRA
4.3. Special Series
This is a brief listing of some of the most important series. Each of the convergence tests
given in the previous section contains even more examples.
Geometric Series
converges absolutely for |a| < 1 to 1 / 1-a, diverges otherwise
p Series
converges for p > 1, diverges otherwise
Harmonic Series
diverges
Alternating Harmonic Series
converges conditionally
Euler's Series
converges to e (Euler's number)
Divergence Test
If the series
Equivalently:
converges, then the sequence
If the sequence
series
converges to zero.
does not converge to zero, then the
can not converge.

Context
This test can never be used to show that a series converges. It can only be used to show
that a series diverges. Hence, the second version of this theorem is the more important,
applicable statement.
Examples 4.2.2:
Does the Divergence test apply to show that the series
converges or diverges ? How about convergence or

divergence of the series
Proof:
Suppose the series does converge. Then it must satisfy the Cauchy
criterion. In other words, given any > 0 there exists a positive integer
N such that whenever n > m > N then
|<
Let m > N and set n = m. Then the series above reduces to

|an|<
if n > N. That, however, is saying that the sequence
converges to zero.
Comparison Test
Suppose that
converges absolutely, and
is a sequence of
numbers for which | bn | | an | for all n > N. Then the series

converges absolutely as well.
If the series
converges to positive infinity, and
is
a sequence of numbers for which an bn for all n > N. Then
the series
also diverges.
Context
This is a useful test, but the limit comparison test, which is rather similar, is a much easier
to use, and therefore more useful. However, this comparison test is very easy to memorize:
Assuming that everything is positive, for simplicity, say we know that:
|bn| |an|
for all n. then just sum both sides to see what you get formally:
Then:
If the left sides equals infinity, so must the right side.

If the right side is finite, the left side also converges.
Examples 4.2.4:
Does
Does
converge or diverge ?
Proof:
The proof, at first glance, seems easy: Suppose that
converges
absolutely, and | b n | | a n | for all n. For simplicity, assume that all
terms in both sequences are positive. Let
SN=
and T N =
Then we have that

T N SN
Since the left side is a convergent sequence, it is in particular bounded. Hence, the right
side is also a bounded sequence of partial sums. Therefore it converges.
This proof wrong, because it does show that the sequence of partial
sums is bounded but it is not necessarily true that a bounded series also
converges - as we know.
However, this proof, slightly modified, does work: Again, assume that all
terms in both sequences are positive. Since
the Cauchy criterion:
converges, it satisfies
|<
if m > n > N. Since | b n | | a n | we then have
| |
|<
if m > n > N. Hence

satisfies the Cauchy criterion, and therefore converges.
The proof for divergence is similar.
Limit Comparison Test
Suppose
and
are two infinite series. Suppose also that
r = lim | a n / b n |
exists and 0 < r < Then

converges absolutely.
converges absolutely if and only if
Context
This test is more useful than the "direct" comparison test because you
do not need to compare the terms of two series too carefully. It is
sufficient if the two terms behave similar "in the long run".
Examples 4.2.6:
Use the limit convergence test to decide whether the
following series converge or diverge. Note that you need to
know convergence of the p-series .
1. Does the series

2. Does the series
converge of diverge ?
3. If r(n) = p(n) / q(n), where p and q are polynomials in

n, can you find general criteria for the series p(n) to
Proof:
Since r = lim | a n / b n | exists, and r is between 0 and infinity there exist
constants c and C, 0 < c < C < such that for some positive integer N
we have:
c<|an/bn|<C
if n > N. Assume
converges absolutely. From above we have that
c | b n | < | an |
for n > N. Hence,

Assume
converges absolutely by the comparison test.
|a n | < C | b n |
for n > N. But since the series C

comparison test to see that
also converges absolutely, we can use again the

must converge absolutely.
Limit Comparison Test
Suppose
and
are two infinite series. Suppose also that
r = lim | a n / b n |
exists and 0 < r < Then

converges absolutely if and only if
Context
This test is more useful than the "direct" comparison test because you
do not need to compare the terms of two series too carefully. It is
sufficient if the two terms behave similar "in the long run".
Examples 4.2.6:
Use the limit convergence test to decide whether the
following series converge or diverge. Note that you need to
know convergence of the p-series .
1. Does the series

2. Does the series
converge of diverge ?
3. If r(n) = p(n) / q(n), where p and q are polynomials in

n, can you find general criteria for the series p(n) to
Proof:
Since r = lim | a n / b n | exists, and r is between 0 and infinity there exist
constants c and C, 0 < c < C < such that for some positive integer N
we have:
c<|an/bn|<C
if n > N. Assume
c | b n | < | an |
for n > N. Hence,

Assume
converges absolutely by the comparison test.
|a n | < C | b n |
for n > N. But since the series C

comparison test to see that
also converges absolutely, we can use again the

must converge absolutely.
Cauchy Condensation Test

Suppose
is a decreasing sequence of positive terms. Then the series
converges if and only if the series
converges.
Context
This test is rather specialized, just as Abel's Convergence Test. The main
purpose of the Cauchy Condensation test is to prove that the p-series
converges if p > 1.
Example 4.2.8:
Use the Cauchy Condensation criteria to answer the

following questions:
1. In the sum
, list the terms a4, ak, and a2k. Then
show that this series (called the harmonic series)
diverges.
2. For which p does the series

converge or
diverge ? (In addition to the p-Series test , recall the
Geometric Series Test for this example)
Proof:
Assume that
converges: We have
2k-1 a2k = a2k + a2k + a2k + ... + a2k
because the sequence is decreasing. Hence, we have that
Now the partial sums on the right are bounded, by assumption. Hence the partial sums on
the left are also bounded. Since all terms are positive, the partial sums now form an
increasing sequence that is bounded above, hence it must converge. Multiplying the left
sequence by 2 will not change convergence, and hence the series
Assume that
converges: We have
converges.
Therefore, similar to above, we get:
But now the sequence of partial sums on the right is bounded, by assumption. Therefore,
the left side forms an increasing sequence that is bounded above, and therefore must
converge.
Theorem 4.2.9: Geometric Series
Let a be any real number. Then the series

is called Geometric Series.
if | a | < 1 the geometric series converges
if | a | 1 the geometric series diverges
If the geometric series converges (i.e. if | a | < 1) then
=
Context
Note that the index for the geometric series starts at 0. This is not
important for the convergence behavior, but it is important for the
resulting limit.
Examples 4.2.10:
Investigate the convergence behavior of the following

series:
1. What is the actual limit of the sum
?
?
3. Does the sum

converge ? (Here the limit
comparison test may be helpful).
Proof:
The proof consists of a nice trick. Consider the partial sum S
multiply it by a:
S N = 1 + a + a 2 + a 3 + ... + a N
a S N = a + a 2 + a 3 + ... + a N+1
and
Subtracting both equations yields: (1 - a) SN = 1 - a N+1. Dividing both sides by (1 - a) and

taking the limit, the result follows from previous result on the power sequence.
Theorem 4.2.11: p Series
The series
is called a p Series.
if p > 1 the p-series converges
if p 1 the p-series diverges
Context
Examples 4.2.12:
Does the series
Does the series
Does the series

converge or diverge ? (This is the
same series as in the example for the Limit Comparison test .
Are we running in a circle here ?)
Proof:
If p < 0 then the sequence
converges to infinity. Hence, the
series diverges by the Divergence Test.
If p > 0 then consider the series
=
The right hand series is now a Geometric Series.
if 0 < p 1 then 2 1-p 1, hence the right-hand series diverges
if 1 < p then 2 1-p < 1, hence the right-hand series converges
Now the result follows from the Cauchy Condensation test .
Root Test
Consider the series

. Then:
if lim sup | a n | 1/n < 1 then the series converges absolutely
if lim sup | a n | 1/n > 1 then the series diverges
if lim sup | a n | 1/n = 1, this test gives no information
Context
Compare this test with the Ratio test. Although this root test is more difficult to apply, it is
better than the ratio test in the following sense: there are series for which the ratio test give
no information, yet the root test will be conclusive. You can also use the root test to prove
the ratio test , but not visa versa.
It is important to remember that when the root test gives 1 as the
answer for the lim sup, then no conclusion at all is possible.
The use of the lim sup rather than the regular limit has the advantage
that we do not have to be concerned with the existence of a limit. On
the other hand, if the regular limit exists, it is the same as the lim sup,
so that we are not giving up anything using the lim sup.
Examples 4.2.14:
Does the root test apply to

Does the root test apply to

Does the series
? Does the series
? Does the series
Proof:
Assume that lim sup | a n | 1/n < 1: Because of the properties of the limit
superior, we know that there exists > 0 and N > 1 such that
| a n | 1/n < 1 for n > N. Raising both sides to the n-th power we have:
| a n | < (1 - ) n
for n > N. But the terms on the right hand side form a convergent geometric series. Hence,
by the comparison test the series with terms on the left-hand side will converge absolutely.
The proof for the second case if left as an exercise.

Ratio Test
Consider the series

. Then
if lim sup | a n+1 / a n | < 1 then the series converges absolutely.
if there exists an N such that | a n+1 / a n | 1 for all n > N then the
series diverges.
if lim sup | a n+1 / a n | = 1, this test gives no information
Note that the second condition is true if lim | an+1 / an | exists and is strictly
bigger than 1.
Context
The ratio test is easier to use than the Root test . However, there are series for which the
ratio tests gives no information, but the root test will. In that sense, the ratio test is weaker
than the root test. In addition, the ratio test can be proved using the root test, but not visa
versa.
Using the lim sup rather than the regular limit has the advantage that
we don't have to worry about existence of the limit. However, if the
regular limit exists, the lim sup yields the same number. Therefore, we
loose nothing by looking at the limit superior.
Example 4.2.16:
Does Euler's series
converge ?
Use the ratio test to test the series

for convergence.
Compare with the same example using the root test.
Are the following statements equivalent ? If not, which
statement is stronger ?
o There exists an N such that | a n+1 / a n | 1 for all n >
N
o lim sup | a n+1 / a n | 1
Give an alternative proof of the ratio test using the root test
(therefore showing that the root test is stronger than the ratio
test).
Proof:
The proof is very similar to the proof of the root test:
Assume that lim sup | a n+1 / a n | < 1: because of the properties of the
lim sup, we know that there exists > 0 and N > 1 such that
| a n+1 / a n | < 1 for n > N. Multiplying both sides by a n we obtain
| a n+1 | < (1- ) | a n |

for n > N. Therefore, we also have
| a n+2 | < (1- ) | a n+1 | < (1 - ) 2 | a n |
for n > N. Repeating this procedure, we get, eventually, that
| a k | < (1 - ) k-N | a N |
for k > N. But the terms on the right hand side form a convergent geometric series, indexed
using the variable k, where N is some constant integer. Hence, by the comparison test the
series with terms on the left-hand side will converge absolutely.
The proof for the second case if left as an exercise.
Abel Convergence Test

Consider the series
. Suppose that
1. the partial sums S N =

2. the sequence
3. lim b n = 0
Then the series
form a bounded sequence
is decreasing
converges.
Context
This test is rather sophisticated. Its main application is to prove the Alternating Series test,
but one can sometimes use it for other series as well, if the more obvious tests do not work.
Examples 4.2.18:
Does the sum
Does the series
Proof:
First, we need a lemma, called the Summation by Parts Lemma:
Lemma: Summation by Parts
Consider the two sequences

and
. Let S N =
be the n-th partial sum. Then for any 0 m n we have:
Proof
Assuming this lemma is proved, we will use it as follows for Abel's Test:
First, let's assume that the partial sums S N are bounded by, say, K.
Next, since the sequence
converges to zero, we can choose an
integer N such that | b n | < / 2K. Using the Summation by Parts lemma,
we then have:
But the sequence

is decreasing to zero, so in particular all terms must be positive
and all absolute values inside the summation above are superfluous. But then the sum is a
telescoping sum. All that remains is the first and last term, and we have:
But by our choice of N this is less than if we choose n and m larger than the
predetermined N. This proves Abel's Test.
What remains to do is the proof of the lemma, which can be found here.
Alternating Series Test
A series of the form

the sequence
converges.
with b n 0 is called Alternating Series. If

is decreasing and converges to zero, then the sum
Context
This test does not prove absolute convergence. In fact, when checking for absolute
convergence the term 'alternating series' is meaningless.
It is important that the series truly alternates, that is each positive term
is followed by a negative one, and visa versa. If that is not the case, the
alternating series test does not apply (while Abel's Test may still work).
Examples 4.2.20:
Does the sum
Does the Alternating Series test apply to
Proof:
Let a n = (-1) n. Then the formal sum
has bounded partial sums
(although the sum does not itself converge. Why not ?). Then, with the
given choice of
indeed converges.
Abel's test applies directly, showing that the series

Integral Test
Suppose that f(x) is positive, continuous, decreasing function on the interval
[N, ). Let a n = f(n). Then
converges if and only if
converges
Context
Note that this test is much different from all the others. We have not yet formally
introduced the concept of an Integral - or even of a continuous function - so that we can not
prove this test here. However, for completeness it is included as a test that is sometimes
useful to apply.
IRA
5.1. Open and Closed Sets
In the previous chapters we dealt with collections of points: sequences and series. Each
time, the collection of points was either finite or countable.and the most important property
of a point, in a sense, was its location in some coordinate or number system. Now will deal
with points, or more precisely with sets of points, in a more abstract setting. The location
of points relative to each other will be more important than their absolute location, or size,
in a coordinate system. Therefore, concepts such as addition and multiplication will not
work anymore, and we will have to start, in a sense, at the beginning again.
All of the previous sections were, in effect, based on the natural
numbers. Those numbers were postulated as existing and all other
properties - including other number systems - were deduced from those
numbers and a few principles of logic.
We will now proceed in a similar way: first, we need to define the basic
objects we want to deal with, together with their most elementary
properties. Then we will develop a theory of those objects and called it
topology.
Definition 5.1.1: Open and Closed Sets
A set U R is called open, if for each x U there exists and > 0
such that the interval ( x - , x + ) is contained in U. Such an
interval is often called an - neighborhood of x, or simply a
neighborhood of x.
A set F is called closed if the complement of F, R \ F,
is open.
Examples 5.1.2:
Which of the following sets are open, closed, both, or

neither ?
1.
The intervals (-3, 3), [4, 7], (-4, 5], (0, ) and [0,
)
2.
The sets R (the whole real line) and 0 (the empty
set)
3.
The set {1, 1/2, 1/3, 1/4, 1/5, ...} and {1, 1/2, 1/3,
1/4, ...} {0}
It is fairly clear that when combining two open sets (either via union or intersection) the
resulting set is again open, and the same statement should be true for closed sets. What
about combining infinitely many sets ?
Proposition 5.1.3: Unions of Open Sets, Intersections of Closed Sets
Every union of open sets is again open.

Every intersection of closed sets is again closed.
Every finite intersection of open sets is again open
Every finite union of closed sets is again closed.
Proof
How complicated can an open or closed set really be ? The basic open
(or closed) sets in the real line are the intervals, and they are certainly
not complicated. As it will turn out, open sets in the real line are
generally easy, while closed sets can be very complicated.
The worst-case scenario for the open sets, in fact, will be given in the
next result, and we will concentrate on closed sets for much of the rest
of this chapter.
Proposition 5.1.4: Characterizing Open Sets
Let U R be an arbitrary open set. Then there are countably many
pairwise disjoint open intervals Un such that U = Un
Proof
Next we need to establish some relationship between topology and our previous studies, in
particular sequences of real numbers. We shall need the following definitions:
Definition 5.1.5: Boundary, Accumulation, Interior, and Isolated
Points
Let S be an arbitrary set in the real line R.
1. A point b R is called boundary point of S if every nonempty neighborhood of b intersects S and the complement of
S. The set of all boundary points of S is called the boundary of
S, denoted by bd(S).
2. A point s S is called interior point of S if there exists a
neighborhood of S completely contained in S. The set of all
interior points of S is called the interior, denoted by int(S).
3. A point t S is called isolated point of S if there exists a
neighborhood U of t such that U S = {t}.
4. A point r S is called accumulation point, if every
neighborhood of r contains infinitely many distinct points of
S.
Examples 5.1.6:
What is the boundary and the interior of (0, 4), [-1, 2], R,
and O ? Which points are isolated and accumulation points, if
any ?
Find the boundary, interior, isolated and accumulation

points, if any, for the set {1, 1/2, 1/3, ... } {0}
Here are some results that relate these various definitions with each other.
Proposition 5.1.7: Boundary, Accumulation, Interior, and Isolated
Points
Let S R. Then each point of S is either an interior point or a

boundary point.
Let S R. Then bd(S) = bd(R \ S).
A closed set contains all of its boundary points. An open set
contains none of its boundary points.
Every non-isolated boundary point of a set S R is an
accumulation point of S.
An accumulation point is never an isolated point.
Proof
Finally, here is a theorem that relates these topological concepts with our previous notion
of sequences.
Theorem 5.1.8: Closed Sets, Accumulation Points, and Sequences
A set S R is closed if and only if every Cauchy sequence of

elements in S has a limit that is contained in S.
Every bounded, infinite subset of R has an accumulation
point.
If S is closed and bounded, and
then there exists a subsequence
converges to an element of S.
is any sequence in S,
of
that
Proof
IRA
5.2. Compact and Perfect Sets

We have already seen that all open sets in the real line can be written as the countable
union of disjoint open intervals. We will now take a closer look at closed sets. The most
important type of closed sets in the real line are called compact sets:
Definition 5.2.1: Compact Sets
A set S of real numbers is called compact if every sequence in S has a
subsequence that converges to an element again contained in S.
Examples 5.2.2:
Is the interval [0,1] compact ? How about [0, 1) ?

Is the set {1, 2, 3} compact ? How about the set N of
natural numbers ?
Is the set {1, 1/2, 1/3, 1/4, ...} compact ?
Is the set {1, 1/2, 1/3, 1/4, ...} {0} compact ?

It is not easy to see what compact sets really look like, based on this definition. However,
the following result gives a nice characterization of them, and lets you answer the above
questions easily.
Proposition 5.2.3: Compact means Closed and Bounded
A set S of real numbers is compact if and only if it is closed and

bounded.
Proof
The above definition of compact sets using sequence can not be used in more abstract
situations. We would also like a characterization of compact sets based entirely on open
sets. We need some definitions first.
Definition 5.2.4: Open Cover
Let S be a set of real numbers. An open cover of S is a collection C of
open sets such that S
C. The collection C of open sets is said to
cover the set S.
A subset of sets from the collection C that still covers
the set S is called a subcovering of S.
Examples 5.2.5:
Let S = [0, 1], and C = { (-1/2, 1/2), (1/3, 2/3), (1/2, 3/2)}.
Is C an open cover for S ?
Let S = [0, 1]. Define
= { t R : | t - | < and
S}
for a fixed > 0. Is the collection of all {
},
S, an open
cover for S ? How many sets of type
are actually needed to
cover S ?
Let S = (0, 1). Define a collection C = { (1/j, 1), for all j >
0 }. Is C an open cover for S ? How many sets from the
collection C are actually needed to cover S ?
Here is the characterization of compact sets based only on open sets:
Theorem 5.2.6: Heine-Borel Theorem
A set S of real numbers is compact if and only if every open cover C

of S can be reduced to a finite subcovering.
Proof
Compact sets share many properties with finite sets. For example, if A and B are two nonempty sets with A B then A B # 0. That is, in fact, true for finitely many sets as well,
but fails to be true for infinitely many sets.
Examples 5.2.7:
Consider the collection of sets (0, 1/j) for all j > 0. What is
the intersection of all of these sets ?
Can you find infinitely many closed sets such that their
intersection is empty and such that each set is contained in its
predecessor ? That is, can you find sets Aj such that Aj+1 Aj
and Aj = 0 ?
Compact sets, on the other hand, have the following nice property, which will be used in
some of the following chapters:
Proposition 5.2.8: Intersection of Nested Compact Sets
Suppose { Aj } is a collection of sets such that each Aj non-empty,

compact, and Aj+1 Aj. Then A = Aj is not empty.
Proof
Another interesting collection of closed sets are the perfect sets:

Definition 5.2.9: Perfect Set
A set S is perfect if it is closed and every point of S is an
accumulation point of S.
Example 5.2.10:
Find a perfect set. Find a closed set that is not perfect. Find
a compact set that is not perfect. Find an unbounded closed set
that is not perfect. Find a closed set that is neither compact
nor perfect.
Is the set {1, 1/2, 1/3, ...} perfect ? How about the set {1,
1/2, 1/3, ...} {0} ?
As an application of the above result, we will see that perfect sets are closed sets that
contain lots of points:
Proposition 5.2.11: Perfect sets are Uncountable
Every non-empty perfect set must be uncountable.

Proof
This can yield a quick, but rather sophisticated proof of the fact that the
interval [a, b] is uncountable: the interval [a, b] is a perfect set, hence,
it must be uncountable.
Another, rather peculiar example of a closed, compact, and perfect set
is the Cantor set.
Definition 5.2.12: Cantor Middle Third Set
Start with the unit interval
S0 = [0, 1]
Remove from that set the middle third and set
S1 = S0 \ (1/3, 2/3)
Remove from that set the two middle thirds and set
S2 = S1 \ { (1/9, 2/9)
(7/9, 8/9) }
Continue in this fashion, where

Sn+1 = Sn \ { middle thirds of subintervals of Sn }
Then the Cantor set C is defined as
C = Sn
The Cantor set gives an indication of the complicated structure of closed sets in the real
line. It has the following properties:
Example 5.2.13: Properties of the Cantor Set
Show that the Cantor set is compact (i.e. closed and

bounded)
Show that the Cantor set is perfect (and hence
uncountable)
Show that the Cantor set has length zero, but contains
uncountably many points.
Show that the Cantor set does not contain any open set
Think about this set. It seems surprising that
a set of length zero can contain uncountably many points.
a perfect set does not have to contain an open set
Therefore, the Cantor set shows that closed subsets of the real line can be more
complicated than intuition might at first suggest. It is in fact often used to construct
difficult, counter-intuitive objects in analysis.
IRA
5.3. Connected and Disconnected Sets
In the last two section we have classified the open sets, and looked at two classes of closed
set: the compact and the perfect sets. In this section we will introduce two other classes of
sets: connected and disconnected sets.
Definition 5.3.1: Connected and Disconnected
An open set S is called disconnected if there are two open, nonempty sets U and V such that:
1. U V = 0
2. U V = S
A set S (not necessarily open) is called disconnected
if there are two open sets U and V such that
1. (U S) # 0 and (V S) # 0
2. (U S) (V S) = 0
3. (U S) (V S) = S
If S is not disconnected it is called connected.
Note that the definition of disconnected set is easier for an open set S. In principle,
however, the idea is the same: If a set S can be separated into two open, disjoint sets in
such a way that neither set is empty and both sets combined give the original set S, then S
is called disconnected.
To show that a set is disconnected is generally easier than showing
connectedness: if you can find a point that is not in the set S, then that
point can often be used to 'disconnect' your set into two new open sets
with the above properties.
Examples 5.3.2:
Is the set { x R : | x | < 1, x # 0 } connected or

disconnected ? What about the set { x R : | x | 1, x # 0 }
Is the set [-1, 1] connected or disconnected ?
Is the set of rational numbers connected or disconnected ?
How about the irrationals ?
Is the Cantor set connected or disconnected ?

In the real line connected set have a particularly nice description:
Proposition 5.3.3: Connected Sets in R are Intervals
If S is any connected subset of R then S must be some interval.

Proof
Hence, as with open and closed sets, one of these two groups of sets
are easy:
open sets in R are the union of disjoint open intervals
connected sets in R are intervals
The other group is the complicated one:
closed sets are more difficult than open sets (e.g. Cantor set)
disconnected sets are more difficult than connected ones (e.g. Cantor set)
In fact, a set can be disconnected at every point.
Definition 5.3.4: Totally Disconnected
A set S is called totally disconnected if for each distinct x, y S

there exist disjoint open set U and V such that x U, y V, and (U
S) (V S) = S.
Intuitively, totally disconnected means that a set can be be broken up into two pieces at
each of its points, and the breakpoint is always 'in between' the original set.
Example 5.3.5:
The Cantor set is disconnected. Is it totally disconnected ?

Is the set {0, 1} connected or disconnected ? Is it totally
disconnected ?
Is the set {1, 1/2, 1/3, 1/4, ...} totally disconnected ? How
about the set {1, 1/2, 1/3, 1/4 ...} {0} ?
Find a totally disconnected subset of the interval [0, 1] of
length 0 (different from the Cantor set), and another one of
length 1.
6.1. Limits
IRA
We now want to combine some of the concepts that we have introduced before: functions,
sequences, and topology. In particular, if we have some function f(x) and a given sequence
{ an }, then we can apply the function to each element of the sequence, resulting in a new
sequence. What we would want is that if the original sequence converges to some number
L, then the new sequence { f( an )} should converge to f(L), and if the original sequence
diverges, the new one should diverge also. This seems not too much to ask for, but is quite
simple minded.
Example 6.1.1:
Consider the function f, where f(x) = 1 if x 0 and f(x) = 2

if x > 0.
1. The sequence { 1/n } converges to 0. What happens to
the sequence { f( 1/n ) } ?
2. The sequence { 3 + (-1)n } is divergent. What happens
to the sequence { f ( 3 + (-1)n ) } ?
3. The sequence { (-1)n / n } converges to zero. What

happens to the sequence { f ( (-1)n / n ) } ?
As the above easy example shows, things can be more complicated than anticipated.
Therefore, we have to attack the problem more systematically. First, we need to define
what we mean by 'limit of a function'.
Definition 6.1.2: Limit of a Function (sequences version)
A function f with domain D in R converges to a limit L as x
approaches a number c if for any sequence { xn in D that converges to
c the sequence { f ( xn ) } converges to L.
We write
f(x) = L
Examples 6.1.3:
Apply this definition in these cases:

1. Let f(x) = m x + b. Then does the limit of that function
exist at an arbitrary point x ?
2. Let g(x) = [x], where [x] denotes the greatest integer
less than or equal to x. Then does the limit of g exist at
an integer ? How about at numbers that are not
integers ?
3. In the above definition, does c have to be in the

domain D of the function ? Is c in the closure(D) ? Do
you know a name for c in terms of topology ?
The above definition works quite well to show that a function is not continuous, because
you only have to find one particular sequence whose images do not converge as a
sequence. It is not a good definition, in general, to prove convergence of a function,
because you will have to check every possible convergent sequence, and that is hard to do.
We would therefore like another definition of convergence or limit of a function.
Definition 6.1.4: Limit of a function (epsilon-delta Version)
A function f with domain D in R converges to a limit L as x
approaches a number c closure (D) if: given any > 0 there exists a
> 0 such that:
if x
D and | x - c | <
then | f(x) - L | <
Example 6.1.5:
Consider the function f with f(x) = 1 if x is rational and f(x)
= 0 if x is irrational. Does the limit of f(x) exist at an arbitrary
number x ?
Regardless of which of the two definitions might be considered easier to use in a particular
situation, the basic problem right now is that we have two different definitions for the same
concept. We therefore have to show that both definitions are actually equivalent to each
other.
Proposition 6.1.6: Equivalence of Definitions of Limits
If f is any function with domain Din R, and c closure(D) then the

following are equivalent:
1. For any sequence { xn } in D that converges to c the sequence
{ f ( xn ) } converges to L
2. given any > 0 there exists a > 0 such if x closure(D) and
| x - c | < then | f(x) - L | <
Proof
In other words, both definitions of continuity are equivalent, and we can use which ever
seems the easiest. Here are some basic properties of limits of functions.
Proposition 6.1.7: Properties for limits of Functions
If
f(x) exists, the limit is unique.

[ f(x) + g(x) ] =
f(x) and
and
g(x), provided that
g(x) exist.
[ f(x) g(x) ] =
f(x) +
f(x)
g(x), provided that
f(x)
g(x) exist.
[ f(x) / g(x) ] =
f(x) and
f(x) /
g(x) exist and
g(x), provided that

g(x) # 0.
Proof
Sometimes a function may not have a limit using the above definitions, but when the
domain of the function is restricted, then a limit exists. This leads to the concepts of onesided limits.
Definition 6.1.8: One-Sided Limits of a Function
If f is a function with domain D and c closure(D). Then:
1. f has a left-hand limit L at c if for every > 0 there exists
> 0 such that if x D and c - < x < c then | f(x) - L | < .
We write
f(x) = L.
2. f has a right-hand limit L at c if for every > 0 there exists

> 0 such that if x D and c < x < c + then | f(x) - L | < .
We write
f(x) = L.
This is the formal definition of x approaching c either only from the right side, or only
from the left side. These one-sided limits are related to regular limits in a straight forward
manner:
Proposition 6.1.9: Limits and One-Sided Limits

f(x) = L if and only if
f(x) = L and
f(x) = L
Proof
Now that we have some idea about limits of functions, we will move to the next question:
if some sequence converges to c, and the function converges to L as x approaches c, then
when is it true that f(c) = L ? This will be the contents of the next section, continuity.
IRA
6.2. Continuous Functions
If one looks up continuity in a thesaurus, one finds synonyms like perpetuity or lack of
interruption. Descartes said that a function is continuous if its graph can be drawn without
lifting the pencil from the paper.
Example 6.2.1:
Use the above imprecise meaning of continuity to decide

which of the two functions are continuous:
1. f(x) = 1 if x > 0 and f(x) = -1 if x < 0. Is this function
continuous ?
2. f(x) = 5x - 6. Is this function continuous?

However, if we want to deal with more complicated functions, we need mathematical
concepts that we can manipulate.
Definition 6.2.2: Continuity
A function is continuous at a point c in its domain D if: given any
> 0 there exists a > 0 such that: if x D and | x - c | < then | f(x)
- f(c) | <
A function is continuous in its domain D if it is
continuous at every point of its domain.
This, like many epsilon-delta definitions and arguments, is not easy to understand.
Click on the Java icon to see an applet that tries to illustrate the definition.
Continuous functions are precisely those groups of functions that
preserve limits, as the next proposition indicates:
Proposition 6.2.3: Continuity preserves Limits
If f is continuous at a point c in the domain D, and { xn } is a sequence
of points in D converging to c, then
f(x) = f(c).
If
f(x) = f(c) for every sequence { xn } of points in
D converging to c, then f is continuous at the point c.
Proof
Again, as with limits, this proposition gives us two equivalent mathematical conditions for
a function to be continuous, and either one can be used in a particular situation.
Example 6.2.4:
Which of the following two functions is continuous:

1. If f(x) = 5x - 6, prove that f is continuous in its
domain.
2. If f(x) = 1 if x is rational and f(x) = 0 if x is irrational,
prove that x is not continuous at any point of its
domain.
If f(x) = x if x is rational and f(x) = 0 if x is irrational,
prove that f is continuous at 0.
If f(x) is continuous in a domain D, and { xn } is a Cauchy
sequence in D, is the sequence { f ( xn ) } also Cauchy ?
Continuous functions can be added, multiplied, divided, and composed with one another
and yield again continuous functions.
Proposition 6.2.5: Algebra with Continuous Functions
The identity function f(x) = x is continuous in its domain.

If f(x) and g(x) are both continuous at x = c, so is f(x) + g(x) at
x = c.
If f(x) and g(x) are both continuous at x = c, so is f(x) * g(x) at
x = c.
If f(x) and g(x) are both continuous at x = c, and g(x) # 0, then
f(x) / g(x) is continuous at x = c.
If f(x) is continuous at x = c, and g(x) is continuous at x = f(c),
then the composition g(f(x)) is continuous at x = c.
Proof
While this proposition seems not very important, it can be used to quickly prove the
following:
Examples 6.2.6:
Every polynomial is continuous in R, and every rational

function r(x) = p(x) / q(x) is continuous whenever q(x) # 0.
The absolute value of any continuous function is

continuous.
Continuity is defined at a single point, and the epsilon and delta appearing in the definition
may be different from one point of continuity to another one. There is, however, another
kind of continuity that works for all points of domain at the same time.
Definition 6.2.7: Uniform Continuity
A function f with domain D is called uniformly continuous on the

domain D if for any > 0 there exists a > 0 such that: if s, t D and
| s - t | < then | f(s) - f(t) | < . Click here for a graphical
explanation.
Take a look at this Java applet illustrating uniform continuity.
While this definition looks very similar to the original definition of
continuity, it is in fact not the same: a function can be continuous, but
not uniformly continuous. The difference is that the delta in the
definition of uniform continuity depends only on epsilon, whereas in the
definition of simply continuity delta depends on epsilon as well as on the
particular point c in question.
Example 6.2.8:
The function f(x) = 1 / x is continuous on (0, 1). Is it

uniformly continuous there ?
The function f(x) = x2 is continuous on [0, 1]. Is it
The function f(x) = x2 is continuous on [0, ). Is it
If f(x) is uniformly continuous in R, and { xn } is a Cauchy

sequence, is the sequence { f ( xn ) } also Cauchy ?
The next theorem illustrates the connection between continuity and uniform continuity, and
gives an easy condition for a continuous function to be uniformly continuous.
Theorem 6.2.9: Continuity and Uniform Continuity
If f is uniformly continuous in a domain D, then f is continuous in D.

If f is continuous on a compact domain D, then f is
uniformly continuous in D.
Proof
Next, we will look at functions that are not continuous.

6.3. Discontinuous Functions
IRA
Definition 6.3.1: Discontinuous function

If a function fails to be continuous at a point c, then the function is
called discontinuous at c, and c is called a point of discontinuity, or
simply a discontinuity.
Points of discontinuity can be classified into three different categories: 'fake'
discontinuities, 'regular' discontinuities, and 'difficult' discontinuities.
Examples 6.3.2:
Consider the following three functions:
1.
2.
3.
4.
Which of these functions, without proof, has a 'fake'
discontinuity, a 'regular' discontinuity, or a 'difficult'
discontinuity ?
Of course, we need some mathematical description of the various types of discontinuities
that a function could have.
Definition 6.3.3: Classification of Discontinuities
Suppose f is a function with domain D and c

discontinuity of f.
1. if
D is a point of
f(x) exists, then c is called removable discontinuity.
2. if
f(x) does not exist, but both
f(x) and
f(x)
exit, then c is called a discontinuity of the first kind , or
jump discontinuity
3. if either
f(x) or
f(x) does not exist, then c is called
a discontinuity of the second kind, or essential discontinuity.
Examples 6.3.4:
Consider the functions from the previous examples. Then

1.
Prove that k(x) has a removable discontinuity at x =
3, and draw the graph of k(x).
2.
Prove that h(x) has a jump discontinuity at x = 0,
and draw the graph of h(x)
3.
Prove that f(x) has a discontinuity of second kind at
x=0
4.
What kind of discontinuity does the function g(x)

have at every point (with proof).
It is clear that any function is either continuous at any given point in its domain, or it has a
discontinuity of one of the above three kinds. It is also clear that removable discontinuities
are 'fake' ones, since one only has to define f(c) =
f(x) and the function will be
continuous at c. Of the other two types of discontinuities, the one of second kind is hard.
Fortunately, however, discontinuities of second kind are rare, as the following results will
indicate.
Definition 6.3.5: Monotone Function
A function f is monotone increasing on (a, b) if f(x) f(y) whenever
x < y.
A function f is monotone decreasing on (a, b) if f(x)
f(y) whenever x < y.
A function f is called monotone on (a, b) if it is either
always monotone increasing or monotone decreasing.
Note that f is increasing if -f is decreasing, and visa versa. Equivalently, f is increasing if
f(x) / f(y) 1 whenever x < y
f(x) - f(y) 0 whenever x < y
These inequalities are often easier to use in applications, since their left sides take a very
nice and simple form. Next, we will determine what type of discontinuities monotone
functions can possibly have. The proof of the next theorem, despite its surprising result, is
not too bad.
Theorem 6.3.6: Discontinuities of Monotone Functions
If f is a monotone function on an open interval (a, b), then any
discontinuity that f may have in this interval is of the first kind.
If f is a monotone function on an interval [a, b], then f
has at most countably many discontinuities.
Proof
This theorem also states that if a function wants to have a discontinuity of the second kind
at a point x = c, then it can not be monotone in any neighborhood of c.
Corollary 6.3.7: Discontinuities of Second Kind
If f has a discontinuity of the second kind at x = c, then f must change
from increasing to decreasing in every neighborhood of c.
Proof
In other words, f must look pretty bad if it has a discontinuity of the second kind.
Examples 6.3.8:
What kind of discontinuity does the function f(x) =

exp(1/x) have at x = 0 ?
What kind of discontinuity does the function f(x) = x
sin(1/x) have at x = 0 ?
What kind of discontinuity does the function f(x) =
cos(1/x) have at x = 0 ?

IRA
6.4. Topology and Continuity
While the definition of continuity suffices for functions on the real line, there are other,
more abstract spaces for which this definition will not work. In particular, our continuity
definitions relies on the presence of an absolute value. There are spaces which do not have
such a distance function, yet we still might want to study continuous functions on those
abstract spaces. In this section we will investigate some topological properties of
continuity which will, in fact, apply equally well to more general settings. In addition, this
section will contain several important theoretical results on continuous function on the real
line.
Proposition 6.4.1: Continuity and Topology
Let f be a function with domain D in R. Then the following
statements are equivalent:
1. f is continuous
2. If D is open, then the inverse image of every open set under f
is again open.
3. If D is open, then the inverse image of every open interval
under f is again open.
4. If D is closed, then the inverse image of every closed set
under f is again closed.
5. If D is closed, then the inverse image of every closed interval
under f is again closed.
6. The inverse image of every open set under f is the intersection
of D with an open set.
7. The inverse image of every closed set under f is the
intersection of D with a closed set.
Proof
This proposition can be used to prove that a function is continuous, and is especially nice if
the domain of the function is either open or closed. This is true in particular for function
defined on all of R (which is both open and closed).
Examples 6.4.2:
Let f(x) = x2. Show that f is continuous by proving

1. that the inverse image of an open interval is open.
2. that the inverse image of a closed interval is closed.
Let f(x) = 1 if x > 0 and f(x) = -1 if x 0. Show that f is not
continuous by
1. finding an open set whose inverse image is not open.
2. finding a closed set whose inverse image is not closed.

Now we know that the inverse images of open sets are open, and the inverse images of
closed sets are closed whenever f is continuous. What about the images of sets under
continuous functions ?
Examples 6.4.3:
Is it true that if f is continuous, then the image of an open
set is again open ? How about the image of a closed set ?
As the above examples show, the image of a closed set is not necessarily closed for
continuous functions. It is also easy to see that the image of a bounded set is not
necessarily bounded. However, the image of bounded and closed sets under continuous
functions is both bounded and closed again. That is the content of the next theorem.
Proposition 6.4.4: Images of Compact and Connected sets

If f is a continuous function on a domain D, then:
1. the image of every compact set is again compact.
2. the image of every connected set is again connected.
Proof
Since compact sets in the real line are characterized by being closed and bounded, we
should note while it is not true that the image of a closed set is closed, one must look at an
unbounded closed set for a counterexample. If the set was closed and bounded, then its
image would be closed again, because the image of a compact set is, in particular, closed,
by the above theorem.
Examples 6.4.5:
If
then:
1. what is the image of [-2, 1] ?
2. find a closed set whose image is not closed
Find examples for the following situations:
1. A continuous function and a set whose image is not
connected.
2. A continuous function and a disconnected set whose
image is connected.
3. A function such that the image of a connected set is
disconnected.
4. Is it true that inverse images of connected sets under

continuous functions are again connected ?
This proposition has several important consequences for continuous functions.
Theorem 6.4.6: Max-Min theorem for Continuous Functions
If f is a continuous function on a compact set K, then f has an
absolute maximum and an absolute minimum on K.
In particular, f must be bounded on the compact set K.
Proof
6.5. Differentiable Functions

Having discussed continuity we will turn to another class of functions: differentiable
functions. This group of functions is one of the focus points of Calculus, and you should
already be familiar with many aspects of those functions
In our setting these functions will play a rather minor role and we will only briefly review
the main topics of that theory. As usual, proofs will be our focus point, rather than
techniques of differentiation as it has been in Calculus.
First, we will start with the definition of derivative.
Definition 6.5.1: Derivative
Let f be a function with domain D in R, and D is an open set in R.
Then the derivative of f at the point c is defined as
f'(c) =
If that limit exits, the function is called differentiable at c. If f is

differentiable at every point in D then f is called differentiable in D.
Other notations for the derivative of f are
or
f(x).
The usual geometric interpretation of the derivative at a point is as slope of the tangent line
to the graph of f(x) at the point (c, f(c)). If a function is differentiable, it may not have any
'edges'. That often makes it easy to decide whether a function is differentiable if you can
see the graph of the function.
Examples 6.5.2:
Find the derivative of f(x) = x and of f(x) = 1 / x.
Find the derivative of

Another way to define a differentiable function is by saying that f(x) can be approximated
by a linear function, as in the following theorem:
Theorem 6.5.3: Derivative as Linear Approximation
Let f be a function defined on (a, b) and c any number in (a, b). Then
f is differentiable at c if and only if there exists a constant M such that
f(x) = f(c) + M ( x - c ) + r(x)
where the remainder function r(x) satisfies the condition
=0
Proof
This theorem provides a suitable method to generalize the concept of derivative to other
spaces: a function defined in some general space is called differentiable at a point c if it can
be approximated by a linear function at that point. On the real line the linear function M ( x
- c ) + f(c), of course, is the equation of the tangent line to f at the point c. In higher
dimensional real space this concept is known as the total derivative of a function.
Examples 6.5.4:
Why might our original definition of differentiability not

be suitable for functions of, say, two or three real variables ?
Use the characterization of differentiability via

approximation by linear functions to define the concept of
'derivative' for functions of n real variables.
In any case, differentiability is a new concept, so that we should first ask ourselves what its
relation to the previous concept of continuity is.
Theorem 6.5.5: Differentiable and Continuity
If f is differentiable at a point c, then f is continuous at that point c.

The converse is not true.
Proof
Examples 6.5.6:
The function f(x) = | x | is continuous everywhere. Is it also

differentiable everywhere ?
The function f(x) = x sin(1/x) is continuous everywhere
except at x = 0, where it has a removable discontinuity. If the
function is extended appropriately to be continuous at x = 0, is
it then differentiable at x = 0 ?
The function f(x) = x2 sin(1/x) has a removable

discontinuity at x = 0. If the function is extended
appropriately to be continuous at x = 0, is it then differentiable
at x = 0 ?
As with continuous functions, differentiable functions can be added, multiplied, divided,
and composed with each other to yield again differentiable functions. In fact, there are easy
rules to compute the derivative of those new functions, all of which are well- known from
Calculus.
Theorem 6.5.7: Algebra with Derivatives
Addition Rule: If f and g are differentiable at x = c then f(x)

+ g(x) is differentiable at x = c, and
o
(f(x) + g(x)) = f'(x) + g'(x)
Product Rule: If f and g are differentiable at x = c then f(x)
g(x) is differentiable at x = c, and
o
(f(x) g(x)) = f'(x) g(x) + f(x) g'(x)
Quotient Rule: If f and g are differentiable at x = c, and g(c)
# 0 then then f(x) / g(x) is differentiable at x = c, and
( f(x) / g(x) ) =
Chain Rule: If g is differentiable at x = c, and f is
differentiable at x = g(c) then f(g(x)) is differentiable at x = c,
and
o
f(g(x)) = f'(g(x)) g'(x)

Proof
Next, we will state several important theorems for differentiable functions:

Theorem 6.5.8: Rolle Theorem
If f is continuous on [a, b] and differentiable on (a, b), and f(a) = f(b)
= 0, then there exists a number x in (a, b) such that f'(x) = 0.
Proof
An extension of Rolle's theorem that removes the conditions on f(a) and f(b) is the MeanValue-Theorem. It is actually a 'shifted' version of Rolle's theorem, as its proof illustrates.
A more general version of the Mean Value theorem is also mentioned which is sometimes
useful.
Theorem 6.5.9: Mean Value Theorem
If f is continuous on [a, b] and differentiable on (a, b), then there
exists a number c in (a, b) such that
f'(c) =
If f and g are continuous on [a, b] and differentiable on

(a, b) and g'(x) # 0 in (a, b) then there exists a number
c in (a, b) such that
Proof
Examples 6.5.10:
Does Rolle's theorem apply to

defined on (3, 3) ? If so, find the number guarantied by the theorem to
exist.
Prove that if f is differentiable on R and | f'(x) | M for all
x, then | f(x) - f(y) | M | x - y | for all numbers x, y. Functions
that satisfy such an inequality are called Lipschitz functions.
Use the Mean Value theorem to show that
Rolle's theorem and the Mean Value theorem allow us to develop the familiar test for local
extrema of a function, as well as increasing and decreasing functions. Recall the definition
of local extremum:
Definition 6.5.11: Local Extremum
Let f be a function defined on a domain D, and c a point in D.
1. If there exists a neighborhood U of c with f(c) f(x) for all x
in U, then f(c) is called a local maximum for the function f
that occurs at x = c.
2. If there exists a neighborhood U of c with f(c) f(x) for all x
in U, then f(c) is called a local minimum for the function f
that occurs at x = c.
3. If f(x) has either a local minimum or a local maximum at x =
c, then f(c) is called local extremum of the function f.
You can find possible local extrema by applying the following theorem:
Theorem 6.5.12: Local Extrema and Monotonicity
If f is differentiable on (a, b), and f has a local extrema at x =

c, then f'(c) = 0.
If f'(x) > 0 on (a, b) then f is increasing on (a, b).
If f'(x) < 0 on (a, b) then f is decreasing on (a, b).
Proof
This theorem suggests the following table in order to find local minima and maxima:
Suppose you have found a point c such that f'(c) either does not exist or f'(c) = 0. For each
c (called a critical point of f) we may have one of these four situations:
Loc. Max
interval
Loc. Min
(a, c) (c , b) interval
sign of f'(x)
dir. of f(x)
up
sign of f'(x)
down dir. of f(x)
No Extremum
interval
(a, c) (c , b)
-
down
up
No Extremum
(a, c) (c , b)
sign of f'(x)
sign of f'(x)
dir. of f(x)
up
up
dir. of f(x)
down down
The results of these tables can be summarized in the following:

Corollary 6.5.13: Finding Local Extrema
Suppose f is differentiable on (a, b). Then:
1. If f'(c) = 0 and f'(x) > 0 on (a, x) and f'(x) < 0 on (x, b), then
f(c) is a local minimum.
2. If f'(c) = 0 and f'(x) < 0 on (a, x) and f'(x) > 0 on (x, b), then
f(c) is a local maximum.
Proof
These results above are the cornerstones of Calculus 1 in most colleges. As a review, you
may enjoy the following examples:
Examples 6.5.14:
If f(x) = x3 -2 x2, then find all local extrema.
If f(x) = | 1 - x2|, then find all relative extrema.

One of the nice applications of derivatives is that they give an easy short-cut rule to finding
limits, when those limits are difficult to obtain otherwise.
Theorem 6.5.15: l Hospital Rules
If f and g are differentiable in a neighborhood of x = c, and f(c) = g(c)

= 0, then
, provided the limit on the right
exists.
The same result holds for one-sided limits.
If f and g are differentiable and
,then
f(x) =
g(x) =
, provided the last limit exists
Proof
There are other situations where l'Hospital's rule may apply, but often
expressions can be rewritten so that one of these two cases will apply.
Examples 6.5.16:
Please find the following limits, using, if necessary,

l'Hospital's rules:
1.
2.
3.
6.6. A Function Primer

Here we want to list some functions that illustrate more or less subtle points for continuous
and differentiable functions. These functions are all difficult, in one sense or another, but
should definitely be part of the repertoire of any math student with an interest in analysis.
Examples of Continuous and Differentiable Functions

A function that is not continuous at any point in R
A function that is continuous at the irrational numbers and discontinuous at the
rational numbers.
A function that is differentiable, but the derivative is not continuous.
A function that is n-times differentiable, but not (n+1)-times differentiable
A function that is not zero, infinitely often differentiable, but the n-th derivative
at zero is always zero.
A function that is continuous everywhere and nowhere differentiable in R.
A continuous, non-constant, differentiable function whose derivative is zero
everywhere except on a set of length zero.
Dirichlet Function
The Dirichlet Function is not continuous at any point on the real line.
Back
Proof:
Left as Exercise. Show that g can not be continuous at a rational number by considering
sequences of irrational numbers. Then show that g can not be continuous at an irrational
number by considering sequences of rational numbers.
Countable Discontinuities
The function g is continuous precisely at the irrational numbers, and

discontinuous at all rational numbers.
Incidentally, it is impossible to have a function that is continuous only at the rationals,
which will be proved in the section on Metric spaces and Baire categories.
Proof:
The actual proof is left as an exercise. However, you may want to make use of the
following fact:
Lemma
If r = p / q is a rational number (in lowest terms) define a function with
domain Q via f(r) = 1 / q. Then the limit of f(r) as r approaches any real
number is zero.
Proof of Lemma:
Take any sequence { rn } of rational numbers converging to a fixed number a (which could
be rational or irrational). Since the sequence converges, it is bounded. For simplicity
assume that all rational numbers rn are in the interval [0, K] for some integer K. Now take
any > 0 and pick an integer M such that 1 / M < . Because each rational number is the
quotient of two positive integers, we have:
at most K of the rational numbers rn in the interval [0, N] can have denominator
equal to 1
at most 2 K of the rational numbers rn in the interval [0, N] can have denominator
equal to 2
...
at most M * K of the rational numbers rn in the interval [0, N] can have
denominator equal to M
In total, at most finitely many of the rn can have a denominator less than or equal to K.
That means, however, that there exists an integer N such the denominator of rn is bigger
than M for all n > N. But then
| f(rn) | < 1 / M < for all n > N
Since > 0 was arbitrary, that means that the limit of f(rn) must be zero, as needed. Our
assumption that the numbers rn were all positive can easily be dropped, and a similar proof
would work again.
Using this lemma it should not be too hard to prove the original assertion.
C(1) Function
The function g is continuous on R, differentiable on R, but the derivative is

not continuous.
This is the graph of the function. Zoom in by clicking and dragging the mouse, or select
Options to change the the power of x. Experiment with different powers. Is the function still
differentiable at zero if n = 1 or n = 0 ? If yes, is the derivative continuous ? How about if
n>2?
Proof:
We have studied this function before. The details are left as an exercise, but here is the
idea:
g is differentiable for all non-zero x (apply some theorems)
find the derivative for non-zero x by using chain and product rules
find the derivative of g at zero by looking at the limit of the difference quotient at
zero
show that the derivative you have just found is not continuous at zero by
considering left and right limits as x approaches zero
C(n) Function
The function g is n-times differentiable, the n-th derivative is continuous,
and g is not (n+1)-times differentiable.
On the right are the graphs of this root function for n = 0, 1 and 2. Can you decide which
graph goes with what n ? Click on Options and choose different powers of x to see how the
function gets smoother and smoother at 0, which is necessary if it wants to be more and
more times differentiable.
Proof:
This proof is left as an exercise. As a hint, start with n = 1. The example, if correct, says
that for n = 1 the function g is once differentiable, the derivative is continuous, but
differentiable again. Why ? Then try n = 2. Then, maybe, try an induction argument to
prove the general case.
C(inf) Function
The function g is non-zero, infinitely often differentiable, and any derivative
of g at x = 0 equals zero.
Zoom in with the mouse or click on Options to

select different window dimensions. In
particular, notice how smooth the function is at
the origin: it is hardly distinguishable from the
x-axis near zero.
Can a polynomial have that property ? In other words, can you find a polynomial of any
(fixed) degree such that it is not identically zero, yet all derivatives at x = 0 are zero ?
Proof:
This function is special, as we will see when dealing with power series representations: if
one wants to find a power series representation, one could apply Taylor's formula to find
the coefficients of the power series, say, centered at zero. That requires that the function
under consideration has to be infinitely often differentiable. This function is. Taylor's
formula involves derivatives of the function at the origin. In this case, they are all zero.
Hence, the power series associated with this function would be identically equal to zero.
But then it does not represent the original function.
In other words, there are functions for which you can use Taylor's theorem to find a
convergent power series, but this power series is not equal to the original function.
In any case, we will first prove that this function is once differentiable, with g'(0) = 0.
Obviously, if x # 0, then g'(x) exists by the chain rule, and we have
for x not zero

Next, we need to find the derivative at x = 0 by looking at the limit of the difference
quotient at x = 0
This limit looks hard, but we can make the substitution u = 1 / x2 and use 'Hospital's rule:
Note that we do have to look at u approaching positive and negative infinity, but because
of the square term in the denominator we can deal with both cases in one line.
Therefore, g is differentiable at x = 0 and the derivative is zero. Next we need to show that
g'(x) is continuous. It is obviously continuous for all x but zero, so we only need to check
continuity at x = 0:
Again, we have used the above substitution, and we have applied l'Hospital's rule several
times in our head. But the result is that g(x) is differentiable everywhere, and g'(0) = 0.
Now we could treat the case N = 2, N = 3, and so forth. But we need to prove this for all N,
so eventually we will have to employ an induction argument. Also, the computations of
higher derivatives will become more and more complicated, since the product rule will
introduce additional terms. Therefore, we have to look at the problem in a more abstract
way.
While the details are left as an exercise, you might want to make use of (and first prove)
the following steps:
Let p(x) be a polynomial in 1 / x, i.e.
. Then show the following:
the n-th derivative of g(x) for x not zero is of the form p(x)
the limit as x approaches zero of p(x)
the limit of the difference quotient at zero for functions of the form p(x)
zero
equals zero
is
Combining these steps will proof the result.
Weierstrass Function
The function g is continuous in R, but not differentiable at any point in R.

Incidentally, there are many other functions of this type, and they are best
treated in a course on complex analysis.
Back
The graph of the n-th partial sum for Weierstrass's monster. To get a better idea about the
limit, try larger values of n (values larger than 30 do not work due to roundoff errors). Try
to zoom into the function using the mouse. Can you believe that this function is continuous,
but differentiable nowhere ?
There are many other, similarly constructed. Here is another example which is the same as
the function above but without the absolute values:
Proof:
This function, incidentally, is the one which was used to generate the logo at the title
pages. To show that it is continuous will be easy - if we assume knowledge from the next
chapter. To show it is not differentiable will be somewhat more difficult.
We'll do it later, sorry.
Cantor Function
The Cantor function is a function that is continuous, differentiable,
increasing, non-constant, and the derivative is zero everywhere except at a
set with length zero.
This is the most difficult function in our repertoire and can be found, for example, in
Kolmogorov and Fomin.
Recall the definition of the Cantor set: Let
= [ 1/3, 2/3]
be the middle third of the interval [0, 1]. Let
= [ 1/9, 2/9],
= [ 7/9, 8/9 ]
be the middle thirds of the intervals remaining after deleting

= [2/27, 3/27],
= [7/27, 8/27],
= [19/27, 20/27],
= [ 25/27, 26,27]
be the middle thirds of the intervals remaining after deleting

and
intervals
from [0, 1]. Let
from [0, 1]. Continue in this fashion so that at the n-th stage we have the
, ...,
, ...,
Then the complement of the union of all these intervals

without the endpoints. Now define the following function
if t is in the interval
Then, for example, we have that
is the Cantor set
if 1/3 t 2/3
Then F(t) is defined everywhere in [0, 1] except at the Cantor set minus the end points 0,
1, 1/3, 2/3, 1/9, 2/9, 7/9, 8/9, ... If t is a number where F is not defined, then there exists an
increasing sequence { xn } of these endpoints converging to t, and a decreasing sequence
{ xn' } of these endpoints converging to t. Since F is defined at those endpoints xn and xn',
we define
F(t) =
F(xn) =
F(xn')
Now we have defined completely the Cantor function. It has the following properties:
F is defined everywhere in the interval [0, 1]
F is not constant
F is increasing
F is continuous in the interval [0, 1]
F is differentiable in the interval [0, 1]
F' is zero at every interior point of the intervals
In particular, F' is zero at points of total length 1 in the interval [0, 1], yet F it is not
constant.
Proof
Some of these properties are obvious, and some require more thought. In particular, why
does the above limit of endpoints exist ? That is a crucial point, because we used this limit
to extend the function to the whole interval [0, 1]. For details and hints about the Cantor
Function, please consult Kolmogorov and Fomin, p 334 ff.
7.1. Riemann Integral
IRA
In a calculus class integration is introduced as 'finding the area under a curve'. While this
interpretation is certainly useful, we instead want to think of 'integration' as more
sophisticated form of summation. Geometric considerations, in our situation, will not be so
fruitful, whereas the summation interpretation of integration will make many of its
properties easy to remember.
First, as usual, we need to define integration before we can discuss its
properties. We will start with defining the Riemann integral, and we will
move to the Riemann-Stieltjes and the Lebesgue integral later.
Definition 7.1.1: Partition of an Interval
A partition P of the closed interval [a, b] is a finite set of points P =
{ x0, x1, x2, ..., xn} such that
a = x0 < x1 < x2 < ... < xn-1 < xn = b
The maximum difference between any two consecutive points of the
partition is called the norm or mesh of the partition and denoted as |
P |, i.e.
| P | = max { xj - xj-1, j = 1 ... n }
A refinement of the partition P is another partition P' that contains
all the points from P and some additional points, again sorted by
order of magnitude.
Examples 7.1.2:
What is the norm of a partition of 10 equally spaced

subintervals in the interval [0, 2] ?
What is the norm of a partition of n equally spaced
subintervals in the interval [a, b] ?
Show that if P' is a refinement of P then | P' | | P |.

Using these partitions, we can define the following finite sum:
Definition 7.1.3: Riemann Sums
If P = { x0, x1, x2, ..., xn} is a partition of the closed interval [a, b] and
f is a function defined on that interval, then the n-th Riemann Sum
of f with respect to the partition P is defined as:
R(f, P) =
f(tj) (xj - xj-1)
where tj is an arbitrary number in the interval [xj-1, xj].
Note: If the function f is positive, a Riemann Sum

geometrically corresponds to a summation of areas of
rectangles with length xj - xj-1 and height f(tj).
Examples 7.1.4:
Suppose f(x) = x2 on [0, 2]. Find
1. the fifth Riemann sum for an equally spaced partition, taking
always the left endpoint of each subinterval
2. the fifth Riemann sum for an equally spaced partition, taking
always the right endpoint of each subinterval
3. the n-th Riemann sum for an equally spaced partition, taking
always the right endpoint of each subinterval.
Riemann sums have the practical disadvantage that we do not know which point to take
inside each subinterval. To remedy that one could agree to always take the left endpoint
(resulting in what is called the left Riemann sum) or always the right one (resulting in the
right Riemann sum). However, it will turn out to be more useful to single out two other
close cousins of Riemann sums:
Definition 7.1.5: Upper and Lower Sum
Let P = { x0, x1, x2, ..., xn} be a partition of the closed interval [a, b]
and f a bounded function defined on that interval. Then:
the upper sum of f with respect to the partition P is defined
as:
U(f, P) =
cj (xj - xj-1)
where cj is the supremum of f(x) in the interval [xj-1, xj].
the lower sum of f with respect to the partition P is defined as
L(f, P) =
dj (xj - xj-1)
where dj is the infimum of f(x) in the interval [xj-1, xj].

Here is an example where the upper sum in
displayed in dark brown and the lower sum in
light brown.
The partition P = {0.5, 1, 1.5, 2}, and
the numbers for the sums are chosen:
for the upper sum: c1 = f(1), c2 = f(2), and

c3 = f(1)
for the lower sum: d1 = f(0.5), d2 = f(1),

and d3 = f(2)
Examples 7.1.6:
Suppose f(x) = x2-1 for x in the interval [-1, 1]. Find:

1. The left and right sums where the interval [-1, 1] is
subdivided into 10 equally spaced subintervals.
2. The upper and lower sums where the interval [-1, 1] is
subdivided into 10 equally spaced subintervals.
3. The upper and lower sums where the interval [-1,1] is
subdivided into n equally spaced subintervals.
Why is, in general, an upper (or lower) sum not a special
case of a Riemann sum ? Find a condition for a function f so
that the upper and lower sums are actually special cases of
Riemann sums.
Find conditions for a function so that the upper sum can be
computed by always taking the left endpoint of each
subinterval of the partition, or conditions for always being
able to take the right endpoints.
Suppose f is the Dirichlet function, i.e. the function that is

equal to 1 for every rational number and 0 for every irrational
number. Find the upper and lower sums over the interval [0,
1] for an arbitrary partition.
These various sums are related via a basic inequality, and they are also related to a
refinement of the partition in the following theorem:
Proposition 7.1.7: Size of Riemann Sums
Suppose P = { x0, x1, x2, ..., xn} is a partition of the closed interval [a,
b], f a bounded function defined on that interval. Then we have:
The lower sum is increasing with respect to refinements of
partitions, i.e. L(f, P') L(f, P) for every refinement P' of the
partition P
The upper sum is decreasing with respect to refinements of
partitions, i.e. U(f, P') U(f,P) for every refinement P' of the
partition P
L(f, P) R(f, P) U(f, P) for every partition P
Proof
In other words, the lower sum is always less than or equal to the upper
sum, and the upper sum is decreasing with respect to a refinement of
the partition while the lower sum is increasing with respect to a
refinement of the partition. Hence, a natural question is: will the two
quantities ever coincide ?
Definition 7.1.8: The Riemann Integral
Suppose f is a bounded function defined on a closed, bounded interval
[a, b]. Define the upper and lower Riemann integrals, respectively, as
I*(f) = inf{ U(f,P): P a partition of [a, b]}
I*(f) = sup{ L(f,P): P a partition of [a, b]}
Then if I*(f) = I*(f) the function f is called Riemann integrable and
the Riemann integral of f over the interval [a, b] is denoted by
f(x) dx
Note that upper and lower sums depend on the particular partition chosen, while the upper
and lower integrals are independent of partitions. However, this definition is very difficult
for practical applications, since we need to find the sup and inf over any partition.
Examples 7.1.9:
Show that the constant function f(x) = c is Riemann

integrable on any interval [a, b] and find the value of the
integral.
Is the function f(x) = x2 Riemann integrable on the interval
[0,1] ? If so, find the value of the Riemann integral. Do the
same for the interval [-1, 1].
Is the Dirichlet function Riemann integrable on the interval

[0, 1] ?
The third example shows that not every function is Riemann integrable, and the second one
shows that we need an easier condition to determine integrability of a given function. The
next lemma provides such a condition for integrability.
Lemma 7.1.10: Riemann Lemma
Suppose f is a bounded function defined on the closed, bounded

interval [a, b]. Then f is Riemann integrable if and only if for every
> 0 there exists at least one partition P such that
| U(f,P) - L(f,P) | <
Proof
Examples 7.1.11:
Is the function f(x) = x2 Riemann integrable on the interval

[0,1] ? If so, find the value of the Riemann integral. Do the
same for the interval [-1, 1] (since this is the same example as
before, using Riemann's Lemma will hopefully simplify the
solution).
Suppose f is Riemann integrable over an interval [-a, a]
and { Pn } is a sequence of partitions whose mesh converges to
zero. Show that for any Riemann sum we have
lim R(f, Pn) =
f(x) dx
Suppose f is Riemann integrable over an interval [-a, a]

and f is an odd function, i.e. f(-x) = -f(x). Show that the
integral of f over [-a, a] is zero. What can you say if f is an
even function?
Now we can state some easy conditions that the Riemann integral satisfies. All of them are
easy to memorize if one thinks of the Riemann integral as a somewhat glorified
summation.
Proposition 7.1.12: Properties of the Riemann Integral
Suppose f and g are Riemann integrable functions defined on [a, b].

Then
1.
c f(x) + d g(x) dx = c
f(x) dx + d
g(x) dx
2. If a < c < b then
f(x) dx = f(x) dx + f(x) dx
3. | f(x) dx |
| f(x) | dx
4. If g is another function defined on [a, b] such that g(x) < f(x)
on [a, b], then g(x) dx
f(x) dx
5. If g is another Riemann integrable function on [a, b] then f(x)
.
g(x) is integrable on [a, b]
Proof
Examples 7.1.13:
Find an upper and lower estimate for

interval [0, 4].
Suppose f(x) = x2 if x 1 and f(x) = 3 if x > 1. Find f(x)

dx over the interval [-1, 2].
If f is an integrable function defined on [a, b] which is
bounded by M on that interval, prove that
x sin(x) dx over the
M (a - b)
f(x) dx M (b - a)
Now we can illustrate the relation between Riemann integrable and continuous functions.
Theorem 7.1.14: Riemann Integrals of Continuous Functions
Every continuous function on a closed, bounded interval is Riemann
integrable. The converse is false.
Proof
IRA
7.2. Integration Techniques
This section provides integration techniques, i.e. methods for finding the actual value of an
integral. We have already found the most basic technique (the Integral Evaluation Shortcut
or First Fundamental Theorem of Calculus): to evaluate an integral over an interval [a, b],
find an antiderivate F of the integrand f and compute F(b) - F(a).
But different from finding a derivative, where a couple of rules can be
used to find virtually every derivative, finding an antiderivative is
basically a guessing game.
Example 7.2.1: Standard Antiderivatives
Find the following antiderivatives:
(a) xr dx
(b) 1/x dx
(d) sin(x) dx (e) cos(x) dx
(c) ex dx
(f) tan(x) dx
(g)
(h)
(i)
As is typical for finding the antiderivatives, the answers for the above examples can not be
deduced. They must be guessed, or learned. However, there are two standard mechanisms
that can be useful: substitution and integration by parts.
Theorem 7.2.2: Substitution Rule
If f is a continuous function defined on [a, b], and s a continuously
differentiable function from [c, d] into [a, b]. Then
Proof
This theorem, in words, says that if you can identify a composition of functions as well as
the derivative of one of the composed functions, there's a good chance you can find the
antiderivative and evaluate the corresponding integral.
Think about the substitution rule as a "change of variable": if a
particular expression in x makes an integrand difficult, change it to u.
Then compute the derivative of u, i.e. du/dx = u'(x) and "solve" it for du:
du = u'(x) dx. If you can use u'(x) dx to change the integrand into an
expression involving only u and no x, then the substitution worked (and
has hopefully simplified the integrand). Otherwise try a different
substitution or a different technique altogether.
Example 7.2.3: Applying the Substitution Rule
Here is a simple example: find (4x + 3)2 dx

If F(x) is an antiderivative of f(x), find
o
f(cx + d) dx
x f(cx2) dx
f'(x) / f(x) dx
Compute the area of a circle with radius r.
Find tan(x) dx and cot(x) dx
Find
and
Another theorem that is commonly used is "integration by parts".
Theorem 7.2.4: Integration by Parts
Suppose f and g are two continuously differentiable functions. Let

G(x) = f(x) g(x). Then
f(x) g'(x) dx = ( G(b) - G(a) ) -
f'(x) g(x) dx
Proof
In order for integration by parts to be useful, three conditions should be satisfied:

The integrand must be a product of two expressions (one of which
could be 1)
You must know the antiderivative of one of the two expressions
The derivative of the other expression should become easier
Example 7.2.5: Applying Integration by Parts
Evaluate the following integrals:
x ex dx
x2 cos(x) dx
ln(x) dx
sin5(x) dx
Integration by parts can be used to provide an effective means to compute the value of an
integral numerically. Of course, Riemann sums can be used to approximate an integral, but
convergence is usually slow. A much faster convergence scheme is based on the trapezoid
rule. To prove it, we need the Mean Value Theorem for Integration:
Theorem 7.2.6: Mean Value Theorem for Integration
If f and g are continuous functions defined on [a, b] so that g(x) 0,

then there exists a number c [a, b] with
f(x) g(x) dx = f(c)
g(x) dx
Proof
Now we can state and prove the Trapezoid Rule.

Proposition 7.2.7: Trapezoid Rule
Let f be a twice continuously differentiable function defined on [a, b]
and set
K = sup{ |f''(x) |, x
[a, b] }
If h = (b - a) / n, where n is a positive integer, then
where |R(n)| < K/12 (b-a) h2.

Proof
The Trapezoid Rule is useful because the error R(n) depends on the square of h. If h is
small, h2 is a lot smaller so that the Trapezoid Rule provides a good approximation to the
numeric value of an integral (as long as f is twice continuously differentiable).
Example 7.2.8: Application of the Trapezoid Rule
Compare the numeric approximations to the integral
sin(x) cos(x) dx
obtained by using (a) a left Riemann sum and (b) the Trapezoid Rule,
using a partition of size 5 and of size 100.
Our final integration technique uses partial fraction decomposition of a polynomial to
simplify rational functions so that they can be integrated.
Theorem 7.2.9: Partial Fraction Decomposition
Suppose p(x) is a polynomial of degree n such that p(x) =

,
where each qj is polynomial of degree j that is irreducible over R. If
s(x) is another polynomial of degree less than n with no factors in
common with p(x), then the rational function s(x) / p(x) can be written
as a finite sum
where each rj is a polynomial of degree less than j.

Proof
This theorem sounds complicated, so we should restate it in a more useful form:

p(x) is a polynomial that can be factored into polynomials qj
each qj can not be factored any further (with real coefficients)

s(x) is another polynomial whose factors, if any, do not cancel any of
the qj, and whose degree is less than the degree of p
Then s(x)/p(x) can be written as a finite sum of simple rational
functions whose denominators consist of the qj's and whose
numerators have degree at most j-1
finding the rj amounts to solving a system of linear equations, as the
examples will show
7.3. Measures
IRA
The Riemann integral is certainly useful (and complicated) enough, but it does have a few
limitations and oddities:
Examples 7.3.1:Oddities of Riemann Integral
What happens when you change the value of a Riemann

integrable function at a single point?
Is it true that a function that is constant except at countably
many points is Riemann integrable?
What is the difference between Riemann integrable
functions and bounded continuous functions?
Can you take a Riemann integral over anything else but an
interval?
Could you define a Riemann integral of a function whose

domain is not R?
We therefore want to define another concept of integration that is more general than the
Riemann integral, yet retains the "good" properties of that integral.
One of the limitations of the Riemann integral is that it is based on the
concept of an "interval", or rather on the length of subintervals [xj-1, xj].
We therefore need to find a generalization of the "length" concept of a
set in the real line. That new "length" concept, which we will call
"measure", should satisfy two key conditions:
1. The new "measure" concept should be applicable to intervals, unions
of intervals, and to more general sets (such as a Cantor set). Ideally,
it should be defined for all sets.
2. The new "measure" concept should share as many properties as
possible with the standard length of an interval, such as:
o the 'measure' of a set should be non-negative
o the 'measure' of an interval should be the length of that
interval
o the 'measure' of countably many disjoint sets should be the
sum of the 'measures' of the individual sets
To define what will eventually be called Lebesgue Measure, we follow a two-stage

strategy:
Stage One:
We will define a concept extending length that is defined for all sets (to satisfy
condition 1 above)
Stage Two:
We will modify that concept so that it looks as close as possible to the standard
length concept (to satisfy condition 2 above)
The stage-one concept is called outer measure, defined as follows:
Definition 7.3.2: Outer Measure
If A is any subset of R, define the (Lebesgue) outer measure of A as:
m*(A) = inf {
l(An) }
where the infimum is taken over all countable collections of open

intervals An such that A
An and l(An) is the standard length of the
interval An.
Examples 7.3.3: Outer Measure of Intervals
Find the outer measure of the empty set O, and prove that
m*(A) m*(B) for all A B.
Find the outer measure of a closed interval [a, b]
Find the outer measure of an open interval (a, b)
Find the outer measure of an infinite interval
Find the outer measure of the set A of all rational numbers

in [0, 1]. Also show that for any finite collection of intervals
covering A we have that the sum of their lengths is greater or
equal to 1.
Outer measure is defined for all sets in R and has some of the properties we wanted our
concept of measure to have.
Proposition 7.3.4: Properties of Outer Measure
Outer measure has the following properties:

1. Outer measure m* is a non-negative set function whose
domain is P(R), i.e. the power set of R.
2. The outer measure of an interval is its length.
3. Outer measure is countably subadditive, i.e. if { An } is a
countable collection of sets, then
m*(
An)
m*(An)
Proof
Since outer measure is defined for all sets in R, our first key condition above is satisfied.
But outer measure is not quite similar to a length, because it is only subadditive ( m*(A
B) m*(A) + m*(B) ), not additive ( m*(A B) = m*(A) + m*(B) for disjoint sets A and B).
Therefore one of the requirements of the second key condition is not satisfied.
Examples 7.3.5: Properties of Outer Measure
Show that the outer measure of a single point is 0, and the

outer measure of a countable set is also 0.
The set [0, 1] is not countable.
Outer measure is subadditive, but not additive.

To make our outer measure additive, something has to give. We decide to restrict the
domain to gain countable additivity, which will result in our stage-two definition.
There are several ways to restrict an outer measure. A somewhat
intuitive way, perhaps, is to define an inner measure m* that is similar to
the outer measure but involves a sup over suitable sets. Then sets for
which m*(A) = m*(A) are called measurable and the restriction of the
outer (or inner) measure to the measurable sets is called a measure.
Using an outer and inner measure to define a measure does not have
many advantages, and as a definition it is difficult to deal with. We
prefer another approach that is due to Caratheodory to define
measurable sets.
Definition 7.3.6: Measurable Sets and Lebesgue Measure
A set E is (Lebesque) measurable if for every set A we have that
m*(A) = m*(A
E) + m*(A
comp(E))
If E is measurable, the non-negative number m(E) = m*(E) is the

(Lebesgue) measure of the set E.
Since outer measure is subadditive, we have
m*(A) = m*( (A E) (A comp(E)) )
m*(A E) + m*(A comp(E))
Therefore, to show that a set E is measurable it is sufficient to prove that
m*(A) m*(A
E) + m*(A
comp(E))
The motivation for this definition is that it ensures that for two disjoint measurable sets E
and F we have that
m(E
F) = m(E) + m(F)
as we will show later, i.e. measure is additive.

Examples 7.3.7: Measurable Sets
Show that the empty set, the set R, and the complement of
a measurable set are all measurable.
Show that every set with outer measure 0 is Lebesgue
measurable.
Show that the union of two measurable sets is measurable.
Show that the intersection of two measurable sets is
measurable.
Show that the interval (a, ) is measurable.

The next two results shows that a measure has all of the properties of a length, but is no
longer defined for all sets.
Theorem 7.3.8: Properties of Lebesgue measure
1. All intervals are measurable and the measure of an interval is

its length
2. All open and closed sets are measurable
3. The union and intersection of a finite or countable number of
measurable sets is again measurable
4. If A is measurable and A is the union of countable number of
measurable sets An, then m(A)
m(An)
5. If A is measurable and A is the union of countable number of
disjoint measurable sets An, then m(A) = m(An)
Proof
Examples 7.3.9: Properties of Measure
Show that for any two sets A and B we have that

m(A - B) = m(A) - m(A B) . What if B A?
What is the measure of the set Q of all rational numbers
and the set I of all irrational numbers inside [0, 1].
Find the measure of the Cantor middle-third set (if it

measurable).
Outer measure is defined for all sets, and according to the above list of properties, most
"common" sets such as intervals, closed or open sets, unions and intersections of
measurable sets are all measurable. But we have to pay for the property that measure is
(countably) additive with the fact that not every set is measurable.
Proposition 7.3.10: Not all Sets are Measurable
There are sets that are not (Lebesgue) measurable, i.e. not every set is
(Lebesgue) measurable.
Proof
To summarize, we introduced a new concept called measure in two stages, each of which
had a typical "good news, bad news" property.
1. In the first stage we defined outer measure.
o good news: outer measure is defined for all sets.
o bad news: outer measure is not additive, i.e. it is not quite comparable to a
length.
2. In the second stage we defined measure by restricting outer measure to the
measurable sets.
o good news: measure is additive, i.e. it is a good generalization of length.
o bad news: measure is not defined for all sets
Before concluding this section, we want to prove one more result that will be
important for the next section:
Proposition 7.3.11: Monotone Sequences of Measurable Sets
If { An } is a sequence of measurable sets that is decreasing, i.e. Aj
Aj+1 for all j, and m(A1) is finite, then
lim m(Aj) = m( Aj)
If { An } is a sequence of measurable sets that is increasing in the
sense that Aj+1 Aj for all j, then
lim m(Aj) = m( Aj)
Proof
Examples 7.3.11: Monotone sequences of measurable sets

o
For decreasing sets we had to assume that m(A1)

was finite. Show that without this assumption the statement in
the previous proposition is false.
Find the measure of the Cantor middle-fifth set, i.e.

the set obtained by using a Cantor-set construction but
removing the middle-fifth instead of the middle-third at each
stage.

7.4. Lebesgue Integral
IRA
We previously defined the Riemann integral roughly as follows:

subdivide the domain of the function (usually a closed, bounded
interval) into finitely many subintervals (the partition)
construct a simple function that has a constant value on each of the
subintervals of the partition (the Upper and Lower sums)
take the limit of these simple functions as you add more and more
points to the partition.
If the limit exists it is called the Riemann integral and the function is called Riemann
integrable. Now we will take, in a manner of speaking, the "opposite" approach:
subdivide the range of the function into finitely many pieces
construct a simple function by taking a function whose values are
those finitely many numbers
take the limit of these simple functions as you add more and more
points in the range of the original function
If the limit exists it is called the Lebesgue integral and the function is called Lebesgue
integrable. To define this new concept we use several steps:
1. we define the Lebesgue Integral for "simple functions"
2. we define the Lebesgue integral for bounded functions over sets of
finite measure
3. we extend the Lebesgue integral to positive functions (that are not
necessarily bounded)
4. we define the general Lebesgue integral
First, we need to clarify what we mean by "simple function".
Definition 7.4.1: Characteristic and Simple Function
For any set A the function
is called the characteristic function of A. A finite linear combination

of characteristic functions
s(x) =
ai XEi(x)
is called simple function if all sets Ei are measurable.

A function f defined on a measurable set A that takes no more than finitely many distinct
values a1, a2, ... , an can always be written as a simple function
f(x) =
an XAn(x)
where
An = { x
A: f(x) = an }
Therefore simple functions can be thought of as dividing the range of f, where the resulting
sets An may or may not be intervals.
Examples 7.4.2: Simple Functions
A step function is a function s(x) such that

s(x) = cj for xj-1 < x < xj
and the { xj } form a partition of [a, b]. Upper, Lower, and
Riemann sums are examples of step functions. What is the
difference, if any, between step functions and simple
functions.
Are simple functions uniquely determined? In other words,
if s1 and s2 are two simple functions with s1(x) = s2(x), do they
have to have the same representation? If different
representations are possible, which one is "the best"?
How does the Dirichlet function fit in with this
terminology?
Is the function that is equal to 1 if x is part of the Cantor
middle-third set and 0 otherwise a simple function?
Are sums, differences, and products of simple functions

simple?
For simple functions we define the Lebesgue integral as follows:
Definition 7.4.3: Lebesgue Integral for Simple Function
If s(x) = an XAn(x) is a simple function and m(An) is finite for all n,

then the Lebesgue Integral of s is defined as
s(x) dx =
an m(An)
If E is a measurable set, we define

E
s(x) dx = XE(x) s(x) dx
Example 7.4.4: Lebesgue Integral for Simple Functions
Find the Lebesgue integral of the constant function f(x) = c

over the interval [a, b].
Find the Lebesgue integral of a step function, i.e. a
function s such that s(x) = cj for xj-1 < x < xj and the { xj } form
a partition of [a, b].
Find the Lebesgue integral of the Dirichlet function
restricted to [0, 1] and of the characteristic function of the
Cantor middle-third set.
Define two simple functions
s1(x) = 2 X[0, 2](x) + 4 X[1, 3](x)
s2(x) = 2 X[0, 1)(x) + 6 X[1, 2](x) + 4 X(2, 3](x)
Show that s1(x) = s2(x) and s1(x) dx = s2(x) dx.
We have seen before that the representation of a simple

function is not unique. Show that the Lebesgue integral of a
simple function is independent of its representation.
Just as step functions were used to define the Riemann integral of a bounded function f
over an interval [a, b], simple functions are used to define the Lebesgue integral of f over a
set of finite measure.
Definition 7.4.5: Lebesgue Integral for Bounded Function
Suppose f is a bounded function defined on a measurable set E with
finite measure. Define the upper and lower Lebesgue integrals,
respectively, as
I*(f)L = inf{ s(x) dx: s is simple and s f }
I*(f)L = sup{ s(x) dx: s is simple and s f }
If I*(f)L = I*(f)L the function f is called Lebesgue integrable over E and
the Lebesgue integral of f over E is denoted by
E
f(x) dx
Examples 7.4.6: Lebesgue Integral for Bounded Functions
Is the function f(x) = x Lebesgue integrable over [0, 1]? If

so, find the integral.
Is the function f(x) = x2 Lebesgue integrable over the
rational numbers inside [0, 2]? If so, find the integral.
Is the Dirichlet function restricted to [0, 1] Lebesgue
integrable? If so, find the integral.
Is every bounded function Lebesgue integrable?

Now a function f can be integrated (if it is integrable) using either the Riemann or the
Lebesgue integral. Fortunately, for many simple functions the two integrals agree and the
Lebesgue integral is indeed a generalization of the Riemann integral.
Theorem 7.4.7: Riemann implies Lebesgue Integrable
If f is a bounded function defined on [a, b] such that f is Riemann

integrable, then f is Lebesgue integrable and
f(x) dx =
[a,b]
f(x) dx
Proof
For most practial applications this theorem is all that I
7.5. Riemann versus Lebesgue
IRA
In progress ...
If f is L-integrable, so is |f|, but the converse is not true (I think).
There is a function f that is R-integrable but |f| is not R-integrable
Riemann integral for bounded functions only, Lebesgue for bounded and
unbounded functions.
Riemann integral can be extended to improper Riemann integrals, but
can not allow functions that are extended real valued.
An improper R-integral may exist without the function being L-integrable
(see f(x) = sin(x) / x for x > 0.
If f is L-integrable and the improper R-integral exists, then both agree.
Mention sequences of R-integrable function vs. L-integrable
Lebesgue's Theorem: a bounded function f on [a, b] is Riemann
integrable if and only if the set of points where f is not continuous has
measure zero.
Examples 1.1.2(a):
If E = { x: x = 2n for n N}, O = {x : x = 2n - 1 for n }, and I = {x
= -2}, then what, in your own words, do these sets represent ?
R: x2
back
The set E is the set of all even integers.

The set O is the set of all odd integers.
The set I consists of all real numbers whose square equals -2. Since there are no
such real numbers, the set I is the empty set.
Proposition 1.1.3: Distributive Law for Sets

A (B
A
(B
C) = (A
B)
C) = (A
(A C)
B)
(A
C)
Context
Proof:
These relations could be best illustrated by means of a Venn Diagram.
Venn Diagram illustrating A (B
C)
Venn Diagram for (A B) (A C)

Obviously, the two resulting sets are the same, hence proving the first
law. However, this is not a rigorous proof, and is therefore not
acceptable. Here is a real proof of the first distribution law:
If x is in A union (B intersect C) then x is either in A or in (B and C).
Therefore, we have to consider two cases:
If x is in A, then x is also in (A union B) as well as in (A union C). Therefore, x is
in (A union B) intersect (A union C).
If x is in (B and C), then x is in (A union B) because x is in B, and x is also in (A
union C), because x is in C. Hence, again x is in (A union B) intersect (A union C).
This proves that
o A (B C) (A B) (A C)
To finish the proof, we have to prove the reverse inequality. So, take x in (A union B)
intersect (A union C). Then x is in (A or B) as well as in (A or C).
If x is in A, then x is also in A union (B intersect C).
If x is in B, then it must also be in C. Hence, x is in B intersect C, and therefore it is
in A union (B intersect C). That shows that
o A (B C) (A B) (A C)
Both inequalities together prove equality of the two sets.
The second distributive laws can be proved the same way, and is left as
an exercise.
Examples 1.1.2(b):
If A = {x R : -4 < x < 3} and B = {x
B, A \ B, and comp(A).
R : -1 < x < 7}, then find A B, A

back
A B={x
R : -4 < x < 7 }
A B={x
R : -1 < x < 3 }
A\B={ x
R : -4 < x -1 }
comp( A ) = { x
R:-
< x -4 } { x
R:3 x<

Examples 1.1.5(a):
Prove that when two even integers are multiplied, the result is an even
integer, and when two odd integers are multiplied, the result is an odd
integer.
back
To prove this we first need to know what exactly an even and odd integer is:
an integer x is even if x = 2n for some integer n
an integer x is odd if x = 2n + 1 for some integer n
Now that we have a precise definition, the actual proof is easy: Take x and y two even
numbers. Then
x = 2n for some integer n
y = 2m for some integer m
Multiplying these numbers together we get
xy = (2n)(2m) = 4 nm = 2 (2nm) = 2 k
where k = 2nm. Hence, xy is again even.
If x and y are two odd numbers, then
x = 2n + 1 for some integer n
y = 2m + 1 for some integer m
Multiplying these numbers together we get
xy = (2n+1)(2m+1) = 4nm + 2(n + m) + 1 = 2 (2nm + n + m) + 1 = 2k + 1
where k = 2nm + n + m. Hence, xy is again odd.
Examples 1.1.5(b):
Prove that if the square of a number is an even integer, then the original
number must also be an even integer. (Try a proof by contradiction).
back
Now that we have a precise definition, the actual proof is easy: Suppose x is a number such
that x2 is even. To start a proof by contradiction we will assume the opposite of what we
would like to prove: assume that x is odd (but x2 is still even). Then, because x is odd, we
can write it as
x = 2n + 1
But then the square of x is
x2 = (2n + 1)(2n + 1) = 4 n2 + 4n + 1 = 2 (2 n2 + 2n) + 1 = 2k + 1
with k = 2 n2 + 2n. Therefore x2 is odd. But that is contrary to our assumption that the
square of x is even. Hence, if the square of x is supposed to be even, x itself must be 'not
odd'. But 'not odd' means even. Therefore, the proof is finished.
Examples 1.1.5(b):
Prove that if the square of a number is an even integer, then the original
number must also be an even integer. (Try a proof by contradiction).
back
Now that we have a precise definition, the actual proof is easy: Suppose x is a number such
that x2 is even. To start a proof by contradiction we will assume the opposite of what we
would like to prove: assume that x is odd (but x2 is still even). Then, because x is odd, we
can write it as
x = 2n + 1
But then the square of x is
x2 = (2n + 1)(2n + 1) = 4 n2 + 4n + 1 = 2 (2 n2 + 2n) + 1 = 2k + 1
with k = 2 n2 + 2n. Therefore x2 is odd. But that is contrary to our assumption that the
square of x is even. Hence, if the square of x is supposed to be even, x itself must be 'not
odd'. But 'not odd' means even. Therefore, the proof is finished.
Euclid Theorem:
There is no largest prime number.
Context
Proof
Suppose there was a largest prime number; call it N. Then there are only finitely many
prime numbers, because each has to be between 1 and N. Let's call those prime numbers a,
b, c, ..., N. Then consider this number:
M = a * b * c * ... * N + 1
Is this new number M a prime number? We could check for divisibility:
M is not divisible by a, because M / a = b * c * ... * N + 1 / a
M is not divisible by b, because M / b = a * c * ... * N + 1 / b
M is not divisible by c, because M / c = a * b * ... * N + 1 / c
.....
Hence, M is not divisible by a, b, c, ..., N. Since these are all possible prime numbers, M is
not divisible by any prime number, and therefore M is not divisible by any number. That
means that M is also a prime number. But clearly M > N, which is impossible, because N
was supposed to be the largest possible prime number. Therefore, our assumption is wrong,
and thus there is no largest prime number.
Examples 1.2.3:
Let A = {1, 2, 3, 4}, B = {14, 7, 234}, C = {a, b, c}, and R = real numbers.
Define the following relations:
1. r relates A and B via: 1 ~ 234, 2 ~ 7, 3 ~ 14, 4 ~ 234, 2 ~ 234
2. f relates A and C via: {(1,c), (2,b), (3,a), (4,b)}
3. g relates A and C via: {(1,a), (2,a), (3,a)}
4. h relates R and itself via: {(x,sin(x))}
Back
1. The relation r is not a function, because the element 2 from the set A is associated
with two elements from B.
2. The relation f is a function, because every element from A has exactly one relation
from the set C.
3. The relation g is not a function, because the element {4} from the domain A has no
element associated with it.
4. The relation h is a function with domain R, because every element {x} in R has
exactly one element {sin(x)} associated with it.
Example 1.2.5(a):
called Dirichlets Function. The range for f is R. Find the image of the
domain of the Dirichlet Function when:
1. the domain of f is Q
2. the domain of f is R
3. the domain of f is [0, 1] (the closed interval between 0 and 1)
Back
1. When the domain is Q, we have that f(x) = 0 for any x, because x must be a rational
number. Hence, the image of the domain Q is the set consisting of the single
element {0}.
2. When the domain is R, we have that f(x) could be 0 or 1, because x could be
rational or irrational. f(x) can not be any other number. Hence, the image of the
domain R is the set consisting of the two elements {0, 1}.
3. The interval [0, 1] contains irrationals as well as rational numbers. Therefore, f(x)
could be equal to 0 or 1, and the image of [0, 1] under f is the set with the two
elements {0, 1}.
Example 1.2.5(b):
called Dirichlets Function. The domain and range for f is R.
What is the preimage of R ? What is the preimage of [-1/2, 1/2] ?
Back
The preimage of R is the set of all elements such that f(x) is contained in R. Since R
includes the numbers 0 and 1, the preimage of R under f is everything in the domain, i.e.
R.
The preimage of [-1/2, 1/2] consists of all elements such that f(x) is
contained in [-1/2, 1/2]. This set does not contain 1, so the preimage of
that set can not contain any irrational number. Hence, the preimage of [1/2, 1/2] is the set of rational numbers Q.
Examples 1.2.7(a):
If the graph of a function is known, how can you decide whether a function
is one-to-one (injective) or onto (surjective) ?
Back
Injection
If a horizontal line intersects the graph of a function in more than one place, then there are
two different points a and b for which f(a) = f(b), but a # b. Then the function is not oneto-one.
Not a one-to-one function
Surjection
If you place a light on the left and on the right hand side of the coordinate system, then the
shadow of the graph on the y axis is the image of the domain of the function. If that
shadow covers the range of the function, then the function is onto.
Not an onto function (if range is R)

Note that every function can be modified to be onto by setting its range to be the image of
its domain.
Examples 1.2.7(b):
Which of the following functions are one-one, onto, or bijections ? The
domain for all functions is R.
1. f(x) = 2x + 5
2. g(x) = arctan(x)
3. g(x) = sin(x)
4. h(x) = 2x3 + 5x2 - 7x + 6

Back
1. Linear Function
This function is linear. The equation y = 2x + 5 has a unique solution for every x, so that
the function is one-one and onto, i.e. a bijection. In fact, all linear functions are bijections.
f(x) = 2x + 5 is a bijection.
2. Inverse Functions
This function is the inverse function of the tangent function. As such, it must be one-toone. It is onto if the range is [- / 2, / 2], but not if the range is R. In fact, all inverse
functions are one-to-one.
g(x) = arctan(x) is injective, not surjective on

R.
3. Periodic Function
Since this function is periodic, it can not be one-to-one. It is onto, if the range is the
interval [-1, 1], but not onto if the range is R. In fact, every periodic function is not one-toone.
g(x) = sin(x) is neither one-to-one nor onto R.
4. Odd Degree Polynomial
This is an odd-degree polynomial. Hence, the limit as x approaches plus or minus infinity
must be plus or minus infinity, respectively. That means that the function is onto. Since
most third degree equations have three zero, this function is probably not one-to-one. A
look at the graph confirms this. In fact, every odd-degree polynomial is onto while no even
degree polynomial is onto.
h(x) = 2x3 + 5x2 - 7x + 6 is onto, not one-to-one

Theorem 1.3.3: Equivalence Classes

Let r be an equivalence relation on a set A. Then A can be written as a union
of disjoint sets
with the following properties:
1. If a,b are in A then a ~ b if and only if a and b are in the same set
2. The subsets
are non-empty and pairwise disjoint.
The sets
are called equivalence classes.

Context
Proof:
We have to decide what the equivalence classes
should be: Since by property (1) two
elements a and b are supposed to be related if and only if they are in the same class, it
seems natural to define:
={b A:b~ }
To simplify notation a little, we denote this class by A( ) . Now we need to check whether
this is a good definition, and whether it satisfies the above properties.
Because of reflexivity of the equivalence relation, the class A(a)
contains the element a and is therefore not empty.
Next, we will show that two classes that are different can not have any
elements in common, i.e. they are disjoint. Recall that disjoint means
that two sets have either an empty intersection or are the same sets.
Take two elements a and a and suppose that A(a) and A(a) have a nonempty intersection; say both classes contain the element c. Then
c ~ a, because c is in A(a)
a ~ c, by symmetry and the above line
c ~ a because c is in A(a)
Therefore, by transitivity, a ~ a
Now take any element b in A(a). Then b ~ a and a ~ a, so that b ~ a by transitivity, and
hence b is contained in A(a). Therefore A(a) is a subset of A(a). The same argument
works the other way around as well (by symmetry), showing that A(a) is a subset of A(a).
Hence, A(a) = A(a). This shows that two different classes must be disjoint.
Finally, we need to show that all classes A(a) make up the original set A.
But that is clear, because every element a in A belongs to the class
A(a), hence the union of all classes A(a) must be contained in A, and
therefore must be equal to A.
Example 1.3.4(a):
Consider the set Z of all integers. Define a relation r by saying that x and y
are related if their difference y - x is divisible by 2. Then
1. Check that this relation is an equivalence relation
2. Find the two equivalence classes, and name them appropriately.
3. How would you add these equivalence classes, if at all ?
Back
1. Equivalence Relation
reflexive:
x - x is equal to zero, which is divisible by two. Hence, every element is related to
itself.
symmetry:
if x ~ y, then y - x is divisible by 2. But then - (y - x) = x - y is divisible by two.
Hence, y ~ x
transitivity:
if x ~ y then y - x = 2n for some integer n
if y ~ z then z - y = 2m for some integer m
But then z - x = (2m + y) - (y - 2n) = 2 (m + n), so that z - x is divisible by 2. In
other words, x and z are related.
2. Equivalence Classes
Two elements are in the same equivalence class if and only if they are
related. If x and y are in the same class, then y - x = 2m for some
integer n.
If y was even, then y = 2m for some integer m, and x = 2m - 2n must also be even.
If y was odd, then y = 2m + 1 for some integer m, and x = 2m + 1 - 2n must also be
odd
Therefore, there are two equivalence classes, and they are appropriately labeled:
E = even numbers: E = [(2)] contains all even numbers
O = odd numbers: O = [(3)] contains all odd numbers.
3. Adding Equivalence Classes
Define [x] + [y] = [x + y]. We need to show that this is well-defined, i.e.
independent of the particular representative of the equivalence classes
of [x] and [y]. Take x ~ x' and y ~ y'.
Then x' - x = 2m and y' - y = 2n for some integers n and m.
Then (x' + y') - (x + y) = x' - x + y' - y = 2m + 2n = 2 (m + n)
That means that (x' + y') and (x + y) are related if x ~ x' and y ~ y'. But then
[x] + [y] = [x + y] = [x' + y'] = [x'] + [y']
Hence, addition does not depend on the particular representative from a class, so that
addition as defined above is indeed a well-defined operation.
A better method would be the following: Define two classes 0 and 1 by
saying:
all numbers divisible by 2 with no remainder are in class 0
all numbers divisible by 2 with remainder 1 are in class 1
Then add equivalence classes by adding numbers modulo 2.
Example 1.3.4(b):
Consider the set Z of all integers. Define a relation r by saying that x and y
are related if their difference y - x is divisible by m. Then we have:
Back
This relation is an equivalence relation (i.e. the three conditions are satisfied)
There are m equivalence classes:
1. all numbers divisible by m with no remainder are in class 0
2. all numbers divisible by m with remainder 1 are in class 1
3. all numbers divisible by m with remainder 2 are in class 2
4. ...
5. all numbers divisible by m with remainder m-1 are in class m-1
Addition can be defined by adding modulo m. That is, if we consider the
equivalence classes obtained by dividing the differences by, say, 5, then we have, as
an example:
o [(2)] + [(1)] = [(3)]

o [(2)] + [(4)] = [(1)]
o [(3)] + [(4)] = [(3)]
o etc...
The (important) details are left as an exercise.
Example 1.3.5:
Consider the set R x R \ {(0,0)} of all points in the plane minus the origin.
Define a relation between two points (x, y) and (x, y) by saying that they
are related if they are lying on the same straight line passing through the
origin.
This relation is an equivalence relation, and the resulting
space of equivalence classes is called projective space. By
picking an appropriate member for each class one can find a
graphical representation for this space.
Back
As usual, we need to check to first verify that this is a relation, and then check the three
conditions that turn a relation into an equivalence relation: This clearly is a relation (most
things are) so that we will have to check:
reflexive:
every point is related to itself, since it lies on a line through the origin.
symmetry:
if (x, y) ~ (x', y'), then they are both on the same line through the origin. This is a
symmetric definition, so symmetry holds as well.
transitivity:
here we should use some math. Recall that two points (x, y) and (x', y') are on the
same line through the origin, if and only if there is a non-zero real number t such
that (x, y) = t (x', y').
take (x, y) ~ (x', y').
Then there exists t # 0 with (x, y) = t (x', y').
take (x', y') ~ (x'', y'')
Then there exists s # 0 with (x', y') = s (x'', y''). But then (x'', y'') = 1/s (x', y') = 1 /
(s t) (x, y), hence (x'', y'') ~ (x, y), proving transitivity.
This proves that the relation is indeed an equivalence relation. Next, we have to try to find
a graphical representation of P by picking an appropriate member for each class.
Since the relation is an equivalence relation we know that there are
disjoint equivalence classes [(x, y)], and we can pick any member of [(x,
y)] as a representative of that class. Define a point p on the unit circle to
be equal to the whole equivalence class [(x, y)] containing p. That is, we
identify each equivalence class with a point on the unit circle by drawing
the line through that point and the origin. That line contains every
member of the equivalence class.
Numbers on the unit circle that are diagonally opposite each other,
however, are in the same equivalence class, since they are on the same
line through the origin.Thus we identify the space P with the unit circle,
where diagonally opposite numbers are identical. Here is a poor picture
of this space:
This space is called projective space, and can be generalized to higher dimensions as well.
Theorem 1.4.2: The Integers

Let A be the set N x N and define a relation r on N x N by saying that (a, b)
is related to (a, b) if a + b = a + b. Then this relation is an equivalence

relation.
If [(a,b)] and [(a, b)] denote the equivalence classes
containing (a, b) and (a, b), respectively, and if we define
addition and multiplication of those equivalence classes as:
1. [(a,b)] + [(a', b')] = [(a + a', b + b')]
2. [(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
then these operations are well-defined and the resulting set of all
equivalence classes has all of the familiar properties of the integers (it
therefore serves to define the integers based only on the natural numbers).
Context
Proof:
We first have to prove reflexivity, symmetry, and transitivity to show that the relation is an
equivalence relation. The first two properties are easy, and are left as an exercise. As for
transitivity:
Take (a,b) ~ (a', b), i.e. a + b = a + b
Take (a', b) ~ (a, b), i.e. a + b = a + b
Adding, we get (a + b) + (a + b) = (a + b) + (a + b). Canceling a and b on both
sides yields
a + b = a + b, i.e. (a, b) ~ (a, b)
Hence, transitivity is proved.
Next, we will have to show that the way that the definition of addition
and multiplication is well-defined. In particular, we need to show that
the definition of these operations does not depend on the particular
representative of the equivalence classes that we chose. The idea of
that proof is clear: pick different members of a class, and show that their
sum or product results in the same class. Thus, suppose:
(a, b) and (c, d) are related
(a, b) and (c, d) are related.
Then [(a,b)] + [(a, b)] = [(a + a, b + b)]
and [(c, d)] + [(c, d)] = [(c + c, d + d)]
Because (a, b) ~ (c, d) we know: a + d = c + b
Because (a, b) ~ (c, d) we know that a + d = c + b
Adding both of those equations we get that:
(a + a) + (d + d) = (c + c) + (b + b)
which means, in other words, that the two elements are related in turn. Hence:
[(a + a), (b + b)] = [(c + c), (d + d)]
which is exactly what we wanted to show.
Finally, we need to show that multiplication is also well-defined.
Therefore, suppose:
(a, b) and (c, d) are related, i.e. a + d = c + b
(a, b) and (c, d) are related. i.e. a + d = c + b
Then, according to the definition of multiplication:

[(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
[(c, d)] * [(c, d)] = [(c * d + d * c, c * c + d * d)]
We need to show that these two classes are identical, i.e. the representatives are related.
That means we need to show that
(a * b + b * a) + (c * c + d * d) = (c * d + d * c) + (a * a + b * b)
It is easiest to start with what we would like to be true and work backwards. Therefore,
rearranging the above equation we have:
a * b' - a * a' + b * a' - b * b' = c * d' - c * c' + d * c' - d * d'
Factoring, we obtain:
a * (b' - a') - b * (b' - a') = c * (d' - c') - d * (d' - c')
and factoring again:
(a - b) * (b' - a') = (c - d) * (d' - c')
But now we can use that fact that (a, b) ~ (c, d) and (a', b') ~ (c', d') to see that this
equation is indeed true.
As for the actual proof, we only have to read the last few lines
backwards to have a perfectly good proof. This will show that the two
resulting classes are the same, proving that multiplication is indeed
well-defined.
As to why this definition yields equivalence classes with properties
similar to the integers, consider a few examples on your own. We do not
actually have to prove anything, because we simply define the integers
to be the set of equivalence classes with respect to the above
equivalence relation and definition of addition and multiplication.
Before we finish: it seems like a real coincidence that this nice
factorization worked in the proof of the well-definition of the
multiplication. If you work out some examples with actual numbers,
however, it might become clear that this is of course not a coincidence
after all.
Examples 1.4.3(a):
Let A be the set N x N and define an equivalence relation r on N x N and
addition of the equivalence classes as follows:
1. (a,b) is related to (a,b) if a + b = a + b
2. [(a,b)] + [(a',b')] = [(a + a', b + b')]
3. [(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
Which elements are contained in the equivalence classes of, say, [(1, 2)],
[(0,0)] and of [(1, 0)] ? Which of the pairs (1, 5), (5, 1), (10, 14), (7, 3) are in
the same equivalence classes ?
Back
The elements in the equivalence class of [(1, 2)] are all numbers (x,y) that are related to (1,
2), i.e. all (x,y) such that
1 + y = x + 2 or
y-x=1
In other words, the difference of the second and the first entry is one. Some members of
this equivalence class are therefore
(2, 3), (3, 4), (100, 101) [(1, 2)]
Some of the elements of the class [(0,0)] and of [(1, 0)] are:
(x, y) [(0, 0)] if (0, 0) ~ (x,y)
0 + y = x + 0 or y = x
Hence, (1, 1), (5, 5), (100, 100) [(0, 0)]
(x, y) (1, 0) if (1, 0) ~ (x, y)
1 + y = 0 + x or x - y = 1
Hence, (2, 1), (6, 5), (101, 100) [(1, 0)]. To determine which of the pairs (1, 5), (5, 1),
(10, 14), (7, 3) are in the same equivalence classes, all we have to do is compare the
differences between the second and the first entry:
(1,5): the difference y - x = 4
(5, 1): the difference y - x = -4
(10, 14): the difference y - x = 4
(7, 3): the difference y - x = -4
Therefore (1,5) and (10, 14) are in the same class, and (5,1) and (7,3) are also in the same
class.
Examples 1.4.3(b):
2. [(a,b)] + [(a',b')] = [(a + a', b + b')]
3. [(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
Below are some examples for addition and multiplication of these
equivalence classes.
Back
If you add [(1,2)] + [(4, 6)] you would get the following:
[(1,2)] + [(4, 6)] = [(1 + 4, 2 + 6)] = [(5, 8)]
By the above examples, that implies
[(1,2)] contains all pairs whose difference y - x = 1
[(1,2)] + [(4,6)] contains all pairs whose difference y - x = 3
Adding [(3,1)] + [(1,3)] gives the following:
[(3,1)] + [(1,3)] = [(3+1, 1+3)]
This is, by the above example, equivalent to the following:
[(3,1)] contains all pairs whose difference y - x = -2
[(3,1)] + [(1,3)] contains all pairs whose difference y - x = 0.
Multiplying the equivalence classes [(5,4)] * [(7, 4)] we get the following:
[(5,4)] * [(7, 4)] = [(5*4 + 4*7, 5*7 + 4*4)] = [(48,51)]
[(5,4)] contains all pairs whose difference y - x is -1
[(5,4)] * [(7, 4)] = [(48,51)] contains all pairs whose difference y - x = 3
Multiplying the equivalence classes [(1,2)] * [(2,1)] we get the following:
[(1,2)] * [(2,1)] = [(1*1 + 2*2, 1*2 + 2*1)] = [(5,4)]
[(1,2)] contains all pairs whose difference y - x is 1
[(1,2)] * [(2,1)] = [(5,4)] contains all pairs whose difference y - x is -1.
Examples 1.4.3(c):
2. [(a,b)] + [(a',b')] = [(a + a', b + b')]
3. [(a,b)] * [(a, b)] = [(a * b + b * a, a * a + b * b)]
What is the best symbol to use for the resulting equivalence classes ?
Back
Since two pairs (a,b) and (a', b') are related if

a + b' = a' + b or b - a = b' - a'
we might as well choose the symbol b - a to denote their equivalence classes. Hence:
the symbol 2 denote the equivalence class [(1,3)] containing, for example, the pairs
(1,3), (5,7), and (100, 102).
the symbol -3 denotes the equivalence class [(4,1)], containing, for example, the
pairs (4,1), (8,5), and (103, 100).
By the above rules, if the symbols 2 and -3 are added together we get the class
2 + -3 = [(1,3)] + [(103,100)] = [(1,2)] = -1
and if the symbols 2 and -3 are multiplied together we get the class
2 * (-3) = [(1,3)] * [(103,100)] = [(1,7)] = -6
Hence, these equivalence classes, together with the definition of addition and
multiplication, give a mathematically precise meaning to the symbol -2, and explains in
fact the meaning, the addition, and the multiplication of the integers Z
Example 1.4.5:
Why does one need another number system more complicated than the
rational numbers Q?
Back
There are several reasons, many of which are explored in detail in the next chapters. Here
are a few of them (which may use terms we are not yet familiar with).
a simple equation like x2 - 9 = 0 does have a solution in Q, but another, just as
simple equation x2 - 2 = 0 does not have a solution in Q.
If we construct a right triangle for which two sides have length 1, then we could not
measure the length of the remaining side if all we knew were rational numbers.
We could not measure the circumference of any circle if all we knew were rational
numbers.
if we set x0 = 2 and then for each integer n > 0 compute the number
successively, then each resulting number is a rational number, the sequence of
numbers is getting smaller and smaller, but they seem to get closer and closer to
some limit. However, this sequence of numbers does not converge to a rational
number. The sequence looks like this (do you know its limit ?):
o x0 = 2
o x1 = 3/2 = 1.5
o x2 = 17/12 = 1.416...,
o x3 = 577/408 = 1.414215686 , ...
o and so on ...
There are sets consisting of rational numbers that are bounded, but do not have a
least upper bound in Q.
Equations such as sin(x) = 1/2 or cos(x) = 0 do not have solutions in Q.
Example: Countable Chairs

Suppose you are standing in an empty classroom, with a lot of students
waiting to get in. How can you know whether there are enough chairs for
everyone? Use the mathematical definition of cardinality to determine the
answer.
Back
We can not count the students, since they are moving around too much. Therefore, we set
up a function f that associates each chair with a student by simply asking each student to
find a chair and sit down. If all chairs are taken, and no students are left standing, then
what does this mean for our function f ?
the domain of f is the space of all students
the range of f is the space of all chairs
the function f is one-to-one, because: if f(a) = f(b), then student a and student b are
occupying the same chair. This can not happen unless student a and student b is the
same. Hence, f is injective.
the function f is onto, because: the range is the space of all chairs. Since all chairs
are occupied, there is a student associated with each chair. Hence, f is surjective.
Therefore, the function f is a bijection between the domain and the range, and by definition
of cardinality the number of students matches the number of chairs, i.e. both sets have the
same cardinality.
On the other hand, if the two sets did not contain the same number of
elements, the following could happen:
if f was one-to-one, but not onto, then there would be empty chairs. Hence,
cardinality of the students is less than the cardinality of the chairs.
if f was onto, but not one-to-one, then all chairs are taken, but some chairs hold
more than one student. Hence, cardinality of the students is greater than the
cardinality of the chairs.
Examples 2.1.2:
Find the bijections to prove the following statements:
1. Let E be the set of all even integers, O be the set of odd integers.
Then card(E) = card(O)
2. Let E be the set of even integers, Z be the set of all integers. Then
card(E) = card(Z)
3. Let N be the set of natural numbers, Z be the set of all integers. Then
card(N) = card(Z)
Back
In each of the three cases we have to find a bijection between the two pairs of sets.
1. Define the function f(n) = n + 1 with domain E and range O. Then the function f is
clearly one-to-one and onto, hence it is a bijection. Now f is a bijection between E
and O, so that card(E) = card(O).
2. Define the function f(n) = 2n with domain Z and range E. Then it is straightforward to show that this function is one-to-one and onto, giving the required
bijection. Hence, card(Z) = card(E).
3. Define the following function: f(n) = n / 2 if n is even and f(n) =- (n-1) / 2 if n is
odd, with domain N and range Z. Again, it is not hard to show that this function is
one-to-one and onto, and therefore card(N) = card(Z).
The actual details of proving that the functions are bijections are left as an exercise.
Theorem 2.1.4: Dedekind Theorem

A set S is infinite if and only if there exists a proper subset A of S which has
the same cardinality as S.
Context
Proof:
The proof contains some gaps that you are supposed to fill in as an exercise.
Assume that S is countable. Then, by definition,
there exists a bijection f from S to N
Next, consider the function g from N to N defined as

g(n) = n + 1, which is a bijection from N to N \ {1}.
Both functions, being bijections, have an inverse function. Define the function
h(s) = (f -1 o g o f)(s) = f -1(g(f(s))) from S to S
Because f and g are bijections, h is also a bijection between S and h(S). But h(S) is now a
proper subset of S - which element is in S but not in h(S) ? Hence, S is equivalent to a
proper subset of itself.
Assume that S is uncountable. Then we can extract a countable subset
B of S. By the above proof, there is a bijection h from B to a proper
subset A of B. Now define the function
f(s) = s if s is in S \ B and f(s) = h(s) if s is in B.
Then f is a bijection between S and f(S), and f(S) is a proper subset of S - do you see why ?
Hence, S is again equivalent to a proper subset of itself.
Assume that S is finite. If S has, say, N elements, then a proper subset
of S contains at most N - 1 elements. But then it is impossible to find a
bijection from S to this proper subset - do you see why ?
Examples 2.1.5:
The set of integers Z and the interval of real numbers between 0 and 2, [0,
2], are both infinite.
Back
According to Dedekind's theorem, we need to find a proper subset of each set that has the
same cardinality as the original set. In other words, we need to find a bijection from the
original set into a proper subset of itself.
Define the function f(n) = 2n. Then this function is a bijection between Z
and the even integers. Hence, Z has the same cardinality as a proper
subset of itself, and therefore Z is infinite.
Define the function f(x) = x / 2. Then the function is a bijection between
the interval [0, 2] and the interval [0, 1]. Hence, the interval [0, 2] is
infinite.
Proposition 2.1.6: Combining Countable Sets
Every subset of a countable set is again countable (or finite).

The set of all ordered pairs of positive integers is countable.
The countable union of countable sets is countable
The finite cross product of countable sets is countable.
Context
Proof:
The first statement seems obvious. Nonetheless, it needs to be proved - as an exercise.
To prove the second statement, we list all ordered pairs as follows:
(1,1)
(1,2)
(1,3)
(1,4) ...
(2,1)
(2,2)
(2,3)
(2,4) ...
(3,1)
(3,2)
(3,3)
(3,4) ...
(4,1)
(4,2)
(4,3)
(4,4) ...
...
...
...
... ...
Next, we have do devise a method for counting these elements in order to enumerate all
pairs. Here is one possible solution:
Diagonalized Counting of N x N
Since we are able to enumerate all elements in the above table, the set
must be countable.
The proof of the next statement - that the countable union of countable
sets is again countable - is very similar. Try to duplicate the above proof
by writing down the elements from all sets in a table similar to the
above table. The details are left as another exercise.
Finally, we need to prove the last statement: Suppose S(1), S(2),
S(3), ..., S(n) are all countable. Then the cross product of those sets is
defined to be
S := {(x(1), x(2), ..., x(n)) : x(j)
S(j) for all j = 1, 2, ..., n}
If there were only two such sets, then the proof is exactly the same as the proof of the first
statement. If there are more than two sets, we can group them. That is, consider
S(1) x S(2) x S(3) x ... x S(n) = ( S(1) x (S2) ) x S(3) x ... x S(n)
The first cross product is countable, because its a cross product of two countable sets. Then
consider
S(1) x S(2) x S(3) x ... x S(n) = ( ( S(1) x (S2) ) x S(3) ) x ... x S(n)
The cross product ((S(1) x (S2)) x S(3)) is countable, because it can be considered again as
the cross product of only two sets, each of which is countable. Clearly we can continue in
this fashion, and since there are only finitely many sets, we will have proved the third
statement.
Proposition: Rationals are Countable

The set of all rational numbers is countable.
Context
Proof:
Every rational number has the form p / q, and we may assume that q is always positive.
Define the height of a rational number p / q as |p| + q. Then we can list all rational
numbers as follows:
Height 1:
0/1
Height 2: -1/1, 1/1

Height 3: -2/1, -1/2, 1/2, 2/1
Height 4: -3/1, -2/2, -1/3, 1/3, 2/2, 3/1
Height 5: ...
......
But then each row is countable (in fact, finite), and there are countably many rows. Hence,
we have written the rationals as a countable union of countable sets, so that the rationals
are indeed countable. As a matter of fact, the above list contains many rational numbers
more than once. But if that set is countable, then the subset of rational numbers of that set
must also be countable, because it is certainly not finite.
Examples 2.1.7(b):
The set of all polynomials that have integer coefficients and degree n is
countable.
Back
Let P(n) be the set of all polynomials with integer coefficients and degree n. Then a
particular element of P(n) is
pn(x) = anxn + an - 1xn - 1 + an - 2xn - 2 + ... + a1x + a0
Define a function f as follows:
domain of f is P(n), range of f is Z x Z x ... x Z (n+1 times)
f(pn) = f( anxn + an - 1xn - 1 + ... a1x + a0) = (an, an - 1, ..., a1, a0)
Because all coefficients are integers, this functions is onto, and is clearly one-to-one.
Hence it is a bijection between the domain and the range. But because the finite cross
product of countable sets is countable, this implies that P(n) is also countable.
Examples 2.1.7(c):
The set of all polynomials with integer coefficients is countable.
Back
Let P be the set of all polynomials with integer coefficients, and define the set P(n) to be
the set of all polynomials with integer coefficients and degree n. From before we already
know that P(n) is countable. But
Hence, P is the countable union of countable sets, and must therefore be countable itself by
our result on countable unions of countable sets.
Proposition 2.2.1: An Uncountable Set

The open interval (0, 1) is uncountable.
Context
Proof:
Any number x in the interval (0, 1) can be expressed as a unique, never-ending decimal.
Actually, this is not quite true: 0.1499999... is the same number as 0.15000.... But when we
simply discard those numbers with a non-ending tail of 9's we still get the open interval (0,
1), and now every number has a unique decimal representation. If these numbers were
countable, we could list them in a two-way infinite table:
1. number: x11, x21, x31, x41, ...
2. number: x12, x22, x32, x42, ...
3. number: x13, x23, x33, x43, ...
4. number: x14, x24, x34, x44, ...
...
where each expression in parenthesis represents all decimals in the decimal representation
of a particular number without the leading '0.'.
In this list, what would be the number associated to the following
element:
Let x be the number represented by (x1, x2, x3, x4, ...), where we let:
o x1 = 0 if x11 = 1 and x1 = 1 if x11 = 0
o x2 = 0 if x22 = 1 and x2 = 1 if x22 = 0
o x3 = 0 if x33 = 1 and x3 = 1 if x33 = 0
o ...
This new element x is different from the first one in our list, because they differ in their
first entry; x is different from the second one in the list, because they differ in the second
entry; x is different from the third one because they differ in the third entry, etc. But now it
is clear that x can not be in the above list, because it differs with the n-th element of that
list in the n-th entry. But this element represents a number in the interval (0, 1). Hence, we
have found that we were unable to list all numbers in (0,1), and therefore the interval is
indeed uncountable.
Examples 2.2.2:
Show that all of the following sets are uncountable:
1. The open interval (-1, 1) is uncountable
2. Any open interval (a, b) is uncountable
3. The set of all real numbers R is uncountable
Back
Recall that the set (0, 1) is uncountable, as proved before. Then:

1. Define the function
f(x) = 2x - 1 from (0, 1) to (-1, 1)
This is a bijection between those two intervals, and therefore both intervals have the same
cardinality.
2. A similar proof can show that any open interval (a, b) is uncountable.
What is the appropriate bijection (try a linear function that maps 0 to a
and 1 to b) ?
3. Define a function
f(x) = x - / 2
Then this function is a bijection between the open intervals (0, 1) and (take the function
g(x) = tan(x)
/ 2,
/ 2). Next,
This function is a bijection between (- / 2, / 2) and R. But then the composition of the
two function will be a bijection from (0, 1) to R, and hence both sets must have the same
cardinality.
Examples 2.2.4(a):
If S = {1,2,3}, then what is P(S) ? What is the power set of the set S = {1, 2,
3, 4} ? How many elements does the power set of S = {1, 2, 3, 4, 5, 6}
have ?
Back
The power set is the set of all subsets of a given set.

For the set S = {1,2,3} this means:
subsets with 0 elements: 0 (the empty set)

subsets with 1 element: {1}, {2}, {3}
subsets with 2 elements: {1,2}, {1,3}, {2,3}
subsets with 3 elements: S
Hence:
P(S) = {0, {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, S}
Therefore, we have:
card(S) = 3 and card(P(S)) = 8 = 23
For the set S = {1,2,3,4} this means:
subsets with 0 elements: 0 (the empty set)
subsets with 1 element: {1}, {2}, {3}, {4}
subsets with 2 elements: {1,2}, {1,3}, {1,4}, {2,3}, {2,4}, {3,4}
subsets with 3 elements: {1,2,3}, {1,2,4}, {1,3,4} {2,3,4}
subsets with 4 elements: S
Therefore, we have:
card(S) = 4 and card(P(S)) = 16 = 24
Finally, if S = {1,2,3,4,5,6} then, based on the above examples, we would suspect that
card(S) = 6, therefore card(P(S)) = 26 = 64
In fact, if a set S contains n elements, then its power set will contain 2n elements. This can
be proved by induction as an exercise.
Examples 2.2.4(b):
The cardinality of the power set of S is always greater than or equal to the
cardinality of a set S for any set S.
Back
To show that the cardinality of P(S) is greater than or equal to that of a set S we have to
find either a surjection from P(S) to S or an injection from S to P(S).
Define the function f from S to P(S) as follows:
f(s) = {s} for any s in S
i.e. every element s in S is mapped to the set {s} containing the single element s. Note that
the set {s} is an element of the power set P(S).
This map is clearly one-to-one, and therefore card (S) card(P(S)).
This does not prove that card(P(S)) > card(S). However, that statement
is also true, but its proof is more complicated.
Theorem 2.2.5: Cardinality of Power Sets
The cardinality of the power set P(S) is always bigger than the cardinality of
S for an set S.
Context
Proof:
The proof will be similar to proof about the uncountablility of the open interval (0,1): we
will attempt to list all sets of P(S) next to all elements of S, and then exhibit a set that can
not be in that list. However, we will do this in a more abstract fashion this time.
Let us denote the elements of the set S by small letters a, b, c, ..., and
the elements of P(S) by capital letters A, B, C, .... Note that each
member of P(S) is again a set in itself. Suppose there was a function f
between S and P(S). That means, that we may have the following
correspondence:
a corresponds to A, i.e. A = f(a)
b corresponds to B, i.e. B = f(b)
c corresponds to B, i.e. C = f(c)
and so on. The fact that small letters and capital letters are the same is merely for
convenience. Now define the set X as follows:
X={s
S: {s}
f(s) = 0 }
or, in other words:

if T = f(t), then t is in X if and only if t is not contained in T
Hence, X contains all those elements that are not contained in their associated sets under
the above function f. This is easy to understand by means of a concrete example:
Let S = {1, 2, 3}. Then P(S) = { 0, {1}, {2}, {3}, {1,2}, {1,3}, {2,3},
{1,2,3} }. An arbitrary function from S to P(S) might associate the
elements of S as follows:
1 is associated with {3}
2 is associated with {1,2,3}
3 is associated with {1,2}.
Then the set X defined above consist of the following elements:
f(1) = {3}, and 1 is not in {3}. Thus, {1} is in X.
f(2) = {1,2,3}, and 2 is in {1,2,3}. Thus, {2} is not in X.
f(3) = {1, 2}, and 3 is not in {1,2}. Thus, {3} is in X.
Hence, X = {1, 3}.
Incidentally, there is no element from S that is mapped to the set X.
Now let us return to the proof. Since X consists of elements of S, X is a
subset of S, and hence X is an element of P(S). If the above map f was
an onto map, then there must be an element x from S with f(x) = X.
Where is that element x ?: Suppose x is contained in X:
Since X contains those elements that are not in its associated set, we have
that x is not contained in f(x) = X.
Suppose x is not contained in X:
Since X contains exactly those elements that are not in its associated set,
and x is assumed to be not in X = f(x), then x must be contained in X.
This is clearly not possible, showing that our assumption that f is onto is false. Hence,
whatever map between S and P(S) exists, it can not be onto. But from a previous example
we already know that the cardinality of P(S) is at least as large as the cardinality of S, i.e.
there is a one-to-one map from S to P(S). Since this map can not also be onto, we have
proved that card(P(S)) > card(S).
Example 2.2.6: Logical Impossibilities - The Set of all

Sets
Let S be the set of all those sets which are not members of themselves. Then
this set can not exist.
Back
This definition seems to make sense, because a set could be an entity of its own, as well as
an element of another set. For example, we could define two sets
A = { {1}, {1,3}, A }
B = { {1}, {1,3} }
Then A is a set that is also a member of itself, whereas B is not a member of itself.
Therefore, we could consider the set of all those sets that are not members of itself. Call
this set S. The above set A would not be an element of S, whereas B is an element of S.
While this, albeit strange, does seem to make sense, we might ask:
Is S an element of itself or not ?
But this question will give a contradiction, because:
If S is an element of itself, then - since by definition S contains those sets only that
are not part of itself - S is not an element of itself. That's not possible.
If S is not an element of itself, then - since S does contain those sets that are not
part of itself - S is a member of S. That's not possible either.
Hence, we have arrived at a logical impossibility, and the set S does indeed not exist.
Example 2.2.7: A Hierarchy of Infinity - Cardinal

Numbers
When dealing with cardinal numbers, one can establish the following rules
and definitions:
1. Definition of a cardinal number
2. Comparing cardinal numbers
3. The power of the continuum and the cardinality aleph null
4. Addition of two cardinal numbers
Back
1. Definition of Cardinal Number
Two sets A and B are called equivalent if there exists a bijection between A and B.
The two sets are said to have the same cardinality, or power.
The cardinality of a set A is denoted by card(A).
The cardinal numbers of two sets are equal if the sets are equivalent.
Note that for finite sets the cardinality is just the number of elements in that set. That is,
card({a, b, 2} ) = card({1, 2, 9}) = 3. However, the concept of cardinality also applies to
infinite sets.
2. Comparing Cardinal Numbers
A cardinal number c is less than or equal to another cardinal number d if there exist
two sets A and B with card(A) = c and card(B) = d and card(A) card(B)
Note that according to this definition, we have that the cardinality of the natural numbers is
strictly less than the cardinality of the real numbers. In fact, those cardinalities have a
special name:
3. Special Cardinalities
The cardinality of the real numbers is called the cardinality (or power) of the continuum,
and is denoted by
c = card(R).
The cardinality of the natural number is called aleph null and is denoted by
= card(N)
4. Adding Cardinal Numbers

Let c and d be two cardinal numbers and take sets A and B with card(A) = c and card(B)
= d. Define the sets
A' = {(a, 0) : a A}
B' = {(b, 1) : b B}
Then the sum of the two cardinal number c and d is defined as
c + d = card(A' B')
Note that the reason for defining the sets A' and B' is to make sure that the resulting sets
are disjoint. If the two sets A and B are disjoint from the outset, one could define the sum
of the cardinal numbers as the cardinality of the union of the original sets.
Example:
Let A = {1,2,3} and B = {1,2}. Then card(A) = 3 and card(B) = 2. However,
card(A B) = 3
because A B = {1,2,3}. Using the above definition, we get:
A' = { (1,0), (2,0), (3,0)}
B' = { (1,1), (2,1) }
therefore
A' B' = {(1,0), (2,0), (3,0), (1,1), (2,1)}
and therefore, as one would think:

card(A' B') = 5 = card(A) + card(B)
Examples 2.2.8:
We want to add or subtract the following cardinal numbers:
1. card(N) + card(N) = card(N)
2. card(N) - card(N) = undefined
3. card(R) + card(N) = card(R)
4. card(R) + card(R) = card(R)
Back
1. card(N) + card(N) = card(N)

According to the definition, this is the same as the cardinality of A B, where A and B are
both countable, disjoint sets . But the countable union of countable sets is again countable.
Hence, card(A B) = card(N), so that
card(N) + card(N) = card(N)
Using our notation for the cardinality of the natural numbers we can rephrase this equation
(recall that = aleph null = card(N))
+ =
2. card(N) - card(N) = undefined

Although this has not been properly defined, one could say that this should be the same as
the cardinality of A \ B, where A and B are both countable sets and B is a subset of A. This
creates problems, however, as the following examples show:
A = B = N. Then card(A \ B) = card(0) = 0
A = all integers, B = even integers. Then card(A \ B) = card(odd integers) =
card(N)
Since we can not have two possible answer, we would guess that
card(N) - card(N) is undefined.
Or, in our 'aleph null' notation we would say:
- is undefined
3. card(R) + card(N) = card(R)

According to the definition, this is the same as the cardinality of A B, where A is
uncountable and B is countable and A and B are disjoint. We know that every subset of a
countable set is countable or finite. Since A is a subset of A B, the set A B can not be
countable. Hence, it must be uncountable. We would therefore guess that
card(R) + card(N) = card(R)
Using our notation for the cardinalities of the natural numbers and the continuum we can
rephrase this equation as:
c+
=c
We would actually need to show that the cardinality of A B can not be strictly larger that
the cardinality of A to establish this. That, however, is left as an exercise.
4. card(R) + card(R) = card(R)

This should be the same as the cardinality of A B, where both A and B are uncountable
and disjoint. It is easy to find a one-to-one function from A to A B, so that card(A)
card(A B). But then card(A B) is again uncountable, so that we would guess that
card(R) + card(R) = card(R)
In our 'special cardinality' notation we could rephrase this as
c+c=c
To establish this, we also need to show that card(A
an exercise.
B) card(A) This is true, and left as

Example 2.2.9: The Continuum Hypothesis

Is there a cardinal number c with card(N) < c < card(R) ? What is the most
obvious candidate ?
Back
We need to find a set whose cardinality is bigger than N and less that that of R. The most
obvious candidate would be the power set of N. However, one can show that
card(P(N)) = card(R)
In fact, this is a deep question called the continuum hypothesis. This question results in
serious problems:
In the 1940's the German mathematician Goedel showed that if one denies the
existence of an uncountable set whose cardinalities is less than the cardinality of
the continuum, no logical contradictions to the axioms of set theory would arise.
One the other hand, it was shown recently that the existence of an uncountable set
with cardinality less than that of the continuum would also be consistent with the
axioms of set theory.
Hence, it seems impossible to decide this question with our usual methods of proving
theorems.
Such undecidable questions do indeed exist for any reasonably complex
logical system (such as set theory), and in fact one can even prove that
such 'non-provable' statements must exist. To read more about this
fascinating subject, look at the book Goedel's Proof or Goedel, Escher,
Bach as mentioned in the reference section of the glossary.
Can you find sets with cardinality strictly bigger than that of the
continuum ?
Examples 2.3.2(a):
Which of the following sets and their ordering relations are partially
ordered, ordered, or well-ordered:
1. S is any set. Define a b if a = b
2. S is any set, and P(S) the power set of S. Define A B if A B
3. S is the set of real numbers between [0, 1]. Define a b if a is less
than or equal to b (i.e. the 'usual' interpretation of the symbol )
4. S is the set of real numbers between [0, 1]. Define a b if a is greater
than or equal to b.
Back
1. S is any set. Define a b if a = b.

This is a trivial partial ordering. Since no element is related to an
element different from itself, this is not an ordered set. Without more
information about S we can not determine anything else.
This example shows that any set can be partially ordered.
2. S is any set, and P(S) the power set of S. Define A B if A
Recall that if A B and B A then A = B. Therefore, this is indeed a

partial ordering. Without more information about the set S we can not
determine anything else.
less than or equal to b (i.e. the 'usual' interpretation of the
symbol )
This is clearly an ordering, and the set [0, 1] with this ordering is usually
represented as a subset of the number line. It is not a well-ordered set,
because the subset (0,1] has no smallest element.
greater than or equal to b.
This is also an ordering. The set is not well-ordered, however, because
the subset [0, 1) has no smallest element. Note that 1 is the smallest
element in the set [0, 1), according to our convention. Here it is
important to distinguish between the conventional meaning of the
symbol and its meaning as we choose to define it for a particular
situation.
Examples 2.3.2(b):
Which of the following sets are well-ordered ?
1. The number systems N, Z, Q, or R ?
2. The set of all rational numbers in [0, 1] ?

3. The set of positive rational numbers whose denominator equals 3 ?
Back
The natural numbers N are well-ordered:

A subset of natural numbers may not have a largest element, but it must have a
smallest element.
The integers Z are not well-ordered:
While many subsets of Z has a smallest element, the set Z itself does not have a
smallest element.
The rationals Q are not well-ordered:
The set Q itself does not have a smallest element.
The real numbers R are not well-ordered:
R itself does not have a smallest element.
The set of all rational numbers in [0, 1] is not well-ordered:
While the set itself does have a smallest element (namely 0), the subset of all
rational numbers in (0, 1) does not have a smallest element.
The set of all positive rational numbers whose denominator equals 3 is well-ordered:
This set is actually the same as the set of natural numbers, because we could simply
re-label a natural number n to look like the symbol n / 3. Then both sets are the
same, and hence this set is well-ordered.
Theorem 2.3.3: Induction Principle

Let S be a well-ordered set with the additional property that every element
except for the smallest one has an immediate predecessor. Then: if Q is a
property such that:
then the property Q holds for every element in S.
Context
Proof:
It is not quite obvious what it is that we have to prove: we need to show that if a property
holds for the smallest element of a well-ordered set S, and it holds for every successor of a
general element of that set, then it is indeed true for the whole set S.
We will prove this by contradiction: Suppose property Q is true for the
smallest element of a well-ordered set S, denoted by the symbol 1.
Suppose also that property Q is true for the successor n + 1 of the
general element n of that set, if it is assumed to be true for that
element n.
Denote by E the set of all elements from S for which property Q is not true
Since S is well-ordered, the subset E contains a smallest element, say n. If n = 1, we would
have a contradiction immediately. If n > 1, then it must have an immediate predecessor,
denoted by the element n-1.
For that element the property Q holds true, and therefore, being true for
n - 1 it must be true for (n-1) + 1 = n, by assumption. Hence, the set E
must be empty, and property Q is therefore true for all of S.
Examples 2.3.4(a):
Use induction to prove the following statements:
1. The sum of the first n positive integers is n (n+1) / 2.
2. If a, b > 0, then (a + b)n an + bn for any positive integer n.
Back
1. The sum of the first n positive integers is n (n+1) / 2

To use induction, we have to proceed in fours steps.
Property Q(n):
1 + 2 + 3 + ... + n = n (n+1) / 2
Check Q(1):
1 = 1 * 2 / 2, which is true.
Assume Q(n) is true:
Assume that 1 + 2 + 3 + ... + n = n (n+1) / 2.
Check Q(n+1):
1 + 2 + 3 + ... + n + n + 1 =
= (1 + 2 + 3 + ... + n) + (n + 1) =
= n (n+1) / 2 + (n+1) =
= n (n+1) / 2 + (2n + 2) / 2 =
= (n+1)(n+2) / 2
Hence, Q(n+1) is true under the assumption that Q(n) is true.
But that finishes the induction proof.
2. If a, b > 0, then (a + b)n an + bn for any positive integer n.
Using an induction proof, we have to proceed in four steps:
Property Q(n):
(a + b)n an + bn if a, b > 0
Check Q(1):
(a + b) a + b, which is true.
Assume that (a + b)n an + bn
Check Q(n+1):
(a + b)n + 1 = (a + b)n (a + b) (an + bn) (a + b) =
= an + 1 + an b + a bn + bn + 1 an + 1 + bn + 1
because a, b > 0. Hence, Q(n+1) is true under the assumption that Q(n) is true.
But that finishes the induction proof.
Proposition: Bernoulli Inequality

If x -1 then (1 + x) n 1 + nx for all positive integers n.
Context
Proof:
The proof goes by induction.
Property Q(n):
If x -1 then (1 + x) n 1 + nx
Check Q(1):
(1 + x) 1 + x is true.
If x -1 then (1 + x) n 1 + nx
Check Q(n+1):
(1 + x) n + 1 = (1 + x) n (1 + x) (1 + nx) (1 + x) =
1 + nx 2 + nx + x = 1 + nx 2 + (n + 1)x 1 + (n + 1)x
so Q(n+1) is true (where have we used that x -1 ?).
As an alternative proof, this statement follows directly from the binomial theorem (do it !).
Example 2.3.5(a):
Impose a new ordering labeled << on the natural numbers as follows:
if n and m are both even, then define n << m if n < m
if n and m are both odd, then define n << m if n < m
if n is even and m is odd, we always define n << m
Is the set of natural numbers, together with this new ordering << wellordered ? Does it have the property that every element has an immediate
predecessor ?
Back
The natural numbers, ordered by the ordering <<, could be listed in order as follows:
2, 4, 6, 8, ....., 1, 3, 5, 7, 9, ..... ,
To show it is well-ordered, take any subset A of natural numbers.
If it contains only odd numbers, then the smallest number in the usual ordering is
the smallest element of A
If it contains only even numbers, then the smallest number in the usual ordering is
the smallest element of A
If it contains both even and odd numbers, then the smallest of the even numbers in
the usual ordering is the smallest element of A
Hence, the set is well-ordered.
But, not every element has an immediate predecessor. For example, the
set:
A = {1, 3, 5, 7, ...}
has a smallest element (namely 1), but 1 does not have an immediate predecessor, since
every even number is smaller than 1 by definition.
Example 2.3.5(b):
Suppose the induction principle defined above does not contain the
assumption that every element except for the smallest has an immediate
predecessor. Then show that it could be proved that every natural number
must be even (which is, of course, not true so the additional assumption on
the induction principle is necessary).
Back
In other words, we assume that the induction principles was stated as follows:
Let S be a well-ordered set Then: if Q is a property such that:
Then the property Q holds for every element in S
If this principle was true, we could prove that every natural number must be even as
follows:
Consider the natural numbers with the ordering << defined as follows:
if n and m are both even, then define n << m if n < m
if n and m are both odd, then define n << m if n < m
if n is even and m is odd, we always define n << m
We have already proved that this set is well-ordered. We want to show that every number is
even. Therefore:
Q is the property that every element is even.
The smallest element of our set in the << ordering is 2, which is even.
Also, if s has property Q then so does the successor of s. That is because in our
ordering, the successor of an even number is always the next even number, never
an odd number, and if s has property Q, then s must be even.
Therefore, by the incorrect induction principle, every natural number is

even - which is, of course, not true.
The actual induction principle as we have defined it does, however, not
apply to this example, since 1 does not have an immediate predecessor.
This example was suggested by Karl Hahn who pointed out that there is
another principle, called Transfinite Induction which - suitably stated does apply to every well-ordered set. He also suggested the book Set
Theory and Logic by Stoll, published by Dover, for further reference on
this and other set theoretical topics.

Examples 2.3.8:
Which of the following two recursive definition is valid ?
1. Let x0 = x1 = 1 and define xn = xn - 1 + xn - 2 for all n > 1.
2. Select the following subset from the natural numbers:
o x0 = smallest element of N
o xn = smallest element of N - {x0 , x1 , x2 , ..., xn + 1}
Back
1. Suppose we let x0 = x1 = 1 and define xn = xn - 1 + xn - 2 for all n > 1. Then the first
element is uniquely determined, and each following element to be selected depends
only on the ones that have already been selected. Hence, this is a valid recursive
definition. Note that to compute, say, x5 , we need to compute all previous numbers
first. The numbers defined in this way are called Fibonacci numbers.
2. This is not a valid recursive definition, because to find the element x2, say, we need
to find the smallest element of the set N - {x0 , x1 , x2 , x3}. Since this definition
involves the element x2 itself (and also x3) which we do not know at this stage, this
recursive definition is not valid.
Exercise 2.3.9:
Is the sum of the first n square numbers equal to (n + 2)/3 ?
Back
We might try to prove this statement via induction

Property Q(n):
12 + 22 + 32 + ... + n2 = (n + 2) / 3
Check Q(1):
12 = 3/3 is true
Assume that 12 + 22 + 32 + ... + n2 = (n + 2) / 3
Check Q(n+1):
12 + 22 + 32 + ... + (n + 1)2 =
= (12 + 22 + 32 + ... + n2) + (n+1)2 =
= (n + 2) / 3 + 3 (n + 1)2 / 3 =
= 1/3 (3 n2 + 7n + 5)
which is not equal to (n+3)/3
Hence, the induction proof failed. That does not, in principle, mean that the statement is
false. It is, however, a strong indication that property Q(n) is false. Indeed, checking for n
= 2 gives:
Q(2) = 1 + 4 = 5 # 4 / 3
so that the statement is indeed false by this counterexample.
Example 2.3.11:
The barber of Seville is that inhabitant of Seville who shaves every man in
Seville that does not shave himself
Back
This is a paradox, because who will shave the barber of Seville himself ?
if the barber does not shave himself, he - according to definition - must shave
himself, because he shaves all who do not shave themselves
if the barber does shave himself, he - according to definition - can not shave
himself, because the barber only shaves those who do not shave themselves
This is actually and example of an invalid recursive definition: In the definition the barber
appears twice: one as the defined entity "barber of Seville" and again as a member of all
men living in Seville. Such a recursive definition is invalid, and in fact leads to a
contradiction in this case.
Example 2.3.12: Sum of Squares and Cubes

Prove the following statements via induction:
1. The sum of the first n numbers is equal to
2. The sum of the first n square numbers is equal to
3. The sum of the first n cubic numbers is equal to
Back
1. We actually have already proved this statement in example 2.3.4, but we should mention
another proof of this statement that does not use induction.
Actually, as the story goes, the young Gauss, who later became a
famous mathematician, was once in grade school. His teacher, who did
not want to teach anything, asked his students to compute the sum of
the first 100 integers, thinking that this would give him plenty of time to
read a newspaper. However, after a few seconds the young Gauss came
up with the correct answer, much to the dismay of his teacher.
How was Gauss able to find the answer so quickly ? He certainly did not
know about induction, so he used the following trick. We want to find 1
+ 2 + 3 + ... + n-1 + n. Instead of finding that sum once, we compute it
twice, calling the unknown answer S. We list the numbers once in
forward and once in backward order:
1+ 2+
3 + ... + (n-1) + n = S
n + (n-1) + (n-2) + ... + 2 + 1 = S
Each column except the last adds up to n+1, and there are n of them. The last column adds
up to 2 S, so that we have the equation:
n (n+1) = 2 S
from which the result follows.

2. Let's prove the second statement via induction.
Property Q(n):
1 + 22 + 32 + ... + n2 =
Check Q(1):
1=123/6=1
which is true. Now assume Q(n) is true, let's prove Q(n+1):
1 + 22 + 32 + ... + n2 + (n+1)2 =
=
+ (n+1)2 =
=
The last expression is exactly property Q(n+1), which finishes the induction proof.
3. We will - of course - leave the third statement as an exercise. It may
be that the last step requires some factorization that may or may not be
easy to do. Instead, try working your way 'backward'. For example,
instead of trying to factor an expression such as 2 n2 + 7n + 6, you
could start with the answer you are hoping to get, in this case (n + 2)
(2n + 3), and work your way back.
Theorem 2.4.1: No Square Roots in Q

There is no rational number x such that x2 = x * x = 2.
Context
Proof:
Suppose there was such an x. Being a rational number, we can write it as
x = a / b (with no common divisors)
Since x2 = x * x = 2 we have
a2 = 2 b2
In other words, a2 is even, and therefore a must be even as well. (Can you prove this ?).
Hence,
a = 2 c for some integer c.
But then we have that
4 c2 = 2 b2, or 2 c2 = b2
As before, this means that b is even.. But then both a and b are divisible by 2. That's a
contradiction, because a and b were supposed to have no common divisors.
Examples 2.4.3(a):
Consider the set S of all rational numbers strictly between 0 and 1. Then this
set has many upper bounds, but only one least upper bound. That supremum
does not have to be part of the original set.
Back
An upper bound for the set S is any number that is greater than or equal to any number in
the set S. Five different upper bounds for S are, for example:
1, 10, 100, 42, and e (Euler's number)
Note that an upper bound is therefore not unique. All that is required is to find number
bigger than all other numbers in the set S.
To find the least upper bound for S we need to find a number x such that
x is an upper bound for S
there is no other upper bound lower than x
Clearly, this upper bound is 1. Note that the supremum, or least upper bound, is unique, but
it is not part of the original set S.
If we include 0 and 1 in the original set S, then the least upper bound is
again 1. This time the unique supremum is part of the set S, which may
or may not happen in general.
Examples 2.4.3(b):
Consider the set of rational numbers {1, 1.4, 1.41, 1.414, 1.4142, ...}
converging to the square root of 2. If all we knew were rational numbers,
this set would have no supremum. If we allow real numbers, there is a
unique supremem.
Back
If we consider the universe to consist only of rational numbers, then this set does not have
a least upper bound.
No number bigger than
is the least upper bound (although each of these numbers
is an upper bound), because if x was that least upper bound, then we can find a
rational number between
and x. That rational number would then be an upper
bound smaller than x, which is a contradiction.
No number less than
is the least upper bound, because if x was that least upper
bound, there is some element of the set between x and
. But then x is not an
upper bound, which is a contradiction.
Hence, there is no supremum for this set S in Q.
If we consider this set as a subset of the real numbers, then the least
upper bound of this set is
.

Definition: Lower and Greatest Lower Bound

Let A be an ordered set and X a subset of A. An element b is called a lower
bound for the set X if every element in X is greater than or equal to b. If
such a lower bound exists, the set X is called bounded below.
Let A be an ordered set, and X a subset of A. An element b
in A is called a greatest lower bound (or infimum) for X if
b is a lower bound for X and there is no other lower bound b'
for X that is greater than b. We write b = inf(X).
By its definition, if a greatest lower bound exists, it is unique.
Context
Theorem 2.4.5: Square Roots in R

There is a positive real number x such that x2 = 2
Context
Proof:
Since the above equation is not true in Q, we have to use a property of the real numbers
that is not true for the rational numbers. Define the set
S = { t R: t > 0 and t2 2 }
Our hope is that the supremum of this set should be the desired solution to the above
equation. However, we first need to make sure that the supremum indeed exists before
showing that it is the desired solution.
S is not empty, because 1 is contained in S, and S is bounded above by,
say, 5. Hence, using the least upper bound property of the real
numbers, S has a least upper bound s:
s = sup(S)
Note that this would not necessarily be true if we restricted ourselves to the rational
numbers.
Now we hope that s2 = 2, i.e. s is the desired solution. Since 1 is in S, we
know
s>1
Now s either is the solution, or one of the following two cases are true:
Is s2 < 2 ?
Let
. Then, by assumption 0 < < 1, so that
Hence, s + is also in S, in which case s can not be an upper bound for S. This is a
contradiction, so this case is not possible.
2
Is s > 2 ?
Let
. Again > 0, so that
Hence, s - is another upper bound for S, so that s is not the least upper bound for
S. This is a contradiction, so that this case is not possible.
Having eliminated these two cases, we are left with s2 = 2, which is what we wanted to
prove.
Example 3.1.3(a):
The sequence
converges to zero.
Back
= {1, 1/2, 1/3, 1/4, ... }, which seems to indicate that the terms
are getting closer and closer to zero. According to the definition of
convergence, we need to show that no matter which > 0 one chooses,
the sequence will eventually become smaller than this number. To be
precise: take any > 0. Then there exists a positive integer N such that
1 / N < . Therefore, for any j > N we have:
| 1/j - 0 | = | 1/j | < 1/N <
whenever j > N. But this is precisely the definition of the sequence {1/j} converging to
zero.
While it looks like this prove is easy, it is a good indication for ' arguments' that will appear again and again. In most of those cases the
proper choice of N will make it appear as if the proof works like magic.
Example 3.1.3(b):
The sequence
does not converge.

Back

Note that
= {-1, 1, -1, 1, -1, 1, ...}. While this sequence does
exhibit a definite pattern, it does not get close to any one number, i.e. it
does not seem to have a limit. Of course we must prove this statement,
so we will use a proof by contradiction.
Suppose that the sequence did converge to a limit L. Then, for = 1/2
there exists a positive integer N such that
| (-1) n- L | < 1/2
for all n > N. But then, for some n > N, we have the inequality:
2 = | (-1) n + 1 - (-1) n | = | ((-1) n + 1 - L) + (L - (-1) n ) |
| (-1) n + 1 - L | + | (-1) n - L | < 1/2 + 1/2 = 1
for n > N, which is a contradiction since it says that 2 < 1, which is not true.
Example 3.1.3(c):
The sequence
converges to zero.
Back

{ n / 2n } = { 0, 1/2, 1/2, 3/8, 1/4, 5/32, ...}. It is not clear, but it seems
as if the terms get smaller and smaller. Indeed this is the case, and we
will prove it:
First, we can use induction to show that
n2 2n
for n > 3. But then we have that
n2 / 2n 1
or equivalently
n / 2n 1/n
for n > 3. But now you should be able to finish the proof yourself. As a hint, for a given ,
choose
N = max{3, 1/ }
Proposition 3.1.4: Convergent Sequences are Bounded

Let
be a convergent sequence. Then the sequence is bounded, and
the limit is unique.
Context
Proof:
Let's prove uniqueness first. Suppose the sequence has two limits, a and a'. Take any > 0.
Then there is an integer N such that:
| aj - a | <
if j > N. Also, there is another integer N' such that
| aj - a' | <
if j > N'. Then, by the triangle inequality:
| a - a' | < | a - aj + aj - a' |
|aj - a | + | aj - a' |
< + =2
if j > max{N,N'}. Hence | a - a' | < 2 for any > 0. But that implies that a = a', so that the
limit is indeed unique.
Next, we prove boundedness. Since the sequence converges, we can
take, for example, = 1. Then
| aj - a | < 1
if j > N. Fix that number N. We have that
| aj | | aj - a | + | a | < 1 + |a|
for all j > N. Define
M = max{|a1|, |a2|, ...., |aN|, (1 + |a|)}
Then | aj | < M for all j, i.e. the sequence is bounded as required.
Example 3.1.5:
The Fibonacci numbers are recursively defined as x1 = 1, x2 = 1, and for all

n > 2 we set xn = xn - 2 + xn - 1. The sequence of Fibonacci numbers {1, 1, 2, 3,
5, ...} does not converge.
Back
We will show by induction that the sequence of Fibonacci numbers is unbounded. If that is
true, then the sequence can not converge, because every convergent sequence must be
bounded.
As for the induction process: The first terms of the Fibonacci numbers
are
{1, 1, 2, 3, 5, 8, 13, 21, ...}
We will show that the n-th term of that sequence is greater or equal to n, at least for n > 4.
Property Q(n):
xn n for all n > 4
Check Q(5) (the lowest term):
x5 = x4 + x3 = 3 + 2 = 5 5 is true.
Assume Q(n) true:
xn n for all n > 4
Check Q(n+1):
xn + 1 = xn + xn - 1 n + xn - 1 n + 1 n
Hence, by induction the Fibonacci numbers are unbounded and the sequence can not
converge.
Proposition 3.1.6: Algebra on Convergent Sequences

Suppose
and
are converging to a and b, respectively. Then
1. Their sum is convergent to a + b, and the sequences can be added
term by term.
2. Their product is convergent to a * b, and the sequences can be
multiplied term by term.
3. Their quotient is convergent to a / b, provide that b # 0, and the
sequences can be divided term by term (if the denominators are not
zero).
4. If an bn for all n, then a b
Context
Proof:
The proofs of these statements involve the triangle inequality, as well as an occasional trick
of adding and subtracting zero, in a suitable form. A proof of the first statement, for
example, goes as follows.
Take any > 0. We know that an a, which implies that there exists an
integer N1 such that
| an - a | < / 2
if n > N1. Similarly, since bn
b there exists another integer N2 such that
| bn - b | < / 2
if n > N2. But then we know that
| (an + bn) - (a + b) | = | (an - a) + (bn - b) |
| an - a | + | b n - b |
< /2 + /2 =
if n > max(N1, N2), which proves the first statement.
Proving the second statement is similar, with some added tricks. We
know that { bn } converges, therefore there exists an integer N1 such
that
| bn | < |b| + 1
if n > N1. We also know that we can find integers N2 and N3 so that
| an - a | < / (|b| + 1)
if n > N2, and
| bn - b | < / (|a| + 1)
if n > N3, because |a| and |b| are some fixed numbers. But then we have:
| an bn - a b | = | an bn - a bn + a bn - a b |
= | bn(an - a) + a (bn - b) |
| bn| |an - a | + | a | | bn - b |
< (| b | + 1) / (|b| + 1) + | a | / (|a| +1) < 2
if n > max(N1, N2, N3), which proves the second statement.
The proof of the third statement is similar, so we will leave it as an
exercise.
The last statement does require a new trick: we will use a proof by
contradiction to get that result:
Assume that an bn for all n, but a > b.
We now need to work out the contradiction: the idea is that since a > b there is some
number c such that b < c < a.
<----------[b]-------[a]-------->
<----------[b]--[c]--[a]-------->
Since an converges to a, we can make the terms of the sequence fall between c and a, and
the terms of bn between b and c. But then we no longer have that an bn, which is our
contradiction. Now let's formalize this idea:
Let c = (a + b)/2. Then clearly b < c < a (verify!). Choose N1 such that
bn < c if n > N1. That works because b < c. Also choose N2 such that an
> c if n > N2. But now we have that
bn < c < an
for n > max(N1, N2). That is a contradiction to the original assumption that an bn for all n.
Hence it can not be true that a > b, so that the statement is indeed proved.
Examples 3.1.8(a):
Is the sequence
monotone increasing or decreasing ?

Back
One can start to investigate this statement without having to suspect the correct answer. We
will simply compare the quotient of two consecutive terms to check whether the answer is
greater or less than one:
(1/n) / (1/ (n+1) ) = (n+1) / n > 1
Hence, the n-th term of the sequence divided by the (n+1) term is always greater than 1, or,
in other words, the n-th term is greater than the (n+1)-th term.
That is the definition of a decreasing sequence so that the sequence is
decreasing. Checking a graphical representation of this sequence
confirms that.

Examples 3.1.8(b):
Is the sequence
monotone increasing or decreasing ?

Back
We will again not guess what the correct answer might be ahead of time. We will instead
look at the difference between two consecutive terms and see if that comes out greater or
less than zero.
This means that the n-th term minus the (n+1)-th term of the sequence is less than 0, so
that the n-th term is less than then (n+1)-th term.
That means, by definition, that the sequence is increasing. Checking a
graphical representation of this sequence confirms that.

Examples 3.1.8(c):
Is it true that a bounded sequence converges ? How about monotone
increasing sequences ?
Back
Both statements are false. As a counter-example to the first statement, consider the
sequence
{ (-1) j }
Each term of this sequence is bounded by -1 or +1, so that the sequence is indeed bounded.
But, as we have seen before, the sequence does not converge.
As for the second statement, consider the simple sequence {n}, i.e. the
sequence consisting of the numbers {1, 2, 3, 4, ...}. It is obviously
increasing, but does not converge to a finite number.
It does, however, get closer and closer to infinity, but we do not, at this
time, consider this convergent.
Examples 3.1.8(c):
Back
sequence
{ (-1) j }
Examples 3.1.8(c):
Back
sequence
{ (-1) j }
Proposition 3.1.9: Monotone Sequences

If
is a monotone increasing sequence that is bounded above, then the
sequence must converge (see picture).
If
is a monotone decreasing sequence that is bounded below, then
the sequence must converge (see picture).
Context
Proof:
Let's look at the first statement, i.e. the sequence in monotone increasing. Take an > 0
and let c = sup(xk). Then c is finite, and given > 0, there exists at least one integer N such
that xN > c - . Since the sequence is monotone increasing, we then have that
xk > c for all k > N, or
| c - xk | <
for all k > N. But that means, by definition, that the sequence converges to c.
The proof for the infimum is very similar, and is left as an exercise.

Examples 3.1.10(a):
The sequences
and
both converge.
Back
First, let us consider the sequence
. It is decreasing because:
( 1/n ) - (1 / (n+1) ) > 0

Also, the sequence is bounded below by 0, because each term is positive. Hence, the
sequence must converge.
Note that this does not tell us the actual limit. But we have proved
before that this sequence converges to 0.
Next, we consider the sequence
because
. This sequence is increasing
n / (n+1) - (n+1) / (n + 2) < 0

The sequence is also bounded above by 1, because n < n + 1 so that
n / (n + 1) < 1
Hence, the sequence must converge.
Note that this does not tell us what the limit of the sequence is.
However, the limit is equal to 1, as you can easily prove yourself.

Examples 3.1.10(b):
Define x1 = b and let xn = xn - 1 / 2 for all n > 1. Then this sequence converges
for any number b.
Back
The proof is very easy using the theorem on monotone, bounded sequences:
b > 0: the sequence is decreasing and bounded below by 0.
b < 0: the sequence is increasing and bounded above by 0
b = 0: the sequence is constantly equal to zero
In either case the sequence converges. As to finding the actual limit, we proceed as
follows: we already know that the limit exists. Call that limit L. Then we have:
lim xn = L = lim xn + 1

L = lim xn + 1 = lim xn / 2 = 1/2 lim xn = 1/2 L
so that we have the equation for the unknown limit L:
L = 1/2 L
Therefore, the limit must be zero.
This proof illustrates the advantage of knowing that a sequence
converges. Based on that fact it was easy to determine the actual limit
of this recursively defined sequence. On the other hand, it would be
very difficult to try to establish convergence based on the original
definition of a convergent sequence.
Examples 3.1.10(c): Computing Square Roots

Let a > 0 and x0 > 0 and define the recursive sequence
xn+1 = 1/2 (xn + a / xn)
Show that this sequence converges to the square root of a regardless of the
starting point x0 > 0.
Back
Before giving the proof, let's see how this recursive sequence can be used to compute a
square root very efficiently. Let's say we want to compute
. Let's start with x0 = 2. Note
that the 'true' value of
with 12 digits after the period is 1.414213562373. Our recursive
sequence would approximate this value quickly as follows:
Term
Exact Value
Approximate Value
x0
2
1
x1 = /2 (x0 + / x0) 1/2 (2 + 2/2) = 3/2
1.5
x2 = 1/2 (x1 + a / x1) 1/2 (3/2 + 2/3/2) = 17/12

1
17
x3 = /2 (x2 + / x2) /2 ( /12 + /17/12) =
1.416666667
577
/408
x4 = 1/2 (x3 + a / x3) 1/2 (577/408 + 2/577/408) = 665857/470832
1.414215686
1.414213562375
After only 4 steps our sequence has approximated the 'true' value of
with 11 digits accuracy.
Now that we have seen the usefulness of this particular recursive
sequence we need to prove that it really does converge to the square
root of a.
First, note that xn > 0 for all n so that the sequence is bounded below.
Next, let's see if the sequence is monotone decreasing, in which case it
would have to converge to some limit. Compute
xn - xn+1 = xn - 1/2 (xn - a / xn) = 1/2 (xn2 - a) / xn

Now let's take a look at xn2 - a:
xn2 - a = 1/4 (xn-1 + a / xn-1)2 - a
= 1/4 xn-12 + 1/2 a + 1/4 a2 / xn-12 - a
= 1/4 xn-12 - 1/2 a + 1/4 a2 / xn-12
= 1/4 (xn-1 - a / xn-1)2
0
But that means that xn - xn+1 0, or equivalently xn xn+1. Hence, the sequence is monotone
decreasing and bounded below by 0 so it must converge.
We now know that
following:
(*) L =
xn =
xn = L. To find that limit, we could try the
xn+1 =
/2 (xn + a/xn) = 1/2 (L + a / L)
Solving the equation L = 1/2 (L + a / L) gives

2 L2 = L 2 + a
or equivalently
L2 = a
which means that the limit L is indeed the square root of a, as required.
However, our proof contains one small caveat. In order to take the limit
inside the fraction in equation (*) we need to know that L is not zero
before we can write down equation (*). We already know that xn is
bounded below by zero, but that is not good enough to exclude the
possibility of L = 0. But we have already shown that
xn2 - a = 1/4 (xn-1 - a / xn-1)2 0
so that xn2 a. That implies that the limit of the sequence (which we already know exists) is
strictly positive since a > 0. Therefore equation (*) is justified and we have completed the
proof.
Theorem 3.1.11: The Pinching Theorem

Suppose {aj} and {cj} are two convergent sequences such that lim aj = lim cj
= L. If a sequence {bj} has the property that
aj bj cj
for all j, then the sequence {bj} converges and lim bj = L.
Context
Proof:
The statement of the theorem is easiest to memorize by looking at a diagram:
All bj are between aj and cj, and since aj and cj converge to the same limit L the bj have no
choice but to also converge to L.
Of course this is not a formal proof, so here we go: we want to show that
given any > 0 there exists an integer N such that | bj - L | < if j > N.
We know that
aj bj cj
Subtracting L from these inequalities gives:
aj - L bj - L cj - L
But there exists an integer N1 such that | aj - L | < or equivalently
- < aj - L <
and another integer N2 such that | cj - L | < or equivalently
- < cj - L <
if j > max(N1, N2). Taking these inequalities together we get:
- < aj - L bj - L cj - L <
But that means that
- < bj - L <
or equivalently | bj - L | < as long as j > max(N1, N2). But that means that {bj} converges
to L, as required.
Examples 3.1.12:
Show that the sequence sin(n) / n and cos(n) / n both converge to zero.
Back
This might seem difficult because trig functions such as sin and cos are
often tricky. However, using the Pinching theorem the proof will be very
easy.
We know that | sin(x) | 1 for all x. Therefore
-1 sin(n) 1
for all n. But then we also know that:
-1/n sin(n)/n 1/n
The sequences {1/n} and -1/n both converge to zero so that the Pinching theorem applies
and the term in the middle must also converge to zero.
To prove the statement involving the cos is similar and left as an
exercise.
Theorem 3.2.2: Completeness Theorem in R

Let
bounded.
be a Cauchy sequence of real numbers. Then the sequence is
Let
be a sequence of real numbers. The sequence is
Cauchy if and only if it converges to some limit a.
Context
Proof:
The proof of the first statement follows closely the proof of the corresponding result for
convergent sequences. Can you do it ?
To prove the second, more important statement, we have to prove two
parts:
First, assume that the sequence converges to some limit a. Take any >
0. There exists an integer N such that if j > N then | aj - a | < /2. Hence:
| aj - ak | | aj - a | + | a - ak| < 2 / 2 =
if j, k > N. Thus, the sequence is Cauchy.
Second, assume that the sequence is Cauchy (this direction is much
harder). Define the set
S = {x
R: x < aj for all j except for finitely many}
Since the sequence is bounded (by part one of the theorem), say by a constant M, we know
that every term in the sequence is bigger than -M. Therefore -M is contained in S. Also,
every term of the sequence is smaller than M, so that S is bounded by M. Hence, S is a
non-empty, bounded subset of the real numbers, and by the least upper bound property it
has a well-defined, unique least upper bound. Let
a = sup(S)
We will now show that this a is indeed the limit of the sequence. Take any > 0 , and
choose an integer N > 0 such that
| a j - ak | < / 2
if j, k > N. In particular, we have:
| a j - aN + 1 | < / 2
if j > N, or equivalently
- / 2 < a j - aN + 1 < / 2
Hence we have:
aj > aN + 1 - / 2
for j > N. Thus, aN + 1 - / 2 is in the set S, and we have that
a aN + 1 - / 2
It also follows that
aj < aN + 1 < / 2
for j > N. Thus, aN + 1 < / 2 is not in the set S, and therefore
a aN + 1 < / 2
But now, combining the last several line, we have that:
|a - aN + 1 | < / 2
and together with the above that results in the following:
| a - aj | < |a - aN + 1 | + | aN + 1 - aj | < 2 / 2 =
for any j > N.
Examples 3.3.2(a):
Take the sequence
. Extract every other member, starting with the
first. Then do the same, starting with the second.
Back
The sequence in question is

= {-1, 1, -1, 1, -1, 1, ...}
If we extract every second number, starting with the first, we get:
{-1, -1, -1, -1, ...}
This subsequence now converges to -1.

If we extract every second number, starting with the second, we get:
{1, 1, 1, 1, 1, ...}
This subsequence now converges to -1.
Examples 3.3.2(b):
Take the sequence
. Extract three different subsequences of your
choice and look at the convergence behavior of these subsequences.
Back
The sequence in question is:
= {1, 1/2, 1/3, 1/4, 1/5, 1/6, ... }

which converges to zero. Now let us extract some subsequences:
Extracting the even terms yields the subsequence
{1/2, 1/4, 1/6, 1/8, 1/10, ...}
which converges to zero (prove it !).
Extracting the odd terms yields the subsequence
{1, 1/3, 1/5, 1/7, 1/9, ...}
Extracting every third member yields the sequence
{1, 1/4, 1/7, 1/10, 1/13, ...}
Hence, all three subsequences converge to zero. This is an illustration of
a general result: if a sequence converges to a limit L then every
subsequence extracted from it will also converge to that limit L.
Proposition 3.3.3: Subsequences from Convergent

Sequence
If
is a convergent sequence, then every subsequence of that
sequence converges to the same limit
If
is a sequence such that every possible subsequence
extracted from that sequences converge to the same limit,
then the original sequence also converges to that limit.
Context
Proof:
The first statement is easy to prove: Suppose the original sequence {aj} converges to some
limit L. Take any sequence nj of the natural numbers and consider the corresponding
subsequence of the original sequence For any > 0 there exists an integer N such that
| an - L | <
as long as n > N. But then we also have the same inequality for the subsequence as long as
nj > N. Therefore any subsequence must converge to the same limit L.
The second statement is just as easy. Suppose {aj} is a sequence such
that every subsequence extracted from it converges to the same limit L.
Now take any > 0. Extract from the original sequence every other
element, starting with the first. The resulting subsequence converges to
L by assumption, i.e. there exists an integer N such that
| aj - L | <
where j is odd and j > N. Now extract every other element, starting with the second. The
resulting subsequence again converges to L, so that
| aj - L | <
where j is even and j > N. But now we take any j, even or odd, and assume that j > N
if j is odd, then | aj - L | < because aj is part of the first subsequence
if j is even, then | aj - L | < because aj is part of the second subsequence
Hence, the original sequence must also converge to L.
Note that we can see from the proof that if the "even" and "odd"
subsequence of a sequence converge to the same limit L, then the full
sequence must also converge to L. It is not enough to just say that the
"even" and "odd" subsequence simply converge, they must converge to
the same limit.
Theorem 3.3.4: Bolzano-Weierstrass

Let
be a sequence of real numbers that is bounded. Then there exists
a subsequence
that converges.
Context
Proof:
Since the sequence is bounded, there exists a number M such that | aj | < M for all j. Then:
either [-M, 0] or [0, M] contains infinitely many elements of the sequence
Say that [0, M] does. Choose one of them, and call it
either [0, M/2] or [M/2, M] contains infinitely many elements of the
(original) sequence.
Say it is [0, M/2]. Choose one of them, and call it
either [0, M/4] or [M/4, M/2] contains infinitely many elements of the
(original) sequence
This time, say it is [M/4, M/2]. Pick one of them and call it
Keep on going in this way, halving each interval from the previous step
at the next step, and choosing one element from that new interval. Here
is what we get:
| < M, because both are in [0, M]
| < M / 2, because both are in [0, M/2]
| < M / 4, because both are in [M/2, M/4]
and in general, we see that

|
| < M / 2k-1
because both are in an interval of length M / 2k-1.

So, this proves that consecutive elements of this subsequence are close together. That is not
enough, however, to say that the sequence is Cauchy, since for that not only consecutive
elements must be close together, but all elements must get close to each other eventually.
So: take any > 0, and pick an integer N such that ???...??? (This trick is
often used: first, do some calculation, then decide what the best choice
for N should be. Right now, we have no way of knowing a good choice).
Pretending, however, that we knew this choice of N, we continue the
proof. For any k, m > N (with m > k) we have:
Now we can see the choice for N: we want to make is so large, such that whenever k, m >
N, the difference between the members of the subsequence is less than the prescribed .
What is therefore the right choice for N to finish the proof ?
Example 3.3.5(a):
The sequence
subsequence.
does not converge, but we can extract a convergent

Back
Since | sin(x) | < 4, the sequence is clearly bounded above and below (the sequence is also,
of course, bounded by 1).
Therefore, using the Bolzano-Weierstrass theorem, there exists a
convergent subsequence.
However, it is nearly impossible to actually list this subsequence. The
Bolzano-Weierstrass theorem does guaranty the existence of that
subsequence, but says nothing about how to obtain it.
The original sequence { sin(j) }, incidentally, does not converge. The
proof of this is not so easy, but if we assume that the second part of this
example has been proved, it would be easy. Remember that the second
part of this example states that given any number L with |L| < 1 there
exists a subsequence of
that converges to L. If that was true
the original sequence can not converge, because otherwise all its
subsequences would have to converge to the same limit.
Of course this proof is only valid if this - more complicated - statement
can be proved.
Example 3.3.5(b):
Given any number L between -1 and 1, it is possible to extract a
subsequence from the sequence
that converges to L.
Examples 3.4.2(a):
?
Back
Clearly, the infimum of the sequence is -1, and the supremum is +1. To find lim inf and lim
sup, we will first find the sequence of numbers Aj and Bj mentioned in the definition.
Let's find the numbers Aj = inf{aj, aj + 1, aj + 2, ...} for the sequence {-1, 1,
-1, 1, ...}.
A1 = inf{-1, 1, -1, 1, ...} = -1

A2= inf{1, -1, 1, -1, ... } = -1
and so on. Therefore, it is clear that

lim inf
= -1
Similarly, we find the numbers Bj = sup{aj, aj + 1, Aj + 2, ...}:

B1 = sup{-1, 1, -1, 1, ...} = 1
B2 = sup{1, -1, 1, -1, ...} = 1
lim sup
=1
Back
For this statement the Bolzano-Weierstrass theorem does not help at all. We need to
proceed differently - but not at this time, sorry.
Examples 3.4.2(b):
?
Back
Since this sequence is {1, 1/2, 1/3, 1/4, ...} the infimum is zero, while the supremum is 1.
As for lim inf and lim sup, we find first the sequence of numbers Aj and Bj mentioned in the
definition.
A1 = inf{1, 1/2, 1/3, 1/4, ...} = 0
A2 = inf{1/2, 1/3, 1/4, 1/5, ...} = 0
A3 = inf(1/3, 1/4, 1/5, 1/6, ...} = 0
lim inf
=0
Similarly, we find the numbers Bj= sup{aj, aj + 1, aj + 2, ...}:

B1 = sup{1, 1/2, 1/3, 1/4, ...} = 1
B2 = sup{1/2, 1/3, 1/4, 1/5, ...} = 1/2
B3 = sup(1/3, 1/4, 1/5, 1/6, ...} = 1/3
lim sup
=0
Examples 3.4.2(c):
Back
This sequence is {-1, 2, -3, 4, -5, 6, -7, ...}. You can quickly check, by looking at the
definition of lim inf and lim sup and working out the numbers Aj and Bj that:
inf { (-1) j j } = lim inf { (-1) j j } = sup{ (-1) j j } =
lim sup{ (-1) j j } =
Note that while lim sup and lim inf are not real numbers, they are uniquely defined as plus
or minus infinity. The limit of the original sequence, on the other hand, does not exist at all.
Hence, there is a difference between a limit not existing, and a limit that
approaches infinity. In the latter sense, lim inf and lim sup will always
exist, which is their most useful property.
Proposition 3.4.3: Lim inf and Lim sup exist

lim sup and lim inf always exist (possibly infinite) for any sequence of real
numbers.
Context
Proof:
The sequence
Aj = inf{aj , aj + 1 , aj + 2 , ...}
is monotone increasing (which you should prove yourself). Hence, lim inf exists (possibly
positive infinity).
The sequence
Bj = sup{aj , aj + 1 , aj + 2 , ...}
is monotone decreasing (which you should prove yourself). Hence, lim sup exists (possibly
negative infinity).
Here we have to allow for a limit to be positive or negative infinity,
which is different from saying that a limit does not exist.
Proposition 3.4.4: Characterizing lim sup and lim inf

Let
be an arbitrary sequence, and let c = lim sup aj and d = lim inf
aj. Then
1. there is a subsequence converging to c
2. there is a subsequence converging to d
3. d lim inf
lim sup
c for any subsequence {
If c and d are both finite, then: given any > 0 there are arbitrary large j
such that aj > c - and arbitrary large k such that ak < d + .
Context
Proof:
First let's assume that c = lim sup{aj} is finite, which implies that the sequence {aj} is
bounded. Recall the properties of the sup (and inf) for sequences:
If a sequence is bounded above, then given any > 0 there exists at least
one integer k such that ak > c Now take any > 0. Then
Ak = sup{ak, ak+1, ...}
so by the above property there exists an integer jk > k such that
Ak >
or equivalently
> Ak - / 2
| Ak -
|< /2
We also have by definition that Ak converges to c so that there exists an integer N such that
| Ak - c | < / 2
But now the subsequence {
|
-c|=|
} is the desired one, because:
- Ak + Ak - c |
|
- Ak | + | Ak - c |
< /2+ /2= /2
if jk > N. Hence, this particular subsequence of {an} converges to c.
The proof to find a subsequence converging to the lim inf is similar and
is left as an exercise.
Statement (3) is pretty simple to prove: For any sequence we always
have that
inf{ak, ak+1, ... } sup{ak, ak+1, ... }
Taking limits on both sides gives lim inf(an) lim sup(an) for any sequence, so it is true in
particular for any subsequence.
Next take any subsequence of {an}. Then:
inf(ak, ak+1, ...) inf(
, ...)
because an infimum over more numbers (on the left side) is less than or equal to an
infimum over fewer numbers (on the right side). But then
d lim inf(
The proof of the inequality lim sup(

shown that
d lim inf
lim sup
) c is similar. Taking all pieces together we have
for any subsequence {

}, as we set out to do.
It remains to show that given any > 0 there are arbitrary large j such
that aj > c - (as well as the corresponding statement for the lim inf d).
But previously we have found a subsequence {
so that there exists an integer N such that
|
} that converges to c
-c|<
if k > N. But that means that - <

c- <
<c+
- c < which implies that
as long as k > N. But that of course means that there are arbitrarily large indices - namely
those jk for which k > N - with the property that
> c - as required. Hence, we have
shown the last statement involving the lim sup, and a similar proof would work for the lim
inf.
All our proofs rely on the fact that the lim sup and lim inf are bounded. It
is not hard to adjust them for unbounded values, but we will leave the
details as an exercise.
Example 3.4.5:
If
is the sequence of all rational numbers in the interval [0, 1],
enumerated in any way, find the lim sup and lim inf of that sequence.
Back
Since the numbers 1 and 0 are itself rational numbers, it is clear that
sup{ an } = 1 and inf{ an } = 0
Therefore, we already know that
0 lim inf{ an } lim sup{ an } 1
To find the lim sup, we will construct a subsequence that converges to 1:
there exists 0 < aj1 < 1 with 1 - aj1 < 1

there exists 0 < aj2 < 1 with 1 - aj2 < 1/2 and aj1 # aj2
there exists 0 < aj3 < 1 with 1 - aj3 < 1/3 and aj3 different from the
previous ones
and so on ...
These numbers exist because the rational numbers in the interval [0, 1] are
arbitrarily close to any real number in that interval, according to the Density
principle.
The subsequence { ajk } constructed in the above way
converges to 1. We already know that any limit of any
convergent subsequence must be less than or equal to 1.
Therefore, since the lim sup is the greatest limit of any
convergent subsequence, we have
lim sup { an } = 1
Similarly, we can extract a subsequence { ajk } that converges to 0. We also
know that every limit of any convergent subsequence must be greater or
equal to zero. Therefore, since the lim inf is the smallest possible limit of all
convergent subsequence, we have:
lim inf { an } = 0

Proposition 3.4.6: Lim sup, lim inf, and limit

If a sequence {aj} converges then
lim sup aj = lim inf aj = lim aj
Conversely, if lim sup aj = lim inf aj are both finite then {aj} converges.
Context
Proof:
Let c = lim sup aj. From before we know that there exists a subsequence of {aj} that
converges to c. But since the original sequence converges, every subsequence must
converge to the same limit. Hence
c = lim sup aj = lim aj
To prove that lim inf aj = c is similar.
The converse of this statement can be proved by noting that
Bj = inf(aj, aj+1, ...) aj sup(aj, aj+1, ...) = Aj
Noting that lim Bj = lim inf(aj) = lim sup(aj) = lim Aj we can apply the Pinching Theorem
to see that the terms in the middle must converge to the same value.
Definition 3.5.1: Power Sequence

Power Sequence: The convergence properties of the power sequence
depends on the size of the base a:
|a| < 1: the sequence converges to 0.
a = 1: the sequence converges to 1 (being constant)
a > 1: the series diverges to plus infinity
a -1: the series diverges
back

Convergent Power series with a = -9/10
Divergent Power series with a = 11/10
Proof:
This seems an obvious statement: if a number is in absolute values less than one, it gets
smaller and smaller when raised to higher and higher powers. Proving something 'obvious',
however, is often difficult, because it may not be clear how to start. To prove the statement
we have to resort to one of the elementary properties of the real number system: the
Archimedian principle.
Case a > 1:
Take any real number K > 0 and define
x=a-1
Since a > 1 we know that x > 0. By the Archimedian principle there exists a
positive integer n such that nx > K - 1. Using Bernoulli's inequality for that n we
have:
an = (1 + x)n 1 + nx > 1 + (K - 1) = K
But since K was an arbitrary number, this proves that the sequence {an} is
unbounded. Hence it can not converge.
Case 0 < a < 1:
Take any > 0. Since 0 < a < 1 we know that 1/a > 1, so that by the previous proof
we can find an N with
But then it follows that

an < for all n > N
This proves that the sequence {an} converges to zero.
Case -1 < a < 0:
By the above proof we know that | an | converges to zero. But since -| an | < an < |
an | the sequence {an} again converges to zero by the Pinching Theorem.
Case a < -1:
Extract the subsequence {a2m} from the sequence {an}. Then this sequence diverges
to infinity by the first part of the proof, and therefore the original sequence can not
converge either.
Case a = 1:
This is the constant sequence, so it converges.
Case a = -1:
We have already proved that the sequence
does not converge.

Definition 3.5.2: Exponent Sequence

Exponent Sequence: The convergence depends on the size of the
exponent a:
a > 0: the sequence diverges to positive infinity

a = 0: the sequence is constant
a < 0: the sequence converges to 0
back

Exponent sequence with a = 2
Exponent sequence with a = -2
Proof:
Write n a = e a ln(n). Then:
if a > 0 then as n approaches infinity, the function ea ln(n) approaches infinity as well
if a < 0 then as n approaches infinity, the function ea ln(n) approaches zero
if a = 0 then the sequence is the constant sequence, and hence convergent
Is this a good proof ? Of course not, because at this stage we know nothing about the
exponential or logarithm function. So, we should come up with a better proof.
But actually we first need to understand exactly what na really means:
If a is an integer, then clearly na means to multiply n with itself a times
If a = p/q is a rational number, then np/q means to multiply n p times, then take the
q-root
It is unclear, however, what na means if a is an irrational number. For example,
what is n , or n ?
One way to define na for all a is to resort to the exponential function:
na = ea ln(n)
In that light, the original 'proof' was not bad after all, but of course we now need to know
how exactly the exponential function is defined, and what its properties are before we can
continue. As it turns out, the exponential function is not easy to define properly. Generally
one can either define it as the inverse function of the ln function, or via a power series.
Another way to define na for a being rational it to take a sequence of
rational numbers rn converging to a and to define na as the limit of the
sequence {nrn}. There the problem is to show that this is well-defined,
i.e. if there are two sequences of rational numbers, the resulting limit
will be the same.
In either case, we will base our proof on the simple fact:
if p > 0 and x > y then xp > yp
which seems clear enough and could be formally proved as soon as the exponential
function is introduced to properly define xp.
Now take any positive number K and let n be an integer bigger than K1/a.
Since
n > K1/a
we can raise both sides to the a-th power to get

na > K
which means - since K was arbitrary - that the sequence {na} is unbounded.
The case a = 0 is clear (since n0 = 1). The second case of a < 0 is
related to the first by taking reciprocals (details are left as an exercise).
Since we have already proved the first case we are done.
Definition 3.5.3: Root of n Sequence

Root of n Sequence: This sequence converges to 1.
back

Root of n sequence
Proof:
If n > 1, then
> 1. Therefore, we can find numbers an > 0 such that
= 1 + an for each n > 1

Hence, we can raise both sides to the n-th power and use the Binomial
theorem:
In particular, since all terms are positive, we obtain
Solving this for an we obtain
0 an
But that implies that an converges to zero as n approaches to infinity, which
means, by the definition of an that
converges to 1 as n goes to infinity.
That is what we wanted to prove.

Definition 3.5.4: n-th Root Sequence

n-th Root Sequence: This sequence converges to 1 for any a > 0.
back

n-th Root sequence with a = 3
Proof:
Case a > 1:
If a > 1, then for n large enough we have 1 < a < n. Taking roots on both sides we
obtain
1<
<
But the right-hand side approaches 1 as n goes to infinity by our statement of the
root-n sequence. Then the sequence {
} must also approach 1, being squeezed
between 1 on both sides (Pinching theorem).
Case 0 < a < 1:
If 0 < a < 1, then (1/a) > 1. Using the first part of this proof, the reciprocal of the
sequence {
} must converge to one, which implies the same for the original
sequence.
Incidentally, if a = 0 then we are dealing with the constant sequence, and the limit is of
course equal to 0.
Definition 3.5.5: Binomial Sequence
Binomial Sequence: If b > 1 then the sequence converges to zero

for any positive integer k. In fact, this is still true if k is replaced by any real
number.
back

Binomial sequence with k = 2 and b = 1.3
Proof:
Note that both numerator and denominator tend to infinity. Our goal will be to show that
the denominator grows faster than the k-th power of n, thereby 'winning' the race to infinity
and forcing the whole expression to tend to zero.
The name of this sequence indicates that we might try to use the
binomial theorem for this. Indeed, define x such that
b=1+x
Since b > 1 we know that x > 0. Therefore, each term in the binomial theorem is positive,
and we can use the (k+1)-st term of that theorem to estimate:
for any k+1 n. Let n = 2k + 1, or equivalently, k = (n-1)/2. Then n - k = n - (n-1)/2 =

(n+1)/2 > n/2, so that each of the expressions n, n-1, n-2, ..., n - k is greater than n/2.
Hence, we have that
But then, taking reciprocals, we have:
But this expression is true for all n > 2k + 1 as well, so that, with k fixed, we can take the
limit as n approaches infinity and the right hand side will approach zero. Since the lefthand side is always greater than or equal to zero, the limit of the binomial sequence must
also be zero.
If k is replaced by any real number, see the exponential sequence to
find out how nk could be defined for rational and irrational values of k.
But perhaps some simple estimate will help if k is not an integer. Details
are left as an exercise.
Definition 3.5.6: Euler Sequence
Euler's Sequence: Converges to e ~

2.71828182845904523536028747135... (Euler's number). This sequence
serves to define e.
back
Euler's sequence
Proof:
We will show that the sequence is monotone increasing and bounded above. If that was
true, then it must converge. Its limit, by definition, will be called e for Euler's number.
Euler's number e is irrational (in fact transcendental), and an
approximation of e to 30 decimals is e ~
2.71828182845904523536028747135.
First, we can use the binomial theorem to expand the expression
Similarly, we can replace n by n+1 in this expression to obtain
The first expression has (n+1) terms, the second expression has (n+2) terms. Each of the
first (n+1) terms of the second expression is greater than or equal to each of the (n+1)
terms of the first expression, because
But then the sequence is monotone increasing, because we have shown that
Next, we need to show that the sequence is bounded. Again, consider the expansion
1+
Now we need to estimate the expression
to finish the proof.
If we define Sn =
, then
so that, finally,
for all n.
But then, putting everything together, we have shown that
1+
1 + Sn 3
for all n. Hence, Euler's sequence is bounded by 3 for all n.

Therefore, since the sequence is monotone increasing and bounded, it
must converge. We already know that the limit is less than or equal to 3.
In fact, the limit is approximately equal to
2.71828182845904523536028747135
Definition 3.5.7: Exponential Sequence
Exponential Sequence: Converges to the exponential function

ex = exp(x) for any real number x.
back
Proof:
We will use a simple substitution to prove this. Let
x/n = 1/u, or equivalently, n = u x
Then we have
But the term inside the square brackets is Euler's sequence, which converges to Euler's
number e. Hence, the whole expression converges to ex, as required.
In fact, we have used a property relating to functions to make this proof
work correctly. What is that property ?
If we did not want to use functions, we could first prove the statement
for x being an integer. Then we could expand it to rational numbers, and
then, approximating x by rational number, we could prove the final
result.
Example 4.1.1: Zeno Paradox (Achilles and the Tortoise)

Achilles is racing against a tortoise. Achilles can run 10 meters per second,
the tortoise only 5 meter per second. The track is 100 meters long. Achilles,
being a fair sportsman, gives the tortoise 10 meter advantage. Who will
win ?
Back
Let us look at the difference between Achilles and the tortoise:

Time
Difference
t=0
10 meters
t=1
5 = 10 / 2 meters
t = 1 + 1/2
2.5 = 10 / 4 meters
t = 1 + 1/2 + 1/4
1.25 = 10 / 8 meters
t = 1 + 1/2 + 1/4 + 1/8 0.625 = 10 / 16 meters

and so on. In general we have:
Time
Difference
t = 1 + 1 / 2 + 1 / 2 2 + 1 / 2 3 + ... + 1 / 2 n 10 / 2 n meters
Now we want to take the limit as n goes to infinity to find out when the distance between
Achilles and the tortoise is zero. But that involves adding infinitely many numbers in the
above expression for the time, and we don't know how to do that. However, if we define
S n = 1 + 1 / 2 + 1 / 2 2 + 1 / 2 3 + ... + 1 / 2 n
then, dividing by 2 and subtracting the two expressions:
S n - 1/2 S n = 1 - 1 / 2 n+1
or equivalently, solving for S n:
S n = 2 ( 1 - 1 / 2 n+1)
But now S n is a simple sequence, for which we know how to take limits. In fact, from the
last expression it is clear that
lim S n = 2
as n approaches infinity. Hence, we have - mathematically correct - computed that Achilles
reaches the tortoise after exactly 2 seconds, and then, of course passes it and wins the race.
A much simpler calculation not involving infinitely many numbers gives
the same result:
Achilles runs 10 meters per second, so he covers 20 meters in 2 seconds
The tortoise runs 5 meters per second, and has an advantage of 10 meters.
Therefore, it also reaches the 20 meter mark after 2 seconds
Therefore, both are even after 2 seconds
Of course, Achilles will finish the race after 10 seconds, while the tortoise needs 18
seconds to finish, and Achilles will clearly win.
The problem with Zeno's paradox is that Zeno was uncomfortable with
adding infinitely many numbers. In fact, his basic argument was: if you
add infinitely many numbers, then - no matter what those numbers are you must get infinity. If that was true, it would take Achilles infinitely
long to reach the tortoise, and he would loose the race. However,
reducing the infinite addition to the limit of a sequence, we have seen
that this argument is false.
Examples 4.1.3(a):
The infinite series

= 1/2 + 1/4 + 1/8 + 1/16 + ... converges to 1
(this series is a special case of the geometric series).
Back
The n-th partial sum for this series is defined as

S n = 1/2 + 1/2 2 + 1/2 3 + ... + 1/2 n

We need to find a closed form for this expression to be able to take the limit of the
sequence of partial sums.
If we divide the above expression by 2 and then subtract it from the
original one we get:
S n - 1/2 S n = 1/2 - 1/2 n+1
Hence, solving this for S n we obtain
S n = 2 (1/2 - 1/2 n+1)
This is now a sequence, and we can take the limit as n goes to infinity. By our result on the
power sequence, the term 1/2 n+1 goes to zero, so that
lim S n = 1
That proves, by definition, that the infinite series converges to the number 1.
Definition: Harmonic Series
The series
= 1 + 1/2 + 1/3 + 1/4 + 1/5 + ... is called harmonic series.
It diverges to infinity.
Context
Proof:
We need to estimate the n-th term in the sequence of partial sums.

The n-th partial sum for this series is:
S N = 1 + 1/2 + 1/3 + 1/4 + ... + 1/n
Now consider the following subsequence extracted from the sequence of partial sums:
S1=1
S 2 = 1 + 1/2
S 4 = 1 + 1/2 + (1/3 + 1/4)
1 + 1/2 + (1/4 + 1/4) = 1 + 1/2 + 1/2 = 1 + 2/2
S 8 = 1 + 1/2 + (1/3 + 1/4) + (1/5 + 1/6 + 1/7 + 1/8)
1 + 1/2 + (1/4 + 1/4) + (1/8 + 1/8 + 1/8 + 1/8) = 1 + 1/2 + 1/2 + 1/2
= 1 + 3/2
In general, one can use induction (do it as an exercise) to show that
S 2k 1 + k / 2
for all k. Hence, the subsequence { S 2 k } extracted from the sequence of partial sums { S
N } is unbounded. But then the sequence { S N } can not converge either and must diverge to
infinity.
Examples 4.1.5(a):
The series
does not converge.

Back
Consider the sequence of partial sums:

S n = -1 + 1 - 1 + 1 ... - 1 = -1
if n is odd, and
S n = -1 + 1 - 1 + 1 ... - 1 + 1 = 0
if n is even.

S n = -1 if n is odd and 0 if n is even
or, in other words, the sequence of partial sums is the same as the sequence
{ (-1) n }
This sequence diverges, as proved before. Hence, our sequence of partial sums - while
bounded - does not converge and therefore the series is divergent.
Examples 4.1.5(b):
The series
Back

The series of absolute values is equal to
and this series converges, as shown before. It is in fact a special case of the geometric
series.
Hence, the series of absolute values converges, which by definition
means that our series converges absolutely. Since absolute converge
implies 'regular' convergence, the series also converges in its original
form.
Definition: Alternating Harmonic Series
The series
is called the Alternating Harmonic series. It converges
but not absolutely, i.e. it converges conditionally.
Context
Proof:

There are many proofs of this fact. For example. the series of absolute values is a p-series
with p = 1, and diverges by the p-series test. The original series converges, because it is an
alternating series, and the alternating series test applies easily. However, here is a more
elementary proof of the convergence of the alternating harmonic series.
We already know that the series of absolute values does not converge
by a previous example. Hence, the series does not converge absolutely.
As for regular convergence, consider the following two partial sums:
and
We have that
S 2n+2 - S 2n = 1 / (2n+1) - 1 / (2n+2) > 0
and
S 2n+3 - S 2n+1 = - 1 / (2n+2) + 1/ (2n+3) < 0
which means for the two subsequences
{ S 2n } is monotone increasing and { S 2n+1 } is monotone decreasing
For each sequence we can combine pairs to see that
S 2n 1 and S 2n+1 0
for all n. Hence, both subsequences are monotone and bounded and must therefore be
convergent. Define their limits as
lim S 2n = L and lim S 2n+1 = M
Then
| M - L | = | lim (S 2n+1 - S 2n) | = 1 / (2n+1)
which converges to zero. Therefore, M = L, i.e. both subsequences converge to the same
limit. But this common limit is the same as the limit of the full sequence, because: given
any > 0 we have
there exists an integer N such that | L - S 2n | < if n > N
there exists an integer M such that | L - S 2n+1 | < if n > M
Now set K = max(N, M). Then, for the above > 0 we have
|L-Sn|<
for n > K because n is either even or odd. Hence, the alternating harmonic series converges
conditionally.

Theorem: Absolute Convergence implies Convergence
If a series
converges absolutely, it converges in the ordinary sense.
Back
Suppose
is absolutely convergent. Let
Tj = |a1| + |a2| + ... + |aj|

be the sequence of partial sums of absolute values, and
Sj = a1 + a2 + ... + aj
be the "regular" sequence of partial sums. Since the series converges absolutely, there
exists an integer N such that:
| Tn - Tm| = |an| + |an-1| + ... + |am-2| + |am-1| <
if n > m > N. But we have by the triangle inequality that
| Sn - Sm| = | an + an-1 + ... + am-2 + am-1 |
|an| + |an-1| + ... + |am-2| + |am-1| = | Tn - Tm | <
Hence the sequence of regular partial sums {Sn} is Cauchy and therefore must converge.
(Compare this proof with the Cauchy Criterion for Series).
The converse is not true because the series
corresponding series of absolute values
converges, but the

does not converge.
Theorem: Absolute Convergence implies Convergence
If a series
converges absolutely, it converges in the ordinary sense.
Back
Suppose
is absolutely convergent. Let
Tj = |a1| + |a2| + ... + |aj|

be the sequence of partial sums of absolute values, and
Sj = a1 + a2 + ... + aj
be the "regular" sequence of partial sums. Since the series converges absolutely, there
exists an integer N such that:
| Tn - Tm| = |an| + |an-1| + ... + |am-2| + |am-1| <
if n > m > N. But we have by the triangle inequality that
| Sn - Sm| = | an + an-1 + ... + am-2 + am-1 |
|an| + |an-1| + ... + |am-2| + |am-1| = | Tn - Tm | <
Hence the sequence of regular partial sums {Sn} is Cauchy and therefore must converge.
(Compare this proof with the Cauchy Criterion for Series).
The converse is not true because the series
corresponding series of absolute values
converges, but the

does not converge.
Theorem 4.1.6: Absolute Convergence and

Rearrangement
Let
be an absolutely convergent series. Then any rearrangement of
terms in that series results in a new series that is also absolutely convergent
to the same limit.
Let
be a conditionally convergent series. Then, for any
real number c there is a rearrangement of the series such
that the new resulting series will converge to c.
Context
Proof:
Suppose
is absolutely convergent. Then the sequence
Sn = |a1| + |a2| + ... + | an|
converges. In particular, it is bounded, i.e. |Sn| < K for some number K. If we take any
rearrangement of terms in the series and form a new sequence of partial sums:
Tn = |
|+|
| + ... |
then Tn is again bounded by the same number K. But since all terms in the partial sum Tn
are positive the sequence is monotone increasing. Therefore {Tn} is monotone increasing
and bounded and must therefore converge.
It remains to show that the limit of the rearrangement is the same as
the limit of the original series. That is left as an exercise.
Finally suppose the series
converges conditionally. Let's first collect
a few facts:
By the divergence test (which we will prove later) we know that the sequence of
general terms an converges to zero.
The series does not converge absolutely.
Since the sequence of partial sums of absolute values is increasing it means that the
series of absolute values must "converge" to positive infinity.
There must be infinitely many positive terms among the aj. If we call them bj and
collect them to form a series then that new series must "converge" to positive
infinity (why?).
There must be infinitely many negative terms among the aj. If we call them cj and
collect them to form a series then that new series must "converge" to negative
infinity (why?).
Now we can describe the idea of the proof, leaving the details as an exercise:
Collect enough bj so that they add up to a number just bigger than c
Add as many terms from cj as is necessary to make the resulting sum just less than
c
Add again terms from bj to be just bigger than c, then again terms from cj to be less
than c, and so on.
Then you can show that the resulting arrangment of bj's and cj indeed
forms a series that converges to c. As usual, the details of this proof are
left as an exercise.
Examples 4.1.7(a): Rearranging the Alternating

Harmonic Series

that is
within 0.001 of 2, i.e. show a concrete rearrangement of that series that is
about to converge to the number 2.
Back
This is a simple numeric computation, involving some trial and error. First, let's collect
positive odd terms so that they add up to something larger than 2:
1 + 1/3 + ... + 1/15 = 2.021800422
Next we subtract the first negative term:

1 + 1/3 + ... + 1/15
- 1/2
= 1.521800422
Then we again add positive terms until we are larger than 2 again:
1 + 1/3 + ... + 1/15
- 1/2
+ 1/17 + 1/19 + ... + 1/41
= 2.004063454
Subtracting the next negative term gives:
1 + 1/3 + ... + 1/15
- 1/2
+ 1/17 + 1/19 + ... + 1/41
- 1/4
= 1.754063454
Again adding positive terms:
1 + 1/3 + ... + 1/15
- 1/2
+ 1/17 + 1/19 + ... + 1/41
- 1/4
+ 1/43 + 1/45 + ... + 1/69
= 2.009446048
Once again we subtract the next negative term:
1 + 1/3 + ... + 1/15
- 1/2
+ 1/17 + 1/19 + ... + 1/41
- 1/4
+ 1/43 + 1/45 + ... + 1/69
- 1/6
= 1.842779381
and add positive terms:
1 + 1/3 + ... + 1/15
- 1/2
+ 1/17 + 1/19 + ... + 1/41
- 1/4
+ 1/43 + 1/45 + ... + 1/69
- 1/6
+ 1/71 + 1/73 + ... + 1/95
= 2.000697893
Now we are within 0.001 of 2, and it seems clear how we can continue in this matter to
find a rearrangement of the alternating harmonic series that converges to 2.

Examples 4.1.7(b):

diverges to positive infinity.
that
Back
Take a look at the following inequality:

2n + j < 2n+1
for j = 1, 3, ..., 2n-1 (which makes for 2n-1
Theorem 4.1.8: Algebra on Series
Let
and
be two absolutely convergent series. Then:
1. The sum of the two series is again absolutely convergent. Its limit is
the sum of the limit of the two series.
2. The difference of the two series is again absolutely convergent. Its
limit is the difference of the limit of the two series.
3. The product of the two series is again absolutely convergent. Its limit
is the product of the limit of the two series (Cauchy Product ).
Context
Proof:
The proof of the first statement is a simple application of the triangle inequality. Let
An = |a1| + |a2| + ... + |an|
and
Bn = |b1| + |b2| + ... + |bn|
Assume that An converges to A and Bn converges to B. Then
Sn = |a1 + b1| + |a2 + b2| + ... + |an + bn|
|a1| + |a2| + ... + |an| + |b1| + |b2| + ... + |bn|
A+B
Therefore the sequence {Sn} is bounded above by A + B. The sequence is also monotone
increasing so that it must converge to some limit. To find the limit, note that
| (a1 + b1) + (a2 + b2) + ... (an + bn) - (A + B)|
| a1 + a2 + ... + an - A | + | b1 + b2 + ... + bn - B |
< /2+ /2
if n is big enough. Therefore the sequence {Sn} converges to A + B, as required.
The proof for the difference of sums is similar. The proof for the Cauchy
product, on the other hand, is much more complicated and will be given
in the statement on the Cauchy Product.
Theorem 4.1.9: Cauchy Criteria for Series
The series
converges if and only if for every > 0 there is a positive
integer N such that if m > n > N then |
|<
Context
Proof:
Suppose that the Cauchy criterion holds. Pick any > 0. Then
|Sn-Sm|=|
|<
But that means precisely that the sequence of partial sums { S N } is a Cauchy sequence,
and hence convergent.
Now suppose that the sum converges. Then, by definition, the sequence
of partial sums converges. In particular, that sequence must be a
Cauchy sequence: given any > 0, there is positive integer N such that
whenever n, m > N we have that
| S n -S m | = |
|<
But that, in turn, means that the Cauchy criterion for series holds.
Examples 4.2.2:
Does the Divergence test apply to show that the series
diverges ? How about the series
converges or
?
Back
The first series converges: to apply the divergence test, we have to consider the sequence
which we have shown to diverge. In particular, the sequence does not converge to zero.
Hence, the above series can not converge.
One can easily show directly that this series does not converge by
looking at the sequence of partial sums.
For the second series, the divergence test does not apply. In order to
use the divergence test, we need to check the limit of the sequence
{ 1/n }
But this limit is indeed zero. Therefore, the divergence test does not give any information
about this series.
In fact, this series is the harmonic series, and it can be shown to
diverge.
Examples 4.2.4(a):
The series
converges.
Back

Because | sin(x) | 1 for all real number x, we have that
1 / 2 n sin(n) | | 1 / 2 n |
But
is a special case of the geometric series, which converges. Hence, by the
comparison test, the original series

Note that, formally, we could say: Since
| 1 / 2 n sin(n) | | 1 / 2 n |
for all n. We also have
<
"Therefore", the series converges absolutely.

Examples 4.2.4(b):
The series
diverges.
Back

This looks similar to the harmonic series. In fact: n > n - 1, so that 1/n <
1/ (n-1). But the harmonic series
diverges to positive infinity, hence
the above series also diverges.
Note that we could state this argument as follows:
1 / n < 1 / (n-1)
for all n. Hence
=
Therefore, the series diverges.
Examples 4.2.6:
Use the limit comparison test together with the results on p-series to
investigate the following series:
1.
2.
3. If r(n) = p(n) / q(n), where p and q are polynomials in n, can you
find general criteria for the series p(n) to converge or diverge ?
Back

Since the term
basically looks like 1 / n 2, we want to
limit-compare this series with the p-series 1 / n 2. In fact:
Hence, both series have the same convergence behavior, and since the p-series
n 2 converges, so does the original series.
1/

Since the term
basically looks like 1 /
compare with the p-series 1 /
. In fact:
, we want to limit-
Hence, both series have the same convergence behavior, and since the p-series
diverges, so does the original series.
1/
p(n)
The last series is left as an exercise. Here are some hints:
The p-series test tells you the convergence behavior of 1 / n k for different
k.
Check the limit of an expression like n k r(n) by comparing the degrees of
numerator and denominator
Depending on the answers above, use the limit comparison to find the
behavior of the original series, based on the degrees of numerator and
denominator.
Examples 4.2.8:
Use the Cauchy Condensation criteria to answer the following questions:
1. In the sum
, list the terms a 4, a k, and a 2 k. Then show that this
series (called the harmonic series) diverges.
2. For which p does the series
Back
The sequence { 1/n } corresponds to the harmonic series. Therefore:
a 4 = 1/4
a k = 1/k
a 2k = 1 / 2 k
As for convergence or divergence of this series, we already know, by 'elementary' means,

that this series diverges. Here is an alternative proof of this, using the Cauchy
Condensation test as follows:
and the last series diverges by the Divergence test. Hence, the original series also diverges.
Next, we investigate the series
If p < 0 then the sequence

by the Divergence Test.
If p > 0 then consider the series
for various p:
converges to infinity. Hence, the series diverges
=
The right hand series is now a Geometric Series, so that:
o
o
if 0 < p 1 then 2 1-p 1, hence the right-hand series diverges

if 1 < p then 2 1-p < 1, hence the right-hand series converges
But this is exactly what we need, in conjunction with Cauchy's Condensation test, to finish
the proof of the statement.
Examples 4.2.10:
Investigate the convergence properties of the following series:
3. Does the sum
?
?
converge ?
Back
The first series
seems to be a geometric series with a = 1/2.

However, the index n starts at n = 1, whereas for the geometric series it
starts at n = 0. While that does not influence the convergence behavior,
it does change the actual limit of the series. In fact, the we have that:
1+
= 1 / (1 - 1/2)
by the geometric series test, so that
= 1 / (1 - 1/2) - 1 = 1.
The second series

is again similar to the geometric series, except for the index,
which is supposed to start at 0. This does not influence convergence (or divergence) but it
does change the actual value of the series.

While
= 1 / (1 - 3/4) = 4
we have for our series:
which is the answer to the above infinite series.

For the last series
we will use the limit comparison test,
together with the geometric series test.

First note that
Therefore, by the limit comparison test, the series

and
have the same
convergence behavior. But by the geometric series test, the second series converges, so that
by the limit comparison test the first one also converges.
Note that we have established convergence of the series, but we do not
know the actual limit. In fact, that limit is very difficult to determine.

Examples 4.2.12(a):
The series
diverges.
Back
This series is called harmonic series, and is a p-series with p = 1. Hence, by the p-series
test, the series diverges.
One can also proof divergence of this series directly by looking at the
limit of partial sums.
Note that, in contrast, the alternating harmonic series

converges.
Examples 4.2.12(b):
The series
converges.
Back

We will use the comparison test, together with the p-series test. First,
note that
1 / (1 + n 2) < 1 / n 2
But since the series
series
1 / n 2 is a p-series with p = 2, and therefore converges, the original
must also converge by the comparison test.

Examples 4.2.12(c):
The series
diverge.
Back

We will use the limit comparison test, together with the p-series test.
First, note that
Hence, by the limit comparison test, the two series
and
1/
have the same convergence behavior. But the second series is a p-series with p = 1/2, so it
diverges by the p-series test. Hence, the original series also diverges.
Examples 4.2.14(a):
The root test does not apply to
, but the series diverges.

Back
To apply the root test, we have to check
lim sup
= lim sup 1 /
=1
by our result on the limit of the n-th root sequence. Hence, the root test, applied to the
above series, is inconclusive.
However, since this series is the harmonic series, we can prove either
directly that it diverges or apply the p-series test to show divergence.
Note that the root test also does not apply to the alternating harmonic
series, which turns out to converge conditionally.
Examples 4.2.14(b):
The root test does not apply to
, but the series converges.

Back
To apply the root test, we have to compute the limit superior of the expression
(n2)1/n = n2/n = (n1/n)2
By our result on the limit of the n-th root sequence, the lim sup of this expression is equal
to 1. Hence, the root test is inconclusive.
However, the above series is a p-series with p = 2, and therefore
converges by the p-series test.
Examples 4.2.14(c):
The series
converges by the root test.

Back

We have to check the limit superior of the expression
By our result on the limit of the n-th root sequence, the lim sup of this expression is 1/2.
Hence, by the root test the above series converges absolutely.
Definition: Euler Series
The series
is called Euler's series. It converges to Euler's number e.

Context

Proof:
It is easy to show that this series converges. We could use, for example, the ratio test:
Hence, the series converges by the ratio test. However, this test says nothing about the
actual limit of the series. We have to be a lot more sophisticated to find the actual limit of
this series.
Recall that we have defined Euler's number as the limit of the Euler
sequence. The proof that the above sum equals that limit is very similar
to the proof that the Euler sequence converges in the first place.
First, we use the binomial theorem to expand the expression
From that expansion it is clear that
because each term in parenthesis is smaller than one. On the other hand, we also have:
for N < n, because each term in parenthesis is greater than zero. But then, taking the limit
as n approaches infinity, we have:
Hence, taking the limit as N approaches infinity, we have the sum squeezed in between the
limit of Euler's sequence, which we know is equal to e. That proves the result.
Examples 4.2.16(b):
The ratio test as well as the root test show that the series
converges.
Back
We have to check the lim sup of the expression
Clearly, the lim sup of this expression is equal to 1/2. Hence, by the ratio test the series
The root test for this series, of course, gives the same result. However, it
is much easier to compute the limit of the above expression rather than
computing it for the expression resulting in the root test.
Examples 4.2.16(c):
The following statements are not equivalent:
There exists an N such that | an+1 / an | 1 for all n > N
lim sup | an+1 / an | 1
In fact, the first statement implies the second, but not the other way around.
Back
Suppose that the first statement is true, i.e.

There exists an N such that | a n+1 / a n | 1 for all n > N
Now recall the definition of the lim sup as the limit of the supremums of the truncated
sequences:
lim sup | a n+1 / a n | = lim ( sup{ | a n+1 / a n | , j n } )
But if n > N, then the expression | a n+1 / a n | 1. Therefore, the lim sup must also be
greater than one.
As an example to show that the second statement does not imply the
first one, consider the sequence
2, 1/2, 2, 1/2, 2, 1/2, ...
Here the lim sup is clearly equal to 2, but there is no N such that the terms are all greater
than or equal to 1 for n > N. What remains for us to do is write this sequence as a quotient
| a n+1 / a n |
So, let
a n = 2 if n is even
a n = 1 if n is odd
Then
a n+1 / a n = 1 / 2 if n is even
a n+1 / a n = 2 / 1 if n is odd
Therefore, the second statement above does not imply the first one.
Examples 4.2.16(d):
Here is an alternative proof of the ratio test that uses the root test directly.
Note that this shows that the root test is 'better' than the ratio test, because
the ratio test can be deduced from the root test.
Back
Proof:
First, recall the ratio test
if lim sup | a n+1 / a n | < 1 then the series converges absolutely.
and the root test
if lim sup | a n | 1/n < 1 then the series converges absolutely.
We will use the root test to prove the ratio test
Assume that lim sup | a n+1 / a n | < 1. Using the properties of the limit
superior, there exists a number c with 0 < c < 1, such that
| a n+1 / a n | < c
for n > N or equivalently,
| a n+1 | < c | a n |
for n > N. Then, for any positive integer k, we have that:
| a N+k | c | a N+k-1 | c ( c | a N+k-2 |) ... c k |a N |
for all k > 1. Making the substitution N + k = n, for n > N, this is equivalent to:
| a n | c n-N | a N |
or
Taking the lim sup on both sides as n approaches infinity (N is fixed) we obtain, using the
result on the n-th Root Sequence:
lim sup | a n | 1/n c < 1
Hence, the sequence satisfies to root test, and therefore the series converges absolutely.
The proof of divergence is left as an exercise.

Examples 4.2.18(a):
The sum
converges conditionally.
Back

We already know that the series does not converge absolutely (why ?).
As for convergence, let us verify the conditions for Abel's test:
First, let
{ a n } = { (-1) n }
and
{bn}={1/n}
Then the sequence of partial sums of a n's is clearly bounded (by what number ?), and the
sequence { b n } is decreasing and convergent to zero. Hence, Abel's test applies, showing
that the series converges.
Therefore, the series converges conditionally.
Examples 4.2.18(b):
The series
Back

Note that we can not simply apply the alternating series test, because
this series is not alternating. We will rather apply the more sophisticated
Abel's test.
To apply Abel's test, we first need to identify two sequences. Let
{ an } = { cos(n) }
and
{ b n } = { 1/n }
While it is clear that { b n } is decreasing and convergent to zero, it is not clear at all
whether the sequence of partial sums of the a n's is bounded or not. We will make an unfair
detour here, using some basic formulas from complex analysis:
Recall from Complex Analysis that
cos(t) + i sin(t) = e it
for all real numbers t, where i is the basic complex number given as the square root of -1.
Taking this formula for granted, we have:
( cos(1) + i sin(1) ) + ( cos(2) + i sin(2) ) + ... + ( cos(n+1) + i sin(n+1) )
=
= e i + e 2i + ... + e (n+1)i =
= e i ( 1 + ( e i ) + ( e i ) 2 + ... + ( e i ) n ) =
Therefore, taking absolute values on both sides we can estimate the last expression using
the triangle inequality and the fact that | e it | = 1 to obtain:
for all n. Finally, using the fact that | a | | a + i b | for all real numbers a, b, we have
Putting the last two expressions together, we see that the sequence of partial sums
| cos(1) + cos(2) + ... + cos(n) |
is bounded for all n. Therefore, both conditions of Abel's test are verified, and hence the
original series converges.
Of course, this is an 'unfair' proof, since it uses complex analysis and we
have not even defined properly what a complex number is. Therefore,
another proof of this example is necessary that uses only real analysis.
Such a proof is left as an exercise, as well as the proof that the series
does not converge absolutely (not easy).
Lemma: Summation by Parts
Consider the two sequences

and
partial sum. Then for any 0 m n we have:
. Let S N =
be the n-th
Context
Proof
The proof is simply a calculation, where the various sums are carefully reindexed:

Examples 4.2.20(a):
The Alternating Harmonic Series: The series

is called the
Alternating Harmonic series. It converges but not absolutely, i.e. it
Back
Proof:
There are many proofs of this fact. For example. the series of absolute values is a p-series
with p = 1, and diverges by the p-series test. The original series converges, because it is an
alternating series, and the alternating series test applies easily. However, here is a more
elementary proof of the convergence of the alternating harmonic series.
We already know that the series of absolute values does not converge
by a previous example. Hence, the series does not converge absolutely.
As for regular convergence, consider the following two partial sums:
We have that
S 2n+2 - S 2n = 1 / (2n+1) - 1 / (2n+2) > 0
S 2n+3 - S 2n+1 = - 1 / (2n+2) + 1/ (2n+3) < 0
which means for the two subsequences
{ S 2n } is monotone increasing
{ S 2n+1 } is monotone decreasing
For each sequence we can combine pairs to see that
S 2n 1 for all n
S 2n+1 0 for all n
Hence, both subsequences are monotone and bounded, and must therefore be convergent.
Define their limits as
lim S 2n = L and lim S 2n+1 = M
Then
| M - L | = | lim (S 2n+1 - S 2n) | = 1 / (2n+1)
which converges to zero. Therefore, M = L, i.e. both subsequences converge to the same
limit. But this common limit is the same as the limit of the full sequence, because: given
any > 0 we have
there exists an integer N such that | L - S 2n | < if n > N
there exists an integer M such that | L - S 2n+1 | < if n > M
Now set K = max(N, M). Then, for the above > 0 we have
|L-Sn|<
for n > K, because n is either even or odd. Hence, the alternating harmonic series
Examples 4.2.20(b):
The series
Back

Note that we can not simply apply the alternating series test, because
this series is not alternating. We will rather apply the more sophisticated
Abel's test.
To apply Abel's test, we first need to identify two sequences. Let
{ an } = { cos(n) }
and
{ b n } = { 1/n }
While it is clear that { b n } is decreasing and convergent to zero, it is not clear at all
whether the sequence of partial sums of the a n's is bounded or not. We will make an unfair
detour here, using some basic formulas from complex analysis:
Recall from Complex Analysis that
cos(t) + i sin(t) = e it
for all real numbers t, where i is the basic complex number given as the square root of -1.
Taking this formula for granted, we have:
( cos(1) + i sin(1) ) + ( cos(2) + i sin(2) ) + ... + ( cos(n+1) + i sin(n+1) )
=
= e i + e 2i + ... + e (n+1)i =
= e i ( 1 + ( e i ) + ( e i ) 2 + ... + ( e i ) n ) =
Therefore, taking absolute values on both sides we can estimate the last expression using
the triangle inequality and the fact that | e it | = 1 to obtain:
for all n. Finally, using the fact that | a | | a + i b | for all real numbers a, b, we have
Putting the last two expressions together, we see that the sequence of partial sums
| cos(1) + cos(2) + ... + cos(n) |
is bounded for all n. Therefore, both conditions of Abel's test are verified, and hence the
original series converges.
Of course, this is an 'unfair' proof, since it uses complex analysis and we
have not even defined properly what a complex number is. Therefore,
another proof of this example is necessary that uses only real analysis.
Such a proof is left as an exercise, as well as the proof that the series
does not converge absolutely (not easy).
Proposition 5.1.3: Unions of Open Sets, Intersections of

Closed Sets
Every union of open sets is again open.

Every intersection of closed sets is again closed.
Every finite intersection of open sets is again open
Every finite union of closed sets is again closed.
Context
Proof:
Let { U n } be a collection of open sets, and let U = U n. Take any x in U. Being in the
union of all U's, it must be contained in one specific U n. Since that set is open, there exists
a neighborhood of x contained in that specific U n. But then that neighborhood must also be
contained in the union U. Hence, any x in U has a neighborhood that is also in U, which
means by definition that U is open.
To prove the second statement, simply use the definition of closed sets
and de Morgan's laws.
Now let U n, n=1, 2, 3, ..., N be finitely many open sets. Take x in the
intersection of all of them. Then:
x is in the first set: there exists an with ( x - , x + ) contained in the first set
x is in the second set: there is with ( x - , x + ) contained in the second set.
....
x is in the N-th set: there is
with ( x ,x+
) contained in the last set.
But then
let = min{
, ...,
}. Then ( x - , x + ) is contained in each set U n
Hence it is contained in their finite intersection, which is therefore open, since x was
arbitrary.
The last statement follows again from de Morgan's laws.
Proposition 5.1.4: Characterizing Open Sets

Let U R be an arbitrary open set. Then there are countably many pairwise
disjoint open intervals U n such that U = U n
Context
Proof:
This proposition is rather interesting, giving a complete description of any possible open
set in the real line. To prove it, we will make use of equivalence relations and classes
again. First, let us define a relation on U:
if a and b are in U, we say that a ~ b if the whole line segment between a and b is
also contained in U.
Is this relation indeed an equivalence relation ? Assuming that it is, we know
immediately that U equals the union of the equivalence classes, and the equivalence classes
are pairwise disjoint. Denote those equivalent classes by U n
Each U n is an interval: take any two points a and b in U n. Being in the
same equivalence classes, a and b must be related. But then the whole
line segment between a and b is contained in U n as well. Since a and b
were arbitrary, U n is indeed an interval.
Each U n is open: take any x U n. Then x U, and since U is open, there
exists an > 0 such that ( x - , x + ) is contained in U. But clearly
each point in that interval is related to x, hence this neighborhood is
contained in U n, proving that U n is open.
There are only countably many U n: This seems the hard part. But, each
U n must contain at least one different rational number.
Why ? Since
there are only countably many rational numbers, there can only be
countably many of the U n's (since they are disjoint).
Examples 5.1.6(a):
What is the boundary and the interior of (0, 4), [-1, 2], R, and O ? Which
points are isolated and accumulation points, if any ?

Back
The boundary of (0, 4) is the set consisting of the two elements {0, 4}. Every
neighborhood of these two points contains points both from the interval (0,4) and
from the complement of that interval. Therefore, both form the boundary. The
interior of the set (0, 4) is the set (0, 4) (i.e. itself). No points of either set are
isolated, and each point of the either set is an accumulation point. The same is true,
incidentally for each of the sets (0, 4), [0, 4), (0, 4], and [0, 4].
The boundary of [-1, 2] is the two-element set {-1, 2}, and the interior is (-1, 2). No
points are isolated, and each point in either set is an accumulation point.
The boundary of the set R as well as its interior is the set R itself. No point is
isolated, all points are accumulation points.
The boundary of the empty set as well as its interior is the empty set itself. Since
the set contains no points, it can not contain isolated or accumulation points.
Examples 5.1.6(b):
Find the boundary, interior, isolated and accumulation points, if any, for the
set
{1, 1/2, 1/3, ... } {0}
Back
The boundary of the set is the set itself, because any neighborhood of every point
contains that point itself and some irrational points. Therefore, any neighborhood of
every point contains points from within and from without the set, i.e. every point of
the set if a boundary point.
The interior of this set is empty, because if x is any point in that set, then any
neighborhood of x contains at least one irrational point that is not part of the set.
Therefore, this neighborhood is not contain in the set.
Every point except 0 is an isolated point. First, it is easy to find a small enough
neighborhood for any point of the form x = 1/n that does not contain any point
from the set but x = 1/n. Therefore, every point x = 1/n is isolated. On the other
hand, if (-a, a) is any small neighborhood of 0, then if n is large enough we have
that 1/n < a. But then the point 1/n which is different from 0, is also part of the
neighborhood, and hence 0 is not isolated.
The point 0 is the only accumulation point of the set. It is clear that no point of the
form x = 1/n is an accumulation point, and it is also clear that 0 is an accumulation
point. Simply look at the previous example, and recall the definition of
accumulation points and their related points.
Proposition 5.1.7: Boundary, Accumulation, Interior, and

Isolated Points
Let S R. Then each point of S is either an interior point or a

boundary point.
Let S R. Then bd(S) = bd(R \ S).

A closed set contains all of its boundary points. An open set contains
none of its boundary points.
Every non-isolated boundary point of a set S R is an accumulation
point of S.
An accumulation point is never an isolated point.
Context
Proof:
Examples 5.2.2(a):
Is the interval [0,1] compact ? How about [0, 1) ?
Back
The interval [0, 1] is compact. To see this, take any sequence of points in [0, 1].
Since the sequence must be bounded, we can extract a convergent subsequence by
the Bolzano Weierstrass theorem. Using the theorem on accumulation and
boundary points and noting that the set [0, 1] is closed, the limit of this
subsequence must be contained in [0, 1]. Hence, the set is compact by definition.
The interval [0, 1) is not compact. Consider the sequence { 1 - 1/n }. Then that
sequence is contained in [0, 1), and converges to 1. Therefore, every subsequence
of it must also converge to 1, which is not part of the original set. Therefore, the set
can not be compact.
Examples 5.2.2(b):
Is the set {1, 2, 3} compact ? How about the set N of natural numbers ?
Back
The set {1, 2, 3} is compact. Take any sequence with elements from the set {1, 2,
3}. This sequence is bounded, so we can extract a convergent subsequence from it.
Since the subsequence converges, it forms a Cauchy sequence. Therefore,
consecutive numbers in this subsequence must eventually be closer than, say, 1/2.
But then the subsequence must eventually be constant, and that constant must be
either 1, 2, or 3. Therefore the subsequence converges to an element of the original
set. But then the set {1, 2, 3} is compact. Note that a similar argument applies to
any set of finitely many numbers.
The set of natural numbers N is not compact. The sequence { n } of natural numbers
converges to infinity, and so does every subsequence. But infinity is not part of the
natural numbers.
Examples 5.2.2(c):
Is the set {1, 1/2, 1/3, 1/4, ...} compact ?
Back
This set is not compact. Take the sequence { 1/n } which is the set itself. It - and hence
every subsequence - converges to zero, which is not part of the set. Therefore, the set can
not be compact by our definition.
Examples 5.2.2(d):
Is the set {1, 1/2, 1/3, 1/4, ...}
{0} compact ?
Back
This set is compact. To see this, first note that the set is bounded. Hence, any sequence of
elements from this set is also bounded, and using the Bolzano-Weierstrass theorem we can
extract a convergent subsequence. But the set is also closed, so that by the theorem on
accumulation and boundary points the limit in fact must be part of the set as well.
Proposition 5.2.3: Compact means Closed and Bounded

A set S of real numbers is compact if and only if it is closed and bounded.
Context
Proof:
First, suppose S is closed and bounded. Take a sequence
in S. Because S is
bounded, the sequence is bounded also, and by the Bolzano-Weierstrass theorem we can
extract a convergent subsequence from
. Using the theorem about closed sets,
accumulation points and sequences, we know that the limit of the subsequence is again in
S. Hence, S is compact.
Now assume that S is compact. We have to show that S is bounded and
closed.
Suppose S was not bounded. Then for every n there exists a number a n
S with | a n | > n. But then no subsequence of the sequence
will
converge to a finite number, hence S can not be compact. Thus, S must
be bounded.
Suppose S was not closed. Then there exists an accumulation point c for
the set S that is not contained in S. Since c is an accumulation point,
there exists a sequence
of elements in S that converges to c. But
then every subsequence of that sequence will also converge to c, and
since c is not in S, that contradicts compactness. Hence, S must be
closed.

Examples 5.2.5(a):
Let S = [0, 1], and C = { (-1/2, 1/2), (1/3, 2/3), (1/2, 3/2)}. Is C an open
cover for S ?
Back
C consists of open sets only.

The union of all sets in C contains S
Therefore, C is an open cover for S.

Examples 5.2.5(b):
Let S = [0, 1]. Define
= { t R : | t - | < and
S} for a fixed > 0.
Is the collection of all {
},
S, an open cover for S ? How many sets of
type
are actually needed to cover S ?
Back
First, each set

is an open set, because it is the same as an interval around of length 2
. Second, the union of all sets
equals the open interval (- , 1 + ), so it contains the
set S. Therefore, the collection {
},
S is an open cover of S.
The collection {
},
S consists of uncountable many sets. In order
to cover S, however, we need only a finite subcollection for any given .
To see this, fix an > 0. Then let N be the smallest integer greater than
1 / , and define
= k * , k = 0, 1, 2, ... N
Then one can quickly check that the collection {

}, k = 0, 1, 2, ..., N is a covering of S.
That is, this new collection forms a finite subcover of S with respect to the original
collection of sets.
Examples 5.2.5(c):
Let S = (0, 1). Define a collection C = { (1/j, 1), for all j > 0 }. Is C an open
cover for S ? How many sets from the collection C are actually needed to
cover S ?
Back
Clearly, each set in the collection C is open. Also
(1 / j, 1) = (0, 1)
so that C is indeed an open cover of S.
Not all sets from the original collection C are needed to cover S. For
example, the subcollection of intervals (1 / (2 j) , 1) is also an open
covering. However, we can not reduce this cover to a finite subcovering.

To see this, extract finitely many sets of the form (1 / j , 1) from the
collection C. Let N be the largest integer j that occurs in this
subcollection. Then the point 1 / (N + 1) is in S, but it is not in any of the
intervals of the finite subcollection. Hence, no finite subcollection from C
can cover S.
On the other hand, S does have some other finite open coverings. For
example, the collection { (-1, 1/2), (0, 2) } is such a finite open cover.
However, the giving open covering C from above can not be reduced to
a finite subcover.
Theorem 5.2.6: Heine-Borel Theorem

A set S of real numbers is compact if and only if every open cover C of S
can be reduced to a finite subcovering.
Context
Proof:
First, assume that every open cover C of S can be reduced to a finite subcovering. We will
show that S must then be closed and bounded, which means by the previous result that S is
compact.
S must be bounded: Take the collection C = {
:
S}, where
=(
- 1, + 1). Then this collection is an open cover of S, and by
assumption can be reduced to a finite subcovering of S. But if aj1 is the
smallest of the centers of the sets
, and aj2 is the largest one, then S
is contained in the set ( aj1 - 1, aj2 + 1) and is therefore bounded.
S must be closed: Suppose S was not closed. Then there exists an
accumulation point s of S that is not contained in S. Since s is an
accumulation point of S we know:
for any n > 1 there exists an S with | s - an | < 1 / n
because every neighborhood of s must contain elements from S. The sequence { an }
clearly converges to s. Define the collection of sets
C = { comp([s - 1/n, s + 1/n]), n > 0 }
Then each set in C is open, being the complement of closed sets. They also cover S,
because the only point that this collection is missing is the point s, which is not part of S.
By assumption, a finite subcover already covers S. If N is the largest index of that finite
subcovering, then aN+1 is not part of that subcovering. However, aN+1 is an element of S, so
that this subcovering can not cover S. That is a contradiction, showing that if S was not
closed, not every covering of S can be reduced to a finite subcovering. Hence, S must be
closed.
Now we have to prove the other direction. Assume therefore that S
compact. Let C be any open cover of S. We need to reduce C to a finite
subcover. Since S is compact, we know it is closed and bounded. Then a
= inf(S) and b = sup(S) are both part of S (
Why ?). Define the set A
as
A = { x: x [a, b] and a finite subcollection of C covers [a, x] S }
Then the set A is not empty (because a

c = sup(A)
A). Define
Since A is a subset of [a, b], we know that a < c b. Suppose c < b. Since S is closed,
comp(S) is open. Therefore, if c comp(S) then there exists an open neighborhood U of c
that is contained in [a, b] (because c < b) and disjoint from S. But then c can not be the
supremum of the set A. Therefore, if c < b then c S. Then c must be contained in some
set
from the open cover C of S. Choose two points y and z in
with y < c < z. As
before, there exists a finite subcollection of C whose members cover [a, y] S. Then these
sets, together with
cover [a, z] S. But then z A, which means again that c can not be
the upper bound for A. This means that assuming c < b leads to a contradiction, so that c =
b. But that will be exactly what we need. If sup(A) = c = b, then let
be that member of
the open cover C that contains b. There exists some open neighborhood (b - , b + )
contained in
. But b - is not an upper bound for A, so there exists x with x > b - and
x A. Then [a, x] S is covered by a finite number of members of C. Together with the
set
these sets form a finite open cover for S.
We have indeed reduced the open cover of S to a finite subcovering of
S, finishing the proof. I think.
Examples 5.2.7(a):
Consider the collection of sets (0, 1/j) for all j > 0. What is the intersection
of all of these sets ?
Back
The intersection of all intervals (0, 1/j) is empty. To see this, take any real number x. If x
0 it is not in any of the intervals (0, 1/j), and hence not in their intersection. If x > 0, then
there exists an integer N such that 0 < 1 / N < x. But then x is not in the set (0, 1 / N) and
therefore x is not in the intersection. Therefore, the intersection is empty.
Note that this is an intersection of 'nested' sets, that is sets that are
decreasing: every 'next' set is a subset of its predecessor.
Examples 5.2.7(b):
Can you find infinitely many closed sets such that their intersection is empty
and such that each set is contained in its predecessor ? That is, can you find
sets A j such that A j+1 A j and A j= 0 ?
Back
It is easy to simply find some closed sets with empty intersection. For example, the
intersection of all intervals of the form [n, n+1] is certainly empty.
To find sets contained in one another is slightly more complicated. We
might try sets of the form A j = [0, 1 / j] for all j. Then A j+1 A j, but their
intersection contains the point {1}.
Let A j = [j, ). Then A j+1 A j, because [j + 1, ) [j, ). But by the

Archimedian principle, the intersection of all sets Aj is empty.
Proposition 5.2.8: Intersection of Nested Compact Sets

Suppose { Aj } is a collection of sets such that each Ak is non-empty,
compact, and Aj+1 Aj. Then A = Aj is not empty.
Context
Proof:
Each Aj is compact, hence closed and bounded. Therefore, A is closed and bounded as
well, and hence A is compact. Pick an aj Aj for each j.
Then the sequence { aj } is contained in A1. Since that set is compact,
there exists a convergent subsequence { ajk } with limit in A1.
But that subsequence, except the first number, is also contained in A2.
Since A2 is compact, the limit must be contained in A2.
Continuing in this fashion, we see that the limit must be contained in
every Aj, and hence it is also contained in their intersection A. But then
A can not be empty.
Example 5.2.10(a):
Find a perfect set. Find a closed set that is not perfect. Find a compact set
that is not perfect. Find an unbounded closed set that is not perfect. Find a
closed set that is neither compact nor perfect.
Back
A perfect set needs to be closed, such as the closed interval [a, b]. In fact, every
point in that interval [a, b] is an accumulation point, so that the set [a, b] is a
perfect set.
The simplest closed set is a singleton { b }.The element b in then set { b } is not an
accumulation point, so the set { b } is closed but not perfect.
The set { b } from above is also compact, being closed an bounded. Hence, it is
compact but not perfect.
The set {-1} [0, ) is closed, unbounded, but not perfect, because the element -1
is not an accumulation point of the set.
The set {-1} [0, ) from above is closed, not perfect, and also not compact,
because it is unbounded.
Example 5.2.10(b):
Is the set {1, 1/2, 1/3, ...} perfect ? How about the set {1, 1/2, 1/3, ...}
{0} ?
Back
The first set is not closed. Hence it is not perfect.

The second set is closed, and {0} is an accumulation point. However, every point
different from 0 is isolated, and can therefore not be an accumulation point.
Therefore, this set is not perfect either.
Proposition 5.2.11: Perfect sets are Uncountable

Every non-empty perfect set must be uncountable.
Context
Proof:
If S is perfect, it consists of accumulation points, and therefore can not be finite. Therefore
it is either countable or uncountable. Suppose S was countable and could be written as
S = { x1, x2, x3, ...}
The interval U1 = (x1 - 1, x1 + 1) is a neighborhood of x1. Since x1 must be an accumulation
point of S, there are infinitely many elements of S contained in U1.
Take one of those elements, say x2 and take a neighborhood U2 of x2
such that closure( U2 ) is contained in U1 and x1 is not contained in
closure( U2 ). Again, x2 is an accumulation point of S, so that the
neighborhood U2 contains infinitely many elements of S.
Select an element, say x3, and take a neighborhood U3 of x3 such that
closure( U3 ) is contained in U2 but x1 and x2 are not contained in
closure( U3 ) .
Continue in that fashion: we can find sets Un and points xn such that:
closure( Un+1 ) Un
xj is not contained in Un for all 0 < j < n
xn is contained in Un
Now consider the set
V = ( closure( Un )
S)
Then each set closure( Un ) S ) is closed and bounded, hence compact. Also, by
construction, ( closure( Un+1 ) S ) (closure( Un) S ). Therefore, by the above result, V
is not empty. But which element of S should be contained in V ? It can not be x1, because
x1 is not contained in closure( U2 ). It can not be x2 because x2 is not in closure( U3 ), and so
forth.
Hence, none of the elements { x1, x2, x3, ... } can be contained in V. But
V is non-empty, so that it must contain an element not in this list. That
means, however, that S is not countable.
Example 5.2.13(a): Properties of the Cantor Set

The Cantor set is compact.
Back
The definition of the Cantor set is as follows: let

A 0 = [0, 1]
and define, for each n, the sets A n recursively as
A n = A n-1 /
Then the Cantor set is given as:

C= An
Each set
is open. Since A 0 is closed, the sets A n are all closed as well,
which can be shown by induction. Also, each set A n is a subset of A 0, so that all sets A n
are bounded.
Hence, C is the intersection of closed, bounded sets, and therefore C is
also closed and bounded. But then C is compact.
Example 5.2.13(b): Properties of the Cantor Set

The Cantor set is perfect and hence uncountable.
Back

A 0 = [0, 1]
A n = A n-1 \

C= An
From this representation it is clear that C is closed. Next, we need to show that every point
in the Cantor set is a limit point.
One way to do this is to note that each of the sets A n can be written as
a finite union of 2 n closed intervals, each of which has a length of 1 / 3
n
, as follows:
A 0 = [0, 1]
A 1 = [0, 1/3] [2/3, 1]
A 2 = [0, 1/9] [2/9, 3/9] [6/9, 7/9] [8/9, 1]
...
Note that all endpoints of every subinterval will be contained in the Cantor set. Now take
any x C = A n Then x is in A n for all n. If x is in A n, then x must be contained in one of
the 2 n intervals that comprise the set A n. Define x n to be the left endpoint of that
subinterval (if x is equal to that endpoint, then let x n be equal to the right endpoint of that
subinterval). Since each subinterval has length 1 / 3 n, we have:
|x-xn|<1/3n
Hence, the sequence { x n } converges to x, and since all endpoints of the subintervals are
contained in the Cantor set, we have found a sequence of numbers contained in C that
converges to x. Therefore, x is a limit point of C. But since x was arbitrary, every point of
C is a limit point. Since C is also closed, it is then perfect.
Note that this proof is not yet complete. One still has to prove the
assertion that each set A n is indeed comprised of 2 n closed
subintervals, with all endpoints being part of the Cantor set. But that is
left as an exercise.
Since every perfect set is uncountable, so is the Cantor.
Example 5.2.13(c): Properties of the Cantor Set

The Cantor set has length zero, but contains uncountably many points.
Back

A0 = [0, 1]
and define, for each n, the sets An recursively as
An = An-1 \
C=
An
To be more specific, we have:

A0 = [0, 1]
A1 = [0, 1] \ (1/3, 2/3)
A2 = A1 \ [(1/9, 2/9) (7/9, 8/9)] =
[0,1] \ (1/3, 2/3) ) \ (1/9, 2/9) \ (7/9, 8/9)
...
That is, at the n-th stage (n > 0) we remove 2 n-1 intervals from each previous set, each
having length 1 / 3 n. Therefore, we will remove a total length of
from the unit interval [0, 1]. Since we remove a set of total length 1 from the unit interval,
the length of the remaining Cantor set must be 0.
The Cantor set contains uncountably many points because it is a perfect
set.

Example 5.2.13(d): Properties of the Cantor Set

The Cantor set does not contain any open set.
Back

A 0 = [0, 1]
A n = A n-1 \

C= An
Another way to write the Cantor set is to note that each of the sets A n can be written as a
finite union of 2 n closed intervals, each of which has a length of 1 / 3 n, as follows:
A 0 = [0, 1]
A 1 = [0, 1/3] [2/3, 1]
A 2 = [0, 1/9] [2/9, 3/9] [6/9, 7/9] [8/9, 1]
...
Now suppose that there is an open set U contained in C. Then there must be an open
interval (a, b) contained in C. Now pick an integer N such that
1/3N<b-a
Then the interval (a, b) can not be contained in the set AN, because that set is comprised of
intervals of length 1 / 3N. But if that interval is not contained in AN it can not be contained
in C. Hence, no open set can be contained in the Cantor set C.
Examples 5.3.2(a):
Is the set { x R : | x | < 1, x # 0 } connected or disconnected ? What about
the set { x R : | x | 1, x # 0 }
Back
The first set is open, so we can try to use the easier definition of disconnected. It is indeed
disconnected; it can be 'split at 0'. More precisely, let U = (-1,0) and V = (0,1). Then U and
V have empty intersection and their union is the original set. But that means the original
set must be disconnected.
The second set is also disconnected, but since it is not open we must
use the second part of the definition of disconnected sets. Again, we can
'split' the set at zero. Let U = (-2,0) and V = (0, 2) and set S = [-1, 0)
(0, -1]. Then both sets U and V are open and we have that
U S = [-1, 0)
V S = (0, 1]
so that
( U S)
( U S)
(V
(V
S)=0
S)=S
Note that the actual choice of U and V was not important. We could have chosen different
intervals, e.g. (-50,0) and (0,10), to show both sets were disconnected. The important part
was that the sets could be split at zero into two disjoint pieces.
Examples 5.3.2(b):
Is the set [-1, 1] connected or disconnected ?
Back
It seems obvious that this set is connected. How should we even try to split it into two
disjoint pieces whose union gives back the original set ? But, try to prove it ! You will find
that it is much easier to show that a set is disconnect than it is to show that a set is
connected.
Examples 5.3.2(c):
Is the set of rational numbers connected or disconnected ? How about the
irrationals ?
Back
Both sets are disconnected. The idea is simple: The set of rational numbers can be 'split' at
any irrational number, while the set of irrationals can be 'split' at any rational number. Can
you provide the details ? Note that neither set is open, so that the more complicated
definition of disconnected must be used.
In fact, both of these sets are totally disconnected.
Examples 5.3.2(d):
Is the Cantor set connected or disconnected ?
The Cantor set is disconnected. In fact, one of the properties of the Cantor
set states that it does not contain any open set. Using that fact one can see
that the Cantor set must in fact be totally disconnected, which implies
disconnected. Can you provide the details ?
Back
Proposition 5.3.3: Connected Sets in R are Intervals

If S is any connected subset of R then S must be some interval.
Context
Proof:
If S is not an interval, then there exists a, b
not in S. Then define the two sets
U = ( - , t ) and V = ( t, )
S and a point t between a and b such that t is
Then U S # 0 (because it contains { a }) and V S # 0 (because it contains { b }), and

clearly (U S) (V S) = 0. Finally, because t is not contained in S, we know that (U S)
(V S) = S. Hence, we have found the required sets U and V to disconnect S. So, we
have proved that if a set is not an interval it is disconnected. That is equivalent to saying
that if it is connected, it must be an interval.
Example 5.3.5(a):
The Cantor set is disconnected. It is in fact totally disconnected.
Back
Example 6.1.1:
Consider the function f, where f(x) = 1 if x 0 and f(x) = 2 if x > 0.
1. The sequence {1/n} converges to 0. What happens to the sequence
{ f( 1/n ) } ?
2. The sequence { 3 + (-1)n} is divergent. What happens to the sequence
{ f(3 + (-1)n)} ?
3. The sequence {(-1)n / n } converges to zero. What happens to the
sequence { f((-1)n / n ) } ?
Back
1.
For the first sequence, we clearly have that f(1/n) = 2 for all n. Hence, the limit of
f(1/n) equals 2. So a function applied to a sequence results in a new sequence that
can converge to a different number.
2. For the second sequence, we know that 3 + (-1)n > 0 for all n. Hence, f(3 + (-1)n) =
2 for all n. This time, a function applied to a divergent sequence results in a
convergent one.
3. For the third sequence we know that alternating terms switch signs. Hence, f((-1)n /
n) = 1 if n is odd, and 2 if n is even. But then the resulting new sequence is
divergent. Hence, we have an example where a function can turn a convergent
sequence into a divergent one.
Examples 6.1.3:
1. Let f(x) = m x + b. Then does the limit of that function exist at an
arbitrary point x ?
2. Let g(x) = [x], where [x] denotes the greatest integer less than or
equal to x. Then does the limit of g exist at an integer ? How about at
numbers that are not integers ?
3. In the above definition, does c have to be in the domain D of the
function ? Is c in the closure(D) ? Do you know a name for c in
terms of topology ?
Back
1. Consider a sequence {xn} converging to c. Then we want to show that f(xn) converges as
well. So choose any > 0 and take a positive integer N so large so that
| xn - c| < / m for n > N
Then we have
| f(xn) - (m c + b) | = | m xn + b - m c - b | = m | xn - c | < m / m =
for n > N. But that means that the sequence f(xn) converges to the number (m c + b).
2. Let g(x) = [x] and assume that c is an integer. Then take the
sequence
xn = c + (-1)n / n
Then this sequence converges to c as n approaches infinity, but the sequence g(xn) does not
converge at all. Hence, the limit of g(x) does not exist at any integer. On the other hand, if
c is not an integer, then the function g(x) converges to the limit L = g(c). Can you prove it ?
3. Recalling our knowledge of topology, we remember that if {xn} is a
sequence in a set D converging to c, then c is called an accumulation
point of D. Would it be correct to say that a function f(x) with domain D
converges to a limit L if for every accumulation point c of D the number
L is also an accumulation point of the image of D ?
Example 6.1.5:
Consider the function f with f(x) = 1 if x is rational and f(x) = 0 is x is
irrational. Does the limit of f(x) exist at an arbitrary number x ?
Back
Recall that f is called the Dirichlet function. Since f 'jumps wildly' up and down, we
suspect that the function does not have a limit at any point. This is indeed the case, as is
easy to show using our new definition of limit:
Let c be any real number and pick = 1/2. Suppose there was a
such that whenever | x - c | < then | f(x) - L | < = 1/2.
We can find a rational number q with | q - c | <
therefore
>0
. Hence f(q) = 1, and
| f(q) - L | < 1/2, or | 1 - L | < 1/2

or equivalently
| L | > 1/2
We can also find an irrational number y with | y - c | <
. Hence f(y) = 0, and therefore
| f(y) - L | < 1/2

or
| L | < 1/2
But that's a contradiction, so the function can not have a limit at any point c.
Proposition 6.1.6: Equivalence of Definitions of Limits

If f is any function with domain D in R, and c closure(D) then the
following are equivalent:
1. For any sequence {xn} in D that converges to c the sequence {f(xn)}
converges to L
2. given any > 0 there exists a > 0 such if x closure(D) and | x - c
| < then | f(x) - L | <
Context
Proof:
Suppose the first condition is true, but the second condition fails. Then
there exists an > 0 such that there is no > 0 with the property that if
| x - c | < then | f(x) - L | < . Therefore, if we let = 1 / n, then for
each n we can produce a number xn with
| xn- c | < but | f(xn) - L |
But then the sequence { xn} converges to c, but the sequence f(xn) does
not converge to L. That is contrary to the first condition being true, and
hence we have proved by contradiction that the first condition implies
the second.
Suppose the second condition is true. Let c be some number in

closure(D) and pick any > 0. There exists a number > 0 such that
whenever | x - c | < then | f(x) - L | <
Take any sequence { xn} in D that converges to c. Then there is an
integer N such that
| xn- c | < for n > N
But then, by assumption of the second condition,
| f(xn) - L | < for n > N
But that is the definition of the sequence { f(xn) } converging to L, as
required.
Example 6.2.1:
Which of the two functions is intuitively continuous and which one is not ?
1. f(x) = 1 if x > 0 and f(x) = -1 if x < 0. Is this function continuous ?
2. f(x) = 5x - 6. Is this function continuous?
Back
Since Descartes claims that a function is continuous if its graph can be drawn without
lifting the pencil, we will look at the graph of each function:
f(x) = -1 if x < 0 and 1 if x > 0: Since we have to lift the pencil to draw this graph, this
function does not appear to be continuous.
f(x) = 5x - 6: Since this graph, being a straight line, does not require us to lift the pencil, we
would call this function continuous.
Proposition 6.2.3: Continuity preserves Limits

If f is continuous at a point c in the domain D, and {xn} is a sequence of
points in D converging to c, then
f(x) = f(c).
If
f(x) = f(c) for every sequence {xn} of points in D
converging to c, then f is continuous at the point c.
Context
Proof:
The proof is very similar to the previous result about the equivalence of
the two definitions of limits for a function. It is therefore left as an
exercise. It would be good practice to see if you can modify the previous
proof and adapt it to this result.
Example 6.2.4(a):
Which of the following two functions is continuous:
1. If f(x) = 5x - 6, prove that f is continuous in its domain.
2. If f(x) = 1 if x is rational and f(x) = 0 if x is irrational, prove that x is
not continuous at any point of its domain.
Back
We have already seen from the graph that the first function seems to be continuous while
the second one does not. We have to formally prove it, though.
Pick any > 0. Take any sequence { xn } converging to c. Then there
exists an integer N such that
| xn - c | < / 5
for n > N. Then
| f(xn) - (5 c - 6) | = | 5 xn - 6 - 5 c + 6 | = 5 | xn - c | <
for n > N. But then the sequence {f(xn)} converges to 5 c - 6, or in other
words: if a sequence {xn} converges to c, then f(xn) converges to f(c).
That proves continuity of the first function.
As for the second one: if c is any real number we can find a sequence of
rational numbers an converging to c, as well as another sequence of
irrational numbers bn also converging to c. But then the sequence
{f(an)} is identically 1, and the sequence {f(bn)} is identically 0. But
then f does not have a limit at c, and hence can not be continuous at c
either (we have seen this argument - more formally - in a previous
example already).
Example 6.2.4(b):
If f(x) = x if x is rational and f(x) = 0 if x is irrational, prove that x is
continuous at 0.
Back
If one looks at this poor representation of the function, we see that it does not at all look
continuous. But if {xn} is any sequence of numbers (rational or irrational) that converges to
zero, then there exists an integer N such that |xn| < for n > N. But f(xn) is either zero or xn
itself, and in any case we have
| f(xn) | | xn| <
That proves that the sequence of {xn)} converges to 0 = f(0), which proves that the function
is continuous at zero.
As an exercise, prove that the function is not continuous for any other x.
Example 6.2.4(c):
If f(x) is continuous in a domain D, and {xn} is a Cauchy sequence in D, is
the sequence {f(xn)} also Cauchy ?
Back
This seems to be true. After all, if f is continuous at c, and {xn} is a sequence converging to
c, then f(xn) must converge to f(c). And since convergent sequences are Cauchy one would
assume that the statement should be true.
But of course life is not so simple. Consider the function f(x) = 1/x and
the sequence {1/n}. Then the sequence is convergent to zero, and thus
is Cauchy. But f(1/n) = n, which is not Cauchy. This function is
continuous in the domain D = (0, 2), say, the sequence {1/n} is Cauchy
in D, but the sequence f(1/n) fails to be Cauchy.
The point here is that the function does not need to be continuous at the
limit point of a sequence, and hence the above statement is false. While
the sequence {1/n} converges to zero, the function f(x) is not
continuous at zero. Yet the sequence {1/n} is Cauchy in (0, 2).
Is the statement true if the domain of the function is all of R? Can you
find other formulations for which the statement would become true ?

Proposition 6.2.5: Algebra with Continuous Functions
The identity function f(x) = x is continuous in its domain.

If f(x) and g(x) are both continuous at x = c, so is f(x) + g(x) at x = c.
If f(x) and g(x) are both continuous at x = c, so is f(x) * g(x) at x = c.
If f(x) and g(x) are both continuous at x = c, and g(x) # 0, then f(x) /
g(x) is continuous at x = c.
If f(x) is continuous at x = c, and g(x) is continuous at x = f(c), then
the composition g(f(x)) is continuous at x = c.
Context
Proof:
Suppose f(x) = x. Then, given any > 0 choose
= / 2. Then, if
|x-c|<
it implies that
| f(x) - f(c) | = | x - c | <
= /2<
. Hence, the identity function is indeed continuous. Was it really necessary to take = /
2?
The sum of continuous functions is continuous follows directly from the
triangle inequality. Take any > 0. There exists 1 > 0 such that
whenever
|x-c|<
we know that
| f(x) - f(c) | <
(because f is continuous at c). There also exists
|x-c|<
> 0 such that whenever
we know that
| g(x) - g(c) | <
(because g is continuous at c). But then, if we let
|x-c|<
then
| (f(x) + g(x)) - (f(c) + g(c)) |
| f(x) - f(c) | + | g(x) - g(c) | <
+ =2
= min(
), we have: if
That finishes the proof. (That we don't get a simple should not bother us any more).
The product of two continuous functions is again continuous, which
follows from a simple trick. We will only look at the trick involved, and
leave the details to the reader:
| f(x) g(x) - f(c) g(c) |
= | f(x) g(x) - f(x) g(c) + f(x) g(c) - f(c) g(c) |
| f(x) | | g(x) - g(c) | + | g(c) | | f(x) - f(c) |
With this trick the rest of the proof should not be too difficult.
A similar trick works for the quotient. Here is the idea:
| f(x) / g(x) - f(c) / g(c) | = | 1 / g(x) g(c) | | f(x) g(c) - f(c) g(x) |
Can you see how to continue ? Adding and subtracting will help again.
As for composition of functions, we have to proceed somewhat different:
We know that f(x) is continuous at c, and g(x) is continuous at f(c).
Therefore, given any > 0 there exists 1 > 0 such that whenever
|t-d|<
then
| g(t) - g(d) | <
There also exists
|x-c|<
> 0 such that if
then
| f(x) - f(c) | <
. (Note that we have replaced the usual by

= min(
here) Now let
and substitute t = f(x) and d = f(c). We have: if

|x-c|<
then
| f(x) - f(c) | <
and then
| g(f(x)) - g(f(c)) | <
In other words, f(g(x)) is continuous at x = c.
Examples 6.2.6(a):
Every polynomial is continuous in R, and every rational function r(x) = p(x)
/ q(x) is continuous whenever q(x) # 0.
Back
This follows by repeatedly applying the proposition on algebra on continuous functions:

The identity function is continuous
Multiplication of continuous functions yields continuous functions
o Hence, all monomials are continuous
Sums and differences of continuous functions are continuous
o Hence, all polynomials are continuous.
To prove directly that a general polynomial
is continuous is next to impossible. Taking the above detour makes the
proof very easy. This is an example of a case where proving an abstract
situation can be much simpler than proving a statement in a concrete
situation.
The statement about rational functions being continuous now follows
immediately from the fact that the division of two continuous functions
yields another continuous function provided that the denominator is not
zero.
Examples 6.2.6(b):
The absolute value of any continuous function is continuous.
Back
First we have to prove that the usual absolute value function f(x) = | x |
is continuous. But this is clear from the graph of that function
(nonetheless, can you prove it formally ?).
Now the statement follows immediately from the fact that the
composition of two continuous functions yields another continuous
function.
As in the case of proving that a polynomial is continuous, the abstract
case is much easier to prove that trying to prove, say, that the absolute
value of a particular polynomial is continuous.
Example 6.2.8(a):
The function f(x) = 1 / x is continuous on (0, 1). Is it uniformly continuous
there ?
Back
It helps to look at the graph of the function:
As the - interval 'slides' up the positive y-axis, the corresponding -interval on the x-axis
gets smaller and smaller. That indicates that the function is not uniformly continuous - but
it is of course not a proof. So:
Take any = 1. Does there exist any
if | t - s | <
such that
then | f(t) - f(s) | < = 1 ?
The basic idea is easy: since | f(t) - f(s) | = | 1/t - 1/s | = | s - t | * 1 / | st |,

we can see that this might approach infinity if s and t approach zero,
and therefore will be bigger than any chosen . All we have to do now is
formalize this proof.
Assume there exists such a . Without loss of generality we may
assume that > 1 (why ?). Then let s = t + / 2 and set t = / 2. We
have (assuming that s, t are positive):
| f(t) - f(s) | = | 1/s - 1/t | = | (t - s) / st | =
>1
But now, no matter what > 1 is we can make | f(t) - f(s) | > 1.
Therefore, the function is not uniformly continuous.
This proof, loosely speaking, depends on the fact that after
simplification | f(t) - f(s) | goes to infinity if s and t approach zero. That is
exactly the situation as described in the above picture.
Example 6.2.8(b):
The function f(x) = x2 is continuous on [0, 1]. Is it uniformly continuous
there ?
Back
This function (as you could guess from its graph) is uniformly continuous
on the closed interval [0, 1]. To prove it, note that
| f(t) - f(s) | = | t - s | | t + s | < | t - s | 2
because s and t are in the interval [0, 1]. Hence, given any > 0 we can
simply choose = / 10 (or something similar) to prove uniform
convergence. Can you fill in the details ? A similar argument,
incidentally, would work on the interval [0, N] for any number N, but it
would fail for the interval [0, ). So, if this function then uniformly
continuous on the interval [0, ) ? That's the next example.
Example 6.2.8(c):
The function f(x) = x2 is continuous on [0,
there ?
). Is it uniformly continuous
Back
While this function is uniformly continuous on any interval [0, N] (N any number) it is no
longer uniformly continuous on the interval [0, ). To prove this, take = 1. Note that
| f(s) - f(t) | = | s - t | | s + t |
Can you see that if s = t + and if t is sufficiently large (depending on
the undetermined ) then no matter what is chosen, | f(s) - f(t) | > 1.
That would prove that the function is no longer uniformly continuous.
The details are left as an exercise.
Note that this argument no longer works on a bounded interval [0, N].
Here we can not make t 'sufficiently large', since it can be no larger than
N. And indeed, the function is uniformly continuous on those bounded
intervals. Later we will show that any function that is continuous on a
compact set is necessarily uniformly continuous.
Example 6.2.8(d):
If f(x) is uniformly continuous in R, and { xn} is a Cauchy sequence, is the
sequence { f( xn) } also Cauchy ?
Back
Note that this is not true if the function is only assumed to be continuous (as the example
f(x) = 1 / x on the interval (0, 1) illustrates). But if f is uniformly continuous, we have:
given any > 0 there exists > 0 such that if | t - s | < then | f(t) - f(s) | <
If { xn} is a Cauchy sequence, there exists an integer N such that
| xn - xj| < if n, j > N
But then, by uniform continuity, we have that
| f( xn) - f(xj) | < if n, j > N
Therefore, the sequence { f( xn) } is again a Cauchy sequence.
Why will this argument will not work for a continuous function ? As an
example, consider again the function f(x) = 1/x. It is continuous in the
open interval (0, 1). Therefore, if we consider a sequence { xn} that
converges to a fixed point inside (0, 1), the same argument as above
would to show that { f( xn) } is also Cauchy. We can not use this
argument, however, for a sequences that converge to 0, since f is not
even defined at zero. On the other hand, we can find Cauchy sequences
which converge to 0, without ever having to refer to the value of the
function at the limit of the sequence.
Theorem 6.2.9: Continuity and Uniform Continuity

If f is uniformly continuous in a domain D, then f is continuous in D.
If f is continuous on a compact domain D, then f is uniformly
continuous in D.
Context
Proof:
The first part of the proof is obvious. If | f(t) - f(s) | is small whenever |s t | is small, regardless of the particular location of s and t, then in
particular | f(x) - f( x0) | must be small when | x - x0| is small.
The second part is much more complicated, and relies on the structure
of compact sets on the real line. Recall that a set is compact if every
open cover has a finite subcover. We first want to define a suitable open
cover: pick an > 0. For every fixed x0in the compact set D there exists
a (possibly different) ( x0) > 0 such that
| f( x0) - f(x) | < if | x0- x | < 1/2 ( x0) (here depends on x0).
Define
U( x0) = { x: | x0- x | < 1/2
( x0) } and U = { U( x0) : x0is contained in D}
Then U is an open cover of D, so by compactness can be reduced to a

finite subcover. Suppose that the sets U1, U2, ..., Un cover the set D:
Let
= 1/2 min{
( x1),
( x2), ...
( xn)}
Since this is a minimum over a finite set, we know that > 0. Now take
any two numbers t, s in D such that | t - s | < . Since the finite
collection covers the compact set D we know that s is contained in, say,
U2. How far away from the center of U2 is t then ?
By the choice of
we know that
| t - x2| < | t - s | + | s - x2| <
+ 1/2
( x2) < 1/2 (
( x2) +
Now we are almost done, because if | t - s | <

| f(t) - f(s) | < | f(t) - f( x2) | + | f( x2) - f(s) |
( x2)) =
( x2)
then
The first difference on the right is less than epsilon because | t - x2| < (
x2). The second one is also less than epsilon because s is contained in
the set U2. Hence, the difference on the left is less than twice epsilon.
But now it should be easy for you to modify the proof so that we can
arrive at
| f(t) - f(s) | < whenever | t - s | < in D
That finishes the proof.
Examples 6.3.2:
Which of the following functions, without proof, has a 'fake' discontinuity, a
'regular' discontinuity, or a 'difficult' discontinuity ?
Back
Your browser can not handle Java applets
This function seems to have a 'fake' discontinuity at x = 3, since we could easily move the
single point at x = 3 to the right height, thereby filling in the discontinuity.
This function, while simple, seems to have a 'true' discontinuity at x = 0. We can not
change the function in a single point to make it continuous.
This function is unclear. It is hard to determine what exactly is going on as x gets closer to
zero. Assuming that the function does turn out to be discontinuous at x = 0, it definitely
seems to have a 'difficult' discontinuity at x = 0.
This function is impossible to graph. The picture above is only a poor representation of the
true graph. Nonetheless, given any point x, the function jumps between 1 and 0 in every
neighborhood of x. That seems to mean that the function has a difficult discontinuity at
every point.
Examples 6.3.4(a):
Prove that k(x) has a removable discontinuity at x = 3, and draw the graph of
k(x).
Back
We can easily check that the limit as x approaches 3 from the right and from the left is
equal to 4. Hence, the limit as x approaches 3 exists, and therefore the function has a
removable discontinuity at x = 3. If we define k(3) = 4 instead of k(3) = 1 then the
function in fact will be continuous on the real line.
Examples 6.3.4(b):
Prove that h(x) has a jump discontinuity at x = 0, and draw the graph of h(x)
Back
It is easy to see that the limit of h(x) as x approaches 0 from the left is -1, while the limit of
h(x) as x approaches 0 from the right is +1. Hence, the left and right handed limits exist
and are not equal, which makes x = 0 a jump discontinuity for this function.
Examples 6.3.4(c):
Prove that f(x) has a discontinuity of second kind at x = 0
Back
This function is more complicated. Consider the sequence xn = 1 / (2 n ). As n goes to

infinity, the sequence converges to zero from the right. But f( xn) = sin(2 n ) = 0 for all k.
On the other hand, consider the sequence xn = 2 / ( (2n+1) ). Again, the sequence
converges to zero from the right as n goes to infinity. But this time f( xn) = sin( (2n+1) /
2) which alternates between +1 and -1. Hence, this limit does not exist. Therefore, the limit
of f(x) as x approaches zero from the right does not exist.
Since f(x) is an odd function, the same argument shows that the limit of
f(x) as x approaches zero from the left does not exist.
Therefore, the function has an essential discontinuity at x = 0.
Examples 6.3.4(d):
What kind of discontinuity does the function g(x) have at every point (with
proof).
Back
This function is impossible to graph. The picture above is only a poor representation of the
true graph. Nonetheless, take an arbitrary point x0 on the real axis. We can find a sequence
{xn} of rational points that converge to x0 from the right. Then g(xn) converges to 1. But we
can also find a sequence {xn} of irrational points converging to x0 from the right. In that
case g(xn) converges to 0. But that means that the limit of g(x) as x approaches x0 from the
right does not exist. The same argument, of course, works to show that the limit of g(x) as
x approaches x0 from the left does not exist. Hence, x0 is an essential discontinuity for g(x).
Theorem 6.3.6: Discontinuities of Monotone Functions

If f is a monotone function on an open interval (a, b), then any discontinuity
that f may have in this interval is of the first kind.
If f is a monotone function on an interval [a, b], then f has at
most countably many discontinuities.
Context
Proof:
Suppose, without loss of generality, that f is monotone increasing, and
has a discontinuity at x0. Take any sequence xn that converges to x0from
the left, i.e. xn x0. Then f( xn) is a monotone increasing sequence of
numbers that is bounded above by f( x0). Therefore, it must have a limit.
Since this is true for every sequence, the limit of f(x) as x approaches x0
from the left exists. The same prove works for limits from the right.
Note: This proof is actually not quite correct. Can you see the mistake ?
Is it really true that if xn converges to x0from the left then f( xn) is
necessarily increasing ? Can you fix the proof so that it is correct ?
As for the second statement, we again assume without loss of generality
that f is monotone increasing. Define, at any point c, the jump of f at x =
c as:
j(c) =
f(x) -
f(x)
Note that j(c) is well-defined, since both one-sided limits exist by the
first part of the theorem. Since f is increasing, the jumps j(c) are all nonnegative. Note that the sum of all jumps can not exceed the number f(b)
- f(a). Now let J(n) be the set of all jumps c where j(c) is greater than
1/n., and let J be the set of all jumps of the function in the interval [a, b].
Since the sum of jumps must be smaller than f(b) - f(a), the set J(n) is
finite for all n. But then, since the union of all sets J(n) gives the set J,
the number of jumps is a countable union of finite sets, and is thus
countable.
Corollary 6.3.7: Discontinuities of Second Kind

If f has a discontinuity of the second kind at x = c, then f must change from
increasing to decreasing in every neighborhood of c.
Context
Proof:
Suppose not, i.e. f has a discontinuity of the second kind at a point x =
c, and there does exist some (small) neighborhood of c where f, say, is
always decreasing. But then f is a monotone function, and hence, by the
previous theorem, can only have discontinuities of the first kind. Since
that contradicts our assumption, we have proved the corollary.
Examples 6.3.8(a):
What kind of discontinuity does the function f(x) = exp(1/x) have at x = 0 ?
Back
f(x) = exp(1/x)
As x approaches zero from the right, 1/x approaches positive infinity.

Therefore, the limit of f(x) as x approaches zero from the right is positive
infinity.
As x approaches zero from the left, 1/x approaches negative infinity.
Therefore, the limit of f(x) as x approaches zero from the left is zero.
Since the right-handed limit fails to exist, the function has an essential
discontinuity at zero.
Examples 6.3.8(b):
What kind of discontinuity does the function f(x) = x sin(1/x) have at x = 0 ?
Back
f(x) = x sin(1/x)
Since | x sin(1/x) | < | x |, we can see that the limit of f(x) as x

approaches zero from either side is zero. Hence, the function has a
removable discontinuity at zero. If we set f(0) = 0 then f(x) is
continuous.
Examples 6.3.8(c):
What kind of discontinuity does the function f(x) = cos(1/x) have at x = 0 ?

Back
f(x) = cos(1/x)
By looking at sequences involving integer multiples of or / 2 we can

see that the limit of f(x) as x approaches zero from the right and from
the left both do not exist. Hence, f(x) has an essential discontinuity at x
= 0.
Proposition 6.4.1: Continuity and Topology

Let f be a function with domain D in R. Then the following statements are
equivalent:
1. f is continuous
2. If D is open, then the inverse image of every open set under f is again
open.
3. If D is open, then the inverse image of every open interval under f is
again open.
4. If D is closed, then the inverse image of every closed set under f is
again closed.
5. If D is closed, then the inverse image of every closed interval under f
is again closed.
6. The inverse image of every open set under f is the intersection of D
with an open set.
7. The inverse image of every closed set under f is the intersection of D
with a closed set.
Context
Proof:
(1) => (2): Assume that f is continuous on an open set D. Let U be an
open set in the range of f. We need to show that f-1(U) D is again open.
Take any x0 f-1(U). That is equivalent to saying that f( x0) U. Since U is
open, we can find an > 0 such that the - neighborhood of f( x0) is
contained in U. For this fixed we can use the continuity of f to pick a
> 0 such that
if | x - x0| < then | f(x) - f( x0) | <
This implies that the - neighborhood of x0is contained in f-1(U). Hence,

the inverse image of the arbitrary set U is open.
(2) => (1): Assume that the inverse image f-1(U) of every open set U is
open. Take any point x0 U and pick an > 0. Then the - neighborhood
of f( x0) is an open set, so that it's inverse image is again open. That
inverse image contains x0, and since it is open it contains a neighborhood of x0for some > 0. But that is exactly what we want:
if | x - x0| < then this set is contained in the set f-1( f( x0) - , f( x0) + )
or in other words
if | x - x0| < then | f(x) - f( x0) | <
(2) <=> (3): This follows immediately from the fact that every open set
in the real line can be written as the countable union of open intervals.
(2) <=> (4): This follows immediately by looking at complements, i.e.
from the fact that
f-1( comp(U) ) = comp( f-1(U) )
That equality should be proved as an exercise.
(4) <=> (5): This follows again by combining the two previous remarks.
(6), (7): This proof is very similar to the proof of (2) <=> (1). In fact,
where in that previous proof have we used the fact that the domain D of
the function is open ? The details are left as an exercise again.
Examples 6.4.2(a):
Let f(x) = x2. Show that f is continuous by proving
1. that the inverse image of an open interval is open.
2. that the inverse image of a closed interval is closed.
Back
f(x) = x2
First, let's look at the inverse images of an open interval:
If 0 < a < b then the inverse image of the open interval (a, b) is (,
,-
). In particular, the inverse image is open.
If a < 0 < b then the inverse image of the open interval (a, b) is (,
), which
is again open.
If a < b < 0, then the inverse image of the open interval (a, b) is again open (which
set is it ?)
But it is now obvious that the inverse image of closed intervals is again
a closed set (note that the empty set is both open and closed).
Hence, we have proved that the function f(x) = x2 is continuous,
avoiding the tedious epsilon-delta proof.
Examples 6.4.2(b):
Let f(x) = 1 if x > 0 and f(x) = -1 if x 0. Show that f is not continuous by
1. finding an open set whose inverse image is not open.
2. finding a closed set whose inverse image is not closed.
Back
f(x) = 1 if x > 0 and f(x) = -1 if x 0

The inverse image of the set (-2, 0) is the negative real axis, together with the origin. That
set is closed. We have found an open set whose inverse image is not open; therefore the
function is not continuous.
The inverse image of the set [0,2] is the positive real axis without the
origin. That set is open. We have found a closed set whose inverse
image is not closed; therefore, the function is not continuous.
Examples 6.4.3:
Is it true that if f is continuous, then the image of an open set is again open ?
How about the image of a closed set ?
Back
This is true for inverse images but not for images. Consider the example
of a parabola, which certainly represents a continuous function:
f(x) = x2
Then the image of the set (-1, 1) is the set [0, 1). That set is neither open nor closed; in
particular, it is not open.
To find a counterexample for images of closed sets, let's look at the
following function:
This function is continuous on the whole real line, and the image of the set [0, ) is the set
(0, 1]. Therefore we have found a closed set whose image under a continuous function is
not closed (nor open).
Proposition 6.4.4: Images of Compact and Connected

sets
If f is a continuous function on a domain D, then:
1. the image of every compact set is again compact.
2. the image of every connected set is again connected.
Context
Proof:
1. Note that the image of a closed set is not necessarily closed for
continuous functions, and the image of a bounded set is not necessarily
bounded. However, the image of a close and bounded set is again
closed and bounded (under continuous functions). Despite this, the
proof is fairly easy: Recall that a set D is compact if every open cover of
D can be reduced to a finite subcover.
Let A be an open cover of the set f(D). Since f is continuous, the
collection
{f-1(U): U
A}
is a collection of open sets that cover D. Since D is compact, this

collection can be reduced to a finite subcover, say:
f-1( U1), f-1( U2), ..., f-1( Un),
But then the sets U1, U2, ..., Un cover the set f(D). Hence, every open
cover of f(D ) can be reduced to a finite subcover. Therefore, f(D ) is
compact.
Note: We have used the fact that the inverse image of open sets is
open. That, however, is only true if the original domain D of the function
is also open. Can you modify this proof so that it still works for a not
necessarily open domain D ?
2. This prove is again simple. The idea is as follows: suppose U D is
connected, but f(U) is not connected. Then
f(U ) = A B with A, B open and disjoint
Since f is continuous, f-1(A) and f-1(B) are both open. They are clearly
disjoint, and their union makes up all of U. But then U is not connected,
which is a contradiction. Thus, the image of every connected set under
a continuous function is connected.
Note: Just as before, this proof is not completely correct. It does reflect
the major idea, but some technicalities are missing. Why is this proof
technically not correct ? Can you fix it ?
Examples 6.4.5(a):
If
then:
1. what is the image of [-2, 1] ?
2. find a closed set whose image is not closed
Back
Note that this function is continuous. The image of the interval [-2, 1] is [1/5, 1]. In
particular, the image of a compact set is compact.
To find a closed set whose image is not closed we must consider an
unbounded set. Otherwise, a closed and bounded set is compact, and
since the image of a compact set under continuous functions is
compact, it is in particular closed again.
But if we look at the closed, unbounded set [0, ), we see that the
image of that set under the above function is the set (0, 1], which is not
closed.
Examples 6.4.5(b):
Find examples for the following situations:
1. A continuous function and a set whose image is not connected.
2. A continuous function and a disconnected set whose image is
connected.
3. A function such that the image of a connected set is disconnected.
4. Is it true that inverse images of connected sets under continuous
functions are again connected ?
Back
1. If f is a continuous function, the image of every connected set is connected. Therefore, to

find an example for this situation, we must start with a disconnected set. It is very easy
then to come up with examples. One can use, for example, the standard parabola and two
separate intervals on the positive real axis. Details are left as an (easy) exercise.
2. Again, we can use the standard parabola. By taking two suitable
intervals, one on the positive real axis, and the other on the negative
real axis, one can easily construct such an example. Details are left as
an exercise.
3. This time we can not take a continuous function, since the images of
connected sets always will be connected for those types of functions. An
easy example may be provided by taking the function f(x) = 1 for x > 0
and f(x) = -1 for x 0 and a suitable connected interval on the real axis.
Details, as usual, are left as an exercise.
4. No, that is not true: inverse images of connected sets may be
disconnected. A quick look at the standard parabola will provide us with
an easy example. Details ? Of course left as exercise.
Theorem 6.4.6: Max-Min theorem for Continuous

Functions
If f is a continuous function on a compact set K, then f has an absolute
maximum and an absolute minimum on K.
In particular, f must be bounded on the compact set K.
Context
Proof:
With the work we have done previously, this proof is easy: Since K is
compact and f a continuous function, f(K) is compact also. The compact
set f(K) is bounded, so that f is bounded on K. The compact set f(K) also
contains its infimum and supremum, so that f has an absolute minimum
and maximum on K.
That's all !
Examples 6.4.7(a):
Find a continuous function on a bounded interval that is unbounded. How
about a continuous function on a bounded interval that does not have an
absolute maximum but is bounded ?
Back
If the function was defined on a bounded and closed set (i.e. a compact set), it would have
to have an absolute maximum and minimum, and would therefore be bounded. To
construct our counterexample, we again have to define a function on an open, bounded
interval. The rest is left as an exercise.
While at first glance confusing, all we have do remember is that if the
domain of the function was a compact (i.e. closed and bounded)
interval, the function would have to have an absolute maximum. Hence,
to construct a counterexample, we need to define a continuous function
on an open, bounded interval. The rest is easy, and left as an exercise.
These examples show that the closedness condition in the Max/Min
theorem for continuous functions is essential.
Examples 6.4.7(b):
Does the function
have an absolute maximum and minimum
on [-2, 1] ? How about on the interval [0, ) ?
Back
This function does have an absolute maximum and minimum on the interval [-2, 1], as
predicted by the Max/Min theorem for continuous functions. The absolute maximum is 1,
and the absolute minimum is 1/5.
On the other hand, on the unbounded interval [0, ) the function fails to
possess both absolute maximum and minimum. While 1 is still the
absolute maximum, there no longer is an absolute minimum.
Thus, the boundedness condition in the Max/Min theorem for continuous
functions is essential.
Theorem 6.4.8: Bolzano Theorem

If f is continuous on a closed interval [a, b] and f(a) and f(b) have opposite
signs, then there exits a number c in the open interval (a, b) such that f(c) =
0.
Context
Proof:
With the work we have done so far this proof is easy. Since (a, b) is
connected and f a continuous function, the interval f( (a,b) ) is also
connected. Therefore that set must contain the interval (f(a), f(b))
(assuming that f(a) < f(b) ). Since f(a) and f(b) have opposite signs, this
interval must include 0. Therefore, 0 is in the image of (a, b), or
equivalently: there exists a c in open interval (a, b) such that f(c) = 0.
That's it !
Examples 6.4.9(a):
Show that the equation cos(x) = x has a solution in the interval [-10, 10].
Back
Let's take a look at the two function f(x) = cos(x) and g(x) = x in one coordinate system:
So one can clearly see that there is exactly one solution. We can use Bolzano's theorem to
actually prove that there must be at least one solution: Let h(x) = cos(x) - x. Then h is a
continuous function and
h(- ) = -1 + > 0
h( ) = -1 - < 0
Hence, by Bolzano's theorem there must be at least one place x0 where h( x0) = 0, or
equivalently where cos( x0) = x0.
One can use Bolzano's theorem to construct an algorithm that will find
zeros of a function to a prescribed degree of accuracy in many cases. In
simple terms:
start with an interval [a, b] where h(a) * h(b) < 0 (i.e. h(a) and h(b) have opposite
signs)
find a point c - usually (a + b) / 2 - such that either h(a) * h(c) < 0 or h(b) * h(c) <
0
o if h(a) * h(c) < 0, repeat this procedure with b replaced by c
o if h(b) * h(c) < 0, repeat this procedure with a replaced by c.
Continue until the difference b - a is small enough.
Would this procedure find the zero of the function f(x) = x2 in the
interval [-1, 1] ?
Examples 6.4.9(b):
Show that the equation
in R
= 0 has at least one solution

Back
Using a computer it is simple enough to draw this function and to see the approximate
solution. However, it is even easier to prove that there must be a solution (without
specifying where the solution would be).
The function p(x) is an odd-degree polynomial. Therefore:
If c =
If c = -
, then
, then
p(x) =
, so that there exists A such that p(A) > 0
p(x) = -
, so that there exists B such that p(B) < 0
Hence, by Bolzano's theorem there exists a zero of p(x) between the

(unknown !) numbers A and B.
Theorem 6.4.10: Intermediate Value Theorem

If f is continuous on a closed interval [a, b] and d is any number between
f(a) and f(b). Then there exists a number c in the open interval (a, b) such
that f(c) = d.
Context
Proof:
With the work we have done so far this proof is easy. In fact, the easiest
proof is an application of Bolzano's theorem, and is left as an exercise.

Examples 6.5.2(a):
Find the derivative of f(x) = x and of f(x) = 1 / x
Back
1. If f(x) = x, then
x.
(x - c) / (x - c) = 1 so that f'(x) = 1 for all
2. If f(x) = 1 / x, then
=
(1/x - 1/c) / (x - c) = (c - x) / (x - c) *
2
2
1 / (x c) = - 1 / c , so that f'(x) = - 1 / x for all x not equal to zero.
Examples 6.5.2(b):
Back
=
=
=
Hence, f'(x) = 2 - 3 x2 for all x.
Theorem 6.5.3: Derivative as Linear Approximation

Let f be a function defined on (a, b) and c any number in (a, b). Then f is
differentiable at c if and only if there exists a constant M such that
f(x) = f(c) + M ( x - c ) + r(x)
where the remainder function r(x) satisfies the condition
=0
Context
Proof:
First, suppose f is differentiable at x = c. Let the constant M = f'(c) and

set
r(x) = f(x) - f(c) - f'(c) ( x - c )
We have to check the limit of the quotient
- f'(c)
Since f is differentiable, the limit of this expression is zero as x

approaches c, as required.
Second, suppose that f(x) = f(c) + M ( x - c ) + r(x) for some constant M
and
= 0 Then
-M=
The limit on the right as x approaches c is zero by assumption. Hence,

the limit on the left must also be zero, and we recognize the constant M
as the derivative of the function f'(c).
Examples 6.5.4(a):
Why might our original definition of differentiability not be suitable for
functions of, say, two or three real variables ?
Back
Suppose we have a function of three real variables X = (x, y, z), defined in all of three
dimensional space, with domain in R..
Let us try to use the original definition of differentiability: we would like
to call the function f(X) = f(x, y, z) differentiable in a point C = (a, b, c) if
the limit of the difference quotient
( f(X) - f(C) ) / (X - C)
as X approaches C exists. The numerator is well-defined, since f(X) and

f(C) are real numbers, so that their difference can be computed as
usual. For the denominator we can define the difference of X - C = (x, y,
z) - (a, b, c) by taking the differences in each component, i.e. X - C = ( x
- a, y - b, z - c). However, we have problems defining division. To make
sense of the above difference quotient we must know how to define the
quotient of a real number and a vector X = (x, y, z).
There is no satisfying definition of such a quotient. Therefore, we can
not use this difference quotient in this situation to define 'derivative'.
To avoid this problem, we need a single real number in the denominator
as well. Hence, we could look, for example, at the quotient
( f(X) - f(C) / (x - a)
where X = (x, y, z) and C = (a, b, c) and take the limit as x approaches c

with all other variables fixed. This concept, somewhat modified, is
known as partial derivatives.
Thus, we can use our original definition of differentiability to define
partial derivatives, but not to define 'the derivative' of a function of
more than one variable.
Note: There is one notable exception - the space of complex
numbers. Recall that a complex number z = x + i y consists of a real
and an imaginary part, where i is the square root of -1. Thus, a complex
number z = x + i y can be identified with the tuple z = (x, y). And there
is a way in two dimensional space to define division (and multiplication)
in a meaningful way (how ?). Therefore, if we identify two-dimensional
real space with the space of complex numbers, we can use the original
definition of derivative to define the complex derivative of a complex
function.
Examples 6.5.4(b):
Use the characterization of differentiability via approximation by linear
functions to define the concept of 'derivative' for functions of n real
variables.
Back
As we have seen, we can not use the definition involving the limit of the difference
quotient as a definition of 'the derivative' of a function of n real variables. Therefore, we
want to try to use the 'linear approximation' characterization to give us a workable concept
of derivatives in higher dimensions. First, recall that a linear function in n variables is
given as matrix multiplication:
where A is an n-by-n matrix with constant coefficients. As you recall, this

is a linear map from n-dimensional space into itself. We can also define
the concept of distance, or norm, in n-dimensional space as follows:
Now we can define 'the derivative' of a function f(X) of n-variables with

range in Rn:
f: Rn--> Rn is called 'differentiable' at a point C in Rn if there exists an n-by-n matrix
A such that:
o f(X) = f(C) + A * (X - C) + r(X)
where the remainder r(X) satisfies the condition
o
The matrix A is called the total derivative of the function f. It's

coefficients, as you may recall from multi-dimensional calculus, are all
possible partial derivatives of f at the point X = C.
We can define, in a similar fashion, the total differential of a function
from n-dimensional into m-dimensional space, as long as we have the
concept of a linear map and of a distance, or norm.
Theorem 6.5.5: Differentiable and Continuity

If f is differentiable at a point c, then f is continuous at that point c. The
converse is not true.
Context
Proof:
Note that
f(x) - f(c) =
(x-c)
As x approaches c, the limit of the quotient exists by assumption and is

equal to f'(c), and the limit of the right-hand factor exists also and is
zero. Therefore:
f(x) - f(c) = 0
which is another way of stating that f is continuous at x = c.

Examples 6.5.6(a):
The function f(x) = | x | is continuous everywhere. Is it also differentiable
everywhere ?
Back
We know that f is continuous. To check for differentiability, we have to

employ the basic definition:
f'(x) =
o
o
o
=
( | x | - | c | ) / (x - c)
If c > 0 then x > 0 eventually. Then there is no need for the absolute value.
The limit become +1.
If c < 0 then x < 0 eventually. The absolute values are resolved by an
additional negative sign. The limit becomes -1.
If c = 0, then the left and the right-handed limits will be different (-1 and
+1). Therefore, the function is not differentiable at 0.
Your browser can not

handle Java applets
This is an example of a function that shows that

differentiability is a stronger concept than continuity:
every differentiable function is continuous (theorem)
there are continuous functions that are not

differentiable
Examples 6.5.6(b):
The function f(x) = x sin(1/x) is continuous everywhere except at x = 0,
where it has a removable discontinuity. If the function is extended
appropriately to be continuous at x = 0, is it then differentiable at x = 0 ?
Back
Your browser can not

handle Java applets
We have seen before that this function has a removable

discontinuity at x = 0. If we set
f(0) = 0
then the function is continuous on the real line. To
check differentiability, we'll have to check the limit of
the difference quotient for c = 0.
Note that
(f(x) - f(0) ) / x = x sin( 1 / x) / x = sin( 1 / x )
But we have seen before that this function does not have a limit as x
approaches 0. Can you recall the argument for this statement ?
Therefore, the function is not differentiable at 0. As a product and
composition of functions that are differentiable for all x but zero, this
function is differentiable everywhere except at x = 0.
Examples 6.5.6(c):
The function f(x) = x2 sin(1 / x ) has a removable discontinuity at x = 0. If
the function is extended appropriately to be continuous at x = 0, is it then
differentiable at x = 0 ?
Back
Your browser can not handle Java

applets
To change the function into a continuous function,

we set
f(0) = 0
This function is now differentiable at 0,

because:
(f(x) - f(0)) / (x - 0) = x sin( 1 / x )
Since | x sin( 1 / x ) | < | x | for all x, we see that the limit of the
difference quotient for c = 0 equals zero. Hence, f is differentiable at 0,
and
f'(0) = 0
The function is also differentiable everywhere else, since it is the

product and composition of differentiable functions everywhere but for x
= 0. Therefore, the function is differentiable on the whole real line.
Actually, this function is more interesting than it seems, because it is
continuous
once differentiable
derivative is not continuous
Thus, it provides an example to show that even if derivatives exist, they
do not necessarily have to be continuous. These statements are proved
at a later point, but you might want to try it on your own already.
Theorem 6.5.7: Algebra with Derivatives

Addition Rule: If f and g are differentiable at x = c then f(x) + g(x) is
differentiable at x = c and
(f(x) + g(x)) = f'(x) + g'(x)

Product Rule: If f and g are differentiable at x = c then f(x) g(x) is
differentiable at x = c and
(f(x) g(x)) = f'(x) g(x) + f(x) g'(x)

Quotient Rule: If f and g are differentiable at x = c, and g(c) # 0 then f(x) /
g(x) is differentiable at x = c, and
( f(x) / g(x) ) =
Chain Rule: If g is differentiable at x = c, and f is differentiable at x = g(c)
then f(g(x)) is differentiable at x = c, and
f(g(x)) = f'(g(x)) g'(x)

Context
Proof:
These proofs, except for the chain rule, consist of adding and
subtracting the same terms and rearranging the result. We will prove
the product and chain rule, and leave the others as an exercise.
Product rule:
o
o
=
= f(c) g'(c) + g(c) f'(c)
=
=
Chain Rule: A 'quick and dirty' proof would go as follows:
Since g is differentiable, g is also continuous at x = c. Therefore, as x

approaches c we know that g(x) approaches g(c). The first factor, by a
simple substitution, converges to f'(u), where u = g(c). The second
factor converges to g'(c). Hence, by our rule on product of limits we see
that the final limit is going to be f'(u) g'(c) = f'(g(c)) g'(c), as required.
But this 'simple substitution' may not be mathematically precise. Here is
a better proof of the chain rule.
Define the function h(t) as follows, for a fixed s = g(c):
Since f is differentiable at s = g(c) the function h is continuous at s. We

have
f'(s) = h(s) =
f(t) - f(s) = h(t) (t - s)
Now we have, with t = g(x):

=
which proves the chain rule. This is, of course, the rigorous version of
the above 'simple substitution'.
Note that the chain rule and the product rule can be used to give a
quick proof of the quotient rule.
Theorem: Rolle's Theorem

If f is continuous on [a, b] and differentiable on (a, b), and f(a) = f(b) = 0, then there exists
a number c in (a, b) such that f'(c) = 0.
Context
Proof:
Illustrating Rolle'e theorem

If f is constantly equal to zero, there is nothing to prove. Hence, assume
f is not constantly equal to zero. Since f is a continuous function on a
compact set it assumes its maximum and minimum on that set. One of
them must be non-zero, otherwise the function would be identically
equal to zero. Assume for now that f(c) # 0 is a maximum. Since f(a) =
f(b) = c we know that c is in (a, b), and therefore f is differentiable at c.
Note that f(x) f(c) since f(c) is a maximum.
if x < c then
0 for all x < c
if x > c then
0 for all x > c
The first inequality implies that as x approaches c from the left, the limit
must be greater than or equal to zero. The second one says that as x
approaches c from the right, the limit must be less than or equal to zero.
But since f is differentiable at c we know that both right and left handed
limits exist and must agree. Therefore, f'(c) = 0.
The proof is similar if f(c) is a minimum. Can you see what would
change, if anything ?
Note: As a consequence of this proof we have shown that if a
differentiable function has a maximum or minimum in the interior of its
domain then the derivative at that point must be zero.
Theorem 6.5.9: Mean Value Theorem

If f is continuous on [a, b] and differentiable on (a, b), then there exists a
number c in (a, b) such that
f'(c) =
If f and g are continuous on [a, b] and differentiable on (a, b) and g'(x) # 0
in (a, b) then there exists a number c in (a, b) such that
Context
Proof:
The first version of the Mean Value theorem is actually Rolle's theorem
in disguise. A simple linear function can convert one situation into the
other:
geometric interpretation of
MVT
We need a linear function (linear so that we can easily compute its
derivative) that maps the line through the two points ( a, f(a) ) and ( b,
f(b) ) to the points ( a, 0 ) and ( b, 0 ). If we subtract that map from the
function we will be in a situation where we can apply Rolle's theorem.
To find the equation of such a line is easy:
y - f(a) =
(x-a)
Thus, we define the following function:
h(x) = f(x) -
( x - a ) - f(a)
Then h is differentiable in ( a, b ) with h(a) = h(b) = 0. Therefore, Rolle's

theorem guaranties a number c between a and b such that h'(c) = 0.
But then
0 = h'(c) = f'(c) -
which is exactly what we had to show for the first part.

The second part is very similar. You can fill in the details yourself by
considering the function
h(x) = f(x) -
( g(x) - g(a) ) - f(a)

Examples 6.5.10(a):
Does Rolle's theorem apply to
defined on (-3, 3) ? If so, find
the number guarantied by the theorem to exist.
Back
Your browser
can not handle
Java applets
This function is continuous on the interval [-3, 3], and differentiable on (3, 3). It is not differentiable at x = -3 and x = +3, but Rolle's theorem
does not require the function to be differentiable at the endpoints. Also,
f(3) = f(-3) = 0. Therefore, Rolle's theorem does apply.
It guaranties the existence of a number c between -3 and 3 such that

f'(c) = 0. It does not specify where exactly this number is located.
However, a quick calculation shows that the number c is in fact c = 0.
Examples 6.5.10(b):
If f is differentiable on R and | f'(x) | M for all x, then | f(x) - f(y) | M | x y | for all numbers x, y.
Back
Functions that satisfy the inequality

| f(x) - f(y) | M | x - y |
for some constant M are called Lipschitz functions. Using those terms, we have to prove
that if f is differentiable and uniformly bounded then it is a Lipschitz function.
Take any two number a, b. By the mean value theorem we know that
there exists an x in (a, b) such that:
f'(x) =
Taking absolute values on both sides and moving the denominator to
the other side we have
| f(b) - f(a) | = | f'(x) | | b - a |
Since f'(x) is uniformly bounded by M, we therefore have
| f(b) - f(a) | M | b - a |
But that is exactly what we wanted to prove.
Examples 6.5.10(c):
Use the Mean Value theorem to show that

Back
Define the function f(x) =

. Then f is continuous on [0, ) and differentiable on (0,
By the Mean Value theorem there exists a number c such that
).
= f'(c)
for c between x and x + 1. But then
0<
= 1/2 * 1 /
As x goes to infinity, so does c (it is always bigger than x). The left side of this equation
goes to 0 as c goes to infinity. Therefore, the right side must also go to zero.
Theorem 6.5.12: Local Extrema and Monotonicity

If f is differentiable on (a, b), and f has a local extrema at x = c, then f'(c) =
0.
If f'(x) > 0 on (a, b) then f is increasing on (a, b).
If f'(x) < 0 on (a, b) then f is decreasing on (a, b).
Context
Proof:
This proof is left as an exercise. For a hint on the first part you might
want to look at the proof of Rolle's theorem. If you have proved the first
part, it should be more or less clear how the remaining parts can be
proved. Recall the definitions of an increasing or decreasing function
and compare it with the difference quotient involved in the derivative of
f.
Corollary 6.5.13: Finding Local Extrema

Suppose f is differentiable on (a, b). Then:
1. If f'(c) = 0 and f'(x) > 0 on (a, x) and f'(x) < 0 on (x, b), then f(c) is a
local minimum.
2. If f'(c) = 0 and f'(x) < 0 on (a, x) and f'(x) > 0 on (x, b), then f(c) is a
local maximum.
Context
Proof:
This corollary becomes obvious when we interpret what it means for the
function to have a positive or negative derivative, as in these tables:
Loc. Max
interval
Loc. Min
sign of f'(x)
dir. of f(x)
up
down dir. of f(x)
No Extremum
interval
sign of f'(x)
(a, c) (c , b)
-
down
up
No Extremum
sign of f'(x)
sign of f'(x)
dir. of f(x)
up
up
dir. of f(x)
(a, c) (c , b)
-
down down
Of course, these tables are no proof - which is once again left as an exercise.
Examples 6.5.14(b):
If f(x) = | 1 - x2|, then find all relative extrema
Back
Your browser can

not handle Java
applets
This function is so easy that you should 'see' the answer right away
by looking at a mental picture of the graph of the function. There are
three local extrema ... all details are left to you.
Note that this function is not differentiable at x = 1 and x = -1. Which of

our theorems can we still apply ? Or does none apply ?
Theorem 6.5.15: l Hospital Rules

If f and g are differentiable in a neighborhood of x = c, and f(c) = g(c) = 0,
then
provided the limit on the right exists. The same result holds for one-sided
limits.
If f and g are differentiable and
then
f(x) =
g(x) = -
provided the last limit exists.

Context
Proof:
The first part can be proved easily, if the right hand limit equals f'(c) /
g'(c): Since f(c) = g(c) = 0 we have
Taking the limit as x approaches c we get the first result. However, the
actual result is somewhat more general, and we have to be slightly
more careful. We will use a version of the Mean Value theorem:
Take any sequence {xn} converging to c from above. All assumptions of
the generalized Mean Value theorem are satisfied (check !) on [c, xn].
Therefore, for each n there exists a number cn in the interval (c, xn) such
that
Taking the limit as n approaches infinity will give the desired result for
right-handed limits. The proof is similar for left handed limits and
therefore for 'full' limits.
The proof of the last part of this theorem is left as an exercise.
Examples 6.5.16(a):
Find
Back
Your browser can

not handle Java
applets
L'Hospital's Rule applies directly in this case. If g(x) = sin(x) and f(x)
= x, then g'(x) = cos(x) and f'(x) = 1. Hence, by l'Hospital's rule, we
have:
= cos(0) / 1 = 1
Examples 6.5.16(b):
Find
Back
Your browser can

not handle Java
applets
Let
L'Hospital's rule does not seem to apply in this case, since we have '0
* ', not '0 / 0'. But if we write this expression as xn / ex we see that
we can apply the second of l'Hospital's rules.
f(x) = xn and g(x) = ex

Then l'Hospital's rule applies to the limit of f(x) / g(x) as x goes to
infinity. In fact, taking derivatives separately, it is easy to see that we
can continue to apply l'Hospital's rule n times. The n-th application of
the rule will yield the expression
n! / ex
which approaches zero as x approaches infinity. Thus, applying l'Hospital's rule n times we
get:
=0
Note: This is simply saying that the exponential function grows faster than any power of x
as x goes to infinity. Therefore, when numerator and denominator 'race' to infinity, the
denominator 'wins', forcing the fraction to be zero as x goes to infinity.
Examples 6.5.16(c):
Find
Back
First, we need to write this expression as a fraction before we can try to apply l'Hospital's
rule:
Written in this form we see that we can apply l'Hospital's rule. It leads to the expression
This is again an expression for which l'Hospital's rule applies. We get the expression
For this expression we get that the limit as x approaches zero is zero. Hence, according to
l'Hospital's rule applied twice we get
=0
Examples 7.1.2(a):
What is the norm of a partition of 10 equally spaced subintervals in the
interval [0, 2] ?
Back
To find a partition of the interval [0, 2] into 10 equally spaced subintervals means to find
points x0, x1, ... x10 inside that interval so that each point has the same distance from its
predecessor. Therefore, the points are:
x0 = 0/10 = 0 x1 = 2/10
x2 = 4/10
x3 = 6/10
x4 = 8/10
x5 = 10/10
x6 = 12/10
x7 = 14/10
x8 = 16/10
x9 = 18/10
x10 = 20/10 = 2
Therefore, the norm of this partition is 2/10.
Examples 7.1.2(b):
What is the norm of a partition of n equally spaced points in the interval [a,
b] ?
Back
If we want to divide [a, b] into n subintervals so that each of them has the same width,
then we must choose the width of those subintervals to be b - a / n. Moreover, the partition
would consist of the following n+1 points:
x0 = a = a + 0 . b - a / n
x1 = a + 1 . b - a / n
...
xn-1 = a + (n-1) . b - a / n
xn = a + n . b - a / n = b
Examples 7.1.2(b):
What is the norm of a partition of n equally spaced points in the interval [a,
b] ?
Back
If we want to divide [a, b] into n subintervals so that each of them has the same width,
then we must choose the width of those subintervals to be b - a / n. Moreover, the partition
would consist of the following n+1 points:
x0 = a = a + 0 . b - a / n
x1 = a + 1 . b - a / n
...
xn-1 = a + (n-1) . b - a / n
xn = a + n . b - a / n = b
Examples 7.1.2(c):
Show that if P' is a refinement of P then | P' | | P |
Back
To prove this fact is more confusing than enlightening. It seems clear that if one or more
points are inserted into the partition P to form the refinement partition P', the largest
distance between the points of P' must now be less than (or equal to) that of the points of
P.
But alas, even things that "seem clear" still need formal proof, so ...
Since | P | is a maximum, there must be at least one integer j such that |
P | = xj+1 - xj. Take all such points from the partition P, i.e. all points such
that | P | = xj+1 - xj. Now consider the refinement P'.
Suppose none of the additional points are inside the intervals [xj, xj+1]. Then the
original maximum has not changed so that | P | = | P' |
Suppose at least one of the additional points is inside at least one of the subintervals
[xj, xj+1]. Then this subinterval can no longer contribute to the maximum of P' so
that | P' | | P |
Examples 7.1.4:
Suppose f(x) = x2 on [0, 2]. Find
1. the fifth Riemann sum for an equally spaced partition, taking always
the left endpoint of each subinterval
2. the fifth Riemann sum for an equally spaced partition, taking always
the right endpoint of each subinterval
3. the n-th Riemann sum for an equally spaced partition, taking always
the right endpoint of each subinterval.
Back
The right and left Riemann sums are illustrated in the Java applet below. To verify the
result of that applet, let's perform the computation manually as well.
The interval is [0, 2], and we want to find the fifth Riemann sum.
Therefore the partition we need is:
x0 = 0, x1 = 2/5 = 0.4, x2 = 4/5 = 0.8,

x3 = 6/5 = 1.2, x4 = 8/5 = 1.6, x5 = 10/5 = 2
with a norm of | P | = 0.4. Taking the right points of each of the resulting intervals we can
compute the right Riemann sum as:
f(0.4)0.4 + f(0.8)0.4 + f(1.2)0.4 + f(1.6)0.4 + f(2)0.4 =
= 0.4(f(0.4) + f(0.8) + f(1.2) + f(1.6) + f(2)) =
= 0.4(0.42 + 0.82 + 1.22 + 1.62 + 22) =
= 0.4(0.16 + 0.64 + 1.44 + 2.56 + 4) =
= 0.48.8 = 3.52
The left Riemann sum, correspondingly, computes to:
f(0.0)0.4 + f(0.4)0.4 + f(0.8)0.4 + f(1.2)0.4 + f(1.8)0.4 =
= 0.4(f(0.0) + f(0.4) + f(0.8) + f(1.2) + f(1.6)) =
= 0.4(02 + 0.42 + 0.82 + 1.22 + 1.62) =
= 0.4(0 + 0.16 + 0.64 + 1.44 + 2.56) =
= 0.44.8 = 1.92
For the last part we can not use our Java applet since we do not have a numerical value for
n. But we can manually compute the answer as follows: subdividing [0, 2] into n equal
subintervals gives the partition:
xj = j2/n, where j=0, 1, ... n
Taking the right endpoint of all resulting subintervals and substituting them into the
Riemann sum formula gives:
R(f, P) = f(2/n) 2/n + f(4/n) 2/n + ... + f( (n-1) 2/n ) + f(n 2/n)
which works out to
R(f, P) = 2/n ( (1 2/n)2 + (2 2/n)2 + ... + ( (n-1) 2/n )2 + (n 2/n)2)
We can factor out 4/n2 to get:
R(f, P) = 8/n3 (12 + 22 + ... + (n-1)2 + n2)
In the chapter on induction we have shown that the sum of the first n square numbers
equals
so that we now have:

R(f, P) = 8/n3
= 4/3 (n+1) (2n+1) / n2
We can substitute n = 5 to verify our formula:

R(f, Pn=5 ) = 4/3 6 11 / 25 = 88/25 = 3.52
just as computed above.
Examples 7.1.6(a):
Suppose f(x) = x2-1 for x in the interval [-1, 1]. Find:
1. The left and right sums where the interval [-1, 1] is subdivided into
10 equally spaced subintervals.
2. The upper and lower sums where the interval [-1, 1] is subdivided
into 10 equally spaced subintervals.
3. The upper and lower sums where the interval [-1,1] is subdivided
into n equally spaced subintervals.
Back
For the first two questions we can again use a Java applet that will perform the
computations for us. Here is are the results computing the left and right sums.
Note, in particular, that right and left Riemann sums are the same (an
accident ?). The upper and lower sums are similarly computed, noting
that in the interval [0, 1] the largest value of x2-1 over any subinterval
inside [0, 1] is always the right endpoint, while the smallest value occurs
on the left endpoint. For subintervals inside [-1, 0] it is just the other
way around.
Note that this time the values don't agree but we have that L(f, P) U(f,
p), which is again no accident.
To answer the last question we can not use the above Java applet
because we don't know a numeric value for n. Here is the appropriate
manual computation for the, say, the upper sum: We are looking for n
equally spaced subintervals of [-1, 1] so that our partition consists of the
points:
xj = -1 + j * 2/n
where j = 0, 1, ..., n and | P | = 2/n.
Case 1: n is even, i.e. n = 2 N for some integer N
The renumbered partition now is:
xj = -1 + j * 1/N
where j = 0, 1, ..., N, N+1, ..., 2 N. In that case the point 0 is part of the partition (take j =
N). The point to notice before we can begin the computation is that the function is
decreasing over the interval [-1, 0] and increasing over the interval [0, 1]. That implies
that for all partition points less than 0 the maximum of f occurs on the left endpoint, for
points bigger than 0 the maximum occurs on the right. Hence:
U(f, P) =
In appropriate "sigma" notation we therefore have:
U(f, P) =
=
=
=
=
=
=
=
We have used the results on the sum of square integers mentioned in the chapter on
Induction to simplify the various sums.
We can verify our formula by looking at the above applet. There the 10upper sum computes to -1.119999. In our formula we need to let N = 5
because 10 = n = 2 N. Substituting N = 5 gives the exact value of -1.12
for the 10-th upper sum.
The case where n is odd, as well as the computation of the lower sum, is
left as an exercise. There's nothing new in those computations (but for n
odd there's something special happening at 0), it's just somewhat
tedious. You can check whichever formulas you come up with against
the numeric answers of the above applet.
Examples 7.1.6(b):
Why is, in general, an upper (or lower) sum not a special case of a Riemann
sum ? Find a condition for a function f so that the upper and lower sums are
actually special cases of Riemann sums.
Back
The reason is simple: a Riemann sum requires points where the function is defined, while
an upper/lower sum involves an sup/inf which may not correspond to points in the range of
the function.
What this means is best illustrated via an example. Take the function
Now take the simple partition consisting of the interval [-1, 1]. Since f(x)
> 0 for all x (which is not visible in the above applet) we must have that
any Riemann sum R(f, P) must give a value strictly bigger than 0. The
lower sum with respect to this partition, on the other hand, is truly zero
(which again is not visible in the above applet (why?)). Therefore this
particular lower sum can not be a Riemann sum.
So to find a condition that ensures that upper/lower sums are special
cases of Riemann sums we must ensure that the sup/inf that appears in
the definition of upper/lower sum is a max or min, respectively. In the
topology chapter we have shown that a continuous function over a
closed, bounded interval must have a max and a min.
Therefore, if f is continuous over the interval [a, b] then the upper and
the lower sum are both special cases of a Riemann sum.
Examples 7.1.6(c):
Find conditions for a function so that the upper sum can be computed by
always taking the left endpoint of each subinterval of the partition, or
conditions for always being able to take the right endpoints.
Back
To ensure, for example, that the left endpoint will always be used for the computation of a
lower sum we must ensure that regardless of the partition that was chosen the function
takes its smallest value inside every partition subinterval on its left endpoint.
We will leave it to you to find the correct condition(s), but you can use
the applet below to experiment with various functions.
Click on Options and use the functions:
f(x) = x2
f(x) = 1 - x2
For one of them the right sum is identical to the upper sum, for the other it is identical to
the lower sum. Perhaps that helps finding the right conditions.
Examples 7.1.6(d):
Suppose f is the Dirichlet function, i.e. the function that is equal to 1 for
every rational number and 0 for every irrational number. Find the upper and
lower sums over the interval [0, 1] for an arbitrary partition.
Back
Take an arbitrary partition P = { x0, x1, ..., xn } of the interval [0, 1].
Between any two points xj and xj+1 there is an irrational number. Therefore the inf
over [ xj, xj+1 ] must be 0. That means that L(f, P) = 0.
Between any two points xj and xj+1 there is a rational number. Therefore the sup
over [ xj, xj+1 ] must be 1. That means that
U(f, P) = (x1 - x0) + (x2 - x1) + ... + (xn - xn-1)
which is a telescoping sum so that
U(f, P) = xn - x0 = 1 - 0 = 1
Thus, we have shown that for the Dirichlet function and for any partition P we have that
L(f, P) = 0 and U(f, P) = 1.
Proposition 7.1.7: Size of Riemann Sums

Suppose P = { x0, x1, x2, ..., xn} is a partition of the closed interval [a, b], f a
bounded function defined on that interval. Then we have:
The lower sum is increasing with respect to refinements of partitions,
i.e. L(f, P') L(f, P) for every refinement P' of the partition P
The upper sum is decreasing with respect to refinements of
partitions, i.e. U(f, P') U(f,P) for every refinement P' of the
partition P
L(f, P) R(f, P) U(f, P) for every partition P
Context
Proof:
The last statement is simple to prove: take any partition P = {x0, x1, ..., xn}. Then
inf{ f(x), xj-1 x xj } f(tj) sup{ f(x), xj-1 x xj }
where tj is an arbitrary number in [xj-1, xj] and j = 1, 2, ..., n. That immediately implies that
L(f, P) R(f, P) U(f, P)
The other statements are somewhat trickier. Let's first find out why they
should be true. To make it simple, let's say that P = {a, b} and P' = {a,
x0, b}. Then
U(f, P) = sup{ f(x), x
[a, b] } (b - a)
and the upper sum for P' would be

U(f, P') = sup{ f(x), x
[a, x0] } (x0 - a) + sup{ f(x), x
[x0, b] } (b - x0)
Geometrically, the upper sum for P corresponds to one large rectangle, the one for P' to
two smaller rectangles, where the smaller rectangles fit into the larger one but do not cover
it.
U(f, P) = 1.089 U(f, P') = 0.86
Since the area covered by the two rectangles is smaller than that covered by the first one,
we have U(f, P) > U(f, P').
Let's show this mathematically, in case one additional point t0 is added
to a particular subinterval [xj-1, xj]. Let:
cj be the sup of f(x) in the interval [xj-1, xj]
Aj be the sup of f(x) in the interval [xj-1, t0]
Bj be the sup of f(x) in the interval [t0, xj]
Then cj Aj and cj Bj so that
cj (xj - xj-1) = cj (xj - t0 + t0 - xj-1) = cj (xj - t0) + cj (t0 - xj-1)
Bj (xj - t0) + Aj (x0 - tj-1)
That shows that if P = {x0, ... xj-1, xj, ..., xn} and P' = {x0, ... xj-1, t0, xj, ..., xn} then U(f, P)
U(f, P').
The proof for a general refinement P' of P uses the same idea plus some
confusing indexing scheme. No more details should be necessary.
The proof for the statement regarding the lower sum is analogous.
Examples 7.1.9(a):
Show that the constant function f(x) = c is Riemann integrable on any
interval [a, b] and find the value of the integral.
Back
We have to compute the upper and lower sums for an arbitrary partition, then find the
appropriate inf and sup to compute the lower and upper integrals. If they agree, we are
done and the common value is the answer. So, here we go:
Take an arbitrary partition P = {x0, x1, ..., xn}. The lower sum of f(x) = c
is:
L(f, P) = c (x1 - x0) + c (x2 - x1) + ... + c (xn - xn-1)
= c (xn - x0) = c (b - a)
because the inf over any interval (as well as the sup) is always c, and the above sum is
telescoping.
Similarly, we have that
U(f, P) = c (b - a)
Hence, the upper and lower sums are independent of the particular partition. Therefore f is
integrable and
I*(f) = I*(f) = c (b - a)
In particular,
f(x) dx = c (b - a)

Examples 7.1.9(b):
Is the function f(x) = x2 Riemann integrable on the interval [0,1] ? If so, find
the value of the Riemann integral. Do the same for the interval [-1, 1].
Back
2
First let's experiment with our "Integrator" applet. To "show" that f(x) = x is integrable we
need to take partitions with more and more points, compute the upper and lower sum, and
hope that the numeric answers will get closer and closer to one common value. If that's the
case, our guess is that this limit is the integral of f over the indicated interval.
But of course that's no proof. As it turns out, to prove that this simple
function is integrable will be difficult, because we do not have a simple
condition at our disposal that could tell us quickly whether this, or any
other function, is integrable. That's the bad news; the good news will be
that we should be able to generalize the proof for this particular
example to a wider set of functions.
Anyhow, here we go. First we should note that in the definition of upper
and lower integral it is not necessary to take the sup and inf over all
partitions. After all, if P is a partition and P' is a refinement of P then L(f,
P') L(f, P) and U(f, P') U(f, P). Therefore partitions with large norm
don't contribute to the sup or inf and it is enough to compute the upper
and lower integral by considering partitions with a small norm only.
Next, take any > 0 and a partition P with |P| < / 2. Then
| U(f, P) - L(f, P)|
|cj - dj| (xj - xj-1)
where cj is the sup of f over [xj-1, xj] and dj is the inf over that interval.
Since f is increasing over [0, 1] we know that the sup is achieved on the
right side of each subinterval, the inf on the left side. Therefore:
| U(f, P) - L(f, P)|
|cj - dj| (xj - xj-1) =
|f(xj) - f(xj-1)| (xj - xj-1)
To estimate this sum, we'll apply the Mean Value Theorem for f(x) = x2:
|f(x) - f(y)| |f'(c)| |x - y|
for c between x and y. Since |f'(c)| 2 for c in the interval [0, 1] we know that
|f(x) - f(y)| 2 |x - y|
But the partition P was chosen with |P| < / 2 so that
|f(xj) - f(xj-1)| 2 |xj - xj-1| 2 / 2 =

But then
| U(f, P) - L(f, P)|
|f(xj) - f(xj)| (xj - xj-1)
(xj - xj-1) = (xn - x0)

= (1 - 0) =
because the last sum is telescoping.
Since the partition P was arbitrary but with small norm - which we
remarked is sufficient for the upper and lower integral - we know that
the upper and lower integral must exist and be equal to one common
limit L.
Therefore we know that f is integrable and it remains to find the value L
of the integral. But that's easy to compute because now that we know
that the function is integrable we can take a suitable partition to find the
value of the integral. So, take the following partition:
xj = j/n
for j = 0, 1, 2, ..., n. Then the upper sum computes to
U(f, P) =
cj (xj - xj-1) =
f(xj) 1/n
= (j/n)2 1/n = 1/n3 j 2

= 1/n3 1/6 n (n+1) (2n+1) = 1/6 (n+1) (2n+1) / n2
We know that the upper integral exists and is equal to L. Therefore, the limit as n goes to
infinity of the above expression must also converge to L. But then L = 1/3, or in other
words:
x2 dx = 1/3
where a = 0 and b = 1.
As for the interval [-1, 1], we can first play our applet game to get an
idea about the answer, then set out to prove everything just as we did
above.
It seems that the applet says that the integral now should evaluate to
something close to 2/3. Of course we need some formal proof, but since
that would be similar to the above one we'll leave it as an exercise.
The proof is somewhat misleading, because it seems to be based on the
fact that f(x) = x2 is differentiable. As a generalization we might
conclude that differentiable functions are integrable. That's correct, but
another, more general concept can be substituted for differentiability
(what might it be?).

Examples 7.1.9(c):
Is the Dirichlet function Riemann integrable on the interval [0, 1] ?
Back
In a previous example we have shown that U(f, P) = 1 and L(f, P) = 0, regardless of the
partition. Therefore:
I*(f) = 1 and I*(f) = 0
Therefore, the Dirichlet function is not integrable over the interval [0, 1].
Lemma 7.1.10: Riemann Lemma

Suppose f is a bounded function defined on the closed, bounded interval [a,
b]. Then f is Riemann integrable if and only if for every > 0 there exists at
least one partition P such that
| U(f,P) - L(f,P) | <
Context
Proof:
One direction is simple: If f is Riemann integrable, then I*(f) = I*(f) = L. By the properties
of sup and inf we know:
There exists a partition P such that L = I*(f) > U(f, P) - / 2
there exists a partition Q such that L = I*(f) < L(f, Q) + / 2
Take the partition P' that is the common refinement of P and Q. Then we know that:
U(f,P) U(f,P')
L(f,Q) L(f,P')
Taking this together we have:
L > U(f,P) - / 2 U(f,P') - / 2
L < L(f,Q) + / 2 L(f,P') + / 2
Multiplying the second inequality by -1 and adding it to the first gives:
0 > U(f,P') - L(f,P') or equivalently:
> U(f,P') - L(f,P') = | U(f, P') - L(f, P')|
Therefore we found a particular partition (namely P') such that
| U(f, P') - L(f, P')| <
for any given .

The other direction is a little bit harder: Assume that for every > 0 we
can find one partition P such that
| U(f, P) - L(f, P)| <
We then need to show that I*(f) - I*(f)| <
We will do that later.
Examples 7.1.11(a): Riemann Lemma

Is the function f(x) = x2 Riemann integrable on the interval [0,1] ? If so, find
the value of the Riemann integral. Do the same for the interval [-1, 1] (since
this is the same example as before, using Riemann's Lemma will hopefully
simplify the solution).
Back
When we proved before that the integral of f(x) = x2 exists and is equal to 1/3 we showed
that
(*) U(f, P) - L(f, P)
for every partition P with small enough norm. Using Riemann's Lemma it is enough to find
one partition such that inequality (*) holds. Therefore, take the partition
P = {j/n, j = 0, 1, 2, ..., n}
for the interval [0, 1]. On every subinterval [(j-1)/n, j/n] the maximum of f occurs on the
right, the minimum on the left. Therefore:
| U(f, P) - L(f, P) |
=
| f(j/n) - f((j-1)/n)| |1/n|
| (j/n)2 - ((j-1)/n)2| |1/n| = 1 / n3
= 1 / n3 | j2 - j2 + 2 j - 1| 2 / n3
= 2 / n3 1/2 n (n+1) = (n+1) / n2
| j2 - (j-1)2 |
j
But since the last expression converges to zero as n goes to infinity, Riemann's Lemma
shows that f(x) = x2 is indeed integrable over the interval [0, 1].
The remainder of this example is just as before (exercise!).
So, Riemann's lemma has indeed simplified our computation, because
we are now able to pick one partition that is best suited for our
particular function and/or interval.
Examples 7.1.11(c): Riemann Lemma

Suppose f is Riemann integrable over an interval [-a, a] and f is an odd
function, i.e. f(-x) = -f(x). Show that the integral of f over [-a, a] is zero.
What can you say if f is an even function?
Back
By assumption f is Riemann integrable so we can use the previous example and take a
sequence of partitions with smaller and smaller mesh, compute a Riemann sum for each
partition, and find the limit.
For an evenly spaced partition that includes 0 it is easy to compute a
particular Riemann sum (such as a "middle Riemann sum") as long as
the function is odd.
The rest is left as an exercise.
For even functions, i.e. functions where f(x) = f(-x), we can show that
the integral of f from -a to a is twice the integral from 0 to a. The details
are left as an exercise again (sorry -:).
Proposition 7.1.12: Properties of the Riemann Integral

Suppose f and g are Riemann integrable functions defined on [a, b]. Then
1.
c f(x) + d g(x) dx = c
2. If a < c < b then
f(x) dx + d
g(x) dx
3. | f(x) dx |
| f(x) | dx
4. If g is another function defined on [a, b] such that g(x) < f(x) on [a,
b], then g(x) dx
f(x) dx
5. If g is another Riemann integrable function on [a, b] then f(x) . g(x)
is integrable on [a, b]
Context
Proof:
Examples 7.1.13(a):
Find an upper and lower estimate for
x sin(x) dx over the interval [0, 4].

Back
Since | sin(x) | 1 we have:

-x x sin(x) x
for all x. By part 4 of the previous proposition we have:
x dx
x sin(x) dx
x dx
for any a, b. But just as we computed
x2 dx in a previous exercise we could show that:
x dx = 1/2 (b2 - a2)

(which would be a nice and simple exercise). Putting everything together we get that
-8
x sin(x) dx 8
Examples 7.1.13(b):
Suppose f(x) = x2 if x 1 and f(x) = 3 if x > 1. Find

interval [-1, 2].
f(x) dx over the

Back
According to part 2 of the previous result we know that

for a < c < b. Let a = -1, c = 1, and b = 2. Then we have:
f(x) dx = x2 dx = 1/3 (13 - (-1)3) = 2/3

and
f(x) dx = 3 dx = 3 (2 - 1) = 3
Therefore
f(x) dx = 2/3 + 3 = 11/3
Here we computed x2 dx similarly than in a previous example.

Examples 7.1.13(c):
If f is an integrable function defined on [a, b], which is bounded by M on
that interval, prove that
M (a - b)
f(x) dx M (b - a)
Back
Since f is bounded on [a, b] by M we know that

-M f M
Therefore:
-M dx
f(x) dx
M dx
or equivalently (we have computed the left and right integrals before):
-M (b - a)
f(x) dx M (b - a)
Theorem 7.1.14: Riemann Integrals of Continuous

Functions
Every continuous function on a closed, bounded interval is Riemann
integrable. The converse is false.
Context
Proof:
We have shown before that f(x) = x2 is integrable where we used the fact that f was
differentiable. We will now adjust that proof to this situation, using uniform continuity
instead of differentiability. We can actually simplify the previous proof because we now
have Riemann's lemma at our disposal.
We know that f is continuous over a closed and bounded interval.

Therefore f must be uniformly continuous over [a, b], i.e. for any given
> 0 we can find a > 0 such that |f(x) - f(y)| < for all x and y with |x y| < .
Now take any > 0 and choose a partition P with |P| < . Then
| U(f, P) - L(f, P)|
|cj - dj| (xj - xj-1)
where cj is the sup of f over [xj-1, xj] and dj is the inf over that interval. Since the function is
continuous, it assumes its maximum and minimum over each of the subintervals, so that
inf and sup can be replaced with min and max.
Since the norm of the partition is less than we know that |cj - dj| < for
all j.
But then
| U(f, P) - L(f, P)|
(xj - xj-1)
=
(xj - xj-1)
= (xn - x0)
= (b - a)
because the last sum is telescoping. That finishes the proof.
For the purist, we should have chosen such that |f(x) - f(y)| < / (b-a)
whenever |x - y| < so that at the end of our above computation we
could get a single . But that's just details...
It is easy to find an example of a function that is Riemann integrable but
not continuous. For example, the function f that is equal to -1 over the
interval [0, 1] and +1 over the interval [1, 2] is not continuous but
Riemann integrable (show it!).
Examples 7.1.15:
Find a function that is not integrable, a function that is integrable but not
continuous, and a function that is continuous but not differentiable.
Back
What this example really shows is

there are functions that are not integrable, continuous, or differentiable
there are more Riemann integrable functions than there are continuous functions
there are more continuous functions than there are differentiable ones
A function that is not integrable:
Take the Dirichlet function. We have previously shown that I*(f) = 1 and
I*(f) =0. Therefore the Dirichlet function is not integrable.
A function that is integrable but not continuous:
Take the function that equals 1 over the interval [0, 1] and 2 over the
interval [1, 2]. It is clear that the function is not continuous, but we need to
prove that it is integrable.
Take a partition P of the interval [0, 2] with norm less than
such that the point xk = 1 is part of the partition. Then:
| U(f, P) - L(f, P)|

|cj - dj| (xj - xj-1)
= |ck - dk| (xk - xk-1) + |ck+1 - dk+1| (xk+1 - xk) =
= | 2 - 1 | (1 - xk-1) + |2 - 1| (xk+1 - 1) =
= (1 - xk-1) + (xk+1 - 1) =
= xk+1 - xk-1 <
because cj = dj over all subintervals except those that include xk. But then
Riemann's Lemma says that f is integrable.
A function that is continuous but not differentiable:
Take the absolute value function f(x) = |x| over the interval [-1, 1]. It is
integrable because it is continuous, but not differentiable because it has a
sharp corner at 0.
Theorem 7.1.16: Riemann Integral of almost Continuous

Function
If f is a bounded function defined on a closed, bounded interval [a, b] and f
is continuous except at countably many points, then f is Riemann integrable.
The converse is also true: If f is a bounded function defined
on a closed, bounded interval [a, b] and f is Riemann
integrable, then f is continuous on [a, b] except possibly at
countably many points.
Context
Proof:
To prove this is not easy; we will start with a simpler version of this theorem: if f is
continuous and bounded over the interval [a, b] except at one point xk, then f is Riemann
integrable over [a, b].
We know that f is bounded by some number M

over the interval [a, b].
Take any > 0 and choose a partition P that
includes the point xk such that
| P | < / 12M
Then in particular
|xk+1 - xk-1| < / 6M
We also know that f is uniformly continuous over [a, xk-1] as well as uniformly continuous
over [xk+1, b]. Therefore, for our chosen there exists
a ' such that |f(x) - f(y)| < 1/3 / (b - a) for all x, y inside [a, xk-1] with |x - y| < '
a '' such that |f(x) - f(y)| < 1/3 / (b - a) for all x, y inside [xk+1, b] with |x - y| <
''
Now refine the partition P by adding points on the left side of xk-1 so that the mesh on that
side is less than ', and by adding points on the right side of xk+1 so that the mesh there is
less than ''. For simplicity, call that new partition again P. Then we have:
| U(f,P) - L(f,P) |
|cj - dj| (xj - xj-1) =
For the first term we have:

|c1 - d1| (x1 - x0) + ... + |ck-1 - dk-1| (xk-1 - xk-2)
< 1/3 /(b-a) (xk-1 - x0) < 1/3 /(b-a) (b - a) = 1/3
because of uniform continuity to the left of xk and our choice of the partition. The third
term can be estimated similarly:
|ck+2 - dk+2| (xk+2 - xk+1) + ... + |cn - dn| (xn - xn-1)
< 1/3 /(b-a) (xn - xk+1) < 1/3 /(b-a) (b - a) = 1/3
Since f is bounded by M we know that |cj - dj| < 2M for all j so that the middle term can be
estimated by:
|ck - dk| (xk - xk-1) + |ck+1 - dk+1| (xk+1 - xk)
< 2M (xk+1 - xk-1) < 2M / 6M = 1/3
Taking everything together we have:
|U(f,P) - L(f,P)| < 1/3 + 1/3 + 1/3 =
Therefore, by Riemann's Lemma, the function f is Riemann integrable.

Examples 7.1.17(a):
Show that every monotone function defined on [a, b] is Riemann integrable.
Back
We have shown before that a monotone function f defined on a closed interval [a, b] has at
most countably many discontinuities.
Therefore such a function f is continuous except at countably many
points, so that by our previous theorem the function must be Riemann
integrable.
Examples 7.1.17(b):
Let
where p, q relatively prime and q > 0, and assume g is restricted to [0, 1]. Is
g Riemann integrable ? If so, what is the value of the integral ?
Back
We have seen this function before, where we have shown that it is continuous at all
irrational numbers and discontinuous at the rationals. In particular, the function has
countably many points of discontinuity. Since the discontinuities are dense, i.e. they are
"all over" the interval [0, 1] it might seem that it is difficult to find the value of the integral
(if the function is Riemann integrable). But with the theoretical background we developed
so far it will be easy to compute the answer.
Having countably many discontinuities, we know by our previous
theorem that the function is Riemann integrable and it remains to find
the value of the integral.
Take any partition P = {x0, x1, ..., xn} and look at:
dj = inf{g(x): x
[xj-1, xj]}
Since every subinterval [xj-1, xj] contains irrational numbers we clearly have that dj = 0 for
all j. But then the lower integral I*(g) = sup{ L(g,P): P a partition of [a, b]} must also be 0.
Since g was integrable the upper and lower integral agree so that
g(x) dx = 0
for a = 0 and b = 1.
Theorem 7.1.18: Fundamental Theorem of Calculus

Suppose f is a bounded, integrable function defined on the closed, bounded
interval [a, b], define a new function:
F(x) = f(t) dt
Then F is continuous in [a, b]. Moreover, if f is also continuous, then F is
differentiable in (a, b) and
F'(x) = f(x) for all x in (a, b).
Note: In many calculus texts this theorem is called the
Second fundamental theorem of calculus. Those books also
define a First Fundamental Theorem of Calculus.
Context
Proof:
The first assumption is simple to prove: Take x and c inside [a, b]. Since f is bounded we
know that | f(x) | < M for some number M. By the properties of the Riemann integral we
know:
assuming without loss of generality that c > x. But then we can take the limit as x
approaches c to see that
| F(x) - F(c) | = 0
which implies continuity of F at x = c.
Now we want to prove that F(x) is differentiable with F' = f if f is
continuous.
Pick x inside the interval (a, b) and choose a number h so small that
x+h is also in (a, b). We compute the difference quotient:
and define
m = inf{ f(t), t [x, x+h] }
M = sup{ f(t), t [x, x+h] }
Clearly we have that m f(t) M so that by the properties of the Riemann integral we have
that:
But since f is continuous at x we know that m = M = f(x) as h goes to zero. Therefore:
which proves the second assertion.

Corollary 7.1.19: Integral Evaluation Shortcut

Suppose f is an continuous function defined on the closed, bounded interval
[a, b], and G is a function on [a, b] such that G'(x) = f(x) for all x in (a, b).
Then
f(x) dx = G(b) - G(a)

Note: The function G is often called Antiderivative of f, and this corollary
is called First Fundamental Theorem of Calculus in many calculus text
books. Those books hose books also define a Second Fundamental Theorem
of Calculus.
Context
Proof:
This being a corollary means that it must be easy to prove. We already know from the
previous theorem that if we define the function
F(x) = f(t) dt
then F(b) - F(a) =
f(x) dx and F' = f. What we need to prove is that if we take any
function G such that G'(x) = f(x) then G(b) - G(a) =

H(x) = F(x) - G(x)
where F and G are as defined above. Then
H'(x) = F'(x) - G'(x) = f(x) - f(x) = 0
f(x) dx also. So, define
so that H(x) = c for some constant c. But then F(x) = G(x) + c so that
F(b) - F(a) = (G(a) + c) - (G(b) + c) = G(b) - G(a)
Examples 7.1.20(a):
Define a function F(x) = t2 sin(t) dt for x in the interval [a, a + 10].
1. Find F(a)
2. Find F'(x)
3. Find F''(x)
4. Find all critical points of F(x) in [a, a + 10]
Back
1. Since F(x) = t2 sin(t) dt the value F(a) is an integral from a to a. But such an integral is
0 regardless of the integrand. Therefore F(a) = 0.
2. The second part is a direct application of the (second) Fundamental
Theorem of Calculus:
t2 sin(t) dt = x2 sin(x)
3. Since we have computed the first derivative already, it is easy to
compute the second derivative:
F''(x) =
F'(x) =
x2 sin(x) = 2x sin(x) + x2 cos(x)
4. To find the critical points of F we need to find the points where F is not differentiable or
where F'(x) = 0. We know that F is differentiable on any closed interval so that the critical
points are those where
F'(x) = x 2 sin(x) = 0
Therefore the critical points are x = 0 and x = k
1, ... that are inside the interval [a, a+10].
, k=1, 2, ..., or better those points k = 0,

Examples 7.1.20(b):
Find the value of the following integrals:
1.
x5 - 4 x2 dx on the interval [0, 2].
2.
1/x2 + cos(x) dx on the interval [1, 4].
3.
(1 + x2)-1 dx on the interval [-1, 1].

Back
This time we will use the Integral Evaluation Shortcut, or First Fundamental Theorem of
Calculus, which requires us to find the antiderivative for each of the functions.
1. Here is a numerical approximation of

2].
x5 - 4 x2 dx on the interval [0,
To compute the exact value, we let

P(x) = 1/6 x6 - 4/3 x3 + C
Then we clearly have:
P'(x) = x5 - 4 x2
so that P is an antiderivate of the integrand. Therefore:
x5 - 4 x2 dx = P(b) - P(a)
With our choices of a and b we can evaluate the integral to
P(2) - P(0) = 1/6 26 - 4/3 23 = 1/6 64 - 4/3 * 8 = 0

interval [1, 4].
1/x2 + cos(x) dx on the
To compute the exact value, first note that 1/x2 + cos(x) is continuous over the interval [-1,
4] so that the function is Riemann integrable. If we define P(x) = -1/x + sin(x) + C then
P'(x) = 1/x2 + cos(x) so that again we found an antiderivative of the original integrand.
Using the Integral Evaluation Shortcut over [-1, 4] we can compute the value of the
integral to
P(4) - P(1) = -1/4 + sin(4) - (-1/1 + sin(1)) = -.8482734801

1, 1].
(1 + x2)-1 dx on the interval [-
The integrand is continuous everywhere, so that it is Riemann integrable over any closed
bounded interval. To find an antiderivative of (1 + x2)-1 is less obvious than in the previous
cases until we remember that
arctan(x) = (1 + x2)-1
But then the integral evaluates to
arctan(1) - arctan(-1) = 2
/4 =
/2 = 1.570796327
Examples 7.1.20(c):
Show that if one starts with an integrable function f in the Fundamental
Theorem of Calculus that is not continuous, the corresponding function F
may not be differentiable.
Back
Our Fundamental Theorem states that if we start with a continuous function f(t) over some
interval [a, b] then the new function F(x) obtained by integrating f from a to some variable
value x is differentiable:
F(x) = f(t) dt is differentiable as long as f is continuous

Now let's start with a simple step function that is integrable but not continuous over the
interval, say, [-1, 1]. Define
Then for if x < 0 we have:
and for x 0
But then
which is not differentiable at x = 0.

Example 7.2.1: Standard Antiderivatives

Find the following antiderivatives:
(a) xr dx
(b) 1/x dx
(c) ex dx
(d) sin(x) dx (e) cos(x) dx
(f) tan(x) dx
(g)
(i)
(h)
Back
For each function to integrate we need to come up with another function whose derivative
is the integrand. Therefore we guess a function F, then differentiate it (for which there are
fixed rules). If indeed F' is equal to the integrand, we have found an antiderivative and we
are done.
Actually, if we can find one antiderivative F of a function f, the function
F(x) + c is also an antiderivative. To simplify notation, we will be content
with finding one antiderivative and not note the arbitrary constant
(which drops out anyway when evaluating F(b) - F(a)).
(a) xr dx
We need a function whose derivate is f(x) = xr. Since differentiating a power
reduces it by one, finding the opposite might mean increasing the power by
one. Therefore, our first guess for the antiderivative is F(x) = xr+1. A quick
check shows:
xr+1 = 1/r+1 xr
which has the correct power but an incorrect coefficient. Therefore we
guess again: let F(x) = 1/r+1 xr+1, where r # -1. Then
F'(x) = r+1/r+1 xr = xr
so that we have found the correct antiderivative, valid for all x and for r #
-1:
xr dx = 1/r+1 xr+1
(b) 1/x dx
This is x to the power -1, which we had to exclude from the example above.
Simple power functions, therefore, are not sufficient as antiderivatives and
we need to resort to more exotic functions. Thinking a little we remember
that there is the natural logarithm ln whose derivative is
ln(x) = 1/x
But that function is only defined for positive x. On the other hand, if x is
negative then -x is positive, and
ln(-x) = -1/(-x) = 1/x

again. Therefore
1/x dx = ln( |x| )
where the bounds of integration must either both be positive or both be
negative, i.e. the point x = 0 can not be inside the integration interval [a, b].
(c) ex dx
The distinguishing feature of the exponential function is that it is its own
derivative. But then it is also its own antiderivative, so that
ex dx = ex
(d) sin(x) dx
The close cousin of the trig function sin is of course the cos, whose
derivative is -sin. But then:
sin(x) dx = -cos(x)
(e) cos(x) dx
This is similar to above, and we can guess the answer right away (and verify
it by differentiating the right side to obtain the integrand):
cos(x) dx = sin(x)
(f) tan(x) dx
Now it gets complicated. No simple function has the tan as its derivative, so
we seem to be stuck. But then tan(x) = sin(x) / cos(x), which at first glance
does not help much. But in terms of differentiation, the top function
happens to be the derivative of the bottom function (except for a negative
sign), so that we - after being struck by intuition - come up with a function
F(x) = ln(cos(x)). Differentiating gives F'(x) = -sin(x) / cos(x), which is
almost correct. Adjusting for the minus sign we have as our final answer:
tan(x) dx = - ln( cos(x) )

where the integration interval [a, b] ( - /2, /2). (We will soon see a
technique called substitution that can clarify the intuitive flash we had)
(g)
This is again tricky. We could guess for quite a while here, not much seems
to work. But then - again an intuitive wonder - we remember the arctan
function, whose derivative is just what we need:
that:
arctan(x) = 1/ 1 + x2 so
= arctan(x)
(h)
Well, tricky again. But we already tried to find the antiderivative of the tan
in one of the earlier examples. While we were trying to find that answer, we
might have computed the derivative of the tan, which was at that time not
helpful. But it works now, as a quick application of the quotient rules
shows:
Therefore:
= tan(x)
(i)
Now it get's to be really exotic. We have already seen that the derivative of
the arctan function yields an expression with an x2 in the denominator.
Therefore we should recall the derivatives of the other two (less frequently
used) inverse trig functions arccos and arcsin.
Therefore we have:
= arcsin(x)
Incidentally, can you prove that the derivative for the arcsin is indeed as
shown above (or for the arccos or arctan, for that matter)?
Theorem 7.2.2: Substitution Rule

If f is a continuous function defined on [a, b], and s a continuously
differentiable function from [c, d] into [a, b]. Then
Proof
Proof:
f is continuous so that there exists a function F with F' = f (in other words, F is an
antiderivative of f). Differentiate the function F(s(x)) using the Chain Rule:
F(s(x)) = F'(s(x)) s'(x) = f(s(x)) s'(x)

because F' = f. Therefore the composite function F(s(x)) is an antiderivative of f(s(x)) s'(x)
so that by our evaluation shortcut we have:
f(s(t)) s'(t) dt = F(s(b)) - F(s(b))

But since F is by assumption an antiderivative of f we have that
which finishes the proof.

Theorem 7.2.4: Integration by Parts

Suppose f and g are two continuously differentiable functions. Let G(x) =
f(x) g(x). Then
f(x) g'(x) dx = ( G(b) - G(a) ) -
f'(x) g(x) dx
Back
Proof:
For the function G(x) = f(x) g(x) we have by the Product Rule:
G(x) =
[ f(x) g(x) ] = f'(x) g(x) + f(x) g'(x)
Therefore the function G is an antiderivative of the function f'(x) g(x) + f(x) g'(x) which
means that
G(b) - G(a) =
=
f'(x) g(x) + f(x) g'(x) dx =
f'(x) g(x) dx +
f(x) g'(x) dx
But that is equivalent to the statement we want to prove.

Theorem 7.2.6: Mean Value Theorem for Integration

If f and g are continuous functions defined on [a, b] so that g(x) 0, then
there exists a number c [a, b] with
f(x) g(x) dx = f(c)
g(x) dx
Back
Proof
Define the numbers
m = inf{ f(x): x [a, b] }
M = sup{ f(x): x [a, b] }
Then we have m f(x) M and since g is non-negative we also have

m g(x) f(x) g(x) M g(x)
By the properties of the Riemann integral this implies that
g(x) dx
f(x) g(x) dx M
g(x) dx
Therefore there exists a number d between m and M such that
g(x) dx =
f(x) g(x) dx
But since f is continuous on [a, b] and d is between m and M, we can apply the
Intermediate Value Theorem to find a number c such that f(c) = d. Then
f(c)
g(x) dx =
f(x) g(x) dx
which is what we wanted to prove.

Proposition 7.2.7: The Trapezoid Rule

Let f be a twice continuously differentiable function defined on [a, b] and
set
K = sup{ |f''(x) |, x
[a, b] }
If h = (b - a) / n, where n is a positive integer, then
where |R(n)| < K/12 (b-a) h2.

Back
We first prove a simpler version of the trapezoid rule using the Mean Value Theorem for
integrals and integration by parts.
Simple Trapezoid Rule: Let f be a function defined on the interval [0, 1] so
that f is twice continuously differentiable. Then there exists a number c [0,
1] so that
f(x) dx = 1/2 (f(0) + f(1)) - 1/12 f''(c)
The trick to prove this statement is to define a function
v(x) = 1/2 x (1 - x)
which has the properties:

v(x) 0 for all x [0, 1] (blue)
v'(x) = 1/2 - x (red)
v''(x) = -1 (green)
Then
f(x) dx = -
v''(x) f(x) dx
Using integration by parts with g'(x) = v''(x) we get:
v''(x) f(x) dx = v'(1) f(1) - v'(0) f(0) = -1/2 f(1) - 1/2 f(0) -
v'(x) f'(x) dx =
v'(x) f'(x) dx
Again using integration by parts with g'(x) = v'(x) we get:
v'(x) f'(x) dx = v(1) f'(1) - v(0) f'(0) -
v(x) f''(x) dx =
v(x) f''(x) dx =
- f''(c)
v(x) dx = - f''(c) 1/12
where we used the Mean Value Theorem for Integration with some number c inside the
interval [0, 1]. Taking everything together (careful with the negative signs) we then have:
f(x) dx = 1/2 f(1) + 1/2 f(0) + v'(x) f'(x) dx =

= 1/2 f(1) + 1/2 f(0) - 1/12 f''(c)
which proves the simple Trapezoid Rule.
To prove the general Trapezoid Rule, assume that f is defined on [a, b].
Let h = (b - a) / n, pick an integer j, and define the function u(x) = a + jh
+ xh for x [0, 1]. The composite function g(x) = f(u(x)) is twice
continuously differentiable and defined on the interval [0, 1] so that the
simple trapezoid rule applies:
g(x) dx = 1/2 g(0) + 1/2 g(1) - 1/12 g''(c)

But g(0) = f(u(0)) = f(a +jh), g(1) = f(u(1)) = f(a + (j+1)h), and g''(x) = h2 f''(x). Moreover
1/2 g(0) + 1/2 g(1) - 1/12 g''(c) =
= 1/h
f(u(x)) u'(x) dx = 1/h
g(x) dx =
f(u(x)) dx =
f(u) du
Therefore:
Summing this equation from j = 0 to j = n-1 gives:
where
which finishes the proof.

Example 7.2.8: Application of the Trapezoid Rule

Compare the numeric approximations to the integral
sin(x) cos(x) dx
obtained by using (a) a left Riemann sum and (b) the Trapezoid Rule, using
a partition of size 5 and of size 100.
Back
First, let's determine the exact value of the integral by substitution. Let
u = sin(x) so that
du/dx = cos(x) or du = cos(x) dx
Then
sin(x) cos(x) dx = u du =
= 1/2 (sin2(1) - sin2(0)) = 0.5(0.7080734183 - 0) = 0.3540367092
To find the left Riemann sum, we let f(x) = sin(x) cos(x) and compute:
f(0) = 0
f(0.2) = 0.1947091712
f(0.4) = 0.3586780454
f(0.6) = 0.4660195430
f(0.8) = 0.4997868015
and therefore
R(P, f) = 1/5*(f(0) + f(0.2) + f(0.4) + f(0.6) + f(0.8)) =
= 0.2 * 1.519193562 = 0.3038387122
The error between the approximate and exact value is about 0.05, or 14%. To estimate the
error using the trapezoid rule, we compute f''(x) = -4 sin(x) cos(x) so that K = 4. Then
|R| < 4/12 * 1 * 0.22 = 0.04 / 3 = 0.013
so that even the theoretically worst error is a lot better (less than 4%). To find the value
using the trapezoid rule, we need to evaluate f at the same values as before and also
compute f(1) = 0.4546487134. Then the trapezoid rule, combining these numbers a little
different than the left Riemann sum, gives:
[1/2 f(0) + f(0.2) + f(0.4) + f(0.6) + f(0.8) + 1/2 f(1)] * 0.2 =
= [0 + 1.519193562 + 0.5 * 0.4546487134 ] * 0.2 =
= 0.3493035838
which is indeed much closer than our previous approximation (the error is only about 1%).
Incidentally, if we increased the size of the partition to 100, the error committed by the
trapezoid rule would already be less than 1/3 * 0.012 = 0.00003 or 0.01%, whereas the
left Riemann sum still has an error of about 0.003, or about 1% (compare the value of
the applet with the exact value above).
Theorem 7.2.9: Partial Fraction Decomposition
Suppose p(x) is a polynomial of degree n such that p(x) =

, where
each qj is polynomial of degree j that is irreducible over R. If s(x) is another
polynomial of degree less than n with no factors in common with p(x), then
the rational function s(x) / p(x) can be written as a finite sum
where each rj is a polynomial of degree less than j.

Back
Proof:
The proof is easiest when allowing complex numbers instead of reals and is not given.

Bahan Kuliah Analisis Real

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Bahan Kuliah Analisis Real

Caricato da

Copyright:

Formati disponibili

1.1.

Notation and Set Theory

Define the following sets: E = { x: x = 2n for n N}, O = { x: x = 2n - 1

i.e. the complement of the intersection of any

i.e. the complement of the union of any number of

1.2. Relations and Functions

Let A = {1, 2, 3, 4}, B = {14, 7, 234}, C = {a, b, c}, and R = real

Let f(x) = 0 if x is rational and f(x) = 1 if x is irrational. This function is

What is the image of [0, 2] and the preimage of [1, 4] ?

Find the image and the preimage of [-2, 2].

If the graph of a function is known, how can you decide whether a

1.3. Equivalence Relations and Classes

Let A = {1, 2, 3, 4} and B = {a, b, c} and define the following two

Which one is an equivalence relation, if any ?

Let r be an equivalence relation on a set A. Then A can be written as a union of

are called equivalence classes.

Consider the set Z of all integers. Define a relation r by saying that x

Here is another, more complicated example:

The space of all equivalence classes obtained under this equivalence

1.4. Natural Numbers, Integers, and Rational

then the property Q holds for all natural numbers.

Let A be the set N x N and define a relation r on N x N by saying that (a,b) is

So, why would anyone bother introducing more complicated

2.1. Countable Infinity

If there exists a function f from A to B that is surjective (i.e. onto) we

We can now answer questions similar to the ones posed at the

Definition 2.1.3: Countable and Uncountable

Every subset of a countable set is again countable (or finite).

The set of all rational numbers is countable.

Show that all polynomials (with integer coefficients) are

2.2. Uncountable Infinity

As for other candidates of uncountable sets, we might consider the

3. Is the set R of all real numbers countable or uncountable ?

Theorem 2.2.5: Cardinality of Power Sets

Example 2.2.6: Logical Impossibilities - The Set of all Sets

Example 2.2.9: The Continuum Hypothesis

2.3. The Principle of Induction

Theorem 2.3.3: Induction Principle

Use induction to prove the following statements:

Use induction to prove Bernoulli's inequality:

If x -1 then (1 + x) n 1 + n x for all n

The set of natural numbers, with the usual ordering, is well-ordered,

Below are two recursive definitions, only one of which is valid.

xn = smallest element of [N - {x0 , x1 , x2 , ..., xn + 1}]

All birds are of the same color.

The problem here is: who shaves the barber ?

Prove the following statements via induction:

2.4. The Real Number System

Construction of R via Dedekinds cuts

Construction of R classes via equivalence of Cauchy sequences .

Consider the set S of all rational numbers strictly between

Can you define Lower Bound and Greatest Lower Bound

2.5. Pi and e are irrational

or multiplying both sides by n! we get:

are both integers

But this implies that

is an integer for any k.

is an integer for any k.

Also note that |

Each of the factors

Now define another function

Then, using the above formula for G(x) +

Thus, the integral

for 0 < x < 1. Therefore, we can estimate our integral to get