Sei sulla pagina 1di 155

Fundamentals of Mathematics

Sets, Logic and Relations


Vasco Brattka
Cape Town
February 11, 2011
Picture of the Sierpi nski Pyramid on the front page is taken from:
http://en.wikipedia.org/wiki/File:Sierpinski pyramid.png
It is under GNU Free Documentation License, Version 1.2 and
Creative Commons Attribution ShareAlike 3.0 Licence
For the written notes
c _ 2010 Vasco Brattka
All rights reserved.
Version of February 11, 2011
Contents
Contents 1
1 Mathematics 3
1.1 What is Mathematics about? . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What are Proofs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Indirect Proofs and the Principle of Excluded Middle . . . . . . . . . 9
2 Sets 13
2.1 What is a Set? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Explicit Denitions of Sets . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Subsets and Comprehension . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Russels Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Union and Intersection of Sets . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Dierence and Complement of Sets . . . . . . . . . . . . . . . . . . . 27
2.7 Union and Intersection of Indexed Families of Sets . . . . . . . . . . 32
2.8 Power Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.9 Product of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.10 Disjoint Union of Sets

. . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 Logic 45
3.1 What is Logic? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4 Correspondence Between Logic and Set Theory . . . . . . . . . . . . 53
4 Relations and Functions 55
4.1 What are Relations? . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Composition and Inverse Relations . . . . . . . . . . . . . . . . . . . 57
4.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Injections, Surjections and Bijections . . . . . . . . . . . . . . . . . . 68
4.5 Families, Sequences and Restrictions

. . . . . . . . . . . . . . . . . 73
4.6 Images and Preimages . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.7 Set of Functions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.8 The Axiom of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1
Contents
4.9 Innite Products

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5 Cardinality 89
5.1 What is the Cardinality of a Set? . . . . . . . . . . . . . . . . . . . . 89
5.2 The Theorem of Schroder-Bernstein

. . . . . . . . . . . . . . . . . . 92
5.3 Cantors Diagonalization Method . . . . . . . . . . . . . . . . . . . . 95
5.4 The Continuum Hypothesis

. . . . . . . . . . . . . . . . . . . . . . . 96
5.5 Cantors Pairing Function . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Induction Principle on Natural Numbers . . . . . . . . . . . . . . . . 100
5.7 Finite and Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . 103
5.8 Dedekind Innite Sets

. . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.9 Cardinality and Set Constructions

. . . . . . . . . . . . . . . . . . . 110
6 Order 117
6.1 What is Order? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2 Reexivity, Symmetry and Transitivity . . . . . . . . . . . . . . . . . 118
6.3 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.4 Preorders, Partial Orders and Linear Orders . . . . . . . . . . . . . . 125
6.5 Monoids

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.6 Maximum and Minimum . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.7 Supremum and Inmum . . . . . . . . . . . . . . . . . . . . . . . . . 136
Axiomatic Set Theory 139
Mathematicians 141
Greek Alphabet 145
Mathematical Symbols 147
Index 149
2
CHAPTER 1
Mathematics
In my own experience, mathematics in general and pure mathematics
in particular has always seemed like secret gardens, special places where
I could grow exotic and beautiful theories. You need a key to get in, a key
that you earn by letting mathematical structures turn in your head until they
are as real as the room you are sitting in.
David Mumford (Fields Medalist, Brown University)
1.1 What is Mathematics about?
What is mathematics about? It is dicult to come up with an exact answer to
this question. Perhaps the best way to approach this question is to look at what
the actual practice in mathematics is and to look at the areas that mathematicians
actually work in.
However, even this is not so easy to undertake. The Mathematical Reviews
database of the American Mathematical Society (AMS) contains references to more
than 2 million articles produced by many thousands of authors and currently about
one hundred thousand articles are added each year, which means that in the average
more than 270 mathematical articles are published per day.
1
The articles in this
database are classied according to the Mathematics Subject Classication and this
classication alone is almost 50 pages long. Today an active mathematician is usually
just expert for one tiny little subeld in some of these categories and has some
rough idea about some of the others. Nowadays, the mere volume of the body of
knowledge in mathematics is so enormous that no single human being can oversee
all of it. Following the exposition in [5] one can subsume most of mathematics under
the following main areas.
1
This database can be found at http://www.ams.org/mathscinet/
3
1. Mathematics
Main areas of mathematics
1. Algebra
2. Number Theory
3. Geometry
4. Algebraic Geometry
5. Analysis
6. Logic
7. Combinatorics
8. Theoretical Computer Science
9. Probability
10. Mathematical Physics
Of course, this classication is a simplied one and many mathematicians will
wonder where their particular area can be found here. Regarding these topics and
numbers that we have mentioned, we can try a vague encyclopaedia like denition
of what mathematics is:
Mathematics is the science of structure, quantity, change and space and
the interactions between them. While mathematical ideas can be inspired
by everyday observations, it is a characteristic feature of mathematical
truth that it is derived with logical reasoning on the basis of sound def-
initions. Mathematics dates back to ancient times, but has undergone
some of its most dramatic advances in the modern era. Nowadays it
can be considered as one of the most successful collective human endeav-
ours. Each day mathematicians all over the world prove hundreds of
new theorems and solve numerous open problems and in this way they
contribute to the systematic body of knowledge that comprises modern
mathematics.
Besides the question of what mathematics is, this description also addresses the
question of how mathematics approaches its subject. That is, besides the content
there is an activity that characterizes what mathematics is. And the main tool of the
activity is the proof. Mathematics is developed by rigorous reasoning about precise
denitions and the result of this reasoning is presented in form of theorems. The
correctness of a theorem is usually witnessed by a proof. In the next section we will
deal with the question of what a proof is.
The results of mathematical work come entitled in dierent forms and we give
the reader some glossary on the terminology.
1. Theorem: This usually stands for some major result that might itself be based
on several other auxiliary results. In a mathematical article often just a handful
or even only one result comes with the title of a theorem and that is then
usually the main result. Sometimes a theorem can also have a very simple
proof.
4
1.2. What are Proofs?
2. Corollary: A corollary is usually a direct conclusion made from results that
have been presented before. It usually does not come with a separate proof.
3. Proposition: A proposition is usually a result, which is considered as interesting
by itself and which is worth being spelled out separately, although it might
have a relatively simple proof and is not necessarily a major achievement.
4. Lemma: A lemma is typically an auxiliary result that is used to prove some
other theorem. It is spelled out separately, because this structures the entire
proof and makes it usually more understandable. Sometimes, lemmas are so
useful that they become very well-known and perhaps better known than the
theorems originally derived from them.
The above terminology is not entirely clear and the boundaries between these
dierent terms is fuzzy. Dierent authors also adapt dierent habits in using these
terms and the above is just meant as a rough guideline.
1.2 What are Proofs?
If Gauss says he has proved something, it seems very probable to me; if Cauchy says
so, it is about as likely as not; if Dirichlet says so, it is certain.
Carl Gustav Jacob Jacobi (1804-1851)
Now, what is this activity of mathematician exactly about? What is a mathe-
matical proof? Usually, a proof is considered as a text that convinces the reader of a
certain result in form of rigorous logical reasoning about the underlying denitions
and concepts. But what is rigorous logical reasoning? The truth is that we cannot
present a proper denition of rigorous reasoning and that mathematics is learned
by doing. This is a bit like to learn bicycling. It is very hard to describe in words
what you have to do, but somebody will show you how to do it and eventually you
will manage not to fall. Basically, everybody can learn how to reason logically and
rigorously in the mathematical sense, but it requires some years of practice under
the guidance of other mathematicians to achieve some mastery in this discipline.
So, let us start right away and let us look into some proofs.
We recall that the natural numbers are exactly the numbers 0, 1, 2, 3, .... We
write
N = 0, 1, 2, 3, ...
for the set of natural numbers. Strictly speaking, this is not a good denition of
N, since it leaves the dots ... open to interpretation. However, we assume that
the reader has some intuitive understanding of the concept of natural numbers and
hence the denition above is clear enough. For the professional mathematician, the
most important information in this denition is that 0 is considered as a natural
5
1. Mathematics
number. Some authors also start with 1 here, but throughout this text we will
consider 0 as a natural number as well.
Now, among the natural numbers we single out the prime numbers as interesting
subset. We recall the denition.
Denition 1.1 (Prime numbers) A natural number p 2 is called prime num-
ber if it has no other natural number as divisor than 1 and p itself. By
P = p N : p is a prime number
we denote the set of all prime numbers.
An easy calculation shows that the rst few prime numbers are
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, ...
One obvious question is whether this sequence of numbers ends eventually or whether
it is is innite? Euclid already proved more than 2000 years ago that the set of prime
numbers is innite. His proof is basically still the same that we use nowadays. Let
us formulate our rst theorem and proof to illustrate the mathematical activity.
Theorem 1.2 (Euclid 300 BC) The set P of prime numbers is not nite.
Proof. Let us assume the contrary, i.e. suppose that there are exactly nitely many
prime numbers p
1
, ..., p
n
for some n N. We know that there are prime numbers
such as 2 and hence n 1. That is, the nite set P = p
1
, ..., p
n
is the set of all
prime numbers and it is not empty. Now consider the product of all these numbers
plus one:
k = p
1
p
2
...p
n
+ 1.
This number k > 1 has a prime divisor p and hence p P. Then p divides the
product p
1
p
2
...p
n
and the number k and hence it divides also 1 = k p
1
p
2
...p
n
,
which is impossible. This means that we have a contradiction and our assumption
was wrong. Thus, the set P of prime number cannot be nite. 2
The little box 2 indicates the position in the text where the proof ends. Some
authors use other symbols for this purpose or they write q.e.d. (which stands for
the Latin phrase quod erat demonstrandum, i.e. which was to be demonstrated).
This version of the proof can be found in many text books and it is considered as
a logically rigorous example of a proof and as a starting point of number theory.
Despite this fact, the proof raises a number of questions:
1. What does it exactly mean that a set is nite or not nite?
2. What is a set at all?
3. We have proved that the assumption that there are only nitely many prime
numbers leads to a contradiction. Is it admissible to conclude in this indirect
way that there are innitely many prime numbers?
6
1.2. What are Proofs?
4. And is this really an indirect proof?
5. What is a proof at all?
6. Why is it so that the number k in the proof must have a prime divisor?
7. What does it mean precisely that one number divides another number?
All these questions are legitimate questions and the rst ve questions really
touch some core topics of this course. We will try to answer these questions during
this course step by step. The last two questions are rather specic technical questions
and they address indeed a gap in the proof that we have left and we will very soon
close this gap and answer the last two questions.
However, let us step back for a moment and let us analyse what this experience
tells us about the mathematical concept of a proof:
1. The question of whether a proof is rigorous enough is a context dependent
question. It depends on what the reader is supposed to know, it depends on
the relevant background and the development of the subject and hence it also
depends on the level of advancement of the presentation.
2. In a course of mathematics it has to be negotiated between students and lec-
turers, what is the right level of rigour and even during the course it might
depend on the exact advancement within the course. At some stage, a cer-
tain type of argument needs to be practised in detail and lecturers will expect
that students esh out any little detail of the required argument. At a later
stage it is taken for granted that this type of technique is mastered and the
corresponding claim just needs to be mentioned without proof.
What this means is that there is no mathematically well-dened concept of rig-
orous enough, but this is a topic that needs to be resolved by interaction within the
relevant community (this could be a class or, for instance, the group of all experts
in a particular eld in general).
Now let us try to close some gaps that we have left in the above proof of Euclids
theorem on the innity of primes. Firstly, we have to dene precisely what divisibility
means. That one natural number a divides another number b, in symbols a[b, means
that there exists a natural number d such that b = ad. In mathematical terms this
is often written as follows.
Denition 1.3 (Divisibility) Let a and b be natural numbers. We dene
a[b : (d N) b = ad
and if a[b holds, then we say that a divides b and that a is a divisor or factor of b
and that b is a multiple of a.
Here [ is read as divides, : is read as is dened to be equivalent
to and is read as there exists. We summarize all the symbols that we use in
7
1. Mathematics
the course of this text in an appendix. A prime divisor is a divisor that is a prime
number. Now we can formulate the following lemma that closes the most essential
gap in the proof of Theorem 1.2. It shows why the number k in the proof has to
have a prime divisor.
Lemma 1.4 Any natural number n > 1 has a prime divisor p P.
Proof. Let n > 1 be a natural number. Let us consider the set
D = d N : d > 1 and d is a divisor of n
of all natural numbers d > 1 that divide n. This set is certainly not empty since
n D and hence there is a minimal number m D and we write m = min(D). If
this m is a prime number, then we have found the desired prime divisor of n. If m
is not a prime number, then it has a natural number d as divisor other than 1 and
m itself. This d divides m and hence also n, i.e. d D and d < m = min(D). But
this is a contradiction and hence this second case cannot occur. 2
Note that we have proved more than claimed, we have even proved that each
natural number n > 1 has a prime number as smallest divisor d > 1. Once again
one could complain that this proof is not rigorous enough. For instance, we did not
prove that if d divides m and m divides n, then m also divides n. This property
is called transitivity of the divisor relation and we will discuss it in the exercises.
Besides this question and the questions raised above, this second proof that we have
seen provokes a number of further questions:
1. Are we allowed to form arbitrary sets or why can we just build a set like D?
2. What exactly is a minimum and why has D such a minimum?
We will address all these questions more carefully in this course. Below we
formulate a number of exercises that continue our little excursion into number theory.
Here we end with a big open problem.
Conjecture 1.5 (Twin primes) There are innitely many twin primes, i.e. there
are innitely many pairs (p, q) of prime numbers p, q P such that q p = 2.
Here are the rst few twin primes:
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), (71, 73), (101, 103)
It is conjectured for some thousand years that there are innitely many such twin
primes, but until today nobody managed to prove this conjecture. In recent years
there was some partial progress on this matter, but until today (February 2010)
there is no nal solution to this problem.
So, after the discussion of a proof that is several thousands of years old, we see
a very similar property as a conjecture that is still unsolved. Hence, the impression
8
1.3. Indirect Proofs and the Principle of Excluded Middle
that mathematics is a completed body of results is as wrong as it can be. Every
solution of a mathematical problem brings further new questions with it that await
a solution and some cases it can take along time and can require enormous eorts
to nd a solution.
Problems
1.1 Prove that the following properties of the divisor relation hold true for all natural
numbers a, b, d, n, m:
1. n[n (reexivity)
2. d[n and n[m implies d[m (transitivity)
3. d[n and d[m implies d[(an +bm) (linearity)
4. ad[an and a ,= 0 implies d[n (cancellation)
5. 1[n (1 divides every natural number)
6. n[0 (every natural number divides 0)
7. 0[n implies n = 0 (0 only divides itself)
8. d[n and n ,= 0 implies d n (divisibility implies less or equal)
9. d[n and n[d implies d = n (antisymmetry)
Identify the places in the proofs of this section where some of these properties have been
implicitly used without further mentioning.
1.3 Indirect Proofs and the Principle of Excluded
Middle
So far as I can judge, Platonism of working mathematicians
is based on a feeling that important mathematical facts are
discoveries rather than inventions.
Yuri Manin (Bonn)
The proof of Euclids result Theorem 1.2 that the set of prime numbers is not
nite was presented as an indirect proof. An indirect proof is a proof that follows
the following logical pattern:
(A =) =A.
Here stand for false and the symbol is also called falsum. The symbol =
stands for implies and stands for not. That is, if one uses the above logical
formula in order to prove A, then one rst shows that the negation of A implies
something incorrect and then one concludes that this entire implication entails A.
Here one reads A as not A. Why is this pattern of logical reasoning justied? It is
essentially based on the principle of excluded middle which we formulate separately.
9
1. Mathematics
Principle 1.6 (Excluded Middle, Aristotle 350BC) Any well-dened mathe-
matical statement A is either true or false. In particular, the statement (AA) is
true.
Here is read as or and the principle says that (A A) is true. For most
mathematicians the principle of excluded middle is clearly true. For instance, we
believe that the Twin Prime Conjecture 1.5 is either true or false, it is just that
we do not know which alternative is correct. There are some mathematicians, who
would argue that the statement of the Twin Prime Conjecture 1.5 is not clearly true
or false. This direction of mathematics is called intuitionism and was essentially
founded by Brouwer. In intuitionistic logic the formula (A A) is not considered
as correct, since our current knowledge does not suce to say clearly whether a
statement A such as the Twin Prime Conjecture 1.5 is correct or whether its negation
A is correct. That is, intuitionists have an understanding of truth that is time
dependent and something that is not found to be true today, might be recognized
as true tomorrow. Most mathematicians rather follow platonism and they believe
that any well-dened mathematical statement is either true or not. By the way,
we will say that a mathematical object or statement is well-dened, if its denition
or specication has a clear and non-ambiguous mathematical interpretation that
actually leads to an object of the specied type. In case of a statement the type
would be a truth value that we can clearly assign.
Directions of Mathematical Philosophy: Platonism versus Intuitionism
1. Platonism: Any well-dened mathematical statement A is either true or false.
2. Intuitionism: For some well-dened mathematical statements A proofs have
been found, i.e. those A are true, others have been disproved, i.e. for them A
is true. For some statements A currently neither A nor A is known.
For intuitionists truth entails knowledge and hence they cannot relate to the
statement that (A A) holds in cases where neither A nor A is known. We
adapt the platonistic main stream philosophy of mathematics for this course and we
assume that the Principle of Excluded Middle is correct. Using this principle we can
obtain a justication for indirect proofs.
Proposition 1.7 (Indirect proof ) For each well-dened mathematical statement
A the reasoning ((A =) =A) is correct.
Proof. Let A be some well-dened mathematical statement for which we can show
(A = ). This means that if A does not hold, then something false follows.
The Principle of Excluded Middle 1.6 tells us that either A or A is correct. Since
something false follows from A, we have no other option but to conclude that A
must be correct. 2
We formulate the indirect proof method (also called proof by contradiction or
reductio ad absurdum) as a method again.
10
1.3. Indirect Proofs and the Principle of Excluded Middle
Proof Method (Proof by Contradiction)
If we want to prove A, then it is sucient to prove that A = holds, since this
implies A by the Principle of Excluded Middle.
We have seen that indirect proofs are essentially based on the Principle of Ex-
cluded Middle 1.6 and our platonistic mind set. Despite this platonistic mind set,
many theorems in mathematics are constructive, so they can also be proved without
using the Principle of Excluded Middle. One of the rst examples of a result that
was not provable without the Principle of Excluded Middle is Hilberts Basis Theo-
rem that Hilbert proved indirectly. His proof provoked signicant controversies and
doubts about the justication of the indirect method. However, indirect proofs are
sometimes much shorter and more elegant than direct proofs. Often, proofs without
the Principle of Excluded Middle are signicantly harder, but today we know that
it is worth investing these additional eorts. The benet is that proofs done in
intuitionistic logic, i.e. without the Principle of Excluded Middle, can (more or less
automatically) be translated into computer programmes. That is, one can extract
programmes from intuitionistic proofs and this is not possible, in general, for non-
intuitionistic proofs. From this perspective, intuitionism has found a very pragmatic
justication and is, as a technique rather than as a philosophy, successfully used for
this and other purposes also by platonists.
Problems
1.2 Revisit the proof of Euclids Theorem 1.2 and show that essentially the same proof with
little modications proves the following statement:
For any given nite number p
1
, ..., p
n
P of prime numbers with n 1 there
exists a prime number p P that is not among the numbers p
1
, ..., p
n
, i.e. such
that p , p
1
, ..., p
n
.
Show that this proof is easily arranged such that it does not use any indirect reasoning!
In fact, this shows that Euclids Theorem is a constructive theorem and the proof actually
contains an algorithm how to compute a further prime number p, given any nite number
p
1
, ..., p
n
of prime numbers.
Bibliographic Remarks
We close this chapter with some bibliographic remarks on useful books. There exists a
huge number of text books which can be used together with this course. Most of them
complement the course in one or the other way. We just mention a few of them.
[1] Martin Aigner and G unter M. Ziegler, Proofs from THE BOOK, 4th edition, Springer,
Berlin, 2009.
[2] Ethan D. Bloch, Proofs and Fundamentals, A First Course in Abstract Mathematics,
Birkhauser, Boston, 2000.
[3] Mariana Cook, Mathematicians: An Outer View of the Inner World, Princeton Uni-
versity Press, 2009.
11
1. Mathematics
[4] Philip J. Davis and Reuben Hersh, The Mathematical Experience, Birkhauser, Bosten,
1981.
[5] Timothy Gowers (editor), The Princeton Companion to Mathematics, Princeton Uni-
versity Press, 2008.
[6] Paul R. Halmos, Naive Set Theory, Springer, New York, 1974.
[7] Kevin Houston, How to Think Like a Mathematician, A Companion to Undergraduate
Mathematics, Cambridge University Press, 2009.
The content of the rst text book by Aigner and Ziegler goes far beyond this course and
it is basically a collection of the gems of mathematics. It is a good companion throughout
the life of any professional mathematician, who will return back to this book in order to
learn some of the most beautiful proofs in mathematics. The second book by Cook is not a
text book, but a collection of more than 90 photographic portraits of mathematicians, which
provides the reader some authentic insights into what mathematicians think and feel about
their work. The book by Davis and Hersh is a book that tries to disclose the nature of math-
ematics and the philosophical grounds on which many mathematicians operate. It also raises
questions about the metaphysical status of truth in mathematics and dominant attitudes
of mathematicians in this respect, such as platonism, formalism and constructivism. The
companion edited by Gowers is an encyclopedic introduction into all areas of mathematics.
The content goes far beyond our course, but it is one of the best available such introductions
into mathematics in general. Finally, the text book by Houston is perhaps the most useful
and aordable companion for the reader of this course.
12
CHAPTER 2
Sets
No one shall expel us from the paradise that Cantor created for us.
David Hilbert (1862-1943)
2.1 What is a Set?
A set is a Many that allows itself to be thought of as a One.
Georg Cantor (18451918)
In the previous section we have already seen several examples of sets, among
them the set of natural numbers N and the set of prime numbers P. Although
mathematics is about rigorous reasoning, we will not present a formal denition
of what a set is here. There is a more rigorous development of set theory, which is
called axiomatic set theory, but this axiomatic approach is too dicult for beginners.
The way we will develop set theory here is called naive set theory, since it is based
on an intuitive understanding of the concept of a set. In other words, although we
want to develop mathematics rigorously, we have to start from somewhere and in
this case the starting point is the naive concept of a set. However, even this naive
concept has a number of features that we can make more precise:
Informal denition of a set
1. A set S is a well-dened collection of mathematical objects x.
2. The members x of a set are called elements. If x is an element of the set S,
then we write x S. Otherwise, we write x , S.
13
2. Sets
3. Two sets S
1
and S
2
are equal if and only if they contain exactly the same
elements. If S
1
and S
2
are equal, then we write S
1
= S
2
. Otherwise, we write
S
1
,= S
2
.
There is a particular set , which is called the empty set and it does not contain
any elements. Sometimes one writes = . We give a few further examples of sets
that are commonly used. The sets N and P have already been mentioned and used
in the previous chapter.
Some useful sets of numbers
1. N := 0, 1, 2, ...., the set of natural numbers,
2. N
n
:= 1, ..., n, the set of natural numbers from 1 to n for n N,
3. P, the set of prime numbers,
4. Z := ..., 2, 1, 0, 1, 2, ..., the set of integers,
5. Q, the set of rational numbers,
6. A, the set of algebraic numbers,
7. R, the set of real numbers,
8. C, the set of complex numbers.
We do not dene the integers, rational numbers, algebraic numbers, real numbers
and complex numbers precisely here. We assume that the reader has seen these sets
of numbers before and we leave a precise treatment to a later stage. We just want
to name some commonly used sets here in order to have some examples. We close
by emphasizing two important properties that members in a set do not have. They
have neither a position nor multiple appearances.
Multiplicity and Order
1. Multiplicity of an object x in a set S is not considered. Either x is an element
of S or not. No element can have multiple instances within a set.
2. Order of elements in a set does not play any role. That is, a member of a set
has no particular position within the set.
Later we will see the concept of an indexed family, where the position of elements
plays a role and one and the same element can also appear several times in dierent
positions. In certain special areas of mathematics also multisets are considered,
which are sets where multiplicity occurs, although order plays no role. Such multisets
have already been considered by Dedekind, but we will not use them here.
14
2.2. Explicit Denitions of Sets
2.2 Explicit Denitions of Sets
I remember once going to see him [Ramanujan] when he was ill at Putney. I had
ridden in taxi cab number 1729 and remarked that the number seemed to me rather
a dull one, and that I hoped it was not an unfavourable omen. No, he replied, it
is a very interesting number; it is the smallest number expressible as the sum of two
cubes in two dierent ways.
Godfrey Harold Hardy (1877-1947)
In general, the curly brackets and are used to specify a particular set.
This can happen in at least two dierent ways, either by listing the elements explic-
itly or by comprehension. Only nite sets can be specied by listing their elements
explicitly.
1
For instance
2, 7, 2010, 4, 2
is a nite set with 4 elements. The particular listing of elements used to specify
the set names the elements in a particular order and some elements might even be
repeated in this list. Nevertheless, neither the order nor the repetition matters for
the set that is dened in this way, as pointed out before. So,
2, 7, 2010, 4, 2 = 2010, 7, 4, 2.
This is simply because we agreed that two sets are equal if and only if they contain
exactly the same elements. That is order and multiplicity are features of the list that
species the set, but not properties of the set itself. Also the naming of elements
can happen in very dierent ways. For instance, the following two singletons are
identical:
1729 = the smallest number expressible as the sum of two cubes in two dierent ways.
A singleton is a set with exactly one element. In this case it is not so easy to
recognize that the two sets are actually identical. This requires some knowledge
about cubes and numbers and also some agreements. For instance, there is an
implicit agreement that the number 1729 is understood as base 10 expansion (this
is usually the case if not mentioned otherwise). The text in the set on the right
hand side is not read as a sequence of symbols, but as a mathematical denition
of the uniquely identied number, which is an element of this set. However, such
denitions are only acceptable if they have some clear mathematical interpretation.
See Problem 2.1 for a problematic example. Sometimes even concretely specied
sets are not so easy to understand. Here is an example.
Example 2.1 Let T = the largest pair of twin primes (p, q). If the Twin Prime
Conjecture 1.5 is correct, then there is no largest pair (p, q) of twin primes and
1
We only made one exception, we also dened the set of natural numbers N = |0, 1, 2, 3, ...
by indicating an innite list of elements.
15
2. Sets
consequently T = . If the Twin Prime Conjecture is false, then there is a largest
pair (p, q) of twin primes and T = (p, q) is the singleton with this pair as its only
member. That is, T = if and only if the Twin Prime Conjecture is correct and
since we do not know yet whether this conjecture is correct, we do not know whether
T = or not.
Also note that the pair (p, q), if existent, is a single object in the set T, although
it contains itself two components. The set T above is mathematically well-dened
although we cannot say whether it is empty or not. Hence, the limitation just
concerns our current knowledge, not the mathematical well-denedness of the set T.
However, other sets are not well-dened. For instance the set
S = the most beautiful natural number
is not well-dened, as long as we do not provide any precise mathematical denition
for a most beautiful natural number. Perhaps some people would argue that
S = 1729, but as long as beautiful is not specied, the denition does not make
any sense. In contrast to that, we want to agree in this course that a set like
S = the largest natural number
is well-dened, since the property the largest natural number used to specify the
set has a clear mathematical interpretation. However, this set S is empty, since
there is no largest natural number.
2.3 Subsets and Comprehension
While the above sets are formed by an explicit listing of their objects, a more common
method is to specify a set by comprehension. Comprehension usually means that a
set is formed by specifying a subset of a given set using some property. An example
that we have already seen is
P := p N : p is a prime number.
Here the given set is the set N of natural numbers and we single out a subset P of it
by specifying which elements of N are members of this subset. So, the way to read
the above denition is that P is dened to be the set of those natural number p N
that have the property that they are prime. Some authors also write this set as
p N [ p is a prime number,
i.e. with a [ instead of a :. In both cases : and [ are read as such that.
The symbol := is read as is dened to be equal. Let us now more formally
capture what a subset is in general.
Denition 2.2 (Subset) Let S be a set. We say that T is a subset of S if all
elements of T are also elements of S. If T is a subset of S, then we write T S.
Otherwise, we write T , S.
16
2.3. Subsets and Comprehension
Please note that some authors also write T S instead of T S. However, the
former notation is somewhat ambiguous because it does not make clear whether the
sets T and S are also allowed to be equal. We dene
T S : (T S and T ,= S).
That is stands for subset, but not equal. The subset symbols can also be
used the other way around. For instance, S T means that T is a subset of S.
Sometimes, S is also called a superset of T in this situation.
The diagram in Figure 2.1 illustrates a subset T S. This gure is a so-called
T
S
Figure 2.1: A subset T S
Venn diagram. The value of such diagrams is limited, since they illustrate sets as if
they are subsets of the two-dimensional plane. This can lead to wrong conclusions
and perceptions. Hence, one should use such diagrams only for inspiration and
any formal proof has to be based rigorously on the original denitions. Here is an
example of some sets and their corresponding subset relations.
Example 2.3 (Subsets) All the following statements hold true (try to prove them!):
1. 2, 3, 5 P N,
2. 2, 3, 5 P,
3. 1, 2, 3 , P,
4. P , 1, 2, 3.
The last two example show that for two given sets S and T neither S T nor
T S needs to hold. However, both inclusions can hold also simultaneously. Our
rst result says when exactly this happens. Although it is a very simple result, we
work out the proof in detail.
Proposition 2.4 (Equality) Let S and T be sets. Then
S = T (S T and T S).
17
2. Sets
Proof. = We start with the assumption that S = T. By denition this means
that S and T have exactly the same elements. In particular, all elements of S are
elements of T, i.e. S T and all elements of T are also in S, i.e. T S.
= Now we assume that S T and T S. By denition this means that all
elements that are in S are also in T and all elements that are in T are also in S.
Thus, S and T have to have exactly the same elements and hence S = T. 2
Although this is an extremely simple proof, it helps to illustrate a number of
important proof techniques. If we want to prove an equivalence A B
of two statements A and B, then we have to prove two implications A = B
and B = A because these two implications together comprise the meaning of
A B. The way to read is as if and only if or is equivalent to.
Similarly, Proposition 2.4 tells us how to prove the equality of two sets S and T,
namely by showing that S T and T S. This is important enough to be capture
again.
Proof Methods (Equivalence of Statements and Equality of Sets)
1. Equivalence: An equivalence of statements like A B is usually proved
by showing A =B and B =A separately.
2. Equality: A set equality like S = T is usually proved by showing S T
and T S separately.
There is another related terminology in mathematics. This is the terminology of
sucient and necessary conditions.
Terminology (Sucient and necessary conditions)
If A and B are two mathematical statements such that A =B holds, then
1. A is called a sucient condition for B and
2. B is called a necessary condition for A.
That is, the condition S T and T S is necessary and sucient for the two
sets S and T to be equal.
It is important to note that there is a correspondence between subsets and the
way sets can be dened by comprehension using some properties. Let us assume
that S is a set and P is a property that can hold for elements x of S or not. We
write P(x) in order to indicate that property P holds for x. Then
T := x S : P(x)
denes a subset of S with exactly those elements of S that satisfy property P. On
the other hand, if T S, then by
P(x) : x T
18
2.3. Subsets and Comprehension
we can dene a property P for all elements of x S that holds if and only if x T.
Since we can move from properties to subsets by comprehension and backwards
from subsets to the respective properties, we can somehow identify subsets and
properties. These are essentially the same things! For example, we have used the
property of being a prime number in order to dene the set P of prime numbers by
comprehension. And hence being prime is equivalent to being a member of P.
This correspondence between subsets formed by comprehension and properties
will play a crucial role in this course, since, as we will see, there is a close relationship
between set theory and logic and this is the rst indication of this relationship.
Now let us look at further examples of subsets. It turns out that the empty set
is a subset of any set.
Proposition 2.5 For any set S we have S.
Proof. In order to show S we have to show that all elements of are also in
S. Since by denition there are no elements in , we have nothing to show and the
statement is proved. 2
Some readers might not like this proof, since it seems to show nothing. Perhaps,
an indirect proof is even more instructive in this case. So, let us formulate an indi-
rect proof for this result.
Proof. Let us assume that , S. If S does not hold, then by denition
there must be an element x , which is not in S. But this is a contradiction, be-
cause there is no element in at all. Hence, the assumption was wrong and S. 2
So, what is the logical principle that corresponds to the statement that S
holds for any set S? It is the so-called principle of explosion that = A holds
for any statement A or in other words from a false statement one can conclude
everything. This principle illustrates why it is so important that mathematics is
consistent! As soon as there is some inconsistency, i.e. some contradiction or some
false statement that could be derived in mathematics, one could conclude everything
and hence all such results were useless.
It is important to point out that our naive approach to set theory is based on
untyped sets. That is, in order to dene a set, we do not have to specify its type, i.e.
we do not have to name a superset S, rst. In particular, there is only one empty
set and not one for each type S.
So, what is the property that corresponds to the empty set? The answer is
that this is the property falsum that we have seen before. It is the property
that does not hold. For an arbitrary mathematical object x we could also dene
: (x ,= x). Since the property x ,= x is always wrong, no matter what x
is, is just a property that corresponds to is not true. We obtain the following.
19
2. Sets
Proposition 2.6 (Empty set) For any set S we have
= x S : x ,= x = x S : .
Proof. Let S be some set and x S. Since (x ,= x), it is clear that
x S : x ,= x = x S : .
We have to show = x S : x ,= x. By Proposition 2.4 it suces to show
x S : x ,= x and x S : x ,= x .
The rst statement x S : x ,= x holds by Proposition 2.5. For the second
statement, let x be an element in x S : x ,= x. This element satises x ,= x,
which is impossible. Hence such an element does not exist and we have nothing to
show. 2
Here is an example of a set that we have considered before.
Example 2.7 The set S = the largest natural number can be dened more pre-
cisely by comprehension as follows:
S = n N : n is the largest natural number.
The property P(n) which is equivalent to n is the largest natural number is a well-
dened property and for each natural number n N it is clear what this is supposed
to mean. However, this property does not hold true for any natural number n N
(since there is no largest one). Hence, S is the empty set by the previous proposition.
In contrast to that, the set
S = n N : n is the most beautiful natural number
is not well-dened, since it is not even clear for a xed number n N what it is
supposed to mean that this is the most beautiful number.
Problems
2.1 Discuss the denition of the following set:
S = the smallest natural number which cannot be dened with less than 100 symbols.
Is this set S well-dened? If yes, can we determine the member of this set? If not, why not?
2.2 Prove that for any three sets R, S, T the following statements hold true:
1. S S (reexivity)
2. R S and S T implies R T (transitivity)
2.3 Find out which of the following statements are correct!
1. ,
2. ,
3. ,
4. N.
20
2.4. Russels Paradox
2.4 Russels Paradox
In formal logic, a contradiction is the signal of defeat,
but in the evolution of real knowledge it marks the
rst step in progress toward a victory.
Alfred North Whitehead (18611947)
We have already seen that there are some limitations on how a set can be listed
explicitly. Namely the listing has to be mathematically well-dened. Now we will see
another type of limitation that shows why comprehension has to be used carefully
as well. In early years of set theory one has already recognised that there are sets
that lead to serious problems. One such construction is called Russels paradox and
we present it as an example here.
Example 2.8 (Russels paradox 1901) We consider the set
S = X : X , X
of all sets X that do not contain themselves as element. Now the question is whether
S is an element of itself ? On the one hand, if S S, then S, by denition has the
property that S , S. This is a contradiction! On the other hand, if S , S, then S,
by denition has the property that S S. This is also a contradiction! Altogether,
we obtain
S S S , S.
This statement is clearly not correct, hence the existence of the set S leads to a
contradiction!
Cantor had discovered similar antinomies, but he did not publish them. The
way out of the problem of Russels paradox is just to declare the formation of sets
such as S as illegal. We have to apply some restrictions with regards to which sets
we can actually build. The discovery of this paradox has led to the development of
axiomatic set theory, a discipline which explains in great formal detail which sets
can be formed and which sets cannot be formed. Essentially, the situation is as
follows.
Admissible constructions of sets
1. We have an empty set and an innite set such as the set N of natural numbers.
2. We can form nite sets by explicit specication of their elements.
3. We can form subsets of already constructed sets by comprehension (using some
property that characterizes the elements in the subset).
4. We can apply certain well-dened operations to sets in order to form new sets
out of given sets. These operations are the union of sets and the power set
construction.
21
2. Sets
We will specify in subsequent sections what union and power set construction
exactly means. The essential point is that the set in Russels paradox has not been
built by either of these admissible tools. The condition X , X could be considered as
a property, but in order to use comprehension to form the set S of Russels paradox
one would need rst the set U of all sets X and this set does not exist (something
that we will prove below exactly with the argument of Russels paradox).
In some approaches to axiomatic set theory, the collection U of all sets is consid-
ered as a proper class, which is something like a set of second order. Already Cantor
considered such classes as the way to avoid antinomies. Then the class S of all sets,
which are not members of themselves can be formed, but the question of whether
S is a member of itself does not make any sense, since S contains only sets and not
proper classes. Outside of set theory, the term class is sometimes also used as a
synonym for sets.
One should note that a set can very well be a member of another set. For instance
the set , N is a set with exactly two elements: the empty set and the set of
natural numbers N. And in fact, Russels paradox can be turned into a useful proof
of the fact that there is no set that contains all sets.
Proposition 2.9 (No universal set) There is no set U that contains all sets S.
Proof. Let U be some set. Now we consider the set
S := X U : X , X
of all sets in U that are not member of themselves. This set S is well-dened by
comprehension. Now, the assumption S S implies S , S. This is a contradiction.
Hence S , S. But this implies S , U. So, no matter how we choose the set U to
start with, we can always construct a set S that is not a member of U. Hence no
set U can contain all sets S. 2
Despite other claims, this result was already known and proved by Cantor in
1899 before Russel presented his paradox. However, Cantor did not publish his
result, but he only reported it in letters to David Hilbert and Richard Dedekind.
However, Cantors original proof was dierent from the proof presented here and we
will come back to his proof at a later stage (see Problem 5.5).
In computability theory, a branch of mathematical logic, one can use the idea
of Russels paradox also in a constructive way to dene sets such as the halting
problem or the selfapplicability problem, which have been studied, for instance, by
Alan Turing and Kurt Godel. These sets exhibit some interesting behaviour and
they play a crucial role in computability theory. It is not unusual in mathematics
that some paradox or contradiction has eventually been turned into a useful result.
In relation to Russels paradox one can ask the question, whether there can be
any set S with S S at all? Indeed, the axioms of formal set theory do not allow
the construction of such a set S. A set S is called well-founded, if an innite chain
22
2.5. Union and Intersection of Sets
S
0
, S
1
, S
2
, S
3
, ... of sets with
... S
3
S
2
S
1
S
0
S
is impossible. If S is a set with S S, then we obtain an innite chain
... S S S S S
and hence S is not well-founded. There is a particular axiom in formal set theory
which ensures that any set is well-founded. Hence, a set S with S S does not
exist. This shows that in axiomatic set theory the set S that we have considered in
the proof of Proposition 2.9 is actually identical to U.
Nowadays there are variants of non well-founded set theory studied, which have
found interesting applications in the study of non-terminating processes, in linguis-
tics and also in a branch of mathematics, called non-standard analysis.
There is a little variant of comprehension, which is slightly more general, but
which we want to subsume under comprehension and this variant is called replace-
ment. We illustrate this with an example.
Example 2.10 (Multiples) For any natural number k N we dene the set kN
of numbers which are multiples of k by
kN := n N : (m N) n = km = n N : k[n.
This denition by comprehension is sometimes also written as follows:
kN = km N : m N.
Here all those values km are members of this set, for which m N. Such a denition
is called a denition by replacement.
For the purposes of this course we will not formally distinguish between replace-
ment and comprehension.
2.5 Union and Intersection of Sets
In this section we want to study operations that allow to construct new sets from
given sets, in particular we will look at the union and the intersection of sets.
Denition 2.11 (Union and intersection) Let X and Y be sets.
1. We dene the union X Y of X and Y by
X Y := x : x X or x Y .
2. We dene the intersection X Y of X and Y by
X Y := x : x X and x Y .
23
2. Sets
Thus, the union X Y is the collection of all elements from X and Y together,
i.e. it is the set of all elements which are in X or in Y . The intersection X Y is
the collection of all elements that are simultaneously in both sets X and Y , i.e. it is
the set of all elements x which are in X and also in Y .
We note that the union is formally not a special case of comprehension, since
we do not dene X Y as a subset of some given set U, but we only create this
set U by forming the union. In contrary to this, intersection can be considered as a
special case of comprehension, since we can prove the following lemma.
Lemma 2.12 For any two sets X Y = x X : x Y .
The diagram in Figure 2.2 illustrates the intersection XY and the union XY
of two sets X and Y .
X X
Y Y
X Y
X Y
Figure 2.2: The intersection X Y and the union X Y
We give a number of examples of the intersection and union of sets.
Example 2.13 The following statements hold true (try to prove them!):
1. 1, 3, 5 2, 3, 4 = 1, 2, 3, 4, 5,
2. 1, 3, 5 2, 3, 4 = 3,
3. 2N 3N = 6N,
4. 2N P = 2,
5. N P = N,
6. N P = P.
In the following proposition we collect a number of useful properties of union
and intersection. In particular, we prove that the union and intersection are both
commutative and associative and, additionally, they are distributive with respect to
each other.
24
2.5. Union and Intersection of Sets
Proposition 2.14 (Union and intersection) Let X, Y and Z be sets. Then the
following properties hold:
1. X Y X and X X Y,
2. X Y = Y X and X Y = Y X, (commutativity)
3. Z (X Y ) = (Z X) Y and
Z (X Y ) = (Z X) Y, (associativity)
4. Z (X Y ) = (Z X) (Z Y ) and
Z (X Y ) = (Z X) (Z Y ). (distributivity)
Proof.
1. If x XY , then x X and x Y . Hence, in particular, x X. This proves
the rst statement. If x X, then x X or x Y . Hence, x X Y . This
proves the second statement.
2. and 3. are left to the reader (see Problem 2.4).
4. We prove only the rst equality and leave the second one to the reader (see
Problem 2.4). In order to prove the rst equality, we convince ourselves that
x Z (X Y ) x Z or x X Y
x Z or (x X and x Y )
(x Z or x X) and (x Z or x Y )
x (Z X) (Z Y ).
This implies the claim.
2
In case of the last part of the proof, we have highlighted the logical structure of
the proof, by combining both inclusions in a single equivalence chain. This is not
recommendable as a general approach, but sometimes proofs can be captured in this
way more transparently. In this case one can see clearly how a set theoretical proof
is done using the underlying logical operations that have been used to dene union
and intersection.
The fact that the union and intersection of sets is associative, allows us to write
expressions like X Y Z for the union of three sets, since it does not matter how
we add parentheses to this expression. In other words, we have
X Y Z := (X Y ) Z = X (Y Z) and
X Y Z := (X Y ) Z = X (Y Z).
25
2. Sets
An analogous denition holds for the union and intersection of any nite number
of sets in general. Now we prove a result on inclusion and its interaction with union
and intersection.
Proposition 2.15 (Inclusion, union and intersection) Let X, Y and Z be sets.
Then the following hold:
1. (X Z and Y Z) if and only if (X Y ) Z.
2. (Z X and Z Y ) if and only if Z (X Y ).
Proof.
1. We prove both directions of the implication separately.
= Let X Z and Y Z. We have to prove (X Y ) Z. Thus, let
x X Y . That is, x X or x Y . In the rst case, it follows that x Z
since X Z and in the second case it also follows x Z since Y Z. Thus,
in any case x Z, which was to be proved.
= Now suppose (X Y ) Z. We have to prove X Z and Y Z. Let
x X. Then x X Y and hence x Z. For the second part, let x Y .
Then x X Y and x Z follows, which was to be proved.
2. We leave this proof to the reader (see Problem 2.5).
2
If two sets have no elements in common, then they are called disjoint.
Denition 2.16 (Disjoint) Let X and Y be sets. If X Y = , then X and Y
are called disjoint
The diagram in Figure 2.3 illustrates two disjoint sets.
X
Y
Figure 2.3: Two disjoint sets X and Y
26
2.6. Dierence and Complement of Sets
Problems
2.4 Prove the remaining statements from Proposition 2.14. Let X, Y and Z be sets. Then
the following properties hold:
1. X Y = Y X and X Y = Y X (commutativity)
2. Z (X Y ) = (Z X) Y and
Z (X Y ) = (Z X) Y (associativity)
3. Z (X Y ) = (Z X) (Z Y ) (distributivity)
2.5 Let X, Y and Z be sets. Prove that Z X and Z Y if and only if Z (X Y ).
2.6 Let X, Y and Z be sets. Prove that Z X or Z Y implies Z (X Y ). Give
examples of sets X, Y and Z such that the inverse implication does not hold true.
2.7 Prove that for any two distinct prime numbers p, q P and their product r = pq it
holds that
rN = pN qN.
Is this also true for arbitrary distinct natural numbers p, q N?
2.6 Dierence and Complement of Sets
In this section we discuss another method to create new sets from given sets, by
considering the dierence of sets.
Denition 2.17 (Dierence) Let X and Y be sets. We dene the dierence XY
of X and Y by
X Y := x : x X and x , Y .
Some authors also write X Y instead of X Y . Figure 2.4 illustrates the
dierence of two sets X and Y .
X
Y
X \ Y
Figure 2.4: The dierence X Y of two sets X and Y
The following example illustrates some set dierences.
27
2. Sets
Example 2.18 The following statements hold true (try to prove them!):
1. 1, 3, 5 2, 3, 4 = 1, 5,
2. 2N 3N = 2N 6N,
3. 2N P = 2N 2,
4. N 2N = 2k + 1 : k N.
In the following proposition we collect some useful basic properties of the set
dierence.
Proposition 2.19 (Dierence) Let X and Y be sets. Then
1. X Y X,
2. (X Y ) Y = ,
3. (X Y ) Y = X Y .
Proof.
1. Let x X Y . Then x X and x , Y . In particular, x X.
2. Let x (XY ) Y . Then x XY and x Y . But x XY means x X
and x , Y . Hence, x Y and x , Y , which is a contradiction. Thus, there is
no x (X Y ) Y and hence (X Y ) Y = .
3. We prove both inclusions separately.
Let x (X Y ) Y . Then x X Y or x Y . This means (x X and
x , Y ) or x Y . Altogether, x X or x Y , i.e. x X Y .
Let x X Y . Then x X or x Y . Now we make a case distinction.
1. Case: x Y . In this case x X Y or x Y is certainly correct.
2. Case: x , Y . In this case x X and x , Y is correct. Hence x X Y or
x Y is correct.
In both cases we obtain x (X Y ) Y .
2
De Morgans Law captures what happens if we subtract a union or an intersec-
tion of sets from another set. Basically, unions are turned into intersections and
intersections into unions in this case, as expressed more precisely in the following
proposition.
Proposition 2.20 (De Morgans Laws) Let X, Y and Z be sets. Then
1. Z (X Y ) = (Z X) (Z Y ),
28
2.6. Dierence and Complement of Sets
2. Z (X Y ) = (Z X) (Z Y ).
Proof. We only prove the rst part of the claim.
1. We prove both inclusions separately.
Let x Z (X Y ). This means x Z and x , (X Y ). That is we
do not have x X or x Y , hence x is neither in X nor in Y , which means
x Z X and x Z Y , i.e. x (Z X) (Z Y )
Let now x (Z X) (Z Y ). Then x (Z X) and x (Z Y ),
which means that x Z and x , X and x , Y . The latter implies that it is
not the case that x X or x Y , i.e. x , (X Y ). Altogether, this entails
x Z (X Y ).
2. We leave this part of the proof to the reader (see Problem 2.8).
2
Sometimes, if only subsets of a xed type are considered, the following notation
is used in mathematics.
Denition 2.21 (Complement) We consider subsets X Z of a xed given set
Z. Then we denote the complement of X by X
c
:= Z X.
The reader should note that the notation X
c
implicitly refers to the given set
Z and this notation does not make sense if Z has not been specied before. Some
authors also write X for the complement of X; other notations are used as well. De
Morgans Laws can be expressed somewhat neater in this notation.
Corollary 2.22 (De Morgans Laws) Let X, Y be both subsets of some xed given
set Z. Then the following hold (where all complements are taken with respect to Z):
1. (X Y )
c
= X
c
Y
c
,
2. (X Y )
c
= X
c
Y
c
.
The fact that unions and intersections are swapped under complementation,
indicates that there is a duality between these two concepts. That means that any
property of unions can be translated into a property of intersections and vice versa,
using complements. The logical rule behind this result is captured in the following
two formulas (where A and B are two propositions):
(A B) A B,
(A B) A B.
Here stands for or and stands for and. A complement taken twice just
yields the original set back. We rst prove a slightly more general statement.
29
2. Sets
Proposition 2.23 (Double dierence) For sets X, Y and Z we have
1. X (Y Z) = (X Y ) (X Z),
2. (X Y ) Z = X (Y Z).
Proof. We prove the rst statement and leave the second one to the reader (see
Problem 2.11).
1. We obtain the following equivalence chain of statements for all x:
x X (Y Z) x X and x , Y Z
x X and not(x Y and x , Z)
x X and (x , Y or x Z)
(x X and x , Y ) or (x Z and x Z)
x (X Y ) (X Z).
This means X (Y Z) = (X Y ) (X Z).
2. This is left to the reader (see Problem 2.11).
2
As a special case we obtain the following result on double complements.
Corollary 2.24 (Double complement) Let X be a subset of some xed set Z.
Then (X
c
)
c
= X, where both complements are understood with respect to Z.
The logical rule behind this observation is the double negation law:
A A,
which holds for all well-dened mathematical statements. The next quite important
proposition is about the contraposition law.
Proposition 2.25 (Contraposition) Let X, Y, Z and W be sets. Then the fol-
lowing holds:
X Y and W Z =(W Y ) (Z X).
Proof. Suppose X Y and W Z. We prove (WY ) (ZX). For this purpose
let x W Y . Then x W and x , Y . We obtain x Z since W Z. On the
other hand, we know that x X implies x Y . Hence, x , X. Altogether, we have
x Z X. 2
It is easy to see that the inverse implication of this proposition does not even
hold true if W = Z (see Problem 2.9). Once again, the contraposition law can be
expressed somewhat neater using the complement notation. Roughly speaking it
says that the inclusion order is inverted by complements.
30
2.6. Dierence and Complement of Sets
Corollary 2.26 Let X, Y both be subsets of some xed given set Z. Then
X Y Y
c
X
c
,
where the complements are all understood with respect to Z.
Convince yourself why we do not just get = here but also =! The logical
version of the contraposition law is the following:
(A =B) (B =A).
This leads to a common proof method in mathematics that we formulate separately.
Proof Method (Contraposition)
In order to prove A = B for two well-dened mathematical statements it is su-
cient (and, in fact, logically equivalent) to prove B =A.
Problems
2.8 Let X, Y and Z be sets. Prove that Z (X Y ) = (Z X) (Z Y ).
2.9 Find sets X, Y and Z such that (Z Y ) (Z X) and X , Y .
2.10 Let X and Y be sets with X Y . Prove that the following statements are pairwise
equivalent to each other:
1. X Y ,
2. Y , X,
3. Y X ,= .
2.11 We consider double dierences of sets.
1. Prove that (X Y ) Z = X (Y Z) for all sets X, Y and Z.
2. Prove that (X Y ) Z X (Y Z) for all sets X, Y and Z.
3. Show that there are sets X, Y and Z such that (X Y ) Z , X (Y Z).
2.12 Let X and Y be subsets of a xed set Z. Prove that X Y = X Y
c
, where the
complement is taken with respect to Z.
2.13 Let X and Y be sets. The symmetric dierence XY of X and Y is dened by
XY := (X Y ) (Y X).
Let X, Y and Z be sets. Prove that the following holds:
1. X = X,
2. XX = ,
3. XY = Y X, (commutative)
4. X(Y Z) = (XY )Z, (associative)
5. XY = (X Y ) (X Y ).
31
2. Sets
2.7 Union and Intersection of Indexed Families of Sets
One should always generalise.
Carl Jacobi (18041851)
Often we want to work with the union and intersection of innitely many sets and
not just of nitely many sets. Usually this is done by considering indexed families of
sets. If I is a non-empty set and there is a set X
i
given for each i I, then (X
i
)
iI
is called an indexed family of sets over I. Some authors also write X
i

iI
for an
indexed family of sets, but this is an unfortunate notation since the curly brackets
and are overloaded in this way with a dierent meaning, hence we will only
use round brackets in order to denote indexed families. Now we can dene union
and intersection for indexed families.
Denition 2.27 (Union and intersection for indexed families) Let I be a non-
empty set and let (X
i
)
iI
and (Y
i
)
iI
be indexed families of sets over I. We dene:
1.

iI
X
i
:= x : (i I) x X
i
,
2.

iI
X
i
:= x : (i I) x X
i
.
Here (i I) is read as there exists an i in I such that and (i I) is
read as for all i in I it holds that. In some sense, the existential quantier can
actually be read like a big or operation
_
and the universal quantier can be
read like a big and operation
_
. Indeed, some authors write
_
instead of and
_
instead of . Having this in mind it is easy to see that the union and intersection
for indexed families of sets actually generalises the union and intersection for two
sets (see Problem 2.14). The reader should also note that intersection could be
considered as a special case of comprehension again, whereas union yields a genuine
new type of set.
In the special case that the index set I = N is the set of natural numbers, one
also uses the following notations:

_
i=0
X
i
:=
_
iN
X
i
and

i=0
X
i
:=

iN
X
i
.
Note that is not considered as value, but the notation i = 0 to is just
understood as another way of saying i N. Similarly, the above notation is used if
the index set is I = n, ..., n +k for n, k N and then we write
n+k
_
i=n
X
i
:=
_
iI
X
i
and
n+k

i=n
X
i
:=

iI
X
i
.
Note that these notations can also be typeset in-line like

i=0
X
i
and

n+k
i=n
X
i
with
indexes written at the side. We give some examples of sets formed with union and
intersection over the natural numbers (or a subset of natural numbers).
32
2.7. Union and Intersection of Indexed Families of Sets
Example 2.28 We obtain the following (try to prove all these statements, see also
Problem 2.7):
1. kN =

mN
km for all k N,
2.

kN
kN = 0,
3.

kN
kN = N,
4. P = N (1

k=2
(kN k)).
The rst result that we prove on unions and intersections of indexed families of
sets concerns the change of the index set. The following proposition shows how this
aects the union and intersection, respectively.
Proposition 2.29 (Variation of index sets) Let I be a non-empty index set with
non-empty subsets J and K. Let (X
i
)
iI
be an indexed family of sets. Then the
following hold:
1. J K =

iJ
X
i

iK
X
i
and

iK
X
i

iJ
X
i
,
2.
_
iJ
X
i
_

_
iK
X
i
_
=

iJK
X
i
,
3.
_
iJ
X
i
_

_
iK
X
i
_
=

iJK
X
i
.
Proof.
1. Let J K. We only prove the second statement. The rst part of the
statement is left to the reader (see Problem 2.16). To this end, let x

iK
X
i
.
Then x X
i
for all i K. Since J K, this means that x X
i
, in particular,
for all i J. But this means that x

iJ
X
i
.
2. We prove this statement by considering both inclusions separately.
Let x
_
iJ
X
i
_

_
iK
X
i
_
. This means that x
_
iJ
X
i
_
or
x
_
iK
X
i
_
. Hence, there exists an i J such that x X
i
or there exists
an i K such that x X
i
. Altogether, this mean that there exists an i JK
such that x X
i
, which means x

iJK
X
i
.
Now let x

iJK
X
i
. Then there exists i J K such that x X
i
.
Hence, there exists i J such that x X
i
or there exists i K such that x
X
i
. This means x

iJ
X
i
or x

iK
X
i
, i.e. x
_
iJ
X
i

iK
X
i
_
.
3. We leave this proof to the reader (see Problem 2.16).
2
Next we discuss some basic property of unions and intersections of indexed fam-
ilies of sets, analogously to those that we have discussed for the union and intersec-
tions of two sets.
33
2. Sets
Proposition 2.30 (Inclusion, union and intersection) Let I be a non-empty
index set and let (X
i
)
iI
be an indexed family of sets and let Y be another set. Then
the following hold:
1. (k I)

iI
X
i
X
k
,
2. (k I) X
k

iI
X
i
,
3. (k I) Y X
k
Y

iI
X
i
,
4. (k I) X
k
Y

iI
X
i
Y .
Proof.
1. Let k I. If x

iI
X
i
, then x X
i
for all i I. In particular, x X
k
.
2. We leave this proof to the reader (see Problem 2.17).
3. We prove both directions of the equivalence separately.
= Let Y X
k
for all k I and let x Y . Then x X
k
for all k I.
Hence x

iI
X
i
.
= Let now Y

iI
X
i
. Fix a k I and let x Y . Then x

iI
X
i
and hence x X
i
for all i I. In particular, x X
k
, which was to be proved.
4. We leave this proof to the reader (see Problem 2.17).
2
Also the distributivity law that we had formulated for union and intersection of
three sets can be generalised to the case of indexed families of sets.
Proposition 2.31 (Distributivity) Let (Y
i
)
iI
be an indexed family of sets over
a non-empty index set I. Let X be another set. Then the following hold true:
1. X
_
iI
Y
i
_
=

iI
(X Y
i
),
2. X
_
iI
Y
i
_
=

iI
(X Y
i
).
Proof.
1. We prove both inclusions separately.
Let x X
_
iI
Y
i
_
. Then x X and x

iI
Y
i
. Hence, there exists
an i I such that x Y
i
. But this means that there exists i I such that
x X and x Y
i
, i.e. there is an i I such that x X Y
i
. Altogether,
x

iI
(X Y
i
).
Let x

iI
(X Y
i
). Then there exists i I such that x X Y
i
, i.e.
such that x X and x Y
i
. In particular, x X and there exists i I such
that x Y
i
and we obtain x X
_
iI
Y
i
_
.
34
2.7. Union and Intersection of Indexed Families of Sets
2. We leave this proof to the reader (see Problem 2.18).
2
Next we generalise de Morgans law about unions and intersections in a set
dierence to the general case of unions and intersections of families of sets.
Proposition 2.32 (Generalised de Morgans law) Let I be a non-empty index
set. Let X be a set and (Y
i
)
iI
an indexed family of sets. Then
1. X
_
iI
Y
i
_
=

iI
(X Y
i
),
2. X
_
iI
Y
i
_
=

iI
(X Y
i
).
Proof.
1. We prove both inclusions separately.
Let x X
_
iI
Y
i
_
. Then x X and x ,

iI
Y
i
. The latter means
that it is not the case that x Y
i
for all i I. In other words, this means that
there is an i I such that x , Y
i
. Hence there is an i I such that x XY
i
,
which means x

iI
(X Y
i
).
Let x

iI
(X Y
i
). Then there is some i I such that x X and
x , Y
i
. Hence, it is not the case that for all i I we have x Y
i
and
this means that it is not the case that x

iI
Y
i
. Altogether, we obtain
x X (

iI
Y
i
).
2. We leave this proof to the reader (see Problem 2.18).
2
Once again, there is a way to formulate de Morgans law for complements.
Corollary 2.33 Let I be a non-empty index set. Let X be a xed set and let (Y
i
)
iI
be a family of subsets Y
i
X. Then we obtain (with all complements taken with
respect to X):
1.
_
iI
Y
i
_
c
=

iI
Y
c
i
,
2.
_
iI
Y
i
_
c
=

iI
Y
c
i
.
Problems
2.14 Let X
1
and X
2
be sets. Prove that the following hold:
1. X
1
X
2
=

2
i=1
X
i
,
2. X
1
X
2
=

2
i=1
X
i
.
35
2. Sets
2.15 We consider the set kN = mk N : m N of all multiples of k N. Prove
P = N
_
1

_
k=2
(kN k)
_
.
2.16 Let I be a non-empty index set with non-empty subsets J and K. Let (X
i
)
iI
be an
indexed family of sets. Prove that the following holds:
1. J K =

iJ
X
i

iK
X
i
,
2.
_
iJ
X
i
_

_
iK
X
i
_
=

iJK
X
i
.
2.17 Let I be a non-empty index set and let (X
i
)
iI
be an indexed family of sets and let
Y be another set. Prove that the following hold:
1. (k I) X
k

iI
X
i
,
2. (k I) X
k
Y

iI
X
i
Y .
2.18 Let I be a non-empty index set. Let X be a set and (Y
i
)
iI
an indexed family of sets.
Prove that:
1. X
_
iI
X
i
_
=

iI
(X Y
i
),
2. X
_
iI
Y
i
_
=

iI
(X Y
i
).
2.19 Let I be a non-empty index set. Let X be a set and (Y
i
)
iN
an indexed family of sets.
Prove that
1.
_
iI
X
i
_
Y =

iI
(X
i
Y ),
2.
_
iI
X
i
_
Y =

iI
(X
i
Y ).
2.8 Power Sets
Besides the method of comprehension we have already seen that the union of sets
(or of an indexed family of sets) is a way to dened new sets from given sets.
Another very important set theoretical construction that cannot be subsumed under
comprehension is the power set construction. Given a set X the power set 2
X
is the
set of all subsets of X, which is much larger than X itself.
Denition 2.34 (Power set) Let X be a set. Then
2
X
:= Y : Y X
is called the power set of X.
Some authors also write T(X) instead of 2
X
. The power set 2
X
of any set X is
always non-empty, since 2
X
. We mention some examples.
Example 2.35 We obtain the following (try to verify these examples!):
36
2.8. Power Sets
1. 2

= ,
2. 2
(2

)
= 2
{}
= , ,
3. 2
(2
(2

)
)
= , , , , ,
4. 2
{0,1}
= , 0, 1, 0, 1,
5. , N, P, kN, k 2
N
for all k N.
The power set 2
N
of N is very large and contains many sets. The example only
lists a few of those. Later on, we will discuss the size of sets and we will see that
the power set 2
X
of a set X is usually much larger than X itself. We collect some
important properties of the power set in the following proposition.
Proposition 2.36 (Power set) Let X and Y be sets. Then the following holds
true:
1. X Y 2
X
2
Y
, (monotonicity)
2. 2
X
2
Y
= 2
XY
,
3. 2
X
2
Y
2
XY
.
Proof.
1. We prove both implications separately.
= Let X Y . We have to prove 2
X
2
Y
. Let A 2
X
. This means
A X. Since X Y , we get A Y by transitivity of the inclusion relation
(see Problem 2.2). This means A 2
Y
.
= Now let 2
X
2
Y
. We have to prove X Y . Let x X. Then
x X, i.e. x 2
X
and hence x 2
Y
since 2
X
2
Y
. But this means
x Y and hence x Y .
2. We prove both inclusions separately.
Let A 2
X
2
Y
. Then A 2
X
and A 2
Y
. That is A X and A Y .
By Proposition 2.15 this implies A X Y , i.e. A 2
XY
.
Let A 2
XY
. Then A X Y , which implies by Proposition 2.15 that
A X and A Y . Hence A 2
X
and A 2
Y
, i.e. A 2
X
2
Y
.
3. This proof is left to the reader (see Problem 2.20).
2
We note that the inverse inclusion of the last statement (3) does not hold true in
general (see Problem 2.20). We close this section with introducing another notation
that is commonly used in mathematics in order to denote unions and intersections
and that is best expressed using the power set.
37
2. Sets
Denition 2.37 (Union and intersection for sets of subsets) Let X be a set
and let o 2
X
. Then we dene
1.

o :=

SS
S,
2.

o :=

SS
S.
In order to make this more precise, we consider o here as an index set. Then we
can dene an indexed family of sets (X
S
)
SS
where X
S
:= S and we obtain
_
o =
_
SS
S =
_
SS
X
S
and

o =

SS
S =

SS
X
S
.
This is the precise interpretation of the denition above and it shows that this is
not a new concept. It is just another way of writing the union and intersection of
indexed families of sets in a way that is sometimes more convenient.
Problems
2.20 Let X and Y be sets. Prove that
1. 2
X
2
Y
2
XY
,
2. 2
X
2
Y
= 2
XY
X Y or Y X.
2.21 Let (X
i
)
iI
be an indexed family of sets. Prove that
1. 2

iI
Xi
=

iI
2
Xi
2. 2

iI
Xi

iI
2
Xi
.
2.9 Product of Sets
When we discussed the twin prime conjecture, we have already spoken about pairs
(p, q) of prime numbers. The essential idea of a pair (p, q) is that it is ordered,
i.e. it matters in which position p and q appear, respectively. This distinguishes an
ordered pair from a set p, q. We could leave the denition of a pair intuitive, but
is is also relatively simple to dene a pair more precisely using sets. This idea of
formalizing pairs goes back to Kuratowski.
Denition 2.38 (Kuratowski pair) Let X be a set with x, y X. Then we
dene the Kuratowski pair, or for short the pair (x, y), as follows:
(x, y) := x, x, y.
Essentially, this denition of a pair is not really used in practice in mathematics,
but only the following property of pairs is of importance. That is, as soon as you
have understood the following proposition and its proof, you can forget the previous
denition.
38
2.9. Product of Sets
Proposition 2.39 (Equality of pairs) Let X be a set with x
1
, x
2
, y
1
, y
2
X.
Then
(x
1
, y
1
) = (x
2
, y
2
) (x
1
= x
2
and y
1
= y
2
).
Proof. We prove both implications separately.
= Let (x
1
, y
1
) = (x
2
, y
2
). Then we obtain
x
1
= x
1
x
1
, y
1
=

(x
1
, y
1
) =

(x
2
, y
2
) = x
2
x
2
, y
2
= x
2
,
which implies x
1
= x
2
. Moreover, we obtain
x
1
, y
1
= x
1
x
1
, y
1
=
_
(x
1
, y
1
) =
_
(x
2
, y
2
) = x
2
x
2
, y
2
= x
2
, y
2
.
Now we make a case distinction.
1. Case: x
1
= y
1
. Then x
1
= x
1
, y
1
= x
2
, y
2
and hence y
2
x
1
, i.e.
y
2
= x
1
= y
1
.
2. Case: x
1
,= y
1
. Then y
1
= x
1
, y
1
x
1
= x
2
, y
2
x
2
and hence
y
1
x
2
, y
2
x
2
, which implies y
1
= y
2
.
= If x
1
= x
2
and y
1
= y
2
, then obviously the sets (x
1
, y
1
) = x
1
, x
1
, y
1

and (x
2
, y
2
) = x
2
, x
2
, y
2
coincide. 2
What distinguishes a pair (x, y) from the set x, y is the fact that the pair is
ordered, i.e. the position in which x and y occurs matters, whereas this aspect is
irrelevant in case of the set x, y. Two sets x
1
, y
1
and x
2
, y
2
are equal if and
only if they contain exactly the same elements, whereas the pairs (x
1
, y
1
) and (x
2
, y
2
)
are equal if and only if they contain exactly the same elements in exactly the same
positions. Now we use pairs in order to dene the product of two sets, which is also
called the Cartesian product after Rene Descartes.
Denition 2.40 (Cartesian product) Let X and Y be sets. Then
X Y := (x, y) : x X and y Y
is called the Cartesian product or just the product of X and Y .
We give a few examples of products of sets.
Example 2.41 The following sets are examples of products:
1. N N is the set of pairs of natural numbers,
2. 2N (N 2N) = (n, k) N N : n even and k odd,
3. (p, q) P P : q p = 2 is the set of twin primes.
In the following proposition we capture some basic properties of the product of
sets.
39
2. Sets
Proposition 2.42 (Products) Let W, X, Y and Z be sets. Then the following
hold:
1. X = X = ,
2. X Y and W Z =X W Y Z, (monotonicity)
3. X (Y Z) = (X Y ) (X Z), (distributivity)
4. X (Y Z) = (X Y ) (X Z), (distributivity)
5. (X Y ) (W Z) = (X W) (Y Z),
6. (X Y ) (W Z) (X W) (Y Z).
Proof.
1. 2. and 3. are left to the reader (see Problem 2.22).
4. We prove both inclusions separately.
Let (x, y) X (Y Z). Then x X and y Y Z. The latter means
y Y and y Z. Hence (x, y) X Y and (x, y) X Z, which means
(x, y) (X Y ) (X Z).
Let a (XY )(XZ). Then a (XY ) and a (XZ). This means
that there are x X, y Y and z Z such that a = (x, y) and a = (x, z).
This implies y = z. In particular, y Y Z and thus a = (x, y) X(Y Z).
5. We prove both inclusions separately.
Let a (X Y ) (W Z). Then a (X Y ) and a (W Z). Hence
there are x X, y Y , w W and z Z such that a = (x, y) and a = (w, z).
This implies x = w and y = z and hence x X W and y Y Z. Thus
a = (x, y) (X W) (Y Z).
Let now (x, y) (XW) (Y Z). Then x XW and y Y Z, i.e.
x X and x W and y Y and y Z. Thus (x, y) (X Y ) (W Z).
6. This proof is left to the reader (see Problem 2.22).
2
We point out that the inverse inclusion in 6. does not hold true in general (see
Problem 2.22). The diagram in Figure 2.5 illustrates the products of X Y and
W Z in a coordinate system (this is not a Venn diagram!). The rst components
of pairs are illustrated on the horizontal axis whereas the second components are
illustrated on the vertical axis. One can see why the intersection is a product
(=rectangle) itself and why the union is not. However, this does not constitute a
formal proof.
40
2.9. Product of Sets
`

. .
X
. .
W
Z
_

_
Y
_

Figure 2.5: The product X Y and W Z in a coordinate system


The denition of pairs can easily be generalised to higher arities. By the arity
we mean the number n of components in a tuple (x
1
, x
2
, ..., x
n
). For instance, we
could dene triples by (x
1
, x
2
, x
3
) := (x
1
, (x
2
, x
3
)) and then we could prove that
(x
1
, x
2
, x
3
) = (y
1
, y
2
, y
3
) if and only if x
1
= y
1
and x
2
= y
2
and x
3
= y
3
. We will not
work this out formally here, but we will take an intuitive understanding of ntuples
(x
1
, ..., x
n
) for an n N from now on. That is, we assume
(x
1
, ..., x
n
) = (y
1
, ..., y
n
) (i 1, ..., n) x
i
= y
i
.
By the way, ntuples are called pairs, triples, quadruples and quintuples for n = 2, 3, 4
and 5, respectively. There is also a tuple () of arity 0, which is sometimes called
the empty tuple or empty word. We do not distinguish between tuples of arity 1 and
their only component, i.e. (x) = x. Using tuples of arbitrary arity n we can now
also generalise the Cartesian product to higher arities.
Denition 2.43 (Generalised Cartesian product) Let X
1
, ..., X
n
be sets with
n N. Then we dene
n
X
i=1
X
i
:= (x
1
, x
2
, ..., x
n
) : (i 1, ..., n) x
i
X
i
.
Later on, we can even further generalize this denition to products over families
of sets, but we rst need to dene what an innite (or indexed) tuple is for this
purpose and we do not have such a denition at hand yet. An important special
case of the previous denition is the situation where all the sets X
i
are the same set.
In this case we simply write
X
n
:=
n
X
i=1
X = X ... X
. .
n times
41
2. Sets
and call this the nfold product of the set X with itself. We also allow the special
case n = 0 here, in which case we obtain a singleton X
0
= () with the empty
tuple.
Problems
2.22 Let W, X, Y and Z be sets. Prove that the following holds:
1. X = X = ,
2. X Y and W Z =X W Y Z,
3. X (Y Z) = (X Y ) (X Z), (distributivity)
4. (X Y ) (W Z) (X W) (Y Z).
5. Prove that there are sets X, Y, Z, W such that the inverse inclusion in the previous
statement does not hold.
2.23 Let X be a set, I a non-empty set and (Y
i
)
iN
an indexed family of sets over I. Prove
that
1. X
_
iI
Y
i
_
=

iI
(X Y
i
),
2. X
_
iI
Y
i
_
=

iI
(X Y
i
).
2.10 Disjoint Union of Sets

Sometimes one would like to dene a union of two sets X, Y such that one can keep
track from which set the elements actually originate from. This is important, in
particular, when X and Y have non-empty intersection.
Denition 2.44 (Disjoint union) Let X and Y be sets. Then we dene the dis-
joint union by
X . Y := (1 X) (2 Y ).
Sometimes the disjoint union is also denoted by X+Y or by XY and sometimes
it is called discriminated union or tagged union. Here the number 1 and 2 in the
rst component is used like a label that indicates from which set, either X or Y , the
elements originate from. If one takes the ordinary union X Y of two sets X and
Y that are not disjoint, i.e. such that X Y ,= , then the information whether an
element XY originates from X or Y (or both) is lost in the set XY . Properties
of the disjoint union can easily be derived from properties of the ordinary union and
the set product and we are not going to study such properties here. We just mention
that the disjoint union can be generalized to families of sets.
Denition 2.45 (Disjoint union of a family of sets) Let (X
i
)
iI
be an indexed
family of sets. Then we dene the disjoint union of this family by
_
iI
X
i
:=
_
iI
(i X
i
).
42
2.10. Disjoint Union of Sets

Sometimes, the disjoint union is also denoted by

iI
X
i
or

iI
X
i
or by

iI
X
i
. Sometimes one would like to consider sets X that contain tuples (x
1
, ..., x
n
)
of dierent arities n N. One can capture this idea using the operation on sets
that can be expressed as disjoint union.
Denition 2.46 (Sets of nite words) Let X be a set. Then the set X

of nite
words over X is dened by
X

:=
_
nN
X
n
=
_
nN
(n X
n
).
The operation on sets is also called Kleene star operation. Strictly speaking,
any element of X

has the form (i, x


1
, ..., x
i
) for some i N. This includes the case
0 = (0). One usually only writes (x
1
, ..., x
i
) = i (x
1
, ..., x
i
) in this situation with
the understanding that i is dened implicitly by the number of arguments in the
tuple (x
1
, ..., x
i
). In this abbreviated notation on obtains 0 = (). The Kleene star
operation has many applications also in computer science, where it is used to describe
regular languages. We give some examples how it can be used in mathematics.
Example 2.47 Here are some examples.
1. We want to create a set E 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

that contains all decimal


expansions of natural numbers without leading zeros. That is
E := (n
1
, ..., n
k
) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

: k = 1 or (k > 1 and n
1
,= 0).
2. We want to create a set D N

that contains chains of numbers that divide


each other. That is
D := (n
1
, ..., n
k
) N

: k 2 and (i 2, ..., k) n
i1
[n
i
.
That is (2, 4) and (3, 6, 12, 36) are examples of elements in D.
We have used the simplied way to denote elements of sets of nite words that has
been described above.
43
CHAPTER 3
Logic
Logic is the anatomy of thought.
John Locke (16321704)
3.1 What is Logic?
Since ancient times logic was mostly considered as the art of proper and systematic
reasoning. Aristotles work on analytics (as he called what we call logic nowadays)
was considered for a long time as the major work in logic and for almost 2000
years there was not much progress in this discipline. This changed radically at the
end of the 19th century when logic became an active eld of research again within
mathematics. Nowadays logic is a rich subeld of mathematics that has many sub-
disciplines on its own, such as model theory, proof theory and computability theory.
There are many applications of particular branches of logic in other disciplines such
as computer science and philosophy, but also within algebra, analysis or other mathe-
matical areas. Within mathematics logic is also the major foundational sub discipline
which undertakes a reection about mathematics with mathematical methods and
this is what has been called metamathematics. Godels results have spectacularly
contributed to the understanding of the limitations of mathematics and perhaps also
the scientic method in general and they are part of the jewels that 20th century
mathematics has produced.
The purpose of this section is neither to introduce any particular knowledge in
logic nor to introduce the subject as a foundational disciplines. We will rather take
a naive approach to logic (similar as with set theory) and we will try to highlight the
relevance of logic as it is used on a day-to-day basis by any working mathematician.
Essentially, we will just look at how we have used logic so far and we will emphasize
and collect the rules logical reasoning that we have already used.
45
3. Logic
3.2 Propositional Logic
Propositional logic is the part of logic that deals with the logical combination of
mathematical propositions without considering any particular mathematical objects.
Informal denition of a proposition
A proposition is a well-dened mathematical statement that is either true or false.
We will typically denote propositions by variables using letters A, B, C, .... The
truth values true and false are sometimes denoted by t and f. We will
denote them by 1 (for true) and 0 (for false). If we have two propositions A
and B, which both are either true or false, then we have altogether 2
2
= 4 dierent
possibilities of assigning truth values to the pair (A, B) and correspondingly we
have 4
2
= 16 dierent binary logical operations that we can consider. We will only
consider a small subset of these and we will dene them in the following denition
via a truth table.
Denition 3.1 (Logical operations) Let A and B be propositions. Then we
dene the logical operations of negation A, conjunction A B, disjunction A B,
implication A = B and equivalence A B via the following table of truth
values:
A B A A B A B A =B A B
0 0 1 0 0 1 1
0 1 1 0 1 1 0
1 0 0 0 1 0 0
1 1 0 1 1 1 1
The symbols and can be read as constant logical operations with truth
values 0 and 1, respectively. The way to read this table is that it tells us what we
actually mean when we say A and B is true, we mean that A and B both have the
truth value 1. Similarly, A or B is true means that at least one (possibly both)
of the propositions A and B have the truth value 1. This should be distinguished
from the exclusive or that we occasionally mean when we say or in our daily
language and that excludes the option that both A and B are true. The exclusive
or operation is sometimes denoted by A + B or A B and it is true if and only
if exactly one of A and B is true. However, we will not further use this operation
here and hence we have not included it in the table above. Note that by denition
A = B is considered as true if and only if the statement if A is true, then B
is true is true. By denition this is always correct, if A is not true (no matter
what the truth value of B is). That means that implications are closely related to
disjunctions and we capture this relation in the following proposition.
46
3.2. Propositional Logic
But before we do this, we point out that whenever A and B are propositions,
then also A, AB, AB, A =B and A B are propositions. Such propo-
sitions that only involve logical operations and propositional variables are called
logical formulas. If we combine several propositions, then we will use parenthesis to
make the order of combinations clear. In cases of doubts we use the rule that nega-
tion binds stronger than any other operation, followed by conjunction, disjunction,
implication and equivalence in this order of priority. Some logical propositions have
truth values that do not depend on the involved propositional variables.
Denition 3.2 (Tautology) A proposition that involves nitely many proposi-
tional variables A, B, C, ... and that has the property that its truth value is always
1, irrespectively of the truth values of A, B, C, ..., is called a tautology.
We give some examples.
Example 3.3 We consider the following logical formulas.
1. A B is not a tautology, since it depends on the truth values of A and B
whether A B is true or not.
2. (A A) is a tautology, since it is always true, irrespectively of the truth
value of A. (Law of Contradiction)
3. (A A) is a tautology. (Principle of Excluded Middle)
4. A A is a tautology. (Double Negation Law)
5. is a tautology since it is true and there are no propositional variables in-
volved.
How can we actually nd out whether a logical formula is a tautology or not?
This can be done with the truth table method that we illustrate in the proof of the
next proposition. We collect a number of tautologies that involve implications.
Proposition 3.4 (Implication) Let A, B and C be propositions. Then the follow-
ing are tautologies:
1. ((A =B) A) =B (modus ponens)
2. (A =B) (A B) (implication and disjunction)
3. ((A =B) (B =C)) =(A =C) (hypothetical syllogism)
4. (A =B) (B =A) (contraposition law)
5. ((A B) ((A =B) (B =A)) (equivalence)
6. ((A B) =C) (A =(B =C)) (currying)
47
3. Logic
Proof. We only prove 2. and leave the other proofs to the reader (see Problem 3.1).
We prove that the given logical formula is a tautology by systematically writing
down its truth table:
A B A (A B) A =B (A =B) (A B)
0 0 1 1 1 1
0 1 1 1 1 1
1 0 0 0 0 1
1 1 0 1 1 1
The last column of this table indicates the truth value of the entire logical formula
(A = B) (A B) depending on the truth values of A and B in the rst
two columns. We see that the last column always carries the truth value 1 for true,
irrespectively of the truth values of A and B. Hence the formula is a tautology. 2
In fact, the tautology 2. is sometimes exploited in order to proof implications
and we capture this as a proof method.
Proof Method (Implications as disjunctions)
In order to prove A = B for two well-dened mathematical statements A and B
it is sucient (and, in fact, logically equivalent) to prove A B.
In the previous proposition we have carefully used parentheses to indicate in
which order the logical operations are to be applied. Sometimes, parenthesis are left
away and the operations are ordered in the following priority list:
, =, , ,
with increasing priority. That is, a logical formula such as
A B C D would be read as ((A) B) (C D).
However, in cases of doubts it is better to use more than less parentheses. In the
following result we collect some very common other tautologies.
Proposition 3.5 (Tautologies) Let A, B and C be propositions. Then the follow-
ing are tautologies:
1. ((A B) C) (A (B C)) (associativity)
2. ((A B) C) (A (B C))
3. (A B) (B A) (commutativity)
4. (A B) (B A)
5. (A (B C)) ((A B) (A C)) (distributivity)
48
3.2. Propositional Logic
6. (A (B C)) ((A B) (A C))
7. (A B) (A B) (de Morgans laws)
8. (A B) (A B)
Proof. One can use the truth table method to prove that all these formulas are
tautologies. We work number 5. out as an example and leave the rest to the reader
(see Problem 3.2). We denote the entire formula (A(BC)) ((AB)(AC))
by F.
A B C B C (A (B C)) A B A C ((A B) (A C)) F
0 0 0 0 0 0 0 0 1
0 0 1 1 0 0 0 0 1
0 1 0 1 0 0 0 0 1
0 1 1 1 0 0 0 0 1
1 0 0 0 0 0 0 0 1
1 0 1 1 1 0 1 1 1
1 1 0 1 1 1 0 1 1
1 1 1 1 1 1 1 1 1
The last column of this table indicates the truth value of the entire logical formula
F, i.e. (A (B C)) ((A B) (A C)) depending on the truth values of
A, B and C in the rst three columns. We see that the last column always carries
the truth value 1 for true, irrespectively of the truth values of A, B and C. Hence
the formula is a tautology. 2
The previous proof shows that checking whether a logical formula is a tautology
or not becomes increasingly more time consuming as more propositional variables
are involved. Roughly speaking, the truth table method requires 2
n
computational
steps (i.e. columns in the table) if the formula involves n propositional variable
A
1
, ..., A
n
.
This observation is related to one of the challenging big open problems of math-
ematics, the so-called P-NP problem. Here P stands for the set of problems that
can be decided in polynomial time and NP stands for the set of problems that can
be veried in polynomial time. We cannot make the denitions of these sets precise
here, this would be subject of a course on computational complexity theory, but we
state that the big open problem is whether these two sets are equal or not. While
it can be proved easily that P NP, it is not known whether the inverse inclusion
holds or not. The majority of experts believes that the inverse inclusion does not
hold. That is, we have the following conjecture (which is open more or less since the
late 1960s).
Conjecture 3.6 (P-NP Problem) P NP.
This problem is among those few big open mathematical problems for which
the Clay Mathematics Institute oers one million US dollar to anybody who solves
49
3. Logic
the problem successfully in either way (i.e. by proving or disproving the conjecture
and by publishing the result properly). For the proof that indeed P NP holds, it
would be sucient to show that there is no signicantly more ecient way to check
whether a given formula is a tautology or not, then the truth table method discussed
above. For the proof that P = NP, it would be sucient to provide a signicantly
more ecient algorithm (that is one that does not require 2
n
many steps, but rather
roughly n
2
, n
3
or n
k
many steps for some xed k N).
Problems
3.1 Let A, B and C be propositions. Prove that the following logical formulas are tautolo-
gies:
1. ((A =B) A) =B (modus ponens)
2. ((A =B) (B =C)) =(A =C) (hypothetical syllogism)
3. (A =B) (B =A) (contraposition law)
4. ((A B) ((A =B) (B =A)) (equivalence)
5. ((A B) =C) (A =(B =C)) (currying)
3.2 Let A, B and C be propositions. Prove that the following logical formulas are tautolo-
gies:
1. ((A B) C) (A (B C)) (associativity)
2. ((A B) C) (A (B C))
3. (A B) (B A) (commutativity)
4. (A B) (B A)
5. (A (B C)) ((A B) (A C)) (distributivity)
6. (A B) (A B) (de Morgans laws)
7. (A B) (A B)
3.3 First-Order Logic
Roughly speaking, rst-order logic is an extension of logic where one does not only
consider mathematical propositions and their truth values, but also such proposi-
tions that depend on certain mathematical objects. Besides the ordinary logical
operations discussed in the previous section, rst-order formulas also involve quan-
tications over such objects using universal and existential quantiers. We start
with an example.
Example 3.7 The following rst-order formula expresses the fact that p N is a
prime number:
p 2 (n N)(n[p =(n = 1 n = p)).
50
3.3. First-Order Logic
If we abbreviate this formula with F(p), then we have
p is a prime number F(p) is true.
In particular, the fact whether F(p) is true does not only depend on the formula
F, but also on the involved mathematical object p N. We say that p is a free
variable in the formula F(p), whereas the variable n is a bound variable that falls
into the scope of the universal quantier (n N).
Besides the logical operations , , , =, a rst-order formula can also
involve existential quantiers and universal quantiers . Almost everything in
mathematics can be expressed using rst-order logical formulas. Occasionally, one
needs second-order logic, where quantications over subsets are allowed (i.e. we can
have formulas like (A X)...). First-order formulas can also involve other math-
ematical objects such as relations or functions (that we will discuss later on). For
instance, in the above example the divisibility relation [ has been used.
Similarly to propositional formulas, rst-order formulas can be true just due to
there mere logical form. For instance, the formula
(x X)F(x) (x X)F(x)
is true, irrespectively of what F(x) means or whether it is true for certain x. Such
rst-order formulas are called valid. An example of a formula which is not valid, is
(x X)(F(x) =G(x)).
The truth of this formula depends on what F(x) and G(x) actually means and how
the respective truth value depends on x. If, for instance X = N and F(x) means x
is a prime number and G(x) means x 2, then this would be correct. If we swap
the meaning of F(x) and G(x), then the above formula would not be true.
Since we have not dened rst-order formulas precisely, we will not be able
to prove in detail that a given formula is actually valid. This can only be done
in a course on logic where syntax and semantics of rst-order formulas is dened
more precisely. However, we believe that all the following examples are intuitively
understandable.
Example 3.8 (Valid rst-order formulas) Let X, Y be sets and let F(x), G(x),
H(x, y) be a rst-order formula whose truth value depends on some x X and
y Y . Let E be a formula that does not depend on x. Then all the following
rst-order formulas are valid.
1. (x X)(F(x) G(x)) ((x X)F(x) (x X)G(x))
(quantier exportation)
2. (x X)(F(x) G(x)) ((x X)F(x) (x X)G(x))
3. (x X)(E G(x)) (E (x X)G(x)) (free quantier exportation)
51
3. Logic
4. (x X)(E G(x)) (E (x X)G(x))
5. (x X)F(x) (x X)F(x) (de Morgans law)
6. (x X)F(x) (x X)F(x)
7. (x X)(y Y )H(x, y) (y Y )(x X)H(x, y) (quantier order)
8. (x X)(y Y )H(x, y) (y Y )(x X)H(x, y).
The thumb rule for the quantier exportation rules is that universal quanti-
cation is compatible with disjunctions and existential quantication is compatible
with conjunctions. This is because one can read the universal quantier like a big
and
_
and the existential quantier like a big or
_
. Correspondingly, some
authors write
_
xX
instead of (x X) and
_
xX
instead of (x X). It is easy
to see that the exportation does not work if conjunctions are used with existential
quantiers or if disjunctions are used with universal quantiers (see Problem 3.3).
If one of the involved formulas does not involve the variable over which one quanti-
es, then one can export quantiers also with incompatible logical operations, as
specied under free quantier exportation above. Similarly as general quantier
exportation is not valid for incompatible logical connectives, quantiers of dierent
type might not be changed in order in general (see Problem 3.3).
In the section on propositional logic we have illustrated a simple method that
can be used to nd out whether a given propositional formula is a tautology or
not. This truth table method was indicated as inecient, but at least in principle
it is applicable to any formula whatsoever and it yields a clear result following the
specied algorithm. Unfortunately, there is not such method for rst-order formulas
and the absence of such a method does not mean that one has not found such a
method, but it has been prove that there is no such method as a matter of principle.
Theorem 3.9 (Church 1936) There is no algorithm that can decide for a given
rst-order formula whether the formula is valid or not.
However, this does not mean that we cannot prove that certain rst-order for-
mulas are valid. Indeed, there is an axiom system of valid rst-order formulas from
which one can derive all the valid rst-order formulas. This is the subject of Godels
Completeness Theorem and this is treated in a course on logic. That means, in par-
ticular, that for any valid rst-order formula there is also a proof that the formula is
valid. It might just be that the proof is very intricate and lengthy. For the current
purposes we just treat rst-order formulas intuitively and we do not formally prove
their correctness. We can, however, construct some counterexamples for rst-order
formulas that are not valid.
Problems
3.3 We consider counterexamples for incompatible quantier exportation.
52
3.4. Correspondence Between Logic and Set Theory
1. Prove that there is a set X and logical formulas F(x), G(x) such that
(x X)(F(x) G(x)) (x X)F(x) (x X)G(x)
is not true.
2. Prove that there is a set X and logical formulas F(x), G(x) such that
(x X)(F(x) G(x)) (x X)F(x) (x X)G(x)
is not true.
3. Prove that there are sets X, Y and a formula H(x, y) such that
(x X)(y Y )H(x, y) (y Y )(x X)H(x, y)
is not true.
3.4 Correspondence Between Logic and Set Theory
We have noticed repeatedly that there is a close correspondence between concepts
in set theory and concepts in logic. Union and intersection, for instance, is dened
using the concepts of disjunction and conjunction, respectively (and using existen-
tial quantication and universal quantication, respectively, in the case of indexed
families of sets).
Set Theory Logic



c

=
=

iI
(i I)

iI
(i I)
Table 3.1: Correspondence between concepts in set theory and logic
On the other hand, for instance, the logical form of de Morgans Law is used
(implicitly) to prove its counterpart for sets. In Table 3.1 we just want to highlight
and collect these corresponding occurrences of concepts again in order to emphasize
this relation. One could add the correspondence between the symmetric dierence
(see Problem 2.13) and the exclusive or operation to this table, but since we
are not going to use them any further, we will not include them here.
We are not going to explain the exact relation of these concepts here, but we
refer the reader to the respective denitions in order to identify the relation.
53
CHAPTER 4
Relations and Functions
Mathematicians do not study objects, but relations between objects.
Thus, they are free to replace some objects by others so long as the
relations remain unchanged. Content to them is irrelevant:
they are interested in form only.
Jules Henri Poincar e (1854-1912)
4.1 What are Relations?
Mathematics is not just about objects, but about relations between objects. If we
study natural numbers, then we are not just interested in them as such, but we
want to understand relations between natural numbers such as divisibility. Only
such relations are giving substance to a subject such as number theory. Similarly,
if we study real numbers, we want to understand relations between them such as
linear, continuous or dierentiable functions. This is what brings substance to linear
algebra and analysis. All such relations can be considered as subsets of set products
in the following straightforward sense.
Denition 4.1 (Relation) A triple (R, X, Y ) is called a relation, if X and Y are
sets and R X Y . We will call X the source and Y the target of the relation and
R its graph.
Typically, we will just say that R XY is a relation between X and Y and we
assume that the source X and target Y is dened in this way implicitly. However,
one should keep in mind that source X and target Y have to be specied as part of
the relation. It is not sucient just to specify the graph alone. A relation is called
homogeneous if X = Y , i.e. if source and target are identical. A relation R XX
55
4. Relations and Functions
is also called a relation on X. If R X Y is a relation, then we also write
xRy : (x, y) R.
The idea of the notation xRy is that it is a short way of saying that x is in relation
R to y. To understand the nature of the denition of a relation, we illustrate it
with a number of examples, some of which we have actually seen earlier.
Example 4.2 The following are relations:
1. The set (x, y) N N : x y N N denes the ordinary less or equal
relation on natural numbers N, usually denoted by .
2. The set (x, y) N N : x < y N N denes the ordinary strictly less
relation on natural numbers N, usually denoted by <.
3. The set (x, y) N N : x[y N N denes the ordinary divisor relation
on natural numbers N, usually denoted by [.
4. For any set X, the set
X
:= (x, y) X X : x = y X X denes
the usual equality relation on X, usually denoted by =. The set
X
is also
called the diagonal of X.
5. For any two sets X, Y , the set X Y denes the all relation between X and
Y .
6. For any two sets X, Y the set X Y denes the empty relation between
X and Y .
7. For any set X, the set (x, A) X 2
X
: x A denes the element relation
between X and 2
X
, usually denoted by .
8. For any set X, the set (A, B) 2
X
2
X
: A B denes the subset relation
on the power set 2
X
, usually denoted by .
9. For any set X, the set (A, B) 2
X
2
X
: A B denes the proper subset
relation on the power set 2
X
, usually denoted by .
We emphasize again that the source and target sets specied here are part of
the denition of the corresponding relations. For instance, there is just one unique
empty set , but there are many empty relations (, X, Y ), namely one for each pair
of sets X and Y . All the relations given in the previous example, besides the element
relation, the all relation and the empty relation, are homogeneous.
Since relations are just specied using subsets of products of sets, we can perform
all usual set-theoretic operations on relations or, more precisely, on their graphs. We
illustrate this with another example, where we try to capture the divisibility relation
restricted to numbers up to 5 and disregarding the number 1.
56
4.2. Composition and Inverse Relations
Example 4.3 We consider the following sets:
1. D := (n, k) N N : n[k,
2. X := Y := 0, 1, 2, 3, 4, 5,
3. A := X 1 = 0, 2, 3, 4, 5,
4. R := D (AA)
= (0, 0), (2, 0), (2, 2), (2, 4), (3, 0), (3, 3), (4, 0), (4, 4), (5, 0), (5, 5).
Then (R, X, Y ) is a relation that captures divisibility up to 5, but disregarding the
number 1. The following diagram illustrates this relation.
X Y
0 0
1 1
2 2
3 3
4 4
5 5
_


Figure 4.1: The relation R X Y .
Finite relations R are often illustrated in diagrams such as the one in Figure 4.1.
The source set and the target set are given separately with all their elements and an
arrow is added from each point x in the source space to each point y in the target
space with xRy.
4.2 Composition and Inverse Relations
Since relations are special sets, we can apply the machinery of set theory to relation,
i.e. we can form unions, intersections, dierences and other operations on sets. This
has been illustrated in Example 4.3. However, there are also some operations that
are tailor-made for relations, the most important of which is composition, which we
dene next.
Denition 4.4 (Composition) Let R XY and S Y Z be relations. Then
we dene a relation S R XZ, which is called the composition of the two given
relations by
S R := (x, z) X Z : (y Y )(xRy and ySz).
57
4. Relations and Functions
We point out that two relations (R, X, Y ) and (S, V, Z) can only be composed in
the order SR if the target Y of R is identical to the source V of the S, i.e. if Y = V .
Sometimes we will just write SR := S R, for short. We illustrate composition in a
continuation of Example 4.3.
Example 4.5 We consider the relation R X Y from Example 4.3 and we let
Z := X = Y . Moreover, we consider the predecessor relation S Y Z with
S := (y, z) Y Z : z = y 1..
The following diagram illustrates the composition S R of the two relations.
X
X Y
0
0
0
0 0
1
1
1
1 1
2
2
2
2 2
3
3
3
3 3
4
4
4
4 4
5
5
5
5 5
_


Z
Z

R
S

S R
Figure 4.2: The relation S R X Z.
The composition operation on relations satises a number of important proper-
ties. We mention that it is associative and that the diagonal acts as identity element
with respect to composition.
Proposition 4.6 (Composition) Let R X Y , S Y Z and T Z W be
relations. Then
1. (T S) R = T (S R) (associativity)
2. R
X
=
Y
R = R (identity element)
58
4.2. Composition and Inverse Relations
Proof. We only prove the rst statement and we leave the second one to the reader
(see Problem 4.1). Let x X and w W. Then we obtain
(x, w) (T S) R
(y Y )
_
(x, y) R and (y, w) (T S)
_
(y Y )
_
(x, y) R and (z Z)((y, z) S and (z, w) T)
_
(y Y )(z Z)
_
(x, y) R and ((y, z) S and (z, w) T)
_
(z Z)(y Y )
_
((x, y) R and (y, z) S) and (z, w) T
_
(z Z)
_
(y Y )((x, y) R and (y, z) S) and (z, w) T
_
(z Z)
_
(x, z) S R and (z, w) T
_
(x, w) T (S R).
Thus, we have proved (T S) R = T (S R). We note that in the above proof
we have used, among other logical transformations, free quantier exportation (see
Example 3.8). 2
A relation does not need to be dened on all elements of the source and it does
not need to reach all elements of the target. In the next denition we capture those
elements of the source and target, respectively, which are actually in use. The
corresponding subsets of the source and the target are called domain and range,
respectively.
Denition 4.7 (Domain and range) Let R X Y be a relation. Then we
dene
1. dom(R) := x X : (y Y ) xRy, which is called the domain of R,
2. range(R) := y Y : (x X) xRy, which is called the range of R.
Some authors write ran(R) or im(R) instead of range(R). In Example 4.3 we
obtain dom(R) = range(R) = A, which is a proper subset of the source and target
X = Y . Those relations for which domain and source set, on the one hand, and
range and target set, on the other hand, coincide, have special names.
Denition 4.8 (Totality) Let R X Y be a relation. Then
1. R is called left total, if dom(R) = X,
2. R is called right total, if range(R) = Y .
59
4. Relations and Functions
None of the relations R, S and S R in Example 4.3 is left total or right total.
The less or equal relation and the divisibility relation [ on N are examples of left
and right total relations. The strictly less relation < on N is left total, but not right
total. The strictly larger relation > on N is right total, but not left total. In fact,
> is just the inversion of <. Inversion is another operation that can be performed
on relations in general and we dene it next.
Denition 4.9 (Inverse relation) Let R X Y be a relation. Then we dene
the inverse relation R
1
Y X by
R
1
:= (y, x) Y X : xRy.
Inversion intuitively means to swap source and target space, but to leave the
relation as it is otherwise. For instance, the inverse
1
is nothing but , the
inverse <
1
is nothing but > (where we consider all these relation on N). Moreover,
the inverse
1
is nothing but (where we consider these relations on 2
X
for an
arbitrary set X). Regarding a diagram as in Example 4.3 inversion means to reverse
all the arrows. We state a number of properties regarding inversion and composition.
Proposition 4.10 (Inverse and composition) Let R X Y and S Y Z
be relations. Then
1. (R
1
)
1
= R.
2. (S R)
1
= R
1
S
1
.
3.
dom(R)
R
1
R.
4. dom(R
1
) = range(R) and range(R
1
) = dom(R).
Proof. We prove 2. and 3. and we leave the other statements to the reader (see
Problem 4.2). Let x X and z Z. Then we obtain
(z, x) (S R)
1
(x, z) S R
(y Y ) ((x, y) R and (y, z) S)
(y Y ) ((z, y) S
1
and (y, x) R
1
)
(z, x) R
1
S
1
Thus, we have proved (S R)
1
= R
1
S
1
. Now let (x, x

)
dom(R)
. Then
x = x

and x dom(R). Hence, there exists y Y such that (x, y) R and


hence (y, x) R
1
. This implies (x, x

) = (x, x) R
1
R. Thus, we have proved

dom(R)
R
1
R. 2
Now we want to discuss a result that shows how composition and inverses aect
the totality of relations.
Proposition 4.11 (Totality) Let R X Y and S Y Z be relations. Then
60
4.3. Functions
1. If R and S are left total, then S R is left total.
2. If R and S are right total, then S R is right total.
3. R is left total R
1
is right total.
Proof. We prove 1. and we leave 2. and 3. to the reader (see Problem 4.3). Let R
and S be left total. Then dom(R) = X and dom(S) = Y and we obtain
x dom(S R) (z Z) x(S R)z
(z Z)(y Y )(xRy and ySz)
(y Y )(xRy and (z Z)ySz)
(y Y )(xRy and y dom(S))
(y Y ) xRy
x dom(R) = X.
Hence dom(S R) = X and S R is left total. 2
Problems
4.1 Let R X Y be a relation. Prove that
1. R
X
=
Y
R = R (identity element)
4.2 Let R X Y be a relation. Prove that:
1. (R
1
)
1
= R,
2. dom(R
1
) = range(R) and range(R
1
) = dom(R).
4.3 Let R X Y and S Y Z be relations. Prove that:
1. If R and S are right total, then S R is right total.
2. R is left total R
1
is right total.
4.3 Functions
Perhaps the most important relations that are considered in mathematics are func-
tions. The idea of a function f : X Y is that each value x X is mapped to one
and only one function value f(x) Y . Thus, the crucial property is the uniqueness
here. For symmetry reasons we have uniqueness on the left and uniqueness on the
right-hand side, where the right uniqueness is what is required for functions.
Denition 4.12 (Uniqueness) Let R X Y be a relation.
1. R is called left unique, if for all x
1
, x
2
X and y Y
x
1
Ry and x
2
Ry =x
1
= x
2
.
61
4. Relations and Functions
2. R is called right unique, if for all x X and y
1
, y
2
Y
xRy
1
and xRy
2
=y
1
= y
2
.
Firstly, we show how composition and inverses aect the uniqueness of relations.
Proposition 4.13 (Uniqueness) Let R X Y and S Y Z be relations.
Then we obtain:
1. If R and S are right unique, then S R is right unique.
2. If R and S are left unique, then S R is left unique.
3. R is left unique if and only if R
1
is right unique.
Proof. We just prove the rst statement and we leave the other statements to the
reader (see Problem 4.4). Let R and S be right unique. We prove that this implies
that S R is right unique. Let x X and let z
1
, z
2
Z such that x(S R)z
1
and
x(S R)z
2
hold. Then there are y
1
, y
2
Y such that xRy
1
, y
1
Sz
1
and xRy
2
, y
2
Sz
2
hold. Since R is right unique, we obtain y
1
= y
2
. That is, we have y
1
Sz
1
and y
1
Sz
2
.
This implies z
1
= z
2
since S is right unique. Altogether we have proved that S R
is right unique. 2
Using the notion of right uniqueness we can now formally dene what a function
is.
Denition 4.14 (Function) Any left total and right unique relation R X Y
is called a function. We denote such a function by f : X Y .
The above notation f : X Y for a function just indicates that this object is a
left total and right unique relation f = (R, X, Y ). The underlying set R is usually
referenced as graph(f) = R and it is called the graph of the function f. Functions
are often called map or mapping and we want to understand all the three words
synonymously here. The important thing is, once again, a function is more than
its graph R; the entire triple (R, X, Y ) constitutes the function. We write range(f)
for the range of a function. The domain of a function is by denition always equal
to the source space. The target space of a function is referred to by some authors
as codomain. However, we will avoid this terminology and stick to the notion pairs
source and target, on the one hand, and domain and range, on the other hand. We
mention that by Y
X
one denotes the set of functions f : X Y . None of the
relations R, S or S R in Example 4.5 is a function. We provide an example of a
function.
Example 4.15 Let X := Y := 0, 1, 2, 3, 4, 5 and let
R := (0, 1), (1, 1), (2, 3), (3, 2), (4, 4), (5, 5).
Then R X Y is a relation that is left total and right unique. Hence it denes a
function f : X Y . The relation R is illustrated in the diagram in Figure 4.3.
62
4.3. Functions
X Y
0 0
1 1
2 2
3 3
4 4
5 5
_

Figure 4.3: A function f : X Y .


A characteristic feature of a function f : X Y with R := graph(f) is that for
each x X there is one and only one value f(x) Y such that x is related to f(x),
i.e. such that (x, f(x)) R holds. This value will be called the function value of f
on input x.
Denition 4.16 (Function value) Let f : X Y be a function with graph(f) =
R. Then we dene f(x) Y to be the unique value in the set
y Y : xRy
for any x X. The value f(x) is called the function value of f at x.
We need to justify why the value f(x) is well-dened by the above implicit
denition. For one, the given set y Y : xRy is non-empty, since R is left total
and secondly it contains exactly one point, since R is right unique. Hence, we can
dene f(x) to be this point.
Example 4.17 We consider the relation R N N given by
R := (n, k) N N : k = n
2
.
This relation is left total and right unique and hence it denes a function f : N N
with graph(f) = R. We have
f(n) = n
2
for each value n N. We could also dene the function f : N N by saying that
f(n) := n
2
for all n N, as this fully species the graph of f (see Proposition 4.18).
Another common way of denoting this denition is as follows:
f : N N, n n
2
.
Here the understanding is that n is an arbitrary element in the source set N and
n n
2
means that n is mapped to n
2
. This is the same as saying f(n) := n
2
for all n N.
63
4. Relations and Functions
Sometimes one nds statements such as
We consider the function f(n) = n
2
.....
The reader should be warned that this is abuse of mathematical terminology that
sometimes creates confusion and mistakes. The function in the above example is
the object f : N N and it is not fully specied without naming its source and
target set. Moreover, the object f(n) is a natural number and not a function in this
case. It is recommendable to avoid the above terminology and to keep functions and
their function values clearly separated in mathematical formulations. The equation
f(n) = n
2
can only be used to dene f, given its source and target set. But neither
the equation nor f(n) is the function. The following proposition justies to dene
a function f by an equation as in Example 4.17 above.
Proposition 4.18 (Graph) Let f : X Y be a function. Then
graph(f) = (x, y) X Y : f(x) = y
Proof. If f : X Y is a function, then that means that R := graph(f) is a left total
and right unique relation R X Y . We prove R = (x, y) X Y : f(x) = y.
If (x, y) R, then xRy and hence f(x) = y

Y : xRy

= y due to right
uniqueness of R. This means f(x) = y, which proves . For the other inclusion
we consider (x, y) X Y with f(x) = y. This means that y is the only value
in y

Y : xRy

and in particular xRy holds, i.e. (x, y) R. 2


`

2 4
n
f(n)
2
4
6
8
10
12
14
16
Figure 4.4: Graph of the function f : N N, n n
2
.
64
4.3. Functions
This proposition says that the graph of a function can essentially be characterized
by the function values. The diagram in Figure 4.4 is a typical illustration of (a part
of) a graph of a function. This illustration uses a Cartesian coordinate system to
illustrate the graph. The horizontal axis represents the input values n, whereas the
vertical axis represents the function values f(n). The previous proposition is the
basis of the following observation which says that two functions with identical source
and target set are equal if and only if all their function values coincide.
Proposition 4.19 (Equality of functions) Let f : X Y and g : X

be
functions. Then
f = g X = X

and Y = Y

and (x X) f(x) = g(x).


Proof. The fact that f : X Y and g : X

are functions means that f =


(graph(f), X, Y ) and g = (graph(g), X

, Y

) and graph(f) X Y and graph(g)


X

are left total and right unique relations. Hence, it is clear that
f = g X = X

and Y = Y

and graph(f) = graph(g).


So let us assume now that X = X

and Y = Y

. By Proposition 4.18 we obtain


graph(f) = graph(g)
(x, y) X Y : f(x) = y = (x, y) X Y : g(x) = y
(x X)(y Y )(f(x) = y g(x) = y)
(x X) f(x) = g(x)
Altogether, this proves the claim. 2
Next we mention that the composition of two functions is a function again. In
fact this result follows from our previous results on relations.
Corollary 4.20 (Composition) Let f : X Y and g : Y Z be functions with
graphs R := graph(f) and S := graph(g). Then the relation S R X Z is a
function too, which we denote by g f : X Z.
If f and g are functions, then the relations R and S are both left total and right
unique. Hence SR is also left total and right unique by Propositions 4.11 and 4.13.
But this means that S R is a function too. The composition of two functions can
be seen such that the functions are applied after each other. This is made precise
in the following proposition.
Proposition 4.21 (Composition) Let f : X Y and g : Y Z be functions.
Then we obtain
(g f)(x) = g(f(x))
for all x X.
65
4. Relations and Functions
Proof. For x X and z Z we obtain by Proposition 4.18
(g f)(x) = z (x, z) graph(g f)
(y Y )((x, y) graph(f) and (y, z) graph(g))
(y Y )(f(x) = y and g(y) = z)
g(f(x)) = z.
This proves (g f)(x) = g(f(x)) for all x X. 2
The composition of functions f : X Y and g : Y Z is often illustrated in
so-called commutative diagrams. Figure 4.5 shows a commutative diagram, which is
called such because it does not matter in which order one goes through the diagram.
Moving from X to Y along the arrow f and to continue along g to Z leads to the
same result as if one moves from X to Z along g f.
Y Z
X

g f
g
f
Figure 4.5: A commutative diagram for the composition of two functions.
Next we mention that the diagonal
X
X X of any set is a function that
we actually call the identity of X.
Denition 4.22 (Identity function) Let X be a set. The function
id
X
: X X, x x
is called the identity of X.
It is easy to see that graph(id
X
) =
X
. We mention an immediate corollary of
Proposition 4.6.
Corollary 4.23 (Identity) Let f : X Y be a function. Then
f = f id
X
= id
Y
f.
At the end of this section we mention some other types of relations which are
often used in mathematics. Relations that are only left total are called multi-valued
function or correspondence and they are typically denoted by f : X Y . They
miss the uniqueness property and hence there is not necessarily one unique function
value f(x), but an entire set f(x) Y of possible values. Relations that are only
66
4.3. Functions
relation R X Y
partial function f : X Y
surjection f : X Y
multi-valued function f : X Y
injection f : X Y
function f : X Y
bijection f : X Y
right unique
right total
left total
left unique
left total
left unique
right unique
right total
>
>
>
>
>
>
>
>
>.
>
>
>
>
>
>
>
>
>.

.
.
Figure 4.6: Some common types of functions
right unique are called partial function and they are often denoted by f : X Y
or f : X Y . Partial functions are not necessarily dened on the entire source
set X. We write dom(f) for the domain of a partial function. The diagram in
Figure 4.6 lists some common types of functions and relations that are often used
in mathematics. We study injections, surjections and bijections more closely in the
next section.
Problems
4.4 Let R X Y and S Y Z be relations. Prove that:
1. If R and S are left unique, then S R is left unique.
2. R is left unique if and only if R
1
is right unique.
67
4. Relations and Functions
4.4 Injections, Surjections and Bijections
In this section we discuss functions that have additional totality and uniqueness
properties. We start with a denition.
Denition 4.24 (Injective, surjective, bijective) Let f : X Y be a func-
tion.
1. f is called injective, if f is left unique,
2. f is called surjective, if f is right total,
3. f is called bijective, if f is injective and surjective.
Injective, surjective and bijective functions are also called injection, surjection and
bijection, respectively.
An injection is sometimes denotes as f : X Y , where the arrow is supposed
to indicate that this function is injective. Such an injection is also called a function f
from X into Y . Similarly, surjections are sometimes denoted as f : X Y , where
the arrow indicates that this function is surjective. Such a surjection is also
called a function f from X onto Y . For bijections one sometimes sees the notation
f : X Y , but we will not use this here.
By denition an injective function is a function that cannot map two dierent
inputs to the same output and a surjective function is a function that yields all
values of the target space as output. We capture these characterizations in terms of
function values in the following proposition.
Proposition 4.25 (Injectivity, surjectivity and bijectivity) Let f : X Y
be a function. Then
1. f is injective if and only if for all x, y X we have that f(x) = f(y) implies
x = y,
2. f is surjective if and only if for all y Y there exists an x X with f(x) = y.
3. f is bijective if and only if for all y Y there exists exactly one x X with
f(x) = y.
We leave the proof to the reader (see Problem 4.5). Often the above characteri-
zation of injectivity is used in its contrapositive form, i.e. a function f : X Y is
injective if and only if for all x, y X we have that x ,= y implies f(x) ,= f(y). For
short: distinct inputs have to be mapped to distinct outputs. This is the reason why
some authors also call injective functions one-to-one function. However, this termi-
nology is ambiguous, since it is also sometimes used to refer to bijective functions
and hence we will try to avoid it here.
68
4.4. Injections, Surjections and Bijections
0 0
0
0 0
0
1 1
1
1 1
1
2 2
2
2 2
2
3 3
3
3 3
3
4 4
4
4 4
4
5
_

injective (but not surjective) surjective (but not injective)


bijective
5

Figure 4.7: Examples of injective, surjective and bijective functions


Figure 4.6 summarizes the dierent types of functions that we have seen. The
function in Example 4.15 is neither surjective not injective. The diagrams in Fig-
ure 4.7 provide examples of injective, surjective and bijective functions. We provide
some further examples.
Example 4.26
1. The square function f : N N, n n
2
is an example of a function that is
injective, but not surjective. Hence, f is also not bijective.
2. The square function f : Z Z, z z
2
on integers is an example of a function
that is neither injective nor surjective.
3. The predecessor function
f : N N, n
_
0 if n = 0
n 1 otherwise
is an example of a function that is not injective, but surjective.
4. The maximum function max : N N N is dened by
max(n, k) :=
_
n if n k
k otherwise
for all n, k N. The function max is surjective, but not injective. The same
holds for the minimum function min : N N N that is dened analogously
with in place of .
69
4. Relations and Functions
5. The identity function id
X
: X X, x x on any set X is injective and
surjective, hence bijective.
6. The constant function c
y
: X Y, x y is dened for any two sets X and Y
and any y Y . If each of X and Y contains at least two dierent elements,
then c
y
is neither surjective nor injective.
The examples of the predecessor function and the maximum function illustrate
another method how denitions (of functions) are often written in mathematics,
namely by case distinction.
Another way to characterize injective and surjective functions is by using their
behaviour under composition with other functions. Roughly speaking, we can di-
vide functional equations by injective functions on the left-hand side and by sur-
jective functions on the right-hand side and these properties actually characterize
injective and surjective functions and they explain why these types of functions play
a signicant role.
Theorem 4.27 (Cancellation) Let f : X Y be a function. Then
1. f is injective if and only if for all sets Z and all functions g, h : Z X we
have that f g = f h implies g = h,
2. f is surjective if and only if for all sets Z and all functions g, h : Y Z we
have that g f = h f implies g = h.
Proof.
1. = Let f be injective and let g, h : Z X be two functions with fg = fh.
By Propositions 4.19 and 4.21 we obtain
f(g(x)) = (f g)(x) = (f h)(x) = f(h(x))
for all x X and hence g(x) = h(x) follows for all x X due to injectivity of
f by Proposition 4.25. Again by Proposition 4.19 we obtain g = h.
= Now let us assume that for all functions g, h : Z X we have that
f g = f h implies g = h. Let us now choose Z = 0 (or any other non-empty
set) and let us consider for any x X the constant function c
x
: Z X, z x.
Let now x
1
, x
2
X with f(x
1
) = f(x
2
). Then by Proposition 4.21
(f c
x
1
)(z) = f(c
x
1
(z)) = f(x
1
) = f(x
2
) = f(c
x
2
(z)) = (f c
x
2
)(z)
follows for all z Z. By Proposition 4.19 this means f c
x
1
= f c
x
2
and
hence by assumption c
x
1
= c
x
2
. This implies again by Proposition 4.19 that
we obtain x
1
= c
x
1
(y) = c
x
2
(y) = x
2
for any y X. Hence we have proved by
Proposition 4.25 that f is injective.
2. We leave this proof to the reader (see Problem 4.5).
70
4.4. Injections, Surjections and Bijections
2
Another important observation is that injective, surjective and bijective functions
are all closed under composition.
Corollary 4.28 Let f : X Y and g : Y Z be functions. Then we obtain the
following:
1. If f and g are injective, then g f is injective.
2. If f and g are surjective, then g f is surjective.
3. If f and g are bijective, then g f is bijective.
All these statements follow from Propositions 4.11 and 4.13. Another interesting
question is when the inverse relation of a function is actually a function.
Proposition 4.29 (Inverse function) Let f : X Y be a function with R :=
graph(f). Then the inverse relation R
1
Y X is a function if and only if f is
bijective.
Proof. = Let f be a bijective function, i.e. R is left and right total and left
and right unique. Then R
1
is also left and right total and left and right unique by
Propositions 4.11 and 4.13. Thus, R
1
is, in particular, a function.
= If R
1
is a function, then it is left total and right unique. This implies that
R is right total and left unique by Propositions 4.11 and 4.13. Moreover, since f
is a function R is also left total and right unique. Altogether, this shows that f is
bijective. 2
If f : X Y is a bijective function with R := graph(f), then the inverse
relation R
1
Y X is a function too by this result and we denote this function
by f
1
: Y X and we call it the inverse function of f. If f : X Y is only
an injective function, then the inverse relation R
1
is only a partial function (see
Problem 4.6). It is common practice in mathematics to denote this partial function
also by f
1
: Y X. This partial function can also be considered as a function
of type f
1
: range(f) X. In other words, the inverse of an injective function
f always exists as a function with range(f) as source set. We obtain the following
result as corollary of Proposition 4.10.
Corollary 4.30 (Inverse function) Let f : X Y and g : Y Z be bijective
functions. Then
1. f
1
is bijective and (f
1
)
1
= f.
2. g f is bijective and (g f)
1
= f
1
g
1
.
3. f f
1
= id
Y
and f
1
f = id
X
.
71
4. Relations and Functions
The bijective functions of type f : X X (with identical source and target set)
have particularly nice properties. They form what is called the symmetric group on
X. We mention all the relevant properties.
Corollary 4.31 (Symmetric group) Let X be a set. Then we obtain for all bi-
jective f, g, h : X X the following:
1. (f g) h = f (g h) (associative)
2. f id
X
= id
X
f = f (identity)
3. f f
1
= f
1
f = id
X
(inverse)
The bijective functions f : X X are also called permutations. This ter-
minology is in particular used if X is a nite set. This is because a bijective map
f : X X actually permutes the elements of X. The bijective function in Figure 4.7
is a typical example of a permutation on a nite set.
Problems
4.5 Let f : X Y be a function. Prove the following:
1. f is injective if and only if for all x, y X we have that f(x) = f(y) implies x = y,
2. f is surjective if and only if for all y Y there exists an x X with f(x) = y.
3. f is bijective if and only if for all y Y there exists exactly one x X with f(x) = y.
4. f is surjective if and only if for all functions g, h : Y Z we have that g f = h f
implies g = h.
4.6 Let f : X Y be a function with R := graph(f). Prove that the inverse relation
R
1
Y X is a partial function of type f
1
: Y X if and only if f is injective. Prove
that for injective f one obtains dom(f
1
) = range(f). Hence, one can also consider this
partial function as a function f
1
: range(f) X.
4.7 Let X and Y be non-empty sets. Prove that the canonical projections
p
X
: X Y X, (x, y) x and p
Y
: X Y Y, (x, y) y
are both surjective.
4.8 Let f
i
: X
i
Y
i
be functions for i 1, 2. Then we dene the product function by
f
1
f
2
: X
1
X
2
Y
1
Y
2
, (x
1
, x
2
) (f
1
(x
1
), f
2
(x
2
)).
Prove the following:
1. f
1
and f
2
injective =f
1
f
2
injective,
2. f
1
and f
2
surjective =f
1
f
2
surjective,
3. f
1
and f
2
bijective =f
1
f
2
bijective,
72
4.5. Families, Sequences and Restrictions

4. f
1
and f
2
bijective =(f
1
f
2
)
1
= f
1
1
f
1
2
.
4.9 Let X be a set. Prove the following:
1. The union U : 2
X
2
X
2
X
, (A, B) A B is surjective, but not injective in
general.
2. The intersection I : 2
X
2
X
2
X
, (A, B) A B, surjective, but not injective in
general.
3. The complement C : 2
X
2
X
, A X A is bijective.
4.10 Let f : X Y and g : Y Z be functions. Prove the following:
1. g f surjective = g surjective,
2. g f injective = f injective.
4.11 Prove that for each function f : X Y there exists a set Z and a surjective function
g : X Z and an injective function h : Z Y such that f = h g.
4.5 Families, Sequences and Restrictions

In this section we just introduce some further terminology that is related to the
source set of a function. A sequence in X is just another name for a function
f : N X and a family in X indexed by I is just a function f : I X. There are
special ways of denoting such functions.
Denition 4.32 (Family and sequence) Let I and X be non-empty sets and let
x
i
X for each i I. Then (x
i
)
iI
is just another way of writing the function
f : I X, i x
i
and this function is called a family in X (indexed by I). A family (x
n
)
nN
in X that
is indexed by N is called a sequence in X.
The notation (x
n
)
nN
for sequences can easily be read as generalization of the
notation (x
1
, x
2
, ..., x
n
) for ntuples, since a sequence (x
n
)
nN
can be considered in
some sense as the innite tuple
(x
0
, x
1
, x
2
, x
3
, ...).
However, one should keep in mind that formally we mean by (x
n
)
nN
the function
f : N X, n x
n
. Sometimes sequences are also written as x
n

nN
, but this
notation is misleading since one has to distinguish a sequence (x
n
)
nN
(which is a
function f : N X) from the set x
n
: n N (which is, in fact, nothing but
range(f)). See also Problem 4.12. We mention that the terminology of an indexed
family of sets (X
i
)
iI
naturally falls under the terminology of a family introduced
here. If X :=

iI
X
i
, then (X
i
)
iI
can be seen as a family in 2
X
indexed by I. In
other words, what we mean by (X
i
)
iI
is exactly the function f : I 2
X
, i X
i
.
73
4. Relations and Functions
Occasionally, one is interested in changing the source set of a function. Since the
source set is part of what constitutes the function, this might change the properties
of the function.
Denition 4.33 (Restriction and extension) Let f : X Y be a function and
let A X. Then we dene the restriction of f to A by
f[
A
: A Y, x f(x).
In this situation f is also called an extension of f[
A
.
So, in other words, the restriction f[
A
of f to A is simply obtained by leaving
f as it is, but by allowing only inputs from the (potentially smaller) source set A.
In this way one can cut o pieces of f that stop f form being injective or from
having other properties. We give an example.
Example 4.34
1. We consider the function f : X Y from Example 4.15. This function
f is not injective, since f(0) = f(1) = 1. By restricting f to either A =
0, 2, 3, 4, 5 or to B = 1, 2, 3, 4, 5 we obtain restrictions f[
A
: A Y and
f[
B
: B Y that are both injective.
2. We consider the square function f : Z Z, z z
2
, which is not injective
since, for instance, f(1) = f(1) = 1. If we restrict f to N, then the resulting
restriction f[
N
: N Z is injective.
Later we will prove in Proposition 4.51 that any function f : X Y has a
restriction f[
A
: A Y with the same range, i.e. such that range(f) = range(f[
A
).
However, this proof requires the Axiom of Choice.
Problems
4.12 Show that there are two sequences (x
i
)
iN
and (y
i
)
iN
in N such that
(x
i
)
iN
,= (y
i
)
iN
and x
i
: i N = y
i
: i N.
4.13 Let X be a set. We consider the union map
U : 2
X
2
X
2
X
, (A, B) A B.
Find a restriction U[
Y
of U that is bijective.
4.6 Images and Preimages
When we work with functions f : X Y we are often not just interested in single
function values, but we would like to know how a function f behaves on certain
subsets A X or B Y . In order to express such properties, we dene the image
of a set under a function and the preimage of a set under a function.
74
4.6. Images and Preimages
Denition 4.35 (Image and primage) Let f : X Y be a function and let
A X and B Y . Then we dene
1. f(A) := y Y : (x A) f(x) = y, the image of A under f,
2. f
1
(B) := x X : f(x) B, the preimage of B under f.
In other words, the values in the image f(A) are all the function values of f that
one obtains for inputs from A and the set f
1
(B) is the set of all inputs that yield
outputs in B. In particular, we obtain range(f) = f(X) and X = f
1
(Y ).
X Y
A
B = f(A)
f
1
(B)

-
-
f : X Y
Figure 4.8: Image B = f(A) and preimage f
1
(B)
The diagram in Figure 4.8 illustrates the situation that we get if we start with
a function f : X Y and a set A X: in a rst step we consider the image
B = f(A) and in a second step the preimage f
1
(B). The diagram illustrates that
the preimage f
1
(B) is potentially larger then the set A that we started with. This
is because elements from X that are not in A can also potentially be mapped to
B = f(A). So, what we can say is that A f
1
(f(A)). A proper proof of this fact
is requested in Problem 4.19.
Correspondingly, the diagram in Figure 4.9 illustrates the situation that we get
if we start with a function f : X Y and a set B Y : in a rst step we consider
the preimage A = f
1
(B) and in a second step we consider the image f(A). In
general, we only get that f(f
1
(B)) B and potentially f(f
1
(B)) is smaller than
the set B we started with. This is because some elements of B might not be in the
range of f. A proper proof of this fact is subject of Problem 4.19.
It is important to emphasize that f
1
in the denition of the preimage does not
refer to the inverse function, but the notation f
1
is overloaded with two dierent
meanings. If f
1
is used together with a set B, such as in f
1
(B), then it refers
to the preimage of B under f, which always exists and if f
1
is used with a single
value y Y , such as in f
1
(y), then it refers to the inverse function of f, which does
not need to exist. We mention that there is a special name for the sets f
1
(y).
75
4. Relations and Functions
X Y
f : X Y
A = f
1
(B)
`
_
-
-

f(A)
B
Figure 4.9: Preimage A = f
1
(B) and image f(A)
Denition 4.36 (Fiber) Let f : X Y be a function and y Y . Then f
1
(y)
is called the ber over y.
In general, the ber over y contains many elements, namely all those elements
x X that are mapped by f to y. If the inverse function f
1
exists, then f
1
(B)
is the same thing as the image of B under the inverse function f
1
and hence this
overloading of notation is justied. We capture this for the special case of bers in
the following proposition.
Proposition 4.37 (Inverse function and preimage) If f : X Y is an injec-
tive function, then
f
1
(y) = f
1
(y)
for all y range(f).
Here f
1
appears in two dierent meanings, on the left-hand side of the equality
it appears as notation for the preimage, on the right-hand side of the equality it
appears as notation for the inverse function. We recall that for injective f the
function f
1
can either be considered as partial function f
1
: Y X or as ordinary
function f
1
: range(f) X. We leave the obvious proof of the proposition to
the reader. It is very important to keep in mind that the preimage f
1
(B) exists
irrespectively of whether the inverse function f
1
exists or not.
In a context where the values might be sets as well, it is better to use a slightly
dierent notation in order to avoid confusion. Some authors write f[A] for the image
and f
1
[B] for the preimage in such a context. The image is sometimes also called
forward image and the preimage is called inverse image as well. As a rst result we
mention a monotonicity property of the image and the preimage that shows that
both preserve the subset relation.
Proposition 4.38 (Monotonicity of image and preimage) Let f : X Y be
a function and let A, B X and C, D Y . Then
76
4.6. Images and Preimages
1. f() = , f
1
() = ,
2. A B =f(A) f(B),
3. C D =f
1
(C) f
1
(D).
Proof.
1. This property is easy to verify.
2. We leave this one to the reader (see Problem 4.14).
3. Let C and D be subsets of Y with C D. We need to prove f
1
(C) f
1
(D).
Let x f
1
(C). This means f(x) C. Since C D, we obtain f(x) D.
But this means x f
1
(D).
2
Images and preimages are not completely independent constructions. There
are some important relations between images and preimages, which are studied in
Problems 4.17 and 4.19. For many applications in mathematics it is important
to understand how set theoretical operations behave with respect to images and
preimages. The rough thumb rule is that preimages are much better behaved than
images and images basically only perserve unions. We make this precise in the
following proposition.
Proposition 4.39 (Image, preimage and set operations) Let f : X Y be
a function and let (A
i
)
iI
be an indexed family of subsets of X and let (B
i
)
iI
be an
indexed family of subsets of Y . Let A, B X and C, D Y . Then
1. f(

iI
A
i
) =

iI
f(A
i
),
2. f(

iI
A
i
)

iI
f(A
i
),
3. f(A B) f(A) f(B),
4. f
1
(

iI
B
i
) =

iI
f
1
(B
i
),
5. f
1
(

iI
B
i
) =

iI
f
1
(B
i
),
6. f
1
(C D) = f
1
(C) f
1
(D).
Proof.
1. Let (A
i
)
iI
be an indexed family of sets. Then we obtain for all y Y
y f(

iI
A
i
) (x

iI
A
i
) f(x) = y
(i I)(x A
i
) f(x) = y
(i I) y f(A
i
)
y

iI
f(A
i
).
77
4. Relations and Functions
This shows f(

iI
A
i
) =

iI
f(A
i
).
2. to 4. We leave these proofs to the reader (see Problem 4.14).
5. Let (B
i
)
iI
be an indexed family of sets. Then we obtain for all x x
x f
1
(

iI
B
i
) f(x)

iI
B
i
(i I) f(x) B
i
(i I) x f
1
(B
i
)
x

iI
f
1
(B
i
).
This shows f
1
(

iI
B
i
) =

iI
f
1
(B
i
).
6. We leave this proof to the reader (see Problem 4.14).
2
It is important to point out that the image does not preserve intersections and
dierences, but we only have the inclusions given in 2. and 3. It is easy to nd
examples that show that the other inclusions are not valid in general. However, for
injective functions one can prove somewhat more (see Problem 4.15). Problem 4.16
is about how restrictions aect the preimage. In Problem 4.20 we discuss two maps
f

and f

, which are induced by the image and preimage, respectively. We close this
section with a result that shows how images and preimages of compositions can be
determined.
Proposition 4.40 (Image, preimage and composition) Let f : X Y and
g : Y Z be functions and let A X and B Y . Then the following holds:
1. (g f)(A) = g(f(A)),
2. (g f)
1
(B) = f
1
(g
1
(B)).
We leave the proof to the reader (see Problem 4.21).
Problems
4.14 Let f : X Y be a function and let (A
i
)
iI
be an indexed family of subsets of X and
let (B
i
)
iI
be an indexed family of subsets of Y . Let A, B X and C, D Y . Prove the
following:
1. A B =f(A) f(B),
2. f(

iI
A
i
)

iI
f(A
i
),
3. f(A B) f(A) f(B),
4. f
1
(

iI
B
i
) =

iI
f
1
(B
i
),
5. f
1
(C D) = f
1
(C) f
1
(D).
78
4.6. Images and Preimages
4.15 Let f : X Y be an injective function and A, B X. Prove the following:
1. f(A B) = f(A) f(B),
2. f(A B) = f(A) f(B).
4.16 Let f : X Y be a function and let A X and B Y . Prove that
(f[
A
)
1
(B) = A f
1
(B).
4.17 Let f : X Y be a function and let A X and B Y . Prove that
f(A) B A f
1
(B).
4.18 Let f : X Y be a function. Prove the following:
1. f is injective if and only if f
1
(y) contains at most one element for each y Y .
2. f is surjective if and only if f
1
(y) contains at least one element for each y Y .
3. f is bijective if and only if f
1
(y) contains exactly one element for each y Y .
4.19 Let f : X Y be a function. Prove the following:
1. A f
1
(f(A)) for each set A X,
2. f(f
1
(B)) B for each set B Y ,
3. f is injective if and only if A = f
1
(f(A)) for each set A X,
4. f is surjective if and only if B = f(f
1
(B)) for each set B Y .
4.20 For each function f : X Y we dene two associated functions
1. f

: 2
X
2
Y
, A f(A) (image map)
2. f

: 2
Y
2
X
, B f
1
(B) (preimage map)
Prove the following:
1. f is injective if and only if f

is injective if and only if f

is surjective,
2. f is surjective if and only if f

is surjective if and only if f

is injective,
3. f is bijective if and only if f

is bijective if and only if f

is bijective,
4. If f is bijective then (f

)
1
= f

.
4.21 Let f : X Y and g : Y Z be functions and let A X and B Y . Prove that
the following holds:
1. (g f)(A) = g(f(A)),
2. (g f)
1
(B) = f
1
(g
1
(B)).
79
4. Relations and Functions
4.7 Set of Functions

In this section we discuss the set Y


X
of all functions f : X Y for two given sets
X and Y . As we will see, this concept generalizes the concept of a power set in some
sense and it can be considered as an exponentiation operation for sets.
Denition 4.41 (Set of functions) Let X and Y be sets. Then we denote by
Y
X
the set of all functions f : X Y and by X! the set of bijective functions
f : X X.
Some authors denote the set of bijective functions also by o
X
, since it is also
called the symmetric group on X, as mentioned in Corollary 4.31. There is one
important function that comes associated with the function set Y
X
and which is
called evaluation.
Denition 4.42 (Evaluation) Let X and Y be sets. Then we dene the evalua-
tion map by
ev : Y
X
X Y, (f, x) f(x)
Sometimes the evaluation map is also called apply operation since it applies the
rst argument (which is a function) to the second argument (which is a suitable
input). The next theorem is telling us that we can identify the set Z
XY
with
the set (Z
Y
)
X
. This corresponds to the arithmetic rule that for natural numbers
x, y, z N we have (z
y
)
x
= z
xy
. The bijection that maps Z
XY
to (Z
Y
)
X
is called
currying operation since it has been studied by Haskell Curry (and indeed already
earlier by others such as Moses Schonnkel).
Theorem 4.43 (Currying) Let X and Y be sets. Then the so-called currying
operation
C : Z
XY
(Z
Y
)
X
,
which is dened by C(f)(x)(y) := f(x, y) for all functions f : X Y Z and all
x X and y Y , is bijective.
Proof. Let g : X Z
Y
be a function. Then we can dene a function f : XY Z
by
f(x, y) := g(x)(y)
for all x X and y Y . For this function f we obtain
C(f)(x)(y) = f(x, y) = g(x)(y)
for all x X and y Y . Hence C(f)(x) = g(x) for all x X, which means
C(f) = g. This shows that C is surjective. Now, let f
1
, f
2
: X Y Z be two
functions with C(f
1
) = C(f
2
). This implies C(f
1
)(x) = C(f
2
)(x) for all x X and
hence
f
1
(x, y) = C(f
1
)(x)(y) = C(f
2
)(x)(y) = f
2
(x, y)
80
4.7. Set of Functions

for all x X and y Y . This means f


1
= f
2
. Hence C is injective. Altogether, we
have proved that C is bijective. 2
Now we want to show in which sense the exponentiation Y
X
generalizes the
power set construction. There is a particular function
A
: X 0, 1 for each set
A X which is called the characteristic function.
Denition 4.44 (Characteristic function) Let X be a set. For each subset A
X we dene the characteristic function
A
: X 0, 1 of A by

A
: X 0, 1, x
_
1 if x A
0 otherwise
.
The characteristic function
A
is called so, since one can think about it as if it
answers the question of whether x A by the result 1 for true and 0 for false.
In particular, we have
x A
A
(x) = 1
for all x X. The following result shows now in which sense the function set
construction Y
X
generalizes the power set construction 2
X
. Namely, there is a
bijection between 2
X
and 0, 1
X
.
Theorem 4.45 (Characteristic function) Let X be a set. Then the following
map is bijective:
: 2
X
0, 1
X
, A
A
.
We leave the proof to the reader (see Problem 4.23). This result is telling us
that we can somehow identify the power set 2
X
with the set of functions 0, 1
X
,
where each set A 2
X
is represented by its characteristic function
A
. That the
map is bijective means that we do not loose any information when we move from
the set A to its characteristic function
A
or backwards.
Problems
4.22 Let X, Y and Z be sets. Prove that for any function f : X Y Z we obtain
ev (C(f) id
Y
) = f.
Here C denotes the currying operation as dened in Theorem 4.43 and ev : Z
Y
Y Z
denotes the evaluation map. The following diagram illustrates the situation.
Z
Y
Y Z
X Y

f
ev
C(f) id
Y
81
4. Relations and Functions
4.23 Let X be a set. Prove that the following map is bijective:
: 2
X
0, 1
X
, A
A
.
4.24 Let X be a set and let A, B X. Prove the following for all x X:
1.
AB
(x) = min(
A
(x),
B
(x)),
2.
AB
(x) = max(
A
(x),
B
(x)),
3.
A
c (x) = 1
A
(x).
Here the complement is understood with respect to X. We denote by min, max : N
2
N
the minimum and maximum on natural numbers, respectively, as dened in Example 4.26.
4.25 Let X and Y be both sets with more than one element. Prove that
1. The range map range : Y
X
2
Y
, f range(f) is not injective. Is it surjective?
2. The graph map graph : Y
X
2
XY
, f graph(f) is injective, but not surjective.
3. The inversion map inv : X! X!, f f
1
is bijective.
4.8 The Axiom of Choice
The Axiom of Choice (together with the Continuum Hypothesis)
is probably the most interesting and most discussed axiom in
mathematics after Euclids Axiom of Parallels.
P. Bernays and A.A. Fraenkel (1958)
There is a particularly important axiom in set theory, which is called the Axiom
of Choice. Perhaps it is the most controversial set-theoretical axiom and some
mathematicians prefer not to use it or to indicate at least, whenever they use this
axiom. However, often this axiom is applied tacitly without even mentioning it. We
phrase this axiom in form of a denition.
Denition 4.46 (Axiom of Choice) The Axiom of Choice is the statement that
for any set X there exists a choice function
C
X
: 2
X
X
with C
X
(A) A for all non-empty A X.
What the choice function C
X
does is that for any non-empty set A X it selects
a point x = C
X
(A) with the property that x A. This seems to be a trivial task
since any non-empty set A has to have some member x. This is the reason why
most mathematicians readily accept the Axiom of Choice. However, from a more
constructive point of view, the axiom is debatable, since it does not specify how
such a point x shall be chosen in general. The other axioms of set theory, such
as the power set axiom, specify in some sense how the object whose existence is
82
4.8. The Axiom of Choice
postulated is actually constructed. This is dierent in case of the Axiom of Choice.
The existence of a certain set (a left total and right unique relation that is the graph
of a choice function C
X
) is postulated for each set X without further specication.
Indeed, one can prove that the Axiom of Choice directly implies the Principle of
Exclude Middle and hence it is not constructive in this sense. We formulate and
prove a corresponding theorem.
Theorem 4.47 (Diaconescu-Goodman-Myhill 1975) The Axiom of Choice di-
rectly implies the Principle of Excluded Middle (and directly here means that this
implication can be shown with a direct proof that does not use the principle itself ).
Proof. Let P be a proposition. We have to show that P P is true (without
using this principle itself). We consider the sets
U := x 0, 1 : (x = 0) P and V := x 0, 1 : (x = 1) P.
Certainly both sets are non-empty since 0 U and 1 V . Now we consider the set
X := 2
{0,1}
. By the Axiom of Choice there is a choice function C
X
for X and this
function has the property that C
X
(U) U and C
X
(V ) V . Now, if P is true, then
U = V = 0, 1 and hence C
X
(U) = C
X
(V ). This means that C
X
(U) ,= C
X
(V )
implies P. Now we see
(C
X
(U) U) (C
X
(V ) V ) = ((C
X
(U) = 0) P) ((C
X
(V ) = 1) P))
= (C
X
(U) ,= C
X
(V )) P
= P P.
This was to be proved. 2
In the Appendix on Axiomatic Set Theory we have given a slightly dierent
alternative formulation of the Axiom of Choice. In the next section we will see even
a further equivalent version. The Axiom of Choice is known to be independent of
the other axioms of Zermelo-Fraenkel set theory (which follows from work of Kurt
Godel and Paul Cohen). Hence, it can neither be proven from those nor can it be
refuted on basis of the other axioms. A reasonable perspective is to think about
mathematics as a very rich theory that can explore theorems based on the Axiom
of Choice and also theorems which are not based on it (or even theorems based on
other competitive axioms that might contradict the Axiom of Choice). We will adapt
this perspective here to some extent and hence we will try to indicate whenever we
actually use the Axiom of Choice. We will give some application soon. We need the
notion of a right inverse.
Denition 4.48 (Right inverse) Let f : X Y and g : Y X be functions.
1. g : Y X is called a right inverse of f if f g = id
Y
.
2. g : Y X is called a left inverse of f if g f = id
X
.
83
4. Relations and Functions
3. g : Y X is called an inverse of f if g is a left and right inverse of f.
By Corollary 4.30 the inverse f
1
of a function f is an inverse if it exists. How-
ever, by by Proposition 4.29 the inverse function f
1
only exists if f is bijective.
We generalize this observation. However, the proof requires the Axiom of Choice.
Theorem 4.49 (Left and right inverses) Let X and Y be non-empty sets and
let f : X Y be a function. Then
1. f has a right inverse if and only if f is surjective.
2. f has a left inverse if and only if f is injective.
3. f has an inverse if and only if f is bijective.
The proof of statement 1. uses the Axiom of Choice.
Proof.
1. Let us assume that the Axiom of Choice holds. We need to show that for
every function f : X Y it holds that f has a right inverse if and only if f is
surjective. Let us x some function f : X Y .
= Let f be surjective. Then there is a choice function C
X
: 2
X
X
with C
X
(A) A for each non-empty set A X. We need to show that f has
a right inverse g : Y X. Since f is surjective, the preimage f
1
(y) is a
non-empty subset of X for each y Y and hence we can dene g by
g(y) := C
X
(f
1
(y))
for every y Y . Then g(y) f
1
(y), i.e. f g(y) = y for all y Y , which
means f g = id
Y
. Hence f has a right inverse g.
= If, on the other hand, f has a right inverse g : Y X, then for each
y Y we have that f(g(y)) = f g(y) = id
Y
(y) = y, hence f is surjective. We
do not need the Axiom of Choice for this direction.
2. We leave this proof to the reader (see Problem 4.26).
3. = Let f be bijective. Then the inverse function f
1
of f exists by Propo-
sition 4.29 and by Corollary 4.30 we obtain f f
1
= id
Y
and f
1
f = id
X
.
Hence, the inverse f
1
is a right inverse as well as a left inverse of f.
= Let f have a right inverse g : Y X and a left inverse h : Y X, i.e.
f g = id
Y
and h f = id
X
. We obtain by associativity
g = id
X
g = (h f) g = h (f g) = h id
X
= h.
Hence g is a left inverse and a right inverse of f, hence it is an inverse.
84
4.9. Innite Products

2
The proof of 3. also shows that the inverse of a function f, if it exists, is uniquely
determined, namely it is the inverse function f
1
.
Corollary 4.50 Let f : X Y be a bijective function. Then the inverse function
f
1
: Y X is the uniquely determined inverse of f.
We mention another result whose proof requires the Axiom of Choice.
Proposition 4.51 (Injective restriction) The following statement is a conse-
quence of the Axiom of Choice. Let f : X Y be a function. Then there exists
a subset A X such that the restriction f[
A
: A Y is injective and such that
range(f) = range(f[
A
).
Proof. Let f : X Y be a function. We dene a function h : X range(f) by
h(x) := f(x) for all x X. This function is surjective and hence it admits a right
inverse g : range(f) X by Theorem 4.49, which requires the Axiom of Choice.
Let A := range(g). Then we obtain for each y range(f)
f[
A
g(y) = f(g(y)) = h(g(y)) = y,
hence range(f[
A
) = range(f). Now let x, y A with f[
A
(y) = f[
A
(x). Then there
are w, z range(f) such that g(w) = x and g(z) = y. Hence
w = f[
A
g(w) = f[
A
(x) = f[
A
(y) = f[
A
g(z) = z,
which implies x = y. Hence f[
A
is injective. 2
The Axiom of Choice has not only lots of important applications in mathematics,
but also some counter-intuitive consequences. One of those is the Banach-Tarski
Paradox, which is in fact a theorem that follows from the Axiom of Choice. It
states that a solid ball in the three dimensional Euclidean space can be decomposed
into nitely many disjoint pieces that can be reassembled to two balls of the same
volume as the original ball. And this process can be performed by rotations and
other geometrical transformations that do not change the shape of the pieces. The
pieces themselves are, however, very complicated and not like solid physical objects.
Problems
4.26 Let X and Y be non-empty sets and let f : X Y be a function. Prove that f has a
left inverse if and only if f is injective.
4.9 Innite Products

When we dened the Cartesian product of sets X Y we generalized this concept


only to nite products
X
n
i=1
X
i
, but not to products over indexed families of sets.
Using functions we can now provide such a generalization.
85
4. Relations and Functions
Denition 4.52 (Product) Let (X
i
)
iI
be an indexed family of sets. Then we
dene the product

iI
X
i
:=
_
f : I
_
iI
X
i
: (i I) f(i) X
i
_
.
We note that in case that X
i
= X for all i I, we obtain

iI
X = X
I
. In this
sense the exponentiation or functions set construction is a special case of the product.
We will see in Problem 4.28 that the product also generalizes the nite Cartesian
product
X
n
i=1
X
i
that we have considered earlier. Now we show that the Axiom of
Choice is equivalent to the statement that this product is non-empty whenever the
sets X
i
are all non-empty.
Theorem 4.53 (Non-empty products) The following statement is equivalent to
the Axiom of Choice. For all indexed families (X
i
)
iI
we obtain

iI
X
i
,= (i I) X
i
,= .
Proof. Let X :=

iI
X
i
. Let us assume the Axiom of Choice holds. We need to
show that the statement in the theorem holds too. We consider both directions.
= Let X
i
,= for all i I. We need to prove that there exists a function
f : I X. By the Axiom of Choice there is a choice function C
X
: 2
X
X
for X. Since X
i
,= for all i I, we obtain C
X
(X
i
) X
i
. Hence, we can dene a
suitable f by f(i) := C
X
(X
i
) for all i I. For this f we obtain f

iI
X
i
and
hence

iI
,= .
= We prove the contrapositive statement. Let j I be such that X
j
= .
Then, obviously, there cannot be any function f : I X with f(j) X
j
. Hence

iI
X
i
= . For this direction we have not used the Axiom of Choice.
Let us now assume that the statement in the theorem is correct. We prove that
under this assumption the Axiom of Choice follows. Let X be a set. We consider
the indexed family of sets (Y
A
)
AI
with I := 2
X
and Y
A
:= A for each A I.
We have to prove that there is a function C
X
: 2
X
X with C
X
(A) A for
each non-empty A X. But if

AI
Y
A
is non-empty, then there exists a func-
tion f : I

AI
Y
A
with f(A) Y
A
= A for each A I = 2
X
. Hence
C
X
(A) := f(A) is a suitable choice for each non-empty A X. Thus, the axiom of
choice follows from the non-emptyness of

AI
Y
A
. 2
There are many other theorems in mathematics that are, in fact, equivalent to
the Axiom of Choice. We just mention two examples:
1. The statement that each vector space has a basis is equivalent to the Axiom of
Choice. This statement is a crucial and fundamental fact in Linear Algebra.
86
4.9. Innite Products

2. The Theorem of Tychono, which is the statement that the product of compact
topological spaces is compact, is equivalent to the Axiom of Choice. This
theorem is an important theorem in Topology.
1
The product of sets comes with associated maps pr
j
for each j I, which are
called the canonical projections:
pr
j
:

iI
X
i
X
j
, f f(j).
These maps are all surjective (see Problem 4.27). The product together with the
canonical projections satises a so-called universal property that we formulate in the
following result.
Theorem 4.54 (Product) Let (X
i
)
iI
be an indexed family of sets. For each set
Y and each family (f
i
)
iI
of functions f
i
: Y X
i
there exists exactly one function
f : Y

iI
X
i
such that
f
j
= pr
j
f
for all j I.
Proof. We rst prove the existence of f. Let Y be a set and let (f
i
)
iI
be a family
of functions f
i
: Y X
i
. Then we dene a function f : Y

iI
X
i
by
f(y)(i) := f
i
(y)
for all y Y and i I. This f is well-dened, since the function f(y) : I

iI
X
i
has the property that f(y)(i) = f
i
(y) X
i
for each i I, hence f(y)

iI
X
i
for
all y Y . Now we obtain
f
j
(y) = f(y)(j) = pr
j
(f(y)) = pr
j
f(y)
for each y Y and j I, which means f
j
= pr
j
f for each j I. Now we still
need to prove that f is uniquely determined. Hence, let g : Y

iI
X
i
be some
function such that f
j
= pr
j
g for all j I. We have to show that f = g. We obtain
g(y)(j) = pr
j
g(y) = f
j
= pr
j
f(y) = f(y)(j)
for all y Y and j I, hence g(y) = f(y) for all y Y and this means f = g. This
completes the proof. 2
The diagram in Figure 4.10 illustrates the situation of the proof. It is an example
of a commutative diagram. For nite products (i.e. nite sets I) we do not have
to distinguish between the product

iI
X
i
and the product
XiI
X
i
that we have
introduced earlier. We leave the proof to the reader (see Problem 4.28). Thus, the
product

iI
X
i
introduced in this section actually generalizes the nite product
XiI
X
i
.
1
Compactness is a notion that is studied in Topology and Analysis. It plays a very important
role because compact sets have many properties in common with nite sets, although they are not
necessarily nite.
87
4. Relations and Functions
Y
Xj

iI
Xi

f
fj
pr
j

Figure 4.10: Commutative diagram for f


j
= pr
j
f.
Problems
4.27 Let (X
i
)
iI
be a family of sets. Prove that the canonical projections pr
j
:

iI
X
j
are surjective for all j I. Prove that these maps are not injective in general.
4.28 Let I := N
n
= 1, 2, ..., n for some n 1. We consider the projections
p
j
:
n
X
i=1
X
i
X
j
, (x
1
, x
2
, ..., x
n
) x
j
.
Prove that there is exactly one map
F :
n
X
i=1
X
i

iI
X
i
with the property that p
j
= pr
j
F. Show that F is bijective.
88
CHAPTER 5
Cardinality
The innite! No other question has ever moved so profoundly the spirit of man.
David Hilbert (18621943)
5.1 What is the Cardinality of a Set?
Obviously, for some considerations about sets the size of a set matters. This size of a
set X is called the cardinality of X and it is often denoted by [X[. Some authors also
write card(X) = [X[. To dene exactly what kind of quantity [X[ is, is somewhat
non-trivial and we will not do this here. There is a mathematically precise way
to interpret [X[ as a so-called cardinal number, which is a quantity that can take
natural number values but also many dierent innite values. Perhaps surprisingly,
we do not have to specify what [X[ exactly is, in order to work with cardinalities.
We will just specify what expressions like [X[ [Y [ mean without saying what [X[
and [Y [ actually are. We make this precise in the following denition.
Denition 5.1 (Cardinality) Let X and Y be sets.
1. We write [X[ = [Y [ and we say that X has the same cardinality as Y if there
is a bijective map f : X Y .
2. We write [X[ [Y [ and we say that X has smaller or the same cardinality as
Y if there is an injective map f : X Y .
3. We write [X[ < [Y [ and we say that X has strictly smaller cardinality than Y
if [X[ [Y [ and not [X[ = [Y [.
To make this clear again: if we write [X[ [Y [, we are not saying that some
sort of number [X[ is less or equal to some sort of number [Y [ (although there is a
89
5. Cardinality
meaningful way to interpret things in this direction), but the expression [X[ [Y [
is simply a short way to say that there exists an injective function f : X Y . We
give a simple example.
Example 5.2 The two sets 1, 2, 3 and A, B, C (where we assume that A, B
and C are pairwise distinct objects) are of the same cardinality, i.e. [1, 2, 3[ =
[A, B, C[. We can easily verify this using a bijective function f : 1, 2, 3
A, B, C dened by f(1) := A, f(2) := B and f(3) := C.
If one accepts the Axiom of Choice, then [X[ [Y [ is the same as saying that
there is a surjective function g : Y X.
Proposition 5.3 The following statement follows from the Axiom of Choice. Let
X and Y be non-empty sets. There exists an injective function f : X Y if and
only if there exists a surjective function g : Y X.
Proof. Let f : X Y be injective. Then f has a left inverse g : Y X by
Theorem 4.49 and hence g f = id
X
. This implies that g : Y X has to surjective.
For the other direction we assume that there exists a surjective function g : Y X.
Then by Theorem 4.49 the Axiom of Choice implies that there exists an right inverse
f : X Y of g, i.e. g f = id
X
. This implies that f has to be injective. 2
Hence, if one accepts the Axiom of Choice, then it does not matter whether
one denes [X[ [Y [ via injections f : X Y or via surjections g : Y X. If,
however, one wants to be rather careful and independent of the Axiom of Choice,
then one should use injections here. The previous proposition is not correct for the
special case X = and Y ,= . In this case, the only function f : Y is injective,
but there is no function whatsoever of type g : Y , in particular, no surjective
one. As a rst result we show that any subset A X of a set X has smaller or the
same cardinality than X.
Proposition 5.4 (Inclusion and cardinality) Let X and Y be sets. Then
X Y =[X[ [Y [.
Proof. Let X and Y be sets with X Y . We consider the identity id
Y
restricted
to X, i.e. the function
f : X Y, x x.
This function is clearly injective, hence [X[ [Y [. 2
However, one should be careful since X Y does not necessarily mean [X[ < [Y [.
That is a set X can very well be smaller than Y in these sense that it contains fewer
elements without having a smaller cardinality. We give an example.
90
5.1. What is the Cardinality of a Set?
Proposition 5.5 (Hilberts hotel) We have
[N[ = [2N[,
where 2N denotes the set of even natural numbers.
Proof. Using the bijective function f : N 2N, n 2n we obtain [N[ = [2N[. 2
This example is also known as the Hilbert Hotel Paradox, although it is not really
a paradox. The story goes as follows: imagine a hotel with innitely many rooms
numbered by natural numbers 0, 1, 2, 3, .... Suppose all the rooms are occupied and
there are 5 new guests arriving. Then you can easily create space, by asking all the
existing guests to move from room number n into room number n + 5. Then the
rooms 0, 1, 2, 3, 4 become vacant and can host the 5 new guests. But even if there is
a bus arriving with innitely many guests numbered by natural numbers 0, 1, 2, 3, ...
one can create enough space. One just asks each guest in room number n to move in
room number 2n. Then only the even room number are occupied and all the newly
arriving guests can move into the rooms with odd room numbers. One can even
continue this game if there are innitely many buses arriving, one for each natural
number 0, 1, 2, 3, ..., but we do not go into this here. We close this section with a
number of further examples.
Example 5.6 Let X, Y and Z be sets.
1. [2
X
[ = [0, 1
X
[, i.e. the power set 2
X
of X and the set 0, 1
X
of functions
f : X 0, 1 have the same cardinality, which follows from Theorem 4.45.
2. [(Z
Y
)
X
[ = [Z
XY
[, which follows from Theorem 4.43.
3. [Y
X
[ < [2
XY
[, if X and Y both have at least two elements, which follows
from Problem 4.25.
Problems
5.1 Let X
1
, X
2
, Y
1
and Y
2
be sets. Prove the following:
1. ([X
1
[ [X
2
[ and [Y
1
[ [Y
2
[) =[X
1
. Y
1
[ [X
2
. X
2
[,
2. ([X
1
[ [X
2
[ and [Y
1
[ [Y
2
[) =[X
1
Y
1
[ [X
2
X
2
[,
3. [X
1
[ [X
2
[ =[2
X1
[ [2
X2
[,
4. ([X
1
[ = [X
2
[ and [Y
1
[ [Y
2
[) =[Y
X1
1
[ [Y
X2
2
[,
5. [X
1
[ [X
2
[ =[X
1
![ [X
2
![.
5.2 Let X and Y be sets and let x X. Prove that [x Y [ = [Y [.
5.3 Let X and Y be sets. Prove that the following map is bijective:
F : X . Y (X Y ) . (X Y ), (i, z)
_
(i, z) if z X Y
(1, z) if z (X Y ) (X Y )
Conclude that [X . Y [ = [(X Y ) . (X Y )[. Prove also [X Y [ [X Y [ [X . Y [.
91
5. Cardinality
5.2 The Theorem of Schr oder-Bernstein

An obvious question about cardinality is whether [X[ [Y [ and [Y [ [X[ together


imply [X[ = [Y [? It turns out that the positive answer to this question requires a
proof that is somewhat more complicated. In fact, Cantor already proved this result
with a somewhat simpler proof using the Axiom of Choice. We will see a dierent
proof without any usage of the Axiom of Choice. This is perhaps the most dicult
theorem that we have seen here so far. We rst state and prove a preparatory result
that is somewhat surprising.
Proposition 5.7 Let X and Y be sets and let f : X Y and g : Y X be
injective functions. Then there exists a set A X such that
g(Y f(A)) = X A.
Proof. We consider the following function
F : 2
X
2
X
, B X g(Y f(B)).
First we prove that this function F is monotone, this means
B A =F(B) F(A)
for all sets A, B X. If B A, then f(B) f(A) and hence Y f(A) Y f(B).
This implies g(Y f(A)) g(Y f(B)) and hence Xg(Y f(B)) Xg(Y f(A)).
But this means F(B) F(A). This nishes the proof that F is monotone. Next we
consider the set
/ := B 2
X
: B F(B).
Since / we obtain / , = and hence we can dene
A :=
_
BM
B.
We claim that this set satises the claim. In order to prove this, it suces to show
that A is a xed point of F, i.e.
A = F(A),
since this implies A = F(A) = X g(Y f(A)), which implies X A = g(Y f(A))
and this is the claim. In order to prove A = F(A), we rstly note that B A for
all B / and hence monotonicity of F yields
F(B) F(A) = F
_
_
BM
B
_
,
which implies
_
BM
F(B) F
_
_
BM
B
_
.
92
5.2. The Theorem of Schroder-Bernstein

Since B F(B) for all B /, we obtain


A =
_
BM
B
_
BM
F(B) F
_
_
BM
B
_
= F(A).
Since F is monotone, this implies F(A) F(F(A)) and hence F(A) /. Hence
F(A)
_
BM
B = A.
Altogether, this proves A = F(A), i.e. A is a xed point of F. This nishes the
proof. 2
The diagram in Figure 5.1 illustrates the claim of the previous proposition. Now
_

A
X \ A
f(A)
Y \ f(A)

-
f
g
X
Y
Figure 5.1: Two injections f : X Y and g : Y X.
we can prove the following theorem.
Theorem 5.8 (Schroder-Bernstein 1897) Let X and Y be sets. Then we obtain
[X[ = [Y [ ([X[ [Y [ and [Y [ [X[).
Proof. Let X and Y be sets. If [X[ = [Y [, then there exists a bijective map
h : X Y . It follows by Proposition 4.29 that the inverse h
1
: Y X exists and
by Corollary 4.30 the inverse h
1
is bijective too. In particular, h : X Y and
h
1
: Y X are injective and hence [X[ [Y [ and [Y [ [X[.
Now we still have to prove the inverse implication. Hence, suppose [X[ [Y [ and
[Y [ [X[. This means that there are injective functions f : X Y and g : Y X.
We need to show [X[ = [Y [, i.e. we have to construct a bijective function h : X Y .
We recall that the inverse function g
1
: range(g) X exists (see Problem 4.6). By
Proposition 5.7 there exists a subset A X such that g(Y f(A)) = X A. Now
we can dene a map h : X Y by
h(x) :=
_
f(x) if x A
g
1
(x) if x X A
93
5. Cardinality
for all x X. The diagram in Figure 5.1 illustrates the idea of the construction.
We need to prove that h is bijective.
We rst prove that h is surjective. Let y Y . If y f(A), then clearly there
is an x A such that h(x) = f(x) = y. Otherwise, y Y f(A), but then
x := g(y) X A and hence h(x) = g
1
(x) = y.
Next we prove that h is injective. Therefore, let x, y X with h(x) = h(y). If x
and y are both in A, then we obtain f(x) = h(x) = h(y) = f(y) and hence x = y,
since f is injective. If x and y are both in XA, then g
1
(x) = h(x) = h(y) = g
1
(y).
Since g
1
: range(g) X is bijective and, in particular, injective, this implies x = y.
If x and y are not both in A and not both in X A, then we can assume without
loss of generality x A and y X A. In this case h(x) = f(x) f(A) and
h(y) = g
1
(y) g
1
(X A) = Y f(A). Hence this case is impossible, since
h(x) = h(y). This nishes the proof that h is injective and hence bijective. 2
Now we can conclude that the relations on cardinality that we have studied
satisfy the following important properties.
Corollary 5.9 The following holds for all sets X, Y and Z:
1. [X[ = [X[ (reexivity)
2. [X[ [Y [ and [Y [ [X[ =[X[ = [Y [ (antisymmetry)
3. [X[ [Y [ and [Y [ [Z[ =[X[ [Z[ (transitivity)
The rst statement clearly holds since the identity id
X
: X X is bijective, the
second statement is the statement of the Theorem of Schroder-Bernstein 5.8 and the
third statement holds since since the composition of two injective maps is injective
by Corollary 4.28. The above properties are basically those of an order relation
(except that the underlying class of all sets is not a set itself). We will study such
order relations later on. This particular order is even total (in a sense specied in
Denition 6.1), as the next result shows.
Theorem 5.10 (Trichotomy) The following statement is equivalent to the Axiom
of Choice. For any two sets X and Y we have [X[ < [Y [ or [X[ = [Y [ or [Y [ < [X[.
The proof is beyond our scope here and we have to postpone it until later.
Problems
5.4 Prove that the following two functions are injective:
1. I : 0, 1
N
N
N
, f f,
2. J : N
N
0, 1
N
, f ( 1, ..., 1
. .
f(0)times
, 0, 1, ..., 1
. .
f(1)times
, 0, 1, ..., 1
. .
f(2)times
, ...).
94
5.3. Cantors Diagonalization Method
For the denition of I we interpret f on the left hand side as function f : N 0, 1 and on
the right-hand side we consider f as function f : N N (which is possible since 0, 1 N).
For the denition of J we have written the function g : N 0, 1 with g = J(f) like a
sequence as an innite tuple, i.e. the tuple contains the function values (g(0), g(1), g(2), ...).
Conclude that [2
N
[ = [0, 1
N
[ = [N
N
[.
5.3 Cantors Diagonalization Method
In this section we want to prove that there are sets of many dierent sizes. We start
with a result of Cantor that shows that the power set 2
X
of any set X is larger
than the set X itself. In some sense this is another instance of Russels paradox (see
Example 2.8).
Theorem 5.11 (Cantor 1892) Let X be a set. Then
[X[ < [2
X
[.
Proof. Let X be a set. We have to prove [X[ [2
X
[ and [X[ ,= [2
X
[. That is, it is
sucient to show that there is an injective function f : X 2
X
and that there is
no injective function g : 2
X
X. It is easy to see that the function
f : X 2
X
, x x
is injective: if x, y X with f(x) = f(y), then we obtain x = y, which implies
x = y. Now let us assume that there is an injective function g : 2
X
X. Then this
function has a left inverse h : X 2
X
by Theorem 4.49, which means h g = id
2
X.
Now we dene the set
A := x X : x , h(x).
Let y := g(A). Then we obtain h(y) = h g(A) = A, which implies
y A y , h(y) y , A.
This is clearly a contradiction. Hence the assumption was wrong and there cannot
be any injective function g : 2
X
X. 2
In Theorem 4.45 we have proved that the power set 2
X
has the same cardinality
as the set of function 0, 1
X
. Hence we obtain the following corollary of Cantors
Theorem.
Corollary 5.12 For any set X we have [X[ < [0, 1
X
[.
From Cantors Theorem 5.11 we can deduce that there are innite sets of dierent
cardinality. In particular, we get the following corollary.
Corollary 5.13 We have [N[ < [2
N
[.
95
5. Cardinality
That is, the power set 2
N
of the natural numbers is larger than the set N of
natural numbers itself, with respect to cardinality. Hence, we get an innite chain
of larger and larger innite sets:
[N[ < [2
N
[ < [2
2
N
[ < [2
2
2
N
[ < ...
We mention that one can also conclude that there is no universal set from Cantors
Theorem 5.11.
Problems
5.5 Show as follows that Proposition 2.9 is also a consequence of Cantors Theorem 5.11:
1. Assume that there is a universal set U that contains all sets X.
2. Show that X U implies 2
X
U.
3. Conclude that X U implies [2
X
[ [U[.
4. Show that this leads to a contradiction!
5.4 The Continuum Hypothesis

Since the power set 2


X
of any set X is strictly larger than the set X itself, the
question arises whether there is any set of cardinality in between. For large enough
nite sets this is certainly the case. If X has two or more elements then there is a set
Y with [X[ < [Y [ < [2
X
[, namely any set Y that has exactly one element more than
X will do the job. However, it was a matter of many mathematical investigations
whether there can be such a set Y for innite sets X as well. It is the so-called
Continuum Hypothesis that there is no such set. It comes in a generalized form
which makes a statement for any innite set and in a basic form which makes the
statement only for the set X = N. We capture both in the following denition.
Denition 5.14 (Continuum Hypothesis) The Generalized Continuum Hypoth-
esis is the statement that for each set X with [N[ [X[ there does not exist any set
Y with
[X[ < [Y [ < [2
X
[.
The (ordinary) Continuum Hypothesis is this statement for the special case X = N.
Kurt Godel proved in 1940 that consistency of Zermelo-Fraenkel set theory im-
plies consistency of the Zermelo-Fraenkel set theory together with the Continuum
Hypothesis. This also holds in presence of the Axiom of Choice. This implies that
the Continuum Hypothesis cannot be proved to be false using the Zermelo-Fraenkel
axioms together with the Axiom of Choice. In 1960 Paul Cohen proved that the
Continuum Hypothesis can also not be derived from the Zermelo-Fraenkel axioms,
not even in presence of the Axiom of Choice. This means that also the negation
of the Continuum Hypothesis together with the Zermelo-Fraenkel axioms and the
96
5.5. Cantors Pairing Function
Axiom of Choice is consistent. Surprisingly, the Generalized Continuum Hypothesis
implies the Axiom of Choice (although the rst one is a non-existence claim and the
second one an existence statement). We state this result here without proof.
Theorem 5.15 (Sierpi nski 1947) The Generalized Continuum Hypothesis implies
the Axiom of Choice.
Hence the Generalized Continuum Hypothesis can be considered as an even
stronger non-constructive principle than the Axiom of Choice. Unlike the Axiom of
Choice the Generalized Continuum Hypothesis is not widely accepted in mathemat-
ics. That is, any application of it has to be mentioned explicitly.
5.6 Let X, Y be sets. Prove that the Generalized Continuum Hypothesis implies
[X[ < [Y [ =[2
X
[ < [2
Y
[.
5.5 Cantors Pairing Function
In the previous section we have seen that [N[ = [2N[, i.e. cardinalitywise there are
exactly as many natural numbers as there are even numbers. Perhaps, even more
surprisingly, we will show in this section that cardinalitywise there are as many
pairs of natural numbers as there are natural number, i.e. [N N[ = [N[. This
proof is due to Cantor and it is called Cantors rst diagonalization. The idea
is captured in the diagram in Figure 5.2. We systematically enumerate all pairs
(n, k) N N of natural numbers in a coordinate system by moving diagonally
through this system. This enumeration yields a function f : NN N that assigns
0
1
2
4
7
11
3
6
10
5
8
12
9
13
14 0
1
2
3
4
0 1 2 3 4

k
n
Figure 5.2: Cantors pairing function
the number f(n, k) N at position (n, k) to each pair (n, k) NN. This function
f is bijective and hence it shows that cardinalitywise there are as many pairs of
natural numbers as there are natural numbers.
97
5. Cardinality
Proposition 5.16 (Cantors pairing function) The function
f : N N N, (n, k)
1
2
(n +k)(n +k + 1) +k
is bijective.
Proof. We consider the following additional functions:
1. s : N N, i
1
2
i(i + 1),
2. h : N N, m maxi N : s(i) m,
3. g
2
: N N, m ms(h(m)),
4. g
1
: N N, m h(m) g
2
(m),
5. g : N N N, m (g
1
(m), g
2
(m)).
Intuitively, s captures the values in the rst column of the diagram in Figure 5.2
and h(m) determines the number of the row, in which the upwards diagonal starts
on which m is located. Firstly, we need to show that all the above dened functions
are correctly dened. Since s(0) = 0 and s(i) < s(i +1) for all i N, we obtain that
the maximum h(m) actually exists for all m N. The denition of s and h imply
s(h(m)) m < s(h(m) + 1) = s(h(m)) +h(m) + 1
for all m N. The rst inequality shows that ms(h(m)) is non-negative and the
second inequality implies m s(h(m)) + h(m) and hence h(m) m + s(h(m)) is
also non-negative for all m N. Hence g
2
and g
1
are both correctly dened.
Now we prove that f is surjective. The denition of s implies
f(n, k) = s(n +k) +k
for all n, k N and hence we obtain for all m N
f g(m) = f(g
1
(m), g
2
(m))
= s(g
1
(m) +g
2
(m)) +g
2
(m)
= s(h(m) g
2
(m) +g
2
(m)) +ms(h(m))
= s(h(m)) +ms(h(m))
= m.
This proves that f is surjective. In fact, it also proves that the function g is a right
inverse of f. Now we prove that g is also a left inverse of f and hence f is injective.
First we note that an easy calculation shows
s(n +k + 1) s(n +k) +k
98
5.5. Cantors Pairing Function
for all n, k N. This implies
h(f(n, k)) = h(s(n +k) +k) = n +k
for all n, k N and we obtain
g
2
(f(n, k)) = f(n, k) s(h(f(n, k))) = f(n, k) s(n +k) = k
for all n, k N and hence
g
1
(f(n, k)) = h(f(n, k)) g
2
(f(n, k)) = n +k k = n
for all n, k N. Altogether, this means
g f(n, k) = (g
1
(f(n, k)), g
2
(f(n, k)) = (n, k)
for all n, k N, i.e. g is a left inverse of f. This implies that f is injective. 2
From this proposition we obtain the following corollary.
Corollary 5.17 We have [N N[ = [N[.
The same idea can be used to generalize the result to triples and ktuples of
natural numbers in general. For instance, for k = 3 we can use the bijective function
F : N
3
N, (m, n, k) f(m, f(n, k))
that is dened with the help of the pairing function f from Proposition 5.16. This
implies [N
3
[ = [N[ and similarly we obtain [N
k
[ = [N[ for all k 1. From this we
can conclude that the set of natural numbers N, the set of integers Z and the set of
rational numbers Q all have the same cardinality.
Proposition 5.18 We have [N[ = [Z[ = [Q[.
We have not dened the sets Z and Q formally, however, we indicate in Prob-
lem 5.7 and Problem 5.8 how this conclusion follows. Another important remark is
that the set of real number has exactly the same cardinality as the power set of N.
Proposition 5.19 We have [2
N
[ = [R[.
Once again, we have not formally dened R here, but we indicate how to prove
this result in Problem 5.9. Finally, we mention that from the two aforementioned
propositions and the Theorem of Cantor 5.11 we get the following corollary.
Corollary 5.20 We have [N[ < [R[.
99
5. Cardinality
Problems
5.7 Prove that the following function is surjective:
f : N N Z, (n, k) n k.
Provide a concrete right inverse g : Z N N of f (without using the Axiom of Choice).
Show that this implies [N[ = [Z[.
5.8 Prove that the following function is surjective:
f : N N N Q, (n, k, m)
n k
m+ 1
.
Provide a concrete right inverse g : Q NNN of f (without using the Axiom of Choice).
Show that this implies [N[ = [Q[.
5.9 This question requires some basic knowledge about real numbers. Prove that the fol-
lowing two functions are injective:
1. F : 0, 1
N
R, f

n=0
f(n)3
n
,
2. G : R 2
QQ
, x (a, b) QQ : a < x < b.
Show that this implies [2
N
[ = [R[.
5.10 Prove that the map
f : 2
N
2
N
2
NN
, (A, B) AB
is bijective. Conclude that [2
N
2
N
[ = [2
N
[.
5.11 Show with the help of the previous problems that [R
2
[ = [R[.
5.12 This quesion requires some basic knowledge about complex numbers. Prove that the
following function is bijective:
f : R
2
C, (a, b) a +bi.
Show that this implies [C[ = [R[.
5.6 Induction Principle on Natural Numbers
In the following section we want to discuss nite and innite sets and prototypes
of such sets will be derived from the natural numbers. For this purpose we need to
clarify some further properties of natural numbers. The rst property is called the
induction principle.
Proposition 5.21 (Induction principle) Let A N be a subset that satises the
following properties:
1. 0 A (induction base)
100
5.6. Induction Principle on Natural Numbers
2. (n N)(n A =n + 1 A) (induction step)
then A = N.
We cannot really prove this proposition here, since we are working with an
intuitive concept of the natural numbers. We have just dened N as the set of num-
bers 0, 1, 2, ... and on basis of this informal denition, the above induction principle
is just intuitively correct. The interpretation of the dots ... in the informal deni-
tion of N is just that with any number n also its successor n +1 follows in that list.
Besides the above induction principle, there is a second principle that also follows
intuitively. This is called the recursion principle. The recursion principle allows us
to dene functions inductively (or recursively) following the inductive structure
of natural numbers.
Proposition 5.22 (Recursion principle) Let X, Y be sets and let g : X Y
and h : Y X N Y be functions. Then there exists exactly one function
f : X N Y with
1. f(x, 0) := g(x),
2. f(x, n + 1) := h(f(x, n), x, n)
for all n N.
Once again, we cannot prove this result here, at least not the existence claim,
since we do not use a precise denition of N. However, we can use the induction
principle in order to derive the uniqueness claim in the recursion principle. We
formulate this as an example here that shows how to use the induction method.
Example 5.23 Let X and Y be sets and let g : X X and h : Y X N Y
be functions. Let us assume that we have two functions f : X N Y and
f

: XN Y that both satisfy the equations given in the Recursion Principle 5.22.
We claim that f = f

follows, using the induction principle. In order to show this,


we prove the following claim:
(n N)(x X)f(x, n) = f

(x, n).
This claim clearly implies f = f

. More precisely, this claim is equivalent to the


statement that the set
A := n N : (x X)f(x, n) = f

(x, n)
is equal to N. If we can show that A satises both requirements of the Induction
Principle 5.21, then A = N follows. We prove this now.
Induction base: n = 0. In this case clearly f(x, 0) = g(x) = f

(x, 0) for all x X.


That means 0 A.
Induction step: n n+1. Now we assume that n N is xed and that for this xed
101
5. Cardinality
n we have (x X)f(x, n) = f

(x, n). This means n A, which is the so-called


induction hypothesis. We need to show n + 1 A. We obtain for all x X
f(x, n + 1) = h(f(x, n), x, n) = h(f

(x, n), x, n) = f

(x, n + 1),
where the induction hypothesis has been used in the middle equality. This now means
n + 1 A. Hence we have proved n A =n + 1 A.
This nishes the induction. Altogether, we have proved A = N and hence f = f

.
Hence, there can at most be one function f that satises the requirements of the
Recursion Principle 5.22.
The above structure of a proof by induction is typical, including the terminology
of an induction base, an induction step and an induction hypothesis. Usually, we will
not formulate the set A explicitly. The implicit understanding is that whenever we
want to prove a statement of the form (n N)P(n) with a proposition P(n),
then we can achieve this by applying the Induction Principle to the set
A := n N : P(n).
The induction principle is the key idea for the so-called Peano axioms of natural
numbers. We formulate these axioms in the following denition.
Denition 5.24 (Peano model) We say that a triple (N, z, s) is a Peano model
of the natural numbers if the following holds:
1. N is a set (natural numbers)
2. z N (zero)
3. s : N N is an injective function with z , range(s) (successor)
4. if a subset A N has the properties
a) z N (induction base)
b) (n N)(n A =s(n) A) (induction step)
then A = N.
Since we are not going to develop set theory axiomatically here, we will not
prove that there are Peano models of the natural numbers at all. We keep on using
our intuitive model (N, 0, s) where s : N N, n n + 1 is the successor function.
The Induction Principle 5.21 essentially says that (N, 0, s) is a Peano model of the
natural numbers. We briey sketch how one can construct a set theoretical model
of the natural numbers, namely by choosing the following sets:
0 := , 1 := 0 0, 2 := 1 1, 3 := 2 2, ...
The set of natural numbers corresponds then to the set N of all these sets and the
successor function corresponds to the function s : N N, n n n. One can
actually prove in a precise way that along these lines one can construct a Peano
model of the natural numbers. But we will not work this out in detail here.
102
5.7. Finite and Countable Sets
Problems
5.13 Use the Induction Principle 5.21 in order to prove the following statement by induction:
(n N)
n

i=0
i =
n(n + 1)
2
.
5.14 Use the Recursion Principle 5.22 in order to prove that there exists exactly one function
f : N N such that
1. f(0) := 1,
2. f(n + 1) := f(n) (n + 1),
for all n N. This function f is called the factorial function and usually one writes n! := f(n)
for all n N.
5.15 For n, k N with k n we dene the binomial coecient
_
n
k
_
:=
n!
k!(n k)!
and we dene
_
n
k
_
:= 0 for k > n. Prove Pascals rule, which states that
_
n + 1
k + 1
_
=
_
n
k
_
+
_
n
k + 1
_
for all n, k N. Use this rule in order to show by induction that
_
n
k
_
is a natural number
for all n, k N.
5.7 Finite and Countable Sets
We have seen that many common sets are either of the same cardinality as the set
of natural numbers N or of the same cardinality as the power set 2
N
. In fact, most
innite sets that commonly occur in mathematics are of one of the two corresponding
cardinalities. We introduce some related terminology. We recall that for each n N
with n 1 we denote by N
n
:= 1, 2, 3, ..., n the set of the natural numbers from
1, ..., n. We dene N
0
:= to be the empty set.
Denition 5.25 Let X be a set. Then we say that
1. X is nite if [X[ = [N
n
[ for some n N,
2. X is innite if X is not nite,
3. X is countable if [X[ [N[,
4. X is countably innite if [X[ = [N[,
5. X is uncountable if X is not countable.
103
5. Cardinality
We note that it follows directly from the denition that the empty set is nite,
each nite set is countable and each countably innite set is countable. Countable
sets are sometimes also called denumerable. We give some examples.
Example 5.26 We discuss some examples of sets.
1. The empty set is nite and so are 2

and N
n
for each n N.
2. The set P of prime numbers is innite, this is exactly what we proved in The-
orem 1.2.
3. The sets N, Z and Q are all countably innite.
4. The sets 2
N
and R and 2
R
are uncountable.
Next we prove that the relation between the cardinalities of N
n
and N
k
can be
directly deduced from n and k.
Proposition 5.27 (Finite sets) Let n, k N. Then we obtain:
1. [N
n
[ [N
k
[ n k,
2. [N
n
[ = [N
k
[ n = k,
3. [N
n
[ < [N
k
[ n < k.
Proof.
1. Let [N
n
[ [N
k
[. Then there is an injective map f : N
n
N
k
. If n > k, then
the values f(1), ..., f(n) cannot all be distinct, but one of the values 1, ..., k
must occur twice among f(1), ..., f(n). In this case f is not injective. Hence
n k. Let us now assume that n k. Then N
n
N
k
and hence [N
n
[ [N
k
[
by Proposition 5.4.
2. Let [N
n
[ = [N
k
[. This means [N
n
[ [N
k
[ and [N
k
[ [N
n
[ and hence n k
and k n by 1. Hence n = k follows. If, on the other hand, n = k, then the
identity id : N
n
N
k
is clearly bijective and hence [N
n
[ = [N
k
[.
3. This follows directly from 1. and 2.
2
The previous result is the reason why one can actually consider [X[ as a natural
number for nite sets X.
Denition 5.28 (Cardinality) Let X be a nite set and n N. Then we dene
[X[ = n : [X[ = [N
n
[.
In this case, the natural number n N is called the cardinality of X.
104
5.7. Finite and Countable Sets
Firstly, this quantity [X[ is well-dened for nite sets, since [X[ = [N
n
[ and
[X[ = [N
k
[ for n, k N implies n = k by Proposition 5.27. Secondly, we have now two
ways of reading expressions like [X[ [Y [ for nite sets X, Y . For one, according to
the original denition this means that there is an injective map f : X Y . Secondly,
we can read this statement as inequality n k for the cardinalities n = [X[ and
k = [Y [. Proposition 5.27 guarantees that these two dierent interpretations actually
lead to the same result, i.e. this ambiguity cannot cause any confusion. An analogous
remark holds for the interpretation of the expressions [X[ = [Y [ and [X[ < [Y [. Next
we prove that the set N is actually innite according to our denition of niteness.
Proposition 5.29 The set N is innite.
Proof. Let us assume that N is nite, i.e. that there is an n N such that [N[ = [N
n
[.
Then it follows that there is a bijection f : N N
n
. Then among the function val-
ues f(0), f(1), ..., f(n) there must be at least one repetition, since there are only n
distinct values in N
n
, but there are n +1 function values. This is a contradiction to
the injectivity of f. Hence, N cannot be nite. 2
Now we show how the notions of niteness and innity behave with respect to a
change in cardinality.
Proposition 5.30 Let X and Y be sets. Then we obtain the following:
1. [X[ [Y [ and Y nite = X nite,
2. [X[ [Y [ and Y countable = X countable,
3. [X[ [Y [ and X innite = Y innite,
4. [X[ [Y [ and X uncountable = Y uncountable.
Proof.
1. Let [X[ [Y [ and let Y be nite. This means that [Y [ = [N
n
[ for some n N
and there is an injective map f : X N
n
. Then the inverse f
1
can be
considered as a function f
1
: range(f) X and range(f) N
n
is a set that
contains exactly k values with k n. Let n
i
be the ith value in range(f)
in increasing order n
1
< n
2
< ... < n
k
and let g : N
k
X be dened by
g(k) := f
1
(n
k
). Then g is bijective and hence [X[ = [N
k
[ is nite.
2. This follows directly from the denition by transitivity.
3. and 4. This follows from 1. and 2. by contraposition.
2
In particular, we obtain that any subset of a nite set is nite and any subset
of a countable set is countable. Now we prove that for nite sets a situation like
[2N[ = [N[ for the natural numbers cannot occur.
105
5. Cardinality
Proposition 5.31 Let X be a nite set. Then the following hold:
1. any injective map f : X X is bijective,
2. any surjective map f : X X is bijective.
Proof. Let X be nite. Then [X[ = [N
n
[ for some n N. Then there is a bijective
map h : X N
n
.
1. If f : X X is injective, then g := h f h
1
: N
n
N
n
is also injective and
hence all the values g(1), ..., g(n) N
n
are distinct. Since there are exactly n
distinct values in N
n
, this means that g is surjective. But then h
1
gh = f is
surjective too. See the diagram in Figure 5.3 for an illustration of the situation.
2. We leave this proof to the reader (see Problem 5.17).
2
Nn
Nn
X

g
h
X
h
f
Figure 5.3: Commutative diagram for g h = h f.
In the next section we will see that the conditions given in this result also imply
niteness, provided one accepts the Axiom of Choice. We close this section with
some additional quantiers that are related to niteness.
Denition 5.32 (Innitely many for almost all) Let X be a set and P(x) a
predicate that depends on x X. Then we dene
1. (

x X) P(x) : x X : P(x) is nite,


2. (

x X) P(x) : x X : P(x) is innite.


In the rst case, we say that P(x) holds for innitely many x X and in the the
second case we say that P(x) holds for almost all x X.
Here almost all means for all but nitely many. In probability theory, mea-
sure theory and topology the term for almost all is sometimes used with a dierent
meaning (with dierent concepts of size). We note that we get directly from the
106
5.8. Dedekind Innite Sets

denition the following version of de Morgans law (and the corresponding statement
with the quantiers swapped):
(

x X) P(x) (

x X) P(x).
Problems
5.16 Let f : X Y be a function and let A X and B Y . Then we obtain:
1. A nite =f(A) nite,
2. A countable =f(A) countable,
3. B innite and f surjective =f
1
(B) innite,
4. B uncountable and f surjective =f
1
(B) uncountable.
5.17 Let X be a nite set. Prove that any surjective map f : X X is bijective.
5.8 Dedekind Innite Sets

Even before Cantor dened nite sets in terms of N


n
and innite sets as sets that
are not nite, Richard Dedekind already suggested a concept of innity that does
not refer to the natural numbers but that uses an intrinsic property of nite sets
to characterize them. This concept leads to further important characterizations of
nite and innite sets (which, however, require the Axiom of Choice).
Denition 5.33 (Dedekind innite sets) A set X is called Dedekind innite if
and only if there is a proper subset A X such that [A[ = [X[.
We give an example.
Example 5.34 The set N is Dedekind innite, since [N[ = [2N[ and 2N N. The
set P of prime number is also Dedekind innite (see Problem 5.18).
In the following theorem we collect a number of conditions that are equivalent
to Dedekind inniteness.
Theorem 5.35 (Dedekind innite sets) Let X be a set. Then the following are
equivalent:
1. X is Dedekind innite,
2. there is a function f : X X which is injective but not bijective,
3. [N[ [X[,
4. X has a countably innite subset.
107
5. Cardinality
Proof. 1.=2. Let X be Dedekind innite. Then there is a proper subset
A X such that [A[ = [X[, i.e. there is a bijective map g : A X. Then the
inverse g
1
: X A is bijective as well and the function f : X X, x g
1
(x) is
injective, but not surjective, since range(f) = A X.
2.=3. Let us now assume that there is a function f : X X that is injective,
but not surjective. We inductively dene an injective function h : N X. Since
f is not surjective, there is some x
0
X range(f). Now we assume that we have
dened x
n
for each n N and we use x
n
in order to dene
x
n+1
:= f(x
n
)
for each n N. We claim that h : N X, n x
n
is injective. We prove by
induction that
(n N)(i < n) x
i
,= x
n
.
Induction base: n = 0. In this case nothing is to be proved.
Induction step n n + 1. We assume we have some xed n N such that (i <
n) x
i
,= x
n
holds. We need to shows this statement for n + 1. By injectivity of f,
this implies (i < n) f(x
i
) ,= f(x
n
). But this implies (i < n) x
i+1
,= x
n+1
. Since
x
n+1
range(f) and x
0
, range(f), it is also clear that x
0
,= x
n+1
. Altogether, this
means (i < n + 1) x
i
,= x
n+1
. But this is the claim for n + 1. Altogether, this
proves that h is injective.
3.=4. Let [N[ [X[. Then there is an injective function h : N X and hence
f : N range(h), x h(x) is bijective and hence [N[ = [range(h)[. Hence range(h)
is a countably innite subset of X.
4.=1. Let D X be a countably innite set. Then there exists a bijective
function h : N D. Let B := h(2N), C := X D and A := BC. We consider the
inverse function h
1
: D N. Since h is injective, it is clear that B D and hence
A X. We dene a function g : A X by
g(x) :=
_
h(
1
2
h
1
(x)) if x B
x if x C
for all x A. Then g(B) = D and g(C) = C, i.e. range(g) = X, i.e. g is
surjective. Moreover, g is also injective: if x, y B, then g(x) = g(y) implies
h(
1
2
h
1
(x)) = h(
1
2
h
1
(y)), which in turn implies x = y; if x, y C, then clearly
g(x) = g(y) implies x = y; if x B and y C, then g(x) D and g(y) C, hence
g(x) = g(y) is not possible in this case. Altogether, this shows that g is bijective. 2
It is easy to see that each Dedekind innite set is innite. This follows from
Proposition 5.31. The reverse implication requires the Axiom of Choice. We rst
prove the following proposition.
Proposition 5.36 It follows from the Axiom of Choice that any innite set X
contains a countably innite set A X.
108
5.8. Dedekind Innite Sets

Proof. Let X be innite. By the Axiom of Choice there exists a choice function
C
X
: 2
X
X such that C
X
(A) A for each non-empty A X. We dene a
function f : N X inductively by
f(0) := C
X
(X)
f(n + 1) := C
X
(X f(0, ..., n))
for all n N. This function is injective and hence it proves [N[ [X[. By Proposi-
tion 5.31 this means that X is Dedekind innite. 2
Now we obtain the following characterization of innite sets.
Theorem 5.37 (Innite sets) Let X be set. It follows from the Axiom of Choice
that the following are equivalent:
1. X is innite,
2. X is Dedekind innite.
Proof. Let X be a set. If X is Dedekind innite, then by Theorem 5.35 it follows
that there is an injection f : X X that is not surjective. Hence X cannot be
nite according to Proposition 5.31. Hence X is innite.
For the other direction, let X be innite. Then X contains a countably in-
nite subset A X by Proposition 5.36. This implies by Theorem 5.35 that X is
Dedekind innite. 2
If we accept the Axiom of Choice, then we also get the following characterization
of nite sets.
Theorem 5.38 (Finite sets) Let X be a set. It follows from the Axiom of Choice
that the following are equivalent:
1. X is nite,
2. there is no proper subset A X with [A[ = [X[,
3. any injective function f : X X is bijective,
4. any surjective function g : X X is bijective,
5. [X[ < [N[.
Proof. Let us assume the Axiom of Choice. It follows from Theorem 5.37 that X is
nite if and only if X is not Dedekind innite, hence niteness of X is equivalent to
the negations of the conditions of Theorem 5.35, i.e. to 2., 3. and 5. That [X[ < [N[
is the negation of [N[ [X[ follows from the Trichotomy Theorem 5.10. In order to
complete the proof, it suces to show that 4. is equivalent to the other statements. It
follows from Proposition 5.31 that any nite X satises 4. Let us hence assume that
109
5. Cardinality
4. holds. We prove that 3. follows. Let hence f : X X be injective. Then f has
a left inverse g : X X, i.e. g f = id
X
. Such a left inverse g is necessarily surjec-
tive and hence bijective by 4. Thus, g
1
= g
1
gf = f and hence f is bijective. 2
We mention that it also follows from the Axiom of Choice that a set is countably
innite if and only if it is countable and innite.
Problems
5.18 Prove that the set of prime numbers P is Dedekind innite, without using the Axiom
of Choice, directly by going back the Theorem of Euclid 1.2 (see also Problem 1.2).
5.19 Let A N. Prove that it follows from the Axiom of Choice that the following are
equivalent:
1. A is innite,
2. (n N)(k N)(k n and k A).
5.20 Let X be a set. Prove that the following are equivalent (without the Axiom of Choice):
1. there exists a function f : X X which is surjective but not injective,
2. there exists a surjective map g : X Y with a countably innite set Y ,
3. the power set 2
X
is Dedekind innite.
5.21 Let X be a set. Prove that it follows from the Axiom of Choice that the following are
equivalent:
1. X is countably innite,
2. X countable and innite.
5.9 Cardinality and Set Constructions

In this section we investigate how set constructions aect the cardinality of sets. We
will only treat nite and countable sets here. The rst observation is that all nite
set constructions that we have considered, preserve niteness. That means that a
nite union, intersection, product, or the power set or function set of nite sets is
nite again. We can say more than this, we can determine formulas that allow to
compute the size of the resulting sets, if we know the size of the original sets. We
assume that k
0
= 1 for all k N.
Theorem 5.39 (Constructions on nite sets) Let X and Y be nite sets. Then
we obtain
1. [X Y [ +[X Y [ = [X . Y [ = [X[ +[Y [,
2. [X Y [ = [X[ [Y [,
110
5.9. Cardinality and Set Constructions

3. [Y
X
[ = [Y [
|X|
,
4. [X![ = [X[!,
5. [2
X
[ = 2
|X|
.
In particular, the sets X Y , X Y , X Y , Y
X
, X! and 2
X
are nite as well.
Proof.
1. Before we start, we prove the extra claim that for any nite set Z and x , Z
we obtain [Z x[ = [Z[ + 1. Since Z is nite, there is some m N such
that [Z[ = [N
m
[ and hence there is some bijection h : Z N
m
. Hence we get
a bijection
H : Z x N
m+1
, z
_
h(z) if z Z
m+ 1 otherwise
This proves the claim, i.e. [Zx[ = [Z[ +1. Since [X[ = [N
n
[ and [Y [ = [N
k
[
implies [X . Y [ = [N
n
. N
k
[ (see Problem 5.1), it suces to show [N
n
. N
k
[ =
n + k for all n, k N in order to prove [X . Y [ = [X[ + [Y [. We prove this
claim by induction on n N.
Induction base: n = 0. In this case N
n
.N
k
= (1)(2N
k
) = 2N
k
and hence [N
n
. N
k
[ = [2 N
k
[ = k for all k N by Problem 5.2.
Induction step: n n + 1. We assume that we have a xed n N with
[N
n
.N
k
[ = n+k for all k N. We consider the claim for n+1 and we obtain
[N
n+1
. N
k
[ = [(1 (N
n
n + 1)) (2 N
k
)[
= [(N
n
. N
k
) (1, n + 1)[
= n +k + 1,
where the last equality follows from the induction hypothesis and the extra
claim we proved rst since (1, n +1) , (N
n
.N
k
). This nishes the induction.
We still have to prove [X.Y [ = [XY [ +[XY [. Firstly, we note that XY
and XY are nite, since [XY [ [XY [ [X.Y [ holds (see Problem 5.3.
Hence, the claim follows since [X.Y [ = [(XY ).(XY )[ = [XY [+[XY [
by Problem 5.3 and by what we proved before.
2. Since [X[ = [N
n
[ and [Y [ = [N
k
[ implies [XY [ = [N
n
N
k
[ (see Problem 5.1),
it suces to show [N
n
N
k
[ = n k for all n, k N. We prove this by induction
on n N.
Induction base: n = 0. In this case N
0
N
k
= N
k
= for all k N and
hence [N
0
N
k
[ = [[ = 0 = 0 k.
Induction step: n n + 1. We assume that n N is some xed number such
that [N
n
N
k
[ = n k for all k N. We now consider the statement for n +1.
We obtain
N
n+1
N
k
= (N
n
n + 1) N
k
= (N
n
N
k
) (n + 1 N
k
)
111
5. Cardinality
where (N
n
N
k
)(n+1N
k
) = . Hence we obtain with 1. and Problem 5.2
[N
n+1
N
k
[ = [N
n
N
k
[ +[n + 1 N
k
[ = n k +k = (n + 1) k.
3. Since [X[ = [N
n
[ and [Y [ = [N
k
[ implies [Y
X
[ = [N
Nn
k
[ (see Problem 5.1), it
suces to show [N
Nn
k
[ = k
n
for all n, k N. We prove this by induction on
n N.
Induction base: n = 0. Then N
0
= and there is only one function f : N
k
,
i.e. [N
N
0
k
[ = 1 = k
0
for all k N.
Induction step: n n + 1. We assume that we have a xed n N with
[N
Nn
k
[ = k
n
for all k N. Now we consider the map
F : N
N
n+1
k
N
Nn
k
N
k
, f (f[
Nn
, f(n + 1)),
where f[
Nn
: N
n
N
k
denotes the restriction of f : N
n+1
N
k
to N
n
. It is
not too dicult to see that F is actually bijective, which implies
[N
N
n+1
k
[ = [N
Nn
k
N
k
[ = [N
Nn
k
[ [N
k
[ = k
n
k = k
n+1
by induction hypothesis and by 2.
4. Since [X[ = [N
n
[ implies [X![ = [N
n
![ (see Problem 5.1), it suces to show
[N
n
![ = n! for all n N. We prove this by induction on n N.
Induction base: n = 0. Then [N
0
![ = [![ = 1 = 0!, since there is exactly one
bijective function f : .
Induction step: n n + 1. We assume that we have a xed n N with
[N
n
![ = n!. We consider the map
F : N
n+1
! N
n
! N
n+1
, f (f

, f(n + 1)),
where f

: N
n
N
n
is the function dened by
f

(k) :=
_
f(k) if f(k) N
n
f(n + 1) if f(k) = n + 1
for all k N and f : N
n+1
N
n+1
. Since f

is bijective for any bijective f, it


is clear that F is well-dened. Moreover, since F is bijective, we obtain
[N
n+1
![ = [N
n
! N
n+1
[ = [N
n
![ [N
n+1
[ = n! (n + 1) = (n + 1)!
by induction hypothesis and by 2.
5. By Theorem 4.45 we have [2
X
[ = [0, 1
X
[. Hence, we can conclude with 3.
[2
X
[ = [0, 1
X
[ = [0, 1[
|X|
= 2
|X|
.
112
5.9. Cardinality and Set Constructions

2
This result explains one motivation for the notations for products, the power set,
the set of functions and the set of bijective functions. These notations are inspired by
the correspond operations on natural numbers that correspond to the cardinalities
of these sets in case of nite sets. For innite sets, these notations still make some
sense in cardinal arithmetic, but we are not going to discuss this here. Countable
sets are less well-behaved with respect to set constructions. We give some examples
of operations that do not preserve countability.
Example 5.40 By denition N is countable.
1. 2
N
and N
N
are uncountable, hence power sets 2
X
and function sets Y
X
are not
countable in general for countable X, Y .
2.

iN
N = N
N
is uncountable, hence products

iI
X
i
are not countable in
general for countable I and X
i
.
3. N! is uncountable (see Problem 5.23), hence X! is not countable in general for
countable X.
However, we can say at least something positive on set operations that preserve
countability.
Proposition 5.41 Let X and Y be countable and let (X
i
)
iI
be a countable family
of countable sets, i.e. I is countable and X
i
is countable for all i I. Then we
obtain:
1. X Y and X Y are countable,
2.

iI
X
i
and

iI
X
i
are countable,
3. X Y and X
n
are countable for each n N.
Proof. That X, Y and X
i
are countable for all i I means [X[ [N[, [Y [ [N[ and
[X
i
[ [N[ for all i I. Hence there are injective functions f : X N, g : Y N
and f
i
: X
i
N for all i I.
1. The function
F : X Y N, z
_
2f(z) if z X
2g(z) + 1 if z Y X
is injective. Hence [X Y [ [N[ and X Y is countable. Moreover, X Y
X Y and hence [X Y [ [X Y [ [N[ and X Y is countable too.
113
5. Cardinality
2. It is sucient to prove the statement for I = N
n
with n N and for I = N.
The case of a general I can be reduced to either of these cases. The case
I = N
n
with n N follows from 1. by induction over n N. We only discuss
the case I = N here. We consider the function
m :
_
iN
X
i
N, x mini N : x X
i

that determines for each x



iN
X
i
the smallest index i = m(x) N such
that x X
i
. Using this function, we dene
F :
_
iN
X
i
N N, x (f
m(x)
(x), m(x)).
This function F is injective and hence [

iN
X
i
[ [N N[ = [N[ by Corol-
lary 5.17. This means that

iN
X
i
is countable. Moreover,

iN
X
i

iN
X
i
and hence [

iN
X
i
[ [

iN
X
i
[ [N[ and this means that

iN
X
i
is countable as well.
3. The function
f g : X Y N N, (x, y) (f(x), g(y))
is injective (see Problem 4.8). Hence [XY [ [NN[ = [N[ by Corollary 5.17.
This means that X Y is countable. It follows that X
n
is countable for all
n N by an easy induction on n.
2
Other operations that are well-behaved with respect to countability are the dis-
joint union (see Problem 5.24) and the Kleene star operation (see Problem 5.25).
Problems
5.22 Let X and Y be nite sets. Prove that
1. [X Y [ = [X[ [X Y [,
2. [X Y [ = [X[ [Y [ if Y X,
3. [XY [ = [X Y [ [X Y [.
5.23 Prove that the set N! of bijections f : N N is not countable.
5.24 Let (X
i
)
iN
be a sequence of countable sets, i.e. each X
i
is countable for all i N.
Prove that the disjoint union

iN
X
i
is countable.
5.25 Let X be a set. We consider the Kleene operation X

(i.e. the set of all nite words


over X). Prove the following:
1. 0, 1

is countably innite,
114
5.9. Cardinality and Set Constructions

2. X

is countable for each countable X.


5.26 For any set X we dene the set
T(X) := A X : A nite
of nite subsets of X. Prove that
1. X nite =T(X) nite,
2. X countable =T(X) countable.
Prove that the following are equivalent:
1. [X[ = [T(X)[ for any innite set,
2. the Axiom of Choice.
5.27 For any two sets X and Y we dene the set
_
X
Y
_
:= A X : [A[ = [Y [
of subsets of X with exactly the same cardinality as Y . Prove that for nite sets X and Y
we obtain

_
X
Y
_

=
_
[X[
[Y [
_
,
where the right hand side uses the binomial coecient as dened in Problem 5.15.
5.28 Let X and Y be nite sets with k = [X[ and n = [Y [. Prove that there are
n!
(nk)!
many injective functions f : X Y .
5.29 Let X be a set. Prove that the following are equivalent, without using the Axiom of
Choice:
1. X is innite,
2. 2
2
X
is Dedekind innite.
Hint: Consider the function f : N 2
2
X
, n A X : [A[ = n.
5.30 Prove that the following are equivalent:
1. For all sets X, Y we obtain
[X[ < [X Y [ and [Y [ < [X Y [ =X Y nite,
2. the Axiom of Choice.
115
CHAPTER 6
Order
The mathematical sciences particularly exhibit order, symmetry, and limitation;
and these are the greatest forms of the beautiful.
Aristotle (384322 BC)
6.1 What is Order?
So far we have studied mainly uniqueness and totality properties of relations and
their consequences. The concepts of left and right totality and uniqueness are the
building blocks that we have used to dene functions, injections, surjections and
bijections. These building blocks have also led to the concept of cardinality. Now
we want to focus on homogeneous relations that extend the identity relation. Such
relations can be used to order mathematical objects and to identify them according
to specic properties. Often there are dierent properties that we can use to identify
or order objects. For instance, the relation on natural numbers orders the natural
numbers according to their appearance in the sequence 0, 1, 2, 3, ... (one could also
say that this is an additive property), whereas the divisibility relation [ orders natural
numbers according to their multiplicative properties. These two ways of ordering
natural numbers focus on dierent properties and they are not identical. However,
they share certain properties as we will see. As another example, the relation
orders sets according to the containment of elements and the relation [X[ [Y [
orders sets according to their cardinality. As we have seen, these dierent types of
ordering sets are not identical, but again they share certain properties. In a rst
step we will identify the relevant properties that certain types of order relations have
in common.
117
6. Order
6.2 Reexivity, Symmetry and Transitivity
We recall that we call a relation R XX a relation on X. Such relations are also
called homogeneous, since source and target set are identical. Similarly, as totality
and uniqueness are the building blocks of our study of functions, the concepts of
reexivity, symmetry and transitivity and variants thereof are the building blocks
of our study of order relations.
Denition 6.1 (Reexivity, symmetry and transitivity) Let R X X be
a relation. Then
1. R is called reexive, if xRx holds for all x X.
2. R is called irreexive, if xRx holds for no x X.
3. R is called symmetric, if xRy implies yRx for all x, y X.
4. R is called antisymmetric, if xRy and yRx implies x = y for all x, y X.
5. R is called transitive, if xRy and yRz implies xRz for all x, y, z X.
6. R is called total, if xRy or yRx holds for all x, y X.
The reader should be warned that the concept of totality mentioned here is not
the same that we studied before. A relation that is total in the sense dened here is
left total and right total, but not any left and right total relation is necessarily total
in the sense specied here. We have already seen a number of relations with some
of the properties listed in the previous denition.
Example 6.2 In the following table we list the empty relation N N, the all
relation N N, the relations =, ,=, , < and divisibility [ all considered on N and
the set relations , and , all on 2
N
.
N N = ,= < [ ,
reexive + + + + +
irreexive + + + + +
symmetric + + + +
antisymmetric + + + + + + +
transitive + + + + + + + +
total + +
A plus + in the table means that the relation has the corresponding property, a
minus means that it does not have the property.
Symmetry is quite a restrictive property for relations. The equality relation,
the empty relation and the all relation are all unique symmetric relations in some
sense. For instance, equality is the only relation on a given set that is reexive,
118
6.2. Reexivity, Symmetry and Transitivity
homogeneous relation
irreexive relation reexive relation
strict order preorder
partial order
compatibility relation
equivalence relation
equality relation
linear order
transitive symmetric
symmetric
symmetric
transitive antisymmetric
total
transitive
irreexive
reexive

>
>
>
>
> >.
.

,
antisymmetric
.
Figure 6.1: Some common types of homogeneous relations R X X.
symmetric and antisymmetric (see Problem 6.3). It is interesting to point out that
the properties of homogeneous relations dened here can be characterized in terms
of composition and inversion and without mentioning points. We formulate a cor-
responding result.
Proposition 6.3 Let R X X be a relation. Then the following hold:
1. R is reexive if and only if
X
R.
2. R is irreexive if and only if R
X
= .
3. R is symmetric if and only if R = R
1
.
4. R is antisymmetric if and only if R R
1

X
.
5. R is transitive if and only if R R R.
6. R is total if and only if X X R R
1
.
We leave the proof to the reader (see Problem 6.1). The diagram in Figure 6.1
shows how the building blocks of reexivity, symmetry and transitivity can be used
to dene certain common types of order relations. Those types of relations that are
highlighted in bold face are the most common ones in mathematics. We will, in
particular, focus on the right hand side of the diagram and study those relations
that contain the equality relation (i.e. the reexive relations).
119
6. Order
Problems
6.1 Let R X X be a relation. Prove all the statements in Proposition 6.3.
6.2 Let X be a set. Prove the following
1. The equality relation
X
XX is the only relation on X that is reexive, symmetric
and antisymmetric.
2. The empty relation XX is the only relation on X that is irreexive, symmetric
and antisymmetric.
3. The all relation X X is the only relation on X that is symmetric and total.
6.3 Let R X X be a relation.
1. We dene the reexive closure of R by R
=
:=
X
R. Prove the following:
a) R
=
is a reexive and symmetric relation.
b) If S X X is reexive and R S, then R
=
S.
c) R
=
=

S X X : S reexive and R S.
d) If R is reexive, then R = R
=
.
2. We dene the reexive and symmetric closure of R by R :=
X
RR
1
. Prove the
following:
a) R is a reexive and symmetric relation.
b) If S X X is reexive and symmetric and R S, then R S.
c) R =

S X X : S reexive and symmetric and R S.


d) If R is reexive and symmetric, then R = R.
3. We dene the transitive closure of R by R
+
:=

n=1
R
n
. Here R
n
stands for the
nfold composition of R with itself. Prove the following:
a) R
+
is a transitive relation.
b) If S X X is transitive and R S, then R
+
S.
c) R
+
=

S X X : S transitive and R S.
d) If R is transitive, then R = R
+
.
6.4 Let X be a nite set with n = [X[ elements. Prove the following
1. There are exactly 2
n
2
relations R X X.
2. There are exactly 2
n
2
n
reexive relations R X X.
120
6.3. Equivalence Relations
6.3 Equivalence Relations
Often one needs to identify some mathematical objects that share certain properties.
Equivalence relations are a tool to express such identications.
Denition 6.4 (Equivalence relation) Let be a relation on X. Then is
called an equivalence relation on X, if is reexive, symmetric and transitive.
Perhaps the most basic example of an equivalence relation is the equality relation
= on an arbitrary set X. Obviously, we use the equality to identify objects. However,
in general we can also identify objects which are not equal.
Example 6.5 We mention a few examples of equivalence relations.
1. Let X be a set with the diagonal
X
X X. The diagonal can be seen as
equality relation on X and this relation is an equivalence relation.
2. Let X be a set with the all relation XX. This relation is also an equivalence
relation on X.
3. Let S be a set of sets. We dene the equinumerosity relation S S by
X Y : [X[ = [Y [
for all X, Y S. By Corollary 5.9 this relation is an equivalence relation.
4. Let n N be xed. For integers x, y Z we dene the relation
x
n
y : n divides [x y[.
In this case x is called congruent to y modulo n. The relation
n
is an
equivalence relation (see Problem 6.9).
5. Let f : X Y be a function. Then we dene the ber relation
f
X X
of f by
x
f
y : f(x) = f(y)
for all x, y X. This relation is an equivalence relation. The ber relation

f
is also called the equivalence kernel of f.
We will study the ber relation
f
of functions somewhat more below, since it
is a particularly important equivalence relation. Since the purpose of equivalence
relations is to identify objects, we need a tool to combine those objects that we
identify. By [x] we denote the equivalence class of x, which is the set of all those
objects that are identied with x with respect to some given equivalence relation.
121
6. Order
Denition 6.6 (Equivalence classes) Let be an equivalence relation on X.
Then we dene the equivalence class of x X by
[x] := y X : y x.
The set
X/ := [x] : x X
of all equivalence classes is called the quotient of X by .
Any equivalence class [x] can be seen as a cluster of objects in X, namely the
cluster of those objects that we identify. The quotient X/ is then the coarsening
of X that we obtain if we replace points by clusters. In fact, the quotient X/
yields a partition of X and vice versa any partition of X denes an equivalence class
whose quotient is the original partition (see Problem 6.8 for details). We give an
example.
Example 6.7 Let X = 1, 2, 3 and let := (1, 1), (1, 2), (2, 1), (2, 2), (3, 3)
X X. Then is an equivalence relation and X/ = [1], [3], where [1] = [2].

_
_
X/
1
2
3
Figure 6.2: Quotient X/.
The map in the following denition assigns to each point its equivalence class.
Denition 6.8 (Canonical projection) Let be an equivalence relation on a
set X. Then
p : X X/, x [x]
is called the canonical projection of the equivalence relation .
It is easy to see that p is surjective and that the ber relation of p is just
(see Problem 6.6). In particular, this shows that any equivalence relation on X
is the ber relation of some function f : X Y for some suitable Y . Now we prove
that the canonical projection and the ber relation can be used to decompose any
function into an injective, a bijective and a surjective function.
Theorem 6.9 (Canonical decomposition) Let f : X Y be a function. Then
f can be decomposed into an injective function i, a bijective function b and a sur-
jective function p, i.e.
f = i b p.
122
6.3. Equivalence Relations
This decomposition can be obtained with the ber relation
f
and the following
selection of functions:
1. p : X X/
f
, x [x] is the canonical projection of
f
,
2. b : X/
f
range(f), [x] f(x),
3. i : range(f) X, x x is the restriction of the identity to range(f).
Proof. It is clear that p is surjective (see Problem 6.6) and that i is injective (since
it is a restriction of the injective identity). We need to prove that b is well-dened
and bijective. We obtain for all x, y X
[x] = [y] x
f
y f(x) = f(y) b([x]) = b([y]).
The forwards direction = shows that b is well-dened (i.e. right unique) and the
backwards direction = shows that b is injective (i.e. left unique). Moreover, b
is also surjective sine for any y range(f) there is some x X with f(x) = y and
hence b([x]) = f(x) = y. Now we still need to prove f = i b p. We obtain
i b p(x) = i b([x]) = i(f(x)) = f(x)
for all x X and hence f = i b p. 2
The composition f = i b p of f is called the canonical decomposition of f. The
commutative diagram in Figure 5.3 illustrates the situation for a general function
f : X Y whereas the set diagram next to it illustrates the special case of a
particular function f : 1, 2, 3 1, 2, 3.
X/
f
range(f)
Y

b
i
X
p
f
`
g
g
g
g
g
g
g
g
g
g
g

_
_

`
`

f
p
i
b
X Y
X/
f
range(f)
Figure 6.3: Canonical decomposition f = i b p
123
6. Order
Problems
6.5 Let X X be an equivalence relation. Then the following three statements are
equivalent to each other for all x, y X:
1. x y,
2. [x] = [y],
3. [x] [y] ,= .
6.6 Let be an equivalence relation on X with canonical projection p : X X/. Prove
the following:
1. p is surjective,
2. is the same relation as
p
.
Here
p
denotes the ber relation of p.
6.7 Let X be a set and R X X a relation on X. We dene the equivalence closure of
R by R

:= (R)
+
(see Problem 6.3). Prove the following:
1. R

is an equivalence relation.
2. If S X X is an equivalence relation and R S, then R

S.
3. R

S X X : S is an equivalence relation and R S.


4. If R is an equivalence relation, then R = R

.
6.8 Let X be a set. Then P 2
X
is called a partition of X, if the following hold:
1. , P (non-emptiness)
2.

P = X (cover)
3. A ,= B =A B = for all A, B P (disjointness)
Each element C P is called a cell of the partition P. Prove the following:
1. If P is a partition of X, then

CP
C C is an equivalence relation
P
on X.
Moreover, X/
P
is identical to P.
2. If is an equivalence relation on X, then X/ is a partition P of X such that
P
is identical with .
6.9 Let n N be xed. For integers x, y Z we dene the relation
x
n
y : n divides [x y[.
In this case x is called congruent to y modulo n. This is often denoted by x y (mod n)
instead of x
n
y. Prove the following:
1.
n
is an equivalence relation,
2.
0
is the equality relation
Z
,
3. [x] = x +zn : z Z for all x Z,
4. Z/
n
= [0], [1], ..., [n 1] for n > 0,
124
6.4. Preorders, Partial Orders and Linear Orders
5. [Z/
n
[ = n for n > 0 and [Z/
0
[ = [Z[.
6.10 Let be an equivalence relation on a set X. Prove that it follows from the Axiom of
Choice that [X/[ [X[.
6.11 Let X be a nite set with n = [X[ elements. Prove that there are exactly B
n
equiva-
lence relations on X, where B
n
is the nth Bell number, dened by B
0
:= 1 and
B
n+1
:=
n

k=0
_
n
k
_
B
k
for all n N.
6.4 Preorders, Partial Orders and Linear Orders
The minimal requirements that an order relation should satisfy are reexivity and
transitivity. The properties of antisymmetry and totality are additional properties
that are considered.
Denition 6.10 (Order) Let be a relation on X.
1. is called a preorder, if it is reexive and transitive.
2. is called a partial order if it is reexive, antisymmetric and transitive.
3. is called a linear order if is reexive, antisymmetric, transitive and total.
The pair (X, ) is called a preordered set, a partially ordered set or a linearly ordered
set if is a preorder, a partial order or a linear order on X, respectively.
Preorders are sometimes also called quasiorders and linear orders are also called
total orders.
Example 6.11 We provide a number of examples of order relations.
1. The less or equal relation on N is a linear order.
2. The divisibility relation [ on N is a partial order, but not a linear order.
3. The divisibility relation [ on Z, dened by
x[y : (z Z) x z = y
for all x, y Z, is a preorder, but not a partial order.
4. The prex relation _ on X

for some set X, dened by


(n, u
1
, ..., u
n
) _ (k, v
1
, ..., v
k
) : (n k and (i n) u
i
= v
i
)
for all n, k N and u
1
, ..., u
n
, v
1
, ..., v
k
X, is a partial order, but not a linear
order in general.
125
6. Order
5. The subset relation on the power set 2
X
of some set X is a partial order,
but not a linear order in general (see Problem 6.12).
6. The relation _ on a set of sets S, dened by
X _ Y : [X[ [Y [
for all X, Y S is a preorder, but not a partial order in general. It follows
from the Axiom of Choice that it is total (see Theorem 5.10).
The graphs in Figure 6.4 and 6.8 illustrate the partially ordered spaces (N, ),
(N, [) and (2
N
, ). Such a graph is called a Hasse diagram. If x is below y and
connected to y by an edge, then this means that x y. Edges that follow from
transitivity are left away in these graphs. Linear ordered sets like (N, ) actually
correspond to linear graphs in this way.
(N, ) (N, |)
0 1
0
1 2
3 5 7 11 13 17 19
2 4 6 10 14 15 9
3 8 18
4 16
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

12


Figure 6.4: Hasse diagrams of the partially ordered sets (N, ) and (N, [)
Whenever we have a preorder, then we automatically get some equivalence rela-
tions. If R is a relation, then

R := RR
1
is called the symmetric closure of R (see
Problem 6.3). On the other hand, also the intersection R R
1
is an equivalence
relation. This is what the following result says in other words.
Proposition 6.12 (Induced equivalence relation) Let be a preorder on X.
Then we obtain an equivalence relation on X by
x y : (x y and y x)
126
6.4. Preorders, Partial Orders and Linear Orders
for all x, y X. The equivalence relation is called the equivalence relation that is
induced by the preorder .
Proof. We can also express the denition by saying that is the intersection of
and its inverse
1
. This relation is obviously symmetric and it is also reexive,
since is reexive. We prove that is transitive. Let x, y, z X such that x y
and y z. Then x y and y z and z y and y x. Since is transitive, we
obtain x z and z x, which means x z. 2
We discuss a number of induced equivalence relations.
Example 6.13 We give some examples of induced equivalence relations.
1. The equivalence relation induced by any partial order on a set X is the equality
relation on X (see Problem 6.13).
2. The equivalence relation induced by the relation _ from Example 6.11 is just the
relation of Example 6.5 (this is the statement of the Theorem of Schroder-
Bernstein 5.8).
3. The equivalence relation induced by the divisibility relation [ on Z has the
equivalence classes [z] := z, z for all z Z.
If we start with a preorder on some set X then the induced equivalence relation
leads to some quotient X/ and on this quotient we can derive now a partial
order from . We denote this derived partial order here with a slightly dierent
symbol .
Proposition 6.14 (Induced partial order) Let be a preorder on X with in-
duced equivalence relation . Then we can dene a partial order on the quotient
X/ by
[x] [y] : x y
for all x, y X.
Proof. Firstly, we need to show that is a well-dened relation on X/ . Let
x, x

, y, y

X with [x] = [x

] and [y] = [y

]. Then x x

and y y

and we obtain
(x y =x

x y y

) and (x

=x x

y),
which means x y x

. Hence, is well-dened. Now we need to show


that is a partial order.
Reexivity: We obtain [x] [x] for all x X since x x.
Antisymmetry: Let [x] [y] and [y] [x] for some x, y X. Then x y and y x
and hence x y. But this means [x] = [y].
Transitivity: Let x, y, z X with [x] [y] [z]. Then x y z and hence x z,
which implies [x] [z]. 2
127
6. Order
This result is the reason why one usually studies partial orders and not preorders.
If a relation is just a preorder on a given set X, then one can replace it by the partial
order induced on the quotient set X/.
Problems
6.12 Let X be a set and consider the subset relation on 2
X
. Prove that is linear if and
only if X has less than two elements.
6.13 Let be a partial order on a set X. Prove that the induced equivalence relation is
the equality on X.
6.14 We call a relation < on X a strict order if it is irreexive and transitive. Let be a
preorder on X. Prove that by
x < y : (x y and y , x)
we can dene a strict order on X, which is called the induced strict order of . Prove that
1. the strict order < induced by the usual less or equal relation on N is the usual
strictly less relation,
2. the strict order induced by the inclusion relation on the power set 2
X
of some
set X is the usual proper inclusion relation,
3. the strict order induced by the cardinality relation _ on some set S of sets satises
the property X Y [X[ < [Y [ for all X, Y S.
6.15 Let X be a nite set with n = [X[ elements. Prove that there are exactly n! linear
orders on X.
6.5 Monoids

In the previous section we have seen that equivalence relations and partial orders
are often induced by preorders. But where do preorders come from? In this section
we show that many preorders are induced by monoids. A monoid is an algebraic
structure with one binary operation : X X X. In general, an algebraic
structure is a set together with operations that typically satisfy some additional
conditions. In the following denition we list some conditions that apply to a single
binary operation.
Denition 6.15 (Binary operations) Let X be a set with a binary operation
: XX X. We usually write xy instead of (x, y) for all x, y X. Let e X.
We dene the following properties of binary operations:
1. is called associative if x (y z) = (x y) z for all x, y, z X,
2. is called commutative if x y = y x for all x, y X,
3. e is said to be an identity for if x e = x = e x for all x X,
128
6.5. Monoids

4. if e is a identity for , then y X is said be an inverse for x X with respect


to , if x y = e = y x.
Algebraic structures with one binary operation : X X X that satisfy
some combinations of these conditions have particular names. We give a survey on
some common such structures in the diagram in Figure 6.5. Here, we are mainly
interested in so-called monoids.
magma
quasi group
loop
semigroup
monoid
group
inverses associative
associative
>
>
>
>
> > .
inverses
.

identity identity
Figure 6.5: Algebraic structures with one binary operation : X X X.
Denition 6.16 (Monoid) A triple (M, , e) is called a monoid if M is a set,
: M M M is a binary operation on M that is associative and e M is an
identity for .
One can prove that the identity of a monoid is uniquely determined (see Prob-
lem 6.16). The most important algebraic structures with one binary operation are
groups. These are monoids such that, additionally, each element has an inverse (see
Problem 6.20). We mention a number of monoids in the following example.
Example 6.17 Let X be a set.
1. (N, +, 0) is a monoid, where + : N N N is the usual addition.
2. (N, , 1) is a monoid, where : N N N is the usual multiplication.
3. (Z, , 1) is a monoid, where : Z Z Z is the usual multiplication.
4. (2
X
, , ) is a monoid, where : 2
X
2
X
2
X
is the usual union.
5. (X

, , 0) is a monoid, where : X

is the concatenation operation,


dened by
(n, u
1
, ..., u
n
) (k, v
1
, ..., v
k
) := (n +k, u
1
, ..., u
n
, v
1
, ..., v
k
)
for all n, k N and u
1
, ..., u
n
, v
1
, ..., v
k
X.
129
6. Order
6. (X
X
, , id
X
) is a monoid, where : X
X
X
X
X
X
is the composition.
We leave the proofs of these facts to Problem 6.17. The reason why we discuss
monoids here is that each monoid automatically comes with an induced preorder
that we mention in the following result.
Theorem 6.18 (Preorder of monoids) Let (M, , e) be a monoid. Then we ob-
tain a preorder on M by
x y : (z M) x z = y
for all x, y M. The preorder is called the induced preorder of the monoid
(M, , e).
Proof. Firstly, is reexive, since has an identity e and hence x = x e for all
x M and hence x x. Secondly, let x y and y z for x, y, z M. Then there
are a, b M such that y = x a and z = y b. Hence z = (x a) b = x (a b)
because is associative. But this means x z. Hence is transitive. 2
The following example shows that many of the preorders that we have considered
here are actually induced by monoids.
Example 6.19 Let X be a set. We consider some monoids and the induced pre-
orders:
1. (N, +, 0) induces the usual less or equal relation on N,
2. (N, , 1) induces the usual divisibility relation [ on N,
3. (Z, , 1) induces the usual divisibility relation [ on Z,
4. (X

, , 0) induces the usual prex relation _ on X

,
5. (2
X
, , ) induces the usual inclusion relation on 2
N
.
We leave the proof to Problem 6.18. The example of the monoid (Z, , 1) shows
that the induced preorder of a monoid is not necessarily a partial order. A monoid
is just the right algebraic structure to yield an interesting preorder. If we have
too much algebraic structure, then the induced preorder can become trivial (see
Problem 6.21).
Problems
6.16 Let (M, , e) be a monoid and let e

M be an element with the property that


x e

= x = e

x for all x M. Prove that e = e

follows.
6.17 Let X be a set. Prove the statements in Example 6.17.
6.18 Let X be a set. Prove the statements of Example 6.19.
130
6.6. Maximum and Minimum
aaa aba baa bba aab abb bab bbb

aa ab ba bb
a b
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Figure 6.6: Hasse diagram of the partially ordered set (a, b

, _)
6.19 We consider the monoid (X
X
, , id
X
) for a set X, where : X
X
X
X
X
X
denotes
the ordinary composition operation. Let be the induced preorder and the induced
equivalence relation on X
X
.
1. Prove that f id
X
if and only if f has a right inverse.
2. Prove that is not antisymmetric, if X has at least two elements.
3. Prove that is not total, if X has at least two elements.
6.20 A monoid (G, , e) is called a group, if for all x G there is a y G such that
x y = e = y x.
In this situation y is called the inverse of x.
1. Prove that (Z, +, 0) is a group with the usual addition + : Z Z Z.
2. Prove that (X!, , id
X
) is a group for any set X, where : X! X! X! denotes the
composition (recall that X! denotes the set of bijective functions f : X X).
6.21 Prove that the preorder induced by a group (G, , e) is the all relation GG.
6.6 Maximum and Minimum
Some elements in a preordered set are located in a particular position. In order to
make this more precise, wee need the concept of an upper and lower bound.
Denition 6.20 (Upper and lower bounds) Let (X, ) be a preordered set,
A X and b X.
1. b is called upper bound of A, if x b for all x A.
2. b is called lower bound of A, if b x for all x A.
131
6. Order
That is an upper bound of a set A is an element b X which is above all elements
of A and a lower bound of A is an element b X that is below all elements of A.
However, in both cases the bound is not required to be an element of the set A itself.
If an upper bound or a lower bound additionally is a member of A, then we call it
a greatest element or a least element, respectively.
Denition 6.21 (Minimum and Maximum) Let (X, ) be a preordered set and
let A X and m X.
1. Then m is called a greatest element of A or a maximum of A if m A and m
is an upper bound of A.
2. Then m is called a least element of A or a minimum of A if m A and m is a
lower bound of A.
The least and the greatest element of a subset of a preordered set need not exist
and if one of them exists it is not necessarily uniquely determined. We give some
examples.
Example 6.22 Let X be a set.
1. The partially ordered set (N, ) has 0 as least element and it has no greatest
element.
2. The partially ordered set (N, [) has 0 as greatest element and 1 as least element.
3. The preordered set (Z, [) has 0 as greatest element and 1 and 1 as least
elements.
4. In the preordered set (Z, [) the set A = 4, 2, 1, 1, 2, 4 has the greatest
elements 4 and 4 and the least elements 1 and 1.
5. In the preordered set (N, [) the set A = 1, 2, 3 has the least element 1 and no
greatest element (see the diagram in Figure 6.7).
6. The partially ordered set (2
X
, ) has the empty set as least element and X
as greatest element.
7. The partially ordered set (X

, _) has the empty word 0 as least element and it


has no greatest element in general.
This example also shows that even a nite subset of a partially ordered set does
not necessarily have a greatest element. On the other hand, an innite set (such as
2
N
in (2
N
, )) can have a least and a greatest element. In a partially ordered set,
the maximum and the minimum is at least uniquely determined if it exists. From
now on we will essentially consider only partially ordered sets.
Proposition 6.23 If (X, ) is a partially ordered set and A X, then A has at
most one minimum and at most one maximum.
132
6.6. Maximum and Minimum
1
2 3
A
Figure 6.7: The subset A = 1, 2, 3 of the partially ordered set (N, [)
Proof. We prove the claim for the minimum. Let us assume that A X has two
minima m, m

. Then m A and m

A and hence m m

and m

m. This
implies m = m, provided that is antisymmetric. 2
Since the maximum and minimum of a set in a partially ordered set is uniquely
determined, if it exists at all, we can use a special notation for these elements.
Denition 6.24 (Minimum and maximum) Let (X, ) be a partially ordered
set and let A X. Then we denote by max(A) the maximum of A, if it exists and
by min(A) the minimum of A, if it exists.
We give some further examples. In particular, we show that the maximum and
the minimum of a two element subset of a linear order always exist.
Example 6.25 We give some examples of maxima and minima.
1. We consider a linearly ordered set (X, ). Then for all x, y X
max(x, y) =
_
y if x y
x otherwise
and min(x, y) =
_
x if x y
y otherwise
Hence these notations correspond to our previous usage of max and min as
functions on natural numbers (see Example 4.26).
2. We consider the partially ordered set (2
X
, ) for some set X. Then max(2
X
) =
X and min(2
X
) = .
In Example 6.22 we have seen that it can happen that a set like 1, 2, 3 in (N, [)
has no greatest element. Nevertheless, there are elements like 2 and 3 in this set,
for which there are no greater elements. Such elements are called maximal.
Denition 6.26 (Maximal and minimal elements) Let (X, ) be a partially
ordered set and let A X.
1. Then an element m A is called a maximal element of A, if m x implies
m = x for all x A.
133
6. Order
2. Then an element m A is called a minimal element of A, if x m implies
x = m for all x A.
Example 6.27 The elements 2 and 3 of the subset A = 1, 2, 3 of the partially
ordered set (N, [) are maximal elements of A and 1 is a minimal element.
If a partially ordered set has a maximum or a minimum, then this is the only
maximal or minimal element of the set, respectively.
Proposition 6.28 Let (X, ) be a partially ordered set and let A X.
1. If max(A) exists, then it is the only maximal element of A.
2. If min(A) exists, then it is the only minimal element of A.
Proof. Let A X and let us assume that max(A) exists and let m A be a
maximal element. Then m max(A) since max(A) is the maximum and hence
m = max(A) since m is a maximal element. This shows that any maximal element
of A is already the maximum. Moreover, if max(A) x for some x A then we
also have x max(A) since max(A) is the maximum and hence x = max(A) follows
by antisymmetry of . Hence max(A) is actually a maximal element. The second
claim can be proved analogously. 2
If one considers a linearly ordered set, then each maximal or minimal element of
a set is automatically the maximum or minimum of that set.
Proposition 6.29 Let (X, ) be a linearly ordered set and let A X and m A.
1. If m is a maximal element of A, then m = max(A) follows.
2. If m is a minimal element of A, then m = min(A) follows.
Proof. Let m A is a maximal element of A. Then for each x A we have m x
or x m since is total. If m x, then m = x follows from the maximality of m.
In any case x m holds. Hence m is the maximum of A. 2
Besides the least and the greatest element, there are often elements in the second
row that are also of some importance. These elements are called atoms and co-atoms.
Denition 6.30 (Atoms) Let (X, ) be a partially ordered set and let A X.
1. Then a A is called an atom of A, if min(A) exists and a is minimal in
A min(A).
2. Then a A is called a co-atom of A, if max(A) exists and a is maximal in
A max(A).
We give some examples of atoms and co-atoms.
134
6.6. Maximum and Minimum
Example 6.31 Let X be some set.
1. In the partially ordered set (N, ) the only atom of N is 1 and there are no
co-atoms, since there is no greatest element.
2. In the partially ordered set (N, [) the atoms of N are exactly the prime numbers
and there are no co-atoms, although there is a greatest element 0 N.
3. In the partially ordered set (2
X
, ) the atoms of 2
X
are exactly the singletons
x for x X and the co-atoms are exactly the complements of singletons
X x for x X.
4. In the partially ordered set (X

, _) the atoms of X

are exactly the words


x X of length 1 (more formally, one should say the words (1, x) X

for
x X) and there are no co-atoms in general.
(2
N
, )

N
{0}
N\{0}
{1}
N\{1}
{3}
N\{3}
{5}
N\{5}
{6}
N\{6}
{7}
N\{7}
{8}
N\{8}
{9}
N\{9}
{0, 2} {1, 2}
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.



{2}
N\{2}
{4}
N\{4}
{0, 1}
Figure 6.8: Hasse diagram of the partially ordered set (2
N
, )
Problems
6.22 Let (M, , e) be a monoid with induced preorder . Prove that e is the least element
in M.
6.23 Prove the statements of Example 6.31.
135
6. Order
6.7 Supremum and Inmum
We have called a least upper bound b of a subset A of a partially ordered set a
maximum if additionally b A. It is also interesting to consider the least upper
bound, which is not necessarily a member of A. And correspondingly, we also
consider the greatest lower bound. These values, if existent, are called supremum
and inmum, respectively.
Denition 6.32 (Supremum and inmum) Let (X, ) be a partially ordered
set and let A X.
1. If existent, the value sup(A) := minb X : (x A) x b is called
supremum or least upper bound of A.
2. If existent, the value inf(A) := maxb X : (x A) b x is called inmum
or greatest lower bound of A.
We give an example that shows that the supremum is actually a concept that is
dierent from the maximum and from maximal elements.
Example 6.33 We consider the set A = 1, 2, 3 in the partially ordered set (N, [).
Then inf(A) = min(A) = 1 and sup(A) = 6, but A has no maximum and A has the
maximal elements 2 and 3 (see the diagram in Figure 6.7).
If the supremum or the inmum of a set A is additionally a member of that set A,
then it follows automatically that it is the maximum or the minimum, respectively.
This follows directly from the uniqueness of maximum and minimum by Proposi-
tion 6.23. If the maximum or the minimum of a set A exists, then the supremum or
the inmum exists and is identical to the maximum or minimum, respectively.
Corollary 6.34 Let (X, ) be a partially ordered set and let A X. Then the
following conditions are equivalent:
1. max(A) exists,
2. max(A) and sup(A) exist and sup(A) = max(A),
3. sup(A) exists and sup(A) A.
Moreover, also the following are equivalent:
1. min(A) exists,
2. inf(A) exists and inf(A) = min(A),
3. inf(A) exists and inf(A) A.
136
6.7. Supremum and Inmum
The importance of the supremum and the inmum stems from the fact that it
can exist even in some cases where the maximum and the minimum do not exist,
respectively. We have already seen in Example 6.33 that the supremum can exist in
a case where the maximum does not exist. Roughly speaking, if the maximum or
minimum of a set does not exist, then the supremum or the inmum, respectively,
is the next best object that one can hope to exist.
Partially ordered sets for which the supremum and the inmum of any two
elements always exist have a special name, they are called lattices.
Denition 6.35 (Lattice) A partially ordered set (X, ) is called a lattice, if the
values x . y := supx, y and x y := infx, y exist for all x, y X.
In this situation x . y is often called the join of x and y and x y is called the
meet of x and y. We discuss a number of examples.
Example 6.36 Let X be a set.
1. Any linear order (X, ) is a lattice with x . y = supx, y = maxx, y and
x y = infx, y = minx, y for all x, y X.
2. The partially ordered set (N, [) is a lattice and for all n, k N
n . k = supn, k = lcm(n, k) := the least common multiple of n and k,
n k = infn, k = gcd(n, k) := the greatest common divisor of n and k.
3. The partially ordered set (2
X
, ) is a lattice with
A. B = supA, B = A B and A B = infA, B = A B
for all A, B X
4. The partially ordered set (X

, _) is not a lattice in general. The supremum


supx, y of two words x, y X

exists if and only if x _ y or y _ x holds


(in which case it is y or x, respectively). The inmum infx, y of two words
x, y X

always exists and it is the longest common prex of x and y.


We leave the proof of these claims to Problem 6.25. In a lattice (X, ) the
operations join . and meet can be interpreted as binary operations . : XX X
and : X X X. These operations satisfy some properties that constitute an
algebraic characterization of a lattice.
Proposition 6.37 (Lattice) Let (X, ) be a lattice. Then we obtain the following
for all x, y, z X:
1. (x . y) . z = x . (y . z) and (x y) z = x (y z) (associative)
2. x . y = y . x and x y = y x (commutative)
137
6. Order
3. x . (x y) = x and x (x . y) = x (absorption)
4. x . x = x and x x = x (idempotent)
We leave the proof to the reader (see Problem 6.26).
Problems
6.24 Let (X, ) be a lattice with least element e X. Prove that (X, ., e) is a monoid.
Conclude that the following are monoids: (N, max, 0), (N, lcm, 1), (2
N
,

, ).
6.25 Prove the statements of Example 6.36.
6.26 Prove Proposition 6.37.
138
Axiomatic Set Theory
God exists since mathematics is consistent,
and the Devil exists since we cannot prove it.
Andre Weil (19061998)
In this section we want to present a brief survey on axiomatic set theory. Actu-
ally, there are many dierent versions of axiomatic set theory and the version pre-
sented here is called Zermelo-Fraenkel set theory, often abbreviated by ZF. Zermelo-
Fraenkel set theory plus the Axiom of Choice, abbreviated as ZFC, is the standard
framework in which most of modern mathematics is developed.
Zermelo-Fraenkel sets theory ZFC
1. Axiom of the Empty Set. There exists an empty set without elements.
2. Axiom of Extensionality. Two sets X and Y are equal, if they contain
exactly the same elements.
3. Axiom of Comprehension. For each given set X and each predicate P for
X, there exists a set S = x X : P(x) of all elements of X that satisfy the
predicate P.
4. Axiom of Pairing. For all objects x and y there exists a set x, y that
contains exactly x and y.
5. Axiom of Union. For all sets of sets o there exists a set
o = x : (X o) x X
that contains all elements that belong to at least one set X o.
6. Axiom of the Power Set. For all sets X there exists a power set
2
X
= S : S X
that contains all subsets of X.
7. Axiom of Innity. There exists an innite set (such as N).
8. Axiom of Replacement. If C and D are classes and f : C D is a function,
then for each subset X of C the image f(X) is a subset of the class D.
139
6. Order
9. Axiom of Foundation. Any non-empty set X of sets has the property that
it contains a member Y such that X Y = .
10. Axiom of Choice. Any set o that contains non-empty sets has a choice
function, i.e. a function f : o o such that f(X) X for each X o.
The axioms as listed here are only informal and simplied versions of the formal
axioms of ZFC. The purpose is to give the reader and impression of what these
axioms are about. For instance, in case of the Axiom of Replacement one would have
to be more precise about what kind of functions f are allowed here. As phrased here,
the axioms are certainly also redundant. For instance, the existence of the empty
set can be deduced from the Axiom of Innity and the Axiom of Comprehension.
The Axiom of Comprehension itself can be deduced form the Axiom of Regularity.
Hence, the presentation of these axioms can be optimised. The Axiom of Foundation
(sometimes also called the Axiom of Regularity) is equivalent to the fact that the
element relation is well-founded (provided one has the Axiom of Choice). There
are other versions of axiomatic set theory such as von Neumann-Bernay-Godel set
theory.
Unfortunately, it is not known whether the ZFC axioms are consistent! This is
one of the big open problems in mathematics and is related to a problem that was
discussed by Hilbert in a famous talk he gave at Paris in 1900.
Conjecture 6.38 (Consistency) The ZFC axioms of Zermelo-Fraenkel set theory
together with the Axiom of Choice are consistent.
Usually one proves that some mathematical axioms are consistent, by providing
an example of a mathematical object that satises all the axioms. However, in case
of set theory the problem is that nobody has any idea how to create a model for set
theory without using sets (and hence set theory). Nobody was able to resolve this
circularity until today. Even worse, Godel turned the observation that there is such
a circularity into the following negative result.
Theorem 6.39 (Godels Second Incompleteness Theorem 1931) ZFC is con-
sistent if and only if one cannot prove its consistency within ZFC.
The if direction of this result is trivial, if ZFC is inconsistent, then one can
conclude everything from ZFC, using the principle of explosion. In particular, one
can prove the consistency of ZFC in ZFC in that case. The only if direction of
the proof requires a deeper insight into logic and computability. What this theorem
clearly shows is that a consistency proof for ZFC would require some meta theory
that goes beyond set theory and then the question of consistency for this meta theory
would have to be resolved.
What is known, however, is that the axiom of choice is independent of the other
axioms of ZFC, i.e. ZFC is consistent if and only if ZF is consistent.
140
Mathematicians
Mathematicians are generally thought of as some kind of intellectual machine, a great
brain that crunches numbers and spits out theorems. In fact we are, as Hermann
Weyl said, more like creative artists. Although strongly constrained by the rules of
logic and by physical experience, we use our imagination to make great leaps into
the unknown. The development of mathematics over thousands of years is one of the
great achievements of civilization.
Sir Michael Francis Atiyah (Fields Medalist, Cambridge)
Some Selected Biographies of Mathematicians
Aristotle (384322 BC) was a philosopher and student of Plato. His work
includes texts on almost all areas of sciences, philosophy and politics of his
time. Aristotle is considered as the rst author who studied formal logic and
his views dominated mathematical logic for more than 2000 years. Aristotle
systematised deduction rules and systems and it seems that he was the rst
one who formulated the Principle of Excluded Middle and in this way he
gave a logical foundation to the concept of reductio ad absurdum, which
was already used in Greek philosophy.
Euclid of Alexandria (323283 BC) is best known for his book Ele-
ments, in which he developed geometry axiomatically. For more than 2000
years this was the standard text book on geometry. However, Euclid also
worked in number theory and results such as Euclids lemma on factoriza-
tion or the Euclidean algorithm to calculate greatest common divisors of
two numbers are named after him.
Ren e Descartes (15961650) was a philosopher, mathematician and
physicist. As a mathematician Descartes is best known for his contribu-
tions to analytic geometry including his approach to treat geometry using
coordinates and vectors. With this approach he created a link between
geometry and algebra that is essential for the modern treatment of these
subjects. Cartesian coordinate systems and the Cartesian product is named
after Descartes.
141
6. Order
Leopold Kronecker (18231891) was a student of Dirichlet and worked
in algebra, number theory, analysis, and mathematical logic. He rejected
Cantors set theory due to its non-constructive nature and proposed himself
nitism, which can be considered as a forerunner of intuitionism. Kronecker
proposed the idea that analysis and other branches of mathematics should
be based on natural numbers and he said God made the integers; all else
is the work of man.
Georg Cantor (18451918) was a student of Kummer and Weierstra and
is best known for his development of set theory. His own denition of a set
was roughly the following: In its entirety we consider any collection M of
well-dened and distinguished objects mof our perception or of our thoughts
as a set. The objects m are called the elements of the set M. Cantor was
led to his study of set theory by his investigation of bijective maps and the
concept of equinumerity. In this context Cantors diagonalisation method
and Cantors pairing function are well-known and named after him. He
also initiated the study of cardinality and of transnite numbers. The set
0, 1
N
as a metric space is often called Cantor space and plays a crucial
role in some elds of mathematics such as topology, descriptive set theory
and fractal geometry.
David Hilbert (18621943) was a student of Lindemann and is one of
the most well-known mathematicians of the 20th century. This is because
he made substantial contributions to various elds of mathematics such as
the theory of invariants, foundations of functional analysis and mathemat-
ical physics as well as mathematical logic. Many concepts and results are
named after him. For instance, Hilbert spaces play a central role in func-
tional analysis and for the mathematical foundations of modern quantum
physics. Hilbert was a strong supporter of Cantors set theory. Hilberts
Basis Theorem caused a controversy around the fact that the proof was
highly non-constructive. Hilberts Nullstellensatz is another fundamental
result that relates geometry to algebra and it is one of the starting points of
modern algebraic geometry. In a famous talk that Hilbert delivered at the
Sorbonne in Paris in 1900 he described 23 mathematical problems that he
considered as essential for mathematics of the 20th century. His problems
turned out to be very inuential and some developments of mathematics
in the 20th century were motivated by attempts to solve some of his prob-
lems. Some of these problems are still unsolved. One of his problems was
to nd a complete and sound axiomatic system for mathematics. Godels
First Incompleteness Theorem brought Hilberts programme and its original
form to an end, since Godel proved that even for arithmetic there cannot
be any reasonable axiom system that is sound and complete simultaneously.
Nevertheless, Hilberts foundational ideas in mathematical logic have a sub-
stantial impact even on nowadays mathematical logic.
142
6.7. Supremum and Inmum
Betrand Russell (18721970) was a philosopher who published texts
on various areas of philosophy, science and politics. In mathematics he is
best known for the three volume work Principia Mathematica, which he
wrote together with Afred North Whitehead. The purpose of this work
was to derive mathematics starting from some basic axioms and using only
the rules of symbolic logic. At the same time this approach was designed
to avoid antinomies such as Russels paradox that Russel discovered when
he studied Freges work on naive set theory. The Principia Mathematica
is sometimes considered as the most signicant work on formal logic since
Aristotle. The value of this approach was relativised, when Godel published
his First Incompleteness Theorem, from which it follows that the system in
Principia Mathematica cannot be complete and sound simultaneously.
Luitzen Egbertus Jan Brouwer (18811966) was a student of Korteweg
and made substantial contributions to topology and logic. He was one of
the main proponents of intuitionism, a constructive approach to mathemat-
ics that avoids the Principle of the Excluded Middle. Ironically, he is best
known for the Brouwer Fixed Point Theorem, which is a theorem in classi-
cal topology that does not admit any constructive proof. Brouwer proved
this theorem in the early years of his career, where he was seeking recogni-
tion among mathematicians and at that time he refrained deliberately from
propagating intuitionism in order not to endanger his career.
Andrey Kolmogorov (19031987) was a student of Luzin and one of
the most inuential mathematicians of the 20th century. He is perhaps best
known for his contributions to probability theory, since he started to develop
the subject systematically using the concepts of measure and integration. In
topology Kolmogorov is known, for instance, for his Superposition Theorem
that solved Hilberts 13th problem and shows that addition is a universal
continuous functions of two arguments (any other continuous function of
two arguments can be expressed using addition and continuous functions
of only one variable). Kolmogorov also extensively contributed to classical
mechanics, the mathematical treatment of turbulences and to analysis and
the theory of dynamical systems. In the area of logic and the foundations
of mathematics he is best known for his work on intuitionistic logic and
the concept of Kolmogorov complexity that allows to dene random objects
using the concept of algorithms (opposed to the treatment in probability
theory that does not allow to speak about randomness of single objects).
143
6. Order
Kurt G odel (19061977) was a student of Hahn and certainly the most
inuential mathematical logician of the 20th century. The Godel Complete-
ness Theorem shows that rst-order logic can be axiomatised in a complete
and sound way, i.e. it shows that in some sense provability and truth corre-
spond in pure logic. However, Godels First Incompleteness Theorem indi-
cates that this is not true for mathematics in general. Even a simple frag-
ment of mathematics such as arithmetic does not admit any axiom system
that is complete and sound simultaneously. This result brought Hilberts
programme in its original form to and end. Godels Second Incompleteness
theorem states that a suciently rich mathematical theory cannot prove
its own consistency and this applies, in particular, to set theory. Godel
also proved that the Axiom of Choice and the Generalized Continuum Hy-
pothesis are both consistent with Zermelo-Fraenkel set theory (i.e. if ZF is
consistent, then also ZF together with the Axiom of Choice and the Gen-
eralised Continuum Hypothesis). Later, Paul Cohen was able to show that
this is also true for the negations of the Axiom of Choice and the negation
of the Generalised Continuum Hypothesis, such that both are independent
of Zermelo-Fraenkel set theory. Godel also made notable contributions to
other areas of logic, such as intuitionistic logic and to relativity theory.
144
Greek Alphabet
alpha
beta
gamma
delta
epsilon
zeta
eta
or theta
iota
kappa
lambda
mu
nu
xi
o omicron
pi
rho
sigma
tau
upsilon
phi
chi
psi
omega
A Alpha
B Beta
Gamma
Delta
E Epsilon
Z Zeta
H Eta
Theta
I Iota
K Kappa
Lambda
M Mu
N Nu
Xi
O Omicron
Pi
R Rho
Sigma
T Tau
Upsilon
Phi
X Chi
Psi
Omega
145
Mathematical Symbols
2 end of the proof box (q.e.d.)
N set of natural numbers 0, 1, 2, ...
P set of prime numbers 2, 3, 5, 7, 11, ...
Z set of integers ..., 2, 1, 0, 1, 2, ...
Q set of rational numbers
A set of algebraic numbers
R set of real numbers
C set of complex numbers
a[b a divides b
is an element of
, is not an element of
empty set
|, set brackets
= is equal to
,= is not equal to
:= is dened to be equal to
is a subset of
, is not a subset of
subset, but not equal to
union
intersection
\ set dierence
X
c
complement of X (wrt. some other set)
Cartesian product
disjoint union
2
X
power set of X, also written as T(X)
(Xi)iI indexed family of sets

iI
Xi union of an indexed family of sets

iI
Xi intersection of an indexed family of sets

iI
Xi disjoint union of an indexed family of sets

product

sum

coproduct (disjoint union)


X

set of nite words over X (Kleene star)


is not true (falsum)
is true (verum)
and
or
not
X Y X is of the same cardinality as Y
X Y X is of cardinality less or equal to Y
X Y X is of smaller cardinality as Y
= implies
= is implied by
if and only if (is equivalent to)
: is dened to be equivalent to
there exists (existential quantier)
for all (universal quantier)
! there is exactly one

there are innitely many

for almost all (for all but nitely many)


S R composition of relations
R
1
inverse relation
dom(R) domain of a relation
range(R) range of a relation
X diagonal of X (equality relation)
idX : X X identity function
graph(f) graph of a function
f : X Y function from X to Y
f : X Y injection from X into Y
f : X Y surjection from X onto Y
f : X Y partial function from X to Y
f : X Y multi-valued function from X to Y
x y x is mapped to y
f(x) function value of f : X Y at x X
f
1
: Y X inverse function of a bijective f : X Y
f(A) image of A X under f : X Y
f
1
(B) preimage of B Y under f : X Y
Y
X
set of functions f : X Y
X! set of bijective functions f : X X
[X[ cardinality of a set X
_
X
Y
_
set of subsets of X of cardinality [Y [
T(X) set of nite subsets of X
equivalence relation
[x] equivalence class of x
X/ quotient of X by

f
ber relation of f : X Y
[ divisibility relation
less or equal (or a preorder)
< strictly less (or a strict order)
x y (mod n) x congruent to y modulo n
n! factorial of n
_
n
k
_
binomial coecient
Bn nth Bell number
147
Index
nfold product, 42
ntuples, 41
algebraic geometry, 142
algebraic numbers, 14
algebraic structure, 128
all relation, 56
almost all, 106
American Mathematical Society, 3
antisymmetric, 118
apply operation, 80
arity, 41
associative, 128
atom, 134
Axiom of Choice, 82, 139, 144
Axiom of Regularity, 140
axiomatic set theory, 13
Banach-Tarski Paradox, 85
Bell number, 125
bijection, 68
bijective, 68
binary operation, 128
binomial coecient, 103
bound variable, 51
Brouwer Fixed Point Theorem, 143
canonical decomposition, 123
canonical projection, 122
canonical projections, 72, 87
Cantor space, 142
Cantors diagonalisation method, 142
Cantors rst diagonalization, 97
Cantors pairing function, 142
cardinal number, 89
cardinality, 89, 104
Cartesian product, 39
case distinction, 70
cell, 124
characteristic function, 80, 81
choice function, 82
co-atom, 134
codomain, 62
commutative, 128
commutative diagrams, 66
complex numbers, 14
comprehension, 15, 16
computability theory, 22, 45
computational complexity theory, 49
concatenation, 129
congruent, 121, 124
conjunction, 46
consistent, 19
constant function, 70
Continuum Hypothesis, 96
Corollary, 5
correspondence, 66
countable, 103
countably innite, 103
currying operation, 80
Dedekind innite, 107
denumerable, 104
descriptive set theory, 142
diagonal, 56
dierence, 27
discriminated union, 42
disjoint, 26
disjoint union, 42
149
Index
disjunction, 46
divides, 7
divisibility relation, 125
divisor, 7
divisor relation, 56
domain, 59
double negation law, 30
duality, 29
element relation, 56
elements, 13
empty relation, 56
empty set, 14
empty tuple, 41
empty word, 41
equal, 14
equality relation, 56
equinumerosity relation, 121
equivalence, 46
equivalence class, 121, 122
equivalence closure, 124
equivalence kernel, 121
equivalence relation, 121
Euclids lemma, 141
Euclidean algorithm, 141
evaluation, 80
evaluation map, 80
exclusive or, 46
existential quantier, 32
extension, 74
factor, 7
factorial function, 103
falsum, 19
family, 73
ber, 76
ber relation, 121
nite, 6, 103
nite words, 43
nitism, 142
First Incompleteness Theorem, 143
rst-order logic, 50
xed point, 92
forward image, 76
fractal geometry, 142
free variable, 51
function, 62
function value, 63
Godel Completeness Theorem, 144
Godels Completeness Theorem, 52
Godels First Incompleteness Theorem,
144
Godels Second Incompleteness, 144
Generalized Continuum Hypothesis, 96,
144
graph, 55, 62
graph map, 82
greatest element, 132
greatest lower bound, 136
group, 131
groups, 129
halting problem, 22
Hasse diagram, 126
Hilbert Hotel Paradox, 91
Hilbert spaces, 142
Hilberts Basis Theorem, 142
Hilberts Nullstellensatz, 142
Hilberts programme, 142
homogeneous, 55, 118
identity, 66, 128
identity function, 70
image, 75
implication, 46
inconsistency, 19
indexed family, 14, 32
indirect, 6
indirect proof, 7
induced preorder, 130
induced strict order, 128
Induction base, 101
induction base, 102
induction hypothesis, 102
induction principle, 100
Induction step, 101
induction step, 102
inmum, 136
150
Index
innite, 103
innitely many, 106
injection, 68
injective, 68
integers, 14
intersection, 23
into, 68
Intuitionism, 10
intuitionism, 10, 142, 143
intuitionistic logic, 10
inverse, 83, 129, 131
inverse function, 71
inverse image, 76
inverse relation, 60
inversion map, 82
irreexive, 118
join, 137
Kleene star operation, 43
Kuratowski pair, 38
lattice, 137
lattices, 137
least element, 132
least upper bound, 136
left inverse, 83
left total, 59
left unique, 61
Lemma, 5
less or equal relation, 56
linear order, 125
linearly ordered set, 125
logical formulas, 47
lower bound, 131
map, 62
mapping, 62
Mathematical Reviews, 3
Mathematics Subject Classication, 3
maximal element, 133
maximum, 132
maximum function, 69
meet, 137
metamathematics, 45
minimal element, 134
minimum, 8, 132
minimum function, 69
model theory, 45
modulo, 121, 124
monoid, 129
monotone, 92
multi-valued function, 66
multiple, 7
multisets, 14
naive set theory, 13
natural numbers, 14
necessary, 18
necessary condition, 18
negation, 46
non well-founded set theory, 23
non-standard analysis, 23
not nite, 6
one-to-one function, 68
onto, 68
order relation, 94
P-NP problem, 49
pair, 38
pairs, 41
partial function, 67
partial order, 125
partially ordered set, 125
partition, 122, 124
Pascals rule, 103
Peano axioms, 102
Peano model of the natural numbers,
102
permutations, 72
Platonism, 10
platonism, 10
power set, 36
power set construction, 21
predecessor function, 69
prex relation, 125
preimage, 75
preorder, 125
preordered set, 125
151
Index
prime divisor, 8
prime numbers, 14
Principia Mathematica, 143
principle of excluded middle, 9
principle of explosion, 19
Principle of the Excluded Middle, 143
product, 39, 85
product function, 72
projections, 88
proof, 4
proof by contradiction, 10
proof by induction, 102
proof theory, 45
proper class, 22
proper subset relation, 56
Proposition, 5
proposition, 46
Propositional, 46
quadruples, 41
quasiorders, 125
quintuples, 41
quotient, 122
range, 59
range map, 82
rational numbers, 14
real numbers, 14
reductio ad absurdum, 10
reexive, 118
reexive and symmetric closure, 120
reexive closure, 120
relation, 55
replacement, 23
restriction of f to A, 74
right inverse, 83
right total, 59
right unique, 62
rigorous enough, 7
Russels paradox, 21, 143
same cardinality, 89
scope, 51
second-order logic, 51
selfapplicability problem, 22
sequence, 73
set, 6, 13
set of all functions, 79
set of bijective functions, 79
set of functions, 62
Sierpi nski Pyramid, 2
singleton, 15
smaller or the same cardinality, 89
source, 55
square function, 69, 74
strict order, 128
strictly less relation, 56
strictly smaller cardinality, 89
subset, 16
subset relation, 56
successor function, 102
sucient, 18
sucient condition, 18
superset, 17
supremum, 136
surjection, 68
surjective, 68
symmetric, 118
symmetric closure, 126
symmetric dierence, 31
symmetric group, 72, 80
tagged union, 42
target, 55
tautology, 47
Theorem, 4
Theorem of Diaconescu-Goodman-Myhill,
83
Theorem of Tychono, 86
topology, 142
total, 118
total orders, 125
transitive, 118
transitive closure, 120
transitivity, 8
triples, 41
truth table, 46
truth table method, 47
tuple, 41
152
Index
twin primes, 8
uncountable, 103
union, 21, 23
universal property, 87
universal quantier, 32
untyped sets, 19
upper bound, 131
valid, 51
Venn diagram, 17
von Neumann-Bernay-Godel set the-
ory, 140
well-dened, 10, 63
well-founded, 22
Zermelo-Fraenkel set theory, 139
153

Potrebbero piacerti anche