Sei sulla pagina 1di 316

Background Course in Mathematics

1
European University Institute
Department of Economics
Fall 2011
antonio villanacci
September 26, 2011
1
I would like to thank the following friends for helpful comments and discussions: Laura Carosi, Michele
Gori, Michal Markun, Marina Pireddu, Kiril Shakhnov and all the students of the courses I used these notes
for in the past several years.
2
Contents
I Linear Algebra 7
1 Systems of linear equations 9
1.1 Linear equations and solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Systems of linear equations, equivalent systems and elementary operations . . . . . . 10
1.3 Systems in triangular and echelon form . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Reduction algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6 Systems of linear equations and matrices . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 The Euclidean Space R
n
21
2.1 Sum and scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Norms and Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Matrices 27
3.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Inverse matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Elementary matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 Elementary column operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4 Vector spaces 43
4.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Row and column space of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Linear dependence and independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.7 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.8 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Determinant and rank of a matrix 63
5.1 Denition and properties of the determinant of a matrix . . . . . . . . . . . . . . . . 63
5.2 Rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Inverse matrices (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Span of a matrix, linearly independent rows and columns, rank . . . . . . . . . . . . 70
6 Eigenvalues and eigenvectors 73
6.1 Characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.2 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.3 Similar matrices and diagonalizable matrices . . . . . . . . . . . . . . . . . . . . . . 75
7 Linear functions 77
7.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Kernel and Image of a linear function . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Nonsingular functions and isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . 81
3
4 CONTENTS
8 Linear functions and matrices 85
8.1 From a linear function to the associated matrix . . . . . . . . . . . . . . . . . . . . . 85
8.2 From a matrix to the associated linear function . . . . . . . . . . . . . . . . . . . . . 86
8.3 M(m, n) and L(V, U) are isomorphic . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.4 Some related properties of a linear function and associated matrix . . . . . . . . . . 89
8.5 Some facts on L(R
n
, R
m
) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.6 Examples of computation of [l]
u
v
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.7 Change of basis and linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.8 Diagonalization of linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9 Solutions to systems of linear equations 97
9.1 Some preliminary basic facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
9.2 A solution method: Rouch-Capellis and Cramers theorems . . . . . . . . . . . . . 98
10 Appendix. Complex numbers 107
II Some topology in metric spaces 111
11 Metric spaces 113
11.1 Denitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11.2 Open and closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
11.2.1 Sets which are open or closed in metric subspaces. . . . . . . . . . . . . . . . 122
11.3 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.4 Sequential characterization of closed sets . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.5 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.5.1 Compactness and bounded, closed sets . . . . . . . . . . . . . . . . . . . . . . 128
11.5.2 Sequential compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
11.6 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
11.6.1 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
11.6.2 Complete metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
11.6.3 Completeness and closedness . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
11.7 Fixed point theorem: contractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
11.8 Appendix. Some characterizations of open and closed sets . . . . . . . . . . . . . . . 140
12 Functions 145
12.1 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
12.2 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
12.3 Continuous functions on compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . 150
13 Correspondence and xed point theorems 153
13.1 Continuous Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
13.2 The Maximum Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
13.3 Fixed point theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
13.4 Application of the maximum theorem to the consumer problem . . . . . . . . . . . . 163
III Dierential calculus in Euclidean spaces 167
14 Partial derivatives and directional derivatives 169
14.1 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
14.2 Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
15 Dierentiability 177
15.1 Total Derivative and Dierentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
15.2 Total Derivatives in terms of Partial Derivatives. . . . . . . . . . . . . . . . . . . . . 179
CONTENTS 5
16 Some Theorems 181
16.1 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
16.2 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
16.3 A sucient condition for dierentiability . . . . . . . . . . . . . . . . . . . . . . . . . 187
16.4 A sucient condition for equality of mixed partial derivatives . . . . . . . . . . . . . 187
16.5 Taylors theorem for real valued functions . . . . . . . . . . . . . . . . . . . . . . . . 187
17 Implicit function theorem 189
17.1 Some intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
17.2 Functions with full rank square Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . 191
17.3 The inverse function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
17.4 The implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
17.5 Some geometrical remarks on the gradient . . . . . . . . . . . . . . . . . . . . . . . . 198
17.6 Extremum problems with equality constraints. . . . . . . . . . . . . . . . . . . . . . 198
IV Nonlinear programming 201
18 Concavity 203
18.1 Convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
18.2 Dierent Kinds of Concave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
18.2.1 Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
18.2.2 Strictly Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
18.2.3 Quasi-Concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
18.2.4 Strictly Quasi-concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . 213
18.2.5 Pseudo-concave Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
18.3 Relationships among Dierent Kinds of Concavity . . . . . . . . . . . . . . . . . . . 216
18.3.1 Hessians and Concavity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
19 Maximization Problems 221
19.1 The case of inequality constraints: Kuhn-Tucker theorems . . . . . . . . . . . . . . . 221
19.1.1 On uniqueness of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
19.2 The Case of Equality Constraints: Lagrange Theorem. . . . . . . . . . . . . . . . . 226
19.3 The Case of Both Equality and Inequality Constraints. . . . . . . . . . . . . . . . . . 228
19.4 Main Steps to Solve a (Nice) Maximization Problem . . . . . . . . . . . . . . . . . . 229
19.4.1 Some problems and some solutions . . . . . . . . . . . . . . . . . . . . . . . . 234
19.5 The Implicit Function Theorem and Comparative Statics Analysis . . . . . . . . . . 236
19.5.1 Maximization problem without constraint . . . . . . . . . . . . . . . . . . . . 236
19.5.2 Maximization problem with equality constraints . . . . . . . . . . . . . . . . 237
19.5.3 Maximization problem with Inequality Constraints . . . . . . . . . . . . . . . 237
19.6 The Envelope Theorem and the meaning of multipliers . . . . . . . . . . . . . . . . . 239
19.6.1 The Envelope Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
19.6.2 On the meaning of the multipliers . . . . . . . . . . . . . . . . . . . . . . . . 240
20 Applications to Economics 243
20.1 The Walrasian Consumer Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
20.2 Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
20.3 The demand for insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
V Problem Sets 249
21 Exercises 251
21.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
21.2 Some topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
21.2.1 Basic topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 255
21.2.2 Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
21.3 Dierential calculus in Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . 259
6 CONTENTS
21.4 Nonlinear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
22 Solutions 265
22.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
22.2 Some topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
22.2.1 Basic topology in metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 277
22.2.2 Correspondences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
22.3 Dierential Calculus in Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 290
22.4 Nonlinear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
Part I
Linear Algebra
7
Chapter 1
Systems of linear equations
1.1 Linear equations and solutions
Denition 1 A
1
linear equation in the unknowns x
1
, x
2
, ..., x
n
is an equation of the form
a
1
x
1
+a
2
x
2
+...a
n
x
n
= b, (1.1)
where b R and j {1, ..., n} , a
j
R . The real number a
j
is called the coecient of x
j
and b is
called the constant of the equation. a
j
for j {1, ..., n} and b are also called parameters of system
(1.1).
Denition 2 A solution to the linear equation (1.1) is an ordered n-tuple (x
1
, ..., x
n
) := (x
j
)
n
j=1
such
2
that the following statement (obtained by substituting x
j
in the place of x
j
for any j ) is true:
a
1
x
1
+a
2
x
2
+...a
n
x
n
= b,
The set of all such solutions is called the solution set or the general solution or, simply, the solution
of equation (1.1).
The following fact is well known.
Proposition 3 Let the linear equation
ax = b (1.2)
in the unknown (variable) x R and parameters a, b R be given. Then,
1. if a 6= 0, then x =
b
a
is the unique solution to (1.2);
2. if a = 0 and b 6= 0, then (1.2) has no solutions;
3. if a = 0 and b = 0, then any real number is a solution to (1.2).
Denition 4 A linear equation (1.1) is said to be degenerate if j {1, ..., n}, a
j
= 0, i.e., it has
the form
0x
1
+ 0x
2
+...0x
n
= b, (1.3)
Clearly,
1. if b 6= 0, then equation (1.3) has no solution,
2. if b = 0, any n-tuple (x
j
)
n
j=1
is a solution to (1.3).
Denition 5 Let a nondegenerate equation of the form (1.1) be given. The leading unknown of the
linear equation (1.1) is the rst unknown with a nonzero coecient, i.e., x
p
is the leading unknown
if
j {1, ..., p 1} , a
j
= 0 and a
p
6= 0.
For any j {1, ..., n} \ {p} , x
j
is called a free variable - consistently with the following obvious
result.
1
In this part, I often follow Lipschutz (1991).
2
:= means equal by denition.
9
10 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Proposition 6 Consider a nondegenerate linear equation a
1
x
1
+ a
2
x
2
+ ...a
n
x
n
= b with leading
unknown x
p
. Then the set of solutions to that equation is
(
(x
k
)
n
k=1
: j {1, ..., n} \ {p} , x
j
R and x
p
=
b
P
j{1,...,n}\{,p}
a
j
x
j
a
p
)
1.2 Systems of linear equations, equivalent systems and el-
ementary operations
Denition 7 A system of m linear equations in the n unknowns x
1
, x
2
, ..., x
n
is a system of the
form

a
11
x
1
+... +a
1j
x
j
+... +a
1n
x
n
= b
1
...
a
i1
x
1
+... +a
ij
x
j
+... +a
in
x
n
= b
i
...
a
m1
x
i
+... +a
mj
x
j
+... +a
mn
x
n
= b
m
(1.4)
where i {1, ..., m} and j {1, ..., n}, a
ij
R and i {1, ..., m}, b
i
R. We call L
i
the
i th linear equation of system (1.4).
A solution to the above system is an ordered n-tuple (x
j
)
n
j=1
which is a solution of each equation
of the system. The set of all such solutions is called the solution set of the system.
Denition 8 Systems of linear equations are equivalent if their solutions set is the same.
The following fact is obvious.
Proposition 9 Assume that a system of linear equations contains the degenerate equation
L : 0x
1
+ 0x
2
+...0x
n
= b.
1. If b = 0, then L may be deleted from the system without changing the solution set;
2. if b 6= 0, then the system has no solutions.
A way to solve a system of linear equations is to transform it in an equivalent system whose
solution set is easy to be found. In what follows we make precise the above sentence.
Denition 10 An elementary operation on a system of linear equations (1.4) is one of the following
operations:
[E
1
] Interchange L
i
with L
j
, an operation denoted by L
i
L
j
(which we can read put L
i
in the
place of L
j
and L
j
in the place of L
i
);
[E
2
] Multiply L
i
by k R\ {0}, denoted by kL
i
L
i
, k 6= 0 (which we can read put kL
i
in the
place of L
i
, with k 6= 0);
[E
3
] Replace L
i
by ( k times L
j
plus L
i
), denoted by (L
i
+kL
j
) L
i
(which we can read put
L
i
+kL
j
in the place of L
i
).
Sometimes we apply [E
2
] and [E
3
] in one step, i.e., we perform the following operation
[E] Replace L
i
by ( k
0
times L
j
and k R\ {0} times L
i
), denoted by (k
0
L
j
+kL
i
) L
i
, k 6= 0.
Elementary operations are important because of the following obvious result.
Proposition 11 If S
1
is a system of linear equations obtained from a system S
2
of linear equations
using a nite number of elementary operations, then system S
1
and S
2
are equivalent.
In what follows, rst we dene two types of simple systems (triangular and echelon form
systems), and we see why those systems are in fact easy to solve. Then, we show how to transform
any system in one of those simple systems.
1.3. SYSTEMS IN TRIANGULAR AND ECHELON FORM 11
1.3 Systems in triangular and echelon form
Denition 12 A linear system (1.4) is in triangular form if the number n of equations is equal to
the number n of unknowns and i {1, ..., n}, x
i
is the leading unknown of equation i, i.e., the
system has the following form:

a
11
x
1
+a
12
x
2
+... +a
1,n1
x
n1
+a
1n
x
n
= b
1
a
22
x
2
+... +a
2,n1
x
n1
+a
2n
x
n
= b
2
...
a
n1,n1
x
n1
+a
n1n
x
n
= b
n1
a
nn
x
n
= b
n
(1.5)
where i {1, ..., n}, a
ii
6= 0.
Proposition 13 System (1.5) has a unique solution.
Proof. We can compute the solution of system (1.5) using the following procedure, known as
back-substitution.
First, since by assumption a
nn
6= 0, we solve the last equation with respect to the last unknown,
i.e., we get
x
n
=
b
n
a
nn
.
Second, we substitute that value of x
n
in the next-to-the-last equation and solve it for the next-to-
the-last unknown, i.e.,
x
n1
=
b
n1
a
n1,n

b
n
a
nn
a
n1,n1
and so on. The process ends when we have determined the rst unknown, x
1
.
Observe that the above procedure shows that the solution to a system in triangular form is
unique since, at each step of the algorithm, the value of each x
i
is uniquely determined, as a
consequence of Proposition 3, conclusion 1.
Denition 14 A linear system (1.4) is said to be in echelon form if
1. no equation is degenerate, and
2. the leading unknown in each equation is to the right of the leading unknown of the preceding
equation.
In other words, the system is of the form

a11x1 +... +a1j


2
xj
2
+... +a1sxs =b1
a2j
2
xj
2
+... +a2,j
3
xj
3
+... +a2sxs =b2
a3,j
3
xj
3
+... +a3sxs =b3
...
ar,j
r
xj
r
+ar,j
r
+1+... +arsxs =br
(1.6)
with j
1
:= 1 < j
2
< ... < j
r
and a
11
, a
2j
2
, ..., a
rj
r
6= 0. Observe that the above system has r
equations and s variables and that s r. The leading unknown in equation i {1, ..., r} is x
j
i
.
Remark 15 Systems with no degenerate equations are the interesting ones. If an equation is
degenerate and the right hand side term is zero, then you can erase it; if the right hand side term
is not zero, then the system has no solutions.
Denition 16 An unknown x
k
in system (1.6) is called a free variable if x
k
is not the leading
unknown in any equation, i.e., i {1, ..., r} , x
k
6= x
j
i
.
In system (1.6), there are r leading unknowns, r equations and s r 0 free variables.
Proposition 17 Let a system in echelon form with r equations and s variables be given. Then,
the following results hold true.
12 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1. If s = r, i.e., the number of unknowns is equal to the number of equations, then the system
has a unique solution;
2. if s > r, i.e., the number of unknowns is greater than the number of equations, then we can
arbitrarily assign values to the n r > 0 free variables and obtain solutions of the system.
Proof. We prove the theorem by induction on the number r of equations of the system.
Step 1. r = 1.
In this case, we have a single, nondegenerate linear equation, to which Proposition 6 applies if
s > r = 1, and Proposition 3 applies if s = r = 1.
Step 2.
Assume that r > 1 and the desired conclusion is true for a system with r1 equations. Consider
the given system in the form (1.6) and erase the rst equation, so obtaining the following system:

a
2j
2
x
j
2
+... +a
2,j
3
x
j
3
+... = b
2
a
3,j3
x
j3
+...
... ...
a
r,j
r
x
j
r
+a
r,j
r
+1
+a
rs
x
s
= b
r
(1.7)
in the unknowns x
j
2
, ..., x
s
. First of all observe that the above system is in echelon form and has
r 1 equation; therefore we can apply the induction argument distinguishing the two case s > r
and s = r.
If s > r, then we can assign arbitrary values to the free variables, whose number is (the old
number minus the erased ones)
s r (j
2
j
1
1) = s r j
2
+ 2
and obtain a solution of system (1.7). Consider the rst equation of the original system
a
11
x
1
+a
12
x
2
+... +a
1,j21
x
j21
+a
1j2
x
j2
+... = b
1
. (1.8)
We immediately see that the above found values together with arbitrary values for the additional
j
2
2
free variable of equation (1.8) yield a solution of that equation, as desired. Observe also that the
values given to the variables x
1
, ..., x
j
21
from the rst equation do satisfy the other equations simply
because their coecients are zero there.
If s = r, the system in echelon form, in fact, becomes a system in triangular form and then the
solution exists and it is unique.
Remark 18 From the proof of the previous Proposition, if the echelon system (1.6) contains more
unknowns than equations, i.e., s > r, then the system has an innite number of solutions since each
of the s r 1 free variables may be assigned an arbitrary real number.
1.4 Reduction algorithm
The following algorithm (sometimes called row reduction) reduces system (1.4) of m equation and
n unknowns to either echelon form, or triangular form, or shows that the system has no solution.
The algorithm then gives a proof of the following result.
Proposition 19 Any system of linear equations has either
1. innite solutions, or
2. a unique solution, or
3. no solutions.
1.4. REDUCTION ALGORITHM 13
Reduction algorithm.
Consider a system of the form (1.4) such that
j {1, ..., n} , i {1, .., m} such that a
ij
6= 0, (1.9)
i.e., a system in which each variable has a nonzero coecient in at least one equation. If that is
not the case, the remaining variables can renamed in order to have (1.9) satised.
Step 1. Interchange equations so that the rst unknown, x
1
, appears with a nonzero coecient in
the rst equation; i.e., arrange that a
11
6= 0.
Step 2. Use a
11
as a pivot to eliminate x
1
from all equations but the rst equation. That is, for
each i > 1, apply the elementary operation
[E
3
] :

a
i1
a
11

L
1
+L
i
L
i
or
[E] : a
i1
L
1
+a
11
L
i
L
i
.
Step 3. Examine each new equation L :
1. If L has the form
0x
1
+ 0x
2
+.... + 0x
n
= 0,
or if L is a multiple of another equation, then delete L from the system.
3
2. If L has the form
0x
1
+ 0x
2
+.... + 0x
n
= b,
with b 6= 0, then exit the algorithm. The system has no solutions.
Step 4. Repeat Steps 1, 2 and 3 with the subsystem formed by all the equations, excluding the
rst equation.
Step 5. Continue the above process until the system is in echelon form or a degenerate equation
is obtained in Step 3.2.
Summarizing, our method for solving system (1.4) consists of two steps:
Step A. Use the above reduction algorithm to reduce system (1.4) to an equivalent simpler
system (in triangular form, system (1.5) or echelon form (1.6)).
Step B. If the system is in triangular form, use back-substitution to nd the solution; if the
system is in echelon form, bring the free variables on the right hand side of each equation, give
them arbitrary values (say, the name of the free variable with an upper bar), and then use back-
substitution.
Example 20

x
1
+ 2x
2
+ (3)x
3
= 1
3x
1
+ (1) x
2
+ 2x
3
= 7
5x
1
+ 3x
2
+ (4) x
3
= 2
Step A.
Step 1. Nothing to do.
3
The justication of Step 3 is Propositon 9 and the fact that if L = kL
0
for some other equation L
0
in the system,
then the operation kL
0
+L L replace L by 0x
1
+0x
2
+.... +0x
n
= 0, which again may be deleted by Propositon
9.
14 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Step 2. Apply the operations
3L
1
+L
2
L
2
and
5L
1
+L
3
L
3
,
to get

x
1
+ 2x
2
+ (3)x
3
= 1
(7) x
2
+ 11x
3
= 10
(7) x
2
+ 11x
3
= 7
Step 3. Examine each new equations L
2
and L
3
:
1. L
2
and L
3
do not have the form
0x
1
+ 0x
2
+.... + 0x
n
= 0;
L
2
is not a multiple L
3
;
2. L
2
and L
3
do not have the form
0x
1
+ 0x
2
+.... + 0x
n
= b,
Step 4.
Step 1.1 Nothing to do.
Step 2.1 Apply the operation
L
2
+L
3
L
3
to get

x
1
+ 2x
2
+ (3)x
3
= 1
(7) x
2
+ 11x
3
= 10
0x
1
+ 0x
2
+ 0x
3
= 3
Step 3.1 L
3
has the form
0x
1
+ 0x
2
+.... + 0x
n
= b,
1. with b = 3 6= 0, then exit the algorithm. The system has no solutions.
1.5 Matrices
Denition 21 Given m, n N\ {0}, a matrix (of real numbers) of order m n is a table of real
numbers with m rows and n columns as displayed below.

a
11
a
12
... a
1j
... a
1n
a
21
a
22
... a
2j
... a
2n
...
a
i1
a
i2
... a
ij
... a
in
...
a
m1
a
m2
... a
mj
... a
mn

For any i {1, ..., m} and any j {1, ..., n} the real numbers a
ij
are called entries of the matrix;
the rst subscript i denotes the row the entries belongs to, the second subscript j denotes the column
the entries belongs to. We will usually denote matrices with capital letters and we will write A
mn
to denote a matrix of order m n. Sometimes it is useful to denote a matrix by its typical
element and we write[a
ij
] i{1,...,m}
j{1,...,n}
, or simply [a
ij
] if no ambiguity arises about the number of rows
and columns. For i {1, ..., m},

a
i1
a
i2
... a
ij
... a
in

1.5. MATRICES 15
is called the i th row of A and it denoted by R
i
(A). For j {1, ..., n},

a
1j
a
2j
...
a
ij
...
a
mj

is called the j th column of A and it denoted by C


j
(A).
We denote the set of mn matrices by M
m,n
, and we write, in an equivalent manner, A
mn
or A M
m,n
.
Denition 22 The matrix
A
m1
=

a
1
...
a
m

is called column vector and the matrix


A
1n
=

a
1
, ... a
n

is called row vector. We usually denote row or column vectors by small Latin letters.
Denition 23 The rst nonzero entry in a row R of a matrix A
mn
is called the leading nonzero
entry of R. If R has no leading nonzero entries, i.e., if every entry in R is zero, then R is called a
zero row. If all the rows of A are zero, i.e., each entry of A is zero, then A is called a zero matrix,
denoted by 0
mn
or simply 0, if no confusion arises.
In the previous sections, we dened triangular and echelon systems of linear equations. Below,
we dene triangular, echelon matrices and a special kind of echelon matrices. In Section (1.6), we
will see that there is a simple relationship between systems and matrices.
Denition 24 A matrix A
mn
is square if m = n. A square matrix A belonging to M
m,m
is called
square matrix of order m.
Denition 25 Given A = [a
ij
] M
m,m
, the main diagonal of A is made up by the entries a
ii
with i {1, .., m}.
Denition 26 A square matrix A = [a
ij
] M
m,m
is an upper triangular matrix or simply a
triangular matrix if all entries below the main diagonal are equal to zero, i.e., i, j {1, .., m} , if
i > j, then a
ij
= 0.
Denition 27 A M
mm
is called diagonal matrix of order m if any element outside the principal
diagonal is equal to zero, i.e., i, j {1, ..., m} such that i 6= j, a
ij
= 0.
Denition 28 A matrix A M
m,n
is called an echelon (form) matrix, or it is said to be in echelon
form, if the following two conditions hold:
1. All zero rows, if any, are on the bottom of the matrix.
2. The leading nonzero entry of each row is to the right of the leading nonzero entry in the
preceding row.
Denition 29 If a matrix A is in echelon form, then its leading nonzero entries are called pivot
entries, or simply, pivots
Remark 30 If a matrix A M
m,n
is in echelon form and r is the number of its pivot entries,
then r min{m, n}. In fact, r m, because the matrix may have zero rows and r n, because the
leading nonzero entries of the rst row maybe not in the rst column, and the other leading nonzero
entries may be strictly to the right of previous leading nonzero entry.
16 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Denition 31 A matrix A M
m,n
is called in row canonical form if
1. it is in echelon form,
2. each pivot is 1, and
3. each pivot is the only nonzero entry in its column.
Example 32 1. All the matrices below are echelon matrices; only the fourth one is in row canonical
form.

0 7 0 0 1 2
0 0 0 1 3 3
0 0 0 0 0 7
0 0 0 0 0 0

2 3 2 0 1 2 4
0 0 1 1 3 3 0
0 0 0 0 0 7 1
0 0 0 0 0 0 0

1 2 3
0 0 1
0 0 0

0 1 3 0 0 4
0 0 0 1 0 3
0 0 0 0 1 2

.
2. Any zero matrix is in row canonical form.
Remark 33 Let a matrix A
mn
in row canonical form be given. As a consequence of the denition,
we have what follows.
1. If some rows from A are erased, the resulting matrix is still in row canonical form.
2. If some columns of zeros are added, the resulting matrix is still in row canonical form.
Denition 34 Denote by R
i
the i th row of a matrix A. An elementary row operation is one of
the following operations on the rows of A:
[E
1
] (Row interchange) Interchange R
i
with R
j
, an operation denoted by R
i
R
j
(which we can
read put L
i
in the place of R
j
and R
j
in the place of R
i
);;
[E
2
] (Row scaling) Multiply R
i
by k R\ {0}, denoted by kR
i
R
i
, k 6= 0 (which we can read
put kR
i
in the place of R
i
, with k 6= 0);
[E
3
] (Row addition) Replace R
i
by ( k times R
j
plus R
i
), denoted by (R
i
+kR
j
) R
i
(which we
can read put R
i
+kR
j
in the place of R
i
).
Sometimes we apply [E
2
] and [E
3
] in one step, i.e., we perform the following operation
[E] Replace R
i
by ( k
0
times R
j
and k R\ {0} times R
i
), denoted by (k
0
R
j
+kR
i
) R
i
, k 6= 0.
Denition 35 A matrix A M
m,n
is said to be row equivalent to a matrix B M
m,n
if B can
be obtained from A by a nite number of elementary row operations.
It is hard not to recognize the similarity of the above operations and those used in solving
systems of linear equations.
We use the expression row reduce as having the meaning of transform a given matrix into
another matrix using row operations. The following algorithm row reduces a matrix A into a
matrix in echelon form.
Row reduction algorithm to echelon form.
Consider a matrix A = [a
ij
] M
m,n
.
Step 1. Find the rst column with a nonzero entry. Suppose it is column j
1
.
Step 2. Interchange the rows so that a nonzero entry appears in the rst row of column j
1
, i.e., so
that a
1j
1
6= 0.
Step 3. Use a
1j
1
as a pivot to obtain zeros below a
1j
1
, i.e., for each i > 1, apply the row operation
[E
3
] :

a
ij
1
a
1j
1

R
1
+R
i
R
i
or
[E] : a
ij
1
R
1
+a
11
R
i
R
i
.
1.5. MATRICES 17
Step 4. Repeat Steps 1, 2 and 3 with the submatrix formed by all the rows, excluding the rst
row.
Step 5. Continue the above process until the matrix is in echelon form.
Example 36 Lets apply the above algorithm to the following matrix

1 2 3 1
3 1 2 7
5 3 4 2

Step 1. Find the rst column with a nonzero entry: that is C


1
, and therefore j
1
= 1.
Step 2. Interchange the rows so that a nonzero entry appears in the rst row of column j
1
, i.e., so
that a
1j
1
6= 0: a
1j
1
= a
11
= 1 6= 0.
Step 3. Use a
11
as a pivot to obtain zeros below a
11
. Apply the row operations
3R
1
+R
2
R
2
and
5R
1
+R
3
R
3
,
to get

1 2 3 1
0 7 11 10
0 7 11 7

Step 4. Apply the operation


R
2
+R
3
R
3
to get

1 2 3 1
0 7 11 10
0 0 0 3

which is is in echelon form.


Row reduction algorithm from echelon form to row canonical form.
Consider a matrix A = [a
ij
] M
m,n
in echelon form, say with pivots
a
1j
1
, a
2j
2
, ..., a
rj
r
.
Step 1. Multiply the last nonzero row R
r
by
1
arj
r
,so that the leading nonzero entry of that row
becomes 1.
Step 2. Use a
rj
r
as a pivot to obtain zeros above the pivot, i.e., for each i {r 1, r 2, ..., 1},
apply the row operation
[E
3
] : a
i,jr
R
r
+R
i
R
i
.
Step 3. Repeat Steps 1 and 2 for rows R
r1
, R
r2
, ..., R
2
.
Step 4. Multiply R
1
by
1
a1j
1
.
Example 37 Consider the matrix

1 2 3 1
0 7 11 10
0 0 0 3

in echelon form, with leading nonzero entries


a
11
= 1, a
22
= 7, a
34
= 3.
18 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Step 1. Multiply the last nonzero row R
3
by
1
3
,so that the leading nonzero entry becomes 1:

1 2 3 1
0 7 11 10
0 0 0 1

Step 2. Use a
rj
r
= a
34
as a pivot to obtain zeros above the pivot, i.e., for each i {r 1, r 2, ..., 1} =
{2, 1}, apply the row operation
[E
3
] : a
i,jr
R
r
+R
i
R
i
,
which in our case are
a
2,4
R
3
+R
2
R
2
i.e., 10R
3
+R
2
R
2
,
a
1,4
R
3
+R
1
R
1
i.e., R
3
+R
1
R
1
.
Then, we get

1 2 3 0
0 7 11 0
0 0 0 1

Step 3. Multiply R
2
by
1
7
, and get

1 2 3 0
0 1
11
7
0
0 0 0 1

Use a
23
as a pivot to obtain zeros above the pivot, applying the operation:
2R
2
+R
1
R
1
,
to get

1 0
1
7
0
0 1
11
7
0
0 0 0 1

which is in row reduced form.


Proposition 38 Any matrix A M
m,n
is row equivalent to a matrix in row canonical form.
Proof. The two above algorithms show that any matrix is row equivalent to at least one matrix
in row canonical form.
Remark 39 In fact, in Proposition 150, we will show that: Any matrix A M
m,n
is row equivalent
to a unique matrix in row canonical form.
1.6 Systems of linear equations and matrices
Denition 40 Given system (1.4), i.e., a system of m linear equation in the n unknowns x
1
, x
2
, ..., x
n

a
11
x
1
+... +a
1j
x
j
+... +a
1n
x
n
= b
1
...
a
i1
x
1
+... +a
ij
x
j
+... +a
in
x
n
= b
i
...
a
m1
x
i
+... +a
mj
x
j
+... +a
mn
x
n
= b
m
,
the matrix

a
11
... a
1j
... a
1n
b
1
...
a
i1
... a
ij
... a
in
b
i
...
a
m1
... a
mj
... a
mn
b
m

is called the augmented matrix M of system (1.4).


1.6. SYSTEMS OF LINEAR EQUATIONS AND MATRICES 19
Each row of M corresponds to an equation of the system, and each column of M corresponds
to the coecients of an unknown, except the last column which corresponds to the constant of the
system.
In an obvious way, given an arbitrary matrix M, we can nd a unique system whose associated
matrix is M; moreover, given a system of linear equations, there is only one matrix M associated
with it. We can therefore identify system of linear equations with (augmented) matrices.
The coecient matrix of the system is
A =

a
11
... a
1j
... a
1n
...
a
i1
... a
ij
... a
in
...
a
m1
... a
mj
... a
mn

One way to solve a system of linear equations is as follows:


1. Reduce its augmented matrix M to echelon form, which tells if the system has solution; if M
has a row of the form (0, 0, ..., 0, b) with b 6= 0, then the system has no solution and you can stop.
If the system admits solutions go to the step below.
2. Reduce the matrix in echelon form obtained in the above step to its row canonical form.
Write the corresponding system. In each equation, bring the free variables on the right hand side,
obtaining a triangular system. Solve by back-substitution.
The simple justication of this process comes from the following facts:
1. Any elementary row operation of the augmented matrix M of the system is equivalent to
applying the corresponding operation on the system itself.
2. The system has a solution if and only if the echelon form of the augmented matrix M does
not have a row of the form (0, 0, ..., 0, b) with b 6= 0 - simply because that row corresponds to
a degenerate equation.
3. In the row canonical form of the augmented matrix M (excluding zero rows) the coecient of
each nonfree variable is a leading nonzero entry which is equal to one and is the only nonzero
entry in its respective column; hence the free variable form of the solution is obtained by
simply transferring the free variable terms to the other side of each equation.
Example 41 Consider the system presented in Example 20:

x
1
+ 2x
2
+ (3)x
3
= 1
3x
1
+ (1) x
2
+ 2x
3
= 7
5x
1
+ 3x
2
+ (4) x
3
= 2
The associated augmented matrix is:

1 2 3 1
3 1 2 7
5 3 4 2

In example 36, we have see that the echelon form of the above matrix is

1 2 3 1
0 7 11 10
0 0 0 3

which has its last row of the form (0, 0, ..., 0, b) with b = 3 6= 0, and therefore the system has no
solution.
20 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Chapter 2
The Euclidean Space R
n
2.1 Sum and scalar multiplication
It is well known that the real line is a representation of the set R of real numbers. Similarly, a
ordered pair (x, y) of real numbers can be used to represent a point in the plane and a triple (x, y, z)
or (x
1
, x
2
, x
3
) a point in the space. In general, if n N
+
:= {1, 2, ..., }, we can dene (x
1
, x
2
, ..., x
n
)
or (x
i
)
n
i=1
as a point in the n space.
Denition 42 R
n
:= R... R .
In other words, R
n
is the Cartesian product of R multiplied n times by itself.
Denition 43 The elements of R
n
are ordered n-tuple of real numbers and are denoted by
x = (x
1
, x
2
, ..., x
n
) or x = (x
i
)
n
i=1
.
x
i
is called i th component of x R
n
.
Denition 44 x = (x
i
)
n
i=1
R
n
and y = (y
i
)
n
i=1
are equal if
i {1, ..., n} , x
i
= y
i
.
In that case we write x = y.
Let us introduce two operations on R
n
and analyze some properties they satisfy.
Denition 45 Given x R
n
, y R
n
, we call addition or sum of x and y the element denoted by
x +y R
n
obtained as follows
x +y := (x
i
+y
i
)
n
i=1
.
Denition 46 An element R is called scalar.
Denition 47 Given x R
n
and R, we call scalar multiplication of x by the element
x R
n
obtained as follows
x := (x
i
)
n
i=1
.
Geometrical interpretation of the two operations in the case n = 2.
insert picture
From the well known properties of the sum and product of real numbers it is possible to verify
that the following properties of the above operations do hold true.
Properties of addition.
A1. (Associative) x, y R
n
, (x +y) +z = x + (y +z);
A2. (existence of null element) there exists an element e in R
n
such that for any x R
n
,
x +e = x; in fact such element is unique and it is denoted by 0;
A3. (existence of inverse element) x R
n
y R
n
such that x +y = 0; in fact, that element
is unique and denoted by x;
21
22 CHAPTER 2. THE EUCLIDEAN SPACE R
N
A4. (Commutative) x, y R
n
, x +y = y +x.
Properties of multiplication.
M1. (distributive) R, x R
n
, y R
n
(x +y) = x +y;
M2. (distributive) R, R, x R
n
, ( +)x = x +x
M3. R, R, x R
n
, ()x = (x);
M4. x R
n
, 1x = x.
2.2 Scalar product
Denition 48 Given x = (x
i
)
n
i=1
, y = (y
i
)
n
i=1
R
n
, we call dot, scalar or inner product of x and
y, denoted by xy or x y, the scalar
n
X
i=1
x
i
y
i
R.
Remark 49 The scalar product of elements of R
n
satises the following properties.
1. x, y R
n
x y = y x;
2. , R, x, y, z R
n
(x +y) z = (x z) +(y z);
3. x R
n
, x x 0;
4. x R
n
, x x = 0 x = 0.
Denition 50 The set R
n
with above described three operations (addition, scalar multiplication
and dot product) is usually called Euclidean space of dimension n.
Denition 51 Given x = (x
i
)
n
i=1
R
n
, we denote the (Euclidean) norm or length of x by
kxk := (x x)
1
2
=
v
u
u
t
n
X
i=1
(x
i
)
2
.
Geometrical Interpretation of scalar products in R
2
.
Given x = (x
1
, x
2
) R
2
\ {0}, from elementary trigonometry we know that
x = (kxk cos , kxk sin) (2.1)
where is the measure of the angle between the positive part of the horizontal axes and x itself.
scan and insert picture (Marcellini-Sbordone page 178)
Using the above observation we can verify that given x = (x
1
, x
2
) and y = (y
1
, y
2
) in R
2
\ {0},
xy = kxk kyk cos
where is an
1
angle between x and y.
scan and insert picture (Marcellini-Sbordone page 179)
From the picture and (2.1), we have
x = (kxk cos
1
, kxk sin
1
)
and
y = (kyk cos
2
, kyk sin
2
) .
Then
2
xy = kxk kyk (cos
1
cos
2
+ sin
1
sin
2
) = kxk kyk cos (
2

1
) .
Taken x and y not belonging to the same line, dene

:= (angle between x and y with minimum


measure). From the above equality, it follows that
1
Recall that x R, cos x = cos (x) = cos (2 x).
2
Recall that
cos (x
1
x
2
) = cos x
1
cos x
2
sinx
1
sinx
2
.
2.3. NORMS AND DISTANCES 23

=

2
x y = 0

<

2
x y > 0

>

2
x y < 0.
Denition 52 x, y R
n
\ {0} are orthogonal if xy = 0.
2.3 Norms and Distances
Proposition 53 (Properties of the norm). Let R and x, y R
n
.
1. kxk 0, and kxk = 0 x = 0,
2. kxk = || kxk,
3. kx +yk kxk +kyk (Triangle inequality),
4. |xy| kxk kyk (Cauchy-Schwarz inequality ).
Proof. 1. By denition kxk =
q
P
n
i=1
(x
i
)
2
0. Moreover, kxk = 0 kxk
2
= 0
P
n
i=1
(x
i
)
2
= 0 x = 0.
2. kxk =
q
P
n
i=1

2
(x
i
)
2
= ||
q
P
n
i=1
(x
i
)
2
= || kxk.
4. (3 is proved using 4)
We want to show that |xy| kxk kyk or |xy|
2
kxk
2
kyk
2
, i.e.,

n
X
i=1
x
i
y
i
!
2

n
X
i=1
x
2
i
!

n
X
i=1
y
2
i
!
Dened X :=
P
n
i=1
x
2
i
, Y :=
P
n
i=1
y
2
i
and Z :=
P
n
i=1
x
i
y
i
, we have to prove that
Z
2
XY. (2.2)
Observe that
a R, 1.
P
n
i=1
(ax
i
+y
i
)
2
0, and
2.
P
n
i=1
(ax
i
+y
i
)
2
= 0 i {1, ..., n} , ax
i
+y
i
= 0
Moreover,
n
X
i=1
(ax
i
+y
i
)
2
= a
2
n
X
i=1
x
2
i
+ 2a
n
X
i=1
x
i
y
i
+
n
X
i=1
y
2
i
= a
2
X + 2aZ +Y 0 (2.3)
If X > 0, we can take a =
Z
X
, and from (2.3), we get
0
Z
2
X
2
X 2
Z
2
X
+Y
or
Z
2
XY,
as desired.
If X = 0, then x = 0 and Z = 0, and (2.2) is true simply because 0 0.
3. It suces to show that kx +yk
2
(kxk +kyk)
2
.
kx +yk
2
=
n
X
i=1
(x
i
+y
i
)
2
=
n
X
i=1

(x
i
)
2
+ 2x
i
y
i
+ (y
i
)
2

=
= kxk
2
+ 2xy +kyk
2
kxk
2
+ 2 |xy| +kyk
2
(4 above)
kxk
2
+ 2 kxk kyk +kyk
2
= (kxk +kyk)
2
.
24 CHAPTER 2. THE EUCLIDEAN SPACE R
N
Remark 54 |kxk kyk| kx yk .
Recall that a, b R
b a b |a| b.
From Proposition 53.3,identifying x with xy and y with y, we get kx y +yk kx yk+kyk,
i.e.,
kxk kyk kx yk
From Proposition 53.3, identifying x with yx and y with x, we get ky x +xk ky xk+kxk,
i.e.,
kyk kxk ky xk = kx yk
and
kx yk kxk kyk
Denition 55 For any n N\ {0} and for any i {1, ..., n} , e
i
n
:=

e
i
j,n

n
j=1
R
n
with
e
i
n,j
=

0 if i 6= j
1 if i = j
In other words, e
i
n
is an element of R
n
whose components are all zero, but the i th component
which is equal to 1. The vector e
i
n
is called the i th canonical vector in R
n
.
Remark 56 x R
n
,
kxk
n
X
i=1
|x
i
| ,
as veried below.
kxk =

n
X
i=1
x
i
e
i

(1)

n
X
i=1

x
i
e
i

(2)
=
n
X
i=1
|x
i
|

e
i

=
n
X
i=1
|x
i
| ,
where (1) follows from the triangle inequality, i.e., Proposition 53.3, and (2) from Proposition
53.2.
Denition 57 Given x, y R
n
,we denote the (Euclidean) distance between x and y by
d (x, y) := kx yk
Proposition 58 (Properties of the distance). Let x, y, z R
n
.
1. d (x, y) 0, and d (x, y) = 0 x = y,
2. d (x, y) = d (y, x),
3. d (x, z) d (x, y) +d (y, z) (Triangle inequality).
Proof. 1. It follows from property 1 of the norm.
2. It follows from the denition of the distance as a norm.
3. Identifying x with x y and y with y z in property 3 of the norm, we get
k(x y) + (y z)k kx yk +ky zk, i.e., the desired result.
Remark 59 Let a set X be given.
Any function n : X R which satisfy the properties listed below is called a norm. More
precisely, we have what follows. Let R and x, y X.Assume that
1. n(x) 0, and n(x) = 0 x = 0,
2. n(x) = || n(x),
3. n(x +y) n(x) +n(y).
Then, n is called a norm and (X, n) is called a normed space.
2.3. NORMS AND DISTANCES 25
Any function d : X X R satisfying properties 1, 2 and 3 presented for the Euclidean
distance in Proposition 58 is called a distance or a metric. More precisely, we have what
follows. Let x, y, z X. Assume that
1. d (x, y) 0, and d (x, y) = 0 x = y,
2. d (x, y) = d (y, x),
3. d (x, z) d (x, y) +d (y, z) ,
then d is called a distance and (X, d) a metric space.
26 CHAPTER 2. THE EUCLIDEAN SPACE R
N
Chapter 3
Matrices
We presented the concept of matrix in Denition 21. In this chapter, we study further properties
of matrices.
Denition 60 The transpose of a matrix A M
m,n
, denoted by A
T
belongs to M
m,n
, and it is
the matrix obtained by writing the rows of A, in order, as columns:
A
T
=

a
11
... a
1j
... a
1n
...
a
i1
... a
ij
... a
in
...
a
m1
... a
mj
... a
mn

T
=

a
11
... a
i1
... a
m1
...
a
1j
... a
ij
... a
mj
...
a
1n
... a
in
... a
mn

.
In other words, row 1 of the matrix A becomes column 1 of A
T
, row 2 of A becomes column 2 of
A
T
, and so on, up to row m which becomes column m of A
T
. Same results is obtained proceeding
as follows: column 1of A becomes row 1 of A
T
, column 2 of A becomes row 2 of A
T
, and so on, up
to column n which becomes row n of A
T
. More formally, given A = [a
ij
]
i{1,...,m}
j{1,...,n}
M
m,n
, then
A
T
= [a
ji
]
j{1,...,n}
i{1,...,m}
M
n,m
.
Denition 61 A matrix A M
n,n
is said symmetric if A = A
T
, i.e., i, j {1, ..., n}, a
ij
= a
ji
.
Remark 62 We can write a matrix A
mn
= [a
ij
] as
A =

R
1
(A)
...
R
i
(A)
...
R
m
(A)

= [C
1
(A) , ..., C
j
(A) , ..., C
n
(A)]
where
R
i
(A) = [a
i1
, ..., a
ij
, ...a
in
] :=
h
R
1
i
(A) , ..., R
j
i
(A) , ...R
n
i
(A)
i
R
n
for i {1, ..., m} and
C
j
(A) =

a
1j
a
ij
a
mj

:=

C
1
j
(A)
C
i
j
(A)
C
m
j
(A)

R
n
for j {1, ..., n} .
In other words, R
i
(A) denotes row i of the matrix A and C
j
(A) denotes column j
of matrix A.
27
28 CHAPTER 3. MATRICES
3.1 Matrix operations
Denition 63 Two matrices A
mn
:= [a
ij
] and B
mn
:= [b
ij
] are equal if for i {1, ..., m},
j {1, ..., n}
i {1, ..., m} , j {1, ..., n} , a
ij
= b
ij
.
Denition 64 Given the matrices A
mn
:= [a
ij
] and B
mn
:= [b
ij
], the sum of A and B, denoted
by A+B is the matrix C
mn
= [c
ij
] such that
i {1, ..., m} , j {1, ..., n} , c
ij
= a
ij
+b
ij
Denition 65 Given the matrices A
mn
:= [a
ij
] and the scalar , the product of the matrix A by
the scalar , denoted by A or A, is the matrix obtained by multiplying each entry A by :
A := [a
ij
]
Remark 66 It is easy to verify that the set of matrices M
m,n
with the above dened sum and
scalar multiplication satises all the properties listed for elements of R
n
in Section 2.1.
Denition 67 Given A = [a
ij
] M
m,n
, B = [b
jk
] M
n,p
, the product A B is a matrix
C = [c
ik
] M
m,p
such that
i {1, ...m} , k {1, ..., p} , c
ik
:=
n
X
j=1
a
ij
b
jk
= R
i
(A) C
k
(B)
i.e., since
A =

R
1
(A)
...
R
i
(A)
...
R
m
(A)

, B = [C
1
(B) , ..., C
k
(B) , ..., C
p
(B)] (3.1)
AB =

R
1
(A) C
1
(B) ... R
1
(A) C
k
(B) ... R
1
(A) C
p
(B)
...
R
i
(A) C
1
(B) ... R
i
(A) C
k
(B) ... R
i
(A) C
p
(B)
...
R
m
(A) C
1
(B) ... R
m
(A) C
k
(B) ... R
m
(A) C
p
(B)

(3.2)
Remark 68 If A M
1n
, B M
n,1
, the above denition coincides with the denition of scalar
product between elements of R
n
. In what follows, we often identify an element of R
n
with a row or
a column vectors ( - see Denition 22) consistently with we what write. In other words A
mn
x = y
means that x and y are column vector with n entries, and wA
mn
= z means that w and z are row
vectors with m entries.
Denition 69 If two matrices are such that a given operation between them is well dened, we say
that they are conformable with respect to that operation.
Remark 70 If A, B M
m,n
, they are conformable with respect to matrix addition. If A M
m,n
and B M
n,p
, they are conformable with respect to multiplying A on the left of B. We often say the
two matrices are conformable and let the context dene precisely the sense in which conformability
is to be understood.
Remark 71 (For future use) k {1, ..., p},
A C
k
(B) =

R
1
(A)
...
R
i
(A)
...
R
m
(A)

C
k
(B) =

R
1
(A) C
k
(B)
...
R
i
(A) C
k
(B)
...
R
m
(A) C
k
(B)

(3.3)
3.1. MATRIX OPERATIONS 29
Then, just comparing (3.2) and (3.3), we get
AB =

A C
1
(B) ... A C
k
(B) ... A C
p
(B)

(3.4)
Similarly, i {1, ..., m},
R
i
(A) B = R
i
(A)

C
1
(B) ... C
k
(B) ... C
p
(B)

=
=

R
i
(A) C
1
(B) ... R
i
(A) C
k
(B) ... R
i
(A) C
p
(B)

(3.5)
Then, just comparing (3.2) and (3.5), we get
AB =

R
1
(A) B
...
R
i
(A) B
...
R
m
(A) B

(3.6)
Denition 72 A submatrix of a matrix A M
m,n
is a matrix obtained from A erasing some rows
and columns.
Denition 73 A matrix A M
m,n
is partitioned in blocks if it is written as submatrices using a
system of horizontal and vertical lines.
The reason of the partition into blocks is that the result of operations on block matrices can
obtained by carrying out the computation with blocks, just as if they were actual scalar entries of
the matrices, as described below.
Remark 74 We verify below that for matrix multiplication, we do not commit an error if, upon
conformably partitioning two matrices, we proceed to regard the partitioned blocks as real numbers
and apply the usual rules.
1. Take a := (a
i
)
n1
i=1
R
n1
, b := (b
j
)
n2
j=1
R
n2
, c := (c
i
)
n1
i=1
R
n1
, d := (d
j
)
n2
j=1
R
n2
,

a | b

1(n
1
+n
2
)

(n1+n2)1
=
n1
X
i=1
a
i
c
i
+
n2
X
j=1
b
j
d
j
= a c +b d. (3.7)
2.
Take A M
m,n1
, B M
m,n2
, C M
n1,p
, D M
n2,p
, with
A =

R
1
(A)
...
R
m
(A)

, B =

R
1
(B)
...
R
m
(B)

,
C = [C
1
(C) , ..., C
p
(C)] ,
D = [C
1
(D) , ..., C
p
(D)]
Then,

A B

m(n
1
+n
2
)

C
D

(n
1
+n
2
)p
=

R
1
(A)
...
R
m
(A)
R
1
(B)
...
R
m
(B)

C
1
(C) , ..., C
p
(C)
C
1
(D) , ..., C
p
(D)

=
=

R
1
(A) C
1
(C) +R
1
(B) C
1
(D) ... R
1
(A) +R
1
(B) C
p
(D)
...
R
m
(A) C
1
(C) +R
m
(B) C
1
(D) R
m
(A) C
p
(C) +R
m
(B) C
p
(D)

=
=

R
1
(A) C
1
(C) ... R
1
(A) C
p
(C)
...
R
m
(A) C
1
(C) R
m
(A) C
p
(C)

R
1
(B) C
1
(D) ... R
1
(B) C
p
(D)
...
R
m
(B) C
1
(D) R
m
(B) C
p
(D)

=
= AC +BD.
30 CHAPTER 3. MATRICES
Denition 75 Let the matrices A
i
M(n
i
, n
i
) for i {1, ..., K}, then the matrix
A =

A
1
A
2
.
.
.
A
i
.
.
.
A
K

K
X
i=1
n
i
,
K
X
i=1
n
i
!
is called block diagonal matrix.
Very often having information on the matrices A
i
gives information on A.
Remark 76 It is easy, but cumbersome, to verify the following properties.
1. (associative) A M
m,n
, B M
np
, C M
pq
, A(BC) = (AB)C;
2. (distributive) A M
m,n
, B M
m,n
, C M
np
, (A+B)C = AC +BC.
3. x, y R
n
and , R,
A(x +y) = A(x) +B(y) = Ax +Ay
It is false that:
1. (commutative) A M
m,n
, B M
np
, AB = BA;
2. A M
m,n
, B, C M
np
, hA 6= 0, AB = ACi =hB = Ci;
3. A M
m,n
, B M
np
, hA 6= 0, AB = 0i =hB = 0i.
Lets show why the above statements are false.
1.
A =

1 2 1
1 1 3

B =

1 0
2 1
0 1

AB =

1 2 1
1 1 3

1 0
2 1
0 1

5 3
1 4

BA =

1 0
2 1
0 1

1 2 1
1 1 3

1 2 1
1 5 5
1 1 3

C =

1 2
1 1

D =

1 0
3 2

CD =

1 2
1 1

1 0
3 2

7 4
2 2

DC =

1 0
3 2

1 2
1 1

1 2
1 8

Observe that since the commutative property does not hold true, we have to distinguish between
left factor out and right factor out:
AB +AC = A(B +C)
EF +GF = (E +G) F
AB +CA 6= A(B +C)
AB +CA 6= (B +C) A
2.
Given
A =

3 1
6 2

B =

4 1
5 6

C =

1 2
4 3

,
3.1. MATRIX OPERATIONS 31
we have
AB =

3 1
6 2

4 1
5 6

7 9
14 18

AC =

3 1
6 2

1 2
4 3

7 9
14 18

3.
Observe that 3. 2. and therefore 2. 3. Otherwise, you can simply observe that 3. follows
from 2., choosing A in 3. equal to A in 2., and B in 3. equal to B C in 2.:
A(B C) =

3 1
6 2

4 1
5 6

1 2
4 3

3 1
6 2

3 1
9 3

0 0
0 0

Since the associative property of the product between matrices does hold true we can give the
following denition.
Denition 77 Given A M
m,m
,
A
k
:= A
1
A
2
... A
k volte
.
Observe that if A M
m,m
and k, l N\ {0}, then
A
k
A
l
= A
k+l
.
Remark 78 Properties of transpose matrices.
1. A M
m,n
(A
T
)
T
= A
2. A, B M
m,n
(A+B)
T
= A
T
+B
T
3. R, A M
m,n
(A)
T
= A
T
4. A M
m,n
, B M
n,m
(AB)
T
= B
T
A
T
Matrices and linear systems.
In Section 1.6, we have seen that a system of mlinear equation in the n unknowns x
1
, x
2
, ..., x
n
and
parameters a
ij
, for i {1, ..., m}, j {1, ..., n}, (b
i
)
n
i=1
R
n
is displayed below:

a
11
x
1
+... +a
1j
x
j
+... +a
1n
x
n
= b
1
...
a
i1
x
1
+... +a
ij
x
j
+... +a
in
x
n
= b
i
...
a
m1
x
i
+... +a
mj
x
j
+... +a
mn
x
n
= b
m
(3.8)
Moreover, the matrix

a
11
... a
1j
... a
1n
b
1
...
a
i1
... a
ij
... a
in
b
i
...
a
m1
... a
mj
... a
mn
b
m

is called the augmented matrix M of system (1.4). The coecient matrix A of the system is
A =

a
11
... a
1j
... a
1n
...
a
i1
... a
ij
... a
in
...
a
m1
... a
mj
... a
mn

Using the notations we described in the present section, we can rewrite linear equations and
systems of linear equations in a convenient and short manner, as described below.
The linear equation in the unknowns x
1
, ..., x
n
and parameters a
1
, ..., a
i
, ..., a
n
, b R
a
1
x
1
+... +a
i
x
i
+... +a
n
x
n
= b
32 CHAPTER 3. MATRICES
can be rewritten as
n
X
i=1
a
i
x
i
= b
or
a x = b
where a = [a
1
, ..., a
n
] and x =

x
1
...
x
n

.
The linear system (3.8) can be rewritten as

n
P
j=1
a
1j
x
j
= b
1
...
n
P
j=1
a
mj
x
j
= b
m
or

R
1
(A) x = b
1
...
R
m
(A) x = b
m
or
Ax = b
where A = [a
ij
].
Denition 79 The trace of A M
mm
, written tr A, is the sum of the diagonal entries, i.e.,
tr A =
m
X
i=1
a
ii
.
Denition 80 The identity matrix I
m
is a diagonal matrix of order m with each element on the
principal diagonal equal to 1. If no confusion arises, we simply write I in the place of I
m
.
Remark 81 1. n N\ {0} , (I
m
)
n
= I
m
;
2. A M
m,n
, I
m
A = AI
n
= A.
Proposition 82 Let A, B M(m, m) and k R. Then
1. tr (A+B) = tr A+ tr B;
2. tr kA = k tr A;
3. tr AB = tr BA.
Proof. Exercise.
3.2 Inverse matrices
Denition 83 Given a matrix A
nn
, , a matrix B
nn
is called an inverse of A if
AB = BA = I
n
.
We then say that A is invertible, or that A admits an inverse.
Proposition 84 If A admits an inverse, then the inverse is unique.
Proof. Let the inverse matrices B and C of A be given. Then
AB = BA = I
n
(3.9)
and
3.2. INVERSE MATRICES 33
AC = CA = I
n
(3.10)
Left multiplying the rst two terms in the equality (3.9) by C, we get
(CA) B = C (BA)
and from (3.10) and (3.9) we get B = C, as desired.
Thanks to the above Proposition, we can present the following denition.
Denition 85 If the inverse of A does exist, then it is denoted by A
1
.
Example 86 Assume that for i {1, ..., n} ,
i
6= 0.The diagonal matrix

1
.
.
.

is invertible and its inverse is


1
.
.
.
1

.
Remark 87 If a row or a column of A is zero, then A is not invertible, as veried below.
Without loss of generality, assume the rst row of A is equal to zero. Assume that B is the
inverse of A. But then, since I = AB, we wold have 1 = R
1
(A) C
1
(B) = 0, a contradiction.
Proposition 88 If A M
m,m
and B M
m,m
are invertible matrices, then AB is invertible and
(AB)
1
= B
1
A
1
Proof.
(AB) B
1
A
1
= A

BB
1

A
1
= AIA
1
= AA
1
= I.
B
1
A
1
(AB) = B
1

A
1
A

B = B
1
IB = B
1
B = I.
Remark 89 The existence of the inverse matrix gives an obvious way of solving systems of linear
equations with the same number of equations and unknowns.
Given the system
A
nn
x = b,
if A
1
exists, then
x = A
1
b.
Proposition 90 (Some other properties of the inverse matrix)
Let the invertible matrix A be given.
1. A
1
in invertible and

A
1

1
= A;
2. A
T
is invertible and

A
T

1
=

A
1

T
;
Proof. 1. We want to verify that the inverse of A
1
is A, i.e.,
A
1
A = I and AA
1
= I,
which is obvious.
2. Observe that
A
T

A
1

T
=

A
1
A

T
= I
T
= I,
and

A
1

T
A
T
=

A A
1

T
= I.
34 CHAPTER 3. MATRICES
3.3 Elementary matrices
Below, we recall the denition of elementary row operations on a matrix A M
mn
presented in
Denition 34.
Denition 91 An elementary row operation on a matrix A M
mn
is one of the following opera-
tions on the rows of A:
[E
1
] (Row interchange) Interchange R
i
with R
j
, denoted by R
i
R
j
;
[E
2
] (Row scaling) Multiply R
i
by k R\ {0}, denoted by kR
i
R
i
, k 6= 0;
[E
3
] (Row addition) Replace R
i
by ( k times R
j
plus R
i
), denoted by (R
i
+kR
j
) R
i
.
Sometimes we apply [E
2
] and [E
3
] in one step, i.e., we perform the following operation
[E
0
] Replace R
i
by ( k
0
times R
j
and k R\ {0} times R
i
), denoted by (k
0
R
j
+kR
i
) R
i
, k 6= 0.
Denition 92 Let E be the set of functions E : M
m,n
M
m,n
which associate with any matrix
A M
m,n
a matrix E (A) obtained from A via an elementary row operation presented in Denition
91. For i {1, 2, 3}, let E
i
E be the set of elementary row operation functions of type i presented
in Denition 91.
Denition 93 For any E E, dene
E
E
= E (I
m
) M
m,m
.
E
E
is called the elementary matrix corresponding to the elementary row operation function E.
With some abuse of terminology, we call any E E an elementary row operation (omitting the
word function), and we sometimes omit the subscript E.
Proposition 94 Each elementary row operations E
1
, E
2
and E
3
has an inverse, and that inverse is
of the same type, i.e., for i {1, 2, 3}, E E
i
E
1
E
i
.
Proof. 1. The inverse of R
i
R
j
is R
j
R
i
.
2. The inverse of kR
i
R
i
, k 6= 0 is k
1
R
i
R
i
.
3. The inverse of (R
i
+kR
j
) R
i
is (kR
j
+R
i
) R
i
.
Remark 95 Given the row, canonical
1
vectors e
i
n
, for i {1, ..., n},
I
n
=

e
1
n
...
e
i
n
...
e
j
n
...
e
n
n

The following Proposition shows that the result of applying an elementary row operation E to
a matrix A can be obtained by premultiplying A by the corresponding elementary matrix E
E
.
Proposition 96 For any A M
m,n
and for any E E,
E (A) = E (I) A := E
E
A. (3.11)
Proof. Recall that
A =

R
1
(A)
...
R
i
(A)
...
R
j
(A)
...
R
m
(A)

1
See Denition 55.
3.3. ELEMENTARY MATRICES 35
We have to prove that (3.11) does hold true E {E
1
, E
2
, E
3
}.
1. E E
1
.
First of all observe that
E (I) =

e
1
m
...
e
j
m
...
e
i
m
...
e
m
m

and E (A) =

R
1
(A)
...
R
j
(A)
...
R
i
(A)
...
R
m
(A)

.
From (3.6),
E (I) A =

e
1
m
A
...
e
j
m
A
...
e
i
m
A
...
e
m
m
A

R
1
(A)
...
R
j
(A)
...
R
i
(A)
...
R
m
(A)

,
as desired.
2. E E
2
.
Observe that
E (I) =

e
1
m
...
k e
i
m
...
e
j
m
...
e
m
m

and E (A) =

R
1
(A)
...
k R
j
(A)
...
R
i
(A)
...
R
m
(A)

.
E (I) A =

e
1
m
A
...
k e
i
m
A
...
e
j
m
A
...
e
m
m
A

R
1
(A)
...
k R
i
(A)
...
R
j
(A)
...
R
m
(A)

,
as desired.
3. E E
3
.
Observe that
E (I) =

e
1
m
...
e
i
m
+k e
j
m
...
e
j
m
...
e
m
m

and E (A) =

R
1
(A)
...
R
i
(A) +k R
j
(A)
...
R
i
(A)
...
R
m
(A)

.
E (I) A =

e
1
m
...
e
i
m
+k e
j
m
...
e
j
m
...
e
m
m

A =

e
1
m
A
...

e
i
m
+k e
j
m

A
...
e
j
m
A
...
e
m
m
A

R
1
(A)
...
R
i
(A) +k R
j
(A)
...
R
j
(A)
...
R
m
(A)

,
as desired.
36 CHAPTER 3. MATRICES
Corollary 97 If A is row equivalent to B, then there exist k N and elementary matrices E
1
, ..., E
k
such that
B = E
1
E
2
... E
k
A
Proof. It follows from the denition of row equivalence and Proposition 96.
Proposition 98 Every elementary matrix E
E
is invertible and (E
E
)
1
is an elementary matrix.
In fact, (E
E
)
1
= E
E
1.
Proof. Given an elementary matrix E, from Denition 93, E E such that
E = E (I) (3.12)
Dene
E
0
= E
1
(I) .
Then
I
def. inv. func.
= E
1
(E (I))
(3.12)
= E
1
(E)
Prop. (96).
= E
1
(I) E
def. E
0
= E
0
E
and
I
def. inv.
= E

E
1
(I)

def. E
0
= E (E
0
)
Prop. (96).
= E (I) E
0
(3.12)
= EE
0
.
Corollary 99 If E
1
, ..., E
k
are elementary matrices, then
P := E
1
E
2
... E
k
is an invertible matrix.
Proof. It follows from Proposition 88 and Proposition 98. In fact,

E
1
k
...E
1
2
E
1
1

is the
inverse of P.
Proposition 100 Let A M
mn
be given. Then, there exist a matrix B M
mn
in row canonical
form, k N and elementary matrices E
1
, ..., E
k
such that
B = E
1
E
2
... E
k
A.
Proof. From Proposition 38, there exist k N elementary operations E
1
, ..., E
k
such that

E
1
E
2
... E
k

(A) = B.
From Proposition 96, j {1, ..., k} ,
E
j
(M) = E
j
(I) M := E
j
M.
Then,

E
1
E
2
... E
k

(A) =

E
1
E
2
... E
k1

E
k
(A)

=

E
1
E
2
... E
k1

(E
k
A) =
=

E
1
E
2
... E
k2

E
k1
(E
k
A) =

E
1
E
2
... E
k2

(E
k1
E
k
A) =
=

E
1
E
2
...E
k3

E
k2
(E
k1
E
k
A) =

E
1
E
2
...E
k3

(E
k2
E
k1
E
k
A) =
... = E
1
E
2
... E
k
A,
as desired.
Remark 101 In fact, in Proposition 150, we will show that the matrix B of the above Corollary
is unique.
Proposition 102 To be row equivalent is an equivalence relation.
Proof. Obvious.
3.3. ELEMENTARY MATRICES 37
Proposition 103 n N\ {0} , A
nn
is in row canonical form and it is invertible A = I.
Proof. []Obvious.
[]
We proceed by induction on n.
Case 1. n = 1.
The case n = 1 is obvious. To try to better understand the logic of the proof, take n = 2, i.e.,
suppose that
A =

a
11
a
12
a
21
a
22

is in row canonical form and invertible. Observe that A 6= 0.


1. a
11
= 1. Suppose a
11
= 0. Then, from 1. in the denition of matrix in echelon form
- see Denition 28 - a
12
6= 0 (otherwise, you would have a zero row not on the bottom of the
matrix). Then, from 2. in that denition, we must have a
21
= 0. But then the rst column is zero,
contradicting the fact that A is invertible - see Remark 87. Since a
11
6= 0, then from 2. in the
Denition of row canonical form matrix - see Denition 31 - we get a
11
= 1.
2. a
21
= 0. It follows from the fact that a
11
= 1 and 3. in Denition 31.
3. a
22
= 1. Suppose a
22
= 0, but then the last row would be zero, contradicting the fact that A
is invertible and a
22
is the leading nonzero entry of the second row, i.e., a
22
6= 0. Then from 2. in
the Denition of row canonical form matrix, we get a
22
= 1.
4. a
12
= 0. It follows from the fact that a
22
= 1 and 3. in Denition 31.
Case 2. Assume that statement is true for n 1.
Suppose that
A =

a
11
a
12
... a
1j
... a
1n
a
21
a
22
... a
2j
... a
2n
...
a
i1
a
i2
... a
ij
... a
in
...
a
n1
a
n2
... a
nj
... a
nn

is in row canonical form and invertible.


1. a
11
= 1. Suppose a
11
= 0. Then, from 1. in the denition of matrix in echelon form - see
Denition 28 -
(a
12
, ..., a
1n
) 6= 0.
Then, from 2. in that denition, we must have

a
21
...
a
i1
...
a
n1

= 0.
But then the rst column is zero, contradicting the fact that A is invertible - see Remark 87. Since
a
11
6= 0, then from 2. in the Denition of row canonical form matrix - see Denition 31 - we get
a
11
= 1.
2. Therefore, we can rewrite the matrix as follows
A =

1 a
12
... a
1j
... a
1n
0 a
22
... a
2j
... a
2n
...
0 a
i2
... a
ij
... a
in
...
0 a
n2
... a
nj
... a
nn

1 a
0 A
22

(3.13)
with obvious denitions of a and A
22
. Since, by assumption, A is invertible, there exists B
which we can partition in the same we partitioned A, i.e.,
B =

b
11
b
c B
22

.
38 CHAPTER 3. MATRICES
and such that B is invertible. Then,
I
n
= BA =

b
11
b
c B
22

1 a
0 A
22

b
11
b
11
+bB
22
c ca +B
22
A
22

1 0
0 I
n1

;
then c = 0 and A
22
B
22
= I
n1
.
Moreover,
I
n
= AB =

1 a
0 A
22

b
11
b
c B
22

b
11
+ac b +aB
22
c A
22
B
22

1 0
0 I
n1

. (3.14)
Therefore, A
22
is invertible. From 3.13, A
22
can be obtained from A erasing the rst row and
then erasing a column of zero, from Remark 33, A
22
is a row reduced form matrix. Then, we can
apply the assumption of the induction argument to conclude that A
22
= I
n1
. Then, from 3.13,
A =

1 a
0 I

.
Since, by assumption, A
nn
is in row canonical form, from 3. in Denition 31, a = 0, and, as
desired A = I.
Proposition 104 Let A belong to M
m,m
. Then the following statements are equivalent.
1. A is invertible;
2. A is row equivalent to I
m
;
3. A is the product of elementary matrices.
Proof. 1. 2.
From Proposition 100, there exist a matrix B M
mm
in row canonical form, k N and
elementary matrices E
1
, ..., E
k
such that
B = E
1
E
2
... E
k
A.
Since A is invertible and, from Corollary 99, E
1
E
2
... E
k
is invertible as well, from Proposition
88, B is invertible as well. Then, from Proposition 103, B = I.
2. 3.
By assumption and from Corollary 97, there exist k N and elementary matrices E
1
, ..., E
k
such that
I = E
1
E
2
... E
k
A,
and, using Proposition 88,
A = (E
1
E
2
... E
k
)
1
I = E
1
k
... E
1
2
E
1
1
I.
Since i {1, ..., k}, E
1
i
is an elementary matrix, the desired result follows.
3. 1.
By assumption, there exist k N and elementary matrices E
1
, ..., E
k
such that
A = E
1
E
2
... E
k
.
Since, from Proposition 98, i {1, ..., k}, E
i
is invertible, A is invertible as well, from Propo-
sition 88.
Proposition 105 Let A
mn
be given.
1. B
mn
is row equivalent to A
mn
there exists an invertible P
mm
such that B = PA.
2. P
mm
is an invertible matrix PA is row equivalent to A.
Proof. 1.
[] From Corollaries 99 and 97, B = E
1
... E
k
A with (E
1
... E
k
) invertible matrix. Then,
it suces to take P = E
1
... E
k
.
[] From Proposition 104, P is row equivalent to I, i.e., there exist E
1
, ..., E
k
such that P =
E
1
... E
k
I. Then by assumption B = E
1
... E
k
I A, i.e., B is row equivalent to A.
2.
From Proposition 104, P is the product of elementary matrices. Then, the desired result follows
from Proposition 96.
3.3. ELEMENTARY MATRICES 39
Proposition 106 If A is row equivalent to a matrix with a zero row, then A is not invertible.
Proof. Suppose otherwise, i.e., A is row equivalent to a matrix C with a zero row and A
is invertible. From Proposition 105, there exists an invertible P such that A = PC and then
P
1
A = C. Since A and P
1
are invertible, then, from Proposition 88, P
1
A is invertible, while
C, from Remark 87, C is not invertible, a contradiction.
Remark 107 From Proposition 104, we know that if A
mm
is invertible, then there exist E
1
, ..., E
k
such that
E
1
... E
k
A = I (3.15)
and, taking inverses of both sides,
(E
1
... E
k
A)
1
= I
and
A
1
E
1
k
...E
1
1
= I
or
A
1
= E
1
... E
k
I. (3.16)
Then, from (3.15)and (3.16), if A is invertible then A
1
is equal to the nite product of those
elementary matrices which transform A in I, or, equivalently, can be obtained applying a nite
number of corresponding elementary operations to the identity matrix I. That observation leads to
the following (Gaussian elimination) algorithm, which either show that an arbitrary matrix A
mm
is not invertible or nds the inverse of A.
An algorithm to nd the inverse of a matrix A
mm
or to show the matrix is not
invertible.
Step 1. Construct the following matrix M
m(2m)
:

A I
m

Step 2. Row reduce M to echelon form. If the process generates a zero row in the part of
M corresponding to A, then stop: A is not invertible : A is row equivalent to a matrix
with a zero row and therefore, from Proposition 106 is not invertible. Otherwise, the part of
M corresponding to A is a triangular matrix.
Step 3. Row reduce M to the row canonical form

I
m
B

Then, from Remark 107, A
1
= B.
Example 108 We nd the inverse of
A =

1 0 2
2 1 3
4 1 8

,
applying the above algorithm.
Step 1.
M =

1 0 2 1 0 0
2 1 3 0 1 0
4 1 8 0 0 1

.
Step 2.

1 0 2 | 1 0 0
0 1 1 | 2 1 0
0 1 0 | 4 0 1

1 0 2 | 1 0 0
0 1 1 | 2 1 0
0 0 1 | 6 1 1

The matrix is invertible.


40 CHAPTER 3. MATRICES
Step 3.

1 0 2 | 1 0 0
0 1 1 | 2 1 0
0 0 1 | 6 1 1

1 0 2 | 1 0 0
0 1 1 | 2 1 0
0 0 1 | 6 1 1

1 0 0 | 11 2 2
0 1 0 | 4 0 1
0 0 1 | 6 1 1

1 0 0 | 11 2 2
0 1 0 | 4 0 1
0 0 1 | 6 1 1

Then
A
1
=

11 2 2
4 0 1
6 1 1

.
Example 109

1 3 | 1 0
4 2 | 0 1

1 3 | 1 0
0 10 | 4 1

1 3 | 1 0
0 1 |
4
10

1
10

1 0 |
2
10
0
3
0 1 |
4
10

1
10

3.4 Elementary column operations


This section repeats some of the discussion of the previous section using column instead of rows of
a matrix.
Denition 110 An elementary column operation is one of the following operations on the columns
of A
mn
:
[F
1
] (Column interchange) Interchange C
i
with C
j
, denoted by C
i
C
j
;
[F
2
] (Column scaling) Multiply C
i
by k R\ {0}, denoted by kC
i
C
i
, k 6= 0;
[F
3
] (Column addition) Replace C
i
by ( k times C
j
plus C
i
), denoted by (C
i
+kC
j
) C
i
..
Each of the above column operation has an inverse operation of the same type just like the
corresponding row operations.
Denition 111 Let F be an elementary column operation on a matrix A
mn
. We denote the
resulting matrix by F (A). We dene also
F
F
= F (I
m
) M
m,m
.
F
F
is then called an elementary matrix corresponding to the elementary column operation F. We
sometimes omit the subscript F.
Denition 112 Given an elementary row operation E, dene F
E
, if it exists
2
, as the column oper-
ation obtained by E substituting the word row with the word column. Similarly, given an elementary
column operation F dene E
F
, if it exists, as the row operation obtained by F substituting the word
column with the word row.
In what follows, F and E are such that F = F
E
and E
F
= E.
2
Of course, if you exchange the rst and the third row, and the matrix has only two columns, you cannot exchange
the rst and the third column.
3.4. ELEMENTARY COLUMN OPERATIONS 41
Proposition 113 Let a matrix A
mn
be given. Then
F (A) =

E

A
T

T
.
Proof. The above fact is equivalent to E

A
T

= (F (A))
T
and it is a consequence of the fact
that the columns of A are the rows of A
T
and vice versa. As an exercise, carefully do the proof in
the case of each of the three elementary operation types.
Remark 114 The above Proposition says that applying the column operation F to a matrix A gives
the same result as applying the corresponding row operation E
F
to A
T
and then taking the transpose.
Proposition 115 Let a matrix A
mn
be given. Then
1.
F (A) = A (E (I))
T
= A F (I) ,
or, since E := E (I) and F := F (I),
F (A) = A E
T
= A F. (3.17)
2. F = E
T
and F is invertible.
Proof. 1.
F (A)
Lemma 113
= =

E

A
T

T Lemma 96
= =

E (I) A
T

T
= A (E (I))
T Lemma 113
= A F (I) .
2. From (3.17), we then get
F := F (I) = I E
T
= E
T
.
From Proposition 90 and Proposition 98, it follows that F is invertible.
Remark 116 The above Proposition says that says that the result of applying an elementary column
operation F on a matrix A can be obtained by postmultiplying A by the corresponding elementary
matrix F.
Denition 117 A matrix B
mn
is said column equivalent to a matrix A
mn
if B can be obtained
from A using a nite number of elementary column operations.
Remark 118 By denition of row equivalent, column equivalent and transpose of a matrix, we
have that
A and B are row equivalent A
T
and B
T
are column equivalent,
and
A and B are column equivalent A
T
and B
T
are row equivalent.
Proposition 119 1. B
mn
is column equivalent to A
mn
there exists an invertible Q
nn
such
that B
mn
= A
mn
Q
nn
.
2. Q
nn
in invertible matrix AQ is column equivalent to A.
Proof. It is very similar to the proof of Proposition 105.
Denition 120 A matrix B
mn
is said equivalent to a matrix A
mn
if B can be obtained from A
using a nite number of elementary row and column operations.
Proposition 121 A matrix B
mn
is equivalent to a matrix A
mn
there exist invertible matrices
P
mm
and Q
nn
such that B
mn
= P
mm
A
mn
Q
nn
.
Proof. []
By assumption B = E
1
... E
k
A F
1
... F
h
.
[]
Similar to the proof of Proposition 105.
42 CHAPTER 3. MATRICES
Proposition 122 For any matrix A
mn
there exists a number r {0, 1, ..., min{m, n}} such that
A is equivalent to the block matrix of the form

I
r
0
0 0

. (3.18)
Proof. The proof is constructive in the form of an algorithm.
Step 1. Row reduce A to row canonical form, with leading nonzero entries a
11
, a
2j
2
,
..., a
j
j
r
.
Step 2. Interchange C
2
and C
j
2
, C
3
and C
j
3
and so on up to C
r
and C
j
r
.You then get a matrix
of the form

I
r
B
0 0

.
Step 3. Use column operations to replace entries in B with zeros.
Remark 123 From Proposition 150 the matrix in Step 2 is unique and therefore the resulting
matrix in Step 3, i.e., matrix (3.18) is unique.
Proposition 124 For any A M
m,n
, there exists invertible matrices P M
m,m
and Q M
n,n
and
r {0, 1, ..., min{m, n}} such that
PAQ =

I
r
0
0 0

Proof. It follows immediately from Propositions 122 and 121.


Remark 125 From Proposition 150 the number r in the statement of the previous Proposition is
unique.
Proposition 126 If A
mm
B
mm
= I, then BA = I and therefore A is invertible and A
1
= B.
Proof. Suppose A is not invertible, the from Proposition 104 is not row equivalent to I
m
and
from Proposition 122, A is equivalent to a block matrix of the form displayed in (3.18) with r < m.
Then, from Proposition 121, there exist invertible matrices P
mm
and Q
mm
such that
PAQ =

I
r
0
0 0

,
and from AB = I, we get
P = PAQQ
1
B
and

I
r
0
0 0

Q
1
B

= P
Therefore, P has some zero rows and columns, contradicting that P is invertible.
Remark 127 The previous Proposition says that to verify that A in invertible it is enough to check
that AB = I.
Remark 128 We will come back to the analysis of further properties of the inverse and on another
way of computing it in Section 5.3
Chapter 4
Vector spaces
4.1 Denition
Denition 129 Let a nonempty set F with the operations of
addition which assigns to any x, y F an element denoted by x y F, and
multiplication which assigns to any x, y F an element denoted by x y F
be given. (F, , ) is called a eld, if the following properties hold true.
1. (Commutative) x, y F, x y = y x and x y = y x;
2. (Associative) x, y, z F, (x y) z = x (y z) and (x y) z = x (y z);
3. (Distributive) x, y, z F, x (y z) = x y x z;
4. (Existence of null elements) f
0
, f
1
F such that x F, f
0
x = x and f
1
x = x;
5. (Existence of a negative element) x F y F such that x y = f
0
;
6. (Existence of an inverse element) x F\ {0}, y F such that x y = f
1
.
Elements of a elds are called scalars.
Example 130 The set R of real numbers with the standard addition and multiplication is a eld.
From the above properties all the rules of elementary algebra can be deduced.
1
The set C of complex numbers is a eld - see Appendix 10.
From the above properties, it follows that f
0
and f
1
are unique.
2
We denote f
0
and f
1
by 0 and
1, respectively.
Denition 131 Let (F, , ) be a eld and V be a nonempty set with the operations of
addition which assigns to any u, v V an element denoted by u +v V, and
scalar multiplication which assigns to any u V and any F an element u V .
Then (V, +, ) is called a vector space on the eld (F, , ) and its elements are called vectors if
the following properties are satised.
A1. (Associative) u, v, w V, (u +v) +w = u + (v +w);
A2. (existence of zero element) there exists an element 0 in V such that u V , u + 0 = u;
A3. (existence of inverse element) u V v V such that u +v = 0;
A4. (Commutative) u, v V , u +v = v +u;
M1. (distributive) F and u, v V, (u +v) = u + v;
M2. (distributive) , F and u V, ( ) u = u + u;
M3. , F and u V, ( ) u = ( u);
M4. u V, 1 u = u.
1
See, for example, Apostol (1967), Section 13.2, page 17.
2
The proof of that result is very similar to the proof of Proposition 132.1 and 2.
43
44 CHAPTER 4. VECTOR SPACES
In the remainder of these notes, if no confusion arises, for ease of notation, we will denote a eld
simply by F and a vector space by V.Moreover, we will write + in the place of , and we will omit
and , i.e., we will write xy instead of x y and v instead of v.
Proposition 132 If V is a vector space, then (as a consequence of the rst four properties)
1. The zero vector is unique and it is denoted by 0.
2. u V , the inverse element of u is unique and it is denoted by u.
3. (cancellation law) u, v, w V,
u +w = v +w u = v.
Proof. 1. Assume that there exist 0
1
, 0
2
V which are zero vectors. Then from (A2),
0
1
+ 0
2
= 0
1
and 0
2
+ 0
1
= 0
2
.
From (A.4) ,
0
1
+ 0
2
= 0
2
+ 0
1
,
and therefore 0
1
= 0
2
.
2. Given u V , assume there exist v
1
, v
2
V such that
u +v
1
= 0 and u +v
2
= 0.
Then
v
2
= v
2
+ 0 = v
2
+

u +v
1

=

v
2
+u

+v
1
=

u +v
2

+v
1
= 0 +v
1
= v
1
..
3.
u +w = v +w
(1)
u +w + (w) = v +w + (w)
(2)
u + 0 = v + 0
(3)
u = v,
where (1) follows from the denition of operation, (2) from the denition of w and (3) from
the denition of 0.
Proposition 133 If V is a vector space over a eld F, then
1. For 0 F and u V , 0u = 0, and therefore 0 V.
2. For 0 V and F , 0 = 0.
3. If F, u V and u = 0, then either = 0 or u = 0 or both.
4. F and u V , () u = (u) = (u) := u.
Proof. 1. From (M1),
0u + 0u = (0 + 0) u = 0u.
Then, adding (0u) to both sides,
0u + 0u + ((0u)) = 0u + ((0u))
and, using (A3),
0u + 0 = 0
and, using (A2), we get the desired result.
2. From (A2),
0 + 0 = 0;
then multiplying both sides by and using (M1),
0 = (0 + 0) = 0 +0;
and, using (A3),
0 + ((0)) = 0 +0 + ((0))
and, using (A2), we get the desired result.
4.2. EXAMPLES 45
3. Assume that u = 0 and 6= 0. Then
u = 1u =

u =
1
(u) =
1
0 = 0.
Taking the contropositive of the above result, we get hu 6= 0i hu 6= 0 = 0i. Therefore
hu 6= 0 u = 0i h = 0i.
4. From u + (u) = 0, we get (u + (u)) = 0, and then u + (u) = 0, and therefore
(u) = (u).
From + () = 0, we get ( + ()) u = 0u, and then u + () u = 0, and therefore
(u) = () u.
Remark 134 From Proposition 133.4, and (M4)in Denition 131, we have
(1) u = 1 (u) := u.
We also dene subtraction as follows:
v u := v + (u)
4.2 Examples
Euclidean spaces.
The Euclidean space R
n
with sum and scalar product dened in Chapter 2 is a vector space
over the eld R.
Matrices.
For any m, n N\ {0}, the set M
m,n
of matrices with real elements with the operation of
addition and scalar multiplication, as dened in Section 2.3 is a vector space on a eld F and it is
denoted by
M
F
(m, n) .
We also set
M(m, n) := M
R
(m, n) ,
i.e., M(m, n) is the vector space on the eld R.
Polynomials
The set of all polynomials
a
0
+a
1
x +a
2
x
2
+... +a
n
t
n
with n N and a
0
, a
1
, a
2
, ..., a
n
R is a vector space on R with respect to the standard sum
between polynomials and scalar multiplication.
Function space F (X).
Given a nonempty set X, the set of all functions f : X R with obvious sum and scalar
multiplication is a a vector space on R.
Sets which are not vector spaces.
(0, +) and [0, +) are not a vector spaces.
For any n N\ {0}, the set of all polynomials of degree n is not a vector space on R.
4.3 Vector subspaces
In what follows, if no ambiguity may arise, we will say vector space instead of vector space on
a eld.
Denition 135 Let W be a subset of a vector space V . W is called a vector subspace of V if W is
a vector space with respect to the operation of vector addition and scalar multiplication dened on
V .
Proposition 136 Let W be a subset of a vector space V . The following three statements are
equivalent.
1. W is a vector subspace of V.
46 CHAPTER 4. VECTOR SPACES
2. a. W 6= , i.e.,
3
0 W;
b. u, v W, u +v W;
c. u W, F, u W.
3. a. W 6= , i.e., 0 W;
b. u, v W, , F, u +v W.
Proof. 2. and 3. are clearly equivalent.
If W is a vector subspace, clearly 2. holds. To show the opposite implication we have to check
only (A2) and (A3), simply by denition of restriction function.
(A2): Since W 6= , we can take u W. Then, from c., taken 0 R, we have 0u = 0 W.
(A3) : Taken u W, from c., (1) u W, but then (1) u = u W.
Example 137 1. Given an arbitrary vector space V , {0} and V are vector subspaces of V .
2. Given R
3
,
W :=

(x
1
, x
2
, x
3
) R
3
: x
3
= 0

is a vector subspace of R
3
.
3. Given the space V of polynomials, the set W of all polynomials of degree n is a vector
subspace of V.
4. The set of all bounded or continuous or dierentiable or integrable functions f : X R is a
vector subspace of F (X).
5. If V and W are vector spaces, then V W is a vector subspace of V and W.
6. [0, +) is not a vector subspace of R.
7. Let V = {0} R R
2
and W = R{0} R
2
. Then V W is not a vector subspace of R
2
.
4.4 Linear combinations
Notation convention. Unless otherwise stated, a greek (or latin) letter with a subscript denotes
a scalar; a latin letter with a superscript denotes a vector.
Denition 138 Let V be a vector space, m N\ {0} and v
1
, v
2
, ..., v
m
V . A linear combination
of vectors v
1
, v
2
, ..., v
m
via coecients
1
,
2
, ...,
m
F is the vector
m
X
i=1

i
v
i
.
The set of all such combinations
(
v V : (
i
)
m
i=1
F
m
such that v =
m
X
i=1

i
v
i
)
is called span of

v
1
, v
2
, ..., v
m

and it is denoted by span

v
1
, v
2
, ..., v
m

or simply span

v
1
, v
2
, ..., v
m

.
Denition 139 Let V be a vector space and S V . span(S) is the set of all linear combinations
of a nite number of vectors in S.
Proposition 140 Let V be a vector space and S 6= , S V .
1. a. S span(S) and b. span(S) is a vector subspace of V .
2. If W is a subspace of V and S W, then span (S) W.
3
It follows from Proposition 133.1.
4.5. ROW AND COLUMN SPACE OF A MATRIX 47
Proof. 1a. Given v S, 1v = v span(S) . 1b. Since S 6= , then span(S) 6= . Given
, F and v, w span S. Then
1
, ...,
n
F, v
1
, ..., v
n
S and
1
, ...,
m
F, w
1
, ..., w
m
S,
such that v =
P
n
i=1

i
v
i
and w =
P
m
j=1

j
w
j
. Then
v +w =
n
X
i=1
(
i
) v
i
+
m
X
j=1

w
j
spanS.
2. Take v span S. Then
1
, ...,
n
F, v
1
, ..., v
n
S W such that v =
P
n
i=1

i
v
i
W,
as desired.
Denition 141 Let V be a vector space and v
1
, v
2
, ..., v
m
V . If V = span

v
1
, v
2
, ..., v
m

, we say
that V is the vector space generated or spanned by the vectors v
1
, v
2
, ..., v
m
.
Example 142 1. R
3
= span({(1, 0, 0) , (0, 1, 0) , (0, 0, 1)}).
2. span

{t
n
}
nN

is equal to the vector space of all polynomials.


4.5 Row and column space of a matrix
Denition 143 Given A M(m, n),
rowspanA := span(R
1
(A) , .., R
i
(A) , ..., R
m
(A))
is called the row space of A or rowspan of A.
The column space of A or col spanA is
col spanA := span(C
1
(A) , .., C
j
(A) , ..., R
n
(A)) .
Remark 144 Given A M(m, n)
col spanA = rowspanA
T
.
Remark 145 Linear combinations of columns and rows of a matrix.
Let A M(m, n) , x R
n
and y R
m
. Then, j {1, .., n} , C
j
(A) R
m
and i {1, ..., m},
R
i
(A) R
n
. Then,
Ax = [C
1
(A) , ..., C
j
(A) , ..., C
n
(A)]

x
1
...
x
j
...
x
n

=
n
X
j=1
x
j
C
j
(A)
and
Ax is a linear combination of the columns of A via the components of the vector x.
Moreover,
yA = [y
1
, ..., y
i
, ..., y
m
]

R
1
(A)
...
R
i
(A)
...
R
m
(A)

=
m
X
i=1
y
i
R
i
(A)
and
yA is a linear combination of the rows of A via the components of the vector y.
As a consequence of the above observation, we have what follow.
1.
rowspan A = {w R
n
: y R
m
such that w = yA} .
48 CHAPTER 4. VECTOR SPACES
2.
col span A = {z R
m
: x R
n
such that z = Ax} ;
Proposition 146 Given A, B M(m, n),
1. if A is row equivalent to B, then rowspanA = rowspanB;
2. if A is column equivalent to B, then col spanA = col spanB.
Proof. 1. B is obtained by A via elementary row operations. Therefore, i {1, ..., m}, either
i. R
i
(B) = R
i
(A), or
ii. R
i
(B) is a linear combination of rows of A.
Therefore, rowspanB rowspanA. Since A is obtained by B via elementary row operations,
rowspanB rowspanA.
2. if A is column equivalent to B, then A
T
is row equivalent to B
T
and therefore, from i. above,
rowspanA
T
= rowspanB
T
. Then the result follows from Remark 144.
Remark 147 Let A M(m, n) be given and assume that
b := (b
j
)
n
j=1
=
m
X
i=1
c
i
R
i
(A) ,
i.e., b is a linear combination of the rows of A. Then,
j {1, ..., n} , b
j
=
m
X
i=1
c
i
R
j
i
(A) ,
where i {1, ...m} and j {1, ...n} , R
j
i
(A) is the j th component of the i th row R
i
(A)
of A.
Lemma 148 Assume that A, B M(m, n) are in echelon form with pivots
a
1j1
, ..., a
iji
, ..., a
rjr
,
and
b
1k1
, ..., b
iki
, ..., b
sks
,
respectively, and
4
r, s min{m, n}. Then
hrowspanA = rowspanBi hs = r and for i {1, ..., s} , j
i
= k
i
i .
Proof. Preliminary remark 1. If A = 0, then A = B and s = r = 0.
Preliminary remark 2. Assume that A, B 6= 0 and then s, r 1. We want to verify that
j
1
= k
1
. Suppose j
1
< k
1
. Then, by denition of echelon matrix, C
j1
(B) = 0, otherwise you
would contradict Property 2 of the Denition 28 of echelon matrix. Then, from the assumption
that rowspanA = rowspanB,we have that R
1
(A) is a linear combination of the rows of B, via
some coecients c
1
, ..., c
m
, and from Remark 147 and the fact that C
j
1
(B) = 0, we have that
a
1j1
= c
1
0 +...c
m
0 = 0, contradicting the fact that a
1j1
is a pivot for A. Therefore, j
1
k
1
. A
perfectly symmetric argument shows that j
1
k
1
.
We can now prove the result by induction on the number m of rows.
Step 1. m = 1.
It is basically the proof of Preliminary Remark 2.
Step 2.
Given A, B M(m, n), dene A
0
, B
0
M(m1, n) as the matrices obtained erasing the rst
row in matrix A and B respectively. From Remark 33, A
0
and B
0
are still in echelon form. If
we show that rowspanA
0
= rowspanB
0
, from the induction assumption, and using Preliminary
Remark 2, we get the desired result.
Let R = (a
1
, ..., a
n
) be any row of A
0
. Since R rowspanB, (d
i
)
m
i=1
such that
R =
m
X
i=1
d
i
R
i
(B) .
4
See Remark 30.
4.5. ROW AND COLUMN SPACE OF A MATRIX 49
Since A is in echelon form and we erased its rst row, we have that if i j
1
= k
1
, then a
i
= 0,
otherwise you would contradict the denition of j
1
. Since B is in echelon form, each entry in its
k
1
th column are zero, but b
1k
1
which is dierent from zero. Then,
a
1k
1
= 0 =
m
X
i=1
d
i
b
ik
1
= d
1
b
ik
1
,
and therefore d
1
= 0, i.e., R =
P
m
i=2
d
i
R
i
(B), or R rowspanB
0
, as desired. Symmetric
argument shows the other inclusion.
Proposition 149 Assume that A, B M(m, n) are in row canonical form. Then,
hrowspanA = rowspanBi hA = Bi .
Proof. [] Obvious.
[] From Lemma 148, the number of pivots in A and B is the same. Therefore, A and B have
the same number s of nonzero rows, which in fact are the rst s rows. Take i {1, ..., s}. Since
rowspanA = rowspanB, there exists (c
h
)
s
h=1
such that
R
i
(A) =
s
X
h=1
c
h
R
h
(B) . (4.1)
We want then to show that c
i
= 1 and l {1, ..., s} \ {i}, c
l
= 0.
Let a
ij
i
be the pivot of R
i
(A) , i.e., a
ij
i
is the nonzero j
i
th component of R
i
(A). Then, from
Remark 147,
a
ij
i
=
s
X
h=1
c
h
R
hj
i
(B) =
s
X
h=1
c
h
b
hj
i
. (4.2)
From Lemma 148, for i {1, ..., s} , j
i
= k
i
, and therefore b
ij
i
is a pivot entry for B, and since
B is in row reduced form, b
ij
i
is the only nonzero element in the j
i
column of B. Therefore, from
(4.2),
a
iji
=
s
X
h=1
c
h
R
hji
(B) = c
i
b
iji
.
Since A and B are in row canonical form a
ij
i
= b
ij
i
= 1 and therefore
c
i
= 1.
Now take l {1, ..., s} \ {i} and consider the pivot element b
lj
l
in R
l
(B). From (4.1) and
Remark 147,
a
ij
l
=
s
X
h=1
c
h
b
hj
l
= c
l
, (4.3)
where the last equalities follow from the fact that B is in row reduced form and therefore b
lj
l
is
the only nonzero element in the j
l
th column of B, in fact, b
lj
l
= 1. From Lemma 148, since b
lj
l
is a pivot element for B, a
lj
l
is a pivot element for A. Since A is in row reduced form, a
lj
l
is the
only nonzero element in column j
l
of A. Therefore, since l 6= i, a
ij
l
= 0, and from (4.3), the desired
result,
l {1, ..., s} \ {i} , c
l
= 0.,
does follow.
Proposition 150 For every A M(m, n), there exists a unique B M(m, n) which is in row
canonical form and row equivalent to A.
Proof. The existence of at least one matrix with the desired properties is the content of
Proposition 38. Suppose that there exists B
1
and B
2
with those properties. Then from Proposition
146, we get
rowspanA = rowspanB
1
= rowspanB
2
.
50 CHAPTER 4. VECTOR SPACES
From Proposition 149,
B
1
= B
2
.
As an immediate consequence of above Proposition 150 and of Propositions 38, 100, 122 and
124, we have the following results.
Corollary 151 1. Any matrix A M(m, n) is row equivalent to a unique matrix in row canonical
form, called the row canonical form of A.
2. Let A M(m, n) be given. Then, there exist a unique matrix B M(m, n) in row canonical
form, k N and elementary matrices E
1
, ..., E
k
such that
B = E
1
E
2
... E
k
A.
3. For any matrix A M(m, n) there exists a unique number r {0, 1, ..., min{m, n}} such
that A is equivalent to the block matrix of the form

I
r
0
0 0

.
4. For any A M(m, n), there exist invertible matrices P M(m, m) and Q M(n, n) and a
unique number r {0, 1, ..., min{m, n}} such that
PAQ =

I
r
0
0 0

and therefore,
5. For any A M(m, n), there exist invertible matrices P M(m, m) and Q M(n, n) and a
unique number r {0, 1, ..., min{m, n}} such that A is equivalent to
PAQ =

I
r
0
0 0

4.6 Linear dependence and independence


Denition 152 Let V be a vector space on a eld F, m N\ {0} and v
1
, v
2
, ..., v
m
V.The set
S =

v
1
, v
2
, ..., v
m

is a set of linearly dependent vectors if


either S = {0} ,
or k {1, ...m} and (m1) 1 coecients
j
F with j {1, ...m} \ {k} such that
v
k
=
X
j{1,...m}\{k}

j
v
j
or, shortly,
v
k
=
X
j6=k

j
v
j
i.e., there exists a vector equal to a linear combination of the other vectors.
Geometrical example in R
2
.
Denition 153 Let V be a vector space and S V. The set S is linearly dependent if either
S = {0} or there exists a subset of S of nite cardinality which is linearly dependent.
Proposition 154 Let V be a vector space and v
1
, v
2
, ..., v
m
V on a eld F. The set S =

v
1
, v
2
, ..., v
m

V is a set of linearly dependent vectors if and only if


(
1
, ...,
i
, ...,
m
) F
m
\ {0} such that
m
X
i=1

i
v
i
= 0, (4.4)
i.e., there exists a linear combination of the vectors equal to the null vector and with some
nonzero coecient.
4.6. LINEAR DEPENDENCE AND INDEPENDENCE 51
Proof. []
If #S = 1, i.e., S = {0}, any R\ {0} is such that 0 = 0. Assume then that #S > 1. Take

i
=


i
if i 6= j
1 if i = j
[]
If S = {v} and R\ {0} is such that v = 0,then from Proposition 58.3 v = 0. Assume
then that #S > 1. Without loss of generality take
1
6= 0. Then,

1
v
1
+
X
i6=1

i
v
i
= 0
and
v
1
=
X
i6=1

1
v
i
.
Proposition 155 Let m 2 and v
1
, ..., v
m
be nonzero linearly dependent vectors. Then, one of
the vectors is a linear combination of the preceding vectors, i.e., k > 1 and

k1
i=1
such that
v
k
=
P
k1
i=1
v
i
.
Proof. Since

v
1
, ..., v
m

are linearly dependent, (


i
)
m
i=1
R
m
\ {0} such that
P
m
i=1

i
v
i
= 0.
Let k be the largest i such that
i
6= 0, i.e.,
k {1, ..., m} such that
k
6= 0 and i {k + 1, ..., m} ,
i
= 0. (4.5)
Consider the case k = 1. Then we would have
1
6= 0 and i > 1,
i
= 0 and therefore
0 =
P
m
i=1

i
v
i
=
1
v
1
, contradicting the assumption that v
1
, ..., v
m
are nonzero vectors. Then, we
must have k > 1, and from (4.5) , we have
0 =
m
X
i=1

i
v
i
=
k
X
i=1

i
v
i
and

k
v
k
=
k1
X
i=1

i
v
i
,
or, as desired,
v
k
=
k1
X
i=1

k
v
i
.
It is then enough to choose
i
=

i

k
for any i {1, ..., k 1}.
Example 156 Take the vectors x
1
= (1, 2), x
2
= (1, 2) and x
3
= (0, 4). S :=

x
1
, x
2
, x
3

is a
set of linearly dependent vectors: x
1
= 1 x
2
+ 0 x
3
. Observe that there are no 1, 2 R such
that x
3
= 1 x
1
+2 x
2
.
Denition 157 The set of vectors S =

v
1
, ..., v
m

V is called a set of linearly independent


vectors if it is not linearly dependent, i.e., h (4.4)i ,i.e., if
(
1
, ...,
i
, ...,
m
) F
m
\ {0} it is the case that
P
m
i=1

i
v
i
6= 0,
or
(
1
, ...,
i
, ...,
m
) F
m
\ {0}
m
X
i=1

i
v
i
6= 0
or
m
X
i=1

i
v
i
= 0 (
1
, ...,
i
, ...,
m
) = 0
or the only linear combination of the vectors which is equal to the null vector has each coecient
equal to zero.
52 CHAPTER 4. VECTOR SPACES
Geometrical example in R
2
.
Denition 158 Let V be a vector space and S V.The set S is linearly independent if every
nite subset of S is linearly independent.
Remark 159 From Remark 145, we have what follows:
hAx = 0 x = 0i hthe column vectors of A are linearly independenti (4.6)
hyA = 0 y = 0i hthe row vectors of A are linearly independenti (4.7)
Remark 160 is a set of linearly independent vectors: see Denition 157.
Example 161 Consider the vectors x
1
= (1, 0, 0, 0) R
4
and x
2
= (0, 1, 0, 0) . Observe that

1
x
1
+
2
x
2
= 0 means (
1
,
2
, 0, 0) = (0, 0, 0, 0).
Exercise 162 Say if the following set is a set of linearly dependent or independent vectors:
S =

x
1
, x
2
, x
3

with x
1
= (3, 2, 1), x
2
= (4, 1, 3), x
3
= (3, 3, 6).
Proposition 163 Let V be a vector space and v
1
, v
2
, ..., v
m
V . If S =

v
1
, ..., v
m

is a set of
linearly dependent vectors and v
m+1
, ..., v
m+k
V are arbitrary vectors, then
S
0
:= S

v
m+1
, ..., v
m+k

=

v
1
, ..., v
m
, v
m+1
, ..., v
m+k

is a set of linearly dependent vectors.


Proof. From the assumptions i

{1, ..., m} and (a


j
)
j6=i
such that
v
j
=
X
j6=i

j
v
j
But then
v
j
=
X
j6=i

j
v
j
+ 0 v
m+1
+ 0 v
m+k
Proposition 164 If S =

v
1
, ..., v
m

V is a set of linearly independent vectors, then S


0
S is
a set of linearly independent vectors
Proof. Suppose otherwise, but then you contradict the previous proposition.
Remark 165 Consider vectors in R
n
.
1. Adding components to linearly dependent vectors give raise to linearly dependent or indepen-
dent vectors;
2. Eliminating components from linearly independent vectors give raise to linearly dependent or
independent vectors;
3. Adding components to linearly independent vectors give raise to linearly independent vectors;
4. Eliminating components from linearly dependent vectors give raise to linearly dependent vec-
tors.
To verify 1. 1nd 2. above consider the following two vectors:

1
1
0

2
2
1

To verify 4., if you have a set of linearly dependent vectors, it is possible to express one vector
as a linear combination of the others and then eliminate components leaving the equality still true.
The proof of 3 is contained in the following Proposition.
4.7. BASIS AND DIMENSION 53
Proposition 166 If S
x
=

x
1
, ..., x
m

R
n
is a set of linearly independent vectors and S
y
=

y
1
, ..., y
m

R
k
is a set of vectors, then S =

x
1
, y
1

, ... (, x
m
, y
m
)

R
n+k
is a set of linearly
independent vectors.
Proof. By assumption
m
X
i=1

i
v
i
= 0 (
1
, ...,
i
, ...,
m
) = 0.
S is a set of linearly independent vectors if
m
X
i=1

v
i
, y
i

= 0 (
1
, ...,
i
, ...,
m
) = 0.
Since
m
X
i=1

v
i
, y
i

= 0
m
X
i=1

i
v
i
= 0 and
m
X
i=1

i
y
i
= 0
the desired result follows.
Corollary 167 If S
x
=

x
1
, ..., x
m

R
n
is a set of linearly independent vectors and S
y
=

y
1
, ..., y
m

R
k
and S
z
=

z
1
, ..., z
m

R
l
are sets of vectors, then S =

z
1
, x
1
, y
1

, ... (z
m
, x
m
, y
m
)

R
l+n+k
is a set of linearly independent vectors.
Example 168 1. The set of vectors

v
1
, ..., v
m

is a linearly dependent set if k {1, ..., m} such


that v
k
= 0:
v
k
+
X
i6=k
0 v
i
= 0
2. The set of vectors

v
1
, ..., v
m

is a linearly dependent set if k, k


0
{1, ..., m} and R
such that v
k
0
= v
k
:
v
k
0
v
k
+
X
i6=k,k
0
0 v
i
= 0
3. Two vectors are linearly dependent if and only if one is a multiple of the other.
Proposition 169 The nonzero rows of a matrix A in echelon form are linearly independent.
Proof. We will show that each row of A starting from the rst one is not a linear combination
of the subsequent rows. Then, as a consequence of Proposition 155, the desired result will follow.
Since A is in echelon form, the rst row has a pivot below which all the elements are zero. Then
that row cannot be a linear combination of the following rows. Similar argument applies to the
other rows.
4.7 Basis and dimension
Denition 170 A set S in a vector space V on a eld F is a basis of V if
1. S is a linearly independent set;
2. span (S) = V .
Proposition 171 A set S =

u
1
, u
2
, ..., u
n

V is a basis of V on a eld F if and only if v V


there exists a unique (
i
)
n
i=1
F
n
such that v =
P
n
i=1

i
u
i
.
Proof. [] Suppose there exist (
i
)
n
i=1
, (
i
)
n
i=1
R
n
such that v =
P
n
i=1

i
u
i
=
P
n
i=1

i
u
i
.
Then
0 =
n
X
i=1

i
u
i

n
X
i=1

i
u
i
=
n
X
i=1
(
i

i
) u
i
.
54 CHAPTER 4. VECTOR SPACES
Since

u
1
, u
2
, ..., u
n

are linearly independent,


i {1, ..., n} ,
i

i
= 0,
as desired.
[]
Clearly V = span (S); we are left with showing that

u
1
, u
2
, ..., u
n

are linearly independent.


Consider
P
n
i=1

i
u
i
= 0. Moreover,
P
n
i=1
0 u
i
= 0. But since there exists a unique (
i
)
n
i=1
R
n
such that v =
P
n
i=1

i
u
i
, it must be the case that i {1, ..., n} ,
i
= 0.
Lemma 172 Suppose that given a vector space V , span

v
1
, ..., v
m

= V .
1. If w V , then

w, v
1
, ..., v
m

is linearly dependent and span

w, v
1
, ..., v
m

= V ;
2. If v
i
is a linear combination of

v
j

i1
j=1
, then span

v
1
, ..., v
i1
, v
i+1
, ..., v
m

= V .
Proof. Obvious.
Lemma 173 (Replacement Lemma) Given a vector space V , if
1. span

v
1
, ..., v
n

= V ,
2.

w
1
, ..., w
m

V is linearly independent,
then
1. n m,
2. a. If n = m, then span

w
1
, ..., w
m

= V .
b. if n > m, there exists

v
i
1
, ..., v
i
nm

v
1
, ..., v
n

such that span

w
1
, ..., w
m
, v
i
1
, ..., v
i
nm

=
V .
Proof. Observe preliminary that since

w
1
, ..., w
m

is linearly independent, j {1, ..., m},


w
j
6= 0.
Dene I
1
as the subset of {1, ..., n} such that i I
1
, v
i
6= 0. Then, clearly, span

v
i

iI
1

=
span

v
1
, ..., v
n

= V . We are going to show that m #I


1
, and since #I
1
n, our result
will imply conclusion 1. Moreover, we are going to show that there exists

v
i
1
, ..., v
i
#I
1
m

v
i

iI1


v
1
, ..., v
n

such that span

w
1
, ..., w
m
, v
i1
, ..., v
i
#I
1
m

= V , and that result will


imply conclusion 2.
Using the above observation and to make notation easier we will assume that I
1
= {1, ..., n},
i.e., i {1, ..., n} , v
i
6= 0.
Now consider the case n = 1.

w
1
, ..., w
m

V implies that there exists (


j
)
m
j=1
R
m
such
that j,
j
6= 0 and w
j
=
j
v
1
, then it has to be m = 1 (and conclusion 1 holds) and since
w
1
=
1
1
v
1
, span

w
1

= V (and conclusion 2 holds).


In conclusion, we can assume n 2.
First of all, observe that from Lemma 172.1,

w
1
, v
1
, ..., v
n

is linearly dependent and span

w
1
, v
1
, ..., v
n

=
V . By Lemma 155, there exists k
1
{1, ..., n} such that v
k
is a linear combination of the preceding
vectors. Then from Lemma 172.2, we have
span

w
1
,

v
i

i6=k1

= V.
Then again from Lemma 172.1,
n
w
1
, w
2
,

v
i

i6=k
1
o
is linearly dependent and span

w
1
, w
2
,

v
i

i6=k
1

=
V . By Lemma 155, there exists k
2
{2, ..., n} \ {k
1
} such that v
k
2
or w
2
is a linear combination of
the preceding vectors. That vector cannot be w
2
because of assumption 2. Therefore,
span

w
1
, w
2
,

v
i

i6=k1,k2

= V.
We can now distinguish three cases: m < n, m = n and m > n.
4.7. BASIS AND DIMENSION 55
Now if m < n, after m steps of the above procedure we get
span

w
1
, ..., w
m
,

v
i

i6=k1,k2,..,km

= V,
which shows 2.a. If m = n,we have
span

w
1
, ..., w
m

= V,
which shows 2.b.
Lets now show that it cannot be m > n. Suppose that is the case. Then, after n of the above
steps, we get span

w
1
, ..., w
n

= V and therefore w
n+1
is a linear combination of

w
1
, ..., w
n

,
contradicting assumption 2.
Proposition 174 Assume that S =

u
1
, u
2
, ..., u
n

and T =

v
1
, v
2
, ..., v
m

are bases of V . Then


n = m.
Proof. By denition of basis we have that
span

u
1
, u
2
, ..., u
n

= V and

v
1
, v
2
, ..., v
m

are linearly independent.


Then from Lemma 173, m n. Similarly,
span

v
1
, v
2
, ..., v
m

= V and

u
1
, u
2
, ..., u
n

are linearly independent,


and from Lemma 173, n m.
The above Proposition allows to give the following Denition.
Denition 175 A vector space V has dimension n if there exists a basis of V whose cardinality
is n. In that case, we say that V has nite dimension (equal to n)and we write dimV = n. If a
vector space does not have nite dimension, it is said to be of innite dimension.
Denition 176 The vector space {0} has dimension 0.
Example 177 1. A basis of R
n
is

e
1
, ..., e
i
, ..., e
n

, where e
i
is dened in Denition 55. That
basis is called canonical basis.
2. Consider the vector space P
n
(t) of polynomials of degree n. The set

t
0
, t
1
, ...t
n

of
polynomials is a basis of P
n
(t) and therefore dimP
n
(t) = n + 1.
Proposition 178 Let V be a vector space of dimension n.
1. m > n vectors in V are linearly dependent;
2. If S =

u
1
, ..., u
n

is a linearly independent set, then it is a basis of V ;


3. If span

u
1
, ..., u
n

= V , then

u
1
, ..., u
n

is a basis of V.
Proof. Let

w
1
, ..., w
n

be a basis of V . Then span

w
1
, ..., w
n

= V .
1. We want to show that

v
1
, ..., v
m

arbitrary vectors in V are linearly dependent. Suppose


otherwise, then by Lemma 173, we would have m n, a contradiction.
2. It is the content of Lemma 173.2.a.
3. We have to show that

u
1
, ..., u
n

are linearly independent. Suppose otherwise. Then there


exists k {1, ..., n} such that span
n

u
i

i6=k
o
= V , but since

w
1
, ..., w
n

is linearly
independent, from Lemma 173, we have n n 1, a contradiction.
Remark 179 The above Proposition 178 shows that in the case of nite dimensional vector spaces,
one of the two conditions dening a basis is sucient to obtain a basis.
56 CHAPTER 4. VECTOR SPACES
Proposition 180 Let V be a vector space of dimension n and

w
1
, ..., w
m

V be a linearly
independent set, with
5
m n. If m < n, then, there exists a set

u
1
, ..., u
nm

such that

w
1
, ..., w
m
, u
1
, ..., u
nm

is a basis of V .
Proof. Take a basis

v
1
, ..., v
n

of V . Then from Conclusion 2 in Lemma 173,


span

w
1
, ..., w
m
, v
i
1
, ..., v
i
nm

= V.
Then from Proposition 178.3, we get the desired result.
Proposition 181 Let W be a subspace of an ndimensional vector space V . Then
1. dimW n;
2. If dimW = n, then W = V .
Proof. 1. From Proposition 178.1, m > n vectors in V are linearly dependent. Since a basis of
W is a set of linearly independent vectors then dimW n.
2. If

w
1
, ..., w
n

is a basis of W, then span

w
1
, ..., w
n

= W. Moreover, those vectors are n lin-


early independent vectors in V . Therefore from Proposition Proposition 178.2., span

w
1
, ..., w
n

=
V .
Remark 182 As a trivial consequence of the above Proposition, v V , v span

u
1
, ..., u
r


dimV r.
Example 183 Let W be a subspace of R
3
, whose dimension is 3. Then from the previous Propo-
sition, dimW {0, 1, 2, 3}. In fact,
1. If dimW = 0, then W = {0}, i.e., a point,
2. if dimW = 1, then W is a straight line trough the origin,
3. if dimW = 2, then W is a plane trough the origin,
4. if dimW = 3, then W = R
3
.
Denition 184 A maximal linearly independent subset S
0
of a set of vectors S V is a subset of
S such that
1. S
0
is linearly independent, and
2. S
0
S
00
S S
00
is linearly dependent, i.e., if S
00
is another subset of S whose cardinality
is bigger than the cardinality of S
0
, then S
00
is a linearly dependent set.
Lemma 185 Given a vector space V , if
1. S =

v
1
, ..., v
k

V are linearly independent, and


2. S
0
= S

v
k+1

are linearly dependent,


then
v
k+1
is a linear combination of the vectors in S.
Proof. Since S
0
is a linearly dependent set,
i {1, ..., k + 1} ,

j{1,...,k+1}\{i}
such that v
i
=
X
j{1,...,k+1}\{i}

j
v
j
.
If i = k + 1,we are done. If i 6= k + 1, without loss of generality, take i = 1. Then

j{1,...,k+1}\{1}
such that v
1
=
k+1
X
j=2

j
v
j
.
If
k+1
= 0, we would have v
1

P
k
j=2

j
v
j
= 0, contradicting Assumption 1. Then
v
k+1
=
1

k+1

v
1

k
X
j=2

j
v

.
5
The inequality m n follows from Proposition 178.1.
4.7. BASIS AND DIMENSION 57
Remark 186 Let V be a vector space and S T V . Then,
1. span S span T;
2. span (span S) = span S.
Proposition 187 Assume that S V is a nite set of vectors, and that spanS = V . Then
1. Any maximal linearly independent subset of S is a basis of V ;
2. If one deletes from S each vector which is a linear combination of preceding vectors in S, then
the remaining set is a basis of V .
Proof. 1.
Suppose

v
1
, ..., v
n

is a maximal linearly independent subset of S, and suppose that w S. By


assumption,

v
1
, ..., v
n
, w

is linearly dependent and, from Lemma 185, w is a linear combination


of the elements in

v
1
, ..., v
n

. Therefore, w span

v
1
, ..., v
n

. Then, S span

v
1
, ..., v
n

, and
V
(1)
= spanS
(2)
span

v
1
, ..., v
n

V,
where (1) is true by assumption and (2) follows from Remark 186. Therefore span

v
1
, ..., v
n

= V ,
and from the denition of basis, the result follows.
2.
The remaining vectors are a maximal linearly independent subset of S and therefore the desired
result follows from part 1 above.
Denition 188 The row rank of A M(m, n) is the maximum number of linearly independent
rows of A (i.e., the cardinality of each maximal linearly independent subset of the set of the row
vectors of A).
Denition 189 The column rank of A M(m, n) is the maximum number of linearly independent
columns of A.
Proposition 190 For any A M(m, n),
1. row rank of A is equal to dimrowspan of A;
2. column rank of A is equal to dimcol span of A.
Proof. 1.
Dene V := rowspan A, r = rowrank A :=(maximum number of linearly independent rows of
A) and let S be a maximal linearly independent subset of the rows of A. Then, from Proposition
187.1, S is a basis of V and therefore r = dimV , i.e., rowrank A = rowspan A.
2.
dimcol span A
Rmk.144
= dimrowspan A
T
= max # l.i. rows in A
T
= max # l.i. columns in A.
Proposition 191 For any A M(m, n), row rank of A is equal to column rank of A.
Proof. Let A be an arbitrary mn matrix

a
11
... a
1j
... a
1n
...
a
i1
... a
ij
... a
in
...
a
m1
... a
mj
... a
mn

Suppose the row rank is r m and the following r vectors form a basis of the row space:

S
1
= [b
11
... b
1j
... b
1n
]
...
S
k
= [b
k1
... b
kj
... b
kn
]
...
S
r
= [b
r1
... b
rj
... b
rn
]

58 CHAPTER 4. VECTOR SPACES


Then, each row vector of A is a linear combination of the above vectors, i.e., we have
i {1, ..., m} , R
i
=
r
X
k=1

ik
S
k
,
or
i {1, ..., m} ,

a
i1
... a
ij
... a
in

=
r
X
k=1

ik
[b
k1
... b
kj
... b
kn
] ,
and setting the j component of each of the above vector equations equal to each other, we have
j {1, .., n} and i {1, ..., m} , a
ij
=
r
X
k=1

ik
b
kj
,
and
j {1, .., n} ,

a
1j
=
P
r
k=1

1k
b
kj
,
....
a
ij
=
P
r
k=1

ik
b
kj
,
...
a
mj
=
P
r
k=1

mk
b
kj
,
or
j {1, .., n} ,

a
1j
...
a
ij
...
a
mj

=
r
X
k=1
b
kj

1k
...

ik
...

mk

,
i.e., each column of A is a linear combination of the r vectors

11
...

i1
...

m1

, ...,

1k
...

ik
...

mk

, ...,

1r
...

ir
...

mr

.
Then, from Remark 182,
dimcol span A r = rowrank A, (4.8)
i.e.,
col rank A rowrank A.
From (4.8) ,which holds for arbitrary matrix A, we also get
dimcol span A
T
rowrank A
T
. (4.9)
Moreover,
dimcol span A
T
Rmk. 144
= dimrowspan A := rowrank A
and
rowrank A
T
:= dimrowspan A
T
Rmk. 144
= dimcol span A.
Then, from the two above equalities and (4.9), we get
rowrank A dimcol span A, (4.10)
and (4.8) and (4.10) gives the desired result.
We can summarize Propositions 190 and 191 as follows.
Corollary 192 For every A M(m, n),
rowrank A = dimrowspanA = col rank A = dimcol span A.
4.8. CHANGE OF BASIS 59
4.8 Change of basis
Denition 193 Let V be a vector space over a eld F with a basis S =

v
1
, ..., v
n

. Then, from
Proposition 171 v V ! (a
i
)
n
i=1
F
n
such that v =
P
n
i=1

i
v
i
. The scalars (a
i
)
n
i=1
are called
the coordinates of v relative to the basis S and are denoted by [v]
S
, or simply by [v], when no
ambiguity arises. We also denote the i th component of the vector [v]
S
by [v]
i
S
and therefore we
have [v]
S
=

[v]
i
S

n
i=1
.
Remark 194 If V = R
n
and S =

e
i
n

n
i=1
:= e
n
, i.e., the canonical basis, then
x R
n
, [x]
en
=
"
n
X
i=1
x
i
e
i
n
#
= x.
Exercise 195 Consider the vector space V of polynomials of degree 2

ax
2
+bx +c : a, b, c R

.
Find the coordinates of an arbitrary element of V with respect to the basis
n
v
1
= 1, v
2
= t 1, v
3
= (t 1)
2
o
Denition 196 Consider a vector space V of dimension n and two bases v =

v
i

n
i=1
and u =

u
k

n
k=1
of V . Then,
k {1, ..., n} , ! (a
ik
)
n
i=1
=

a
1k
...
a
ik
a
nk

such that u
k
=
n
X
i=1
a
ik
v
i
. (4.11)
The matrix
P =

a
11
... a
1k
... a
1n
...
a
i1
a
ik
a
in
a
n1
a
nk
a
nn

M(n, n) , (4.12)
i.e.,
P =

u
1

v
...

u
k

v
... [u
n
]
v

M(n, n) ,
is called the change-of-basis matrix from the basis v to the basis u.
Remark 197 The name of the matrix P in the above Denition follows from the fact that the
entries of P are used to transform (in the way described in (4.11)) vectors of the basis v in
vectors of the basis u.
Proposition 198 If P is the change-of-basis matrix from the basis v to the basis u and Q the
change-of-basis matrix from the basis u to the basis v, then P and Q are invertible matrices and
P = Q
1
.
Proof. By assumption,
k {1, ..., n} , (a
ik
)
n
i=1
such that u
k
=
n
X
i=1
a
ik
v
i
, (4.13)
P = [a
ik
] M(n, n) ,
i {1, ..., n} , (b
j
)
n
j=1
=

b
1i
...
b
ji
b
ni

such that v
i
=
n
X
j=1
b
ji
u
j
, (4.14)
60 CHAPTER 4. VECTOR SPACES
Q =

b
11
... b
1i
... b
1n
...
b
j1
b
ji
b
jn
b
n1
b
ni
b
nn

M(n, n) ,
Substituting (4.14) in (4.13), we get
u
k
=
n
X
i=1
a
ik

n
X
j=1
b
ji
u
j

=
n
X
j=1

n
X
i=1
b
ji
a
ik
!
u
j
Now, observe that k {1, ..., n} ,

u
k

u
=

n
X
i=1
b
ji
a
ik
!
n
j=1
= e
k
n
and

n
X
i=1
b
ji
a
ik
!
n
j=1
=

b
j1
b
ji
b
jn

a
1k
...
a
ik
a
nk

n
j=1
=

b
T
j
a
k

n
j=1
is the kth column of BA.
Therefore
BA = I
and we got the desired result.
The next Proposition explains how the coordinate vectors are aected by a change of basis.
Proposition 199 Let P be the change-of-basis matrix from v to u. Then,
w V, [w]
u
= P
1
[w]
v
and
[w]
v
= P [w]
u
Proof. By denition of [w]
u
, there exists (
k
)
n
k=1
R
n
such that [w]
u
= (
k
)
n
k=1
and
w =
n
X
k=1

k
u
k
.
Moreover, as said at the beginning of the previous proof,
k {1, ..., n} , (a
ik
)
n
i=1
such that u
k
=
P
n
i=1
a
ik
v
i
.
Then,
w =
n
X
k=1

n
X
i=1
a
ik
v
i
!
=
n
X
i=1

n
X
k=1

k
a
ik
!
v
i
,
i.e.,
[w]
v
=

n
X
k=1
a
ik

k
!
n
i=1
.
Moreover, using the denition of P - see (4.12) - we get
P [w]
u
=

a
11
... a
1k
... a
1n
...
a
i1
a
ik
a
in
a
n1
a
nk
a
nn

a
1
...
a
k
a
n

P
n
k=1
a
1k

k
...
P
n
k=1
a
ik

k
...
P
n
k=1
a
nk

k

,
as desired.
4.8. CHANGE OF BASIS 61
Remark 200 Although P is called the change-of-basis matrix from v to u, for the reason explained
in Remark 197, it is P
1
which transforms the coordinates of w V relative to the original basis
v into the coordinates of w relative to the new basis u.
Proposition 201 Let S =

v
1
, ..., v
n

be a subset of a vector space V . Let T be a subset of V


obtained from S using one of the following elementary operations:
1. interchange two vectors,
2. multiply a vector by a nonzero scalar,
3. add a multiple of one vector to another one.
Then,
1. spanS = spanT, and
2. S is independent T is independent.
Proof. 1. Any element in T is a linear combination of vectors of S. Since any operation has an
inverse, i.e., an operation which brings the set back to its original nature - similarly to what said
in Proposition 94 - any element in S is a linear combination of vectors in T.
2. S is independent
(1)
S is a basis of spanS
(2)
dimspanS = n
(3)
dimspanT = n T is a
basis of spanT T is independent
where (1) follows from the denitions of basis and span, the [] part in (2) from Proposition
178.3, (3) from above conclusion 1. .
Proposition 202 Let A = [a
ij
] , B = [b
ij
] M(m, n) be row equivalent matrices over a eld F
and v
1
, ...v
n
vectors in a vector space V . Dene
u
1
= a
11
v
1
+ ... a
1j
v
j
+ ... a
1n
v
n
...
u
i
= a
i1
v
1
a
ij
v
j
+ a
in
v
n
u
m
= a
m1
v
1
a
mj
v
j
+ a
mn
v
n
and
w
1
= b
11
v
1
+ ... b
1j
v
j
+ ... b
1n
v
n
...
w
i
= b
i1
v
1
b
ij
v
j
+ b
in
v
n
w
m
= b
m1
v
1
b
mj
v
j
+ b
mn
v
n
.
Then
span

u
1
, ..., u
m

= span

w
1
, ..., w
m

.
Proof. Observe that applying elementary operations of the type dened in the previous
Proposition to the set of vectors

u
1
, ..., u
m

is equivalent to applying elementary row operations


to the matrix A.
Since B can be obtained via row operations from A, then the set of vectors

w
1
, ..., w
m

can be
obtained applying elementary operations of the type dened in the previous Proposition to the
set of vectors

u
1
, ..., u
m

, and from that Proposition the desired result follows.


Proposition 203 Let v
1
, ..., v
n
belong to a vector space V . Assume that k {1, ..., n} , (a
ik
)
n
i=1
such that u
k
=
P
n
k=1
a
ik
v
k
, and dene P = [a
ik
] M(n, n). Then,
1. If P is invertible, then
D

v
i

n
i=1
is linearly independent
E

u
k

n
k=1
is linearly independent
E
;
2. If P is not invertible, then

u
k

n
k=1
is linearly dependent.
62 CHAPTER 4. VECTOR SPACES
Proof. 1.
From Proposition 104, P is row equivalent to the identity matrix I
n
. Therefore, from Proposition
202,
span

v
i

n
i=1
= span

u
k

n
k=1
,
and from Proposition 201, the desired result follows.
2. Since P is not invertible, then P is not row equivalent to the identity matrix and from
Proposition 122, it is equivalent to a matrix with a zero row, say the last one. Then i {1, ..., n},
a
ni
= 0 and u
n
=
P
n
k=1
a
nk
v
k
= 0.
Corollary 204 Let v =

v
1
, ..., v
n

be a basis of a vector space V . Let P = [a


ik
] M(n, n) be
invertible. Then u =

u
1
, ..., u
n

such that k {1, ..., n} , u


k
=
P
n
k=1
a
ik
v
k
is a basis of V and
P is the change-of-basis matrix from v to u.
Proposition 205 If v =

v
i

n
i=1
is a basis of R
n
,
B :=

v
1
... v
n

M(n, n) ,
i.e., B is the matrix whose columns are the vectors of S, and P M(n, n) is an invertible matrix,
then
1.
BP :=

u
1
... u
n

M(n, n) ,
is a matrix whose columns are another basis of R
n
, and P is the change-of-basis matrix from v to
u :=

u
1
, ..., u
n

, and
2. w R
n
, [w]
u
= P
1
[w]
v
.
Proof. 1.
BP
(3.4)in Rmk. 71
=

B C
1
(P) ... B C
n
(P)

Rmk. 145
=
P
n
k=1
C
k
1
(P) v
k
...
P
n
k=1
C
k
n
(P) v
k

.
Therefore,
i {1, ..., n} , u
i
=
n
X
k=1
C
k
i
(P) v
k
,
i.e., each element in the basis u is a linear combinations of elements in v. Then from Corollary
204, the desired result follows.
2. It follows from Proposition 199.
Remark 206 In the above Proposition 205, if
B :=

e
1
... e
n

,
i.e., e =

e
1
, ..., e
n

is the canonical basis (and P is invertible), then


1. the column vectors of P are a new basis of R
n
, call it u, and
2. x R
n
,
y := [x]
u
= P
1
[x]
e
= P
1
x.
Chapter 5
Determinant and rank of a matrix
5.1 Denition and properties of the determinant of a matrix
To motivate the denition of determinant, we present an informal discussion of a way to nd
solutions to the linear system with two equations and two unknowns, shown below.

a
11
x
1
+a
12
x
2
= b
1
a
21
x
1
+a
22
x
2
= b
2
(5.1)
The system can be rewritten as follows
Ax = b
where
A =

a
11
a
12
a
21
a
22

, x =

x
1
x
2

, b =

b
1
b
2

.
Lets informally discuss how to nd solutions to system (5.1). If a
22
6= 0 and a
12
6= 0, multiplying
both sides of the rst equation by a
22
, of the second equation by a
12
and adding up, we get
a
11
a
22
x
1
+a
12
a
22
x
2
a
12
a
21
x
1
a
12
a
22
x
2
= a
22
b
1
a
12
b
2
Therefore, if
a
11
a
22
a
12
a
21
6= 0
we have
x
1
=
b
1
a
22
b
2
a
12
a
11
a
22
a
12
a
21
(5.2)
In a similar manner
1
we have
x
2
=
b
2
a
11
b
1
a
21
a
11
a
22
a
12
a
21
(5.3)
We can then the following preliminary denition: given A M
22
, the determinant of A is
det A = det

a
11
a
12
a
21
a
22

:= a
11
a
22
a
12
a
21
Using the denition of determinant, we can rewrite (5.2) and (5.3)as follows.
x
1
=
det

b
1
a
21
b
2
a
22

det A
and x
2
=
det

a
11
b
1
a
21
b
2

det A
We can now present the denition of the determinant of a square matrix A
nn
for arbitrary
n N\ {0}.
1
Assuming a
21
6= 0 and a
11
6= 0, multiply both sides of the rst equation by a
21
, of the second equation by a
11
and then add up.
63
64 CHAPTER 5. DETERMINANT AND RANK OF A MATRIX
Denition 207 Given n > 1 and A M(n, n), i, j {1, ..., n}, we call A
ij
M(n 1, n 1) the
matrix obtained from A erasing the i th row and the j th column.
Denition 208 Given A M(1, 1), i.e., A = [a] with a R. The determinant of A is denoted by
det A and we let det A := a. For n N\ {0, 1}, given A M(n, n), we dene the determinant of A
as
det A :=
n
X
j=1
(1)
1+j
a
1j
det A
1j
Observe that [a
1j
]
n
j=1
is the rst row of A, i.e.,
det A :=
n
X
j=1
(1)
1+j
R
j
1
(A) det A
1j
Example 209 For n = 2, we have
det

a
11
a
12
a
21
a
22

=
2
X
j=1
(1)
1+j
a
1j
det A
1j
= (1)
1+1
a
11
det A
11
+ (1)
1+2
a
12
det A
12
=
= a
11
a
22
a
12
a
21
and we get the informal denition given above.
Example 210 For n = 3, we have
det A = det

a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33

= (1)
1+1
a
11
det A
11
+ (1)
1+2
a
12
det A
12
+ (1)
1+3
a
13
det A
13
Denition 211 Given A = [a
ij
] M(n, n), , i, j, det A
ij
is called minor of a
ij
in A;
(1)
i+j
det A
ij
is called cofactor of a
ij
in A.
Theorem 212 Given A M(n, n), det Ais equal to the sum of the products of the elements of any
rows or column for the corresponding cofactors, i.e.,
i {1, ...n} , det A =
n
X
j=1
(1)
i+j
R
j
i
(A) det A
ij
(5.4)
and
j {1, ...n} , det A =
n
X
i=1
(1)
i+j
C
i
j
(A) det A
ij
(5.5)
Proof. Omitted. We are going to omit several proofs about determinants. There are dierent
ways of introducing the concept of determinant of a square matrix. One of them uses the concept of
permutations - see, for example, Lipschutz (1991), Chapter 7. Another one is an axiomatic approach
- see, for example, Lang (1971) - he introduces (three) properties that a function f : M(n, n) R
has to satisfy and then shows that there exists a unique such function, called determinant. Following
the rst approach the proof of the present theorem can be found on page 252, in Lipschutz (1991)
Theorem 7.8, or following the second approach, in Lang (1971), page 128, Theorem 4.
Denition 213 The expression used above for the computation of det A in (5.4) is called (Laplace)
expansion of the determinant by row i.
The expression used above for the computation of det A in (5.5) is called (Laplace) expansion
of the determinant by column j.
Denition 214 Consider a matrix A
nn
. Let 1 k n. A k th order principal submatrix
(minor) of A is the (determinant of the) square submatrix of A obtained deleting (n k) rows and
(n k) columns in the same position.
5.1. DEFINITION AND PROPERTIES OF THE DETERMINANT OF A MATRIX 65
Theorem 215 (Properties of determinants)
Let the matrix A = [a
ij
] M(n, n) be given. Properties presented below hold true even if words
column, columns are substituted by the words row, rows.
1. det A = det A
T
.
2. if two columns are interchanged, the determinant changes its sign,.
3. if there exists j {1, .., n} such that C
j
(A) =
P
p
k=1

k
b
k
, then
det
"
C
1
(A) , ...,
p
X
k=1

k
b
k
, ..., C
n
(A)
#
=
p
X
k=1

k
det

C
1
(A) , ..., b
k
, ..., C
n
(A)

,
i.e., the determinant of a matrix which has a column equal to the linear combination of some
vectors is equal to the linear combination of the determinants of the matrices in which the
column under analysis is each of the vector of the initial linear combination, and, therefore,
R and j {1, ..., n},
det [C
1
(A) , ..., C
j
(A) , ..., C
n
(A)] = det [C
1
(A) , ..., C
j
(A) , ..., C
n
(A)] = det A.
4. if j {1, ..., n} such that C
j
(A) = 0, then det A = 0, i.e., if a matrix has column equal to
zero, then the determinant is zero.
5. if j, k {1, ..., n} and Rsuch that C
j
(A) = C
k
(A), then det A = 0, i.e., the determi-
nant of a matrix with two columns proportional one to the other is zero.
6. If k {1, ..., n} and
1
, ...,
k1
,
k+1
, ...,
n
R such that C
k
(A) =
P
j6=k

j
C
j
(A), then
det A = 0, i.e., if a column is equal to a linear combination of the other columns, then
det A = 0.
7.
det

C
1
(A) , ..., C
k
(A) +
X
j6=k

j
C
j
(A) , ..., C
n
(A)

= det A
8. j, j

{1, ..., n} ,
P
n
i=1
a
ij
(1)
i+j

det A
ij
= 0, i.e., the sum of the products of the elements
of a column times the cofactor of the analogous elements of another column is equal to zero.
9. If A is triangular, det A = a
11
... a
22
...a
nn
, i.e., if A is triangular (for example, diagonal),
the determinant is the product of the elements on the diagonal.
Proof. 1.
Consider the expansion of the determinant by the rst row for the matrix A and the expansion
of the determinant by the rst column for the matrix A
T
.
2.
We proceed by induction. Let A be the starting matrix and A
0
the matrix with the interchanged
columns.
P (2) is obviously true.
P (n 1) P (n)
Expand det A and det A
0
by a column which is not any of the interchanged ones:
det A =
n
X
i=1
(1)
i+j
C
i
j
(A) det A
ij
det A
0
=
n
X
i=1
(1)
i+k
C
i
j
(A) det A
0
ij
Since k {1, ..., n}, A
kj
, A
0
kj
M(n 1, n 1), and they have interchanged column, by the
induction argument, det A
kj
= det A
0
kj
, and the desired result follows.
3.
66 CHAPTER 5. DETERMINANT AND RANK OF A MATRIX
Observe that
p
X
k=1

k
b
k
=

p
X
k=1

k
b
k
i
!
n
i=1
.
Then,
det

C
1
(A) , ...,
P
p
k=1

k
b
k
, ..., C
n
(A)

=
P
n
i=1
(1)
i+j
P
p
k=1

k
b
k
i

det A
ij
=
=
P
p
k=1

k
P
n
i=1
(1)
i+j
b
k
i
det A
ij
=
P
p
k=1

k
det

C
1
(A) , ..., b
k
, ..., C
n
(A)

.
4.
It is sucient to expand the determinant by the column equal to zero.
5.
Let A := [C
1
(A) , C
1
(A) , C
3
(A) , ..., C
n
(A)] and
e
A := [C
1
(A) , C
1
(A) , C
3
(A) , ..., C
n
(A)] be
given. Then det A = det [C
1
(A) , C
1
(A) , C
3
(A) , ..., C
n
(A)] = det
e
A. Interchanging the rst
column with the second column of the matrix
e
A, from property 2, we have that det
e
A = det
e
Aand
therefore det
e
A = 0, and det A = det
e
A = 0.
6.
It follows from 3 and 5.
7.
It follows from 3 and 6.
8.
It follows from the fact that the obtained expression is the determinant of a matrix with two
equal columns.
9.
It can be shown by induction and expanding the determinant by the rs row or column, choosing
one which has all the elements equal to zero excluding at most the rst element. In other words, in
the case of an upper triangular matrix, we can say what follows.
det

a
11
a
12
0 a
22

= a
11
a
22
.
det

a
11
a
12
a
13
... a
1n
0 a
22
a
23
... a
2n
0 0 a
33
a
3n
...
.
.
.
0 ... 0 ... a
nn

= a
11
det

a
22
a
23
... a
2n
0 a
33
a
3n
.
.
.
... 0 ... a
nn

= a
11
a
22
a
33
...a
nn
.
Theorem 216 A, B M(n, n), det(AB) = det A det B.
Proof. Exercise.
Denition 217 A M(n, n) is called nonsingular if det A 6= 0.
5.2 Rank of a matrix
Denition 218 Given A M(m, n), a square submatrix of A of order k min{m, n} is a matrix
obtained considering the elements belonging to k rows and k columns of A.
Denition 219 Given A M(m, n), the rank of A is the greatest order of square nonsingular
submatrices of A.
Remark 220 rankA min{m, n}.
5.2. RANK OF A MATRIX 67
To compute rank A, with A M(m, n), we can proceed as follows.
1. Consider k = min{m, n}, and the set of square submatrices of A of order k. If there exists
a nonsingular matrix among them, then rank A = k. If all the square submatrices of A of order k
are singular, go to step 2 below.
2. Consider k 1, and then the set of the square submatrices of A of order k 1. If there exists
a nonsingular matrix among them, then rank A = k 1. If all square submatrices of order k 1are
singular, go to step 3.
3. Consider k 2 ...
and so on.
Remark 221 1. rank I
n
= n.
2. The rank of a matrix with a zero row or column is equal to the rank of that matrix without
that row or column, i.e.,
rank

A
0

= rank

A 0

= rank

A 0
0 0

= rank A
That result follows from the fact that the determinant of any square submatrix of A involving
that zero row or columns is zero.
3. From the above results, we also have that
rank

I
r
0
0 0

= r
We now describe an easier way to the compute the rank of A, which in fact involves elementary
row and column operations we studied in Chapter 1.
Proposition 222 Given A, A
0
M(m, n),
h A is equivalent to A
0
i hrank A = rank A
0
i
Proof. [] Since A is equivalent to A
0
, it is possible to go from A to A
0
through a nite number
of elementary row or column operations. In each step, in any square submatrix A

of A which has
been changed accordingly to those operations, the elementary row or column operations 1, 2 and 3
(i.e., 1. row or column interchange, 2. row or column scaling and 3. row or column addition) are
such that the determinant of A

remains unchanged or changes its sign (Property 2, Theorem 215),


it is multiplied by a nonzero constant (Property 3), remains unchanged (Property 7), respectively.
Therefore, each submatrix A

whose determinant is dierent from zero remains with determinant


dierent from zero and any submatrix A

whose determinant is zero remains with zero determinant.


[]
From Corollary 151.5
2
, we have that there exist unique b r and b r
0
such that
A is equivalent to

I
r
0
0 0

and
A
0
is equivalent to

I
r
0 0
0 0

.
Moreover, from [] part of the present proposition, and Remark 221
rankA = rank

I
r
0
0 0

= b r
and
rankA
0
= rank

I
r
0 0
0 0

= b r
0
.
2
That results says what follows:
For any A M(m, n), there exist invertible matrices P M(m, m) and Q M(n, n) and a unique number
r {0, 1, ..., min{m, n}} such that A is equivalent to
PAQ =
_
I
r
0
0 0
_
68 CHAPTER 5. DETERMINANT AND RANK OF A MATRIX
Then, by assumption, b r = b r
0
:= r, and A and A
0
are equivalent to

I
r
0
0 0

and therefore A is equivalent to A


0
.
Example 223 Given

1 2 3
2 3 4
3 5 7

we can perform the following elementary rows and column operations, and cancellation of zero
row and columns on the matrix:

1 2 3
2 3 4
3 5 7

1 2 3
2 3 4
0 0 0

1 2 3
2 3 4

1 1 3
2 1 4

1 1 0
2 1 1

1 1 0
0 0 1

0 1 0
0 0 1

1 0
0 1

Therefore, the rank of the matrix is 2.


5.3 Inverse matrices (continued)
Using the notion of determinant, we can nd another way of analyzing the problems of i. existence
and ii. computation of the inverse matrix. To do that, we introduce the concept of adjoint matrix.
Denition 224 Given a matrix A
nn
, we call adjoint matrix of A, and we denote it by Adj A,
the matrix whose elements are the cofactors
3
of the corresponding elements of A
T
.
Remark 225 In other words to construct Adj A,
1. construct A
T
,
2. consider the cofactors of each element of A
T
.
Example 226
A =

1 2 3
0 1 2
1 2 0

, A
T
=

1 0 1
2 1 2
3 2 0

, Adj A =

4 6 1
2 3 2
1 0 1

Proposition 227 Given A


nn
, we have
A Adj A = Adj A A = det A I (5.6)
Proof. Making the product A Adj A := B, we have
1. i {1, ..., n}, the i th element on the diagonal of B is the expansion of the determinant
by the i th row and therefore is equal to det A.
2. any element not on the diagonal of B is the product of the elements of a row times the
corresponding cofactor a parallel row and it is therefore equal to zero due to Property 8 of the
determinants stated in Theorem 215).
Example 228
det

1 2 3
0 1 2
1 2 0

= 3

1 2 3
0 1 2
1 2 0

4 6 1
2 3 2
1 0 1

3 0 0
0 3 0
0 0 3

= 3 I
3
From Denition 211, recall that given A = [a
ij
] M(m, m), i, j,
(1)
i+j
det A
ij
is called cofacor of a
ij
in A.
5.3. INVERSE MATRICES (CONTINUED) 69
Proposition 229 Given an n n matrix A, the following statements are equivalent:
1. det A 6= 0, i.e., A is nonsingular;
2. A
1
exists, i.e., A is invertible;
3. rank A = n;
4. col rank A = n, i.e., the column vectors of the matrix A are linearly independent;
5. rowrank A = n, i.e., the row vectors of the matrix A are linearly independent;
6. dimcol span A = n;
7. dimrowspan A = n.
Proof. 1 2
From (5.6) and from the fact that det A 6= 0, we have
A
Adj A
det A
=
Adj A
det A
A = I
and therefore
A
1
=
Adj A
det A
(5.7)
1 2
AA
1
= I det

AA
1

= det I det A det A


1
= 1 det A 6= 0 (and det A
1
6= 0).
1 3
It follows from the denition of rank and the fact that A is n n matrix.
2 4
From (4.6), it suces to show that hAx = 0 x = 0i. Since A
1
exists, Ax = 0 A
1
Ax =
A
1
0 x = 0.
2 4
From Proposition 178.2, the n linearly independent column vectors a
1
, ..., a
i
, ..., a
n
are a basis
of R
n
. Therefore, each vector in R
n
is equal to a linear combination of those vectors. Then
k {1, ..., n} b
k
R
n
such that the k th vector e
k
in the canonical basis is equal to
[C
1
(A) , ..., C
i
(A) , ..., C
n
(A)] b
k
= Ab
k
, i.e.,

e
1
... e
k
... e
n

=

Ab
1
... Ab
k
... Ab
n

or, from (3.4) in Remark 71, dened
B :=

b
1
... b
k
... b
n

I = AB
i.e., A
1
exists (and it is equal to B).
The remaining equivalences follow from Corollary 192.
Remark 230 From the proof of the previous Proposition, we also have that, if det A 6= 0, then
det A
1
= (det A)
1
.
Remark 231 The previous theorem gives a way to compute the inverse matrix as explained in
(5.7).
70 CHAPTER 5. DETERMINANT AND RANK OF A MATRIX
Example 232 1.

1 2 3
0 1 2
1 2 0

1
=
1
3

4 6 1
2 3 2
1 0 1

2.

0 1 1
1 1 0
1 1 1

1
=

1 0 1
1 1 1
0 1 1

3.

0 1 2
3 4 5
6 7 8

1
does not exist because det

0 1 2
3 4 5
6 7 8

= 0
4.

0 1 0
2 0 2
1 2 3

1
=

1
3
4

1
2
1 0 0
1
1
4
1
2

5.

a 0 0
0 b 0
0 0 c

1
=

1
a
0 0
0
1
b
0
0 0
1
c

if a, b, c 6= 0.
5.4 Span of a matrix, linearly independent rows and columns,
rank
Proposition 233 Given A M(m, n), then
dimrowspan A = rank A.
Proof. First proof.
The following result which is the content of Corollary 151.5 plays a crucial role in the proof:
For any A M(m, n), there exist invertible matrices P M(m, m) and Q M(n, n) and a
unique number r {0, 1, ..., min{m, n}} such that
A is equivalent to PAQ =

I
r
0
0 0

.
From the above result, Proposition 222 and Remark 221 , we have that
rank A = rank

I
r
0
0 0

= r. (5.8)
From Propositions 105 and 146, we have
rowspan A = rowspan PA;
from Propositions 119 and 146, we have
col span PA = col span PAQ = col span

I
r
0
0 0

.
From Corollary 192,
dimrowspan PA = dimcol span PA.
Therefore
dimrowspan A = dimcol span

I
r
0
0 0

= r, (5.9)
5.4. SPAN OF A MATRIX, LINEARLY INDEPENDENT ROWS AND COLUMNS, RANK 71
where the last equality follows simply because
col span

I
r
0
0 0

= col span

I
r
0

,
and the r column vectors of the matrix

I
r
0

are linearly independent and therefore, from Propo-


sition 187 , they are a basis of span

I
r
0

.
From (5.8) and (5.9), the desired result follows.
Second proof.
We are going to show that rowrank A = rank A.
Recall that
rowrank A :=
*
r N such that
i . r row vectors of A are linearly independent,
ii. if m > r, any set of rows of A of cardinality > r is linearly dependent.
+
We ant to show that
1. if rowrank A = r, then rank A = r, and
2. if rank A = r, then rowrank A = r.
1.
Consider the r l.i. row vectors of A. Since r is the maximal number of l.i. row vectors, from
Lemma 185, each of the remaining (mr) row vectors is a linear combination of the r l.i. ones.
Then, up to reordering of the rows of A, which do not change either rowrank A or rank A, there
exist matrices A
1
M(r, n) and A
2
M(mr, n) such that
rank A = rank

A
1
A
2

= rank A
1
where the last equality follows from Proposition 222. Then r is the maximum number of l.i. row
vectors of A
1
and therefore, from Proposition 191, the maximum number of l.i. column vectors of
A
1
. Then, again from Lemma 185, we have that there exist A
11
M(r, r) and A
12
M(r, n r)
such that
rank A
1
= rank

A
11
A
12

= rank A
11
Then the square rr matrix A
11
contains r linearly independent vectors. Then from Proposition
229, the result follows.
2.
By Assumption, up to reordering of rows, which do not change either rowrank A or rank A,
A =

s r n r s
r A
11
A
12
A
13
mr A
21
A
22
A
23

with
rank A
11
= r.
Then from Proposition 229, row, and column, vectors of A
12
are linearly independent. Then
from Corollary 167, the r row vectors of

A
11
A
12
A
13

are l.i.
Now suppose that the maximum number of l.i. row vectors of A are r
0
> r (and the other mr
0
row vectors of A are linear combinations of them). Then from part 1 of the present proof, rank
A = r
0
> r, contradicting the assumption.
72 CHAPTER 5. DETERMINANT AND RANK OF A MATRIX
Remark 234 From Corollary 192 and the above Proposition, we have for any matrix A
mn
, the
following numbers are equal:
1. rank A :=greatest order of square nonsingular submatrices of A,
2. dimrowspan A,
3. dimcol span A
4. rowrank A := max number of linear independent rows of A,
5. col rank A := max number of linear independent columns of A.
Chapter 6
Eigenvalues and eigenvectors
In the present chapter, we try to answer the following basic question: given a real matrix A
M(n, n), can we nd a nonsingular matrix P (which can be thought to represent a change-of-basis
matrix ( see Corollary 205 and Remark after it) such that
P
1
AP is a diagonal matrix?
6.1 Characteristic polynomial
We rst introduce the notion of similar matrices.
Denition 235 A matrix B M(n, n) is similar to a matrix A M(n, n) if there exists an
invertible matrix P M(n, n) such that
B = P
1
AP
Remark 236 Similarity is an equivalence relation. Therefore, we say that A and B are similar
matrices if B = P
1
AP.
Denition 237 A matrix A M(n, n) is said to be diagonalizable if there exists an invertible
matrix P such that B = P
1
AP is a diagonal matrix, i.e., if there exists a similar diagonal matrix.
Denition 238 Given a matrix A = [a
ij
] M
F
(n, n), the matrix
tI A,
with t F, is called the characteristic matrix of A;
det [tI A]
is called the characteristic polynomial (in t) and it is denoted by
A
(t);

A
(t) = 0
is the characteristic equation of A .
Proposition 239 Similar matrix have the same determinant and the same characteristic polyno-
mial.
Proof. Assume that A and B are similar matrices; then, by denition, there exists a nonsingular
matrix P such that B = P
1
AP. Then
det B = det

P
1
AP

= det P
1
det Adet P =
= det P
1
det P det A = det

P
1
P

det A =
= det A.
73
74 CHAPTER 6. EIGENVALUES AND EIGENVECTORS
and very similarly,
det [tI B] = det

tI P
1
AP

= det

P
1
tIP P
1
AP

=
= det

P
1
(tI A) P

= det P
1
det [tI A] det P =
= det P
1
det P det [(tI A)] = det

P
1
P

det [(tI A)] =


= det [(tI A)] .
The proof of the following result goes beyond the scope of these notes and it is therefore omitted.
Proposition 240 Let A M
F
(n, n) be given. Then its characteristic polynomial is
t
n
S
1
t
n1
+S
2
t
n2
+... + (1)
n
S
n
,
where i {1, ..., n}, S
i
is the sum of the principal minors of order i in A - see Denition 214.
Exercise 241 Verify the statement of the above Proposition for n {2, 3}.
6.2 Eigenvalues and eigenvectors
Denition 242 Let A M(n, n) on a eld F be given. A scalar F is called an eigenvalue of
A if there exists a nonzero vector v F
n
such that
Av = v.
Every vector v satisfying the above relationship is called an eigenvector of A associated with (or
belonging to) the eigenvalue .
Remark 243 The two main reasons to require an eigenvector to be dierent from zero are explained
in Proposition 245, part 1, and the rst line in both steps of the induction proof of Proposition 252
below.
Denition 244 Let E

be the set of eigenvectors belonging to .


Proposition 245 1. There is only one eigenvalue associated with an eigenvector.
2. The set E

{0} is a subspace of F
n
.
Proof. 1. If Av = v = v, then ( ) v = 0. Since v 6= 0, the result follows.
2. Step 1. If is an eigenvalue of A and v is an associated eigenvector, then k F\ {0}, kv is
an associated eigenvector as well: Av = v A(kv) = kAv = k (v).
Step 2. If is an eigenvalue of A and v, u are associated eigenvectors, then v+u is an associated
eigenvector as well: A(v +u) = Av +Au = v +u = (v +u).
Proposition 246 Let A be a matrix in M(n, n) over a eld F. Then, the following statements are
equivalent.
1. F is an eigenvalue of A.
2. I A is a singular matrix.
3. F is a solution to the characteristic equation
A
(t) = 0.
Proof. F is an eigenvalue of A v F
n
\ {0} such that Av = v v F
n
\ {0} such
that (I A) v = 0
Exercise
I A is a singular matrix det (I A) = 0 F is a solution
to the characteristic equation
A
(t) = 0.
Theorem 247 (The fundamental theorem of algebra)
1
Let
p (z) = a
n
z
n
+a
n1
z
n1
+... +a
1
z +a
0
with a
n
6= 0
1
For a brief description of the eld of complex numbers, see Chapter 10.
6.3. SIMILAR MATRICES AND DIAGONALIZABLE MATRICES 75
be a polynomial of degree n 1 with complex coecients a
0
, ..., a
n
. Then
p (z) = 0
for at least one z C.
In fact, there exist r N\ {0}, pairwise distinct (z
i
)
r
i=1
C
r
and (n
i
)
r
i=1
N
r
, c C\ {0} such
that
p (z) = c
r
i=1
(z z
i
)
n
i
and
r
X
i=1
n
i
= n.
Denition 248 For any i {1, ..., r}, n
i
is called the algebraic multiplicity of the solution z
i
.
As a consequence of the Fundamental Theorem of Algebra, we have the following result.
Proposition 249 Let A be a matrix in M(n, n) over the complex eld C. Then, A has at least
one eigenvalue.
Denition 250 Let be an eigenvalue of A. The algebraic multiplicity of is the multiplicity of
as a solution to the characteristic equation
A
(t) = 0. The geometric multiplicity of is the
dimension of E

{0}.
In Section 8.8, we will prove the following result: Let be an eigenvalue of A. The geometric
multiplicity of is smaller or equal than the algebraic multiplicity of .
6.3 Similar matrices and diagonalizable matrices
Proposition 251 A M(n, n) is diagonalizable, i.e., A is similar to a diagonal matrix D
A has n linearly independent eigenvectors. In that case, the elements on the diagonal of D are
the associated eigenvalues and D = P
1
AP, where P is the matrix whose columns are the n
eigenvectors.
Proof. [] Assume that

v
k

n
k=1
is the set of linearly independent eigenvectors associated with
A and for every k {1, ..., n},
k
is the unique eigenvalue associated with v
k
. Then,
k {1, ..., n} , Av
k
=
k
v
k
i.e.,
AP = P diag [(
k
)
n
k=1
] ,
where P is the matrix whose columns are the n eigenvectors. Since they are linearly independent,
P is invertible and
P
1
AP = diag [(
k
)
n
k=1
] .
[] Assume that there exists an invertible matrix P M(n, n) and a diagonal matrix D =diag
[(
k
)
n
k=1
] such that P
1
AP = D. Then
AP = PD.
Let

v
k

n
k=1
be the column vectors of P. Then the above expression can be rewritten as
k {1, ..., n} , Av
k
=
k
v
k
.
First of all, k {1, ..., n} , v
k
6= 0, otherwise P would not be invertible. Then v
k
,
k
are
eigenvectors-eigenvalues associated with A. Finally

v
k

n
k=1
are linearly independent because P is
invertible.
Proposition 252 Let v
1
, ..., v
n
be eigenvectors of A M(m, m) belonging to pairwise distinct
eigenvalues
1
, ...,
n
, respectively. Then v
1
, ..., v
n
are linearly independent.
76 CHAPTER 6. EIGENVALUES AND EIGENVECTORS
Proof. We prove the result by induction on n.
Step 1. n = 1.
Since, by denition, eigenvectors are nonzero, the result follows.
Step 2.
Assume that
n
X
i=1
a
i
v
i
= 0. (6.1)
Multiplying (6.1) by A, and using the assumption that i {1, ..., n}, Av
i
=
i
v
i
, we get
0 =
n
X
i=1
a
i
Av
i
=
n
X
i=1
a
i

i
v
i
. (6.2)
Multiplying (6.1) by
n
, we get
0 =
n
X
i=1
a
i

n
v
i
. (6.3)
Subtracting (6.3) from (6.2), we get
0 =
n1
X
i=1
a
i
(
i

n
) v
i
. (6.4)
From the assumption of Step 2 of the proof by induction and from (6.4), we have that

v
1
, ..., v
n1

are linearly independent, and therefore, i {1, ..., n 1}, a


i
(
i

n
) = 0, and since eigenvalue
are pairwise distinct,
i {1, ..., n 1} , a
i
= 0.
Substituting the above conditions in (6.1), we get a
n
v
n
= 0,and since v
n
is an eigenvector and
therefore dierent from zero,
a
n
= 0.
Remark 253 A M(m, m) admits at most m distinct eigenvalues. Otherwise, from Proposition
252, you would have m+ 1 linearly independent vectors in R
m
, contradicting Proposition 178 .1.
Proposition 254 Let A M(n, n) be given. Assume that
A
(t) =
n
i=1
(t a
i
) , with a
1
, ..., a
n
pairwise distinct. Then, A is similar to diag [(a
i
)
n
i=1
], and, in fact, diag [(a
i
)
n
i=1
] = P
1
AP,
where P is the matrix whose columns are the n eigenvectors.
Proof. From Proposition 246, i {1, ..., n}, a
i
is an eigenvalue of A. For every i {1, ..., n},
let v
i
be the eigenvector associated with a
i
. Since a
1
, ..., a
n
are pairwise distinct, from Proposition
252, the eigenvectors v
1
, ..., v
n
are linearly independent. From Proposition 251, the desired result
follows.
In Section 9.2, we will describe an algorithm to nd eigenvalues and eigenvectors of A and to
show if A is diagonalizable.
Chapter 7
Linear functions
7.1 Denition
Denition 255 Given the vector spaces V and Uover the same eld F, a function l : V U is
linear if
1. v, w V , l (v +w) = l (v) +l (w), and
2. F, v V , l (v) = l (v).
Call L(V, U) the set of all such functions. Any time we write L(V, U), we implicitly assume
that V and U are vector spaces on the same eld F.
In other words, l is linear if it preserves the two basic operations of vector spaces.
Remark 256 1. l L(V, U) i v, w V and , F, l (v +w) = l (v) +l (w);
2. If l L(V, U), then l (0) = 0: for arbitrary x V , l (0) = l (0x) = 0l (x) = 0.
Example 257 Let V and U be vector spaces. The following functions are linear.
1. (identity function)
l
1
: V V, l
1
(v) = v.
2. (null function)
l
2
: V U, l
2
(v) = 0.
3.
a F, l
a
: V V, l
a
(v) = av.
4. (projection function)
n N, k N\ {0} ,
proj
n+k,n
: R
n+k
R
n
, proj
n+k,n
: (x
i
)
n+k
i=1
7(x
i
)
n
i=1
;
5. (immersion function)
n N, k N\ {0} ,
i
n,n+k
: R
n
R
n+k
, i
n,n+k
: (x
i
)
n
i=1
7((x
i
)
n
i=1
, 0) with 0 R
k
.
Example 258 Taken A M(m, n), then
l : R
n
R
m
, l (x) = Ax
is a linear function, as shown in part 3 in Remark 76.
77
78 CHAPTER 7. LINEAR FUNCTIONS
Example 259 Let V be the vector space of polynomials in the variable t. The following functions
are linear
1. The derivative denes a function D : V V as
D : p 7p
0
where p
0
is the derivative function of p.
2. The denite integral from 0 to 1 denes a function i : V R as
i : p 7
Z
1
0
p (t) dt.
Proposition 260 If l L(V, U) is invertible, then its inverse l
1
is linear.
Proof. Take arbitrary u, u
0
U and , F. Then, since l is onto, there exist v, v
0
V such
that
l (v) = u and l (v
0
) = u
0
and by denition of inverse function
l
1
(u) = v and l
1
(u
0
) = v
0
.
Then
u +u
0
= l (v) +l (v
0
) = l (v +v
0
)
where last equality comes from the linearity of l. Then again by denition of inverse,
l
1
(u +u
0
) = v +v
0
= l
1
(u) +l
1
(u
0
) .
7.2 Kernel and Image of a linear function
Denition 261 Assume that l L(V, U). The kernel of l, denoted by ker l is the set
{v V : l (v) = 0} = l
1
(0) .
The Image of l, denoted by Iml is the set
{u U : v V such that f (v) = u} = l (V ) .
Proposition 262 Given l L(V, U), ker l is a vector subspace of V and Iml is a vector subspace
of U.
Proof. Take v
1
, v
2
ker l and , F. Then
l

v
1
+v
2

= l

v
1

+l

v
2

= 0
i.e., v
1
+v
2
ker l.
Take w
1
, w
2
Iml and , F. Then for i {1, 2} , v
i
V such that l

v
i

= w
i
.Moreover,
w
1
+w
2
= l (v
i
) +l

v
2

= l

v
1
+v
2

i.e., w
1
+w
2
Iml.
Proposition 263 If span

v
1
, ..., v
n

= V and l L(V, U), then span

v
i

n
i=1

= Im l.
Proof. Taken u Im l, there exists v V such that l (v) = u. Moreover, (
i
)
n
i=1
R
n
such
that v =
P
n
i=1

i
v
i
. Then
u = l (v) = l

n
X
i=1

i
v
i
!
=
n
X
i=1

i
l

v
i

,
as desired.
7.2. KERNEL AND IMAGE OF A LINEAR FUNCTION 79
Remark 264 From the previous proposition, we have that if

v
1
, ..., v
n

is a basis of V , then
n dimspan

v
i

n
i=1

= dimIml.
Example 265 Let V the vector space of polynomials and D
3
: V V , p 7 p
000
, i.e., the third
derivative of p. Then
ker D
3
= set of polynomials of degree 2,
since D
3

at
2
+bt +c

= 0 and D
3
(t
n
) 6= 0 for n > 2. Moreover,
ImD
3
= V,
since every polynomial is the third derivative of some polynomial.
Proposition 266 (Dimension Theorem)If V is a nite dimensional vector space and l L(V, U),
then
dimV = dimker l + dimIml
Proof. (Idea of the proof.
1. Using a basis of ker l (with n
1
elements) and a basis of Iml (with n
2
elements) construct a
set with n
1
+n
2
elements which generates V .
2. Show that set is linearly independent (by contradiction), and therefore a basis of V , and
therefore dimV = n
1
+n
2
.)
Since ker l V and from Remark 264, ker l and diml have nite dimension. Therefore, we can
dene n
1
= dimker l and n
2
= dimIml.
Take an arbitrary v V . Let

w
1
, .., w
n1

be a basis of ker l (7.1)


and

u
1
, .., u
n2

be a basis of Iml (7.2)


Then,
i {1, .., n
2
} , v
i
V such that u
i
= l

v
i

(7.3)
From (7.2),
c = (c
i
)
n
2
i=1
such that l (v) =
n
2
X
i=1
c
i
u
i
(7.4)
Then, from (7.4) and (7.3) ,we get
l (v) =
n2
X
i=1
c
i
u
i
=
n2
X
i=1
c
i
l

v
i

and from linearity of l


0 = l (v)
n
2
X
i=1
c
i
l

v
i

= l (v) l

n
2
X
i=1
c
i
v
i
!
= l

v
n
2
X
i=1
c
i
v
i
!
i.e.,
v
n
2
X
i=1
c
i
v
i
ker l (7.5)
From (7.1),
(d
j
)
n
j=1
such that v
n
2
X
i=1
c
i
v
i
=
n
1
X
j=1
d
j
w
j
Summarizing, we have
v V, (c
i
)
n
2
i=1
and (d
j
)
n
1
j=1
such that v =
n2
X
i=1
c
i
v
i
+
n1
X
j=1
d
j
w
j
80 CHAPTER 7. LINEAR FUNCTIONS
Therefore, we found n
1
+ n
2
vectors which generate V ; if we show that they are l .i., then, by
denition, they are a basis and therefore n = n
1
+n
2
as desired.
We want to show that
n2
X
i=1

i
v
i
+
n1
X
j=1

j
w
j
= 0

(
i
)
n2
i=1
,

n
1
j=1

= 0 (7.6)
Then
0 = l

n
2
X
i=1

i
v
i
+
n
1
X
j=1

j
w
j

=
n
2
X
i=1

i
l

v
i

+
n
1
X
j=1

j
l

w
j

From (7.1), i.e.,



w
1
, .., w
n1

is a basis of ker l, and from (7.3), we get


n2
X
i=1

i
u
i
= 0
From (7.2), i.e.,

u
1
, .., u
n2

be a basis of Iml,
(
i
)
n
2
i=1
= 0 (7.7)
But from the assumption in (7.6) and (7.7) we have that
n
1
X
j=1

j
w
j
= 0
and since

w
1
, .., w
n1

is a basis of ker l, we get also that

n1
j=1
= 0,
as desired.
Example 267 Let V and U be vector spaces, with dimV = n.
In 1. and 2. below, we verify the statement of the Dimension Theorem: in 3. and 4., we use
that statement.
1. Identity function id
V
.
dimImid
V
= n
dimker l = 0.
2. Null function 0 L(V, U)
dimIm0 = 0
dimker 0 = n.
3. l L

R
2
, R

,
l ((x
1
, x
2
)) = x
1
.
Since ker l =

(x
1
, x
2
) R
2
: x
1
= 0

, {(0, 1)} is a basis


1
of ker l and
dimker l = 1,
dimIml = 2 1 = 1.
4. l L

R
3
, R
2

,
l ((x
1
, x
2
, x
3
)) =

1 2 3
0 1 0

x
1
x
2
x
3

.
Since
Iml =

y R
2
: x R
3
such that Ax = y

= spancol A,
and since the rst two column vectors of A are linearly independent, we have that
dimIml = 2
dimker l = 3 2 = 1.
1
In Remark 335 we will present an algorithm to compute a basis of ker l.
7.3. NONSINGULAR FUNCTIONS AND ISOMORPHISMS 81
7.3 Nonsingular functions and isomorphisms
Denition 268 l L(V, U) is singular if v V \ {0} such that l (v) = 0.
Remark 269 Thus l L(V, U) is nonsingular if v V \ {0} , l (v) 6= 0 i.e., ker l = {0}. Briey,
l L(V, U) is nonsingular ker l = {0}.
Remark 270 In Remark 301, we will discuss the relationship between singular matrices and sin-
gular linear functions.
Proposition 271 If l L(V, U) is nonsingular, then the image of any linearly independent set is
linearly independent.
Proof. Suppose

v
1
, ..., v
n

are linearly independent. We want to show that



l

v
1

, ..., l (v
n
)

are linearly independent as well. Suppose


n
X
i=1

i
l

v
i

= 0.
Then
l

n
X
i=1

i
v
i
!
= 0.
and therefore
P
n
i=1

i
v
i

ker l = {0}, where the last equality comes from the fact that l is
nonsingular. Then
P
n
i=1

i
v
i
= 0 and, since

v
1
, ..., v
n

are linearly independent, (


i
)
n
i=1
= 0, as
desired.
Denition 272 Let two vector spaces V and U be given. U is isomorphic to V if there exists a
function l L(V, U) which is one-to-one and onto. l is called an isomorphism from V to U.
Remark 273 By denition of isomorphism , if l is an isomorphism, the l is invertible and therefore,
from Proposition 260, l
1
is linear.
Remark 274 To be isomorphic is an equivalence relation.
Proposition 275 Any n-dimensional vector space V on a eld F is isomorphic to F
n
.
Proof. Since V and F
n
are vector spaces, we are left with showing that there exists an isomor-
phism between them. Let v =

v
1
, ..., v
n

be a basis of V . Dene
cr : V F
n
, v 7[v]
v
,
where cr stands for coordinates.
1. cr is linear. Given v, w V , suppose
v =
n
X
i=1
a
i
v
i
and w =
n
X
i=1
b
i
v
i
i.e.,
[v]
v
= [a
i
]
n
i=1
and [w]
v
= [b
i
]
n
i=1
.
, F and v
1
, v
2
V ,
v +w =
n
X
i=1
a
i
v
i
+
n
X
i=1
b
i
v
i
=
n
X
i=1
(a
i
+b
i
) v
i
i.e.,
[v +w]
v
= [a
i
]
n
i=1
+ [b
i
]
n
i=1i
= [v]
v
+ [w]
v
.
2. cr is onto. (a)
n
i=1
R
n
, cr
P
n
i=1
a
i
v
i

= (a)
n
i=1
.
3. cr is one-to-one. cr (v) = cr (w) implies that v = w, simply because v =
P
n
i=1
cr
i
(v) u
i
and
w =
P
n
i=1
cr
i
(w) u
i
.
82 CHAPTER 7. LINEAR FUNCTIONS
Proposition 276 Let V and U be nite dimensional vectors spaces on the same eld F such that
S =

v
1
, ..., v
n

is a basis of V and

u
1
, ..., u
n

is a set of arbitrary vectors in U. Then there exists


a unique linear function l : V U such that i {1, ..., n}, l

v
i

= u
i
.
Proof. The proof goes the following three steps.
1. Dene l;
2. Show that l is linear;
3. Show that l is unique.
1. Using Denition 193, v V , dene
l : V U, v 7
n
X
i=1
[v]
i
s
u
i
.
Then j {1, ..., n},

v
j

S
= e
j
n
2
and l

v
j

=
P
n
j=1
e
j
n
u
j
= u
j
.
2. Let v, w V and , F. Then
l (v +w) =
n
X
i=1
[v +w]
i
S
u
i
=
n
X
i=1

[v]
i
S
+ [w]
i
S

u
i
=
n
X
i=1
[v]
i
S
u
i
+
n
X
i=1
[w]
i
S
u
i
,
where the before the last equality follows from the linearity of [.] - see the proof of Proposition 275
3. Suppose g L(V, U) and i {1, ..., n}, g

v
i

= u
i
. Then v V ,
g (v) = g

n
X
i=1
[v]
i
S
v
i
!
=
n
X
i=1
[v]
i
S
g

v
i

=
n
X
i=1
[v]
i
S
u
i
= l (v)
where the last equality follows from the denition of l.
Remark 277 Observe that if V and U are nite(nonzero) dimensional vector spaces, there is a
multitude of functions from V to U. The above Proposition says that linear functions are completely
determined by what they do to the elements of a basisof V .
Proposition 278 Assume that l L(V, U) .Then,
l is one-to-one l is nonsingular
Proof. []
Take v ker l. Then
l (v) = 0 = l (0)
where last equality follows from Remark 256. Since l is one-to-one, v = 0.
[] If l (v) = l (w), then l (v w) = 0 and, since l is nonsingular, v w = 0.
Proposition 279 Assume that V and U are nite dimensional vector spaces and l L(V, U) .Then,
1. l is one-to-one dimV dimU;
2. l is onto dimV dimU;
3. l is invertible dimV = dimU.
Proof. 1. Since l is one-to-one, from the previous Proposition, dimker l = 0. Then, from
Proposition 266, dimV = dimIml. Since Iml is a subspace of U, then dimIml dimU.
2. Since l is onto i Iml = U, from Proposition 266, we get
dimV = dimker l + dimU dimU.
3. l is invertible i l is one-to-one and onto.
Proposition 280 Let V and U be nite dimensional vector space on the same eld F. Then,
U and V are isomorphic dimU = dimV.
2
Recall that e
i
n
R
n
is the i th element in the canonical basis of R
n
.
7.3. NONSINGULAR FUNCTIONS AND ISOMORPHISMS 83
Proof. []
It follows from the denition of isomorphism and part 3 in the previous Proposition.
[]
Assume that V and U are vector spaces such that dimV = dimU = n. Then, from Proposition
275, V and U are isomorphic to F
n
and from Remark 274, the result follows.
Proposition 281 Suppose V and U are vector spaces such that dimV = dimU = n and l
L(V, U). Then the following statements are equivalent.
1. l is nonsingular, i.e., ker l = {0} ,
2. l is one-to-one,
3. l is onto,
4. l is an isomorphism.
Proof. [1 2] .
It is the content of Proposition 278.
[1 3]
Since l is nonsingular, then ker l = {0} and dimker l = 0. Then, from Proposition 266, i.e.,
dimV = dimker l +dimIml, and the fact dimV = dimU, we get dimU = dimIml. Since Iml U
and U is nite dimensional, from Proposition 181, Iml = U, i.e., l is onto, as desired.
[3 1]
Since l is onto, dimIml = dimV and from Proposition 266, dimV = dimker t + dimV, and
therefore dimker l = 0, i.e., l is nonsingular.
[1 4]
It follows from the denition of isomorphism and the facts that [1 2] and [1 3].
[4 1]
It follows from the denition of isomorphism and the facts that [2 1].
84 CHAPTER 7. LINEAR FUNCTIONS
Chapter 8
Linear functions and matrices
In Remark 66 we have seen that the set of m n matrices with the standard sum and scalar
multiplication is a vector space, called M(m, n). We are going to show that:
1. the set L(V, U) with naturally dened sum and scalar multiplication is a vector space, called
L(V, U);
2. If dimV = n and dimU = m, then L(V, U) and M(m, n) are isomorphic.
8.1 From a linear function to the associated matrix
Denition 282 Suppose V and U are vector spaces over a eld F and l
1
, l
2
L(V, U) and F.
l
1
+l
2
: V U, v 7l
1
(v) +l
2
(v)
l
1
: V U, v 7l
1
(v)
Proposition 283 L(V, U) with the above dened operations is a vector space, denoted by L(V, U).
Proof. Exercise.
Remark 284 Compositions of linear functions are linear.
Denition 285 Suppose l L(V, U) , v =

v
1
, ..., v
n

is a basis of V , u =

u
1
, ..., u
m

is a basis
of U. Then,
[l]
u
v
:=

l

v
1

u
...

l

v
j

u
... [l (v
n
)]
u

M(m, n) , (8.1)
where for any j {1, ..., n},

l

v
j

u
is a column vector, is called the matrix representation of l
relative to the basis v and u. In words, [l]
u
v
is the matrix whose columns are the coordinates relative
to the basis of the codomain of l of the images of each vector in the basis of the domain of l.
Denition 286 Suppose l L(V, U) , v =

v
1
, ..., v
n

is a basis of V , u =

u
1
, ..., u
m

is a basis
of U.

u
v
: L(V, U) M(m, n) , l 7[l]
u
v
dened in (8.1) .
If no confusion may arise, we will denote
u
v
simply by .
The proposition below shows that multiplying the coordinate vector of v relative to the basis v
by the matrix [l]
u
v
, we get the coordinate vector of l (v) relative to the basis u.
Proposition 287 v V ,
[l]
u
v
[v]
v
= [l (v)]
u
(8.2)
Proof. Assume v V . First of all observe that
[l]
u
v
[v]
v
=

l

v
1

u
...

l

v
j

u
... [l (v
n
)]
u

[v]
1
v
...
[v]
j
v
...
[v]
n
v

=
n
X
j=1
[v]
j
v

v
j

u
.
85
86 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES
Moreover, from the linearity of the function cr
u
:= [.]
u
, and using the fact that the composition
of linear functions is a continuous function, we get:
[l (v)]
u
= cr
u
(l (v)) = (cr
u
l)

n
X
j=1
[v]
j
v
v
j

u
=
n
X
j=1
[v]
j
v
(cr
u
l)

v
j

=
n
X
j=1
[v]
j
v

v
j

u
.
8.2 From a matrix to the associated linear function
Given A M(m, n) , recall that i {1, .., m}, R
i
(A) denotes the i th row vector of A, i.e.,
A =

R
1
(A)
...
R
i
(A)
...
R
m
(A)

mn
Denition 288 Consider vector spaces V and U with basis v =

v
1
, ..., v
n

and u =

u
1
, ..., u
m

,
respectively. Given A M(m, n) , dene
l
u
A,v
: V U, v 7
m
X
i=1
(R
i
(A) [v]
v
) u
i
.
Proposition 289 l
u
A,v
dened above is linear, i.e., l
u
A,v
L(V, U).
Proof. , R and v
1
, v
2
V ,
l
u
A,v

v
1
+v
2

=
P
m
i=1
R
i
(A)

v
1
+v
2

v
u
i
=
P
m
i=1
R
i
(A)

v
1

v
+

v
2

u
i
=
=
P
m
i=1
R
i
(A)

v
1

v
u
i
+
P
m
i=1
R
i
(A)

v
2

v
u
i
= l
u
A,v

v
1

+l
u
A,v

v
2

.
where the second equality follows from the proof of Proposition 275.
Denition 290 Given the vector spaces V and U with basis v =

v
1
, ..., v
n

and u =

u
1
, ..., u
m

,
respectively, dene

u
v
: M(m, n) L(V, U) : A 7l
u
A,v
.
If no confusion may arise, we will denote
u
v
simply by .
Proposition 291 dened above is linear.
Proof. We want to show that , R and A, B M(m, n),
(A+B) = (A) + (B)
i.e.,
l
v
A+B,u
= l
u
A,v
+l
v
B,u
i.e., v V ,
l
v
A+B,u
(v) = l
u
A,v
(v) +l
v
B,u
(v) .
Now,
l
v
A+B,u
(v) =
P
m
i=1
( R
i
(A) + R
i
(B)) [v]
v
u
i
=
=
P
m
i=1
R
i
(A) [v]
v
u
i
+
P
m
i=1
R
i
(B) [v]
v
u
i
= l
u
A,v
(v) +l
v
B,u
(v) ,
where the rst equality come from Denition 288.
8.3. M(M, N) AND L(V, U) ARE ISOMORPHIC 87
8.3 M(m, n) and L(V, U) are isomorphic
Proposition 292 Given the vector space V and U with dimension n and m, respectively,
M(m, n) and L(V, U) are isomorphic.
Proof. Linearity of the two spaces was proved above. We want now to check that presented
in Denition 290 is an isomorphism, i.e., is linear, one-to-one and onto. In fact, thanks to
Proposition 291, it is enough to show that is invertible.
First proof.
1. is one-to-one: see Theorem 2, page 105 in Lang (1971);
2. is onto: see bottom of page 107 in Lang (1970).
Second proof.
1. = id
L(V,U)
.
Given l L(V, U), we want to show that v V,
l (v) = (( ) (l)) (v)
i.e., from Proposition 275,
[l (v)]
u
= [(( ) (l)) (v)]
u
.
First of all, observe that from (8.2), we have
[l (v)]
u
= [l]
u
v
[v]
v
.
Moreover,
[(( ) (l)) (v)]
u
(1)
=

[l]
u
v

(v)

u
(2)
=
P
n
i=1

i th row of [l]
u
v

[v]
v
u
i

u
(3)
=
=

i th row of [l]
u
v

[v]
v

n
i=1
(4)
= [l]
u
v
[v]
v
where (1) comes from the denition of , (2) from the denition of , (3) from the denition of
[.]
u
, (4) from the denition of product between matrices.
2. = id
M(m,n)
.
Given A M(m, n), we want to show that ( ) (A) = A. By denition of ,
(A) = l
u
A,v
such that v V , l
u
A,v
(v) =
m
X
i=1
R
i
(A) [v]
v
u
i
. (8.3)
By denition of ,
( (A)) =

l
u
A,v

u
v
.
Therefore, we want to show that

l
u
A,v

u
v
= A. Observe that from 8.3,
l
u
A,v
(v
1
) =

m
i=1
R
i
(A)[v
1
]
v
u
i
=

m
i=1
[a
i1
,...,a
ij
,...,a
in
][1,...,0,...,0]u
i
= a
11
u
1
+...+a
i1
u
i
+....+a
m1
u
m
....
l
u
A,v
(v
j
) =

m
i=1
R
i
(A)[v
j
]
v
u
i
=

m
i=1
[a
i1
,...,a
ij
,...,a
in
][0,...,1,...,0]u
i
= a
1j
u
1
+...+a
ij
u
i
+...+a
mj
u
m
...
l
u
A,v
(v
n
) =

m
i=1
Ri(A)[v
n
]
v
u
i
=

m
i=1
[ai1,...,aij,...,ain][0,...,0,...,1]u
i
= a1nu
1
+...+ainu
i
+...+amnu
m
(From the above, it is clear why in denition 285 we take the transpose.) Therefore,

l
u
A,v

u
v
=

a
11
... a
1j
... a
1n
...
a
i1
a
ij
a
in
...
a
m1
... a
mj
... a
mn

= A,
as desired.
88 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES
Proposition 293 Let the following objects be given.
1. Vector spaces V with basis v =

v
1
, ..., v
j
, ..., v
n

, U with basis u =

u
1
, ..., u
i
, ..., u
m

and
W with basis w =

w
1
, ..., w
k
, ..., w
p

;
2. l
1
L(V, U) and l
2
L(U, W).
Then
[l
2
l
1
]
w
v
= [l
2
]
w
u
[l
1
]
u
v
,
or

w
v
(l
2
l
1
) =
w
u
(l
2
)
u
v
(l
1
) .
Proof. By denition
1
[l
1
]
u
v
=

l
1

v
1

u
...

l
1

v
j

u
... [l
1
(v
n
)]
u

mn
:=
:=

l
1
1

v
1

... l
1
1

v
j

... l
1
1
(v
n
)
...
l
i
1

v
1

l
i
1

v
j

l
i
1
(v
n
)
...
l
m
1

v
1

... l
m
1

v
j

... l
m
1
(v
n
)

:=

l
11
1
... l
1j
1
... l
1n
1
...
l
i1
1
l
ij
1
l
in
1
...
l
m1
1
... l
mj
1
... l
mn
1

:=
:=
h
l
ij
1
i
i{1,...,m},j{1,...,n}
:= A M(m, n) ,
and therefore j {1, ..., n} , l
1

v
j

=
P
m
i=1
l
ij
1
u
i
.
Similarly,
[l
2
]
w
u
=

l
2

u
1

w
...

l
2

u
i

w
... [l
2
(u
m
)]
w

pm
:=
:=

l
1
2

u
1

... l
1
2

u
i

... l
1
2
(u
m
)
...
l
k
2

u
1

l
k
2

u
i

l
k
2
(u
m
)
...
l
p
2

u
1

... l
p
2

u
i

... l
p
2
(u
m
)

:=

l
11
2
... l
1i
2
... l
1m
2
...
l
k1
2
l
ki
2
l
km
2
...
l
p1
2
... l
pi
2
... l
pm
2

:=
:=

l
ki
2

k{1,...,p},i{1,...,m}
:= B M(p, m) ,
and therefore i {1, ..., m} , l
2

u
i

=
P
p
k=1
l
ki
2
w
k
.
Moreover, dened l := (l
2
l
1
), we get
[l
2
l
1
]
w
v
=

l

v
1

w
...

l

v
j

w
... [l (v
n
)]
w

pn
:=
:=

l
1

v
1

... l
1

v
j

... l
1
(v
n
)
...
l
k

v
1

l
k

v
j

l
k
(v
n
)
...
l
p

v
1

... l
p

v
j

... l
p
(v
n
)

:=

l
11
... l
1j
... l
1n
...
l
k1
l
kj
l
kn
...
l
p1
... l
pj
... l
pn

:=
:=

l
kj

k{1,...,p},j{1,...,n}
:= C M(p, n) ,
and therefore j {1, ..., n} , l

v
j

=
P
p
k=1
l
kj
w
k
.
Now, j {1, ..., n}
l

v
j

= (l
2
l
1
)

v
j

= l
2

l
1

v
j

= l
2

P
m
i=1
l
ij
1
u
i

=
P
m
i=1
l
ij
1
l
2

u
i

=
P
m
i=1
l
ij
1

P
p
k=1
l
ki
2
w
k
=
P
p
k=1
P
m
i=1
l
ki
2
l
ij
1
w
k
.
1
_
l
1
_
v
1
__
u
, .., [l
1
(v
n
)]
u
.are column vectors.
8.4. SOME RELATEDPROPERTIES OF ALINEARFUNCTIONANDASSOCIATEDMATRIX89
The above says that j {1, ..., n}, the j th column of C is

P
m
i=1
l
1i
2
l
ij
1
P
m
i=1
l
ki
2
l
ij
1
P
m
i=1
l
pi
2
l
ij
1

On the other hand, the j th column of B A is

[1st row of B] [j th column of A]


...
[k th row of B] [j th column of A]
...
[p th row of B] [j th column of A]

l
11
2
... l
1i
2
... l
1m
2

l
1j
1
l
ij
1
l
mj
1

...

l
k1
2
l
ki
2
l
km
2

l
1j
1
l
ij
1
l
mj
1

...

l
p1
2
... l
pi
2
... l
pm
2

l
1j
1
l
ij
1
l
mj
1

P
m
i=1
l
1i
2
l
ij
1
P
m
i=1
l
ki
2
l
ij
1
P
m
i=1
l
pi
2
l
ij
1

as desired.
8.4 Some related properties of a linear function and associ-
ated matrix
In this section, the following objects will be given: a vector space V with a basis v =

v
1
, ..., v
n

;
a vector space U with a basis u =

u
1
, ..., u
n

; l L(V, U) and A M(m, n).


From Remark 145, recall that
col span A = {z R
m
: x R
n
such that z = Ax} ;
Lemma 294 cr
u
(Iml) = col span[l]
u
v
.
Proof. []
y cr
u
(Iml)
def cr
u Iml such that cr
u
(u) = [u]
u
= y
def Iml
v V such that l (v) = u
v V such that [l (v)]
u
= y
Prop. 287
v V such that [l]
u
v
[v]
v
= y y col span[l]
u
v
.
[]
We want to show that y col span[l]
u
v
y cr
u
(Iml), i.e., u Iml such that y = [u]
u
.
y col span[l]
u
v
x
y
R
n
such that [l]
u
v
x
y
= y
def cr
v =
P
n
j=1
x
y,j
v
j
such that [l]
u
v
[v]
v
=
y
Prop. 287
v V such that [l (v)]
u
= y
u=l(v)
u Iml such that y = [u]
u
, as desired.
Lemma 295 dimIml = dimcol span[l]
u
v
.
Proof. It follows from the above Lemma, the proof of Proposition 275, which says that cr
u
is
an isomorphism and Proposition 280, which says that isomorphic spaces have the same dimension.
Proposition 296 Given l L(V, U),
90 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES
1. l onto rank [l]
u
v
= dimU;
2. l one-to-one rank [l]
u
v
= dimV ;
3. l invertible [l]
u
v
invertible, and in that case

l
1

v
u
=

[l]
u
v

1
.
Proof. Recall that from Remark 269 and Proposition 278, l one-to-one l nonsingular
ker l = {0}.
1. l onto Iml = U dimIml = dimU
Lemma 295
dimcol span[l]
u
v
= dimU
Remark 234
desired
result.
2. l one-to-one
Proposition 266
dimIml = dimV
Lemma 295
dimcol span[l]
u
v
= dimV
Remark 234
desired
result.
3. The rst part of the statement follows from 1. and 2. above. The second part is proven
below. First of all observe that for any vector space W with basis w, id
W
L(W, W) and if
W has a basis w =

w
1
, .., w
k

, we also have that


[id
W
]
w
w
=

id
W

w
1

w
, ..,

id
W

w
k

= I
k
.
Moreover, if l is invertible
l
1
l = id
V
and

l
1
l

v
v
= [id
U
]
v
v
= I
m
.
Since

l
1
l

v
v
=

l
1

v
u
[l]
u
v
,
the desired result follows.
Remark 297 From the denitions of and , we have what follows:
1.
l = ((l)) =

[l]
u
v

= l
u
[l]
u
v
,v
.
2.
A = ( (A)) =

l
u
A,v

=

l
u
A,v

u
v
.
Lemma 298 cr
u

Iml
u
A,v

= col spanA.
Proof. Recall that (l) = [l]
u
v
and (A) = l
u
A,v
. For any l L(V, U),
cr
u
(Iml)
Lemma 294
= col span[l]
u
v
Def. 286
= col span (l) (8.4)
Take l = l
u
A,v
. Then from (8.4) ,we have
cr
u

Iml
u
A,v

= col span

l
u
A,v

Rmk. 297.2
= col span A.
Lemma 299 dimIml
u
A,v
= dimcol spanA = rankA.
Proof. Since Lemma 295 holds for any l L(V, U) and l
u
A,v
L(V, U), we have that
dimIml
u
A,v
= dimcol span

l
u
A,v

u
v
Rmk. 297.2
= dimcol span A
Rmk. 234.
= rank A.
Proposition 300 Let A M(m, n) be given.
8.5. SOME FACTS ON L

R
N
, R
M

91
1. rankA = m l
u
A,v
onto;
2. rankA = n l
u
A,v
one-to-one;
3. A invertible l
u
A,v
invertible, and in that case l
v
A
1
,u
=

l
u
A,v

1
.
Proof. 1. rankA = m
Lemma 299
dimIml
u
A,v
= m l
u
A,v
onto;
2. rankA = n
(1)
dimker l
u
A,v
= 0
Proposition 278
l
u
A,v
one-to-one,
where (1) the rst equivalence follows form the fact that n = dimker l
u
A,v
+ dimIml
u
A,v
, and
Lemma 299.
3. First statement: A invertible
Prop. 229
rankA = m = n
1 and 2 above
l
u
A,v
invertible.
Second statement: Since l
u
A,v
invertible, there exists

l
u
A,v

1
: U V such that
id
V
=

l
u
A,v

1
l
u
A,v
.
Then
I =
v
v
(id
V
)
Prop. 293
=
v
u

l
u
A,v


u
v

l
u
A,v

Rmk. 297
=
v
u

l
u
A,v

A.
Then, by denition of inverse matrix,
A
1
=
v
u

l
u
A,v

and

v
u

A
1

= (
v
u

v
u
)

l
u
A,v

= id
L(U,U)

l
u
A,v

=

l
u
A,v

1
.
Finally, from the denition of
v
u
, we have

v
u

A
1

= l
v
A
1
,u
,
as desired.
Remark 301 Consider A M(n, n). Then from Proposition 300,
A invertible l
u
A,v
invertible;
from Proposition 229,
A invertible A nonsingular;
from Proposition 281,
l
u
A,v
invertible l
u
A,v
nonsingular.
Therefore,
A nonsingular l
u
A,v
nonsingular.
8.5 Some facts on L(R
n
, R
m
)
In this Section, we specialize (basically repeat) the content of the previous Section in the important
case in which
V = R
n
, v =

e
j
n

n
j=1
:= e
n
U = R
m
u =

e
i
m

m
i=1
:= e
m
v = x (8.5)
and therefore
l L(R
n
, R
m
) .
From L(R
n
, R
m
) to M(m, n).
92 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES
From Denition 285, we have
[l]
e
m
e
n
=
h

l

e
1
n

em
...

l

e
j
n

em
... [l (e
n
n
)]
e
m
i
=

l

e
1
n

... l

e
j
n

... l (e
n
n
)

:= [l] ;
(8.6)
from Denition 286,
:=
e
m
en
: L(R
n
, R
m
) M(m, n) , l 7[l] ;
from Proposition 287,
[l] x = l (x) . (8.7)
From M(m, n) to L(R
n
, R
m
).
From Denition 288,
l
A
:= l
e
m
A,em
: R
n
R
m
, (8.8)
l
A
(x) =
P
m
i=1

R
i
(A) [x]
e
n

e
i
m
=
=

R
1
(A) x
...
R
i
(A) x
...
R
m
(A) x

= Ax.
(8.9)
From Denition 290,
:=
e
m
e
n
: M(m, n) L(R
n
, R
m
) : A 7l
A
. .
From Proposition 292,
M(m, n) and L(R
n
, R
m
) are isomorphic.
From Proposition 293, if l
1
L(R
n
, R
m
) and l
2
L(R
m
, R
p
), then
[l
2
l
1
] = [l
2
] [l
1
] . (8.10)
Some related properties.
From Proposition 296, given l L(R
n
, R
m
),
1. l onto rank [l] = m;
2. l one-to-one rank [l] = n;
3. l invertible [l] invertible, and in that case

l
1

= [l]
1
.
From Remark 297,
1.
l = ((l)) = ([l]) = l
[l]
.
2.
A = ((A)) = (l
A
) = [l
A
] .
From Proposition 300, given A M(m, n),
1. rankA = m l
A
onto;
2. rankA = n l
A
one-to-one;
3. A invertible l
A
invertible, and in that case l
A
1 = (l
A
)
1
.
Remark 302 From (8.7) and Remark 145,
Iml := {y R
m
: x R
n
such that y = [l] x} = col span [l] . (8.11)
Then, from the above and Remark 234,
dimIml = rank [l] = max # linearly independent columns of [l] . (8.12)
Similarly, from (8.9) and Remark 234, we get
Iml
A
:= {y R
m
: x R
n
such that y = l
A
x = Ax} = col span A,
and
dimIml
A
= rank A = max # linearly independent columns of A.
8.6. EXAMPLES OF COMPUTATION OF [L]
U
V
93
Remark 303 Assume that l L(R
n
, R
m
).
1. The above Remark gives a way of nding a basis of Iml: it is enough to consider a number
equal to rank [l] of linearly independent vectors among the column vectors of [l].
2. From (8.7) we have that
ker l = {x R
n
: [l] x = 0} ,
i.e., ker l is the set, in fact the vector space, of solution to the systems [l] x = 0. In ??, we will
describe an algorithm to nd a basis of the kernel of an arbitrary linear function.
Remark 304 From Remark 302 and Proposition 266, given l L(R
n
, R
m
),
dimR
n
= dimker l +rank [l] ,
and given A M(m, n),
dimR
n
= dimker l
A
+rank A.
8.6 Examples of computation of [l]
u
v
1. id L(V, V ) .
[id]
v
v
=

id

v
1

v
, ...,

id

v
j

v
, ..., [id (v
n
)]
v

=

v
1

v
, ...,

v
j

v
, ..., [v
n
]
v

=

e
1
n
, ..., e
j
n
, ..., e
n
n

= I
n
.
2. 0 L(V, U) .
[0]
u
v
= [[0]
v
, ..., [0]
v
, ..., [0]
v
] = 0 M(m, n) .
3. l

L(V, V ), with F.
[l

]
v
v
=

v
1

v
, ...,

v
j

v
, ..., [ v
n
]
v

v
1

v
, ...,

v
j

v
, ..., [v
n
]
v

=
=

e
1
n
, ..., e
j
n
, ..., e
n
n

= I
n
.
4. l
A
L(R
n
, R
m
), with A M(m, n).
[l
A
] =

A e
1
n
, ..., A e
j
n
, ..., A e
n
n

= A

e
1
n
, ..., e
j
n
, ..., e
n
n

= A I
n
= A.
5. (projection function) proj
n+k,n
L

R
n+k
, R
n

, proj
n+k,n
: (x
i
)
n+k
i=1
7(x
i
)
n
i=1
. Dened
proj
n+k,n
:= p, we have
[p] =

p

e
1
n+k

, ..., p

e
n
n+k

, p

e
n+1
n+k

, ..., p

e
n+k
n+k

=

e
1
n
, ..., e
n
n
, 0, ..., 0

= [I
n
|0] ,
where 0 M(n, k).
6. (immersion function) i
n,n+k
L

R
n
, R
n+k

, i
n,n+k
: (x
i
)
n
i=1
7 ((x
i
)
n
i=1
, 0) with
0 R
k
.Dened i
n,n+k
:= i, we have
[i] =

i

e
1
n

, ..., i (e
n
n
)

e
1
n
... e
n
n
0 ... 0

I
n
0

,
where 0 M(k, n).
Remark 305 Point 4. above implies that if l : R
n
R
m
, x 7Ax, then [l] = A. In other words,
to compute [l] you do not have to take the image of each element in the canonical basis; the rst
line of [l] is the vector of the coecient of the rst component function of l and so on. For example,
if l : R
2
R
3
,
x 7

a
11
x
1
+ a
12
x
2
a
21
x
1
+ a
22
x
2
a
31
x
1
+ a
32
x
2
,
then
[l] =

a
11
a
12
a
21
a
22
a
31
a
32

94 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES


8.7 Change of basis and linear operators
In this section, we answer the following question: how does the matrix representation of a linear
function l L(V, V ) for given basis v =

v
1
, .., v
n

change when we change the basis?


Denition 306 Let V be a vector space over a eld F. l L(V, V ) is called a linear operator on
V .
Proposition 307 Let P be the change-of-basis matrix from a basis v to a basis u for a vector space
V . Then, for any l L(V, V ),
[l]
u
u
= P
1
[l]
v
v
P,
In words, if A is the matrix representing l in the basis v, then B = P
1
AP is the matrix which
represents l in a new basis u, where P is the change-of-basis matrix from v to u.
Proof. For any v V , from Proposition 199, we have that
P [v]
u
= [v]
v
.
Then, premultiplying both sides by P
1
[l]
v
v
, we get
P
1
[l]
v
v
P [v]
u
= P
1
[l]
v
v
[v]
v
Prop. 287
= P
1
[l (v)]
v
Prop. 199
= [l (v)]
u
. (8.13)
From Proposition 287,
[l]
u
u
[v]
u
= [l (v)]
u
. (8.14)
Therefore, form (8.13) and (8.14), we get
P
1
[l]
v
v
P [v]
u
= [l]
u
u
[v]
u
. (8.15)
Since, from the proof of Proposition 275, [...]
u
:= cr
u
: V R
n
is onto, x R
n
, v V such
that [v]
u
= x. Observe that k {1, .., n},

u
k

u
= e
k
n
. Therefore, rewriting (8.15) with respect to
k, for any k {1, .., n}, we get
P
1
[l]
v
v
PI
n
= [l]
u
u
I
n
,
as desired..
Remark 308 A and B represent l means that there exists basis v and u such that
A = [l]
v
v
and B = [l]
u
u
.
Moreover, from Denition 196 and Proposition 198, there exists P invertible which is a change-
of-basis matrix. The above Proposition 307 says that A and B are similar. Therefore, all the
matrix representations of l L(V, V ) form an equivalence class of similar matrices.
Remark 309 Now suppose that
f : M(n, n) R
is such that
if A is similar to B, then f (A) = f (B) .
Then, given a vector space V of dimension n, we can dene
F : L(V, V ) R
such that for any basis v of V ,
F (l) = f

[l]
v
v

.
Indeed, by denition of f and by Remark 308, the denition is well given: for any pair of basis
u and v, f ([l]
u
) = f ([l]
v
).
Remark 310 For any A, P M(n, n) with P invertible,
tr PAP
1
= tr A,
as veried below. tr PAP
1
= tr (PA) P
1
= tr P
1
PA = tr A,where the second equality
comes from Property 3 in Proposition 82.
8.8. DIAGONALIZATION OF LINEAR OPERATORS 95
Remark 309, Proposition 239 and Remark 310 allow to give the following denitions.
Denition 311 Let l L(V, V ) and a basis v of V . The denitions of determinant, trace and
characteristic polynomial of l are as follows:
det : L(V, V ) R, : l 7det [l]
v
v
,
tr : L(V, V ) R, : l 7tr [l]
v
v
,

l
: L(V, V ) space of polynomials, : l 7
[l]
v
v
.
8.8 Diagonalization of linear operators
Denition 312 Given a vector space V of dimension n, l L(V, V ) is diagonalizable if there
exists a basis v of V such that [l]
v
v
is a diagonal matrix.
We present below some denitions and results which are analogous to those discussed in Section
6.3. In what follows, V is a vector space of dimension n.
Denition 313 Let a vector space V on a eld F and l L(V, V ) be given. A scalar F is
called an eigenvalue of l if there exists a nonzero vector v vV such that
l (v) = v.
Every vector v satisfying the above relationship is called an eigenvector of l associated with (or
belonging to) the eigenvalue .
Denition 314 Let E

be the set of eigenvector belonging to .


Proposition 315 1. There is only one eigenvalue associated with an eigenvector.
2. The set E

{0} is a subspace of V .
Proof. Similar to the matrix case.
Proposition 316 Let a vector space V on a eld F and l L(V, V ) be given. Then, the following
statements are equivalent.
1. F is an eigenvalue of l.
2. id
V
l is singular.
3. F is a solution to the characteristic equation
l
(t).
Proof. Similar to the matrix case.
Denition 317 Let be an eigenvalue of l. The algebraic multiplicity of is the multiplicity of
as a solution to the characteristic equation
l
(t) = 0. The geometric multiplicity of is the
dimension of E

{0}.
Proposition 318 l L(V, V ) is diagonalizable there exists a basis of V with n (linearly inde-
pendent) eigenvectors.
In that case, the elements on the diagonal of D are the associated eigenvalues.
Proof. []
[l]
v
v
=diag [(
i
)
n
i=1
] := D i {1, ..., n} ,

l

v
i

v
= [l]
v
v


v
i

v
= D e
i
n
=
i
e
i
n
. Taking
(cr
v
)
1
of both sides, we get
i {1, ..., n} , l

v
i

=
i
v
i
.
[]
i {1, ..., n} , l

v
i

=
i
v
i
i {1, ..., n} ,

v
i

v
= [l]
v
v

v
i

v
= D e
i
n
=
i
e
i
n

[l]
v
v
:=

l

v
1

v
|...| [l (v
n
)]
v

i
e
1
n
|...|
i
e
n
n

= diag [(
i
)
n
i=1
] .
96 CHAPTER 8. LINEAR FUNCTIONS AND MATRICES
Lemma 319

v
1

v
, ..., [v
n
]
v

are linearly independent



v
1
, ..., v
n

are linearly independent.


Proof.
P
n
i=1

v
i

v
= 0 (cr
v
)
1
P
n
i=1

v
i

= (cr
v
)
1
(0)
cr
1
is linear

P
n
i=1

i
(cr
v
)
1

v
i

= 0
P
n
i=1

i
v
i
= 0.
Proposition 320 Let v
1
, ..., v
r
be eigenvectors of l L(V, V ) belonging to pairwise distinct eigen-
values
1
, ...,
n
, respectively. Then v
1
, ..., v
n
are linearly independent.
Proof. By assumption, i {1, ..., n} , l

v
i

=
i
v
i
and [l]
v
v

v
i

v
=
i

v
i

v
. Then, from the
analogous theorem for matrices, i.e., Proposition 252,

v
1

v
, ..., [v
n
]
v

are linearly independent.


Then, the result follows from Lemma 319
Proposition 321 is an eigenvalue of l L(V, V ) is a root of the characteristic polynomial
of l.
Proof. Similar to the matrix case, i.e., Proposition 246.
Proposition 322 Let be an eigenvalue of l L(V, V ). The geometric multiplicity of is smaller
or equal than the algebraic multiplicity of .
Proof. Assume that the geometric multiplicity of is r. Then, by assumption, dimE

{0} = r
and therefore there are r linearly independent eigenvectors, v
1
, ..., v
r
, and therefore,
i {1, ..., r} , l

v
i

= v
i
From Lemma 180, there exist

w
1
, .., w
s

such that u =

v
1
, ..., v
r
, w
1
, .., w
s

is a basis of V . Then,
by denition of matrix associated with l with respect to that basis, we have
[l]
u
u
=

l

v
1

u
|...| [l (v
r
)]
u
|

w
1

u
|...| [l (w
s
)
u
]

=
=

0
...
0
_
0
0
...
0
0

...
0
_
0
0
...
0
...
...
...
...
_
...
...
...
...
0
0
...

_
0
0
...
0
|
|
|
|
_
|
|
|
|
a
11
a
21
...
a
r1
_
b
11
b
21
...
b
s1
a
12
a
22
...
a
r2
_
b
12
b
22
...
b
s2
...
...
...
...
_
...
...
...
...
a
1s
a
2s
...
a
rs
_
b
1s
b
2s
...
b
ss

:=

I
r
A
0 B

,
for well chosen values of the entries of A and B. Then the characteristic polynomial of [l]
u
u
is
det

tI

I
r
A
0 B

= det

(t ) I
r
A
0 tI
s
B

= (t )
r
det (tI
s
B) .
Therefore, the algebraic multiplicity of is greater or equal than r.
Chapter 9
Solutions to systems of linear
equations
9.1 Some preliminary basic facts
Lets recall some basic denition from Section 1.6.
Denition 323 Consider the following linear system with m equations and n unknowns

a
11
x
1
+ +a
1n
x
n
= b
1
.
.
.
a
m1
x
1
+ +a
mn
x
n
= b
m
which can be rewritten as
Ax = b
A
mn
is called matrix of the coecients (or coecient matrix) associated with the system and
M
m(n+1)
=

A | b

is called augmented matrix associated with the system.
Recall the following denition.
Denition 324 Two linear system are said to be equivalent if they have the same solutions.
Lets recall some basic facts we discussed in previous chapters.
Remark 325 It is well known that the following operations applied to a system of linear equations
lead to an equivalent system:
I) interchange two equations;
II) multiply both sides of an equation by a nonzero real number;
III) add left and right hand side of an equation to the left and right hand side of another equation;
IV) change the place of the unknowns.
The transformations I), II), III) and IV) are said elementary transformations.
Those transformations correspond to elementary operations on rows of M or columns of A in
the way described below
I) interchange two rows of M;
II) multiply a row of M by a nonzero real number;
III) sum a row of M to another row of M ;
IV) interchange two columns of A.
The above described operations do not change the rank of A and they do not change the rank of
M - see Proposition 222.
Homogenous linear system.
Denition 326 A linear system for which b = 0, i.e., of the type
Ax = 0
with A M(m, n), is called homogenous system.
97
98 CHAPTER 9. SOLUTIONS TO SYSTEMS OF LINEAR EQUATIONS
Remark 327 Obviously, 0 is a solution of the homogenous system. The set of solution of a homo-
geneous system is ker l
A
. From Remark 304,
dimker A = n rank l
A
9.2 A solution method: Rouch-Capellis and Cramers the-
orems
The solution method presented in this section is based on two basic theorems.
1. Rouch-Capellis Theorem, which gives necessary and sucient condition for the existence of
solutions;
2. Cramers Theorem, which gives a method to compute solutions - if they exist.
Theorem 328 (Rouch` e Capelli) A system with m equations and n unknowns
A
mn
x = b (9.1)
has solutions

rank A = rank

A | b

Proof. []
Let x

be a solution to 9.1. Then, b is a linear combination, via the solution x

of the columns
of A. Then, from Proposition 222,
rank

A | b

= rank

A | 0

= rank

A

.
[]
1st proof.
We want to show that
x R
n
such that Ax

= b, i.e., b =
n
X
j=1
x

j
C
j
(A) .
By assumption, rank A = rank

A | b

:= r. Since rank A = r, there are r linearly
independent column vectors of A, say {C
j
(A)}
jR
, where R {1, ..., n} and #R = r.
Since rank

A | b

= r, {C
j
(A)}
jR
{b} is a linearly independent set and from Lemma
185, b is a linear combinations of the vectors in {C
j
(A)}
jR
, i.e., (x
j
)
jR
such that b =
P
jR
x
j

C
j
(A) and
b =
X
jR
x
j
C
j
(A) +
X
j
0
{1,...,n}\R
0 C
j
0 (A) .
Then, x

=

x

n
j=1
such that
x

j
=

x
j
if j R
0 if j
0
{1, ..., n} \R
is a solution to Ax = b.
Second proof.
Since rank A := r min{m, n}, by denition of rank, there exists a rank r square submatrix
A

of A. From Remark 325, reordering columns of A and rows of



A | b

does not change the
rank of A or

A | b

and leads to the following system, which is equivalent to Ax = b:

A
12
A
21
A
22

x
1
x
2

b
0
b
00

,
where A
12
M(r, n r) , A
21
M(mr, r) , A
22
M(mr, n r) , x
1
R
r
, x
2
R
nr
, b
0

R
r
, b
00
R
mr
,
9.2. A SOLUTION METHOD: ROUCH-CAPELLIS AND CRAMERS THEOREMS 99

x
1
, x
2

has been obtained from x performing on it the same permutations performed on the
columns of A,
(b
0
, b
00
) has been obtained from b performing on it the same permutations performed on the rows
of A.
Since
rank

A
12
b
0

= rank A

= r,
the r rows of

A

A
12
b
0

are linearly independent. Since
rank

A
12
b
0
A
21
A
22
b
00

= r,
the rows of that matrix are linearly dependent and from Lemma 185, the last m r rows of
[A|b] are linear combinations of the rst r rows. Therefore, using again Remark 325, we have that
Ax = b is equivalent to

A
12

x
1
x
2

= b
0
or, using Remark 74.2,
A

x
1
+A
12
x
2
= b
0
and
x
1
= (A

)
1

b
0
A
12
x
2

R
r
while x
2
can be chosen arbitrarily; more precisely
n

x
1
, x
2

R
n
: x
1
= (A

)
1

b
0
A
12
x
2

R
r
and x
2
R
nr
o
is the nonempty set of solution to the system Ax = b.
Theorem 329 (Cramer) A system with n equations and n unknowns
A
nn
x = b
with det A 6= 0, has a unique solution x = (x
1
, ..., x
i
, ..., x
n
) where for i {1, ..., n} ,
x
i
=
det A
i
det A
and A
i
is the matrix obtained from A substituting the column vector b in the place of the i
th column.
Proof. since det A 6= 0, A
1
exists and it is unique. Moreover, from Ax = b, we get A
1
Ax =
A
1
b and
x = A
1
b
Moreover
A
1
b =
1
det A
Adj A b
It is then enough to verify that
Adj A b =

det A
1
...
det A
i
...
det A
n

which we omit (see Exercise 7.34, page 268, in Lipschutz (1991).


The combinations of Rouche-Capelli and Cramers Theorem allow to give a method to solve any
linear system - apart from computational diculties.
Rouche-Capelli and Cramers Theorem based method.
Let the following system with m equations and n unknowns be given.
A
mn
x = b
100 CHAPTER 9. SOLUTIONS TO SYSTEMS OF LINEAR EQUATIONS
1. Compute rank A and rank

A | b

.
i. If
rank A 6= rank

A | b

,
then the system has no solution.
ii. If
rank A = rank

A | b

:= r,
then the system has solutions which can be computed as follows.
2. Extract a square r-dimensional invertible submatrix A
r
from A.
i. Discard the equations, if any, whose corresponding rows are not part of A
r
.
ii. In the remaining equations, bring on the right hand side the terms containing unknowns
whose coecients are not part of the matrix A
r
, if any.
iii. You then get a system to which Cramers Theorem can be applied, treating as constant
the expressions on the right hand side and which contain n r unknowns. Those unknowns can
be chosen arbitrarily. Sometimes it is said that then the system has
nr
solutions or that the
system admits n r degrees of freedom. More formally, we can say what follows.
Denition 330 Given S, T R
n
, we dene the sum of the sets S and T, denoted by S + T, as
follows
{x R
n
: s S, t T such that x = s +t} .
Proposition 331 Assume that the set S of solutions to the system Ax = b is nonempty and let
x

S. Then
S = {x

} + ker l
A
:=

x R
n
: x
0
ker l
A
such that x = x

+x
0

Proof. []
Take x S. We want to nd x
0
ker A such that x = x

+ x
0
. Take x
0
= x x

. Clearly
x = x

+ (x x

). Moreover,
Ax
0
= A(x x

)
(1)
= b b = 0 (9.2)
where (1) follows from the fact that x, x

S.
[]
Take x = x

+x
0
with x

S and x
0
ker A. Then
Ax = Ax

+Ax
0
= b + 0 = b.
Remark 332 The above proposition implies that a linear system either has no solutions, or has a
unique solution, or has innite solutions.
Denition 333 V is an ane subspace of R
n
if there exists a vector subspace W of R
n
and a
vector x R
n
such that
V = {x} +W
We say that the
1
dimension of the ane subspace V is dimW.
Remark 334 Since dimker A = n rank A, the above Proposition and Denition say that if a
nonhomogeneous systems has solutions, then the set of solutions is an ane space of dimension
n rank A.
1
If W
0
and W
00
are vector subspaces of R
n
, x R
n
and V := {x} +W
0
= {x} +W
00
, then W
0
= W
00
.
Take w W
0
, then x +w V = {x} + W
00
. Then there exists x {x} and w W
00
such that x +w = x + w.
Then w = w W
00
, and W
0
W
00
. Similar proof applies for the opposite inclusion.
9.2. A SOLUTION METHOD: ROUCH-CAPELLIS AND CRAMERS THEOREMS 101
Remark 335 How to nd a basis of ker .
Let A M(m, n) be given and rankA = r min{m, n}. Then, from the second proof of
Rouche-Capellis Theorem, we have that system
Ax = 0
admits the following set of solutions
n

x
1
, x
2

R
r
R
nr
: x
1
= (A

)
1

A
12
x
2

o
(9.3)
Observe that dimker l
A
= n r := p. Then, a basis of ker l
A
is
B =

[A

]
1
A
12
e
1
p
e
1
p

, ...,

[A

]
1
A
12
e
p
p
e
p
p

.
To check the above statement, we check that 1. B ker l
A
, and 2. B is linearly independent
2
.
1. It follows from (9.3);
2. It follows from the fact that det

e
1
p
, ..., e
p
p

= det I = 1.
Example 336

x
1
+x
2
+x
3
+ 2x
4
= 0
x
1
x
2
+x
3
+ 2x
4
= 0
Dened
A

1 1
1 1

, A
12
=

1 2
1 2

the starting system can be rewritten as

x
1
x
2

1 1
1 1

1 2
1 2

x
3
x
4

.
Then a basis of ker is
B =

1 1
1 1

1 2
1 2

1
0

1
0

1 1
1 1

1 2
1 2

0
1

0
1

1
0
1
0

2
0
0
1

An algorithm to nd eigenvalues and eigenvectors of A


nn
and to show if A is
diagonalizable.
Step 1. Find the characteristic polynomial (t) of A.
Step 2. Find the roots of (t) to obtain the eigenvalues of A.
Step 3. Repeat (a) and (b) below for each eigenvalue of A:
(a) Form M = AI;
(b) Find a basis of ker M. These basis vectors are linearly independent eigenvectors of
A belonging to .
Step 4. Consider the set S = {v
1
, v
2
, ..., v
m
} of all eigenvectors obtained in Step 3:
(a) if m 6= n, then A is not diagonalizable.
(b) If m = n, let P be the matrix whose columns are the eigenvectors {v
1
, v
2
, ..., v
m
}.
Then,
D = P
1
AP = diag (
i
)
0
,
where for i {1, ..., n},
i
is the eigenvalue corresponding to the eigenvector v
i
.
Example 337 Apply the above algorithm to
A =

1 4
2 3

.
2
Observe that B is made up by p vectors and dimker l
A
= p.
102 CHAPTER 9. SOLUTIONS TO SYSTEMS OF LINEAR EQUATIONS
1.
det [tI A] = det

t 1 4
2 t 3

= t
2
4t 5.
2. t
2
4t 5 = 0 has solutions
1
= 1,
2
= 5.
3.
1
= 1.
(a)
M =

1 1 4
2 1 3

2 4
2 4

(b) ker A =

(x, y) R
2
: x =
1
2
y

=

(x, y) R
2
: 2x = y

. Therefore a basis of that one


dimensional space is {(2, 1)} .
3.
2
= 5.
(a)
M =

5 1 4
2 5 3

4 4
2 2

(b) ker A =

(x, y) R
2
: x = y

. Therefore a basis of that one dimensional space is {(1, 1)} .


4. A is diagonalizable.
P =

2 1
1 1

, P
1
=

1
3

1
3
1
3
2
3

and
P
1
AP =

1
3

1
3
1
3
2
3

1 4
2 3

2 1
1 1

1 0
0 5

Example 338 Discuss the following system (i.e., say if admits solutions).

x
1
+ x
2
+ x
3
= 4
x
1
+ x
2
+ 2x
3
= 8
2x
1
+ 2x
2
+ 3x
3
= 12
The augmented matrix [A|b] is:

1 1 1 | 4
1 1 2 | 8
2 2 3 | 12

rank [A|b] = rank

1 1 1 | 4
1 1 2 | 8
0 0 0 | 0

= 2 = rank A
From Step 2 of Rouche-Capelli and Cramers method, we can consider the system

x
2
+ x
3
= 4 x
1
x
2
+ 2x
3
= 8 x
1
Therefore, x
1
can be chosen arbitrarily and since det

1 1
1 2

= 1,
x
2
= det

4 x
1
1
8 x
1
2

= x
1
x
3
= det

1 4 x
1
1 8 x
1

= 4
Therefore, the set of solution is

(x
1
, x
2
, x
3
) R
3
: x
2
= x
1
, x
3
= 4

9.2. A SOLUTION METHOD: ROUCH-CAPELLIS AND CRAMERS THEOREMS 103


Example 339 Discuss the following system

x
1
+ x
2
= 2
x
1
x
2
= 0
The augmented matrix [A|b] is:

1 1 | 2
1 1 | 0

Since det A = 1 1 = 2 6= 0,
rank [A|b] = rankA = 2
and the system has a unique solution:
x
1
=
det

2 1
0 1

2
=
2
2
= 1
x
2
=
det

1 2
1 0

2
=
2
2
= 1
Therefore, the set of solution is
{(1, 1)}
Example 340 Discuss the following system

x
1
+ x
2
= 2
x
1
+ x
2
= 0
The augmented matrix [A|b] is:

1 1 | 2
1 1 | 0
Since det A = 1 1 = 0, and det

1 2
1 0

= 2 6= 0, we have that
rank [A|b] = 2 6= 1 = rankA
and the system has no solutions. Therefore, the set of solution is .
Example 341 Discuss the following system

x
1
+ x
2
= 2
2x
1
+ 2x
2
= 4
The augmented matrix [A|b] is:

1 1 | 2
2 2 | 4

From rank properties,


rank

1 1
2 2

= rank

1 1
0 0

= 1
rank

1 1 2
2 2 4

= rank

1 1 2
0 0 0

= 1
Recall that elementary operations of rows on the augmented matrix do not change the rank of
either the augmented or coecient matrices.
Therefore
rank [A|b] = 1 = rankA
and the system has innite solutions. More precisely, the set of solutions is

(x
1
, x
2
) R
2
: x
1
= 2 x
2

104 CHAPTER 9. SOLUTIONS TO SYSTEMS OF LINEAR EQUATIONS


Example 342 Say for which value of the parameter k R the following system has one, innite
or no solutions:

(k 1) x + (k + 2) y = 1
x +ky = 1
x 2y = 1
[A|b] =

k 1 k + 2 | 1
1 k | 1
1 2 | 1

det [A|b] = det

k 1 k + 2 1
1 k 1
1 2 1

= det

1 k
1 2

det

k 1 k + 2
1 2

+det

k 1 k + 2
1 k

=
(2 k) (2k + 2 k 2) +

k
2
k +k + 2

= 2 k + 2k +k +k
2
+ 2 = 2k +k
2
+ 4
= 1 17 < 0. Therefore, the determinant is never equal to zero and rank [A|b] = 3. Since rank
A
32
2, the solution set of the system is empty of each value of k.
Remark 343 To solve a parametric linear system Ax = b, where A M(m, n), it is convenient
to proceed as follows.
1. Perform easy row operations on [A|b];
2. Compute min{m, n + 1} := k and consider the k k submatrices of the matrix [A|b];
3. among those square matrices choose a matrix A

which is a submatrix of A, if possible; if you


have more than one matrix to choose among, choose the easiest one from a computational
viewpoint, i.e., that one with highest number of zeros, the lowest number of times a parameters
appear, ... ;
4. compute det A

, a function of the parameter;


5. analyze the two cases det A

6= 0 and det A

= 0.
Example 344 Say for which value of the parameter a R the following system has one, innite
or no solutions:

ax
1
+x
2
+x
3
= 2
x
1
ax
2
= 0
2ax
1
+ax
2
= 4
[A|b] =

a 1 1 2
1 a 0 0
2a a 0 4

det

a 1 1
1 a 0
2a a 0

= a + 2a
2
= 0 if a = 0,
1
2
.
Therefore, if a 6= 0,
1
2
,
rank [A|b] = rangoA = 3
and the system has a unique solution.
If a = 0,
[A|b] =

0 1 1 2
1 0 0 0
0 0 0 4

Since det

0 1
1 0

= 1, rank A = 2. On the other hand,


det

0 1 2
1 0 0
0 0 4

= 1 det

1 2
0 4

= 4 6= 0
9.2. A SOLUTION METHOD: ROUCH-CAPELLIS AND CRAMERS THEOREMS 105
and therefore rank [A|b] = 3, and
rank [A|b] = 3 6= 2 = rangoA
and the system has no solutions.
If a =
1
2
,
[A|b] =


1
2
1 1 2
1
1
2
0 0
1
1
2
0 4

Since
det

1 1
1
2
0

=
1
2
,
rangoA = 2.
Since
det


1
2
1 2
1 0 0
1 0 4

= det

1 2
0 4

= 4
and therefore
rank [A|b] = 3,
rank [A|b] = 3 6= 2 = rangoA
and the system has no solutions.
Example 345 Say for which value of the parameter a R the following system has one, innite
or no solutions:

(a + 1) x
1
+ (2) x
2
+ 2ax
3
= a
ax
1
+ (a) x
2
+ x
3
= 2
[A|b] =

a + 1 2 2a | a
a a 1 | 2

det

2 2a
a 1

= 2a
2
2 = 0,
whose solutions are 1, 1. Therefore, if a R\ {1, 1},
rank [A|b] = 2 = rangoA
and the system has innite solutions. Lets study the system for a {1, 1}.
If a = 1, we get
[A|b] =

0 2 2 | 1
1 1 1 | 2

and since
det

0 2
1 1

= 2 6= 0
we have again
rank [A|b] = 2 = rangoA
If a = 1,we have
[A|b] =

2 2 2 | 1
1 1 1 | 2

and
rank [A|b] = 2 > 1 = rangoA
and the system has no solution..
106 CHAPTER 9. SOLUTIONS TO SYSTEMS OF LINEAR EQUATIONS
Example 346 Say for which value of the parameter a R the following system has one, innite
or no solutions:

ax
1
+ x
2
= 1
x
1
+ x
2
= a
2x
1
+ x
2
= 3a
3x
1
+ 2x
2
= a
[A|b] =

a 1 | 1
1 1 | a
2 1 | 3a
3 2 | a

Observe that
rank A
42
2.
det

1 1 a
2 1 3a
3 2 a

= 3a = 0 if a = 0.
Therefore, if a R\ {0},
rank [A|b] = 3 > 2 rank A
42
If a = 0,
[A|b] =

0 1 | 1
1 1 | 0
2 1 | 0
3 2 | 0

and since
det

0 1 1
1 1 0
2 1 0

= 1
the system has no solution for a = 0.
Summarizing, a R, the system has no solutions.
Chapter 10
Appendix. Complex numbers
A
1
simple motivation to introduce complex numbers is to observe that the equation
x
2
+ 1 = 0
does not have any solution in the set R. The symbol

1,denoted by i, was introduced to
solve the equation anyway and was called the imaginary unit and numbers as 3 + 4i were called
complex numbers. In what follows, we present a rigorous axiomatic description of those numbers.
Denition 347 The set of complex number is the set R
2
where equality addition and multiplication
are respectively denoted as follows:
(x
1
, x
2
) , (y
1
, y
2
) R
2
,
a. (x
1
, x
2
) + (y
1
, y
2
) = (x
1
+y
1
, x
2
+y
2
);
b. (x
1
, x
2
) (y
1
, y
2
) = (x
1
y
1
x
2
y
2
, x
1
y
2
+x
2
y
1
),
where addition and multiplication on real numbers are the standard ones.
The set of complex number is denoted by C.
x = (x
1
, x
2
) C, x
1
is called the real part of x and x
2
the imaginary part of x.
Remark 348 The symbol i =

1 will be introduced soon as a specic element of C.


Proposition 349 C is a eld.
Proof. We have to verify that the operations presented in Denition 347 satisfy properties of
Denition 129. We will show only some of them.
Property 2 for multiplication.
Given x = (x
1
, x
2
), y = (y
1
, y
2
) and z = (z
1
, z
2
), we have
(xy) z = (x
1
y
1
x
2
y
2
, x
1
y
2
+x
2
y
1
) (z
1
, z
2
) =
= ((x
1
y
1
x
2
y
2
) z
1
(x
1
y
2
+x
2
y
1
) z
2
, (x
1
y
1
x
2
y
2
) z
2
+ (x
1
y
2
+x
2
y
1
) z
1
) =
= (x
1
y
1
z
1
x
2
y
2
z
1
x
1
y
2
z
2
x
2
y
1
z
2
, x
1
y
1
z
2
x
2
y
2
z
2
+x
1
y
2
z
1
+x
2
y
1
z
1
)
and
x (y z) = (x
1
, x
2
) (y
1
z
1
y
2
z
2
, y
1
z
2
+y
2
z
1
) =
= (x
1
(y
1
z
1
y
2
z
2
) x
2
(y
1
z
2
+y
2
z
1
) , x
1
(y
1
z
2
+y
2
z
1
) +x
2
(y
1
z
1
y
2
z
2
)) =
= (x
1
y
1
z
1
x
1
y
2
z
2
x
2
y
1
z
2
x
2
y
2
z
1
, x
1
y
1
z
2
+x
1
y
2
z
1
+x
2
y
1
z
1
x
2
y
2
z
2
)
Property 4.
The null element with respect to the sum is (0, 0). The null element with respect to the multi-
plication is (1, 0):
(x
1
, x
2
) C, (1, 0) (x
1
, x
2
) = (1x
1
0x
2
, 1x
2
+ 0x
1
) = (x
1
, x
2
).
Property 5. The negative element of (x
1
, x
2
) is simply (x
1
, x
2
) dened (x
1
, x
2
).
Property 6. (x
1
, x
2
) C\ {0}, we want to nd its inverse element (x
1
, x
2
)
1
:= (a, b). We
must then have
(x
1
, x
2
) (a, b) = (1, 0)
1
The main source of this Appendix is Chapter 9, in Apostol (1967).
107
108 CHAPTER 10. APPENDIX. COMPLEX NUMBERS
or

x
1
a x
2
b = 1
x
2
a +x
1
b = 0
which admits a unique solution i (x
1
)
2
+ (x
2
)
2
6= 0,i.e., (x
1
, x
2
) 6= 0, as we assumed. Then
(
a =
x1
(x1)
2
+(x2)
2
b =
x2
(x
1
)
2
+(x
2
)
2
Summarizing,
(x
1
, x
2
)
1
=

x
1
(x
1
)
2
+ (x
2
)
2
,
x
2
(x
1
)
2
+ (x
2
)
2
!
.
Remark 350 As said in the above proof,
0 C means 0 = (0, 0) ;
1 C means 1 = (1, 0) ;
x = (x
1
, x
2
) C, we dene x = (x
1
, x
2
).
Denition 351
C
0
= {(x
1
, x
2
) C : x
2
= 0} .
Denition 352 Given two elds F
1
and F
2
, if there exists a function f : F
1
F
2
which is
invertible and preserves operation, then the function is called a (eld) isomorphism and the elds
are said isomorphic.
Remark 353 In the case described in the above denition, each elements in F
1
can be identied
with its isomorphic image.
Denition 354 A subset F of a eld F is a subeld of F if F is a eld.
Denition 355 A eld F is an extension of a eld G if Fcontains a subeld F
0
isomorphic to G.
Proposition 356
f : R C
0
, a 7(a, 0)
is an isomorphism from R to C
0
.
Proof. 1. f is invertible.
Its inverse is
f
1
: C
0
R, (a, 0) 7a.

f f
1

((a, 0)) = f

f
1
(a, 0)

= f (a) = (a, 0).

f
1
f

(a) = f
1
(f (a)) = f
1
(a, 0) = a.
2. f (a +b) = f (a) +f (b).
f (a +b) = (a +b, 0) ; f (a) +f (b) = (a, 0) + (b, 0) = (a +b, 0).
3. f (ab) = f (a) f (b) .
f (ab) = (ab, 0); f (a) f (b) = (a, 0) (b, 0) = (ab 0, 0a + 0b) = (ab, 0).
Remark 357 As a consequence of the previous Proposition, C
0
is a eld. Therefore, C
0
is a
subeld of C isomorphic to R. Then C is an extension of R,and using what said in Remark 353,
a R,
we can identify a R with (a, 0) C
0
. (10.1)
Denition 358 x C, x
2
= x x = (x
1
, x
2
) (x
1
, x
2
) =

(x
1
)
2
(x
2
)
2
, 2x
1
x
2

.
Proposition 359 The equation
x
2
+ 1 = 0 (10.2)
has a solution x

C.
109
Proof. x

= (0, 1).
Observe that in the above equation 1 C, i.e., we want to show that
(0, 1) (0, 1) + (1, 0) = (0, 0) .
(0, 1) (0, 1) + (1, 0) = (0 1, 0) + (1, 0) = (0, 0) ,
as desired.
Observe that x

= (0, 1) is a solution as well.


On the base of the above proposition, we can give the following denition.
Denition 360
i = (0, 1) .
Remark 361 We then have that i = (0, 1) and the that both i and i are solutions to equation
(10.2) and that i
2
= (i)
2
= 1.
The following simple Proposition makes sense of the denition of a +bi as a complex number.
Proposition 362 (a, b) C,
(a, b) = (a, 0) + (b, 0) i.
and therefore, using (10.1), we get (a, b) = a +bi.
Proof. (a, 0)+(b, 0) i = (a, 0)+(b, 0) (0, 1) = (a, 0)+(b 0 0 1, b + 0) = (a, 0)+(0, b) = (a, b).
Remark 363 The use of the identication of (a, b) with a+bi is mainly mnemonic. In fact, treating
a + bi as made up by real number and using the fact that i
2
= 1, we get back the denition of
product and inverse as follows:
(a +bi) (c +of) = ac +adi +bci bd = (ac bd) + (ad +bc) i,
1
a +bi
=
a bi
(a +bi) (a bi)
=
a
a
2
+b
2
+
b
a
2
+b
2
i.
Denition 364 The complex conjugate of z = a +ib is z = a ib.
Remark 365 The following properties follow from the above denitions. z
1
, z
2
, z C,
1. z
1
+z
2
= z
1
+z
2
;
2. z
1
z
2
= z
1
z
2
;
3.

z1
z
2

=
z1
z
2
;
4. z z = |z|
2
.
Theorem 366 (Fundamental Theorem of Algebra). Given n N, n 1 and (a
i
)
n
i=0
C
n+1
with
a
n
6= 0, the equation
p
n
(z) :=
n
X
k=0
a
k
z
k
= 0
has solutions in C. Moreover, there exists (
k
)
n
k=1
C
n
such that {
k
: k {1, ..., n}} is equal
to the set of solution of p
n
(z) = 0.
The complex numbers
1
, ...,
n
are called roots of the complex polynomial p. If they are pairwise
distinct, we say that p has simple roots. If a root appears l times in the set {
k
: k {1, ..., n}}
of solutions, is called a root of multiplicity l.
The proof of Theorem 366 is beyond the scope of this notes. We just verify the conclusion of
the theorem for n = 2, and real coecients, i.e., for the equation ax
2
+bx +c = 0.
110 CHAPTER 10. APPENDIX. COMPLEX NUMBERS
ax
2
+bx +c =

x +
b
2a

b
2
4ac
4a
2
.
If b
2
4ac 0, the equation has real solutions. If b
2
4ac < 0,then it has the following complex
solutions
r
1
=
b
2a
+i
p
(b
2
4ac)
2a
i and r
2
=
b
2a
i
p
(b
2
4ac)
2a
.
Therefore, if a second degree equation (with real coecients) has no real solutions, then its
solutions are complex conjugates. Vice-versa, if z = a + ib, then z and z are solution of the
equation
x
2
2ax +|z|
2
= 0.
For a geometrical interpretation see Section 9.5, page 361, in Apostol(1967).
Since a complex number is an ordered pair of real numbers, we can give a simple geometrical
representation of complex numbers using Cartesian diagrams. Given a complex number z = (a, b) 6=
0, we can express it in polar coordinates, i.e., we can nd a unique r R
++
called modulus of the
number (a, b), denoted by |(a, b)| and some number such that
a = r cos and b = r sin. (10.3)
In fact, r is the Euclidean norm of (a, b):
|(a, b)| =
p
a
2
+b
2
,
and the angle is called an argument of (a, b) : if is an argument of (a, b) then k Z, +2k
is an argument as well. To nd an argument of (a, b) ,we can take
=

arctan
b
a
if a 6= 0

2
if a = 0 and b < 0

2
if a = 0 and b < 0
where we consider the tan function restricted to (, ) in order to have Imarctan = (, ).
Therefore, we can dene the principal argument of (a, b) as the unique R such that
a = r cos and b = r sin and (, ] .
Proposition 367 For any z = r (cos +i sin), any w = s (cos + sin) and n N, we have
1. zw = rs (cos ( +) +i sin( +)) ,
2. z
n
= r
n
(cos n +i sinn) .
Proof. The identities follow from basic trigonometric identities - see for example Sydsaeter
(1981), page 40.
Some trigonometric identities.
The following trigonometric identities hold true
2
: , R,
1. (fundamental formula) cos
2
+ sin
2
= 1;
2. addition formulas:
(a) cos ( ) = cos cos sin sin;
(b) sin( ) = sin cos cos sin;
(a) sin2 = 2 sincos ;
(b) cos 2 = cos
2
sin
2
.
2
See, for example, Simon and Blume (1994), Appendix A2.
Part II
Some topology in metric spaces
111
Chapter 11
Metric spaces
11.1 Denitions and examples
Denition 368 Let X be a nonempty set. A metric or distance on X is a function d : XX R
such that x, y, z X
1. (a.) d (x, y) 0, and (b.) d (x, y) = 0 x = y,
2. d (x, y) = d (y, x),
3. d (x, z) d (x, y) +d (y, z) (Triangle inequality).
(X, d) is called a metric space.
Remark 369 Observe that the denition requires that x, y X, it must be the case that d (x, y)
R.
Example 370 n-dimensional Euclidean space with Euclidean metric.
Given n N\ {0}, take X = R
n
, and
d
2,n
: R
n
R
n
R, (x, y) 7

n
X
i=1
(x
i
y
i
)
2
!1
2
.
(X, d
2,n
) was shown to be a metric space in Proposition 58, Section 2.3. d
2,n
is called the Euclidean
distance in R
n
. In what follows, unless needed, we write simply d
2
in the place of d
2,n
.
Proposition 371 (Discrete metric space) Given a nonempty set X and the function
d : X
2
R, d (x, y) =

0 if x = y
1 if x 6= y,
(X, d) is a metric space, called discrete metric space.
Proof. 1a. 0, 1 0.
1b. From the denition, d (x, y) = 0 x = y.
2. It follows from the fact that x = y y = x and x 6= y y 6= x.
3. If x = z, the result follows. If x 6= z, then it cannot be x = y and y = z, and again the result
follows.
Proposition 372 Given n N\ {0} , p [1, +) , X = R
n
,
d : R
n
R
n
R, (x, y) 7

n
X
i=1
|x
i
y
i
|
p
!1
p
,
(X, d) is a metric space.
113
114 CHAPTER 11. METRIC SPACES
Proof. 1a. It follows from the denition of absolute value.
1b. []Obvious.
[] (
P
n
i=1
|x
i
y
i
|
p
)
1
p
= 0
P
n
i=1
|x
i
y
i
|
p
= 0 for any i , |x
i
y
i
| = 0 for any i ,
x
i
y
i
= 0.
2. It follows from the fact |x
i
y
i
| = |y
i
x
i
|.
3. First of all observe that
d (x, z) =

n
X
i=1
|(x
i
y
i
) + (y
i
z
i
)|
p
!1
p
.
Then, it is enough to show that

n
X
i=1
|(x
i
y
i
) + (y
i
z
i
)|
p
!1
p

n
X
i=1
|x
i
y
i
|
p
!1
p
+

n
X
i=1
|(y
i
z
i
)|
p
!1
p
which is a consequence of Proposition 373 below.
Proposition 373 Taken n N\ {0} , p [1, +) , X = R
n
, a, b R
n

n
X
i=1
|a
i
+b
i
|
p
!1
p

n
X
i=1
|a
i
|
p
!1
p
+

n
X
i=1
|b
i
|
p
!1
p
Proof. It follows from the proof of the Proposition 376 below.
Denition 374 Let R

be the set of sequences in R.


Denition 375 For any p [1, +), dene
1
l
p
=
(
(x
n
)
nN
R

:
+
X
n=1
|x
n
|
p
< +
)
,
i.e., roughly speaking, l
p
is the set of sequences whose associated series are absolutely convergent.
Proposition 376 (Minkowski inequality). (x
n
)
nN
, (y
n
)
nN
l
p
, p [1, +),

+
X
n=1
|x
n
+y
n
|
p
!1
p

+
X
n=1
|x
n
|
p
!1
p
+

+
X
n=1
|y
n
|
p
!1
p
. (11.1)
Proof. If either (x
n
)
nN
or (y
n
)
nN
are such that n N, x
n
= 0 or n N, y
n
= 0, i.e., if
either sequence is the constant sequence of zeros, then (11.1) is trivially true.
Then, we can consider the case in which
, R
++
such that

P
+
n=1
|x
n
|
p
1
p
= and

P
+
n=1
|y
n
|
p
1
p
= .
(11.2)
Dene
n N, b x
n
=
|x
n
|

and b y
n
=
|y
n
|

. (11.3)
Then
+
X
n=1
b x
n
=
+
X
n=1
b y
n
= 1. (11.4)
For any n N, from the triangle inequality for the absolute value, we have
|x
n
+y
n
| |x
n
| +|y
n
| ;
1
For basic results on series, see, for example, Section 10.5 in Apostol (1967).
11.1. DEFINITIONS AND EXAMPLES 115
since p [1, +), f
p
: R
+
R, f
p
(t) = t
p
is an increasing function, we have
|x
n
+y
n
|
p
(|x
n
| +|y
n
|)
p
. (11.5)
Moreover, from (11.3),
(|x
n
| +|y
n
|)
p
= (|b x
n
| + |b y
n
|)
p
= ( +)
p


+
|b x
n
| +

+
|b y
n
|

. (11.6)
Since p [1, +), f
p
is convex (just observe that f
00
p
(t) = p (p 1) t
p2
0), we get


+
|b x
n
| +

+
|b y
n
|


+
|b x
n
|
p
+

+
|b y
n
|
p
(11.7)
From (11.5) , (11.6) and (11.7), we get
|x
n
+y
n
|
p
( +)
p


+
|b x
n
|
p
+

+
|b y
n
|
p

.
From the above inequalities and basic properties of the series, we then get
+
X
n=1
|x
n
+y
n
|
p
( +)
p


+
+
X
n=1
|b x
n
|
p
+

+
+
X
n=1
|b y
n
|
p
!
(11.4)
= ( +)
p


+
+

+

= ( +)
p
.
Therefore, using (11.2), we get
+
X
n=1
|x
n
+y
n
|
p

+
X
n=1
|x
n
|
p
!1
p
+

+
X
n=1
|y
n
|
p
!1
p

p
,
and therefore the desired result.
Proposition 377 (l
p
, d
p
) with
d
p
: l
p
l
p
R, d
p

(x
n
)
nN
, (y
n
)
nN

+
X
n=1
|x
n
y
n
|
p
!1
p
.
is a metric space.
Proof. We rst of all have to check that d
p
(x, y) R, i.e., that

P
+
n=1
|x
n
y
n
|
p
1
p
converges.

+
X
n=1
|x
n
y
n
|
p
!1
p
=

+
X
n=1
|x
n
+ (y
n
)|
p
!1
p

+
X
n=1
|x
n
|
p
!1
p
+

+
X
n=1
|y
n
|
p
!1
p
< +,
where the rst inequality follows from Minkowski inequality and the second inequality from the
assumption that we are considering sequences in l
p
.
Properties 1 and 2 of the distance follow easily from the denition. Property 3 is again a
consequence of Minkowsi inequality:
d
p
(x, z) =

+
X
n=1
|(x
n
y
n
) + (y
n
z
n
)|
p
!1
p

+
X
n=1
|x
n
y
n
|
p
!1
p
+

+
X
n=1
|(y
n
z
n
)|
p
!1
p
:= d
p
(x, y)+d
p
(y, z) .
Denition 378 Let T be a non empty set. B(T) is the set of all bounded real functions dened on
T, i.e.,
B(T) := {f : T R : sup{|f (x)| : x T} < +} ,
116 CHAPTER 11. METRIC SPACES
and
2
d

: B(T) B(T) R, d

(f, g) = sup{|f (x) g (x)| : x T}


Denition 380
l

=

(x
n
)
nN
R

: sup{|x
n
| : n N} < +

is called the set of bounded real sequences, and, still using the symbol of the previous denition,
d

: l

R, d

(x
n
)
nN
, (y
n
)
nN

= sup{|x
n
y
n
| : n N}
Proposition 381 (B(T) , d

)and (l

, d

) are metric spaces, and d

is called the sup metric.


Proof. We show that (B(T) , d

) is a metric space. As usual, the dicult part is to show


property 3 of d

, which is done below.


f, g, h B(T) , x T,
|f (x) g (x)| |f (x) h(x)| + |h(x) g (x)|
sup{|f (x) h(x)| : x T} + sup{|h(x) g (x)| : x T} =
= d

(f, h) +d

(h, g) .
Then,x T,
d

(f, g) := sup |f (x) g (x)| d

(f, g) +d

(h, g) .
Exercise 382 If (X, d) is a metric space, then

X,
1
1+d

is a metric space.
Proposition 383 Given a metric space (X, d) and a set Y such that 6= Y X, then

Y, d
|Y Y

is a metric space.
Proof. By denition.
Denition 384 Given a metric space (X, d) and a set Y such that 6= Y X, then

Y, d
|Y Y

,
or simply, (Y, d) is called a metric subspace of X.
Example 385 1. Given R with the (Euclidean) distance d
2,1
, ([0, 1) , d
2,1
) is a metric subspace of
(R, d
2,1
).
2. Given R
2
with the (Euclidean) distance d
2,2
, ({0} R, d
2,2
) is a metric subspace of

R
2
, d
2,2

.
Exercise 386 Let C ([0, 1]) be the set of continuous functions from [0, 1] to R. Show that a metric
on that set is dened by
d (f, g) =
Z
1
0
|f (x) g (x) dx| ,
where f, g C ([0, 1]).
Example 387 Let X be the set of continuous functions from R to R, and consider d (f, g) =
sup
xR
|f (x) g (x)|. (X, d) is not a metric space because d is not a function from X
2
to R: it
can be sup
xR
|f (x) g (x)| = +.
2
Denition 379 Observe that d(f, g) R :
d

(f, g) := sup{|f (x) g (x)| : x T} sup{|f (x)| : x T} + sup{|g (x)| : x T} < +.


11.2. OPEN AND CLOSED SETS 117
Example 388 Let X = {a, b, c} and d : X
2
R such that
d (a, b) = d (b, a) = 2
d (a, c) = d (c, a) = 0
d (b, c) = d (c, b) = 1.
Since d (a, b) = 2 > 0 + 1 = d (a, c) +d (b, c), then (X, d) is not a metric space.
Example 389 Given n N\ {0} , p (0, 1) , X = R
2
, dene
d : R
2
R
2
R, (x, y) 7

2
X
i=1
|x
i
y
i
|
p
!
1
p
,
(X, d) is not a metric space, as shown below. Take x = (0, 1) , y = (1, 0) and z = (0, 0). Then
d (x, y) = (1
p
+ 1
p
)
1
p
= 2
1
p
,
d (x, z) = (0
p
+ 1
p
)
1
p
= 1
d (z, y) = 1.
Then, d (x, y) (d (x, z) +d (z, y)) = 2
1
p
2 > 0.
11.2 Open and closed sets
Denition 390 Let (X, d) be a metric space. x
0
X and r R
++
, the open r-ball of x
0
in
(X, d) is the set
B
(X,d)
(x
0
, r) = {x X : d (x, x
0
) < r} .
If there is no ambiguity about the metric space (X, d) we are considering, we use the lighter
notation B(x
0
, r) in the place of B
(X,d)
(x
0
, r).
Example 391 1.
B
(R,d2)
(x
0
, r) = (x
0
r, x
0
+r)
is the open interval of radius r centered in x
0
.
2.
B
(R,d
2
)
(x
0
, r) =

(x
1
, x
2
) R
2
:
q
(x
1
x
01
)
2
+ (x
2
x
02
)
2
< r

is the open disk of radius r centered in x


0
.
3. In R
2
with the metric d given by
d ((x
1
, x
2
) , (y
1
, y
2
)) = max {|x
1
y
1
, |x
2
y
2
||}
the open ball B(0, 1) can be pictured as done below:
a square around zero.
Denition 392 Let (X, d) be a metric space. x is an interior point of S X if
there exists an open ball centered in x and contained in S, i.e.,
r R
++
such that B(x, r) S.
Denition 393 The set of all interior points of S is called the Interior of S and it is denoted by
Int
(X,d)
S or simply by Int S.
118 CHAPTER 11. METRIC SPACES
Remark 394 Int S S, simply because x Int S x B(x, r) S, where the rst inclusion
follows from the denition of open ball and the second one from the denition of Interior. In other
words, to nd interior points of S, we can limit our search to points belonging to S.
It is not true that S X, S Int S, as shown below. We want to prove that
(S X, x S, x S x IntS),
i.e.,
(S X and x S such that x / Int S).
Take (X, d) = (R, d
2
), S = {1} and x = 1. Then, clearly 1 {1}, but 1 / Int S : r R
++
,
(1 r, 1 +r) * {1}.
Remark 395 To understand the following example, recall that a, b R such that a < b, c Q
and d R\Q such that c, d (a, b) - see, for example, Apostol (1967).
Example 396 Let (R, d
2
) be given.
1. Int N = Int Q =.
2. a, b R, a < b, Int [a, b] = Int [a, b) = Int (a, b] = Int (a, b) = (a, b).
3. Int R = R.
4. Int = .
Denition 397 Let (X, d) be a metric space. A set S X is open in (X, d) , or (X, d)-open, or
open with respect to the metric space (X, d), if S Int S, i.e., S = Int S, i.e.,
x S, r R
++
such that B
(X,d)
(x, r) := {y X : d (y, x) < r} S.
Remark 398 Let (R, d
2
) be given. From Example 396, it follows that
N, Q, [a, b] , [a, b) , (a, b] are not open sets, and (a, b) , R and are open sets. In particular, open
interval are open sets, but there are open sets which are not open interval. Take for example
S = (0, 1) (2, 3).
Exercise 399 n N, i {1, ..., n} , a
i
, b
i
R with a
i
< b
i
,

n
i=1
(a
i
, b
i
)
is (R
n
, d
2
) open.
Proposition 400 Let (X, d) be a metric space. An open ball is an open set.
Proof. Take y B(x
0
, r). Dene
= r d (x
0
, y) . (11.8)
First of all, observe that, since y B(x
0
, r), d (x
0
, y) < r and then R
++
. It is then enough
to show that B(y, ) B(x
0
, r), i.e., we assume that
d (z, y) < (11.9)
and we want to show that d (z, x
0
) < r. From the triangle inequality
d (z, x
0
) d (z, y) +d (y, x
0
)
(11.9),(11.8)
< < + (r ) = r,
as desired.
Example 401 In a discrete metric space (X, d), x X, r (0, 1] , B(x, r) := {y X : d (x, y) < r} =
{x} and r > 1, B(x, r) := {y X : d (x, y) < r} = X. Then, it is easy to show that any subset of
a discrete metric space is open, as veried below. Let (X, d) be a discrete metric space and S X.
For any x S, take =
1
2
; then B

x,
1
2

= {x} S.
11.2. OPEN AND CLOSED SETS 119
Denition 402 Let a metric space (X, d) be given. A set T X is closed in (X, d) if its comple-
ment in X, i.e., X\T is open in (X, d).
If no ambiguity arises, we simply say that T is closed in X, or even, T is closed; we also write
T
C
in the place of X\T.
Remark 403 S is open S
C
is closed, simply because S
C
closed

S
C

C
= S is open.
Example 404 The following sets are closed in (R, d
2
): R; N; ; a, b R, a < b, {a} and [a, b].
Remark 405 It is false that:
S is not open S is closed
(and therefore that S is not closed S is open), i.e., there exist sets which are not open and
not closed, for example (0, 1] in (R, d
2
). There are also two sets which are both open and closed:
and R
n
in (R
n
, d
2
).
Proposition 406 Let a metric space (X, d) be given.
1. and X are open sets.
2. The union of any (nite or innite) collection of open sets is an open set.
3. The intersection of any nite collection of open sets is an open set.
Proof. 1.
x X, r R
++
, B(x, r) X. is open because it contains no elements.
2.
Let I be a collection of open sets and S =
AI
A. Assume that x S. Then there exists A I
such that x A. Then, for some r R
++
x B(x, r) A S
where the rst inclusion follows from fact that A is open and the second one from the denition of
S.
3.
Let F be a collection of open sets, i.e., F = {A
n
}
nN
, where N N, #N is nite and n N,
A
n
is an open set. Take S =
nN
A
n
. If S = , we are done. Assume that S 6= and that x S.
Then from the fact that each set A is open and from the denition of S as the intersection of sets
n N, r
n
R
++
such that x B(x, r
n
) A
n
Since N is a nite set, there exists a positive r

= min{r
n
: n N} > 0. Then
n N, x B(x, r

) B(x, r
n
) A
n
and from the very denition of intersections
x B(x, r

)
nN
B(x, r
n
)
nN
A
n
= S.
Remark 407 The assumption that #N is nite cannot be dispensed with:

+
n=1
B

0,
1
n

=
+
n=1

1
n
,
1
n

= {0}
is not open.
Remark 408 A generalization of metric spaces is the concept of topological spaces. In fact, we
have the following denition which assumes the previous Proposition.
Let X be a nonempty set. A collection T of subsets of X is said to be a topology on X if
1. and X belong to T ,
2. The union of any (nite or innite) collection of sets in T belongs to T ,
3. The intersection of any nite collection of sets in T belongs to T .
(X, T ) is called a topological space.
The members of T are said to be open set with respect to the topology T , or (X, T ) open.
120 CHAPTER 11. METRIC SPACES
Proposition 409 Let a metric space (X, d) be given.
1. and X are closed sets.
2. The intersection of any (nite or innite) collection of closed sets is a closed set.
3. The union of any nite collection of closed sets is a closed set.
Proof. 1
It follows from the denition of closed set, the fact that
C
= X, X
C
= and Proposition 406.
2.
Let I be a collection of closed sets and S =
BI
B. Then, from de Morgans laws,
S
C
= (
BI
B)
C
=
BI
B
C
Then from Remark 403, B I, B
C
is open and from Proposition 406.1,
BI
B
C
is open as
well.
2.
Let F be a collection of closed sets, i.e., F = {B
n
}
nN
, where N N, #N is nite and n N,
B
n
is an open set. Take S =
nN
B
n
. Then, from de Morgans laws,
S
C
= (
nN
B
n
)
C
=
nN
B
C
n
Then from Remark 403, n N, B
C
is open and from Proposition 406.2,
BI
B
C
is open as well.
Remark 410 The assumption that #N is nite cannot be dispensed with:

+
n=1
B

0,
1
n

C
=
+
n=1
B

0,
1
n

C
=
+
n=1

,
1
n
,

1
n
, +

is not closed.
Denition 411 If S is both closed and open in (X, d), S is called clopen in (X, d).
Remark 412 In any metric space (X, d), X and are clopen.
Proposition 413 In any metric space (X, d), {x} is closed.
Proof. We want to show that X\ {x} is open. If X = {x}, then X\ {x} = , and we are done.
If X 6= {x}, take y X, where y 6= x. Taken
r = d (y, x) (11.10)
with r > 0, because x 6= y. We are left with showing that B(y, r) X\ {x}, which is true
because of the following argument. Suppose otherwise; then x B(y, r), i.e., r
(11.10)
= d (y, x) < r,
a contradiction.
Remark 414 From Example 401, any set in any discrete metric space is open. Therefore, the
complement of each set is open, and therefore each set is then clopen.
Denition 415 Let a metric space (X, d) and a set S X be given. x is an boundary point of S
if
any open ball centered in x intersects both S and its complement in X, i.e.,
r R
++
, B(x, r) S 6= B(x, r) S
C
6= .
Denition 416 The set of all boundary points of S is called the Boundary of S and it is denoted
by F (S).
Exercise 417 F (S) = F

S
C

Exercise 418 F (S) is a closed set.


11.2. OPEN AND CLOSED SETS 121
Denition 419 The closure of S, denoted by Cl (S) is the intersection of all closed sets containing
S, i.e., Cl (S) =
S
0
S
S
0
where S := {S
0
X : S
0
is closed and S
0
S}.
Proposition 420 1. Cl (S)is a closed set;
2. S is closed S = Cl (S).
Proof. 1.
It follows from the denition and Proposition 409.
2.
[]
It follows from 1. above.
[]
Since S is closed, then S S. Therefore, Cl (S) = S (
S
0
S
S
0
) = S.
Denition 421 x X is an accumulation point for S X if any open ball centered at x contains
points of S dierent from x, i.e., if
r R
++
, (S\ {x}) B(x, r) 6=
The set of accumulation points of S is denoted by D(S) and it is called the Derived set of S.
Denition 422 x X is an isolated point for S X if x S and it is not an accumulation point
for S, i.e.,
x S and r R
++
such that (S\ {x}) B(x, r) = ,
or
r R
++
such that S B(x, r) = {x} .
The set of isolated points of S is denoted by Is (S)..
Proposition 423 D(S) = {x R
n
: r R
++
, S B(x, r) has an innite cardinality}.
Proof. []
Suppose otherwise, i.e., x is an accumulation point of S and r R
++
such that S B(x, r) =
{x
1
, ..., x
n
}.Then dened := min{d (x, x
i
) : i {1, ..., n}}, (S\ {x}) B

x,

2

= , a contradic-
tion.
[]
Since S B(x, r) has an innite cardinality, then (S\ {x}) B(x, r) 6= .
Remark 424
Is (S) = {x S : r R
++
such that S B(x, r) = {x}} ,
as veried below.
x is an isolated point of S if belongs to S and it is not an accumulation point, i.e., if
x S x / D(S) ,
or
x S r R
++
such that (S\ {x}) B(x, r) = ,
or
r R
++
such that S B(x, r) = {x} .
122 CHAPTER 11. METRIC SPACES
11.2.1 Sets which are open or closed in metric subspaces.
Remark 425 1. [0, 1) is ([0, 1) , d
2
) open.
2. [0, 1) is not (R, d
2
) open. We want to show

x
0
[0, 1) , r R
++
such that B
(R,d
2
)
(x
0
, r)

= (x
0
r, x
0
+r) [0, 1) , (11.11)
i.e.,
x
0
[0, 1) such that r R
++
, x
0
R such that x
0
(x
0
r, x
0
+r) and x
0
/ [0, 1) .
It is enough to take x
0
= 0 and x
0
=
r
2
.
3. Let ([0, +) , d
2
) be given. [0, 1) is open, as shown below. By denition of open set, - go
back to Denition 397 and read it again - we have that, given the metric space ((0, +) , d
2
), [0, 1)
is open if
x
0
[0, 1), r R
++
such that B
([0,+),d2)
(x
0
, r) :=
n
x [0, +) : d (x
0
, x) < r
o
([0, 1)) .
If x
0
(0, 1), then take r = min{x
0
, 1 x
0
} > 0.
If x
0
= 0, then take r =
1
2
. Therefore, we have B
(R+,d2)

0,
1
2

=

x R
+
: |x 0| <
1
2

0,
1
2

[0, 1).
Remark 426 1. (0, 1) is ((0, 1) , d
2
) closed.
2. (0, 1] is ((0, +) , d
2
) closed, simply because (1, +) is open.
Proposition 427 Let a metric space (X, d), a metric subspace (Y, d) of (X, d) and a set S Y be
given.
S is open in (Y, d) there exists a set O open in (X, d) such that S = Y O.
Proof. Preliminary remark.
x
0
Y, r R
++
, B
(Y,d)
(x
0
, r) := {x Y : d (x
0
, x) < r} = Y {x X : d (x
0
, x) < r} = Y B
(X,d)
(x
0
, r) .
(11.12)
[]
Taken x
0
S, by assumption r
x
0
R
++
such that B
(Y,d)
(x
0
, r) S Y . Then
S =
x0S
B
(Y,d)
(x
0
, r)
(11.12)
=
x0S

Y B
(X,d)
(x
0
, r)

distributive laws
= Y

x0S
B
(X,d)
(x
0
, r)

,
and the it is enough to take O =
x0S
B
(X,d)
(x
0
, r) to get the desired result.
[]
Take x
0
S. then, x
0
O, and, since, by assumption, O is open in (X, d) , r R
++
such that
B
(X,d)
(x
0
, r) O. Then
B
(Y,d)
(x
0
, r)
(11.12)
= Y B
(X,d)
(x
0
, r) O Y = S,
where the last equality follows from the assumption. Summarizing, x
0
S, r R
++
such that
B
(Y,d)
(x
0
, r) S, as desired.
Corollary 428 Let a metric space (X, d), a metric subspace (Y, d) of (X, d) and a set S Y be
given.
1.
hS closed in (Y, d)i hthere exists a set C closed in (X, d) such that S = Y C.i .
2.
hS open (respectively, closed) in (X, d)i

:
hS open (respectively, closed) in (Y, d)i .
3. If Y is open (respectively, closed) in X,
hS open (respectively, closed) in (X, d)i hS open (respectively, closed) in (Y, d)i .
i.e., the implication in the above statement 2. does hold true.
11.3. SEQUENCES 123
Proof. 1.
hS closed in (Y, d)i
def.
hY \S open in (Y, d)i
Prop. 427

hthere exists an open set S


00
in (X, d) such that Y \S = S
00
Y i
hthere exists a closed set S
0
in (X, d) such that S = S
0
Y i ,
where the last equivalence is proved below;
[]
Take S
00
= X\S
0
, open in (X, d) by denition. We want to show that
if S
00
= X\S, S = S
0
Y and Y X, then Y \S = S
00
Y :
x Y \S iff x Y x / S
x Y (x / S
0
Y )
x Y ( (x S
0
Y ))
x Y ( (x S
0
x Y ))
x Y ((x / S
0
x / Y ))
(x Y x / S
0
) ((x Y x / Y ))
x Y x / S
0
x S
00
Y iff x Y x S
00
x Y (x X x / S
0
)
(x Y x X) x / S
0
x Y x / S
0
[]
Take S
0
= X\S.Then S
0
is closed in (X, d). We want to show that
if T
0
= X\S
00
, Y \S = S
00
Y, Y X, then S = S
0
Y .
Observe that we want to show that Y \S = Y \ (S
0
Y ) ,or from the assumptions, we want to
show that
S
00
Y = Y \ ((X\S
00
) Y ) .
x Y \ ((X\S
00
) Y ) iff x Y ((x X\S
00
x Y ))
x Y (x / X\S
00
x / Y )
x Y (x S
00
x / Y )
(x Y x S
00
) (x Y x / Y )
x Y x S
00
x S
00
Y
2. and 3.
Exercises.
11.3 Sequences
Unless otherwise specied, up to the end of the chapter, we assume that
X is a metric space with metric d,
and
R
n
is the metric space with Euclidean metric.
Denition 429 A sequence in X is a function x : N X.
Usually, for any n N, the value x(n) is denoted by x
n
,which is called the n-th term of the
sequence; the sequence is denoted by (x
n
)
nN
.
Denition 430 Given a nonempty set X, X

is the set of sequences (x


n
)
nN
such that n N,
x
n
X.
Denition 431 A strictly increasing sequence of natural numbers is a sequence (k
n
)
nN
in N such
1 < k
1
< k
2
< ... < k
n
< ...
124 CHAPTER 11. METRIC SPACES
Denition 432 A subsequence of a sequence (x
n
)
nN
is a sequence (y
n
)
nN
such that there exists
a strictly increasing sequence (k
n
)
nN
of natural numbers such that n N, y
n
= x
kn
.
Denition 433 A sequence (x
n
)
nN
X

is said to be (X, d) convergent to x


0
X (or convergent
to x
0
X with respect to the metric space (X, d) ) if
> 0, n
0
N such that n > n
0
, d (x
n
, x
0
) < (11.13)
x
0
is called the limit of the sequence (x
n
)
nN
and we write
lim
n+
x
n
= x
0
, or x
n
n
x
0
. (11.14)
(x
n
)
nN
in a metric space (X, d) is convergent if there exist x
0
X such that (22.10) holds. In
that case, we say that the sequence converges to x
0
and x
0
is the limit of the sequence.
3
Remark 434 A more precise, and heavy, notation for (11.14) would be
lim
n+
(X,d)
x
n
= x
0
or x
n
n

(X,d)
x
0
Remark 435 Observe that

1
n

nN
converges with respect to (R, d
2
) and it does not converge with
respect to (R
++
, d
2
) .
Proposition 436 lim
n+
x
n
= x
0
lim
n+
d (x
n
, x
0
) = 0.
Proof. Observe that we can dene the sequence (d (x
n
, x
0
))
nN
in R. Then from denition 755,
we have that lim
n+
d (x
n
, x
0
) = 0 means that
> 0, n
0
N such that n > n
0
, |d (x
n
, x
0
) 0| < .
Remark 437 Since (d (x
n
, x
0
))
nN
is a sequence in R, all well known results hold for that sequence.
Some of those results are listed below.
Proposition 438 (Some properties of sequences in R).
All the following statements concern sequences in R.
1. Every convergent sequence is bounded.
2. Every increasing (decreasing) sequence that is bounded above (below) converges to its sup
(inf).
3. Every sequence has a monotone subsequence.
4. (Bolzano-Weierstrass 1) Every bounded sequence has a convergent subsequence.
5. (Bolzano-Weierstrass 2) Every sequence contained in a closed and bounded set has a conver-
gent subsequence.
Moreover, suppose that (x
n
)
nN
and (y
n
)
nN
are sequences in R and lim
n
x
n
= x
0
and
lim
n+
y
n
= y
0
. Then
6. lim
n+
(x
n
+y
n
) = x
0
+y
0
;
7. lim
n+
x
n
y
n
= x
0
y
0
;
8. if n N, x
n
6= 0 and x
0
6= 0, lim
n+
1
xn
=
1
x0
;
9. if n N, x
n
y
n
, then x
0
y
0
;
10. Let (z
n
)
nN
be a sequence such that n N, x
n
z
n
y
n
, and assume that x
0
= y
0
. Then
lim
n+
z
n
= x
0
.
Proof. Most of the above results can be found in Chapter 12 in Simon and Blume (1994), Ok
page 50 on, Morris pages 126-128, apostol; put a reference to apostol.
Proposition 439 If (x
n
)
nN
converges to x
0
and (y
n
)
nN
is a subsequence of (x
n
)
nN
, then (y
n
)
nN
converges to x
0
.
3
For the last sentence in the Denition, see, for example, Morris (2007), page 121.
11.3. SEQUENCES 125
Proof. By denition of subsequence, there exists a strictly increasing sequence (k
n
)
nN
of
natural numbers, i.e., 1 < k
1
< k
2
< ... < k
n
< ..., such that n N, y
n
= x
kn
.
If n +, then k
n
+. Moreover, n, k
n
such that
d (x
0
, x
kn
) = d (x
0
, y
n
)
Taking limits of both sides for n +, we get the desired result.
Proposition 440 A sequence in (X, d) converges at most to one element in X.
Proof. Assume that x
n
n
p and x
n
n
q; we want to show that p = q. From the Triangle
inequality,
n N, 0 d (p, q) d (p, x
n
) +d (x
n
, q) (11.15)
Since d (p, x
n
) 0 and d (x
n
, q) 0, Proposition 438.7 and (11.15) imply that d (p, q) = 0 and
therefore p = q.
Proposition 441 Given a sequence (x
n
)
nN
=

x
i
n

k
i=1

nN
in R
k
,

(x
n
)
nN
converges to x

D
i {1, ..., k} ,

x
i
n

nN
converges to x
i
E
,
and
lim
n+
x
n
=

lim
n+
x
i
n

n
i=1
.
Proof. []
Observe that

x
i
n
x
i

=
q
(x
i
n
x
i
)
2
d (x
n
x) .
Then, the result follows.
[]
By assumption, > 0 and i {1, ..., k} , there exists n
0
such that n > n
0
, we have

x
i
n
x
i

<

k
. Then n > n
0
,
d (x
n
x) =

k
X
i=1

x
i
n
x
i

2
!
1
2
<

k
X
i=1

2
!
1
2
=

2
k
X
i=1
1
k
!
1
2
= .
Proposition 442 Suppose that (x
n
)
nN
and (y
n
)
nN
are sequences in R
k
and lim
n
x
n
= x
0
and lim
n+
y
n
= y
0
. Then
1. lim
n+
(x
n
+y
n
) = x
0
+y
0
;
2. c R, lim
n+
c x
n
= c x
0
;
3. lim
n+
x
n
y
n
= x
0
y
0
.
Proof. It follows from Propositions 438 and 441.
Example 443 In Proposition 381 , we have seen that (B([0, 1]) , d

) is a metric space. Observe


that dened n N\ {0},
f
n
: [0, 1] R, t 7t
n
,
we have that (f
n
)
n
B([0, 1])

. Moreover, t [0, 1],



f
n

nN
R

and it converges in
(R, d
2
). In fact,
lim
n+
t
n
=

0 if t [0, 1)
1 if t = 1.
126 CHAPTER 11. METRIC SPACES
Dene
f : [0, 1] R, t 7

0 if t [0, 1)
1 if t = 1.
.
We want to check that it is false that
f
m

(B([0,1]),d)
f,
i.e., it is false that
d

(f
m
, f)
m
0.
1 0.75 0.5 0.25 0
1
0.75
0.5
0.25
0
x
y
x
y
m N\ {0} , t [0, 1],
0 f
m
(t) 1.
Therefore, by denition of f, m N\ {0}
t [0, 1) , 0 = f (t) f
m
(t) < 1,
and f (1) = f
m
(1) .
Then, m N\ {0} ,
t [0, 1) , 0 f (t) f
m
(t) < 1, and
> 0,
e
t [0, 1] such that f
m

e
t

< 1 ,
and m N\ {0} ,
d

(f
m
, f) = sup
t[0,1]
|f
m
(t) f (t)| = 1.
Exercise 444 For any metric space (X, d) and (x
n
)
nN
X

x
n

(X,d)
x

*
x
n

(X,
1
1+d
)
x
+
.
11.4 Sequential characterization of closed sets
Proposition 445 Let (X, d) be a metric space and S X.
4

S is closed



any (X, d) convergent sequence (x
n
)
nN
S

converges to an element of S

.
4
Proposition 503 in Appendix 11.8 presents a dierent proof of the result below.
11.5. COMPACTNESS 127
Proof. We want to show that
S is closed

(x
n
)
nN
is such that 1. n N, x
n
S, and
2. x
n
x
0

x
0
S

.
[]
Suppose otherwise, i.e., 1. and 2. above do hold, but x
0
X\S. Since S is closed, X\S is open
and therefore r R
++
such that B(x
0
, r) X\S. Since x
n
x
0
, M N such that n > M,
x
n
B(x
0
, r) X\S, contradicting Assumption 1 above.
[]
Suppose otherwise, i.e., S is not closed. Then, X\S is not open. Then, x X\S such that
n N, x
n
X such that x
n
B

x,
1
n

S, i.e.,
i. x X\S
ii. n N, x
n
S,
iii. d (x
n
, x) <
1
n
, and therefore x
n
x,
and i., ii. and iii. contradict the assumption.
Remark 446 The Appendix to this chapter contains some other characterizations of closed sets
and summarizes all the presented characterizations of open and closed sets.
11.5 Compactness
Denition 447 Let (X, d) be a metric space, S a subset of X, and be a set of arbitrary cardi-
nality. A family S = {S

such that , S

is (X, d) open, is said to be an open cover of


S if S

.
A subfamily S
0
of S is called a subcover of S if S
S
0
S
0 S
0
.
Denition 448 A metric space (X, d) is compact if every open cover of X has a nite subcover.
A set S X is compact in X if every open cover of S has a nite subcover.
Example 449 Any nite set in any metric space is compact.
Take S = {x
i
}
n
i=1
in (X, d) and an open cover S of S. For any i {1, ..., n}, take an open set
in S which contains x
i
; call it S
i
. Then S
0
= {S
i
: i {1, ..., n}} is the desired open subcover of S.
Example 450 1. (0, 1) is not compact in (R, d
2
).
We want to show that the following statement is true:
hS such that
SS
S (0, 1) , S
0
S such that #S
0
is nite and
SS
0 S (0, 1)i ,
i.e.,
S such that
SS
S (0, 1) and S
0
S either #S
0
is innite or
SS
0 S + (0, 1) .
Take S =

1
n
, 1

nN\{0,1}
and S
0
any nite subcover of S.Then there exists a nite set N such
that S
0
=

1
n
, 1

nN
. Take n

= max {n N}. Then,


SS
0 S =
nN

1
n
, 1

=

1
n

, 1

and

1
n

, 1

+ (0, 1).
2. (0, 1] is not compact in ((0, +) , d
2
). Take S =

1
n
, 1 +
1
n

nN\{0}
and S
0
any nite sub-
cover of S.Then there exists a nite set N such that S
0
=

1
n
, 1 +
1
n

nN
. Take n

= max {n N}
and n

= min{n N}. Then,


SS
0 S =
nN

1
n
, 1 +
1
n

=

1
n

, 1 +
1
n

and

1
n

, 1 +
1
n

+
(0, 1].
Proposition 451 Let (X, d) be a metric space.
X compact and C X closed hC compacti .
Proof. Take an open cover S of C. Then S (X\C) is an open cover of X. Since X is
compact, then there exists an open covers S
0
of S (X\C) which cover X. Then S
0
\ {X\C} is a
nite subcover of S which covers C.
128 CHAPTER 11. METRIC SPACES
11.5.1 Compactness and bounded, closed sets
Denition 452 Let (X, d) be a metric space and S a subset of X. S is bounded in (X, d) if
r R
++
and x S such that S B
(X,d)
(x, r).
Proposition 453 Let (X, d) be a metric space and S a subset of X.
S compact S bounded.
Proof. If S = , we are done. Assume then that S 6= , and take x S and B =
{B(x, n)}
nN\{0}
. B is an open cover of X and therefore of S. Then, there exists B
0
B such that
B
0
= {B(x, n
i
)}
iN
,
where N is a nite set and B
0
covers S.
Then takes n

= max
iN
n
i
, we get S B(x, n

) as desired.
Proposition 454 Let (X, d) be a metric space and S a subset of X.
S compact S closed.
Proof. If S = X, we are done by Proposition 409. Assume that S 6= X: we want to show that
X\S is open. Take y S and x X\S. Then, taken r
y
R such that
0 < r
y
<
1
2
d (x, y) ,
we have
B(y, r
y
) B(x, r
y
) = .
Now, S = {B(y, r
y
) : y S} is an open cover of S, and since S is compact, there exists a nite
subcover S
0
of S which covers S, say
S
0
= {B(y
n
, r
n
)}
nN
,
such that N is a nite set. Take
r

= min
nN
r
n
,
and therefore r

> 0. Then n N,
B(y
n
, r
n
) B(x, r
n
) = ,
B(y
n
, r
n
) B(x, r

) = ,
and
(
nN
B(y
n
, r
n
)) B(x, r

) = .
Since {B(y
n
, r
n
)}
nN
covers S, we then have
S B(x, r

) = ,
or
B(x, r

) X\S.
Therefore, we have shown that
x X\S, r

R
++
such that B(x, r

) X\S,
i.e., X\S is open and S is closed.
11.5. COMPACTNESS 129
Remark 455 Summarizing, we have seen that in any metric space
S compact S bounded and closed.
The opposite implication is false. In fact, the following sets are bounded, closed and not compact.
1. Let the metric space ((0, +) , d
2
). (0, 1] is closed from Remark 426, it is clearly bounded
and it is not compact from Example 450.2 .
2. (X, d) where X is an innite set and d is the discrete metric.
X is closed, from Remark 414 .
X is bounded: take x X and r = 2 .
X is not compact. Take S = {B(x, 1)}
xX
. Then x X there exists a unique element S
x
in
S such that x S
x
.
5
Remark 456 In next section we are going to show that if (X, d) is an Euclidean space with the
Euclidean distance and S X, then
S compact S bounded and closed.
11.5.2 Sequential compactness
Denition 457 Let a metric space (X, d) be given. S X is sequentially compact if every sequence
of elements of S has a subsequence which converges to an element of S, i.e.,

(x
n
)
nN
is a sequence in S

a subsequence (y
n
)
nN
of (x
n
)
nN
such that y
n
x S

.
In what follows, we want to prove that in metric spaces, compactness is equivalent to sequential
compactness. To do that requires some work and the introduction of some useful also in itself
concepts.
Proposition 458 (Nested intervals) For every i N, dene I
i
= [a
i
, b
i
] R such that I
i+1
I
i
.
Then
iN
I
i
6= .
Proof. By assumption,
a
1
a
2
... a
n
... (11.16)
and
...b
n
b
n1
... b
1
(11.17)
Then,
m, n N, a
m
< b
n
simply because, if m > n, then a
m
< b
m
b
n
,where the rst inequality follows from the
denition of interval I
m
and the second one from (11.17), and if m n, then a
m
a
n
b
n
,where
the rst inequality follows from (11.16) and the second one from the denition of interval I
n
.
Then A := {a
n
: n N} is nonempty and bounded above by b
n
for any n.Then supA := s exists.
Since n N, b
n
is an upper bound for A,
n N, s b
n
and from the denition of sup,
n N, a
n
s
Then
n N, a
n
s b
n
and
n N, I
n
6= .
Remark 459 The statement in the above Proposition is false if instead of taking closed bounded
intervals we take either open or unbounded intervals. To see that consider I
n
=

0,
1
n

and I
n
=
[n, +].
5
For other examples, see among others, page 155, Ok (2007).
130 CHAPTER 11. METRIC SPACES
Proposition 460 (Bolzano- Weirstrass) If S R
n
, S with innite cardinality and S bounded .
Then S admits at least an accumulation point, i.e., D(S) 6= .
Proof. Step 1. n = 1.
Since S is bounded, a
0
, b
0
R such that S [a
0
, b
0
] := B
0
.Divide B
0
in two subinterval of
equal length:

a
0
,
a
0
+b
0
2

and

a
0
+b
0
2
, b
0

Choose an interval which contains an innite number of points in S. Call B


1
= [a
1
, b
1
] that
interval. Proceed as above for B
1
. We therefore obtain a family of intervals
B
0
B
1
.... B
n
...
Observe that lenght B
0
:= b
0
a
0
and
n N, lenght B
n
=
b
0
a
0
2
n
.
Therefore, > 0, N N such that n > N, lenght B
n
< .
From Proposition 458, it follows that
x
+
n=0
B
n
We are now left with showing that x is an accumulation point for S
r R
++
, B(x, r) contains an innite number of points.
By construction, n N, B
n
contains an innite number of points; it is therefore enough to
show that
r R
++
, n N such that B(x, r) B
n
.
Observe that
B(x, r) B
n
(x r, x +r) [a
n
, b
n
] x r < a
n
< b
n
< x +r max {x a
n
, b
n
x} < r
Moreover, since x [a
n
, b
n
],
max {x a
n
, b
n
x} < b
n
a
n
= lenght B
n
=
b
0
a
0
2
n
Therefore, it suces to show that
r R
++
, n N such that
b
0
a
0
2
n
< r
i.e., n N and n > log
2
(b
0
a
0
).
Step 2. Omitted.
Remark 461 The above Proposition does not say that there exists an accumulation point which
belongs to S. To see that, consider S =

1
n
: n N

.
Proposition 462 Let a metric space (X, d) be given and consider the following statements.
1. S is compact set;
2. Every innite subset of S has an accumulation point which belongs to S, i.e.,
hT S #T is innitei hD(T) S 6= i ,
3. S is sequentially compact
4. S is closed and bounded.
11.5. COMPACTNESS 131
Then
1. 2. 3 4.
If X = R
n
, d = d
2
, then we also have that
3 4.
Proof. (1) (2)
Take an innite subset T S and suppose otherwise. Then, no point in S is an accumulation
point of T, i.e., x S r
x
> 0 such that
B(x, r
x
) T\ {x} = .
Then
6
B(x, r
x
) T {x} . (11.19)
Since
S
xS
B(x, r
x
)
and S is compact, x
1
, ..., x
n
such that
S
n
i=1
B(x, r
i
)
Then, since T S,
T = S T (
n
i=1
B(x
i
, r
i
)) T =
n
i=1
(B(x
i
, r
i
) T) {x
1
, ..., x
n
}
where the last inclusion follows from (11.19). But then #T n, a contradiction.
(2) (3)
Take a sequence (x
n
)
nN
of elements in S.
If #{x
n
: n N} is nite, then x
n
such that x
j
= x
n
for j in an innite subset of N, and
(x
n
, ..., x
n
, ...) is the required convergent subsequence - converging to x
n
S.
If #{x
n
: n N} is innite, then there exists a subsequence (y
n
)
nN
of (x
n
)
nN
with an innite
amount distinct values, i.e., such that n, m N, n 6= m, we have y
n
6= y
m
. To construct the
subsequence (y
n
)
nN
, proceed as follows.
y
1
= x
1
:= x
k
1
,
y
2
= x
k
2
/ {x
k
1
},
y
3
= x
k3
/ {x
k1
, x
k2
},
...
y
n
= x
k
n
/

x
k
1
, x
k
2
, ...x
k
n1

,
...
Since {y
n
: n N} is an innite subset of S, by assumption it does have an accumulation point
in S; moreover, we can redene (y
n
)
nN
in order to have n N, y
n
6= x
7
, as follows. If k such
that y
k
= x, take the (sub)sequence (y
k+1
, y
k+2
, ...) = (y
k+n
)
nN
. With some abuse of notation,
6
In general,
A\B = C A C B, (11.18)
as shown below.
Since A\B = A B
C
, by assumption, we have
_
A B
C
_
B = C B
Moreover,
_
A B
C
_
B = (A B)
_
B
C
B
_
= A B A
Observe that the inclusion in (11.18)can be strict, i.e., it can be
A\B = C A C B;
just take A = {1} , B = {2} and C = {1} :
A\B = {1} = C A = {1} C B = {1, 2} .;
7
Below we need to have d (y
n
, x) > 0.
132 CHAPTER 11. METRIC SPACES
call still (y
n
)
nN
the sequence so obtained. Now take a further subsequence as follows, using the
fact that x is an accumulation point of {y
n
: n N} := T,
y
m1
T such that d (y
m1
, x) <
1
1
,
y
m2
Tsuch that d (y
m2
, x) < min
n
1
2
, (d (y
m
, x))
mm1
o
,
y
m
3
T such that d (y
m3
, x) < min
n
1
3
, (d (y
m
, x))
mm2
o
,
...
y
mn
T such that d (y
mn
, x) < min
n
1
n
, (d (y
m
, x))
mmn1
o
,
Observe that since n, d (y
mn
, x) < min
n
(d (y
m
, x))
mmn1
o
, we have that n, m
n
> m
n1
and therefore (y
mn
)
nN
is a subsequence of (y
n
)
nN
and therefore of (x
n
)
nN
. Finally, since
lim
n+
d (y
mn
, x) < lim
n+
1
n
= 0
we also have that
lim
n+
y
m
n
= x
as desired.
(3) (1)
It is the content of Proposition 469 below.
(1) (4)
It is the content of Remark 455.
Second proof: (1 3) (4) compare with the above proofs.
Assume that S is compact; we rst show that S is bounded and then closed.
Boundedness. Since S
xS
B(x, 1) and S is compact, there exists a nite number n of points
x
1
, ..., x
n
in S such that S
n
i=1
B(x
i
, 1). Dene M = max {d (x
i,
x
j
) : i, j {1, ..., n}}. Take
x, y S; then there exists i and j such that x B(x
i
, 1) and y B(x
j
, 1) .Therefore
d (x, y) d (x, x
i
) +d (x
i
, x
j
) +d (x
j
, y) < 1 +M + 1 = M + 2.
Closedness. From Proposition 420, it suces to show that Cl (S) S. Take x Cl (S).
From Proposition 502, (x
n
)
nN
in S converging to x Cl (S). From 3 above, (x
n
)
nN
admits a
subsequences converging to a point of S. Since, from Proposition 439 every subsequence converges
to x, x S as desired.
If X = R
n
, (4) (2)
Take an innite subset T S. Since S is bounded T is bounded as well. Then from Bolzano-
Weirestrass theorem (i.e., Proposition 460, D(T) 6= . Since T S, from Proposition 496, D(T)
D(S) and since S is closed, D(S) S.Then, summarizing 6= D(T) S and therefore D(T)S =
D(T) 6= , as desired.
To complete the proof of the above Theorem it suces to show sequential compactness implies
compactness, which is done below, and it requires some preliminary results.
Denition 463 Let (X, d) be a metric space and S a subset of X. S is totally bounded if > 0,
a nite set T S such that S
xT
B(x, ).
Proposition 464 Let (X, d) be a metric space and S a subset of X.
S totally bounded S bounded.
Proof. By assumption, taken = 1,there exists a number N N such that T = {x
i
}
N
i=1
S
and such that
S
N
i=1
B(x
i
, 1) . (11.20)
Take r =

max
i{1,...,N}
d (x
i
, 0)

+ 1. Then, i {1, ..., N}, r d (x


i
, 0) + 1 and
d (x
i
, 0) r 1. (11.21)
We now want to show that
B(0, r)
N
i=1
B(x
i
, 1) , (11.22)
11.5. COMPACTNESS 133
and therefore from (11.20), we get the desired result.
For any i {1, ..., N} , for any y B(x
i
, 1),
d (y, 0) d (y, x
i
) +d (x
i
, 0) < 1 +r 1 = r,
where the last inequality follows from (11.21) and the fact that y B(x
i
, 1).
Remark 465 In the previous Proposition, the opposite implication does not hold true.
Take (X, d) where X is an innite set and d is the discrete metric. Then, if =
1
2
, a ball is
needed to take care of each element in X .
Remark 466 In (R
n
, d
n
), S bounded S totally bounded.
Since S is bounded, there exist r R
++
such that S B(x, r).
Lemma 467 Let (X, d) be a metric space and S a subset of X.
S sequentially compact S totally bounded.
Proof. Suppose otherwise, i.e., > 0 such that for any nite set T S, S "
xT
B(x, ).
We are now going to construct a sequence in S which does not admit any convergent subsequence,
contradicting sequential compactness.
Take an arbitrary
x
1
S.
Then, by assumption S " B(x
1
, ). Then take x
2
S\B(x
1
, ), i.e.,
x
2
S and d (x
1
, x
2
) > .
By assumption, S " B(x
1
, ) B(x
2
, ). Then, take x
3
S\ (B(x
1
, ) ??B(x
2
, )), i.e.,
x3 S and for i {1, 2} , d (x
3
, x
i
) > .
By the axiom of choice, we get that
n N, x
n
S and for i {1, ..., n 1} , d (x
n
, x
i
) > .
Therefore, we have constructed a sequence (x
n
)
nN
S

such that
i, j N, if i 6= j, then d (x
i
, x
j
) > . (11.23)
But, then it is easy to check that (x
n
)
nN
does not have any convergent subsequence in S, as
veried below. Suppose otherwise, then you (x
n
)
nN
would admit a subsequence (x
m
)
mN
S

such that x
m
x S. But, by denition of convergence, N N such that m > N, d (x
m
, x) <

2
,
and therefore
d (x
m
, x
m+1
) d (x
m
, x) +d (x
m+1
, x) < ,
contradicting (11.23).
Lemma 468 Let (X, d) be a metric space and S a subset of X.

S sequentially compact
S is an open cover of S



> 0 such that x S, O
x
S such that B(x, ) O
x

.
Proof. Suppose otherwise; then
n N x
n
S such that O S, B

x
n
,
1
n

* O. (11.24)
By sequential compactness, the sequence (x
m
)
mN
S

admits a subsequence (x
m
)
mN
S

such that x
m
x S. Since S is an open cover of S, O S such that x O and, since O is
open, > 0 such that
B(x, ) O. (11.25)
134 CHAPTER 11. METRIC SPACES
Since x
m
x, M N such that{x
M+i
, i N} B

x,

2

. Now, take m > max

M,
2

. Then,
B

x
m
,
1
m

B(x, ) . (11.26)
We now want to show d (y, x
m
) <
1
m
d (y, x) < . But,
d (y, x) d (y, x
m
) +d (x
m
, x) <
1
m
+

2
< .
From (11.25) and (11.26), we get B

x
m
,
1
m

O S, contradicting (11.24).
Proposition 469 Let (X, d) be a metric space and S a subset of X.
S sequentially compact S compact.
Proof. Take an open cover S of S. Since S is sequentially compact, from Lemma 468,
> 0 such that x S O
x
S such that B(x, ) O
x
.
Moreover, from Lemma 467 and the denition of total boundedness, there exists a nite set
T S such that S
xT
B(x, )
xT
O
x
. But then {O
x
: x T} is the required subcover of
S which covers S.
We conclude our discussion on compactness with some results we hope will clarify the concept
of compactness in R
n
.
Proposition 470 Let X be a proper subset of R
n
, and C a subset of X.
C is bounded and (R
n
, d
2
) closed
(m 1)
C is (R
n
, d
2
) compact
(m 2)
C is (X, d
2
) compact
(not ) 3
C is bounded and (X, d
2
) closed
Proof. [1 m]
It is the content of Propositions 453, 454 and last part of Proposition 462.
Observe preliminarily that
(X S

) (X S

) = X (S

)
[2 ]
Take T := {T

}
A
such that A, T

is (X, d) open and C


A
T

. From Proposition
427,
A, S

such that S

is (R
n
, d
2
) open and T

= X S

.
Then
C
A
T


A
(X S

) = X (
A
S

) .
We then have that
C
A
S

,
i.e., S := {S

}
A
is a (R
n
, d
2
) open cover of C and since C is (R
n
, d
2
) compact, then there exists
a nite subcover {S
i
}
iN
of S such that
C
iN
S
i
.
Since C X , we then have
C (
iN
S
i
) X =
iN
(S
i
X) =
iN
T
i
,
i.e., {T
i
}
iN
is a (X, d) open subcover of {T

}
A
which covers C, as required.
11.6. COMPLETENESS 135
[2 ]
Take S := {S

}
A
such that A, S

is (R
n
, d
2
) open and C
A
S

. From Proposition
427,
A, T

:= X S

is (X, d) open.
Since C X , we then have
C (
A
S

) X =
A
(S

X) =
A
T

.
Then, by assumption, there exists {T
i
}
iN
is an open subcover of {T

}
A
which covers C, and
therefore there exists a set N with nite cardinality such that
C
iN
T
i
=
iN
(S
i
X) = (
iN
S
i
) X (
iN
S
i
) ,
i.e., {S
i
}
iN
is a (R
n
, d
2
) open subcover of {S

}
A
which covers C, as required.
[3 ]
It is the content of Propositions 453, 454.
[3 not ]
See Remark 455.1.
Remark 471 The proof of part [2 m] above can be used to show the following result.
Given a metric space (X, d), a metric subspace (Y, d) a set C Y , then
C is (Y, d) compact
m
C is (X, d) compact
In other words, (X
0
, d) compactness of C X
0
X is an intrinsic property of C: it does not
depend by the subspace X
0
you are considering. On the other hand, as we have seen, closedness and
openness are not an intrinsic property of the set.
Remark 472 Observe also that to dene anyway compact sets as closed and bounded sets would
not be a good choice. The conclusion of the extreme value theorem (see Theorem 528
8
) would not
hold in that case. That theorem basically says that a continuous real valued function on a compact
set admits a global maximum. It is not the case that a continuous real valued function on a closed
and bounded set admits a global maximum: consider the continuous function
f : (0, 1] R, f (x) =
1
x
.
The set (0, 1] is bounded and closed (in ((0, +) , d
2
) and f has no maximum on (0, 1].
11.6 Completeness
11.6.1 Cauchy sequences
Denition 473 Let (X, d) be a metric space. A sequence (x
n
)
nN
X

is a Cauchy sequence if
> 0, N N such that l, m > N, d (x
l
, x
m
) < .
Proposition 474 Let a metric space (X, d) and a sequence (x
n
)
nN
X

be given.
1. (x
n
)
nN
is convergent (x
n
)
nN
is Cauchy, but not vice-versa;
2. (x
n
)
nN
is Cauchy (x
n
)
nN
is bounded;
3. (x
n
)
nN
is Cauchy and it has a subsequence converging in X (x
n
)
nN
is convergent.
8
fn
136 CHAPTER 11. METRIC SPACES
Proof. 1.
[] Since (x
n
)
nN
is convergent, by denition, x X such that x
n
x, N N such that
l, m > N, d (x, x
l
) <

2
and d (x, x
m
) <

2
. But then d (x
l
, x
m
) d (x
l
, x) +d (x, x
m
) <

2
+

2
= .
[:]
Take X = (0, 1) , d = absolute value, (x
n
)
nN\{0}
(0, 1)

such that n N\ {0} , x


n
=
1
n
.
(x
n
)
nN\{0}
is Cauchy:
> 0, d

1
l
,
1
m

1
l

1
m

<

1
l

1
m

=
1
l
+
1
m
<

2
+

2
= ,
where the last inequality is true if
1
l
<

2
and
1
m
<

2
, i.e., if l >
2

and m >
2

. Then, it is
enough to take N >
2

and N N, to get the desired result.


(x
n
)
nN\{0}
is not convergent to any point in (0, 1):
take any x (0, 1). We want to show that
> 0 such that N N n > N such that d (x
n
, x) > .
Take =
x
2
> 0 and N N , take n

N such that
1
n

< min

1
N
,
x
2

. Then, n

> N, and

1
n

= x
1
n

> x
x
2
=
x
2
= .
2.
Take = 1. Then N N such that l, m > N, d (x
l
, x
m
) < 1. If N = 1, we are done. If
N 1, dene
r = max {1, d (x
1
, x
N
) , ..., d (x
N1
, x
N
)} .
Then
{x
n
: n N} B(x
N
, r) .
3.
Let (x
n
k
)
kN
be a convergent subsequence to x X. Then,
d (x
n
, x) d (x
n
, x
n
k
) +d (x
n
k
, x) .
Since d (x
n
, x
n
k
) 0, because the sequence is Cauchy, and d (x
n
k
n
, x) 0, because the subse-
quence is convergent, the desired result follows.
11.6.2 Complete metric spaces
Denition 475 A metric space (X, d) is complete if every Cauchy sequence is a convergent se-
quence.
Remark 476 If a metric space is complete, to show convergence you do not need to guess the limit
of the sequence: it is enough to show that the sequence is Cauchy.
Example 477 ((0, 1) , absolute value) is not a complete metric space; it is enough to consider

1
n

nN\{0}
.
Example 478 Let (X, d) be a discrete metric space. Then, it is complete. Take a Cauchy sequence
(x
n
)
nN
X

. Then, we claim that N N and x X such that n > N, x


n
= x. Suppose
otherwise:
N N, m, m
0
> N such that x
m
6= x
m
0 ,
but then d (x
m
, x
m
0 ) = 1, contradicting the fact that the sequence is Cauchy.
Example 479 (Q, d
2
) is not a complete metric space. Since R\Q is dense in R, x R\Q, we
can nd (x
n
)
nN
Q

such that x
n
x.
Exercise 480 (R, d
2
) is complete.
11.6. COMPLETENESS 137
Example 481 For any nonempty set T, (B(T) , d

) is a complete metric space.


Let (f
n
)
nN
(B(T))

be a Cauchy sequence. For any x T, (f


n
(x))
nN
R

is a Cauchy
sequence, and since R is complete, it has a convergent subsequence, without loss of generality,
(f
n
(x))
nN
itself converging say to f
x
R. Dene
f : T R, : x 7f
x
.
We are going to show that (i). f B(T), and (ii) f
n
f.
(i). Since (f
n
)
n
is Cauchy,
> 0, N N such that l, m > N, d

(f
l
, f
m
) := sup
xT
|f
l
(x) f
m
(x)| < .
Then,
x T, |f
l
(x) f
m
(x)| sup
xT
|f
l
(x) f
m
(x)| = d

(f
l
, f
m
) < . (11.27)
Taking limits of both sides of (11.27) for l +, and using the continuity of the absolute value
function, we have that
x T, lim
l+
|f
l
(x) f
m
(x)| = |f (x) f
m
(x)| < . (11.28)
Since
9
x T, | |f (x)| |f
m
(x)| | | f (x) f
m
(x) | < ,
and therefore,
x T, |f (x)| f
m
(x) +.
Since f
l
B(T), f B(T) as well.
(ii) From (11.28), we also have that
x T, |f (x) f
m
(x)| < ,
and by denition of sup
d

(f
m
, f) := sup
xT
|f
ml
(x) f (x)| < ,
i.e., d

(f
m
, f) 0.
For future use, we also show the following result.
Proposition 482
BC (X) := {f : X R : f is bounded and continuous}
endowed with the metric d (f, g) = sup
xX
|f (x) g (x)| is a complete metric space.
Proof. See Stokey and Lucas (1989), page 47.
11.6.3 Completeness and closedness
Proposition 483 Let a metric space (X, d) and a metric subspace (Y, d) of (X, d) be given.
1. Y complete Y closed;
2. Y complete Y closed and X complete.
Proof. 1.
Take (x
n
)
nN
Y

such that x
n
x. From Proposition 445, it is enough to show that x Y .
Since (x
n
)
nN
is convergent in X, then , using an argument similar to that one used in the proof
of Proposition 474.1, it is Cauchy. Since Y is complete, by denition, x
n
x Y .
2.
Take a Cauchy sequence (x
n
)
nN
Y

. We want to show that x


n
x Y . Since Y X,
(x
n
)
nN
is Cauchy in X, and since X is complete, x
n
x X. But since Y is closed, x Y.
9
See, for example, page 37 in Ok (2007).
138 CHAPTER 11. METRIC SPACES
Remark 484 An example of a metric subspace (Y, d) of (X, d) which is closed and not complete is
the following one. (X, d) = (R
+
, d
2
), (Y, d) = ((0, 1] , d
2
) and (x
n
)
nN\{0}
=

1
n

nN\{0}
.
Corollary 485 Let a complete metric space (X, d) and a metric subspace (Y, d) of (X, d) be given.
Then,
Y complete Y closed.
11.7 Fixed point theorem: contractions
Denition 486 Let (X, d) be a metric space. A function : X X is said to be a contraction if
k (0, 1) such that x, y X, d ((x) , (y)) k d (x, y) .
The inf of the set of k satisfying the above condition is called contraction coecient of .
Example 487 1. Given (R, d
2
),
f

: R R, x 7x
is a contraction i || < 1; in that case || is the contraction coecient of f

.
2. Let S be a nonempty open subset of R and f : S S a dierentiable function. If
sup
xS
|f
0
(x)| < 1,
then f is a contraction
Denition 488 For any f, g X B(T), we say that f g if x T, f (x) g (x).
Proposition 489 (Blackwell) Let the following objects be given:
1. a nonempty set T;
2. X is a nonempty subset of the set B(T) such that f X, R
+
, f + X;
3. : X X is increasing, i.e., f g (f) (g);
4. (0, 1) such that f X, R
+
, (f +) (f) +.
Then is a contraction with contraction coecient .
Proof. f, g X, x T
f (x) g (x) |f (x) g (x)| sup
xT
|f (x) g (x)| = d

(f, g) .
Therefore, f g +d

(f, g), and from Assumption 3,


(f) (g +d

(f, g)) .
Then, from Assumption 4,
(0, 1) such that (g +d

(f, g)) (g) +d

(f, g) ,
and therefore
(f) (g) +d

(f, g) . (11.29)
Since the argument above is symmetric with respect to f and g, we also have
(g) (f) +d

(f, g) . (11.30)
From (11.29) and (11.30) and the denition of absolute value, we have
|(f) (g)| d

(f, g) ,
as desired.
11.7. FIXED POINT THEOREM: CONTRACTIONS 139
Proposition 490 (Banach xed point theorem) Let (X, d) be a complete metric space. If : X
X is a contraction with coecient k, then
! x

X such that x

= (x

) . (11.31)
and
x
0
X and n N\ {0} , d (
n
(x
0
) , x

) k
n
d (x
0
, x

) , (11.32)
where
n
:=
1
(
2
....
n
).
Proof. (11.31) holds true.
Take any x
0
X and dene the sequence
(x
n
)
nN
X

, with n N, x
n+1
= (x
n
) .
We want to show that 1. that (x
n
)
nN
is Cauchy, 2. its limit is a xed point for , and 3. that
xed point is unique.
1. First of all observe that
n N\ {0} , d (x
n+1
, x
n
) k
n
d (x
1
, x
0
) , (11.33)
where k is the contraction coecient of , as shown by induction below.
Step 1: P (1) is true:
d (x
2
, x
1
) = d ((x
1
) , (x
0
)) kd (x
1
, x
0
)
from the denition of the chosen sequence and the assumption that is a contraction.
Step 2. P (n 1) P (n) :
d (x
n+1
, x
n
) = d ((x
n
) , (x
n1
)) kd (x
n
, x
n1
) k
n
d (x
1
, x
0
)
from the denition of the chosen sequence, the assumption that is a contraction and the assumption
of the induction step.
Now, for any m, l N with m > l,
d (x
m
, x
l
) d (x
m
, x
m1
) +d (x
m1
, x
m2
) +... +d (x
l+1
, x
l
)

k
m1
+k
m2
+... +k
l

d (x
1
, x
0
) k
l 1k
ml
1k
d (x
1
, x
0
) ,
where the rst inequality follows from the triangle inequality, the third one from the following
computation
10
:
k
m1
+k
m2
+... +k
l
= k
l

1 +k +... +k
ml+1

= k
l
1 k
ml
1 k
.
Finally, since k (0, 1) ,we get
d (x
m
, x
l
)
k
l
1 k
d (x
1
, x
0
) . (11.34)
10
We are also using the basic facat used to study geometrical series. Dene
s
n
1 +a +a
2
+... +a
n
;
:
Multiply both sides of the above equality by(1 a):
(1 a) s
n
(1 a)
_
1 +a +a
2
+... +a
n
_
(1 a) s
n

_
1 +a +a
2
+... +a
n
_

_
a +a
2
+... +a
n+1
_
= 1 a
n+1
Divide both sides by (1 a):
s
n

_
1 +a +a
2
+... +a
n
_

_
a +a
2
+... +a
n+1
_
=
1 a
n+1
1 a
=
1
1 a

a
n+1
1 a
140 CHAPTER 11. METRIC SPACES
If x
1
= x
0
, then for any m, l N with m > l, d (x
m
, x
l
) = 0 and n N, x
n
= x
0
and
the sequence is converging and therefore it is Cauchy. Therefore, consider the case x
1
6= x
0
.From
(11.34) it follows that (x
n
)
nN
X

is Cauchy: > 0 choose N N such that


k
N
1k
d (x
1
, x
0
) < ,
i.e., k
N
<
(1k)
d(x
1
,x
0
)
and N >
log
(1k)
d(x
1
,x
0
)
log
.
2. Since (X, d) is a complete metric space, (x
n
)
nN
X

does converge say to x

X, and,
in fact, we want to show that (x

) = x

. Then, > 0, N N such that n > N,


d ((x

) , x

) d ((x

) , x
n+1
) +d

x
n+1
, x

d ((x

) , (x
n
)) +d

x
n+1
, x

kd (x

, x
n
) +d

x
n+1
, x


2
+

2
= ,
where the rst equality comes from the triangle inequality, the second one from the construction
of the sequence (x
n
)
nN
X

, the third one from the assumption that is a contraction and the
last one from the fact that (x
n
)
nN
converges to x

. Since is arbitrary, d ((x

) , x

) = 0, as
desired.
3. Suppose that b x is another xed point for - beside x

. Then,
d (b x, x

) = d ((b x) , (x

)) kd (b x, x

)
and assuming b x 6= x

would imply 1 k, a contradiction of the fact that is a contraction with


contraction coecient k.
(11.32) hods true.
We show the claim by induction on n N\ {0}.
P (1) is true.
d ((x
0
) , x

) = d ((x
0
) , (x

)) k d (x
0
, x

) ,
where the equality follows from the fact that x

is a xed point for , and the inequality by the


fact that is a contraction.
P (n 1) is true implies that P (n) is true.
d (
n
(x
0
) , x

) = d (
n
(x
0
) , (x

)) = d

n1
(x
0
)

, (x

kd

n1
(x
0
) , x

kk
n1
d (x
0
, x

) = k
n

11.8 Appendix. Some characterizations of open and closed


sets
Remark 491 From basic set theory, we have A
C
B = B A, as veried below.

x : x A
C
x B

= hx : x A (x B)i = hx : (x B) x Ai
()
= hx : x B x Ai ,
where () follows from the fact that hp qi = h(p) qi.
Proposition 492 S is open S F (S) = .
Proof. []
Suppose otherwise, i.e., x S F (S). Since x F (S), r R
++
, B(x, r) S
C
6= . Then,
from Remark 491, r R
++
, it is false that B(x, r) S, contradicting the assumption that S is
open.
[]
Suppose otherwise, i.e., x S such that
r R
++
, B(x, r) S
C
6= (11.35)
Moreover
x B(x, r) S 6= (11.36)
But (11.35) and (11.36) imply x F (S). Since x S, we would have S F (S) 6= , contra-
dicting the assumption.
11.8. APPENDIX. SOME CHARACTERIZATIONS OF OPEN AND CLOSED SETS 141
Proposition 493 S is closed F (S) S.
Proof.
S closed S
C
open
(1)
S
C
F

S
C

=
(2)
S
C
F (S) =
(3)
F (S) S
where
(1) follows from Proposition 492;
(2) follows from Remark 417
(3) follows Remark 491.
Proposition 494 S is closed D(S) S.
Proof. We are going to use Proposition 22.2.1, i.e., S is closed F (S) S.
[]
Suppose otherwise, i.e.,
x / S such that r R
++
, (S\ {x}) B(x, r) 6=
and since x / S, it is also true that
r R
++
, S B(x, r) 6= (11.37)
and
r R
++
, S
C
B(x, r) 6= (11.38)
From (11.37) and (11.38), it follows that x F (S), while x / S, which from Proposition 22.2.1
contradicts the assumption that S is closed.
[]
Suppose otherwise, i.e., using Proposition 22.2.1,
x F (S) such that x / S
Then, by denition of F (S),
r R
++
, B(x, r) S 6= .
Since x / S, we also have
r R
++
, B(x, r) (S\ {x}) 6= ,
i.e., x D(S) and x / S, a contradiction.
Remark 495 The above proof shows also that
x S hx D(s) x F (S)i
Proposition 496 S, T X, S T D(S) D(T).
Proof. Take x D(S). Then
r R
++
, (S\ {x}) B(x, r) 6= . (11.39)
Since S T, we also have
(T\ {x}) B(x, r) (S\ {x}) B(x, r) . (11.40)
From (11.39) and (11.40), we get x D(T).
Proposition 497 S D(S) is a closed set.
142 CHAPTER 11. METRIC SPACES
Proof. Take x (S D(S))
C
. Since x / D(S), x R
++
such that B(x, r) (S\ {x}) = .
Since x / S, we also have that
r R
++
such that S B(x, r) = (11.41)
and therefore
B(x, r) S
C
.
Lets show that B(x, r) D(S) = . If y B(x, r), then from (11.41), y / S and
r R
++
such that (S\ {y}) B(x, r) =
and y / D(S) , i.e.,
B(x, r) D(S) = (11.42)
Then, from (11.41) and (11.42)
r R
++
such that B(x, r) (S D(S)) = (B(x, r) S) (B(x, r) D(S)) =
i.e.,
r R
++
such that B(x, r) (S D(S))
C
or x Int (S D(S))
C
, which is therefore open.
Proposition 498 Cl (S) = S D(S).
Proof. []
Since
S Cl (S) (11.43)
from Proposition 496,
D(S) D(Cl (S)) . (11.44)
Since Cl (S)is closed, from Proposition 494,
D(Cl (S)) Cl (S) (11.45)
From (11.43) , and (11.44) , (11.45), we get
S D(S) Cl (S)
[]
Since, from the previous Proposition, S D(S) is closed and contains S, then by denition of
Cl (S),
Cl (S) S D(S) .
Proposition 499 Cl (S) = Int S F (S).
Proof. R
n
= Int S F (S) Int S
C
and
(Int S F (S))
C
= Int S
C
.
Therefore, it is enough to show that
(Cl (S))
C
= Int S
C
.
[]
Take x Int S
C
. Then, r R
++
such that B(X, r) S
C
and therefore B(x, r) S = and,
since x / S,
B(x, r) (S\ {x}) = .
Then x / S and x / D(S), i.e.,
x / S D(S) = Cl (S)
11.8. APPENDIX. SOME CHARACTERIZATIONS OF OPEN AND CLOSED SETS 143
where last equality follows from Proposition 498. In other words, x (Cl (S))
C
.
[]
Take x (Cl (S))
C
= (D(S) S)
C
. Since x / D(S),
r R
++
such that (S\ {x}) B(x, r) = (11.46)
Since x / S,
r R
++
such that S B(x, r) = (11.47)
i.e.,
r R
++
such that B(x, r) S
C
(11.48)
and x Int S
C
.
Denition 500 x X is an adherent point for S if r R
++
, B(x, r) S 6= and
Ad (S) := {x X : r R
++
, B(x, r) S 6= }
Corollary 501 1. Cl (S) = Ad (S).
2.A set S is closed Ad (S) = S.
Proof. 1.
[]
x Cl (S) hx IntS or F (S)i and in both cases the desired conclusion is insured.
[]
If x S, then, by denition of closure, x Cl (S). If x / S,then S = S\ {x} and, from the
assumption, r R
++
, B(x, r) (S\ {x}) 6= , i.e., x D(S) which is contained in Cl (S)from
Proposition 498.
2. It follows from 1. above and Proposition 420.2.
Proposition 502 x Cl (S) (x
n
)
nN
in S converging to x.
Proof. []
From Corollary 501, if x Cl (S) then n N, we can take x
n
B

x,
1
n

S. Then d (x, x
n
) <
1
n
and lim
n+
d (x, x
n
) = 0.
[]
By denition of convergence,
> 0, n

N such that n > n

, d (x
n
, x) < or x
n
B(x, )
or
> 0, B(x, ) S {x
n
: n > n

}
and
> 0, B(x, ) S 6=
and from the Corollary the desired result follows.
Proposition 503 S is closed any convergent sequence (x
n
)
nN
with elements in S converges to
an element of S.
Proof. We are going to show that S is closed using Proposition 494, i.e., S is closed
D(S) S. We want to show that
hD(S) Si

(x
n
)
nN
is such that 1. n N, x
n
S, and
2. x
n
x
0

x
0
S

,
[]
Suppose otherwise, i.e., there exists (x
n
)
nN
such that 1. n N, x
n
S. and 2.x
n
x
0
, but
x
0
/ S.
By denition of convergent sequence, we have
> 0, n
0
N such that n > n
0
, d (x
n
, x
0
) <
144 CHAPTER 11. METRIC SPACES
and, since n N, x
n
S,
{x
n
: n > n
0
} B(x
0
, ) (S\ {x
0
})
Then,
> 0, B(x
0
, ) (S\ {x
0
}) 6=
and therefore x
0
D(S) while x
0
/ S, contradicting the fact that S is closed.
[]
Suppose otherwise, i.e., x
0
D(S) and x
0
/ S. We are going to construct a convergent
sequence (x
n
)
nN
with elements in S which converges to x
0
(a point not belonging to S).
From the denition of accumulation point,
n N, (S\ {x
0
}) B

x,
1
n

6= .
Then, we can take x
n
(S\ {x}) B

x,
1
n

, and since d (x
n
, x
0
) <
1
n
, we have that d (x
n
, x
0
)
0.
Summarizing, the following statements are equivalent:
1. S is open (i.e., Int S S)
2. S
C
is closed,
3. S F (S) = ,
and the following statements are equivalent:
1. S is closed,
2. S
C
is open,
3. F (S) S,
4. S = Cl (S) .
5. D(S) S,
6. Ad (S) = S,
7. any convergent sequence (x
n
)
nN
with elements in S converges to an element of S.
Chapter 12
Functions
12.1 Limits of functions
In what follows we take for given metric spaces (X, d) and (X
0
, d
0
) and sets S X and T X
0
.
Denition 504 Given x
0
D(S) ,i.e., given an accumulation point x
0
for S, and f : S T, we
write
lim
xx
0
f (x) = l
if
> 0, > 0 such that x

B
(X,d)
(x
0
, ) S

\ {x
0
} f (x) B
(X
0
,d
0
)
(l, )
or
> 0, > 0 such that h x S 0 < d (x, x
0
) < i d
0
(f (x) , l) <
Proposition 505 Given x
0
D(S) and f : S T,
hlim
xx
0
f (x) = li

for any sequence (x


n
)
nN
in S such that n N, x
n
6= x
0
and lim
n+
x
n
= x
0
, lim
n+
f (x
n
) = l

Proof. for the following proof see also Proposition 6.2.4, page 123 in Morris.
[]
Take
a sequence (x
n
)
nN
in S such that n N, x
n
6= x
0
and lim
n+
x
n
= x
0
We want to show that lim
n+
f (x
n
) = l, i.e.,
> 0, n
0
N such that n > n
0
, d (f (x
n
) , l) <
Since lim
xx
0
f (x) = l,
> 0, > 0 such that x S 0 < d (x, x
0
) < d (f (x) , l) <
Since lim
n+
x
n
= x
> 0, n
0
N such that n > n
0
, 0
()
< d (x
n
, x
0
) <
where () follows from the fact that n N, x
n
6= x
0
.
Therefore, combining the above results, we get
> 0, n
0
N such that n > n
0
, d (f (x
n
) , l) <
as desired.
[]
145
146 CHAPTER 12. FUNCTIONS
Suppose otherwise, then
> 0 such that
n
=
1
n
, i.e., n N, x
n
S such that x
n
S 0 < d (x
n
, x
0
) <
1
n
and d (f (x) , l) .
(12.1)
Consider (x
n
)
nN
; then, from the above and from Proposition 436, x
n
x
0
, and from the above
(specically the fact that 0 < d (x
n
, x
0
)), we also have that n N, x
n
6= x
0
. Then by assumption,
lim
n+
f (x
n
) = l, i.e., by denition of limit,
> 0, N N such that if n > N, then |f (x
n
l)| < ,
contradicting (12.1).
Proposition 506 (uniqueness) Given x
0
D(S) and f : S T,

lim
xx0
f (x) = l
1
and lim
xx0
f (x) = l
2

hl
1
= l
2
i
Proof. It follows from Proposition 505 and Proposition 440.
Proposition 507 Given S X, x
0
D(S) and f, g : S R, , and
lim
xx
0
f (x) = l and lim
xx
0
g (x) = m
1. lim
xx
0
f (x) +g (x) = l +m;
2. lim
xx
0
f (x) g (x) = l m;
3. if m 6= 0 and x S, g (x) 6= 0, lim
xx
0
f(x)
g(x)
=
l
m
.
Proof. It follows from Proposition 505 and Proposition 438.
12.2 Continuous Functions
Denition 508 Given a metric space (X, d) and a set V X, an open neighborhood of V is an
open set containing V .
Remark 509 Sometimes, an open neighborhood is simply called a neighborhood.
Denition 510 Take x
0
S and f : S T. Then, f is continuous at x
0
if
> 0, > 0 such that x

B
(X.d)
(x
0
, ) S

f (x) B
(X
0
,d
0
)
(f (x
0
) , ) ,
i.e.,
> 0, > 0 such that x S d (x, x
0
) < d
0
(f (x) , f (x
0
)) < ,
i.e.,
> 0, > 0, f

B
(X,d)
(x
0
, ) S

B
(X
0
,d
0
)
(f (x
0
) , ) ,
i.e.,
for any open neighborhood V of f (x
0
),
there exists an open neighborhood U of x
0
such that f (U S) V.
If f is continuous at x
0
for every x
0
in S, f is continuous on S.
Remark 511 If x
0
is an isolated point of S, f is continuos at x
0
. If x
0
is an accumulation point
for S, f is continuous at x
0
if and only if lim
xx
0
f (x) = f (x
0
).
Proposition 512 Suppose that Z X
00
, where (X
000
, d
00
) is a metric space and
f : S T, g : W f (S) Z
h : S Z, h(x) = g (f (x))
If f is continuous at x
0
S and g is continuous at f (x
0
), then h is continuous at x
0
.
12.2. CONTINUOUS FUNCTIONS 147
Proof. Exercise (see Apostol (1974), page 79) or Ok, page 206.
Proposition 513 Take f, g : S R
n
R. If f and g are continuous, then
1. f +g is continuous;
2. f g is continuous;
3. if x S, g (x) 6= 0,
f
g
is continuous.
Proof. If x
0
is an isolated point of S, from Remark 511, we are done. If x
0
is an accumulation
point for S, the result follows from Remark 511 and Proposition 507.
Proposition 514 Let f : S X R
m
, and for any j {1, ..., m} f
j
: S R be such that
x S,
f (x) = (f
j
(x))
m
j=1
Then,
hf is continuousi hj {1, ..., m} , f
j
is continuousi
Proof. The proof follows the strategy used in Proposition 441.
Denition 515 Given for any i {1, ..., n} , S
i
R, f :
n
i=1
S
i
R is continuous in each
variable separately if i {1, ..., n} and x
0
i
S
i
,
f
x
0
i
:
k6=i
S
k
R, f
x
0
i

(x
k
)
k6=i

= f

x
1
, .., x
i1
, x
0
i
, x
i+1
, ..., x
n

is continuous.
Proposition 516 Given for any i {1, ..., n} ,
f :
n
i=1
S
i
R is continuous f is continuous in each variable separately
Proof. Exercise.
Remark 517 It is false that
f is continuous in each variable separately f is continuous
To see that consider f : R
2
R,
f (x, y) =

xy
x
2
+y
2
if (x, y) 6= 0
0 if (x, y) = 0
The following Proposition is useful to show continuity of functions using the results about
continuity of functions from R to R.
Proposition 518 For any k {1, ..., n} , take S
k
X, and dene S :=
n
k=1
S
k
X
n
. Moreover,
take i {1, ..., n} and let
g : S
i
Y, : x
i
7g (x
i
)
be a continuous function and
f : S Y, (x
k
)
n
k=1
7g (x
i
) .
Then f is continuous.
Example 519 An example of the objects described in the above Proposition is the following one.
g : [0, ] R, g (x) = sinx,
f : [0, ] [, 0] R, f (x, y) = sinx.
148 CHAPTER 12. FUNCTIONS
Proof. of Proposition 518. We want to show that
x
0
S, > 0 > 0 such that d (x, x
0
) < x S d (f (x) , f (x
0
)) <
We know that
x
i0
S
i
, > 0
0
> 0 such that d (x
i
, x
i0
) <
0
x
i
S d (g (x
i
) , g (x
i0
)) <
Take =
0
. Then d (x, x
0
) < x S d (x
i
, x
i0
) <
0
x
i
S and > d (g (x
i
) , g (x
i0
)) =
d (f (x) , f (x
0
)), as desired.
Exercise 520 Show that the following function is continuous.
f : R
2
R
3
, f (x
1
x
2
) =

e
x1
+ cos (x
1
x
2
)
sin
2
x
1
e
x
2
x
1
+x
2

From Proposition 514, it suces to show that each component function is continuous. We are
going to show that f
1
: R
2
R,
f
1
(x
1
x
2
) = e
x1
+ cos (x
1
x
2
)
is continuous, leaving the proof of the continuity of the other component functions to the reader.
1. f
11
: R
2
R, f
11
(x
1
, x
2
) = e
x
1
is continuous from Proposition 518 and Calculus 1;
2. h
1
: R
2
R, h
1
(x
1
, x
2
) = x
1
is continuous from Proposition 518 and Calculus 1,
h
2
: R
2
R, h
2
(x
1
, x
2
) = x
2
is continuous from Proposition 518 and Calculus 1,
g : R
2
R, g (x
1
, x
2
) = h
1
(x
1
, x
2
) h
2
(x
1
, x
2
) = x
1
x
2
is continuous from Proposition 513.2,
: R R, (x) = cos x is continuous from Calculus 1,
f
12
: R
2
R, f
12
(x
1
, x
2
) = ( g) (x
1
, x
2
) = cos (x
1
x
2
) is continuous from Proposition 512
(continuity of composition).
3. f
1
= f
11
+f
12
is continuous from Proposition 513.1.
The following Proposition is useful in the proofs of several results.
Proposition 521 Let S, T be arbitrary sets, f : S T, {A
i
}
n
i=1
a family of subsets of S and
{B
i
}
n
i=1
a family of subsets of T.Then
1. inverse image preserves inclusions, unions, intersections and set dierences, i.e.,
a. B
1
B
2
f
1
(B)
1
f

1
B
2

,
b. f
1
(
n
i=1
B
i
) =
n
i=1
f
1
(B
i
) ,
c. f
1
(
n
i=1
B
i
) =
n
i=1
f
1
(B
i
) ,
d. f
1
(B
1
\B
2
) = f
1
(B
1
) \f
1
(B
2
) ,
2. image preserves inclusions, unions, only, i.e.,
e. A
1
A
2
f (A)
1
f (A
2
),
f. f (
n
i=1
A
i
) =
n
i=1
f (A
i
) ,
g. f (
n
i=1
A
i
)
n
i=1
f (A
i
) , and
if f is one-to-one, then f (
n
i=1
A
i
) =
n
i=1
f (A
i
) ,
h. f (A
1
\A
2
) f (A
1
) \f (A
2
) ,and
if f is one-to-one and onto, then f (A
1
\A
2
) = f (A
1
) \f (A
2
) ,
3. relationship between image and inverse image
i. A
1
f
1
(f (A
1
)) , and
if f is one-to-one, then A
1
= f
1
(f (A
i
)) ,
12.2. CONTINUOUS FUNCTIONS 149
l. B
1
f

f
1
(B
1
)

, and
if f is onto, then B
1
= f

f
1
(B
1
)

.
Proof.
...
g.
(i) . y f (A
1
A
2
) x A
1
A
2
such that f (x) = y;
(ii) . y f (A
1
) f (A
2
) y f (A
1
) y f (A
2
) (x
1
A
1
such that f (x
1
) = y)
(x
2
A
2
such that f (x
2
) = y)
To show that (i) (ii) it is enough to take x
1
= x and x
2
= x.
...
Proposition 522 f : X Y is continuous
V Y is open f
1
(V ) X is open.
Proof. []
Take a point x
0
f
1
(V ). We want to show that
r > 0 such that B(x
0
, r) f
1
(V )
Dene y
0
= f (x
0
) V . Since V Y is open,
> 0 such that B(y
0
, ) V (12.2)
Since f is continuous,
> 0, > 0, f (B(x
0
, )) B(f (x
0
) , ) = B(y
0
, ) (12.3)
Then, taken r = , we have
B(x
0
, r) = B(x
0
, )
(1)
f
1
(f (B(x
0
, )))
(2)
f
1
(B(y
0
, ))
(3)
f
1
(V )
where (1) follows from 3.i in Proposition 521,
(2) follows from 1.a in Proposition 521 and (12.3)
(3) follows from 1.a in Proposition 521 and (12.2).
[]
Take x
0
X and dene y
0
= f (x
0
); we want to show that f is continuous at x
0
.
Take > 0, then B(y
0
, ) is open and, by assumption,
f
1
(B(y
0
, )) X is open. (12.4)
Moreover, by denition of y
0
,
x
0
f
1
(B(y
0
, )) (12.5)
(12.4) and (12.5) imply that
> 0 such that B(x
0
, ) f
1
(B(y
0
, )) (12.6)
Then
f (B(x
0
, ))
(1)
f

f
1
(B(y
0
, ))

(2)
(B(y
0
, ))
where
(1) follows from 2.e in Proposition 521 and (12.6) ,
(2) follows from 2.l in Proposition 521
Proposition 523 f : X Y is continuous
V Y closed f
1
(V ) X closed.
150 CHAPTER 12. FUNCTIONS
Proof. []
V closed in Y Y \V open. Then
f
1
(Y \V ) = f
1
(Y ) \ f
1
(V ) = X\f
1
(V ) (12.7)
where the rst equality follows from 1.d in Proposition 521.
Since f is continuous and Y \V open,then from (12.7) X\f
1
(V ) X is open and therefore
f
1
(V ) is closed.
[]
We want to show that for every open set V in Y , f
1
(V ) is open.
V open Y \V closed f
1
(Y \V ) closedX\f
1
(V ) closedf
1
(V ) open.
Denition 524 A function f : X Y is open if
S X open f (S) open;
it is closed if
S X closed f (S) closed.
Exercise 525 Through simple examples show the relationship between open,closed and continuous
functions.
We can summarize our discussion on continuous function in the following Proposition.
Proposition 526 Let f be a function between metric spaces (X, d) and (Y, d
0
). Then the following
statements are equivalent:
1. f is continuous;
2. V Y is open f
1
(V ) X is open;
3. V Y closed f
1
(V ) X closed;
4.

x
0
X, (x
n
)
nN
X

such that n N, x
n
6= x
0
and lim
n+
x
n
= x
0
, lim
n+
f (x
n
) = l

12.3 Continuous functions on compact sets


Proposition 527 Given f : X Y, if S is a compact subset of X and f is continuous, then f (S)
is a compact subset (of Y ).
Proof. Let F be an open covering of f (S), so that
f (S)
AF
A. (12.8)
We want to show that F admits an open subcover which covers f (S). Since f is continuous,
A F, f
1
(A) is open in X
Moreover,
S
(1)
f
1
(f (S))
(2)
f
1
(
AF
A)
(3)
=
AF
f
1
(A)
where
(1) follows from 2.i in Proposition 521,
(2) follows from 1.a in Proposition 521 and (12.8),
(3) follows from 1.b in Proposition 521.
In other words

f
1
(A)

AF
is an open cover of S. Since S is compact there exists A
1
, .., A
n

F such that
S
n
i=1
f
1
(A) .
12.3. CONTINUOUS FUNCTIONS ON COMPACT SETS 151
Then
f (S)
(1)
f

n
i=1

f
1
(A)

(2)
=
n
i=1
f

f
1
(A
i
)

(3)

n
i=1
A
i
where
(1) follows from 1.a in Proposition 521,
(2) follows from 2.f in Proposition 521,
(3) follows from 3.l in Proposition 521.
Proposition 528 (Extreme Value Theorem) If S a nonempty, compact subset of X and f :
S R is continuous, then f admits global maximum and minimum on S, i.e.,
x
min
, x
max
S such that x S, f (x
min
) f (x) f (x
max
) .
Proof. From the previous Proposition f (S) is closed and bounded. Therefore, since f (S) is
bounded, there exists M = supf (S). By denition of sup,
> 0, B(M, ) f (S) 6=
Then
1
, n N,take

n
B

M,
1
n

f (S) .
Then, (
n
)
nN
is such that n N,
n
f (S) and 0 < d (
n
, M) <
1
n
. Therefore,
n
M,and
since f (S) is closed, M f (S). But M f (S) means that x
max
S such that f (x
max
) = M
and the fact that M = supf (S) implies that x S, f (x) f (x
max
). Similar reasoning holds
for x
min
.
We conclude the section showing a result useful in itself and needed to show the inverse function
theorem - see Section 17.3.
Proposition 529 Let f : X Y be a function from a metric space (X, d) to another metric space
(Y, d
0
). Assume that f is one-to-one and onto. If X is compact and f is continuous, then the
inverse function f
1
is continuous.
Proof. Exercise.
1
The fact that M f (S) can be also proved as follows: from Proposition 501 , M Cl f (S) = f (S), where the
last equality follows from the fact that f (S) is closed.
152 CHAPTER 12. FUNCTIONS
Chapter 13
Correspondence and xed point
theorems
13.1 Continuous Correspondences
Denition 530 Consider
1
two metric spaces (X, d
X
) and (Y, d
Y
). A correspondence from X
to Y is a rule which associates a subset of Y with each element of X, and it is described by the
notation
: X Y, : x 77(x) .
Remark 531 In other words, a correspondence : X Y can be identied with a function
from X to 2
Y
(the set of all subsets of Y ). If we identify x with {x}, a function from X to Y can
be thought as a particular correspondence.
Remark 532 Some authors make part of the Denition of correspondence the fact that is not
empty valued, i.e., that x X, (x) 6= .
In what follows, unless otherwise stated, (X, d
X
) and (Y, d
Y
) are assumed to be metric spaces
and are denoted by X and Y , respectively.
Denition 533 Given U X, (U) =
xU
(x) = {y Y : x U such that y (x)}.
Denition 534 The graph of : X Y is
graph := {(x, y) X Y : y (x)} .
Denition 535 Consider : X Y. is Upper Hemi-Continuous (UHC) at x X if (x) 6=
and for every open neighborhood V of (x) , there exists an open neighborhood U of x such that for
every x
0
U, (x
0
) V (or (U) V ).
is UHC if it is UHC at every x X.
Denition 536 Consider : X Y. is Lower Hemi-Continuous (LHC) at x X if (x) 6=
and for any open set V in Y such that (x) V 6= , there exists an open neighborhood U of x
such that for every x
0
U, (x
0
) V 6= .
is LHC if it is LHC at every x X.
Example 537 Consider X = R
+
and Y = [0, 1], and

1
(x) =

[0, 1] if x = 0
{0} if x > 0.

2
(x) =

{0} if x = 0
[0, 1] if x > 0.

1
is UHC and not LHC;
2
is LHC and not UHC.
1
This chapter is based mainly on McLean (1985) and Hildebrand (1974).
153
154 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
Some (partial) intuition about the above denitions can be given as follows.
Upper Hemi-Continuity does not allow explosions. In other words, is not UHC at x if there
exists a small enough open neighborhood of x such that does explode, i.e., it becomes much
bigger in that neighborhood.
Lower Hemi-Continuity does not allow implosions. In other words, is not LHC at x if there
exists a small enough open neighborhood of x such that does implode, i.e., it becomes much
smaller in that neighborhood.
In other words, UHC no explosion and LHC no implosion( or explosion not UHC
and implosion not LHC). On the other hand, opposite implications are false, i.e.,
it is false that explosion not UHC and implosion not LHC, or, in an equivalent
manner,
it is false that no explosion UHC and no implosion LHC.
An example of a correspondence which neither explodes nor explodes and which is not UHC
and not LHC is presented below.
: R
+
R, : x 77

[1, 2] if x [0, 1)
[3, 4] if x [1, +)
does not implode or explode if you move away from 1 (in a small open neighborhood of 1):
on the right of 1, does not change; on the left, it changes completely. Clearly, is neither UHC
nor LHC (in 1).
The following correspondence is both UHC and LHC:
: R
+
R, : x 77[x, x + 1]
A, maybe disturbing, example is the following one
: R
+
R, : x 77(x, x + 1) .
Observe that the graph of the correspondence under consideration does not implode, does not
explode, does not jump. In fact, the above correspondence is LHC, but it is not UHC in any
x R
+
, as veried below. We want to show that
not
*
for every neighborhood V of (x) ,
there exists a neighborhood U of x such that for every x
0
U, (x
0
) V
+
i.e.,
*
there exists a neighborhood V

of (x) such that.


for every neighborhood U of x there exists x
0
U such that (x
0
) * V
+
Just take V = (x) = (x, x + 1); then for any open neighborhood U of x and, in fact, x
0

U\ {x}, (x
0
) * V .
Example 538 The correspondence below
: R
+
R, : x 77

[1, 2] if x [0, 1]
[3, 4] if x [1, +)
is UHC, but not LHC.
Remark 539 Summarizing the above results, we can maybe say that a correspondence which is
both UHC and LHC, in fact a continuous correspondence, is a correspondence which agrees with
our intuition of a graph without explosions, implosions or jumps.
Proposition 540 1. If : X Y is either UHC or LHC and it is a function, then it is a
continuous function.
2. If : X Y is a continuous function, then it is a UHC and LHC correspondence.
13.1. CONTINUOUS CORRESPONDENCES 155
Proof.
1.
Case 1. is UHC.
First proof. Use the fourth characterization of continuous function in Denition 510.
Second proof. Recall that a function f : X Y is continuous i [V open in Y ]

f
1
(V ) open in X

.
Take V open in Y . Consider x f
1
(V ), i.e., x such that f (x) V . By assumption f is UHC and
therefore an open neighborhood U of x such that f (U) V . Then, U f
1
f (U) f
1
(V ) .
Then, for any x f
1
(V ), we have found an open set U which contains x and is contained in
f
1
(V ) , i.e., f
1
(V ) is open.
Case 2. is LHC.
See Remark 544 below.
2.
The results follows from the denitions and again from Remark 544 below.
Denition 541 : X Y is a continuous correspondence if it is both UHC and LHC.
Very often, checking if a correspondence is UHC or LHC is not easy. We present some related
concepts which are more convenient to use.
Denition 542 : X Y is sequentially LHC at x X if
for every sequence (x
n
)
nN
X

such that x
n
x, and for every y (x) ,
there exists a sequence (y
n
)
nN
Y

such that n N, y
n
(x
n
) and y
n
y,
is sequentially LHC if it is sequentially LHC at every x X.
Proposition 543 Consider : X Y. is LHC at x X is LHC in terms of sequences
at x X.
Proof.
[]
Consider an arbitrary sequence (x
n
)
nN
X

such that x
n
x and an arbitrary y (x) .
For every r N\ {0} , consider B

y,
1
r

. Clearly B

y,
1
r

(x) 6= , since y belongs to both sets.


From the fact that is LHC, we have that
r N\ {0} , a neighborhood U
r
of x such that z U
r
, (z) B

y,
1
r

6= . (1)
Since x
n
x,
r N\ {0} , n
r
N such that n n
r
x
n
U
r
(2) .
Consider {n
1
, ..., n
r
, ...} . For any r N\ {0}, if n
r
< n
r+1
, dene n
0
r
:= n
r
and n
0
r+1
:= n
r+1
, i.e.,
just add a
0
to the name of those indices. If n
r
n
r+1
, dene n
0
r
:= n
r
and n
0
r+1
:= n
r
+1. Then,
r N\ {0} , n
0
r+1
> n
0
r
and condition (2) still hold, i.e.,
r, n
0
r
such that n n
0
r
x
n
U
r
and n
0
r+1
> n
0
r
. (3)
We can now dene the desired sequence (y
n
)
nN
. For any n

n
0
r
, n
0
r+1

, observe that from (3),


x
n
U
r
, and, then, from (1) , (x
n
) B

y,
1
r

6= . Then,
for any r, for any n

n
0
r
, n
0
r+1

N, choose y
n
(x
n
) B

y,
1
r

(4) .
We are left with showing that y
n
y, i.e., > 0, m such that n m y
n
B(y, ). Observe
that (4) just says that
for any n [n
0
1
, n
0
2
) , y
n
(x
n
) B

y,
1
1

,
for any n [n
0
2
, n
0
3
) , y
n
(x
n
) B

y,
1
2

B(y, 1)
...
for any n

n
0
r
, n
0
r+1

, y
n
(x
n
) B

y,
1
r

y,
1
r1

,
156 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
and so on. Then, for any > 0, choose r >
1

(so that
1
r
< ) and m = n
0
r
, then from the above
observations, n

m, n
0
r+1

y
n
B

y,
1
r

B(y, ) ,and for n > n


0
r+1
, a fortiori, y
n
B

y,
1
r

.
[]
Assume otherwise, i.e., an open set V such that
(x) V 6= (5)
and such that open neighborhood U of x x
U
U such that (x
U
) V = .
Consider the following family of open neighborhood of x:

x,
1
n

: n N\ {0}

.
Then n N\ {0} , x
n
B

x,
1
n

, and therefore x
n
x, such that
(x
n
) V = (6) .
From (5) , we can take y (x) V. By assumption, we know that there exists a sequence
(y
n
)
nN
Y

such that n N\ {0}, y


n
(x
n
) and y
n
y. Since V is open and y V, n
such that n > n y
n
V. Therefore,
y (x
n
) V (7) .
But (7) contradicts (6) .
Thanks to the above Proposition from now on we talk simply of Lower Hemi-Continuous corre-
spondences.
Remark 544 If : X Y is LHC and it is a function, then it is a continuous function. The
result follows from the characterization of Lower Hemi-Continuity in terms of sequences and from
the characterization of continuous functions presented in Proposition 526.
Denition 545 : X Y is closed, or "sequentially UHC", at x X if
for every sequence (x
n
)
nN
X

such that x
n
x, and for every sequence (y
n
)
nN
Y

such
that y
n
(x
n
) and y
n
y,
it is the case that y (x) .
is closed if it is closed at every x X.
Proposition 546 is closed graph is a closed set in X Y .
2
Proof. An equivalent way of stating of the above Denition is the following one: for every
sequence (x
n
, y
n
)
nN
(X Y )

such that n N, (x
n
, y
n
) graph and (x
n
, y
n
) (x, y), it is
the case that (x, y) graph . Then, from the characterization of closed sets in terms of sequences,
i.e., Proposition 445, the desired result follows.
Remark 547 Because of the above result, many author use the expression has closed graph in
the place of is closed.
Remark 548 The denition of closed correspondence does NOT reduce to continuity in the case
of functions, as the following example shows.

3
: R
+
R,
3
(x) =

{0} if x = 0

1
x

if x > 0.
is a closed correspondence, but it is not a continuous function.
Denition 549 : X Y is closed (non-empty, convex, compact ...) valued if for every x X,
(x) is a closed (non-empty, convex, compact ...) set.
Proposition 550 Consider : X Y. closed

:
closed valued.
2
((X Y ) , d

) with d

((x, x
0
) , (y, y
0
)) = d (x, x
0
) +d
0
(y, y
0
) is a metric space.
13.1. CONTINUOUS CORRESPONDENCES 157
Proof.
[]
We want to show that every sequence in (x) which converges, in fact, converges in (x).
Choose x X and a sequence (y
n
)
nN
such that {y
n
: n N} (x) and such that y
n
y. Then
setting for any n N, x
n
= x, we get x
n
x, y
n
(x
n
) , y
n
y. Then, since is closed,
y (x) . This shows that (x) is a closed set.
[:]

2
in Example 537 is closed valued, but not closed.
Remark 551 Consider : X Y. is UHC
;
:
is closed.
[;]

4
: R
+
R,
4
(x) = [0, 1) for every x R
is UHC and not closed.
[:]

3
in Remark 548 is closed and not UHC, simply because it is not a continuous function.
Proposition 552 Consider : X Y. If is UHC (at x) and closed valued (at x), then is
closed (at x).
Proof.
Take an arbitrary x X. We want to show that is closed at x, i.e., assume that x
n
x, y
n

(x
n
) , y
n
y; we want to show that y (x). Since (x) is a closed set, it suces to show
that y Cl(x), i.e.,
3
> 0, B(y, ) (x) 6= .
Consider

B

z,

2

: z (x)

. Then,
z(x)
B

z,

2

:= V is open and contains (x) . Since


is UHC at x, there exists an open neighborhood U of x such that
(U) V. (1)
Since x
n
x U, b n N such that n > b n, x
n
U, and, from (1), (x
n
) V. Since
y
n
(x
n
),
n > b n, y
n
V :=
z(x)
B

z,

2

. (2)
From (2) , n > b n, z

n
(x) such that y
n
B

n
,

2

and then
d (y
n
, z

) <

2
. (3)
Since y
n
y, n

such that n > n

,
d (y
n
, y) <

2
(4)
From (3) and (4) , n > max {b n, n

} , z

n
(x) and d (y, z

) d (y, y
n
) + d (y
n
, z

) < , i.e.,
z

n
B(y, ) (x) 6= .
Proposition 553 Consider : X Y. If is closed and there exists a compact set K Y such
that (X) K, then is UHC.
Therefore, in simpler terms, if is closed (at x)and Y is compact, then is UHC (at x).
Proof.
Assume that there exists x X such that is not UHC at x X, i.e., there exist an open
neighborhood V of (x) such that for every open neighborhood U
x
of x, (U
x
) V
C
6= .
In particular, n N\ {0} ,

x,
1
n

V
C
6= . Therefore, we can construct a sequence
(x
n
)
nN
X

such that x
n
x and (x
n
) V
C
6= . Now, take y
n
(x
n
) V
C
. Since
y
n
(X) K and K is compact, and therefore sequentially compact, up to a subsequence,
y
n
y K. Moreover, since n N\ {0} , y
n
V
C
and V
C
is closed,
y V
C
(1) .
3
See Corollary 501.
158 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
Since is closed and x
n
x, y
n
(x
n
) , y
n
y, we have that y (x). Since, by
assumption, (x) V, we have that
y V (2) .
But (2) contradicts (1) .
None of the Assumptions of the above Proposition can be dispensed of. All the examples below
show correspondences which are not UHC.
Example 554 1.
: R
+
R, (x) =

1
2

if x [0, 2]
{1} if x > 2.
Y = [0, 1] , but is not closed.
2.
: R
+
R, (x) =

{0} if x = 0

1
x

if x > 0.
is closed, but (X) = R
+
, which is closed, but not bounded.
3.
: [0, 1] [0, 1) , (x) =

{0} if x [0, 1)

1
2

if x = 1.
is closed (in Y ), but Y = [0, 1) is not compact. Observe that if you consider
: [0, 1] [0, 1

], (x) =

{0} if x [0, 1)

1
2

if x = 1.
,
then is not closed.
Denition 555 Consider : X Y, V Y.
The strong inverse image of V via is
s

1
(V ) := {x X : (x) V } ;
.
The weak inverse image of V via is
w

1
(V ) := {x X : (x) V 6= } .
Remark 556 1. V Y,
s

1
(V )
w

1
(V ) .
2. If is a function, the usual denition of inverse image coincides with both above denitions.
Proposition 557 Consider : X Y.
1.1. is UHC for every open set V in Y,
s

1
(V ) is open in X;
1.2. is UHC for every closed set V in Y,
w

1
(V ) is closed in X;
2.1. is LHC for every open set V in Y,
w

1
(V ) is open in X;
2.2. is LHC for every closed set V in Y,
s

1
(V ) is closed in X.
4
Proof.
[1.1., ] Consider V open in Y . Take x
0

s

1
(V ); by denition of
s

1
, (x
0
) V. By
denition of UHC correspondence, an open neighborhood U of x
0
such that x U, (x) V.
Then x
0
U
s

1
(V ) .
[1.1., ] Take an arbitrary x
0
X and an open neighborhood V of (x
0
) . Then x
0

s

1
(V )
and
s

1
(V ) is open by assumption. Therefore (just identifying U with
s

1
(V )), we have proved
that is UHC.
To show 1.2, preliminarily, observe that

1
(V )

C
=
s

V
C

. (13.1)
4
Part 2.2 of the Proposition will be used in the proof of the Maximum Theorem.
13.1. CONTINUOUS CORRESPONDENCES 159
(To see that, simply observe that

w

1
(V )

C
:= {x X : (x) V = } and
s

V
C

:=

x X : (x) V
C

)
[1.2., ] V closed V
C
open
Assum., (1.1)

V
C

(13.1)
=

w

1
(V )

C
open
w

1
(V )
closed.
[1.2., ]
From (1.1.) , it suces to show that open set V in Y,
s

1
(V ) is open in X. Then,
V open V
C
closed
Assum.

V
C

closed

w

V
C

C (13.1)
=
s

1
(V ) open.
The proofs of parts 2.1. and 2.2. are similar to the above ones.
Remark 558 Observe that is UHC
;
:
for every closed set V in Y,
s

1
(V ) is closed in X.
[;]
Consider
: R
+
R, (x) =

[0, 2] if x [0, 1]
[0, 1] if x > 1.
is UHC and [0, 1] is closed, but
s

1
([0, 1]) := {x R
+
: (x) [0, 1]} = (1, +) is not
closed.
[:]
Consider
: R
+
R
+
, (x) =

0,
1
2

{1} if x = 0
[0, 1] if x > 0.
For any closed set in Y := R
+
,
s

1
(V ) can be only one of the following set, and each of them
is closed: {0} , R
+
, . On the other hand, is not UHC in 0.
Denition 559 Let the vector spaces (X, d
X
) , (Y, d
Y
) and (Z, d
Z
) and the correspondences :
X Y, : Y Z be given. The composition of with is
: X Z,
( ) (x) :=
y(x)
(y) = {z Z : x X such that z = ((x))}
.
Proposition 560 Consider : X Y, : Y Z. If and are UHC, then is UHC.
Proof.
Step 1.
s
( )
1
(V ) =
s

1
(V )

.
s
( )
1
(V ) = {x X : ((x)) V } = {x X : y (x) , (y) V } =
=

x X : y (x) , y
s

1
(V )

=

x X : (x)
s

1
(V )

=
s

1
(V )

.
Step 2. Desired result.
Take V open in Z. From Theorem 557, we want to show that
s
( ) (V ) is open in X. From
step 1, we have that
s
( )
1
(V ) =
s

1
(V )

. Now,
s

1
(V ) is open because is UHC,
and
s

1
(V )

is open because is UHC.


Proposition 561 Consider : X Y. If is UHC and compact valued, and A X is a
compact set, then (A) is compact.
Proof.
Consider an arbitrary open cover {C

}
I
for (A) . Since (A) :=
xA
(x) and is compact
valued, there exists a nite set N
x
I such that
(x)
N
x
C

:= G
x
. (13.2)
Since for every N
x
, C

is open, then G
x
is open. Since is UHC,
s

1
(G
x
) is open.
Moreover, x
s

1
(G
x
): this is the case because, by denition, x
s

1
(G
x
) i (x) G
x
,
160 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
which is just (13.2) . Therefore,

s

1
(G
x
)

xA
is an open cover of A. Since, by assumption, A is
compact, there exists a nite set {x
i
}
m
i=1
A such that A
m
i=1

1
(G
x
i
)

. Finally,
(A)

m
i=1

1
(G
x
i
)

(1)

m
i=1

1
(G
x
i
)

(2)

m
i=1
G
x
i
=
m
i=1

N
x
i
C

,
and
n
{C

}
Nx
i
o
m
i=1
is a nite subcover of {C

}
I
. We are left with showing (1) and (2)
above.
(1) . In general, it is the case that (
m
i=1
S
i
)
m
i=1
(S
i
) .
y (
m
i=1
S
i
) x
m
i=1
S
i
such that y (x) i such that y (x) (S
i
) y

m
i=1
(S
i
) .
(2) . In general, it is the case that

1
(A)

A.
y

1
(A)

x
s

1
(A) such that y (x). But, by denition of
s

1
(A) , and
since x
s

1
(A) , it follows that (x) A and therefore y A.
Remark 562 Observe that the assumptions in the above Proposition cannot be dispensed of, as
veried below.
Consider : R
+
R, (x) = [0, 1). Observe that is UHC and bounded valued but not
closed valued , and ([0, 1]) = [0, 1) is not compact.
Consider : R
+
R, (x) = R
+
. Observe that is UHC and closed valued, but not
bounded valued, and ([0, 1]) = R
+
is not compact.
Consider : R
+
R
+
, (x) =

{x} if x 6= 1
{0} if x = 1.
Observe that is not UHC and
([0, 1]) = [0, 1) is not compact.
Add Proposition 5, page 25 and Proposition 6, page 26, from Hildebrand (1974) ...
maybe as exercises ...
Remark 563 Below, we summarize some facts we showed in the present Section, in a somehow
informal manner.
h(if is a fcn., it is cnt.i h is UHCi
;
:
h is sequentially UHC,i.e., closedi
;

h(if is a fcn., it is cnt.i


h(if is a fcn, it is continuousi h is LHCi h is sequentially LHCi
h UHC and closed valued at xi h is closed at xi
h UHC at x i h is closed at x and Im compacti
13.2 The Maximum Theorem
Theorem 564 (Maximum Theorem) Let the metric spaces (, d

) , (X, d
X
), the correspondence
: X and a function u : X R be given.
5
Dene
: X,
() = {z () : x () , u(z, ) u(x, )} = arg max
x()
u(x, ) ,
Assume that
is non-empty valued, compact valued and continuous,
u is continuous.
Then
1. is non-empty valued, compact valued, UHC and closed,and
2.
v : R, v : 7 max
x()
u(x, ) .
is continuous.
5
Obviously, stands for budget correspondence and u for utility function.
13.2. THE MAXIMUM THEOREM 161
Proof.
is non-empty valued.
It is a consequence of the fact that is non-empty valued and compact valued and of the
Extreme Value Theorem - see Proposition 528.
is compact valued.
We are going to show that for any , () is a sequentially compact set. Consider a
sequence (x
n
)
nN
X

such that {x
n
: n N} () . Since () () and () is compact
by assumption, without loss of generality, up to a subsequence, x
n
x
0
() . We are left
with showing that x
0
() . Take an arbitrary z (). Since {x
n
: n N} () , we have
that u(x
n
, ) u(z, ) . By continuity of u, taking limits with respect to n of both sides, we get
u(x
0
, ) u(z, ) , i.e., x
0
(), as desired.
is UHC.
From Proposition 557, it suces to show that given an arbitrary closed set V in X,
w

1
(V ) :=
{ : () V 6= } is closed in . Consider an arbitrary sequence (
n
)
nN
such that {
n
: n N}
w

1
(V ) and such that
n

0
. We have to show that
0

w

1
(V ).
Take a sequence (x
n
)
nN
X

such that for every n, x


n
(
n
) V 6= . Since (
n
)
(
n
) , it follows that x
n
(
n
). We can now show the following
Claim. There exists a subsequence (x
n
k
)
n
k
N
of (x
n
)
nN
such that x
n
k
x
0
and x
0
(
0
) .
Proof of the Claim.
Since {
n
: n N} {
0
} is a compact set (Show it), and since, by assumption, is UHC
and compact valued, from Proposition 561, ({
n
: n N} {
0
}) is compact. Since {x
n
}
n

({
n
} {
0
}) , there exists a subsequence (x
n
k
)
n
k
N
of (x
n
)
nN
which converges to some x
0
.
Since is compact valued, it is closed valued, too. Then, is UHC and closed valued and from
Proposition 552, is closed. Since

n
k

0
, x
n
k
(
n
k
) , x
n
k
x
0
,
the fact that is closed implies that x
0
(
0
).
End of the Proof of the Claim.
Choose an arbitrary element z
0
such that z
0
(
0
). Since we assumed that
n

0
and
since is LHC, there exists a sequence (z
n
)
nN
X

such that z
n
(
n
) and z
n
z
0
.
Summarizing, and taking the subsequences of (
n
)
nN
and (z
n
)
nN
corresponding to (x
n
k
)
n
k
N
,
we have for any n
k
,

n
k

0
,
x
n
k
x
0
, x
n
k
(
n
k
) , x
0
(
0
) ,
z
n
k
z
0
, z
n
k
(
n
k
) , z
0
(
0
) .
Then for any n
k
, we have that u(x
n
k
,
n
k
) u(z
n
k
,
n
k
) . Since u is continuous, taking limits,
we get that u(x
0
,
0
) u(z
0
,
0
) .Since the choice of z
0
in (
0
) was arbitrary, we have then
x
0
(
0
) .
Finally, since (x
n
k
)
n
k
N
V

, x
n
k
x
0
and V is closed, x
0
V. Then x
0
(
0
) V and

0
{ : () V 6= } :=
w

1
(V ) , which was the desired result.
is closed.
is UHC and compact valued, and therefore closed valued. Then, from Proposition 552, it is
closed, too.
v is a continuous function.
The basic idea of the proof is that v is a function and it is equal to the composition of UHC
correspondences; therefore, it is a continuous function. A precise argument goes as follows.
Let the following correspondences be given:
(, id) : X , 7 () {} ,
162 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
: X R, (x, ) 7{u(x, )} .
Then, from Denition 559,
( (, id)) () =
(x,)(){}
{u(x, )} .
By denition of ,
, x () ,
(x,)(){}
{u(x, )} = {u(x, )} ,
and
, ( (, id)) () = {u(x, )} = {v ()} . (13.3)
Now, (, id) is UHC, and since u is a continuous function, is UHC as well. From Proposition
560, (, id) is UHC and, from 13.3, v is a continuous function.
A sometimes more useful version of the Maximum Theorem is one which does not use the fact
that is UHC.
Theorem 565 (Maximum Theorem) Consider the correspondence : X and the function
u : X R dened in Theorem 564 and , X Euclidean spaces.
Assume that
is non-empty valued, compact valued, convex valued, closed and LHC.
u is continuous.
Then
1. is a non-empty valued, compact valued, closed and UHC correspondence;
2. v is a continuous function.
Proof.
The desired result follows from next Proposition.
Proposition 566 Consider the correspondence : X , with and X Euclidean spaces.
Assume that is non-empty valued, compact valued, convex valued, closed and LHC. Then
is UHC.
Proof.
See Hildenbrand (1974) Lemma 1 page 33. The proof requires also Theorem 1 in Hildenbrand
(1974).
The following result allows to substitute the requirement is LHC with the easier to check
requirement Cl is LHC.
Proposition 567 Consider the correspondence : X. is LHC Cl is LHC.
Proof.
Preliminary Claim.
V open set, , Cl() V 6= () V 6= .
Proof of the Preliminary Claim.
Take z Cl() V 6= . Since V is open, > 0 such that B(z, ) V. Since z Cl() ,
{z
n
} () such that z
n
z. But then n

such that n > n

z
n
B(z, ) V . But z
n
V
and z
n
() implies that () V 6= .
End of the Proof of the Preliminary Claim.
[]
Take an open set V such that Cl() V 6= . We want to show that there exists an open set
U

such that U

and U

, Cl() V 6= . From the preliminary remark, it must be the


case that () V 6= . Then, since is LHC, there exists an open set U such that U and
U

, () V 6= . Since Cl() () , we also have Cl() V 6= . Choosing U

= U,
we are done.
13.3. FIXED POINT THEOREMS 163
[]
Since () V 6= , then Cl() V 6= , and, by assumption, open set U
0
such that
U
0
and U
0
, Cl() V 6= . Then, from the preliminary remark, it must be the case that
() V 6= .
Remark 568 In some economic models, a convenient strategy to show that a correspondence is
LHC is the following one. Introduce a correspondence
b
; show that
b
is LHC; show that Cl
b
= .
Then from the above Proposition 567, the desired result follows - see, for example, point 5 the proof
of Proposition 581 below.
13.3 Fixed point theorems
A thorough analysis of the many versions of xed point theorems existing in the literature is outside
the scope of this notes. Below, we present a useful relatively general version of xed point theorems
both in the case of functions and correspondences.
Theorem 569 (The Brouwer Fixed Point Theorem)
For any n N\ {0}, let S be a nonempty, compact, convex subset of R
n
. If f : S S is a
continuous function, then x S such that f (x) = x.
Proof. For a (not self-contained) proof, see Ok (2007), page 279.
Just to try to avoid having a Section without a proof, lets show the following extremely simple
version of that theorem.
Proposition 570 If f : [0, 1] [0, 1] is a continuous function, then x [0, 1] such that f (x) = x.
Proof. If f (0) = 0 or f (1) = 1, the result is true. Then suppose otherwise, i.e., f (0) 6= 0 and
f (1) 6= 1, i.e., since the domain of f is [0, 1], suppose that f (0) > 0 and f (1) < 1. Dene
g : [0, 1] R, : x 7x f (x) .
Clearly, g is continuous, g (0) = f (0) < 0 and g (1) = 1 f (1) > 0. Then, from the
intermediate value for continuous functions, x [0, 1] such that g (x) = x f (x) = 0, i.e.,
x = f (x), as desired.
Theorem 571 (Kakutanis Fixed Point Theorem)
For any n N\ {0}, let S be a nonempty, compact, convex subset of R
n
. If : S S is a
convex valued, closed correspondence, then x S such that (x) 3 x.
Proof. For a proof, see Ok (2007), page 331.
13.4 Application of the maximum theorem to the consumer
problem
Denition 572 (Mas Colell (1996), page 17) Commodities are goods and services available for
purchases in the market.
We assume the number of commodities is nite and equal to C. Commodities are indexed by
superscript c = 1, ..., C.
Denition 573 A commodity vector is an element of the commodity space R
C
.
Denition 574 (almost Mas Colell(1996), page 18) A consumption set is a subset of the
commodity space R
C
. It is denoted by X. Its elements are the vector of commodities the individual
can conceivably consume given the physical or institutional constraints imposed by the environment.
Example 575 See Mas colell pages 18, 19.
164 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
Common assumptions on X are that it is convex,bounded below and unbounded. Unless other-
wise stated, we make the following stronger
Assumption 1 X = R
C
+
:=

x R
C
: x 0

.
Denition 576 p R
C
is the vector of commodity prices.
Households choices are limited also by an economic constraint: they cannot buy goods whose
value is bigger than their wealth, i.e., it must be the case that px w, where w is households
wealth.
Remark 577 w can take dierent specications. For example, we can have w = pe, where e R
C
is the vector of goods owned by the household, i.e., her endowments.
Assumption 2 All commodities are traded in markets at publicly observable prices, expressed in
monetary unit terms.
Assumption 3 All commodities are assumed to be strictly goods (and not bad), i.e., p R
C
++
.
Assumption 4 Households behave as if they cannot inuence prices.
Denition 578 The budget set is
(p, w) :=

x R
C
+
: px w

.
With some abuse of notation we dene the budget correspondence as
: R
C
++
R
++
R
C
, (p, w) =

x R
C
+
: px w

.
Denition 579 The utility function is
u : X R, x 7u(x)
Denition 580 The Utility Maximization Problem (UMP) is
max
xR
C
+
u(x) s.t. px w, or x (p, w) .
: R
C
++
R
++
R
C
, (p, w) = arg max (UMP) is the demand correspondence.
Theorem 581 is a non-empty valued, compact valued, closed and UHC correspondence and
v : R
C
++
R
++
R, v : (p, w) 7max (UMP) ,
i.e., the indirect utility function, is a continuous function.
Proof.
As an application of the (second version of) the Maximum Theorem, i.e., Theorem 565, we have
to show that is non-empty valued, compact valued, convex valued, closed and LHC.
1. is non-empty valued.
x =

w
Cp
c

C
c=1
(p, w) (or, simpler, 0 (p, w)).
2. is compact valued.
(p, w) is closed because is the intersection of the inverse image of closed sets via continuous
functions.
(p, w) is bounded below by zero.
(p, w) is bounded above because for every c, x
c

c
0
6=c
p
c
0
x
c
0
p
c

w
p
c
, where the rst inequal-
ity comes from the fact that px w, and the second inequality from the fact that p R
C
++
and
x R
C
+
.
3. is convex valued.
To see that, simply, observe that (1 ) px
0
+px
00
(1 ) w +w = w.
4. is closed.
We want to show that for every sequence {(p
n
, w
n
)}
n
R
C
++
R
++
such that
13.4. APPLICATION OF THE MAXIMUM THEOREM TO THE CONSUMER PROBLEM165
(p
n
, w
n
) (p, w) , x
n
(p
n
, w
n
) , x
n
x,
it is the case that x (p, w) .
Since x
n
(p
n
, w
n
), we have that p
n
x
n
w
n
and x
n
0. Taking limits of both sides of both
inequalities, we get px w and x 0,i.e., x (p, w) .
5. is LHC.
We proceed as follows: a. Int is LHC; b. Cl Int = . Then, from Proposition 567 the
result follows.
a. Observe that Int (p, w) :=

x R
C
+
: x 0 and px < w

and that Int (p, w) 6= , since


x =

w
2Cp
c

C
c=1
Int (p, w) . We want to show that the following is true.
For every sequence (p
n
, w
n
)
n


R
C
++
R
++

such that (p
n
, w
n
) (p, w) and for any x
Int (p, w) ,
there exists a sequence {x
n
}
n
R
C
+
such that n, x
n
Int (p
n
, w
n
) and x
n
x.
p
n
x w
n
px w < 0 (where the strict inequality follows from the fact that x Int (p, w) .
Then, N such that n N p
n
x w
n
< 0.
For n N, choose an arbitrary x
n
Int (p
n
, w
n
) 6= . Since p
n
x w < 0, for every n > N,
there exists
n
> 0 such that z B(x,
n
) p
n
z w
n
< 0.
For any n > N, choose x
n
= x +
1

C
min

n
2
,
1
n

1. Then,
d (x, x
n
) =

C
min

n
2
,
1
n

2
!1
2
= min

n
2
,
1
n

<
n
,
i.e., x
n
B(x,
n
) and therefore
p
n
x
n
w
n
< 0 (1) .
Since x
n
x, we also have
x
n
0 (2) .
(1) and (2) imply that x
n
Int (p
n
, w
n
) . Moreover, since x
n
x, we have 0 5 lim
n+
(x
n
x) =
lim
n
1

C
min

n
2
,
1
n

1 5 lim
n
1
n
1

C
1 = 0, i.e., lim
n+
x
n
= x.
6
b.
It follows from the fact that the budget correspondence is the intersection of the inverse images
of half spaces via continuous functions.
2.
It follows from Proposition 582, part (4), and the Maximum Theorem.
Proposition 582 For every (p, w) R
C
++
R
++
,
(1) R
++
, (p, w) = (p, w) ;
(2) if u is LNS, x R
C
+
, x (p, w) px = w;
(3) if u is quasi-concave, is convex valued;
(4) if u is strictly quasi-concave, is single valued, i.e., it is a function.
Proof.
(1)
It simply follows from the fact that R
++,
(p, w) = (p, w) .
(2)
Suppose otherwise, then x
0
R
C
+
such that x
0
(p, w) and px
0
< w. Therefore,
0
> 0 such
that B(x
0
,
0
) (p, w) (take
0
= d (x
0
, H (p, w))). Then, from the fact that u is LNS, there exists
x

such that x

B(x
0
,
0
) (p, w) and u(x

) > u(x
0
), i.e., x
0
/ (p, w) , a contradiction.
(3)
Assume there exist x
0
, x
00
such that x
0
, x
00
(p, w). We want to show that [0, 1] , x

:=
(1 ) x
0
+x
00
(p, w) . Observe that u(x
0
) = u(x
00
) := u

. From the quasi-concavity of u, we


6
Or simply
0 lim
n
d (x, xn) lim
n
1
n
= 0.
166 CHAPTER 13. CORRESPONDENCE AND FIXED POINT THEOREMS
have u

. We are therefore left with showing that x

(p, w) , i.e., is convex valued.


To see that, simply, observe that px

= (1 ) px
0
+px
00
(1 ) w +w = w.
(4) Assume otherwise. Following exactly the same argument as above we have x
0
, x
00
(p, w) ,
and px

w. Since u is strictly quasi concave, we also have that u

> u(x
0
) = u(x
00
) := u

,
which contradicts the fact that x
0
, x
00
(p, w) .
Proposition 583 If u is a continuous LNS utility function, then the indirect utility function has
the following properties.
For every (p, w) R
C
++
R
++
,
(1) R
++
, v (p, w) = v (p, w) ;
(2) Strictly increasing in w and for every c, non increasing in p
c
;
(3) for every v R, {(p, w) : v (p, w) v} is convex.
(4) continuous.
Proof.
(1) It follows from Proposition 582 (2) .
(2)
If w increases, say by w, then, from Proposition 582 (2) , px(p, w) < w+w. Dene x(p, w) :=
x
0
. Then,
0
> 0 such that B(x
0
,
0
) (p, w +w) (take
0
= d (x
0
, H (p, w +w))). Then, from
the fact that u is LNS, there exists x

such that x

B(x
0
,
0
) (p, w +w) and u(x

) > u(x
0
).
The result follows observing that v (p, w +w) u(x

) .
Similar proof applies to the case of a decrease in p. Assume p
c
0
< 0. Dene := (
c
)
C
c=1
R
C
with
c
= 0 i c 6= c
0
and
c
0
= p
c
0
. Then,
px(p, w) = w (p +) x(p, w) = px(p, w) +
(0)
p
c
0
x
c
0
(p, w) =
= w + p
c
0
x
c
0
(p, w) w. The remaining part of the proof is the same as in the case of an
increase of w.
(3) Take (p
0
, w
0
) , (p
00
, w
00
) {(p, w) : v (p, w) v} := S (v) . We want to show that [0, 1] ,

, w

:= (1 ) (p
0
, w
0
) +(p
00
, w
00
) S (v) , i.e., x

, w

u(x) > v.
x

, w

x w

(1 ) p
0
+p
00
(1 ) w
0
+w
00
.
Then, either p
0
x w
0
or p
00
x w
00
. If p
0
x w
0
, then u(x) v (p
0
, w
0
) v. Similarly, if
p
00
x w
00
.
(4)
It was proved in Theorem 581.
Part III
Dierential calculus in Euclidean
spaces
167
Chapter 14
Partial derivatives and directional
derivatives
14.1 Partial Derivatives
The
1
concept of partial derivative is not that dierent from the concept of derivative of a function
from R to R. To get an intuitive idea of the former concept, consider the function
f : R
2
R, (x
1
, x
2
) 7x
1
x
2
;
x x
2
= x
02
, and dene
f
|x02
: R R, x
1
7f (x
1
, x
02
) = x
1
x
02
.
Then we know that f
|x02
is dierentiable in x
01
if the following limit exists and it is nite:
lim
x1x01
f
|x
02
(x
1
) f
|x
02
(x
01
)
x
1
x
01
= lim
x1x01
x
1
x
02
x
01
x
02
x
1
x
01
In that case, we denote that limit by f
0
|x02
(x
01
) and we call it the derivative of the function f
|x
02
in x
01
. We are going to dene that derivative (of a function from R to R) as the partial derivative
of f in x
0
with respect to x
1
.
Denition 584 Let a set S R
n
, a point x
0
= (x
0k
)
n
k=1
Int S and a function f : S R be
given. If the following limit exists and it is nite
lim
x
k
x
0k
f ((x
01
, ..., x
0k1
, x
k
, x
0k+1
, ..., x
0n
)) f ((x
01
, ..., x
0k1
, x
0k
, x
0k+1
, ..., x
0n
))
x
k
x
0k
(14.1)
then it is called the partial derivative of f with respect to the k th coordinate computed in x
0
and
it is denoted by any of the following symbols
D
x
k
f (x
0
) , D
k
f (x
0
) ,
f
x
k
(x
0
) ,
f(x)
x
k |x=x
0
.
Remark 585 1. Taken e
k
n
= (0, .., 1, ..., 0), the k th vector in the canonical basis of R
n
, and
h = x
k
x
0k
, we can rewrite the limit in (14.1) as follows:
lim
x
k
x
0k
f

x
0
+ (x
k
x
0k
) e
k
n

f (x
0
)
x
k
x
0k
= lim
h0
f

x
0
+he
k
n

f (x
0
)
h
= D
x
k
f (x
0
) (14.2)
where last equality holds if the limit exists and it is nite.
2. As said above, partial derivatives are not really a new concept. We are just treating f as a
function of one variable at the time, keeping the other variables xed. In other words, for simplicity
taking S = R
n
and using the notation of the above denition, we can dene
g
k
: R R, g
k
(x
k
) = f

x
0
+ (x
k
x
0k
) e
k
n

1
In this Part, I follow closely Section 5.14 and chapters 12 and 13 in Apostol (1974).
169
170 CHAPTER 14. PARTIAL DERIVATIVES AND DIRECTIONAL DERIVATIVES
a function of only one variable, and, by denition of g
k
,
lim
x
k
x
0k
f

x
0
+ (x
k
x
0k
) e
k

f (x
0
)
x
k
x
0k
= lim
lim
x
k
x
0k
g
k
(x
k
) g
k
(x
0
k)
h
= g
0
k
(x
0k
) ,
or
lim
h0
f

x
0
+he
k

f (x
0
)
h
= lim
h0
g
k
(x
0k
+h) g
k
(x
0k
)
h
= g
0
k
(x
0k
) .
Remark 586 Loosely speaking, we can give the following geometrical interpretation of partial deriv-
atives. Given f : R
2
R admitting partial derivatives,
f(x0)
x1
is the slope of the graph of the
function obtained cutting the graph of f with a plane which is
orthogonal to the x
1
x
2
plane, and
going through the line parallel to the x
1
axis and passing through the point x
0
, line to which we
have given the same orientation as the x
1
axis.
picture to be added.
Denition 587 Given an open subset S in R
n
and a function f : S R, if k {1, ..., n}, the
limit in (14.1) exists, we call the gradient of f in x
0
the following vector
(D
k
f (x
0
))
n
k=1
and we denote it by
Df (x
0
)
Example 588 Given f : R
3
R,
f (x
1
, x
2
, x
3
) = e
xy
cos x + sinyz
we have
Df (x) =

(sinx) e
xy
+y (cos x) e
xy
z cos yz +x(cos x) e
xy
y cos yz

Remark 589 The existence of the gradient for f in x


0
does not imply continuity of the function
in x
0
, as the following example shows.
f : R
2
R, f (x
1
, x
2
) =

x
1
+x
2
if
either x
1
= 0 or x
2
= 0
i.e., (x
1
, x
2
) ({0} R) (R {0})
1 otherwise
5
2.5
0
-2.5
-5
5
2.5
0
-2.5
-5
5
2.5
0
-2.5
-5
x y
z
x y
z
14.2. DIRECTIONAL DERIVATIVES 171
D
1
f (0) = lim
x10
f (x
1
, 0) f (0, 0)
x
1
0
= lim
x10
x
1
x
1
0
= 1
and similarly
D
2
f (0) = 1.
f is not continuous in 0: we want to show that > 0 such that > 0 there exists (x
1
, x
2
)
R
2
such that (x
1
, x
2
) B(0, ) and |f (x
1
, x
2
) f (0, 0)| . Take =
1
2
and any (x
1
, x
2
) B(0, )
such that x
1
6= 0 and x
2
6= 0. Then |f (x
1
, x
2
) f (0, 0)| = 1 > .
14.2 Directional Derivatives
A rst generalization of the concept of partial derivative of a function is presented in Denition 591
below.
Denition 590 Given
f : S R
n
R
m
, x 7f (x) ,
i {1, .., m, }, the function
f
i
: S R
n
R
m
, x 7 i th component of f (x) .
is called the i th component function of f.
Therefore,
x S, f (x) = (f
i
(x))
m
i=1
. (14.3)
Denition 591 Given m, n N\ {0}, a set S R
n
, x
0
Int S, u R
n
, h R such that
x
0
+ hu S, f : S R
m
, we call the directional derivative of f at x
0
in the direction u, denoted
by the symbol
f
0
(x
0
; u) ,
the limit
lim
h0
f (x
0
+hu) f (x
0
)
h
(14.4)
if it exists and it is nite.
Remark 592 Assume that the limit in (14.4) exists and it is nite. Then, from (14.3) and using
Proposition 441,
f
0
(x
0
; u) = lim
h0
f (x
0
+hu) f (x
0
)
h
=

lim
h0
f
i
(x
0
+hu) f
i
(x
0
)
h

m
i=1
= (f
0
i
(x
0
; u))
m
i=1
.
If u = e
j
n
, the j th element of the canonical basis in R
n
, we then have
f
0

x
0
; e
j
n

lim
h0
f
i

x
0
+he
j
n

f
i
(x
0
)
h
!
m
i=1
=

f
0
i

x
0
; e
j
n

m
i=1
()
=

D
x
j
f
i
(x
0
)

m
i=1
:= D
x
j
f (x
0
)
(14.5)
where equality () follows from (14.2).
We can then construct a matrix whose n columns are the above vectors, a matrix which involves
all partial derivative of all component functions of f. That matrix is formally dened below.
Denition 593 Assume that f = (f
i
)
m
i=1
: S R
n
R
m
admits all partial derivatives in x
0
. The
Jacobian matrix of f at x
0
is denoted by Df (x
0
) and is the following mn matrix:

D
x
1
f
1
(x
0
) ... D
x
j
f
1
(x
0
) ... D
x
n
f
1
(x
0
)
... ... ...
D
x1
f
i
(x
0
) ... D
xj
f
i
(x
0
) ... D
xn
f
i
(x
0
)
... ... ...
D
x
1
f
m
(x
0
) ... D
x
j
f
m
(x
0
) ... D
x
n
f
m
(x
0
)

=

D
x1
f (x
0
) ... D
xj
f (x
0
) ... D
xn
f (x
0
)

172 CHAPTER 14. PARTIAL DERIVATIVES AND DIRECTIONAL DERIVATIVES
Remark 594 How to easily write the Jacobian matrix of a function.
To compute the Jacobian of f is convenient to construct a table as follows.
1. In the rst column, write the m vector component functions f
1
, ..., f
i
, ..., f
m
of f.
2. In the rst row, write the subvectors x
1
, ..., x
j
, ..., x
n
of x.
3. For each i and j, write the partial Jacobian matrix D
xj
f
i
(x) in the entry at the intersection
of the i th row and j th column.
We then obtain the following table,
x
1
... x
j
... x
n
f
1
d D
x
1
f
1
(x) D
x
j
f
1
(x) D
x
n
f
1
(x) e
... | |
f
i
| D
x1
f
i
(x) D
xj
f
i
(x) D
xn
f
i
(x) |
... | |
f
m
b D
x
1
f
m
(x) D
x
j
f
m
(x) D
x
n
f
m
(x) c
where the Jacobian matrix is the part of the table between square brackets.
Example 595 Given f : R
4
R
5
,
f (x, y, z, t) =

xy
x
2
+1
x+yz
e
x
xyzt
e
t
x +y +z +t
x
2
+t
2

its Jacobian matrix is

y
x
2
+1
2x
2 y
(x
2
+1)
2
x
x
2
+1
0 0
1
e
x

1
e
x
(x +yz)
z
e
x
y
e
x
0
ty
z
e
t
tx
z
e
t
tx
y
e
t
xy
z
e
t
txy
z
e
t
1 1 1 1
2x 0 0 2t

54
Remark 596 From Remark 592,
u R
n
, f
0
(x
0
; u) exists Df (x
0
) exists (14.6)
On the other hand, the opposite implication does not hold true. Consider the example in Remark
589. There, we have seen that
D
x
f (0) = D
y
f (0) = 1
But if u = (u
1
, u
2
) with u
1
6= 0 and u
2
6= 0, we have
lim
h0
f (0 +hu) f (0)
h
= lim
h0
1 0
h
=
Remark 597 Again loosely speaking, we can give the following geometrical interpretation of direc-
tional derivatives. Take f : R
2
R admitting directional derivatives. f (x
0
; u) with kuk = 1 is the
slope the graph of the function obtained cutting the graph of f with a plane which is
orthogonal to the x
1
x
2
plane, and
going through the line going through the points x
0
and x
0
+ u, line to which we have given the
same orientation as u.
picture to be added.
14.2. DIRECTIONAL DERIVATIVES 173
Example 598 Take
f : R
n
R, x 7x
0
x
0
= kxk
2
Then, the existence of f
0
(x
0
; u) can be checked computing the following limit.
lim
h0
f (x
0
+hu) f (x
0
)
h
= lim
h0
(x
0
+hu) (x
0
+hu) x
0
x
0
h
= lim
h0
x
0
x
0
+hx
0
u +hux
0
+h
2
uu x
0
x
0
h
=
= lim
h0
2hx
0
u +h
2
uu
h
= lim
h0
2x
0
u +huu = 2x
0
u
Exercise 599 Verify that
f
0
(x
0
; u) = f
0
(x
0
; u) .
Solution:
f
0
(x
0
; u) = lim
h0
f(x0+h(u))f(x0)
h
= lim
h0
f(x0hu)f(x0)
h
=
= lim
k0
f(x
0
+(h)u)f(x
0
)
h
k:=h
= lim
k0
f(x
0
+ku)f(x
0
)
k
= f
0
(x
0
; u) .
Remark 600 It is not the case that
u R
n
, f
0
(x
0
; u) exists f is continuous in x
0
(14.7)
as the following example shows. Consider
f : R
2
R, f (x, y) =

xy
2
x
2
+y
4
if x 6= 0
0 if x = 0, i.e., (x, y) {0} R
25
20
15
10
5
0
-5
5
2.5
0
-2.5
-5
0.5
0.25
0
-0.25
-0.5
x
y
z
x
y
z
Lets compute f
0
(0; u). If u
1
6= 0.
lim
h0
f (0 +hu) f (0)
h
= lim
h0
hu
1
h
2
u
2
2
(h
2
u
2
1
+h
4
u
4
2
) h
= lim
h0
u
1
u
2
2
u
2
1
+h
2
u
4
2
=
u
2
2
u
1
If u
1
= 0,we have
lim
h0
f (0 +hu) f (0)
h
= lim
h0
f (0, hu
2
) f (0)
h
= lim
h0
0
h
= 0
On the other hand, if x = y
2
and x, y 6= 0, i.e., along the graph of the parabola x = y
2
except
the origin, we have
f (x, y) = f

y
2
, y

=
y
4
y
4
+y
4
=
1
2
174 CHAPTER 14. PARTIAL DERIVATIVES AND DIRECTIONAL DERIVATIVES
while
f (0, 0) = 0.
Roughly speaking, the existence of partial derivatives in a given point in all directions implies
continuity along straight lines through that point; it does not imply continuity along all possible
lines through that point, as in the case of the parabola in the picture above.
Remark 601 We are now left with two problems:
1. Is there a denition of derivative whose existence implies continuity?
2. Is there any easy way to compute the directional derivative?
Appendix (to be corrected)
There are other denitions of directional derivatives used in the literature.
Let the following objects be given: m, n N\ {0}, a set S R
n
, x
0
Int S, u R
n
, h R
such that x
0
+hu S, f : S R
m
,
Denition 602 (our denition following Apostol) We call the directional derivative of f at x
0
in
the direction u according to Apostol, denoted by the symbol
f
0
A
(x
0
; u) ,
the limit
lim
h0
f (x
0
+hu) f (x
0
)
h
(14.8)
if it exists and it is nite.
Denition 603 (Girsanov) We call the directional derivative of f at x
0
in the direction u according
to Girsanov, denoted by the symbol
f
0
G
(x
0
; u) ,
the limit
lim
h0
+
f (x
0
+hu) f (x
0
)
h
(14.9)
if it exists and it is nite.
Denition 604 (Wikipedia) Take u R
n
such that kuk = 1.We call the directional derivative of f
at x
0
in the direction u according to Wikipedia, denoted by the symbol
f
0
W
(x
0
; u) ,
the limit
lim
h0
+
f (x
0
+hu) f (x
0
)
h
(14.10)
if it exists and it is nite.
Fact 1. For given x
0
S, u R
n
A G W,
while the opposite implications do not hold true. In particular, to see way A : G, just take
f : R R,
f (x) =

0 if x < 0
1 if x 0
,
and observe that while the right derivative in 0 is
lim
h0
+
f (h) f (0)
h
= lim
h0

1 1
h
= 0,
while the left derivative is
lim
h0

f (h) f (0)
h
= lim
h0

0 1
h
= +.
14.2. DIRECTIONAL DERIVATIVES 175
Fact 2. For given x
0
S,
f
0
W
(x, u) exists f
0
G
(x, v) exists for any v = u and R
++
.
Proof.
f
0
G
(x, v) = lim
h0
+
f(x0+hv)f(x0)
h
= lim
h0
+
f(x0+hu)f(x0)
h
k=h>0
=
= lim
h0
+
f(x0+ku)f(x0)
k
= f
0
W
(x, u) .
Fact 3. Assume that u 6= 0 and x
0
R
n
. Then the following implications are true:
u R
n
, f
0
A
(x, u) exists u R
n
, f
0
G
(x, u) exists u R
n
such that kuk = 1, f
0
W
(x, u) exists.
Proof.
From Fact 1, we are left with showing just two implications.
G A.
We want to show that
u R
n
, lim
h0
+
f (x
0
+hu) f (x
0
)
h
R v R
n
, lim
h0
f (x
0
+hv) f (x
0
)
h
R.
Therefore, it suces to show that l := lim
h0

f(x
0
+hv)f(x
0
)
h
R.Take u = v. Then,
l = lim
h0

f (x
0
hu) f (x
0
)
h
= lim
h0

f (x
0
hu) f (x
0
)
h
k=h
= lim
k0
+
f (x
0
+ku) f (x
0
)
k
R.
W G.
The proof of this implication is basically the proof of Fact 2.We want to show that
u R
n
such that kuk = 1, lim
h0
+
f (x
0
+hu) f (x
0
)
h
R v R
n
\ {0} , l := lim
h0
+
f (x
0
+hv) f (x
0
)
h
R.
In fact,
l := lim
h0
+
f

x
0
+hkvk
v
kvk

f (x
0
)
h
R,
simply because

v
kvk

= 1.
Remark 605 We can give the following geometrical interpretation of directional derivatives. First
of all observe that from Proposition 608,
f
0
(x
0
; u) := lim
h0
f (x
0
+hu) f (x
0
)
h
= df
x0
(u) .
If f : R R, we then have
f
0
(x
0
; u) = f
0
(x
0
) u.
Therefore, if u = 1, we have
f
0
(x
0
; u) = f
0
(x
0
) ,
and if u > 0, we have
sign f
0
(x
0
; u) = sign f
0
(x
0
) .
Take now f : R
2
R admitting directional derivatives. Then,
f
0
(x
0
; u) = Df (x
0
) u with kuk = 1
is the slope the graph of the function obtained cutting the graph of f with a plane which is
orthogonal to the x
1
x
2
plane, and
along the line going through the points x
0
and x
0
+u, in the direction from x
0
to x
0
+u.
176 CHAPTER 14. PARTIAL DERIVATIVES AND DIRECTIONAL DERIVATIVES
Chapter 15
Dierentiability
15.1 Total Derivative and Dierentiability
If f : R R, we say that f is dierentiable in x
0
, if the following limit exists and it is nite
lim
h0
f (x
0
+h) f (x
0
)
h
and we write
f
0
(x
0
) = lim
h0
f (x
0
+h) f (x
0
)
h
or, in equivalent manner,
lim
h0
f (x
0
+h) f (x
0
) f
0
(x
0
) h
h
= 0
and
f (x
0
+h) (f (x
0
) +f
0
(x
0
) h) = r (h)
where
lim
h0
r (h)
h
= 0,
or
f (x
0
+h) = f (x
0
) +f
0
(x
0
) h +r (h) ,
or using what said in Section 8.5, and more specically using denition 8.8,
f (x
0
+h) = f (x
0
) +l
f
0
(x0)
(h) +r (h)
where l
f
0
(x
0
)
L(R, R)
and lim
h0
r(h)
h
= 0
Denition 606 Given a set S R
n
, x
0
Int S, f : S R
m
, u R
n
,u 6= 0, such that x
0
+u S,
we say that f is dierentiable at x
0
if
there exists
a linear function df
x
0
: R
n
R
m
such that
lim
u0
f (x
0
+u) f (x
0
) df
x
0
(u)
kuk
= 0 (15.1)
In that case, the linear function df
x
0
is called the total derivative or the dierential or simply
the derivative of f at x
0
.
177
178 CHAPTER 15. DIFFERENTIABILITY
Remark 607 Obviously, given the condition of the previous Denition, we can say that f is dif-
ferentiable at x
0
if there exists a linear function df
x0
: R
n
R
m
such that u R
n
such that
x
0
+u S
f (x
0
+u) = f (x
0
) +df
x
0
(u) +r (u) , with lim
u0
r (u)
kuk
= 0 (15.2)
or
f (x
0
+u) = f (x
0
) +df
x
0
(u) +kuk E
x
0
(u) , with lim
u0
E
x
0
(u) = 0 (15.3)
The above equations are called the rst-order Taylor formula (of f at x
0
in the direction u).
Condition (15.3) is the most convenient one to use in many instances.
Proposition 608 Assume that f : S R
m
is dierentiable at x
0
, then
u R
n
, f
0
(x
0
; u) = df
x
0
(u) .
Proof.
f
0
(x
0
; u) := lim
h0
f (x
0
+hu) f (x
0
)
h
(1)
=
= lim
h0
f (x
0
) +df
x
0
(hu) +khuk E
x
0
(hu) f (x
0
)
h
(2)
= lim
h0
hdf
x
0
(u) +|h| kuk E
x
0
(hu)
h
(3)
=
= lim
h0
df
x
0
(u) + lim
h0
sign (h) kuk E
x
0
(hu)
(4)
= df
x
0
(u) +kuk lim
hu0
sign (h) E
x
0
(hu)
(5)
= df
x
0
(u) ,
where
(1) follows from (15.3) with hu in the place of u,
(2) from the fact that df
x0
is linear and therefore (Exercise) continuous, and from a property of
a norm,
(3) from the fact that
|h|
h
= sign (h),
1
(4) from the fact that h 0 implies that hu 0,
(5) from the assumption that f is dierentiable in x
0
.
Remark 609 The above Proposition implies that if the dierential exists, then it is unique - from
the fact that the limit is unique, if it exists.
Proposition 610 If f : S R
m
is dierentiable at x
0
, then f is continuous at x
0
.
Proof. We have to prove that
lim
u0
f (x
0
+u) f (x
0
) = 0
i.e., from (15.3), it suces to show that
lim
u0
df
x
0
(u) +kuk E
x
0
(u) = df
x
0
(0) + lim
u0
kuk E
x
0
(u) = 0
where the rst equality follows from the fact that df
x
0
is linear and therefore continuous, and the
second equality from the fact again that df
x
0
is linear, and therefore df
x
(0) = 0, and from (15.2) .
Remark 611 The above Proposition is the answer to Question 1 in Remark 601. We still do not
have a answer to Question 2 and another question naturally arises at this point:
3. Is there an easy way of checking dierentiability?
1
sign is the function dened as follows:
sign : R {1, 0 + 1} , x 7

1 if x < 0
0 if x = 0
+1 if x > 0.
15.2. TOTAL DERIVATIVES IN TERMS OF PARTIAL DERIVATIVES. 179
15.2 Total Derivatives in terms of Partial Derivatives.
In Remark 613 below, we answer question 2 in Remark 601: Is there any easy way to compute
the directional derivative?
Proposition 612 Assume that f = (f
j
)
m
j=1
: S R
n
R
m
is dierentiable in x
0
. The matrix
associated with df
x
0
with respect to the canonical bases of R
n
and R
m
is the Jacobian matrix Df (x
0
),
i.e., using the notation of Section 8.5,
[df
x0
] = Df (x
0
) ,
i.e.,
x R
n
, df
x
0
(x) = Df (x
0
) x. (15.4)
Proof. From (8.6) in Section 8.5
[df
x0
] =

df
x0

e
1
n

... df
x0

e
i
n

... df
x0
(e
n
n
)

mn
From Proposition 608,
i {1, ..., n} , df
x
0

e
i

= f
0

x
0
; e
i

,
and from (14.5)
f
0

x
0
; e
i

= (D
x
i
f
j
(x
0
))
m
j=1
.
Then
[df
x
0
] =

(D
x
1
f
j
(x
0
))
m
j=1
... (D
x
i
f
j
(x
0
))
m
j=1
... (D
x
1
f
j
(x
0
))
m
j=1

mn
,
as desired.
Remark 613 From Proposition 608, part 1, and the above Proposition 612, we have that if f is
dierentiable in x
0
, then u R
m
u R
m
, f
0
(x
0
; u) = Df (x
0
) u.
Remark 614 From (15.4), we get
kdf
x0
(x)k = k[Df (x
0
)]
mn
xk
(1)

m
X
j=1
|Df
j
(x
0
) x|
(2)

m
X
j=1
kDf
j
(x
0
)k kxk
where (1) follows from Remark 56, (2) from Cauchy-Schwarz inequality in (53) .Therefore, de-
ned :=
P
m
j=1
kDf
j
(x
0
)k ,we have that
kdf
x0
(x)k kxk
and
lim
x0
kdf
x
0
(x)k = 0
Remark 615 We have seen that
f dierentiable in x
0
f admits directional derivative in x
0

:
Df (x
0
) exists
(not ) (not )
f continuous in x
o
f continuous in x
o
f continuous in x
o
Therefore
f dierentiable in x
0
: Df (x
0
) exists
and
f dierentiable in x
0
: f admits directional derivative in x
0
We still do not have an answer to question 3 in Remark 611: Is there an easy way of checking
dierentiability? We will provide an answer in Proposition 643.
180 CHAPTER 15. DIFFERENTIABILITY
Chapter 16
Some Theorems
We rst introduce some needed denitions.
Denition 616 Given an open S R
n
,
f : S R
m
, x := (x
j
)
n
j=1
7f (x) = (f
i
(x))
m
i=1
I {1, ..., m} and J {1, ..., n} ,
the partial Jacobian of (f
i
)
iI
with respect to (x
j
)
jJ
in x
0
S is the following (#I) (#J)
submatrix of Df (x
0
)

f
i
(x
0
)
x
j

iI, jJ
,
and it is denoted by
D
(xj)
jJ
(f
j
)
iI
(x
0
)
Example 617 Take:
S an open subset of R
n
1
, with generic element x
0
= (x
j
)
n
1
j=1
,
T an open subset of R
n
2
, with generic element x
00
= (x
k
)
n2
k=1
and
f : S T R
m
, (x
0
, x
00
) 7f (x
0
, x
00
)
Then, dened n = n
1
+n
1
, we have
D
x
0 f (x
0
) =

D
x
1
f
1
(x
0
) ... D
x
n
1
f
1
(x
0
)
... ...
D
x
1
f
i
(x
0
) ... D
x
n
1
f
i
(x
0
)
... ...
D
x
1
f
m
(x
0
) ... D
x
n
1
f
m
(x
0
)

mn
1
and, similarly,
f (x
0
) :=

D
x
n
1
+1
f
1
(x
0
) ... D
xn
f
1
(x
0
)
... ...
D
x
n
1
+1
f
i
(x
0
) ... D
x
n
f
i
(x
0
)
... ...
D
x
n
1
+1
f
m
(x
0
) ... D
x
n
f
m
(x
0
)

mn
2
and therefore
Df (x
0
) :=

D
x
0 f (x
0
) D
x
00 f (x
0
)

mn
Denition 618 Given an open set S R
n
and f : S R, assume that x S, Df (x) :=

f(x)
x
j

n
j=1
exists. Then, j {1, ..., n}, we dene the j th partial derivative function as
f
x
j
: S R, x 7
f (x)
x
j
181
182 CHAPTER 16. SOME THEOREMS
Assuming that the above function has partial derivative with respect to x
k
for k {1, ..., n}, we
dene it as the mixed second order partial derivative of f with respect to x
j
and x
k
and we write

2
f (x)
x
j
x
k
:=

f(x)
x
j
x
k
Denition 619 Given f : S R
n
R, the Hessian matrix of f at x
0
is the n n matrix
D
2
f (x
0
) :=


2
f
x
j
x
k
(x
0
)

j,k=1,...,n
Remark 620 D
2
f (x
0
) is the Jacobian matrix of the gradient function of f.
Example 621 Compute the Hessian function of f : R
3
++
R,
f (x, y, z) =

e
x
cos y +z
2
+x
2
log y + log x + log z + 2t log t

We rst compute the gradient:

2xlny + (cos y) e
x
+
1
x
(siny) e
x
+
x
2
y
2z +
1
z
2 lnt + 2

and then the Hessian matrix

2 lny + (cos y) e
x

1
x
2
(siny) e
x
+ 2
x
y
0 0
(siny) e
x
+ 2
x
y
(cos y) e
x

x
2
y
2
0 0
0 0
1
z
2
+ 2 0
0 0 0
2
t

16.1 The chain rule


Proposition 622 (Chain Rule) Given S R
n
, T R
m
,
f : S R
n
R
m
,
such that Imf T, and
g : T R
m
R
p
,
assume that f and g are dierentiable in x
0
and y
0
= f (x
0
), respectively. Then
h : S R
n
R
p
, h(x) = (g f) (x)
is dierentiable in x
0
and
dh
x
0
= dg
f(x
0
)
df
x
0
.
Proof. We want to show that there exists a linear function dh
x
0
: R
n
R
p
such that
h(x
0
+u) = h(x
0
) +dhf
x0
(u) +kuk E

x
0
(u) , with lim
u0
E

x
0
(u) = 0,
and dh
x
0
= dg
f(x0)
df
x
0
.
Taking u suciently small (in order to have x
0
+u S), we have
h(x
0
+u) h(x
0
) = f [g (x
0
+u)] f [g (x
0
)] = f [g (x
0
+u)] f (y
0
)
and dened
v = g (x
0
+u) y
0
we get
h(x
0
+u) h(x
0
) = f (y
0
+v) f (y
0
) .
16.1. THE CHAIN RULE 183
Since g is dierentiable in x
0
, we get
v = dg
x0
(u) +kuk E
x0
(u) , with lim
u0
E
x0
(u) = 0. (16.1)
Since f is dierentiable in y
0
= g (x
0
), we get
f (y
0
+v) f (y
0
) = df
y
0
(v) +kvk E
y
0
(v) , with lim
v0
E
y
0
(v) = 0. (16.2)
Inserting (16.1) in (16.2), we get
f (y
0
+v)f (y
0
) = df
y
0
(dg
x
0
(u) +kuk E
x
0
(u))+kvkE
y
0
(v) = df
y
0
(dg
x
0
(u))+kukdf
y
0
(E
x
0
(u))+kvkE
y
0
(v)
Dened
E
x0
(u) :=

0 if u = 0
df
y
0
(E
x
0
(u)) +
kvk
kuk
E
y
0
(v) if u 6= 0
,
we are left with showing that
lim
u0
E
x
0
(u) = 0.
Observe that
lim
u0
df
y
0
(E
x
0
(u)) = 0
since linear functions are continuous and from (16.1). Moreover, since lim
u0
v = 0, from (16.2),
we get
lim
u0
E
y
0
(v) = 0.
Finally, we have to show that lim
u0
kvk
kuk
is bounded. Now, from the denition of u and from
(614), dened :=
P
m
j=1
kDg
j
(x
0
)k,
kvk = kdg
x0
(u) +kuk E
x0
(u)k kdg
x0
(u)k +kuk kE
x0
(u)k ( +kE
x0
(u)k) kuk
and
lim
u0
kvk
kuk
lim
u0
( +kE
x0
(u)k) = ,
as desired.
Remark 623 From Proposition 612 and Proposition 293, or simply (8.10), we also have
Dh(x
0
)
pn
= Dg (f (x
0
))
pm
Df (x
0
)
mn
.
Observe that Dg (f (x
0
)) is obtained computing Dg (y) and then substituting f (x
0
) in the place
of y. We therefore also write Dg (f (x
0
)) = Dg (y)
|y=f(x0)
Exercise 624 Compute dh
x
0
, if n = 1, p = 1.
Denition 625 Given f : S R
n
R
m
1
, g : S R
n
R
m
2
,
(f, g) : R
n
R
m
1
+m
2
, (f, g) (x) = (f (x) , g (x))
Remark 626 Clearly,
D(f, g) (x
0
) =

Df (x
0
)
Dg (x
0
)

Example 627 Given


f : R R
2
, : x 7(sinx, cos x)
g : R
2
R
2
, : (y
1
, y
2
) 7(y
1
+y
2
, y
1
y
2
)
h = g f : R R
2
, : x 7(sinx + cos x, sinx cos x)
184 CHAPTER 16. SOME THEOREMS
verify the conclusion of the Chain Rule Proposition.
Df (x) =

cos x
sinx

Dg (y) =

1 1
y
2
y
1

Dg (f (x)) =

1 1
cos x sinx

Dg (f (x)) Df (x) =

1 1
cos x sinx

cos x
sinx

cos x sinx
cos
2
x sin
2
x

= Dh(x)
Example 628 Take
g : R
k
R
n
, : t 7g (t)
f : R
n
R
k
R
m
, : (x, t) 7f (x, t)
h : R
k
R
m
, : t 7f (g (t) , t)
Then
e g := (g, id
R
k) : R
k
R
n
R
k
, t 7(g (t) , t)
and
h = f e g = f (g, id
R
k, )
Therefore, assuming that f, g, h are dierentiable,
[Dh(t
0
)]
mk
= [Df (g (t
0
) , t
0
)]
m(n+k)

Dg (t
0
)
I

(n+k)k
=
=

[D
x
f (g (t
0
) , t
0
)]
mn
| [D
t
f (g (t
0
) , t
0
)]
mk

[Dg (t
0
)]
nk
I
kk

=
[D
x
f (g (t
0
) , t
0
)]
mn
[Dg (t
0
)]
nk
+ [D
t
f (g (t
0
) , t
0
)]
mk
In the case k = n = m = 1, the above expression
df (x = g (t) , t)
dt
=
f (g (t) , t)
x
dg (t)
dt
+
f (g (t) , t)
t
or
df (g (t) , t)
dt
=
f (x, t)
x |x=g(t)

dg (t)
dt
+
f (x, t)
t |x=g(t)
16.2 Mean value theorem
Proposition 629 (Mean Value Theorem) Let S be an open subset of R
n
and f : S R
m
a
dierentiable function. Let x, y S be such that the line segment joining them is contained in S,
i.e.,
L(x, y) := {z R
n
: [0, 1] such that z = (1 ) x +y} S.
Then
a R
m
, z L(x, y) such that a [f (y) f (x)] = a [Df (z) (y x)]
Remark 630 Under the assumptions of the above theorem, the following conclusion is false:
z L(x, y) such that f (y) f (x) = Df (z) (y x).
But if f : S R
m=1
, then setting a R
m=1
equal to 1, we get that the above statement is
indeed true.
16.2. MEAN VALUE THEOREM 185
Proof. of Proposition 629
Dene u = y x. Since S is open and L(x, y) S, > 0 such that t (, 1 +) such that
x +tu = (1 t) x +ty S. Taken a R
m
, dene
F : (, 1 +) R, : t 7a f (x +tu) =
P
m
j+1
a
j
f
j
(x +tu)
Then
F
0
(t) =
m
X
j+1
a
j
[Df
j
(x +tu)]
1n
u
n1
= a
1m
[Df (x +tu)]
mn
u
n1
and F is continuous on [0, 1] and dierentiable on (0, 1); then, we can apply Calculus 1 Mean
Value Theorem and conclude that
(0, 1) such that F (1) F (0) = F
0
() ,
and by denition of F and u,
(0, 1) such that f (y) f (x) = a Df (x +u) (y x)
which choosing z = x +u gives the desired result.
Remark 631 Using the results we have seen on directional derivatives, the conclusion of the above
theorem can be rewritten as follows.
z L(x, y) such that f (y) f (x) = f
0
(z, y x)
As in the case of real functions of real variables, the Mean Value Theorem allows to give a simple
relationship between sign of the derivative and monotonicity.
Denition 632 A set C R
n
is convex if x
1
, x
2
C and [0, 1], (1 ) x
1
+x
2
C.
Proposition 633 Let S be an open and convex subset of R
n
and f : S R
m
a dierentiable
function. If x S, df
x
= 0, then f is constant on S.
Proof. Take arbitrary x, y S. Then since S is convex and f is dierential, from the Mean
Value Theorem, we have that
a R
m
, z L(x, y) such that a [f (y) f (x)] = a [Df (z) (y x)] = 0.
Taken a = f (y) f (x), we get that
kf (y) f (x)k = 0
and therefore
f (x) = f (y) ,
as desired.
Denition 634 Given x := (x
i
)
n
i=1
, y := (y
i
)
n
i=1
R
n
,
x y means i {1, ..., n} , x
i
y
i
;
x > y means x y x 6= y;
x y means i {1, ..., n} , x
i
> y
i
.
Denition 635 f : S R
n
R is increasing if x, y S, x > y f (x) f (y).
f is strictly increasing if x, y S, x > y f (x) > f (y) .
Proposition 636 Take an open, convex subset S of R
n
, and f : S R dierentiable.
1. If x S, Df (x) 0, then f is increasing;
2. If x S, Df (x) >> 0, then f is strictly increasing.
186 CHAPTER 16. SOME THEOREMS
Proof. 1. Take y x. Since S is convex, L(x, y) S. Then from the Mean Value Theorem,
z L(x, y) such that f (y) f (x) = Df (z) (y x)
Since y x 0 and Df (z) 0, the result follows.
2. Take x > y. Since S is convex, L(x, y) S. Then from the Mean Value Theorem,
z L(x, y) such that f (y) f (x) = Df (z) (y x)
Since y x > 0 and Df (z) >> 0, the result follows.
Exercise 637 Is the following statement correct: If x S, Df (x) > 0, then f is strictly
increasing ?.
Corollary 638 Take an open, convex subset S of R
n
, and f C
1
(S, R). If x
0
S and u
R
n
\ {0} such that f
0
(x
0
, u) > 0, then t R
++
such that t

0, t

,
f (x
0
+tu) f (x
0
) .
Proof. Since f is C
1
and f
0
(x
0
, u) = Df (x
0
) u > 0, r > 0 such that
x B(x
0
, r) , f
0
(x, u) > 0.
Then t (r, r),

x
0
+
1
kuk
tu x
0

= t < r, and therefore


f
0

x
0
+
t
kuk
u, u

> 0
Then, from the Mean Value Theorem, t

0,
r
2

,
f (x
0
+tu) f (x
0
) = f
0
(x
0
+tu, u) 0.
Denition 639 Given a function f : S R
n
R, x
0
S is a point of local maximum for f if
> 0 such that x B(x
0
, ) , f (x
0
) f (x) ;
x
0
is a point of global maximum for f if
x S, f (x
0
) f (x) .
x
0
S is a point of strict local maximum for f if
> 0 such that x B(x
0
, ) , f (x
0
) > f (x) ;
x
0
is a point of strict global maximum for f if
x S, f (x
0
) > f (x) .
Local, global, strict minima are dened in obvious manner
Proposition 640 If S R
n
, f : S R admits all partial derivatives in x
0
Int S and x
0
is a
point of local maximum or minimum, then Df (x
0
) = 0.
Proof. Since x
0
is a point of local maximum, > 0 such that x B(x
0
, ), f (x
0
) f (x).
As in Remark 585, for any k {1, ..., n}, dene
g
k
: R R, g (x
k
) = f

x
0
+ (x
k
x
0k
) e
k
n

.
Then g
k
has a local maximum point at x
0k
. Then from Calculus 1,
g
0
k
(x
0k
) = 0
Since, again from Remark 585, we have
D
k
f (x
0
) = g
0
k
(x
0
) .
the result follows.
16.3. A SUFFICIENT CONDITION FOR DIFFERENTIABILITY 187
16.3 A sucient condition for dierentiability
Denition 641 f = (f
i
)
m
i=1
: S R
n
R
m
is continuously dierentiable on A S, or f is C
1
on S, or f C (A, R
m
) if i {1, ..., m} , j {1, ..., n} ,
D
x
j
f
i
: A R, x 7D
x
j
f
i
(x) is continuous.
Denition 642 f : S R
n
R is twice continuously dierentiable on A S, or f is C
2
on S,
or f C
2
(A, R
m
) if j, k {1, ..., n} ,

2
f
x
j
x
k
: A R, x 7

2
f
x
j
x
k
(x) is continuous.
Proposition 643 If f is C
1
in an open neighborhood of x
0
, then it is dierentiable in x
0
.
Proof. See Apostol (1974), page 357 or Section "Existence of derivative", page 232, in Bartle
(1964), where the signicant case f : R
n
R
m=1
is presented using Cauchy-Schwarz inequality.
See also,Theorem 1, page 197, in Taylor and Mann,for the case f : R
2
R.
The above result is the answer to Question 3 in Remark 611. To show that f : R
m
R
n
is
dierentiable, it is enough to verify that all its partial derivatives, i.e., the entries of the Jacobian
matrix, are continuous functions.
16.4 Asucient condition for equality of mixed partial deriv-
atives
Proposition 644 If f : S R
n
R
1
is C
2
in an open neighborhood of x
0
, then i

f
xi
x
k
(x
0
) =

f
x
k
x
i
(x
0
)
or

2
f
x
i
x
k
(x
0
) =

2
f
x
k
x
i
(x
0
)
Proof. See Apostol (1974), Section 12.13, page 358.
16.5 Taylors theorem for real valued functions
To get Taylors theorem for functions f : R
m
R, we introduce some notation in line with the
denition of directional derivative:
f
0
(x, u) =
n
X
i=1
D
x
i
f (x) u
i
.
Denition 645 Assume S is an open subset of R
m
and the function f : S R admits partial
derivatives at least up to order m, and x S, u R
m
. Then
f
00
(x, u) :=
n
X
i=1
n
X
j=1
D
i,j
f (x) u
i
u
j
,
f
000
(x, u) :=
n
X
i=1
n
X
j=1
n
X
k=1
D
i,j,k
f (x) u
i
u
j
u
k
and similar denition applies to f
(m)
(x, u).
188 CHAPTER 16. SOME THEOREMS
Proposition 646 (Taylors formula) Assume S is an open subset of R
m
and the function f : S
R admits partial derivatives at least up to order m, and x S, u R
m
. Assume also that all its
partial derivative of order < m are dierentiable. If y and x are such that L(y, x) S, then there
exists z L(y, x) such that
f (y) = f (x) +
m1
X
k=1
1
k!
f
(k)
(x, y x) +
1
m!
f
(m)
(z, y z) .
Proof. . Since S is open and L(x, y) S, > 0 such that t (, 1 +) such that
x +t (y x) S. Dene g : (, 1 +) R
g (t) = f (x +t (y x)) .
From standard Calculus 1 Taylors theorem, we have that (0, 1) such that
f (y) f (x) = g (1) g (0) =
m1
X
k=1
1
k!
g
(k)
(0) +
1
m!
g
(m)
() .
Then
g (
0
t) = Df (x +t (y x)) (y x) =
n
X
i=1
D
x
i
f (x +t (y x)) (y
i
x
i
) = f
0
(x +t (y x) , y x) ,
g
00
(t) =
n
X
i=1
n
X
j=1
D
x
i
,x
j
f (x +t (y x)) (y
i
x
i
) (y
j
x
j
) = f
00
(x +t (y x) , y x)
and similarly
g
(m)
(t) = f
0(m)
(x +t (y x) , y x)
Then the desired result follow substituting 0 in the place of t where needed and choosing z =
x + (y x).
Chapter 17
Implicit function theorem
17.1 Some intuition
Below, we present an informal discussion of the Implicit Function Theorem. Assume that
f : R
2
R, (x, t) 7f (x, t)
is at least C
1
. The basic goal is to study the nonlinear equation
f (x, t) = 0,
where x can be interpreted as an endogenous variable and t as a parameter (or an exogenous
variable). Assume that

x
0
, t
0

R
2
such that f

x
0
, t
0

= 0
and for some > 0
a C
1
function g :

t
0
, t
0
+

R, t 7g (t)
such that
f (g (t) , t) = 0 (17.1)
We can then say that g describes the solution to the equation
f (x, t) = 0,
in the unknown variable x and parameter t, in an open neighborhood of t
0
. Therefore, using
the Chain Rule - and in fact, Remark 628 - applied to both sides of (17.1), we get
f (x, t)
x |x=g(t)

dg (t)
dt
+
f (x, t)
t |x=g(t)
= 0
and
assuming that
f (x, t)
x |x=g(t)
6= 0
we have
dg (t)
dt
=
f(x,t)
t |x=g(t)
f(x,t)
x |x=g(t)
(17.2)
The above expression is the derivative of the function implicitly dened by (17.1) close to the
value t
0
. In other words, it is the slope of the level curve f (x, t) = 0 at the point (t, g (t)).
For example, taken
f : R
2
R, (x, t) 7x
2
+t
2
1
f (x, t) = 0 describes the circle with center in the origin and radius equal to 1. Putting t on the
horizontal axis and x on the vertical axis, we have the following picture.
189
190 CHAPTER 17. IMPLICIT FUNCTION THEOREM
1 0.5 0 -0.5 -1
1
0.5
0
-0.5
-1
t
x
t
x
Clearly
f ((0, 1)) = 0
As long as t (1, 1) , g (t) =

1 t
2
is such that
f (g (t) , t) = 0 (17.3)
Observe that
d

1 t
2

dt
=
t

1 t
2
and

f(x,t)
t |x=g(t)
f(x,t)
x |x=g(t)
=
2t
2x|x=g(t)
=
t

1 t
2
For example for t =
1

2
, g
0
(t) =
1

1
1
2
= 1.
Lets try to present a more detailed geometrical interpretation
1
. Consider the set

(x, t) R
2
: f (x, t) = 0

presented in the following picture.


Insert picture a., page 80.
In this case, does equation
f (x, t) = 0 (17.4)
dene x as a function of t? Certainly, the curve presented in the picture is not the graph of a
function with x as dependent variable and t as an independent variable for all values of t in R. In
fact,
1. if t (, t
1
], there is only one value of x which satises equation (17.4);
2. if t (t
1
, t
2
), there are two values of x for which f (x, t) = 0;
3. if t (t
2
, +), there are no values satisfying the equation.
If we consider t belonging to an interval contained in (t
1
, t
2
), we have to to restrict the admissible
range of variation of x in order to conclude that equation (17.4) denes x as a function of t in that
interval. For example, we see that if the rectangle R is as indicated in the picture , the given
equation denes x as a function of t, for well chosen domain and codomain - naturally associated
with R. The graph of that function is indicated in the gure below.
Insert picture b., page 80.
1
This discussion is taken from Sydsaeter (1981), page 80-81.
17.2. FUNCTIONS WITH FULL RANK SQUARE JACOBIAN 191
The size of R is limited by the fact that we need to dene a function and therefore one and only
one value has to be associated with t. Similar rectangles and associated solutions to the equation
can be constructed for all other points on the curve, except one: (t
2
, x
2
). Irrespectively of how
small we choose the rectangle around that point, there will be values of t close to t
2
, say t
0
, such
that there are two values of x, say x
0
and x
00
, with the property that both (t
0
, x
0
) and (t
0
, x
00
) satisfy
the equation and lie inside the rectangle. Therefore, equation (17.4) does not dene x as a function
of t in an open neighborhood of the point (t
2
, x
2
). In fact, there the slope of the tangent to the
curve is innite. If you try to use expression (17.2) to compute the slope of the curve dened by
x
2
+t
2
= 1 in the point (1, 0), you get an expression with zero in the denominator.
On the basis of the above discussion, we see that it is crucial to require the condition
f (x, t)
x |x=g(t)
6= 0
to insure the possibility of locally writing x as a solution (to (17.4)) function of t.
We can informally, summarize what we said as follow.
If f is C
1
, f (x
0
, t
0
) = 0 and
f(x,t)
x |(x,t)=(x
0
,t
0
)
6= 0, then f (x, t) = 0 dene x as a C
1
function
g of t in an open neighborhood of t
0
, and g
0
(t) =
f(x,t)
t
f(x,t)
x |x=g(t)
.
Next sections provide a formal statement and proof of the Implicit Function Theorem. Some
work is needed.
17.2 Functions with full rank square Jacobian
Proposition 647 Taken a R
n
, r R
++
, assume that
1. f := (f
i
)
n
i=1
: R
n
R
n
is continuous on Cl (B(a, r)) ;
2. x B(a, r) , [Df (x)]
nn
exists and det Df (x) 6= 0;
3. x F (B(a, r)) , f (x) 6= f (a) .
Then, R
++
such that
f (B(a, r)) B(f (a) , ) .
Proof. Dene B := B(a, r) and
g : F (B) R, x 7kf (x) f (a)k .
From Assumption 3, x F (B), g (x) > 0. Moreover, since g is continuos and F (B) is compact,
g attains a global minimum value m > 0 on F (B). Take =
m
2
; to prove the desired result, it is
enough to show that T := B(f (a, )) f (B), i.e., y T, y f (B) . Dene
h : Cl (B) R, x 7kf (x) yk .
since h is continuos and Cl B is compact, h attains a global minimum in a point c B. We
now want to show that c B. Observe that, since y T = B

f

a, =
m
2

,
h(a) = kf (a) yk <
m
2
(17.5)
Therefore, since c is a global minimum point for h, it must be the case that h(c) <
m
2
. Now
take x F (B); then
h(x) = kf (x) yk = kf (x) f (a) (y f (a))k
(1)
kf (x) f (a)kky f (a)k
(2)
g (x)
m
2
(3)

m
2
,
where (1) follows from Remark 54, (2) from (17.5) and (3) from the fact g has minimum value
equal to m. Therefore, x F (B), h(x) > h(a) and h does not attain its minimum on F (B).
Then h and h
2
get their minimum at c B
2
. Since
H (x) := h
2
(x) = kf (x) yk
2
=
n
X
i=1
(f
i
(x) y
i
)
2
2
x B, h(x) 0 and h(x) h(c). Therefore, h
2
(x) h
2
(c).
192 CHAPTER 17. IMPLICIT FUNCTION THEOREM
from Proposition 640, DH (c) = 0, i.e.,
k {1, ..., n} , 2
n
X
i=1
D
x
k
f
i
(c) (f
i
(c) y
i
) = 0
i.e.,
[Df (c)]
nn
(f (c) y)
n1
= 0.
Then, from assumption 2,
f (c) = y,
i.e., since c B, y f (B), ad desired.
Proposition 648 (1st sucient condition for openness of a function)
Let an open set A R
n
and a function f : A R
n
be given. If
1. f is continuous,
2. f is one-to-one,
3. x A, Df (x) exists and det Df (x) 6= 0,
then f is open.
Proof. Taken b f (A), there exists a A such that f (a) = b. Since A is open, there exists
r R
++
such that B(a, r) A. Moreover, since f is one-to-one and since a / F (B), x F (B),
f (x 6= f (a)) .Then
3
, for suciently small r, Cl (B(a, r)) A, and the assumptions of Proposition
647 are satised and there exists R
++
such that
f (A) f (Cl (B(a, r))) B(f (a) , ) ,
as desired.
Denition 649 Given f : S T, and A S, the function f
|A
is dened as follows
f
|A
: A f (A) , f
|A
(x) = f (x) .
Proposition 650 Let an open set A R
n
and a function f : A R
n
be given. If
1. f is C
1
,
2. a A such that det Df (a) 6= 0,
then r R
++
such that f is one-to-one on B(a, r), and, therefore, f
|B(a,r)
is invertible.
Proof. Consider (R
n
)
n
with generic element z :=

z
i

n
i=1
, where i {1, ..., n} , z
i
R
n
, and
dene
h : R
n
2
R, :

z
i

n
i=1
7det

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

.
Observe that h is continuous because f is C
1
and the determinant function is continuous in its
entries. Moreover, from Assumption 2,
h(a, ..., a, ..., a) = det Df (a) 6= 0.
Therefore, r R
++
such that

z
i

n
i=1
B((a, ..., a, ..., a) , r
0
) , h

z
i

n
i=1

6= 0.
Observe that B

:= B((a, ..., a, ..., a) , r) 6= , where :=


n

z
i

n
i=1
R
n
2
: i {1, ..., n} , z
i
= z
1
o
.
Dene
proj : (R
n
)
n
R
n
, proj :

z
i

n
i=1
7z
1
,
3
Simply observe that r R
++
, B
_
x,
r
2
_
B(x, r).
17.2. FUNCTIONS WITH FULL RANK SQUARE JACOBIAN 193
and observe that proj (B

) = B(a, r) R
n
and i {1, .., n} , z
i
B(a, r),

z
1
, ..., z
i
..., z
n

and therefore h

z
1
, ..., z
i
..., z
n

6= 0,or, summarizing,
r R
++
such that i {1, .., n} , z
i
B(a, r) , h

z
1
, ..., z
i
..., z
n

6= 0
We now want to show that f is one-to-one on B(a, r). Suppose otherwise, i.e., given x, y
B(a, r), f (x) = f (y), but x 6= y. We can now apply the Mean Value Theorem (see Remark 630) to
f
i
for any i w{1, ..., n} on the segment L(x, y) B(a, r). Therefore i {1, .., n}, z
i
L(x, y)
such that
i {1, .., n} , z
i
L(x, y) such that 0 = f
i
(x) f
i
(y) = Df

z
i

(y x)
i.e.,

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

(y x) = 0
Observe that i, z
i
B(a, r) and therefore

z
i

n
i=1
B((a, ..., a, ..., a) , r
00
) ,and therefore
det

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

= h

z
i

n
i=1

6= 0
and therefore y = x, a contradiction.
Remark 651 The above result is not a global result, i.e., it is false that if f is C
1
and its Jacobian
has full rank everywhere in the domain, then f is one to one. Just take the function tan.
5 2.5 0 -2.5 -5
50
25
0
-25
-50
x
y
x
y
The next result gives a global property.
Proposition 652 (2nd sucient condition for openness of a function) Let an open set A R
n
and a function f : A R
n
be given. If
1. f is C
1
,
2. x A, det Df (x) 6= 0,
then f is an open function.
194 CHAPTER 17. IMPLICIT FUNCTION THEOREM
Proof. Take an open set S A. From Proposition 650,x S there exists r R
++
such that
f is one-to-one on B(x, r). Then, from Proposition 648, f (B(x, r)) is open in R
n
. We can then
write S =
xS
B(x, r) and
f (S) = f (
xS
B(x, r)) =
xS
f (B(x, r))
where the second equality follows from Proposition 521.2..f1, and then f (S) is an open set.
17.3 The inverse function theorem
Proposition 648 shows that a C
1
function with full rank square Jacobian in a point a has a local
inverse in an open neighborhood of a. The inverse function theorem give local dierentiability
properties of that local inverse function.
Lemma 653 If g is the inverse function of f : X Y and A X, then g
|f(A)
is the inverse of
f
|A
, and
if g is the inverse function of f : X Y and B Y , then g
|B
is the inverse of f
|g(B)
.
Proof. Exercise.
Proposition 654 Let an open set S R
n
and a function f : S R
n
be given. If
1. f is C
1
, and
2. a S, det Df (a) 6= 0,
then there exist two open sets X S and Y f (S) and a unique function g such that
1. a X and f (a) Y ,
2. Y = f (X),
3. f is one-to-one on X,
4. g is the inverse of f
X
,
5. g is C
1
.
Proof. Since f is C
1
, r
1
R
++
such that x B(a, r
1
) , det Df (x) 6= 0. Then, from
Proposition 650, f is one-to-one on B(a, r
1
). Then take r
2
(0, r
1
), and dene B := B(a, r
2
) .
Observe that Cl (B) (a, r
2
) B(a, r
1
) .Using the fact that f is one-to-one on B(a, r
1
) and therefore
on B(a, r
2
), we get that Assumption 3 in Proposition 647 is satised - while the other two are
trivially satised. Then, R
++
such that
f (B(a, r
2
)) B(f (a) , ) := Y.
Dene also
X := f
1
(Y ) B, (17.6)
an open set because Y and B are open sets and f is continuous. Since f is one-to-one and
continuous on the compact set Cl (B) , from Proposition 529, there exists a unique continuous
inverse function b g : f (Cl (B)) Cl (B) of f
|Cl (B)
. From denition of Y ,
Y f (B) f (Cl (B)) . (17.7)
From denition of X,
f (X) = Y f (B) = Y.
Then, from Lemma 653,
g = b g
Y
is the inverse of f
|X
.
The above shows conclusions 1-4 of the Proposition. (About conclusion 1, observe that a
f
1
(B(f (a, ))) B(a, r
2
) = X and f (a) f (X) = Y .)
We are then left with proving condition 5.
17.3. THE INVERSE FUNCTION THEOREM 195
Following what said in the proof of Proposition 650, we can dene
h : R
n
2
R, :

z
i

n
i=1
7det

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

.
and get that, from Assumption 2,
h(a, ..., a, ..., a) = det Df (a) 6= 0,
and, see the proof of Proposition 650 for details,
r R
++
such that i {1, .., n} , z
i
B(a, r) , h

z
1
, ..., z
i
..., z
n

6= 0, (17.8)
and trivially also
r R
++
such that z B(a, r) , h(z, ..., z..., z) = det Df (z) 6= 0. (17.9)
Assuming, without loss of generality that we took r
1
< r, we have that
Cl (B) := Cl (B) (a, r
2
) B(a, r
1
) B(a, r) .
Then z
1
, ..., z
n
Cl (B) , h

z
1
, ..., z
i
..., z
n

6= 0. Writing g =

g
j

n
j=1
, we want to prove that
i {1, .., n}, g
i
is C
1
. We go through the following two steps: 1. y Y, i, k {1, .., n} ,
D
y
k
g
i
(y) exists, and 2.it is continuous.
Step 1.
We want to show that the following limit exists and it is nite:
lim
h0
g
i

y +he
k
n

g
i
(y)
h
.
Dene
x = (x
i
)
n
i=1
= g (y) X Cl (B)
x
0
= (x
0
i
)
n
i=1
= g

y +he
k
n

X Cl (B)
(17.10)
Then
f (x
0
) f (x) =

y +he
k
n

y = he
k
n
.
We can now apply the Mean Value Theorem to f
i
for i {1, .., n}: z
i
L(x, x
0
) Cl (B) ,
where the inclusion follows from the fact that x, x
0
Cl (B) a convex set, such that
i {1, .., n} ,
f
i
(x
0
) f
i
(x)
h
=
Df
i

z
i

(x
0
x)
h
and therefore

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

1
h
(x
0
x) = e
k
n
.
Dene
A =

Df
1
(z
1
)
...
Df
i

z
i

...
Df
n
(z
n
)

Then, from (17.8), the above system admits a unique solution, i.e., using (17.10),
g

y +he
k
n

g (y)
h
=
1
h
(x
0
x)
196 CHAPTER 17. IMPLICIT FUNCTION THEOREM
and, using Cramer theorem (i.e., Theorem 329),
g

y +he
k
n

g (y)
h
=

z
1
, ..., z
n

h(z
1
, ..., z
n
)
where takes values which are determinants of a matrix involving entries of A. We are left with
showing that
lim
h0

z
1
, ..., z
n

h(z
1
, ..., z
n
)
exists and it is nite, i.e., the limit of the numerator exists and its nite and the limit of the
denominator exists is nite and nonzero.
Then, if h 0, y +he
k
n
y, and, being g continuous, x
0
x and, since z
i
L(x, x
0
), z
i
x
for any i. Then, h

z
1
, ..., z
n

h(x, ..., x) 6= 0,because, from 17.10, x Cl (B) and from (17.9).


Moreover,

z
1
, ..., z
n

(x, ..., x).


Step 2.
Since
lim
h0
g
i

y +he
k
n

g
i
(y)
h
=
(x, ..., x)
h(x, ..., x)
and and h are continuous functions, the desired result follows.
17.4 The implicit function theorem
Theorem 655 Given S, T open subsets of R
n
and R
k
respectively and a function
f : S T R
n
, : (x, t) 7f (x, t) ,
assume that
1. f is C
1
,
there exists (x
0
, t
0
) S T such that
2. f(x
0
, t
0
) = 0,
3. [D
x
f (x
0
, t
0
)]
nn
is invertible.
Then there exist N(x
0
) S open neighborhood of x
0
, N(t
0
) T open neighborhood of t
0
and a
unique function
g : N(t
0
) N(x
0
)
such that
1. g is C
1
,
2. {(x, t) N(x
0
) N(t
0
) : f(x, t) = 0} = {(x, t) N(x
0
) N(t
0
) : x = g(t)} := graph g.
4
Proof. See Apostol (1974),
Remark 656 Conclusion 2. above can be rewritten as
t N (t
0
) , f (g (t) , t) = 0 (17.11)
Computing the Jacobian of both sides of (17.11), using Remark 628, we get
t N (t
0
) , 0 = [D
x
f (g (t) , t)]
nn
[Dg (t)]
nk
+ [D
t
f (g (t) , t)]
nk
(17.12)
and using Assumption 3 of the Implicit Function Theorem, we get
t N (t
0
) , [Dg (t)]
nk
= [D
x
f (g (t) , t)]
1
nn
[D
t
f (g (t) , t)]
nk
Observe that (17.12) can be rewritten as the following k systems of equations: i {1, ..., k},
[D
x
f (g (t) , t)]
nn
[D
ti
g (t)]
n1
= [D
ti
f (g (t) , t)]
n1
4
Then g (t
0
) = x
0
.
17.4. THE IMPLICIT FUNCTION THEOREM 197
Exercise 657
5
Discuss the application of the Implicit Function Theorem to f : R
5
R
2
f (x
1
, x
2
, t
1
, t
2
, t
3
) 7

2e
x
1
+x
2
t
1
4t
2
+ 3
x
2
cos x
1
6x
1
+ 2t
1
t
3

at

x
0
, t
0

= (0, 1, 3, 2, 7) .
Lets check that each assumption of the Theorem is veried.
1. f(x
0
, t
0
) = 0 . Obvious.
2. f is C
1
.
We have to compute the Jacobian of the function and check that each entry is a continuous
function.
x
1
x
2
t
1
t
2
t
3
2e
x1
+x
2
t
1
4t
2
+ 3 2e
x1
t
1
x
2
4 0
x
2
cos x
1
6x
1
+ 2t
1
t
3
x
2
sinx
1
6 cos x
1
2 0 1
3. [D
x
f (x
0
, t
0
)]
nn
is invertible.
[D
x
f (x
0
, t
0
)] =

2e
x1
t
1
x
2
sinx
1
6 cos x
1

|(0,1,3,2,7)
=

2 3
6 1

whose determinant is 20.


Therefore, we can apply the Implicit Function Theorem and compute the Jacobian of g : N (t
0
)
R
2
N (x
0
) R
3
:
Dg (t) =

2e
x
1
t
1
x
2
sinx
1
6 cos x
1

x
2
4 0
2 0 1

=
=
1
6t
1
+ 2 (cos x
1
) e
x
1
+t
1
x
2
sinx
1

2t
1
x
2
cos x
1
4 cos x
1
t
1
6x
2
4e
x
1
x
2
2
sinx
1
4x
2
sinx
1
+ 24 2e
x
1

Exercise 658 Given the utility function u : R


2
++
R
++
, (x, y) 7u(x, y) satisfying the following
properties
i. u is C
2
, ii. (x, y) R
2
++
, Du(x, y) >> 0,iii. (x, y) R
2
++
, D
xx
u(x, y) < 0, D
yy
u(x, y) <
0, D
xy
u(x, y) > 0,
compute the Marginal Rate of Substitution in (x
0
, y
0
) and say if the graph of each indierence
curve is concave.
5 3.75 2.5 1.25 0
5
3.75
2.5
1.25
0
x
y
x
y
5
The example is taken from Rudin (1976), pages 227-228.
198 CHAPTER 17. IMPLICIT FUNCTION THEOREM
17.5 Some geometrical remarks on the gradient
In what follows we make some geometrical, not rigorous remarks on the meaning of the gradient,
using the implicit function theorem. Consider an open subset X of R
2
, a C
1
function
f : X R, : (x, y) 7f (x, y)
where a R. Assume that set
L(a) := {(x, y) X : f (x, y) = a}
is such that (x, y) X,
f(x,y)
y
6= 0 and
f(x,y)
x
6= 0 , then
1. L(a) is the graph of a C
1
function from a subset of R to R;
2. (x

, y

) L(a) the line going through the origin and the point Df (x

, y

) is orthogonal
to the line going through the origin and parallel to the tangent line to L(a) at (x

, y

) ;or
the line tangent to the curve L(a) in (x

, y

) is orthogonal to the line to which the gradient


belongs to.
3. (x

, y

) L(a) the directional derivative of f at (x

, y

) in the the direction u such that


kuk = 1 is the largest one if u =
Df(x

,y

)
kDf(x

,y

)k
.
1. It follows from the Implicit Function Theorem.
2. The slope of the line going through the origin and the vector Df (x

, y

) is
f(x

,y

)
y
f(x

,y

)
x
(17.13)
Again from the Implicit Function Theorem, the slope of the tangent line to L(a) in (x

, y

) is

f(x

,y

)
x
f(x

,y

)
y
(17.14)
The product between the expressions in (17.13) and (17.14) is equal to 1.
3. the directional derivative of f at (x

, y

) in the the direction u is


f
0
( (x

, y

) ; u) = Df (x

, y

) u = kDf (x

, y

)k kuk cos
where is an angle in between the two vectors. Then the above quantity is the greatest possible i
cos = 1, i.e., u is colinear with Df (x

, y

), i.e., u =
Df(x

,y

)
kDf(x

,y

)k
.
17.6 Extremum problems with equality constraints.
Given the open set X R
n
, consider the C
1
functions
f : X R, f : x 7f (x) ,
g : X R
m
, g : x 7g (x) := (g
j
(x))
m
j=1
with m n. Consider also the following maximization problem:
(P) max
xX
f (x) subject to g (x) = 0 (17.15)
The set
C := {x X : g (x) = 0}
is called the constraint set associated with problem (17.15).
17.6. EXTREMUM PROBLEMS WITH EQUALITY CONSTRAINTS. 199
Denition 659 The solution set to problem (17.15) is the set
{x

C : x C, f (x

) f (x)} ,
and it is denoted by arg max (17.15).
The function
L : X R
m
R, L : (x, ) 7f (x) +
T
g (x)
is called Lagrange function associated with problem (17.15).
Theorem 660 Given the open set X R
n
and the C
1
functions
f : X R, f : x 7f (x) , g : X R
m
, g : x 7g (x) := (g
j
(x))
m
j=1
,
assume that
1. f and g are C
1
functions,
2. x
0
is a solution to problem (17.15),
6
and
3. rank [Dg (x
0
)]
mn
= m.
Then, there exists
0
R
m
, such that, DL(x
0
,
0
) = 0, i.e.,

Df (x
0
) +
0
Dg (x
0
) = 0
g (x
0
) = 0
(17.16)
Proof. Dene x
0
:= (x
i
)
m
i=1
R
m
and t = (x
m+k
)
nm
k=1
R
nm
and therefore x = (x
0
, t). From
Assumption 3, without loss of generality,
det [D
x
0 g (x
0
)]
mm
6= 0. (17.17)
We want to show that there exists
0
R
m
which is a solution to the system
[Df (x
0
)]
1n
+
1m
[Dg (x
0
)]
mn
= 0. (17.18)
We can rewrite (17.18) as follows

D
x
0 f (x
0
)
1m
| D
t
f (x
0
)
1(nm)

+
1m

D
x
0 g (x
0
)
mm
| D
t
g (x
0
)
m(nm)

= 0
or

D
x
0 f (x
0
)
1m

+
1m

D
x
0 g (x
0
)
mm

= 0 (1)

D
t
f (x
0
)
1(nm)

+
1m

D
t
g (x
0
)
m(nm)

= 0 (2)
(17.19)
From (17.17), there exists a unique solution
0
to subsystem (1) in (17.19). If n = m, we are
done. Assume now that n > m. We have now to verify that
0
is a solution to subsystem (2)
in (17.19), as well. To get the desired result, we are going to use the Implicit Function Theorem.
Summarizing, we hat that
1. g is C
1
, 2. g(x
0
0
, t
0
) = 0, 3. det [D
x
0 g (x
0
0
, t
0
)]
mm
6= 0,
i.e., all the assumption of the Implicit Function Theorem are veried. Then we can conclude
that there exist N(x
0
) R
m
open neighborhood of x
0
0
, N(t
0
) R
nm
open neighborhood of t
0
and a unique function : N(t
0
) N(x
0
) such that
1. is C
1
, 2. (t
0
) = x
0
0
, 3. t N (t
0
) , g ((t) , t) = 0. (17.20)
Dene now
F : N(t
0
) R
nm
R, : t 7f ((t) , t) ,
and
G : N(t
0
) R
nm
R
m
, : t 7g ((t) , t) .
6
The result does apply in the case in which x
0
is a local maximum for Problem (17.15). Obviously the result
apply to the case of (local) minima, as well.
200 CHAPTER 17. IMPLICIT FUNCTION THEOREM
Then, from (17.20) and from Remark 656, we have that t N (t
0
),
0 = [DG(t)]
m(nm)
= [D
x
0 g ((t) , t)]
mm
[D(t)]
m(nm)
+ [D
t
g ((t) , t)]
m(nm)
. (17.21)
Since
7
, from (17.20), t N (t
0
) , g ((t) , t) = 0 and since
x
0
:= (x
0
0
, t
0
)
(17.20)
= ((t
0
) , t
0
) (17.22)
is a solution to problem (17.15), we have that f (x
0
) = F (t
0
) F (t), i.e., briey,
t N (t
0
) , F (t
0
) F (t) .
Then, from Proposition 640, DF (t
0
) = 0. Then, from the denition of F and the Chain Rule,
we have
[D
x
0 f ((t
0
) , t
0
)]
1m
[D(t
0
)]
m(nm)
+ [D
t
f ((t
0
) , t
0
)]
1(nm)
= 0. (17.23)
Premultiplying (17.21) by , we get

1m
[D
x
0 g ((t) , t)]
mm
[D(t)]
m(nm)
+
1m
[D
t
g ((t) , t)]
m(nm)
= 0. (17.24)
Adding up (17.23) and (17.24) ,computed at t = t
0
,we get
([D
x
0 f ((t
0
) , t
0
)] + [D
x
0 g ((t
0
) , t
0
)]) [D(t
0
)] +[D
t
f ((t
0
) , t
0
)] ++ [D
t
g ((t
0
) , t
0
)] = 0,
and from (17.22),
([D
x
0 f (x
0
)] + [D
x
0 g (x
0
)]) [D(t
0
)] + [D
t
f (x
0
)] + [D
t
g (x
0
)] = 0. (17.25)
Then, from the denition of
0
as the unique solution to (1) in (17.19) ,we have that [D
x
0 f (x
0
)]+

0
[D
x
0 g (x
0
)] = 0, and then from (17.25) computed at =
0
, we have
[D
t
f (x
0
)] +
0
[D
t
g (x
0
)] = 0,
i.e., (2) in (17.19), the desired result.
7
The only place where the proof has to be sligtgly changed to get the result for local maxima is here.
Part IV
Nonlinear programming
201
Chapter 18
Concavity
Consider
1
a set X R
n
, and the functions f : X R, g : X R
m
, h : X R
l
. The goal of this
Chapter is to study the problem
max
xX
f (x) s.t. g (x) 0 and h(x) = 0,
under suitable assumptions. The role of concavity (and dierentiability) of the functions f and
g is crucial.
18.1 Convex sets
Denition 661 A set C R
n
is convex if x
1
, x
2
C and [0, 1], (1 ) x
1
+x
2
C.
Denition 662 A set C R
n
is strictly convex if x
1
, x
2
C such that x
1
6= x
2
, and (0, 1),
(1 ) x
1
+x
2
Int C.
Remark 663 If C is strictly convex, then C is convex, but not vice-versa.
Proposition 664 The intersection of an arbitrary family of convex sets is convex.
Proof. We want to show that given a family {C
i
}
iI
of convex sets, if x, y C :=
iI
C
i
then (1 ) x + y C. x, y C implies that x, y C
i
, i I. Since C
i
is convex, i I,
[0, 1], (1 ) x +y C
i
, and [0, 1] (1 ) x +y C.
Exercise 665 n N, i {1, ..., n} , I
i
is an interval in R, then

n
i=1
I
i
is a convex set.
18.2 Dierent Kinds of Concave Functions
Maintained Assumptions in this Chapter. Unless otherwise stated, X is an open and convex
subset of R
n
. f is a function such that
f : X R, : x 7f (x) .
For each type of concavity we study, we present
1. the denition in the case in which f is C
0
(i.e., continuous),
2. an attempt of a partial characterization of that denition in the case in which f is C
1
and
C
2
; by partial characterization, we mean a statement which is either sucient or necessary for the
concept presented in the case of continuous f;
1
This part is based on Cass (1991).
203
204 CHAPTER 18. CONCAVITY
3. the relationship between the dierent partial characterizations;
4. the relationship between the type of concavity and critical points and local or global extrema
of f.
Finally, we study the relationship between dierent kinds of concavities.
The following pictures are taken from David Casss Microeconomic Course I followed at the
University of Pennsylvania (in 1985) and summarize points 1., 2. and 3. above.
18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 205
18.2.1 Concave Functions.
Denition 666 Consider a C
0
function f. f is concave i x
0
, x
00
X, [0, 1],
f((1 )x
0
+x
00
) (1 )f (x
0
) +f(x
00
).
Proposition 667 Consider a C
0
function f.
f is concave

M = {(x, y) X R : y f(x) } is convex.


Proof.
[]
Take (x
0
, y
0
) , (x
00
, y
00
) M. We want to show that
[0, 1] , ((1 ) x
0
+x
00
, (1 ) y
0
+y
00
) M.
But, from the denition of M, we get that
(1 )y
0
+y
00
(1 )f(x
0
) +f(x
00
) f((1 )x
0
+x
00
).
[]
From the denition of M, x
0
, x
00
X , (x
0
, f(x
0
)) M and (x
00
, f(x
00
)) M.
Since M is convex,
((1 ) x
0
+x
00
, (1 ) f (x
0
) +f(x
00
)) M
and from the denition of M,
(1 ) f (x
0
) +f(x
00
) f(x
0
+ (1 ) x
00
)
as desired.
Proposition 668 (Some properties of concave functions).
1. If f, g : X R are concave functions and a, b R
+
, then the function af + bg : X R,
af +bg : x 7af (x) +bg (x) is a concave function.
2. If f : X R is a concave function and F : A R , with A Im f, is nondecreasing and
concave, then F f is a concave function.
Proof.
1. This result follows by a direct application of the denition.
2. Let x
0
, x
00
X and [0, 1] . Then
(F f) ((1 ) x
0
+x
00
)
(1)
F ((1 ) f (x
0
) +f (x
00
))
(2)
(1 ) (F f) (x
0
) + (F f) (x
00
) ,
where (1) comes from the fact that f is concave and F is non decreasing, and
(2) comes from the fact that F is concave.
Remark 669 (from Sydster (1981)). With the notation of part 2 of the above Proposition, the
assumption that F is concave cannot be dropped, as the following example shows. Take f, F :
R
++
R
++
, f (x) =

x and F (y) = y
3
. Then f is concave and F is strictly increasing, but
F f (x) = x
3
2
and its second derivative is
3
4
x

1
2
> 0. Then, from Calculus I, we know that F f
is strictly convex and therefore it is not concave.
Of course, the monotonicity assumption cannot be dispensed either. Consider f (x) = x
2
and
F (y) = y. Then, (F f) (x) = x
2
, which is not concave.
Proposition 670 Consider a dierentiable function f.
f is concave

x
0
, x
00
X, f(x
00
) f(x
0
) Df(x
0
)(x
00
x
0
).
206 CHAPTER 18. CONCAVITY
Proof.
[]
From the denition of concavity, we have that for (0, 1) ,
(1 ) f(x
0
) +f(x
00
) f(x
0
+(x
00
x
0
))
(f (x
00
) f (x
0
)) f (x
0
+(x
00
x
0
)) f(x
0
)
f(x
00
) f(x
0
)
f(x
0
+(x
00
x
0
))f(x
0
)

.
Taking limits of both sides of the lasts inequality for 0, we get the desired result.
[]
Consider x
0
, x
00
X and (0, 1). For {0, 1} , the desired result is clearly true. Since X is
convex, x

:= (1 ) x
0
+x
00
X. By assumption,
f(x
00
) f(x

) Df(x

)(x
00
x

) and
f(x
0
) f(x

) Df(x

)(x
0
x

)
Multiplying the rst expression by , the second one by (1 ) and summing up, we get
(f(x
00
) f(x

)) + (1 )(f(x
0
) f(x

)) Df(x

)((x
00
x

) + (1 )(x
0
x

))
Since
(x
00
x

) + (1 ) (x
0
x

) = x

= 0,
we get
f(x
00
) + (1 ) f(x
0
) f(x

),
i.e., the desired result.
Denition 671 Given a symmetric matrix A
nn
, A is negative semidenite if x R
n
, xAx 0.
A is negative denite if x R
n
\ {0}, xAx < 0.
Proposition 672 Consider a C
2
function f.
f is concave

x X, D
2
f(x) is negative semidenite.
Proof.
[]
We want to show that u R
n
, x
0
X, it is the case that u
T
D
2
f(x
0
)u 0. Since X is open,
x
0
X a R
++
such that |h| < a (x
0
+hu) X . Taken I := (a, a) R, dene
g : I R, g : h 7f(x
0
+hu) f(x
0
) Df(x
0
)hu.
Observe that
g
0
(h) = D
x
f(x
0
+hu) u +Df(x
0
) u
and
g
00
(h) = u D
2
f(x
0
+hu) u
Since f is a concave function, from Proposition 670, we have that h I, g(h) 0. Since
g(0) = 0, h = 0 is a maximum point. Then, g
0
(0) = 0 and
g
00
(0) 0 (1) .
Moreover, h I, g
0
(h) = Df(x
0
+hu)u Df(x
0
)u and g
00
(h) = u
T
D
2
f(x
0
+hu)u. Then,
g
00
(0) = u D
2
f(x
0
) u (2) .
18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 207
(1) and (2) give the desired result.
[]
Consider x, x
0
X. From Taylors Theorem (see, Proposition 646), we get
f(x) = f(x
0
) +Df(x
0
)(x x
0
) +
1
2

x x
0

T
D
2
f(x)(x x
0
)
where x = x
0
+(xx
0
), for some (0, 1) . Since, by assumption,

x x
0

T
D
2
f(x)(xx
0
) 0,
we have that
f(x) f(x
0
) Df(x
0
)(x x
0
),
the desired result.
Some Properties.
Proposition 673 Consider a concave function f. If x
0
is a local maximum point, then it is a
global maximum point.
Proof.
By denition of local maximum point, we know that > 0 such that x B(x
0
, ) , f (x
0
)
f (x) . Take y X; we want to show that f

x
0

f (y) .
Since X is convex,
[0, 1] , (1 ) x
0
+y X.
Take
0
> 0 and suciently small to have

1
0

x
0
+
0
y B(x
0
, ). To nd such
0
, just solve
the inequality

1
0

x
0
+
0
y x
0

y x
0

y x
0

< , where, without


loss of generality, y 6= x
0
.
Then,
f

x
0

f

1
0

x
0
+
0
y
f concave


1
0

f(x
0
) +
0
f(y),
or
0
f(x
0
)
0
f(y). Dividing both sides of the inequality by
0
> 0, we get f(x
0
) f(y).
Proposition 674 Consider a dierentiable and concave function f. If Df(x
0
) = 0, then x
0
is a
global maximum point.
Proof.
From Proposition 670, if Df(x
0
) = 0, we get that x X, f

x
0

f(x), the desired result.


18.2.2 Strictly Concave Functions.
Denition 675 Consider a C
0
function f. f is strictly concave i x
0
, x
00
X such that
x
0
6= x
00
, (0, 1),
f((1 )x
0
+x
00
) > (1 )f (x
0
) +f(x
00
).
Proposition 676 Consider a C
1
function f.
f is strictly concave
x
0
, x
00
X such that x
0
6= x
00
,
f(x
00
) f(x
0
) < Df(x
0
)(x
00
x
0
).
Proof.
[]
Since strict concavity implies concavity, it is the case that
x
0
, x
00
X , f(x
00
) f(x
0
) Df(x
0
)(x
00
x
0
). (18.1)
By contradiction, suppose f is not strictly concave. Then, from 18.1, we have that
208 CHAPTER 18. CONCAVITY
x
0
, x
00
X, x
0
6= x
00
such that f(x
00
) = f(x
0
) +Df(x
0
)(x
00
x
0
). (18.2)
From the denition of strict concavity and 18.2, for (0, 1) ,
f((1 )x
0
+x
00
) > (1 )f (x
0
) +f(x
0
) +Df(x
0
)(x
00
x
0
)
or
f((1 )x
0
+x
00
) > f(x
0
) +Df(x
0
)(x
00
x
0
). (18.3)
Applying 18.1 to the points x() := (1 )x
0
+x
00
and x
0
, we get that for (0, 1) ,
f((1 )x
0
+x
00
) f(x
0
) +Df(x
0
)((1 )x
0
+x
00
x
0
)
or
f((1 )x
0
+x
00
) f(x
0
) +Df(x
0
)(x
00
x
0
). (18.4)
And 18.4 contradicts 18.3.
[] The proof is very similar to that one in Proposition 667.
Proposition 677 Consider a C
2
function f. If
x X, D
2
f(x) is negative denite,
then f is strictly concave.
Proof.
The proof is similar to that of Proposition 672.
Remark 678 In the above Proposition, the opposite implication does not hold. The standard coun-
terexample is f : R R, f : x 7x
4
.
Some Properties.
Proposition 679 Consider a strictly concave, C
0
function f. If x
0
is a local maximum point, then
it is a strict global maximum point, i.e., the unique global maximum point.
Proof.
First, we show that a. it is a global maximum point, and then b. the desired result.
a. It follows from the fact that strict concavity is stronger than concavity and from Proposition
673.
b. Suppose otherwise, i.e., x
0
, x
0
X such that x
0
6= x
0
and both of them are global maximum
points. Then, (0, 1) , (1 ) x
0
+x
0
X, since X is convex, and
f

(1 ) x
0
+x
0

> (1 ) f (x
0
) +f

x
0

= f (x
0
) = f

x
0

,
a contradiction.
Proposition 680 Consider a strictly concave, dierentiable function f. If Df

x
0

= 0, then x
0
is a strict global maximum point.
Proof.
Take an arbitrary x X such that x 6= x
0
. Then from Proposition 676, we have that f(x) <
f(x
0
) +Df(x
0
)(x x
0
) = f

x
0

, the desired result.


18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 209
18.2.3 Quasi-Concave Functions.
Denitions.
Denition 681 Consider a C
0
function f. f is quasi-concave i x
0
, x
00
X, [0, 1],
f((1 )x
0
+x
00
) min{f (x
0
) , f(x
00
)} .
Proposition 682 If f : X R is a quasi-concave function and F : R R is non decreasing,
then F f is a quasi-concave function.
Proof.
Without loss of generality, assume
f (x
00
) f (x
0
) (1) .
Then, since f is quasi-concave, we have
f((1 )x
0
+x
00
) f (x
0
) (2) .
Then,
F (f ((1 ) x
0
+x
00
))
(a)
F (f (x
0
))
(b)
= min{F (f (x
0
)) , F (f (x
00
))} ,
where (a) comes from (2) and the fact that F is nondecreasing, and
(b) comes from (1) and the fact that F is nondecreasing.
Proposition 683 Consider a C
0
function f. f is quasi-concave
R, B() := {x X : f(x) } is convex.
Proof.
[] [Strategy: write what you want to show].
We want to show that R and [0, 1], we have that
hx
0
, x
00
B(a)i h(1 ) x
0
+x
00
B()i ,
i.e.,
hf (x
0
) and f (x
00
) i hf ((1 ) x
0
+x
00
) i .
But by Assumption,
f((1 )x
0
+x
00
) min{f (x
0
) , f(x
00
)}
def x
0
,x
00
.
[]
Consider arbitrary x
0
, x
00
X. Dene := min{f (x
0
) , f(x
00
)}. Then x
0
, x
00
B() . By
assumption, [0, 1], (1 ) x
0
+x
00
B() , i.e.,
f((1 )x
0
+x
00
) := min{f (x
0
) , f(x
00
)} .
Proposition 684 Consider a dierentiable function f. f is quasi-concave x
0
, x
00
X,
f(x
00
) f(x
0
) 0 Df(x
0
)(x
00
x
0
) 0.
Proof.
[] [Strategy: Use the denition of directional derivative.]
Take x
0
, x
00
such that f (x
00
) f (x
0
) . By assumption,
f ((1 ) x
0
+x
00
) min{f (x
0
) , f(x
00
)} = f (x
0
)
210 CHAPTER 18. CONCAVITY
and
f ((1 ) x
0
+x
00
) f (x
0
) 0.
Dividing both sides of the above inequality by > 0, and taking limits for 0
+
, we get
lim
0
+
f (x
0
+(x
00
x
0
)) f (x
0
)

= Df(x
0
)(x
00
x
0
) 0.
[]
Without loss of generality, take
f (x
0
) = min{f (x
0
) , f (x
00
)} (1) .
Dene
: [0, 1] R, : 7f ((1 ) x
0
+x
00
) .
We want to show that
[0, 1] , () (0) .
Suppose otherwise, i.e.,

[0, 1] such that (

) < (0). Observe that in fact it cannot be

{0, 1}: if

= 0, we would have (0) < (0), and if

= 1, we would have (1) < (0), i.e.,


f (x
00
) < f (x
0
), contradicting (1). Then, we have that

(0, 1) such that (

) < (0) (2) .


Observe that from (1) , we also have that
(1) (0) (3) .
Therefore, see Lemma 685,

>

such that

0
(

) > 0 (4) , and


(

) < (0) (5) .


From (4) , and using the denition of
0
, and the Chain Rule,
2
we get
0 <
0
(

) = [Df ((1

) x
0
+

x
00
)] (x
00
x
0
) (6) .
Dene x

:= (1

) x
0
+

x
00
. From (5) , and the assumption, we get that
f (x

) < f (x
0
) .
Therefore, by assumption,
0 Df (x

) (x
0
x

) = Df (x

) (

) (x
00
x
0
) ,
i.e.,
[Df (x

)]
T
(x
00
x
0
) 0 (7) .
But (7) contradicts (6) .
Lemma 685 Consider a function g : [a, b] R with the following properties:
1. g is dierentiable on (a, b) ;
2. there exists c (a, b) such that g (b) g (a) > g (c) .
Then, t (c, b) such that g
0
(t) > 0 and g (t) < g (a) .
2
Dened v : [0, 1] X R
n
, 7 (1 ) x
0
+ x
00
, we have that = f v. Therefore,
0
(

) = Df (v (

))
Dv (

).
18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 211
Proof.
Without loss of generality and to simplify notation, assume g (a) = 0. Dene A := {x [c, b] : g (x) = 0} .
Observe that A = [c, b] g
1
(0) is closed; and it is non empty, because g is continuous and by
assumption g (c) < 0 and g (b) 0.
Therefore, A is compact, and we can dene := minA.
Claim. x [c, ) g (x) < 0.
Suppose not, i.e., y (c, ) such that g (y) 0. If g (y) = 0, could not be minA. If g (y) > 0,
since g (c) < 0 and g is continuous, there exists x
0
(c, y) (c, ) , again contradicting the denition
of . End of the proof of the Claim.
Finally, applying Lagrange Theorem to g on [c, ], we have that t (c, ) such that g
0
(t) =
g()g(c)
c
. Since g () = 0 and g (c) < 0, we have that g
0
(t) < 0. From the above Claim, the desired
result then follows.
Proposition 686 Consider a C
2
function f. If f is quasi-concave then
x X, R
n
such that Df (x) = 0,
T
D
2
f(x) 0.
Proof.
for another proof- see Laura s file
Suppose otherwise, i.e., x X, and R
n
such that Df (x) = 0 and
T
D
2
f

x
0

> 0.
Since the function h : X R, h : x 7
T
D
2
f(x) is continuous and X is open,
[0, 1] , > 0 such that if k x x
0
k< , then
D
2
f

x + (1 ) x
0

> 0 (1) .
Dene x := x
0
+

kk
, with 0 < < . Then,
kx x
0
k = k

kk
k = <
and x satises (1). Observe that
=
k k

(x x
0
).
Then, we can rewrite (1) as

x x
0

T
D
2
f

x + (1 ) x
0

x x
0

> 0
From Taylor Theorem, (0, 1) such that
f(x) = f(x
0
) + (x x
0
)
T
Df

x
0

+
1
2
(x x
0
)
T
D
2
f(x + (1 )x
0
)(x x
0
).
Since Df(x
0
) = 0 and from (1) , we have
f(x) > f(x
0
) (2).
Letting e x = x
0
+(/k k), using the same procedure as above, we can conclude that
f(e x) > f(x
0
) (3).
But, since x
0
=
1
2
(x + e x), (2) and (3) contradict the Denition of quasi-concavity.
Remark 687 In the above Proposition, the opposite implication does not hold. Consider f : R
R, f : x 7x
4
.
From Proposition 683, this function is clearly not quasi-concave. Take > 0. Then B() =

x R : x
4

= (,

) (

, +) which is not convex.


On the other hand observe the following. f
0
(x) = 4x
3
and 4x
3
= 0 if either x = 0 or = 0.
In both cases 12x
2
= 0. (This is example is taken from Avriel M. and others (1988), page 91).
212 CHAPTER 18. CONCAVITY
Some Properties.
Remark 688 Consider a quasi concave function f. It is NOT the case that
if x
0
is a local maximum point , then it is a global maximum point. To see that, consider the
following function.
f : R R, f : x 7

x
2
+ 1 if x < 1
0 if x 1
2.5 1.25 0 -1.25 -2.5
1.25
0
-1.25
-2.5
x
y
x
y
Proposition 689 Consider a C
0
quasi-concave function f. If x
0
is a strict local maximum point
, then it is a strict global maximum point.
Proof.
By assumption, > 0 such that if x B(x
0
, ) X and x
0
6= x, then f (x
0
) > f (x) .
Suppose the conclusion of the Proposition is false; then x
0
X such that f (x
0
) f (x
0
) .
Since f is quasi-concave,
[0, 1] , f ((1 ) x
0
+x
0
) f (x
0
) . (1)
For suciently small , (1 ) x
0
+x
0
B(x
0
, ) and (1) above holds, contradicting the fact
that x
0
is the strict local maximum point.
Proposition 690 Consider f : (a, b) R. f monotone f quasi-concave.
Proof.
Without loss of generality, take x
00
x
0
.
Case 1. f is increasing. Then f (x
00
) f (x
0
) . If [0, 1], then (1 ) x
0
+ x
00
= x
0
+
(x
00
x
0
) x
0
and therefore f ((1 ) x
0
+x
00
) f (x
0
).
Case 2. f is decreasing. Then f (x
00
) f (x
0
) . If [0, 1], then (1 ) x
0
+x
00
= (1 ) x
0

(1 ) x
00
+x
00
= x
00
(1 ) (x
00
x
0
) x
00
and therefore f ((1 ) x
0
+x
00
) f (x
00
) .
Remark 691 The following statement is false: If f
1
and f
2
are quasi-concave and a, b R
+
, then
af
1
+bf
2
is quasi-concave.
It is enough to consider f
1
, f
2
: R R, f
1
(x) = x
3
+x, and f
2
(x) = 4x. Since f
0
1
> 0, then
f
1
and, of course, f
2
are monotone and then, from Proposition 690, they are quasi-concave. On the
other hand, g (x) = f
1
(x) + f
2
(x) = x
3
x has a strict local maximum in x = 1which is not a
strict global maximum, and therefore, from Proposition 689, g is not quasi-concave.
x
3
3x
18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 213
4 2 0 -2 -4
100
50
0
-50
-100
x
y
x
y
Remark 692 Consider a dierentiable quasi-concave function f. It is NOT the case that
if Df(x
0
) = 0, then x
0
is a global maximum point.
Just consider f : R R, f : x 7x
3
and x
0
= 0, and use Proposition 690.
18.2.4 Strictly Quasi-concave Functions.
Denitions.
Denition 693 Consider a C
0
function f. f is strictly quasi-concave
i x
0
, x
00
X, such that x
0
6= x
00
, and (0, 1), we have that
f((1 )x
0
+x
00
) > min{f (x
0
) , f(x
00
)} .
Proposition 694 Consider a C
0
function f. f is strictly quasi-concave R, B() :=
{x X : f(x) } is strictly convex.
Proof.
Taken an arbitrary and x
0
, x
00
B(), with x
0
6= x
00
, we want to show that (0, 1), we
have that
x

:= (1 ) x
0
+x
00
Int B()
Since f is strictly quasi-concave,
f

x

> min{f (x
0
) , f(x
00
)}
Since f is C
0
, there exists > 0 such that x B

f (x) >
i.e., B

B(), as desired (Of course, we are using the fact that {x X : f(x) > }
B() ).
Remark 695 Observe that in Proposition 694, the opposite implication does not hold true: just
consider f : R R, f : x 71.
Observe that 1, B() = R, and > 1, B() = . On the other hand, f is not strictly
quasi-concave.
214 CHAPTER 18. CONCAVITY
Denition 696 Consider a dierentiable function f. f is dierentiable-strictly-quasi-concave i
x
0
, x
00
X, such that x
0
6= x
00
, we have that
f(x
00
) f(x
0
) 0 Df(x
0
)(x
00
x
0
) > 0.
Proposition 697 Consider a dierentiable function f.
If f is dierentiable-strictly-quasi-concave, then f is strictly quasi-concave.
Proof.
The proof is analogous to the case of quasi concave functions.
Remark 698 Given a dierentiable function, it is not the case that strict-quasi-concavity implies
dierentiable-strict-quasi-concavity.
f : R R, f : x 7 x
3
a. is dierentiable and strictly quasi concave and b. it is not
dierentiable-strictly-quasi-concave.
a. f is strictly increasing and therefore strictly quasi concave - see Fact below.
b. Take x
0
= 0 and x
00
= 1. Then f (1) = 1 > 0 = f (0) . But Df (x
0
) (x
00
x
0
) = 0 1 = 0 0.
Remark 699 If we restrict the class of dierentiable functions to whose with non-zero gradients
everywhere in the domain, then dierentiable-strict-quasi-concavity and strict-quasi-concavity are
equivalent (see Balasko (1988), Math. 7.2.).
Fact. Consider f : (a, b) R. f strictly monotone f strictly quasi concave.
Proof.
By assumption, x
0
6= x
00
, say x
0
< x
00
implies that f (x
0
) < f (x
00
) (or f (x
0
) > f (x
00
)). If
(0, 1), then (1 ) x
0
+x
00
> x
0
and therefore f ((1 ) x
0
+x
00
) > min{f (x
0
) , f (x
00
)} .
Proposition 700 Consider a C
2
function f. If
x X, R
n
\ {0} , we have that

Df (x) = 0
T
D
2
f(x) < 0

,
then f is dierentiable-strictly-quasi-concave.
Proof.
Suppose otherwise, i.e., there exist x
0
, x
00
X such that
x
0
6= x
00
, f (x
00
) f (x
0
) and Df (x
0
) (x
00
x
0
) 0.
Since X is an open set, a R
++
such the following function is well dened:
g : [a, 1] R, g : h 7f ((1 h) x
0
+hx
00
) .
Since g is continuous, there exists h
m
[0, 1] which is a global minimum. We now proceed as
follows. Step 1. h
m
/ {0, 1} . Step 2. h
m
is a strict local maximum point, a contradiction.
Preliminary observe that
g
0
(h) = Df (x
0
+h(x
00
x
0
)) (x
00
x
0
)
and
g
00
(h) = (x
00
x
0
)
T
D
2
f (x
0
+h(x
00
x
0
)) (x
00
x
0
) .
Step 1. If Df (x
0
) (x
00
x
0
) = 0, then, by assumption,
(x
00
x
0
)
T
D
2
f (x
0
+h(x
00
x
0
)) (x
00
x
0
) < 0.
Therefore, zero is a strict local maximum (see, for example, Theorem 13.10, page 378, in Apostol
(1974) ). Therefore, there exists h

R such that g (h

) = f (x
0
+h

(x
00
x
0
)) < f (x
0
) = g (0) .
18.2. DIFFERENT KINDS OF CONCAVE FUNCTIONS 215
If
g
0
(0) = Df (x
0
) (x
00
x
0
) < 0,
then there exists h

R such that
g (h

) = f (x
0
+h

(x
00
x
0
)) < f (x
0
) = g (0) .
Moreover, g (1) = f (x
00
) f (x
0
) . In conclusion, neither zero nor one can be global minimum
points for g on [0, 1] .
Step 2. Since the global minimum point h
m
(0, 1) , we have that
0 = g
0
(h
m
) = Df (x
0
+h
m
(x
00
x
0
)) (x
00
x
0
) .
Then, by assumption,
g
00
(0) = (x
00
x
0
)
T
D
2
f (x
0
+h
m
(x
00
x
0
)) (x
00
x
0
) < 0,
but then h
m
is a strict local maximum point, a contradiction.
Remark 701 Dierentiable-strict-quasi-concavity does not imply the condition presented in Propo-
sition 700. f : R R, f : x 7 x
4
is dierentiable-strictly-quasi-concave (in next section
we will show that strict-concavity implies dierentiable-strict-quasi-concavity). On the other hand,
take x

= 0. Then Df (x

) = 0. Therefore, for any R


n
\ {0} , we have Df (x

) = 0, but

T
D
2
f (x

) = 0 0.
Some Properties.
Proposition 702 Consider a dierentiable-strictly-quasi-concave function f.
x

is a strict global maximum point Df (x

) = 0.
Proof.
[] Obvious.
[] From the contropositive of the denition of dierentiable-strictly-quasi-concave function,
we have:
x

, x
00
X, such that x

6= x
00
, it is the case that Df(x

)(x
00
x

) 0 f (x
00
) f (x

) < 0
or f (x

) > f (x
00
) . Since Df (x

) = 0, then the desired result follows.


Remark 703 Obviously, we also have that if f is dierentaible-strictly-quasi-concave, it is the case
that:
x

local maximum point x

is a strict maximum point.


Remark 704 The above implication is true also for continuous strictly quasi concave functions.
(Suppose otherwise, i.e., x
0
X such that f (x
0
) f (x

). Since f is strictly quasi-concave,


(0, 1), f ((1 ) x

+x
0
) > f (x

), which for suciently small contradicts the fact that x

is a local maximum point.


Is there a denition of ?concavity weaker than concavity and such that:
If f is a ?concave function, then
x

is a global maximum point i Df (x

) = 0.
The answer is given in the next section.
18.2.5 Pseudo-concave Functions.
Denition 705 Consider a dierentiable function f. f is pseudo-concave i
x
0
, x
00
X, f (x
00
) > f (x
0
) Df (x
0
) (x
00
x
0
) > 0,
or
x
0
, x
00
X, Df (x
0
) (x
00
x
0
) 0 f (x
00
) f (x
0
) .
216 CHAPTER 18. CONCAVITY
Proposition 706 If f is a pseudo-concave function, then
x

is a global maximum point Df (x

) = 0.
Proof.
[] Obvious.
[] Df (x

) = 0 x X, Df (x

) (x x

) 0 f (x) f (x

) .
Remark 707 Observe that the following denition of pseudo-concavity will not be useful:
x
0
, x
00
X, Df (x
0
) (x
00
x
0
) 0 f (x
00
) f (x
0
) (18.5)
For such a denition the above Proposition would still apply, but it is not weaker than concavity.
Simply consider the function f : R R, f : x 7 x
2
. That function is concave, but it does not
satisfy condition (18.5). Take x
0
= 2 and x
00
= 1. Then, f
0
(x
0
) (x
00
x
0
) = 4 (1 (2)) =
4 > 0, but f (x
00
) = 1 > f (x
0
) = 4.
We summarize some of the results of this subsection in the following tables.
Class of function Fundamental properties
C G max L max G max
Uniqueness
of G. max
Strictly concave Yes Yes Yes
Concave Yes Yes No
Di.ble-str.-q.-conc. Yes Yes Yes
Pseudoconcave Yes Yes No
Quasiconcave No No No
where C stands for property of being a critical point, and L and G stand for local and global,
respectively. Observe that the rst, the second and the last row of the second column apply to the
case of C
0
and not necessarily dierentiable functions.
18.3 Relationships among Dierent Kinds of Concavity
The relationships among dierent denitions of concavity in the case of dierentiable functions are
summarized in the following table.
strict concavity
&
linearity anity concavity

pseudo-concavity dierentiable-strict-quasi-concavity

quasi-concavity
All the implications which are not implied by those explicitly written do not hold true.
In what follows, we prove the truth of each implication described in the table and we explain
why the other implications do no hold.
Recall that
1. f : R
n
R
m
is a linear function i x
0
, x
00
R
n
, a, b R f (ax
0
+bx
00
) = af (x
0
) +bf (x
00
);
2. g : R
n
R
m
is an ane function i there exists a linear function f : R
n
R
m
and c R
m
such that x R
n
, g (x) = f (x) +c.
SC C
Obvious (a > b a b).
18.3. RELATIONSHIPS AMONG DIFFERENT KINDS OF CONCAVITY 217
C PC
From the assumption and from Proposition 670, we have that f (x
00
)f (x
0
) Df (x
0
) (x
00
x
0
) .
Then f (x
00
) f (x
0
) > 0 Df (x
0
) (x
00
x
0
) > 0.
PC QC
Suppose otherwise, i.e., x
0
, x
00
X and

[0, 1] such that


f ((1

) x
0
+

x
00
) < min{f (x
0
) , f (x
00
)} .
Dene x() := (1 ) x
0
+ x
00
. Consider the segment L(x
0
, x
00
) joining x
0
to x
00
. Take
arg min

f (x()) s.t. [0, 1] . is well dened from the Extreme Value Theorem. Observe that
6= 0, 1, because f (x(

)) < min{f (x(0)) = f (x


0
) , f (x(1)) = f (x
00
)} .
Therefore, [0, 1] and (0, 1) ,
f

x

f

(1 ) x

+x()

.
(1 ) x

+x()



x
0
x() x

x
00
Then,
[0, 1] , 0 lim
0
+
f

(1 ) x

+x()

f

x

= Df

x


x() x

.
Taking = 0, 1 in the above expression, we get:
Df

x


x
0
x

0 (1)
and
Df

x


x
00
x

0 (2) .
Since
x
0
x

= x
0

x
0
x
00
= (x
00
x
0
) (3) ,
and
x
00
x

= x
00

x
0
x
00
=

1

(x
00
x
0
) (4) ,
substituting (3) in (1) , and (4) in (2) , we get
()

Df

x

(x
00
x
0
)

0,
and
(+)

Df

x

(x
00
x
0
)

0.
Therefore,
0 = Df

x

(x
00
x
0
) = Df

x

(x
00
x
0
)
(4)
=
= Df

x

x
00
x

.
Then, by pseudo-concavity,
f (x
00
) f

x

(5) .
By assumption,
f (x(

)) < f (x
00
) (6) .
218 CHAPTER 18. CONCAVITY
(5) and (6) contradict the denition of .
DSQC PC
Obvious.
SC DSQC
Obvious.
L C
Obvious.
C ;SC
f : R R, f : x 7x.
QC ;PC
f : R R, f : x 7

0 if x 0
e

1
x
2
if x > 0
4 2 0 -2 -4
1
0.75
0.5
0.25
0
x
y
x
y
f is clearly nondecreasing and therefore, from Lemma 690, quasi-concave.
f is not pseudo-concave: 0 < f (1) > f (1) = 0, but
f
0
(1) (1 1) = 0 (2) = 0.
PC ;C , DSQC ;C and DSQC ;SC
Take f : (1, +) R, f : x 7x
3
.
Take x
0
< x
00
. Then f (x
00
) > f (x
0
) . Moreover,
(>0)
Df (x
0
)
(>0)
(x
00
x
0
) > 0. Therefore, f is DSQC and
therefore PC. Since f
00
(x) > 0, f is strictly convex and therefore it is not concave and, a fortiori,
it is not strictly concave.
PC ;DSQC , C ;DSQC
Consider f : R R, f : x 71. f is clearly concave and PC, as well ( x
0
, x
00
R, Df (x
0
) (x
00
x
0
)
0). Moreover, any point in R is a critical point, but it is not the unique global maximum point.
Therefore, from Proposition 702, f is not dierentiable - strictly - quasi - concave.
QC ;DSQC
If so, we would have QC DSQC PC, contradicting the fact that QC ;PC.
C ;L and SC ;L
f : R R, f : x 7x
2
.
18.3. RELATIONSHIPS AMONG DIFFERENT KINDS OF CONCAVITY 219
18.3.1 Hessians and Concavity.
In this subsection, we study the relation between submatrices of a matrix involving the Hessian
matrix of a C
2
function and the concavity of that function.
Denition 708 Consider a matrix A
nn
. Let 1 k n.
A kth order principal submatrix (minor) of A is the (determinant of the) square submatrix of
A obtained deleting (n k) rows and (n k) columns in the same position. Denote these matrices
by
e
D
i
k
.
The k th order leading principal submatrix (minor) of A is the (determinant of the) square
submatrix of A obtained deleting the last (n k) rows and the last (n k) columns. Denote these
matrices by D
k
.
Example 709 Consider
A =

a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33

.
Then
e
D
1
1
= a
11
,
e
D
2
1
= a
22
,
e
D
3
1
= a
33
, D
1
=
e
D
1
1
= a
11
;
e
D
1
2
=

a
11
a
12
a
21
a
22

,
e
D
2
2
=

a
11
a
13
a
31
a
33

,
e
D
3
2
=

a
22
a
23
a
32
a
33

,
D
2
=
e
D
1
2
=

a
11
a
12
a
21
a
22

;
D
3
=
e
D
1
3
= A.
Denition 710 Consider a C
2
function f : X R
n
R. The bordered Hessian of f is the
following matrix
B
f
(x) =

0 Df (x)
[Df (x)]
T
D
2
f (x)

.
Theorem 711 (Simon, (1985), Theorem 1.9.c, page 79 and Sydsaeter (1981), Theorem 5.17, page
259). Consider a C
2
function f : X R.
1. If x X, k {1, ..., n} ,
sign

k leading principal minor of D


2
f (x)

= sign (1)
k
,
then f is strictly concave.
2. x X, k {1, ..., n} ,
sign

non zero k principal minor of D


2
f (x)

= sign (1)
k
,
i f is concave.
3. If n 2 and x X, k {3, ..., n + 1} ,
sign(k leading principal minor of Bf (x)) = sign (1)
k1
,
then f is pseudo concave and, therefore, quasi-concave.
4. If f is quasi-concave, then x X, k {2, ..., n + 1} ,
sign(non zero k leading principal minors of Bf (x)) = sign (1)
k1
Remark 712 It can be proved that Conditions in part 1 and 2 of the above Theorem are sucient
for D
2
f (x) being negative denite and equivalent to D
2
f (x) being negative semidenite, respec-
tively.
Remark 713 (From Sydsaetter (1981), page 239) It is tempting to conjecture that a function f is
concave i
x X, k {1, ..., n} , sign

non zero k leading principal minor of D


2
f (x)

= sign (1)
k
,
(18.6)
220 CHAPTER 18. CONCAVITY
That conjecture is false. Consider
f : R
3
R, f : (x
1
, x
2
, x
3
) 7x
2
2
+x
2
3
.
Then Df (x) = (0, 2x
2
, 2x
3
) and
D
2
f (x) =

0 0 0
0 2 0
0 0 2

All the leading principal minors of the above matrix are zero, and therefore Condition 18.6 is
satised, but f is not a concave function. Take x
0
= (0, 0, 0) and x
00
(0, 0, 1). Then
(0, 1) , f ((1 ) x
0
+x
00
) =
2
< (1 ) f (x
0
) +f (x
00
) =
Example 714 Consider f : R
2
++
R, f : (x, y) 7x

, with , R
++
. Observe that (x, y)
R
2
++
, f (x, y) > 0. Verify that
1. if + < 1, then f is strictly concave;
2. , R
++
, f is quasi-concave;
3. + 1 if and only if f is concave.
1.
D
x
f (x, y) = x
1
y

=

x
f (x, y) ;
D
y
f (x, y) = x

y
1
=

y
f (x, y) ;
D
2
x,x
f (x, y) = ( 1) x
2
y

=
(1)
x
2
f (x, y) ;
D
2
y,y
f (x, y) = ( 1) x

y
2
=
(1)
y
2
f (x, y) ;
D
2
x,.y
f (x, y) = x
1
y
1
=

x

y
f (x, y) .
D
2
f (x, y) = f (x, y)
"
(1)
x
2

x

y

x

y
(1)
y
2
#
.
a.
(1)
x
2
< 0 (0, 1).
b.
(1)(1)
2

2
x
2
y
2
=
1
x
2
y
2

( + 1)
2

=
=
1
x
2
y
2
(1 ) > 0
,>0
+ < 1.
In conclusion, if , (0, 1) and + < 1, then f is strictly concave.
2.
Observe that
f (x, y) = g (h(x, y))
where
h : R
2
++
R, (x, y) 7lnx + lny
g : R R, z 7e
z
Since h is strictly concave (why?) and therefore quasi-concave and g is strictly increasing, the
desired result follows from Proposition 682.
3.
Obvious from above results.
Chapter 19
Maximization Problems
Let the following objects be given:
1. an open convex set X R
n
, n N\ {0};
2. f : X R, g : X R
m
, h : X R
k
, m, k N\ {0}, with f, g, h at least dierentiable.
The goal of this Chapter is to study the problem.
max
xX
f (x)
s.t. g (x) 0 (1)
h(x) = 0 (2)
(19.1)
f is called objective function; x choice variable vector; (1) and (2) in (19.1) constraints; g
and h constraint functions;
C := {x X : g (x) 0 and h(x) = 0}
is the constraint set.
To solve the problem (19.1) means to describe the following set
{x

C : x C, f (x

) f (x)}
which is called solution set to problem (19.1) and it is also denoted by arg max (19.1). We will
proceed as follows.
1. We will analyze in detail the problem with inequality constraints, i.e.,
max
xX
f (x)
s.t. g (x) 0 (1)
2. We will analyze in detail the problem with equality constraints, i.e.,
max
xX
f (x)
s.t. h(x) = 0 (2)
3. We will describe how to solve the problem with both equality and inequality constraints, i.e.,
max
xX
f (x)
s.t. g (x) 0 (1)
h(x) = 0 (2)
19.1 The case of inequality constraints: Kuhn-Tucker theo-
rems
Consider the open and convex set X R
n
and the dierentiable functions f : X R, g :=

g
j

m
j=1
: X R
m
. The problem we want to study is
max
xX
f (x) s.t. g (x) 0. (19.2)
221
222 CHAPTER 19. MAXIMIZATION PROBLEMS
Denition 715 The Kuhn-Tucker system (or conditions) associated with problem 19.2 is

Df (x) +Dg (x) = 0 (1)


0 (2)
g (x) 0 (3)
g (x) = 0 (4)
(19.3)
Equations (1) are called rst order conditions; equations (2) , (3) and (4) are called complemen-
tary slackness conditions.
Remark 716 (x, ) X R
m
is a solution to Kuhn-Tucker system i it is a solution to any of
the following systems:
1.

f(x)
x
i
+
P
m
j=1

j
gj(x)
x
i
= 0 for i = 1, ..., n (1)

j
0 for j = 1, ..., m (2)
g
j
(x) 0 for j = 1, ..., m (3)

j
g
j
(x) = 0 for j = 1, ..., m (4)
2.

Df (x) +Dg (x) = 0 (1)
min{
j
, g
j
(x)} = 0 for j = 1, ..., m (2)
Moreover, (x, ) X R
m
is a solution to Kuhn-Tucker system i it is a solution to
Df (x) +Dg (x) = 0 (1)
and for each j = 1, ..., m, to one of the following conditions
either (
j
> 0 and g
j
(x) = 0 )
or (
j
= 0 g
j
(x) > 0 )
or (
j
= 0 g
j
(x) = 0 )
Denition 717 Given x

X, we say that j is a binding constraint at x

if g
j
(x

) = 0. Let
J

(x

) := {j {1, ..., m} : g
j
(x

) = 0}
and
g

:= (g
j
)
jJ

(x

)
, b g := (g
j
)
j / J

(x

)
Denition 718 x

R
n
satises the constraint qualications associated with problem 19.2 if it is
a solution to
max
xR
n Df (x

) x s.t. Dg

(x

) (x x

) 0 (19.4)
The above problem is obtained from 19.2
1. replacing g with g

;
2. linearizing f and g

around x

, i.e., substituting f and g

with f (x

) +Df (x

) (x x

) and
g (x

) +Dg (x

) (x x

), respectively;
3. dropping redundant terms, i.e., the term f (x

) in the objective function, and the term


g

(x

) = 0 in the constraint.
Theorem 719 Suppose x

is a solution to problem 19.2 and to problem 19.4, then there exists

R
m
such that (x

) satises Kuhn-Tucker conditions.


The proof of the above theorem requires the following lemma, whose proof can be found for
example in Chapter 2 in Mangasarian (1994).
Lemma 720 (Farkas) Given a matrix A
mn
and a vector a R
n
,
either 1. there exists R
m
+
such that a = A,
or 2. there exists y R
n
such that Ay 0 and ay < 0,
but not both.
19.1. THE CASE OF INEQUALITY CONSTRAINTS: KUHN-TUCKER THEOREMS 223
Proof. of Theorem 719
(main steps: 1. use the fact x

is a solution to problem 19.4; 2. apply Farkas Lemma; 3. choose

= ( from Farkas , 0)).


Since x

is a solution to problem 19.4, for any x R


n
such that Dg

(x

) (x x

) 0 it is the
case that Df (x

) x

Df (x

) x or
Dg

(x

) (x x

) 0 [Df (x

)] (x x

) 0. (19.5)
Applying Farkas Lemma identifying
a with Df (x

)
and
A with Dg

(x

)
we have that either
1. there exists R
m
+
such that
Df (x

) = Dg

(x

) (19.6)
or 2. there exists y R
n
such that
Dg

(x

) y 0 and Df (x

) y < 0 (19.7)
but not both 1 and 2.
Choose x = y +x

and therefore you have y = x x

. Then, 19.7 contradicts 19.5. Therefore,


1. above holds.
Now, choose

:= (, 0) R
m

R
mm

, we have that
Df (x

) +

Dg (x

) = Df (x

) + (, 0)

Dg

(x

)
Db g (x

= Df (x

) +Dg (x

) = 0
where the last equality follows from 19.6;

0 by Farkas Lemma;
g (x

) 0 from the assumption that x

solves problem 19.2;

g (x

) = (, 0)

(x

)
b g (x

= g

(x) = 0, where the last equality follows from the denition


of g

.
Theorem 721 If x

is a solution to problem (19.2) and


either for j = 1, .., m, g
j
is pseudo-concave and x
++
X such that g (x
++
) 0,
or rank Dg

(x

) = m

:= #J

(x

),
then x

solves problem (19.4) .


Proof. We prove the conclusion of the theorem under the rst set of conditions.
Main steps: 1. suppose otherwise: e x ... ; 2. use the two assumptions; 3. move from x

in the
direction x

:= (1 ) e x +x
++
.
Suppose that the conclusion of the theorem is false. Then there exists e x R
n
such that
Dg

(x

) (e x x

) 0 and Df (x

) (e x x

) > 0 (19.8)
Moreover, from the denition of g

and x
++
, we have that
g

x
++

>> 0 = g

(x

)
Since for j = 1, .., m, g
j
is pseudo-concave we have that
Dg

(x

x
++
x

0 (19.9)
Dene
x

:= (1 ) e x +x
++
224 CHAPTER 19. MAXIMIZATION PROBLEMS
with (0, 1). Observe that
x

= (1 ) e x +x
++
(1 ) x

= (1 ) (e x x

) +

x
++
x

Therefore,
Dg

(x

= (1 ) Dg

(x

) (e x x

) +Dg

(x

x
++
x

0 (19.10)
where the last equality come from 19.8 and 19.9.
Moreover,
Df (x

= (1 ) Df (x

) (e x x

) +Df (x

x
++
x

0 (19.11)
where the last equality come from 19.8 and a choice of suciently small.
1
Observe that from Remark 613, 19.10 and 19.11 we have that
(g

)
0

, x

0
and
f
0

, x

> 0
Therefore, using the fact that X is open, and that b g (x

) 0, there exists such that


x

X
g

(x

) = 0
f

> f (x

)
b g

0
(19.12)
But then 19.12 contradicts the fact that x

solves problem (19.2).


From Theorems 719 and 721, we then get the following corollary.
Theorem 722 Suppose x

is a solution to problem 19.2, and one of the following constraint qual-


ications hold:
a. for j = 1, ..., m, g
j
is pseudo-concave and there exists x
++
X such that g (x
++
) 0
b. rank Dg

(x

) = #J

,
Then there exists

R
m
such that (x

) solves the system 19.3.


Theorem 723 If f is pseudo-concave, and for j = 1, ..., m, g
j
is quasi-concave, and (x

) solves
the system 19.3, then x

solves problem 19.2.


Proof. Main steps: 1. suppose otherwise and use the fact that f is pseudo-concave; 2. for
j
b
J (x

) := {1, ..., m} \J

(x

), use the quasi-concavity of g


j
; 3. for j J

(x

), use (second
part of) kuhn-Tucker conditions; 4. Observe that 2. and 3. above contradict the rst part of
Kuhn-Tucker conditions.)
Suppose otherwise, i.e., there exists b x X such that
g (b x) 0 and f (b x) > f (x

) (19.13)
From 19.13 and the fact that f pseudo-concave, we get
Df (x

) (b x x

) > 0 (19.14)
From 19.13, the fact that g

(x

) = 0 and that g
j
is quasi-concave, we get that
for j J

(x

) , Dg
j
(x

) (b x x

) 0
1
Assume that (0, 1), R
++
and R. We want to show that there exist

(0, 1) such that


(1 ) + > 0
i.e.,
> ( )
If ( ) = 0, the claim is true.
If ( ) > 0, any <

will work (observe that


> 0).
If ( ) < 0, the claim is clearly true because 0 < and ( ) < 0.
19.1. THE CASE OF INEQUALITY CONSTRAINTS: KUHN-TUCKER THEOREMS 225
and since

0,
for j J

(x

) ,

j
Dg
j
(x

) (b x x

) 0 (19.15)
For j
b
J (x

), from Kuhn-Tucker conditions, we have that g


j
(x

) > 0 and

j
= 0, and therefore
for j
b
J (x

) ,

j
Dg
j
(x

) (b x x

) = 0 (19.16)
But then from 19.14, 19.15 and 19.16, we have
Df (x

) (b x x

) +

Dg (x

) (b x x

) > 0
contradicting Kuhn-Tucker conditions.
We can summarize the above results as follows. Call (M) the problem
max
xX
f (x) s.t. g (x) 0 (19.17)
and dene
M := arg max (M) (19.18)
S := {x X : R
m
such that (x, ) is a solution to Kuhn-Tucker system (19.3)} (19.19)
1. Assume that one of the following conditions hold:
(a) for j = 1, ..., m, g
j
is pseudo-concave and there exists x
++
X such that g (x
++
) 0
(b) rank Dg

(x

) = #J

.
Then
x

M x

S
2. Assume that both the following conditions hold:
(a) f is pseudo-concave, and
(b) for j = 1, ..., m, g
j
is quasi-concave.
Then
x

S x

M.
19.1.1 On uniqueness of the solution
The following proposition is a useful tool to show uniqueness.
Proposition 724 The solution to problem
max
xX
f (x) s.t. g (x) 0 (P)
either does not exist or it is unique if one of the following conditions holds
1. f is strictly quasi-concave, and
for j {1, ..., m}, g
j
is quasi-concave;
2. f is quasi-concave and locally non-satiated (i.e., x X, > 0, there exists x
0
B(x, )
such that f (x
0
) > f (x) ), and
for j {1, ..., m}, g
j
is strictly quasi-concave.
Proof. 1.
Since g
j
is quasi concave V
j
:= {x X : g
j
(x) 0} is convex. Since the intersection of convex
sets is convex V =
m
j=1
V
j
is convex.
Suppose that both x
0
and x
00
are solutions to problem (P) and x
0
6= x
00
. Then for any (0, 1),
(1 ) x
0
+x
00
V (19.20)
226 CHAPTER 19. MAXIMIZATION PROBLEMS
because V is convex, and
f ((1 ) x
0
+x
00
) > min{f (x
0
) , f (x
00
)} = f (x
0
) = f (x
00
) (19.21)
because f is strictly-quasi-concave.
But (19.20) and (19.21) contradict the fact that x
0
and x
00
are solutions to problem (P).
2.
Observe that V is strictly convex because each V
j
is strictly convex. Suppose that both x
0
and
x
00
are solutions to problem (P) and x
0
6= x
00
. Then for any (0, 1),
x() := (1 ) x
0
+x
00
Int V
i.e., > 0 such that B(x() , ) V . Since f is locally non-satiated, there exists x
0

B(x() , ) V such that
f (b x) > f (x()) (19.22)
Since f is quasi-concave,
f (x()) f (x
0
) = f (x
00
) (19.23)
(19.22) and (19.23) contradict the fact that x
0
and x
00
are solutions to problem (P).
Remark 725 1. If f is strictly increasing (i.e., x
0
, x
00
X such that x
0
> x
00
, we have that
f (x
0
) > f (x
00
) ) or strictly decreasing, then f is locally non-satiated.
2. If f is ane and not constant, then f is quasi-concave and Locally NonSatiated.
Proof of 2.
f : R
n
R ane and not constant means that there exists a R and b R\ {0} such that
f : x 7 a + b
T
x. Take an arbitrary x and > 0. For i {1, ..., n}, dene
i
:=

k
(sign b
i
) and
e x := x + (
i
)
n
i=1
, with k 6= 0 and which will be computed below. Then
f (e x) = a +bx +
P
n
i=1

k
|b
i
| > f (x);
ke x xk =

k
((sign b
i
) b
i
)
n
i=1

=

k

q
P
n
i=1
(b
i
)
2

=

k
kbk < if k >
1
kbk
.
Remark 726 In part 2 of the statement of the Proposition f has to be both quasi-concave and
Locally NonSatiated.
a. Example of f quasi-concave (and g
j
strictly-quasi-concave) with more than one solution:
max
xR
1 s.t. x + 1 0 1 x 0
The set of solution is [1, +1]
a. Example of f Locally NonSatiated (and g
j
strictly-quasi-concave) with more than one solution:
max
xR
x
2
s.t. x + 1 0 1 x 0
The set of solutions is {1, +1}.
19.2 The Case of Equality Constraints: Lagrange Theorem.
Consider the C
1
functions
f : X R, f : x 7f (x) ,
g : X R
m
, g : x 7g (x) := (g
j
(x))
m
j=1
with m n. Consider also the following maximization problem:
(P) max
xX
f (x) s.t. g (x) = 0 (19.24)
L : X R
m
R, L : (x, ) 7f (x) +g (x)
is called Lagrange function associated with problem (17.15).
We recall below the statement of Theorem 660.
19.2. THE CASE OF EQUALITY CONSTRAINTS: LAGRANGE THEOREM. 227
Theorem 727 (Necessary Conditions)
Assume that rank [Dg (x

)] = m.
Under the above condition, we have that
x

is a local maximum for (P)

there exists

R
m
, such that

Df (x

) +

Dg (x

) = 0
g (x

) = 0
(19.25)
Remark 728 The full rank condition in the above Theorem cannot be dispensed. The following
example shows a case in which x

is a solution to maximization problem (19.24), Dg (x

) does not
have full rank and there exists no

satisfying Condition 19.25. Consider


max
(x,y)R
2 x s.t. x
3
y = 0
x
3
+y = 0
The constraint set is {(0, 0)} and therefore the solution is just (x

, y

) = (0, 0). The Jacobian


matrix of the constraint function is

3x
2
1
3x
2
1

|(x

,y

)
=

0 1
0 1

which does have full rank.


(0, 0) = Df (x

, y

) + (
1
,
2
) Dg (x

, y

) =
= (1, 0) + (
1
,
2
)

0 1
0 1

= (1,
1
+
2
) ,
from which it follows that there exists no

solving the above system.


Theorem 729 (Sucient Conditions)
Assume that
1. f is pseudo-concave,
2. for j = 1, ..., m, g
j
is quasi concave.
Under the above conditions, we have what follows.
[there exist (x

) X R
m
such that
3.

0,
4. Df (x

) +

Dg (x

) = 0, and
5. g (x

) = 0 ]

solves (P) .
Proof.
Suppose otherwise, i.e., there exists b x X such that
for j = 1, ..., m, g
j
(b x) = g
j
(x

) = 0 (1) , and
f (b x) > f (x

) (2) .
Quasi-concavity of g
j
and (1) imply that
Dg
j
(x

) (b x x

) 0 (3) .
Pseudo concavity of f and (2) imply that
Df (x

) (b x x

) > 0 (4) .
But then
0
Assumption
= [Df (x

) +Dg (x

)] (b x x

) =
(>0)
Df (x

) (b x x

) +
(0)

(0)
Dg (x

) (b x x

) > 0,
a contradiction.
228 CHAPTER 19. MAXIMIZATION PROBLEMS
19.3 The Case of Both Equality and Inequality Constraints.
Consider
the open and convex set X R
n
and the dierentiable functions f : X R, g : X R
m
, h :
X R
l
.
Consider the problem
max
xX
f (x) s.t. g (x) 0
h(x) = 0.
(19.26)
Observe that
h(x) = 0
for k = 1, ..., l, h
k
(x) = 0
for k = 1, ..., l, h
k1
(x) := h
k
(x) 0 and h
k2
(x) := h
k
(x) 0.
Dened h
.1
(x) :=

h
k1
(x)

l
k=1
and h
.2
(x) :=

h
k2
(x)

l
k=1
, problem 19.26 with associated
multipliers can be rewritten as
max
xX
f (x) s.t. g (x) 0
h
.1
(x) 0
1
h
.2
(x) 0
2
(19.27)
The Lagrangian function of the above problem is
L(x; ,
1
,
2
) = f (x) +
T
g (x) + (
1

2
)
T
h(x) =
= f (x) +
m
X
j=1

j
g
j
(x) +
l
X
k=1

k
1

k
2

T
h(x) ,
and the Kuhn-Tucker Conditions are:
Df (x) +
T
Dg (x) + (
1

2
)
T
Dh(x) = 0
g
j
(x) 0,
j
0,
j
g
j
(x) = 0, for j = 1, ..., m,
h
k
(x) = 0,

k
1

k
2

:= T 0, for k = 1, ..., l.
(19.28)
Theorem 730 Assume that f, g and h are C
2
functions and that
either rank

Dg

(x

)
Dh(x

= m

+l,
or for j = 1, ..., m, g
j
is pseudoconcave, and k, h
k
and h
k
are pseudocave.
Under the above conditions,
if x

solves 19.26, then (x

) X R
m
R
l
which satises the associated Kuhn-Tucker
conditions.
Proof. The above conditions are called Weak reverse convex constraint qualication (Man-
gasarian (1969)) or Reverse constraint qualication (Bazaraa and Shetty (1976)). The needed
result is presented and proved in
Mangasarian
2
,- see 4, page 172 and Theorem 6, page 173, and Bazaraa and Shetty (1976) - see
7 page 148, and theorems 6.2.3, page 148 and Theorem 6.2.4, page 150.
See also El-Hodiri (1991), Theorem 1, page 48 and Simon (1985), Theorem 4.4. (iii), page 104.
2
What Mangasarian calls a linear function is what we call an ane function.
19.4. MAIN STEPS TO SOLVE A (NICE) MAXIMIZATION PROBLEM 229
Theorem 731 Assume that
f is pseudo-concave, and
for j = 1, ..., m, g
j
is quasi-concave, and for k = 1, ..., l, h
k
is quasi-concave and h
k
is
quasi-concave.
Under the above conditions,
if (x

) X R
m
R
l
satises the Kuhn-Tucker conditions associated with 19.26, then
x

solves 19.26.
Proof.
This follows from Theorems proved in the case of inequality constraints.
Similarly, to what we have done in previous sections, we can summarize what said above as
follows.
Call (M
2
) the problem
max
xX
f (x) s.t. g (x) 0
h(x) = 0.
(19.29)
and dene
M
2
:= arg max (M
2
)
S
2
:= {x X : R
m
such that (x, ) is a solution to Kuhn-Tucker system (19.28)}
1. Assume that one of the following conditions hold:
(a) rank

Dg

(x

)
Dh(x

= m

+l,or
(b) for j = 1, ..., m, g
j
is linear, and h(x) is ane.
Then
M
2
S
2
2. Assume that both the following conditions hold:
(a) f is pseudo-concave, and
(b) for j = 1, ..., m, g
j
is quasi-concave, and for k = 1, ..., l, h
k
is quasi-concave and h
k
is
quasi-concave.
Then
M
2
S
2
19.4 Main Steps to Solve a (Nice) Maximization Problem
We have studied the problem
max
xX
f (x) s.t. g (x) 0 (M)
which we call a maximization problem in the canonical form, i.e., a maximization problem
with constraints in the form of , and we have dened
M := arg max (M)
C := {x X : g (x) 0}
S := {x X : R
m
such that (x, ) satises Kuhn-Tucker Conditions (19.3)}
Recall that X is an open, convex subset of R
n
, f : X R, j {1, ..., m}, g
j
: X R and
g := (g
j
)
m
j=1
: X R
m
.
230 CHAPTER 19. MAXIMIZATION PROBLEMS
In many cases, we have to study the following problem
max
x
f (x) s.t. g (x) 0, (M
0
)
in which the set X is not specied.
We list the main steps to try to solve (M
0
).
1. Canonical form.
Write the problem in the (in fact, our denition of) canonical form. Sometimes the problem
contains a parameter an open subset of R
k
. Then we should write: for given
max
x
f (x, ) s.t. g (x, ) 0.
2. The set X and the functions f and g.
a. Dene the functions
e
f, e g naturally arising from the problem with domain equal to their
denition set, where the denition set of a function is the largest set D

which can be the domain


of that function.
b. Determine X. A possible choice for X is the intersection of the denition set of each
function, , i.e.,
X = D
f
D
g1
... D
gm
c. Check if X is open and convex.
d. To apply the analysis described in the previous sections, show, if possible, that f and g are
of class C
2
or at least C
1
.
3. Existence.
Try to apply the Extreme Value Theorem. If f is at least C
1
, then f is continuous and therefore
we have to check if the constraint set C is non-empty and compact. Recall that a set S in R
n
is
compact if and only if S is closed (in R
n
) and bounded.
Boundedness has to be shown brute force, i.e., using the specic form of the maximization
problem.
If X = R
n
, then C := {x X : g (x) 0} is closed, because of the following well-known argu-
ment:
C =
m
j=1
g
1
j
([0, +)) ; since g
j
is C
2
(or at least C
1
) and therefore continuous, and [0, +)
closed, g
1
j
([0, +)) is closed in X = R
n
; then C is closed because intersection of closed sets.
A problem may arise if X is an open proper subset of R
n
. In that case the above argument
shows that C is a closed set in X 6= R
n
and therefore it is not necessarily closed in R
n
. A possible
way out is the following one.
Verify that while the denition set of f is X, the denition set of g is R
n
. If D
g
= R
n
, then from
the above argument
e
C := {x R
n
: e g (x) 0}
is closed (in R
n
). Observe that
C =
e
C X.
Then, we are left with showing that
e
C X and therefore
e
C X =
e
C and then
C =
e
C
If
e
C is compact, C is compact as well.
3
4. Number of solutions.
See subsection 19.1.1. In fact, summarizing what said there, we know that the solution to (M),
if any, is unique if
1. f is strictly-quasi-concave, and for j {1, ..., m}, g
j
is quasi-concave; or
2. for j {1, ..., m}, g
j
is strictly-quasi-concave and
3
Observe that the above argument does not apply to che case in which
C =
_
x R
2
++
: w x
1
x
2
0
_
:
in that case,

C =
_
x R
2
: w x
1
x
2
0
_
* R
2
++
.
19.4. MAIN STEPS TO SOLVE A (NICE) MAXIMIZATION PROBLEM 231
either a. f is quasi-concave and locally non-satiated,
or b. f is ane and non-costant,
or c. f is quasi-concave and strictly monotone,
or d. f is quasi-concave and x X, Df (x) >> 0,
or e. f is quasi-concave and x X, Df (x) << 0.
5. Necessity of K-T conditions.
Check if the conditions which insure that M S hold, i.e.,
either a. for j = 1, ..., m, g
j
is pseudo-concave and there exists x
++
X such that g (x
++
) 0,
or b. rank Dg

(x

) = #J

.
If those conditions holds, each property we show it holds for elements of S does hold a fortiori
for elements of M.
6. Suciency of K-T conditions.
Check if the conditions which insure that M S hold, i.e., that
f is pseudo-concave and for j = 1, ..., m, g
j
is quasi-concave.
If those conditions holds, each property we show it does not hold for elements of S does not
hold a fortiori for elements of M.
7. K-T conditions.
Write the Lagrangian function and then the Kuhn-Tucker conditions.
8. Solve the K-T conditions.
Try to solve the system of Kuhn-Tucker conditions in the unknown variables (x, ) . To do that;
either, analyze all cases,
or, try to get a good conjecture and check if the conjecture is correct.
Example 732 Discuss the problem
max
(x
1
,x
2
)
1
2
log (1 +x
1
) +
1
3
log (1 +x
2
) s.t. x
1
0
x
2
0
x
1
+x
2
w
with w > 0.
1. Canonical form.
For given w R
++
,
max
(x
1
,x
2
)
1
2
log (1 +x
1
) +
1
3
log (1 +x
2
) s.t. x
1
0
x
2
0
w x
1
x
2
0
(19.30)
2. The set X and the functions f and g.
a.
e
f : (1, +)
2
R, (x
1
, x
2
) 7
1
2
log (1 +x
1
) +
1
3
log (1 +x
2
)
e g
1
: R
2
R (x
1
, x
2
) 7x
1
e g
2
: R
2
R (x
1
, x
2
) 7x
2
e g
3
: R
2
R (x
1
, x
2
) 7w x
1
x
2
b.
X = (1, +)
2
and therefore f and g are just
e
f and e g restricted to X.
c. X is open and convex because Cartesian product of open intervals which are open, convex
sets.
d. Lets try to compute the Hessian matrices of f, g
1
, g
2
, g
3
. Gradients are
Df (x
1
, x
2
) , =

1
2(x1+1)
,
1
3(x2+1)

De g
1
(x
1
, x
2
) = (1, 0)
De g
2
(x
1
, x
2
) = (0, 1)
De g
3
(x
1
, x
2
) = (1, 1)
232 CHAPTER 19. MAXIMIZATION PROBLEMS
Hessian matrices are
D
2
f (x
1
, x
2
) , =
"

1
2(x
1
+1)
2
0
0
1
3(x
2
+1)
2
#
D
2
e g
1
(x
1
, x
2
) = 0
D
2
e g
2
(x
1
, x
2
) = 0
D
2
e g
3
(x
1
, x
2
) = 0
In fact, g
1
, g
2
, g
3
are ane functions. In conclusion, f and g
1
, g
2
, g
3
are C
2
. In fact, g
1
and g
2
are linear and g
3
is ane.
2. Existence.
C is clearly bounded: x C,
(0, 0) (x
1
, x
2
) (w, w)
In fact, the rst two constraint simply say that (x
1
, x
2
) (0, 0). Moreover, from the third constraint
x
1
w x
2
w, simply because x
2
0; similar argument can be used to show that x
2
w.
To show closedness, use the strategy proposed above.
e
C := {x R
n
: g (x) 0}
is obviously closed. Since
e
C R
2
+
, because of the rst two constraints,
e
C X := (1, +)
2
and therefore C =
e
C X =
e
C is closed.
We can then conclude that C is compact and therefore arg max (19.30) 6= .
4. Number of solutions.
From the analysis of the Hessian and using Theorem 711, parts 1 ad 2, we have that f is strictly
concave:

1
2 (x
1
+ 1)
2
< 0
det
"

1
2(x1+1)
2
0
0
1
3(x
2
+1)
2
#
=
1
2 (x
1
+ 1)
2

1
3 (x
2
+ 1)
2
> 0
Moreover g
1
, g
2
, g
3
are ane and therefore concave. From Proposition 724, part 1, the solution
is unique.
5. Necessity of K-T conditions.
Since each g
j
is ane and therefore pseudo-concave, we are left with showing that there exists
x
++
X such that g (x
++
) >> 0. Just take

x
++
1
, x
++
2

=
w
4
(1, 1) :
w
4
> 0
w
4
> 0
w
w
4

w
4
=
w
2
> 0
Therefore
M S
6. Suciency of K-T conditions.
f is strictly concave and therefore pseudo-concave, and each g
j
is linear and therefore quasi-
concave. Therefore
M S
7. K-T conditions.
L(x
1
, x
2
,
1
,
2
, ; w) =
1
2
log (1 +x
1
) +
1
3
log (1 +x
2
) +
1
x
1
+
2
x
2
+(w x
1
x
2
)

1
2(x1+1)
+
1
= 0
1
3(x
2
+1)
+
2
= 0
min{x
1
,
1
} = 0
min{x
2
,
2
} = 0
min{w x
1
x
2
, } = 0
19.4. MAIN STEPS TO SOLVE A (NICE) MAXIMIZATION PROBLEM 233
8. Solve the K-T conditions.
Conjecture: x
1
> 0 and therefore
1
= 0; x
2
> 0 and therefore
2
= 0; w x
1
x
2
= 0.The
Kuhn-Tucker system becomes:

1
2(x
1
+1)
= 0
1
3(x
2
+1)
= 0
w x
1
x
2
= 0
0
x
1
> 0, x
2
> 0

1
= 0,
2
= 0
Then,

1
2(x1+1)
=
1
3(x
2
+1)
=
w x
1
x
2
= 0
> 0
x
1
> 0, x
2
> 0

1
= 0,
2
= 0

x
1
=
1
2
1
x
2
=
1
3
1
w

1
2
1

1
3
1

= 0
> 0
x
1
> 0, x
2
> 0

1
= 0,
2
= 0
0 = w

1
2
1

1
3
1

= w
5
6
+ 2; and =
5
6(w+2)
> 0. Then x
1
=
1
2
1 =
6(w+2)
25
1 =
3w+65
5
=
3w+1
5
and x
2
=
1
3
1 =
6(w+2)
35
1 =
2w+45
5
=
2w1
5
.
Summarizing

x
1
=
3w+1
5
> 0
x
2
=
2w1
5
> 0
=
5
6(w+2)
> 0

1
= 0,
2
= 0
Observe that while x
1
> 0 for any value of w, x
2
> 0i w >
1
2
. Therefore, for w

0,
1
2

, the
above one is not a solution, and we have to come up with another conjecture;
x
1
= w and therefore
1
= 0; x
2
= 0 and
2
0; w x
1
x
2
= 0 and 0.The Kuhn-Tucker
conditions become

1
2(w+1)
= 0
1
3
+
2
= 0

1
= 0
x
2
= 0

2
0
x
1
= w
0
and

=
1
2(w+1)
> 0

2
=
1
2(w+1)

1
3
=
32w2
6(w+1)
=
12w
6(w+1)

1
= 0
x
2
= 0

2
0
x
1
= w

2
=
12w
6(w+1)
= 0 if w =
1
2
,and .
2
=
12w
6(w+1)
> 0 if w

0,
1
2

Summarizing, the unique solution x

to the maximization problem is


if w

0,
1
2

, then x

1
= w,

1
= 0 and x

2
= 0,

2
> 0
if w =
1
2
, then x

1
= w,

1
= 0 x

2
= 0,

2
= 0
if w

1
2
, +

, then x

1
=
3w+1
5
> 0

1
= 0 and x

2
=
2w1
5
> 0

2
= 0
234 CHAPTER 19. MAXIMIZATION PROBLEMS
The graph of x

1
as a function of w is presented below (please, complete the picture)
2 1.5 1 0.5 0
1
0.75
0.5
0.25
0
w
x1
w
x1
The graph below shows constraint sets for w =
1
4
,
1
2
, 1 and some signicant level curve of the
objective function.
1.5 1 0.5 0 -0.5 -1
1.5
1
0.5
0
-0.5
-1
x1
x2
x1
x2
Observe that in the example, we get that if

2
= 0, the associated constraint x
2
0 is not
signicant. See Subsection 19.6.2, for a discussion of that statement.
Of course, several problems may arise in applying the above procedure. Below, we describe some
commonly encountered problems and some possible (partial) solutions.
19.4.1 Some problems and some solutions
1. The set X.
X is not open.
Rewrite the problem in terms of an open set X
0
and some added constraints. A standard
example is the following one.
max
xR
n
+
f (x) s.t. g (x) 0
19.4. MAIN STEPS TO SOLVE A (NICE) MAXIMIZATION PROBLEM 235
which can be rewritten as
max
xR
n f (x) s.t. g (x) 0
x 0
2. Existence.
a. The constraint set is not compact. If the constraint set is not compact, it is sometimes
possible to nd another maximization problem such that
i. its constraint set is compact and nonempty, and
ii. whose solution set is contained in the solution set of the problem we are analyzing.
A way to try to achieve both i. and ii. above is to restrict the constraint set (to make it
compact) without eliminating the solution of the original problem. Sometimes, a problem with
the above properties is the following one.
max
xX
f (x) s.t. g (x) 0 (P1)
f (x) f (b x) 0
where b x is an element of X such that g (b x) 0.
Observe that (P cons) is an example of (P1).
In fact, while, condition 1. above., i.e., the compactness of the constraint set of (P1) depends
upon the specic characteristics of X, f and g, condition ii. above is satised by problem (P1), as
shown in detail below.
Dene
M := arg max (P) M
1
:= arg max (P1)
and V and V
1
the constraint sets of Problems (P) and (P1), respectively. Observe that
V
1
V (19.31)
If V
1
is compact, then M
1
6= and the only thing left to show is that M
1
M, which is always
insured as proved below.
Proposition 733 M
1
M.
Proof. If M
1
= , we are done.
Suppose that M
1
6= , and that the conclusion of the Proposition is false, i.e., there exists
x
1
M
1
such that
a. x
1
M
1
, and b. x
1
/ M, or
a. x X such that g (x) 0 and f (x) f (b x), we have f

x
1

f (x);
and
b. either i. x
1
/ V ,
or ii. e x X such that
g (e x) 0 (19.32)
and
f (e x) > f

x
1

(19.33)
Lets show that i. and ii. cannot hold.
i.
It cannot hold simply because V
1
V , from 19.31.
ii.
Since x
1
V
1
,
f

x
1

f (b x) (19.34)
From (19.33) and (19.34), it follows that
f (e x) > f (b x) (19.35)
But (19.32), (19.35) and (19.33) contradict the denition of x
1
, i.e., a. above
236 CHAPTER 19. MAXIMIZATION PROBLEMS
b. Existence without the Extreme Value Theorem If you are not able to show existence,
but
i. sucient conditions to apply Kuhn-Tucker conditions hold, and
ii. you are able to nd a solution to the Kuhn-Tucker conditions,
then a solution exists.
19.5 The Implicit Function Theorem and Comparative Sta-
tics Analysis
The Implicit Function Theorem can be used to study how solutions (x X R
n
) to maximizations
problems and, if needed, associated Lagrange or Kuhn-Tucker multipliers ( R
m
) change when
parameters ( R
k
) change. That analysis can be done if the solutions to the maximization
problem (and the multipliers) are solution to a system of equation of the form
F
1
(x, ) = 0
with (# choice variables) = (# dimension of the codomain of F
1
), or
F
2
(, ) = 0
where := (x, ), and (# choice variables and multipliers) = (# dimension of the codomain of
F
2
),
To apply the Implicit Function Theorem, it must be the case that the following conditions do
hold.
1. (# choice variables x) = (# dimension of the codomain of F
1
), or
(# choice variables and multipliers) = (# dimension of the codomain of F
2
).
2. F
i
has to be at least C
1
.That condition is insured if the above systems are obtained from
maximization problems characterized by functions f, g which are at least C
2
: usually the above
systems contain some form of rst order conditions, which are written using rst derivatives
of f and g.
3. F
1
(x

,
0
) = 0 or F
2
(

,
0
) = 0. The existence of a solution to the system is usually the
result of the strategy to describe how to solve a maximization form - see above Section 19.4.
4. det [D
x
F
1
(x

,
0
)]
nn
6= 0 or det [D

F
2
(

,
0
)]
(n+m)(n+m)
6= 0. That condition has to be
veried directly on the problem.
If the above conditions are veried, the Implicit Function Theorem allow to conclude what
follows (in reference to F
2
).
There exist an open neighborhood N(

) X of

, an open neighborhood N(
0
) of

0
and a unique C
1
function g : N(
0
) R
p
N(

) X R
n
such that
N (
0
) , F (g () , ) = 0 and
Dg () =
h
D

F (, )
|=g()
i
1

h
D

F (, )
|=g()
i
Therefore, using the above expression, we may be able to say if the increase in any value of any
parameter implies an increase in the value of any choice variable (or multiplier).
Three signicant cases of application of the above procedure are presented below. We are going
to consider C
2
functions dened on open subsets of Euclidean spaces.
19.5.1 Maximization problem without constraint
Assume that the problem to study is
max
xX
f (x, )
and that
1. f is concave;
19.5. THE IMPLICIT FUNCTION THEOREM AND COMPARATIVE STATICS ANALYSIS237
2. There exists a solution x

to the above problem associated with


0
.
Then, from Proposition 674, we know that x

is a solution to
Df (x,
0
) = 0
Therefore, we can try to apply the Implicit Function Theorem to
F
1
(x, ) = Df (x,
0
)
An example of application of the strategy illustrated above is presented in Section 20.3.
19.5.2 Maximization problem with equality constraints
Consider a maximization problem
max
xX
f (x, ) s.t. g (x, ) = 0,
Assume that necessary and sucient conditions to apply Lagrange Theorem hold and that there
exists a vector (x

) which is a solution (not necessarily unique) associated with the parameter

0
. Therefore, we can try to apply the Implicit Function Theorem to
F
2
(, ) =

Df (x

,
0
) +

Dg (x

,
0
)
g (x

,
0
) .

(19.36)
19.5.3 Maximization problem with Inequality Constraints
Consider the following maximization problems with inequality constraints. For given ,
max
xX
f (x, ) s.t. g (x, ) 0 (19.37)
Moreover, assume that the set of solutions of that problem is nonempty and characterized by the
set of solutions of the associated Kuhn-Tucker system, i.e., using the notation of Subsection 19.1,
M = S 6= .
We have seen that we can write Kuhn-Tucker conditions in one of the two following ways, beside
some other ones,

Df (x, ) +Dg (x, ) = 0 (1)


0 (2)
g (x, ) 0 (3)
g (x, ) = 0 (4)
(19.38)
or

Df (x, ) +Dg (x, ) = 0 (1)


min{
j
, g
j
(x, )} = 0 for j {1, ..., m} (2)
(19.39)
The Implicit Function Theorem cannot be applied to either system (19.38) or system (19.39):
system (19.38) contains inequalities; system (19.39) involves functions which are not dierentiable.
We present below conditions under which the Implicit Function Theorem can be anyway applied
to allow to make comparative statics analysis. Take a solution (x

,
0
) to the above system(s).
Assume that
for each j, either

j
> 0 or g
j
(x

,
0
) > 0.
In other words, there is no j such that
j
= g
j
(x

,
0
) = 0. Consider a partition J

,
b
J of
{1, .., m}, and the resulting Kuhn-Tucker conditions.

Df (x

,
0
) +

Dg (x

,
0
) = 0

j
> 0 for j J

g
j
(x

,
0
) = 0 for j J

j
= 0 for j
b
J
g
j
(x

,
0
) > 0 for j
b
J
(19.40)
238 CHAPTER 19. MAXIMIZATION PROBLEMS
Dene
g

(x

,
0
) := (g
j
(x

,
0
))
jJ

b g (x

,
0
) := (g
j
(x

,
0
))
j

:=

jJ

:=

J
Write the system of equations obtained from system (19.40) eliminating strict inequality con-
straints and substituting in the zero variables:

Df (x

,
0
) +

Dg

(x

,
0
) = 0
g

(x

,
0
) = 0
(19.41)
Observe that the number of equations is equal to the number of remaining unknowns and
they are
n + #J

i.e., Condition 1 presented at the beginning of the present Section 19.5 is satised. Assume that
the needed rank condition does hold and we therefore can apply the Implicit Function Theorem to
F
2
(, ) =

Df (x

,
0
) +

Dg

(x

,
0
)
g

(x

,
0
)

= 0
Then. we can conclude that there exists a unique C
1
function dened in an open neighborhood
N
1
of
0
such that
N
1
, () := (x

() ,

())
is a solution to system (19.41) at .
Therefore, by denition of ,

Df (x

() , ) +

()
T
Dg

(x

() , ) = 0
g

(x

() , ) = 0
(19.42)
Since is continuous and

(
0
) > 0 and b g (x

(
0
) ,
0
) > 0, there exist an open neighborhood
N
2
N
1
of
0
such that N
2

() > 0
b g (x() , ) > 0
(19.43)
Take also N
2
n
b

() = 0
(19.44)
Then, systems (19.42), (19.43) and (19.44) say that N
2
,

x() ,

() ,
b
()

satisfy
Kuhn-Tucker conditions for problem (19.37) and therefore, since C = M, they are solutions to the
maximization problem.
The above conclusion does not hold true if Kuhn-Tucker conditions are of the following form

Df (x, ) +
T
Dg (x, ) = 0

j
= 0, g
j
(x, ) = 0 for j J
0

j
> 0, g
j
(x, ) = 0 for j J
00

j
= 0, g
j
(x, ) > 0 for j
b
J
(19.45)
where J
0
6= , J
00
and
b
J is a partition of J.
In that case, applying the same procedure described above, i.e., eliminating strict inequality
constraints and substituting in the zero variables, leads to the following systems in the unknowns
x R
n
and (
j
)
jJ
00
R
#J
00
:

Df (x, ) + (
j
)
jJ
00
D(g
j
)
jJ
00
(x, ) = 0
g
j
(x, ) = 0 for j J
0
g
j
(x, ) = 0 for j J
00
19.6. THE ENVELOPE THEOREM AND THE MEANING OF MULTIPLIERS 239
and therefore the number of equation is n + #J
00
+ #J
0
> n + #J
00
,simply because we are
considering the case J
0
6= . Therefore the crucial condition
(# choice variables and multipliers) = (# dimension of the codomain of F
2
)
is violated.
Even if the Implicit Function Theorem could be applied to the equations contained in (19.45),
in an open neighborhood of
0
we could have

j
() < 0 and/or g
j
(x() , ) < 0 for j J
0
Then () would be solutions to a set of equations and inequalities which are not Kuhn-Tucker
conditions of the maximization problem under analysis, and therefore x() would not be a solution
to the that maximization problem.
An example of application of the strategy illustrated above is presented in Section 20.1.
19.6 The Envelope Theorem and the meaning of multipliers
19.6.1 The Envelope Theorem
Consider the problem (M) : for given ,
max
xX
f (x, ) s.t. g (x, ) = 0
Assume that for every , the above problem admits a unique solution characterized by Lagrange
conditions and that the Implicit function theorem can be applied. Then, there exists an open set
O such that
x : O X, x : 7arg max (P) ,
v : O R, v : 7max (P) and
: O R
m
, 7 unique Lagrange multiplier vector
are dierentiable functions.
Theorem 734 For any

O and for any pair of associated (x

) := (x(

) , (

)), we have
D

v (

) = D

L(x

)
i.e.,
D

v (

) = D

f (x

) +

g (x

)
Remark 735 Observe that the above analysis applies also to the case of inequality constraints, as
long as the set of binding constraints does not change.
Proof. of Theorem 734 By denition of v (.) and x(.) , we have that
O, v () = f (x() , ) . (1)
Consider an arbitrary value

and the unique associate solution x

= x(

) of problem (P) .
Dierentiating both sides of (1) with respect to and computing at

, we get
[D

v (

)]
1k
=
h
D
x
f (x, )
|(x

)
i
1n

h
D

x()
|=

i
nk
+
h
D

f (x, )
|(x

)
i
1k
(2)
From Lagrange conditions
D
x
f (x, )
|(x

)
=

D
x
g (x, )
|(x

)
(3) ,
where

is the unique value of the Lagrange multiplier. Moreover


O, g (x() , ) = 0. (4)
240 CHAPTER 19. MAXIMIZATION PROBLEMS
Dierentiating both sides of (4) with respect to and computing at

, we get
h
D
x
g (x, )
|(x

)
i
mn

h
[D

x()]
|=

i
nk
+
h
D

g (x, )
|(x

)
i
mk
= 0 (5) .
Finally,
[D

v (

)]
1k
(2),(3)
=

D
x
g (x, )
|(x

)
D

x()
|=

+D

f (x, )
|(x

)
(5)
=
= D

f (x, )
|(x

)
+

g (x, )
|(x

)
19.6.2 On the meaning of the multipliers
The main goal of this subsection is to try to formalize the following statements.
1. The fact that
j
= 0 indicates that the associated constraint g
j
(x) 0 is not signicant - see
Proposition 736 below.
2. The fact that
j
> 0 indicates that a way to increase the value of the objective function is to
violate the associated constraint g
j
(x) 0 - see Proposition 737 below.
For simplicity, consider the case m = 1. Let (CP) be the problem
max
xX
f (x) s.t. g (x) 0
and (UP) the problem
max
xX
f (x)
with f strictly quasi-concave, g is quasi-concave and solutions to both problem exist . Dene
x

:= arg max (CP) with associated multiplier

, and x

:= arg max (UP).


Proposition 736 If

= 0, then x

= arg max (UP) x

= arg max (CP) .


Proof. By the assumptions of this section, the solution to (CP) exists, is unique, it is equal to
x

and there exists

such that
Df (x

) +

Dg (x

) = 0
min{g (x

) ,

} = 0.
Moreover, the solution to (UP) exists, is unique and it is the solution to
Df (x) = 0.
Since

= 0, the desired result follows.


Take > 0 and k (, +). Let (CPk) be the problem
max
xX
f (x) s.t. g (x) k
Let
b x : (, +) X, k 7arg max (CPk)
b v : (, +) R, k 7max (CPk) := f (b x(k))
Let
b
(k) be such that

b x(k) ,
b
(k)

is the solution to the associated Kuhn-Tucker conditions.


Observe that
x

= b x(0) ,

=
b
(0) (19.46)
Proposition 737 If

> 0, then b v
0
(0) < 0.
19.6. THE ENVELOPE THEOREM AND THE MEANING OF MULTIPLIERS 241
Proof. From the envelope theorem,
k (, +) , b v
0
(k) =
(f (x) +(g (x) k))
k | x(k),

(k)
=
b
(k)
and from (19.46)
b v
0
(0) =
b
(0) =

< 0.
Remark 738 Consider the following problem. For given a R,
max
xX
f (x) s.t. g (x) a 0 (19.47)
Assume that the above problem is well-behaved and that x(a) = arg max (19.47), v (a) =
f (x(a)) and (x(a) , (a)) is the solution of the associated Kuhn-Tucker conditions. Then, applying
the Envelope Theorem we have
v
0
(a) = (a)
242 CHAPTER 19. MAXIMIZATION PROBLEMS
Chapter 20
Applications to Economics
20.1 The Walrasian Consumer Problem
The utility function of a household is
u : R
C
++
R : x7u(x) .
Assumption. u is a C
2
function; u is dierentiably strictly increasing, i.e., x R
C
++
, Du(x) 0;
u is dierentiably strictly quasi-concave, i.e., x R
C
++
, x 6= 0 and Du(x) x = 0
x
T
D
2
u(x) x < 0; for any u R,

x R
C
++
: u(x) u

is closed in R
C
.
The maximization problem for household h is
(P1) max
xR
C
++
u(x) s.t. px w 0.
The budget set of the above problem is clearly not compact. But, in the Appendix, we show
that the solution of (P1) are the same as the solutions of (P2) and (P3) below. Observe that the
constraint set of (P3) is compact.
(P2) max
xR
C
++
u(x) s.t. px w = 0;
(P3) max
xR
C
++
u(x) s.t. px w 0;
u(x) u(e

) ,
where e

x R
C
++
: px w

.
Theorem 739 Under the Assumptions (smooth 1-5),
h
(p, w
h
) is a C
1
function.
Proof.
Observe that, from it can be easily shown that, is a function.
We want to show that (P2) satises necessary and sucient conditions to Lagrange Theorem,
and then apply the Implicit Function Theorem to the First Order Conditions of that problem.
The necessary condition is satised because D
x
[px w] = p 6= 0;
Dene also
: R
C
++
R
C1
++
R
C
++
,
: (p, w) 7 Lagrange multiplier for (P2) .
The sucient conditions are satised because: from Assumptions (smooth 4), u is dierentiably
strictly quasi-concave; the constraint is linear; the Lagrange multiplier is strictly positive -see
below.
The Lagrangian function for problem (P2) and the associated First Order Conditions are de-
scribed below.
L(x, , p, w) = u(x) (px w)
(FOC) (1) Du(x) p = 0
(2) px w = 0
243
244 CHAPTER 20. APPLICATIONS TO ECONOMICS
Dene
F : R
C
++
R
++
R
C
++
R
++
R
C
R,
F : (x, , p, w) 7

Du(x) p
px w

.
As an application of the Implicit Function Theorem, it is enough to show that D
(x,)
F (x, , p, w)
has full row rank (C + 1).
Suppose D
(x,)
F does not have full rank; then there would exist
x R
C
and R such that := (x, ) 6= 0 and D
(x,)
F (x, ) = 0, or

D
2
u(x) p
T
p 0

x

= 0,
or
(a) D
2
u(x) x p
T
= 0
(b) px = 0
.
The idea of the proof is to contradict Assumption u3.
Claim 1. x 6= 0.
By assumption it must be 6= 0 and therefore, if x = 0, 6= 0. Since p R
C
++
, p
T
6= 0.
Moreover, if x = 0, from (a), we would have p
T
= 0, a contradiction. .
Claim 2. Du x = 0.
From (FOC1) , we have Du x
h
p x = 0; using (b) the desired result follows .
Claim 3. x
T
D
2
u x = 0.
Premultiplying (a) by x
T
, we get x
T
D
2
u(x) x x
T
p
T
= 0. Using (b) , the result follows.
Claims 1, 2 and 3 contradict Assumption u3.
The above result gives also a way of computing D
(p,w)
x(p, w) , as an application of the Implicit
Function Theorem .
Since
x p w
Du(x) p D
2
u p
T
I
C
0
px +w p 0 x 1

D
(p,w)
(x, ) (p, w)

(C+1)(C+1)
=

D
p
x D
w
x
D
p
D
p

=
=

D
2
u p
T
p 0

1
(C+1)(C+1)

I
C
0
x 1

(C+1)(C+1)
To compute the inverse of the above matrix, we can use the following fact about the inverse of
partitioned matrix (see for example, Goldberger, (1963), page 26)
Let A be an n n nonsingular matrix partitioned as
A =

E F
G H

,
where E
n1n1
, F
n1n2
, G
n2n1
, H
n2n2
and n
1
+ n
2
= n. Suppose that E and D := H
GE
1
F are non singular. Then
A
1
=

E
1

I +FD
1
GE
1

E
1
FD
1
D
1
GE
1
D
1

.
If we assume that D
2
u is negative denite and therefore invertible, we have

D
2
u p
T
p 0

1
=
"

D
2

I +
1
p
T
p

D
2

D
2

1
p
T

1
p

D
2

1
#
where = p

D
2

1
p
T
R
++
.
And
20.2. PRODUCTION 245
[D
p
x(p, w)]
CC
=
h

D
2

I +
1
p
T
p

D
2

D
2

1
p
T
i

I
C
x

=
=

D
2

I +
1
p
T
p

D
2

D
2

1
p
T
x =
1

D
2

1
h

I +p
T
p

D
2

+p
T
x
i
[D
w
x(p, w)]
C1
=
h

D
2

I +
1
p
T
p

D
2

D
2

1
p
T
i

0
1

=
1

D
2

1
p
T
[D
p
(p, w)]
1C
=
h

1
p

D
2

1
i

I
C
x

=
1

D
2

1
+x

.
[D
w
(p, w)]
11
=
h

1
p

D
2

1
i

0
1

=
1
.
.
.
As a simple application of the Envelope Theorem, we also have that, dened the indirect utility
function as
v : R
C+1
++
R, v : (p, w) 7u(x(p, w)) ,
we have that
D
(p,w)
v (p, w) =

x
T
1

.
20.2 Production
Denition 740 A production vector (or input-output or netput vector) is a vector y := (y
c
)
C
c=1

R
C
which describes the net outputs of C commodities from a production process. Positive numbers
denote outputs, negative numbers denote inputs, zero numbers denote commodities neither used nor
produced.
Observe that, given the above denition, py is the prot of the rm.
Denition 741 The set of all feasible production vectors is called the production set Y R
C
. If
y Y, then y can be obtained as a result of the production process; if y / Y,that is not the case.
Denition 742 The Prot Maximization Problem (PMP) is
max
y
py s.t. y Y.
It is convenient to describe the production set Y using a function F : R
C
R called the
transformation function. That is done as follows:
Y =

y R
C
: F (y) 0

.
We list below a smooth version of the assumptions made on Y , using the transformation function.
Some assumption on F (.) .
(1) y R
C
such that F (y) 0.
(2) F is C
2
.
(3) (No Free Lunch) If y 0, then F (y) < 0.
(4) (Possibility of Inaction) F (0) = 0.
(5) (F is dierentiably strictly decreasing) y R
C
, DF (y) 0
(6) (Irreversibility) If y 6= 0 and F (y) 0, then F (y) < 0.
(7) (F is dierentiably strictly concave) R
C
\ {0} ,
T
D
2
F (y) < 0.
Denition 743 Consider a function F (.) satisfying the above properties and a strictly positive real
number N. The Smooth Prot Maximization Problem (SPMP) is
max
y
py s.t. F (y) 0 and kyk N. (20.1)
246 CHAPTER 20. APPLICATIONS TO ECONOMICS
Remark 744 For any solution to the above problem it must be the case that F (y) = 0. Suppose
there exists a solution y
0
to (SPMP) such that F (y
0
) > 0. Since F is continuous, in fact C
2
,
there exists > 0 such that z B(y
0
, ) F (z) > 0. Take z
0
= y
0
+
1
C
. Then, d (y
0
, z
0
) :=

P
C
c=1

2
1
2
=

C

2
1
2
=

2
C
1
2
=

C
< . Therefore z
0
B(y
0
, ) and
F (z
0
) > 0 (1) .
But,
pz
0
= py
0
+p
1
C
> py
0
(2) .
(1) and (2) contradict the fact that y
0
solves (SPMP).
Proposition 745 If a solution with kyk < N to (SPMP) exists . Then y : R
C
++
R
C
, p 7
arg max (20.1) is a well dened C
1
function.
Proof.
Lets rst show that y (p) is single valued.
Suppose there exist y, y
0
y (p) with y 6= y
0
. Consider y

:= (1 ) y+y
0
. Since F (.) is strictly
concave, it follows that F

y

> (1 ) F (y) +F (y
0
) 0, where the last inequality comes from
the fact that y, y
0
y (p) . But then F

y

> 0. Then following the same argument as in Remark


744, there exists > 0 such that z
0
= y

+
1
C
and F (z
0
) > 0. But pz
0
> py

= (1 ) py+py
0
= py,
contradicting the fact that y y (p) .
Lets now show that y is C
1
From Remark 744 and from the assumption that kyk < N, (SPMP) can be rewritten as
max
y
py s.t. F (y) = 0. We can then try to apply Lagrange Theorem.
Necessary conditions: DF (y) 0;
sucient conditions: py is linear and therefore pseudo-concave; F (.) is dierentiably strictly
concave and therefore quasi-concave; the Lagrange multiplier is strictly positive -see below.
Therefore, the solutions to (SPMP) are characterized by the following First Order Conditions,
i.e., the derivative of the Lagrangian function with respect to y and equated to zero:
y p
L(y, p) = py +F (y) . p +DF (y) = 0 F (y) = 0
.
Observe that =
p1
D
y
1F(y)
> 0.
As usual to show dierentiability of the choice function we take derivatives of the First Order
Conditions.
y
p +DF (y) = 0 D
2
F (y) [DF (y)]
T
F (y) = 0 DF (y) 0
We want to show that the above matrix has full rank. By contradiction, assume that there
exists := (y, ) R
C
R, 6= 0 such that

D
2
F (y) [DF (y)]
T
DF (y) 0

y

= 0,
i.e.,
D
2
F (y) y + [DF (y)]
T
= 0 (a) ,
DF (y) y = 0 (b) .
Premultiplying (a) by y
T
, we get y
T
D
2
F (y) y +y
T
[DF (y)]
T
= 0. From (b) , it
follows that y
T
D
2
F (y) y = 0, contradicting the dierentiably strict concavity of F (.) .
(3)
From the Envelope Theorem, we know that if

y,

is the unique pair of solution-multiplier


associated with p, we have that
20.3. THE DEMAND FOR INSURANCE 247
D
p
(p)
|p
= D
p
(py)
|(p,y)
+D
p
F (y)
|(p,y)
.
Since D
p
(py)
|(p,y)
= y, D
p
F (y) = 0 and, by denition of y, y = y (p) , we get D
p
(p)
|p
= y (p) ,
as desired.
(4)
From (3) , we have that D
p
y (p) = D
2
p
(p) . Since (.) is convex -see Proposition ?? (2)- the
result follows.
(5)
From Proposition ?? (4) and the fact that y (p) is single valued, we know that R
++
, y (p)
y (p) = 0. Taking derivatives with respect to , we have D
p
y (p)
|(p)
p = 0. For = 1, the desired
result follows.
20.3 The demand for insurance
Consider an individual whose wealth is
W d with probability , and
W with probability 1 ,
where W > 0 and d > 0.
Let the function
u : A R, u : c u(c)
be the individuals Bernoulli function.
Assumption 1. c R, u
0
(c) > 0 and u
00
(c) < 0.
Assumption 2. u is bounded above.
An insurance company oers a contract with following features: the potentially insured individ-
ual pays a premium p in each state and receives d if the accident occurs. The (potentially insured)
individual can buy a quantity a R of the contract. In the case, she pays a premium (a p) in each
state and receives a reimbursement (a d) if the accident occurs. Therefore, if the individual buys
a quantity a of the contract, she get a wealth described as follows
W
1
:= W d ap +ad with probability , and
W
2
:= W ap with probability 1 .
(20.2)
Remark 746 It is reasonable to assume that p (0, d) .
Dene
U : R R, U : a 7u(W d ap +ad) + (1 ) u(W ap) .
Then the individual solves the following problem. For given, W R
++
, d R
++
, p (0, d) ,
(0, 1)
max
aR
U (a) (M) (20.3)
To show existence of a solution, we introduce the problem presented below. For given W
R
++
, d R
++
, p (0, d) , (0, 1)
max
aR
U (a) s.t. U (a) U (0) (M
0
)
DenedA

:= arg max (M) and A


0
:= arg max (M
0
) , the existence of solution to (M) , follows
from the Proposition below.
Proposition 747 1. A
0
A

. 2. A
0
6= .
248 CHAPTER 20. APPLICATIONS TO ECONOMICS
Proof.
Exercise
To show that the solution is unique, observe that
U
0
(a) = u
0
(W d +a (d p)) (d p) + (1 ) u
0
(W ap) (p) (20.4)
and therefore
U
00
(a) =
(+)
u
00
()
(W d +a (d p))
(+)
(d p)
2
+
(+)
(1 )
()
u
00
(W ap)
(+)
p
2
< 0.
Summarizing, the unique solution of problem (M) is the unique solution of the equation:
U
0
(a) = 0.
Denition 748 a

: R
++
(0, 1) (0, d) R
++
R,
a

: (d, , p, W) 7arg max (M) .


U

: R,
U

: 7u(W d +a

() (d p)) + (1 ) u(W a

() p)
Proposition 749 The signs of the derivatives of a

and U

with respect to are presented in the


following table
1
:
d p W
a

> 0 if a

[0, 1] > 0 T 0 0 if a

1
U

0 if a

[0, 1] 0 if a

[0, 1] 0 if a

0 > 0
Proof. Exercise.
1
Conditions on a

() contained in the table can be expressed in terms of exogenous variables.


Part V
Problem Sets
249
Chapter 21
Exercises
21.1 Linear Algebra
1.
Show that the set of pair of real numbers is not a vector space with respect to the following
operations:
(a, b) + (c, d) = (a +c, b +d) and k (a, b) = (ka, b) ;
(a, b) + (c, d) = (a +c, b) and k (a, b) = (ka, kb) .
2.
Show that W is not a vector subspace of R
3
if
W =

(x, y, z) R
3
: z 0

;
W =

x R
3
: kxk 1

,
W = Q
3
.
3.
Let V be the vector space of all functions f : R R. Show the W is a vector subspace of V if
W = {f V : f (1) = 0} ;
W = {f V : f (1) = f (2)} .
4.
Let V be the vector space of all functions f : R R. Show f, g, h dened below are linearly
independent:
f (x) = e
2x
, g (x) = x
2
, h(x) = x.
5.
Show that
a.
V =

(x
1
, x
2
, x
3
) R
3
: x
1
+x
2
+x
3
= 0

is a vector subspace of R
3
;
b.
S = {(1, 1, 0) , (0, 1, 1)}
is a basis for V .
6.
Show the following fact.
Proposition. Let a matrix A M(n, n), with n N\ {0} be given. The set
C
A
:= {B M(n, n) : BA = AB}
is a vector subspace of M(n, n) (with respect to the eld R).
7.
Let U and V be vector subspaces of a vector space W. Show that
U +V := {w W : u U and v V such that w = u +v}
251
252 CHAPTER 21. EXERCISES
is a vector subspace of W.
8.
Show that the following sets of vectors are sets of linearly independent vectors:
{(1, 1, 1, ) , (0, 1, 1) , (0, 0, 1)} .
9.
Show that
V =

(x
1
, x
2
, x
3
) R
3
: x
1
x
2
= 0

is a vector subspace of R
3
,and nd a basis for V .
10. Find the change-of-basis matrix from
S = {u
1
= (1, 2), u
2
= (3, 5)}
to
E = {e
1
= (1, 0), e
2
= (0, 1)}.
11. Find the determinant of
C =

6 2 1 0 5
2 1 1 2 1
1 1 2 2 3
3 0 2 3 1
1 1 3 4 2

12. Say for which values of k R the following matrix has rank a. 4, b. 3:
A :=

k + 1 1 k 2
k 1 2 k k
1 0 1 1

13.
Say if the following matrices have the same row spaces:
A =

1 1 5
2 3 13

, B =

1 1 2
3 2 3

, C =

1 1 1
4 3 1
3 1 3

.
14.
Say if the following matrices have the same column spaces:
A =

1 3 5
1 4 3
1 1 9

, B =

1 2 3
2 3 4
7 12 17

.
15.
Diagonalize
A =

4 2
3 1

.
16. Let
A =

2 2
1 3

.
Find: (a) all eigenvalues of A and corresponding eigenspaces, (b) an invertible matrix P such that
D = P
1
AP is diagonal.
17.
Show that similarity between matrices is an equivalence relation.
18.
Let A be a square symmetric matrix with real entries. Show that eigenvalues are real numbers.
Show that if
i
6=
j
for i 6= j then corresponding eigenvectors are orthogonal.
21.1. LINEAR ALGEBRA 253
19.
Given
l
4
: R
4
R
4
, l
4
(x
1
, x
2
, x
3
, x
4
) =

x
1
, x
1
+x
2
, x
1
+x
2
+x
3
, x
1
+x
2
+x
3
+x
4

show it is linear, compute the associated matrix, and compute ker l
4
and Iml
4
.
20.
Complete the text below.
Proposition. Assume that l L(V, U) and ker l = {0}. Then,
u Iml, there exists a unique v V such that l (v) = u.
Proof.
Since ........................, by denition, there exists v V such that
l (v) = u. (21.1)
Take v
0
V such that l (v
0
) = u. We want to show that
........................ (21.2)
Observe that
l (v) l (v
0
)
(a)
= ......................... (21.3)
where (a) follows from .........................
Moreover,
l (v) l (v
0
)
(b)
= ........................., (21.4)
where (b) follows from .........................
Therefore,
l (v v
0
) = 0,
and, by denition of ker l,
......................... (21.5)
Since, ........................., from (22.5), it follows that
v v
0
= 0.
21.
Let the following sets be given:
V =

(x
1
, x
2
, x
3
, x
4
) R
4
: x
1
x
2
+x
3
x
4
= 0

and
W =

(x
1
, x
2
, x
3
, x
4
) R
4
: x
1
+x
2
+x
3
+x
4
= 0

If possible, nd a basis of V W.
22.
Say if the following statement is true or false.
Let V and U be vector spaces on R, W a vector subspace of U and l L(V, U). Then l
1
(W)
is a vector subspace of V .
23.
Let the following full rank matrices
A =

a
11
a
12
a
21
a
22

B =

b
11
b
12
b
21
b
22

254 CHAPTER 21. EXERCISES


be given. Say for which values of k R, the following linear system has solutions.

1 a
11
a
12
0 0 0
2 a
21
a
22
0 0 0
3 5 6 b
11
b
12
0
4 7 8 b
21
b
22
0
1 a
11
a
12
0 0 k

x
1
x
2
x
3
x
4
x
5
x
6

k
1
2
3
k

24.
Consider the following Proposition contained in Section 8.1 in the class Notes:
Proposition v V,
[l]
u
v
[v]
v
= [l (v)]
u
(21.6)
Verify the above equality in the case in which
a.
l : R
2
R
2
, (x
1
, x
2
) 7

x
1
+x
2
x
1
x
2

b. the basis v of the domain of l is

1
0

0
1

,
c. the basis u of the codomain of l is

1
1

2
1

,
d.
v =

3
4

.
25.
Complete the following proof.
Proposition. Let
n, m N\ {0} such that m > n, and
a vector subspace L of R
m
such that dimL = n
be given. Then, there exists l L(R
n
, R
m
) such that Im l = L.
Proof. Let

v
i

n
i=1
be a basis of L R
m
. Take l L(R
n
, R
m
) such that
i {1, ..., n} , l
2

e
i
n

= v
i
,
where e
i
n
is the ith element in the canonical basis in R
n
. Such function does exists and, in fact,
it is unique as a consequence of a Proposition in the Class Notes that we copy below:
..........................................................................................................................
Then, from the Dimension theorem
dimIml = ..............................................
Moreover,
L = ...........

v
i

n
i=1
...............
Summarizing,
L Im l , dimL = n and dimIml n,
and therefore
dimIml = n.
Finally, from Proposition .................................in the class Notes since L Im l , dimL = n
and dimIml = n, we have that Iml = L, as desired.
Proposition ............................. in the class Notes says what follows:
...................................................................................................................................................
21.2. SOME TOPOLOGY IN METRIC SPACES 255
26.
Say for which value of the parameter a R the following system has one, innite or no solutions

ax
1
+ x
2
= 1
x
1
+ x
2
= a
2x
1
+ x
2
= 3a
3x
1
+ 2x
2
= a
27.
Say for which values of k,the system below admits one, none or innite solutions.
A(k) x = b (k)
where k R, and
A(k)

1 0
1 k 2 k
1 k
1 k 1

, b (k)

k + 1
k
1
0

.
21.2 Some topology in metric spaces
21.2.1 Basic topology in metric spaces
1.
Do Exercise 382: Let d be a metric on a non-empty set X. Show that
d
0
: X X R, d (x, y) =
d (x, y)
1 +d (x, y)
is a metric on X.
2.
Let X be the set of continuous real valued functions with domain [0, 1] R and d (f, g) =
R
1
0
(f (x) g (x)) dx,where the integral is the Riemann Integral (that one you learned in Calculus
1). Show that (X, d) is not a metric.
3.
Do Exercise 399 for n = 2. n N, i {1, ..., n} , a
i
, b
i
R with a
i
< b
i
,

n
i=1
(a
i
, b
i
)
is (R
n
, d
2
) open.
4.
Show the second equality in Remark 407:

+
n=1

1
n
,
1
n

= {0}
5.
Say
1
if the following set is open or closed:
S :=

x R : n N\ {0} such that x = (1)
n
1
n

6.
Say if the following set is open or closed:
A :=
+
n=1

1
n
, 10
1
n

.
7.
1
You should prove what you say - in all exercises.
256 CHAPTER 21. EXERCISES
Do Exercise 417: show that F (S) = F

S
C

.
8.
Do Exercise 418: show that F (S) is a closed set.
9.
Let the metric space (R, d
2
) be given. Find Int S, Cl (S) , F (S) , D(S) , Is (S) for S = Q,
S = (0, 1) and S =

x R : n N such that x =
1
n

.
10.
Show that the following statements are false:
a. Cl (Int S) = S,
b. Int Cl (S) = S.
11.
Given S R, say if the following statements are true or false.
a. S is an open bounded interval S is an open set;
b. S is an open set S is an open bounded interval;
c. x F (S) x D(S);
d. x D(S) x F (S) .
12.
Using the denition of convergent sequences, show that the following sequences do converge:
(x
n
)
nN
R

such that n N, x
n
= 1;
(x
n
)
nN\{0}
R

such that n N\ {0} , x


n
=
1
n
.
13.
Using Proposition 445, show that [0, 1] is (R, d
2
) closed.
14.
A subset of a discrete space, i.e., a metric space with the discrete metric, is compact if and only
if it is nite.
15.
Say if the following statement is true: An open set is not compact.
16.
Using the denition of compactness, show the following statement: Any open ball in

R
2
, d
2

is
not compact.
17.
Complete the follow solution of Exercise 480.
We want to show that (R, d
2
) is complete.
Let (x
n
)
nN
be a Cauchy sequence in (R, d
2
). From Proposition ..................., (x
n
)
n
is bounded.
From Proposition ..................., (x
n
)
nN
has a convergent subsequence (x
n
k
)
kN
such that x
n
k
k
x
0
;
we want to show that x
n
n
x
0
. Since (x
n
)
nN
is a Cauchy sequence, > 0,
..........................................................
Since x
n
k
k
x
0
,
..........................................................
Then n, n
k
> N
3
:= max {N
1
, N
2
},
..........................................................
as desired.
18.
Show that f (A B) = f (A) f (B) .
19.
Show that f (A B) 6= f (A) (B) .
20.
Using the characterization of continuous functions in terms of open sets, show that for any
metric space (X, d) the constant function is continuous.
21.
a. Say if the following sets are (R
n
, d
2
) compact:
R
n
+
,
21.2. SOME TOPOLOGY IN METRIC SPACES 257
x R
n
andr R
++
, , B(x, r).
b. Say if the following set is (R, d
2
) compact:

x R : n N\ {0} such that x =


1
n

.
22.
Given the continuous functions
g : R
n
R
m
show that the following set is closed
{x R
n
: g (x) 0}
.
23.
Show that the following set is closed

(x, y) R
2
: x 0, y 0, x +y 1

.
24.
Assume that f : R
m
R
n
is continuous. Say if
{x R
m
: f (x) = 0}
is (a) closed, (b) is compact.
25.
Using the characterization of continuous functions in terms of open sets, show that the following
function is not continuous
f : R R, f (x) =

x if x 6= 0
1 if x = 0
26.
Using the Extreme Value Theorem, say if the following maximization problems have solutions.
max
xR
n
n
X
i=1
x
i
s.t. kxk 1
max
xR
n
n
X
i=1
x
i
s.t. kxk < 1
max
xR
n
n
X
i=1
x
i
s.t. kxk 1
21.2.2 Correspondences
To solve the following exercises on correspondences, we need some preliminary denitions.
2
A set C R
n
is convex if x
1
, x
2
C and [0, 1], (1 ) x
1
+x
2
C.
A set C R
n
is strictly convex if x
1
, x
2
C and (0, 1), (1 ) x
1
+x
2
Int C.
Consider an open and convex set X R
n
and a continuous function f : X R, f is quasi-
concave i x
0
, x
00
X, [0, 1],
f((1 )x
0
+x
00
) min{f (x
0
) , f(x
00
)} .
f is strictly quasi-concave
2
The denition of quasi-concavity and strict quasi-concavity will be studied in detail in Chapter .
258 CHAPTER 21. EXERCISES
Denition 750 i x
0
, x
00
X, such that x
0
6= x
00
, and (0, 1), we have that
f((1 )x
0
+x
00
) > min{f (x
0
) , f(x
00
)} .
We dene the budget correspondence as
Denition 751
: R
C
++
R
++
R
C
, (p, w) =

x R
C
+
: px w

.
The Utility Maximization Problem (UMP) is
Denition 752
max
xR
C
+
u(x) s.t. px w, or x (p, w)
: R
C
++
R
++
R
C
, (p, w) = arg max (UMP) is the demand correspondence.
The Prot Maximization Problem (PMP) is
max
y
py s.t. y Y.
Denition 753 The supply correspondence is
y : R
C
++
R
C
, y (p) = arg max(PMP).
We can now solve some exercises. (the numbering has to be changed)
1.
Show that is non-empty valued.
2.
Show that for every (p, w) R
C
++
R
++
,
(a) if u is quasiconcave, is convex valued;
(b) if u is strictly quasiconcave, is single valued, i.e., it is a function.
3.
Show that is closed.
4.
If a solution to (PMP) exists, show the following properties hold.
(a) If Y is convex, y (.) is convex valued;
(b) If Y is strictly convex (i.e., (0, 1) , y

:= (1 ) y
0
+y
00
Int Y ), y (.) is single valued.
5
Consider
1
,
2
: [0, 2] R,

1
(x) =

1 + 0.25 x, x
2
1

if x [0, 1)
[1, 1] if x = 1

1 + 0.25 x, x
2
1

if x (1, 2]
,
and

2
(x) =

1 + 0.25 x, x
2
1

if x [0, 1)
[0.75, 0.25] if x = 1

1 + 0.25 x, x
2
1

if x (1, 2]
.
Say if
1
and
2
are LHC, UHC, closed, convex valued, compact valued.
6.
Consider : R
+
R,
(x) =

sin
1
x

if x > 0
[1, 1] if x = 0
,
Say if is LHC, UHC, closed.
7.
Consider : [0, 1] [1, 1]
21.3. DIFFERENTIAL CALCULUS IN EUCLIDEAN SPACES 259
(x) =

[0, 1] if x Q [0, 1]
[1, 0] if x [0, 1] \Q
.
Say if is LHC, UHC, closed.
8.
Consider
1
,
2
: [0, 3] R,

1
(x) =

x
2
2, x
2

,
and

2
(x) =

x
2
3, x
2
1

3
(x) := (
1

2
) (x) :=
1
(x)
2
(x) .
Say if
1
,
2
and
3
are LHC, UHC, closed.
21.3 Dierential calculus in Euclidean spaces
1 .
Using the denition, compute the partial derivative of the following function in an arbitrary
point (x
0
, y
0
) :
f : R
2
R, f (x, y) = 2x
2
xy +y
2
.
2 .
If possible, compute partial derivatives of the following functions.
a. f (x, y) = x arctan
y
x
;
b. f (x, y) = x
y
;
c. f (x, y) = (sin(x +y))

x+y
in (0, 3).
3 .
Given the function f : R
2
R,
f (x, y) =

xy
x
2
+y
2
if x +y 6= 0
0 otherwise
,
show that it admits both partial derivatives in (0, 0) and it is not continuous in (0, 0).
4 .
Using the denition, compute the directional derivative f
0
((1, 1) ; (
1
,
2
)) with
1
,
2
6= 0 for
f : R
2
R,
f (x, y) =
x +y
x
2
+y
2
+ 1
.
5 .
Using the denition, show that the following function is dierentiable: f : R
2
R,
f (x, y) = x
2
y
2
+xy (21.7)
Comment: this exercise requires some tricky computations. Do not spend too much time on it.
Do this exercise after having done Proposition 643.
6 .
Using the denition, show that the following functions are dierentiable.
a. l L(R
n
, R
m
);
b. the projection function f : R
n
R, f :

x
i

n
i=1
7x
1
.
7 .
Show the following proposition.
If j {1, ..., m}, a
j
R and f
j
: R
n
R is dierentiable, then
f : R
n
R, f (x) =
m
X
j=1

j
f
j
(x)
260 CHAPTER 21. EXERCISES
is dierentiable.
7.
Show the following result which was used in the proof of Proposition 608. A linear function
l : R
n
R
m
is continuous.
8 .
Compute the Jacobian matrix of f : R
2
R
3
,
f (x, y) = (sinxcos y, sinxsiny, cos xcos y)
9 .
Given dierentiable functions g, h : R R and y R\ {0}, compute the Jacobian matrix of
f : R
3
R
3
,
f (x, y, z) =

g (x) h(z) ,
g (h(x))
y
, e
xg(h(x))

10 .
Compute total derivative and directional derivative at x
0
in the direction u.
a.
f : R
3
++
R, f (x
1
, x
2
, x
3
) =
1
3
log x
1
+
1
6
log x
2
+
1
2
log x
3
x
0
= (1, 1, 2), u =
1

3
(1, 1, 1);
b.
f : R
3
R, f (x
1
, x
2
, x
3
) = x
2
1
+ 2x
2
2
x
2
3
2x
1
x
2
6x
2
x
3
x
0
= (1, 0, 1), u =

2
, 0,
1

;
b.
f : R
2
R, f (x
1
, x
2
) = x
1
e
x
1
x
2
x
0
= (0, 0), u = (2, 3).
11 .
Given
f (x, y, z) =

x
2
+y
2
+z
2

1
2
,
show that if (x, y, z) 6= 0, then

2
f (x, y, z)
x
2
+

2
f (x, y, z)
y
2
+

2
f (x, y, z)
z
2
= 0
12 .
Given the C
2
functions g, h : R R
++
, compute the Jacobian matrix of
f : R
3
R
3
, f (x, y, z) =

g(x)
h(z)
, g (h(x)) +xy, ln(g (x) +h(x))

13 .
Given the functions
f : R
2
R
2
, f (x, y) =

e
x
+y
e
y
+x

g : R R, x 7g (x)
h : R R
2
, h(x) = f (x, g (x))
Assume that g is C
2
. If possible compute the dierential of h in 0.
14 .
21.4. NONLINEAR PROGRAMMING 261
Let the following dierentiable functions be given.
f : R
3
R (x
1
, x
2
, x
3
) 7f (x
1
, x
2
, x
3
)
g : R
3
R (x
1
, x
2
, x
3
) 7g (x
1
, x
2
, x
3
)
a : R
3
R
3
, (x
1
, x
2
, x
3
) 7

f (x
1
, x
2
, x
3
)
g (x
1
, x
2
, x
3
)
x
1

b : R
3
R
2
, (y
1
, y
2
, y
3
) 7

g (y
1
, y
2
, y
3
)
f (y
1
, y
2
, y
3
)

Compute the directional derivative of the function b a in the point (0, 0, 0) in the direction
(1, 1, 1).
15 .
Using the theorems of Chapter 16, show that the function in (21.7) is dierentiable.
16 .
Given
f : R
3
R
1
, (x, y, z) 7z +x +y
3
+ 2x
2
y
2
+ 3xyz +z
3
9,
say if you can apply the Implicit Function Theorem to the function in (x
0
, y
0
, z
0
) = (1, 1, 1) and,
if possible, compute
x
z
and
y
z
in (1, 1, 1).
17 .
Using the notation of the statement of the Implicit Function Theorem presented in the Class
Notes, say if that Theorem can be applied to the cases described below; if it can be applied, compute
the Jacobian of g. (Assume that a solution to the system f (x, t) = 0 does exist).
a. f : R
4
R
2
,
f (x
1
, x
2
, t
1
, t
2
) 7

x
2
1
x
2
2
+ 2t
1
+ 3t
2
x
1
x
2
+t
1
t
2

b. f : R
4
R
2
,
f (x
1
, x
2
, t
1
, t
2
) 7

2x
1
x
2
+t
1
+t
2
2
x
2
1
+x
2
2
+t
2
1
2t
1
t
2
+t
2
2

c. f : R
4
R
2
,
f (x
1
, x
2
, t
1
, t
2
) 7

t
2
1
t
2
2
+ 2x
1
+ 3x
2
t
1
t
2
+x
1
x
2

18.
Say under which conditions, if z
3
xz y = 0, then

2
z
xy
=
3z
2
+x
(3z
2
x)
3
19. Do Exercise 658: Given the utility function u : R
2
++
R
++
, (x, y) 7u(x, y) satisfying the
following properties i. u is C
2
, ii. (x, y) R
2
++
, Du(x, y) >> 0,iii. (x, y) R
2
++
, D
xx
u(x, y) <
0, D
yy
u(x, y) < 0, D
xy
u(x, y) > 0,compute the Marginal Rate of Substitution in (x
0
, y
0
) and say if
the graph of each indierence curve is concave.
21.4 Nonlinear programming
1.
Determine, if possible, the nonnegative parameter values for which the following functions f :
X R, f : (x
i
)
n
i=1
:= x 7f (x) are concave, pseudo-concave, quasi-concave, strictly concave.
(a) X = R
++
, f (x) = x

;
(b) X = R
n
++
, n 2, f (x) =
P
n
i=1

i
(x
i
)

i
(for pseudo-concavity and quasi-concavity consider
only the case n = 2).
(c) X = R, f (x) = min{, x } .
262 CHAPTER 21. EXERCISES
2.
a. Discuss the following problem. For given (0, 1), a (0, +),
max
(x
1
,x
2
)
u(x) + (1 ) u(y) s.t. y a
1
2
x
y 2a 2x
x 0
y 0
where u : R R is a C
2
function such that z R, u
0
(z) > 0 and u
00
(z) < 0.
b. Say if there exist values of (, a) such that (x, y,
1
,
2
,
3
,
4
) =

2
3
a,
2
3
a,
1
, 0, 0, 0

, with

1
> 0, is a solution to Kuhn-Tucker conditions, where
j
is the multiplier associated with constraint
j {1, 2, 3, 4} .
c. Assuming that
the rst and fourth constraint hold with a strict inequality, the second constraint holds as
equality and the associated multiplier is positive,
describe in detail how to compute the eect of a change of a or on the solution of the problem.
3.
a. Discuss the following problem. For given (0, 1) , w
1
, w
2
R
++
,
max
(x,y,m)R
2
++
R
log x + (1 ) log y s.t
w
1
mx 0
w
2
+my 0
b. Compute the eect of a change of w
1
on the component x

of the solution.
c. Compute the eect of a change of on the objective function computed at the solution of
the problem.
4.
a. Discuss the following problem.
min
(x,y)R
2 x
2
+y
2
4x 6y s.t. x +y 6
y 2
x 0
y 0
b. Let (x

, y

) be a solution to the the problem.


b. Can it be x

= 1 ?
c. Can it be (x

, y

) = (2, 2) ?.
5.
Discuss the problem.
max
(x,y,z)R
3 x
2
2y
2
3z
2
+ 2x s.t. x +y +z 2
x 0
6.
Discuss and try to solve the following problems.
(a)
max
(x,y)R++(2,+)
3 log x + 2 log (2 +y) s.t. y 0
y
x 1
x
x +y 10
150x + 200y 1550
where
y
,
x
, , are the multipliers associated to the corresponding constraint. (Hint: the
solution is such that
x > 1, y > 0,
y
=
x
= = 0, > 0.
(b)
max
x:=(x
i
)
n
i=1
R
n
P
n
i=1
x
i
s.t.
P
n
i=1

i
x
2
i
1,
21.4. NONLINEAR PROGRAMMING 263
where for i = 1, .., n,
i
(0, 1) .
(c)
max
(x,y,z)R
3
++
log x + log y + log z s.t. 4x
2
+y
2
+z
2
100
x 1
y 1
z 1.
7.
Characterize the solutions to the following problems.
(a) (consumption-investment)
max
(c1,c2,k)R
3 u(c
1
) +u(c
2
)
s.t.
c
1
+k e
c
2
f (k)
c
1
, c
2
, k 0,
where u : R R, u
0
> 0, u
00
< 0; f : R
+
R, f
0
> 0, f
00
< 0 and such that f (0) = 0;
(0, 1) , e R
++
. After having written Kuhn Tucker conditions, consider just the case in which
c
1
, c
2
, k > 0.
(b) (labor-leisure)
max
(x,l)R
2 u(x, l)
s.t.
px +wl = wl
l l
x, l 0,
where u : R
2
is C
2
, (x, l) Du(x, l) 0, u is dierentiably strictly quasi-concave, i.e.,(x, l) ,
if 6= 0 and Du(x, l) = 0, then
T
D
2
u < 0; p > 0, w > 0 and l > 0.
8.
(a) Consider the model described in Exercise 3. (a) . What would be the eect on consumption
(c
1
, c
2
) of an increase in initial endowment e?
What would be the eect on (the value of the objective function computed at the solution of
the problem) of an increase in initial endowment e?
Assume that f (k) = ak

, with a R
++
and (0, 1) . What would be the eect on consump-
tion (c
1
, c
2
) of an increase in a?
(b) Consider the model described in Exercise 3. (b) .What would be the eect on leisure l of an
increase in the wage rate w? in the price level p?
What would be the eect on (the value of the objective function computed at the solution of
the problem) of an increase in the wage rate w? in the price level p?
264 CHAPTER 21. EXERCISES
Chapter 22
Solutions
22.1 Linear Algebra
1.
Show that the set of pair of real numbers is not a vector space with respect to the following
operations:
(a, b) + (c, d) = (a +c, b +d) and k (a, b) = (ka, b) ;
(a, b) + (c, d) = (a +c, b) and k (a, b) = (ka, kb) .
Solution:
To show these we employ denition of a vector space 129 on page 43 handouts. We are going to
proof with a counter-example.
a. (a, b) + (c, d) = (a +c, b +d) and k(a, b) = (ka, b)
The property M2. (Distributive) is violated. Let show this: ((k + n)a, b) = (k + n)(a, b) =
k(a, b) +n(a, b) = (ka, b) + (na, b) = ((k +n)a, 2b)
Otherwise, It is easy can be done by using proposition 132: for 0 K (R in our case) and
u V, 0u = 0. But 0(a, b) = (0, b) 6= 0, so the operations cannot be part of a vector space.
b. (a, b) + (c, d) = (a +c, b) and k(a, b) = (ka, kb)
The property A4. (Commutative) is violated. Take u = (a, b) and v = (c, d) with b 6= d, then
u +v = (a +c, b) and v +u = (c +a, d) therefore u +v 6= v +u.
2.
Show that W is not a vector subspace of R
3
if
W =

(x, y, z) R
3
: z 0

;
W =

x R
3
: kxk 1

,
W = Q
3
.
Solution:
In the rst case take any w W with z > 0, and < 0; then w has the last coordinate negative,
but it shouldnt.
Second case, take any nonzero w W, dene = 2/||w||; note that ||w|| = 2 > 1 so w / W,
but it should from the denition of a subspace.
Last case, multiplication of any nonzero element of Q
3
by R
3
\ Q
3
will give an element from
R
3
\ Q
3
instead of Q
3
as it should if W was a subspace.
3.
Let V be the vector space of all functions f : R R. Show the W is a vector subspace of V if
265
266 CHAPTER 22. SOLUTIONS
W = {f V : f (1) = 0} ;
W = {f V : f (1) = f (2)} .
Solution:
Note that f
0
0 V belongs to both W, so they are nonempty. Checking other conditions is
equally trivial.
First case, Assume two functions g(), h() V such that g(1) = h(1) = 0 so g(), h() W, and
a scalar k R. The sum g(1) +h(1) = 0 + 0 = 0 so g() + h() W. Similarly, the scalar product
k g(1) = k 0 = 0 so k g() W. Therefore, as the sum of any two vectors of the subspace belongs
to the subspace and the scalar product of any vector of the subspace and any vector belongs to the
subspace (that is, it shows closure under addition and scalar multiplication), we can conclude that
W is a subspace.
Second case, Assume two functions g(), h() V such that g(1) = g(2) and h(1) = h(2) so
g(), h() W, and a scalar k R. The sum g(1) +h(1) = g(2) +h(2) so g() +h() W. Similarly,
the scalar product k g(1) = k g(2) so k g() W. Therefore, as the sum of any two vectors of
the subspace belongs to the subspace and the scalar product of any vector of the subspace and any
vector belongs to the subspace, we can conclude that W is a subspace.
4.
Let V be the vector space of all functions f : R R. Show f, g, h dened below are linearly
independent:
f (x) = e
2x
, g (x) = x
2
, h(x) = x.
Solution:
By denition 158 of linear independency. We need to show
1
e
2x
+
2
x
2
+
3
x 0 then
i
= 0
for all i. It is obvious that function mentioned above are pairwise linearly independent, but it is not
enough to show joint linearly independent. We need also to proof that sum above is not equivalent
to zero with
i
6= 0 for all i. Without loss of generality suppose alpha
1
> 0, than
1
e
2x
> 0 for all
x. Hence
2
x
2
+
3
x < 0 for all x, or x <
3
/
2
for all x. However it does not hold because we
always can nd big enough x. To sum up, the functions e
2x
, x
2
and x are not equivalent to zero,
pairwise linearly and jointly linearly independent. Therefore they are linearly independent.
5.
Show that
a.
V =

(x
1
, x
2
, x
3
) R
3
: x
1
+x
2
+x
3
= 0

is a vector subspace of R
3
;
b.
S = {(1, 1, 0) , (0, 1, 1)}
is a basis for V .
Solution:
The rst part is trivial, nothing changes in comparison with exercise 3. For the second part note
that V 6= R
3
and belongs to it. See example 182 from the notes or simple observe that vector
(1, 1, 1) / V . According to proposition 180 dimV < 3. Note however, that (1, 1, 0) and (1, 1, 0)
are independent. Thus by proposition 170 dimV = 2 and S is a basis for V .
6.
Show the following fact.
22.1. LINEAR ALGEBRA 267
Proposition. Let a matrix A M(n, n), with n N\ {0} be given. The set
C
A
:= {B M(n, n) : BA = AB}
is a vector subspace of M(n, n) (with respect to the eld R).
Solution:
1. 0 M(n, n) : A0 = 0A = 0.
2. , R and B, B
0
C
A
,
(B +B
0
) A = BA+B
0
A = AB +AB
0
= AB +AB
0
= A(B +B
0
) .
7.
Show that the sum of vector subspaces is a vector subspace.
Solution:
see Smith (1992), page 24. Let W
i
V is a vector subspace of space V . So W =
n
P
1
W
i
is
the union of vector subspace. We need to show
i. W 6= , i.e., 0 W;
ii. u, v W, u +v W;
i. The rst point is trivial, because every W
i
is not empty, then the union is not empty as well.
0 U +V , because 0 U and 0 V.
ii. Take , F and w
1
, w
2
U + V . Then there exists u
1
, u
2
U and v
1
, v
2
V such that
w
1
= u
1
+v
1
and w
2
= u
2
+v
2
. Therefore,
w
1
+w
2
=

u
1
+v
1

u
2
+v
2

=

u
1
+v
1

u
2
+v
2

U +V,
because U and V are vector spaces and therefore u
1
+v
1
U and u
2
+v
2
V .
8.
Show that the following sets of vectors are sets of linearly independent vectors:
{(1, 1, 1, ) , (0, 1, 1) , (0, 0, 1)} .
Solution:
We want to show that if
P
3
i=1

i
v
i
= 0 then
i
= 0 for all i. Note that
P
3
i=1

i
v
i
= (
1
,
1
+

2
,
1
+
2
+
3
) = 0, solving recursively we get that
i
= 0 indeed.
9.
Show that
V =

(x
1
, x
2
, x
3
) R
3
: x
1
x
2
= 0

is a vector subspace of R
3
,and nd a basis for V .
Solution:
The rst part is trivial. For the second part note that V 6= R
3
as (0, 1, 0)
0
6= V , therefore
dim(V ) < 3. Note however, that u
1
= (1, 1, 0)
0
and u
2
= (0, 0, 1)
0
are independent elements of V .
Therefore dim(V ) = 2, and so u
1
, u
2
form its basis (point 2), Proposition 170.
10. Find the change-of-basis matrix from
S = {u
1
= (1, 2), u
2
= (3, 5)}
to
E = {e
1
= (1, 0), e
2
= (0, 1)}.
268 CHAPTER 22. SOLUTIONS
Solution:
Note that u
1
= 1e
1
+ 2e
2
and u
2
= 3e
1
+ 5e
2
, or e
1
= 5u
1
+ 2u
2
and e
2
= 3u
1
1u
2
. To get
P, according to denition 195 in the lecture notes e = u P, write the coecients at u
1
and u
2
as
columns of P, to get P =

5 3
2 1

.
11. Find the determinant of
C =

6 2 1 0 5
2 1 1 2 1
1 1 2 2 3
3 0 2 3 1
1 1 3 4 2

Solution:
Easiest way: use row and column operations to change C to a triangular matrix. Remember that
the determinant stays the same, or changes sign if the operation is of the type R
i
R
j
or C
i
C
j
.
For example,
det C = det

6 2 1 0 5
2 1 1 2 1
1 1 2 2 3
3 0 2 3 1
1 1 3 4 2

=

R
1
R
3

= det

1 1 2 2 3
2 1 1 2 1
6 2 1 0 5
3 0 2 3 1
1 1 3 4 2

= det

1 1 2 2 3
2 1 1 2 1
6 2 1 0 5
3 0 2 3 1
1 1 3 4 2

2R
1
+R
2
R
2
6R
1
+R
3
R
3
3R
1
+R
4
R
4
R
1
+R
5
R
5

= det

1 1 2 2 3
0 1 3 2 5
0 4 11 12 13
0 3 4 9 10
0 0 1 2 5

= det

1 1 2 2 3
0 1 3 2 5
0 4 11 12 13
0 3 4 9 10
0 0 1 2 5

4R
2
+R
3
R
3
3R
2
+R
4
R
4

= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 5 3 5
0 0 1 2 5

= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 5 3 5
0 0 1 2 5

5R
3
+R
4
R
4
R
3
+R
5
R
5

= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 0 17 30
0 0 0 6 12

= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 0 17 30
0 0 0 6 12

6
17
R
4
+R
5
R
5

= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 0 17 30
0 0 0 0
24
17

22.1. LINEAR ALGEBRA 269


= det

1 1 2 2 3
0 1 3 2 5
0 0 1 4 7
0 0 0 17 30
0 0 0 0
24
17

= (1)(1)(1)(17)(
24
17
) =det C = 24
12. Say for which values of k R the following matrix has rank a. 4, b. 3:
A :=

k + 1 1 k 2
k 1 2 k k
1 0 1 1

Solution:
We employ the same ideas as in a previous exercise. Note it cannot be 4 as rank(A)
min{#rows, #collums}. Its easy to check that rank(A) = 3 by using elementary transforma-
tions 2R
3
+ R
1
R
1
, C
4
+ C
1
C
1
, C
4
+ C
3
C
3
, C
1
+ C
3
C
3
, C
2
+ C
3
C
3
, to
get

k 1 1 0 0
0 0 1 k
0 0 0 1

which has the last three columns independent for any k.


13.
Say if the following matrices have the same row spaces:
A =

1 1 5
2 3 13

, B =

1 1 2
3 2 3

, C =

1 1 1
4 3 1
3 1 3

.
Solution:
We can paraphrase the proposition 145 in following way. Matrices have the same row space if
and only if their row canonical forms have the same nonzero rows. Hence row reduce each matrix
to row canonical form.
A =

1 1 5
2 3 13

2R
1
+R
2
R
2

1 1 5
0 1 3

R
1
R
2
R
1

1 0 2
0 1 3

B =

1 1 2
3 2 3

3R
1
+R
2
R
2

1 1 2
0 1 3

R
1
R
2
R
1

1 0 1
0 1 3

C =

1 1 1
4 3 1
3 1 3

4R
1
+R
2
R
2
3R
1
+R
3
R
3

1 1 1
0 1 3
0 2 6

R
1
+R
2
R
1
2R
2
+R
3
R
3

1 0 2
0 1 3
0 0 0

Since the nonzero rows of reduced form of A and of reduced form of C are the same, A and C
have the same row space. On the other hand, the nonzero rows of the reduced form of B are not
the same as the others, and so B has a dierent row space.
14.
270 CHAPTER 22. SOLUTIONS
Say if the following matrices have the same column spaces:
A =

1 3 5
1 4 3
1 1 9

, B =

1 2 3
2 3 4
7 12 17

.
Solution:
We employ the proposition 145 as well as in previous exercise. There is only minor dierence
we are going to calculate column reduced form of the matrices or row reduced form the transposes
that is the same.
A =

1 3 5
1 4 3
1 1 9

3C
1
+C
2
C
2
5C
1
+C
3
C
3

1 0 0
1 1 2
1 2 4

2C
2
+C
3
C
3
1C
2
+C
1
C
1

1 0 0
0 1 0
3 2 0

B =

1 2 3
2 3 4
7 12 17

2C
1
+C
2
C
2
7C
1
+C
3
C
3

1 0 0
2 1 2
7 2 4

2C
2
+C
3
C
3
2C
2
+C
1
C
1

1 0
0 1
3 2
The matrices A and B have the same column space.
15.
Diagonalize
A =

4 2
3 1

.
Solution:
First we nd the eigenvalues of the matrix
A =

4 2
3 1

=det[tI A] =

t 4 2
3 t + 1

= t
2
3t 10
The solutions for det[tI A] = 0 and, therefore, the eigenvalues of the matrix are = {5, 2} .
They are dierent and thus by proposition 250 from the lecture notes, the matrix P composed of
the corresponding eigenvectors v
1
= (2, 1)
0
and v
2
= (1, 3)
0
(for example) does the trick. The
answer is P =

2 1
1 3

and D =

5 0
0 2

.
16. Let
A =

2 2
1 3

.
Find: (a) all eigenvalues of A and corresponding eigenspaces, (b) an invertible matrix P such that
D = P
1
AP is diagonal.
Solution:
For the eigenvalues, we look at the solutions for det[tI A] = 0
A =

2 2
1 3

=det[tI A] =

t 2 2
1 t 3

= t
2
5t + 4
So that the eigenvalues are = {1, 4} . For the eigenspaces, we substitute each eigenvalue in the
tI A matrix
For = 1
22.1. LINEAR ALGEBRA 271
I A =

1 2
1 2

From where we can observe that ker A =



(x, y) R
2
: 2x = y

is the eigenspace for = 1.


For = 4
4I A =

2 2
1 1

From where we can observe that ker A =



(x, y) R
2
: x = y

is the eigenspace for = 4.


Finally, for the P matrix, we consider one eigenvector belonging to each of the eigenspaces, for
example, (2, 1) for = 1 and (1, 1) for = 4. Then, by Proposition 241, we can construct P as
P =

2 1
1 1

17.
Show that similarity between matrices is an equivalence relation.
Solution:
Recall that: A matrix B M(n, n) is similar to a matrix A M(n, n) if there exists an invertible
matrix P M(n, n) such that
B = P
1
AP.
An equivalence relation is a relation which is reexive, symmetric and transitive.
a. reexive. B = I
1
BI.
b. symmetric. B = P
1
AP A = PBP
1
.
c. transitive. B = P
1
AP and C = Q
1
BQ C = Q
1
BQ = Q
1

P
1
AP

Q =
(PQ)
1
A(PQ).
18.
Let A be a square symmetric matrix with real entries. Show that eigenvalues are real numbers.
Show that if
i
6=
j
for i 6= j then corresponding eigenvectors are orthogonal.
Solution:
A R
nn
. Recall that for any matrix A (even with complex entries) A

= { a
ji
}
ij
. Denote by
and v eigenvalue and the corresponding eigenvector. Then Av = v, and thus v

Av = v

v,
by applying complex conjugative ()

to both sides I get using the fact that for real A A

= A
0
,
v

A
0
v =

v

v. By symmetry of A matrix v

v =

v

v, which is equivalent to R as v is nonzero.


Since A = A
0
v
0
i
Av
j
= v
0
j
Av
i
and so (
i

j
)v
0
i
v
j
= 0. As s are dierent, the latter can hold only
if v
0
i
v
j
= 0 for i 6= j (of course v
0
i
v
i
6= 0).
19.
Given
l
4
: R
4
R
4
, l
4
(x
1
, x
2
, x
3
, x
4
) =

x
1
, x
1
+x
2
, x
1
+x
2
+x
3
, x
1
+x
2
+x
3
+x
4

show it is linear, compute the associated matrix, and compute ker l
4
and Iml
4
.
Solution:
Linearity is easy. According to denition 254, l
4
L(R
4
, R
4
) is linear if u, v R
4
and ,
R, l
4
(u + v) = l
4
(u) + l
4
(v). Therefore, we assume two vectors u = (u
1
, u
2
, u
3
, u
4
) and v =
(v
1
, v
2
, v
3
, v
4
) such that u, v R
4
and , R. Therefore
272 CHAPTER 22. SOLUTIONS
l
4
(u) +l
4
(v) = (u
1
, u
1
+u
2
, u
1
+u
2
+u
3
, u
1
+u
2
+u
3
+u
4
)
+(v
1
, v
1
+v
2
, v
1
+v
2
+v
3
, v
1
+v
2
+v
3
+v
4
)
= (u
1
+v
1
, u
1
+u
2
+v
1
+v
2
, u
1
+u
2
+u
3
+v
1
+v
2
+v
3
,
u
1
+u
2
+u
3
+u
4
+v
1
+v
2
+v
3
+v
4
)
= l
4
(u +v)
So l
4
is linear.
For the associated matrix, we have to rst choose the bases of the domain and codomain. It is nat-
ural to choose canonical basis e
i
4
i=1
in both domain and codomain, where e
i
has 1 on i-th place and
0s otherwise. Then express the value of l
4
at each vector of basis of domain, l
4
(e
i
) for i = 1, . . . , 4,
in terms of the basis of codomain e
i
4
i=1
. We have l
4
(e
1
) = (1, 1, 1, 1)
0
= 1e
1
+ 1e
2
+ 1e
3
+ 1e
4
,
l
4
(e
2
) = (0, 1, 1, 1)
0
= 0e
1
+ 1e
2
+ 1e
3
+ 1e
4
, l
4
(e
1
) = (0, 0, 1, 1)
0
= 0e
1
+ 0e
2
+ 1e
3
+ 1e
4
,
l
4
(e
1
) = (0, 0, 0, 1)
0
= 0e
1
+ 0e
2
+ 0e
3
+ 1e
4
. The coecients in each equation put as columns
of a matrix to get A =

1 0 0 0
1 1 0 0
1 1 1 0
1 1 1 1

. The matrix is nonsingular and therefore the kerl


4
= 0.
4 = dimR
4
= dimkerl
4
+dimiml
4
= dimiml
4
and therefore iml
4
= R
4
.
20.
Completed text
Proposition. Assume that l L(V, U) and ker l = {0}. Then,
u Iml, there exists a unique v V such that l (v) = u.
Proof.
Since u Iml, by denition, there exists v V such that
l (v) = u. (22.1)
Take v
0
V such that l (v
0
) = u. We want to show that
l (v
0
) = u. (22.2)
Observe that
l (v) l (v
0
)
(a)
= u u = 0, (22.3)
where (a) follows from (22.1) and (22.2).
Moreover,
l (v) l (v
0
)
(b)
= l (v v
0
) , (22.4)
where (b) follows from the assumption that l L(V, U).
Therefore,
l (v v
0
) = 0,
and, by denition of ker l,
v v
0
ker l. (22.5)
Since, by assumption, ker l = {0}, from (22.5), it follows that
v v
0
= 0.
21.
22.1. LINEAR ALGEBRA 273
Let the following sets be given:
V =

(x
1
, x
2
, x
3
, x
4
) R
4
: x
1
x
2
+x
3
x
4
= 0

and
W =

(x
1
, x
2
, x
3
, x
4
) R
4
: x
1
+x
2
+x
3
+x
4
= 0

If possible, nd a basis of V W.
Solution:
Both V and W are ker of linear function; therefore V , W and V W are vector subspaces of R
4
.
Moreover
V W =

(x
1
, x
2
, x
3
, x
4
) R
4
:

x
1
x
2
+x
3
x
4
= 0
x
1
+x
2
+x
3
+x
4
= 0

rank

1 1 1 1
1 1 1 1

= 2
Therefore, dimker l = dimV W = 4 2 = 2.
Lets compute a basis of V W :

x
1
x
2
= x
3
+x
4
x
1
+x
2
= x
3
x
4
After taking sum and subtraction we get following expression

x
1
= x
3
x
2
= x
4
A basis consists two linearly independent vectors. For example, If x
3
= 1 and x
4
= 0, then
x
1
= 1 and x
2
= 0. If x
3
= 0 and x
4
= 1, then x
1
= 0 and x
2
= 1
{(1, 0, 1, 0) , (0, 1, 0, 0)} .
22.
Say if the following statement is true or false.
Let V and U be vector spaces on R, W a vector subspace of U and l L(V, U). Then l
1
(W)
is a vector subspace of V .
Solution:
By proposition 135, we have to show that
1. 0 l
1
(W) ,
2. , R and v
1
, v
2
l
1
(W) we have that v
1
+v
2
l
1
(W).
1.
l (0)
lL(V,U)
= 0
W vector space
W
,
2.Since v
1
, v
2
l
1
(W),
l

v
1

, l

v
2

W. (22.6)
Then
l

v
1
+v
2

lL(V,U)
= l

v
1

+l

v
2
(a)
W
where (a) follows from (22.6) and the fact that W is a vector space.
274 CHAPTER 22. SOLUTIONS
23.
Let the following full rank matrices
A =

a
11
a
12
a
21
a
22

B =

b
11
b
12
b
21
b
22

be given. Say for which values of k R, the following linear system Cx = d has solutions.

1 a
11
a
12
0 0 0
2 a
21
a
22
0 0 0
3 5 6 b
11
b
12
0
4 7 8 b
21
b
22
0
0 a
11
a
12
0 0 k

x
1
x
2
x
3
x
4
x
5
x
6

k
1
2
3
k

Solution:
Observe that
det

a
11
a
12
0 0 0
a
21
a
22
0 0 0
5 6 b
11
b
12
0
7 8 b
21
b
22
0
a
11
a
12
0 0 k

= det A det B k.
Then, if k 6= 0,then the rank of both matrix of coecients and augmented matrix is 5 and the
set of solution to the system is an ane subspace of R
6
of dimension 1. If k = 0,then the system is

1 a
11
a
12
0 0 0
2 a
21
a
22
0 0 0
3 5 6 b
11
b
12
0
4 7 8 b
21
b
22
0
1 a
11
a
12
0 0 0

x
1
x
2
x
3
x
4
x
5
x
6

0
1
2
3
0

,
which is equivalent to the system

1 a
11
a
12
0 0 0
2 a
21
a
22
0 0 0
3 5 6 b
11
b
12
0
4 7 8 b
21
b
22
0

x
1
x
2
x
3
x
4
x
5
x
6

0
1
2
3

,
whose set of solution is an ane subspace of R
6
of dimension 2.
24.
Consider the following Proposition contained in Section 8.1 in the class Notes:
Proposition v V,
[l]
u
v
[v]
v
= [l (v)]
u
(22.7)
Verify the above equality in the case in which
a.
l : R
2
R
2
, (x
1
, x
2
) 7

x
1
+x
2
x
1
x
2

b. the basis v of the domain of l is

1
0

0
1

,
22.1. LINEAR ALGEBRA 275
c. the basis u of the codomain of l is

1
1

2
1

,
d.
v =

3
4

.
Solution:
[l]
u
v
:=

1
0

u
,

0
1

1
1

u
,

1
1

1 3
0 2

,
[v]
v
=

3
4

[l (v)]
u
=

7
1

u
=

9
8

[l]
u
v
[v]
v
=

1 3
0 2

3
4

9
8

25.
Let
n, m N\ {0} such that m > n, and
a vector subspace L of R
m
such that dimL = n
be given. Then, there exists l L(R
n
, R
m
) such that Im l = L.
Proof. Let

v
i

n
i=1
be a basis of L R
m
. Take l L(R
n
, R
m
) such that
i {1, ..., n} , l
2

e
i
n

= v
i
,
where e
i
n
is the ith element in the canonical basis in R
n
. Such function does exists and, in fact,
it is unique as a consequence of a Proposition in the Class Notes that we copy below:
Let V and U be nite dimensional vectors spaces such that S =

v
1
, ..., v
n

is a basis of V and

u
1
, ..., u
n

is a set of arbitrary vectors in U. Then there exists a unique linear function l : V U


such that i {1, ..., n}, l

v
i

= u
i
- see Proposition 273, page 82.
Then, from the Dimension theorem
dimIml = n dimker l n.
Moreover, L = span

v
i

n
i=1
Iml. Summarizing,
L Im l , dimL = n and dimIml n,
and therefore
dimIml = n.
Finally, from Proposition 179 in the class Notes since L Im l , dimL = n and dimIml = n, we
have that Iml = L, as desired.
Proposition 179 in the class Notes says what follows:
Proposition. Let W be a subspace of an ndimensional vector space V . Then
1. dimW n;
2. If dimW = n, then W = V .
276 CHAPTER 22. SOLUTIONS
26.
Say for which value of the parameter a R the following system has one, innite or no solutions

ax
1
+ x
2
= 1
x
1
+ x
2
= a
2x
1
+ x
2
= 3a
3x
1
+ 2x
2
= a
Solution:
For matrix A we have
rankA = rank

a 1
1 1
2 1
3 2

= 2
as we can nd a nonsingular matrix for any value of a, for example

1 1
2 1

. For the matrix [A | b],


we can perform basic row operations so that
[A | b] =

a 1 1
1 1 a
2 1 3a
3 2 a

R
1
R
2

1 1 a
a 1 1
2 1 3a
3 2 a

1 1 a
a 1 1
2 1 3a
3 2 a

=
aR
1
+R
2
R
2
2R
1
+R
3
R
3
3R
1
+R
4
R
4
=

1 1 a
0 1 a 1 a
2
0 1 a
0 1 2a

1 1 a
0 1 a 1 a
2
0 1 a
0 1 2a

=
R
2
R
3
R
3
R
4
=

1 1 a
0 1 a
0 1 2a
0 1 a 1 a
2

1 1 a
0 1 a
0 1 2a
0 1 a 1 a
2

=
R
2
+R
3
R
3
(1 a)R
2
+R
4
R
4
=

1 1 a
0 1 a
0 0 a
0 0 1 a

And therefore the matrix A will always rank = 3: if a = 0 the matrix composed of rows 1,2 and
4 is non-singular; if a = 1 the matrix composed of rows 1,2 and 3 is non-singular. Therefore, as
rankA 6= rank[A | b] a, then, by Theorem 327 (Rouch-Capelli) the system has no solutions.
27.
Say for which values of k,the system below admits one, none or innite solutions.
A(k) x = b (k)
where k R, and
A(k)

1 0
1 k 2 k
1 k
1 k 1

, b (k)

k + 1
k
1
0

.
Solution:
22.2. SOME TOPOLOGY IN METRIC SPACES 277
[A(k) |b (k)]

1 0 k 1
1 k 2 k k
1 k 1
1 k 1 0

det

1 0 k 1
1 k 1
1 k 1 0

= 2 2k
If k 6= 1, the system has no solutions. If k = 1,
[A(1) |b (1)]

1 0 0
0 1 1
1 1 1
1 0 0

1 0 0
0 1 1
1 1 1

1 0 0
0 1 1

Then, if k = 1, there exists a unique solution.


22.2 Some topology in metric spaces
22.2.1 Basic topology in metric spaces
1.
Do Exercise 382: Let d be a metric on a non-empty set X. Show that
d
0
: X X R, d (x, y) =
d (x, y)
1 +d (x, y)
is a metric on X.
Solution:
To proof that d
0
is a metric, we have to check the properties noted in Denition 368
a. d
0
(x, y) 0, d
0
(x, y) = 0 =x = y
By the denition of d
0
(x, y), it is always going to be positive as d(x, y) 0. Furthermore,
d
0
(x, y) = 0 d(x, y) = 0 x = y
b. d
0
(x, y) = d
0
(y, x)
Applying the denition
d
0
(x, y) = d
0
(y, x)
d(x, y)
1 +d(x, y)
=
d(y, x)
1 +d(y, x)
but d(x, y) = d(y, x) so we have
d(x, y)
1 +d(x, y)
=
d(x, y)
1 +d(x, y)

c. d
0
(x, z) d
0
(x, y) +d
0
(y, z)
278 CHAPTER 22. SOLUTIONS
Applying the denition
d
0
(x, z) d
0
(x, y) +d
0
(y, z)
d(x, z)
1 +d(x, z)

d(x, y)
1 +d(x, y)
+
d(y, z)
1 +d(y, z)
Multiplying both sides by [1 +d(x, z)][1 +d(x, y)][1 +d(y, z)]
d(x, z)[1 +d(x, y)][1 +d(y, z)] d(x, y)[1 +d(x, z)][1 +d(y, z)] +d(y, z)[1 +d(x, z)][1 +d(x, y)]
Operating and simplifying we obtain
d(x, z) d(x, y) +d(y, z)
+ [[1 +d(x, z)][1 +d(x, y)][1 +d(y, z)] + 2[1 +d(x, y)][1 +d(y, z)]]
Which concludes the proof, as we have arrived to something which is true from the initial as-
sumptions
2.
Let X be the set of continuous real valued functions with domain [0, 1] R and d (f, g) =
R
1
0
|f (x) g (x) dx| ,where the integral is the Riemann Integral (that one you learned in Calculus
1). Show that (X, d) is not a metric.
Solution:
We are going to prove with a counter-example. We want to show that it can be d (f, g) = 0 and
f 6= g. Take
f (x) = 0, x [0, 1] ,
and
g (x) = 2x + 1
R
1
0
(2x + 1) dx = 0
3.
Do Exercise 399 for n = 2. n N, i {1, ..., n} , a
i
, b
i
R with a
i
< b
i
,

n
i=1
(a
i
, b
i
)
is (R
n
, d
2
) open.
Solution:
Dene S = (a
1
, b
1
) (a
2
, b
2
) and take x
0
:=

x
0
1
, x
0
2

S. Then, for i {1, 2}, there exist


i
> 0
such that x
0
i
B

x
0
i
,
i

(a
i
, b
i
). Take = min{
1
,
2
}. Then, for i {1, 2}, x
0
i
B

x
0
i
,


(a
i
, b
i
) and, dened B =.B

x
0
1
,

x
0
2
,

, we have that x
0
B S. It then suces to show
that B

x
0
,

B. Observe that
x B

x
0
,

x, x
0

< ,
d

x
0
1
, 0

, (x
1
, 0)

=
q
(x
0
1
x
1
)
2
=

x
0
1
x
1

,
and
d

x
0
1
, 0

, (x
1
, 0)

=
q
(x
0
1
x
1
)
q
(x
0
1
x
1
)
2
+ (x
0
1
x
1
)
2
= d

x, x
0

.
4.
22.2. SOME TOPOLOGY IN METRIC SPACES 279
Show the second equality in Remark 407:

+
n=1

1
n
,
1
n

= {0}
Solution:
To show that A is equivalent to B, we have to prove both A B : and A B :.
Dene
A :=
+
n=1

1
n
,
1
n

= {0} = B.
A B :
We have to show that if x A,then x B; it is equivalent to show that if x / B, then x / A,
i.e., if x 6= 0,then x / A;.. For x = a > 0 N > 1/a such that x / A
N
=

1
N
,
1
N

. Therefore
x / A
A B :
We have to show that if x = 0 B,then x A; Consider x A
N
1/N < x < 1/N.
N 1/N < 0 < 1/N, then 0 A
N
. As results, zero belongs to intersection.
5.
Say
1
if the following set is open or closed:
S :=

x R : n N\ {0} such that x = (1)
n
1
n

Solution:
S =

1, +
1
2
,
1
3
,
1
4
,
1
5
,
1
6
, ...

The set is not open: it suces to nd x S and such that x / Int S; take for example 1. We
want to show that it false that
> 0 such that (1 , 1 +) S.
In fact, > 0, 1

2
(1 , 1 +), but 1

2
/ S. The set is not closed. It suces
to show that F (S) is not contained in S, in fact that 0 / S (obvious) and 0 F (S). We want to
show that > 0, B(0, ) S 6= .In fact, (1)
n
1
n
B(0, ) if n is even and (1)
n
1
n
=
1
n
< . It
is then enough to take n even and n >
1

.
6.
Say if the following set is open or closed:
A :=
+
n=1

1
n
, 10
1
n

.
Solution:
A = (0, 10)
The exercise implicitly suppose that we are in a metric space (R, d
2
). The set is (R, d
2
) open, as
a union of innite collection of open sets (proposition 407). The set is not closed, because A
c
is not
open. Let take 10 or 0. These two point does not belongs to Int(A
c
)
7.
Do Exercise 417: show that F (S) = F

S
C

.
Solution:
The set of all boundary points of S is called the Boundary of S and it is denoted by F (S).
The solution immediately follow from denition 415.
1
You should prove what you say - in all exercises.
280 CHAPTER 22. SOLUTIONS
Denition 754 Let a metric space (X, d) and a set S X be given. x is an boundary point of S
if
any open ball centered in x intersects both S and its complement in X, i.e.,
r R
++
, B(x, r) S 6= B(x, r) S
C
6= .
As you can see nothing changes in denition above if you replace the set to its complement.
8.
Do Exercise 418 F is closed: show that F (S) is a closed set.
Solution:
According to denition 402, the F (S) in metric space is closed if its complement is open. The
complement includes union of the set of interior points of S and the set of interior points of S
C
,
which are open sets by denition. The last step is to mention that union of two open sets is open.
Let prove it formally
We want to show that (F (S))
C
is an open set. Observe that
x (F (S))
C
x / (F (S))

r R
++
, B(x, r) S 6= B(x, r) S
C
6=

r R
++
such that B(x, r) S = B(x, r) S
C
6=
r R
++
such that B(x, r) S
C
B(x, r) S
x Int S
C
x Int S
(1)
r

x
R
++
such that either a. B(x, r

x
) Int S
C
or b. B(x, r

x
) Int S.
(22.8)
where (1) follows from the fact that the Interior of a set is an open set.
If case a. in (22.8) holds true, then B(x, r

x
) (F (S))
C
and similarly for case b., as desired.
9.
Let the metric space (R, d
2
) be given. Find Int S, Cl (S) , F (S) , D(S) , Is (S) for S = Q,
S = (0, 1) and S =

x R : n N such that x =
1
n

.
Solution:
S = Q is neither closed nor open.
IntS = , because there is no ball that includes only points belonging to S
Cl (S) = R
F (S) = R
D(S) = R
Is (S) =
S = (0, 1) is an open set
IntS = S
Cl (S) = [0, 1]
F (S) = {0} {1}
D(S) = [0, 1]
Is (S) =
S =

x R : n N such that x =
1
n

is neither closed nor open.


IntS =
Cl (S) = S {0}
F (S) = S {0}
D(S) = {0}
22.2. SOME TOPOLOGY IN METRIC SPACES 281
Is (S) = S
It is useful to check your answer employing the summarizing from lecture note.
The following statements are equivalent:
1. S is open (i.e., Int S S)
2. S
C
is closed,
3. S F (S) = ,
and the following statements are equivalent:
1. S is closed,
2. S
C
is open,
3. F (S) S,
4. S = Cl (S) .
5. D(S) S,
6. Is (S) = S,
And some extra propositions 498 and 499 S
1. The closure of S is equal to the union of S and D(S)
2. The closure of S is equal to the union of Int (S) andF (S)
10.
Show that the following statements are false:
a. Cl (Int S) = S,
b. Int Cl (S) = S.
Solution:
a.
Take S = N. Then, Int S = , Cl () = , and Cl (Int S) = 6= N =S.
b.
Take S = N. Then, Cl (S) = N, Int N = , and Int Cl (S) = 6= N =S.
11.
Given S R, say if the following statements are true or false.
a. S is an open bounded interval S is an open set;
True. If S is an open bounded interval, then a, b R, a < b such that S = (a, b). Take x S
and = min{|x a| , |x b|}. Then I (x, ) (a, b).
b. S is an open set S is an open bounded interval;
False. (0, 1) (2, 3) is an open set, but it is not an open interval.
c. x F (S) x D(S);
False. Take S := {0, 1}. 0 F (S), but 0 / D(S)
d. x D(S) x F (S) .
False. Take S (0, 1) .
1
2
D(S) 0, but
1
2
/ F (S) .
11.
Given S R, say if the following statements are true or false.
282 CHAPTER 22. SOLUTIONS
a. S is an open bounded interval S is an open set;
True. If S is an open bounded interval, then a, b R, a < b such that S = (a, b). Take x S
and = min{|x a| , |x b|}. Then I (x, ) (a, b).
b. S is an open set S is an open bounded interval;
False. (0, 1) (2, 3) is an open set, but it is not an open interval.
c. x F (S) x D(S);
False. Take S := {0, 1}. 0 F (S), but 0 / D(S)
d. x D(S) x F (S) .
False. Take S (0, 1) .
1
2
D(S) 0, but
1
2
/ F (S) .
12.
Using the denition of convergent sequences, show that the following sequences do converge:
(x
n
)
nN
R

such that n N, x
n
= 1;
(x
n
)
nN\{0}
R

such that n N\ {0} , x


n
=
1
n
.
Solution:
Denition 755 A sequence (x
n
)
nN
X

is said to be (X, d) convergent to x


0
X (or convergent
with respect to the metric space (X, d) ) if
> 0, n
0
N such that n > n
0
, d (x
n
, x
0
) < (22.9)
x
0
is called the limit of the sequence (x
n
)
nN
and we write
lim
n+
x
n
= x
0
, or x
n
n
x
0
.
(x
n
)
nN
in a metric space (X, d) is convergent if there exist x
0
X such that (22.10) holds. In
that case, we say that the sequence converges to x
0
and x
0
is the limit of the sequence.
Let take rst sequence (x
n
)
nN
R

such that n N, x
n
= 1; It is obvious that d (x
n
, 1) =
0 <
Check the second (x
n
)
nN\{0}
R

such that n N\ {0} , x


n
=
1
n
.
> 0, n
0
= 2/ N such that n > n
0
, d (x
n
, 0) = /2 < (22.10)
13.
Using Proposition 445 closed in terms of sequences, show that [0, 1] is (R, d
2
) closed.
Solution:
Take (x
n
)
nN
[0, 1]

such that x
n
x
0
; we want to show that x
0
[0, 1]. Suppose otherwise,
i.e., x
0
/ [0, 1].
Case 1. x
0
< 0. By denition of convergence, chosen =
x
0
2
> 0, there exists n

N such
that n > n

, d (x
n
, x
0
) < , i.e., |x
n
x
0
| < =
x
0
2
, i.e., x
0
+
x
0
2
< x
n
< x
0

x
0
2
=
x
0
2
< 0.
Summarizing, n > n

, x
n
/ [0, 1] , contradicting the assumption that (x
n
)
nN
[0, 1]

.
Case 2. x
0
> 1. Similar to case 1.
14.
A subset of a discrete space, i.e., a metric space with the discrete metric, is compact if and only
if it is nite.
Solution:
This is Example 7.15, page 150, Morris (2007):
1. In fact, we have the following result: Let (X, d) be a metric space and A = {x
1
, ..., x
n
} any
nite subset of X.Then Ais compact, as shown below.
22.2. SOME TOPOLOGY IN METRIC SPACES 283
Let O
i
, i I be any family of open sets such that A
iI
O
i
. Then for each x
j
A, there
exists O
ij
such that x
j
O
ij
. Then A O
i1
O
i2
... O
in
. Therefore A is compact.
2. Conversely, let A be compact. Then the family of singleton set O
x
= {x}, x A is such that
each O
x
is open and A
xA
O
x
. Since A is compact, there exists O
x
1
, O
x
2
, ..., O
x
n
such that
A O
x1
O
x2
... O
xn
, that is, A {x
1
, ..., x
n
}. Hence, A is nite.
15.
Say if the following statement is true: An open set is not compact.
Solution:
In general it is false. For example in a discrete metric space: see previous exercise.
16.
Using the denition of compactness, show the following statement: Any open ball in

R
2
, d
2

is
not compact.
Solution:
Take an open ball B(x, r). Consider S =

B

x, r

1
1
n

nN\{0,1}
. Observe that S is an open
cover of B(x, r); in fact
nN\{0,1}
B

x, r

1
1
n

= B(x, r) ,as shown below.


[] x
0

nN\{0,1}
B

x, r

1
1
n

n
x
0 N\ {0, 1} such x B

x, r

1
1
n
x
0

B(x, r).
[] Take x
0
B(x, r). Then, d (x, x
0
) < r. Take n such that d (x
0
, x) < r

1
1
n
x
0

, i.e.,
n >
r
rd(x
0
,x)
(and n > 1), then x
0
B

x, r

1
1
n

.
Consider an arbitrary subcover of S, i.e.,
S
0
=

B

x, r

1
1
n

nN
with #N = N N. Dene n

= min{n N}. Then


nN
B

x, r

1
1
n

= B

x, r

1
1
n

,
and if d (x
0
, x)

1
1
n

, r

, then x
0
B(x, r) and x
0
/
nN
B

x, r

1
1
n

.
17.
Complete the follow solution of Exercise 480 (R is complete).
(page 129 in Morris)
We want to show that (R, d
2
) is complete.
Let (x
n
)
nN
be a Cauchy sequence in (R, d
2
). From Proposition 474, (x
n
)
n
is bounded. From
Proposition 438, (x
n
)
nN
has a convergent subsequence (x
n
k
)
kN
such that x
n
k
k
x
0
; we want to
show that x
n
n
x
0
. Since (x
n
)
nN
is a Cauchy sequence, > 0,
N
1
N such that n, m > N
1
, |x
n
x
m
| <

2
.
Since x
n
k
k
x
0
,
N
2
N such that n
k
> N
2
, |x
n
k
x
0
| <

2
.
Then n, n
k
> N
3
:= max {N
1
, N
2
},
|x
n
x
0
| < |x
n
x
n
k
| + |x
n
k
x
0
| < ,
as desired.
Solution:
18.
Show that f (A B) = f (A) f (B) .
Solution:
284 CHAPTER 22. SOLUTIONS
As we did before, we must show two inclusions f (A B) f (A) f (B) and f (A) f (B)
f (A B). To prove the rst one let y f (A B), i.e. x AB such that f(x) = y. Then ether
x A or x B that implies f(x) = y A or f(x) = y B. In both case y f(A) f(B)
We now show the reverse inclusion. Let y f(A) f(B), then y f(A) or y f(B), but
y f(A) implies that x A such that f(x) = y. The same implication for y f(B). As results,
y = f(x) in either case with x A B i.e. y f(A B) Q.E.D.
19.
Show that f (A B) 6= f (A) (B) .
Solution:
Take f = sin, A = [2, 0] , B = [0, 2].
20.
Using the characterization of continuous functions in terms of open sets, show that for any metric
space (X, d) the constant function is continuous.
Solution:
Take c R and dene the following function
f : X Y, f (x) = c.
It suces to show that the preimage of every open subset of the domain is open in the codomain.
The inverse image of any open set K is either X (if c K) or (if c / K), which are both open
sets.
21.
a. Say if the following sets are (R
n
, d
2
) compact:
R
n
+
,
x R
n
andr R
++
, , Cl B(x, r) .
b. Say if the following set is (R, d
2
) compact:

x R : n N\ {0} such that x =


1
n

.
Solution:
a. R
n
+
is not bounded, then by proposition 453 it is not compact.
b. Cl B(x, r) is compact.
Step 1. Cl B(x, r) = {y R
n
: d (x, y) r} := C.
Proof of Step 1.
i. Cl B(x, r) C.
The function d
x
: R
n
R, d
x
(y) = d (x, y) :=

P
n
i=1
(x
i
y
i
)
2
1
2
is continuous. Therefore,
C = d
1
x
([0, r]) is closed. Since B(x, r) C, by denition of closure, the desired result follows.
ii. Cl B(x, r) C.
From Corollary 502 in the Notes, it suces to show that Ad B(x, r) C. If d (y, x) < r,
we are done. Suppose that d (y, x) = r.We want to show that for every > 0, we have that
B(x, r) B(y, ) 6= .If > r, then x B(x, r) B(y, ). Now take, r. It is enough to take
a point very close to y inside B(x, r)". For example, check the computations, we can verify
that z B(x, r) B(y, ), where z = x +

1

2r
) (y x)

.In fact
d (x, z) =

1

2r
)d (y, x)

= (1

2r
)r = r

2
< r,
22.2. SOME TOPOLOGY IN METRIC SPACES 285
and
d (y, z) =

2r
d (y, x) =

2r
r =

2
< .
Step 2. {y R
n
: d (x, y) r} is closed and bounded.
Easy.
c.
S =

x R : n N\ {0} such that x =
1
n

See solution to exercise 5, it was shown that S is


not closed and therefore using proposition 454 we can conclude S is not compact
22.
Given the continuous functions
g : R
n
R
m
show that the following set is closed
{x R
n
: g (x) 0}
Solution:
Observe that given for any j {1, ..., m} , the continuous functions g
j
: R
n
R and g = (g
j
)
m
j=1
, we can dene
C := {x R
n
: g (x) 0} .
Then C is closed, because of the following argument:
C =
m
j=1
g
1
j
([0, +)) ; since g
j
is continuous, and [0, +) is closed, then g
1
j
([0, +)) is closed
in R
n
; then C is closed because intersection of closed sets.
23.
Show that the following set is closed

(x, y) R
2
: x 0, y 0, x +y 1

.
Solution:
It can be done by employing proposition 493 that is S is closed F (S) S. F (S) =

(x, y) R
2
: x (0, 1), y (0, 1), x = 0, y = 0, x +y = 1

that is clearly belongs to S F (S) S


24.
Given the continuous functions
g : R
n
R
m
show that the following set is closed
X = {x R
n
: g (x) 0}
Solution:
The set is closed, but it is not necessary compact. To show this we are going to use proposition
523. Given g is continuous function we hate to show that preimage of any closed set is closed. Take
Y = {y R
m
: y 0} that is clearly closed, because F (Y ) Y . All conditions of proposition 523
are satised, then g
1
(Y ) is closed. The set g
1
(Y ) can be bounded or not. I will give you two
examples. First, suppose g
i
(x) = 4 x
2
is n-dimentional paraboloid, then g
1
(Y ) = [2, 2]
n
that
is obviously compact in R
n
. Second example, preimage of g
i
(x) = e
x
is R
n
, which is not bounded,
so it is not compact.
25.
Assume that f : R
m
R
n
is continuous. Say if
X = {x R
m
: f (x) = 0}
is (a) closed, (b) is compact.
286 CHAPTER 22. SOLUTIONS
Solution:
This question is the same story as exercise above. The set is closed, but it is not necessary
compact. By example 449 any nite set (in our case it is just one point y = 0) in any metric space
is compact. hence it is closed and bounded. As result, X is closed, but not necessary compact.
Compare S =

x R
2
: x
1
+x
2
= 0

with S =

x R
2
: x
2
1
+x
2
2
= 0

26.
Using the characterization of continuous functions in terms of open sets, show that the following
function is not continuous
f : R R, f (x) =

x if x 6= 0
1 if x = 0
Solution:
As before we are going to use Proposition 523; in our case, to show that the function is not
continuous, we have to take an open set V of the codomain such that f
1
(V ) is not open.
LetV = B
Y
(1, ), be an open ball around the value 1 of the codomain: which belongs to the
codomain and is open by denition. However, its preimage f
1
(V ) = {0} B
X
(1, ) is the union
of an open set and a closed set, so is neither open nor closed.
27.
Using the Extreme Value Theorem, say if the following maximization problems have solutions.
max
xR
n
n
X
i=1
x
i
s.t. kxk 1
max
xR
n
n
X
i=1
x
i
s.t. kxk < 1
max
xR
n
n
X
i=1
x
i
s.t. kxk 1
Solution:
For the purpose of the Extreme Value Theorem, we rst prove that the function is continuous.
Clearly, the function
P
n
i=1
x
i
is continuous as is the sum of linear and ane functions; the linear and
ane function been continuous by denition and the sum of continuous functions been continuous
by Proposition 513. Therefore, to check for the existence of solutions for the problems we only have
to check for the compactness of the restrictions.
The rst set is closed, because it is the inverse image of the closed set [0, 1] via the continuous
function kk. The rst set is bounded as well by denition. Therefore the set is compact and the
function is continuous, we can apply Extreme Value theorem. The second set is open, therefore it
is not compact and Extreme Value theorem can not be applied. The third set is unbounded, as
result it is not compact and Extreme Value theorem can not be applied.
22.2. SOME TOPOLOGY IN METRIC SPACES 287
22.2.2 Correspondences
Solutions to exercises on correspondences.
1.
Since u is a continuous function, from the Extreme Value Theorem , we are left with showing
that for every (p, w) , (p, w) is non empty and compact,i.e., is non empty valued and compact
valued.
x =

w
Cp
c

C
c=1
(p, w) .
(p, w) is closed because is the intersection of the inverse image of two closed sets via continuous
functions.
(p, w) is bounded below by zero.
(p, w) is bounded above because for every c, x
c

c
0
6=c
p
c
0
x
c
0
p
c

w
p
c
, where the rst inequality
comes from the fact that px w, and the second inequality from the fact that p R
C
++
and x R
C
+
.
2.
(a)Consider x
0
, x
00
(p, w) . We want to show that [0, 1] , x

:= (1 ) x
0
+x
00
(p, w) .
Observe that u(x
0
) = u(x
00
) := u

. From the quasiconcavity of u, we have u

. We are
therefore left with showing that x

(p, w) , i.e., is convex valued. To see that, simply, observe


that px

= (1 ) px
0
+px
00
(1 ) w +w = w.
(b) Assume otherwise. Following exactly the same argument as above we have x
0
, x
00
(p, w) ,
and px

w. Since u is strictly quasi concave, we also have that u

> u(x
0
) = u(x
00
) := u

,
which contradicts the fact that x
0
, x
00
(p, w) .
3.
We want to show that for every (p, w) the following is true. For every sequence {(p
n
, w
n
)}
n

R
C
++
R
++
such that
(p
n
, w
n
) (p, w) , x
n
(p
n
, w
n
) , x
n
x,
it is the case that x (p, w) .
Since x
n
(p
n
, w
n
), we have that p
n
x
n
w
n
. Taking limits of both sides, we get px w,i.e.,
x (p, w) .
4.
(a) We want to show that y
0
, y
00
y (p) , [0, 1] , it is the case that y

:= (1 ) y
0
+y
00

y (p) , i.e., y

Y and y Y, py

py.
y

Y simply because Y is convex.


py

:= (1 ) py
0
+py
00
y
0
,y
00
y(p)
(1 ) py +py = py.
(b)Suppose not; then y
0
, y
00
Y such that y
0
6= y
00
and such that
y Y, py
0
= py
00
> py (1) .
Since Y is strictly convex, (0, 1) , y

:= (1 ) y
0
+ y
00
Int Y. Then, > 0 such that
B

Y. Consider y

:= y

+

2C
1, where 1 := (1, ..., 1) R
C
. d

, y

=
q
P
C
c=1


2C

2
=

C
. Then, y

Y and, since p 0, we have that py

> py

= py
0
= py
00
, contradicting
(1) .
5.
This exercise is taken from Beavis and Dobbs (1990), pages 74-78.
288 CHAPTER 22. SOLUTIONS

1
0.5 1.0 1.5 2.0
-1
0
1
2
3
4
x
y

2
0.5 1.0 1.5 2.0
-1
0
1
2
3
4
x
y
For every x [0, 2], both
1
(x) and
2
(x) are closed, bounded intervals and therefore convex
and compact sets. Clearly
1
is closed and
2
is not closed.

1
and
2
are clearly UHC and LHC for x 6= 1. Using the denitions, it is easy to see that for
x = 1,
1
is UHC, and not LHC and
2
is LHC and not UHC.
6.
1 2 3 4 5
-1.0
-0.5
0.0
0.5
1.0
x
y
For every x > 0, is a continuous function. Therefore, for those values of x, is both UHC and
LHC.
is UHC in 0. For every neighborhood of [1, 1] and for any neighborhood of {0} in R
+
, (x)
[1, 1] .
is not LHC in 0. Take the open set V =

1
2
,
3
2

;we want to show that > 0 z

(0, )
such that (z

) /

1
2
,
3
2

. Take n N such that


1
n
< and z

=
1
n
. Then 0 < z

< and
sinz

= sinn = 0 /

1
2
,
3
2

.
Since is UHC and closed valued, from Proposition 16 is closed.
7.
is not closed. Take x
n
=

2
2n
[0, 1] for every n N. Observe that x
n
0. For every
n N, y
n
= 1 (x
n
) and y
n
1. But 1 (0) = [0, 1] .
22.2. SOME TOPOLOGY IN METRIC SPACES 289
is not UHC. Take x = 0 and a neighborhood V =

1
2
,
3
2

of (0) = [0, 1] . Then > 0, x


(0, ) \Q. Therefore, (x

) = [1, 0] * V.
is not LHC. Take x = 0 and the open set V =

1
2
,
3
2

.Then (0)

1
2
,
3
2

= [0, 1]

1
2
,
3
2

1
2
, 1

6= . But, as above, > 0, x

(0, ) \Q. . Then (x

) V = [1, 0]

1
2
,
3
2

= .
8.
(This exercise is taken from Klein, E. (1973), Mathematical Methods in Theoretical Economics,
Academic Press, New York, NY, page 119).
Observe that
3
(x) =

x
2
2, x
2
1

.
1 2 3
-2
0
2
4
6
8
x
y

1
([0, 3]) =

(x, y) R
2
: x 0, 3, y x
2
2, y x
2

.
1
([0, 3]) is dened in terms of weak
inequalities and continuous functions and it is closed and therefore
1
is closed. Similar argument
applies to
2
and
3
.
Since [10, 10] is a compact set such that
1
([0, 3]) [10, 10] , from Proposition 17,
1
is UHC.
Similar argument applies to
2
and
3
.

1
is LHC. Take an arbitrary x [0, 3] and a open set V with non-empty intersection with
1
(x) =

x
2
2, x
2

. To x ideas, take V =

, x
2
+

, with

0, x
2

. Then, take U =

,
p
x
2
+

.
Then for every x U,

x
2

V
1
(x) .
1 2 3
-2
0
2
4
6
8
x
y
Similar argument applies to
2
and
3
.
290 CHAPTER 22. SOLUTIONS
22.3 Dierential Calculus in Euclidean Spaces
1 .
f0((x
0
, y
0
) ; (
1
, 2)) = lim
h0
f((x
0
+h
1
,y
0
+h
2
);(
1
,2))f((x
0
,y
0
))
h
=
lim
h0
1
h
(2(x
0
+h
1
)
2
(x
0
+h
1
)(y
0
+h
2
) + (y
0
+h
2
)
2
2x
2
0
x
0
y
0
+y
2
0
) =
x
0
(4
1

2
) +y
0
(2
2

1
)
2.
a) f is partially dierentiable on R
2
\ ({0} R).
f
x
= arctan
(y/x)
xy
x
2
+y
2
,
f
y
=
x
2
x
2
+y
2
.
b) f is partially dierentiable on R
++
R.
f
x
= yx
y1
,
f
y
= x
y
log x.
c) note that (log f)
0
= f
0
/f and f(x, y) = e

x+y sin(x+y)
. Therefore
f
x
= (sin(x+y))

x+y
(
log sin(x +y)
2

x +y
+

x +y cos(x +y)
sin(x +y)
),
f
y
=
f
x
.
3 .
a.
D
x
f (0, 0) = lim
h0
f (h, 0) f (0, 0)
h
= lim
h0
0
h
= 0;
b.
D
y
f (0, 0) = lim
k0
f (0, k) f (0, 0)
h
= lim
h0
0
h
= 0;
c. The basic idea of the proof is to compute the limit for (x, y) going to (0, 0) along the line
y = mx in the xy plane:
lim
(x,y)0
x (mx)
x
2
+ (mx)
2
= lim
(x,y)0
m
1 +m
2
.
Therefore, the limit depends on m and using that result a precise statement in terms of the
denition of continuity can be given.
4 .
f
0
((1, 1); (
1
,
2
)) = lim
h0
f(1 +h
1
, 1 +h
2
) f(1, 1)
h
=
lim
h0
1
h

2 +h(
1
+
2
)
(1 +h
1
)
2
+ (1 +h
2
)
2
+ 1

2
3

=
= lim
h0

2h(
2
1
+
2
2
) (
1
+
2
)
3[(1 +h
1
)
2
+ (1 +h
2
)
2
+ 1]
=

1
+
2
9
5 .
We will show the existence of a linear function T
(x
0
,y
0
)
(x, y) = a(x x
0
) + b(y y
0
) such that
the denition of dierential is satised. After substituting, we want to show that
lim
(x,y)(x
0
,y
0
)
|x
2
y
2
+xy x
2
0
+y
2
0
x
0
y
0
a(x x
0
) b(y y
0
)|
p
(x x
0
)
2
+ (y y
0
)
2
= 0.
22.3. DIFFERENTIAL CALCULUS IN EUCLIDEAN SPACES 291
Manipulate the numerator of the above to get NUM = |(x x
0
)(x + x
0
a + y) (y y
0
)(y +
y
0
+b x
0
)|. Now the ratio R whose limit we are interested to obtain satises
0 R
|x x
0
||x +x
0
a +y|
p
(x x
0
)
2
+ (y y
0
)
2
+
|y y
0
||y +y
0
+b x
0
|
p
(x x
0
)
2
+ (y y
0
)
2
|x x
0
||x +x
0
a +y| +|y y
0
||y +y
0
+b x
0
|.
For a = 2x
0
+y
0
and b = x
0
2y
0
we get the limit of R equal zero as required.
6 .
a) given x
0
R
n
we need to nd T
x0
: R
n
R
m
linear and E
x0
with lim
v0
E
x0
(v) = 0. Take
T
x
0
= l and E
x
0
0. By linearity check that they satisfy the requirements.
b) projection is linear, so by a) is dierentiable. Note that T
x
0
((v
i
)
n
i=1
) = v
1
in this case.
7.
From the denition of continuity, we want to show that x
0
R
n
, > 0 > 0 such that
kx x
0
k < kl (x) l (x
0
)k < . Dened [l] = A, we have that
kl (x) l (x
0
)k = kA x x
0
k = kR
1
(A) (x x
0
) , ..., R
m
(A) (x x
0
)k
(1)

P
m
i=1
|R
i
(A) (x x
0
)|
(2)

P
m
i=1
kR
i
(A)k kx x
0
k m

max
i{1,...,m}
{R
i
(A)}

kx x
0
k ,
(22.11)
where (1) follows from Remark 56 and (2) from Proposition 53.4, i.e., Cauchy-Schwarz inequality.
Therefore, if max
i{1,...,m}
{R
i
(A)}, we are done. Otherwise, take
=

m

max
i{1,...,m}
{R
i
(A)}
.
Then we have that kx x
0
k < implies that kx x
0
k m

max
i{1,...,m}
{R
i
(A)}

< , and
from (22.11) , kl (x) l (x
0
)k < , as desired.
8 .
Df(x, y) =

cos xcos y sinxsiny


cos xsiny sinxcos y
sinxcos y cos xsiny

.
9 .
Df(x, y, z) =

g
0
(x)h(z) 0 g(x)h
0
(z)
g
0
(h(x))h
0
(x)/y g(h(x))/y
2
0
exp(xg(h(x))((g(h(x)) +g
0
(h(x))h
0
(x)x)) 0 0

.
10 .
a) Df = (
1
3x
1
,
1
6x
2
,
1
2x
3
), hence f C
1
at x
0
and Df(x
0
) = (1/3, 1/6, 1/4). This is, hence, also the
representation of the total derivative at x
0
, with f
0
(x
0
, h) =

3
4
.
b) Df = (2x
1
2x
2
, 4x
2
2x
1
6x
3
, 2x
3
6x
2
), Df(x
0
) = (2, 4, 2) and f
0
(x
0
, h) = 0.
c) Df = (e
x
1
x
2
+x
1
x
2
e
x
1
x
2
, x
2
1
e
x
1
x
2
) and f
0
(x
0
, h) = 2.
11 .
292 CHAPTER 22. SOLUTIONS
Given
f (x, y, z) =

x
2
+y
2
+z
2

1
2
,
show that if (x, y, z) 6= 0, then

2
f (x, y, z)
x
2
+

2
f (x, y, z)
y
2
+

2
f (x, y, z)
z
2
= 0
Solution:
f (x, y, z)
x
=
1
2
(x
2
+y
2
+z
2
)
3
2
2x

2
f (x, y, z)
x
2
=
3
2
(x
2
+y
2
+z
2
)
5
2
2x(x) (x
2
+y
2
+z
2
)
3
2
The second partial derivatives with respect to z and y are the same up to a change x to y or z.
After summation we get the desired equality.
12 .
Df =

g
0
(x)/h(z) 0 g(x)h
02
g
0
(h(x))h
0
(x) +y x 0
(g
0
(x) +h
0
(x))/(g(x) +h(x)) 0 0

.
13 .
Apply the chain rule
Df(0, g(0)) =

1 +g0(0)
g0(0)e
g(0)
+ 1

14 .
D
x
(b a) (x) = D
y
b (y)
|y=a(x)
D
x
a (x) .
D
y
b (y) =

D
y
1
g (y) D
y
2
g (y) D
y
3
g (y)
D
y1
f (y) D
y2
f (y) D
y3
f (y)

|y=a(x)
D
x
a (x) =

D
x1
f (x) D
x2
f (x) D
x3
f (x)
D
x
1
g (x) D
x
2
g (x) D
x
3
g (x)
1 0 0

D
x
(b a) (x) =

Dy
1
g(f(x),g(x),x1) Dy
2
g(f(x),g(x),x1) Dy
3
g(f(x),g(x),x1)
Dy
1
f(f(x),g(x),x1) Dy
2
f(f(x),g(x),x1) Dy
3
f(f(x),g(x),x1)

D
x
1
f(x) D
x
2
f(x) D
x
3
f(x)
D
x
1
g(x) D
x
2
g(x) D
x
3
g(x)
1 0 0

=
=

D
y
1
g D
y
2
g D
y
3
g
D
y
1
f D
y
2
f D
y
3
f

D
x1
f D
x2
f D
x3
f
D
x1
g D
x2
g D
x3
g
1 0 0

=
=

D
y
1
g D
x
1
f +D
y
2
gD
x
1
g +D
y
3
g D
y
1
g D
x
2
f +D
y
2
g D
x
2
g D
y
1
g D
x
3
f +D
y
2
g D
x
3
g
D
y
1
f D
x
1
f +D
y
2
f D
x
1
g +D
y
3
f D
y
1
f D
x
2
f +D
y
2
f D
x
2
g D
y
1
f D
x
3
f +D
y
2
f D
x
3
g

.
22.3. DIFFERENTIAL CALCULUS IN EUCLIDEAN SPACES 293
15 .
By the sucient condition of dierentiability, it is enough to show that the function f C
1
(partial derivatives continuous). Partial derivatives are
f
x
= 2x +y and
f
y
= 2y +x both are
indeed continuous, so f is dierentiable.
16 .
i) f C
1
as Df(x, y, z) = (1 + 4xy
2
+ 3yz, 3y
2
+ 4x
2
y + 3z, 1 + 3xy + 3z
2
) has continuous entries
(everywhere, in particular around (x
0
, y
0
, z
0
))).
ii) f(x
0
, y
0
, z
0
) = 0 by direct calculation.
iii) f
0
z
=
f
z
|
(x
0
,y
0
,z
0
)
= 7 6= 0, f
0
y
=
f
y
|
(x
0
,y
0
,z
0
)
= 10 6= 0 and f
0
x
=
f
x
|
(x
0
,y
0
,z
0
)
= 8 6= 0.
Therefore we can apply Implicit Function Theorem around (x
0
, y
0
, z
0
) = (1, 1, 1) to get
x
z
=
f
0
z
f
0
x
= 7/8,
y
z
=
f
0
z
f
0
y
= 7/10.
17 .
a) Df =

2x
1
2x
2
2 3
x
2
x
1
1 1

continuous, det D
x
f(x, t) = 2x
2
1
+2x
2
2
6= 0 except for x
1
= x
2
= 0
(so exclude this point from the domain). Finally
Dg(t) =
1
2x
2
1
+ 2x
2
2

2x
1
+ 2x
2
3x
1
2x
2
2x
2
+ 2x
1
2x
1
3x
1

.
b) Df =

2x
2
2x
1
1 2t
2
2x
1
2x
2
2t
1
2t
2
2t
1
+ 2t
2

continuous, det D
x
f(x, t) = 4x
2
2
4x
2
1
6= 0 except for
|x
1
| = |x
2
| (so exclude these points from the domain). Finally
Dg(t) =
1
4x
2
2
4x
2
1

4x
1
t
1
+ 4x
1
t
2
+ 2x
2
4x
1
t
1
4x
1
t
2
+ 4x
2
t
2
2x
1
+ 4x
2
t
1
4x
2
t
2
4x
1
t
2
4x
2
t
1
+ 4x
2
t
2

.
c) Df =

2 3 2t
1
2t
2
1 1 t
2
t
1

continuous, det D
x
f(x, t) = 5 6= 0 always. Finally
Dg(t) =
1
5

2t
1
3t
2
2t
2
3t
1
2t
1
+ 2t
2
2t
2
+ 2t
1

.
18.
As an application of the Implicit Function Theorem, we have that
z
x
=
(z
3
xzy)
x
(z
3
xzy)
z
=
z
3z
2
x
if 3z
2
x 6= 0. Then,

2
z
xy
=

z(x,y)
3(z(x,y))
2
x

y
=
z
y

3z
2
x

6
z
y
z
2
(3z
2
x)
2
Since,
z
y
=
(z
3
xzy)
y
(z
3
xzy)
z
=
1
3z
2
x
,
294 CHAPTER 22. SOLUTIONS
we get

2
z
xy
=
1
3z
2
x

3z
2
x

6
1
3z
2
x
z
2
(3z
2
x)
2
=
3z
2
x
(3z
2
x)
3
19.
As an application of the Implicit Function Theorem, we have that the Marginal Rate of Substi-
tution in (x
0
, y
0
) is
dy
dx|(x,y)=(x0,y0)
=
(u(x,y)k)
x
(u(x,y)k)
y
|(x,y)=(x
0
,y
0
)
< 0
d
2
y
dx
2
=

Dxu(x,y(x))
Dyu(x,y(x))

x
=

()
D
xx
u +
(+)
D
xy
u
()
dy
dx
!
(+)
D
y
u

(+)
D
xy
u +
()
D
yy
u
()
dy
dx
!
(+)
D
x
u
(D
y
u)
2
(+)
> 0
and therefore the function y (x) describing indierence curves is convex.
22.4 Nonlinear Programming
(Problem 1 is taken from David Cass problem sets for his Microeconomics course at the University
of Pennsylvania).
Exercise 1.
(a)
If = 0, then f (x) = . The constant function is concave and therefore pseudo-concave, quasi-
concave, not strictly concave.
If > 0, f
0
(x) = x
1
, f
00
(x) = ( 1) x
2
.
f
00
(x) 0 ( 1) 0
0,0
0 1 f concave f quasi-concave.
f
00
(x) < 0 ( > 0 and (0, 1)) f strictly concave.
(b)
The Hessian matrix of f is
D
2
f (x) =

1
(
1
1) x

1
2
0
.
.
.
0
n

n
(
n
1) x

n
2

.
D
2
f (x) is negative semidenite (i,
i
[0, 1]) f is concave.
D
2
f (x) is negative denitive (i,
i
> 0 and
i
(0, 1)) f is strictly concave.
We consider the case n = 3. The border Hessian matrix is
B(f (x)) =

0
1

1
x

1
1

n

n
x

n
1

1
x

1
1

1
(
1
1) x

1
2
0
|
.
.
.

n
x
1
0
n

n
(
n
1) x

n
2

.
The determinant of the relevant leading principal minors are
det

0
1

1
x

1
1

1
x

1
1

1
(
1
1) x

1
2

=
2
1

2
1

1
1

2
< 0
det

0
1

1
x

1
1

2
x

2
1

1
x

1
1

1
(
1
1) x

1
2
0

2
x

1
1
0
2

2
(
2
1) x

2
2

=
=

1
x

1
1

1
x

1
1

2
(
2
1) x

2
2

2
x

1
1

2
x

2
1

1
(
1
1) x

1
2

=
22.4. NONLINEAR PROGRAMMING 295
=

2
x

1
+
2
4

1
x

2
(
2
1) +
2

2
x

1
(
1
1)

=
=
1

2
x

1
+
2
4

1
x

2
(
2
1) +
2

2
x

1
(
1
1)

> 0 i
for i = 1, 2,
i
> 0 and
i
(0, 1) .
(c)
If = 0, then f (x) = min{, } = 0.
If > 0, we have
-2 -1 1 2 3 4 5
-5
5
10
x
y
The intersection of the two line has coordinates

x

:=
+

.
f is clearly not strictly concave, because it is constant in a subset of its domain. Lets show it is
concave and therefore pseudo-concave and quasi-concave.
Given x
0
, x
00
X, 3 cases are possible.
Case 1. x
0
, x
00
x

.
Case 2. x
0
, x
00
x

.
Case 3. x
0
x

and x
00
x

.
The most dicult case is case 3: we want to show that
(1 ) f (x
0
) +f (x
00
) f ((1 ) x
0
+x
00
) .
Then, we have
(1 ) f (x
0
) +f (x
00
) = (1 ) min{, x
0
} +min{, x
00
} =
= (1 ) (x
0
) +.
Since, by construction x
0
,
(1 ) (x
0
) + ;
since, by construction x
00
,
(1 ) (x
0
) + (1 ) (x
0
) +((x
00
) =
= [(1 ) x
0
+x
00
] .
Then (1 ) f (x
0
) +f (x
00
) min{, [(1 ) x
0
+x
00
] } :=
:= f ((1 ) x
0
+x
00
) , as desired.
Exercise 2.
1. Canonical form.
For given (0, 1), a (0, +),
max
(x,y)R
2 u(x) + (1 ) u(y) s.t. a
1
2
x y 0
1
2a 2x y 0
2
x 0
3
y 0
4
296 CHAPTER 22. SOLUTIONS

y = a
1
2
x
y = 2a 2x
, solution is

x =
2
3
a, y =
2
3
a

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
0.0
0.5
1.0
1.5
2.0
x
y
2. The set X and the functions f and g.
a. The domain of all function is R
2
.
b. X = R
2
.
c. R
2
is open and convex.
d. Df (x, y) = ( u
0
(x) , (1 ) u
0
(y)).The Hessian matrix is

u
00
(x) 0
0 (1 ) u
00
(y)

Therefore, f and g are C


2
functions and f is strictly concave and the functions g
j
are ane.
3. Existence.
C is closed and bounded below by (0, 0) and above by (a, a) :
y a
1
2
x a
2x 2a y 2a.
4. Number of solutions.
The solution is unique because f is strictly concave and the functions g
j
are ane and therefore
concave.
5. Necessity of K-T conditions.
The functions g
j
are ane and therefore concave.
x
++
=

1
2
a,
1
2
a

a
1
2
1
2
a
1
2
a =
1
4
a > 0
2a 2
1
2
a
1
2
a =
1
2
a > 0
1
2
a > 0
1
2
a > 0
6. Suciency of K-T conditions.
is strictly concave and the functions g
j
are ane
22.4. NONLINEAR PROGRAMMING 297
7. K-T conditions.
L(x, y,
1
, ...,
4
; , a) = u(x) +(1 ) u(y) +
1

a
1
2
x y

+
2
(2a 2x y) +
3
x+
4
y.

u
0
(x)
1
2

1
2
2
+
3
= 0
(1 ) u
0
(y)
1

2
+
4
= 0
min

1
, a
1
2
x y

= 0
min{
2
, 2a 2x y} = 0
min{
3
, x} = 0
min{
4
, y} = 0

u
0
(x)
1
2

1
2
2
+
3
= 0
(1 ) u
0
(x)
1

2
+
4
= 0
a
1
2
x y 0

1
0
... = 0
2a 2x y 0

2
0
... = 0
x 0

3
0
... = 0
y 0

4
0
... = 0
8. "Solve the K-T conditions"

u
0

2
3
a

2
2
= 0
(1 ) u
0

2
3
a

2
= 0
a
1
2
2
3
a
2
3
a = 0

1
> 0
... = 0
2a 2
2
3
a
2
3
a = 0

2
= 0
... = 0
2
3
a > 0

3
= 0
... = 0
2
3
a > 0

4
= 0
... = 0

2
=
1
2
u
0

2
3
a

= (1 ) u
0

2
3
a

> 0
1
2
= 1
=
2
3
, a R
++
c.
298 CHAPTER 22. SOLUTIONS

u
0

2
3
a

2
2
= 0
(1 ) u
0

2
3
a

2
= 0
a
1
2
x y > 0

1
= 0
... = 0
2a 2x y = 0

2
> 0
... = 0
x > 0

3
= 0
... = 0
y > 0

4
= 0
... = 0

u
0
(x) 2
2
= 0
(1 ) u
0
(y)
2
= 0
2a 2x y = 0
x y
2
a
u
0
(x) 2
2
u
00
(x) 0 2 u
0
(x) 0
(1 ) u
0
(y)
2
0 (1 ) u
00
(y) 1 u(y) 0
2a 2x y 2 1 0 0 2
det

u
00
(x) 0 2
0 (1 ) u
00
(y) 1
2 1 0

=
= u
00
(x) det

(1 ) u
00
(y) 1
1 0

2 det

0 2
(1 ) u
00
(y) 1

=
= u
00
(x) 4 (1 ) u
00
(y) > 0
D
(,a)
(x, y,
2
) =

u
00
(x) 0 2
0 (1 ) u
00
(y) 1
2 1 0

u
00
(x) 0
u
00
(y) 0
0 2

Using maple:

u
00
(x) 0 2
0 (1 ) u
00
(y) 1
2 1 0

1
=
=
1
u
00
(x)4(1)u
00
(y)

1 2 2u
00
(y) 2u
00
(y)
2 4 u
00
(x)
2u
00
(y) 2u
00
(y) u
00
(x) u
00
(x) u
00
(y)
2
u
00
(x) u
00
(y)

D
(,a)
(x, y,
2
) =
=
1
u
00
(x)4(1)u
00
(y)

1 2 2u
00
(y) (1 )
2 4 u
00
(x)
2u
00
(y) (1 ) u
00
(x) u
00
(x) u
00
(y) (1 )

u
0
(x) 0
u
0
(y) 0
0 2

=
=
1
u
00
(x)+4(1)u
00
(y)

u
0
(x) 2u
0
(y) 4u
00
(y) (1 )
2u
0
(x) 4u
0
(y) 2u
00
(x)
2u
00
(y) (1 ) u
0
(x) +u
00
(x) (u
0
(y)) 2u
00
(x) u
00
(y) (1 )

22.4. NONLINEAR PROGRAMMING 299


Exercise 3.
1. OK.
2. Existence. ...
3.Canonical form.
max
(x,y,m)R
2
++
R
log x + (1 ) log y s.t
w
1
mx 0
x
w
2
+my 0
y
2. Uniqueness.
The objective function is concave and the constraint functions are linear; uniqueness is not insured
on the basis of the sucient conditions presented in the notes.
3. Necessity of Kuhn-Tucker conditions.
Constraint functions are linear. Choose (x, y, m)
++
=

w1
2
,
w2
2
, 0

.
4. Suciency of Kuhn-Tucker conditions.
OK.
5. Kuhn-Tucker conditions.
D
x
L = 0

x

x
= 0
D
y
L = 0
1
y

y
= 0
D
m
L = 0
x
+
y
= 0
min...
min...
6. Solutions
Constraints are binding:
x
=
y
> 0.

w
1
mx = 0
w
2
+my = 0

w
1
m

= 0
w
2
+m
1

= 0

w
1

= w
2
+
1

...
w
1

= w
2
+
1

.
=
1
w1+w2
m = w
1
(w
1
+w
2
)
x = (w
1
+w
2
)
y = (1 ) (w
1
+w
2
) .
b.
Computations of the derivative is straightforward. Otherwise, you can apply the Implicit function
Theorem. Let F be the function dened by the left hand sides of Kuhn-Tucker conditions:

x

x
= 0
1
y

y
= 0

x
+
y
= 0
w
1
mx = 0
w
2
+my = 0
The desired result follows from the expression below:
300 CHAPTER 22. SOLUTIONS
D
(,w1,w2)
(x

, y

, m

) =

D
(x,y,m)
F (x, y, m, , w
1
, w
2
)

1
D
(,w1,w2)
F (x, y, m, , w
1
, w
2
)
c. As an application of the Envelope theorem, the desired result follows computing the partial
derivative of the Lagrange function with respect to :
D

( log x + (1 ) log y +
x
(w
1
mx) +
y
(w
2
+my)) = log x

log y

Exercise 4.
1. Existence.
The constraint set C is nonempty ( 0 belongs to it) and closed. It is bounded below by 0. y is
bounded above by 2. x is bounded above because of the rst constraint:: x 6y
y0
6. Therefore
C is compact.
2. Uniqueness.
C is convex. Lets compute the Hessian matrix of the objective function
x y
x
2
y
2
+ 4x + 6y 2x + 4 2y + 6
,
x y
2x + 4 2 0
2y + 6 0 2
and therefore, since the Hessian matrix is negative denite, the objective function is strictly
concave. Therefore, the solution is unique.
3. Canonical form.
max
(x,y)R
2 x
2
y
2
+ 4x + 6y s.t. x y + 6 0
1
2 y 0
2
x 0
x
y 0.
y
4. Necessity of Kuhn-Tucker conditions.
Constraints are linear and therefore pseudo-concave. Take (x
++
, y
++
) = (1, 1) .
5. Suciency of Kuhn-Tucker conditions.
The objective function is strictly concave and therefore pseudo-concave. Constraints are linear
and therefore quasi-concave.
6. Lagrange function and kuhn-Tucker conditions.
L

x, y,
1
,
2
,
x
,
y

= x
2
y
2
+ 4x + 6y +
1
(x y + 6) +
+
2
(2 y) +
x
x +
y
y
D
x
L = 0 2x + 4
1
+
x
= 0
D
y
L = 0 2y + 6
2
+
y
= 0
min{x y + 6,
1
} = 0
min{2 y,
2
} = 0
min{x,
x
} = 0
min

y,
y

= 0
b.
If x

= 0, then y + 6 0. Since y 2, we get y + 6 > 0 and therefore


1
= 0. But then

x
= 6, which contradicts the Kuhn-Tucker conditions above.
c.
Substituting in the Kuhn-Tucker conditions
22.4. NONLINEAR PROGRAMMING 301
2 2 + 4
1
+
x
= 0
1
+
x
= 0
2 2 + 6
2
+
y
= 0 2
2
+
y
= 0
min{2 2 + 6,
1
} = min{2,
1
} = 0
1
= 0
min{2 2,
2
} = min{0,
2
}
min{2,
x
} = 0
x
= 0
min

2,
y

= 0
y
= 0
From the second equation
2
= 2, and therefore

x

, y

1
,

2
,

x
,

= (2, 2, 0, 2, 0) .
Exercise 5.
1. R
3
is open and convex.
2. Existence.
The extreme value theorem does not help.
3. Canonical form.
max
(x,y,z)R
3 x
2
2y
2
3z
2
+ 2x s.t. x y z + 2 0
x 0
3. Uniqueness..
The constraint set is convex because the constraints are linear. The Hessian matrix of the
objective function is

2 0 0
0 4 0
0 0 6

Therefore the objective function is strictly concave and the solution is unique.
4. Necessity of the Kuhn-Tucker conditions.
The constraints are linear and therefore pseudo-concave; (x, y, z)
++
=

1
3
,
1
3
,
1
3

.
5. Suciency of Kuhn-Tucker conditions.
The objective function is strictly concave and therefore pseudo-concave and constraints are linear
and therefore quasi-concave.
6. Lagrange function and Kuhn-Tucker conditions.
L(x, y) = x
2
2y
2
3z
2
+ 2x +(x y z + 2) +x
L
x
= 0 2x + 1 + = 0 (1)
L
y
= 0 4y = 0 (2)
L
z
= 0 6z = 0 (3)
min{x y z + 2, } = 0 (4)
min{x, } = 0 (5)
We can conjecture that y = z = 0 and therefore from (2), = 0. Then, the objective function
becomes x
2
+x, which has a global maximum in 1. If x = 1, form (5) , = 0. (1) and (4) are then
veried.
Exercise 6. a.
1. Existence.
The objective function is continuous on R
++
(2, +) .
The constraint set is closed because inverse image of closed sets via continuous functions. It is
bounded below by (1, 0) . Moreover, from the third constraint x 10y 10 and similarly y 10,
i.e., the constraint set is bounded above by (10, 10) . Therefore, as an application of the Extreme
Value Theorem, a solution exists.
2. Uniqueness.
302 CHAPTER 22. SOLUTIONS
The constraint set is convex because intersection of convex sets.
The Hessian of the objective function is


3
x
2
0
0
2
(2+y)
2

, which is negative denite and there-


fore the function is strictly concave and the solution is unique.
3. Canonical form.
max
(x,y)R
+
(2,+)
3 log x + 2 log (2 +y) s.t. y 0
y
x 1 0
x
10 x y 0
1550 150x 200y 0
4. Necessity of K-T.
The constraint functions are ane and therefore pseudo-concave and (x
++
, y
++
) := (2, 1) satises
the constraints with strong inequalities.
5. Suciency of K-T.
The objective function is strictly concave and therefore pseudo-concave and the constraint func-
tions are linear and therefore quasi-concave.
6. K-T conditions.
L

x, y,
x
,
y
, ,

:= 3 log x + 2 log (2 +y) +


+
y
y +
x
(x 1) +(10 x y) +(1550 150x 200y)
D
x
L =
3
x
+
x
150 = 0
D
y
L =
2
2+y
+
y
200 = 0
y 0,
y
0,
y
y = 0,
x 1 0,
x
0,
x
(x 1) = 0,
10 x y 0, 0, (10 x y) = 0,
1550 150x 200y 0, 0, (1550 150x 200y) = 0,
7. Solve the K-T conditions.
Following the suggestion (see Kreps for the way the suggestion is obtained), the Kuhn-Tucker
system in 6. above becomes

3
x
+
x
150 = 0
2
2+y
+
y
200 = 0

y
= 0, y > 0

x
= 0, x > 1
= 0, 10 x y 0
> 0, 1550 150x 200y = 0

3
x
150 = 0
2
2+y
200 = 0

y
= 0, y > 0

x
= 0, x > 1
= 0, 10 x y 0
> 0, 1550 150x 200y = 0

3
x
150 = 0
2
2+y
200 = 0
1550 150x 200y = 0
,
Solution is :

=
1
390
, y =
19
10
, x =
39
5

.
Exercise 6. b.
1. Existence.
The objective function is continuous on R.
22.4. NONLINEAR PROGRAMMING 303
The constraint set is closed because inverse image of closed sets via continuous functions. It
is bounded: for any j, we have
j
x
2
j
1
P
i6=j

i
x
2
i
1 and
j
x
2
j
1 0, and nally

q
1
j
x
j

q
1
j
.Therefore, as an application of the Extreme Value Theorem, a solution exists.
2. Uniqueness.
The objective function is linear and strictly increasing. The Hessian of the constraint function is
2

1
.
.
.

,
which is clearly negative denite. Therefore, the constraint function is strictly concave, and unique-
ness follows.
3. Canonical form.
max
x:=(x
i
)
n
i=1
R
n
P
n
i=1
x
i
s.t. 1
P
n
i=1

i
x
2
i
0.
4. Necessity of K-T.
The constraint function is strictly concave and therefore pseudo concave. Take i, x
++
i
:=
1

i2n
;
then 1
P
n
i=1

i
x
2
i
= 1
P
n
i=1

i
1

i
2n
= 1
P
n
i=1
1
2n
=
1
2
> 0.
5. Suciency of K-T.
The objective function is linear and therefore pseudo-concave and the constraint function is
strictly concave and therefore quasi-concave.
6. K-T conditions.
L(x, ; ) :=
P
n
i=1
x
i
+

1
P
n
i=1

i
x
2
i

D
x
i
L = 1 2
i
x
i
= 0
1
P
n
i=1

i
x
2
i
0, 0,

1
P
n
i=1

i
x
2
i

= 0,
7. Solve the K-T conditions.
Observe that if = 0, from the First Order Conditions, we get 1 = 0. Therefore, it must be
> 0. Then we have:

1 2
i
x
i
= 0 i = 1, ...n
1
P
n
i=1

i
x
2
i
= 0
(
x
i
=
1
2
i
i = 1, ...n
1
P
n
i=1

1
2
i

2
= 0
(
x
i
=
1
2i
i = 1, ...n
=
1
2
q
P
n
i=1
1

x
i
=
1

n
i=1
1

i
i = 1, ...n
=
1
2
q
P
n
i=1
1
i
.
Exercise 6. .c.
1. Existence.
The objective function is continuous on R.
The constraint set is closed because inverse image of closed sets via continuous functions. It is
bounded below by (1, 1, 1). It is bounded above. Suppose not then there would exists a sequence
{(x
n
, y
n
, z
n
)}
nN
contained in the set and such that, say, lim
n+
x
n
= +, but since for any n,
(y
n
, z
n
) = (1, 1) , the rst constraint would be violated. Therefore, as an application of the Extreme
Value Theorem, a solution exists.
2. Uniqueness.
304 CHAPTER 22. SOLUTIONS
The budget set is convex, because intersections of convex sets.
The Hessian of the objective function is


1
x
2
0 0
0
1
y
2
0
0 0
1
z
2

. Therefore the function is strictly


concave and the solution is unique.
3. Canonical form.
max
(x,y,z)R
3
++
log x + log y + log z s.t. 100 4x
2
y
2
z
2
0
x 1 0
y 1 0
z 1 0.
4. Necessity of K-T.
The Hessian of the rst constraint function is

8 0 0
0 2 0
0 0 2

which is clearly negative denite. Therefore, the rst constraint function is strictly concave and
therefore pseudo-concave. The other constraint functions are linear and therefore pseudo-concave.
Take (x
++
, y
++
, z
++
) = (2, 2, 2) . Then the constraints are veried with strict inequality.
5. Suciency of K-T.
The objective function is concave and therefore pseudo-concave. The constraint functions are
either strictly concave or linear and therefore quasi-concave.
6. K-T conditions.
L

x, y, z; ,
x
,
y
,
z

:= log x + log y + log z +

100 4x
2
y
2
z
2

+
+
x
(x 1) +
y
(y 1) +
z
(z 1) .
D
x
L =
1
x
8x +
x
= 0
D
y
L =
1
y
2y +
y
= 0
D
x
L =
1
z
2z +
z
= 0
0, 100 4x
2
y
2
z
2
0,

100 4x
2
y
2
z
2

= 0

x
0, x 1 0,
x
(x 1) = 0

y
0, y 1 0,
y
(y 1) = 0

z
0, z 1 0,
z
(z 1) = 0
7. Solve the K-T conditions.
Observe that if = 0, from the First Order Conditions, we get
x
=
1
x
, a contradiction.
Moreover, it is reasonable to conjecture that x, y, z > 1. Hoping for non rare case, we look for a
solution where
x
=
y
=
z
= 0. (Recall that we know that the solution exists and is unique). We
then get the following system:

1
x
8x = 0
1
y
2y = 0
1
z
2z = 0
100 4x
2
y
2
z
2
= 0

1 8x
2
= 0
1 2y
2
= 0
1 2z
2
= 0
100 4x
2
y
2
z
2
= 0
100 4
1
8

1
2

1
2
=
1
2
2003

= 0, =
3
200
;
x =
q
1
8
3
200
=
q
3
25
=

3
5
;
y = z =
q
1
2
3
200
=

3
10
.
.
22.4. NONLINEAR PROGRAMMING 305
Exercise 7. a.
1. Existence.
The objective function is continuous on R
3
.
The constraint set is closed because inverse image of closed sets via continuous functions. It is
bounded below by 0. It is bounded above: suppose not then
if c
1
+, then from the rst constraint it must be k , which is impossible;
if c
2
+, then from the second constraint and the fact that f
0
> 0, it must be k +,
violating the rst constraint;
if k +, then the rst constraint is violated.
Therefore, as an application of the Extreme Value Theorem, a solution exists.
2. Uniqueness.
The budget set is convex. Observe that f (k) c
2
0 is concave function and therefore quasi-
concave.
The Hessian of the objective function is

u
00
(c
1
) 0 0
0 u
00
(c
2
) 0
0 0 0

. Therefore the function is


concave.
3. Canonical form.
max
(c
1
,c
2
,k)R
3 u(c
1
) +u(c
2
)
s.t.
e c
1
k 0
1
f (k) c
2
0
2
c
1
0
1
c
2
0
2
k 0.
3
4. Necessity of K-T.
Observe that the Hessian of the second constraint with respect to (c
1
, c
2
, k) is

0 0 0
0 0 0
0 0 f
00
(k)

,
and therefore that constraint is concave. Therefore, the constraints are linear or concave.
Take

c
++
1
, c
++
2
, k
++

=

e
4
,
1
2
f

e
4

,
e
4

. Then the constraints are veried with strict inequality.


5. Suciency of K-T.
The objective function is strictly concave and therefore pseudo-concave. The constraint functions
are either concave or linear and therefore quasi-concave.
6. K-T conditions.
L(c
1
, c
2
, k,
1
,
2
,
1
,
2
,
3
) := u(c
1
) +u(c
2
) +
1
(e c
1
k) +
+
2
(f (k) c
2
) +
1
c
1
+
2
c
2
+
3
k.
D
c2
L = u
0
(c
1
)
1
+
1
= 0
D
c
2
L = u
0
(c
2
)
2
+
2
= 0
D
k
L =
1
+
2
f
0
(k) +
3
= 0

1
0, e c
1
k 0,
1
(e c
1
k) = 0

2
0, f (k) c
2
0,
2
(f (k) c
2
) = 0

1
0, c
1
0,
1
c
1
= 0

2
0, c
2
0,
2
c
2
= 0

3
0, k 0,
3
k = 0
7. Solve the K-T conditions.
306 CHAPTER 22. SOLUTIONS
Since we are looking for positive solution we get

u
0
(c
1
)
1
= 0
u
0
(c
2
)
2
= 0

1
+
2
f
0
(k) = 0
e c
1
k = 0
f (k) c
2
= 0
Observe that from the rst two equations of the above system,
1
,
2
> 0.
Exercise 7. b.
1. Existence.
The objective function is continuous on R
3
.
The constraint set is closed because inverse image of closed sets via continuous functions. It is
bounded below by 0. It is bounded above: suppose not then
if x +, then from the rst constraint (px+wl = wl),it must be l , which is impossible.
Similar case is obtained, if l +.
Therefore, as an application of the Extreme Value Theorem, a solution exists.
2. Uniqueness.
The budget set is convex. The function is dierentiably strictly quasi concave and therefore
strictly quasi-concave and the solution is unique.
3. Canonical form.
max
(x,l)R
2 u(x, l)
s.t.
px wl +wl = 0
l l 0
x 0
l 0
4. Necessity of K-T.
The equality constraint is linear, the other constraints are linear and therefore concave.
5. Suciency of K-T.
The objective function is dierentiably strictly quasi-concave and therefore pseudo-concave. The
equality constraint is linear, the inequality constraints are linear and therefore quasi-concave.
6. K-T conditions.
L

c
1
, c
2
, k,
1
,
2
,
3
,
4
; p, w, l

:= u(x, l) +
1

px wl +wl

+
+
2

l l

+
3
x +
4
l.
D
x
L = D
x
u
1
p +
3
= 0
D
l
L = D
l
u
1
w
2
+
4
= 0
px wl +wl = 0

2
0, l l 0,
2

l l

= 0

3
0, x 0,
3
x = 0

4
0, l 0,
4
l = 0
7. Solve the K-T conditions.
Looking for not corner solutions, i.e., solutions at which x > 0 and 0 < l < l, we get
2
=
3
=

4
= 0 and
22.4. NONLINEAR PROGRAMMING 307

D
x
u
1
p = 0
D
l
u
1
w = 0
px wl +wl = 0
.
Exercise 8. a.
Lets apply the Implicit Function Theorem (:= IFT) to the conditions found in Exercise 3.(a).
Writing them in the usual informal way we have:
c
1
c
2
k
1

2
e a
u
0
(c
1
)
1
= 0 u
00
(c
1
) 1
u
0
(c
2
)
2
= 0 u
00
(c
2
) 1

1
+
2
f
0
(k) = 0
2
f
00
(k) 1 f
0
(k)
2
k
1
e c
1
k = 0 1 1 1
f (k) c
2
= 0 1 f
0
(k) k

To apply the IFT, we need to check that the following matrix has full rank
M :=

u
00
(c
1
) 1
u
00
(c
2
) 1

2
f
00
(k) 1 f
0
(k)
1 1
1 f
0
(k)

.
Suppose not then there exists := (c
1
, c
2
, k,
1
,
2
) 6= 0 such that M = 0, i.e.,

u
00
(c
1
) c
1
+
1
= 0
u
00
(c
2
) c
2
+
2
= 0

2
f
00
(k) k+
1
+ f
0
(k)
2
= 0
c
1
+ k = 0
c
2
+ f
0
(k) k+ = 0
Recall that
[M = 0 = 0] iff M has full rank.
The idea of the proof is either you prove directly [M = 0 = 0] , or you 1. assume M = 0
and 6= 0 and you get a contradiction.
If we dene c := (c
1
, c
2
), := (
1
,
2
) , D
2
:=

u
00
(c
1
) c
1
u
00
(c
2
) c
2

, the
above system can be rewritten as

D
2
c+ = 0

2
f
00
(k) k+ [1, f
0
(k)] = 0
c+

1
f
0
(k)

k = 0
.

c
T
D
2
c+ c
T
= 0 (1)
k
2
f
00
(k) k+ k [1, f
0
(k)] = 0 (2)

T
c+
T

1
f
0
(k)

k = 0 (3)
.
c
T
D
2
c
(1)
= c
T
=
T
c
(3)
=
T

1
f
0
(k)

k
(2)
= k [1, f
0
(k)] =
= k
2
f
00
(k) k > 0,
308 CHAPTER 22. SOLUTIONS
while c
T
D
2
c = (c
1
)
2
u
00
(c
1
) + (c
2
)
2
u
00
(c
1
) < 0. since we got a contradiction, M has full
rank.
Therefore, in a neighborhood of the solution we have
D
(e,a)
(c
1
, c
2
, k,
1
,
2
) =

u
00
(c
1
) 1
u
00
(c
2
) 1

2
f
00
(k) 1 f
0
(k)
1 1
1 f
0
(k)


2
k
1
1
k

.
To compute the inverse of the above matrix, we can use the following fact about the inverse of
partitioned matrix (see Goldberger, (1964), page 27:
Let A be an n n nonsingular matrix partitioned as
A =

E F
G H

,
where E
n
1
n
1
, F
n
1
n
2
, G
n
2
n
1
, H
n
2
n
2
and n
1
+n
2
= n. Suppose that E and D := HGE
1
F
are non singular. Then
A
1
=

E
1

I +FD
1
GE
1

E
1
FD
1
D
1
GE
1
D
1

.
In fact, using Maple, with obviously simplied notation, we get

u
1
0 0 1 0
0 u
2
0 0 1
0 0
2
f
2
1 f
1
1 0 1 0 0
0 1 f
1
0 0

1
=

1
u
1
+u
2
f
2
1
+
2
f
2

f
1
u
1
+u
2
f
2
1
+
2
f
2

1
u
1
+u
2
f
2
1
+
2
f
2

u2f
2
1
+2f2
u
1
+u
2
f
2
1
+
2
f
2
u
2
f
1
u
1
+u
2
f
2
1
+
2
f
2

f
1
u1+u2f
2
1
+2f2
f
2
1
u1+u2f
2
1
+2f2
f
1
u1+u2f
2
1
+2f2
f
1
u
1
u1+u2f
2
1
+2f2

u
1
+
2
f
2
u1+u2f
2
1
+2f2

1
u
1
+u
2
f
2
1
+
2
f
2
f
1
u
1
+u
2
f
2
1
+
2
f
2
1
u
1
+u
2
f
2
1
+
2
f
2

u
1
u
1
+u
2
f
2
1
+
2
f
2
u
2
f
1
u
1
+u
2
f
2
1
+
2
f
2

u
2
f
2
1
+
2
f
2
u1+u2f
2
1
+2f2
f
1
u1
u1+u2f
2
1
+2f2

u1
u1+u2f
2
1
+2f2
u
1
u
2
f
2
1
+
2
f
2
u1+u2f
2
1
+2f2
u
2
f
1
u1
u1+u2f
2
1
+2f2
u
2
f
1
u
1
+u
2
f
2
1
+
2
f
2

u
1
+
2
f
2
u
1
+u
2
f
2
1
+
2
f
2
u
2
f
1
u
1
+u
2
f
2
1
+
2
f
2
u
2
f
1
u
1
u
1
+u
2
f
2
1
+
2
f
2
u
2
u
1
+
2
f
2
u
1
+u
2
f
2
1
+
2
f
2
Therefore,

D
e
c
1
D
a
c
1
D
e
c
2
D
a
c
2
D
e
k D
a
k
D
e

1
D
a

1
D
e
,
2
D
a
,
2

u2f1+2f2
u
1
+u
2
f
1
+
2
f
2
2k
1
+u2f1k

u
1
+u
2
f
1
+
2
f
2
f
1
u
1
u
1
+u
2
f
1
+
2
f
2
f
1

2
k
1
+k

u
1
+k

2
f
2
u
1
+u
2
f
1
+
2
f
2
u
1
u1+u2f1+2f2

2
k
1
+u
2
f
1
k

u1+u2f1+2f2
u
1
u2f1+2f2
u
1
+u
2
f
1
+
2
f
2
u
1
2k
1
+u2f1k

u
1
+u
2
f
1
+
2
f
2
u
2
f
1
u
1
u1+u2f1+2f2
u
2
f
1

2
k
1
+k

u
1
+k

2
f
2
u1+u2f1+2f2

.
Then D
e
c
1
=
u
2
f
1
+
2
f
2
u1+u2f1+2f2
:=
+


u
00
(c
2
)
+
f
0
+
+

f
00
u
00
(c1)

+
+
u
00
(c2)

f
0
+
+2
+
f
00

> 0
D
e
c
2
= f
1
u1
u1+u2f1+2f2
:=
+
f
0

u
00
(c
1
)
u
00
(c1)

+
+
u
00
(c2)

f
0
+
+2
+
f
00

> 0
22.4. NONLINEAR PROGRAMMING 309
D
a
k =

2
k
1
+u
2
f
1
k

u
1
+u
2
f
1
+
2
f
2
:=
+
2
+
k
1
+
+


u
00
(c2)
+
f
0
+
k

u
00
(c
1
)

+
+
u
00
(c
2
)

f
0
+
+
2
+
f
00

,
which has sign equal to sign

+

2
+
k
1
+
+


u
00
(c
2
)
+
f
0
+
k

.
Exercise 8. b.
Lets apply the Implicit Function Theorem to the conditions found in a previous exercise. Writing
them in the usual informal way we have:
x l
1
p w l
D
x
u
1
p = 0 D
2
x
D
2
xl
p
1
D
l
u
1
w = 0 D
2
xl
D
2
l
w
1
px wl +wl = 0 p w 1 l l w
To apply the IFT, we need to check that the following matrix has full rank
M :=

D
2
x
D
2
xl
p
D
2
xl
D
2
l
w
p w

.
Dened D
2
:=

D
2
x
D
2
xl
D
2
xl
D
2
l

, q :=

p
w

, we have M :=

D
2
q
q
T

.
Suppose not then there exists := (y, )

R
2
R

\ {0} such that M = 0, i.e.,

D
2
y q = 0 (1)
q
T
y = 0 (2)
We are going to show
Step 1. y 6= 0; Step 2. Du y = 0; Step 3. It is not the case that y
T
D
2
y < 0.
These results contradict the assumption about u.
Step 1.
Suppose y = 0. Since q 0, from (1) , we get = 0, and therefore = 0, a contradiction.
Step 2.
From the First Order Conditions, we have
Du
1
q = 0 (3) .
Duy
(3)
=
1
qy
(2)
= 0.
Step 3.
y
T
D
2
y
(1)
= y
T
q
(2)
= 0.
Therefore, in a neighborhood of the solution we have
D
(p,w,l)
(x, l,
1
) =

D
2
x
D
2
xl
p
D
2
xl
D
2
l
w
p w

1
1 l l w

.
Unfortunately, here we cannot use the formula seen in the Exercise 4 (a) because the Hessian of
the utility function is not necessarily nonsingular. We can invert the matrix using the denition of
inverse. (For the inverse of a partitioned matrix with this characteristics see also Dhrymes, P. J.,
(1978), Mathematics for Econometrics, 2nd edition, Springer-Verlag, New York, NY, Addendum
pages 142-144.
With obvious notation and using Maple, we get
310 CHAPTER 22. SOLUTIONS

d
x
d p
d d
l
w
p w 0

1
=

w
2
dxw
2
2dpw+p
2
d
l
p
w
dxw
2
2dpw+p
2
d
l

dw+pd
l
dxw
2
2dpw+p
2
d
l
p
w
d
x
w
2
2dpw+p
2
d
l
p
2
d
x
w
2
2dpw+p
2
d
l
d
x
w+dp
d
x
w
2
2dpw+p
2
d
l

dw+pd
l
dxw
2
2dpw+p
2
d
l
d
x
w+dp
dxw
2
2dpw+p
2
d
l
d
x
d
l
+d
2
dxw
2
2dpw+p
2
d
l

Therefore,
D
(p,w,l)
(x, l,
1
) =

w
2
d
x
w
2
2dpw+p
2
d
l
p
w
d
x
w
2
2dpw+p
2
d
l

dw+pd
l
d
x
w
2
2dpw+p
2
d
l
p
w
dxw
2
2dpw+p
2
d
l
p
2
dxw
2
2dpw+p
2
d
l
d
x
w+dp
dxw
2
2dpw+p
2
d
l

dw+pd
l
d
x
w
2
2dpw+p
2
d
l
dxw+dp
d
x
w
2
2dpw+p
2
d
l
dxd
l
+d
2
d
x
w
2
2dpw+p
2
d
l


1
0 0
0
1
0
1 l l w

=
=

w
2

1
dw+pd
l
d
x
w
2
2dpw+p
2
d
l

pw
1
ldw+lpd
l
d
x
w
2
2dpw+p
2
d
l
dw+pd
l
d
x
w
2
2dpw+p
2
d
l
w
pw1dxw+dp
d
x
w
2
2dpw+p
2
d
l
p
2
1ldxw+ldp
d
x
w
2
2dpw+p
2
d
l

dxw+dp
d
x
w
2
2dpw+p
2
d
l
w

1
dw+
1
pd
l
+d
x
d
l
d
2
dxw
2
2dpw+p
2
d
l

1
d
x
w+
1
dpld
x
d
l
+ld
2
dxw
2
2dpw+p
2
d
l

d
x
d
l
+d
2
dxw
2
2dpw+p
2
d
l
w

D
p
l =
pw
1
d
x
w +dp
d
x
w
2
2dpw +p
2
d
l
D
w
l =
p
2

1
ld
x
w +ldp
d
x
w
2
2dpw +p
2
d
l
.
The sign of these expressions is ambiguous, unless other assumptions are made.
22.4. NONLINEAR PROGRAMMING 311
References
Apostol, T. M., (1967), Calculus, Volume 1, 2nd edition, John Wiley & Sons, New York, NY.
Apostol, T. M., (1974), Mathematical Analysis, 2nd edition, Addison-Wesley Publishing Company,
Reading, MA.
Azariadis, C., (2000), Intertemporal Macroeconomics, Blackwell Publisher Ltd, Oxford, UK.
Balasko, Y., (1988), Foundations of the Theory of General Equilibrium,. Academic Press, Boston.
Bazaraa, M.S. and C. M. Shetty, (1976), Foundations of Optimization, Springer-Verlag, Berlin.
Bartle, R. G., (1964), The Elements of Real Analysis, John Wiley & Sons, New York, NY.
Cass D., (1991), Nonlinear Programming for Economists, mimeo, University of Pennsylvania.
de la Fuente, A., (2000), Mathematical Methods and Models for Economists, Cambridge University
Press, Cambridge, UK.
Dhrymes, P. J., (1978), Mathematics for Econometrics, 2nd edition Springer-Verlag, New York,
NY
El-Hodiri, M. A., (1991), Extrema of smooth functions, Springer-Verlag, Berlin.
Goldberger, A.S, (1964), Econometric Theory, John Wiley & Sons, New York, NY.
Hildebrand, W., (1974), Core and equilibria of a large economy, Princiton University Press, Prince-
ton, NJ.
Hirsch, M. W. and S. Smale, (1974), Dierential Equations, Dynamical Systems and Linear Alge-
bra, Academic Press, New York, NY.
Lang S. (1971), Linear Algebra, second edition, Addison Wesley, Reading.
Lang, S., (1986), Introduction to Linear Algebra, 2nd edition, Springer-Verlag, New York, NY.
Lipschutz, S., (1991), Linear Algebra, 2nd edition, McGraw-Hill, New York, NY.
Lipschutz, S., (1965), General Topology, McGraw-Hill, New York, NY.
Mangasarian O. L. (1994), Nonlinear Programming, SIAM, Philadelphia.
McLean, R., (1985), Class notes for the course of Mathematical Economics (708), University of
Pennsylvania, Philadelphia, PA, mimeo.
Moore, J. C., (1999), Mathematical Methods for Economic Theory 1, Springer-Verlag, Berlin.
Moore, J. C., (1999), Mathematical Methods for Economic Theory 2, Springer-Verlag, Berlin.
Ok, E. A., (2007), Real Analysis with Economic Applications, Princeton University Press, Prince-
ton NJ.
Rudin, W., (1976), Principles of Mathematical Analysis, 3rd edition, McGraw-Hill, New York,
NY.
Simon, C. S. (1986), Scalar and vector maximization: calculus techniques with economic appli-
cations, in Reiter S, Studies in mathematical economics, The Mathematical Association of
America, p. 62-159
Simon, C. P. and L. Blume, (1994), Mathematics for Economists, Norton, New York
Simmons, G. F., (1963), Introduction to Topology and Modern Analysis, McGraw-Hill, New York.
Smith, L.,(1992), Linear Algebra, 2nd edition, Springer-Verlag, New York, NY.
312 CHAPTER 22. SOLUTIONS
Stokey, N. C., Lucas, E. C. (with Prescott, E. C.), (1989), Recursive methods in Economic Dy-
namics, Harvard University Press, Cambridge, MA.
Sydsaeter, K., (1981), Topics in Mathematical Analysis for Economists, Academic Press, London,
UK.
Taylor, A. E. and W. R. Mann, (1984), Advanced Calculus, 3rd ed., John Wiley & Sons, New
York, NY.
Wrede, R. C. and M. Spiegel (2002), Theory and Problems of Advanced Calculus, 2nd edition,
Schaums outlines, McGraw Hill, New York.

Potrebbero piacerti anche