Sei sulla pagina 1di 50

mlbaker presents

lambertw.wordpress.com)
MATH 245
Advanced Linear Algebra 2
Instructor: Stephen New
Term: Spring 2011 (1115)
University of Waterloo
Finalized: August 2011
Disclaimer: These notes are provided as-is, and may be incomplete or contain errors.
Contents
1 Ane spaces 1
1.1 Ane spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Ane independence and span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Simplices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Orthogonality 5
2.1 Norms, distances, angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Orthogonal projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Applications of orthogonality 9
3.1 Circumcenter of a simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Polynomial interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Least-squares best t polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Cross product in R
n
13
4.1 Generalized cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.2 Parallelotope volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5 Spherical geometry 17
6 Inner product spaces 17
6.1 Abstract inner products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 Orthogonality and Gram-Schmidt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.4 Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7 Linear operators 28
7.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2 Dual spaces and quotient spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1
8 Adjoint of a linear operator 35
8.1 Similarity and triangularizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.2 Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
8.3 Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
9 Bilinear and quadratic forms 41
9.1 Bilinear forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
9.2 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
10 Jordan canonical form 49
Administrative
Website: http://math.uwaterloo.ca/~snew.
Oce: MC 5163.
Email: snew@uwaterloo.ca
Textbook: We will cover most of Chapters 6 and 7 from Linear Algebra by Friedberg, Insel and Spence
(Sections 6.1 6.8, 7.1 7.4).
Midterm: Tuesday, June 7.
1 Ane spaces
1.1 Ane spaces
The set R
n
is given by
R
n
=
_

_
x =
_
_
_
x
1
.
.
.
x
n
_
_
_ : x
i
R
_

_
A vector space in R
n
is a set of the form U = spanv
1
, . . . , v

.
Denition 1.1. An ane space in R
n
is a set of the form p +U = p +u [ u U for some point p R
n
and some
vector space U in R
n
.
Theorem 1.2. Let P = p + U and Q = q + U be two ane spaces. We have p + U q + V if and only if q p V
and U V.
Proof. Suppose that p + U q + V. We have p = p + 0 p + U. Hence p q + V. Hence p = q + v for some v V,
by denition. From that equation, q p = v V. Let u U. Consider p + u. This is an element of p + U. But
p +U q +V. So we can write p +u = q +v for some v V. Hence u = (q p) +v, which is in V since q p, v V.
This nishes the proof of one direction.
Conversely, suppose q p V and U V. We will show p + U q + V. Let u U. p + u = p + u + q q =
q + (u (q p)) q +V.
Corollary 1.3. p +U = q +V if and only if q p U and U = V.
Denition 1.4. The vector space U is called the vector space associated to the ane space p +U. The dimension
of an ane space is the dimension of its associated vector space. A 0-dimensional ane space in R
n
is a point. A
1-dimensional ane space in R
n
is called a line (this denes the word line). A 2-dimensional ane space in R
n
is
called a plane. An (n 1)-dimensional ane space in R
n
is called a hyperplane.
2
Remark 1.5. To calculate the dimension of a solution space, you count the number of parameters (columns with no
pivots in the row-reduced matrix).
Remark 1.6. A vector is the name given to an element of a vector space. In an ane space, they are called points.
1.2 Ane independence and span
Denition 1.7. Let a
0
, . . . , a

be points in R
n
. We let a
0
, . . . , a

) denote the smallest ane space in R


n
that contains
those points (that is, the intersection of all ane spaces in R
n
which contain each a
i
). This is called the ane span
of the points.
Remark 1.8. To be rigorous, we should prove that an intersection of countably many ane spaces is always an ane
space.
Denition 1.9. We say that the points a
0
, . . . , a

are anely independent if dima


0
, . . . , a

) = .
Theorem 1.10. Let a
0
, . . . , a

be points in R
n
. Let u
k
= a
k
a
0
(for 1 k ). Then a
0
, . . . , a

) = a
0
+
spanu
1
, . . . , u

.
Proof. Let U = spanu
1
, . . . , u

. We rst show that a


0
+ U contains all the points a
i
. Note a
0
= a
0
+ 0 a
0
+ U,
and for the rest we can write a
k
= a
0
+a
k
a
0
= a
0
+u
k
a
0
+U. This proves that a
0
, . . . , a

) a
0
+U.
Next, we show that a
0
+ U a
0
, . . . , a

). So we need to show that a


0
+ U q + V for every ane space q + V
which contains a
0
, . . . , a

. So let q + V be such an ane space. We must show that q a


0
V and U is a subspace
of V. Since a
0
q + V we have q a
0
V. Also, we have a
k
q + V for k 1. Say a
k
= q + v
k
for v
k
V. Now
u
k
= a
k
a
0
= q +v
k
a
0
. So u
k
V since q a
0
, v
k
V. Hence U is a subspace of V.
a
0
a
1
u
1
a
2
u
2
a
3
u
3
Figure 1: The setup for l = 3
Corollary 1.11. We have the following:
(i) Note:
a
0
, . . . , a

) = a
0
+ spanu
1
, . . . , u
k
=
_
a
0
+

i=1
t
i
u
i
[ t
i
R
_
=
_

i=0
s
i
a
i
[ s
i
R,

i=0
s
i
= 1
_
(ii) a
0
, . . . , a

is anely independent if and only if u


1
, . . . , u

is linearly independent where u


k
= a
k
a
0
.
1.3 Convex sets
Denition 1.12. Let a, b R
n
. The line segment [a, b] is the set
a +t(b a) [ 0 t 1.
A set C R
n
is called convex if for all a, b C, we have [a, b] C.
Denition 1.13. For a set S R
n
, the convex hull of S, denoted by [S], is the smallest convex set in R
n
containing
S, that is, the intersection of all convex sets which contain S. (Should prove that the intersection of convex sets is
convex for full rigour; this is left as an exercise).
Remark 1.14. When S = a
0
, . . . , a

we write [S] as [a
0
, . . . , a

]. Note that this agrees with the notation for the


line segment (convex hull of two points).
3
Figure 2: Convex and non-convex subsets of the plane
1.4 Simplices
Denition 1.15. An (ordered, non-degenerate) -simplex consists of an ordered (l + 1)-tuple (a
0
, . . . , a

) of points
a
i
R
n
with a
0
, . . . , a

anely independent, together with the convex hull [a


0
, . . . , a

]. A 0-simplex is a point, a
1-simplex is an ordered line segment. A 2-simplex is an ordered triangle. A 3-simplex is an ordered tetrahedron.
Theorem 1.16. Let a
0
, . . . , a

be anely independent points in R


n
. Then [a
0
, . . . , a

] can be written
_
a
0
+
n

i=1
t
i
u
i
[ 0 t
i
R,

i=1
t
i
1
_
=
_

i=0
s
i
a
i
[ 0 s
i
R,

i=0
s
i
= 1
_
where u
k
= a
k
a
0
.
Proof. Exercise. Check the statement of the theorem and then prove it.
Remark 1.17 (triangle facts). The medians of a triangle meet at a point called the centroid. The perpendicular
bisectors meet at the circumcenter (the center of the circle in which the triangle is inscribed). The angle bisectors
meet at the incenter (the center of the circle inscribed in the triangle). Furthermore, the altitudes meet at the
orthocenter. The cleavers meet at the cleavance center. We will think about higher-dimensional generalizations
of these notions.
Figure 3: The medial hyperplane M
1,2
(shaded) in a 3-simplex.
Denition 1.18. Let [a
0
, . . . , a

] be an -simplex in R
n
. For 0 j < k , the (j, k) medial hyperplane is dened
to be the ane space M
j,k
which is the ane span of the points a
i
(for i ,= j, k) and the midpoint
1
2
(a
j
+a
k
).
Theorem 1.19. Let [a
0
, . . . , a

] be an -simplex in R
n
. The medial hyperplanes M
j,k
with 0 j < k have a
unique point of intersection g. This point g is called the centroid of the -simplex, and it is given by the the average
of the points a
i
,
g =
1
l + 1

i=0
a
i
.
4
Proof. We rst claim that g lies in each M
j,k
. We have
g =
1
l + 1

i=0
a
i
=
1
l + 1

i=j,k
a
i
+
1
l + 1
(a
j
+a
k
) =
1
l + 1

i=j,k
a
i
+
2
l + 1
_
a
j
+a
k
2
_
and the sum of the coecients is
(l 1)
1
1 +l
+
2
l + 1
=
l + 1
l + 1
= 1.
Hence g M
j,k
for all 0 j < k , hence
g

j,k
M
j,k
.
Further, we claim g is unique; it is the only point of intersection. Note that each M
j,k
really is a hyperplane. To see
this, we show that the points a
i
(i ,= j, k) and
1
2
(a
j
+a
k
) are anely independent. Suppose

i=j,k
s
i
a
i
+s
_
a
j
+a
k
2
_
= 0
and
s +

i=j,k
s
i
= 0.
Then
s
0
a
0
+. . . +
s
2
a
j
+. . . +
s
2
a
k
+. . . +s

= 0.
Also, the sum of these coecients is zero, therefore each coecient is zero since a
0
, . . . , a

is anely independent.
Thus a
i
[ i ,= j, k
1
2
(a
j
+ a
k
) is anely independent. This shows that each M
j,k
is indeed a hyperplane of
dimension l 1.
Note a
k
does not lie in the medial hyperplane M
0,k
=
1
2
(a
0
+a
k
), a
1
, . . . , a
k1
, a
k+1
, . . . , a

). Suppose to the contrary


that a
k
M
0,k
, say
a
k
= s
1
a
1
+. . . +s
k1
a
k1
+s
k
_
a
0
+a
k
2
_
+s
k+1
a
k+1
+. . . +s

with

i=1
s
i
= 1. Then
s
k
2
a
0
+s
1
a
1
+. . . +s
k1
a
k1
+
_
s
k
2
1
_
a
k
+s
k+1
a
k+1
+. . . +s

= 0
and the sum of the coecients is zero (since we moved a
k
to the other side). Since a
0
, . . . , a

is anely independent,
all of the coecients
s
k
2
, s
1
, . . . , s
k1
,
s
k
2
1, s
k+1
, . . . , s

must be zero. However, it is impossible that both


s
k
2
and
s
k
2
1 are zero. So we have reached a contradiction, proving
that a
k
/ M
0,k
.
We now construct a sequence of ane spaces (formed by taking intersections of the medial hyperplanes) by putting
V
k
:=

k
i=1
M
0,i
for 1 k . Hence V
1
= M
0,1
, V
2
= M
0,1
M
0,2
, and so on. To nish the proof, we note that
because a
k+1
V
k
but a
k+1
/ V
k+1
, it follows that
k+1

i=1
M
0,i

k

i=1
M
0,i
In other words, each V
k+1
is properly contained within V
k
(for 1 k 1). We know dim(M
0,1
) = 1 and
g V

(as was previously shown), so dim(V

) 0. By the above, dim(V


k
) = k and hence dim(V

) = 0. This hence
demonstrates g is the unique point of intersection of the hyperplanes M
0,k
(1 k ). Thus g is the unique point of
intersection of the hyperplanes M
j,k
for 1 j < k .
5
2 Orthogonality
2.1 Norms, distances, angles
Denition 2.1. Let u, v R
n
. We dene the dot product of u with v by
u v := u
t
v = v
t
u =
n

i=1
u
i
v
i
.
Theorem 2.2 (Properties of Dot Product). The dot product satises, for all u, v, w R
n
and t R:
1. [positive denite] u u 0, holding with equality if and only if u = 0.
2. [symmetry] u v = v u.
3. [bilinearity] (tu) v = t(u v) = u (tv), (u +v) w = u w +v w, and u (v +w) = u v +u w.
Proof. Easy.
Denition 2.3. For u R
n
we dene the length (or norm) of u to be
[u[ :=

u u.
Theorem 2.4 (Properties of Length). Length satises, for all u, v, w R
n
and t R:
1. [positive denite] [u[ 0, holding with equality if and only if u = 0.
2. [homogeneous] [tu[ = [t[[u[.
3. [polarization identity] u v =
1
2
([u +v[
2
[u[
2
[v[
2
) =
1
4
([u +v[
2
[u v[
2
). (Law of Cosines)
4. [Cauchy-Schwarz inequality] [u v[ [u[[v[, with equality if and only if u, v is linearly dependent.
5. [triangle inequality] [u +v[ [u[ +[v[, and also [[u[ [v[[ [u +v[.
Proof. The rst two are trivial; we will prove properties 3 to 5.
3. Note that
[u +v[
2
= (u +v) (u +v) = u u + 2(u v) +v v = [u[
2
2(u v) +[v[
2
whereby the rst equality follows immediately. We also have
[u v[
2
= [u[
2
2(u v) +[v[
2
Therefore [u +v[
2
[u v[
2
= 4(u v), and the second equality falls out.
4. Suppose u, v is linearly dependent, say v = tu. Then
[u v[ = [u (tu)[ = [t(u u)[ = [t[[u u[ = [t[[u[
2
= [u[[tu[ = [u[[v[.
Now suppose u, v is linearly independent. We will prove strict equality. Note that for all t R we have
u +tv ,= 0 by independence. Hence
0 < [u +tv[
2
= (u +tv) (u +tv) = [u[
2
+ 2t(u v) +t
2
[v[
2
which is a quadratic in t with no real roots. Hence the discriminant must be negative, that is,
(2(u v))
2
4[v[
2
[u[
2
< 0
whereby, taking square roots, we are done.
6
5. Note that
[u +v[
2
= [u[
2
+ 2(u v) +[v[
2
[u[
2
+ 2[u v[ +[v[
2
[u[
2
+ 2[u[[v[ +[v[
2
= ([u[ +[v[)
2
by applying Cauchy-Schwarz. Taking square roots, the proof follows. For the other equality, note
[u[ = [(u +v) v[ [u +v[ +[v[
so that [u[ [v[ [u +v[. Similarly, [v[ [u[ [u +v[.
We are done.
Denition 2.5. For a, b R
n
, the distance between a and b is dened to be
dist(a, b) := [b a[.
Theorem 2.6 (Properties of Distance). Distance satises, for all a, b, c R
n
:
1. [positive denite] dist(a, b) 0, holding with equality if and only if a = b.
2. [symmetric] dist(a, b) = dist(b, a).
3. [triangle inequality] dist(a, c) dist(a, b) +dist(b, c).
Proof. Exercise.
Denition 2.7. Let u R
n
. We say u is a unit vector if it has length 1.
Denition 2.8. For 0 ,= u, v R
n
, we dene the angle between them to be the angle
(u, v) := arccos
_
u v
[u[[v[
_
[0, ].
Theorem 2.9 (Properties of Angle). Angle satises for all 0 ,= u, v R
n
:
1. The following:
(u, v) [0, ].
(u, v) = 0 if and only if u = tv for some 0 < t R.
(u, v) = if and only if u = tv for some 0 > t R.
(u, v) =

2
if and only if u v = 0.
2. (u, v) = (v, u).
3. (tu, v) = (u, tv) =
_
(u, v) if 0 < t R
(u, v) if 0 > t R.
4. The Law of Cosines holds. Put := (u, v). Then
[v u[
2
= [u[
2
+[v[
2
2[u[[v[ cos .
5. Trigonometric ratios. Put := (u, v) and suppose (v u) u = 0. Then
cos =
[u[
[v[
and sin =
[v u[
[v[
.
Proof. Exercise.
Remark 2.10. We have the following:
7
1. For points a, b, c R
n
with a ,= b and b ,= c, we dene
abc := (a b, c b).
2. For 0 ,= u, v R
2
we can dene the oriented angle from u to v to be the angle [0, 2] with
cos =
u v
[u[[v[
and sin =
det(u, v)
[u[[v[
=
u
1
v
2
u
2
v
1
[u[[v[
.
In this case it is understood that
det(u, v) := det
_
u
1
v
1
u
2
v
2
_
.
Denition 2.11. For u, v R
n
, we say that u and v are orthogonal (or perpendicular) when u v = 0.
2.2 Orthogonal complements
Denition 2.12. For a vector space U in R
n
we dene the orthogonal complement of U in R
n
to be the vector
space
U

= v R
n
: v u = 0 for all u U.
Theorem 2.13 (Properties of Orthogonal Complement). We have the following:
1. U

is a vector space in R
n
.
2. If U = spanu
1
, . . . , u

where each u
i
R
n
, then
U

= v R
n
: v u
i
= 0 for all i.
3. For A M
kn
(R), we have (RowA)

= null A. For
A =
_
_
_
r
t
1
.
.
.
r
t
k
_
_
_
RowA = spanr
1
, . . . , r
k
R
n
. Note that
Ax =
_
_
_
r
t
1
x
.
.
.
r
t
k
x
_
_
_ =
_
_
_
r
1
x
.
.
.
r
k
x
_
_
_
4. dimU + dimU

= n, and U U

= R
n
.
5. (U

= U.
6. (null A)

= RowA.
Proof. We will prove #5; the rest are left as exercise.
5. Let x U. By denition of U

, x v = 0 for all v U

. By denition of (U

, we have x (U

therefore
U (U

.
On the other hand by #4, dimU+dim(U

) = n. Also, dimU

+dim(U

= n. Therefore, dimU = dim(U

.
Since U is a subspace of (U

and dimU = dim(U

, it follows that U = (U

.
8
2.3 Orthogonal projection
Theorem 2.14. Let A M
kn
(R). Then null(A
t
A) = null(A).
Proof. For x R
n
, x null(A) implies Ax = 0, so A
t
Ax = 0. Therefore x null(A
t
A).
If x null(A
t
A) then A
t
Ax = 0, then x
t
A
t
Ax = 0, so (Ax)
t
(Ax) = 0, so (Ax) (Ax) = 0, so [Ax[
2
= 0, so [Ax[ = 0,
so Ax = 0, so x null(A).
Remark 2.15. For A = (u
1
, . . . , u
n
), we have
A
t
A =
_
_
_
u
t
1
.
.
.
u
t
n
_
_
_(u
1
, . . . , u
n
) =
_
_
_
u
1
u
1
u
1
u
2
u
1
u
3
. . .
u
2
u
1
u
2
u
2
. . .
.
.
.
_
_
_
Theorem 2.16 (Orthogonal Projection Theorem). Let U be a subspace of R
n
. Then for any x R
n
there exist
unique vectors u, v R
n
with u U and v U

and u +v = x.
Proof. (uniqueness) Let x R
n
. Suppose u U, v U

, u + v = x. Let u
1
, . . . , u

be a basis for U. Let A be the


matrix with columns u
1
up to u

(A M
n
), so U = col(A). Since u U = col(A), we have u = At for some t R

.
Since v U

= col(A)

= null A
t
, we have A
t
v = 0. We have
u +v = x
At +v = x
A
t
At = A
t
x since A
t
v = 0
t = (A
t
A)
1
A
t
x
since u
1
, . . . , u

is linearly independent, so rank(A


t
A) = rank(A) = and A
t
A M

(R). So
u = At = A(A
t
A)
1
A
t
x
and v = x u.
(existence) Again, let u
1
, . . . , u

be a basis for U and let A = (u


1
, . . . , u

), and let
u = A(A
t
A)
1
A
t
x
Then clearly u col(A) = U and u +v = x. We need to show that v U

= (col A)

= null A
t
.
A
t
v = A
t
(x u) = A
t
x A
t
A(A
t
A)
1
A
t
x = A
t
x (A
t
A)(A
t
A)
1
A
t
x = A
t
x A
t
x = 0.
So the proof is complete.
Denition 2.17. Let U be a vector space in R
n
. Let x R
n
. Let u and v be the vectors of the above theorem with
u U, v U

, u +v = x. Then u is called the orthogonal projection of x onto U and we write


u = Proj
U
(x).
Note that since U = (U

it follows that
v = Proj
U
(x)
Example 2.18. When U = spanu with 0 ,= u R
n
, we can take A = u, hence
Proj
U
(x) = A(A
t
A)
1
A
t
x = u(u
t
u)
1
u
t
x = u([u[
2
)
1
u
t
x =
uu
t
x
[u[
2
=
u(u x)
[u[
2
=
u x
[u[
2
u.
We also write
Proj
u
(x) = Proj
U
(x) =
u x
[u[
2
u
9
Theorem 2.19. u := Proj
U
(x) is the unique point in U nearest to x.
Proof. Let w U with w ,= u. Note that (w u) (x u) = 0 since u, w U so w u U and v = x u U

by
the denition of u = Proj
U
(x). By Pythagoras theorem (special case of the Law of Cosines), we note that
[w x[
2
= [w u[
2
+[x u[
2
> [x u[
2
since [w u[
2
> 0 (because w ,= u).
Denition 2.20. Let U be a subspace of R
n
and let x R
n
. We dene the reection of x in U to be
Re
U
(x) = Proj
U
(x) Proj
U
(x) = x 2Proj
U
(x) = 2Proj
U
(x) x
For an ane space P = p +U R
n
and a point x R
n
we can also dene
Proj
p+U
(x) = p + Proj
U
(x p)
Re
p+U
(x) = p + Re
U
(x p).
3 Applications of orthogonality
We started o by talking about ane spaces (when you solve systems of equations, the solution set is, in general, an
ane space: a vector space translated by a point). We also dened -simplices, and saw that we can apply the ideas
of linear algebra to talk about geometry in higher dimensions. We then talked about inner products (the dot product
in R
n
and its relationship to lengths and angles, orthogonality, orthogonal complements of vector spaces, orthogonal
projections). Geometric applications of these include nding the circumcenter of a simplex by nding an intersection
of perpendicular bisectors, and nding best-t and interpolating polynomials.
3.1 Circumcenter of a simplex
Aside 3.1. In geometry, we study curves and surfaces and higher-dimensional versions of those. There are three main
ways: a graph (draw a graph of a function of one variable, you get a curve; a graph of a function of two variables, you
get a surface), you can do an implicit description, or you can do it parametrically.
For f : U R
k
R

, the graph of f denoted Graph(f) = (x, y) R


k+
[ y = f(x) R
k+
; usually this is a
k-dimensional version of a surface. The equation y = f(x) is known as an explicit equation (it can be thought of as
equations, actually). The kernel of f is x R
k
[ f(x) = 0. Usually this is (k )-dimensional. We also dene the
image of f, which is the set f(x) [ x U R

. Usually this is k-dimensional. (It is described parametrically, by


parameter x).
For example, the top half of a sphere can be described by z =
_
r
2
(x
2
+y
2
). An implicit equation describing the
whole sphere would be x
2
+y
2
+z
2
r
2
= 0. The sphere is the kernel of f(x, y, z) = x
2
+y
2
+z
2
r
2
.
Denition 3.2. Let P be an ane space in R
n
and let a, b be points in R
n
with a ,= b. The perpendicular bisector
of [a, b] in P is the set of all x P such that x
a+b
2
(b a) = 0.
Theorem 3.3. x is on the perpendicular bisector of [a, b] if and only if dist(x, a) = dist(x, b).
Proof. For x P, x lies on the perpendicular bisector of [a, b] if and only if (x
a+b
2
) (b a) = 0 if and only
if (2x (a + b)) (b a) = 0. This holds i 2x b 2x a a b + a a b b + b a = 0. This holds i
a a 2x a = b b 2x b. But this holds i a a 2x a +x x = b b 2x b +x x i [x a[
2
= [x b[
2
i
[x a[ = [x b[.
Theorem 3.4 (Simplicial Circumcenter Theorem). Let [a
0
, a
1
, . . . , a

] be an -simplex in R
n
. For 0 j < k ,
write B
j,k
for the perpendicular bisector of the [a
j
, a
k
]. Then the ane spaces B
j,k
with 0 j < k have a unique
point of intersection in a
0
, . . . , a

). This point is denoted by and is called the circumcenter of the -simplex.


10
By the above theorem, is the unique point in a
0
, . . . , a

) which is equidistant from each a


i
. Theres an ( 1)-
dimensional sphere centered at passing through each of the points a
i
.
Proof. (uniqueness) Suppose such a point exists. Then lies on each B
0,k
for 1 k . We have a
0
, . . . , a

).
Say = a
0
+t
1
u
1
+. . . +t

where u
k
= a
k
a
0
and t
k
R. That is,
= a
0
+At
where A is the matrix with the column vectors u
1
to u

. Since lies on B
0,k
, where 1 k , we use the denition
of B
0,k
and write the equation that satises.
_

a
0
+a
k
2
_
(a
k
a
0
) = 0
We rewrite this as
((a
0
+At) (a
0
+
a
k
a
0
2
)) (a
k
a
0
) = 0
This gives
(At
1
2
u
k
) u
k
= 0
(At) u
k
=
1
2
[u
k
[
2
_
_
_
(At) u
1
.
.
.
(At) u

_
_
_ =
1
2
_
_
_
[u
1
[
2
.
.
.
[u

[
2
_
_
_
A
t
At =
1
2
_
_
_
[u
1
[
2
.
.
.
[u

[
2
_
_
_
Where we note that A
t
denotes the transpose and not A raised to the power of t (the letter t is being used in
two dierent ways here). Since [a
0
, . . . , a

] is a (non-degenerate) -simplex, we know that u


1
, . . . , u

is linearly
independent so that rank(A
t
A) = rank(A) = and A
t
A is an matrix, so it is invertible. Therefore,
t = (A
t
A)
1
v
where
v =
1
2
_
_
_
[u
1
[
2
.
.
.
[u

[
2
_
_
_
and therefore = a
0
+At = a
0
+A(A
t
A)
1
v.
(existence) We need to show that this point = a
0
+ A(A
t
A)
1
v lies on all the perpendicular bisectors B
j,k
with
0 j < k . So let 0 j < k . Since B
0,j
and B
0,k
, we have
dist(, a
0
) = dist(, a
j
)
dist(, a
0
) = dist(, a
k
)
hence dist(, a
k
) = dist(, a
j
) so that lies on B
j,k
.
11
3.2 Polynomial interpolation
The following theorem gives us information about the so-called polynomial interpolation of data points. This process
produces a polynomial which actually passes through the points.
Theorem 3.5. Let (a
0
, b
0
), (a
1
, b
1
), . . . , (a
n
, b
n
) be n + 1 points with the a
i
distinct. Then there exists a unique
polynomial of degree n with p(a
i
) = b
i
for all i.
Proof. For p(x) = c
0
+c
1
x +c
2
x
2
+. . . +c
n
x
n
, then we have p(a
i
) = b
i
for all i if and only if
c
0
+c
1
a
0
+c
2
a
2
0
+. . . +c
n
a
n
0
= b
1
.
.
.
c
0
+c
1
a
n
+c
2
a
2
2
+. . . +c
n
a
n
n
= b
n
if and only if Ac = b, where
A =
_
_
_
_
_
1 a
0
a
2
0
. . . a
n
0
1 a
1
a
2
1
. . . a
n
1
.
.
.
1 a
n
a
2
n
. . . a
n
n
_
_
_
_
_
and
c =
_
_
_
_
_
c
0
c
1
.
.
.
c
n
_
_
_
_
_
, b =
_
_
_
_
_
b
0
b
1
.
.
.
b
n
_
_
_
_
_
This matrix A is called the Vandermonde matrix for a
0
, a
1
, . . . , a
n
. We must show that A is invertible. We claim
that
det A =

0j<kn
(a
k
a
j
).
We prove this by induction. Let
A
k
=
_
_
_
_
_
1 a
0
a
2
0
. . . a
k
0
1 a
1
a
2
1
. . . a
k
1
.
.
.
1 a
k
a
2
k
. . . a
k
k
_
_
_
_
_
that is, A
k
is the Vandermonde matrix for a
0
, a
1
, . . . , a
k
. We have
det A
1
= det
_
1 a
0
1 a
1
_
= a
1
a
0
.
Fix k with 2 k n and suppose det A
k1
=

0i<j<k1
(a
j
a
i
). Write x = a
k
. Then
D(x) := det A
k
=
_
_
_
_
_
1 a
0
a
2
0
. . . a
k
0
1 a
1
a
2
1
. . . a
k
1
.
.
.
1 x x
2
. . . x
k
_
_
_
_
_
and we see that, expanding on the last row, D(x) is a polynomial of degree k in x, with leading coecient C = det A
k1
which by the induction hypothesis is

0i<jk1
(a
j
a
i
).
12
For each 0 i k 1, by subtracting the ith row from the last, we see that
D(x) = det A
k
= det
_
_
_
_
_
_
_
1 a
0
a
2
0
. . . a
k
0
1 a
1
a
2
1
. . . a
k
1
.
.
.
1 a
k1
a
2
k1
. . . a
k
k1
0 x a
i
x
2
a
2
i
. . . x
k
a
k
i
_
_
_
_
_
_
_
.
So (x a
i
) is a factor of each term on the last row, so (x a
i
) is a factor of D(x) for each 0 i k 1. It follows
that
D(x) = C(x a
0
)(x a
1
) (x a
k1
)
=
_
_

0i<jk1
(a
j
a
i
)
_
_
(a
k
a
0
)(a
k
a
1
) (a
k
a
k1
)
=

0i<jk
(a
j
a
i
).
This completes the proof.
3.3 Least-squares best t polynomials
Now we will look not at polynomial interpolation, but tting polynomials, that is, nding a polynomial which is a
best t curve to some data points (not necessarily passing through those points).
Example 3.6. Given a positive integer and given n data points (a
1
, b
1
), (a
2
, b
2
), . . ., (a
n
, b
n
). Find the best t
polynomial of degree for the data points.
The solution depends on what we mean by best t. We could mean some function f which minimizes the distance
of each data point to the graph of f:
n

i=1
dist((a
i
, b
i
), graph of f).
Or we could mean a function f which minimizes the sum of the vertical distances between the data points and the
graph:
n

i=1
b
i
f(a
i
).
As our denition, we will take best t to mean something that minimizes
n

i=1
(b
i
f(a
i
))
2
hence the name sum of squares. Equivalently, such a function minimizes

_
n

i=1
(b
i
f(a
i
))
2
= dist
_
_
_
_
_
_
b
1
.
.
.
b
n
_
_
_,
_
_
_
f(a
1
)
.
.
.
f(a
n
)
_
_
_
_
_
_.
To help us with this task, we have the following theorem.
Theorem 3.7. Given a positive integer , and n points (a
1
, b
1
), . . . , (a
n
, b
n
) in R
2
such that at least +1 of the points
a
i
are distinct, there exists a unique polynomial f(x) of degree which minimizes the sum
n

i=1
(b
i
f(a
i
))
2
.
This polynomial is called the least-squares best t polynomial of degree for the data points.
13
Proof. For f(x) = c
0
+c
1
x +. . . +c

, we have
_
_
_
f(a
1
)
.
.
.
f(a
n
)
_
_
_ =
_
_
_
c
0
+c
1
a
1
+c
2
a
2
1
+. . . +c

1
.
.
.
c
0
+c
1
a
n
+c
2
a
2
n
+. . . +c

n
_
_
_ = Ac,
where
A =
_
_
_
_
_
1 a
1
a
2
1
. . . a

1
1 a
2
a
2
2
. . . a

2
.
.
.
1 a
n
a
2
n
. . . a

n
_
_
_
_
_
M
n(+1)
(R) and c =
_
_
_
_
_
c
0
c
1
.
.
.
c

_
_
_
_
_
To minimize the sum
n

i=1
(b
i
f(a
i
))
2
,
we must minimize
dist
_
_
_
_
_
_
b
1
.
.
.
b
n
_
_
_,
_
_
_
f(a
1
)
.
.
.
f(a
n
)
_
_
_
_
_
_ = dist(b, Ac)
We must have that Ac is the (unique) point in col(A) which lies nearest to b. Thus
Ac = Proj
col(A)
(b).
Since + 1 of the a
i
are distinct, it follows that the corresponding + 1 rows of A form a Vandermonde matrix on
+ 1 distinct points. This ( + 1) ( + 1) Vandermonde matrix is invertible by the previous theorem, so these + 1
rows are linearly independent. Therefore, the rank of the matrix A is +1, which means the columns of A are linearly
independent. Therefore A is one-to-one, and hence there is a unique c.
We now seek a formula for c. We look for u, v R
+1
with u U = col A, v U

= null A
t
, and u + v = b. Say
u = Ac.
u +v = b
Ac +v = b
A
t
Ac = A
t
b
c = (A
t
A)
1
A
t
b.
Since c is the vector of coecients, the proof is complete. Put f(x) = c
0
+. . . +c

.
4 Cross product in R
n
A familiar notion is the cross product of two vectors in R
3
. In this section, we generalize this to the cross product of
n1 vectors in R
n
, and see some results about the connections between cross products, determinants, and the volume
of the parallelotope generated by vectors.
4.1 Generalized cross product
Denition 4.1. Let u
1
, . . . , u
n1
be vectors in R
n
. We dene the cross product of these vectors to be
X(u
1
, . . . , u
n1
) = formal det
_
u
1
. . . u
n1
e
1
.
.
.
en
_
where e
k
is the kth standard basis vector. This is equal to
n

i=1
(1)
i+n
det(A
i
)e
i
14
where A = (u
1
, . . . , u
n1
) M
n(n1)
and A
i
is the (n 1) (n 1) matrix obtained from A by removing the ith
row.
Example 4.2. In R
2
, for u =
_
u
1
u
2
_
R
2
we write u
X
= X(u) = formal det
_
u
1
e
1
u
2
e
2
_
. In R
3
, for u, v R
3
we write
X(u, v) = formal det
_
_
u
1
v
1
e
1
u
2
v
2
e
2
u
3
v
3
e
3
_
_
= det
_
u
2
v
2
u
3
v
3
_
e
1
det
_
u
1
v
1
u
3
v
3
_
e
2
+ det
_
u
1
v
1
u
2
v
2
_
e
3
.
=
_
_
u
2
v
3
u
3
v
2
u
3
v
1
u
1
v
3
u
1
v
2
u
2
v
1
_
_
.
This particular cross product gives the area of the parallelogram generated by the vectors u, v.
Theorem 4.3 (Properties of Cross Product). We have the following, for u
1
, . . . , u
n1
, v R
n
, and t R:
1. X(u
1
, . . . , tu
k
, . . . , u
n1
) = tX(u
1
, . . . , u
k
, . . . , u
n1
).
2. X(u
1
, . . . , u

, . . . , u
k
, . . . , u
n1
) = X(u
1
, . . . , u
k
, . . . , u

, . . . , u
n1
), that is, interchanging two vectors ips the
sign of the cross product. (This means the cross product is skew-symmetric).
3. X(u
1
, . . . , u
k
+ v, . . . , u
n1
) = X(u
1
, . . . , u
k
, . . . , u
n1
) + X(u
1
, . . . , v, . . . , u
n1
). This, together with property 1,
says that the cross product is multilinear.
4. We have
X(u
1
, . . . , u
n1
) v = det(u
1
, . . . , u
n1
, v)
hence for 1 i n 1, we have
X(u
1
, . . . , u
n1
) u
i
= 0.
5. X(u
1
, . . . , u
n1
) = 0 if and only if u
1
, . . . , u
n1
is linearly dependent. Furthermore, when X(u
1
, . . . , u
n1
) ,= 0,
the set
u
1
, . . . , u
n1
, X(u
1
, . . . , u
n1
)
is what we would call a positively oriented basis for R
n
which means
det(u
1
, . . . , u
n1
, X(u
1
, . . . , u
n1
)) > 0.
Proof. We prove 4 and 5.
4. Note that
X(u
1
, . . . , u
n1
) v =
_
n

i=1
(1)
i+n
det(A
i
)e
i
_
v
=
n

i=1
(1)
i+n
det(A
i
)(e
i
v)
=
n

i=1
(1)
i+n
det(A
i
)v
i
= det(u
1
, . . . , u
n1
, v)
15
5. u
1
, . . . , u
n1
is linearly independent if and only if A has rank n 1, if and only if row space has dimension
n 1, if and only if some n 1 rows of A are linearly independent, if and only if one of the matrices A
i
is
invertible, if and only if X(u
1
, . . . , u
n1
) ,= 0.
X(u
1
, . . . , u
n1
) =
_
_
_
_
_
_
_
+[A
1
[
[A
2
[
+[A
3
[
.
.
.
[A
n1
[
_
_
_
_
_
_
_
,= 0.
Also,
det(u
1
, . . . , u
n1
, X(u
1
, . . . , u
n1
)) = X(u
1
, . . . , u
n1
) X(u
1
, . . . , u
n1
)
= [X(u
1
, . . . , u
n
)[
2
> 0
whenever X(u
1
, . . . , u
n1
) ,= 0.
The proof is complete.
4.2 Parallelotope volume
Denition 4.4. For vectors u
1
, . . . , u
k
we dene the parallelotope (or parallelepiped) on the vectors u
1
, . . . , u
k
in
R
n
to be the set
_
k

i=1
t
i
u
i
[ 0 t
i
1 for all i
_
.
We dene the k-volume of this parallelotope, written vol
k
(u
1
, . . . , u
k
) inductively by vol
1
(u
1
) = [u
1
[ and for k 2 we
dene
vol
k
(u
1
, . . . , u
k
) = vol
k1
(u
1
, . . . , u
k1
)[u
k
[ sin,
where is the angle between the vector u
k
and the vector space U = spanu
1
, . . . , u
k1
, or alternatively
vol
k
(u
1
, . . . , u
k
) = vol
k1
(u
1
, . . . , u
k1
) [Proj
U
(u
k
)[ .
Theorem 4.5 (Parallelotope Volume Theorem). Let u
1
, . . . , u
k
R
n
. Then
vol
k
(u
1
, . . . , u
k
) =
_
det(A
t
A),
where A is the matrix with columns u
1
, . . . , u
k
.
Proof. To see the truth of the base case, note vol
1
(u
1
) = [u
1
[ =

u
1
u
1
=
_
u
t
1
u
1
=
_
det(u
t
1
u
1
) =
_
det(A
t
A),
where A = u
1
. Next, x k 2, and suppose inductively that
vol
k1
(u
1
, . . . , u
k1
) =
_
det(A
t
A)
where A = (u
1
, . . . , u
k1
). Let B = (u
1
, . . . , u
k
) = (A, u
k
), that is, B is the matrix obtained from A by inserting the
vector u
k
as its last column. Now, put U = spanu
1
, . . . , u
k1
and then dene the projections
p := Proj
U
(u
k
) col A
q := Proj
U
(u
k
) (col A)

= null(A
t
)
and hence p + q = u
k
. Then of course B = (A, p + q). Since p col A, the matrix B can be dened from the matrix
(A, q) by performing elementary column operations of type 3. Hence B = (A, p + q) = (A, q)E where E is a product
of type 3 elementary matrices. Hence E is k k, and det(E) = 1. It follows that
det(B
t
B) = det(E
t
(A, q)
t
(A, q)E) = det
__
A
t
q
t
_
(A, q)
_
16
since det E = 1. This then becomes
det(B
t
B) = det
_
A
t
A A
t
q
q
t
A q
t
q
_
= det
_
A
t
A 0
0 [q[
2
_
= det(A
t
A)[q[
2
.
Therefore taking square roots, we apply the induction hypothesis to obtain
_
det(B
t
B) =
_
det(A
t
A)[q[ = vol
k1
(u
1
, . . . , u
k1
) [q[ = vol
k
(u
1
, . . . , u
k
)
hence completing the proof.
Theorem 4.6 (Dot Product of Two Cross Products). Let u
1
, . . . , u
n1
, v
1
, . . . , v
n1
R
n
. Then
X(u
1
, . . . , u
n1
) X(v
1
, . . . , v
n1
) = det(A
t
B)
where A is the matrix with columns u
1
, . . . , u
n1
and B is the matrix with columns v
1
, . . . , v
n1
.
Proof. First, we dene
x := X(u
1
, . . . , u
n1
)
y := X(v
1
, . . . , v
n1
).
We now calculate x y twice:
x y = X(u
1
, . . . , u
n1
) y = det(A, y)
and also
x y = x X(v
1
, . . . , v
n1
) = X(v
1
, . . . , v
n1
) x = det(B, x).
Therefore, we see that
(x y)
2
= det(A, y) det(B, x) = det
_
(A, y)
t
(B, x)

= det
__
A
t
y
t
_
(B, x)
_
and this becomes
(x y)
2
= det
_
A
t
B A
t
x
y
t
B y
t
x
_
= det
_
A
t
B 0
0 x y
_
.
To justify that A
t
x = 0, observe that since x = X(u
1
, . . . , u
n1
), it is the case that x u
i
= 0 for all i. This means
that x (spanu
1
, . . . , u
n1
)

= (col A)

= null A
t
. Also, y null B
t
. It is now clear that either x y = 0, or
x y = det(A
t
B).
Suppose that x y = 0 with one of x or y being zero. If x = X(u
1
, . . . , u
n1
) = 0, then u
1
, . . . , u
n1
is linearly
dependent, so rank(A) < n1 (the proof of this is an exercise). Therefore A
t
B is non-invertible, implying det(A
t
B) =
0. Similarly, if y = 0 we also obtain det(A
t
B) = 0.
Suppose next that x y = 0 with both x, y nonzero. Since
0 = x y = X(u
1
, . . . , u
n1
) y = det(A, y)
we have y col(A) (since x ,= 0 so the columns of A are linearly independent). Similarly, x col(B), say x = Bt.
Note that t ,= 0, since x ,= 0. Note that since x = X(u
1
, . . . , u
n1
), we know its perpendicular to col(A) which means
x null(A
t
), therefore A
t
Bt = A
t
x = 0. Since t ,= 0 and (A
t
B)t = 0, we see that A
t
B has nontrivial nullspace,
proving A
t
B cannot be invertible. Hence det(A
t
B) = 0.
In all cases, we have obtained x y = det(A
t
B). So the proof is complete.
Corollary 4.7. For u
1
, . . . , u
n1
R
n
,
[X(u
1
, . . . , u
n1
)[ =
_
det(A
t
A) = vol
n1
(u
1
, . . . , u
n1
).
This is the end of the material that will be tested on the course midterm.
17
5 Spherical geometry
In this section we study some spherical geometry. A useful reference for this material is available at
http://www.math.uwaterloo.ca/~snew/math245spring2011/Notes/sphere.pdf.
In view of this, these notes do not themselves include information on spherical geometry.
6 Inner product spaces
6.1 Abstract inner products
Denition 6.1. Let F be a eld. We dene the dot product of two vectors u, v F
n
by
u v :=
_
_
_
u
1
.
.
.
u
n
_
_
_
_
_
_
v
1
.
.
.
v
n
_
_
_ =
n

i=1
u
i
v
i
= v
t
u = u
t
v.
Remark 6.2. Let u, v, w F
n
and t F. The dot product is symmetric and bilinear, but it is not positive denite,
that is, it is not the case that u u 0.
For example, if F = Z
5
then
_
1
2
_

_
1
2
_
= 1
2
+ 2
2
= 0
or on the other hand if F = C then
_
1
i
_

_
1
i
_
= 1
2
+i
2
= 0.
For A M
k
(F) and x F

, if
A =
_
_
_
r
t
1
.
.
.
r
t
k
_
_
_
then we observe that
Ax =
_
_
_
r
1
x
.
.
.
r
k
x
_
_
_.
For u, v C
n
, we could dene
u v =
_
_
_
u
1
.
.
.
u
n
_
_
_
_
_
_
v
1
.
.
.
v
n
_
_
_ =
_
_
_
x
1
+iy
1
.
.
.
x
n
+iy
n
_
_
_
_
_
_
r
1
+is
1
.
.
.
r
n
+is
n
_
_
_ = x
1
r
1
+y
1
s
1
+x
2
r
2
+y
2
s
2
+. . .
Here we are identifying C
n
with R
2n
and using the dot product in R
2n
. This dot product is symmetric and R-bilinear,
but is not C-bilinear. Note that for z C we have [z[
2
= z z.
Denition 6.3. For u, v C
n
, we dene the (standard) inner product of u and v to be
u, v) =
n

i=1
u
i
v
i
= v
t
u = v

u
where v

= v
t
.
This product has the following properties:
18
It is conjugate-symmetric.
It is linear in the rst variable, and conjugate linear in the second (we call this hermitian or sesquilinear).
That is,
u +v, w) = u, w) +v, w)
u, v +w) = u, v) +u, w)
tu, v) = tu, v)
u, tv) =

tu, v)
It is positive denite: we have u, u) 0 with u, u) = 0 if and only if u = 0.
For A = (u
1
, . . . , u

) M
n
(C), and x C
n
, we have
A

x =

A
t
x =
_
_
_
u
t
1
.
.
.
u
t

_
_
_x =
_
_
_
x, u
1
)
.
.
.
x, u

)
_
_
_
and for B = (v
1
, . . . , v

) M
n
(C) we have
B

A =
_
_
_
v
t
1
.
.
.
v
t

_
_
_(u
1
, . . . , u

) =
_
_
_
u
1
, v
1
) . . . u

, v
1
)
.
.
.
.
.
.
.
.
.
u
1
, v

) . . . u

, v

)
_
_
_
where A

= A
t
=

A
t
, that is,
(A

)
ij
= A
ji
.
When A M
n
(R) then we simply have A

= A
t
since complex conjugation, when restricted to the reals, is the
identity map.
Denition 6.4. Let F be R or C. Let V be a vector space over the eld F. Then we can dene an inner product on
V as a map , ) : V V F such that for all u, v, w V and all t F we have:
u, v) = v, u)
u, v +w) = u, v) +u, w)
u +v, w) = u, w) +v, w)
tu, v) = tu, v)
u, tv) =

tu, v)
u, u) 0 with equality i u = 0
A vector space V over F (R or C) together with an inner product is called an inner product space over F.
Example 6.5. We have the following:
R
n
is an inner product space using the dot product (also called the standard inner product) on R
n
.
C
n
is an inner product space using its standard inner product.
The standard inner product on M
k
(F) is given by
A, B) =
_
_
_
_
a
11
.
.
.
a
k
_
_
_,
_
_
_
b
11
.
.
.
b
k
_
_
_
_
= a
11

b
11
+a
12

b
12
+. . . +a
k

b
k
= tr(B

A).
19
Example 6.6. Let F be R or C. Then F

is the vector space of sequences a = (a


1
, a
2
, . . .) with a
i
F, where a
i
,= 0
for only nitely many i. This vector space has the standard basis
e
1
, e
2
, e
3
, . . .
where e
k
= (e
k1
, e
k2
, . . .) with e
ki
=
ki
(Kronecker delta notation). The standard inner product on F

is
a, b) =

i=1
a
i

b
i
where we note that this sum is indeed nite since only nitely many a
i
, b
i
are nonzero.
Example 6.7. Let a < b be real. Then we dene (([a, b], F) to be the vector space of continuous functions f : [a, b]
F. The standard inner product on (([a, b], F) is given by
f, g) =
_
b
a
f g
where we note that [f[
2
= f, f) =
_
b
a
f

f =
_
b
a
[f[
2
0.
Example 6.8. Let T
n
(F) denote the vector space of polynomials with coecients in F of degree at most n. Let T(F)
denote the vector space of all polynomials over F. In T
n
(F) we have several inner products:
We can dene
_
n

i=0
a
i
x
i
,
n

j=0
b
j
x
j
_
=
n

i=0
a
i

b
i
.
For a < b real, we can put
f, g) =
_
b
a
f g
For distinct points a
0
, . . . , a
n
F we can dene
f, g) =
n

i=0
f(a
i
)g(a
i
).
Note that the rst and second of these inner products generalize to inner products on T(F).
Denition 6.9. Let U be an inner product space over F (where F is R or C). For u, v U we say that u and v are
orthogonal when u, v) = 0. Also, for u U we dene the norm (or length) of u to be
[u[ =
_
u, u)
noting that when [u[ = 1 we call u a unit vector.
Denition 6.10. Let U be an inner product space. Let u, x U. We dene the orthogonal projection of x onto
u to be
Proj
u
x =
x, u)
[u[
2
u
_
not the same as
u, x)
[u[
2
u if F = C!
_
Proposition 6.11. x Proj
u
x is orthogonal to u.
Proof. This is an easy computation. Just expand x Proj
u
x, u) and use the properties of the inner product.
Theorem 6.12 (Properties of Norm). Let U be an inner product space over F, with F = R or F = C. Then for all
u, v, w U and t F we have
1. [tu[ = [t[[u[ (note that these are 2 dierent norms).
20
2. [u[ 0 with equality if and only if u = 0.
3. The Cauchy-Schwarz inequality holds, that is, [u, v)[ [u[[v[.
4. The triangle inequality holds: [[u[ [v[[ [u +v[ [u[ +[v[.
5. The polarization identity holds. If F = R then we have
u, v) =
1
4
_
[u +v[
2
[u v[
2
_
whereas if F = C we have (?)
u, v) =
1
4
_
[u +v[
2
+i[u +iv[
2
[u v[
2
i[u iv[
2
_
6. Pythagoras theorem holds: if u, v) = 0 then [u + v[
2
= [u[
2
+ [v[
2
. Note that in the complex setting, the
converse is not true!
Proof of Cauchy-Schwarz. If u, v is linearly dependent, then one of u and v is a multiple of the other, say v = tu
with t F. Then
[u, v)[ = [u, tu)[ = [

tu, u)[ = [t[[u[


2
= [u[[tu[ = [u[[v[.
Next, suppose u, v is linearly independent. Consider
w = v Proj
u
v = v
v, u)
[u[
2
u = 1 v
v, u)
[u[
2
u
Since u, v is linearly independent, we have w ,= 0, hence [w[
2
> 0, hence w, w) > 0
_
v
v, u)
[u[
2
u, v
v, u)
[u[
2
u
_
> 0
hence
v, v)
v, u)
[u[
2
v, u)
v, u)
[u[
2
u, v) +
v, u)
[u[
2
v, u)
[u[
2
u, u) > 0
therefore
v, v)
u, v)u, v)
[u[
2

u, v)u, v)
[u[
2
+
u, v)u, v)
[u[
4
u, u) > 0
hence
[v[
2

2[u, v)[
2
[u[
2
+
[u, v)[
2
[u[
4
[u[
2
> 0
therefore multiplying through by [u[
2
we obtain
[u[
2
[v[
2
[u, v)[
2
> 0
nally yielding
[u, v)[
2
< [u[
2
[v[
2
as required.
Proof of triangle inequality. We will prove that [u +v[ [u[ +[v[. Note that
[u +v[
2
= u +v, u +v) = u, u) +u, v) +v, u) +v, v) = u, u) +u, v) +u, v) +v, v)
and therefore this is equal to
[u[
2
+ 2 Reu, v) +[v[
2
[u[
2
+ 2[Reu, v)[ +[v[
2
[u[
2
+ 2[u[[v[ +[v[
2
= ([u[ +[v[)
2
.
So the proof is complete.
21
Proof of Pythagoras theorem. We will prove that if u, v) = 0 then [u +v[
2
= [u[
2
+[v[
2
. As above, we have
[u +v[
2
= [u[
2
+ 2 Reu, v) +[v[
2
We see that
[u +v[
2
= [u[
2
+[v[
2
if and only if Reu, v) = 0.
Remark 6.13. Let F be R or C and U be a vector space over F. Then a norm on U is a map [ [ : U R such that
for all u, v U and all t F,
1. [tu[ = [t[[u[.
2. [u +v[ [u[ +[v[.
3. [u[ 0 with equality if and only if u = 0.
A vector space U over F with a norm is called a normed linear space.
Example 6.14. Some norms do not arise from inner products. In R
n
for u R
n
we can dene the norm of u to be
[u[ =
n

i=1
[u
i
[.
This is a norm on R
n
(that is dierent from the standard one).
Denition 6.15. Let U be an inner product space (it could be a normed linear space) over F (R or C). For a, b U
we dene the distance between them to be
dist(a, b) = [b a[
The distance function has the following properties for all u, v, w U:
1. dist(a, b) = dist(b, a).
2. dist(a, c) dist(a, b) +dist(b, c).
3. dist(a, b) 0 with equality if and only if a = b.
Remark 6.16. A metric on a set X is a map dist : XX R which satises properties 1, 2 and 3 above. A subset
U X is open when for all a U, there exists r > 0 such that B
r
(a) U, where
B
r
(a) := x X : dist(a, x) < r
It has the following properties:
1. , X are open.
2. If U
1
, . . . , U
n
are open then so is
n

i=1
U
i
.
3. If U

is open for all A then so is


_
A
U

.
A topology on a set X is a set T of subsets of X which we call the open sets in X such that 1, 2, and 3 hold. A
topological space is a set X together with a topology T .
22
6.2 Orthogonality and Gram-Schmidt
Denition 6.17. Let W be an inner product space over F (R or C). Let | W. We say | is orthogonal when
u, v) = 0 for all u ,= v |. We say | is orthonormal if it is orthogonal and furthermore [u[ = 1 for all u |.
Example 6.18. For | = u
1
, . . . , u

R
n
, let A = (u
1
, . . . , u

), then | is orthogonal if and only if A


t
A is diagonal,
since
A
t
A =
_
_
_
u
1
u
1
. . . u

u
1
.
.
.
.
.
.
.
.
.
u
1
u

. . . u

_
_
_
and | is orthonormal if and only if A
t
A = I.
Example 6.19. Recall that if | is a basis for the vector space U, then for x U = span |, we can write
x = t
1
u
1
+t
2
u
2
+. . . +t

with each t
i
F and u
i
distinct elements in |. When | = u
1
, . . . , u

and x = t
1
u
1
+. . . +t

we write
_
x

U
= t =
_
_
_
t
1
.
.
.
t

_
_
_ F

.
The map : U F

given by (x) =
_
x

U
is a vector space isomorphism, so U

= F

.
Remark 6.20. For | = u
1
, . . . , u

R
n
, A = (u
1
, . . . , u

), and x U = span | = col A with x = t


1
u
1
+. . . +t

=
At, note
_
x

U
= t. To nd t =
_
x

U
we solve At = x. Hence
A
t
At = A
t
x
and thus
_
x

U
= t = (A
t
A)
1
A
t
x. As a remark, note that (A
t
A)
1
A
t
is a left inverse for A. If | is orthonormal then
A
t
A = I so
_
x

U
= A
t
x which gives
_
x

U
=
_
_
_
u
t
1
.
.
.
u
t

_
_
_x =
_
_
_
x u
1
.
.
.
x u

_
_
_.
Also, for x R
n
, we have
Proj
U
x = A(A
t
A)
1
A
t
x.
If | is orthonormal so that A
t
A = I, then Proj
U
x = AA
t
x, which is equal to
_
u
1
. . . u

_
_
_
_
x u
1
.
.
.
x u

_
_
_.
Theorem 6.21 (Gram-Schmidt Procedure). Let U be a nite (or countable) dimensional inner product space over F
(R or C). Let | = u
1
, . . . , u

(or | = u
1
, . . .) be a basis for U. Let v
1
= u
1
, and for k 1, let
v
k
= u
k

k1

i=1
Proj
v
i
(u
k
) = u
k

k1

i=1
u
k
, v
i
)
[v
i
[
2
v
i
.
Then
1 = v
1
, . . . , v

(or 1 = v
1
, . . .)
is an orthogonal basis for U with spanv
1
, . . . , v
k
= spanu
1
, . . . , u
k
for all k 1.
23
Proof. We will prove inductively that v
1
, . . . , v
k
is an orthogonal basis for spanu
1
, . . . , u
k
. The base case holds
since v
1
= u
1
. Fix k 2 and suppose that v
1
, . . . , v
k1
is an orthogonal basis for spanu
1
, . . . , u
k1
. Let
v
k
= u
k

k1

i=1
u
k
, v
i
)
[v
i
[
2
v
i
.
Since v
k
is equal to u
k
plus a linear combination of v
1
, . . . , v
k1
and since spanv
1
, . . . , v
k1
= spanu
1
, . . . , u
k1
,
it follows that
spanv
1
, . . . , v
k
= spanv
1
, . . . , v
k1
, u
k
= spanu
1
, . . . , u
k1
, u
k

and hence v
1
, . . . , v
k
is a basis for spanu
1
, . . . , u
k
. Next, we seek to show that v
1
, . . . , v
k
is orthogonal. By our
induction hypothesis, v
i
, v
j
) = 0 for all i ,= j less than k. So it remains to show that v
k
, v
i
) = 0 for 1 i k 1.
We have for 1 j k 1 that
v
k
, v
j
) =
_
u
k

k1

i=1
u
k
, v
i
)
[v
i
[
2
v
i
, v
j
_
= u
k
, v
j
)
k1

i=1
_
u
k
, v
i
)
[v
i
[
2
v
i
, v
j
_
= u
k
, v
j
)
k1

i=1
_
u
k
, v
i
)
[v
i
[
2
v
i
, v
j
)
_
but this is, since v
i
, v
j
) = 0 for all i ,= j, becomes
u
k
, v
j
)
u
k
, v
j
)
[v
j
[
2
v
j
, v
j
) = 0.
This completes the proof.
Corollary 6.22. Every nite (or countable) dimensional inner product space U over F (R or C) has an orthonormal
basis.
Proof. Let | = u
1
, . . . , u

(or | = u
1
, . . .) be a basis for U. Apply the Gram-Schmidt procedure to | to obtain
an orthonormal basis 1 = v
1
, . . . , v

(or 1 = v
1
, . . .) for U, then for all k 1 let
w
k
=
v
k
[v
k
[
to obtain an orthonormal basis J = w
1
, . . . , w

(or J = w
1
, . . .).
Remark 6.23. The previous corollary does not hold for uncountable dimensional vector spaces.
Corollary 6.24. Let W be a nite (or countable) dimensional inner product space over R or C and let U be a
nite dimensional subspace. Then any orthogonal (or orthonormal) basis | can be extended to an orthogonal (or
orthonormal) basis J for W.
Proof. Extend a given orthogonal basis | = u
1
, . . . , u

to a basis u
1
, . . . , u
+1
, . . . for W, then apply the Gram-
Schmidt procedure to obtain an orthogonal basis v
1
, . . . , v

, v
+1
, . . . for W and verify that v
i
= u
i
for 1 i .
Remark 6.25. This result does not always hold when U is countable dimensional.
Example 6.26. Let W be an inner product space. Let | = u
1
, . . . , u

W. If | is an orthogonal set of nonzero


vectors, then for x span |, say
x = t
1
u
1
+. . . +t

i=1
t
i
u
i
with each t
i
F (R or C), we have for each k (1 k ) that
x, u
k
) =
_

i=1
t
i
u
i
, u
k
_
=

i=1
t
i
u
i
, u
k
) = t
k
u
k
, u
k
) = t
k
[u
k
[
2
.
24
Thus,
t
k
=
x, u
k
)
[u
k
[
2
so | is linearly independent and we have
[x]
U
=
_
x, u
1
)
[u
1
[
2
. . .
x, u

)
[u

[
2
_
t
If | is orthonormal, then
[x]
U
=
_
x, u
1
) . . . x, u

)
_
t
6.3 Orthogonal complements
Denition 6.27. Let W be an inner product space. Let U be a subspace of W. The orthogonal complement of U in
W is the vector space
U

= x W : x, u) = 0 for all u U
Note that if | = u
1
, . . . , u

is a basis for U, then


U

= x W : x, u
i
) = 0 for all i = 1, . . . ,
and also if | is a (possibly innite) basis for U, then
U

= x W : x, u) = 0 for all u |.
Remark 6.28. Let W be a nite dimensional inner product space and let U be a subspace of W. Let | = u
1
, . . . , u

be an orthogonal (or orthonormal) basis for U. Extend | to an orthogonal (or orthonormal) basis
J = u
1
, . . . , u

, v
1
, . . . , v
m

for W. Then 1 = v
1
, . . . , v
m
is an orthogonal (or orthonormal) basis for U

.
Proof. For x = t
1
u
1
+. . . +t

+s
1
v
1
+. . . +s
m
v
m
, we have
t
k
=
x, u
k
)
[u
k
[
2
and s
k
=
x, v
k
)
[v
k
[
2
.
If x U

, then each t
k
= 0 so x span1. If x span1 then x = s
1
v
1
+ . . . + s
m
v
m
, then for each k, x, u
k
) =

s
i
v
i
, u
k
) =

s
i
v
i
, u
k
) = 0.
Remark 6.29. As a consequence, we see that in the case above,
1. U U

= W.
2. dimU + dimU

= dimW.
3. U = (U

.
Proof of #3. x, v) = 0 for all v U

(from the denition of U

), and so x (U

(from the denition of (U

)
thus U (U

. Also dimU = dimWdimU

= dimW(dimWdim(U

) = dim(U

. Thus U = (U

.
Note carefully that the rst part of the proof above does not use nite-dimensionality.
Remark 6.30. When U is innite dimensional we still have U (U

but in general U ,= (U

.
Since UU

= W, it follows that given x W there exist unique vectors u, v with u U, v U

such that u+v = x.


25
Example 6.31. Consider the inner product space W = R

. This is the set of sequences (a


1
, a
2
, . . .) with each a
i
R
and only nitely many a
i
are nonzero. W has basis e
1
, e
2
, . . .. Let U be the subspaces of sequences whose sum

i=1
a
i
= 0 (note that this is a nite sum). U has basis
e
k
e
1
: k 2 = e
2
e
1
, e
3
e
1
, . . .
and
U

= x R

: x, a) = 0 for all a U
Note that
x, a) = (x
1
, x
2
, . . .), (a
1
, a
2
, . . .)) =

i=1
x
i
a
i
.
So
U

= x R

: x, e
k
e
1
) = 0 for all k 2 = x = (x
1
, x
1
, x
1
, . . .) R

= 0
since only nitely many x
i
are nonzero.
6.4 Orthogonal projections
Theorem 6.32 (Orthogonal Projection Theorem). Let W be a (possibly innite dimensional) inner product space.
Let U be a nite dimensional subspace of W. Let | = u
1
, . . . , u

be an orthogonal basis for U. Given x W, there


exist unique vectors u, v W with u U, v U

such that u + v = x. The vector u is called the orthogonal


projection of x onto U and is denoted Proj
U
x. The projection is given by
u = Proj
U
x =

i=1
x, u
i
)
[u
i
[
2
u
i
.
Also, u is the unique vector in U which is nearest to x.
Proof. (uniqueness) Suppose such u, v exist. So u U, v U

and u +v = x. Say
u =

i=1
t
i
u
i
, so that t
i
=
u, u
i
)
[u
i
[
2
.
We have
x, u
i
) = u +v, u
i
) = u, u
i
) +v, u
i
) = u, u
i
)
since v U

so v, u
i
) = 0 therefore
u =

i=1
u, u
i
)
[u
i
[
2
u
i
=

i=1
x, u
i
)
[u
i
[
2
u
i
.
This completes the proof of uniqueness.
(existence) Given x W, let
u =

i=1
x, u
i
)
[u
i
[
2
u
i
so we have
u, u
i
)
[u
i
[
2
=
x, u
i
)
[u
i
[
2
then let v = x u. Clearly u U = spanu
1
, . . . , u

and u +v = x. We must verify v U

. We have
v, u
i
) = x u, u
i
) = x, u
i
) u, u
i
) = 0.
This completes the proof of existence.
26
We claim u is the unique point in U nearest to x. Let w U with w ,= u. Note w u U, since w, u U. Also
x u = v U

so x u, w u) = 0. By Pythagoras theorem,
[(w u) (x u)[
2
= [w x[
2
= [x u[
2
+[w u[
2
> [x u[
2
since w ,= u.
Example 6.33. Let a
0
, . . . , a
n
be n + 1 distinct points in F (R or C). Consider T
n
(F) with the inner product
f, g) =
n

i=0
f(a
i
)g(a
i
).
For 0 k n, let
g
k
(x) =

i=k
x a
i
a
k
a
i
so that we have
g
k
(a
i
) =
k
i
(Kronecker delta notation). Note that g
0
, . . . , g
n
is an orthonormal basis for T
n
(F). For f T
n
(F),
f =
n

k=0
f, g
k
)
[g
k
[
2
g
k
and
f, g
k
) =
n

i=0
f(a
i
)g
k
(a
i
) = f(a
k
), since g
k
(a
i
) =
k
i
.
So then
f =
n

k=0
f(a
k
)g
k
.
Example 6.34. Find the polynomial of degree 2 (f T
2
(R)) which minimizes
_
1
1
(f(x) [x[)
2
dx
Solution. Consider the vector space C([1, 1], R) with its standard inner product
f, g) =
_
1
1
fg.
We need to nd the unique f T
2
(R) which minimizes dist(f, g) where g(x) = [x[. We must take
f = Proj
P
2
g.
Let p
0
= 1, p
1
= x, p
2
= x
2
. So p
0
, p
1
, p
2
is the standard basis for T
2
(R). Apply the Gram-Schmidt procedure:
q
0
= p
0
= 1.
q
1
= p
1

p
1
, q
0
)
[q
0
[
2
q
0
= p
1
= x.
q
2
= p
2

p
2
, q
0
)
[q
0
[
2
q
0

p
2
, q
1
)
[q
1
[
2
q
1
= x
2

2
3

1
2
1 0 = x
2

1
3
.
We now have an orthogonal basis q
0
, q
1
, q
2
. So we take
f = Proj
P
2
g =
g, q
0
)
[q
0
[
2
q
0
+
g, q
1
)
[q
1
[
2
q
1
+
g, q
2
)
[q
2
[
2
q
2
= . . . =
15
16
x
2
+
3
16
.
We are done.
27
Denition 6.35. Let U and V be inner product spaces. An inner product space isomorphism from U to V is a
bijective linear map L : U V which preserves the inner product, that is, L(x), L(y)) = x, y) for all x, y U.
Proposition 6.36. Note that if | = u
1
, . . . , u

and 1 = v
1
, . . . , v

are orthonormal bases for U and V respectively,


then the linear map L : U V given by L(u
i
) = v
i
(that is, L(

t
i
u
i
) =

t
i
v
i
) is an inner product space isomorphism.
Proof. For x =

s
i
u
i
, y =

t
j
u
j
we have
x, y) =

s
i
u
i
,

t
j
u
j
) =

i,j
s
i
t
j
u
i
, u
j
) =

i
s
i
t
i
=
_
_
_
_
s
1
.
.
.
s

_
_
_,
_
_
_
t
1
.
.
.
t

_
_
_
_
=
_
x

U
,
_
y

U
)
and we have
L(x) = L(

s
i
u
i
) =

s
i
L(u
i
) =

s
i
v
i
L(y) =

t
j
v
j
L(x), L(y)) =

s
i
t
j
.
This completes the proof.
As a corollary to the Gram-Schmidt orthogonalization procedure, we see that every n-dimensional inner product space
is isomorphic to F
n
. If you have any (innite dimensional) vector space, it has a basis, and you can use that basis to
construct an inner product. Say | is a basis. We dene an inner product as follows: for
x =

uU
s
u
u and y =

uU
t
u
u
we dene
x, y) =

uU
s
u
t
u
.
Every vector space, we can construct an inner product such that a given basis is orthonormal in it. However, not all
inner product spaces (namely, the innite dimensional ones) furnish an orthogonal basis!
7 Linear operators
7.1 Eigenvalues and eigenvectors
We now want to consider the following problem. Given a linear map L : U V nd bases | and 1 which are related
to the geometry of L so that
the matrix
_
L

U
V
is in some sense simple.
Example 7.1. Let U and V be nite-dimensional vector spaces over any eld F. Let L be a linear map. Show that
we can choose bases | and 1 for U and V so that
_
L

U
V
=
_
I 0
0 0
_
.
Solution. Suppose r = rank(L). Choose a basis
u
r+1
, . . . , u
k

for ker(L) = null(L). Extend this to a basis


u
1
, . . . , u
r
, u
r+1
, . . . , u
k
=: |
28
for U. Let v
i
= L(u
i
) (for i with 1 i r). Verify that
v
1
, . . . , v
r

is linearly independent and hence forms a basis for the range of L. Extend this to a basis
1 = v
1
, . . . , v
r
, . . . , v

.
Thus
_
L

U
V
=
__
L(u
1
)

V
. . .
_
L(u
k
)

V
_
=
_
I
r
0
0 0
_
and we are done.
Remark 7.2.
_
L

U
V
is the matrix such that
_
L(x)

V
=
_
L

U
V
_
x

U
.
When | = 1 we sometimes simply write
_
L

U
. Note that for | = u
1
, . . . , u
k
and 1 = v
1
, . . . , v

we have
_
L(u
i
)

V
=
_
L

U
V
_
u
i

U
=
_
L

U
V
e
i
= ith column of
_
L

U
V
therefore
_
L

U
V
=
__
L(u
1
)

V
. . .
_
L(u
k
)

V
_
M
k
.
Also, for
U
(U)
L
V
(V)
M
W
(W)
we have
_
ML

U
W
=
_
M

V
W
_
L

U
V
.
Similarly, for
U
(U
1
)
I
U
(U
2
)
L
V
(V
2
)
I
V
(V
1
)
we have
_
L

U
1
V
1
=
_
I
V

V
2
V
1
_
L

U
2
V
2
_
I
U

U
1
U
2
Warning. Some calculational examples follow. Im not completely sure about the correctness of these calculations;
sorry.
Example 7.3. Let u
1
= (1, 1, 2)
t
, u
2
= (2, 1, 3)
t
, | = u
1
, u
2
, and U = span(|). Let F = Re
U
, that is, F : R
3
R
3
is given by
F(x) = Re
U
(x) = x 2Proj
U
(x) = Proj
U
(x) Proj
U
(x) = 2Proj
U
(x) x.
We wish to nd
_
F

.
Solution. There are three methods.
1. Let A = (u
1
, u
2
). We have Proj
U
(x) = A(A
t
A)
1
A
t
x. Therefore
F(x) = 2Proj
U
(x) x = (2A(A
t
A)
1
A
t
I)x
so that
_
F

= 2A(A
t
A)
1
A
t
I.
Calculating, we see that
A
t
A =
_
1 1 2
2 1 3
_
_
_
1 2
1 1
2 3
_
_
=
_
6 9
9 14
_
and eventually we arrive at
_
F

=
1
3
_
_
1 2 2
2 1 2
2 2 1
_
_
.
29
2. Use Gram-Schmidt to construct an orthonormal basis and perform the projection this way.
3. We have F(u
1
) = u
1
, F(u
2
) = u
2
. Choose u
3
= u
2
u
1
, so that u
3
= (1, 1, 1)
t
. Then F(u
3
) = u
3
. For
1 = u
1
, u
2
, u
3
,
_
F

V
=
__
F(u
1
)

V
_
F(u
2
)

V
_
F(u
3
)

V
_
=
_
_
1 0 0
0 1 0
0 0 1
_
_
So
F(u
1
, u
2
, u
3
) = (u
1
, u
2
, u
3
) = (u
1
, u
2
, u
3
)
_
_
1 0 0
0 1 0
0 0 1
_
_
(u
1
, u
2
, u
3
)
1
and then we get
_
F

= (u
1
, u
2
, u
3
)
_
_
1 0 0
0 1 0
0 0 1
_
_
(u
1
, u
2
, u
3
)
1
.
We are done.
Example 7.4. Let R be the rotation in R
3
about the vector u = (1, 0, 2)
t
by /2 (with the direction given by the
right hand rule). We wish to nd
_
R

.
Solution. Choose u
2
= (2, 0, 1)
t
so that u u
2
= 0 and choose u
3
= u
1
u
2
= (0, 5, 0)
t
. Let
v
1
=
u
1
[u
1
[
=
1

5
_
_
1
0
2
_
_
v
2
=
u
2
[u
2
[
=
1

5
_
_
2
0
1
_
_
v
3
=
u
3
[u
3
[
=
_
_
0
1
0
_
_
and observe that 1 = v
1
, v
2
, v
3
is orthonormal. Then R(v
1
) = v
1
, R(v
2
) = v
3
, R(v
3
) = v
2
, so we get
_
R

V
=
_
_
1 0 0
0 0 1
0 1 0
_
_
.
Then
_
R

(v
1
, v
2
, v
3
) = (v
1
, v
2
, v
3
)
_
_
1 0 0
0 0 1
0 1 0
_
_
so that
_
R

= (v
1
, v
2
, v
3
)
_
_
1 0 0
0 0 1
0 1 0
_
_
(v
1
, v
2
, v
3
)
1
.
Note that (v
1
, v
2
, v
3
)
1
= (v
1
, v
2
, v
3
)
t
for orthonormal vectors.
Example 7.5. Let L : R
3
R
3
be the linear map which scales by a factor of 2 in the direction of u
1
:= (1, 1, 2)
t
, xes
vectors in the direction of u
2
:= (2, 1, 3)
t
, and annihilates vectors in the direction of u
3
:= (1, 1, 1)
t
. Find
_
L

=
_
L

S
.
30
Solution. Let | = u
1
, u
2
, u
3
. Note L(u
1
) = 2u
1
, L(u
2
) = u
2
, and L(u
3
) = 0. Then
_
L

U
=
_
_
2 0 0
0 1 0
0 0 0
_
_
but note that
_
L

S
=
_
I

U
S
_
L

U
U
_
I

S
U
.
Since
_
I

U
S
= (u
1
, u
2
, u
3
) and
_
I

S
U
= (u
1
, u
2
, u
3
)
1
, we calculate
_
L

S
=
1
3
_
_
2 4 2
5 7 2
7 11 4
_
_
.
We are done.
Remark 7.6. Note that
_
L

U
= diag(
1
, . . . ,
n
)
__
L(u
1
)

U
. . .
_
L(u
n
)

U
_
=
_
_
_

1
. . . 0
.
.
.
.
.
.
.
.
.
0 . . .
n
_
_
_
which occurs if and only if L(u
i
) =
i
u
i
for all i (1 i n).
Denition 7.7. For a linear operator L : U U, we say F is an eigenvalue of L and u is an eigenvector of L
for when 0 ,= u and L(u) = u.
Denition 7.8. For a linear operator L on a nite dimensional vector space U, the characteristic polynomial of L
is the polynomial
f(t) = f
L
(t) = det(L tI).
Denition 7.9. For F an eigenvalue, the eigenspace of is the space
E

:= null(L I).
Remark 7.10 (eigenvalue characterizations). Let L be a linear operator on U. The following are equivalent:
1. is an eigenvalue of L.
2. L(u) = u for some nonzero u U.
3. (L I)(u) = 0 for some nonzero u U.
4. L I has a nontrivial kernel (that is, E

is nontrivial).
If we assume further that U is nite dimensional then we can add the following to the list:
6. det(L I) = 0.
7. is a root of the characteristic polynomial, f
L
.
Note also that a nonzero u U is an eigenvector for if and only if u E

.
31
Remark 7.11. Suppose dim(U) = n. If L : U U is diagonalizable, say
_
L

U
= diag(
1
, . . . ,
n
)
then we have
f
L
(t) = det(L tI) = det
_
_
L tI

U
_
= det(
_
L

U
tI) = det
_
_
_

1
t . . . 0
.
.
.
.
.
.
.
.
.
0 . . .
n
t
_
_
_ = (1)
n
n

i=1
(t
i
).
Note that the isomorphism
U
: U F
n
given by
U
(x) =
_
x

U
maps null(L) onto null(
_
L

U
) and hence maps
null(L I) onto
null
_
_
_

1
. . . 0
.
.
.
.
.
.
.
.
.
0 . . .
n

_
_
_ = spane
i
[
i
=
Thus,
null(L I) = spanu
i
[
i
=
so that
dimE

= dimnull(L I) = (# of i such that


i
= ) = multiplicity of in f
L
(t).
Theorem 7.12 (Eigenvector Independence). Let L be a linear operator on a vector space U. Let
1
, . . . ,
k
be distinct
eigenvalues of L. Let u
1
, . . . , u
k
be the corresponding eigenvectors. Then the set u
1
, . . . , u
k
is linearly independent.
Proof. We proceed by induction. Note that u
1
is linearly independent, since u
1
,= 0 by denition of an eigenvector.
Suppose u
1
, . . . , u
k1
is linearly independent and assume
t
1
u
1
+t
2
u
2
+. . . +t
k1
u
k1
+t
k
u
k
= 0
where each t
i
F. Operate on both sides by the transformation (L
k
I). We obtain
t
1
(L(u
1
)
k
u
1
) +. . . +t
k1
(L(u
k1
)
k
u
k1
) +t
k
(L(u
k
)
k
u
k
) = 0
and therefore
t
1
(
1

k
)u
1
+. . . +t
k1
(
k1

k
)u
k1
+t
k
(
k

k
)u
k
= 0.
Since the
i
are distinct, and u
1
, . . . , u
k1
is linearly independent, it follows that
t
1
= t
2
= . . . = t
k1
= 0
and hence t
k
= 0 as required.
Corollary 7.13. If
1
, . . . ,
k
are distinct eigenvalues for a linear operator L on a vector space U and if |
i
=
u
i,1
, . . . , u
i,
i
is a basis for E

i
then
k
_
i=1
|
i
is a basis for E

1
+E

2
+. . . +E

k
=
k

i=1
E

i
.
Theorem 7.14 (Dimension of an Eigenspace). Let L be a linear operator on a nite dimensional vector space U. Let
be an eigenvalue of L. Then
1 dimE

mult

(f
L
).
32
Proof. Since is an eigenvalue, we have
E

= null(L I) ,= 0.
Hence dimE

1. Let u
1
, . . . , u

be a basis for E

(so that dimE

= ). Extend this to a basis


| = u
1
, . . . , u
n

for U. Then
_
L

U
has the block form
_
L

U
=
_
I

A
0 B
_
where O is a zero matrix. So
f
L
(t) = det(
_
L

U
tI) = det
_
I

A
0 B
_
= ( t)

det(B tI).
Hence (t )

divides f
L
(t), therefore mult

(f
L
). This completes the proof.
Corollary 7.15. For a linear operator L on a nite dimensional vector space U, L is diagonalizable if and only if f
L
splits (factors completely into linear factors) and dim(E

) = mult

(f
L
) for each eigenvalue .
Example 7.16. Let L(x) = Ax, where
A =
_
_
2 4 2
5 7 2
7 11 2
_
_
.
Diagonalize L.
Solution. Find the eigenvalues:
det(AI) = det
_
_
2 4 2
5 7 2
7 11 4
_
_
= (2 )(7 )(4 ) 56 110 22(2 ) +20(4 ) +14(7 )
which is equal to ( 3)( 6). Hence the eigenvalues are = 0, 3, 6. Find a basis for each eigenspace:
( = 0): Note that
null(AI) = null
_
_
2 4 2
5 7 2
7 11 4
_
_
= null
_
_
1 0 1
0 1 1
0 0 0
_
_
by a row reduction. So a basis is (1, 1, 1)
t
.
( = 3): Note that
null(AI) = null
_
_
5 4 2
5 4 2
7 11 1
_
_
= null
_
_
1 0 2/3
0 1 1/3
0 0 0
_
_
hence a basis is (2, 1, 3)
t
.
And so on.
Aside 7.17. Dene
(u, v) = cos
1
u, v)
[u[[v[
[0, /2]
for 0 ,= u, v C
n
. Then in general we have to deal with the complex cos
1
function on the unit disk, so we get
complex angles.
Remark 7.18. Read section 5.4.
Theorem 7.19 (Cayley-Hamilton Theorem). For a linear map L : U U of nite dimensional vector spaces, f
L
(L) = 0.
(Aside: if f
L
(t) = a
0
+a
1
t +. . . +a
n
t
n
then by denition f
L
(L) = a
0
I +a
1
L +a
2
L
2
+. . . +a
n
L
n
so indeed f
L
is a linear
operator on L).
33
7.2 Dual spaces and quotient spaces
Denition 7.20. Let W be a vector space over a eld F and let U be a subspace of W. Then we dene the quotient
space W/U to be the vector space
W/U = x +U : x W
with addition given by
(x +U) + (y +U) = (x +y) +U
and the zero given by 0 +U = U.
Example 7.21. Show that if | is a basis for U and we extend this to a basis J = | 1 for W (where | and 1 are
disjoint) then v +U : v 1 is a basis for W/U.
Corollary 7.22. We have
dimU + dim(W/U) = dimW.
In the nite dimensional case, if W is an inner product space, we have an isomorphism
U


= W/U given by the map : W/U U

dened by (x +U) = Proj


U
(x) = x Proj
U
(x).
Denition 7.23. The codimension of U in W is dened to be dim(W/U).
Recall that for vector spaces U and V, the set L(U, V) = linear maps from U to V is a vector space. This space is
also sometimes denoted Hom(U, V) (due to the word homomorphism, since linear maps are technically vector space
homomorphisms). In the nite dimensional case, if we choose bases | and 1 for U and V we obtain an isomorphism
: L(U, V) M
k
where k = dimV, = dimU, given by
(L) =
_
L

U
V
.
Denition 7.24. For a vector space U over a eld F, the dual space of U is
U

= Hom(U, F)
that is, the space of linear maps from the vector space to the underlying eld (such maps are often called linear
functionals on U).
Denition 7.25. Given a basis | = u
1
, . . . , u

for U, for each k = 1, . . . , dene f


k
U

by
f
k
(u
i
) =
k
i
where
k
i
is the Kronecker delta notation. Observe the eect of these maps:
f
k
_

i
t
i
u
i
_
=

i
t
i
f
k
(u
i
) = t
k
.
Remark 7.26. Note that L = f
1
, . . . , f

is a basis for U

. Indeed, if t
1
f
1
+. . . +t

= 0 in U

, then
t
1
f
1
(x) +. . . +t

(x) = 0
for all x U, so

i
t
i
f
i
(u
k
) = 0 for all k. Hence t
k
= 0 for all k. Also, given L U

(that is, L : U F linear), note


that for x =

i
t
i
u
i
, we get
L(x) = L(

i
t
i
u
i
) =

i
t
i
L(u
i
) =

i
f
i
(x)L(u
i
) =
_

i
L(u
i
)f
i
_
(x)
and hence L =

i
L(u
i
)f
i
. This basis L is called the dual basis to the basis |.
34
Remark 7.27. Note that for x U, say x =

i
t
i
u
i
, we have t
k
= f
k
(x), so
_
x

U
=
_
_
_
f
1
(x)
.
.
.
f

(x)
_
_
_
and for L U

we have
_
L

L
=
_
_
_
L(u
1
)
.
.
.
L(u

)
_
_
_
Denition 7.28. Dene eval : U U

by
(eval(x))(f) = f(x)
for x U, f U

. This is called the evaluation map. This map is always a monomorphism (injective linear map),
but in the nite dimensional case is actually a vector space isomorphism.
Example 7.29. Show that in the case that U is nite dimensional, the evaluation map is an isomorphism.
Denition 7.30. Let W be a vector space over a eld F and let U be a subspace of W. Then we dene the annihilator
of U in W

to be
U

= g W

: g(x) = 0, x U
Example 7.31. Show that when W is nite dimensional,
dimU

+ dimU

= dimW

Denition 7.32. Let U, V be vector spaces over a eld F. Let L : U V be a linear map. Then the transpose of L
(or the dual map) is the linear map L
t
: V

given by
L
t
(g)(x) = g(L(x)) g V

, x U.
Hence in terms of linear maps,
L
t
(g) = (g L) g V

.
Theorem 7.33. Let U and V be nite dimensional vector spaces over F. Let L : U V be linear. Let |, 1 be bases
for U and V respectively. Let T, ( be the dual bases for U

and V

. Then
_
L
t

G
F
=
_
_
L

U
V
_
t
Proof. We have
_
L
t

G
F
=
__
L
t
(g
1
)

F
. . .
_
L
t
(g

F
_
=
_
_
_
L
t
(g
1
)(u
1
) . . . L
t
(g

)(u
1
)
.
.
.
L
t
(g
1
)(u
k
) . . . L
t
(g

)(u
k
)
_
_
_ =
_
_
_
g
1
(L(u
1
)) . . . g

(L(u
1
))
.
.
.
g
1
(L(u
k
)) . . . g

(L(u
k
))
_
_
_
and also
_
L

U
V
=
__
L(u
1
)

V
. . .
_
L(u
k
)

V
_
=
_
_
_
g
1
(L(u
1
)) . . . g
1
(L(u
k
))
.
.
.
g

(L(u
1
)) . . . g

(L(u
k
)).
_
_
_
This completes the proof.
35
V W
V

V

W
L
t
L

8 Adjoint of a linear operator


Denition 8.1. Let U be an inner product space over F = R or C. We dene a map
=
U
: U U

given by (u) = , u)
by which we mean, of course, that
(u)(x) = x, u)
for all u, x U.
Note that when F = R, is linear in u, however when F = C it is conjugate linear in u. However, in either case, any
map (u) (u U) is linear, and also is injective, since for u U, (u) = 0 implies that
(u)(x) = 0 for all x x, u) = 0 for all x
hence in particular u, u) = 0, implying that u = 0. In the case that U is nite dimensional, the map is also
surjective. To see this, suppose U is nite dimensional, and choose an orthonormal basis | = u
1
, . . . , u

for U. Let
L U

so L : U F. For
x =

t
i
u
i
, L(x) = L(

t
i
u
i
) =

t
i
L(u
i
).
For u =

s
i
u
i
, we have
(u)(x) = x, u) =

i
t
i
u
i
,

j
s
j
u
j
) =

i,j
t
i
s
j
u
i
, u
j
) =

i
t
i
s
i
.
To get (u) = L, that is (u)(x) = L(x) for all x, choose s
i
= L(u
i
), so that
u =

L(u
i
)u
i
.
Denition 8.2. Let U and V be nite dimensional inner product spaces over F = R or C. Let L : U V be a linear
map. We dene the adjoint (or conjugate transpose) of L to be the linear map L

: V U, dened by
L

=
1
U
L
t

V
,
where we note that
1
U
and
V
are conjugate linear. Hence L

is linear, and we have

U
L

= L
t

V
= (
U
L

)(y) = (L
t

V
)(y) y V
= (
U
L

)(y)(x) = (L
t

V
)(y)(x) x U
=
U
(L

(y))(x) = L
t
(
V
(y))(x) =
V
(y)(L(x))
= x, L

(y)) = L(x), y) x U, y V
L

is the unique linear map from V to U satisfying the last line above for all x U and y V.
Denition 8.3. More generally, for any (possibly innite dimensional) vector spaces U and V and for a linear map
L : U V, if there exists a map L

: V U such that
L(x), y) = x, L

(y)) x U, y V
then L

is indeed unique and linear (prove this as an exercise) and we call it the adjoint of L.
36
Remark 8.4. For A M
k
(C) that is, A : C

C
k
, we have for all x C

and y C
k
,
Ax, y) = y

Ax = y

x = (A

y)

x = x, A

y).
Theorem 8.5. Let U and V be nite dimensional inner product spaces over F = R or C. Let L : U V be linear.
Let | and 1 be orthonormal bases for U and V respectively. Then
_
L

V
U
=
_
_
L

U
V
_

.
Proof. We simply calculate the matrices:
_
L

V
U
=
__
L

(v
1
)

U
. . .
_
L

(v

U
_
=
_
_
_
L

(v
1
), u
1
) . . . L

(v

), u
1
)
.
.
.
.
.
.
.
.
.
L

(v
1
), u
k
) . . . L

(v

), u
k
)
_
_
_ =
_
_
_
u
1
, L

(v
1
)) . . . u
1
, L

(v

))
.
.
.
.
.
.
.
.
.
u
k
, L

(v
1
)) . . . u
k
, L

(v

))
_
_
_
which becomes
_
_
_
L(u
1
), v
1
) . . . L(u
1
), v

)
.
.
.
.
.
.
.
.
.
L(u
k
), v
1
) . . . L(u
k
), v

)
_
_
_.
However, we have
_
L

U
V
=
__
L(u
1
)

V
. . .
_
L(u
k
)

V
_
=
_
_
_
L(u
1
), v
1
) . . . L(u
k
), v
1
)
.
.
.
.
.
.
.
.
.
L(u
1
), v

) . . . L(u
k
), v

)
_
_
_
which completes the proof.
8.1 Similarity and triangularizability
Remark 8.6. If L : U U is linear and we are given bases |, 1 for U, then
_
L

V
V
=
_
I

V
U
_
L

U
U
_
I

U
V
.
Denition 8.7. For A, B M
nn
(F), we say A and B are similar when B = P
1
AP for some invertible matrix
P M
nn
(F). Note that if | = u
1
, . . . , u
n
is an orthonormal basis for an inner product space U, then
U
: U F
n
given by

U
(x) =
_
x

U
is an inner product space isomorphism. As a result, we see that the following are equivalent:
1. 1 = v
1
, . . . , v
n
is orthonormal.
2.
_
v
1

U
, . . . ,
_
v
n

U
is orthonormal.
3. Q :=
_
I

V
U
satises Q

Q = I, or in other words, Q

= Q
1
.
4. P :=
_
I

U
V
satises P

P = I.
Denition 8.8. For P M
nn
(F) where F = R or C, we call P an orthonormal matrix if P

P = I. When F = C,
we call P a unitary matrix. When F = R, P

= P
t
, hence P

P = I if and only if P
t
P = I, and P is called
orthogonal.
For A, B M
nn
(F) where F = R or C, we say A and B are orthonormally similar when B = P

AP (= P
1
AP)
for some orthonormal matrix P.
37
Denition 8.9. For L : U U linear when U is a nite dimensional inner product space, we say L is orthonormally
diagonalizable when there exists an orthonormal basis | for U such that
_
L

U
is diagonal.
For A M
nn
(F) where F = R or C, we say A is orthonormally diagonalizable when A is orthonormally similar
to a diagonal matrix, that is, when P

AP is diagonal for some orthonormal matrix P.


Theorem 8.10 (Schurs Theorem). Let U be a nite dimensional inner product space over F = R or C. Let L : U U
be linear. Then L is orthonormally triangularizable (i.e. there exists an orthonormal basis | for U such that
_
L

U
is
upper triangular) if and only if f
L
(t) splits over F.
We now recast the above theorem as an equivalent statement about square matrices over F = R or C, and then prove
that.
Theorem 8.11 (Schurs Theorem). For A M
nn
(F), F = R or C, A is orthonormally triangularizable if and only if
f
A
(t) splits.
Proof. Suppose A is orthonormally triangularizable. Choose an orthonormal matrix P such that T = P

AP is upper
triangular. Note that f
A
(t) = f
T
(t), which is equal to
det
_
_
_
_
_
t
11
t t
12
. . . t
1n
0 t
22
t . . . t
2n
.
.
.
.
.
.
.
.
.
.
.
.
0 . . . 0 t
nn
t
_
_
_
_
_
=
n

i=1
(t
ii
t)
which shows that f
T
(t) and hence f
A
(t) splits.
Conversely, suppose f
A
(t) splits. Choose an eigenvalue
1
of A with a corresponding unit eigenvector u
1
for
1
. Then
Au
1
=
1
u
1
. Extend u
1
to an orthonormal basis u
1
, . . . , u
n
for F
n
. Dene
P = (u
1
, . . . , u
n
)
so that P is an orthonormal matrix. Also, let Q = (u
2
, . . . , u
n
) so that P = (u
1
, Q). Then
P

AP =
_
u

1
Q

_
A(u
1
, Q) =
_
u

1
Q

_
(
1
u
1
, AQ) =
_

1
u

1
u
1
u

1
AQ

1
Q

u
1
Q

AQ
_
=
_

1
u

1
AQ
0 B
_
where the last equation is obtained by putting B = Q

AQand noting that Q

u
1
is a matrix of dot products which are all
zero due to orthonormality. Also u

1
u
1
= 1 as remarked above so that
1
u

1
u
1
=
1
. Now note that f
A
(t) = (
1
t)f
B
(t),
so f
B
(t) splits. Assume inductively that B is orthonormally triangularizable. Choose R M
(n1)(n1)
(F) with
R

R = I so that R

BR is upper triangular. Then we have


_
1 0
0 R
_

AP
_
1 0
0 R
_
=
_
1 0
0 R

__

1
u

1
AQ
0 B
__
1 0
0 R
_
=
_

1
u

1
AQR
0 R

BR
_
which is upper triangular. Note that
_
P
_
1 0
0 R
__

_
P
_
1 0
0 R
__
=
_
1 0
0 R

_
P

P
_
1 0
0 R
_
=
_
1 0
0 R

R
_
= I.
The proof is complete.
Denition 8.12. Let U be a nite-dimensional inner product space over F = R or C. Let L : U U be linear. Then
L is called normal when L commutes with L

(that is, L

L = LL

). Similarly, for A M
nn
(F), with F = R or C, A
is called normal when A

A = AA

.
Denition 8.13. The spectrum of a linear map L : U U is the set of eigenvalues of L.
38
Theorem 8.14 (Orthonormal Diagonalization of Normal Matrices). Let U be a nite-dimensional inner product space
over F = R or C. Let L : U U be linear. Then L is orthonormally diagonalizable if and only if L is normal and f
L
(t)
splits.
Proof. Suppose L is orthonormally diagonalizable. Then choose an orthonormal basis | so that
_
L

U
= D =
diag(
1
, . . . ,
n
). Then L

commutes with L, since


_
L

U
= D

= diag(
1
, . . . ,
n
) which commutes with D =
_
L

U
,
and f
L
(t) splits since
f
L
(t) = f
D
(t) =
n

i=1
(
i
t).
Conversely, suppose that L

L = LL

and f
L
(t) splits. Since f
L
(t) splits, L is orthonormally upper triangularizable by
Schurs theorem. Choose an orthonormal basis | for U so that T =
_
L

U
is upper triangular. Since L

L = LL

, we
have T

T = TT

. We shall show that this implies T is diagonal. Suppose


T =
_
T
11
T
12
T
13
. . .
0 T
22
T
23
. . .
_
so that T

=
_
_
_
_
_
T
11
0 . . .
T
12
T
22
. . .
T
13
T
23
. . .
.
.
.
.
.
.
.
.
.
_
_
_
_
_
Note that since T

T = TT

, we clearly have (TT

)
11
= (T

T)
11
and so
[T
11
[
2
+[T
12
[
2
+[T
13
[
2
+. . . = [T
11
[
and thus [T
12
[
2
= [T
13
[
2
= . . . = 0. Now (TT

)
22
= (T

T)
22
and hence
[T
22
[
2
+[T
23
[
2
+[T
24
[
2
+. . . = [T
12
[
2
+[T
22
[
2
= [T
22
[
2
by the previous equation
whereby T
23
= T
24
= . . . = 0 and so on. This completes the proof.
Denition 8.15. For 0 ,= u R
3
and R, extend
u
|u|
to an orthonormal basis | = u
1
, u
2
, u
3
. Then
R
u,
: R
3
R
3
is the map
_
R
u,

U
=
_
_
1 0 0
0 cos sin
0 sin cos
_
_
Remark 8.16. Given
x
0
, x
1
, x
2
, and the recurrence x
n+3
= 6x
n
+ 5x
n+1
2x
n+2
,
we have f(x) = 6 + 5x 2x
2
x
3
and we solve for the roots , , . The solution is of the form
A
n
+B
n
+C
n
.
8.2 Self-adjoint operators
Denition 8.17. Let U be a nite-dimensional inner product space over F = R or C. A linear operator L on U is
called Hermitian or self-adjoint when L

= L. For A M
nn
(F) with F = R or C we say that A is Hermitian or
self-adjoint when A

= A and we say A is symmetric when A


t
= A.
Theorem 8.18 (Spectral Theorem for Hermitian Maps). Let U be a nite dimensional inner product space over
F = R or C. Let L be a linear operator on U. Then L is orthonormally diagonalizable and every eigenvalue of L is real
if and only if L is Hermitian.
Proof. Suppose L is orthonormally diagonalizable and every eigenvalue of L is real. Choose an orthonormal basis |
for U so that
_
L

U
= D = diag(
1
, . . . ,
n
)
where each
i
R. We have D

= diag(
1
, . . . ,
n
), and so
_
L

U
= D

= D =
_
L

U
. Therefore, L

= L.
39
Proof of other direction: Suppose L

= L. We claim every eigenvalue of L is real. To see this, proceed as follows.


Choose an orthonormal basis | for U and let A =
_
L

U
. Note that A

= A, so A commutes with A

. By previous
theorem, A is orthonormally diagonalizable over C. Choose a change of basis matrix P M
nn
(I) with P

P = I,
such that
P

AP = D = diag(
1
, . . . ,
n
)
with each
i
C. We have
D

= (P

AP)

= P

P = P

AP = D.
Since D

= diag(
1
, . . . ,
n
) = diag(
1
, . . . ,
n
) = D we have
i
=
i
for all i, so each
i
is real.
Alternate proof : Suppose L

= L. Let be any eigenvalue of L and let u be an eigenvector for . Then


u, u) = u, u) = L(u), u) = u, L

u) = u, L(u)) = u, u) =

u, u)
therefore =

since u, u) , = 0, therefore R.
Since the eigenvalues are all real, f
L
splits (even when F = R). Since L

= L, L commutes with L

. Therefore L is
normal. Therefore L is orthonormally diagonalizable.
Remark 8.19. Recall that for a linear operator L on a nite dimensional inner product space U, the following are
equivalent:
1. L is an isometry.
2. L preserves norm.
3. L is an isomorphism of inner product spaces.
4. Given an orthonormal basis | = u
1
, . . . , u
n
for U, the set L(u
1
), . . . , L(u
n
) is an orthonormal basis.
5. The columns of A =
_
L

U
are an orthonormal basis.
6. A

A = I.
Denition 8.20. Let U be a (nite dimensional) inner product space over F = R or C. Let L be a linear operator on
U. Then L is called unitary, or orthonormal if it satises any of the (equivalent) conditions above.
For A M
nn
(F) where F is R or C, A is unitary if A

A = I, and we say A is orthogonal when A


t
A = I.
Remark 8.21. Here is some notation not used in the course. For any eld F,
The general linear group, GL(n, F) = A M
nn
(F) : det A ,= 0.
The special linear group, SL(n, F) = A GL(n, F) : det A = 1.
The orthogonal group, O(n, F) = A GL(n, F) : A
t
A = I.
The special orthogonal group, SO(n, F) = A O(n, F) : det A = 1.
When F = C, we also have the unitary group, U(n) = A GL(n, C) : A

A = I.
And also the special unitary group, SU(n) = A U(n) : det A = 1.
Theorem 8.22 (Spectral Theorem for Unitary Maps). Let U be a nite dimensional inner product space over F = R
or C. Let L be a linear operator on U. Then L is orthonormally diagonalizable and every eigenvalue of L has unit
norm if and only if L is unitary and f
L
splits.
Proof. This is an immediate corollary to the Spectral Theorem for Normal Matrices since if L

L = I so L
1
= L

thus
L commutes with L

. Also for D = diag(


1
, . . . ,
n
), we have
D

D = diag(
1
, . . . ,
n
) diag(
1
, . . . ,
n
) = diag([
1
[
2
, . . . , [
n
[
2
).
So D

D = I which is equivalent to each [


i
[ = 1.
40
Remark 8.23. Suppose U is a nite dimensional inner product space, and V U a subspace. Say u
1
, . . . , u
k
is an
orthonormal basis for V and we extend it to a basis | = u
1
, . . . , u
n
for the bigger space. Then
_
Proj
V

U
=
_
I
k
0
0 0
_
and also
_
Re
V

U
=
_
I
k
0
0 I
nk
_
Orthogonal scaling map:
_
Scale
,V

=
_
I
k
0
0 I
nk
_
For L : U U, if | = u
1
, . . . , u
n
is an orthonormal basis for U such that
_
L

U
= D = diag(
1
, . . . ,
n
), then
L =
n

i=1

i
Proj
u
i
=

distinct
eigenvalues
Proj
E

=
n

i=1
Scale

i
,u
i
=

distinct
eigenvalues
Scale
,E

Also, for L : U U where U is a nite-dimensional inner product space,


L is an orthogonal reection (that is, L = Re
V
for some subspace V U)
if and only if L = L

and L

L = I if and only if L = L

and L
2
= I. Furthermore,
L is an orthogonal projection (that is, L = Proj
V
for some subspace V U)
if and only if L = L

and L
2
= L.
8.3 Singular value decomposition
Theorem 8.24 (Singular Value Theorem). Let U and V be nite dimensional inner product spaces over F = R or C.
Let L : U V be linear. Then there exist orthonormal bases | and 1 for U and V such that
_
L

U
V
is in the form
_
diag(
1
, . . . ,
r
) 0
0 0
_
where the
i
are real (r = rank L) with
1

2
. . .
r
> 0. The values
i
are uniquely determined from L.
Proof. (uniqueness) Suppose that u
1
, . . . , u
k
and 1 = v
1
, . . . , v

are orthonormal bases for U and V such that


_
L

U
V
=
_
diag(
1
, . . . ,
r
) 0
0 0
_
with
1

2
. . .
r
> 0. Then L(u
i
) =
i
v
i
for i r and L(u
i
) = 0 for i > r. Also,
_
L

V
U
=
_
diag(
1
, . . . ,
r
) 0
0 0
_
so L

(v
i
) =
i
u
i
for i r and also L

(v
i
) = 0 for i > r, therefore L

L(u
i
) = L

i
v
i
=
i
L

(v
i
) =
2
i
u
i
, so each
2
i
is an
eigenvalue for L

L and each u
i
is an eigenvalue for
2
i
. This completes the proof for uniqueness.
(existence) Note that rank(L) = rank(L

) = rank(L

L). Indeed,
null(L

) = null(L

L) by your homework.
Note that the eigenvalues L

L are all non-negative since if L

L(u) = u where 0 ,= u U then


[L(u)[
2
= L(u), L(u)) = u, L

L(u)) = u, u) =

u, u) =

[u[
2
therefore

=
[L(u)[
2
[u[
2
0
41
and hence 0. Also note that L

L is self-adjoint, since (L

L)

= L

= L

L. Thus we can orthonormally


diagonalize L

L, so choose an orthonormal basis | for U so that


_
L

U
= D = diag(
1
, . . . ,
n
)
with
1

2
. . .
n
> 0 and
i
= 0 for i > r. For each i r, let

i
=
_

i
and v
i
=
L(u
i
)

i
.
Note that v
1
, . . . , v
r
is orthonormal since
v
i
, v
j
) =
L(u
i
)

i
,
L(u
j
)

j
) =
1

j
L(u
i
), L(u
j
)) =
1

j
u
i
, L

L(u
j
)) =
1

j
u
i
,
2
j
u
j
) =

j

ij
=
ij
.
Extend v
1
, . . . , v
r
to an orthonormal basis 1 = v
1
, . . . , v
r
, . . . , v

for V. Then
_
L

U
V
=
_
diag(
1
, . . . ,
r
) 0
0 0
_
since L(u
i
) =
i
v
i
for i r and L(u
i
) = 0 for i > r.
Remark 8.25. When we choose | and 1 to put
_
L

U
V
in the above form, the above proof shows the following:
The
i
are the positive square roots of the eigenvalues
i
(which are non-negative) of L

L.
The vectors u
i
are eigenvectors of L

L.
For i = 1, . . . , r,
v
i
=
L(u
i
)

i
and u
i
=
L

(v
i
)

i
u
r+1
, . . . , u
k
is an orthonormal basis for null L
u
1
, . . . , u
r
is an orthonormal basis for (null L)

v
1
, . . . , v
r
is an orthonormal basis for range(L)
v
r+1
, . . . , v

is an orthonormal basis for range(L)

= null L

.
For A M
k
(F), F = R or C we can nd unitary (or orthogonal when F = R) matrices P and Qwhere P = (u
1
, . . . , u
k
)
and Q = (u
1
, . . . , u

) (for u
i
, v
i
are above) so that A = QP

where
=
_
diag(
1
, . . . ,
r
) 0
0 0
_
with
1

2
. . .
r
> 0. Such a factorization A = QP

is called a singular value decomposition of A, and


the numbers

1

2
. . .
r
> 0
are called the singular values of A, or the singular values of L for L Hom(U, V).
9 Bilinear and quadratic forms
9.1 Bilinear forms
Denition 9.1. Let U, V, W be vector spaces over any eld F. Then a map F : U V W is called bilinear when
the following are satised:
F(u +v, w) = F(u, w) +F(v, w)
42
F(u, v +w) = F(u, v) +F(u, w)
F(tu, v) = tF(u, v)
F(u, tv) = tF(u, v)
for all u, v, w in the appropriate spaces (notational note the small letters do not correspond to the large ones).
The set of bilinear maps
F : U V W
is a vector space, which we denote by Bilin(U V, W).
Theorem 9.2. Let U and V be nite-dimensional vector spaces over F. Then for a bilinear map F : UV F, if we
choose bases | and 1 then there is a unique matrix
_
F

U
V
such that for all u U and v V we have
F(u, v) =
_
v

t
V
_
F

U
V
_
u

U
Moreover the map
U,V
: Bilin(U V, F) M
k
(F) where k = dimU, and = dimV given by

U,V
(F) =
_
F

U
V
is an isomorphism.
Proof. (uniqueness) Suppose
F(u, v) = y
t
Ax
where y =
_
v

V
, x =
_
u

U
for all u U, v V. Say | = u
1
, . . . , u
k
and 1 = v
1
, . . . , v

. Then
F(u
i
, v
j
) = e
t
j
Ae
i
= A
ji
.
(existence) Given F Bilin(U V, F) let A be the matrix given by
A
ji
= F(u
i
, v
j
)
For u =

x
i
u
i
, v =

y
j
v
j
so that
_
u

U
= x,
_
v

V
= y we have
F(u, v) = F
_

x
i
u
i
,

y
j
v
j
_
=

i,j
x
i
y
j
F(u
i
, v
j
) =

i,j
x
i
y
j
A
ji
= y
t
Ax.
Verify that this map is indeed linear and bijective (vector space isomorphism).
Remark 9.3. U V = Bilin(U

, F) is called the tensor product of U and V.


Denition 9.4. Let U be a vector space over a eld F. A bilinear form on U is a bilinear map F : U U F. A
bilinear form F : U U F is symmetric when
F(u, v) = F(v, u)
for all u, v U. It is skew-symmetric when
F(u, v) = F(v, u)
When U is nite-dimensional and | is a basis for U we write
_
F

U
=
_
F

U
U
.
As an exercise, verify that symmetry (resp. skew-symmetry) of bilinear forms is equivalent to symmetry (resp. skew-
symmetry) in their matrices.
43
Example 9.5. Let U be a nite dimensional vector space over a eld F. Let
F : U U F
be a bilinear form on U. For bases | and 1 for U, determine how
_
F

U
and
_
F

V
are related.
Solution. For x, y U,
F(x, y) =
_
y

t
V
_
F

V
_
x

V
=
_
_
I

U
V
_
y

U
_
t _
F

V
_
_
I

U
V
_
x

U
_
=
_
y

t
U
_
_
I

U
V
t _
F

V
_
I

U
V
_
_
x

U
therefore
_
F

U
=
_
I

U
V
t _
F

V
_
I

U
V
and we are done.
Denition 9.6. For A, B M
nn
(F) when A = P
t
BP for some invertible matrix P, we say A and B are congruent.
Remark 9.7. When A and B are congruent, they have the same rank, so we can dene the rank of a bilinear form
F : U U F
to be the rank of
_
F

U
(this is independent of the choice of basis |). On the other hand, in general,
det A ,= det B
for congruent A, B. The spectra are not equal either, nor are the characteristic polynomials.
Example 9.8. Given a bilinear form F : U U F on U, when can we diagonalize F, that is when can we nd a
basis for U so that
_
F

U
is diagonal?
Theorem 9.9 (Diagonalization of Bilinear Forms). Let F be a eld of characteristic not equal to 2. Let U be a nite
dimensional vector space over F. Let F : U U F be a bilinear form on U. Then there exists a basis | for U such
that
_
F

U
is diagonal if and only if F is symmetric.
Proof. Note that if
_
F

U
= D is diagonal, then D is symmetric, so F is symmetric.
Suppose F is symmetric. Choose a basis 1 for U and let A =
_
F

V
. Note A is symmetric, since F is symmetric. We
need to nd an invertible matrix P so that P
t
AP is diagonal. We shall put A into diagonal form using a sequence
of column operations and corresponding row operations so the elementary matrices for the row operations are the
transposes of the elementary matrices for the column operations. Here is an algorithm to do this.
1. If A
11
,= 0, we use the operations
C
i
C
i

A
1i
A
11
C
1
R
i
R
i

A
i1
A
11
R
1
(notice that the corresponding elementary matrices are transposes of each other) to eliminate the entries in the
rst row and column.
If A
11
= 0, then if A
ii
,= 0 for some i 2 then use R
1
R
i
, C
1
C
i
to move A
ii
into the (1, 1) position. If
A
ii
= 0 for all i 2, and some A
1i
,= 0 for i 2, then do
C
1
C
1
+C
i
R
1
R
1
+R
i
to convert the (1,1) entry 2A
1i
,= 0 then eliminate the other entries in the rst row and column as above.
44
2. Note that A is symmetric, so is E
t
AE since (E
t
AE)
t
= E
t
A
t
E
tt
= E
t
AE. Repeat the above procedure on the
lower-right (n 1) (n 1) submatrix.
This completes the proof.
Corollary 9.10. When F = C, given a symmetric bilinear form F on U, we can choose a basis | so
_
F

U
is in the
form
_
F

U
=
_
I
r
0
_
where r = rank F.
Proof. Choose 1 so that
_
F

V
=
_
diag(z
1
, . . . , z
r
) 0
0 0
_
= D
with each z
i
,= 0. Choose w
i
with w
2
i
= z
i
. Let
E =
_
diag(1/w
1
, . . . , 1/w
r
) 0
0 0
_
E
t
DE =
_
I
r
0
0 0
_
We are done.
Corollary 9.11. When F = R we can choose | so that
_
F

U
=
_
_
I
k
I
rk
0
nr
_
_
Proof. Exercise. (Take w
i
=
_
[z
i
[ above).
Theorem 9.12 (Sylvesters Law of Inertia). Let U be an n-dimensional vector space over R. Let F : UU R be a
symmetric bilinear form on U. Let | and 1 be bases for U such that
_
F

U
=
_
_
I
k
I
rk
0
nr
_
_
and
_
F

V
=
_
_
I

I
r
0
nr
_
_
then k = . The number k is called the index of F.
Proof. Suppose k ,= , say > k. Say | = u
1
, . . . , u
n
and 1 = v
1
, . . . , v
n
. For w U, say
w =

x
i
u
i
so
_
w

U
= x, where x = (x
1
, . . . , x
n
)
t
, we have
F(w, u
j
) = F(

x
i
u
i
, u
j
) =

x
i
F(u
i
, u
j
) =
_

_
+x
j
if 1 j k
x
j
if k + 1 j r
0 if r + 1 j n
and
F(w, w) = F(

i
x
i
u
i
,

j
x
j
u
j
) =

i,j
x
i
x
j
F(u
i
, u
j
) =
k

i=1
x
2
i

r

j=k+1
x
2
j
+
n

p=r+1
0 =
k

i=1
x
2
i

r

j=k+1
x
2
j
.
45
We claim that we can choose w U so that
F(w, u
i
) = 0 for 1 i k
F(w, v
j
) = 0 for + 1 j r
F(w, u
i
) ,= 0 for some k + 1 i r
If we can do this, we will be done (why?). We now prove the above claim. Dene L : U R
k+r
by
L(w) = (F(w, u
1
), . . . , F(w, u
k
), F(w, v
+1
), . . . , F(w, v
r
))
t
Note that rank(L) k +r , and nullity(L) n (k +r ) = (n r) + ( k) > n r. Therefore,
spanu
r+1
, . . . , u
n
, = null(L)
so we can choose w null(L) with w / spanu
r+1
, . . . , u
n
. If we write
w =
n

i=1
x
i
u
i
then because F(w, u
i
) = 0 for 1 i k, it follows that x
i
= 0 (1 i k), and because w / spanu
r+1
, . . . , u
n
, it
follows that F(w, u
i
) = x
i
,= 0 for some k + 1 i r. This proves the claim. Choose such a vector w U. Then
writing w =

x
i
u
i
=

y
j
v
j
,
F(w, w) =
k

i=1
x
2
i

r

i=k+1
x
2
i
=
r

i=k+1
x
2
i
since x
i
= F(w, u
i
) = 0 for 1 i k. But the above is strictly negative, since x
i
= F(w, u
i
) ,= 0 for some k+1 i r,
and
F(w, w) =

i=1
y
2
i

r

i=k+1
y
2
i
=

i=1
y
2
i
since y
1
= F(w, v
i
) = 0 for + 1 i r. But the above is nonnegative, giving the desired contradiction. This
completes the proof.
Denition 9.13. Let U be a vector space over R. Let F : U U R be a symmetric bilinear form. Then we make
the following denitions:
1. F is called positive denite when F(u, u) 0, with equality if and only if u = 0, u U.
2. F is called positive semidenite when F(u, u) 0, u U.
3. F is called negative denite when F(u, u) 0, with equality if and only if u = 0, u U.
4. F is called negative semidenite when F(u, u) 0, u U.
5. F is called indenite if its none of the above, that is, if F(u, u) > 0 for some u U, and F(v, v) < 0 for some
v U.
When U is nite dimensional, and | is a basis for U and A =
_
F

U
, then we have F is positive denite
F(u, u) > 0 for all 0 ,= u U
_
u

t
U
_
F

U
_
u

U
> 0 for all 0 ,= u U x
t
Ax > 0 for all 0 ,= x R
n
.
For A M
nn
(R) symmetric, A is positive denite x
t
Ax > 0 for all 0 ,= x R
n
. A is positive semidenite
x
t
Ax 0 for all x R
n
.
In particular, an inner product is a positive-denite symmetric bilinear form.
Theorem 9.14. Let U be a nite dimensional vector space over R. Let F : UU R be a symmetric bilinear form.
Let | be a basis for U. Let A =
_
F

U
. Then F is positive denite (if and only if A is positive denite) if and only if all
the eigenvalues of A are positive. F is positive semidenite if and only if all the eigenvalues of A are nonnegative, etc.
46
Proof. We know that F is positive denite if and only if A is positive denite, if and only if x
t
Ax > 0 for all 0 ,= x R
n
.
Hence suppose that x
t
Ax > 0 for all 0 ,= x R
n
. Let be an eigenvalue of A, and let x be an eigenvector (hence
x ,= 0 by denition) of A for . Then
Ax = x
x
t
Ax = x
t
x = [x[
2
=
x
t
Ax
[x[
2
> 0.
Conversely, suppose that all the eigenvalues of A are positive. Since A is symmetric, we can orthogonally diagonalize
A. Hence, choose an orthogonal matrix P so that
P

AP = D = diag(
1
, . . . ,
n
),
where the
i
are the eigenvalues of A, which are assumed to all be positive. We have
A = PDP

= PDP
t
since were in R.
For 0 ,= x R
n
,
x
t
Ax = x
t
PDP
t
= y
t
Dy where y = P
t
x
noting that y ,= 0 since x ,= 0 and P is invertible. This becomes
(y
1
, . . . , y
n
) diag(
1
, . . . ,
n
)
_
_
_
y
1
.
.
.
y
n
_
_
_ =
n

i=1

i
y
2
i
> 0,
since each
i
> 0 and some y
i
,= 0.
Theorem 9.15 (Characterization of Positive/Negative Denite Bilinear Forms over R). Let A M
nn
(R) be sym-
metric. Then A is positive denite if and only if det(A
kk
) > 0 for all k = 1, . . . , n. A is negative denite if and only
if
(1)
k
det(A
kk
) > 0
for all k = 1, . . . , n where A
kk
is the upper-left k k submatrix of A.
Remark 9.16. For A M
nn
(R), A
t
= A, A is positive denite if and only if x
t
Ax > 0 for 0 ,= x R
n
if and only if
eigenvalues of A are all positive.
Proof of theorem. We show that x
t
Ax > 0 for all 0 ,= x R
n
if and only if det(A
kk
) > 0 for all k. Suppose x
t
Ax > 0
for all 0 ,= x R
n
. Then for all 0 ,= x R
k
we have
_
x
0
_
t
A
_
x
0
_
= (x
t
, 0)
_
A
kk
B
C D
__
x
0
_
= (x
t
, 0)
_
A
kk
x
Cx
_
= x
t
A
kk
x
and hence A
kk
is positive denite for all k. Hence the eigenvalues of A
kk
are all positive. Therefore the determinant
of A
kk
is positive (since the determinant is the product of the eigenvalues). A dierent way of seeing this: since the
k k matrix is symmetric, we can orthogonally diagonalize it: we convert it to a diagonal matrix and look at the
determinant of that matrix.
Now suppose det A
kk
> 0 for all k (1 k n). Use our row/column operation algorithm to diagonalize A. We have
A
11
= det(A
11
) > 0
so we use
C
i
C
i

A
1i
A
11
C
1
, R
i
R
i

A
i1
A
11
R
1
47
to convert A to the form
_
A
11
0
0 B
_
These same operations convert A
(k+1)(k+1)
to
_
A
11
0
0 B
kk
_
These operations do not change the determinant so
det B
kk
=
det A
(k+1)(k+1)
A
11
> 0
for all k (1 k n 1). Continuing the algorithm we convert A to diagonal form and the resulting diagonal matrix
has positive entries. Therefore, the index of A is n. On the other hand, since A = A
t
we can orthogonally diagonalize
to convert A to the form D = diag(
1
, . . . ,
n
) where the
i
are the eigenvalues of A. Since the index of A is n, every

i
> 0 (we are using Sylvesters theorem).
Part 2 follows from part 1 by replacing A by A.
9.2 Quadratic forms
Denition 9.17. Let U be a vector space over a eld F. A quadratic form if a map K : U F given by
K(u) = F(u, u)
for some symmetric bilinear form F on U.
Remark 9.18. When U is nite dimensional and | is a basis for U we write
_
K

U
=
_
F

U
so that for
K(u) = F(u, u) =
_
u

t
U
_
F

U
_
u

U
=
_
u

t
U
_
K

U
_
u

U
= x
t
Ax
where x =
_
u

U
, A =
_
K

U
and
x
t
Ax =

i,j
x
i
A
ij
x
j
=

i
A
ii
x
2
i
+

i<j
2A
ij
x
i
x
j
which is a homogeneous polynomial of degree 2. A polynomial (or power series) in x, y is of the form
p(x, y) = c
00
+ c
10
x +c
01
y
. .
homog. of deg. 1
+c
20
x
2
+c
11
xy +c
02
y
2
. .
homog. of deg. 2
(quadratic form)
+c
30
x
3
+c
21
x
2
y +c
12
xy
2
+c
03
y
3
. .
homog. of deg. 3
(cubic form)
+. . .
For f(x, y) (

, the Taylor series at (0, 0) is


T(x, y) = f(0, 0) +
f
x
(0, 0)x +
f
y
(0, 0)y +
1
2

2
f
x
2
(0, 0)x
2
+

2
f
xy
(0, 0)xy +
1
2

2
f
y
2
(0, 0)y
2
+
1
3!

3
f
x
3
(0, 0)x
3
+. . .
= f(0, 0) +D
_
x
y
_
+ (x, y)H
_
x
y
_
where
D =
_
f
x
(0, 0)
f
y
(0, 0)
_
and H =
_

2
f
x
2
(0, 0)

2
f
xy
(0, 0)

2
f
yx
(0, 0)

2
f
y
2
(0, 0)
_
.
48
Example 9.19. Sketch 3x
2
4xy + 6y
2
= 10 (or sketch 3x
2
4xy + 6y
2
= z).
Solution. 3x
2
4xy + 6y
2
= (x, y)
t
A(x, y), where
A =
_
3 2
2 6
_
Orthogonally diagonalize A
f
A
(t) = det
_
3 t 2
2 6 t
_
= t
2
9t + 14 = (t 7)(t 2)
so the eigenvalues are
1
= 7,
2
= 2. When = 7,
AI =
_
4 2
2 1
_

_
2 1
0 0
_
we take u
1
=
1

5
(1, 2)
t
. When = 2 we can take u
2
=
1

5
(2, 1)
t
. Let
P = (u
1
, u
2
) =
1

5
_
1 2
2 1
_
which is a rotation so
P
t
AP = D =
_
7 0
0 2
_
Now we have A = PDP
t
.
(x, y)A
_
x
y
_
= (x, y)PDP
t
_
x
y
_
= (s, t)D
_
s
t
_
where
_
s
t
_
= P
t
_
x
y
_
_
x
y
_
= P
_
s
t
_
10 = 3x
2
4xy + 6y
2
= (x, y)A
_
x
y
_
= (s, t)D
_
s
t
_
= 7s
2
+ 2t
2
s
2
/(10/7) +t
2
/5 = 1
Example 9.20. Let U be a nite dimensional inner product space over R. Let K : U R be a quadratic form on U.
Find the maximum and minimum values of K(u) for [u[ = 1.
Solution. Choose an orthonormal basis | = u
1
, . . . , u
n
so that
_
K

U
= D = diag(
1
, . . . ,
n
)
with
1

2
. . .
n
. Then for u U, write x =
_
u

U
. Note that [x[ = [u[ since | is orthonormal.
K(u) = x
t
Dx =
n

i=1

i
x
2
i

n

i=1

1
x
2
1
=
1
n

i=1
x
2
i
=
i
[x[
2
=
i
when [u[ = [x[ = 1. When u = u
1
(an eigenvector for
1
) we have x = e
1
, so
K(u) =
n

i=1

i
x
2
i
=
1
49
so
max
|u|=1
K(u) =
1
with K(u
1
) =
1
.
Similarly,
min
|u|=1
K(u) =
n
with K(u
n
) =
n
.
Example 9.21. Let U and V be two nite dimensional vector spaces over R. Let L : U V be linear. Find
max
|u|=1
[L(u)[ and min
|u|=1
[L(u)[
Solution. Choose an orthonormal basis | for U and 1 for V. Let A =
_
L

U
V
. For u U, write x =
_
u

U
. Then
_
L(u)

V
= Ax
and because 1 was orthonormal,
[L(u)[ = [Ax[ = [L(u)[
2
= [Ax[
2
= x
t
A
t
Ax
therefore,
max
|u|=1
[L(u)[
2
is the maximum eigenvalue of A
t
A. Similarly,
min
|u|=1
[L(u)[
2
is the minimum eigenvalue of A
t
A. Thus
max
|u|=1
[L(u)[ =
1
=
_

1
min
|u|=1
[L(u)[ =
n
=
_

n
where
1
. . .
n
are the eigenvalues of A
t
A.
This is the end of the material that will be tested on the course nal exam.
10 Jordan canonical form
Theorem 10.1 (Jordan Canonical Form). Let U be a nite dimensional vector space over any eld F. Let L : U U
be linear. Suppose f
L
(t) splits. There is a basis | for U such that
_
L

U
is of the block-diagonal form
_
_
_
J
k
1

1
.
.
.
J
k

_
_
_
where
J
k

=
_
_
_
_
_
_
_
_
_
1 . . . 0
.
.
. 1
.
.
.
1
.
.
.
.
.
.
.
.
. 1
0 . . . 0
_
_
_
_
_
_
_
_
_
and the blocks J
k
i

i
are unique (up to order).
50

Potrebbero piacerti anche