Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Waffle
Mathcamp 2009
Last week, Ari taught you about one kind of simple (in the nontechnical
sense) ring, specifically semisimple rings. These have the property that every
module splits as a direct sum of simple modules (in the technical sense).
This week, well look at a rather different kind of ring, namely a principal
ideal domain, or PID. These rings, like semisimple rings, have the property that
every (finitely generated) module is a direct sum of simple modules, though
here we use simple in the nontechnical sense of easy to understand. However,
while this property of modules was almost the definition of semisimple rings,
for PIDs it is much less obvious, and the bulk of our time will be devoted
to proving this classification of modules. This classification is very powerful,
and its applications include both a complete classification of finitely generated
abelian groups and a classification of matrices up to conjugation over C (or
any algebraically closed field). The main example of a PID we will focus on is
the integers Z, for which modules are just abelian groups. However, another
important example will be k[x], the ring of polynomials in one variable over a
field. Indeed, while k[x] and Z may look like fairly different rings at first, they
are in fact very similar just by both being PIDs.1
The first, most obvious difference between what I will do and what Ari did is
that PIDs are by definition commutative. Thus throughout these notes, all
rings will be assumed to be commutative. If a, b, c . . . R are elements
of a ring, we let (a, b, c, . . .) denote the ideal they generate. More generally,
if a, b, c . . . M are elements of an R-module, we let (a, b, c, . . .) denote the
submodule they generate.
We start with some generalities on commutative rings that will lead up to the
notion of a PID.
Definition 1.1. A ring R is an integral domain (or domain, for short) if 0 6= 1
and whenever a, b R and ab = 0, either a = 0 or b = 0. A ring is a field if
1 There is even mathematicians who are trying to make sense of an imaginary field with
one element, or F1 , such that Z would be F1 [x] (or perhaps a similar ring formed from F1
that is not quite the same as F1 [x]; Im not an expert on this so I might be getting the details
wrong). If they can make enough sense of this, they may be able to use it to prove the
Riemann hypothesis!
1.1
Exercises
0KM N 0
is a short exact sequence if i is injective, p is surjective, and ker(p) = Im(i).
Equivalently, up to isomorphism, K is a submodule of M and N = M/K, with
i and p the obvious maps.
i
If these conditions hold, we say that the short exact sequence splits.
The next exercise uses the following result, which is equivalent to the Axiom
of Choice.
Lemma 1.13 (Zorns Lemma). Let P be a partially ordered set, and suppose
that whenever C P is totally ordered, there is x P such that x c for
all c C (briefly, every chain in P has an upper bound). Then there is an
element z P which is maximal: there is no x P such that x > z.
Exercise 1.14.
(a): Let I R be an ideal. Show that there is a maximal ideal containing
I. (Hint: To get upper bounds, use that an ideal is proper iff it does not
contain 1).
(b): Show that an element a R is contained in a maximal ideal iff it is not a
unit.
Exercise 1.15.
(a): Show that the polynomial ring R[x] satisfies the following universal property: a homomorphism from R[x] to another ring S is equivalent to a homomorphism from R to S together with a chosen element in S (the image
of x).5
(b): Show that an R[x]-module is equivalent to an R-module M together with
an element of EndR (M ) (corresponding to multiplication by x). (Hint: An
S-module structure on an abelian group M is the same as homomorphism
from S to the (noncommutative) ring EndZ (M ).)
Exercise 1.16.
(a): Let R be a domain. Let k be the set of formal expressions a/b where
a, b R and b 6= 0, and we say a/b = c/d if ad = bc. Show that with
the usual addition and multiplication of fractions, k is a field. Show that
a 7 a/1 is an injective ring-homomorphism from R into k. We call k the
field of fractions of R.
(b): Show that a ring is a domain iff it is a subring of a field.
Exercise 1.17. Show that if a domain is finite, it is a field.
5 By
equivalent, we mean that there is a naturally defined bijection between the set of
homomorphisms R[x] S and the set of pairs (f, s) where f : R S and s S.
First, we note that changing an element by a unit does not change its divisibility properties. If we allow ourselves to change factors by units, factorization
is not even unique in Z. For example 6 = 2 3 = (2)(3), where 2, 3, 2,
and 3 are all prime. For the integers, we remedy this by arbitrarily choosing a
representative of each associate class, namely the positive one. For polynomials
over a field, we can similarly choose to use only monic polynomials (polynomials
with leading coefficient 1) in our factorizations. For a general PID, we will just
arbitrarily choose representatives of each associate class of primes, and call these
the chosen primes. Equivalently, we are picking a generator for every maximal
ideal.
Theorem 2.6 (Unique Factorization). Let R be a PID. Then every nonzero
a R can be factorized as upd11 pd22 pdnn , where u is a unit, the pi are chosen
primes, and di > 0. This factorization is unique up to permuting the pi .
Proof. First, we prove existence of the factorization. If a is a unit, this is
trivial. Otherwise, let P be any maximal ideal containing (a). Then P = (p1 )
for some chosen prime p1 , and p1 |a since (a) (p1 ). Write a = p1 a1 . Repeat
the argument with a1 in place of a to write a1 = p2 a2 if a1 is not a unit.
Continue by induction. If we ever get an = u to be a unit, we are done, since
a = p1 a1 = p1 p2 a2 = = p1 p2 pn u. If we never get a unit, we get
S an infinite
ascending chain of ideals (a) (a1 ) (a2 ) (a3 ) . . .. Let I = (an ); then
I is an ideal. But then I = (b) since R is a PID, and b (an ) for some n. But
then I = (an ) (an+1 ), so (an ) = (an+1 ), a contradiction. Hence the process
must eventually stop with an a unit, and we get
Pthe desired
P factorization.
Now we prove uniqueness by induction on
di . If
di = 0 P
(i.e., n = 0),
then a = u is just a unit, and uniqueness is obvious. Now suppose
di > 0 and
em
are two different factorizations. Then p1
a = upd11 pd22 pdnn = vq1e1 q2e2 qm
divides the product on the right-hand side and is prime, so p1 must divide one of
the factors. Since no two chosen primes are associate, this implies that p1 = qi
d1 1 d2
dn
for some i. Cancelling these common factors,
P we get b = up1 p2 pn =
e1 e2
ei 1
em
vq1 q2 qi
di by one, so by induction the
qm . We have decreased
factorization of b is unique, so these two factorizations are the same up to
permutation. It follows that the two original factorization of a were the same
up to permutation.
2.1
Exercises
Exercise 2.7. Let k be a field (though any ring will do for the first part). Show
that the ideal (x, y) k[x, y] is not principal. Conclude that k[x, y] is a domain
but not a PID.6
Exercise 2.8.
6 In fact, it can be shown that k[x, y] has unique factorization, showing that a domain with
unique factorization (or UFD) need not be a PID.
(a): Show that if a polynomial f (x) k[x] is irreducible and of degree greater
than 1, then f has no roots. (Hint: If f (a) = 0, show by division that
(x a)|f .)
(b): Show that the polynomial x2 + 1 is irreducible as an element of R[x].
(c): Show that the polynomial x4 4 is reducible as an element of Q[x], but
has no roots.
The next exercise gives another class of examples of PID which are important
in more advanced commutative algebra.
Exercise 2.9. A local ring is a ring R with a single maximal ideal P . By
Exercise 1.14(b), this means that every element of R P is a unit. A discrete
valuation ring (DVR) is a Noetherian local domain such that the maximal ideal
P is principal. Let R be a DVR and let p be a generator for P .
(a): Show that every nonzero element of R is of the form upn for u a unit.
(Hint: Imitate the proof of existence of prime factorizations in a PID and
use Noetherianness.)
(b): Show that R is a PID, and in fact that every nonzero ideal is (pn ) for some
n.
(c): Let p Z be a prime, and let Z(p) be the set of rational numbers whose
denominators are relatively prime to p. Show that Z(p) is a DVR with
maximal ideal (p).
P
(d): Let k be a field, and let k[[x]] = { n=0 an xn : an k} be the ring of
formal power series in one variable with coefficients in P
k. Show that k[[x]]
is a DVR with maximal ideal (x). (Hint: Show that
an xn is a unit if
a0 6= 0, and use this to prove that (b) above holds for R = k[[x]] and p = x.)
The example of Exercise 2.9(c) is generalized in Exercise 3.15.
Exercise 2.10. A Euclidean domain is a domain R together with a function
d : R {0} N satisfying d(ab) max(d(a), d(b)) and such that if a, b R and
b 6= 0, then there exist q and r such that a = qb + r and r = 0 or d(r) < d(a).
For example, if R = Z we can let d(n) = |n| and if R = k[x] we can let
d(f ) = deg(f ). Show that any Euclidean domain is a PID.
Exercise 2.11. Let R be a PID and a, b R. Then a greatest common divisor
(gcd) of a and b is c R such that c|a, c|b, and if d|a and d|b then d|c. We write
c = gcd(a, b).
(a): Show that c is a gcd of a and b iff c generates the ideal (a, b). Conclude
that any two elements have a gcd, well-defined up to units.
(b): Show that if c = gcd(a, b), then there exist r, s R such that c = ra + sb.
Conversely, show that if c = ra + sb and c|a and c|b, then c = gcd(a, b).
(c): Suppose R is a Euclidean domain. Then show that the following algorithm,
known as the Euclidean algorithm, computes gcd(a, b) and allows one to
explicitly write it in the form ra + sb: Assume WLOG that d(b) d(a).
If a|b, stop and say the gcd is a. Otherwise, write b = qa + r with d(r) <
d(a). Now repeat the algorithm, but replace b with r. The algorithm will
eventually terminate because d(a) + d(b) decreases with each step. (To get
an idea of how the algorithm works, you may want to try it with R = Z
and a = 18, b = 26.)7
7 The Greeks invented the Euclidean algorithm to prove that Z is a PID (though they didnt
call it that!). Oddly, I have seen multiple references in which the example used to demonstrate
the Euclidean algorithm is 18 and 26.
In this section, R will always denote a PID, and primes will be taken to mean
chosen primes.
We can now get to the real meatclassifying finitely generated modules
over a PID. Note that a Z-module is the same as an abelian group, so this will
also give a complete classification of finitely generated abelian groups. This is
perhaps more impressive when you consider that is it generally accepted to be
hopeless to even classify finite nonabelian groups.
What about the case R = k[x]? In this case, by Exercise 1.15, an R-module is
a k-module V (i.e., vector space) together with an endomorphism A Endk (V )
(i.e., a linear transformation on V ). If (V, A) and (W, B) are k[x]-modules, a homomorphism between them is a k-linear map T : V W that intertwines with
the endomorphisms: T A = BT (this just is the property that T (xv) = xT (v),
since A and B correspond to multiplication by x). It follows that (V, A) and
(V, B) are isomorphic iff there is a k-linear automorphism T of V such that
T A = BT , or B = T AT 1 . Thus by classifying k[x]-modules, we will classify
linear transformations on vector spaces up to conjugation by linear automorphisms. Or, in simpler language, we will classify matrices up to conjugation.
Lets now look at what the classification is.
Definition 3.1. An R-module M is cyclic if it is generated by a single element.
This generalizes the notion of a cyclic group. Note that if M is generated
by x, then f (a) = ax is a surjective homomorphism from R to M . Hence cyclic
modules are exactly those of the form R/I for some ideal I.
Here is the main theorem of this section.
Theorem (Classification of Modules over a PID). Let M be a finitely generatedL
R-module. Then M is isomorphic to a direct sum of cyclic modules
Rn R/(qi ), where each qi = pdi i is a power of a prime. This decomposition
is unique up to permuting the factors.
The number n is sometimes called the Betti number of M and the qi are the
elementary divisors or torsion coefficients.
There are several steps to proving this theorem. Basically, we can prove it
separately for the summands Rn and for the summands R/(q) corresponding to
each prime.
3.1
Torsion modules
10
Note that the period of an element is only defined up to units; this should
not cause any confusion. If x has period a, then (x) M is isomorphic to
R/(a).
Example 3.3. Let R = Z. Then a module is a finitely generated torsion module
iff it is a finite abelian group. Indeed, it is clear that any element of a finite
abelian group must be torsion (otherwise its multiples would all be different
and there would be infinitely many of them). Conversely, if M is generated by
x1 , . . . , xn with periods a1 , . . . , an , then any element of M can be written as
b1 x1 + + bn xn where each bi satisfies 0 bi < ai , so M is finite.
Example 3.4. Let R = k[x]. If (V, A) is an k[x]-module and v V is torsion,
then (v)
= k[x]/(f (x)) for some nonzero polynomial f , which (by changing by
a unit) we may assume to be monic. If f (x) = xn + an1 xn1 + . . . , then in
k[x]/(f ), xn = an1 xn1 . . . is k-linearly dependent on the lower powers
of x. We similarly can see that all higher powers of x can be spanned just by
B = {1, . . . , xn1 }. In fact, B is a basis for k[x]/(f ) as a k-vector space, since
every nonzero element of (f ) has degree at least n, so no linear combination of
elements of B can vanish in k[x]/(f ).
Thus if v is torsion, (v) is finite-dimensional. In fact, more generally, a k[x]modules is finitely generated torsion iff it is finite-dimensional as a vector space,
which you will prove in Exercise 3.11.
In this section, we will prove the classification theorem for torsion modules,
which is the general classification theorem in the case where the Betti number
is 0.
First, we show that we can treat each prime separately when analyzing a
torsion module.
Definition 3.5. Let p R be a prime and M be an R-module. Then an
element x M is p-torsion if its period is a power of p. If every element of M
is p-torsion, we say M is a p-module.
The name p-module generalizes the traditional term p-group for a group
whose order is a power of p.
Lemma 3.6. Let M be a finitely generated torsion R-module. For each p, let
Mp = {x M : xL
is p-torsion}. Then Mp is a submodule of M , and M splits
as the direct sum
Mp over all primes p.
Proof. First, we show Mp is a submodule. Let x, y Mp , with (0/x) = (pn )
and (0/y) = (pm ) and WLOG m n. Then pn (x + y) = 0, so pn (0/x + y).
Since p is prime, the only ideals containing (pn ) are (pk ) for k n. Hence the
period of x + y is a power of p, so x + y Mp . A similar argument shows that
Mp is closed under scalar multiplication.
L
P
Now wePshow that M =
Mp . This consists of two statements: M =
Mp
and Mp q6=p M (q) = 0 for all p. That is, the Mp generate all of M and they
are linearly disjoint. We prove the second statement first.
11
P
P
Suppose x Mp q6=p M (q) is nonzero; write x =
yq for yP
q M (q).
Let r be the product of the periods of the yq . Then we have rx =
ryq = 0,
so r (0/x). But x Mp , so (0/x) = (pn ) for some n. Since r, as a product
of powers
P of q for q 6= p, is relatively prime to p, this is impossible. Hence
Mp q6=p M (q) = 0.
P
Now we show that M =
Mp , i.e. the Mp together generate M . Let x M
be nonzero and factor the period of x as r = pd11 pdnn (ignoring units for
convenience). Let ai = r/(pni i ) and yi = ai x; i.e. multiply x by all of its period
except the pi part. Then pni i yi = rx = 0, and as above this implies yi M (pi ).
Now consider the ideal I = (a1 , . . . , an ) R. Since R is a PID, I = (a) for
some a. We then have a|ai for all i. By unique factorization, this implies P
a = 1,
so I is all of R. In particular, 1 I = (a1 , . . . , an ), so we can write 1 =
bi ai
for some bi R. But now we have
X
X
x=
bi ai x =
bi yi .
Since yi Mp , we conclude that x
Mp , as desired.
13
Exercises
14
(d): If M is an R-module, we can define the localization MP similarly as fractions x/s with x M and s R P , with the same identification. Show
that if R is a PID and M is torsion, the localization M(p) is naturally isomorphic to the module Mp of Lemma 3.6. (Hint: Localization preserves
direct sums. Use Lemma 3.6, and show (Mp )(p) = Mp and (Mq )(p) = 0 if
q 6= p.)
Localization is an extremely important technique in commutative algebra,
though we dont have time to discuss it in more depth here. Unless it turns out
that we do, in which case I might talk about it some more on Saturday.
15
3.2
Torsion-free modules
Now we look at the other extreme case of the classification, torsion-free modules.
We want to show that any torsion-free module M is free, i.e. isomorphic to
a direct sum of copies of R. Generators of M inducing such a direct sum
representation are called aPbasis for M . More concretely, {xi } is a basis for M
iff they generate M and
ai xi = 0 implies ai = 0 for all i.PLike bases for a
vector space, this implies that any x M can be written as
ai xi for unique
ai . The size of the basis {xi } is called the rank of M (like the dimension of
a vector space. Note that for any prime p, {xi } will also give a basis for the
R/(p)-vector space M/pM . This implies that the rank of M is well-defined.
The first step to proving all torsion-free modules are free is the following.
Lemma 3.16. Let M be a finitely generated free R-module, and let N M be
a submodule. Then N is free.
Proof. Let {x1 , . . . , xn } be a basis for M , let Mr = (x1 , . . . , xr ) and let Nr =
N Mr . We show by induction that Nr is free; since Nn = N this will imply
N is free. For r = 0, N0 = 0, so this is trivial. Now suppose Nr is free,
and let I be the set of a R such that some element of Nr+1 has a as its
coefficient for xr+1 . Equivalently, identifying (xr+1 ) with R, I is the ideal
(Mr + Nr+1 )/Mr Mr+1 /Mr = (xr+1 ).
If I = 0, then clearly Nr+1 = Nr , so Nr+1 is free. Otherwise, since R
is a PID, we can write I = (a) for a nonzero. Let w Nr+1 be such that
w = b1 x1 + . . . br xr + axr+1 . If y Nr+1 , then the xr+1 coefficient of y is ca for
some c R, so y cw Nr . Thus Nr+1 = Nr + (w). Furthermore, Nr (w) = 0
clearly, since any nonzero multiple of w has nonzero xr+1 coefficient. Thus
Nr+1 = Nr (w); since Nr and (w)
= R are free, so is Nr+1 .
Note that this proof also shows that the rank of N is at most the rank of
M.
We can now prove that torsion-free modules are free.
Theorem 3.17. Let M be an R-module. Then M is free, i.e. isomorphic to
Rn for some n, and this n is unique.
Proof. Uniqueness of n is just the well-definedness of rank of free modules.
Now let {v1 , . . . , vnP
} M be a maximal linearly independent set, i.e. a
maximal set such that ai vi = 0 implies ai = 0. Let N = (v1 , . . . , vn ); by linear
independence, N is free. For any y M , there is some nonzero a M such that
ay N . Indeed, if no such a existed, then y would be linearly independent from
{v1 , . . . , vn }, contradicting maximality.
In particular, let {y1 , . . . , ym } generate
Q
M and ai yi N , and let b = ai . Then byi N for all i, so bM N . But
now multiplication by b is a homomorphism from M to N , and is injective since
M is torsion-free. Thus M is isomorphic to a submodule of N . By Lemma 3.16
and freeness of N , M is free.
16
3.3
Exercises
Exercise 3.18. Let k be a field and let M be the ideal (x, y) k[x, y]. Show
that M is a finitely generated torsion-free k[x, y]-module, but M is not free.
(Hint: Show that I = (x, y) and J = (x 1, y) are both maximal ideals and
k[x, y]/I
= k[x, y]/J
= k as rings. Then show that M/IM and M/JM have
different dimensions as k-vector spaces, and that this is impossible for a free
module.9 )
Exercise 3.19. Find an example of a (non-finitely generated) torsion-free Zmodule which is not free.
Exercise 3.20. Find an example of a (non-finitely generated) torsion-free k[x]module which is not free.
Exercise 3.21. Find an example of a (non-finitely generated) torsion Z-module
which is not a direct sum of cyclic modules.
Exercise 3.22. Let R be a PID. Show that a submodule of any (not necessarily
finitely generated) free R-module is free. (Hint: Let M be free and N M .
Use Zorns Lemma on the poset of linearly independent subsets of N which
generate free submodules F N such that N/F is torsion-free. Show that a
maximal such basis must generate N itself by imitating the proof of Lemma
3.16. To show that every chain has an upper bound, show that if (F ) are
S a
chain of nested submodules of N such that N/F is torsion-free, then N/ F
is torsion-free.)
3.4
We can now combine Corollary 3.10 for torsion modules and Theorem 3.17 for
torsion-free modules to prove the full classification theorem.
Theorem 3.23 (Classification of Modules over a PID). Let M be a finitely
generated
L R-module. Then M is isomorphic to a direct sum of cyclic modules
Rn R/(qi ), where each qi = pdi i is a power of a prime. This decomposition
is unique up to permuting the factors.
Proof. Let T = {x M : x is torsion}, the torsion part of M . Then T is a
submodule of M ; the argument is the same as the argument in Lemma 3.6 that
Mp is a submodule for p a prime. Clearly T is torsion, and T is finitely
L generated
by Noetherianness, so Corollary 3.10 says it splits uniquely as
R/(qi ) for qi
prime powers.
Let N = M/T , and let f : M N be the canonical map. Then N is
torsion-free. Indeed, if x
N is nonzero and a
x = 0, let x M be such that
f (x) = x
. Then ax T , so b(ax) = 0 for some nonzero b. But then (ba)x = 0
so x T , contradicting the assumption that x
6= 0.
9 Here
17
Thus N is torsion-free and hence free by Theorem 3.17. If the short exact
sequence
0T M N 0
were to split, we would get the desired direct sum representation fo M . But a
splitting map r : N M is easily constructed: let {
xi } be a basis for N , and let
r(
xi ) = xi for some xi such that f (xi ) = x
i . This is well-defined since
L N is free.
We clearly have f r = 1, so we get a splitting M = N T
= Rn R/(qi ).
of the representation of M . Given M = Rn
L Finally, we show uniqueness L
R/(qi ), it is clear that T =
R/(qi ) and N
= Rn . The uniqueness thus
follows from the uniqueness of the qi and n given in Corollary 3.10 and Theorem
3.17.
18
a 0 0 0 0 0
1 a 0 0 0 0
0 1 a 0 0 0
0 0 1 a 0 0
.. .. .. .. . .
.. ..
. . . .
. . .
0 0 0 0 a 0
0 0 0 0 1 a
A matrix of this form is called a Jordan block. Note in particular that every
Jordan block is lower triangular.10 A matrix is said to be in Jordan normal
form if it is a direct sum of Jordan blocks. The classification theorem for
finitely generated torsion k[x]-modules thus gives:
Corollary 4.1. Let k be an algebraically closed field. Then every square matrix
over k is conjugate to a Jordan normal form matrix. That is, every linear
transformation of a finite-dimensional k-vector space is in Jordan normal form
with respect to some basis. Two matrices are conjugate iff their Jordan normal
forms are the same up to permuting the Jordan blocks.
10 By convention, the order of the basis is often reversed, so that Jordan blocks are upper
triangular instead of lower triangular.
19
Corollary 4.2. Let k be an algebraically closed field. Then every linear transformation of a finite-dimensional k-vector space is lower triangular11 with respect to some basis.
Proof. Jordan normal forms are lower triangular.
Jordan normal forms enormously simplify much of linear algebra. For example, the determinant of a triangular matrix is just the product of its diagonal
entries.
L
k[x]/((x ai )ni ), the polynomials (x ai )ni are called
If (V, A)
=
Q the elementary divisors of the linear transformation A on V . The product (xai )ni
is called the characteristic polynomial A (x). Note that if we were working over
Z instead of k[x], the product of the elementary divisors of a finitely generated
torsion module would be exactly its order as a group. Thus in some sense we
could consider the characteristic polynomial to measure the size of (V, A). For
example, it is easy to see that deg(A ) = dim V .
Proposition 4.3. Let k be algebraically closed and A be a matrix over k. The
characteristic polynomial is A (x) = det(xI A), where xI A is a matrix with
entries in k[x].
Proof. The matrix xI A over the ring k[x] is the same as the matrix of A,
except
that its diagonal entries are x ai instead of ai . Its determinant is thus
Q
(x ai )ni = A (x).
Recall that an eigenvalue of a linear map A is a scalar k such that I A
is not invertible, i.e. there exists a nonzero v V such that Av = v. Note that
the eigenvalues of A are exactly the roots of the polynomial A (x): I A is
not invertible iff 0 = det(I A) = A (). Q
What are the roots of A ? Since A = i (xi ai )ni for (xi ai )ni the
elementary divisors of A, the roots are the ai . This is also easy to see quite
concretely from the Jordan normal form: if A is in Jordan normal form and
Av = v, explicit computations with a basis fairly easily show that is one of
the ai .
The following theorem is very trivial using Jordan normal forms.
Theorem 4.4 (Cayley-Hamilton over Algebraically Closed Fields). Let A be a
square matrix over an algebraically closed field k. Then A (A) = 0.
Proof. We may assume A is in Jordan normal form, and we can split it into
Jordan blocks. But on a (x ai )ni Jordan block, it is easy to explicitly compute
that (A ai I)ni = 0. Alternatively, note that A represents multiplication by x
on the module k[x]/((x ai )ni ), so Q
clearly (A ai I)ni represents multiplication
by 0. In any case, we get A (A) = i (A ai I)ni = 0.
All of this is well and good, but what if our field is not algebraically closed?
Then irreducible polynomials are not always linear, so we do not have Jordan
11 Or,
20
A(f1 )
0
0
0
A(f2 )
0
D= .
.
..
.
..
..
..
.
0
0
A(fn )
It follows that A and B are conjugate iff they have the same invariant
factors fi . Also, the invariant factors of A are the same whether we think of
it as a matrix over k or over K. Indeed, A is conjugate to D over k and hence
also over K, and by uniqueness of D, the invariant factors are the same over
both fields.
Now suppose A and B are conjugate over K. Then they have the same
invariant factors over K. But these are the same as the invariant factors over
k, so A and B are also conjugate over k.
Proposition 4.5 is quite useful. For example, we could have two real matrices which we know are diagonalizable over C and have the same (complex)
eigenvalues. This implies that they are conjugate over C, so we can conclude
that they are conjugate over R.
We can also talk about characteristic polynomials over non-algebraically
closed fields. The elementary divisors (i.e., the torsion coefficients of (V, A) as a
k[x]-module) are no longer necessarily powers of linear polynomials, but we can
still define A as the product of As elementary divisors. Note that it is easy to
see that the product of the invariant factors is the same as the product of the
elementary divisors (since the elementary divisors are just the irreducible-power
factors of the invariant factors). Since the invariant factors are invariant under
extending the field, so is the product of the elementary divisors, i.e. A . We
thus obtain:
Proposition 4.6. Let k be any field and A be a matrix over k. The characteristic polynomial is A (x) = det(xI A), where xI A is a matrix with entries
in k[x].
Proof. Let K be an algebraically closed field containing k. By the discussion
above, we can compute A over K instead of over k. However, det(xI A)
clearly also doesnt depend on what field were working over. Thus the result
follows by Proposition 4.3.
21
This could also be proven by explicitly writing down the matrices A(f ) and
computing that det(xI A(f )) = f (x).
By a similar method, we can prove the Cayley-Hamilton Theorem over any
field.
Theorem 4.7 (Cayley-Hamilton). Let A be a square matrix over a field k.
Then A (A) = 0.
Proof. Let K be an algebraically closed field containing k. Then by the discussion above, A is the same over k and over K. By Theorem 4.4, we have
A (A) = 0 over K, and hence also over k.
Lets look at some more concrete applications of all this theory.
Theorem 4.8. Diagonalizable matrices are dense in Mn (C). That is, for any
complex matrix A and any > 0, there is a diagonalizable matrix B such that
every entry of B A is smaller than .
Proof. Since conjugation by any invertible matrix is continuous, we may assume
A is in Jordan normal form. By modifying A by less than and only on the
diagonal, we can obtain a triangular matrix Q
B all of whose diagonal entries are
distinct; call them bi . But then B (x) = (x bi ), and it follows that Bs
elementary divisors are x bi so its Jordan normal form is diagonal.
Theorem 4.8 is sometimes useful when trying to prove a property of a continuous function of matrices, since it then suffices to check that property for
diagonalizable matrices.
As another application, lets classify all 22 real matricesup to conjugation.
a 0
Over C, the only possible Jordan normal forms are Dab =
and Ta =
0 b
a 1
. If Dab is conjugate to a real matrix, its trace a + b and determinant
0 a
ab must be real, and it easily follows that a and b must be either be conjugate
complex numbers or both be real. On the other hand, for any z= x + iy
C,
x y
there does indeed exist a real matrix conjugate to Dzz, namely
.
y x
If Ta is conjugate to a real matrix, similar considerations show that a R.
Thus we have the following result:
Theorem 4.9. Every
real 2 2 matrixis conjugate to exactly one matrix of
x 0
x y
x 1
the form
,
, or
(for x, y R).
0 y
y x
0 x
4.1
Exercises
0 1
A = 0 0
1 0
22
0
1
0
and
1
B = 1
0
1 0
0 0
3 1
23