Sei sulla pagina 1di 20

University of New South Wales School of Mathematics

MATH2601

Higher Linear Algebra

2. VECTOR SPACES AND LINEAR TRANSFORMATIONS


Definition. Let F be a field. A vector space over F consists of a set
V , a binary operation on V (called addition and written u + v) and a
function from F V to V (called scalar multiplication and written v),
for which the following properties hold.
1. Associativity of addition. For all u, v, w in V we have (u+v)+w =
u + (v + w).
2. Commutativity of addition. For all v, w in V we have v+w = w+v.
3. Zero element. There exists an element 0 of V such that if v V
then v + 0 = v.
4. Negatives. For each element v of V there is an element w of V such
that v + w = 0. (We usually denote this element by v.)

5. Associativity of multiplication. For all , F and all v V we


have ()v = (v).
6. Multiplicative identity. For each v V we have 1v = v.

7. Vector distributive law. For all in F and all v, w in V we have


(v + w) = v + w.
8. Scalar distributive law. For all , in F and all v in V we have
( + )v = v + v.

Comments.
The elements of V are usually referred to as vectors and the field
elements as scalars. The properties are called the vector space
axioms.
The first four axioms show that (V, +) is an abelian group. Hence all
the results we know about groups apply to the addition of vectors.

Examples. You should be very familiar with many of the following


examples, so we wont say all that much about them.
1. The set Fn consists of all ntuples (a1 , a2 , . . . , an ) of elements of F.
It is a vector space over F, under the standard operations
(a1 , a2 , . . . , an ) + (b1 , b2 , . . . , bn ) = (a1 + b1 , a2 + b2 , . . . , an + bn )
(a1 , a2 , . . . , an ) = (a1 , a2 , . . . , an ) .
In first year you did a lot of work in R2 and R3 , and some in R4
and so on.
2. The set of geometric vectors in 2 dimensions (or in n dimensions)
consists of arrows which are added by placing them head to
tail; scalar multiplication consists of multiplying an arrows length
while leaving its direction unchanged (or reversed in the case of a
negative scalar). You may well be so used to treating these vectors
as ntuples that you believe they are identical to ntuples. But
really, of course, an arrow is not the same as a list of numbers!
To express this example more precisely: two arrows with the same
length and direction are regarded as being the same, no matter
where they start. So rather than individual arrows, we should be
talking about equivalence classes of arrows.
3. For any positive integers m, n, the set Mm,n (F) of m n matrices with entries from F is a vector space under the usual matrix
operations.
4. The set P(F) of polynomials with coefficients in F is a vector space.
So is Pn (F), the set of such polynomials having degree at most n.
5. Let X be a nonempty set. The set of all functions from X to F is
a vector space over F. Operations are defined by
(f + g)(x) = f (x) + g(x)

for all x X;

Note carefully that the field contains a zero element 0, which is not
the same as the vector 0.

(f )(x) = f (x) for all x X.

Question. Why did I put quotation marks around the word associativity in axiom 5?

This is worth pondering, as it is very easy to imagine that we are


not saying anything with these definitions. So, what are we saying?

It is not necessary that the set X be a vector space. (Why not?) Exercise. What are we really talking about if X = { 1, 2, . . . , n }?
What if X = { 1, . . . , m } { 1, . . . , n }?
Note that, strictly speaking, a polynomial is not a function. (However, a polynomial function is a function. Confused yet?)

6. A class of examples which is very important in number theory: let


K and L be fields with K L C. We often call L a field
extension of K. Then L is a vector space over K, where the vector
space operations of addition and scalar multiplication are the usual
field operations of addition and multiplication in C.

For instance, the set Q 2 is by definition the smallest field

which contains Q and 2 . It is not very hard to show that


Q


 


2 = x + y 2 x, y Q ,

and this is a vector space over Q. Similarly, Q 2, 3 is a vector



space over Q 2 .

7. In the same way as we emphasized in discussing groups, the operations in a vector space need not be conventional addition and
multiplication: as long as the axioms are satisfied, its a vector
space. For example, take V = R+ , the set of positive real numbers,
and define addition and scalar multiplication by
v w = vw

and v = v .

That is, addition is really multiplication, and multiplication is


really exponentiation. We have used different symbols to make it
clear that we are not talking about standard operations. Exercise.
(a) Prove that under these operations, V is a vector space over R.
(b) Summarise your proof in one word!

3. (1)v = v;

4. if v = w and 6= 0, then v = w;

5. if v = 0 then either = 0 or v = 0.
Proof of the third result. By axiom 6, axiom 8, a field calculation and
part 1 of this lemma,
v + (1)v = 1v + (1)v = (1 + (1))v = 0v = 0 .
By uniqueness of inverses in the group (V, +) we have (1)v = v.
Definition. Let V be a vector space and W a subset of V . If W is a
vector space with the same scalar field and the same operations as V ,
then W is said to be a subspace of V .
Lemma. Subspace lemma. Let V be a vector space over the field F,
and let W be a subset of V . Then W is a subspace of V if and only if
the following conditions hold.
1. W is not empty.
2. Closure under addition. For all v, w W we have v + w W .

3. Closure under scalar multiplication. For all v W and all F


we have v W .

Proof. Exercise. In particular, make sure you can explain clearly why
W satisfies axioms 3 and 4.
Comments.
An equivalent alternative for condition 1 is that W contain the zero
vector.
An equivalent alternative for the second and third conditions: if
v, w W , F then v + w W .
Examples.

Lemma. Basic properties of vector spaces. Let V be a vector space.


For all v, w V and all scalars we have
1. 0v = 0;

1. The subspaces of R3 are the trivial subspace { 0 }; lines through the


origin; planes through the origin; R3 itself.
Note that a plane through the origin in R3 should not be described
as R2 . Strictly speaking, R2 is not even a subspace of R3 , because

2. 0 = 0;
3

the vectors in R2 are pairs, while those in R3 are triples. However,


{ (x, y, 0) | x, y R } ,
which looks like R2 , is a subspace of R3 .
2. The space Mn,n (R) of real square matrices has various important
subspaces, for example,
{ symmetric matrices } and { skewsymmetric matrices } .
3. Let V be the vector space of functions f : X F, and let Y be a
subset of X. Then
{ f V | f (y) = 0 for all y Y }
is a subspace of V .
4. The vector space R+ from example 7 is not a subspace of R.
Linear combinations, span, linear independence.
Definition. Let V be a vector space over F. A linear combination of
vectors v1 , v2 , . . . , vn in V is any vector which can be expressed
1 v1 + 2 v2 + + n vn ,
where the k are scalars. If S is a subset of V , then the span of S is
span S = { linear combinations of vectors in S } .
We say that S spans V if span S = V .
The point of the last item in the definition is that span S is always a
subset of V ; sometimes this subset is equal to the whole of V , sometimes
it is only a part of V . Indeed, span S is not just a subset but a subspace
of V , as stated in the next result.
Lemma. If S is a subset of a vector space V , then span(S) is a subspace
of V .
5

Examples.
1. The span of two nonzero vectors in R3 is a plane through the origin,
unless the vectors are scalar multiples of each other, in which case
it is a line through the origin.
2. For any n N we have span{ 1, t, t2 , . . . , tn } = Pn .
3. Any field can be considered as a vector space over itself. Consider
the vector space Q of rational numbers over the field Q, and the set
n
o
1
1
1
S = 1,
,
,
,... Q .
10 100 1000
We have
1
 1 
 1 
 1 
= 3(1) + 1
+4
+1
+5
+ ;
10
100
1000
10000
but is not an element of Q. How can we reconcile this with the
preceding lemma?
Definition. A subset S of a vector space V is linearly independent
if for all vectors v1 , v2 , . . . , vn in S (with n 1) the equation
1 v1 + 2 v2 + + n vn = 0
has a unique solution for the scalars 1 , 2 , . . . , n in F. A set which is
not linearly independent is said to be linearly dependent.
Note that in the definition of linear independence, once again we are
considering specifically finite linear combinations.
Example. It is easy to see that the polynomials
p1 = 1 + 2t + 3t2 ,

p2 = 4 + 5t + 6t2 ,

p3 = 14 + 25t + 36t2

are linearly dependent. (How?) For a slightly harder example take


p1 = 1 + 2t + 3t2 ,

p2 = 4 + 5t + 6t2 ,

p3 = 1 + 2t2 .

Following the definition, we consider the equation


1 p1 + 2 p2 + 3 p3 = 0 ,
6

where the right hand side is the zero polynomial. Substituting the given
polynomials,
1 (1 + 2t + 3t2 ) + 2 (4 + 5t + 6t2 ) + 3 (1 + 2t2 ) = 0 + 0t + 0t2 ;

21 + 52 = 0 ,

B = { (1, 0, . . . , 0), (0, 1, . . . , 0), . . . , (0, 0, . . . , 1) } ,

31 + 62 + 23 = 0 .

Writing as an augmented matrix and rowreducing,


1 4 1
0
1 4 1
2 5 0

0 0 3 2
3 6 2
0 0
1
0

0
0 .
0

()

As the left hand side of the echelon form has no nonleading columns,
the system has a unique solution. Therefore p1 , p2 , p3 are linearly independent. Comment. You are probably used to putting the polynomial
coefficients directly into a matrix and starting at () for this kind of
problem. Thats fine, as long as you understand the background and
know where the matrix comes from. In particular, make sure you understand why the coefficients of each polynomial must always become a
column, and not a row, of the matrix.
Basis, dimension and coordinates.
Definition. A basis for a vector space V is a linearly independent subset
of V which spans V .
Examples. The idea is that we are looking for a minimal set of basic
vectors from which we can form, by means of linear combinations, all
vectors in V . (The word basic here is not a joke, it is the reason
for the term basis!) The following examples are intended to give you
some intuitive feel for how you can find a basis of a given vector space;
you should be able also to give a formal proof that each basis is actually
correct.
1. To obtain all vectors in Rn in terms of addition and (real) scalar
multiplication, we need to be able to specify independently all the
unknowns in the vector
v = (a1 , a2 , . . . , an ) .
7

v = a1 (1, 0, . . . , 0) + a2 (0, 1, . . . , 0) + + an (0, 0, . . . , 1) ;


we can obtain all these expressions if we begin with the vectors

expanding and equating coefficients,


1 + 42 3 = 0 ,

This can be written as

and it seems clear that fewer than this many vectors will not suffice.
Therefore B seems to be a basis for Rn .
2. Exactly the same basis works for Cn as a vector space over C. But
what if we want Cn as a vector space over R?
3. Consider
V = { (x1 , x2 , . . . , xn ) Rn | x1 + x2 + + xn = 0 } ,
a subspace of Rn . In this case the vector v = (a1 , a2 , . . . , an ) in V
is completely fixed once we choose n 1 of the values a1 , a2 , . . . , an ,
say for example the first n 1. We can write
v = a1 (1, 0, . . .) + a2 (0, 1, . . .) + ;
remembering that the vectors we pick must be elements of V , a
natural choice is
v = a1 (1, 0, . . . , 0, 1) + a2 (0, 1, . . . , 0, 1) +
+ an1 (0, 0, . . . , 1, 1) .
The n 1 vectors on the right hand side form a basis for V .
4. Any 3 3 real symmetric matrix can be written as a linear combination

a b c
1 0 0
0 1 0
b d e = a0 0 0 + b1 0 0 +
c e f
0 0 0
0 0 0

0 0 0
0 0 0
+ e0 0 1 + f 0 0 0 .
0 1 0
0 0 1
8

A basis for the

1
B = 0

space of 3 3 real symmetric matrices is



0 0
0 1 0
0 0 1
0 0,1 0 0,0 0 0,
0 0
0 0 0
1 0 0

0 0 0
0 0 0
0 0 0
0 1 0,0 0 1,0 0 0 .

0 0 0
0 1 0
0 0 1





5. Find a basis for X M2,2 X 12 = 00 .
6. Find a basis for V = { p P3 | p(5) = 0 }.


7. Find bases for Q 2 , and for Q 3 2 , over Q.
8. It may be much harder to find bases for more complicated vector
spaces; indeed, for a space such as
C[0, 1] = { continuous functions f : [0, 1] R }
it is more or less impossible to write down any specific basis. The
assertion that every vector space has a basis is equivalent to the
Axiom of Choice.
How can we find a basis for a vector space if our intuition does not help?
Essentially, there are two options: start big (begin with a spanning
set, then remove vectors one by one until we have a set which still
spans the space but is also independent), or start small (begin with
an independent set, then add vectors one by one until we have a set which
is still independent but also spans the space). Both of these procedures
will run into difficulties if infinitely many vectors are needed to span the
space, so we exclude this possibility in the following theorem.
Theorem. Let V be a vector space over F, and suppose that V has a
finite spanning set.
1. If S is a finite spanning set for V , then S contains a basis for V .
2. If T is a linearly independent subset of V , then T can be extended
to a basis of V : that is, there is a basis of V which contains T .
3. Any two bases of V have the same number of elements.
9

In order to prove this theorem we need a few easy observations about


linear independence and spans, followed by a more difficult but very
important lemma.
Observations. Properties of independent sets and spans.
1. Any subset of a linearly independent set is linearly independent.
2. (a) If v span(S), then S { v } is linearly dependent.
(b) If S is linearly independent and S { v } is linearly dependent,
then v span(S).
3. If S1 S2 , then span(S1 ) span(S2 ).

4. We have span(S { v }) = span(S) if and only if v span(S).


5. If S is linearly dependent, then there is a vector v in S such that
span(S { v }) = span(S).

Proof: exercise.

Lemma. The Exchange Lemma. Suppose that S0 is a finite spanning


set for V and that T is a linearly independent subset of V . Then there
is a spanning set S of V such that
T S

and |S| = |S0 | .

Proof. Suppose that |S0 | = n. We shall first prove that the lemma is
true when T = { u1 , u2 , . . . , um } is a finite independent set and m n;
we shall then show that T cannot have more than n elements (and in
particular, T cannot be infinite).
The proof will be by induction on m. The case m = 0 is clear, since we
can choose S = S0 .
For the inductive step, suppose that the result is true for some specific
nonnegative integer m < n, and consider a linearly independent set
T = { u1 , u2 , . . . , um+1 }. Since { u1 , u2 , . . . , um } is also independent,
V has a spanning set
S1 = { u1 , u2 , . . . , um , vm+1 , . . . , vn } ,
and it follows that
S2 = { u1 , u2 , . . . , um , um+1 , vm+1 , . . . , vn }
10

is a linearly dependent spanning set of V . Of the sets


{ u1 } ,

{ u1 , u2 } , . . . ,

{ u1 , u2 , . . . , um , um+1 } ,

{ u1 , u2 , . . . , um , um+1 , vm+1 } , . . . ,
{ u1 , u2 , . . . , um , um+1 , vm+1 , . . . , vm+k } , . . . ,
{ u1 , u2 , . . . , um , um+1 , vm+1 , . . . , vn } ,

consider the first which is linearly dependent. By assumption, it cannot


be any of the first m+1 sets (those involving vectors uj only), so it must
be, say,
R = { u1 , u2 , . . . , um , um+1 , . . . , v } ,
where v is one of the vectors vm+1 , . . . , vn . Using properties 2, 3 and 4
from the preceding observations, we have
v span(R { v }) v span(S2 { v })

span(S2 { v }) = span(S2 ) = V .

Now take S = S2 { v }. Then we have just shown that S spans V ;


also, T S. Moreover, S has one element fewer than S2 , and S2 has
one element more than S1 , so |S| = |S1 | = n. This completes the proof
of the inductive step.
It remains to show that T cannot have more than n elements. If it
were so, then by what we have just proved, { u1 , u2 , . . . , un } would be a
spanning set for V ; so { u1 , u2 , . . . , un+1 } would be linearly dependent;
so T would be linearly dependent, contrary to assumption. Hence, T
has at most n elements.
Corollary. If S is a finite spanning set for a vector space V and T is a
linearly independent subset of V , then |T | |S|.
We can now return to the proof of the theorem on pages 910.
1. Suppose that S is a finite spanning set for V . Let B be a minimal
subset of S such that B spans V . Then B is independent (otherwise
observation 5 shows that there is a smaller spanning subset), and
therefore is a basis for V .
11

2. Let S be a finite spanning set for V , and let T = { u1 , u2 , . . . , um }


be a linearly independent subset of V . From the exchange lemma
we know that T is finite, and so S T is a finite spanning set for
V . Let B be a minimal subset of S T which satisfies T B and
span(B) = V . Now suppose that B is linearly dependent. Then
1 u1 + + m um + m+1 vm+1 + + n vn = 0
for some scalars 1 , . . . , n , not all zero. If all the k are zero then
u1 , u2 , . . . , um are dependent, which is not the case; so at least
one k is nonzero. But then B { vk } is a spanning set which
contains T , and this contradicts the minimality of B. Therefore B
is independent, and is hence a basis for V .
3. Suppose that B1 and B2 are bases for V , a vector space with a
finite spanning set. Since B1 is linearly independent and B2 spans
V , the corollary says that |B1 | |B2 |. Similarly |B2 | |B1 |, and
this completes the proof.
Definition. A vector space V is finitedimensional if it has a finite
spanning set; in this case the dimension of V , written dim V , is the
number of vectors in a basis of V . If V is not finitedimensional it is
said to be infinitedimensional.
Comments and examples.
The definition of dim V makes sense because of part 3 from the
previous theorem.
By choosing convenient bases (known as standard bases), it is not
very hard to show that dim Fn = n and dim Mm,n (F) = mn and
dim Pn (F) = n + 1, assuming all of these are vector spaces over F.
Exercise. What is the dimension of Cn as a vector space over R?
What about the space of all symmetric n n matrices over a field
F? Skewsymmetric matrices?
The set P(F) of all polynomials with coefficients in F is infinite
dimensional.
Let be a complex number which is a root of a polynomial of degree
n having rational coefficients, but not a root of any smallerdegree
polynomial. Then { 1, , 2 , . . . , n1 } is a basis for Q() over Q,
12

and so dim Q() = n. If is not a root of any such polynomial


then Q() is infinitedimensional.
The corollary to the Exchange Lemma says, essentially, that any independent set T is smaller than any spanning set S. The sets S and T
can be of the same size only if they are both independent and spanning
sets, that is, bases, as stated in the following result.

of R3 . Solution. From the definition of the coordinate vector, we


need to solve
x1 (1, 2, 1) + x2 (2, 5, 3) + x3 (5, 7, 3) = (9, 1, 5) .
In scalar form this is the system
x1 + 2x2 + 5x3 = 9

Lemma. Let V be a vector space of (finite) dimension n. In V , an independent set containing n vectors is a basis; and a spanning set containing
n vectors is a basis.
Proof of the first claim. Let B be a basis for V and let T be an
independent set containing n vectors. By definition, B is a spanning
set, and by assumption it is finite; so by the Exchange Lemma, there is
a spanning set S such that T S and |S| = |B| = n. Since T and S
have the same number of elements, T = S: thus T is independent, it is
a spanning set and it is a basis.
Let B = { b1 , b2 , . . . , bn } be a basis for a (finitedimensional) vector space V , and let v V . Since B spans V , there exist scalars
x1 , x2 , . . . , xn such that
v = x1 b1 + x2 b2 + + xn bn ;
since B is independent, these scalars are uniquely determined by v and
B. This prompts the following definition.
Definition. Let B = { b1 , b2 , . . . , bn } be an ordered basis for a
vector space V over a field F, and let v V . The coordinate vector
of v with respect to B is the vector (x1 , x2 , . . . , xn ) Fn such that
v = x1 b1 + x2 b2 + + xn bn .
The coordinate vector of v with respect to B will often be denoted [v]B .
Comments and examples.
1. If B and v are given, finding [v]B is just a matter of solving linear
equations. For instance, find the coordinates of (9, 1, 5) with respect
to the ordered basis
B = { (1, 2, 1), (2, 5, 3), (5, 7, 3) }
13

2x1 5x2 7x3 = 1


x1 3x2 3x3 = 5 .

As usual, we

1
2
1

write this in augmented matrix form and rowreduce:


2
5
1 2
5
9
9
1 0 1 3
19 . ()
5 7
5
5
3 3
0
0 1

Solving, the coordinate vector is


x = (x1 , x2 , x3 ) = (8, 4, 5) .
Check that 8(1, 2, 1) 4(2, 5, 3) + 5(5, 7, 3) = (9, 1, 5).
Note that the left hand side of the echelon form in () also confirms
that B really is a basis for R3 .
As we have seen in previous problems, it is fine to begin with the
matrices in (), provided you understand the bits which you are not
writing down.
2. Finding coordinates with respect to standard bases is nearly always
very easy: this is, in fact, the main reason why standard bases are
useful. For example,
the coordinate vector of (a1 , a2 , . . . , an ) with respect to the
standard ordered basis in Rn is (a1 , a2 , . . . , an ), which is the
same vector. (Or is it?)
the standard ordered basis for P3 is { 1, t, t2 , t3 }, and the coordinate vector of 4 t + 7t3 with respect to this basis is
(4, 1, 0, 7).
14

3. For a carefully chosen basis, it may be possible to find coordinate


vectors by simpler methods than solving equations. Example. Define polynomials p1 , p2 , . . . , pn by
p1 (t) = (t 2)(t 3)(t 4) (t n)
p2 (t) = (t 1)(t 3)(t 4) (t n)
p3 (t) = (t 1)(t 2)(t 4) (t n)
and so on. That is, pk (t) is the product (t 1)(t 2) (t n) with
the factor (tk) missing. Given that B = { p1 , p2 , . . . , pn } is a basis
for Pn1 , the coordinate vector (1 , 2 , . . . , n ) of a polynomial
p Pn1 satisfies
p(t) = 1 p1 (t) + 2 p2 (t) + + n pn (t) .
We can find the coordinates by substituting t = k, giving directly
p(k) = k pk (k). Therefore the coordinate vector of p with respect
to B is


p(n)
p(1) p(2)
,
, ... ,
.
[ p ]B =
p1 (1) p2 (2)
pn (n)
Can you complete this calculation by finding a formula for pk (k)?
Exercise. Use these ideas to confirm with no extra calculation that
B is a basis for Pn1 .
Sums and direct sums of vector spaces. Let S be a spanning set for
V . Its true by definition that every vector in V is a sum of multiples of
vectors in S; if S is a basis, this can be done in only one way. Well see
later, when we revise eigenspaces and introduce generalised eigenspaces,
that it is sometimes useful to clump vectors in S into subsets rather
than to regard them as individual vectors. The present topic is the
background for this.
Definition. Suppose that V is a vector space with subspaces W1 and
W2 . The sum of W1 and W2 is the subspace
W1 + W2 = { w 1 + w 2 | w 1 W1 , w 2 W2 }
15

of V . If W1 W2 = { 0 } then W1 + W2 is called the direct sum of W1


and W2 and is written W1 W2 .
Comments.
Its obvious that W1 + W2 is a subset of V , but we didnt actually
prove that its a subspace. Exercise. Do so! While youre at it,
also prove that W1 W2 is a subspace of V .

The importance of direct sums is that every vector in W1 W2 can


be written in a unique way as a sum w1 + w2 . Prove this!

Examples.
1. In R3 , the sum of two different lines through the origin will be a
plane through the origin. Moreover, the lines will intersect only at
the origin, so it is a direct sum:
span{ v } span{ w } = span{ v, w } .
If we have a line and a plane through the origin in R3 , where the
line is not contained in the plane, then any vector in R3 can be
found in a unique way by adding a vector from the line and the
plane: symbolically,
L P = R3 .
Two different planes through the origin in R3 must intersect in a
line, so we do not have a direct sum. It is geometrically clear that
we can get any vector in R3 by adding a vector from each plane, so
P1 + P2 = R3 .
2. For any positive integer n, the vector space Mn,n of n n matrices
is the direct sum of its subspaces { symmetric n n matrices } and
{ skewsymmetric n n matrices }.

3. In M2,2 (R), describe the sum and the intersection of the subspaces
W1 =



a
0

b
0





a
a, b R and W2 =

b
16

0
0




a, b R .

4. Let n be a nonnegative integer and let k, m be nonnegative integers not exceeding n. Consider the vector space Pn , and write
Lk = { a0 + a1 t + + an tn | ak+1 = ak+2 = = an = 0 } ,
Um = { a0 + a1 t + + an tn | a0 = a1 = = am1 = 0 } .

Proof. Let { a1 , . . . , ak } be a basis for W1 W2 . This can be extended


to bases { a1 , . . . , ak , b1 , . . . , bm } and { a1 , . . . , ak , c1 , . . . , cn } for W1
and W2 respectively. We shall prove that
B = { a1 , . . . , ak , b1 , . . . , bm , c1 , . . . , cn }

When is it true that Pn = Lk + Um ? When is Pn = Lk Um ?

is a basis for W1 + W2 . First, it is a spanning set: any vector w in


W1 + W2 can be written as a sum w1 + w2 with w1 W1 and w2 W2 ,
and so we have a linear combination

It is useful to think of these problems in terms of dimension. For the


two planes in example 1, each has dimension 2; they are not the same,
so adding vectors in one plane to those in the other will give a space
of dimension strictly greater than 2; as we are in R3 , the only possible
larger dimension is 3. In fact, we can also see intuitively that the planes
have combined dimensions 2 + 2; but vectors in the 1dimensional
intersection will be counted twice; so the dimension of the sum will
be 2 + 2 1.

w = (1 a1 + + 1 b1 + ) + (1 a1 + + 1 c1 + )
= (1 + 1 )a1 + + 1 b1 + + 1 c1 +

5. In any vector space we have span(S) + span(T ) = span(S T ).

Likewise, consider the polynomials in example 4. Lets be specific


and take n = 7, k = 4, m = 2. Then we have
Lk = span{ 1, t, t2 , t3 , t4 }

and Um = span{ t2 , t3 , t4 , t5 , t6 , t7 } .

It is clear that by adding elements from Lk and Um we can get any


polynomial in Pn = span{ 1, t, t2 , t3 , t4 , t5 , t6 , t7 }; and that the spanning
set for Pn consists of those for Lk and Um , with the overlap counted
only once; and so
dim(Lk + Um ) = 5 + 6 3
= dim(Lk ) + dim(Um ) dim(Lk Um ) .
Note that we could have expressed the subspaces differently, and there
might not be any actual intersection between their spanning sets; nevertheless, with a bit of care we can prove that this kind of result is
generally true.
Theorem. Dimension of a sum. If V is a finitedimensional vector
space with subspaces W1 and W2 , then
dim(W1 + W2 ) = dim(W1 ) + dim(W2 ) dim(W1 W2 ) .
17

which is in span B. To prove independence, suppose that


1 a1 + + 1 b1 + + 1 c1 + = 0 .

()

This can be rewritten as


1 a1 + + 1 b1 + = 1 c1 .
Now the left hand side is an element of W1 and the right hand side is
an element of W2 ; since LHS = RHS, each side is in W1 W2 and hence
is a linear combination of a1 , . . . , ak . In particular, we have an equation
of the form 1 c1 = 1 a1 + , that is,
1 a1 + + 1 c1 + = 0 ;
and since the vectors aj , cj are all independent, all the coefficients j are
zero. The same argument shows that all the j are zero; substituting
back into () and using a similar argument for a third time shows that
all the j are zero. Thus B is an independent set, and hence is a basis
for W1 + W2 .
It is now easy to complete the proof: we have
dim(W1 + W2 ) = k + m + n
= (k + m) + (k + n) k

= dim(W1 ) + dim(W2 ) dim(W1 W2 ) ,

and we are done.


18

Corollary. Dimension of a direct sum. For a direct sum we have


dim(W1 W2 ) = dim(W1 ) + dim(W2 ) .
Comments and examples.
1. It is always good to check a theorem by looking at extreme cases
and other special cases. Confirm that the dimension of a sum
theorem is correct when W2 = W1 , and also when W2 = V , and
also when W2 = { 0 }.
2. Suppose that a vector space V of dimension 26 has subspaces W1
and W2 of dimension 19 and 11 respectively. Find all possible dimensions of W1 + W2 and W1 W2 .
3. This example shows how, sometimes, thoughtful use of the above
theorem can replace large amounts of boring algebra. Problem.
Prove that every matrix in M2,2 can be written in a unique way as
a sum X + Y , where X and Y are 2 2 matrices satisfying
   
   
0
3
0
1
.
=
and Y
=
X
0
4
0
2
Solution. It is not hard to see that

    

0
1

and W2 = Y
W1 = X X
=
0
2

   

Y 3 = 0

0
4

are 2dimensional subspaces of M2,2 . Moreover, any matrix X in


their intersection satisfies

 

1 3
0 0
X
=
;
2 4
0 0

since 12 34 is invertible this means that X = 0. So we have a direct
sum with dimension
dim(W1 W2 ) = dim(W1 ) + dim(W2 ) = 4 ;
therefore W1 W2 = M2,2 , and the result is proved.
19

Linear transformations. Consider a function T : V W between


two vector spaces. Since, fundamentally, the important things about
vector spaces are the operations of addition and scalar multiplication, it
seems plausible that the most important such functions should be those
that preserve these operations.
Definition. Let V and W be vector spaces over a field F. A function
T : V W is a linear transformation if
1. for all v1 , v2 V we have T (v1 + v2 ) = T (v1 ) + T (v2 );
2. for all v V and all F we have T (v) = T (v).
The kernel (or nullspace) and image (or range) of T are
ker T = { v V | T (v) = 0 }

and

im(T ) = { T (v) | v V } .

Examples.
1. For any matrix A Mm,n (F), the function
T : Fn Fm

where T (x) = Ax

is a linear transformation from Fn to Fm . In fact, as we shall see


later, every linear transformation is really like this, as long as we
are talking about finitedimensional spaces.
2. Differentiation and definite integration are linear transformations on
suitable spaces of functions or polynomials; so are multiplication by
a fixed function and substitution for a variable. For this reason there
are close connections between linear algebra and calculus, some of
which we shall see later in this course. For example, we might have

or
or

T (f )(x) = f (x)
Z
T (f )(x) =
f (x) cos x dx

T (f )(x) = f (x ) .

In each case we have to take some care in choosing the domain and
codomain of T . For example, not all functions f are differentiable;
and even for those which are, f need not be differentiable.
20

3. The function T : P2 R defined by


Z 1
T (p) =
p(t) dt

Hence

is linear. Its kernel is given by


ker T =

Z

p P2

p(t) dt = 0

1
2

and S preserves addition. The rest of the proof is similar, and is


left as an exercise.

Lemma. Basic properties of linear transformations. Let T : V W


be a linear transformation between vector spaces over F. Then

= { a + bt + ct | 2a + 23 c = 0 }

= { a + bt 3at2 | a, b R }

1. T (0) = 0;

= span{ t, 1 3t }

2. for all v V we have T (v) = T (v);

The fact that ker T is the span of something (anything!) shows that
ker T is a subspace of V : we shall show later that this is always true.
It is easy to see (how?) that the two polynomials in the spanning
set are independent and therefore form a basis for ker T . To find
the image of T we note that any real number x is in the image,
because x = T (p) for p = 21 x + 0t + 0t2 ; therefore im(T ) = R.
2

4. Many important geometric transformations of R and R (and, if


you can imagine it, Rn ) are linear: rotations, reflections, dilations,
projections. But not translations.
5. Consider C as a vector space over R. Linear functions from C to
C include the conjugate, real part and imaginary part. Exercise.
Show that none of these is linear if considered as a function on the
space C over C.
6. An important example which well need later. Let V be a (finite
dimensional) vector space over F with a basis B = { b1 , . . . , bn }.
Then the function S : V Fn defined by S(v) = [v]B is linear.
Proof. Let v, w be vectors in V whose coordinate vectors with
respect to B are (v1 , . . . , vn ) and (w1 , . . . , wn ) respectively. By definition
v = v 1 b 1 + + vn b n

and

w = w1 b1 + + wn bn ,

and so
v + w = (v1 + w1 )b1 + + (vn + wn )bn .
21

[v + w]B = (v1 + w1 , . . . , vn + wn )
= (v1 , . . . , vn ) + (w1 , . . . , wn )
= [v]B + [w]B ,

3. ker T is a subspace of V ;
4. if U is a subspace of V , then the set T (U ),defined by
T (U ) = { T (u) | u U } ,
is a subspace of W ; in particular, im(T ) is a subspace of W ;

5. if U is finitedimensional, then T (U ) is finitedimensional; in particular, im(T ) is finitedimensional;


6. if T is bijective, then T 1 is a linear transformation from W to V .
Proof. We prove the first parts of properties 4 and 5. Let U be a
subspace of V ; then certainly T (U ) is a subset of W , which is a known
vector space. Because U is not empty, it is clear that T (U ) is not empty.
Let w1 , w2 T (U ); by definition we have w1 = T (u1 ) and w2 = T (u2 )
for some u1 , u2 U . Using linearity of T and the fact that U is a
subspace of V , we have
w1 + w2 = T (u1 ) + T (u2 ) = T (u1 + u2 ) T (U ) ,
so T (U ) is closed under addition.
Likewise, if w T (U ) and F, then we have w = T (u) for some
u U and hence
w = T (u) = T (u) T (U ) .
22

Thus T (U ) is closed under scalar multiplication. Hence T (U ) is a subspace of W . Now if U has a finite spanning set { u1 , u2 , . . . , un }, then
every vector w in T (U ) can be written
w = T (u) = T (1 u1 + 2 u2 + + n un )
= 1 T (u1 ) + 2 T (u2 ) + + n T (un )

clear that ker(T ) is also finitedimensional*; suppose that { u1 , . . . , un }


is a basis for ker(T ). We shall show that
B = { u1 , . . . , un , v1 , . . . , vr }
is a basis for V . First, suppose that
1 u1 + + n un + 1 v1 + + r vr = 0 .

()

for some scalars 1 , 2 , . . . , n , and so { T (u1 ), T (u2 ), . . . , T (un ) } is a


(finite) spanning set for T (U ). This shows that if U is finitedimensional,
then T (U ) is finitedimensional.

Applying T to both sides gives

Lemma. Alternative definition of linear transformation. A function


T : V W between vector spaces over F is linear if and only if

because each T (uk ) is zero; since the wk are independent, each k is zero.
Substituting back into () and recalling that the uk are also independent
shows that each k is also zero. Hence B is a linearly independent set.

T (v1 + v2 ) = T (v1 ) + T (v2 )

Now let v V ; we must show that v is in span(B). By definition,


T (v) is in im(T ), and so we have T (v) = 1 w1 + + r wr for some
scalars k . Consider the vector

for all v1 , v2 V and all scalars .


Since we know that the kernel and image of a linear transformation are
vector spaces, it makes sense to talk about their dimensions.
Definition. If T is a linear transformation, then the dimension of the
kernel of T is called the nullity of T , and the dimension of its image is
called the rank of T .
There is a very important connection between the rank and the nullity of
a linear transformation. The proof of the following theorem employs the
same technique as that of the dimension of a sum theorem, page 18.
Theorem. The RankNullity Theorem. Let V be a finitedimensional
vector space. If T : V W is a linear transformation, then
rank(T ) + nullity(T ) = dim V .

1 w1 + + r wr = 0

v (1 v1 + + r vr ) .
We have
T (v (1 v1 + + r vr )) = T (v) (1 w1 + + r wr ) = 0 ;
so the above vector is in the kernel of T ; so
v (1 v1 + + r vr ) = 1 u1 + + n un
for some scalars k . Therefore
v = 1 u1 + + n un + 1 v1 + + r vr span(B) ;
this shows that B is a spanning set and hence also a basis for V . Finally,
B contains r + n vectors; so
dim V = r + n = rank(T ) + nullity(T )

Proof. Since V is finitedimensional, property 5 of the lemma shows


that im(T ) is finitedimensional; let { w1 , . . . , wr } be a basis for im(T ).
Choose vectors v1 , . . . , vr in V such that wk = T (vk ) for each k. It is
23

as required.
* It is probably clear. But try to prove it carefully!
24

Examples.
1. Another look at example 3 on page 21. Having found a basis for
ker T as in the previous working, we can say that
rank T = dim P2 nullity T = 3 2 = 1 ;
so im T is a 1dimensional subspace of R; so im T = R.
2. Consider the mapping T : R3 R3 which sends a vector to its
projection onto a certain plane P through the origin. It is geometrically clear that the image of T is P and the kernel of T is the line
through the origin perpendicular to P . So we have rank T = 2 and
nullity T = 1, in accordance with the RankNullity Theorem.
Let A be an m n matrix over the field F, and consider the linear
transformation T : Fn Fm given by T (x) = Ax. The kernel, image,
nullity and rank of A are by definition the same as those of T . Observe
that if A has columns c1 , . . . , cn then
im(A) = { Ax | x Fn }
= { x1 c1 + + xn cn | x1 , . . . , xn F }
= span{ c1 , . . . , cn }
= span{ columns of A } .

This space is called the column space of A; we have just proved that
CS(A) = im(A). Similarly, the row space of A, meaning the span of
the rows of A, satisfies
RS(A) = CS(AT ) = im(AT ) .
It is an immediate corollary of the RankNullity Theorem that
rank(A) + nullity(A) = n ,

If we write both x and T (x) as column vectors this can be expressed in


the form





 x1
x1
7x
+
x
7
1
0
1
2
x2 .
T x2 =
=
x2 + 4x3
0 1 4
x3
x3
We say that
A=

7
0

1 0
1 4

is the matrix of T with respect to standard bases. Similarly, consider


the differentiation map T : P3 P2 given by T (p) = p . Using the
general form of a polynomial in P3 , this is the same as
T (a + bt + ct2 + dt3 ) = b + 2ct + 3dt2 ;
once again we shall replace the polynomials by their coordinate vectors
and rewrite the result in terms of matrix multiplication,

a
b
0 1
b
T = 2c = 0 0
c
3d
0 0
d
The matrix of differentiation with
P2 is

0 1
0 0
0 0

a
0 0
b
2 0 .
c
0 3
d

respect to standard bases in P3 and

0 0
2 0 .
0 3

Matrices of linear transformations. There is a very close and important relationship between linear transformations and matrix multiplication. For example, consider the linear mapping T : R3 R2 defined
by
T (x1 , x2 , x3 ) = (7x1 + x2 , x2 + 4x3 ) .

(Of course, strictly speaking, this is nonsense: the domain of T is P3 ,


and so T of a vector in R4 is not defined. Well clear up the matter and
write things more accurately very soon.)
Another example: define T : M22 M22 by T (X) = 3X X T , and
take the standard basis matrices for M22 to be

 
 
 

0 0
0 0
0 1
1 0
,
,
,
,
0 1
1 0
0 0
0 0

25

26

sometimes referred to as the ranknullity theorem for matrices.

in that order. Then we have



 
 
 
a b
3a 3b
a c
2a
T
=

=
c d
3c 3d
b d
3c b
or in terms of coordinates


a
2a
2 0
b 3b c 0 3
T =
=
c
3c b
0 1
d
2d
0 0

0
1
3
0

3b c
2d


0
a
0 b
.
0
c
2
d

So the matrix of this linear transformation with respect to standard


bases is

2 0
0 0
0 3 1 0

.
0 1 3 0
0 0
0 2
There is no particular need to do the preceding problems using standard
bases (though in many cases, but not all, to do so will simplify the
working). For example, lets consider again the differentiation map from
P3 to P2 . Well continue to use the standard ordered basis S = { 1, t, t2 }
for P2 , but shall use a different ordered basis
B = { 3 + t, 5t2 2t3 , 4 7t + t3 , 1 + t2 }
for P3 . (Exercise. Check that this truly is a basis!) Writing the differentiation mapping in terms of these bases we have

T a(3 + t) + b(5t2 2t3 ) + c(4 7t + t3 ) + d(1 + t2 )

= T (3a 4c d) + (a 7c)t + (5b + d)t2 + (2b + c)t3
= (a 7c) + (10b + 2d)t + (6b + 3c)t2 ;

in terms of coordinate vectors,




a
a 7c
1
b

T = 10b + 2d = 0
c
6b + 3c
0
d
27

a
0 7 0
b
10
0 2 .
c
6 3 0
d

The 3 4 matrix we have just found is the matrix of T with respect to


the bases B in the domain and S in the codomain.
We could change to a nonstandard basis in the codomain as well
as the domain. For example, take
C = { 1 4t + 3t2 , 2t t2 , t2 } .
As above we have

T a(3 + t) + b(5t2 2t3 ) + c(4 7t + t3 ) + d(1 + t2 )

= (a 7c) + (10b + 2d)t + (6b + 3c)t2 ;

we need to write the right hand side in terms of the basis C. Setting
(a 7c) + (10b + 2d)t + (6b + 3c)t2
= 1 (1 4t + 3t2 ) + 2 (2t t2 ) + 3 (t2 )
and solving gives
1 = a 7c , 2 = 2a + 5b 14c + d , 3 = a b + 10c + d .
Now we can proceed as above to write the definition of T in terms of
coordinate vectors (this time, with respect to B in the domain and C in
the codomain) and mimic T by matrix multiplication. We have


a
a
a 7c
1
0
7 0
b
b
T = 2a + 5b 14c + d = 2
5 14 1 ,
c
c
a b + 10c + d
1 1 10 1
d
d
and therefore the matrix of T
and C in the codomain is

1
2
1

with respect to bases B in the domain


0
7
5 14
1 10

0
1 .
1

The formal definition of matrix of a linear transformation is as follows.


28

Definition. Let T : V W be a linear transformation, where V and


W are finitedimensional vector spaces. Then A is the matrix of T with
respect to ordered bases B for V and C for W if


 
T (v) C = A v B

for all vectors v in V .


A consequence of this definition leads to a (sometimes) simpler
method of finding the matrix of a linear transformation, and to a proof
that every linear transformation between finitedimensional spaces has
a matrix. Let the basis for V be B = { b1 , . . . , bn }. Then from the
definition we have

1
0

 

T (b1 ) ]C = A b1 B = A
... = { first column of A } .
0

In other words, to find the first column of A we take the first basis vector
for V , calculate T of this vector, and write down its coordinate vector
with respect to the basis of W . The other columns of A are then found
by a similar method.
Theorem. Matrix of a linear transformation. Let V and W be vector
spaces with bases B = { b1 , . . . , bn } and C = { c1 , . . . , cm } respectively,
and let T be a linear transformation from V to W . Let A be the m n
matrix whose jth column is the coordinate vector with respect to C of
T (bj ). Then A is the matrix of T with respect to bases B and C.
Proof. For any vector v = v1 b1 + + vn bn in V we have

v1
.
A[v]B = A .. = v1 [T (b1 )]C + + vn [T (bn )]C .
vn
But recalling (example 6, page 21) that the function which maps vectors
to coordinates is linear, we have
A[v]B = [v1 T (b1 ) + + vn T (bn )]C

= [T (v1 b1 + + vn bn )]C = [T (v)]C ,

and this completes the proof.


29

Examples.
1. Reworking the differentiation map from P3 to P2 : take the basis
B = { 3 + t, 5t2 2t3 , 4 7t + t3 , 1 + t2 }
for P3 and the standard basis S = { 1, t, t2 } for P2 . Then
d
(3 + t) = 1
dt
d
T (5t2 2t3 ) =
(5t2 2t3 ) = 10t 6t2
dt
d
(4 7t + t3 ) = 7 + 3t2
T (4 7t + t3 ) =
dt
d
(1 + t2 ) = 2t ,
T (1 + t2 ) =
dt
T (3 + t) =

and the coordinates of the right hand sides with respect to S are

1
0 ,
0

0
10 ,
6

7
0
3

and


0
2 .
0

The coordinate vectors are the columns of

1 0 7 0
0 2 ,
A = 0 10
0 6 3 0
and this, as we found on page 27, is the matrix of T with respect
to bases B for P3 and S for P2 .
2. A mapping given by the formula T (x) = x for all x is called an
identity mapping as it leaves every vector unchanged. Consider the
identity mapping on R3 ; we shall find its matrix with respect to the
basis B = { (1, 4, 1), (0, 2, 5), (3, 1, 1) } in the domain, and the
standard basis in the codomain. Using the columns method is
30

boring but easy:

1
1
T 4 = 4 ;
1
1

0
0
T 2 = 2 ;
5
5

3
3
T 1 = 1 ;
1
1

its coordinate vector is

its coordinate vector is

its coordinate vector is

1
4
1

0
2
5

3
1 ,
1

and so the matrix of the identity mapping on R3 with respect to


bases B in the domain and S in the codomain is

1
0
3
4 2 1 .
1 5
1
This illustrates the next result, which will lead us to a third and
sometimes even more convenient method for finding the matrix of
a linear transformation.
Theorem. Let T be the identity map on a vector space V , and let B =
{ b1 , b2 , . . . , bn } be a basis for V . If A is the matrix of T with respect
to the basis B in the domain and the standard basis in the codomain,
then the columns of A are b1 , b2 , . . . , bn , written as coordinate vectors
with respect to S.
To apply this result we need to consider the composition of linear
mappings. Let T1 : V1 V2 and T2 : V2 V3 be linear transformations;
then the composition T3 , a function from V1 to V3 defined by

T3 (v) = T2 T1 (v)
is also linear. (Exercise. Prove it!) Now suppose that we have bases
B1 , B2 and B3 for V1 , V2 and V3 ; let the matrices of T1 and T2 with
31

respect to these bases be A1 and A2 respectively. Then for any v in V1


we have
 






T3 (v) B3 = T2 T1 (v) B3 = A2 T1 (v) B2 = A2 A1 v B1 ;

so the matrix of T3 with respect to bases B1 in the domain and B3 in


the codomain is A2 A1 . (Note the order of multiplication!)
To find the matrix of T : V W with respect to bases B in the
domain and C in the codomain, we may write T as the composition of
three mappings. Given any v in V ,
calculate I(v) = v, where I is the identity map on V ;

calculate T (I(v)) = T (v);


calculate J(T (I(v))) = T (v), where J is the identity map on W .
The point is that if we use the basis B to represent the initial vector v
and C to represent the final value of T (v), but switch to standard bases
for all the intermediate steps, then the matrices of all three functions will
be quite easy to find. It is often helpful to display the triple composition
in diagrammatic form.
T :

V w.r.t.
x S

T : V w.r.t. B

W w.r.t.
x S

W w.r.t. C
M

For example, define T : R3 R2 by


T (x) = (x1 + 2x2 + 3x3 , 4x1 + 3x2 + 2x3 ) .
To find the matrix of T with respect to bases
B = { (1, 0, 1), (1, 1, 3), (1, 1, 0) }

and

C = { (1, 2), (3, 1) }

for R3 and R2 , we can write down the matrices which correspond to the
left, top and right sides of the diagram:





1
1 1
1 3
1 2 3
.
P = 0
1
1 , A =
, Q=
2 1
4 3 2
1 3 0
32

One more thing to observe. We have obtained the matrix Q of


the identity map J with respect to a new basis in the domain and the
standard basis in the codomain; but we actually want it the other way
around, and so we must use Q1 . Therefore the matrix of T with respect
to the above bases is


1 

1
1 1
1 2 3
1 3
0
1
1
M = Q1 AP =
4 3 2
2 1
1 3 0


1 4 3
2
.
=
2 11 1
5

Comment. The matrix P is called the changeofbasis matrix from


B to B.
Proof. Let v V ; write x = [v]B and x = [v]B . By using the fact
that the vectorstocoordinates map is linear (example 6, page 21),
we have
x = [x1 b1 + + xn bn ]B = x1 [b1 ]B + + xn [bn ]B = P x :
the last step is simply the definition of matrix multiplication, together
with the fact that [bj ]B is the jth column of P . That is, we have
[v]B = P [v]B .

Comment. If W is the same as V , then the bases B and C may be the


same. The matrix of T with respect to a basis B means the matrix
of T with respect to B in the domain and also B in the codomain.

An identical argument gives [T (v)]C = Q[T (v)]C , and so

Example. Consider the map T : P2 R2 given by T (p) = (p(1), p(2)),


with bases { 1t, 2t, t2 } for P2 and { (2, 3), (4, 5) } for R2 . The matrix
of T with respect to these bases is

Since this is true for all x Fn , we have A = Q1 AP , and the proof


is done.

2 4
3 5

1 

1
1



1
2 0
1
1 1
4
5
11
1 1 0 =
.
2 4
2
3 5
2
0
0 1


To conclude this section we give a more careful proof of the methods we


have just been using. The proof takes a bit of effort to set up, but the
central idea is nothing more complicated than matrix multiplication.
Theorem. Diagram chasing. Let V be a finitedimensional vector
space with bases B = { b1 , . . . , bn } and B = { b1 , . . . , bn }. Let W
be a finitedimensional vector space with bases C = { c1 , . . . , cm } and
C = { c1 , . . . , cm }. Let T : V W be a linear transformation which
has matrix A with respect to bases B and C, and matrix A with respect
to bases B and C . Let P be the n n matrix whose jth column is
the coordinate vector [bj ]B , and let Q be the m m matrix whose jth
column is [cj ]C . Then
A = Q1 AP .
33

A x = [T (v)]C = Q1 [T (v)]C = Q1 A[v]B = Q1 AP x .

Isomorphism of vector spaces. Rewriting an equation w = T (v)


in the form y = Ax suggests that we are thinking of all vectors as
coordinate vectors, and all linear transformations as matrices. As with
groups, we can formalise this by defining suitable kinds of functions
between apparently different objects.
Definition. Let T be a bijective linear mapping from V to W . Then
T is called an isomorphism from V to W . If such a T exists, we say
that V and W are isomorphic.
Examples.
1. The example on page 1 of chapter 1 illustrates the fact that R4
and M2,2 and P3 and C2 are isomorphic (when considered as vector
spaces over R). An isomorphism from R4 to M2,2 is defined by


a b
;
T (a, b, c, d) =
c d
we leave you to check the details of this and the other isomorphisms.
34

2. The vector spaces Fn and Mn,1 (F) and M1,n (F) are all isomorphic:
this is why we can treat vectors of n components as if they were
n 1 matrices, and why it doesnt matter (out of context) whether
we regard them as row vectors or column vectors.
3. Similarly, C (as a vector space over R) is isomorphic to R2 and
to the set of geometric vectors (arrows) in 2 dimensions: this is
why we can interpret addition of complex numbers as addition of
vectors.
4. Let V be a finitedimensional vector space over F; then V is isomorphic to Fn , where n = dim V . In fact, we already know that
for any basis B of V , the map S : V Fn given by S(v) = [v]B is
linear, and it follows from remarks on page 13 that it is bijective.
We denote by L(V, W ) the set of all linear transformations from a
vector space V over a field F to a space W over the same field. You
may check that L(V, W ) is a vector space under the usual operations
of addition and scalar multiplication for functions. If dim V = n
and dim W = m, then L(V, W ) is isomorphic to Mm,n (F). In fact,
if we choose bases B and C for V and W , and if SC : W W is the
vectorstocoordinates map for W with respect to the basis C,
as in the previous paragraph, then isomorphisms in both directions
are given by
(T ) = A ,

the m n matrix with jth column SC (T (bj ))

Recall that to prove a function is bijective, we must show that it is


onetoone and onto. For linear transformations there is a very simple
criterion for the former.
Lemma. Onetoone and kernel. A linear transformation T : V W
is onetoone if and only if ker T = { 0 }.

Proof. Suppose T is onetoone; let v ker T . Then T (v) = 0 = T (0),


and since T is onetoone, we have v = 0. Thus ker T = { 0 }.
Conversely, suppose that ker T = { 0 }. For any v1 , v2 V we have
T (v1 ) = T (v2 ) T (v1 v2 ) = 0
v1 v2 ker T v1 v2 = 0 v1 = v2 ;
so T is onetoone.

Note that this result is certainly not true in general for nonlinear
functions: consider, for example, the functions from R to R given by
f (x) = x2

and g(x) = x + 1 .

Theorem. Isomorphisms and matrices. Let T : V W be a linear


map between finitedimensional spaces, and let A be the matrix of T
(with respect to any bases). Then T is invertible if and only if A is
invertible. If this is the case, then the matrix of T 1 (with respect to
the same bases) is A1 .
Proof. Well work with bases B = { b1 , . . . , bn } for V and C for W .
Recall that by definition, we have

and
1 (A) = T ,

1
the function which maps v to SC
(A[v]B ).

5. Recall example 7 from page 3: V = R+ and the addition and


scalar multiplication operations are actually multiplication and
exponention. We may check that
T :V R

where T (x) = log x

is an isomorphism, and so these two spaces are isomorphic.


35

[T (v)]C = A[v]B
for all v V . First, suppose that A is invertible. We define a function
S from W to V : for any w W , let
S(w) = x1 b1 + + xn bn

where x = A1 [w]C .

Then for any w we have


[T (S(w))]C = A[S(w)]B = Ax = [w]C
36

()

and so T (S(w)) = w. Let v V ; applying () to w = T (v), we have


S(T (v)) = x1 b1 + + xn bn

The inverse of D is the map D 1 : P2 V given by

where x = A1 [T (v)]C ;

(p) =

hence
[S(T (v))]B = x = A1 A[v]B = [v]B ,
and so S(T (v)) = v. Thus T is invertible (and S is its inverse).
Conversely, suppose that T is invertible; we know that T 1 is a
linear map from W to V and therefore has a matrix A . For any x Fn ,
take v = x1 b1 + + xn vn ; then

its matrix

p(s) ds ;
0

1 3 1
1 2 0
0 0 1

is easier to calculate than A since we have a standard basis (more or


less) in the codomain; and we can check that this matrix is A1 .

A Ax = A A[v]B = A [T (v)]C = [S(T (v))]B = [v]B = x .


Therefore A A = In . Similarly, AA = Im . Therefore A is invertible,
and the matrix A of T 1 is equal to A1 .
Example. Let
V = { p P3 | p(0) = 0 } ;

you can check that V is a vector space, that B = { t, t2 , t3 } is a basis


for V , and that the differentiation map D : V P2 , where D(p) = p ,
is linear. Just to make it difficult (or maybe not?), well take a non
standard basis
C = { 1 + 2t, 3 + 4t, 1 + 3t2 }
for P2 . Calculating the images
D(t) = 1 ,

D(t2 ) = 2t ,

and their respective coordinate vectors

2
3
1 , 1 ,
0
0
gives the matrix of D with respect

2
A= 1
0

D(t3 ) = 3t2

2
1
1

to B and C as

3 2
1 1 .
0 1

37

Similarity of matrices. Consider a linear transformation T : V V ,


where V is finitedimensional; let B1 and B2 be two bases for V . If the
matrix of T with respect to B1 is A1 and the matrix with respect to B2
is A2 , then diagram chasing shows that
A2 = P 1 A1 P ,
where P is the matrix whose columns are the basis vectors in B2 , written as coordinate vectors with respect to B1 . This equation defines an
important relation between matrices.
Definition. Matrices A1 and A2 (square, and of the same size) over a
field F are said to be similar if there exists an invertible matrix P over
F such that A2 = P 1 A1 P .
Lemma. Similarity is an equivalence relation on Mn,n (F).
Theorem. Matrices A1 and A2 are similar if and only if they are the
matrices of the same linear transformation with respect to two choices
of bases.
Proof. The if statement is proved by the remarks at the beginning
of this section. Conversely, suppose that A1 and A2 are similar n n
matrices over a field F, say A2 = P 1 A1 P , and let T : Fn Fn be given
by T (x) = A1 x. Then A1 is the matrix of T with respect to the standard
basis S = { e1 , . . . , en } for Fn . Now consider B = { P e1 , . . . , P en }: it
is not hard to show that this is a linearly independent set, and it has n
elements, so it is a basis for Fn . The diagram chasing theorem shows
38

immediately that the matrix of T with respect to B is P 1 A1 P = A2 .


Therefore A1 and A2 are the matrices of T with respect to the bases S
and B.
Many properties of matrices that we have studied are similarity
invariants: that is, if A1 has the property and A1 is similar to A2 , then
A2 has the property as well. Here are two important examples.
Theorem. Rank and nullity are similarity invariants.
Proof. Let A1 , A2 be similar matrices, say, A1 = P 1 A2 P . Take a
basis B1 = { b1 , . . . , br } for im(A1 ), and write B2 = { P b1 , . . . , P br }.
We shall show that B2 is a basis for im(A2 ), and it will follow that
rank(A1 ) = |B1 | = |B2 | = rank(A2 ) .
To show that B2 is independent, let 1 (P b1 ) + + r (P br ) = 0; then
P (1 b1 + + r br ) = 0 ;
since P is invertible, 1 b1 + + r br = 0 and hence every k is zero.
To show that B2 spans im(A2 ), let w im(A2 ); by definition of the
image, similarity of A1 , A2 , and the fact that B1 spans im(A1 ), we have
w = A2 v = P A1 P

v = P (1 b1 + + r br )

for some scalars 1 , . . . , r ; the last step holds since A1 P 1 v is in


im(A1 ). Hence
w = 1 (P b1 ) + + r (P br ) span(B2 ) ,
and this completes the proof.
To show that nullity is a similarity invariant, suppose once again
that A1 = P 1 A2 P , take a basis B1 = { b1 , . . . , bn } for ker(A1 ) and
write B2 = { P b1 , . . . , P bn }. If w ker(A2 ) then
A2 w = 0

P A1 P 1 w = 0

P 1 w ker(A1 )

A1 P 1 w = 0

P 1 w = 1 b1 + + n bn
w = 1 (P b1 ) + + n (P bn ) span(B2 ) .
39

Therefore B2 spans ker(A2 ). Exactly as before, B2 is a linearly independent set; hence B2 is a basis for ker(A2 ), and nullity(A1 ) = nullity(A2 ).
Comments.
The matrices

1 2
A1 = 0 4
0 0

3
5
6

1
and A2 = 0
0

2 3
4 5
0 0

have different nullities (and ranks), and therefore are not similar.
Note that when A1 , A2 are similar, it is not in general true that
ker(A1 ) = ker(A2 ): the dimensions are the same, but the spaces
themselves are usually different. In other words, although the nullity is a similarity invariant, the kernel is not. In fact, it is not hard
to see from the preceding proof that if A1 = P 1 A2 P , then
ker(A2 ) = { P v | v ker(A1 ) } .
Corresponding remarks apply to the images of similar matrices.
Other examples of similarity invariants are the trace, determinant
and eigenvalues. These will be left as exercises or treated later in
the course. Note, however, that we cannot use these invariants to
prove that two matrices are similar. You may check that
A1 =

3 1
0 3

and A2 =

3 0
0 3

coincide in all the similarity invariants we have mentioned, yet they


are not similar.

40

Potrebbero piacerti anche