Sei sulla pagina 1di 171

1.

1
System of Linear Equations
When mathematics is used to solve a problem it often
becomes necessary to find a solution to a so-called system
of linear equations. Historically, linear algebra developed
from studying method for solving such equations. This
module introduces methods for solving system of linear
equations and includes examples of problems that reduce
to solving such equations. The techniques of this module
will be used throughout the remainder of the course.

Definition: A linear equation is an algebraic equation in


which each term is either a constant or the product of a
constant and the first power of a single variable.

Example: 𝑥 + 3𝑦 = 9.
The graph of this equation is a straight line in the 𝑥 − 𝑦
plane.

Note: Linear equations can have one or more variables.


Consider a system of two linear equations
𝑥 + 3𝑦 = 9

−2𝑥 + 𝑦 = −4
A pair of values of 𝑥 and 𝑦 that satisfy both the equations is
called a solution. It can be seen by substitution that
𝑥 = 3, 𝑦 = 2 is a solution to this system. A solution to such a
system will be a point at which the graphs of the two
equations intersect. The following examples illustrate that
three possibilities can arise for such system of equations.
There can be a unique solution, no solution, or many
solutions.

Unique Solution
𝑥 + 3𝑦 = 9
−2𝑥 + 𝑦 = −4

Lines intersect at 3,2 .

Unique solution, 𝑥 = 3, 𝑦 = 2.

x+3y=9

-2x+y=-4

(3,2)

No Solution

−2𝑥 + 𝑦 = 3

−4𝑥 + 2𝑦 = 2
Lines are parallel. No point of intersection.
-4x+2y=2

-2x+y=3

Many Solutions

4𝑥 − 2𝑦 = 6

6𝑥 − 3𝑦 = 9
Both equations have the same graph. Any point of the
graph is a solution.

4x-2y=6

6x-3y=9
Our aim in this module is to analyze larger system of
linear equation. The following is an example of a system of
three linear equations.
𝑥1 + 𝑥2 + 𝑥3 = 2
2𝑥1 + 3𝑥2 + 𝑥3 = 3
𝑥1 − 𝑥2 − 2𝑥3 = −6

A linear equation in three variables corresponds to a plane


in three-dimensional space. Solutions will be points that lie
on all three planes. As for systems of two equations there
can be a unique solution, no solution, or many solutions.

C
Three planes A, B and C intersects at a single point P.
P corresponds to a unique solution.
A

Planes A, B and C have no points in common.


There is no solution.
B

A Q

C
Three planes A, B and C intersects in a line PQ.
Any point on the line is a solution

As the number of variables increases, a geometrical


interpretation of such a system of equations becomes
increasingly complex. Each equation will represent a space
embedded in a larger space. Solutions will be points that lie
on all the embedded space. A geometrical approach to
visualizing solutions becomes impractical. We have to rely
solely on algebraic methods. We introduce a method for
solving systems, of linear equations called Gauss-Jordan
elimination.
Gauss-Jordan Elimination:

The general form of system of linear equations is


𝑎11 𝑥1 + 𝑎12 𝑥2 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1

𝑎21 𝑥1 + 𝑎22 𝑥2 + ⋯ + 𝑎2𝑛 𝑥𝑛 = 𝑏2


.

𝑎𝑚1 𝑥1 + 𝑎𝑚 2 𝑥2 + ⋯ + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑚
The matrix representation of the above system of linear
equations is
𝑥1 𝑏1
𝑥2 𝑏2
𝑎11 𝑎12 … . . 𝑎1𝑛
. .
𝑎21 𝑎22 … . . 𝑎2𝑛
. =
𝑎𝑚1 𝑎𝑚2 … . . 𝑎𝑚𝑛 .
𝑚×𝑛 . .
𝑥𝑛 𝑛×1 𝑏𝑚 𝑚×1

i.e., 𝐴𝑋 = 𝐵

𝐴 is called coefficient matrix.


𝑎11 𝑎12 … . . 𝑎1𝑛 𝑏1
The matrix 𝐴 𝐵 = 𝑎21 𝑎22 … . . 𝑎2𝑛 𝑏2
𝑎𝑚1 𝑎𝑚2 … . . 𝑎𝑚𝑛 𝑏𝑚

is the augmented matrix of the system of linear equations.


Definition: A matrix is in reduced echelon form if

1. Any row consisting entirely of zeros is grouped at the


bottom of the matrix.
2. The first non-zero element of each row is 1. This
element is called leading coefficient or pivot.
3. The leading coefficient of each row after the first is
positioned to the right of the leading coefficient of the
previous row.
4. All other elements in a column that contains a leading
coefficient are zero.

Note: If we give relaxation to the condition that leading


coefficient is 1 and to the point 4, then the form of the
matrix is called echelon form.
Example: The following matrices are all in reduced
echelon form.
1 0 8 1 0 0 7 1 4 0 0 1 2 3 0
0 1 2 , 0 1 0 3 , 0 0 1 0 , 0 0 0 1
0 0 0 0 0 1 9 0 0 0 1 0 0 0 0
The following matrixes are not in reduced echelon form.
1 2 0 4 1 2 0 3 0
0 0 0 0 0 0 3 4 0
0 0 1 3 0 0 0 0 1
Row of zeros First non-zero
not at bottom element is row 2
of matrix is not 1.

1 7 0 8
1 0 0 2
0 1 0 3
0 0 1 4
0 0 1 2
0 1 0 3
0 0 0 0
Leading 1 in row 3 not to Non zero element
the right of leading 1 in above leading 1 in row
row 2. 2

There are usually many sequences of row operation that


can be used to transform a given matrix to reduced echelon
form they all, however, lead to the same reduced echelon
form. We say that the reduced echelon form of a matrix is
unique.
Gauss-Jordan Elimination:

1. Write down the augmented matrix of the system of


linear equations.
2. Derive the reduced echelon form of the augmented
matrix using elementary row operations.
3. Write down the system of equations corresponding to
the reduced echelon form. This system gives the
solution.
Problem 1: Solve the following system of linear equations.

𝑥1 − 2𝑥2 + 4𝑥3 = 12
2𝑥1 − 𝑥2 + 5𝑥3 = 18

−𝑥1 + 3𝑥2 − 3𝑥3 = −8


Solution:

Step 1:
Start with the augmented matrix and use the first row
to create zeros in the first column (This corresponds to
using the first equation to eliminate 𝑥1 from the second and
third equations).

 1 2 4 12   1 2 4 12 
 2 1 5 18  R 2  2 R1 0 3 3 6 
     
 1 3 3 8 R3  R1 0 1 1 4 

Step 2:
1
Next, multiply row 2 by to make the (3,3) element
3
1. (This corresponds to making the coefficient of 𝑥2 in the
second equation 1.)

 1 2 4 12 
0 1 1 2 
1  
  R2 0 1 1 4 
3
Step 3:
Create zeros in the second column as follows. (This
corresponds to using the second equation to eliminate 𝑥2
from the first and third equations.)

 1 0 2 8 
R1  (2) R 2 0 1 1 2 
 
R3  (1) R 2 0 0 2 6 

Step 4:

1
Multiply row 3 by . (This corresponds to making
2
the coefficient of in the third equation 1.)

 1 0 2 8 
0 1 1 2 
1  
  R3 0 0 1 3 
2

Finally, create zeros in the third column. (This corresponds


to using the third equation to eliminate 𝑥3 from the first
and second equations.)

 1 0 0 2 
R1   2  R3 0 1 0 1 
 
R 2  R3 0 0 1 3 

This matrix corresponds to the system


𝑥1 =2

𝑥2 = 1
𝑥3 = 3

The solution is 𝑥1 = 2, 𝑥2 = 1, 𝑥3 = 3.
Problem 2: Solve, if possible, the system of equations.

2𝑥3 − 2𝑥4 = 2
3𝑥1 + 3𝑥2 − 3𝑥3 + 9𝑥4 = 12

4𝑥1 + 4𝑥2 − 2𝑥3 + 11𝑥4 = 12


Solution:

Step 1:
We interchange rows, if necessary, to bring a nonzero
element to the top of the first nonzero column. This
nonzero element is called a pivot.

Pivot

 3 3 3 9 12 
  0 0 2 2 2 
R1  R 2  
 4 4 2 11 12 

Step 2:
Create a 1 in the pivot location by multiplying the
1
pivot row by
𝑝𝑖𝑣𝑜𝑡

 1 1 1 3 4 
 0 0 2 2 2 
1  
  R1  4 4 2 11 12 
3
Step 3:
Create zeros elsewhere in the pivot column by
adding suitable multiple of the pivot row to all other rows of
the matrix

1 1 1 3 4 
 0 0 2 2 2 
R3  (4) R1  
0 0 2 1 4 

Step 4:
Cover the pivot row and all rows above it. Repeat
steps 1 and 2 for the remaining sub Matrix. Repeat step 3
for the whole matrix. Continue until the reduced echelon
from is reached. Pivot

1 1 1 3 4 
0 0 2 2 2 
 
0 0 2 1 4 

 1 1 1 3 4 
0 0 1 1 1 
1  
  R2 0 0 2 1 4 
2

 1 1 0 2 5 
R1  R 2 0 0 1 1 1 
 
R3  (2) R 2 0 0 0 1 6 

Pivot
 1 1 0 0 17 
R1  (2) R3 0 0 1 0 5 
 
R 2  R3 0 0 0 1 6 

This matrix is the reduced echelon form of the given


matrix. The corresponding system of equations is

𝑥1 + 𝑥2 = 17
𝑥3 = −5

𝑥4 = −6

Let us assign an arbitrary value 𝑟 to 𝑥1 . The general


solution to the system of equation is

𝑥1 = 𝑟, 𝑥2 = 17 − 𝑟, 𝑥3 = −5, and 𝑥4 = −6.

As 𝑟 ranges over the set of real numbers we get many


solutions.
Problem 3: solve, if possible, the system of equations

3𝑥1 − 3𝑥2 + 3𝑥3 = 9


2𝑥1 − 𝑥2 + 4𝑥3 = 7

3𝑥1 − 5𝑥2 − 𝑥3 = 7
Solution: Start with the augmented matrix and follow the
Gauss-Jordan algorithm. Pivots and leading 1s are circled

3 3 3 9   1 1 1 3
2 1 4 7  2 1 4 7
  1  
3 5 1 7    R1 3 5 1 7 
3

 1 1 1 3   1 0 3 4 
0 1 2 1  R1  R2 0 1 2 1 
R2  (2)R1  
 
R3  (3)R1 0 2 4 2  R3  2R2 0 0 0 0 

We have arrived at the reduced echelon form. The


corresponding system of equations is
𝑥1 + 3𝑥3 = 4

𝑥2 + 2𝑥3 = 1

There are many values of 𝑥1 , 𝑥2 and 𝑥3 that satisfy these


equations. This is a system of equations that has many
solutions.𝑥1 is called the leading variable of the first
equation and 𝑥2 is the leading variable of the second
equation. To express these many solutions, we write the
leading variables in each equation in terms of the
remaining variables. We get
x1  3x3  4

x2  2x3  1

Let us assign the arbitrary value r to x3 . The general


solution to the system is
x1  3r  4, x2  2r  1, x3  r

As r ranges over the set of real numbers we get many


solutions. r is called a parameter. We can get specific
solutions by giving r different values. For example,

r 1 gives x1  1, x2  1, x3  1

r = -2 gives x1  10, x2  5, x3  2 .


Problem 4: Solve, if possible, the system of equations.

𝑥1 − 𝑥2 + 2𝑥3 = 3
2𝑥1 − 2𝑥2 + 5𝑥3 = 4

𝑥1 + 2𝑥2 − 𝑥3 = −3
2𝑥2 + 2𝑥3 = 1

Solution: Starting with the augmented matrix we get

1 1 2 3  1 1 2 3 
 2 2 5 4   0 0 1 2 
  R 2  (2) R1  
1 2 1 3 0 3 3 6 
  R3  (1) R1  
0 2 2 1  0 2 2 1 

1 1 2 3 
 0 3 3 6 
 
R 2  R3 0 0 1 2 
 
0 2 2 1 

1 1 2 3 
 0 1 1 2 
1  
  R2 0 0 1 2 
3  
0 2 2 1 
1 0 11
 0 1 1 2 
R1  R 2  
0 0 1 2 
R 4  (2) R 2  
0 0 4 5

 1 0 0 3
R1  (1) R3 0 1 0 4 
 
R 2  R3 0 0 1 2 
 
R 4  (4) R3 0 0 0 13 

1 0 0 3
 0 1 0 4 
1  
  R4 0 0 1 2 
 13   
0 0 0 1
This matrix is still not in reduced echelon form; zeros still
have to be created above the 1 in the last row. However, in
such a situation, when the last nonzero row of a matrix is of
the form (0 0 . . . 0 1), there is no need to proceed further.
They system has no solution. To see this, let us write down
the equation that corresponds to the last row of the matrix.

0𝑥1 + 0𝑥2 + 0𝑥3 = 1

This equation cannot be satisfied by any values


of 𝑥1 , 𝑥2 and 𝑥3 . Thus the system of equation has no solution.
Exercise:
1. Determine the matrix of coefficients and augmented
matrix of each following systems of equations.

(a) 𝑥1 + 3𝑥2 = 7
2𝑥1 − 5𝑥2 = −3

(b) −𝑥1 + 3𝑥2 − 5𝑥3 = −3


2𝑥1 − 2𝑥2 + 4𝑥3 = 8

𝑥1 + 3𝑥2 = 6

(c) 5𝑥1 + 2𝑥2 − 4𝑥3 = 8

4𝑥2 + 3𝑥3 = 0
𝑥1 − 𝑥3 = 6

2. Interpret the following matrices as augmented matrices


of systems of equations. Write down each system of
equations.
7 9 8 
(a) 6 4 3
 

8 7 5 1
(b) 4 6 2 4 
 
9 3 7 6 
0 2 4 
(c) 5 7 3 
 
6 0 8 

1 2 1 6 
(d) 0 1 4 5 
 
0 0 1 2 

3. In the following exercises you are given a matrix followed


by an elementary row operation. Determine each resulting
matrix.
 2 6 4 0  
1 2 3 6 
(a)   1
8 3 2 5    R1
2

1 2 3 1  
 1 1 7 1  R 2  R1
(b)  
 2 4 5 3 R3   2  R1

1 0 4 3 
0 1 3 2  R1   4  R3
(c)  
0 0 1 5  R 2  3R3

4. The following systems of equations all have unique


solutions. Solve these systems using the method of Gauss-
Jordan elimination with matrices.

(a) 2𝑥2 + 4𝑥3 = 8


2𝑥1 + 2𝑥2 = −3
𝑥1 + 𝑥2 + 𝑥3 = 5
(b) 𝑥1 + 2𝑥2 + 3𝑥3 = 14
2𝑥1 + 5𝑥2 + 8𝑥3 = 36

𝑥1 − 𝑥2 = −4

(c) 2𝑥1 + 2𝑥2 − 4𝑥3 = 14


3𝑥1 + 𝑥2 + 𝑥3 = 8

2𝑥1 − 𝑥2 + 2𝑥3 = −1

5. Determine the following matrices are in reduced echelon


form. If a matrix is not in reduced echelon form, give a
reason.
1 0 2 
(a) 0 1 3 
 

1 2 5 6 
(b) 0 1 3 7 
 

1 0 0 
(c) 0 1 0 
 
0 0 1

1 0 0 3 2 
(d) 0 2 0 6 1 
 
0 0 1 2 3 
6. Determine the following matrices are in reduced echelon
form. If a matrix is not in reduced echelon form, give a
reason.
1 0 3 2 
(a) 0 0 1 8 
 
0 1 4 9 

1 5 0 2 0
0 0 1 9 0 
(b) 
0 0 0 0 1
 
0 0 0 0 0

1 0 2 0 3
0 0 0 0 0 
(c) 
0 1 2 0 7
 
0 0 0 1 3

1 0 0 5 3 
(d) 0 0 1 0 3 
 
0 1 2 3 7 

1 5 3 0 7 
(e) 0 0 0 1 4 
 
0 0 0 0 0 

7. Solve (if possible) each the following systems of


equations using the method of Gauss-Jordan elimination.

(a) 𝑥1 +2𝑥2 − 𝑥3 − 𝑥4 = 0
𝑥1 + 2𝑥2 + 𝑥4 = 4

−𝑥1 −2𝑥2 + 2𝑥3 + 4𝑥4 = 5


(b) 𝑥2 − 3𝑥3 + 𝑥4 = 0
𝑥1 + 𝑥2 − 𝑥3 − 4𝑥4 = 0

−2𝑥1 −2𝑥2 + 2𝑥3 − 8𝑥4 = 0

(c) 𝑥1 − 𝑥2 − 2𝑥3 = 7
2𝑥1 −2𝑥2 + 2𝑥3 − 4𝑥4 = 12

−𝑥1 + 𝑥2 − 𝑥3 + 2𝑥4 = −4

−3𝑥1 + 𝑥2 − 8𝑥3 − 10𝑥4 = −29

Answers:
 1 3 5 5 2 4 
1 3   2 2 4  0 4 3 
1. (a)  2 5 (b)   (c)  
   1 3 0  1 0 1

2. (a) 7𝑥1 + 9𝑥2 = 8


6𝑥1 + 4𝑥2 = −3
(b) 8𝑥1 + 7𝑥2 + 5𝑥3 = −1

4𝑥1 + 6𝑥2 + 2𝑥3 = 4


9𝑥1 + 3𝑥2 + 7𝑥3 = 6

(c) −2𝑥2 = 4
5𝑥1 + 7𝑥2 = −3
6𝑥1 =8

(d) 𝑥1 + 2𝑥2 − 𝑥3 = 6
𝑥2 + 4𝑥3 = 5
𝑥3 = −2
1 3 2 0  1 2 3 1 1 0 0 23
3. (a) 1 2 3 6  (b) 0 3 10 0  (c) 0 1 0 17 
     
8 3 2 5  0 8 1 1 0 0 1 5 

4. (a) 𝑥1 = 3, 𝑥2 = 0, 𝑥3 = 2 (b) 𝑥1 = 0, 𝑥2 = 4, 𝑥3 = 2

(c) 𝑥1 = 2, 𝑥2 = 3, 𝑥3 = −1
5. (a) Yes (b) No (c) Yes (d) No

6. (a) No (b) Yes (c) No (d) No (e) Yes

7. (a) 𝑥1 = 3 − 2𝑟, 𝑥2 = 𝑟, 𝑥3 = 2, 𝑎𝑛𝑑 𝑥4 = 1

(b) 𝑥1 = −2𝑟 − 3𝑠, 𝑥2 = 3𝑟 − 𝑠, 𝑥3 = 𝑟, 𝑥4 = 𝑠

(c) No solution
1.2
Vector Space ℝ𝑛

Introduction to vectors
The locations of points in a plane are usually discussed in
terms of a coordinate system. For example, in Figure below,
the location of each point in the plane can be described
using a rectangular coordinate system. The point 𝐴 is the
point (5,3).

Furthermore, 𝐴 is a certain distance in a certain direction


from the origin 0,0 . The distance and direction are
characterized by the length and direction of the line
segment from the origin 𝑂 to 𝐴. We call such a directed line
segment a position vector and denote it by 𝑂𝐴. 𝑂 is called
the initial point of 𝑂𝐴, and 𝐴 is called the terminal point.
There are thus two ways of interpreting (5,3); it defines the
location of a point in a plane, and it also defines the
position vector𝑂𝐴.

EXAMPLE 1:

Sketch the position vectors 𝑂𝐴 = 4,1 , 𝑂𝐵 = −5, −2 , and


𝑂𝐶 = −3,4 . See the figure below.

Denote the collection of all ordered pairs of real numbers


by ℝ2 . Note the significance of “ordered” here; for example,
(5,3) is not the same vector (3,5). The order is significant.
These concepts can be extended to arrays consisting of
three real numbers, such as (2,4,3), which can be
interpreted in two ways as the location of a point in three
space relative to an 𝑥𝑦𝑧 coordinate system, or as a position
vector. These interpretations are illustrated in the figure
below. We shall denote the set of all ordered triples of real
numbers by ℝ3 .
We now generalize these concepts with the following
definition.
DEFINITION: Let (𝑢1 , 𝑢2 , … . . 𝑢𝑛 ) be a sequence of 𝑛 real
numbers. The set of all such sequences is called n-space
and is denoted ℝ𝑛 .
𝑢1 is the first component of 𝑢1 , 𝑢2 , … . . 𝑢𝑛 . 𝑢2 is the second
component and so on.
EXAMPLE 2: ℝ4 is the collection of all sets of four ordered
3
real numbers. For example, (1,2,3,4) and (−1, , 0,5) are
4
elements of ℝ .
4

ℝ5 is the collection of all sets of five ordered real


7
numbers. For example, −1,2,0, , 9 is in this collection.
8
DEFINITION: Let 𝑢 = 𝑢1 , … . . 𝑢𝑛 and 𝑣 = 𝑣1 , … . . 𝑣𝑛 be two
elements of ℝ𝑛 . We say that 𝑢 and 𝑣 are equal if 𝑢1 =
𝑣1 , … 𝑢𝑛 = 𝑣𝑛 . Thus two elements of ℝ𝑛 are equal if their
corresponding components are equal.

Let us now develop the algebraic structure for ℝ𝑛 .


DEFINITION: Let 𝑢 = 𝑢1 , … . . 𝑢𝑛 and 𝑣 = 𝑣1 , … . . 𝑣𝑛 be
elements of ℝ𝑛 and let c be a scalar. Addition and scalar
multiplication are performed as follows.
Addition: 𝑢 + 𝑣 = 𝑢1 + 𝑣1 , … 𝑢𝑛 + 𝑣𝑛
Scalar multiplication: 𝑐𝑢 = 𝑐𝑢1 , … . . 𝑐𝑢𝑛
To add two elements of ℝ𝑛 , we add corresponding
components. To multiply an element of ℝ𝑛 by a scalar, we
multiply every component by that scalar. Observe that the
resulting elements are in ℝ𝑛 . We say that ℝ𝑛 is closed under
addition and under scalar multiplication.
ℝ𝑛 with operations of component wise addition and
scalar multiplication is an example of a vector space, and
its elements are called vectors.
We shall henceforth in this course interpret ℝ𝑛 to be a
vector space.
We now give example to illustrate geometrical
interpretations of these vectors and their operations.
EXAMPLE 3: This example gives a geometrical
interpretation of vector addition. Consider the sum of the
vectors 4,1 and 2,3 . We get
4,1 + 2,3 = 6,4
In the figure below, we interpret these vectors as position
vectors. Construct the parallelogram having the vectors
4,1 and 2,3 as adjacent sides. The vector 6,4 , the sum,
will be the diagonal of the parallelogram.

In general, if 𝑢 and 𝑣 are vectors in the same vector space,


then 𝑢 + 𝑣 is the diagonal of the parallelogram defined by 𝑢
and 𝑣. See the figure below. This way of visualizing vector
addition is useful in all vector spaces.
EXAMPLE 4: This example gives a geometrical
interpretation of scalar multiplication. Consider the scalar
multiple of the vectors 3,2 by 2. We get
2 3,2 = 6,4
Observe in the figure below that (6,4) is a vector in the
same direction as (3,2) and 2 times it in length.

The direction will depend upon the sign of the scalar. The
general result is as follows. Let 𝑢 be a vector and 𝑐 a scalar.
The direction of 𝑐𝑢 will be the same as the direction of 𝑢 if
𝑐 > 0, the opposite direction to 𝑢 if 𝑐 < 0. The length of 𝑐𝑢 is
𝑐 times the length of 𝑢. See the figure below.

Zero vector:
The vector (0,0, … ,0) having 𝑛 zero components, is called the
zero vector of ℝ𝒏 and is denoted by 0. For example, (0,0,0)
is zero vector of ℝ3 . We shall find that zero vectors play a
central role in the development of vector spaces.
Negative Vector:

The vector (−1)𝑢 is written – 𝒖 and is called the negative of


𝒖. It is a vector that has the same magnitude as 𝒖, but lies
in the opposite direction to 𝒖.
Subtraction:
Subtraction is performed on elements of ℝ𝒏 by subtracting
corresponding components. For example, in ℝ3 ,
5,3, −6 − 2,1,3 = (3,2, −9)
Observe that this is equivalent to
5,3, −6 + −1 2,1,3 = 3,2, −9
Thus subtraction is not new operation on ℝ𝑛 , but a
combination of addition and scalar multiplication by −1.
We have only two independent operations on ℝ𝑛 , namely
addition and scalar multiplication.
We now discuss some of the properties of vector addition
and scalar multiplication. The properties are similar to
those of matrices.

THEOREM:
Let 𝒖, 𝒗 and 𝒘 be vectors in ℝ𝑛 and let 𝑐 and 𝑑 be scalars.
(a) 𝒖 + 𝒗 = 𝒗 + 𝒖
(b) 𝒖 + (𝒗 + 𝒘) = (𝒖 + 𝒗) + 𝒘
(c) 𝒖 + 𝟎 = 𝟎 + 𝒖 = 𝒖
(d) 𝒖 + (−𝒖) = 𝟎
(e) 𝒄 (𝒖 + 𝒗) = 𝒄𝒖 + 𝒄𝒗
(f) (𝒄 + 𝒅) 𝒖 = 𝒄𝒖 + 𝒅𝒖
(g) 𝒄 (𝒅𝒖) = (𝒄𝒅)𝒖
(h) 𝟏𝒖 = 𝒖
These results are verified by writing the vectors in terms of
components and using the definitions of vector addition
and scalar multiplication, and the properties of real
numbers. We give the proofs of (a) and (e).
𝒖 + 𝒗 = 𝒗 + 𝒖:
Let 𝑢 = 𝑢1 , … . . 𝑢𝑛 and 𝑣 = 𝑣1 , … . . 𝑣𝑛 Then
𝑢 + 𝑣 = 𝑢1 , … . . 𝑢𝑛 + 𝑣1 , … . . 𝑣𝑛
= 𝑢1 + 𝑣1 , … 𝑢𝑛 + 𝑣𝑛
= 𝑣1 + 𝑢1 , … 𝑣𝑛 + 𝑢𝑛
=𝑣+𝑢
𝒄(𝒖 + 𝒗) = 𝒄𝒖 + 𝒄𝒗:
𝑐 𝒖 + 𝒗 = 𝑐 𝑢1 , … . . 𝑢𝑛 + 𝑣1 , … . . 𝑣𝑛
= 𝑐( 𝑢1 + 𝑣1 , … 𝑢𝑛 + 𝑣𝑛 )
= 𝑐 𝑢1 + 𝑣1 , … , 𝑐 𝑢𝑛 + 𝑣𝑛

= 𝑐𝑢1 + 𝑐𝑣1 , … 𝑐𝑢𝑛 + 𝑐𝑣𝑛


= 𝑐𝑢1 , … . . , 𝑐𝑢𝑛 + 𝑐𝑣1 , … . . , 𝑐𝑣𝑛
= 𝑐 𝑢1 , … . . , 𝑢𝑛 + 𝑐 𝑣1 , … . . , 𝑣𝑛
= 𝑐𝑢 + 𝑐𝑣
Some of the above properties can be illustrated
geometrically. The commutative property of vector addition
is illustrated in the figure below. Note that we get the same
diagonal to the parallelogram whether we add the vectors
in the order 𝑢 + 𝑣 or in the order 𝑣 + 𝑢. One implication of
part (b) above is that we can write certain algebraic
expressions involving vectors, without parentheses.

General Vector Space


In this section, we generalize the concept of the vector
space ℝ𝑛 . We examine the underlying algebraic structure of
ℝ𝑛 . Any set with this structure has the same mathematical
properties and will be called a vector space. The results
that were developed for the vector space ℝ𝑛 will also apply
such sets. We will, for example, find that certain spaces of
functions have the same mathematical properties as the
vector space ℝ𝑛 . Similarly the scalar set has the algebraic
structure of the real number set and will be called a field.
Precise definitions are as follows.
Definition: Let 𝐹 be a set having at least two elements 0𝐹
and 1𝐹 (0𝐹 ≠ 1𝐹 ) together two operations ‘. ’ (multiplication)
and ‘ + ’ (addition). A field (𝐹, +, . ) is a triplet satisfying the
following axioms:
For any three elements 𝑎, 𝑏, 𝑐 ∈ 𝐹.
1) Addition and multiplication are closed
𝑎 + 𝑏 ∈ 𝐹 and 𝑎𝑏 ∈ 𝐹.
2) Addition and multiplication are associative:
𝑎+𝑏 +𝑐 =𝑎+ 𝑏+𝑐 ,
𝑎𝑏 𝑐 = 𝑎 𝑏𝑐

3) Addition and multiplication are commutative:


𝑎 + 𝑏 = 𝑏 + 𝑎, 𝑎𝑏 = 𝑏𝑎
4) The multiplicative operation distribute over addition:
𝑎 𝑏 + 𝑐 = 𝑎𝑏 + 𝑎𝑐
5) 0𝐹 is the additive identity:
0𝐹 + 𝑎 = 𝑎 + 0𝐹 = 𝑎
6) 1𝐹 is the multiplicative identity:
1𝐹 𝑎 = 𝑎 1𝐹 = 𝑎
7) Every element has additive inverse:
∃ − 𝑎 ∈ 𝐹, 𝑎 + −𝑎 = −𝑎 + 𝑎 = 0𝐹
8) Every non-zero element has multiplicative inverse:
If 𝑎 ≠ 0𝐹 , then ∃ 𝑎−1 ∈ 𝐹 such that 𝑎𝑎−1 = 𝑎−1 𝑎 = 1𝐹
Example: (ℚ, +, . ), (ℝ, +, . ) and (ℂ, +, . ) are all fields.
(ℤ, +, . ) is not a field because every non-zero element except
−1 and 1 has no multiplicative inverse.
Definition: A vector space 𝑉(𝐹) over a field ‘𝐹’ is a non-
empty set whose elements are called vectors, possessing
two operations ‘ + ’ (vector addition), and ‘. ’ (scalar
multiplication) which satisfy the following axioms:

For 𝑎 , 𝑏 and 𝑐 ∈ 𝑉 𝐹 and 𝛼, 𝛽 ∈ 𝐹.


1) Vector addition and scalar multiplication are closed:

𝑎 + 𝑏 ∈ 𝑉 𝐹 , 𝛼𝑎 ∈ 𝑉 𝐹 .
2) Commutativity:

𝑎+𝑏 =𝑏+𝑎
3) Associativity:

𝑎+𝑏 +𝑐 =𝑎+ 𝑏+𝑐

4) Existence of an additive identity:

∃ 0 ∈ 𝑉 𝐹 such that 𝑎 + 0 = 0 + 𝑎 = 𝑎
5) Existence of additive inverse:

∃ − 𝑎 ∈ 𝑉 𝐹 such that 𝑎 + −𝑎 = −𝑎 + 𝑎 = 0
6) Distributive Laws:

𝛼 𝑎 + 𝑏 = 𝛼𝑎 + 𝛼𝑏

7) 1𝐹 𝑎 = 𝑎
8) 𝛼𝛽 𝑎 = 𝛼 𝛽𝑎 .
Note: Throughout this course we use the field of real
numbers as scalar set. We may refer a vector space simply
Example:
1) The set of all 2×2 matrices with entries real numbers is
a vector space over the field of real numbers under usual
addition and scalar multiplication of matrices.
2) The set of all functions having real numbers as their
domain is a vector space over the field of real numbers
under the following operations.
𝒇+𝒈 𝒙 =𝒇 𝒙 +𝒈 𝒙
𝒄𝒇 𝒙 = 𝒄. 𝒇 𝒙
For all functions 𝑓 and 𝑔 and scalar 𝑐.
We now give a theorem that contains useful properties of
vectors. These are properties that were immediately
apparent for ℝ𝑛 and were taken almost for granted. They
are not, however, so apparent for all vector spaces.
Theorem: Let „V‟ be a vector space, 𝑣 a vector in 𝑉, 0 the
zero vector of 𝑉, ‘𝑐’ a scalar, and 0 the zero scalar. Then

(a) 0𝑣 = 0
(b) 𝑐𝑣 = 0
(c) −1 𝑣 = −𝑣
(d) If 𝑐𝑣 = 0, then either 𝑐 = 0 or 𝑣 = 0
Proof: (a) 0𝑣 + 0𝑣 = 0 + 0 𝑣 = 0𝑣
Add the negative of 0𝑣 namely −0𝑣 to both sides of this
equation.
0𝑣 + 0𝑣 + −0𝑣 = 0𝑣 + −0𝑣

⇒ 0𝑣 + 0𝑣 + −0𝑣 =0

⇒ 0𝑣 + 0 = 0

⇒ 0𝑣 = 0

(b) 𝑐0 = 𝑐 0 + 0

⇒ 𝑐0 + 𝑐0

⇒ 𝑐0 = 0.
(c) −1 𝑣 + 𝑣 = −1 𝑣 + 1𝑣
= −1 + 1 𝑣
= 0𝑣 = 0
Thus (-1) 𝑣 is the additive inverse of 𝑣.
i.e (-1) 𝑣 = −𝑣
(d) Assume that 𝑐 ≠ 0𝐹 .

Then ∃ 𝑐 −1 such that 𝑐 −1 𝑐 = 1𝐹

𝑐𝑣 = 0 ⇒ 𝑐 −1 𝑐𝑣 = 𝑐 −1 0

⇒ 𝑐 −1 𝑐𝑣 = 0

⇒ 1𝐹 𝑣 = 0

⇒ 𝑣 = 0.

SUBSPACES
Definition: Let 𝑉(𝐹) be a vector space. A non-empty subset
𝑈 ⊆ 𝑉 which is also a vector space under the inherited
operations of 𝑉 is called a vector subspace of 𝑉.

Example: 0 and 𝑉 are trivial vector subspaces of 𝑉.

Theorem: Let 𝑉(𝐹) be a vector space. Then U⊆ 𝑉, 𝑈 ≠ ∅ is a


subspace of 𝑉 if and only if for all 𝛼 ∈ 𝐹 and 𝑎, 𝑏 ∈ 𝑈 it is
verified that 𝑎 + 𝛼𝑏 ∈ 𝑈.
Proof: Let 𝑉(𝐹) be a vector space and let 𝑈 be a non-empty
subset of 𝑉. If 𝑈 is a subspace of 𝑉, then it is clear that
𝑎 + 𝛼𝑏 ∈ 𝑈 for all 𝛼 ∈ 𝐹and 𝑎, 𝑏 ∈ 𝑈.
Conversely, suppose that 𝑈 is non-empty subset of 𝑉 and
for all 𝛼 ∈ 𝐹 and 𝑎, 𝑏 ∈ 𝑈, 𝑎 + 𝛼𝑏 ∈ 𝑈.
We prove that 𝑈 is a subspace of 𝑉. That is, 𝑈 is a vector
space under the inherited operations of 𝑉.
1) Vector addition and scalar multiplication are closed,
commutative and Associative. By taking 𝛼 = −1, we get
−𝑣 ∈ 𝑈 for every 𝑣 ∈ 𝑈

So 0 = 𝑣 − 𝑣 ∈ 𝑈
All other properties hold as they hold in 𝑉.
Therefore ‘𝑈’ is a vector space with the same binary
operations as on 𝑉 and hence 𝑈 is a subspace of 𝑉.
Theorem: Let X⊆ 𝑉, 𝑌 ⊆ 𝑉 be vector subspaces of a vector
space 𝑉(𝐹). Then their intersection 𝑋 ∩ 𝑌 is also a vector
subspace of 𝑉.
Proof: Follows from the above Theorem.
Problem 1: Let 𝑢 = −1,4,3,7 and 𝑣 = −2, −3,1,0 be vectors
in ℝ4 . Find 𝑢 + 𝑣 and 3𝑢.

Solution: We get

𝑢 + 𝑣 = −1,4,3,7 + −2, −3,1,0 = −3,1,4,7


3𝑢 = 3 −1,4,3,7 = −3,12,9,21

Note that the resulting vector under each


operation is in the original vector space ℝ4 .
Problem 2: Let 𝑢 = 2,5, −3 , 𝑣 = −4,1,9 , 𝑤 = 4,0,2 .
Determine the vector 2𝑢 − 3𝑣 + 𝑤.

Solution: 2𝑢 − 3𝑣 + 𝑤 = 2 2,5, −3 − 3 −4,1,9 + 4,0,2

= 4,10, −6 — −12,3,27 + 4,0,2


= 4 + 12 + 4,10 − 3 + 0, −6 − 27 + 2

= 20,7, −31 .
Problem 3: Let ℂ denote the complex numbers and ℝ
denote the real numbers. Is ℂ a vector space over ℝ under
ordinary addition and multiplication? Is ℝ a vector space
over ℂ?

Solution: ℂ is a vector space over ℝ but ℝ is not a vector


space over ℂ sinceℝ is not closed under scalar
multiplication over ℂ.
Problem 4: Let 𝑉(𝐹) be a vector space, and let 𝑈1 ⊆ 𝑉 and
𝑈2 ⊆ 𝑉 be vector subspaces. Prove that if 𝑈1 ∪ 𝑈2 is a vector
subspace of 𝑉, then either 𝑈1 ⊆ 𝑈2 on 𝑈2 ⊆ 𝑈1
Solution: If 𝑈1 ⊆ 𝑈2 on 𝑈2 ⊆ 𝑈1 then it is trivial that 𝑈1 ∪ 𝑈2
is a subspace of V.

Suppose that 𝑈1 ⊈ 𝑈2 on 𝑈2 ⊈ 𝑈1

So ∃ 𝑢1 ∈ 𝑈1 and 𝑈2 ∈ 𝑈1 consider 𝑢1 + 𝑢2
Then 𝑢1 + 𝑢2 cannot be in 𝑈1 and 𝑈2 .

(If 𝑢1 + 𝑢2 ∈ 𝑈1 , then 𝑢1 + 𝑢2 − 𝑢1 = 𝑢2 ∈ 𝑈1 )

Therefore 𝑈1 ∪ 𝑈2 is not closed with respect to vector


addition. Hence 𝑈1 ∪ 𝑈2 is not a subspace of 𝑉.
Problem 5: Prove that 𝑋 = 𝑎, 𝑏, 𝑐, 𝑑 ∈ 𝑎 − 𝑏 − 3𝑑 = 0 is a
vector subspace of ℝ4 .

Solution: 𝑎1 , 𝑏1 , 𝑐1 , 𝑑1 + 𝛼 𝑎2 , 𝑏2 , 𝑐2 , 𝑑2 = (𝑎1 + 𝛼𝑎2 , 𝑏1 +


𝛼𝑏2 , 𝑐1 + 𝛼𝑐2 , 𝑑1 + 𝛼𝑑2 , 𝑎1 + 𝛼𝑎2 − 𝑏1 + 𝛼𝑏2 − 3 𝑑1 + 𝛼𝑑2 )

= 𝑎1 − 𝑏1 − 3𝜆1 + 𝛼 𝑎2 − 𝑏2 − 3𝜆2

= 0 + 𝛼 0 = 0.
Problem 6: Prove that 𝑋 = 𝑎, 2𝑎 − 3𝑏, 5𝑏, 𝑎 + 2𝑏, 𝑎 : 𝑎, 𝑏 ∈ 𝑅
is a vector subspace of ℝ5 .

Solution:

𝑎1 , 2𝑎1 − 3𝑏1 , 5𝑏1 , 𝑎1 + 2𝑏1 , 𝑎1


+ 𝛼 𝑎2 , 2𝑎2 − 3𝑏2 , 5𝑏2 , 𝑎2 + 2𝑏2 , 𝑎2
= (𝑎1 + 𝛼𝑎2 , 2 𝑎1 + 𝛼𝑎2 − 3 𝑏1 + 𝛼𝑏2 , 5 𝑏1 + 𝛼𝑏2 , 𝑎1 + 𝛼𝑎2 +
2 𝑏1 + 𝛼𝑏2 , 𝑎1 + 𝛼𝑎2 ).
Exercise
1. Compute the following vector expressions for 𝑢 = 1,2 ,
𝑣 = 4, −1 , and 𝑤 = (−3,5).
(a) 𝑢 + 3𝑣

(b) 2𝑢 + 3𝑣 − 𝑤

(c) −3𝑢 + 4𝑣 − 2𝑤
2. Prove that the set 𝐶 𝑛 with the operations of addition and
scalar multiplication defined as follow is a vector space
𝑢1 , … , 𝑢𝑛 + 𝑣1 , … , 𝑣𝑛 = 𝑢1 + 𝑣1 , … , 𝑢𝑛 + 𝑣𝑛

𝑐 𝑢1 , … , 𝑢𝑛 = 𝑐𝑢1 , … , 𝑐𝑢𝑛

Determine 𝑢 + 𝑣 and 𝑐𝑢 for the following vectors and scalars


in 𝐶 2 .

(a) 𝑢 = 2 − 𝑖, 3 + 4𝑖 , 𝑣 = 5,1 + 3𝑖 , 𝑐 = 3 − 2𝑖.

(b) 𝑢 = 1 + 5𝑖, −2 − 3𝑖 , 𝑣 = 2𝑖, 3 − 2𝑖 , 𝑐 = 4 + 𝑖.

3. Let 𝑊 be the set of vectors of the form 𝑎, 𝑎 2 , 𝑏 . Show


that 𝑊 is not a subspace of ℝ3 .

4. Prove that the set 𝑈 of 2 × 2 diagonal matrices is a


subspace of the vector space 𝑀22 of 2 × 2 matrices.

5. Let 𝑃𝑛 denote the set of real polynomial functions of


degree ≤ 𝑛. Prove that 𝑃𝑛 is a vector space if addition and
scalar multiplication are defined on polynomials in a point
wise manner.
6. Let 𝑊 be the set of vectors of the form 𝑎, 𝑎, 𝑎 + 2 . Show
that 𝑊 is not a subspace of ℝ3 .

7. Consider the sets of vectors of the following form. Prove


that the sets are subspaces of ℝ3 .

(a) 𝑎, 𝑏, 0

(b) 𝑎, 2𝑎, 𝑏

(c) 𝑎, 𝑎 + 𝑏, 3𝑎

8. Are the following sets subspaces of ℝ3 ? The set of all


vectors of the form 𝑎, 𝑏, 𝑐 where

(a) 𝑎 + 𝑏 + 𝑐 = 0

(b) 𝑎𝑏 = 0

(c) 𝑎𝑏 = 𝑎𝑐.

9. Prove that the following sets are not subspaces of ℝ3 .


The set of all vectors of the form

(a) (𝑎, 𝑎 + 1, 𝑏)

(b) 𝑎, 𝑏, 𝑎 + 𝑏 − 4 .
10. Let 𝑈 be the set of all vectors of the form 𝑎, 𝑏, 𝑐 and 𝑉
be the set of all vectors of the form 𝑎, 𝑎 + 𝑏, 𝑐 . Show that 𝑈
and 𝑉 are the same set. Is this set a subspace of ℝ3 ?

Answers
1. (a) 13, −1 (b) 17, −4 (c) 19, −20
2. (a) 7 − 𝑖, 4 + 7𝑖 (b) 4 − 7𝑖, 17 + 6𝑖

3. 𝑊 is not a subspace.

4. It is a vector space of matrices, embedded in 𝑀22


5. Subspace

6. Not a subspace

7. (a) The set is the 𝑥𝑦 plane

(b) The set is the plane given by the equation 𝑦 = 2𝑥.

(c) The set is the plane given by the equation 𝑧 = 3𝑥.


8. (a) Subspace (b) Not a subspace (c) Not a subspace

9. (a) If 𝑎, 𝑎 + 1, 𝑏 = 0,0,0 then 𝑎 = 𝑎 + 1 = 𝑏 = 0. 𝑎 = 𝑎 +


1 is impossible. That is, the set does not contain the zero
vector.

(b) If 𝑎, 𝑏, 𝑎 + 𝑏 − 4 = 0,0,0 , then 𝑎 = 𝑏 = 0 and −4 = 0 Not


possible. That is, the set does not contain the zero vector.
10. Yes.
1.3
Linear Combination of Vectors

Observe that any vector (𝑎, 𝑏, 𝑐) in the vector space can be


written as
(𝑎, 𝑏, 𝑐) = 𝑎(1,0,0) + 𝑏(0, 1 ,0) + 𝑐(0, 0,1)
The vector 1,0,0 , 0,1,0 and (0,1,0) in some sense
characterize the vector space ℝ3 . We pursue this approach
to understanding vector spaces in terms of certain vectors
that represent the whole space.
Definition: Let 𝑣1 , 𝑣2 , … , 𝑣𝑚 be vectors in a vector space 𝑉.
We say that 𝑣 , a vector in 𝑉 , is a linear combination of
𝑣1 , 𝑣2 , … , 𝑣𝑚 if there exists scalars of 𝑐1 , 𝑐2 , … , 𝑐𝑚 such that ‘𝑣’
can be written as
𝑣 = 𝑐1 𝑣1 + 𝑐2 𝑣2 + ⋯ + 𝑐𝑚 𝑣𝑚
Example: The vector (5,4,2) is a linear combination of the
vectors 1,2,0 (3,1,4) and (1,0,3). Since it can be written as
(5,4,2) = (1,2,0) + 2(3,1,4) − 2(1,0,3).
DEFINITION: The vectors 𝑣1 , 𝑣2 , … … … . , 𝑣𝑚 are said to span
a vector space if every vector in the space can be expressed
as a linear combination of these vectors.
A spanning set of vectors in a sense defines the vector
space, since every vector in the space can be obtained from
this set.
We have developed the mathematics for looking at a
vector space in terms of a set of vectors that spans the
space. It is also useful to be able to do the converse,
namely to use a set of vectors to generate a vector space.
THEOREM: Let 𝑣1 , 𝑣2 , … , 𝑣𝑚 be vectors in a vector space 𝑉.
Let 𝑈 be the set consisting of all linear combinations of
𝑣1 , 𝑣2 , … , 𝑣𝑚 . Then 𝑈 is a subspace of 𝑉 spanned by the
vectors 𝑣1 , 𝑣2 , … , 𝑣𝑚 . 𝑈 is said to be the vector space
generated by 𝑣1 , 𝑣2 , … , 𝑣𝑚 .
Proof: Let 𝑢1 = 𝑎1 𝑣1 + ⋯ + 𝑎𝑚 𝑣𝑚 and 𝑢2 = 𝑏1 𝑣1 + ⋯ + 𝑏𝑚 𝑣𝑚
be arbitrary elements of 𝑈. Then
𝑢1 + 𝑢2 = 𝑎1 𝑣1 + ⋯ + 𝑎𝑚 𝑣𝑚 + 𝑏1 𝑣1 + ⋯ + 𝑏𝑚 𝑣𝑚
= 𝑎1 + 𝑏1 𝑣1 + 𝑎𝑚 + 𝑏𝑚 𝑣𝑚 .
𝑢1 + 𝑢2 is a linear combination of 𝑣1 , 𝑣2 , … , 𝑣𝑚 . Thus 𝑢1 + 𝑢2
is in 𝑈. vector addition.
Let ‘𝑐’ be an arbitrary scalar. Then
𝑐𝑢1 = 𝑐 𝑎1 𝑣1 + ⋯ + 𝑎𝑚 𝑣𝑚 = 𝑐𝑎1 𝑣1 + ⋯ + 𝑐𝑎𝑚 𝑣𝑚
𝑐𝑢1 is a linear combination of 𝑣1 , 𝑣2 , … , 𝑣𝑚 . Therefore 𝑐𝑢1
is in 𝑈. 𝑈 is closed under scalar multiplication. Thus 𝑈 is a
subspace of 𝑣.
By the definition of 𝑈, every vector in 𝑈 can be written
as a linear combination of 𝑣1 , 𝑣2 , … , 𝑣𝑚 . Thus 𝑣1 , 𝑣2 , … , 𝑣𝑚
span 𝑈.
Problem 1: Determine whether or not the vector (−1,1,5) is
a linear combination of the vectors (1,2,3)(0,1,4) and (2,3,6)
Solution: We examine the identity
𝐶1 (1,2,3) + 𝐶2 (0,1,4) + 𝐶3 (2,3,6) = (−1,1,5)

Can we find scalars 𝐶1 , 𝐶2 and 𝐶3 such that this identity


holds?

Using the operations of addition and scalar multiplication


we get

(𝐶1 + 2𝐶3 , 2𝐶1 + 𝐶2 + 3𝐶3 , 3𝐶1 + 4 𝐶2 + 6𝐶3 ) = (−1, ,5)


Equating components leads to the following system of
linear equations.

𝐶1 + 2𝐶3 = −1

2𝐶1 + 𝐶2 + 3𝐶3 = 1

3𝐶1 + 4 𝐶2 + 6𝐶3 = 5
It can be shown that this system of equations has the
unique solution.

𝐶1 = 1, 𝐶2 = 2, 𝐶3 = −1.

Thus the vector (−1,1,5) has the following linear


combination of the vectors (1,2,3)(0,1,4) and (2,3,6)
(−1,1,5) = (1,2,3) + 2(0,1,4) − 1(2,3,6).
Problem 2: Express the vector (4,5,5) as a linear
combination of the vectors (1,2,3), (−1,1,4) and (3,3,2)

Solution: Examine the following indentify for values of 𝐶1 ,


𝐶2 and 𝐶3 .
𝐶1 (1,2,3) + 𝐶2 (−1,1,4) + 𝐶3 (3,3,2) = (4,5,5)

We get (𝐶1 − 𝐶2 + 3𝐶3 , 2𝐶1 + 𝐶2 + 3𝐶3 , 3𝐶1 + 4 𝐶2 + 2𝐶3 ) =


( 4,5,5)
Equating components leads to the following system of
linear equations.
𝐶1 − 𝐶2 + 3𝐶3 = 4

2𝐶1 + 𝐶2 + 3𝐶3 = 5

3𝐶1 + 4 𝐶2 + 2𝐶3 = 5
This system of equations has many solutions,

𝐶1 = −2𝑟 + 3, 𝐶2 = 𝑟 − 1, 𝐶3 = 𝑟

Thus the vector can be expressed in many ways as a linear


combination of the vectors (1,2,3), (−1,1,4) and (3,3,2)
( 4,5,5) = (−2𝑟 + 3) (1,2,3) + (𝑟 − 1) (−1,1,4) + 𝑟(3,3,2)
For example,

𝑟 = 3 gives ( 4,5,5) = −3 (1,2,3) + 2(−1,1,4) + 3(3,3,2)

𝑟 = −1 gives ( 4,5,5) = 5 (1,2,3) − 2(−1,1,4) − (3,3,2).


Problem 3: Show that the vector (3, −4, −6) cannot be
expressed as a linear combination of the vectors
(1,2,3) (−1, −1, −2) and (1,4,5)
Solution: Consider the identity

𝐶1 (1,2,3) + 𝐶2 (−1, −1, −2) + 𝐶3 (1,4,5) = (3, −4, −6)

This identity leads to the following system of linear


equations.

𝐶1 − 𝐶2 + 𝐶3 = 3
2𝐶1 − 𝐶2 + 4𝐶3 = −4

3𝐶1 − 2 𝐶2 + 5𝐶3 = 6

This system has no solution. Thus (3, −4, −6) is not a linear
combination of the vectors

(1,2,3) (−1, −1, −2) and (1,4,5).


Problem 4: Show that the vectors (1,2,0), (0,1, −1) and
(1, 1,2) span ℝ3 .

Solution: Let (𝑥, 𝑦, 𝑧) be an arbitrary element of ℝ3 .

We have to determine whether we can write (𝑥, 𝑦, 𝑧) =


𝐶1 (1,2,0) + 𝐶2 (0,1, −1) + 𝐶3 (1, 1,2).
Multiply and add the vectors to get

(𝑥, 𝑦, 𝑧) = ( 𝐶1 + 𝐶3 , 2𝐶1 + 𝐶2 + 𝐶3 , −𝐶2 + 2𝐶3 )

Thus, 𝐶1 + 𝐶3 = 𝑥
2𝐶1 + 𝐶2 + 𝐶3 = 𝑦
−𝐶2 + 2𝐶3 = 𝑧

This system of equations in the variables 𝐶1 , 𝐶2 and 𝐶3 is


solved by the method of Gauss-Jordon elimination. It is
found to have the solution

𝐶1 = 3𝑥 − 𝑦 − 𝑧,

𝐶2 = −4𝑥 + 2𝑦 + 𝑧,

𝐶3 = −2𝑥 + 𝑦 + 𝑧.

We can write an arbitrary vector of ℝ3 as a linear


combination of these vectors as follows.

(𝑥, 𝑦, 𝑧) = (3𝑥 − 𝑦 − 𝑧) (1,2,0) + (−4𝑥 + 2𝑦 + 𝑧) (0,1, −1)


+ (−2𝑥 + 𝑦 + 𝑧) (1, 1,2).

The vectors (1,2,0), (0,1, −1) and (1, 1,2) span ℝ3 .


Problem 5: Let 𝑣1 and 𝑣2 span a subspace 𝑈 of a vector
space 𝑉. Let 𝑘1 and 𝑘2 be non-zero scalars. Show that 𝑘1 𝑣1
and 𝑘2 𝑣2 also span 𝑈.
Solution: Let 𝑣 be a vector in

Since 𝑣1 and 𝑣2 span 𝑈. There exists scalars 𝑎 and 𝑏 such


that

𝑣 = 𝑎 𝑣1 + 𝑏 𝑣2
we can write
𝑎 𝑏
𝑣= 𝑘1 𝑣1 + 𝑘2 𝑣2
𝑘1 𝑘2

Thus the vectors 𝑘1 𝑣1 and 𝑘2 𝑣2 span 𝑈.


Problem 6: Let ‘𝑈’ be the subspace generated by the
vectors (1, 2, 0) and (−3,1,2). Let 𝑉 be the subspace of ℝ3
generated by the vectors (−1,5,2) and (4,1, −2). Show that
𝑈 = 𝑉.

Solution: Let ‘𝑢’ be a vector in 𝑈. Let us show that 𝑢 is in


𝑉.

Since 𝑢 is in 𝑈, there exists scalars 𝑎 and 𝑏 such that


𝑢 = 𝑎 (1, 2, 0) + 𝑏 (−3,1,2)

= (𝑎 − 3𝑏, 2𝑎 + 𝑏, 2𝑏)

Let us see if we can write u as a linear combination of


(−1,5,2) and (4,1, −2)
𝑢 = 𝑝(−1,5,2) + 𝑞 (4,1, −2)

= (−𝑝 + 4𝑞, 5𝑝 + 𝑞, 2𝑝 − 2𝑞)

Such 𝑝 and 𝑞 would have to satisfy


−𝑝 + 4𝑞 = 𝑎 − 3𝑏

5𝑝 + 𝑞 = 2𝑎 + 𝑏

2𝑝 − 2𝑞 = 2𝑏.
𝑎+𝑏 𝑎−2𝑏
This system of eqs has unique solution 𝑝 = ,𝑞 = .
3 3

Thus 𝑢 can be written as


𝑎+𝑏 𝑎−2𝑏
𝑝= −1,5,2 + (4,1, −2).
3 3
Therefore 𝑢 is a vector in 𝑉. Conversely, let 𝑣 be a vector in
𝑉. Similar to the above we can show that 𝑣 is in 𝑈.
Therefore 𝑈 = 𝑉.
Exercise
1. Let 𝑈 be the vector space generated by the functions
𝑓 𝑥 = 𝑥 + 1 and 𝑔 𝑥 = 2𝑥 2 − 2𝑥 + 3. Show that the function
𝑕 𝑥 = 6𝑥 2 − 10𝑥 + 5 lies in 𝑈.

2. In the following sets of vectors, determine whether the


first vector is a linear combination of the other vectors.

(a) −3,3,7 ; 1, −1,2 , 2,1,0 , (−1,2,1)

(b) 0,10,8 ; −1,2,3 , 1,3,1 , (1,8,5)


3. Determine whether the following vectors span ℝ3 .

(a) 2,1,0 , −1,3,1 , (4,5,0)


(b) 1,2,1 , −1,3,0 , (0,5,1)

4. Give three other vectors in the subspace of ℝ3 generated


by the vectors 1,2,3 , (1,2,0).

5. Let 𝑈 be the subspace of ℝ3 generated by the vectors


(3, −1,2) and (1,0,4). Let 𝑉 be the subspace of ℝ3 generated
by the vectors (4, −1,6) and (1, −1, −6). Show that 𝑈 = 𝑉.
6. In each of the following, determine whether the first
function is a linear combination of the functions that
follow:

(a)𝑓 𝑥 = 3𝑥 2 + 2𝑥 + 9; 𝑔 𝑥 = 𝑥 2 + 1, 𝑕 𝑥 = 𝑥 + 3

(b) 𝑓 𝑥 = 𝑥 2 + 4𝑥 + 5; 𝑔 𝑥 = 𝑥 2 + 𝑥 − 1, 𝑕 𝑥 = 𝑥 2 + 2𝑥 + 1

7. Let 𝑣, 𝑣1 and 𝑣2 . be vectors in a vector space 𝑉. Let v be


a linear combination of 𝑣1 and 𝑣2 . If 𝑐1 and 𝑐2 are nonzero
scalars, show that 𝑣 is also a linear combination of 𝑐1 𝑣1 and
𝑐2 𝑣2 .

Answers
2. (a) −3,3,7 = 2 1, −1,2 − 2,1,0 + 3 −1,2,1
(b) 0,10,8 = 2 − 𝑐 −1,2,3 + 2 − 2𝑐 1,3,1 + 𝑐 1,8,5 ,
whether c is any real number

3. (a) Span (b) Do not span

4. e.g., 1,2,3 + 1,2,0 = 2,4,3 , 1,2,3 − 1,2,0


= 0,0,3 , 2 1,2,3 = (2,4,6).
1.4
Linear Dependence and Independence
In this module, we continue the development of vector
space structure. We introduce concepts of dependence and
independence of vectors. These will be useful tools in
constructing “efficient” spanning sets for vector spaces–sets
in which there are no redundant vectors.
Let us motivate the idea of dependence of vectors. Observe
that the vector (4, −1,0) is a linear combination of the
vectors (2,1,3) and (0,1,2) since it can be written as
(4, −1,0) = 2(2,1,3) − 3(0,1,2).
The above equation can be rewritten in a number of ways.
Each vector can be expressed in terms of the other vectors.
(2,1,3) = (1/2) (4, −1,0) + (3/2) (0,1,2)

(0,1,2) = (2/3) (2,1,3) − (1/3) (4, −1,0).

Each of the three vectors is, in fact, dependent on the other


two vectors. We express this by writing.

(4, −1,0) − 2(2,1,3) + 3(0,1,2) = (0,0,0).


This concept of dependence of vectors is made precise with
the following definition.

DEFINITION: (a) The set of vectors { 𝑣1 , … , 𝑣𝑚 } in a vector


space 𝑉 is said to be linearly dependent if there exists
scalars 𝑐1 , … , 𝑐𝑚 not all zero, such that 𝑐1 𝑣1 + ⋯ + 𝑐𝑚 𝑣𝑚 = 0.
(b) The set of vectors { 𝑣1 , … , 𝑣𝑚 } is linearly independent if
𝑐1 𝑣1 + ⋯ + 𝑐𝑚 𝑣𝑚 = 0 can only be satisfied when 𝑐1 =
0, … . , 𝑐𝑚 = 0.
We now present an important result that relates the
concepts of linear dependence and linear combination.
THEOREM: A set consisting of two or more vectors in a
vector space is linearly dependent if and only if it is
possible to express one of the vectors as a linear
combination of the other vectors.

Proof: Let the set { 𝑣1 , 𝑣2 … , 𝑣𝑚 } be linearly dependent.

Therefore, there exists scalars 𝑐1 , 𝑐2 , … , 𝑐𝑚 not all zero, such


that
𝑐1 𝑣1 + 𝑐2 𝑣2 + ⋯ , 𝑐𝑚 𝑣𝑚 = 0

Assume that 𝑐1 ≠ 0.
The above identity can be rewritten as
−𝑐2 −𝑐𝑚
𝑣1 = 𝑣2 + ⋯ + 𝑣
𝑐1 𝑐𝑚 𝑚

Thus, v1 is a linear combination of 𝑣2 , … , 𝑣𝑚 . Conversely,


assume that 𝑣1 is a linear combination of 𝑣2 , … , 𝑣𝑚 .

Therefore there exists scalars 𝑑2 , … , 𝑑𝑚 such that

𝑣1 = 𝑣2 𝑑2 , … , 𝑑𝑚 𝑣𝑚 .

⇒ 1𝑣1 + −𝑑2 𝑣2 + ⋯ + −𝑑𝑚 𝑣𝑚 = 0.

Thus the set { 𝑣1 , 𝑣2 , … , 𝑣𝑚 } is linearly dependent,


completing the proof.
THEOREM: Let 𝑉 be a vector space. Any set of vectors in 𝑉
that contains the zero vector is linearly dependent.

Proof: Consider the set 0, 𝑣2 , … 𝑣𝑚 , which contains the zero


vector. Let us examine the identity.

𝑐1 0 + 𝑐2 𝑣2 + ⋯ + 𝑐𝑛 𝑣𝑛 = 0.

We see that the identity is true for 𝑐1 = 1, 𝑐2 = 0, … , 𝑐𝑚 = 0


(not all zero). Thus the set of vectors is linearly dependent,
proving the theorem.

THEOREM: Let the set {𝑣1 , … , 𝑣𝑚 } be linearly dependent in


a vector space 𝑉. Any set of vectors in 𝑉 that contains these
vectors will also be linearly dependent.

Proof: Since the set {𝑣1 , … , 𝑣𝑚 } is linearly dependent, there


exists scalars 𝐶1 , … 𝐶𝑚 , not all zero, such that 𝑐1 𝑣1 + ⋯ +
𝑐𝑚 𝑣𝑚 = 0.

Consider the set of vectors, which contains the given


vectors.

There are scalars, not all zero, namely 𝐶1 , 𝐶2 , … 𝐶𝑚 , 0, … 0


such that
𝐶1 𝑣1 + ⋯ + 𝐶𝑚 𝑣𝑚 + 0𝑣𝑚 + 1 + ⋯ . . +0𝑣𝑛 = 0.

Thus the set{𝑣1 , … , 𝑣𝑚 , 𝑣𝑚+1 , … 𝑣𝑛 } is linearly dependent.


Problem 1: Show that the set {(1,2,3), (−2,1,1), (8,6,10)} is
linearly dependent in ℝ3 .

Solution: Let us examine the identity 𝐶1 (1,2,3) +


𝐶2 (−2,1,1) + 𝐶3 (8,6,10) = 0. We want to show that at least
one of the C’s can be nonzero. We get
(𝐶1 − 2𝐶2 + 8𝐶3 , 2𝐶1 + 𝐶2 + 6𝐶3 , 3𝐶1 + 𝐶2 + 10𝐶3 ) = (0,0,0).
Equating each component of this vector to zero gives the
system of equations.

𝐶1 − 2𝐶2 + 8𝐶3 = 0

2𝐶1 + 𝐶2 + 6𝐶3 = 0
3𝐶1 + 𝐶2 + 10𝐶3 = 0

This system has the solution 𝐶1 = 4, 𝐶2 = −2, 𝐶3 = −1. Since


at least one of the 𝐶’s is nonzero, the set of vectors is
linearly dependent.
The Linear dependence is expressed by the equation

4(1,2,3) − 2(−2,1,1) − (8,6,10) = 0.


Problem 2: Show that the set {(3, −2,2), (3, −1,4), (1,0,5)} is
linearly dependent in ℝ3 .

Solution: We examine the identity𝐶1 (3, −2,2) + 𝐶2 (3, −1,4) +


𝐶3 (1,0,5) = 0.

We want to show that this identity can only hold if


𝐶1 , 𝐶2 and 𝐶3 are all zero. We get

(3𝐶1 + 3𝐶2 + 𝐶3 , −2𝐶1 − 𝐶2 , 2𝐶1 + 4𝐶2 + 5𝐶3 ) = 0.

Equating the components to zero gives.

3 𝐶1 + 3 𝐶2 + 𝐶3 = 0

−2 𝐶1 − 𝐶2 = 0
2 𝐶1 + 4 𝐶2 + 5 𝐶3 = 0

This system has the solution 𝐶1 = 0, 𝐶2 = 0, 𝐶3 = 0. Since


the set of vectors is linearly independent.
Problem 3: Let the set {𝑣1 , 𝑣2 } be linearly independent.
Prove that {𝑣1 + 𝑣2 , 𝑣1 − 𝑣2 } is also linearly independent.
Solution: Let us examine the identity.

𝑎(𝑣1 + 𝑣2 ) + 𝑏(𝑣1 − 𝑣2 ) ______(1)

If we can show that this identity implies 𝑎 = 0 and 𝑏 = 0,


then {𝑣1 + +𝑣2 , 𝑣1 − 𝑣2 } will be linearly independent,
we get
𝑎𝑣1 + 𝑎𝑣2 + 𝑏𝑣1 − 𝑏𝑣2 = 0

⇒ 𝑎 + 𝑏 𝑣1 + 𝑎 − 𝑏 𝑣2 = 0

Since {𝑣1 , 𝑣2 } is linearly independent 𝑎 + 𝑏 = 0, 𝑎 – 𝑏 = 0

This system has the unique solution 𝑎 = 0, 𝑏 = 0.


EXCERCISE
1. Find values of ‘𝑡’ for which the following sets are
linearly dependent.

(a) {(−1, 2), (𝑡, −4)}

(b) {(2, −𝑡), (2𝑡 + 6,4𝑡)}.

2. Let the set {𝑣1 , 𝑣2 , 𝑣3 } be linearly dependent in a vector


space 𝑉 . Let ‘𝑐’ be a non-zero scalar. Prove that the
following sets are also linearly dependent.

(a) {𝑣1 , 𝑣1 + 𝑣2 , 𝑣3 }

(b) {𝑣1 , 𝑐𝑣2 , 𝑣3 }

(c) {𝑣1 , 𝑣1 + 𝑐𝑣2 , 𝑣3 } .


3. Same question as above replacing linearly dependent
with linearly independent.
4. Let a set ‘𝑆’ be linearly independent in a vector space
𝑉 . Prove that every subset of 𝑆 is also linearly
independent. Let 𝑃 be linearly dependent. Is every
subset of 𝑃 linearly dependent?
5. Let {𝑣1 , 𝑣2 } be linearly independent in a vector space 𝑉.
Show that if a vector 𝑣3 is not of the form 𝑎𝑣1 + 𝑏𝑣2 ,
then the set {𝑣1 , 𝑣2 , 𝑣3 } is linearly independent.
6. Prove that a set of two or more vectors in a vector
space is linearly independent if no vector in the set can
be expressed as a linear combination of the other
vectors.
Answers
1. (a) 2
(b) -7
1.5
BASES AND DIMENSION

DEFINITION: A finite set of vectors 𝑣1 , … . . , 𝑣𝑚 is called a


basis for a vector space V, if the set spans V and is linearly
independent.
Intuitively, a basis is an efficient set for characterizing a
vector space, in that any vector can be expressed as a
linear combination of the basis vectors, and the basis
vectors are independent of one another.
Example: The set of ‘n’ vectors { 1,0, … ,0 , 0,1,0, … ,0 , …,
0, … ,0,1 } is a basis for ℝ𝑛 . This basis is called the standard
basis for ℝ𝑛 .
THEOREM: Let 𝑣1 , 𝑣2 , … , 𝑣𝑛 be a basis be a basis for a
vector space 𝑉. If 𝜔1 , 𝜔2 , … , 𝜔𝑚 is a set of more than 𝑛
vectors in 𝑉, then this set is linearly dependent.
Proof: We examine the identity 𝑐1 𝜔1 + ⋯ . +𝑐𝑚 𝜔𝑚 = 0 …..(1)
We will show that values of 𝑐1 , … . . , 𝑐𝑚 , not all zono, exist,
satisfying this identity and proving that the vectors are
linearly dependent.
Since the set 𝑣1 , 𝑣2 , … , 𝑣𝑛 is a basis for 𝑉, each of the
vectors 𝜔1 , 𝜔2 , … , 𝜔𝑚 can be expressed as a linear
combination of 𝑣1 , 𝑣2 , … . , 𝑣𝑛
Let 𝜔1 = 𝑎11 𝑣1 + 𝑎12 𝑣2 + ⋯ . +𝑎1𝑛 𝑣𝑛
𝜔2 = 𝑎21 𝑣1 + 𝑎22 𝑣2 + ⋯ . +𝑎2𝑛 𝑣𝑛
.
.
𝜔𝑚 = 𝑎𝑚1 𝑣1 + 𝑎𝑚2 𝑣2 + ⋯ . +𝑎𝑚𝑛 𝑣𝑛
Substituting these values in (1) we get
𝑐1 𝑎11 𝑣1 + 𝑎12 𝑣2 + ⋯ . +𝑎1𝑛 𝑣𝑛 + ⋯ + 𝑐𝑚 𝑎𝑚1 𝑣1 + 𝑎𝑚2 𝑣2 +
⋯ . +𝑎𝑚𝑛 𝑣𝑛 = 0
Rearranging, we get
𝑐1 𝑎11 + 𝑐2 𝑎21 + ⋯ + 𝑐𝑚 𝑎𝑚1 𝑣1 + ⋯ + 𝑐1 𝑎1𝑛 + 𝑐2 𝑎2𝑛 + ⋯ +
𝑐𝑚 𝑎𝑚𝑛 𝑣𝑛 = 0

Since 𝑣1 , 𝑣2 , … , 𝑣𝑛 are linearly independent, we get


𝑎11 𝑐1 + 𝑎21 𝑐2 + ⋯ + 𝑎𝑚1 𝑐𝑚 = 0
.
.
𝑎1𝑛 𝑐1 + 𝑎2𝑛 𝑐2 + ⋯ + 𝑎𝑚𝑛 𝑐𝑚 = 0
Thus finding c’s that satisfy equation (1) reduces to finding
solutions to this system of ′𝑛′ equations in ′𝑚′ variables.
Since 𝑚 > 𝑛, the number of variables is greater than the
number of equations. We know that such a system of
homogeneous equations has many solutions.
Therefore, there are non-zero values of c’s that satisfy
equation(1). Thus the set 𝜔1 , 𝜔2 , … , 𝜔𝑚 is linearly
dependent.
THEOREM: Any two bases for a vector space 𝑉 consist of
the same number of vectors.
Proof: Let 𝑣1 , 𝑣2 , … , 𝑣𝑛 and 𝜔1 , 𝜔2 , … , 𝜔𝑚 be two bases for
𝑉 . If we interpret 𝑣1 , 𝑣2 , … , 𝑣𝑛 as a basis for 𝑉 and
𝜔1 , 𝜔2 , … , 𝜔𝑚 as a set of linearly independent vectors in 𝑉,
than the previous theorem tells us that 𝑚 ≤ 𝑛. conversely, if
we interpret 𝜔1 , 𝜔2 , … , 𝜔𝑚 as a basis for 𝑉 and 𝑣1 , 𝑣2 , … , 𝑣𝑛
as a set of linearly independent vectors in 𝑉, then 𝑛 ≤ 𝑚.
Thus 𝑛 = 𝑚, proving that both the bases consists of same
number of vectors.
DEFINITION: If a vector space 𝑉 has a basis consisting of
′𝑛′ vectors, than the dimension of 𝑉 is said to be 𝑛. we write
dim 𝑣 for dimension of 𝑉.
EXAMPLE: The set of ′𝑛′ vectors 1,0, … ,0 , … , 0, … 0,1
forms a basis (the stranded basis) for ℝ𝑛 . Thus the
dimension of ℝ𝑛 is ′𝑛′.
Note that we have defined a basis for a vector space to be
a finite set of vectors that spans the space and is linearly
independent. Such a set does not exist for all vector
spaces. When such a finite set exists, we say that the
vector space is finite dimensional. If such a finite set does
not exist, we say that the vector space is infinite
dimensional.
THEOREM: Let 𝑣1 , 𝑣2 , … , 𝑣𝑛 be a basis for a vector space 𝑉.
Then each vector in 𝑉 can be expressed uniquely as a
linear combination of these vectors.
Proof: Let ′𝑣′ be a vector in 𝑉. Since 𝑣1 , 𝑣2 , … , 𝑣𝑛 is a basis,
we can express 𝑣 as a linear combination of these vectors.
Suppose we can write
𝑣 = 𝑎1 𝑣1 + 𝑎2 𝑣2 + ⋯ + 𝑎𝑛 𝑣𝑛 and
𝑣 = 𝑏1 𝑣1 + 𝑏2 𝑣2 + ⋯ + 𝑏𝑛 𝑣𝑛 then
𝑎1 𝑣1 + 𝑎2 𝑣2 + ⋯ + 𝑎𝑛 𝑣𝑛 = 𝑏1 𝑣1 + 𝑏2 𝑣2 + ⋯ + 𝑏𝑛 𝑣𝑛
⟹ 𝑎1 − 𝑏1 𝑣1 + 𝑎2 − 𝑏2 𝑣2 + ⋯ + 𝑎𝑛 − 𝑏𝑛 𝑣𝑛 = 0
Since 𝑣1 , 𝑣2 , … , 𝑣𝑛 is a basis, the vectors 𝑣1 , 𝑣2 , … , 𝑣𝑛 are
linearly independent. Thus 𝑎1 − 𝑏1 = 0, … , 𝑎𝑛 − 𝑏𝑛 = 0
implying that 𝑎1 = 𝑏1 , … , 𝑎𝑛 = 𝑏𝑛
Therefore there is only one way of expressing 𝑣 as a linear
combination of the basis.
Lemma. Let S be a linearly independent subset of a vector
space V. Suppose β is a vector in V which is not in the
subspace spanned by S. Then the set obtained by adjoining
β to S is linearly independent.
Proof: Suppose 𝛼1 , … , 𝛼𝑚 are distinct vectors in S and
that 𝑐1 𝛼1 + ⋯ + 𝑐𝑚 𝛼𝑚 + 𝑏𝛽 = 0.
Then 𝑏 = 0; for otherwise,
𝑐1 𝑐𝑚
𝛽= − 𝛼1 + ⋯ + − 𝛼𝑚
𝑏 𝑏
and β is in the subspace spanned by S. Thus 𝑐1 𝛼1 + ⋯ +
𝑐𝑚 𝛼𝑚 = 0, and since S is a linearly independent set each
𝑐𝑖 = 0.
Theorem: If W is a subspace of finite-dimensional vector
space V, every linearly independent subset of W is finite
and is part of a basis for W.
Proof: Suppose S0 is a linearly independent subset of W. If
S is a linearly independent subset of W containing S0, then
S is also a linearly independent subset of V; since V is
finite-dimensional, S contains no more than dim V
elements.
We extend S0, to a basis for W, as follows. If S0 spans
W, then S0 is basis for W and we are done. If S0 does not
span W, we use the preceding lemma to find a vector β1 in
W such that the set 𝑆1 = 𝑆0 ∪ β1 is independent. If S1
spans W, fine. If not, apply the lemma to obtain a vector β 2
in W such that 𝑆2 = 𝑆1 ∪ β2 is independent. If we continue
in this way, then (in not more than dim V steps) we reach a
set
𝑆𝑚 = 𝑆0 ∪ β1 , … , βm
Which is a basis for W.
Suppose that a vector space is known to be a dimension
𝑛. The following theorem tells us that we do not have to
check both linear independence and spanning conditions
to see if a given set is a basis.
THEROM: Let 𝑉 be a vector space of dimension 𝑛.
a) If 𝑆 = 𝑣1 , 𝑣2 , … , 𝑣𝑛 is a set of 𝑛 linearly independent
vectors in 𝑉, then 𝑆 is a basis for 𝑉.
b) If 𝑆 = 𝑣1 , 𝑣2 , … , 𝑣𝑛 is a set of 𝑛 vectors that spans 𝑉,
then 𝑆 is a basis for 𝑉.
Proof: (a) part is clear from the above theorem and the fact
that every basis of V contains n number of elements.
(b) It is enough to show that S is linearly independent.
Let 𝑢, 𝑢2 , … , 𝑢𝑛 be a basis of V. If we give a proof similar to
the first theorem of this material and by using the fact that
a homogeneous system of linear equations with equal
number of variables and equations will have unique
solution, we can prove that 𝑣1 , 𝑣2 , … , 𝑣𝑛 is linearly
independent.
THEOREM: If W1 and W2 are finite-dimensional subspaces
of a vector space V, then 𝑊1 + 𝑊2 is finite-dimensional and
dim 𝑊1 + 𝑑𝑖𝑚 𝑊2 = 𝑑𝑖𝑚 𝑊1 ∩ 𝑊2 + 𝑑𝑖𝑚 𝑊1 + 𝑊2
Proof. By Theorem 5 and its corollaries, 𝑊1 ∩ 𝑊2 has a finite
basis 𝛼1 , … , 𝛼𝑘 which is part of a basis
𝛼1 , … , 𝛼𝑘 , 𝛽1 , … , 𝛽𝑚 for W1
and part of basis
𝛼1 , … , 𝛼𝑘 , 𝛾1 , … , 𝛾𝑛 for W2.
The subspace 𝑊1 + 𝑊2 is spanned by the vectors
𝛼1 , … , 𝛼𝑘 , 𝛽1 , … , 𝛽𝑚 , 𝛾1 , … , 𝛾𝑛 and these vectors form an
independent set. For suppose
𝑥𝑖 𝛼𝑖 + 𝑦𝑗 𝛽𝑗 + 𝑧𝑟 𝛾𝑟 = 0.
Then
− 𝑧𝑟 𝛾𝑟 = 𝑥𝑖 𝛼𝑖 + 𝑦𝑗 𝛽𝑗

which shows that 𝑧𝑟 𝛾𝑟 belongs to 𝑊1 . As 𝑧𝑟 𝛾𝑟 Also


belongs to 𝑊2 it follows that
𝑧𝑟 𝛾𝑟 = 𝑐𝑖 𝛼𝑖
for certain scalars 𝑐1 , … , 𝑐𝑘 . Because the set
𝛼1 , … , 𝛼𝑘 , 𝛾1 , … , 𝛾𝑛
is independent, each of the scalars 𝑧𝑟 = 0. Thus,
𝑥𝑖 𝛼𝑖 + 𝑦𝑗 𝛽𝑗 = 0

and since
𝛼1 , … , 𝛼𝑘 , 𝛽1 , … , 𝛽𝑚
is also an independent set, each 𝑥𝑖 = 0 and each 𝑦𝑗 = 0 .
Thus 𝛼1 , … , 𝛼𝑘 , 𝛽1 , … , 𝛽𝑚 , 𝛾1 , … , 𝛾𝑛
is a basis for 𝑊1 + 𝑊2 . Finally
dim 𝑊1 + dim 𝑊2 = 𝑘 + 𝑚 + 𝑘 + 𝑛
=𝑘+ 𝑚+𝑘+𝑛
= dim 𝑊1 ∩ 𝑊2 + dim 𝑊1 + 𝑊2 .
Problem 1: Show that the set 1,0, −1 , 1,1,1 , 1,2,4 is a
basis for ℝ3 .

Solution: Let us first show that the set spans ℝ3 .

Let (𝑥1 , 𝑥2 , 𝑥3 ) be an arbitrary element of ℝ3 .

We try to find scalars 𝑎1 , 𝑎2 , 𝑎3 such that 𝑥1 , 𝑥2 , 𝑥3 =


𝑎1 1,0, −1 + 𝑎2 1,1,1 + 𝑎3 1,2,4 . This identity leads to the
system of equations.
𝑎1 + 𝑎2 + 𝑎3 = 𝑥1

𝑎2 + 2𝑎3 = 𝑥2

−𝑎1 + 𝑎2 + 4𝑎3 = 𝑥3
This system of equations has the solution

𝑎1 = 2𝑥1 − 3𝑥2 + 𝑥3

𝑎2 = −2𝑥1 + 5𝑥2 − 2𝑥3

𝑎3 = 𝑥1 − 2𝑥2 + 𝑥3
Thus the set spans the space. We now show that the set is
linearly independent.

Consider the identity


𝑏1 1,0, −1 + 𝑏2 1,1,1 + 𝑏3 1,2,4 = (0,0,0)

This identity leads to the system of equations.


𝑏1 + 𝑏2 + 𝑏3 = 0

𝑏2 + 2𝑏3 = 0

−𝑏1 + 𝑏2 + 4𝑏3 = 0
This system has the unique solution 𝑏1 = 0, 𝑏2 = 0, and 𝑏3 =
0. Thus the set is linearly independent.

Therefore 1,0, −1 , 1,1,1 , 1,2,4 forms a basis for ℝ3 .


Problem 2: Prove that the set 1,3, −1 , 2,1,0 , 4,2,1 is a
basis for ℝ3

Solution: The dimension of ℝ3 is three. Thus a basis of


ℝ3 consists of three vectors. We have the correct number of
vectors for a basis.

Normally, we would have to show that this set is linearly


independent and that it spans ℝ3

Since ℝ3 is finite dimensional vector space, we need to


check only one of these two conditions. Let us check for
linear independence. We get 𝑐1 1,3, −1 + 𝑐2 2,1,0 +
𝑐3 4,2,1 = 0,0,0 . This identity leads to the system of
equations.
𝑐1 + 2𝑐2 + 4𝑐3 = 0
3𝑐1 + 𝑐2 + 2𝑐3 = 0

−𝑐1 + 4𝑐3 = 0

This system has unique solution 𝑐1 = 0, 𝑐2 = 0, 𝑐3 = 0


Thus the vectors are linearly independent. The set
1,3, −1 , 2,1,0 , 4,2,1 is therefore a basis for ℝ3 .
Problem 3: State (with a brief explanation) whether the
following statements are true or false.

(a) The vectors (1, 2), (-1, 3), (5, 2) are linearly dependent in
ℝ 2.

(b) The vectors (1, 0, 0), (0, 2, 0), (1, 2, 0) span ℝ3.

(c) {(1, 0, 2), (0, 1, -3)} is a basis for the subspace of ℝ3


consisting of vectors of the form (a, b, 2a-3b).

(d) Any set of two vectors can be used to generate a two-


dimensional subspace of ℝ3.
Solution:

(a) True: The dimension of ℝ2 is two. Thus any three


vectors are linearly dependent.

(b) False: The three vectors are linearly dependent. Thus


they cannot span a three-dimensional space.

(c) True: The vectors span the subspace since

(a, b, 2a-3b) = a(1, 0, 2) + b(0, 1, -3)


The vectors are also linearly independent since they are not
collinear.

(d) False: The two vectors must be linearly independent.


Exercise
1. Prove that the subspace of ℝ3 generated by the vectors
−1,2,1 , 2, −1,0 , and 1,4,3 is a two dimensional
subspace of ℝ3 and give a basis for this subspace.
2. Find a basis for ℝ3 that includes the vectors 1,1,1 and
1,0, −2 .
3. Determine a basis for each of the following subspaces
of ℝ3 . Give the dimension of each subspace.
a) The set of vectors of the form 𝑎, 𝑎, 𝑏 .
b) The set of vectors of the form 𝑎, 𝑏, 𝑎 + 𝑏
c) The set of vectors of the form 𝑎, 𝑏, 𝑐 , where 𝑎 + 𝑏 + 𝑐 =
0.
4. Which of the following sets of vectors are bases for ℝ2?

(a) {(3, 1), (2, 1)} (b) {(1, −3), (−2, 6)}

5. Which of the following sets are bases for ℝ3?

(a) {(1, −1, 2), (2, 0, 1), (3, 0, 0)}

(b) {(2, 1, 0), (−1, 1, 1), (3, 3, 1)}

6. Prove that the vector (1, 2, -1) lies in the two


dimensional subspace of ℝ3 generated by the vectors
(1, 3, 1) and (1, 4, 3).
7. Let {𝑣1 , 𝑣2 } be a basis for a vector space V. Show that
the set of vectors {𝑢1 , 𝑢2 }, where 𝑢1 = 𝑣1 + 𝑣2 , 𝑢2 = 𝑣1 −
𝑣2 , is also a basis for V.
8. Let V be a vector space of dimension n. Prove that no
set of n - 1 vectors can span V.
9. Let V be a vector space, and let W be a subspace of V.
If dim (V) = n and dim (W) = m, prove that m ≤ n.
Answers
1. { −1,2,1 , (2, −1,0)} is a basis.
2. { 1,1,1 , 1,0, −2 , (1,0,0)}.
3. a) Basis = { 1,1,0 , (0,0,1)}, dimension = 2.
b) Basis = { 1,0,1 , (0,1,1)}, dimension = 2.
c) Basis = { 1,0, −1 , (0,1, −1)}, dimension = 2.
4. (a) Basis (b) Not a basis
5. (a) Basis (b) Not a basis
6. 1,2, −1 = 2 1,3,1 − (1,4,3).
1.6
LINEAR TRANSFORMATIONS
A vector space has two operations defined on it, namely,
addition and scalar multiplication. Linear transformations
between vectors spaces are those functions that preserve
these linear structures in the following sense.
DEFINITION: Let 𝑈 and 𝑉 be vector spaces. Let 𝑢 and 𝑣 be
vectors in 𝑈 and let 𝑐 be a scalar. A function 𝑇: 𝑈 → 𝑉 is said
to be linear transformation if
𝑇 𝑢+𝑣 =𝑇 𝑢 +𝑇 𝑣
𝑇 𝑐𝑢 = 𝑐𝑇 𝑢
The first condition implies that ′𝑇′ maps the sum of two
vectors into the sum of images of those vectors. The second
condition implies that ′𝑇′ maps the scalar multiple of a
vector into the same scalar multiple of the image. Thus the
operations of addition and scalar multiplication are
preserved under linear transformation.
THEOREM: Let V be a finite-dimensional vector space over
the field F and let 𝛼1 , … , 𝛼𝑛 be ordered basis for V. Let W
be a vector space over the same field F and let 𝛽1 , … , 𝛽𝑛 be
any vectors in W. Then there is precisely one linear
transformation T from V into W such that
T(𝛼𝑗 ) = 𝛽𝑗 j= 1,…,n.
Proof: To prove there is some linear transformation T with
T(𝛼𝑗 ) = 𝛽𝑗 we proceed as follows. Given 𝛼 in V, there is a
unique n-tuple (𝑥1 , … , 𝑥𝑛 ) such that
𝛼 = 𝑥1 𝛼1 + ⋯ + 𝑥𝑛 𝛼𝑛 .
For this vector 𝛼 we define
𝑇(𝛼) = 𝑥1 𝛽1 + ⋯ + 𝑥𝑛 𝛽𝑛 .
Then T is a well-defined rule for associating with each
vector 𝛼 in V a vector 𝑇(𝛼) in W. From the definition it is
clear that 𝑇(𝛼𝑗 ) = 𝛽𝑗 for each j. To see that T is linear, let

𝛽 = 𝑦1 𝛼1 + ⋯ + 𝑦𝑛 𝛼𝑛
be in V and let c be any scalar. Now
𝑐𝛼 + 𝛽 = 𝑐𝑥1 + 𝑦1 𝛼1 + ⋯ + 𝑐𝑥𝑛 + 𝑦𝑛 𝛼𝑛
and so by definition
𝑇 𝑐𝛼 + 𝛽 = 𝑐𝑥1 + 𝑦1 𝛽1 + ⋯ + 𝑐𝑥𝑛 + 𝑦𝑛 𝛽𝑛
on the other hand,
𝑛 𝑛

𝑐 𝑇(𝛼) + 𝑇(𝛽) = 𝑐 𝑥𝑖 𝛽𝑖 + 𝑦𝑖 𝛽𝑖
𝑖=1 𝑖=1
𝑛

= (𝑐𝑥𝑖 + 𝑦𝑖 ) 𝛽𝑖
𝑖=1

and thus 𝑇 𝑐𝛼 + 𝛽 = 𝑐 𝑇(𝛼) + 𝑇(𝛽).


If U is a linear transformation from V into W with
𝑈(𝛼𝑗 ) = 𝛽𝑗 , 𝑗 = 1, … , 𝑛 , then for the vector 𝛼 = 𝑛𝑖=1 𝑥𝑖 𝛼𝑖 we
have
𝑛

𝑈(𝛼) = 𝑈 𝑥𝑖 𝛼𝑖
𝑖=1
𝑛

= 𝑥𝑖 (𝑈(𝛼𝑖 ))
𝑖=1
𝑛

= 𝑥𝑖 𝛽𝑖
𝑖=1

so that U is exactly the rule T which we defined above.


This shows that the linear transformation T with 𝑇(𝛼𝑖 ) = 𝛽𝑖
is unique.
The following theorem shows that any linear
transformation maps the zero vector of the domain vector
space to the zero vector of the co-domain vector space.
THEOREM: Let 𝑇: 𝑈 → 𝑉 be a linear transformation. Let 0𝑈
and 0𝑉 be the zero vectors of 𝑈 and 𝑉. Then 𝑇 0𝑈 = 0𝑉 .

That is, a linear transformation maps a zero vector into a


zero vector.
Proof: let 𝑢 be a vector in 𝑈 and let 𝑇 𝑢 = 𝑣

𝑇 0𝑈 = 𝑇 0 𝑢 = 0 𝑇 𝑢 = 0𝑣 = 0𝑉 .
DEFINITION: Let 𝑇: 𝑈 → 𝑉 be a linear transformation. The
set of vectors in 𝑈 that are mapped into the zero vector of 𝑉
is called the kernel of 𝑇. The kernel is denoted by 𝑘𝑒𝑟 𝑇 .
The set of vectors in 𝑉 that are images of vectors in 𝑈
is called the range of 𝑇. The range is denoted by 𝑟𝑎𝑛𝑔𝑒 𝑇 .
We illustrate these sets in the following figure.

U V

O
Kernel

𝑇→
All vectors in 𝑈 that are mapped into 0.

U V

range

𝑇→
All vectors in 𝑉 that are images of vectors in 𝑈.

Whenever we introduce sets in linear algebra, we are


interested in knowing whether they are vector spaces or
not. We now find that the kernel and range are indeed
vector spaces.
THEREM: Let 𝑇: 𝑈 → 𝑉 be a linear transformation.

a) The kernel of ′𝑇 ′ is a subspace of 𝑈


b) The range of ′𝑇 ′ is a subspace of 𝑉.
Proof: From the previous theorem, we know that the kernel
is non empty since it contains the zero vector of 𝑈.
To prove that the kernel is a subspace of 𝑈, we show
that it is closed under addition and scalar multiplication.
First we prove closure under addition, Let 𝑢1 , 𝑢2 ∈ ker 𝑇 .
Then 𝑇 𝑢1 = 𝑇 𝑢2 = 0.
Now 𝑇 𝑢1 + 𝑢2 = 𝑇 𝑢1 + 𝑇 𝑢2 = 0 + 0 = 0.
Then vector 𝑢1 + 𝑢2 is mapped into 0. Thus 𝑢1 + 𝑢2 is in
ker 𝑇
Let us now show that ker⁡ (𝑇) is closed under scalar
multiplication. Let ′𝑐′ be a scalar.
𝑇 𝑐𝑢1 = 𝑐𝑇 𝑢1 = 𝑐0 = 0.
Thus 𝑐𝑢1 is in ker 𝑇 .
The kernel is closed under addition and under scalar
multiplication. It is a subspace of 𝑈.
(b) The previous theorem tells us that the range is non
empty since it contains the zero vector of 𝑉.
To prove that the range is a subspace of 𝑉, we show
that it is closed under addition and scalar multiplication.
Let 𝑣1 and 𝑣2 be elements of 𝑟𝑎𝑛𝑔𝑒 𝑇 . Thus ∃ vectors 𝑢1 and
𝑢2 in the domain 𝑈 such that
𝑇 𝑢1 = 𝑣1 and 𝑇 𝑢2 = 𝑣2
Now 𝑇 𝑢1 + 𝑢2 = 𝑇 𝑢1 + 𝑇 𝑢2 = 𝑣1 + 𝑣2 . The vector 𝑣1 + 𝑣2
is the image of 𝑢1 + 𝑢2 . Thus 𝑣1 + 𝑣2 is in the range.
Let ′𝑐′ be a scalar. Then 𝑇 𝑐𝑢1 = 𝑐𝑇 𝑢1 = 𝑐𝑣1
The vector 𝑐𝑣1 is the image of 𝑐𝑢1 . Thus 𝑐𝑣1 is in the range.
The range is closed under addition and under scalar
multiplication. It is a subspace of 𝑉.
The following theorem gives an important relationship
between the “sizes” of the kernel and the range of a linear
transformation.
THEOREM: Let 𝑇: 𝑈 → 𝑉 be a linear transformation. Then
𝐷𝑖𝑚 ker 𝑇 + dim 𝑟𝑎𝑛𝑔𝑒 𝑇 = dim 𝑑𝑜𝑚𝑎𝑖𝑛 𝑇

Proof: If ker 𝑇 = 𝑈, then 𝑟𝑎𝑛𝑔𝑒 𝑇 = 0 . Since the only


vector space with dimension 0 is 0 , we are done in this
case.
Suppose that ker 𝑇 ≠ 𝑈.
Let 𝑢1 , 𝑢2 , … , 𝑢𝑚 be a basis for ker 𝑇 . Add vectors 𝑢𝑚 +1 , … , 𝑢𝑛
to this set to get a basis 𝑢1 , 𝑢2 , … , 𝑢𝑛 for 𝑈.
We shall show that 𝑇(𝑢𝑚 +1 ), … , 𝑇(𝑢𝑛 ) form a basis for the
range, thus proving the theorem.
Let 𝑢 ∈ 𝑈.
Then we get scalars 𝑎1 , 𝑎2 , … . , 𝑎𝑛 such that 𝑢 = 𝑎1 𝑢1 + 𝑎2 𝑢2 +
⋯ + 𝑎𝑚 𝑢𝑚 + 𝑎𝑚 +1 𝑢𝑚 +1 + ⋯ + 𝑎𝑛 𝑢𝑛 .
Thus
𝑇 𝑢 = 𝑇 𝑎1 𝑢1 + 𝑎2 𝑢2 + ⋯ + 𝑎𝑚 𝑢𝑚 + 𝑎𝑚 +1 𝑢𝑚 +1 + ⋯ + 𝑎𝑛 𝑢𝑛
= 𝑎1 𝑇(𝑢1 ) + ⋯ + 𝑎𝑚 𝑇 𝑢𝑚 + 𝑎𝑚 +1 𝑇 𝑢𝑚 +1 + ⋯ + 𝑎𝑛 𝑇𝑢𝑛
= 𝑎𝑚 +1 𝑇 𝑢𝑚 +1 + ⋯ + 𝑎𝑛 𝑇𝑢𝑛 .
Since 𝑇 𝑢 represents an arbitrary vector in the range of 𝑇,
the vectors 𝑇 𝑢𝑚 +1 , … . 𝑇 𝑢𝑛 span the range.
It remains to prove that these vectors are linearly
independent. Consider the identity
𝑏𝑚 +1 𝑇 𝑢𝑚 +1 + ⋯ + 𝑏𝑛 𝑇 𝑢𝑛 = 0
⟹ 𝑇 𝑏𝑚+1 𝑢𝑚 +1 + ⋯ + 𝑏𝑛 𝑢𝑛 = 0
⟹ 𝑏𝑚+1 𝑢𝑚 +1 + ⋯ + 𝑏𝑛 𝑢𝑛 ∈ ker 𝑇 .
⟹ 𝑏𝑚 +1 𝑢𝑚+1 + ⋯ + 𝑏𝑛 𝑢𝑛 = 𝑐1 𝑢1 + ⋯ + 𝑐𝑚 𝑢𝑚
⟹ 𝑐1 𝑢1 + ⋯ + 𝑐𝑚 𝑢𝑚 − 𝑏𝑚 +1 𝑢𝑚 +1 − ⋯ − 𝑏𝑛 𝑢𝑛 = 𝑐
Since the vectors 𝑢1 , … . , 𝑢𝑚 , , 𝑢𝑚 +1 , . . , 𝑢𝑛 are a basis, they are
linearly independent. Therefore, the coefficients are all zero.
𝑐1 = 0, 𝑐𝑚 = 0, 𝑏𝑚 +1 = 0, … , 𝑏𝑛 = 0. So 𝑇 𝑢𝑚 +1 , … . , 𝑇(𝑢𝑛 ) are
linearly independent.
Therefore the set of vectors 𝑇 𝑢𝑚 +1 , … . , 𝑇 𝑢𝑛 is a basis for
the range.
TERMINOLOGY:
The kernel of a linear mapping ′𝑇′ is often called the null
space. 𝐷𝑖𝑚 ker⁡
(𝑇) is called the nullity, and dim 𝑟𝑎𝑛𝑔𝑒(𝑇) is
called the rank of the transformation. The previous
theorem is often referred to as the rank/nullity theorem
and written in the following form. 𝑅𝑎𝑛𝑘 𝑇 + 𝑛𝑢𝑙𝑙𝑖𝑡𝑦 𝑇 =
dim 𝑑𝑜𝑚𝑎𝑖𝑛(𝑇).
Problem 1: Prove that the following transformation
𝑇: 𝑅 2 → 𝑅 2 is linear. T(x,y) = (2x, x+y)
Solution: We first show that T preserves addition. Let
(x1,y1) and (x2,y2) be elements of 𝑅 2 . Then
T((x1,y1)+ (x2,y2)) = T (x1+x2,y1+y2) by vector addition
= (2x1+2x2, x1+x2+y1+y2) by definition of T
= (2x1, x1+ y1)+( 2x2, x2+ y2) by vector addition
= T(x1,y1)+T (x2,y2) by definition of T
Thus T preserves vector addition.
We now show that T preserves scalar multiplication. Let c
be a scalar.
T(c(x1,y1))=T(cx1,cy1) by scalar multiplication of a vector
=(2c x1, cx1+c y1) by definition of T
=c(2x1, x1+ y1) by scalar multiplication of a vector
=cT (x1, y1) by definition of T
Thus T preserves scalar multiplication. T is linear.
Problem 2: Let 𝑃𝑛 be the vector space of real polynomial
functions of degree ≤n. Show that the following
transformation 𝑇: 𝑃2 → 𝑃1 is linear.
T (ax2+bx+c)= (a+b)x+c
Solution: Let ax2+bx+c and px2+qx+r be arbitrary
elements of P2. Then
T ((ax2+bx+c)+(px2+qx+r)) = T ((a+p)x2+(b+q)x+(c+r) by vector
addition
= (a+p+b+q)x+(c+r) by definition of T
=(a+b)x+c+(p+q)x+r
= T (ax2+bx+c)+ T(px2+qx+c ) by definition of T
Thus T preserves addition
We now show that T preserves scalar multiplication. Let k
be a scalar.
T(k(ax2+bx+c))=T(kax2+kbx+kc) by scalar multiplication
= (ka+kb)x+kc by definition of T
=k((a+b)x+c)
=kT(ax2+bx+c) by definition of T
T preserves scalar multiplication. Therefore, T is a linear
transformation.
Problem 3: Find the kernel and range of the linear
operator T(x,y,z)=(x,y,0)
Solution: Since the linear operator T maps R3 into R3, the
kernel and range will both be subspaces of R3.
Kernel: ker(T) is the subset that is mapped into (0,0,0). We
see that T(x,y,z) = (x,y,0)
= (0,0,0), if x=0,y=0
Thus ker(T) is the set of all vectors of the form (0,0,z). We
express this as ker(T) ={(0,0,z)}
Geometrically, ker(T) is the set of all vectors that lie on the
z-axis.
Range: The range of T is the set of all vectors of form
(x,y,0). Thus range (T) = {(x,y,0)}
Range (T) is the set of all vectors that lie in the x-y plane.
Exercise
1. Prove that the following transformations 𝑇: 𝑅2 → 𝑅 are not
linear.

(a) T(x, y) = y2

(b) T(x, y) = x-3

2. Determine the kernel and range of each of the following


transformations. Show that dim ker(T)+dim range (T)=dim
domain (T) for each transformation.

(a) T(x, y, z) = (x, 0, 0) of 𝑹3 → 𝑹3

(b) T(x, y, z) = (x + y, z) of 𝑹3 → 𝑹2

(c) T(x, y) = (3x, x-y, y) of 𝑹2 → 𝑹3

3. Let 𝑇: 𝑈 → 𝑉 be a linear mapping. Let v be a nonzero


vector in V. Let W be the set of vectors in U such tat T(w)=v.
Is W a subspace of U?

4. Let 𝑇: 𝑈 → 𝑉 be a linear transformation. Prove that


dim range (T) = dim domain (T)

if and only if T is one-to-one.

5 Let 𝑇: 𝑈 → 𝑉 be a linear transformation. Prove that T is


one –to-one if and only if it preserves linear independence.
Answers
2. (a) The kernel is the set {(0,r,s)} and the range is the set
{(a,0,0)}. Dim ker(T) =2, dim range(T) =1, and dim
domain(T)= 3, so dim ker(T)+ dim range(T) =dim domain(T).

(b) The kernel is the set {(r,-r,0)} and the range is R2. Dim
ker(T) =1, dim range(T)=2, and dim domain(T)= 3, so dim
ker(T)+ dim range(T) = dim domain(T).
(c) The kernel is the zero vector and the range is the set
{(3a,a-b,b)}. Dim ker(T) =0, dim range(T)= 2, so dim ker(T)+
dim range(T) = dim domain(T).

3. This set is not a subspace because it does not contain


the zero vector.
4. T is one-to-one if and only if ker(T) is the zero vector if
and only if dim ker(T) = 0 if and only if dim range (T)=dim
domain(T).
1.7

Matrix Representations of Linear Transformation

In this module we introduce a way of representing a


linear transformation between general vector spaces by a
matrix. We lead up to this discussion by looking at the
information below that is necessary to represent a linear
transformation by a matrix.
Definition: Let 𝑼 be a vector space with basis 𝑩 =
𝑢1 , . . . , 𝑢𝑛 and let 𝒖 be a vector in 𝑈. We know that there
exist unique scalars 𝑎1 , . . . . , 𝑎𝑛 such that
𝒖 = 𝑎1 𝒖1 + . . . . + 𝑎𝑛 𝒖𝑛

 a1 
The column vector 𝑢𝑩 =
   is called the coordinate
 
 an 
vector of u relative to this basis. The scalars 𝑎1 , . . . . , 𝑎𝑛
are called the coordinates of 𝒖 relative to this basis.
Note: We will use a column vector from coordinate vectors
rather than row vectors. The theory develops most
smoothly with this convention.
Example: Find the coordinate vector of 𝒖 = (4,5) relative to
the following bases 𝑩 and 𝑩′ of 𝑅 2 :
(a) The standard basis, 𝐵 = 1,0 , 0,1 and
(b) 𝐵′ = { 2,1 , −1,1 }.

Solution:
(a) By observation, we see that
(4, 5) = 4(1, 0) + 5(0, 1)

4
Thus 𝒖𝐵 =   . The given representation of u is, in
5
fact, relative to the standard basis.
(b) Let us now find the coordinate vector of u relative
to 𝐵′, a basis that is not the standard basis. Let
(4, 5) = 𝑎1 (2, 1) + 𝑎2 (-1, 1)
Thus

(4, 5) = (2𝑎1 ,𝑎1 ) + (-𝑎2 ,𝑎2 )


(4, 5) = (2a1  a2 , a1  a2 )

Comparing components leads to the following system of


equations.

2a1  a2  4
a1  a2  5
This system has the unique solution
𝑎1 = 3 , 𝑎2 = 2

3
Thus 𝑢𝐵 ′ =  
2
Definition: Let 𝑩 = 𝑢1 , . . . , 𝑢𝑛 and 𝑩′ = 𝑢′1 , . . . , 𝑢′ 𝑛 be
bases for a vector space 𝑈. Let the coordinate vectors of
𝑢1 , . . . , 𝑢𝑛 relative to the basis 𝑩′ = 𝑢′1 , . . . , 𝑢′ 𝑛 be
(𝑢1 )𝐵 ′ , . . . . , (𝑢𝑛 )𝐵 ′ . The matrix 𝑃, having these vectors as
columns , plays a central role in our discussion. It is called
the transition matrix from the basis 𝑩 to the basis 𝑩′.
Transition matrix 𝑷 = [(𝑢1 )𝐵 ′ , . . . . , (𝑢𝑛 )𝐵 ′ ].

Theorem: Let 𝑩 = 𝑢1 , . . . , 𝑢𝑛 and 𝑩′ = 𝑢′1 , . . . , 𝑢′ 𝑛 be


bases for a vector space 𝑈 . If u is a vector in 𝑈 having
coordinate vectors 𝒖𝐵 and 𝒖𝐵 ′ relative to these bases, then

𝒖𝐵 ′ = 𝑃𝒖𝐵

where 𝑃 is the transition matrix from 𝐵 to 𝐵′:


𝑃 = [ 𝑢1 𝐵′ , . . . . , 𝑢𝑛 𝐵 ′ ].

Proof: Since {𝑢′1 ,. . . . , 𝑢𝑛′ } is a basis for 𝑈, each of the


vectors 𝑢1 , . . . , 𝑢𝑛 can be expressed as a linear
combination of these vectors.
Let
𝑢1 = 𝑐11 𝑢′1 + . . . +𝑐𝑛1 𝑢′𝑛

𝑢𝑛 = 𝑐1𝑛 𝑢′1 + . . . +𝑐𝑛𝑛 𝑢′𝑛
If u= 𝑎1 𝑢1 + . . . . + 𝑎𝑛 𝑢𝑛 , we get
𝒖 = 𝑎1 𝑢1 + . . . . + 𝑎𝑛 𝑢𝑛

= 𝑎1 𝑐11 𝑢′1 + . . . +𝑐𝑛1 𝑢′ 𝑛 + . . . . + 𝑎𝑛 𝑐1𝑛 𝑢′1 + . . . +𝑐𝑛𝑛 𝑢′ 𝑛


= (𝑎1 𝑐11 + . . . +𝑎𝑛 𝑐1𝑛 )𝑢′1 + . . . . + (𝑎1 𝑐𝑛1 + . . . +𝑎𝑛 𝑐𝑛𝑛 )𝑢′𝑛
The coordinate vector of u relative to 𝐵′ can therefore be
written

 (a1c11  . . .  anc1n )   c11. . .c1n   a1 


𝒖𝐵 ′ 
      
    
(a1cn1  . . .  ancnn )  cn1. . .cnn   an 

= [ 𝑢1 𝐵′ , . . . . , 𝑢𝑛 𝐵 ′ ]𝒖𝐵

proving the theorem.


Example: Consider the bases 𝐵= 1,2 , 3, −1 and
3
𝐵′ = { 1,0 , 0,1 } of 𝑅 2 . If u is a vector such that 𝒖𝐵    ,
4
find 𝒖𝐵 ′ .

Solution: We express the vectors of 𝐵 in terms of the


vectors of 𝐵′ to get the transition matrix.
1,2 = 1 1,0 + 2(0,1)
3, −1 = 3 1,0 − 1(0,1)

1  3
The coordinate vectors of (1,2) and (3, −1) are   and  
2  1
.
The transition matrix 𝑃 is thus
1 3 
P 
 2 1
(Observer that the columns of 𝑃 are the vectors of the
basis.) We get

1 3   3  15
𝒖𝐵 ′       
 2 1  4   2 

Let f be any function. We know that f is defined if its


effect on every element of the domain is known. This is
usually done by means of an equation that gives that effect
of the function on an arbitrary element in the domain. For
example, consider the function f defined by

f x  x 3

The domain of f is x  3 . The above equation gives the effect


of f on every element in this interval. For example, f  7   2 .
Similarly, a linear transformation T is defined if its value at
every vector in the domain is known. However, unlike a
general function, we will see that if we know the effect of
the linear transformation on a finite subset of the domain
(a basis), it will be automatically defined on all elements of
the domain.
Theorem: Let T : U  V be a linear transformation. Let
{u1 ,......,un } be a basis for U . T is defined by its effect on the
base vectors, namely by T u1  ,.....T un  . The range of T is
spanned by T u1  ,.....T un  .

Thus, defining a linear transformation on a basis defined it


on the whole domain.

Proof: Let u be an element of U . Since u1 ,.....,un  is a basis


for U , there exist scalars a1 ,.....an such that

u  a1u1  .....  anun

The linearity of T gives

T u   T  a1u1  .......  anun 

= a1T u1   ......  anT un 

Therefore T  u  is known if T u1  ,.....T un  are known.

Further, T  u  may be interpreted to be an arbitrary element


in the range of T . It can be expressed as a linear
combination of T u1  ,.....T un  . Thus T u1  ,.....T un  spans the
range of T .
From now onwards, we will represent the elements of U
and V by coordinate vectors, and T by a matrix A that
defines a transformation of coordinate vectors. The matrix
A is constructed by finding the effect of T on basis vectors.
Theorem: Let U and V be vector spaces with bases
B  u1 ,.....un  and B  v1 ,.....,vm  . Let T : U  V be a linear
transformation. If u is a vector in U with image T  u  having
coordinate vectors a and b relative to these bases, then

b  Aa Where A  T u1 B ....T un B 

The Matrix 𝐴 thus defines a transformation of coordinate


vectors of 𝑈 in the “same way” as 𝑇 transforms the vectors
of 𝑈 . See the figure below. 𝐴 is called the matrix
representation of 𝑻 (or matrix of 𝑻) with respect the bases
𝐵 and 𝐵′ .

𝑢 . . . . . . . . . . 𝑇 . . . . . . . . . . .𝑇(𝑢)

Coordinate
Mapping

......... .........
a A b

Figure

Proof: Let 𝑢 = 𝑎1 𝑢1 + . . . +𝑎𝑛 𝑢𝑛 .


Using the linearity of 𝑇, we can write

𝑇(𝑢) = 𝑇(𝑎1 𝑢1 + . . . +𝑎𝑛 𝑢𝑛 )


= 𝑎1 𝑇(𝑢1 )+ . . . +𝑎𝑛 𝑇(𝑢𝑛 )

Let the effect of 𝑇 on the basis vectors of 𝑈 be

𝑇 𝑢1 = 𝑐11 𝑣1 + . . . + 𝑐1𝑚 𝑣𝑚
𝑇 𝑢2 = 𝑐21 𝑣1 + . . . + 𝑐2𝑚 𝑣𝑚
.
.
.
𝑇 𝑢𝑛 = 𝑐𝑛1 𝑣1 + . . . + 𝑐𝑛𝑚 𝑣𝑚

Thus

𝑇 𝑢 = 𝑎1 𝑐11 𝑣1 + . . . + 𝑐1𝑚 𝑣𝑚 + . . . +𝑎𝑛 (𝑐𝑛1 𝑣1 + . . . + 𝑐𝑛𝑚 𝑣𝑚 )

= 𝑎1 𝑐11 + . . . + 𝑎𝑛 𝑐𝑛1 𝑣1 + . . . +(𝑎1 𝑐1𝑚 + . . . + 𝑎𝑛 𝑐𝑛𝑚 )𝑣𝑚

The coordinate vector of 𝑇(𝑢) is therefore

𝑎1 𝑐11 + . . . + 𝑎𝑛 𝑐𝑛1 𝑐11 … 𝑐𝑛1 𝑎1


𝑏 = ⋮ = ⋮ ⋮ = 𝐴𝑎
𝑎1 𝑐1𝑚 + . . . + 𝑎𝑛 𝑐𝑛𝑚 𝑐1𝑚 … 𝑐𝑛𝑚 𝑎𝑛

proving the theorem.

Importance of Matrix Representation

The fact that every linear transformation can now be


represented by a matrix means that all the theoretical
mathematics of these vector spaces and their linear
transformation can be undertaken in terms of the vector
spaces 𝑅 𝑛 and matrices. A second reason is a
computational one. The elements of 𝑅 𝑛 and matrices can be
manipulated on computers. Thus general vector spaces
and their linear transformations can be discussed on
computers through these representations.

Relation between Matrix Representations

We have seen that the matrix representation of a linear


transformation depends upon the bases selected. When
linear transformations arise in applications, a goal is often
to determine a simple matrix representation. At this time
we discuss how matrix representations of linear operators
relative to different bases related. We remind the reader
that if A and B are square matrices of the same size, then 𝐵
is said to be similar to 𝐴 if there exists an invertible matrix
𝑃 such that

𝐵 = 𝑃−1 𝐴𝑃

The transformation of the matrix 𝐴 into the matrix 𝐵 in this


manner is called similarity transformation. We now find
the matrix representations of a linear operator relative to
two bases are similar matrices.

Theorem: Let 𝑈 be a vector space with bases 𝐵 and 𝐵′. Let


𝑃 be the transformation matrix from 𝐵′ to 𝐵. If 𝑇 is a linear
operator on 𝑈 , having matrix 𝐴 with respect to the first
basis and 𝐴′ with respect to the second basis , then

𝐴′ = 𝑃−1 𝐴𝑃
Proof: Consider a vector 𝑢 in 𝑈. Let its coordinate vector
relative to B and 𝐵′ be 𝑎 and 𝑎′. The coordinate vectors of
𝑇(𝑢) are 𝐴𝑎 and 𝐴′𝑎′. Since P is the transition matrix from 𝐵′
to 𝐵, we know that

𝑎 = 𝑃𝑎′ and 𝐴𝑎 = 𝑃(𝐴′ 𝑎′ )

This second equation may be rewritten

𝑃−1 𝐴𝑎 = 𝐴′𝑎′

Substituting 𝑎 = 𝑃𝑎′ into this equation gives

𝑃−1 𝐴𝑃𝑎′ = 𝐴′𝑎′

This effect of the matrices 𝑃−1 𝐴𝑃 and 𝐴′ as transformations


on an arbitrary coordinate vector 𝑎′ is the same. Thus these
matrices are equal.

Applications of Linear Transformation


A specific application of linear maps is for geometric
transformations, such as those performed in computer
graphics, where the translation, rotation and scaling of 2D
or 3D objects is performed by the use of a transformation
matrix. For Example:

1. Reflection with respect to x-axis:


 u    u  1 0   u1   u1 
L : R 2  R 2 , L  1    A 1        .
 u 2   u 2  0  1 u 2   u 2 
For example, the reflection for the triangle with vertices
 1, 4, 3, 1, 2, 6 is

  1    1  3   3   2   2 
L      , L      , L       .
  4    4  1   1  6   6

The plot is given below.

2, 6

 1, 4

3, 1

3,  1
 1,  4

2,  6

2. Reflection with respect to y   x :


 u    u   0  1  u1   u2 
L : R 2  R 2 , L  1    A 1         .
 u2   u2   1 0  u2    u1 
Thus, the reflection for the triangle with vertices
 1, 4, 3, 1, 2, 6 is

  1   4  3    1  2   6
L      , L      , L       .
  4    1   1   3  6   2

The plot is given below

2, 6

 1, 4

3, 1
 4, 1

 1,  3

 6,  2

3. Rotation:

  u1    u1  cos   sin    u1 
2

2

L : R  R , L     A     u 
 u 2   u 2   sin   cos    2 
For example, as   2,

cos 
A
   2   0
 sin   1
  cos    1
2
  0  .
 sin 2 2 

Thus, the rotation for the triangle with vertices 0, 0, 1, 0, 1, 1
is

 0  0  1 0 0


L       0  0, .
 0   1 0    

 1  0  1 1 0


L       0  1,
   
0 1 0    

and

 1  0  1 1  1
L       1   1 .
   
1 1 0    

The plot is given below.

0, 1
1, 1
 1, 1

0, 0 1, 0
Problem 1: Consider the linear transformation T : R3  R2
defined as follows on basis vectors of R 3 . Find T 1, 2,3.

T 1,0,0    3, 1 , T  0,1,0    2,1 , T  0,0,1   3,0

Solution: Since T is defined on basis vectors of R 3 , it is


defined on the whole space. To find, T 1, 2,3 , express the
vector 1, 2,3 as a linear combination of the basis vectors
and use the linearity of T .

T 1, 2,3  T 11,0,0   2  0,1,0   3 0,0,1  

 1T 1,0,0  , 2T  0,1,0   3T  0,0,1

 1 3, 1  22,1  3 3,0 

  8, 3
Problem 2: Let 𝑇: 𝑈 → 𝑉 be a linear transformation. T is
defined relative to bases 𝐵 = {𝑢1 , 𝑢2 , 𝑢3 } and 𝐵′ = 𝑣1 , 𝑣2 of 𝑈
and 𝑉 as follows

𝑇 𝑢1 = 2𝑣1 − 𝑣2
𝑇 𝑢2 = 3𝑣1 + 2𝑣2
𝑇 𝑢3 = 𝑣1 − 4𝑣2
Find the matrix representation of 𝑇 with respect to these
bases and use this matrix to determine the image of the
vector 𝑢 = 3𝑢1 + 2𝑢2 − 𝑢3 .

Solution: The coordinate vectors of 𝑇 𝑢1 , 𝑇 𝑢2 and 𝑇 𝑢3


are

2 3 1


 1 ,   and  
  2  4 
These vectors make up the columns of the matrix of 𝑇

 2 3 1 
A 
 1 2 4 
Let us now find the image of the vector 𝑢 = 3𝑢1 + 2𝑢2 − 𝑢3
using this matrix.
3
The coordinate vector of 𝑢 is a   2  .
 
 1
We get
3
 2 3 1    11
Aa    2   5
 1 2  4   1  
 
11
𝑇(𝑢) has coordinate vector   . Thus 𝑇 𝑢 = 11𝑣1 + 5𝑣2 .
5
Problem 3: Consider the linear transformation 𝑇: 𝑅 3 → 𝑅 2 ,
defined by 𝑇 𝑥, 𝑦, 𝑧 = (𝑥 + 𝑦, 2𝑧). Find the matrix of 𝑇 with
respect to the bases {𝑢1 , 𝑢2 , 𝑢3 } and {𝑢′1 , 𝑢′2 } of 𝑅 3 and 𝑅 2 ,
where
𝑢1 = 1,1,0 , 𝑢2 = 0,1,4 , 𝑢3 = 1,2,3 𝑎𝑛𝑑 𝑢′1 = 1,0 , 𝑢′2 = (0,2).

Use this matrix to find the image of the vector 𝑢 = (2,3,5)

Solution: We find the effect of 𝑇 on the basis vectors of 𝑅 3 .


T (u1 )  T (1,1,0)  (2,0)  2(1,0)  0(0,2)  2u '1  0u '2
T (u2 )  T (0,1,4)  (1,8)  1(1,0)  4(0,2)  1u '1  4u '2
T (u3 )  T (1,2,3)  (3,6)  3(1,0)  3(0,2)  3u '1  3u '2

The coordinate vector of T (u1 ) , T (u2 ) and T (u3 ) are thus


 2  1   3
,
0 4 , and 3 . These vectors from the columns of the
     
matrix of 𝑇.
 2 1 3
A 
 0 4 3
Let us now use A to find the image of the vector 𝑢 = 2,3,5 .
We determine the coordinate vector of 𝑢. It can be shown
that
𝑢 = 2,3,5 = 3 1,1,0 + 2 0,1,4 − 1,2,3
= 3𝑢1 + 2𝑢2 + (−1)𝑢3
3
The coordinate vector of 𝑢 is thus a   2  . The coordinate
 
 1
vector of 𝑇(𝑢) is
3
 2 1 3    5 
b  Aa     2   5 
 0 4 3   1  
 
Therefore, 𝑇 𝑢 = 5𝑢′1 + 5𝑢′2 = 5 1,0 + 5 0,2 = (5,10).
We can check this result directly using the definition
𝑇 𝑥, 𝑦, 𝑧 = (𝑥 + 2𝑦, 2𝑧).
For 𝑢 = 2,3,5 , this gives

𝑇 𝑢 = 𝑇 2,3,5 = 2 + 3,2 × 5 = (5,10).


Problem 4: Consider the linear operator 𝑇 𝑥, 𝑦 = 2𝑥, 𝑥 + 𝑦
on 𝑅 2 . Find the matrix of 𝑇 with respect to the standard
basis 𝐵 = 1,0 , 0,1 of 𝑅 2 . Use the transformation
𝐴′ = 𝑃−1 𝐴𝑃 to determine the matrix 𝐴′ with respect to the
basis 𝐵′ = 2,3 , 1, −1 .

Solution: The effect of 𝑇 on the vectors of the standard


basis is
𝑇 1,0 = 2,1 = 2 1,0 + 1(0,1)
𝑇 0,1 = 0,1 = 0 1,0 + 1(0,1)

The matrix of T relative to the standard basis is

2 0
A 
1 1 

We now find 𝑃, the transition matrix from 𝐵′ to 𝐵. Write the


vectors of 𝐵′ in terms of those of 𝐵.

−2,3 = −2 1,0 + 3(0,1)


1, −1 = 1 1,0 − (1(0,1)

The transition matrix is


 2 1 
P 
 3 1

Therefore
1
 2 1   2 0  2 1 
A '  P 1 AP    1 1   3 1
 3 1   
1 1   2 0  3 2 
  
3 2   1 1   10 6 

Exercise
1) Let 𝑇: 𝑈 → 𝑉 be a linear transformation. Let 𝑇 be
defined relative to bases {𝑢1 , 𝑢2 , 𝑢3 } and {𝑣1 , 𝑣2 , 𝑣3 } of 𝑈
and 𝑉 as follows:
𝑇 𝑢1 = 𝑣1 + 𝑣2 + 𝑣3
𝑇 𝑢2 = 3𝑣1 −2𝑣2
𝑇 𝑢3 = 𝑣1 + 2𝑣2 − 𝑣3 .
Find the matrix of 𝑇 with respect to these bases. Use
this matrix to find the image of the vector
𝑢 = 3𝑢1 + 2𝑢2 − 5𝑢3 .

2) Find the matrices of the following linear operators on


𝑅3 with respect to the standard basis of 𝑅3 . Use these
matrices to find the images of the vector −1,5,2
(a) 𝑇 𝑥, 𝑦, 𝑧 = 𝑥, 2𝑦, 3𝑧
(b) 𝑇 𝑥, 𝑦, 𝑧 = 𝑥, 0,0

3) Consider the linear transformation 𝑇: 𝑅3 → 𝑅2 defined


by 𝑇 𝑥, 𝑦, 𝑧 = (𝑥 − 𝑦, 𝑥 + 𝑧) . Find the matrix of 𝑇 with
respect to the bases {𝑢1 , 𝑢2 , 𝑢3 } and {𝑢′1 , 𝑢′2 } of 𝑅3 and
𝑅2 , where
𝑢1 = 1, −1,0 , 𝑢2 = (2,0,1)
𝑢3 = 1,2,1 and 𝑢′1 = −1,0 , 𝑢′2 = (0,1).
Use this matrix to find the image of the vector
𝑢 = 3, −4,0 .

4) Find the matrix of the differential operator 𝐷 with


respect to the basis {2𝑥 2 , 𝑥, −1} of 𝑃2 . Use this matrix to
find the image of 3𝑥 2 − 2𝑥 + 4.
5) Find the matrix of the following linear transformations
with respect to the basis 𝑥, 1 of 𝑃1 and {𝑥 2 , 𝑥, 1}of 𝑃2 .
(a) 𝑇(𝑎𝑥 2 + 𝑏𝑥 + 𝑐) = 𝑏 + 𝑐 𝑥 2 + 𝑏 − 𝑐 𝑥 of 𝑃2 into itself
(b) 𝑇 𝑎𝑥 + 𝑏 = 𝑏𝑥 2 + 𝑎𝑥 + 𝑏 of 𝑃1 into 𝑃2 .

6) Consider the linear operator 𝑇 𝑥, 𝑦 = (2𝑥, 𝑥 + 𝑦) on 𝑅2 .


Find the matrix of 𝑇 with respect to the standard basis
of 𝑅2 . Use a similarity transformation to then find the
matrix with respect to the basis 1,1 , 2,1 of 𝑅2 .

7) Let 𝑈, 𝑉 𝑎𝑛𝑑 𝑊 be vector spaces with bases


𝐵 = 𝑢1 , . . . , 𝑢𝑛 , 𝐵′ = 𝑣1 ,. . . . , 𝑣𝑚 , and
𝐵′′ = 𝑤1 ,. . . . , 𝑤𝑚 , be linear transformations. Let 𝑃
be the matrix of 𝑇 with respect to 𝐵 𝑎𝑛𝑑 𝐵′, and 𝑄 be
the matrix representation of 𝐿 with respect to 𝐵′ 𝑎𝑛𝑑 𝐵′′.
Prove that the matrix of 𝐿𝑜𝑇 with respect to 𝐵 𝑎𝑛𝑑 𝐵′′ is
𝑄𝑃.

8) Is it possible for two distinct linear transformations


𝑇: 𝑈 → 𝑉 and 𝐿: 𝑈 → 𝑉 to have the same matrix with
respect to bases 𝐵 𝑎𝑛𝑑 𝐵′ of 𝑈 𝑎𝑛𝑑 𝑉?
Answers

1 3 1 
1) 1 2 2  , 4v1  11v2  8v3
 
1 0 1
1 0 0 
2) (a) 0 2 0  ,( 1,10,6)
 
0 0 3 

1 0 0 
(b)
0 0 0  ,( 1,0,0)
 
0 0 0 
 2 2 1 
3)   ,(7,3)
 1 3 2 
0 0 0

4) 4 0 0 ,6 x  2

 
 0 1 0 
0 1 1 

5) (a) 0 1 1

 
0 0 0 
0 1 
(b) 1 0 
 
0 1 

2 0 2 2
6)   , 0 1 
 1 1   
8) No
1.8
The Inverse of a Matrix
In this module we introduce the concept of the matrix
inverse. We will see how an inverse can be used to solve
certain systems of linear equations, and we will see an
application of matrix inverse in cryptography, the study of
codes.
We motivate the idea of the inverse of a matrix by looking
at the multiplicative inverse of a real number. If number 𝑏
is the inverse of 𝑎, then

𝑎𝑏 = 1 and 𝑏𝑎 = 1
1
For example, is the inverse of 4 and we have
4

1 1
4 = 4=1
4 4

These are the ideas that we extend to matrices.

Definition: Let 𝐴 be an 𝑛 × 𝑛 matrix. If a matrix 𝐵 can be


found such that 𝐴𝐵 = 𝐵𝐴 = 𝐼𝑛 , then 𝐴 is said to be
invertible and 𝐵 is called the inverse of 𝐴. If such a matrix
𝐵 does not exist, then 𝐴 has no inverse.
1 2
Example: Prove that the matrix 𝐴 = has inverse
3 4
−2 1
𝐵= − .
3 1
2 2

Solution: We have that


1 2 −2 1 1 0
𝐴𝐵 = 3

1 = = 𝐼2
3 4 2 2 0 1

and
−2 1 1 2 1 0
𝐵𝐴 = 3

1 = = 𝐼2
2 2
3 4 0 1

Thus 𝐴𝐵 = 𝐵𝐴 = 𝐼2 , proving that the matrix 𝐴 has inverse 𝐵.

We know that a real number can have at most one


inverse. We now see that this is also the case for a matrix.
Theorem: The inverse of an invertible matrix is unique.

Proof: Let 𝐵 and 𝐶 be inverses of 𝐴. Thus 𝐴𝐵 = 𝐵𝐴 = 𝐼𝑛 and


𝐴𝐶 = 𝐶𝐴 = 𝐼𝑛 . Multiply both sides of the equation 𝐴𝐵 = 𝐼𝑛 by
𝐶 and use the algebraic properties of matrices.
𝐶 𝐴𝐵 = 𝐶𝐼𝑛

𝐶𝐴 𝐵 = 𝐶
𝐼𝑛 𝐵 = 𝐶

𝐵=𝐶

Thus an invertible matrix has only one inverse.

Notation: Let 𝐴 be an invertible matrix. We denote its


inverse 𝐴−1 .
Thus 𝐴𝐴−1 = 𝐴−1 𝐴 = 𝐼𝑛

Let 𝐴−𝑛 = 𝐴−1 𝑛 = 𝐴−1 𝐴−1 … 𝐴−1


𝑛 times
We now derive a method, based on the Gauss-Jordan
algorithm, for finding the inverse of a matrix.

Let 𝐴 be an invertible matrix. Let the columns of 𝐴−1 be


𝑋1 , 𝑋2 , … , 𝑋𝑛 , and the columns of 𝐼𝑛 be 𝐶1 , 𝐶2 , … , 𝐶𝑛 . Write 𝐴−1
and 𝐼𝑛 in terms of their columns as follows

𝐴−1 = 𝑋1 𝑋2 … 𝑋𝑛 and 𝐼𝑛 = 𝐶1 𝐶2 … 𝐶𝑛

We shall find 𝐴−1 by finding 𝑋1 𝑋2 … 𝑋𝑛 . Since 𝐴𝐴−1 = 𝐼𝑛 ,


then
𝐴 𝑋1 𝑋2 … 𝑋𝑛 = 𝐶1 𝐶2 … 𝐶𝑛

Matrix multiplication, carried out in terms of columns,


gives

𝐴𝑋1 𝐴𝑋2 … 𝐴𝑋𝑛 = 𝐶1 𝐶2 … 𝐶𝑛


leading to

𝐴𝑋1 = 𝐶1 , 𝐴𝑋2 = 𝐶2 , … , 𝐴𝑋𝑛 = 𝐶𝑛

Thus 𝑋1 𝑋2 … 𝑋𝑛 are solutions to the equations


𝐴𝑋 = 𝐶1 , 𝐴𝑋 = 𝐶2 , … , 𝐴𝑋 = 𝐶𝑛 , all having the same matrix of
coefficients 𝐴. Solve these systems by using Gauss-Jordan
elimination on the large augmented matrix 𝐴: 𝐶1 𝐶2 … 𝐶𝑛 .
𝐴: 𝐶1 𝐶2 … 𝐶𝑛 ≈ ⋯ ≈ 𝐼𝑛 : 𝑋1 𝑋2 … 𝑋𝑛

Thus

𝐴: 𝐼𝑛 ≈ ⋯ ≈ 𝐼𝑛 : 𝐴−1

Therefore, if we compute the reduced echelon form of 𝐴: 𝐼𝑛


and get a matrix of the form 𝐼𝑛 : 𝐵 , then 𝐵 = 𝐴−1 .
On the other hand, if we compute the reduced echelon form
of 𝐴: 𝐼𝑛 and find that it is not of the form 𝐼𝑛 : 𝐵 , then 𝐴 is
not invertible.
We now summarize the results of this discussion.

Gauss-Jordan Elimination for Finding the Inverse of a


Matrix

Let 𝐴 be an 𝑛 × 𝑛 matrix.
1. Adjoin the identity 𝑛 × 𝑛 matrix 𝐼𝑛 to 𝐴 to form the matrix
𝐴: 𝐼𝑛 .
2. Compute the reduced echelon form of 𝐴: 𝐼𝑛 .
If the reduced echelon form is of the type 𝐼𝑛 : 𝐵 , then 𝐵 is
the inverse of 𝐴.
If the reduced echelon form is not of the type 𝐼𝑛 : 𝐵 , in
that the first 𝑛 × 𝑛 submatrix is not 𝐼𝑛 , then 𝐴 has no
inverse.

The Gauss-Jordan method of computing the inverse of a


matrix tells us that 𝐴 is invertible if and only if the reduced
echelon form of 𝐴: 𝐼𝑛 is 𝐼𝑛 : 𝐵 . As 𝐴: 𝐼𝑛 is transformed to
𝐼𝑛 : 𝐵 , 𝐴 is transformed to 𝐼𝑛 . This observation leads to the
following result.
Theorem: An 𝑛 × 𝑛 matrix 𝐴 is invertible if and only if its
reduced echelon form is 𝐼𝑛 .
If the matrix of coefficients of a system of linear equations
is invertible then the inverse can be used to discuss the
solutions. The following is a key result in such discussions.
Theorem: Let 𝐴𝑋 = 𝐵 be a system of 𝑛 linear equations in 𝑛
variables. If 𝐴−1 exists, the solution is unique and is given
by 𝑋 = 𝐴−1 𝐵.

Proof: We first prove that 𝑋 = 𝐴−1 𝐵 is solution.

Substitute 𝑋 = 𝐴−1 𝐵 into the matrix equation. Using the


properties of matrices we get

𝐴𝑋 = 𝐴 𝐴−1 𝐵 = 𝐴𝐴−1 𝐵 = 𝐼𝑛 𝐵 = 𝐵

𝑋 = 𝐴−1 𝐵 satisfies the equation; thus it is a solution.

We now prove the uniqueness of the solution. Let 𝑋1 be


solution. Thus 𝐴𝑋1 = 𝐵 . Multiplying both sides of this
equation by 𝐴−1 gives

𝐴−1 𝐴𝑋1 = 𝐴−1 𝐵

𝐼𝑛 𝑋1 = 𝐴−1 𝐵

𝑋1 = 𝐴−1 𝐵

Thus there is a unique solution 𝐴−1 𝐵.

We now summarize some of the algebraic properties of


matrix inverse.

Properties of Matrix Inverse

Let 𝐴 and 𝐵 be invertible matrices and 𝑐 a nonzero scalar.


Then

1. 𝐴−1 −1 =𝐴
1
2. 𝑐𝐴 −1 = 𝐴−1
𝑐
3. 𝐴𝐵 −1 = 𝐵 −1 𝐴−1
4. 𝐴𝑛 −1
= 𝐴−1 𝑛

5. 𝐴𝑡 −1 = 𝐴−1 𝑡

We verify results 1 and 3 to illustrate the techniques


involved, leaving the remaining results for the reader to
verify.

𝑨−𝟏 −𝟏 = 𝑨: This result follows directly from the definition


of inverse of a matrix. Since 𝐴−1 is the inverse of 𝐴, we have

𝐴𝐴−1 = 𝐴−1 𝐴 = 𝐼𝑛

This statement also tells us that 𝐴 is the inverse of 𝐴−1 .


Thus 𝐴−1 −1 = 𝐴.

𝑨𝑩 −𝟏 = 𝑩−𝟏 𝑨−𝟏: We want to show that the matrix 𝐵−1 𝐴−1


is the inverse of the matrix 𝐴𝐵. We get, using the properties
of matrices,

𝐴𝐵 𝐵−1 𝐴−1 = 𝐴 𝐵𝐵−1 𝐴−1

= 𝐴𝐼𝑛 𝐴−1

= 𝐴𝐴−1
= 𝐼𝑛

Similarly, it can be shown that 𝐵−1 𝐴−1 𝐴𝐵 = 𝐼𝑛 . Thus


𝐵−1 𝐴−1 is the inverse of the matrix 𝐴𝐵.
4 1
Example: If = , then it can be shown that 𝐴−1 =
3 1
1 −1
. Use this information to compute 𝐴𝑡 −1 .
−3 4
Solution: Result 5 above tells us that if we know the
inverse of a matrix we also know the inverse of its
transpose. We get
𝑡
1 −1 1 −3
𝐴𝑡 −1 = 𝐴−1 𝑡 = = .
−3 4 −1 4
Cryptography
Cryptography is the process of coding and decoding
messages. The word comes from the Greek kryptos,
meaning “hidden”. The technique can be traced back to the
ancient Greeks. Today governments use sophisticated
methods of coding and decoding messages. One type of
code that is extremely difficult to break makes use of a
large matrix to encode a message. The receiver of the
message decodes it using the inverse of the matrix. This
first matrix is called the encoding matrix and its inverse is
called the decoding matrix. We illustrate the method for a
3 X 3 matrix.

Let the message be

PREPARE TO ATTACK
and the encoding matrix be
−3 −3 −4
0 1 1
4 3 4
We assign a number to each letter of the alphabet. For
convenience, let us associate each letter with its position in
the alphabet: 𝐴 is 1, 𝐵 is 2, and so on. Let a space between
words be denoted by the number 27. Thus the message
becomes
P R E P A R E * T O * A T T A C K

16 18 5 16 1 18 5 27 20 15 27 1 20 20 1 3 11

Since we are going to use a 3 X 3 matrix to encode the


message, we break the enumerated message up into a
sequence of 3 X 1 column matrices as follows.
16 16 5 15 20 3
8 1 27 27 20 11
5 18 20 1 1 27
Observe that it was necessary to add a space at the end of
the message in order to complete the last matrix. We now
put the message into code by multiplying each of the above
column matrices by the encoding matrix. This step can be
conveniently done by writing the column matrices as
columns of a matrix and pre-multiplying that matrix by the
encoding matrix. We get
−3 −3 −4 16 16 5 15 20 3
0 1 1 18 1 27 27 20 11
4 3 4 5 18 20 1 1 27
−122 −123 −176 −130 −124 −150
= 23 19 47 28 21 38
138 139 181 145 144 153
The columns of this matrix give the encoded message. The
message is transmitted in the following linear form.
−122, 23, 138, −123, 19, 139, −176, 47, 181,
−130, 28, 145, −124, 21, 144, −150, 38, 153
To decode the message, the receiver writes this string as a
sequence of 3 X 1 column matrices and repeats the
technique using the inverse of the encoding matrix. The
inverse of this encoding matrix, the decoding matrix, is
1 0 1
4 4 3
−4 −3 −3
To decode the message, we multiply
1 0 1 −122 −123 −176 −130 −124 −150
4 4 3 23 19 47 28 21 38
−4 −3 −3 138 139 181 145 144 153
16 16 5 15 20 3
= 18 1 27 27 20 11
5 18 20 1 1 27
The columns of the final matrix, written in linear form, give
the original message:
16 18 5 16 1 18 5 27 20 15 27 1 20 20 1 3 11

P R E P A R E * T O * A T T A C K
Problem 1: Determine the inverse of the matrix
1 −1 −2
𝐴= 2 −3 −5
−1 3 5
Solution: Applying the method of Gauss-Jordon
elimination, we get
1 −1 −2 1 0 0
𝐴: 𝐼3 = 2 −3 −5 0 1 0
−1 3 5 0 0 1
≈ 1 −1 −2 1 0 0
𝑅2 + −2 𝑅1 0 −1 −1 −2 1 0
𝑅3 + 𝑅1 0 2 3 1 0 1

≈ 1 −1 −2 1 0 0
−1 𝑅2 0 1 1 2 −1 0
0 2 3 1 0 1
≈ 1 0 −1 3 −1 0
𝑅1 + 𝑅2 0 1 1 2 −1 0
𝑅3 + −2 𝑅2 0 0 1 −3 2 1
≈ 1 0 0 0 1 1
𝑅1 + 𝑅3 0 1 0 5 −3 −1
𝑅2 + −1 𝑅3 0 0 1 −3 2 1
Thus
0 1 1
𝐴−1 = 5 −3 −1
−3 2 1
Problem 2: Determine the inverse of the following matrix, if
it exists.
1 1 5
𝐴= 1 2 7
2 −1 4
Solution: Applying the method of Gauss-Jordon
elimination, we get
1 1 5 1 0 0
𝐴: 𝐼3 = 1 2 7 0 1 0
2 −1 4 0 0 1
≈ 1 1 5 1 0 0
𝑅2 + −1 𝑅1 0 1 2 −1 1 0
𝑅3 + −2 𝑅1 0 −3 −6 −2 0 1
≈ 1 0 3 2 −1 0
𝑅1 + −1 𝑅2 0 1 2 −1 1 0
𝑅3 + 3𝑅2 0 0 0 −5 3 1
There is no need to proceed further. The reduced echelon
form cannot have a one in the 3,3 location. The reduced
echelon form cannot be of the form 𝐼𝑛 : 𝐵 . Thus 𝐴−1 does
not exist.
Problem 3: Solve the system of equations

𝑥1 − 𝑥2 − 2𝑥3 = 1
2𝑥1 − 3𝑥2 − 5𝑥3 = 3

−𝑥1 + 3𝑥2 + 5𝑥3 = −2


Solution: This system can be written in the following
matrix form:
1 −1 −2 𝑥1 1
2 −3 𝑥
−5 2 = 3
−1 3 5 𝑥3 −2
If the matrix of coefficients is invertible, the unique
solution is
𝑥1 −1
1 −1 −2 1
𝑥2 = 2 −3 −5 3
𝑥3 −1 3 5 −2
This inverse has already been found in Problem 1. Using
that result we get
𝑥1 0 1 1 1 1
𝑥2 = 5 −3 −1 3 = −2
𝑥3 −3 2 1 −2 1
The unique solution is 𝑥1 = 1, 𝑥2 = −2, 𝑥3 = 1.
Exercise
1. Determine the inverse of each of the following 3 X 3
matrices, if it exists, using the method of Gauss Jordan
elimination.
1 2 3
a. 0 1 2
4 5 3
1 2 −3
b. 1 −2 1
5 −2 −3

2. Determine the inverse of each of the following 4 X 4


matrices, if it exists, using the method of Gauss Jordan
elimination.
−3 −1 1 −2
a. −1 3 2 1
1 2 3 −1
−2 1 −1 −3
−1 0 −1 −1
b. −3 −1 0 −1
5 0 4 3
3 0 3 2

3. Solve the following systems of three equations in three


variables by determining the inverse of the matrix of
coefficients and then using matrix multiplication.

x1  2 x2  x3  2
x1  x2  2 x3  0
a.
x1  x2  x3  1
x1  2 x2  3 x3  1
2 x1  5 x2  3 x3  3
b.
x1  8 x3  15
 x1  x2  5
 x1  x3  2
c.
6 x1  2 x2  3 x3  1

4. Prove that 𝐴𝐵𝐶 −1 = 𝐶 −1 𝐵−1 𝐴−1 .


5. Prove that if 𝐴 has no inverse then 𝐴𝑡 also has no
inverse.
6. Prove that a diagonal matrix is invertible if and only if all
its diagonal elements are nonzero. Can you find a quick
way for determining the inverse of an invertible diagonal
matrix?
4 −3
7. Encode the message RETREAT using the matrix .
3 −2
8. Decode the message 49,38, −5, −3, −61, −39 , which was
4 −3
encoded using the matrix .
3 −2
Answers
1.
7 1
−3 −
3 3
8 2
a. − 3
3 3
4 1
−1 −
3 3
b. The inverse does not exist

2.
 1 1 1 
 5  5 5 0 
 
 1 1 0
1 
 5 5 5 
a.  1
 1 1
0  
 5 5 5
 1 1 1
 0   
 5 5 5
 1 0 1 1
 0 1 3 4 
b.  
 1 0 1 2 
 
 3 0 0 1

3.
1 3 5 7
𝑥1 9 9 9 2 9
3 3 3
a. 𝑥2 = 0 − 0 =
9 9 9
𝑥3 2 3 1 1 5
− − −
9 9 9 9
𝑥1 −40 16 9 1 143
b. 𝑥2 = 13 −5 −3 3 = −47
𝑥3 5 −2 −1 15 −16
𝑥1 2 3 1 5 5
c. 2 = 3 3 1 −2 = 10
𝑥
𝑥3 2 4 1 1 3

7. 57, 44, 26, 24, 17, 13, −1, 6


8. PEACE
1.9
The Rank of a Matrix

In this module the student is introduced to the concept of


the rank of a matrix. Rank enables one to relate matrices to
vectors, and vice versa. Rank is a unifying tool that enables
us to bring together many of the concepts discussed in the
course. Solutions to certain systems of linear equations
and invertibility of a matrix all come together under the
umbrella of rank.
Definition: Let 𝐴 be an 𝑚 × 𝑛 matrix. The rows of 𝐴 may be
viewed as row vectors 𝒓1 , … , 𝒓𝑚 , and the columns as column
vectors 𝒄1 , … , 𝒄𝑛 . Each row vector will have 𝑛 components,
and each column vector will have 𝑚 components. The row
vectors will span a subspace of 𝑹𝑛 called the row space of
𝐴 , and the column vectors will span a subspace of 𝑹𝑚
called the column space of 𝐴.
Example: Consider the matrix
1 2 −1 2
3 4 1 6
5 4 1 0
The row vectors of 𝐴 are
𝒓1 = 1,2, −1,2 𝒓2 = 3,4,1,6 𝒓3 = 5,4,1,0
These vectors span a subspace of 𝑹4 called the row space of
𝐴.
The column vectors of 𝐴 are
1 2 −1 2
𝒄1 = 3 𝒄2 = 4 𝒄3 = 1 𝒄4 = 6
5 4 1 0
These vectors span a subspace of 𝑹3 called the column
space of 𝐴.
Theorem: The row space and the column space of a matrix
𝐴 have the same dimension.
Proof: Let 𝒖1 , … , 𝒖𝑚 be the row vectors of 𝐴. The 𝑖th vector
is
𝒖𝑖 = 𝑎𝑖1 , 𝑎𝑖2 , … , 𝑎𝑖𝑛
Let the dimension of the row space be 𝑠. Let the vectors
𝒗1 , … 𝒗𝑠 form a basis for the row space. Let the 𝑗th vector of
this set be

𝒗𝑗 = 𝑏𝑗 1 , 𝑏𝑗 2 , … , 𝑏𝑗𝑛

Each of the row vectors of 𝐴 is a linear combination of


𝒗1 , … 𝒗𝑠 . Let
𝒖1 = 𝑐11 𝒗1 + 𝑐12 𝒗2 + ⋯ + 𝑐1𝑠 𝒗𝑠
.
.
.
𝒖𝑚 = 𝑐𝑚1 𝒗1 + 𝑐𝑚2 𝒗2 + ⋯ + 𝑐𝑚𝑠 𝒗𝑠
Equating the 𝑖th components of the vectors on the left and
right, we get
𝑎1𝑖 = 𝑐11 𝑏1𝑖 + 𝑐12 𝑏2𝑖 + ⋯ + 𝑐1𝑠 𝑏𝑠𝑖
.
.
.
𝑎𝑚𝑖 = 𝑐𝑚1 𝑏1𝑖 + 𝑐𝑚2 𝑏2𝑖 + ⋯ + 𝑐𝑚𝑠 𝑏𝑠𝑖
This may be written
𝑎1𝑖 𝑐11 𝑐12 𝑐1𝑠
⋮ = 𝑏1𝑖 ⋮ + 𝑏2𝑖 ⋮ + ⋯ + 𝑏𝑠𝑖 ⋮
𝑎𝑚𝑖 𝑐𝑚1 𝑐𝑚2 𝑐𝑚𝑠
This implies that each column vector of 𝐴 lies in a space
spanned by a single set of 𝑠 vectors. Since 𝑠 is the
dimension of the row space of 𝐴, we get
dim(column space of 𝐴) ≤ dim(row space of 𝐴)
By similar reasoning, we can show that
dim(row space of 𝐴) ≤ dim(column space of 𝐴)
Combining these two results we see that
dim(row space of 𝐴) = dim(column space of 𝐴)
proving the theorem.
Definition: The dimension of the row space and the
column space of a matrix 𝐴 is called the rank of 𝐴. The
rank of 𝐴 is denoted rank 𝐴 .
EXAMPLE: Determine the rank of the matrix
1 2 3
𝐴= 0 1 2
2 5 8
Solution: We see by inspection that the third row of 𝐴 is a
linear combination of the first two rows:
2,5,8 = 2 1,2,3 + 0,1,2
Hence the three rows of 𝐴 are linearly dependent. The rank
of 𝐴 must be less than 3. Since 1,2,3 is not a scalar
multiple of 0,1,2 , these two vectors are linearly
independent. These vectors form a basis for the row space
of 𝐴. Thus rank 𝐴 = 2.
This method, based on the definition, is not practical for
determining the ranks of larger matrices. We shall give a
more systematic method for finding the rank of a matrix.
The following theorem, which paves the way for the
method, tells us that the rank of a matrix that is in
reduced echelon form is immediately known.
Theorem: The nonzero row vectors of a matrix 𝐴 that is in
reduced echelon form are a basis for the row space of 𝐴.
The rank of 𝐴 is the number of nonzero row vectors.
Proof: Let 𝐴 be an 𝑚 × 𝑛 matrix with nonzero row vectors of
𝒓1 , … 𝒓𝑡 . Consider the identity
𝑘1 𝒓1 + 𝑘2 𝒓2 + ⋯ + 𝑘𝑡 𝒓𝑡 = 0
where 𝑘1 , … , 𝑘𝑡 are scalars.
The first nonzero element of 𝒓1 is 1. 𝒓1 is the only one of the
row vectors to have a nonzero number in this component.
Thus, on adding the vectors 𝑘1 𝒓1 , 𝑘2 𝒓2 , … , 𝑘𝑡 𝒓𝑡 , we get a
vector whose first component is 𝑘1 . On equating this vector
to zero, we get 𝑘1 = 0. The identity then reduces to
𝑘2 𝒓2 + ⋯ + 𝑘𝑡 𝒓𝑡 = 0
The first nonzero element of 𝒓2 is 1, and it is the only one of
these remaining row vectors with a nonzero number in this
component. Thus 𝑘2 = 0. Similarly, 𝑘3 , … , 𝑘𝑡 are all zero. The
vectors 𝒓1 , … , 𝒓𝑡 are therefore linearly independent. These
vectors span the row space of 𝐴. They thus form a basis for
the row space of 𝐴. The dimension of the row space is 𝑡.
The rank of 𝐴 is 𝑡, the number of nonzero row vectors in 𝐴.
Theorem: Let 𝐴 and 𝐵 be row equivalent matrices. Then 𝐴
and 𝐵 have the same row space. Rank 𝐴 = rank 𝐵 .
Proof: Since 𝐴 are 𝐵 are row equivalent, the rows of 𝐵 can
be obtained from the rows of 𝐴 through a sequence of
elementary row operations. Therefore each row of 𝐵 is a
linear combination of the rows of 𝐴. Thus the row space of
𝐵 is contained in the row space of 𝐴.
In the same way, the rows of 𝐴 can be obtained from the
rows of 𝐵 through a sequence of elementary row
operations, implying that the row space of 𝐴 is contained in
the row space of 𝐵.
It follows that the row spaces of 𝐴 and 𝐵 are equal. Since
their row spaces are equal, their ranks must be equal.
The next result brings the last two results together to give
a method for finding a basis for the row space of a matrix
and the rank of the matrix.
Theorem: Let 𝐸 be a reduced echelon form of a matrix 𝐴.
The nonzero row vectors of 𝐸 form a basis for the row space
of 𝐴. The rank of 𝐴 is the number of nonzero row vectors in
𝐸.
The concept of rank plays an important role in
understanding the behavior of systems of linear equations.
We have seen how systems of linear equations can have a
unique solution, many solutions, or no solutions at all.
These situations can be conveniently categorized in terms
of the ranks of the augmented matrix and the matrix of
coefficients.
THEOREM 5.17
Consider a system of 𝑚 equations in 𝑛 variables.
(a) If the augmented matrix and the matrix of
coefficients have the same rank 𝑟 and 𝑟 = 𝑛 , the
solution is unique.
(b) If the augmented matrix and the matrix of
coefficients have the same ran 𝑟 and 𝑟 < 𝑛, there are
many solutions.
(c) If the augmented matrix and the matrix of
coefficients do not have the same rank, a solution
does not exist.
PROOF: Let the system of equations be
𝑎11 𝑥1 + ⋯ + 𝑎1𝑛 𝑥𝑛 = 𝑏1
.
.
.
𝑎𝑚1 𝑥1 + ⋯ + 𝑎𝑚𝑛 𝑥𝑛 = 𝑏𝑚
This system can be written
𝑎11 𝑎1𝑛 𝑏1
𝑥1 ⋮ + ⋯ + 𝑥𝑛 ⋮ = ⋮
𝑎𝑚1 𝑎𝑚𝑛 𝑏𝑚
That is
𝑥1 𝒂1 + ⋯ + 𝑥𝑛 𝒂𝑛 = 𝒃 ------- (1)
Let us now look at the three possibilities.
(a) Since the ranks of the matrix of coefficients and
augmented matrix are the same, 𝒃 must be linearly
dependent on 𝒂1 , … , 𝒂𝑛 . Furthermore, since the rank
is 𝒏 , the vectors 𝒂1 , … 𝒂𝑛 are linearly independent
and thus form a basis for the column space of the
augmented matrix. Therefore Equation (1) has a
unique solution; the solution to the system is
unique.
(b) Since the ranks of the matrix of coefficients and
augmented matrix are the same, 𝒃 must be linearly
dependent on 𝒂1 , … , 𝒂𝑛 . However since rank < 𝒏, the
vectors 𝒂1 , … 𝒂𝑛 are linearly dependent. 𝒃 can
therefore be expressed in more than one way as a
linear combination of 𝒂1 , … 𝒂𝑛 . Thus Equation (1) has
many solutions; the solution to the system exists
but is not unique.
(c) Since the rank of the augmented matrix is not equal
to the rank of the matrix of coefficients, 𝒃 is linearly
independent of 𝒂1 , … 𝒂𝑛 . Thus Equation (1) has no
solution; a solution to the system does not exist.
Problem 1: Find a basis for the row space of the following
matrix 𝐴, and determine its rank.
1 2 3
𝐴= 2 5 4
1 1 5
Solution: Use elementary row operations to find a reduced
echelon form of the matrix 𝐴. We get
1 2 3 1 2 3 1 0 7
2 5 4 ≈ 0 1 −2 ≈ 0 1 −2
1 1 5 0 −1 2 0 0 0
The two vectors 1,0,7 , 0,1, −2 form a basis for the row
space of 𝐴. Rank 𝐴 = 2.
Problem 2: Find a basis for the column space of the
following matrix 𝐴.
1 1 0
𝐴= 2 3 −2
−1 −4 6

Solution: The transpose of 𝐴 is


1 2 −1
𝑡
𝐴 = 1 3 −4
−1 −2 6
The column space of 𝐴 becomes the row space of 𝐴𝑡 . Let us
find a basis for the row space of 𝐴𝑡 . Compute a reduced
echelon form of 𝐴𝑡 .
1 2 1 1 2 −1 1 0 5
1 3 −4 ≈ 0 1 −3 ≈ 0 1 −3
0 −2 6 0 −2 6 0 0 0
The nonzero row vectors of this echelon form, namely
1,0,5 , 0,1, −3 , are a basis for the row space of 𝐴𝑡 . Write
these vectors in column form to get a basis for the column
space of 𝐴. The following vectors are a basis for the column
space of 𝐴.
1 2
3 , 4
5 4
Problem 3: Find a basis for the subspace 𝑉 of 𝑹4 spanned
by the vectors

1,2,3,4 , −1, −1, −4, −2 , 3,4,11,8

Solution: We construct a matrix 𝐴 having these vectors as


row vectors.
1 2 3 4
𝐴 = −1 −1 −4 −2
3 4 11 8
Determine a reduced echelon form of 𝐴. We get
1 2 3 4 1 2 3 4 1 0 5 0
−1 −1 −4 −2 ≈ 0 1 −1 2 ≈ 0 1 −1 2
3 4 11 8 0 −2 2 −4 0 0 0 0
The nonzero vectors of this reduced echelon form, namely
1,0,5,0 and 0,1, −1,2 , are a basis for the subspace 𝑉.
Exercise
1. Determine the ranks of the following matrices using the
definition of rank.
1 2 1
a. 2 4 2
1 2 3
2 1 3
b. 4 2 6
2 1 3
1 3 4
c. −1 3 1
0 6 5

2. Find the reduced echelon form for each of the following


matrices. Use the echelon form to determine a basis for
the row space and the rank of each matrix.
1 2 −1
a. 2 5 2
0 2 9
1 −3 2
b. −2 6 −4
−1 3 −2

3. Find bases for the subspaces of 𝑹3 spanned by the


following vectors.
a. 1,3,2 , 0,1,4 , 1,4,9
b. 1, −1,3 , 1,0,1 , −2,1, −4
4. Find bases for both the row and column spaces of the
following matrix 𝐴. Show that the dimensions of both row
space and column space are the same.
1 2 −1
𝐴= 0 1 3
1 4 6

5. Let 𝐴 be a 3 X 4 matrix. Prove that the column vectors of


𝐴 are linearly dependent.
6. Let 𝐴 be a 𝑛 × 𝑛 invertible matrix. Prove that the row
vectors of 𝐴 form a basis for 𝑹𝑛 .
7. Let 𝐴 be a 𝑛 × 𝑛 invertible matrix. Prove that the columns
of 𝐴 are linearly independent if and only if rank 𝐴 = 𝑛.
8. If 𝐴 and 𝐵 are matrices of the same size, prove that
rank 𝐴 + 𝐵 ≤ rank 𝐴 + rank 𝐵 .

Answers
1.
a. 2
b. 1
c. 2

2.
1 0 0
a. 0 1 0 . Basis 1,0,0 , 0,1,0 and 0,0,1 . Rank 3.
0 0 1
1 −3 2
b. 0 0 0 . Basis 1, −3,2 . Rank 1.
0 0 0

3.
a. The given vectors are linearly independent so they
are a basis for the space they span, which is 𝑹3 .
1 −1 3 1 0 1
b. 1 0 1 ≈ ⋯ ≈ 0 1 −2 . Vectors 1,0,1 and
−2 1 −4 0 0 0
0,1, −2 are a basis for the space spanned by the
given vectors.

1 2 −1 1 0 0
4. 𝐴 = 0 1 3 ≈ ⋯ ≈ 0 1 0 . Vectors 1,0,0 , 0,1,0
1 4 6 0 0 1
and 0,0,1 are a basis for the row space of 𝐴.
1.10
Eigenvalues and Eigenvectors

Definition: Let 𝐴 be an 𝑛 × 𝑛 matrix. A scalar 𝜆 is called an


eigenavlue of 𝐴 if there exists a nonzero vector 𝑋 in ℝ𝑛 such
that 𝐴𝑋 = 𝜆𝑋 . The vector 𝑋 is called an eigenvector
corresponding to 𝜆.
Let us look at the geometrical significance of an
eigenvector that corresponds to a nonzero eigenvalue. The
vector 𝐴𝑋 is in the same or opposite direction as 𝑋 ,
depending on the sign of 𝜆 . See the figure below. An
eigenvector of 𝐴 is thus a vector whose direction is
unchanged or reversed when multiplied by 𝐴.

Computation of Eigenvalues and Eigenvectors

Let 𝐴 be an 𝑛×𝑛 matrix with eigenvalue 𝜆 and


corresponding eigenvector 𝑋. Thus 𝐴𝑋 – 𝜆𝑋 . This equation
may be rewritten as 𝐴𝑋 – 𝜆𝑋 = 0, giving 𝐴 − 𝜆𝐼𝑛 𝑋 = 0.
This matrix equation represents a system of homogeneous
linear equations having matrix of coefficients 𝐴 − 𝜆𝐼𝑛 𝑋 = 0
is a solution to this system. However, eigenvectors have
been defined to be nonzero vectors. Further, nonzero
solutions to this system of equations can only exist if the
matrix of coefficients is singular, 𝐴 − 𝜆𝐼𝑛 = 0 . Hence,
solving the equation 𝐴 − 𝜆𝐼𝑛 = 0 for 𝜆 leads to all the
eigenvalues of 𝐴.
On expanding the determinant 𝐴 − 𝜆𝐼𝑛 , we get a
polynomial in 𝜆. This polynomial is called the characteristic
polynomial of A. The equation 𝐴 − 𝜆𝐼𝑛 = 0 is called the
characteristic equation of 𝐴.
The eigenvalues are then substituted back into the
equation 𝐴 − 𝜆𝐼𝑛 𝑋 = 0 to find the corresponding
eigenvectors.

We are always interested in knowing whether sets of


vectors form subspaces!

Theorem: Let 𝐴 be an 𝑛 × 𝑛 matrix and 𝜆 an eigenvalue of


𝐴. The set of all eigenvectors corresponding to 𝜆, together
with the zero vectors, is a subspace of ℝ𝑛 . This subspace is
called the eigenspace of 𝜆.
Proof: In order to show that the eigenspace is a subspace,
we have to show that it is closed under vector addition and
scalar multiplication.

Let 𝑋1 and 𝑋2 be two vectors in the eigenspace of 𝜆 and let


𝑐 be a scalar.

Then 𝐴𝑋1 = 𝜆𝑋1 and 𝐴𝑋2 = 𝜆𝑋2 .

Hence, 𝐴𝑋1 + 𝐴𝑋2 = 𝜆𝑋1 + 𝜆𝑋2


𝐴( 𝑋1 + 𝑋2 ) = 𝜆(𝑋1 + 𝑋2 ).
Thus 𝑋1 + 𝑋2 is a vector in the eigenspace of 𝜆 . The
eigenspace is closed under addition.

Further, since 𝐴𝑋1 = 𝜆𝑋1 ,


𝑐𝐴𝑋1 = 𝑐𝜆𝑋1

𝐴 𝑐𝑋1 = 𝜆 𝑐𝑋1 .
Therefore 𝑐𝑋1 is a vector in the eigenspace of 𝜆 . The
eigenspace is closed under scalar multiplication.

Thus the eigenspce is a subspace.

Properties of Eigen Values

Property 1: The sum of the eigenvalues of a square matrix


𝐴 is the sum of the diagonal elements (trace) of 𝐴.

Property 2: The product of the eigenvalues is 𝐴 .

 a11 a12  a1n 


a a22  a2 n 
Proof: Let A   
21

     
 
 n1
a an2  ann 

The eigen values are got from the characteristic equation

a11   a12  a1n


a21 a22    a2 n
A  I  0
   
an1 an 2  ann  
Let A   I  a0  a1
n 1
n
   an . . . ………………….(1)

Setting   0 we get A  an …………………………….(2)

Also explaining A   I by first row we get

A   I  (1)n  n  (1)n1 (a11  a22    ann ) n1   …….(3)

Comparing (1) and (3) we get

a0  (1)n ; a1  (1)n1 (a11  a22    ann ); ……………(4)


If 1 , 2 ,n are the eigenvalues ( (i.e) the roots of the
characteristic equation) then

Sum of the eigenvalues = sum of the roots

= 1  2    n
𝑎1
=− = 𝑎11 + 𝑎22 + ⋯ + 𝑎𝑛𝑛 (Using 4)
𝑎0

= trace of 𝐴.
Product of the eigenvalues = product of the roots

= 1 , 2 ,n
𝑎 (−1)𝑛 𝑎 𝑛
= (−1)𝑛 𝑛 =
𝑎0 (−1)𝑛

= 𝑎𝑛

= 𝐴 (from 2 )
Property 3: The eigen values of 𝐴 and its transpose 𝐴𝑇 are
the same.

Proof: It is enough if we prove that 𝐴 and 𝐴𝑇 have the same


characteristic polynomial. Since for any square matrix
𝑀, 𝑀 = 𝑀𝑇 we have,

𝐴 − 𝜆𝐼 = (𝐴 − 𝜆𝐼)𝑇 = 𝐴𝑇 − (𝜆𝐼)𝑇 = 𝐴𝑇 − 𝜆𝐼
Hence the result.

Property 4: If 𝜆 is an eigen value of a non singular matrix


1
𝐴 then is an eigen value of 𝐴−1 .
𝜆

Proof : Let 𝑋 be an eigen vector corresponding to 𝜆.

Then 𝐴𝑋 = 𝜆𝑋. Since 𝐴 is non singular 𝐴−1 exists.

∴ 𝐴−1 𝐴𝑋 = 𝐴−1 (𝜆𝑋)

𝐼𝑋 = 𝜆𝐴−1 𝑋
1
∴ 𝐴−1 𝑋 = 𝑋.
𝜆
1
∴ is an eigen value of 𝐴−1 .
𝜆

Corollary: If 𝜆1 , 𝜆2 , … , 𝜆𝑛 are the eigen values of a non


1 1 1
singular matrix 𝐴 then , ,…, are the eigen values of
𝜆1 𝜆2 𝜆𝑛
𝐴−1 .
Property 5: If 𝜆 is an eigen value of 𝐴 then 𝑘𝜆 is an eigen
value of 𝑘𝐴 where 𝑘 is a scalar.

Proof: Let 𝑋 be an eigen vector corresponding to 𝜆.


Then 𝐴𝑋 = 𝜆𝑋.
Now, 𝑘𝐴 𝑋 = 𝑘(𝐴𝑋)

= 𝑘(𝜆𝑋) (by (1))


= 𝑘𝜆 𝑋.

∴ 𝑘𝜆 is an eigen value of 𝑘𝐴.

Property 6: If 𝜆 is an eigen value of 𝐴 then 𝜆𝑘 is an eigen


value of 𝐴𝑘 where 𝑘 is any positive integer.

Proof: Let 𝑋 be an eigen vector corresponding to 𝜆.

Then 𝐴𝑋 = 𝜆𝑋.

Now, 𝐴2 𝑋 = 𝐴𝐴 𝑋 = 𝐴(𝐴𝑋)

= 𝐴(𝜆𝑋) (by (1))


= 𝜆(𝐴𝑋)

= 𝜆(𝜆𝑋) (by (1))

= 𝜆2 𝑋.

∴ 𝜆2 is an eigen value of 𝐴2 .

Proceeding like this we can prove that 𝜆𝑘 is an eigen value


of 𝐴𝑘 for any positive integer.

Corollary: If 𝜆1 , 𝜆2 , … , 𝜆𝑛 are eigen values of 𝐴 then


𝜆1𝑘 , 𝜆𝑘2 , … , 𝜆𝑘𝑛 are given values of 𝐴𝑘 for any positive integer.
CAYLEY-HAMILTON THEOREM:
Statement: Every square matrix satisfies its own
characteristic equation.

Proof: Let 𝐴 be a 𝑛 × 𝑛 square matrix. Then characteristic


equation of 𝐴 is 𝐴 − 𝜆𝐼 = 0,

Let 𝐴 − 𝜆𝐼 = 𝜆𝑛 + 𝑎𝑛−1 𝜆𝑛−1 + ⋯ + 𝑎0 .

Cofactor of any element of 𝐴 − 𝜆𝐼 is a polynomial of degree


at most equal to 𝑛 − 1.
Hence, adj 𝐴 − 𝜆𝐼 may be expressed as a matrix
polynomial in 𝜆.

Let adj 𝐴 − 𝜆𝐼 = 𝐵𝑛 −1 𝜆𝑛−1 + 𝐵𝑛−2 𝜆𝑛−2 + ⋯ + 𝐵1 𝜆 + 𝐵0 ,

Where 𝐵0 , … . , 𝐵𝑛−1 are all matrices of order 𝑛 and whose


elements are functions of the elements of the matrix 𝐴 .
Now, we have

𝐴 − 𝜆𝐼 adj 𝐴 − 𝜆𝐼 = 𝐴 − 𝜆𝐼 𝐼.
i.e.,

𝐴 − 𝜆𝐼 𝐵𝑛 −1 𝜆𝑛−1 + ⋯ + 𝐵0 = −1 𝑛
𝜆𝑛 + 𝑎𝑛−1 𝜆𝑛−1 + ⋯ + 𝑎0 𝐼.

Comparing the coefficients of like powers of 𝜆 on both


sides,
−𝐵𝑛 −1 = −1 𝑛 𝐼.

𝐴𝐵𝑛−1 − 𝐵𝑛−2 = −1 𝑛 𝑎𝑛−1 𝐼.


𝐴𝐵𝑛 −2 − 𝐵𝑛 −3 = −1 𝑛 𝑎𝑛−2 𝐼.
………………………….

…...……………………..

𝐴𝐵1 − 𝐵0 = −1 𝑛 𝑎1 𝐼.
𝐴𝐵0 = −1 𝑛 𝑎0 𝐼.

Pre-multiplying the first by 𝐴𝑛 , the second by 𝐴𝑛 −1 , … and


adding the above relations, we get

0 = −1 𝑛 𝐴𝑛 + 𝑎𝑛−1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼

⇒ −1 𝑛 𝐴𝑛 + 𝑎𝑛 −1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼 = 0

Hence 𝐴 satisfies its characteristic equation.

Remark: Determination of 𝐴−1 using Cayley-Hemilton


theorem.

𝐴 satisfies its characteristic equation

i.e −1 𝑛 𝐴𝑛 + 𝑎𝑛 −1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼 = 0

⇒ 𝐴𝑛 + 𝑎𝑛−1 𝐴𝑛 −1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼 = 0

⇒ 𝐴−1 𝐴𝑛 + 𝑎𝑛−1 𝐴𝑛−1 + ⋯ + 𝑎1 𝐴 + 𝑎0 𝐼 if 𝐴 is non-singular

⇒ 𝑎0 𝐴−1 = −𝐴𝑛 −1 − 𝑎𝑛−1 𝐴𝑛 −2 − ⋯ − 𝑎1 𝐼


1
∴ 𝐴−1 = − 𝐴𝑛 −1 + 𝑎𝑛−1 𝐴𝑛−2 + ⋯ + 𝑎1 𝐼 .
𝑎0

Applications
Eigenvalues and eigenvectors are extremely important tools
in applying mathematics. They are used in many branches
of engineering and the natural and social sciences. Now we
see use of eigenvectors in long term behavior of the
population distribution.

It is estimated that the number of people living in cities in


the United States during 2000 is 58 million. The number of
people living in the surrounding suburbs is 142 million.
 58 
Let us represent this information by the matrix 𝑋0 =   .
142 
Consider the population flow from cities to suburbs.
During 2000, the probability of a person staying in the city
was 0.96 . Thus the probability of moving to the suburbs
was 0.04 (assuming that all those who moved went to the
suburbs). Consider now the reverse population flow, from
suburbia to city. The probability of a person moving to the
city was 0.01; the probability of remaining in suburbia was
0.99. These probabilities can be written as the elements of a
stochastic matrix P:

(from) (to)
City Suburb

0.96 0.01 city


P 
0.04 0.99  Suburb

The probability of moving from location 𝐴 to location 𝐵 is


given by the element in column 𝐴 and row 𝐵 . In this
context, the stochastic matrix is called a matrix of
transition probabilities.
Now consider the population distribution in 2001, one year
later:

City population in 2001 = People who remained from 2000 +


People who moved in the from
the suburbs
= 0.96 × 58 + (0.01 × 142)

= 57.1 million
Suburban population in 2001 = People who moved in from
the city +People who stayed
from 2000

= 0.04 × 58 + (0.99 × 142)

= 142.9 million.

Note that we can arrive at these numbers using matrix


multiplication:

0.96 0.01  58   57.1 


0.04 0.99 142  142.9 
    
Using 2000 as the base year, let 𝑋1 be the population in
2002, one year later. We can write 𝑋1 = 𝑃𝑋0 .
Assume that the population flow represented by the matrix
𝑃 is unchanged over the years. The population distribution
𝑋2 after 2 years is given by

𝑋2 = 𝑃𝑋1

After 3 years the population distribution is given by


𝑋3 = 𝑃𝑋2

After 𝑛 years we get


𝑋𝑛 = 𝑃𝑋𝑛 −1
The predictions of this model (to four decimal places) are

 58  City  57.1   56.245 


X0    X1    X2   
142  Suburb
142.9  143.755
 55.4327   54.6611 
X3    X4   
144.5672  145.3389 
and so on.

If the sequence 𝑋0 , 𝑋1 , 𝑋2 , … converges to some fixed vector 𝑋,


where 𝑃𝑋 = 𝑋. The population movement would then be in
a steady-state with the total city population and total
suburban population remaining constant thereafter. We
then write
𝑋0 , 𝑋1 , 𝑋2 , … → 𝑋

Since such a vector 𝑋 satisfies 𝑃𝑋 = 𝑋 , it would be an


eigenvector of 𝑃 corresponding the eigenvalue 1. Knowledge
of the existence and value of such a vector would give us
information about the long term behavior of the population
distribution.
Problem 1: Find the eigenvalue and eigenvectors of the
−4 −6
matrix 𝐴 = .
3 5
Solution: Let us first derive the characteristic polynomial
of 𝐴. We get
−4 −6 1 0 −4 − 𝜆 −6
𝐴 − 𝜆𝐼2 = −𝜆 =
3 5 0 1 3 5−𝜆
Note that the matrix 𝐴 − 𝜆𝐼2 is obtained by subtracting 𝜆
from the diagonal elements of 𝐴. The characteristic
polynomial of 𝐴 is 𝐴 − 𝜆𝐼2 = −4 − 𝜆 5 − 𝜆 + 18 = 𝜆2 + 𝜆 −
2.

We now solve the characteristic equation of 𝐴.

𝜆2 + 𝜆 − 2 = 0

𝜆−2 𝜆+1 =0

𝜆 = 2 or −1.

The eigenvalues of 𝐴 are 2 and −1.

The corresponding eigenvectors are found by using these


values of 𝜆 in the equation 𝐴 − 𝜆𝐼2 𝑋 = 0. There are many
eigenvectors corresponding to each eigenvalue.

𝝀 = 𝟐: We solve the equation 𝐴 − 2𝐼2 𝑋 = 0 for x. The matrix


𝐴 − 2𝐼2 is obtained by subtracting 2 from the diagonal
elements of A. We get
−6 −6 𝑥1
=0
3 3 𝑥2
This leads to the system of equations
−6𝑥1 − 6𝑥2 = 0

3𝑥1 + 3𝑥2 = 0

giving 𝑥1 = −𝑥2 . The solutions to this system of equations


are 𝑥1 = −𝑟, 𝑥2 = 𝑟, where r is a scalar. Thus the
eigenvectors of A corresponding to 𝜆 = 2 are nonzero
vectors of the form
−1
𝑟
1
𝝀 = −𝟏: We solve the equation 𝐴 + 1𝐼2 𝑋 = 0for X. The
matrix 𝐴 + 1𝐼2 is obtained by adding 1 to the diagonal
elements of 𝐴. We get
−3 −6 𝑥1
=0
3 6 𝑥2
This leads to the system of equations
−3𝑥1 − 6𝑥2 = 0
3𝑥1 + 3𝑥2 = 0

Thus 𝑥1 = −2𝑥2 . The solutions to these equations are


𝑥1 = −2𝑠, 𝑥2 = 𝑠, where s is a scalar. Thus the eigenvectors
of A corresponding to 𝜆 = −1 are nonzero vectors of the
−2
form 𝑠 .
1
Problem 2: Find the eigenvalues and eigenvectors of the
5 4 2
matrix 4 5 2 .
2 2 2
Solution: The matrix 𝐴 − 𝜆𝐼3 is obtained by subtracting 𝜆
from the diagonal elements of 𝐴. Thus
5−𝜆 4 2
𝐴 − 𝜆𝐼3 = 4 5−𝜆 2 .
2 2 2−𝜆
The characteristic polynomial of 𝐴 is 𝐴 − 𝜆𝐼3 . Using row
and column operations to simplify determinants, we get
5−𝜆 4 2
𝐴 − 𝜆𝐼3 = 4 5−𝜆 2
2 2 2−𝜆
1−𝜆 −1 + 𝜆 0
= 4 5−𝜆 2
2 2 2−𝜆
1−𝜆 0 0
= 4 9−𝜆 2
2 4 2−𝜆
= 1−𝜆 9 − 𝜆 2 − 𝜆 − 8 = 1 − 𝜆 𝜆2 − 11𝜆 + 10

= 1 − 𝜆 𝜆 − 10 𝜆 − 1 = − 𝜆 − 10 𝜆 − 1 2

We now solve the characteristic equation of 𝐴:


= − 𝜆 − 10 𝜆 − 1 2 =0
𝜆 = 10 or 1

The eigenvalues of A are 10 and 1.


The corresponding eigenvectors are found by using these
values of 𝜆 in the equation 𝐴 − 𝜆𝐼3 𝑋 = 0.

𝜆 = 10: We get 𝐴 − 10𝐼3 𝑋 = 0


−5 4 2 𝑥1
4 −5 2 𝑥2 = 0
2 2 −8 𝑥3
The solutions to this system of equations are 𝑥1 = 2𝑟, 𝑥2 =
2𝑟and 𝑥3 = 𝑟, where r is a scalar. Thus the eigenspace of
𝜆 = 10 is the one-dimensional space of vectors of the form
2
𝑟 2
1
𝜆 = 1: Let 𝜆 = 1 in 𝐴 − 𝜆𝐼3 𝑋 = 0. We get
𝐴 − 1𝐼3 𝑋 = 0
4 4 2 𝑥1
4 4 2 𝑥2 = 0
2 2 1 𝑥3
The solutions to this system of equations can be shown to
be 𝑥1 = −𝑠 − 𝑡, 𝑥2 = 𝑠, and 𝑥3 = 2𝑡, where s and t are scalars.
Thus the eigenspace of 𝜆 = 1 is the space of vectors of the
form
−𝑠 − 𝑡
𝑠
2𝑡
Separating the parameters s and t, we can write
−𝑠 − 𝑡 −1 −1
𝑠 =𝑠 1 +𝑡 0
2𝑡 0 2
Thus the eigenspace of 𝜆 = 1 is a two-dimensional subspace
of ℝ2 with basis
−1 −1
1 , 0
0 2
If an eigenvalue occurs as a 𝑘 times repeated root of the
characteristic equation, we say that it is of multiplicity 𝑘.
Thus 𝜆 = 10 has multiplicity 1, while 𝜆 = 1 has multiplicity
2 in this problem.
Problem 3: Let 𝐴 be an 𝑛 × 𝑛 matrix 𝐴 with eigenvalues
𝜆1 , … . . 𝜆𝑛 and corresponding eigenvectors 𝑋1 , … . . 𝑋𝑛 . Prove
that if 𝑐 ≠ 0, then the eigenvalues of 𝑐𝐴 are 𝑐𝜆1 , … . . 𝑐𝜆𝑛 with
corresponding eigenvectors 𝑋1 , … . . 𝑋𝑛 .

Solution: Let 𝜆𝑖 be one of the eigenvalues of 𝐴 with


corresponding eigenvector 𝑋𝑖 . Then 𝐴𝑋𝑖 = 𝜆𝑖 𝑋𝑖 . Multiply both
sides of this equation by 𝑐 to get

𝑐𝐴𝑋𝑖 = 𝑐𝜆𝑖 𝑋𝑖 .

Thus 𝑐𝜆𝑖 is an eigenvalue of 𝑐𝐴 with corresponding


eigenvector 𝑋𝑖 .
3 10 5
Problem 4: If the given values of 𝐴 = −2 −3 −4 are
3 5 7
2, 2, 3 find the given values of 𝐴 and 𝐴 .
−1 2

Solution: Since 0 is not an eigen value of 𝐴, 𝐴 is a non


singular matrix and hence 𝐴−1 exists.
1 1 1
Eigen values of 𝐴−1 are , , and eigen values of 𝐴2 are
2 2 3
22 , 22 32 .
3 0 0
Problem 5: Find the eigenvalues of 𝐴 when 𝐴 = 5
5
4 0 .
3 6 1
Solution: The characteristic equation of 𝐴 is obviously
3 − 𝜆 94 − 𝜆) 1 − 𝜆 = 0.

Hence the eigen values of 𝐴 are 3, 4, 1.

∴ The eigen values of 𝐴5 are 35 , 45 , 15 .


Problem 6: Find the sum and product of the eigen values
3 −4 4
of the matrix 1 −2 4 without actually finding the eigen
1 −1 3
values.
3 −4 4
Solution: Let 𝐴 = 1 −2 4
1 −1 3
Sum of the eigen values = trace of 𝐴 = 3 + −2 + 3 = 4.

Product of the eigen values = 𝐴 .


3 −4 4
Now, 𝐴 = 1 −2 4 = 3 −6 + 4 + 4 3 − 4 − 4(−1 + 2)
1 −1 3
= −6 − 4 − 4 = −14.

∴ Product of the eigen values= −14.


EXERCISE
1. Determine the characteristic polynomials, eigenvalues,
and corresponding eigenspaces of the given 3 × 3 matrices.
3 2 −2 1 −2 2
𝑎. −3 −1 3 𝑏. −2 1 2
1 2 0 −2 0 3
15 7 −7
𝑐. −1 1 1
13 7 −5
2. Determine the characteristic polynomials, eigenvalues,
and corresponding eigenspaces of the matrix.
4 2 −2 2
1 3 1 −1
0 0 2 0
1 1 −3 5
3. Prove that if 𝐴 is a diagonal matrix, then its eigenvalues
are the diagonal elements.

4. Prove that A and 𝐴𝑇 have the same eigenvalues.


5. Prove that λ = 0 is an eigenvalue of a matrix A if and only
if A is singular.
6. Prove that if the eigenvalues of a matrix A are
𝜆1 , … . . 𝜆𝑛 with corresponding eigenvectors 𝑋1 , … . . 𝑋𝑛 , then
𝜆1𝑚 , … . . 𝜆𝑚
𝑛 are eigenvalues of Am with corresponding
eigenvectors 𝑋1 , … . . 𝑋𝑛 .
7. Show that the following matrices satisfy their
characteristic equations.
0 2 8 −10 6 −8 −1 5
𝑎 . 𝑏 . 𝑐 . 𝑑 .
−1 3 5 −7 4 −6 −10 14
8. Verify Cayley-Hamiltion theorem for the matrix 𝐴 and
2 −1 1
find its inverse, where 𝐴 = −1 2 −1 .
1 −1 2
9. Find the characteristic equation of the matrix 𝐴 =
2 1 1
0 1 0 and hence find the matrix 𝐴7 − 5𝐴6 + 9𝐴5 − 13𝐴4 +
1 1 2
17𝐴3 − 21𝐴2 + 21𝐴 − 8𝐼.

10. Using Cayley-Haimaltion theorem find 𝐴8 , if 𝐴 =


1 2
.
2 −1

ANSWERS
1)
a. Characteristic polynomial −𝜆3 + 2𝜆2 + 𝜆 − 2 ;
eigenvalues 1,-1,2; corresponding eigenspaces
1   1   0 
r  0  , s  1 , t 1 
     
1   1  1 
b. Characteristic polynomial 1 − 𝜆 2 (3 − 𝜆) ; eigen
1 
values 1,3; corresponding eigenspaces r  1  ,
 
 2 
0 
s 1 
 
1 
c. Characteristic polynomial 1 − 𝜆 2 − 𝜆 8 − 𝜆 ;
eigenvalues 1,2,8; corresponding eigenspaces
 1  0  1 
r  1 , s 1  , t  0 
     
 1  1  1 
2) Characteristic polynomial 2 − 𝜆 2 − 𝜆 4 − 𝜆 (6 − 𝜆) ;
eigenvalues 2,4,6; corresponding eigenspaces
1 0  0  1 
 1 0  1  0 
r    s , t   , p  
0 1   0  0 
       
0 1   1 1 
3 1 −1
1
8) 1 3 1 .
4
−1 1 3
9)𝜆3 − 5𝜆2 + 7𝜆 − 3 = 0, 𝐼

10)625𝐼

Potrebbero piacerti anche