Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Erik Wahln
erik.wahlen@math.lu.se
October 3, 2014
(1) x0 = Ax
is computed using eigenvalues and eigenvectors. This method works when A has
n distinct eigenvalues or, more generally, when there is a basis for Cn consisting of
eigenvectors of A. If the latter case, we say that A is diagonalizable. In this case,
the general solution is given by
x(t) = etA c,
where
es tn+1
rn (t) = ,
(n + 1)!
1
for some s between 0 and t. Since limn rn (t) = 0, we have
t
X tk
e = , t R.
k=0
k!
is called the radius of convergence of the series. The significance of the radius of
convergence is that the power series converges inside of it, and that the series can
be manipulated as though it were a finite sum there (e.g. differentiated termwise).
To prove this we first show a preliminary lemma.
Lemma 1. Suppose that {fn } is a sequence of continuously differentiable functions
on an interval I. Assume that fn0 g uniformly on I and that {fn (a)} converges
for some a I. Then {fn } converges pointwise on I to some function f . Moreover,
f is continuously differentiable with f 0 (x) = g(x).
Proof. We have Z x
fn (x) = fn (a) + fn0 (s) ds, n.
a
Since fn0 (s) g(s) uniformly on the interval between a and x, we can take the
limit under the integral and obtain that fn (x) converges to f (x) defined by
Z x
f (x) = f (a) + g(s) ds, x I,
a
where f (a) = limn fn (a). The fundamental theorem of calculus now shows that
f is continuously differentiable with f 0 (x) = g(x).
Theorem 2. Assume that the power series
X
(2) ak (x x0 )k
n=0
has positive radius of convergence. The series converges uniformly and absolutely
in the interval [x0 r, x0 + r] whenever 0 < r < R and diverges when |x x0 | > R.
The limit is infinitely differentiable and the series can be differentiated termwise.
Proof. If |xx0 | > R the sequence ak (xx0 )k is unbounded, so the series diverges.
Consider an interval |x x0 | r, where r (0, R). Choose r < r < R. Then
r k r k
|ak (x x0 )k | |ak |rk C ,
r r
2
when |x x0 | r, where C is a constant such that |ak |rk C k. Since r/r < 1
we find that
X r k
< .
k=0
r
Weierstrass M-test then shows that (2) converges uniformly when |x x0 | r
to some function S(x). S is at least continuous since its the uniform limit of a
sequence of continuous functions. If S is differentiable and can be differentiated
termwise, then the derivative is
X
(k + 1)ak+1 (x x0 )k .
k=0
This is a again a power series with radius of convergence R (prove this!). Hence
it converges uniformly on [x0 r, x0 + r], 0 < r < R. It follows from the previous
lemma that S is continuously differentiable on (x0 R, x0 + R) and that the power
series can be differentiated termwise. An induction argument shows that S is
infinitely many times differentiable.
The interval (x0 R, x0 + R) is called the interval of convergence. The power
series also converges for complex x with |x x0 | < R (hence the name radius of
convergence). All of the above properties still hold for complex x, but the derivative
must now be understood as a complex derivative. This concept is studied in courses
in complex analysis; we will not linger on it here.
In particular,
A
X Ak
e = .
k=0
k!
For this to make sense, we have to show that the series converges for all t R.
A matrix sequence {Ak } k=1 , Ak C
nn
, is said to converge if it converges element-
wise, i.e. [Ak ]ij converges as k for all i and j, where the notation [A]ij is used
to denote the element on row i and column j of A (this will also be denoted aij
in some places). As usual, a matrix series is said to converge if the corresponding
(matrix) sequence of partial sums converges. Instead of checking if all the elements
converge, it is sometimes useful to work with the matrix norm.
3
Equivalently, kAk is the smallest K 0 s.t.
|Ax| K|x|, x Cn
since
x x x
|Ax| = A |x|
= A |x| and = 1.
|x| |x| |x|
The following proposition follows from the definition (prove this!).
Proposition 6.
kABk kAkkBk.
Proof. We have
Proposition 7.
Proof. (1) Recall that aij = [A]ij . Let {e1 , . . . , en }, be the standard basis. Then
a1j
..
Aej = . |aij | kAej k kAk|ej | kAk.
anj
R1 R1 x
(2) We have that
Ax = ... x = ... ,
Rn Rn x
where
Ri = ai1 ain
is row i of A. Hence
n n
! !
X X X
kAxk2 = kRi xk2 |Ri |2 |x|2 = |aij |2 |x|2 .
Cauchy-Schwarz
i=1 i=1 i,j
4
Corollary 8. Let {Ak }
k=1 C
nn
and A Cnn . Then
We can now show that our definition of the matrix exponential makes sense.
Proposition 9. The series tk Ak tA
P
k=0 k! defining e converges uniformly on compact
intervals. Moreover, the function t 7 e is differentiable with derivative AetA .
tA
x(t) = etA c, c Cn
x0 = Ax, x(0) = x0
is given by
x(t) = etA x0 .
In certain cases it is possible to compute the matrix exponential directly from
the definition.
Example 11. Suppose that
1 0 0
0 2 0
A = ..
.. . . ..
. . . .
0 0 n
5
so
P
tk k1
k=0 k! 0 0
P tk k2
tA
0 k=0 k! 0
e = .. .. ... ..
. . .
P tk kn
0 0 k=0 k!
et1 0 0
0 et2 0
= .. .. .
.. . .
. . . .
tn
0 0 e
x(t) = c1 e1 t v1 + + cn en t vn .
T c = x0 c = T 1 x0 .
6
Thus the solution of the IVP is
x(t) = T etD T 1 x0 ,
where
1 0
D = ... . . . ... .
0 n
This can also be seen by computing the matrix exponential of etA using the fol-
lowing proposition.
Proposition 13.
1
eT BT = T eB T 1
Proof. This follows by noting that
(T BT 1 )k = (T BT 1 )(T BT 1 ) (T BT 1 )
= T B(T 1 T )B(T 1 T ) (T 1 T )BT 1
= T B k T 1 ,
whence
X (T BT 1 )k X T B k T 1
= = T eB T 1 .
k=0
k! k=0
k!
(3) A = T DT 1
with T and D as above. D is the matrix for the linear operator x 7 Ax in the
basis v1 , . . . , vn . The matrix for this linear operator in the standard basis is simply
A. Equation (3) is the change of basis formula for matrices. For convenience we
will also refer to D as the matrix for A in the basis v1 , . . . , vn , although this is a
bit sloppy. From (3) and Proposition 13 it follows that
etA = T etD T 1 .
The matrix for etA in the basis v1 , . . . , vn is thus given by etD . The solution of the
IVP is given by
etA x0 = T etD T 1 x0 ,
confirming our previous result.
We finally record some properties of the matrix exponential which will be useful
for matrices which cant be diagonalized.
Lemma 14. AB = BA eA B = BeA .
Proof. Ak B = Ak1 BA = = AB k . Thus
N
! N
!
X Ak X Ak
lim B = lim B .
N
k=0
k! N
k=0
k!
7
Proposition 15.
(1) (eA )1 = eA .
(2) eA eB = eA+B = eB eA if AB = BA.
(3) etA esA = e(t+s)A .
Proof. (1) We have that
d tA tA
e e = AetA etA + etA (AetA ) = (A A)etA etA = 0,
dt
where we have used the previous lemma to interchange the order of etA and A.
Hence
etA etA C (constant matrix).
Setting t = 0, we find that C = I (identity matrix). Setting t = 1, we find that
eA eA = I.
(2) We have that
d t(A+B) tA tB
e e e = (A + B)et(A+B) etA etB et(A+B) AetA etB et(A+B) etA BetB
dt
= (A + B (A + B))etA etB et(A+B)
= 0,
where we have used the previous lemma in the second line. As in (1) we obtain
eA eB e(A+B) = I eA eB = eA+B , where we have used (1).
(3) follows from (2) since tA and sA commute.
Lets use this to compute the matrix exponential of a matrix which cant be
diagonalized.
Example 16. Let
2 0 0 1
D= , N=
0 2 0 0
and
2 1
A=D+N = .
0 2
The matrix A is not diagonalizable, since the only eigenvalue is 2 and C x = 2x
has the solution
1
x = z , z C.
0
Since D is diagonal, we have that
tD e2t 0
e = .
0 e2t
8
Since D and N commute, we find that
2t 2t
tA tD+tN tD tN e 0 1 t e te2t
e =e =e e = = .
0 e2t 0 1 0 e2t
Exercise 5 below shows that the hypothesis that A and B commute in Propo-
sition 15b is essential.
2 Generalized eigenvectors
We know now how to compute the matrix exponential when A is diagonalizable. In
the next section we will discuss how this can be done when A is not diagonalizable.
In order to do that, we need to introduce some more advanced concepts from linear
algebra. When A is diagonalizable there is a basis consisting of eigenvectors. The
main idea when A is not diagonalizable is to replace this basis by a basis consisting
of generalized eigenvectors.
Definition 17. Let A Cnn . A vector v Cn , v 6= 0, is called a generalized
eigenvector corresponding to the eigenvalue if
(A I)m v = 0
for some integer m 1.
Note that according to this definition, an eigenvector also qualifies as a gen-
eralized eigenvector. We also remark that in the above definition has to be an
eigenvalue since if m 1 is the smallest positive integer such that (A I)m v = 0,
then w = (A I)m1 v 6= 0 and
(A I)w = (A I)(A I)m1 v = (A I)m v = 0,
so w is an eigenvector and is an eigenvalue.
9
Example 19. Suppose that A has n distinct eigenvalues 1 , . . . , n with cor-
responding eigenvectors v1 , . . . , vn . It then follows that the vectors v1 , . . . , vn are
linearly independent (see Theorem 7.1.2 in Ahmad and Ambrosetti) and thus form
a basis for Cn . Let
ker(A i I) = span{vi }, i = 1, . . . , n,
ker B = {x Cn : B x = 0}
by the definition of a basis. It is also clear that each eigenspace is invariant under
A.
More generally, suppose that A is diagonalizable, i.e. that it has k distinct
eigenvalues 1 , . . . , k and that the geometric multiplicity of each eigenvalue i
equals the algebraic multiplicity ai . Let ker(A i I), i = 1, . . . , k, be the corre-
sponding eigenspaces. We can then find a basis for each eigenspace consisting of
ai eigenvectors. The union of these bases consists of a1 + + ak = n elements
and is linearly independent, since eigenvectors belonging to different eigenvalues
are linearly independent (this follows from an argument similar to Theorem 7.1.2
in Ahmad and Ambrosetti). We thus obtain a basis for Cn and it follows that
10
where the adjugate matrix adj B is a matrix whose elements are the cofactors of
B. More specifically, the element on row i and column j of the adjugate matrix for
B is Cji , where Cji = (1)j+i det Bji , in which Bji is the (n 1) (n 1) matrix
obtained by eliminating row j and column i from B (see sections 7.1.27.1.3 of
Ahmad and Ambrosetti). Note that each element of adj(AI) is a polynomial in
of degree at most n 1 (since at least one element of the diagonal is eliminated).
Thus,
pA ()(A I)1 = n1 Bn1 + + B1 + B0 ,
for some constant n n matrices B0 , . . . , Bn1 . Multiplying with A I gives
Thus,
Bn1 = a0 I,
ABn1 Bn2 = a1 I,
..
.
AB1 B0 = an1 I,
AB0 = an I,
where
pA () = a0 n + a1 n1 + + an1 + an .
Multiplying the rows by An , An1 , . . ., A, I and adding them, we get
Lemma 21. Suppose that p() = p1 ()p2 () where p1 and p2 are relatively prime.
If p(A) = 0 we have that
Proof. The invariance follows from pi (A)Ax = Api (A)x = 0, x ker pi (A). Since
p1 and p2 are relatively prime, it follow by Euclids algorithm that there exist
polynomials q1 , q2 such that
p1 ()q1 () + p2 ()q2 () = 1.
Thus
p1 (A)q1 (A) + p2 (A)q2 (A) = I.
11
Applying this identity to the vector x Cn , we obtain
where
p2 (A)x2 = p2 (A)p1 (A)q1 (A)x = p(A)q1 (A)x = 0,
so that x2 ker p2 (A). Similarly x1 ker p1 (A). Thus V = ker p1 (A) + ker p2 (A).
On the other hand, if
we obtain that
so that
12
Proof of Theorem 18. If we select a basis {vi,1 , . . . , vi,ni } for each subspace ker(A
i I)ai , then the union {v1,1 , . . . , v1,n1 , v2,1 , . . . , v2,n2 , . . . , vk,1 , . . . , vk,nk } will be a
basis for Cn consisting of generalized eigenvectors. The fact these vectors are
linearly independent follows from Theorem 22. Indeed, suppose that
k ni
!
X X
i,j vi,j = 0.
i=1 j=1
13
We remark that matrix B in the above theorem are not unique. Apart from the
order of the blocks Bi , the blocks themselves depend on the particular bases chosen
for the generalized eigenspaces. There is a particularly useful way of choosing these
bases which gives rise to the Jordan normal form. Although the Jordan normal
form will not be required to compute the matrix exponential we present it here for
completeness and since it is mentioned in Ahmad and Ambrosetti.
Theorem 24. Let A Cnn . There exists an invertible n n matrix T such that
T 1 AT = J,
where is an eigenvalue of A, I is a unit matrix and N has ones on the line directly
above the diagonal and zeros everywhere else. In particular, N is nilpotent.
(4) x(t) = c1 e1 t v1 + + cn en t vn .
14
we simply choose c1 , . . . , cn so that
x(0) = c1 v1 + + cn vn = x0 .
In other words, the numbers ci are the coordinates for the vector x0 in the basis
v1 , . . . , vn . Note that each term ei t vi in the solution is actually etA vi . Since the
jth column of the matrix etA is given by etA ej , where {e1 , . . . , en } is the standard
basis, we can compute the matrix exponential by repeating the above steps with
initial data x0 = ej for j = 1, . . . , n.
The same approach works when A is not diagonalizable, with the difference
that the basis vectors vi are now generalized eigenvectors instead of eigenvectors.
Denote the basis vectors vi,j , i = 1, . . . , k, j = 1, . . . , ai , as in the proof of Theo-
rem 18 (recall that the dimension of each generalized eigenspace is the algebraic
multiplicity of the corresponding eigenvalue). Then the general solution is
ai
k X
X
x(t) = ci,j etA vi,j
i=1 j=1
(A i I)ai vi,j = 0,
we find that
where we have also used the fact that I and (A I) commute and the definition
of the matrix exponential. The general solution can therefore be written
i 1 `
k X ai aX
!
X t
(6) x(t) = ci,j ei t (A i I)` vi,j .
i=1 j=1 `=0
`!
In order to solve the IVP (5), we simply have to express x0 in the basis vi,j to find
the coefficients ci,j . Finally, to compute etA we repeat the above steps for each
standard basis vector ej .
Remark 25. Formula (6) shows that the general solution is a linear combination
of exponential functions multiplied with polynomials. The polynomial factors can
only appear for eigenvalues with (algebraic) multiplicity two or higher. This is
precisely the same structure which we encountered for homogeneous higher order
scalar linear differential equations with constant coefficients.
15
Remark 26. When A is diagonalizable we see from (4) that the general solution
is just a linear combination of exponential functions, without polynomial factors.
This seems to contradict the previous remark. A closer look reveals that in this
case
Remark 27. More generally, it might happen that (A i I)` vanishes on the
generalized eigenspace ker(A i I)ai for some ` < ai 1. One can show that there
is a unique monic polynomial pmin () such that pmin (A) = 0. pmin () is called
the minimal polynomial. It divides the characteristic polynomial pA () and can
therefore be factorized as
with mi ai for each i. Repeating the proof of Theorem 22, we obtain that
so that ker(Ai I)ai = ker(Ai I)mi , i.e. (Ai I)mi vanishes on ker(Ai I)ai .
In fact, one can show that (A i I)m vanishes on ker(A i I)ai if and only if
m mi . In the diagonalizable case we have
pmin () = ( 1 ) ( k ),
16
In order to compute the matrix exponential, we find the solutions with x(0) = e1
and x(0) = e2 , respectively. In the first case, we obtain the equations
1 1 1
c1 + c2 =
1 1 0
(
c1 + c2 = 1
c1 + c2 = 0
(
c1 = 21
c2 = 12 .
Hence,
1 et + e3t
At 1 t 1 1 3t 1
e e1 = e + e =
2 1 2 1 2 et + e3t
and
1 et + e3t
At 1 t 1 1 3t 1
e e2 = e + e = .
2 1 2 1 2 et + e3t
Finally,
et + e3t et + e3t
At 1
e = .
2 et + e3t et + e3t
Example 29. Let
3 4
A= .
1 1
The characteristic polynomial is ( + 1)2 , so 1 = 1 is the only eigenvalue. This
means that any vector belongs to the generalized eigenspace ker(A + I)2 , so that
eAt v = et e(A+I)t v
= et (I + t(A + I))v
t 1 0 2 4
=e +t v
0 1 1 2
t 1 2t 4t
=e v.
t 1 + 2t
In particular
At t 1 2t 4t
e =e .
t 1 + 2t
17
We could come to the same conclusion by using the standard basis vectors e1 and
e2 as our basis v1,1 , v1,2 in the solution formula (6). This would give the general
solution
tA tA t 1 2t t 4t
x(t) = c1 e e1 + c2 e e2 = c1 e + c2 e .
t 1 + 2t
There is one more possibility for 2 2 matrices. The matrix could be diago-
nalizable and still have a double eigenvalue. We leave it as an exercise to show
that this can happen if and only if A is a scalar multiple of the identity matrix,
i.e. A = I for some number (which will be the only eigenvalue). In this case
etA = et I.
Ax = 0 x = z(1, 0, 1),
Ax = 2x x = z(0, 1, 0),
z C. Thus v1 = (1, 0, 1) and v2 = (0, 1, 0) are eigenvectors corresponding to 1
and 2 , respectively. We see that A is not diagonalizable.
The generalised eigenspace corresponding to 2 is simply the usual eigenspace
ker(A 2I), but the one corresponding to 1 is ker A2 . Calculating
0 0 0
A2 = 0 4 0 ,
0 0 0
we find e.g. the basis v1,1 = v1 = (1, 0, 1), v1,2 = (1, 0, 0) for ker A2 and we
previously found the basis v2,1 = v2 = (0, 1, 0) for ker(A 2I).
We have
0 0
tA 2t
e 1 =e 1
0 0
and
1 1 1
tA 0t
e 0 =e
0 =
0 .
1 1 1
18
Finally,
1 1 1+t 0 t 1 1+t
etA 0 = (I + tA) 0 = 0 1 + 2t 0 0 = 0 .
0 0 t 0 1t 0 t
We can thus already write the general solution as
1 1+t 0
(8) x(t) = c1 0 + c2 0 + c3 e2t 1 ,
1 t 0
where we chose to use the simpler notation c1 , c2 , c3 for the coefficients instead of
c1,1 , c1,2 , c2,1 from eq. (6).
In order to compute etA , we need to compute etA ei for the standard basis
vectors. Note that we have already computed
1+t
etA e1 = 0
t
and
0
tA 2t
e e2 = e 1 ,
0
so it remains to compute etA e3 . We thus need to solve the equation
0 1 1 0
0 = c1 0 + c2 0 + c3 1 ,
1 1 0 0
and a simple calculation gives c1 = 1, c2 = 1 and c3 = 0, so that
0 1 1+t t
etA 0 = 0 + 0 = 0 .
1 1 t 1t
Thus,
1+t 0 t
etA = 0 e2t 0 .
t 0 1t
Example 31. Suppose that we in the previous example wanted to solve the IVP
0
0
x = Ax, x(0) = 1 .
1
Of course, once we have the formula for the matrix exponential we can find the
solution by calculating
1+t 0 t 0 t
tA
x(t) = e x(0) = 0 e2t 0 1 = e2t .
t 0 1t 1 1t
19
We could however also solve the problem by using the general solution (8) and
finding c1 , c2 , c3 to match the initial data. We thus need to solve the system
0 1 1 0
1 = c1 0 + c2 0 + c3 1 ,
1 1 0 0
t2
tA 2t t(A2I) 2t 2
e =e e =e I + t(A 2I) + (A 2I) .
2
We find that
1 1 1 0 0 0
A 2I = 0 0 0 and (A 2I)2 = 0 0 0 .
1 1 1 0 0 0
t2
Thus the term 2
(A 2I)2 vanishes and we obtain
20
Again pA () = ( 2)3 and thus 2 is the only eigenvalue. This time
0 0 0 0 0 0
A 2I = 0 0 1 and (A 2I)2 = 1 0 0 ,
1 0 0 0 0 0
so that
t2
tA 2t 2
e =e I + t(A 2I) + (A 2I)
2
1 0 0 0 0 0 2 0 0 0
t
= e2t 0 1 0 + t 0 0 1 + 1 0 0
2
0 0 1 1 0 0 0 0 0
2t
e 0 0
t2 2t
= 2 e e te2t .
2t
te2t 0 e2t
We can also formulate the above algorithm using the block diagonal represen-
tation of A from Theorem 23. Let T be the matrix for the corresponding change of
basis, i.e. T is the matrix whose columns are the coordinates for the basis vectors
in which A takes the block diagonal form B. Then A = T BT 1 and
etA = T etB T 1 ,
where
etB1
etB =
... ,
etBk
and
tai 1
tBi t(i Ii +Ni ) ti Ii tNi i t mi 1
e =e =e e =e Ii + tNi + + N ,
(ai 1)! i
21
in Example 30 we found that the generalized eigenspace ker A2 had the basis
1 1
v1,1 = 0 , v1,2 = 0
1 0
Thus, we take
1 1 0
T = 0 0 1
1 0 0
and find that
0 0 1
T 1 = 1 0 1 .
0 1 0
Since Av1,1 = 0, Av1,2 = v1,1 and Av2,1 = 2v2,1 , The corresponding block diagonal
matrix is
0 1 0
B 1
B = 0 0 0 = ,
B2
0 0 2
where
0 1
B1 = , B2 = 2.
0 0
We find that
tB1 1 t
e = I + tB1 = , etB2 = e2t .
0 1
Thus,
1 t 0
etB = 0 1 0
0 0 e2t
and
1 1 0 1 t 0 0 0 1
etA = T etB T 1 = 0 0 1 0 1 0 1 0 1
2t
1 0 0 0 0 e 0 1 0
1+t 0 t
= 0 e2t 0 .
t 0 1t
22
Exercises
1. Compute eA by summing the power series when
0 1 2
0 1
a) A = b) A = 0 0 2 .
1 0
0 0 0
tA 0 1
2. Compute e by diagonalising the matrix, where A = .
1 0
3. Show that
keA k ekAk .
4. a) Show that
(eA ) = eA .
b) Show that eS is unitary if S is skew symmetric, that is, S = S.
6. Let
0 1 1 1 1 2 1 2 0
A1 = 0 1 1 , A2 = 1 1 0 , A3 = 3 1 3 .
0 1 1 1 1 2 0 2 1
x0 = Ax, x(0) = x0 ,
where
2 1 1 1
A = 4 1 4 and x0 = 0 .
5 1 4 0
9. The matrix
18 3 2 12
0 2 0 0
A=
2 12 2
1
24 6 3 16
has the eigenvalues 1 and 2. Find the corresponding generalized eigenspaces
and a determine a basis consisting of generalized eigenvectors.
23
10. Consider the initial value problem
(
x01 = x1 + 3x2 ,
x(0) = x0 .
x02 = 3x1 + x2 ,
11. Can you find a general condition on the eigenvalues of A which guarantees
that all solutions of the IVP
x0 = Ax, x(0) = x0 .
converge to zero as t ?
12. The matrices A1 and A2 in Exercise 6 have the same eigenvalues. If youve
solved Exercise 7 correctly, you will notice that all solutions of the IVP cor-
responding to A1 are bounded for t 0 while there are unbounded solutions
of the IVP corresponding to A2 . Explain the difference and try to formulate
a general principle.
24