Sei sulla pagina 1di 47

Department of Mathematics 2005/06

Controllability of Linear Systems

Candidate Number: 16776

Submitted for the Master of Science


London School of Economics
University of London

1
CONTENTS

Table of contents

Abstract p4

I) Introduction p5

II) Control Theory p6


Linear control systems p6
dx
Solution to = Ax(t ) + Bu (t ) p7
dt

III) Controllability p 10

IV) Feedback and eigenvalues placement p 23

V) Algorithm and Program p 27

VI) exp( A), where A ∈  n× n and the Cayley-Hamilton theorem p 32

Bibliography p 46

2
List of Figures

Figure I.1: A Control System p5


Figure II.1: Algebraic Underdetermined Equation p6
Figure IV.1: A Closed Loop System by State Feedback p 23

3
Abstract
Control systems of the form
dx
= f ( x(t ), u (t )), t ≥ 0, x(t ) ∈  n u (t ) ∈  m .
dt
are explained and their applications are described. Moreover, the special case of linear
control systems of the form
d
x(t ) = Ax(t ) + Bu (t ), t ≥ 0, A ∈  n× n , B ∈  n× m , x(t ) ∈  n , u (t ) ∈  m .
dt
are solved in terms of the matrices A and B.
Conditions on the matrices A and B are given to make the system controllable, reachable
spaces are introduced, Hautus test for controllability is proved and also the Pole
Placement Theorem.
Finally, a Matlab code for the Pole Placement Theorem is given and a numerical
example solved by the program.
The necessary ideas of the exponentiation of a matrix and the Cayley-Hamilton theorem
are introduced in the last chapter.

4
Chapter I: Introduction

Control theory is a domain of applied mathematics that deals with the basic principles
behind the analysis of control systems. Roughly speaking, control systems are models
which are described by underdetermined differential equations of the type
dx
= f ( x(t ), u (t )), t ≥ 0, (*)
dt
f :  n ×  m   n , and

where for each t ≥ 0 we have x(t ) ∈  n and u (t ) ∈  m . The function u is called the
input, which can be frequently chosen, that is, it can be specified arbitrarily, and once
this is done, the differential equation (*) has a unique solution x (given an initial
condition x(0), and under some mild conditions on f ).
The models given by the differential equation (*) arise frequently in applications from
various sciences such as engineering, economics and biology.

One can think of a control system as a black box. This box, being given the input u,
manufactures the output x, according to the equation.

Schematically this can be represented as the following figure

Figure I.1: A control system

The aim in control theory is to choose the input u in such a manner that some desired
effect is produce on the output x.

5
Chapter II: Control Theory

As it has been just said above, in control theory, one studies models which are described
by underdetermined differential equations. This means that some of the variables
satisfying the differential equations are ‘free’. An example of an underdetermined
algebraic equation would be x 2 + u 2 = 1, where both x and u are non-negative real
numbers. In this case we can see that this equation does not have an unique solution of
the form ( x, u ). Rather, there is freedom in choosing one of the variables, say u, and

once a particular u is chosen, then the variable x is uniquely determined by x = 1 − u 2 .

Figure I.2: Algebraic Underdetermined Equation

Linear Control Systems


In an analogous same manner, in linear control theory, one considers the differential
equation
d
x(t ) = Ax(t ) + Bu (t ), t ≥ 0, x(0) = x0 (**)
dt
where A ∈  n× n , B ∈  n× m , and for each t ≥ 0 : x(t ) ∈  n and u (t ) ∈  m.

The equation (**) is an underdetermined differential equation: indeed, u is free, and


once a particular u ∈ (C [0, ∞)) m has been chosen, this determines a unique solution x.
Later on, in the next section, we will show that this x is given by
t
x = x(t ) = etA x 0 + ∫ e(t −τ ) A Bu (τ)d τ, t ≥ 0.
0

We shall also explain the meaning of the exponential of a matrix.

6
It is important to note that the underdetermined differential equation (**) is a special
case of the much more general underdetermined differential equation (*), where
f :  n ×  m   n is the linear function given by

f (ξ, ϑ) = Aξ + Bϑ, (ξ, ϑ) ∈  n ×  n.


That is the reason why (**) are called linear control systems.

Models described by equations of the type (**) occur frequently in applications (often
as linear approximation of more complicated non-linear models), and we give one
example below from the biological field.

Consider a lake with two species of fish whose populations x1 , x2 over a time period

[0, T ] are reasonably accurately modelled by

d  x1 (t )   a11 a12   x1 (t ) 
 =   for t ∈ [0, T ],
dt  x2 (t )   a21 a22   x2 (t ) 

and with initial conditions


 x1 (0)   x1i 
 x (0)  =  x  .
 2   2i 
This might very well be the model of the typical prey-predator system. Here, obviously
there is no input and this system can be solved rather easily.
Let us bring in the input: suppose we are interested in harvesting these fish (both types)
at harvesting rates h1 , h2 which are the inputs. In this case, the model describing the

evolution of the population becomes


d  x1 (t )   a11 a12   x1 (t )   h1 (t ) 
 =  − 
dt  x2 (t )   a21 a22   x2 (t )   h2 (t ) 

for the same t ∈ [0, T ], and the same initial conditions.


We might desire several things, i.e. several different outputs. For instance suppose that
we want to harvest the species of fish over [0, T ] in such a way that by time t = T (the
end of the fishing time) we are left with the desired final populations x1 f , x2 f starting

from x1i , x2i . Perhaps by then one of the species is nearing extinction, because we have

wanted it to be this way.


So we would like to know if this is possible and if so, how we might achieve this.

7
dx
Solution to = Ax(t ) + Bu (t )
dt

In this part of the chapter we shall find a unique solution of the differential equation
d
(E.II.1) x(t ) = Ax(t ) + Bu (t ), t ≥ 0, x(0) = x0
dt

See the Appendix for the manipulation of exponentials of a square matrix.

Lemma
Suppose that A ∈  n× n , then the first-order linear differential equation
dx
(E.II.2) = Au (t ), t ≥ 0, x(0) = x0
dt
has the unique solution x(t ) = etA x0 .

Proof
Set x1 (t ) = etA x0 and compute

d d
x1 (t ) = (etA x0 ) = AetA x0 ,
dt dt
so t  etA x0 does indeed solve (E.II.2). Furthermore x1 (0) = e0 A x0 = Ix0 = x0 .
Finally, uniqueness requires the product rule. Suppose x is a solution to (E.II.2),
consider the derivative of e − tA x(t )
d − tA
 e x(t )  = − Ae− tA x(t ) + e −tA Ax(t ).
dt 
However, A and etA commute, as it can be seen by the definition of the exponential
matrix, and hence the derivative of etA x(t ) is zero. This implies that etA x(t ) is a constant

column vector, say C, and x(t ) = etAC. Because x(0) = x0 , we have that x0 = e0 AC = C ,

so it follows that x(t ) = etA x0 . 

Theorem
Suppose that A ∈  n× n and that B ∈  n×1.
If u is a continuous function, then the differential equation (E.II.1) has the unique
solution x(⋅) in [ 0, +∞ ) given by

8
t
x(t ) = etA x0 + ∫ e( t −τ ) A Bu (τ)d τ.
0

Proof.
First we show that this is indeed a solution. Formally, by working backwards we have
d  tA  d  tA 
t t

∫ ∫
( t −τ ) A
 e x0 + e Bu ( τ ) d τ  =  e x0 + e tA
e( t −τ ) A Bu (τ) d τ 
dt  0  dt  0 

d  tA ( t −τ ) A 
t
= AetA x0 + e ∫e Bu (τ)d τ 
dt  0 
t
= AetA x0 + AetA ∫ e( t −τ ) A Bu (τ)d τ + etAe −tA Bu (t )
0

 tA t

= A  e x0 + e ∫ e( t −τ ) A Bu (τ)d τ  + etA − tA Bu (t )
tA

 0 
 t

= A  etA x0 + etA ∫ e(t −τ ) A Bu (τ)d τ  + Bu (t )
 0 
hence, this x(⋅) does indeed satisfy (E.II.1) and the initial condition follows by setting t
to be zero
0
e0 A x0 + ∫ e( t −τ ) A Bu (τ)d τ = Ix0 + 0 = x0 .
0

The uniqueness follows by considering two different solutions x1 and x2 . It follows that

x := x1 − x2 satisfies the differential equation (E.II.2) with x(0) = 0 and so from the

previous Lemma we have that ∀t ≥ 0 : x(t ) = 0 which means that x1 = x2 . And this ends

the proof. 

9
Chapter III: Controllability

As we said in the definition of linear systems, the interesting fact about


underdetermined equations is that we are allowed to choose the free variable in such a
way that we can produce a desired effect on the dependent variable.
Control theory seeks to do this in the differential equation
dx
= f ( x(t ), u (t )), x(0) = x0 , t ≥ 0.
dt
We will restrict our attention to linear systems
dx
= Ax(t ) + Bu (t ).
dt
The state variable x are the variables to be controlled, which depend upon the free
variables u, the inputs. The main goal is then to solve the question: how do we choose
the control inputs to achieve regulation of the state variables?

We would like to define and characterize the property of controllability in terms of the
matrices A and B. To this end, we first define controllability and reachability of linear
systems.

Definition
The system
dx
(E.III.1) = Ax(t ) + Bu (t )
dt
for t ∈ [ 0, T ] is said to be controllable at time T if for every pair of vectors x0 , x1 in  n ,

there exists a control u ∈ C [ 0, T ] such that the solution x of (E.III.1) with x(0) = x0

satisfies x(T ) = x1.

The system (E.III.1) is referred to as system ( A, B).

Examples
Let A = 0 and B = 1, so that system ( A, B ) becomes
dx
= u (t ) 0 ≤ t ≤ T .
dt

10
Given x0 , x1 ∈ , define u ∈ C [0, T ] to be the constant function

1
u (t ) = ( x1 − x0 ) 0 ≤ t ≤ T .
T
However, by the use of the fundamental theorem of calculus we can write
T T T
dx 1
x(T ) − x(0) = x(T ) − x0 = ∫ d τ = ∫ x '(τ)d τ = ∫ u (τ)d τ = ( x1 − x0 )(T − 0) = x1 − x0
0 dτ 0 0 T

and hence
x(T ) = x1.

Let us now study a system which is not controllable.


1 0  1 
Consider the system A =   and B =   , which gives:
0 1  0
 dx1
 dt = x1 (t ) + u (t )

 dx2 = x (t )
 dt 2

The second equation implies that x2 (t ) = x2 (0)et , and so if x2 (0) > 0, then for all t ≥ 0

we have x2 (t ) > 0. This in turn means that a final state with the x2 − component negative

is never reachable by any control.

Let us define more thoroughly what reachability means.

Definition
The reachable space of (E.III.1) at time T, denoted by ℜT , is defined as the set of all

x ∈  n for which there exists a control u ∈ C [ 0, T ] such that


T
x = ∫ e(T −τ ) A Bu (τ)d τ.
0

Let us remark that this means that if we run the differential equation (E.III.1) with the
input u, and with initial condition x(0) = x0 , then x is the set of all points in the state

space that are ‘reachable’ at time T starting from 0 by means of some input u ∈ C [ 0, T ].

Formally we have

11
 T

ℜT =  x ∈  n | ∃u ∈ C[0, T ] s.t. u = ∫ e(T −τ ) A Bu (τ)d τ  .
 0 
This means that ℜT is made up of all the controls u that make the system ( A, B )

controllable.
First we show that a reachable space is a subspace of  n .

Lemma
The reachable space ℜT is a subspace of  n .

Proof
In order to prove this claim, we need to show that ℜT is

- nonempty
- closed under addition
- closed under scalar multiplication.
If we take u = 0, then
T

∫e
(T −τ ) A
Bu (τ)d τ = 0,
0

and hence 0 ∈ ℜT .

Next, suppose x1 , x2 ∈ ℜT then there exist u1 , u2 ∈ C [ 0, T ] such that


T
xi = ∫ e(T −τ ) A Bui (τ)d τ for i = 1,2.
0

Thus u := u1 + u2 ∈ C [ 0, T ] and
T T T

∫ e Bu (τ)d τ = ∫ e Bu1 (τ)d τ + ∫ e Bu2 (τ)d τ = x1 + x2 ,


(T −τ ) A (T −τ ) A (T − τ ) A

0 0 0

and hence x1 + x2 ∈ ℜT .

Finally, if x ∈ ℜT then there exists a u ∈ C [ 0, T ] such that


T
x = ∫ e(T −τ ) A Bu (τ)d τ.
0

If α ∈ , then αu ∈ C [ 0, T ] and

12
T T

∫ e B(αu)(τ)d τ = α ∫ e Bu (τ)d τ = αx,


(T −τ ) A (T −τ ) A

0 0

thus αx ∈ ℜT . The proof is therefore finished. 

So far, we have proved that ℜT ⊂  n , now we want to give a necessary and sufficient

condition for ℜT to actually be  n . As we will see, this condition depends on the rank

of the matrix [ B | AB |  | An −1B ].


Let us remember that the definition of rank is a bit technical:

Definition
The rank of a matrix A, written rank( A), is equal to the maximum number of linearly
independent rows of A or, equivalently, the dimension of the row space of A.

Essentially, the rank counts the number of genuinely independent rows in the matrix A.

Theorem
The following propositions are equivalent
(1) ℜT =  n ,

(2) rank[ B | AB |  | An −1B] = n.

Proof
Let us do a contradiction to prove that (2) implies (1). To this end, suppose ℜT ≠  n ,

then there exists a x0 ≠ 0 belonging to  n such that for all r ∈ ℜT we have x0T r = 0.
This is actually equivalent to the following equation
T
x0T ∫ e(T −τ ) A Bu (τ)d τ = 0 ∀u ∈ ([0, T ]) m .
0

T
In particular, it will still be true if we choose u = u ( τ) = B T e(T −τ ) A x0 , with t ∈ [0, T ]. As

a precaution it is worth reminding the reader that T and T have very different meanings
and the notation can become a bit tricky. This choice of u belongs to ([0, T ]) m .
So plugging this intro the integral gives

13
T T
0 = ∫ x0T e(T −τ ) A BB T e(T − τ) A x0 d τ = ∫ y (τ)T y (τ)d τ,
T

0 0

where
 y1 (τ) 
T (T −τ ) AT  
y (τ) = B e x0 =    ,
 y (τ) 
 m 
see the Appendix to justify the change of the transpose in the exponential.
The product inside the integral can be written as
T m m T
0 = ∫ ∑ yk (τ)2 d τ = ∑ ∫ yk (τ)2 d τ,
0 k =1 k =1 0

the inversion of the summation and integration symbols is justified because the sum is
T
finite. However, each ∫0
yk (τ)2 d τ is greater than or equal to zero, so the sum is only
T
zero if each ∫
0
yk (τ)2 d τ is actually zero, i.e.
T
∀k ∈ {1,..., m} : ∫ y (τ) d τ = 0,
2
k
0

consequently yk (τ)2 = 0 or yk (τ) = 0 for all τ ∈ [0, T ] and ∀k ∈ {1,..., m}.

Hence ∀τ ∈ [0, T ] : x0T e(T −τ ) A B = 0. Differentiate this last equation k times with respect

to τ. This process gives x0T Ak e(T −τ ) A B = 0, and by suggestively setting T = τ we have

x0T Ak B = 0. This is the same as saying that x0T [ B | AB |  | An −1 B] = 0, but since we

have that x0 ≠ 0, we have to have rank[ B | AB |  | An −1B ] < n, which is a

contradiction since we know this rank to be exactly n, and thus ℜT =  n .

Let us now prove that (1) implies (2), set ℘ := rank[ B | AB |  | An −1 B] < n. Then

there exists a nonzero x0 ∈  n such that

x0T [ B | AB |  | An −1B] = 0,

however, every matrix is a zero of its characteristic polynomial (by the Cayley-
Hamilton theorem) thus
 n−2

x0T An B = x0T (α 0 I + α1 A +  + α n An −1 ) B = x0T  α n An −1 + ∑ α k Ak  B = 0,
 k =0 
and by induction ∀k ≥ n : x0T Ak B = 0.

14
These last two equations give: ∀k ≥ 0 : x0T Ak B = 0.
Now,

1 22 1
0 = x0T IB + x0T AtB + x0T A t B +  = ∑ x0T Ak t k B
2! k =0 k!
since each term in the sum is zero, and we continue from here
 ∞ 1 
0 = x0T  ∑ Ak t k  B = x0T etA B.
 k =0 k ! 
However, this implies that x0 ∉ ℜT , since otherwise if some u ∈ ([0, T ]) m then
T
x0 = ∫ e(T − τ) A Bu (τ)d τ
0

T T
x0T x0 = ∫ x0T e(T − τ) A Bu (τ)d τ = ∫ 0 Bu (τ)d τ = 0
0 0

which gives x0 = 0, a contradiction, so the rank has to be exactly n. And this ends the

proof. 

Schematically we have the following results


 T

ℜT :=  x ∈  n | ∃u ∈ C[0, T ] s.t. u = ∫ e(T −τ) A Bu (τ)d τ 
 0 
ℜT =  n ⇔ rank[ B | AB |  | An −1B] = n and hence (?)

( A, B ) is controllable ⇔ rank[ B | AB |  | An −1 B] = n.

Combining both results from the Lemma and from the Theorem we obtain:

Corollary
The following propositions are equivalent
(1) ( A, B ) is controllable

(2) rank[ B | AB |  An −1 B |] = n, the dimension of the state space.

Before proceeding into the eigenvalue analysis for controllability let us return to the
examples we considered at the beginning and see why one of them is controllable and
the other is not using our new result: Corollary 3.5.
In the first example we had n = 1 and

15
rank[ B | AB |  | An −1B] = rank[ B] = rank[1] = 1 = n,
which is indeed the dimension of , the state space.

However, on the second example, the state space was  2 , so n = 2 and we have

1 1 
rank[ B | AB |  | An −1B] = rank[ B | AB |] = rank   = 1 ≠ 2 = n.
0 0

Lemma
Let A ∈  n× n , then the characteristic polynomial of A is the same as the characteristic

polynomial of T −1 AT , where T is an invertible matrix in  n× n .

Proof
We have
det( I λ − T −1 AT ) = det(T −1 (λI − A)T )

= det(T ) det(T −1 )det(λI − A)


= det(λI − A).
so the characteristic polynomials are indeed the same. 

Lemma
Let A be the following matrix
 0 1  0 
     
A= .
 0 0  1 
 
 −a0 −a1  −an −1 

Then its characteristic polynomial is


n −1
det(λI − A) = λ n + ∑ ak λ k .
k =0

Proof
One has

16
λ −1 0  0 
0 λ −1  0 

det(λI n − A) = det       
 
0 0 0  −1 
 a0 a1 a2  λ + an −1 

 λ −1  0  0 −1  0 
          
= λ det  + det 
0 0  −1  0 0  −1 
   
 a1 a2  λ + an −1   a0 a2  λ + an −1 

 λ −1  0 
    
= λ det  +a
0 0  −1  0
 
 a1 a2  λ + an −1 

 
= λ λ (λ(
 (λ + an −1 ) + an − 2 ) 
 ) + a1  + a0
 n −3 n −3 
= λ n + λ n −1an −1 + λ n − 2 an − 2 +  + λa1 + a0
n −1
= λ n + ∑ ak λ k . 
k =0

From now onwards, let A ∈  n× n , and B ∈  n×1.

Lemma

Let ( A, B ) be controllable and let A have characteristic polynomial λ n + ∑ k = 0 ak λ k .


n −1

Then there exists an invertible T in  n× n such that


 0 1  0  1 
      0 
T −1 AT =  and T −1 B =   .
 0 0  1   
   
 − a0 − a1  − an −1  0 

Proof
Because ( A, B) is controllable we have rank[ B | AB |  | An −1 B] = n. So if we define

Ω as Ω = {B, AB,, An −1B}, then Ω forms a basis of  n . With respect to this basis Ω,
the matrix of the linear transformation A is

17
0  0 −a0 
1  0 − a1 
Q= .
    
 
0  1 − an −1 

Let the characteristic polynomial of A be given by


n −1
p ( z ) = z n + an −1 z n −1 +  + a1 z + a0 = z n + ∑ ak z k ,
k =0

and define the auxiliary polynomials


n−2
p1 ( z ) = z n −1 + an −1 z n − 2 +  + a2 z + a1 = z n −1 + ∑ ak +1 z k
k =0

n −3
p2 ( z ) = z n − 2 + an −1 z n − 3 +  + a3 z + a2 = z n − 2 + ∑ ak + 2 z k
k =0


pn −1 ( z ) = z + an −1

pn ( z ) = 1.

The general polynomial being given by


n −i
pi ( z ) = z n − i + ∑ an − k z n − i − k .
k =1

These auxiliary polynomials satisfy the recursion relation (for i ∈ {1,, n})
zpi ( z ) = pi −1 ( z ) − ai −1 pn ( z ).

Now, also for i ∈ {1,, n} introduce the vectors ei = pi ( A) B. The auxiliary polynomials

can be written for the matrix A as


n −i
pi ( A) = An − i + ∑ an − k An − i − k ,
k =1

and similarly for the recursion relation we have


Aei = ei −1 − ai −1en .

Using this recursion relation we can see that


span{B, AB,, An −1B} = span{e1 , e2 ,, en }.

Therefore, {ei i ∈ {1,, n}} is a basis for  n .

In this basis, A has the matrix

18
 0 1  0 
     
 .
 0 0  1 
 
 −a0 − a1  − an −1 

The statement for T −1B


1 
0 
T B= 
−1

 
 
0 
follows trivially. 

Next we prove an if and only if lemma which says that if ( A, B) is controllable, then it
will still controllable after the multiplication of an invertible matrix T.

Lemma
Let T be an invertible matrix in  n× n . Then the following propositions are equivalent
(1) ( A, B) is controllable

(2) (T −1 AT , T −1 B) is controllable.

Proof
Using the relation (T −1 AT ) n = T −1 AnT , we have that

rank[ B | AB |  | An −1B] = n ⇔ rank[T −1B | T −1 ATT −1B |  |T −1 An −1TT −1B] = n,


and the claim follows. 

Recall that a subspace V ⊂  n is said to be A-invariant if AV ⊂ V , which is equivalent


to saying that if x ∈ V then Ax ∈ V . This means that V is A-invariant when the whole
trajectory remains in V whenever the initial state belongs to V.

Lemma
One has that ran[ B | AB |  | An −1B] is the smallest A-invariant subspace containing
ran B.

19
Proof
Let us first prove that ran[ B | AB |  | An −1B] is a subspace containing ran B.

Suppose that x ∈ ran B, this means that there exists u ∈  m such that x = Bu, and this can
be seen my computing
u 
0
[ B | AB |  | A B]   = Bu = x.
n −1

 
 
0
Next, let us show that ran[ B | AB |  | An −1B] is an A-invariant subspace. To

accomplish this, let x ∈ ran[ B | AB |  | An −1B] which means that

 u1 
u 
x = [ B | AB |  | An −1B]  2 

 
 un 
n
= Bu1 + ABu2 +  + An −1Bun = ∑ Ak Buk +1
k =0

and operating A on x we have


n
Ax = ABu + A2 Bu2 +  + An −1 Bun −1 + An Bun = ∑ Ak Buk .
k =1

If we let p denote the characteristic polynomial of A then by Cayley-Hamilton we have


p( A) = 0. So, if
n −1
p( z ) = z n + an −1 z n −1 +  + a1 z + a0 = z n + ∑ ak z k ,
k =0

then
n −1
An = − an −1 An −1 −  − a1 A − a0 I = −∑ ak Ak .
k =0

If we now substitute this new equation for An in the equation for Ax we have
Ax = ABu1 + A2 Bu2 +  + An −1Bun −1 + (− an −1 An −1 −  − a1 A − a0 I ) Bun

= B ⋅ 0 + AB(u1 − a1un ) + A2 B(u2 − a2un ) +  + An −1 B(un −1 − an −1un )


n −1
 n −1 
= ∑ Ak Buk +  −∑ ak Ak  Bun
k =1  k =0 

20
n −1
= B ⋅ 0 + ∑ Ak B(uk − ak un )
k =1

 0 
 u −au 
= [ B | AB |  | An −1B]  1 1 n 
  
 
un −1 − an −1un 
∈ ran[ B | AB |  | An −1B],

so ran[ B | AB |  | An −1B] is indeed A-invariant.


Finally we need to prove that this subspace is the smallest of all the subspaces with this
property. In order to do so, let V be a subspace of  n such that
- ran B ⊂ V
- AV ⊂ V ,

then the claim is that ran[ B | AB |  | An −1B] ⊂ V . Let us proceed as in the previous

parts of the proof. Let x ∈ ran[ B | AB |  | An −1B] then

 u1 
u  n
x = ran[ B | AB |  | An −1 B]  2  = ∑ Ak −1 Buk .
   k =1
 
u n 
Since ∀ ∈ {1,, n} : Bu ∈ ran B, and since ran B ⊂ V , it follows that Bu ∈ V , for all 

in the set {1, , n}. And, as we have shown, V is A-invariant, and so ∀k ≥ 0 : Ak Bu ∈ V .

As V is a subspace, it follows that


n
x = Bu1 +  + An −1 Bun = ∑ Ak −1Buk ∈ V .
k =1

This ends the proof of the lemma. 

Hautus test for controllability


The following propositions are equivalent
(1) ( A, B) is controllable
(2) ∀λ ∈ : rank[ B λI − A] = n.

21
Proof
Let us first prove that (1) implies (2). To achieve such goal, let ( A, B) be controllable.
We now consider two cases for λ. If λ is not an eigenvalue of A, then we have
rank[ B λI − A] = n. However, if λ happens to be an eigenvalue of A then there exists

v such that v T (λI − A) = 0, which can also be written as v T A = λv T .

If v T B = 0 then we have
v T [ B | AB |  | An −1B] = [v T B | λv T B |  |λ n −1v T An −1B] = 0
a contradiction to ( A, B) being controllable.
To show that (2) implies (1) let ( A, B) be non controllable. Choose a basis for

ran[ B | AB |  | An −1B] and extend it to a basis B for  n .

Then by previous lemma there exists T ∈  n× n such that


A A12  B 
T −1 AT =  11 , T −1 B =  1  .
0 A22  0
Now Let λ 2 be an eigenvalue of A22 and let x2 ≠ 0 be such that x2T A22 = λ 2 x2T .

Also, let v T [ B λI − A] = 0. Hence ∀λ ∈ : rank[ B λI − A] = n.

Now we define the state feedback which has the control u in terms of the variable x by
means of an appropriate matrix. It is called in such a way because it feeds the u into the
system in a closed loop manner.

State Feedback
Let F ∈  m× n , then u (t ) = Fx(t ).

Theorem
If ( A, B) is controllable then ( A + BF , B) is still controllable.

Proof
We have
 1 − F  
rank[ B λI − A − BF ] = rank  [ B λI − A]  
 0 1  
= rank[ B λI − A]. 

22
Chapter IV: The eigenvalue placement theorem

Let us do a summary of the linear system ( A, B)


dx
= Ax(t ) + Bu (t ) t ≥ 0
dt
with the following matrices
x ∈ n u ∈ m A ∈  n× n B ∈  n× m

and the state feedback being defined as


u (t ) = Fx(t ) F ∈  m× n .

The state feedback has been defined in such a way so that we can create a closed loop
system, that is, a system that is fed u (t ) in terms of x(t ).

Closed loop system


dx
= Ax(t ) + Bu (t ) = Ax(t ) + BFx(t ) = ( A + BF ) x(t ).
dt

Figure IV.1: A Closed Loop System by State Feedback

A sensible question to ask now is the following : what are the conditions on A and B
such that for every monic, real polynomial p of degree n there exists F ∈  m× n such that
the characteristic polynomial of A + BF is p?

In order to answer this question we first state and prove some technical lemmas which
will eventually lead to a major proposition called the Pole Placement Theorem.

23
Lemma
If ( A, B) is controllable and b ∈ ran B with b ≠ 0, then there exist u1 ,, un −1 such that

the sequence defined by


 x1 := b

 xk +1 := Axk + Buk k ∈ {1,, n − 1}

is independent.

Proof
We proceed by steps, x1 ≠ 0 , and therefore it is independent. Now suppose we have

constructed the sequence xi for i = 1,, k . Let ϑ = span { x1 ,, xk } , that is, the linear

space generated by { x1 ,, xk }. We need to choose uk such that xk +1 = Axk + Buk ∉ ϑ.

Should this not be possible then ∀u : Axk + Buk ∈ ϑ (*). In particular choose u = 0 so we

have Axk ∈ ϑ (**). Combining (*) and (**) we have

∀u : Bu ∈ ϑ or the same in other words: ran B ⊂ ϑ. Finally for all i < k we have that
Axi = xi +1 − Bui ⊂ ϑ, since both xi +1 and Bui satisfy xi +1 ⊂ ϑ and Bui ⊂ ϑ.

This gives now gives that indeed Axi ⊂ ϑ for all i ∈ {1,, k} and so it follows that

Aϑ ⊂ ϑ. This in turn implies that we have ϑ =  n , and k = n.

Lemma
If ( A, B) is controllable and b ∈ ran B with b ≠ 0, then there exists F such that
( A + BF , b) is controllable.

Proof
It follows from the previous lemma. Define F by Fxk = uk , for k ∈ {1,, n − 1} and

Fxn = 0. Then ( A + BF ) n −1 b = xk , and so ( A + BF , b) is controllable.

Note: we could have used any other arbitrary vector in  n to define Fxn .

24
We are now in a position to finally give a necessary and sufficient condition for the
system ( A, B ) to be controllable in terms of characteristic polynomials, this result is
called the:

Eigenvalue Placement Theorem


The following propositions are equivalent
(1) ( A, B ) is controllable
(2) for every monic polynomial p with real coefficients of degree n, there exists
F ∈  m× n such that the characteristic polynomial of A + BF is p

Proof
Let us prove the ‘only if’ part first. Suppose that ( A, B) is not controllable. We want to
reach the conclusion that the characteristic polynomial of A + BF is never equal to p.
So, if ( A, B) is not controllable then there exist λ ∈ and v ∈ n such that v T A = λv T

and that v T B = 0. Thus, ∀F ∈  m× n : v T ( A + BF ) = λv T , which is the same as saying that

∀F ∈  m× n , λ is an eigenvalue of A + BF .
Having this result now we take any monic polynomial p of degree n with real
coefficients such that p (λ) ≠ 0. This gives the conclusion that we were looking for,
which is that the characteristic polynomial of A + BF will never be equal to p.
[Note: take for instance the natural choice p( x) = x n , if λ ≠ 0 and p( x) = x n + 1 if
λ = 0. ]
The proof of the ‘if’ is considerably longer and needs to be divided into two cases, first
one with m = 1 and then a more general case for larger and arbitrary m.
For m = 1 find T such that
 0 1  0  1 
      0
T −1 AT =  and T −1b =  
 0 0  1   
   
 −a0 −a1  −an −1  0
Find f = [ f1  f n ] such that

25
 0 1  0 
     
T −1 AT + T −1bf =  ,
 0 0  1 
 
 −a0 ' −a1 '  −an −1 '

(this can be accomplish indeed with −a0 + f1 = −a0 ',, −an −1 + f n = −an −1 ')
where
n −1
λ n + ∑ λ k ak '
k =0

is the desired characteristic polynomial.


It just remains to realize that
 0 1  0 
     
T −1 ( A + bfT −1 )T = 
 0 0  1 
 
 −a0 ' −a1 '  −an −1 '

and so
 0 1  0 
      −1
A + bfT −1 = T  T
 0 0  1 
 
 −a0 ' −a1 '  −an −1 '

has a characteristic polynomial which is λ n + ∑ k = 0 λ k ak '.


n −1

The general case is now done as follows. First, suppose ( A, B) is controllable. Set
b ∈ ran B with b ≠ 0, then ∃F0 such that ( A + BF0 , b) is controllable.

We can also write ∃f ' such that the characteristic polynomial of A + BF + bf ' is the
desired characteristic polynomial. However, there exists u such that b = Bu (because
we have that b ∈ ran B ).
Finally, define F = F0 + uf ', then characteristic polynomial of A + BF is the desired

polynomial.

We will do a numerical example using this theorem (about the linearized motion of a
satellite where the inputs are the thrust in the radial direction and the thrust in the
tangential direction) in the following Chapter.
We will also give the a code which was successfully impletemented in Matlab.

26
Chapter V: Algorithm and Program

The following is a Matlab code of the eigenvalue placement theorem which is


commented and described.

% The program is Pole Assignment

% Given (A, B) and poles p, this function will find out if the given pair is controllable
and
% if so it will relocate the poles. The outputs are relocated poles and the feedback
matrix {z}.

% Checking for controllability

[n,n]=size(A);
C=[];
for i=1:n,
C=[C A^(i-1)*B];
end
if rank(C)==n,
disp('(A,B) is controllable')
else
disp('(A,B) is uncontrollable'), RETURN
end
[n,m]=size(B);

r=n-1;
while r~=n,
% form a random matrix of size m x n
F=rand(m,n);
Ao=A+B*F;
% find the eigvenvalues of Ao
x=eig(Ao);
% to check whether they are distinct

27
V=[];
for i=1:n,
v=[];
for j=1:n,
v=[v x(i)^(j-1)];
end
V=[V;v];
end
r=rank(V);
end

r=n-1;
while r~=n,
% form a random matrix u of size m x 1
u=rand(m,1);
b=B*u;
% check for controllability
C=[];
for i=1:n,
C=[C Ao^(i-1)*b];
end
r=rank(C);
end

% given poles p(1), ... , p(n), to find coefficients of the polynomials


x=[1 -p(1)];
for i=2:n,
x = conv(x, [1 -p(i)]);
end

for i=1:n,
d(i)=x(i+1);
end

28
% to find the characteristic polynomial of A
x=eig(Ao);
y=[1 -x(1)];
for i=2:n,
y=conv(y,[1 -x(i)]);
end
for i=1:n,
ao(i)=y(i+1);
end

% Transformation matrix
P=[zeros(1,n); eye(n-1) zeros(n-1,1)];
x=[1];
for i=1:(n-1),
x=[ao(i) x];
end
U=x;
t=x;
for i=1:(n-1),
t=t*P;
U=[U;t];
end

T=C*U;
fo=[];
for i=1:n,
fo=[fo (d(n+1-i)-ao(n+1-i)) ];
end
fo=fo*inv(T);
x=eig(Ao-b*fo);
disp('poles for controllable pair have been relocated at');
disp(x);
disp('feedback matrix is');
z=(F-u*fo);

29
disp(z);

Now, consider the system


 x1 (t )   0 1 0 0   x1 (t )   0 0
   2   x2 (t )  1 0   u1 (t ) 
d  x2 (t )   3 0 0
= + .
dt  x3 (t )   0 0 0 1   x3 (t )   0 0  u2 (t ) 
      
 x4 (t )   0 −2 0 0   x4 (t )   0 1

This system is the linearized equation of motion of a satellite where the input u1 is the

thrust in the radial direction and u2 the thrust in the tangential direction.

We input this in the program Controllability.m


0 1 0 0
3 0 0 2 
A= ,
0 0 0 1
 
 0 −2 0 0

0 0
1 0 
B= ,
0 0
 
0 1
and we define the points p(1) = 1, p(2) = 2, p(3) = 3, p(4) = 4, which are the
eigenvalues we wish to have in this system. We also try the eigenvalues p (1) = 1,
p (2) = −1, p (3) = 5, and p (4) = −2.

The output of the program is as follows

>> A = [0 1 0 0 ; 3 0 0 2 ; 0 0 0 1 ; 0 -2 0 0]

A=

0 1 0 0
3 0 0 2
0 0 0 1
0 -2 0 0

>> B = [0 0 ; 1 0 ; 0 0 ; 0 1]

B=

0 0
1 0

30
0 0
0 1

>> p(1)=1

p=

>> p(2)=2

p=

1 2

>> p(3)=3

p=

1 2 3

>> p(4)=4

p=

1 2 3 4

>> controllability
(A,B) is controllable
poles for controllable pair have been relocated at
4.0000
1.0000
3.0000
2.0000

feedback matrix is
118.8359 -24.5533 -5.2740 64.2453
64.0536 -13.1355 -2.5757 34.5533

>>

>> A = [0 1 0 0 ; 3 0 0 2 ; 0 0 0 1 ; 0 -2 0 0]

A=

0 1 0 0
3 0 0 2
0 0 0 1
0 -2 0 0

>> B = [0 0 ; 1 0 ; 0 0 ; 0 1]

B=

0 0
1 0
0 0
0 1

>> p(1) = 1

31
p=

>> p(2) = -1

p=

1 -1

>> p(3) = 5

p=

1 -1 5

>> p(4) = -2

p=

1 -1 5 -2

>> controllability
(A,B) is controllable
poles for controllable pair have been relocated at
5.0000
-2.0000
-1.0000
1.0000

feedback matrix is
3.6791 3.4798 4.2638 -0.4639
1.7086 2.0414 2.5879 -0.4798

>>

Therefore, wit the first set of eigenvalues the matrix F is


118.8359 −24.5533 −5.2740 64.2453
F = ,
 64.0536 −13.1355 −2.5757 34.5533
and for the second set we have
3.6791 3.4798 4.2638 −0.4639 
F = .
1.7086 2.0414 2.5879 −0.4798
Since F was defined such that u (t ) = Fx(t ), then we have our control u for both cases
with the eigenvalues we wanted.

32
Chapter VI: exp( A), where A ∈  n× n and the Cayley-Hamilton theorem

The objective of this chapter is to set the necessary tools required in the process of
constructing a consistent control theory. The aim is to understand these tools now
instead of having to digress during the theory because this would take our attention
away.
The only required knowledge required is single variable calculus and some linear
algebra: namely vector spaces and matrix theory, in particular expansion of
determinants, digitalisation of matrices, eigenvalues and eigenvectors. The main
objective is not to go into the details of linear algebra, but to explain the exponentiation
of a matrix and the Cayley-Hamilton theorem.

The (i, j ) -entry of a generic matrix A will be denoted by ( Aij ), or by (aij ) once the
cofactors and minors are introduced.

Definition A.1
Let ⋅ denote the following norm

A = max { Aij such that 1 ≤ i and j ≤ n}.

Clearly we have
(*) Aij | ≤ A

for all i and j.

Although there are several possible choices for a norm, this one is convenient because it
allows a relatively straightforward proof of the convergence of the exponential of a
matrix.

Lemma A.2
Suppose that A and B are two real n-square matrices, then AB ≤ n A B .

Proof
This follows by using the inequality that we have pointed out previously, we first
estimate the absolute value of the general entry of the matrix AB, denoted by ( AB)ij ,

33
n
|( AB)ij | := ∑A B
k =1
ik kj

n
≤ ∑ |Aik Bkj | by the triangular inequality
k =1

n
= ∑ |Aik ||Bkj |
k =1

n
≤∑ A B by the inequality (*)
k =1

=n A B .

And this ends the proof. 

Lemma A.3
k
Suppose that A and B are two real n-square matrices, then Ak ≤ n k −1 A .

Proof (by induction)


Suppose the statement above is true for some integer k, then
Ak +1 = Ak A

≤ n Ak A applying Lemma A.2 with B = A,


k
≤ nn k −1 A A by the induction hypothesis
k +1
= nk A .

Finally we need to show one integer for which the statement is true.
This can be done by setting B = A, in Lemma A.2
2
AA = A2 ≤ n A A = n 2 −1 A .

And hence the result follows by induction. 

Lemma A.4
Suppose A is a square matrix and I is the identity matrix, then the series

1 k
eA = ∑ A , with A0 = I ,
k =0 k !

is well-defined and always converges.

34
Proof
By the convention of our notation, the (i, j ) -entry of e A is

1 k
(e A )ij = ∑ ( A )ij , with (A0 )ij = I ij .
k =0 k !

To show that this exponential of a matrix converges, we show that the series

1 2 1
I ij + Aij + ( A )ij +  = ∑ ( Ak )ij
2! k =0 k !

is absolutely convergent, and hence convergent. To this end, let a = n A then

1 1 k 1 k
I ij ≤ I = 1 for k = 0, and ( Ak )ij ≤ A ≤ n k −1 A for k ≥ 1. (By Lemma A.3)
k! k! k!
Hence
1 1 1 2 1 3
I ij |+|Aij | + |( A2 )ij | + |( A3 )ij | +  ≤ 1 + A + n A + n 2 A + 
2! 3! 2! 3!
1 1 1 
= 1 +  a + a 2 + a3 +  
n 2! 3! 
ea − 1
=1+ .
n
The same in abbreviated notation

1 ∞
1 1 ∞ 1 k ea − 1
∑ k ! |( A ) | ≤ ∑ k ! n ∑
k k −1 k
ij A =1+ a =1+ .
k =0 k =0 n k =1 k ! n

And the result follows since the series converges absolutely. 

Lemma A.5
The series

 1 r 1 s
∑∑k r + s=k
 A B 
 r ! s ! ij

converges for all i and j.

Proof
As in the proof above, let a = n A and b = n B and estimate double sum

 1 r 1 s 1
∑∑
k r + s=k
 A B  =∑ ∑
 r ! s ! ij k r + s = k r !s !
( Ar B s )ij

35
1
≤∑ ∑ Ar B s (by the (*) inequality)
k r + s = k r !s !

1
≤∑ ∑ n Ar B s (by Lemma A.2)
k r + s=k r ! s !
1
≤∑ ∑
r s
n(n r −1 A )(n s −1 B ) (by Lemma A.3)
k r + s = k r !s !

1 r +s−2
=∑ ∑
r s
n A B
k r + s = k r !s !

1 −2 r s
=∑ ∑ n ab
k r + s = k r !s !

1 r s
≤∑ ∑ ab
k r + s = k r !s !

= ea +b
< ∞. 

Lemma A.6
Suppose that A and B are two real n-square matrices that commute, i.e. AB = BA, then

we have e A + B = e Ae B .

Proof

1  ∞ 1  ∞ 1 
Formally we have ea + b = ∑ (a + b) k and ea eb =  ∑ a k  ∑ b k  .
k =0 k !  k = 0 k !  k = 0 k ! 
The terms of degree k in these expansions are respectively
1 1 1 r1 s
k!
( A + B)k = ∑
k ! r + s=k
( r ) A B and
k r s

r + s=k r !
A B.
s!
Because of the binomial coefficients we have
1 k 1 k! 1 1 1
k!
( r )= = =
k ! r !(k − r )! r !(k − r )! r ! s !
,

and hence both terms are equal for all k and all r and s such that k = r + s.
Define
n
1 k
S n ( A) := ∑ A.
k =0 k !

Then

36
 n 1  n 1  n n 1 1
S n ( A) S n ( B ) =  ∑ Ar  ∑ B s  = ∑∑ Ar B s ,
 r = 0 r !  s = 0 s !  r = 0 s = 0 r ! s !
but similarly
n n
1 1 r1 s
Sn ( A + B) = ∑ ∑ k !( ) A B = ∑ ∑
k
r
r s
A B.
k =0 r + s =k k =0 r + s =k r ! s!
When we compare terms, we realize that the expansion of the partial sum S n ( A + B )

consists of the same terms as that of S n ( A) S n ( B) provided that r + s ≤ n.

The subtlety here is showing that the sum of the remaining terms (the ones r + s > n)
tend to zero as k grows large without bound.
By Lemma A.5 we have that the i, j-entry of S k ( A) S k ( B ) − S k ( A + B ) satisfies

 1 r 1 s
( S k ( A) S k ( B ) − S k ( A + B))ij ≤ ∑
r + s>k
 A A  .
 r ! s ! ij

However, by Lemma A.5, this sum tends to zero as k grows, but we also have that
S k ( A) S k ( B ) − S k ( A + B) → e A e B − e A + B ,

hence this proves the lemma. 

Lemma A.7
1
Suppose I is the identity matrix, then lim (ehA − I ) = A.
h→0 h

Proof
Expand the exponential (we have shown this to be a valid operation in Lemma A.4)

1 hA 1
(e − I ) − A = ∑ h k −1 Ak
h k =2 k !

and use the usual substitution a = n h A to estimate the series

 ∞ 1 k −1 k  ∞
1 k −1 k

 k!
 k =2
h A  ≤ ∑
ij k = 2 k !
h ( A )ij (by the triangle inequality)


1 k −1 k
=∑ |h| |( A )ij |
k =2 k !


1 k −1 k
≤∑ |h| A (by the (*) inequality)
k =2 k !

37

1 k −1 k −1 k
≤∑ |h| n A (by Lemma A.3)
k =2 k !


1 k −1 k −1 k − k − k
=∑ |h| n a n |h|
k =2 k !

1 ∞ 1 k
= ∑ a
n |h| k = 2 k !
A
= (e a − 1 − a )
a
 ea − 1 
= A − 1 .
 a 
At this point it is important to note that a → 0 as h → 0. However, the key here relies
on real analysis, which tells us that the derivative of the exponential function is the
exponential function itself,
ea − 1 d a
lim = e = e0 = 1,
a →0 a da a=0

(by L’Hopital’s rule) so the series tends to zero with h. 

Lemma 1.8
The function etA is a differentiable matrix-valued function of t, and its derivative is
AetA .

Proof
Apply Newton’s quotient definition of derivative
d tA 1
e = lim (e( t + h ) A − etA )
dt h→0 h

because of the commutability of tA and hA (both t and h are scalars), we have


1 ( t + h ) A tA 1
(e − e ) = (e hA − I )etA
h h
and by Lemma A.7, the term is parenthesis tends to A as h goes to zero, and this proves
the result. 

The exponential of a matrix, although well-defined and convergent requires further


explanations, in particular it requires an intelligent way of computing it, and certainly
exponentiating the entries of A is not a good way of doing so.

38
Nonetheless, there is straightforward case which can be computed easily, and that is
when the matrix A is diagonal, with diagonal entries λ i .

It is important to note that inspection of the series shows that e A is also diagonal in this
case, and that its diagonal entries are eλi . This can easily be seen by the definition, if A
is a diagonal matrix with entries λ i , then each term in the expansion of e A will be a

diagonal matrix with entries λ ik , each matrix being divided by k !, and hence e A is a

diagonal matrix (example later on).

Definition A.9
A matrix A is diagonalizable if there exists a matrix P such that D := P −1 AP is a
diagonal matrix.

The following two lemmas are part of the details of linear algebra which we have
chosen to omit, so the proofs will not be given.

Lemma A.10
If A is diagonalizable, then A = PDP −1.

Lemma A.11
If A is diagonalizable, then ( PDP −1 ) k = PD k P −1.

Lemma A.12
If A is diagonalizable then
eλ1  0 
 
e A = P      P −1
 0  eλn 
 
where λi are the eigenvalues of A.

Proof

1 k
One has e A = ∑ A , with A0 = I
k =0 k !

39

1
= I +∑ ( PDP −1 ) k
k =1 k !

1
= PIP −1 + ∑ ( PDP −1 ) k
k =1 k !

 ∞
1 
= P  I + ∑ D k  P −1
 k =1 k ! 

 ∞
1 
= P  I + ∑ D k  P −1
 k =1 k ! 
= Pe D P −1
 eλ1  0 
 
= P      P −1. 
 0  eλn 
 

Example
 0 −θ 
Compute e A for A =  .
θ 0 

Solution
The key step here is to realize that
 J θ2 k 
A2 k =  2 k
 0
0

2k
J 2k θ 
where J 2 k = {−+11 k is odd
k is even
.

 − K 2 k +1θ2 k +1 
A 2 k +1 = 
0
 K 2 k +1θ
2 k +1
0


where K 2 k +1 = {−+11 k is odd
k is even
.

Therefore

1 k 1 1
eA = ∑ A = I + A + A2 + A3 + 
k =0 k ! 2! 3!

 1 0   0 −θ  1  −θ2 0  1  0 θ3 
= + +  +   +
 0 1   θ 0  2!  0 −θ2  3!  −θ3 0 

 1 2 1 4 1 3 1 5 
 1 − 2! θ + 4! θ −  −θ + 3! θ − 5! θ +  
= 
 θ − 1 θ3 + 1 θ5 −  −1 + 1 θ2 − 1 θ4 +  
 
 3! 5! 2! 4! 

40
 (−1) k 2 k (−1) k 2 k +1 
 ∑ x − ∑ (2k + 1)! x 
(2 k )!
= 
 (−1) k 2 k +1 (−1) k 2 k 
∑ x −∑ x 
 (2k + 1)! (2k )! 
 cos θ − sin θ 
= .
 sin θ cos θ 

Example (diagonal 2 × 2 real matrix)


λ 0
Compute e A for A =  1 .
 0 λ2 

Solution
 λn 0
We already know the solution to this one, we know that An =  1  , and hence
0 λ 2n 

1 n ∞ 1  λ1n 0   eλ1 0 
eA = ∑ A =∑  = .
n = 0 n! n = 0 n!  0 λ n2   0 eλ2 

Example
0 1
Compute e A for A =  .
0 0

Solution
The crucial step is detecting that this is a nilpotent matrix, i.e. A2 = 0, and so on, so we
only need to sum up a finite number of terms.
 1 0   0 1   1 1
eA = I + A =  +  = .
 0 1   0 0   0 1

Definition A.13
Suppose A is an n-square matrix. Let M ij denote the (n − 1) − square submatrix of A

obtained by deleting its ith row and jth column. The determinant det( M ij ) is called the

minor of the element aij of A, and we define the cofactor of aij , denoted by Aij , to be

the signed minor: Aij = (−1)i + j det( M ij ).

Note that M ij denotes a matrix, whereas Aij denotes a scalar.

41
Definition A.14
Consider a n-square matrix A over a field K. The classical adjoint, also called the
adjoint, of A, denoted by adj A is the transpose of the matrix of cofactors of A, i.e.

 A11 A21  An1 


 
A A22  An 2 
adj A =  12 .
     
 
 A1n A2 n  Ann 

Lemma A.15 (Laplace Expansions)


n n
One has det A = ∑ aij Aij = ∑ aij Aij .
i =1 j =1

Proof
Omitted. 

Lemma A.16
Suppose A is an n-square matrix, and suppose B is the matrix obtained from A by


n
replacing the ith row of A by the row vector ( bi1 bi 2 ... bin ) . One has b Aij .
j =1 ij

Additionally, if i ≠ j then
n n

∑a
k =1
jk Aik = ∑ akj Aki = 0.
k =1

Proof
As usual with our convention B = (bij ), and by the Laplace expansions in Lemma A.15

we have
n
det B = ∑ bij Bij .
j =1

Now, we can safely say that Bij = Aij for j = 1,..., n since Bij does not depend upon the

ith row of B. Therefore the equation above becomes


n
det B = ∑ bij Aij .
j =1

42
This proves the first part of the claim, the second part requires us to introduce the matrix
A’ obtained from A by replacing the ith row of A by its jth row. However, determinant
theory gives that det A ' = 0 since A’ has two identical rows (this is also a classical result
in linear algebra). Using the first part of the claim we have
n
0 = det A ' = ∑ a jk Aik .
k =1

The final argument is to use the also classical fact det A ' = det A, which gives
n

∑a
k =1
kj Aki = 0.

And this completes the proof. 

Lemma A.17
Suppose A is an n-square matrix, then A ⋅ (adj A) = (adj A) ⋅ A = det( A) I .

Proof
As usual with our notation, the elements of A are denoted by aij , and the elements of

A ⋅ (adj A) will be denoted by bij .

The ith row of A is ( ai1 ai 2 ... ain ) . Since adj A is the transpose of the matrix of

cofactors, the jth column of adj A is the transpose of the cofactors of the jth row of A,
T
i.e. ( Aj1 Ai 2 ... Ain ) . However, we can obtain bij by multiplying these two vectors

together, i.e.
n
Ai 2 ... Ain ) = ∑ aik Ajk .
T
bij = ( ai1 ai 2 ... ain ) ⋅ ( Aj1
k =1

By the Lemma A.16 we have that

bij = {det(
0,
A), if i = j
otherwise
.

Hence A ⋅ (adj A) is the diagonal matrix with each diagonal element det( A), this means
that A ⋅ (adj A) = det( A) I . The exact same reasoning gives (adj A) ⋅ A = det( A) I . 

Now we reach the second important proposition of this chapter since we have developed
all the necessary tools to state it and prove it.

43
Proposition A.18 (Cayley-Hamilton theorem)
Every matrix is a root of its characteristic polynomial.

Proof
Suppose that A is an arbitrary real n-square matrix, and let ∆ (t ) denote its characteristic
polynomial, say
n −1
∆ (t ) = det(tI − A) = t n + an −1t n −1 +  + a1t + a0 = t n + ∑ ak t k .
k =0

Now let B (t ) denote the classical adjoint of the matrix tI − A. The elements of B (t ) are
cofactors of the matrix tI − A and hence are polynomials in t of degree not exceeding
n − 1, thus
n −1
B (t ) = Bn −1t n −1 +  + B1t + B0 = ∑ Bk t k .
k =0

By Lemma A.17 we have (tI − A) B(t ) = det(tI − A) I , or

 n −1   n −1

(tI − A)  ∑ Bk t k  =  t n + ∑ ak t k  I .
 k =0   k =0 
The key step is now to equate the coefficients of corresponding powers of t, this can be
done since both sides are of order n,
Bn −1 = I

Bn − 2 − ABn −1 = an −1I

Bn − 3 − ABn − 2 = an − 2 I in general Bn − k − ABn − k +1 = an − k +1I and so on…

B0 − AB1 = a1I

− AB0 = a0 I .

Next, multiply the matrices above by An , An −1 ,..., A, I , respectively, this yields

An Bn −1 = I

An −1Bn − 2 − An Bn −1 = an −1 An −1

An − 2 Bn − 3 − An −1Bn − 2 = an − 2 An − 2 and so on…

AB0 − A2 B1 = a1 A

− AB0 = a0 I .

Finally, add the above matrix equations

44
n −1
0 = An + an −1 An −1 + an − 2 An − 2 +  + a0 I = An + ∑ ak Ak = ∆ ( A),
k =0

which is the Cayley-Hamilton theorem. 

Lemma A.19
Let A and B be two real n-square matrices, then ( AB) T = B T AT , where the T denotes the
transpose of a matrix.

Proof


n
According to our notation, A = (aij ) and B = (bij ) so the ij-entry of AB is a b ,
k =1 ik kj

which is also the ji-entry of ( AB) T.

However, column j of B now becomes row j of B T , and row i of A becomes column i of

AT . This means that the ji-entry of B T AT is


 ai1 
  n
a
(bij b2 j  bnj )  i 2  = ∑ bkj aik .
   k =1
 
 ain 
Thus, ( AB) T = B T AT because the corresponding entries are equal. 

Lemma A.20
For any real n-square matrix A we have: ( AT ) k = ( A k ) T.

Proof (By induction)


By setting A = B in Lemma A.19 we have ( A2 ) T = ( A 2 ) T. And now we proceed by

induction. Let P(n) be the statement ( AT ) n = ( A n ) T. We know it to be true for n = 2.

Now assume P (n) is true for some integer n, then ( AT ) n = ( A n ) T , multiply both sides

by AT , this gives ( AT ) n AT = ( A n ) T AT. The left hand side is simply ( AT ) n +1, whereas

the right hand side becomes ( A n ) T AT = ( AA n ) T = ( A n +1) T , by Lemma A.19. So if P (n)


is true for some integer n, then so is P (n + 1) and thus P (n) is true for all n ≥ 2. 

45
We finish with two lemmas concerning the determinants and transpose of the
exponential of a square matrix.

Lemma A.21
Suppose A is a real n-square matrix, then det e A = eTr A .

Proof
Let us start by a reduction of A of the form A = P −1TP, where T is upper triangular.

Note that since T k will still be triangular, with diagonal entries equal to (t kjj ), eT will be
t
triangular as well, with diagonal entries equal to (e jj ).
Hence
det eT = ∏ e jj = exp ∑ t jj = eTr T ,
t

j j

and this is the result we wanted since e A = P −1eT P. 

Lemma A.22
Suppose A is a real n-square matrix, then (e A ) T = exp( A T ), where exp denotes
exponential of a square matrix as usual.

Proof
By definition and Lemma A.4 we have
T T
 ∞ 1  ∞
1  ∞
1 ∞
1
(e ) =  ∑ Ak  = ∑  Ak  = ∑ ( Ak ) T = ∑ ( A T ) k = exp( A T ),
A T

 k =0 k !  k =0  k !  k =0 k ! k =0 k !

we have used Lemma A.20 to invert the transpose and the exponentiated power. 

46
Bibliography

Linear Multivariable Control: A Geometric Approach


W. Murray Wonham – Springer Verlag 1978

Mathematical Control Theory: An Introduction (Systems & Control: Foundations and


Applications)
Jerzy Zabczyk – Springer Verlag 1982

An Introduction to Geometric State Theory in Linear Multivariable Control (A


Summary with Focus in the Disturbance Decoupling Problem)
Marvi Teixeira – 1993

Advanced Linear Algebra


S. Roman – Springer Verlag 1992

47

Potrebbero piacerti anche