Optimal Control

Optimal control
Overview of optimal control

Linear quadratic regulator
Linear quadratic tracking control
Examples
Optimal Control p. 1/52

Overview
What is optimal control?
Optimal control is the process of finding control and

state histories for a system to minimize a performance
index
The optimal control problem is to find a control u that
forces the system x = Ax + Bu to follow an optimal
trajectory x that minimizes the performance criterion,
R t1
or cost function J = 0 h(x, u) dt
For discrete time systems, its
xk+1 = Axk + Buk , J = k01 h(x, u)
t1 , k1 are called the optimization horizon
Example: Autopilot of yatch
Autopilot designed for course-keeping, i.e. minimize

e = d in the presence of disturbances (wind,
waves)
What are the objectives?
Keep on track as much as possible, i.e. minimize e
(save time)
Use as little fuel f as possible (save cost)
Minimize rudder activity (save cost)

Example: Autopilot of a yatch
Define a quadratic performance index:

Z t1
J = q2e + r1 f 2 + r2 2 dt, q, r1 , r2 > 0
t0
Z t1
r1 0 f
= e qe + f dt
t0 0 r2
Z t1
= xT Qx + uT Ru dt
t0
Q, R are state and control weighting matrices, always

square and symmetric
Optimal control seeks to find u that minimizes J
Linear Quadratic Regulator (LQR)
Consider a state-space system

x = Ax + Bu (1)
y = Cx (2)
where dim(x) = nx , dim(y) = ny , dim(u) = nu
Our objective is to make x 0 (regulate the system) as

fast as possible, but using as little control effort (u) as
possible
Question: How do we design the control u so that the

states converge as fast as possible with as little control
effort as possible?
LQR
Define a function equation

Z t1
min
f= u xT Qx + uT Ru dt > 0 (3)
t0
where Q, R are symmetric positive definite weighting

matrices
A positive definite matrix Q will satisfy xT Qx > 0 for all

values of x
Mathematical representation: if Q > 0 then
xT Qx > 0, x

LQR
Then differentiate with respect to t to get

T !
f min T T f x
= u x Qx + u Ru + (4)
t x t
Define f = xT P x where P = P T > 0, so we get

T
f f
= 2P x = 2xT P (5)
x x
f T
=x Px (6)
t t

LQR
f f x
Substitute for , ,
t x t
from (6), (5), (1) into (4) to get
P
T min T T T

x x = u x Qx + u Ru + 2x P (Ax + Bu) (7)
t
To minimize the RHS of (7) w.r.t. u, take the partial
derivative

x Qx + u Ru + 2x P (Ax + Bu) = 2uT R + 2xT P B
T T T
u
(8)
Equate it to zero to get
u = R1 B T P x (9)

LQR
Substitute (9) into (7) to get
xT P x = xT (Q + 2P A P BR1 B T P )x (10)
Since 2xT P Ax = xT (P A + AT P )x then (10) becomes
xT P x = xT (Q + P A + AT P P BR1 B T P )x
and
P = (Q + P A + AT P P BR1 B T P ) (11)

LQR
It can be shown that the solutions of P will converge and

hence P 0
If the final time t1 is very far away from t0 (infinite horizon),

then (11) will reduce to become the Ricatti equation
P A + AT P + Q P BR1 B T P = 0 (12)

LQR
Summary
Given a state-space system (1), to design the optimal

controller
Select the weighting matrices Q, R as in (3)

The size of the weights correspond to how much you want to penalize x and u;
to make x converge faster, make Q bigger; to use less input, make R bigger
Solve the Ricatti equation (12) to get P

You can solve this in Matlab using the command care or lqr
Set u as in (9)

LQR
One good thing about this design method is that

stability is guaranteed if (A, B) is stabilizable
Just need to choose Q, R and the system will be stable
The Riccati equation (12) can be re-written as
P (A BR1 B T P ) + (A BR1 B T P )T P
+ Q + P BR1 B T P = 0
Recall that K = R1 B T P , so re-arrange to get
P (A BK) + (A BK)T P = (Q + P BR1 B T P ) (13)

| {z }
negative definite
LQR
Quoting Lyapunov theory: if there exists a matrix

P = P T > 0 such that
P A + AT P < 0
then the matrix A is stable
Apply the same argument to (13), therefore A BK is

stable

LQR
The weights Q and R can be used to tune the size of K

If Q is chosen to be larger, then K will also be larger
If R is chosen to be larger, then K will be smaller

Example 1
Consider a state-space system where

0 1 0
A= , B=
3 2 1
It is desired to minimize the cost function

Z
J = xT Qx + uT Ru dt
where
2 0
Q= , R=1
0 3

Example 1
The Matlab commands are
>> A=[0 1;-3 -2]; B=[0;1];

>> Q=[2 0;0 3]; R=1;
>> P=care(A,B,Q,R);
and we get

3.1633 0.3166
P =
0.3166 0.7628
1 T

K = Kopt = R B P = 0.3166 0.7628
(A BK) = 1.3814 j1.1867

Example 1
Lets try another design to make (A BK) deeper in the

LHP; the following choice of K will give
(A BK) = 4, 5 (pole placement)
Solve using det(A-BK) = 0 to find K

K = K1 = 17 7
Lets now simulate the system; set an initial condition of

2
x0 =
3

Example 1
x , K=K
1 opt
x , K=K
2 2 opt
x , K=K
1 1
x , K=K
2 1
4
0 1 2 3 4 5 6 7 8 9 10
The states x

Example 1
6
K=K
opt
8 K=K
1
10
12
14
0 1 2 3 4 5 6 7 8 9 10
The input u

Example 1
30
25
K=K
opt
20 K=K1
15
10
0
0 1 2 3 4 5 6 7 8 9 10
The cost function J

Example 1
So what can we observe/conclude?
The non-optimal controller K1 causes x to converge

faster because (A BK1 ) are more negative
The non-optimal controller K1 causes a larger control
effort u
Ultimately the cost function with K1 is higher because it
is not optimal

Example 1
The Ricatti equation can actually be solved by hand

p1 p2
Let P =
p2 p3
Substitute A, B, Q, R into the Ricatti equation:

p1 p2 0 1 0 3 p1 p2
+
p2 p3 3 2 1 2 p2 p3

2 0 p p2 0 h i p p2
+ 1 0 1 1 =0
0 3 p2 p3 1 p2 p3

Example 1
We get the following equations:
6p2 + 2 p22 = 0 (14)

p1 2p2 3p3 p2 p3 = 0 (15)
2p2 4p3 + 3 p23 = 0 (16)
Solve (14) to get p2 = 0.3166, 6.3166

Substitute p2 = 0.3166 into (16) and solve
p3 = 0.7628, 4.7628 (take only positive)
Substitute p2 = 6.3166 into (16) and solve
p3 = 2 j2.3634 (reject this set of p2 , p3 )
Finally solve (15) to get p1 = 3.1631
LQ tracking
What was demonstrated earlier was a regulation problem

(to make x 0)
In reality, it is desired that x follows a desired state

trajectory r
So now the quadratic performance index is

Z t1
T T

J= (r x) Q(r x) + u Ru dt (17)
t0
Q, R have the same function as before

LQ tracking
The optimal control u is given by
u = R1 B T P x R1 B T s (18)
s = (A BR1 B T P )T s Qr (19)
P = P A AT P Q + P BR1 B T P (20)

Example 2
Consider a DC motor modelled by:

2 0 10
= 1 0
+
0
V

Let x = and define r to be the reference for x. Find

the optimal control V (for infinite horizon) such that the
following cost function is minimized:
Z
T 2 1 0
J= (r x) Q(r x) + V dt, Q =
0 0 5

Example 2
Based on A, B, Q, R = 1, solve the Ricatti equation:

0.1020 0.2236
P =
0.2236 2.7269
Implement the optimal controller in (18) - (19) and get the

following results:

Example 2
Motor position (solid) and its reference (dashed)
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0 5 10 15 20 25 30 35 40 45 50

Example 2
Input voltage
0.6
0.4
0.2
0.2
0.4
0.6
0 5 10 15 20 25 30 35 40 45 50

Example 2
Cost function J
160
140
120
100
80
60
40
20
0
0 5 10 15 20 25 30 35 40 45 50

Example 2
The response of the position is rather sluggish, but it

saves the control input V
To make respond faster, change weight to

1 0
Q=
0 10

0.1367 0.7071
Solve Ricatti equation to get P =
0.7071 11.0775
converges faster but at a higher cost of V

Example 2
Motor position (solid) and its reference (dashed)
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
0 5 10 15 20 25 30 35 40 45 50

Example 2
Input voltage
2.5
1.5
0.5
0.5
1.5
2.5
0 5 10 15 20 25 30 35 40 45 50

LQ tracking
The block diagram is .........

Example 2
Notice that converges much faster now, but at a

higher cost of V
Note that there is no steady-state error because this is
a type 1 system; for optimal tracking, there may still be
steady-state error in order to save the input cost

Optimal Control

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Optimal Control

Caricato da

Copyright:

Formati disponibili

Optimal control

Overview of optimal control

Optimal Control p. 1/52

What is optimal control?

Optimal control is the process of finding control and

Autopilot designed for course-keeping, i.e. minimize

Optimal Control p. 3/52

Define a quadratic performance index:

Q, R are state and control weighting matrices, always

Consider a state-space system

where dim(x) = nx , dim(y) = ny , dim(u) = nu

Our objective is to make x 0 (regulate the system) as

Question: How do we design the control u so that the

Define a function equation

where Q, R are symmetric positive definite weighting

A positive definite matrix Q will satisfy xT Qx > 0 for all

Optimal Control p. 6/52

Then differentiate with respect to t to get

Define f = xT P x where P = P T > 0, so we get

Optimal Control p. 7/52

Optimal Control p. 8/52

Substitute (9) into (7) to get

Since 2xT P Ax = xT (P A + AT P )x then (10) becomes

Optimal Control p. 9/52

It can be shown that the solutions of P will converge and

If the final time t1 is very far away from t0 (infinite horizon),

Optimal Control p. 10/52

Given a state-space system (1), to design the optimal

Select the weighting matrices Q, R as in (3)

Solve the Ricatti equation (12) to get P

Optimal Control p. 11/52

One good thing about this design method is that

Recall that K = R1 B T P , so re-arrange to get

P (A BK) + (A BK)T P = (Q + P BR1 B T P ) (13)

Quoting Lyapunov theory: if there exists a matrix

Apply the same argument to (13), therefore A BK is

Optimal Control p. 13/52

The weights Q and R can be used to tune the size of K

Optimal Control p. 14/52

Consider a state-space system where

It is desired to minimize the cost function

Optimal Control p. 15/52

The Matlab commands are

>> A=[0 1;-3 -2]; B=[0;1];

Optimal Control p. 16/52

Lets try another design to make (A BK) deeper in the

Lets now simulate the system; set an initial condition of

Optimal Control p. 17/52

Optimal Control p. 18/52

Optimal Control p. 19/52

The cost function J

Optimal Control p. 20/52

So what can we observe/conclude?

The non-optimal controller K1 causes x to converge

Optimal Control p. 21/52

The Ricatti equation can actually be solved by hand

Optimal Control p. 22/52

We get the following equations:

6p2 + 2 p22 = 0 (14)

Solve (14) to get p2 = 0.3166, 6.3166

What was demonstrated earlier was a regulation problem

In reality, it is desired that x follows a desired state

So now the quadratic performance index is

Q, R have the same function as before