Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Chapter 66
Dynamic
Dynamic Propgramming
Propgramming
(315)
(315)
1 T
1 N1 T
J sx N x N (qx k x k r u Tk u k ) ,
2
2 k 0
1 T
1 N1 T
J sx N x N (qx k x k r u Tk u k ) ,
2
2 k 0
BC-I
BC-II
x0 given
x =r given
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
1 N 1 2
1 N 1 2
2
J0 X Uk 1.5 2 Uk
2 K 0
2 K 0
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
2 K 0
2
N
BC-I
BC-II
X K 1 X K Uk
X K 1 X K Uk
X 1 1 .5 1 2 .5
State :
x0 given
xN=rN given
X 2 2.5 1 3.5
X 0 1 .5
K 0
Control : U0 1
1 N 1 2 1 2 J 1 2. 3 3 . 3
C0ntribution to : J0
Uk U 0
0
2 K 0
2
J1
1 N 1 2
1
Uk J2 102 2.25 2.3
2 K 0
2
J2 X 22 1.5 22 2.25
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
1 X1=X0+U0
X =1.5+1
0.5 X11=1.5+.5
0
=2.5
=2
X1=1.5+0
X1=1.5-0.5 =1.5
X 0=
1
2 K 0
Uk=
1
X2=X1+U1
0.5 X
X22=1.5+1
=1.5+.5
0
=2.5
=2
X2=1.5+0
X2=1.5-0.5 =1.5
-1
0.5=1
X1=1.5-
-1
0.5=1
X2=1.5-
1=0.5
1=0.5
X =1+1=2
0.5 X11=1+.5=1
0
X 0=
0.5
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
.5
X1=1+0=1
-0.5 X1=1-1
0.5=0.5
X
1=1-1=0
1.5
X2=1+0=1
-0.5 X2=1-1
0.5=0.5
X
2=1-1=0
X =0.5+1
0.5 X11=0.5+0.5
0
X =1+1=2
0.5 X22=1+0.5=
=1.5
=1
X1=0.5+0
X1=0.5-0.5 =0.5
0.5=0
-1 X1=0.5-1=-
0.5 X
X22=0.5+1
=0.5+0.5
0
=1.5
=1
X2=0.5+0
=0.5
-0.5 X
2=0.50.5=-0
-1 X2=0.5-1=-
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
X =0+1=1
0.5 X11=0+.5=0
0
.5
X1=0+0=0
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
2 K 0
U k=
1
X2=X1+U1
X =0+1=1
0.5 X22=0+0.5=
0
0.5
X2=0+0=0
-0.5 X2=0-0.5=
-0.5
-1 X2=0-1=-1
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
1 X1=X0+U0
X =1.5+1
0.5 X11=1.5+.5
0
=2.5
=2
X1=1.5+0
X1=1.5-0.5 =1.5
X 0=
1
-1
1
0
X 0=
0.5
J0=0+0.75=0
0.5=0.5
X
1=1-1=0
X =0.5+1
0.5 X11=0.5+0.5
=1.5
=1
X1=0.5+0
X1=0.5-0.5 =0.5
0.5=0
-1 X1=0.5-1=-
=2.5
=2
X2=1.5+0
J0=0.5 /2+.75=0.875
J0=0+0.375=0.375
J0=-0.52/2+.125=0.25
-1
1
1=0.5
12State
/2+0.25=0.75
Xk exceeds
-0.5 X2=1-
J0=12/2+0.75=1.25
J0=0.52/2+0=0.125
Not in X
k
1.5
X2=1+0=1
0.5=0.5
X
2=1-1=0
2.25
J.375
=0+1=1
1
J1=0.52/2+0.25=0.375
J*2 12
1
J1=-12/2+0=0.5
J1=12/2+2.25=
0.5 X
X22=0.5+1
=0.5+0.5
0
J*2 1.5 2
constraint
X =1+1=2
0.5 X22=1+0.5= J1=0.52/2+2.25=2
-1
J0.5
0=0+0.125=0.125
J1=2
/2+1=1.125
J0.5
=1
J0=-12/2+0=0.5
J0=0.52/2+.375=
J1=0+2.25=2.25
0.5=1
X2=1.5-
State Xk exceeds
constraint
State Xk exceeds
constraint
0.5 X
X22=1.5+1
=1.5+.5
X2=1.5-0.5 =1.5
J0=.75
1 State
/2+.125=0.62
Xk exceeds
constraint
5
State Xk exceeds
constraint
State Xk exceeds
constraint
1=0.5
-0.5 X1=1-1
U k=
2
/2+.375=0.5
J0.5
=0
.5
X1=1+0=1
2 K 0
J0 0.5U02 J1*
0.5=1
X1=1.5-
X =1+1=2
0.5 X11=1+.5=1
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
=1.5
=1
X2=0.5+0
X2=0.5-0.5 =0.5
0.5=-0
-1 X2=0.5-1=-
2.75
J1=0.52/2+1=1
J.125
1=0+0.25=0.
25
J1=0.52/2+0=0.125
Not in X
k
J*2 0.5 2
0.25
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
X1=X0+U0
X =0+1=1
0.5 X11=0+.5=0
0
.5
X1=0+0=0
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
K= 0
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
2 K 0
J0 0.5U02 J1*
Uk=
J0=0.52+0.375=
0.857
J0=0.52/2+0.125
=.25
J0=0+0.125=0.1
25 Not in X
k
constraints
Not in Xk
constraints
X2=X1+U1
X =0+1=1
0.5 X22=0+.5=0
0
.5
X2=0+0=0
-0.5 X2=-0-0.5=0.5
-1 X2=0-1=-1
K= 1
=375
J1=0+0=0
J*2 0
Not in Xk
constraints
Not in Xk
constraints
K=N=2
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
1 X1=X0+U0
U k=
State Xk exceeds
constraint
State Xk exceeds
constraint
X =1.5+1
X1=1.5-0.5 =1.5
X 0=
X1=1.5-1 0.5=1
1
1=0.5
X =1+1=2
0.5 X11=1+.5=1
0
X 0=
0.5
.5
X1=1+0=1
-0.5 X1=1-1
0.5=0.5
X
1=1-1=0
X =0.5+1
0.5 X11=0.5+0.5
0
=1.5
=1
X1=0.5+0
X1=0.5-0.5 =0.5
0.5=0
-1 X1=0.5-1=-
2 K 0
J0 0.5U02 J1*
0.5 X11=1.5+.5
=2.5
=2
X1=1.5+0
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
J0=0+0.75=0
J0=.75
0
U0*=
-0.5
1 State
/2+.125=0.62
Xk exceeds
constraint
5
J0=0.5 /2+.75=0.875
J0=0+0.375=0.375
-1
1
1=0.5
12State
/2+0.25=0.75
Xk exceeds
0
U0*=
-0.5
1
U0*=0
U =
-0.5
*
0
0.5=0.5
X
2=1-1=0
2.25
constraint
J.375
=0+1=1
1
J1=0.5
U1*=
/2+0.25=0.375 -0.5
J*2 12
1
J1=-12/2+0=0.5
U1*=-1
J1=12/2+2.25=
0.5 X
X22=0.5+1
=0.5+0.5
0
1.5
X2=1+0=1
-0.5 X2=1-
J0=12/2+0.75=1.25
0.5 /2+0=0.125
Not in Xk
J*2 1.5 2
J1=2
/2+1=1.125
J0.5
=1
-1
J1=0+2.25=2.25
0.5=1
X2=1.5-
J0=-12/2+0=0.5
J0=-
State Xk exceeds
constraint
State Xk exceeds
constraint
X =1+1=2
0.5 X22=1+0.5= J1=0.52/2+2.25=2
J0.5
=0+0.125=0.125
=2.5
=2
X2=1.5+0
X2=1.5-0.5 =1.5
J0=0.52/2+.375=
X2=X1+U1
0.5 X
X22=1.5+1
=1.5+.5
2
/2+.375=0.5
J0.5
=0
J0=-0.52/2+.125=0.25
=1.5
=1
X2=0.5+0
X2=0.5-0.5 =0.5
0.5=-0
-1 X2=0.5-1=-
2.75
J1=0.52/2+1=1
J.125
1=0+0.25=0.
25
J1=0.5 /2+0=0.125
Not in Xk
2
J*2 0.5 2
0.25
U1*=
-0.5
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
X =0+1=1
0.5 X11=0+.5=0
0
.5
X1=0+0=0
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
K= 0
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
2 K 0
J0 0.5U02 J1*
Uk=
J0=0.52+0.375=
0.857
J0=0.52/2+0.125
=.25
J0=0+0.125=0.1
25 Not in X
k
constraints
Not in Xk
constraints
X2=X1+U1
X =0+1=1
0.5 X22=0+.5=0
0
.5
X2=0+0=0
-0.5 X2=-0-0.5=0.5
-1 X2=0-1=-1
K= 1
=375
J1=0+0=0
J*2 0
Not in Xk
constraints
Not in Xk
constraints
K=N=2
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
N 1
X
X0k=
= U k=
2
1 X1=X0U0+U
-1
X0=
1
1
1
-1
X0=
0
X1=1+1=2
X1=1+1=0
1
-1
X 0=
-1
X1=2+1=3
0
X1=-2+1=-
X1=0+1=1
X1=0+1=1
J0 X 0 U0 J
Uk=
State Xk exceeds
constraint
2
XX2=X
1U1+U1
=2+1=3
2
-1
J0=(1*1) 11=0
J0=(-1*1)
*
1
J0 =(0*1) -2-
J1 X1U1 J*2
2
J*2 X N
J*2 2 2
X2=-2+1=-1
State Xk exceeds
constraint
J1=(2*-1)-2+1=-1
X2=1+1=2
J1=(1*1) 1+4=5
-1
X2=-1+1=0
J1=(1*-1) -1+0=-
1=-3
-1+1=0
J0 =(0*1) 0-
1=-1
J0 =(0*1) 0-
-1
1=-1
1
-1
X1=1+1=0
X1=1+1=2
J0 =(-1*1)
-1+1=0
J0 =(-1*-1) 11=0
1
-1
X2=0+1=1
X2=0+1=1
X2=-1+1=0
X2=1+1=2
J1 =(0*1) 0+1=1
J1 =(0*-1)
J*2 12
1
J*2 0 2
0
0+1=1
J1 =(-1*1)
-1+0=-1
J1 =(-1*-1)
1+4=5
J*2 12
1
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
XXk0== Uk=
0
X1=X0+U0
J0=12/2+0.75=1
X =0+1=1
.25
J0=0.52/2+.375
.5
X1=0+0=0
=0.5
0.5 X11=0+.5=0
0
J0 0.5U02 J1*
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
K= 0
J0=0+0.125=
J0.125
0=0.52/2+0=0.125
Not in X
k
constraints
U0*=
-0.5
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
2 K 0
Uk=
1
X2=X1+U1
2
J1 0.5U12 J*2 J*2 X N
Not in Xk
constraints
X22=1+1=2
=0.5+.5
0.5 X
J0=0.52/2+.375
=1
X2=0+0=0
=0.5
J0=0+0.125=0.
.52=-0.5-0.5 X
0.5=-1
-1 X2=-1-1=-2
K= 1
J125
0=-
U1*=
-0.5
J*2 0.5 2
0.25
0.52/2+0=0.125
Not in Xk
constraints
K=N=2
Aircraft Ascending
u= 1
Aircraft Descending
u= -1
d
g
minimum-fuel
problem
with
fixed final state
and constrained
control and state
values.
Initial state is xo = a.
If x3 = f, the optimal (only) control is u3 = -1, and the cost is then 4. This is
indicated by placing (4) above node f, and placing an arrowhead on path f i.
To control Uk at stage k can be considered to be uk=1, where uk=1 results in a
move up, and uk=-1 results in a move down to stage k + I.
If x3=h, the optimal control is u3 =1, with a cost of 2,
Now decrement k to 2. If xz=c, then u2=-1 with a cost to go of 4+3 =7. This
information is added to the figure. If x2=e, then we must make a decision.
If we apply uz=1 to get to f, and then go via the optimal path to i, the cost is 4+3
=7, On the other hand, if we apply u2 = -I at e and go to h, the cost is 2 + 2 = 4.
Hence, at e the optimal decision is u2=-I with a cost to go of 4. Add this
information
Note:
- Optimal solution not unique, e.g., two optimal paths a i
- State variable feedback: optimal choice of path depends on city
at present
- Forward planning does not work
- Optimal path satisfy Bellmans principle,
e.g. b e h i is optimal from b i
e h i is optimal from e i
- Less computation than exhaustive search approach
Discrete-time System
For nonlinear systems the state and co-state equations are hard
to solve, and constraint further complicate things. Dynamic
programming, can easily be applied to nonlinear systems, and
more constraints there are on the control and state variables, the
easier the solution!
Solving this equation backwards in time for all states yields the optimal control law
A routing network is
shown in Fig. P6.11. Find the optimal
path from x0 to x6 if
only
movement
from left to right is
permitted. Now find
the optimal path
from any node as a
state-variable
feedback.
Assignment
Problem 6.2-1:
Board
X0
X6
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
1 X1=X0+U0
X =1.5+1
0.5 X11=1.5+.5
0
=2.5
=2
X1=1.5+0
X1=1.5-0.5 =1.5
X 0=
1
-1
0.5=1
X1=1.5-
1=0.5
X =1+1=2
0.5
Uk=
State Xk exceeds
constraint
State Xk exceeds
constraint
J0=0+0.75=
0.75
J0=0.52/2+.375=0.
J50=-
1 /2+.125=0.625
State Xk exceeds
constraint
2
J0=0.52/2+.75=
.5
X1=1+0=1
J0=0+0.375=0
-0.5 X1=1X 0=
0.5=0.5
-1 X
1=1-1=0
1
X =0.5+1
0.5 X11=0.5+.5
=1.5
X1=0.5+0
0 =1
X1=0.5-0.5 =0.5
0.5=0
-1 X1=0.5-1=-
2 K 0
J0 0.5U02 J1*
0.5 X11=1+.5=1
0
N 1
1
2 0.5,
0 xk 1.5; N=2; Uk=-1, -0.5,
J0 X0,
1;
Uk2
N
0.875
.375
J0=-
0.5 /2+.125=0.25
2
J0=1
/2+0=0.5
2
J =1
/2+0.75=
2
1.25
J0=0.52/2+.375
=0.5
J0=0+0.125=0
.125
J0=-
0.52/2+0=0.125
Not in Xk
constraints
2
J1 0.5U12 J*2 J*2 X N
X2=X1+U1
0.5 X
X22=2.5+1
=2+.5=2
0
=3.5
.5
X2=1.5+0
X2=1-0.5 =1.5
-1
0.5=0.5
X2=0.5-
1=-0.5
X =2+1=3
0.5 X22=1.5+0.5
0
=2
X2=1+0=1
-0.5 X2=0.5-1
0.5=0
X
2=0-1=-1
State Xk exceeds
constraint
State Xk exceeds
constraint
J0=0+2.25=
2.25
J0=0.52/2+1=1.125
J0=-
J*2 1.5 2
2.25
12/2+0.25=0.75
State Xk exceeds
constraint
J0=0.52/2+2.25=2.
375
J0=0+1=1
J0=0.52/2+0.25=0.37
5
J =-
J*2 12
1
2
1
/2+0=0.5
2
J =1
/2+0.75=
0
0.5 X
X22=0.5+1
=0.5+.5
=1.5
X2=0.5+0
0 =1
1.25
X2=0.5-0.5 =0.5
0.5=0
-1 X2=0.5-1=-
J0=-
J0=0.52/2+.375
=0.5
J0=0+0.125=0
.125
0.52/2+0=0.125
Not in Xk
constraints
J*2 0.5 2
0.25
X0=
0
J0=12/2+0.75=
X =0+1=1
0.5 X11=0+.5=0
0
.5
X1=0+0=0
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
1.25
J0=0.52/2+.375
=0.5
J0=0+0.125=0
.125
J0=-
0.5 /2+0=0.125
Not in Xk
constraints
2
X22=1+1=2
=0.5+.5
0.5 X
0
=1
X2=0+0=0
X2=-0.5-0.5 .5
0.5=-1
-1 X2=-1-1=-2
Not in Xk
constraints
J0=0.52/2+.375
=0.5
J0=0+0.125=0
.125
J0=-
0.52/2+0=0.125
Not in Xk
constraints
J*2 0.5 2
0.25
End
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
X=
Xk0= Uk=
2
1 X1=X0+U0
J0
State Xk exceeds
constraint
State Xk exceeds
constraint
X =2+1=3
0.5 X11=2+.5=2
0
.5
X1=2+0=2
-0.5 X1=2X 0=
1.5
-1
0.5=1.5
X1=2-1=1
J =0.5(4+0.25)+2.374
+4=6
=4.449
0
J0=0.5(4+1)+1.55=
4.05
X =1.5+1
=2.5
=2
X1=1.5+0
J0=0+0.375=0.375
0.5 X11=1.5+.5
X 0=
1
J0=0.5(4+0)
State Xk exceeds
constraint
J0=0.52/2+.75=0.875
1
( X 02 U02 ) J1*
2
Uk=
1
J1
X2=X1+U1
=3
X2=2+0=2
-0.5 X2=1.5-1
0.5=2
X2=1-1=0
X =2.5+1
0.5 X22=2+0.5=
=3.5
2.5
X2=1.5+0
J0=-0.52/2+.125=0.25
-1
0.5=1
X
1=1.5-
J0=-1 /2+0=0.5
-1
0.5=0.5
X
2=0.5-1=-
1=0.5
J0=12/2+0.75=1.25
0.5
X =1+1=2
0.5 X11=1+.5=1
J0=0.52/2+.375=
.5
X1=1+0=1
J0.5
0=0+0.125=0.125
-0.5 X1=1-1
0.5=0.5
X
1=1-1=-0
J0=-
X2=1-0.5 =1.5
0.5 X
X22=2+1=3
=1.5+.5
0
=2
X2=1+0=1
-0.5 X2=0.5-
0.52/2+0=0.125
Not in X
k
-1
1
( X 12 U12 ) J*2
2
J1=0.5(4+0)+2=
4
J =0.5(2.25+0.25)+1.125
0
2 K 0
J*2
2
k
2
k
1 2
XN
2
1
J*2 22
2
2
=2.374
J1=0.5(1+1)+0.5
=1.55
State
2
N
State Xk exceeds
constraint
State Xk exceeds
constraint
0.5 X
X22=3+1=4
=2.5+.5
X1=1.5-0.5 =1.5
N 1
1
1
0 xk 2; N=2; Uk=-1,
J -0.5,
X 0, 0.5,
( X 1; U
0.5=0
X
2=0-1=-1
Xk exceeds
constraint
J1=0.52/2+2.25=2
J.375
1=0+1=1
J1=-
1
J*2 1.5 2
2
1.125
0.52/2+0.25=0.375
J1=-12/2+0=0.5
J1=12/2+2.25=
2.75
J1=0.52/2+1=0
J.25
1=0+0.25=0.
25
J1=0.52/2+0=0.125
Not in X
k
1
J*2 12
2
0.5
Example 6.2-1. Optimal Control of a Discrete System using dynamic programming find an admissible control
sequence u0*, u1* that minimizes J0 resulting in an admissible state trajectory x0* ,x1*, x2*
X=
Xk0= Uk=
0.5
X1=X0+U0
X =0.5+1
0.5 X11=0.5+.5
0
X0=
0
=1.5
=1
X1=0.5+0
X1=0.5-0.5 =0.5
-1
1
0.5=
0
X
1=0.5-1=
1 2
( X 0 U02 ) J1*
2
J0=0.52+0.375=
0.857
J0=0.52/2+.375
=0.5
J0=0+0.125=0.1
J25
0=0.52/2+0=0.125
Not in X
k
-0.5
constraints
J0=0.52+0.375=
X =0+1=1
0.857
.51=0+0=0
X
=0.5
J0=0+0.125=0.1
0.5 X11=0+.5=0
0
J0
-0.5 X1=0-0.5=
-0.5
-1 X1=0-1= -1
J0=0.52/2+.375
J25
0=0.52/2+0=0.125
Not in X
k
constraints
N 1
1
1
0 xk 2; N=2; Uk=-1,
J -0.5,
X 0,
0.5,
( X1; U
Uk=
1
J1
X2=X1+U1
X =1.5+1
0.5 X22=1+.5=1
0
=2.5
.5
X2=0.5+0
=0.5
-0.5 X
2=-0-0.5=0.5
-1 X2=-0.5-1=1.5
1
X =1+1=2
0.5 X22=0.5+.5
0
=1
X
2=0+0=0
-0.5 X2=-0.50.5=-1
-1 X2=-1-1=-2
2
N
2 K 0
2
k
2
k
1
2
( X12 U12 ) J*2 J *2 1 X N
2
2
J0=12/2+1=
2
J0.5
0=0.5 /2+.375
=0.5
J0=0+0=0
J0=-
1
J*2 0.5 2
2
0.125
0.52/2+0=0.125
Not in Xk
constraints
J0=12/2+1=
2
J0.5
0=0.5 /2+.375
=0.5
J0=0+0=0
J0=0.52/2+0=0.125
Not in Xk
constraints
1 2
J 0
2
0
*
2