06 - Optimal Control Theory
06 - Optimal Control Theory
2
Optimal Control Problem
➢ Control System Modeling
▪ State vector
- to describe internal behavior of the system
- x = [ x1 x2 xn ]T
▪ Control vector
- to determine system behavior
- u = [u1 u2 um ]
T
3
Optimal Control Problem
➢ Admissible controls
A piecewise continuous control u(·), defined on some time
interval t0 t t f with range in the control region U
u (t ) U , t [t0 , t f ]
is said to be an admissible control.
x, u
x
u
t0 tf t
4
Optimal Control Problem
➢ Performance criterion
▪ A performance criterion (or a cost function) measures the
penalty that must be paid as a consequence of the dynamic
system’s trajectory.
5
Optimal Control Problem
➢ Type of Functional Forms
tf
▪ Lagrange form J (u ) = t0
L(t , x(t ), u (t ))dt
tf dl
tf
Ex) J (u ) =
t0 L(t , x(t ), u (t )) dt =
t0 dt
(t , x(t ), u (t )) dt
▪ Minimum fuel
tf
J (u ) = mdt = m(t f ) − m(t0 )
t0
tf
J (u ) = f i (t , x(t ), u (t ))dt = xi (t f ) − xi (t0 ) ( x(t f ))
t0
➢ Type of constraints
▪ Point constraints
i ( t , x( t )) 0, e ( t , x( t )) = 0, t [t0 , t f ]
▪ Isoperimetric (integral) constraints
- Possible to be reformulated as a point constraint
tf tf
t0
i (t , x(t ), u (t ))dt 0,
t0
e (t , x(t ), u (t ))dt 0
▪ Path constraints
i (t , x(t ), u (t )) 0, e (t , x(t ), u (t )) 0, t [t0 , t f ]
Ex) Input constraints, Path constraints
umin u (t ) umax xmin xk (t ) xmax
8
Optimal Control Problem
➢ Generalized optimal control problem
Find the control history u (t ) that minimizes the performance index
tf
J = (t f , x(t f )) + L(t , x, u )dt
t0
10
Functional
➢ Function
▪ A function f is a rule of correspondence that assigns each
element q in a certain set D to a unique element in a set R.
D is called the domain of f and R is the range.
▪ Ex) Suppose q1 , q2 , , qn are thecoordinates of a point in n-dimensional
➢ Functional
▪ A functional J is a rule of correspondence that assigns each
function x in a certain class to a unique real number.
is called the domain of the functional, and the set of
corresponding real numbers is the range R.
▪ Functional is a “function of a function.”
▪ Ex) Suppose that x is a continuous function of t defined in the
tf
interval [t0 , t f ]. Then J ( x) = x(t )dt is a functional.
t0
11
Variation of a function
➢ The function x(t ) , which is infinitesimal, is called a virtual
displacement function and may be considered as the variation of
the function x (t ) at a fixed value of t . The varied function x (t )
is defined as
x(t ) = x(t ) + x(t )
➢ The operator means a change in the function at a fixed value
of the independent variable, while the operator d means a
change in the function w.r.t a change in the independent variable.
x(t )
x = x + x
x(t ) dx(t )
t t + dt
12
Variation of a function
➢ Some properties of the operator
dx d
= ( x )
dt dt
tf tf
x(t )dt = [ x(t )]dt
t0 t0
13
Calculus of Variation
➢ Increment of functional J
If x and x + x are functions for which the functional J is defined, then the
increment of J , denoted by J , is
J ( x, x) J ( x + x) − J ( x)
where x is called the variation of the function x.
➢ Variation of J
The increment of a functional can be written as
J ( x, x) = J ( x, x) + g ( x, x) x
where J is linear in x. If
lim g ( x, x) = 0,
x →0
14
Calculus of Variation
➢ Fundamental theorem
If x* is an extremal, the variation of J must vanish on x* ;
that is J ( x* , x) = 0 for all admissible x.
Proof by contradiction)
Assume that x* is an extremal and J ( x* , x) 0.
We will see that these assumptions imply that the sign of J can be changed
in an arbitrary small neighborhood of x* .
Recall that J ( x* , x) = J ( x* + x) − J ( x* )
= J ( x* , x ) + g ( x* , x ) x
where g ( x* , x) → 0 as x → 0; thus, there is a neighborhood, x , where
g ( x* , x) x is small enough so that J dominate J .
15
Calculus of Variation
Now let us select the variation of x.
x = x, where 0 and x . Suppose that J ( x* , x) 0.
Since J is a linear functional of x,from the homogeneity
J ( x* , x) = J ( x* , x) 0. Since J dominates J , J ( x* , x) 0.
16
Calculus of Variation
➢ The simplest variational problem
For fixed t f and x(t f ), to find x* for which the following functional has
a relative minimum.
tf
J ( x) = g (t , x(t ), x(t ))dt
t0
Sol )
J ( x, x) = J ( x + x ) − J ( x )
tf tf
= g (t , x(t ) + x(t ), x(t ) + x(t ))dt − g (t , x (t ), x (t ))dt
t0 t0
g g
tf (t , x (t ), x (t )) x (t ) + x (t , x (t ), x (t )) x (t )
= x dt
+ o( x(t ), x(t ))
t0
Therefore,
tf g g
J ( x, x) = (t , x(t ), x(t )) x(t ) + (t , x (t ), x (t )) x (t ) dt
t0
x x
17
Calculus of Variation
Applying integration by parts for the second integrand,
tf g
t0 x (t , x (t ), x (t )) x(t )dt
tf
g t f d g
= (t , x(t ), x(t )) x(t ) − (t , x ( t ), x ( t )) x(t )dt
t0 dt x
x t0
For fixed t f and fixed x(t f ), t f = x (t f ) = 0,
tf g d g
J ( x, x) = (t , x(t ), x(t )) − (t , x(t ), x(t )) x(t )dt
t
x0 dt x
Euler equation
18
Calculus of Variation
19
Calculus of Variation
➢ Fixed final time, Free final state x(t )
For the case that t f is fixed but x(t f ) free,
Admissible
t f = 0 but x(t f ) is arbitrary. curves
tf
J ( x) = g (t , x(t ), x (t ))dt x0
t0
tf
g t0 tf t
J ( x, x ) = (t , x(t ), x(t )) x (t )
x t0
t f g d g
+ (t , x(t ), x(t )) − (t , x(t ), x (t )) x(t )dt
t0
x dt x
From J ( x* , x ) = 0, the necessary conditions are Additional B.C.
g g for determining
x (t f , x (t f ), x (t f )) x(t f ) = 0 x (t f , x (t f ), x (t f )) = 0 the final states
* * * *
= − x* (t ) + 4 x* (t ) = 0
x* (t ) = c1e −2t + c2 e 2t and x* (t ) = −2c1e −2t + 2c2e 2t
From the terminal B.C.,
g g
(t f , x* (t f ), x* (t f )) = 0 = (2, x* (2), x* (2)) x* (2) + x* (2) = 0
x x
− c1e −4 + 3c2 e 4 = 0
And, from the initial equation x(0) = 1, c1 + c2 = 1
3e 4 e −4
Therefore, c1 = −4 , c2 = −4
e + 3e 4
e + 3e 4
21
Calculus of Variation
➢ Constrained Extreme problems
Ex) Find the point on the line x1 + x2 = 5 that is nearest the origin.
A. Elimination method
The differential of f at the x* should be zero.
f f * *
df ( x1* , x2* ) = 0 = ( x1* , x2* ) x1 + ( x1 , x2 ) x2
x1 x2
Eliminate x1 in the cost using the constraint,
f ( x2 ) = (5 − x2 ) 2 + x22 .
Then, df ( x2* ) = 0 = 10 − 4 x2* x2 .
Therefore, x2* = 2.5 and x1* = 2.5
22
Calculus of Variation
B. Lagrange multiplier method
Consider the augmented function
f a ( x1 , x2 , p ) = x12 + x22 + p ( x1 + x2 − 5)
Necessary condition
f * * f * * f
df a ( x , x , p ) = 0 = ( x1 , x2 , p ) x1 +
* *
( x1 , x2 , p ) x2 + + ( x1* , x2* , p ) p
x1 x2 p
1 2
= 2 x1* + p x1 + 2 x2* + p x2 + x1* + x2* − 5 p
By solving,
2 x1* + p = 0, 2 x2* + p = 0, x1* + x2* − 5 = 0
we have,
x1* = 2.5, x2* = 2.5, and p = −5
23
Calculus of Variation
➢ Lagrange multiplier method
▪ Augment the constraint function multiplied by a variable, called
the Lagrange multiplier, to the original performance cost
▪ Simply add a zero to the cost
▪ For a point constraint, Lagrange multiplier is a constant.
▪ For a time-varying constraint, Lagrange multiplier is a variable.
tf
Minimze J = (t f , x(t f )) + L(t , x(t ), u (t ))dt
t0
subject to x = f (t , x(t ), u (t ))
and (t f , x(t f )) = 0
Minimze J = (t f , x(t f )) + T (t f , x(t f ))
24
E-L Necessary Conditions
➢ Standard optimal control problem with free final time
Find the contril history u (t ) that minimizes the performance index
tf
J = (t f , x (t f )) + L (t , x (t ), u (t ))dt
t0
25
E-L Necessary Conditions
➢ Modified Optimal Control Problem
where
G (t f , x(t f ), ) = (t f , x (t f )) + T (t f , x (t f )): Endpoint equation
H (t , x (t ), u (t ), (t )) = L(t , x(t ), u (t )) + T (t ) f (t , x(t ), u (t )): Hamiltonian
26
E-L Necessary Conditions
Recall that the case of free final time/state,
G G G
J a (u * ) = 0 = (t f , x* (t f ), ) x f + (t f , x* (t f ), ) t f + (t f , x* (t f ), ) v
x t v
g
+ (t f , x* (t f ), x* (t f ), u * (t ), * (t f )) x f
x
g
+ − (t f , x* (t f ), x* (t f ), u * (t ), * (t f )) x* (t f ) + g (t f , x* (t f ), x* (t f ), * (t f )) t f
x
t f g d g
+ (t , x* (t ), x* (t ), u * (t ), * (t )) − (t , x(t ), x(t ), u * (t ), * (t )) x(t )
t0
x dt x
g
+ (t , x* (t ), x* (t ), u * (t ), * (t )) u (t )
u
g
+ (t , x* (t ), x* (t ), u * (t ), * (t )) (t )dt
G g
Coeff. of t f : (t f , x* (t f ), * ) − (t f , x* (t f ), x* (t f ), * , * (t f )) x* (t f )
t x
+ g (t f , x* (t f ), x* (t f ), * , * (t f )) = 0
Gt t =t − − T x* (t f ) + H − T x*
f t =t f t =t f
Gt + H t =t = 0
f
G
Coeff. of t f : (t f , x* (t f ), * ) = 0 (t f , x* (t f )) = 0
g
Coeff. of u : (t , x* (t ), x* (t ), * , * (t )) = 0 H u = 0
u
g
Coeff. of : (t , x* (t ), x* (t ), * , * (t )) = 0 x = f
29
E-L Necessary Conditions
Summary
Optimal control problem
tf
Min. J = (t f , x(t f )) + L(t , x, u ) dt
t0
Necessary conditions
- Unknowns(2n + p + 2) : x(t f ) ~ n, (t0 ) ~ n, ~ p + 1, t f ~ 1
- Euler-Lagrange equations(TPBVP )
x* = f (t , x* , u * ), x* (t0 ) = x0 : System equation( n)
* = − H xT (t , x* , u * , * ), * (t f ) = Gx t =t : Co-state equation( n)
f
- Terminal constraint( p + 1) : (t f , x* (t f )) = 0
- Transversality condition(1) : H + Gt t =t f
=0 : if t f fixed
- Control equation: H u (t , x* , u * , * ) = 0 :
30
E-L Necessary Conditions
tf
Ex) Min. J = t f + k u 2 dt
t0
subject to x = u , t0 = 0, x0 = 0 and x f = 1.
Sol) (t ) = t , (t ) = x(t ) − 1 G (t ) = + = t + x(t ) − 1
L(t ) = ku 2 (t ), f (t ) = u (t ) H (t ) = L + f = ku 2 (t ) + u (t )
= − H xT = 0,
(t f ) = Gx t =t (t f ) = (t ) =
f
H u (t , x* , u * , * ) = 0 2ku + = 0 u = − = −
2k 2k
H + Gt t =t f
= 0 = 0 ku 2 (t f ) + u (t f ) + 1 = 0 = 2 k
x = −
2k
x(t ) = − ( 2k ) t ,
( )
(t f , x* (t f )) = 0 x(t f ) = 1 − 2k t f = 1 t f = k ( 0 when = −2 k )
x(t ) = 1 t , u (t ) = 1
k k
31
E-L Necessary Conditions
Ex) Navigation problem
Find the velocity direction (t ) that minimizes J = t f
subject to x1 = V cos
x2 = V sin + w
with the prescribed b.c.'s; t0 = 0, x10 = x20 = 0 and x1 f = 1, x2 f = 0.
1 x1 (t f ) − 1 x2
Sol) (t f ) = t f , (t f ) = =
x
2 2 f ( t )
V
G (t f ) = + = t f + 1 x1 (t f ) − 1 + 2 x2 (t f )
T
t =t f
H = L + T f = 1 (V cos ) + 2 (V sin + w)
x1
(t f ) = Gx t =t 1 (t f ) = 1 , 2 (t f ) = 2
f
w
= −H T
x 1 = − H x1 = 0 1 = 1
2 = − H x 2 = 0 2 = 2
32
E-L Necessary Conditions
2 2
H (t , x* , u * , * ) = 0 = − 1V sin + 2V cos tan = =
1 1
1
x1 = V cos t x1 f = 1 = V cos t f t f =
V cos
w V 2 − w2
x2 = (V sin + w)t x2 f = 0 = (V sin + w)t f sin = − , cos =
V V
w 1
= − sin −1 , tf = V
V V 2 − w2
−w
V 2 − w2
1
H + Gt t =t = 0 1 V 2 − w2 + 1 = 0 1 = −
f
V 2 − w2
w
2 = 1 tan =
V 2 − w2
33
E-L Necessary Conditions
Ex) Launch vehicle control
For a given t f find the thrust acceleration direction
(t ) that maximizes y
J = u (t f ) V
a
subject to u = a cos
v = a sin − g (t )
v
x=u m
u
y=v
with the final conditions;
g
y (t f ) = h, v(t f ) = 0
34
E-L Necessary Conditions
Sol) G = + T = u + y y − h + v v
H = L + T f = u a cos + v (a sin − g ) + x u + y v
= − H xT u = −x , v = − y , x = 0, y = 0
u = −c1t + c3 , v = −c2 t + c4 , x = c1 , y = c2
(t f ) = Gx t = t u (t f ) = 1, v (t f ) = v , x (t f ) = 0, y (t f ) = y
f
c1 = 0, c3 = 1, c2 = y , c4 = v + y t f
H = −u sin + v cos = 0 tan = tan 0 − ct: Linear tangent law
where tan 0 = v + y t f and c = y
- tan 0 and c (or v and y ) are determined by two final b.c.'s y (t f ) = h, v(t f ) = 0.
- Linear tangent law ahs been the fundamental law for attitude control of
most launch vehicles.
Ex) IGM(Iterative Guidance Mode) for Saturn rockets,
PEG(Powered Explicit Guidance) for Space-shuttles
35
E-L Necessary Conditions
Optimal turn
Linear tangent law
Gravity turn
=0
36
Principle of Optimality
Theorem) If a − b − c − e is the optimal path from a to e, then b − c − e is
the optimal path from b to e.
Proof by contradiction)
Suppose that b − d − e is the optimal path from b to e; then
J bde J bce
and
J ab + J bde J ab + J bce = J ae* .
This can be satisfied only by violating the condition that a − b − c − e is
the optimal path from a to e.
Jbde
b
d
c
J ab
Jbce
a
e 37
H-J-B Equation
Consider a value ftn. defined by
tf
V (t , x(t ), u (t )) = (t f , x(t f )) + L( , x ( ), u ( ))d
t
tf
V (t , x(t )) = min (t f , x(t f )) + L( , x ( ), u ( ))d
*
u t
= min (t , x(t )) +
t +t tf
f f L(t , x( ), u ( ))d + L(t , x( ), u ( ))d
u t t +t
V (t , x(t )) = min
*
u
t
t +t
L( , x( ), u ( ))d + V * (t + t , x(t + t ))
t +t Ld + V * (t , x(t )) + V * (t , x(t )) t
t t
= min + Vx (t , x(t )) x(t + t ) − x (t )
T
*
u
T
0 = Vt * (t , x (t )) t + min L (t , x (t ), u (t ))t + Vx* (t , x (t )) f (t , x (t ), u (t )) t
u
By dividing t , we have a partial differential eqn.
*
u
x
* T
0 = Vt (t , x(t )) + min L(t , x (t ), u (t )) + V (t , x (t )) f (t , x (t ), u (t ))
with the b.c. by setting t = t f in the value ftn.; V * (t f , x(t f )) = (t f , x(t f )).
Define Hamiltonian,
T
H (t , x, u ,V * ) = L(t , x (t ), u (t )) + Vx* (t , x (t )) f (t , x (t ), u (t ))
then, we have Hamilton-Jacobi-Bellman eqn.
0 = Vt * (t , x(t )) + min H (t , x (t ), u (t ),V * (t , x (t )))
u
or
Vt * (t , x(t )) = − H (t , x (t ), u * (t ),V * (t , x (t )))
39
H-J-B Equation
H-J-B equation
- Based on the principle of optimality
- Partial differential equation: Guessing the solution form of V included
- Proving the rule for defining optimal controls of continuous systems
- Dynamic programming required for obtaining solution
40
H-J-B Equation
tf J max − J (t ) for t t f
V (t ) = (t f , x(t f )) + L( , x( ), u ( ))d V (t ) =
Vmin = (t f , x(t f )) at t = t f
t
(t f , x(t f )) + J (t f ) at t = t f
J V
J max Vmax
(t f , x(t f ))
J max
(t f , x(t f ))
t0 t tf t0 t tf
41
H-J-B Equation
H in J = H in V ?
J :H = L + T f
t t t
J x (t ) = Ldt = L + ( f − x ) dt = H − T x dt
T
x t0 x t0 x t0
t H
dt = − T dt = − T (t ) − T (t0 )
t
=
t0 x t0
J x (t ) = −Vx (t ) V = J max − J
V : H = L + VxT f = L + T f
42
H-J-B Equation
1 2 tf 1
Ex) Find u to minimize J = x (t f ) + u 2 (t ) dt sub. to x(t ) = x(t ) + u(t ) and t f is fixed.
4 0 4
2
−t
2
t
( −t
B.C.: (t f ) = x (t f ) (t f ) = 1 x (t f ) c e f = 1 che f + c e f ) c e
h
tf
− c e
−t f
=0
x(0) = x0 ch + c = x0
t −t
ef e f
c = t f −t f
x0 , ch = t f −t f
x0
e +e e +e
−t
e − ( t f −t ) + et f −t
x(t ) = ch e + c e = t f
t
−t f x0
e +e
t −t
ef ef
t −t
(t ) = c e = t f
−t
−t f
x = − ( t f −t ) t f −t x (t ) = K (t ) x(t )
e +e +e
0
e
u * (t ) = −2 (t ) = −2 K (t ) x (t ) : The same optimal control as that of H-J-B
44
References
➢ Optimal Control Theory
[1] D. E. Kirk, Optimal Control Theory, An Introduction, Prentice-Hall, 1970.
[2] A. E. Bryson, Jr. and Y-C Ho, Applied Optimal Control, Hemisphere Publiching Corp., 1070.
[3] R. F. Stengel, Optimal Control and Estimation, Dover Publication, 1986.
[4] D. G. Hull, Optimal Control Theory for Applications, Springer-Verlag, 2003.
45