0% found this document useful (0 votes)
3 views45 pages

06 - Optimal Control Theory

Uploaded by

kim960905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views45 pages

06 - Optimal Control Theory

Uploaded by

kim960905
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Optimal Control Theory

- Optimal Control Problem


- Calculus of Variation
- Euler-Lagrange Necessary conditions
- Hamilton-Jacobi-Bellman Equations
Optimal Control Problem
➢ Objective
To find a history of the control vector that makes a control
system satisfy the physical constraints and, at the same time,
minimizes (or maximizes) some performance criterion(cost
function)

➢ Steps of Problem Formulation


▪ Modeling of the control system
- Set differential eqns. of a physical system
- Specify admissible controls
▪ Specification of a performance criterion
- Lagrange, Mayer, and Bolza forms
▪ Specification of physical constraints to be satisfied
- Path and terminal constraints
- Control limit

2
Optimal Control Problem
➢ Control System Modeling
▪ State vector
- to describe internal behavior of the system
- x = [ x1 x2 xn ]T
▪ Control vector
- to determine system behavior
- u = [u1 u2 um ]
T

▪ Ordinary differential equations


- the simplest model that accurately predicts the system
- x(t ) = f (t , x(t ), u (t ))
with specified initial conditions x(t0 ) = x0

3
Optimal Control Problem
➢ Admissible controls
A piecewise continuous control u(·), defined on some time
interval t0  t  t f with range in the control region U
u (t )  U , t  [t0 , t f ]
is said to be an admissible control.

x, u
x
u

t0 tf t

4
Optimal Control Problem
➢ Performance criterion
▪ A performance criterion (or a cost function) measures the
penalty that must be paid as a consequence of the dynamic
system’s trajectory.

▪ A functional used for quantitative evaluation of a system’s


performance depends on
- the control and state variables
- the initial and/or terminal times (if not fixed)

5
Optimal Control Problem
➢ Type of Functional Forms
tf
▪ Lagrange form J (u ) =  t0
L(t , x(t ), u (t ))dt

▪ Mayer form J (u ) =  (t f , x(t f ))


tf
▪ Bolza form J (u ) =  (t f , x(t f )) + t0
L(t , x(t ), u (t ))dt : widely used

▪ Each type of functional forms can be transformed into either of the


other two other types.
tf
Ex) Minimum time problem J (u ) = t f − t0 = 1  dt
t0

tf dl
tf
Ex) J (u ) =
t0 L(t , x(t ), u (t )) dt = 
t0 dt
(t , x(t ), u (t )) dt

= l (t f , x(t f ), u (t f )) − l (t0 , x(t0 ), u (t0 ))


dl l l
⎯⎯⎯
u=g ( x)
→ (t , x(t )) = L (t , x(t ))  x+ =L
dt x t
6
Optimal Control Problem
➢ Typical performance criterion
▪ Minimum time (Open-end-time problem)
tf
J (u ) = t f − t0 =  1 dt
t0

▪ Minimum fuel
tf
J (u ) =  mdt = m(t f ) − m(t0 )
t0
tf
 J (u ) =  f i (t , x(t ), u (t ))dt = xi (t f ) − xi (t0 )  ( x(t f ))
t0

▪ Minimum time and fuel


tf
J (u ) =  (kt + k Tf fi )dt , kt and k f are weightings.
t0

▪ Linear Quadratic(LQ) cost


 ( x(t f ) − xD ) = [ x(t f ) − xD ]T Q[ x(t f ) − xD ]
L(t , x(t ), u (t )) = xT (t )Qx(t ) + xT (t )Cu (t ) + u T (t ) Ru (t )
7
Optimal Control Problem
➢ Physical constraints
▪ Functional equalities/inequalities restricting the range of
values that can be assumed by control and/or state variables

➢ Type of constraints
▪ Point constraints
 i ( t , x( t ))  0,  e ( t , x( t )) = 0, t  [t0 , t f ]
▪ Isoperimetric (integral) constraints
- Possible to be reformulated as a point constraint
tf tf
t0
 i (t , x(t ), u (t ))dt  0, 
t0
 e (t , x(t ), u (t ))dt  0
▪ Path constraints
 i (t , x(t ), u (t ))  0,  e (t , x(t ), u (t ))  0, t  [t0 , t f ]
Ex) Input constraints, Path constraints
umin  u (t )  umax xmin  xk (t )  xmax

8
Optimal Control Problem
➢ Generalized optimal control problem
Find the control history u (t ) that minimizes the performance index
tf
J =  (t f , x(t f )) +  L(t , x, u )dt
t0

subject to the differential constraints


x = f (t , x, u ), x (t0 ) = x0
and constraints
 i j (t f , x(t f ))  0, j = 1, , ni
 ej (t f , x(t f )) = 0, j = 1, , ne
 i j (t , x(t ), u (t ))  0, j = 1, , ni
 ej (t , x(t ), u (t )) = 0, j = 1, , ne
u (t )  [umin , umax ]

▪ Constraints can be easily converted into the cost function.


▪ Inequality constraints generally denies the analytic solution.
9
Optimal Control Problem
➢ Open-loop optimal control
▪ Depend on a particular initial conditions
▪ Typically given by numerical optimization results
▪ Path planning
▪ u * (t ) =  (t ; t0 , x0 ), t  [t0 , t f ]

➢ Closed-loop optimal control


▪ Feedback control
▪ Available when an analytic solution is obtained
▪ LQR, Optimal guidance laws
▪ u (t ) =  (t , x(t ))
*

10
Functional
➢ Function
▪ A function f is a rule of correspondence that assigns each
element q in a certain set D to a unique element in a set R.
D is called the domain of f and R is the range.
▪ Ex) Suppose q1 , q2 , , qn are thecoordinates of a point in n-dimensional

Euclidean space. Then f (q) = q12 + q22 + + qn2 is a function.

➢ Functional
▪ A functional J is a rule of correspondence that assigns each
function x in a certain class  to a unique real number.
 is called the domain of the functional, and the set of
corresponding real numbers is the range R.
▪ Functional is a “function of a function.”
▪ Ex) Suppose that x is a continuous function of t defined in the
tf
interval [t0 , t f ]. Then J ( x) =  x(t )dt is a functional.
t0

11
Variation of a function
➢ The function  x(t ) , which is infinitesimal, is called a virtual
displacement function and may be considered as the variation of
the function x (t ) at a fixed value of t . The varied function x (t )
is defined as
x(t ) = x(t ) +  x(t )
➢ The operator  means a change in the function at a fixed value
of the independent variable, while the operator d means a
change in the function w.r.t a change in the independent variable.

x(t )

x = x + x
 x(t ) dx(t )

t t + dt
12
Variation of a function
➢ Some properties of the operator 

 dx  d
  = ( x )
 dt  dt
tf tf
  x(t )dt =  [ x(t )]dt
t0 t0

F (t , x(t ), x(t )) F (t , x(t ), x(t ))


F = x+ x
x x

13
Calculus of Variation
➢ Increment of functional J
If x and x +  x are functions for which the functional J is defined, then the
increment of J , denoted by J , is
J ( x,  x) J ( x +  x) − J ( x)
where  x is called the variation of the function x.

➢ Variation of J
The increment of a functional can be written as
J ( x,  x) =  J ( x,  x) + g ( x,  x)  x
where  J is linear in  x. If
lim g ( x,  x) = 0,
 x →0

then J is said to be differentiable and  J is the variation of J evaluated


for the function x.

14
Calculus of Variation
➢ Fundamental theorem
If x* is an extremal, the variation of J must vanish on x* ;
that is  J ( x* ,  x) = 0 for all admissible  x.

Proof by contradiction)
Assume that x* is an extremal and  J ( x* ,  x)  0.

We will see that these assumptions imply that the sign of J can be changed
in an arbitrary small neighborhood of x* .

Recall that J ( x* ,  x) = J ( x* +  x) − J ( x* )
=  J ( x* ,  x ) + g ( x* ,  x )  x
where g ( x* ,  x) → 0 as  x → 0; thus, there is a neighborhood,  x   , where
g ( x* ,  x)  x is small enough so that  J dominate J .
15
Calculus of Variation
Now let us select the variation of x.
 x =  x, where   0 and  x   . Suppose that  J ( x* , x)  0.
Since  J is a linear functional of  x,from the homogeneity
 J ( x* , x) =  J ( x* ,  x)  0. Since  J dominates J , J ( x* , x)  0.

Now consider the variation  x = − x. Again, we have


 J ( x* , − x) = − J ( x* ,  x)  0. And, J ( x* , − x)  0.

In summary, we have shown that if  J ( x* ,  x)  0, then


in an arbitrary small neighborhood of x* , the sign of J is changed.
This contradicts the assumption x*is an extremal.

16
Calculus of Variation
➢ The simplest variational problem
For fixed t f and x(t f ), to find x* for which the following functional has
a relative minimum.
tf
J ( x) =  g (t , x(t ), x(t ))dt
t0

Sol )
J ( x,  x) = J ( x +  x ) − J ( x )
tf tf
=  g (t , x(t ) +  x(t ), x(t ) +  x(t ))dt −  g (t , x (t ), x (t ))dt
t0 t0

  g   g  
tf  (t , x (t ), x (t ))   x (t ) +  x (t , x (t ), x (t ))   x (t ) 
=   x  dt
 + o( x(t ),  x(t )) 
t0

 
Therefore,
tf   g   g  
 J ( x,  x) =    (t , x(t ), x(t ))   x(t ) +  (t , x (t ), x (t ))   x (t ) dt
t0
  x   x  
17
Calculus of Variation
Applying integration by parts for the second integrand,
tf  g 
t0  x (t , x (t ), x (t ))   x(t )dt
tf
  g   t f d  g 
=   (t , x(t ), x(t ))   x(t )  −  (t , x ( t ), x ( t ))  x(t )dt
 t0 dt  x 
 x  t0  
For fixed t f and fixed x(t f ),  t f =  x (t f ) = 0,
tf   g  d  g 
 J ( x,  x) =    (t , x(t ), x(t ))  −  (t , x(t ), x(t ))    x(t )dt
t
  x0  dt  x 

From the fundamental theorem, the necessary condition for x* to be an extremeal.


 g  d  g 
 J ( x* ,  x ) = 0   (t , x* (t ), x* (t ))  −  (t , x* (t ), x* (t ))  = 0, t [t0 , t f ]
 x  dt  x 

Euler equation

18
Calculus of Variation

➢ We need to solve Euler equation for obtaining x*


➢ General property of Euler equation
▪ Ordinary differential eqns.
▪ Nonlinear and Time-varying
▪ Difficult for solving either analytically or numerically
- TPBVP(Two Point Boundary Value Problem) is introduced
if x(t0 ), x(t f ) are given instead of x(t0 ), x(t0 ) .

19
Calculus of Variation
➢ Fixed final time, Free final state x(t )
For the case that t f is fixed but x(t f ) free,
Admissible
 t f = 0 but  x(t f ) is arbitrary. curves

tf
J ( x) =  g (t , x(t ), x (t ))dt x0
t0
tf
  g   t0 tf t
 J ( x,  x ) =   (t , x(t ), x(t ))   x (t ) 
  x  t0
t f   g  d  g 
+    (t , x(t ), x(t ))  −  (t , x(t ), x (t ))    x(t )dt
t0
  x  dt  x 
From  J ( x* ,  x ) = 0, the necessary conditions are Additional B.C.
 g  g for determining
 x (t f , x (t f ), x (t f ))   x(t f ) = 0  x (t f , x (t f ), x (t f )) = 0 the final states
* * * *

and Euler equation


 g  d  g 
 x (t , x *
(t ), x *
(t )) −
 dt  x (t , x *
(t ), x *
(t ))  = 0, t  [t0 , t f ]
20
Calculus of Variation
2
➢ Ex) Find x (t ) to minimize J ( x) =  [ x 2 (t ) + 2 x(t ) x(t ) + 4 x 2 (t )]dt
*
0

where x(0) = 1, and x(2) is free.


Sol ) From Euler equation,
 g  d  g  d
 x (t , x *
(t ), x *
(t )) −
 dt  x (t , x *
(t ), x *
(t )) =
  2 x *
(t ) + 8 x *
(t )  −
 dt  2 x *
(t ) + 2 x *
(t ) 

= − x* (t ) + 4 x* (t ) = 0
 x* (t ) = c1e −2t + c2 e 2t and x* (t ) = −2c1e −2t + 2c2e 2t
From the terminal B.C.,
g g
(t f , x* (t f ), x* (t f )) = 0 = (2, x* (2), x* (2))  x* (2) + x* (2) = 0
x x
 − c1e −4 + 3c2 e 4 = 0
And, from the initial equation x(0) = 1, c1 + c2 = 1
3e 4 e −4
Therefore, c1 = −4 , c2 = −4
e + 3e 4
e + 3e 4

21
Calculus of Variation
➢ Constrained Extreme problems
Ex) Find the point on the line x1 + x2 = 5 that is nearest the origin.

Sol) The problem can be changed as


Find x1* , x2* which minimzes f ( x1 , x2 ) = x12 + x22 subject to x1 + x2 = 5.

A. Elimination method
The differential of f at the x* should be zero.
 f   f * * 
df ( x1* , x2* ) = 0 =  ( x1* , x2* )  x1 +  ( x1 , x2 )  x2
 x1   x2 
Eliminate x1 in the cost using the constraint,
f ( x2 ) = (5 − x2 ) 2 + x22 .
Then, df ( x2* ) = 0 = 10 − 4 x2*  x2 .
Therefore, x2* = 2.5 and x1* = 2.5
22
Calculus of Variation
B. Lagrange multiplier method
Consider the augmented function
f a ( x1 , x2 , p ) = x12 + x22 + p ( x1 + x2 − 5)
Necessary condition
 f * *   f * *   f 
df a ( x , x , p ) = 0 =  ( x1 , x2 , p )  x1 + 
* *
( x1 , x2 , p )  x2 + +  ( x1* , x2* , p )  p
 x1  x2  p
1 2
  
=  2 x1* + p  x1 +  2 x2* + p  x2 +  x1* + x2* − 5 p
By solving,
2 x1* + p = 0, 2 x2* + p = 0, x1* + x2* − 5 = 0
we have,
x1* = 2.5, x2* = 2.5, and p = −5

23
Calculus of Variation
➢ Lagrange multiplier method
▪ Augment the constraint function multiplied by a variable, called
the Lagrange multiplier, to the original performance cost
▪ Simply add a zero to the cost
▪ For a point constraint, Lagrange multiplier is a constant.
▪ For a time-varying constraint, Lagrange multiplier is a variable.

tf
Minimze J =  (t f , x(t f )) +  L(t , x(t ), u (t ))dt
t0

subject to x = f (t , x(t ), u (t ))
and  (t f , x(t f )) = 0


Minimze J =  (t f , x(t f )) +  T (t f , x(t f ))

L(t , x(t ), u (t )) +  (t )  f (t , x(t ), u (t )) − x  dt


tf
+ T
t0

24
E-L Necessary Conditions
➢ Standard optimal control problem with free final time
Find the contril history u (t ) that minimizes the performance index
tf
J =  (t f , x (t f )) +  L (t , x (t ), u (t ))dt
t0

subject to the differential constraints


Soft terminal constr.
x = f (t , x (t ), u (t )), x (t0 ) = x0
and the final conditions
 (t f , x(t f )) = 0.
Here, x : n  1 state vector Hard terminal constr.
u : m  1 control vector
 and L: sclars
f : n  1vector
 : ( p + 1)  1vector where p  n

25
E-L Necessary Conditions
➢ Modified Optimal Control Problem

Find the control history u (t ) that minimizes the performance index


tf
J a (u ) = G (t f , x(t f ), ) +   H (t , x (t ), u (t ),  (t )) −  T (t ) x (t )  dt
t0

where
G (t f , x(t f ), ) =  (t f , x (t f )) +  T (t f , x (t f )): Endpoint equation
H (t , x (t ), u (t ),  (t )) = L(t , x(t ), u (t )) +  T (t ) f (t , x(t ), u (t )): Hamiltonian

Let g (t , x(t ), x(t ), u (t ),  (t )) H (t , x(t ), u (t ),  (t )) −  T (t ) x(t )


tf
J a (u ) = G (t f , x(t f ), ) +  g (t , x(t ), x(t ), u (t ),  (t ))dt
t0

26
E-L Necessary Conditions
Recall that the case of free final time/state,
 G   G   G 
 J a (u * ) = 0 =  (t f , x* (t f ), )   x f +  (t f , x* (t f ), )   t f +  (t f , x* (t f ), )   v
 x   t   v 
 g 
+  (t f , x* (t f ), x* (t f ), u * (t ),  * (t f ))   x f
 x 
  g  
+ −  (t f , x* (t f ), x* (t f ), u * (t ),  * (t f ))  x* (t f ) + g (t f , x* (t f ), x* (t f ),  * (t f ))   t f
  x  
t f  g d  g 
+   (t , x* (t ), x* (t ), u * (t ),  * (t )) −  (t , x(t ), x(t ), u * (t ),  * (t ))    x(t )
t0
 x dt  x 
 g 
+  (t , x* (t ), x* (t ), u * (t ),  * (t ))   u (t )
 u 
 g 
+  (t , x* (t ), x* (t ), u * (t ),  * (t ))   (t )dt
  

 J a (u * ) = 0 = Gx −  T  t =t  x f + Gt + H t =t  t f +  (t f , x* (t f ))   v


f f

 H   x(t ) +  H u  u (t ) +  x − f  (t ) dt 


tf
+ x +  T
t0
27
E-L Necessary Conditions
G g
Coeff. of  x f : (t f , x* (t f ), * ) + (t f , x* (t f ), x* (t f ), * ,  * (t f )) = 0
x x
 Gx t =t −   T  = 0  Gx −  T  = 0
f t =t f t =t f

G  g 
Coeff. of  t f : (t f , x* (t f ), * ) −  (t f , x* (t f ), x* (t f ), * ,  * (t f ))  x* (t f )
t  x 
+ g (t f , x* (t f ), x* (t f ), * ,  * (t f )) = 0

 Gt t =t −  − T  x* (t f ) +  H − T x* 
f t =t f   t =t f
 Gt + H t =t = 0
f

G
Coeff. of  t f : (t f , x* (t f ), * ) = 0   (t f , x* (t f )) = 0


Note) g = H (t , x(t ), u (t ),  (t )) −  T (t ) x(t )


G (t f , x(t f ), ) =  (t f , x(t f )) +  T (t f , x (t f ))
H (t , x(t ), u (t ),  (t )) = L(t , x(t ), u (t )) +  T (t ) f (t , x(t ), u (t ))
28
E-L Necessary Conditions
g d  g 
Coeff. of  x : (t , x* (t ), x* (t ), * ,  * (t )) −  (t , x(t ), x(t ), * ,  * (t ))  = 0
x dt  x 
 H x −  − T  = 0  H x + T = 0
d
dt

g
Coeff. of  u : (t , x* (t ), x* (t ), * ,  * (t )) = 0  H u = 0
u

g
Coeff. of  : (t , x* (t ), x* (t ), * ,  * (t )) = 0  x = f


Note) g = H (t , x(t ), u (t ),  (t )) −  T (t ) x(t )


G (t f , x(t f ), ) =  (t f , x(t f )) +  T (t f , x (t f ))
H (t , x(t ), u (t ),  (t )) = L(t , x(t ), u (t )) +  T (t ) f (t , x(t ), u (t ))

29
E-L Necessary Conditions
 Summary 
Optimal control problem
tf
Min. J =  (t f , x(t f )) +  L(t , x, u ) dt
t0

subject to x = f (t , x, u ), x(t0 ) = x0 and  (t f , x(t f )) = 0 for a free t f .

Necessary conditions
- Unknowns(2n + p + 2) : x(t f ) ~ n,  (t0 ) ~ n,  ~ p + 1, t f ~ 1
- Euler-Lagrange equations(TPBVP )
x* = f (t , x* , u * ), x* (t0 ) = x0 : System equation( n)
 * = − H xT (t , x* , u * ,  * ),  * (t f ) = Gx t =t : Co-state equation( n)
f

- Terminal constraint( p + 1) :  (t f , x* (t f )) = 0
- Transversality condition(1) :  H + Gt t =t f
=0 : if t f fixed

- Control equation: H u (t , x* , u * ,  * ) = 0 :
30
E-L Necessary Conditions
tf
Ex) Min. J = t f + k  u 2 dt
t0

subject to x = u , t0 = 0, x0 = 0 and x f = 1.
Sol)  (t ) = t ,  (t ) = x(t ) − 1  G (t ) =  +   = t +   x(t ) − 1
L(t ) = ku 2 (t ), f (t ) = u (t )  H (t ) = L +  f = ku 2 (t ) + u (t )
 = − H xT   = 0,
 (t f ) = Gx t =t   (t f ) =    (t ) = 
f

H u (t , x* , u * ,  * ) = 0  2ku +  = 0  u = −  = −
2k 2k
 H + Gt t =t f
= 0 = 0  ku 2 (t f ) + u (t f ) + 1 = 0   = 2 k

x = −
2k
 x(t ) = −  ( 2k ) t ,
( )
 (t f , x* (t f )) = 0  x(t f ) = 1  −  2k t f = 1 t f = k (  0 when  = −2 k )

 x(t ) =  1  t , u (t ) = 1
 k k
31
E-L Necessary Conditions
Ex) Navigation problem
Find the velocity direction  (t ) that minimizes J = t f
subject to x1 = V cos 
x2 = V sin  + w
with the prescribed b.c.'s; t0 = 0, x10 = x20 = 0 and x1 f = 1, x2 f = 0.

 1   x1 (t f ) − 1 x2
Sol)  (t f ) = t f ,  (t f ) =   =  
 x
 2  2 f  ( t )
V
G (t f ) =  +    = t f +  1  x1 (t f ) − 1 +  2 x2 (t f )
T
t =t f

H = L +  T f = 1 (V cos  ) + 2 (V sin  + w)
x1
 (t f ) = Gx t =t  1 (t f ) =  1 , 2 (t f ) =  2
f
w
 = −H T
x  1 = − H x1 = 0  1 =  1
2 = − H x 2 = 0  2 =  2
32
E-L Necessary Conditions
2  2
H (t , x* , u * ,  * ) = 0 = − 1V sin  + 2V cos   tan  = =
1  1
1
x1 = V cos   t  x1 f = 1 = V cos   t f  t f =
V cos 
w V 2 − w2
x2 = (V sin  + w)t  x2 f = 0 = (V sin  + w)t f  sin  = − , cos  =
V V
w 1
 = − sin −1 , tf = V
V V 2 − w2
−w

V 2 − w2
1
 H + Gt t =t = 0   1 V 2 − w2 + 1 = 0   1 = −
f
V 2 − w2
w
 2 =  1 tan  =
V 2 − w2

33
E-L Necessary Conditions
Ex) Launch vehicle control
For a given t f find the thrust acceleration direction
 (t ) that maximizes y
J = u (t f ) V
a
subject to u = a cos 
v = a sin  − g  (t )
v
x=u m
u
y=v
with the final conditions;
g
y (t f ) = h, v(t f ) = 0

34
E-L Necessary Conditions
Sol) G =  +  T  = u +  y  y − h  +  v v
H = L +  T f = u a cos  + v (a sin  − g ) + x u +  y v
 = − H xT  u = −x , v = − y , x = 0,  y = 0
 u = −c1t + c3 , v = −c2 t + c4 , x = c1 ,  y = c2
 (t f ) = Gx t = t  u (t f ) = 1, v (t f ) =  v , x (t f ) = 0,  y (t f ) =  y
f

 c1 = 0, c3 = 1, c2 =  y , c4 =  v +  y t f
H  = −u sin  + v cos  = 0  tan = tan 0 − ct: Linear tangent law
where tan 0 =  v +  y t f and c =  y

- tan 0 and c (or  v and  y ) are determined by two final b.c.'s y (t f ) = h, v(t f ) = 0.
- Linear tangent law ahs been the fundamental law for attitude control of
most launch vehicles.
Ex) IGM(Iterative Guidance Mode) for Saturn rockets,
PEG(Powered Explicit Guidance) for Space-shuttles
35
E-L Necessary Conditions

Optimal turn
Linear tangent law

Gravity turn
 =0

36
Principle of Optimality
Theorem) If a − b − c − e is the optimal path from a to e, then b − c − e is
the optimal path from b to e.
Proof by contradiction)
Suppose that b − d − e is the optimal path from b to e; then
J bde  J bce
and
J ab + J bde  J ab + J bce = J ae* .
This can be satisfied only by violating the condition that a − b − c − e is
the optimal path from a to e.
Jbde
b
d
c
J ab
Jbce

a
e 37
H-J-B Equation
Consider a value ftn. defined by
tf
V (t , x(t ), u (t )) =  (t f , x(t f )) +  L( , x ( ), u ( ))d
t

where t  t f and t f fixed. The minimum cost is then

 tf
V (t , x(t )) = min  (t f , x(t f )) +  L( , x ( ), u ( ))d
*

u t 
= min  (t , x(t )) +  
t +t tf
f f L(t , x( ), u ( ))d +  L(t , x( ), u ( ))d
u t t +t

From the principle of optimality,

V (t , x(t )) = min
*

u

t
t +t
L( , x( ), u ( ))d + V * (t + t , x(t + t )) 
 t +t Ld + V * (t , x(t )) + V * (t , x(t ))  t 
 t  t  
 
= min + Vx (t , x(t ))   x(t + t ) − x (t ) 
T
 *
 
 
u

 + Higher order terms 


 f (t , x (t ), u (t )) t 
38
H-J-B Equation
For small t ,

 T
0 = Vt * (t , x (t ))  t + min L (t , x (t ), u (t ))t + Vx* (t , x (t ))  f (t , x (t ), u (t )) t
u

By dividing t , we have a partial differential eqn.
*
u
 x
* T
0 = Vt (t , x(t )) + min L(t , x (t ), u (t )) + V (t , x (t ))  f (t , x (t ), u (t )) 
with the b.c. by setting t = t f in the value ftn.; V * (t f , x(t f )) =  (t f , x(t f )).
Define Hamiltonian,
T
H (t , x, u ,V * ) = L(t , x (t ), u (t )) + Vx* (t , x (t ))  f (t , x (t ), u (t ))
then, we have Hamilton-Jacobi-Bellman eqn.
0 = Vt * (t , x(t )) + min H (t , x (t ), u (t ),V * (t , x (t )))
u

or
Vt * (t , x(t )) = − H (t , x (t ), u * (t ),V * (t , x (t )))

39
H-J-B Equation
H-J-B equation
- Based on the principle of optimality
- Partial differential equation: Guessing the solution form of V included
- Proving the rule for defining optimal controls of continuous systems
- Dynamic programming required for obtaining solution

40
H-J-B Equation
tf  J max − J (t ) for t  t f
V (t ) =  (t f , x(t f )) +  L( , x( ), u ( ))d  V (t ) = 
Vmin =  (t f , x(t f )) at t = t f
t

 J (t ) = t L(t , x ( ), u ( ))d for t  t



J (t ) = 
t0 f

 (t f , x(t f )) + J (t f ) at t = t f

J V
J max Vmax
 (t f , x(t f ))
J max

 (t f , x(t f ))
t0 t tf t0 t tf

41
H-J-B Equation
H in J = H in V ?
J :H = L + T f
 t  t  t
J x (t ) =  Ldt =   L +  ( f − x )  dt =   H −  T x  dt
 T

x t0 x t0 x t0
t H
dt = −   T dt = −  T (t ) −  T (t0 ) 
t
=
t0 x t0  
J x (t ) = −Vx (t ) V = J max − J
V : H = L + VxT f = L +  T f

42
H-J-B Equation
1 2 tf 1
Ex) Find u to minimize J = x (t f ) +  u 2 (t ) dt sub. to x(t ) = x(t ) + u(t ) and t f is fixed.
4 0 4

Sol) By H-J-B equations


Hamiltonian: H (t , x(t ), u (t ),Vx* ) = 1 u 2 (t ) + Vx*  x(t ) + u (t ) 
4
Optimal control: H u = 0 = 1 u * (t ) + Vx*  u * (t ) = −2Vx*
2
* 2 * 2
H-J-B eqn.: 0 = Vt +
* 1 
  *
 *
 *

−2Vx  + Vx  x(t ) − 2Vx  = Vt − Vx  + Vx* x(t )
4
Guess solution to satisfy o.d.e and b.c.: V * = 1 K (t ) x 2 (t )
2
 Vx* = K (t ) x(t )  u = −2 K (t ) x(t )
 Vt * = 1 K (t ) x 2 (t ) x is not an explicit ftn. of time.
2
From H-J-B eqn.: 0 = 1 Kx 2 − K 2 x 2 + Kx 2  0 = 1 K (t ) − K 2 (t ) + K (t )
2 2
Boundary condition: V * (t f , x(t f )) = 1 K (t f ) x 2 (t f ) = 1 x 2 (t f )  K (t f ) = 1
2 4 2
t −t
ef
Finally, K (t ) = t f −t − ( t −t )
e +e f
43
H-J-B Equation
Sol) By E-L equations
Hamiltonian: H (t , x, u,  ) = 1 u 2 +   x + u 
4
Optimal control: H u = 0 = 1 u * +   u * = −2
2
Costate eqn.:  = − H x   = −   (t ) = c e − t
System eqn.: x = x + u = x − 2  x = x − 2c e − t  x = che t + c e − t

2
−t
2
t
( −t
B.C.:  (t f ) = x (t f )   (t f ) = 1 x (t f )  c e f = 1 che f + c e f ) c e
h
tf
− c e
−t f
=0
x(0) = x0  ch + c = x0
t −t
ef e f
 c = t f −t f
x0 , ch = t f −t f
x0
e +e e +e
−t
 e − ( t f −t ) + et f −t 
x(t ) = ch e + c e =  t f
t
−t f  x0
 e +e 
t −t
ef  ef
t −t

 (t ) = c e = t f
−t
−t f
x =  − ( t f −t ) t f −t  x (t ) = K (t ) x(t )
e +e +e 
0
e
u * (t ) = −2 (t ) = −2 K (t ) x (t ) : The same optimal control as that of H-J-B

44
References
➢ Optimal Control Theory
[1] D. E. Kirk, Optimal Control Theory, An Introduction, Prentice-Hall, 1970.
[2] A. E. Bryson, Jr. and Y-C Ho, Applied Optimal Control, Hemisphere Publiching Corp., 1070.
[3] R. F. Stengel, Optimal Control and Estimation, Dover Publication, 1986.
[4] D. G. Hull, Optimal Control Theory for Applications, Springer-Verlag, 2003.

45

You might also like