DETERMINISTIC FINITE-STATE PROBLEM
Terminal Arcs
with Cost Equal
to Terminal Cost
...
t
Artificial Terminal
Initial State
... Node
s
...
Stage 0 Stage 1 Stage 2 ... Stage N - 1 Stage N
• States <==> Nodes
• Controls <==> Arcs
• Control sequences (open-loop) <==> paths
from initial state to terminal states
• akij : Cost of transition from state i ∈ Sk to state
j ∈ Sk+1 at time k (view it as “length” of the arc)
• aN
it : Terminal cost of state i ∈ SN
• Cost of control sequence <==> Cost of the cor-
responding path (view it as “length” of the path)
BACKWARD AND FORWARD DP ALGORITHMS
• DP algorithm:
JN (i) = aN
it , i ∈ SN ,
k
Jk (i) = min aij +Jk+1 (j) , i ∈ Sk , k = 0, . . . , N −1
j∈Sk+1
The optimal cost is J0 (s) and is equal to the
length of the shortest path from s to t
• Observation: An optimal path s → t is also an
optimal path t → s in a “reverse” shortest path
problem where the direction of each arc is reversed
and its length is left unchanged
• Forward DP algorithm (= backward DP algo-
rithm for the reverse problem):
J˜N (j) = a0sj , j ∈ S1 ,
˜ ˜
N −k
Jk (j) = min aij + Jk+1 (i) , j ∈ SN −k+1
i∈SN −k
˜ ˜
N
The optimal cost is J0 (t) = mini∈SN ait + J1 (i)
• View J˜k (j) as optimal cost-to-arrive to state j
from initial state s
A NOTE ON FORWARD DP ALGORITHMS
• There is no forward DP algorithm for stochastic
problems
• Mathematically, for stochastic problems, we
cannot restrict ourselves to open-loop sequences,
so the shortest path viewpoint fails
• Conceptually, in the presence of uncertainty,
the concept of “optimal-cost-to-arrive” at a state
xk does not make sense. For example, it may be
impossible to guarantee (with prob. 1) that any
given state can be reached
• By contrast, even in stochastic problems, the
concept of “optimal cost-to-go” from any state xk
makes clear sense
GENERIC SHORTEST PATH PROBLEMS
• {1, 2, . . . , N, t}: nodes of a graph (t: the desti-
nation)
• aij : cost of moving from node i to node j
• Find a shortest (minimum cost) path from each
node i to node t
• Assumption: All cycles have nonnegative length.
Then an optimal path need not take more than N
moves
• We formulate the problem as one where we re-
quire exactly N moves but allow degenerate moves
from a node i to itself with cost aii = 0
Jk (i) = opt. cost of getting from i to t in N −k moves
J0 (i): Cost of the optimal path from i to t.
• DP algorithm:
Jk (i) = min aij +Jk+1 (j) , k = 0, 1, . . . , N −2,
j=1,...,N
with JN −1 (i) = ait , i = 1, 2, . . . , N
EXAMPLE
State i
Destination
5
5 3 3 3 3
2 3
4
7 5 4 4 4 5
1 4 3
2 4.5 4.5 5.5 7
5 5
2
6 1 2 2 2 2
1
2 3
0.5
0 1 2 3 4 Stage k
(a) (b)
JN −1 (i) = ait , i = 1, 2, . . . , N,
Jk (i) = min aij +Jk+1 (j) , k = 0, 1, . . . , N −2.
j=1,...,N
ESTIMATION / HIDDEN MARKOV MODELS
• Markov chain with transition probabilities pij
• State transitions are hidden from view
• For each transition, we get an (independent)
observation
• r(z; i, j): Prob. the observation takes value z
when the state transition is from i to j
• Trajectory estimation problem: Given the ob-
servation sequence ZN = {z1 , z2 , . . . , zN }, what is
the “most likely” state transition sequence X̂N =
{x̂0 , x̂1 , . . . , x̂N } [one that maximizes p(XN | ZN )
over all XN = {x0 , x1 , . . . , xN }].
s x0 x1 x2 xN - 1 xN t
...
...
...
VITERBI ALGORITHM
• We have
p(XN , ZN )
p(XN | ZN ) =
p(ZN )
where p(XN , ZN ) and p(ZN ) are the unconditional
probabilities of occurrence of (XN , ZN ) and ZN
• Maximizing p(XN | ZN ) is equivalent with max-
imizing ln(p(XN , ZN ))
• We have
N
Y
p(XN , ZN ) = πx0 pxk−1 xk r(zk ; xk−1 , xk )
k=1
so the problem is equivalent to
N
X
minimize − ln(πx0 ) − ln pxk−1 xk r(zk ; xk−1 , xk )
k=1
over all possible sequences {x0 , x1 , . . . , xN }.
• This is a shortest path problem.
GENERAL SHORTEST PATH ALGORITHMS
• There are many nonDP shortest path algo-
rithms. They can all be used to solve deterministic
finite-state problems
• They may be preferable than DP if they avoid
calculating the optimal cost-to-go of EVERY state
• This is essential for problems with HUGE state
spaces. Such problems arise for example in com-
binatorial optimization
A Origin Node s
5 1 15
AB AC AD
20 4 20 3 4 3
ABC ABD ACB ACD ADB ADC
3 3 4 4 20 20
ABCD ABDC ACBD ACDB ADBC ADCB
1 15 5 1
15 5
Artificial Terminal Node t
5 1 15
5 20 4
1 20 3
15 4 3
LABEL CORRECTING METHODS
• Given: Origin s, destination t, lengths aij ≥ 0.
• Idea is to progressively discover shorter paths
from the origin s to every other node i
• Notation:
− di (label of i): Length of the shortest path
found (initially ds = 0, di = ∞ for i 6= s)
− UPPER: The label dt of the destination
− OPEN list: Contains nodes that are cur-
rently active in the sense that they are candi-
dates for further examination (initially OPEN={s})
Label Correcting Algorithm
Step 1 (Node Removal): Remove a node i from
OPEN and for each child j of i, do step 2
Step 2 (Node Insertion Test): If di + aij <
min{dj , UPPER}, set dj = di + aij and set i to
be the parent of j. In addition, if j 6= t, place j in
OPEN if it is not already in OPEN, while if j = t,
set UPPER to the new value di + ait of dt
Step 3 (Termination Test): If OPEN is empty,
terminate; else go to step 1
VISUALIZATION/EXPLANATION
• Given: Origin s, destination t, lengths aij ≥ 0
• di (label of i): Length of the shortest path found
thus far (initially ds = 0, di = ∞ for i 6= s). The
label di is implicitly associated with an s → i path
• UPPER: The label dt of the destination
• OPEN list: Contains “active” nodes (initially
OPEN={s})
Is di + aij < UPPER ?
YES
(Does the path s --> i --> j
have a chance to be part
of a shorter s --> t path ?)
Set dj = di + aij
INSERT YES
Is di + aij < dj ?
(Is the path s --> i --> j
i j
better than the
OPEN current path s --> j ?)
REMOVE
EXAMPLE
1 A Origin Node s
5 1 15
2 AB 7 AC 10 AD
20 4 20 3 4 3
3 ABC 5 ABD ACB 8 ACD ADB ADC
3 3 4 4 20 20
4 ABCD 6 ABDC ACBD 9 ACDB ADBC ADCB
1 15 5 1
15 5
Artificial Terminal Node t
Iter. No. Node Exiting OPEN OPEN after Iteration UPPER
0 - 1 ∞
1 1 2, 7,10 ∞
2 2 3, 5, 7, 10 ∞
3 3 4, 5, 7, 10 ∞
4 4 5, 7, 10 43
5 5 6, 7, 10 43
6 6 7, 10 13
7 7 8, 10 13
8 8 9, 10 13
9 9 10 13
10 10 Empty 13
• Note that some nodes never entered OPEN
VALIDITY OF LABEL CORRECTING METHODS
Proposition: If there exists at least one path
from the origin to the destination, the label cor-
recting algorithm terminates with UPPER equal
to the shortest distance from the origin to the des-
tination
Proof: (1) Each time a node j enters OPEN, its
label is decreased and becomes equal to the length
of some path from s to j
(2) The number of possible distinct path lengths
is finite, so the number of times a node can enter
OPEN is finite, and the algorithm terminates
(3) Let (s, j1 , j2 , . . . , jk , t) be a shortest path and
let d∗ be the shortest distance. If UPPER > d∗
at termination, UPPER will also be larger than
the length of all the paths (s, j1 , . . . , jm ), m =
1, . . . , k, throughout the algorithm. Hence, node
jk will never enter the OPEN list with djk equal
to the shortest distance from s to jk . Similarly
node jk−1 will never enter the OPEN list with
djk−1 equal to the shortest distance from s to jk−1 .
Continue to j1 to get a contradiction