0% found this document useful (0 votes)
40 views17 pages

Dynamic Optimization: A Tool Kit: Manuel W Alti This Draft: September 2002

This document provides an introduction to dynamic optimization and optimal control. It defines dynamic optimization as determining control and state variable paths over time to maximize a criterion function, subject to constraints. The document outlines optimal control approaches for both discrete and continuous time, including the use of Lagrangians, co-state variables, and optimality conditions. It also provides an example dynamic optimization problem of a girl eating a cake over multiple days.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views17 pages

Dynamic Optimization: A Tool Kit: Manuel W Alti This Draft: September 2002

This document provides an introduction to dynamic optimization and optimal control. It defines dynamic optimization as determining control and state variable paths over time to maximize a criterion function, subject to constraints. The document outlines optimal control approaches for both discrete and continuous time, including the use of Lagrangians, co-state variables, and optimality conditions. It also provides an example dynamic optimization problem of a girl eating a cake over multiple days.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Dynamic optimization: A tool kit

Manuel Walti
This draft: September 2002

Contents
1 Introduction

2 Optimal control
2.1 Discrete time . . . . . . . . . . . . . . . . .
2.1.1 Finite horizon . . . . . . . . . . . . .
2.1.2 Innite horizon . . . . . . . . . . . .
2.2 Continuous time . . . . . . . . . . . . . . .
2.2.1 Finite horizon . . . . . . . . . . . . .
2.2.2 Innite horizon with discounting . .
2.3 Digression: Continuous versus discrete time

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

3
3
3
5
7
7
10
11

3 Dynamic programming
3.1 Discrete time - deterministic setting
3.1.1 Finite horizon . . . . . . . . .
3.1.2 Innite horizon . . . . . . . .
3.2 Continuous time . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

12
12
12
13
17

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

Introduction

In our microeconomics courses we have learned standard tools for maximizing an


objective function (such as utility) subject to equality and inequality constraints
(such as static budget constraints). Under appropriate convexity assumptions
the solution to such problems usually consists of a single optimal magnitude for
every choice variable, e.g. the optimal demand of a particular good, given prices
and income.
In contrast, a dynamic optimization problem rises the question of what is
the optimal magnitude of a choice variable in each period of time within a given
time span. We could dene dynamic optimization as the process of determining
the paths of control variables and state variables for a dynamic system over
a nite or innite time horizon to maximize a criterion function. There may
be constraints on the nal states of the system and on the in-ight states and
controls.
I wish to thank Esther Br
ugger and Martin Wagner for helpful comments. Any errors are
my own.

To illustrate this rather abstract denition lets look at an example of a


dynamic optimization problem.1 Once upon a time there was a little girl who
got a cake. The girl decided to eat that cake all alone. But she was undetermined
when she wanted to eat it. First, she thought of eating the whole cake right
away. But then, nothing would be left for tomorrow and the day after tomorrow.
Well, on the one hand, she thought by herself, eating cake today is better than
eating it tomorrow. On the other hand, eating too much at the same time might
not be the best thing to do. She imagined that the rst mouthful of cake is a real
treat, the second is great, the third is also nice. But the more you eat, the less
you enjoy it. In the end youre almost indierent, she thought. So, she decided
to eat only a bit of the cake everyday. Then, she could eat everyday another
rst mouthful of cake. The girl knew that the cake would be spoiled if she kept
it more than nine days. Therefore, she would eat the cake in the rst ten days.
Yet, how much should she eat everyday? She thought of eating everyday a piece
of the same size. But if eating cake today is better than waiting for tomorrow,
how can it possibly be the best to do the same today as tomorrow? If I ate just
a little bit less tomorrow and a little bit more today I would be better o, she
concluded. - And she would eat everyday a bit less than the previous day and
the cake would last ten days long and nothing would be left in the end.
The girls problem can be stated as follows. The girl maximizes
V (c1 , c2 , ..., cT ) =

T
t=0

t u (ct )

subject to
kt+1 kt = ct
where
k0 given
kT +1 0
ct is the amount of cake consumed in period t (in the context of our dynamic
optimization problem, ct represents the control variable). ct yields instantaneous
utility u (ct ), where u (ct ) satises the usual assumptions, i.e., u () > 0 and
u () < 0. Future consumption is discounted with discount factor 0
1. The present value in period 0 of the whole consumption path equals V .
(The time-separable lifetime utility, V , represents the criterion function of our
dynamic optimization problem.) T is the last day with consumption. In our
story T is 9 as today is 0. The cake size (which represents the state variable
in our dynamic optimization problem) is denoted by k. A rst constraint on the
in-ight state variable requires that the cake size in period t is the previous size
less the previous consumption. The original size of the cake, k0 , is given. A nal
constraint requires that the cake size at the terminal date must be nonnegative.
The girls problem now is to determine the optimal path of ct . As it stands,
this problem could be solved numerically, e.g. with the help of the solver in
Excel. However, it may also be solved analytically. The remainder of this
handout provides you with the most relevant tools to do so, namely optimal
control and dynamic programming. It is written in the style of a cookbook and
1 The following parable is taken from Kurt Schmidheiny and Manuel W
alti, Doing economics with the computer, Session 5.

explicitly does not deal with the very advanced mathematics behind dynamic
optimization.
This handout is mainly based on King [3], Barro and Sala-i-Martin [1], and
Leonard and Van Long [2].

2
2.1

Optimal control
Discrete time

We all know the standard method of optimization with constraints, the KuhnTucker Theorem.2 To solve a dynamic optimization problem we basically apply
the same theorem.
2.1.1

Finite horizon

The typical problem that we want to solve takes the following form. An agent
chooses or controls a number of control variables, ct ,3 so as to maximize an
objective function subject to some constraints. These constraints are dynamic
in that they describe the evolution of the state of the economy, as represented
by a set of endogenous state variables which we denote by kt . The evolution
of these endogenous state variables is aected through the economic agents
choice of the control variables; they are also inuenced by the variation in some
exogenous state variables, xt .
At the heart of this dynamic system are the equations describing the dynamic
behavior of the states; they take the form
kt+1 kt = g(ct , kt , xt )
We write these so-called accumulation equations as involving changes in the state
variables because this eases the conversion to continuous time in our discussion
below. We assume that the initial values of the state variables, k0 , are given.
Moreover, there are terminal conditions on the state variables, which take the
form
kT +1 k
The criterion (or objective) function is assumed to be a discounted sequence
of ow returns, u(ct , kt , xt ), which can represent prots, utilities, and so on. It
takes the form
T

t u(ct , kt , xt )
t=0

This is a maximization problem with a (possibly large) number of choice


variables subject to a (possibly large) number of constraints, most of which are
equality constraints. Accordingly we can form the Lagrangian
L =

T

t=0

t u(ct , kt , xt ) +

T


t t [g(ct , kt , xt ) + kt kt+1 ]

t=0



+ T +1 T +1 kT +1 k
2 If not: an excellent treatment can be found in the mathematical appendix to Mas-Colell,
Whinston, and Green (1995).
3 c can be considered as a vector.
t

where t denotes current valued multipliers (or current valued co-state variables). With the choice of present valued multipliers, t , the Lagrangian takes
the form
L

T


t u(ct , kt , xt ) +

t=0



+ T +1 kT +1 k

T


t [g(ct , kt , xt ) + kt kt+1 ]

t=0

To change from one concept to the other we use the following transformation of
the co-states
t t = t
T +1 T +1 = T +1
t ,kt ,xt )
Lets look at the case of current valued multipliers and lets write u(cc
t
t
as u
etc.
For
the
FOCs
we
derive
the
Lagrangian
with
respect
to
the
controls,
ct
the states and the co-states

L
ct
L
kt+1
L
kT +1
L
( t t )

= 0 = t

ut
gt
+ t t
ct
ct

(1)


= 0 = t t + t+1 t+1

gt+1
kt+1


ut+1
+ 1 + t+1
kt+1

(2)

= 0 = T T + T +1 T +1

(3)

= 0 = g(ct , kt , xt ) + kt kt+1

(4)

The conditions in (1) and (4) hold for t = 0, 1, ..., T , while those in (2) holds for
t = 0, 1, ..., T 1. Moreover, the complementary slackness conditions have to
be met:
L

( T +1

T +1 )
L
T +1 T +1
( T +1 T +1 )



kT +1 k 0



= T +1 T +1 kT +1 k = 0

The complementary slackness conditions say that if the value of a given state
variable at the terminal date is positive (i.e., kT +1 0) then its current valued
shadow price must be zero. Alternatively, if its current valued shadow price at
the terminal date is positive, then the agent must leave kT +1 = 0.
Application: Cake eating problem in discrete time Consider the following simplied version of the cake eating problem mentioned in Section 1.
The girl chooses c0 , c1 , ..., cT that maximize
T
t=0

u (ct )

where u (ct ) satises u () > 0 and u () < 0, subject to


kt+1 kt = ct
4

k0 given
kT +1 0
Note that in contrast to the story above we assume here that the girl isnt
impatience so that = 1 (i.e., there is no discounting). There is one control
(ct ), one endogenous state (kt ), and no exogenous state variable. Furthermore,
u(ct , kt , xt ) = u (ct ) and g(ct , kt , xt ) = ct . Thus, the optimality conditions are
given by
L
ct
L
kt+1
L
kT +1
L
( t t )
L
(T +1 )
L
T +1
(T +1 )

ut
t
ct

0=

0 = t + t+1

0 = T + T +1

0 = ct + kt kt+1


kT +1 k 0



= T +1 kT +1 k = 0

We end up with a set of non-linear (dierence) equations. We have to solve


this system choosing a value of 0 (or, equivalently, of c0 ) so that kT +1 k = 0
T +1
- since T +1 > 0 due to u
cT +1 > 0 for any nite cT +1 . The most convenient
way to do this is to write the above set of dierence equations in the form of a
nonlinear state space system (or, alternatively, as a approximated linear state
space system).
2.1.2

Infinite horizon

Most models considered in economics involve economic agents with innite planning horizons. The typical problem takes the form
max
ct

t u(ct , kt , xt )

t=0

subject to
kt+1 kt = g(ct , kt , xt )
k0 given
lim t kt = 0

where t , as before, is the current valued multiplier. The terminal condition now
says that kt can be negative and grow forever in magnitude, as long as the rate
of growth is less than t ; it is called the transversality condition.4 Benveniste
and Scheinkman (1979)5 have shown that if
4

An alternative form of the transversality condition, called Michels condition, is given


limt Vt = 0.
5 Benveniste, L.M. and J.A. Scheinkman (1979), On the Dierentiability of the Value Function in Dynamic Models of Economics, Econometrica; 47(3), May, p. 727-32

by

either there is discounting


or utility is nite
then in an innite horizon dynamic optimization problem the transversality
condition is necessary.
With the choice of the current valued multipliers, the Lagrangian becomes
to



L=
t u(ct , kt , xt ) +
t t [g(ct , kt , xt ) + kt kt+1 ]
t=0

t=0

The optimality conditions are given by the FOCs


L
ct
L
kt+1
L
( t t )

ut
gt
+ t t
ct
ct

0 = t

0 = t t + t+1 t+1

0 = g(ct , kt , xt ) + kt kt+1


gt+1
ut+1
+ 1 + t+1
kt+1
kt+1

plus the initial condition and the transversality condition


k0 given
lim t kt = 0

Application: The neoclassical growth model The benevolent social planners problem is


t u (Ct , 1 Nt )
max
t=0

s.t. Kt+1 = AF (Kt , Nt ) Ct + (1 ) Kt


A solution to this problem exists because we are maximizing a continuous
function over a compact set. The solution is unique since we are maximizing
a concave function over a (-dimensional) convex set. A Lagrangian can be
formed
L=

t u (Ct , 1 Nt ) +

t=0

t [AF (Kt , Nt ) Ct + (1 ) Kt Kt+1 ]

t=0

The optimality conditions are given by the FOCs (due to the convexity of
the problem these are also sucient)
L
Ct
L
Nt
L
Kt+1
L
t

0 = t u1 (Ct , 1 Nt ) t

0 = t u2 (Ct , 1 Nt ) t AF2 (Kt , Nt ) Xt

0 = t t+1 [AF1 (Kt+1 , Nt+1 ) + (1 )]

0 = AF (Kt , Nt ) Ct + (1 ) Kt Kt+1
6

plus the boundary conditions which are given by the initial capital stock, K0 ,
and the transversality condition, lim t Kt+1 = 0.
t

We end up with a set of non-linear (dierence) equations. For an illustration


of how such a system can be solved see e.g. Macroeconomics II, Summary 3
(Economic growth), Section 3.6

2.2

Continuous time

We use the Kuhn-Tucker Theorem with an innitesimally short time period.


This application is called the Maximum Principle of Pontryagin.
2.2.1

Finite horizon

The maximization problem now looks like


 T
max
u [c (t) , k (t) , x (t)] et dt

(5)

s.t.

d
k (t) = g [c (t) , k (t) , x (t)]
dt
k (0) = k0 > 0 given
k (T ) k

As in the discrete time setting, equation (5) is called the criterion (or objective)
d
function. The expression for dt
k (t) is called the accumulation (or transition)
equation. Next we have the initial condition and the nal constraint. For
simplicity lets assume that there is just one control, one endogenous state, and
one exogenous state, although a multitude of these variables could readily be
included.
Digression on the discount rate in continuous time The discount factor
1
. denotes the rate of time preference; it
in discrete time is t , where = 1+
expresses the impatience of an economic agent. In continuous time, needs to
be broken down to since the time does not jump in units of 1 anymore. This
can be done as follows
1

t = et ln = e(ln 1+ )t = e[ln 1ln(1+)]t = e[0ln(1+)]t = et


since ln(1 + x) x for small x.
The present value Hamiltonian H To apply the Maximum Principle we
use the following cookbook procedure.7
1. Construct the present value Hamiltonian H
H = u [c (t) , k (t) , x (t)] et + (t) g [c (t) , k (t) , x (t)]
where (t) again is called the present valued co-state variable or multiplier.
6 http://www-vwi.unibe.ch/amakro/Lectures/macroii/macro.htm
7 For

a heuristic derivation see Barro and Sala-i-Martin [1].

2. Take the derivative of the Hamiltonian w.r.t. the control variable and set
it to 0
H
=0
(6)
c (t)
3. Take the derivative of the Hamiltonian w.r.t. the state variable (the variable that appears in the dierential equation above) and set it to equal
the negative of the derivative of the multiplier w.r.t. time
H
d
= (t)
k (t)
dt

(7)

4. Take the derivative of the Hamiltonian w.r.t. the co-state variable and set
it to the derivative of the state variable w.r.t. time
H
d
= k (t)
(t)
dt

(8)

5. Transversality condition: Set the product of the shadow price and the
state variable at the end of the planning horizon to 08


(T ) k (T ) k = 0
If we combine equation (6) and (7) with equation (8) (which represents
nothing else than the transition equation) then we can form a system of two
dierential equations in the variables and k. The nal step is to nd a solution
to this dierential equation system. For an illustrative example compare Barro
and Sala-i-Martin [1], Appendix on mathematical methods, section 1.3.8.

The current value Hamiltonian H

1. Construct the following current value Hamiltonian H


= Het
H

= u [c (t) , k (t) , x (t)] + t g [c (t) , k (t) , x (t)]

(t) = (t) et
2. Take the derivative of the Hamiltonian w.r.t. the control variable and set
it to 0

H
=0
(9)
c (t)
3. Take the derivative of the Hamiltonian w.r.t. the state variable (the variable that appears in the dierential equation above) minus t and set
8 Generally,

the transversality condition implies the following behavior for the co-state:
Case 1
Case 2
Case 3

k (T ) = k

(T ) free

k (T ) k

(T ) 0

k (T ) > k

(T ) = 0

k (T ) free

(T ) = 0

with identical conditions for k (0).

the sum of the two terms to equal the negative of the derivative of the
multiplier w.r.t. time

d
H
(t) = (t)
k (t)
dt

(10)

where > 0 (otherwise multiply it by (1)).


4. Take the derivative of the Hamiltonian w.r.t. co-state variable and set it
to the derivative of the state variable w.r.t. time
H
d
= k (t)
(t)
dt
5. Transversality condition: Set the product of the shadow price and the
state variable at the end of the planning horizon to 09


(T ) k (T ) k = 0
If we combine equation (9) and (10) with the transition equation then we
can form a system of two dierential equations in the variables and k. The
nal step is to nd a solution to this dierential equation system.
Application: The cake eating problem in continuous time Consider a
household who has an initial stock of cake, k (0), which can be consumed over
the continuous interval 0 t T . The consumption of the cake generates
utility u [c (t)]. The stock of the cake evolves through time as
d
k (t) = c (t)
dt
and the terminal condition is
k (T ) k
where k is some positive amount of cake. Suppose that the household values
cake consumption according to the utility expression


u [c (t)] dt
0

Note that in contrast to the story in Section 1 we assume here that the household
is not impatient and, hence, et = e0t = 1. To solve this problem lets make
use of the cookbook procedure given above. Step 1 leads to the Hamiltonian
H = u [c (t)] (t) c (t)
9 Generally,

the transversality condition implies the following behavior for the co-state:
Case 1

k (T ) = k

Case 2

k (T ) k

(T ) 0

k (T ) > k

(T ) = 0

k (T ) free

(T ) = 0

Case 3
with identical conditions for k (0).

(T ) free

Step 2 leads to the condition




H
= u [c (t)] (t) = 0
c (t)
Step 3 leads to the condition


H
= 0 = (t)
k (t)
d
where (t) = dt
(t).
Step 4 leads to the condition


H
= c (t) = k (t)
(t)

Finally, the transversality condition (step 5) is given by




(T ) k (T ) k = 0
2.2.2

Infinite horizon with discounting

In case of an innite horizon with discounting we can apply the same procedure
as for nite horizon except that we change the transversality condition to
lim (t) k (t) = 0

resp.

lim (t) k (t) = 0

This means, again, that the value of the capital stock must be asymptotically 0,
otherwise something valuable would be left over: If the quantity, k (t), remains
positive asymptotically, then the price, (t), must approach 0 asymptotically.
If k (t) grows forever at a positive rate then the price (t) must approach 0 at
a faster rate so that the product, (t) k (t), goes to 0.
Application: The neoclassical growth model with fixed labor supply
Consider the following continuous time model

max U =

et

s.t.

C (t)
dt
1

1
K (t) = AK (t)
N C (t) K (t)
K (0) = K0 > 0 given

To solve this problem lets make use of our cookbook procedure. Step 1 leads
to the current value Hamiltonian
1

= C (t)
H
1

1
+ (t) AK (t)
N C (t) K (t)

Step 2 leads to the condition




C = C (t) (t) = 0
H
10

Step 3 leads to the condition




K = (t) (1 ) AK (t) N (t) = (t)
H
where is supposed to be > 0.
Step 4 leads to the condition


= AK (t)1 N C (t) K (t) = K (t)
H
Finally, the transversality condition (step 5) is given by
lim (t) k (t) = 0

2.3

Digression: Continuous versus discrete time

Given that the two methods are closely related, it is interesting to ask why they
are both used. Continuous and discrete time dier in the following important
ways:
Phase planes: when a dynamic model is stated in continuous time one
can study the qualitative dynamics (of the resulting system of dierential
equations) using certain graphical techniques that are not available in
discrete time.
Culture: some groups of economists learned one way and others learned
another, with persisting dierences.
Solutions: whether a particular model is stated in continuous time or discrete time may lead to dierent solutions, i.e. there may be mathematical
dierences in the respective solutions.
Closed form solutions: a closed form solution is a solution that can be
arrived at by solving an equation or a set of equations, as opposed to
the use of numerical methods. There are dierent forms of closed form
solutions in discrete and continuous time. For example, in the basic growth
model there is a discrete time closed form for the case with log utility and
complete depreciation. In the continuous time model, there is a closed
form which has other restrictions.

11

Dynamic programming

In section 2 we considered discrete optimal control theory, which is based on


the familiar Kuhn-Tucker Theorem. An alternative method of solving this type
of problems is the dynamic programming approach.
Recall the typical problem (notation is the same as in section 3): Find
c0 , c1 , ..., cT that maximize
V =

T


t u(ct , kt , xt )

(11)

t=0

subject to
kt+1 kt = g(ct , kt , xt )
k0 given
kT +1 0
Dynamic programming exploits two fundamental properties of this type of
problems, namely separability and additivity over time periods. More precisely,
for any t, the functions ut and gt depend on t and on the state and control
variables, but not on their past or future values;
the maximand V is the sum of the net momentary utilities.
Using these two properties, Bellman (1957) enunciates an important theorem
about the nature of any optimal solution of problem (11). This theorem is known
as the principle of optimality. Roughly speaking, it says that an optimal policy
has the property that at any stage t, the remaining decisions ct , ct+1 , ..., cT must
be optimal with regard to the current state kt , which results from the initial
state k0 and the earlier decisions c0 , c1 , ..., ct1 . This property is obviously
sucient for optimality since we require it to hold for all t: when we put t = 1,
we have the denitions of an optimal policy. Furthermore, the property is also
necessary, since any deviation from the optimal policy, even in the last period,
is clearly suboptimal.
It was left to Bellmans genius to transform this rather trite, nearly tautological observation into an ecient method of solution. We now state the result
formally.

3.1
3.1.1

Discrete time - deterministic setting


Finite horizon

The problem setup is given by


V (kt , t , at ) =
s.t.

kt+1 kt
at+1
xt

max {u(ct , kt , x(t )) + V (kt+1 , t+1 , at+1 )}

ct ,kt+1

= g(ct , kt , x(t ))
= at + 1
= x(t )

with

(12)
(13)
(14)

t+1 = m(t )

(15)

(12) is the Bellman Equation, (13) is the accumulation equation, (14) is the age
equation, and (15) gives the law of motion of the exogenous variable xt as a
12

function of a set of exogenous state variables, t , that evolve according to the possibly nonlinear - dierence equation system t+1 = m(t ). For an example
compare the application of stochastic dynamic programming below.
Note that we have converted the many-period optimization problem given
above into a two period optimization problem, which involves trading o between the current return u(ct , kt , x(t )) and the future value V (kt+1 , t+1 , at+1 ).
To solve the problem, we begin at the terminal value and proceed by backward induction. This process is frequently called value iteration, as it involves
taking initial value function V (kt+1 , t+1 , at+1 ), nding the optimal level of the
right hand side of the Bellman equation at each kt , t and thereby constructing
a new value function V (kt , t , a). We can also write now
V (kt , , a)
s.t.

k k
a


max
{u(c, k, x()) + V (k  ,  , a )}

c,k

= g(c, k, x())
= a+1
= m()

On the Bellman Equation, we now apply the standard method:


L = u(c, k, x()) + V (k  ,  , a ) + [g(c, k, x()) + k k  ]
where t denotes the shadow price. The FOCs are
u(c, k, x())
g(c, k, x())
+
c
c
V (k  ,  , a )
+
k 
g(c, k, x()) + k k 

The rst FOC then is the derivation of L with respect to the control, the
second with respect to the state at time t + 1, and the third with respect to the
Lagrange multiplier yielding the constraint of state accumulation. To get an ex  
pression for V (kk, ,a ) we need the envelope theorem (for a rigorous exposition
compare the relevant literature)
u(c, k, x()
g(c, k, x())
V (k, , a)

+ [
+ 1]
k
k
k
Before plugging in, we change the subscripts since we need
sequent period.
3.1.2

V (k,)
k

of the sub-

Infinite horizon

In an innite time horizon setting, we replace the terminal condition in the


typical problem above by the transversality condition
lim t kt = 0

Apart from this new condition, innite horizon optimization is not dierent
from the nite horizon case. Bellmans principle of optimality at once tells us
why. Consider any nite horizon subproblem with the initial and terminal conditions xed by the larger problem For the subproblem, the maximum principle
conditions apply. But the initial and terminal times of the subproblem could be
arbitrary, so the conditions must in fact hold for the entire range (0, ).
13

Application: Non-stochastic dynamic programming We consider a nonstochastic baseline model of investment where rms face a perfectly elastic supply of capital goods and can adjust their capital stocks costlessly. Suppose that
the prots of the rm can be written as
p (t ) f (kt ) (t ) it u (it , kt , t )
where p (t ) may be interpreted as an output price or a productivity shock;
f (kt ) is a positive, increasing and strictly concave production function (i.e. we
abstract from labor within the scope of this model); (t ) is the investment
good price; and it is the quantity of investment expenditure. The rms capital
accumulation will be described by
kt+1 kt = it d kt g (it , kt )
The Bellman equation for the innite horizon problem is
V (kt , t ) = max {u (it , kt , t ) + V (kt+1 , t+1 )}
it ,kt+1

where maximization takes place subject to kt+1 kt = g (it , kt ) and the dynamic
equations for the exogenous states follow t+1 = m (t ).
Since this is a constrained optimization problem, we may form the Lagrangian
L = {u (it , kt , t ) + V (kt+1 , t+1 )} + t [g (it , kt ) kt+1 + kt ]
The FOCs are
it

kt+1

g (it , kt )
u (it , kt , t )
+ t
it
it
V (kt+1 , t+1 )
0 = t +
kt+1
0 = g (it , kt ) kt+1 + kt
0=

Specically, the rst FOC is (t ) + t = 0 or, equivalently, t = (t ).


t+1 ,t+1 )
To get an expression for V (kk
we need the envelope theorem
t+1


g (it , kt )
V (kt , t )
u (it , kt , t )
=
+ t
+1
kt
kt
kt
where the second term on the RHS is the derivative of the budget constraint with
t+1 ,t+1 )
respect to kt . V (kk
is computed by updating the resulting expression to
t+1
t + 1. This yields
V (kt+1 , t+1 )
f (kt+1 )
= p (t+1 )
+ t+1 (1 d)
kt+1
kt+1
Application: Stochastic dynamic programming Dynamic programming
is particularly well suited to optimization problems that combine time and uncertainty.

14

Consider a representative agent with preferences over consumption, c, and


leisure, l, that are represented by the expected utility function,
E0

t u (ct , lt )

t=0

The momentary utility, u (ct , lt ), satises standard assumptions.


The representative agent faces three constraints at each date. The rst says
that the total uses of output for consumption and investment do not exceed total
output, which is produced from capital and labor via a production function that
is shifted by productivity shocks, at
ct + it = at f (kt , nt )
The second says that labor plus leisure is equal to the time endowment
(normalized to unity)
nt + lt = 1
The third says that the future capital stock, kt+1 , evolves according to the
net result of investment and depreciation
kt+1 kt = it dkt
Note that the capital stock at date t, kt , is a predetermined variable, i.e. the
result of prior investment decisions.
To study the representative agents optimality decisions, it is necessary to be
explicit about how the exogenous variable, at , evolves through time. The general
approach is to assume that at is a function of a variable t , i.e. at = a (t ). We
also assume that t is a Markov process, so that the conditional distribution of
t+1 depends only on t and not on any additional past history. We call t the
exogenous state variable of the model. For instance, assume that at = at1 t ,
where t is white noise.
We are now ready to set up the dynamic program solved by the representative
agent:
V (kt , t ) = max {u (ct , lt ) + Et V (kt+1 , t+1 |t )}
it ,kt+1

subject to
1,t

: 0 = at f (kt , nt ) ct it g1 [ct , it , nt , kt , a (t )]

2,t
3,t

: 0 = 1 nt lt g2 [nt , lt ]
: kt+1 kt = it dkt g3 [it , kt ]

Since this is a constrained optimization problem, we may form the Lagrangian


L

= {u (ct , lt ) + Et V (kt+1 , t+1 )}


+1,t [at f (kt , nt ) ct it ]
+2,t [1 nt lt ]
+3,t [it dkt kt+1 + kt ]

15

The derivation of the FOCs for the two control variables ct and lt raises no
problems
ct

lt

u (ct , lt )
1,t
ct
u (ct , lt )
0=
2,t
lt
0=

The FOC for the control variable nt is derived as follows. Recall that in
general the FOC for a control variable is given by10
u(ct , kt , xt )
g(ct , kt , xt )
+
=0
ct
ct
In the case at hand, the control variable nt does not show up in the momentary
utility of the representative agent, u (ct , lt ). Moreover, nt appears in accumulation equation g1 [ct , it , nt , kt , a (t )] and g2 [nt , lt ]. It follows that the FOC is
given by
g1 [ct , it , nt , kt , a (t )]
g2 [nt , lt ]
0 + 1,t
+ 2,t
=0
nt
nt
or, more specic,
nt : 0 = 1,t

at f (kt , nt )
2,t
nt

A similar logic applies to the control variable it , which appears in accumulation equation g1 [ct , it , nt , kt , a (t )] and g3 [it , kt ] (but not in the momentary
utility function). The FOC is given by
it : 0 = 1,t + 3,t
In general the FOC for a state variable is given by11


V (kt+1 , t+1 )
t + Et
=0
kt+1
To get an expression for

V (kt+1 ,t+1 )
kt+1

we need the envelope theorem:

V (kt , t )
u (ct , kt , xt )
g(ct , kt , xt )

+ t [
+ 1]
kt
kt
kt
In the case at hand, the state variable kt does not show up in the momentary
utility function of the representative household, u (ct , lt ). Moreover, kt appears
in accumulation equation g1 [ct , it , nt , kt , a (t )] and g3 [it , kt ]. Thus,




V (kt , t )
g1 [ct , it , nt , kt , a (t )]
g3 [it , kt ]
= 0 + 1,t
+1
+ 3,t
kt
kt
kt
10 Be aware of the following point: The general function u(c , k , x ( )) and the momentary
t t
t t
utility function of the problem at hand, u (ct , lt ), use the same notation. Also, in the general
problem ct stands for control variables and kt stands for state variables, whereas in the problem
at hand ct denotes consumption (just one of several control variables) and kt denotes physical
capital (the only endogenous state variable).
11 Compare the previous footnote.

16

t ,a(t )]
?
Why has the term +1 been skipped in the expression 1,t g1 [ct ,it ,nkt ,k
t
Well, as you can see above, kt does not show up on the LHS of accumulation
equation 1. Changing the subscripts and substituting yields




at+1 f (kt+1 , nt+1 )
+ 3,t+1 (1 d)
kt+1 : 0 = t + Et 1,t+1
kt+1
The last three FOCs are

3.2

1,t+1

: 0 = at f (kt , nt ) ct it

2,t+1
3,t+1

: 0 = 1 nt lt
: 0 = it dkt kt+1 + kt

Continuous time

To be written.

References
[1] Barro, R.J. and X. Sala-i-Martin (1995), Economic growth, McGraw-Hill
[2] Leonard, D. and N. Van Long (1992), Optimal control theory and static
optimization in economics, Cambridge University Press
[3] King, R.G. (?), Notes on dynamic optimization, handout

17

You might also like