Bayesian DSGE-VAR Model Analysis
Bayesian DSGE-VAR Model Analysis
Frank Schorfheide
Department of Economics, University of Pennsylvania
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 2
• Compare fit of DSGE model to that of a VAR based on marginal data densities.
Mechanics are non-trivial. Under a very diffuse prior for the VAR coefficients, the
– DSGE-VARs: Del Negro and Schorfheide (2004, 2005), Del Negro, Schorfheide,
• Write VAR as Y = XΦ + U , Y is T × n, X is T × k.
• Solution: tilt estimates toward a point in the parameter space. Example: Minnesota
Example
• MLE of µ:
n
1X
µ̂M L = yt .
n t=1
Example
• MSE of MLE:
· ¸
1
IE µ (µ − µ̂M L)2 = |{z}
02 + .
n
|{z}
Bias2
Variance
• MSE of Bayes Estimator:
· ¸ µ 2
¶2
1/τ n
IE µ (µ − µ̂B )2 = µ2 + .
n + 1/τ 2 (n + 1/τ 2)2
| {z } | {z }
Bias2 variance
• If µ2 is small, i.e. the discrepancy between the a priori expected value and the “true”
conditional on θ. Overall:
• Let IE D
θ [·] be the expectation under DSGE model and define the autocovariance ma-
trices
ΓXX (θ) = IE D 0
θ [xt xt ], ΓXY (θ) = IE D 0
θ [xt yt ].
0 0
• Replace sample moments Y ∗ Y ∗ by IE D ∗ ∗
θ [Y Y ] = λT ΓY Y (θ), etc.
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 9
• Define
Φ∗(θ) = Γ−1
XX (θ)ΓXY (θ), Σ∗(θ) = ΓY Y (θ) − ΓY X (θ)Γ−1
XX (θ)ΓXY (θ). (4)
• Prior distribution:
µ ¶
Σ|θ ∼ IW λT Σ∗(θ), λT − k, n (5)
à · ¸−1!
1
Φ|Σ, θ ∼ N Φ∗(θ), Σ−1 ⊗ ΓXX (θ) ,
λT
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 10
• Alternative motivation...
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 11
) (T ): Cross-equation
I2 restriction for given value Prior contours for misspecification
of T parameters )'
) (T )+)'
)'
I1
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 12
• There is a vector θ and matrices Φ∆ and Σ∆ such that the data are generated from
• Our prior for Φ∆ has the property that its density is proportional to the expected
• Likelihood ratio:
· ¸
L(Φ∗ + Φ∆, Σ∗u|Y∗, X∗)
ln (7)
L(Φ∗, Σ∗u|Y∗, X)
· µ ¶¸
1 ∆0 0 ∗0 0 ∗0 0
= − tr Σ∗−1u Φ X ∗ X ∗ Φ ∆
− 2Φ X ∗ X ∗ Φ ∆
− 2(Φ ∗
+ Φ ∆ 0 0
) X∗ Y∗ + 2Φ X ∗ Y∗ .
2
• We now choose a prior density that is proportional (∝) to the expected likelihood ratio:
½ · µ ¶¸¾
1 0
p(Φ∆|Σ∗u) ∝ exp − tr λT Σ∗−1 u Φ∆ ΓXX Φ∆ . (9)
2
• Again, we obtain:
µ ¶
Σ|θ ∼ IW λT Σ∗(θ), λT − k, n (11)
à · ¸−1!
1
Φ|Σ, θ ∼ N Φ∗(θ), Σ−1 ⊗ ΓXX (θ) ,
λT
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 15
f∆ is “local” misspecification. DSGE model provides good albeit not perfect approx-
•Φ
imation to reality.
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 16
DSGE-VARs: Posteriors
• The joint posterior density of VAR and DSGE model parameters can be factorized:
DSGE-VARs: Posteriors
• The posterior distribution of Φ and Σ is also of the Inverted Wishart – Normal form:
µ ¶
Σ|Y, θ ∼ IW (1 + λ)T Σ̂b(θ), (1 + λ)T − k, n (14)
µ ¶
Φ|Y, Σ, θ ∼ N Φ̂b(θ), Σ ⊗ (λT ΓXX (θ) + X 0X)−1 ,
DSGE-VARs: Posteriors
• The marginal posterior density of θ can be obtained by evaluating the marginal likeli-
hood
n (1+λ)T −k
|λT ΓXX (θ) + X 0X|− 2 |(1 + λ)T Σ̂b(θ)|− 2
pλ(Y |θ) = n λT −k
|λT ΓXX (θ)|− 2 |λT Σ∗(θ)|− 2
n((1+λ)T −k) Qn
(2π)−nT /22 2
i=1 Γ[((1 + λ)T − k + 1 − i)/2]
× n(λT −k) Qn
.
2 2 i=1 Γ[(λT − k + 1 − i)/2]
DSGE-VARs: Posteriors
1. Use the RWM Algorithm to generate draws θ(s) from the marginal posterior distribu-
tion pλ(θ|Y ).
of p̂λ(Y ).
3. For each draw θ(s) generate a pair Φ(s), Σ(s), by sampling from the IW−N distribution.
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 20
DSGE-VARs: Posterior of θ
• Where does the information about θ come from? Rewrite posterior as
discrepancy between Φ̂mle and Σ̂mle and the restriction functions Φ∗(θ), Σ∗(θ).
T
ln p∗(Y |θ) = − vech(Σ̂mle − Σ∗(θ))0D(Σ̂−1 0 ∗
mle ⊗ Σ̂mle )D vech(Σ̂mle − Σ (θ))
0
2
1
− vec(Φ̂mle − Φ∗(θ))0(Σ̂−1 0 ∗
mle ⊗ X X)vec(Φ̂mle − Φ (θ))
2
+const + small. (18)
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 21
DSGE-VARs: Posterior of θ
becomes equivalent to making inference based on the quasi-likelihood function p∗(Y |θ).
in the discrepancy between Φ̂mle and Σ̂mle and the restriction functions Φ∗(θ), Σ∗(θ).
• We will study the fit of the DSGE model by examining the marginal likelihood function
of the hyperparameter λ:
Z
p(Y |λ) = p(Y |θ, Σ, Φ)pλ(θ, Σ, Φ)d(θ, Σ, Φ). (19)
• Maximum / mode:
• It is common in the literature to use marginal data densities to document the fit of
DSGE models relative to VARs with diffuse priors. In our framework this corresponds
to comparing
Prior
Likelihood λ=∞
Φ Φ*
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 24
Prior
Likelihood λ=∞
λ→0
Φ Φ*
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 25
• Prior simplifies to
µ ¶
1
φ∼N φ∗, . (21)
λT γ0
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 26
T 2 1
ln p(Y |λ, φ∗) = −T /2 ln(2π) − σ̃ (λ, φ∗) − c(λ, φ∗). (22)
2 2
• The term σ̃ 2(λ, φ∗) measures the in-sample one-step-ahead forecast error:
2 ∗ 1X 1X
lim σ̃ (λ, φ ) = (yt − φ̂yt−1)2, 2 ∗
lim σ̃ (λ, φ ) = (yt − φ∗yt−1)2,
λ−→0 T λ−→∞ T
• The third term in (22) can be interpreted as a penalty for model complexity and is of
the form
µ ¶
γ̂0
c(λ, φ∗) = ln 1 + .
λγ0
γ0γ̂02
λ̂ = . (23)
T (γ̂0γ1 − γ0γ̂1)2 − (γ0)2γ̂0
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 27
• As λ approaches zero, the marginal log likelihood function tends to minus infinity.
– For small values of λ the goodness-of-fit terms are essentially identical. Marginal
DSGE-VARs: Posterior of λ
• Goal of IRF comparisons is to document in which dimensions the DSGE model dy-
• To what extent does the VAR satisfy key structural equations implied by the DSGE?
• Examples: Cogley and Nason (1994), Rotemberg and Woodford (1997), Schorfheide
(2000), Boivin and Giannoni (2003), and Christiano, Eichenbaum, and Evans (2004),
to name a few.
• Important issue: estimation and the identification of the VAR that serves as a bench-
• In our framework: compare (i) DSGE-VAR(∞) and DSGE-VAR(λ̂) IRFs; (ii) DSGE-
DSGE-VARs: Identification
• So far the DSGE-VAR is reduced form. For most applications we would like a mapping
• The DSGE model is identified: there is a matrix Ω∗(θ) that maps the variance-
DSGE-VARs: Identification
1. Use the RWM Algorithm to generate draws θ(s) from the marginal posterior distribu-
tion pλ(θ|Y ).
of p̂λ(Y ).
3. For each draw θ(s) generate a pair Φ(s), Σ(s), by sampling from the IW−N distribution.
• How well is the state-space representation of the linearized DSGE model approximated
• For each θ draw compare responses of the state-space version of the DSGE to the
DSGE-VAR(λ = ∞) version.
Notes: DSGE model responses computed from state-space representation: posterior mean
(solid); DSGE-VAR(λ = ∞) responses: posterior mean (dashed) and pointwise 90% proba-
bility bands (dotted).
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 34
• How different are the IRFs of the VAR that is estimated subject to the DSGE model
restrictions from the IRFs of the VAR in which restrictions are relaxed?
• For each (Φ, Σ, θ) draw compare responses of the state-space version of the DSGE to
• Moreover, for each draw we compute the difference between DSGE-VAR(λ) and DSGE-
VAR(λ = ∞). We use these differences to compute a posterior mean and 90% proba-
bility bands.
• For instance, in response to a monetary policy shock, the right-hand-side of the Euler
Notes: DSGE model responses: posterior mean (solid); DSGE-VAR responses: posterior
mean (dashed) and pointwise 90% probability bands (dotted).
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 36
• U.S. data; unless noted otherwise thirty years of observations (T = 120), starting in
observations.
• Best fit in terms of Bayesian marginal likelihood and out-of-sample forecasting perfor-
−1040
−1049
−1060
−1080
−1100
−1118
−1120 −1123
−1140
0.33 0.5 0.75 1 1.25 1.5 2 5 Inf DSGE
λ
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 40
−1077
−1080
−1100
−1120
−1140
−1148
−1160 −1162
−1180
0.33 0.5 0.75 1 1.25 1.5 2 5 Inf DSGE
λ
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 41
0.4 0.4
0.2 1
0.2 0.2
0 0 0 0
−0.4 −0.4
−0.4 −2
−0.6 −0.6
W Inflation R
0.3 0.3 1.5
0.2 0.2
1
0.1 0.1
0 0
0.5
−0.1 −0.1
−0.2 −0.2
0
−0.3 −0.3
1.2 0.8
3
1.5
1 0.6
0.8 2 0.4
1
0.6 1 0.2
0.4 0
0.5
0
0.2 −0.2
0 0 −1 −0.4
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
W Inflation R
1 0.1 0.2
0 0.1
0.8
−0.1 0
0.6 −0.2 −0.1
−0.4 −0.3
0.2
−0.5 −0.4
0 −0.6 −0.5
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 43
−1080
−1100 −1101
−1118
−1120 −1123 Baseline
−1128
−1139
−1140
No Indexation
−1155
−1160
−1180
−1200
−1220
−1230
No Habit
−1240
0 0 0 −0.2
−0.8 −0.8 −2 −1
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
W Inflation R
0.2 0.3 1.5
0.2
0.1
1
0.1
0 0
0.5
−0.1 −0.1
−0.2
0
−0.2
−0.3
1.2
0.8 2 0.4
1
0.8
0.6 1 0.2
0.6
0.4
0.4 0 0
0.2
0 0.2 −1 −0.2
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
W Inflation R
0.8 0.3 0.4
0.2 0.3
0.6
0.1 0.2
0 0.1
0.4
−0.1 0
−0.2 −0.1
0.2
−0.3 −0.2
0 −0.4 −0.3
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 46
0.4 0.4
0.2 0
0.2 0.2
0 0
0 −1
−0.2 −0.2
−0.4 −0.4
−0.2 −2
−0.6 −0.6
W Inflation R
0.3 0.2 1.5
0.2
0.1
1
0.1
0 0
0.5
−0.1 −0.1
−0.2
0
−0.2
−0.3
1.4
0.8 3 0.4
1.2
1 0.6 2 0.2
0.8 0.4 1 0
0.6
0.2 0 −0.2
0.4
0.2 0 −1 −0.4
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
W Inflation R
1 0.2 0.3
0.1 0.2
0.8
0 0.1
0.6 −0.1 0
−0.3 −0.2
0.2
−0.4 −0.3
0 −0.5 −0.4
0 4 8 12 16 0 4 8 12 16 0 4 8 12 16
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 48
Discussion
Question: If we did not have the baseline model at hand, but only the alternative specifi-
cation, can we learn something from our procedure about what is missing?
• No Habit: For money shock (but also technology), clearly something is amiss! Con-
• . . . however DSGE-VAR(λ̂)’s IRFs are close to those of the Baseline model: Even if
the DSGE model w/o Habit is grossly misspecified, the DSGE-VAR(λ̂) is not too bad
as a benchmark.
• No Indexation: No clear evidence from Money and/or Tech IRFs that a feature that
10 10 11
10
Baseline 3
No Indexation 3
2
−2
No Habit −2
−4
0 0.33 0.5 0.75 1 1.25 1.5 2 5 Inf DSGE
λ
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 50
Discussion
• Roughly: DSGE model and unrestricted VAR are comparable. DSGE-VAR improves.
• This suggests:
DSGE-VARs: Extensions
• Start from DSGE with interest-rate feedback rule, allow for deviations from cross-
identified VARs.
0
y2,t = x0tΨ∗(θ) + u02,t. (25)
|{z}
Inf lation, OutputGap
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 52
• VAR approximation (25) is in general not exact yet quite accurate with four lags in
our application.
• (24) and (25) comprise a partially identified (based on exclusion restrictions) VAR.
Frank Schorfheide, University of Pennsylvania: Bayesian Methods 53
DSGE-VARs: Extensions
with
DSGE-VARs: Extensions
• There is a vector θ and matrices Ψ∆ and Σ∆ such that the data are generated from
the expected likelihood ratio of Ψ evaluated at its (misspecified) restricted value Ψ∗(θ)
DSGE-VARs: Extensions
• The forecast error u2,t is a function of the structural shocks: u02,t = ²1,tA1 + ²02,tA2.
• After some matrix algebra we can determine A1 and A02A2, which identifies monetary
policy shocks, but does not separate technology from demand shocks.
• We follow the idea in Del Negro and Schorfheide (2004) and decompose the DSGE
0 0
model response AD D ∗
2 (θ) = A2,tr (θ)Ω (θ).
• Now we have a collection of identified VARs. Del Negro and Schorfheide (2005) study
• Large body of empirical work on the Bayesian estimation / evaluation of DSGE models.
• An and Schorfheide (2005) paper illustrates techniques based on New Keynesian model.
• Model size / dimensionality of the parameter space pose challenge for MCMC methods.
• Model misspecification is and will remain a concern in empirical work with DSGE mod-
cation in the measures of uncertainty constructed for forecasts and policy recommen-
dations.