Kelejian 2017
Kelejian 2017
1. There is a vast literature dealing with, or related to spatial panel data models. Some important
references are Anselin et al. (2008), Arellano (2003), Baltagi (2005), Baltagi et al. (2003), Blundell
and Bond (1998), Elhorst (2010, 2014), Kapoor et al. (2007), Lee and Yu (2010a,b, 2012a), Mutl
and Pfaffermayr (2011), Pesaran and Tosetti (2011), and Piras (2013, 2014). See also Chapter 13 in
the new edition of Baltagi’s book and Chapter 30 in Pesaran (2015).
JT = eT eT
where ⊗ denotes the Kronecker product, eT is the T × 1 vector of ones, and let
JT
Q1 = ⊗ IN . (15.2.2)
T
Let Ft be any N × 1 vector, and let F = (F1 , ..., FT ) . Let F̄ = T −1 (F1 +
... + FT ), i.e., if t relates to time, then F̄ is the time average of the T vectors,
F1 , ..., FT . Then, the matrix Q0 is such that
Q1 F = (eT ⊗ F̄ ), (15.2.4)
Q0 Q0 = Q0 , (15.2.5)
Q1 Q1 = Q1 ,
Q0 + Q1 = IN T ,
Q0 Q1 = 0,
Q0 (eT ⊗ G) = 0.
general model in this case might contain nonparametrically specified error terms
which allow for heteroskedasticity as well as spatial correlation.
In this section we start with the simplest random effects model which only
has exogenous regressors, and a structurally specified error term. Generaliza-
tions will be straightforward.
Consider the panel data model
y t = X t β + ut , (15.3.1)
ut = ρ2 W ut + εt , |ρ2 | < 1,
εt = μ + vt , t = 1, ..., T
y = Xβ + u, (15.3.3)
u = (IT ⊗ ρ2 W )u + ε,
ε = (eT ⊗ IN )μ + v.
with ε being the VC matrix of ε. Since μ and v are independent, it then follows
from the third line in (15.3.3) and (15.2.5) that
−1/2 −1/2
where ε ε ε = IN T . The third line in (15.3.8) follows immediately by
multiplying the second line across by σv , and then setting Q0 = IN T − Q1 .
−1/2
Suppose ρ2 , σv2 , σμ2 , were known. Given this, ε would also be known
via (15.3.7). In this case one would estimate (15.3.3) by first transforming the
model to eliminate the spatial correlation induced by the second line in (15.3.3),
−1/2
and then transforming the resulting model by premultiplying it across by ε .
Specifically, let
Given the VC matrix ε of ε in (15.3.7) and the results in (15.3.8), one would
−1/2
then premultiply (15.3.10) across by ε to obtain
−1/2
ε y(ρ2 ) = −1/2
ε X(ρ2 )β + −1/2
ε ε (15.3.11)
= −1/2
ε X(ρ2 )β +ψ
−1/2
where ψ = ε ε. It follows from (15.3.7) and (15.3.8) that
E(ψ) = 0, (15.3.12)
−1/2
E(ψψ ) = −1/2
ε E(εε )e
= −1/2
ε ε −1/2
ε
= IN T .
Thus, if ρ2 , σv2 , σμ2 were known, the estimation of β in the model in (15.3.3)
may now be evident. Specifically, let
y ∗ = −1/2
ε y(ρ2 ), (15.3.13)
∗
X = −1/2
ε X(ρ2 ),
Given that ρ2 , σv2 , and σμ2 are known, the estimator of β would just be the OLS
estimator based on (15.3.14) which can be expressed as a GLS estimator based
on (15.3.10), namely
Under standard conditions given in Kapoor et al. (2007), β̂GLS is consistent and
asymptotically normal with the anticipated distribution. In particular,
D
(N T )−1/2 (β̂GLS − β) → N (0, V C), (15.3.16)
V C = lim (N T )[X(ρ2 ) −1
ε X(ρ2 )]
−1
N→∞
= lim (N T )(X ∗ X ∗ )−1 .
N→∞
with
X(ρˆ2 ) = X − (IT ⊗ ρˆ2 W )X,
y(ρˆ2 ) = y − (IT ⊗ ρˆ2 W )y.
Then, under reasonable conditions, Kapoor et al. (2007) show that β̂F GLS is
consistent and asymptotically normal with the same distribution as that of β̂GLS .
In particular,
D
(N T )−1/2 (β̂F GLS − β) → N (0, V C), (15.3.22)
V C = lim (N T )[X(ρ2 ) −1 −1
ε X(ρ2 )] .
N→∞
ˆ −1
β̂F GLS N (β, [X(ρˆ2 ) −1
ε X(ρˆ2 )] ). (15.3.23)
those given in Section 2.2.4 in reference to the GMM procedure for the estima-
tion of ρ2 .
In reference to u and ε in (15.3.3), let
ū = (IT ⊗ W )u, (15.3.24)
¯ū = (IT ⊗ W )ū,
ε = u − ρ2 ū,
¯
ε̄ = ū − ρ2 ū.
If the quadratic forms in (15.3.27) are multiplied out, e.g., (û − ρ2ū) Q0 (û −
ρ2
ū) = û Q0 û + ρ2
2 ū Q0
ū − 2ρ2 û Q0
ū, the six equations in (15.3.27) can be
expressed in a form that is more conducive to estimation. Specifically, let
⎡ ⎤ ⎡ ⎤
δ̂1
1
−1) û Q0 û
⎡ ⎤ ⎢ ⎥ ⎢ N (T ⎥
⎢ δ̂ ⎥ ⎢ 1 Q ⎥
ρ2 ⎢ 2 ⎥ ⎢ ū 0 ū ⎥
⎢ 2 ⎥ ⎢ ⎥ ⎢ N (T −1) ⎥
⎢ ρ2 ⎥ ⎢ δ̂3 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ N (T −1) û Q0
1 ū ⎥
γ = ⎢ 2 ⎥ , δ̂ = ⎢ ⎥ , S = ⎢ ⎥, (15.3.28)
⎣ σv ⎦ ⎢ δ̂4 ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ û Q 1 û ⎥
⎢ δ̂ ⎥ ⎢ N ⎥
σ12 ⎢ 1 ⎥
⎣ 5 ⎦ ⎣ ū Q1
N ū ⎦
δ̂6 1
N û Q1 ū
⎡ ⎤
2 −1 1 0
N (T −1) û Q0 ū N (T −1) ū Q0 ū
⎢ ⎥
⎢
¯ −1 ¯ ¯ 1 ⎥
⎢ 2
N (T −1) ū Q0 ū N (T −1) ū Q0 ū N T r(W W ) 0 ⎥
⎢ ⎥
⎢ 1 ⎥
⎢ ¯ −1 ¯ ⎥
M =⎢ N (T −1) [û Q0 ū + ū Q0 ū] N (T −1) ū Q0 ū 0 0 ⎥.
⎢ ⎥
⎢ 2 −1 0 1 ⎥
⎢ N û Q1 ū N ū Q1 ū ⎥
⎢ ⎥
−1
_
⎢ 2 ¯ ¯ 1 ⎥
⎣ N ū Q1 ū N ū Q1 ū 0 N T r(W W )⎦
[û Q
1
N ū¯ +
1 ū Q
ū] 1
−1 ¯
N ū Q1 ū 0 0
Then, the six equations in (15.3.27) can be expressed as
S = Mγ + δ̂. (15.3.29)
Kapoor et al. (2007) show that consistent estimators of σv2 and ρ2 can be
obtained by nonlinear least squares based only on the first three equations
in (15.3.29), namely by finding
argminσv2 ,ρ2 [δ̂12 + δ̂22 + δ̂32 ]. (15.3.30)
Let ρˆ2 and σ̂v2 be the resulting estimators of ρ2 and σv2 . Then a consistent esti-
mator of σ12 can be obtained from the fourth equation in (15.3.29):
1
σ̂12 = (û − ρˆ2
ū) Q1 (û − ρˆ2
ū). (15.3.31)
N
Kapoor et al. (2007) also suggested a more efficient GLS-type estimators
of ρ2 , σv2 , and σ12 which are based on all six equations in (15.3.29). Assuming
normality of the innovation vector ε in (15.3.3), ε ∼ N (0, ε ) where ε is given
in (15.3.7), they obtained the VC matrix of δ = (δ1 , ..., δ6 ) in (15.3.26), namely
1 4
T −1 σv 0
Vδ = ⊗ D, (15.3.32)
0 σ14
Panel Data Models Chapter | 15 315
⎡
⎤
2 2T r( WNW ) 0
⎢ ⎥
D=⎢ ) ⎥
W (W +W )
W W W W W
⎣ 2T r( N ) 2T r( W N ) T r( W N ⎦.
W (W +W ) W )
0 T r( W N ) T r( W W +W
N )
Let V̂δ be identical to Vδ except that σv4 and σ14 are now replaced by their corre-
sponding consistent estimators obtained from (15.3.29) and (15.3.31). Then the
nonlinear GLS-type estimators are determined by finding
2. To describe the within and between estimators, consider the model in (A). Using evident notation
let
since α and μi cancel, and where ε̌it is the transformed error term.
The within estimator of β is the OLS estimator based on (B). The between estimator of β is the
OLS estimator of β in the regression of ȳi on the constant and x̄i. , i = 1, ..., N .
316 Spatial Econometrics
trolling for state specific effects, public capital is no longer significant and plays
no role in production. The same data set was also used in Millo and Piras (2012)
to illustrate their R library dealing with spatial panel data models. The spatial
weighting matrix used by Millo and Piras (2012) in their illustration is a sim-
ple row-standardized binary contiguity matrix. This weighting matrix is also the
one we use in this illustration.
The estimates obtained using (15.3.1) and following the procedure described
in the previous section are reported below:
= 2.227(0.135) + 0.054(0.022) ln(pcap) + 0.257(0.021) ln(pc)
ln(gsp)
+ 0.728(0.025) ln(emp) − 0.004(0.001) unemp.
The estimated value for ρ2 is 0.548, and σv2 = 0.001, σ12 = 0.088, and θ =
0.887. All the coefficients in the regression equation are significant and have
the expected sign. This means that when the error term accounts for spatial
correlation as specified in (15.3.1), the variable reflecting public sector capital
has a positive and significant effect. The value of the spatial coefficient ρ2 is
positive (and its magnitude is quite large!). However, the procedure discussed
above does not determine the statistical significance of the estimator of ρ2 .
Typically, the within estimator is considered in a fixed effects framework (Section 15.5 below); the
between estimator is typically in a random effects framework.
Panel Data Models Chapter | 15 317
We then multiply the model across by Q1 and estimate Q1 u using the instrument
matrix H+ where
H# = (H∗ , H+ ).
Q0 u = Q0 y − Q0 Z γ̂∗ . (15.4.6)
318 Spatial Econometrics
Q1 u = Q1 y − Q1 Z γ̃+ . (15.4.9)
and
where θ̌ = 1 − σ̌v /σ̌1 . Note that in these transformations one is first transforming
to account for spatial correlation, and then transforming again to account for the
covariance matrix of ε.
Let P# = H# (H# H# )−1 H# and
γ̌ = (Žρˇ ,θ̌ Žρˇ2 ,θ̌ )−1 Žρˇ ,θ̌ yρˇ2 ,θ̌ . (15.4.11)
2 2
wage in manufacturing sector). One thing is worth noticing here. As it was for
cross-sectional models, the presence of ρ1 complicates the interpretation of the
other coefficients. In a model without spatial lags, and without additional en-
dogenous variables, the coefficients would be interpreted as elasticities. On the
other hand, in the absence of additional endogenous variables, for models such
as that in (15.4.1), the interpretation of the coefficients is a bit different.4 Some
of the software that implements the estimation of spillover effects in models in-
volving spatial lags, but no additional endogenous variables, is available in R or
Matlab, among other packages.
yt = Xt β1 + ρ1 Wyt + Yt β2 + μ + ut , (15.5.1)
ut = ρ2 W ut + vt , t = 1, ..., T
4. Spillover effects in models which have additional endogenous variables are more complex. The
reason for this is that the system involving these variables also involves the dependent variable of
the model being considered, as well as exogenous variables. Therefore spillover effects relate not to
the single equation being considered, but to the complete system. At present, there are no results in
the spatial literature relating to this.
322 Spatial Econometrics
A Note on Identification
Before continuing we note that in a fixed effects model, the coefficients of re-
gressors whose values do not vary over time are not identified. This is the case
whether those variables are exogenous or endogenous. For example, suppose
(15.5.1) were extended to
where S is an N × ks regressor matrix whose values do not vary over time. Since
Sβ0 is a time invariant N × 1 vector the model in (15.5.2) reduces to
The properties of μ̂ are easily determined by substituting (15.5.5) into the first
line of (15.5.6):
Clearly,
E(μ̂) = μ, (15.5.8)
V Cμ̂ = σv2 [(eT ⊗ IN )(eT ⊗ IN )]−1
σv2
= IN .
T
It should be evident that issues relating to the consistency of μ̂ involve T . For
example, if N is assumed to be given, and T → ∞, then by (15.5.8) V Cμ̂ → 0,
and since μ̂ is unbiased, Chebyshev’s inequality in Section A.3 of Appendix A
P
implies that μ̂ is consistent, μ̂ → μ. Now let μ̂i , i = 1, ..., N be the ith element
324 Spatial Econometrics
and
P
5. In this case N → ∞ and we are not saying that μ̂ → μ as N → ∞ because this “limit” makes
no sense. The reason for this is that μ is an N × 1 vector and so, in the limit, μ cannot even be
defined – there is no upper limit to ∞.
Panel Data Models Chapter | 15 325
Estimation
The estimation procedure takes place in three steps. In the first step a consistent
but inefficient estimator, say γ̂ , of γ in (15.5.9) is determined. Then, γ̂ is used
to estimate the error vector in (15.5.9), namely Q0 u. In the second step the
estimator of Q0 u is used to estimate the parameters ρ2 and σv2 . In the third step
the model in (15.5.9) is transformed to account for the spatial correlation, and
then a more efficient estimator of γ is obtained. An expression is then given
which enables finite sample inferences.
Step 1
Let PH∗ = H∗ (H∗ H∗ )−1 H∗ and Ẑ∗ = PH∗ Q0 Z. Then, the 2SLS estimator of γ
in (15.5.9), based on the instruments in (15.5.13) is
γ̂ = (Ẑ∗ Ẑ∗ )−1 Ẑ∗ Q0 y. (15.5.14)
P
Under standard conditions, γ̂ can easily be shown to be consistent, γ̂ → γ .
Given γ̂ , the evident estimator of Q0 u in (15.5.9) is
Q0 u = Q0 y − Q0 Z γˆ. (15.5.15)
Step 2
Given Q0 u, the parameters ρ2 and σv2 can be consistently estimated using the
first three equations in Kapoor et al. (2007). For example, noting from (15.2.5)
that Q0 = Q0 and Q20 = Q0 , the empirical form of the first three equations in
their paper can be expressed in terms of Q0 u in (15.5.15) as
1
(Q0 û − ρ2 Q0
ū) (Q0 û − ρ2 Q0
ū) = σv2 + δ̂1 , (15.5.17)
N (T − 1)
1 ¯ = σν2 1 T r(W W ) + δ̂2 ,
(Q0ū − ρ2 Q0¯ (Q0
ū) ū − ρ2 Q0
ū)
N (T − 1) N
6. Again, because the model has additional endogenous variables, maximum likelihood or Bayesian
methods cannot be implemented unless the entire system generating the endogenous variables is
known!
326 Spatial Econometrics
1
ū − ρ2 Q0
(Q0 ¯ (Q0 û − ρ2 Q0
ū) ū) = 0 + δ̂3
N (T − 1)
where δ̂i , i = 1, 2, 3 are error terms.7 The estimators of ρ2 and σv2 , say ρ̌2 and
σ̌v2 , are then obtained by nonlinear least squares, namely by finding
Let
ū0 = (IT ⊗ W )û0 ,
ū¯ 0 = (IT ⊗ W )
ū0 ,
û0 = Q0 u.
Note that in light of (15.5.16) the three equations in (15.5.17) can also be ex-
pressed as
1
(û0 − ρ2
ū0 ) (û0 − ρ2 ū0 ) = σv2 + δ̂1 , (15.5.19)
N (T − 1)
1 1
ū0 − ρ2
( ū0 − ρ2
ū¯ 0 ) ( ū¯ 0 ) = σν2 T r(W W ) + δ̂2 ,
N (T − 1) N
1
ū0 − ρ2
( ū¯ 0 ) (û0 − ρ2
ū0 ) = 0 + δ̂3 ,
N (T − 1)
since Q0 (IT ⊗ ρ2 W ) = (IT ⊗ ρ2 W )Q0 .
Step 3
Finally, one needs to transform the variables in (15.5.9) in order to account for
spatial correlation. Specifically, let
7. For example, since Q0 is symmetric and idempotent, any quadratic form such as M Q0 M can
be expressed as
M Q0 M = M Q0 Q0 M
= (Q0 M) (Q0 M).
Using this, the first equation in Kapoor et al. (2007) can be written as
1 1
ε Q0 ε = (Q0 ε) (Q0 ε)
N (T − 1) N (T − 1)
= σv2 + δ1
where ε = u − ρ(IT ⊗ W )u is the innovation vector, and E(δ1 ) = 0. Thus, to estimate ε Q0 ε one
only has to estimate Q0 ε.
Panel Data Models Chapter | 15 327
where σ̂v2 is a consistent estimator of σv2 . One such estimator is σ̌v2 which is
determined by the GMM approach described by (15.5.18). Another one is based
on (15.5.21). Let
Q0 v = yρˇ2 − Zρˇ2 γ̌ . (15.5.25)
Then, under reasonable conditions, another consistent estimator of σv2 is σ̂v2
where
1
σ̂v2 = (Q0 v) (Q0 v). (15.5.26)
N (T − 1)
Illustration 15.5.1: A fixed effects version of the model of crime
We consider again the model of crime in North Carolina and the three-step pro-
cedure described above for the fixed effects model. Furthermore, for that model
we consider the two additional instruments (offense mix and per capita tax ratio)
to control for the endogeneity of police per capita and the probability of arrest.
Results from the estimation are reported below:
= 0.427(0.106)lpolpc − 0.250(0.113)lprbarr
lcrmrte
− 0.246(0.065) lprbconv − 0.142(0.048) lprbpris
− 0.007(0.027) lavgsen + 0.417(0.289) ldensity
328 Spatial Econometrics
where yt = yt − yt−1 , etc.9 The error vector ut in (15.5.28) can be expressed
as
ut = [IN − ρ2 W ]−1 vt . (15.5.29)
) = 0 for all s ≤ t −2 implies E(u y ) = 0 for all s ≤ t −2,
Since E(vt yt−s t t−s
in their framework Baltagi et al. (2014b) suggest the use of time lagged de-
pendent as well as exogenous variables as instruments to estimate their model.
Many steps in their procedure would carry over to the estimation of (15.5.28).
The overall procedure is interesting, but a little bit tedious. It also depends cru-
cially on the assumption that the elements of vt are i.i.d. (0, σv2 ) over both
i = 1, ..., N and t = 1, ..., T . We do not describe the details of the procedure
because in Section 15.6 we present a general panel data fixed effects model
which contains both (15.5.1) and (15.5.27) as special cases.
in order to reduce the number of parameters.10 At the time, the general un-
derstanding was that without an assumption such as (15.6.1) the variance–
9. Baltagi et al. (2014b) eliminate their random effects vector because its elements are correlated
with the time lagged dependent variable. So, although their model is quite different than ours in that
they have random effects while our model here has fixed effects, the approach taken for estimation
is quite similar.
10. Note that in a panel context, the expression in (15.6.1) can be further complicated if, for exam-
ple, time lagged variables are considered.
330 Spatial Econometrics
Since consistent estimators of the elements of the fixed effects vector μ are not
possible, the fixed effects vector will be eliminated from the model. As we saw
in Section 15.5, there are at least two “typical” ways of doing this. One is to
take time differences; the other is to premultiply the model across by Q0 . In
this section we eliminate the fixed effects by premultiplying the model across
by Q0 . Specifically,
Q0 y = Q0 Zγ + Q0 u, (15.6.4)
∗ ∗ ∗
y =Z γ +u ,
u = Rε, (15.6.6)
332 Spatial Econometrics
⎡ ⎤
R11 0 . . . 0
⎡ ⎤ ⎢ ⎥⎡ ⎤
u1 ⎢ R21 R22 0 . . 0 ⎥ ε1
⎢ ⎥ ⎢ ⎥⎢ ⎥
⎢ .. ⎥ = ⎢ . . . 0 . . ⎥⎢ .. ⎥
⎣ . ⎦ ⎢ ⎢ . . . . 0 .
⎥⎣
⎥ . ⎦
⎢ ⎥
uT ⎣ . . . . . 0 ⎦ εT
RT 1 RT 2 . . . RT T
where R is the N T × N T matrix whose (i, j )th block is the N × N matrix Rij ,
i, j = 1, ..., T .
Let be an N T × h, h ≥ kY matrix of observations on exogenous variables
that are not in (15.6.3), but are in the system determining Y . As described in
earlier models, the variables in may only be a subset of the variables in that
system.
The parameters in the model in (15.6.3) cannot be estimated by maximum
likelihood or by Bayesian techniques unless all the equations determining Y
are known. Therefore, as for earlier models the estimation procedure will be
instrumental variables.
Let H∗ = [Q0 X, Q0 P , Q0 ]. Then the IV matrix is H , where
11. Again, this does not account for triangular arrays; see Section A.15 in the appendix on large
sample theory, or Kapoor et al. (2007) for specifications that do account for triangular arrays.
Panel Data Models Chapter | 15 333
where H H , H RRH , and ZQ0 Z are nonsingular finite matrices, and H Z has
full column rank.
again, since Q20 = Q0 . Note that there are no parameters of the error term that
have to be estimated in order to obtain γ̂ .
The large sample distribution of γ̂ is
D
(N T )1/2 (γ̂ − γ ) → N (0, V Cγ̂ ), (15.6.11)
V Cγ̂ = SH RRH S ,
S = [H Z −1 −1 −1
H H H Z ] H Z H H ,
H RRH = lim (N T )−1 H RR H.
N→∞
γ̂ N [γ , (N T )−1 V C γ̂ ] (15.6.12)
where
ˆ H RRH Ŝ ,
V C γ̂ = Ŝ
Ŝ = [(N T )−1 Z H (H H )−1 H Z]−1 Z H (H H )−1 ,
and where ˆ H RRH is the HAC estimator of H RRH . In constructing this HAC
estimator, RR should be viewed as the VC matrix of the error term u; see
Chapter 9.
frame. They also considered different ways of modeling the bootlegging effect.
In fact, they studies the sensitivity of their results by replacing their minimum
price variable with a maximum neighboring price variable.
In this example we formulate a slightly modified version of the model con-
sidered by Baltagi and Levin (1992) in which we replace their minimum price
with an average price variable based on the six nearest neighbors states; we also
consider the spatial lag of cigarette consumption.
More specifically, the model that we estimate in this example is
A glance at the results shows that the coefficients of the (time) lagged con-
sumption variable, and of price and income have the expected signs and are also
statistically significant. In fact, one would expect that consumption habits are
persistent, the price effect on demand is negative, and the income effect is posi-
tive. However, it should be stressed once more that these coefficients cannot be
interpreted as elasticities because of the presence of the spatially lagged depen-
dent variable whose coefficient is positive and significant.12 The average price
12. Obtaining the elasticity for this dynamic panel data model would be even more complicated
than usual. For an example of a dynamic panel, see Parent and LeSage (2010).
336 Spatial Econometrics
of the six nearest neighbors state is not statistically significant at the usual 5%
level.
A final point relates to statistical inference. Standard errors are produced
using the spatial HAC estimator with a Parzen kernel. In doing this we specify
a variable bandwidth based on the distance to the six nearest neighbors.
done in Section 15.6. In this section we use the Q0 method, which turns out to
be convenient.
Premultiplying the fixed effects model in (15.7.1) by Q0 yields
Q0 y = Q0 Zγ + Q0 u, (15.7.3)
∗ ∗ ∗
y =Z γ +u ,
where, using evident notation, XJ and YJ are respectively the N T × kJ,x and
N T × kJ,Y matrices of observations on the exogenous and endogenous variables
in the J th alternative model, WJ is the corresponding weighting matrix, etc. The
unit specific vector, μJ , can be either a random or a fixed effects vector. Note
that some alternative models may only differ from the null in terms of their
weighting matrix, others may only differ in their set of regressors, while others
may differ in both!
As in Chapter 12, the J -test is based on augmenting the null model with
predictions of the dependent variable based on the alternative models, and then
testing for the significance of those augmenting variables. The dependent vector
in the final form of the null model is y ∗ = Q0 y. Therefore, the J -test in this
panel data framework is based on testing for the significance of predictions of
Q0 y based on the alternative models.
Premultiplying (15.7.4) by Q0 yields
least two ways of predicting the dependent vector based on the J th model. One
is just the estimated right-hand side of that model based on γ̂J . The other is
based on the reduced form. Under reasonable conditions, Kelejian and Piras
(2016b) show that, in a panel data framework, if there is only one alternative,
G = 1, the asymptotic power of the test is the same for these two types of pre-
dictors. They also give Monte Carlo results which suggest that in finite samples
the power is roughly the same for these two types of predictors even if G > 1.
Because the predictor based on the estimated right-hand side is computationally
simpler in that it does not involve inverting a matrix, we suggest its use.
Let ŷJ∗ = ZJ∗ γ̂J = Q0 ZJ γ̂J be the predicted value of the dependent vector
based on the J th model, J = 1, ..., G. Let
∗
Ŷ1,G = (ŷ1∗ , ..., ŷG
∗
), (15.7.6)
δ = (δ1 , ..., δG
)
y ∗ = Z ∗ γ + Ŷ1,G
∗
δ + u∗ (15.7.7)
∗ ∗
=M F +u
The Instruments
In a manner similar to Chapter 6, let J be the N T × hJ matrix of observations
on the exogenous variables the researcher knows to be in the system determin-
ing YJ , and assume that hJ ≥ kJ,Y . Also, let XJ,− and J,− be identical to
XJ and J except that each element now is lagged by one time period. Let
J = (XJ , J , XJ,− , J,− ) and let
J = (J , (IT ⊗ WJ )J , ..., (IT ⊗ WJr )J )LI , J = 1, ..., G, (15.7.9)
where typically r = 2. Then, the instrument matrix for estimating the augmented
model is
H = Q0 (, 1 , ..., G )LI . (15.7.11)
Assumptions
The assumptions relating to the augmented model are quite similar to those
in Section 15.6. Specifically, with respect to the augmented model in (15.7.7)
assume Assumptions 15.1, 15.2, 15.3, and 15.4. Assumption 15.5 is replaced
by13
Assumption 15.6. The large sample theory relates to N → ∞, while T remains
fixed and finite, and
ˆ H RRH Ŝ ),
F̂ N (F, Ŝ (15.7.13)
13. The assumptions below are “high” level assumptions which should be more than adequate to
convince the reader of the large sample result given below. More technical readers should see the
list of assumptions given in Kelejian and Piras (2016a and 2016b).
14. See Fingleton and Palombi (2015) for an alternative approach based on bootstrap methods.
340 Spatial Econometrics
Ŝ = (M̂ ∗ M̂ ∗ )−1 M ∗ H,
where the variable description can be found in Illustration 15.6.1, and p̄ is the
minimum price used by Baltagi and Levin (1992).
The model under H1 was identical to the one in Illustration 15.6.1. Following
the J -test procedure described in this section Kelejian and Piras (2016b) found
that at the 5% level, the J -test rejected the null model since the chi-squared
variable = 19.063 > χ12 = 3.841. They concluded that the cross-state purchases
are better captured by the model under the alternative that includes the spatial
lag of the dependent variable!
Panel Data Models Chapter | 15 341
SUGGESTED PROBLEMS
1. Demonstrate the results given in (15.2.5), namely
Q0 Q0 = Q0 , (15.2.5)
Q1 Q1 = Q1 ,
Q0 + Q1 = IN T ,
Q0 Q1 = 0,
Q0 (eT ⊗ G) = 0.
aQ0 + bQ1
is
a −1 Q0 + b−1 Q1 .
4. In reference to model (15.4.1),
(a) What would be required in order for the model in (15.4.1) to be estimated
by maximum likelihood?
(b) Suppose r < h in (15.4.1). Can the model still be estimated? Explain
why, or why not.