Wang Ho 2010 FeSFA
Wang Ho 2010 FeSFA
Journal of Econometrics
journal homepage: www.elsevier.com/locate/jeconom
of inefficiency is the same for all individuals, so the problem of Greene (2005) found that the incidental parameters problem does
inseparable inefficiency and individual heterogeneity remains. not affect the slope coefficients of a stochastic frontier model, while
In all these models, the inability to separate inefficiency and there is also evidence suggesting that the variance parameters are
individual heterogeneity is likely to limit their applicability in more likely to be affected when T is not large.
empirical studies. This point is lucidly made in Greene (2005), In this paper, we propose a different panel stochastic frontier
which conducts a cross-country comparison of health care service model that has the true fixed-effect model specification and yet
efficiency and argues that the (in)efficiency effect and the time- allows model transformations to be done while keeping the like-
invariant country-specific effect are different and should be lihood function tractable. After transforming the model by either
accounted for separately in the estimation. If, for example, the first-difference or within-transformation, the fixed effects are re-
country-specific heterogeneity is not adequately controlled for, moved before estimation based on which we obtain consistent
then the estimated inefficiency may be picking up country-specific MMLE for the panel stochastic frontier model. Our model differs
heterogeneity in addition to or even instead of inefficiency. In from Greene’s in three aspects. (1) Removing the fixed-effect pa-
this way, the inability of a model to estimate individual effects in rameters avoids the incidental parameters problem entirely, and
addition to the inefficiency effect poses a problem for empirical consistency can be obtained by N → ∞. (2) The model we con-
research. Greene then proposes the ‘‘true fixed-effect’’ model, sider is flexible in the sense that it allows the pre-truncation mean
which is essentially a standard fixed-effect panel data model of the inefficiency variable to be non-zero (e.g., truncated-normal)
augmented by the inefficiency effect (uit ). The latter effect is and it accommodates exogenous determinants of inefficiency in
allowed to change over time and across individuals in the model. the model. (3) No special maximization routine is required.
However, including both the inefficiency effect and fixed indi- The proposed model shares important characteristics of the
vidual effects in the model significantly complicates its estimation. scaling-property model proposed by Wang and Schmidt (2002).
For a fixed-effect model, the number of fixed-effect parameters The authors discussed in the paper the theoretical appeals of the
(also called incidental parameters since their values are usually not scaling property in the context of cross-sectional data. Alvarez et al.
of direct interest) increases with the number of individuals (N). In (2006) discuss the use of scaling-property model in the panel data
context. Here, we show that the property can be manipulated such
this situation, the conventional asymptotic result, which relies on
that model transformations of either first-difference or within-
N → ∞, cannot be applied and estimates of the incidental param-
transformation can be performed analytically. We conduct a Monte
eters are necessarily inconsistent for a fixed T (number of observa-
Carlo experiment to evaluate the performance of the estimator,
tions per individual). For many estimators, inconsistency may also
paying particular attention to the effects from different values of N
contaminate the estimates of the model’s other parameters; the
and T . We also compare the results to those of the dummy-variable
issue is referred to as the incidental parameters problem (Neyman
based approach. Finally, we illustrate the use of the estimator in a
and Scott, 1948). For instance, for linear models with normal er-
capital investment model with a financing constraint, using data
rors, the maximum likelihood estimator (MLE) of the slope coeffi-
from Taiwan.
cients is still consistent, but that of the variance–covariance matrix
The rest of the paper is organized as the follows. Section 2
is inconsistent (Kendall and Stuart, 1973; Mak, 1982). For nonlin-
presents the model and shows how the first-difference and within-
ear models, such as the binomial logit, the MLE of all of the model
transformation of the model can be performed. This section also
parameters is inconsistent in general. Incidental parameters would provides marginal likelihood functions and the formula for esti-
not be an issue and MLE would be consistent if T → ∞, but this mating the inefficiency index for both of the transformed models.
condition is seldom met in empirical applications. Section 3 provides Monte Carlo results on the models, which is fol-
Aside from the statistical issue, there is also a related compu- lowed by an empirical example in Section 4. Conclusions of the pa-
tational problem. It arises because the number of parameters to per are given in Section 5.
be estimated is at least N, and so maximizing the model’s log-
likelihood function may be difficult when N is large.1
2. The model
The literature proposes some solutions to the incidental param-
eters problem for some of the models. The key to these solutions Consider a stochastic frontier model with the following specifi-
usually lies in removing the incidental parameters before estima- cations:
tion. One popular approach, which is widely used in linear models,
is to transform the model by first-differencing or by within- yit = αi + xit β + εit , (1)
transformation and then obtaining the marginal MLE (MMLE). εit = vit − uit , (2)
Alternatively, a conditional likelihood may be formed if a suffi-
cient statistic exists for the fixed effects, yielding a conditional MLE
vit ∼ N (0, σv ), 2
(3)
(CMLE). The likelihood functions of MMLE and CMLE do not con- uit = hit · ui ,
∗
(4)
tain incidental parameters, and the estimators are thus consistent
hit = f (zit δ), (5)
(e.g., Cornwell and Schmidt, 1992).
These methods, however, are not readily applicable to stochas- ui ∼ N (µ, σ ),
∗ + 2
u i = 1, . . . , N , t = 1, . . . , T . (6)
tic frontier models. For the MMLE, the transformation is usually
In this setup, αi is individual i’s fixed unobservable effect, xit is a
intractable because of the nonlinearity of the model. For the CMLE,
1 × K vector of explanatory variables, vit is a zero-mean random
the sufficient statistic is yet to be found. On the other hand, Greene
error, uit is a stochastic variable measuring inefficiency, and hit is
(2005) suggested that the model may be estimated by MLE where
a positive function of a 1 × L vector of non-stochastic inefficiency
individual dummies are included for the fixed effects. The numer- determinants (zit ). Neither xit nor zit contains constants (intercepts)
ical issue of estimating a large number of parameters is then han- because they are not identified. The notation ‘‘+’’ indicates that the
dled by an advanced numerical maximization algorithm. Using a underlying distribution is truncated from below at zero so that re-
Monte Carlo experiment on a cross-country health-care data set, alized values of the random variable u∗i are positive. If we set µ
equal to 0, then u∗i follows a half-normal distribution. The ran-
dom variable u∗i is independent of all T observations on vit , and
1 Recent developments in computer algorithms have relaxed this constraint to both u∗i and vit are independent of all T observations on {xit , zit }.
some extent. For instance, the algorithm adopted by Greene (2005) is able to handle For example, in a study of technical inefficiency of production, yit
large problems and is available in the LimDep package. is the log of output, xit is a vector of log inputs and other factors
288 H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296
affecting production, uit is the technical inefficiency which mea- It is noteworthy that the model in (7)–(11) looks similar to the
sures the percentage (when multiplied by 100) of output loss due cross-sectional model of Wang and Schmidt (2002) except for the
to inefficiency, and zit is a vector of variables explaining the ineffi- multivariate normal distribution and the obvious transformation
ciency. of variables. More importantly, the truncated normal distribution
The above model can be seen as a panel extension of the cross- of u∗i is not affected by the transformation. This key aspect of
sectional model of Wang and Schmidt (2002) (which attributed the model leads to a tractable likelihood function. Alternatively,
the idea to Simar et al., 1994). The extension shows up in the consider a typical and simple model specification in which (4)–(6)
inclusion of the individual effects (αi ) and in the specification of are replaced by uit ∼ N + (0, σ̌u2 ). Even in its simplest form, first-
the time-invariant ‘‘basic’’ distribution u∗i . As will be shown later, differencing uit will not result in a known distribution, and the joint
the time-invariant assumption of u∗i holds the key to a tractable distribution involving the 1vit terms would be intractable.
model transformation.2 After tedious but straightforward derivation, the marginal log-
The above model exhibits the ‘‘scaling property’’ that, condi- likelihood function of panel i in the model is
tional on zit , the one-sided error term equals a scaling function hit 1 1 1
multiplied by a one-sided error distributed independently of zit . ln LDi = − (T − 1) ln(2π ) − ln(T ) − (T − 1) ln(σv2 )
2 2 2
With this property, the shape of the underlying distribution of in-
µ µ µ∗
2 2
efficiency is the same for all individuals, but the scale of the dis- 1 1
− 1ε̃i0 Σ −1 1ε̃i + ∗
− + ln σ∗ Φ
tribution is stretched or shrunk by observation-specific factors zit . 2 2 σ∗ 2 σu
2 σ∗
The time-invariant specification of u∗i allows the inefficiency uit to
µ
be correlated over time for a given individual. Compared to the − ln σu Φ , (13)
independence assumption of uit used in some other panel mod- σu
els, the correlated inefficiency is another appealing property of the where
current model. Wang and Schmidt (2002) and Alvarez et al. (2006)
discussed other advantages of the scaling property. µ/σu2 − 1ε̃i0 Σ −1 1h̃i
Whether the scaling property holds in the data is ultimately an µ∗ = , (14)
1h̃0i Σ −1 1h̃i + 1/σu2
empirical question. Nevertheless, note that the specification nests
some of the models in the literature as special cases. By setting µ = 1
0, the model is the same as that in Reifschneider and Stevenson
σ∗2 = , (15)
1h̃0i Σ −1 1h̃i + 1/σu2
(1991), Caudill and Ford (1993) and Caudill et al. (1995). Using a
time trend variable in the place of zit , i.e., f (zit δ) = f (zt δ), the 1ε̃i = 1ỹi − 1x̃i β. (16)
model essentially mimics the one proposed by Kumbhakar (1990) In the expressions, Φ is the cumulative density function of a
and Battese and Coelli (1992). standard normal distribution. The marginal log-likelihood function
In the next two sections, we show that the fixed individual ef- of the model is obtained by summing the above function over
fect αi can be removed from the model by either first-differencing i = 1, . . . , N. The model parameters are estimated by numerically
or within-transforming the model. maximizing the marginal log-likelihood function of the model.
The matrix has 2σv2 on the diagonal and −σv2 on the off-diagonals. which is evaluated at 1ε̃i = 1ε̃ˆ i .
2.2. Within-transformation
2 Alvarez et al. (2006) assumed that the basic distribution is u∗ , although they
it
briefly mentioned u∗i as another possible modeling strategy for panel data (p. 205). By within-transformation, the sample mean of each panel is
3 The condition requires that z contains at least one variable which changes
it
subtracted from every observation in the panel. The transforma-
values over time. tion thus removes the time-invariant individual effect from the
H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296 289
model. The following notation is helpful in discussing the model: 2.2.1. The inefficiency index
wi = (1/T ) Tt=1 wit , wit = wit − wi , and the stacked vector of
P
As we discussed in the case of the first-differenced model, the
wit for a given i is w̃i = (wi1 , wi2 , . . . , wiT )0 . The model after the formula of Jondrow et al. (1982) can be applied here after α̂i is
transformation is recovered to obtain an observation-specific inefficiency index. The
ỹi = x̃i β + ε̃i , (18) estimator may not work very well for small samples because of the
large sample assumption used in recovering α̂i . Again, we propose
ε̃i = ṽi − ũi , (19) a modified estimator which does not require α̂i and thus does not
ṽi ∼ MN(0, Π ), (20) suffer from the approximation problem. The estimator is based on
the conditional expectation of uit on ε̃i = ỹi − x̃i β:
ũi = h̃i u∗i , (21)
µ∗∗
φ σ∗∗
σ∗∗
ui ∼ N (µ, σ ),
∗ + 2
u i = 1, . . . , N . (22) E (uit |ε̃i ) = hit µ∗∗ + , (30)
µ∗∗
The variance–covariance matrix of ṽi is Φ σ∗∗
σv (1 − 1/T ) σv2 (−1/T ) ... σv2 (−1/T )
2
σ 2 (−1/T ) which is evaluated at ε̃i = ε̃ˆ i .
v σv2 (1 − 1/T ) ... σv2 (−1/T )
Π = .. .. .. ..
.
. . . 2.2.2. Recovering values of individual fixed effects
σv2 (−1/T ) σv2 (−1/T ) ... σv (1 − 1/T )
2 Although the individual effects αi ’s are not estimated in the
model, their values can be recovered after the model’s other
ι0 ι
= σv2 IT − parameters are estimated by either of the transformed models
T proposed above. A T -consistent estimator of αi may be obtained
by solving the first-order condition for αi from the untransformed
= σv M ,
2
(23) log-likelihood function of the model assuming all other parameters
where ι is a T × 1 vector of 1’s. For (21), note that are known. Doing so, we have
!
T
1X µ̂∗∗∗
uit = uit − ui = hit ui − ui ∗ ∗
hit φ σ̂∗∗∗
T t =1 α̂i = yi − xi β̂ + µ̂∗∗∗ ĥi + σ̂∗∗∗ ĥi , (31)
µ̂∗∗∗
Φ σ̂∗∗∗
= (hit − hi )ui = hit ui .
∗ ∗
(24)
Eq. (21) is the stacked vector of uit . where
The above model is complicated by the fact that M is a singular µ̂σ̂u−2 − σ̂v−2T ε̂it ĥit
P
idempotent matrix and is not invertible. Here we use the singular t
µ̂∗∗∗ = , (32)
multivariate normal distribution of Khatri (1968) to solve the σ̂v−2T
P
ĥ2it + σ̂u−2
problem. The density function of the vector ṽi which is defined on t
a (T − 1)-dimensional subspace is
σ̂v 2T
1 1
σ̂∗∗∗
2
= P . (33)
g (ṽi ) = exp − ṽi0 Π − ṽi , (25) ĥ2it + σ̂v2T σ̂u−2
√ q
2
( 2π )(T −1) σv2(T −1)
t
The hat symbol indicates the values estimated from either the first-
where Π − indicates the generalized inverse of Π , and (T − 1)σv2 difference model or the within-transformation model.
is the product of nonzero eigenvalues of Π .4 The model’s marginal
likelihood function is then derived based on the joint distribution 2.3. Equivalence of the two models
of ṽi and ũi . The marginal log-likelihood function of the ith panel
is Although the two models proposed above may seem different,
1 1 1 the likelihood functions are actually the same (LDi ∝ LW i ). To prove
ln LW
i = − (T − 1) ln(2π ) − (T − 1) ln(σv2 ) − ε̃i0 Π − ε̃i the equivalence of the estimates, we first observe that the mod-
2 2 2
els’ likelihood functions, as stated in (13)–(16) and (26)–(29), differ
1 µ2∗∗ µ2 µ∗∗
+ − 2 + ln σ∗∗ Φ only in terms involving the inverse of the variance–covariance ma-
2 σ∗∗ 2 σu σ∗∗ trices. In particular, if the following equations can be established,
then the equivalence of the likelihood functions is obtained (for
µ
− ln σu Φ , (26) generality, Π −1 is used in lieu of Π − in this section):
σu
ε̃0i Π −1 ε̃i = 1ε̃0i Σ −1 1ε̃i , (34)
where
µ/σu2 − ε̃i0 Π − h̃i h̃i Π
0 −1
h̃i = 1h̃i Σ 0 −1
1h̃i , (35)
µ∗∗ = , (27)
h̃0i Π − h̃i + 1/σu2 ε̃0i Π −1 h̃i = 1ε̃0i Σ −1 1h̃i . (36)
1 A proof is sketched as follows.
σ∗∗
2
= , (28) Let D be a T − 1 × T matrix of the first-difference projection
h̃0i Π − h̃i + 1/σu2 matrix,
ε̃i = ỹi − x̃i β. (29) −1 1 0 ... 0
The marginal log-likelihood function of the model is obtained by 0 −1 1 ... 0
summing the above function over i = 1, . . . , N. .. .. .. ..
D= 0
. . . . .
(37)
. .. .. ..
.
. . . . 0
4 Eigenvalues of an idempotent matrix are either 0 or 1, with the number of
0 0 ... −1 1
eigenvalues that are 1 equal to the rank of the matrix. The rank of the matrix M is
T − 1, so there is a total of T − 1 eigenvalues equal to σv2 for the matrix Π = σv2 M. The first-difference model is obtained by projecting the original
290 H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296
Table 1
T = 5.
β = 0.5, δ = 0.5, µ = 0.5, σv2 = 0.1
σu2 /σv2 = 2 σu2 /σv2 = 1.5
N = 100 N = 100
First-difference Within-transf. First-difference Within-transf.
Mean MSE Mean MSE Mean MSE Mean MSE
β̂ 0.500 3.0 × 10 −4
0.500 3.0 × 10 −4
0.500 2.9 × 10 −4
0.500 2.9 × 10−4
(0.017) (0.017) (0.017) (0.017)
δ̂ 0.502 0.007 0.502 0.007 0.502 0.008 0.502 0.008
(0.084) (0.084) (0.088) (0.087)
µ̂ 0.491 0.032 0.489 0.035 0.497 0.027 0.496 0.027
(0.180) (0.186) (0.164) (0.164)
σ̂ 2
u 0.232 0.027 0.233 0.029 0.177 0.019 0.176 0.018
(0.163) (0.167) (0.134) (0.133)
σ̂v2 0.099 6.4 × 10−5 0.099 6.5 × 10−5 0.099 6.4 × 10−5 0.099 6.4 × 10−5
(0.008) (0.008) (0.008) (0.008)
E (uit |Θ )a 0.709 0.104 0.708 0.104 0.670 0.096 0.669 0.095
(0.137) (0.137) (0.137) (0.136)
a
corr 0.871 0.871 0.860 0.860
N = 200 N = 200
First-difference Within-transf. First-difference Within-transf.
Mean MSE Mean MSE Mean MSE Mean MSE
β̂ 0.499 1.6 × 10−4 0.499 1.6 × 10−4 0.499 1.6 × 10−4 0.499 1.6 × 10−4
(0.013) (0.013) (0.013) (0.013)
δ̂ 0.497 0.003 0.497 0.003 0.497 0.004 0.498 0.004
(0.057) (0.057) (0.060) (0.060)
µ̂ 0.500 0.014 0.497 0.013 0.502 0.011 0.501 0.011
(0.117) (0.115) (0.105) (0.105)
σ̂u2 0.218 0.011 0.219 0.011 0.165 0.007 0.165 0.007
(0.103) (0.103) (0.081) (0.082)
σ̂v 2
0.100 3.0 × 10 −5
0.100 3.0 × 10 −5
0.100 3.0 × 10 −5
0.100 2.9 × 10−5
(0.006) (0.005) (0.005) (0.005)
E (uit | Θ )a 0.705 0.091 0.703 0.091 0.665 0.083 0.663 0.083
(0.093) (0.091) (0.091) (0.091)
corra 0.874 0.875 0.864 0.864
N = 300 N = 300
First-difference Within-transf. First-difference Within-transf.
Mean MSE Mean MSE Mean MSE Mean MSE
β̂ 0.500 9.9 × 10 −5
0.500 9.9 × 10 −5
0.500 9.9 × 10 −5
0.500 9.7 × 10−5
(0.010) (0.010) (0.010) (0.010)
δ̂ 0.499 0.002 0.500 0.002 0.499 0.002 0.501 0.002
(0.047) (0.048) (0.049) (0.050)
µ̂ 0.498 0.009 0.496 0.009 0.499 0.007 0.495 0.007
(0.097) (0.096) (0.085) (0.085)
σ̂ 2
u 0.211 0.006 0.210 0.006 0.159 0.004 0.159 0.004
(0.078) (0.078) (0.061) (0.062)
σ̂v2 0.100 2.2 × 10−5 0.100 2.2 × 10−5 0.100 2.2 × 10−5 0.100 2.2 × 10−5
(0.005) (0.005) (0.005) (0.005)
E (uit |Θ )a 0.699 0.088 0.697 0.088 0.659 0.080 0.656 0.080
(0.073) (0.074) (0.071) (0.072)
a
corr 0.875 0.875 0.865 0.865
a
E (uit |Θ ) = E (uit |1ε̃i ) evaluated at 1ε̃i = 1ε̃ˆ i for the first-difference model, E (uit |Θ ) = E (uit |ε̃i ) evaluated at ε̃i = ε̃ˆ i for the within-transformation model. corr = corr
(E (uit |Θ ), uit ). The standard deviations are in the parentheses.
correlation coefficients between the estimated and the true values 3.2. Dummy variable models: A comparison
of inefficiency index. Table 3 presents the results of models with
As shown in the previous subsection, the parameters are esti-
T = 15. Regardless of the size of N, all parameters are estimated mated very well in transformed models, and the estimation con-
very well. sistency improves with increases in either N or T . To further
Finally, we add Table 4 which reports the results from models understand the performance of the estimators, in this subsection
with larger values of σu2 . It is obvious from the table that larger σu2 we provide the simulation results of the model in which the fixed
makes µ and σu2 , both of which are parameters of u∗i , to be esti- individual effects are estimated by dummy variables.6 The model
mated less precisely. On the other hand, the expected inefficiency
index (E (uit )) is computed conditional on the composed error of
εit = vit − uit , and so the conditional information is more useful if 6 This is similar to the ‘‘true fixed-effect’’ model of Greene (2005) in that the fixed
uit accounts for a larger share of εit ’s variance. The result is a higher effects are estimated by dummy variables. The only difference is in the specification
correlation between the true and the estimated inefficiency index of uit , which has a truncated-normal distribution with exogenous determinants
in the current model. Greene assumes an i.i.d. half-normal distribution. Further
when σu2 increases. The above observations are also found in the investigation is needed to determine to what extent the results observed here
preceding tables. pertain to Greene’s model.
292 H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296
Table 2
T = 10.
β = 0.5, δ = 0.5, µ = 0.5, σv2 = 0.1
σu2 /σv2 = 2 σu2 /σv2 = 1.5
N = 100 N = 200 N = 100 N = 200
Mean MSE Mean MSE Mean MSE Mean MSE
β̂ 0.499 1.2 × 10 −4
0.500 6.5 × 10 −5
0.499 1.2 × 10 −4
0.500 6.4 × 10−5
(0.011) (0.008) (0.011) (0.008)
δ̂ 0.502 0.002 0.501 0.001 0.502 0.003 0.502 0.001
(0.049) (0.034) (0.052) (0.036)
µ̂ 0.490 0.019 0.496 0.007 0.494 0.013 0.497 0.005
(0.136) (0.081) (0.112) (0.071)
σ̂u2 0.213 0.009 0.204 0.003 0.160 0.005 0.154 0.002
(0.096) (0.059) (0.072) (0.046)
σ̂v2 0.099 2.5 × 10−5 0.100 1.2 × 10−5 0.099 2.5 × 10−5 0.100 1.2 × 10−5
(0.005) (0.003) (0.005) (0.003)
E (uit |Θ )a 0.697 0.053 0.693 0.049 0.657 0.050 0.653 0.046
(0.083) (0.058) (0.081) (0.056)
a
corr 0.931 0.933 0.922 0.924
N = 300 N = 300
Mean MSE Mean MSE
β̂ 0.500 4.0 × 10 −5
0.500 4.0 × 10−5
(0.006) (0.006)
δ̂ 0.502 0.001 0.502 0.001
(0.029) (0.031)
µ̂ 0.498 0.005 0.499 0.003
(0.068) (0.059)
σ̂ 2
u 0.202 0.002 0.151 0.001
(0.049) (0.037)
σ̂v2 0.100 8.6 × 10−6 0.100 8.6 × 10−6
(0.003) (0.003)
E (uit |Θ )a 0.692 0.048 0.652 0.046
(0.049) (0.048)
a
corr 0.933 0.924
a
E (uit |Θ ) = E (uit |1ε̃i ) evaluated at 1ε̃i = 1ε̃ˆ i for the first-difference model. corr = corr (E (uit |Θ ), uit ). The standard deviations are in the parentheses.
Table 3
T = 15.
β = 0.5, δ = 0.5, µ = 0.5, σv2 = 0.1
σu2 /σv2 = 2 σu2 /σv2 = 1.5
N = 100 N = 200 N = 100 N = 200
Mean MSE Mean MSE Mean MSE Mean MSE
β̂ 0.500 7.6 × 10−5 0.500 3.5 × 10−5 0.500 7.6 × 10−5 0.500 3.5 × 10−5
(0.009) (0.006) (0.009) (0.006)
δ̂ 0.499 0.001 0.498 0.001 0.499 0.002 0.498 0.001
(0.037) (0.025) (0.040) (0.026)
µ̂ 0.500 0.011 0.497 0.005 0.502 0.008 0.498 0.004
(0.106) (0.073) (0.089) (0.061)
σ̂u2 0.207 0.005 0.207 0.003 0.156 0.003 0.156 0.002
(0.073) (0.051) (0.055) (0.038)
σ̂v 2
0.100 1.4 × 10 −5
0.100 7.9 × 10 −6
0.100 1.4 × 10 −5
0.100 7.8 × 10−6
(0.004) (0.003) (0.004) (0.003)
E (uit |Θ )a 0.700 0.035 0.695 0.033 0.660 0.034 0.654 0.032
(0.070) (0.048) (0.068) (0.046)
corra 0.954 0.955 0.947 0.948
N = 300 N = 300
Mean MSE Mean MSE
β̂ 0.500 2.4 × 10 −5
0.500 2.4 × 10−5
(0.005) (0.005)
δ̂ 0.500 4.6 × 10−4 0.500 0.001
(0.022) (0.023)
µ̂ 0.497 0.004 0.497 0.003
(0.060) (0.051)
σ̂u2 0.203 0.002 0.152 0.001
(0.040) (0.029)
σ̂v2 0.100 5.0 × 10−6 0.100 5.1 × 10−6
(0.002) (0.002)
E (uit |Θ )a 0.692 0.032 0.651 0.031
(0.041) (0.040)
corra 0.955 0.948
a
E (uit |Θ ) = E (uit |1ε̃i ) evaluated at 1ε̃i = 1ε̃ˆ i for the first-difference model. corr = corr (E (uit |Θ ), uit ). The standard deviations are in the parentheses.
H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296 293
Table 4
Larger σu2 .
β̂ 0.500 3.0 × 10 −4
0.500 7.7 × 10 −5
0.500 1.0 × 10 −4
0.500 2.4 × 10−5
(0.017) (0.009) (0.010) (0.005)
δ̂ 0.501 0.006 0.499 0.001 0.499 0.002 0.500 3.7 × 10−4
(0.077) (0.033) (0.043) (0.019)
µ̂ 0.480 0.049 0.499 0.020 0.496 0.014 0.494 0.007
(0.221) (0.141) (0.117) (0.081)
σ̂ 2
u 0.343 0.051 0.308 0.012 0.312 0.012 0.304 0.004
(0.221) (0.109) (0.107) (0.061)
σ̂v2 0.099 6.6 × 10−5 0.100 1.4 × 10−5 0.100 2.2 × 10−5 0.100 5.0 × 10−6
(0.008) (0.004) (0.005) (0.002)
E (uit |Θ )a 0.783 0.116 0.775 0.037 0.773 0.100 0.766 0.034
(0.139) (0.073) (0.075) (0.045)
a
corr 0.888 0.964 0.892 0.964
β̂ 0.500 3.0 × 10−4 0.500 7.7 × 10−5 0.500 1.0 × 10−4 0.500 2.5 × 10−5
(0.017) (0.009) (0.010) (0.005)
δ̂ 0.501 0.005 0.499 0.001 0.500 0.002 0.500 3.1 × 10−4
(0.072) (0.030) (0.040) (0.018)
µ̂ 0.469 0.075 0.495 0.030 0.494 0.019 0.493 0.010
(0.272) (0.172) (0.139) (0.098)
σ̂u2 0.453 0.082 0.411 0.020 0.413 0.018 0.406 0.007
(0.282) (0.143) (0.134) (0.080)
σ̂v 2
0.099 6.6 × 10 −5
0.100 1.5 × 10 −5
0.100 2.2 × 10 −5
0.100 5.0 × 10−6
(0.008) (0.004) (0.005) (0.002)
E (uit |Θ )a 0.849 0.125 0.842 0.038 0.839 0.109 0.834 0.035
(0.141) (0.076) (0.077) (0.046)
corra 0.901 0.970 0.904 0.970
a
E (uit |Θ ) = E (uit |1ε̃i ) evaluated at 1ε̃i = 1ε̃ˆ i for the first-difference model. corr = corr (E (uit |Θ ), uit ). The standard deviations are in the parentheses.
suffers from the incidental parameters problem, and the simula- while δ̂ falls from 0.810 in Model 1 to 0.627 in Model 2, it still
tion results show the consequences of not removing incidental pa- overestimates the true value by 25%. Given that effects of the in-
rameters prior to estimation. efficiency determinants often play an important role in empirical
Since we wish to observe how values of N and T affect esti- stochastic frontier analysis, this result should be alarming to em-
mation, we simulate models with different configurations of N = pirical researchers. σ̂u2 also suffers from a large bias of about −47%.
100, 200, 300 and T = 5, 10, 15. We choose σu2 = 0.2 for all the The correlation coefficient between the true and the estimated in-
models and keep other parameters the same as those used in the dex is 0.726, which is only a slight increase from the value of 0.711
previous section. The results are presented in Table 5 (for selected obtained from Model 1. The first-difference model, on the other
models) and in Fig. 1. For comparison, we also reproduce the re- hand, reaps substantial gains from a larger T as shown in the table.
sults of the corresponding first-difference models in the table and Model 3 increases N to 300 and keeps T at 5. As expected,
the figure. the dummy variable model does not benefit from an increase in
For Model 1 (N = 100 and T = 5), β is estimated very well and N. There is no appreciable change in the parameter estimation
the estimate of σv2 is also reasonably sound. The rest of the param- compared with Model 1. The first-difference model, in contrast,
eters, however, are very poorly estimated: δ̂ = 0.810 (δ = 0.5), improves substantially from an increase in N. Model 4 increases
µ̂ = 0.232 (µ = 0.5), and σ̂u2 = 0.056 (σu2 = 0.20). Note both N and T . The results of the dummy-variable model show
that these are all parameters of uit . The biases are large and sig- improvements when compared to Model 1, which is likely due to
nificant. The correlation coefficient between the estimated ineffi- the effect of the large T .
ciency (E (uit |εit ) evaluated at εit = ε̂it ) and the true inefficiency It is worthwhile noting that, for all the cases presented in
is 0.711, clearly smaller than the value of 0.871 obtained from the Table 5, the dummy variable model tends to overestimate the im-
first-difference model. portance of exogenous determinant of inefficiency (δ too large)
The above finding is consistent with Greene (2005), in which the while underestimating σu2 . Because the inefficiency index is funda-
author showed that the incidental parameters problem does not mentally affected by the variance parameters, large biases in δ , µ,
cause bias to the slope coefficients. The estimation problem arises and σu2 have negative consequences on the estimated inefficiency.
mainly in the error variances estimation. However, since estimated Using the figures in Table 5, it can be shown that the sample mean
inefficiency of a stochastic frontier model is based on the error of the inefficiency index from the dummy variable model is about
variance, the empirical consequence of the incidental parameters 18% to 40% smaller than that of the first-difference model. The
problem cannot be ignored. correlation coefficient between the true and the estimated ineffi-
Model 2 keeps N the same and increases T to 15. As discussed ciency index is also much smaller with the dummy variable model.
earlier, larger T helps the dummy-variable model gain consistency. Fig. 1 plots the point estimates of δ̂ , µ̂, and σ̂u2 from all possible
The table shows that the estimation indeed improves with T equal combinations of N and T in the simulation. Graphs in the left
to 15, but the overall result is still unsatisfactory. For example, column have N fixed at 100 while T changes from 5 to 10 to 15.
294 H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296
Table 5
Dummy variable model: A comparison.
Model 1: N = 100, T = 5 Model 2: N = 100, T = 15
Dummy First-difference Dummy First-difference
Mean MSE Mean MSE Mean MSE Mean MSE
(std.) (std.) (std.) (std.)
β̂ 0.498 0.001 0.500 3.0 × 10−4 0.500 9.2 × 10−5 0.500 7.6 × 10−5
(0.027) (0.017) (0.010) (0.009)
δ̂ 0.810 0.115 0.502 0.007 0.627 0.018 0.499 0.001
(0.140) (0.084) (0.043) (0.037)
µ̂ 0.232 0.218 0.491 0.032 0.371 0.031 0.500 0.011
(0.383) (0.180) (0.120) (0.106)
σ̂u2 0.056 0.029 0.232 0.027 0.106 0.013 0.207 0.005
(0.092) (0.163) (0.066) (0.073)
σ̂v2 0.090 2.3 × 10−4 0.099 6.4 × 10−5 0.095 5.2 × 10−5 0.100 1.4 × 10−5
(0.011) (0.008) (0.005) (0.004)
E (u|Θ )a 0.449 1.331 0.709 0.104 0.545 0.192 0.700 0.035
(0.502) (0.137) (0.062) (0.070)
corra 0.711 0.871 0.726 0.954
Model 3: N = 300, T = 5 Model 4: N = 300, T = 15
Dummy First-difference Dummy First-difference
Mean MSE Mean MSE Mean MSE Mean MSE
(std.) (std.) (std.) (std.)
β̂ 0.498 1.6 × 10−4 0.500 9.9 × 10−5 0.499 3.2 × 10−5 0.500 2.4 × 10−5
(0.012) (0.010) (0.006) (0.005)
δ̂ 0.823 0.113 0.499 0.002 0.626 0.017 0.500 4.6 × 10−4
(0.091) (0.047) (0.033) (0.022)
µ̂ 0.255 0.085 0.498 0.009 0.371 0.028 0.497 0.004
(0.157) (0.097) (0.106) (0.060)
σ̂ 2
u 0.041 0.029 0.211 0.006 0.106 0.013 0.203 0.002
(0.068) (0.078) (0.062) (0.040)
σ̂v2 0.090 1.7 × 10−4 0.100 2.2 × 10−5 0.095 4.0 × 10−5 0.100 5.0 × 10−6
(0.008) (0.005) (0.003) (0.002)
E (u|Θ )a 0.422 0.251 0.699 0.088 0.542 0.194 0.692 0.032
(0.078) (0.073) (0.052) (0.041)
a
corr 0.718 0.875 0.724 0.955
a
E (uit |Θ ) = E (uit |1ε̃i ) evaluated at 1ε̃i = 1ε̃ˆ i for the first-difference model, E (uit |Θ ) = E (uit |εit ) evaluated at εit = ε̂it for the dummy model. corr = corr(E (uit |Θ ), uit ).
The standard deviations are in the parentheses.
Graphs in the right column have T fixed at 5 as N changes from (because of the incidental parameters problem). While the point
100 to 200 to 300. The figure clearly shows that the estimates estimates improve with larger T , with T = 15 the performance is
of the dummy-variable model do not benefit from increases in N still inferior to that of the first-difference model.
H.-J. Wang, C.-W. Ho / Journal of Econometrics 157 (2010) 286–296 295
Table 7
Empirical results.
Model 1 (dummy) Model 2 (within)
Coeff. Std. err. Coeff. Std. err.
Frontier
with the extent of financing constraints. In any case, the validity model. Our models’ desirable statistical properties and their ease
of using cash flow to gauge financing constraints is controversial of estimation should appeal to empirical researchers.
in the literature (see, e.g., Fazzari et al., 2000, Kaplan and Zingales, Similar to Greene’s (2005) finding, our Monte Carlo results indi-
2000) and the empirical results are mixed. cate that while the incidental parameters problem does not affect
As for the dummy-variable model (Model 1; log-likelihood the estimation of slope coefficients, it does introduce bias to the es-
value = −1512.152), the coefficients of the Q and sales vari- timated model residuals. The situation cannot be remedied with a
ables are quite close to those of Model 2. On the other hand, the larger N, and can only be improved by increasing T . Since the inef-
coefficient of the log of asset is much larger in size compared to ficiency estimation is based on model residuals and the estimation
Model 2 while the estimate of σu2 is much smaller. The mean of the is often at the core of a stochastic frontier analysis study, the inci-
conditional expectation of exp(−uit ) also shows that the dummy- dental parameters problem should concern empirical researchers,
variable model implies a higher investment efficiency (lower fi- particularly when T is not large.
nancing constraints).
These observations are consistent with the simulation results References
(see, in particular, Model 3 in Table 5 for similar N and T ), which
Alvarez, A., Amsler, C., Orea, L., Schmidt, P., 2006. Interpreting and testing the scaling
show that the β coefficients are always similar in both models property in models where inefficiency depends on firm characteristics. Journal
while the dummy-variable model tends to overestimate the im- of Productivity Analysis 25, 201–212.
pact of the inefficiency determinants (e.g., ln Assetsit ) and underes- Battese, G.E., Coelli, T.J., 1992. Frontier production functions, technical efficiency
and panel data: With application to paddy farmers in India. Journal of
timate the size of σu2 . Productivity Analysis 3, 153–169.
We end this section by discussing the model’s time-varying Carpenter, R.E., Fazzari, S.M., Petersen, B.C., 1994. Inventory investment, internal-
characteristic of the inefficiency index. As mentioned earlier, the finance fluctuations, and the business cycle. Brookings Papers on Economic
model’s inefficiency is a product of a time-varying function f (zit δ) Activity 2, 75–137.
Caudill, S.B., Ford, J.M., 1993. Biases in frontier estimation due to heteroscedasticity.
and a time-invariant random variable u∗i . This combination yields Economics Letters 41, 17–20.
a specification that is in between the time-constant assumption Caudill, S.B., Ford, J.M., Gropper, D.M., 1995. Frontier estimation and firm-specific
inefficiency measures in the presence of heteroscedasticity. Journal of Business
of inefficiency (uit = ui , i.e., Schmidt and Sickles, 1984) and
and Economic Statistics 13, 105–111.
the observation-independent assumption (e.g., Greene, 2005). In Chen, N.-K., Wang, H.-J., 2008. Identifying the demand and supply effects of financial
a related context, Greene (2002, 2005) found that the difference crises on bank credit—Evidence from Taiwan. Southern Economic Journal 75,
between the predictions of models with time-varying and time- 26–49.
Chirinko, R.S., 1993. Business fixed investment spending: Modeling strategies,
invariant inefficiency is vast and unsettling. empirical results, and policy implications. Journal of Economic Literature 31,
To see if the time-varying or the time-constant properties of the 1875–1911.
model dominates in the estimated efficiency, we compute, as an Cornwell, C., Schmidt, P., 1992. Models for which the MLE and the conditional MLE
coincide. Empirical Economics 17, 67–75.
approximation, the mean and the standard deviation of the effi- Fazzari, S.M., Hubbard, R.G., Petersen, B.C., 2000. Investment-cash flow sensitivities
ciency index within each firm. In particular, we assess the cross- are useful a comment on Kaplan and Zingales. Quarterly Journal of Economics
time variation of the efficiency index by calculating the standard 115, 695–705.
Gertler, M., Gilchrist, S., 1994. Monetary policy, business cycles, and the behavior of
deviation of the index for each firm (separately) and then average small manufacturing firms. Quarterly Journal of Economics 109, 309–340.
the figures across firms. This yields a mean standard deviation of Gilchrist, S., Himmelberg, C.P., 1995. Evidence on the role of cash flow for
0.020 (which would be 0 if the model has a time-constant speci- investment. Journal of Monetary Economics 36, 541–572.
Greene, W., 2002. Fixed and random effects in stochastic frontier models, Stern
fication of inefficiency). On the other hand, the mean of the effi- School of Business, New York University.
ciency index across firms is 0.571. A one standard deviation above Greene, W., 2005. Reconsidering heterogeneity in panel data estimators of the
and below the mean puts the efficiency index between 0.551 to stochastic frontier model. Journal of Econometrics 126, 269–303.
Hayashi, F., 1985. Corporate finance side of the Q theory of investment. Journal of
0.591 for this sample. Although not a precise measure, these num- Public Economics 27, 261–280.
bers suggest that the time variation of inefficiency is on the lower Jondrow, J., Lovell, C.A.K., Materov, I.S., Schmidt, P., 1982. On the estimation of
side. However, the numbers are not totally unreasonable given that technical inefficiency in the stochastic frontier production function model.
Journal of Econometrics 19, 233–238.
they are from firms in a six-year span. Whether the low time vari- Kaplan, S.N., Zingales, L., 2000. Investment-cash flow sensitivities are not valid
ation is due to data or is a property intrinsic to the proposed model measures of financing constraints. Quarterly Journal of Economics 115,
specification remains an issue for further investigation. 707–712.
Kendall, M.G., Stuart, A., 1973. The Advanced Theory of Statistics. Griffin, London.
Khatri, C.G., 1968. Some results for the singular normal multivariate regression
5. Conclusion models. Sankhya 30, 267–280.
Kumbhakar, S.C., 1990. Production frontiers, panel data, and time-varying technical
inefficiency. Journal of Econometrics 46, 201–211.
Recent literature has emphasized the importance of separating Lewellen, W.G., Badrinath, S.G., 1997. On the measurement of Tobin’s q. Journal of
inefficiency and fixed individual effects in a panel stochastic Financial Economics 44, 77–122.
Mak, T.K., 1982. Estimation in the presence of incidental parameters. The Canadian
frontier model. In this paper, we propose a class of panel stochastic
Journal of Statistics 10, 121–132.
frontier models that take account of both time-varying inefficiency Neyman, J., Scott, E.L., 1948. Consistent estimation from partially consistent
and time-invariant individual effects. An important feature of these observations. Econometrica 16, 1–32.
Osterberg, W.P., 1989. Tobin’s q, investment, and the endogenous adjustment of
models is that simple transformations can be performed to remove
financial structure. Journal of Public Economics 40, 293–318.
the fixed individual effects prior to estimation. The first-difference Reifschneider, D., Stevenson, R., 1991. Systematic departures from the frontier: A
and within-transformation methods, which cannot normally be framework for the analysis of firm inefficiency. International Economic Review
used on stochastic frontier models due to their complicated error 32, 715–723.
Schmidt, P., Sickles, R.C., 1984. Production frontiers and panel data. Journal of
structure, eliminate the problem of incidental parameters brought Business and Economic Statistics 2, 367–374.
about by the inclusion of fixed individual effects in the model. Simar, L., Lovell, C.A.K., Eeckaut, P.V., 1994. Stochastic frontiers incorporating
The transformed models proposed in this paper in general per- exogenous influences on efficiency, Discussion Papers No.9403. Institut de
Statistique, University Catholique de Louvain.
formed quite well in our Monte Carlo study. Most importantly, con- Wang, H.-J., 2003. A stochastic frontier analysis of financing constraints on
sistency of the parameter estimates can be improved by increasing investment: The case of financial liberalization in Taiwan. Journal of Business
either T or N (or both). In addition, because the fixed individual and Economic Statistics 21, 406–419.
Wang, H.-J., Schmidt, P., 2002. One-step and two-step estimation of the effects
effects are removed by model transformations, the number of pa- of exogenous variables on technical efficiency levels. Journal of Productivity
rameters to be estimated is no more than that of a cross-sectional Analysis 18, 129–144.