Model Hazard
Model Hazard
DHM
=
X
n
i=1
Y
i,t
i
log
h t
i
, x
i,t
i
, z
i,t
i
1 h t
i
, x
i,t
i
, z
i,t
i
( )
X
n
i=1
X
t
i
j=1
log{1 h( j, x
i, j
, z
i, j
)].
Here, h( j, x
i,j
, z
i,j
) is the value of the hazard function
indicating the probability of bankruptcy instantly occur-
ring at time j for the ith company which is non-bankrupt
before time j, for each j =1, . . . , t
i
and i =1, . . . , n.
Note that the hazard function h(t, x, z) in
DHM
can be
of any functional form with values in the interval (0, 1).
Shumway (2001) considered a linear logistic function for
the hazard function:
h(t, x, z) =
exp{o
1
[
1
log(t) ,
1
x
1
z]
1 exp{o
1
[
1
log(t) ,
1
x
1
z]
,
where o
1
, [
1
, ,
1
, and
1
are 1 1, 1 1, 1 d, and 1 q
vectors of parameters, respectively. Given the linear
1056 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
logistic hazard function, the resulting log-likelihood of the
panel data becomes
=
X
n
i=1
Y
i,t
i
o
1
[
1
log(t
i
) ,
1
x
i,t
i
1
z
i,t
i
X
n
i=1
X
t
i
j=1
log[1 exp{o
1
[
1
log( j) ,
1
x
i, j
1
z
i, j
]].
The maximum likelihood estimates of parameters o
1
, [
1
,
,
1
, and
1
can be simply obtained by solving the normal
equations:
0 =
X
n
i=1
Y
i,t
i
1
log(t
i
)
x
i,t
i
z
i,t
i
2
6
6
6
4
3
7
7
7
5
X
n
i=1
X
t
i
j=1
exp{o
1
[
1
log( j) ,
1
x
i, j
1
z
i, j
]
1 exp{o
1
[
1
log( j) ,
1
x
i, j
1
z
i, j
]
1
log( j)
x
i, j
z
i, j
2
6
6
6
4
3
7
7
7
5
.
Based on the maximum likelihood estimates ^ o
1
,
^
[
1
, ^ ,
1
,
and
^
1
, if a firm has predictor values (x
0
, z
0
) at time t
0
,
then its predicted instant bankruptcy probability can be
given by
^
h(t
0
, x
0
, z
0
) =
exp{ ^ o
1
^
[
1
log(t
0
) ^ ,
1
x
0
^
1
z
0
]
1 exp{ ^ o
1
^
[
1
log(t
0
) ^ ,
1
x
0
^
1
z
0
]
.
Cox and Oakes (1984) showed that the maximum
likelihood estimates ^ o
1
,
^
[
1
, ^ ,
1
, and
^
1
are consistent for
o
1
, [
1
, ,
1
, and
1
, respectively. Thus, the resulting
predicted instant bankruptcy probability
^
h(t
0
, x
0
, z
0
)
converges to the true instant bankruptcy probability
h(t
0
, x
0
, z
0
). This result shows that DHM should be an
efficient bankruptcy prediction model if the hazard
function is correctly specified.
2.2. DSHM
The main advantage of DHM lies in its simplicity of
computation and interpretation, but the linear logistic
function for modeling the hazard function may not be
proper. If one chooses a parametric hazard function that
is not appropriate, then the resulting model-based instant
bankruptcy probability prediction might not correctly
estimate the true probability, and there is a danger of
coming to an erroneous prediction.
The limitation of DHM can be improved by removing
the restriction that the hazard function belongs to a
particular parametric family. In this paper, we suggest
a DSHM, which is more flexible in modeling the hazard
function. The DSHM is constructed by replacing
the parametric hazard function in DHM with
a semiparametric hazard function. That is, we assume
the hazard function belongs to the family
h
+
(t, x, z) =
exp{[log(t) m(x) z]
1 exp{[log(t) m(x) z]
.
Here, [ and are unknown parameters, and m(x) is an
unknown but smooth function of the value x of the
d-dimensional continuous predictor X. Following the
same development of , the corresponding log-likelihood
function of the panel data based on our DSHM is
expressed by
+
=
X
n
i=1
Y
i,t
i
[ log(t
i
) m x
i,t
i
z
i,t
i
X
n
i=1
X
t
i
j=1
log[1 exp{[log( j) m(x
i, j
) z
i, j
]].
For a company with predictor values (x
0
, z
0
) at time t
0
, if
[, m(x
0
), and can be efficiently estimated by
^
[, ^ m(x
0
),
and
^
, respectively, then the firms instant bankruptcy
probability can be predicted by
^
h
+
(t
0
, x
0
, z
0
) =
exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o
1 exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o .
In sections 2.3 and 2.4, we show how to estimate
parameters [, m(x
0
), and using a local likelihood
method. The advantage of this approach will be seen from
empirical studies given in section 3.
2.3. A local likelihood method
There exist many well-known methods for estimating [,
m(x
0
), and , where x
0
is any given value of the
d-dimensional continuous predictor X. One of these
methods with a simple idea is the local likelihood
method; see, for example, Tibshirani and Hastie (1987),
Staniswallis (1989), Fan et al. (1995), and Hwang et al.
(2007). The basic rational of the local likelihood method
is to center the data around x
0
and weight the likelihood
in such a way that it places more emphasis on those
observations nearest to x
0
.
The idea of the local likelihood method can be simply
explained by first introducing a neighborhood
S(x
0
) ={x =(x
1
, . . . , x
d
)
T
:|xx
0
| _b} of x
0
. Here b is
some positive constant to be determined later by the
sampled data, and called the bandwidth. The notation
|x| denotes the Euclidean distance of the given vector x.
If the value of b is small enough and x
i,j
belongs to S(x
0
),
then Taylors first-order expansion states that
m(x
i, j
) - m(x
0
) m
(1)
(x
0
)
T
(x
i, j
x
0
),
and such m(x
i,j
) in the likelihood can be written as
j
0
j
1
(x
i,j
x
0
), where we denote m(x
0
) by j
0
and
m
(1)
(x
0
)
T
by j
1
. Note that j
0
is a scalar parameter and
j
1
is a 1 d vector of parameters.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1057
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
To make an inference about o=(j
0
, [, j
1
, ), we
suggest modifying the likelihood function
+
to the
following local (weighted) log-likelihood function:
+
0
(o; x
0
) =
X
n
i=1
Y
i,t
i
j
0
[log(t
i
) j
1
x
i,t
i
x
0
z
i,t
i
W x
i,t
i
X
n
i=1
X
t
i
j=1
log[1 exp{j
0
[log( j)
j
1
(x
i, j
x
0
) z
i, j
]]W(x
i, j
).
Here W(x) is called the weight function and the simplest
weight W(x
i,j
) assigned to the observation (Y
i,j
, x
i,j
, z
i,j
)
is the indicator value I{x
i,j
cS(x
0
)}. However, a more
general weighting scheme can be used for defining the
local likelihood. This can be achieved, for example, by
introducing a symmetric and unimodal probability
density function K
b
(x), and defining W(x
i,j
) =
K
b
(x
i,j
x
0
). In the paper, we suggest that K
b
(x) be
taken as the joint probability density function of d
independent normal random variables N(0, b
2
). Given
such K
b
(x) we point out that if the value of b
1
|x
i,j
x
0
|
becomes larger because of choosing a smaller value
of b, then the effect of the observation (Y
i,j
, x
i,j
, z
i,j
) on
estimating the important parameters in DSHM will tend
to be smaller or even non-existent. This indicates that the
value of b can be used to control sample observations for
inclusion in the analysis. The results in the literature show
that the choice of bandwidth b plays an important role
in the analysis. Some discussions of the above weighting
method can be found in the monographs of Eubank
(1988), Mu ller (1988), Ha rdle (1990, 1991), Scott (1992),
Wand and Jones (1995), Fan and Gijbels (1996), and
Simonoff (1996), etc. In this paper, we select
W(x
i,j
) =K
b
(x
i,j
x
0
) in all analyses.
Set ~ o = ( ~ j
0
,
~
[, ~ j
1
,
~
) as the maximizer of
+
0
(o; x
0
).
The maximum local likelihood estimate ~ o can also be
equivalently obtained by solving a system of weighted
normal equations
0 =
X
n
i=1
Y
i,t
i
1
log(t
i
)
x
i,t
i
x
0
z
i,t
i
2
6
6
6
4
3
7
7
7
5
K
b
x
i,t
i
x
0
X
n
i=1
X
t
i
j=1
exp{j
0
[log( j) j
1
(x
i, j
x
0
) z
i, j
]
1 exp{j
0
[log( j) j
1
(x
i, j
x
0
) z
i, j
]
1
log( j)
x
i, j
x
0
z
i, j
2
6
6
6
4
3
7
7
7
5
K
b
(x
i, j
x
0
).
We define ~ m(x
0
) = ~ j
0
to indicate that it is an estimate
of m(x
0
). We also point out that [ and are global
parameters and their corresponding estimates produced
from ~ o may not be efficient, since such estimates are
derived by maximizing a local log-likelihood depending
on x
0
. In section 2.4, we show how more efficient
estimates of [, m(x
0
), and can be achieved.
2.4. More powerful estimates of parameters in DSHM
More powerful estimates of [, m(x
0
), and can be derived
using the following two-step procedure. We first note
that, for each value x
i,j
, an initial estimate ~ m(x
i, j
) of m(x
i,j
)
can be obtained by the method outlined in section 2.3.
The two-step procedure includes the following.
Step 1: [ and are estimated by maximizing the pseudo
log-likelihood
+
1
([, ) =
X
n
i=1
Y
i,t
i
{[log(t
i
) ~ m x
i,t
i
z
i,t
i
]
X
n
i=1
X
t
i
j=1
log[1 exp{[ log( j) ~ m(x
i, j
) z
i, j
]],
or, equivalently, solving equations
0 =
X
n
i=1
Y
i,t
i
log(t
i
)
z
i,t
i
X
n
i=1
X
t
i
j=1
exp{[log( j) ~ m(x
i, j
) z
i, j
]
1 exp{[log( j) ~ m(x
i, j
) z
i, j
]
log( j)
z
i, j
.
Let the estimates of ([, ) be (
^
[,
^
), the maximizer of
+
1
([, ). Here
+
1
([, ) is obtained by replacing each m(x
i,j
)
in
+
with its initial estimate ~ m(x
i, j
).
Step 2: m(x
0
) is estimated by maximizing the pseudo
local log-likelihood
+
2
(j
0
, j
1
;x
0
) =
X
n
i=1
Y
i,t
i
j
0
^
[log(t
i
) j
1
x
i,t
i
x
0
^
z
i,t
i
n o
K
g
x
i,t
i
x
0
X
n
i=1
X
t
i
j=1
log[1exp{j
0
^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]]K
g
(x
i, j
x
0
),
or, equivalently, solving equations
0 =
X
n
i=1
Y
i,t
i
1
x
i,t
i
x
0
K
g
(x
i,t
i
x
0
)
X
n
i=1
X
t
i
j=1
exp{j
0
^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]
1 exp{j
0
^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]
1
x
i, j
x
0
K
g
(x
i, j
x
0
).
Set ( ^ j
0
, ^ j
1
) as the maximizer of
+
2
(j
0
, j
1
; x
0
). The
estimate of m(x
0
) is given by ^ m(x
0
) = ^ j
0
. Here
+
2
(j
0
, j
1
; x
0
) is obtained by replacing [ and in
+
0
(o; x
0
) with their estimates produced in step 1.
We note that in step 2 we have used a different
bandwidth g in the local likelihood method. We allow b
and g to be different in the analysis but emphasize that
both values will be determined by the sampled data (see
our proposal given in section 2.6). We suggest that the
final estimates of [, m(x
0
), and be defined by
^
[, ^ m(x
0
),
and
^
. Also, at time t
0
, the predicted instant bankruptcy
1058 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
probability of a firm with predictor values (x
0
, z
0
) is
suggested to be defined by
^
h
+
(t
0
, x
0
, z
0
) =
exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o
1 exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o .
2.5. Selecting parametric hazard function using ^ m(x)
The estimated function ^ m(x) of m(x) can be used to
determine the functional form of the logit of hazard
function. We recall in the usual DHM that a linear
logistic function is assumed for the hazard function.
That is, the logit transformation of the hazard function is
a linear function of the d-dimensional continuous
predictors. We note that, for the jth predictor value x
j
in x, the relation between the logit-transformed hazard
function and x
j
can be determined visually by plotting
{x
j
, ^ m(x)], for each j =1, . . . , d. Here x in the plot of
{x
j
, ^ m(x)] has the jth component as x
j
, but all other
components are fixed at their sample median levels, since
the distribution of the explanatory variable in the
financial field is usually fat-tailed and skewed. Using the
plots, a proper functional form of the logit of the hazard
function can be determined. For example, if the plot of
{x
j
, ^ m(x)], for some j, presents a cubic relation, then the
relation between the logit-transformed hazard function
and x
j
should be an order-three polynomial. In the
empirical examples discussed in section 3, we apply this
strategy to propose a new parametric hazard function. We
denote the parametric hazard function derived from using
the plots of {x
j
, ^ m(x)], for j =1, . . . , d, by h
#
(t, x, z). The
DHM based on such a data-based parametric hazard
function h
#
(t, x, z) is denoted DHM
#
in the analysis.
2.6. Bankruptcy prediction
Theoretical argument shows
^
h
+
(t
0
, x
0
, z
0
) to be a consis-
tent estimator of the instant bankruptcy probability. This
means that a reliable bankruptcy prediction system can be
established based on using estimate
^
h
+
(t
0
, x
0
, z
0
). In this
paper, we suggest that if a firm has predictor values x
0
and z
0
at time t
0
and the calculated probability
^
h
+
(t
0
, x
0
, z
0
) is no more than a given cut-off value p,
then this firm is classified to be in a healthy
status. Otherwise, it is classified to be in a bankruptcy
status.
To decide a proper cut-off value p, usually one would
use all of the panel data to evaluate the performance of
the classification scheme. For simplicity of computation,
we suggest only using the dataset {(Y
i,t
i
, x
i,t
i
, z
i,t
i
),
i =1, . . . , n}, collected at the last observation time of
each company in the sampling period. There are two
types of in-sample error rates occurring in this
evaluation:
Type I error rate o
in
(p) =
P
n
i=1
Y
i,t
i
I
^
h
+
(t
i
, x
i,t
i
, z
i,t
i
) _p
n o h i
P
n
i=1
Y
i,t
i
,
and
Type II error rate [
in
( p) =
P
n
i=1
1Y
i,ti
I
^
h
+
t
i
, x
i,ti
, z
i,ti
4p
n o h i
P
n
i=1
1Y
i,ti
,
where p c[0, 1] and I() stands for the indicator function.
Using the cut-off value p, o
in
( p) is the rate of
misclassifying a bankrupt company as a healthy com-
pany, and [
in
( p) is the rate of misclassifying a healthy
company as a bankrupt company.
To keep these two error rates as small as possible, we
determine a proper cut-off value p
+
for the bankruptcy
prediction method based on DSHM such that
t
in
( p
+
) = o
in
( p
+
) [
in
( p
+
) = min
pc[0,1],o
in
( p)_u
{o
in
( p) [
in
( p)],
for each u c[0, 1]. That is to control the in-sample type I
error rate o
in
( p) to be at most u, so that the sum of
the two in-sample error rates is minimal. Controlling the
magnitude of o
in
( p) is essential if the type I error would
cause much more severe losses to the investors. On the
other hand, if classifying healthy firms as being bankrupt
would cause more severe losses to the investor, we might
control the in-sample type II error rate [
in
( p) instead.
In practice, the value of u is determined by the investor.
If there is no restriction on the magnitude of o
in
( p) and
[
in
( p), then we simply take u =1 (Altman 1968, Ohlson
1980, Begley et al. 1996).
Recall that the DSHM also depends on the bandwidths
b and g. Thus we need to generalize the previous method
for defining p
+
. We suggest considering the in-sample type
I and II error rates as functions of p, b, and g, denoted,
respectively, as o
in
( p, b, g) and [
in
( p, b, g). For each given
u c[0, 1], the proper cut-off value p
+
and bandwidths b
and g are then determined simultaneously by minimizing
t
in
( p, b, g) = o
in
( p, b, g) [
in
( p, b, g)
with respect to ( p, b, g) under the constraints: p c[0, 1],
b40, g40, and o
in
( p, b, g) _u. Such values for p
+
, b, and
g are denoted, respectively, as ^ p(u),
^
b(u), and ^ g(u).
2.7. Measuring prediction performance
The performance of the bankruptcy prediction rule based
on DSHM is measured by the out-of-sample error rates.
To compute these error rates, the out-of-sample data are
selected. In contrast, the panel data used to build the
bankruptcy prediction rule are considered as the in-
sample data. The out-of-sample data are generated
similarly to the panel data for building prediction
models. The out-of-sample period is from January 2001
to December 2004. The out-of-sample companies include
all healthy firms in the panel data and the new firms
beginning their listing on the New York Stock Exchange,
American Stock Exchange, or NASDAQ during the out-
of-sample period. Assume that there are n
0
out-of-sample
companies. All predictor values occurring at the last
observation time of the n
0
out-of-sample companies in the
out-of-sample period were also collected from both
Predicting bankruptcy using the discrete-time semiparametric hazard model 1059
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
COMPUSTAT and CRSP databases. The out-of-sample
data are denoted by
~
Y
k, ~ t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k
, k = 1, . . . , n
0
.
Here, for the kth out-of-sample company,
~
t
k
c
{1, . . . ,
0
] denotes the length of duration, where
0
is a positive integer indicating the length of the out-of-
sample period. At the last observation time
~
t
k
,
~
Y
k, ~ t
k
= 1
indicates that the kth company is bankrupt, and
~
Y
k, ~ t
k
= 0
otherwise. Further, ~ x
k, ~ t
k
and ~ z
k, ~ t
k
are values of explana-
tory variables X and Z collected at time
~
t
k
, respectively.
Given each value of u c[0, 1], the out-of-sample error
rates for the bankruptcy prediction rule based on DSHM
are defined by
Type I error rate
o
out
(u) =
P
n
0
k=1
~
Y
k, ~ t
k
I
^
h
+
~
t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k
_ ^ p(u)
n o h i
P
n
0
k=1
~
Y
k, ~ t
k
,
Type II error rate
[
out
(u) =
P
n
0
k=1
1
~
Y
k, ~ t
k
I
^
h
+
~
t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k
4 ^ p(u)
n o h i
P
n
0
k=1
1
~
Y
k, ~ t
k
,
and the total error rate is t
out
(u) =o
out
(u) [
out
(u).
Given the out-of-sample data, the out-of-sample error
rates can be similarly defined for the bankruptcy
prediction rules based on DHM and DHM
#
3. Empirical studies
In this section, empirical studies are conducted to
compare the performance of the prediction rules based
on DHM, DHM
#
and DSHM.
3.1. The data
Four panel datasets were considered for empirical studies.
The predictors considered were the accounting variables
and market-driven variables suggested by Ohlson (1980)
and Shumway (2001). Ohlson (1980) suggested using nine
accounting variables:
WCTA =Working capital divided by total assets,
TLTA =Total liabilities divided by total assets,
NITA =Net income divided by total assets,
CLCA =Current liabilities divided by current assets,
FUTL =Funds provided by operations divided by
total liabilities,
CHIN =(NI
t
NI
t1
)/([NI
t
[ [NI
t1
[), where NI
t
is
net income for the most recent period,
SIZE =Logarithm of total assets divided by GNP
price-level index, where the index assumes
a base value of 100 for 1984,
INTWO =One if net income was negative for the last
two years, zero otherwise,
OENEG =One if total liabilities exceed total assets, zero
otherwise.
Shumway (2001) suggested using only two accounting
variables, TLTA and NITA, in the model. Besides
accounting variables, Shumway (2001) and Chava and
Jarrow (2004) further suggested using market-driven
variables such as
RSIZE=Logarithm of each firms market equity value
divided by the total NYSE/AMEX/
NASDAQ market equity value,
EXRET =Monthly return on the firm minus the value-
weighted CRSP NYSE/AMEX/NASDAQ
index return cumulated to obtain the yearly
return,
as well as the variable
LNAGE =Logarithm of firm age
for prediction. Here the firm age is defined as the number
of calendar years it has been traded during the sampling
period on the New York Stock Exchange, American
Stock Exchange, or NASDAQ (Shumway 2001).
Based on these predictors, we studied the performance
of the prediction rules using combinations of accounting
and market-driven variables. We considered two studies,
with and without market-driven variables, for each set of
accounting variables. The variable LNAGE was always
included in the prediction models, since the models
considered in this paper depend on the hazard function
(see definitions of DHM and DSHM). Later, we shall
report the empirical results of the prediction rules using
the four different sets of panel data.
The sampling period of each of the four panel datasets
(for building prediction model) was taken from January
1984 to December 2000. The out-of-sample period (for
measuring prediction performance) was from January
2001 to December 2004. All firms starting their listing
on the New York Stock Exchange, American Stock
Exchange, or NASDAQ during both sampling periods
are included in the studies, except that the financial
institutions were eliminated from the sample due to the
unique capital requirements and regulatory structure in
that industry group. All panel and out-of-sample datasets
were selected from both COMPUSTAT and CRSP
databases. Companies that were delisted and declared
bankruptcy by CRSP as meeting the delisting codes
400490, 572, and 574 were considered bankrupt, other-
wise healthy.
Note that COMPUSTAT and CRSP databases contain
many missing values for the predictors in each study.
However, in the analysis we only considered those
companies in the dataset with complete predictor values.
The problem of missing data is not unusual in applica-
tions, especially when there are many predictive variables
used in the model. But as long as the missingness occurs
at random, the complete-data analysis will not introduce
systematic biases (Allison 2001, Little and Rubin 2002).
Here we have no reason not to believe that the
missingness occurring in the COMPUSTAT and CRSP
databases is missing at random.
In each study, the DHM with linear logistic hazard
function h(t, x, z) and the DSHM with semiparametric
1060 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
logistic hazard function h
+
(t, x, z) were considered. A
modified DHM, denoted by DHM
#
, using the data-based
parametric hazard function h
#
(t, x, z) suggested from the
result of DSHM, is also included in the analysis so that
a comparison can be made.
3.2. Computational procedures
In computing the DSHM, the values of the continuous
predictors were first divided by their respective sample
standard deviations so that all variables have the same
scale. This is important, since it can avoid the influence of
a predictor with very large range in estimating the optimal
values of ( p, b, g) and in reading the plots of {x
j
, ^ m(x)],
for j =1, . . . , d.
A grid-search approach was used in computing the
optimal values of ( p, b, g). First, the values of t
in
( p, b, g)
on an equally spaced logarithmic grid of 1001 51 51
values of ( p, b, g) in [10
5
, 1] [0.5, 5] [0.5, 5] were
computed. See Marron and Wand (1992) for a discussion
that an equally spaced grid of parameters is typically not
a very efficient design for this type of grid search. Given
each value of u c[0, 1], the global minimizer
{ ^ p(u),
^
b(u), ^ g(u)] of t
in
( p, b, g) on the grid points with
restriction o
in
( p, b, g) _u was taken as the optimal values
of ( p, b, g). Based on these optimal values, the out-of-
sample error rates, functions of u, can then be computed
according to our previous definitions.
In addition, using { ^ p(1),
^
b(1), ^ g(1)], the plot of {x
j
, ^ m(x)]
can be produced for each continuous predictor. We note
that, in the plot of {x
j
, ^ m(x)], we have taken the left and
the right boundary points of its horizontal axes as the 0.5
and 99.5 percentiles of the values of the jth component of
the continuous predictor X, for each j. These plots are
used to visually check the adequacy of the order-one
polynomial function assumed for each continuous pre-
dictor in the linear logistic hazard function of DHM. The
empirical results given below show that, sometimes,
the order-one polynomial functions should be replaced
by order-two or -three polynomials in order to yield better
predictive power.
3.3. Results based on using Ohlsons accounting
variables with and without market-driven
variables included
Given the two datasets with and without the market-
driven variables included, table 1 reports the summary
statistics and the estimated coefficients of DHM, and
figure 1 presents the plot of {x
j
, ^ m(x)] for each continuous
predictor. Table 1 shows that the values of the estimated
coefficients for variables TLTA, NITA, CLCA, and SIZE
in panel A, and those for variables TLTA, FUTL, and
SIZE in panel B do not agree with their expected signs.
This result indicates that the linear logit of the hazard
function of DHM for each of the two datasets might not
be suitable. The slope of each curve in figure 1 agrees with
Table 1. Summary statistics of the panel dataset and the estimated coefficients of DHM using Ohlsons accounting variables with
and without market-driven variables.
Variable Mean Median Standard deviation Minimum Maximum
Estimated coefficient
of DHM ( p-value)
Panel A: Without market-driven variables
78 bankrupt companies, 2275 healthy companies, and 14,066 firm years
Intercept 5.397 (0.001)
WCTA 0.299 0.299 1.733 202 0.995 1.272 (0.002)
TLTA 0.471 0.431 1.736 0.001 203 0.681 (0.091)
NITA 0.182 0.036 14.501 1719 1.421 0.031 (0.779)
CLCA 0.650 0.444 3.326 0.002 215.667 0.001 (0.908)
FUTL 0.136 0.116 4.285 38.061 464.448 0.031 (0.585)
CHIN 0.072 0.097 0.644 1 1 0.931 (0.001)
SIZE 0.170 0.238 1.224 7.462 4.742 0.055 (0.586)
INTWO 0.254 0 0.435 0 1 0.774 (0.006)
OENEG 0.028 0 0.164 0 1 1.942 (0.001)
LNAGE 1.283 1.386 0.776 0 2.708 0.171 (0.287)
Panel B: With market-driven variables
77 bankrupt companies, 2192 healthy companies, and 13,400 firm years
Intercept 12.557 (0.001)
WCTA 0.304 0.302 1.771 202 0.987 1.164 (0.007)
TLTA 0.464 0.430 1.774 0.001 203 0.554 (0.178)
NITA 0.168 0.038 14.854 1719 1.421 0.134 (0.226)
CLCA 0.607 0.441 2.707 0.002 203 0.018 (0.077)
FUTL 0.083 0.124 4.327 38.061 464.448 0.001 (0.975)
CHIN 0.077 0.101 0.643 1 1 0.823 (0.001)
SIZE 0.114 0.188 1.205 7.462 4.742 0.443 (0.001)
INTWO 0.239 0 0.426 0 1 0.764 (0.006)
OENEG 0.022 0 0.146 0 1 2.047 (0.001)
RSIZE 4.793 4.803 0.755 8.584 1.451 1.428 (0.001)
EXRET 0.900 0.358 16.761 10.893 867.761 0.117 (0.211)
LNAGE 1.275 1.386 0.775 0 2.708 0.051 (0.820)
Predicting bankruptcy using the discrete-time semiparametric hazard model 1061
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
the expected direction of the corresponding variable
effect, except variable SIZE in panel (n). Panels (e) and
(g) of figure 1 indicate that the order-one polynomials
for the two variables FUTL and SIZE in DHM should
be replaced by order-two polynomials if only Ohlsons
accounting variables were considered in the study.
Panels (l) and (n) of figure 1 indicate that the order-two
polynomials should be applied on the variables FUTL
and SIZE in the DHM if the market-driven variables
are also included in the analysis. Figure 2 shows the out-
of-sample error rates of the three prediction rules based
on DHM, DHM
#
, and DSHM for the two given datasets.
We first see that the prediction models based on
using parametric hazard functions are in general con-
servative in the sense of having smaller type I error rates
than the expected upper bound u. In contrast, the type I
error rates of the DSHM are close to the designed
upper bounds in the cases of u _0.20. On the other hand,
the type II and the total error rates of the DSHM
are much smaller than those of the parametric models
when u _0.20. One can see that, in the case of solely
using Ohlsons accounting variables for analysis,
the largest percentage decrease of the total error rate
of the DSHM over the DHM is 55%. In the case
of including market-driven variables in the analysis,
the largest percentage decrease becomes 63%. We also
point out that, in the two cases considered in figure 2,
the improvement of DHM with the data-based para-
metric hazard function h
#
(t, x, z) over that with the
linear logistic hazard function h(t, x, z) is limited when
u _0.20. This result suggests that the DHM with the data-
based hazard function may not always improve
the performance of DHM with a simple linear logistic
hazard function.
3.4. Results based on using Shumways accounting
variables with and without market-driven
variables included
We next report the results of the prediction models DHM,
DHM
#
, and DSHM using Shumways accounting vari-
ables with and without market-driven variables included.
0.35 0.54
7.46
4.63
(a)
WCTA: Ohlson variable
m
^
(
x
)
0.01 1.01
6.62
5.14
(b)
TLTA: Ohlson variable
m
^
(
x
)
0.16 0.03
6.24
6.2
(c)
NITA: Ohlson variable
m
^
(
x
)
0 1.33
6.26
5.81
(d)
CLCA: Ohlson variable
m
^
(
x
)
2.38 0.58
6.99
4.84 (e)
FUTL: Ohlson variable
m
^
(
x
)
1.56 1.56
6.58
5.13 (f)
CHIN: Ohlson variable
m
^
(
x
)
2.34 2.62
7.81
5.69 (g)
SIZE: Ohlson variable
m
^
(
x
)
0.23 0.53
7.02
4.88 (h)
WCTA: Ohlson variable
m
^
(
x
)
0.01 0.88
6.36
5.13 (i)
TLTA: Ohlson variable
m
^
(
x
)
0.13 0.03
6.2
4.98 (j)
NITA: Ohlson variable
m
^
(
x
)
0.01 1.33
6.06
5.51 (k)
CLCA: Ohlson variable
m
^
(
x
)
2.17 0.58
6.83
4.87 (l)
FUTL: Ohlson variable
m
^
(
x
)
1.56 1.56
6.19
5.46
(m)
CHIN: Ohlson variable
m
^
(
x
)
2.25 2.67
7.03
5.64
(n)
SIZE: Ohlson variable
m
^
(
x
)
8.81 3.65
10.29
3.22
(o)
RSIZE: market variable
m
^
(
x
)
0.4 2.64
11.55
5.36
(p)
EXRET: market variable
m
^
(
x
)
Figure 1. Plots of marginal relations between the logit-transformed hazard function and predictors. Panels (a)(g) show the plots of
{x
j
, ^ m(x)] resulting from DSHM solely using Ohlsons accounting variables. Panels (h)(p) show the plots of {x
j
, ^ m(x)] resulting
from DSHM using Ohlsons accounting variables and market-driven variables. The value of x in the plot of {x
j
, ^ m(x)] has the jth
component as x
j
, but all other components fixed at their sample median level.
1062 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
Table 2 reports the summary statistics and the estimated
coefficients of DHM. It shows that the values of
the estimated coefficients of Shumways accounting
and the two market-driven variables all agree with their
expected signs in the study. Figure 3 presents a plot
of {x
j
, ^ m(x)] for each continuous predictor, and shows
that the slope of the curve in each panel agrees with
the expected direction of the variable effect. Panels (a)
and (b) of figure 3 show that the order-one polynomials
for the two accounting variables in DHM are proper
in the study without market-driven variables. However,
simply for comparison, we naively used order-two
polynomials for variables TLTA and NITA to define
DHM
#
. On the other hand, from panels (c)(f) of figure 3,
we see that the order-one polynomial for the variable
RSIZE in DHM should be replaced by an order-three
polynomial. Figure 4 shows the out-of-sample error
rates of the prediction rules based on DHM, DHM
#
,
and DSHM. Inspecting the results given in figure 4,
we see that, in the range u _0.20, the type I error rates
of DHM and DSHM are basically very similar.
However, the total error rate of the DSHM is in general
smaller than that of the DHM, for all u c[0, 1]. The
largest percentage decrease of the total error rate by
the DSHM over the DHM is 17% when the market-
driven variables are not included in the analysis, and
21% when the market-driven variables are included.
The figure also shows that the improvement of DHM
#
over DHM is minimal in the case without including
the market-driven variables. This result is reasonable,
since the corresponding order-one polynomials for
the accounting variables are proper for modeling
the hazard function. However, the figure shows that the
improvement of DHM
#
over DHM is significant
when the market-driven variables are included in the
model.
4. Concluding remarks
In this paper, a bankruptcy prediction method based on
DSHM is proposed. This is an extension of the DHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (a)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (b)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (c)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (d)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (e)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (f)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
Figure 2. The performance of the three prediction rules based on DHM (dashed curve), DHM
#
(dotted curve), and DSHM (solid
curve) using Ohlsons accounting variables with and without market-driven variables. Panels (a), (c), and (e) show, respectively, the
out-of-sample type I, type II, and total error rates of the prediction methods solely using Ohlsons accounting variables. Panels (b),
(d), and (f) show the three out-of-sample error rates using Ohlsons accounting variables and market-driven variables. In each panel,
the data-based parametric hazard function h
#
(t, x, z) of DHM
#
was identical to the linear logistic hazard function h(t, x, z) of DHM
except that the order-one polynomials for the two variables FUTL and SIZE were replaced by order-two polynomials.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1063
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
proposed by Shumway (2001) and Chava and Jarrow
(2004). The DHM assumes that the logit transformation
of the hazard function is a linear function of the
predictors. In contrast, the DSHM only assumes that
the transformed function is a smooth function of the
continuous predictors. This gives the prediction model
more freedom in modeling the underlying hazard func-
tion. We point out that the estimates in the DSHM are
derived from using the local likelihood method. It can be
shown that, under very general conditions, the computed
instant bankruptcy probability using DSHM consistently
estimates the true instant bankruptcy probability. Thus
the DSHM is a reliable prediction rule.
One additional advantage of using DSHM is that by
plotting {x
j
, ^ m(x)] for each continuous predictor, one can
visually check the adequacy of the parametric DHM. If
the parametric model is not proper, the results from
DSHM can also guide us on how to make a better
selection of parametric model. Sometimes, using para-
metric modeling is important, particularly when one has
too many predictor variables to be considered simulta-
neously and does not have enough sample data to
estimate them non-parametrically.
We have considered four studies to investigate the finite
sample performance of the DSHM. The four studies were
based on the accounting and market-driven variables
proposed by Ohlson (1980) and Shumway (2001). The
results of the four studies demonstrate that the DSHM
improves the performance of DHM in the prediction of
bankruptcy. The DSHM generally has smaller out-of-
sample total error rates in all studies. Such an advantage
of the DSHM over the DHM in the case of
solely using accounting variables is more significant
than that in the case of employing both accounting and
Table 2. Summary statistics of the panel dataset and the estimated coefficients of DHM using Shumways accounting variables with
and without market-driven variables.
Variable Mean Median Standard deviation Minimum Maximum
Estimated coefficient
of DHM ( p-value)
Panel A: Without market-driven variables
92 bankrupt companies, 2368 healthy companies, and 14,846 firm years
Intercept 5.712 (0.001)
TLTA 0.478 0.441 1.692 0.001 203 0.959 (0.001)
NITA 0.174 0.035 14.115 1719 1.421 0.096 (0.267)
LNAGE 1.290 1.386 0.777 0 2.708 0.055 (0.696)
Panel B: With market-driven variables
91 bankrupt companies, 2281 healthy companies, and 14,140 firm years
Intercept 12.197 (0.001)
TLTA 0.471 0.440 1.728 0.001 203 1.342 (0.001)
NITA 0.161 0.037 14.460 1719 1.421 0.080 (0.448)
RSIZE 4.797 4.808 0.753 8.584 1.451 1.258 (0.001)
EXRET 0.818 0.368 16.328 10.893 867.761 0.186 (0.026)
LNAGE 1.282 1.386 0.777 0 2.708 0.258 (0.180)
0.01 1.07
5.97
3.51 (a)
TLTA: Shumway variable
m
^
(
x
)
0.17 0.03
5.41
5 (b)
NITA: Shumway variable
m
^
(
x
)
0.01 0.92
7.06
2.69 (c)
TLTA: Shumway variable
m
^
(
x
)
0.15 0.03
5.81
5.01 (d)
NITA: Shumway variable
m
^
(
x
)
8.81 3.69
21.9
2.58 (e)
RSIZE: market variable
m
^
(
x
)
0.42 2.58
13.94
4.07 (f)
EXRET: market variable
m
^
(
x
)
Figure 3. Plots of marginal relations between the logit-transformed hazard function and predictors. Panels (a) and (b) show the
plots of {x
j
, ^ m(x)] resulting from DSHM solely using Shumways accounting variables. Panels (c)(f) show the plots of {x
j
, ^ m(x)]
resulting from DSHM using Shumways accounting variables and market-driven variables. The value of x in the plot of {x
j
, ^ m(x)]
has the jth component as x
j
, but all other components fixed at their sample median level.
1064 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
market-driven variables. This result is particularly useful
when applying DSHM to those companies not listed in
stock exchanges. Shumway (2001) pointed out that, in
general, the prediction performance of DHM using both
accounting and market-driven variables is better than that
solely employing accounting variables. Our empirical
results confirm that such an advantage of market-driven
variables can also be applied to DSHM.
Note that, in our development of DSHM, we have used
a logistic hazard function h(t, x, z) as a basis. We remark
that Allison (1982) considered a discrete-time propor-
tional hazard function defined by
j(t, x, z) = 1 exp[exp{o
1
[
1
log(t) ,
1
x
1
z]]
in the analysis. The discrete-time proportional hazard
function was derived from the well-known proportional
hazard function of Cox (1972). Using the same rationale
as given in section 2, we can also modify the discrete-time
proportional hazard function as the discrete-time semi-
parametric proportional hazard function
j
+
(t, x, z) = 1 exp[exp{[log(t) m(x) z]]
for bankruptcy prediction. Our unreported empirical
results from the four panel datasets studied in this paper
show that the performance of DHM using j(t, x, z) is
similar to that employing h(t, x, z). The same remark also
applies to DSHM when replacing h
+
(t, x, z) by j
+
(t, x, z).
More investigation of DSHM is necessary. Firstly, in
applications, it is not clear how long a sampling period
should be used so that a powerful prediction model can be
developed. This is important, since if it is long, then there
will be many missing data. Secondly, in some practical
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(a)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (c)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (d)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (e)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (f)
u
o
u
t
(
u
)
DHM
DHM
#
DSHM
Figure 4. The performance of the three prediction rules based on DHM (dashed curve), DHM
#
(dotted curve), and DSHM (solid
curve) using Shumways accounting variables with and without market-driven variables. Panels (a), (c), and (e) show, respectively,
the out-of-sample type I, type II, and total error rates of the prediction methods solely using Shumways accounting variables. The
data-based parametric hazard function h
#
(t, x, z) of DHM
#
in each of panels (a), (c), and (e) used order-two polynomials for
variables TLTA and NITA (this model is over-parameterized, since DHM is approximately correct, but included here simply for
comparison). Panels (b), (d), and (f) show the three out-of-sample error rates using Shumways accounting variables and market-
driven variables. The data-based parametric hazard function h
#
(t, x, z) of DHM
#
in each of panels (b), (d), and (f) was identical to
the linear logistic hazard function h(t, x, z) in DHM except that the order-one polynomial for the variable RSIZE was replaced by an
order-three polynomial.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1065
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3
applications, such as credit rating, we are interested
in predicting the rating of a particular company. Thus it
is useful to study how to extend the prediction methods
to such a situation. Thirdly, in this paper, the perfor-
mance of DSHM was only studied using firm-specific
variables including accounting and market-driven vari-
ables. Other important firm-specific variables, such as
the KMV-Merton default probability, and industry
effects and macroeconomic variables have been consid-
ered by Chava and Jarrow (2004), Hillegeist et al. (2004),
Duffie et al. (2007), Bharath and Shumway (2008),
and Chava et al. (2008). It would be of interest to study
the effects of these variables on our semiparametric
approach in the future. Further, to account for the
heterogeneity, a latent variable method can also be
considered; see Duffie et al. (2009) and Chava et al.
(2008). Finally, we remark that the DSHM depends on
the logistic hazard function. The robustness of the use of
this particular hazard function is still not clear. If it is
not robust, then the local quasi-likelihood approach (Fan
et al. 1995) or the local semilikelihood approach
(Claeskens and Aerts 2000, Claeskens and Keilegom
2003) can be considered.
Acknowledgements
The authors thank the two referees for their kind
suggestions, which greatly improved the presentation of
this paper. This research was supported by the National
Science Council, Taiwan, Republic of China.
References
Allison, P.D., Discrete-time methods for the analysis of event
histories. Sociol. Methodol., 1982, 13, 6198.
Allison, P.D., Missing Data, 2001 (SAGE Publications:
London).
Altman, E.I., Financial ratios, discriminant analysis, and the
prediction of corporate bankruptcy. J. Finan., 1968, 23,
589609.
Atiya, A.F., Bankruptcy prediction for credit risk using neural
networks: a survey and new results. IEEE Trans. Neural
Ntwks, 2001, 12, 929935.
Begley, J., Ming, J. and Watts, S., Bankruptcy classification
errors in the 1980s: an empirical analysis of Altmans and
Ohlsons models. Rev. Account. Stud., 1996, 1, 267284.
Bharath, S.T. and Shumway, T., Forecasting default with the
Merton distance to default model. Rev. Finan. Stud., 2008, 21,
13391369.
Chava, S. and Jarrow, R.A., Bankruptcy prediction with
industry effects. Rev. Finan., 2004, 8, 537569.
Chava, S., Stefanescu, C. and Turnbull, S., Modeling the loss
distribution, 2008. Available online at: http://faculty.london.
edu/cstefanescu/research.html (accessed 21 April 2008).
Claeskens, G. and Aerts, M., Bootstrapping local polynomial
estimators in likelihood-based models. J. Statist. Plann. Infer.,
2000, 86, 6380.
Claeskens, G. and Keilegom, I.V., Bootstrap confidence bands
for regression curves and their derivatives. Ann. Statist., 2003,
31, 18521884.
Cox, D.R., Regression models and life-tables (with discussion).
J. R. Statist. Soc., Ser. B, 1972, 34, 187220.
Cox, D.R. and Oakes, D., Analysis of Survival Data, 1984
(Chapman and Hall: London).
Duffie, D., Credit risk modeling with affine process. J. Bank.
Finan., 2005, 29, 27512802.
Duffie, D., Eckner, A., Horel, G. and Saita, L., Frailty
correlated default, 2009. Available online at: http://www.
afajof.org/afa/forthcoming/5279.pdf (accessed 12 January
2009).
Duffie, D., Saita, L. and Wang, K., Multi-period corporate
default prediction with stochastic covariates. J. Finan. Econ.,
2007, 83, 635665.
Eubank, R.L., Spline Smoothing and Nonparametric Regression,
1988 (Marcel Dekker: New York).
Fan, J. and Gijbels, I., Local Polynomial Modeling and its
ApplicationTheory and Methodologies, 1996 (Chapman and
Hall: New York).
Fan, J., Heckman, N.E. and Wand, M.P., Local
polynomial kernel regression for generalized linear models
and quasi-likelihood functions. J. Am. Statist. Assoc., 1995,
90, 141150.
Ha rdle, W., Applied Nonparametric Regression, 1990
(Cambridge University Press: Cambridge).
Ha rdle, W., Smoothing Techniques: with Implementation in S,
1991 (Springer: Berlin).
Ha rdle, W., Moro, R.A. and Scha fer, D., Graphical data
representation in bankruptcy analysis. In Handbook of Data
Visualization, edited by C.H. Chen, W. Ha rdle and A. Unwin,
pp. 853872, 2008 (Springer: Berlin).
Hillegeist, S.A., Keating, E.K., Cram, D.P. and
Lundstedt, K.G., Assessing the probability of bankruptcy.
Rev. Account. Stud., 2004, 9, 534.
Hwang, R.C., Cheng, K.F. and Lee, J.C., A semiparametric
method for predicting bankruptcy. J. Forecast., 2007, 26,
317342.
Little, R.J.A. and Rubin, D.B., Statistical Analysis with Missing
Data, 2002 (Wiley: New York).
Marron, J.S. and Wand, M.P., Exact mean integrated square
error. Ann Statist., 1992, 20, 712736.
Merton, R.C., On the pricing of corporate debt: the risk
structure of interest rates. J. Finan., 1974, 29, 449470.
Mu ller, H.G., Nonparametric Regression Analysis of
Longitudinal Data, 1988 (Springer: Berlin).
Ohlson, J., Financial ratios and the probabilistic prediction of
bankruptcy. J. Account. Res., 1980, 18, 109131.
Scott, D.W., Multivariate Density Estimation: Theory, Practice,
and Visualization, 1992 (Wiley: New York).
Shumway, T., Forecasting bankruptcy more accurately: a simple
hazard model. J. Bus., 2001, 74, 101124.
Simonoff, J.S., Smoothing Methods in Statistics, 1996 (Springer:
New York).
Staniswallis, J.G., The kernel estimate of a regression function
in likelihood-based models. J. Am Statist. Assoc., 1989, 84,
276283.
Sun, L. and Shenoy, P.P., Using Bayesian networks for
bankruptcy prediction: some methodological issues. Eur. J.
Oper. Res., 2007, 180, 738753.
Tibshirani, R. and Hastie, T., Local likelihood estimation.
J. Am. Statist. Assoc., 1987, 82, 559568.
Vassalou, M. and Xing, Y., Default risk in equity returns.
J. Finan., 2004, 59, 831868.
Wand, M.P. and Jones, M.C., Kernel Smoothing, 1995
(Chapman and Hall: London).
Zmijewski, M.E., Methodological issues related to the estima-
tion of financial distress prediction models. J. Account. Res.,
1984, 22, 5982.
1066 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d
b
y
[
C
e
n
t
r
a
l
U
L
i
b
r
a
r
y
o
f
B
u
c
h
a
r
e
s
t
]
a
t
2
2
:
0
2
2
3
J
a
n
u
a
r
y
2
0
1
3