0% found this document useful (0 votes)
27 views

Model Hazard

This article proposes a discrete-time semiparametric hazard model (DSHM) for bankruptcy prediction using panel data from firms. The DSHM is an extension of the discrete-time hazard model (DHM) that allows for a more flexible choice of hazard function without assuming a parametric form. The article illustrates the DSHM using four real panel datasets, comparing its out-of-sample error rates to the DHM and a modified DHM. The DSHM is shown to have better predictive performance in all cases, demonstrating its potential as a powerful bankruptcy prediction model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Model Hazard

This article proposes a discrete-time semiparametric hazard model (DSHM) for bankruptcy prediction using panel data from firms. The DSHM is an extension of the discrete-time hazard model (DHM) that allows for a more flexible choice of hazard function without assuming a parametric form. The article illustrates the DSHM using four real panel datasets, comparing its out-of-sample error rates to the DHM and a modified DHM. The DSHM is shown to have better predictive performance in all cases, demonstrating its potential as a powerful bankruptcy prediction model.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

This article was downloaded by: [Central U Library of Bucharest]

On: 23 January 2013, At: 22:02


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
House, 37-41 Mortimer Street, London W1T 3JH, UK
Quantitative Finance
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/rquf20
Predicting bankruptcy using the discrete-time
semiparametric hazard model
K. F. Cheng
a
, C. K. Chu
b
& Ruey-Ching Hwang
c
a
Biostatistics Center, China Medical University, Taichung, Taiwan, and Institute of
Statistics, National Central University, Jhongli, Taiwan
b
Department of Applied Mathematics, National Dong Hwa University, Hualien, Taiwan
c
Department of Finance, National Dong Hwa University, Hualien, Taiwan
Version of record first published: 23 Jul 2009.
To cite this article: K. F. Cheng , C. K. Chu & Ruey-Ching Hwang (2010): Predicting bankruptcy using the discrete-time
semiparametric hazard model, Quantitative Finance, 10:9, 1055-1066
To link to this article: http://dx.doi.org/10.1080/14697680902814274
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss, actions,
claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
Quantitative Finance, Vol. 10, No. 9, November 2010, 10551066
Predicting bankruptcy using the discrete-time
semiparametric hazard model
K. F. CHENGf, C. K. CHU and RUEY-CHING HWANG*,
fBiostatistics Center, China Medical University, Taichung, Taiwan, and Institute of Statistics,
National Central University, Jhongli, Taiwan
Department of Applied Mathematics, National Dong Hwa University, Hualien, Taiwan
,Department of Finance, National Dong Hwa University, Hualien, Taiwan
(Received 25 October 2007; in final form 6 February 2009)
The usual bankruptcy prediction models are based on single-period data from firms. These
models ignore the fact that the characteristics of firms change through time, and thus they may
suffer from a loss of predictive power. In recent years, a discrete-time parametric hazard
model has been proposed for bankruptcy prediction using panel data from firms. This model
has been demonstrated by many examples to be more powerful than the traditional models.
In this paper, we propose an extension of this approach allowing for a more flexible choice of
hazard function. The new method does not require the assumption of a parametric model for
the hazard function. In addition, it also provides a tool for checking the adequacy of the
parametric model, if necessary. We use real panel datasets to illustrate the proposed method.
The empirical results confirm that the new model compares favorably with the well-known
discrete-time parametric hazard model.
Keywords: Discrete-time hazard model; Local likelihood; Out-of-sample error rate; Panel
data; Semiparametric model
1. Introduction
Bankruptcy prediction has been routinely applied by
academics, practitioners, and regulators. The well-known
prediction models include the discriminant analysis model
(Altman 1968), the KMV-Merton model (Merton 1974,
Vassalou and Xing 2004), the linear logit model (Ohlson
1980), and the probit model (Zmijewski 1984), to name
only a few. The common principle of these approaches
is that the models are developed using only single-period
data from the studied firms. Shumway (2001) criticized
that such prediction processes are static in nature, since
they ignore the changing characteristics of firms through
time. In order to avoid the possible loss of predictive
power due to the use of static models, Shumway (2001)
and Chava and Jarrow (2004) suggested that a discrete-
time hazard model (DHM) could be used for bankruptcy
prediction. Their analyses are based on applying the
idea of survival analysis (Cox and Oakes 1984). This
novel model has the advantage of using all available
information of firms to build up a prediction system
so that each firms bankruptcy risk at each time point can
be determined. Thus the model is a dynamic forecasting
model. Other bankruptcy forecasting models based
on multiple-period data include, for example, Hillegeist
et al. (2004), Bharath and Shumway (2008), and Chava
et al. (2008) using the same idea of survival analysis, and
Duffie (2005) and Duffie et al. (2007) making use
of different ideas in point processes. Approaches based
on neural networks (Atiya 2001), support vector machines
(Ha rdle et al. 2008), and Bayesian networks (Sun and
Shenoy 2007), etc., have also been introduced in the
literature for bankruptcy prediction.
According to Shumway (2001) and Chava and Jarrow
(2004), the important parameters in DHM are determined
by maximizing a log-likelihood function. However, their
approach is not flexible enough for modeling the hazard
function. One major assumption needed in their DHM
is that the hazard function has to be a parametric function
such as a simple linear logistic function. Unfortunately,
the parametric model assumption is not always true in
all applications. Ha rdle et al. (2008) also pointed out that
*Corresponding author. Email: [email protected]
Quantitative Finance
ISSN 14697688 print/ISSN 14697696 online 2010 Taylor & Francis
http://www.informaworld.com
DOI: 10.1080/14697680902814274
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

many well-known parametric models for bankruptcy
prediction are not proper. In general, it is difficult
to ensure that conclusions based on the parametric
model are meaningful unless we have a large dataset
and powerful lack-of-fit tests to confirm that
the parametric model is most appropriate in bankruptcy
prediction. The latter often seems to be impossible,
particularly when one does not have a sufficient number
of sample observations for analysis. To avoid this
potential pitfall, we show in this paper that the idea of
the semiparametric logit model (Hwang et al. 2007) can
be directly extended to the DHM using panel data.
Specifically, we shall propose a discrete-time semipara-
metric hazard model (DSHM) for bankruptcy prediction.
This model is built on the work of DHM but needs not
assume any parametric form for the hazard function.
If necessary, the result of the proposed modeling strategy
can also guide us as to how to determine the most
appropriate parametric form for the hazard function.
In the literature, two types of hazard function have been
considered. The first type of hazard function, after taking
logit transformation, is a linear function of predictors; see
Shumway (2001) and Chava and Jarrow (2004). The
second type uses the discrete-time proportional hazard
function; see Allison (1982). Our semiparametric approach
can be applied on these two types of hazard function.
However, our development of DSHM is mainly based on
the logistic hazard function, since this function is often
used for predicting bankruptcy.
The remainder of this paper is organized as follows.
In section 2, we first outline the basic idea of DHM. We
then point out that the rationale of DSHM is similar to
that of the semiparametric logit model of Hwang et al.
(2007). The method is developed under the concept of
local likelihood, and it turns out that the important
estimators needed in DSHM can be derived from solving
a simple system of weighted normal equations. Thus,
the required computation is as simple as that in DHM.
In section 3, we illustrate our method using four panel
datasets based on the predictors suggested by Ohlson
(1980) and Shumway (2001). Each panel dataset was
analysed using DSHM, DHM, and modified DHM. The
modified DHM is an improved parametric version
of DHM using results developed from DSHM.
The predictive power of each method was measured
by out-of-sample error rates. Based on the error rates
summarized in section 3, we conclude that DSHM has
better performance in all cases. Sometimes, the improve-
ment of both DSHM and modified DHM over DHM is
very significant, depending on the predictors selected in
the study. This shows that DSHM has potential as
a powerful bankruptcy prediction model. Finally, con-
cluding remarks appear in section 4.
2. Methodology
In this section, we describe the basic idea of DSHM,
develop estimating equations for unknown quantities and
introduce a tool for visually checking the adequacy of the
linear logistic hazard function in DHM. Before doing this,
we first briefly review the basic steps for deriving DHM.
2.1. DHM
The DHM can be formally defined from the log-
likelihood function of the panel data. The model has the
advantage of using all available information to predict
each firms bankruptcy risk at each point in time. In the
following, we describe the structure of the panel data used
in the prediction model.
The panel data are determined by two factors. They are
the sampling period and sampling criteria. In this paper,
the panel datasets analysed in section 3 were sampled
from January 1984 to December 2000, and all firms
starting their listing on the New York Stock Exchange,
American Stock Exchange, or NASDAQ during the
sampling period were recruited in the sample. All
information at the discrete time points during the
sampling period were collected from both
COMPUSTAT and CRSP databases. Assume that there
are n selected companies under the particular sampling
scheme. We denote the panel data by
{(Y
i, j
, x
i, j
, z
i, j
), j = 1, . . . , t
i
, i = 1, . . . , n].
Here, for the ith firm in the dataset, we denote
t
i
c{1, . . . , } to be the length of the firms duration
during the sampling period, and is a positive integer
indicating the total length of the sampling period. At
the last observation time t
i
, Y
i,t
i
=1 indicates that the ith
company is bankrupt, and Y
i,t
i
=0 otherwise. At the
observation time j5t
i
, we always have Y
i,j
=0. Finally, we
let x
i,j
and z
i,j
be values of the d 1 continuous and q 1
discrete explanatory variables X and Z collected at time j,
respectively.
The log-likelihood function of the panel data has been
given in (21) of Allison (1982). It is expressed as

DHM
=
X
n
i=1
Y
i,t
i
log
h t
i
, x
i,t
i
, z
i,t
i

1 h t
i
, x
i,t
i
, z
i,t
i

( )

X
n
i=1
X
t
i
j=1
log{1 h( j, x
i, j
, z
i, j
)].
Here, h( j, x
i,j
, z
i,j
) is the value of the hazard function
indicating the probability of bankruptcy instantly occur-
ring at time j for the ith company which is non-bankrupt
before time j, for each j =1, . . . , t
i
and i =1, . . . , n.
Note that the hazard function h(t, x, z) in
DHM
can be
of any functional form with values in the interval (0, 1).
Shumway (2001) considered a linear logistic function for
the hazard function:
h(t, x, z) =
exp{o
1
[
1
log(t) ,
1
x
1
z]
1 exp{o
1
[
1
log(t) ,
1
x
1
z]
,
where o
1
, [
1
, ,
1
, and
1
are 1 1, 1 1, 1 d, and 1 q
vectors of parameters, respectively. Given the linear
1056 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

logistic hazard function, the resulting log-likelihood of the
panel data becomes
=
X
n
i=1
Y
i,t
i
o
1
[
1
log(t
i
) ,
1
x
i,t
i

1
z
i,t
i

X
n
i=1
X
t
i
j=1
log[1 exp{o
1
[
1
log( j) ,
1
x
i, j

1
z
i, j
]].
The maximum likelihood estimates of parameters o
1
, [
1
,
,
1
, and
1
can be simply obtained by solving the normal
equations:
0 =
X
n
i=1
Y
i,t
i
1
log(t
i
)
x
i,t
i
z
i,t
i
2
6
6
6
4
3
7
7
7
5

X
n
i=1
X
t
i
j=1
exp{o
1
[
1
log( j) ,
1
x
i, j

1
z
i, j
]
1 exp{o
1
[
1
log( j) ,
1
x
i, j

1
z
i, j
]

1
log( j)
x
i, j
z
i, j
2
6
6
6
4
3
7
7
7
5
.
Based on the maximum likelihood estimates ^ o
1
,
^
[
1
, ^ ,
1
,
and
^

1
, if a firm has predictor values (x
0
, z
0
) at time t
0
,
then its predicted instant bankruptcy probability can be
given by
^
h(t
0
, x
0
, z
0
) =
exp{ ^ o
1

^
[
1
log(t
0
) ^ ,
1
x
0

^

1
z
0
]
1 exp{ ^ o
1

^
[
1
log(t
0
) ^ ,
1
x
0

^

1
z
0
]
.
Cox and Oakes (1984) showed that the maximum
likelihood estimates ^ o
1
,
^
[
1
, ^ ,
1
, and
^

1
are consistent for
o
1
, [
1
, ,
1
, and
1
, respectively. Thus, the resulting
predicted instant bankruptcy probability
^
h(t
0
, x
0
, z
0
)
converges to the true instant bankruptcy probability
h(t
0
, x
0
, z
0
). This result shows that DHM should be an
efficient bankruptcy prediction model if the hazard
function is correctly specified.
2.2. DSHM
The main advantage of DHM lies in its simplicity of
computation and interpretation, but the linear logistic
function for modeling the hazard function may not be
proper. If one chooses a parametric hazard function that
is not appropriate, then the resulting model-based instant
bankruptcy probability prediction might not correctly
estimate the true probability, and there is a danger of
coming to an erroneous prediction.
The limitation of DHM can be improved by removing
the restriction that the hazard function belongs to a
particular parametric family. In this paper, we suggest
a DSHM, which is more flexible in modeling the hazard
function. The DSHM is constructed by replacing
the parametric hazard function in DHM with
a semiparametric hazard function. That is, we assume
the hazard function belongs to the family
h
+
(t, x, z) =
exp{[log(t) m(x) z]
1 exp{[log(t) m(x) z]
.
Here, [ and are unknown parameters, and m(x) is an
unknown but smooth function of the value x of the
d-dimensional continuous predictor X. Following the
same development of , the corresponding log-likelihood
function of the panel data based on our DSHM is
expressed by

+
=
X
n
i=1
Y
i,t
i
[ log(t
i
) m x
i,t
i

z
i,t
i

X
n
i=1
X
t
i
j=1
log[1 exp{[log( j) m(x
i, j
) z
i, j
]].
For a company with predictor values (x
0
, z
0
) at time t
0
, if
[, m(x
0
), and can be efficiently estimated by
^
[, ^ m(x
0
),
and
^
, respectively, then the firms instant bankruptcy
probability can be predicted by
^
h
+
(t
0
, x
0
, z
0
) =
exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o
1 exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o .
In sections 2.3 and 2.4, we show how to estimate
parameters [, m(x
0
), and using a local likelihood
method. The advantage of this approach will be seen from
empirical studies given in section 3.
2.3. A local likelihood method
There exist many well-known methods for estimating [,
m(x
0
), and , where x
0
is any given value of the
d-dimensional continuous predictor X. One of these
methods with a simple idea is the local likelihood
method; see, for example, Tibshirani and Hastie (1987),
Staniswallis (1989), Fan et al. (1995), and Hwang et al.
(2007). The basic rational of the local likelihood method
is to center the data around x
0
and weight the likelihood
in such a way that it places more emphasis on those
observations nearest to x
0
.
The idea of the local likelihood method can be simply
explained by first introducing a neighborhood
S(x
0
) ={x =(x
1
, . . . , x
d
)
T
:|xx
0
| _b} of x
0
. Here b is
some positive constant to be determined later by the
sampled data, and called the bandwidth. The notation
|x| denotes the Euclidean distance of the given vector x.
If the value of b is small enough and x
i,j
belongs to S(x
0
),
then Taylors first-order expansion states that
m(x
i, j
) - m(x
0
) m
(1)
(x
0
)
T
(x
i, j
x
0
),
and such m(x
i,j
) in the likelihood can be written as
j
0
j
1
(x
i,j
x
0
), where we denote m(x
0
) by j
0
and
m
(1)
(x
0
)
T
by j
1
. Note that j
0
is a scalar parameter and
j
1
is a 1 d vector of parameters.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1057
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

To make an inference about o=(j
0
, [, j
1
, ), we
suggest modifying the likelihood function
+
to the
following local (weighted) log-likelihood function:

+
0
(o; x
0
) =
X
n
i=1
Y
i,t
i
j
0
[log(t
i
) j
1
x
i,t
i
x
0

z
i,t
i

W x
i,t
i


X
n
i=1
X
t
i
j=1
log[1 exp{j
0
[log( j)
j
1
(x
i, j
x
0
) z
i, j
]]W(x
i, j
).
Here W(x) is called the weight function and the simplest
weight W(x
i,j
) assigned to the observation (Y
i,j
, x
i,j
, z
i,j
)
is the indicator value I{x
i,j
cS(x
0
)}. However, a more
general weighting scheme can be used for defining the
local likelihood. This can be achieved, for example, by
introducing a symmetric and unimodal probability
density function K
b
(x), and defining W(x
i,j
) =
K
b
(x
i,j
x
0
). In the paper, we suggest that K
b
(x) be
taken as the joint probability density function of d
independent normal random variables N(0, b
2
). Given
such K
b
(x) we point out that if the value of b
1
|x
i,j
x
0
|
becomes larger because of choosing a smaller value
of b, then the effect of the observation (Y
i,j
, x
i,j
, z
i,j
) on
estimating the important parameters in DSHM will tend
to be smaller or even non-existent. This indicates that the
value of b can be used to control sample observations for
inclusion in the analysis. The results in the literature show
that the choice of bandwidth b plays an important role
in the analysis. Some discussions of the above weighting
method can be found in the monographs of Eubank
(1988), Mu ller (1988), Ha rdle (1990, 1991), Scott (1992),
Wand and Jones (1995), Fan and Gijbels (1996), and
Simonoff (1996), etc. In this paper, we select
W(x
i,j
) =K
b
(x
i,j
x
0
) in all analyses.
Set ~ o = ( ~ j
0
,
~
[, ~ j
1
,
~
) as the maximizer of
+
0
(o; x
0
).
The maximum local likelihood estimate ~ o can also be
equivalently obtained by solving a system of weighted
normal equations
0 =
X
n
i=1
Y
i,t
i
1
log(t
i
)
x
i,t
i
x
0
z
i,t
i
2
6
6
6
4
3
7
7
7
5
K
b
x
i,t
i
x
0

X
n
i=1
X
t
i
j=1
exp{j
0
[log( j) j
1
(x
i, j
x
0
) z
i, j
]
1 exp{j
0
[log( j) j
1
(x
i, j
x
0
) z
i, j
]

1
log( j)
x
i, j
x
0
z
i, j
2
6
6
6
4
3
7
7
7
5
K
b
(x
i, j
x
0
).
We define ~ m(x
0
) = ~ j
0
to indicate that it is an estimate
of m(x
0
). We also point out that [ and are global
parameters and their corresponding estimates produced
from ~ o may not be efficient, since such estimates are
derived by maximizing a local log-likelihood depending
on x
0
. In section 2.4, we show how more efficient
estimates of [, m(x
0
), and can be achieved.
2.4. More powerful estimates of parameters in DSHM
More powerful estimates of [, m(x
0
), and can be derived
using the following two-step procedure. We first note
that, for each value x
i,j
, an initial estimate ~ m(x
i, j
) of m(x
i,j
)
can be obtained by the method outlined in section 2.3.
The two-step procedure includes the following.
Step 1: [ and are estimated by maximizing the pseudo
log-likelihood

+
1
([, ) =
X
n
i=1
Y
i,t
i
{[log(t
i
) ~ m x
i,t
i

z
i,t
i
]

X
n
i=1
X
t
i
j=1
log[1 exp{[ log( j) ~ m(x
i, j
) z
i, j
]],
or, equivalently, solving equations
0 =
X
n
i=1
Y
i,t
i
log(t
i
)
z
i,t
i

X
n
i=1
X
t
i
j=1
exp{[log( j) ~ m(x
i, j
) z
i, j
]
1 exp{[log( j) ~ m(x
i, j
) z
i, j
]
log( j)
z
i, j

.
Let the estimates of ([, ) be (
^
[,
^
), the maximizer of

+
1
([, ). Here
+
1
([, ) is obtained by replacing each m(x
i,j
)
in
+
with its initial estimate ~ m(x
i, j
).
Step 2: m(x
0
) is estimated by maximizing the pseudo
local log-likelihood

+
2
(j
0
, j
1
;x
0
) =
X
n
i=1
Y
i,t
i
j
0

^
[log(t
i
) j
1
x
i,t
i
x
0

^
z
i,t
i
n o
K
g
x
i,t
i
x
0

X
n
i=1
X
t
i
j=1
log[1exp{j
0

^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]]K
g
(x
i, j
x
0
),
or, equivalently, solving equations
0 =
X
n
i=1
Y
i,t
i
1
x
i,t
i
x
0

K
g
(x
i,t
i
x
0
)

X
n
i=1
X
t
i
j=1
exp{j
0

^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]
1 exp{j
0

^
[log( j) j
1
(x
i, j
x
0
)
^
z
i, j
]

1
x
i, j
x
0

K
g
(x
i, j
x
0
).
Set ( ^ j
0
, ^ j
1
) as the maximizer of
+
2
(j
0
, j
1
; x
0
). The
estimate of m(x
0
) is given by ^ m(x
0
) = ^ j
0
. Here

+
2
(j
0
, j
1
; x
0
) is obtained by replacing [ and in

+
0
(o; x
0
) with their estimates produced in step 1.
We note that in step 2 we have used a different
bandwidth g in the local likelihood method. We allow b
and g to be different in the analysis but emphasize that
both values will be determined by the sampled data (see
our proposal given in section 2.6). We suggest that the
final estimates of [, m(x
0
), and be defined by
^
[, ^ m(x
0
),
and
^
. Also, at time t
0
, the predicted instant bankruptcy
1058 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

probability of a firm with predictor values (x
0
, z
0
) is
suggested to be defined by
^
h
+
(t
0
, x
0
, z
0
) =
exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o
1 exp
^
[log(t
0
) ^ m(x
0
)
^
z
0
n o .
2.5. Selecting parametric hazard function using ^ m(x)
The estimated function ^ m(x) of m(x) can be used to
determine the functional form of the logit of hazard
function. We recall in the usual DHM that a linear
logistic function is assumed for the hazard function.
That is, the logit transformation of the hazard function is
a linear function of the d-dimensional continuous
predictors. We note that, for the jth predictor value x
j
in x, the relation between the logit-transformed hazard
function and x
j
can be determined visually by plotting
{x
j
, ^ m(x)], for each j =1, . . . , d. Here x in the plot of
{x
j
, ^ m(x)] has the jth component as x
j
, but all other
components are fixed at their sample median levels, since
the distribution of the explanatory variable in the
financial field is usually fat-tailed and skewed. Using the
plots, a proper functional form of the logit of the hazard
function can be determined. For example, if the plot of
{x
j
, ^ m(x)], for some j, presents a cubic relation, then the
relation between the logit-transformed hazard function
and x
j
should be an order-three polynomial. In the
empirical examples discussed in section 3, we apply this
strategy to propose a new parametric hazard function. We
denote the parametric hazard function derived from using
the plots of {x
j
, ^ m(x)], for j =1, . . . , d, by h
#
(t, x, z). The
DHM based on such a data-based parametric hazard
function h
#
(t, x, z) is denoted DHM
#
in the analysis.
2.6. Bankruptcy prediction
Theoretical argument shows
^
h
+
(t
0
, x
0
, z
0
) to be a consis-
tent estimator of the instant bankruptcy probability. This
means that a reliable bankruptcy prediction system can be
established based on using estimate
^
h
+
(t
0
, x
0
, z
0
). In this
paper, we suggest that if a firm has predictor values x
0
and z
0
at time t
0
and the calculated probability
^
h
+
(t
0
, x
0
, z
0
) is no more than a given cut-off value p,
then this firm is classified to be in a healthy
status. Otherwise, it is classified to be in a bankruptcy
status.
To decide a proper cut-off value p, usually one would
use all of the panel data to evaluate the performance of
the classification scheme. For simplicity of computation,
we suggest only using the dataset {(Y
i,t
i
, x
i,t
i
, z
i,t
i
),
i =1, . . . , n}, collected at the last observation time of
each company in the sampling period. There are two
types of in-sample error rates occurring in this
evaluation:
Type I error rate o
in
(p) =
P
n
i=1
Y
i,t
i
I
^
h
+
(t
i
, x
i,t
i
, z
i,t
i
) _p
n o h i
P
n
i=1
Y
i,t
i

,
and
Type II error rate [
in
( p) =
P
n
i=1
1Y
i,ti

I
^
h
+
t
i
, x
i,ti
, z
i,ti

4p
n o h i
P
n
i=1
1Y
i,ti

,
where p c[0, 1] and I() stands for the indicator function.
Using the cut-off value p, o
in
( p) is the rate of
misclassifying a bankrupt company as a healthy com-
pany, and [
in
( p) is the rate of misclassifying a healthy
company as a bankrupt company.
To keep these two error rates as small as possible, we
determine a proper cut-off value p
+
for the bankruptcy
prediction method based on DSHM such that
t
in
( p
+
) = o
in
( p
+
) [
in
( p
+
) = min
pc[0,1],o
in
( p)_u
{o
in
( p) [
in
( p)],
for each u c[0, 1]. That is to control the in-sample type I
error rate o
in
( p) to be at most u, so that the sum of
the two in-sample error rates is minimal. Controlling the
magnitude of o
in
( p) is essential if the type I error would
cause much more severe losses to the investors. On the
other hand, if classifying healthy firms as being bankrupt
would cause more severe losses to the investor, we might
control the in-sample type II error rate [
in
( p) instead.
In practice, the value of u is determined by the investor.
If there is no restriction on the magnitude of o
in
( p) and
[
in
( p), then we simply take u =1 (Altman 1968, Ohlson
1980, Begley et al. 1996).
Recall that the DSHM also depends on the bandwidths
b and g. Thus we need to generalize the previous method
for defining p
+
. We suggest considering the in-sample type
I and II error rates as functions of p, b, and g, denoted,
respectively, as o
in
( p, b, g) and [
in
( p, b, g). For each given
u c[0, 1], the proper cut-off value p
+
and bandwidths b
and g are then determined simultaneously by minimizing
t
in
( p, b, g) = o
in
( p, b, g) [
in
( p, b, g)
with respect to ( p, b, g) under the constraints: p c[0, 1],
b40, g40, and o
in
( p, b, g) _u. Such values for p
+
, b, and
g are denoted, respectively, as ^ p(u),
^
b(u), and ^ g(u).
2.7. Measuring prediction performance
The performance of the bankruptcy prediction rule based
on DSHM is measured by the out-of-sample error rates.
To compute these error rates, the out-of-sample data are
selected. In contrast, the panel data used to build the
bankruptcy prediction rule are considered as the in-
sample data. The out-of-sample data are generated
similarly to the panel data for building prediction
models. The out-of-sample period is from January 2001
to December 2004. The out-of-sample companies include
all healthy firms in the panel data and the new firms
beginning their listing on the New York Stock Exchange,
American Stock Exchange, or NASDAQ during the out-
of-sample period. Assume that there are n
0
out-of-sample
companies. All predictor values occurring at the last
observation time of the n
0
out-of-sample companies in the
out-of-sample period were also collected from both
Predicting bankruptcy using the discrete-time semiparametric hazard model 1059
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

COMPUSTAT and CRSP databases. The out-of-sample
data are denoted by
~
Y
k, ~ t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k

, k = 1, . . . , n
0

.
Here, for the kth out-of-sample company,
~
t
k
c
{1, . . . ,
0
] denotes the length of duration, where
0
is a positive integer indicating the length of the out-of-
sample period. At the last observation time
~
t
k
,
~
Y
k, ~ t
k
= 1
indicates that the kth company is bankrupt, and
~
Y
k, ~ t
k
= 0
otherwise. Further, ~ x
k, ~ t
k
and ~ z
k, ~ t
k
are values of explana-
tory variables X and Z collected at time
~
t
k
, respectively.
Given each value of u c[0, 1], the out-of-sample error
rates for the bankruptcy prediction rule based on DSHM
are defined by
Type I error rate
o
out
(u) =
P
n
0
k=1
~
Y
k, ~ t
k
I
^
h
+
~
t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k

_ ^ p(u)
n o h i
P
n
0
k=1
~
Y
k, ~ t
k

,
Type II error rate
[
out
(u) =
P
n
0
k=1
1
~
Y
k, ~ t
k

I
^
h
+
~
t
k
, ~ x
k, ~ t
k
, ~ z
k, ~ t
k

4 ^ p(u)
n o h i
P
n
0
k=1
1
~
Y
k, ~ t
k

,
and the total error rate is t
out
(u) =o
out
(u) [
out
(u).
Given the out-of-sample data, the out-of-sample error
rates can be similarly defined for the bankruptcy
prediction rules based on DHM and DHM
#
3. Empirical studies
In this section, empirical studies are conducted to
compare the performance of the prediction rules based
on DHM, DHM
#
and DSHM.
3.1. The data
Four panel datasets were considered for empirical studies.
The predictors considered were the accounting variables
and market-driven variables suggested by Ohlson (1980)
and Shumway (2001). Ohlson (1980) suggested using nine
accounting variables:
WCTA =Working capital divided by total assets,
TLTA =Total liabilities divided by total assets,
NITA =Net income divided by total assets,
CLCA =Current liabilities divided by current assets,
FUTL =Funds provided by operations divided by
total liabilities,
CHIN =(NI
t
NI
t1
)/([NI
t
[ [NI
t1
[), where NI
t
is
net income for the most recent period,
SIZE =Logarithm of total assets divided by GNP
price-level index, where the index assumes
a base value of 100 for 1984,
INTWO =One if net income was negative for the last
two years, zero otherwise,
OENEG =One if total liabilities exceed total assets, zero
otherwise.
Shumway (2001) suggested using only two accounting
variables, TLTA and NITA, in the model. Besides
accounting variables, Shumway (2001) and Chava and
Jarrow (2004) further suggested using market-driven
variables such as
RSIZE=Logarithm of each firms market equity value
divided by the total NYSE/AMEX/
NASDAQ market equity value,
EXRET =Monthly return on the firm minus the value-
weighted CRSP NYSE/AMEX/NASDAQ
index return cumulated to obtain the yearly
return,
as well as the variable
LNAGE =Logarithm of firm age
for prediction. Here the firm age is defined as the number
of calendar years it has been traded during the sampling
period on the New York Stock Exchange, American
Stock Exchange, or NASDAQ (Shumway 2001).
Based on these predictors, we studied the performance
of the prediction rules using combinations of accounting
and market-driven variables. We considered two studies,
with and without market-driven variables, for each set of
accounting variables. The variable LNAGE was always
included in the prediction models, since the models
considered in this paper depend on the hazard function
(see definitions of DHM and DSHM). Later, we shall
report the empirical results of the prediction rules using
the four different sets of panel data.
The sampling period of each of the four panel datasets
(for building prediction model) was taken from January
1984 to December 2000. The out-of-sample period (for
measuring prediction performance) was from January
2001 to December 2004. All firms starting their listing
on the New York Stock Exchange, American Stock
Exchange, or NASDAQ during both sampling periods
are included in the studies, except that the financial
institutions were eliminated from the sample due to the
unique capital requirements and regulatory structure in
that industry group. All panel and out-of-sample datasets
were selected from both COMPUSTAT and CRSP
databases. Companies that were delisted and declared
bankruptcy by CRSP as meeting the delisting codes
400490, 572, and 574 were considered bankrupt, other-
wise healthy.
Note that COMPUSTAT and CRSP databases contain
many missing values for the predictors in each study.
However, in the analysis we only considered those
companies in the dataset with complete predictor values.
The problem of missing data is not unusual in applica-
tions, especially when there are many predictive variables
used in the model. But as long as the missingness occurs
at random, the complete-data analysis will not introduce
systematic biases (Allison 2001, Little and Rubin 2002).
Here we have no reason not to believe that the
missingness occurring in the COMPUSTAT and CRSP
databases is missing at random.
In each study, the DHM with linear logistic hazard
function h(t, x, z) and the DSHM with semiparametric
1060 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

logistic hazard function h
+
(t, x, z) were considered. A
modified DHM, denoted by DHM
#
, using the data-based
parametric hazard function h
#
(t, x, z) suggested from the
result of DSHM, is also included in the analysis so that
a comparison can be made.
3.2. Computational procedures
In computing the DSHM, the values of the continuous
predictors were first divided by their respective sample
standard deviations so that all variables have the same
scale. This is important, since it can avoid the influence of
a predictor with very large range in estimating the optimal
values of ( p, b, g) and in reading the plots of {x
j
, ^ m(x)],
for j =1, . . . , d.
A grid-search approach was used in computing the
optimal values of ( p, b, g). First, the values of t
in
( p, b, g)
on an equally spaced logarithmic grid of 1001 51 51
values of ( p, b, g) in [10
5
, 1] [0.5, 5] [0.5, 5] were
computed. See Marron and Wand (1992) for a discussion
that an equally spaced grid of parameters is typically not
a very efficient design for this type of grid search. Given
each value of u c[0, 1], the global minimizer
{ ^ p(u),
^
b(u), ^ g(u)] of t
in
( p, b, g) on the grid points with
restriction o
in
( p, b, g) _u was taken as the optimal values
of ( p, b, g). Based on these optimal values, the out-of-
sample error rates, functions of u, can then be computed
according to our previous definitions.
In addition, using { ^ p(1),
^
b(1), ^ g(1)], the plot of {x
j
, ^ m(x)]
can be produced for each continuous predictor. We note
that, in the plot of {x
j
, ^ m(x)], we have taken the left and
the right boundary points of its horizontal axes as the 0.5
and 99.5 percentiles of the values of the jth component of
the continuous predictor X, for each j. These plots are
used to visually check the adequacy of the order-one
polynomial function assumed for each continuous pre-
dictor in the linear logistic hazard function of DHM. The
empirical results given below show that, sometimes,
the order-one polynomial functions should be replaced
by order-two or -three polynomials in order to yield better
predictive power.
3.3. Results based on using Ohlsons accounting
variables with and without market-driven
variables included
Given the two datasets with and without the market-
driven variables included, table 1 reports the summary
statistics and the estimated coefficients of DHM, and
figure 1 presents the plot of {x
j
, ^ m(x)] for each continuous
predictor. Table 1 shows that the values of the estimated
coefficients for variables TLTA, NITA, CLCA, and SIZE
in panel A, and those for variables TLTA, FUTL, and
SIZE in panel B do not agree with their expected signs.
This result indicates that the linear logit of the hazard
function of DHM for each of the two datasets might not
be suitable. The slope of each curve in figure 1 agrees with
Table 1. Summary statistics of the panel dataset and the estimated coefficients of DHM using Ohlsons accounting variables with
and without market-driven variables.
Variable Mean Median Standard deviation Minimum Maximum
Estimated coefficient
of DHM ( p-value)
Panel A: Without market-driven variables
78 bankrupt companies, 2275 healthy companies, and 14,066 firm years
Intercept 5.397 (0.001)
WCTA 0.299 0.299 1.733 202 0.995 1.272 (0.002)
TLTA 0.471 0.431 1.736 0.001 203 0.681 (0.091)
NITA 0.182 0.036 14.501 1719 1.421 0.031 (0.779)
CLCA 0.650 0.444 3.326 0.002 215.667 0.001 (0.908)
FUTL 0.136 0.116 4.285 38.061 464.448 0.031 (0.585)
CHIN 0.072 0.097 0.644 1 1 0.931 (0.001)
SIZE 0.170 0.238 1.224 7.462 4.742 0.055 (0.586)
INTWO 0.254 0 0.435 0 1 0.774 (0.006)
OENEG 0.028 0 0.164 0 1 1.942 (0.001)
LNAGE 1.283 1.386 0.776 0 2.708 0.171 (0.287)
Panel B: With market-driven variables
77 bankrupt companies, 2192 healthy companies, and 13,400 firm years
Intercept 12.557 (0.001)
WCTA 0.304 0.302 1.771 202 0.987 1.164 (0.007)
TLTA 0.464 0.430 1.774 0.001 203 0.554 (0.178)
NITA 0.168 0.038 14.854 1719 1.421 0.134 (0.226)
CLCA 0.607 0.441 2.707 0.002 203 0.018 (0.077)
FUTL 0.083 0.124 4.327 38.061 464.448 0.001 (0.975)
CHIN 0.077 0.101 0.643 1 1 0.823 (0.001)
SIZE 0.114 0.188 1.205 7.462 4.742 0.443 (0.001)
INTWO 0.239 0 0.426 0 1 0.764 (0.006)
OENEG 0.022 0 0.146 0 1 2.047 (0.001)
RSIZE 4.793 4.803 0.755 8.584 1.451 1.428 (0.001)
EXRET 0.900 0.358 16.761 10.893 867.761 0.117 (0.211)
LNAGE 1.275 1.386 0.775 0 2.708 0.051 (0.820)
Predicting bankruptcy using the discrete-time semiparametric hazard model 1061
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

the expected direction of the corresponding variable
effect, except variable SIZE in panel (n). Panels (e) and
(g) of figure 1 indicate that the order-one polynomials
for the two variables FUTL and SIZE in DHM should
be replaced by order-two polynomials if only Ohlsons
accounting variables were considered in the study.
Panels (l) and (n) of figure 1 indicate that the order-two
polynomials should be applied on the variables FUTL
and SIZE in the DHM if the market-driven variables
are also included in the analysis. Figure 2 shows the out-
of-sample error rates of the three prediction rules based
on DHM, DHM
#
, and DSHM for the two given datasets.
We first see that the prediction models based on
using parametric hazard functions are in general con-
servative in the sense of having smaller type I error rates
than the expected upper bound u. In contrast, the type I
error rates of the DSHM are close to the designed
upper bounds in the cases of u _0.20. On the other hand,
the type II and the total error rates of the DSHM
are much smaller than those of the parametric models
when u _0.20. One can see that, in the case of solely
using Ohlsons accounting variables for analysis,
the largest percentage decrease of the total error rate
of the DSHM over the DHM is 55%. In the case
of including market-driven variables in the analysis,
the largest percentage decrease becomes 63%. We also
point out that, in the two cases considered in figure 2,
the improvement of DHM with the data-based para-
metric hazard function h
#
(t, x, z) over that with the
linear logistic hazard function h(t, x, z) is limited when
u _0.20. This result suggests that the DHM with the data-
based hazard function may not always improve
the performance of DHM with a simple linear logistic
hazard function.
3.4. Results based on using Shumways accounting
variables with and without market-driven
variables included
We next report the results of the prediction models DHM,
DHM
#
, and DSHM using Shumways accounting vari-
ables with and without market-driven variables included.
0.35 0.54
7.46
4.63
(a)
WCTA: Ohlson variable
m
^
(
x
)
0.01 1.01
6.62
5.14
(b)
TLTA: Ohlson variable
m
^
(
x
)
0.16 0.03
6.24
6.2
(c)
NITA: Ohlson variable
m
^
(
x
)
0 1.33
6.26
5.81
(d)
CLCA: Ohlson variable
m
^
(
x
)
2.38 0.58
6.99
4.84 (e)
FUTL: Ohlson variable
m
^
(
x
)
1.56 1.56
6.58
5.13 (f)
CHIN: Ohlson variable
m
^
(
x
)
2.34 2.62
7.81
5.69 (g)
SIZE: Ohlson variable
m
^
(
x
)
0.23 0.53
7.02
4.88 (h)
WCTA: Ohlson variable
m
^
(
x
)
0.01 0.88
6.36
5.13 (i)
TLTA: Ohlson variable
m
^
(
x
)
0.13 0.03
6.2
4.98 (j)
NITA: Ohlson variable
m
^
(
x
)
0.01 1.33
6.06
5.51 (k)
CLCA: Ohlson variable
m
^
(
x
)
2.17 0.58
6.83
4.87 (l)
FUTL: Ohlson variable
m
^
(
x
)
1.56 1.56
6.19
5.46
(m)
CHIN: Ohlson variable
m
^
(
x
)
2.25 2.67
7.03
5.64
(n)
SIZE: Ohlson variable
m
^
(
x
)
8.81 3.65
10.29
3.22
(o)
RSIZE: market variable
m
^
(
x
)
0.4 2.64
11.55
5.36
(p)
EXRET: market variable
m
^
(
x
)
Figure 1. Plots of marginal relations between the logit-transformed hazard function and predictors. Panels (a)(g) show the plots of
{x
j
, ^ m(x)] resulting from DSHM solely using Ohlsons accounting variables. Panels (h)(p) show the plots of {x
j
, ^ m(x)] resulting
from DSHM using Ohlsons accounting variables and market-driven variables. The value of x in the plot of {x
j
, ^ m(x)] has the jth
component as x
j
, but all other components fixed at their sample median level.
1062 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

Table 2 reports the summary statistics and the estimated
coefficients of DHM. It shows that the values of
the estimated coefficients of Shumways accounting
and the two market-driven variables all agree with their
expected signs in the study. Figure 3 presents a plot
of {x
j
, ^ m(x)] for each continuous predictor, and shows
that the slope of the curve in each panel agrees with
the expected direction of the variable effect. Panels (a)
and (b) of figure 3 show that the order-one polynomials
for the two accounting variables in DHM are proper
in the study without market-driven variables. However,
simply for comparison, we naively used order-two
polynomials for variables TLTA and NITA to define
DHM
#
. On the other hand, from panels (c)(f) of figure 3,
we see that the order-one polynomial for the variable
RSIZE in DHM should be replaced by an order-three
polynomial. Figure 4 shows the out-of-sample error
rates of the prediction rules based on DHM, DHM
#
,
and DSHM. Inspecting the results given in figure 4,
we see that, in the range u _0.20, the type I error rates
of DHM and DSHM are basically very similar.
However, the total error rate of the DSHM is in general
smaller than that of the DHM, for all u c[0, 1]. The
largest percentage decrease of the total error rate by
the DSHM over the DHM is 17% when the market-
driven variables are not included in the analysis, and
21% when the market-driven variables are included.
The figure also shows that the improvement of DHM
#
over DHM is minimal in the case without including
the market-driven variables. This result is reasonable,
since the corresponding order-one polynomials for
the accounting variables are proper for modeling
the hazard function. However, the figure shows that the
improvement of DHM
#
over DHM is significant
when the market-driven variables are included in the
model.
4. Concluding remarks
In this paper, a bankruptcy prediction method based on
DSHM is proposed. This is an extension of the DHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (a)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (b)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (c)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (d)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (e)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (f)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
Figure 2. The performance of the three prediction rules based on DHM (dashed curve), DHM
#
(dotted curve), and DSHM (solid
curve) using Ohlsons accounting variables with and without market-driven variables. Panels (a), (c), and (e) show, respectively, the
out-of-sample type I, type II, and total error rates of the prediction methods solely using Ohlsons accounting variables. Panels (b),
(d), and (f) show the three out-of-sample error rates using Ohlsons accounting variables and market-driven variables. In each panel,
the data-based parametric hazard function h
#
(t, x, z) of DHM
#
was identical to the linear logistic hazard function h(t, x, z) of DHM
except that the order-one polynomials for the two variables FUTL and SIZE were replaced by order-two polynomials.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1063
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

proposed by Shumway (2001) and Chava and Jarrow
(2004). The DHM assumes that the logit transformation
of the hazard function is a linear function of the
predictors. In contrast, the DSHM only assumes that
the transformed function is a smooth function of the
continuous predictors. This gives the prediction model
more freedom in modeling the underlying hazard func-
tion. We point out that the estimates in the DSHM are
derived from using the local likelihood method. It can be
shown that, under very general conditions, the computed
instant bankruptcy probability using DSHM consistently
estimates the true instant bankruptcy probability. Thus
the DSHM is a reliable prediction rule.
One additional advantage of using DSHM is that by
plotting {x
j
, ^ m(x)] for each continuous predictor, one can
visually check the adequacy of the parametric DHM. If
the parametric model is not proper, the results from
DSHM can also guide us on how to make a better
selection of parametric model. Sometimes, using para-
metric modeling is important, particularly when one has
too many predictor variables to be considered simulta-
neously and does not have enough sample data to
estimate them non-parametrically.
We have considered four studies to investigate the finite
sample performance of the DSHM. The four studies were
based on the accounting and market-driven variables
proposed by Ohlson (1980) and Shumway (2001). The
results of the four studies demonstrate that the DSHM
improves the performance of DHM in the prediction of
bankruptcy. The DSHM generally has smaller out-of-
sample total error rates in all studies. Such an advantage
of the DSHM over the DHM in the case of
solely using accounting variables is more significant
than that in the case of employing both accounting and
Table 2. Summary statistics of the panel dataset and the estimated coefficients of DHM using Shumways accounting variables with
and without market-driven variables.
Variable Mean Median Standard deviation Minimum Maximum
Estimated coefficient
of DHM ( p-value)
Panel A: Without market-driven variables
92 bankrupt companies, 2368 healthy companies, and 14,846 firm years
Intercept 5.712 (0.001)
TLTA 0.478 0.441 1.692 0.001 203 0.959 (0.001)
NITA 0.174 0.035 14.115 1719 1.421 0.096 (0.267)
LNAGE 1.290 1.386 0.777 0 2.708 0.055 (0.696)
Panel B: With market-driven variables
91 bankrupt companies, 2281 healthy companies, and 14,140 firm years
Intercept 12.197 (0.001)
TLTA 0.471 0.440 1.728 0.001 203 1.342 (0.001)
NITA 0.161 0.037 14.460 1719 1.421 0.080 (0.448)
RSIZE 4.797 4.808 0.753 8.584 1.451 1.258 (0.001)
EXRET 0.818 0.368 16.328 10.893 867.761 0.186 (0.026)
LNAGE 1.282 1.386 0.777 0 2.708 0.258 (0.180)
0.01 1.07
5.97
3.51 (a)
TLTA: Shumway variable
m
^
(
x
)
0.17 0.03
5.41
5 (b)
NITA: Shumway variable
m
^
(
x
)
0.01 0.92
7.06
2.69 (c)
TLTA: Shumway variable
m
^
(
x
)
0.15 0.03
5.81
5.01 (d)
NITA: Shumway variable
m
^
(
x
)
8.81 3.69
21.9
2.58 (e)
RSIZE: market variable
m
^
(
x
)
0.42 2.58
13.94
4.07 (f)
EXRET: market variable
m
^
(
x
)
Figure 3. Plots of marginal relations between the logit-transformed hazard function and predictors. Panels (a) and (b) show the
plots of {x
j
, ^ m(x)] resulting from DSHM solely using Shumways accounting variables. Panels (c)(f) show the plots of {x
j
, ^ m(x)]
resulting from DSHM using Shumways accounting variables and market-driven variables. The value of x in the plot of {x
j
, ^ m(x)]
has the jth component as x
j
, but all other components fixed at their sample median level.
1064 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

market-driven variables. This result is particularly useful
when applying DSHM to those companies not listed in
stock exchanges. Shumway (2001) pointed out that, in
general, the prediction performance of DHM using both
accounting and market-driven variables is better than that
solely employing accounting variables. Our empirical
results confirm that such an advantage of market-driven
variables can also be applied to DSHM.
Note that, in our development of DSHM, we have used
a logistic hazard function h(t, x, z) as a basis. We remark
that Allison (1982) considered a discrete-time propor-
tional hazard function defined by
j(t, x, z) = 1 exp[exp{o
1
[
1
log(t) ,
1
x
1
z]]
in the analysis. The discrete-time proportional hazard
function was derived from the well-known proportional
hazard function of Cox (1972). Using the same rationale
as given in section 2, we can also modify the discrete-time
proportional hazard function as the discrete-time semi-
parametric proportional hazard function
j
+
(t, x, z) = 1 exp[exp{[log(t) m(x) z]]
for bankruptcy prediction. Our unreported empirical
results from the four panel datasets studied in this paper
show that the performance of DHM using j(t, x, z) is
similar to that employing h(t, x, z). The same remark also
applies to DSHM when replacing h
+
(t, x, z) by j
+
(t, x, z).
More investigation of DSHM is necessary. Firstly, in
applications, it is not clear how long a sampling period
should be used so that a powerful prediction model can be
developed. This is important, since if it is long, then there
will be many missing data. Secondly, in some practical
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(a)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
(b)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (c)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (d)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (e)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1 (f)
u

o
u
t
(
u
)


DHM
DHM
#
DSHM
Figure 4. The performance of the three prediction rules based on DHM (dashed curve), DHM
#
(dotted curve), and DSHM (solid
curve) using Shumways accounting variables with and without market-driven variables. Panels (a), (c), and (e) show, respectively,
the out-of-sample type I, type II, and total error rates of the prediction methods solely using Shumways accounting variables. The
data-based parametric hazard function h
#
(t, x, z) of DHM
#
in each of panels (a), (c), and (e) used order-two polynomials for
variables TLTA and NITA (this model is over-parameterized, since DHM is approximately correct, but included here simply for
comparison). Panels (b), (d), and (f) show the three out-of-sample error rates using Shumways accounting variables and market-
driven variables. The data-based parametric hazard function h
#
(t, x, z) of DHM
#
in each of panels (b), (d), and (f) was identical to
the linear logistic hazard function h(t, x, z) in DHM except that the order-one polynomial for the variable RSIZE was replaced by an
order-three polynomial.
Predicting bankruptcy using the discrete-time semiparametric hazard model 1065
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

applications, such as credit rating, we are interested
in predicting the rating of a particular company. Thus it
is useful to study how to extend the prediction methods
to such a situation. Thirdly, in this paper, the perfor-
mance of DSHM was only studied using firm-specific
variables including accounting and market-driven vari-
ables. Other important firm-specific variables, such as
the KMV-Merton default probability, and industry
effects and macroeconomic variables have been consid-
ered by Chava and Jarrow (2004), Hillegeist et al. (2004),
Duffie et al. (2007), Bharath and Shumway (2008),
and Chava et al. (2008). It would be of interest to study
the effects of these variables on our semiparametric
approach in the future. Further, to account for the
heterogeneity, a latent variable method can also be
considered; see Duffie et al. (2009) and Chava et al.
(2008). Finally, we remark that the DSHM depends on
the logistic hazard function. The robustness of the use of
this particular hazard function is still not clear. If it is
not robust, then the local quasi-likelihood approach (Fan
et al. 1995) or the local semilikelihood approach
(Claeskens and Aerts 2000, Claeskens and Keilegom
2003) can be considered.
Acknowledgements
The authors thank the two referees for their kind
suggestions, which greatly improved the presentation of
this paper. This research was supported by the National
Science Council, Taiwan, Republic of China.
References
Allison, P.D., Discrete-time methods for the analysis of event
histories. Sociol. Methodol., 1982, 13, 6198.
Allison, P.D., Missing Data, 2001 (SAGE Publications:
London).
Altman, E.I., Financial ratios, discriminant analysis, and the
prediction of corporate bankruptcy. J. Finan., 1968, 23,
589609.
Atiya, A.F., Bankruptcy prediction for credit risk using neural
networks: a survey and new results. IEEE Trans. Neural
Ntwks, 2001, 12, 929935.
Begley, J., Ming, J. and Watts, S., Bankruptcy classification
errors in the 1980s: an empirical analysis of Altmans and
Ohlsons models. Rev. Account. Stud., 1996, 1, 267284.
Bharath, S.T. and Shumway, T., Forecasting default with the
Merton distance to default model. Rev. Finan. Stud., 2008, 21,
13391369.
Chava, S. and Jarrow, R.A., Bankruptcy prediction with
industry effects. Rev. Finan., 2004, 8, 537569.
Chava, S., Stefanescu, C. and Turnbull, S., Modeling the loss
distribution, 2008. Available online at: http://faculty.london.
edu/cstefanescu/research.html (accessed 21 April 2008).
Claeskens, G. and Aerts, M., Bootstrapping local polynomial
estimators in likelihood-based models. J. Statist. Plann. Infer.,
2000, 86, 6380.
Claeskens, G. and Keilegom, I.V., Bootstrap confidence bands
for regression curves and their derivatives. Ann. Statist., 2003,
31, 18521884.
Cox, D.R., Regression models and life-tables (with discussion).
J. R. Statist. Soc., Ser. B, 1972, 34, 187220.
Cox, D.R. and Oakes, D., Analysis of Survival Data, 1984
(Chapman and Hall: London).
Duffie, D., Credit risk modeling with affine process. J. Bank.
Finan., 2005, 29, 27512802.
Duffie, D., Eckner, A., Horel, G. and Saita, L., Frailty
correlated default, 2009. Available online at: http://www.
afajof.org/afa/forthcoming/5279.pdf (accessed 12 January
2009).
Duffie, D., Saita, L. and Wang, K., Multi-period corporate
default prediction with stochastic covariates. J. Finan. Econ.,
2007, 83, 635665.
Eubank, R.L., Spline Smoothing and Nonparametric Regression,
1988 (Marcel Dekker: New York).
Fan, J. and Gijbels, I., Local Polynomial Modeling and its
ApplicationTheory and Methodologies, 1996 (Chapman and
Hall: New York).
Fan, J., Heckman, N.E. and Wand, M.P., Local
polynomial kernel regression for generalized linear models
and quasi-likelihood functions. J. Am. Statist. Assoc., 1995,
90, 141150.
Ha rdle, W., Applied Nonparametric Regression, 1990
(Cambridge University Press: Cambridge).
Ha rdle, W., Smoothing Techniques: with Implementation in S,
1991 (Springer: Berlin).
Ha rdle, W., Moro, R.A. and Scha fer, D., Graphical data
representation in bankruptcy analysis. In Handbook of Data
Visualization, edited by C.H. Chen, W. Ha rdle and A. Unwin,
pp. 853872, 2008 (Springer: Berlin).
Hillegeist, S.A., Keating, E.K., Cram, D.P. and
Lundstedt, K.G., Assessing the probability of bankruptcy.
Rev. Account. Stud., 2004, 9, 534.
Hwang, R.C., Cheng, K.F. and Lee, J.C., A semiparametric
method for predicting bankruptcy. J. Forecast., 2007, 26,
317342.
Little, R.J.A. and Rubin, D.B., Statistical Analysis with Missing
Data, 2002 (Wiley: New York).
Marron, J.S. and Wand, M.P., Exact mean integrated square
error. Ann Statist., 1992, 20, 712736.
Merton, R.C., On the pricing of corporate debt: the risk
structure of interest rates. J. Finan., 1974, 29, 449470.
Mu ller, H.G., Nonparametric Regression Analysis of
Longitudinal Data, 1988 (Springer: Berlin).
Ohlson, J., Financial ratios and the probabilistic prediction of
bankruptcy. J. Account. Res., 1980, 18, 109131.
Scott, D.W., Multivariate Density Estimation: Theory, Practice,
and Visualization, 1992 (Wiley: New York).
Shumway, T., Forecasting bankruptcy more accurately: a simple
hazard model. J. Bus., 2001, 74, 101124.
Simonoff, J.S., Smoothing Methods in Statistics, 1996 (Springer:
New York).
Staniswallis, J.G., The kernel estimate of a regression function
in likelihood-based models. J. Am Statist. Assoc., 1989, 84,
276283.
Sun, L. and Shenoy, P.P., Using Bayesian networks for
bankruptcy prediction: some methodological issues. Eur. J.
Oper. Res., 2007, 180, 738753.
Tibshirani, R. and Hastie, T., Local likelihood estimation.
J. Am. Statist. Assoc., 1987, 82, 559568.
Vassalou, M. and Xing, Y., Default risk in equity returns.
J. Finan., 2004, 59, 831868.
Wand, M.P. and Jones, M.C., Kernel Smoothing, 1995
(Chapman and Hall: London).
Zmijewski, M.E., Methodological issues related to the estima-
tion of financial distress prediction models. J. Account. Res.,
1984, 22, 5982.
1066 K. F. Cheng et al.
D
o
w
n
l
o
a
d
e
d

b
y

[
C
e
n
t
r
a
l

U

L
i
b
r
a
r
y

o
f

B
u
c
h
a
r
e
s
t
]

a
t

2
2
:
0
2

2
3

J
a
n
u
a
r
y

2
0
1
3

You might also like