Sei sulla pagina 1di 9

Knowledge-Based Systems 24 (2011) 1380–1388

Contents lists available at ScienceDirect

Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys

The random subspace binary logit (RSBL) model for bankruptcy prediction
Hui Li a,d,⇑, Young-Chan Lee b, Yan-Chun Zhou c, Jie Sun a
a
School of Economics and Management, Zhejiang Normal University, P.O. Box 62, 688 YingBinDaDao, Jinhua, Zhejiang 321004, China
b
Division of Economics & Commerce, Dongguk University, GyeongJu Campus, Gyeongju, Gyeongbuk 780-714, Republic of Korea
c
School of Business, Ningbo University, 818 FengHuaLu Road, Ningbo, Zhejiang 315211, China
d
College of Engineering, The Ohio State University, 470 Hitchcock Hall, 2070 Neil Avenue, Columbus, OH 43210, USA

a r t i c l e i n f o a b s t r a c t

Article history: This paper proposes the random subspace binary logit (RSBL) model (or random subspace binary logistic
Received 6 January 2011 regression analysis) by taking the random subspace approach and using the classical logit model to gen-
Received in revised form 21 June 2011 erate a group of diverse logit decision agents from various perspectives for predictive problem. These
Accepted 21 June 2011
diverse logit models are then combined for a more accurate analysis. The proposed RSBL model takes
Available online 29 June 2011
advantage of both logit (or logistic regression) and random subspace approaches. The random subspace
approach generates diverse sets of variables to represent the current problem as different masks. Differ-
Keywords:
ent logit decision agents from these masks, instead of a single logit model, are constructed. To verify its
Bankruptcy prediction
Random subspace binary logit
performance, we used the proposed RSBL model to forecast corporate failure in China. The results indi-
Group decision of predictive models cate that this model significantly improves the predictive ability of classical statistical models such as
Corporate failure prediction multivariate discriminant analysis, logit model, and probit model. Thus, the proposed model should make
Probit logit model more suitable for predictive problems in academic and industrial uses.
Multivariate discriminant analysis Ó 2011 Elsevier B.V. All rights reserved.

1. Introduction multivariate normality. To address this problem, Martin [25] and


Ohlson [29] used a conditional probability model to forecast bank-
Investors, creditors, bankers, stockholders, and managers need ruptcy. This type of model uses the nonlinear maximum likelihood
effective tools for managing various risks associated with their method for estimating probability of bankruptcy. According to the
decisions. Researches in this topic includes: Cho [9], Pai et al. assumption about probability distribution, logit and probit models
[33], among others. One such tool is bankruptcy prediction, which assume a logistic distribution and a cumulative normal distribu-
refers to prediction of business failure through financial variables tion, respectively. The relationship between financial variables
[11,14,15,26–28,34,36,38]. Early studies of bankruptcy prediction and the probability of bankruptcy is typically assumed to be linear.
typically employed discriminant analysis. Fitzpatrick [10] provided Thus, linear logit model has typically been used. These two classi-
an in-depth interpretation of bankruptcy variables and trends (i.e., cal statistical models have been widely used for forecasting bank-
he employed a type of multiple variable analysis); Beaver [3] pro- ruptcy for many years [17,18,39,16]. Even now, firms frequently
posed a framework for employing univariate analysis for bank- employ logit model to calculate probability of bankruptcy for their
ruptcy prediction; and Altman [1] employed multivariate customers. Previous studies have attempted to improve the effec-
discriminant analysis (MDA) to predict bankruptcy. Because tiveness of these two classical models by optimizing single model.
discriminant analysis is easy to understand, interpret, and explain, For example, Refs. [32,31] proposed the Tabu method, a type of fea-
it has been suitable for industrial use. MDA assumes dichotomous ture selection method, for discriminant analysis and logit model.
data, a multivariate normal distribution, equal variance-covariance In the past two decades, a number of studies have investigated
matrices across two groups, a specified prior probability of two performance of intelligent models on bankruptcy prediction (e.g.,
groups, and the absence of multicollinearity [2]. However, the case-based reasoning with the k-nearest neighbor as the heart,
assumption of multivariate normality is often violated for bank- classification and regression tree, support vector machine, among
ruptcy data without effective solutions (e.g., only several financial others). However, the two statistical models are still very popular
variables distribute normally for bankruptcy data from China). (particularly for industrial use) because they are well-known mod-
Further, univariate normality is not a sufficient condition for els for bankruptcy prediction and are easy to model, interpret, and
explain. Logit model is used more frequently because it is less
demanding than MDA. However, logit model has a drawback. Most
⇑ Corresponding author at: School of Economics and Management, Zhejiang
of the recent studies (e.g., [22,23,40,19]) have demonstrated that
Normal University, P.O. Box 62, 688 YingBinDaDao, Jinhua, Zhejiang 321004, China.
Tel.: +86 579 8229 8602. predictive abilities of classical statistical models (e.g., logit model)
E-mail address: lihuihit@gmail.com (H. Li). are relative weak.

0950-7051/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.knosys.2011.06.015
H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388 1381

Predictive accuracy is one of the most important indicators of of an event (e.g., a bankruptcy). Assume that there are k variables,
model effectiveness. If logit model is relative weak in predictive namely x1, x2, . . ., xk. The logit value of the unknown binomial prob-
accuracy, the loss saved by using logit in bankruptcy prediction ability is modeled as a linear function:
in industry will be smaller. However, it is difficult to introduce  
probabilityi
new models for industrial users for the following reasons. (1) Intel- logitðprobabilityi Þ ¼ ln
ligent models are complex approaches for the problem, compared 1  probabilityi
with statistical models. (2) Industrial practitioners are already ¼ b0 þ b1 xi1 þ b2 xi2 þ    þ bk xik : ð1Þ
familiar with the classical approaches. One is not likely to replace
a familiar tool as long as it continues to perform reasonably well. The unknown parameter bj can be estimated through maximum
Thus, the shortcoming of logit model—relative poor performance likelihood estimation of generalized linear models. The greater the
in terms of bankruptcy prediction—should be addressed to make value of bj is, the more the jth variable contributes to prediction.
it more suitable for industrial use. One may argue that the loss Further, b0 refers to an intercept, and bj (j = 1, 1, . . ., k), the regres-
resulting from the use of logit model instead of intelligent models sion coefficient of the jth variable. Finally,
for bankruptcy prediction is not substantial because the difference 1
in reductions in predictive error between statistical models and probabilityi ¼ ðb0 þb1 xi1 þb2 xi2 þþbk xi Þ
: ð2Þ
1þe k
intelligent models is commonly 3%. However, a decision tool that
can reduce predictive error by 3% could potentially save the indus- Assume that
try approximately $1.2 billion annually [41]. Because logit model is z ¼ b0 þ b1 x1 þ b2 x2 þ    þ bk xk : ð3Þ
widely used for bankruptcy prediction, we need to enhance its
predictive ability for forecasting bankruptcy while preserving its The following function, which is dependent on z, is referred to as lo-
advantages. gistic regression:
From the perspective of management science and ensemble
1
learning, using various decision agents with diverse opinions is f ðzÞ ¼ ; ð4Þ
1 þ ez
effective in improving performance of predictive model [30,20].
The predictive ability of a committee of decision agents may ex- where f(z) 2 (0, 1) represents the probability of an event. Typically,
ceed that of a single decision agent. Logit model can be regarded the cutoff value is 0.5. Logistic regression is useful for describing the
as a decision agent for bankruptcy prediction. The current problem relationship between financial variables and the probability of cor-
can be described and represented by some variables as a mask porate failure.
from which a decision agent will be constructed. The random sub-
space method is an approach for producing various representations 2.2. Random subspace approach
(i.e., masks) that can be used to generate different decision agents.
This approach injects randomness into problem representation by Random subspace approach refers to the construction of deci-
randomly selecting variables with replacement. This means that sion models through random selection of a number of variables
different variable sets are used to construct logit model. from a given set of variables [13]. This approach reflects a type of
Thus, to improve the analysis performance of logit model, the ensemble method and is known to be able to improve predictive
present study combines random subspace approach with binary accuracy [6,7,8]. Each time a random subspace is generated, a deci-
logit model to generate the simple random subspace binary logit sion agent will be produced by constructing a model on top of the
(RSBL) model that takes into account different decision agents’ representation of the current problem. Finally, all decision agents
opinions. The results of practical application verifying its ability are integrated by a simple vote by the committee. This approach
to forecast corporate failure in China indicate that the proposed is described as follows. Consider the condition with m observations
model can forecast corporate failure significantly better than clas- in a k-dimensional space:
sical statistical models (i.e., MDA, logit model, and probit model).
The paper is organized as follows. Section 2 introduces binary logit fðx1;j ; x2;j ; . . . ; xi;j ; . . . ; xk;j Þjxi;j g; where i 2 f1; mg; j 2 f1; kg: ð5Þ
model and random subspace approach. Section 3 proposes the i is the number of observations; j is the number of variables; and xi,j
RSBL model. Section 4 uses the proposed model to forecast corpo- refers to the value of the ith observation for the jth variable. The
rate failure in China, and Section 5 concludes. random subspace represented by the randomly selected variables
is expressed as follows:

2. Binary logit model and random subspace approach fðx1;j ; x2;j ; . . . ; xm;j Þjxi;j g;
where
ð6Þ
2.1. Binary logit model for bankruptcy prediction xi;j ¼ xi;j for i 2 I; and
xi;j ¼ Null for i R I:
Logit model is used to forecast probability of an event by fitting
data (represented by some variables) to the logistic curve. The term I is the k0 -dimensional subset of {1, 2, . . ., k}; and k0 6 k. Here we as-
‘‘logit,’’ introduced by Berkson [4], is borrowed from probit model, sume that k0 = k. Thus, the total number of variables in subspaces is
a similar model introduced by Bliss [5]. In the field of bankruptcy the same as that of those variables in the initial space. Because some
prediction, the occurrence of an event refers to corporate failure. of the same variables are selected, the actual number of variables is
Further, the variables include financial variables calculated from less than k. The creation of random space is repeated P times. Each
firm’s public financial statements. For example, the probability of time, a random generator from 1 to k0 is used to select a variable to
a firm declaring bankruptcy in the future can be determined by be used in the subspace, and the process is repeated k0 times. The
analyzing various variables for the firm’s profitability (e.g., various source of randomness is based on re-sampling P times with replace-
financial ratios). Assume that the result of bankruptcy prediction is ment the blocks of m observations for each k variable. As a result, P
either corporate failure or non-failure. The number of Xi is known, random subspaces are created as different masks for the current
and the probability of proi is unknown. Here Xi refers to the ith problem. The same algorithm is implemented on top of these masks
object (observation), and a total of m observations exist. For each to generate diverse decision agents. According to Ho [13], this
observation, a set of variables can be used to inform the probability approach reflects a type of stochastic discrimination to increase
1382 H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388

predictive accuracy by combining weak models that do not have full Eq. (7) are the same, then all the logit decision agents are the same.
discriminate power for the same problem. Combining invariant As a result, combining the results for different logit agents’ deci-
models has a chance to increase their discriminative power. sions produces no difference. Thus, using randomness to increase
The randomly selected variables represent different masks for the diversity of committee members is critical. When logit agents
the current problem, which means that different masks have dif- have their decisions, the committee will reach a consensus through
ferent component variables reflecting different perspectives of the following mechanism:
the classification problem. Thus, the corresponding models based
X
P
on different masks focus on different perspectives of the current Z¼ wr  f ðzr Þ
problem. These models have different discriminant abilities, which r¼1
are on the foundation of their perspectives. This random subspace X
P
wr
approach attempts to derive only the different perspectives from ¼
r¼1
1 þ expðzr Þ
the initial representation of the current problem. These perspec- ð10Þ
tives represent the interests of different committees. This is consis- X
P
wr
¼ ;
tent with the operation of committees in the real world, that is,
r¼1
1 þ expððbr0 þ br1 xr1 þ br2 xr2 þ    þ brk0 xrk0 ÞÞ
committees include representatives from different groups. Finally,
where
the voting mechanism is used to produce a consensus inside the
0
committee of voters or decision agents. i 2 f1; mg; j 2 f1; k g; xri 2 ð1; þ1Þ; wr 2 ½0; 1:
wr is the importance of the opinion of the rth logit decision agent;
3. The random subspaces binary logit model P
and wr = 1. Further, Z 2 (0, 1) indicates the score whether an event
is likely to occur. Z is an integration of P logit decision agents. Be-
The RSBL model is constructed by combining logit model with cause randomness is injected into the proposed RSBL model, the
random subspace approach. To generate a committee of logit weight of each logit decision agent can be the same, that is, 1/P.
decision agents, we take the random subspace approach to produce By this use of the logit decision committee to produce the prob-
diverse masks of the current problem for logit model. Based on ability of an event (e.g., bankruptcy prediction), the analysis per-
Eq. (6), we assume that the rth mask is expressed as follows: formance of logit model can be enhanced. The cutoff value is still
fðxr1;j ; xr2;j ; . . . ; xrm;j Þjxri;j g;
0
where i 2 f1; mg; j 2 f1; k g: ð7Þ 0.5. If Z P 0.5, then the consensus of the logit decision committee
(i.e., the RSBL model) is that the event will happen with the prob-
According to this expression, the current problem is represented by ability of Z. Otherwise, the consensus is that the event will not
a mask using k0 variables from the initial set of k variables. Some occur.
variables in the initial set of variables may emerge more than once According to Hastie et al. [12] and Lim [24], combining several
in the rth mask, whereas others may not emerge at all. The k0 vari- weak decision agents can produce a powerful committee. We use a
ables are selected as follows: First, a random number generator is total of P logit decision agents. These agents are independent of
used to produce a random number between (0, 1). Then the random one another in that randomness is injected into the construction
number is multiplied by the total number of variables, namely k, to of the proposed model. For a simple vote by the committee, we
be a feature indicator. The upper integer of the feature indicator is can assume that P is odd. Let Xr express a random variable reflect-
used to identify the selected feature. Third, this process is repeated ing the prediction of the rth logit model. Suppose that the perfor-
k times, resulting in k features. These k features are used to repre- mances of P logit decision agents are expressed as the variable
sent sample as a mask. The whole procedure is repeated P times. Fi- accuracy, which is derived based on Eq. (10). Thus,
nally, P masks are obtained for the sample representation. After the
X r  BernoulliðaccuracyÞ: ð11Þ
sets of subspace variables are selected, the logit decision agents are
constructed on the m observations with P masks. The variable accuracy is determined by the proportion of correct
By constructing Eq. (4) on the basis of Eq. (7), we transfer logit decisions by the logit agent in all decisions. The number of accurate
model as follows: predictions by the RSBL model (i.e., the committee of logit decision
agents) is defined as follows:
f ðzr Þ ¼ 1þe1zr ;
where ð8Þ X
P
Y¼ X r  BinomialðP; accuracyÞ: ð12Þ
zr ¼ br0 þ br1 xr1 þ br2 xr2 þ    þ brk0 xrk0 : r¼1

If a variable emerges more than once in the rth mask, then coeffi- Because we assume that P is odd, we express P as P = 2K + 1, where
cient of the variable indicates the number of times it emerges. By K is a nonnegative integer. Let Zr = Probability (Y P K + 1). Then the
this means, the effectiveness of randomness is used to influence predictive accuracy of the RSBL model by voting is calculated as
the results. Further, f(zr) is in the range of (0, 1) because x ranges follows:
from (1, +1). X
P
By using P masks for the current problem, we can obtain P dif- ZP ¼ PC r accuracyr ð1  accuracyÞPr ; ð13Þ
ferent logit decision agents based on Eq. (8). The set of logit deci- r¼Kþ1
sion agents is expressed as follows:
where Cr is a turning parameter, namely P!/r!/(P  r)!. Based on
fðf ðz1 Þ; f ðz2 Þ; . . . ; f ðzP ÞÞjf ðzr Þg; where r 2 f1; Pg: ð9Þ Lam and Suen [21], we will obtain the following results:
8
Here all f(zr) are in the range of (0, 1). The P logit decision agents < < 0:5 when accuracy < 0:5;
>
constitute the model committee for the current problem (e.g., bank- Z 2Kþ1 ¼ 0:5 when accuracy ¼ 0:5; ð14Þ
ruptcy prediction). Because Eq. (7) is used to inject the randomly >
:
> 0:5 when accuracy > 0:5:
generated masks for the initial problem into the modeling process
of logit model, committee members of decision agents in Eq. (9) The RSBL model follows the following procedure:
model different perspectives of the current problem. Randomness
makes different logit decision agents to produce diverse decisions. Step 1: Randomly generate P masks from the original variables
As a result, their diversity is assured. Note that if all masks from with replacement. These masks are used to represent the
H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388 1383

Fig. 1. Mechanism of the RSBL model.

available observations from different perspectives. By con- confidence level—to produce four more data sets. Thus, we ob-
structing diverse masks for the current problem, we will under- tained a total of six data sets. Finally, as shown in Appendix, we
stand and model the problem more effectively. used the variables to initially represent the observations for the
Step 2: Fit P logit decision agents by using all available masks for two types of data sets [22]. Table 1 summarizes the six data sets.
observations. These P logit decision agents represent P different Using the initial data sets, we compared predictive performance
perspectives on the initial data. This means that diverse masks of the RSBL model with that of the classical statistical models (i.e.,
holding knowledge of classification are transferred to a commit- MDA, logit model, and probit model). We set the total number of
tee of logit decision agents. logit decision agents as 11 because Breiman [6] noted that most
Step 3: Make P predictions about the same target by using P dif- of the improvements from bagging will be achieved within 10 rep-
ferent logit decision agents. New problems will be analyzed by lications. However, there will be no consensus on the binary clas-
using the committee of logit decision agents, each of which ful- sification without additional processes if each group has five votes.
fils the analysis independently. To address this problem, we used an odd number (11). We then
Step 4: Produce a consensus for the RSBL model (i.e., the com- tested the following null hypotheses:
mittee of logit decision agents). A simple vote by the committee
of logit decision agents is used to integrate the opinions of each  H1: Predictive performance of the RSBL model is significant bet-
logit model. This produces the results for the RSBL model. ter than that of MDA.
 H2: Predictive performance of the RSBL model is significant bet-
Fig. 1 shows the mechanism of the RSBL model. ter than that of logit model.
 H3: Predictive performance of the RSBL model is significant bet-
ter than that of probit model.
4. Empirical design and results
To test these hypotheses, we divided all the observations into
4.1. Bankruptcy prediction for Chinese firms two groups. We used one group (approximately 70% of observa-
tions) as the training and calibration data set for the modeling
To verify the feasibility and effectiveness of the RSBL model, we and the other group (approximately 30% of observations) as unseen
examined its ability to predict corporate failure in China. We ob- data set for testing predictive performance of the models. We re-
tained the data from the Shanghai Stock Exchange and the Shenz- peated this split 200 times to make the results sensible in statistic.
hen Stock Exchange, which included a total of 135 pairs of firms Thus, we employed the multiple holdout method, which reduces
representing those that declared bankruptcy and those that did the bias in performance estimation of models much better than
not. There is no rigid mathematical definition of an outlier. Typi- the one-time holdout method. One-tailed paired-samples t test
cally, approximately 1 in 370 deviates by three times the standard was employed to verify hypotheses.
deviation for normally distributed data [35]. This study defines
those firms with variable values deviating from the mean by more 4.2. Results
than three times the standard deviation as outliers. In addition,
there is no rigid mathematical definition of corporate failure, and The performance measure, i.e., accuracy, is the proportion of
models of bankruptcy prediction identify only a certain part of cor- right decisions of a logit agent on testing samples in total decisions.
porate failure. Further, an effective model for identifying outlier The values for statistic analysis are accuracies of models on the 30%
failures may not be the same as that for indentifying common fail- testing dataset. Table 2 shows the results for predictive perfor-
ures. Thus, removing outliers from data is useful for making a mod- mance of the RSBL model and the three classical statistical models
el for identifying common failures more effective, that is, if outliers on the six data sets (the six statistical indices refer to the mean, the
are included, then the accuracy of the model should decrease not median, the minimum, the maximum, the standard deviation, and
only for common failures but also for outlier failures. In this regard, the range). The best-time index refers to the number of times the
we excluded outliers and observations with missing values and ob-
tained 153 observations for short-term bankruptcy prediction and
216 observations for medium-term bankruptcy prediction [37]. For Table 1
short-term bankruptcy prediction, we used data one year prior to The six datasets for bankruptcy prediction.

corporate failure, and for medium-term bankruptcy prediction, Dataset Description


we used data two years prior to corporate failure. 1 Data set for short-term prediction with all variables
We used a total of 30 financial variables as the initial ratios to 2 Data set for short-term prediction with t-test variables
represent the observations; none of these variables had missing 3 Data set for short-term prediction with MDA variables
values. To determine whether the RSBL model can significantly 4 Data set for medium-term prediction with all variables
5 Data set for medium-term prediction with t-test variables
improve predictive accuracy, we employed two feature selection
6 Data set for medium-term prediction with MDA variables
approaches—stepwise MDA and a two-tailed t-test at the 95%
1384 H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388

Table 2
Results of bankruptcy prediction (six data sets).

model provided the optimal value. The values in rectangular indi- the RSBL model; logit model and the RSBL model; and probit model
cate the best performance result. Table 3 shows the results of t-test and the RSBL model, respectively. A positive value means that the
for the three hypotheses. classical model made more errors in bankruptcy prediction than
We compared performance of the RSBL model with that of the the RSBL model, whereas a positive value means that the classical
other models. As shown in Table 4, the RSBL model predicted cor- model made fewer errors. The seventh, eighth, and ninth columns
porate failure much better than the classical models. The second indicate the superiority of the RSBL model over MDA, Logit, and
column indicates the error rate, and the third column refers to Probit, respectively, in percentage terms. The underlined values
the error rate for the RSBL model. The fourth, fifth, and sixth col- indicate the cases in which MDA, logit model, and probit models
umns refer to the differences in the error rate between MDA and performed better than the RSBL model.

Table 3
Significance test (one-tailed test).

⁄⁄⁄
Significant at the 1% level.
H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388 1385

Table 4
Comparative results using random subspaces binary logit model as a benchmark.

Data set RSBL MDA-RSBL Logit-RSBL Probit-RSBL MDA-RSBL


RSBL (%) Logit-RSBL
(%) Probit-RSBL
RSBL (%)
RSBL

1 Mean error 15.81 1.37 5.49 5.52 8.67 34.72 34.91


Median error 15.56 0.00 5.55 6.66 0.00 35.67 42.80
SD 4.89 0.46 1.22 1.28 9.41 24.95 26.18
Max. error 28.89 4.44 8.89 8.89 15.37 30.77 30.77
Min. error 4.44 0.00 2.23 4.45 0.00 50.23 100.23
Range 24.44 4.45 6.67 4.45 18.21 27.29 18.21
2 Mean error 14.83 0.89 5.23 5.17 6.00 35.27 34.86
Median error 15.56 0.00 4.44 4.44 0.00 28.53 28.53
SD 4.31 0.22 1.75 1.82 5.10 40.60 42.23
Max. error 24.44 2.23 15.56 11.12 9.12 63.67 45.50
Min. error 2.22 0.00 2.22 2.22 0.00 100.00 100.00
Range 22.22 2.22 13.34 8.89 9.99 60.04 40.01
3 Mean error 11.54 0.20 0.96 1.23 1.73 8.32 10.66
Median error 11.11 0 1.11 2.22 0 9.99 19.98
SD 4 0.07 0.3 0.29 1.75 7.50 7.25
Max. error 22.22 2.22 2.22 2.22 9.99 9.99 9.99
Min. error 4.44 4.44 4.44 4.44 100.00 100.00 100.00
Range 17.78 6.66 6.66 6.66 37.46 37.46 37.46
4 Mean error 19.37 1.57 4.58 4.78 8.11 23.64 24.68
Median error 18.75 1.56 4.69 6.25 8.32 25.01 33.33
SD 4.27 0.20 0.92 0.71 4.68 21.55 16.63
Max. error 31.25 1.56 7.81 3.12 4.99 24.99 9.98
Min. error 6.25 3.12 1.56 3.12 49.92 24.96 49.92
Range 25 1.56 6.25 0 6.24 25.00 0.00

5 Mean error 18.14 2.09 3.41 3.64 11.52 18.80 20.07


Median error 18.75 1.56 3.12 3.12 8.32 16.64 16.64
SD 3.98 0.43 0.96 1.07 10.80 24.12 26.88
Max. error 28.12 4.69 7.82 7.82 16.68 27.81 27.81
Min. error 9.37 1.56 0.00 1.56 16.65 0.00 16.65
Range 18.75 6.25 7.81 9.38 33.33 41.65 50.03
6 Mean error 16.50 0.20 0.62 1.01 1.21 3.76 6.12
Median error 16.41 0.79 0.78 0.78 4.81 4.75 4.75
SD 4.35 0.29 0.15 0.1 6.67 3.45 2.30
Max. error 29.69 0 0 0 0 0.00 0.00
Min. error 6.25 0 0 1.56 0 0.00 24.96
Range 23.44 0 0 1.56 0 0.00 6.66
Average – – – – 4.18 21.98 22.79

4.3. Discussion mean, the median, the minimum, and the maximum suggest that
the RSBL model is more accurate than MDA, logit model, and probit
4.3.1. Absolute results model for bankruptcy prediction.
This study proposes the RSBL model to improve predictive We considered the standard deviation and the range to exam-
ability of logit model for corporate failure because logit model is ine the stability of the models. In terms of the standard deviation,
preferred by industrial users. Table 2 shows the resampling results. the RSBL model generated the best ratio on the datasets repre-
The RSBL model performed much better than all the classical sta- sented by all variables and t-test variables: 4.89 and 4.31 for
tistical models (e.g., MDA, logit model, and probit model). The RSBL the first and second data sets, respectively, and 4.37 and 3.98
model produced the best mean accuracy for the first five data sets for the fourth and fifth data sets, respectively. However, MDA
(84.19%, 85.17%, 88.46%, 80.63%, and 81.86%). For the sixth data set, produced the best standard deviation for the third (3.91) and
the RSBL model was outperformed by MDA, but there was little dif- sixth (4.06) data sets. These results suggest that the RSBL model
ference in their performance (83.50% vs. 83.70%, respectively). performs better than the other models. In terms of the range,
Thus, the results for mean accuracy suggest that the RSBL model the RSBL model showed the best ratio for the first (24.44), second
is more suitable for bankruptcy prediction than MDA, logit model, (22.22), third (17.78), fifth (18.75), and sixth (23.44) data sets.
and probit model. In terms of median accuracy for short-term Thus, these results indicate that the RSBL model is superior to
bankruptcy prediction, both the RSBL model and MDA produced and thus more suitable than MDA, logit model, and probit model
the best ratio for the first three data sets (84.44%, 84.44%, and for bankruptcy prediction.
88.89%, respectively). For medium-term bankruptcy prediction, The best-time index refers to the number of times the model
the RSBL model showed the best ratio for the fourth and fifth data provided the best ratio. As shown in Table 2, the RSBL model
sets (81.25%). These results indicate that the RSBL model per- provided the best ratio for the six data sets 6, 6, 4, 5, 5, and 3
formed at least as well as the other models. In terms of minimum times, respectively. This indicates that the RSBL model is more
accuracy (i.e., the maximum error rate), the RSBL model produced accurate than MDA, logit model, and probit model on five occa-
the best ratio for the six data sets (71.11%, 75.56%, 77.78%, 68.75%, sions and less accurate than to MDA on one occasion. Thus, these
71.88%, and 70.31%, respectively). Thus, the RSBL model outper- absolute results indicate that the RSBL model can dramatically
formed MDA, logit model, and probit model in terms of maximum improves the predictive ability of statistical models, including
accuracy (i.e., the minimum error rate). The accuracy results for the logit model.
1386 H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388

As shown in Table 2, the RSBL model performed slightly better corporate bankruptcy in uncertain environments and demonstrate
than the other models for the initial data sets and performed the feasibility and effectiveness of the RSBL for bankruptcy predic-
slightly better or even worse when the feature selection method tion. The RSBL model is not complex: It injects randomness and
was used. The feature selection method improved the predictive committee decisions into logit model. Industrial users typically
performance of classical models, and thus, all the models performed construct logit models for bankruptcy prediction by using all avail-
better with this method. More specifically, when this method was able data. In this regard, the RSBL model extends logit model
used, the RSBL model increased the accuracy of the classical models slightly by randomly subspacing the available data and then induc-
just slightly or even had a negative effect. This indicates that group ing a vote inside the committee of logit models. The results indi-
decisions by logit agents are more useful when a single model does cate that this treatment allows logit model to retain its advantages.
not provide excellent performance. If a single model produces This study has some limitations. We attempted to integrate
excellent performance, then the use of group decisions by logit bagging with logit model for bankruptcy prediction, but this type
agents may reduce the accuracy of the model. of integration did not increase predictive performance for corpo-
rate failure in China. Thus, future research should focus on mak-
4.3.2. Test of significance ing this combination work. The random subspace approach
We tested the three hypotheses, which addressed whether cannot be directly integrated with MDA because the replaced
there will be any difference in predictive performance between variables disobey the assumptions of MDA. Thus, future research
the RSBL model and MDA, logit model, and probit model. As shown should examine how MDA could be integrated with random sub-
in Table 3, H1 was accepted at the 1% level of significance for the space approach for bankruptcy prediction. Further, data from
first, second, fourth, and fifth data sets. H2 and H3 were also ac- other countries (e.g., the US and UK) should be used for a better
cepted at the 1% level of significance for the first, second, fourth, understanding of RSBL. Although this study focuses on binary lo-
and fifth data sets. Thus, H1, H2, and H3 were accepted four times. git model, there are several types of logit models, including mul-
For the third set, the test did not accept H1 but accepted H2 and H3 tinomial logit, mixed logit, nested logit, exploded logit, and
at the 1% level of significance. For the sixth data set, the test did not ordered logit models. Thus, future research should extend the
accept H1 but accepted H2 and H3 at the 1% level of significance. RSBL model to other types of logit models. The RSBL model is
Thus, H2 and H3 were accepted two more times. In sum, H1 was ac- limited by its computational burden, which exceeds that of a
cepted four times, whereas H2 and H3, six times. These results indi- single logit model. However, this computation can be imple-
cate that the RSBL model has significant positive effects on mented with little difficulty if some computer-aided tools are
predictive performance of the three classical statistical models available. Logit model can be used for real-world risk manage-
for bankruptcy prediction, verifying the effectiveness of the pro- ment through manual computation or computer-aided computa-
posed RSBL model. tion. When a single logit model is used with manual
computation, the user simply develops a logit model from avail-
able data and uses the model to predict the risk associated with
4.3.3. Results of performance improvement
the current problem. However, users typically employ some
Though we have provided some evidence that RSBL provides
computer-aided tools (e.g., SPSS, SAS, and Matlab) to implement
significant improvement on predictive ability of MDA, logit, and
the single logit model. In this regard, the RSBL model should not
probit, yet one may wonder how worse the three classical statis-
be used manually because it is not easy to generate random
tical models are compared with RSBL. As shown in Table 4, al-
numbers manually. With computer-aided tools, the RSBL model
most all of percentage point’s indicators of error improvement
does not require 11 times the work that a single logit model re-
are positive. This means that MDA, logit and probit were inferior
quires just because there are 11 logit decision agents. First, the
to RSBL because they produced more error rates. MDA, logit and
whole data set should be re-featured with replacement to be
probit were inferior to RSBL respectively by 4.18%, 21.98%, and
11 data sets. Further, those computer-aided tools that can gener-
22.79% from the perspective of average performance of the six
ate a random number between (0, 1) should be used. By inte-
statistical indices. Note that MDA, logit and probit outperformed
grating the random number with the total number of features,
RSBL by 100% in terms of minimum error since these three mod-
a computer-aided tool can retrieve a randomly selected feature.
els produced 100% accuracies in maximum accuracy while RSBL
By repeating the selection process the times of the total number
produced 95.56% in this index. Without this outlier, MDA, logit
of features, we can obtain a randomly generated mask, i.e., a set
and probit were worse than RSBL, respectively, by 7.16%,
of features for the data representation, and achieve the data set.
25.47%, and 26.30% in terms of predictive performance. RSBL
By repeating this process 11 times with a computer-aided tool,
was significantly better than logit and probit in terms of predic-
we can obtain 11 data sets and produce 11 logit decision agents
tive ability. RSBL was also significantly better than MDA in
for the data sets, which are used to generate 11 decision out-
predictive ability since either 4.18% or 7.16% is a significant
puts. We can induce a consensus for the logit decision group
improvement on predictive performance. These results provide
through a vote, and for this, computer-aided tools are ideal.
clear evidence that the proposed model can dramatically im-
Although the computational burden associated with the RSBL
proves the predictive ability of logit model.
model is likely to be similar to that associated with a single logit
model, the proposed model does require more computing re-
5. Conclusion and limitations sources than statistical models. In this regard, future research
should focus on developing a software package.
We combined logit model and random subspace approach to
propose the RSBL model and used it to forecast corporate failure Acknowledgements
in China. The results indicate that the proposed model performs
significantly better than the classical models (i.e., MDA, logit mod- This study was supported in part by the National Natural Sci-
el, and probit model) in predicting corporate failure. Predictive ence Foundation of China (No. 70801055) and the Zhejiang Provin-
abilities of MDA, logit and probit are worse than that of RSBL, cial Natural Science Foundation of China (No. Y7100008). The
respectively, by 4.18%, 21.98%, and 22.79% (or 7.16%, 25.47%, and authors are grateful to Prof. Hamido Fujita (the Editor in Chief)
26.30% after deleting outliers). Thus, these results suggest that and the three anonymous referees for their helpful comments
the RSBL model helps industrial users to minimize loss from and recommendations.
H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388 1387

Appendix A. Variables used for bankruptcy prediction of China companies

No. Names of 30 variables Short-term prediction Medium-term prediction


MDA variables t-test variables MDA variables t-test variables
1 Gross income/sales O O
2 Net income/sales O O
3 Ebit/total asset O O
4 Net profit/total assets O O O
5 Net profit/current assets O O O
6 Net profit/fixed assets O
7 Profit margin O O
8 Net profit/equity O O O
9 Account receivable turnover
10 Inventory turnover
11 Account payable turnover O
12 Total assets turnover O O O
13 Current assets turnover O O O
14 Fixed assets turnover O O
15 Current ratio O O
16 Cash/current liability O O
17 Asset-liability ratios O O O O
18 Equity/debt ratio O O
19 Liability/tangible net asset O O
20 Liability/equity market value O O
21 Interest coverage ratio O
22 Growth rate of primary business O
23 Growth rate of total assets O O O
24 Current assets/total assets O
25 Fixed assets/total assets O
26 Equity/fixed assets
27 Current liability/total liability O O
28 Earning per share O O O
29 Net assets per share O O
30 Cash flow per share O O

References [17] K. Keasey, R. Watson, Non-financial symptoms and the prediction of small
company failure: Atest of Argenti’s hypotheses, Journal of Business Finance
and Accounting 14 (3) (1987) 335–354.
[1] E.I. Altman, Financial ratios discriminant analysis and the prediction of
[18] K. Keasey, P. McGuiness, H. Short, Multilogit approach to predicting corporate
corporate bankruptcy, The Journal of Finance 23 (4) (1968) 589–609.
failure – further analysis and the issue of signal consistency, Omega 18 (1)
[2] S. Balcaen, H. Ooghe, 35 years of studies on business failure: an overview of the
(1990) 85–94.
classical statistical methodologies and their related problems, The British
[19] P.R. Kumar, V. Ravi, Bankruptcy prediction in banks and firms via statistical
Accounting Review 38 (2006) 63–93.
and intelligent techniques – a review, European Journal of Operational
[3] W. Beaver, Financial ratios predictors of failure, Journal of Accounting Research
Research 180 (2007) 1–28.
4 (1966) 71–111.
[20] L.I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms, Wiley,
[4] J. Berkson, Application of the logistic function to bio-assay, Journal of the
NJ, 2004.
American Statistical Association 39 (1944) 357–365.
[21] L. Lam, C.Y. Suen, Application of majority voting to pattern recognition: an
[5] C.I. Bliss, The method of probits, Science 79 (1934) 409–410.
analysis of its behavior and performance, IEEE Transaction on Systems, Man,
[6] L. Breiman, Bagging predictors, Machine Learning 24 (1996) 123–140.
and Cybernetics 27 (1997) 553–568.
[7] L. Breiman, Arcing classifiers, Annals of Statistics 26 (1998) 801–849.
[22] H. Li, J. Sun, Ranking-order case-based reasoning for financial distress
[8] L. Breiman, Random forecast, Machine Learning 45 (2001) 5–32.
prediction, Knowledge-Based Systems 21 (8) (2008) 868–878.
[9] V. Cho, MISMIS – a comprehensive decision support system for stock market
[23] H. Li, J. Sun, Forecasting business failure in China using case-based reasoning
investment, Knowledge-Based Systems 23 (6) (2010) 626–633.
with hybrid case representation, Journal of Forecasting 29 (5) (2010) 486–501.
[10] P.J. Fitzpatrick, A comparison of ratios of successful industrial enterprises with
[24] N. Lim, Classification by ensembles from random partitions using logistic
those of failed firms, Certified Public Accountant, New York, 1932.
regression models, PhD Thesis, Department of Applied Mathematics and
[11] W. Hardle, Y. Lee, D. Schafer, et al., Variable selection and oversampling in the
Statistics, Stony Brook University, 2007.
use of smooth support vector machines for predicting the default risk of
[25] D. Martin, Early warning of bank failure: a logit regression approach, Journal of
companies, Journal of Forecasting 28 (6) (2009) 512–534.
Banking and Finance 1 (1977) 249–276.
[12] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data
[26] T.E. McKee, Rough sets bankruptcy prediction models versus auditor signaling
Mining Inference and Prediction, Springer, New York, NY, 2001.
rates, Journal of Forecasting 22 (8) (2003) 569–586.
[13] T.K. Ho, The random space method for constructing decision forests, IEEE
[27] T.E. McKee, M. Greenstein, Predicting bankruptcy using recursive partitioning
Transactions on Pattern Analysis and Machine Intelligence 20 (8) (1998) 832–
and a realistically proportioned data set, Journal of Forecasting 19 (3) (2000)
844.
219–230.
[14] Y. Hu, J. Ansell, Retail default prediction by using sequential minimal
[28] C. Nam, T. Kim, N. Park, et al., Bankruptcy prediction using a discrete-time
optimization technique, Journal of Forecasting 28 (8) (2009) 651–666.
duration model incorporating temporal and macroeconomic dependencies,
[15] R. Hwang, K. Cheng, J. Lee, A semiparametric method for predicting
Journal of Forecasting 27 (6) (2008) 493–506.
bankruptcy, Journal of Forecasting 26 (5) (2010) 317–342.
[29] J. Ohlson, Financial ratios and the probabilistic prediction of bankruptcy,
[16] S. Jones, D.A. Hensher, Predicting firm financial distress: a mixed logit model,
Journal of Accounting Research 18 (1) (1980) 109–131.
Accounting Review 79 (4) (2004) 1011–1038.
1388 H. Li et al. / Knowledge-Based Systems 24 (2011) 1380–1388

[30] D.E. O’Leary, Knowledge acquisition from multiple experts: an empirical study, [36] J. Sun, H. Li, Data mining method for listed companies’ financial distress
Management Science 44 (1998) 1049–1058. prediction, Knowledge-Based Systems 21 (1) (2008) 1–5.
[31] J. Pacheco, S. Casado, L. Nunez, A variable selection method based on tabu [37] J. Sun, H. Li, Financial distress prediction based on serial combination of
search for logistic regression models, European Journal of Operational multiple classifiers, Expert Systems with Applications 36 (4) (2009) 8659–
Research 199 (2) (2009) 506–511. 8666.
[32] J. Pacheco, S. Casado, L. Núñez, O. Gómez, Analysis of new variable selection [38] C.F. Tsai, Feature selection in bankruptcy prediction, Knowledge-Based
methods for discriminant analysis, Computational Statistics and Data Analysis Systems 22 (2) (2009) 120–127.
51 (3) (2006) 1463–1478. [39] F.M. Tseng, L. Lin, A quadratic interval logit model for forecasting bankruptcy,
[33] P.F. Pai, M.F. Hsu, M.C. Wang, A support vector machine-based model for Omega 33 (2005) 85–91.
detecting top management fraud, Knowledge-Based Systems 24 (2) (2011) [40] F.M. Tseng, Y.C. Hu, Comparing four bankruptcy prediction models: logit,
314–321. quadratic interval logit, neural and fuzzy neural networks, Expert Systems
[34] P. Ravisankar, V. Ravi, Financial distress prediction in banks using group with Applications 37 (3) (2010) 1846–1853.
method of data handling neural network, counter propagation neural network [41] D. West, S. Dellana, J. Qian, Neural network ensemble strategies for financial
and fuzzy ARTMAP, Knowledge-Based Systems 23 (8) (2010) 823–831. decision applications, Computers and Operations Research 32 (2005)
[35] D. Ruan, G. Chen, E.E. Kerre, G. Wets (Eds.), Intelligent Data Mining: 2543–2559.
Techniques and Applications. Studies in Computational Intelligence, vol. 5,
Springer, 2005.

Potrebbero piacerti anche