Sei sulla pagina 1di 23

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/0955-534X.htm

Corporate
Corporate failure prediction failure
models in the twenty-first prediction
models
century: a review
David Veganzones
INSEEC U Research Center, ESCE International Business School, Paris, France, and
Received 9 December 2018
Eric Severin Revised 24 February 2019
3 June 2019
RimeLab, EA7396, Universite de Lille, Lille, France Accepted 7 June 2019

Abstract
Purpose – Corporate failure remains a critical financial concern, with implications for both firms and
financial institutions; this paper aims to review the literature that proposes corporate failure prediction
models for the twenty-first century.
Design/methodology/approach – This paper gathers information from 106 published articles that contain
corporate failure prediction models. The focus of the analysis is on the elements needed to design corporate failure
prediction models (definition of failure, sample approach, prediction methods, variables and evaluation metrics
and performance). The in-depth review creates a synthesis of current trends, from the view of those elements.
Findings – Both consensus and divergences emerge regarding the design of corporate failure prediction
models. On the one hand, authors agree about the use of bankruptcy as a definition of failure and that at least
two evaluation metrics are needed to examine model performance for each class, individually and in general.
On the other hand, they disagree about data collection procedures. Although several explanatory variables
have been considered, all of them serve as complements for the primarily used financial information. Finally,
the selection of prediction methods depends entirely on the research objective. These discrepancies suggest
fundamental advances in discovery and establish valuable ideas for further research.
Originality/value – This paper reveals some caveats and provides extensive, comprehensible guidelines
for corporate failure prediction, which researchers can leverage as they continue to investigate this critical
financial subject. It also suggests fruitful directions to develop further experiments.
Keywords Finance, Literature review, Financial distress, Corporate failure
Paper type Literature review

1. Introduction
Since the beginning of the twenty-first century, corporate failure has completely changed,
resulting in an increasing number of firms suffering from financial difficulties and/or
having to cease their activities altogether. The modern financial crisis revealed the
vulnerability of financial systems worldwide and creates tremendous instability, virtually
paralyzing some financial markets. The ability to predict corporate failure thus remains a
crucial need, especially for ensuring the relationship between financial institutions and
firms. Studies of corporate failure in turn have increased exponentially, emerging as a major
research stream in the extended domain of corporate finance.
In particular, this crisis has posed new challenges to researchers investigating corporate
failure prediction. That is why researchers have rushed to develop adequate prediction models
European Business Review
The authors are very grateful to the two anonymous reviewers for their substantial contribution to © Emerald Publishing Limited
0955-534X
the improvement of this paper. DOI 10.1108/EBR-12-2018-0209
EBR corresponding to the new conditions. In this regard, the new standpoint relies on paying more
attention to artificial intelligence prediction methods as well as to variable selection and data
characteristics. This research has yielded fruitful results in terms of developing more capable
corporate failure prediction models. Nonetheless, corporate failure field has lacked insights into
how those prediction models have been designed in spite of its importance. Indeed, the
experimentation designing should be taken into account because it may significantly affect the
results.
Consequently, a profound review of corporate failure models literature is needed. To
date, few studies undertake this effort. Balcaen and Ooghe (2006) review classic statistical
methodologies (univariate analysis, discriminant analysis [DA] and conditional probabilities
models) and related issues, providing clear documentation about a classical corporate failure
paradigm (their review covers articles published between 1966 and 2001). Kumar and Ravi
(2007) analyze statistical and artificial intelligence prediction methods published between
1968 and 2005, and Verikas et al. (2010) present a comprehensive review of novel techniques
applied to forecast corporate failure, hybrid, and ensemble techniques. These reviews have
paved the way to a better understanding of ways to forecast corporate failure, yet they
suffer two fundamental limitations. First, they review studies from before the emergence of
a new perspective on corporate failure prediction, which emerged after the catastrophic
consequences of the dot-com bust in 2000 and the global financial crisis in 2008. Thus, they
cannot address current elements that have led to the design of new models to forecast
corporate failure in accordance with new accounting and regulatory requirement scenarios
(e.g. Basel II and III). Second, these prior reviews mostly focus on evaluating prediction
methods. The evolution of computing power has initiated a new era when it comes to
developing accurate prediction methods, and its relevance for designing effective corporate
failure models is irrefutable, yet using a univariate approach based on this tool limits
investigations of corporate failure from a comprehensive view. Zhou (2013) describes
various elements that should inform the design of corporate failure prediction models
(definition of failure, sample approach, prediction methods, variables and evaluation
metrics, and performance), which these extant studies do not address.
In response, this study presents a review of high quality research since the beginning of
the twenty-first century by analyzing 106 studies devoted to corporate failure prediction with
three pursued contributions. First, an overview of all the elements related to designing
corporate failure prediction models and their issues fills a prominent literature gap, because
no previous reviews provide such a perspective. With this overview, this article provides
profound methodological guidance for academicians and practitioners that hope to address
this challenging corporate issue. Second, in recent years, many researchers and practitioners
have proposed novel models to understand the causes of corporate failure and, eventually,
predict whether a firm will fail. Therefore, it is timely and relevant to investigate
fundamental advances of discovery devoted to corporate failure in this century, which in turn
presents clear insights into more recent corporate failure prediction models. Third, this study
highlights elements and limits of corporate failure models, so it reveals new ideas for
developing future research that can clarify this critical corporate concern even better. In sum,
the current review is different from the abovementioned reviews for the following reasons:
 None of the existing reviews provides a deep insight into the designing process of
corporate failure prediction models. Thus, this paper fetches this gap providing an
organized and profound review of key elements (definition of failure, sample
approach, prediction methods, variables and evaluation metrics, and performance)
to ensure a complete understanding of the corporate failure prediction models.
 It overviews the advances in corporate failure in the twenty-first century, which is Corporate
relevant to keep in mind the new emerged discoveries in this domain to define an failure
experimental methodology.
prediction
 It discussed valuable topics in the current real-world corporate failure prediction models
practices which need to be further investigated.
 It represents an essential guideline so that researchers can understand and be
introduced in the corporate failure domain.

Section 2 therefore starts with a literature review methodology, followed by Sections 3, 4 and
5, which review the definition of failure, the data sample selection and prediction methods,
respectively. Section 6 lists the variables used in previous corporate failure models, and then
Section 7 contains the evaluation metrics and performance measures. Finally, Section 8
concludes and outlines some proposed research opportunities.

2. Literature review methodology


To review the literature, we first follow a particular search to collect an initial list of
publications with respect to corporate failure prediction models in the twenty-first century.
In particular, we carried out a searching for related papers with the support of ISI Web of
Science and ScienceDirect bibliographic databases, which includes the titles of major
publishers like Elsevier, Springer, Wiley, etc. The following keywords were used in the
search: “bankruptcy prediction”, “corporate failure” and “financial distress”. As we explore
the advances of discovery in the current century, the search is truncated between 2000 and
2017 years. After obtaining the initial search results, a further refinement is performed to
eliminate some papers that do not present experimental research for corporate failure
prediction. It has been considered that the development of empirical studies reflects the
interacting factors in corporate failure. Thus, these studies can be seen as an analysis
process. Finally, to ensure the quality of the collected papers, it has also checked whether
those papers are listed in the French Centre National de la Recherche Scientifique (CNRS)
journal classification to make the final selection[1]. The final list includes 106 papers, from
which we recorded the journal name, year of publication and the research objective, along
with all the abovementioned key elements used to design the corporate failure prediction
models.
The final papers are gathered from more than 30 scientific journals, which are related to
information systems, operation research, econometrics, finance, accounting, economics,
innovation and entrepreneurship and, general management fields. Table I provides a
description of the journals alongside with the number of publications that belongs to each
journal. It can be observed that 74 out of 106 papers belong to information systems and
operation research fields. This can be explained by the fact that corporate failure prediction
is mostly treated as binary classification problem in which, each sample of the data set
belongs to a group of predefined classes (failed and non-failed) and then, the prediction
methods are applied with the objective of attempting to separate one class from the other
with the minimum amount of error. Indeed, as it can be observed in Table II, the objective of
69 out of 106 gathered papers is quantitative, which is closely related to information systems
and operation research fields. Besides, it also shows that qualitative papers are gained
interest, especially, after researchers realized that the performance of corporate failure
models did not only depend on the prediction method complexity but, on data characteristics
and variable selection as well.
EBR Journal category No. of publications
Journal name 2000-2005 2006-2011 2012-2017 Total

Information Systems (IS) 7 22 21 50


Expert Systems with Applications 7 13 11 31
Knowledge-based systems – 4 5 9
Decision Support Systems – 2 3 5
Information and Management – 3 1 4
Other journals* – – 1 1
Operation Research (OR) 11 8 6 23
European Journal of Operational Research 3 3 2 8
Computers and Operations Research 2 2 1 5
Omega 3 – – 3
Annals of Operations Research 1 – 1 2
Other journals* 2 3 2 7
Econometrics (EM) 1 3 4 8
Journal of Forecasting 1 3 4 8
Finance (FI) 2 2 4 8
Journal of Banking and Finance 1 – 1 2
Review of Quantitative Finance and Accounting 1 – 1 2
Other journals* – 2 2 4
Accounting (AC) 2 3 1 6
Review of Accounting Studies 1 1 – 2
Other journals* 1 2 1 4
Economics (EC) – – 4 4
Computational Economics – – 2 2
Economic Modelling – – 2 2
Innovation and Entrepreneurship (IE) 2 – – 2
Journal of Business Venturing 2 – – 2
General management or other category (OT) 2 – 2 4
Journal of Business Research – – 2 2
Other journals* 2 – – 2
Total 27 39 40 106

Notes: *Other Journals includes: ABACUS (AC); Applied Soft Computing (OR); European Accounting
Table I. Review (AC); IEEE Transactions on neural networks (OR); Information Sciences (IS); International Review
of Financial Analysis (FI); International Journal of Management (OT); Journal of Business Finance and
Description of the Accounting (AC); Journal of Economics and Business (OT); Journal of Empirical Finance (FI); Journal of
gathered Financial Research (FI); Journal of International Financial Markets, Institutions and Money (FI);
publications Management Accounting Research (AC); Neurocomputing (OR)

No. of publications
Research objective 2000-2005 2006-2011 2012-2017 Total

Quantitative 19 27 23 69
Qualitative 6 12 16 34
Other 2 – 1 3
Table II. Total 27 39 40 106
Research objective of Notes: Quantitative refers to a paper that investigates a prediction method in corporate failure; Qualitative
the gathered refers to a paper that investigates an explanatory variable, variable selection method or data sample
publications selection; Other refers to a paper that investigates model interpretation and knowledge extraction
3. Definition of corporate failure Corporate
By defining corporate failure, this section establishes the first methodological choice failure
underlying this study and reveals which experimental samples are most appropriate for
constructing a model. No universal accepted definition exists, and corporate failure has been
prediction
studied from various perspectives (e.g. juridical, economic, financial, econometric), leading models
to multiple conceptualizations that might provide a representation of failure. Since Altman’s
(1968) early empirical study though, corporate failure often has been treated as a binary
classification, such that each sample contains members that wbelong to a predefined class
(failed or non-failed firms). This definition of failure relies on a firm condition and aims to
provide a criterion to distinguish between failed and non-failed firms.
In the twenty-first century, the main definition of failure refers to bankruptcy. This
definition, which is juridical in nature, refers to an ultimate, severe form of failure, which
may lead to the disappearance of firms with serious liquidity and solvency problems (Joos
et al., 1995). The popularity of this bankruptcy metric likely arises because it offers a well-
defined dichotomy that provides an objective assessment of firms’ financial condition and
also can be dated precisely. Firms thus can be categorized clearly into two populations in a
given period: failed (those with a bankruptcy procedure) and non-failed (those without a
bankruptcy procedure). Bankruptcy measures thus dominate, but they suffer a drawback, in
that they do not account for whether the bankruptcy declaration originates due to financial
signs of failure or other causes (Balcaen and Ooghe, 2006), which may produce sudden
bankruptcy samples[2]. Such contaminated samples would be detrimental for model
construction, because they cannot be investigated from a financial perspective. If the
inclusion of sudden bankruptcy samples needs to be avoided, is there an alternative
definition of failure that might be more effective?
The studies included in the current review reveal a few that consider financial distress as
a definition of failure, indicating that an enterprise encounters financial difficulties or
struggles to fulfill its obligations. Platt and Platt (2002) define financial distress as several
years of negative net operating income, suspension of dividend payments, major
restructuring, or layoffs. However, this definition raises a concern, because financial distress
is arbitrary in nature (Keasey and Watson, 1991), so the criterion to discriminate between
firms and assign them to a category for a given period is subjective and depends entirely on
the author’s concept of financial distress. Empirical research that uses this definition of
failure thus requires cautious evaluation, because the subjective nature of financial distress
may produce biased results. In some cases, the concept of financial distress appears useful
not because it provides a better dichotomy but rather because it aligns with the Chinese
institutional definition of failure[3]. Therefore, bankruptcy is the most suitable
representation of corporate failure, because it offers an objective discrimination criterion
from an empirical research perspective that can be used for further data development.

4. Data sample selection


Having established this definition of failure, it is necessary to set guidelines for constructing
the data samples from which to extract the models. The data collection is central to any
prediction model because it informs the preliminary investigation of the data and also
facilitates effective model construction (Tian et al., 2015). The review of prior literature
reveals two main approaches to data collection.
First, some data collection efforts seek samples with which to perform experiments. For
example, authors often select single-country firms, with the recognition that juridical
and accounting systems vary widely, such that each country has its own corporate failure
and accounting rules. Only Korol (2013), investigating firms from both Latin America and
EBR Central Europe, compares early warning systems for bankruptcy risk across several
countries. Yet this study highlights concerns about the comparability of different country
samples, due to the challenges of creating a single concept of failure and a measurement
criterion for countries with different juridical and accounting systems.
Another tactic within this stream of research selects samples according to industry
sector or type of firm. As Figure 1 indicates, authors often choose listed firms to build their
data sets, mainly because these firms are required to publish annual accounts, which offer
open accessibility to their financial information. Nonetheless, considering the diverse
samples in Figure 1, it appears pertinent to question whether the likelihood of failure might
vary with the type of firms and their diverse financial characteristics. Chava and Jarrow
(2004) indicate that industry groupings significantly improve the performance of hazard
models. Ciampi (2015), in line with Altman and Sabato (2007), also argues that small and
medium-sized enterprises (SMEs) possess specific organizational and strategic
characteristics, which are unlike those of large firms, so specific corporate failure prediction
models need to be constructed for SMEs. Such evidence makes it clear that a corporate
failure model that relies solely on a specific type of firm likely produces efficient
classification rules, which may explain why so many authors collect data from specific firm
types, such as SMEs (Altman and Sabato, 2007; Gordini, 2014; Ciampi, 2015), banks (Ravi
and Pramodh, 2008; Boyacioglu et al., 2009; Iturriaga and Sanz, 2015) or manufacturers
(Barniv et al., 2000; Shin et al., 2005; Cho et al., 2010).
Second, some data collection efforts are more insightful with regard to the size of the
data. Because bankruptcy prediction models are dichotomous (i.e. failed versus non-failed
firms), they must be designed using information that describes the two groups of firms as
well as possible. The data generation therefore uses stratified samples, based on random
selection from the population, and then groups multiple firms from each class. A key
determinant is the proportion of each type of firm in the data. Starting with Altman (1968),
bankruptcy prediction models generally have been based on balanced samples, in which the
proportion of failed and non-failed firms is equal, which offers two major advantages. First,
it allows the models to concentrate equally on both types of firms to design the classification
rules. Second, accuracy rate, one of the simplest and most popular evaluation metrics, can be
properly applied only with this procedure. However, this sample representation suffers
criticisms, due to the overrepresentation of failed firms in the data relative to real-world
populations. For example, Zmijewski (1984) demonstrates that if the failed and non-failed
proportions do not reflect real-world populations, biased parameters result, distorting the
estimation results. Yet if the sample proportion is representative of the overall population of
firms, such that the failed firms samples are clearly outnumbered by non-failed ones (i.e.
imbalanced data set), the models cannot represent the data characteristics accurately, which
may lead to a suboptimal classification model that provides poor predictions across data
classes (Fernández et al., 2010), because these models would seek to classify the majority

50
40
30
20
10

Figure 1. 0
Sample proportions Listed firms SMEs Banks Manufacturing Others
firms
class (non-failed firms) accurately but ignore the minority class (failed firms). Overall, Corporate
models cannot recognize failed firms correctly, as demonstrated by McKee and Greenstein failure
(2000) in their tests of three bankruptcy prediction methods (Interactive Dichotomizer 3
[ID3], logistic regression, and neural networks [NN]) in five imbalanced data sets. They
prediction
show that imbalanced proportions lead to poor classification performance, especially for models
failed firms. The performance of corporate failure prediction models is a function of the
equilibrium between each class in the data, which eliminates class bias. Models built in
balanced proportions instead tend to reduce the gap between classes and have greater
propensity to predict failed firms.
Yet a balanced distribution also can have detrimental effects on data size, due to the
scarcity of bankrupt firms, inconsistent bankruptcy rates and a lack of accessibility to these
firms’ information, such that it is difficult and costly to gather information about
failed firms (Tian et al., 2015). For a data set built using a balanced distribution, in which
failed firms are paired with firms that did not fail, the capacity to collect failed firms
becomes a key condition for the data size. The limiting factor for sample size is the number
of failed firms, and few studies feature more than about a thousand failed firms (Kumar and
Ravi, 2007). This limit might explain why, as in Figure 2 shows, recent studies mostly
employ data sets containing fewer than 400 firms, which is rather small.
With regard to the samples used to design a model, three concepts thus emerge from the
current review. First, samples are delimitated geographically, despite a lack of consensus
about which kinds of firms should be included in the sample. Some studies include firms
across sectors, whereas others focus on a specific industry sector. Second, data sets should
include a balanced proportion of failed and non-failed firms to achieve more accurate results.
Third, large samples are needed to obtain more reliable results and robustness, though the
size tends to be conditional on the number of failed firms available.

5. Prediction methods
Prediction methods are the core of corporate failure models; their objective is to separate
failed firms from non-failed ones with minimum error. The vast number of studies devoted
to prediction methods has made this element a cornerstone of corporate failure prediction
studies, which work to develop more complex prediction methods that can produce more
accurate predictions. Thus, extant literature offers diverse, extensive methods intended to
predict corporate failure. It is possible to classify those methods into three broad groups.
First, single statistical methods are prominent. Notably, DA, one of the first methods
employed for corporate failure prediction, is still widely used (Cambas et al., 2005; Serrano-
Cinca and Gutiérrez-Nieto, 2013), though logistic regression (LR) is the dominant prediction
method (Barniv et al., 2000; Foreman, 2003; Tseng and Lin, 2005). The LR method is less
statistically demanding than DA, though it still requires a lack of multicollinearity among
the independent variables (Tucker, 1996). Well-known concepts from statistical decision

70
60
50
40
30
20
10
Figure 2.
0 Data size proportion
Less or equal than 400 firms More than 400 firms
EBR theory can establish discrimination boundaries between the two firms’ classes, according to
analyses, summaries, and interpretations of the data. These methods usually are simple and
easy to use, offering both efficiency and robustness.
Second, artificial intelligence methods have grown popular in recent years, especially with
the support of advanced computing and informatics technology. This group comprises many
techniques, among which NN are dominant (Charalambous et al., 2000; Atiya, 2001; Wang
et al., 2015), along with case-based reasoning (CBR) (Park and Han, 2002; Li and Sun, 2008),
decision trees (DT) (Kim and Upneja, 2014; Liang et al., 2015), and support vector machines
(SVM) (Min and Lee, 2005; Sun and Li, 2012). These methods do not require any specific
assumptions and focus on learning directly from the data, which makes their predictions
more reliable than those of models that seek to understand underlying phenomena. The
reliance on nonlinear approaches also offers extended possibilities for testing complex data.
Third, ensemble methods combine several classifiers for data analysis and prediction[4].
For example, Chandra et al. (2009) combine NN, DT, LR and SVM to design a more accurate
prediction method; Sun et al. (2011) employ an AdaBoost technique in combination with DT;
and Chuang (2013) integrates rough set theory with CBR. Such methods offer a dynamic
vision of failure that can be captured by the combination of classifiers, which might lead to
improved performance.
The twenty-first century also exhibits two periods reflecting the evolution of prediction
methods. As Figure 3 shows, artificial intelligence and statistical methods are the most
popular prediction methods up until 2007. The preference for artificial intelligence likely
emerges because these methods outperform statistical approach, due to their ability to deal
with non-linear distributions and avoidance of being subjected to any stringent data
assumptions. Lin (2009), in line with Tseng and Hu (2010) and Lee and Choi (2013), compares
artificial intelligence and statistical methods and finds that the former achieve better
accuracy than the latter. Yet the benefits gained from using artificial intelligence methods
may be minor, with only slightly superior performance compared with simple statistical
methods, even as they demand substantially more time. Furthermore, the determination of
parameters associated with these classifiers is not straightforward.
After 2007, prediction methods changed notably, and the exponential increase of
ensemble methods moved them into a position of prominence. This gain came at the expense
of statistical methods, which remain useful mainly to compare results among methods
(du Jardin, 2010; Tsai and Hsu, 2013). Stand-alone methods always could be added to
ensembles so authors have realized that a properly designed ensemble method outperform

13% 13%

31% Stacal
Methods
Arficial
Intelligence 51%
Ensemble 36%

56%

Figure 3.
Prediction method Notes: Left: prediction methods before and including 2007; right:
proportions
prediction methods after 2007
approaches based on a single classifier, because combining diverse, independent classifiers Corporate
produces better results (Kainulainen et al., 2014). For example, Sun and Li (2009) combine six failure
classifiers. Thus, ensemble methods have gained attention due to their effectiveness and
feasibility with regard to combining different classifiers to predict corporate failure.
prediction
A broad observation is that it is impossible to ensure the superiority of any one method, models
because all prediction methods possess particular characteristics that make them relevant
for predicting corporate failure. The selection of a prediction method thus is arbitrary and
depends on the researchers’ objective. In this regard, artificial intelligence and ensemble
methods may be adequate if the research goal is to predict corporate failure more accurately,
because they are not subject to any of the stringent data assumptions. They can adapt to
any data characteristic and thus are good at performing function classification tasks. In
contrast, statistical methods create good classification rules, which perform as well as
artificial intelligence methods, but they provide more interpretable results. This
interpretability benefits users of corporate failure prediction models, helping them
understand which factors influence firms’ failure likelihood.

6. Explanatory variables
Corporate failure can be analyzed from several perspectives, and many variables might
influence a firm’s failure probability. These variables can be assigned to five major groups
(du Jardin, 2009), as detailed in Figure 4.
Financial ratios, which reflect the relationship between two figures derived from the
financial statements or other sources of financial information, absolutely dominate corporate
failure prediction. This finding that financial ratios are the most used variable is not
surprising; since Beaver (1966) first showed that financial ratios have predictive power, the
relation between financial information and corporate failure has been irrefutable. Financial
ratios offer objective measures, based on publicly available information (Micha, 1984), so
many studies rely solely on financial ratios as explanatory variables (Kaski et al., 2001;
Chauhan et al., 2009), with the assumption that these ratios contain all relevant information
for predicting corporate failure. However, several authors question this assumption. Argenti
(1976) expresses doubts about the capacity of a model to predict failure with evidence from
only financial ratios, and Zavgren (1985) claims that any model containing only financial
information cannot predict failure with certainty.
In response, in the early twenty-first century, a new perspective emerged, and studies
began to investigate how to include alternative or complementary variables. For example,
market variables refer to the financial situation of firms according to their stock market
value and offer a good complement to financial ratios. Beaver et al. (2005) explain that

120

100

80

60

40

20
Figure 4.
0 Variable use
Financial raos Market Non-financial Variaon of Economic proportions
variables variables financial raos variables
EBR including market variables adds significant value, because they reflect financial information
that is not contained in accounting statements. Tinoco and Wilson (2013) also show that
market-based variables can provide a direct assessment of volatility and growth
opportunities, measures that provide powerful predictors of bankruptcy risk. However,
these variables are available only for listed firms, which represent a major disadvantage.
Non-financial variables also have attracted some interest, because they indicate a wider
dimension of failure and may be relevant for models’ predictive performance. Wu (2004)
finds that results obtained from a combination of financial and non-financial ratios are more
accurate than those achieved with just financial ratios. Business efficiency, which offers a
measure of firms’ management, is a recursive non-financial variable, calculated using data
envelopment analysis, which is an efficient performance evaluation tool that accounts for all
dimensions of corporate activity. Yeh et al. (2010) and Xu and Wang (2009) offer evidence
that including business efficiency as a predictor, together with financial ratios, leads to more
accurate corporate failure prediction models. Among the other non-financial factors
considered, Ciampi (2015) highlights that exploiting corporate governance along with
financial ratios improves the performance of corporate failure models for SMEs, and
Tobback et al. (2017) demonstrate that relational data give more reliable results. Thus, non-
financial variables offer another perspective on failure with good predictive power, though
they also tend to be expensive and difficult to gather.
Various financial ratios also encompass distinct statistical or mathematical approaches
(e.g. mean, median, logarithm) to reflect the evolution of financial ratios across periods.
Volkov et al. (2017) show that introducing sequential information from time-series of financial
ratios leads to better classification performance, yet studies mainly include this group of
variables by establishing a logarithm transformation of a financial ratio (Kolari et al., 2002;
Min and Lee, 2005), such as the total assets logarithm popularized by Altman (1968).
Finally, economic variables that reflect macroeconomic factors also have been
contemplated, due to their potential influence on the accuracy of predictive models. Mare
(2015) shows that variations in economic cycles are positively related to failure probability.
Nonetheless, the gain in accuracy obtained by including these variables may be minimal.
Considering these diverse approaches, it seems that there is no real limit on the design of
explanatory variables. However, including variables that are irrelevant or redundant may
lead to suboptimal models, so variable selection remains a crucial step in building a model.
Authors must select the most relevant variables to reduce redundancy as much as possible
but still retain the necessary information for predictions. The variable selection process in
turn is essential for establishing model parsimony, accuracy and generalization. Variable
selection methods should be considered in conjunction with the explanatory variables; no
theory confirms which variables are the best predictors (Scott, 1981). The methodologies
used across the reviewed studies can be grouped into four main categories.
The first variable selection technique relies on filter methods, such as t-tests (Xiao et al.,
2012), stepwise analysis (Gordini, 2014), and stepwise logistic regression (Charitou et al.,
2004). These methods leverage statistical concepts to select the best set of variables and are
entirely independent of the classifier chosen to predict failure. They are computationally
efficient and statistically differentiable, which is particularly helpful when there are many
variables under consideration (Guyon and Elisseeff, 2003), yet they are prone to unexpected
failures and do not model variable dependency (Fogel, 2006).
The second technique selects variables according to their popularity in prior literature;
the selected variables have been used by other authors previously. Li and Miu (2010), in line
with Hu (2008) and Quintana et al. (2007), include variable from Altman’s (1968) study:
working capital/total assets, retained earnings/total assets, earnings before interest and
taxes/total assets, market values equity/book value of total debt and sales/total assets. The Corporate
predictive power of these variables has been established, yet selecting variables according to failure
this criterion may be problematic, because even popular ratios can be unreliable, and they
also prevent better selection options (Dirickx and Van Landeghem, 1994).
prediction
Wrapper methods are based on a heuristic technique that selects variables according to models
their usefulness for a given classifier. Yu et al. (2014) employ sequential forward selection;
Jeong et al. (2012) use a generalized adaptive approach. Sun and Shenoy (2007) and Chuang
(2013) select the most relevant variables through a heuristic method based on a naïve Bayes
model and CBR, respectively. Wrapper methods take into consideration how the classifier
might improve model performance. However, they involve a search process for a good
variable set, which is time consuming, and they are prone to overfitting.
Finally, feature extraction methods do not select the best variables but instead combine
the variables to create a small set. Specifically, feature extraction identifies a mapping that
reduces variables’ dimensionality, which can be categorized as linear or non-linear. These
methods include principal component analysis (PCA), which is very popular (Chen and Du,
2009; Hu and Jansell, 2009; Chen, 2011). Because feature extraction combines variables, it
may eliminate overfitting issues and generate better discriminatory power. Nonetheless, the
variable transformation tends to be very time consuming.
Across this literature stream, studies thus implement different variable selection strategies,
suggesting that there is no widely accepted strategy for performing variable selection, even
though the methods influence classifier performance. Thus, du Jardin (2010, p. 11) concluded:
“model accuracy depends on part on the intrinsic characteristics of prediction method and in part
of the fit of this method and the variable selection procedure involved in its design.” In
recognition of the importance of variable selection methods, several studies place this process at
the heart of their research. Some of them investigate how the most commonly used variable
selection techniques influence the performance of prediction methods. For example, Tsai (2009)
compares five variable selection techniques – t-test, stepwise regression, correlation matrix,
factor analysis, and PCA – to determine which one fits a NN classifier best and finds that t-tests
are most suitable for NN. In a similar study, du Jardin (2010) indicates that NN instead achieves
better performance if a wrapper method selects the variables. Liang et al. (2015) explore three
filter and two wrapper-based feature selection techniques, combined with six prediction methods.
They argue there is no single best combination. With another approach, Xu et al. (2014) develop a
new variable selection technique that integrates soft set theory with LR. Their results indicate
that this technique provides superior performance in terms of both accuracy and stability.
Although annual account information, in the form of financial ratios, is the primary
information used to predict bankruptcy, collected studies concur that financial ratios alone
cannot capture all the dynamics underlying corporate failures. Alternative variables thus join
financial ratios to predict failure. Yet no established feature selection framework is available to
select the most significant explanatory variables. Thus, the explanatory variables ultimately
included in corporate failure prediction models should be selected using multiple methods, to
reveal the optimal solution according to an evaluation of all possible combinations. As Murray
(1977) argues, the value of a variable for further classification can be measured only by the
number of times that it would be selected by different variable selection techniques.

7. Evaluation metrics
The repercussions of misestimating corporate failure can be catastrophic for financial markets
that rely on proper assignments of limited financial resources, so evaluation metrics are
fundamental. This element establishes the predictive ability of corporate failure models, and
the accuracy rate, in combination with type errors, provides the most frequently used
EBR evaluation metric. Type errors reflect the total percentage of correct classifications, which is a
simple way to describe a classifier’s performance on a given data set. Although Type-I and
Type-II errors measure the percentage of failed and non-failed firms that have been
misclassified, respectively, they evaluate each type of firm individually. Thus, the cost of these
two errors is asymmetric; that is, the cost of misclassifying a failed firm is completely different
from that of misclassifying a non-failed firm. Predicting that a firm is healthy when it will fail
leads to a loss in capital; predicting that a firm is failed when it is healthy only involves the loss
of a commercial bargain. Thus, Type-I error is more important than Type-II error for financial
institutions. These metrics can verify model performance, in that they provide capacity
predictions for failed and non-failed firms individually and globally, which likely explains their
vast popularity (Tang and Chi, 2005; Chen et al., 2011; Yu et al., 2014; Gepp and Kumar, 2015).
However, they only provide useful model performance information when the data include two
types of samples with balanced proportions, which is an apparent limitation. As noted
previously, bankruptcy firms are relatively rare, so the data may be skewed by an imbalanced
data set, in which case evaluation metrics do not provide adequate information about prediction
performance with respect to the type of classification required, because the sizes of the groups
of firms are uneven (He and Garcia, 2009). According to Lopez et al. (2013), such evaluation
metrics are often biased toward the majority class, leading to a higher misclassification rate for
minority class instances. This issue prevents their application for corporate failure prediction,
which demands the accurate prediction of the minority class (failed firms).
The area under curve (AUC) option provides a more suitable evaluation metric for
imbalanced data sets, because it creates a visible representation of classifiers’ performance
and is completely insensitive to imbalanced distributions. The visual representation reflects
the trade-off between a true positive (failed firms that have been correctly classified) and a
false positive (failed firms that have been incorrectly classified). The curve should be as close
as possible to a value of 1, which represents a perfect true positive classification without false
positives. This measure includes misclassification costs in conjunction with corporate failure
models performance regardless of the sample proportion. Accordingly, AUC measures
have recently attracted more attention and have become more widespread in corporate failure
domain, regardless of data proportions (Ravi and Pramodh, 2008; Horta and Camanho, 2013;
Pal et al., 2016; du Jardin et al., 2017). Complementary metrics in imbalanced data sets also
provide more intuitive, practical evaluations of the prediction performance in each class. Both
Zhou (2013) and Iturriaga and Sanz (2015), investigating bankruptcy predictions in
imbalanced data sets, thus use sensitivity and specificity metrics, which reflect the
percentages of failed and non-failed samples correctly classified, respectively.
Overall then, no single metric appears more relevant than others to evaluate classifiers’
performance; this evaluation requires the consideration of conditional factors such as data
proportions (Raeder et al., 2012). The selection of evaluation metrics also may satisfy special
requirements in ways that avoid distorted conclusions (Hand, 2012). In addition, more than
one evaluation metric should be calculated to evaluate the performance of prediction
methods, because no single metric can capture the individual classification for each class
and global performance.

8. Conclusion and research opportunities


In the twenty-first century, the corporate failure prediction domain has become a cornerstone
of corporate finance research, and studies from disparate academic disciplines have shifted
focus to investigate this critical corporate issue. Therefore, the current study undertakes a
profound analysis of the advances in this domain in the current century. The review of the
elements that encompass corporate failure models (definition of corporate failure, data sample
selection, prediction methods, explanatory variables, and evaluation metrics) represents a Corporate
novel contribution to extant literature. The reviewed literature reveals in turn that empirical failure
investigations into corporate failure prediction should define corporate failure as bankruptcy,
which provides the most appropriate dichotomy to group firms of different status, and should
prediction
apply four key conditions to the data collection. First, samples should be from a single models
country, to ensure their uniform juridical and accounting systems. Second, one data set should
belong to a concrete type of industry or sector of activity, so that the model design can reflect
specific financial characteristics. Then another data set can feature firms across sectors, to test
the model’s capacity to create good prediction rules. Third, data sets should include equal
numbers of failed and non-failed firms to achieve the most optimal classification performance
across classes. Fourth, the data should be as large as possible, keeping the previous
requirements in mind, to obtain more reliable results with greater precision and robustness. In
addition, because no prediction method clearly outperforms the others, researchers should
select these methods according to their research goals and the interpretability (statistical
methods) versus accuracy (artificial intelligence or ensemble methods) trade-off. Financial
ratios should be used to design the model, but complementary variables also should be
integrated to provide a broader view on failure and more accurate results. Finally, at least one
evaluation metric should measure the classification performance of each class individually,
and then another to evaluate global performance. Table III summarizes the key findings so
that these guidelines can help researchers and practitioners predict corporate failure.
Despite several advances with regard to corporate failure prediction, forecasting
limitations still remain. The imperfections in the designs of corporate failure models are not
totally surprising, considering the number of corporate failures. But it also implies some
pertinent research opportunities, that all rooted in two principles: focus on improving
corporate failure prediction models or on understanding corporate failure.
The former principle is based on the issues related to corporate failure prediction models
highlighted in this review. In this regard, the main issues that need to be further
investigated are presented below.
The collected studies point in different directions with respect to the data sample, which
indicates vastly discrepant conclusions. This point demands further research, because prior
literature has not investigated issues related to data construction. On the one hand, data tend to
be small in corporate failure models, but larger data sizes might be more beneficial. In turn, this
factor could exert a powerful influence on corporate failure prediction models, so a more precise
understanding of the optimal, efficient sample size that produces accurate and consistent
results is needed. On the other hand, imbalanced data sets also might be detrimental to models’
performance, but no evidence exists to specify the degree of data imbalance at which
performance starts to be jeopardized. These data sample selection issues seldom have been
investigated, though they appear crucial for practical applications of corporate failure models.
The development of more accurate corporate failure prediction models is a common
research goal, yet the majority of studies published in the twenty-first century have
achieved only minimal improvements in accuracy, and corporate failure forecasting
accuracy ranges from 80 to 85 per cent. In this context, du Jardin and Séverin (2011) criticize
models with very short forecasting horizons; prediction rates often are good one year before
failure, but less so as the horizon extends to three years or more. Their study and du Jardin
(2015) both use self-organized maps to design failure trajectories and offer produce stable
predictions across a longer forecasting horizon – the only existing efforts to improve the
forecast horizon of corporate failure models. Few investigations seek accurate results for
midterm predictions, despite their importance for financial institution that need to assess the
risk of an investment or debt. It thus constitutes an important research direction.
EBR Advantage Disadvantage

Bankruptcy as definition of corporate failure


Well-defined dichotomy Introduction of non-financial bankruptcy samples
Objective criteria to separate failed and non-failed may be detrimental for model construction
firms in a given period
Data sample belong to a concrete type of industry/sector of activity
Take into account specific organizational and Numerous specific prediction models should be built
strategic characteristics Impossibility to design unique prediction rule for its
The specific type of firms models likely produce practical application
more efficient prediction rules
Equal number of failed and non-failed firms in the data set
Eliminate the prediction bias toward the majority Ignore real-world conditions, the proportion of
class bankrupt firms is very low
Models can represent data characteristics May lead to suboptimal models that might provide
adequately unfavorable predictions across real-world data
Achieve more accurate predictions
Large data set
Increase model’s generalization capacity Scarcity of bankruptcy firms
Provide more reliable results with greater precision Lack of accessibility to firms information
and robustness
Variety prediction methods
Wide range of prediction methods (statistical A trade-off between interpretability and accuracy
techniques, artificial intelligence, ensemble Conditioned to data characteristics (statistical
methods) techniques) and determination of parameters
The selection depends on the research goals (artificial intelligence and ensemble methods)
Financial ratios and complementary variables to design the model
Broader dimension underlying corporate failures Financial information can be manipulated (earnings
Better capacity to create prediction rules management)
More accurate prediction Lack of robustness of the complementary variables
Large marginal cost of collecting complementary
variables in comparison with the increase of model
accuracy
Variable selection method in conjugation with prediction method
Consider the characteristics of the prediction No single best combination
method to optimize the variable selection Time-consuming, compare various variable selection
Essential for the accuracy of the model methods
Table III. At least two evaluation metrics
Summary of key Evaluate global performance and individual class Conditioned to data set characteristics (class
findings classification proportion)

Finally, this literature review reveals that financial information is limited in its ability to
encapsulate all the factors that may affect failure likelihood. Other variables can
complement this information. Nonetheless, further research is needed to understand firms’
failure processes. In particular, an in-depth analysis of non-accounting information could
provide insights into how to address failure dynamics more accurately.
In contrast to the principle of proposing a new, improved model that will perform the
classification task, further research opportunities might provide insights into understanding
failures. Seldom research is made to interpret model to contribute toward the development of
a business failure theory or to bring light to corporate failure effects and causes. That is why Corporate
we precisely reveal further opportunities in this direction. failure
Despite numerous research studies on developing empirical corporate failure models, a
generally accepted corporate failure theory has not yet been proposed. Nonetheless,
prediction
empirical studies reveal an aspect of reality based on the analysis process of the theoretical models
frameworks. In this regard, two theories have been specially considered. On the one hand,
decomposition measures theoretical framework, which examines changes in the structure of
balance sheets. More precisely, variables have been built based on the decomposition of
aggregate figures (assets and liabilities) to provide discrimination power with respect to
failed and non-failed firms. On the other hand, cash-based variables have also been
considered based on cash management theory, which indicates that an imbalance between
cash inflows and outflows may lead to corporate failure. On the whole, theoretical
frameworks have highlighted failure patterns, which have been reflected in significant
variables for the corporate failure model development. Moreover, financial distress largely
has been discarded, due to its limitations in empirical research. Yet it offers significant
theoretical value because financial distress is an ongoing process in which firms experience
diverse states. Indeed, a corporate failure theory could derive from the concept of financial
distress and structural inertia theory (Hannan and Freeman, 1984). This theory suggests
that organizations are subject to strong inertia forces, that is, they seldom succeeded in
making radical changes in the face of threats. Thus, it can suggest that under the threat of
financial distress inertia, which is the result of abnormalities in business operations for a
continuous period of time, firms are not able to change this dynamic. Therefore, this view
could support the elaboration of a theoretical failure analysis that might enrich knowledge
about this issue.
Even though the number of corporate failures is increasing, relatively little research has
addressed the general causes of failure. This is a significant gap in the literature that needs
to be fulfilled. Hence, what are exactly these causes? According to related literature (Carter
and Van Auken, 2006; Amankwah-Amoah, 2015), the causes of failure can be grouped in
two categories: On the one hand, external factors which are produced for environmental
causes or no directly related to business activities and, are unpredictable. On the other hand,
internal factors which will leave traces in the accounts and are predictable. Table IV
summarizes some of the causal factors leading to failure[5]. From these categories, the latter
requires further research. In particular, as those factors are traceable, one could separate the
moment when the factor occurs and the moment of failure. Taking into account the lapse of
time that can elapse between these two moments, corrective actions can be taken to avoid
failure. Thus, further research can bring light to the possible strategies to avoid failure

External factors Internal factors

Accidents (natural disaster, . . .) Financial difficulties (lack of liquidity or funds, . . .)


Departure of person in a key role Loss of market shares (loss of costumers, suboptimal sell-price
decision, . . .)
Litigation with a partner/public Management issues (deficient accounting system or
administration management strategy, . . .)
Owner personal issues (death, disease, . . .) Obsolete technology or production process
Public policy less favorable to the sector Operating cost (high remuneration or production cost, . . .) Table IV.
Unsuccessful implementation or fraud Supplier issues (loss of suppliers, refusal to accept late Summary of
against the firm (project/investment payments, . . .) corporate failure
failures, . . .) causes
EBR regarding the underlying internal factors. Moreover, as Kücher et al. (2018) claimed, the internal
reasons of failure should be investigated alongside with firms’ age, size or life cycle so that the
likelihood of suffering that kind of failure can be addressed considering firms’ characteristics.
Finally, it is well known that corporate failure has severe effects on the business,
economy and the society as a whole because it may result in a traumatic experience for
employees, shareholders and devastating consequences for the economy. Nonetheless,
business failure may also have positive aspects because it leads to potentially valuable
opportunities. Indeed, Amankwah-Amoah (2016) affirmed that corporate failure has two
categories of effects: contagion effects and competitive effects. The former unleashes a range
of negative consequences, which is based on the concept that corporate failure is a
discrediting label that creates negative perceptions. In this respect, internal stakeholders
(employees, owners, [. . .] suffer the immediate consequences of failure in form of financial,
social and psychological costs, which can be manifested as reduction of income, depression
or relationships diminution as the result of the stigma associated with failure (Singh et al.,
2007) . Moreover, counterparties are also exposed to contagion effects. The propagation
mechanism relies on direct ties between firms because counterparties could be clients,
vendors or dealers of the failed firm. This effect negative effect on the counterparty occurs
when a failed firm causes large losses that they drive its counterparty into insolvency, which
in turn could cause a third or a cluster of default (Das et al., 2007). In contrast, competitive
effects highlight the potential advantages of failure. On the one hand, failure can encourage
learning because individuals are more likely to develop new knowledge by drawing on prior
failure experience (Coelho and McClure, 2005). Besides, learnings from failure may foster a
range of higher-level learning capabilities. Thus, internal stakeholders make an effort to
identify and exploit new opportunities. On the other hand, business failure is good due to the
release of knowledge and resources from failed firms and to eliminate uncompetitive firms
from the market. In this sense, firms can profit this circumstance to hire ready-made talent
and to tap clients of the failed firms (Pe’er and Vertinsky, 2008; Isenberg, 2011).
Bearing these effects in mind, if the costs of contagion effects are too high in comparison
to the benefits from competitive effects, stakeholders may exit their business career. Thus,
further research on investigating the termination and rebuilding of relationship ties after
business failure is needed. More precisely, one should investigate potential policies to reduce
contagion effects while facilitating and fostering competitive effects.
Keeping in mind these suggested future research directions, they summarize these
opportunities and implications so that it can serve as a catalyst for more research not only in
improving corporate failure prediction models but, in understanding how business failure
influences firms’ dynamics and its consequences as well:
Panel A: Corporate failure prediction model:
(1) Data size:
 Opportunity:
– is a sample size relevant for models performance;
– determine an efficient and effective sample size for data construction; and
– is the optimal data size equal for all prediction methods (statistical methods,
artificial intelligence, and ensemble methods).
 Implications:
– reduce the time/cost of model designing; and
– develop a more efficient model to balance the trade-off between accuracy
and resource efficiency.
(2) Imbalanced datasets: Corporate
 Opportunity: failure
– analyze what imbalance distribution disturbs a model predictive prediction
performance; models
– analyze the improvement capacity of treatment techniques; and
– can a corporate prediction model perform steady regardless of data
imbalanced distribution.
 Implications:
– predict failure accurately in a real-world scenario (imbalanced datasets).
(3) Prediction over time:
 Opportunity:
– improve prediction models stability over time; and
– represent the evolution of firm financial health over time.
 Implications:
– understand the influence of time dimension on business failure process;
– improve the model prediction horizon; and
– assess the risk associated with the firm in the mid-term.

(4) Explanatory variable:


 Opportunity:
– analyze variables that reflect failure factors; and
– analyze variables that deal with unreliable financial ratios (earnings
management).
 Implications:
– apply to any type of firm and prove to improve model accuracy.

Panel B: Corporate failure understanding.


(1) Failure theory:
 Opportunity:
– create a theoretical failure analysis based on financial distress and
structural inertia theory.
 Implications:

– understand the determinants of failure.


(2) Failure causes:
 Opportunity:
– define strategies to avoid failure caused by internal factors; and
– identify the possible failure internal factors regarding firms characteristics
(age, size and life cycle).
 Implications:
– establish a mechanism to escape from business failure; and
– enrich our understanding of firms attitude toward business failure.
EBR (3) Failure effects:
 Opportunity:
– analyze the attitudes to recover after a business failure.
 Implications:
– governments can adopt policies to provide a second chance condition.

Notes
1. CNRS journal classification list is considered a highly prestigious reference in France to evaluate
the quality of publications. The list is intended to include top rank journals in management or in
economics (further information in https://www.gate.cnrs.fr/spip.php?rubrique31). Moreover, 16 out
of 106 papers were included in the final list of papers even though they are not listed in CNRS
journal classification. Those papers are of interest to our study and are published in computer
science journals with a Thomson-Reuters impact factor score of more than 0.5 (Applied Soft
Computing, IEEE Transactions on neural networks, Knowledge-based systems, Neurocomputing).
2. Hill et al. (2011) identify sudden bankruptcies as those that occur due to unexpected events, such
as natural disasters.
3. The Chinese national security management institution considers a firm in financial distress when
its profits continue to be negative for two consecutive years or its per share net assets are lower
than per share stock value (Sun et al., 2014).
4. Ensemble methods sometimes are divided into two sub-groups: hybrid methods and ensemble-
based techniques. A hybrid method integrates a classifier in the learning phase to select
parameters or features for another classifier that makes the prediction; an ensemble method
combines multiple methods to make a final decision. Both sub-groups are based on the
combination of classifiers, but they adopt distinct methodologies; the current study categorizes
them together in a more general group of ensemble methods. Further information about each sub-
group is available from Verikas et al. (2010).
5. For further information, see Bradley (2004).

References
Altman, E.I. (1968), “Financial ratios, discriminant analysis and the prediction of corporate
bankruptcy”, Journal of Finance, Vol. 23 No. 4, pp. 889-609.
Altman, E.I. and Sabato, G. (2007), “Modelling credit risk for SMEs: evidence from the US market”,
Abacus, Vol. 43 No. 3, pp. 332-357.
Amankwah-Amoah, J. (2015), “Where will the axe fall? An integrative framework for understanding
attributions after a business failure”, European Business Review, Vol. 27 No. 4, pp. 409-429.
Amankwah-Amoah, J. (2016), “An integrative process model of organisational failure”, Journal of
Business Research, Vol. 69 No. 9, pp. 3388-3397.
Argenti, J. (1976), Corporate Collapse: The Causes and Symptoms, McGraw-Hill, London.
Atiya, A.F. (2001), “Bankruptcy prediction for credit risk using neural networks: a survey and new
results”, IEEE Transactions on Neural Networks, Vol. 12 No. 4, pp. 929-935.
Balcaen, S. and Ooghe, H. (2006), “35 years of studies on business failure: an overview of the classic
statistical methodologies and their related problems”, The British Accounting Review, Vol. 38
No. 1, pp. 63-93.
Barniv, R., Mehrez, A. and Kline, D.M. (2000), “Confidence intervals for controlling the probability of
bankruptcy”, Omega, Vol. 28 No. 5, pp. 555-565.
Beaver, W. (1966), “Financial ratios as predictor of failure”, Journal of Accounting Research, Vol. 4,
pp. 71-111.
Beaver, W.H., McNichols, M.F. and Rhie, J.-W. (2005), “Have financial statements become less Corporate
informative? Evidence from the ability of financial ratios to predict bankruptcy”, Review of
Accounting Studies, Vol. 10 No. 1, pp. 93-122.
failure
Bradley, D.B. (2004), “Small business: causes of bankruptcy”, Small Business Advancement National
prediction
Center, University of Central AR, College of Business Administration. models
Boyacioglu, M.A., Kara, Y. and Baykan, Ö.K. (2009), “Predicting bank financial failures using neural
networks, support vector machines and multivariate statistical methods: a comparative analysis
in the sample of savings deposit insurance fund (SDIF) transferred banks in Turkey”, Expert
Systems with Applications, Vol. 36 No. 2, pp. 3355-3366.
Cambas, S., Cabuk, A. and Kilic, S.B. (2005), “Prediction of commercial bank failure via multivariate
statistical analysis of financial structures: the Turkish case”, European Journal of Operational
Research, Vol. 166 No. 2, pp. 528-546.
Carter, R. and Van Auken, H. (2006), “Small firm bankruptcy”, Journal of Small Business Management,
Vol. 44 No. 4, pp. 493-512.
Chandra, D.K., Ravi, V. and Bose, I. (2009), “Failure prediction of dotcom companies using hybrid
intelligent techniques”, Expert Systems with Applications, Vol. 36 No. 3, pp. 4830-4837.
Charalambous, C., Charitou, A. and Kaourou, F. (2000), “Comparative analysis of artificial neural
network models: Application in bankruptcy prediction”, Annals of Operations Research, Vol. 99
Nos 1/4, pp. 403-425.
Charitou, A., Neophytou, E. and Charalambous, C. (2004), “Predicting corporate failure: empirical
evidence for the UK”, European Accounting Review, Vol. 13 No. 3, pp. 465-497.
Chauhan, N., Ravi, V. and Chandra, D.K. (2009), “Differential evolution trained wavelet neural
networks: application to bankruptcy prediction in banks”, Expert Systems with Applications,
Vol. 36 No. 4, pp. 7659-7665.
Chava, S. and Jarrow, R.A. (2004), “Bankruptcy prediction with industry effects”, European Finance
Review, Vol. 8 No. 4, pp. 537-569.
Chen, M.Y. (2011), “Predicting corporate financial distress based on integration of decision tree
classification and logistic regression”, Expert Systems with Applications, Vol. 38 No. 9,
pp. 11261-11272.
Chen, W.S. and Du, Y.K. (2009), “Using neural networks and data mining techniques for the financial
distress prediction model”, Expert Systems with Applications, Vol. 36 No. 2, pp. 4075-4086.
Chen, H.L., Yang, B., Wang, G., Liu, J., Xu, X., Wang, S.J. and Liu, D.-Y. (2011), “A novel bankruptcy
prediction model based on an adaptive fuzzy k-nearest neighbor method”, Knowledge-Based
Systems, Vol. 24 No. 8, pp. 1348-1359.
Cho, S., Hong, H. and Ha, B.C. (2010), “A hybrid approach based on the combination of variable
selection using decision trees and case-based reasoning using the Mahalanobis distance: for
bankruptcy prediction”, Expert Systems with Applications, Vol. 37 No. 4, pp. 3482-3488.
Chuang, C.L. (2013), “Application of hybrid case-based reasoning for enhanced performance in
bankruptcy prediction”, Information Sciences, Vol. 236, pp. 174-185.
Ciampi, F. (2015), “Corporate governance characteristics and default prediction modeling for small
enterprises: an empirical analysis of Italian firms”, Journal of Business Research, Vol. 68 No. 5,
pp. 1012-1025.
Coelho, P.R.P. and McClure, J.E. (2005), “Learning from failure mid”, American Journal of Business,
Vol. 20 No. 1, pp. 13-20.
Das, S., Duffie, D., Kapadia, N. and Saita, L. (2007), “Common failings: how corporate defaults are
correlated”, The Journal of Finance, Vol. 62 No. 1, pp. 93-117.
Dirickx, Y. and Van Landeghem, G. (1994), “Statistical failure prevision problems”, Tijdschrift Voor
Economie en Management, Vol. 39 No. 4, pp. 429-462.
EBR Du Jardin, P. (2009), “Bankruptcy prediction models: how to choose the most relevant variables?”,
Bankers, Markets and Investors, Vol. 98, pp. 39-46.
Du Jardin, P. (2010), “Predicting bankruptcy using neural networks and other classification methods:
the influence of variable selection techniques on model accuracy”, Neurocomputing, Vol. 73
Nos 10/12, pp. 2047-2060.
Du Jardin, P. (2015), “Bankruptcy prediction using terminal failure processes”, European Journal of
Operational Research, Vol. 242 No. 1, pp. 286-303.
Du Jardin, P. and Séverin, E. (2011), “Predicting corporate bankruptcy using a self-organizing map: an
empirical study to improve the forecasting horizon of a financial failure model”, Decision
Support Systems, Vol. 51 No. 3, pp. 701-711.
Du Jardin, P., Veganzones, D. and Séverin, E. (2017), “Forecasting corporate bankruptcy using accrual-
based models”, Computational Economics, Vol. 54 No. 1, pp. 1-37.
Fernández, A., del Jesus, M.J. and Herrera, F. (2010), “On the 2-tuples based genetic tuning performance
for fuzzy rule based classification systems in imbalanced data-sets”, Information Sciences,
Vol. 180 No. 8, pp. 1268-1291.
Fogel, D.B. (2006), Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, John
Wiley and Sons.
Foreman, R.D. (2003), “A logistic analysis of bankruptcy within the US local telecommunications
industry”, Journal of Economics and Business, Vol. 55 No. 2, pp. 135-166.
Gepp, A. and Kumar, K. (2015), “Predicting financial distress: a comparison of survival analysis and
decision tree techniques”, Procedia Computer Science, Vol. 54, pp. 396-404.
Gordini, N. (2014), “A genetic algorithm approach for SMEs bankruptcy prediction: empirical evidence
from Italy”, Expert Systems with Applications, Vol. 41 No. 14, pp. 6433-6445.
Guyon, I. and Elisseeff, A. (2003), “An introduction to variable and feature selection”, Journal of
Machine Learning Research, Vol. 3, pp. 1157-1182.
Hand, D.J. (2012), “Assessing the performance of classification methods”, International Statistical
Review, Vol. 80 No. 3, pp. 400-414.
Hannan, M.T. and Freeman, J. (1984), “Structural inertia and organizational change”, American
Sociological Review, Vol. 49 No. 2, pp. 149-164.
He, H. and Garcia, E.A. (2009), “Learning from imbalanced data”, IEEE Transactions on Knowledge and
Data Engineering, Vol. 21 No. 9, pp. 1263-1284.
Hill, N.T., Perry, S.E. and Andes, S. (2011), “Evaluating firms in financial distress: an event history
analysis”, Journal of Applied Business Research (JABR), Vol. 12 No. 3, pp. 60-71.
Horta, I.M. and Camanho, A.S. (2013), “Company failure prediction in the construction industry”, Expert
Systems with Applications, Vol. 40 No. 16, pp. 6253-6257.
Hu, Y.C. (2008), “Incorporating a non-additive decision making method into multi-layer neural
networks and its application to financial distress analysis”, Knowledge-Based Systems, Vol. 21
No. 5, pp. 383-390.
Hu, Y.C. and Jansell, J. (2009), “Retail default prediction by using sequential minimal optimization
technique”, Journal of Forecasting, Vol. 28 No. 8, pp. 651-666.
Isenberg, D. (2011), “Entrepreneurs and the cult of failure”, Harvard Business Review, Vol. 89 No. 4, p. 36.
Iturriaga, F.J.L. and Sanz, I.P. (2015), “Bankruptcy visualization and prediction using neural networks:
a study of US commercial banks”, Expert Systems with Applications, Vol. 42 No. 6, pp. 2857-2869.
Jeong, C., Min, J.H. and Kim, M.S. (2012), “A tuning method for the architecture of neural network
models incorporating GAM and GA as applied to bankruptcy prediction”, Expert Systems with
Applications, Vol. 39 No. 3, pp. 3650-3658.
Joos, P., De Bourdeaudhuij, C. and Ooghe, H. (1995), “Financial distress models in Belgium: the results of a
decade of empirical research”, The International Journal of Accounting, Vol. 30 No. 3, pp. 245-274.
Kainulainen, L., Miche, Y., Eirola, E., Yu, Q., Frénay, B., Séverin, E. and Lendasse, A. (2014), Corporate
“Ensembles of local linear models for bankruptcy analysis and prediction”, Case Studies in
Business Industry and Government Statistics, Vol. 4 No. 2, pp. 116-133.
failure
Kaski, S., Sinkkonen, J. and Peltonen, J. (2001), “Bankruptcy analysis with self-organizing maps in
prediction
learning metrics”, IEEE Transactions on Neural Networks, Vol. 12 No. 4, pp. 936-947. models
Keasey, K. and Watson, R. (1991), “Financial distress models: a review of their usefulness”, British
Journal of Management, Vol. 2 No. 2, pp. 89-102.
Kim, S.Y. and Upneja, A. (2014), “Predicting restaurant financial distress using decision tree and
AdaBoosted decision tree models”, Economic Modelling, Vol. 36, pp. 354-362.
Kolari, J., Glennon, D., Shin, H. and Caputo, M. (2002), “Predicting large US commercial bank failures”,
Journal of Economics and Business, Vol. 54 No. 4, pp. 361-387.
Korol, T. (2013), “Early warning models against bankruptcy risk for Central European and Latin
American enterprises”, Economic Modelling, Vol. 31, pp. 22-30.
Kücher, A., Mayr, S., Mitter, C., Duller, C. and Feldbauer-Durstmüller, B. (2018), “Firm age dynamics
and causes of corporate bankruptcy: age dependent explanations for business failure”, Review of
Managerial Science, pp. 1-29, available at: https://doi.org/10.1007/s11846-018-0303-2
Kumar, P.R. and Ravi, V. (2007), “Bankruptcy prediction in banks and firms via statistical and intelligent
techniques – a review”, European Journal of Operational Research, Vol. 180 No. 1, pp. 1-28.
Lee, S. and Choi, W.S. (2013), “A multi-industry bankruptcy prediction model using back-propagation
neural network and multivariate discriminant analysis”, Expert Systems with Applications,
Vol. 40 No. 8, pp. 2941-2946.
Li, H. and Sun, J. (2008), “Ranking-order case-based reasoning for financial distress prediction”,
Knowledge-Based Systems, Vol. 21 No. 8, pp. 868-878.
Li, M.Y.L. and Miu, P. (2010), “A hybrid bankruptcy prediction model with dynamic loadings on
accounting-ratio-based and market-based information: a binary quantile regression approach”,
Journal of Empirical Finance, Vol. 17 No. 4, pp. 818-833.
Liang, D., Tsai, C.F. and Wu, H.T. (2015), “The effect of feature selection on financial distress
prediction”, Knowledge-Based Systems, Vol. 73, pp. 289-297.
Lin, T.H. (2009), “A cross model study of corporate financial distress prediction in Taiwan: multiple
discriminant analysis, logit, probit and neural networks models”, Neurocomputing, Vol. 72
Nos 16/18, pp. 3507-3516.
Lopez, V., Fernández, A., García, S., Palade, V. and Herrera, F. (2013), “An insight into classification
with imbalanced data: empirical results and current trends on using data intrinsic
characteristics”, Information Sciences , Vol. 250, pp. 113-141.
McKee, T.E. and Greenstein, M. (2000), “Predicting bankruptcy using recursive partitioning and a
realistically proportioned data set”, Journal of Forecasting, Vol. 19 No. 3, pp. 219-230.
Mare, D.S. (2015), “Contribution of macroeconomic factors to the prediction of small bank failures”,
Journal of International Financial Markets, Institutions and Money, Vol. 39, pp. 25-39.
Micha, B. (1984), “Analysis of business failures in France”, Journal of Banking and Finance, Vol. 8 No. 2,
pp. 281-291.
Min, J.H. and Lee, Y.C. (2005), “Bankruptcy prediction using support vector machine with optimal choice
of kernel function parameters”, Expert Systems with Applications, Vol. 28 No. 4, pp. 603-614.
Murray, G.D. (1977), “A cautionary note on selection of variables in discriminant analysis”, Applied
Statistics, Vol. 26 No. 3, pp. 246-250.
Pal, R., Kupka, K., Aneja, A.P. and Militky, J. (2016), “Business health characterization: a hybrid regression
and support vector machine analysis”, Expert Systems with Applications, Vol. 49, pp. 48-59.
Park, C.S. and Han, I. (2002), “A case-based reasoning with the feature weights derived by analytic hierarchy
process for bankruptcy prediction”, Expert Systems with Applications, Vol. 23 No. 3, pp. 255-264.
EBR Pe’er, A. and Vertinsky, I. (2008), “Firm exits as a determinant of new entry: is there evidence of local
creative destruction?”, Journal of Business Venturing, Vol. 23 No. 3, pp. 280-306.
Platt, H.D. and Platt, M.B. (2002), “Predicting corporate financial distress: reflections on choice-based
sample bias”, Journal of Economics and Finance, Vol. 26 No. 2, pp. 184-199.
Quintana, D., Saez, Y., Mochon, A. and Isasi, P. (2007), “Early bankruptcy prediction using ENPC”,
Applied Intelligence, Vol. 29 No. 2, pp. 157-161.
Raeder, T., Forman, G. and Chawla, N. (2012), “Learning from imbalanced data: evaluation matters”, in
Holmes D.E. and Jain L.C. (Eds), Data Mining: Foundations and Intelligent Paradigms, Springer-
Verlag, Berlin Heidelberg.
Ravi, V. and Pramodh, C. (2008), “Threshold accepting trained principal component neural network and
feature subset selection: application to bankruptcy prediction in banks”, Applied Soft
Computing, Vol. 8 No. 4, pp. 1539-1548.
Scott, J. (1981), “The probability of bankruptcy: a comparison of empirical predictions and theoretical
models”, Journal of Banking and Finance, Vol. 5 No. 3, pp. 317-344.
Serrano-Cinca, C. and Gutiérrez-Nieto, B. (2013), “Partial least square discriminant analysis for
bankruptcy prediction”, Decision Support Systems, Vol. 54 No. 3, pp. 1245-1255.
Shin, K.S., Lee, T.S. and Kim, H.J. (2005), “An application of support vector machines in bankruptcy
prediction model”, Expert Systems with Applications, Vol. 28 No. 1, pp. 127-135.
Singh, S., Corner, P. and Pavlovich, K. (2007), “Coping with entrepreneurial failure”, Journal of
Management and Organization, Vol. 13 No. 4, pp. 331-344.
Sun, J. and Li, H. (2009), “Financial distress prediction based on serial combination of multiple
classifiers”, Expert Systems with Applications, Vol. 36 No. 4, pp. 8659-8666.
Sun, J. and Li, H. (2012), “Financial distress prediction using support vector machines: ensemble vs
individual”, Applied Soft Computing, Vol. 12 No. 8, pp. 2254-2265.
Sun, J., Jia, M.Y. and Li, H. (2011), “AdaBoost ensemble for financial distress prediction: an empirical
comparison with data from Chinese listed companies”, Expert Systems with Applications, Vol. 38
No. 8, pp. 9305-9312.
Sun, J., Li, H., Huang, Q.H. and He, K.Y. (2014), “Predicting financial distress and corporate failure: a
review from the state-of-the-art definitions, modeling, sampling, and featuring approaches”,
Knowledge-Based Systems, Vol. 57, pp. 41-56.
Sun, L. and Shenoy, P.P. (2007), “Using Bayesian networks for bankruptcy prediction: some
methodological issues”, European Journal of Operational Research, Vol. 180 No. 2, pp. 738-753.
Tang, T.-C. and Chi, L.-C. (2005), “Neural networks analysis in business failure prediction of Chinese
importers: a between-countries approach”, Expert Systems with Applications, Vol. 29 No. 2, pp. 244-255.
Tian, S., Yu, Y. and Zhou, M. (2015), “Data sample selection issues for bankruptcy prediction”, Risk,
Hazards and Crisis in Public Policy, Vol. 6 No. 1, pp. 91-116.
Tinoco, M.H. and Wilson, N. (2013), “Financial distress and bankruptcy prediction among listed
companies using accounting, market and macroeconomic variables”, International Review of
Financial Analysis, Vol. 30, pp. 394-419.
Tobback, E., Bellotti, T., Moeyersoms, J., Stankova, M. and Martens, D. (2017), “Bankruptcy prediction
for SMEs using relational data”, Decision Support Systems, Vol. 102, pp. 69-81.
Tsai, C.F. (2009), “Feature selection in bankruptcy prediction”, Knowledge-Based Systems, Vol. 22 No. 2,
pp. 120-127.
Tsai, C.F. and Hsu, Y.F. (2013), “A Meta-learning framework for bankruptcy prediction”, Journal of
Forecasting, Vol. 32 No. 2, pp. 167-179.
Tseng, F.M. and Hu, Y.C. (2010), “Comparing four bankruptcy prediction models: logit, quadratic
interval logit, neural and fuzzy neural networks”, Expert Systems with Applications, Vol. 37
No. 3, pp. 1846-1853.
Tseng, F.M. and Lin, L. (2005), “A quadratic interval logit model for forecasting bankruptcy”, Omega, Corporate
Vol. 33 No. 1, pp. 85-91.
failure
Tucker, J. (1996), “Neural networks versus logistic regression in financial modelling: a methodological
comparison”, Proceedings of the 1996 World First Online Workshop on Soft Computing (WSC1), prediction
Nagoya University, August 19-30. models
Verikas, A., Kalsyte, Z., Bacauskiene, M. and Gelzinis, A. (2010), “Hybrid and ensemble-based soft computing
techniques in bankruptcy prediction: a survey”, Soft Computing, Vol. 14 No. 9, pp. 995-1010.
Volkov, A., Benoit, D.F. and Van den Poel, D. (2017), “Incorporating sequential information in
bankruptcy prediction with predictors based on markov for discrimination”, Decision Support
Systems, Vol. 98, pp. 59-68.
Wang, D., Song, X., Yin, W. and Yuan, J. (2015), “Forecasting core business transformation risk using
the optimal rough set and the neural network”, Journal of Forecasting, Vol. 34 No. 6, pp. 478-491.
Wu, C., Y. (2004), “Using non-financial information to predict bankruptcy: a study of public companies
in Taiwan”, International Journal of Management, Vol. 21 No. 2, p. 194.
Xiao, Z., Yang, X., Pang, Y. and Dang, X. (2012), “The prediction for listed companies’ financial distress
by using multiple prediction methods with rough set and Dempster–Shafer evidence theory”,
Knowledge-Based Systems, Vol. 26, pp. 196-206.
Xu, X. and Wang, Y. (2009), “Financial failure prediction using efficiency as a predictor”, Expert
Systems with Applications, Vol. 36 No. 1, pp. 366-373.
Xu, W., Xiao, Z., Dang, X., Yang, D. and Yang, X. (2014), “Financial ratio selection for business failure
prediction using soft set theory”, Knowledge-Based Systems, Vol. 63, pp. 59-67.
Yeh, C.C., Chi, D.J. and Hsu, M.F. (2010), “A hybrid approach of DEA, rough set and support vector machines
for business failure prediction”, Expert Systems with Applications, Vol. 37 No. 2, pp. 1535-1541.
Yu, Q., Miche, Y., Séverin, E. and Lendasse, A. (2014), “Bankruptcy prediction using extreme learning
machine and financial expertise”, Neurocomputing, Vol. 128, pp. 296-302.
Zavgren, C.V. (1985), “Assessing the vulnerability to failure of American industrial firms: a logistic
analysis”, Journal of Business Finance and Accounting, Vol. 12 No. 1, pp. 19-45.
Zhou, L. (2013), “Performance of corporate bankruptcy prediction models on imbalanced dataset: the
effect of sampling methods”, Knowledge-Based Systems, Vol. 41, pp. 16-25.
Zmijewski, M.E. (1984), “Methodological issues related to the estimation of financial distress prediction
models”, Journal of Accounting Research, Vol. 22, pp. 59-82.

About the authors


David Veganzones is an Assistant Professor of Finance at ESCE International Business School. He
obtained his PhD in Management and Finance at the Institute d’Administration des Enterprises
(IAE), University of Lille, Lille, France. He is interested in various domains of bankruptcy prediction
and the application of machine learning to corporate finance. David Veganzones is the corresponding
author and can be contacted at: david.veganzones@gmail.com
Eric Severin is a Professor of Finance at IAE Lille (University of Lille) and he is a Specialist in
Corporate Finance. His research interests are twofold: bankruptcy prediction and the relationship
between economics and finance.

For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com

Potrebbero piacerti anche