0% found this document useful (0 votes)
30 views12 pages

Expert Systems With Applications: Wei-Sen Chen, Yin-Kuan Du

This paper discusses the development of a financial distress prediction model using artificial neural networks (ANN) and data mining techniques. It highlights the limitations of traditional statistical methods and demonstrates that the ANN approach yields better prediction accuracy, achieving an 82.14% correct percentage for predicting financial distress two seasons prior to its occurrence. The study emphasizes the importance of timely financial information for investors and proposes an AI-based methodology as a more effective alternative for predicting potential financial crises in companies.

Uploaded by

junfeidu32
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views12 pages

Expert Systems With Applications: Wei-Sen Chen, Yin-Kuan Du

This paper discusses the development of a financial distress prediction model using artificial neural networks (ANN) and data mining techniques. It highlights the limitations of traditional statistical methods and demonstrates that the ANN approach yields better prediction accuracy, achieving an 82.14% correct percentage for predicting financial distress two seasons prior to its occurrence. The study emphasizes the importance of timely financial information for investors and proposes an AI-based methodology as a more effective alternative for predicting potential financial crises in companies.

Uploaded by

junfeidu32
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Expert Systems with Applications 36 (2009) 4075–4086

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: [Link]/locate/eswa

Using neural networks and data mining techniques for the financial distress
prediction model
Wei-Sen Chen *, Yin-Kuan Du
Industrial Technology Research Institute, #195, Sec. 4, Chung-Hsing Rd., Chutung 310, HsinChu, Taiwan, ROC

a r t i c l e i n f o a b s t r a c t

Keywords: The operating status of an enterprise is disclosed periodically in a financial statement. As a result, inves-
Financial distress prediction model tors usually only get information about the financial distress a company may be in after the formal finan-
Artificial neural network cial statement has been published. If company executives intentionally package financial statements with
Data mining the purpose of hiding the actual status of the company, then investors will have even less chance of
obtaining the real financial information. For example, a company can manipulate its current ratio by
up to 200% so that its liquidity deficiency will not show up as a financial distress in the short run. To
improve the accuracy of the financial distress prediction model, this paper adopted the operating rules
of the Taiwan stock exchange corporation (TSEC) which were violated by those companies that were sub-
sequently stopped and suspended, as the range of the analysis of this research. In addition, this paper also
used financial ratios, other non-financial ratios, and factor analysis to extract adaptable variables. More-
over, the artificial neural network (ANN) and data mining (DM) techniques were used to construct the
financial distress prediction model. The empirical experiment with a total of 37 ratios and 68 listed com-
panies as the initial samples obtained a satisfactory result, which testifies for the feasibility and validity
of our proposed methods for the financial distress prediction of listed companies.
This paper makes four critical contributions: (1) The more factor analysis we used, the less accuracy we
obtained by the ANN and DM approach. (2) The closer we get to the actual occurrence of financial dis-
tress, the higher the accuracy we obtain, with an 82.14% correct percentage for two seasons prior to
the occurrence of financial distress. (3) Our empirical results show that factor analysis increases the error
of classifying companies that are in a financial crisis as normal companies. (4) By developing a financial
distress prediction model, the ANN approach obtains better prediction accuracy than the DM clustering
approach. Therefore, this paper proposes that the artificial intelligent (AI) approach could be a more suit-
able methodology than traditional statistics for predicting the potential financial distress of a company.
Crown Copyright Ó 2008 Published by Elsevier Ltd. All rights reserved.

1. Introduction to various professionals, such as bank loan officers, creditors,


stockholders, bondholders, financial analysts, governmental offi-
In Taiwan, domestic and foreign capital markets have developed cials, as well as the general public, as it provides them with timely
rapidly in recent years, gradually giving people the idea of making warnings (Ko & Lin, 2006).
a financial investment. There are various financial investment Financial failure occurs when a firm suffers chronic and serious
objects, such as stocks, futures, options, bond funds etc., and losses or when the firm becomes insolvent with liabilities that are
investment stock is the most widely accepted in society. However, disproportionate to its assets (Hua, Wang, Xu, Zhang, & Liang,
capital markets are volatile, and most investors only know that a 2007). Common causes and symptoms of financial failure include
company is in financial trouble after the financial statement of lack of financial knowledge, failure to set capital plans, poor debt
the company has been made public. Therefore, forecasting management, inadequate protection against unforeseen events
corporate financial distress plays an increasingly important role and difficulties in adhering to proper operating discipline in the
in today’s society since it has a significant impact on lending deci- financial market. The common assumption underlying bankruptcy
sions and the profitability of financial institutions. The ability to prediction is that a firm’s financial statements appropriately reflect
make accurate bankruptcy predictions are of critical importance above characteristics. Several classification techniques have been
suggested to predict financial distress using ratios and data
* Corresponding author. Tel.: +886 3 5820100; fax: +886 3 5610616. originating from these financial statements, e.g., univariate
E-mail address: wschen@[Link] (W.-S. Chen). approaches (Beaver, 1966), multivariate approaches, linear multiple

0957-4174/$ - see front matter Crown Copyright Ó 2008 Published by Elsevier Ltd. All rights reserved.
doi:10.1016/[Link].2008.03.020
4076 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

discriminant approaches (MDA) (Altman, 1968; Altman, Edward, niques for classification and prediction (Wu, Yang, & Liang, 2006),
Haldeman, & Narayanan, 1977), multiple regression (Meyer & Pifer, and is considered an advanced multiple regression analysis that
1970), logistic regression (Dimitras, Zanakis, & Zopounidis, 1996), can accommodate complex and non-linear data relationships (Jost,
factor analysis (Blum, 1974), and stepwise (Laitinen & Laitinen, 1993). It was first described by Werbos (1974), and further devel-
2000). However, strict assumptions of traditional statistics such as oped by Ronald, Rumelhart, and Hinton (1986). The details for the
linearity, normality, independence among predictor variables and back-propagation learning algorithm can be found in Medsker and
pre-existing functional form relating to the criterion variable and Liebowitz (1994).
the predictor variable limit their application in the real world Fig. 1 shows the l  m  n (l denotes input neurons, mdenotes
(Hua et al., 2007). hidden neurons, and n denotes output neurons) architecture of a
With radical changes taking place in corporate finance and the BPN model (Panda, Chakraborty, & Pal, 2007). The input layer can
global economic environment, critical financial ratios can change be considered the model stimuli and the output layer the input
dynamically (John & Robert, 2001). This means that it is both stimuli outcome. The hidden layer determines the mapping rela-
important as well as necessary to develop an evolutionary ap- tionships between input and output layers, whereas the relation-
proach for coping with future dynamic financial environments. ships between neurons are stored as weights of the connecting
Therefore, this paper proposes a model of financial distress predic- links. The input signals are modified by the interconnection
tion integrating artificial neural network (ANN) and data mining weight, known as weight factor wji, which represents the intercon-
(DM) techniques. The main objectives of this paper are to (1) adopt nection of the ith node of the first layer to the jth node of the
ANN and DM techniques to construct a financial distress prediction second layer. The sum of the modified signals (total activation) is
model, (2) use financial and non-financial ratios to enhance the then modified by a sigmoid transfer function (f). Similarly, the out-
accuracy of the financial distress prediction model, (3) employ a put signals of the hidden layer are modified by interconnection
traditional statistical method (factor analysis) to compare the de- weight wkj of the kth node of the output layer to the j th node of
gree of accuracy with that of the artificial intelligent (AI) approach, the hidden layer. The sum of the modified signals is then modified
and (4) to expand this model so that it will work within a financial by sigmoid transfer (f) function and the output is collected at the
distress prediction system to provide information to investors as output layer.
well as investment monitoring organizations. The data for our Let Ip = (Ip1,Ip2, . . . , Ipl), p = 1,2, . . . , N be the pth pattern among N
experiment were collected from the Taiwan stock exchange corpo- input patterns. Where wji and wkj are connection weights between
ration (TSEC) database. the ith input neuron to the jth hidden neuron, and the jth
The rest of this paper is organized as follows. A literature review hidden neuron to the kth output neuron, respectively (Panda
of related studies is provided in Section 2. Section 3 describes our et al., 2007).
proposed approach and the functionalities of each process. Section Output from a neuron in the input layer is
4 presents the process for selecting suitable indicators by factor
Opi ¼ Ipi ; i ¼ 1; 2; . . . ; l ð1Þ
analysis. To prove the prediction performance of our approach,
we carried out several experiments which are described in Section Output from a neuron in the hidden layer is
5. In Section 6, we compared our results with the ANN, and DM ap- !
X
1
proaches. Finally, in Section 7 we draw our conclusions about Opj ¼ f ðNET pj Þ ¼ f wji opi ; j ¼ 1; 2; . . . ; m ð2Þ
financial distress forecasting and discuss future work. i¼0

Output from a neuron in the output layer is


!
2. Literature review X
m
Opk ¼ f ðNET pk Þ ¼ f wkj opj ; k ¼ 1; 2; . . . ; n ð3Þ
j¼0
2.1. Artificial neural network
Where f( ) is the sigmoid transfer function given by f(x) = 1/(1 + ex).
The ANN is composed of richly interconnected non-linear nodes BPN has been applied to various areas, such as investigating
that communicate in parallel. The connection weights are modifi- long-term tidal predictions (Lee, 2004), improving customer satis-
able, allowing ANN to learn directly from examples without requir- faction (Deng, Chen, & Pei, 2007), predicting flank wear in drills
ing or providing an analytical solution to the problem. The most (Panda et al., 2007), enhancing job completion time prediction in
popular forms of learning are: the semiconductor fabrication factory (Chen, 2007), and providing

 Supervised learning: Patterns for which both their inputs and


outputs are known are presented to the ANN. The task of the
supervised learner is to predict the value of the function for
any valid input object after having seen a number of training
examples. ANN employing supervised learning has been widely
utilized for the solution of function approximation and classifi-
cation problems.
 Unsupervised learning: Patterns are presented to the ANN in the
form of feature values. It is distinguished from supervised learn-
ing by the fact that there is no a priori output. ANN employing
unsupervised learning has been successfully employed for data
mining and classification tasks. The self-organizing map (SOM)
and adaptive resonance theory (ART) constitutes the most pop-
ular exemplar of this class.

A back-propagation network (BPN) is a neural network that


uses a supervised learning method and feed-forward architecture.
A BPN is one of the most frequently utilized neural network tech- Fig. 1. Back-propagation network architecture.
W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086 4077

the required accuracy for focal ventricular arrhythmias diagnosis  Step 4. Data mining: This is an essential process, where AI meth-
(Yılmaz & Cunedioglu, 2007). ods are applied in order to search for meaningful or desired pat-
Based on the above literatures, many researches employed the terns in a particular representational form, such as association
BPN techniques for many applications. However, few of them used rule mining, classification trees, and clustering techniques.
it to carry out empirical investigations of financial distress predic-  Step 5. Knowledge Extraction: Based on the above steps it is pos-
tion related topics. Therefore, in this study we will use the BPN sible to visualize the extracted patterns or visualize the data
technique to forecast a potential crisis in the bankruptcy prediction depending on the extraction models. Besides, this process also
domain. We hope that the results of our proposed approach will checks for or resolves any potential conflicts with previously
provide a useful methodology for investors as well as supervisory believed knowledge.
organizations to predict and avoid investing in, a company open  Step 6. Knowledge Application: Here, we apply the found knowl-
to a bankruptcy in the near future. edge directly into the current application domain or in other
fields for further action.
2.2. Data mining  Step 7. Knowledge Evaluation: Here, we identify the most inter-
esting patterns representing knowledge based data on some
Data mining (DM), also known as ‘‘knowledge discovery in dat- measure of interest. Moreover, it allows us to improve the accu-
abases” (KDD), is the process of discovering meaningful patterns in racy and efficiency of the mined knowledge.
huge databases (Han & Kamber, 2001). In addition, it is also an
application that can provide significant competitive advantages A particular data mining algorithm is usually an instantiation of
for making the right decision. (Huang, Chen, & Lee, 2007). DM is the model preference search components. The more common mod-
an explorative and complicated process involving multiple itera- el functions in the current data mining process include the follow-
tive steps. Fig. 2 shows an overview of the data mining process ing (Mitra, Pal, & Mitra, 2002).
(Han & Kamber, 2001). It is interactive and iterative, involving
the following steps:  Classification: Classifies a data item into one of several prede-
fined categories.
 Step 1. Application domain identification: Investigate and under-  Regression: Maps a data item to a real-valued prediction
stand the application domain and the relevant prior knowledge. variable.
In addition, identify the goal of the KDD from the administrators’  Clustering: Maps a data item into a cluster, where clusters are
or users’ point of view. natural groupings of data items based on similarity metrics or
 Step 2. Target dataset selection: Select a suitable dataset, or probability density models.
focus on a subset of variables or data samples where data  Association rules: Describes association relationship among dif-
relevant to the analysis task are retrieved from the ferent attributes.
database.  Summarization: Provides a compact description for a subset of
 Step 3. Data Preprocessing: the DM basic operations include data.
‘data clean’ and ‘data reduction’: In the ‘data clean’ process, we  Dependency modeling: Describes significant dependencies
remove the noise data, or respond to the missing data field. In among variables.
the ‘data reduction’ process, we reduce the unnecessary dimen-  Sequence analysis: Models sequential patterns, like time-series
sionality or adopt useful transformation methods. The primary analysis. The goal is to model the state of the process generating
objective is to improve the effective number of variables under the sequence or to extract and report deviations and trends over
consideration. time.

Knowledge
Evaluation
Knowledge
Applying
Knowledge
Extraction
Data Mining

Data
Preprocessing
Target
Dataset
Application Selection
Domain
Identification

Fig. 2. Data mining phases (Han & Kamber, 2001).


4078 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

In the recent past many research contributions have applied tor analysis (non-factor analysis, 1st factor analysis, and 2nd factor
data mining techniques to many applications. DM has been suc- analysis). Then, the intelligent financial distress prediction model
cessfully applied to several financial problem domains. Recent will be constructed and initiated to validate the new data sets of
examples are as follows. Huang, Hsu, and Wang (2007) adopted the financial statement from the TSEC.
the time-series mining approach to simulate human intelligence
and discover financial database patterns automatically (Huang 4. The FDP selecting phase
et al., 2007). Kirkos, Spathis, and Manolopoulos (2007) used classi-
fication mining to identify fraudulent financial statements (Kirkos 4.1. Data
et al., 2007). Chun and Park (2006) integrated the regression anal-
ysis and case-based reasoning for predicting the stock market in- Our sample contained data from 68 Taiwan firms listed in the
dex (Chun & Park, 2006). However, few of these studies focused TSEC. The period of sampling was from 1999 January u/i October,
on the data clustering approach, and even fewer empirical investi- 2006, amounting to 7 years and 10 months. The 34 firms in finan-
gations were made of financial distress prediction related topics. cial distress were matched with 34 non-bankruptcy firms. These
Therefore, we will use data clustering to enhance the accuracy of firms were characterized as non-bankruptcy based on the absence
predicting bankruptcy in a capital market. of any indication or proof concerning the issuing of financial dis-
tress in the auditors’ reports, in the financial and taxation dat-
3. Research methodology abases and in the TSEC. This of course did not guarantee that the
financial statements of these firms were not falsified or that the
In this study we integrate ANN and DM techniques for financial financial distress of these firms would not be revealed in the fu-
distress prediction (FDP). The research methodology is as shown in ture. It only guaranteed that no firms in financial distress had been
Fig. 3. In the first phase we deal with the dataset which basically is found during an extensive search. All the variables used in the
the original huge set of records from the TSEC which will be cov- sample were extracted from formal financial statements, such as
ered by data pre-processing. The data sets then undergo cleaning balance sheets and income statements. This implies that the use-
and preprocessing for removing discrepancies and inconsistencies fulness of this study is not restricted by the fact that only data from
to improve their quality. The goal in this phase is to select the suit- Taiwanese companies was used.
able indicators, including financial and non-financial ratios, by
means of factor analysis. After the above processes, the next phase 4.2. Variables
will load these indicators and discovery prediction rule sets that
are ready to be used in ANN and DM clustering. The ‘‘FDP Select- The selection of variables to be used as candidates for participa-
ing” will be discussed in detail in the following sections. tion in the input vector was based upon prior research work linked
In the FDP Modeling phase we collect the financial statement to the topic of financial distress prediction. The work carried out by
data sets for ANN and DM processing. In the ANN approach, we will Kirkos et al. (2007), Spathis (2002), Spathis, Doumpos, and Zopo-
use the BPN algorithm to discover the rules and predict the FDP. In unidis (2002), Fanning and Cogger (1998), Persons (1995), Stice
the DM approach, we will use the clustering technique to classify (1991), Feroz, Park, and Pastena (1991), Loebbecke, Eining, and
and predict the FDP. Next, the selected data set is analyzed by Willingham (1989) and Kinney and McDaniel (1989) contained
applying algorithms in order to identify the patterns among the the suggested indicators of financial distress prediction. Therefore,
data that represent a relationship. The BPN and clustering algo- this paper adopted the related variables based on prior researches,
rithm are applied to separately determine the financial distress the Taiwanese Economic Journal (TEJ), and the Taiwanese eco-
prediction patterns or rules. nomic database. Moreover, this paper selected 37 variables and
In the FDP Comparison phase, we compare the prediction accu- categorized them as six major types: earning ability, financial
racy for BPN and clustering mining by means of several times fac- structure ability, management efficiency ability, management

Fig. 3. Research methodology.


W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086 4079

performance, debt-repaying ability, and non-financial factors. The Table 1


details of these indicators belong to each type and are listed as 1st Factor analysis results

follows: Factors Variables Factor Communality Eigenvalues Explained


loadings variance
 Earning ability: Including pretax margin, return on total assets, 1 Earnings per share 0.870 0.919 4.023 10.874
return on equity, earnings per share, and gross margin ratios. (EPS)
 Financial structure ability: Including debt to assets, times interest Return on equity 0.862 0.889
(ROE)
earned, book value per share, financial leverage ratio, debt to Return on asset 0.850 0.866
equity, short term and long term debt to book value ratio, fixed (ROA)
assets to total assets ratio, gross margin to total assets ratio, Pretax margin 0.641 0.524
inventory to total assets ratio, inventory to sales ratio, invest- growth ratio
Margin before 0.638 0.814
ment ratio, and current assets to total assets ratios.
interest and tax
 Management efficiency ability: Including turnover rate of inven- (BEFM)
tory, turnover rate of account receivable, turnover rate of fixed
2 Current ratio 0.762 0.877 3.858 10.428
assets, turnover rate of total assets, turnover rate of equity, Acid-test ratio 0.742 0.833
and turnover rate of working capital ratios. Equity per share 0.631 0.742
 Management performance: Including pretax margin growth ratio, Cash ratio 0.624 0.503
gross margin growth ratio, and sales growth ratio ratios. Gross margin ratio 0.609 0.660
Price-book ratio 0.352 0.462
 Debt-repaying ability: Including current ratio, acid-test ratio,
(PBR)
cash ratio, cash flow ratio, cash flow to long term debt, cash flow
3 Gearing ratio 0.949 0.969 3.738 10.103
to total debt, and cash flow to short term and long term debt Debt to equity ratio 0.948 0.968
ratio ratios. (DEBE)
 Non-financial factors: Including dividend payout ratio, price- Debt/equity (DE) 0.923 0.962
book ratio, the proportion of collateralized shares by the board Debt ratio 0.625 0.820
of directors, and the insider holding ratio. 4 Turnover rate of 0.858 0.824 2.886 7.800
total assets
Turnover rate of 0.798 0.793
equity
4.3. Factor analysis Turnover rate of 0.635 0.803
fixed assets
This paper collected the samples of 34 pairs of financial distress Gross margin to 0.479 0.716
and non-bankruptcy firms listed in the TSEC, between 1999 and total assets ratio

2006. The main variables are 37 ratios for the predictive financial 5 Inventory to total 0.899 0.889 2.558 6.912
distress model factors. This research used the SPSS statistical soft- assets ratio
Inventory to sales 0.848 0.802
ware to conduct factor analysis and principle component analysis
ratio
(PCA) with varimax for rotation (VARIMAX), in order to make the Current assets to 0.578 0.871
factor structure easier and simpler to explain. The principle for total assets
the selection of factors is based on Kaiser’s criteria, meaning that The proportion of 0.422 0.397
collateralized
the eigenvalue greater than 1 is a common factor, the absolute va-
shares by the broad
lue of the factor loadings is greater than 0.5 and the communality of directors
is greater than 0.8 in order to obtain suitable factors.
6 Cash flow ratio 0.859 0.873 2.476 6.693
In total, we compiled 33 financial ratios and 4 non-financial ra- Cash flow to total 0.830 0.823
tios. In an attempt to reduce dimensionality, we ran a factor anal- debt ratio
ysis to test whether the differences between these 37 variables Dividend payout 0.514 0.579
were significant for each variable. If the difference was not signif- ratio

icant (low factor loadings or communality values), the variable was 7 Insider holding 0.756 0.635 2.039 5.510
considered to be non-informative. Table 1 shows the factor load- ratio
Investment ratio 0.635 0.755
ings, communality, the eigenvalues and the explained variance
Fixed assets to total 0.607 0.772
for each variable. As a result, 18 variables presented high factor assets ratio
loadings or communality values. These variables were chosen to 8 Times interest 0.836 0.778 2.012
be used in the input vector, while the remaining 19 variables were earned
discarded. In addition, the total explained variance was 75.776%. Cash flow to long 0.827 0.732
We used the factor analysis to process the experiment a second term debt
time. Table 2 shows that 5 variables were discarded, and that the 9 Turnover rate of 0.788 0.777 4.571
total explained variance was 85.288%. Due to the better perfor- working capital
Turnover rate of 0.714 0.736
mance in the total explained variance value, we can assume that
inventory
the factor analysis is not yet the optimal solution. Therefore, we
10 Turnover rate of 0.782 0.693 1.646 4.450
used the factor analysis to process the experiment a third time.
account receivable
Table 3 shows that two variables were discarded, and that the total Gross margin 0.665 0.526
explained variance was 91.876%. Therefore we used the factor growth ratio
analysis to process the experiment a fourth time. However, Table Sales revenue 0.548 0.641
4 shows there were no suitable variables to be discarded, and the growth ratio

total explained variance was down to 88.228%. Therefore, we can 11 Cash flow to short 0.871 0.813 1.110 2.999
were sure that the optimal factor analysis was the one we carried term and long term
debt ratio
out the third time, where the performance was the highest at Total explained variance 75.776
91.876%.
4080 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

Table 2 Table 4
2nd Factor analysis results 4th Factor analysis results

Factors Variables Factor Communality Eigenvalues Explained Factors Variables Factor Communality Eigenvalues Explained
loadings variance loadings variance
1 Return on asset 0.913 0.904 3.497 19.427 1 Gearing ratio 0.962 0.988 2.969 26.989
(ROA) Debt to equity 0.961 0.986
Return on equity 0.903 0.923 ratio
(ROE) Debt/equity (DE) 0.942 0.973
Earnings per 0.897 0.897
2 Return on asset 0.923 0.929 2.779 25.259
share (EPS)
(ROA)
Margin before 0.759 0.717
Earnings per 0.918 0.923
interest and tax
share (EPS)
(BEFM)
Return on equity 0.910 0.931
2 Gearing ratio 0.962 0.978 3.492 19.401 (ROE)
Debt to equity 0.961 0.976
3 Cash flow to 0.937 0.915 2.047 18.613
ratio
total debt ratio
Debt/equity (DE) 0.939 0.968
Cash flow ratio 0.920 0.916
Debt ratio 0.656 0.789
Inventory to 0.393 0.194
3 Current ratio 0.896 0.939 2.296 12.755 total assets ratio
Acid-test ratio 0.892 0.940
4 Current ratio 0.936 0.975 1.910 17.366
4 Inventory to total 0.909 0.868 11.659 Acid-test ratio 0.921 0.975
assets ratio
Total explained variance 88.228
Inventory to sales 0.891 0.854
ratio
5 Turnover rate of 0.840 0.775 2.076 11.536 equity ratio, gearing ratio, debt/equity (DE), return on asset
fixed assets (ROA), earnings per share (EPS), return on equity (ROE), current ra-
Turnover rate of 0.811 0.761 tio, acid-test ratio, current assets to total assets, cash flow to total
total assets debt ratio, cash flow ratio, inventory to total assets ratio, and
Current assets to 0.649 0.869
total assets
inventory to sales ratio.

6 Cash flow to total 0.830 0.835 1.892 10.509


debt ratio 5. The FDP modeling phase
Cash flow ratio 0.811 0.869
Cash flow to short 0.642 0.489 5.1. ANN experiments and results
term and long
term debt ratio
This process uses the finance and non-finance ratios, and con-
Total explained variance 85.288
structs a financial distress prediction model after carrying out a
second time factor analysis. The variables are then loaded as
ANN input nodes. In addition, we also apply these experiment
Table 3 parameters to investigate the past 2 seasons, the past 4 seasons,
3rd Factor analysis results the past 6 seasons, and the past 8 seasons before the financial dis-
Factors Variables Factor Communality Eigenvalues Explained tress occurred, for the sake of prediction accuracy. In this experi-
loadings variance ment, we will use the BPN as the ANN algorithm. In addition, the
1 Debt to equity 0.970 0.993 3.011 23.164 training sample and the testing sample will adopt the 80:20 ratio.
ratio In terms of bankruptcy prediction, whether or not the predic-
Gearing ratio 0.969 0.994 tion is accurate is routinely measured by three quantities: Type I
Debt/equity (DE) 0.943 0.973
Error Rate, Type II Error Rate, and Total Error Rate. ‘‘Type I Error Rate”
2 Return on asset 0.923 0.935 2.759 21.227 means that the error rate for the risk can not categorize the normal
(ROA)
company as a normal company, ‘‘Type II Error Rate” means that the
Earnings per 0.917 0.930
share (EPS) error rate for the risk can not categorize the bankruptcy company,
Return on equity 0.909 0.930 and ‘‘Total Error Rate” means the combined ‘‘Type I Error Rate” and
(ROE) ‘‘Type II Error Rate”. Table 5 shows the relationship among these
3 Current ratio 0.894 0.929 2.106 16.203 three error rate types. The formula for each error rate is listed as
Acid-test ratio 0.877 0.945 follows:
Current assets to 0.602 0.763
total assets Y2
Type I Error Rate ¼ ð4Þ
4 Cash flow to 0.950 0.934 2.038 15.673
Y3
total debt ratio Y4
Type II Error Rate ¼ ð5Þ
Cash flow ratio 0.940 0.943 Y6
5 Inventory to 0.927 0.889 2.029 15.609 ðY 2 þ Y 4 Þ
Total Error Rate ¼ ð6Þ
total assets ratio Y9
Inventory to 0.874 0.788
sales ratio
Table 5
Total explained variance 91.876 The relationship with type I, II, and total error rates

Prediction Sum
Normal Bankruptcy
After the three times factor analysis, 13 variables presented
higher factor loadings or communality values. These variables Actually Normal Y1 Y2 Y3
Bankruptcy Y4 Y5 Y6
were chosen to be used in the input vector, while the remaining
Sum Y7 Y8 Y9
24 variables were discarded. The selected variables were debt to
W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086 4081

5.1.1. The experiment using a non-factor analysis However, the accuracy rate reduces to 65.45%, and the error rate
This experiment obtains a result after using 37 original ratio rises to 34.55% when measured over the past 8 seasons. Similar
variables that have not yet obtained a result by factor analysis. to the above experiment, the closer the financial crisis the higher
As shown in Table 6, the testing data has an estimate accuracy rate the accuracy will be.
as high as 82.14%, with an error rate of 17.86% for the past 2 sea-
sons. However, the accuracy rate reduces to 60%, and the error rate 5.2. DM experiments and results
rises to 40% when measured over the past 8 seasons. The closer the
financial crisis the higher the accuracy will be. Clustering analysis finds groups, each very different from the
other. However, within a group all members are very similar. Un-
5.1.2. The experiment with the 1st factor analysis like classification, the class label of each group is not known. Clus-
This experiment obtains a result after using 18 original ratio tering is a way to naturally segment data into groups, whereas
variables of this research that have undergone 1st factor analysis. classification is a way to segment data by assigning it into groups.
As shown in Table 7, the testing data has an estimate accuracy rate Briefly, a good clustering method will produce high quality clusters
as high as 78.57%, with an error rate of 21.43% for the past 2 sea- with high intra-class similarity and low inter-class similarity (Chen
sons. However, the accuracy rate reduces to 66.36%, and the error & Chen, 2006). However, how good a cluster is ultimately depends
rate rises to 33.64% when measured over the past 8 seasons. Sim- on the opinion of the user. In our experiment, we used the parti-
ilar to the above experiment, the closer the financial crisis the tioning methods to cluster the datasets for the financial distress
higher the accuracy will be. prediction model. The partitioning methods construct a partition
of a database of N objects into a set of k clusters. Usually, they start
5.1.3. The experiment with 2nd factor analysis with an initial partition and then use an iterative control strategy
This experiment obtains a result after using 13 original ratio to optimize an objective function.
variables of this research that have undergone 2nd factor analysis. The K-means algorithm (Han & Kamber, 2001) is a well-
As shown in Table 8, the testing data has an estimate accuracy rate known and commonly used clustering algorithm. It takes input
as high as 75%, with an error rate of 25% for the past 2 seasons. parameter k and partitions data into k clusters. First, we select
k objects to represent the cluster centers. The remaining objects
are then assigned to the cluster whose center is closest to the
Table 6
object. Then, it computes the mean value for each cluster as
The accuracy for the ANN model with non-factor analysis
new cluster centers. This process is iterated until the criterion
Training data Testing data function converges.
Normal Bankruptcy Normal Bankruptcy The same as with the ANN experiment, this process also uses a
2 Accuracy rate 87.03% 94.44% 92.86% 71.43% finance and non-finance ratio, and constructs the financial distress
Average 90.74% 82.14% prediction model after a second time factor analysis. We apply the
4 Accuracy rate 89.91% 92.67% 100.00% 55.56% K-means algorithm to investigate the past 2 seasons, the past 4
Average 91.28% 77.78% seasons, the past 6 seasons, and the past 8 seasons before the
6 Accuracy rate 91.41% 95.71% 87.80% 65.85%
Average 93.56% 76.83%
occurrence of financial distress to ensure prediction accuracy. After
8 Accuracy rate 95.85% 93.55% 74.55% 45.45% the K-means algorithm implementation, we decided to adopt
Average 94.70% 60.00% 10–15 clusters to analyze the prediction accuracy.

5.2.1. The experiment with non-factor analysis


This experiment obtains a result after using 37 original ratio
Table 7 variables of this research that haven’t yet undergone a factor anal-
The accuracy for the ANN model with 1st factor analysis
ysis. As shown in Table 9, the data has an estimate accuracy rate as
Training data Testing data high as 78.57%, with an error rate of 21.43% for the past 2 seasons.
Normal Bankruptcy Normal Bankruptcy However, the accurate rate reduces to 56.36%, and the error rate
rises to 43.64%, when measured over the past 8 seasons. The closer
2 Accuracy rate 90.74% 84.48% 85.71% 71.43%
Average 86.11% 78.57% the financial crisis the higher the accuracy will be.
4 Accuracy rate 87.16% 85.32% 88.89% 48.15%
Average 86.24% 68.52% 5.2.2. The experiment with 1st factor analysis
6 Accuracy rate 83.44% 86.50% 65.85% 68.29% This experiment obtains a result after using 18 original ratio
Average 84.97% 67.07%
variables of this research that have undergone a 1st factor analysis.
8 Accuracy rate 93.09% 88.48% 67.27% 65.45%
Average 90.78% 66.36% As shown in Table 10, the data has an estimate accuracy rate as
high as 75%, with an error rate of 25% for the past 2 seasons. How-
ever, the accurate rate reduces to 56.36%, and the error rate rises to
43.64% when measured over the past 8 seasons. Similar to the
Table 8 above experiment, the closer the financial crisis the higher the
The accuracy for the ANN model with 2nd factor analysis accuracy will be.
Training data Testing data
5.2.3. The experiment with 2nd factor analysis
Normal Bankruptcy Normal Bankruptcy
This experiment obtains a result after using 13 original ratio
2 Accuracy rate 87.04% 77.78% 78.57% 71.43% variables of this research that have undergone 2nd factor analysis.
Average 82.41% 75.00%
4 Accuracy rate 86.24% 86.24% 92.59% 51.85%
As shown in Table 11, the testing data has an estimate accuracy
Average 86.24% 72.22% rate as high as 75%, with an error rate of 25% for the past 2 seasons.
6 Accuracy rate 87.73% 83.44% 80.49% 48.78% However, the accurate rate reduces to 56.36%, and the error rate
Average 85.58% 64.63% rises to 43.64% when measured over the past 8 seasons. Similar
8 Accuracy rate 86.18% 81.57% 78.18% 52.73%
to the above experiment, the closer the financial crisis the higher
Average 83.87% 65.45%
the accuracy will be.
4082 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

Table 9
The accuracy for the clustering model with non-factor analysis

10 Clusters 11 Clusters 12 Clusters 13 Clusters 14 Clusters 15 Clusters


Accuracy Accuracy Accuracy Accuracy Accuracy Accuracy
Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy
2 Accuracy 100% 57.14% 50.00% 85.71% 50.00% 85.71% 100% 57.14% 100% 57.14% 85.71% 57.14%
Average 78.57% 67.86% 67.86% 78.57% 78.57% 71.43%
4 Accuracy 74.07% 77.78% 88.89% 55.56% 88.89% 55.56% 62.96% 70.37% 88.89% 55.56% 88.89% 55.56%
Average 75.93% 72.22% 72.22% 66.67% 72.22% 72.22%
6 Accuracy 51.22% 85.37% 100% 48.78% 100% 51.22% 100% 51.22% 97.56% 51.22% 60.98% 73.17%
Average 68.29% 64.63% 75.61% 75.61% 74.39% 67.07%
8 Accuracy 47.27% 87.27% 85.45% 36.36% 41.82% 78.18% 40.00% 78.18% 50.91% 76.36% 96.36% 16.36%
Average 67.27% 60.91% 60.00% 59.09% 63.64% 56.36%

Table 10
The accuracy for the clustering model with 1st factor analysis

10 Clusters 11 Clusters 12 Clusters 13 Clusters 14 Clusters 15 Clusters


Accuracy Accuracy Accuracy Accuracy Accuracy Accuracy
Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy
2 Accuracy 100% 35.71% 100.00% 50.00% 100.00% 50.00% 100% 50.00% 100% 50.00% 100.00% 50.00%
Average 67.86% 75.00% 75.00% 75.00% 75.00% 75.00%
4 Accuracy 70.37% 51.85% 100.00% 40.74% 100.00% 37.04% 100.00% 40.74% 100.00% 40.74% 100.00% 37.04%
Average 61.11% 70.37% 68.52% 70.37% 70.37% 68.52%
6 Accuracy 92.68% 29.27% 100% 19.51% 100% 19.51% 53.66% 51.22% 100.00% 34.15% 100.00% 34.15%
Average 60.98% 59.76% 59.76% 52.44% 67.07% 67.07%
8 Accuracy 72.73% 40.00% 100.00% 14.55% 100.00% 21.82% 100.00% 23.64% 100.00% 21.82% 100.00% 21.82%
Average 56.36% 57.27% 60.91% 61.82% 60.91% 60.91%

Table 11
The accuracy for the clustering model with 2nd factor analysis

10 Clusters 11 Clusters 12 Clusters 13 Clusters 14 Clusters 15 Clusters


Accuracy Accuracy Accuracy Accuracy Accuracy Accuracy
Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy Normal Bankruptcy
2 Accuracy 100.00% 21.43% 100.00% 35.71% 100.00% 50.00% 71.43% 50.00% 71.43% 50.00% 92.86% 57.14%
Average 60.71% 67.86% 75.00% 60.71% 60.71% 75.00%
4 Accuracy 100.00% 37.04% 100.00% 37.04% 100.00% 37.04% 100.00% 25.93% 100.00% 51.85% 70.37% 74.07%
Average 68.52% 68.52% 68.52% 62.96% 75.93% 72.22%
6 Accuracy 82.93% 65.85% 82.93% 65.85% 82.93% 65.85% 82.93% 65.85% 68.29% 68.29% 68.29% 68.29%
Average 74.39% 74.39% 74.39% 74.39% 68.29% 68.29%
8 Accuracy 58.18% 56.36% 83.64% 29.09% 65.45% 60.00% 65.45% 60.00% 61.82% 61.82% 74.55% 56.36%
Average 57.27% 56.36% 62.73% 62.73% 61.82% 65.45%

6. The FDP comparing phase worse and worse trend as BPN model. In addition, the clustering
model becomes more accurate the closer the crisis.
After the implementation for the FDP modeling phase, we will
compare the BPN and clustering approaches with the accuracy 6.2. The type II error rate for BPN and clustering
rate, Type II error rate, and factor analysis. The detail descriptions
will be discussed as following sections. As seen by the above-mentioned results shown in Fig. 6, the
BPN model presents the Type II error rate by non-factor analysis,
6.1. The accuracy rate for BPN and clustering after first-time factor analysis, and after the second time factor
analysis. It shows that the Type II error rate increases for each fac-
As is evident by the above-mentioned results in Fig. 4, the BPN tor analysis, while the accuracy rate decreases from the past 2 sea-
model presents the prediction performance by non-factor analysis, sons to the past 8 seasons prior to the financial crisis. In addition,
after the first-time factor analysis, and after the second time factor the BPN model becomes more accurate the closer the crisis and the
analysis. The result shows that the accuracy rate has the worst Type II error rate becomes lower.
trend from the past 2 seasons to the past 8 seasons prior to the As seen by the above-mentioned results shown in Fig. 7, the
occurrence of the financial crisis. In addition, the BPN model shows clustering model presents the Type II error rate by non-factor
that the closer the crisis the higher the accuracy rate becomes. analysis, after first-time factor analysis, and after the second time
As seen by the above-mentioned results shown in Fig. 5, the factor analysis. It indicates that the Type II error rate has approxi-
clustering model shows the prediction performance by non-factor mately the same increasing trend as the BPN model, while the
analysis, after first-time factor analysis, and after the second time accuracy rate decreases similar to the BPN model. The only excep-
factor analysis. As a result, the accuracy rate is also shown the tion is the Type II error rate which is better in the 2nd factor anal-
W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086 4083

The Accuracy Rate for the BPN


90.00%

80.00%
past 2 seasons
70.00%

60.00%
past 4 seasons
50.00%

40.00%
past 6 seasons
30.00%

20.00%
past 8 seasons
10.00%

0.00%
None 1st 2nd
past 2 seasons 82.14% 78.57% 75.00%
past 4 seasons 77.78% 68.52% 72.22%
past 6 seasons 76.83% 67.07% 64.63%
past 8 seasons 60.00% 66.36% 65.45%

Fig. 4. The accuracy rate for the BPN.

The Accuracy Rate for Clustering


80.00%
past 2 seasons

60.00%
past 4 seasons
40.00%

past 6 seasons
20.00%

past 8 seasons
0.00%
None 1st 2nd

past 2 seasons 73.81% 74.31% 66.67%


past 4 seasons 71.91% 68.21% 69.36%
past 6 seasons 72.56% 61.18% 72.36%
past 8 seasons 61.21% 59.70% 61.06%

Fig. 5. The accuracy rate for clustering.

ysis than in the non-factor analysis over the past 6 seasons. Never- 7. Conclusions
theless, in summary we get that the closer the crisis point, the low-
er the Type II error rate in the clustering model. This research aimed at the financial and the non-financial ratios
in the financial statement, and used the BPN and the clustering
6.3. The factor analysis for BPN and clustering model to compare the performance of the financial distress
predictions, in order to find a better early-warning method. This
In this comparison, we average the accuracy rate of BPN and the research took 34 companies that were facing a financial crisis,
clustering model for each factor analysis and over 2, 4, 6, and 8 sea- and matched them with 34 normal companies of the similar indus-
sons. In Fig. 8, we can see that the accuracy rate (non-factor anal- try. In addition, we adopted the necessary dataset from the TSEC
ysis) with the BPN model is better than with the clustering model, database and sampled them into the past 2, 4, 6, 8 seasons prior
with the exception of the past 8 seasons. In Fig. 9, we can see that to the financial crisis occurrence. This data was then used to carry
the accuracy rates (1st factor analysis) with the BPN model are all out a statistical factor analysis, with each ratio variable being gen-
better than with the clustering model. In Fig. 10, we can see that erated going into BPN and clustering methods in order to make a
the accuracy rate (2nd factor analysis) with the BPN model is bet- comparison.
ter than with the clustering model, with the exception over the After the experiments, we summarized four critical contribu-
past 6 seasons. tions. First, the more time we used factor analysis, the less accurate
4084 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

The Type 2 Error Rate for the BPN


60.00%

50.00% past 2 seasons

40.00%
past 4 seasons
30.00%

20.00%
past 6 seasons

10.00%
past 8 seasons

0.00%
None 1st 2nd
past 2 seasons 28.57% 28.57% 28.57%
past 4 seasons 44.44% 51.85% 48.15%
past 6 seasons 34.15% 31.71% 51.22%
past 8 seasons 54.55% 34.55% 47.27%

Fig. 6. The type 2 error rate for the BPN.

The Type 2 Error Rate for Clustering


90.00%
80.00%
past 2 seasons
70.00%
60.00%
past 4 seasons
50.00%
40.00%
past 6 seasons
30.00%
20.00%
10.00% past 8 seasons
0.00%
None 1st 2nd

past 2 seasons 33.34% 52.38% 55.95%


past 4 seasons 38.27% 58.64% 56.17%
past 6 seasons 39.84% 74.71% 33.34%
past 8 seasons 37.88% 76.36% 46.06%

Fig. 7. The type 2 error rate for clustering.

the results for the BPN and clustering approaches. In our experi- Third, most investors are concerned with the Type II error rate
ments, we found that when we applied all of the 37 variables with and avoid investing in these companies. Our empirical results
non-factor analysis into the BPN and clustering models, we could show that factor analysis increases the error forecasts of classifying
obtain a better prediction performance except for the past 8 sea- companies with a potential financial crisis as a normal company.
sons in the BPN model and for the past 2 seasons in the clustering Moreover, we also found that the average rate of the Type II error
model. in the clustering model is higher than in the BPN model. Therefore,
Second, the closer we get to the time of the actual financial dis- the prediction performance for the clustering approach is more
tress, the more accurate the prediction will be. For example, the aggressively influenced than the BPN model.
accuracy rate with the non-factor analysis for 2 seasons before Finally, the BPN approach obtains a better prediction accuracy
the financial distress occurs is 82.14% in BPN, while it is only 60% than the DM clustering approach in developing a financial distress
over 8 seasons. The results are similar for the clustering model, prediction model, with the exception that the accuracy rate (non-
where the accuracy rate with non-factor analysis for 2 and 8 sea- factor analysis) for the past 8 seasons model and the accuracy rate
sons before the occurrence of financial distress are 73.81% and (2nd factor analysis) for the past 6 seasons is lower with the BPN
61.21%, respectively. model.
W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086 4085

BPN vs. Clustering Model


Accuracy Rate
90.00%
80.00%
70.00%
60.00%
50.00% BPN
40.00%
Clustering
30.00%
20.00%
10.00%
0.00%
2 Seasons 4 Seasons 6 Seasons 8 Seasons
Before the Occurrence of Financial Distress
Fig. 8. The accuracy rate with non-factor analysis for the BPN and clustering comparison.

BPN vs. Clustering Model


Accuracy rate
90.00%
80.00%
70.00%
60.00%
50.00% BPN
40.00% Clustering
30.00%
20.00%
10.00%
0.00%
2 Seasons 4 Seasons 6 Seasons 8 Seasons
Before the Occurrence of Financial Distress
Fig. 9. The accuracy rate with 1st analysis for the BPN and clustering comparison.

BPN vs. Clustering Model


Accuracy rate
80.00%
70.00%
60.00%
50.00%
BPN
40.00%
Clustering
30.00%

20.00%

10.00%
0.00%
2 Seasons 4 Seasons 6 Seasons 8 Seasons
Before the Occurrence of Financial Distress
Fig. 10. The accuracy rate with 2nd analysis for the BPN and clustering comparison.

In future research, additional artificial intelligence techniques, Acknowledgements


such as other neural network models, classification mining, genetic
algorithms, and others, could also be applied. And certainly, We also gratefully acknowledge the Editor and anonymous
researchers could expand the system so as to deal with more finan- reviewers for their valuable comments and constructive
cial datasets. suggestions.
4086 W.-S. Chen, Y.-K. Du / Expert Systems with Applications 36 (2009) 4075–4086

References Kinney, W., & McDaniel, L. (1989). Characteristics of firms correcting previously
reported quarterly earnings. Journal of Accounting and Economics, 11(1), 71–93.
Kirkos, E., Spathis, C., & Manolopoulos, Y. (2007). Data mining techniques for the
Altman, E. L. (1968). Financial ratios, discriminant analysis and the prediction of
detection of fraudulent financial statements. Expert Systems with Applications,
corporate bankruptcy. The Journal of Finance, 23(3), 589–609.
32(4), 995–1003.
Altman, E. L., Edward, I., Haldeman, R., & Narayanan, P. (1977). A new model to
Ko, P. C., & Lin, P. C. (2006). An evolution-based approach with modularized
identify bankruptcy risk of corporations. Journal of Banking and Finance, 1,
evaluations to forecast financial distress. Knowledge-Based Systems, 19(1),
29–54.
84–91.
Beaver, W. (1966). Financial ratios as predictors of failure, empirical research in
Laitinen, E. K., & Laitinen, T. (2000). Bankruptcy prediction application of the
accounting: Selected studied. Journal of Accounting Research, 71–111.
Taylor’s expansion in logistic regression. International Review of Financial
Blum, M. (1974). Failing company discriminant analysis. Journal of Accounting
Analysis, 9, 327–349.
Research, 1–25.
Lee, T. L. (2004). Back-propagation neural network for long-term tidal predictions.
Chen, T. (2007). Incorporating fuzzy c-means and a back-propagation network
Ocean Engineering, 31(2), 225–238.
ensemble to job completion time prediction in a semiconductor fabrication
Loebbecke, J., Eining, M., & Willingham, J. (1989). Auditor’s experience with material
factory. Fuzzy Sets and Systems, 158(19), 2153–2168.
irregularities: Frequency, nature and detectability. Auditing: A Journal of Practice
Chen, A. P., & Chen, C. C. (2006). A new efficient approach for data clustering in
and Theory, 9, 1–28.
electronic library using ant colony clustering algorithm. The Electronic Library,
Medsker, L., & Liebowitz, J. (1994). Design and development of expert systems and
24(4), 548–559.
neural networks. New York: Macmillan.
Chun, S. H., & Park, Y. J. (2006). A new hybrid data mining technique using a
Meyer, P. A., & Pifer, H. (1970). Prediction of bank failures. The Journal of Finance, 25,
regression case based reasoning: Application to financial forecasting. Expert
853–868.
Systems with Applications, 31(2), 329–336.
Mitra, S., Pal, S. K., & Mitra, P. (2002). Data mining in soft computing framework: A
Deng, W. J., Chen, W. C., & Pei, W. (2007). Back-propagation neural network
survey. IEEE Transactions Neural Networks, 13(1), 3–14.
based importance–performance analysis for determining critical service
Panda, S. S., Chakraborty, D., & Pal, S. K. (2007). Flank wear prediction in drilling
attributes. Expert Systems with Applications. doi: 10.1016/[Link].2006.
using back-propagation neural network and radial basis function network.
12.016.
Applied Soft Computing. doi:10.1016/[Link].2007.07.003.
Dimitras, A. I., Zanakis, S. H., & Zopounidis, C. (1996). A survey of business failure
Persons, O. (1995). Using financial statement data to identify factors associated with
with an emphasis on prediction methods and industrial applications. European
fraudulent financial reporting. Journal of Applied Business Research, 11(3), 38–46.
Journal of Operational Research, 90(3), 487–513.
Ronald, J. W., Rumelhart, D. E., & Hinton, G. E. (1986). Learning internal
Fanning, K., & Cogger, K. (1998). Neural network detection of management fraud
representations by error propagation. In E. David Rumelhart & J. A. McClelland
using published financial data. International Journal of Intelligent Systems in
(Eds.). Parallel distributed processing: Explorations in the microstructure of cognition
Accounting, Finance and Management, 7(1), 21–24.
(Vol. 1). Cambridge: MIT Press/Bradford Books.
Feroz, E., Park, K., & Pastena, V. (1991). The financial and market effects of the SECs
Spathis, C. (2002). Detecting false financial statements using published data: Some
accounting and auditing enforcement releases. Journal of Accounting Research,
evidence from Greece. Managerial Auditing Journal, 17(4), 179–191.
29(Suppl.), 107–142.
Spathis, C., Doumpos, M., & Zopounidis, C. (2002). Detecting falsified financial
Han, J., & Kamber, M. (2001). Data mining: Concepts and techniques. San Francisco,
statements: A comparative study using multicriteria analysis and multivariate
CA, USA: Morgan Kaufmann.
statistical techniques. The European Accounting Review, 11(3), 509–535.
Huang, M. J., Chen, M. Y., & Lee, S. C. (2007). Integrating data mining with case-
Stice, J. (1991). Using financial and market information to identify pre-engagement
based reasoning for chronic diseases prognosis and diagnosis. Expert Systems
market factors associated with lawsuits against auditors. The Accounting Review,
with Applications, 32(3), 856–867.
66(3), 516–533.
Huang, Y. P., Hsu, C. C., & Wang, S. H. (2007). Pattern recognition in time series
Werbos, P. (1974), Beyond regression: New tools for prediction and analysis in the
database: A case study on financial database. Expert Systems with Applications,
behavioral science, Ph.D. Thesis, Committee on Applied Mathematics, Harvard
33(1), 199–205.
University, Cambridge, MA.
Hua, Z., Wang, Y., Xu, X., Zhang, B., & Liang, L. (2007). Predicting corporate financial
Wu, D., Yang, Z., & Liang, L. (2006). Using DEA-neural network approach to evaluate
distress based on integration of support vector machine and logistic regression.
branch efficiency of a large Canadian bank. Expert Systems with Applications, 31,
Expert Systems with Applications, 33(2), 434–440.
108–115.
John, S. G., & Robert, W. I. (2001). Tests of the generalizability of altman’s
Yılmaz, B., & Cunedioglu, U. (2007). Source localization of focal ventricular
bankruptcy prediction model. Journal of Business Research, 54, 53–61.
arrhythmias using linear estimation, correlation, and back-propagation
Jost, A. (1993). Neural networks: A logical progression in credit and marketing
networks. Computers in Biology and Medicine, 37(10), 1437–1445.
decision system. Credit World, 81(4), 26–33.

You might also like