Fulltext 3

Appl Intell
DOI 10.1007/s10489-010-0259-7
The use of genetic programming for the construction of a financial

management model in an enterprise
Wen-Tsao Pan
© Springer Science+Business Media, LLC 2010
Abstract The fast development in China’s economy has 1 Introduction

caused the rapid expansion of the domestic market. Since
many economists do not have optimistic views regarding In recent years due to the migration effects of globaliza-
the bubble economy of China, it is necessary for Taiwanese tion, the current economy suffers a severe fluctuation period.
businessmen to understand in-depth the business operational Many large enterprises, located in the mainland of China,
possibly have to face the risks of bankruptcy, debt or con-
performance and financial situation of enterprises in China,
tract violation, or even the default on stock settlement in
so as to reduce the risk of a potential investment. In this
the stock market. This situation clearly indicates that many
article, data from the China Economic Research Database
companies fail to manage the emerging risks in an appro-
(CCER), the financial database of financial corporations are
priate manner, or, even worst, they do not seem to have a
collected for analysis to investigate the business operation
clear understanding of these risks. At this moment, Taiwan
and management performance and financial characteristic of is signing an Economic Cooperation Framework Agreement
enterprises in China. In this article, grey relational analy- (ECFA) with Mainland China. In the future, even more of
sis is applied first in order to investigate the business op- Taiwan’s enterprises are going to cooperate with firms lo-
erational performance of 600 enterprises in China. After- cated in the mainland of China, in many sectors. Therefore,
wards, a more recent clustering technique is used to divide, the risk entailed in this kind of business between Taiwan’s
based on financial characteristic, enterprises in China into and Mainland China’s enterprises, as well as the risk of
two groups. Finally, three models, namely genetic program- investing in financial products in Mainland China by indi-
ming, Back-Propagation Neural Network and Logistic Re- vidual Taiwanese investors, is going to increase. Therefore,
gression are adopted to construct an Enterprise Operational many experts and scholars focusing on the research of risk
Performance model and an Enterprise Finance Characteris- management in Mainland China started to pay greater atten-
tic model, respectively. Based on the results found, it can be tion to Mainland China’s enterprise operation and manage-
concluded that genetic programming yielded the best clas- ment issues. Meanwhile, complicated actuary and quantita-
sification and forecast performance, compared to the other tive techniques are used to construct the assessment model
three techniques. of business operation and management performance.
Also, in this article, operation and management perfor-
mance of enterprises in Mainland China is investigated.
Keywords Grey relational analysis · Cluster analysis ·
First, financial data of 600 enterprises, located in Main-
Genetic programming · Back-propagation neural network ·
land China, are collected. Then, grey relational analysis is
Data mining
adopted in order to study the operation and management
performance and rank them. In addition, to investigate the
financial characteristics of these 600 enterprises, the stabil-
W.-T. Pan () ity and activity in the financial five forces analysis are se-
Department of Information Management, Fooyin University,
3F., No. 12, Lane 271, Longjiang Rd., Jhongshan District, Taipei
lected for the study. Based on the financial index of stability
City 104, Taiwan, ROC and activity, cluster analysis is performed in order to inves-
e-mail: teacherp0162@yahoo.com.tw tigate the financial characteristics of the enterprise. There
W.-T. Pan
might be two main results based on the clustering analysis: network. Findings on the GRA model prediction indicated
the first cluster contains two groups in which both finan- that GRA performance was superior to those from the other
cial stability and activity of the enterprise is good and bad three models. And Wang [9] applied the Gas to enhance the
at the same time, and the second cluster also contains two GM (1.1) model within the Grey system, in addition, he also
groups in which at least one factor, between financial stabil- applied Taiwan IC industry as one of the empirical case stud-
ity and activity, is good and the other one is bad. Finally, in ies. Moreover, GAGM was found to have the capability of
this article, grey relational analysis and cluster analysis re- generating less forecasting errors than that of BGM (1, 1)
sults are aimed respectively at, financial index is associated model. If the readers are interested, they can refer to the
as sample data, and we referring to Dain [1] using genetic publications by professor Deng or the publications of other
programming to construct mobile robot navigation strate- scholars to have a detailed understanding of the theoretical
gies, Wu and Tsai [2] using neural networks to classifica- basis of grey relational analysis.
tion spam filtering and Jang et al. [3] using neural networks In this article, grey relational analysis is used to study in
to predict stock price trend. Genetic Programming (GP), detail, the business operation and management performance
Back-Propagation Neural Network (BPN) and Logistic Re- of enterprises in Mainland China, where in the same time, it
gression (LR) are adopted to construct Enterprise Operation is used for ranking. Dichotomy is used to properly split/rank
Performance (EOP) and Enterprise Finance Characteristics enterprises into good and bad firms, regarding their opera-
(EFC). It is hoped that the research result can be used as a tion performance.
business operational and management reference by the en-
terprises. 2.2 Kmedoid cluster
This study is structured as follows. In Sect. 1 some in-
troductory comments, as well as the main objective of this Kmedoid, as compared to Kmeans, is a rather new hard
paper, are presented. In Sect. 2 some methodological issues clustering method. The main difference between Kmeans
regarding the grey relational analysis, the cluster model and and Kmedoid stands in calculating the cluster centers. The
genetic programming, are analyzed. In the next section, re- new cluster center is the nearest data point to the mean of
sults from the empirical analysis are provided. Finally, in the cluster points. The hard partitioning methods are simple
Sect. 4 some basic concluding remarks and suggestions for and popular, though their results are not always reliable and
further research directions are discussed. these algorithms have numerical problems as well. From an
N × n dimensional data set, Kmedoid algorithms allocate
each data point to one of c clusters to minimize the within-
2 Research method cluster sum of squares:

c
2.1 Grey relational analysis xk − vi 2
i=1 k∈Ai
Grey theory has been developed by professor Deng [4] of
Mainland China for over 20 years. This theory mainly aims where Ai is a set of objects (data points) in the i-th cluster
at performing system relational analysis and model con- and vi is the mean for the points over cluster “i” denotes
struction, in the case of incomplete information and unclear actually a distance norm. In Kmeans clustering vi called the
system model. Grey Relational Analysis is a part of grey the- cluster prototypes, i.e. the cluster centers:
ory, and it is a factor analysis method [5]. It is mainly used to Ni
k=1 xk
analyze the relational grade between divergent events. How- vi = , xk ∈ Ai
ever, grade of grey relation means the change of relational Ni
grade along with time for two factors of two systems or where Ni is the number of objects in Ai .
one system. If the relation is close, then the two systems, In Kmedoid clustering the cluster centers are the nearest
or the change of factor, approach a consistent value. Oth- objects to the mean of data in one cluster V = {vi ∈ X|l 1 ≤
erwise, the system is not consistent. However, at this time, i ≤ c}. It is useful when each data point denotes a position
grey relational analysis has been widely applied to the as- of a system, so there is no continuity in the data space. In
sessment of business operational performance [6, 7]. Addi- these ways, the mean of the points in one set does not exist.
tionally, the researches of problem predictions through Grey
system theory and applications include Huang [8]. Proce- 2.3 GK cluster
dure presented for formal software estimation by applying
the integration from GRA with GA, in conjunction to ac- In this article, we referring to Bahrampour et al. [10] using
curacy comparison through approaches of case-by-case rea- Gustafson-Kessel [11] clustering algorithm (GK cluster) ex-
soning, classification, regression tree and artificial neural tends the Fuzzy C-means (FCM) algorithm by employing an
The use of genetic programming for the construction of a financial management model in an enterprise
adaptive distance norm in order to detect clusters with dif- Fig. 1 The syntax tree of
genetic programming
ferent geometrical shapes in the data set. GK Cluster algo-
rithm acquires membership function matrix U and cluster-
ing center V through the acquisition of objective function.
The efficiency index of GK Cluster algorithm is:

c
N
J (Z, V , U ) = (μij )m zj − υi 2A
i=1 j =1
The distance norm adopted is:

Dij2 = (zj − υi )T Ai (zj − υi )
In the above formula:

Wherein the plus and minus sign is the node in the in-
Dij2 is square inner-product distance norm
ternal side, the rest of terminal nodes are elemental sets
Ai is a matrix and is decided by cluster covariance ma-
defined according to the problems (X, Y and 4). More-
trix Fi .
over, the corresponding representative formula of syntax
N
j =1 (μij ) (zj − υi )(zj − υi ) tree is X + (4 − Y ). For the theories related to genetic
m T
Fi = N programming, the readers can refer to the related books of
m
j =1 (μij )
professor Koza. In this article, Matlab GPLAB toolbox is
1 adopted to perform the construction of a genetic program-
Ai = det(ρi Fi ) d+1 Fi−1
ming model. The main features of this toolbox include the
Through the use of Lagrange multiplication, objective
initial parameters that you can set up yourself. In the ex-
function is optimized to obtain (U, V ) condition with a min-
ecution stage of the program, tree depth and node num-
imum point.
ber will be dynamically adjusted, and the crossover rate
1 and mutation rate will be adjusted automatically too. Fi-
μij = c 2/(m−1)
k=1 (Dij /Dkj ) nally, the execution result will generate visionary graphs,
N m
j =1 (μij ) Zj
etc. GPLAB and its documentation are released under GNU
υi = N General Public License, and freely available for download
m
j =1 (μij ) at http://www.itgb.unl.pt:1111/gplab/.
In the formula, i = 1, 2, . . . , c, the Fuzzy Exponent “m” is
an adjustable parameter used to represent the fuzzy degree
of cluster. The larger m is, the greater overlap among clus- 3 Empirical study
ters, and generally, m = 2 is taken, ρi is a constant to every
cluster. 3.1 Sample data and variable
2.4 Genetic programming and GPLAB In this study, data regarding financial ratios of enterprises
in Mainland China for the year 2008 were collected from
Genetic Programming (GP) is a recent data mining tech- the China Economic Research Database, there are 736 en-
nique developed by professor Koza [12, 13] and is based on terprises in all. Also, enterprises with defective data were
the genetic algorithm of Holland [14]. Genetic programming excluded and fetch the integer. So, the final dataset com-
shares some concepts from genetic algorithms, such as the prised of 600 firms. In order to analyze stability and activity
chromosome, fitness function, reproduction, crossover and of these enterprises, Specific financial ratios were properly
mutation. However, what is different is, genetic program- chosen, based on the stability and activity of the financial
ming further replaces genes within chromosomes (0 and 1) five forces for selection. The main objective is to analyze
by syntax tree. Therefore, each individual within the popula- the performance of each enterprise in stability and activ-
tion represents a set of computer programs. These program
ity. The selected ratios are the current ratio (X1), debt ra-
codes, similar to genes and through the natural selection in
tio (X2), inventory turnover ratio (X3), Accounts receivable
the evolution process, can generate optimal program codes.
turnover rate (X4) and asset turnover rate (X5). Its formula
Another difference between genetic algorithm and genetic
is:
programming is that the latter uses tree structures with high
variations in size, shape and structure to represent chromo- current ratio (X1) = current assets/current liabilities
somes, and it represents different formulas respectively. This debt ratio (X2) = total liabilities/total assets
is shown in Fig. 1. inventory turnover ratio (X3) = sales/inventories
W.-T. Pan
accounts receivable turnover rate (X4) = net sales/average with good operation and management performance and the
accounts receivable other 300 companies with bad operation and management
asset turnover rate (X5) = net sales/total assets. performance. The grey relational analysis result is as shown
in Fig. 2. In this article, the minimal value of debt ratio (X2)
The descriptive statistical values of these ratio data are as
and the maximal value of other indexes (X1, X3, X4, X5) are
shown in Table 1.
used as standard sequence. Here, bold-dotted lines represent
3.2 Enterprise’s operational and management performance Standard Sequence, and the rest of the fine lines represent
analysis Inspected Sequence, that is, the rest of the data. Each data
has five nodes representing five space indexes of that enter-
In this article, all variables are considered as assessment in- prise. The closer Inspected Sequence gets to Standard Se-
dexes for the enterprise’s operational and management per- quence, the better the business operational and management
formance so as to apply grey relational analysis and to in- performance of that enterprise.
vestigate the business operation and management perfor- It is found that enterprises with the top three rank-
mance of enterprises in Mainland China in 2008. All the ings in business operational and management performance
index values, except debt ratio, will be the larger the bet- are: Fujian Expressway (600033), Hongsheng Technology
ter. For all the assessment indexes, grey relational analysis Co., Ltd. (600817) and Orient International Enterprise, Ltd.
as proposed by professor Deng is used. The grey relational (600278), respectively. And the last three rankings are:
Matlab toolbox developed by Wen et al. [15] is adopted to Beisheng Pharmaceutical (600556), Jintai Group Co., Ltd.
find out the Grey Relational Grade and to perform ranking. (600385) and Tianyi Science & Technology (600703), re-
In this article, dichotomy is used in order to properly sepa- spectively. Next, these results are used to construct Enter-
rate the 600 enterprises into two categories: 300 companies prise Operation Performance (EOP) to be used as reference
for business operational and management performance by
Table 1 The descriptive statistical values of financial ratios of 600 other enterprises. In order to assess whether there is any
enterprises in Mainland China problem regarding the business operation and management
of any enterprise, the corresponding performance indexes of
X1 X2 X3 X4 X5
this enterprise have to be inserted into the model. In this ar-
Max 22.447 11.468 295.159 21.436 3.025
ticle, the former 300 enterprises represent enterprises with
good business operational and management performance
Min 0.023 0.001 0.014 0.042 0.001
(represented by 0), and the latter 300 enterprises represent
Avg 1.383 1.418 7.884 7.659 0.669
enterprises with bad business operational and management
Std 1.611 1.266 23.301 4.826 0.459
performance (represented by 1). Then, this is considered as a
N 600 600 600 600 600
dependent variable and is associated with five space indexes
Fig. 2 Linear sequence chart of

grey relational analysis
Fig. 3 Genetic programming

output results for constructing
the Enterprise Operational
Performance model
(independent variables) in order to perform the construc-

tion of three Enterprise Operation Performance (EOP) mod-
els such as Genetic Programming, Back-Propagation Neural
Network and Logistic Regression.
Firstly, the construction of the genetic programming
model is demonstrated. The node mathematical function in
the model architecture is composed of plus, mypower, my-
divide, square, times, minus and mylog. The evolution ini-
tial parameter values in the generations include 100 genetic
generation numbers and 30 chromosomes. In the execution
process, crossover rate, mutation rate and depth and node
number of syntax tree will be dynamically adjusted. Also
shown are the number of reproduction and clonings result-
ing from failed genetic operators. The performance assess-
ment index is finely divided into 6 sets, with each set of
data containing 100 data. Five sets are used as the training Fig. 4 BPN network architecture of Enterprise Operational Perfor-
sample data to construct the model. One set is used as test mance model and Financial Characteristic Detection model
data to test the model stability, and eventually, cross verifi-
cation is carried out. Figure 3 shows the output result using neural number of the network hidden layer, input layer (5)
the former five sets of training data to construct genetic pro- plus output layer (1) divided by 2, which is 3. By doing so,
gramming model: good forecast results can be obtained. The network architec-
Figure 3 shows the syntax tree of genetic programming, ture is as shown in Fig. 4.
which can be represented as in the following symbolic re- Finally, Logistic Regression of traditional statistical
gression: model was also adopted to construct Enterprise Operational
Performance so as to compare the advantages and disadvan-
mylog(plus(plus(square(plus(mylog(plus(mylog(X4), tages of three models.
plus(square(mylo g(X4)),square(mydivide(mylog(X4),
X4))))),mylog(X4))),square(mydivide(mylog(square 3.3 Financial characteristic analysis in the enterprise’s
(plus(X2,X1))),X2))),square(X1))) operational and management performance
In this article, Back-Propagation Neural Network is In this article, all the assessment indexes for the enterprise’s
adopted again to construct Enterprise Operational Perfor- operational and management performance are used again to
mance model. In the architecture selection of neural net- perform the financial characteristic analysis of enterprises.
work, this article has referred to the publication of Yeh [16]. Among the financial five forces, stability represents the en-
For general issues, one hidden layer can be adopted for the terprise’s short term liquidity and level of debt, activity
W.-T. Pan
Xie and Beni’s Index (XBI) is the sum of the ratio of the
close status of each group to the group center to the minimal
close status of each group to the group center, and for this
value, the smaller the better. Its formula is:
c N
j =1 (μij ) xj − vi 2
m
i=1
XBI =
N mini,j xj − vi 2
Which μij is the membership of data point j in cluster i.

Classification capability index includes:
Wherein, the cluster center of Kmedoid is:
v1 = [0.0596 0.1743 0.0353 0.6417 0.3099]

v2 = [0.0591 0.0997 0.0207 0.2333 0.1805]
The cluster center of GK Cluster is:
v1 = [1.0526 1.3826 21.3463 11.6378 0.6699]

v2 = [1.4917 1.3515 3.5481 5.6227 0.6245]
From the comparison of cluster analysis results, it is

found that one of the groups is supplier of both good stability
and activity, that is, a number of 263 (represented by 0). An-
Fig. 5 Two cluster model output results of the Financial Characteristic
Detection model of enterprises in Mainland China other group is supplier of both bad stability and activity, that
is, a number of 337 (represented by 1). However, after refer-
ring to the cluster index in Bensaid et al. [18] and Xie and
represents whether the production management and inven-
Beni [19], it is found that for Kmedoid of hard cluster, the PI
tory management of an enterprise is appropriate or not or
index (Partition Index) is 2.1828, SI (Separation Index) in-
whether there is any drawback in the accreditation policy.
dex is 0.0036, XBI (Xie and Beni’s Index) index is 7.1744,
Cluster analysis is performed using the recent techniques of
which are all higher than those of fuzzy cluster GK Cluster,
Kmedoid and fuzzy cluster method GK cluster, which use
namely, PI index of 0.0182, SI index of 3.0340e−005 and
the assessment indexes as input data. From the analysis re-
XBI index of 3.5714. Since the smaller the PI, SI and XBI
sult, the data is divided into 2 clusters. Moreover, based on
index, the better the cluster result. Thus, in this article, GK
the work of professor Liu and Xu [17], a recent Fuzzy Sam-
cluster analysis result is used for subsequent analysis.
mon mapping nonlinear mapping algorithm is used in order
to map the cluster result, which is as shown in Fig. 5. Where First, the cluster result is used as dependent variable and
Cluster analysis model efficiency indexes include: the five space index is used as independent variables to
Partition Index (PI) is, after clustering analysis, the sum perform the construction of the Enterprise Finance Char-
of the ratio of the close status of each group to the group cen- acteristic (EFC) models using genetic programming, Back-
ter to the divergence status among the group centers of the Propagation Network and Logistic Regression. It is expected
groups, and for this value, the smaller the better. Its formula that the result can be used as reference by the enterprises so
is: that managers in an enterprise can detect at any time its fi-
nancial status and make appropriate adjustments at any time.
c N
j =1 (μij ) xj − vi
m
Same as before, five sets are used as training sample data
PI = c
Ni k=1 vk − vi 2 to construct the model. One set is used as test data to test
i=1
the model stability. Finally, cross verification is carried out.
Separation Index (SI) the sum of the ratio of the close status Model parameter selection for the genetic programming was
of each group to the group center to the divergence status the same as in Sect. 3.2. Figure 6 shows the output result for
of the minimal distances of the group centers of the groups, the use of the former five sets of training data to construct
and for this value, the smaller the better. Its formula is: the Genetic Programming model.
c N Figure 6 shows the syntax tree of genetic programming,
j =1 (μij ) xj − vi
2 2
i=1 which can be represented as in the following symbolic re-
SI =
N mini,k vk − vi 2 gression:
Fig. 6 The genetic

programming output result for
constructing the Financial
Characteristic Detection model
of enterprises in Mainland
China
Fig. 7 The ROC curve of the

classification forecast result of
the Enterprise Operational
Performance model and the
Enterprise Finance
Characteristic detection model
mydivide(square(X5),plus(mydivide(mydivide(X1, values that are smaller than or equal to 0.5 is classified as 0

plus(mydivide(mydivide(square(X5),plus(X5,X2)),X4), and values that are larger than 0.5 are classified as 1 so as to
plus(X4,square(X5)))),plus(plus(X1,times(X4,X3)), observe the forecast competences of the three models. After
plus(X4,square(plus(plus(X5,square(X5)), cross verification of 6 sets of data with the results generated
square(X5)))))),plus(X4,square(plus(X4,square(plus by the three models, we then use SPSS statistical software
(X1,square(X5)))))))) to plot the ROC curve, which is as shown in Fig. 7.
The figure shows the ROC curve of the classification
Then Back-Propagation Neural Network is adopted again forecast result of cross verification of six sets of data of the
to construct the model, and the network architecture is as Enterprise Operational Performance model and the Finan-
shown in Fig. 4. Finally, in this article, Logistic Regression
cial Characteristic Detection model. Bradley [20] pointed
of the traditional statistical model is adopted to construct the
out that the larger the area above the reference line and
model so as to compare the advantages and disadvantages of
the curve, the more accurate the classification forecast com-
three Financial Characteristic Detection models.
petence of that model. It can be seen from the figure that
3.4 General comparison of classification forecast no matter if it is the Enterprise Operational Performance
competence of the three models model or the Financial Characteristic Detection model, ge-
netic programming model usually shows the best classi-
In the output results of the genetic programming model and fication competence. Then from Table 2, we can observe
the Back-Propagation Neural Network model in this article, the ROC curve analysis output result, wherein Sensitivity
W.-T. Pan
(Sen) means the percentage occupied by the number with NA

Precision (P) = × 100%
a forecast result of 1 to the number with a real value of 1, NA + NB
and Specificity (Spe) means the percentage occupied by the 2RP
number with a forecast result of 0 to the number with a F Value (F) = × 100%
R+P
real value of 0. Moreover, Hand and Till [21] pointed out
Gini Index = 2 × AUC − 1. For all these index values, the Recall in Table 3 represents the correct rate of forecasting
larger the better. As seen in the table for genetic program- the number of bad companies, and high correct rate means
ming model, the specificity, sensitivity, area under the curve that there is fewer wrong judgment in forecasting the com-
(AUC), Gini Index and overall mean value are all higher panies in risk. Precision means correct judgment rate of the
than those of other models. Hence, it has a very good alarm- bad companies. It reflects the capability to find out bad com-
ing and detection competence. panies, that is, fewer normal companies are identified as bad
Finally, this article further targeted to the different neu- companies. F Value is the mean of the reciprocal of recall
rons adopted at the hidden layer in the BPN scheme, i.e. rate and precision rate. It associates recall rate and precision
there were 1 to 5 neurons at the hidden layer. And this was to rate to form a general judgment index.
verify what Yeh [16] mentioned that the hidden layer could Since the values of these three indexes are the higher the
adopt one layer in addition that each neuron could adopt in- better, after observing three index values of Genetic Pro-
put layer (5) plus output layer (1) divided by 2 which was 3. gramming model in the table, we see that the classification
This way, it could acquire the optimal predictions. From Ta- forecast capability is usually and obviously superior than
ble 3, we can see that the BPN categorization capability for that of five types Back-Propagation Neural Network model
neuron of 3 at hidden layer is indeed better than others from and Logistic Regression model. Observe again three type’s
BPN models. During training stage, these 5 BPN modeling index values of five types Back-Propagation Neural Network
parameters were separately identified with learn rate is 0.5, model and Logistic Regression model, we see that the data
the times of learning is 1000, and the deviation less than are relatively close and there is no significant difference, and
0.005, i.e. the termination for the learning. this point can be referred to by the future research personnel.
Classification capability index includes:
NA
Recall(R) = × 100% 4 Conclusion and suggestion
NG
The business operation and management performance of en-
Table 2 The analysis output result of ROC curve
terprises in Mainland China has great impact on Taiwanese
Model Sen Spe Auc Gini AVG enterprises. It is thus necessary to understand in-depth the
business operation, management performance and financial
EOP GP 0.943 0.953 0.948 0.896 0.935 characteristics of enterprises in Mainland China, while in
BPN 0.843 0.859 0.851 0.702 0.814 the same time the development of a proper detection model
LR 0.783 0.835 0.809 0.618 0.761 is required. Since most of the current documentations re-
lating to the researches for the occurrences of enterprise fi-
EFC GP 0.947 0.947 0.947 0.894 0.934
nancial crises were processed through the model construc-
BPN 0.869 0.886 0.878 0.756 0.847
tions for financial pre-warning and credit rating; nonethe-
LR 0.840 0.886 0.863 0.726 0.829
less, the purpose of this thesis was different from those in
Table 3 The analysis output

result of three classification Model GP BPN BPN BPN BPN BPN LR
capability index (5.1.1) (5.2.1) (5.3.1) (5.4.1) (5.5.1)
EFC
Recall 0.9911 0.9644 0.9703 0.9852 0.9703 0.9318 0.9733
Precision 0.9056 0.8783 0.8828 0.8956 0.8850 0.8587 0.8895
F Value 0.9464 0.9193 0.9245 0.9382 0.9257 0.8937 0.9295
EOP
Recall 0.9867 0.9367 0.9367 0.9833 0.9300 0.9433 0.9733
Precision 0.9834 0.8949 0.9243 0.9736 0.9426 0.9042 0.9669
F Value 0.9850 0.9153 0.9305 0.9784 0.9362 0.9233 0.9701
the past documentations, and it engaged in the model con- 7. Lin SY (2004) Evaluation of business reputation in information
structions for enterprise performance as well as basing on service industry-an application of grey relational analysis. J Inf
the characteristics of finance. However, prior proceeding to Technol Soc (2):79–95
the constructions for these two models, it must first adopt 8. Huang SJ, Chiu NH, Chen LW (2008) Integration of the grey rela-
tional analysis with genetic algorithm for software effort estima-
the data mining to unearth the enterprise performance (by tion. Eur J Oper Res 188(3):898–909
using GRA or DEA) or financial characteristics (by using 9. Wang CH, Hsu LC (2008) Using genetic algorithms grey theory
Cluster Analysis) prior to achieving the effective model con- to forecast high technology industrial output. Appl Math Comput
struction. Therefore, this accounts for why the past docu- 195(1):256–263
mentations were rarely found with similar model construc- 10. Bahrampour S, Moshiri B, Salahshoor K (2010) Weighted and
tion approaches. Hence, this thesis possessed a certain de- constrained possibilistic C-means clustering for online fault
detection and isolation, Appl Intell. doi:10.1007/s10489-010-
gree of originality. The main contribution of this study is
0219-2. Available online 13 March 2010
to apply grey relational analysis in order to investigate the
11. Gustafson DE, Kessel WC (1979) Fuzzy clustering with fuzzy co-
business operation and management performance of 600 variance matrix. In: Proceedings of the IEEE CDC, San Diego,
enterprises in Mainland China and to construct an Enter- pp 761–766
prise Operational Performance model. In addition, this arti- 12. Koza JR (1992) Genetic programming I: on the programming of
cle also adopts cluster analysis to perform financial charac- computers by means of natural selection. MIT Press, Cambridge
teristic analysis of business operation and management and 13. Koza JR (1992) Genetic programming II: automatic discovery of
to construct an enterprise’s financial characteristic detection reusable programs. MIT Press, Cambridge
model. It is hoped that the results can be provided to the 14. Holland J (1975) Adaptation in natural and artificial systems. Uni-
versity of Michigan Press, Ann Arbor
enterprises to be used as reference for enhancing business
15. Wen KL, Chang-Chien SK, Yeh CK, Wang CW, Lin HS (2006)
operation and management performance and for improving
Apply MATLAB in grey system theory. Chuan Hwa Book CO,
the enterprise’s financial situation. Through cross verifica- Ltd
tion of six sets of data, no matter for the Enterprise Opera- 16. Yeh YC (2001) The model application and practice of artificial
tional Performance or the Financial Characteristic Detection neural network, Scholars Books Co, Ltd
model, the Genetic Programming model has the best classi- 17. Liu L, Xu W (2006) UOFC-AINet: A fuzzy immune network for
fication forecast performance of all three models. unsupervised optimal clustering, cimca. In: International confer-
Genetic programming, Back-Propagation Neural Net- ence on computational intelligence for modelling control and au-
tomation and international conference on intelligent agents web
work and Logistic Regression were also used for the model
technologies and international commerce (CIMCA’06), p 196
construction in this article. In the future, we suggest that 18. Bensaid AM, Hall LO, Bezdek JC, Clarke LP, Silbiger ML, Ar-
other models (for example, fuzzy neural network and deci- rington JA, Murtagh RF (1996) Validity-guided (Re) clustering
sion tree) can be adopted to further investigate the detection with applications to image segmentation. IEEE Trans Fuzzy Syst
competences of these models. 4:112–123
19. Xie XL, Beni GA (1991) Validity measure for fuzzy clustering.
Acknowledgements The author appreciated the time and effort of IEEE Trans Pattern Anal Mach Intell 3(8):841–846
the reviewers. 20. Bradley AP (1997) The use of the area under the ROC curve in
the evaluation of machine learning algorithms. Pattern Recognit
30(7):1145–1159
References 21. Hand DJ, Till RJ (2001) A simple generalisation of the area un-
der the ROC curve to multiple class classification problems. Mach
1. Dain RA (1998) Developing mobile robot Wall-Following algo- Learn 45(2):171–186
rithms using genetic programming. Appl Intell 8(1):33–41
2. Wu CH, Tsai CH (2008) Robust classification for spam filtering by
back-propagation neural networks using behavior-based features. Wen-Tsao Pan works at depart-
Appl Intell 31(2):107–121 ment of information management,
3. Jang GS, Lai F, Jiang BW, Parng TM, Chien LH (1993) Intelli- Fooyin University, Taiwan ROC.
gent stock trading system with price trend prediction and rever- His current research interests in-
sal recognition using dual-module neural networks. Appl Intell clude machine learning, data min-
3(3):225–248 ing, financial prediction and compu-
4. Deng J (1982) The control problems of grey system. Syst Control tational intelligence. He is interna-
Lett (5):288–294 tional journal referee of Economic
5. Lin CT, Chi LW, Chiu YS (1996) The application of grey relational Modelling, Knowledge-Based Sys-
analysis in physical education training. Thesis. In: the first grey tems, etc. . . . . His papers have been
system theory and application forum, pp 333–340 appeared in Expert Systems with
6. Wang YJ (2006) Applying grey relation analysis to find the repre- Applications, Neural Computing
sentative indicators of financial ratios for evaluating financial per- and Applications, etc.
formance of container shipping companies on Taiwan. Marit Q
15(1):1–17

Fulltext 3

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Fulltext 3

Caricato da

Copyright:

Formati disponibili

Appl Intell

The use of genetic programming for the construction of a financial

© Springer Science+Business Media, LLC 2010

Abstract The fast development in China’s economy has 1 Introduction

The distance norm adopted is:

In the above formula:

Fig. 2 Linear sequence chart of

Fig. 3 Genetic programming

(independent variables) in order to perform the construc-

Which μij is the membership of data point j in cluster i.

v1 = [0.0596 0.1743 0.0353 0.6417 0.3099]

The cluster center of GK Cluster is:

v1 = [1.0526 1.3826 21.3463 11.6378 0.6699]

From the comparison of cluster analysis results, it is

Fig. 6 The genetic

Fig. 7 The ROC curve of the

mydivide(square(X5),plus(mydivide(mydivide(X1, values that are smaller than or equal to 0.5 is classified as 0

(Sen) means the percentage occupied by the number with NA

Table 3 The analysis output

Potrebbero piacerti anche