Sei sulla pagina 1di 4

The 5th International Conference on Computer Science & Education Hefei, China.

August 2427, 2010

ThP9.9

An Application of Factor Analysis on Road Traffic Accident


Yang Haixia
Business College of Shanxi University Taiyuan, China

Nan Zhihong
School of Information Management Shanxi University of Finance & Economics Taiyuan, China

AbstractThis paper analyzes the causes of 372 traffic accidents that occurred in China by factor analysis. Five main factors are extracted and corresponding explanations are given, which can provide not only strategic support for traffic control department, but also some warnings to perpetrators. Index TermsRoad traffic, Factor analysis, Major factor

associated with the causes of accidents and the road conditions can be found, for example, going through bars illegally , drunken driving, dense fog, snow slippery, rainy slippery, manned trucks, steep slopes, sharp curves and so on [3] . After analysis, 16 indicators were selected as shown in table , which basically reflect all the factors of traffic accidents [4] [5].
TABLE I.

TABLE OF INDICATOR VARIABLE Meaning Iillegal road occupation, including passing across the road, retrograde, breaking through bars illegally, etc. Weather conditions, including dense fog, snow, rain, etc. Improper operation Fatigue driving Ttire puncture Overloading Drunken driving Braking failure Overspend Driving without a license Corners, sharp curve, etc. Hill road Steep slopes Road damage Manned vehicles Cliff, glen, waterfront, etc.

I.

INTRODUCTION
No 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 CZHD CTQ C3 C5 C6 C7 C9 C10 C11 C13 CURVE PATH SLOPE ROADSTATE CARRYMAN CLIFF Name

In recent years, with the improvement the living standards and road condition, the number of motor vehicles is increasing as well as the number of traffic accidents. So it is vital and necessary to find out those valuable hidden information and little-known factors behind these shocking accidents so as to give warnings to people. Indeed, to avoid road traffic accidents is an interactive process involving people, vehicles and roads. The Accident rate is an unforeseen function that is constrained by the above factors. The previous research on traffic accidents mainly takes method of mathematical such as frequency analysis, probability analysis, correlation analysis, cluster analysis, etc[1]. A deep research has never been achieved, which is really a great drawback to drivers and some relevant departments. Data Mining is a technology that extracts valuable information from large recorded data[2], these extracted information can help those concerned to make rational decisions. Factor analysis is an important technology in Data Mining, it could find out different causes existing in traffic accidents with less comprehensive indicators. Thus, a more effective result can be achieved in accident prevention and management. In view of this, this paper takes the traffic news as the information source; using the method of factor analysis, analyzes the causes of traffic accidents and extracts five main factors. II. METHODS

B. The characteristics of the samples 1) Using the Pearson[6] correlation coefficient formula to calculate the Component Matrix about 16 indicators in the SPSS environment. According to the correlation matrix as is shown as table , it is found that most variables have relatively strong correlation, so it is necessary to make a factor analysis.

A. Selection of samples and indicators In this paper, 372 news about traffic accidents were selected as samples. Through reading, the descriptions

978-1-4244-6005-2/10/$26.00 2010 IEEE

1355

ThP9.9

TABLE II.

CORRELATION COEFFICIENT MATRIX

2) Bartlett Test of Sphericity and KMO (Kaiser-MeyerOlkin) Test.


TABLE III.

KMO TEST AND BARTLETT SPHERICITY TEST RESULTS

According to statistical significance of the variable communality degree, it shows that the public factor reflects the percentage of the original variable information, and plans the variable contribution which full public factor made to the total variance for x i . As the statistics show in the Extraction list, most of the common variable degree is relatively large, and it means that more information is retained when variable space is changed into factor space. Therefore, the effect of factor analysis is remarkable. C. To obtain factor variables 1) There are a variety of methods to select factor variables. In this paper, factors are extracted by method of principal component analysis. The cumulative contribution rate of the first 5 factors reaches 78.082%, which can explain most of information contained in the original variables [8]. As shown in table .
TABLE V. TOTAL VARIANCE EXPLAINED

It can be concluded from the test results as shown in table that KMO value of 0.734 is suitable for doing factor analysis according to the standard which was given by statistician Kaiser. An accompanying probability of 0.000 was given by Bartlett sphericity test, which is less than the significance level of 0.05, therefore Bartlett sphericity test of the null hypothesis should be ruled out, so it is suitable for factor analysis [7]. 3) To calculate the communality of these 16 variables separately before and after the common factor is extracted. The result is shown in table .
TABLE IV.

VARIABLES COMMON LEVEL

2 Calculation of factor loading matrix as table shows, five common factors are extracted, set F1 , F2 ,...F5 as the first, second, ... the fifth main factor, factor model is as follows:
CZHD = 0.437 F1 + 0.718 F2 0.165 F3 0.009 F4 + 0.413 F5 CTQ = 0.672 F1 + 0.075 F2 + 0.260 F3 0.206 F4 0.170 F5 ...... CLIFF = 0.806 F1 0.390 F2 0.216 F3 0.122 F4 0.056 F5

1356

ThP9.9

TABLE VI.

FACTOR LOADING MATRIX

D. Visual results of factor analysis for traffic accidents As illustrated in Fig. 1, the vertical coordinates of the Scree Figure[11] is eigenvalue, while the horizontal coordinates is the number of factors. It is obvious that eigenvalue of the top 5 factors is greater than 1, the polyline is steep, and the line tends to be stable from the sixth factor. Thus 5 factors should also be selected for visual observation.

The five factor variables are integrated on the original variables, which are obtained by transforming the original variables linearly and have new meanings. In practice, analytical work is conducted mainly through the value analysis of the loading matrix for the purpose of obtaining the correlation between factor variables and the original variables. However, in the loading matrix, a row or a line may have more than one larger factor loading, and this situation will fail to offer the clear significance of the factor variable, so it is necessary to conduct factor rotation. In this paper, the great variance method is introduced to factor rotation, and factor loading matrix after rotation [9] [10] is shown in table .
TABLE VII.

Figure 1. Scree Figure

III.

ANALYSIS OF RESULTS

As it is shown in table , five main factors extracted from traffic accidents through factor analysis are as follows:

FACTOR LOADING MATRIX AFTER ROTATION

1) Performance factor
The first principal factor is mainly determined by variables such as CLIFF, PATH, C7, SLOPE and their loadings on the main factors are: 0.922, 0.884, 0.855, 0.769; these variables reflect the relation between vehicle performance and road traffic in terms of road performance and truck performance, which may be called the performance factor.

2) Strain factor
The second principal factor is mainly determined by variables such as C5, C3 and their loadings on the main factors are 0.821 and 0.750; these variables are related to the resilience of the drivers, which may be called strain factor.

3) Emergency factor
3) The calculation of factor score covariance matrix is obtained. As the factor score covariance matrix is a unit matrix, it indicates that the 5 common factors extracted are not related each other, as shown in table .
TABLE VIII. THE COVARIANCE MATRIX OF FACTOR SCORE

The third principal factor is mainly determined by variables like CZHD, C6 and their loadings on the main factors are 0.894 and 0.861. The corresponding situation to these variables is unpredictable to the driver in driving, so they may be called emergency factor.

4) Quality factor
The fourth principal factor is mainly determined by variables such as C13, C9 and their loadings on the main factors are 0.753 and 0.746; these variables are related with the quality of the drivers, which may be called quality factor.

5) State factors
The fifth principal factor is mainly determined by variables such as ROADSTATE and C10, their loadings on the main
1357

ThP9.9

factors are 0.814 and 0.532. These variables contained in the fifth factor are related to road condition and vehicle condition, which may be called state factors. Of course, a more accurate interpretation and naming still need to consult an expert of traffic accidents. IV. COUNTERMEASURE AND SUGGESTIONS

ACKNOWLEDGMENT The authors appreciate help of the some colleagues of Business College of Shanxi University and Shanxi University of Finance & Economics. REFERENCES
[1] [2] [3] [4] [5] [6] [7] Xu Hongguo, He Biao. Road Traffic Accident Analysis and Reproduction. Beijing: Police Education Press, 1996. Han Jiawei, Micheline Kambe, Data Mining Concepts and Techniques, Morgan Kmdmaa Publishes, 2000 http://www.122cn.com. Ron Kohavi.Data Mining and Visualization, National Academy of Engineering USA Frontiers of Engineering, 2000. Chen Jingmin, eds. Data Warehouse and Data Mining. First Edition. Electronic Industry Press. 2002. Fan Yuan , Hao Liren , Hao Zheou, eds. SPSS Statistical Analysis Practical. First Edition. China Water Power Press. 2003. Li Wei, Wang Wei, Deng Wei. Information Visualization and Software Design in Provincial Highway Network Management System. Technology of Highway Traffic 2005. 22(7). Luo Yonglian. How to Exclude the Noise of Emergencys Corpus and Eliminate Duplication in Web Page. 2005. Shanxi University, a master's degree thesis in 2005. Ma Qingguo. Management Statistics, Data Acquisition, Statistical Theory, SPSS Tools and Application. First Edition. Science Press. 2002. Ma Hongbo, Chen Zhibo, Lu Shouyi. Database Visualization Technology in the Field of Statistical Data Management Application. Computer Applications.2004. 19(2). Su Jinming, Fu Ronghua, Zhou Jianbin, Zhang lianhua. Statistical Software SPSS Series of Secondary Development. The first edition. Electronic Industry Press. 2003. [U.S.] Tom Soukup Ian Davidson with Zhu Jianqiu, Cai Weijie translation. Visual Data Mining. First Edition. Electronic Industry Press. 2004. Xuewei. Data Mining Overview. Statistics and Actuarial. 2001 Yu Jianying, He Xuhong. Data Statistical Analysis and SPSS Application. First edition. Posts & Telecom Press. 2003. Chen Yongde. Road Traffic Accident Analysis and Prevention. Beijing: China Communications Press, 1999.

In summary, when the implicit knowledge in the unforeseeable, uncertain and incidental accidents are combined with the unpredictable environmental variation and peoples habits, it could transfer the implicit knowledge to explicit knowledge, which would offer warnings to traffic accidents. 1 To improve the technological road construction, make the traffic accidents avoiding as an important assessment of road design and evaluation. 2 To carry out several programs of accidents warning and put emphasis on the work concerning good vehicles and other hardware, to make the vehicle inspection more effective in terms of regulations, standards and principles, to explore advanced sensing technology of accidents warning, such as, warnings to tire explosion and malfunctioning, etc. 3 To enhance the drivers cultural and sense of responsibility, to strengthen the function of people oriented [15], and to handle traffic accidents from a comprehensive view involving people, vehicle, road as well as environment. 4 To improve road signs construction, especially in those more risky and dangerous roads, multiple measures concerning sound, light and electricity should be better choice to warn drivers, furthermore, soft environment of those accident-prone areas should be optimized so as to achieve a more effective result. 5 Much more scientific research should be given to data, information and relevant knowledge about accident avoiding, meanwhile, those effective researching results should be applied to society through various ways and to enhance public safety awareness.

[8]

[9] [10]

[11]

[12]

[13] [14] [15]

1358

Potrebbero piacerti anche