Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
MANAGERIAL STATISTICS
PROJECT REPORT
In our study, we have implemented statistical concepts to determine the probability of accidents
happening on various stations in the Mumbai railway network (Western Line). We have also
tried to determine the independence of various attributes considered in our study. We start with
collecting the accident-related data of various passengers starting from August 2018 till August
2019. The attributes considered in our study are Age, Day, Month, Gender, Type of Accident
and Nearest Station where it occurred. We calculated a prior probability of accidents happening
at each station. We use this prior probability to determine that if an accident has occurred then
what is the probability of it happening at a station. Hence, we implemented Bayes Theorem in
the first part of our analysis. This analysis helped us to understand and find out if any accident
has occurred what is the probability that it will occur near the station. After this analysis, we
implement the test of independence on the attributes considered in our study where we checked
the independence of the Gender and Type of Accident. Through this analysis, we get to know
whether the type of accident that is major, minor or death is independent upon the gender or
not. We have also used the Regression model to determine the relationship between the type
of accidents and attributes.
2. Introduction
Formerly known as Bombay Suburban Railway, Mumbai Suburban Railways is spread over a
length of 390 km, carrying more than 7.5 million commuters daily. Being one of the busiest
rail networks, they offer approximately 2342 train services with annual commuters of around
2.64 billion. Thus, the system is too overcrowded throughout the entire year, starting from 4:00
a.m. until 01:00 a.m.
The history of this prestigious railway network dates to April 16th, 1953 when the first train
took off from Bori Bunder for Thane and covered 34 km in around an hour and a half. With
the intention of just experimentation, it soon became a hit amongst the local people for
providing the ease of travelling from one place to another. Lines started expanding and today,
it is the second-largest railway network in Asia.
Indian Railways takes care of the operations part and divides it into two divisions, mainly the
Western Railways and Central Railways. Our primary focus is on Western Railway network
which runs from Churchgate parallel to the west coast of the Mumbai Metropolitan Region,
Maharashtra. It consists of 37 railway stations from Dahanu Road in the north to Churchgate
station in the south. The Western Railways' Electric Multiple Unit, which was approved in
2012-13 Railway Budget, runs on AC 25 kV power and uses 9 car rakes. This conversion from
DC to AC was carried out to optimize the punctuality and energy-efficiency of the network,
allowing the trains to achieve 100 km/hr. The trains running at slower speeds halt at all the
stations while the faster ones stop only at the important ones.
While the railway services provide ease of transportation to the people, there are also many
negative aspects of this extensive network. The most important one to take a note of is the
number of accidents that take place daily. Reports show that about 2000 passengers die
annually and between the period of 2002 and 2012, more than 36000 people met with fatal
accidents and around 37000 passengers were injured severely. Although over crowdedness can
be primarily attributed to the main cause of accidents, there are several other reasons. Reasons
such as trains halting for a mere 10 seconds and then passengers hastily trying to board or get
off the trains within that time window resulting in a major chunk of the number of accidents.
Some passengers die when they travel sitting on the roof to avoid the crowd and then
accidentally touching on the high-voltage wires. Another reason is because of the open doors
and windows. People travel by hanging off the edge of the footboard, off door ledges often
losing support or slipping off the trains. Another reason is teenagers performing stunts off the
doorways. People often lose their lives due to crossing the tracks on foot simply to avoid the
footbridges to save time.
2.1 Problem Statement
This report has been made to study the number of accidents that occurred in each of the stations
of the Western Lines during the period of August 2018-August 2019, the demographics of the
passengers who had met with the accidents, the type of injuries and the frequencies of such
happenings every month. We also want to determine the independence and relationship of
various attributes considered in our study.
1. To determine the probability of accidents occurring in each station during the period of
August 2018-August 2019.
2. To apply Bayes' Theorem and determine if the accident has occurred then what is the
probability of it occurring in any of the nearest stations.
3. To apply the Test of Independence and determine whether gender and type of accident
are independent of each other.
4. To determine the relationship between the type of accidents and gender, age group and
days of the week through Regression model for all types of accidents.
3. Methodology
P(A) ∗ P(B ǀ A)
P(A ǀ B) =
P(B)
Where,
P(A) = Probability of event A occurring
P(B) = Probability of event B occurring
P (A ǀ B) = Probability of event A given that event B occurred.
P (B ǀ A) = Probability of event B given that event A occurred.
A total of 1356 accidents occurred from August-2018 to August-2019. After extrapolating the
data in excel, our first objective was to determine the probability of an accident taking place at
any randomly selected railway station involved in our study. We use these probabilities as prior
probabilities. Next, we implemented Bayes' theorem and used these previously calculated prior
probabilities to determine that if an accident has occurred then what is the probability of it
occurring near a specific station. We assumed Conditioned probability by using data of
accidents occurred near a specific station. Highest accidents occurred near any station was
given higher conditioned probability and lowest accidents occurred near any station were given
lower values. This analysis helped us to understand and find out if any accident has occurred
what is the probability that it will occur near the specific station.
Step 2: Select a random sample and record the observed frequency. fij , for each cell of the table
2
(𝑓𝑖𝑗 − 𝑒𝑖𝑗 )
χ2 = Σi Σj
𝑒𝑖𝑗
Step 5: Determine the rejection rule
If p-value ≤ α or ≥ then Reject Ho or else do not reject Ho
Where α is the significance level and there are (n-1)*(m-1) degree of freedom (with n rows and
m column)
We apply a test of independence to find out whether the type of accident i.e. death, major or
minor has any significant relationship with gender. The general conception is that male
passenger are more susceptible to injuries caused due to accidents in and around western
Mumbai railway network.
As per data extracted from western railway website, the maximum number of accidents
occurred from August 2018 to August 2019 is near Kandivali Station.
72
60 67
61 59 60 60 61
50 58
54
48 49 49
40 44 45
30 37
33
20 26 26
23
20 18 18
10 15
12
0
MALAD
BORIVALI
CHURCHGATE
GOREGAON
VIRAR
JOGESHWARI
KHAR ROAD
MAHALAXMI
DADAR
MARINE LINES
MIRA RD
SANTA CRUZ
ANDHERI
BHAYANDER
CHARNI ROAD
DAHISAR
MATUNGA RD
NAIGAON
NALASOPARA
LOWER PAREL
MAHIM
ELPHINSTONE
GRANT ROAD
KANDIVALI
VASAI RD
VILE PARLE
STATTION
From the above chart, it can be the inference that maximum accidents occurred at or near
Kandivali station followed by Borivali station.
Probability of accidents occurred at or near the station was calculated and the maximum
probability is 0.07 at or near Kandivali station and least probability is 0.009 at or near
Mahalaxmi station.
Below table shows probability calculation for no of accidents occurred at or near the station.
Stations No of Accidents Probability
ANDHERI 75 0.055
BANDRA 67 0.049
BHAYANDER 61 0.045
BORIVALI 87 0.064
CHARNI ROAD 26 0.019
CHURCHGATE 23 0.017
DADAR 59 0.044
DAHISAR 60 0.044
ELPHINSTONE 15 0.011
GOREGAON 72 0.053
GRANT ROAD 20 0.015
JOGESHWARI 85 0.063
KANDIVALI 94 0.069
KHAR ROAD 33 0.024
LOWER PAREL 48 0.035
MAHALAXMI 12 0.009
MAHIM 49 0.036
MALAD 60 0.044
MARINE LINES 18 0.013
MATUNGA RD 18 0.013
MIRA RD 26 0.019
MUMBAI CENTRAL (L) 37 0.027
NAIGAON 49 0.036
NALASOPARA 61 0.045
SANTA CRUZ 44 0.032
VASAI RD 58 0.043
VILE PARLE 45 0.033
VIRAR 54 0.040
Total 1356 1.000
We further analysed data and segregated based on gender and it was found out that most of
the accidents occur to men. Below table shows no of accidents segregated based on gender.
500
400
300
200
100
0
Minor Major Dead
Male Female
4.2 Bayes’ Theorem
Using Bayes' theorem, we find out if any accident has occurred what is the probability that it
will occur near the specific station. Below table shows the analysis by Bayes' theorem
Step 1:
Ho: Accidents are independent of gender
Ha: Accidents are not independent of gender
Step 2:
Level of significance alpha = 0.05
Step 3:
Below table consists of type of accidents with respect to gender.
Test statistics are calculated based on the data from this table. Observed frequency (OF) is the
actual frequency in various combinations and Expected frequency (EF) is what the frequency
should have been nominal.
Observed Expected
OF – EF (OF - EF) ^2 (OF - EF) ^2/E.F.
Frequency Frequency
132 135.93 -3.93 15.43 0.11
Step 4:
The P-value calculated = 0.16
Step 5:
As p- value (0.16) is more than Alpha (0.05). We cannot reject Ho. And Hence, it is proved
that accidents are independent of Gender.
Regression Statistics
Multiple R 0.196
R Square 0.039
Observations 1356
We got R square value very low (approx. 0.039). Low "R square" indicates that the relationship
between X variables and Y variable is very weak.
5. Conclusion and Discussion
1. https://wr.indianrailways.gov.in/index.jsp