Sei sulla pagina 1di 28

Math Internal Assessment

The Relationship Between the Total Points Earned and the Number of Yellow Cards
Received During the 2003 – 2004, 2004 – 2005 , 2005 – 2006, 2006 – 2007, 2007 –
2008 seasons of the English Premier League.

Alisha Narula
IB Math Studies
Mr. Clement
International School of Bangkok
November 27, 2009
000307 - 149
Alisha Narula
November 17, 2009
Math Studies

Math Internal Assessment


Title:
The Relationship Between the Total Points Earned and the Number of
Yellow Cards Received During the 2003 – 2004, 2004 – 2005 , 2005 – 2006,
2006 – 2007, 2007 – 2008 seasons of the English Premier League.

Introduction:
The managers of the premier leagues teams need to go over most of the
statistics produced during the previous seasons, in order to make certain
necessary predictions for a proper game plan for the coming seasons. Also,
buying and selling of the players becomes very interesting based on the
previous statistics of points earned, yellow cards, red cards, and other criteria.
The red cards are more of a serious matter on which the referees have been
criticized in many cases. However the yellow cards obtained more or less mild
enough to relate to the total number of points earned by each team. Comparing
and contrasting the two main criteria’s of the statistics the total points earned
and the total number of yellow cards receive should give enough of an idea to
the managers and the committee members to make logical and important
decisions, in order to succeed in the forth coming seasons.
Using the most common logical sense it shows that the teams earning
more yellow cards should be the teams not playing well enough of a fair game
ending up in less points for an overall season. In contrary to this, it is quite
evident that the better teams or rather the top ten, should be having much
less yellow cards in number, thereby depicting better players and higher
efficient strategies enforced. This is more or less, a very logical understanding
of the situation. In order to prove or see the relation, a statistical investigation
is going to be conducted where 2003 – 2008 annual season statistics are taken
and dealt in detail with various numbers of statistical data, graphs, and other
calculations with the help of which clear predictions could be seen.
In order to, move on with further investigations, I start with easier
statistics heading into the more complicated and sophisticated ones. Finding
the mean, median, mode, lower and upper quartiles, range, standard deviation,
and scatter plots if possible would further enhance for the easier parts of the
statistics. On the other hand, the complicated calculations such as the
Pearson’s Correlations Coefficient, the r value, value, the linear regression
line with the scatter plot and finally leading to the Chi Squared Value, where
some assumptions will be made and see if the hypothesis is independent or
not.

Task:
I would like to find whether or not there exists a relationship between
the total points earned and the yellow cards obtained by each tem of various
seasons of the English Premier League.
Table # 1: Season 2003 – 2004 Showing Total Points and Yellow Cards Obtained by
20 Teams of the English Premier League

Caption # 1:: The table above depicts 20 various teams part of the English Premier
League, with their total points earned and total number of yellow
yellow cards obtained
during the 2003 – 2004 season, and also including the mean of both x and y.

Mathematical Process
Mean Number of Total Points:
=

= 51.6 ( 3 significant figures )

Using the GDC Statistical Program the following calculations are made
Minimum Value: 3
Lower Quartile: 42.5
Median: 49
Upper Quartile: 56
Max Value: 90

Mean Number of Total Number of Yellow Cards

= 64.6 (3 significant figures)

Using the GDC Statistical Program the following calculations are made
Minimum Value: 40
Lower Quartile: 58
Median: 63
Upper Quartile: 71.5
Max Value: 89

Dispersion of spread of the data is another very important context for which the
standard deviation can be calculated. How far away from the mean are the data spread
about would be clearly stated by the standard deviation.

The formula used for this mathematical calculation is:

Standard Deviation:
Standard Deviation Table # 2: Standard Deviation Table for the Total Points Earned
(x) during the Season 2003 – 2004 of the English Premier League

Standard Deviation

=
= 14.8 (3 significant figures)
Therefore, SD of (X) is: 14.8,
14.8, which show the deviation from the mean on both sides.

Caption # 2: The table and calculations above depict the mathematical process of
Standard Deviation Calculation for the total points earned ((x)) during the season
2003 – 2004 of the English Premier League.
Standard Deviation Table # 3: Standard Deviation Table for the Total Number of
Yellow Cards (y)) during the Season 2003 – 2004 of the English Premier League

Standard Deviation =

= 119 (3 significant figures)

Therefore, SD of (Y) is 119 , which show the deviation from the mean on both sides.

Caption # 3: The table & calculations above depict the mathematical process of
Standard Deviation Calculation for the total number of yellow cards (y) obtained
during the season 2003 – 2004, English Premier League.
Table # 4: Season 2004 – 2005 Showing Total Points and Yellow Cards Obtained by
20 Teams of the English Premier League

Caption # 4: The table above depicts 20 various teams part of the English Premier
League, with their total points earned and total number of yellow cards obtained
during the 2004 – 2005 season, and also including the mean of both x and y.
Standard Deviation Table # 5:
5 Standard Deviation Table for the Total Points Earned
(x) during the Season 2004 – 2005 of the English Premier League
Standard Deviation =

= 16.7 (3 significant figures)


Therefore, SD of (X) is: 16.7,
16.7 which show the deviation from the mean on both sides.

Caption # 5: The table and calculations above depict the mathematical process of
Standard Deviation Calculation for the total earned points (x) during the season 2004
– 2005, English Premier League.

Standard Deviation Table # 6: Standard Deviation Table for the Total Number of
Yellow Cards (y)) during the Season 2004 – 2005 of the English Premier League

Standard Deviation =

= 81.4 (3 significant figures)


Therefore, SD of (Y) is 81.4,, which show the deviation from the mean on both sides.
Caption # 6: The table & calculations above depict the mathematical process of
Standard Deviation Calculation for the total number of yellow cards (y)
( ) obtained
during the 2004 – 2005 season, in the English Premier League.

Table # 7: Season 2005 – 2006 Showing Total Points and Yellow Cards Obtained by
20 Teams of the English Premier League

Caption # 7: The table above depicts 20 various teams part of the English Premier League, with
their total points earned and total number of yellow cards obtained during the 2005 – 2006 season, and
also including the mean of both x and y.

Standard Deviation Table # 8: Standard Deviation Table for the Total Points Earned
(x) during the Season 2005– 2006 of the English Premier League.
Standard Deviation =

= 17.6 (3 significant figures)


Therefore, SD of (X) is: 17.6,
17.6, which show the deviation from the mean on both sides.

Caption # 8: The table and calculations above depict the mathematical process of
Standard Deviation Calculation for the total earned points (x)
( during the season 2005
– 2006,, English Premier League.

Standard Deviation Table # 99: Standard Deviation Table for the Total Number of
Yellow Cards (y)) during the Season 2005 – 2006 of the English Premier League

Standard Deviation =

= 8.38 (3 significant figures)


Therefore, SD of (Y) is 8.38,, which show the deviation from the mean on both sides.
Caption # 9: The table & calculations above depict the mathematical process of
Standard Deviation Calculation for the total number of yellow cards (y)
( obtained
ined
during the 2005 – 2006 season, in the English Premier League.

Table # 10: Season 2006 – 2007 Showing Total Points and Yellow Cards Obtained
by 20 Teams of the English Premier League

Caption # 10: The table above depicts 20 various teams part of the English Premier
League, with their total points earned and total number of yellow
yellow cards obtained
during the 2006 – 2007 season, and also including the mean of both x and y.
Standard Deviation Table # 11: Standard Deviation Table for the Total Points
Earned (x) during the Season 2006–
2006 2007 of the English Premier League.

Standard Deviation =

= 15.4 (3 significant figures)


Therefore, SD of (X) is: 15.4,
15.4, which show the deviation from the mean on both sides.

Caption # 11: The table and calculations above depict the mathematical process of
Standard Deviation Calculation for the total earned points (x)
( ) during the season 2006
– 2007,, English Premier League.
Standard Deviation Table # 12 12: Standard Deviation Table for the Total Number of
Yellow Cards (y)) during the Season 2006 – 2007 of the English Premier League

Standard Deviation =

= 12.9 (3 significant figures)


Therefore, SD of (Y) is 12.9,, which show the deviation from the mean on both sides.

Caption # 12: The table & calculations above depict the mathematical process of
Standard Deviation Calculation for the total number of yellow cards (y)
( obtained
ined
during the 2006 – 2007 season, in the English Premier League.
Table # 13: Season 2007 – 2008 Showing Total Points and Yellow Cards Obtained
by 20 Teams of the English Premier League

Caption # 13: The table above depicts 20 various teams part of the English Premier
League, with their total points earned and total number of yellow cards obtained
during the 2007 – 2008 season, and also including the mean of both x and y.
Standard Deviation Table # 14: Standard Deviation Table for the Total Points
Earned (x) during the Season 2007–
2007 2008 of the English Premier League.

Standard Deviation =

= 19.2 (3 significant figures)


Therefore, SD of (X) is: 19.2,
19.2, which show the deviation from the mean on both sides.

Caption # 14: The table and calculations above depict the mathematical process of
Standard Deviation Calculation for the total earned points (x)
( ) during the season 2007
– 2008, English Premier League.
Standard Deviation Table # 15: Standard Deviation Table for thee Total Number of
Yellow Cards (y)) during the Season 2007 – 2008 of the English Premier League

Standard Deviation =

= 10.4 (3 significant figures)


Therefore, SD of (Y) is 10.4,, which show the deviation from the mean on both sides.

Caption # 15: The table & calculations above depict the mathematical process of
Standard Deviation Calculation for the total number of yellow cards (y)
( ) obtained
during the 2007 – 2008 season, in the English Premier League.
MEASURING CORRERELATION
Dealing with the linear association, a concept called correlation is used to
measure the strength and direction. The correlation coefficient lies between – 1 and 1.
The r-value of 0 shows no linear association at all, -11 and 1 shows perfect negative
and positive correlation respectively. The positive correlation shows an increase in
one variable resulting in an increase in the other. The negative correlation shows an
increase in one variable resulting in a decrease in the other. Using the data x, (the total
points earned) and y (the total number of yellow cards), Pearson’s’ Correlation
Coefficient is calculated showing the degree of linearity between the two variables x,
and y. In order to do that, a table of values showing x, y, xy, , and is created and
a particular formula which is given below is used to calculate the r-value manually.
Certainly a revision of the results will be corrected by enforcing the GDC function.
ula for Pearson’s Correlation Coefficient (r-value):
Formula

Table # 16: The Table of Values depicting the values from the 2003 – 2004 season of
the English Premier League

r- value :
r= - 0.592
= 0.350 = 35.0%
The -0.592 r – value shows a moderate negative relationship between the total points
earned, x, and the total number of yellow cards obtained, y. Furthermore, there is a
35.0% correlation evident.
Likewise the “r” value for other years is calculated way the statistical mode on the
GDC.

Seasons “r values” “ values”


2003 – 2004 - 0.592 0.350 = 35.0%
2004 – 2005 - 8.32 6.92 = 692%
2005 – 2006 0.0378 0.0015 = 0.15%
2006 – 2007 -0.148 0.0218 = 2.18%
2007 – 2008 -.0.428 0.183 = 18.3%

Caption # 16: The tables and calculations above depict the Pearson’s Correlation
Coefficient also known as the r – value, which is – 0.592.

LEAST SQUARES REGRESSION


The next part of the investigation consists of finding the least squares
regression line or, also known as the line of best fit. This along with the scatter plot
would at a glimpse depict a clear picture of the relationship between the variables.
Not only does it serve this purpose, but it also aids in calculating an equation, which
holds a great importance in further calculating the values outside the graph drawn,
that is extrapolation. Interpolation is not at all a problem, as that can be directly seen
from the graph. Once again, a similar table of values is used to find the equation of
least square regression.
Table # 17: The Table of Values depicting the values from the 2003 – 2004 season of
the English Premier League.
Caption # 17: Using this chart,
chart values will be used in order to calculate the line of
regression.

Linear Regression Formula:

The details to work out the equation:


(3 significant figures)

Putting all the details in the main equation:

The scatter plot showing the line of best fit is shown as follows:

Graph # 1: Scatter Plot of Data from Season 2003 – 2004 with a Linear Regression
Line & it’s Equation

Scatter Plot of Yellow Cards vs. Goal Points Earned


for 2003-2004
100
y = -0.4374x + 87.167
Number of Yellow Cards

80

60

40

20

0
0 20 40 60 80 100
Goal Points

Caption # 1: As seen above, the scatter plot depicts a negative and weak correlation,
having a linear regression line of y= -0.437x + 87.16
Furthermore, using the GDC Function and the x, y, xy, , and table of values, and
the linear regression line formula. The linear
linear regression line equation was found for
the other seasons. Specifically seasons 2004-
2004 2005, 2005-2006, 2006-2007,2007, and 2007
2007-
2008 during the English Premiere League.

Season 2004 – 2005:


Table # 18: The Table of Values depicting the values from the 2004 – 2005 season
of the English Premier League.

Caption #18: Using this chart, values will be used in order to calculate the line of
regression. (On the calculator)
Graph # 2: Scatter Plot of Data from Season 2004–
2004 2005 with a Linear Regression
Line & it’s Equation
Scatter Plot for Yellow Card vs. Goal Points Earned
for 2004-2005
80
70
60
Yellow Cards

50
40
y = 0.0022x + 51.454
30
20
10
0
0 20 40 60 80 100
Goal Points Earned
Caption # 2: As seen above, the scatter plot depicts a negative and weak correlation,
having a linear regression line of y= 0.002x + 51.45.
Season 2005 – 2006
Table # 19: The Table of Values depicting the values from the 2005 – 2006 seaso
season
of the English Premier League.

Caption #19: Using this chart, values will be used in order to calculate the line of
regression. (On the calculator)
Graph # 3: Scatter
er Plot of Data from Season 2005–
2005 2006 with a Linear Regression
Line & it’s Equation

Scatter plot of Yellow Cards vs. Goal Points Earnerd


for season 2005 - 2006
80
70
60
Yellow Cards

50
40
30 y = -0.0118x + 59.182
20
10
0
0 20 40 60 80 100
Goal Points Earned

Caption #3: As seen above, the scatter plot depicts a negative and weak correlation,
having a linear regression line of y= -0.011x + 59.18
Season 2006 – 2007
Table # 20: The Table of Values depicting the values
value from the 2006 – 2007 season
of the English Premier League.

Caption #20: Using this chart, values will be used in order to calculate the line of
regression. (On the calculator)
Graph # 4: Scatter
er Plot of Data from Season 2006–
2006 2007 with a Linear Regression
Line & it’s Equation

Scatter Plot of Yellow Cards vs.


100 Goal Points Earned for 2006 -
y = -0.1034x + 66.618
80
2007
Yellow Cards

60

40

20

0
0 20 40 60 80 100
Goal Points
Caption # 4: As seen above, the scatter plot depicts a negative and weak correlation,
having a linear regression line of y= -0.103x + 66.61.
Season 2007 – 2008:
Table # 21: The Table of Values depicting the values
value from the 2007 – 2008 season of
the English Premier League.

Caption #21: Using this chart, values will be used in order to calculate the line of
regression. (On the calculator)
Graph # 5: Scatter
er Plot of Data from Season 2007-
2007 2008 with a Linear Regression
Line & its Equation

90
80
y = -0.2304x + 72.481
70
60
50
Series1
40
30 Linear (Series1)
20
10
0
0 20 40 60 80 100

Caption # 5 As seen above, the scatter plot depicts a negative and weak correlation,
having a linear regression line of y= -0.230x + 72.48
Table #22: This table depicts the least square regression for annuals seasons
beginning from 2003 – up until 2008.

Seasons Least Square Regression


2003 – 2004
2004 – 2005 y= -0.002x + 51.45.
2005 – 2006 y= -0.011x + 59.18
2006 – 2007 y= -0.103x + 66.61
2007 – 2008 y= -0.230x + 72.48

Caption # 22: From the table above it is noted, that all the least square regression
equations for the following seasons from 2003 – 2008 have had a negative and rather
weak correlation.

Test of Independence
This test is done to show whether the data are independent of each other. Should one
set affect the other is the question that has to be answered. Usually the test finds
the difference between the observed and expected value by using the formula:

Test of Independence Formula:

Where is an observed frequency, and is an expected frequency.

Table # 23: This is the Contingency Table of the Observed Values for annual seasons
beginning from 2003 – 2008 in the English Premier League.

Observed Values Table

Yellow
Points
TOTAL
Total
Points

20 30 50

32 18 50

TOTAL 52 48 100
Caption # 23: The Contingency table above depicts observed values.
Table # 24: This is the Contingency Table of the Expected Values and the Process for
finding Expected Values for annual seasons beginning from 2003 – 2008 in the
English Premier League.

Expected Values Process & Formula Table


Yellow Cards TOTAL

Total Points
Expected value = Expected value =

= = 26 = =24 b = 50

Expected value = Expected value =

= = 26 = =24 d =50
TOTAL a = 52 c =48 100

Caption # 24: Through the table above the processes for finding expected values is
clearly depicted.
Title # 25: This is the Contingency Table of the Expected Values for annual seasons
beginning from 2003 – 2008 in the English Premier League.

Expected Values Table


Yellow Cards TOTAL

Total Points

26 24 50

26 24
50
TOTAL 52 48 100

Caption # 25: The contingency table above depicts expected values.


The calculation of the contingency table proceeds to the calculation of the
manually. For the calculation, the table is created as follows:

Table # 26: The following table is created to manually calculate the value.

20 26 (20 – 26) 36 1.385


30 24 (30 - 24) 36 1.5
32 26 (32 - 26) 36 1.385
18 24 (18 - 24) 36 1.5
Total: 5.77
Null Hypothesis :
The total points earned and the number of yellow cards obtained are independent.

Alternative Hypothesis :
The two related matters are not independent.

Degree of Freedom
Formula :
(r – 1) (c – 1)
= (2 – 1) (2 – 1)
=1x1
=1

Through investigation it is clear that that total number of points earned and the
number of yellow cards are very well related which would be an important factor for
all the managers to evaluate the decisions of buying players and confirming their
quality of play. Thus according the number of yellow cards that certainly leads to red
cards.
Validity:
Working in the beginning of the project, once the scatter plot and line of linear
regression were drawn, they showed an evidence of almost no relationship between
the total number of points earned and the yellow cards. Also, the Pearson’s
Correlation Coefficient was quite weak in most cases. However, working through the
test of independence, it is noted that there lies a relationship between the two
factors, which holds an immense importance to all English Premier League Managers.
Furthermore, the probability value to

Conclusion:
To conclude, this investigation hasn’t only aided my in expanding my
understanding of various mathematical processes, such as the , least square
regression, and other mathematical processes.
Firstly, the conclusions made for the least square regression equation was that
these two factors being, yellow cards and the total earned points, had no correlation as
the equations constantly showed a pattern amongst equations that depict a negative
and weak correlation such as, , y= -0.002x + 51.45. ,y= -0.011x +
59.18 , y= -0.103x + 66.61, y= -0.230x + 72.48. Therefore, after having completed
the least squares regression conclusions were made that the data had basically no
correlation. However, after having completed the table, it is clearly noted that the
total number of points and the total earned points during the seams, are in reality very
well related. Furthermore, this is further supported as the null hypothesis is rejected
and the alternative hypothesis is accepted, as the is greater than the critical
value. In addition, the probability value is 0.0163, which is less that 0.05. That is a
further evidence of the fact that the null hypothesis is rejected.
More importantly, this investigation not only demonstrated that there is an
evident correlation between the yellow cards and the total points earned. Also, these
results are highly valuable for managers as it they can make sure to include various
techniques that prevent players from obtaining more yellow cards as an evident
correlation is seen.

Potrebbero piacerti anche