Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Sports have become an integral part of our constantly evolving society. All across the world, almost each
country can say that they have a national sport that they are proud of. However, sports have grown to
become more than that. Probably due to the continuous globalization, some sports have been
embraced worldwide.
A perfect example of such sport is represented by tennis. Played for the first time in Birmingham,
England, sometime between 1859 and 1865, tennis has succeeded in amassing a staggering 1 billion fans
according to the latest studies. This figure is even more impressive if the world’s population (7.7 billion
as of November 2018) is also taken into account.
What is more, the most watched tennis match, the Australian Open Men’s Final in 2017, drew in 15.2
million simultaneous viewers in order to become Eurosport’s most watched tennis match of all times.
Furthermore, the ESPN ratings reports revealed that this final had experienced a growth in ratings of
over 80% from the one in the previous year and that it will most likely remain ESPN’s most watched
sports event in that particular time slot ( Sawe, 2018).
All of these facts ignite the question of just what were the reasons spurring these impressive figures.
One of the clear answers is represented by the current levels which the top ranked players in the world
have constantly showcased, but also their longevity and ability to create an icon worthy of people’s
admiration.
Nowadays, people are not content with just attending a tennis tournament and witnessing a good
match. No, they want something extraordinary, something special, brilliant even, from the players,
which will make them feel as though the money spent on tickets for major tournaments was not wasted.
What is more, some of them pay incredibly high prices in order to witness history being made. After all,
which tennis fan wouldn’t have liked to be in the stands when Arthur Ashe became the first male
African-American to win a Grand Slam in 1975?Which tennis fan wouldn’t have liked to be there when
Roger Federer beat the record for most Grand Slam tournaments in 2009 after a heavily contested
Wimbledon Final against Andy Roddick? Which tennis fan wouldn’t have liked to witness Andy Murray
become the first British man to triumph at Wimbledon after almost 80 years? (Tennis View Mag, 2015)
Well, when it comes to these kinds of events, the answer is almost always no one and it is not
necessarily the big tournaments that draw in the crowds, but in most cases it is mostly about the player
themselves. Some of these players have the magnetic ability to sell out arenas of thousands of people
only by announcing their attendance at a tournament.
One of these players is without a doubt Roger Federer. The Swiss born is widely regarded as the greatest
tennis player of all times, having achieved such incredible feats, even now at his age of 37, that some are
left to wonder just how he does it. Over the course of the years in which Roger Federer has played
professional tennis, he has won more Grand Slam tournaments than anybody else, boasting with a total
of 20 and he has also won more than 1080 matches and holds only 234 losses.
Furthermore, Roger Federer holds 27 Guinness World Records, 99 career singles titles and stands alone
in a long series of other dozens of tennis records. Upon looking at these facts, it comes without any
doubt that Roger Federer is a figure worthy of the admiration he receives at each tournament he plays,
the contributions he has brought to the sport being remarkable. Not only is he a safe bet towards
achieving a sold-out night, but he is in almost all cases a guarantee for offering the fans memorable
moments that will keep them coming back for more.
Moreover, Roger Federer and the other top-ranked players in the world have yielded significant
improvements to the game itself. Through their fame and the incredible fan dedication, they have been
one of the contributing factors in the increase of prize money awarded at all competitions. As an
example, we can take the US Open where the total prize money was $53 million dollars in 2018 and the
Australian Open where, for the 2019 edition, the organizers have made a commitment of $42.85 dollars
(Total Sportek, 2019). Thus, these increase have also been beneficial to other, lower ranked players,
helping them with the expenses concurred by playing tennis and enabling them to participate in more
tournaments worldwide and raise their rankings.
Well, in this case, if Roger Federer and a handful of others always bring out a great crowd and create
legions of fans, the question stands of whether he has anything left to improve or if he has given it all
and is now on a downwards slope towards retirement. This project will provide an analysis of Roger
Federer’s career, his wins and losses over the course of the years at each tournament and also his
earnings across his career and his game. In order for us to provide a thorough glimpse into Roger
Federer’s career and its progress in these areas, we have gathered secondary data from multiple sources
specialized in tennis.
2. Wins and Losses over the course of his career at ATP each tournament
type
In the following chapter we will try to statistically analyze Roger Federer’s wins and losses at each type
of tournament.
We start by analyzing how Roger Federer performed at Grand Slams, the 4 most important tournaments
in tennis. As it can be seen in the table, in his first year as a professional tennis player he did not enter
the main draw at any Grand Slam, while in the next he only amassed 2 losses. However, starting from
2000, he slowly started to grow and make his way to the top.
Year Grand Slam
WINS(xi) LOSSES TOTAL
1998 0 0 0
1999 0 2 2
2000 7 4 11
2001 13 4 17
2002 6 4 10
2003 13 3 16
2004 23 1 24
2005 24 2 26
2006 27 1 28
2007 27 1 28
2008 24 3 27
2009 26 2 28
2010 20 3 23
2011 20 4 24
2012 21 3 24
2013 13 4 17
2014 19 4 23
2015 18 4 22
2016 10 2 12
2017 18 1 19
2018 14 2 16
Table 1-Grand Slam Wins/Losses
a) The arithmetic mean (or mean or average) is the most commonly used and
readily understood central tendency measure (Serban , et al., 2003). The
formula of the arithmetic mean for simple data series is:
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 ∑𝑛𝑖=1 𝑥𝑖
𝑥̅ = =
𝑛 𝑛
Equation 1-Arithmetic mean formula
Where:
∑ 𝑥𝑖/ 21
𝑖=1
and n= 21 years
343
𝑥̅ = 21
= 16.33 Grand Slam matches won/ year
Consequently, we can say that 16.33 is the average number of wins/ year in a Grand Slam that Roger
Federer has achieved over the course of his career.
Using the same formula for the losses that Federer sustained in a Grand Slam tournament we obtain the
following arithmetic mean:
54
𝑥̅ = 21
=2.5714 Grand Slam matches lost/year
21+1
Me placeWininaGrandSlam= 2
=11.
Therefore, the median will be in the 11th place of the observations ordered increasingly and it will be
represented by 18, which means that in half of the cases, Roger Federer won more than 18 matches in a
Grand Slam/ year and in the other half he won less than 18 matches in a Grand Slam/year.
Regarding the losses, the Meplace remains eleven because the number of observations is still 21 and the
Median will be 3, which means that in half of the cases, Roger Federer lost more than 3 matches in a
Grand Slam/ year and in the other half he lost less than 3 matches in a Grand Slam/year.
d)The variance is a computed measure whose value is affected by the value of every observation in a
series and therefore this measure reflects the dispersion of all the observations.
It is calculated as the simple arithmetic mean of the squares of the individual deviations and the mean
and the formula is as it follows:
2
∑(𝑥𝑖 − ̅̅̅
𝑥)2
𝜎 =
𝑛̅ − 1
Equation 2-Variance Formula
In order to find the variance for the number of matches won in Grand Slams over the course of the years
we did as follows:
Step 1: we computed the square of each observation and made a total sum of the squares.
Step 2: we divided the sum by the number of observations from which we subtracted 1 because n<31 .
Using the same steps to calculate the variance for the losses in Grand Slams we obtained the following:
33.14371536
𝜎2 = 20
= 1.65719
e)Standard deviation is one of the most frequently encountered measures of dispersion and it is
calculated as the square root of the variance:
𝜎 = √𝜎 2
8.2178738126
For the matches won in a Grand Slam, the coefficient of variation is: 𝜈 = ∗ 100=50.32%.
16.33
According to the rules, if the coefficient of variation is below 35%, that means that the data series is
homogenous and the average is representative. However, since in our case the coefficient of variation in
50.32%, which is above 35%, we can say that the data is not homogenous and that the average is not
representative. Therefore, regarding the matches won in a Grand Slam, Roger Federer did not win
similarly in each of the 21 years studied.
1.28731
For the matches lost in a Grand Slam, the coefficient of variation is: 𝜈 = 2.5714
*100= 0.5*100=
50.06%. According to the rules, if the coefficient of variation is below 35%, that means that the data
series is homogenous and the average is representative. Yet, in our case the coefficient of variation in
50.06%, which is above 35%, we can say that the data is not homogenous and that the average is not
representative. Therefore, regarding the matches lost in a Grand Slam, similarly to the previous value
for the matches won in a Grand Slam, Roger Federer did not lose similarly in each of the 21 years
studied.
g) The coefficient of skewness measures the degree of skewness of a distribution or curve, which is
denote by Sk and defined with the following formula for which the result usually varies between –3 (for
negative) to +3 (for positive) and the sign indicates the direction of skewness. We used the formula with
the Median instead of the Mode because the mode value was only repeated three times and we had
other values repeated twice, so it was not very representative for the data set.
3(𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛)
𝑆𝑘 =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
Equation 5-Coefficient of skewness formula
By applying this formula to the wins recorded in a Grand Slam we obtained the following:
3(16.33−18)
𝑆𝑘 = = - 0.61
8.2178738126
Therefore, we can say that because the result is negative, we have a low negative skewness regarding
the number of wins in a Grand Slam over the course of his career.
Using the same formula for the losses that Federer recorded in a Grand Slam throughout his career we
obtained the result:
3(2.5714−3)
𝑆𝑘 = 1.28731
= -0.998834
Therefore, we can say that because the result is negative, we have a low to medium negative skewness
regarding the number of losses in a Grand Slam over the course of his career.
h) The range of a set of observations is the difference between the largest and the smallest
observations: R=xmax-xmin.
In the case of the Grand Slam wins we can say that the absolute range is: RGrandSlamWins=27-0=27.
However, the use of the absolute range is limited because it fails to take into consideration all of the
observations and only considers the extreme values. As a result, we will calculate the relative range:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 27−0
RGrandSlamWins(%)= 𝑥̅̅
∗ 100= 16.33 ∗ 100=165.33
Because our relative range is above 100-120%, we can say that the data regarding the Grand Slam wins
is not homogenous and that the data is not representative.
When we calculate the absolute range for Grand Slam losses we obtain: RGrandSlamLosses=4-0=4. Just like
above, we will calculate the relative range due to the limitations of the absolute range and the result is:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 4−0
RGrandSlamLosses(%)= 𝑥̅̅
∗ 100 = ̅̅̅̅̅̅̅̅̅̅ ∗ 100= 1.5557*100= 155.57%
2.5714
Because our relative range is above 100-120%, we can say that the data regarding the Grand Slam losses
is not homogenous and that the data is not representative.
Next we will analyze how Roger Federer competed in the tournaments belonging to the Masters 1000
category. In the table below we can see how Roger Federer’s performance differed over the course of
the 21 years in which he played at Masters 1000 tournaments. His progress is clearly visible, from his
debut year when he did not succeed in reaching the main draw in any tournament of this level to his
following years when he was a very strong competitor except cases in which he suffered from injuries.
Because our relative range is above 100-120%, we can say that the data regarding the Masters 1000
wins is not homogenous and that the data is not representative.
By calculating the absolute range for the Masters 1000 losses we applied the same formula: R=xmax-xmin=
RMasters1000Losses= 9-0=9 and because of the fact that the absolute range is limited, failing to take into
consideration all of the observations, only considering the extreme values, we calculated the
relative range by applying the same formula:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 9
RMasters1000Losses(%)= 𝑥̅̅
*100= 5*100= 1.8*100= 180%
Because our relative range is above 100-120%, we can say that the data regarding the Masters 1000
losses is not homogenous and that the data is not representative.
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 ∑𝑛𝑖=1 𝑥𝑖
𝑥̅ = =
𝑛 𝑛
Where:
x= the number of wins in an ATP 500/ year
n= the number of years
Therefore, for the number of wins in ATP 500 tournaments we will have:
∑21
𝑖=1 𝑥𝑖/ 21, where n= 21 years:
165
𝑥̅ = = 7.8571 ATP 500 wins/year
21
From this number we can observe that the average number of wins that Roger Federer
managed to obtain in a year in an ATP 500 level tournament is 7.8571.
For the number of losses obtained in ATP 500 level tournaments we apply the same formula
for the mean as with the wins:
∑21
𝑖=1 𝑥𝑖/ 21, where n= 21 years:
25
𝑥̅ = =1.1904761
21
From this number we can observe that the average number of losses that Roger Federer
managed to obtain in a year in an ATP 500 level tournament is 1.1904761.
b) The Mode in the case of the wins in the ATP 500 is represented by the value that is
repeated most which is represented by 5 wins which is repeated 4 times. In the case of the
losses, the mode is represented by 1 loss which is repeated 10 times.
𝑛+1
c) The Median will be computed with the formula of: 2
because of the odd number of
associations. Therefore, the Median Place is the 11th place of the observations ordered
increasingly and it will be represented by 8 wins which means that in half of the cases
observed Roger Federer won less than 8 matches/year in ATP 500 tournaments and in the
other half he won more than 8 matches/ year in ATP 500 tournaments. Regarding the ATP
500 losses, the Meplace will remain in the 11th place of the observations ordered increasingly
and it will be represented by 1 loss which means that in half of the cases observed Roger
Federer lost less than 1 match/year in ATP 500 tournaments and in the other half he lost
more than 1 match/ year in ATP 500 tournaments.
d) The variance will be computed with the formula:
̅̅̅2
∑(xi−x) 390.571
For the number of wins in ATP 500 tournaments: σ2 = ̅ −1
n
= 20
= 19.5286
̅̅̅2
∑(xi−x) 29.2381
For the number of losses in ATP 500 tournaments: σ2 = ̅ −1
n
= 20 = 1.4619
e) The standard deviation has the following formula:
σ=√(σ^2 )
For the number of matches won in an ATP 500 level tournament, the standard deviation is:
σ=√19.5286 = 4.4191
For the number of matches lost in an ATP 500 level tournament, the standard deviation is:
σ=√1.4619= 1.209
f) The coefficient of variation was calculated by using the following formula:
𝜎 4.4191
For the wins in ATP 500 tournaments: 𝜈 = 𝑥̅ ∗ 100= 7.8571
̅̅̅̅̅̅̅̅̅̅
∗ 100= 56.2433%
The comment for the coefficient of variation is that because it is above 35% which is
considered to be limit below which the data is homogenous and the average is
representative, we can say that the data is not homogenous and the average is not
representative. Consequently, the number of matches Roger Federer won over the course of
his 21 year long career in a ATP 500 is not similar.
𝜎 1.209
For the losses in ATP 500 tournaments: 𝜈 = 𝑥̅ ∗ 100= 1.1904761 ∗ 100=101.556
The comment for the coefficient of variation is that because it is above 35% which is
considered to be limit below which the data is homogenous and the average is
representative, we can say that the data is not homogenous and the average is not
representative. Consequently, the number of matches Roger Federer lost over the course of
his 21 year long career in a ATP 500 is not similar.
g) The coefficient of skewness was calculated with the following formula:
3(𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛) 3(1.1904761−1)
𝑆𝑘 = = =- 0.097
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 4.4191
Therefore, we can say that because the result is negative, we have a very low negative
skewness regarding the number of wins in matches played in ATP 500 tournaments over
the course of his career.
For the losses suffered in ATP 500 we applied the same formula and obtained:
3(𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛) 3(7.8571−8)
𝑆𝑘 = 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛= 1.209
=-0.35459
Therefore, we can say that because the result is negative, we have a low negative skewness
regarding the number of losses in matches played in ATP 500 tournaments over the course
of his career.
h) The range: We calculated absolute range with the formula: R=xmax-xmin= RATP500Wins= 15-
0=15, but the use of the absolute range is limited because it fails to take into consideration
all of the observations and only considers the extreme values and, because of this, we
calculated the relative range, using the following formula:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 15
RATP500Wins(%)= 𝑥̅̅
*100= 7.8571*100= 190.91%
Because our relative range is above 100-120%, we can say that the data regarding the ATP
500 wins is not homogenous and that the data is not representative.
For the ATP 500 losses we calculated the absolute range with the formula: R=xmax-xmin=
RATP500Losses= 4-0=4, but the use of the absolute range is limited because it fails to take into
consideration all of the observations and only considers the extreme values and, because of
this, we calculated the relative range, using the following formula:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 4
RATP500WinsLosses(%)= 𝑥̅̅
*100= 1.1904761*100= 336%
Because our relative range is above 100-120%, we can say that the data regarding the ATP
500 losses is not homogenous and that the data is not representative.
b) The Mode in the case of the wins in the ATP 250 is represented by the value that is
repeated most which is represented by 15 wins which is repeated 4 times. In the case of
losses in the ATP 250 the mode is represented by the value of 0 which is repeated 7 times.
𝑛+1
c) The Median will be computed with the formula of: 2
because of the odd number of
associations. Therefore, the Median Place is the 11th place of the observations ordered
increasingly and it will be represented by 7 wins which means that in half of the cases
observed Roger Federer won less than 7 matches/year in ATP 250 tournaments and in the
other half he won more than 7 matches/ year in ATP 250 tournaments.
For the ATP 250 losses the Meplace=11th place of the observations ordered increasingly and
it will be represented by 1 loss which means that in half of the cases observed Roger
Federer lost less than 1 match/year in ATP 250 tournaments and in the other half he lost
more than 1 match/ year in ATP 250 tournaments.
d) The variance will be computed with the formula:
̅̅̅2
∑(xi−x) 832
For the number of wins in ATP 250= σ2 = ̅ −1
n
= 20
= 41.6
̅̅̅2
∑(xi−x) 194.952
For the number of losses in ATP 250=σ2 = ̅ −1
= = 9.7476
n 20
e) The standard deviation has the following formula:
σ=√(σ^2 )
For the number of matches won in an ATP 250 level tournament, the standard deviation is:
σ=√41.6 = 6.4498
For the number of matches lost in an ATP 250 level tournament, the standard deviation is:
σ=√9.7476 = 3.1221
f) The coefficient of variation was calculated by using the following formula:
𝜎 6.4498
For the number of wins in ATP 250: 𝜈 = 𝑥̅ ∗ 100= 9.8095
̅̅̅̅̅̅̅̅̅
∗ 100= 65.75%
The comment for the coefficient of variation is that because it is above 35% which is considered
to be limit below which the data is homogenous and the average is representative, we can say
that the data is not homogenous and the average is not representative. Consequently, the
number of matches Roger Federer won over the course of his 21 year long career in a ATP 250
tournaments is not similar.
𝜎 3.1221
For the number of losses in ATP 250: 𝜈 = ∗ 100= ̅̅̅̅̅̅̅̅̅ ∗ 100= 131.13%
𝑥̅ 2.3809
The comment for the coefficient of variation is that because it is above 35% which is considered
to be limit below which the data is homogenous and the average is representative, we can say
that the data is not homogenous and the average is not representative. Consequently, the
number of matches Roger Federer lost over the course of his 21 year long career in a ATP 250
tournaments is not similar.
g) The coefficient of skewness was calculated with the following formula:
3(𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛) 3(9.8095−7)
For the wins in ATP 250: 𝑆𝑘 = 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛= 6.4498
=1.30678
Therefore, we can say that because the result is positive, we have a medium positive
skewness regarding the number of wins in matches played in ATP250 tournaments over the
course of his career.
3(𝑀𝑒𝑎𝑛−𝑀𝑒𝑑𝑖𝑎𝑛) 3(2.3809−1)
For the losses in ATP 250: 𝑆𝑘 = 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛= 3.1221
=0.4422
Therefore, we can say that because the result is positive, we have a low positive skewness
regarding the number of losses in matches played in ATP250 tournaments over the course
of his career.
i) The range: We calculated absolute range with the formula: R=xmax-xmin= RATP250Wins= 22-0=22 but
the use of the absolute range is limited because it fails to take into consideration all of the
observations and only considers the extreme values and, because of this, we calculated the
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 22
relative range, using the following formula: RATP250Wins(%)= 𝑥̅̅
*100= 9.8095*100= 224.27%
Because our relative range is above 100-120%, we can say that the data regarding the ATP 250
wins is not homogenous and that the data is not representative.
For the number of losses in ATP 250 we calculated the absolute range in the following manner:
R=xmax-xmin= RATP250Losses= 11-0=11 but the use of the absolute range is limited because it fails to
take into consideration all of the observations and only considers the extreme values and,
because of this, we calculated the relative range, using the following formula:
𝑥𝑚𝑎𝑥 −𝑥𝑚𝑖𝑛 11
RATP250Losses(%)= 𝑥̅̅
*100= 2.3809*100=462.01%
2.5 Pie chart representing total wins obtained throughout Roger Federer’s career at
all types of tournaments mentioned before.
Wins
19%
32%
Grand Slam
Masters 1000
15% ATP 500
ATP 250
34%
In the pie chart above we have presented the percentages of wins in each category of tournaments from
the 1082 matches that Roger Federer has won throughout his career at these types of events.
Therefore, we can see that the highest percentages belongs to the Masters 1000 category, but that it is
followed closely behind by Grand Slams. These tournaments represent the most important and
prestigious categories of tennis tournaments and that explains why, once his career progressed, Roger
Federer chose to focus on them. What is more, this could also explain why he still continues to hold the
Guinness World Record for the most weeks spent at World No.1 and the highest count of Grand Slams in
the Open Era.
3. A statistical analysis of the prize money earned by Roger Federer
throughout his career
Year Earnings
(USD)
1998 27 955
1999 225 139
2000 623 782
2001 865 425
2002 1 995 027
2003 4 000 680
2004 6 357 547
2005 6 137 018
2006 8 343 885
2007 10 130 620
2008 5 886 879
2009 8 768 110
2010 7 698 289
2011 6 369 576
2012 8 584 842
2013 3 203 637
2014 9 343 988
2015 8 682 892
2016 1 527 269
2017 13 054 856
2018 8 629 233
Total 120 456 649
Table 5- Prize Money earned
In the table above we have gathered data regarding the prize money won by Roger Federer in
each year of his career, from 1998 until 2018. These figures represent only the money earned
through tennis, without taking into account the yearly endorsements which are also an
impressively growing figure each year.
In order to create a better image of his career’s earnings, we have calculated the average mean
for the data showcased above. The formula for the mean that we used is:
𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 ∑𝑛𝑖=1 𝑥𝑖
𝑥̅ = =
𝑛 𝑛
Where:
x= the earnings/year
n= the number of years
Therefore, for the earnings in each year from tournaments we will have:
∑21
𝑖=1 𝑥𝑖/ 21, where n= 21 years:
120456649
𝑥̅ = 21
=5736030.9 USD/year
From this number we can observe that the average number USD that Roger Federer earned
in a year from tournaments is 5 736 030.9 USD.
14000000
12000000
y = 403236x - 8E+08
R² = 0.438
10000000
8000000
6000000
4000000
2000000
0
1995 2000 2005 2010 2015 2020
The figure above shows the fact that his earnings had an increasing evolution over the course of the 21
yeas analyzed. Also, the value of 0.438 is estimates how close the points are to the trend line, the closer
the value of R2 to 1, the better the fit to the trend line. Due to the fact that performance in sports can be
unpredictable at times, Roger Federer being no exception, seeing as he was plagued by injuries as well,
the fact that there are points further away from the trend line showcases that forecasting his future
performance can prove to be difficult. What is more, the furthest the points are from the trend line, the
more you can see in which years he was at peak physical prowess and in which years he was still
learning or wasn’t physically well.
4. Evolution of Roger Federer’s abilities until the present moment
In this chapter we will provide an analysis of how Roger Federer’s abilities have progressed or regressed
over the course of the years until the present moment. We will see in which areas he has succeeded in
improving and also try to gleam what could make his game even better.
Winners in matches
Forehand winners Backhand winners Net winners Aces
26%
27% 34%
35%
25% 13%
23%
17%
The pie chart above presents the evolution of winners during the course of the match. We have split the
winners into 4 different shot types. The first level of the pie chart, the one towards the outside
represents the past while the level towards the inside in the present. Therefore, we can see that over
the course of the years, Roger Federer has increased his forehand winners by 1% and the net winners by
1.8%, while his backhand winners have decreased by 4% and the aces by 1%. Looking at his game, these
facts can be explained by considering him going for the net winners more. Also, in the most recent years
he tends to attack more on the forehand side, which is why his backhand winners have decreased the
most.
Unforced errors in matches
Forehand unforced errors Backhand unforced errors
Net unforced errors Double faults
10%
4% 6%
5%
48%46%
41%
40%
In the pie chart above we have put together data regarding Roger Federer’s unforced errors. Like above,
the level towards the outside is the past and the one towards the inside is the present. Therefore, the
forehand unforced errors grew with almost 2%, the backhand unforced errors grew as well by
approximately 1% and the net unforced errors also grew with a little more than 1%. The most significant
change was in the decrease of unforced errors, it going down with almost 4%. By correlating both charts,
we can see that although he has increased his forehand winners, the unforced errors also grew due to
him more aggressive style, the same applying to the net winners. However, a problem occurs with his
backhand where his unforced errors grew although his winners dipped as well, signifying that he hits his
target less when using the one-handed backhand. Regarding aces, their number is a bit lower, with
approximately 1%, but the double faults decreased by a much significant percentage: almost 4%. This
can be explained by his serve which has improved considerably over the course of the last years and
which we will analyze in the next subchapter.
One of Roger Federer’s main weapons in the matches played against his opponents is his serve. It is
extremely consistent, very well placed and the variation that he uses when he serves always keeps his
opponents guessing. In the table below we have presented some of the career stats regarding his serve
and the percentage of points that he has won over the course of the years.
The result of this tells us that Roger Federer wins 71.12% of the total number of points that he plays.
The second value we wanted to calculate was the relative speed of his serve. In order to calculate this,
we used the following formula:
𝑠𝑒𝑟𝑣𝑒 𝑠𝑝𝑒𝑒𝑑
Relative speed=𝑎𝑣𝑒𝑟𝑎𝑔𝑒 1𝑠𝑡 𝑠𝑒𝑟𝑣𝑒 𝑠𝑝𝑒𝑒𝑑 *100
The third and final value that we computed concerning Roger Federer’s serve is represented by the
relationship that exists between his serve speed and the percentage of points that he won on serve. The
formula that we used to calculate this project is the following:
%𝑜𝑓 𝑝𝑜𝑖𝑛𝑡𝑠 𝑤𝑜𝑛
Relationship=𝑡ℎ𝑒 𝑠𝑝𝑒𝑒𝑑 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑒𝑟𝑣𝑒*100
Applying this for both the first and the second serve we obtained the following results:
%𝑜𝑓 𝑝𝑜𝑖𝑛𝑡𝑠 𝑤𝑜𝑛 𝑜𝑛 1𝑠𝑡 𝑠𝑒𝑟𝑣𝑒 78
For the 1st serve: 𝑡ℎ𝑒 𝑠𝑝𝑒𝑒𝑑 𝑜𝑓 𝑡ℎ𝑒 1𝑠𝑡 𝑠𝑒𝑟𝑣𝑒
*100=125*100=0.624*100=62.4 m/h.
The comment for these values is that in both the case of the 1st serve and the 2nd serve is around 62
which means that there is a close relationship between the speed of Roger Federer’s serve and the
percentage of points that he wins on serve. In other words, the percentage of Federer winning a point
on serve is represented by the value of 0.62* the speed of the serve (m/h).
5. Conclusion
After a statistic analysis of all of the data gathered regarding Roger Federer’s wins/losses, earnings,
winners, unforced errors and serve, we can clearly see the improvements that he made as he
progressed in his career. He went from only winning 2 matches in an ATP 250 level tournament in his
first year of play to reaching his greatest success at the most important categories of tournaments:
Grand Slams and Masters 1000 where it is clear to see that his focus shifted towards in the last years of
his career.
In spite of the fact that 2018 was on a slightly lower level than 2017, it is clear to see that Roger Federer
still has the will, passion and, most importantly, the ability to be one of the main contenders at the big
tournaments. What is more, the statistics we have analyzed in the project reveal perhaps his weaker
spot- the backhand- where, if he should improve, he could be almost guaranteed to extend his record
tally of Grand Slams.
All in all, Roger Federer has little upon where he could work on regarding his style of play, but the areas
that he should focus on while taking into consideration his age are his physical fitness in order to ensure
that he can compete against the younger players in matches that can last 4-5 hours or even longer and
also to attempt to remain as healthy as possible and avoid the performance of 2016, where he needed
to have knee surgery, since his wins and losses that year dipped considerably.
Bibliography
Sawe, B. E., 2018. The Most Popular Sports In The World. [Online]
Available at: https://www.worldatlas.com/articles/what-are-the-most-popular-sports-in-the-world.html
Curwin, J., Slater, R. & Eadson, D., 2013. Quantitative methods for business decisions. 7th ed. Andover :
Cengage Learning.
Serban , D., Mitrut, C. & Mitrut, C. A., 2003. In: Statistics for business administration. Bucharest: Editura
ASE.
Tennis View Mag, 2015. Memorable Moments in Tennis History Timeline. [Online]
Available at: http://www.tennisviewmag.com/memorable-moments-tennis-history-timeline
Total Sportek, 2019. Highest Prize Money In Tennis Grand Slams. [Online]
Available at: https://www.totalsportek.com/money/highest-prize-money-in-tennis-grand-slams/
References