Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
josh_manzano@dlsu.edu.ph
Abstract. A dataset is built into a model for the prediction of ranked match
outcomes for the popular online multiplayer game League of Legends,
features are extracted from the data that Riot Games API exposes–including
champions picked for the game, player role information, and mastery
levels for the players’ champions (pre-game knowledge) as well as
in-game even statistics and player statistics. We find that pre- match
knowledge (champions, masteries, roles, spells) does not affect outcome
of the match as much as in-game statistics such as objectives secured and
tower/champion kills.
1 Introduction
Due to the prevalence of online multiplayer games such as League of
Legends, predicting match outcomes based on pre-game statistics have already
been done multiple times before. This study will be focusing on using both
pre-game statistics and in-game statistics as features to classify and predict the
outcomes of ranked matches. Both features will be tested on how much they affect
the outcome of the game. Since a regular game of League of Legends generates so
much data and statistics, it would be infeasible to use all of these as features. To
make the study more feasible, the only statistics that would be used are ones that
would most likely affect the game such as first kills, objectives secured, inhibitors
destroyed etc. Decision Trees, Naive-Bayes, and Deep Learning will be employed
in creating the different models needed. These models will then be compared to see
which one works the best in predicting the outcome of a match.
2 Review of Related Studies
This study employs similar methods used by other studies, with the
difference of the algorithms used. A study made by Lucas, L called “League Of
Legends Outcome Prediction” [1] uses gradient boosted trees and gradient boosted
trees with logistic regression to predict the match outcomes. Similar other
websites/software and the League Of Legends API itself provides the means to do
automated basic analysis to the match and player statistics of a user. The
limitations of these are the lack of a more in-depth analysis of thousands of
matches, since the scope that they can analyze in is very small. This study will also
be differentiating between the effects of pre-game statistics and in-game statistics
to the outcome of a match.
3 Methodology
This study includes the the three processes, (1) Dataset building, (2) Feature
Selection, and the (3) Model-building and classification task.
For Dataset building and Feature Selection, most of the data gathered and
selection was obtained from Kaggle.com [2], where the author states that most of
the data was gathered through the Riot Games API, in which it was easy to gather
other users’ ranked game history and statistics.
A labeled data set containing 51,491 matches (only 10,000 of which will be
utilized) will be used to make a model that could analyze the outcome of a match,
of which it contains pre-game (champions selected) and in-game statistics. These
data sets will be used to analyze which feature affects the outcome of the game the
most and which ones could affect predictability.
Due to the aforementioned infeasibility of using all of the statistics in
analyzing the outcome of a match, the features chosen will only be limited to 20
pre-game and 8 in-game statistic features. Features such as number of last hits,
number of player kills, and jungle creeps killed etc. will not be included. The
included features are as follows:
In-Game Features:
gameDuration: Duration of the game.
firstBlood: Which team got the first kill in the game.
firstTower: Which team got the first tower kill in the game.
firstInhibitor: Which team got the first inhibitor kill in the game.
firstBaron: Which team got the first baron kill in the game.
firstDragon: Which team got the first dragon kill in the game.
firstRiftHerald: Which team got the first rift herald kill in the game.
t1/t2_towerKills: The amount of tower kills a team got in the game.
t1/t2_inhibitorKills: The amount of inhibitor kills a team got in the game.
t1/t2_baronKills: The amount of baron kills a team got in the game.
t1/t2_dragonKills: The amount of dragon kills a team got in the game.
t1/t2_riftHerald: The amount of rift herald kills a team got in the game.
Pre-Game Features:
champID (t1/t2 and 1-5): Champions picked by each team.
ban (t1/t2 and 1-5): Champions banned by each team.
As stated, one data sets will be used. Cross Validation on all three
algorithms will be used to ensure that estimation of accuracy will be optimal. The
data set will be subjected to the three algorithms which are Decision Tree,
Naive-Bayes, and Deep Learning.
Cross Validation Process of the three algorithms are as follows:
The results and accuracy of the models built from the two data sets will be
then analyzed to see which features are most important and which would model
would be most accurate.
4 Results and Discussion
Decision Tree
Cross Validation:
PerformanceVector:
accuracy: 95.90% +/- 0.35% (mikro: 95.90%)
ConfusionMatrix:
True: Blue Red
Blue: 4880 252
Red: 158 4710
According to the performance vector, the decision tree algorithm is the most
accurate out of all three with an accuracy of 95.90 percent. It scored noticeably
better in determining whether the blue team won and worse when determining
whether the red team won according to its confusion matrix. This indicates that it
may be a little inconsistent and biased towards one team, as there is a 94 sample
margin in which it is not consistent.
Naive-Bayes
In Naive-Bayes, we are able to see more accurately how much each feature
affected the outcome of the match.We are able to see that getting the first kill,
tower kill, or objective made the match outcome win lean significantly towards the
team who got it.
As we saw in the decision trees, game duration and other factors did not
sway the outcome that much.
Cross Validation Results:
PerformanceVector:
accuracy: 93.81% +/- 0.49% (mikro: 93.81%)
ConfusionMatrix:
True: Blue Red
Blue: 4775 356
Red: 263 4606
According to the performance vector, the Naive-Bayes scored worse in
accuracy out of all the three with an accuracy of 93.81 percent. It is also similar to
the decision tree in terms of consistency, having a 93 sample margin.
Deep Learning
PerformanceVector:
accuracy: 95.55% +/- 0.69% (mikro: 95.55%)
ConfusionMatrix:
True: Blue Red
Blue: 4820 227
Red: 218 4735
In Deep Learning, we see that it is the most consistent in determining which
team has won, only having a 9 sample margin. This indicates that deep learning
can be used to more reliably predict the outcome of a match, as the features/match
statistics are stable enough to correctly determine who is the winner.
5 Conclusion and Future Work
In conclusion, we can state that match statistics influences a ranked match in
League of Legends enough to safely predict its outcome. Features such as tower
kills and inhibitor kills affect the outcome the most based on what we learned on
the Decision Tree result. While the outcome may hinge on tower kills and inhibitor
kills, other features such as first kill, tower kill, or objective still affects the
outcome albeit not as much as the two aforementioned features.
To predict a match, it is best to use Deep Learning for its consistency. While
Decision Trees may be slightly more accurate, they are also volatile. Naive-Bayes
yielded poor accuracy when predicting the outcome of a match and should be not
be used for predicting as it is not suited to the data set.
The data set used is built from 10,000 different ranked games from various
players playing League of Legends. This data set could be refined if other features
could be added without making it unfeasible such as playing time of each player,
in-game ranks of players, and/or gold differences.
To further improve the goals of the study, an even more comprehensive
machine learning algorithm can be used on an even bigger data set to significantly
increase accuracy and understanding on which factors affect a game’s outcome.
Best-case scenario would be having a data set from players around the world and a
supercomputer that would be able to calculate the neural connections needed to
make nearly perfect predictions.
References
[1] Lin, L. (n.d.). League Of Legends Match Outcome Prediction. Retrieved from
http://cs229.stanford.edu/proj2016/report/Lin-LeagueOfLegendsMatchOutcomePre
diction-report.pdf
[2] Retrieved from: https://www.kaggle.com/datasnaek/league-of-legends