Sei sulla pagina 1di 13

Computers & Education 118 (2018) 25–37

Contents lists available at ScienceDirect

Computers & Education


journal homepage: www.elsevier.com/locate/compedu

To gamify or not to gamify? An experimental field study of the


T
influence of badges on motivation, activity, and performance in an
online learning course
Elias Kyewski∗, Nicole C. Krämer
University of Duisburg-Essen, Germany

AR TI CLE I NF O AB S T R A CT

Keywords: Over the last few years, the implementation of game elements like badges in non-game en-
Gamification vironments has become increasingly popular (Butler, 2014). In this study, we tested whether
Badges badges, which could be received for successful task performance and specific activities within an
Motivation e-learning course in a higher education setting, had an impact on students' motivation and
e-learning
performance. In a between-subjects experimental field study, students were randomly assigned to
Social comparison
three different conditions (no badges, badges visible to peers, badges only visible to students
themselves). The results show that badges have less impact on motivation and performance than
is commonly assumed. Independent of condition, students’ intrinsic motivation decreased over
time. Contrary to expectation, the badges that could only be viewed by the students themselves
were evaluated more positively than those that could also be viewed by others.

1. Introduction

The aspect of gamification in non-game environments such as educational settings has been viewed with great interest
(Domínguez et al., 2013; Seaborn & Fels, 2015) and is used as a method to increase student participation in classrooms (Hanus & Fox,
2015). Gamification describes the use of game elements and game design elements in non-game contexts (Deterding, Dixon, Khaled,
& Nacke, 2011). It aims to combine intrinsic motivation with extrinsic motivation in order to foster engagement and motivation to
participate actively (Mishra & Kotecha, 2017; Muntean, 2011). The use of badges is a typical method of gamification (Hakulinen,
Auvinen, & Korhonen, 2013) and has been popularized as badgification (Butler, 2014). Badges are visual displays of users' progress
(Hakulinen et al., 2013; Hanus & Fox, 2015), which, for example, indicate the achieved competence level (Boticki, Baksa, Seow, &
Looi, 2015), give immediate feedback (Turan, Avinc, Kara, & Goktas, 2016), and constitute one form of extrinsic rewards (Hanus &
Fox, 2015). Symbols of progression can have a huge influence on a person's behavior, with a notable example being the usage of
badges in military organizations (Butler, 2014). Studies have shown that badges increase motivation and have significant effects on
engagement and activity (e.g., Anderson, Huttenlocher, Kleinberg, & Leskovec, 2014; Ruipérez-Valiente, Muñoz-Merino, & Kloos,
2016b, pp. 1–8) as well as on motivation for learning (Gibson, Ostashewski, Flintoff, Grant, & Knight, 2015; Santos et al., 2013),
which is especially important as motivation has been shown to be an important factor in learning (Eales, Hall, & Bannon, 2002).
Furthermore, badges can be used to showcase one's performance to peers (Hakulinen et al., 2013).
Although the employment of badges in the field of education has been described in numerous papers, the number of studies that
empirically assess the effects of badges is still limited. Specifically, there is a lack of studies that attempt to unravel the mechanisms


Corresponding author.
E-mail addresses: elias.kyewski@uni-due.de (E. Kyewski), nicole.kraemer@uni-due.de (N.C. Krämer).

https://doi.org/10.1016/j.compedu.2017.11.006
Received 1 April 2017; Received in revised form 14 November 2017; Accepted 16 November 2017
Available online 21 November 2017
0360-1315/ © 2017 Elsevier Ltd. All rights reserved.
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

behind the potential effects of badges. At least two such mechanisms are conceivable: On the one hand, badges might function as
(extrinsic) rewards, leading people to perform specific activities in order to receive the badges. On the other hand, and perhaps more
crucially, an individual might be motivated to collect badges in order to compare his/her own achievements positively with the
achievements of others – as social comparison has been shown to be a powerful driver of behavior (Festinger, 1954). To disentangle
these mechanisms, we varied badge visibility on three levels: no badges visible (no reward, no social comparison), badges visible only
to the person receiving them (reward, no social comparison), badges visible for everyone (reward, social comparison).
In this study, we examined the influence of badges awarded over a period of five weeks on participants’ motivation, activity, and
performance during a large, open-access online course conducted over one semester. Thus, potential influences of badges over a
longer period were analyzed. As we were able to conduct an experimental field study in a real-life context (i.e. the study took place in
a real higher education setting with students who were aspiring to earn credits for passing the course), our study extends previous
small-scale, lab-based research. We therefore hope to answer the question of whether badges have a relevant impact in real-life
situations and determine the mechanism (reward or social comparison) through which they affect participants.

2. Theoretical background

Motivation is conceivably the most important factor in learning (Eales et al., 2002). The question of motivation gains additional
relevance in online courses, in which it is even easier to skip participation than it is in face-to-face learning settings. Therefore,
especially in the realm of massive open online courses (MOOCs), in which only a small percentage of those people who start a course
actually successfully complete it, it has been asked by which means students can be motivated to continue (Khalil & Ebner, 2014).
Awarding badges for successful task performance has been proposed as one potential tool in this respect, since badges have been
demonstrated to positively increase learner motivation (Fischer, Heinz, Schlenker, & Follert, 2016; Santos et al., 2013) and to affect
behavior (Hamari, 2017). However, so far, the mechanisms through which badges are able to influence motivation and performance
have been neither discussed nor empirically tested. Potential aspects which can been proposed in this regard are reward (in the sense
that badges can be perceived as additional extrinsic reward) and social comparison (in the sense that when badges are visible for
everyone, it is possible to assess one's own achievements relative to the group, increasing the motivation to either stay on top or reach
the top with regard to performance). In the following, we will first discuss reward as an explanation for the effectiveness of badges
and then focus on social comparison processes.

2.1. Extrinsic and intrinsic rewards

Studies have shown that incentive systems like badges increase motivation (e.g., Anderson et al., 2014; Mekler, Brühlmann,
Opwis, & Tuch, 2013). In order to be able to explain potential motivation gains from badges, one needs to distinguish between
different forms of motivations. Here, Ryan and Deci (2000) suggested a self-determination continuum in which they differentiate
between amotivation (nonself-determined behavior with no regulations), extrinsic motivation which “refers to doing something
because it leads to a separable outcome” (p. 55) (including the increasingly autonomous subtypes external regulation, introjected
regulation, identified regulation, and integrated regulation) and intrinsic motivation which refers to “[…] doing something because it
is inherently interesting or enjoyable […]” (p. 55). As amotivation is not a relevant concept for the present study, we focus on
intrinsic and extrinsic motivation in the following (without explicitly distinguishing the subtypes of extrinsic motivation).
A central goal of education and learning is to motivate students to engage and be active in a course (Hanus & Fox, 2015). If
students are lacking in motivation to learn, there is the possibility to motivate them extrinsically (Hamari, 2017). Accordingly,
incentive systems have long been prevalent in schools and are used to motivate student learning (Deci, Koestner, & Ryan, 2001). In
this line, reward systems based on badges should improve extrinsic motivation (Domínguez et al., 2013). However, previous studies
have shown that extrinsic rewards have a negative effect on students’ intrinsic motivation (Deci et al., 2001). To explain this,
Nicholson (2012) stated that the decrease of intrinsic motivation is based on “the controlling aspect of these rewards” (Nicholson,
2012, p. 2) and that intrinsic motivation is replaced by external motivation. Domínguez et al. (2013) reported that the use of
incentive systems can cause a feeling of manipulation and that the induced behavior vanishes as soon as the reward is ended.
Furthermore, numerous studies (e.g., Filsecker & Hickey, 2014; Hanus & Fox, 2015) describe that adding rewards to tasks which one
already finds interesting leads to a decrease of motivation and that rewards may cause a shift from intrinsic to extrinsic motivation,
which is in line with cognitive evaluation theory (Deci & Ryan, 1985). However, assumptions regarding the influence of rewards on
both intrinsic and extrinsic motivation are divergent: Given specific conditions, Ryan and Deci (2000) assume that interpersonal
factors such as badges can enhance intrinsic motivation as long as extrinsic rewards lead to satisfaction of the psychological need for
competence (Ryan & Deci, 2000). However, in order for this effect to emerge, individuals must experience not only competence, but
also autonomy, in the sense that they experience their behavior to be self-determined (for more detailed information see Deci & Ryan,
1985; Ryan & Deci, 2000). In sum, although research has demonstrated that rewards can – in the long run – lead to diminishing
intrinsic motivation (Deci, Koestner, & Ryan, 1999a, 1999b; Deci et al., 2001), there are also findings that extrinsic gratifications
which reward specific accomplishments foster not only extrinsic but also intrinsic motivation (Cameron & Pierce, 1994).
Likewise, the discussion around gamification is relatively divergent (e.g., van Roy & Zaman, 2017) and opinions differ regarding
the use of gamification elements such as badges to improve intrinsic motivation. Therefore, caution is warranted when assuming that
it is possible to increase intrinsic motivation by rewards (Haaranen, Ihantola, Hakulinen, & Korhonen, 2014). Furthermore, the level
of intrinsic motivation prior to the task or course should be taken into consideration: In persons with an initially high level of intrinsic
motivation, the addition of extrinsic motivations like badges may lead to a decrease of intrinsic motivation. By contrast, unmotivated

26
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

persons or those with an initially low level of intrinsic motivation may be motivated by badges, leading to greater engagement and
participation (Glover, 2013).
Therefore, it is difficult to derive hypotheses with regard to intrinsic motivation, since on the one hand, rewards like badges might
decrease intrinsic motivation, and on the other hand, the reward is given in such a way that it is mostly tied to accomplishments and
not mere activity, which has been shown to foster intrinsic motivation (Cameron & Pierce, 1994). Moreover, given that participants
are merely rewarded over a short period of time, we assume that:
H1. Participants' intrinsic motivation will increase when they receive badges for their accomplishments.
Nevertheless, the participants' prerequisites in terms of intrinsic motivation prior to the course (see Glover, 2013) have to be taken
into account:
H2. In particular, the intrinsic motivation of participants who report low intrinsic motivation prior to the course will increase in the badge
conditions, while the intrinsic motivation of those who report high intrinsic motivation will decrease in the badge conditions.
Based on the assumption that students' motivation (either intrinsic or extrinsic) will rise due to the reward given, we also expect
an effect on students' actual activity and performance. Numerous studies have shown an impact of the use of badges and other game
elements like leaderboards on student's interest in the course (e.g., Todor & Piticǎ, 2013), their motivation (e.g., Fischer et al., 2016;
Santos, Almeida, Pedro, Aresta, & Koch-Grunberg, 2013; Villagrasa & Duran, 2013) and on their engagement, participation, and
performance (e.g., Barata, Gama, Jorge, & Gonçalves, 2013; Gibson et al., 2015; O'Donovan, Gain, & Marais, 2013). Therefore, we
state:
H3. Students in the badge conditions will participate more actively in the course than students in the control group.
H4. Students in the badge conditions will show a better performance (grades and quiz results) than students in the control group.
In order to understand why people show specific reactions while using the learning platform in the different conditions, it is also
important to assess users’ perceptions. Therefore, we are interested to know whether or not the awarding of badges influences the
evaluation of the online platform in terms of ease of use versus complexity or regularity of use. Accordingly, we ask:
RQ1: How do students evaluate the online platform depending on conditions?

2.2. Social comparison

Besides reward, however, there are further mechanisms that might be responsible for the effects of badges: People tend to
evaluate their own opinions and abilities by comparing them with those of others (e.g., Zuckerman & Gal-Oz, 2014). Festinger (1954)
termed this phenomenon social comparison. According to Festinger (1954), people compare themselves with similar others. In general,
people have a tendency to continuously enhance their abilities and aspects of their self, and strive to be better than others. Moreover,
the comparison with others can reduce uncertainty and dissonance (Festinger, 1954). A central aspect of social comparison theory is
downward or upward comparison. Downward comparison can affect people positively and make them feel superior, whereas upward
comparison can lower people's self-concept and thus affect them negatively (Hanus & Fox, 2015). Social comparison is known to be
relevant for learning processes and motivation (Malzahn, Ganster, Sträfling, Krämer, & Hoppe, 2013) and to be a major feature of the
classroom (Huguet, Dumas, Monteil, & Genestoux, 2001), due to the reward system based on academic performance (Dijkstra,
Kuyper, van der Werf, Buunk, & van der Zee, 2008). Therefore, in most cases, social comparison in the classroom is not a choice but
happens automatically (Dijkstra et al., 2008). People are also able to compare their achievements in online learning contexts, for
instance through points, grades, badges, or leaderboards. In this regard, badges have been described as social markers, which provide
a social validation that the achievement activity is worthwhile (Hamari, 2017). Following this line of thought and based on as-
sumptions derived from informational social influence (Deutsch & Gerard, 1955), it has been hypothesized that seeing badges of
fellow students will lead to increased engagement and activity since “individuals are more likely to engage in behaviors that they
perceive others are also engaged in.” (Hamari, 2017, p. 470). Other students thus serve as a source of information. This is followed by
conformity if the behavior of others is interpreted to be appropriate (Deutsch & Gerard, 1955). Indeed, previous research has already
shown that badges which are visible to all have significant effects on course activity (Anderson et al., 2014), a finding which has been
explained as follows: “allowing students to show their badges to others could make them more desirable and might motivate more
students to pursue the badges” (Hakulinen et al., 2013, p. 4). Therefore, we derive the following hypothesis:
H5. Students who see their own and other people's badges will be more active than those who can only see their own badges.
H6. Students who see their own and other people's badges will show a better performance (grades and quiz results) than those who can only see
their own badges.
With regard to the question of how different forms of badges (i.e. those that can be seen by others versus those that are only visible
to oneself) are perceived, it is more difficult to derive a hypothesis since – to our knowledge – no study so far has compared the
evaluation of different forms of badges. The theoretical background of social comparison – which is the aspect concerning which the
two forms differ – also does not provide any hints on which form would be preferred. While some people might not enjoy engaging in
social comparison (especially when others are not doing well), other people might value the opportunity to learn about their own
status. Therefore, we ask:

27
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

RQ2: Do the participants differ with regard to their evaluation of the different forms of badges?

3. Method

This study was conducted during an online seminar at a German university over a period of one semester. The topic of the online
seminar was “Basic psychological mechanisms of computer-mediated communication: learning and teaching”. As the platform, we
used Moodle, which is commonly used by universities and is designed to support teaching as well as learning. This platform delivers a
large number of learner-centric tools and collaborative learning environments. Another feature is the customizability and integration
of external applications. Moodle is open-source and can be deployed on a private server.

3.1. Sample

324 students from a wide range of subjects registered for the online course. In total, 159 students (105 female, 54 male) parti-
cipated in at least one survey round. Only 106 students reported their age, which ranged from 18 to 39 (M = 24.21, SD = 3.15).
During the time of the course, 173 (53.4%) students dropped out, resulting in 151 students completing the course. Due to these
dropouts during the course, and due to students’ varying degrees of willingness to participate in the different survey rounds, the
samples and distributions differ between the data collection time points, therefore resulting in different samples for calculations.
Survey data of 126 students were used in the further analysis. As these students had completed the course and participated in at least
one survey round, we were able to refer to their logfiles as well as to their survey data. All students who participated in surveys but
did not complete the course were not considered for any further calculations. Given this fact, 65 participants participated only in one
of three questionnaires (time point 1: n = 59; time point 2: n = 15; time point 3: n = 1). Thirty-one students participated in at least
two questionnaires (time point 1 & 2: n = 26; time point 1 & 3: n = 1; time point 2 & 3: n = 4). Furthermore, twenty students
attended in all three questionnaires (time point 1, 2, & 3: n = 20).

3.2. Study design

The study employed a between-subjects experimental design in which the students were randomly and automatically divided into
three different conditions. Students’ characteristics such as study program, age, sex, or knowledge on the topic were not considered as
we were aiming for completely randomized groups. Two conditions were treatment conditions and one condition served as control
group. Students in the condition own-others-badges were able to see their own badges as well as those of fellow students (treatment
condition). Students in the condition own-badges were only able to see their own awarded badges and not those of other students
(treatment condition). Students in the control group (no-badges) did not receive achievement badges. Badges could be earned within
the first five weeks of the course.

3.3. Badges

During the first five weeks of the course, students in the treatment conditions were awarded badges for different activities and
achievements. Different badges were designed (see Table 1). One badge was a hidden badge, in the sense that students did not know
beforehand that they would be awarded a badge if they completed the respective activity (in this case reading further literature, see
Table 2). All other badges were displayed. Both treatment groups were given a description of achievement badges, with the difference
that students in the own-others-badges condition were informed that they would be able to see their own badges and those of fellow
students, and fellow students would also be able to see their badges. In the second treatment condition (own-badges), the students
were informed that only they would be able to see their achieved badges. The hidden badge and the quiz badge were ranked into
gold, silver and bronze depending on the students' activities and performance (e.g., gold signified superior performance in a quiz
compared to other students). By making this overview of achievement badges available to students, we hoped to call attention to the
badges and their related activities. The students did not receive an elaborated description of how to achieve the badges (except for the
peer review badge, since the students were mostly not familiar with peer feedback and needed a precise description of the process).
The other badges were merely given catchy names in order to clarify the activity for which they would be awarded.
For each thematic block, in a short quiz, the students were tested on the knowledge they had acquired through literature and
videos. The quiz results yielded a relative score for each student compared to the other students’ scores. The quiz badge was awarded
in gold, silver and bronze, and was rewarded to students who answered more questions correctly (in percent) compared to other
students. Moreover, this badge could change. To illustrate: The quiz badge is awarded to a student who answered all questions
correctly in week one and performed better than other students. Therefore, he receives the gold badge. In the following week, the
student does not get full marks on the quiz, but other students do. As a consequence, his gold badge changes from gold to silver. This
adjustment is made for every thematic block. It is therefore possible to earn the same badge as in the previous week. Earning a gold
badge becomes increasingly difficult from week to week. The quiz bronze badge was awarded 19 times, the quiz silver badge was
awarded 63 times and the quiz gold badge was awarded 32 times.
The hidden badge, which was awarded for reading additional literature, was awarded 21 times in bronze, 9 times in silver and 7
times in gold.
Another badge was awarded for participation in the discussion forum. To earn this badge, students had to comment on at least

28
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

Table 1
Different badges.

Badges Name Description (not visible to students)

Participation in forum Comment on at least three forum postings and open a thread

Peer-Feedback Be among the best 20% of reviewers

Quiz gold badge 90-100% of your quiz answers are correct incorporating results of other students

Quiz silver badge 70-89% of your quiz answers are correct incorporating results of other students

Quiz bronze badge 50-69% of your quiz answers are correct incorporating results of other students

Table 2
Hidden badge conferred in gold, silver and bronze.

Badges Name Description

Additional literature gold Read all the additional, voluntary literature of week 1–6

Additional literature silver Read 6/9 of the additional, voluntary literature of week 1–6

Additional literature bronze Read 3/9 of the additional, voluntary literature of week 1–6

three posts and had to start a new discussion topic. This badge was awarded to 19 out of 88 students.
To earn the peer feedback badge, students had to submit an assignment, which was then reviewed by other students. In a further
step, students rated the received reviews regarding constructiveness, helpful hints, concreteness, argumentation, appropriate choice
of words and mentioning of strengths and weaknesses. Based on this rating, the best 20% of the reviewers received the peer feedback
badge. This badge was awarded to 15 students (n = 88).
After a period of five weeks, it was no longer possible to earn badges. We therefore aimed to determine a) the direct effects of
badges by focusing on the effects within the first five weeks, and b) transfer effects by analyzing the effect in weeks 6–14.

29
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

3.4. Dependent variables

3.4.1. Logfiles
After registering on the platform and before participating in the course, students were asked to provide consent for data on their
user behavior on the platform to be logged. They were informed that if they consented to this, the data would be saved anonymously.
Furthermore, it was made clear to the students that there were no inherent advantages or disadvantages to agreeing or refusing for
their data to be logged. 157 (90.2%) students expressed agreement.
The logfiles of these students provided information about the login, resource accesses, (e.g., literature and video access), quiz
attempts, quiz results, and participation in questionnaires. As logfiles provided information about a student's last login, we were able
see at which time they logged in and the period since the last login. The data on video access provided information about how often a
student clicked on a video (although this did not reveal whether the student actually watched the video). This was also the case for
literature access: The system was only capable of logging whether or not a student clicked on literature but could not track scrolling
behavior; therefore, there was no guarantee that students read the literature after accessing it. Furthermore, students were able to
download literature; these data also provided information about activity and resource access. Data about students' quiz attempts and
results were also logged, enabling us to take a closer look at students' performance (see above) and to draw conclusions about
improvement.
All logfiles were saved on a weekly basis. Moreover, the order of accesses was logged. Based on the personal platform ID of every
single student, which was transmitted to the surveys, it was possible to assign logfiles to surveys, thus providing the opportunity to
analyze relations between self-report data and logfiles.

3.5. Measures

Students participated in several online questionnaires on a voluntary basis. The questionnaires encompassed socio-demographic
characteristics (e.g., age and sex), as well as badge impact (BIS), the System Usability Scale (SUS) and an adapted and extended
version of the academic self-regulation questionnaire (SRQ-A).
Badge-related questions were posed via different scales, including items from the Badge Impact Survey (BIS) by Biles, Plass, and
Homer (2014), such as “Do you like to earn badges?” (with responses ranging from 1 = never to 5 = all the time), and “How
important is it for you to earn badges?” (with responses ranging from 1 = not important to 5 = very important). To collect feedback
on specific badges, we extended the questionnaire with self-developed items (e.g., “Badges have motivated me ”, “I find badges
interesting and tried to collect them”, with responses ranging from 1 = strongly disagree to 5 = strongly agree). Items of the BIS and
self-developed items were all rated on a 5-point Likert scale. In total, we used 27 badge-related items to evaluate the badges. These
items can be divided into the following five aspects: what the badges indicate (7 items), importance of the meaning of badges (7
items), attitude towards badges (7 items), earning badges (5 items) and comparison of badges (1 item).
For further analysis, we conducted an exploratory factor analysis using Horn's parallel analysis method (Fabrigar, Wegener,
MacCallum, & Strahan, 1999). Seven items regarding what the badges indicated were included in the calculations (e.g., “I am making
progress”, “I am among the best”, with responses ranging from 1 = strongly disagree to 5 = strongly agree). Two items were
eliminated because they failed to meet the minimum criterion of having a primary factor loading of 0.4 or above (Costello & Osborne,
2005). Two factors emerged. The first factor consisted of two items thematically related to “among the best” (Cronbach's α = 0.80),
and the second factor consisted of three items (Cronbach's α = 0.62) thematically related to “progress” (see Table 3).
Furthermore, we wanted to know how important the meanings of the badges were to the students (e.g., “I managed a task
successfully”, “I am better than someone else”, with responses ranging from 1 = strongly disagree to 5 = strongly agree). Therefore,
we conducted another exploratory factor analysis using Horn's parallel analysis method (Fabrigar et al., 1999). Seven items regarding
the importance of the badges' meaning were included. The factor analysis yielded two factors. The first factor consisted of two items
thematically related to the importance of being “among the best” (Cronbach's α = 0.93) and the second factor consisted of five
items related to importance of “progress” (Cronbach's α = 0.86) (see Table 3).

Table 3
Factor loadings on principal axis extraction and promax rotation for badge meaning and importance of badges.

Item Badge meaning Importance of badges

Factor 1 “Among the best” Factor 2 “Progress” Factor 1 Factor 2 “Progress”


“Among the best”

I completed a task -0.166 0.370 0.700 0.014


I managed a task successfully 0.095 0.392 0.651 0.232
I am better than someone else 0.970 -0.071 0.016 0.886
I have fun 0.297 0.678 0.627 -0.045
I have discovered something -0.145 0.595 0.831 -0.150
I am making progress -0.157 0.538 0.827 0.102
I am among the best 0.683 0.028 -0.072 1.019

Note. The first two items were not included in further calculations because they failed to meet minimum criteria of having a primary factor loading of 0.4 or above.

30
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

Table 4
Sample size per hypothesis and research question.

Hypothesis N

overall condition 1 condition 2 condition 3

Hypothesis 1 20 10 2 8
Hypothesis 2 13 10 3 –
Hypothesis 3 116 81 35
Hypothesis 4
Quiz 138 96 42
Grade 85 59 26
Research question 1 65 26 23 16
Participants who earned badges 51 21 14 16
Hypothesis 5
active days (week 2–6) 118 46 35 37
participants who earned badges (active days week 2–6) 94 32 25 37
Hypothesis 6 59 33 26 –
Research question 2 49 26 23 –

Note: For hypotheses 1, 3, and 4 condition 1 and 2 were cumulated as gamification conditions.
For hypothesis 2 only participants of the gamification conditions were included.

The System Usability Scale (SUS) – an improved German translation of Brooke’s (1986) questionnaire by Lohmann and Schäffer
(2013) – consists of 10 items (e.g., “I think that I would like to use the system frequently”) rated on a 5-point Likert scale from
1 = “strongly disagree” to 5 = “strongly agree”. The internal validity (Cronbach's α ) was 0.85 (n = 65).
The adapted and extended version of the academic self-regulation questionnaire (SRQ-A) by Müller, Hanfstingl, and Andreitz
(2007) consists of the four dimensions intrinsic regulation (5 items), identified regulation (4 items), introjected regulation (4 items)
and external regulation (4 items). This questionnaire serves to measure the motivational regulation of students' learning and was
administered at three different time points: in the first week of the course (t1: n = 106, Cronbach's α = 0.88), after week 5 (t2:
n = 65, Cronbach's α = 0.87) and almost at the end of the course (t3: n = 27, Cronbach's α = 0.90). In the following, only the
intrinsic regulation (e.g., “I am learning in this online lecture … because I find it fun”, with responses ranging from 1 = strongly
disagree to 5 = strongly agree) will be considered.

4. Results

The main purpose of this study was to examine the impact of badges on students' motivation, activity, and performance.
Therefore, we conducted several analyses depending on conditions. As mentioned above, due to dropouts as well as students’ varying
willingness to participate in the different survey rounds, the sample and the distribution differ for each time point of data collection,
therefore resulting in different samples for calculations. Table 4 shows the sample size for each hypothesis and research question.
Hypothesis 1 stated that participants’ intrinsic motivation would increase when they receive badges for their accomplishments. To
examine this hypothesis, we conducted a two-factor analysis of variance with repeated measures on one factor to reveal whether a
significant effect of condition (badge (treatment conditions) vs. no-badge (control group)) on intrinsic motivation could be shown.
The within-subject variables were intrinsic motivation for the three time points. The between-subjects factor was the condition.
The results showed no significant effect of condition on intrinsic motivation over time (F (2,36) = 1.69, p = 0.20, η2 = 0.09). No
interaction effect was found between condition (badges vs. no badges) and intrinsic motivation depending on time point. A significant
main effect of intrinsic motivation over time emerged (F (2,36) = 4.77, p = 0.015, η2 = 0.21). Pairwise comparison revealed a
significant difference between t1 and t3 (p = 0.034). Students’ intrinsic motivation decreased from the beginning to the end of the
course (see Table 5).
Even a closer look at each condition (own-others-badges vs. own-badges vs. control group) revealed no significant effect of the
conditions on intrinsic motivation over time (F (4,34) = 0.92, p = 0.46, η2 = 0.10). Therefore, no interaction effect was found
between condition and intrinsic motivation depending on the time point. Therefore, hypothesis 1 is not supported. The means and
standard deviations are presented in Table 5.
To test hypothesis 2, which stated that in particular, the intrinsic motivation of participants who reported low intrinsic motivation
prior to the course would increase in the badge conditions, while the intrinsic motivation of those who reported initially high intrinsic
motivation would decrease in the badge conditions, we conducted a moderation analysis using PROCESS for SPSS (Hayes, 2012) with
intrinsic motivation at t3 as dependent variable, the condition as independent variable and intrinsic motivation at t1 as moderator
(n = 13; only students in the badge conditions were included). The results of the overall model showed a significant effect (F
(3,9) = 12.03, p = 0.002, R2 = 0.68).
The results also revealed that intrinsic motivation at t1 was a significant predictor of intrinsic motivation at t3 (b = 0.98, t
(9) = 4.43, p = 0.002). This means that for every 1 unit increase in intrinsic motivation at t1, there is a 0.02 unit decrease in intrinsic
motivation at t3. A non-significant effect was found for condition (b = 0.35, t (9) = 0.85, p = 0.42). Moreover, there was no
significant effect for the interaction between condition and intrinsic motivation at t1 (b = −0.48, t (9) = −0.75, p = 0.47). A more

31
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

Table 5
Descriptive Statistics for intrinsic regulation with statistic test for time 1, time 2, and time 3 according to conditions (the condition grouped badges conditions includes
own-other-badges and own-badges conditions).

Time period condition M SD N

Time 1 own-others-badges 3.50 0.80 10


own-badges 3.60 0.85 2
no-badges 3.28 0.88 8
grouped badges conditions 3.52 0.77 12

Time 2 own-others-badges 3.34 0.58 10


own-badges 3.30 0.14 2
no-badges 3.18 0.69 8
grouped badges conditions 3.33 0.53 12

Time 3 own-others-badges 2.64 1.01 10


own-badges 3.00 0.57 2
no-badges 3.05 1.03 8
grouped badges conditions 2.70 0.99 12

detailed look at the conditional effect of condition on intrinsic motivation at t3 in terms of values of the moderator (intrinsic
motivation t1) revealed a significant effect for low intrinsic motivation at t1 (b = 0.73, t (9) = 3.03, p = 0.014). This means that for
students who reported low intrinsic motivation at t1, there was significant effect on intrinsic motivation at t3. For average intrinsic
motivation at t1 (M = 3.28) and for those with high intrinsic motivation, no significant effect on intrinsic motivation at t3 was found.
The Johnson-Neyman technique revealed a significant association between having an intrinsic motivation below 2.9 at t1 and in-
trinsic motivation at t3 (b = 0.52, t (9) = 2.26, p = 0.05), indicating a linear decline of intrinsic motivation over time. Thus, low
intrinsic motivation at t1 exerted an effect on intrinsic motivation at t3, but there was no interaction effect between condition and
intrinsic motivation at t1 on intrinsic motivation at t3. Therefore, hypothesis 2 is not supported.
Hypothesis 3 predicted that students in the badge conditions (own-others-badges and own-badges) would participate more actively
in the course than students in the control group (no-badges). To examine this hypothesis, the mean of the logfiles of active days (weeks
2–6, time period in which it was possible to earn badges) was computed and served as independent variable. Results of a one-way
analysis of variance (ANOVA) illustrated that students in the badge conditions (n = 81) did not participate more actively than
students in the no-badges condition (n = 35): F (1,79) = 0.56, p = 0.58, η2 < 0.01. Therefore, hypothesis 3 is not supported.
Hypothesis 4 stated that students in the badge conditions would show a better performance (grades and quiz results) than students
in the control group. To examine this hypothesis, we conducted a one-way analysis of variance. No significant differences emerged
between the badge conditions and the control group regarding quiz results (F (1,136) = 0.19, p = 0.67, η2 < 0.001). To test
whether students in the badge conditions showed a better performance regarding course grade, a one-way analysis of variance was
conducted. There were no significant differences between the badge conditions and the control group regarding the course grade (F
(1,83) = 1.08, p = 0.30, η2 = 0.013). Therefore, hypothesis 4 is not supported.
Research question 1 focused on the perception of the online platform depending on conditions. To investigate this research
question, we conducted a one-way analysis of variance (ANOVA). Results showed a marginal, non-significant difference between the
three conditions (F (2,62) = 3.01, p = 0.057, η2 = 0.09). When analyzing the conditions depending on students who earned a badge
(condition own-others-badges and own-badges), the results showed a significant difference between the conditions (F (2,48) = 4.02,
p = 0.024, η2 = 0.14). Post hoc comparison using Scheffé test indicated that the mean score for the condition own-others-badges
(n = 21, M = 3.15, SD = 0.75) differed significantly from the mean score for the condition no-badges (n = 16, M = 3.76,
SD = 0.49). Thus, students in the no-badges condition perceived the platform as easier to use, less complex and would use the
platform more regularly. No significant differences were found between the condition own-others-badges and the condition own-
badges, or between the conditions own-others-badges and no-badges.
Hypothesis 5 predicted that students who were able to see their own badges and other people's badges would participate more
actively in the course than those who could only see their own badges. Here, the mean of the active days (weeks 2–6) served as
dependent variable. Results of a one-way analysis of variance (ANOVA) showed no significant differences between the conditions (F
(2,115) = 0.556, p = 0.58, η2 = 0.01). There was even no significant difference between the conditions if only students who had
actually earned a badge were included; F (2,91) = 0.054, p = 0.95, η2 = 0.001. In summary, these results suggest that students
showed the same level of activity regardless of their condition. Therefore, hypothesis 5 is not supported.
Hypothesis 6 predicted that students who could see their own and other people's badges would show a better performance (grades
and quiz results (all weeks)) than those who could only see their own badges. Results of a multivariate analysis of variance
(MANOVA) showed no significant differences between the conditions regarding grades and quiz results (Wilks' λ = 0.96, F
(2,56) = 1.20, p = 0.31). Therefore, hypothesis 6 is not supported.
With regard to research question 2, we analyzed students' perception of the badges descriptively. Only students from the two
treatment conditions were included, as they were the only ones who could see badges and therefore badge-related questions were
only given to these students. As a manipulation check, we first tested whether students had perceived badges. Descriptive results of
the surveys revealed that only two (4.1%) out of 49 students reported that they saw badges of fellow students (47 students (95.9%)
did not see badges of fellow students). As only 26 students were in the condition own-others-badges, only two out of a possible 26

32
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

Table 6
Descriptive values for badge related items.

Item M SD

I have managed a task successfully. 3.31 1.31


How important is it for you to earn badges? 3.27 1.40
I think badges are useless. 3.22 1.37
Do you like getting badges? 3.04 1.40
Badges motivated me. 2.61 1.26
I like that fellow students are able to see my commitment in the course. 2.39 1.19
I find badges interesting and tried to collect them. 2.31 1.36
When badges are available to earn, do you try to get all possible badges? 2.20 1.17
Do badges make to want you try an activity. 1.78 0.99
Have you ever chosen an activity based on the badges or achievements you could receive? 1.39 0.86
In general, would you do things differently (or do something you wouldn't normally do) based on whether or not you'd get a badge? 1.29 0.58
Do you compare badges you have earned with your friends? 1.04 0.20

Note: Items are presented in descending order. (n = 49).

students saw the badges of fellow students (7.7%) predefined by conditions. Thirty-five students (71.4%) perceived their own badges
on the platform (M = 4.12, SD = 1.29, n = 49). Table 6 shows the descriptive results of students' perception of the badges regardless
of conditions. Mean values indicate that the highest ratings depict the center of the scale. Overall, it is notable that students did not
evaluate badges either positively or negatively. Table 6 shows that students evaluated the act of earning badges as non-desirable. In
general, students’ activity did seem not to be affected by the badges. Furthermore, only two students reported having ever compared
their achieved badges with those of fellow students (M = 1.04, SD = 0.20).
Including the conditions revealed significant differences between the conditions own-others-badges and own-badges. Results of an
ANOVA regarding the aspect attitude towards badges showed that students in the condition own-badges (n = 23, M = 3.61, SD = 1.16)
liked the badges significantly more than students in the condition own-others-badges (n = 26, M = 2.54, SD = 1.42) (F (1,47) = 8.22,
p = 0.006). Moreover, students in the condition own-badges (M = 3.17, SD = 1.07) were more motivated by badges than students in
the condition own-others-badges (M = 2.12, SD = 1.21; F (1,47) = 10.37, p = 0.042). A significant difference also emerged between
students in the conditions own-others-badges (M = 3.73, SD = 1.37) and own-badges (M = 2.65, SD = 1.15) regarding the uselessness
of badges (F (1,47) = 8.74, p = 0.005). The results also showed that students in the condition own-others-badges (M = 3.88,
SD = 1.31) were not as interested in the badges as students in the condition own-badges (M = 2.57, SD = 1.12; F (1,47) = 14.20,
p < 0.001). Regarding the aspect of what badges indicate, students in the condition own-badges (M = 3.74, SD = 0.81) liked the fact
that the badges indicated that they had managed a task successfully significantly more than did students in the condition own-others-
badges (M = 2.92, SD = 1.55; F (1,47) = 5.14, p = 0.028). Overall, students in the condition own-badges rated badges more
positively.
The following section describes the aspect of what badges signify. Therefore, an evaluation of the badges in terms of the con-
ditions own-others-badges and own-badges was conducted. We asked students in these two conditions what badges signify by means of
items of the Badge Impact Survey (aspect of indication of badges). A subsequent independent samples t-test revealed no significant
differences between the two conditions for factor 1 “among the best” (t (47) = −0.69, p = 0.50) or factor 2 “progress” (t
(47) = −1.20, p = 0.24). Examining the means for factor 1, it can be noted that students in the condition own-others-badges (n = 26,
M = 2.35, SD = 1.11) and students in the condition own-badges (n = 23, M = 2.57, SD = 1.12) rated items of factor 1 as “disagree”,
indicating that they did not think that badges signified being “among the best”. Examining the means for factor 2, it can be noted that
students in the condition own-others-badges (n = 26, M = 3.03, SD = 0.79) and students in the condition own-badges (n = 23,
M = 3.25, SD = 0.46) rated items of factor 2 as “neither agree nor disagree”. Furthermore, we wanted to know how important the
meanings of the badges were to the students (aspect of importance of the meaning of badges). A subsequent independent samples t-test
revealed no significant differences between the two conditions for factor 1 (t (47) = −0.69, p = 0.50) or for factor 2 (t
(47) = −1.64, p = 0.11). Examining the means for factor 1, it can be noted that students in the condition own-others-badges (n = 26,
M = 2.35, SD = 1.11) and students in the condition own-badges (n = 23, M = 2.57, SD = 1.12) rated items of factor 1 as “disagree”,
indicating that it was not important to them that badges signified that they are among the best. Examining the means for factor 2, it
can be noted that students in the condition own-others-badges (n = 26, M = 2.97, SD = 1.25) and students in the condition own-
badges (n = 23, M = 3.46, SD = 0.77) rated items of factor 2 as “neither agree nor disagree”, indicating that it was not important to
them that badges stand for having made progress.
The following section describes the aspect of earning badges. Here, we were interested in which students (condition own-others
badges or condition own-badges) liked earning badges more, and for whom it was more important to earn badges. An independent
samples t-test was conducted to compare these two conditions in terms of liking and importance of earning badges. A significant
difference emerged between the condition own-others-badges (n = 26, M = 2.69, SD = 1.41) and the condition own-badges (n = 23,
M = 3.74, SD = 1.18) in terms of liking to earn badges; t (47) = −2.80, p = 0.007, with students in the condition own-badges liking
to earn badges more. Likewise, a significant difference was found between the condition own-others-badges (n = 26, M = 2.04,
SD = 1.11) and the condition own-badges (n = 23, M = 2.83, SD = 1.15) regarding the importance of earning badges; t
(47) = −2.43, p = 0.019, with students in the condition own-badges attaching a greater amount of importance to earning badges.
Furthermore, students in the condition own-badges (M = 2.78, SD = 1.35) found badges more interesting and tried to collect them

33
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

more often than students in the condition own-others-badges (M = 1.88, SD = 1.24; t (47) = −2.43, p = 0.019). When only
considering those students who actually earned a badge, the following significant differences prevailed: Students in the condition
own-badges who earned a badge (n = 14, M = 3.93, SD = 1.14) liked to earn badges more than students in the condition own-others-
badges who earned a badge (n = 21, M = 2.62, SD = 1.47); t (33) = −2.82, p = 0.008. Likewise, a significant difference was found
between students in the condition own-others-badges who earned a badge (n = 21, M = 2.10, SD = 1.18) and students in the
condition own-badges who earned a badge (n = 14, M = 3.00, SD = 1.11) regarding the importance of earning badges; t
(33) = −2.28, p = 0.029. The results showed that students in the condition own-badges liked to earn badges more and found it more
important to earn badges compared to students in the condition own-others-badges (with or without having actually earned badges).

5. Discussion

As little is known about the impact of gamification elements in online higher education courses, this study examined whether
badges which are awarded for specific behavior have an influence on students' motivation, activity, and performance. We were
especially interested in uncovering the mechanisms behind the potential effects of earning badges, and therefore compared the effects
of badges which are only seen by the students themselves, and thus only rely on reward mechanisms, with the effects of badges that
can be seen by all course participants, and thus provide both reward and the opportunity for social comparison. Hypothesis 1 stated
that participants' intrinsic motivation will increase when they receive badges for their accomplishments. Previous studies found that
intrinsic motivation can be replaced by extrinsic motivation and that extrinsic rewards can negatively affect students’ intrinsic
motivation (Deci et al., 2001; Mekler et al., 2013; Nicholson, 2012). From cognitive evaluation theory (Ryan & Deci, 2000), one can
derive the assumption that badges can enhance intrinsic motivation when they are awarded for the accomplishment of specific
activities. The results of this study revealed that badges do not have an influence on intrinsic motivation, neither increasing nor
decreasing it. Instead, intrinsic motivation declines over time in the course, independent of condition. However, this finding is not
uncommon and is in line with previous research (e.g., Zusho, Pintrich, & Coppola, 2003). In order to test whether the null results were
merely due to the fact that people with different levels of intrinsic motivation at the start of the course show opposing patterns, which
led to a canceling out of the effect, we analyzed hypothesis 2.
Hypothesis 2 predicted that in particular, the intrinsic motivation of those participants who reported low intrinsic motivation
prior to the course would increase in the badge conditions, while the intrinsic motivation of those with initially high intrinsic
motivation would decrease in the badge conditions. The results revealed no differential influence of badges for the different groups,
either increasing or decreasing. Therefore, the lack of effects of the badges cannot be attributed to the fact that some people might
benefit from them with respect to intrinsic motivation while others will not. However, this should not be interpreted as refuting the
established findings from cognitive evaluation theory (Ryan & Deci, 2000). Rather, it might be due to the fact that the badges
themselves were not perceived as a reward. This is also supported by the descriptive results, as the mean values indicate that
participants rather disliked receiving badges and did not feel motivated by them. Moreover, the non-significant results might also
have been caused by the diminishing number of participants (106 at t1 to 27 at t3). This further highlights the decreasing willingness
to participate on a voluntary basis and therefore provides additional information on decreasing intrinsic motivation.
In line with these results, the analysis of hypothesis 3 showed that students in the gamification conditions (own-others-badges and
own-badges) were obviously not motivated by badges to be more active in the course. Furthermore, even though there were no
significant differences between the conditions, the means revealed that students in the condition no-badges who did not earn badges
were even more active than students in the gamification conditions. This contradicts previous research findings that badges increase
motivation and have significant effects on engagement and activity (e.g., Anderson et al., 2014) as well as on motivation for learning
(Gibson et al., 2015) and affect learner motivation positively (Santos et al., 2013). Moreover, there was no effect of badges on
performance: In line with previous studies (e.g., Glover, 2013) and based on the assumption that students' motivation (either intrinsic
or extrinsic) will rise due to the reward given, we assumed that students in the badge conditions would show a better performance
(grades and quiz results) than students in the control group (hypothesis 4). The results showed no significant differences between the
badge conditions and the control group. Besides the fact that these results might possibly indicate that under field conditions, badges
are not as effective in terms of increasing activity and performance than has previously been shown and assumed, there are three
possible explanations for this finding. First, as we did not find an effect on motivation, one might argue that the prerequisite for
increasing activity and performance was not met. Second, the null results might have occurred because the quizzes were optional and
did not influence participants’ grades. Third, a further problem specifically regarding the null results on performance might have
been that the time period between awarding the badges and the final exam was too long. As reported by Domínguez et al. (2013), the
use of incentive systems can cause a feeling of manipulation, and the behavior vanishes as soon as the reward is ended – in line with
theoretical assumptions about extrinsic rewards potentially interfering with intrinsic motivation. Thus, especially for performance,
null results might have occurred since the badges were awarded within the first five weeks of the course and more than two months
before the final exam took place. Hence, the time span might have been too long to find effects. Therefore, future studies should
reduce the time interval between awarding badges and the final performance test.
Future research needs to disentangle why the badges we used were less effective than those used in previous studies and what
made them insufficient for influencing behavior. Manipulation checks showed that the badges were noticed and perceived, but they
were not liked, and they did not have an effect on motivation, performance, and activity.
Research question 1 focused on how students perceive the online platform depending on conditions. The results showed a
marginal, non-significant difference between the conditions. It can be summarized that students perceived the online platform
equally and that badges did not have an impact on the students’ perception of the platform. When only considering those students

34
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

who actually earned a badge (conditions own-others-badges and own-badges), significant differences emerged, insofar as students in the
condition no-badges who were not able to earn badges perceived the online platform as easier to use. This might be due to the fact that
students in the gamification conditions received more guidelines and information about badges and how to earn them. Thus, they had
more to read and more information to digest. Moreover, they received a congratulatory message if they had earned a badge, and their
badges were displayed on the platform, which might have led to a more complex presentation of the platform. This “overload” might
explain the less favorable usability evaluation. Thus, it might be advisable to keep badges as simple and intuitive as possible, so that
there is no need for additional explanations on top of the already necessary announcements within an online course.
Hypothesis 5 stated that students who could see their own and other people's badges would be more active than those who could
only see their own badges. This assumption is based on social comparison. The results revealed no significant differences between the
conditions. This may be due to the fact that most of the students in the own-others-badges condition did not actually see the badges of
their fellow students. Surprisingly, only two out of 26 participants reported that they checked on the badges of fellow students. While
this low number might have been partly attributable to social desirability when answering the question, it is still likely that students
were not interested in looking up their peers' badges. An explanation for this might be that it was not sufficiently relevant for them to
go to a peer's profile in order to check his or her badges. We did ensure in the instructions that all participants in the respective
condition were aware of the possibility to check on others' badges. Nevertheless, as they chose not to do so, we can conclude that
social comparison was not sufficiently attractive and did not seem to be beneficial.
Despite the potential explanations for why participants did not actually check on others’ badges, we need to regard the ma-
nipulation for the condition own-others-badges as having failed. The results on this condition cannot be interpreted as providing
information on the question of whether social comparison adds to the potential effects. Future research should ensure that badges are
more immediately visible instead of representing a possibility for social comparison that has to be actively sought out. The results
might have been different if participants had been unable to avoid seeing others' badges. An all-time visible leaderboard which
automatically triggers social comparison might be more successful in this respect.
Furthermore, it is helpful to take a closer look at the results of research question 2, which revealed that students in the condition
own-badges liked the badges more and were more motivated by them. However, due to the low means, these findings should be
interpreted with great caution, as participants in both conditions (own-others-badges and own-badges) stated that they neither agreed
nor disagreed that badges motivated them and that they liked them. Particularly interesting are the results which refer to comparison:
Students in both conditions (own-others-badges and own-badges) strongly disagreed that they compared the badges they had earned
with those of their fellow students.
Hypothesis 6, which predicted that students who see their own badges and other people's badges will show a better performance
(grades and quiz results) than those who can only see their own badges, was consequently not supported. No significant differences
emerged between students who could see their own badges as well as those of others and students who could only see their own
badges. This is in accordance with the justifications of hypothesis 5. Again, the study design has to be questioned here. Nevertheless,
as the instructions made the students aware that their and others' badges would be visible to all, it can at least be concluded that the
knowledge of being in a competitive setting did not motivate the participants.
Research question 2 addressed how students in the conditions own-others-badges and own-badges evaluated badges. On the whole,
the results showed low mean values. Students who were only able to see their own badges were more motivated by badges. In
general, students in the condition own-badges evaluated badges more positively than students in the condition own-others-badges. This
result is in line with previous studies (e.g., Codish & Ravid, 2014), which demonstrated that students were not interested in com-
peting publically and perceived a public competition as de-motivating. Students in the condition own-badges were not able to compare
their badges with the badges and thus achievements of fellow students. Earning badges was for themselves and not to show that they
were engaging in the course or for reasons of social comparison. Thus, they could earn badges in order to monitor their own progress.
In terms of the mean values, it can be seen that earning badges was not important to students. This is in line with findings by
Ruipérez-Valiente, Muñoz-Merino, and Kloos (2016a) as well as Pirker, Riffnaller-Schiefer, and Gütl (2014), who showed that stu-
dents were not very interested in earning badges. For students in the condition own-others-badges, the fact that other students would
be able to see their badges as well may have created a feeling of external control and even social pressure. Therefore, the results of
research question 2 might be explained by the fact that a public competition was undesired and even de-motivating.

6. Limitations

This study has several limitations that should be acknowledged. First, it has to be noted that not all students who participated in
the course took part in the survey, and furthermore did not fill in all questionnaires, resulting in different samples for the different
measurement points. However, given that the research took place within an actual course that was relevant for students' final degree,
the possibility to force or incentivize students to fill in accompanying questionnaires was limited. Indeed, a similar study conducted in
the following semester resulted in even worse participation. In line with this, and similar to other e-learning courses, there is a certain
dropout rate, which can reach 70% (Park & Choi, 2009). In this course, the dropout rate was 53%, which affected the number of
students who agreed to data collection and the number who participated in the surveys. Therefore, especially the survey towards the
end of the course was only filled in by a small number of participants. A further limitation is that the period in which badges were
awarded was relatively short, as badges were only awarded in the first five weeks of the course. It can therefore be assumed that the
badges were not very salient. However, answers from survey respondents suggest that they did perceive their badges (condition own-
others badges and own-badges) but not those of other students (condition own-others-badges). Further studies should extend the period
in which badges can be earned and should make badges more visible. Additionally, badges might be tied to actual rewards. For

35
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

instance, instead of providing merely symbolic value, they could be tied to bonus points which influence students’ grade in the final
exam within the course. This could lead to badges becoming more valuable and attractive for students and may have an impact on
their motivation, activity, and performance. In addition, only two out of 49 students reported that they saw badges of fellow students
demonstrating that badges should be presented more saliently for the condition own-other badges. Given the small number of par-
ticipants who saw badges of fellow students this might have influenced the results.
A further limitation of this study is the inclusion of self-developed items to collect feedback on specific badges. As these items have
not been validated previously, further research is needed to render the results of this study more amenable to comparison.
As students in this course were enrolled in different study programs, it can be assumed that they did not know each other, but this
cannot be completely ruled out. Hence, it can be concluded that conditions went undetected until debriefing. Most students only had
to pass the e-learning course, and thus did not receive a grade for attending the course. This might have influenced the overall
motivation for participation.
With regard to motivation this study examined intrinsic and extrinsic motivation. However, subtypes like external regulation,
introjected regulation, identified regulation, integrated regulation, or amotivation were not analyzed in greater depth in this study.
Therefore, future studies should include subtypes in order to achieve more fine-grained insights.
Finally, it should be kept in mind that our results are only valid for this particular course. Nevertheless, since gamification in non-
game contexts is becoming increasingly prevalent, the presented study provides some initial insight into the link between gamifi-
cation and motivation to learn and to actively participate in online courses by investigating one of the most commonly used ga-
mification elements: badges.

7. Conclusion

It has been argued that badges can increase motivation to participate in courses but can also decrease intrinsic motivation. We
collected and analyzed data from surveys and from logs to determine whether or not badges have an influence on students' moti-
vation, activity, and performance. Based on the observations and results of our study, the general conclusion is that the way in which
we awarded badges did not seem to be influential regarding students' motivation, activity, and performance. Our results revealed that
badges neither increased nor decreased students’ motivation and activity during the course. Furthermore, the results showed that
badges did not influence grades or quiz results. Instead, we found a general – and well known – trend that students became less
intrinsically motivated over time.
Despite the rather large number of null results, we believe that this study helps to point out important aspects regarding ga-
mification. We strongly believe that more evaluations of badges are required. Studies on cooperative, individualistic and randomly
awarded badges should be conducted in order to reveal influences of badges on students’ motivation, activity, and performance in a
more comprehensive way. With regard to answering the question of whether to gamify or not to gamify, we can draw the preliminary
conclusion that although we were unable to show that badges help to motivate, foster activity and increase learning results, they
certainly also do not hinder these processes – especially when participants only see their own badges.

Funding

This work was supported by Mercator Research Center Ruhr (MERCUR) [Pr-2014-0023] entitled “Pedagogical and Technological
Concepts for Collaborative Learning in MOOCs”.

References

Anderson, A., Huttenlocher, D., Kleinberg, J., & Leskovec, J. (2014, April). Engaging with massive online courses. Proceedings of the 23rd international conference on
World wide web (pp. 687–698). ACM.
Barata, G., Gama, S., Jorge, J., & Gonçalves, D. (2013). Engaging engineering students with gamification. Games and virtual worlds for serious applications (VS-GAMES),
Proceedings of the 5th international conference on games and virtual worlds for serious applications (pp. 1–8). IEEE.
Biles, M. L., Plass, J., & Homer, B. D. (2014). Good badges, evil badges? An empirical inquiry into the impact of digital badge design on goal orientation and learning.
Retrieved from http://create.nyu.edu/wordpress/wp-content/uploads/2015/02/HASTAC-Report-Badges-and-Learning-CREATE.pdf.
Boticki, I., Baksa, J., Seow, P., & Looi, C. K. (2015). Usage of a mobile social learning platform with virtual badges in a primary school. Computers & Education, 86,
120–136. http://dx.doi.org/10.1016/j.compedu.2015.02.015.
Brooke, J. (1986). System usability scale (SUS): A quick-and-dirty method of system evaluation user information. Reading, UK: Digital Equipment Co Ltd.
Butler, C. (2014). A framework for evaluating the effectiveness of gamification techniques by personality type. In F. F. Nah (Ed.). HCI in business (pp. 381–389).
Springer International Publishing.
Cameron, J., & Pierce, W. D. (1994). Reinforcement, reward, and intrinsic motivation: A meta-analysis. Review of Educational Research, 64, 363–423. http://dx.doi.org/
10.3102/00346543064003363.
Codish, D., & Ravid, G. (2014). Personality based gamification: How different personalities perceive gamification.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical
Assessment, Research & Evaluation, 10(7), 1–9.
Deci, E. L., Koestner, R., & Ryan, R. M. (1999a). A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological
Bulletin, 125, 627–668. https://doi.org/10.1037/0033-2909.125.6.627.
Deci, E. L., Koestner, R., & Ryan, R. M. (1999b). The undermining effect is a reality after all—extrinsic rewards, task interest, and self-determination: Reply to Eisenberger,
Pierce, and Cameron (1999) and Lepper, Henderlong, and Gingras (1999). https://doi.org/10.1037/0033-2909.125.6.692.
Deci, E. L., Koestner, R., & Ryan, R. M. (2001). Extrinsic rewards and intrinsic motivation in education: Reconsidered once again. Review of Educational Research, 71,
1–27. http://dx.doi.org/10.3102/00346543071001001.
Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Springer Science & Business Media.
Deterding, S., Dixon, D., Khaled, R., & Nacke, L. (2011, September). From game design elements to gamefulness: Defining gamification. Proceedings of the 15th
international academic MindTrek conference: Envisioning future media environments (pp. 9–15). ACM.

36
E. Kyewski, N.C. Krämer Computers & Education 118 (2018) 25–37

Deutsch, M., & Gerard, H. B. (1955). A study of normative and informational social influences upon individual judgment. The Journal of Abnormal and Social Psychology,
51, 629–636. http://dx.doi.org/10.1037/h0046408.
Dijkstra, P., Kuyper, H., van der Werf, G., Buunk, A. P., & van der Zee, Y. G. (2008). Social comparison in the classroom: A review. Review of Educational Research, 78,
828–879. http://dx.doi.org/10.3102/0034654308321210.
Domínguez, A., Saenz-de-Navarrete, J., De-Marcos, L., Fernández-Sanz, L., Pagés, C., & Martínez-Herráiz, J. J. (2013). Gamifying learning experiences: Practical
implications and outcomes. Computers & Education, 63, 380–392. http://dx.doi.org/10.1016/j.compedu.2012.12.020.
Eales, R. T., Hall, T., & Bannon, L. J. (2002, January). The motivation is the message: Comparing CSCL in different settings. Proceedings of the conference on computer
support for collaborative Learning: Foundations for a CSCL community (pp. 310–317). International Society of the Learning Sciences.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological
Methods, 4, 272–299. https://doi.org/10.1037/1082-989X.4.3.272.
Festinger, L. (1954). A theory of social comparison processes. Human Relations, 7, 117–140. http://dx.doi.org/10.1177/001872675400700202.
Filsecker, M., & Hickey, D. T. (2014). A multilevel analysis of the effects of external rewards on elementary students' motivation, engagement and learning in an
educational game. Computers & Education, 75, 136–148. https://doi.org/10.1016/j.compedu.2014.02.008.
Fischer, H., Heinz, M., Schlenker, L., & Follert, F. (2016). Gamifying higher education. Beyond badges, points and leaderboards. Proceedings of the 11th international
forum on knowledge asset dynamics (pp. 15–17). IFKAD.
Gibson, D., Ostashewski, N., Flintoff, K., Grant, S., & Knight, E. (2015). Digital badges in education. Education and Information Technologies, 20(2), 403–410. https://
doi.org/10.1007/s10639-013-9291-7.
Glover, I. (2013). Play as you learn: Gamification as a technique for motivating learners. In J. Herrington, A. Couros, & V. Irvine (Eds.). Proceedings of world conference
on educational multimedia, hypermedia and telecommunications 2013. Chesapeake, VA: AACE.
Haaranen, L., Ihantola, P., Hakulinen, L., & Korhonen, A. (2014, March). How (not) to introduce badges to online exercises. Proceedings of the 45th ACM technical
symposium on Computer science education (pp. 33–38). ACM.
Hakulinen, L., Auvinen, T., & Korhonen, A. (2013, March). Empirical study on the effect of achievement badges in TRAKLA2 online learning environment. Proceedings
of learning and teaching in computing and engineering (LaTiCE), 2013 (pp. 47–54). IEEE.
Hamari, J. (2017). Do badges increase user activity? A field experiment on the effects of gamification. Computers in Human Behavior, 71, 469–478. http://dx.doi.org/
10.1016/j.chb.2015.03.036.
Hanus, M. D., & Fox, J. (2015). Assessing the effects of gamification in the classroom: A longitudinal study on intrinsic motivation, social comparison, satisfaction,
effort, and academic performance. Computers & Education, 80, 152–161. http://dx.doi.org/10.1016/j.compedu.2014.08.019.
Hayes, A. F. (2012). PROCESS: A Versatile Computational Tool for Observed Variable Mediation, Moderation, and Conditional Process Modeling [White paper]. Retrieved
from http://www.afhayes.com/public/process2012.pdf.
Huguet, P., Dumas, F., Monteil, J. M., & Genestoux, N. (2001). Social comparison choices in the classroom: Further evidence for students' upward comparison tendency
and its beneficial impact on performance. European Journal of Social Psychology, 31, 557–578. http://dx.doi.org/10.1002/ejsp.81.
Khalil, H., & Ebner, M. (2014, June). Moocs completion rates and possible methods to improve retention-a literature review. Proceedings of world conference on
educational multimedia, hypermedia and telecommunications (vol. 2014, No. 1, pp. 1305–1313).
Lohmann, K., & Schäffer, J. (2013). System usability scale (SUS)–An improved German translation of the questionnaire. Retrieved from http://minds.coremedia.com/
2013/09/18/sus-scale-an-improved-german-translation-questionnaire/.
Malzahn, N., Ganster, T., Sträfling, N., Krämer, N., & Hoppe, H. U. (2013). Motivating students or teachers? In D. Hernández-Leo, T. Ley, R. Klamma, & A. Harrer
(Eds.). Scaling up learning for sustained impact (pp. 191–204). Springer Berlin Heidelberg.
Mekler, E. D., Brühlmann, F., Opwis, K., & Tuch, A. N. (2013). Do points, levels and leaderboards harm intrinsic motivation?: an empirical analysis of common
gamification elements. Proceedings of the first international conference on gameful design, research and applications (pp. 66–73). ACM.
Mishra, R., & Kotecha, K. (Mar. 2017). Students engagement through gamification in education gamifying formative assessment. Journal of Engineering Education
Transformations. http://dx.doi.org/10.16920/jeet/2017/v0i0/111751. [S.I.], ISSN 2394–1707. Available at: http://www.journaleet.org/index.php/jeet/article/
view/111751 (Accessed 20 November 2017) .
Müller, F. H., Hanfstingl, B., & Andreitz, I. (2007). Wissenschaftliche Beiträge aus dem Institut für Unterrichts- und Schulentwicklung (IUS)Skalen zur motivationalen
Regulation beim Lernen von Schülerinnen und Schülern. Adaptierte und ergänzte Version des Academic Self-Regulation Questionnaire (SRQ-A) nach Ryan & Connell, Vol. 1.
Muntean, C. I. (2011, October). Raising engagement in e-learning through gamification. Proceedings of the 6th international conference on virtual learning ICVL (pp. 323–
329). .
Nicholson, S. (2012). A user-centered theoretical framework for meaningful gamification. Games+Learning+Society, 8(1), Retrieved from http://scottnicholson.com/
pubs/meaningfulframework.pdf.
O'Donovan, S., Gain, J., & Marais, P. (2013). A case study in the gamification of a university-level games development course. Proceedings of the South African institute
for computer scientists and information technologists conference (pp. 242–251). ACM.
Park, J. H., & Choi, H. J. (2009). Factors influencing adult learners' decision to drop out or persist in online learning. Journal of Educational Technology & Society, 12(4),
207–217.
Pirker, J., Riffnaller-Schiefer, M., & Gütl, C. (2014, June). Motivational active learning: Engaging university students in computer science education. Proceedings of the
2014 conference on innovation & technology in computer science education (pp. 297–302). ACM.
van Roy, R., & Zaman, B. (2017). Why gamification fails in education and how to make it successful: Introducing nine gamification heuristics based on Self-
Determination Theory. In M. Ma, & A. Oikonomou (Eds.). Serious games and edutainment applications (pp. 485–509). Springer International Publishing.
Ruipérez-Valiente, J. A., Muñoz-Merino, P. J., & Kloos, C. D. (2016a). Analyzing students' intentionality towards badges within a case study using Khan academy.
Proceedings of the sixth international conference on learning analytics & knowledge (pp. 536–537). ACM.
Ruipérez-Valiente, J. A., Muñoz-Merino, P. J., & Kloos, C. D. (2016b). An analysis of the use of badges in an educational experiment. Frontiers in education conference
(FIE), 2016 IEEE (pp. 1–8). IEEE.
Ryan, R. M., & Deci, E. L. (2000). Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25, 54–67. http://dx.
doi.org/10.1006/ceps.1999.1020.
Santos, C., Almeida, S., Pedro, L., Aresta, M., & Koch-Grunberg, T. (2013a). Students' perspectives on badges in educational social media platforms: The case of SAPO
campus tutorial badges. Advanced learning technologies (ICALT), Proceedings of the 13th international conference on advanced learning technologies (pp. 351–353).
IEEE.
Santos, J. L., Charleer, S., Parra, G., Klerkx, J., Duval, E., & Verbert, K. (2013b). Evaluating the use of open badges in an open learning environment. In D. Hernández-
Leo, T. Ley, R. Klamma, & A. Harrer (Eds.). Scaling up learning for sustained impact (pp. 314–327). Springer Berlin Heidelberg.
Seaborn, K., & Fels, D. I. (2015). Gamification in theory and action: A survey. International Journal of Human-computer Studies, 74, 14–31. https://doi.org/10.1016/j.
ijhcs.2014.09.006.
Todor, V., & Pitică, D. (2013). The gamification of the study of electronics in dedicated e-learning platforms. Electronics technology (ISSE), Proceedings of the 36th
international spring seminar on (pp. 428–431). IEEE.
Turan, Z., Avinc, Z., Kara, K., & Goktas, Y. (2016). Gamification and education: Achievements, cognitive loads, and views of students. International Journal of Emerging
Technologies in Learning, 11, 64–69. https://doi.org/10.3991/ijet.v11i07.5455.
Villagrasa, S., & Duran, J. (2013). Gamification for learning 3D computer graphics arts. Proceedings of the first international conference on technological ecosystem for
enhancing multiculturality (pp. 429–433). ACM.
Zuckerman, O., & Gal-Oz, A. (2014). Deconstructing gamification: Evaluating the effectiveness of continuous measurement, virtual rewards, and social comparison for
promoting physical activity. Personal and Ubiquitous Computing, 18, 1705–1719. https://doi.org/10.1007/s00779-014-0783-2.
Zusho, A., Pintrich, P. R., & Coppola, B. (2003). Skill and will: The role of motivation and cognition in the learning of college chemistry. International Journal of Science
Education, 25, 1081–1094. http://dx.doi.org/10.1080/0950069032000052207.

37