Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
A Report Submitted
In Partial Fulfillment of the Requirements for the Degree of
Bachelors of Technology in Computer Science & Engineering
Rishabh Tyagi(11-1-5-098)
Nikhil Kharode(11-1-5-001)
Nabajyoti Hazarika(11-1-5-088)
Rajiv Mandal(10-1-5-038)
DECLARATION
...........................................
Date:
CERTIFICATE
This is to certify that the project work entitled " A Recommender Evaluation
: Towards a Production Recommender System " submitted by Rishabh
Tyagi(11-1-5-098), Nikhil Kharode(11-1-5-001), Nabajyoti Hazarika(11-15-088), Rajiv Mandal(10-1-5-038) in the partial fulfillment of award of
degree of Bachelors of Technology in Computer Science & Engineering at
National Institute of Technology, Silchar, was done under the guidance and
supervision of Prabhakar Sarma Neog . The matter presented in the report
has not been submitted for the award of any other degree of this or any
other institute/university. I wish them all success in life.
.........................................
Date
ABSTRACT
vii
ACKNOWLEDGMENTS
We take this occasion to render our deep sense of gratitude and tribute to our
supervisor, Prabhakar Sarma Neog , Assistant Professor for his constant and
valuable guidance in the truest sense throughout the course of the work. It was
his encouragement and support from the initial to the final level enabled us to
develop an understanding of the subject. Every time we had a problem, we
rushed to him for advice, and he never ever let us down. His timely suggestions
helped us to circumvent all sorts of hurdles that we had to face throughout our
work. We are deeply indebted for his inspiration, motivation and guidance.
Rishabh Tyagi
Nikhil Kharode
Nabajyoti Hazarka
Rajiv Mandal
vi
Contents
ABSTRACT
Ii
ACKNOWLEDGMENTS
Ii
List of Figures
Ii
List of Tables
Iii
1 Introduction
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Literature Review
3 Background
3.1 Content-based Recommendation . . . . . . . . . . . . . . .
8
9
11
12
14
15
19
20
20
6 Data sets
6.1 MovieLense- MovieLense Dataset (100k) . . . . . . . . . .
27
27
28
29
7 Implementation
30
7.1 About Apache Mahout . . . . . . . . . . . . . . . . . . . .
31
7.2 Introduction to recommendation in Apache Mahout . . . . . 32
7.3 Installation of Apache Mahout . . . . . . . . . . . . . . . .
33
33
34
34
8 Evaluation Metrics
8.1 Predictive Accuracy Metrics . . . . . . . . . . . . . . . . .
36
36
37
38
48
10 Conclusion
49
50
ii
List of Figures
3.1 Ted.com uses a top-k item recommendation approach to rank
items. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
13
22
23
34
37
iii
List of Tables
3.1 Movie rating scenario user rating 1-5 scale . . . . . . . . .
12
14
3.3
3.4
18
18
6.1
Chapter 1
Introduction
1.1 Motivation
Today we are living in the era of Internet and Digitization. These together is
bring in unification of world/boundaries. Digital media and technology is one
of the fastest growing concepts in the world. It has changed the way we do
just about everything. Due to this digital revolution and web technologies, the
amount of information in the world is increasing with high volume, velocity and
variety. Processing of these overloaded data is very much necessary due to
limited intake capacity of human and limited time to make decisions.
Recommender system try to process these overloaded data in a personalized
way of a particular interest of a user. In another way we can also say that
recommender systems are such intelligent systems which can capture the
effort of some available users decisions to help a large community of other
users back to make their decisions quickly. In this way it also help users to
locate possible items of interest more quickly by filtering and ranking them in
a personalized way. Some of these systems provide the end user not only
with such a personalized item list but also with an explanation which describes
why a specific item is recommended and why the system supposes that the
user will like it [1]. I feel this field has both the power of research and also a
source to support people.
In recent years the popularity of Recommender System is increasing day by
day as it is incorporated with a variety of applications like movies, music [2],
news, books, research articles [3], search queries, social tags[4], and
products in general. Now a days Recommender System is also
integrated with experts [5], jokes, restaurants [6], financial services,
life insurance, persons (online dating), and Twitter [7].
Another accelerator for the research in the field of Recommender Systems is the
Netflix Prize competition started in October 2006. The competition is held by
Netflix, an online DVD rental service, and seeks to improve the accuracy of
predictions about how much someone is going to like a movie based on their
preferences. Many scientist, students, engineers and enthusiasts were attracted
by the free accessible large scale data set and the public announced $1,000,000
prize money for the winner of the competition[8].
1.2 Objective
Our first objective is to study the current research trend on recommender
system. We investigate the existing recommendation algorithms, which are
mainly collaborative filtering methods along with their mechanisms. For
experimentation we produce some tools and techniques. We analyze the
similarity
methods
mathematically
and
implemented
them
in
Chapter 2
Literature Review
Herbert Simon in his book " The sciences of the artificial" mentioned about
the necessity of recommender system as follows - As of the mid-1990s the
lesson has still not been learned. An information superhighway is proclaimed without any concern about the traffic jams it can produce or the
parking spaces it will require. Nothing in the new technology increases the
number of hours in the day or the capacities of human beings to absorb
information. The real design problem is not to provide more information to
people but to allocate the time they have available for receiving information
so that they will get only the information that is most important and relevant
to the decisions they will make. The task is not to design informationdistributing systems but intelligent information-filtering systems. [9; 10]
In the year 1992, Goldberg et al.[11]has made an experimental mail system
named as Tapestry. Their main aim was to tackle overloaded amount of
documents in electronic mail system which was hugely popular at that time.
Other mail systems at that time use contents of a document as a measure of
filtering. But Goldberg and his friends use a new method along with content
based filtering where human decision is involved. They named that method
as collaborative filtering. They defined it as follows "Collaborative filtering
simply means that people collaborate to help one another perform filtering by
recording their reactions to documents they read." However, other names
were also suggested in the beginning of the recommender system research,
4
such as social filtering [12] and social information filtering [13]. As men-tioned,
the more general term recommender system was coined by Resnick and
Varian in 1997 and it subsequently became the most popular term for these
systems[10]. Tapestry system has limitation as it is designed for small
workgroups where the members more or less known to each other.
In the survey of [19] it is found that, [20] authors first proposed the prediction
of missing values in the U-I matrix by using the side information of items(e.g: title,
genre
of
movie)and
then
deploying
user-based
CF
to
generate
Chapter 3
Background
Generation of information is boundless in these days. Though it is helpful but in
some cases people feel more pain than gain. To overcome the pain which is
obtained from countless information around us, people always takes help of
machines and generated some machine learning tools to help people. Content
based techniques deal with the contents of a documents and produced a
categorized list out of them. It makes some part easy to deal with large amount
of data. Search engines where some query can be thrown to find some information are examples of content based filtering. Another solution available
related to content based is top-k recommendation. In this approach a list is
maintained with most popular items which is common to all users. For ex-ample:
www.ted.com is a website where the most popular talks can be found in top-k list
manner. Users can sort items bases on the different approaches such as overall
popularity (most viewed), popularity in the past week (most emailed this week),
or popularity in the past month (most popular this month) among others[28]. A
figure is given for the system. Main problem is that in both cases these
recommendations are not customized to users interest.
In mid 1990s the first paper on Collaborative filtering appeared. Later it became
more famous as Recommender System. Traditional collaborative filtering
techniques take earlier user to item relation as some ratings to predict the future
interest of that users or may be others. In content based filtering the
methodologies use some type of similarity score to match the query describing
the content with the individual titles or items, and then present the
user with a ranked list of suggestions [29]. But on the other hand
Collaborative filtering methodologies donot use any information regarding
the actual content(eg: words, authors, description)of the items, but are
rather based on usage or preference patterns of other user [30; 29].
Collaborative filtering method is based on the data structure as user-item
matrix having users and items consisting of their rating scores.
Figure 3.1: Ted.com uses a top-k item recommendation approach to rank items.
where the maximum is computed over the frequencies fz, j of all terms tz that
occur in document dj. In order for the weights to fall in the [0,1] interval and
10
for the documents to be represented by vectors of equal length, weights obtained by Equation (3.1) are usually normalized by cosine normalization:[28]
Titanic
Alice 5
Bob ?
Jim
2
Kate ?
Inception
?
1
4
2
Toystory
3
?
?
?
Taken
?
4
?
?
Skyfall
?
?
?
3
Matrix
1
?
5
?
In a user-based approach, the recommender ranks users based on the similarity among them, and uses suggestions provided by most similar users to recommend new items [75]. The user-based approach of collaborative filtering
systems are not as preferred as an item-based approach due to the instability in
the relationships between users. For a system which handles a large user base,
even the smallest change of the user data is likely to reset the entire group of
similar users. Explanation user-based collaborative filtering example in fig:
Alice,Bob and Joe are similar to Jack based on their history(movies in the left
hand side). Now we want to recommend a new movie from the list in the right
hand side to Jack. Based on the past preferences of Alice, Bob
12
and Joe, "The Da Vinci Code" is liked by all of the three, while "Black
Swan" is only liked by two of them. Therefore "The Da Vinci Code" is
likely to be recommended to Jack.
Item Based CF
Item-based CF[32; 36] is a model-based approach which produces recommendations based on the relationship between items inferred from the rating
matrix. The assumption behind this approach is that users will prefer items
that are similar to other items they like. The model-building step consists of
calculating a similarity matrix containing all item-to-item similarities using a
given similarity measure. Popular are again Pearson correlation and Cosine
similarity. All pairwise similarities are stored in a n n similarity matrix S. To
reduce the model size to n k with k n, for each item only a list of the k most
similar items and their similarity values are stored. The k items which are most
similar to item i is denoted by the set S(i) which can be seen as the
neighborhood of size k of the item. Retaining only k similarities per item improves the space and time complexity significantly but potentially sacrifices
some recommendation quality [32].
13
i4
i5
i6
i7
i8
i1
0.1
0
0.3
0.2
0.4
0
0.1
-
i2
0.1
0.8
0.9
0
0.2
0.1
0
0.0
i3
0
0.8
0
0.7
0.1
0.3
0.9
4.6
i4
0.3
0.9
0
0
0.3
0
0.1
2.8
i5
0.2
0
0.4
0
0.1
0
0
-
i6
0.4
0.2
0.1
0.3
0.2
0
0.1
2.7
i7
0
0.1
0.3
0
0.1
0
0
0.0
i8
0.1
0
0.5
0.1
0
0.1
0
-
ua
2
?
?
?
4
?
?
5
Table 3.2: shows an example for n = 8 items with k = 3. For the similarity
matrix S only the k = 3 largest entries are stored per row (these entries are
marked using bold face). For the example we assume that we have ratings
for the active user for items i1,i5, and i8. The rows corresponding to these
items are highlighted in the item similarity matrix. We can now compute the
weighted sum using the similarities (only the reduced matrix with the k = 3
highest ratings is used) and the users ratings. The result (below the matrix)
shows that i3 has the highest estimated rating for the active user.
14
between users. In this case users are considered as points in a space of many
dimensions(dimensions are related to number of items). This method compute
the Euclidean distance
0 0
d between two such user points. This value alone
doesnt constitute a valid similarity metric, because larger values would mean
more-distant, and therefore less similar, users. The value should be smaller
when users are more similar. Therefore 1 (1 + d) is implemented. When distance
is 0 it will indicate users have identical preferences and result will be
1,decreasing to 0 as d increases. This similarity method never returns a negative
value, but larger values still mean more similarity.[37]
to measure similarity between user profiles and it is based on the use of the
standard Pearson r correlation coefficient. The possible values of the Pearson coefficient ranges from -1 to +1 including 0. Values near -1 indicate a
negative correlation while values close +1 indicates a positive correlation; a
value of 0 shows no correlation at all. Once Pearson coefficient has been
calculated the recommendation can be done as in the previous way by
averaging the values of the most similar profiles. One important characteristic
of this algorithm is that it takes into account not only positive correlation but
also negative correlation to make the predictions.
There are two issues with computing similarity between users using Pearson
15
correlation. One is the question of what to do with items that one user
has rated but the other has not. The straightforward, statistically
correct way to handle this is to only consider items that both users
have rated, and to do this consistently. This results in the following
formula, where Iu is the items rated by u:
min(jIu\Ivj;50)
50
the Pearson correlation, for our purposes. Rather than compute a cor-relation
based on the original preference values, it computes a correlation based on
the relative rank of preference values. Imagine that, for each user, their leastpreferred items preference value is overwritten with a 1. Then
16
Item 101
3.0
1.0
1.0
2.0
3.0
Item 102
2.0
2.0
2.0
Item 103
1.0
3.0
1.0
1.0
Table 3.3: Five users has given rating from(1 to 5)on three itemset
User 2(Yi)
Item103(3.0)
Item102(2.0)
Item101(1.0)
Rank(xi)
1
2
3
Rank(yi)
3
2
1
di(xi-yi)
-2
0
2
di2
4
0
4
Table 3.4: Rearranging the Rating table with lower to higher Ranking
this example no of items is 3.These values are substituted into the equation
17
Item 101
3.0
1.0
1.0
2.0
3.0
Item 102
2.0
2.0
2.0
Item 103
1.0
3.0
1.0
1.0
Correlation to user 1
1.0
-1.0
1.0
1.0
Tanimoto coefficient similarity : There are also present such user similarity method
Item
101 102 103 104
User 1
User 2
User 3
User 4
User 5
similarity to user1
1.0
0.75
0.17
0.4
0.5
18
19
Chapter 4
Issues in Recommender System
Some of the common problem issues of recommender system are discussed
here, this thesis does not provide solution for these problems. These are some
of the open issues that our researchers can explore in future.
Cold-start problem for system - A new system may face both of the
above mentioned problem and so it is named as system cold start.
20
4.2 Evaluation
Till now in the field of recommender system it is challenging to identify
the best algorithm for a given purpose, as researchers disagree on
which attributes it should be measured, and on which metrics should be
used for each attribute evaluation. In the paper of Herlocker [39], authors
broadly discussed about this problem and they point out three points.
According to them different algorithms may be better or worse on
different data sets. Second point is that goals for which evaluation is
performed may differ. Final point says that it is challenging to decide
what combinations of measures to use in comparative evaluation.
21
the attacker rates highly will probably be recommended to the target user. Since
currently RSs are mainly centralized servers, creating a fake identity is a timeconsuming activity and hence these attacks are not currently heavily carried on and
studied. However we believe that, as soon as the publishing of ratings and opinions
becomes more decentralized (for example, with SemanticWeb formats such as RVW
[2] or FOAF [3]), these types of attacks will become more and more an issue.
Basically, creating such attacks will become as widespread as spam is today, or at
least as easy.
Web Of Trust
The webs of trust of all the users can then be aggregated in a global trust network, or
social network (Figure 1), and a graph walking algorithm be used to predict the
importance of a certain node of the network. This intuition is exploited, for
example, by PageRank [11], one of the algorithm powering the search engine
Google.com. According to this analysis, the Web is a network of content without a
centralized quality control and PageRank tries to infer the authority of every single
page by examining the structure of the network. PageRank follows a simple idea: if a
link from page A to page B represents a positive vote issued by A about B, then the
global rank of a page depends on the number (and quality) of the incoming links.
The same intuition can be extended from web pages to users: if users are allowed to
cast trust values on other users, then these values can be used to predict the
trustworthiness of unknown users. For example, the consumer opinion site
Epinions.com, where users can express opinions and ratings on items, also allows
users to express their degree of trust in other users. Precisely, the epinions.com
FAQ suggests a user should add in her web of trust reviewers whose reviews and
ratings they have consistently found to be valuable.
Fig1. Trust network. Nodes are users and edges are trust statements. The dotted
edge is one of the undefined and predictable trust statements
Trust metrics [3,14,8] have precisely the goal of predicting, given a certain
user, trust in unknown users based on the complete trust network. For example, in
Figure 1, a trust metric can predict the level of trust of A in D. Trust metrics can be
divided into local and global. Local Trust metrics take into account the very personal
and subjective views of the users and end up predicting different values of trust in
other users for every single user. Instead global trust metrics predict a global
reputation value that approximates how the community as a whole considers a
certain user. In this way, they dont take into account the subjective opinions of each
user but average them across standardized global values. PageRank [11], for example,
is a global metric. However, in general, local trust metrics are computationally more
expensive because they must be computed for each user whereas global ones are just
run once for all the
Community. In the following, we argue that trust-awareness can overcome all the
weaknesses. Precisely, trust propagation allows us to compute a relevance measure,
alternative to user similarity that can be used as an additional or complementary
weight when calculating recommendation predictions. In [9] we have shown how this
predicted trust value, thanks to trust propagation, is computable on much more users
than the user similarity value. CF systems have problems scaling up because
calculating the neighbours set requires computing User Similarity of current user
against every other user. However, we can significantly reduce the number of users
which RS has to consider by prefiltering users based on their predicted trust value.
For example, it would be possible to consider only users at a small distance in social
network from current user or considering only users with a predicted trust higher than
a certain threshold. Moreover, trust metrics can be attack-resistant [8], i.e. they can
be used to spot malicious users and to only take into account reliable users and their
ratings. It should be kept in mind, however, that there isnt a global view of which
user is reliable or trustworthy so that, for example, a user can be considered
trustworthy by one user and untrustworthy by another user.
Chapter 5
SVD++(Single Valued Decomposition and Latent Features)
5.1 Baseline Estimates
Typical CF data exhibit large user and item effects i.e., systematic tendencies for
some users to give higher ratings than others, and for some items to receive higher
ratings than others. It is customary to adjust the data by accounting for these effects,
which we encapsulate within the baseline estimates. Denote by the overall average
rating. A baseline estimate for an unknown rating rui is denoted by bui and accounts
for the user and item effects:
The parameters bu and bi indicate the observed deviations of user u and item i,
respectively, from the average. For example, suppose that we want a baseline estimate
for the rating of the movie Titanic by user Joe. Now, say that the average rating over
all movies, , is 3.7 stars. Furthermore, Titanic is better than an average movie, so it
tends to be rated 0.5 stars above the average. On the other hand, Joe is a critical user,
who tends to rate 0.3 stars lower than the average. Thus, the baseline estimate for
Titanics rating by Joe would be 3.9 stars by calculating 3.7 0.3 + 0.5. In order to
estimate bu and bi one can solve the least squares problem:
possible to the ground truth ru;j . Formally, we can learn the user- and item-feature
matrices by minimizing the following loss (objective) function:
Chapter 6
Data sets
6.1 MovieLense- MovieLense Dataset (100k)
Description
The 100k MovieLense ratings data set. The data was collected through the
MovieLens web site (movielens.umn.edu) during the seven-month period from
September 19th, 1997 through April 22nd, 1998. The data set contains about
100,000 ratings (1-5) from 943 users on 1664 movies. R> MovieLense
References
Herlocker, J., Konstan, J., Borchers, A., Riedl, J.. An Algorithmic Framework for Performing Collaborative Filtering. Proceedings of the 1999 Con-
(a) Raw rating distribution for movielense (b) Normalize rating distribution for movielense
22
23
R> MSWeb
32710 x 285 rating matrix of class binaryRatingMatrix with 98653
ratings. We took a portion of the dataset for evaluation.
R>MSWeb10 <- sample(MSWeb[rowCounts(MSWeb)>10],100)
100 x 285 rating matrix of class binaryRatingMatrix with 1381
ratings. R> hist(rowCounts(MSWeb10))
R> hist(colCounts(MSWeb10))
24
Chapter 7
Implementation
Many
organizations
developed
different
platforms
to
implement
recommender system. Some of the open source projects are listed in the
following table. These platforms gives researchers an easy path for
studying, building and experimenting with different recommender system
algorithms.
Among
them
Apache
Mahout(The
Apache
Software
Software
Apache
Mahout
Cofi
Crab
easyrec
LensKit
Description
Machine learning library
includes collaborative filtering
Collaborative filtering library
Components to create
recommender systems
Recommender for Web pages
Collaborative filtering algorithms
from GroupLens Research
Language URL
Java
http://mahout.apache.org/
Java
Java
http://www.nongnu.org/cofi/
https://github.com/
muricoca/crab
http://easyrec.org/
Java
http://lenskit.grouplens.org/
Python
Recomm
Testing,developing environment R
enderlab
http://R-Forge.R-project.org/
projects/recommenderlab/
25
interfaces
to
connect
and
conduct
users
customized
most popular, free Java IDE with resent version luna. It supports Java8.
org.apache.maven.archetypes:maven-archetype-quickstart
For data input we used DataModel, it gives us the facility to store and access
all the preference ,item, user data needed in the computation. Next, we need
UserSimilarity methods. It will provide the different similarity mea-sure
techniques to compute user similarity needed in collaborative filtering. A
UserNeighborhood implementation gives the concept of a group of user which
is most similar to a given user. Finally, Recommender implementation
combines all this concepts to recommend items to users.
In our experimentation process we use MovieLense100k data set.There are
various sizes of movielense data can be found.But this data files are .dat file
formatted. To input data file in our FileDataModel we convert that .dat file in
28
Chapter 8
Evaluation Metrics
In literature, different accuracy measure for recommender system can be
found since 1994[15]. Out of them, some of the most popular methods are
classified into three classes by [39] authors. These are namely : predictive
accuracy metrics, classification accuracy metrics and rank accuracy metrics.
29
Problem may occur when these matrics are applied to real time data
sets, due to data sparsity.
For this problem in collaborative filtering, items which has no rating are
ignored from the top list which is generated for recommendation.
Relevant Non-relevant
Retrieved
A
b
Not-retrieved C
d
Table 7.1: Confusion matrix of two classes when considering the retrieval of document
A ROC curve represents recall against fallout. The objective of ROC curve
analysis are to return all of the relevant documents without returning the
irrelevant ones. It does so (see Fig ) by maximizing recall(called the true
positive rate) while minimizing the fallout(false positive rate).
32
Chapter 9
Results and Explanation
Comparison of User based and Item based method
Figure 8.1: User based Vs Item based varying training data set size
The Figure 8.1: is a comparison graph between User based and Item based
recommendation on MovieLense data set. Using this graph we represent the
error of prediction which is measured against Average Absolute Difference with
increasing amount of Training data set size. When there is a very less amount of
training data in both User based and Item based they give a high value of error.
Its significance is that both the recommendation method suffers from cold start
problem at the initial stage. But as the training data set
33
size increases both the method gives apparently good results(less error).
It is noticeable that Item based method after some time gets saturated. In
contrast User based method still can give better performance.
No of NNeighbor
Similarity
n=1 n=3 n=5 n=7 n=9 n=11
Pearson
NaN 0.89 0.83 0.92 0.77 0.90
Loglikelihood NaN 1.02 0.83 0.84 0.83 0.88
Euclidean
NaN 0.68 0.78 0.86 0.77 0.83
n=13
0.80
0.91
0.85
n=15
0.87
0.84
0.80
n=17
0.90
0.86
0.71
n=19
0.87
0.82
0.76
number, or undefined, and are denoted by Javas NaN symbol.In userbased collaborative filtering there is two approaches to consider
neighborhood. In Table 8.1: we can see the different error values with
varying no of fixed size neighborhood applied to three different similarity
methods. At neighborhood equal to 1 no similarity method can provide
prediction. Therefore we are getting "not a number" in implementation.
In Graph also that space is showing blank the reason is that at neighbor-hood=1
user-based collaborative filtering is facing cold start problem. But whenever the
no neighborhood increases we are getting some amount of error value means
user-based collaborative filtering is starts predicting values. At neighborhood
equal to seventeen we are getting a lower value of error in Euclidean based
similarity. The Euclidean distance similarity metric may
35
be a little better than Pearson, though their results are quite similar. It also
appears that using a small neighborhood is better than a large one; the
best evaluations occur when using a neighborhood of seventeen people.
From this point we can also consider that in case of MovieLense data set
may be the users preferences are truly quite personal, and incorporating
too many others in the computation does not help.
t=0.7
0.87
0.82
0.87
t=0.8
0.89
0.80
0.86
t=0.9
0.87
0.80
0.84
Table 8.2: Comparison with different similarity metrics using threshold neighborhood
The Table 8.2: hold the error values which is produced by the
average absolute difference at different threshold neighborhood using
user-based collaborative filtering in MovieLense data set. It is
noticeable that at threshold value 0.4 and 0.5 the Euclidean distance
similarity method is producing less error compared to other metrics.
In user based collaborative filtering "cosine similarity" and nearest neighborhood=50 is considered. As its rating range is +10 to -10 for evaluation
purpose we have to consider a good Rating value as 5. Again user-based
collaborative filtering is going to give high value of precision-recall other than
popular and item based collaborative filtering. The structure of Jester and
MovieLense data set is similar in nature but their rating range is dissimilar.
So,inference is that user-based collaborative filtering also going to give high
precision-recall value compared to other methods and though there is
dissimilarity of rating still it will not going to give difference in the performance of
user-based collaborative filtering. Random recommendation is worst out of four.
But popular recommendation can be placed after UBCF. Because this means
that jokes rated high are usually liked by everyone and are safe recommendation.
But this might be different for other data set. IBCF finds what is good between
the items, but it may be fail to users having different taste than in general.
Another thing that ROC curve cannot explain all the viewpoints , though UBCF
does
better
but
more
expensive
to
generate
recommendations
at
recommendation time, as it uses the whole matrix. On the other hand IBCF saves
only k closest items in the matrix and does not need to save anything. But ,if we
want serendipity UBCF does a better job.
38
39
40
Chapter 10
Conclusion
Selecting appropriate algorithms for building a recommender system
is a tricky job. There is not a singular way for applying these
algorithms in a general manner.
Recommender systems are applied in various contexts and applications. If
one method perform better on one domain it is not guaranteed that it will perform same on an another system as before. Because several factors influence
the performance of the method. Sometimes required information for prediction
may not be available. Therefore, a series of experimentation should be done
to find the best method for a specific domain.
We have done some experiments on data set and find the better method for
prediction. But more better method may be available for that data sets.
For collaborative filtering, rating matrix is the most prominent source of information. Along that rating, some data sets also provide some side
information(e.g. gender,age about users and movie title,actor etc). If all
these extra information can be incorporated at the time of prediction
mechanism surely we will get some good prediction values. I think there is
a lot of future scope to improve collaborative filtering techniques.
41
Appendix A
Netflix Prize Competition
In 2006, the online DVD rental company Netflix announced a contest to improve the state of its recommender system. To enable this, the company released a training set of more than 100 million ratings spanning about 500,000
anonymous customers and their ratings on more than 17,000 movies, each
movie being rated on a scale of 1 to 5 stars. Participating teams submit
predicted ratings, and Netflix calculates a root-mean-square error(RMSE)
based on the held out truth. The first team that can improve on the Netflix
algorithms RMSE performance by 10 percent or more wins a $ 1 million prize.
If no team reaches the 10 percent goal, Netflix gives 50,000 Progress Prize
to the team in first place after each year of the competition.
The contest created a buzz within the collaborative filed. Until this point
,the only publicly available data for collaborative filtering research was
orders of magnitude smaller. The release of this data and the
competitions al-lure spurred a burst of energy and activity. According to
the contest website (www.netflixprize.com), more than 48,000 teams
from 182 different countries have downloaded the data [50].
42
Bibliography
[1] F. Gedikli, D. Jannach, and M. Ge, How should i explain? a comparison
of different explanation types for recommender systems, Int. J. Hum.Comput. Stud., vol. 72, no. 4, pp. 367382, Apr. 2014. [Online].
Available: http://dx.doi.org/10.1016/j.ijhcs.2013.12.007
43
Retrieval, ser. SIGIR 09. New York, NY, USA: ACM, 2009, pp. 532
539. [Online]. Available: http://doi.acm.org/10.1145/1571941.1572033
Committee,
2013,
pp.
505514.
[Online].
Available:
http://dl.acm.org/citation.cfm?id=2488388.2488433
44
tions and the Internet, 2006. SAINT 2006. International Symposium on.
46
filtering
recommendation
algorithms,
in
47
48
[48] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Analysis of recommendation algorithms for e-commerce, in Proceedings of the 2nd ACM
conference on Electronic commerce. ACM, 2000, pp. 158167.
49