Sei sulla pagina 1di 8

2011 International Conference on Computational and Information Sciences

A Case Study of Recommendation Algorithms


Xiwei Wang
Dept. of Computer Science
University of Kentucky

Erik von der Osten


Dept. of Computer Science
Universitt Heidelberg

Xuzi Zhou
Dept. of Computer Science
University of Kentucky

United States
xiwei@netlab.uky.edu

Germany
eosten@ix.urz.uni-heidelberg.de

United States
xuzizhou@netlab.uky.edu

Hui Lin

Jinze Liu

Dept. of Electrical and Computer Engineering

Dept. of Computer Science

University of Kentucky

University of Kentucky

United States

United States

hui.lin@uky.edu

liuj@netlab.uky.edu
compare, and buy specific items or groups of items without
the necessity of personal presence. To sell their products
better, most online shopping websites provide
recommendation information to the customers 1 who have
visited their website before. Recommender systems are
mechanisms that can be used to help users to make purchase
decisions.
A recommender system is actually a program that
utilizes algorithms to predict customers purchase interests
by profiling their shopping patterns. There are many
research publications about recommender systems since the
mid-1990s [1]. Different approaches and models have been
proposed and applied to real world industrial applications.
The most popular recommendation technique is the
Collaborative Filtering (CF) model [2, 3]. In CF, previous
transactions are analyzed in order to establish connections
between users and products. When recommending items to
a user, the CF-based recommender systems try to find
information related to the current user to compute ratings for
every possible item. Items with the highest rating scores will
be presented to the user.
In the research area of recommender systems, most
models work with rating datasets, such as Netflix movie
rating data [21]. The rating information is very important for
obtaining good prediction accuracy because it precisely
indicates users preferences and the degree of their interest
on certain items. However, the rating information is not
always available. Some websites do not have a rating
mechanism and thus their users cannot leave any rating
feedback on the products. This situation requires evaluating
implicit information which results in a lower prediction
accuracy of the recommender systems. The datasets from
the retargeting company2 are such a kind of data that contain

AbstractRecommender systems are very popular in online


service providers. Among all sorts of recommender systems,
the top-N recommendation for online shopping systems has
drawn increasing attention from researchers. Most existing
papers about recommendation algorithms use public datasets
as their experiment data, e.g. Netflix, Movielens. These
datasets, containing the users ratings of movies, have been
carefully tweaked. Thus, these datasets are very suitable for
algorithm study. However, in real applications, such as online
shopping websites, whose data may not be tweaked or without
any explicit rating information in it but is still used for
recommender systems. Fortunately, we are invited by an
American retargeting company, to study the effects of
recommendation algorithms on their datasets and try to find a
good strategy for selecting algorithms with respect to
particular websites. In this paper, several typical
recommendation algorithms popularity based model, item
similarity-based model, SVD model, and bipartite graph
model are studied. The filtering step of the popularity based
model is also applied to other models for further comparison.
Experiments are performed with these methods on four
different browsing history datasets from this retargeting
company to help us in obtaining advantages and
disadvantages of each approach. Experimental results show
that there is no perfect or dominating model for all datasets.
Nevertheless, we have found a somewhat perfect strategy in
our selection.
Keywords-Recommender Systems, Collaborative Filtering,
Case Study

I.

INTRODUCTION

Over the past 20 years, the Internet has served as the


major technology connecting our world. Economists have
discovered the great potential that lies in that piece of
technology. They have tried and are still trying to find
suitable ways to make it as easy and pleasant as possible to
spend money while surfing the Internet. Almost every shop
now has an online presence that makes it possible to search,
978-0-7695-4501-1/11 $26.00 2011 IEEE
DOI 10.1109/ICCIS.2011.20

1
The term product and item, customer and user will be used
interchangeably.
2
We are not allowed to disclose the companys name.

410

no monotonic relation between error metrics and accuracy


metrics.
Sarwar et al [6] investigated several techniques on ecommerce data for analyzing large-scale purchase and
preference information to produce useful recommendations
to users. They applied dimensionality reduction to two
neighborhood models to study their performance. Their
experimental results on e-commerce data show that the
models working on lower dimension space have a slightly
smaller F1 score [5] than the ones working on original space.
However, this accuracy difference is trivial and the former
case saved a significant running time.

no rating information. The information provided includes


user ID, product ID and the clicking history of users with
corresponding date. Therefore, some state-of-the-art
recommendation algorithms should be re-examined and
tweaked to better suit non-rating data.
In this paper, we compare the following
recommendation algorithms: item popularity-based model,
item similarity-based model, bipartite graph model, and
SVD-based latent factor model, by analyzing and
performing experiments on browsing history datasets of
online shopping websites.
The contributions of this paper is three-fold (1) we
conduct the experiments on datasets provided by the
retargeting company, which differ from Netflix data by not
having any rating information and having never been
tweaked (2) our experimental results show that no model is
perfect in all cases - some may be good to some datasets
but are not suitable to others. A list of seven models ranked
by prediction accuracy is presented and some suggestions to
select models in different cases are proposed (3) we apply
the filtering step of item popularity-based model to other
models and test the performance of the filtered models. We
found that the filtering step could greatly improve the
prediction accuracy on item similarity-based model and
bipartite graph model but has no effect on SVD-based
model.
The remainder of this paper is organized as follows.
Section II gives the related work. Section III describes the
main idea of each approach. Section IV presents the
experiments on the datasets from the retargeting company
and discusses their results. Some concluding remarks and
future work are given in Section V.
II.

III.

DESCRIPTION OF THE MODELS

A. Notational Conventions
In this paper, we use a matrix to store the clicking
relationships between users and items, called the user-item
clicking matrix, denoted by R = [rij ] i 1,..., m where m is the
j 1,..., n

number of users and n is the number of items. The entry rij


is the click count that user i left on item j in the given time
period. Thus each row corresponds to a user and each
column corresponds to an item. Though there is no definite
proof that a user clicked on an item will buy it, a high click
count may imply that the user is interested in it.
For user u and item i, we denote the existing click count
from u to i as rui and the predicted one as rui . The
recommended items should be interesting to user, i.e. user is
likely to click on the recommended items.
B. Item Popularity-based Model
Item popularity-based approaches are very traditional
ones for recommender systems. The main idea of item
popularity-based models is to recommend most popular,
most viewed, or best selling items to users. Although item
popularity-based models overlook users preferences, this
kind of models are still effective to a certain degree, and are
adopted as an auxiliary component in recommender
systems by many famous online shopping sites, such as
Amazon.com, ebay.com, etc.
In our item popularity-based model, a popularity list is
maintained for each data set, denoted by L {t k }k 1,2,...,n .
The elements in L are items in descending order with
respect to their view counts, denoted by npk .
For simple implementation of popularity-based top-N
recommendation, the items contained in the first N
elements of list L will be used as the results. However,
these recommended items may not be interesting to a user,
which means a less accurate prediction. Thus, a further step
of filtering is adopted in this model to improve the
prediction results. The filtering step introduces a new
parameter hu into the model, where

RELATED WORKS

In [4], Breese et al did an empirical analysis of


collaborative filtering algorithms on three datasets, i.e. MS
Web [4], Nielsen3, and Eachmovie4. The first two datasets
have binary rating values while the last one has a rating
scale from 0 to 5. They perform two classes of experiments
All-but-1 test and Given test. All-but-1 test withholds a
single randomly selected vote for each user in the test set,
and tries to predict its value given all other votes the user
has voted on. The Given test randomly selects a certain
number of votes from each test user as the observed votes,
and then attempts to predict the remaining votes. Our
evaluation strategy is very similar with All-but-1 test with
the only difference that we use the last vote as the one to be
predicted instead of randomly select one.
Cremonesi et al [11] studied the performance of
recommender algorithms on top-N recommendation tasks.
They adopted the accuracy metrics (recall, precision)
instead of the error metrics (RMSE, MAE) as their
evaluation standard since in most commercial recommender
systems, only the recommended items are displayed while
the predicted values are not. They also showed that there is

hu

number of distinct items viewed by user u


total number of items viewed by user u

The value of hu can reveal some information about the


users browsing habit, such as the user prefers to view an
item just once, or prefers to view an item for couple of

This dataset was made by Nielsen Media Research.


4
http://www.grouplens.org/node/76

411

times during browsing. In the former case, the model


should not recommend the items that have been already
viewed by the user. In the latter case, such items could also
be presented to the user. A threshold ht is set to determine
whether a further filtering step is needed for a user u:
i)

if hu < ht,, recommend the top-N items in L to the user;

ii)

if hu ht, perform the filtering step.

where xik'

D. SVD-based Latent Factor Model


Latent factor models [22] focus on reducing
dimensionality of the user-item rating matrix in order to
discover some latent factors. These factors should best
explain user preferences with least noise. Basically, they
are approximations to the original rating information.
Most latent factor models are based on SVD [8]. The
main idea of SVD-based models is to factorize the useritem rating matrix into two lower rank matrices, i.e. a
user-factor matrix P and an item-factor matrix Q. Thus,
each user u and item i can be represented as an fdimensional factor vector pu (u-th row of P) and qi (i-th row
of Q), respectively ( pu , qi f ) [11]. The prediction of a
rating from user u to item i is made in the following way
(3)
rui pu qiT
In order to obtain the user and item factor vectors, SVD
is applied on the huge spare matrix R with all the missing
values being zeros.
(4)
Rmun U mur rur QnTur
where U and Q are orthonormal matrices, is a diagonal
matrix with singular values on its diagonal and r is the rank
of .
With SVDLIBC (an SVD-package) [12], the dimension
f ( f d r ) can be easily specified when decomposing the
rating matrix. Hence, the user factor matrix is represented
by
Pmu f U mu f f u f Rmun Qnu f
(5)

C. Item Similarity-based Model


Among all recommender systems, similarity-based
models are simple to implement and thus widely used.
Papagelis et al [7] showed that, in most cases, item
similarity-based models result in a better performance in
prediction accuracy compared to the user similarity-based
models. In the item similarity-based model [7], when
recommending items to a user u, the system first retrieves
neighbors of the items that have been viewed by u. Then it
will pick up the N most similar neighbors and recommend
them to user u.
In the real world applications, a notable challenge in a
recommender system is the cold start problem [9]. It often
occurs in situations where new users just sign up or users
present very few opinions. In this paper, we incorporate the
item popularity factor into the similarity-based model to
address the new-user-problem. We use the following
formula to predict the relationship between user u and item
i.
1
| S (i; u ) |

Ju

2
ij

 (1 - J ) u

j S (i;u )

npi
N

(1)

and so R P QT .
If we use ru to denote the u-th row of the rating matrix
R, then the user factor vector pu can be obtained via
(6)
pu ru Q
From (3) and (6), we have
rui ru Q qiT
(7)
Essentially, the matrix Q is the only one in decomposed
matrices that will be used in prediction.

The first tier is the similarity score and the second tier is the
popularity score. S(i;u) is the set of items that were viewed by
user u and are similar to item i. ij is the Pearson correlation
coefficient [10] between item i and item j. The popularity score is
the ratio between the view count of item i (denoted by npi) and the
global maximum view count (denoted by N). We use [0, 1] to
control the weight of each part.

When computing the Pearson correlation coefficient,


we slightly modified the formula to provide better
relationships between two items.

U ij

k 1

d
k 1

'
( xik
 xi )( x 'jk  x j )

'
( xik
 xi ) 2

d
k 1

1
) is a variation of xik (nck is
1  log nck

the number of items that user u has viewed) and d is the


number of dimensions of vector xi which represents an item
(each users click count on this item corresponds to an
entry in the vector).
The modification on xik is based on the premise that
users who have clicked fewer items make more contribution
to the similarity computation than those users that have
clicked lots more items.

Case ii) means the number of distinct items viewed by


user u is close to the total number of items viewed by u.
Then the items in the last transaction of user u will be
excluded from recommendation results (generated in the
first step).
The filtering step is also applicable to other models
(called the filtered models), such as item similarity-based
model. To utilize the filtering step, other models are
required to generate an ordered top-(2N) item list for top-N
recommendation. The top-(2N) list will take the place of
popularity list L.

rui

xik (1 

E. Bipartite Graph Model


In this graph model [13], users and items are modeled
as vertices of a graph. All vertices in this graph can be
divided into two disjoint sets I (item set) and U (user set).
Every edge is a connection between a vertex in U and one
in I and corresponds to an entry rij in User-Item rating
matrix R, as shown in Fig. 1.

(2)

( x 'jk  x j ) 2

412

TABLE I.
Item set

t1

ti

tj
rk

User set

u1

uk

um

Figure 1. A bipartite graph

Sets I and U in bipartite graph model are independent


sets [14]. Therefore, transition probability between each
item pair ti and tj can be obtained by
i

(8)

| t j ))

k 1

rki
n
rkj
j 1

where P(t i | u k )

rkj
m
rkj
k 1

P(u k | t j )

All item nodes now form a finite Markov chain with


transition matrix P = [ pij ]i, j 1,...,n , where pij = P (ti | tj) [15],
i.e. the probability that this chain ends in the specific item
node ti with initial node tj is pij. Therefore, given the
previous click history of user u, we can predict the
probability for a certain item i that u might be interested in:

( p

ij

Tu (t j ))

(9)

j 1

where Tu is the initial state vector for user u in a Markov


chain and Tu(tj) is the component corresponding to item j.
ruj
n
rui
i 1

Tu (t j )

(10)

In order to penalize the users (or items) with a large


number of clicks, the penalization parameter is
introduced in the model [13, 16]. It is based on a similar
premise like the one of the shown item similarity-based
model when computing item similarities. Transition
probability with penalization parameter is:
P(ti | u k )

rki

n
rkj
j 1

IV.

P(u k | t j )

rkj
D

m r
kj
k 1

# of users

# of items

# of clicks

3699

20,471

499

134,982

5202

148,409

1,004

300,757

8631

112,738

94

1,559,529

9093

70,049

2,303

120,836

B. Evaluation Strategy
As stated before, our datasets are different from Netflix
datasets; therefore, we do not evaluate by the prediction
error (e.g. RMSE, MAE) but by the hit rate (also referred to
as recall rate, precision).
To compare the prediction accuracies, we apply the
models described above on four datasets to get top-N
recommendations with N = 10 for users in test user set.
We call the recommended item set the predicted set. As
evaluation criterion we adopted the hit rate of
recommendations (the higher, the better), i.e.

rui

Site ID

Each dataset is divided into three subsets, namely


training set, test set and last transaction set. The training
set is obtained from the original dataset by removing 1000
users (called test users) and their accompanying data. In
order to make sure the items that have been viewed by test
users also exist in the training set, the items should occur at
least 15 times in the training set after removing the data of
test users. The last transactions of the removed test users
form the last transaction set and the remaining data of them
form the test set.
Our goal is using training set to train the models and
apply the models on the data in the test set to predict the
last transaction of the test users.

(P(t | u ) P(u

P(ti | t j )

STATISTICS ON DATASETS

tn

HitRate

number of correctly predicted test users


number of test users

The hit rates of all models are also tested for different
Ns on these datasets.

(11)

EXPERIMENT STUDY

A. Description of Datasets
The data that we used is gathered by a retargeting
company for research use. It consists of the browsing
history from 139 online shopping sites in one week
(08/08/2010 08/14/2010). In this dataset, each row
represents a transaction, which has four attributes, namely
product ID, website ID, user ID, and date.
We select four sub datasets from the whole data as our
testing datasets. Statistics are shown in Table I.

Figure 2. Hit rate with different on site 52025


5
We use the website with site ID xxxx and site xxxx interchangeably
they refer to the same datasets in this context.

413

TABLE II.

HIT RATE WITH DIFFERENT ON SITE 5202

Hit Rate

0.5

20.0%

21.2%

14.2%

the range of the rating matrix (min rating value max


rating value) is not very large.
2) Prediction on datasets
We first perform the models on site 3699. The hit rates
are shown in Fig. 3. Note that, we use IP to represent the
item popularity-based model, IS for the item similaritybased model, f-IS for the filtered item similarity-based
model, BG for the bipartite graph model, f-BG for the
filtered bipartite graph model, SVD for the SVD-based
latent factor model and f-SVD for the filtered SVD-based
latent factor model.
On this site, SVD-based model (with 60 factors)
achieved the highest prediction accuracy, which is much
higher than other models. We also tested SVD model with
more factors but no better results could be obtained. This
means the first 60 factors capture the most critical latent
properties of the items from this data.
The bipartite graph model reaches an accuracy of
46.7%, which is close to the results of the item similaritybased model. Essentially BG has a similar principle with IS
since they both need to build an item-item matrix. The
difference lies on the viewpoint of entries in the matrix
transition probability in BG and item similarity in IS.
Actually, in some IS models, the similarities are obtained
by computing the conditional probability between items
and users.
The item popularity-based model performs worst on
this dataset. This is reasonable as there are 499 items and
only 10 items are recommended to each test. However, the
filtering step in IP works quite well with IS and BG. The
hit rate is 67.3% for f-IS (70.9% for f-BG) where the top20 recommendation by IS model got a hit rate of 67.6%
(71.0% for BG). The results surprised us to some extent. It
almost filtered all the incorrect items and retained the
correct ones. By analyzing the user browsing habit, we
found most users clicked distinct items just once. Thus,
they may not be interested in the items they have already
clicked.
Note that in the IS model, the neighbors of items that
have been viewed by a user may have been also viewed by
this user. Therefore, the IS-based recommendation is not
entirely suitable to these kinds of users and a further
filtering step is needed.

Within the item popularity-based model, an itempopularity list is constructed by collecting statistics on the
browsing history. The filtering step is applied on this list to
obtain the final recommended items.
In the item similarity-based model, the parameter is
tweaked to get the best ratio of similarity-based score and
popularity-based score. is chosen from the interval of [0,
1] with step size 0.1.
Considering the filtered item similarity-based model,
we perform the filtering step of the item popularity-based
model on the ordered top-(2N) recommendation list
generated by the item similarity-based model to produce a
new top-N list. The same step will be applied to the other
models.
With bipartite graph model, we first build a probability
transition matrix with (8) and (11). The prediction is
computed with the Markov chain in the matrix.

Hit Rate

C. Results and Discussion


Before we step into comparing the models, the
parameters in item-based similarity model and bipartite
graph model are studied on the website with site ID 5202.
1) Parameter Study
a) in item-based similarity model
The curve in Fig. 2 shows that with increasing, better
hit rates are reached. Popularity-based score cannot
contribute to the accuracy in a way that is more positive
than similarity-based score. Nevertheless, as stated before,
the popularity-based score is mainly used to provide
recommendation information for new users they almost
have no preference. In our experiments, the hit rate is tested
by applying the models on users that have information in
both test set and last transaction set. This test methodology
does not focus on the new user problem enough. Thus, the
popularity-based score was eliminated by setting to 1.0,
which means this model will recommend items based on
the similarity-based score only.
b) in bipartite graph model
In this model, penalizes the users or items with lots of
clicks. Thus with a larger , the corresponding probabilities
in (11) will be smaller. Table II shows the hit rates with
different .
In the test datasets, the number of distinct items that a
user has clicked is not varying a lot. Hence the effect of
as penalty is not obvious comparing to grocery shopping
[16] in which case a customer purchasing a large number of
a certain product reflects a higher interest in this product.
Generally speaking, = 1 should be suitable for cases that

Model

Figure 3. Hit rate on site 3699

414

Hit Rate

Hit Rate

Model

Model

Figure 6. Hit rate on site 8631

Nevertheless, the filtering step has no effect on SVDbased model. It can be inferred that the latent factors in
SVD not only capture the users clicking count information
but the clicking patterns, i.e. the browsing habit. So no
filtering step is needed.
For website with site ID 5202, the results shown in
Fig.4 are quite different from those on 3699. The filtered
models (except f-SVD) perform better than the others. IS
and BG, f-IS and f-BG have a very similar hit rate,
respectively. IP got the worst prediction accuracy once
again. However, SVD (with 70 factors), the champion of
the previous experiment, only got a hit rate of 18.6%. The
latent factors seem not to capture the correlation between
users and items very well. Thus, latent factors do not work
for all cases. The f-SVD again has no improvement on
SVD.
Fig. 5 shows the results on the website with site ID
9093. The filtered bipartite graph model performs best.
SVD (with 100 factors) has a similar hit rate on site 5202.
The results show that in some datasets, the local
relationship among items (obtained by BG and IS-like
models) plays a more crucial role in predicting the next
item while in some other datasets, capturing the global
effects (obtained by SVD-like latent factor models) is more
important.

The prediction results on site 8631 (Fig. 6) look much


different. SVD and IP models have a very high hit rate
compared to the others. In this case, we used 94 factors
(same as the number of items) for the SVD model. Note
that, this website has specific properties it has quite few
items (94) and many users (112,738). When examining the
recommended item list of SVD model, we found that it
only generates one item for each user. In other words, the
top-1 recommendation of SVD on this site gives 95.9%
users predicted correctly! It works well because it
successfully captures almost all characteristics of the items
by latent factors. Item popularity-based model performs
even better since the popular items are welcomed by most
users - this differs from that in the first dataset. f-SVD is
not employed on this site due to the fact that there is no
top-20 list for the filtering step to working on.
An interesting question is, why BG and IS performed
much worse in this dataset compared to others? In [20],
Huang, Z. et al discussed the sparsity of the user-item
rating matrix. It can be used to explain the problem in out
experiments. Due to the small number of items and the
large number of users, most customers have only visited a
few items. Then the number of edges in BG model
connecting to these users is small. Hence, the transition
probability will no longer represent the similarity of
products. This also happens in IS model - an item is similar
to almost all other items with very close similarity values.
If customers in this dataset tend to be interested in several
popular items, the prediction accuracy is going to be very
low. This explains the good results of the IP model in this
case.

Hit Rate

Figure 4. Hit rate on site 5202

TABLE III.

PERFORMANCE LIST
Performance Rank

Model

Figure 5. Hit rate on site 9093

415

Dataset
3699

IP
6

IS
5

f-IS
3

BG
4

f-BG
2

SVD

5202

8631

9093

1
6

1
4

TOTAL

19 (6)

18 (5)

11 (2)

15 (4)

8 (1)

13 (3)

1
5

Hit Rate

Hit Rate

Figure 9. Top-N prediction on site 5202

Hit Rate

Hit Rate

Figure 7. Top-N prediction on site 3699

Figure 8. Top-N prediction on site 8631

Figure 10. Top-N prediction on site 9093

The figures show the increasing trends with greater N for


all models on all sites. The differences lie in the slope of the
curves. The filtered models f-BG and f-IS have about 20%
~ 50% accuracy improvement on the original models (BG
and IS). Note that, the candidate lists of filtered models for
top-15 (top-20) recommendation are the top-30 (top-40)
result lists from original models. This makes it so that more
correctly predicted items are on the candidate list.
In this experiment, 8631 is still a special website
compared to other three sites. The hit rates of SVD and IP
reach 95.9% and 98.6% when N = 5, respectively. That
means, for these two models, the accurately recommended
items fall in top-5 list and most of them even occur in top-1
list (the hit rates are 95.2% and 98.5% when N = 1,
respectively). So on this website, top-5 recommendation is
preferable for SVD and IP models since the number of items
that recommended to users are required to be small (users
may not be interested in a top-50 or top-100
recommendation list as there are too many items).

Furthermore, if the dataset has many items but very few


customers who have viewed many items, then the number of
edges connected to most customers will be very large. Thus,
most entries in the transition matrix have small and close
values which prevent the model from finding good items.
As a summary of top-10 recommendations, Table III
gives the performance statistics of seven models 6 on four
datasets. The models are ranked by their hit rates the
model that performs best is ranked 1 and the one which
performs worst is ranked 6. It is expected that IP model has
the lowest rank (rank 6 in total) since it is only based on the
popularity of items. The f-BG and f-IS models got the first
two places. SVD also works well in most cases. It can be
seen that some models predict very accurately for only
certain datasets while f-BG and f-IS models have higher
average hit rates than others. Thus, the filtered models,
especially f-BG, which take into consideration the human
behavior patterns, can be suggested as the wont-bewrong model for most recommendation tasks.
At the end, we studied the influence on the hit rate with
different Ns in top-N recommendation; the predictions with
seven models are performed on 4 sites. Fig. 7 - 10 show the
variance.

V.

CONCLUSION AND FUTURE WORK

Recommender systems have been studied for many


years. Researchers have proposed different models with
different mechanisms in this area. Some of them try to find
a model that can fit all cases. However, in real world
applications, this seems to be impossible since certain
models have their own preferences they may perform

6
Since SVD and f-SVD have same hit rate, we use SVD to stand for both
SVD and f-SVD in this context.

416

quite well on some data but have poor prediction accuracies


on other data. Moreover, the choice of the dataset is an
important issue in this research as well. To our best
knowledge, most papers states that they use public datasets
as their experiment data, such as Netflix, MovieLens, etc.
These datasets have been carefully tweaked and have
explicit rating information in it. Thus they are very suitable
for algorithm study and can provide standard benchmarks.
Nonetheless, this is not always the case for datasets from
real world applications.
Thus, in this paper, we conducted some experiments on
the datasets from an American retargeting company, to find
a good strategy in model selection for specific datasets. The
datasets we adopted are the clicking history of online
shopping websites. They contain no explicit rating
information and have never been tweaked. The experimental
results show that, if a dataset has few items but lots of users,
the SVD-based model and item popularity-based model may
be good choices; if a dataset has many items but less users,
the bipartite graph model and filtered models (except
filtered SVD) are suitable to it. However, if the
recommender system designer has very little knowledge of
the relationship between the number of users and items or is
not sure about which models may fit a particular dataset, the
filtered bipartite graph model should be a good choice due
to its best average performance on our data. The
experimental results also show that the filtering step has no
effect on SVD-based model which indicates that the latent
factors can capture both rating information and user clicking
pattern. This should be a superior point of SVD-based
model over other models.
To enhance the ability of recommendation models
discussed in this paper, we may combine two or more
models by doing linear combination (or some other blending
methods) on the top-20 recommended items generated by
different models. The user behavior pattern may be further
studied and incorporated into existing collaborative filtering
models. Furthermore, due to the limited information (no
category, data collected in very short time), some
characteristics of each model have not been revealed. We
believe that discoveries that are more interesting can be
uncovered if given more useful information.

[5]

G. Kowalski, Information Retrieval Systems: Theory and


Implementation, Kluwer Academic Publishers. ISBN: 0792399269

[6]

B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Analysis of


Recommendation Algorithms for Ecommerce, In proceedings of the
2nd ACM conference on Electronic commerce (EC'00), ACM New
York, Oct. 2000, pp 158-167.

[7]

M. Papagelis, D. Plexousakis, Qualitative Analysis of User-Based


and Item-Based Prediction Algorithms for Recommendation Agents,
Engineering Applications of Artificial Intelligence, vol. 18, no. 7, pp
781-789, Oct. 2005.

[8]

A. Paterek, Improving Regularized Singular Value Decomposition


for Collaborative Filtering, In Proceedings of KDD Cup and
Workshop (KDDCup '07), ACM New York, Aug. 2007, pp 2-5.

[9]

S. Park, D. Pennock, O. Madani, N. Good, and D. DeCoste, Nave


Filterbots for Robust Cold-Start Recommendations, Proceedings of
the 12th ACM SIGKDD international conference on Knowledge
discovery and data mining (KDD '06), ACM New York, Aug. 2006,
pp 699-705.

[10] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl,


GroupLens: an Open Architecture for Collaborative Filtering of
Netnews, In Proceedings of the 1994 ACM conference on Computer
supported cooperative work (CSCW '94), ACM New York, Oct. 1994,
pp 175-186.
[11] P. Cremonesi, Y. Koren, and R. Turrin, Performance of
Recommender Algorithms on Top-N Recommendation Tasks, In
Proceedings of the fourth ACM conference on Recommender systems
(RecSys '10), ACM New York, Sep. 2010, pp 39-46.
[12] M.W. Berry, Large-scale Sparse Singular Value Computations,
International Journal of Supercomputer-Applications, vol. 6, no. 1, pp
13-49, 1992.
[13] C. Hsu, H. Chung, and H. Huang, Mining Skewed and Sparse
Transaction Data for Personalized Shopping Recommendation,
Machine Learning, vol. 57, pp 3559, 2004.
[14] J.A. Bondy, Graph Theory with Applications, North-Holland, ISBN
0-444-19451-7, 1976.
[15] G. F. Lawler, Introduction to Stochastic Processes, second ed.
Chapman & Hall, ISBN(13) 978-1584886518, 2006.
[16] M. Li, B.M. Dias, I. Jarman, W. EI-Deredy, and P. Lisboa, Grocery
Shopping Recommendations based on Basket-Sensitive Random
Walk, In Proceedings of the 15th ACM SIGKDD international
conference on Knowledge discovery and data mining (KDD '09),
ACM New York, Jun. 2009, pp 1215-1223.
[17] Q. Zhao, and S. Bhowmick, Sequential Pattern Mining: A Survey,
Technical Report. CAIS, Nanyang Technological University,
Singapore.
[18] P. Huang, Mining Closed Sequential Patterns in Data Stream
Environment, Masters Thesis. Ming Chuan University, Taiwan,
2009.

REFERENCES
[1]

G. Adomavicius, and A. Tuzhilin, Toward the Next Generation of


Recommender Syetems: A Survey of the State-of-the-Art and
Possible Extensions, IEEE Transactions on Knowledge and Data
Engineering, vol. 17, no. 6, pp 734-749, June 2005.

[2]

D. Goldberg, D. Nichols, B.M. Oki, and D. Terry, Using


Collaborative Filtering to Weave an Information Tapestry,
Communications of the ACM, vol. 35, pp 61 -70, 1992.

[20] Z. Huang, D. Zeng, and H. Chen, A Link Analysis Approach to


Recommendation under Sparse Data, In Proceedings of the Tenth
Americas Conference on Information Systems (AMCIS '04), ACM
New York, Aug. 2004, pp 1997-2005.

[3]

J. Konstan, B. Miller, D. Maltz, J. Herlocker, L. Gordon, J. and Riedl,


GroupLens: Applying Collaborative Filtering to Usenet News,
Communications of the ACM, vol. 40, pp 77-87, 1997.

[21] J. Bennet, and S. Lanning, The Netflix Prize, KDD Cup and
Workshop, 2007. www.netflixprize.com

[4]

J. Breese, D. Heckerman, and C. Kadie. Empirical Analysis of


Predictive Algorithms for Collaborative Filtering, Technical Report.
Microsoft Research. MSR-TR-98-12, 1998.

[19] J. Wang, J. Han, and C. Li, Frequent Closed Sequence Mining


without Candidate Maintenance, IEEE Transactions on Knowledge
and Data Engineering, vol. 19, no. 8, pp 1001 -1015, 2007.

[22] Y. Koren, Factorization Meets the Neighborhood: a Multifaceted


Collaborative Filtering Model, In Proceeding of the 14th ACM
SIGKDD international conference on Knowledge discovery and data
mining (KDD'08), ACM New York, Aug. 2008, pp 426-434.

417

Potrebbero piacerti anche