1
Introduction to
Recommendation System
Presented by HongBo Deng
Nov 14, 2006
Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D.
Ullman
2
Netflix Prize - $1,000,000 Prize
Netflix recently announced
their NetflixPrize in which
they will award $1 million
dollars for an algorithm that
can out-perform their
recommendation approach
Cinematch by 10%.
3
Outline
 What is Recommendation systems?
 Three recommendation approaches
 Content-based
 Collaborative
 Hybrid approach
 Conclusions
 Review of my previous work
4
What is Recommendation systems?
Items
Search Recommendations
Products, web sites, blogs, news items, …
Recommendation systems are
programs which attempt to predict
items that a user may be interested in
5
Recommendation Types
 Editorial
 Simple aggregates
 Top 10, Most Popular, Recent Uploads
 Tailored to individual users
 Amazon, Netflix, …
 Books, CDs, other products at amazon.com
 Movies by Netflix, MovieLens
6
Formal Model
 C = set of Customers
 S = set of Items, e.g. books, movies
 The space S of possible items and the
user space C can be very large.
 Utility function u: C £S ! R
 R = set of ratings
 R is a totally ordered set
 e.g., 0-5 stars, real number in [0,1]
7
Utility Matrix
0.4
10.2
0.30.5
0.21
King KongKing Kong LOTRLOTR MatrixMatrix Nacho LibreNacho Libre
AliceAlice
BobBob
CarolCarol
DavidDavid
8
Recommendation Process
 Collecting “known” ratings for matrix
 Extrapolate unknown ratings from
known ratings
 Estimate ratings for the items that have not
been seen by a user
 Recommend the items with the highest
estimated ratings to a user
9
Collecting Ratings
 Explicit data collection
 Ask people to rate items
 Doesn’t work well in practice – people can’t
be bothered
 Implicit data collection
 Learn ratings from user actions
 e.g., purchase implies high rating
 What about low ratings?
10
Extrapolating Utilities
 Key problem: matrix U is sparse
 most people have not rated most items
 Three approaches
 Content-based recommendation
 Collaborative recommendation
 Hybrid recommendation
11
Content-based recommendations
 Main idea: recommend items to
customer C similar to previous items
rated highly by C
 Movie recommendations
 recommend movies with same actor(s),
director, genre, …
 Websites, blogs, news
 recommend other sites with “similar”
content
12
Plan of action
likeslikes
Item profilesItem profiles
RedRed
CirclesCircles
TrianglesTriangles
User profileUser profile
matchmatch
recommendrecommend
buildbuild
13
Item Profiles
 For each item, create an item profile
 Profile is a set of features
 movies: author, title, actor, director,…
 text: set of “important” words in document
 How to pick important words?
 Usual heuristic is TF.IDF (Term Frequency
times Inverse Doc Frequency)
14
TF.IDF
fij = frequency of term ti in document dj
ni = number of docs that mention term i
N = total number of docs
TF.IDF score wij = TFij £ IDFi
Doc profile = set of words with highest
TF.IDF scores, together with their scores
15
User profiles and prediction
 User profile possibilities:
 Weighted average of rated item profiles
 Variation: weight by difference from average
rating for item
 …
 Traditional heuristic
 Given user profile c and item profile s,
estimate u(c,s) = cos(c,s) = c.s/(|c||s|)
 Need efficient method to find items with
high utility
 E.g.
16
Model-based approaches
 For each user, learn a classifier that
classifies items into rating classes
 liked by user and not liked by user
 e.g., Bayesian, regression, SVM
 Apply classifier to each item to find
recommendation candidates
 Problem: scalability
17
Limitations of content-based approach
 Finding the appropriate features
 e.g., images, movies, music
 Overspecialization
 Never recommends items outside user’s
content profile
 People might have multiple interests
 Recommendations for new users
 How to build a profile?
 A new user, having very few ratings, would
not be able to get accurate
recommendations.
18
Collaborative Filtering
 Consider user c
 Find set D of other users whose ratings
are “similar” to c’s ratings
 Estimate user’s ratings based on ratings
of users in D
Set of other users
Similar
Ratings
Ratings
Estimate
19
Similar users
 Let rx be the vector of user x’s ratings
 Cosine similarity measure
 sim(x,y) = cos(rx , ry)
 Pearson correlation coefficient
 Sxy = items rated by both users x and y
20
Rating predictions
 Let D be the set of k users that are the
most similar to c and who have rated
item s
 Possibilities for prediction function (item
s):
 rcs = 1/k ∑d2D rds
 rcs = (∑d2D sim(c,d)£rds)/(∑
d2 D
sim(c,d))
 Other options?
21
Complexity
 Expensive step is finding k most similar
customers
 O(|U|)
 Too expensive to do at runtime
 Need to pre-compute
 Naïve precomputation takes time O(N|
U|)
 Simple trick gives some speedup
 Can use clustering, partitioning as
alternatives, but quality degrades
22
Item-Item Collaborative Filtering
 So far: User-user collaborative filtering
 Another view
 For item s, find other similar items
 Estimate rating for item based on ratings for
similar items
 Can use same similarity metrics and
prediction functions as in user-user model
 In practice, it has been observed that
item-item often works better than user-
user
23
Pros and cons of collaborative
filtering
 Works for any kind of item
 No feature selection needed
 New user problem
 The same problem as with content-based
system
 New item problem
 Sparsity of rating matrix
24
Hybrid Methods
 Implement two separate recommenders
and combine their predictions
 Add content-based methods to
collaborative approach
 item profiles for new item problem
 deal with sparsity-related problems
25
Evaluating Recommendations
 Precision
 Accuracy of predictions
 Compare predictions with known ratings, Root-
mean-square error (RMSE)
 Receiver operating characteristic (ROC)
 Tradeoff curve between false positives and false
negatives
 Recommendation Quality
 Top-n measures (e.g., Breese score)
 Item-Set Coverage
 Number of items/users for which system can
make predictions
26
Conclusions
 Content-based
 The user will be recommended items similar to the
ones the user preferred in the past
 Collaborative
 The user will be recommended items that people
with similar tastes and preferences liked in the past;
 Hybrid
 Combine collaborative and content-based methods
27
Review of my previous work
28
Facial Expression Recognition
Preprocessing procedure
Rotate to line up
eye coordinates
Locate & Corp
Face Region
Geometrical
Normalize
Gabor Feature
Extraction
Normalize
PCA&LDA
Translation Matrix
Train PhaseTemplates
Test Phase
Histogram
Equalization
Distance
Classifier
29
Image Stitching
Feature Points
extraction
Correlation
Match
Ransac eliminate
pseudo match
points
Build the Model
Perspective model
Image alignmentImage Stitching
Demo
30
Any questions or suggestions
 Thank you

Introduction to recommendation system

  • 1.
    1 Introduction to Recommendation System Presentedby HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman
  • 2.
    2 Netflix Prize -$1,000,000 Prize Netflix recently announced their NetflixPrize in which they will award $1 million dollars for an algorithm that can out-perform their recommendation approach Cinematch by 10%.
  • 3.
    3 Outline  What isRecommendation systems?  Three recommendation approaches  Content-based  Collaborative  Hybrid approach  Conclusions  Review of my previous work
  • 4.
    4 What is Recommendationsystems? Items Search Recommendations Products, web sites, blogs, news items, … Recommendation systems are programs which attempt to predict items that a user may be interested in
  • 5.
    5 Recommendation Types  Editorial Simple aggregates  Top 10, Most Popular, Recent Uploads  Tailored to individual users  Amazon, Netflix, …  Books, CDs, other products at amazon.com  Movies by Netflix, MovieLens
  • 6.
    6 Formal Model  C= set of Customers  S = set of Items, e.g. books, movies  The space S of possible items and the user space C can be very large.  Utility function u: C £S ! R  R = set of ratings  R is a totally ordered set  e.g., 0-5 stars, real number in [0,1]
  • 7.
    7 Utility Matrix 0.4 10.2 0.30.5 0.21 King KongKingKong LOTRLOTR MatrixMatrix Nacho LibreNacho Libre AliceAlice BobBob CarolCarol DavidDavid
  • 8.
    8 Recommendation Process  Collecting“known” ratings for matrix  Extrapolate unknown ratings from known ratings  Estimate ratings for the items that have not been seen by a user  Recommend the items with the highest estimated ratings to a user
  • 9.
    9 Collecting Ratings  Explicitdata collection  Ask people to rate items  Doesn’t work well in practice – people can’t be bothered  Implicit data collection  Learn ratings from user actions  e.g., purchase implies high rating  What about low ratings?
  • 10.
    10 Extrapolating Utilities  Keyproblem: matrix U is sparse  most people have not rated most items  Three approaches  Content-based recommendation  Collaborative recommendation  Hybrid recommendation
  • 11.
    11 Content-based recommendations  Mainidea: recommend items to customer C similar to previous items rated highly by C  Movie recommendations  recommend movies with same actor(s), director, genre, …  Websites, blogs, news  recommend other sites with “similar” content
  • 12.
    12 Plan of action likeslikes ItemprofilesItem profiles RedRed CirclesCircles TrianglesTriangles User profileUser profile matchmatch recommendrecommend buildbuild
  • 13.
    13 Item Profiles  Foreach item, create an item profile  Profile is a set of features  movies: author, title, actor, director,…  text: set of “important” words in document  How to pick important words?  Usual heuristic is TF.IDF (Term Frequency times Inverse Doc Frequency)
  • 14.
    14 TF.IDF fij = frequencyof term ti in document dj ni = number of docs that mention term i N = total number of docs TF.IDF score wij = TFij £ IDFi Doc profile = set of words with highest TF.IDF scores, together with their scores
  • 15.
    15 User profiles andprediction  User profile possibilities:  Weighted average of rated item profiles  Variation: weight by difference from average rating for item  …  Traditional heuristic  Given user profile c and item profile s, estimate u(c,s) = cos(c,s) = c.s/(|c||s|)  Need efficient method to find items with high utility  E.g.
  • 16.
    16 Model-based approaches  Foreach user, learn a classifier that classifies items into rating classes  liked by user and not liked by user  e.g., Bayesian, regression, SVM  Apply classifier to each item to find recommendation candidates  Problem: scalability
  • 17.
    17 Limitations of content-basedapproach  Finding the appropriate features  e.g., images, movies, music  Overspecialization  Never recommends items outside user’s content profile  People might have multiple interests  Recommendations for new users  How to build a profile?  A new user, having very few ratings, would not be able to get accurate recommendations.
  • 18.
    18 Collaborative Filtering  Consideruser c  Find set D of other users whose ratings are “similar” to c’s ratings  Estimate user’s ratings based on ratings of users in D Set of other users Similar Ratings Ratings Estimate
  • 19.
    19 Similar users  Letrx be the vector of user x’s ratings  Cosine similarity measure  sim(x,y) = cos(rx , ry)  Pearson correlation coefficient  Sxy = items rated by both users x and y
  • 20.
    20 Rating predictions  LetD be the set of k users that are the most similar to c and who have rated item s  Possibilities for prediction function (item s):  rcs = 1/k ∑d2D rds  rcs = (∑d2D sim(c,d)£rds)/(∑ d2 D sim(c,d))  Other options?
  • 21.
    21 Complexity  Expensive stepis finding k most similar customers  O(|U|)  Too expensive to do at runtime  Need to pre-compute  Naïve precomputation takes time O(N| U|)  Simple trick gives some speedup  Can use clustering, partitioning as alternatives, but quality degrades
  • 22.
    22 Item-Item Collaborative Filtering So far: User-user collaborative filtering  Another view  For item s, find other similar items  Estimate rating for item based on ratings for similar items  Can use same similarity metrics and prediction functions as in user-user model  In practice, it has been observed that item-item often works better than user- user
  • 23.
    23 Pros and consof collaborative filtering  Works for any kind of item  No feature selection needed  New user problem  The same problem as with content-based system  New item problem  Sparsity of rating matrix
  • 24.
    24 Hybrid Methods  Implementtwo separate recommenders and combine their predictions  Add content-based methods to collaborative approach  item profiles for new item problem  deal with sparsity-related problems
  • 25.
    25 Evaluating Recommendations  Precision Accuracy of predictions  Compare predictions with known ratings, Root- mean-square error (RMSE)  Receiver operating characteristic (ROC)  Tradeoff curve between false positives and false negatives  Recommendation Quality  Top-n measures (e.g., Breese score)  Item-Set Coverage  Number of items/users for which system can make predictions
  • 26.
    26 Conclusions  Content-based  Theuser will be recommended items similar to the ones the user preferred in the past  Collaborative  The user will be recommended items that people with similar tastes and preferences liked in the past;  Hybrid  Combine collaborative and content-based methods
  • 27.
    27 Review of myprevious work
  • 28.
    28 Facial Expression Recognition Preprocessingprocedure Rotate to line up eye coordinates Locate & Corp Face Region Geometrical Normalize Gabor Feature Extraction Normalize PCA&LDA Translation Matrix Train PhaseTemplates Test Phase Histogram Equalization Distance Classifier
  • 29.
    29 Image Stitching Feature Points extraction Correlation Match Ransaceliminate pseudo match points Build the Model Perspective model Image alignmentImage Stitching Demo
  • 30.
    30 Any questions orsuggestions  Thank you

Editor's Notes

  • #3 Netflix the DVD rental company recently announced their NetflixPrize in which they will award $1 million dollars for an algorithm that can out-perform their recommendation approach Cinematch by 10%. To qualify for the $1,000,000 Grand Prize, the accuracy of your submitted predictions on the qualifying set must be at least 10% better than the accuracy Cinematch can achieve on the same training data set at the start of the contest.
  • #5 Usually, the users rely on search engine to get the information. While, recommendation systems are a useful alternative to search algorithms since they help users discover items they might not have found by themselves.
  • #6 The most traditional recommendation is from editorial. Anther recommendation is performed by simple aggregate Now there are some recommendation systems can tailor to individual users.
  • #7 Let C be the set of all users and let S be the set of all possible items that can be recommended, such as books, movies, or restaurants. The space S of possible items can be very large, ranging in hundreds of thousands or even millions of items in some applications, such as recommendation books or CDs. Similarly, the user space can also be very large – millions in some case. Let u be a utility function that measures the usefulness of item s to user c, i.e., u : CXS  R, where R is a totally ordered set (e.g., nonnegative integers or real numbers within a certain range).
  • #8 Here is an example of a user-item rating matrix for a movie recommendation application. Some of the ratings are empty, which means that the users have not rated the corresponding movies. In its most common formulation, the recommendation problem is reduced to the problem of estimating ratings for the items that have not been seen by a user.
  • #9 In recommender systems, utility is typically represented by ratings and is initially defined only on the items previously rated by the users. As demonstrated above in the utility Matrix, some of the ratings are empty, which means that the users have not rated the corresponding movies. Therefore, the recommendation engine should be able to estimate (predict) the ratings of the nonrated movie/user combinations and issue appropriate recommendations based on these predictions.
  • #10 Examples of explicit data collection include the following: Asking a user to rate an item on a sliding scale. Asking a user to rank a collection of items from favorite to least favorite. Presenting two items to a user and asking him/her to choose the best one. Asking a user to create a list of items that he/she likes. Examples of implicit data collection include the following: Observing the items that a user views in an online store. Analyzing item/user viewing times[1] Keeping a record of the items that a user purchases online. Obtaining a list of items that a user has listened to or watched on his/her computer.
  • #11 Recommender system are usually classified into the following categories, based on how recommendations are made:
  • #12 According to the previous items rated highly by the same user For example, in a movie recommendation application, in order to recommend movies to user c, the content-based recommender system tries to understand the commonalities among the movies user c has rated highly in the past (specific actors, directors, genres, subject matter, etc). Then, only the movies that have a high degree of similarity to whatever the user’s preferences are would be recommended.
  • #14 Item profile is defined with a set of features. For example, in a movie recommendation application, each movie can be represented by its author, title, actor, director, year of release, etc. One of the best-known measures for specifying keyword weights in Information Retrieval is the Term Frequency/Inverse Document Frequency measure.
  • #15 TFi;j, the term frequency (or normalized frequency) of keyword ki in document dj, is defined as The inverse document frequency for keyword ki is usually defined as
  • #16 After we get the item profiles, how can we build the user profiles? As stated earlier, content-based systems recommend items similar to those that a user liked in the past So some average approach, weighted average of rated item profiles can be used to build the user profile After the user profile is built, one traditional heuristic method, the utility function u(c,s) is usually defined as cosine similarity measure For example, if user c reads many online articles on the topic of bioinformatics, then content-based recommendation techniques will be able to recommend other bioinformatics articles to user c.
  • #17 Besides the traditional heuristics that are based mostly on information retrieval methods, other techniques for content-based recommendation have also been used, such as Bayesian classifiers and various machine learning techniques.
  • #18 The user has to rate a lot of items before a content-based recommender system can really understand the user’s preferences and present the user with reliable recommendations. Therefore, a new user, having very few ratings, would not be able to get accurate recommendations.
  • #19 Unlike content-based recommendation methods, collaborative recommender systems (or collaborative filtering systems) try to predict the utility of items for a particular user based on the items previously rated by other users. For example: in a movie recommendation application, in order to recommend movies to user c, the collaborative recommender system tries to find the “peers” of user c, i.e., other users that have similar tastes in movies (rate the same movies similarly). Then, only the movies that are most liked by the “peers” of user c would be recommended.
  • #20 Various approaches have been used to compute the similarity sim(x,y) between users in collaborative recommender systems. In most of the approaches, the similarity between two users is based on their ratings of items that both users have rated. The two most popular approaches are cosine and correlation based. To present them, let … Note that both the content-based and the collaborative approaches use the same cosine measure from information retrieval literature. However, in content-based recommender systems, it is used to measure the similarity between vectors of TF-IDF weights, whereas, in collaborative systems, it measures the similarity between vectors of the actual user-specified ratings. Sxy is the intersection of sets Sx and Sy.
  • #21 The aggregation can be a simple average However, the most common aggregation approach is to use the weighted sum. The more similar users c and d are, the more weight rating r(ds) will carry in the prediction of r(cs)
  • #22 One common strategy is to calculate all user similarities sim(x, y) (including the calculation of Sxy) in advance and recalculate them only once in a while (since the network of peers usually does not change dramatically in a short time). Then, whenever the user asks for a recommendation, the ratings can be efficiently calculated on demand using precomputed similarities.
  • #23 first determines the similarities between the various items and then uses them to identify the set of items to be recommended. The key steps in this class of algorithms are (i) the method used to compute the similarity between the items, and (ii) the method used to combine these similarities in order to compute the similarity between a basket of items and a candidate recommender item. Our experimental evaluation on eight real datasets shows that these item-based algorithms are up to two orders of magnitude faster than the traditional user-neighborhood based recommender systems and provide recommendations with comparable or better quality.
  • #24 they can deal with any kind of content and recommend any items, even the ones that are dissimilar to those seen in the past. However, collaborative systems have their own limitations. New user problem: It is the same problem as with content-based systems. In order to make accurate recommendations, the system must first learn the user’s preferences from the ratings that the user gives. New item problem: New items are added regularly to recommender systems. Collaborative systems rely solely on users’ preferences to make recommendations. Therefore, until the new item is rated by a substantial number of users, the recommender system would not be able to recommend it. Sparsity of rating matrix: In any recommender system, the number of ratings already obtained is usually very small compared to the number of ratings that need to be predicted. For example, in the movie recommendation system, there may be many movies that have been rated by only few people and these movies would be recommended very rarely, even if those few users gave high ratings to them. One way to overcome the problem of rating sparsity is to use user profile information when calculating user similarity.
  • #25 Several recommendation systems use a hybrid approach by combining collaborative and content-based methods, which helps to avoid certain limitations of content-based and collaborative systems Here is two different ways to combine collaborative and content-based methods:
  • #27 In this talk, we give an brief introduction about what is the recommendation system, and review three recommendation approaches