Introduction to recommendation system

1
Introduction to
Recommendation System
Presented by HongBo Deng
Nov 14, 2006
Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D.
Ullman

2
Netflix Prize - $1,000,000 Prize
Netflix recently announced
their NetflixPrize in which
they will award $1 million
dollars for an algorithm that
can out-perform their
recommendation approach
Cinematch by 10%.

3
Outline
 What is Recommendation systems?
 Three recommendation approaches
 Content-based
 Collaborative
 Hybrid approach
 Conclusions
 Review of my previous work

4
What is Recommendation systems?
Items
Search Recommendations
Products, web sites, blogs, news items, …
Recommendation systems are
programs which attempt to predict
items that a user may be interested in

5
Recommendation Types
 Editorial
 Simple aggregates
 Top 10, Most Popular, Recent Uploads
 Tailored to individual users
 Amazon, Netflix, …
 Books, CDs, other products at amazon.com
 Movies by Netflix, MovieLens

6
Formal Model
 C = set of Customers
 S = set of Items, e.g. books, movies
 The space S of possible items and the
user space C can be very large.
 Utility function u: C £S ! R
 R = set of ratings
 R is a totally ordered set
 e.g., 0-5 stars, real number in [0,1]

7
Utility Matrix
0.4
10.2
0.30.5
0.21
King KongKing Kong LOTRLOTR MatrixMatrix Nacho LibreNacho Libre
AliceAlice
BobBob
CarolCarol
DavidDavid

8
Recommendation Process
 Collecting “known” ratings for matrix
 Extrapolate unknown ratings from
known ratings
 Estimate ratings for the items that have not
been seen by a user
 Recommend the items with the highest
estimated ratings to a user

9
Collecting Ratings
 Explicit data collection
 Ask people to rate items
 Doesn’t work well in practice – people can’t
be bothered
 Implicit data collection
 Learn ratings from user actions
 e.g., purchase implies high rating
 What about low ratings?

10
Extrapolating Utilities
 Key problem: matrix U is sparse
 most people have not rated most items
 Three approaches
 Content-based recommendation
 Collaborative recommendation
 Hybrid recommendation

11
Content-based recommendations
 Main idea: recommend items to
customer C similar to previous items
rated highly by C
 Movie recommendations
 recommend movies with same actor(s),
director, genre, …
 Websites, blogs, news
 recommend other sites with “similar”
content

12
Plan of action
likeslikes
Item profilesItem profiles
RedRed
CirclesCircles
TrianglesTriangles
User profileUser profile
matchmatch
recommendrecommend
buildbuild

13
Item Profiles
 For each item, create an item profile
 Profile is a set of features
 movies: author, title, actor, director,…
 text: set of “important” words in document
 How to pick important words?
 Usual heuristic is TF.IDF (Term Frequency
times Inverse Doc Frequency)

14
TF.IDF
fij = frequency of term ti in document dj
ni = number of docs that mention term i
N = total number of docs
TF.IDF score wij = TFij £ IDFi
Doc profile = set of words with highest
TF.IDF scores, together with their scores

15
User profiles and prediction
 User profile possibilities:
 Weighted average of rated item profiles
 Variation: weight by difference from average
rating for item
 …
 Traditional heuristic
 Given user profile c and item profile s,
estimate u(c,s) = cos(c,s) = c.s/(|c||s|)
 Need efficient method to find items with
high utility
 E.g.

16
Model-based approaches
 For each user, learn a classifier that
classifies items into rating classes
 liked by user and not liked by user
 e.g., Bayesian, regression, SVM
 Apply classifier to each item to find
recommendation candidates
 Problem: scalability

17
Limitations of content-based approach
 Finding the appropriate features
 e.g., images, movies, music
 Overspecialization
 Never recommends items outside user’s
content profile
 People might have multiple interests
 Recommendations for new users
 How to build a profile?
 A new user, having very few ratings, would
not be able to get accurate
recommendations.

18
Collaborative Filtering
 Consider user c
 Find set D of other users whose ratings
are “similar” to c’s ratings
 Estimate user’s ratings based on ratings
of users in D
Set of other users
Similar
Ratings
Ratings
Estimate

19
Similar users
 Let rx be the vector of user x’s ratings
 Cosine similarity measure
 sim(x,y) = cos(rx , ry)
 Pearson correlation coefficient
 Sxy = items rated by both users x and y

20
Rating predictions
 Let D be the set of k users that are the
most similar to c and who have rated
item s
 Possibilities for prediction function (item
s):
 rcs = 1/k ∑d2D rds
 rcs = (∑d2D sim(c,d)£rds)/(∑
d2 D
sim(c,d))
 Other options?

21
Complexity
 Expensive step is finding k most similar
customers
 O(|U|)
 Too expensive to do at runtime
 Need to pre-compute
 Naïve precomputation takes time O(N|
U|)
 Simple trick gives some speedup
 Can use clustering, partitioning as
alternatives, but quality degrades

22
Item-Item Collaborative Filtering
 So far: User-user collaborative filtering
 Another view
 For item s, find other similar items
 Estimate rating for item based on ratings for
similar items
 Can use same similarity metrics and
prediction functions as in user-user model
 In practice, it has been observed that
item-item often works better than user-
user

23
Pros and cons of collaborative
filtering
 Works for any kind of item
 No feature selection needed
 New user problem
 The same problem as with content-based
system
 New item problem
 Sparsity of rating matrix

24
Hybrid Methods
 Implement two separate recommenders
and combine their predictions
 Add content-based methods to
collaborative approach
 item profiles for new item problem
 deal with sparsity-related problems

25
Evaluating Recommendations
 Precision
 Accuracy of predictions
 Compare predictions with known ratings, Root-
mean-square error (RMSE)
 Receiver operating characteristic (ROC)
 Tradeoff curve between false positives and false
negatives
 Recommendation Quality
 Top-n measures (e.g., Breese score)
 Item-Set Coverage
 Number of items/users for which system can
make predictions

26
Conclusions
 Content-based
 The user will be recommended items similar to the
ones the user preferred in the past
 Collaborative
 The user will be recommended items that people
with similar tastes and preferences liked in the past;
 Hybrid
 Combine collaborative and content-based methods

28
Facial Expression Recognition
Preprocessing procedure
Rotate to line up
eye coordinates
Locate & Corp
Face Region
Geometrical
Normalize
Gabor Feature
Extraction
Normalize
PCA&LDA
Translation Matrix
Train PhaseTemplates
Test Phase
Histogram
Equalization
Distance
Classifier

29
Image Stitching
Feature Points
extraction
Correlation
Match
Ransac eliminate
pseudo match
points
Build the Model
Perspective model
Image alignmentImage Stitching
Demo

30
Any questions or suggestions
 Thank you

Introduction to recommendation system

More Related Content

What's hot

Viewers also liked

Similar to Introduction to recommendation system

Recently uploaded

Introduction to recommendation system

Editor's Notes