0% found this document useful (0 votes)
6 views28 pages

Collaborative Filtering Techniques Overview

The document discusses collaborative filtering (CF) methods used in recommendation systems, highlighting two main types: memory-based and model-based approaches. It addresses challenges such as data sparsity, cold start problems, and scalability, while also presenting potential solutions like clustering and hybrid systems. The summary concludes with the pros and cons of CF, noting its ability to provide personalized recommendations without feature selection, but also its limitations in certain scenarios.

Uploaded by

nguy4176
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views28 pages

Collaborative Filtering Techniques Overview

The document discusses collaborative filtering (CF) methods used in recommendation systems, highlighting two main types: memory-based and model-based approaches. It addresses challenges such as data sparsity, cold start problems, and scalability, while also presenting potential solutions like clustering and hybrid systems. The summary concludes with the pros and cons of CF, noting its ability to provide personalized recommendations without feature selection, but also its limitations in certain scenarios.

Uploaded by

nguy4176
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Recommendation

Systems
Collaborative Filtering
Yijun Lin

Thanks for source slides and material


J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets
[Link] 1
Recall: Collaborative Filtering
Overview
• CF works by collecting user feedback on items: e.g.,
ratings
• Two classes of CF algorithms:
• Neighborhood-based or Memory-based approaches
• User-based CF

• Item-based CF

2
Recall: Collaborative Filtering
Overview
• CF works by collecting user feedback on items: e.g.,
ratings
• Two classes of CF algorithms:
• Neighborhood-based or Memory-based approaches
• User-based CF
• Item-based CF
• Model-based approaches
• Estimate parameters of statistical models
• Latent factor and matrix factorization models

3
Model-Based Collaborative Filtering
• Provide recommendations by estimating parameters of
(statistical) models for predicting user ratings
• Developing models to learn complex patterns
• Based on training set – Supervised Learning
• Then make predictions for CF tasks with the learned models
• Example models:
• Bayesian models
• Classification algorithms (if users’ ratings are in categories)
• Regression models and SVD (singular value decomposition ) for
numerical ratings

4
Regression Models and SVD
Methods
• Numerical ratings are common in real recommender
systems
• Regression and SVD methods are good at making
predictions for numerical values
• Linear regression methods
• Latent factor models
• Learn model parameters to minimize the error between
predictions and ground truth on training data
• After developing such a model, fitting the model to a
new data sample to make prediction
5
UV-Decomposition
• For a utility matrix with rows and columns (i.e., users
and items)
• Find a matrix with rows and columns and a matrix
with rows and columns
• Such that closely approximates in those entries
where is nonblank

𝑀 𝑈 𝑉 6
UV-Decomposition
• For a utility matrix with rows and columns (i.e., users
and items)
• Find a matrix with rows and columns and a matrix
with rows and columns
• Such that closely approximates in those entries
where is nonblank => Minimize reconstruction error on
known ratings

𝑀 𝑈 𝑉 7
UV-Decomposition
• For a utility matrix with rows and columns (i.e., users
and items)
• Find a matrix with rows and columns and a matrix
with rows and columns
• Such that closely approximates in those entries
where is nonblank => Use to estimate the missing
rating

𝑀 𝑈 𝑉 8
Characteristics and Challenges of CF
• Data Sparsity
• Cold Start Problem
• Synonyms
• Scalability
• Gray Sheep
•…

9
Characteristics and Challenges of CF
(1)
• Data Sparsity
• Many commercial recommender systems deal with large
user/item sets
• Most users do not rate most items: Utility matrix is extremely
sparse
• Limitations?

10
Characteristics and Challenges of CF
(1)
• Data Sparsity
• Many commercial recommender systems deal with large
user/item sets
• Most users do not rate most items: Utility matrix is extremely
sparse
• This reduces probability of finding set of users with similar
ratings

11
Characteristics and Challenges of CF
(1)
• Data Sparsity
• Many commercial recommender systems deal with large
user/item sets
• Most users do not rate most items: Utility matrix is extremely
sparse
• This reduces probability of finding set of users with similar
ratings
• Potential Solutions: Clustering / Dimensionality Reduction

12
Characteristics and Challenges of CF
(1)
• Data Sparsity
• Many commercial recommender systems deal with large
user/item sets
• Most users do not rate most items: Utility matrix is extremely
sparse
• This reduces probability of finding Revise
set ofthe
users with similar
utility matrix so the columns
Harry-Potter
ratings Star-
Wars
represent clusters of items
• Potential Solutions: Clustering / Dimensionality Reduction

13
Characteristics and Challenges of CF
(1)
• Data Sparsity
• Many commercial recommender systems deal with large
user/item sets
• Most users do not rate most items: Utility matrix is extremely
sparse
• This reduces probability of finding set of users with similar
ratings
• Potential Solutions: Clustering / Dimensionality Reduction
• Principle Component Analysis (PCA), Singular Value
Decomposition (SVD)
• Remove unrepresentative or insignificant users/items to reduce size of
the matrix 14
Characteristics and Challenges of CF
(2)
• Cold Start Problem
• When a new item has just entered the system
• Limitations?

15
Characteristics and Challenges of CF
(2)
• Cold Start Problem
• When a new item has just entered the system
• Can’t be recommended until some users rate it
• Also called “first-rater problem”
• When a new user has just entered the system
• Limitations?

16
Characteristics and Challenges of CF
(2)
• Cold Start Problem
• When a new item has just entered the system
• Can’t be recommended until some users rate it
• Also called “first-rater problem”
• When a new user has just entered the system
• Cannot give good recommendations because of lack of rating or
purchase history
• Potential Solutions:
• Content-based systems do not rely on ratings from other
users
• Hybrid CF (content-boosted CF): External content
information can be used to produce predictions for new users 17
Characteristics and Challenges of CF
(3)
• Synonyms:
• Same or very similar items that have different names or
entries
• Limitations?

18
Characteristics and Challenges of CF
(3)
• Synonyms:
• Same or very similar items that have different names or entries
• Most recommender systems are unable to discover this latent
association
• These products are treated differently
• Potential Solutions:
• Automatic term expansion or construction of thesaurus
• Construct a semantic space where terms and documents that
are closely associated

19
Characteristics and Challenges of CF
(4)
• Scalability
• Traditional CF systems suffer scalability problems at very large
scale
• Limitations?

20
Characteristics and Challenges of CF
(4)
• Scalability
• Traditional CF systems suffer scalability problems at very large
scale
• Expensive computation, cannot do online/real-time
recommendation

• Potential Solutions:
• Dimensionality reduction techniques
• Item-based CF algorithms: Instead of calculating similarities
between all pairs of items, calculate similarity only between
pairs of co-rated items by a user => Trick for HW4 21
Characteristics and Challenges of CF
(5)
• Gray Sheep
• There are users whose opinions do not consistently agree or
disagree with any group of people
• Limitations?

22
Characteristics and Challenges of CF
(5)
• Gray Sheep
• There are users whose opinions do not consistently agree or
disagree with any group of people
• Do not benefit from collaborative filtering

• Potential Solutions:
• Hybrid approach combining content-based and CF
recommendations

23
Characteristics and Challenges of CF
(5)
• Black sheep
• Idiosyncratic tastes
• Make recommendations nearly impossible
• Considered an acceptable failure

24
Summary of Collaborative Filtering
Methods
• Does CF require feature selection?
• Can CF do personalized recommendation?
• Can CF learn latent patterns from user interactions?

25
Summary of Collaborative Filtering
Methods
• Pros:
• No feature selection needed
• Provide personalized recommendation
• Consider user interactions to capture complex user
preferences

26
Summary of Collaborative Filtering
Methods
• Can CF recommend for new users?
• Can CF recommend for new items?
• Can CF recommend if there is no common users/items?
• Does CF tend to recommend items to a user with unique
taste?

27
Summary of Collaborative Filtering
Methods
• Pros:
• No feature selection needed
• Provide personalized recommendation
• Consider user interactions to capture complex user
preferences
• Cons:
• Cold start: Need enough users in the system to find a match
• First rater: Cannot recommend an item that has not been
previously rated
• Sparsity: Hard to find users that have rated the same items
• Popularity bias: Cannot recommend items to someone with 28

You might also like