0% found this document useful (0 votes)

11 views58 pages

Recommender Systems Overview and Techniques

The document discusses recommender systems, which are information filtering systems that predict user ratings for various items such as music, movies, and books. It outlines different paradigms of recommender systems, including collaborative filtering, content-based filtering, and hybrid approaches, highlighting their advantages and disadvantages. The document also delves into techniques for measuring user similarity and making predictions, emphasizing the importance of understanding user preferences and item characteristics in enhancing user experience and increasing sales for providers.

Uploaded by

milake3794

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views58 pages

Recommender Systems Overview and Techniques

Uploaded by

milake3794

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Information Retrieval and

Web Analytics

Andreas Kaltenbrunner & Ricardo Baeza-Yates

Recommender systems
(Part I)
References

• Mining of Massive Datasets (Chapter 9).

[Link]

• Recommender Systems Handbook

Francesco Ricci, Lior Rokach, Bracha Shapira
Springer
Introduction

• Recommender System is a form of information filtering

system which seeks to predict user ratings of items.

• Recommender System finds application in music,

movies, news, books, research articles and many other.
Application areas
In the Social Web
Why using Recommender Systems?

• Value for the customer

– Find things that are interesting
– Narrow down the set of choices
– Help to explore the space of options
– Discover new things
– Entertainment
– …
• Value for the provider
– Offer unique personalized service for the customer
– Increase trust and customer loyalty
– Increase sales, click through rates, conversion, etc.
– Opportunities for promotion, persuasion
– Obtain more knowledge about customers
– …
Jeff Bezos

• “If I have 3 million

customers on the Web,
I should have 3 million
stores on the Web”
– Jeff Bezos, CEO of
[Link]
– Degree in Computer
Science
– US$83.6 billion
estimated (November
2020), ranked no. 1 in
the Forbes list of
world’s billionaires
Long-Tails Graph in RS

Distribution of popularity among items or products in a

marketplace.
Problem domain

• Recommendation systems (RS) help to match users with items

– Ease information overload
– Sales assistance (guidance, advisory, persuasion, …)

RS are software agents that elicit the interests and preferences of

individual consumers […] and make recommendations accordingly.
They have the potential to support and improve the quality of the
decisions consumers make while searching for and selecting
products online.
• [Xiao & Benbasat, MISQ, 2007]

• Different system designs / paradigms

– Based on availability of exploitable data
– Implicit and explicit user feedback
– Domain characteristics
Recommender systems

• RS seen as a function
• Given:
– User model (e.g., ratings, preferences, demographics,
situational context)
– Items (with or without description of item
characteristics)
• Find:
– Relevance score. Used for ranking.
• Finally:
– Recommend items that are assumed to be relevant
• But:
– Remember that relevance might be context-dependent
– Characteristics of the list itself might be important
(diversity)
Paradigms of recommender systems

Collaborative: "Tell me what's

popular among my peers"

Collaborative filtering models can recommend an item to

user A based on the interests of a similar user B.
Paradigms of recommender systems

Content-based: "Show me
more of the same I've liked"

Content-based filtering uses item features to recommend

other items similar to what the user likes, based on their
previous actions or explicit feedback.
Paradigms of recommender systems

Content-based: "Show me more of the same I've liked"

Content-based filtering uses item features to recommend

other items similar to what the user likes, based on their
previous actions or explicit feedback.
Paradigms of recommender systems

Knowledge-based: "Tell me
what fits based on my needs"
Paradigms of recommender systems

Hybrid: combinations of various

inputs and/or composition of
different mechanism
Recommender systems: basic techniques

Pros Cons
Collaborative No knowledge-engineering Requires some form of rating
effort, serendipity of results, feedback, cold start for new
learns market segments users and new items

Content-based No community required, Content descriptions

comparison between items necessary, cold start for
possible new users, no surprises

Knowledge-based Deterministic Knowledge engineering

recommendations, assured effort to bootstrap, basically
quality, no cold-start, can static, does not react to
resemble sales dialogue short-term trends
COLLABORATIVE FILTERING
Collaborative Filtering (CF)

• The most prominent approach to generate recommendations

– used by large, commercial e-commerce sites
– well-understood, various algorithms and variations exist
– applicable in many domains (book, movies, DVDs, ..)
• Approach
– use the "wisdom of the crowd" to recommend items
• Basic assumption and idea
– Users give ratings to catalog items (implicitly or explicitly)
Explicit
• users specify how much they liked for example a particular movie by
providing a numerical rating.
• this approach usually does not scale because only a small fraction
of users leave ratings and reviews
Implicit:
• E.g., a user watches a movie => system infers that she/he likes it.
– Customers who had similar tastes in the past, will have similar
tastes in the future
User-based nearest-neighbor collaborative
filtering (1)

• The basic technique:

– Given an "active user" (Alice) and an item I not yet
seen by Alice
– Goal is to estimate Alice's rating for item I, e.g., by
• find a set of users (peers) who liked the same items as
Alice in the past and who have rated item I
• use, e.g., the average of their ratings to predict, if Alice
will like item I
• do this for all items Alice has not seen and
recommend the best-rated
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
User-based nearest-neighbor collaborative
filtering (2)

• Some first questions

– How do we measure similarity?
– How many neighbors should we consider?
– How do we generate a prediction from the
neighbors' ratings?

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Measuring user similarity

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3 sim = 0,85
User2 4 3 4 3 5 sim = 0,71

User3 3 3 1 5 4 sim = 0,00

User4 1 5 5 2 1 sim = -0,79

Making predictions

• is the average rating of target user a.

• N is the neighborhood of most similar users.
• d is the average rating of neighbour user b.
• is the similarity between user a and b.
• is the rating assigned by user b to item p.
Item-based collaborative filtering

• Basic idea:
– Use the similarity between items (and not
users) to make predictions
• Example:
– Look for items that are similar to Item5
– Take Alice's ratings for these items to
predict the rating for Item5
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Item-based Collaborative Filtering

sij: similarity of items i and j

rxj: rating of user x on item j
N(i;x): set items rated by x similar to i
Item-based CF (|N|=2)

users
movies

- unknown rating - rating between 1 to 5

Item-based CF (|N|=2)
movies

- estimate rating of movie 1 by user 5

Item-based CF (|N|=2)

Here we use Pearson correlation as similarity:

Neighbor selection: 1) Subtract mean rating mi from each movie i
Identify movies similar to m1 = (1+3+5+5+4)/5 = 3.6
row 1: [-2.6, 0, -0.6, 0, 0, 1.4, 0, 0, 1.4, 0, 0.4, 0]
movie 1, rated by user 5 2) Compute Pearson correlation (i.e., cosine similarity if avg
Item-based CF (|N|=2)

Compute similarity weights:

s1,3=0.41, s1,6=0.59
Item-based CF (|N|=2)

Predict by taking weighted average:

r1.5 = (0.41*2 + 0.59*3) / (0.41+0.59) = 2.6
Before:
CF: Common Practice

• Define similarity sij of items i and j

• Select k nearest neighbors N(i; x)
– Items most similar to i, that were rated by x
• Estimate rating rxi as the weighted average:

baseline estimate for rxi ◼ μ = overall mean movie rating

◼ bx = rating deviation of user x
= (avg. rating of user x) – μ
◼ bi = rating deviation of movie i
Item-Item vs. User-User

Avatar LOTR Matrix Pirates

Alice

Bob

Carol

David

◼ In practice, it has been observed that item-item often works

better than user-user
◼ Why? Items are simpler, users have multiple tastes
Example CF: Item vs Item
ITEMS
1 2 3 4

1 5 1 3
b3=(1+1+4+3)/4-μ=2.25-3= -0.75
U b4=(3+2+3+5)/4-μ=3.25-3= 0.25
2 1 2
S b63=μ+b6+b3=3+1-0.75= 3.25
E b64=μ+b6+b4=3+1+0.25= 4.25
R 3
r61= b61+(s13·(r63-b63)
S +s14·(r64-b64)) /(s13+s14)
4 1 r61=3.5+(0.48·(3-3.25)
+0.32·(5-4.25))/(0.48+0.32)

r61=3.68
5 4 4 3

6 ? 4 3 5

Average ratings: s12=0 s13=0.48 s14=0.32

μ = (5+1+3+2+1+4+3+5+1+4+3+4)/12 = 36/12 = 3
bx= b6=(4+3+5)/3-μ = 12/3-3 = 1
bi = b1=(1+4)/2-μ = 2.5-3 = -0.5
bxi = b61=μ+bx+bi = 3+1-0.5 = 3.5
Memory-based and model-based
approaches

• User-based CF is said to be "memory-based"

– the rating matrix is directly used to find neighbors /
make predictions
– does not scale for most real-world scenarios
– large e-commerce sites have tens of millions of
customers and millions of items
• Model-based approaches
– based on an offline pre-processing or "model-learning"
phase
– at run-time, only the learned model is used to make
predictions
– models are updated / re-trained periodically
– large variety of techniques used
– model-building and updating can be computationally
expensive
Model-based approaches

• Plethora of different techniques proposed in the last

years, e.g.,
– Matrix factorization techniques, statistics
• singular value decomposition, principal component
analysis
– Association rule mining
• compare: shopping basket analysis
– Probabilistic models
• clustering models, Bayesian networks, probabilistic
Latent Semantic Analysis
– Various other machine learning approaches
• Costs of pre-processing
– Usually not discussed
– Incremental updates possible?
The Netflix Prize

•
The Netflix Utility Matrix R
480,000 users
Matrix R 1 3 4
3 5 5
4 5 5
3
17,700
movies 3
2 2 2
5
2 1 1
3 3
1
The Netflix Utility Matrix R
480,000 users
Matrix R 1 3 4
3 5 5
4 5 5
3
17,700
movies 3
Training Data Set 2 ? ? Test Data Set
?
2 1 ?
3 ? True rating of
user x on item i
1

Predicted rating
BellKor Recommender System

• The winner of the Netflix Challenge!

• Multi-scale modeling of the data: Global effects
Combine top level, “regional”
modeling of the data, with
a refined, local view:
– Global: Factorization
• Overall deviations of users/movies
– Factorization:
• Addressing “regional” effects Collaborative
filtering
– Collaborative filtering:
• Extract local patterns
Modeling Local & Global Effects

• Global:
– Mean movie rating: 3.7 stars
– The Sixth Sense is 0.5 stars above avg.
– Joe rates 0.2 stars below avg.
⇒ Baseline estimation:
Joe will rate The Sixth Sense 4 stars
• Local neighborhood (CF/NN):
– Joe didn’t like related movie Signs
– ⇒ Final estimate:
Joe will rate The Sixth Sense 3.8 stars
Recap: Collaborative Filtering (CF)

• Earliest and most popular collaborative

filtering method
• Derive unknown ratings from those of “similar”
movies (item-item variant)
• Define similarity measure sij of items i and j
• Select k-nearest neighbors, compute the rating
– N(i; x): items most similar to i that were rated
by x

sij… similarity of items i and j

rxj…rating of user x on item j
N(i;x)… set of items similar to
item i that were rated by x
Recap: deviations in classic CF

• In practice we get better estimates if we

model deviations:

baseline estimate for rxi

Problems/Issues:
1) Similarity measures are “arbitrary”
2) Pairwise similarities neglect
interdependencies among users
μ = overall mean rating 3) Taking a weighted average can be
bx = rating deviation of user x restricting
= (avg. rating of user x) – μ Solution: Instead of sij use wij that we
bi = (avg. rating of movie i) – μ estimate directly from data
Recommendations via Optimization 1 3 4
3 5 5
4 5 5
3
3
2 2 2
5
2 1 1
• Goal: Make good recommendations 3 3
1
– Quantify goodness using RMSE:
Lower RMSE ⇒ better recommendations
– Want to make good recommendations on
items that user has not yet seen.

– Let’s set build a system such that it works

well on known (user, item) ratings
And hope the system will also predict well the
unknown ratings
Recommendations via Optimization

SSE = sum of squared errors

Predicted rating True

rating
Latent Factor Models

• “SVD” on Netflix data: R ≈ Q · PT SVD: A = U Σ VT

users factors

users
items

factors
≈items

PT
R Q
• For now, let’s assume we can approximate the rating
matrix R as a product of “thin” Q · PT
– R has missing entries but let’s ignore that for now!
• Basically, we will want the reconstruction error to be small on
known ratings and we don’t care about the values on the missing
ones
Latent Factor Models (e.g., SVD)
Serious Braveheart
The Color Amadeus
Purple

Lethal Weapon
Sense and
Sensibility Ocean’s 11
Geared Geared
towards towards
females males

The Lion King

The Princess Independence

Diaries Day
Dumb and
Funny Dumber
Ratings as Products of Factors

• How to estimate the missing rating of

user x for item i ?

qi = row i of Q
px = column x of PT
Ratings as Products of Factors

• How to estimate the missing rating of

user x for item i ?

qi = row i of Q
px = column x of PT
Ratings as Products of Factors

• How to estimate the missing rating of

user x for item i ?

qi = row i of Q
px = column x of PT

ȓxi=-0.5·(-2)+0.6·0.3+0.5·2.4=2.38
Latent Factor Models
Serious Braveheart
The Color Amadeus
Purple

Lethal Weapon
Sense and
Sensibility Ocean’s 11
Geared Factor 1Geared
towards towards
females males

The Lion King

Factor 2

The Princess Independence

Diaries Day
Dumb and
Funn Dumber
y
Latent Factor Models
Serious Braveheart
The Color Amadeus
Purple

Lethal Weapon
Sense and
Sensibility Ocean’s 11
Geared Factor 1Geared
towards towards
females males

The Lion King

Factor 2

The Princess Independence

Diaries Day
Dumb and
Funn Dumber
y
Recap: SVD
n n
• Remember SVD:
– A: Input data matrix
V T
Σ
– U: Left singular vecs m
– V: Right singular vecs
A ≈m

– Σ: Singular values
– U,V are orthonormal. U
We do not need this here, we do not care, we just
want something that works

• In our case:
“SVD” on Netflix data: R ≈ Q · PT
A = R, Q = U, PT = Σ VT
Latent Factor Models

•
FINDING THE LATENT
FACTORS
Back to Our Problem

• Want to minimize SSE for unseen test data

• Idea: Minimize SSE on training data
– Want large k (# of factors) to capture all the
signals
– But SSE on test data begins to rise for k > 2
1 3 4
3 5 5

• This is a classical example of overfitting:

4 5 5
3
3
2 ? ?

– With too much freedom (too many free 2 1

3
?

?
?

parameters) the model starts fitting noise

• I.e., it fits too well the training data and thus not
generalizing well to unseen test data
Detour: Minimizing a function

•
Dealing with Missing Entries
1 3 4
3 5 5
4 5 5
3
3

• To solve overfitting, we introduce 2 ?

2 1
?
?

regularization:
3 ?
1

– Allow rich model where there is sufficient

data
– Shrink aggressively where data is scarce

“error” “length”
λ1, λ2 … user set regularization parameters

Note: We do not care about the “raw” value of the objective function,
but we care in P,Q that achieve the minimum of the objective
Summary: Recommender Systems

Main paradigms:
• Collaborative filtering: based on user or
item similarity.
• Content-based filtering: relies on item
features and user profiles.
• Knowledge-based and hybrid
approaches: combine structured knowledge
and multiple methods.
Summary: Recommender Systems

• Collaborative filtering (CF):recommending

items based on similar users or items.
• Two types: user-based and item-based CF.
• CF enables serendipity and captures
evolving community preferences.
•Model-based methods, such as matrix
factorization (SVD), have advanced scalability
and accuracy.
•Key challenges remain:
•Cold-start problem
•Data sparsity
•Balancing accuracy, novelty, and diversity

Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
31 pages
Introduction to Recommender Systems
No ratings yet
Introduction to Recommender Systems
54 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
33 pages
Recommender Systems & Matrix Factorization
No ratings yet
Recommender Systems & Matrix Factorization
49 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
95 pages
Introduction to Recommender Systems
No ratings yet
Introduction to Recommender Systems
86 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
46 pages
User-Based Collaborative Filtering Guide
No ratings yet
User-Based Collaborative Filtering Guide
131 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
24 pages
CCS360 RS Unit 3
No ratings yet
CCS360 RS Unit 3
34 pages
Recommender Systems Overview
No ratings yet
Recommender Systems Overview
36 pages
Introduction to Recommender Systems
No ratings yet
Introduction to Recommender Systems
48 pages
Overview of Recommender Systems
No ratings yet
Overview of Recommender Systems
26 pages
Understanding Recommender Systems in ML
No ratings yet
Understanding Recommender Systems in ML
33 pages
Collaborative Filtering in Recommender Systems
No ratings yet
Collaborative Filtering in Recommender Systems
31 pages
Understanding Collaborative Filtering
No ratings yet
Understanding Collaborative Filtering
34 pages
Understanding Collaborative Filtering Techniques
No ratings yet
Understanding Collaborative Filtering Techniques
52 pages
Unit III-collaborative Filtering - Final
No ratings yet
Unit III-collaborative Filtering - Final
87 pages
Understanding Collaborative Filtering Techniques
No ratings yet
Understanding Collaborative Filtering Techniques
65 pages
Collaborative Filtering in Recommender Systems
No ratings yet
Collaborative Filtering in Recommender Systems
14 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
30 pages
Overview of Recommender Systems
No ratings yet
Overview of Recommender Systems
58 pages
CS361 - Applied ML - Lec06 Unsupervised Learning - Intro. To Recommendation Systems
No ratings yet
CS361 - Applied ML - Lec06 Unsupervised Learning - Intro. To Recommendation Systems
45 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
12 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
52 pages
3 - Collaborative Filtering
No ratings yet
3 - Collaborative Filtering
13 pages
Introduction to Recommender Systems
No ratings yet
Introduction to Recommender Systems
25 pages
Understanding Recommendation Systems
No ratings yet
Understanding Recommendation Systems
50 pages
Building Effective Recommendation Systems
No ratings yet
Building Effective Recommendation Systems
40 pages
Collaborative vs Content-Based Filtering
No ratings yet
Collaborative vs Content-Based Filtering
9 pages
Understanding Collaborative Filtering Techniques
No ratings yet
Understanding Collaborative Filtering Techniques
31 pages
Recommendation Systems - Lecture 1
No ratings yet
Recommendation Systems - Lecture 1
21 pages
Recommender Systems: Encyclopedia of Machine Learning Chapter No: 00338 Page Proof Page 1 22-4-2010 #1
No ratings yet
Recommender Systems: Encyclopedia of Machine Learning Chapter No: 00338 Page Proof Page 1 22-4-2010 #1
9 pages
Rsunit 2
No ratings yet
Rsunit 2
12 pages
Understanding Recommendation Systems
No ratings yet
Understanding Recommendation Systems
52 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
25 pages
Tutorial IJCAI 2013
No ratings yet
Tutorial IJCAI 2013
144 pages
UNIMAS E-Recruitment: Recommender Systems
No ratings yet
UNIMAS E-Recruitment: Recommender Systems
46 pages
Recommender Systems: Techniques & Importance
No ratings yet
Recommender Systems: Techniques & Importance
27 pages
CSE545: Recommendation Systems Overview
No ratings yet
CSE545: Recommendation Systems Overview
72 pages
Disadvantages of Hybrid Recommender Systems
No ratings yet
Disadvantages of Hybrid Recommender Systems
23 pages
Hybrid Recommender System Framework
No ratings yet
Hybrid Recommender System Framework
19 pages
Unit 3 RS
No ratings yet
Unit 3 RS
6 pages
CH 5
No ratings yet
CH 5
47 pages
Lecture 4-6 RecommenderSystems
No ratings yet
Lecture 4-6 RecommenderSystems
71 pages
Understanding Recommendation Systems
No ratings yet
Understanding Recommendation Systems
37 pages
Collaborative Filtering in Recommender Systems
No ratings yet
Collaborative Filtering in Recommender Systems
12 pages
IDEA - Collaborative Filtering Techniques in Recommendation Systems
No ratings yet
IDEA - Collaborative Filtering Techniques in Recommendation Systems
11 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
12 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
68 pages
Understanding Recommender Systems
No ratings yet
Understanding Recommender Systems
19 pages
Recommender Systems: Approaches & Examples
No ratings yet
Recommender Systems: Approaches & Examples
57 pages
Overview of Recommender Systems
No ratings yet
Overview of Recommender Systems
24 pages
DSA 7-Recommender Systems
No ratings yet
DSA 7-Recommender Systems
24 pages
Recommender Systems Overview and Types
No ratings yet
Recommender Systems Overview and Types
16 pages
Data Work Ow Frameworks
No ratings yet
Data Work Ow Frameworks
71 pages
Recommender Systems Overview and Techniques
No ratings yet
Recommender Systems Overview and Techniques
40 pages
Passive Keyless Entry Vulnerabilities
No ratings yet
Passive Keyless Entry Vulnerabilities
21 pages
Software Process Models Overview
No ratings yet
Software Process Models Overview
55 pages
MCA Entrance Test Questions and Answers
No ratings yet
MCA Entrance Test Questions and Answers
11 pages
A22 Pro Mini WiFi Camera Manual
100% (1)
A22 Pro Mini WiFi Camera Manual
17 pages
Accounting Information Systems Overview
No ratings yet
Accounting Information Systems Overview
85 pages
Big Data and Data Analytics Curriculum
No ratings yet
Big Data and Data Analytics Curriculum
4 pages
CSE202 Calendar Software Project
No ratings yet
CSE202 Calendar Software Project
14 pages
PDF Compressor Free Version Guide
0% (1)
PDF Compressor Free Version Guide
131 pages
CodeX Security Assessment Report
No ratings yet
CodeX Security Assessment Report
5 pages
Overview of Computing Paradigms
0% (1)
Overview of Computing Paradigms
18 pages
ManualsLib Download Overview
No ratings yet
ManualsLib Download Overview
92 pages
Introduction To Research and Research Methods
No ratings yet
Introduction To Research and Research Methods
83 pages
OOAD Experiment Journal Entries
No ratings yet
OOAD Experiment Journal Entries
3 pages
General Computing Quiz for JHS 1-3
No ratings yet
General Computing Quiz for JHS 1-3
4 pages
Inv-17402.PDF Cit Electronic Compra
No ratings yet
Inv-17402.PDF Cit Electronic Compra
2 pages
Understanding Data Flow Diagrams
No ratings yet
Understanding Data Flow Diagrams
7 pages
Caroline Jones: Art Admin Resume
No ratings yet
Caroline Jones: Art Admin Resume
2 pages
Redwood and Zipline Widgets Guide
No ratings yet
Redwood and Zipline Widgets Guide
176 pages
UsbFix Antivirus Scan Report
100% (1)
UsbFix Antivirus Scan Report
3 pages
SCADA Applications in Power Systems
No ratings yet
SCADA Applications in Power Systems
11 pages
Free Customizable Solo RPG Tool
No ratings yet
Free Customizable Solo RPG Tool
1 page
OHS Policies & Computer Maintenance Guide
No ratings yet
OHS Policies & Computer Maintenance Guide
10 pages
Inference Detector Hosting Design Overview
No ratings yet
Inference Detector Hosting Design Overview
29 pages
3DFY.ai: Generating 3D Models from Text
No ratings yet
3DFY.ai: Generating 3D Models from Text
5 pages
Mainframe Software Engineer Profile
No ratings yet
Mainframe Software Engineer Profile
2 pages
Overview of Project Management Methods
No ratings yet
Overview of Project Management Methods
7 pages
Unix Calendar and Date Commands Guide
No ratings yet
Unix Calendar and Date Commands Guide
4 pages
AWS and DevOps Course Overview
No ratings yet
AWS and DevOps Course Overview
3 pages
OOP Basics in C++ Explained
No ratings yet
OOP Basics in C++ Explained
8 pages
Introduction to Python Programming Basics
No ratings yet
Introduction to Python Programming Basics
21 pages

Recommender Systems Overview and Techniques

Uploaded by

Recommender Systems Overview and Techniques

Uploaded by

Information Retrieval and

Andreas Kaltenbrunner & Ricardo Baeza-Yates

• Mining of Massive Datasets (Chapter 9).

• Recommender Systems Handbook

• Recommender System is a form of information filtering

• Recommender System finds application in music,

• Value for the customer

• “If I have 3 million

Distribution of popularity among items or products in a

• Recommendation systems (RS) help to match users with items

RS are software agents that elicit the interests and preferences of

• Different system designs / paradigms

Collaborative: "Tell me what's

Collaborative filtering models can recommend an item to

Content-based filtering uses item features to recommend

Content-based: "Show me more of the same I've liked"

Content-based filtering uses item features to recommend

Hybrid: combinations of various

Content-based No community required, Content descriptions

Knowledge-based Deterministic Knowledge engineering

• The most prominent approach to generate recommendations

• The basic technique:

• Some first questions

Item1 Item2 Item3 Item4 Item5

Item1 Item2 Item3 Item4 Item5

User3 3 3 1 5 4 sim = 0,00

User4 1 5 5 2 1 sim = -0,79

• is the average rating of target user a.

sij: similarity of items i and j

- unknown rating - rating between 1 to 5

- estimate rating of movie 1 by user 5

Here we use Pearson correlation as similarity:

Compute similarity weights:

Predict by taking weighted average:

• Define similarity sij of items i and j

baseline estimate for rxi ◼ μ = overall mean movie rating

Avatar LOTR Matrix Pirates

◼ In practice, it has been observed that item-item often works

Average ratings: s12=0 s13=0.48 s14=0.32

• User-based CF is said to be "memory-based"

• Plethora of different techniques proposed in the last

• The winner of the Netflix Challenge!

• Earliest and most popular collaborative

sij… similarity of items i and j

• In practice we get better estimates if we

baseline estimate for rxi

– Let’s set build a system such that it works

SSE = sum of squared errors

Predicted rating True

• “SVD” on Netflix data: R ≈ Q · PT SVD: A = U Σ VT

The Lion King

The Princess Independence

• How to estimate the missing rating of

• How to estimate the missing rating of

• How to estimate the missing rating of

The Lion King

The Princess Independence

The Lion King

The Princess Independence

• Want to minimize SSE for unseen test data

• This is a classical example of overfitting:

– With too much freedom (too many free 2 1

parameters) the model starts fitting noise

• To solve overfitting, we introduce 2 ?

– Allow rich model where there is sufficient

• Collaborative filtering (CF):recommending

You might also like