Unit III-collaborative Filtering - Final
Unit III-collaborative Filtering - Final
A systematic approach, Nearest-neighbor collaborative filtering (CF), user-based and item-based CF,
components of neighborhood methods (rating normalization, similarity weight computation, and
neighborhood selection
Suggested Activities:
• Practical learning – Implement collaborative filtering concepts
• Assignment of security aspects of recommender systems
Suggested Evaluation Methods:
• Quiz on collaborative filtering
• Seminar on security measures of recommender systems
What Is Collaborative Filtering?
• Collaborative filtering filters information by using the interactions and data collected by the
system from other users. It’s based on the idea that people who agreed in their evaluation of
certain items are likely to agree again in the future.
• The concept is simple: when we want to find a new movie to watch we’ll often ask our friends
for recommendations. Naturally, we have greater trust in the recommendations from friends who
share tastes similar to our own.
• Most collaborative filtering systems apply the so-called similarity index-based technique.
In the neighborhood-based approach, a number of users are selected based on their
similarity to the active user. Inference for the active user is made by calculating a
weighted average of the ratings of the selected users.
• Collaborative-filtering systems focus on the relationship between users and items. The similarity
of items is determined by the similarity of the ratings of those items by the users who have
rated both items.
• Collaborative filtering recommender systems have played a significant role in the rise of web
services and content platforms like Amazon, Netflix, YouTube, etc. in recent years. In this age
of information, knowing what the customer wants before they even know it themselves is
nothing short of a superpower. As the name suggests, recommender system algorithms are used
to offer relevant content or product to the consumer based on their taste or previous choices
• User-based, which measures the similarity between target users and other users.
• Item-based, which measures the similarity between the items that target users rate or interact
with and other items.
1
Why do we need recommender systems?
• Back in 2006, Netflix offered a prize to solve a simple problem that had been around for
years. It was to find the best collaborative algorithm to predict user ratings for films that they
haven't watched yet, based on previous ratings of other movies.
• Today, e-commerce giants continue to try to solve this problem in a better way by
observing users’ past behavior to predict what other things the same user will like. .
• Recommendations also help customers discover new products and offers that they’re not
explicitly looking for, thus speeding up the search process. This allows companies to send out
personalized newsletters via email that offer new TV shows, movies, products, and services
that are better suited for them.
• One of the most significant advantages of modern recommendation algorithms is their
ability to take implicit feedback and suggest new content/products, thus staying up-to- date
with customers’ preferences. This enables businesses to continue catering to customers even
if their tastes change over time.
User-item interaction matrix
• In collaborative filtering, we ignore the features of an individual item. Instead, we focus on
a similar group of people using the item and recommend other items that the group likes.
• Similar users are divided into small clusters and are recommended new items according
to the preferences of that cluster. Let’s understand this with an easy movie recommendation
example:
2
• Users 1 and 3 have opposite tastes.
• Users 3 and 4 both disliked Movie 2, so there’s a high chance User 4 will also dislike
Movie 4.
• User 3 might dislike Movie 1.
Collaborative filtering: Advantages and disadvantages
Advantages
• No domain knowledge is required since all the features are learned automatically.
• Can help users discover new interests even if they’re not actively searching for them by
recommending new items similar to what they’re interested in.
• Does not require in-detail features and contextual data of products or items. It only needs the
user-item interaction matrix to train the matrix factorization model.
Disadvantages
• Data sparsity can lead to difficulty in recommending new products or users since the
suggestions are based on historic data and interactions.
• As the user base grows, the algorithms suffer due to high data volume and lack of scalability.
• Lack of diversity in the long run. This might seem counterintuitive since the whole point of
collaborative filtering is to recommend new items to the user. However, since the algorithms
function based on historical ratings, it will not recommend items with little or limited data.
Popular products will be more popular in the long run and there will be a lack of new and
diverse options.
3
A systematic approach to collaborative filtering involves the following steps:
1. Data Collection: Gather user-item interaction data, such as ratings, reviews, purchases, or clicks.
4
2. Data Preprocessing: Clean and prepare the data for analysis, including handling missing values,
outliers, and data normalization.
3. User or Item Representation: Encode user preferences or item features into a suitable
representation, such as user-item matrices or item-attribute vectors.
4. Similarity Calculation: Compute similarity scores between users or items based on their
respective representations.
5. Nearest Neighbor Identification: Identify the nearest neighbor for each user or item based on the
calculated similarity scores.
6. Prediction Generation: Predict the rating or preference of a user for an item based on the ratings or
preferences of their nearest neighbor.
7. Evaluation and Optimization: Evaluate the performance of the CF algorithm using appropriate
metrics and refine the model parameters to improve accuracy.
8. Deployment and Maintenance: Integrate the CF algorithm into the recommender system and
monitor its performance over time, making adjustments as needed.
Effective collaborative filtering relies on the quality and quantity of user-item interaction data.
Additionally, the choice of similarity measures, nearest neighbor identification techniques, and
prediction algorithms can significantly impact the performance of the CF system.
5
• content-based approaches, which use the content of items previously rated by a user u,
collaborative (or social) filtering approaches rely on the ratings of u as well as those of other
users in the system.
• The key idea is that the rating of u for a new item i is likely to be similar to that of another user
v. if u and v have rated other items in a similar way. Likewise, u is likely to rate two
items i and j in a similar fashion, if other users have given similar ratings to these two
items.
6
• Collaborative filtering methods can be grouped in the two general classes of neighborhood
and model- based methods.
• In neighborhood based (memory-based or heuristic-based ) collaborative filtering the user-
item ratings stored in the system are directly used to predict ratings for new items.
• This can be done in two ways known as user based or item-based recommendation.
o User-based systems, such as GroupLens (Social Computing Research at the
University of Minnesota) ,Bellcore video (Library Toolkit is a set of tools for
constructing and browsing libraries of digital video), and Ringo (Social Information
Filtering for Music Recommendation), evaluate the interest of a user u for an item I using
the ratings for this item by other users, called neighbors, that have similar rating
patterns. The neighbors of user u are typically the users v whose ratings o the items
rated by both u and v, i.e. 𝐿_𝑢𝑣 , are most correlated to those of u.
o Item-based approaches, on the other hand, predict the rating of a user u for an item i
based on the ratings of u for items similar to i. In such approaches, two items are similar
if several users of the system have rated these items in a similar fashion.
• In model-based approaches use these ratings to learn a predictive model. The general idea
is to model the user-item interactions with factors representing latent characteristics of the users
and items in the system, like the preference class of users and the category class of items. This
model is then trained using the available data, and later used to predict ratings of users for new
items. Model-based approaches for the task of recommending items are numerous and include
Bayesian Clustering , Latent Semantic Analysis , Latent Dirichlet Allocation, Maximum
Entropy , Boltzmann Machines, Support Vector Machines and Singular Value Decomposition
8
Advantages of Memory-Based Collaborative Filtering:
• Simplicity: Memory-based approaches are intuitive and simple to implement, making them a
viable option for solving problems with moderately big datasets in a short amount of time.
• Transparency: Memory-Based systems’ suggestions are easy to understand since they are
grounded in the user’s and the item’s direct interactions.
• Serendipity: Memory-based filtering has the potential to provide serendipitous
recommendations, in which users stumble onto previously unknown but potentially
fascinating content through shared relationships with other users
Drawbacks of Memory-Based Collaborative Filtering:
• Sparsity and Scalability: Since the frequency of user-item interactions tends to decrease as the
dataset expands, it becomes more difficult to discover trustworthy neighbours and might cause
scaling problems.
• Cold Start: Memory-Based systems struggle when there are too few contacts with new users
or things to make reliable suggestions.
• Limited Representation: Memory-based approaches may provide subpar results because they
fail to fully capture complicated patterns in the data.
Model-based collaborative approach
• Cooperative Modelling Instead of using a predetermined set of rules, filters use a statistical
or machine learning model to identify and exploit hidden links and patterns in the data. These
models are then used to estimate users’ preferences for unseen objects based on their training
data of past interactions between users and items
• In the model-based approach, machine learning models are used to predict and rank
interactions between users and the items they haven’t interacted with yet. These models are
trained using the interaction information already available from the interaction matrix by
deploying different algorithms like matrix factorization, deep learning, clustering, etc.
9
Matrix factorization
Matrix factorization is used to generate latent features by decomposing the sparse user-item
interaction matrix into two smaller and dense matrices of user and item entities.
Matrix factorization is a popular technique used in Collaborative Filtering (CF) for recommendation
systems. CF is a method to predict a user's interests by collecting preferences or behavior
information from many users. Matrix factorization is particularly effective in collaborative filtering
because it can handle the sparsity of user-item interaction data.
Here's how matrix factorization works in the context of collaborative filtering:
1. Understanding the Data Matrix:
• Assume you have a matrix R representing user-item interactions. Rows correspond to
users, columns correspond to items, and the entries Rui represent user u's interaction
(like rating, purchase, or view) with item i. However, most entries are unknown
(missing) because not all users interact with all items.
2. Objective of Matrix Factorization:
• The goal of matrix factorization in CF is to decompose this sparse matrix R into the
product of two lower-dimensional matrices U and I
𝑹 ≈ 𝑼 × 𝑰𝑻
• Here, U (an 𝑚 × 𝑘 matrix) represents user embeddings, where each row u (out of m
rows) corresponds to a user's latent factors in an k-dimensional space.
• I (an 𝑛 × 𝑘 matrix) represents item embeddings, where each row i (out of n rows)
corresponds to an item's latent factors in the same k-dimensional space.
3. Matrix Factorization Process:
• Matrix factorization aims to learn the matrices U and I by minimizing the reconstruction
error between R and 𝑈 × 𝐼𝑇. This is typically achieved through optimization techniques
like gradient descent, alternating least squares, or stochastic gradient descent.
• The objective function could be formulated as:
minimize ∑(𝑢,𝑖)∈observed (𝑅𝑢𝑖 − (𝑈 × 𝐼𝑇)𝑢𝑖)2 +λ (∥ 𝑈 ∥2+∥ 𝐼 ∥2) where λ is a
regularization parameter to prevent overfitting.
4. Prediction and Recommendations:
Once the matrices U and I are learned, the missing entries in R can be estimated as
10
𝑼 × 𝑰𝑻
Recommendations for a user u can be made by suggesting items that have the
highest predicted scores (entries in 𝑼 × 𝑰𝑻 ) for that user, but have not been
interacted with yet.
5. Key Advantages:
• Matrix factorization is effective in handling sparsity because it leverages latent factors
to capture user and item interactions.
• It can provide personalized recommendations even for users with very few interactions.
Advantages of Model-Based Collaborative Filtering:
• Scalability: Model-Based approaches outperform Memory-Based ones in dealing with big
and sparse datasets because they learn underlying patterns without making direct comparisons
of users or things.
• Cold Start Mitigation: By using supplementary data or a hybrid method, model-based
filtering may help with the cold start issue.
• Flexibility: Model-based methods may use a wide variety of data and attributes, allowing for
the incorporation of context to enhance suggestions.
Drawbacks of Model-Based Collaborative Filtering:
• Complexity: Due to the complexity of the models they need, the development and tuning of
model- based approaches often take more time and skill.
• Black Box: High accuracy is possible with Model-Based filtering, although the
models’ inner workings may be less visible and interpretable than those of Memory-Based
approaches.
• Overfitting: Overfitting is a problem in Model-Based systems when there is insufficient
data, and this may result in suggestions that are too weighted towards prior encounters
Hybrid Approaches
Hybrid approaches in Collaborative Filtering (CF) combine different methods or techniques to
overcome limitations and enhance the performance of recommendation systems. These approaches
leverage the strengths of multiple recommendation strategies, such as collaborative filtering (CF) and
content-based filtering (CBF), to provide more accurate and diverse recommendations. Here's a
breakdown of hybrid approaches in CF:
1. Collaborative Filtering (CF):
• Collaborative Filtering methods recommend items based on user-item interactions or
similarities between users. This can be user-based CF (recommending items liked by
similar users) or item-based CF (recommending similar items to those a user has liked).
2. Content-Based Filtering (CBF):
• Content-Based Filtering recommends items based on their features or attributes. It
analyzes item descriptions or user profiles to suggest items that are similar in content
to previously liked items.
3. Types of Hybrid Approaches:
a. Weighted Hybrid:
• In this approach, predictions from different recommendation techniques (e.g., CF and
CBF) are combined using weighted averages or other blending methods. The weights
can be fixed or learned based on data.
b. Feature Combination:
• Features derived from both CF and CBF methods are combined to create a unified
feature representation. Machine learning algorithms can then use this combined feature
representation to make recommendations.
c. Cascade or Switch Hybrid:
• Recommendations from one method (e.g., CF) are used to filter or augment
recommendations from another method (e.g., CBF). This can improve recommendation
accuracy by leveraging the strengths of both methods.
d. Meta-Level Hybrid:
11
• In this approach, predictions from different recommendation algorithms are treated as
input features to a meta-learner (e.g., a machine learning model). The meta-learner then
combines these predictions to generate final recommendations.
Advantages of Hybrid Approaches:
• Improved Accuracy: By combining multiple methods, hybrid approaches can mitigate
weaknesses and improve recommendation accuracy.
• Diversity: Hybrid methods can provide more diverse recommendations by leveraging
different recommendation strategies.
• Robustness: They are more robust to data sparsity and the cold start problem compared to
individual CF or CBF methods.
• Improved Performance: Hybrid techniques might possibly provide higher overall
performance by using the capabilities of Memory-Based and Model-Based methodologies.
• Cold Start Mitigation: Cold-starting difficulties may be mitigated with the use of hybrid
technology.
Examples:
• Netflix's recommendation system uses a hybrid approach, combining collaborative
filtering (based on user ratings) with content-based filtering (analyzing movie
attributes like genre).
• Amazon's recommendation system also uses a hybrid approach, combining
user-item interactions with item attributes and user demographics.
Movie Recommendation System
Data:
• User Preferences: User ratings for movies.
• Movie Attributes: Genre, director, actors, release year, etc.
Hybrid Approach Components:
1. Collaborative Filtering (CF):
• Idea: Recommend movies based on user behavior and preferences.
• Implementation:
• Use matrix factorization (like Singular Value Decomposition or Matrix
Factorization) to learn latent factors from user-item interactions (ratings).
• Predict ratings for unseen movies based on similar users' preferences.
2. Content-Based Filtering (CBF):
• Idea: Recommend movies based on the attributes or content of the items.
• Implementation:
• Extract features from movies such as genre, director, actors, release year.
• Build a profile for each user based on their rated movies.
• Recommend movies that are similar in content to the ones a user has liked.
3. Hybridization:
• Combining CF and CBF:
• Weighted Approach: Combine scores from CF and CBF using a weighted
sum or other fusion techniques.
• Switching Strategy: Use CF for some users and CBF for others based on
data availability or performance metrics.
• Feature Combination: Include content-based features (e.g., movie genres,
director) as additional input to the collaborative filtering model.
Recommendation Process:
• For a New User:
• If the user has not rated any movies yet:
• Use CBF to recommend movies based on their provided preferences (e.g.,
preferred genres).
• Once the user rates some movies:
• Incorporate these ratings into the CF model to provide personalized
12
recommendations.
• For Existing Users:
• Use the hybrid approach to generate recommendations:
• Combine CF predictions (based on user-item interactions) with CBF
recommendations (based on movie attributes).
• Present the top-rated hybrid recommendations to the user.
Benefits of Hybrid Approach:
• Increased Accuracy: Combining multiple recommendation techniques can lead to more
accurate predictions.
• Improved Coverage: Content-based filtering can recommend items even when user-item
interactions are sparse (cold start problem).
• Enhanced Personalization: Incorporating user preferences (CBF) along with user-item
interactions (CF) leads to more personalized recommendations.
In this movie recommendation system example, the hybrid approach leverages both collaborative
filtering and content-based filtering techniques to provide diverse and accurate movie
recommendations tailored to individual users' tastes and preferences. Hybridization allows for a more
robust recommendation system that can handle various scenarios and user behaviours effectively.
NEAREST NEIGHBOR COLLABORATIVE FILTERING
• Neighborhood-based recommender systems fall under the collaborative filtering umbrella
and focus on using behavioral patterns, such as movies that users have watched in the past,
to identify similar users (i.e., users who demonstrate similar preferences), or similar items (i.e.,
items that receive similar interest from the same users).
• Nearest Neighbors Collaborative Filtering (NNCF) is a technique used in recommendation
systems to predict user preferences based on the similarity between users or items. It falls
under the umbrella of Collaborative Filtering (CF), which utilizes the collective wisdom of
users to make recommendations.
• User-based Collaborative Filtering (UBCF):
o Predict a user's preference for an item by finding similar users based on their historical
ratings.
• Item-based Collaborative Filtering (IBCF):
o Predict a user's preference for an item by finding similar items based on how users have
rated them.
Steps Involved Nearest Neighbors Collaborative Filtering
Step-1: Data Representation: Represent user-item interactions as a matrix R, where rows correspond
to users and columns correspond to items. Each entry Rui represents a user u's rating (or interaction)
with item i.
Step-2: Similarity Calculation: Compute similarity between users (for UBCF) or items (for IBCF)
based on their rating patterns. Common similarity metrics include cosine similarity, Pearson
correlation, or Jaccard similarity.
Step-3: Nearest Neighbors Selection: For a given user u (or item i), identify the k most similar users
(or items) based on the computed similarity scores. Nearest Neighbors are typically selected based
on the highest similarity scores.
Step-4: Prediction:
• UBCF Prediction: Predict user u's rating for item i by averaging the ratings of the k nearest
Neighbors who have rated item i, weighted by their similarity to user u.
• IBCF Prediction: Predict user u's rating for item i by combining ratings of items similar to
item i, weighted by the similarity between items.
• We refer to the technique that computes similar users as user-based and to the technique that
focuses on computing similar items as item-based.
• An example of the item-based technique is Netflix’s “Because you watched…” feature,
which recommends movies or shows based on examples that users previously showed
13
interest in.
• An example of a user-based recommender system is [Link], which recommends
destinations based on the historical behavior of other users with similar travel history.
Pipeline Overview
The image below summarizes the pipeline for our implementation of item-based and user-based
recommender systems in our declarative language, Rel. Without loss of generality, we focus on a
movie recommendation use case, where we are given interactions between users and movies.
Step 1: We convert user-item interactions to a bipartite graph.
The first step is to convert the input interactions data to a bipartite graph that contains two types of
nodes: Users and Movies, as shown in the image below.
14
The two node types are connected by an edge that we call watched. In Rel, Users and Movies are
represented by entity types, and their attributes, such as id and name, are represented by value types
Step 2: MovieLens Graph. We use user-item interactions to compute item-item and user-user similarities
by leveraging the functions supported by the graph analytics library.
• Once we define the entity and value types, the next step is to populate the entities with data
from the original MovieLens dataset.
• Assuming we have a relation called watched_train(user, movie) that represents the train
subset of the MovieLens data and contains the watch history for the users, and a relation called
movie_info(movie, movie_name) that contains the movie names, we create the Movie entity
as follows:
• The User entity is created similarly. Finally, we add an additional edge called watched that
connects the movie entity to the user entity.
Step 3: Similarity Computation.
• We use the similarities to predict the scores for all (user, movie) pairs. Each score is an
indication of how likely it is for a user to interact with a movie.
• Now that we have modeled our data as a graph, we can compute item-item and user-user
similarities using the user-item interactions: movies that have been watched by the same users
will have a high similarity value, while movies that have been watched by different users will
have a low similarity value.
• Here, we focus on the item-based method. The approach for the user-based method is very
similar. There are several similarity metrics that can be used for this task. Currently, the Rel
graph library provides the cosine_similarity and jaccard_similarity relations
Step 4: Scoring
• We sort the scores for every user in order to generate top-k recommendations.
• Using the similarities calculated in the previous step, we then compute the (user, movie) scores
for all pairs. We predict that a user will watch movies that are similar to the movies they have
watched in the past (item-based approach).
• The score for a pair (user, movie) indicates how likely it is for a user to watch a movie and is
calculated as follows: Where:
The formula shown in the image is the Pearson Correlation Similarity used in Collaborative Filtering
(Recommender Systems) to measure similarity between two users (or items).
Pearson Similarity Formula
∑𝑝( 𝑟𝑎𝑝 − 𝑟ˉ𝑎 )(𝑟𝑏𝑝 − 𝑟ˉ𝑏 )
𝑆𝑖𝑚(𝑎, 𝑏) =
√∑𝑝( 𝑟𝑎𝑝 − 𝑟ˉ𝑎 )2 √∑𝑝( 𝑟𝑏𝑝 − 𝑟ˉ𝑏 )2
Meaning of Symbols
Symbol Meaning
𝑆𝑖𝑚(𝑎, 𝑏) Similarity between user a and user b
𝑟𝑎𝑝 Rating given by user a to item p
𝑟𝑏𝑝 Rating given by user b to item p
𝑟ˉ𝑎 Average rating of user a
𝑟ˉ𝑏 Average rating of user b
𝑝 Items rated by both users
Step 2: Prediction of missing rating of an item Now, the target user might be very similar to some users and
may not be much similar to others. Hence, the ratings given to a particular item by the more similar users
should be given more weightage than those given by less similar users and so on. This problem can be solved
by using a weighted average approach. In this approach, you multiply the rating of each user with a similarity
factor calculated using the above mention formula. The missing rating can be calculated as
The formula in the image is the Resnick Prediction Formula, widely used in User-Based
Collaborative Filtering (UBCF) in recommender systems.
Resnick Prediction Formula
∑ 𝑠𝑖𝑚(𝑢, 𝑖) ( 𝑟𝑖𝑝 − 𝑟ˉ𝑖 )
𝑟𝑢𝑝 = 𝑟ˉ𝑢 + 𝑖∈𝑢𝑠𝑒𝑟𝑠
∑𝑖∈𝑢𝑠𝑒𝑟𝑠 ∣ 𝑠𝑖𝑚(𝑢, 𝑖) ∣
16
Meaning of Symbols
Symbol Meaning
𝑟𝑢𝑝 Predicted rating of user u for item p
𝑟ˉ𝑢 Average rating of user u
𝑠𝑖𝑚(𝑢, 𝑖) Similarity between user u and user i
𝑟𝑖𝑝 Rating given by user i for item p
𝑟ˉ𝑖 Average rating of user i
( sim(u,i)
𝑖 Neighbor users similar to user u
U1 3 1 2 3 3
U2 4 3 4 3 5
U3 3 3 1 5 4
In User-Based Collaborative Filtering, when computing similarity, we exclude the item whose rating
we want to predict.
So I5 (BBC) is excluded while calculating similarity.
Alice
5+4+1+4
𝑅ˉ𝐴 = = 3.5
4
U1
17
3+1+2+3+3
ˉ =
𝑅𝑈1 = 2.4
5
U2
4+3+4+3+5
ˉ =
𝑅𝑈2 = 3.8
5
U3
3+3+1+5+4
ˉ =
𝑅𝑈3 = 3.2
5
Denominator
√1.52 + 0.52 + (−2.5)2 + 0.52
= √2.25 + 0.25 + 6.25 + 0.25
= √9 = 3
√0.62 + (−1.4)2 + (−0.4)2 + 0.62
= √0.36 + 1.96 + 0.16 + 0.36
= √2.84 = 1.685
1.5
𝑆𝑖𝑚(𝐴, 𝑈1) = = 0.296
3 × 1.685
Denominator
√9 = 3
18
√0.22 + (−0.8)2 + 0.22 + (−0.8)2
= √1.36 = 1.166
−1
𝑆𝑖𝑚(𝐴, 𝑈2) = = −0.286
3 × 1.166
Denominator
√𝟗 = 𝟑
√0.04 + 0.04 + 4.84 + 3.24
= √8.16 = 2.857
6
𝑆𝑖𝑚(𝐴, 𝑈3) = = 0.701
3 × 2.857
Numerator
U1
0.296(3 − 2.4) = 0.296 × 0.6 = 0.178
U2
−0.286(5 − 3.8) = −0.286 × 1.2 = −0.343
U3
0.701(4 − 3.2) = 0.701 × 0.8 = 0.561
Total
0.178 − 0.343 + 0.561 = 0.396
Denominator
∣ 0.296 ∣ +∣ −0.286 ∣ +∣ 0.701 ∣= 1.283
Final Result
Predicted rating of Alice for BBC (I5) = 3.81 ≈ 4
Thus, BBC is recommended to Alice.
19
b) Item-to-Item Based Collaborative Filtering
• Collaborative Filtering is a technique or a method to predict a user’s taste and find the items
that a user might prefer on the basis of information collected from various other users having
similar tastes or preferences.
• It takes into consideration the basic fact that if person X and person Y have a certain reaction
for some items then they might have the same opinion for other items too.
• The two most popular forms of collaborative filtering are:
• User Based: Here, we look for the users who have rated various items in the same way and
then find the rating of the missing item with the help of these users.
• Item Based: Here, we explore the relationship between the pair of items (the user who
bought Y, also bought Z). We find the missing rating with the help of the ratings given to the
other items by the user.
• The similarity between item pairs can be found in different ways. One of the most common
methods is to use cosine similarity
• We hence try to generate predictions based on the ratings of similar products. We compute
this using a formula which computes rating for a particular item using weighted sum of the
ratings of the other similar product.
The formula in your image is the Item-Based Collaborative Filtering prediction formula:
∑𝒋 𝒓𝒂𝒕𝒊𝒏𝒈(𝑼, 𝑰𝒋 ) × 𝒔𝒊𝒋
𝒓𝒂𝒕𝒊𝒏𝒈(𝑼, 𝑰𝒊 ) =
∑ 𝒔𝒊𝒋
𝒋
Meaning of Symbols
• 𝑟𝑎𝑡𝑖𝑛𝑔(𝑈, 𝐼𝑖 )→ predicted rating of user U for item 𝐼𝑖
• 𝑟𝑎𝑡𝑖𝑛𝑔(𝑈, 𝐼𝑗 )→ rating given by the user to similar item 𝐼𝑗
• 𝑠𝑖𝑗 → similarity between item 𝐼𝑖 and item 𝐼𝑗
• ∑→ summation over all similar items
User_1 2 – 3
User_2 5 2 –
20
User_3 3 3 1
User_4 – 2 2
To find the missing ratings using Item-Based Collaborative Filtering (IBCF), we compute similarity
between items and then predict the missing value using ratings of similar items.
Where
• 𝒓𝒖𝒊 = predicted rating of user 𝒖for item 𝒊
• 𝒓𝒖𝒋 = rating given by user 𝒖to item 𝒋
• 𝒔𝒊𝒋 = similarity between items 𝒊and 𝒋
Denominator
√𝟓𝟐 + 𝟑𝟐 × √𝟐𝟐 + 𝟑𝟐
= √𝟑𝟒 × √𝟏𝟑
= 𝟓. 𝟖𝟑 × 𝟑. 𝟔𝟏 = 𝟐𝟏. 𝟎𝟓
𝒔𝟏𝟐 = 𝟏𝟗/𝟐𝟏. 𝟎𝟓 = 𝟎. 𝟗𝟎
Denominator
√𝟐𝟐 + 𝟑𝟐 × √𝟑𝟐 + 𝟏𝟐
= √𝟏𝟑 × √𝟏𝟎
= 𝟑. 𝟔𝟏 × 𝟑. 𝟏𝟔 = 𝟏𝟏. 𝟒𝟏
𝒔𝟏𝟑 = 𝟗/𝟏𝟏. 𝟒𝟏 = 𝟎. 𝟕𝟗
Denominator
√𝟑𝟐 + 𝟐𝟐 × √𝟏𝟐 + 𝟐𝟐
= √𝟏𝟑 × √𝟓
= 𝟑. 𝟔𝟏 × 𝟐. 𝟐𝟑 = 𝟖. 𝟎𝟓
𝒔𝟐𝟑 = 𝟕/𝟖. 𝟎𝟓 = 𝟎. 𝟖𝟕
Numerator
𝟏. 𝟖 + 𝟐. 𝟔𝟏 = 𝟒. 𝟒𝟏
Denominator
𝟎. 𝟗𝟎 + 𝟎. 𝟖𝟕 = 𝟏. 𝟕𝟕
𝒓𝟏,𝟐 = 𝟒. 𝟒𝟏/𝟏. 𝟕𝟕 = 𝟐. 𝟒𝟗 ≈ 𝟐
Denominator
𝟎. 𝟕𝟗 + 𝟎. 𝟖𝟕 = 𝟏. 𝟔𝟔
𝒓𝟐,𝟑 = 𝟓. 𝟔𝟗/𝟏. 𝟔𝟔 = 𝟑. 𝟒𝟐 ≈ 𝟑
Numerator
𝟏. 𝟖 + 𝟏. 𝟓𝟖 = 𝟑. 𝟑𝟖
Denominator
𝟎. 𝟗𝟎 + 𝟎. 𝟕𝟗 = 𝟏. 𝟔𝟗
𝒓𝟒,𝟏 = 𝟑. 𝟑𝟖/𝟏. 𝟔𝟗 = 𝟐
Advantages:
• Simple and intuitive approach to collaborative filtering.
• Effective in scenarios where users/items have sparse interactions.
• Can capture complex user-item relationships based on similarity
metrics. Challenges and Considerations:
• Data Sparsity: Nearest Neighbors CF may struggle with sparse datasets, where not all
users have rated many items.
• Scalability: Computing pairwise similarities can be computationally expensive for
large datasets.
• Cold Start Problem: Nearest Neighbors CF may face challenges when dealing with new
users or items with few ratings.
23
COMPONENTS OF NEIGHBORHOOD METHODS
The three very important considerations in the implementation of a neighborhood-based
recommender system are
1) the normalization of ratings,
2) the computation of the similarity weights, and
3) the selection of neighbors.
• Neighborhood Selection: Once the similarity between users or items is computed, the next step
is to select a subset of neighbors that are most similar to the target user or item. This subset is
known as the neighborhood. The size of the neighborhood, i.e., the number of nearest neighbors
to consider, can be fixed or adaptive.
• Rating Prediction: After selecting the neighborhood, the algorithm predicts the rating of a target
user for an item by aggregating the ratings of its neighbors for that item. This can be done using
various aggregation functions such as weighted average, weighted sum, or regression-based
methods.
• Sparse Data Handling: Neighborhood methods often face the challenge of dealing with sparse
data, where many user-item pairs have no ratings. Various strategies such as neighborhood
expansion, imputation, or incorporating auxiliary information may be employed to handle sparse
data and improve recommendation quality.
• When it comes to assigning a rating to an item, each user has its own personal scale. Even if
an explicit definition of each of the possible ratings is supplied (e.g., 1=“strongly disagree”,
2=“disagree”, 3=“neutral”, etc.), some users might be reluctant to give high/low scores to items
they liked/disliked.
• Two of the most popular rating normalization schemes that have been proposed to
convert individual ratings to a more universal scale are mean-centering and Z-score
24
I. Mean-centering
Mean Centering in Recommender Systems
Mean centering is a normalization technique used in Collaborative Filtering to remove the bias of
users or items when calculating similarities or predicting ratings.
Different users rate items differently:
• Some users usually give high ratings
• Some users usually give low ratings
Mean centering adjusts ratings so that we measure true preference instead of rating habits.
Mean Centering Formula -For User Mean Centering
𝒓′𝒖,𝒊 = 𝒓𝒖,𝒊 − 𝒓ˉ𝒖
Where
• 𝒓𝒖,𝒊= rating given by user 𝒖to item 𝒊
• 𝒓ˉ𝒖 = average rating of user 𝒖
• 𝒓′𝒖,𝒊= mean-centered rating
Lucy
𝟑 + 𝟏 + 𝟐 + 𝟑 + 𝟑 𝟏𝟐
𝒓ˉ𝑳𝒖𝒄𝒚 = = = 𝟐. 𝟒
𝟓 𝟓
Eric
𝟒 + 𝟑 + 𝟒 + 𝟑 + 𝟓 𝟏𝟗
𝒓ˉ𝑬𝒓𝒊𝒄 = = = 𝟑. 𝟖
𝟓 𝟓
Diane
𝟑 + 𝟑 + 𝟒 + 𝟓 + 𝟒 𝟏𝟗
𝒓ˉ𝑫𝒊𝒂𝒏𝒆 = = = 𝟑. 𝟖
𝟓 𝟓
25
Mean-Centered Matrix
User Matrix Titanic Die Hard Forrest Gump Wall-E
John 5−4 = 1 3−4 = -1 4−4 = 0 4−4 = 0 ?
Lucy 3−2.4 = 0.6 1−2.4 = -1.4 2−2.4 = -0.4 3−2.4 = 0.6 3−2.4 = 0.6
Eric 4−3.8 = 0.2 3−3.8 = -0.8 4−3.8 = 0.2 3−3.8 = -0.8 5−3.8 = 1.2
Diane 3−3.8 = -0.8 3−3.8 = -0.8 4−3.8 = 0.2 5−3.8 = 1.2 4−3.8 = 0.2
Interpretation
Example: Diane
Original ratings
Movie Rating
Titanic 3
Forrest Gump 5
Mean rating: 𝟑. 𝟖
Mean-centered values
Movie Mean-Centered
Titanic −0.8
Forrest Gump +1.2
This shows:
• Titanic → below Diane's average preference
• Forrest Gump → above Diane's average preference
Even though Titanic has rating 3, it becomes negative preference after mean-centering.
Simple Intuition
Rating Situation Meaning
Positive value User likes item more than average
Zero Neutral preference
Negative value User likes item less than average
Detailed step-by-step solution to predict John’s rating for Wall-E using User-Based Collaborative
Filtering with Mean-Centered Ratings (Resnick Formula).
26
User Matrix Titanic Die Hard Forrest Gump
John 1 -1 0 0
Lucy 0.6 -1.4 -0.4 0.6
Eric 0.2 -0.8 0.2 -0.8
Diane -0.8 -0.8 0.2 1.2
Compute Similarity (Pearson) Between John and Other Users
We use only co-rated movies
(Matrix, Titanic, Die Hard, Forrest Gump)
Similarity (John, Lucy)
Numerator
(1 × 0.6) + (−1 × −1.4) + (0 × −0.4) + (0 × 0.6)
0.6 + 1.4 + 0 + 0 = 2
Denominator
√12 + (−1)2 + 02 + 02
= √2 = 1.414
Lucy part
√0.62 + (−1.4)2 + (−0.4)2 + 0.62
= √2.48 = 1.575
Similarity
2
𝑠𝑖𝑚(𝐽, 𝐿) =
1.414 × 1.575
2
=
2.227
= 0.90
Denominator
John
√2 = 1.414
Eric
√0.22 + (−0.8)2 + 0.22 + (−0.8)2
= √1. 36 = 1.166
Similarity
1
𝑠𝑖𝑚(𝐽, 𝐸) =
1.414 × 1.166
1
=
1.648
= 0.61
Similarity (John, Diane)
Numerator
27
(1 × −0.8) + (−1 × −0.8) + (0 × 0.2) + (0 × 1.2)
−0.8 + 0.8 = 0
Similarity
𝑠𝑖𝑚(𝐽, 𝐷) = 0. So Diane does not influence prediction.
Ratings for Wall-E
User Rating Mean Deviation
Lucy 3 2.4 0.6
Eric 5 3.8 1.2
Diane 4 3.8 0.2
Apply Resnick Prediction Formula
∑𝑠𝑖𝑚(𝑢, 𝑣)(𝑟𝑣𝑖 − 𝑟ˉ𝑣 )
𝑟𝑢,𝑖 = 𝑟ˉ𝑢 +
∑ ∣ 𝑠𝑖𝑚(𝑢, 𝑣) ∣
Calculate Numerator
(0.90 × 0.6) + (0.61 × 1.2) + (0 × 0.2)
0.54 + 0.732
= 1.272
Calculate Denominator
∣ 0.90 ∣ +∣ 0.61 ∣ +∣ 0 ∣
= 1.51
Final Prediction
1.272
𝑟𝐽𝑜ℎ𝑛,𝑊𝑎𝑙𝑙𝐸 = 4 +
1.51
= 4 + 0.84
= 4.84
Final Result
Predicted Rating for John on Wall-E≈4.84
Interpretation
• Eric and Lucy are similar to John
• Both rated Wall-E highly
• Therefore, John is predicted to like Wall-E
Where
• 𝑟𝑢,𝑖 = rating given by user 𝑢to item 𝑖
• 𝑟ˉ𝑖 = average rating of item 𝑖
′
• 𝑟𝑢,𝑖 = item mean-centered rating
28
Given Rating Matrix
User Matrix Titanic Die Hard Forrest Gump
John 5 1 – 2
Lucy 1 5 2 5
Eric 2 ? 3 5
Diane 4 3 5 3
Step 1: Calculate Item Mean
Matrix
5+1+2+4
𝑟ˉ𝑀𝑎𝑡𝑟𝑖𝑥 =
4
12
= =3
4
Titanic
1+5+3
𝑟ˉ𝑇𝑖𝑡𝑎𝑛𝑖𝑐 =
3
=3
Die Hard
2+3+5
𝑟ˉ𝐷𝑖𝑒𝐻𝑎𝑟𝑑 =
3
10
= = 3.33
3
Forrest Gump
2+5+5+3
𝑟ˉ𝐹𝑜𝑟𝑟𝑒𝑠𝑡 =
4
15
= = 3.75
4
Titanic
User Calculation Result
John 1 − 3 −2
Lucy 5 − 3 2
Diane 3 − 3 0
(Eric unknown)
Die Hard
User Calculation Result
Lucy 2 − 3.33 −1.33
Eric 3 − 3.33 −0.33
Diane 5 − 3.33 1.67
Forrest Gump
29
User Calculation Result
John 2 − 3.75 −1.75
Lucy 5 − 3.75 1.25
Eric 5 − 3.75 1.25
Diane 3 − 3.75 −0.75
Interpretation
Example: Titanic
Average rating of Titanic
𝑟ˉ𝑇𝑖𝑡𝑎𝑛𝑖𝑐 = 3
predict Eric’s missing rating for Titanic (?) using Item-Based Collaborative Filtering with Item
Mean-Centering.
Now we will
Denominator
√(−2)2 + 22 + 02 × √22 + (−2)2 + 12
= √8 × √9
= 2.83 × 3 = 8.49
𝑠𝑖𝑚(𝑇𝑖𝑡𝑎𝑛𝑖𝑐, 𝑀𝑎𝑡𝑟𝑖𝑥) = −0.94
30
Similarity (Titanic , Die Hard)
Common users: Lucy, Diane
Numerator
(2)(−1.33) + (0)(1.67)
= −2.66
Denominator
√4 × √4.56
= 2 × 2.14 = 4.28
𝑠𝑖𝑚 = −0.62
Denominator
√8 × √5.19
= 2.83 × 2.28 = 6.45
𝑠𝑖𝑚 = 0.93
Where
′
𝑟𝑢,𝑗 = mean-centered rating.
Eric’s Known Ratings
Movie Mean-centered rating
Matrix -1
Die Hard -0.33
Forrest Gump 1.25
Compute Numerator
Matrix contribution =(−1)(−0.94) = 0.94
Denominator
∣ −0.94 ∣ +∣ −0.62 ∣ +∣ 0.93 ∣
0.94 + 0.62 + 0.93 = 2.49
31
Predicted Mean-Centered Rating
2.307
𝑟′ = = 0.93
2.49
Final Rating
Add Titanic mean
𝑟𝐸𝑟𝑖𝑐,𝑇𝑖𝑡𝑎𝑛𝑖𝑐 = 3 + 0.93
= 3.93
Completed Matrix
User Matrix Titanic Die Hard Forrest Gump
John 5 1 – 2
Lucy 1 5 2 5
Eric 2 4 3 5
Diane 4 3 5 3
Calculate Mean and Standard Deviation: Compute the mean and standard deviation of each
item's ratings across all users.
Step 1: Compute mean (μ) and std deviation (σ) per user
User 1 Ratings: [5, 3, 0]
𝟓+𝟑+𝟎
Mean μ1= =2.67
𝟑
2 (5−2.67)2 +(3−2.67)2 +(0−2.67)2 2
Std dev σ1 = √ = √4.22 ≈ 2.05
3
User 2 Ratings: [4, 0, 0]
𝟒+𝟎+𝟎
Mean μ2= =1.33
𝟑
2 (4−1.33)2 +(0−1.33)2 +(0−1.33)2 2
Std dev σ2= √ = √3.56 ≈ 1.89
3
User 3 Ratings: [1, 1, 0]
𝟏+𝟏+𝟎
Mean μ3= =0.67
𝟑
2 (1−0.67)2 +(1−0.67)2 +(0−0.67)2 2
Std dev σ3 = √ = √0.22 ≈ 0.47
3
34
Step 2: Apply Z-score Formula
Item 1 Item 2 Item 3
User 1 (5−2.67)/2.05 ≈ 1.14 (3−2.67)/2.05 ≈ 0.16 (0−2.67)/2.05 ≈ -1.30
User 2 (4−1.33)/1.89 ≈ 1.41 (0−1.33)/1.89 ≈ -0.71 (0−1.33)/1.89 ≈ -0.71
User 3 (1−0.67)/0.47 ≈ 0.70 (1−0.67)/0.47 ≈ 0.70 (0−0.67)/0.47 ≈ -1.41
Where:
Symbol Meaning
𝒓ˉ𝒊 average rating of item i
𝝈𝒊 standard deviation of item ratings
Predicted rating:
∑𝒋∈𝑵(𝒊) 𝒔𝒊𝒎(𝒊, 𝒋) 𝒛𝒖𝒋
𝒓̂𝒖𝒊 = 𝒓ˉ𝒊 + 𝝈𝒊
∑𝒋∈𝑵(𝒊) ∣ 𝒔𝒊𝒎(𝒊, 𝒋) ∣
Item 1: (5,4,1)
Mean = µ1 = (5 + 4 + 1) / 3 = 3.33
2 (5−3.33)2 +(4−3.33)2 +(1−3.33)2 2
Standard Deviation = σ1 = √ ≈ √2.89 ≈ 1.70
3
Item 2: [3, 0, 1]
Mean = µ2 = (3 + 0+ 1) / 3 = 1.33
2 (3−1.33)2 +(40−1.33)2 +(1−1.33)2 2
Standard Deviation = σ2 = √ 3
≈ √1.56 ≈ 1.25
Item 3: [0, 0, 0]
35
Mean = µ3 = 0
Standard Deviation = σ3 = 0
We'll handle division by 0 using a common rule: set z-score to 0 where standard deviation is 0 (no
variation).
Step 2: Apply Z-score
Item 1 Item 2 Item 3
User 1 (5−3.33)/1.70 ≈ 0.98 (3−1.33)/1.25 ≈ 1.33 0
User 2 (4−3.33)/1.70 ≈ 0.39 (0−1.33)/1.25 ≈ -1.06 0
User 3 (1−3.33)/1.70 ≈ -1.37 (1−1.33)/1.25 ≈ -0.27 0
where Iuv once more denotes the items rated by both u and v. A problem
with this measure is that is does not consider the differences in the mean and
variance of the ratings made by users u and v.
A popular measure that compares ratings where the effects of mean and
variance have been removed is the Pearson Correlation (PC) similarity:
Example: Suppose we have a small dataset representing user ratings for movies:
To calculate the Pearson correlation similarity between User 1 and User 2 based on the provided
ratings for movies, we'll follow the steps outlined earlier:
38
Movie U1 U2 Dev1 = U1 - 2.25 Dev2 = U2 - 1.25 Product
M1 5 4 2.75 2.75 7.5625
M2 3 0 0.75 -1.25 -0.9375
M3 0 0 -2.25 -1.25 2.8125
M4 1 1 -1.25 -0.25 0.3125
𝟗.𝟕𝟓
Cov(U1, U2) = = 3.25
𝟑
9.75 9.75
sim(User1,User2)= 3.84×3.28 =12.58 ≈0.775
So, the Pearson correlation similarity between User 1 and User 2 is approximately 0.775. This
indicates a moderate positive correlation between their ratings on the shared movies.
39
Calculate deviations from the mean:
−2.75 −2.75
sim(User1, User3) = 3.84×3.84 = 14.75 ≈-0.186
So, the Pearson correlation similarity between User 1 and User 3 is approximately -0.186. This
indicates a negative correlation between their ratings on the shared movies.
40
Sum of products (numerator):
−6.875 − 1.125 − 5.625 − 1.875 = −15.5
−15.5 −15.5
sim (User1, User4) = 3.84×4.12 =15.8208 ≈ - 0.98
So, the Pearson correlation similarity between User 1 and User 4 is approximately -0.98. This
negative correlation suggests some dissimilarity between their ratings on the shared movies. This is a
very strong negative correlation, meaning their tastes are almost opposite.
IV. Mean Squared Difference (MSD)
The Mean Squared Difference (MSD) is a statistical measure used to quantify the average
squared difference between two sets of values. It is commonly employed in various fields,
including statistics, machine learning, and signal processing, to assess the similarity or
dissimilarity between datasets.
Mean Squared Difference (MSD): Definition and Calculation
Given two sets of values X = {x1, x2, …, xn} and Y = {y1, y2, …, yn}, where n is the number of
elements in each set, the Mean Squared Difference (MSD) is calculated as follows.
1. Compute Differences
Calculate the difference between corresponding elements of X and Y:
Difference = (xi − yi) for i = 1, 2, …, n
2. Square Differences
Square each difference obtained in step 1:
Squared Difference = (xi − yi)² for i = 1, 2, …, n
3. Calculate Mean Squared Difference (MSD)
Compute the average (mean) of the squared differences:
41
𝑛
Interpretation
The Mean Squared Difference (MSD) provides a measure of the average discrepancy or error between
corresponding values of X and Y.
It quantifies how much X and Y deviate from each other on average, with larger differences resulting in
higher squared values and thus contributing more to the overall MSD.
MSD is commonly used as a loss function in regression problems to assess the goodness of fit of a
model’s predictions compared to the actual values.
Example:
Suppose we have three users (User X, User Y, and User Z) and their ratings for four movies
(Movie 1, Movie 2, Movie 3, and Movie 4). Here are the ratings:
User X: [4, 3, 5, 2]
User Y: [3, 2, 4, 3]
User Z: [5, 4, 3, 2]
To calculate the Mean Squared Difference (MSD) between User X and User Y for these
movies, we follow these steps:
• Compute the squared difference between corresponding ratings of User X and User Y for each
movie.
• Calculate the mean of these squared differences.
To calculate the Mean Squared Difference (MSD) between User X and User Y, follow these steps.
Step 1: Ratings Table
Movie User X User Y
Movie 1 4 3
Movie 2 3 2
Movie 3 5 4
Movie 4 2 3
Step 2: Calculate the Difference for Each Movie
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = (𝑋𝑖 − 𝑌𝑖 )
Interpretation
• MSD = 0 → perfectly similar ratings
• Higher MSD → more difference between users
Here MSD = 1, meaning User X and User Y have relatively similar rating patterns.
Similarly, you can calculate the MSD between other pairs of users or for different sets of movies.
MSD is a simple metric that gives you an idea of how similar or dissimilar the ratings of two users
are. A lower MSD indicates greater similarity in ratings.
To calculate the Mean Squared Difference (MSD) between User X and User Z, we'll follow
the same steps:
Step 1: Ratings Given
Item User X User Z
1 4 5
2 3 4
3 5 3
4 2 2
Step 2: Compute the Difference for Each Item
𝑫𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 = (𝑿𝒊 − 𝒁𝒊 )
A lower MSD indicates greater similarity in ratings. In this case, the MSD between User X
and User Z is higher than the MSD between User X and User Y (which was 1), suggesting that
User X's ratings are more similar to User Y's ratings than to User Z's ratings.
To calculate the Mean Squared Difference (MSD) between User Y and User Z, follow the same procedure.
Step 1: Ratings Table
Movie User Y User Z
Movie 1 3 5
Movie 2 2 4
Movie 3 4 3
Movie 4 3 2
NEIGHBORHOOD SELECTION
The selection of the neighbors used in the recommendation of items is normally done in
two steps:
1) a global filtering step where only the most likely candidates are kept
2) a per prediction step which chooses the best candidates for this prediction.
PRE – FILTERING OF NEIGHBORS
The pre-filtering of neighbors is an essential step that makes neighborhood-based
approaches practicable by reducing the amount of similarity weights to store, and limiting
the number of candidate neighbors to consider in the predictions. There are several ways
in which this can be accomplished:
• Top-N filtering: For each user or item, only a list of the N nearest-neighbors and their respective similarity
weight is kept. To avoid problems with efficiency or accuracy, N should be chosen carefully. Thus, if N is too
large, an excessive amount of memory will be required to store the neighborhood lists and predicting ratings will
be slow. On the other hand, selecting a too small value for N may reduce the coverage of the recommendation
method, which causes some items to be never recommended.
• Threshold filtering: Instead of keeping a fixed number of nearest-neighbors, this approach keeps all the
neighbors whose similarity weight has a magnitude greater than a given threshold 𝑤𝑚𝑖𝑛. While
this is more flexible than the previous filtering technique, as only the most significant neighbors
are kept, the right value of wmin may be difficult to determine.
• Negative filtering: In general, negative rating correlations are less reliable than positive ones.
Intuitively, this is because strong positive correlation between two users is a good indicator of
their belonging to a common group (e.g., teenagers, science-fiction fans, etc.). However,
although negative correlation may indicate membership to different groups, it does not tell how
different these groups are, or whether these groups are compatible for other categories of items.
While experimental investigation have found negative correlations to provide no significant
improvement in the prediction accuracy, whether such correlations can be discarded depends
on the data.
45
NEIGHBORS IN THE PREDICTIONS
Calculate Similarity: Use a similarity metric (such as Pearson correlation, cosine similarity, or
Jaccard similarity) to measure the similarity between users or items based on their ratings or
features.
Identify Neighbors: Select the top-k most similar users or items as neighbors. The value of k can
be predefined or determined dynamically.
Make Predictions: Use the ratings of the neighbors to predict ratings for the target user or item. This
can be done by taking a weighted average of the ratings given by neighbors, where the weights are
the similarities between the neighbors and the target user (or item).
Recommendation: Once predictions are made, recommend items with the highest predicted ratings
to the target user.
Suppose we have three users (User A, User B, and User C) and their ratings for movies (Movie 1,
Movie 2, and Movie 3). We want to predict the rating of Movie 3 for User A.
• Identify Neighbors: Let's say we choose User B and User C as neighbors based on
their high similarity scores.
• Make Predictions: We can predict the rating of Movie 3 for User A by taking a weighted
average of the ratings given by User B and User C for Movie 3, where the weights are their
similarities with User A.
• In practice, recommendation systems use more sophisticated algorithms and techniques, but
the basic idea remains the same: identify similar users or items as neighbors and use their
preferences to make predictions or recommendations.
User B
𝐵ˉ = (4 + 5 + 3)/3 = 4
User C
𝐶ˉ = (3 + 2 + 4)/3 = 3
Denominator:
√0.52 + (−0.5)2 × √02 + 12
= √0.5 × 1 = 0.707
Similarity:
𝑆𝑖𝑚(𝐴, 𝐵) = −0.5/0.707 = −0.707
47
Denominator:
√0.5 × √1
= 0.707
Similarity:
𝑆𝑖𝑚(𝐴, 𝐶) = 0.5/0.707 = 0.707
Where
𝑢= neighbors (B, C)
Numerator
0.707 + 0.707 = 1.414
Denominator
∣ −0.707 ∣ +∣ 0.707 ∣= 1.414
Final Result
User Movie 3 Predicted Rating
User A ≈ 5
Movie 3 should be recommended to User A because the predicted rating is very high.
48
SECURITY ASPECTS OF RECOMMENDER SYSTEMS
49
Two Marks
1
4. Why do we need recommender systems?
2
7. What is user based Collaborative Filtering?
To suggest new recommendations to a particular user, a group of similar
users (nearest neighbors) is created based on the interactions of the reference
user. The items that are most popular in this group, but new to the target user,
are used for the suggestions.
User-based CF algorithms recommend items to a user based on the
preferences of similar users. The algorithm first identifies a set of similar
users, also known as
nearest neighbors, based on their past interactions with items. The similarity
between
users is typically measured using distance metrics such as cosine similarity or
Pearson correlation. Once the nearest neighbors are identified, the algorithm
predicts the rating
of an item for the active user by aggregating the ratings of that item from
the nearest neighbors
3
recommends to the active user items that are similar to items that the user
has liked in the past
4
12. What is Matrix factorization?
The score for a pair (user, movie) indicates how likely it is for a user to watch a
movie and is calculated as follows: Where:
6
the score is the sum of the similarity scores of the target movie’s nearest neighbors
that have been watched by the target user
18. Write the formula to find the similarity between an user and item
7
calculated based on their rating patterns, and recommendations are made by identifying
users similar to the target user and recommending items they have liked.
• Rating Normalization: To improve the accuracy of predictions, rating normalization
techniques may be applied. These techniques adjust the ratings to account for user or
item biases, such as users who tend to rate items more positively or items that are
consistently rated higher or lower than others.
• Sparse Data Handling: Neighborhood methods often face the challenge of dealing
with sparse data, where many user-item pairs have no ratings. Various strategies such
as neighborhood expansion, imputation, or incorporating auxiliary information may be
employed to handle sparse data and improve recommendation quality.
Where:
z is the Z-score.
x is the original value.
μ is the mean of the distribution.
σ is the standard deviation of the distribution
21. Write the formula to find out the similarity between two users
Meaning of Symbols
• 𝐶𝑉(𝑢, 𝑣)– Cosine similarity between user u and user v
• 𝑥𝑢 , 𝑥𝑣 – Rating vectors of users u and v
• 𝑟𝑢𝑖 – Rating given by user u to item i
• 𝑟𝑣𝑖 – Rating given by user v to item i
• 𝐼𝑢𝑣 – Set of items rated by both users u and v
• 𝐼𝑢 – Set of items rated by user u
• 𝐼𝑣 – Set of items rated by user v
In simple terms
• The numerator computes the dot product of common ratings.
• The denominator normalizes using the magnitude of each user's rating
vector.
8
• The result ranges from 0 to 1 (or −1 to 1 depending on ratings), indicating how
similar the users are.
22. How similarity weight computation is carried out in item based collaborative
filtering
Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈– Set of all users who rated the items
• 𝑅𝑢𝑖 – Rating given by user u to item i
• 𝑅𝑢𝑗 – Rating given by user u to item j
Explanation
• The numerator calculates the dot product of rating vectors of items 𝑖and 𝑗.
• The denominator normalizes the values using the magnitude of each item vector.
• The result gives the cosine of the angle between the two item vectors.
Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈– Set of users who rated both items
• 𝑅𝑢𝑖 – Rating given by user u for item i
• 𝑅𝑢𝑗 – Rating given by user u for item j
• 𝑅ˉ 𝑖 – Mean rating of item i
• 𝑅ˉ𝑗 – Mean rating of item j
Explanation
• Measures the linear correlation between two items.
• Ratings are mean-centered by subtracting the average rating.
• This reduces rating bias when users rate items on different scales.
Key Property
• Value range: −1 to +1
o +1 → Perfect positive similarity
o 0 → No correlation
o −1 → Opposite preference
9
Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈𝑖 – Set of users who interacted with item i
• 𝑈𝑗 – Set of users who interacted with item j
• ∣ 𝑈𝑖 ∩ 𝑈𝑗 ∣– Number of users who interacted with both items
• ∣ 𝑈𝑖 ∪ 𝑈𝑗 ∣– Total number of users who interacted with either item
Explanation
• Measures the overlap between two user sets.
• Commonly used for implicit feedback data such as:
o clicks
o purchases
o views
o likes
Key Properties
• Value range: 0 to 1
o 0 → No common users
o 1 → Both items interacted by exactly the same users
• rxy=1 indicates a perfect positive correlation, meaning that the ratings of users
x and y are perfectly linearly related (i.e., when one user rates an item highly,
the other user also tends to rate it highly).
• rxy=−1 indicates a perfect negative correlation, meaning that the ratings of
users x and y are perfectly inversely related (i.e., when one user rates an
item highly, the other user tends to rate it poorly).
• rxy =0 indicates no linear correlation between the ratings of users x and y.
23. What is the MSD of two sets of values X={3,5,7,9} and Y={4,6,8,10}.
10
The Mean Squared Difference (MSD) equation written clearly is:
𝑛
1
𝑀𝑆𝐷 = ∑( 𝑥𝑖 − 𝑦𝑖 )2
𝑛
𝑖=1
Meaning of Symbols
• MSD – Mean Squared Difference
• 𝑛– Number of paired observations
• 𝑥𝑖 – Value of the first vector (or item/user) at position i
• 𝑦𝑖 – Value of the second vector at position i
• (𝑥𝑖 −𝑦𝑖 )2 – Squared difference between corresponding values
Example from the image
Given vectors:
• X = (3, 5, 7, 9)
• Y = (4, 6, 8, 10)
Step 1: Compute differences
(3 − 4), (5 − 6), (7 − 8), (9 − 10) = (−1, −1, −1, −1)
✅ Result:
The Mean Squared Difference between X and Y is 1.
On average, the squared difference between corresponding values of X and Y is 1,
indicating their level of dissimilarity.
Part -B
1. List the difference between collaborative recommendation engine
and content- based recommendation engine.
Aspect Collaborative Content-Based
Recommendation Recommendation
Engine Engine
Basic Idea Recommends items based on Recommends items based on
similar users’ preferences similar item features that a user
liked before
Data Used Uses user–item interaction Uses item attributes or content
data (ratings, clicks, features (keywords, genre,
purchases) category)
Working Principle “Users who liked this item “Items similar to what you
also liked these items” liked earlier are recommended”
Dependency Depends heavily on other Depends mainly on user’s own
users’ behavior past preferences
Similarity Computes similarity between Computes similarity between
Calculation users or items item features
11
Example User-based CF, Item-based TF-IDF, Cosine similarity,
Algorithm CF, Matrix Factorization Feature matching
Cold Start Suffers from new user and Works better for new items if
Problem new item problem item features are known
Diversity of Can recommend different Recommendations are usually
Recommendations types of items liked by similar to previously liked
similar users items
Need for Item Does not require detailed item Requires detailed item
Information descriptions attributes or metadata
Scalability Can be difficult with very Generally easier if item features
large user-item datasets are available
Explainability Harder to explain why Easier to explain (based
an item is on item features)
recommended
Example Amazon “Customers News recommendation,
Applications who bought this also article suggestion, music
bought”, Netflix genre recommendation
recommendations
12
3. Explain Memory based Collaborative filtering in detail (8 mraks)
4. Explain Model based collaborative filtering in detail ( 8 Marks)
5. Explain the steps to be followed in Nearest Neighbor Collaborative filtering
6. Explain User based Collaborative Filtering with an example.
7. Explain Item-to-Item Based Collaborative Filtering with an example
8. How rating Normalization is done using Mean Centralization in
neighborhood methods?
9. How rating Normalization is done using Z-Score Normalization in
neighborhood
methods?
10. If a user gives same ratings to similar types of movies and if the user
missed to give rating for the new movie which comes under the prior
rated category. Is it possible to predict the rating of a new movie for
that specific user item based collaborative filtering? Describe the
procedure step by step in detail. (April-May-2024)
Predicting a User’s Rating for a New Movie Using Item-Based
Collaborative Filtering (IBCF)
13
Yes, Item-Based Collaborative Filtering (IBCF) can predict a rating for a
new movie that a user hasn't rated, based on the ratings they have given
to similar movies. The procedure involves computing item similarities
and using them to infer the missing rating.
14
If we calculate similarity between Movie 1 and Movie 2 (both Action
movies), and Movie 1 and Movie 4, we get:
1. Movie 1 & Movie 2
Result:
The cosine similarity between Movie 1 and Movie 2 is approximately 0.976, which
indicates very high similarity between the two movies.
(2 × 4) + (3 × 5) = 8 + 15 = 23
Step 2: Denominator
√22 + 32 = √4 + 9 = √13
√42 + 52 = √16 + 25 = √41
23
𝑠𝑖𝑚(3,2) =
√13 × √41
√13 ≈ 3.61, √41 ≈ 6.40
23
𝑠𝑖𝑚(3,2) ≈ ≈ 0.99
23.1
15
Result:
Similarity between Movie 3 and Movie 2 ≈ 0.99 (very high similarity).
Step 1: Numerator
4 × 4 = 16
Step 2: Denominator
√42 = √16 = 4
√42 + 52 = √16 + 25 = √41
16
𝑠𝑖𝑚(4,2) =
√16 × √41
16
= ≈ 0.79
√656
• Result:
Similarity between Movie 4 and Movie 2 ≈ 0.79.
Example Calculation:
Assume User C rated:
• Movie 1 = 3
• Movie 4 = 3
16
So, the predicted rating for User C on Movie 2 is 3.0.
Step 4: Use the Prediction for Recommendation
• If predicted rating ≥ threshold (e.g., 3.5), recommend the movie.
• If the rating is low, the system won’t recommend it.
17
Nearest neighbor methods are applied in two main ways:
A. User-Based Collaborative Filtering (UBCF)
• Finds users similar to a target user.
• Predicts ratings by averaging ratings of similar users.
Steps:
1. Compute similarity between users (e.g., Cosine Similarity or Pearson
Correlation).
Step 1: Numerator
(𝟓 × 𝟒) + (𝟒 × 𝟓)
𝟐𝟎 + 𝟐𝟎 = 𝟒𝟎
Step 2: Denominator
√𝟓𝟐 + 𝟒𝟐 = √𝟐𝟓 + 𝟏𝟔 = √𝟒𝟏
√𝟒𝟐 + 𝟓𝟐 = √𝟏𝟔 + 𝟐𝟓 = √𝟒𝟏
18
2. Cosine Similarity: User1 & User3
Common movies: A, C
(𝟓 × 𝟑) + (𝟑 × 𝟓)
𝒔𝒊𝒎(𝑼𝟏, 𝑼𝟑) =
√𝟓𝟐 + 𝟑𝟐 √𝟑𝟐 + 𝟓𝟐
Step 1: Numerator
(𝟓 × 𝟑) + (𝟑 × 𝟓)
𝟏𝟓 + 𝟏𝟓 = 𝟑𝟎
Step 2: Denominator
√𝟓𝟐 + 𝟑𝟐 = √𝟐𝟓 + 𝟗 = √𝟑𝟒
√𝟑𝟐 + 𝟓𝟐 = √𝟗 + 𝟐𝟓 = √𝟑𝟒
Step 2: Denominator
√𝟒𝟐 + 𝟐𝟐 = √𝟏𝟔 + 𝟒 = √𝟐𝟎
√𝟑𝟐 + 𝟒𝟐 = √𝟗 + 𝟏𝟔 = √𝟐𝟓
19
4. Similarity Table
User Pair Cosine Similarity
U1 – U2 0.976
U1 – U3 0.882
U2 – U3 0.894
Step 3: Predict Missing Ratings
Numerator
𝟎. 𝟗𝟕𝟔(−𝟏. 𝟔𝟕) + 𝟎
= −𝟏. 𝟔𝟑
Denominator
𝟎. 𝟗𝟕𝟔 + 𝟎. 𝟖𝟖𝟐 = 𝟏. 𝟖𝟓𝟖
Prediction
−𝟏. 𝟔𝟑
𝑹𝟏𝑩 = 𝟒 +
𝟏. 𝟖𝟓𝟖
𝑹𝟏𝑩 = 𝟒 − 𝟎. 𝟖𝟖
𝑹𝟏𝑩 ≈ 𝟑. 𝟏𝟐
20
✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟏(𝑴𝒐𝒗𝒊𝒆𝑩) ≈ 𝟑
Numerator
𝟎. 𝟗𝟕𝟔(−𝟏) + 𝟎. 𝟖𝟗𝟒(𝟏)
= −𝟎. 𝟗𝟕𝟔 + 𝟎. 𝟖𝟗𝟒
= −𝟎. 𝟎𝟖𝟐
Denominator
𝟏. 𝟖𝟕
Prediction
−𝟎. 𝟎𝟖𝟐
𝑹𝟐𝑪 = 𝟑. 𝟔𝟕 +
𝟏. 𝟖𝟕
𝑹𝟐𝑪 ≈ 𝟑. 𝟔𝟑
✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟐(𝑴𝒐𝒗𝒊𝒆𝑪) ≈ 𝟒
Numerator
𝟎 + 𝟎. 𝟖𝟗𝟒(𝟏. 𝟑𝟑)
= 𝟏. 𝟏𝟗
Denominator
𝟏. 𝟕𝟕𝟔
Prediction
𝟏. 𝟏𝟗
𝑹𝟑𝑫 = 𝟒 +
𝟏. 𝟕𝟕𝟔
𝑹𝟑𝑫 ≈ 𝟒. 𝟔𝟕
✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟑(𝑴𝒐𝒗𝒊𝒆𝑫) ≈ 𝟒. 𝟕
21
Final Predicted Matrix
User A B C D
U1 5 3.1 3 4
U2 4 2 3.6 5
U3 3 4 5 4.7
22
Computing Cosine Similarity Between Movies
1. Compute Similarity Between Movie A and Other Movies
We calculate similarity for Movie A with B, C, and D using their common
user ratings.
23
(b) Movie A & Movie C Similarity
Only User 1 and User 3 have rated both movies:
Movie A vector: [5,3]
Movie C vector: [3,5]
𝟑𝟎
Sim (A, C) = 5.83×5.8330 = 𝟑𝟒 ≈0.882
(c) Movie A & Movie D Similarity
• Movie A vector: [5,4]
• Movie D vector: [4,5]
24
Let's compute the cosine similarity between Movie B and Movie C step
by step.
25
Final Answer
Cosine Similarity (B,C)=1.0
26
Final Answer
Cosine Similarity (B,D)=1.0
Likewise, cosine similarity of Movie C and D
27
Prediction 2: User 2's Rating for Movie C
Final Predictions
User/Movie Movie Movie Movie Movie
A B C D
User 1 5 3.9 3 4
User 2 4 2 4.1 5
User 3 3 4 5 4.5
Conclusion
1. Movie B is predicted to be rated 3.9 by User 1.
2. Movie C is predicted to be rated 4.1 by User 2.
3. Movie D is predicted to be rated 4.5 by User 3
28
Comparison with Content-Based Filtering (CBF)
Content-Based Filtering (CBF) Overview
• Uses movie attributes (genre, actors, director, etc.) to make
recommendations.
• Does not depend on user interaction history with other users.
Formula for Content-Based Score:
12. Illustrate the working principle of neighborhood methods and discuss the
components used (April-May-2025)
29
1. Construct the User–Item Rating Matrix
The first step is to collect ratings from users and form a user–item matrix.
User / Item Item1 Item2 Item3 Item4
User1 5 ? 3 4
User2 4 2 ? 5
User3 3 4 5 ?
• Rows represent users
• Columns represent items
• Missing values represent ratings to be predicted
2. Compute Similarity Between Users or Items
The system measures similarity between users or items.
Cosine Similarity
∑𝑖 𝑟𝑢𝑖 𝑟𝑣𝑖
𝑠𝑖𝑚(𝑢, 𝑣) =
2
√∑𝑖 𝑟𝑢𝑖 √∑𝑖 𝑟𝑣𝑖2
Where
• 𝑟𝑢𝑖 = rating of user 𝑢on item 𝑖
• 𝑟𝑣𝑖 = rating of user 𝑣on item 𝑖
This computes the angle between two rating vectors.
Pearson Correlation Similarity
∑𝑖( 𝑟𝑢𝑖 − 𝑟ˉ𝑢 )(𝑟𝑣𝑖 − 𝑟ˉ𝑣 )
𝑠𝑖𝑚(𝑢, 𝑣) =
√∑𝑖( 𝑟𝑢𝑖 − 𝑟ˉ𝑢 )2 √∑𝑖( 𝑟𝑣𝑖 − 𝑟ˉ𝑣 )2
Where
• 𝑅𝑢𝑖 = predicted rating
• 𝑅ˉ𝑢 = average rating of user 𝑢
• 𝑠𝑖𝑚(𝑢, 𝑣)= similarity between users
• 𝑁(𝑢)= set of nearest neighbors
30
5. Generate Recommendations
After predicting ratings, the system recommends items with the highest
predicted scores.
Example:
Item Predicted Rating
Movie B 3.2
Movie C 4.6
Thus Movie C will be recommended.
2. Similarity Computation
Similarity measures determine how closely two users or two items are
related. It is a crucial step in identifying neighbors.
Cosine Similarity
Where
– Similarity between user 𝑢and user 𝑣
𝑟𝑢𝑖 – Rating given by user 𝑢to item 𝑖
𝑟𝑣𝑖 – Rating given by user 𝑣to item 𝑖
∑𝑖 – Summation over all items rated by both users
This computes the cosine of the angle between two rating vectors.
Pearson Correlation
Where
𝑠𝑖𝑚(𝑢, 𝑣)– Similarity between user 𝑢and user 𝑣
𝑟𝑢𝑖 – Rating given by user 𝑢for item 𝑖
𝑟𝑣𝑖 – Rating given by user 𝑣for item 𝑖
𝑟ˉ𝑢 – Average rating of user 𝑢
𝑟ˉ𝑣 – Average rating of user 𝑣
∑𝑖 – Summation over all commonly rated items
3. Neighborhood Formation
After computing similarity values, the system selects the nearest
neighbors.
Types
• Top-K Neighbors – choose the K most similar users/items.
32
• Threshold-Based Neighbors – choose neighbors whose similarity
exceeds a threshold.
Example:
User Pair Similarity
U1 – U2 0.97
U1 – U3 0.88
Thus User2 becomes the nearest neighbor of User1.
Importance
• Reduces computation.
• Improves recommendation accuracy.
4. Prediction Function
The prediction function estimates unknown ratings using the ratings of
neighbors.
Rating Prediction Formula
Where:
• 𝑅𝑢𝑖 = predicted rating of user 𝑢for item 𝑖
• 𝑅ˉ𝑢 = average rating of user 𝑢
• 𝑠𝑖𝑚(𝑢, 𝑣)= similarity between users
• 𝑁(𝑢)= neighborhood of user 𝑢
Purpose
• Calculates the expected rating for an unrated item.
5. Recommendation Generation
Once ratings are predicted, the system recommends items with the highest
predicted ratings.
Example:
Item Predicted Rating
Movie B 3.2
Movie C 4.5
Thus the system recommends Movie C.
Item-Based Neighborhood
• Finds similar items.
• Recommends items similar to those already liked by the user.
Example:
33
Item similarity → recommend related items.
8. Model Evaluation
Evaluation measures the quality of recommendations.
Common metrics include:
• Mean Absolute Error (MAE)
• Root Mean Square Error (RMSE)
• Precision and Recall
These metrics help determine prediction accuracy.
Neighborhood methods rely on several key components including rating
matrices, similarity measures, neighbor selection, prediction
functions, and recommendation generation. By leveraging similarities
between users or items, these methods effectively predict missing ratings
and provide personalized recommendations.
13. Consider that you are a data scientist in Amazon. Your team is tasked with
designing a hybrid recommendation engine that personalizes product
suggestions based on browsing history, purchase history, and textual
reviews. Develop the system architecture and explain the mathematical
models used such as collaborative filtering, content-based filtering.
Discuss challenges in scalability and cold start and propose solutions.
(April-May-2025)
Hybrid Recommendation Engine for Amazon (16 Marks)
As a data scientist in Amazon, the goal is to design a hybrid
recommendation system that suggests products by combining browsing
history, purchase history, and textual reviews. Hybrid systems combine
collaborative filtering and content-based filtering to improve
recommendation accuracy.
34
Components of Architecture
1. Data Sources
• Browsing history (clicked products)
• Purchase history
• Product ratings
• Product metadata (category, price, brand)
• Textual reviews
2. Data Storage
• User database
• Product catalog database
• Interaction logs
3. Feature Engineering
Extract useful features such as:
• User preferences
• Product attributes
• Review sentiment scores
4. Recommendation Models
Two main models are used:
• Collaborative Filtering
• Content-Based Filtering
5. Hybrid Recommendation Layer
Combines predictions from both models.
6. Ranking and Recommendation
Products are ranked based on predicted scores and recommended to
users.
35
Cosine Similarity
∑𝑖 𝑟𝑢𝑖 𝑟𝑣𝑖
𝑠𝑖𝑚(𝑢, 𝑣) =
2
√∑𝑖 𝑟𝑢𝑖 √∑𝑖 𝑟𝑣𝑖2
Where
• 𝑟𝑢𝑖 = rating of user 𝑢on item 𝑖
This finds similar users or items.
Where
• 𝑅𝑢𝑖 = predicted rating
• 𝑁(𝑢)= nearest neighbors
This predicts the rating of a user for a product.
Where
• 𝑇𝐹(𝑡, 𝑑)= term frequency
• 𝑑𝑓(𝑡)= document frequency
• 𝑁= number of documents
TF-IDF converts textual reviews into numerical vectors.
Cosine Similarity for Product Features
⃗𝑓𝑖 ⋅ ⃗⃗𝑓𝑗
𝑠𝑖𝑚(𝑖, 𝑗) =
∣∣ 𝑓𝑖 ∣∣ ∣∣ 𝑓𝑗 ∣∣
Where
• 𝑓𝑖 = feature vector of product 𝑖
This finds similar products.
4. Hybrid Recommendation Strategy
The hybrid system combines both approaches.
Weighted Hybrid Model
𝑆𝑐𝑜𝑟𝑒(𝑢, 𝑖) = 𝛼 𝐶𝐹(𝑢, 𝑖) + (1 − 𝛼) 𝐶𝐵(𝑢, 𝑖)
Where
• 𝐶𝐹(𝑢, 𝑖)= collaborative filtering score
36
• 𝐶𝐵(𝑢, 𝑖)= content-based score
• 𝛼= weight parameter
This improves accuracy and robustness.
5. Handling Textual Reviews
Natural Language Processing techniques are used:
Steps:
1. Text preprocessing
2. Tokenization
3. Stop-word removal
4. TF-IDF feature extraction
5. Sentiment analysis
This helps understand user opinions about products.
6. Scalability Challenges
Large e-commerce platforms like Amazon handle millions of users and
products.
Issues
• Large user-item matrix
• High computation cost
• Real-time recommendations
Solutions
• Distributed computing (Spark, Hadoop)
• Matrix factorization
• Approximate nearest neighbor search
• Incremental model updates
7. Cold Start Problem
Cold start occurs when there is insufficient data.
Types
1. New User Problem
User has no interaction history.
Solution
• Use browsing behavior
• Ask users to rate products
• Use demographic information
38