0% found this document useful (0 votes)
6 views87 pages

Unit III-collaborative Filtering - Final

The document discusses collaborative filtering (CF), a method used in recommender systems that leverages user interactions to predict preferences for items. It outlines two main types of CF: user-based and item-based, along with their advantages and disadvantages, such as data sparsity and scalability issues. Additionally, it covers systematic approaches to CF, including data collection, preprocessing, similarity calculation, and the use of matrix factorization for generating recommendations.

Uploaded by

hod.ai
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views87 pages

Unit III-collaborative Filtering - Final

The document discusses collaborative filtering (CF), a method used in recommender systems that leverages user interactions to predict preferences for items. It outlines two main types of CF: user-based and item-based, along with their advantages and disadvantages, such as data sparsity and scalability issues. Additionally, it covers systematic approaches to CF, including data collection, preprocessing, similarity calculation, and the use of matrix factorization for generating recommendations.

Uploaded by

hod.ai
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT III COLLABORATIVE FILTERING

A systematic approach, Nearest-neighbor collaborative filtering (CF), user-based and item-based CF,
components of neighborhood methods (rating normalization, similarity weight computation, and
neighborhood selection
Suggested Activities:
• Practical learning – Implement collaborative filtering concepts
• Assignment of security aspects of recommender systems
Suggested Evaluation Methods:
• Quiz on collaborative filtering
• Seminar on security measures of recommender systems
What Is Collaborative Filtering?

• Collaborative filtering filters information by using the interactions and data collected by the
system from other users. It’s based on the idea that people who agreed in their evaluation of
certain items are likely to agree again in the future.
• The concept is simple: when we want to find a new movie to watch we’ll often ask our friends
for recommendations. Naturally, we have greater trust in the recommendations from friends who
share tastes similar to our own.
• Most collaborative filtering systems apply the so-called similarity index-based technique.
In the neighborhood-based approach, a number of users are selected based on their
similarity to the active user. Inference for the active user is made by calculating a
weighted average of the ratings of the selected users.
• Collaborative-filtering systems focus on the relationship between users and items. The similarity
of items is determined by the similarity of the ratings of those items by the users who have
rated both items.
• Collaborative filtering recommender systems have played a significant role in the rise of web
services and content platforms like Amazon, Netflix, YouTube, etc. in recent years. In this age
of information, knowing what the customer wants before they even know it themselves is
nothing short of a superpower. As the name suggests, recommender system algorithms are used
to offer relevant content or product to the consumer based on their taste or previous choices

There are two classes of Collaborative Filtering:

• User-based, which measures the similarity between target users and other users.
• Item-based, which measures the similarity between the items that target users rate or interact
with and other items.
1
Why do we need recommender systems?

• Back in 2006, Netflix offered a prize to solve a simple problem that had been around for
years. It was to find the best collaborative algorithm to predict user ratings for films that they
haven't watched yet, based on previous ratings of other movies.
• Today, e-commerce giants continue to try to solve this problem in a better way by
observing users’ past behavior to predict what other things the same user will like. .
• Recommendations also help customers discover new products and offers that they’re not
explicitly looking for, thus speeding up the search process. This allows companies to send out
personalized newsletters via email that offer new TV shows, movies, products, and services
that are better suited for them.
• One of the most significant advantages of modern recommendation algorithms is their
ability to take implicit feedback and suggest new content/products, thus staying up-to- date
with customers’ preferences. This enables businesses to continue catering to customers even
if their tastes change over time.
User-item interaction matrix
• In collaborative filtering, we ignore the features of an individual item. Instead, we focus on
a similar group of people using the item and recommend other items that the group likes.
• Similar users are divided into small clusters and are recommended new items according
to the preferences of that cluster. Let’s understand this with an easy movie recommendation
example:

What we can infer from this user-item matrix is:


• Users 1 and 2 liked Movie 1. Since User 1 liked movies 2 and 4 a lot, there’s a high
chance of User 2 enjoying the same.

2
• Users 1 and 3 have opposite tastes.
• Users 3 and 4 both disliked Movie 2, so there’s a high chance User 4 will also dislike
Movie 4.
• User 3 might dislike Movie 1.
Collaborative filtering: Advantages and disadvantages
Advantages
• No domain knowledge is required since all the features are learned automatically.
• Can help users discover new interests even if they’re not actively searching for them by
recommending new items similar to what they’re interested in.
• Does not require in-detail features and contextual data of products or items. It only needs the
user-item interaction matrix to train the matrix factorization model.
Disadvantages
• Data sparsity can lead to difficulty in recommending new products or users since the
suggestions are based on historic data and interactions.
• As the user base grows, the algorithms suffer due to high data volume and lack of scalability.
• Lack of diversity in the long run. This might seem counterintuitive since the whole point of
collaborative filtering is to recommend new items to the user. However, since the algorithms
function based on historical ratings, it will not recommend items with little or limited data.
Popular products will be more popular in the long run and there will be a lack of new and
diverse options.

Types of collaborative filtering


The two types of collaborative filtering approaches are:

• Memory-based collaborative approach


• Model-based collaborative approach

3
A systematic approach to collaborative filtering involves the following steps:
1. Data Collection: Gather user-item interaction data, such as ratings, reviews, purchases, or clicks.

4
2. Data Preprocessing: Clean and prepare the data for analysis, including handling missing values,
outliers, and data normalization.
3. User or Item Representation: Encode user preferences or item features into a suitable
representation, such as user-item matrices or item-attribute vectors.
4. Similarity Calculation: Compute similarity scores between users or items based on their
respective representations.
5. Nearest Neighbor Identification: Identify the nearest neighbor for each user or item based on the
calculated similarity scores.
6. Prediction Generation: Predict the rating or preference of a user for an item based on the ratings or
preferences of their nearest neighbor.
7. Evaluation and Optimization: Evaluate the performance of the CF algorithm using appropriate
metrics and refine the model parameters to improve accuracy.
8. Deployment and Maintenance: Integrate the CF algorithm into the recommender system and
monitor its performance over time, making adjustments as needed.

Effective collaborative filtering relies on the quality and quantity of user-item interaction data.
Additionally, the choice of similarity measures, nearest neighbor identification techniques, and
prediction algorithms can significantly impact the performance of the CF system.

5
• content-based approaches, which use the content of items previously rated by a user u,
collaborative (or social) filtering approaches rely on the ratings of u as well as those of other
users in the system.
• The key idea is that the rating of u for a new item i is likely to be similar to that of another user
v. if u and v have rated other items in a similar way. Likewise, u is likely to rate two
items i and j in a similar fashion, if other users have given similar ratings to these two
items.

Collaborative approaches overcome some of the limitations of content-based ones.


• Items for which the content is not available or difficult to obtain can still be recommended to
users through the feedback of other users.
• Collaborative recommendations are based on the quality of items as evaluated by peers,
instead of relying on content that may be a bad indicator of quality.
• Collaborative filtering ones can recommend items with very different content, as long as
other users have already shown interest for these different items

6
• Collaborative filtering methods can be grouped in the two general classes of neighborhood
and model- based methods.
• In neighborhood based (memory-based or heuristic-based ) collaborative filtering the user-
item ratings stored in the system are directly used to predict ratings for new items.
• This can be done in two ways known as user based or item-based recommendation.
o User-based systems, such as GroupLens (Social Computing Research at the
University of Minnesota) ,Bellcore video (Library Toolkit is a set of tools for
constructing and browsing libraries of digital video), and Ringo (Social Information
Filtering for Music Recommendation), evaluate the interest of a user u for an item I using
the ratings for this item by other users, called neighbors, that have similar rating
patterns. The neighbors of user u are typically the users v whose ratings o the items
rated by both u and v, i.e. 𝐿_𝑢𝑣 , are most correlated to those of u.
o Item-based approaches, on the other hand, predict the rating of a user u for an item i
based on the ratings of u for items similar to i. In such approaches, two items are similar
if several users of the system have rated these items in a similar fashion.
• In model-based approaches use these ratings to learn a predictive model. The general idea
is to model the user-item interactions with factors representing latent characteristics of the users
and items in the system, like the preference class of users and the category class of items. This
model is then trained using the available data, and later used to predict ratings of users for new
items. Model-based approaches for the task of recommending items are numerous and include
Bayesian Clustering , Latent Semantic Analysis , Latent Dirichlet Allocation, Maximum
Entropy , Boltzmann Machines, Support Vector Machines and Singular Value Decomposition

Memory-based collaborative approach


• Memory-based collaborative filtering (also called neighborhood-based or user-item filtering)
is based on the assumption that users with similar historical preferences would continue to
display similar historical preferences in the future. In this method, item ratings are computed in a
straightforward manner by factoring in the ratings of nearby people or things.
• In memory-based collaborative filtering, only the user-item interaction matrix is utilized to
make new recommendations to users. The whole process is based on the users’ previous ratings
and interactions.
• Memory-based filtering consists of 2 methods: user-based collaborative filtering and
item-based collaborative filtering.
User-based collaborative filtering
7
• To suggest new recommendations to a particular user, a group of similar users (nearest
neighbors) is created based on the interactions of the reference user. The items that are
most popular in this group, but new to the target user, are used for the suggestions.
• User-based CF algorithms recommend items to a user based on the preferences of similar
users. The algorithm first identifies a set of similar users, also known as nearest neighbors,
based on their past interactions with items. The similarity between users is typically measured
using distance metrics such as cosine similarity or Pearson correlation. Once the nearest
neighbors are identified, the algorithm predicts the rating of an item for the active user by
aggregating the ratings of that item from the nearest neighbors.

Item-based collaborative filtering


• In item-based filtering, new recommendations are selected based on the old
interactions of the target user. First, all the items that the user has already liked are
considered. Then, similar products are computed and clusters are made (nearest
neighbors). New items from these clusters are suggested to the user.
• Item-based CF algorithms recommend items to a user based on the similarity of items to
items that the user has interacted with in the past. The algorithm first identifies a set of
similar items based on their attributes or features. The similarity between items is typically
measured using distance metrics or similarity measures such as Jaccard similarity or cosine
similarity. Once the similar items are identified, the algorithm recommends to the active
user items that are similar to items that the user has liked in the past.

8
Advantages of Memory-Based Collaborative Filtering:
• Simplicity: Memory-based approaches are intuitive and simple to implement, making them a
viable option for solving problems with moderately big datasets in a short amount of time.
• Transparency: Memory-Based systems’ suggestions are easy to understand since they are
grounded in the user’s and the item’s direct interactions.
• Serendipity: Memory-based filtering has the potential to provide serendipitous
recommendations, in which users stumble onto previously unknown but potentially
fascinating content through shared relationships with other users
Drawbacks of Memory-Based Collaborative Filtering:
• Sparsity and Scalability: Since the frequency of user-item interactions tends to decrease as the
dataset expands, it becomes more difficult to discover trustworthy neighbours and might cause
scaling problems.
• Cold Start: Memory-Based systems struggle when there are too few contacts with new users
or things to make reliable suggestions.
• Limited Representation: Memory-based approaches may provide subpar results because they
fail to fully capture complicated patterns in the data.
Model-based collaborative approach
• Cooperative Modelling Instead of using a predetermined set of rules, filters use a statistical
or machine learning model to identify and exploit hidden links and patterns in the data. These
models are then used to estimate users’ preferences for unseen objects based on their training
data of past interactions between users and items
• In the model-based approach, machine learning models are used to predict and rank
interactions between users and the items they haven’t interacted with yet. These models are
trained using the interaction information already available from the interaction matrix by
deploying different algorithms like matrix factorization, deep learning, clustering, etc.

9
Matrix factorization
Matrix factorization is used to generate latent features by decomposing the sparse user-item
interaction matrix into two smaller and dense matrices of user and item entities.
Matrix factorization is a popular technique used in Collaborative Filtering (CF) for recommendation
systems. CF is a method to predict a user's interests by collecting preferences or behavior
information from many users. Matrix factorization is particularly effective in collaborative filtering
because it can handle the sparsity of user-item interaction data.
Here's how matrix factorization works in the context of collaborative filtering:
1. Understanding the Data Matrix:
• Assume you have a matrix R representing user-item interactions. Rows correspond to
users, columns correspond to items, and the entries Rui represent user u's interaction
(like rating, purchase, or view) with item i. However, most entries are unknown
(missing) because not all users interact with all items.
2. Objective of Matrix Factorization:
• The goal of matrix factorization in CF is to decompose this sparse matrix R into the
product of two lower-dimensional matrices U and I
𝑹 ≈ 𝑼 × 𝑰𝑻
• Here, U (an 𝑚 × 𝑘 matrix) represents user embeddings, where each row u (out of m
rows) corresponds to a user's latent factors in an k-dimensional space.
• I (an 𝑛 × 𝑘 matrix) represents item embeddings, where each row i (out of n rows)
corresponds to an item's latent factors in the same k-dimensional space.
3. Matrix Factorization Process:
• Matrix factorization aims to learn the matrices U and I by minimizing the reconstruction
error between R and 𝑈 × 𝐼𝑇. This is typically achieved through optimization techniques
like gradient descent, alternating least squares, or stochastic gradient descent.
• The objective function could be formulated as:
minimize ∑(𝑢,𝑖)∈observed (𝑅𝑢𝑖 − (𝑈 × 𝐼𝑇)𝑢𝑖)2 +λ (∥ 𝑈 ∥2+∥ 𝐼 ∥2) where λ is a
regularization parameter to prevent overfitting.
4. Prediction and Recommendations:
Once the matrices U and I are learned, the missing entries in R can be estimated as
10
𝑼 × 𝑰𝑻
Recommendations for a user u can be made by suggesting items that have the
highest predicted scores (entries in 𝑼 × 𝑰𝑻 ) for that user, but have not been
interacted with yet.
5. Key Advantages:
• Matrix factorization is effective in handling sparsity because it leverages latent factors
to capture user and item interactions.
• It can provide personalized recommendations even for users with very few interactions.
Advantages of Model-Based Collaborative Filtering:
• Scalability: Model-Based approaches outperform Memory-Based ones in dealing with big
and sparse datasets because they learn underlying patterns without making direct comparisons
of users or things.
• Cold Start Mitigation: By using supplementary data or a hybrid method, model-based
filtering may help with the cold start issue.
• Flexibility: Model-based methods may use a wide variety of data and attributes, allowing for
the incorporation of context to enhance suggestions.
Drawbacks of Model-Based Collaborative Filtering:
• Complexity: Due to the complexity of the models they need, the development and tuning of
model- based approaches often take more time and skill.
• Black Box: High accuracy is possible with Model-Based filtering, although the
models’ inner workings may be less visible and interpretable than those of Memory-Based
approaches.
• Overfitting: Overfitting is a problem in Model-Based systems when there is insufficient
data, and this may result in suggestions that are too weighted towards prior encounters
Hybrid Approaches
Hybrid approaches in Collaborative Filtering (CF) combine different methods or techniques to
overcome limitations and enhance the performance of recommendation systems. These approaches
leverage the strengths of multiple recommendation strategies, such as collaborative filtering (CF) and
content-based filtering (CBF), to provide more accurate and diverse recommendations. Here's a
breakdown of hybrid approaches in CF:
1. Collaborative Filtering (CF):
• Collaborative Filtering methods recommend items based on user-item interactions or
similarities between users. This can be user-based CF (recommending items liked by
similar users) or item-based CF (recommending similar items to those a user has liked).
2. Content-Based Filtering (CBF):
• Content-Based Filtering recommends items based on their features or attributes. It
analyzes item descriptions or user profiles to suggest items that are similar in content
to previously liked items.
3. Types of Hybrid Approaches:
a. Weighted Hybrid:
• In this approach, predictions from different recommendation techniques (e.g., CF and
CBF) are combined using weighted averages or other blending methods. The weights
can be fixed or learned based on data.
b. Feature Combination:
• Features derived from both CF and CBF methods are combined to create a unified
feature representation. Machine learning algorithms can then use this combined feature
representation to make recommendations.
c. Cascade or Switch Hybrid:
• Recommendations from one method (e.g., CF) are used to filter or augment
recommendations from another method (e.g., CBF). This can improve recommendation
accuracy by leveraging the strengths of both methods.
d. Meta-Level Hybrid:
11
• In this approach, predictions from different recommendation algorithms are treated as
input features to a meta-learner (e.g., a machine learning model). The meta-learner then
combines these predictions to generate final recommendations.
Advantages of Hybrid Approaches:
• Improved Accuracy: By combining multiple methods, hybrid approaches can mitigate
weaknesses and improve recommendation accuracy.
• Diversity: Hybrid methods can provide more diverse recommendations by leveraging
different recommendation strategies.
• Robustness: They are more robust to data sparsity and the cold start problem compared to
individual CF or CBF methods.
• Improved Performance: Hybrid techniques might possibly provide higher overall
performance by using the capabilities of Memory-Based and Model-Based methodologies.
• Cold Start Mitigation: Cold-starting difficulties may be mitigated with the use of hybrid
technology.
Examples:
• Netflix's recommendation system uses a hybrid approach, combining collaborative
filtering (based on user ratings) with content-based filtering (analyzing movie
attributes like genre).
• Amazon's recommendation system also uses a hybrid approach, combining
user-item interactions with item attributes and user demographics.
Movie Recommendation System
Data:
• User Preferences: User ratings for movies.
• Movie Attributes: Genre, director, actors, release year, etc.
Hybrid Approach Components:
1. Collaborative Filtering (CF):
• Idea: Recommend movies based on user behavior and preferences.
• Implementation:
• Use matrix factorization (like Singular Value Decomposition or Matrix
Factorization) to learn latent factors from user-item interactions (ratings).
• Predict ratings for unseen movies based on similar users' preferences.
2. Content-Based Filtering (CBF):
• Idea: Recommend movies based on the attributes or content of the items.
• Implementation:
• Extract features from movies such as genre, director, actors, release year.
• Build a profile for each user based on their rated movies.
• Recommend movies that are similar in content to the ones a user has liked.
3. Hybridization:
• Combining CF and CBF:
• Weighted Approach: Combine scores from CF and CBF using a weighted
sum or other fusion techniques.
• Switching Strategy: Use CF for some users and CBF for others based on
data availability or performance metrics.
• Feature Combination: Include content-based features (e.g., movie genres,
director) as additional input to the collaborative filtering model.
Recommendation Process:
• For a New User:
• If the user has not rated any movies yet:
• Use CBF to recommend movies based on their provided preferences (e.g.,
preferred genres).
• Once the user rates some movies:
• Incorporate these ratings into the CF model to provide personalized
12
recommendations.
• For Existing Users:
• Use the hybrid approach to generate recommendations:
• Combine CF predictions (based on user-item interactions) with CBF
recommendations (based on movie attributes).
• Present the top-rated hybrid recommendations to the user.
Benefits of Hybrid Approach:
• Increased Accuracy: Combining multiple recommendation techniques can lead to more
accurate predictions.
• Improved Coverage: Content-based filtering can recommend items even when user-item
interactions are sparse (cold start problem).
• Enhanced Personalization: Incorporating user preferences (CBF) along with user-item
interactions (CF) leads to more personalized recommendations.
In this movie recommendation system example, the hybrid approach leverages both collaborative
filtering and content-based filtering techniques to provide diverse and accurate movie
recommendations tailored to individual users' tastes and preferences. Hybridization allows for a more
robust recommendation system that can handle various scenarios and user behaviours effectively.
NEAREST NEIGHBOR COLLABORATIVE FILTERING
• Neighborhood-based recommender systems fall under the collaborative filtering umbrella
and focus on using behavioral patterns, such as movies that users have watched in the past,
to identify similar users (i.e., users who demonstrate similar preferences), or similar items (i.e.,
items that receive similar interest from the same users).
• Nearest Neighbors Collaborative Filtering (NNCF) is a technique used in recommendation
systems to predict user preferences based on the similarity between users or items. It falls
under the umbrella of Collaborative Filtering (CF), which utilizes the collective wisdom of
users to make recommendations.
• User-based Collaborative Filtering (UBCF):
o Predict a user's preference for an item by finding similar users based on their historical
ratings.
• Item-based Collaborative Filtering (IBCF):
o Predict a user's preference for an item by finding similar items based on how users have
rated them.
Steps Involved Nearest Neighbors Collaborative Filtering
Step-1: Data Representation: Represent user-item interactions as a matrix R, where rows correspond
to users and columns correspond to items. Each entry Rui represents a user u's rating (or interaction)
with item i.
Step-2: Similarity Calculation: Compute similarity between users (for UBCF) or items (for IBCF)
based on their rating patterns. Common similarity metrics include cosine similarity, Pearson
correlation, or Jaccard similarity.
Step-3: Nearest Neighbors Selection: For a given user u (or item i), identify the k most similar users
(or items) based on the computed similarity scores. Nearest Neighbors are typically selected based
on the highest similarity scores.
Step-4: Prediction:
• UBCF Prediction: Predict user u's rating for item i by averaging the ratings of the k nearest
Neighbors who have rated item i, weighted by their similarity to user u.
• IBCF Prediction: Predict user u's rating for item i by combining ratings of items similar to
item i, weighted by the similarity between items.
• We refer to the technique that computes similar users as user-based and to the technique that
focuses on computing similar items as item-based.
• An example of the item-based technique is Netflix’s “Because you watched…” feature,
which recommends movies or shows based on examples that users previously showed
13
interest in.
• An example of a user-based recommender system is [Link], which recommends
destinations based on the historical behavior of other users with similar travel history.

Pipeline Overview

The image below summarizes the pipeline for our implementation of item-based and user-based
recommender systems in our declarative language, Rel. Without loss of generality, we focus on a
movie recommendation use case, where we are given interactions between users and movies.
Step 1: We convert user-item interactions to a bipartite graph.
The first step is to convert the input interactions data to a bipartite graph that contains two types of
nodes: Users and Movies, as shown in the image below.

14
The two node types are connected by an edge that we call watched. In Rel, Users and Movies are
represented by entity types, and their attributes, such as id and name, are represented by value types

Step 2: MovieLens Graph. We use user-item interactions to compute item-item and user-user similarities
by leveraging the functions supported by the graph analytics library.
• Once we define the entity and value types, the next step is to populate the entities with data
from the original MovieLens dataset.
• Assuming we have a relation called watched_train(user, movie) that represents the train
subset of the MovieLens data and contains the watch history for the users, and a relation called
movie_info(movie, movie_name) that contains the movie names, we create the Movie entity
as follows:
• The User entity is created similarly. Finally, we add an additional edge called watched that
connects the movie entity to the user entity.
Step 3: Similarity Computation.
• We use the similarities to predict the scores for all (user, movie) pairs. Each score is an
indication of how likely it is for a user to interact with a movie.
• Now that we have modeled our data as a graph, we can compute item-item and user-user
similarities using the user-item interactions: movies that have been watched by the same users
will have a high similarity value, while movies that have been watched by different users will
have a low similarity value.
• Here, we focus on the item-based method. The approach for the user-based method is very
similar. There are several similarity metrics that can be used for this task. Currently, the Rel
graph library provides the cosine_similarity and jaccard_similarity relations
Step 4: Scoring
• We sort the scores for every user in order to generate top-k recommendations.
• Using the similarities calculated in the previous step, we then compute the (user, movie) scores
for all pairs. We predict that a user will watch movies that are similar to the movies they have
watched in the past (item-based approach).
• The score for a pair (user, movie) indicates how likely it is for a user to watch a movie and is
calculated as follows: Where:

• 𝑆𝑐𝑜𝑟𝑒𝑢,𝑖is the predicted score for user u and item i


15
• N[i] is the set of item i’s nearest neighbors
• W[u] is the set of items watched by user u
• 𝑆𝑖,𝑛is the similarity score between items i and n
the score is the sum of the similarity scores of the target movie’s nearest neighbors that have been
watched by the target user
• The pred relation takes the following inputs:
• neighborhood_size: The number of similar movies (neighbors) we use to predict the score
• M: The relation containing (movie, user) pairs
• S: The similarity metric, e.g., jaccard, cosine
• T: The relation that selects the top neighborhood_size most similar movies to the
target movie (i.e., the nearest neighbors).
Step 5: We evaluate performance using evaluation metrics that are widely used for recommender
systems

a) User-Based Collaborative Filtering


User-Based Collaborative Filtering is a technique used to predict the items that a user might like on
the basis of ratings given to that item by other users who have similar taste with that of the target
user. Many websites use collaborative filtering for building their recommendation system.
Step 1: Finding the similarity of users to the target user U. Similarity for any two users ‘a’ and ‘b’
can be calculated from the given formula,

The formula shown in the image is the Pearson Correlation Similarity used in Collaborative Filtering
(Recommender Systems) to measure similarity between two users (or items).
Pearson Similarity Formula
∑𝑝( 𝑟𝑎𝑝 − 𝑟ˉ𝑎 )(𝑟𝑏𝑝 − 𝑟ˉ𝑏 )
𝑆𝑖𝑚(𝑎, 𝑏) =
√∑𝑝( 𝑟𝑎𝑝 − 𝑟ˉ𝑎 )2 √∑𝑝( 𝑟𝑏𝑝 − 𝑟ˉ𝑏 )2

Meaning of Symbols
Symbol Meaning
𝑆𝑖𝑚(𝑎, 𝑏) Similarity between user a and user b
𝑟𝑎𝑝 Rating given by user a to item p
𝑟𝑏𝑝 Rating given by user b to item p
𝑟ˉ𝑎 Average rating of user a
𝑟ˉ𝑏 Average rating of user b
𝑝 Items rated by both users

Step 2: Prediction of missing rating of an item Now, the target user might be very similar to some users and
may not be much similar to others. Hence, the ratings given to a particular item by the more similar users
should be given more weightage than those given by less similar users and so on. This problem can be solved
by using a weighted average approach. In this approach, you multiply the rating of each user with a similarity
factor calculated using the above mention formula. The missing rating can be calculated as
The formula in the image is the Resnick Prediction Formula, widely used in User-Based
Collaborative Filtering (UBCF) in recommender systems.
Resnick Prediction Formula
∑ 𝑠𝑖𝑚(𝑢, 𝑖) ( 𝑟𝑖𝑝 − 𝑟ˉ𝑖 )
𝑟𝑢𝑝 = 𝑟ˉ𝑢 + 𝑖∈𝑢𝑠𝑒𝑟𝑠
∑𝑖∈𝑢𝑠𝑒𝑟𝑠 ∣ 𝑠𝑖𝑚(𝑢, 𝑖) ∣

16
Meaning of Symbols
Symbol Meaning
𝑟𝑢𝑝 Predicted rating of user u for item p
𝑟ˉ𝑢 Average rating of user u
𝑠𝑖𝑚(𝑢, 𝑖) Similarity between user u and user i
𝑟𝑖𝑝 Rating given by user i for item p
𝑟ˉ𝑖 Average rating of user i
( sim(u,i)
𝑖 Neighbor users similar to user u

Example: User-Based Collaborative Filtering


Consider a matrix that shows four users Alice, U1, U2 and U3 rating on different news apps. The
rating range is from 1 to 5 on the basis of users’ likability of the news app. The ‘?’ indicates
that the user has not rated the app

Name Inshorts HT NYT TOI BBC


(I1) (I2) (I3) (I4) (I5)
Alice 5 4 1 4 ?

U1 3 1 2 3 3

U2 4 3 4 3 5

U3 3 3 1 5 4

In User-Based Collaborative Filtering, when computing similarity, we exclude the item whose rating
we want to predict.
So I5 (BBC) is excluded while calculating similarity.

Step 1: Given Rating Matrix


User I1 I2 I3 I4 I5
Alice 5 4 1 4 ?
U1 3 1 2 3 3
U2 4 3 4 3 5
U3 3 3 1 5 4
Goal: Predict Alice’s rating for I5 (BBC)
Similarity is calculated using I1, I2, I3, I4 only.
Step2. Mean Rating of Each User
Formula
∑𝑅𝑢,𝑖
𝑅ˉ𝑢 =
𝑁

Alice
5+4+1+4
𝑅ˉ𝐴 = = 3.5
4

U1

17
3+1+2+3+3
ˉ =
𝑅𝑈1 = 2.4
5

U2
4+3+4+3+5
ˉ =
𝑅𝑈2 = 3.8
5

U3
3+3+1+5+4
ˉ =
𝑅𝑈3 = 3.2
5

Step-3. Pearson Similarity Formula


∑(𝑅𝐴,𝑖 − 𝑅ˉ𝐴 )(𝑅𝑈,𝑖 − 𝑅ˉ𝑈 )
𝑆𝑖𝑚(𝐴, 𝑈) =
√∑(𝑅𝐴,𝑖 − 𝑅ˉ𝐴 )2 √∑(𝑅𝑈,𝑖 − 𝑅ˉ𝑈 )2

Only I1–I4 are used.


Step-4. Similarity Between Alice and U1
Item Alice U1 𝑨 − 𝑨ˉ 𝑼𝟏 − 𝑼 ˉ𝟏 Product
U1-2.4
I1 5 3 1.5 0.6 0.9
I2 4 1 0.5 -1.4 -0.7
I3 1 2 -2.5 -0.4 1.0
I4 4 3 0.5 0.6 0.3
Numerator
0.9 − 0.7 + 1 + 0.3 = 1.5

Denominator
√1.52 + 0.52 + (−2.5)2 + 0.52
= √2.25 + 0.25 + 6.25 + 0.25
= √9 = 3
√0.62 + (−1.4)2 + (−0.4)2 + 0.62
= √0.36 + 1.96 + 0.16 + 0.36
= √2.84 = 1.685
1.5
𝑆𝑖𝑚(𝐴, 𝑈1) = = 0.296
3 × 1.685

Step-5. Similarity Between Alice and U2


Item Alice U2 𝑨 − 𝑨ˉ ˉ𝟐
𝑼𝟐 − 𝑼 Product
U2- 3.8
I1 5 4 1.5 0.2 0.3
I2 4 3 0.5 -0.8 -0.4
I3 1 4 -2.5 0.2 -0.5
I4 4 3 0.5 -0.8 -0.4
Numerator
0.3 − 0.4 − 0.5 − 0.4 = −1

Denominator
√9 = 3
18
√0.22 + (−0.8)2 + 0.22 + (−0.8)2
= √1.36 = 1.166
−1
𝑆𝑖𝑚(𝐴, 𝑈2) = = −0.286
3 × 1.166

Step- 6. Similarity Between Alice and U3


Item Alice U3 𝑨 − 𝑨ˉ 𝑼𝟑 − 𝑼 ˉ 𝟑 Product
U3-3.2
I1 5 3 1.5 -0.2 -0.3
I2 4 3 0.5 -0.2 -0.1
I3 1 1 -2.5 -2.2 5.5
I4 4 5 0.5 1.8 0.9
Numerator
−0.3 − 0.1 + 5.5 + 0.9 = 6

Denominator
√𝟗 = 𝟑
√0.04 + 0.04 + 4.84 + 3.24
= √8.16 = 2.857
6
𝑆𝑖𝑚(𝐴, 𝑈3) = = 0.701
3 × 2.857

Step- 7. Rating Prediction (Resnick Formula)


∑𝑺𝒊𝒎(𝑨, 𝑼)(𝑹𝑼,𝑰𝟓 − 𝑹ˉ𝑼 )
𝑷(𝑨, 𝑰𝟓) = 𝑹ˉ𝑨 +
∑ ∣ 𝑺𝒊𝒎(𝑨, 𝑼) ∣

Numerator
U1
0.296(3 − 2.4) = 0.296 × 0.6 = 0.178
U2
−0.286(5 − 3.8) = −0.286 × 1.2 = −0.343
U3
0.701(4 − 3.2) = 0.701 × 0.8 = 0.561
Total
0.178 − 0.343 + 0.561 = 0.396
Denominator
∣ 0.296 ∣ +∣ −0.286 ∣ +∣ 0.701 ∣= 1.283

Step-8. Final Prediction


𝟎. 𝟑𝟗𝟔
𝑷(𝑨, 𝑰𝟓) = 𝟑. 𝟓 +
𝟏. 𝟐𝟖𝟑
= 𝟑. 𝟓 + 𝟎. 𝟑𝟎𝟗
= 𝟑. 𝟖𝟏

Final Result
Predicted rating of Alice for BBC (I5) = 3.81 ≈ 4
Thus, BBC is recommended to Alice.

19
b) Item-to-Item Based Collaborative Filtering

• Collaborative Filtering is a technique or a method to predict a user’s taste and find the items
that a user might prefer on the basis of information collected from various other users having
similar tastes or preferences.
• It takes into consideration the basic fact that if person X and person Y have a certain reaction
for some items then they might have the same opinion for other items too.
• The two most popular forms of collaborative filtering are:
• User Based: Here, we look for the users who have rated various items in the same way and
then find the rating of the missing item with the help of these users.
• Item Based: Here, we explore the relationship between the pair of items (the user who
bought Y, also bought Z). We find the missing rating with the help of the ratings given to the
other items by the user.
• The similarity between item pairs can be found in different ways. One of the most common
methods is to use cosine similarity

• Prediction Computation: The second stage involves executing a recommendation system. It


uses the items (already rated by the user) that are most similar to the missing item to
generate rating.

• We hence try to generate predictions based on the ratings of similar products. We compute
this using a formula which computes rating for a particular item using weighted sum of the
ratings of the other similar product.
The formula in your image is the Item-Based Collaborative Filtering prediction formula:
∑𝒋 𝒓𝒂𝒕𝒊𝒏𝒈(𝑼, 𝑰𝒋 ) × 𝒔𝒊𝒋
𝒓𝒂𝒕𝒊𝒏𝒈(𝑼, 𝑰𝒊 ) =
∑ 𝒔𝒊𝒋
𝒋

Meaning of Symbols
• 𝑟𝑎𝑡𝑖𝑛𝑔(𝑈, 𝐼𝑖 )→ predicted rating of user U for item 𝐼𝑖
• 𝑟𝑎𝑡𝑖𝑛𝑔(𝑈, 𝐼𝑗 )→ rating given by the user to similar item 𝐼𝑗
• 𝑠𝑖𝑗 → similarity between item 𝐼𝑖 and item 𝐼𝑗
• ∑→ summation over all similar items

Example: Item-to-Item Based Collaborative Filtering


Given below is a set table that contains some items and the user who have rated those items. The rating
is explicit and is on a scale of 1 to 5. Each entry in the table denotes the rating given by i th User to a
jth Item. In most cases majority of cells are empty as a user rate only for few items. Here, we have
taken 4 users and 3 items. We need to find the missing ratings for the respective user.

User/Item Item_1 Item_2 Item_3

User_1 2 – 3

User_2 5 2 –

20
User_3 3 3 1

User_4 – 2 2

To find the missing ratings using Item-Based Collaborative Filtering (IBCF), we compute similarity
between items and then predict the missing value using ratings of similar items.

1. Given Rating Matrix


User / Item Item₁ Item₂ Item₃
User₁ 2 – 3
User₂ 5 2 –
User₃ 3 3 1
User₄ – 2 2
Missing ratings:
• 𝒓𝟏,𝟐
• 𝒓𝟐,𝟑
• 𝒓𝟒,𝟏
2. Item-Based CF Prediction Formula
∑ 𝒓𝒖𝒋 𝒔𝒊𝒋
𝒋
𝒓𝒖𝒊 =
∑𝒋 ∣ 𝒔𝒊𝒋 ∣

Where
• 𝒓𝒖𝒊 = predicted rating of user 𝒖for item 𝒊
• 𝒓𝒖𝒋 = rating given by user 𝒖to item 𝒋
• 𝒔𝒊𝒋 = similarity between items 𝒊and 𝒋

3. Compute Item Similarities (Cosine Similarity)


∑𝒓𝒖𝒊 𝒓𝒖𝒋
𝒔𝒊𝒋 =
√∑𝒓𝟐𝒖𝒊 √∑𝒓𝟐𝒖𝒋

Similarity between Item₁ and Item₂


Common users → User₂, User₃
User Item₁ Item₂
U₂ 5 2
U₃ 3 3
Numerator
(𝟓 × 𝟐) + (𝟑 × 𝟑)
= 𝟏𝟎 + 𝟗 = 𝟏𝟗

Denominator
√𝟓𝟐 + 𝟑𝟐 × √𝟐𝟐 + 𝟑𝟐
= √𝟑𝟒 × √𝟏𝟑
= 𝟓. 𝟖𝟑 × 𝟑. 𝟔𝟏 = 𝟐𝟏. 𝟎𝟓
𝒔𝟏𝟐 = 𝟏𝟗/𝟐𝟏. 𝟎𝟓 = 𝟎. 𝟗𝟎

Similarity between Item₁ and Item₃


21
Common users → User₁, User₃
User Item₁ Item₃
U₁ 2 3
U₃ 3 1
Numerator
(𝟐 × 𝟑) + (𝟑 × 𝟏)
=𝟔+𝟑=𝟗

Denominator
√𝟐𝟐 + 𝟑𝟐 × √𝟑𝟐 + 𝟏𝟐
= √𝟏𝟑 × √𝟏𝟎
= 𝟑. 𝟔𝟏 × 𝟑. 𝟏𝟔 = 𝟏𝟏. 𝟒𝟏
𝒔𝟏𝟑 = 𝟗/𝟏𝟏. 𝟒𝟏 = 𝟎. 𝟕𝟗

Similarity between Item₂ and Item₃


Common users → User₃, User₄
User Item₂ Item₃
U₃ 3 1
U₄ 2 2
Numerator
(𝟑 × 𝟏) + (𝟐 × 𝟐)
=𝟑+𝟒=𝟕

Denominator
√𝟑𝟐 + 𝟐𝟐 × √𝟏𝟐 + 𝟐𝟐
= √𝟏𝟑 × √𝟓
= 𝟑. 𝟔𝟏 × 𝟐. 𝟐𝟑 = 𝟖. 𝟎𝟓
𝒔𝟐𝟑 = 𝟕/𝟖. 𝟎𝟓 = 𝟎. 𝟖𝟕

4. Predict Missing Ratings


Predict 𝒓𝟏,𝟐 (User₁ → Item₂)
User₁ rated:
• Item₁ = 2
• Item₃ = 3
(𝟐 × 𝟎. 𝟗𝟎) + (𝟑 × 𝟎. 𝟖𝟕)
𝒓𝟏,𝟐 =
𝟎. 𝟗𝟎 + 𝟎. 𝟖𝟕

Numerator
𝟏. 𝟖 + 𝟐. 𝟔𝟏 = 𝟒. 𝟒𝟏

Denominator
𝟎. 𝟗𝟎 + 𝟎. 𝟖𝟕 = 𝟏. 𝟕𝟕
𝒓𝟏,𝟐 = 𝟒. 𝟒𝟏/𝟏. 𝟕𝟕 = 𝟐. 𝟒𝟗 ≈ 𝟐

Predict 𝒓𝟐,𝟑 (User₂ → Item₃)


User₂ rated:
• Item₁ = 5
• Item₂ = 2
(𝟓 × 𝟎. 𝟕𝟗) + (𝟐 × 𝟎. 𝟖𝟕)
𝒓𝟐,𝟑 =
𝟎. 𝟕𝟗 + 𝟎. 𝟖𝟕
22
Numerator
𝟑. 𝟗𝟓 + 𝟏. 𝟕𝟒 = 𝟓. 𝟔𝟗

Denominator
𝟎. 𝟕𝟗 + 𝟎. 𝟖𝟕 = 𝟏. 𝟔𝟔
𝒓𝟐,𝟑 = 𝟓. 𝟔𝟗/𝟏. 𝟔𝟔 = 𝟑. 𝟒𝟐 ≈ 𝟑

Predict 𝒓𝟒,𝟏 (User₄ → Item₁)


User₄ rated:
• Item₂ = 2
• Item₃ = 2
(𝟐 × 𝟎. 𝟗𝟎) + (𝟐 × 𝟎. 𝟕𝟗)
𝒓𝟒,𝟏 =
𝟎. 𝟗𝟎 + 𝟎. 𝟕𝟗

Numerator
𝟏. 𝟖 + 𝟏. 𝟓𝟖 = 𝟑. 𝟑𝟖

Denominator
𝟎. 𝟗𝟎 + 𝟎. 𝟕𝟗 = 𝟏. 𝟔𝟗
𝒓𝟒,𝟏 = 𝟑. 𝟑𝟖/𝟏. 𝟔𝟗 = 𝟐

5. Final Predicted Ratings


User Item Predicted
User₁ Item₂ ≈2
User₂ Item₃ ≈3
User₄ Item₁ ≈2
6. Final Completed Matrix
User / Item Item₁ Item₂ Item₃
User₁ 2 2 3
User₂ 5 2 3
User₃ 3 3 1
User₄ 2 2 2
This is the Item-Based Collaborative Filtering solution.

Advantages:
• Simple and intuitive approach to collaborative filtering.
• Effective in scenarios where users/items have sparse interactions.
• Can capture complex user-item relationships based on similarity
metrics. Challenges and Considerations:
• Data Sparsity: Nearest Neighbors CF may struggle with sparse datasets, where not all
users have rated many items.
• Scalability: Computing pairwise similarities can be computationally expensive for
large datasets.
• Cold Start Problem: Nearest Neighbors CF may face challenges when dealing with new
users or items with few ratings.

23
COMPONENTS OF NEIGHBORHOOD METHODS
The three very important considerations in the implementation of a neighborhood-based
recommender system are
1) the normalization of ratings,
2) the computation of the similarity weights, and
3) the selection of neighbors.

• Neighborhood methods, a class of collaborative filtering algorithms, rely on the concept of


finding a "neighborhood" of users or items similar to a target user or item. These methods
are based on the idea that users who have similar preferences tend to like similar items, and
vice versa. The key components of neighborhood methods include:
• Similarity Measure: Neighborhood methods use a similarity measure to quantify the similarity
between users or items. Common similarity measures include cosine similarity, Pearson
correlation coefficient, and Jaccard similarity. The choice of similarity measure can significantly
affect the performance of the algorithm.

• Neighborhood Selection: Once the similarity between users or items is computed, the next step
is to select a subset of neighbors that are most similar to the target user or item. This subset is
known as the neighborhood. The size of the neighborhood, i.e., the number of nearest neighbors
to consider, can be fixed or adaptive.

• Rating Prediction: After selecting the neighborhood, the algorithm predicts the rating of a target
user for an item by aggregating the ratings of its neighbors for that item. This can be done using
various aggregation functions such as weighted average, weighted sum, or regression-based
methods.

• Item or User-Based Approach: Neighborhood methods can be either item-based or user-based.


In item- based approaches, similarities between items are computed based on the ratings given by
users, and recommendations are made by finding items similar to those the user has liked. In user-
based approaches, similarities between users are calculated based on their rating patterns, and
recommendations are made by identifying users similar to the target user and recommending
items they have liked.

• Rating Normalization: To improve the accuracy of predictions, rating normalization techniques


may be applied. These techniques adjust the ratings to account for user or item biases, such as
users who tend to rate items more positively or items that are consistently rated higher or lower
than others.

• Sparse Data Handling: Neighborhood methods often face the challenge of dealing with sparse
data, where many user-item pairs have no ratings. Various strategies such as neighborhood
expansion, imputation, or incorporating auxiliary information may be employed to handle sparse
data and improve recommendation quality.

Components of neighborhood methods -Rating Normalization

• When it comes to assigning a rating to an item, each user has its own personal scale. Even if
an explicit definition of each of the possible ratings is supplied (e.g., 1=“strongly disagree”,
2=“disagree”, 3=“neutral”, etc.), some users might be reluctant to give high/low scores to items
they liked/disliked.
• Two of the most popular rating normalization schemes that have been proposed to
convert individual ratings to a more universal scale are mean-centering and Z-score

24
I. Mean-centering
Mean Centering in Recommender Systems

Mean centering is a normalization technique used in Collaborative Filtering to remove the bias of
users or items when calculating similarities or predicting ratings.
Different users rate items differently:
• Some users usually give high ratings
• Some users usually give low ratings
Mean centering adjusts ratings so that we measure true preference instead of rating habits.
Mean Centering Formula -For User Mean Centering
𝒓′𝒖,𝒊 = 𝒓𝒖,𝒊 − 𝒓ˉ𝒖

Where
• 𝒓𝒖,𝒊= rating given by user 𝒖to item 𝒊
• 𝒓ˉ𝒖 = average rating of user 𝒖
• 𝒓′𝒖,𝒊= mean-centered rating

2. Example Dataset- Original Rating Matrix


User Matrix Titanic Die Hard Forrest Gump Wall-E
John 5 3 4 4 ?
Lucy 3 1 2 3 3
Eric 4 3 4 3 5
Diane 3 3 4 5 4

Calculate Mean Rating of Each User


John
𝟓 + 𝟑 + 𝟒 + 𝟒 𝟏𝟔
𝒓ˉ𝑱𝒐𝒉𝒏 = = =𝟒
𝟒 𝟒

Lucy
𝟑 + 𝟏 + 𝟐 + 𝟑 + 𝟑 𝟏𝟐
𝒓ˉ𝑳𝒖𝒄𝒚 = = = 𝟐. 𝟒
𝟓 𝟓

Eric
𝟒 + 𝟑 + 𝟒 + 𝟑 + 𝟓 𝟏𝟗
𝒓ˉ𝑬𝒓𝒊𝒄 = = = 𝟑. 𝟖
𝟓 𝟓

Diane
𝟑 + 𝟑 + 𝟒 + 𝟓 + 𝟒 𝟏𝟗
𝒓ˉ𝑫𝒊𝒂𝒏𝒆 = = = 𝟑. 𝟖
𝟓 𝟓

Mean-Centered Ratings - Formula


𝒓𝒖𝒊 − 𝒓ˉ𝒖

25
Mean-Centered Matrix
User Matrix Titanic Die Hard Forrest Gump Wall-E
John 5−4 = 1 3−4 = -1 4−4 = 0 4−4 = 0 ?
Lucy 3−2.4 = 0.6 1−2.4 = -1.4 2−2.4 = -0.4 3−2.4 = 0.6 3−2.4 = 0.6
Eric 4−3.8 = 0.2 3−3.8 = -0.8 4−3.8 = 0.2 3−3.8 = -0.8 5−3.8 = 1.2
Diane 3−3.8 = -0.8 3−3.8 = -0.8 4−3.8 = 0.2 5−3.8 = 1.2 4−3.8 = 0.2

Interpretation
Example: Diane
Original ratings
Movie Rating
Titanic 3
Forrest Gump 5
Mean rating: 𝟑. 𝟖

Mean-centered values
Movie Mean-Centered
Titanic −0.8
Forrest Gump +1.2
This shows:
• Titanic → below Diane's average preference
• Forrest Gump → above Diane's average preference
Even though Titanic has rating 3, it becomes negative preference after mean-centering.

Why Mean-Centering is Important


It removes user rating bias.
Example
User Rating Style
John gives high ratings
Lucy gives low ratings
Mean-centering normalizes their rating behaviour before calculating similarity.
Types of Mean Centering
Type Formula Purpose
User Mean Centering 𝒓𝒖𝒊 − 𝒓ˉ𝒖 Removes user bias
Item Mean Centering 𝒓𝒖𝒊 − 𝒓ˉ𝒊 Removes item popularity bias

Simple Intuition
Rating Situation Meaning
Positive value User likes item more than average
Zero Neutral preference
Negative value User likes item less than average

A) User-Based Collaborative Filtering with Mean-Centered Ratings (Resnick Formula).

Detailed step-by-step solution to predict John’s rating for Wall-E using User-Based Collaborative
Filtering with Mean-Centered Ratings (Resnick Formula).
26
User Matrix Titanic Die Hard Forrest Gump
John 1 -1 0 0
Lucy 0.6 -1.4 -0.4 0.6
Eric 0.2 -0.8 0.2 -0.8
Diane -0.8 -0.8 0.2 1.2
Compute Similarity (Pearson) Between John and Other Users
We use only co-rated movies
(Matrix, Titanic, Die Hard, Forrest Gump)
Similarity (John, Lucy)
Numerator
(1 × 0.6) + (−1 × −1.4) + (0 × −0.4) + (0 × 0.6)
0.6 + 1.4 + 0 + 0 = 2

Denominator
√12 + (−1)2 + 02 + 02
= √2 = 1.414

Lucy part
√0.62 + (−1.4)2 + (−0.4)2 + 0.62
= √2.48 = 1.575

Similarity
2
𝑠𝑖𝑚(𝐽, 𝐿) =
1.414 × 1.575
2
=
2.227
= 0.90

Similarity (John, Eric)


Numerator
(1 × 0.2) + (−1 × −0.8) + (0 × 0.2) + (0 × −0.8)
0.2 + 0.8 = 1

Denominator
John
√2 = 1.414

Eric
√0.22 + (−0.8)2 + 0.22 + (−0.8)2
= √1. 36 = 1.166

Similarity
1
𝑠𝑖𝑚(𝐽, 𝐸) =
1.414 × 1.166
1
=
1.648
= 0.61
Similarity (John, Diane)
Numerator
27
(1 × −0.8) + (−1 × −0.8) + (0 × 0.2) + (0 × 1.2)
−0.8 + 0.8 = 0

Similarity
𝑠𝑖𝑚(𝐽, 𝐷) = 0. So Diane does not influence prediction.
Ratings for Wall-E
User Rating Mean Deviation
Lucy 3 2.4 0.6
Eric 5 3.8 1.2
Diane 4 3.8 0.2
Apply Resnick Prediction Formula
∑𝑠𝑖𝑚(𝑢, 𝑣)(𝑟𝑣𝑖 − 𝑟ˉ𝑣 )
𝑟𝑢,𝑖 = 𝑟ˉ𝑢 +
∑ ∣ 𝑠𝑖𝑚(𝑢, 𝑣) ∣

Calculate Numerator
(0.90 × 0.6) + (0.61 × 1.2) + (0 × 0.2)
0.54 + 0.732
= 1.272
Calculate Denominator
∣ 0.90 ∣ +∣ 0.61 ∣ +∣ 0 ∣
= 1.51
Final Prediction
1.272
𝑟𝐽𝑜ℎ𝑛,𝑊𝑎𝑙𝑙𝐸 = 4 +
1.51
= 4 + 0.84
= 4.84

Final Result
Predicted Rating for John on Wall-E≈4.84

Interpretation
• Eric and Lucy are similar to John
• Both rated Wall-E highly
• Therefore, John is predicted to like Wall-E

b) Item-Based Mean Centering in Collaborative Filtering


In Item-Based Collaborative Filtering, we normalize ratings by subtracting the average rating of each
item instead of the user average.
This removes the popularity bias of items.
Some items naturally receive higher ratings, while others receive lower ratings.
Item mean centering helps us compare ratings fairly.
Item Mean Centering Formula

𝑟𝑢,𝑖 = 𝑟𝑢,𝑖 − 𝑟ˉ𝑖

Where
• 𝑟𝑢,𝑖 = rating given by user 𝑢to item 𝑖
• 𝑟ˉ𝑖 = average rating of item 𝑖

• 𝑟𝑢,𝑖 = item mean-centered rating

28
Given Rating Matrix
User Matrix Titanic Die Hard Forrest Gump
John 5 1 – 2
Lucy 1 5 2 5
Eric 2 ? 3 5
Diane 4 3 5 3
Step 1: Calculate Item Mean
Matrix
5+1+2+4
𝑟ˉ𝑀𝑎𝑡𝑟𝑖𝑥 =
4
12
= =3
4

Titanic
1+5+3
𝑟ˉ𝑇𝑖𝑡𝑎𝑛𝑖𝑐 =
3
=3

Die Hard
2+3+5
𝑟ˉ𝐷𝑖𝑒𝐻𝑎𝑟𝑑 =
3
10
= = 3.33
3

Forrest Gump
2+5+5+3
𝑟ˉ𝐹𝑜𝑟𝑟𝑒𝑠𝑡 =
4
15
= = 3.75
4

Step 2: Subtract Item Mean Matrix


User Calculation Result
John 5 − 3 2
Lucy 1 − 3 −2
Eric 2−3 −1
Diane 4 − 3 1

Titanic
User Calculation Result
John 1 − 3 −2
Lucy 5 − 3 2
Diane 3 − 3 0
(Eric unknown)

Die Hard
User Calculation Result
Lucy 2 − 3.33 −1.33
Eric 3 − 3.33 −0.33
Diane 5 − 3.33 1.67

Forrest Gump
29
User Calculation Result
John 2 − 3.75 −1.75
Lucy 5 − 3.75 1.25
Eric 5 − 3.75 1.25
Diane 3 − 3.75 −0.75

Item Mean-Centered Matrix


User Matrix Titanic Die Hard Forrest Gump
John 2 -2 – -1.75
Lucy -2 2 -1.33 1.25
Eric -1 ? -0.33 1.25
Diane 1 0 1.67 -0.75

Interpretation
Example: Titanic
Average rating of Titanic
𝑟ˉ𝑇𝑖𝑡𝑎𝑛𝑖𝑐 = 3

Diane rated Titanic = 3


3−3=0

So the item mean-centered rating = 0 (neutral).


This means Diane’s rating is exactly equal to the average rating of Titanic.

predict Eric’s missing rating for Titanic (?) using Item-Based Collaborative Filtering with Item
Mean-Centering.

Now we will

Compute Item Similarity


We compute similarity between Titanic and other movies.
Similarity Formula (Cosine)
∑𝑟𝑢,𝑖 𝑟𝑢,𝑗
𝑠𝑖𝑚(𝑖, 𝑗) =
2 2
√∑𝑟𝑢,𝑖 √∑𝑟𝑢,𝑗

Similarity (Titanic , Matrix)


Common users: John, Lucy, Diane
Numerator
(−2)(2) + (2)(−2) + (0)(1)
= −4 − 4 + 0 = −8

Denominator
√(−2)2 + 22 + 02 × √22 + (−2)2 + 12
= √8 × √9
= 2.83 × 3 = 8.49
𝑠𝑖𝑚(𝑇𝑖𝑡𝑎𝑛𝑖𝑐, 𝑀𝑎𝑡𝑟𝑖𝑥) = −0.94

30
Similarity (Titanic , Die Hard)
Common users: Lucy, Diane
Numerator
(2)(−1.33) + (0)(1.67)
= −2.66

Denominator
√4 × √4.56
= 2 × 2.14 = 4.28
𝑠𝑖𝑚 = −0.62

Similarity (Titanic , Forrest Gump)


Common users: John, Lucy, Diane
Numerator
(−2)(−1.75) + (2)(1.25) + (0)(−0.75)
= 3.5 + 2.5 = 6

Denominator
√8 × √5.19
= 2.83 × 2.28 = 6.45
𝑠𝑖𝑚 = 0.93

Prediction Formula (Item-Based CF)



∑𝑟𝑢,𝑗 × 𝑠𝑖𝑚(𝑖, 𝑗)
𝑟𝑢,𝑖 = 𝑟ˉ𝑖 +
∑ ∣ 𝑠𝑖𝑚(𝑖, 𝑗) ∣

Where

𝑟𝑢,𝑗 = mean-centered rating.
Eric’s Known Ratings
Movie Mean-centered rating
Matrix -1
Die Hard -0.33
Forrest Gump 1.25
Compute Numerator
Matrix contribution =(−1)(−0.94) = 0.94

Die Hard contribution =(−0.33)(−0.62) = 0.205

Forrest Gump contribution = (1.25)(0.93) = 1.162

Total numerator = 0.94 + 0.205 + 1.162 = 2.307

Denominator
∣ −0.94 ∣ +∣ −0.62 ∣ +∣ 0.93 ∣
0.94 + 0.62 + 0.93 = 2.49

31
Predicted Mean-Centered Rating
2.307
𝑟′ = = 0.93
2.49

Final Rating
Add Titanic mean
𝑟𝐸𝑟𝑖𝑐,𝑇𝑖𝑡𝑎𝑛𝑖𝑐 = 3 + 0.93
= 3.93

Final Prediction: Eric will likely rate Titanic ≈ 4

Completed Matrix
User Matrix Titanic Die Hard Forrest Gump
John 5 1 – 2
Lucy 1 5 2 5
Eric 2 4 3 5
Diane 4 3 5 3

When choosing between the implementation of a user-based and an item-based neighborhood


recommender system, five criteria should be considered:
• Accuracy: The accuracy of neighborhood recommendation methods depends mostly on the ratio
between the number of users and items in the system. The similarity between two users in user-
based methods, which determines the neighbors of a user, is normally obtained by comparing the
ratings made by these users on the same items. On the other hand, an item-based method usually
computes the similarity between two items y comparing ratings made by the same user on these
items.
• Efficiency: The memory and computational efficiency of recommender systems also depends on
the ratio between the number of users and items. Thus, when the number of users exceeds the number
of items, as is it most often the case, item- based recommendation approaches require much less
memory and time to compute the similarity weights (training phase) than user-based ones, making
them more scalable. However, the time complexity of the online recommendation phase, which
depends only on the number of available items and the maximum number of neighbors, is the same
for user-based and item-based methods.
• Stability: The choice between a user-based and an item-based approach also depends on the
frequency and amount of change in the users and items of the system. If the list of available items
is fairly static in comparison to the users of the system, an item-based method may be preferable.
On the contrary, in applications where the list of available items is constantly changing, e.g., an
online article recommender, user-based methods could prove to be more stable.
• Justifiability: An advantage of item-based methods is that they can easily be used to justify a
recommendation. User-based methods, however, are less amenable to this process because the
active user does not know the other users serving as neighbors in the recommendation.
• Serendipity: In item-based methods, the rating predicted for an item is based on the ratings given
to similar items. Consequently, recommender systems using this approach will tend to recommend
to a user item that are related to those usually appreciated by this user.

II. Z – SCORE NORMALIZATION


In recommender systems, different users use rating scales differently.
Some users give very high ratings, while others give moderate or low ratings even for items
they like.
32
To handle this rating bias, normalization techniques are used.
Two common techniques are:
• Mean-centering
• Z-score normalization
While mean-centering removes the average bias, Z-score normalization also considers the
variation (spread) in ratings.
Consider two users A and B:
User Rating Pattern Average Rating
User A Ratings alternate between 1 and 5 3
User B Always gives rating 3 3
Both users have the same average rating (3).
However:
• User A has large variation in ratings.
• User B has no variation.
Now suppose User B gives rating 5 to an item.
This rating is very unusual for B because B usually gives 3.
Therefore:
➡ Rating 5 from B indicates stronger preference than rating 5 from A.
Z-score normalization captures this difference in rating spread. Z-score normalization, also known
as standard score normalization, is a statistical technique used to rescale a distribution of values to have
a mean of zero and a standard deviation of one. This normalization technique is often applied to features
or variables in data preprocessing to ensure that they are on a comparable scale, which can be beneficial
for certain machine learning algorithms.
The formula for calculating the Z-score of a data point x is:
𝑥−𝜇
𝑧=
𝜎
Where:
• z is the Z-score.
• x is the original value.
• μ is the mean of the distribution.
• σ is the standard deviation of the distribution.
Here's how Z-score normalization works:
a. Calculate Mean and Standard Deviation: Compute the mean (μ) and standard
deviation (σ) of the data distribution.
b. Normalize Data: For each data point, subtract the mean (μ) and then divide by the
standard deviation (σ). This centers the data distribution around zero and scales it to
have a standard deviation of one.
Z-score normalization is particularly useful in situations where the data distribution may have
outliers or exhibit skewness. By rescaling the data to have a mean of zero and a standard
deviation of one, Z-
score normalization helps to mitigate the impact of outliers and ensures that all features
contribute equally to the analysis.
It's important to note that Z-score normalization assumes that the data distribution is
approximately Gaussian (normal). If the distribution is significantly non-normal, other
normalization techniques may be more appropriate. Additionally, Z-score normalization is
sensitive to outliers, so preprocessing steps such as outlier removal or transformation may be
necessary before normalization.

Example of how Z-score normalization can be applied in collaborative filtering:


33
a) Z-Score in User-Based Recommendation
In user-based collaborative filtering, normalization is done per user.
Normalized rating:
𝒓𝒖𝒊 − 𝒓ˉ𝒖
𝒛𝒖𝒊 =
𝝈𝒖

Prediction is computed using similar users’ normalized ratings.


Predicted rating:
∑𝒗∈𝑵(𝒖) 𝒔𝒊𝒎(𝒖, 𝒗) 𝒛𝒗𝒊
𝒓̂𝒖𝒊 = 𝒓ˉ𝒖 + 𝝈𝒖
∑𝒗∈𝑵(𝒖) ∣ 𝒔𝒊𝒎(𝒖, 𝒗) ∣
Where:
Symbol Meaning
𝒔𝒊𝒎(𝒖, 𝒗) similarity between users
𝑵(𝒖) neighborhood of similar users
Thus:
1. Normalize ratings
2. Predict normalized value
3. Convert back to original scale

Example: Suppose we have a matrix of user ratings for items:


User Item1 Item2 Item 3
User1 5 3 0
User2 4 0 0
User3 1 1 0

Calculate Mean and Standard Deviation: Compute the mean and standard deviation of each
item's ratings across all users.

Step 1: Compute mean (μ) and std deviation (σ) per user
User 1 Ratings: [5, 3, 0]
𝟓+𝟑+𝟎
Mean μ1= =2.67
𝟑
2 (5−2.67)2 +(3−2.67)2 +(0−2.67)2 2
Std dev σ1 = √ = √4.22 ≈ 2.05
3
User 2 Ratings: [4, 0, 0]
𝟒+𝟎+𝟎
Mean μ2= =1.33
𝟑
2 (4−1.33)2 +(0−1.33)2 +(0−1.33)2 2
Std dev σ2= √ = √3.56 ≈ 1.89
3
User 3 Ratings: [1, 1, 0]
𝟏+𝟏+𝟎
Mean μ3= =0.67
𝟑
2 (1−0.67)2 +(1−0.67)2 +(0−0.67)2 2
Std dev σ3 = √ = √0.22 ≈ 0.47
3

34
Step 2: Apply Z-score Formula
Item 1 Item 2 Item 3
User 1 (5−2.67)/2.05 ≈ 1.14 (3−2.67)/2.05 ≈ 0.16 (0−2.67)/2.05 ≈ -1.30
User 2 (4−1.33)/1.89 ≈ 1.41 (0−1.33)/1.89 ≈ -0.71 (0−1.33)/1.89 ≈ -0.71
User 3 (1−0.67)/0.47 ≈ 0.70 (1−0.67)/0.47 ≈ 0.70 (0−0.67)/0.47 ≈ -1.41

Z-Score Normalized Matrix:


Item 1 Item 2 Item 3
User 1 1.14 0.16 -1.30
User 2 1.41 -0.71 -0.71
User 3 0.70 0.70 -1.41

b) Z-Score in Item-Based Recommendation


In item-based collaborative filtering, normalization is done per item.
Normalized rating:
𝒓𝒖𝒊 − 𝒓ˉ𝒊
𝒛𝒖𝒊 =
𝝈𝒊

Where:
Symbol Meaning
𝒓ˉ𝒊 average rating of item i
𝝈𝒊 standard deviation of item ratings
Predicted rating:
∑𝒋∈𝑵(𝒊) 𝒔𝒊𝒎(𝒊, 𝒋) 𝒛𝒖𝒋
𝒓̂𝒖𝒊 = 𝒓ˉ𝒊 + 𝝈𝒊
∑𝒋∈𝑵(𝒊) ∣ 𝒔𝒊𝒎(𝒊, 𝒋) ∣

Here the system:


Normalizes ratings by item statistics
Uses similar items
Converts normalized predictions back to ratings.

Here we consider Item Based rating

User Item1 Item2 Item 3


User1 5 3 0
User2 4 0 0
User3 1 1 0

Item 1: (5,4,1)
Mean = µ1 = (5 + 4 + 1) / 3 = 3.33
2 (5−3.33)2 +(4−3.33)2 +(1−3.33)2 2
Standard Deviation = σ1 = √ ≈ √2.89 ≈ 1.70
3
Item 2: [3, 0, 1]
Mean = µ2 = (3 + 0+ 1) / 3 = 1.33
2 (3−1.33)2 +(40−1.33)2 +(1−1.33)2 2
Standard Deviation = σ2 = √ 3
≈ √1.56 ≈ 1.25
Item 3: [0, 0, 0]

35
Mean = µ3 = 0
Standard Deviation = σ3 = 0

We'll handle division by 0 using a common rule: set z-score to 0 where standard deviation is 0 (no
variation).
Step 2: Apply Z-score
Item 1 Item 2 Item 3
User 1 (5−3.33)/1.70 ≈ 0.98 (3−1.33)/1.25 ≈ 1.33 0
User 2 (4−3.33)/1.70 ≈ 0.39 (0−1.33)/1.25 ≈ -1.06 0
User 3 (1−3.33)/1.70 ≈ -1.37 (1−1.33)/1.25 ≈ -0.27 0

Final Z-score Normalized Matrix (Item-Based):


Item 1 Item 2 Item 3
User 1 0.98 1.33 0
User 2 0.39 -1.06 0
User 3 -1.37 -0.27 0
Now, all ratings are standardized such that they have a mean of approximately zero and a
standard deviation of approximately one. This normalization allows for fair comparison of
ratings across users and items, which is useful in collaborative filtering algorithms where
similarities between users or items are computed based on these ratings.

Advantages of Z-Score Normalization


1. Considers Rating Variance
Unlike mean-centering, it also considers spread of ratings.
2. Handles Different Rating Behaviors
Some users give extreme ratings, others give moderate ratings.
Z-score adjusts these differences.
3. Useful for Wide Rating Scales
Works well when rating scale is:
• large (1–10)
• continuous ratings

Disadvantages of Z-Score Normalization


1. Sensitive to Standard Deviation
If the standard deviation is very small, normalization may produce very large values.
2. Predictions May Fall Outside Rating Scale
Since normalization involves multiplication/division, predicted ratings may be:
• less than minimum rating
• greater than maximum rating
Example:
Predicted rating = 6.2 when rating scale is 1–5.
3. More Sensitive than Mean-Centering
Because the standard deviation values can vary widely.

Comparison: Mean-Centering vs Z-Score


Feature Mean-Centering Z-Score Normalization
Removes rating bias Yes Yes
Considers rating variance No Yes
Uses standard deviation No Yes
Accuracy Moderate Better in many cases
Sensitivity Less More
36
III. SIMILARITIY WEIGHT COMPUTATION
The similarity weights play a double role in neighborhood-based
recommendation methods:
1) they allow the selection of trusted neighbors whose ratings are used in the prediction
2) they provide the means to give more or less importance to these neighbors in the prediction.
A measure of the similarity between two objects a and b, often used in
information retrieval, consists in representing these objects in the form of two
vectors xa and xb and computing the Cosine Vector (CV) (or Vector Space)
similarity [7, 8, 44] between these vectors:

The similarity between two users u and v would then be computed as

where Iuv once more denotes the items rated by both u and v. A problem
with this measure is that is does not consider the differences in the mean and
variance of the ratings made by users u and v.
A popular measure that compares ratings where the effects of mean and
variance have been removed is the Pearson Correlation (PC) similarity:

The Pearson correlation coefficient 𝑟𝑥𝑦between two users x and y is calculated


as:

The Pearson correlation coefficient ranges from -1 to 1:


• 𝑟𝑥𝑦=1 indicates a perfect positive correlation, meaning that the ratings of users x and y are
perfectly linearly related (i.e., when one user rates an item highly, the other user also tends to
rate it highly).
• 𝑟𝑥𝑦=−1 indicates a perfect negative correlation, meaning that the ratings of users x and y are
perfectly inversely related (i.e., when one user rates an item highly, the other user tends to
rate it poorly).
• 𝑟𝑥𝑦 =0 indicates no linear correlation between the ratings of users x and y.
• In collaborative filtering, Pearson correlation similarity is used to identify similar users or items
based on their rating patterns. Users or items with higher Pearson correlation coefficients are
considered more similar, and their ratings can be used to make recommendations for each other.

Example Of Pearson Correlation (PC) Similarity In Collaborative Filtering


• Identify the common movies rated by both users (Movies 1, 2, and 5).
• Calculate the mean ratings for both users based on the common movies.
• Calculate the deviations from the mean for both users.
37
• Compute the covariance between the deviations.
• Calculate the standard deviations for both users.
• Finally, compute the Pearson correlation coefficient.
Pearson correlation similarity is a measure used in collaborative filtering to determine the similarity
between two users (or items) based on their rating patterns. It measures the linear correlation between
the ratings given by two users (or items), taking into account the mean rating of each user. A positive
correlation indicates similar rating patterns, while a negative correlation indicates dissimilar rating
patterns.
Here's an example of how Pearson correlation similarity can be calculated for users in collaborative
filtering: Suppose we have a matrix of user ratings for items:

Example: Suppose we have a small dataset representing user ratings for movies:

Movie 1 Movie 2 Movie 3 Movie 4


User 1 5 3 0 1
User 2 4 0 0 1
User 3 1 1 0 5
User 4 0 1 5 4

To calculate the Pearson correlation similarity between User 1 and User 2 based on the provided
ratings for movies, we'll follow the steps outlined earlier:

Movies rated by both User 1 and User 2:


Movies rated: Movie 1, Movie 4
Thus:
User 1 ratings → 5, 1
User 2 ratings → 4, 1
Calculate mean ratings for User 1 and User 2:
Mean rating for User 1: (5 + 3 + 0 + 1) / 4 = 2.25
Mean rating for User 2: (4 + 0 + 0 + 1) / 4 = 1.25

Calculate deviations from the mean:

38
Movie U1 U2 Dev1 = U1 - 2.25 Dev2 = U2 - 1.25 Product
M1 5 4 2.75 2.75 7.5625
M2 3 0 0.75 -1.25 -0.9375
M3 0 0 -2.25 -1.25 2.8125
M4 1 1 -1.25 -0.25 0.3125

Sum = 7.5625 -0.9375 + 2.8125 + 0.3125 = 9.75

𝟗.𝟕𝟓
Cov(U1, U2) = = 3.25
𝟑

2 (2.75)2 +(0.75)2 +(−2.25)2 +(−1.25)2 2 7.5625+0.5625+5.0625+1.5625


Standard deviation User 1 = √ =√
3 3
2 14.75 2
=√ = √4.9167 = 2.22
3
2 (2.75)2 +(−1.25)2 +(−1.25)2 +(−0.25)2 2 7.5625+1.5625+1.5625+0.0625
Standard deviation User 2 = √ =√
3 3
2 10.75 2
=√ = √3.5833 = 1.89
3
User 1 variance: =(2.75 + (0.75)2 + (−2.25)2 + (−1.25)2
)2
= 7.5625+0.5625+5.0625+1.5625
2
=14.75 =√14.75 ≈ 3.84
User 2 variance: = (2.75)2 + (−1.25)2 + (−1.25)2 + (−0.25)2
= 7.5625+1.5625+1.5625+0.0625=10.75
2
=10.75 =√10.75 ≈ 3.28

Pearson correlation similarity =

9.75 9.75
sim(User1,User2)= 3.84×3.28 =12.58 ≈0.775

So, the Pearson correlation similarity between User 1 and User 2 is approximately 0.775. This
indicates a moderate positive correlation between their ratings on the shared movies.

Movies rated by both User 1 and User 3:

Movies rated: Movie 2


Calculate mean ratings for User 1 and User 3:

Mean rating for User 1: (5 + 3 + 0 + 1) / 4 = 2.25


Mean rating for User 3: (1 + 1 + 0 + 5) / 4 = 1.75

39
Calculate deviations from the mean:

Sum of products (numerator):


−2.0625 − 0.5625 + 3.9375 − 4.0625 = −2.75

User 1 variance: =(2.75)2 + (0.75)2 + (−2.25)2 + (−1.25)2


= 7.5625+0.5625+5.0625+1.5625
2
=14.75 =√14.75 ≈ 3.84
User 3 variance: = (−0.75)2 + (−0.75)2 + (−1.75)2 + (3.25)2
= 0.5625+0.5625+3.0625+10.5625
2
=14.75 = √14.75 ≈ 3.84

Pearson correlation similarity =

−2.75 −2.75
sim(User1, User3) = 3.84×3.84 = 14.75 ≈-0.186

So, the Pearson correlation similarity between User 1 and User 3 is approximately -0.186. This
indicates a negative correlation between their ratings on the shared movies.

Movies rated by both User 1 and User 4:

Movies rated: Movie 2, Movie 4


Calculate mean ratings for User 1 and User 4:

Mean rating for User 1: (5 + 3 + 0 + 1) / 4 = 2.25


Mean rating for User 4: (0 + 1 + 5 + 4) / 4 =
2.5 Calculate deviations from the mean:

40
Sum of products (numerator):
−6.875 − 1.125 − 5.625 − 1.875 = −15.5

User 1 variance: =(2.75)2 + (0.75)2 + (−2.25)2 + (−1.25)2


= 7.5625+0.5625+5.0625+1.5625
2
=14.75 =√14.75 ≈ 3.84
User 4 variance: = (−2.5)2 + (−1.5)2 + (2.5)2 + (1.5)2
= 6.25+2.25+6.25+2.25
2
=17.0 = √17.0 ≈ 4.12

Calculate the Pearson correlation similarity:

−15.5 −15.5
sim (User1, User4) = 3.84×4.12 =15.8208 ≈ - 0.98

So, the Pearson correlation similarity between User 1 and User 4 is approximately -0.98. This
negative correlation suggests some dissimilarity between their ratings on the shared movies. This is a
very strong negative correlation, meaning their tastes are almost opposite.
IV. Mean Squared Difference (MSD)
The Mean Squared Difference (MSD) is a statistical measure used to quantify the average
squared difference between two sets of values. It is commonly employed in various fields,
including statistics, machine learning, and signal processing, to assess the similarity or
dissimilarity between datasets.
Mean Squared Difference (MSD): Definition and Calculation
Given two sets of values X = {x1, x2, …, xn} and Y = {y1, y2, …, yn}, where n is the number of
elements in each set, the Mean Squared Difference (MSD) is calculated as follows.
1. Compute Differences
Calculate the difference between corresponding elements of X and Y:
Difference = (xi − yi) for i = 1, 2, …, n
2. Square Differences
Square each difference obtained in step 1:
Squared Difference = (xi − yi)² for i = 1, 2, …, n
3. Calculate Mean Squared Difference (MSD)
Compute the average (mean) of the squared differences:

41
𝑛

𝑀𝑆𝐷 = (1/𝑛) 𝛴 ∑(𝑥𝑖 − 𝑦𝑖)²


𝑖

Interpretation
The Mean Squared Difference (MSD) provides a measure of the average discrepancy or error between
corresponding values of X and Y.
It quantifies how much X and Y deviate from each other on average, with larger differences resulting in
higher squared values and thus contributing more to the overall MSD.
MSD is commonly used as a loss function in regression problems to assess the goodness of fit of a
model’s predictions compared to the actual values.

Example:
Suppose we have three users (User X, User Y, and User Z) and their ratings for four movies
(Movie 1, Movie 2, Movie 3, and Movie 4). Here are the ratings:

User X: [4, 3, 5, 2]
User Y: [3, 2, 4, 3]
User Z: [5, 4, 3, 2]
To calculate the Mean Squared Difference (MSD) between User X and User Y for these
movies, we follow these steps:
• Compute the squared difference between corresponding ratings of User X and User Y for each
movie.
• Calculate the mean of these squared differences.
To calculate the Mean Squared Difference (MSD) between User X and User Y, follow these steps.
Step 1: Ratings Table
Movie User X User Y
Movie 1 4 3
Movie 2 3 2
Movie 3 5 4
Movie 4 2 3
Step 2: Calculate the Difference for Each Movie
𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = (𝑋𝑖 − 𝑌𝑖 )

Movie User X User Y Difference


Movie 1 4 3 1
Movie 2 3 2 1
Movie 3 5 4 1
Movie 4 2 3 -1
Step 3: Square the Differences
(𝑋𝑖 −𝑌𝑖 )2

Movie Difference Squared Difference


Movie 1 1 1
Movie 2 1 1
Movie 3 1 1
42
Movie Difference Squared Difference
Movie 4 -1 1
Step 4: Sum the Squared Differences
1+1+1+1= 4

Step 5: Compute the Mean Squared Difference


Number of movies 𝑛 = 4
∑(𝑋𝑖 − 𝑌𝑖 )2
𝑀𝑆𝐷 =
𝑛
4
𝑀𝑆𝐷 = = 1
4

Mean Squared Difference (MSD) between User X and User Y = 1

Interpretation
• MSD = 0 → perfectly similar ratings
• Higher MSD → more difference between users
Here MSD = 1, meaning User X and User Y have relatively similar rating patterns.

Similarly, you can calculate the MSD between other pairs of users or for different sets of movies.
MSD is a simple metric that gives you an idea of how similar or dissimilar the ratings of two users
are. A lower MSD indicates greater similarity in ratings.

To calculate the Mean Squared Difference (MSD) between User X and User Z, we'll follow
the same steps:
Step 1: Ratings Given
Item User X User Z
1 4 5
2 3 4
3 5 3
4 2 2
Step 2: Compute the Difference for Each Item
𝑫𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 = (𝑿𝒊 − 𝒁𝒊 )

Item User X User Z Difference


1 4 5 -1
2 3 4 -1
3 5 3 2
4 2 2 0
Step 3: Square the Differences
(𝑿𝒊 −𝒁𝒊 )𝟐

Item Difference Squared Difference


1 -1 1
2 -1 1
3 2 4
43
Item Difference Squared Difference
4 0 0
Step 4: Sum the Squared Differences
𝟏+𝟏+𝟒+𝟎=𝟔

Step 5: Divide by Number of Co-rated Items


Number of items 𝒏 = 𝟒
∑(𝑿𝒊 − 𝒁𝒊 )𝟐
𝑴𝑺𝑫 =
𝒏
𝟔
𝑴𝑺𝑫 = = 𝟏. 𝟓
𝟒

Mean Squared Difference (MSD) between User X and User Z = 1.5


Interpretation
• Lower MSD → users are more similar
• Higher MSD → users are less similar
Here 1.5 indicates moderate difference in ratings.

A lower MSD indicates greater similarity in ratings. In this case, the MSD between User X
and User Z is higher than the MSD between User X and User Y (which was 1), suggesting that
User X's ratings are more similar to User Y's ratings than to User Z's ratings.

To calculate the Mean Squared Difference (MSD) between User Y and User Z, follow the same procedure.
Step 1: Ratings Table
Movie User Y User Z
Movie 1 3 5
Movie 2 2 4
Movie 3 4 3
Movie 4 3 2

Step 2: Compute the Difference


𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 = (𝑌𝑖 − 𝑍𝑖 )

Movie User Y User Z Difference


Movie 1 3 5 -2
Movie 2 2 4 -2
Movie 3 4 3 1
Movie 4 3 2 1

Step 3: Square the Differences


(𝑌𝑖 −𝑍𝑖 )2

Movie Difference Squared Difference


Movie 1 -2 4
Movie 2 -2 4
Movie 3 1 1
44
Movie Difference Squared Difference
Movie 4 1 1

Step 4: Sum the Squared Differences


4 + 4 + 1 + 1 = 10

Step 5: Compute MSD


Number of movies 𝑛 = 4
∑(𝑌𝑖 − 𝑍𝑖 )2
𝑀𝑆𝐷 =
𝑛
10
𝑀𝑆𝐷 = = 2.5
4

Mean Squared Difference (MSD) between User Y and User Z = 2.5


Final Comparison of Users
Users Compared MSD
User X – User Y 1
User X – User Z 1.5
User Y – User Z 2.5
Interpretation:
• Lowest MSD → Most similar users

NEIGHBORHOOD SELECTION
The selection of the neighbors used in the recommendation of items is normally done in
two steps:
1) a global filtering step where only the most likely candidates are kept
2) a per prediction step which chooses the best candidates for this prediction.
PRE – FILTERING OF NEIGHBORS
The pre-filtering of neighbors is an essential step that makes neighborhood-based
approaches practicable by reducing the amount of similarity weights to store, and limiting
the number of candidate neighbors to consider in the predictions. There are several ways
in which this can be accomplished:
• Top-N filtering: For each user or item, only a list of the N nearest-neighbors and their respective similarity
weight is kept. To avoid problems with efficiency or accuracy, N should be chosen carefully. Thus, if N is too
large, an excessive amount of memory will be required to store the neighborhood lists and predicting ratings will
be slow. On the other hand, selecting a too small value for N may reduce the coverage of the recommendation
method, which causes some items to be never recommended.
• Threshold filtering: Instead of keeping a fixed number of nearest-neighbors, this approach keeps all the
neighbors whose similarity weight has a magnitude greater than a given threshold 𝑤𝑚𝑖𝑛. While
this is more flexible than the previous filtering technique, as only the most significant neighbors
are kept, the right value of wmin may be difficult to determine.
• Negative filtering: In general, negative rating correlations are less reliable than positive ones.
Intuitively, this is because strong positive correlation between two users is a good indicator of
their belonging to a common group (e.g., teenagers, science-fiction fans, etc.). However,
although negative correlation may indicate membership to different groups, it does not tell how
different these groups are, or whether these groups are compatible for other categories of items.
While experimental investigation have found negative correlations to provide no significant
improvement in the prediction accuracy, whether such correlations can be discarded depends
on the data.
45
NEIGHBORS IN THE PREDICTIONS

To find neighbors in the context of predictions, particularly in collaborative filtering-based


recommendation systems, we often use similarity metrics to identify users or items that are similar
to each other. These similar users or items are considered neighbors. Once we identify neighbors,
we can use their ratings or preferences to make predictions for a user or item.

Here's a basic outline of the process:

Calculate Similarity: Use a similarity metric (such as Pearson correlation, cosine similarity, or
Jaccard similarity) to measure the similarity between users or items based on their ratings or
features.

Identify Neighbors: Select the top-k most similar users or items as neighbors. The value of k can
be predefined or determined dynamically.

Make Predictions: Use the ratings of the neighbors to predict ratings for the target user or item. This
can be done by taking a weighted average of the ratings given by neighbors, where the weights are
the similarities between the neighbors and the target user (or item).

Recommendation: Once predictions are made, recommend items with the highest predicted ratings
to the target user.

Here's a simplified example:

Suppose we have three users (User A, User B, and User C) and their ratings for movies (Movie 1,
Movie 2, and Movie 3). We want to predict the rating of Movie 3 for User A.

User A: [5, 4, -] (User A hasn't rated Movie 3)


User B: [4, 5, 3]
User C: [3, 2, 4]
• Calculate Similarity: We can use a similarity metric (e.g., Pearson correlation) to
calculate the similarity between User A and each of the other users.

• Identify Neighbors: Let's say we choose User B and User C as neighbors based on
their high similarity scores.

• Make Predictions: We can predict the rating of Movie 3 for User A by taking a weighted
average of the ratings given by User B and User C for Movie 3, where the weights are their
similarities with User A.

• Recommendation: Recommend Movie 3 to User A if the predicted rating is above a certain


threshold.

• In practice, recommendation systems use more sophisticated algorithms and techniques, but
the basic idea remains the same: identify similar users or items as neighbors and use their
preferences to make predictions or recommendations.

Step 1: Rating Matrix


User Movie 1 Movie 2 Movie 3
User A 5 4 ?
46
User Movie 1 Movie 2 Movie 3
User B 4 5 3
User C 3 2 4
User A has not rated Movie 3, so we predict it using similar users.

Step 2: Compute Mean Rating of Each User


Mean rating = (Sum of ratings) / (Number of rated movies)
User A
𝐴ˉ = (5 + 4)/2 = 4.5

User B
𝐵ˉ = (4 + 5 + 3)/3 = 4

User C
𝐶ˉ = (3 + 2 + 4)/3 = 3

Step 3: Pearson Similarity


Pearson formula:
∑(𝐴𝑖 − 𝐴ˉ)(𝐵𝑖 − 𝐵ˉ )
𝑆𝑖𝑚(𝐴, 𝐵) =
√∑(𝐴𝑖 − 𝐴ˉ)2 √∑(𝐵𝑖 − 𝐵ˉ )2

We compute similarity using co-rated movies (Movie1 & Movie2).

Step 4: Similarity Between User A and User B


𝐴 𝐵
Movie A B Product
− 𝐴 − 𝐵ˉ
ˉ
M1 5 4 0.5 0 0
M2 4 5 -0.5 1 -0.5
Numerator:
0 + (−0.5) = −0.5

Denominator:
√0.52 + (−0.5)2 × √02 + 12
= √0.5 × 1 = 0.707

Similarity:
𝑆𝑖𝑚(𝐴, 𝐵) = −0.5/0.707 = −0.707

Step 5: Similarity Between User A and User C


𝐴 𝐶
Movie A C Product
− 𝐴 − 𝐶ˉ
ˉ
M1 5 3 0.5 0 0
M2 4 2 -0.5 -1 0.5
Numerator:
0 + 0.5 = 0.5

47
Denominator:
√0.5 × √1
= 0.707

Similarity:
𝑆𝑖𝑚(𝐴, 𝐶) = 0.5/0.707 = 0.707

Step 6: Predict Rating for Movie 3 (Resnick Formula)


Prediction formula:
∑𝑆𝑖𝑚(𝐴, 𝑢)(𝑅𝑢,3 − 𝑢ˉ)
𝑃(𝐴, 3) = 𝐴ˉ +
∑ ∣ 𝑆𝑖𝑚(𝐴, 𝑢) ∣

Where
𝑢= neighbors (B, C)

Contribution from User B


(−0.707)(3 − 4)
(−0.707)(−1) = 0.707

Contribution from User C


(0.707)(4 − 3)
= 0.707

Numerator
0.707 + 0.707 = 1.414

Denominator
∣ −0.707 ∣ +∣ 0.707 ∣= 1.414

Step 7: Final Prediction


𝑃(𝐴, 3) = 4.5 + (1.414/1.414)
𝑃(𝐴, 3) = 4.5 + 1
𝑃(𝐴, 3) = 5.5

Since rating scale is 1–5, we cap it:


𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑅𝑎𝑡𝑖𝑛𝑔 = 5

Final Result
User Movie 3 Predicted Rating
User A ≈ 5
Movie 3 should be recommended to User A because the predicted rating is very high.

48
SECURITY ASPECTS OF RECOMMENDER SYSTEMS

• Security is a critical aspect of recommender systems, especially given the


sensitive nature of user data and the potential for malicious actors to exploit
vulnerabilities. Here are some key security considerations for recommender
systems:
• Privacy Protection: Recommender systems often rely on user data to generate
recommendations. It's essential to implement robust privacy protection mechanisms
to safeguard sensitive user information. Techniques such as data anonymization,
differential privacy, and secure multiparty computation can be employed to protect
user privacy while still providing effective recommendations.
• Data Integrity: Ensuring the integrity of data is crucial to prevent unauthorized
tampering or manipulation of recommendation algorithms. Employing cryptographic
techniques, access controls, and data validation mechanisms can help maintain the
integrity of data used by recommender systems.
• Authentication and Authorization: Implement strong authentication and
authorization mechanisms to control access to recommender system resources and
functionalities. This helps prevent unauthorized access to user data and system
components, reducing the risk of data breaches and malicious activities.
• Secure Communication: Secure communication protocols, such as HTTPS, should
be used to encrypt data transmitted between clients and recommender system servers.
This protects sensitive user information from eavesdropping and interception by
unauthorized parties.
• Model Robustness: Ensure that recommendation algorithms are robust against
adversarial attacks and manipulation attempts. Adversarial training, model validation,
and robustness testing can help identify and mitigate vulnerabilities in
recommendation models.
• User Trust and Transparency: Promote user trust and transparency by providing
clear explanations of how recommendations are generated and how user data is used.
Offering users control over their data and preferences, as well as transparent opt-
in/opt-out mechanisms, can enhance trust in the recommender system.
• Monitoring and Auditing: Implement monitoring and auditing mechanisms to
detect anomalous behavior, security incidents, and unauthorized access attempts in
real-time. Regular security audits and penetration testing can help identify and
address security vulnerabilities proactively.
• Regulatory Compliance: Ensure compliance with relevant data protection
regulations (e.g., GDPR, CCPA) and industry standards to protect user privacy and
data rights. This includes obtaining explicit user consent for data processing,
providing users with access to their data, and adhering to data retention and deletion
policies.

49
Two Marks

1. What is Collaborative Filtering?


Collaborative filtering is a recommendation technique that predicts users
preferences by analyzing their pas interactions and similarities with other users
or items. Its commonly used in recommender systems to suggest items or content
that users are likely to enjoy based on the behaviour of similar users
2. What are the types of collaborative filtering?
There are generally two types of collaborative filtering
methods:

User-based Collaborative Filtering:


In user-based collaborative filtering, the system recommends items to a target
user based on the preferences and behaviors of other users who are similar to that
user.
The process involves finding users who have similar preferences to the target
user, based on items they have liked or interacted with.
For example, if User A and User B have liked or purchased similar items in the
past, then when User A likes a new item, User B might also like that item.

Item-based Collaborative Filtering:


In item-based collaborative filtering, the system recommends items that are
similar to the items that a target user has liked or interacted with in the past.
This method identifies items that are frequently liked or interacted with by users
who have also liked or interacted with the same items as the target user.
For instance, if a user has liked certain movies, item-based collaborative filtering
will recommend other movies that are often liked by users who liked the same
movies

3. State the difference between content based and collaborative filtering.

1
4. Why do we need recommender systems?

5. What are the two approaches in collaborative filtering?

6. Draw a neat sketch to steps involved in a systematic approach to


collaborative filtering

2
7. What is user based Collaborative Filtering?
To suggest new recommendations to a particular user, a group of similar
users (nearest neighbors) is created based on the interactions of the reference
user. The items that are most popular in this group, but new to the target user,
are used for the suggestions.
User-based CF algorithms recommend items to a user based on the
preferences of similar users. The algorithm first identifies a set of similar
users, also known as
nearest neighbors, based on their past interactions with items. The similarity
between
users is typically measured using distance metrics such as cosine similarity or
Pearson correlation. Once the nearest neighbors are identified, the algorithm
predicts the rating
of an item for the active user by aggregating the ratings of that item from
the nearest neighbors

8. What is item based Collaborative Filtering?(April/May 2024)


• In item-based filtering, new recommendations are selected based on the
old interactions of the target user. First, all the items that the user has
already liked are considered. Then, similar products are computed and
clusters are made (nearest neighbors). New items from these clusters are
suggested to the user.
• Item-based CF algorithms recommend items to a user based on the
similarity of items to items that the user has interacted with in the past.
The algorithm first identifies a set of similar items based on their
attributes or features. The similarity between items is typically measured
using distance metrics or similarity measures such as Jaccard similarity
or cosine similarity. Once the similar items are identified, the algorithm

3
recommends to the active user items that are similar to items that the user
has liked in the past

9. List the advantages of Memory-Based Collaborative Filtering


Simplicity: Memory-based approaches are intuitive and simple to implement,
making them a viable option for solving problems with moderately big datasets
in a short amount of time.
Transparency: Memory-Based systems’ suggestions are easy to understand
since they are grounded in the user’s and the item’s direct interactions.
Serendipity: Memory-based filtering has the potential to provide
serendipitous recommendations, in which users stumble onto previously
unknown but potentially fascinating content through shared relationships with
other users

10. List the drawbacks of Memory-Based Collaborative Filtering


• Sparsity and Scalability: Since the frequency of user-item interactions tends
to decrease as the dataset expands, it becomes more difficult to discover
trustworthy neighbours and might cause scaling problems.
• Cold Start: Memory-Based systems struggle when there are too few contacts
with new users or things to make reliable suggestions.
• Limited Representation: Memory-based approaches may provide subpar
results because they fail to fully capture complicated patterns in the data
11. What is model based Collaborative filtering?

4
12. What is Matrix factorization?

Matrix factorization is a popular technique used in Collaborative Filtering


(CF) for recommendation systems. CF is a method to predict a user's interests
by collecting preferences or behavior information from many users. Matrix
factorization is particularly effective in collaborative filtering because it
can handle the sparsity of user-item interaction data. Matrix factorization
aims to learn the matrices U and I by minimizing the reconstruction error
between R and ≈P×QT � . This
is typically achieved through optimization techniques like gradient descent,
alternating least squares, or stochastic gradient descent.
The objective function could be formulated as:

13. What is meant by Hybrid Approaches?


Hybrid approaches in Collaborative Filtering (CF) combine different methods or
techniques to overcome limitations and enhance the performance of recommendation
systems. These approaches leverage the strengths of multiple recommendation
strategies, such as collaborative filtering (CF) and content-based filtering (CBF), to
provide more accurate and diverse recommendations.
14. List the Hybrid Approach Components.
1. Collaborative Filtering (CF):
Idea: Recommend movies based on user behavior and preferences.
Implementation: Use matrix factorization (like Singular Value Decomposition or
Matrix Factorization) to learn latent factors from user-item interactions (ratings).
Predict ratings for unseen movies based on similar users' preferences.
2. Content-Based Filtering (CBF):
Idea: Recommend movies based on the attributes or content of the items.
Implementation:: Extract features from movies such as genre, director,
actors, release year. Build a profile for each user based on their rated movies.
Recommend movies that are similar in content to the ones a user has liked.
5
3. Hybridization:
Combining CF and CBF:
o Weighted Approach: Combine scores from CF and CBF using a weighted
sum or other fusion techniques.
o Switching Strategy: Use CF for some users and CBF for others based
on data availability or performance metrics.
o Feature Combination: Include content-based features (e.g., movie genres,
director) as additional input to the collaborative filtering model.

15. What are the approaches used in nearest neighbor CF?


• Nearest Neighbors Collaborative Filtering (NNCF) is a technique used in
recommendation systems to predict user preferences based on the similarity between
users or items. It falls under the umbrella of Collaborative Filtering (CF), which
utilizes the collective wisdom of users to make recommendations.
• User-based Collaborative Filtering (UBCF): Predict a user's preference for an
item by finding similar users based on their historical ratings.
• Item-based Collaborative Filtering (IBCF): oPredict a user's preference for an
item by finding similar items based on how users have rated them
16. List the Steps Involved Nearest Neighbors Collaborative Filtering
Step-1: Data Representation: Represent user-item interactions as a matrix R, where
rows correspond to users and columns correspond to items. Each entry Rui represents
a user u's rating (or interaction) with item i.
Step-2: Similarity Calculation: Compute similarity between users (for UBCF) or items
(for IBCF) based on their rating patterns. Common similarity metrics include cosine
similarity, Pearson correlation, or Jaccard similarity.
Step-3: Nearest Neighbors Selection: For a given user u (or item i), identify the
k most similar users (or items) based on the computed similarity scores.
Nearest Neighbors are typically selected based on the highest similarity
scores. Step-4: Prediction:
• UBCF Prediction: Predict user u's rating for item i by averaging the
ratings of the k
nearest Neighbors who have rated item i, weighted by their similarity to user u.
• IBCF Prediction: Predict user u's rating for item i by combining ratings
of items similar to item i, weighted by the similarity between items

17. How to find the score for a pair (user, movie)?

The score for a pair (user, movie) indicates how likely it is for a user to watch a
movie and is calculated as follows: Where:

6
the score is the sum of the similarity scores of the target movie’s nearest neighbors
that have been watched by the target user

• The pred relation takes the following inputs:


• neighborhood_size: The number of similar movies (neighbors) we use to
predict the score
• M: The relation containing (movie, user) pairs
• S: The similarity metric, e.g., Jaccard, cosine
• T: The relation that selects the top neighborhood_size most similar movies
to the target movie (i.e., the nearest neighbors).

18. Write the formula to find the similarity between an user and item

19. What are the key components of neighborhood methods?


• Similarity Measure: Neighborhood methods use a similarity measure to quantify the
similarity between users or items. Common similarity measures include cosine
similarity, Pearson correlation coefficient, and Jaccard similarity.
• Neighborhood Selection: Once the similarity between users or items is computed, the
next step is to select a subset of neighbors that are most similar to the target user or item.
This subset is known as the neighborhood. The size of the neighborhood, i.e.,
the number of nearest neighbors to consider, can be fixed or adaptive.
• Rating Prediction: After selecting the neighborhood, the algorithm predicts the rating
of a target user for an item by aggregating the ratings of its neighbors for that item. This
can be done using various aggregation functions such as weighted average, weighted
sum, or regression-based methods.
• Item or User-Based Approach: Neighborhood methods can be either item-based or
user-based. In item-based approaches, similarities between items are computed based
on the ratings given by users, and recommendations are made by finding items similar
to those the user has liked. In user-based approaches, similarities between users are

7
calculated based on their rating patterns, and recommendations are made by identifying
users similar to the target user and recommending items they have liked.
• Rating Normalization: To improve the accuracy of predictions, rating normalization
techniques may be applied. These techniques adjust the ratings to account for user or
item biases, such as users who tend to rate items more positively or items that are
consistently rated higher or lower than others.
• Sparse Data Handling: Neighborhood methods often face the challenge of dealing
with sparse data, where many user-item pairs have no ratings. Various strategies such
as neighborhood expansion, imputation, or incorporating auxiliary information may be
employed to handle sparse data and improve recommendation quality.

20. What is Z Score normalization?


Z-score normalization, also known as standard score normalization, is a statistical
technique used to rescale a distribution of values to have a mean of zero and a
standard deviation of one. This normalization technique is often applied to features or
variables in data preprocessing to ensure that they are on a comparable scale, which can
be beneficial for certain machine learning algorithms.

The formula for calculating the Z-score of a data point


x is:

Where:
z is the Z-score.
x is the original value.
μ is the mean of the distribution.
σ is the standard deviation of the distribution

21. Write the formula to find out the similarity between two users

Meaning of Symbols
• 𝐶𝑉(𝑢, 𝑣)– Cosine similarity between user u and user v
• 𝑥𝑢 , 𝑥𝑣 – Rating vectors of users u and v
• 𝑟𝑢𝑖 – Rating given by user u to item i
• 𝑟𝑣𝑖 – Rating given by user v to item i
• 𝐼𝑢𝑣 – Set of items rated by both users u and v
• 𝐼𝑢 – Set of items rated by user u
• 𝐼𝑣 – Set of items rated by user v
In simple terms
• The numerator computes the dot product of common ratings.
• The denominator normalizes using the magnitude of each user's rating
vector.

8
• The result ranges from 0 to 1 (or −1 to 1 depending on ratings), indicating how
similar the users are.

22. How similarity weight computation is carried out in item based collaborative
filtering

• Construct an Item-User Rating Matrix


• Rows represent items, and columns represent users.
• Entries contain user ratings or interactions with items.
• Choose a Similarity Metric
• Cosine similarity (most common for numerical ratings).

Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈– Set of all users who rated the items
• 𝑅𝑢𝑖 – Rating given by user u to item i
• 𝑅𝑢𝑗 – Rating given by user u to item j
Explanation
• The numerator calculates the dot product of rating vectors of items 𝑖and 𝑗.
• The denominator normalizes the values using the magnitude of each item vector.
• The result gives the cosine of the angle between the two item vectors.

• Pearson correlation (if users have different rating behaviors).

Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈– Set of users who rated both items
• 𝑅𝑢𝑖 – Rating given by user u for item i
• 𝑅𝑢𝑗 – Rating given by user u for item j
• 𝑅ˉ 𝑖 – Mean rating of item i
• 𝑅ˉ𝑗 – Mean rating of item j
Explanation
• Measures the linear correlation between two items.
• Ratings are mean-centered by subtracting the average rating.
• This reduces rating bias when users rate items on different scales.
Key Property
• Value range: −1 to +1
o +1 → Perfect positive similarity
o 0 → No correlation
o −1 → Opposite preference

• Jaccard similarity (for implicit feedback data).

9
Meaning of Symbols
• 𝑠𝑖𝑚(𝑖, 𝑗)– Similarity between item i and item j
• 𝑈𝑖 – Set of users who interacted with item i
• 𝑈𝑗 – Set of users who interacted with item j
• ∣ 𝑈𝑖 ∩ 𝑈𝑗 ∣– Number of users who interacted with both items
• ∣ 𝑈𝑖 ∪ 𝑈𝑗 ∣– Total number of users who interacted with either item
Explanation
• Measures the overlap between two user sets.
• Commonly used for implicit feedback data such as:
o clicks
o purchases
o views
o likes
Key Properties
• Value range: 0 to 1
o 0 → No common users
o 1 → Both items interacted by exactly the same users

• Compute Similarity Weights


1. Compute pairwise similarity for all items.
2. Store them in an item similarity matrix.
• Use Similarity for Recommendations
• Predict user ratings based on similar items.
• Recommend items that are most similar to what a user has interacted with.

23. What is meant by Pearson coefficient?


The Pearson correlation coefficient ��� between two users x and y is
calculated as:

The Pearson correlation coefficient ranges from -1 o 1:

• rxy=1 indicates a perfect positive correlation, meaning that the ratings of users
x and y are perfectly linearly related (i.e., when one user rates an item highly,
the other user also tends to rate it highly).
• rxy=−1 indicates a perfect negative correlation, meaning that the ratings of
users x and y are perfectly inversely related (i.e., when one user rates an
item highly, the other user tends to rate it poorly).
• rxy =0 indicates no linear correlation between the ratings of users x and y.

23. What is the MSD of two sets of values X={3,5,7,9} and Y={4,6,8,10}.

10
The Mean Squared Difference (MSD) equation written clearly is:
𝑛
1
𝑀𝑆𝐷 = ∑( 𝑥𝑖 − 𝑦𝑖 )2
𝑛
𝑖=1

Meaning of Symbols
• MSD – Mean Squared Difference
• 𝑛– Number of paired observations
• 𝑥𝑖 – Value of the first vector (or item/user) at position i
• 𝑦𝑖 – Value of the second vector at position i
• (𝑥𝑖 −𝑦𝑖 )2 – Squared difference between corresponding values
Example from the image
Given vectors:
• X = (3, 5, 7, 9)
• Y = (4, 6, 8, 10)
Step 1: Compute differences
(3 − 4), (5 − 6), (7 − 8), (9 − 10) = (−1, −1, −1, −1)

Step 2: Square the differences


(−1)2 , (−1)2 , (−1)2 , (−1)2 = (1,1,1,1)

Step 3: Compute MSD


1 4
𝑀𝑆𝐷 = (1 + 1 + 1 + 1) = = 1
4 4

✅ Result:
The Mean Squared Difference between X and Y is 1.
On average, the squared difference between corresponding values of X and Y is 1,
indicating their level of dissimilarity.

Part -B
1. List the difference between collaborative recommendation engine
and content- based recommendation engine.
Aspect Collaborative Content-Based
Recommendation Recommendation
Engine Engine
Basic Idea Recommends items based on Recommends items based on
similar users’ preferences similar item features that a user
liked before
Data Used Uses user–item interaction Uses item attributes or content
data (ratings, clicks, features (keywords, genre,
purchases) category)
Working Principle “Users who liked this item “Items similar to what you
also liked these items” liked earlier are recommended”
Dependency Depends heavily on other Depends mainly on user’s own
users’ behavior past preferences
Similarity Computes similarity between Computes similarity between
Calculation users or items item features

11
Example User-based CF, Item-based TF-IDF, Cosine similarity,
Algorithm CF, Matrix Factorization Feature matching
Cold Start Suffers from new user and Works better for new items if
Problem new item problem item features are known
Diversity of Can recommend different Recommendations are usually
Recommendations types of items liked by similar to previously liked
similar users items
Need for Item Does not require detailed item Requires detailed item
Information descriptions attributes or metadata
Scalability Can be difficult with very Generally easier if item features
large user-item datasets are available
Explainability Harder to explain why Easier to explain (based
an item is on item features)
recommended
Example Amazon “Customers News recommendation,
Applications who bought this also article suggestion, music
bought”, Netflix genre recommendation
recommendations

2. Explain the steps involved in systematic approach to collaborative filtering


with a neat sketch (8 marks)

12
3. Explain Memory based Collaborative filtering in detail (8 mraks)
4. Explain Model based collaborative filtering in detail ( 8 Marks)
5. Explain the steps to be followed in Nearest Neighbor Collaborative filtering
6. Explain User based Collaborative Filtering with an example.
7. Explain Item-to-Item Based Collaborative Filtering with an example
8. How rating Normalization is done using Mean Centralization in
neighborhood methods?
9. How rating Normalization is done using Z-Score Normalization in
neighborhood
methods?
10. If a user gives same ratings to similar types of movies and if the user
missed to give rating for the new movie which comes under the prior
rated category. Is it possible to predict the rating of a new movie for
that specific user item based collaborative filtering? Describe the
procedure step by step in detail. (April-May-2024)
Predicting a User’s Rating for a New Movie Using Item-Based
Collaborative Filtering (IBCF)

13
Yes, Item-Based Collaborative Filtering (IBCF) can predict a rating for a
new movie that a user hasn't rated, based on the ratings they have given
to similar movies. The procedure involves computing item similarities
and using them to infer the missing rating.

Step-by-Step Procedure to Predict the Rating


Step 1: Create the Item-User Rating Matrix
Construct a matrix where:
• Rows represent movies (items).
• Columns represent users.
• Cells contain ratings given by users to movies.
If a user hasn’t rated a movie, the value is missing (null).
Example:
Movie/User User A User B User C
Movie 1 5 4 3
(Action)
Movie 2 4 5 ?
(Action)
Movie 3 2 3 4
(Comedy)
Movie 4 ? 4 3
(Action)
Here, User C hasn't rated Movie 2, but they have rated Movie 1 and Movie 4,
which belong to the same genre (Action).
Step 2: Compute Similarity Between Movies
To find the similarity between movies, we use Cosine Similarity or Pearson
Correlation.
Cosine Similarity Formula:

Example of Similarity Calculation:

14
If we calculate similarity between Movie 1 and Movie 2 (both Action
movies), and Movie 1 and Movie 4, we get:
1. Movie 1 & Movie 2

Compute the numerator (dot product)


(5 × 4) + (4 × 5) = 20 + 20 = 40

Compute the denominator (vector magnitudes)


First vector magnitude:
√52 + 42 = √25 + 16 = √41

Second vector magnitude:


√42 + 52 = √16 + 25 = √41

Final similarity value


40 40
𝑠𝑖𝑚(1,2) = =
√41 × √41 41
𝑠𝑖𝑚(1,2) ≈ 0.976

Result:
The cosine similarity between Movie 1 and Movie 2 is approximately 0.976, which
indicates very high similarity between the two movies.

2. Movie 3 & Movie 2

Step 1: Numerator (dot product)

(2 × 4) + (3 × 5) = 8 + 15 = 23

Step 2: Denominator

√22 + 32 = √4 + 9 = √13
√42 + 52 = √16 + 25 = √41

Step 3: Final value

23
𝑠𝑖𝑚(3,2) =
√13 × √41
√13 ≈ 3.61, √41 ≈ 6.40
23
𝑠𝑖𝑚(3,2) ≈ ≈ 0.99
23.1

15
Result:
Similarity between Movie 3 and Movie 2 ≈ 0.99 (very high similarity).

3. Movie 4 & Movie 2

Step 1: Numerator

4 × 4 = 16

Step 2: Denominator

√42 = √16 = 4
√42 + 52 = √16 + 25 = √41

Step 3: Final value

16
𝑠𝑖𝑚(4,2) =
√16 × √41
16
= ≈ 0.79
√656

• Result:
Similarity between Movie 4 and Movie 2 ≈ 0.79.

Step 3: Predict the Missing Rating


To predict User C's rating for Movie 2, we use a weighted sum of the ratings the
user has given to similar movies:

Example Calculation:
Assume User C rated:
• Movie 1 = 3
• Movie 4 = 3

16
So, the predicted rating for User C on Movie 2 is 3.0.
Step 4: Use the Prediction for Recommendation
• If predicted rating ≥ threshold (e.g., 3.5), recommend the movie.
• If the rating is low, the system won’t recommend it.

Advantages of This Approach


✅ Works well when users rate many similar items.
✅ Handles new movies well if enough similarity data exists.
✅ Can be extended for implicit feedback data (e.g., clicks, watch
history).

Challenges & Limitations


⚠️ Cold Start Problem – Doesn’t work if a movie has zero ratings.
⚠️ Data Sparsity – If users rate very few items, similarity calculations
may be unreliable.
⚠️ Scalability Issues – Large datasets require efficient computation
techniques.
11. How to make movie recommendations and rating predictions using
nearest neighbor collaborative filtering algorithms? And elaborate
whether it is possible to yield better results on content-based filtering
on comparing with the aforementioned method. Justify the response
(April-May-2024)
Nearest Neighbor Collaborative Filtering (CF) is a technique that
recommends movies based on similar users (User-Based CF) or similar
items (Item-Based CF). It operates by finding the K nearest neighbors
to a user or item and using their interactions (ratings) to generate
predictions.

1. Steps for Nearest Neighbor Collaborative Filtering


Step 1: Construct the User-Item Rating Matrix
• Each row represents a user.
• Each column represents a movie.
• Entries contain user ratings for movies.
Example Matrix:
User/Movie Movie Movie Movie Movie
A B C D
User 1 5 ? 3 4
User 2 4 2 ? 5
User 3 3 4 5 ?

Step 2: Choose the Collaborative Filtering Approach

17
Nearest neighbor methods are applied in two main ways:
A. User-Based Collaborative Filtering (UBCF)
• Finds users similar to a target user.
• Predicts ratings by averaging ratings of similar users.
Steps:
1. Compute similarity between users (e.g., Cosine Similarity or Pearson
Correlation).

Cosine Similarity: User1 & User2


Common movies: A, D

Step 1: Numerator
(𝟓 × 𝟒) + (𝟒 × 𝟓)
𝟐𝟎 + 𝟐𝟎 = 𝟒𝟎

Step 2: Denominator
√𝟓𝟐 + 𝟒𝟐 = √𝟐𝟓 + 𝟏𝟔 = √𝟒𝟏
√𝟒𝟐 + 𝟓𝟐 = √𝟏𝟔 + 𝟐𝟓 = √𝟒𝟏

Step 3: Final Similarity


𝟒𝟎
𝒔𝒊𝒎(𝑼𝟏 , 𝑼𝟐 ) =
𝟒𝟏
𝒔𝒊𝒎(𝑼𝟏 , 𝑼𝟐 ) = 𝟎. 𝟗𝟕𝟔

18
2. Cosine Similarity: User1 & User3
Common movies: A, C
(𝟓 × 𝟑) + (𝟑 × 𝟓)
𝒔𝒊𝒎(𝑼𝟏, 𝑼𝟑) =
√𝟓𝟐 + 𝟑𝟐 √𝟑𝟐 + 𝟓𝟐
Step 1: Numerator
(𝟓 × 𝟑) + (𝟑 × 𝟓)
𝟏𝟓 + 𝟏𝟓 = 𝟑𝟎

Step 2: Denominator
√𝟓𝟐 + 𝟑𝟐 = √𝟐𝟓 + 𝟗 = √𝟑𝟒
√𝟑𝟐 + 𝟓𝟐 = √𝟗 + 𝟐𝟓 = √𝟑𝟒

Step 3: Final Similarity


𝟑𝟎
𝒔𝒊𝒎(𝑼𝟏 , 𝑼𝟑 ) =
𝟑𝟒
𝒔𝒊𝒎(𝑼𝟏 , 𝑼𝟑 ) = 𝟎. 𝟖𝟖𝟐

3. Cosine Similarity: User2 & User3


Common movies: A, B
(𝟒×𝟑)+(𝟐×𝟒)
sim(U2,U3) =
√𝟒𝟐 +𝟑𝟐 √𝟐𝟐 +𝟒𝟐
Step 1: Numerator
(𝟒 × 𝟑) + (𝟐 × 𝟒)
𝟏𝟐 + 𝟖 = 𝟐𝟎

Step 2: Denominator
√𝟒𝟐 + 𝟐𝟐 = √𝟏𝟔 + 𝟒 = √𝟐𝟎
√𝟑𝟐 + 𝟒𝟐 = √𝟗 + 𝟏𝟔 = √𝟐𝟓

Step 3: Final Similarity


𝟐𝟎
𝒔𝒊𝒎(𝑼𝟐 , 𝑼𝟑 ) =
√𝟐𝟎 × 𝟓
√𝟐𝟎 = 𝟒. 𝟒𝟕𝟐
𝟐𝟎
𝒔𝒊𝒎(𝑼𝟐 , 𝑼𝟑 ) =
𝟐𝟐. 𝟑𝟔
𝒔𝒊𝒎(𝑼𝟐 , 𝑼𝟑 ) = 𝟎. 𝟖𝟗𝟒

19
4. Similarity Table
User Pair Cosine Similarity
U1 – U2 0.976
U1 – U3 0.882
U2 – U3 0.894
Step 3: Predict Missing Ratings

We use weighted sum of neighbors' ratings to predict missing values.

Step 3: Predict Missing Ratings


We use weighted sum of neighbors' ratings to predict missing values.
2. Select K nearest neighbors (users most similar to the target user).
3. Compute a weighted sum of their ratings to predict missing values.
Formula for Rating Prediction:

Predict User1 → Movie B


Neighbors who rated Movie B
• User2 = 2
• User3 = 4
Prediction equation
𝟎. 𝟗𝟕𝟔(𝟐 − 𝟑. 𝟔𝟕) + 𝟎. 𝟖𝟖𝟐(𝟒 − 𝟒)
𝑹𝟏𝑩 = 𝟒 +
∣ 𝟎. 𝟗𝟕𝟔 ∣ +∣ 𝟎. 𝟖𝟖𝟐 ∣

Numerator
𝟎. 𝟗𝟕𝟔(−𝟏. 𝟔𝟕) + 𝟎
= −𝟏. 𝟔𝟑

Denominator
𝟎. 𝟗𝟕𝟔 + 𝟎. 𝟖𝟖𝟐 = 𝟏. 𝟖𝟓𝟖

Prediction
−𝟏. 𝟔𝟑
𝑹𝟏𝑩 = 𝟒 +
𝟏. 𝟖𝟓𝟖
𝑹𝟏𝑩 = 𝟒 − 𝟎. 𝟖𝟖
𝑹𝟏𝑩 ≈ 𝟑. 𝟏𝟐
20
✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟏(𝑴𝒐𝒗𝒊𝒆𝑩) ≈ 𝟑

Step 4: Predict User2 → Movie C


Neighbors who rated Movie C
• User1 = 3
• User3 = 5
𝟎. 𝟗𝟕𝟔(𝟑 − 𝟒) + 𝟎. 𝟖𝟗𝟒(𝟓 − 𝟒)
𝑹𝟐𝑪 = 𝟑. 𝟔𝟕 +
𝟎. 𝟗𝟕𝟔 + 𝟎. 𝟖𝟗𝟒

Numerator
𝟎. 𝟗𝟕𝟔(−𝟏) + 𝟎. 𝟖𝟗𝟒(𝟏)
= −𝟎. 𝟗𝟕𝟔 + 𝟎. 𝟖𝟗𝟒
= −𝟎. 𝟎𝟖𝟐

Denominator
𝟏. 𝟖𝟕

Prediction
−𝟎. 𝟎𝟖𝟐
𝑹𝟐𝑪 = 𝟑. 𝟔𝟕 +
𝟏. 𝟖𝟕
𝑹𝟐𝑪 ≈ 𝟑. 𝟔𝟑

✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟐(𝑴𝒐𝒗𝒊𝒆𝑪) ≈ 𝟒

Step 5: Predict User3 → Movie D


Neighbors who rated Movie D
• User1 = 4
• User2 = 5
𝟎. 𝟖𝟖𝟐(𝟒 − 𝟒) + 𝟎. 𝟖𝟗𝟒(𝟓 − 𝟑. 𝟔𝟕)
𝑹𝟑𝑫 = 𝟒 +
𝟎. 𝟖𝟖𝟐 + 𝟎. 𝟖𝟗𝟒

Numerator
𝟎 + 𝟎. 𝟖𝟗𝟒(𝟏. 𝟑𝟑)
= 𝟏. 𝟏𝟗

Denominator
𝟏. 𝟕𝟕𝟔

Prediction
𝟏. 𝟏𝟗
𝑹𝟑𝑫 = 𝟒 +
𝟏. 𝟕𝟕𝟔
𝑹𝟑𝑫 ≈ 𝟒. 𝟔𝟕

✅ Predicted Rating
𝑼𝒔𝒆𝒓𝟑(𝑴𝒐𝒗𝒊𝒆𝑫) ≈ 𝟒. 𝟕

21
Final Predicted Matrix
User A B C D
U1 5 3.1 3 4
U2 4 2 3.6 5
U3 3 4 5 4.7

✅ Final Predicted Ratings


• User1(MovieB) ≈ 3.1
• User2(MovieC) ≈ 3.6
• User3(MovieD) ≈ 4.7
.

B. Item-Based Collaborative Filtering (IBCF)


• Finds similar movies rather than users.
• Predicts ratings by considering how similar movies were rated by a user.
Steps:
1. Compute similarity between movies using past user ratings (Cosine
Similarity or Pearson Correlation).
2. Select K nearest neighbor movies (most similar to the target movie).
3. Predict missing ratings based on ratings of similar movies.
Formula for Rating Prediction:

Step 1: Given User-Movie Rating Matrix


User/Movie Movie Movie Movie Movie
A B C D
User 1 5 ? 3 4
User 2 4 2 ? 5
User 3 3 4 5 ?
Our goal is to predict the missing ratings (?) using item-based
collaborative filtering.

Step 2: Compute Similarity Between Movies


Choosing Similarity Metric

22
Computing Cosine Similarity Between Movies
1. Compute Similarity Between Movie A and Other Movies
We calculate similarity for Movie A with B, C, and D using their common
user ratings.

(a) Movie A & Movie B Similarity


Only User 2 and User 3 have rated both movies:

23
(b) Movie A & Movie C Similarity
Only User 1 and User 3 have rated both movies:
Movie A vector: [5,3]
Movie C vector: [3,5]

A⋅C = (5×3)+(3×5) =15+15=30

𝟑𝟎
Sim (A, C) = 5.83×5.8330 = 𝟑𝟒 ≈0.882
(c) Movie A & Movie D Similarity
• Movie A vector: [5,4]
• Movie D vector: [4,5]

Only User 1 and User 2 have rated both movies:


𝑨 ⋅ 𝑫 = (𝟓 × 𝟒) + (𝟒 × 𝟓) = 𝟐𝟎 + 𝟐𝟎 = 𝟒𝟎 = 𝟐𝟎 + 𝟐𝟎 = 𝟒𝟎
= 𝟐𝟎 + 𝟐𝟎 = 𝟒𝟎

24
Let's compute the cosine similarity between Movie B and Movie C step
by step.

Step 1: Extract Relevant Ratings


From the given data:
• Movie A: [5,4,3]
• Movie B: [?,2,4]
• Movie C: [3,?,5]
• Movie D: [4,5,?]
User Movie B Movie C
User 1 ? (missing) 3
User 2 2 ? (missing)
User 3 4 5
Since User 1’s rating for Movie B and User 2’s rating for Movie C are
missing, we only use User 3's ratings.
Thus, we use:
• Movie B vector: [4]
• Movie C vector: [5]

Step 2: Cosine Similarity Formula

Step 3: Compute the Dot Product


𝑩 ⋅ 𝑪 = (𝟒 × 𝟓) = 20
Step 4: Compute Magnitudes

Step 5: Compute Cosine Similarity

25
Final Answer
Cosine Similarity (B,C)=1.0

Let's compute the cosine similarity between Movie B and Movie D


step by step.

Step 1: Extract Relevant Ratings


From the given data:
• Movie A: [5,4,3]
• Movie B: [?,2,4]
• Movie C: [3,?,5]
• Movie D: [4,5,?]
User Movie B Movie D
User 1 ? (missing) 4
User 2 2 5
User 3 4 ? (missing)
Since User 1’s rating for Movie B and User 3’s rating for Movie D are
missing, we only use User 2's ratings.
Thus, we use:
• Movie B vector: [2]
• Movie D vector: [5]
Step 2: Cosine Similarity Formula

Step 3: Compute the Dot Product


𝑩 ⋅ 𝑫 = (𝟐 × 𝟓) = 𝟏𝟎
Step 4: Compute Magnitudes

Step 5: Compute Cosine Similarity

26
Final Answer
Cosine Similarity (B,D)=1.0
Likewise, cosine similarity of Movie C and D

Cosine similarity of movie B and


Movie C is

Step 3: Predict Missing Ratings


Predict missing ratings based on ratings of similar movies.
Formula for Rating Prediction:

Prediction 1: User 1's Rating for Movie B

27
Prediction 2: User 2's Rating for Movie C

Prediction 3: User 3's Rating for Movie D

Final Predictions
User/Movie Movie Movie Movie Movie
A B C D
User 1 5 3.9 3 4
User 2 4 2 4.1 5
User 3 3 4 5 4.5
Conclusion
1. Movie B is predicted to be rated 3.9 by User 1.
2. Movie C is predicted to be rated 4.1 by User 2.
3. Movie D is predicted to be rated 4.5 by User 3
28
Comparison with Content-Based Filtering (CBF)
Content-Based Filtering (CBF) Overview
• Uses movie attributes (genre, actors, director, etc.) to make
recommendations.
• Does not depend on user interaction history with other users.
Formula for Content-Based Score:

Which Approach Yields Better Results?


Feature Collaborative Content-Based
Filtering (CF) Filtering (CBF)
Cold Start (New Bad Better (uses movie
Users) features)
Cold Start (New Bad Good (uses
Movies) attributes)
Personalized Strong Medium
Recommendations
Scalability Slower (large dataset) Faster
Data Required Requires user Works without
ratings ratings
Diversity Recommends Recommends
variety similar content
Justification:
✅ CBF is better for new users or new movies because it does not rely
on past ratings.
✅ CF gives more diverse and personalized recommendations by
learning from user preferences.
✅ Hybrid models (combining CF + CBF) provide the best results,
mitigating each method's weaknesses.

12. Illustrate the working principle of neighborhood methods and discuss the
components used (April-May-2025)

a) Working Principle of Neighborhood Methods (Collaborative


Filtering)
Neighborhood methods are memory-based recommendation
techniques that predict a user’s preference by analyzing the preferences
of similar users (user-based) or similar items (item-based). The basic
idea is that users with similar tastes will rate items similarly.

29
1. Construct the User–Item Rating Matrix
The first step is to collect ratings from users and form a user–item matrix.
User / Item Item1 Item2 Item3 Item4
User1 5 ? 3 4
User2 4 2 ? 5
User3 3 4 5 ?
• Rows represent users
• Columns represent items
• Missing values represent ratings to be predicted
2. Compute Similarity Between Users or Items
The system measures similarity between users or items.
Cosine Similarity
∑𝑖 𝑟𝑢𝑖 𝑟𝑣𝑖
𝑠𝑖𝑚(𝑢, 𝑣) =
2
√∑𝑖 𝑟𝑢𝑖 √∑𝑖 𝑟𝑣𝑖2

Where
• 𝑟𝑢𝑖 = rating of user 𝑢on item 𝑖
• 𝑟𝑣𝑖 = rating of user 𝑣on item 𝑖
This computes the angle between two rating vectors.
Pearson Correlation Similarity
∑𝑖( 𝑟𝑢𝑖 − 𝑟ˉ𝑢 )(𝑟𝑣𝑖 − 𝑟ˉ𝑣 )
𝑠𝑖𝑚(𝑢, 𝑣) =
√∑𝑖( 𝑟𝑢𝑖 − 𝑟ˉ𝑢 )2 √∑𝑖( 𝑟𝑣𝑖 − 𝑟ˉ𝑣 )2

This method considers rating deviations from the average rating.


3. Identify the Neighborhood
After computing similarity, the system selects the nearest neighbors.
Example:
User Pair Similarity
U1 – U2 0.97
U1 – U3 0.88
Thus User2 becomes the nearest neighbor of User1.
Methods used:
• Top-K neighbors
• Similarity threshold

4. Predict Missing Ratings


The predicted rating is computed using the weighted average of
neighbor ratings.
∑𝑣∈𝑁(𝑢) 𝑠𝑖𝑚(𝑢, 𝑣)( 𝑅𝑣𝑖 − 𝑅ˉ𝑣 )
ˉ
𝑅𝑢𝑖 = 𝑅𝑢 +
∑𝑣∈𝑁(𝑢) ∣ 𝑠𝑖𝑚(𝑢, 𝑣) ∣

Where
• 𝑅𝑢𝑖 = predicted rating
• 𝑅ˉ𝑢 = average rating of user 𝑢
• 𝑠𝑖𝑚(𝑢, 𝑣)= similarity between users
• 𝑁(𝑢)= set of nearest neighbors
30
5. Generate Recommendations
After predicting ratings, the system recommends items with the highest
predicted scores.
Example:
Item Predicted Rating
Movie B 3.2
Movie C 4.6
Thus Movie C will be recommended.

6. Types of Neighborhood Methods


User-Based Collaborative Filtering
• Finds similar users
• Uses their ratings to predict unknown ratings.
Example workflow:
User → Similar Users → Rating Prediction → Recommendation

Item-Based Collaborative Filtering


• Finds similar items
• Recommends items similar to those the user already liked.
Example workflow:
Item → Similar Items → Rating Prediction → Recommendation
Neighborhood methods work by computing similarity between users or
items, selecting the nearest neighbors, predicting missing ratings
using weighted averages, and recommending items with the highest
predicted ratings. These methods are widely used in recommender
systems such as Netflix, Amazon, and Spotify.

b) Components of Neighborhood Methods in Recommender


Systems

Neighborhood methods are memory-based collaborative filtering


approaches that recommend items by identifying similar users or
similar items and using their preferences to predict unknown ratings. The
effectiveness of neighborhood methods depends on several key
components.

1. User–Item Rating Matrix


The user–item rating matrix is the fundamental component of
neighborhood-based recommendation systems. It represents the
interaction between users and items.
Example:
User / Item Item1 Item2 Item3 Item4
User1 5 ? 3 4
User2 4 2 ? 5
User3 3 4 5 ?
Characteristics
• Rows represent users.
31
• Columns represent items.
• Each cell contains the rating given by a user to an item.
• Missing entries represent unknown ratings to be predicted.
Importance
• Forms the input data for recommendation algorithms.
• Often sparse because users rate only a few items.

2. Similarity Computation
Similarity measures determine how closely two users or two items are
related. It is a crucial step in identifying neighbors.
Cosine Similarity

Where
– Similarity between user 𝑢and user 𝑣
𝑟𝑢𝑖 – Rating given by user 𝑢to item 𝑖
𝑟𝑣𝑖 – Rating given by user 𝑣to item 𝑖
∑𝑖 – Summation over all items rated by both users

This computes the cosine of the angle between two rating vectors.

Pearson Correlation

Where
𝑠𝑖𝑚(𝑢, 𝑣)– Similarity between user 𝑢and user 𝑣
𝑟𝑢𝑖 – Rating given by user 𝑢for item 𝑖
𝑟𝑣𝑖 – Rating given by user 𝑣for item 𝑖
𝑟ˉ𝑢 – Average rating of user 𝑢
𝑟ˉ𝑣 – Average rating of user 𝑣
∑𝑖 – Summation over all commonly rated items

• Accounts for rating scale differences between users.

Jaccard Similarity (Implicit Feedback)


∣ 𝑈𝑖 ∩ 𝑈𝑗 ∣
𝑠𝑖𝑚(𝑖, 𝑗) =
∣ 𝑈𝑖 ∪ 𝑈𝑗 ∣

Used when data consists of clicks, views, or purchases instead of ratings.

3. Neighborhood Formation
After computing similarity values, the system selects the nearest
neighbors.
Types
• Top-K Neighbors – choose the K most similar users/items.
32
• Threshold-Based Neighbors – choose neighbors whose similarity
exceeds a threshold.
Example:
User Pair Similarity
U1 – U2 0.97
U1 – U3 0.88
Thus User2 becomes the nearest neighbor of User1.
Importance
• Reduces computation.
• Improves recommendation accuracy.

4. Prediction Function
The prediction function estimates unknown ratings using the ratings of
neighbors.
Rating Prediction Formula

Where:
• 𝑅𝑢𝑖 = predicted rating of user 𝑢for item 𝑖
• 𝑅ˉ𝑢 = average rating of user 𝑢
• 𝑠𝑖𝑚(𝑢, 𝑣)= similarity between users
• 𝑁(𝑢)= neighborhood of user 𝑢
Purpose
• Calculates the expected rating for an unrated item.

5. Recommendation Generation
Once ratings are predicted, the system recommends items with the highest
predicted ratings.
Example:
Item Predicted Rating
Movie B 3.2
Movie C 4.5
Thus the system recommends Movie C.

6. Neighborhood Model Types


Neighborhood methods operate in two main ways:
User-Based Neighborhood
• Finds similar users.
• Recommendations come from ratings of similar users.
Example:
User1 → similar users → predict ratings.

Item-Based Neighborhood
• Finds similar items.
• Recommends items similar to those already liked by the user.
Example:

33
Item similarity → recommend related items.

7. Data Preprocessing and Normalization


Before computing similarity, ratings are often normalized.
Example:

𝑟𝑢𝑖 = 𝑟𝑢𝑖 − 𝑟ˉ𝑢

Normalization removes user rating bias.

8. Model Evaluation
Evaluation measures the quality of recommendations.
Common metrics include:
• Mean Absolute Error (MAE)
• Root Mean Square Error (RMSE)
• Precision and Recall
These metrics help determine prediction accuracy.
Neighborhood methods rely on several key components including rating
matrices, similarity measures, neighbor selection, prediction
functions, and recommendation generation. By leveraging similarities
between users or items, these methods effectively predict missing ratings
and provide personalized recommendations.

13. Consider that you are a data scientist in Amazon. Your team is tasked with
designing a hybrid recommendation engine that personalizes product
suggestions based on browsing history, purchase history, and textual
reviews. Develop the system architecture and explain the mathematical
models used such as collaborative filtering, content-based filtering.
Discuss challenges in scalability and cold start and propose solutions.
(April-May-2025)
Hybrid Recommendation Engine for Amazon (16 Marks)
As a data scientist in Amazon, the goal is to design a hybrid
recommendation system that suggests products by combining browsing
history, purchase history, and textual reviews. Hybrid systems combine
collaborative filtering and content-based filtering to improve
recommendation accuracy.

1. System Architecture of Hybrid Recommendation Engine


Architecture Flow

34
Components of Architecture
1. Data Sources
• Browsing history (clicked products)
• Purchase history
• Product ratings
• Product metadata (category, price, brand)
• Textual reviews
2. Data Storage
• User database
• Product catalog database
• Interaction logs
3. Feature Engineering
Extract useful features such as:
• User preferences
• Product attributes
• Review sentiment scores
4. Recommendation Models
Two main models are used:
• Collaborative Filtering
• Content-Based Filtering
5. Hybrid Recommendation Layer
Combines predictions from both models.
6. Ranking and Recommendation
Products are ranked based on predicted scores and recommended to
users.

2. Collaborative Filtering Model


Collaborative filtering recommends items based on similar users or
similar items.

35
Cosine Similarity
∑𝑖 𝑟𝑢𝑖 𝑟𝑣𝑖
𝑠𝑖𝑚(𝑢, 𝑣) =
2
√∑𝑖 𝑟𝑢𝑖 √∑𝑖 𝑟𝑣𝑖2

Where
• 𝑟𝑢𝑖 = rating of user 𝑢on item 𝑖
This finds similar users or items.

Rating Prediction Formula


∑𝑣∈𝑁(𝑢) 𝑠𝑖𝑚(𝑢, 𝑣)( 𝑅𝑣𝑖 − 𝑅ˉ𝑣 )
𝑅𝑢𝑖 = 𝑅ˉ𝑢 +
∑𝑣∈𝑁(𝑢) ∣ 𝑠𝑖𝑚(𝑢, 𝑣) ∣

Where
• 𝑅𝑢𝑖 = predicted rating
• 𝑁(𝑢)= nearest neighbors
This predicts the rating of a user for a product.

3. Content-Based Filtering Model


Content-based filtering recommends items based on product features
and user preferences.
Example features:
• Product category
• Brand
• Keywords from reviews
• Product description
TF-IDF Representation
𝑁
𝑇𝐹 − 𝐼𝐷𝐹(𝑡, 𝑑) = 𝑇𝐹(𝑡, 𝑑) × log ( )
𝑑𝑓(𝑡)

Where
• 𝑇𝐹(𝑡, 𝑑)= term frequency
• 𝑑𝑓(𝑡)= document frequency
• 𝑁= number of documents
TF-IDF converts textual reviews into numerical vectors.
Cosine Similarity for Product Features
⃗𝑓𝑖 ⋅ ⃗⃗𝑓𝑗
𝑠𝑖𝑚(𝑖, 𝑗) =
∣∣ 𝑓𝑖 ∣∣ ∣∣ 𝑓𝑗 ∣∣

Where
• 𝑓𝑖 = feature vector of product 𝑖
This finds similar products.
4. Hybrid Recommendation Strategy
The hybrid system combines both approaches.
Weighted Hybrid Model
𝑆𝑐𝑜𝑟𝑒(𝑢, 𝑖) = 𝛼 𝐶𝐹(𝑢, 𝑖) + (1 − 𝛼) 𝐶𝐵(𝑢, 𝑖)

Where
• 𝐶𝐹(𝑢, 𝑖)= collaborative filtering score
36
• 𝐶𝐵(𝑢, 𝑖)= content-based score
• 𝛼= weight parameter
This improves accuracy and robustness.
5. Handling Textual Reviews
Natural Language Processing techniques are used:
Steps:
1. Text preprocessing
2. Tokenization
3. Stop-word removal
4. TF-IDF feature extraction
5. Sentiment analysis
This helps understand user opinions about products.
6. Scalability Challenges
Large e-commerce platforms like Amazon handle millions of users and
products.
Issues
• Large user-item matrix
• High computation cost
• Real-time recommendations
Solutions
• Distributed computing (Spark, Hadoop)
• Matrix factorization
• Approximate nearest neighbor search
• Incremental model updates
7. Cold Start Problem
Cold start occurs when there is insufficient data.
Types
1. New User Problem
User has no interaction history.
Solution
• Use browsing behavior
• Ask users to rate products
• Use demographic information

2. New Item Problem


New products have no ratings.
Solution
• Use product metadata
• Use content-based filtering
• Promote items initially

8. Advantages of Hybrid Recommendation Systems


• Higher recommendation accuracy
• Handles cold start better
• Combines strengths of multiple models
• Improves personalization

A hybrid recommendation engine combining collaborative filtering and


content-based filtering enables Amazon to deliver highly personalized
product recommendations using browsing data, purchase history, and
textual reviews. By addressing challenges such as scalability and cold
37
start, the system can provide efficient and accurate recommendations to
millions of users.

14. Explain the role of Similarity Weight Computation in neighborhood-


based recommendation systems with example.
15. Explain how to calculate the similarities between using Mean Squared Difference
(MSD) with example
16. Explain how neighborhood selection is made in Collaborative filtering-based RS
17. Explain on the security aspects to be considered in RS

38

You might also like