0% found this document useful (0 votes)

77 views66 pages

Recommender Systems Course Overview

The document outlines the course CCS360 on Recommender Systems at Arunachala College of Engineering for Women, detailing course outcomes, educational objectives, and program outcomes. It includes a list of experiments related to data similarity measures and dimensionality reduction techniques, along with coding examples in Python. Additionally, it presents the college's vision, mission, and guidelines for laboratory conduct.

Uploaded by

Sunitha Sekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views66 pages

Recommender Systems Course Overview

Uploaded by

Sunitha Sekar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

ARUNACHALA COLLEGE OF ENGINEERING

FOR WOMEN, MANAVILAI

DEPARTMENT OF COMPUTER SCIENCE

AND ENGINEERING

CCS360 & RECOMMENDER SYSTEMS

(Regulations-2021)

Name of the Student: _______________________________

Register Number :_______________________________
Year &Branch :_______________________________
Semester :_______________________________

1
Subject Code & Name : CCS360 & RECOMMENDER SYSTEMS

Branch : CSE
Year/Semester : III / VI
Course Outcomes :
On the successful completion of the course, the students will be able to
Cos Knowledge Course Outcomes
Level
CO1 K1 Understand the basic concepts of recommender systems.
CO2 K3 Implement machine-learning and data-mining algorithms in
recommender systems data sets.
CO3 K3 Implementation of Collaborative Filtering and carrying out
performance evaluation of recommender systems based on
various metrics.
CO4 K3 Design and implement a simple recommender system.
CO5 K4 Learn about advanced topics of recommender systems.

2
COLLEGE VISION & MISSION STATEMENT
Vision
To incubate value based technical education and produce outstanding women graduate
to compete with the technological challenges with right attitude towards social
empowerment.
Mission
 To equip necessary resources and to establish sufficient infrastructure for a beneficial
process of learning that paves the way for making ideal technocrats.
 To educate and make the students efficient with necessary skills and to make them
industry ready engineers.
 To establish high-level learning and research skills to confront technological
scenarios.
 To provide valuable resources for social empowerment and lifelong learning process.

DEPARTMENT VISION & MISSION STATEMENT

Vision
To provide skill based technical education in the field of Computer Science with
evolving technologies and to produce employable individuals in the society.
Mission
 To create an environment for student-centric learning and impart quality technical
education for professionals.
 To provide training in latest technological trends and advancements in the world of
computing technology.
 To empower the students with required skills to solve the complex technical
problems.
 To enhance the creativity in research and to develop the competency of the students in
technological field.
PROGRAM EDUCATIONAL OBJECTIVES (PEO’s)
 To direct the students in analytical, design and implementation skills for solving
computational problems.
 To train the students to become a software professional with social responsibilities
and ethical values.
 To enable the graduates be effective team member and infuse leadership qualities.
PROGRAM OUTCOMES (PO'S)

3
1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of data, and
synthesis of the information to provide valid conclusions.
5. Modern tool usage:
Create, select, and apply appropriate techniques, resources, and modern engineering and IT
tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and need
for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.

4
12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change.
PROGRAM SPECIFIC OUTCOMES PROGRAM (PSO’S)
 Able to solve problems in the broad area of programming concepts, appraise
environmental and social issues with ethics and manage different projects.
 Apply the acquired knowledge to design and develop the computer software and
hardware.
 Create solutions by adapting emerging technologies for real time applications of
industry.

LIST OF EXPERIMENTS WITH COs, POs & PSOs

[Link] NAME OF EXPERIMENTS COs POs PSOs

1 Implement Data similarity measures using CO1 1,2,3,4,5,6,9,10,11, 1,2,3

5
Python 12
Implement dimension reduction techniques for CO1 1,2,3,4,5,6,9,10,11, 1,2,3
2
recommender systems 12
Implement user profile learning CO2 1,2,3,4,5,6,9,10,11, 1,2,3
3
12
Implement content-based recommendation CO4 1,2,3,4,5,6,9,10,11, 1,2,3
4
systems 12
Implement collaborative filter techniques CO3 1,2,3,4,5,6,9,10,11, 1,2,3
5
12
Create an attack for tampering with CO5 1,2,3,4,5,6,9,10,11, 1,2,3
6
recommender systems 12
Implement accuracy metrics like Receiver CO3 1,2,3,4,5,6,9,10,11, 1,2,3
7
Operated Characteristic curves 12
ADVANCED EXPERIMENTS
Build a Movie Recommendation System CO3 1,2,3,4,5,6,9,10,11, 1,2,3
8
12
Restaurant Recommendation System CO4 1,2,3,4,5,6,9,10,11, 1,2,3
9
12
ADDITIONAL EXPERMINETS
Write a program to pre-process the dataset for CO1 1,2,3,4,5,6,9,10,11, 1,2,3
10
analysis. 12
Write a program to visualizing the Ratings in CO1 1,2,3,4,5,6,9,10,11, 1,2,3
11
the Data Set 12

INSTRUCTIONS TO STUDENTS

DOs:

6
 Always sit on assigned computer.
 Enter laboratory in time and work quitely.
 Use the computer properly to keep it in good working condition.
 Wear id cards and lab coats before entering the laboratory.
 Report the problems identifies in the computer to the staff in charge.
 Shut down the computer properly before leaving the lab.

DON’Ts:

 Do not eat in the lab.

 D not wander around the room and distract other students.
 Do not remove anything from the computer laboratory without permission.
 Avoid stepping on electrical wires or any other computer cables
 Do not change the computer settings.
 Unauthorized access to any software is prohibited.
 Do not connect pen drive or any other storing devices.
 Do not download any software from unauthorized sites.
 Do not send any emails to unauthorized persons without permission.

INDEX
Sl.
No. Date Topic Page No Signature

7
Ex No: 1 DATA SIMILARITY MEASURES USING PYTHON
Date:

Aim:

8
Implement Python program to calculate Data similarity measures.
Similarity Measures
The similarity measure is the measure of how much alike two data objects are. A
similarity measure is a data mining or machine learning context is a distance with dimensions
representing features of the objects. If the distance is small, the features are having a high
degree of similarity. Whereas a large distance will be a low degree of similarity.
Similarity measure usage is more in the text related preprocessing techniques, Also
the similarity concepts used in advanced word embedding techniques. For example, two fruits
are similar because of color or size or taste.
Generally, similarities are measured in the range 0 to 1 [0,1]. In the machine learning
world, this score in the range of [0, 1] is called the similarity score. Two main considerations
of similarity:
Similarity = 1 if X = Y (Where X, Y are two objects)
Similarity = 0 if X ≠ Y
That’s all about similarity let’s drive to five most popular similarity distance measures.
1. Euclidean distance
Euclidean distance is the most common use of distance measure. In most cases when
people say about distance, they will refer to Euclidean distance. Euclidean distance is also
known as simply distance. The Euclidean distance between two points is the length of the
path connecting them. The Pythagorean theorem gives this distance between two points.

Program:
from math import*
def euclidean_distance(x,y):
return sqrt(sum(pow(a-b,2) for a, b in zip(x, y)))
print (euclidean_distance([0,3,4,5],[7,6,3,-1]))

9
Output:
9.746794344808963
2. Manhattan distance:
Manhattan distance is a metric in which the distance between two points is calculated as the
sum of the absolute differences of their Cartesian coordinates. In a simple way of saying it is
the total sum of the difference between the x-coordinates and y-coordinates.

In a plane with p1 at (x1, y1) and p2 at (x2, y2).

Manhattan distance = |x1 – x2| + |y1 – y2|
This Manhattan distance metric is also known as Manhattan length, rectilinear distance, L1
distance or L1 norm, city block distance, Minkowski’s L1 distance, taxi-cab metric, or city
block distance.
Program:
from math import*
def manhattan_distance(x,y):
return sum(abs(a-b) for a,b in zip(x,y))
print (manhattan_distance([10,20,10],[10,20,20]))
Output:
10
3. Minkowski distance
The Minkowski distance is a generalized metric form of Euclidean distance and Manhattan
distance.

10
In the equation, d^MKD is the Minkowski distance between the data record i and j, k the
index of a variable, n the total number of variables in y and λ the order of the Minkowski
metric. Although it is defined for any λ > 0, it is rarely used for values other than 1, 2, and ∞.
Synonyms of Minkowski
Different names for the Minkowski distance or Minkowski metric arise from the order:
 λ = 1 is the Manhattan distance. Synonyms are L1-Norm, Taxicab, or City-Block
distance. For two vectors of ranked ordinal variables, the Manhattan distance is
sometimes called Foot-ruler distance.
 λ = 2 is the Euclidean distance. Synonyms are L2-Norm or Ruler distance. For two
vectors of ranked ordinal variables, the Euclidean distance is sometimes called Spear-
man distance.
 λ = ∞ is the Chebyshev distance. Synonyms are Lmax-Norm or Chessboard distance.

Program:
from math import*
from decimal import Decimal

def nth_root(value, n_root):

root_value = 1/float(n_root)
return round (Decimal(value) ** Decimal(root_value),3)

def minkowski_distance(x,y,p_value):
return nth_root(sum(pow(abs(a-b),p_value) for a,b in zip(x, y)),p_value)

print (minkowski_distance([0,3,4,5],[7,6,3,-1],3))
Output:
8.373

4. Cosine Similarity

11
The cosine similarity metric finds the normalized dot product of the two attributes. By
determining the cosine similarity, we would effectively try to find the cosine of the angle
between the two objects. The cosine of 0° is 1, and it is less than 1 for any other angle.
It is thus a judgment of orientation and not magnitude. Two vectors with the same
orientation have a cosine similarity of 1, two vectors at 90° have a similarity of 0. Whereas
two vectors diametrically opposed having a similarity of -1, independent of their magnitude.
Cosine similarity is particularly used in positive space, where the outcome is neatly
bounded in [0,1]. One of the reasons for the popularity of cosine similarity is that it is very
efficient to evaluate, especially for sparse vectors.

Program:
from math import*

def square_rooted(x):
return round(sqrt(sum([a*a for a in x])),3)

def cosine_similarity(x,y):
numerator = sum(a*b for a,b in zip(x,y))
denominator = square_rooted(x)*square_rooted(y)
return round(numerator/float(denominator),3)

print (cosine_similarity([3, 45, 7, 2], [2, 54, 13, 15]))

Output:
0.972
5. Jaccard similarity:
Jaccard similarity finds similarity between sets.
Sets - A set is (unordered) collection of objects {a,b,c}. we use the notation as elements
separated by commas inside curly brackets { }. They are unordered so {a,b} = { b,a }.
Cardinality - The cardinality of A denoted by |A| which counts how many elements are in A.

12
Intersection - The intersection between two sets A and B is denoted A ∩ B and reveals all
items which are in both sets A, B.
Union - The union between two sets A and B is denoted A ∪ B and reveals all items which
are in either set.

The Jaccard similarity measures the similarity between finite sample sets and is
defined as the cardinality of the intersection of sets divided by the cardinality of the union of

the ration of the cardinality of A ∩ B and A ∪ B.

the sample sets. Suppose you want to find Jaccard similarity between two sets A and B it is

= 2 / 7 = 0.286
Program:
from math import*

def jaccard_similarity(x,y):
intersection_cardinality = len([Link](*[set(x), set(y)]))
union_cardinality = len([Link](*[set(x), set(y)]))
return intersection_cardinality/float(union_cardinality)

print (jaccard_similarity([0,1,2,5,6],[0,2,3,5,7,9]))
Output:
0.375

Reference: [Link]
in-python/
Result:
Thus, the Python program to calculate Data similarity measures were implemented
successfully.

13
Ex No: 2 DIMENSIONALITY REDUCTION
Date:

Aim:
Implement Python program to demonstrate Dimensionality reduction techniques.
Missing Value Ratio
What if we have too many missing values (say more than 50%)? Should we impute
the missing values or drop the variable? A better option is to drop the variable since it will not
have much information. However, this isn’t set in stone. We can set a threshold value and if
the percentage of missing values in any variable is more than that threshold, we will drop the
variable.
First download the csv file from the link given below and then upload the csv file into
the “Colab Notebook” folder in google drive.
[Link]
Train_UWu5bXk.csv
Program:
import pandas as pd
import numpy as np
import [Link] as plt
from [Link] import drive
[Link]('/content/gdrive')

train=pd.read_csv("/content/gdrive/MyDrive/Colab
Notebooks/Train_UWu5bXk.csv")
# checking the percentage of missing values in each variable
a = [Link]().sum()/len(train)*100
print(a)
Output:
Item_Identifier 0.000000 Outlet_Identifier 0.000000
Item_Weight 17.165317 Outlet_Establishment_Year 0.000000
Item_Fat_Content 0.000000 Outlet_Size 28.276428
Item_Visibility 0.000000 Outlet_Location_Type 0.000000
Item_Type 0.000000 Outlet_Type 0.000000
Item_MRP 0.000000 Item_Outlet_Sales 0.000000
dtype: float64

As you can see in the above table, there aren’t too many missing values (just 2
variables have them actually). We can impute the values using appropriate methods, or we
can set a threshold of, say 20%, and remove the variable having more than 20% missing
values.
Program:
# saving in ‘variable’ after dropping columns have > 20% data is missing
variables = [Link]
14
variable = [ ]
for i in range(1,12):
if a[i]<=20: #setting the threshold as 20%
[Link](variables[i])
print(variables)
print(variable)
Output:
Index(['Item_Identifier', 'Item_Weight', 'Item_Fat_Content', 'Item_Visibility',
'Item_Type', 'Item_MRP', 'Outlet_Identifier',
'Outlet_Establishment_Year', 'Outlet_Size', 'Outlet_Location_Type',
'Outlet_Type', 'Item_Outlet_Sales'],
dtype='object')

['Item_Weight', 'Item_Fat_Content', 'Item_Visibility', 'Item_Type', 'Item_MRP',

'Outlet_Identifier', 'Outlet_Establishment_Year', 'Outlet_Location_Type', 'Outlet_Type',
'Item_Outlet_Sales']
Low Variance Filter
Consider a variable in our dataset where all the observations have the same value, say
1. If we use this variable, do you think it can improve the model we will build? The answer is
no, because this variable will have zero variance. So, we need to calculate the variance of
each variable we are given. Then drop the variables having low variance as compared to other
variables in our dataset.
Let’s first impute the missing values in the Item_Weight column using the median
value of the known Item_Weight observations. For the Outlet_Size column, we will use the
mode of the known Outlet_Size values to impute the missing values. Then check whether all
the missing values have been filled:
Program:
train['Item_Weight'].fillna(train['Item_Weight'].median(), inplace=True)
train['Outlet_Size'].fillna(train['Outlet_Size'].mode()[0], inplace=True)
a = [Link]().sum()/len(train)*100
print(a)
Output:
Item_Identifier 0.0 Outlet_Identifier 0.0
Item_Weight 0.0 Outlet_Establishment_Year 0.0
Item_Fat_Content 0.0 Outlet_Size 0.0
Item_Visibility 0.0 Outlet_Location_Type 0.0
Item_Type 0.0 Outlet_Type 0.0
Item_MRP 0.0 Item_Outlet_Sales 0.0
dtype: float64

As the above output shows, the variance of Item_Visibility is very less as compared to
the other variables. We can safely drop this column. This is how we apply low variance filter.

15
Program:
numeric = train[['Item_Weight', 'Item_Visibility', 'Item_MRP',
'Outlet_Establishment_Year']]
var = [Link]()
print(“Variance : \n”, var)
numeric = [Link]
variable = [ ]
for i in range(len(var)):
if var[i]>=10: #setting the threshold as 10%
[Link](numeric[i])
print(numeric)
print(variable)
Output:
Variance:
Item_Weight 17.869561
Item_Visibility 0.002662
Item_MRP 3878.183909
Outlet_Establishment_Year 70.086372
dtype: float64
Index(['Item_Weight', 'Item_Visibility', 'Item_MRP', 'Outlet_Establishment_Year'],
dtype='object')
['Item_Weight', 'Item_MRP', 'Outlet_Establishment_Year']

Random Forest
Random Forest is one of the most widely used algorithms for feature selection. We
need to convert the data into numeric form by applying one hot encoding, as Random Forest
(Scikit-Learn Implementation) takes only numeric inputs. We can pick the top-most three
features to reduce the dimensionality in our dataset.
Program:
from [Link] import RandomForestRegressor

# Drop the columns with missing values and string data types
df=train[['Item_Visibility', 'Item_MRP', 'Outlet_Establishment_Year']]
[Link]()

model = RandomForestRegressor(random_state=1, max_depth=10)

[Link](df,train.Item_Outlet_Sales)

features = [Link]
importances = model.feature_importances_
indices = [Link](importances)[-3:] # top 3 features
[Link]('Feature Importances')
[Link](range(len(indices)), importances[indices], color='b', align='center')
[Link](range(len(indices)), [features[i] for i in indices])
[Link]('Relative Importance')
[Link]()
16
Output:

Reference: [Link]
techniques-python/
Factor Analysis
Factor analysis is a dimensionality reduction technique commonly used in statistics. It
is an unsupervised machine-learning technique. It uses the user generated biochemist dataset
and performs a FA that creates analysis between two components. There are two types of
factor analysis
1. Exploratory Factor Analysis - It is used to find structures among a set of attributes.
The number of factors/components is not specified on hand by the researchers or the
scientists. The overall values need to be derived as well.
2. Confirmatory Factor Analysis - It is used for ground-level hypotheses and is based
on existing theories or concepts. Here, the researchers already have an expected
(hypothesized) structure of the data. So the purpose of CFA is to determine the extent
to which the proven data fits the expected data.
Applications of Factor Analysis
1. To reduce the number of variables used to analyze data
2. To detect the structure of the relationship between two set of variables.
First create the bioChemist dataset in MS Excel with 15 rows and upload in drive.
index art sex mar kids phd mentor
1 0 women single 0 2 6
2 0 women single 0 4 6
3 0 men married 1 2 3
4 0 women single 0 4 26
5 0 women married 2 4 2
6 0 women married 0 4 3
7 0 men married 2 4 4
8 0 men single 0 3 6
9 0 women married 0 5 0
10 0 men single 0 2 14
11 0 women single 0 3 13
12 0 women married 1 1 3
13 0 women single 0 4 4
14 0 men married 0 4 0
15 0 women single 2 2 2

17
Program:
import numpy as np
import pandas as pd
from [Link] import FactorAnalysis
import [Link] as plt

from [Link] import drive

[Link]('/content/gdrive')

df =pd.read_csv("/content/gdrive/MyDrive/Colab
Notebooks/[Link]")
df = [Link][1:15]
print(df)

x = df[['art', 'kids', 'phd', 'mentor']]

print(x)

fact_2c = FactorAnalysis(n_components = 2)
x_factor = fact_2c.fit_transform(x)

thisdict = {"single" : "0" , "married" : "1"}

z = [Link]([Link](thisdict), dtype = int)
colors = [Link](["blue", "purple"])
print(z)

Output:
[0 1 0 1 1 1 0 1 0 0 1 0 1 0]

18
Program:
[Link]('Marital Status: Single - Blue & Married - Purple')
[Link]("Factor 1")
[Link]("Factor 2")
[Link](x_factor[:,0], x_factor[:,1], c = colors[z])

Output:

Reference: [Link]
Principal Component Analysis (PCA)
In this article, we will cluster the wine datasets with K-Means Clustering and
visualize them after dimensionality reductions with PCA. K-Means Clustering is an
unsupervised learning algorithm that tries to cluster data into K number of clusters based on
their similarity.
In k means clustering, we specify the number of clusters we want the data to be
grouped into. The algorithm randomly assigns each observation to a set and finds the centroid
of each set. Then, the algorithm iterates through two steps: Reassign data points to the cluster
whose centroid is closest. Calculate the new centroid of each cluster. These two steps are
repeated until the within-cluster variation cannot be reduced further. The within-cluster
deviation is calculated as the sum of the Euclidean distance between the data points and their
respective cluster centroids.
[Link] data set is the result of a chemical analysis of wines grown in the same
region in Italy but derived from three different cultivars. The analysis determined the
quantities of 13 constituents found in each of the three types of wines.
Program:
import pandas as pd
import seaborn as sns
import [Link] as plt
from [Link] import StandardScaler
from [Link] import load_wine
from [Link] import KMeans
from [Link] import PCA

19
df = load_wine(as_frame=True)
df = [Link]
[Link]('target', axis =1, inplace=True)
[Link]()
Output:
alcohol malic_acid ash alcalinity magnesium phenols flavanoid nonflavanoid pro intensity hue od280 proline

14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0

13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0

13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0

14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0

13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0

Program:
scaler =StandardScaler()
features =[Link](df)
features =[Link](df)
# Convert to pandas Dataframe
scaled_df =[Link](features,columns=[Link])
# Print the scaled data
scaled_df.head(2)

X=scaled_df.values
wcss = {}
for i in range(1, 11):
kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42)
[Link](X)
wcss[i] = kmeans.inertia_

[Link]([Link](), [Link](), 'gs-')

[Link]("Values of 'k'")
[Link]('WCSS')
[Link]()
kmeans=KMeans(n_clusters=3)
[Link](X)

pca=PCA(n_components=2)
reduced_X=[Link](data=pca.fit_transform(X),columns=['PCA1','PCA2'])
reduced_X.head()
20
centers=[Link](kmeans.cluster_centers_)
[Link](figsize=(7,5))

# Scatter plot
[Link](reduced_X['PCA1'],reduced_X['PCA2'],c=kmeans.labels_)
[Link](centers[:,0],centers[:,1],marker='x',s=100,c='red')
[Link]('PCA1')
[Link]('PCA2')
[Link]('Wine Cluster')
plt.tight_layout()
Output:

Reference: [Link]
Singular Value Decomposition (SVD)
Singular Value Decomposition is very vastly used in the field of computation
engineering and machine learning for feature extraction, linear regression problems with least
squares, dimension reduction, etc.
According to wikipedia "In linear algebra, the singular value decomposition (SVD) is
a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square
normal matrix with an orthonormal eigenbasis to any matrix. It is related to the polar
decomposition".
Program:
import requests
import cv2
import numpy as np
import [Link] as plt
from [Link] import imread, imshow
from [Link] import drive
[Link]('/content/gdrive')

21
gray_image = imread("/content/gdrive/MyDrive/Colab Notebooks/[Link]",
as_gray=True)

# Calculating the SVD

u, s, v = [Link](gray_image, full_matrices=False)

# inspect shapes of the matrices

print(f'[Link]:{[Link]},[Link]:{[Link]},[Link]:{[Link]}')

Output:
[Link]:(180, 180),[Link]:(180,),[Link]:(180, 240)

Program:
# plot images with different number of components
comps = [180, 1, 5, 10, 15, 20]
[Link](figsize=(12, 6))

for i in range(len(comps)):
low_rank = u[:, :comps[i]] @ [Link](s[:comps[i]]) @ v[:comps[i], :]
if(i == 0):
[Link](2, 3, i+1),
[Link](low_rank, cmap='gray'),
[Link](f'Actual Image with n_components = {comps[i]}')
else:
[Link](2, 3, i+1),
[Link](low_rank, cmap='gray'),
[Link](f'n_components = {comps[i]}')

References:
[Link]
[Link]

Result:
Thus the Python programs to demonstrate Dimensionality reduction techniques were
implemented successfully.

22
Ex No: 3 USER PROFILE LEARNING
Date:

Aim:
To write Python programs to implement different User Profile Learning techniques.
Naive Bayes Classifier
Naive Bayes is a statistical classification technique based on Bayes Theorem. It is one
of the simplest supervised learning algorithms. Naive Bayes classifier is the fast, accurate and
reliable algorithm. Naive Bayes classifiers have high accuracy and speed on large datasets.

Naive Bayes classifier assumes that the effect of a particular feature in a class is
independent of other features. For example, a loan applicant is desirable or not depending on
his/her income, previous loan and transaction history, age, and location. Even if these features
are interdependent, these features are still considered independently. This assumption
simplifies computation, and that's why it is considered as naive. This assumption is called
class conditional independence.
Naive Bayes classifier calculates the probability of an event in the following steps:
Step 1: Calculate the prior probability for given class labels
Step 2: Find Likelihood probability with each attribute for each class
Step 3: Put this value in Bayes Formula and calculate posterior probability.
Step 4: See which class has a higher probability, given the input belongs to the higher
probability class.
Download loan_data.csv dataset from [Link]
data and upload in Google drive.
Program:
import pandas as pd
import seaborn as sns
import [Link] as plt

from [Link] import drive

[Link]('/content/gdrive')

df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/loan_data.csv')
[Link]()

Output:
credit purpose [Link] install log dti fico days bal util inq delinq [Link]
[Link]

23
1 d e b t _ c o n s o l i d a t i o n 0 . 11 8 9 829.10 11 . 3 5 0 4 0 7 1 9 . 4 8 737 5639.958333 28854 52.1 0 0 0
0
1 credit_card 0.1071 228.22 11 . 0 8 2 1 4 3 1 4 . 2 9 707 2760.000000 33623 76.7 0 0 0 0
1 debt_consolidation 0.1357 366.86 1 0 . 3 7 3 4 9 1 11 . 6 3 682 4710.000000 3 5 11 25.6 1 0 0 0
1 debt_consolidation 0.1008 162.34 11 . 3 5 0 4 0 7 8 . 1 0 7 1 2 2699.958333 33667 73.2 1 0 0 0
1 credit_card 0.1426 102.92 11 . 2 9 9 7 3 2 1 4 . 9 7 667 4066.000000 4740 39.5 0 1 0 0
[Link](data=df,x='purpose',hue='[Link]')
[Link](rotation=45, ha='right');

pre_df = pd.get_dummies(df,columns=['purpose'],drop_first=True)
pre_df.head()

from sklearn.model_selection import train_test_split

X = pre_df.drop('[Link]', axis=1)
y = pre_df['[Link]']

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size=0.33, random_state=125)

from sklearn.naive_bayes import GaussianNB

model = GaussianNB()

[Link](X_train, y_train);

from [Link] import (

accuracy_score,
confusion_matrix,
ConfusionMatrixDisplay,
f1_score,
classification_report,
)

y_pred = [Link](X_test)

accuray = accuracy_score(y_pred, y_test)

f1 = f1_score(y_pred, y_test, average="weighted")

print("Accuracy:", accuray)
print("F1 Score:", f1)

labels = ["Fully Paid", "Not fully Paid"]

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)
[Link]()

24
Output:

Reference: [Link]
Rule Based Classifier
Rule-based classifiers are just another type of classifier which makes the class decision
depending by using various “if..else” rules. These rules are easily interpretable and thus these
classifiers are generally used to generate descriptive models. The condition used with “if” is
called the antecedent and the predicted class of each rule is called the consequent.
Properties of rule-based classifiers:
1. The percentage of records which satisfy the antecedent conditions of a particular rule.
2. The rules generated by the rule-based classifiers are generally not mutually exclusive,
i.e. many rules can cover the same record.
3. The rules generated by the rule-based classifiers may not be exhaustive, i.e. there may
be some records which are not covered by any of the rules.
4. The decision boundaries created by them is linear, but these can be much more
complex than the decision tree because the many rules are triggered for the same
record.
You can get [Link] data from this link [Link]
classification/tree/main/data
Program:
import pandas as pd
from [Link] import drive
[Link]('/content/gdrive')

df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/[Link]')
[Link]()

Output:
PRICE SOURCE SEX COUNTRY AGE
0 39 android male bra 17
1 39 android male bra 17
2 49 android male bra 17
3 29 android male tur 17

25
4 49 android male tur 17

Program:
[Link]().[Link]() # Returns any value is missing in DataFrame
[Link]().sum()

Output:
PRICE 0
SOURCE 0
SEX 0
COUNTRY 0
AGE 0
dtype: int64

Program:
df["SOURCE"].nunique() # Count number of distinct SOURCE elements
df["SOURCE"].value_counts()# Returns counts of SOURCE rows
df["COUNTRY"].value_counts() # Returns counts of COUNTRY rows

Output:
usa 2065
bra 1496
deu 455
tur 451
fra 303
can 230
Name: COUNTRY, dtype: int64

Program:
# Country breakdown of income averages
[Link]("COUNTRY")["PRICE"].agg({"mean"})

Output:
mean
COUNTRY
bra 34.327540 fra 33.587459
can 33.608696 tur 34.787140
deu 34.032967 usa 34.007264

Program:
# Country and Source breakdown of income averages
[Link](["COUNTRY", 'SOURCE'])["PRICE"].mean()

Output:
COUNTRY SOURCE
bra android 34.387029 ios 34.268817
ios 34.222222 fra android 34.312500
can android 33.330709 ios 32.776224
ios 33.951456 tur android 36.229437
deu android 33.869888 ios 33.272727

26
usa android 33.760357 ios 34.371703
Name: PRICE, dtype: float64

Program:
# Average income on the basis of variables
agg_df = [Link](["COUNTRY", 'SOURCE', "SEX", "AGE"])
["PRICE"].mean().sort_values(ascending=False)
agg_df.head()

Output:
COUNTRY SOURCE SEX AGE
bra android male 46 59.0 usa ios male 32 54.0
usa android male 36 59.0 deu android female 36 49.0
fra android female 24 59.0
Name: PRICE, dtype: float64

Program:
# Convert the index names to variable names
agg_df = agg_df.reset_index()
agg_df.head()

Output:
COUNTRY SOURCE SEX AGE PRICE
0 bra android male 46 59.0
1 usa android male 36 59.0
2 fra android female24 59.0
3 usa ios male 32 54.0
4 deu android female36 49.0

Program:
# Convert AGE variable to categorical variable and adding it to agg_df
my_labels = ['0_18', '19_23', '24_30', '31_40', '41_70']
agg_df["AGE_CUT"] = [Link](x=agg_df["AGE"], bins=[0, 18, 23, 30, 40,
70], labels=my_labels)
agg_df.tail(10)

Output:
COUNTRY SOURCE SEX AGE PRICE AGE_CUT
338 bra android male 23 21.5 19_23
339 tur android male 21 19.0 19_23
340 tur ios male 47 19.0 41_70
341 bra ios female 34 19.0 31_40
342 bra ios male 47 19.0 41_70
343 usa ios female 38 19.0 31_40
344 usa ios female 30 19.0 24_30
345 can android female 27 19.0 24_30
346 fra android male 18 19.0 0_18
347 deu android male 26 9.0 24_30

Program:
# Identify new level-based customers (Personas)

27
agg_df["customers_level_based"] = [f"{i[0]}_{i[1]}_{i[2]}_{i[-1]}" for i in
agg_df.values]
agg_df["customers_level_based"].head()

Output:
0 bra_android_male_41_70 3 usa_ios_male_31_40
1 usa_android_male_31_40 4 deu_android_female_31_40
2 fra_android_female_24_30
Name: customers_level_based, dtype: object

Program:
# Segment new customers (Personas)
agg_df["SEGMENT"] = [Link](agg_df["PRICE"], 4, labels=["D", "C", "B",
"A"])
agg_df.head()

Output:
COU SOURCE SEX AGE PRICE AGE_CUT customers_level
SEGMENT
bra android male 46 59.0 41_70 bra_android_male_41_70 A
usa android male 36 59.0 31_40 usa_android_male_31_40 A
fra android female24 59.0 24_30 fra_android_female_24_30 A
usa ios male 32 54.0 31_40 usa_ios_male_31_40 A
deu android female36 49.0 31_40 deu_android_female_31_40 A

Program:
# Describe the segments and especially "C"
agg_df.groupby(["SEGMENT"]).agg({"PRICE": ["mean", "max", "sum"]})
agg_df[agg_df["SEGMENT"] == "C"].describe()

Output:
AGE PRICE
count 95.000000 95.000000 25% 19.000000 32.333333
mean 26.663158 32.933339 50% 24.000000 32.913043
std 10.075893 0.877933 75% 32.000000 33.861004
min 15.000000 31.173913 max 54.000000 34.000000
Program:
new_user = "fra_android_male_24_30"
print(agg_df[agg_df["customers_level_based"] == new_user])

Output:
COU SOURCE SEX AGE PRICE AGE_CUT customers_level SEGMENT
fra android male 25 33.0 24_30 fra_android_male_24_30 C

Reference: [Link]
classification-problem-6088c0e405d4
Result:
The Python programs to implement different User Profile Learning techniques was
implemented successfully.

28
Ex No: 4 CONTENT-BASED RECOMMENDATION SYSTEM
Date:

Aim:
To implement a Content based Recommender System in Python.
Python Recommendation Systems
Python Recommendation Systems employs a data-driven methodology to offer
customers tailored recommendations. It uses user data and algorithms to forecast and suggest
goods, services, or content that a user is probably going to find interesting.
Recommender System is of different types:
 Content-Based Recommendation: It is supervised machine learning used to induce a
classifier to discriminate between interesting and uninteresting items for the user.
 Collaborative Filtering: Collaborative Filtering recommends items based on similarity
measures between users and/or items. The basic assumption behind the algorithm is
that users with similar interests have common preferences.
Content-Based Recommendation System
Content-based systems recommend items to the customer which are previously high-
rated items by other customer. It uses the features and properties of the item. From these
properties, it can calculate the similarity between the items.

Program:
import numpy as np
import pandas as pd
import sklearn
import [Link] as plt
import seaborn as sns
from [Link] import NearestNeighbors

import warnings
[Link](action='ignore', category=FutureWarning)

#loading rating dataset

ratings = pd.read_csv("[Link]
tutorial/[Link]")
print([Link]())

Output:
userId movieId rating timestamp
0 1 1 4.0 964982703
1 1 3 4.0 964981247
2 1 6 4.0 964982224
3 1 47 5.0 964983815
4 1 50 5.0 964982931

29
Program:
# loading movie dataset
movies = pd.read_csv("[Link]
tutorial/[Link]")
print([Link]())

Output:

movieId title genres

Program (Optional):
n_ratings = len(ratings)
n_movies = len(ratings['movieId'].unique())
n_users = len(ratings['userId'].unique())

print(f"Number of ratings: {n_ratings}")

print(f"Number of unique movieId's: {n_movies}")
print(f"Number of unique users: {n_users}")
print(f"Average ratings per user: {round(n_ratings/n_users, 2)}")
print(f"Average ratings per movie: {round(n_ratings/n_movies, 2)}")

Output:
Number of ratings: 100836 Average ratings per user: 165.3
Number of unique movieId's: 9724 Average ratings per movie: 10.37
Number of unique users: 610

Program (Optional):
user_freq = ratings[['userId', 'movieId']].groupby(
'userId').count().reset_index()
user_freq.columns = ['userId', 'n_ratings']
print(user_freq.head())

Output:
userId n_ratings
0 1 232
1 2 29
2 3 39
3 4 216
4 5 44

Program (Optional):
# Find Lowest and Highest rated movies:
mean_rating = [Link]('movieId')[['rating']].mean()
# Lowest rated movies

30
lowest_rated = mean_rating['rating'].idxmin()
[Link][movies['movieId'] == lowest_rated]
# Highest rated movies
highest_rated = mean_rating['rating'].idxmax()
[Link][movies['movieId'] == highest_rated]
# show number of people who rated movies highest
ratings[ratings['movieId']==highest_rated]
# show number of people who rated movies lowest
ratings[ratings['movieId']==lowest_rated]

# the above movies has very low dataset. We will use bayesian average.
movie_stats = [Link]('movieId')[['rating']].agg(['count', 'mean'])
movie_stats.columns = movie_stats.[Link]()

Program:
# Now, we create user-item matrix using scipy csr matrix
from [Link] import csr_matrix

def create_matrix(df):
N = len(df['userId'].unique())
M = len(df['movieId'].unique())

# Map Ids to indices

user_mapper = dict(zip([Link](df["userId"]), list(range(N))))
movie_mapper = dict(zip([Link](df["movieId"]), list(range(M))))

# Map indices to IDs

user_inv_mapper = dict(zip(list(range(N)), [Link](df["userId"])))
movie_inv_mapper = dict(zip(list(range(M)),
[Link](df["movieId"])))

user_index = [user_mapper[i] for i in df['userId']]

movie_index = [movie_mapper[i] for i in df['movieId']]

X= csr_matrix((df["rating"], (movie_index, user_index)), shape=(M, N))

return X, user_mapper, movie_mapper, user_inv_mapper,

movie_inv_mapper

X, user_mapper, movie_mapper, user_inv_mapper, movie_inv_mapper =

create_matrix(ratings)

"""
Find similar movies using KNN
"""
def find_similar_movies(movie_id, X, k, metric='cosine',
show_distance=False):

neighbour_ids = []
movie_ind = movie_mapper[movie_id]

31
movie_vec = X[movie_ind]
k+=1
kNN = NearestNeighbors(n_neighbors=k, algorithm="brute",
metric=metric)
[Link](X)
movie_vec = movie_vec.reshape(1,-1)
neighbour = [Link](movie_vec,
return_distance=show_distance)
for i in range(0,k):
n = [Link](i)
neighbour_ids.append(movie_inv_mapper[n])
neighbour_ids.pop(0)
return neighbour_ids

movie_titles = dict(zip(movies['movieId'], movies['title']))

movie_id = 3
similar_ids = find_similar_movies(movie_id, X, k=10)
movie_title = movie_titles[movie_id]

print(f"Since you watched {movie_title}")

for i in similar_ids: print(movie_titles[i])

Output:
Since you watched Grumpier Old Men (1995)
Grumpy Old Men (1993)
Striptease (1996)
Nutty Professor, The (1996)
Twister (1996)
Father of the Bride Part II (1995)
Broken Arrow (1996)
Bio-Dome (1996)
Truth About Cats & Dogs, The (1996)
Sabrina (1995)
Birdcage, The (1996)

Program:
def recommend_movies(user_id, X, user_mapper, movie_mapper,
movie_inv_mapper, k=10):
df1 = ratings[ratings['userId'] == user_id]
if [Link]:
print(f"User with ID {user_id} does not exist.")
return

movie_id = df1[df1['rating'] == max(df1['rating'])]['movieId'].iloc[0]

movie_titles = dict(zip(movies['movieId'], movies['title']))
similar_ids = find_similar_movies(movie_id, X, k)
movie_title = movie_titles.get(movie_id, "Movie not found")
if movie_title == "Not found":
print(f"Movie with ID {movie_id} not found.")

32
return
print(f"Since you watched {movie_title}, you might also like:")
for i in similar_ids:
print(movie_titles.get(i, "Not found"))

user_id = 150 # Replace with the desired user ID

recommend_movies(user_id, X, user_mapper, movie_mapper,
movie_inv_mapper, k=10)

Output:
Since you watched Twelve Monkeys (a.k.a. 12 Monkeys) (1995), you might also like:
Pulp Fiction (1994)
Terminator 2: Judgment Day (1991)
Independence Day (a.k.a. ID4) (1996)
Seven (a.k.a. Se7en) (1995)
Fargo (1996)
Fugitive, The (1993)
Usual Suspects, The (1995)
Jurassic Park (1993)
Star Wars: Episode IV - A New Hope (1977)
Heat (1995)

Reference: [Link]

Result:
The Python program to implement a Content based Recommender System was
implemented successfully.

33
Ex No: 5 COLLABORATIVE FILTERING TECHNIQUES
Date:

Aim:
To implement different Collaborative Filtering Techniques in Python.
Python Recommendation Systems
Python Recommendation Systems employs a data-driven methodology to offer
customers tailored recommendations. It uses user data and algorithms to forecast and suggest
goods, services, or content that a user is probably going to find interesting. Recommender
System is of different types:
 Content-Based Recommendation: It is supervised machine learning used to induce a
classifier to discriminate between interesting and uninteresting items for the user.
 Collaborative Filtering: Collaborative Filtering recommends items based on similarity
measures between users and/or items. The basic assumption behind the algorithm is
that users with similar interests have common preferences.

Neighborhood-based collaborative filtering algorithms:

Neighborhood-based collaborative filtering algorithms, also referred to as memory-

based algorithms, were algorithms based on the fact that similar users display similar patterns
of rating behavior and similar items receive similar ratings. There are two primary types of
neighborhood-based algorithms:

1. User-based collaborative filtering.

2. Item-based collaborative filtering.

User-Based Collaborative Filtering:

Similar users have similar ratings on the same item. Therefore, if Alice and Bob have
rated movies in a similar way in the past, then one can use Alice’s observed ratings on the
movie Terminator to predict Bob’s unobserved ratings on this movie. Use following
[Link] file.
user_0 user_1 user_2 user_3 user_4 user_5 user_6 user_7 user_8 user_9
movie_0 0 0 3 4 2 1 2 0 5 1
movie_1 3 0 1 3 0 0 0 0 0 0
movie_2 0 3 0 4 0 2 0 0 0 2
movie_3 5 2 3 2 0 4 3 3 0 0
movie_4 0 5 5 0 0 0 0 0 5 4
movie_5 0 0 0 0 4 0 4 2 3 0
movie_6 4 4 0 0 4 4 3 4 0 4
movie_7 5 0 4 2 3 0 3 3 3 3
movie_8 0 3 0 0 5 5 0 4 0 0
movie_9 2 0 0 0 0 0 0 0 4 0

34
Program:
import numpy as np
import pandas as pd
from [Link] import cosine_similarity

from [Link] import drive

[Link]('/content/gdrive')

df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/[Link]')
[Link]()

Output:
Unnamed: 0user_0 user_1 user_2 user_3 user_4 user_5 user_6 user_7 user_8 user_9
0 movie_0 0 0 3 4 2 1 2 0 5 1
1 movie_1 3 0 1 3 0 0 0 0 0 0
2 movie_2 0 3 0 4 0 2 0 0 0 2
3 movie_3 5 2 3 2 0 4 3 3 0 0
4 movie_4 0 5 5 0 0 0 0 0 5 4

Program:
[Link]([Link][0], axis=1, inplace=True)
matrix = df[0:10].to_numpy()
item_similarity = cosine_similarity(matrix.T)

# Calculate item scores based on user's interactions and item similarity

item_scores = [Link](item_similarity)
# Sort items by score and recommend the top-n
n = 10
recommended_items = [Link](item_scores)[:n]
print(recommended_items)

Output:
0 1 2 3 4 5 6 7 8 9 5 3 1 2 5 9 0 8 7 4 6
0 5 1 7 0 4 9 3 6 8 2 6 8 3 2 0 9 1 6 4 5 7
1 4 1 8 5 7 9 6 2 3 0 7 3 5 1 8 4 9 7 2 0 6
2 8 6 4 0 7 2 3 5 9 1 8 8 3 2 0 9 6 1 4 5 7
3 8 3 9 2 1 4 5 6 0 7 9 5 1 7 4 3 0 9 6 2 8
4 5 4 3 7 0 6 1 8 2 9

Program:
# Display recommendations
user_id = int(input("Enter user id as integer : "))
print("Top 5 Items recommended for user_", user_id)
for i in range(5):
print("Recommendation ", i+1, " : movie_", recommended_items[i]
[user_id])

Output:

35
Enter user id as integer : 4
Top 5 Items recommended for user_ 4
Recommendation 1 : movie_ 5
Recommendation 2 : movie_ 4
Recommendation 3 : movie_ 3
Recommendation 4 : movie_ 7
Recommendation 5 : movie_ 0

Item-based collaborative filtering:

In order to make recommendations for target item B, the first step is to determine a set
S of items, which are most similar to item B. Then, to predict the rating of any user A for item
B, the ratings in set S, which are specified by A, are determined. The weighted average of
these ratings is used to compute the predicted rating of user A for item B. Use following
[Link] file.
user_0 user_1 user_2 user_3 user_4 user_5 user_6 user_7 user_8 user_9 description
movie_0 0 0 3 4 2 1 2 0 5 1 Adventure|Animation|Children|Comedy
movie_1 3 0 1 3 0 0 0 0 0 0 Adventure|Children|Fantasy
movie_2 0 3 0 4 0 2 0 0 0 2 Comedy|Romance
movie_3 5 2 3 2 0 4 3 3 0 0 Comedy|Drama|Romance
movie_4 0 5 5 0 0 0 0 0 5 4 Comedy
movie_5 0 0 0 0 4 0 4 2 3 0 Adventure|Comedy|Fantasy
movie_6 4 4 0 0 4 4 3 4 0 4 Animation|Children|Fantasy
movie_7 5 0 4 2 3 0 3 3 3 3 Children|Comedy
movie_8 0 3 0 0 5 5 0 4 0 0 Animation|Romance
movie_9 2 0 0 0 0 0 0 0 4 0 Comedy|Drama

Program:
import pandas as pd
import numpy as np
from [Link] import cosine_similarity
from sklearn.feature_extraction.text import TfidfVectorizer

from [Link] import drive

[Link]('/content/gdrive')

df = pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/[Link]')
[Link]()
Output:
Unnamed: 0user_0 user_1 user_2 user_3 user_4 user_5 user_6 user_7 user_8 user_9
description
0 movie_0 0 0 3 4 2 1 2 0 5 1 Adventure|Animation|Children|
Comedy|Fantasy
1 movie_1 3 0 1 3 0 0 0 0 0 0 Adventure|Children|Fantasy
2 movie_2 0 3 0 4 0 2 0 0 0 2 Comedy|Romance
3 movie_3 5 2 3 2 0 4 3 3 0 0 Comedy|Drama|Romance
4 movie_4 0 5 5 0 0 0 0 0 5 4 Comedy

Program:
# Extract features from text descriptions
tfidf_vectorizer = TfidfVectorizer()
tfidf_matrix = tfidf_vectorizer.fit_transform(df['description'])

36
user_profile = [Link](tfidf_matrix.shape[1])
[Link]([Link][0], axis=1, inplace=True)
[Link]([Link][10], axis=1, inplace=True)
matrix = df[0:10].to_numpy()
print(matrix)
Output:
[[0 0 3 4 2 1 2 0 5 1] [0 0 0 0 4 0 4 2 3 0]
[3 0 1 3 0 0 0 0 0 0] [4 4 0 0 4 4 3 4 0 4]
[0 3 0 4 0 2 0 0 0 2] [5 0 4 2 3 0 3 3 3 3]
[5 2 3 2 0 4 3 3 0 0] [0 3 0 0 5 5 0 4 0 0]
[0 5 5 0 0 0 0 0 5 4] [2 0 0 0 0 0 0 0 4 0]]

Program:
rating = [Link](axis = 1)
for i in range(10):
user_profile += tfidf_matrix[i].toarray()[0] * rating[i]

# Calculate cosine similarity between the user profile and item features
similarities = cosine_similarity([user_profile], tfidf_matrix)
# Get recommended item IDs
recommended_items = [Link](similarities)[:10]
print(recommended_items)
Output:
[[9 8 3 1 4 5 2 6 7 0]]

Program:
# Display recommendations
print("Top 5 Recommended Items:")
for i in range(5):
print("Recommendation ", i+1, " : movie_", recommended_items[0][i])
Output:
Top 5 Recommended Items:
Recommendation 1 : movie_ 9
Recommendation 2 : movie_ 8
Recommendation 3 : movie_ 3
Recommendation 4 : movie_ 1
Recommendation 5 : movie_ 4

References:
[Link]
[Link]
recommendation-systems-836e5e2fe152
[Link]
item-collaborative-filtering-in-python-3baae5179c52
Result:

37
Thus the Python programs to implement different Collaborative Filtering Techniques was
executed successfully.
Ex No: 6 RECEIVER OPERATED CHARACTERISTIC CURVES
Date:

Aim:
To implement Receiver Operated Characteristic curves in Python.
Receiver Operated Characteristic curves

The AUC-ROC curve, or Area Under the Receiver Operating Characteristic curve, is a
graphical representation of the performance of a binary classification model at various
classification thresholds. It is commonly used in machine learning to assess the ability of a
model to distinguish between two classes, typically the positive class (e.g., presence of a
disease) and the negative class (e.g., absence of a disease).

ROC: Receiver Operating Characteristics

AUC: Area Under Curve

ROC Curve

ROC stands for Receiver Operating Characteristics, and the ROC curve is the graphical
representation of the effectiveness of the binary classification model. It plots the true positive
rate (TPR) vs the false positive rate (FPR) at different classification thresholds.

AUC Curve:

AUC stands for Area Under the Curve, and the AUC curve represents the area under the
ROC curve. It measures the overall performance of the binary classification model. As both
TPR and FPR range between 0 to 1, So, the area will always lie between 0 and 1, and A
greater value of AUC denotes better model performance. Our main goal is to maximize this
area in order to have the highest TPR and lowest FPR at the given threshold. The AUC
measures the probability that the model will assign a randomly chosen positive instance a
higher predicted probability compared to a randomly chosen negative instance. It represents
the probability with with our model is able to distinguish between the two classes which are
present in our target.

38
TPR – True Positive Rate

FPR – False Positive Rate

TPR and FPR

Basically, the ROC curve is a graph that shows the performance of a classification model
at all possible thresholds (threshold is a particular value beyond which you say a point
belongs to a particular class). The curve is plotted between two parameters

Let us quickly look at the confusion matrix.

Confusion Matrix for a Classification Task

True Positive: Actual Positive and Predicted as Positive

True Negative: Actual Negative and Predicted as Negative

False Positive (Type I Error): Actual Negative but predicted as Positive

False Negative (Type II Error): Actual Positive but predicted as Negative

Program:
import [Link] as plt
from [Link] import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from [Link] import roc_curve
from [Link] import roc_auc_score

# generate 2 class dataset

X, y = make_classification(n_samples=1000, n_classes=2, random_state=1)
# split into train/test sets
trainX, testX, trainy, testy = train_test_split(X, y, test_size=0.5,
random_state=2)
# generate a no skill prediction (majority class)
ns_probs = [0 for _ in range(len(testy))]
# fit a model
model = LogisticRegression(solver='lbfgs')

39
[Link](trainX, trainy)
# predict probabilities
lr_probs = model.predict_proba(testX)
# keep probabilities for the positive outcome only
lr_probs = lr_probs[:, 1]
# calculate scores
ns_auc = roc_auc_score(testy, ns_probs)
lr_auc = roc_auc_score(testy, lr_probs)
# summarize scores
print('No Skill: ROC AUC=%.3f' % (ns_auc))
print('Logistic: ROC AUC=%.3f' % (lr_auc))
# calculate roc curves
ns_fpr, ns_tpr, _ = roc_curve(testy, ns_probs)
lr_fpr, lr_tpr, _ = roc_curve(testy, lr_probs)
# plot the roc curve for the model
[Link](ns_fpr, ns_tpr, linestyle='--', label='No Skill')
[Link](lr_fpr, lr_tpr, marker='.', label='Logistic')
# axis labels
[Link]('False Positive Rate')
[Link]('True Positive Rate')
# show the legend
[Link]()
# show the plot
[Link]()
Output:

Reference: [Link]
classification-in-python/
Result:
Thus the Python program to implement Receiver Operated Characteristic curves was
executed successfully.

40
Ex No: 7 ATTACKS ON RECOMMENDER SYSTEM
Date:

Aim:
To implement attack on Recommender System in Python.
Attacks on Recommender Systems
 Recommender systems have been shown vulnerable to adversarial attacks that force
the models to produce misleading recommendations.
 The person making the attack on the recommender system is also referred to as the
adversary.
 A fake profile refers to a set of ratings corresponding to a fake user created by the
adversary. The number of injected profiles may depend on the specific
recommendation algorithm being attacked, and the approach used to attack it.
Major Classification of Attacks
 An attack that requires a smaller number of injected profiles is referred to as an
efficient attack because such attacks are often difficult to detect.
 On the other hand, if an attack requires a large number of injected profiles, then such
an attack is inefficient attack because most systems should be able to detect a sudden
injection of a large number of ratings about a small number of items.
Attacks can also be classified based on the amount of knowledge required attack successfully.
 Some attacks require only limited knowledge about the ratings distribution. Such
attacks are referred to as low-knowledge attacks.
 On the other hand, attacks that require a large amount of knowledge about the ratings
distribution are referred to as high knowledge attacks.
Example of recommender system attacks:
Amazon product’s reviews is distorted with thousands of fake ones. False reviews were
helping unknown brands dominate searches for popular items. Hundreds of unverified five-
star reviews were being posted on product pages in a single day. Many product pages also
included positive reviews for completely different items.
Push Attack
The manufacturer of an item or the author of a book might submit fake positive reviews
on Amazon in order to maximize sales. Such attacks are referred to as Product push attacks.
Program:

41
import numpy as np
import pandas as pd

from [Link] import drive

[Link]('/content/gdrive')

df=pd.read_csv('/content/gdrive/MyDrive/Colab Notebooks/[Link]')
print(df)
Output
users title genre rating
0 user_0 Leo Action | Comedy | Romance 4
1 user_1 Mark Antony Comedy 5
2 user_2 Por Thozhil Action | Detective 4
3 user_3 PS2 Action | Romance 4
4 user_0 Dada Romance 4
5 user_3 Thalaivi Biography 5
6 user_0 PS2 Action | Romance 3
7 user_4 Dada Romance 5
8 user_5 Mark Antony Comedy 5
9 user_6 Mark Antony Comedy 5
10 user_7 Mark Antony Comedy 5
11 user_0 Mark Antony Comedy 5
12 user_7 Leo Action | Comedy | Romance 3

Program:
#Naive Push Attack
import csv

# List that we want to add as a new row

List = ["Jawan", "Action | Romance", 5]

with open('/content/gdrive/MyDrive/Colab Notebooks/[Link]', 'r')

as inp, open('/content/gdrive/MyDrive/Colab Notebooks/[Link]', 'w') as out:
writer_obj = [Link](out)
for row in [Link](inp):
writer_obj.writerow(row)
print(row)
for i in range(1,8):
user = "user_"+str(i)
List1 = [user] + List
writer_obj.writerow(List1)
print(List1)
[Link]()
[Link]()

Output:
['users', 'title', 'genre', 'rating']
['user_0', 'Leo', 'Action | Comedy | Romance', '4']
['user_1', 'Mark Antony', 'Comedy', '5']

42
['user_2', 'Por Thozhil', 'Action | Detective', '4']
['user_3', 'PS2', 'Action | Romance', '4']
['user_0', 'Dada', 'Romance', '4']
['user_3', 'Thalaivi', 'Biography', '5']
['user_0', 'PS2', 'Action | Romance', '3']
['user_4', 'Dada', 'Romance', '5']
['user_5', 'Mark Antony', 'Comedy', '5']
['user_6', 'Mark Antony', 'Comedy', '5']
['user_7', 'Mark Antony', 'Comedy', '5']
['user_0', 'Mark Antony', 'Comedy', '5']
['user_7', 'Leo', 'Action | Comedy | Romance', '3']
['user_1', 'Jawan', 'Action | Romance', 5]
['user_2', 'Jawan', 'Action | Romance', 5]
['user_3', 'Jawan', 'Action | Romance', 5]
['user_4', 'Jawan', 'Action | Romance', 5]
['user_5', 'Jawan', 'Action | Romance', 5]
['user_6', 'Jawan', 'Action | Romance', 5]
['user_7', 'Jawan', 'Action | Romance', 5]

Nuke Attack
The competitor of an item manufacturer might submit malicious reviews about the item.
Such attacks are referred to as Nuke attacks.

Program:
#Naive Nuke Attack
import csv

# List that we want to add as a new row

List = ["Jawan", "Action | Romance", 0]

with open('/content/gdrive/MyDrive/Colab Notebooks/[Link]', 'r')

Output:
['users', 'title', 'genre', 'rating']
['user_0', 'Leo', 'Action | Comedy | Romance', '4']
['user_1', 'Mark Antony', 'Comedy', '5']

43
['user_2', 'Por Thozhil', 'Action | Detective', '4']
['user_3', 'PS2', 'Action | Romance', '4']
['user_0', 'Dada', 'Romance', '4']
['user_3', 'Thalaivi', 'Biography', '5']
['user_0', 'PS2', 'Action | Romance', '3']
['user_4', 'Dada', 'Romance', '5']
['user_5', 'Mark Antony', 'Comedy', '5']
['user_6', 'Mark Antony', 'Comedy', '5']
['user_7', 'Mark Antony', 'Comedy', '5']
['user_0', 'Mark Antony', 'Comedy', '5']
['user_7', 'Leo', 'Action | Comedy | Romance', '3']
['user_1', 'Jawan', 'Action | Romance', 0]
['user_2', 'Jawan', 'Action | Romance', 0]
['user_3', 'Jawan', 'Action | Romance', 0]
['user_4', 'Jawan', 'Action | Romance', 0]
['user_5', 'Jawan', 'Action | Romance', 0]
['user_6', 'Jawan', 'Action | Romance', 0]
['user_7', 'Jawan', 'Action | Romance', 0]

Bandwagon Attack
The basic idea of the bandwagon attack is to leverage the fact that a small number of
items are very popular in terms of the number of ratings they receive. For example, a
blockbuster movie or a widely used textbook might receive many ratings. Therefore, if these
items are always rated in the fake user profile, it increases the chance of a fake user profile
being similar to the target user.

Program:
#Bandwagon Attack
import csv

# Find Highest rated movie

mean_rating = [Link]('title')[['rating']].mean()
highest_rated = mean_rating['rating'].idxmax()
print("Highest rated Movie : ", highest_rated)

with open('/content/gdrive/MyDrive/Colab Notebooks/[Link]', 'r')

as inp, open('/content/gdrive/MyDrive/Colab Notebooks/[Link]', 'w') as out:
writer_obj = [Link](out)
for row in [Link](inp):
if(row[1] == highest_rated):
gen = row[2]
print(row[1], row[2])
break
break
List = [highest_rated, gen, 5]
print(List1)
for row in [Link](inp):
writer_obj.writerow(row)
print(row)

44
for i in range(1,8):
user = "user_"+str(i)
List1 = [user] + List
writer_obj.writerow(List1)
print(List1)
[Link]()
[Link]()
Output:
Highest rated Movie : Mark Antony
['user_7', 'Mark Antony', 'Comedy', 5]
['user_0', 'Leo', 'Action | Comedy | Romance', '4']
['user_1', 'Mark Antony', 'Comedy', '5']
['user_2', 'Por Thozhil', 'Action | Detective', '4']
['user_3', 'PS2', 'Action | Romance', '4']
['user_0', 'Dada', 'Romance', '4']
['user_3', 'Thalaivi', 'Biography', '5']
['user_0', 'PS2', 'Action | Romance', '3']
['user_4', 'Dada', 'Romance', '5']
['user_5', 'Mark Antony', 'Comedy', '5']
['user_6', 'Mark Antony', 'Comedy', '5']
['user_7', 'Mark Antony', 'Comedy', '5']
['user_0', 'Mark Antony', 'Comedy', '5']
['user_7', 'Leo', 'Action | Comedy | Romance', '3']
['user_1', 'Mark Antony', 'Comedy', 5]
['user_2', 'Mark Antony', 'Comedy', 5]
['user_3', 'Mark Antony', 'Comedy', 5]
['user_4', 'Mark Antony', 'Comedy', 5]
['user_5', 'Mark Antony', 'Comedy', 5]
['user_6', 'Mark Antony', 'Comedy', 5]
['user_7', 'Mark Antony', 'Comedy', 5]

Result:
Thus the Python program to implement simple attacks on Recommender System was
executed successfully.

45
ADVANCED EXPERIMENTS

Ex No: 8 BUILD A MOVIE RECOMMENDATION SYSTEM

Date:

Aim:
To implement a program in Python to build a Movie Recommendation system using
NumPy and Pandas.

Movie Recommendation:
Our recommendation system functions based on the similarities between movies. More
specifically, it will recommend movies to you that other users with similar taste have enjoyed.
To demonstrate this, we'll select two movies from the data set:
Toy Story (1995)
Returns of the Jedi (1983)

The first thing we need to do is create matrices that contain the user ratings for each movie in
the data set. These movie matrices will allow you to see how each user rated every movie in
the data set. Let's examine what's stored in the toy_story_user_ratings and
star_wars_user_ratings variables.
A value of NaN is stored if a specific user has not provided a rating for the Toy Story
(1995) movie. The user ID of the user who provided the rating is stored as the index of the
Series. Next, we will use the corrwith method to calculate the correlation between the
toy_story_user_ratings and star_wars_user_ratings data sets. This will allow us to see if the
movies are similar, since their ratings distribution among users will be highly correlated if so!
First, a pandas Series is created using ratings_matrix.corrwith(toy_story_user_ratings)
that shows the correlation of user ratings between the Toy Story (1995) movie and every
other movie in the data set. Next, the specific correlation for Return of the Jedi (1983) is
pulled from the data structure by passing in the name of the movie in square brackets.
Let's try and find a movie that _is _highly similar to the Return of the Jedi (1983) movie.
To do this, let's build a pandas DataFrame that stores the correlation of every movie's user
ratings with the Return of the Jedi (1983) user ratings.
 The first line of code creates a pandas DataFrame with a single column that shows
the correlation of every movie's user ratings with the user ratings of Return of the
Jedi (1983)
 The dropna method removes null values from the DataFrame
 The sort_values method combined with the arguments 0 and ascending = False
modifies the DataFrame so the most similar movies are shown at the top
 The head(15) method shows only the 15 entries at the top of the DataFrame.
Let's filter out movies that have less than 50 reviews to improve the basic recommendation
system that we have built in this tutorial so far. To start this process, we'll want to add the
number of ratings from each movie to our ratings_matrix data structure.

46
Program:
import pandas as pd
import numpy as np
import [Link] as plt
import seaborn as sns
%matplotlib inline

#Import the data

raw_data = pd.read_csv('[Link]', sep = '\t', names = ['user_id', 'item_id', 'rating', 'timestamp'])
movie_titles_data = pd.read_csv('Movie_Id_Titles')

#Merge our two data sources

merged_data = [Link](raw_data, movie_titles_data, on='item_id')
merged_data.columns

#Calculate aggregate data

merged_data.groupby('title')['rating'].mean().sort_values(ascending = False)
merged_data.groupby('title')['rating'].count().sort_values(ascending = False)

#Create a DataFrame and add the number of ratings to is using a count method
ratings_data = [Link](merged_data.groupby('title')['rating'].mean())
ratings_data['# of ratings'] = merged_data.groupby('title')['rating'].count()

#Make some visualizations

[Link](ratings_data['# of ratings'])
[Link](ratings_data['rating'])

#Create the ratings matrix and get user ratings for `Return of the Jedi (1983)` and `Toy Story
(1995)`
ratings_matrix = merged_data.pivot_table(index='user_id',columns='title',values='rating')

47
star_wars_user_ratings = ratings_matrix['Return of the Jedi (1983)']
toy_story_user_ratings = ratings_matrix['Toy Story (1995)']
ratings_matrix.corrwith(toy_story_user_ratings)['Return of the Jedi (1983)']

#Calculate correlations and source recommendations

correlation_with_star_wars =
[Link](ratings_matrix.corrwith(star_wars_user_ratings))
correlation_with_star_wars.dropna().sort_values(0, ascending = False).head(15)

#Add the number of ratings and rename columns

correlation_with_star_wars = correlation_with_star_wars.join(ratings_data['# of ratings'])
correlation_with_star_wars.columns = ['Corr. With SW Ratings', '# of Ratings']
correlation_with_star_wars.[Link] = ['Movie Title']

#Get new recommendations from movies that have more than 50 ratings
correlation_with_star_wars[correlation_with_star_wars['# of Ratings'] >
50].sort_values('Corr. With SW Ratings', ascending = False).head(10)
Output:

Result:

A program in Python to build a Movie Recommendation system using NumPy and

Pandas was implemented successfully.

48
Ex No: 9
RESTAURANT RECOMMENDATION SYSTEM
Date:

Aim:
To implement a program in Python to build a Restaurant Recommendation system using
NumPy and Pandas.

Restaurant Recommendation:
These are active information filtering systems that personalize the information provided
to a user based on their interests, relevance of the information, etc. Recommendation systems
are widely used to recommend movies, items, restaurants, places to visit, items to buy, etc.

There are two types of recommendation systems:

1. Content-based filtering
2. Collaborative filtering

Start the task of Restaurant Recommendation System by importing the necessary Python
Libraries. Before that download dataset from
[Link]
ml/input?select=[Link]

In [1]:
import numpy as np
import pandas as pd
import seaborn as sb
import [Link] as plt
import seaborn as sns

49
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from [Link] import classification_report
from [Link] import confusion_matrix
from [Link] import r2_score
import warnings
[Link]('always')
[Link]('ignore')
import re
from [Link] import stopwords
from [Link] import linear_kernel
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
Now, I will load and read the dataset:

In [2]:
zomato_real=pd.read_csv("../input/zomato-bangalore-dataset/[Link]")
zomato_real.head() # prints the first 5 rows of the dataset
Out[2]:

Now the next step is data cleaning and feature engineering for this step we need to do a lot of
stuff with the data such as:
1. Deleting Unnecessary Columns
2. Removing the Duplicates
3. Remove the NaN values from the dataset
4. Changing the column names
5. Data Transformations
6. Data Cleaning
1. Adjust the column names Now, let’s perform all the above steps in our data:
In [3]:
#Deleting Unnnecessary Columns

50
zomato=zomato_real.drop(['url','dish_liked','phone'],axis=1) #Dropping the column
"dish_liked", "phone", "url" and saving the new dataset as "zomato"

#Removing the Duplicates

[Link]().sum()
zomato.drop_duplicates(inplace=True)

#Remove the NaN values from the dataset

[Link]().sum()
[Link](how='any',inplace=True)

#Changing the column names

zomato = [Link](columns={'approx_cost(for two
people)':'cost','listed_in(type)':'type', 'listed_in(city)':'city'})

#Some Transformations
zomato['cost'] = zomato['cost'].astype(str) #Changing the cost to string
zomato['cost'] = zomato['cost'].apply(lambda x: [Link](',','.')) #Using lambda function to
replace ',' from cost
zomato['cost'] = zomato['cost'].astype(float)
#Removing '/5' from Rates
zomato = [Link][[Link] !='NEW']
zomato = [Link][[Link] !='-'].reset_index(drop=True)
remove_slash = lambda x: [Link]('/5', '') if type(x) == [Link] else x
[Link] = [Link](remove_slash).[Link]().astype('float')

# Adjust the column names

[Link] = [Link](lambda x:[Link]())
zomato.online_order.replace(('Yes','No'),(True, False),inplace=True)
zomato.book_table.replace(('Yes','No'),(True, False),inplace=True)
restaurants = list(zomato['name'].unique())
zomato['Mean Rating'] = 0

for i in range(len(restaurants)):
zomato['Mean Rating'][zomato['name'] == restaurants[i]] = zomato['rate'][zomato['name']
== restaurants[i]].mean()

from [Link] import MinMaxScaler

scaler = MinMaxScaler(feature_range = (1,5))
zomato[['Mean Rating']] = scaler.fit_transform(zomato[['Mean Rating']]).round(2)
Now the next step is to perform some text preprocessing steps which include:
1. Lower casing
2. Removal of Punctuations
3. Removal of Stopwords
4. Removal of URLs
5. Spelling correction
In [4]:
## Lower Casing
zomato["reviews_list"] = zomato["reviews_list"].[Link]()

51
## Removal of Puctuations
import string
PUNCT_TO_REMOVE = [Link]
def remove_punctuation(text):
"""custom function to remove the punctuation"""
return [Link]([Link]('', '', PUNCT_TO_REMOVE))

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text:

remove_punctuation(text))

## Removal of Stopwords
from [Link] import stopwords
STOPWORDS = set([Link]('english'))
def remove_stopwords(text):
"""custom function to remove the stopwords"""
return " ".join([word for word in str(text).split() if word not in STOPWORDS])
zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text:
remove_stopwords(text))

## Removal of URLS
def remove_urls(text):
url_pattern = [Link](r'https?://\S+|www\.\S+')
return url_pattern.sub(r'', text)

zomato["reviews_list"] = zomato["reviews_list"].apply(lambda text: remove_urls(text))

zomato[['reviews_list', 'cuisines']].sample(5)
Out[4]:
reviews_list cuisines
rated 40 ratedn hi allnni visited place South Indian, North Indian, Chinese,
19691
friend... Street Food
rated 40 ratedn got friday nightnot
35018 Mediterranean, Italian, Asian
crowded go...
rated 10 ratedn bad experience air
22624 Mexican, Continental, Italian, Chinese
conditionin...
rated 10 ratedn packed drinage food
32489 North Indian, Biryani, Chinese
delivered ...
rated 40 ratedn hello regular adda week
38093 Cafe
visit ...
In [5]:
# RESTAURANT NAMES:
restaurant_names = list(zomato['name'].unique())
def get_top_words(column, top_nu_of_words, nu_of_word):
vec = CountVectorizer(ngram_range= nu_of_word, stop_words='english')
bag_of_words = vec.fit_transform(column)

52
sum_words = bag_of_words.sum(axis=0)
words_freq = [(word, sum_words[0, idx]) for word, idx in vec.vocabulary_.items()]
words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)
return words_freq[:top_nu_of_words]

zomato=[Link](['address','rest_type', 'type', 'menu_item', 'votes'],axis=1)

import pandas

# Randomly sample 60% of your dataframe

df_percent = [Link](frac=0.5)
TF-IDF Vectorization
TF-IDF (Term Frequency-Inverse Document Frequency) vectors for each document. This
will give you a matrix where each column represents a word in the general vocabulary (all
words that appear in at least one document) and each column represents a restaurant, as
before. TF-IDF is the statistical method of assessing the meaning of a word in a given
document. Now, I will use the TF-IDF vectorization on the dataset:

In [6]:
df_percent.set_index('name', inplace=True)
indices = [Link](df_percent.index)

# Creating tf-idf matrix

tfidf = TfidfVectorizer(analyzer='word', ngram_range=(1, 2), min_df=0,
stop_words='english')
tfidf_matrix = tfidf.fit_transform(df_percent['reviews_list'])

cosine_similarities = linear_kernel(tfidf_matrix, tfidf_matrix)

Now the last step for creating a Restaurant Recommendation System is to write a function
that will recommend restaurants:

In [7]:
def recommend(name, cosine_similarities = cosine_similarities):

# Create a list to put top restaurants

recommend_restaurant = []

# Find the index of the hotel entered

idx = indices[indices == name].index[0]

# Find the restaurants with a similar cosine-sim value and order them from bigges number
score_series = [Link](cosine_similarities[idx]).sort_values(ascending=False)

# Extract top 30 restaurant indexes with a similar cosine-sim value

top30_indexes = list(score_series.iloc[0:31].index)

# Names of the top 30 restaurants

for each in top30_indexes:
recommend_restaurant.append(list(df_percent.index)[each])

# Creating the new data set to show similar restaurants

53
df_new = [Link](columns=['cuisines', 'Mean Rating', 'cost'])

# Create the top 30 similar restaurants with some of their columns

for each in recommend_restaurant:
df_new = df_new.append([Link](df_percent[['cuisines','Mean Rating', 'cost']]
[df_percent.index == each].sample()))

# Drop the same named restaurants and sort only the top 10 by the highest rating
df_new = df_new.drop_duplicates(subset=['cuisines','Mean Rating', 'cost'], keep=False)
df_new = df_new.sort_values(by='Mean Rating', ascending=False).head(10)

print('TOP %s RESTAURANTS LIKE %s WITH SIMILAR REVIEWS: ' %

(str(len(df_new)), name))

return df_new
recommend('Pai Vihar')
TOP 10 RESTAURANTS LIKE Pai Vihar WITH SIMILAR REVIEWS:
Out[7]:
Mean
cuisines cost
Rating
Atithi North Indian, Chinese, Street Food 3.63 800.0
Atithi North Indian 3.63 750.0
Samosa Singh Street Food, Fast Food, Rolls, Desserts 3.60 200.0
Fast Food, North Indian, Chinese,
Magix'S Parattha Roll 3.52 400.0
Mughlai, Rolls
Prasiddhi Food Corner Fast Food, North Indian, South Indian 3.45 200.0
Shrusti Coffee Cafe, South Indian 3.45 150.0
South Indian, North Indian, Chinese, Street
Shanthi Sagar 3.44 400.0
Fo...
Mayura Sagar Chinese, North Indian, South Indian 3.32 250.0
Vasanth Vihar - Since
South Indian, Street Food 3.32 150.0
1965
Marwa Restaurant North Indian, Chinese, Fast Food, BBQ 3.19 600.0

Result:

Thus the Machine Learning project on Restaurant Recommendation system with Python
programming language was executed successfully.

54
ADDITIONAL EXPERIMENTS

Ex No: 10 PREPROCESS DATASET IN PYTHON

Date:

Aim:
To implement a program in Python on data preprocessing using Python, NumPy and
Pandas.

Data Preprocessing

For machine learning algorithms to work, it’s necessary to convert raw data into a
clean data set, which means we must convert the data set to numeric data. We do this by
encoding all the categorical labels to column vectors with binary values. Missing values, or
NaNs (not a number) in the data set is an annoying problem. You have to either drop the
missing rows or fill them up with a mean or interpolated values. Kaggle provides two data
sets: training data and results data. Both data sets must have the same dimensions for the
model to produce accurate results.
Load Data in Pandas
To work on the data, you can either load the CSV in Excel or in Pandas. For the purposes of
this tutorial, we’ll load the CSV data in Pandas.
Program:
import pandas as pd

df = pd.read_csv('C:\\Users\\ADMIN\\Downloads\\[Link]')
[Link]()
Output:

Let’s take a look at the data format.

Program:
[Link]()
Output:
<class '[Link]'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):

55
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object
4 Sex 891 non-null object
5 Age 714 non-null float64
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 204 non-null object
11 Embarked 889 non-null object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB

Drop Columns that aren’t Useful

Let’s try to drop some of the columns that won’t contribute much to our machine learning
model. We’ll start with Name, Ticket and Cabin.
Program:
cols = ['Name', 'Ticket', 'Cabin']
df = [Link](cols, axis=1)
We dropped three columns.
Drop Rows with Missing Values
Next we can drop all rows in the data that have missing values (NaNs).
Program:
df = [Link]()
[Link]()
Output:
<class '[Link]'>
Index: 712 entries, 0 to 890
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 712 non-null int64
1 Survived 712 non-null int64
2 Pclass 712 non-null int64
3 Sex 712 non-null object
4 Age 712 non-null float64
5 SibSp 712 non-null int64
6 Parch 712 non-null int64
7 Fare 712 non-null float64
8 Embarked 712 non-null object
dtypes: float64(2), int64(5), object(2)
memory usage: 55.6+ KB

Problem With Dropping Rows

56
After dropping rows with missing values, we find the data set is reduced to 712 rows
from 891, which means we are wasting data. Machine learning models need data to train and
perform well. So, let’s preserve the data and make use of it as much as we can.
Creating Dummy Variables
Instead of wasting our data, let’s convert the Pclass, Sex and Embarked to columns in
Pandas and drop them after conversion.
Program:
df = pd.read_csv('C:\\Users\\ADMIN\\Downloads\\[Link]')
[Link]()
Output:
<class '[Link]'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object
4 Sex 891 non-null object
5 Age 714 non-null float64
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 204 non-null object
11 Embarked 889 non-null object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB

Program:
dummies = []
cols = ['Pclass', 'Sex', 'Embarked']
for col in cols:
[Link](pd.get_dummies(df[col]))
titanic_dummies = [Link](dummies, axis=1)

[Link]()
Output:
<class '[Link]'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 28 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object

57
4 Sex 891 non-null object
5 Age 714 non-null float64
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 204 non-null object
11 Embarked 889 non-null object
12 1 891 non-null bool
13 2 891 non-null bool
14 3 891 non-null bool
15 female 891 non-null bool
16 male 891 non-null bool
17 C 891 non-null bool
18 Q 891 non-null bool
19 S 891 non-null bool
20 1 891 non-null bool
21 2 891 non-null bool
22 3 891 non-null bool
23 female 891 non-null bool
24 male 891 non-null bool
25 C 891 non-null bool
26 Q 891 non-null bool
27 S 891 non-null bool
dtypes: bool(16), float64(2), int64(5), object(5)
memory usage: 97.6+ KB

Drop the data

Now that we converted Pclass, Sexand Embarked values into columns, we drop the
redundant columns from the data frame.
Program:
df = [Link](['Pclass', 'Sex', 'Embarked'], axis=1)
[Link]()
Output:
<class '[Link]'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 25 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Name 891 non-null object
3 Age 714 non-null float64
4 SibSp 891 non-null int64
5 Parch 891 non-null int64
6 Ticket 891 non-null object
7 Fare 891 non-null float64
8 Cabin 204 non-null object
9 1 891 non-null bool
10 2 891 non-null bool

58
11 3 891 non-null bool
12 female 891 non-null bool
13 male 891 non-null bool
14 C 891 non-null bool
15 Q 891 non-null bool
16 S 891 non-null bool
17 1 891 non-null bool
18 2 891 non-null bool
19 3 891 non-null bool
20 female 891 non-null bool
21 male 891 non-null bool
22 C 891 non-null bool
23 Q 891 non-null bool
24 S 891 non-null bool
dtypes: bool(16), float64(2), int64(4), object(3)
memory usage: 76.7+ KB

Take Care of Missing Data

Everything’s clean now, except Age, which has lots of missing values. Let’s compute a
median or interpolate() all the ages and fill those missing age values. Pandas has an
interpolate() function that will replace all the missing NaNs to interpolated values.
Program:
df['Age'] = df['Age'].interpolate()
[Link]()
Output:
<class '[Link]'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 25 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Name 891 non-null object
3 Age 891 non-null float64
4 SibSp 891 non-null int64
5 Parch 891 non-null int64
6 Ticket 891 non-null object
7 Fare 891 non-null float64
8 Cabin 204 non-null object
9 1 891 non-null bool
10 2 891 non-null bool
11 3 891 non-null bool
12 female 891 non-null bool
13 male 891 non-null bool
14 C 891 non-null bool
15 Q 891 non-null bool
16 S 891 non-null bool
17 1 891 non-null bool
18 2 891 non-null bool
19 3 891 non-null bool

59
20 female 891 non-null bool
21 male 891 non-null bool
22 C 891 non-null bool
23 Q 891 non-null bool
24 S 891 non-null bool
dtypes: bool(16), float64(2), int64(4), object(3)
memory usage: 76.7+ KB
[46]:

Convert the Data Frame to NumPy

Now that we’ve converted all the data to integers, it’s time to prepare the data for machine
learning models. This is where scikit-learn and NumPy come into play:
X= Input set with 14 attributes
y = Small y output, in this case Survived
Now we convert our data frame from Pandas to NumPy and we assign input and output. X
still has Survived values in it, which should not be there. So we drop in the NumPy column,
which is the first column.
Program:
X = [Link]
y = df['Survived'].values

import numpy as np
X = [Link](X, 1, axis=1)

Divide the Data Set Into Training Data and Test Data
Now that we’re ready with X and y, let's split the data set: we’ll allocate 70 percent for
training and 30 percent for tests using scikit model_selection.
Program:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

Result:
Now you can preprocess data on your own. Go on and try it for yourself to start building
your own models and making predictions.

60
Ex No: 11 VISUALIZING THE RATINGS IN THE DATA SET
Date:

Aim:
To implement a program in Python on visualize data using Python, NumPy and Pandas.

The Libraries We Need For This Tutorial

This tutorial will make use of a number of open-source Python libraries, including
NumPy, pandas, and matplotlib. We'll import these libraries now. To start, open a Jupyter
Notebook in the directory you'd like to work in. Here are the imports that we will start our
Python script with:
Program:
#Data imports
import pandas as pd
import numpy as np
#Visualization imports
import [Link] as plt
import seaborn as sns
%matplotlib inline

Now that our imports have been executed, we can move on to importing our movie database.

Importing Our Movie Database

The first thing you'll need to do is download the files that contain our data set. Click the
following three links to download a version of each file to your Downloads folder:
 Movie database download 1 :
[Link]
 Movie database download 2 :
[Link]
 Movie database download 3 :
[Link]

This will download three files named:

61
Movie_Id_Titles, [Link], [Link]

Move these files into the directory that you'd like to work in for this tutorial. This needs to be
the same folder that you opened your Jupyter Notebook in earlier. Then, you'll need to import
the data into a pandas DataFrame.

The actual data for our movie database lies within the [Link] file. Here is the command
required to import the data into a DataFrame:

Program:
raw_data = pd.read_csv('[Link]', sep = '\t', names = ['user_id', 'item_id', 'rating', 'timestamp'])

You will notice that this DataFrame has four columns and none of them contain the title
of the movie. This data lies in a separate that we downloaded previously named
Movie_Id_Titles. You will need to import this data and merge it with our existing raw_data
DataFrame before [Link], let's import the movie title data. Then let's merge the two
DataFrames together into one DataFrame by merging them on the item_id column.

Program:

movie_titles_data = pd.read_csv('Movie_Id_Titles')

merged_data = [Link](raw_data, movie_titles_data, on='item_id')

You can get a sense of what the new DataFrame contains by running merged_data.columns,
which returns.
Program:
Index(['user_id', 'item_id', 'rating', 'timestamp', 'title'], dtype='object')

Exploratory Data Analysis

Exploratory data analysis is the process of learning more about a data set by calculating
aggregate statistics or creating visualizations. Let's dig in to our merged movies data set
before building our recommendation system later in this tutorial.

Calculating The Movies With The Highest Average Rating

For every movie in our data set, there are a number of different ratings that are submitted by
the different users of the database. Let's start by calculating the average rating for every
movie in the database with the following command.

Program:

merged_data.groupby('title')['rating'].mean().sort_values(ascending = False)

This will return a pandas Series that orders the movies from the highest average rating to the
lowest average rating. It will look something like this.

62
Calculating The Movies With The Most Ratings

You can list the movies in order of their number of ratings with the following command.

Program:

merged_data.groupby('title')['rating'].count().sort_values(ascending = False)

This generates the following output:

Visualizing the Ratings in Our Data Set

Now visualize the distribution of movie ratings in our data set. It will be helpful to store our
ratings in a simpler data structure first. Accordingly, let's quickly create a pandas DataFrame
that contains the average rating and the number of ratings for every movie in the data set.
Let's start the DataFrame with just the average rating by movie with the following statement.

Program:

ratings_data = [Link](merged_data.groupby('title')['rating'].mean())

Next let's add another column to this DataFrame that contains the number of ratings for every
movie in the data set.

Program:

ratings_data['# of ratings'] = merged_data.groupby('title')['rating'].count()

63
We can now use this DataFrame to create some nice visualizations. First, let's visualize the
distribution of number of ratings by movie using seaborn's distplot function.

[Link](ratings_data['# of ratings'])

As you can see, most movies seem to have either 0 ratings or 1 rating. This makes sense -
very few movies have the mass appeal to receive many ratings from watchers. Let's create a
similar visualization for the actual rating assign to the movies.

[Link](ratings_data['rating'])

64
As you can tell, most movies seem to be distributed around a rating of 3 or so, with peaks
at 1, 2, 4, and 5 - which are presumably movies with only one rating.

The Relationship Between Average Rating and Number of Ratings

Let's create one last visualization that explores the relationship between a movie's average
rating and its number of ratings. The seaborn jointplot is a nice visualization for this. We can
create our jointplot with the following command.

[Link](x = ratings_data['rating'], y = ratings_data['# of ratings'])

This generates the following plot:

65
It seems like there seems to be some positive relationship between the number of ratings and
the average rating. Said differently, movies with high average ratings tend to have more
ratings, and vice versa.

Result:

We have now spent some time on exploratory data analysis, which ensures that we have a
good sense of the structure of our data before building our recommendation system.

Common questions

Collaborative filtering techniques in recommendation systems include user-based and item-based collaborative filtering. User-based collaborative filtering suggests items by finding users with similar rating patterns and using their preferences to predict another user's interests . Item-based collaborative filtering, on the other hand, involves identifying a set of items similar to the target item, then using their known ratings to predict a user's potential rating of a target item based on a weighted average . The fundamental difference is that user-based focuses on similarities between users, while item-based focuses on similarities between items.

Content-based filtering in recommendation systems works by recommending items to users based on the features of items and the user’s previous interactions with similar items . This approach relies on machine learning principles to induce a classifier that can differentiate between interesting and uninteresting items for the user. This is often implemented using supervised machine learning to model user preferences and the characteristics of items, allowing the system to generate recommendations based on content similarity .

CS3491 AI & ML Exam Resources
No ratings yet
CS3491 AI & ML Exam Resources
11 pages
B.Tech AI & Data Science Curriculum 2023
No ratings yet
B.Tech AI & Data Science Curriculum 2023
88 pages
OEE351 Renewable Energy Syllabus
No ratings yet
OEE351 Renewable Energy Syllabus
3 pages
CS8691 AI Model Exam Paper 2021
No ratings yet
CS8691 AI Model Exam Paper 2021
2 pages
Sathyabama CSE-AI Syllabus 2023
No ratings yet
Sathyabama CSE-AI Syllabus 2023
12 pages
M.Tech Data Science Exam Questions 2025
No ratings yet
M.Tech Data Science Exam Questions 2025
1 page
Moratuwa AI Degree Syllabus Overview
No ratings yet
Moratuwa AI Degree Syllabus Overview
120 pages
Ethics in AI Laboratory Record 2024-2025
No ratings yet
Ethics in AI Laboratory Record 2024-2025
31 pages
Optimization Techniques Course Overview
No ratings yet
Optimization Techniques Course Overview
1 page
Digital Principles Lab Manual 2025
No ratings yet
Digital Principles Lab Manual 2025
63 pages
CSE 2022 International Conference Details
No ratings yet
CSE 2022 International Conference Details
2 pages
Machine Learning Applications in IoT
No ratings yet
Machine Learning Applications in IoT
4 pages
Database Management Exam Questions
No ratings yet
Database Management Exam Questions
2 pages
Software Engineering Lab Manual 2025-26
No ratings yet
Software Engineering Lab Manual 2025-26
182 pages
AI Course Syllabus for Mechatronics
No ratings yet
AI Course Syllabus for Mechatronics
3 pages
Environmental Management Plan Overview
No ratings yet
Environmental Management Plan Overview
21 pages
Module - 5 Notes - AI (BAD402)
No ratings yet
Module - 5 Notes - AI (BAD402)
22 pages
Types and Benefits of Parallel Computing
No ratings yet
Types and Benefits of Parallel Computing
11 pages
Ethical AI Practices and Decision-Making
No ratings yet
Ethical AI Practices and Decision-Making
3 pages
Understanding Biological and Machine Vision
No ratings yet
Understanding Biological and Machine Vision
53 pages
M.E. Structural Engineering Syllabus 2021
No ratings yet
M.E. Structural Engineering Syllabus 2021
81 pages
Understanding Perceptrons in Deep Learning
No ratings yet
Understanding Perceptrons in Deep Learning
39 pages
Roboethics: Ethics in Robotics and AI
No ratings yet
Roboethics: Ethics in Robotics and AI
10 pages
Ethical Case Studies in Healthcare & AI
No ratings yet
Ethical Case Studies in Healthcare & AI
19 pages
IIT Kanpur - Artificial Intelligence & Machine Learning in Business Applications
No ratings yet
IIT Kanpur - Artificial Intelligence & Machine Learning in Business Applications
7 pages
CP4292 Parallel Programming Lab Manual
No ratings yet
CP4292 Parallel Programming Lab Manual
39 pages
CS8591 Computer Networks Syllabus
0% (1)
CS8591 Computer Networks Syllabus
5 pages
Problem Solving Agents in AI
No ratings yet
Problem Solving Agents in AI
80 pages
CS3351: Digital Principles Overview
No ratings yet
CS3351: Digital Principles Overview
37 pages
AL3391 Artificial Intelligence Overview
No ratings yet
AL3391 Artificial Intelligence Overview
151 pages
ICNEXT'25: Engineering & Tech Conference
No ratings yet
ICNEXT'25: Engineering & Tech Conference
4 pages
B.Tech CSE Course Structure & Syllabus R23
No ratings yet
B.Tech CSE Course Structure & Syllabus R23
20 pages
CCS364 Soft Computing Lab Manual
No ratings yet
CCS364 Soft Computing Lab Manual
30 pages
OCS351 Lab Manual: AI Algorithms
No ratings yet
OCS351 Lab Manual: AI Algorithms
9 pages
AI & ML Lab Exam Questions 2024
No ratings yet
AI & ML Lab Exam Questions 2024
1 page
Key Questions for AL3391 Course
No ratings yet
Key Questions for AL3391 Course
8 pages
Comparing Merge and Quick Sort Algorithms
No ratings yet
Comparing Merge and Quick Sort Algorithms
7 pages
Types of Artificial Intelligence Explained
No ratings yet
Types of Artificial Intelligence Explained
112 pages
Key Requirements for Computer Networks
No ratings yet
Key Requirements for Computer Networks
5 pages
Parallel Computing Concepts and Applications
No ratings yet
Parallel Computing Concepts and Applications
1 page
Energy-Efficient ML for IoT Devices
No ratings yet
Energy-Efficient ML for IoT Devices
1 page
CD3291 Data Structures Lesson Plan
No ratings yet
CD3291 Data Structures Lesson Plan
4 pages
CS3391 OOP Important Questions Guide
No ratings yet
CS3391 OOP Important Questions Guide
44 pages
Image Processing Exam Guide 2024
No ratings yet
Image Processing Exam Guide 2024
2 pages
CS3551 Distributed Computing Syllabus
No ratings yet
CS3551 Distributed Computing Syllabus
2 pages
Particle Swarm Optimization Explained
No ratings yet
Particle Swarm Optimization Explained
20 pages
CSE Department Overview and Achievements
No ratings yet
CSE Department Overview and Achievements
68 pages
CS8691: Artificial Intelligence Course Plan
No ratings yet
CS8691: Artificial Intelligence Course Plan
9 pages
HSST Computer Science Rank List 2024
No ratings yet
HSST Computer Science Rank List 2024
38 pages
AI Lab Manual for TE Computer Engineering
No ratings yet
AI Lab Manual for TE Computer Engineering
34 pages
Computer Graphics & Image Processing 21CS63
No ratings yet
Computer Graphics & Image Processing 21CS63
151 pages
SCH - Mcs104a - Dec Jan 25
No ratings yet
SCH - Mcs104a - Dec Jan 25
14 pages
Digital Image Processing Concepts
No ratings yet
Digital Image Processing Concepts
39 pages
CS3353 C Programming & Data Structures
No ratings yet
CS3353 C Programming & Data Structures
5 pages
M.Tech - Computer Vision and Image Processing
No ratings yet
M.Tech - Computer Vision and Image Processing
21 pages
Ethics in Artificial Intelligence Course
No ratings yet
Ethics in Artificial Intelligence Course
37 pages
VTU Big Data Analytics Lab Manual
No ratings yet
VTU Big Data Analytics Lab Manual
62 pages
Vision and Mission of Engineering Institute
No ratings yet
Vision and Mission of Engineering Institute
5 pages
DevOps Lab Manual for Computer Engineering
No ratings yet
DevOps Lab Manual for Computer Engineering
100 pages
Data Structures Course Overview and Objectives
No ratings yet
Data Structures Course Overview and Objectives
49 pages
Food Booking System Project Report
No ratings yet
Food Booking System Project Report
20 pages
Geometric Transformations in 2D
No ratings yet
Geometric Transformations in 2D
55 pages
Understanding Distributed Systems Basics
No ratings yet
Understanding Distributed Systems Basics
40 pages
飞鸽电话产品型号概览
No ratings yet
飞鸽电话产品型号概览
4 pages
e - 20250520 Badi Cogs Split
No ratings yet
e - 20250520 Badi Cogs Split
2 pages
Ethics in ICT: Responsibilities and Principles
No ratings yet
Ethics in ICT: Responsibilities and Principles
2 pages
Pwani Innovation Week 2023 Schedule
No ratings yet
Pwani Innovation Week 2023 Schedule
16 pages
Six Sigma Certification Programs Overview
No ratings yet
Six Sigma Certification Programs Overview
12 pages
School Registration Form Template
No ratings yet
School Registration Form Template
1 page
Orbit N1/N2 Specifications Overview
No ratings yet
Orbit N1/N2 Specifications Overview
4 pages
Bid Proposal: Two-Storey Home Construction
No ratings yet
Bid Proposal: Two-Storey Home Construction
10 pages
Fire Safety Training and Protocols
No ratings yet
Fire Safety Training and Protocols
27 pages
Formula Bharat 2027 Guide
No ratings yet
Formula Bharat 2027 Guide
12 pages
Mechatronic Systems Design Assignment Guide
No ratings yet
Mechatronic Systems Design Assignment Guide
10 pages
All Informations and Sizes For A Paper Sizes A0, A1, A2, A3, A4, A5 ...
No ratings yet
All Informations and Sizes For A Paper Sizes A0, A1, A2, A3, A4, A5 ...
2 pages
E-Pass Ticket Booking Guidelines
No ratings yet
E-Pass Ticket Booking Guidelines
2 pages
Missionaries Cannibals Assignment
No ratings yet
Missionaries Cannibals Assignment
5 pages
MathematicsSampleProgram - Year 9
No ratings yet
MathematicsSampleProgram - Year 9
51 pages
JavaScript and CSS Basics Explained
No ratings yet
JavaScript and CSS Basics Explained
15 pages
UK National Annex to Eurocode 1 Fire Actions
No ratings yet
UK National Annex to Eurocode 1 Fire Actions
10 pages
Optimize Your HUMAN Website Usage
No ratings yet
Optimize Your HUMAN Website Usage
3 pages
Bond Graphs for Dynamic System Modeling
No ratings yet
Bond Graphs for Dynamic System Modeling
43 pages
Itel 25v 200ah Lithium Battery Price in Pakistan
No ratings yet
Itel 25v 200ah Lithium Battery Price in Pakistan
1 page
AMK: Innovating Transportation in Saudi Arabia
No ratings yet
AMK: Innovating Transportation in Saudi Arabia
26 pages
BMW ABS Fault Codes and Diagnostics
No ratings yet
BMW ABS Fault Codes and Diagnostics
12 pages
Nested Loops in Computer Applications
No ratings yet
Nested Loops in Computer Applications
19 pages
Exam Date Sheet Scheduler Project Report
No ratings yet
Exam Date Sheet Scheduler Project Report
32 pages
Learn SQL Beginner To Advanced
No ratings yet
Learn SQL Beginner To Advanced
131 pages
Weekly Equipment Safety Checklists
No ratings yet
Weekly Equipment Safety Checklists
11 pages
Computer Security Exam Questions 2024
No ratings yet
Computer Security Exam Questions 2024
2 pages