Machine Learning Lab 146 Certificate
Machine Learning Lab 146 Certificate
(203105403)
7th SEMESTER
7A10 (CSE)
Name: ............................................................................................
Year/Sem: ....................................................................................
EnrollmentNo: ..................................................................
Course: ........................................................................................
CERTIFICATE
Mr./Ms..............................................................................................................
with enrolment no. ..........................................................................................
has successfully completed his/her laboratory experiments in the
Machine Learning Laboratory (203105403)
From the Department of ...............................................................
PRACTICAL 1
AIM: Write a program to demonstrate the working of the decision tree
based ID3 algorithm.
2. dataset = datasets.load_iris()
x = [Link]
y = [Link]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state =
3)
3. clf = DecisionTreeClassifier()
[Link](x_train, y_train)
y_pred = [Link](x_test)
accuracy = metrics.accuracy_score(y_test, y_pred)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 2
AIM: Build an Artificial Neural Network by implementing the
Backpropagation algorithm and test the same using appropriate data sets.
1. import pandas as pd
2. url = '[Link]
dbe/bigdatacertification/master/dataset/churn_trasnsformed_new.csv'
df_csv = pd.read_csv(url, sep=',',)
df_csv.head()
3. df = df_csv.drop("Unnamed: 0",axis=1)
[Link]()
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
5. feature = ['Churn']
train_feature = [Link](feature, axis=1)
train_target = df["Churn"]
train_feature.head(5)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
• Number of layer = 3
Number of Iteration = 137
Current loss computed with the loss function = 0.41786698381624476
• array([[1393, 192],
[ 217, 311]], dtype=int64)
[16]:
12. acc_mlp = metrics.accuracy_score(y_test, y_predmlp)
prec_mlp = metrics.precision_score(y_test, y_predmlp)
rec_mlp = metrics.recall_score(y_test, y_predmlp)
f1_mlp = metrics.f1_score(y_test, y_predmlp)
kappa_mlp = metrics.cohen_kappa_score(y_test, y_predmlp)
print("Accuracy:",acc_mlp)
print("Precision:",prec_mlp)
print("Recall:",rec_mlp)
print("F1 Score:",f1_mlp)
print("Cohens kappa Score",kappa_mlp)
• Accuracy: 0.8064363464268812
Precision: 0.6182902584493042
Recall: 0.5890151515151515
F1 Score: 0.6032977691561591
Cohens kappa Score 0.4753847881578428
2203051057146(Aayush Vaghela)
5
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
15. [Link](figsize=(8,6))
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 3
Aim: Write a program to implement the naïve Bayesian classifier for a
sample training data set stored as a .CSV file. Compute the accuracy of the
classifier, considering a few test data sets.
1. import numpy as np
import pandas as pd
from [Link] import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from [Link] import accuracy_score, classification_report, confusion_matr
ix
import seaborn as sns
import [Link] as plt
3. playgolf_data
temperature_encoder = LabelEncoder()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
playgolf_data['Temperature']=temperature_encoder.fit_transform(playgolf_data['Temper ature'])
StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = [Link](X_test)
classifier = GaussianNB()
[Link](X_train, y_train)
y_pred =[Link](X_test)
print("accuracy:",accuracy_score(y_test,y_pred)) print("classification
report:\n", classification_report(y_test, y_pred))
confusion = confusion_matrix(y_test,y_pred)
confusion_df= [Link](confusion, index=['Actual No' ,'Actual Yes'],columns=['Actual
Yes','Actual No'])
print("Confusion Matrix:")
print(confusion_df)
[Link](confusion_df,annot=True,fmt='d', cmap='Blues' ,cbar=False)[Link]=('Predicted
label')
[Link]=('True label') [Link]=('Confusion
Matrix')
[Link]
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
Practical : 4
documents = [
{"text": "The Yankees won the game last night", "label": "sports"},Practical : 4 Aim:-
Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task.
documents = [
{"text": "The Yankees won the game last night", "label": "sports"},
{"text": "The president gave a speech today", "label": "politics"},
{"text": "The Lakers are on a winning streak", "label": "sports"},
{"text": "The government announced a new policy", "label": "politics"},
{"text": "The Cowboys lost to the Eagles", "label": "sports"},
{"text": "The economy is growing rapidly", "label": "politics"},
{"text": "The Bulls are on a losing streak", "label": "sports"},
{"text": "The president is visiting a foreign country", "label": "politics"},
{"text": "The Packers won the Super Bowl", "label": "sports"},
{"text": "The government is facing a crisis", "label": "politics"}
]
# fit the vectorizer to the training data and transform both the training and testing datax_train =
vectorizer.fit_transform(train_docs)
y_train = train_labels
x_test = [Link](test_docs)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
y_pred = [Link](x_test)
# fit the vectorizer to the training data and transform both the training and testing datax_train =
vectorizer.fit_transform(train_docs)
y_train = train_labels
x_test = [Link](test_docs)
[Link](x_test)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 5
Aim: Write a program to construct a Bayesian network considering medicaldata.
Use this model to demonstrate the diagnosis of heart patients using standard
Heart Disease Data Set.
import pandas as pd
from [Link] import MaximumLikelihoodEstimator
from [Link] import BayesianModel
from [Link] import VariableElimination
data = pd.read_csv("/content/[Link]")
heart_disease = [Link](data)
print(heart_disease)
model = BayesianModel([
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
('age', 'Lifestyle'),
('Gender', 'Lifestyle'),
('Family', 'heartdisease'),
('diet', 'cholestrol'),
('Lifestyle', 'diet'),
('cholestrol', 'heartdisease'),
('diet', 'cholestrol')
])
[Link](heart_disease, estimator=MaximumLikelihoodEstimator)
HeartDisease_infer = VariableElimination(model)
q = HeartDisease_infer.query(variables=['heartdisease'], evidence=
{
'age': int(input('Enter Age: ')),
'Gender': int(input('Enter Gender: ')),
'Family': int(input('Enter Family History: ')),
'diet': int(input('Enter Diet: ')),
'Lifestyle': int(input('Enter Lifestyle: ')),
'cholestrol': int(input('Enter Cholesterol: '))
})
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
print(q)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 6
AIM: Apply EM algorithm to cluster a set of data stored in a .CSV file. Use
the same data set for clustering using k-Means algorithm.
2. df = pd.read_csv("C:\\Users\\Uditi\\Desktop\\income [Link]")
[Link]()
3. [Link]([Link],df['Income($)'])
[Link]('Age')
[Link]('Income($)')
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
4. km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income($)']])
y_predicted
• array([1, 1, 2, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2])
5. df['cluster']=y_predicted
[Link]()
6. km.cluster_centers_
• array([[3.82857143e+01, 1.50000000e+05],
• [3.40000000e+01, 8.05000000e+04],
• [3.29090909e+01, 5.61363636e+04]])
7. df1 = df[[Link]==0]
df2 = df[[Link]==1]
df3 = df[[Link]==2]
[Link]([Link],df1['Income($)'],color='green')
[Link]([Link],df1['Income($)'],color='red')
[Link]([Link],df1['Income($)'],color='black')
[Link](km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='purple',marker='*',
label='centroid')
[Link]('Age')
[Link]('Income ($)')
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
8. scaler = MinMaxScaler()
[Link](df[['Income($)']])
df['Income($)'] = [Link](df[['Income($)']])
[Link](df[['Age']])
df['Age'] = [Link](df[['Age']])
9. [Link]()
10. km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income($)']])
y_predicted
• array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2])
11. df['cluster']=y_predicted
[Link]()
12. km.cluster_centers_
• array([[0.1372549 , 0.11633428],
• [0.72268908, 0.8974359 ],
• [0.85294118, 0.2022792 ]])
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
1. sse = []
K_rng = range(1,10)
for K in K_rng:
km = KMeans(n_clusters=K)
[Link](df[['Age','Income($)']])
[Link](km.inertia_)
2. [Link]('K')
[Link]('Sum of squared error')
[Link](k_rng,sse)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 7
1. import pandas as pd
from [Link] import load_irisiris =
load_iris()
2. iris.feature_names ['sepal
length (cm)',sepal width
(cm)', 'petal length (cm)',
'petal width (cm)']
3. iris.target_names
array(['setosa', 'versicolor', 'virginica'], dtype='<U10')
[Link]['target'] = [Link]
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
6. df[[Link]==1].head()
7. df[[Link]==2].head()
8. df['flower_names']=[Link](lambda x: iris.target_names[:x])
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
9. df[45:55]
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
15. x=[Link](['target','flower_names'],axis='columns')
y=[Link]
16. x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=1)
17. len(x_train)
• 120
18. len(x_test)
• 30
21. [Link](x_test,y_test)
• 0.9666666666666667
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
22. [Link]([[4.8,3.0,1.5,0.3]])
• array([0])
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 8
1. import pandas as pd
import numpy as np
from sklearn import linear_model
import [Link] as plt
2. df = pd.read_csv("C:\\Users\\Uditi\\Desktop\\homeprices [Link]")
df
3. %matplotlib inline
[Link]('area')
[Link]('price')
[Link]([Link],[Link],color='red',marker='+')
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
4. new_df = [Link]('price',axis='columns')
new_df
5. price = [Link]
price
6. reg = linear_model.LinearRegression()
[Link](new_df,price)
7. [Link]([[3300]])
• array([628715.75342466])
8. reg.coef_
• array([135.78767123])
9. reg.intercept_
• 180616.43835616432
10. # Y = m * X + b
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
12. [Link]('area')
[Link]('price')
[Link]([Link],[Link],color='red',marker='+')
price=reg.coef_ *new_df + reg.intercept_
[Link](new_df, price, color='red', label='Linear Regression Line')
LOGISTIC REGRESSION
1. import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline
2. df=pd.read_csv('C:\\Users\\Uditi\\Desktop\\insurance_data [Link]')
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
3. [Link]([Link],df.bought_insurance,marker='+',color='red')
6. X_test
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
7. X_train
9. [Link](X_train , y_train)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
12. [Link](X_test,y_test)
• 0.6666666666666666
13. y_predicted
• array([1, 1, 0, 1, 1, 0], dtype=int64)
14. model.coef_
• array([[0.33860165]])
15. model.intercept_
• array([-10.45240375])
18. age=35
prediction_function(age)
• 0.7989910002494708
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 9
Aim: Compare the various supervised learning algorithm by using
appropriatedataset.
import numpy as np
import pandas as pd
import [Link] as plt
import seaborn as sns
%matplotlib inline
import os
print([Link]())
import warnings
[Link]('ignore'
)
dataset = pd.read_csv("/content/[Link]")
[Link]
[Link](5)
[Link]()
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
info = ["age","1: male, 0: female","chest pain type, 1: typical angina, 2: atypical angina, 3: non-
anginal pain, 4: asymptomatic","resting blood pressure"," serum cholestoral in mg/dl","fasting
blood sugar > 120 mg/dl","resting electrocardiographic results (values 0,1,2)"," maximum heart
rate achieved","exercise induced angina","oldpeak = ST depression induced by exercise relative to
rest","the slope of the peak exercise ST segment","number of major vessels (0-3) colored by
flourosopy","thal: 3 = normal; 6 = fixed defect; 7 = reversable defect"]
for i in range(len(info)):
print([Link][i]+":\t\t\t"+info[i])
dataset["target"].describe()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
dataset["target"].unique()
print([Link]()["target"].abs().sort_values(ascending=False))
y = dataset["target"]
[Link](y)
target_temp = [Link].value_counts()
print(target_temp)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
dataset["sex"].unique()
[Link](dataset["sex"])
X_train.shape
X_test.shape
Y_train.shape
Y_test.shape
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
Model Fitting
Logistic Regression
Y_pred_lr.shape
score_lr = round(accuracy_score(Y_pred_lr,Y_test)*100,2)
print("The accuracy score achieved using Logistic Regression is: "+str(score_lr)+" %")
print(classification_report(Y_pred_lr,Y_test))
Naive Baye's
nb = GaussianNB()
[Link](X_train,Y_train)
Y_pred_nb = [Link](X_test)
Y_pred_nb.shape
score_nb = round(accuracy_score(Y_pred_nb,Y_test)*100,2)
print("The accuracy score achieved using Naive Bayes is: "+str(score_nb)+" %")
print(classification_report(Y_pred_nb,Y_test))
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
Y_pred_svm.shape
score_svm = round(accuracy_score(Y_pred_svm,Y_test)*100,2)
print("The accuracy score achieved using Linear SVM is: "+str(score_svm)+" %")
print(classification_report(Y_pred_svm,Y_test))
K Nearest Neighbours
knn = KNeighborsClassifier(n_neighbors=7)
[Link](X_train,Y_train)
Y_pred_knn=[Link](X_test)
Y_pred_knn.shape
score_knn = round(accuracy_score(Y_pred_knn,Y_test)*100,2)
print("The accuracy score achieved using KNN is: "+str(score_knn)+"
%")
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
Decision Tree
DecisionTreeClassifier max_accuracy = 0
for x in range(200):
dt =
DecisionTreeClassifier(random_state=x)
[Link](X_train,Y_train)
Y_pred_dt = [Link](X_test)
current_accuracy = round(accuracy_score(Y_pred_dt,Y_test)*100,2)
if(current_accuracy>max_accuracy):
max_accuracy = current_accuracy
best_x = x
#print(max_accuracy)
#print(best_x)
dt = DecisionTreeClassifier(random_state=best_x)
[Link](X_train,Y_train)
Y_pred_dt = [Link](X_test)
print(Y_pred_dt.shape)
score_dt = round(accuracy_score(Y_pred_dt,Y_test)*100,2)
print("The accuracy score achieved using Decision Tree is: "+str(score_dt)+" %")
Random Forest
max_accuracy = 0
for x in range(2000):
rf = RandomForestClassifier(random_state=x)
[Link](X_train,Y_train)
Y_pred_rf = [Link](X_test)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
current_accuracy = round(accuracy_score(Y_pred_rf,Y_test)*100,2)
if(current_accuracy>max_accuracy):
max_accuracy = current_accuracy
best_x = x
#print(max_accuracy)
#print(best_x)
rf = RandomForestClassifier(random_state=best_x)
[Link](X_train,Y_train)
Y_pred_rf = [Link](X_test)
Y_pred_rf.shape
score_rf = round(accuracy_score(Y_pred_rf,Y_test)*100,2)
print("The accuracy score achieved using Random Forest is: "+str(score_rf)+" %")
print(classification_report(Y_pred_rf,Y_test))
Y_pred_xgb = xgb_model.predict(X_test)
Y_pred_xgb.shape
score_xgb = round(accuracy_score(Y_pred_xgb,Y_test)*100,2)
print("The accuracy score achieved using XGBoost is: "+str(score_xgb)+" %")
print(classification_report(Y_pred_xgb,Y_test))
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
scores = [score_lr,score_nb,score_svm,score_knn,score_dt,score_rf,score_xgb]
algorithms = ["Logistic Regression","Naive Bayes","Support Vector Machine","K-
Nearest Neighbors","Decision Tree","Random Forest","XGBoost"]
for i in range(len(algorithms)):
print("The accuracy score achieved using "+algorithms[i]+" is: "+str(scores[i])+" %")
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
PRACTICAL 10
Aim: Compare the various Unsupervised learning algorithm by using the
appropriate datasets.
print(f"Accuracy: {hierarchical_metrics[0]:.4f}")
print(f"Precision: {hierarchical_metrics[1]:.4f}")
print(f"Recall: {hierarchical_metrics[2]:.4f}")
print(f"F1 Score: {hierarchical_metrics[3]:.4f}")
Using DBSCAN:-
# Apply DBSCAN
dbscan = DBSCAN(eps=0.5, min_samples=5)
dbscan_labels = dbscan.fit_predict(X)
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
# Function to compute
evaluation metricsdef
compute_metrics(y_true,
y_pred):
y_pred_matched = cluster_labels_to_true_labels(y_true,
y_pred)[y_pred]accuracy = accuracy_score(y_true,
y_pred_matched)
precision = precision_score(y_true, y_pred_matched,
average='weighted')recall = recall_score(y_true, y_pred_matched,
average='weighted')
f1 = f1_score(y_true, y_pred_matched,
average='weighted')return accuracy, precision, recall,
f1
# Compute metrics for GMM clustering
gmm_metrics = compute_metrics(y,
gmm_labels)# Plot the results
df = [Link](X,
columns=iris.feature_names)df['Cluster'] =
gmm_labels
df['True Label'] = y
[Link](df, hue='Cluster',
palette='viridis')[Link]('GMM
Clustering')
[Link]()
2203051057146(Aayush Vaghela)
Faculty of Engineering & Technology
Machine Learning (203105403)
B. Tech CSE 4th Year/ 7th Semester
print(f"Accuracy:
{gmm_metrics[0]:.4f}")
print(f"Precision:
{gmm_metrics[1]:.4f}")print(f"Recall:
{gmm_metrics[2]:.4f}") print(f"F1
Score: {gmm_metrics[3]:.4f}")
2203051057146(Aayush Vaghela)