Experiment 1:
How to Install PIP on Windows?
Before we start with how to install pip for Python on Windows, let’s first go through the basic
introduction to Python. Python is a widely-used general-purpose, high-level programming language.
Python is a programming language that lets you work quickly and integrate systems more efficiently.
PIP is a package management system used to install and manage software packages/libraries written
in Python. These files are stored in a large “on-line repository” termed as Python Package Index
(PyPI).
pip uses PyPI as the default source for packages and their dependencies. So whenever you type:
Download and Install pip:
pip can be downloaded and installed using command-line by going through the following steps:
Download the [Link] file and store it in the same directory as python is installed.
Change the current path of the directory in the command line to the path of the
directory where the above file exists.
Run the command given below:
python [Link]
and wait through the installation process.
pip is now installed on your system.
Verification of the Installation process:
One can easily verify if the pip has been installed correctly by performing a version check on the
same. Just go to the command line and execute the following command:
pip -V
To Install Various Packages using PIP :
Syntax : pip install <package_name>
pip will look for that package on PyPI and if found, it will download and install the package on your
local system.
Packages :
a) Numpy:
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of
multidimensional array objects and a collection of routines for processing of array.
Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was
also developed, having some additional functionalities. In 2005, Travis Oliphant created NumPy
package by incorporating the features of Numarray into Numeric package. There are many
contributors to this open source project.
Operations using NumPy
Using NumPy, a developer can perform the following operations −
Mathematical and logical operations on arrays.
Fourier transforms and routines for shape manipulation.
Operations related to linear algebra. NumPy has in-built functions for linear algebra and
random number generation.
NumPy – A Replacement for MatLab
NumPy is often used along with packages like SciPy (Scientific Python) and Mat−plotlib (plotting
library). This combination is widely used as a replacement for MatLab, a popular platform for
technical computing. However, Python alternative to MatLab is now seen as a more modern and
complete programming language.
It is open source, which is an added advantage of NumPy.
b) Scipy:
SciPy, pronounced as Sigh Pi, is a scientific python open source, distributed under the BSD licensed
library to perform Mathematical, Scientific and Engineering Computations.
The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array
manipulation. The SciPy library is built to work with NumPy arrays and provides many user-friendly
and efficient numerical practices such as routines for numerical integration and optimization.
Together, they run on all popular operating systems, are quick to install and are free of charge.
NumPy and SciPy are easy to use, but powerful enough to depend on by some of the world's leading
scientists and engineers.
SciPy Sub-packages
SciPy is organized into sub-packages covering different scientific computing domains. These are
summarized in the following table −
[Link] Vector quantization / Kmeans
Physical and mathematical
[Link]
constants
[Link] Fourier transform
[Link] Integration routines
[Link] Interpolation
[Link] Data input and output
[Link] Linear algebra routines
[Link] n-dimensional image package
[Link] Orthogonal distance regression
[Link] Optimization
[Link] Signal processing
[Link] Sparse matrices
Spatial data structures and
[Link]
algorithms
Any special mathematical
[Link]
functions
[Link] Statistics
c) matplotlib
plot(x, y): plot x and y using default line style and color.
[Link]([xmin, xmax, ymin, ymax]): scales the x-axis and y-axis from minimum to maximum
values
plot.(x, y, color=’green’, marker=’o’, linestyle=’dashed’, linewidth=2, markersize=12): x
and y co-ordinates are marked using circular markers of size 12 and green color line with —
style of width 2
[Link](‘X-axis’): names x-axis
[Link](‘Y-axis’): names y-axis
plot(x, y, label = ‘Sample line ‘) plotted Sample Line will be displayed as a legend
d) scikit-learn
Scikit-Learn, also known as sklearn is a python library to implement machine learning models and
statistical modelling. Through scikit-learn, we can implement various machine learning models for
regression, classification, clustering, and statistical tools for analyzing these models. It also provides
functionality for dimensionality reduction, feature selection, feature extraction, ensemble techniques,
and inbuilt datasets. We will be looking into these features one by one.
This library is built upon NumPy, SciPy, and Matplotlib.
Write a program to read two numbers from user and display the result using bitwise &
, | and ^ operators on the numbers
a = int(input("Enter first number: "))
b = int(input("Enter second number: "))
c = a^b
print ("Bitwise XOR Operation of", a, "and", b, "=", c)
Write a program to calculate the sum of numbers from 1 to 20 which are not
divisible by 2, 3 or 5.
def findSum(n, k):
# Find the last multiple of N
val = (k // (n - 1)) * n;
rem = k % (n - 1);
# Find the K-th non-multiple of N
if (rem == 0):
val = val - 1;
else:
val = val + rem;
# Calculate the sum of
# all elements from 1 to val
sum = (val * (val + 1)) // 2;
# Calculate the sum of
# all multiples of N
# between 1 to val
x = k // (n - 1);
sum_of_multiples = (x * (x + 1) * n) // 2;
sum -= sum_of_multiples;
return sum;
# Driver code
n = 7; k = 13;
print(findSum(n, k))
Write a program to find the maximum of two numbers using functions.
def maximum(a, b):
if a >= b:
return a
else:
return b
# Driver code
a = 2
b = 4
print(maximum(a, b))
Implement slicing operation on strings and lists.
# String slicing
String ='ASTRING'
# Using slice constructor
s1 = slice(3)
s2 = slice(1, 5, 2)
s3 = slice(-1, -12, -2)
print("String slicing")
print(String[s1])
print(String[s2])
print(String[s3])
# Initialize list
Lst = [50, 70, 30, 20, 90, 10, 50]
# Display list
print(Lst[-7::1])
Experiment 2:
Implement python program to load structured data onto Data Frame and perform
exploratory data analysis
import pandas as pd
import [Link] as plt
Df = pd.read_csv('[Link]')
print([Link]())
print(Df["Education"].value_counts())
print([Link](['Education', 'Age']).mean())
y = list([Link])
[Link](y)
[Link]()
Implement python program for data preparation activities such as filtering, grouping,
ordering and joining of datasets.
import pandas as pd
import [Link] as plt
Df = pd.read_csv('[Link]')
# Filter top scoring students
df = df[df['Age'] >= 60]
print(df)
Merging
# import module
import pandas as pd
# creating DataFrame for Student Details
details = [Link]({
'ID': [101, 102, 103, 104, 105, 106,
107, 108, 109, 110],
'NAME': ['Jagroop', 'Praveen', 'Harjot',
'Pooja', 'Rahul', 'Nikita',
'Saurabh', 'Ayush', 'Dolly', "Mohit"],
'BRANCH': ['CSE', 'CSE', 'CSE', 'CSE', 'CSE',
'CSE', 'CSE', 'CSE', 'CSE', 'CSE']})
# printing details
print(details)
Experiment 3:
Implement Python program to prepare plots such as bar plot, histogram, distribution
plot, box plot, scatter plot.
Histogram:
import [Link] as plt
import numpy as np
from matplotlib import colors
from [Link] import PercentFormatter
# Creating dataset
[Link](23685752)
N_points = 10000
n_bins = 20
# Creating distribution
x = [Link](N_points)
y = .8 ** x + [Link](10000) + 25
# Creating histogram
fig, axs = [Link](1, 1,figsize =(10, 7),tight_layout = True)
[Link](x, bins = n_bins)
# Show plot
[Link]()
barplot:
import numpy as np
import [Link] as plt
# creating the dataset
data = {'C':20, 'C++':15, 'Java':30,
'Python':35}
courses = list([Link]())
values = list([Link]())
fig = [Link](figsize = (10, 5))
# creating the bar plot
[Link](courses, values, color ='maroon',
width = 0.4)
[Link]("Courses offered")
[Link]("No. of students enrolled")
[Link]("Students enrolled in different courses")
[Link]()
scatter plot:
import [Link] as plt
# dataset-1
x1 = [89, 43, 36, 36, 95, 10,66, 34, 38, 20]
y1 = [21, 46, 3, 35, 67, 95,53, 72, 58, 10]
# dataset2
x2 = [26, 29, 48, 64, 6, 5,36, 66, 72, 40]
y2 = [26, 34, 90, 33, 38,20, 56, 2, 47, 15]
[Link](x1, y1, c ="pink", linewidths = 2, marker ="s", edgecolor ="green", s = 50)
[Link](x2, y2, c ="yellow", linewidths = 2, marker ="^", edgecolor ="red", s = 200)
[Link]("X-axis")
[Link]("Y-axis")
[Link]()
boxplot:
# Import libraries
import [Link] as plt
import numpy as np
# Creating dataset
[Link](10)
data_1 = [Link](100, 10, 200)
data_2 = [Link](90, 20, 200)
data_3 = [Link](80, 30, 200)
data_4 = [Link](70, 40, 200)
data = [data_1, data_2, data_3, data_4]
fig = [Link](figsize =(10, 7))
# Creating axes instance
ax = fig.add_axes([0, 0, 1, 1])
# Creating plot
bp = [Link](data)
# show plot
[Link]()
Distribution plot:
import [Link] as plt
import numpy as np
from matplotlib import colors
from [Link] import PercentFormatter
# Creating dataset
[Link](23685752)
N_points = 10000
n_bins = 20
# Creating distribution
x = [Link](N_points)
y = .8 ** x + [Link](10000) + 25
legend = ['distribution']
# Creating histogram
fig, axs = [Link](1, 1, figsize =(10, 7), tight_layout = True)
# Remove axes splines
for s in ['top', 'bottom', 'left', 'right']:
[Link][s].set_visible(False)
# Remove x, y ticks
[Link].set_ticks_position('none')
[Link].set_ticks_position('none')
# Add padding between axes and labels
[Link].set_tick_params(pad = 5)
[Link].set_tick_params(pad = 10)
# Add x, y gridlines
[Link](b = True, color ='grey', linestyle ='-.', linewidth = 0.5, alpha = 0.6)
# Add Text watermark
[Link](0.9, 0.15, 'Jeeteshgavande30', fontsize = 12, color ='red', ha ='right', va ='bottom', alpha =
0.7)
# Creating histogram
N, bins, patches = [Link](x, bins = n_bins)
# Setting color
fracs = ((N**(1 / 5)) / [Link]())
norm = [Link]([Link](), [Link]())
for thisfrac, thispatch in zip(fracs, patches):
color = [Link](norm(thisfrac))
thispatch.set_facecolor(color)
# Adding extra features
[Link]("X-axis")
[Link]("y-axis")
[Link](legend)
[Link]('Customized histogram')
# Show plot
[Link]()
Experiment 4
Implement Simple Linear regression algorithm in Python.
import numpy as np
import [Link] as plt
def estimate_coef(x, y):
# number of observations/points
n = [Link](x)
# mean of x and y vector
m_x = [Link](x)
m_y = [Link](y)
# calculating cross-deviation and deviation about x
SS_xy = [Link](y*x) - n*m_y*m_x
SS_xx = [Link](x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plot
[Link](x, y, color = "m",
marker = "o", s = 30)
# predicted response vector
y_pred = b[0] + b[1]*x
# plotting the regression line
[Link](x, y_pred, color = "g")
# putting labels
[Link]('x')
[Link]('y')
# function to show plot
[Link]()
def main():
# observations / data
x = [Link]([i for i in range(11)])
y = [Link]([2*i for i in range(11)])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {}".format(b[0], b[1]))
# plotting regression line
plot_regression_line(x, y, b)
if __name__ == "__main__":
main()
Implement Gradient Descent algorithm for the above linear regression model.
# Implementation of gradient descent in linear regression
import numpy as np
import [Link] as plt
class Linear_Regression:
def __init__(self, X, Y):
self.X = X
self.Y = Y
self.b = [0, 0]
def update_coeffs(self, learning_rate):
Y_pred = [Link]()
Y = self.Y
m = len(Y)
self.b[0] = self.b[0] - (learning_rate * ((1/m) * [Link](Y_pred - Y)))
self.b[1] = self.b[1] - (learning_rate * ((1/m) * [Link]((Y_pred - Y) * self.X)))
def predict(self, X=[]):
Y_pred = [Link]([])
if not X: X = self.X
b = self.b
for x in X:
Y_pred = [Link](Y_pred, b[0] + (b[1] * x))
return Y_pred
def get_current_accuracy(self, Y_pred):
p, e = Y_pred, self.Y
n = len(Y_pred)
return 1-sum([abs(p[i]-e[i])/e[i] for i in range(n) if e[i] != 0])/n
def compute_cost(self, Y_pred):
m = len(self.Y)
J = (1 / 2*m) * ([Link](Y_pred - self.Y)**2)
return J
def plot_best_fit(self, Y_pred, fig):
f = [Link](fig)
[Link](self.X, self.Y, color='b')
[Link](self.X, Y_pred, color='g')
[Link]()
def main():
X = [Link]([i for i in range(11)])
Y = [Link]([2*i for i in range(11)])
regressor = Linear_Regression(X, Y)
iterations = 0
steps = 100
learning_rate = 0.01
costs = []
#original best-fit line
Y_pred = [Link]()
regressor.plot_best_fit(Y_pred, 'Initial Best Fit Line')
while 1:
Y_pred = [Link]()
cost = regressor.compute_cost(Y_pred)
[Link](cost)
regressor.update_coeffs(learning_rate)
iterations += 1
if iterations % steps == 0:
print(iterations, "epochs elapsed")
print("Current accuracy is :",regressor.get_current_accuracy(Y_pred))
stop = input("Do you want to stop (y/*)??")
if stop == "y":
break
#final best-fit line
regressor.plot_best_fit(Y_pred, 'Final Best Fit Line')
#plot to verify cost function decreases
h = [Link]('Verification')
[Link](range(iterations), costs, color='b')
[Link]()
# if user wants to predict using the regressor:
[Link]([i for i in range(10)])
if __name__ == '__main__':
main()
Experiment 5:
Implement Multiple linear regression algorithm using Python.
import numpy as np
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D
import [Link] as plt
def generate_dataset(n):
x = []
y = []
random_x1 = [Link]()
random_x2 = [Link]()
for i in range(n):
x1 = i
x2 = i/2 + [Link]()*n
[Link]([1, x1, x2])
[Link](random_x1 * x1 + random_x2 * x2 + 1)
return [Link](x), [Link](y)
x, y = generate_dataset(200)
[Link]['[Link]'] = 12
ax = [Link](projection ='3d')
[Link](x[:, 1], x[:, 2], y, label ='y', s = 5)
[Link]()
ax.view_init(45, 0)
[Link]()
Experiment 6:
Implement Python Program to build logistic regression and decision tree models
using the Python package stats model and sklearn APIs.
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
col_names =
['Pregnancies','Glucose','BloodPressure','SkinThickness','Insulin','BMI','DiabetesPedigreeFunction','A
ge','Outcome']
# load dataset
pima = pd.read_csv("[Link]", header=None, names=col_names)
feature_cols =
['Pregnancies','Glucose','BloodPressure','SkinThickness','Insulin','BMI','DiabetesPedigreeFunction','A
ge']
X = pima[feature_cols] # Features
y = [Link] # Target variable
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=1)
logreg = LogisticRegression()
[Link](X_train,y_train)
y_pred=[Link](X_test)
cnf_matrix = metrics.confusion_matrix(y_test, y_pred)
print(cnf_matrix)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
print("Precision:",metrics.precision_score(y_test, y_pred))
print("Recall:",metrics.recall_score(y_test, y_pred))
6b) decision tree
import pandas as pd
from [Link] import DecisionTreeClassifier # Import Decision Tree Classifier
from sklearn.model_selection import train_test_split # Import train_test_split function
from sklearn import metrics
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
# load dataset
pima = pd.read_csv("[Link]", header=None, names=col_names)
feature_cols = ['pregnant', 'insulin', 'bmi', 'age','glucose','bp','pedigree']
X = pima[feature_cols] # Features
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)
clf = DecisionTreeClassifier()
clf = [Link](X_train,y_train)
y_pred = [Link](X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Experiment 7:
Write a Python program to implement k-Nearest Neighbour algorithm to classify
the iris data set. Print both correct and wrong predictions
#k-Nearest Neighbour algorithm(lab)
from [Link] import load_iris
iris = load_iris()
print("Feature Names:",iris.feature_names,"Iris Data:",[Link],"Target
Names:",iris.target_names,"Target:",[Link])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split([Link], [Link], test_size = .25)
from [Link] import KNeighborsClassifier
clf = KNeighborsClassifier()
[Link](X_train, y_train)
print(" Accuracy=",[Link](X_test, y_test))
print("Predicted Data")
print([Link](X_test))
prediction=[Link](X_test)
print("Test data :")
print(y_test)
diff=prediction-y_test
print("Result is ")
print(diff)
print('Total no of samples misclassied =', sum(abs(diff)))
Experiment 8:
Implement Support vector Machine algorithm on any data set
#SupportVectorMachine(lab)
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn import metrics
cancer_data = datasets.load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(cancer_data.data, cancer_data.target,
test_size=0.4,random_state=109)
#create a classifier
cls = [Link](kernel="linear")
#train the model
[Link](X_train,y_train)
#predict the response
pred = [Link](X_test)
#accuracy
print("acuracy:", metrics.accuracy_score(y_test,y_pred=pred))
#precision score
print("precision:", metrics.precision_score(y_test,y_pred=pred))
#recall score
print("recall" , metrics.recall_score(y_test,y_pred=pred))
print(metrics.classification_report(y_test, y_pred=pred))
Experiment 9:
Write a program to implement the naive Bayesian classifier for a sample training
data set stored as a .csv file. Compute the accuracy of the classifier, considering
few test data sets
import pandas as pd
from sklearn import tree
from [Link] import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv('[Link]')
print("The first 5 values of data is :\n",[Link]())
X = [Link][:,:-1]
print("\nThe First 5 values of train data is\n",[Link]())
y = [Link][:,-1]
print("\nThe first 5 values of Train output is\n",[Link]())
le_outlook = LabelEncoder()
[Link] = le_outlook.fit_transform([Link])
le_Temperature = LabelEncoder()
[Link] = le_Temperature.fit_transform([Link])
le_Humidity = LabelEncoder()
[Link] = le_Humidity.fit_transform([Link])
le_Windy = LabelEncoder()
[Link] = le_Windy.fit_transform([Link])
print("\nNow the Train data is :\n",[Link]())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)
classifier = GaussianNB()
[Link](X_train,y_train)
from [Link] import accuracy_score
print("Accuracy is:",accuracy_score([Link](X_test),y_test))
Experiment 10:
Write a Python program to construct a Bayesian network considering medical data. Use
this model to demonstrate the diagnosis of heart patients using standard Heart Disease
Data Set
#Bayesian network(lab)
import bayespy as bp
import numpy as np
import csv
from colorama import init
from colorama import Fore, Back, Style
init()
ageEnum = {'SuperSeniorCitizen':0, 'SeniorCitizen':1, 'MiddleAged':2, 'Youth':3,'Teen':4}
genderEnum = {'Male':0, 'Female':1}
familyHistoryEnum = {'Yes':0, 'No':1}
dietEnum = {'High':0, 'Medium':1, 'Low':2}
lifeStyleEnum = {'Athlete':0, 'Active':1, 'Moderate':2, 'Sedetary':3}
cholesterolEnum = {'High':0, 'BorderLine':1, 'Normal':2}
heartDiseaseEnum = {'Yes':0, 'No':1}
with open('[Link]') as csvfile:
lines = [Link](csvfile)
dataset = list(lines)
data = []
for x in dataset:
[Link]([ageEnum[x[0]],genderEnum[x[1]],familyHistoryEnum[x[2]],dietEnum[x[3]],lifeStyleEnu
m[x[4]],cholesterolEnum[x[5]],heartDiseaseEnum[x[6]]])
data = [Link](data)
N = len(data)
p_age = [Link](1.0*[Link](5))
age = [Link](p_age, plates=(N,))
[Link](data[:,0])
p_gender = [Link](1.0*[Link](2))
gender = [Link](p_gender, plates=(N,))
[Link](data[:,1])
p_familyhistory = [Link](1.0*[Link](2))
familyhistory = [Link](p_familyhistory, plates=(N,))
[Link](data[:,2])
p_diet = [Link](1.0*[Link](3))
diet = [Link](p_diet, plates=(N,))
[Link](data[:,3])
p_lifestyle = [Link](1.0*[Link](4))
lifestyle = [Link](p_lifestyle, plates=(N,))
[Link](data[:,4])
p_cholesterol = [Link](1.0*[Link](3))
cholesterol = [Link](p_cholesterol, plates=(N,))
[Link](data[:,5])
p_heartdisease = [Link]([Link](2), plates=(5, 2, 2, 3, 4, 3))
heartdisease = [Link]([age, gender, familyhistory, diet, lifestyle,
cholesterol], [Link], p_heartdisease)
[Link](data[:,6])
p_heartdisease.update()
m=0
while m == 0:
print("\n")
res = [Link]([int(input('Enter Age: ' + str(ageEnum))), int(input('Enter Gender: ' +
str(genderEnum))), int(input('Enter FamilyHistory: ' + str(familyHistoryEnum))), int(input('Enter
dietEnum: ' + str(dietEnum))),int(input('Enter LifeStyle: ' + str(lifeStyleEnum))), int(input('Enter
Cholesterol: ' + str(cholesterolEnum)))], [Link],
p_heartdisease).get_moments()[0][heartDiseaseEnum['Yes']]
print("Probability(HeartDisease) = " + str(res))
m = int(input("Enter for Continue:0, Exit :1 "))
Experiment 11:
Assuming a set of documents that need to be classified, use the naive Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program.
Calculate the accuracy, precision and recall for your data set
from [Link] import fetch_20newsgroups
twenty_train = fetch_20newsgroups(subset='train', shuffle=True)
print("lenth of the twenty_train--------->", len(twenty_train))
print("**First Line of the First Data File**")
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(twenty_train.data)
print('dim=',X_train_counts.shape)
from sklearn.feature_extraction.text import TfidfTransformer
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
print(X_train_tfidf.shape)
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB().fit(X_train_tfidf, twenty_train.target)
from [Link] import Pipeline
text_clf = Pipeline([('vect', CountVectorizer()), ('tfidf', TfidfTransformer()), ('clf',MultinomialNB())])
text_clf = text_clf.fit(twenty_train.data, twenty_train.target)
# Performance of NB Classifier
import numpy as np
twenty_test = fetch_20newsgroups(subset='test', shuffle=True)
predicted = text_clf.predict(twenty_test.data)
accuracy=[Link](predicted == twenty_test.target)
print("Predicted Accuracy = ",accuracy)
#To Calculate Accuracy,Precision,Recall
from sklearn import metrics
print("Accuracy= ",metrics.accuracy_score(twenty_test.target,predicted))
print("Precision=",metrics.precision_score(twenty_test.target,predicted,average=None))
print("Recall=",metrics.recall_score(twenty_test.target,predicted,average=None))
print(metrics.classification_report(twenty_test.target,predicted,target_names=twenty_test.target_n
ames))
Experiment 12:
Implement PCA on any Image dataset for dimensionality reduction and classification of
images into different classes
import numpy as np
import pandas as pd
import [Link] as plt
from [Link] import PCA
import cv2
from [Link] import stats
import [Link] as mpimg
img = [Link]([Link]('[Link]'), cv2.COLOR_BGR2RGB)
[Link](img)
[Link]()
print([Link])
#Splitting into channels
blue,green,red = [Link](img)
# Plotting the images
fig = [Link](figsize = (15, 7.2))
fig.add_subplot(131)
[Link]("Blue Channel")
[Link](blue)
fig.add_subplot(132)
[Link]("Green Channel")
[Link](green)
fig.add_subplot(133)
[Link]("Red Channel")
[Link](red)
[Link]()
Experiment 13:
Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs
#Locally Weighted Regressionalgorithm(lab)
import numpy as np
import pandas as pd
import [Link] as plt
# kernel smoothing function
def kernel(point, xmat, k):
m,n = [Link](xmat)
weights = [Link]([Link]((m)))
for j in range(m):
diff = point - X[j]
weights[j, j] = [Link](diff * diff.T / (-2.0 * k**2))
return weights
# function to return local weight of eah traiining example
def localWeight(point, xmat, ymat, k):
wt = kernel(point, xmat, k)
W = (X.T * (wt*X)).I * (X.T * wt * ymat.T)
return W
# root function that drives the algorithm
def localWeightRegression(xmat, ymat, k):
m,n = [Link](xmat)
ypred = [Link](m)
for i in range(m):
ypred[i] = xmat[i] * localWeight(xmat[i], xmat, ymat, k)
return ypred
#import data
data = pd.read_csv('[Link]')
colA = [Link](data.total_bill)
colB = [Link]([Link])
mcolA = [Link](colA)
mcolB = [Link](colB)
m = [Link](mcolB)[1]
one = [Link]((1, m), dtype = int)
X = [Link]((one.T, mcolA.T))
print([Link])
# predicting values using LWLR
ypred = localWeightRegression(X, mcolB, 0.8)
# plotting the predicted graph
xsort = [Link]()
[Link](axis=0)
[Link](colA, colB, color='red')
[Link](xsort[:, 1], ypred[X[:, 1].argsort(0)], color='green', linewidth=5)
[Link]('Total Bill')
[Link]('Tip')
[Link]()