Table of Content
Program to implement simple linear regression............................................................. 2
Program to Implement Multiple Linear Regression.......................................................5
Gradient Descent Algorithm Implementation.................................................................8
Backpropagation Algorithm Implementation............................................................... 12
Decision Tree Implementation........................................................................................ 18
Naive Bayes Algorithm Implementation........................................................................20
SVM Algorithm Implantation.........................................................................................23
KNN algorithm Implementation.....................................................................................26
K-Mean Cluster algorithm Implementation..................................................................28
Single Linkage Clustering Implementation................................................................... 33
1
Program to implement simple linear regression
Theory:
Simple linear regression is a regression model that estimates the relationship between one
independent variable and one dependent variable using a straight line. Here, both
variables should be quantitative.
Here is the python implementation of simple linear regression in python:
import numpy as np
import [Link] as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from [Link] import mean_squared_error, r2_score
X = [Link]([2.5, 5.1, 3.2, 8.5, 3.5, 1.5, 9.2, 5.5, 8.3, 2.7, 7.8, 6.0, 4.0, 4.9,
6.5]).reshape(-1, 1)
y = [Link]([21, 47, 27, 75, 30, 20, 88, 60, 81, 25, 85, 62, 41, 44, 56])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
[Link](X_train, y_train)
y_pred = [Link](X_test)
mse = mean_squared_error(y_test, y_pred)
2
r2 = r2_score(y_test, y_pred)
print("--- Simple Linear Regression Results ---")
print(f"Coefficient (Slope): {model.coef_[0]:.2f}")
print(f"Intercept: {model.intercept_:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"R-squared Score (R²): {r2:.2f}")
[Link](figsize=(10, 6))
[Link](X, y, color='blue', label='Actual Data Points')
[Link](X, [Link](X), color='red', linewidth=2, label='Regression Line')
[Link]('Simple Linear Regression: Study Hours vs. Exam Score')
[Link]('Study Hours (X)')
[Link]('Exam Score (Y)')
[Link]()
[Link](True)
[Link]()
3
Simple Linear Regression Results ---
Coefficient (Slope): 9.47
Intercept: 0.85
Mean Squared Error (MSE): 11.08
R-squared Score (R²): 0.97
4
Program to Implement Multiple Linear Regression
Theory:
Multiple linear regression refers to a statistical technique that uses two or more
independent variables to predict the outcome of a dependent variable. The technique
enables analysts to determine the variation of the model and the relative contribution of
each independent variable in the total variance.
Source code
import import numpy as np
import [Link] as plt
from [Link] import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from [Link] import mean_squared_error, r2_score
housing = fetch_california_housing(as_frame=True)
X = [Link]
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
model = LinearRegression()
[Link](X_train, y_train)
y_pred = [Link](X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("--- Multiple Linear Regression Results (California Housing) ---")
print(f"Intercept: {model.intercept_:.4f}")
5
print("\nCoefficients:")
for feature, coef in zip([Link], model.coef_):
print(f" {feature}: {coef:.4f}")
print(f"\nMean Squared Error (MSE): {mse:.4f}")
print(f"R-squared Score (R²): {r2:.4f}")
# --- Figure for Q2: Predicted vs. Actual Plot ---
[Link](figsize=(10, 6))
[Link](y_test, y_pred, color='k', alpha=0.6, label='Predicted Values')
max_val = max(y_test.max(), y_pred.max())
min_val = min(y_test.min(), y_pred.min())
[Link]([min_val, max_val], [min_val, max_val], color='k', linestyle='--', linewidth=2,
label='Ideal Fit')
[Link]('Predicted vs. Actual Housing Prices (Multiple Linear Regression)')
[Link]('Actual Price ($100,000s)')
[Link]('Predicted Price ($100,000s)')
[Link]()
[Link](True)
[Link]()
6
Output:
C--- Multiple Linear Regression Results (California Housing) ---
Intercept: -37.0562
Coefficients:
MedInc: 0.4458
HouseAge: 0.0097
AveRooms: -0.1221
AveBedrms: 0.7786
Population: -0.0000
AveOccup: -0.0034
Latitude: -0.4185
Longitude: -0.4337
Mean Squared Error (MSE): 0.5306
R-squared Score (R²): 0.5958
7
Gradient Descent Algorithm Implementation
Theory:
Gradient descent (GD) is an iterative first-order optimisation algorithm used to find a
local minimum/maximum of a given function. This method is commonly used in machine
learning (ML) and deep learning(DL) to minimise a cost/loss function.
import numpy as np
import [Link] as plt
def mean_squared_error(y_true, y_predicted):
cost = [Link]((y_true - y_predicted)**2) / len(y_true)
return cost
def gradient_descent(x, y, iterations=1000, learning_rate=0.0001,
stopping_threshold=1e-6):
current_weight = 0.1
current_bias = 0.01
n = float(len(x))
costs = []
weights = []
previous_cost = None
for i in range(iterations):
y_predicted = (current_weight * x) + current_bias
current_cost = mean_squared_error(y, y_predicted)
if previous_cost and abs(previous_cost - current_cost) <= stopping_threshold:
break
previous_cost = current_cost
[Link](current_cost)
[Link](current_weight)
weight_derivative = -(2/n) * sum(x * (y - y_predicted))
8
bias_derivative = -(2/n) * sum(y - y_predicted)
current_weight = current_weight - (learning_rate * weight_derivative)
current_bias = current_bias - (learning_rate * bias_derivative)
print(f"Iteration {i+1}: Cost {current_cost}, Weight {current_weight}, Bias
{current_bias}")
[Link](figsize=(8,6))
[Link](weights, costs)
[Link](weights, costs, marker='o')
[Link]("Cost vs Weights")
[Link]("Cost")
[Link]("Weight")
[Link]()
return current_weight, current_bias
def main():
X = [Link]([12.5, 15.2, 18.7, 22.1, 25.6,
28.4, 30.2, 33.1, 35.9, 38.4,
40.8, 43.2, 45.5, 48.3, 50.9,
53.4, 55.7, 58.2, 60.9, 63.5])
Y = [Link]([25.1, 30.4, 36.8, 41.2, 48.5,
52.3, 55.6, 60.1, 63.8, 67.2,
70.9, 74.5, 77.1, 81.6, 85.3,
88.2, 91.5, 95.8, 99.4, 103.2])
estimated_weight, estimated_bias = gradient_descent(X, Y, iterations=2000)
print(f"Estimated Weight: {estimated_weight}\nEstimated Bias: {estimated_bias}")
Y_pred = estimated_weight * X + estimated_bias
9
[Link](figsize=(8,6))
[Link](X, Y, marker='o')
[Link]([min(X), max(X)], [min(Y_pred), max(Y_pred)], linestyle='dashed')
[Link]("X")
[Link]("Y")
[Link]()
if __name__ == "__main__":
main()
Output:
Iteration 1: Cost 4485.8906050000005, Weight 0.66017274, Bias 0.0227025
Iteration 2: Cost 1897.0640702528005, Weight 1.023818908825526, Bias 0.031030311264300003
Iteration 3: Cost 806.0722044868571, Weight 1.2598856953900024, Bias 0.036518198618663913
Iteration 4: Cost 346.30261175807516, Weight 1.4131316263442328, Bias 0.04016248712642121
Iteration 5: Cost 152.5447266254931, Weight 1.5126126463833116, Bias 0.04260996228537919
Iteration 6: Cost 70.89034545165576, Weight 1.5771910368488695, Bias 0.04428049858790036
Iteration 7: Cost 36.478981195867775, Weight 1.619111646812047, Bias 0.045446666445577356
Iteration 8: Cost 21.976920126624293, Weight 1.6463234791563963, Bias 0.04628541070892021
Iteration 9: Cost 15.865101249407957, Weight 1.6639867973098792, Bias 0.04691159887196276
Iteration 10: Cost 13.289125066996766, Weight 1.6754514954713082, Bias 0.04739979959918476
Iteration 11: Cost 12.203235029455731, Weight 1.6828922283457954, Bias 0.04779842071711136
Iteration 12: Cost 11.745301873628902, Weight 1.6877207190522807, Bias 0.048138887190729
10
Estimated Weight: 1.686624305312527
Estimated Bias: 0.49939851519938316
11
Backpropagation Algorithm Implementation
Theory:
Backpropagation is the essence of neural network training. It is the method of fine-tuning
the weights of a neural network based on the error rate obtained in the previous epoch
(i.e., iteration). Proper tuning of the weights allows you to reduce error rates and make the
model reliable by increasing its generalization.
Import import numpy as np
import [Link] as plt
def sigmoid(x):
return 1 / (1 + [Link](-x))
def sigmoid_derivative(x):
return x * (1 - x)
def mean_squared_error(y_true, y_pred):
return [Link]((y_true - y_pred)**2)
X = [Link]([[0,0],[0,1],[1,0],[1,1]])
y_or = [Link]([[0],[1],[1],[1]])
y_and = [Link]([[0],[0],[0],[1]])
input_neurons = 2
hidden_neurons = 2
output_neurons = 1
learning_rate = 0.5
epochs = 10000
[Link](42)
weights_input_hidden = [Link](input_neurons, hidden_neurons)
12
weights_hidden_output = [Link](hidden_neurons, output_neurons)
bias_hidden = [Link](1, hidden_neurons)
bias_output = [Link](1, output_neurons)
def train(X, y):
global weights_input_hidden, weights_hidden_output, bias_hidden, bias_output
losses = []
for epoch in range(epochs):
hidden_input = [Link](X, weights_input_hidden) + bias_hidden
hidden_output = sigmoid(hidden_input)
final_input = [Link](hidden_output, weights_hidden_output) + bias_output
final_output = sigmoid(final_input)
loss = mean_squared_error(y, final_output)
[Link](loss)
error = y - final_output
d_output = error * sigmoid_derivative(final_output)
error_hidden = d_output.dot(weights_hidden_output.T)
d_hidden = error_hidden * sigmoid_derivative(hidden_output)
weights_hidden_output += hidden_output.[Link](d_output) * learning_rate
bias_output += [Link](d_output, axis=0, keepdims=True) * learning_rate
weights_input_hidden += [Link](d_hidden) * learning_rate
bias_hidden += [Link](d_hidden, axis=0, keepdims=True) * learning_rate
return losses, final_output
loss_or, output_or = train(X, y_or)
print("OR Gate Output:\n", [Link](output_or,2))
print("Weights Input-Hidden:\n", weights_input_hidden)
print("Weights Hidden-Output:\n", weights_hidden_output)
print("Bias Hidden:\n", bias_hidden)
print("Bias Output:\n", bias_output)
13
[Link](loss_or)
[Link]("Loss Curve OR Gate")
[Link]("Epochs")
[Link]("MSE")
[Link]()
[Link](42)
weights_input_hidden = [Link](input_neurons, hidden_neurons)
weights_hidden_output = [Link](hidden_neurons, output_neurons)
bias_hidden = [Link](1, hidden_neurons)
bias_output = [Link](1, output_neurons)
loss_and, output_and = train(X, y_and)
print("\nAND Gate Output:\n", [Link](output_and,2))
print("Weights Input-Hidden:\n", weights_input_hidden)
print("Weights Hidden-Output:\n", weights_hidden_output)
print("Bias Hidden:\n", bias_hidden)
print("Bias Output:\n", bias_output)
[Link](loss_and)
[Link]("Loss Curve AND Gate")
[Link]("Epochs")
[Link]("MSE")
[Link]()
14
Output
OR Gate Output:
[[0.02]
[0.99]
[0.99]
[1. ]]
Weights Input-Hidden:
[[4.41966416 3.44081689]
[4.48823328 3.36048634]]
Weights Hidden-Output:
[[7.01617582]
[4.64249603]]
Bias Hidden:
[[-2.40845672 -1.91084141]]
Bias Output:
[[-5.27789397]]
15
AND Gate Output:
[[0. ]
[0.01]
[0.01]
[0.98]]
Weights Input-Hidden:
[[4.43461661 0.76298874]
16
[4.42058908 0.55635497]]
Weights Hidden-Output:
[[10.61089559]
[-1.33167572]]
Bias Hidden:
[[-6.50872719 1.7636314 ]]
Bias Output:
[[-4.32266379]]
17
Decision Tree Implementation
import pandas as pd
from [Link] import load_iris
from sklearn.model_selection import train_test_split
from [Link] import DecisionTreeClassifier, plot_tree
import [Link] as plt
from [Link] import accuracy_score
iris = load_iris()
X = [Link]
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = DecisionTreeClassifier()
[Link](X_train, y_train)
y_pred = [Link](X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
[Link](figsize=(12,8))
plot_tree(clf, feature_names=iris.feature_names, class_names=iris.target_names,
filled=True)
[Link]()
18
Output
19
Naive Bayes Algorithm Implementation
import pandas as pd
from [Link] import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from [Link] import accuracy_score, confusion_matrix, classification_report
import seaborn as sns
import [Link] as plt
iris = load_iris()
X = [Link]
y = [Link]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
nb = GaussianNB()
[Link](X_train, y_train)
y_pred = [Link](X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))
cm = confusion_matrix(y_test, y_pred)
[Link](figsize=(6,4))
[Link](cm, annot=True, fmt="d", cmap="Blues", xticklabels=iris.target_names,
yticklabels=iris.target_names)
[Link]("Predicted")
[Link]("Actual")
[Link]("Confusion Matrix")
20
[Link]()
Output
Accuracy: 1.0
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 10
1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11
accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30
21
22