0% found this document useful (0 votes)
8 views24 pages

AISC Expt 5

The document outlines an experiment to implement a Supervised Learning Algorithm using a Backpropagation Network (BPN) in a computer engineering course. It explains the theory behind supervised learning, the backpropagation process, and the use of the bipolar sigmoid activation function. Additionally, it provides a detailed algorithm and sample code for training a neural network with verbose output showing intermediate values.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views24 pages

AISC Expt 5

The document outlines an experiment to implement a Supervised Learning Algorithm using a Backpropagation Network (BPN) in a computer engineering course. It explains the theory behind supervised learning, the backpropagation process, and the use of the bipolar sigmoid activation function. Additionally, it provides a detailed algorithm and sample code for training a neural network with verbose output showing intermediate values.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BHARATIYA VIDYA BHAVAN’S

SARDAR PATEL INSTITUTE OF TECHNOLOGY


Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

Name Adwait Patkhedkar

UID no. 2023300174

Division C

Batch B

Course Code CS303

Experiment 5: To implement Supervised Learning Algorithm (BPN)

Program

AIM: To implement Supervised Learning Algorithm (Backpropagation


Network)
THEORY:
Supervised learning is a type of machine learning where a model is
trained using labeled data. The training dataset consists of
input–output pairs, where the input features are mapped to a known
target or label. The goal of supervised learning is to learn a function
that can predict the correct output for new, unseen inputs.

It is mainly categorized into:

●​ Classification – predicting discrete labels (e.g., spam or not


spam).​

●​ Regression – predicting continuous values (e.g., house prices).​

The performance of a supervised model is evaluated using metrics


such as accuracy, precision, recall, or mean squared error, depending
on the problem type.

Back Propagation is also known as "Backward Propagation of Errors"


is a method used to train neural networks . Its goal is to reduce the
difference between the model’s predicted output and the actual output
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

by adjusting the weights and biases in the network.


It works iteratively to adjust weights and bias to minimize the cost
function. In each epoch the model adapts these parameters by reducing
loss by following the error gradient. It often uses optimization
algorithms like gradient descent or stochastic gradient descent. The
algorithm computes the gradient using the chain rule from calculus
allowing it to effectively navigate complex layers in the neural
network to minimize the cost function.

In BPN, the process works in two main phases:

1.​ Forward Pass – Inputs are multiplied by weights, added with


biases, and passed through activation functions across layers to
generate the final output.​

2.​ Backward Pass – The error between predicted and actual output
is computed and propagated backward through the network. Using
the chain rule, gradients are calculated to update weights and
biases.​

By repeating these steps over many iterations, the network gradually


minimizes error and improves prediction accuracy.
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

SIGMOID ACTIVATION FUNCTION

The sigmoid function is a commonly used activation function in neural


networks. It has an S-shaped curve and is defined as

It maps any real-valued input into the range (0, 1), making it useful
for problems where outputs need to be interpreted as probabilities.
The function is smooth and differentiable, which allows
gradient-based optimization methods like backpropagation to work
effectively.

Bipolar Sigmoid activation function

The bipolar sigmoid activation function is a smooth, S-shaped


nonlinear function that maps inputs to the range (–1, +1). It is useful
for neural networks because it introduces nonlinearity, provides
continuous differentiability for backpropagation, and helps
represent both positive and negative activations, improving
convergence compared to the standard sigmoid.

It is just the scaled version of tanh(x)


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

The graph of bipolar sigmoid over real domain

ALGORITHM
Initialize weights and biases

Forward Pass​
For each layer, compute net input:​
a_j = sum(w_ij * x_i) + b_j

Apply bipolar sigmoid activation function:​


o_j = (1 - exp(-a_j)) / (1 + exp(-a_j))

Pass output to the next layer until final output is obtained.

Error Calculation​
Compute error at the output layer:​
E = 1/2 * sum((y - y_hat)^2)​
where y = actual output, y_hat = predicted output (use -1 and +1 for
bipolar targets)

Backward Pass​
Error term (delta) at output layer:​
delta_j = (y_j - y_hat_j) * 0.5 * (1 - y_hat_j^2)

Error term for hidden layers:​


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

delta_h = (sum(delta_j * w_hj)) * 0.5 * (1 - o_h^2)

Weight Update​
Update each weight using learning rate eta:​
w_ij(new) = w_ij(old) + eta * delta_j * o_i

Update bias:​
b_j(new) = b_j(old) + eta * delta_j

Repeat steps Forward Pass → Error Calculation → Backward Pass →


Weight Update for all training samples until error converges or
maximum epochs are reached.
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

CODE:
import numpy as np

import pandas as pd

import [Link] as plt

class VerboseNeuralNetwork:

def __init__(self):

# Initialize weights based on the network diagram

# Input to hidden layer weights

self.W1 = [Link]([[0.4, 0.1], # weights from


x1,x2 to z1

[0.6, 0.4]]) # weights from


x1,x2 to z2

# Hidden to output layer weights

self.W2 = [Link]([[-0.1, 0.5]]) # weights from


z1,z2 to Y

# Bias weights

self.b1 = [Link]([0.3, -0.1]) # bias to z1, z2

self.b2 = [Link]([-0.3]) # bias to Y

# Learning rate
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

self.learning_rate = 0.5

# Store training history

self.training_history = []

self.epoch_losses = [] # Store loss for each epoch


for plotting

def bipolar_sigmoid(self, x):

"""Bipolar sigmoid activation function: tanh(x) =


(e^x - e^-x)/(e^x + e^-x)"""

# Clip x to prevent overflow

x = [Link](x, -500, 500)

return [Link](x)

def bipolar_sigmoid_derivative(self, x):

"""Derivative of bipolar sigmoid function: 1 -


tanh^2(x)"""

return 1 - x**2

def forward(self, X):

"""Forward propagation with detailed intermediate


values"""

# Input to hidden layer


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

self.z1_input = [Link](X, self.W1) + self.b1

self.z1_output = self.bipolar_sigmoid(self.z1_input)

# Hidden to output layer

self.y_input = [Link](self.z1_output, self.W2.T) +


self.b2

self.y_output = self.bipolar_sigmoid(self.y_input)

return self.y_output

def backward(self, X, y_true, y_pred):

"""Backward propagation with weight changes


tracking"""

m = [Link][0] # number of samples

# Store old weights for change calculation

old_W1 = [Link]()

old_W2 = [Link]()

old_b1 = [Link]()

old_b2 = [Link]()

# Calculate output layer error


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

output_error = y_pred - y_true

output_delta = output_error *
self.bipolar_sigmoid_derivative(y_pred)

# Calculate hidden layer error

hidden_error = output_delta.dot(self.W2)

hidden_delta = hidden_error *
self.bipolar_sigmoid_derivative(self.z1_output)

# Update weights and biases

# Output layer updates

self.W2 -= self.learning_rate *
(self.z1_output.[Link](output_delta)).T / m

self.b2 -= self.learning_rate * [Link](output_delta,


axis=0) / m

# Hidden layer updates

self.W1 -= self.learning_rate *
[Link](hidden_delta) / m

self.b1 -= self.learning_rate * [Link](hidden_delta,


axis=0) / m

# Calculate weight changes

delta_W1 = self.W1 - old_W1


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

delta_W2 = self.W2 - old_W2

delta_b1 = self.b1 - old_b1

delta_b2 = self.b2 - old_b2

return delta_W1, delta_W2, delta_b1, delta_b2

def train_verbose(self, X, y, epochs, show_every=1):

"""Train with verbose output showing all


intermediate values"""

print("Training Neural Network with Bipolar Sigmoid


Activation\n")

print("=" * 150)

for epoch in range(epochs):

epoch_data = []

total_loss = 0

# Store weights before training for this epoch

epoch_W1 = [Link]()

epoch_W2 = [Link]()

epoch_b1 = [Link]()

epoch_b2 = [Link]()
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

# Forward propagation for all samples

y_pred = [Link](X)

# Calculate loss

loss = [Link]((y_pred - y) ** 2)

total_loss = loss

# Backward propagation

delta_W1, delta_W2, delta_b1, delta_b2 =


[Link](X, y, y_pred)

# Store detailed information for each sample

for i in range(len(X)):

# Calculate intermediate values for this


sample

zin1 = X[i][0] * epoch_W1[0][0] + X[i][1] *


epoch_W1[1][0] + epoch_b1[0]

zin2 = X[i][0] * epoch_W1[0][1] + X[i][1] *


epoch_W1[1][1] + epoch_b1[1]

z1 = [Link](zin1)

z2 = [Link](zin2)

yin = z1 * epoch_W2[0][0] + z2 *
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

epoch_W2[0][1] + epoch_b2[0]

y_out = [Link](yin)

# Predicted output

y_predicted = 1 if y_out > 0 else -1

row_data = {

'Epoch': epoch + 1,

'Sample': i + 1,

'x1': X[i][0],

'x2': X[i][1],

't': y[i][0],

'zin1': f"{zin1:.4f}",

'zin2': f"{zin2:.4f}",

'z1': f"{z1:.4f}",

'z2': f"{z2:.4f}",

'yin': f"{yin:.4f}",

'Y': f"{y_out:.4f}",

'Y_pred': y_predicted,

'bias_z1': f"{epoch_b1[0]:.4f}",

'bias_z2': f"{epoch_b1[1]:.4f}",
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

'bias_Y': f"{epoch_b2[0]:.4f}",

'Loss': f"{loss:.6f}"

epoch_data.append(row_data)

# Show results for this epoch

if epoch % show_every == 0:

print(f"\nEPOCH {epoch + 1}")

print("-" * 150)

# Create DataFrame for better formatting

df = [Link](epoch_data)

print(df.to_string(index=False))

print(f"\nWeight Changes:")

print(f"ΔW1 (input to hidden):")

print(f" [{delta_W1[0][0]:+.6f},
{delta_W1[0][1]:+.6f}]")

print(f" [{delta_W1[1][0]:+.6f},
{delta_W1[1][1]:+.6f}]")

print(f"ΔW2 (hidden to output):


[{delta_W2[0][0]:+.6f}, {delta_W2[0][1]:+.6f}]")

print(f"Δb1 (hidden biases):


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

[{delta_b1[0]:+.6f}, {delta_b1[1]:+.6f}]")

print(f"Δb2 (output bias):


[{delta_b2[0]:+.6f}]")

print(f"\nCurrent Weights:")

print(f"W1: {self.W1}")

print(f"W2: {self.W2}")

print(f"b1: {self.b1}")

print(f"b2: {self.b2}")

print(f"Total Loss: {total_loss:.6f}")

print("=" * 150)

# Store history

self.training_history.extend(epoch_data)

self.epoch_losses.append(total_loss) # Store
loss for plotting

# Check for convergence

if total_loss < 1e-6:

print(f"\nConverged at epoch {epoch + 1}


with loss {total_loss:.8f}")

break
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

def predict(self, X):

"""Make predictions"""

output = [Link](X)

return [Link](output > 0, 1, -1)

def test_network(self):

"""Test the final trained network"""

print("\n" + "="*50)

print("FINAL NETWORK TESTING")

print("="*50)

X_test = [Link]([[-1, -1], [1, 1], [-1, 1], [1,


-1]])

y_test = [Link]([[-1], [-1], [1], [1]])

print("\nFinal Test Results:")

print("-" * 80)

print("x1 x2 Target Y_output Y_predicted


Correct")

print("-" * 80)

correct = 0
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

for i in range(len(X_test)):

y_out = [Link](X_test[i:i+1])[0][0]

y_pred = 1 if y_out > 0 else -1

is_correct = "✓" if y_pred == y_test[i][0] else


"✗"

if y_pred == y_test[i][0]:

correct += 1

print(f"{X_test[i][0]:2} {X_test[i][1]:2}
{y_test[i][0]:2} {y_out:+.4f} {y_pred:2}
{is_correct}")

print("-" * 80)

print(f"Accuracy: {correct}/{len(X_test)} =
{correct/len(X_test)*100:.1f}%")

def plot_training_progress(self):

"""Plot training loss and weight evolution over


epochs"""

if not self.epoch_losses:

print("No training data available for


plotting.")

return
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

# Create subplots

fig, ((ax1, ax2), (ax3, ax4)) = [Link](2, 2,


figsize=(15, 10))

[Link]('Neural Network Training Progress',


fontsize=16)

epochs = range(1, len(self.epoch_losses) + 1)

# 1. Loss curve

[Link](epochs, self.epoch_losses, 'b-',


linewidth=2, marker='o', markersize=3)

ax1.set_title('Training Loss Over Time')

ax1.set_xlabel('Epoch')

ax1.set_ylabel('Mean Squared Error')

[Link](True, alpha=0.3)

ax1.set_yscale('linear') # Using actual scale as


requested

plt.tight_layout()

[Link]()

# Additional plot: Decision boundary visualization


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

# Training data (XOR problem)

X = [Link]([[-1, -1],

[1, 1],

[-1, 1],

[1, -1]])

y = [Link]([[-1],

[-1],

[1],

[1]])

# Create and train the neural network

nn = VerboseNeuralNetwork()

print("Initial Network Configuration:")

print(f"W1 (input to hidden):\n{nn.W1}")

print(f"W2 (hidden to output):\n{nn.W2}")

print(f"b1 (hidden bias): {nn.b1}")

print(f"b2 (output bias): {nn.b2}")


BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

print(f"Learning rate: {nn.learning_rate}")

# Train with verbose output (show every epoch for first 10


epochs, then every 5th)

print("\nStarting training...")

nn.train_verbose(X, y, epochs=100, show_every=1)

# Test the final network

nn.test_network()

# Plot training progress and results

print("\nGenerating training progress plots...")

nn.plot_training_progress()

# Create summary statistics

print("\n" + "="*50)

print("TRAINING SUMMARY")

print("="*50)

if nn.training_history:

df_history = [Link](nn.training_history)

final_epoch = df_history['Epoch'].max()
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

final_loss = float(df_history[df_history['Epoch'] ==
final_epoch]['Loss'].iloc[0])

print(f"Total epochs trained: {final_epoch}")

print(f"Final loss: {final_loss:.8f}")

print(f"Network successfully learned the XOR-like


function!")
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

RESULT:
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

CONCLUSION:
In my implementation, I successfully built and trained a neural network using
the backpropagation algorithm. I observed that through a repeating cycle of a
forward pass to make a prediction and a backward pass to correct for the
error, my model systematically learned from the dataset. By plotting the
average error after each epoch, I confirmed a consistent and steady decrease
over time, which showed the learning process in action. After running the
training for a large number of epochs, I watched this error rate bottom out and
stabilize, which indicated to me that the model had successfully converged,
finally mastering the non-linear pattern required to solve the problem and
validating the effectiveness of the algorithm.
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
Bhavan’s Campus, Munshi Nagar, Andheri (West), Mumbai – 400058-India
Department of Computer Engineering

You might also like