0% found this document useful (0 votes)

15 views18 pages

8 Python Programs for Data Classification

The document contains a series of programming exercises that demonstrate various machine learning techniques using Python. Each program utilizes different datasets and algorithms, including binary classification with make_circles and make_moons, clustering with make_blobs, decision trees with ID3, Q-learning, neural networks, and locally weighted regression. The examples include code snippets and visualizations to illustrate the results of each implementation.

Uploaded by

ririn79475

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views18 pages

8 Python Programs for Data Classification

Uploaded by

ririn79475

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Program – 1

▪ Write a program using 2d binary classification data generated by

make_circles() have a spherical decision boundary.

# Import necessary libraries

from [Link] import make_circles

import [Link] as plt

# Generate 2d classification dataset

X, y = make_circles(n_samples=200, shuffle=True,

noise=0.1, random_state=42)

# Plot the generated datasets

[Link](X[:, 0], X[:, 1], c=y)

[Link]()

Page | 1
Output :

Page | 2
Program – 2
▪ Write a program for Two interlocking half circles represent the 2d binary
classification data produced by the make_moons() function.

#import the necessary libraries

from [Link] import make_moons

import [Link] as plt

# generate 2d classification dataset

X, y = make_moons(n_samples=500, shuffle=True,

noise=0.15, random_state=42)

# Plot the generated datasets

[Link](X[:, 0], X[:, 1], c=y)

[Link]()

Page | 3
Output :

Page | 4
Program – 3

▪ Write a program to generate the data by using the function make_blobs()

are blobs that can be utilized for clustering.

#import the necessary libraries

from [Link] import make_blobs

import [Link] as plt

# Generate 2d classification dataset

X, y = make_blobs(n_samples=500, centers=3, n_features=2, random_state=23)

# Plot the generated datasets

[Link](X[:, 0], X[:, 1], c=y)

[Link]()

Page | 5
Output :

Page | 6
Program – 4

▪ Write a program to generate data by the function make_classification()

need to balance between n_informative, n_redundant and n_classes
attributes X[:, :n_informative + n_redundant + n_repeated]

#import the necessary libraries

from [Link] import make_classification
import [Link] as plt
# generate 2d classification dataset
X, y = make_classification(n_samples = 100,
n_features=2,
n_redundant=0,
n_informative=2,
n_repeated=0,
n_classes =3,
n_clusters_per_class=1)
# Plot the generated datasets
[Link](X[:, 0], X[:, 1], c=y)
[Link]()

Page | 7
Output :

Page | 8
Program – 5
▪ Write a Program for the implementation of Q-Learning (Reinforcement
Learning)

import numpy as np
import pylab as pl
import networkx as nx
edges = [(0, 1), (1, 5), (5, 6), (5, 4), (1, 2),
(1, 3), (9, 10), (2, 4), (0, 6), (6, 7),
(8, 9), (7, 8), (1, 7), (3, 9)]

goal = 10
G = [Link]()
G.add_edges_from(edges)
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos)
[Link]()

Page | 9
Output :

Page | 10
Program – 6
▪ Write a program to demonstrate the working of the decision tree
based ID3 algorithm. Use an appropriate data set for building the
decision tree and apply this knowledge to classify a new sample.

# Import necessary libraries

from sklearn import datasets

from sklearn.model_selection import train_test_split

from [Link] import DecisionTreeClassifier

from [Link] import accuracy_score, classification_report, confusion_matrix

# Load a dataset (for example, using the iris dataset)

iris = datasets.load_iris()

X = [Link]

y = [Link]

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create the decision tree classifier using ID3 algorithm (information gain)

clf = DecisionTreeClassifier(criterion="entropy")

# Train the classifier using the training data

[Link](X_train, y_train)

# Make predictions on the testing data

y_pred = [Link](X_test)

# Evaluate the performance of the classifier

print("Accuracy:", accuracy_score(y_test, y_pred))

print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred))

print("\nClassification Report:\n", classification_report(y_test, y_pred))

Page | 11
# Now, let's classify a new sample

new_sample = [[5.1, 3.5, 1.4, 0.2]] # example values for the new sample

predicted_class = [Link](new_sample)

print("\nPredicted class for new sample:", predicted_class)

Page | 12
output :

Page | 13
Program – 7
▪ Build an Artificial Neural Network by implementing the Backpropagation
algorithm and test the same using appropriate data sets.

import numpy as np

class NeuralNetwork:

def init(self, input_size, hidden_size, output_size):

self.input_size = input_size

self.hidden_size = hidden_size

self.output_size = output_size

self.learning_rate = 0.1

self.weights_input_hidden=[Link](self.input_size
, self.hidden_size)

self.weights_hidden_output=[Link](self.hidden_s
ize, self.output_size)

def sigmoid(self, x):

return 1 / (1 + [Link](-x))

def sigmoid_derivative(self, x):

return x * (1 - x)

def forward_propagation(self, input_data):

self.hidden_input = [Link](input_data, self.weights_input_hidden)

self.hidden_output = [Link](self.hidden_input)

[Link] = [Link](self.hidden_output,
self.weights_hidden_output)

return [Link]

def backward_propagation(self, input_data, target):

error = target - [Link]

d_output = error * self.sigmoid_derivative([Link])

error_hidden = d_output.dot(self.weights_hidden_output.T)

d_hidden = error_hidden *
self.sigmoid_derivative(self.hidden_output)

Page | 14
self.weights_hidden_output += self.hidden_output.[Link](d_output) *
self.learning_rate

self.weights_input_hidden += input_data.[Link](d_hidden) *
self.learning_rate

def train(self, input_data, target, epochs):

for _ in range(epochs):

output = self.forward_propagation(input_data)

self.backward_propagation(input_data, target)

def predict(self, input_data):

return self.forward_propagation(input_data)

# Example usage

input_data = [Link]([[0, 0], [0, 1], [1, 0], [1, 1]])

target = [Link]([[0], [1], [1], [0]])

# Create a neural network with 2 input nodes, 2 hidden nodes, and 1 output
node

nn = NeuralNetwork(2, 2, 1)

# Train the neural network

[Link](input_data, target, epochs=10000)

# Test the neural network

print("Predictions:")

for i in range(len(input_data)):

print(f"Input: {input_data[i]},

Predicted Output: {[Link](input_data[i])}")

Page | 15
Output :

Page | 16
Program – 8
▪ Implement the non-parametric Locally Weighted Regression
algorithm in order to fit datapoints. Select appropriate data set for
your experiment and draw graphs.
import numpy as np

import [Link] as plt

# Generate a sample dataset

[Link](0)

X = [Link](0, 10, 100)

y = 2 * X + 1 + [Link](scale=2, size=100)

# Locally Weighted Regression function

def lowess(x, y, tau=0.1):

n = len(x)

y_hat = [Link](n)

for i in range(n):

weights = [Link](-0.5 * ((x - x[i]) / tau) ** 2)

b = [Link]([[Link](weights * y), [Link](weights * y * x)])

A = [Link]([[[Link](weights), [Link](weights * x)],

[[Link](weights * x), [Link](weights * x * x)]])

theta = [Link](A, b)

y_hat[i] = theta[0] + theta[1] * x[i]

return y_hat

# Fit the data using Locally Weighted Regression

tau = 0.3 # Bandwidth parameter

y_lowess = lowess(X, y, tau)

# Plot the original data and the fitted curve

[Link](X, y, label='Original data', color='b')

[Link](X, y_lowess, label=f’LOWESS (tau={tau})', color='r')

[Link]()

Page | 17
Output :

Page | 18

Common questions

When choosing between deterministic methods like Decision Trees and probabilistic models, consider the nature of the data, interpretability needs, and training complexity. Decision Trees provide clear rule-based classifications ideal for transparency and understanding. Probabilistic models can handle uncertainty and provide predictive distributions, lending themselves better to data with inherent variance. Scalability, handling of missing data, prediction speed, and underlying assumptions about data distribution also play significant roles in model selection .

The make_circles function generates data that forms a concentric circular pattern ideal for testing kernel-based algorithms, where the decision boundary is spherical . In contrast, make_moons produces data that forms two interlocking half circles providing a more complex pattern with non-linearly separable data . This makes make_moons suitable for testing algorithms capable of handling nonlinear boundaries without kernel transformations.

Reinforcement learning, particularly Q-Learning, can be applied to graph structures by modeling states as nodes and actions as edges. The goal can be a specific node, and Q-Learning helps find the optimal path with the highest cumulative reward from a starting node to the goal. This setup is effective for problems like maze solving, network routing, and game tactics, where decisions must be made sequentially over graph-based state spaces .

Synthetic datasets generated by functions like make_circles and make_blobs provide a controlled environment to test and understand algorithm behavior under known conditions. Their simplified nature, such as perfect separation or clustering, allows for fundamental insights into model performance without the noise and unpredictability of real-world data. While useful for understanding model limitations and behaviors, their overly tidy nature may not perfectly translate to complex real-world scenarios, potentially limiting general applicability .

The parameter 'tau' in Locally Weighted Regression acts as a bandwidth, influencing how much of the dataset impacts the regression estimation at each point. A small 'tau' leads to a fit that closely adheres to individual datapoints (overfitting), capturing more variability, whereas a large 'tau' results in a smoother, less precise fit (underfitting), capturing the overall trend better but possibly missing finer details .

The learning rate in Backpropagation determines the step size at each iteration while moving toward a minimum of the loss function. A high learning rate may converge quickly but risks overshooting the minimum, while a low learning rate offers more precision but can lead to longer training times and risk getting stuck in local minima. A well-chosen learning rate balances convergence speed with accuracy improvements .

When using make_classification, it is essential to balance the number of informative, redundant, and repeated features. Informative features are essential for classification, while redundant features are linear combinations of informative ones, and repeated features are drawn randomly. Additionally, specifying the number of classes and clusters per class helps tailor the complexity of the generated dataset to reflect realistic scenarios for classification challenges .

The 'gini' index and 'entropy' (ID3 algorithm) are both criteria used to split nodes in a Decision Tree. The 'gini' index evaluates the impurity of a dataset without incorporating the logarithm calculation that is part of 'entropy', making it computationally faster. 'Entropy' provides more information gain described in probabilistic terms, potentially yielding a tree with better discrimination capability at the risk of increased computation time. In practice, both perform similarly in terms of accuracy, but may yield different tree structures .

The graphical representation of a Decision Tree provides intuitive insights into decision rules and data structure, making it easier to interpret which variables are most important and how decisions are reached. This representation aids in better understanding and communication of the model's logic, facilitates error analysis, and supports domain experts in validating and refining decision-making criteria .

Using datasets created by make_blobs for clustering tasks is advantageous due to their design to test different clustering algorithms under controlled conditions. They offer well-separated clusters that can help in evaluating a clustering algorithm's ability to identify distinct groups in a dataset. This controlled complexity allows for testing model robustness and evaluation of cluster validity metrics such as silhouette scores .

Labpdf
No ratings yet
Labpdf
10 pages
Linear Regression Implementation Guide
100% (1)
Linear Regression Implementation Guide
45 pages
Mtech Programs AI Lab
No ratings yet
Mtech Programs AI Lab
11 pages
Artificial Intelligence Lab
No ratings yet
Artificial Intelligence Lab
11 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
49 pages
K-Nearest Neighbour Iris Classification
No ratings yet
K-Nearest Neighbour Iris Classification
16 pages
Mtech Lab Manual
No ratings yet
Mtech Lab Manual
11 pages
Programming Tasks and Data Analysis
No ratings yet
Programming Tasks and Data Analysis
33 pages
Lab Manual
No ratings yet
Lab Manual
23 pages
ML Programs
No ratings yet
ML Programs
18 pages
Regression, Decision Trees, and Classifiers
No ratings yet
Regression, Decision Trees, and Classifiers
16 pages
Machine Learning Algorithms Overview
No ratings yet
Machine Learning Algorithms Overview
19 pages
Cy-701 Machine Learning Lab Manual
No ratings yet
Cy-701 Machine Learning Lab Manual
31 pages
Implementing Machine Learning Algorithms
No ratings yet
Implementing Machine Learning Algorithms
20 pages
AI Lab
No ratings yet
AI Lab
15 pages
AI Algorithms Lab Manual
No ratings yet
AI Algorithms Lab Manual
29 pages
Alogrithm and AI Lab (Mtech 2024-2025)
No ratings yet
Alogrithm and AI Lab (Mtech 2024-2025)
26 pages
ML Lab
No ratings yet
ML Lab
39 pages
Decision Tree and Neural Network Implementation
No ratings yet
Decision Tree and Neural Network Implementation
22 pages
Machine Learning Lab Exercises in Python
No ratings yet
Machine Learning Lab Exercises in Python
37 pages
DMW Record
No ratings yet
DMW Record
28 pages
Python Clustering and Decision Trees
No ratings yet
Python Clustering and Decision Trees
64 pages
Non-Parametric Regression & Algorithms
No ratings yet
Non-Parametric Regression & Algorithms
23 pages
Mlshit
No ratings yet
Mlshit
17 pages
Linear Regression and SVM Implementation
No ratings yet
Linear Regression and SVM Implementation
25 pages
Scikit Learn Cross-Validation Guide
No ratings yet
Scikit Learn Cross-Validation Guide
141 pages
ML File: Practical Programming Tasks
No ratings yet
ML File: Practical Programming Tasks
17 pages
NC Lab Program1 Perceptron Algorithm
No ratings yet
NC Lab Program1 Perceptron Algorithm
36 pages
Machine Learning Algorithms in Python
No ratings yet
Machine Learning Algorithms in Python
34 pages
Engineering Skills Course Certificate
No ratings yet
Engineering Skills Course Certificate
36 pages
Final Ai Lab 6 Programs 2024-25
No ratings yet
Final Ai Lab 6 Programs 2024-25
10 pages
Python Code for Neural Network Basics
No ratings yet
Python Code for Neural Network Basics
7 pages
Experiments in Machine Learning Models
No ratings yet
Experiments in Machine Learning Models
21 pages
KNN, K-Means, and Regression in Python
No ratings yet
KNN, K-Means, and Regression in Python
12 pages
MAchine Learning Lab 6 Programs
No ratings yet
MAchine Learning Lab 6 Programs
11 pages
Decision Trees & Bayesian Classifiers in Python
No ratings yet
Decision Trees & Bayesian Classifiers in Python
25 pages
Machine Learning Lab File for B.Tech AI
No ratings yet
Machine Learning Lab File for B.Tech AI
21 pages
Python Machine Learning Algorithms Overview
No ratings yet
Python Machine Learning Algorithms Overview
17 pages
CSM ML Lab Manual
No ratings yet
CSM ML Lab Manual
27 pages
ML Lab Micro
No ratings yet
ML Lab Micro
10 pages
Neural Networks and Decision Trees Lab Report
No ratings yet
Neural Networks and Decision Trees Lab Report
11 pages
Machine Learning Model Training Guide
No ratings yet
Machine Learning Model Training Guide
25 pages
Linear Regression and Classification Models
No ratings yet
Linear Regression and Classification Models
20 pages
Simple Linear Regression Implementation
No ratings yet
Simple Linear Regression Implementation
14 pages
MCSL106 - Algorithms & AI Lab Manual
No ratings yet
MCSL106 - Algorithms & AI Lab Manual
15 pages
Final Ai Lab Mtech 1ST Sem Syllabus
No ratings yet
Final Ai Lab Mtech 1ST Sem Syllabus
7 pages
California Housing Data Analysis
No ratings yet
California Housing Data Analysis
10 pages
Data Science Algorithms and Techniques
No ratings yet
Data Science Algorithms and Techniques
32 pages
Python Data Visualization and Regression Techniques
No ratings yet
Python Data Visualization and Regression Techniques
21 pages
Machine Learning Lab Manual for B.Tech
No ratings yet
Machine Learning Lab Manual for B.Tech
19 pages
Finall ML P6
No ratings yet
Finall ML P6
9 pages
Introduction to Machine Learning Types
No ratings yet
Introduction to Machine Learning Types
8 pages
Machine Learning Algorithms from Scratch
No ratings yet
Machine Learning Algorithms from Scratch
9 pages
Linear and Non-Linear Regression & Clustering
No ratings yet
Linear and Non-Linear Regression & Clustering
12 pages
Perceptron and Neural Network Implementations
No ratings yet
Perceptron and Neural Network Implementations
41 pages
Candidate Elimination and Neural Networks
No ratings yet
Candidate Elimination and Neural Networks
6 pages
ML Programs6 10
No ratings yet
ML Programs6 10
10 pages
Cast Iron Pricing and Calculator Guide
No ratings yet
Cast Iron Pricing and Calculator Guide
3 pages
Time Series Analysis in Forecasting
No ratings yet
Time Series Analysis in Forecasting
4 pages
Evaluating Limits in Calculus
No ratings yet
Evaluating Limits in Calculus
23 pages
Trendy Number Checker in Python
No ratings yet
Trendy Number Checker in Python
3 pages
PGCE Secondary Teacher Training Overview
No ratings yet
PGCE Secondary Teacher Training Overview
15 pages
9781452914121.U of Minnesota Press - Urban Design - Alex Krieger, William S. Saunders.2009
No ratings yet
9781452914121.U of Minnesota Press - Urban Design - Alex Krieger, William S. Saunders.2009
389 pages
Cagdianao CSO Consultation for People's Council
No ratings yet
Cagdianao CSO Consultation for People's Council
19 pages
Frozen Desert Concept Art
No ratings yet
Frozen Desert Concept Art
9 pages
Unix for Poets: Text Analysis Tools
No ratings yet
Unix for Poets: Text Analysis Tools
39 pages
Level 3 Unit 8 Practice Test
No ratings yet
Level 3 Unit 8 Practice Test
3 pages
Linguistic Terms and Definitions Guide
No ratings yet
Linguistic Terms and Definitions Guide
6 pages
Mythesis Lebenslauf
No ratings yet
Mythesis Lebenslauf
136 pages
Essential Genetic Concepts Explained
No ratings yet
Essential Genetic Concepts Explained
9 pages
AD Dogbone U9 Product Data Sheet
No ratings yet
AD Dogbone U9 Product Data Sheet
2 pages
Passion and Perseverance in Learning English
No ratings yet
Passion and Perseverance in Learning English
2 pages
Existentialist Americanism Explored
100% (1)
Existentialist Americanism Explored
99 pages
Dilution Techniques in FeCl3 Solutions
100% (3)
Dilution Techniques in FeCl3 Solutions
5 pages
Multicultural Team Dynamics and Sensitivity
No ratings yet
Multicultural Team Dynamics and Sensitivity
14 pages
Collocation Practice Questions
No ratings yet
Collocation Practice Questions
7 pages
Term 2 Lesson Plans: Exploring Plants
No ratings yet
Term 2 Lesson Plans: Exploring Plants
42 pages
Newton's Second Law Experiment Guide
100% (1)
Newton's Second Law Experiment Guide
6 pages
Coding Interview: Two Sum Problem
No ratings yet
Coding Interview: Two Sum Problem
10 pages
FASTtrack Therapeutics 1st Edition Nadia Bukhari Digital Version 2025
No ratings yet
FASTtrack Therapeutics 1st Edition Nadia Bukhari Digital Version 2025
113 pages
Dynamics 2019
No ratings yet
Dynamics 2019
3 pages
Understanding Organizational Behavior Basics
No ratings yet
Understanding Organizational Behavior Basics
33 pages
Comprehensive Vocabulary Guide
No ratings yet
Comprehensive Vocabulary Guide
3 pages
Science Observation and Inference Guide
No ratings yet
Science Observation and Inference Guide
7 pages
Grade 3 Mathematics Term 1 Scheme 2026
No ratings yet
Grade 3 Mathematics Term 1 Scheme 2026
6 pages
ESG Risks in Palm Oil Industry
No ratings yet
ESG Risks in Palm Oil Industry
2 pages
Phototunable Anisotropic Fluid Capacitor
No ratings yet
Phototunable Anisotropic Fluid Capacitor
9 pages