AIML Lab Record
AIML Lab Record
Machine learning is a subset of artificial intelligence in the field of computer science that often
uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve
performance on a specific task) with data, without being explicitly programmed. In the past
decade, machine learning has given us self-driving cars, practical speech recognition, effective
web search, and a vastly improved understanding of the human genome.
Machine learning tasks are typically classified into two broad categories, depending on whether
there is a learning "signal" or "feedback" available to a learning system:
1. Supervised learning: The computer is presented with example inputs and their desired
outputs, given by a "teacher", and the goal is to learn a general rule that maps inputs to outputs.
As special cases, the input signal can be only partially available, or restricted to special feedback:
3. Active learning: the computer can only obtain training labels for a limited set of instances
(based on a budget), and also has to optimize its choice of objects to acquire labels for. When
used interactively, these can be presented to the user for labeling.
4. Reinforcement learning: training data (in form of rewards and punishments) is given only as
feedback to the program's actions in a dynamic environment, such as driving a vehicle or playing
a game against an opponent.
Page 3
5. Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to
find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden
patterns in data) or a means towards an end (feature learning).
In classification, inputs are divided into two or more classes, and the learner must produce a
model that assigns unseen inputs to one or more (multi-label classification) of these classes. This
is typically tackled in a supervised manner. Spam filtering is an example of classification, where
the inputs are email (or other) messages and the classes are "spam" and "not spam".
In regression, also a supervised problem, the outputs are continuous rather than discrete. In
clustering, a set of inputs is to be divided into groups. Unlike in classification, the groups are not
known beforehand, making this typically an unsupervised task. Density estimation finds the
distribution of inputs in some space.
Dimensionality reduction simplifies inputs by mapping them into a lower dimensional space.
Topic modeling is a related problem, where a program is given a list of human language
documents and is tasked with finding out which documents cover similar topics.
Page 4
2. Association rule learning
Association rule learning is a method for discovering interesting relations between variables in
large databases.
4. Deep learning
Falling hardware prices and the development of GPUs for personal use in the last few years have
contributed to the development of the concept of deep learning which consists of multiple hidden
layers in an artificial neural network. This approach tries to model the way the human brain
processes light and sound into vision and hearing. Some successful applications of deep learning
are computer vision and speech Recognition.
Page 5
of two categories, an SVM training algorithm builds a model that predicts whether a new
example falls into one category or the other.
7. Clustering
Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that
observations within the same cluster are similar according to some pre designated criterion or
criteria, while observations drawn from different clusters are dissimilar. Different clustering
techniques make different assumptions on the structure of the data, often defined by some
similarity metric and evaluated for example by internal compactness (similarity between
members of the same cluster) and separation between different clusters. Other methods are based
on estimated density and graph connectivity. Clustering is a method of unsupervised learning,
and a common technique for statistical data analysis.
8. Bayesian networks
A Bayesian network, belief network or directed acyclic graphical model is a probabilistic
graphical model that represents a set of random variables and their conditional independencies
via a directed acyclic graph (DAG). For example, a Bayesian network could represent the
probabilistic relationships between diseases and symptoms. Given symptoms, the network can be
used to compute the probabilities of the presence of various diseases. Efficient algorithms exist
that perform inference and learning.
9. Reinforcement learning
Reinforcement learning is concerned with how an agent ought to take actions in an environment
so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt
to find a policy that maps states of the world to the actions the agent ought to take in those states.
Reinforcement learning differs from the supervised learning problem in that correct input/output
pairs are never presented, nor sub-optimal actions explicitly corrected.
10. Similarity and metric learning
In this problem, the learning machine is given pairs of examples that are considered similar and
pairs of less similar objects. It then needs to learn a similarity function (or a distance metric
function) that can predict if new objects are similar. It is sometimes used in Recommendation
systems.
Page 6
11. Genetic algorithms
A genetic algorithm (GA) is a search heuristic that mimics the process of natural selection, and
uses methods such as mutation and crossover to generate new genotype in the hope of finding
good solutions to a given problem. In machine learning, genetic algorithms found some uses in
the 1980s and 1990s. Conversely, machine learning techniques have been used to improve the
performance of genetic and evolutionary algorithms.
Page 7
[Link] :1 IMPLEMENTATION OF UNINFORMED SEARCH ALGORITHMS
DATE:
1.1 AIM:
To Implement Uninformed search algorithms ( BFS and DFS )
1.3 ALGORITHM
BFS Algorithm
Breadth-First Search (BFS) is an algorithm used for traversing graphs or trees.
Traversing means visiting each node of the graph. Breadth-First Search is a recursive
algorithm to search all the vertices of a graph or a tree. BFS in python can be
implemented by using data structures like a dictionary and lists. Breadth-First Search in
tree and graph is almost the same. The only difference is that the graph may contain
cycles, so we may traverse to the same node again.
Step 1: Enqueue the starting node. The first step is to enqueue the starting node into a
queue data structure. ...
Step 2: Dequeue a node and mark it as visited. ...
Step 3: Enqueue all adjacent nodes of the dequeued node that are not yet visited. ...
Step 4: Repeat steps 2-3 until the queue is empty.
DFS Algorithm
The recursive method of the Depth-First Search algorithm is implemented using stack.
A standard Depth-First Search implementation puts every vertex of the graph into one
Page 8
in all 2 categories: 1) Visited 2) Not Visited. The only purpose of this algorithm is to
visit all the vertex of the graph avoiding cycles.
Step:1 : We will start by putting any one of the graph's vertex on top of the stack.
Step:2 : After that take the top item of the stack and add it to the visited list of the
vertex.
Step:3 : Next, create a list of that adjacent node of the vertex. Add the ones which
aren't in the visited list of vertexes to the top of the stack.
Step:4 : Lastly, keep repeating steps 2 and 3 until the stack is empty.
1.4 PROGRAM & OUTPUT
# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling
OUTPUT
Page 10
def dfs(visited, graph, node): #function for dfs
if node not in visited:
print (node)
[Link](node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
# Driver Code
print("Following is the Depth-First Search")
dfs(visited, graph, '5')
OUTPUT
1.5 PROCEDURE
1.6 RESULT
Page 11
[Link] : 2 IMPLEMENTATION OF INFORMED SEARCH ALGORITHMS
DATE: (A* & AO*)
2.1 AIM:
To Implement Informed search algorithms ( BFS and DFS )
2.3 ALGORITHM
A* Search Algorithm:
A* Search Algorithm is a Path Finding Algorithm. It is similar to Breadth First Search(BFS). It
will search shortest path using heuristic value assigned to node and actual cost from
Source_node to Dest_node
Real-life Examples
Maps
Games
Page 12
Difference between A * and AO * algorithm
An A* algorithm represents an OR graph algorithm that is used to find a single solution (either this
or that). An AO* algorithm represents an AND-OR graph algorithm that is used to find more than
one solution by ANDing more than one branch.
Real-life Examples
Maps
Games
Formula for AO* Algorithm
h(n) = heuristic_value
g(n) = actual_cost
f(n) = actual_cost + heursitic_value
f(n) = g(n) + h(n)
Program :
#for each node m,compare its distance from start i.e g(m) to the
#from start through n node
else:
if g[m] > g[n] + weight:
#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n
while parents[n] != n:
[Link](n)
n = parents[n]
Page 14
[Link](start_node)
[Link]()
print('Path found: {}'.format(path))
return path
def heuristic(n):
H_dist = {
'A': 10,
'B': 8,
'C': 5,
'D': 7,
'E': 3,
'F': 6,
'G': 5,
Page 15
'H': 3,
'I': 1,
'J': 0
}
return H_dist[n]
}
aStarAlgo('A', 'J')
Output
Page 16
Program
class Graph:
def __init__(self, graph, heuristicNodeList, startNode): #instantiate graph object with graph
topology, heuristic values, start node
[Link] = graph
self.H=heuristicNodeList
[Link]=startNode
[Link]={}
[Link]={}
[Link]={}
def printSolution(self):
Page 17
print("FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE
STARTNODE:",[Link])
print("------------------------------------------------------------")
print([Link])
print("------------------------------------------------------------")
if flag==True: # initialize Minimum Cost with the cost of first set of child node/s
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child
node/s
flag=False
else: # checking the Minimum Cost nodes with the current Minimum Cost
if minimumCost>cost:
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList # set the Minimum Cost child
node/s
Page 18
def aoStar(self, v, backTracking): # AO* algorithm for a start node and backTracking
status flag
print("-----------------------------------------------------------------------------------------")
if solved==True: # if the Minimum Cost nodes of v are solved, set the current node
status as solved(-1)
[Link](v,-1)
[Link][v]=childNodeList # update the solution graph with the solved
nodes which may be a part of solution
if v!=[Link]: # check the current node is the start node for backtracking the current
node value
[Link]([Link][v], True) # backtracking the current node value with
backtracking status set to true
Page 19
if backTracking==False: # check the current call is not for backtracking
for childNode in childNodeList: # for each Minimum Cost child node
[Link](childNode,0) # set the status of child node to 0(needs exploration)
[Link](childNode, False) # Minimum Cost child node is further explored with
backtracking status as false
h1 = {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J':1, 'T': 3}
graph1 = {
'A': [[('B', 1), ('C', 1)], [('D', 1)]],
'B': [[('G', 1)], [('H', 1)]],
'C': [[('J', 1)]],
'D': [[('E', 1), ('F', 1)]],
'G': [[('I', 1)]]
}
G1= Graph(graph1, h1, 'A')
[Link]()
[Link]()
h2 = {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7} # Heuristic values of Nodes
graph2 = { # Graph of Nodes and Edges
'A': [[('B', 1), ('C', 1)], [('D', 1)]], # Neighbors of Node 'A', B, C & D with repective weights
'B': [[('G', 1)], [('H', 1)]], # Neighbors are included in a list of lists
'D': [[('E', 1), ('F', 1)]] # Each sublist indicate a "OR" node or "AND" nodes
}
G2 = Graph(graph2, h2, 'A') # Instantiate Graph object with graph, heuristic values and start
Node
[Link]() # Run the AO* algorithm
[Link]() # print the solution graph as AO* Algorithm search
Page 20
Output:
HEURISTIC VALUES : {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : B
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : G
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 10, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : B
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 10, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 7, 'J': 1,
'T': 3}
SOLUTION GRAPH : {}
PROCESSING NODE : I
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 8, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': []}
PROCESSING NODE : G
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 12, 'B': 8, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I']}
PROCESSING NODE : B
-----------------------------------------------------------------------------------------
Page 21
HEURISTIC VALUES : {'A': 12, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : C
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 1,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G']}
PROCESSING NODE : J
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': []}
PROCESSING NODE : C
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 6, 'B': 2, 'C': 1, 'D': 12, 'E': 2, 'F': 1, 'G': 1, 'H': 7, 'I': 0, 'J': 0,
'T': 3}
SOLUTION GRAPH : {'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J']}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE STARTNODE: A
------------------------------------------------------------
{'I': [], 'G': ['I'], 'B': ['G'], 'J': [], 'C': ['J'], 'A': ['B', 'C']}
------------------------------------------------------------
HEURISTIC VALUES : {'A': 1, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : D
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 4, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {}
Page 22
PROCESSING NODE : E
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 10, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : D
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 11, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 4, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': []}
PROCESSING NODE : F
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 6, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': []}
PROCESSING NODE : D
-----------------------------------------------------------------------------------------
HEURISTIC VALUES : {'A': 7, 'B': 6, 'C': 12, 'D': 2, 'E': 0, 'F': 0, 'G': 5, 'H': 7}
SOLUTION GRAPH : {'E': [], 'F': [], 'D': ['E', 'F']}
PROCESSING NODE : A
-----------------------------------------------------------------------------------------
FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE STARTNODE: A
------------------------------------------------------------
{'E': [], 'F': [], 'D': ['E', 'F'], 'A': ['D']}
------------------------------------------------------------
2.5 PROCEDURE
Open python 3.0 IDLE / Colab
Write the program
Run the program
Observe the output and take the hard copy
Write the program for various example / application , Observe the output and take the
hard copy
2.6 RESULT
3.3 ALGORITHM
Conditional probability is defined as the likelihood of an event or outcome occurring, based
on the occurrence of a previous event or outcome. Conditional probability is calculated by
multiplying the probability of the preceding event by the updated probability of the
succeeding, or conditional, event
Bayes’ Rule
Bayes’ Rule. Bayes’ theorem which was given by Thomas Bayes, a British Mathema tician,
in 1763 provides a means for calculating the probability of an event given some information.
Mathematically Bayes’ theorem can be stated as:
Naive Bayes
Bayes’ rule provides us with the formula for the probability of Y given some feature X. In real-
world problems, we hardly find any case where there is only one feature. When the features are
independent, we can extend Bayes’ rule to what is called Naive Bayes which assumes that the
Page 24
features are independent that means changing the value of one feature doesn’t influence the
values of other variables and this is why we call this algorithm “NAIVE”. Naive Bayes can be
used for various things like face recognition, weather prediction, Medical Diagnosis, News
classification, Sentiment Analysis, and a lot more.
When there are multiple X variables, we simplify it by assuming that X’s are independent, so
Page 25
different values now. Also, the (PDF) probability density function of a normal distribution is
given by:
We can use this formula to compute the probability of likelihoods if our data is
continuous.
Problem statement:
– Given features X1 ,X2 ,…,Xn
– Predict a label Y
X = (Rainy, Hot,
High, False) y =
No
Or
Consider a random experiment of tossing 2 coins. The sample space here will be:
S = {HH, HT, TH, TT}
P(H) is the probability of hypothesis H being true. This is known as the prior
probability.
P(E) is the probability of the evidence(regardless of the hypothesis).
P(E|H) is the probability of the evidence given that hypothesis is true.
P(H|E) is the probability of the hypothesis given that the evidence is there.
##import library
import math
Page 26
import random
import pandas as pd
import numpy as np
X, y = make_classification(
n_features=6,
n_classes=3,
n_samples=800,
n_informative=2,
random_state=1,
n_clusters_per_class=1,
# Model training
[Link](X_train, y_train)
# Predict Output
predicted = [Link]([X_test[6]])
Page 27
from [Link] import (
accuracy_score,
confusion_matrix,
ConfusionMatrixDisplay,
f1_score,
)
y_pred = [Link](X_test)
accuray = accuracy_score(y_pred, y_test)
f1 = f1_score(y_pred, y_test, average="weighted")
print("Accuracy:", accuray)
print("F1 Score:", f1)
labels = [0,1,2]
cm = confusion_matrix(y_test, y_pred, labels=labels)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)
[Link]();
OUTPUT
Accuracy: 0.8484848484848485
F1 Score: 0.8491119695890328
Page 28
3.5 PROCEDURE
Open python 3.0 IDLE / Colab
Write the program
Run the program
Observe the output and take the hard copy
Write the program for various example / application , Observe the output and take the
hard copy
3.6 RESULT
.
Page 29
[Link] : 4 IMPLEMENTATION OF BAYESIAN NETWORKS
DATE :
4.1AIM:
To Implement Bayesian Networks
4.3 ALGORITHM
This section will be about obtaining a Bayesian network, given a set of sample data. Learning a
Bayesian network can be split into two problems:
Parameter learning: Given a set of data samples and a DAG that captures the dependencies
between the variables, estimate the (conditional) probability distributions of the individual
variables.
Structure learning: Given a set of data samples, estimate a DAG that captures the dependencies
between the variables.
This notebook aims to illustrate how parameter learning and structure learning can be done with
pgmpy. Currently, the library supports:
Page 30
Score-based structure estimation (BIC/BDeu/K2 score; exhaustive search, hill
climb/tabu search)
Constraint-based structure estimation (PC)
Hybrid structure estimation (MMHC)
The Bayesian Parameter Estimator starts with already existing prior CPDs, that express
our beliefs about the variables before the data was observed. Those "priors" are then
updated, using the state counts from the observed data.
One can think of the priors as consisting in pseudo state counts, that are added to the
actual counts before normalization. Unless one wants to encode specific beliefs about
the distributions of the variables, one commonly chooses uniform priors, i.e. ones that
deem all states equiprobable.
A very simple prior is the so-called K2 prior, which simply adds 1 to the count of every
single state. A somewhat more sensible choice of prior is BDeu (Bayesian Dirichlet
equivalent uniform prior). For BDeu we need to specify an equivalent sample size N and
then the pseudo-counts are the equivalent of having observed N uniform samples of each
variable (and each parent configuration).
*Parameter Learning *
Parameter learning is the task to estimate the values of the conditional probability distributions
(CPDs), for the variables fruit, size, and tasty.
Page 31
data = [Link](data={'fruit': ["banana", "apple", "banana", "apple",
"banana","apple", "banana",
"apple", "apple", "apple", "banana",
"banana", "apple", "banana",],
'tasty': ["yes", "no", "yes", "yes", "yes",
"yes", "yes",
"yes", "yes", "yes", "yes", "no",
"no", "no"],
'size': ["large", "large", "large", "small",
"large", "large", "large",
"small", "large", "large", "large",
"large", "small", "small"]})
Program :
!pip install pgmpy
!pip install pandas
!pip install numpy
import pandas as pd
data = [Link](data={'fruit': ["banana", "apple", "banana", "apple",
"banana","apple", "banana",
"apple", "apple", "apple", "banana",
"banana", "apple", "banana",],
'tasty': ["yes", "no", "yes", "yes", "yes",
"yes", "yes",
"yes", "yes", "yes", "yes", "no", "no",
"no"],
'size': ["large", "large", "large", "small",
"large", "large", "large",
"small", "large", "large", "large",
"large", "small", "small"]})
print(data)
Page 32
mle = MaximumLikelihoodEstimator(model, data)
print(mle.estimate_cpd('fruit')) # unconditional
print(mle.estimate_cpd('tasty')) # conditional
OUTPUT
fruit
apple 7
banana 7
+---------------+-----+
| fruit(apple) | 0.5 |
+---------------+-----+
| fruit(banana) | 0.5 |
+---------------+-----+
+------------+--------------+--------------------+---------------------
+---------------+
Page 33
| fruit | fruit(apple) | fruit(apple) | fruit(banana) |
fruit(banana) |
+------------+--------------+--------------------+---------------------
+---------------+
| size | size(large) | size(small) | size(large) |
size(small) |
+------------+--------------+--------------------+---------------------
+---------------+
| tasty(no) | 0.25 | 0.3333333333333333 | 0.16666666666666666 | 1.0
|
+------------+--------------+--------------------+---------------------
+---------------+
| tasty(yes) | 0.75 | 0.6666666666666666 | 0.8333333333333334 | 0.0
|
+------------+--------------+--------------------+---------------------+--
+------------+---------------------+--------------------+--------------------+---------------------+
| fruit | fruit(apple) | fruit(apple) | fruit(banana) | fruit(banana) |
+------------+---------------------+--------------------+--------------------+---------------------+
| size | size(large) | size(small) | size(large) | size(small) |
+------------+---------------------+--------------------+--------------------+---------------------+
| tasty(no) | 0.34615384615384615 | 0.4090909090909091 | 0.2647058823529412 |
0.6428571428571429 |
+------------+---------------------+--------------------+--------------------+---------------------+
| tasty(yes) | 0.6538461538461539 | 0.5909090909090909 | 0.7352941176470589 |
0.35714285714285715 |
+------------+---------------------+--------------------+--------------------+---------------------+
4.5 PROCEDURE
4.6 RESULT
Page 34
[Link] : 5 BUILD REGRESSION MODELS
DATE:
5.1AIM:
To Build Regression Models
5.3 ALGORITHM
Regression analysis is a commonly used statistical technique for predicting the relationship between a
dependent variable and one or more independent variables. In the field of machine learning, regression
algorithms are used to make predictions about continuous variables, such as housing prices, student scores,
or medical outcomes. Python, being one of the most widely used programming languages in data science
and machine learning, has a variety of powerful libraries for implementing regression algorithms.
[Link] linear regression is a statistical method used to model the relationship between a
dependent variable and two or more independent variables. It is an extension of simple linear
regression, where only one independent variable is used to predict the dependent variable.
[Link] regression is a form of regression analysis in which the relationship between the
independent variable x and the dependent variable y is modeled as an nth degree polynomial. It
allows for more flexibility to model non-linear relationships between variables, unlike linear
regression which assumes that the relationship is linear. Below you can see the generalized
equation for polynomial regression, where y is the dependent variable, and the x values would be
the independent variables. Notice how we could expand this by choosing higher orders of
Page 35
polynomials (to some order k) and we could have also included interaction terms.
3. Ridge Regression is a variation of linear regression that addresses some of the issues of linear
regression. Linear regression can be prone to overfitting when the number of independent
variables is large, this is because the coefficients of the independent variables can become very
large leading to a complex model that fits the noise of the data. Ridge Regression solves this issue
by adding a term to the linear regression equation called L2 regularization term, also known as
Ridge Penalty, which is the sum of the squares of the coefficients multiplied by a regularization
parameter lambda.
4. LASSO (Least Absolute Shrinkage And Selection Operator) is another variation of linear
regression that addresses some of the issues of linear regression. It is used to solve the problem of
overfitting when the number of independent variables is large. Lasso Regression adds a term to
the linear regression equation called L1 regularization term, also known as Lasso Penalty, which
is the sum of the absolute values of the coefficients multiplied by a regularization parameter
lambda.
5. Elastic Net Regression is a hybrid of Ridge Regression and Lasso Regression that combines
the strengths of both. It addresses the problem of overfitting when the number of independent
variables is large by adding both L1 and L2 regularization terms to the linear regression equation .
6. Decision tree based regression is a method that uses decision trees to model the relationship
between a dependent variable and one or more independent variables. Decision Trees are widely
used machine learning algorithms that can be used for both classification and regression problems
in python. A decision tree is a tree-like structure where each internal node represents a test on an
attribute, each branch represents an outcome of the test, and each leaf node represents a predicted
Page 36
value or class
[Link] Vector Regression (SVR) is a type of Support Vector Machine (SVM) algorithm,
which is a supervised learning algorithm that can be used for regression problems. SVR is a linear
model that aims to find the hyperplane that maximally separates the data points into two classes,
while at the same time minimizing the classification error. In SVR, the goal is to find the
hyperplane that maximally separates the data points from the prediction error, while at the same
time minimizing the margin of deviation between the predicted value and the true value of the
dependent variable. The optimization problem of SVR can be formulated as:
reg.coef_
reg.intercept_
[Link]([Link]([[3, 5]]))
OUTPUT:
array([16.])
2. Polynomial regression
# polynomial
import numpy as np
from [Link] import PolynomialFeatures
X = [Link](6).reshape(3, 2)
X
poly = PolynomialFeatures(2)
Page 37
poly.fit_transform(X)
poly = PolynomialFeatures(interaction_only=True)
poly.fit_transform(X)
OUTPUT:
array([[ 1., 0., 1., 0.],
[ 1., 2., 3., 6.],
[ 1., 4., 5., 20.]])
3. Ridge regression
from sklearn.linear_model import Ridge
import numpy as np
n_samples, n_features = 10, 5
rng = [Link](0)
y = [Link](n_samples)
X = [Link](n_samples, n_features)
clf = Ridge(alpha=1.0)
[Link](X, y)
OUTPUT:
Ridge
Ridge()
4. Lasso regression
#lasso
from sklearn import linear_model
clf = linear_model.Lasso(alpha=0.1)
[Link]([[0,0], [1, 1], [2, 2]], [0, 1, 2])
print(clf.coef_)
print(clf.intercept_)
OUTPUT:
[0.85 0. ]
0.15000000000000002
Page 38
from [Link] import make_regression
X, y = make_regression(n_features=2, random_state=0)
regr = ElasticNet(random_state=0)
[Link](X, y)
print(regr.coef_)
print(regr.intercept_)
print([Link]([[0, 0]]))
OUTPUT:
[18.83816048 64.55968825]
1.4512607561653996
[1.45126076]
OUTPUT:
array([-0.39292219, -0.46749346, 0.02768473, 0.06441362, -0.50323135,
0.16437202, 0.11242982, -0.73798979, -0.30953155, -0.00137327])
#SVR
from [Link] import SVR
from [Link] import make_pipeline
from [Link] import StandardScaler
import numpy as np
n_samples, n_features = 10, 5
rng = [Link](0)
y = [Link](n_samples)
X = [Link](n_samples, n_features)
regr = make_pipeline(StandardScaler(), SVR(C=1.0, epsilon=0.2))
[Link](X, y)
Page 39
OUTPUT:
Application _ LR
data_url ="[Link]
raw_df = pd.read_csv(data_url, sep="\s+",
skiprows=22, header=None)
X = [Link]([raw_df.values[::2, :],
raw_df.values[1::2, :2]])
y = raw_df.values[1::2, 2]
X_train, X_test,\
y_train, y_test = train_test_split(X, y,
test_size=0.4,
random_state=1)
reg = linear_model.LinearRegression()
[Link](X_train, y_train)
regression coefficients
print('Coefficients: ', reg.coef_)
# plotting legend
[Link](loc='upper right')
# plot title
[Link]("Residual errors")
OUTPUT:
Coefficients: [-8.95714048e-02 6.73132853e-02 5.04649248e-
02 2.18579583e+00
-1.72053975e+01 3.63606995e+00 2.05579939e-03 -
1.36602886e+00
2.89576718e-01 -1.22700072e-02 -8.34881849e-01 9.40360790e-
03
-5.04008320e-01]
Variance score: 0.7209056672661748
Page 41
5.5 PROCEDURE
5.6 RESULT
6.3 ALGORITHM
A decision tree is a supervised machine-learning algorithm that can be used for
both classification and regression problems. Algorithm builds its model in the
structure of a tree along with decision nodes and leaf nodes. A decision tree is
simply a series of sequential decisions made to reach a specific result.
The Palmer Penguins dataset
This Colab uses the Palmer Penguins dataset, which contains size measurements for
three penguin species:
Chinstrap
Gentoo
Adelie
This is a classification problem—the goal is to predict the species of penguin
based on data in the Palmer's Penguins dataset. Let’s meet the penguins.
Page 43
Application Ex: Bank loan approval
[Link]
import numpy as np
import pandas as pd
import tensorflow_decision_forests as tfdf
Page 44
path =
"[Link]
mer_penguins/[Link]"
pandas_dataset = pd.read_csv(path)
label = "species"
classes = list(pandas_dataset[label].unique())
print(f"Label classes: {classes}")
# >> Label classes: ['Adelie', 'Gentoo', 'Chinstrap']
pandas_dataset[label] = pandas_dataset[label].map([Link])
[Link](1)
# Use the ~10% of the examples as the testing set
# and the remaining ~90% of the examples as the training set.
test_indices = [Link](len(pandas_dataset)) < 0.1
pandas_train_dataset = pandas_dataset[~test_indices]
pandas_test_dataset = pandas_dataset[test_indices]
tf_train_dataset =
[Link].pd_dataframe_to_tf_dataset(pandas_train_dataset, label=label)
model = [Link]()
[Link](tf_train_dataset)
tfdf.model_plotter.plot_model_in_colab
tfdf.model_plotter.plot_model_in_colab(model, max_depth=10)
bill_depth_mm = 16.35;
if bill_depth_mm > 16.35
classes = list(pandas_dataset[label].unique())
print(f"Label classes: {classes}")
else
Page 45
bill_depth_mm < 16.35
classes = list(pandas_dataset[label].unique())
print(f"Label classes: {classes}")
end
end
[Link]("accuracy")
print("Train evaluation: ", [Link](tf_train_dataset,
return_dict=True))
# >> Train evaluation: {'loss': 0.0, 'accuracy': 0.96116}
tf_test_dataset =
[Link].pd_dataframe_to_tf_dataset(pandas_test_dataset, label=label)
print("Test evaluation: ", [Link](tf_test_dataset,
return_dict=True))
# >> Test evaluation: {'loss': 0.0, 'accuracy': 0.97142}
OUTPUT:
tensorflow_decision_forests.component.model_plotter.model_plott
er.plot_model_in_colab
def plot_model_in_colab(model: InferenceCoreModel, **kwargs)
Page 46
/usr/local/lib/python3.10/dist-packages/
tensorflow_decision_forests/component/model_plotter/
model_plotter.pyPlots a model structure in colab.
Args:
model: The model to plot.
**kwargs: Arguments passed to "plot_model".
Returns:
A Colab HTML element showing the model.
6.5 PROCEDURE
6.6 RESULT
Page 47
[Link] : 7 BUILD SVM MODELS
DATE:
7.1AIM:
To build SVM Models
7.3 ALGORITHM
The main objective is to segregate the given dataset in the best possible way. The distance
between the either nearest points is known as the margin. The objective is to select a hyperplane
with the maximum possible margin between support vectors in the given dataset. SVM searches
for the maximum marginal hyperplane in the following steps:
1. Generate hyperplanes which segregates the classes in the best way. Left-hand side figure
showing three hyperplanes black, blue and orange. Here, the blue and orange have higher
classification error, but the black is separating the two classes correctly.
2. Select the right hyperplane with the maximum segregation from the either nearest data
points as shown in the right-hand side figure.
Page 48
import math
import random
import pandas as pd
import numpy as np
import [Link]
import requests
#Load dataset
cancer = datasets.load_breast_cancer()
#kernel implimentation
def K(x, xi):
# Choose one of the following implementations:
# Linear kernel
# return sum(x * xi)
# Gaussian kernel
gamma = 1 # Set the kernel parameter
return exp(-gamma * sum((x_i - xi_i)**2 for x_i, xi_i in zip(x, xi)))
# print data(feature)shape
[Link]
Page 49
X_train, X_test, y_train, y_test =
train_test_split([Link], [Link],
test_size=0.3,random_state=109) # 70% training and 30% test
#Import svm model
from sklearn import svm
OUTPUT:
Features: ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
'mean smoothness' 'mean compactness' 'mean concavity'
'mean concave points' 'mean symmetry' 'mean fractal dimension'
'radius error' 'texture error' 'perimeter error' 'area error'
'smoothness error' 'compactness error' 'concavity error'
'concave points error' 'symmetry error' 'fractal dimension error'
'worst radius' 'worst texture' 'worst perimeter' 'worst area'
'worst smoothness' 'worst compactness' 'worst concavity'
'worst concave points' 'worst symmetry' 'worst fractal dimension']
Labels: ['malignant' 'benign']
Page 50
[1.969e+01 2.125e+01 1.300e+02 1.203e+03 1.096e-01 1.599e-01 1.974e-01
1.279e-01 2.069e-01 5.999e-02 7.456e-01 7.869e-01 4.585e+00 9.403e+01
6.150e-03 4.006e-02 3.832e-02 2.058e-02 2.250e-02 4.571e-03 2.357e+01
2.553e+01 1.525e+02 1.709e+03 1.444e-01 4.245e-01 4.504e-01 2.430e-01
3.613e-01 8.758e-02]
[1.142e+01 2.038e+01 7.758e+01 3.861e+02 1.425e-01 2.839e-01 2.414e-01
1.052e-01 2.597e-01 9.744e-02 4.956e-01 1.156e+00 3.445e+00 2.723e+01
9.110e-03 7.458e-02 5.661e-02 1.867e-02 5.963e-02 9.208e-03 1.491e+01
2.650e+01 9.887e+01 5.677e+02 2.098e-01 8.663e-01 6.869e-01 2.575e-01
6.638e-01 1.730e-01]
[2.029e+01 1.434e+01 1.351e+02 1.297e+03 1.003e-01 1.328e-01 1.980e-01
1.043e-01 1.809e-01 5.883e-02 7.572e-01 7.813e-01 5.438e+00 9.444e+01
1.149e-02 2.461e-02 5.688e-02 1.885e-02 1.756e-02 5.115e-03 2.254e+01
1.667e+01 1.522e+02 1.575e+03 1.374e-01 2.050e-01 4.000e-01 1.625e-01
2.364e-01 7.678e-02]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0
1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 1
1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1
1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0
1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1
1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1
1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0
0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1
1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 1 1
0 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1
1 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0
1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 0 0 0 0 0 0 1]
Accuracy: 0.9649122807017544
Precision: 0.9811320754716981
Recall: 0.9629629629629629
7.5 PROCEDURE
7.6 RESULT
Page 51
[Link] : 8 IMPLEMENT ENSEMBLING TECHNIQUES
DATE:
8.1AIM:
To Implement Ensembling Techniques
8.3 ALGORITHM
The steps of the EM algorithm are as follows:
1. We first consider a set of starting parameters given a set of incomplete (observed) data and
we assume that observed data come from a specific model
2. We then use the model to “estimate” the missing data . In other words after formulating some
parameters from observed data to build a model, we use this model to guess the missing
value/data. This step is called the expectation step.
3. Now we use the “complete” data that we have estimated to update parameters where using
the missing data and observed data, we find the most likely modified parameters to build the
modified model. This is called the maximization step .
4. We repeat steps 2 & 3 until convergence that is there is no change in the parameters of the
model and the estimated model fits the observed data.
Page 52
The EM strategy can be explained with a coin toss example. This is the example we will be
using in subsequent iterations to explain the complete flow of the EM algorithm. In this
example we assume that we are tossing a number of coins sequentially to obtain a sequence
of Head or Tails. The context of the coin toss example is given in Table 33.1. Here the
problem is defined as X, the sequence of Heads and Tails that is observed, Y as the identifier
of the coin that is tossed in the sequence, which is hidden and finally θ which is the
parameter vector which is associated with the probabilities of the observed and hidden data.
Here if we assume three coins are tossed λ is the probability of coin 0 showing H (so 1 − λ is
the probability of it showing T), p1 is the probability of coin 1 showing H, and p2 is the
probability of coin 2 showing H.
Parameters of EM
Page 53
def coin_em(rolls, theta_A=None, theta_B=None, maxiter=10):
# Initial Guess
theta_A = theta_A or [Link]()
theta_B = theta_B or [Link]()
thetas = [(theta_A, theta_B)]
# Iterate
for c in range(maxiter):
print("#%d:\t%0.2f %0.2f" % (c, theta_A, theta_B))
heads_A, tails_A, heads_B, tails_B = e_step(rolls, theta_A,
theta_B)
theta_A, theta_B = m_step(heads_A, tails_A, heads_B, tails_B)
[Link]((theta_A,theta_B))
return thetas, (theta_A,theta_B)
Page 54
return pow(bias, numHeads) * pow(1-bias, flips-numHeads)
# Call the functions
rolls = [ "HTTTHHTHTH", "HHHHTHHHHH", "HTHHHHHTHH",
"HTHTTTHHTT", "THHHTHHHTH" ]
thetas, _ = coin_em(rolls, 0.6, 0.5, maxiter=10)
type(thetas)
thet=thetas[1]
print(thetas)
rolls_p = "HHHTTHTHTH"
numHeads_p = rolls_p.count('H')
print('No. of Heads', numHeads_p)
flips_p = len(rolls_p)
print(flips_p)
OUTPUT:
9
10
9
10
8
10
8
10
4
10
4
10
7
10
7
Page 55
10
#8: 0.80 0.52
5
10
5
10
9
10
9
10
8
10
8
10
4
10
4
10
7
10
7
10
#9: 0.80 0.52
5
10
5
10
9
10
9
10
8
10
8
10
4
10
4
10
7
10
7
10
No. of Heads 6
10
Lilelihood of A coin : 0.0004366017976005356
Lilelihood of B coin : 0.0010483562262602218
Probability of A coin : 0.2940162553991999
Probability of B coin : 0.7059837446008002
Page 56
The boy tossed Coin B
8.5 PROCEDURE
8.6 RESULT
Page 57
[Link] : 9 IMPLEMENT CLUSTERING ALGORITHMS
DATE:
9.1AIM:
To Implement Clustering Algorithms
9.3 ALGORITHM
Kmeans and EM algorithm
We can explain K means as an EM algorithm. First we initialize the k means (mk) of the Kmeans
algorithm. In the E Step we assign each point to a Cluster and during the M Step given the
Clusters we refine mean mk of each cluster k. This process is repeated until the change in means
is small.
K-means and Mixture of Gaussians
Now we know that in a general K-means which is essentially a classifier and we need to find the
parameter to fit data – that is we need to find the mean – µk as already discussed above.
However when we use mixture of Gaussians which is a probability model where we are defining
a “soft” classifier. Now the parameters that are to be determined to fit to data are the means µ k
and covariance Σk which define the Gaussians distributions and the mixing coefficient πk. Now
given the data set, find the mixing coefficients, means and covariance. If we knew which
component generated each data point, the maximum likelihood solution would involve fitting
each component to the corresponding cluster . However our problem is that the data set is
unlabelled or are hidden
Page 58
# Generate sample data
###[Link](0)
#X = [Link](19, 2)
u_labels = [Link](label)
import [Link] as plt
#plotting the results:
for i in u_labels:
[Link](X[label == i , 0] , X[label == i , 1] , label = i)
[Link]()
[Link]("K-Means Clustering")
[Link]()
OUTPUT
[[ 1 2]
[ 2 4]
[10 12]
[11 15]
[ 3 2]
[12 13]]
Page 59
array([[ 2. , 2.66666667],
[11. , 13.33333333]])
array([1], dtype=int32)
Cluster Labels: [0 0 1 1 0 1]
Cluster Centers: [[ 2. 2.66666667]
[11. 13.33333333]]
9.5 PROCEDURE
9.6 RESULT
Page 60
[Link] : 10 IMPLEMENT EM FOR BAYESIAN NETWORKS
DATE:
10.1AIM:
To Implement EM for Bayesian Networks
10.3 ALGORITHM
Here the E-step or expectation step is so named because it involves updating our
expectation of which cluster each point belongs to. The M-step or maximization
step is so named because it involves maximizing some fitness function that
defines the locations of the cluster centers—in this case, that maximization is
accomplished by taking a simple mean of the data in each cluster.
%matplotlib inline
import [Link] as plt
Page 61
[Link]('seaborn-whitegrid')
import numpy as np
centers = kmeans.cluster_centers_
[Link](centers[:, 0], centers[:, 1], c='black', s=200);
while True:
# 2a. Assign labels based on closest center
labels = pairwise_distances_argmin(X, centers)
Page 62
centers, labels = find_clusters(X, 4, rseed=0)
[Link](X[:, 0], X[:, 1], c=labels,
s=50, cmap='viridis');
Figure: 1
Figure: 2
Page 63
Figure: 3
Figure: 4
Page 64
Figure: 5
Figure: 6
Page 65
Figure: 7
Page 66
10.5 PROCEDURE
10.6 RESULT
Page 67
[Link] : 11 BUILD NEURAL NETWORK MODELS
DATE:
11.1AIM:
To build Neural Network (BP) Models
11.3 ALGORITHM
Neural Networks are computational models that mimic the complex functions of the human
brain. The neural networks consist of interconnected nodes or neurons that process and learn
from data, enabling tasks such as pattern recognition and decision making in machine
learning. The article explores more about neural networks, their working, architecture and
more.
Page 68
y = [Link](([92], [86], [89]), dtype=float)
X = X/[Link](X,axis=0) # maximum of X array longitudinally y = y/100
#Sigmoid Function
def sigmoid (x):
return (1/(1 + [Link](-x)))
#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
return x * (1 - x)
#Variable initialization
epoch=7000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
#weight and bias initialization
wh=[Link](size=(inputlayer_neurons,hiddenlayer_neurons))
bh=[Link](size=(1,hiddenlayer_neurons))
wout=[Link](size=(hiddenlayer_neurons,output_neurons))
bout=[Link](size=(1,output_neurons)) # draws a random range
of numbers uniformly of dim x*y
#Forward Propagation
for i in range(epoch):
hinp1=[Link](X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=[Link](hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.[Link](d_output) *lr
# dotproduct of nextlayererror and currentlayerop
bout+= [Link](d_output, axis=0,keepdims=True) *lr
wh += [Link](d_hiddenlayer) *lr
bh += [Link](d_hiddenlayer, axis=0,keepdims=True) *lr
Page 69
print("Predicted Output: \n" ,output)
OUTPUT:
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[92.]
[86.]
[89.]]
Predicted Output:
[[0.99999894]
[0.99999822]
[0.99999887]]
Model: 2
import numpy as np
# array of any amount of numbers. n = m
X = [Link]([[1, 2, 3],
[3, 4, 1],
[2, 5, 3]])
# multiplication
y = [Link]([[.5, .3, .2]])
# transpose of y
y = y.T
# sigma value
sigm = 2
for j in range(100):
# find matrix 1. 100 layers.
m1 = (y - (1/(1 + [Link](-([Link]((1/(1 + [Link](
-([Link](X, sigm))))), delt))))))*((1/(
1 + [Link](-([Link]((1/(1 + [Link](
Page 70
-([Link](X, sigm))))), delt)))))*(1-(1/(
1 + [Link](-([Link]((1/(1 + [Link](
-([Link](X, sigm))))), delt)))))))
# find matrix 2
m2 = [Link](delt.T) * ((1/(1 + [Link](-([Link](X, sigm)))))
* (1-(1/(1 + [Link](-([Link](X, sigm)))))))
# find delta
delt = delt + (1/(1 + [Link](-([Link](X, sigm))))).[Link](m1)
# find sigma
sigm = sigm + ([Link](m2))
OUTPUT:
[[0.99999319 0.99999374 0.99999351]
[0.99999988 0.99999989 0.99999988]
[1. 1. 1. ]]
11.5 PROCEDURE
11.6 RESULT
Page 71
[Link] : 12 BUILD DEEP LEARNING MODELS
DATE:
12.1 AIM:
To build deep Neural Network Models
12.3 ALGORITHM
Simple Convolutional Neural Network (CNN) to classify CIFAR images
The CIFAR10 dataset contains 60,000 color images in 10 classes, with 6,000 images in each
class. The dataset is divided into 50,000 training images and 10,000 testing images. The classes
are mutually exclusive and there is no overlap between them.
The 6 lines of code below define the convolutional base using a common pattern: a stack
of Conv2D and MaxPooling2D layers.
[Link](figsize=(8,8))
Page 72
for i in range(25):
[Link](5,5,i+1)
[Link]([])
[Link]([])
[Link](False)
[Link](train_images[i])
# The CIFAR labels happen to be arrays,
#which is why we need the extra index
[Link](class_names[train_labels[i][0]])
[Link]()
model = [Link]()
[Link](layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32,
3)))
[Link](layers.MaxPooling2D((2, 2)))
[Link](layers.Conv2D(64, (3, 3), activation='relu'))
[Link](layers.MaxPooling2D((2, 2)))
[Link](layers.Conv2D(64, (3, 3), activation='relu'))
[Link]()
[Link]([Link]())
[Link]([Link](64, activation='relu'))
[Link]([Link](10))
[Link]()
# Adam is the best among the adaptive optimizers in most of the cases
[Link](optimizer='adam',
loss=[Link](from_logits=True),
metrics=['accuracy'])
[Link]([Link]['accuracy'],label='accuracy')
[Link]([Link]['val_accuracy'],label = 'val_accuracy')
[Link]('Epoch')
[Link]('Accuracy')
[Link]([0.5, 1])
[Link](loc='lower right')
Page 73
test_loss, test_acc = [Link](test_images,
test_labels,
verbose=2)
OUTPUT:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
Page 74
=================================================================
conv2d (Conv2D) (None, 30, 30, 32) 896
=================================================================
Total params: 56320 (220.00 KB)
Trainable params: 56320 (220.00 KB)
Non-trainable params: 0 (0.00 Byte)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 30, 30, 32) 896
=================================================================
Total params: 122570 (478.79 KB)
Trainable params: 122570 (478.79 KB)
Non-trainable params: 0 (0.00 Byte)
Epoch 1/10
1563/1563 [==============================] - 38s 24ms/step - loss: 1.5158 -
accuracy: 0.4451 - val_loss: 1.1963 - val_accuracy: 0.5729
Epoch 2/10
1563/1563 [==============================] - 37s 24ms/step - loss: 1.1395 -
accuracy: 0.5940 - val_loss: 1.0595 - val_accuracy: 0.6315
Epoch 3/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.9965 -
accuracy: 0.6494 - val_loss: 1.0275 - val_accuracy: 0.6462
Epoch 4/10
Page 75
1563/1563 [==============================] - 36s 23ms/step - loss: 0.8967 -
accuracy: 0.6829 - val_loss: 0.9410 - val_accuracy: 0.6737
Epoch 5/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.8347 -
accuracy: 0.7081 - val_loss: 0.8940 - val_accuracy: 0.6955
Epoch 6/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.7794 -
accuracy: 0.7269 - val_loss: 0.8578 - val_accuracy: 0.7054
Epoch 7/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.7296 -
accuracy: 0.7446 - val_loss: 0.8526 - val_accuracy: 0.7099
Epoch 8/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.6908 -
accuracy: 0.7581 - val_loss: 0.8534 - val_accuracy: 0.7132
Epoch 9/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.6577 -
accuracy: 0.7676 - val_loss: 0.8581 - val_accuracy: 0.7144
Epoch 10/10
1563/1563 [==============================] - 36s 23ms/step - loss: 0.6192 -
accuracy: 0.7823 - val_loss: 0.8458 - val_accuracy: 0.7162
12.5 PROCEDURE
Page 76
hard copy
12.6 RESULT
DATE:
13.1 AIM:
To build deep Neural Network for digit classification
Page 78
[Link](X_train, y_train, epochs=5, batch_size=32)
print('Loss:', loss)
print('Accuracy:', accuracy)
pred = [Link](X[0,:].reshape(1, -1))
print(pred)
print(y[0,:])
dgts = load_digits()
print([Link])
import [Link] as plt
[Link]()
[Link]([Link][0])
[Link]()
OUTPUT
Epoch 1/5
45/45 [==============================] - 1s 2ms/step - loss: 5.0013 -
accuracy: 0.2289
Epoch 2/5
45/45 [==============================] - 0s 2ms/step - loss: 1.1459 -
accuracy: 0.6354
Epoch 3/5
45/45 [==============================] - 0s 2ms/step - loss: 0.5060 -
accuracy: 0.8462
Epoch 4/5
45/45 [==============================] - 0s 2ms/step - loss: 0.3240 -
accuracy: 0.9040
Epoch 5/5
45/45 [==============================] - 0s 2ms/step - loss: 0.2358 -
accuracy: 0.9283
12/12 [==============================] - 0s 2ms/step - loss: 0.2600 -
accuracy: 0.9250
Loss: 0.2599562704563141
Accuracy: 0.925000011920929
1/1 [==============================] - 0s 52ms/step
[[9.9925131e-01 8.5686344e-07 7.1382141e-07 4.9433766e-06 6.5290674e-06
5.4754998e-04 2.5058553e-06 3.0487461e-06 3.8330847e-05 1.4414983e-04]]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
(1797, 64)
<Figure size 640x480 with 0 Axes>
Page 79
Page 80