0% found this document useful (0 votes)
104 views13 pages

AIML Lab Programs Overview

The document describes several AI and machine learning algorithms: 1. A* search algorithm for pathfinding with heuristics. 2. AO* search algorithm which is an improvement on A* with better time complexity. 3. Candidate elimination algorithm for learning concepts from examples. 4. ID3 decision tree algorithm for classification which uses information gain as the splitting criterion. 5. Backpropagation algorithm for training neural networks by minimizing error between actual and predicted output. 6. Naive Bayes classifier which is a simple probabilistic classifier based on Bayes' theorem.

Uploaded by

Neeraja Rajesh
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views13 pages

AIML Lab Programs Overview

The document describes several AI and machine learning algorithms: 1. A* search algorithm for pathfinding with heuristics. 2. AO* search algorithm which is an improvement on A* with better time complexity. 3. Candidate elimination algorithm for learning concepts from examples. 4. ID3 decision tree algorithm for classification which uses information gain as the splitting criterion. 5. Backpropagation algorithm for training neural networks by minimizing error between actual and predicted output. 6. Naive Bayes classifier which is a simple probabilistic classifier based on Bayes' theorem.

Uploaded by

Neeraja Rajesh
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

AIML LAB PROGRAMS

1. ASTAR
def aStarAlgo(start_node, stop_node):
open_set = set(start_node)
closed_set = set()
g = {}
parents = {}
g[start_node] = 0
parents[start_node] = start_node

while len(open_set) > 0 :


n = None

for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v

if n == stop_node or Graph_nodes[n] == None:


Pass

else:
for (m, weight) in get_neighbors(n):

if m not in open_set and m not in closed_set:


open_set.add(m)
parents[m] = n
g[m] = g[n] + weight

else:
if g[m] > g[n] + weight:
g[m] = g[n] + weight
parents[m] = n

if m in closed_set:
closed_set.remove(m)
open_set.add(m)

if n == None:
print ('Path does not exist!')
return None
if n == stop_node:
path = []

while parents[n] != n:
[Link](n)
n = parents[n]
[Link](start_node)
[Link]()
print('Path found: {}'.format(path))
return path
open_set.remove(n)# {'F','B'} len=2
closed_set.add(n) #{A} len=1
print('Path does not exist!')
return None

def get_neighbors(v):
if v in Graph_nodes:
return Graph_nodes[v]
else:
return None

def heuristic(n):
H_dist = { 'A': 10,'B': 8,'C': 5,'D': 7, 'E': 3,'F': 6,'G': 5,'H': 3,'I': 1,'J': 0}
return H_dist[n]

Graph_nodes = {

'A': [('B', 6), ('F', 3)],


'B': [('C', 3), ('D', 2)],
'C': [('D', 1), ('E', 5)],
'D': [('C', 1), ('E', 8)],
'E': [('I', 5), ('J', 5)],
'F': [('G', 1),('H', 7)] ,
'G': [('I', 3)],
'H': [('I', 2)],
'I': [('E', 5), ('J', 3)],

}
aStarAlgo('A', 'J')
2. AOSTAR
class Graph:
def __init__(self, graph, heuristicNodeList, startNode):

[Link] = graph
self.H=heuristicNodeList
[Link]=startNode
[Link]={}
[Link]={}
[Link]={}

def applyAOStar(self):
[Link]([Link], False)

def getNeighbors(self, v):


return [Link](v,'')

def getStatus(self,v):
return [Link](v,0)

def setStatus(self,v, val):


[Link][v]=val

def getHeuristicNodeValue(self, n):


return [Link](n,0)

def setHeuristicNodeValue(self, n, value):


self.H[n]=value

def printSolution(self):
print("FOR GRAPH SOLUTION, TRAVERSE THE GRAPH FROM THE
STARTNODE:",[Link])
print([Link])

def computeMinimumCostChildNodes(self, v):


minimumCost=0
costToChildNodeListDict={}
costToChildNodeListDict[minimumCost]=[]
flag=True

for nodeInfoTupleList in [Link](v):


cost=0
nodeList=[]
for c, weight in nodeInfoTupleList:
cost=cost+[Link](c)+weight
[Link](c)

if flag==True:
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList
flag=False

else:
if minimumCost>cost:
minimumCost=cost
costToChildNodeListDict[minimumCost]=nodeList

return minimumCost, costToChildNodeListDict[minimumCost]

def aoStar(self, v, backTracking):


print("HEURISTIC VALUES :", self.H)
print("SOLUTION GRAPH :", [Link])
print("PROCESSING NODE :", v)

if [Link](v) >= 0:
minimumCost, childNodeList [Link](v)
[Link](v, minimumCost)
[Link](v,len(childNodeList))

solved=True

for childNode in childNodeList:


[Link][childNode]=v
if [Link](childNode)!=-1:
solved=solved & False

if solved==True:
[Link](v,-1)
[Link][v]=childNodeList

if v!=[Link]:
[Link]([Link][v], True)

if backTracking==False:
for childNode in childNodeList:
[Link](childNode,0)
[Link](childNode, False)

h1 = {'A': 1, 'B': 6, 'C': 2, 'D': 12, 'E': 2, 'F': 1, 'G': 5, 'H': 7, 'I': 7, 'J':1, 'T': 3}
graph1 = {
'A': [[('B', 1), ('C', 1)], [('D', 1)]], 'B': [[('G', 1)], [('H', 1)]],'C': [[('J', 1)]],'D': [[('E', 1), ('F',
1)]],'G': [[('I', 1)]]}
G1= Graph(graph1, h1, 'A')
[Link]()
[Link]()
[Link] ELIMINATION ALGORITHM

import csv

with open("[Link]") as f:
csv_file = [Link](f)
data = list(csv_file)

specific = data[1][:-1]
general = [['?' for i in range(len(specific))] for j in range(len(specific))]

for i in data:
if i[-1] == "Yes":
for j in range(len(specific)):
if i[j] != specific[j]:
specific[j] = "?"
general[j][j] = "?"

elif i[-1] == "No":


for j in range(len(specific)):
if i[j] != specific[j]:
general[j][j] = specific[j]
else:
general[j][j] = "?"

print("\nStep " + str([Link](i)+1) + " of Candidate Elimination Algorithm")


print(specific)
print(general)

gh = []

for i in general:
for j in i:
if j != '?':
[Link](i)
break
print("\nFinal Specific hypothesis:\n", specific)
print("\nFinal General hypothesis:\n", gh)
4. ID3 ALGORITHM
import pandas as pd
import math

def base_entropy(dataset):
p=0
n=0
target = [Link][:, -1]
targets = list(set(target))
for i in target:
if i == targets[0]:
p=p+1
else:
n=n+1
if p == 0 or n == 0:
return 0
elif p == n:
return 1
else:
entropy = 0 - (
((p / (p + n)) * (math.log2(p / (p + n))) + (n / (p + n)) * (math.log2(n/ (p +
n)))))
return entropy

def entropy(dataset, feature, attribute):


p=0
n=0
target = [Link][:, -1]
targets = list(set(target))
for i, j in zip(feature, target):
if i == attribute and j == targets[0]:
p=p+1
elif i == attribute and j == targets[1]:
n=n+1
if p == 0 or n == 0:
return 0
elif p == n:
return 1
else:
entropy = 0 - (
((p / (p + n)) * (math.log2(p / (p + n))) + (n / (p + n)) * (math.log2(n/ (p +
n)))))
return entropy
def counter(target, attribute, i):
p=0
n=0
targets = list(set(target))
for j, k in zip(target, attribute):
if j == targets[0] and k == i:
p=p+1
elif j == targets[1] and k == i:
n=n+1
return p, n

def Information_Gain(dataset, feature):


Distinct = list(set(feature))
Info_Gain = 0
for i in Distinct:
Info_Gain = Info_Gain + [Link](i) / len(feature) *
entropy(dataset,feature, i)
Info_Gain = base_entropy(dataset) - Info_Gain
return Info_Gain

def generate_childs(dataset, attribute_index):


distinct = list([Link][:, attribute_index])
childs = dict()
for i in distinct:
childs[i] = counter([Link][:, -1], [Link][:, attribute_index], i)
return childs

def modify_data_set(dataset,index, feature, impurity):


size = len(dataset)
subdata = dataset[dataset[feature] == impurity]
del (subdata[[Link][index]])
return subdata

def greatest_information_gain(dataset):
max = -1
attribute_index = 0
size = len([Link]) - 1
for i in range(0, size):
feature = list([Link][:, i])
i_g = Information_Gain(dataset, feature)
if max < i_g:
max = i_g
attribute_index = i
return attribute_index

def construct_tree(dataset, tree):


target = [Link][:, -1]
impure_childs = []
attribute_index = greatest_information_gain(dataset)
childs = generate_childs(dataset, attribute_index)
tree[[Link][attribute_index]] = childs
targets = list(set([Link][:, -1]))
for k, v in [Link]():
if v[0] == 0:
tree[k] = targets[1]
elif v[1] == 0:
tree[k] = targets[0]
elif v[0] != 0 or v[1] != 0:
impure_childs.append(k)
for i in impure_childs:
sub = modify_data_set(dataset,attribute_index,
[Link][attribute_index], i)
tree = construct_tree(sub, tree)
return tree

def main():
df = pd.read_csv("[Link]")
tree = dict()
result = construct_tree(df, tree)
for key, value in [Link]():
print(key, " => ", value)

if __name__ == "__main__":
main()
5. BACK PROPOGATION ALGORITHM
import numpy as np
X = [Link](([2, 9], [1, 5], [3, 6]), dtype=float)
y = [Link](([92], [86], [89]), dtype=float)
X = X/[Link](X,axis=0)
y = y/100

def sigmoid (x):


return 1/(1 + [Link](-x))

#Derivative of Sigmoid Function


def derivatives_sigmoid(x):
return x * (1 - x)

#Variable initialization
epoch=5000
lr=0.1
inputlayer_neurons = 2
hiddenlayer_neurons = 3
output_neurons = 1

wh=[Link](size=(inputlayer_neurons,hiddenlayer_neurons))
bh=[Link](size=(1,hiddenlayer_neurons))
wout=[Link](size=(hiddenlayer_neurons,output_neurons))
bout=[Link](size=(1,output_neurons))

for i in range(epoch):

#Forward Propogation
hinp1=[Link](X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=[Link](hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)

#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.[Link](d_output) *lr
wh += [Link](d_hiddenlayer) *lr

print("Input: \n" + str(X))


print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
6. NAIVE BAYES CLASSIFIER

import pandas as pd
from sklearn import tree
from [Link] import LabelEncoder
from sklearn.naive_bayes import GaussianNB

data = pd.read_csv('[Link]')
print("The first 5 Values of data is :\n", [Link]())

X = [Link][:, :-1]
print("\nThe First 5 values of the train data is\n", [Link]())

y = [Link][:, -1]
print("\nThe First 5 values of train output is\n", [Link]())

le_outlook = LabelEncoder()
[Link] = le_outlook.fit_transform([Link])

le_Temperature = LabelEncoder()
[Link] = le_Temperature.fit_transform([Link])

le_Humidity = LabelEncoder()
[Link] = le_Humidity.fit_transform([Link])

le_Windy = LabelEncoder()
[Link] = le_Windy.fit_transform([Link])

print("\nNow the Train output is\n", [Link]())

le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.20)

classifier = GaussianNB()
[Link](X_train, y_train)

from [Link] import accuracy_score


print("Accuracy is:", accuracy_score([Link](X_test), y_test))
7. EM-KMEANS ALGORITHM

import [Link] as plt


from sklearn import datasets
from [Link] import KMeans
import pandas as pd
import numpy as np

iris = datasets.load_iris()
X = [Link]([Link])
[Link] = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y = [Link]([Link])
[Link] = ['Targets']

model = KMeans(n_clusters=3)
[Link](X) # model.labels_ : Gives cluster no for which samples belongs to

[Link](figsize=(14,7))
colormap = [Link](['red', 'lime', 'black'])

[Link](1, 3, 1)
[Link](X.Petal_Length, X.Petal_Width, c=colormap[[Link]], s=40)
[Link]('Real Clusters')
[Link]('Petal Length')
[Link]('Petal Width')

[Link](1, 3, 2)
[Link](X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)
[Link]('K-Means Clustering')
[Link]('Petal Length')
[Link]('Petal Width')

from sklearn import preprocessing

scaler = [Link]()
[Link](X)
xsa = [Link](X)
xs = [Link](xsa, columns = [Link])
from [Link] import GaussianMixture
gmm = GaussianMixture(n_components=40)
[Link](xs)
[Link](1, 3, 3)
[Link](X.Petal_Length, X.Petal_Width, c=colormap[0], s=40)
[Link]('GMM Clustering')
[Link]('Petal Length')
[Link]('Petal Width')

print('Observation: The GMM using EM algorithm based clustering matched the


true labels more closely than the Kmeans.')
8. KNN ALGORITHM

from sklearn.model_selection import train_test_split


from [Link] import KNeighborsClassifier
from sklearn import datasets
iris=datasets.load_iris()
print("Iris Data set loaded...")
x_train, x_test, y_train, y_test = train_test_split([Link],[Link],test_size=0.1)

for i in range(len(iris.target_names)):
print("Label", i , "-",str(iris.target_names[i]))

classifier = KNeighborsClassifier(n_neighbors=2)
[Link](x_train, y_train)
y_pred=[Link](x_test)
print("Results of Classification using K-nn with K=1 ")

for r in range(0,len(x_test)):
print(" Sample:", str(x_test[r]), " Actual-label:", str(y_test[r])," Predicted-label:",
str(y_pred[r]))
print("Classification Accuracy :" , [Link](x_test,y_test));
9. LOCALLY WEIGHTED REGRESSIONALGORITHM

import numpy as np
import [Link] as plt

def local_regression(x0, X, Y, tau):


x0 = [1, x0]
X = [[1, i] for i in X]
X = [Link](X)
xw = (X.T) * [Link]([Link]((X - x0) ** 2, axis=1) / (-2 * tau))
beta = [Link](xw @ X) @ xw @ Y @ x0
return beta

def draw(tau):
prediction = [local_regression(x0, X, Y, tau) for x0 in domain]
[Link](X, Y, 'o', color='black')
[Link](domain, prediction, color='red')
[Link]()

X = [Link](-3, 3, num=1000)
domain = X
Y = [Link]([Link](X ** 2 - 1) + .5)

draw(10)
draw(0.1)
draw(0.01)
draw(0.001)

Common questions

Powered by AI

The A* algorithm is used for finding the single shortest path from a start node to a goal node by exploring paths using a heuristic to optimize the search process. It expands each path by reaching nodes that minimize the total cost to goal. In contrast, AO* is suited for finding the lowest cost of reaching a goal considering AND-OR graphs, which might require expanding multiple paths simultaneously based on logical operations and costs. AO* directly manipulates the heuristic values and recursively backtracks to ensure optimization across the graph, handling problems where nodes can lead to several outcome paths .

Backpropagation strengthens neural network training by adjusting the weights of the network through the calculation of gradient descent. Essential components include the forward pass, where inputs pass through the network to generate predictions, and the backward pass, where the error terms are calculated as the difference between actual and predicted values. These errors are then propagated backward through the network using the derivative of the activation function (commonly the sigmoid function) to update the weights in the direction that minimizes the overall error, optimized further by a learning rate .

KMeans clustering algorithm partitions data into k clusters by assigning each data point to the nearest cluster center and iteratively recalculating centers until convergence. It relies on Euclidean distance and hard assignments, meaning each point belongs strictly to one cluster. EM, on the other hand, extends beyond assigning points by assuming a probabilistic model with Gaussian distributions. It involves two steps: expectation, which assigns probabilities (soft assignments) to data points for each cluster, and maximization, which updates the model parameters to better fit the data over iterations. EM is more flexible than KMeans for clusters of varying shapes and convergence characteristics .

The Candidate Elimination Algorithm is a supervised learning method used to find the most specific and general hypotheses from a given set of training data. It starts with the most general hypotheses ('general' set) and the most specific hypotheses ('specific' set). With each positive instance, it makes the specific set more general by replacing differing attributes with '?'. With each negative instance, it refines the general set to exclude that instance by replacing mismatched attributes. The goal is to road map the hypothesis space by progressively narrowing it down to a version space that still contains a valid hypothesis consistent with all examples provided .

In the AO* algorithm, heuristic node values represent the estimated cost from that node to the goal node. As the algorithm processes nodes, it evaluates the cost of achieving a solution through different child nodes and updates the heuristic value for the node to reflect the minimum cost among its successors. During the recursive search, the algorithm backtracks and recalculates heuristic values, which guides the search to more promising paths by minimizing overall solution cost in AND-OR graphs .

The A* algorithm uses heuristic functions to prioritize which nodes to explore. It maintains two sets: open_set for nodes to be evaluated and closed_set for nodes already evaluated. The current f-cost of a node is the sum of its g-cost (cost from the start node to the current node) and its heuristic cost (h-cost), which is an estimate of the cost from the current node to the goal. This heuristic is typically calculated using a predefined distance function, such as a straight-line distance in a spatial graph .

Locally weighted regression (LWR) differs from traditional regression methods by focusing on fitting a model that gives more weight to data points near the target point for prediction, thus offering a localized fit. In contrast, traditional regression applies a global fit over the entire dataset. LWR adapts to the local structure of the data, which allows it to model nonlinear relationships more flexibly compared to global methods, greatly enhancing prediction accuracy when data exhibits non-linearity or varying patterns over different ranges .

Gaussian Naive Bayes simplifies handling classification tasks involving multiple categorical features by assuming independence between every pair of features. This classifier estimates the posterior probabilities using the Gaussian distribution and processes them using Bayesian inference to determine the likely class for each input data point. The Gaussian assumption about the distribution of continuous data allows it to effectively manage mixed categorical inputs, scaling efficiently with increased attribute numbers by treating each feature's influence separately .

Primary considerations for KNN include the choice of k, the number of neighbors, which directly impacts the classifier's bias-variance tradeoff. A small k can make the model sensitive to noise, thereby high variance but low bias, while a large k increases bias, lowering variance. Scaling and normalization of data are crucial since KNN relies on distance measurements. The algorithm's performance is also impacted by the choice of distance metric (e.g., Euclidean, Manhattan). KNN's simplicity and versatility in handling nonlinear data distributions make it popular, however, it is computationally expensive with increasing data size .

The ID3 algorithm constructs decision trees by selecting the attribute that provides the highest information gain at each step. Information gain is calculated by measuring the difference between the entropy of the current dataset and the weighted average entropy after the dataset is split by the attribute. The attribute with the highest information gain is chosen as the decision node to partition the dataset further, leading to the construction of branches until leaves represent pure classifications or further splitting does not provide added informational value .

You might also like