0% found this document useful (0 votes)

11 views27 pages

Python AI Practical Work Guide

This document presents practical work on artificial intelligence in Python for a third year of engineering school. It contains sections on uninformed and informed search, linear regression, classification, and the use of scikit-learn.

Uploaded by

ScribdTranslations

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views27 pages

Python AI Practical Work Guide

Uploaded by

ScribdTranslations

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Mohammedia School of Engineering Computer Engineering

Computer Science Department 3emeYear

Artificial Intelligence
Practical Work in Python

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 2022-2023

Mohammedia School of Engineers Computer Engineering
Department of Computer Science 3thYear

Table of contents
Table of contents
1 Technical environment................................................................................................... 4
1.1 Python.......................................................................................................................... 4
1.2 Development environment
1.3 Execution of a Python program
1.3.1 The code editor......................................................... 5
1.3.2 Add the display package ............................................................................ 5
2 Uninformed search: DFS and BFS
2.1 Problem of the 8-puzzle
2.2 DFS Strategy................................................................................................................. 7
2.2.1 Algorithm ........................................................................................................... 7
2.2.2 Implementation..................................................................................................... 7
2.2.3 Output
2.3 BFS Strategy................................................................................................................. 8
2.3.1 Algorithm
2.3.2 Implementation..................................................................................................... 9
2.3.3 Output :................................................................................................................. 9
3 Informed research............................................................................................................ 9
3.1 BestFirst Strategy........................................................................................................ 9
3.1.1 Algorithm ........................................................................................................... 9
3.1.2 Implementation
3.1.3 Output
3.2 A* Strategy
3.2.1 Algorithm
3.2.2 Implementation
3.2.3 Output................................................................................................................. 11
4 Machine Learning: Linear Regression ................................................................. 12
4.1 Objectif Global .......................................................................................................... 12
4.2 Provided Functions ..................................................................................................... 12
4.2.1 Display Function .......................................................................................... 12
4.2.2 Call code ...................................................................................................... 12
4.3 Functions to implement ........................................................................................... 13
4.3.1 Arithmetic Mean........................................................................................ 13
4.3.2 Covariance
4.3.3 Variance

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 1

Mohammadia School of Engineering Computer Science
Department of Computer Science 3thYear

4.3.4 Calculation of coefficients ....................................................................................... 13

4.3.5 Calculation of RMSE................................................................................................. 14
4.3.6 Linear regression ............................................................................................. 14
4.4 Outputs
5 Machine Learning: Classification ................................................................................... 14
5.1 Global Objective
5.2 Call Program.................................................................................................... 15
5.3 Functions to implement ........................................................................................... 15
5.3.1 Load the data ........................................................................................... 15
5.3.2 Calculate the average........................................................................................... 16
5.3.3 Calculate the standard derivative ............................................................................... 16
5.3.4 Synthesize the dataset
5.3.5 Model the dataset
5.3.6 Calculate the probability distribution of x ........................................................ 16
5.3.7 Calculate the probability for a line................................................................ 16
5.3.8 Making the prediction
5.4 Outputs....................................................................................................................... 17
6 Machine Learning: Sickit-Learn ..................................................................................... 17
6.1 Overall Objective .......................................................................................................... 17
6.1.1 Classification
6.1.2 Regression
6.1.3 Clustering
6.2 Work Environment
6.2.1 Sickit-learn
6.2.2 Panda .................................................................................................................. 21
6.2.3 Six and Ipython
6.3 Classification ............................................................................................................. 22
6.3.1 Logistic Regression (Logistic_R.py)........................................................... 22
6.3.2 SVM ([Link])............................................................. 22
6.3.3 Naives Bayes ([Link])........................................................................................ 22
6.3.4 Decision Tree ([Link]) ....................................................................... 23
6.3.5 Logistic Regression with Cross validation (Logistic_R-[Link]).................... 23
6.4 Linear Regression ([Link]) ........................................................................................ 23
6.5 Clustering
6.5.1 K-means ([Link]) ...................................................................................... 24
6.5.2 Mean-shift ([Link])................................................................................. 24
7 NLP: Natural Language Processing

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 2

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

7.1 Pipeline
7.2 Rules based Sentiments analysis ----------------------------------------------------- 25
7.3 Fake-News detection

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 3

Mohammadia School of Engineers Computer Engineering
Computer Department 3thYear

1 Technical Environment
1.1 Python
Python is a portable, dynamic, extensible, free language that allows (without imposing) a
modular and object-oriented approach to programming. Python has been developed since 1989.
by Guido van Rossum and many volunteer contributors. It is under the GPL free license,
free, works on all platforms (Windows, Linux, OSX, etc.) and was designed for
being a readable language. As a result, comments are indicated by the hash character (#),
Blocks are identified by indentation, and have a number of keywords, objects
of different types (int, float, bool, tuple, list, string ...).
1.2 Development Environment
We need to install Python, a code editor, and then add the packages.
necessary as the progress of the labs proceeds.
Download the last version python since son site official:
[Link] start the installation
2. After completing the installation, add Python to your environment variables
On Windows: In the "start" menu, search for "advanced system settings."
In the 'Advanced system settings' tab, click on the 'Variables' button
of the environment", then modify the variable "Path" to add the location
of installation of python which can be "C:\Program Files (x86)\Pythonx" for 32-bit
installation or "C:\Program Files\Pythonx" for 64-bit installation (depending on the version
Python that you have installed

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 4

Mohammadia School of Engineers Computer Engineering
Computer Department 3thYear

Linux: edit environment variables using your terminal

export PYTHONPATH=$HOME/test:$PYTHONPATH
export PATH=$HOME/test/bin:$PATH
3. Check if Python has been successfully installed

1.3 Executing a Python Program

After writing and saving your program with the " .py " extension you can
Run it with the command 'python program_name.py'.
1.3.1 The code editor
Use a simple text editor such as Notepad or notepad.
1.3.2 Add the display package
For the first sessions, it is recommended to have a display package that allows
to visualize the path found. To do this, we use the Graphviz package through the link:
[Link]

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 5

Mohammadia School of Engineers Computer Engineering
Computer Science Department 3thYear

2 Uninformed search: DFS and BFS

2.1 8-puzzle Problem
The 8-puzzle problem is modeled by a matrix of 9 cells (3 rows and 3 columns). A
starting from any initial state, use an artificial intelligence algorithm to achieve
a final state. The algorithm must return the set of steps that allow this resolution.
The basic program consists of:
The Node class:
o__init__: the constructor allows to initialize the problem by defining the state
initial, the final state, an action to take, and the limit depth;
o__repr__ : returns a string that represents the current state of
problem
The function possible_moves: allows generating possible actions based on
the state passed as a parameter
The function generate_state: allows generating the next state based on the previous state
as a parameter and an action (move);
The function create_node: to create a node from a given state passed as a parameter, a
action, and a depth;
The expand_node function: allows generating possible nodes from the given node
in parameters by reusing the expand_node function;
The display function: this function allows you to generate a pdf file containing the path
result of the solving algorithm;
Execution example:
Etat initial : [2, 8, 3, 1, 6, 4, 7, 0, 5]
Etat final : [1, 2, 3, 8, 0, 4, 7, 6, 5]
Execution tree: the result depending on the algorithm used

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 6

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

2.2 DFS Strategy

2.2.1 Algorithm
This function implements the dfs (depth-first search) algorithm which can be implemented in
iterative or recursive format:

inDepth_resolution(Current_State, History, [Mvts]) :-

if is_goal(Current_State): stop,
for each Mvt in legal_Mouvement :
New_State = apply_operator(Current_State, Mvt)
if legal_state(New_State) and not(member(New_State, History)) :
add_member(New_State, History)
update(Current_State, Mvt, New_State),
inDepth_resolution(New_State, History, Mvts).

DFS (node):
Visited //History
If goal is true:
Stop
Else:
If node not in Visited:
[Link](node)
If node is goal:
goal = true
Stop
for each Mvt in legal_Mouvement :
New_State = apply_operator(Current_State, Mvt),
if not(member(New_State, History)) :
add_member(New_State, History),
update(Current_State, Mvt, New_State),
inDepth_resolution(New_State, History).

2.2.2 Implementation
Declaration of variables: visited History
def dfs(node):
stack = []
visited = []
visited_str = []
depth_limit = 5
[Link](create_node(initial, "283164705", None, 0))
Stopping condition:
while len(stack) > 0:
if len(stack) == 0: return None
Test if the node does not exist in the history: not(member(New_State, History)),
node = [Link](0)
if [Link] in visited_str:
continue

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 7

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

else:
[Link](node)
visited_str.append([Link])
Test if the current state is not final: legal_state(New_State)
if [Link] == goal:
return visited
Apply the operator: apply_operator (Current_State, Mvt),
And if legal state: legal_state(New_State) and not(member(New_State, History))
Update the state: update(Current_State, Mvt, New_State):
if [Link] < depth_limit:
expanded_nodes = expand_node(node)
if(expanded_nodes not in visited):
expanded_nodes.extend(stack)
stack = expanded_nodes

2.2.3 Output

2.3 BFS Strategy

2.3.1 Algorithm
This function must implement the bfs (breadth-first search) algorithm. This is a task to
to render
BFS (G, s):
let Q be queue.
[Link]( s )

Mark s as visited
while (Q is not empty)

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 8

Mohammadia School of Engineering Computer Engineering
Department of Computer Science 3thYear

v = [Link]()

for all neighbors w of v in Graph G

if w is not visited
[Link]( w )
mark w as visited

2.3.2 Implementation
Work to be submitted

2.3.3 Output :

3 Informed research
Informed search uses information about the nodes or the arcs or both.
3.1 BestFirst Strategy
3.1.1 Algorithm
The BestFirst algorithm that can be implemented as follows:

Create 2 empty lists: OPEN and CLOSED

Start from the initial node (say N) and put it in the 'ordered' OPEN list.
Repeat the next steps until GOAL node is reached
If OPEN list is empty, then EXIT the loop returning ‘False’
Select the first/top node (say N) in the OPEN list and move it to the
CLOSED list. Also capture the information of the parent node.
If N is a GOAL node, then move the node to the Closed list and exit.
the loop returning 'True'. The solution can be found by backtracking the
path
If N is not the GOAL node, expand node N to generate the ‘immediate’
next nodes linked to node N and add all those to the OPEN list
Reorder the nodes in the OPEN list in ascending order according to an
evaluation function f(n)

3.1.2 Implementation
Work to be done

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 9

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

3.1.3 Output

3.2 A* Strategy
3.2.1 Algorithm
The A* algorithm (A Star search) can be implemented as follows:

1 Put node_start in the OPEN list with f(node_start) = h(node_start)

(initialization)
2 while the OPEN list is not empty {
3 Take from the open list the node node_current with the lowest cost
4 if(node_current) = g(node_current) + h(node_current)
5 if node_current is node_goal we have found the solution; break
6 Generate each state node_successor that comes after node_current
7 for each node_successor of node_current {
8 Set successor_current_cost = g(node_current) + w(node_current,
node_successor
9 if node_successor is in the OPEN list {
10 if g(node_successor) ≤ successor_current_cost continue
(to line 20
11 } else if node_successor is in the CLOSED list {

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 10

Mohammadia School of Engineering Computer Science
Department of Computer Science 3thYear

12 if g(node_successor) ≤ successor_current_cost continue

(to line 20)
13 Move node_successor from the CLOSED list to the OPEN
list
14 } else {
15 Add node_successor to the OPEN list
Set h(node_successor) to be the heuristic distance
to node_goal
17 }
Set g(node_successor) = successor_current_cost
Set the parent of node_successor to node_current
20 }
21 Add node_current to the CLOSED list
22
23 if(node_current != node_goal) exit with error (the OPEN list is
empty

3.2.2 Implementation
Work to be done
3.2.3 Output

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 11

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

4 Machine Learning: Linear Regression

4.1 Global Objective
The overall objective of this lab is to implement linear regression. The main function
linear regression reuses a set of functions to implement. These functions are
implemented in a use case that corresponds to explaining the revenue of a
company by advertising expenses over 12 months.
The task involves implementing simple linear regression, following the implementations of
functions (Mean, covariance, variance, regression coefficients and rmse_metric for the
calculation of the mean squared error.
Linear regression is the most commonly used statistical method. One usually distinguishes the
simple regression (one explanatory variable) of multiple regression (multiple variables)
explanatory) although the conceptual framework and calculation methods are identical.
The principle of linear regression is to model a quantitative dependent variable Y,
through a linear combination of p quantitative explanatory variables, X1, X2, …, Xp.
The deterministic model (not taking randomness into account) is expressed for an observation i:

yi= a1x1i+ a2x2i+ ... + apxpi+ ei

where toiis the observed value for the dependent variable for observation i, xijis the value
taken by the variable j for observation i, and eiis the model's error.
4.2 Provided Functions
4.2.1 Display Function
This function allows you to generate a graph.
def plot_graph(x, y, predicted):
[Link](x, y, c = 'red')
[Link](x, predicted, marker = 'o', color = 'blue')
[Link]()
4.2.2 Call code
We declare two series for which we try to perform linear regression, to display the
regression graph, and then we calculate the root mean square error RMSE (Root Mean
Square Error).
Advertising costs (*1000 DH)
x = [25, 17, 18, 28, 22, 20, 19, 22, 30, 30, 27, 24]
# Chiffre d'affaires (*1000 DH)
y = [280, 250, 255, 292.5, 265, 260, 262.5, 280, 285, 296, 285, 270]
yp = simple_linear_regression(x, y)
plot_graph(x, y, yp)
rmse = rmse_metric(y, yp)
print('RMSE: %.3f' % (rmse))

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 12

Mohammadia School of Engineers Computer Science
Département Informatique 3thYear

4.3 Functions to implement

4.3.1 Arithmetic Mean
An arithmetic mean in mathematics is the average value of a series of data.
It is the sum of the values divided by the number of values.
def mean(values):
#insert code
4.3.2 Covariance
The covariance between two random variables is a number that quantifies their
joint deviations from their respective expectations.

Calculate covariance between x and y

def covariance(x, mean_x, y, mean_y):
#insert code
4.3.3 Variance
Variance is a measure of the dispersion of values in a sample or distribution.
of probability. It expresses the average of the squares of the deviations from the mean, also equal to the
difference between the average of the squares of the values of the variable and the square of the average.

Calculate the variance of a list of numbers

def variance(values, mean):
#insert code
4.3.4 Calculation of coefficients
This function allows to calculate the coefficients of the linear equation.
This function allows you to calculate the coefficients b.0and b1of the linear equation.

y = b0+ b1x And

With
Calculate coefficients
def coefficients(x, y):
#insert code

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 13

Mohammadia School of Engineers Computer Science
Department of Computer Science 3thYear

4.3.5 Calculation of RMSE

This function calculates the RMSE (Root Mean Square Error).

Calculate root mean squared error
def rmse_metric(actual, predicted):
#insert code
4.3.6 Linear regression
This function implements linear regression using the calculated coefficients:
Simple linear regression algorithm
def simple_linear_regression(x, y):
#insert code
4.4 Outputs
Example of result:

5 Machine Learning: Classification

5.1 Global Objective
In this lab, we are interested in predicting the class of a flower (name) based on its features.
characteristics.
The dataset consists of 150 rows, where each row presents the characteristics and the class.
(name) of a flower. We have 3 classes of flowers (names), while the characteristics define (in)
cm) the length of the sepals, the width of the sepals, the length of the petals and the width of the petals
petals. Class 0 corresponds to the orchid flower, 1 to lavender, and 2 to tulip.
Naive Bayesian classification is a type of simple probabilistic Bayesian classification.
based on Bayes' theorem with a strong independence (so-called naive) of the hypotheses.
In simple terms, a naive Bayesian classifier assumes that the existence of a feature
for a class, is independent of the existence of other characteristics. A fruit can be

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 14

Mohammadia School of Engineering Computer Engineering
Department of Computer Science 3èmeYear

considered an apple if it is red, round, and about ten centimeters. Even

if these characteristics are related in reality, a naive Bayesian classifier will determine that
the fruit is an apple considering these characteristics of color independently,
shape and size.

The probabilistic model for a classifier is the conditional model. whereC

is a dependent class variable whose instances or classes are few in number,
conditioned by several characteristic variables F1...,Fn.
5.2 Call Program
We provide the path of the file containing the data, we perform the learning then we
predict the class of a record (row).
Make a prediction with Naive Bayes on flower Dataset
filename = '[Link]'
dataset = load_csv(filename)
# fit model
model = summarize_by_class(dataset)
define a new record
row = [5.6,2.8,4.3,1.4]
predict the label
label = predict(model, row)
Data=%s, Predicted: %s

5.3 Functions to implement

The work involves implementing naïve bayes, following the implementation of the following functions:
5.3.1 Load the data
This is the function that loads the data from the Excel file:
def load_csv(filename):
separated = dict()
with open(filename, 'r') as file:
csv_reader = reader(file)
for row in csv_reader:
if not row:
continue
rowF = [0]*len(row)
for i in range(len(row)-1):
rowF[i] = float(row[i].strip())
rowF[i+1] = int(row[i+1].strip())
class_value = rowF[-1]
if (class_value not in separated):
separated[class_value] = list()
separated[class_value].append(rowF)
return separated

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 15

Mohammadia School of Engineering Computer Science
Department of Computer Science 3emeYear

5.3.2 Calculate the average

This function implements the arithmetic mean of the numbers provided as parameters:

def mean(numbers):
#insert code

5.3.3 Calculate the standard derivative

This function provides the standard derivative using the following formula:

def stdev(numbers):
#insert code

5.3.4 Synthesize the dataset

Perform the calculations for the average, standard deviation, and number of rows for each column of
data
def summarize_dataset(dataset):
#insert code

5.3.5 Model the dataset

Organize the records into classes and calculate the statistics using the function
summarize_dataset:
def summarize_by_class(dataset):
#insert code

5.3.6 Calculate the probability distribution of x

Calculate the probabilities:

def calculate_probability(x, mean, stdev):

#insert code

5.3.7 Calculate the probability for a line

Calculate the prediction probabilities for each class for a given row:
def calculate_class_probabilities(summaries, row):
#insert code

5.3.8 Make the prediction

Predict the class for a line:
def predict(summaries, row):

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 16

Mohammadia School of Engineers Computer Science
Department of Computer Science 3emeYear

#insert code

5.4 Outputs
Example of result:

6 Machine Learning: Scikit-Learn

6.1 Global Objective
In this practical session, we will try to exploit the potential of Scikit-Learn to achieve the
classification, linear regression, and clustering.
For any ML process, we divide the dataset into two parts:
The training set: is the part that allows the algorithm to learn.
Test set: is the part that allows you to verify the effectiveness of the learning.
6.1.1 Classification
The classification process allows for the grouping of all data into different classes.
[Link] Confusion matrix
A confusion matrix or contingency table is a summary of prediction results.
on a classification problem. The correct and incorrect predictions are highlighted.
and distributed by class. The results are thus compared with the actual values. It allows
to understand how the classification model is confused when it makes
predictions.
Calculation of the confusion matrix:
Based on the results obtained by the trained model and the predictions, the matrix indicates
the number of correct and incorrect predictions for each class. Each line of the
The table corresponds to a predicted class, and each column corresponds to a actual class.
In the lines under the actual classes, the predictions or results are recorded. These
Results can be the correct indication of a positive prediction such as 'true positive'.
a true positive and a negative prediction as a true negative, or
an incorrect positive prediction such as 'false positive' (false positive) and a
incorrect negative prediction as 'false negative':
TP (True Positive): the cases where the prediction is positive, and where the actual value is
indeed positive. Example: the doctor informs you that you are pregnant, and
You are indeed pregnant.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 17

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

oTN (True Negative): cases where the prediction is negative, and where the actual value is
actually negative. Example: the doctor tells you that you are not
pregnant, and you are indeed not pregnant.
oFP (False Positive): the cases where the prediction is positive, but the actual value is
negative. Example: the doctor informs you that you are pregnant, but you are not.
not pregnant.
FN (False Negative): cases where the prediction is negative, but the actual value is
positive. Example: the doctor tells you that you are not pregnant, but you
You are pregnant.

[Link] Performance measures

From the confusion matrix, we can derive a whole set of performance criteria.
Here are some examples of performance metrics commonly used:
The Recall ("recall" in English), or sensitivity ("sensitivity" in English), is the rate of true
positives, that is to say the proportion of positives that we have correctly identified
Precision, that is to say the proportion of correct predictions among the points we have.
positive predictions
The "F-measure" to evaluate a trade-off between recall and precision, which is their average.
harmonic
Specificity or Support is the rate of true negatives. It is a
complementary measure of sensitivity.

6.1.2 Regression
Regression in machine learning involves mathematical methods that
allow scientists to predict a continuous outcome (y) based on the value of one or
several predictive variables (x). Linear regression is probably the most common form

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 18

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

popular regression analysis due to its ease of use for prediction and
forecast.
Pour évaluer un modèle de régression :
We can calculate the distance between predicted values and true values. This gives us:
The sum of squared residuals (RSS);
the average of this sum (MSE) ;
The square root of this average (RMSE).
We can prefer to calculate the correlation between predicted values and true values:
the relative squared error (RSE);
the coefficient of determination (R2).
[Link] Performance measures: RSS and MSE
The sum of the squares of the residuals, or RSS, stands for Residual Sum of Squares. It is calculated for
each point xifrom the test game the distance between its label and the predicted value and make it the
sum :

The problem with RSS is that it becomes greater the more data we have. For this
reason, it has been normalized by the number n of points in the test set which is the MSE
The mean squared error, or MSE, for Mean Squared Error:

[Link] RMSE and RMSLE

The RMSE, or Root Mean Squared Error, is the square root of the MSE to bring it back to the unit of y.
while the RMSLE (Root Mean Squared Log Error).
In fact, the RMSE does not perform very well when the labels can take values
which span several orders of magnitude. Let's imagine making an error of 100 units on a
Label worth 4; the corresponding term in the RMSE is worth 1002=10000. It is exactly
the same thing as if we make an error of 100 units on a label that is worth 8000. For
Taking this into account, we can pass the predicted values and the true values to the log before
calculate the RMSE

[Link] Performance measures: CSR and R2

The relative squared error (RSE) is the RSS normalized by the sum of the squares of the distances between
the labels and their average, it is actually the complement to 1 of the coefficient of
determination (R2), which is the square of the Pearson correlation between predicted and true values
values.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 19

Mohammadia School of Engineering Computer Engineering
Department of Computer Science 3thYear

6.1.3 Clustering
Clustering is a machine learning method that involves grouping points.
of data by similarity or distance. It is an unsupervised learning method and
a popular technique for statistical data analysis. For a given set of points,
You can use classification algorithms to classify these data points.
individuals in specific groups.
There are different functions with which we can evaluate the performance of
clustering algorithms.
[Link] Adjusted Rand Index
Rand Index is a function that calculates a measure of similarity between two clusters.
For this calculation, the rand index considers all pairs of samples and counting pairs.
who are assigned to similar or different clusters in the predicted and true clustering.
Then, the raw score of the Rand index is 'adjusted for randomness' in the index score.

[Link] Score based on mutual information

Mutual information is a function that calculates the agreement of the two assignments. It ignores the
permutations. The following versions are available:
– Normalized Mutual Information (NMI): Scikit learn
to have [Link].normalized_mutual_info_score module.
– Adjusted mutual information (AMI): Scikit learn
to have [Link].adjusted_mutual_info_score module.
[Link] Fowlkes-Mallows Score
The Fowlkes-Mallows function measures the similarity of two clusterings of a set of
points. It can be defined as the geometric mean of precision and recall by
pairs.

Mathematically,
[Link] Silhouette Coefficient
The Silhouette function will calculate the average silhouette coefficient of all samples in
using the average intra-cluster distance and the average distance to the nearest cluster to
each sample.

Mathematically,
S = left(ba right)/max left(a,b right)

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 20

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3emeYear

Here, a is the intra-cluster distance, and b is the average distance of the nearest group.
[Link] Contingency matrix
This matrix will indicate the intersection cardinality for each pair of trust (true,
predicted). The confusion matrix for classification problems is a matrix of
square contingency.
6.2 Work Environment
In order to ensure the present labs, the following libraries must be installed:
6.2.1 Sickit-learn
Scikit-learn is a free Python library for machine learning. It is developed
par de nombreux contributeurs notamment dans le monde académique par des instituts français
higher education and research organizations like Inria. It is written in Python, with some
essential algorithms written in Cython to optimize performance.
The installation of Scikit-learn involves the following steps:
1. Installing pip:
a. Download [Link] to a folder on your computer.
b. Open the command prompt and navigate to the folder containing the program
of installation [Link].
c. Run the following command: python [Link]
d. pip is now installed! We can check that Pip has been installed
correctly by opening the command prompt and entering the following command
: pip -V
2. Installing Scikit-Learn:
pip install -U scikit-learn
python -m pip show scikit-learn # para ver qué versión y dónde está instalado scikit-learn
python -m pip freeze # to see all packages installed in the active virtualenv
python -c "import sklearn; sklearn.show_versions()"
6.2.2 Panda
Pandas is a library written in Python that allows for manipulation and analysis.
data. In particular, it offers data structures and operations of
manipulation of numerical arrays and time series.
It allows to:
Manipulating data tables with variable labels (columns) and
of individuals (lines);
These tables are called DataFrames;
One can easily read and write these dataframes from or to a tab-delimited file;
One can easily plot graphs from these DataFrames using matplotlib.
Install Panda:

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 21

Mohammadia School of Engineers Computer Science
Computer Science Department 3thYear

pip install pandas

you :
py -m pip install pandas
6.2.3 Six and Ipython
IPython is an interactive terminal, or shell, for the Python programming language that
proposes features such as introspection, additional syntax, completion
and a rich history.
Installation: pip install ipython
Six is a Python compatibility library.⋅2 and⋅3. It provides utility functions for
bridge the differences between Python versions in order to write Python code that is
compatible with both versions of Python.
Installation: pip install six
6.3 Classification
6.3.1 Logistic Regression (Logistic_R.py)
Logistic regression is a binary regression model. It aims to model as accurately as possible.
a simple mathematical model to numerous real observations. In other words
to associate with a vector of random variables (x1, x2, …, xka binomial random variable
generically noted y.
The goal is therefore to execute the program without adding code. The program uses a
dataset available in scikit-learn through the load_digits() function [Link]
[Link]/stable/modules/generated/[Link].load_digits.html) which allows generating
randomly generated numbers. Then, we test and display the confusion matrix and the
precision.
6.3.2 SVM ([Link])
To stay succinct, Support Vector Machines (SVM) are a set of techniques
supervised learning aimed at finding, in a space of dimension N>1,
the hyperplane that best divides a dataset into two. SVMs are separators
linear, that is to say that the boundary separating the classes is a straight line.
The objective is therefore to run the program without adding any code. The program uses a
dataset available in scikit-learn through the function datasets.load_iris() [Link]
[Link]/stable/modules/generated/[Link].load_iris.html) which represents the problem
classification of flowers into 3 classes (as seen in the Machine Learning lab:
Classification).
6.3.3 Naives Bayes ([Link])
Naive Bayes Classifier is a popular algorithm in Machine Learning. It is an algorithm
of supervised classification. It is particularly useful for classification problems.
text. An example of using Naive Bayes is that of the anti-spam filter.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 22

Mohammadia School of Engineering Computer Engineering
Department of Computer Science 3thYear

The goal is to execute the program with the addition of code. The program uses a dataset.
available in sickit-learn through the function datasets.load_breast_cancer() [Link]
[Link]/stable/modules/generated/[Link].load_breast_cancer.html) which represents
the classification problem of a breast cancer dataset in Wisconsin,
USA. The goal is to implement the division of the dataset into training and test sets, make the prediction,
calculate the accuracy and the confusion matrix.
6.3.4 Decision Tree ([Link])
A decision tree is a visual representation of a classification algorithm.
data according to different criteria that we will call decisions (or nodes).
The objective is to execute the programs without adding code. The program uses the dataset.
provides [Link]. The goal is to load and split the data, perform training, testing,
then evaluate the model using the confusion matrix, the classification report which
allows displaying precision, recall, F1, support, and precision.
6.3.5 Logistic Regression with Cross validation (Logistic_R-[Link])
The objective is to execute an example of Logistic Regression using Cross-validation.
K-Fold. Cross-validation helps in the evaluation of machine learning models.
This statistical method helps to compare and select the model in learning.
automatic applied. The dataset is divided into a number K. It divides the set
data to the point where the test set uses each set.
Let's understand the concept using 5-fold cross-validation or K=5: the method will divide
the dataset in five folds. The model uses the first fold of the first iteration
to test the model. It uses the remaining datasets to train the model. The
second part helps to test the entire dataset and other supports with the process of
training. The same process repeats until the test set uses each of the
five folds.
We will therefore execute the program without adding any code. The code reuses the provided dataset.
[Link].
6.4 Linear Regression ([Link])
The linear regression algorithm is a supervised learning algorithm, that is to say
that from the target variable or the variable to be explained (Y), the model aims to make
a prediction using so-called explanatory (X) or predictive variables (see Machine
Learning: Linear Regression.
The goal is to run the program with added code. The program uses a dataset.
available in scikit-learn through the function datasets.load_diabetes() [Link]
[Link]/stable/modules/generated/[Link].load_diabetes.html) which represents the
prediction problem of diabetes rate progression one year later based on 10
properties. The objective is to implement the division of the dataset into training and test sets, to make the
prediction, and display the result in the form of a graph.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 23

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3thYear

6.5 Clustering
Clustering is an unsupervised learning method.
line represents an individual (an observation). At the end of the clustering application, we
will retrieve this data grouped by similarity. Clustering will group into several
families (clusters) individuals/objects based on their characteristics. Thus, individuals
those found in the same cluster are similar and the data found in another cluster
they are not.
There are two types of clustering:
Hierarchical clustering
Non-hierarchical clustering (partitioning)
The goal is to execute the programs without adding code.
6.5.1 K-means ([Link])
K-means (k-means) is a non-hierarchical unsupervised clustering algorithm. It
allows grouping the observations of the data set into K distinct clusters. Thus the data
Similar ones will be found in the same cluster. Furthermore, an observation cannot be
find only in one cluster at a time (exclusivity of belonging). The same observation, does not
can therefore belong to two different clusters.
The goal is to run the program without adding code. The program uses a dataset.
available in scikit-learn through the function datasets.make_blobs() [Link]
[Link]/stable/modules/generated/[Link].make_blobs.html) which allows generating
isotropic Gaussian blobs for clustering. We load the data, we make a display
Initially, we apply K-Means learning, then we test the prediction and finally we move on to
the graphical display of cluster centers.
6.5.2 Mean-shift ([Link])
Mean Shift is also known as mode-seeking algorithm (Kernel
Density Estimation - KDE) that assigns data points to clusters in a way that
shifting the data points to the high-density area. The highest density of points
data is called a model in the region. It has widely used applications in the
field of computer vision and image segmentation.
KDE is a method for estimating the distribution of data points. It works by
placing a kernel on each data point. The kernel in mathematical terms is a
weighting function that will apply weights to individual data points.
The addition of all individual kernels generates the probability.
The objective is to run the program without adding code. We reuse the same dataset from the lab.
previous datasets.make_blobs(). #import the display style. We import the dataset, we create
the dataset by defining the initial centers and creating clusters based on the defined centers,
we display the initial dataset, then we start the learning, we move to the display of
centres des clusters, et finalement l’affichage des clusters.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 24

Mohammadia School of Engineers Computer Engineering
Department of Computer Science 3èmeYear

7 NLP: Natural Language Processing

7.1 Pipeline
A Machine Learning pipeline is used to help automate ML workflows. They
work by allowing a sequence of data to be transformed and correlated between
they in a model that can be tested and evaluated to obtain a result, whether positive or
negative.
Machine learning pipelines consist of several steps to train a
model. Machine learning pipelines are iterative because each step is repeated.
to continuously improve the model's accuracy and obtain an efficient algorithm. To
create better machine learning models and make the most of them,
accessible, scalable, and sustainable storage solutions are imperative, paving the way for
on-site object storage.
7.2 Rules based Sentiments analysis
The lab consists of executing a program that allows detecting the sentiment of a sentence.
(positive/negative).
The pipeline consists of tokenization, normalization, removal of stop words,
stemming, lemmatization, word occurrence calculation, calculation of the number and
percentages of positive and negative words, decision on whether the sentence is positive or negative.
The calculation of positive and negative words is done using a dictionary of positive words.
and a dictionary of negative words.
The program consists of:
1. Declare a text string to analyze: new_text
2. We transform the text into lowercase and divide it into tokens.
3. We normalize the text: removal of non-alphanumeric text symbols.
4. On radicalization (stemming): reduction of each word to its radical or root form
5. Lemmatization: reducing words to a normalized form
6. We calculate the positive words and the negative words
7. We calculate the percentages of positive and negative words.
8. Decide for the text to be analyzed whether it is a positive or negative word;
7.3 Fake-News detection
This lab consists of detecting whether a news item is false (fake) or true (true).
The pipeline consists of reading a dataset '[Link]' of fake news and another '[Link]' of real news.
news. To preprocess this news: flag data, concatenate dataframes, delete the
date and title, convert to lowercase, remove stopwords, trace some statistics,
then divide the train/test data and model them using five models.
Random Forest is implemented using Scikit-learn, while you need to implement (in
using Scikit-learn) Naive Bayes, Logistic regression, Decision Tree and SVM. Finally, the
Five models are compared to select the most accurate.
The objective is to program Naïve Bayes, Logistic regression, Decision Tree, and SVM.

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 25

Mohammedia School of Engineers Computer Science
Department of Computer Science 3thYear

The program consists of:

1. Read the dataset
2. Declare the Fake and True flags
3. Concatenate the data from the dataset
4. Shuffling the data to reduce variance and ensure that
models remain general and less oversized
5. Display the data
6. Remove the date and the title from the data
7. Convert to lowercase
8. Remove the punctuation
9. Remove stop words
10. Calculate the number of articles per subject
11. Calculate the number of fake and real articles.
12. Count the frequent words
13. Count the frequent words in fake news and in real news
14. Flatten the confusion matrix
15. Divide the data
16. Code à insérer
17. Display the result of the Random Forest
18. Code to insert
19. Display the graph

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 26

Python Machine Learning Training Report
No ratings yet
Python Machine Learning Training Report
40 pages
Lab Manual Python
No ratings yet
Lab Manual Python
35 pages
Machine Learning and Python Overview
No ratings yet
Machine Learning and Python Overview
32 pages
Statistics and ML with Python Guide
No ratings yet
Statistics and ML with Python Guide
218 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
49 pages
Intership Training Report PDF
No ratings yet
Intership Training Report PDF
12 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
22 pages
Python Machine Learning Lab Manual
No ratings yet
Python Machine Learning Lab Manual
5 pages
Credit Card Default Prediction Report
No ratings yet
Credit Card Default Prediction Report
37 pages
Master Machine Learning with Python
No ratings yet
Master Machine Learning with Python
92 pages
Python for Statistics & Machine Learning
100% (1)
Python for Statistics & Machine Learning
333 pages
Innomatics Data Science Curriculum Overview
No ratings yet
Innomatics Data Science Curriculum Overview
10 pages
Python for Statistics & Machine Learning
No ratings yet
Python for Statistics & Machine Learning
300 pages
Tax Project Submission Guidelines
No ratings yet
Tax Project Submission Guidelines
300 pages
AI and ML Training Report Summary
No ratings yet
AI and ML Training Report Summary
123 pages
Data Science Internship Report in Python
No ratings yet
Data Science Internship Report in Python
20 pages
Summer Training on Machine Learning in Python
No ratings yet
Summer Training on Machine Learning in Python
21 pages
Machine Learning with Python Syllabus
No ratings yet
Machine Learning with Python Syllabus
4 pages
Python for Machine Learning Basics
No ratings yet
Python for Machine Learning Basics
54 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
64 pages
Handling TypeError in Seasonal Decompose
100% (1)
Handling TypeError in Seasonal Decompose
319 pages
Statistics and Machine Learning Guide
100% (1)
Statistics and Machine Learning Guide
166 pages
Python Handbook for ML with PyTorch
No ratings yet
Python Handbook for ML with PyTorch
135 pages
Machine Learning Calculations in Python
100% (1)
Machine Learning Calculations in Python
323 pages
William Löfstedt: Height Insights
100% (1)
William Löfstedt: Height Insights
313 pages
Python for Statistics & Machine Learning
No ratings yet
Python for Statistics & Machine Learning
415 pages
Python To AI ML Full Guide
No ratings yet
Python To AI ML Full Guide
3 pages
Python Machine Learning Guide
No ratings yet
Python Machine Learning Guide
109 pages
Statistics Machine Learning Python
No ratings yet
Statistics Machine Learning Python
395 pages
Neural Network Models in AI Course
No ratings yet
Neural Network Models in AI Course
119 pages
IPL Match Winner Prediction Using ML
No ratings yet
IPL Match Winner Prediction Using ML
31 pages
2data Science Full Notes EI334
No ratings yet
2data Science Full Notes EI334
94 pages
Python for Statistics & Machine Learning
No ratings yet
Python for Statistics & Machine Learning
329 pages
Machine Learning with Python Course Outline
No ratings yet
Machine Learning with Python Course Outline
4 pages
Introduction to Machine Learning with Python
100% (2)
Introduction to Machine Learning with Python
1,159 pages
Machine Learning with Python Syllabus
No ratings yet
Machine Learning with Python Syllabus
9 pages
Deep Learning and Classification Course
No ratings yet
Deep Learning and Classification Course
34 pages
Python for Data Science and ML
No ratings yet
Python for Data Science and ML
5 pages
Master Data Science with Python Course
No ratings yet
Master Data Science with Python Course
16 pages
Python For ML Lab Manual
No ratings yet
Python For ML Lab Manual
38 pages
Machine Learning & Python Exam Questions
No ratings yet
Machine Learning & Python Exam Questions
14 pages
Python Basics for AI Lab EET-NSU
No ratings yet
Python Basics for AI Lab EET-NSU
18 pages
Python for Statistics and Machine Learning
100% (2)
Python for Statistics and Machine Learning
300 pages
Python for Statistics & Machine Learning
100% (2)
Python for Statistics & Machine Learning
389 pages
Statistics and Machine Learning Overview
No ratings yet
Statistics and Machine Learning Overview
319 pages
Python Machine Learning Internship Report
No ratings yet
Python Machine Learning Internship Report
32 pages
Python Basics for AI Lab Experiment
No ratings yet
Python Basics for AI Lab Experiment
3 pages
Data Science and Python Basics Guide
No ratings yet
Data Science and Python Basics Guide
65 pages
Python for Economists: Course Overview
No ratings yet
Python for Economists: Course Overview
4 pages
Report - Docx 20251112 175647 0000
No ratings yet
Report - Docx 20251112 175647 0000
41 pages
Data Analysis Methods Lab Manual
No ratings yet
Data Analysis Methods Lab Manual
37 pages
Python For ML Lab Manual
No ratings yet
Python For ML Lab Manual
37 pages
Bipin Ghimire's Machine Learning Report
No ratings yet
Bipin Ghimire's Machine Learning Report
10 pages
Analyzing Low Birth Weight Factors
100% (1)
Analyzing Low Birth Weight Factors
219 pages
Machine Learning Curriculum Overview
No ratings yet
Machine Learning Curriculum Overview
3 pages
Suleman 221720 ML Lab1
No ratings yet
Suleman 221720 ML Lab1
15 pages
Machine Learning Internship Overview
No ratings yet
Machine Learning Internship Overview
15 pages
Python for Machine Learning Course
No ratings yet
Python for Machine Learning Course
3 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
13 pages
Understanding Adjustment Entries in Accounting
100% (1)
Understanding Adjustment Entries in Accounting
20 pages
Spanish Literature Trends 1939-1970
0% (1)
Spanish Literature Trends 1939-1970
1 page
International Labor Law Conflicts Explained
No ratings yet
International Labor Law Conflicts Explained
5 pages
Key Dental Reference Lines Explained
No ratings yet
Key Dental Reference Lines Explained
7 pages
MEX to BOG Flight Details and Requirements
100% (1)
MEX to BOG Flight Details and Requirements
2 pages
Dynamics of Linked Bodies in Motion
100% (1)
Dynamics of Linked Bodies in Motion
29 pages
Advertising Ethics Code Overview
No ratings yet
Advertising Ethics Code Overview
12 pages
Protected Natural Reserves in Chiapas
No ratings yet
Protected Natural Reserves in Chiapas
11 pages
Saint Martin de Porres Overview
67% (3)
Saint Martin de Porres Overview
2 pages
Language Evaluation: Dad for a Day
No ratings yet
Language Evaluation: Dad for a Day
3 pages
Understanding Transformer Phase Shift
100% (1)
Understanding Transformer Phase Shift
8 pages
Hospital Anxiety and Depression Scale
100% (2)
Hospital Anxiety and Depression Scale
2 pages
PPE Training and Care Guidelines
No ratings yet
PPE Training and Care Guidelines
19 pages
Administrative Process in Belle Cosmetics
No ratings yet
Administrative Process in Belle Cosmetics
1 page
Understanding Emotional Intelligence
89% (9)
Understanding Emotional Intelligence
27 pages
MGR0125 Peletizer Product Manual
100% (1)
MGR0125 Peletizer Product Manual
22 pages
Staff and Line Structure Explained
No ratings yet
Staff and Line Structure Explained
13 pages
Reorganized Keynotes by Henry C. Allen
No ratings yet
Reorganized Keynotes by Henry C. Allen
6 pages
Work Safety and Health Evaluation Guide
No ratings yet
Work Safety and Health Evaluation Guide
3 pages
Guilt-Free Dessert Recipes Guide
100% (3)
Guilt-Free Dessert Recipes Guide
21 pages
Chemical Risk Assessment in Drilling
100% (1)
Chemical Risk Assessment in Drilling
57 pages
Vehicle Usage Rights for Election Campaign
No ratings yet
Vehicle Usage Rights for Election Campaign
1 page
Household Service Employment Contract
0% (1)
Household Service Employment Contract
3 pages
Interview Analysis of Ellis's Therapy
No ratings yet
Interview Analysis of Ellis's Therapy
5 pages
Pacific Ocean Pollution Solutions for BIM 40
No ratings yet
Pacific Ocean Pollution Solutions for BIM 40
5 pages
Dostoevsky's Existentialism in "Notes"
100% (3)
Dostoevsky's Existentialism in "Notes"
2 pages
BMW 525 tds (E39) Technical Overview
No ratings yet
BMW 525 tds (E39) Technical Overview
21 pages
Talk to Me: Lyrics Analysis
No ratings yet
Talk to Me: Lyrics Analysis
2 pages
The Untamed Frontier of Literature
No ratings yet
The Untamed Frontier of Literature
9 pages
Balancing Self-Esteem for Mental Health
No ratings yet
Balancing Self-Esteem for Mental Health
7 pages
Unpacking Urban Voids in Design
No ratings yet
Unpacking Urban Voids in Design
116 pages
Georgia Davis-Bacon Wage Rates 2020
No ratings yet
Georgia Davis-Bacon Wage Rates 2020
5 pages
EP Lab Syllabus 2021 Regulation
No ratings yet
EP Lab Syllabus 2021 Regulation
60 pages
Sprinkler System Verification Report 2021
No ratings yet
Sprinkler System Verification Report 2021
12 pages
Understanding Acids and Bases
No ratings yet
Understanding Acids and Bases
24 pages
Telecommunications and Data Processing Facilities: April 2020 Document No: 11782 - 14
100% (1)
Telecommunications and Data Processing Facilities: April 2020 Document No: 11782 - 14
27 pages
Use of Waste Materials in Hotm Mix Asphalt STP1193-EB.20405 PDF
No ratings yet
Use of Waste Materials in Hotm Mix Asphalt STP1193-EB.20405 PDF
304 pages
Motion for Unpaid Medical Expenses
No ratings yet
Motion for Unpaid Medical Expenses
3 pages
Balraj Singh Malik Vs Govt. of NCT of Delhi & Anr On 22 Decembe
No ratings yet
Balraj Singh Malik Vs Govt. of NCT of Delhi & Anr On 22 Decembe
8 pages
Evolution of the Television Set
No ratings yet
Evolution of the Television Set
2 pages
PCB Mounting Feet: Types and Benefits
No ratings yet
PCB Mounting Feet: Types and Benefits
15 pages
Linear Equations in One Variable
No ratings yet
Linear Equations in One Variable
18 pages
Comprehensive MCQs on Marketing & SCRUM
No ratings yet
Comprehensive MCQs on Marketing & SCRUM
18 pages
Van Mahotsav 2025: Environment Awareness Event
No ratings yet
Van Mahotsav 2025: Environment Awareness Event
1 page
APSRTC Online Bus Ticket Confirmation
No ratings yet
APSRTC Online Bus Ticket Confirmation
2 pages
Boundary Work Among Groups, Occupations, and Organizations - From Cartography To Process
No ratings yet
Boundary Work Among Groups, Occupations, and Organizations - From Cartography To Process
34 pages
Tigerair Taiwan In-Flight Advertising Kit
100% (1)
Tigerair Taiwan In-Flight Advertising Kit
16 pages
Impact of Research on Stock Predictability
No ratings yet
Impact of Research on Stock Predictability
28 pages
Histogram Construction for Class 9
No ratings yet
Histogram Construction for Class 9
7 pages
ATO-750 OTDR: Features & Data Management
No ratings yet
ATO-750 OTDR: Features & Data Management
10 pages
Independent Contractor Timesheet Log
No ratings yet
Independent Contractor Timesheet Log
1 page
91 91 Issue 7 AMD 3 2
No ratings yet
91 91 Issue 7 AMD 3 2
38 pages
Zasady Korespondencji Formalnej w Angielskim
No ratings yet
Zasady Korespondencji Formalnej w Angielskim
33 pages
Zappa Frank-Dont Eat The Yellow
No ratings yet
Zappa Frank-Dont Eat The Yellow
7 pages
Community Role in COVID-19 Control
No ratings yet
Community Role in COVID-19 Control
8 pages
Christchurch Earthquake Ground Motion Analysis
No ratings yet
Christchurch Earthquake Ground Motion Analysis
8 pages
CNC Operator Programmer Resume
No ratings yet
CNC Operator Programmer Resume
1 page
Common Licensing
No ratings yet
Common Licensing
72 pages
MGIT Bus Routes and Timings 2023-24
No ratings yet
MGIT Bus Routes and Timings 2023-24
3 pages
AI Transforming Business Landscape
No ratings yet
AI Transforming Business Landscape
3 pages

Python AI Practical Work Guide

Uploaded by

Python AI Practical Work Guide

Uploaded by

Mohammedia School of Engineering Computer Engineering

Computer Science Department 3emeYear

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 2022-2023

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 1

4.3.4 Calculation of coefficients ....................................................................................... 13

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 2

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 3

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 4

Linux: edit environment variables using your terminal

1.3 Executing a Python Program

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 5

2 Uninformed search: DFS and BFS

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 6

2.2 DFS Strategy

inDepth_resolution(Current_State, History, [Mvts]) :-

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 7

2.3 BFS Strategy

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 8

for all neighbors w of v in Graph G

Create 2 empty lists: OPEN and CLOSED

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 9

1 Put node_start in the OPEN list with f(node_start) = h(node_start)

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 10

12 if g(node_successor) ≤ successor_current_cost continue

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 11

4 Machine Learning: Linear Regression

yi= a1x1i+ a2x2i+ ... + apxpi+ ei

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 12

4.3 Functions to implement

Calculate covariance between x and y

Calculate the variance of a list of numbers

y = b0+ b1x And

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 13

4.3.5 Calculation of RMSE

This function calculates the RMSE (Root Mean Square Error).

5 Machine Learning: Classification

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 14

considered an apple if it is red, round, and about ten centimeters. Even

The probabilistic model for a classifier is the conditional model. whereC

5.3 Functions to implement

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 15

5.3.2 Calculate the average

5.3.3 Calculate the standard derivative

5.3.4 Synthesize the dataset

5.3.5 Model the dataset

5.3.6 Calculate the probability distribution of x

def calculate_probability(x, mean, stdev):

5.3.7 Calculate the probability for a line

5.3.8 Make the prediction

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 16

6 Machine Learning: Scikit-Learn

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 17

[Link] Performance measures

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 18

[Link] RMSE and RMSLE

[Link] Performance measures: CSR and R2

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 19

[Link] Score based on mutual information

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 20

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 21

pip install pandas

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 22

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 23

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 24

7 NLP: Natural Language Processing

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 25

The program consists of:

Asmae EL KASSIRI – Driss NAMLY - Karim BOUZOUBAA 26

You might also like