0% found this document useful (0 votes)

3 views35 pages

Introduction to Machine Learning Basics

Uploaded by

Amrita P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views35 pages

Introduction to Machine Learning Basics

Uploaded by

Amrita P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

INTRODUCTION TO MACHINE LEARNING

MODULE I
AI refers to the development of programs that behave intelligently
and mimic human intelligence through a set of algorithms.
Machine learning is a subset of AI, which uses algorithms that learn
from data to make predictions. These predictions can be generated
through supervised learning, where algorithms learn patterns from
existing data, or unsupervised learning, where they discover general
patterns in data. ML models can predict numerical values based on
historical data, categorize events as true or false, and cluster data
points based on commonalities.
ie. Machine learning (ML) is a discipline of artificial intelligence
(AI) that provides machines with the ability to automatically
learn from data and past experiences while identifying patterns to
make predictions with minimal human intervention.
Deep learning, on the other hand, is a subfield of machine learning
dealing with algorithms based essentially on multi-layered artificial
neural networks (ANN) that are inspired by the structure of the
human brain. Unlike conventional machine learning algorithms, deep
learning algorithms are less linear, more complex, and hierarchical,
capable of learning from enormous amounts of data, and able to
produce highly accurate results. Language translation, image
recognition, and personalized medicines are some examples of deep
learning applications.
Traditional Programming

Traditional programming is a manual process—meaning a person

(programmer) creates the program. But without anyone programming
the logic, one has to manually formulate or code rules.

In machine learning, on the other hand, the algorithm automatically

formulates the rules from the data.

Machine Learning Programming

Unlike traditional programming, machine learning is an automated

process. It can increase the value of your embedded analytics in many
areas, including data prep, natural language interfaces, automatic
outlier detection, recommendations, and causality and significance
detection. All of these features help speed user insights and reduce
decision bias.
For example, if you feed in customer demographics and transactions
as input data and use historical customer churn rates as your output
data, the algorithm will formulate a program that can predict if a
customer will churn or not. That program is called a predictive
model.

You can use this model to predict business outcomes in any situation
where you have input and historical output data:

1. Identify the business question you would like to ask.

2. Identify the historical input.
3. Identify the historically observed output (i.e., data samples for
when the condition is true and for when it’s false).
For instance, if you want to predict who will pay the bills late,
identify the input (customer demographics, bills) and the output (pay
late or not), and let the machine learning use this data to create your
model.
In summary, traditional programming is rule-based and deterministic,
relying on human-crafted logic, whereas machine learning is data-
driven and probabilistic, relying on patterns learned from data.

A machine can learn if it can gain more data to improve its

performance.

How does Machine Learning work

A machine learning system builds prediction models, learns from

previous data, and predicts the output of new data whenever it
receives it. The amount of data helps to build a better model that
accurately predicts the output, which in turn affects the accuracy of
the predicted output.

The Machine Learning algorithm's operation is depicted in the

following block diagram:

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:

1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
In supervised learning, sample labeled data are provided to the
machine learning system for training, and the system then predicts the
output based on the training data.

The system uses labeled data to build a model that understands the
datasets and learns about each one. After the training and processing
are done, we test the model with sample data to see if it can accurately
predict the output.

The mapping of the input data to the output data is the objective of
supervised learning.

Supervised learning can be grouped further in two categories of

algorithms:

o Classification
o Regression

Regression is when the variable to predict is numerical, whereas

classification is when the variable to predict is categorical. For
example, regression would use age to predict income, while
classification would use age to predicate a category like making a
specific purchase.

Within supervised learning, various algorithms are used-

 Linear regression
 Logistic regression
 Decision trees
 Random forest
 Gradient boosting

 Artificial neural networks

2) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns
without any supervision.

The training is provided to the machine with the set of data that has
not been labeled, classified, or categorized, and the algorithm needs to
act on that data without any supervision. The goal of unsupervised
learning is to restructure the input data into new features or a group of
objects with similar patterns.

These algorithms are tasked with finding patterns and

relationships within the data without any prior knowledge of the
data’s meaning. Unsupervised machine learning algorithms find
hidden patterns and data without any human intervention, i.e.,
we don’t give output to our model. The training model has only
input parameter values and discovers the groups or patterns on
its own.

In unsupervised learning, we don't have a predetermined result. The

machine tries to find useful insights from the huge amount of data. It
can be further classifieds into two categories of algorithms:

o Clustering
o Association
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in
which a learning agent gets a reward for each right action and gets a
penalty for each wrong action. The agent learns automatically with
these feedbacks and improves its performance. In reinforcement
learning, the agent interacts with the environment and explores it. The
goal of an agent is to get the most reward points, and hence, it
improves its performance.

Unlike supervised learning, which relies on a training dataset with

predefined answers, RL involves learning through experience. In RL,
an agent learns to achieve a goal in an uncertain, potentially complex
environment by performing actions and receiving feedback through
rewards or penalties.(trial and error)
STEPS IN ML

Machine learning (ML) involves several steps, which help to create a

model that can learn from data and make predictions or decisions.
Here's a general overview of the key steps in the machine learning
process:

1. Define the Problem

 Understand the business or research problem.

 Determine the type of problem (e.g., classification, regression,
clustering).
 Set clear goals for the ML model (e.g., accuracy(Accuracy is a
specific metric that measures the proportion of correct
predictions made by the model out of all predictions),
performance(Performance" is a broader term that encompasses
various aspects of how well a model works. It often refers to the
overall effectiveness of the model in solving the problem,
including multiple metrics, not just accuracy-precision,
recall,F1score etc).
2. Collect and Prepare the Data

 Data Collection: Gather relevant data from different sources

(databases, APIs, sensors, etc.).
 Data Cleaning: Handle missing values, remove duplicates,
correct errors, and normalize or standardize the data(convert
feature into a common scale).
 Data Exploration: Perform exploratory data analysis (EDA) to
understand the dataset (e.g., summary statistics, visualizations).
 Feature Engineering: Create new features or modify existing
ones to improve model performance.

3. Split the Data

 Divide the data into training, validation, and testing datasets.

o Training Data: Used to train the model.
o Validation Data: Used to tune the model’s
hyperparameters and select the best model.
o Testing Data: Used to evaluate the final model's
performance.

4. Select a Model

 Choose an appropriate machine learning algorithm or model for

the problem.
o For supervised learning, models include linear regression,
decision trees, random forests, support vector machines,
etc.
o For unsupervised learning, models like k-means clustering,
hierarchical clustering, and PCA may be chosen.
o For deep learning, neural networks can be considered.

5. Train the Model

 Use the training dataset to fit the model and learn the
relationships between the features and target variables.
 The model parameters are adjusted to minimize error or
optimize a specific objective (e.g., minimize loss function).
6. Evaluate the Model

 Assess the model's performance using the validation dataset.

 Common evaluation metrics include accuracy, precision, recall,
F1 score, mean squared error (MSE), etc.
 Analyze if the model is underfitting (too simple) or overfitting
(too complex).

7. Tune the Model

 Adjust hyperparameters to improve model performance (e.g.,

learning rate, regularization parameters).
 Techniques like grid search, random search, or Bayesian
optimization can be used for hyperparameter tuning.

8. Test the Model

 Evaluate the final model on the test dataset to check its

performance on unseen data.
 Ensure the model generalizes well to new, real-world data.

9. Deploy the Model

 Integrate the trained model into a production environment for

real-time predictions or batch processing.
 Set up monitoring and logging to track the model’s performance
in production.

10. Monitor and Maintain the Model

 Continuously monitor the model’s performance, detect drift in

data or performance, and retrain the model if necessary.
 Update the model periodically as new data becomes available or
as business requirements evolve.

Each of these steps is iterative, and the process may involve revisiting
earlier steps based on insights gained from later ones.
Applications of ML
 Image Recognition
 Speech Recognition(AI based vehicle commands, alexa)
 Recommender Systems(An example of these services is very
common for example youtube. It recommends new videos and
content based on the user’s past search [Link]
recommendation systems)
 Self Driving Cars
 Online Fraud Detection
 Stock Market Trading
 Spam detection
 Medical Diagnosis
 Traffic Prediction
 Virtual Personnel System(Virtual assistance)
Feature Selection Techniques
Feature selection:
Feature selection is a process that chooses a subset of features from
the original features by removing the redundant, irrelevant, or
noisy features so that the feature space is optimally reduced
according to a certain criterion.
While developing the machine learning model, only a few
variables in the dataset are useful for building the model, and the
rest features are either redundant or irrelevant. If we input the
dataset with all these redundant and irrelevant features, it may
negatively impact and reduce the overall performance and accuracy
of the model. Hence it is very important to identify and select the
most appropriate features from the data and remove the irrelevant
or less important features, which is done with the help of feature
selection in machine learning.
A feature is an attribute that has an impact on a problem or is
useful for the problem, and choosing the important features for the
model is known as feature selection.
Each machine learning process depends on feature engineering,
which mainly contains two processes; which are
 Feature Selection
 Feature Extraction.
Feature selection is about selecting the subset of the original
feature set.
Feature extraction creates new features. Feature selection is a way
of reducing the input variable for the model by using only relevant
data in order to reduce overfitting in the model.(Overfitting
happens when a model learns too much from the training data,
including details that don’t matter (like noise or outliers)
Here given some set of features and we will try to assign a rate or a
weight for that particular feature with the help of entropy, variance
and capacity to maintain local similarity.
The one which is having more weight will be selected and others
will be removed from that feature set.
Here we will be given a set of features, from this set of feature we
have to generate the subset of features. Then we will give these
subset of features to the machine learning algorithms the same will
be repeated again and again for all subset of features and the one
which gives the maximum performance will be considered as the
final one .
To generate a subset of features there are 2 methods
Forward wrapper methods and Backward wrapper methods
Forward wrapper method
We start with an empty feature set and train the model after that
and add one to the feature set and create one model and measure
the performance of the model
Here given some set of features and we try to identify the
correlation of features(eg if x increases y also increases)
If it is highly correlated with the target variable, then we will retain
other wise it will be eliminated.

There are some machine learning algorithm which uses this

method. eg: Decision tree
In decision tree while building the tree we will start with the
feature having more information gain(Information gain is a
measure used to determine which feature should be used to split the
data at each internal node of the decision tree)
So when it comes to the next level, again we will select feature with
more information gain
So rather than considering all features, we will only consider the
feature which is having more importance and then we will start
building the ML model

Advantages and Disadvantages of Feature Selection

Entropy
Entropy is a concept that stems from information theory, which
measures the impurity of the sample values. It is defined with by
the following formula, where:
 S represents the data set that entropy is calculated
 c represents the classes in set, S
 p(c) represents the proportion of data points that belong to class
c to the number of total data points in set, S

Entropy values can fall between 0 and 1. If all samples in data set,
S, belong to one class, then entropy will equal zero. If half of the
samples are classified as one class and the other half are in another
class, entropy will be at its highest at 1. In order to select the best
feature to split on and find the optimal decision tree, the attribute
with the smallest amount of entropy should be used.
Information gain represents the difference in entropy before and
after a split on a given attribute. The attribute with the highest
information gain will produce the best split as it’s doing the best
job at classifying the training data according to its target
classification. Information gain is usually represented with the
following formula,

Imagine that we have the following arbitrary dataset:

For this dataset, the entropy is 0.94. This can be calculated by finding
the proportion of days where “Play Tennis” is “Yes”, which is 9/14,
and the proportion of days where “Play Tennis” is “No”, which is
5/14. Then, these values can be plugged into the entropy formula
above.

Entropy (Tennis) = -(9/14) log2(9/14) – (5/14) log2 (5/14) = 0.94

We can then compute the information gain for each of the attributes
individually. For example, the information gain for the attribute,
“Humidity” would be the following:

Gain (Tennis, Humidity) = (0.94)-(7/14)(0.985) – (7/14)(0.592) =

0.151

We can then compute the information gain for each of the attributes
individually. For example, the information gain for the attribute,
“Humidity” would be the following:

Gain (Tennis, Humidity) = (0.94)-(7/14)(0.985) – (7/14)(0.592) =

0.151

As a recap,
- 7/14 represents the proportion of values where humidity equals
“high” to the total number of humidity values. In this case, the
number of values where humidity equals “high” is the same as the
number of values where humidity equals “normal”.

- 0.985 is the entropy when Humidity = “high”

- 0.59 is the entropy when Humidity = “normal”

Then, repeat the calculation for information gain for each attribute in
the table above, and select the attribute with the highest information
gain to be the first split point in the decision tree. In this case, outlook
produces the highest information gain. From there, the process is
repeated for each subtree.

Descriptive Statistics
The first step of any data-related process is the collection of data.
After Data collection, data can be sorted, analyzed, and used in
various methods and formats, depending on the project’s needs.
While analyzing a dataset, We use statistical methods to arrive at a
conclusion. Two types of statistical methods are widely used in
data analysis: descriptive and inferential.
Descriptive statistics is a means of describing features of a data set by
yielding summaries about the data samples. It aids in improving data
analysis and identifying the dataset’s trend.
Measures of central tendency, including mean, median, and mode, are
important statistical concepts used in machine learning for several
purposes. They help summarize and understand the data, which can
influence how machine learning models are trained and evaluated.
Types of Descriptive Statistics

There are various dimensions in which this data can be described.

The three main dimensions used for describing data are the central
tendency, dispersion, and the shape of the data.

Descriptive Statistics Based on the Central Tendency of Data

The central tendency of data is the center of the distribution of

data. It describes the location of data and concentrates on where the
data is located. The three most widely used measures of the
“center” of the data are Mean, Median, and Mode.

Mean

The “Mean” is the average of the data. The average can be

identified by summing up all the numbers and then dividing them
by the number of observations.

Mean = X1 + X2 + X3 +… + Xn / n

Example:
Data – 10,20,30,40,50 and Number of observations = 5
Mean = [ 10+20+30+40+50 ] / 5
Mean = 30

The central tendency of the data may be influenced by outliers. An

outlier is a data point that differs significantly from other
observations. It can cause serious problems in analysis.

Example:

Data – 10,20,30,40,200
Mean = [ 10+20+30+40+200 ] / 5
Mean = 60

Solution for the outliers problem: Removing the outliers while

taking averages will give us better results.

Median

It is the 50th percentile of the data. In other words, it is exactly the

center point of the data. Neural networks identify the median by
ordering the data, splitting it into two equal parts, and then finding
the number in the middle. It is the best way to find the center of the
data.

Note that, in this case, the central tendency of the data is not
affected by outliers.

Example:

Odd number of Data – 10,20,30,40,50

Median is 30.
Even the number of data – 10,20,30,40,50,60

Find the middle 2 data and take the mean of those two values.
Here, 30 and 40 are middle values.

Now, add them and divide the result by 2

30+40 / 2 =35
Median is 35

Mode

The mode of the data is the most frequently occurring data or

elements in a dataset. If an element occurs the highest number of
times, it is the mode of that data. If no number in the data is
repeated, then that data has no mode. There can be more than one
mode in a dataset if two values have the same frequency, which is
also the highest frequency.

Example:

Data – 1,3,4,6,7,3,3,5,10, 3
Mode is 3, because 3 has the highest frequency (4 times)

Descriptive Statistics Based on the Dispersion of Data

The dispersion is the “spread of the data”. It measures how far the
data is spread. In most of the dataset, the data values are closely
located near the mean. The values in some other datasets spread
widely from the mean. You can measure these dispersions
of data using the Interquartile Range (IQR), range, standard
deviation, and variance.

Measures of variability, also called measures of dispersion, help

quantify the spread or distribution of observations in a dataset.

1. Inter Quartile Range (IQR)

The range between the first and third quartiles (Q3 - Q1) is called
the Interquartile Range (IQR),

The Inter Quartile Range is the difference between the third

quartile (Q3) and the first quartile (Q1)

IQR = Q3 – Q1

Quartiles are special percentiles.

1st Quartile Q1 is the same as the 25th percentile.
2nd Quartile Q2 is the same as 50th percentile.
3rd Quratile Q3 is same as 75th percentile

Quartiles

Quartiles are values that divide a dataset into four equal parts, which
makes them useful for understanding the spread and central tendency
of the data. There are three quartiles:
 First Quartile (Q1) or Lower Quartile: This is the 25th
percentile of the data. It represents the value below which 25%
of the data fall. It’s the median of the lower half of the dataset.
 Second Quartile (Q2) or Median: This is the 50th percentile of
the data. It represents the middle value of the dataset, splitting
the data into two halves. If the dataset has an odd number of
elements, it’s the middle value; if even, it’s the average of the
two middle values.
 Third Quartile (Q3) or Upper Quartile: This is the 75th
percentile of the data. It represents the value below which 75%
of the data fall, and above which 25% of the data fall. It's the
median of the upper half of the dataset.

Q1 = [(n+1)/4]th item
Q2 = [(n+1)/2]th item
Q3 = [3(n+1)/4]th item

Quartiles Examples
Question 1: Find the quartiles of the following data: 4, 6, 7, 8, 10,
23, 34.
Solution: Here the numbers are arranged in the ascending order and
number of items, n = 7
Lower quartile, Q1 = [(n+1)/4] th item
Q1= 7+1/4 = 2nd item = 6
Median, Q2 = [(n+1)/2]th item
Q2= 7+1/2 item = 4th item = 8
Upper Quartile, Q3 = [3(n+1)/4]th item
Q3 = 3(7+1)/4 item = 6th item = 23

The range between the first and third quartiles (Q3 - Q1) is called
the Interquartile Range (IQR),
The Inter Quartile Range is the difference between the third
quartile (Q3) and the first quartile (Q1)

IQR = Q3 – Q1

Percentile

In statistics, a percentile is a term that describes how a score compares

to other scores from the same set. It is expressed as the percentage of
values in a set of data scores that fall below a given value.

Ie. It is a measure of a position of a particular data point .

Ie the data points aer divided into 100 parts ie then 99 percentiles will
be there

Example 1: The scores obtained by 10 students are 38, 47, 49, 58, 60,
65, 70, 79, 80, 92. Using the percentile formula, calculate the
percentile for score 70?

Solution:

Given:
Scores obtained by students are 38, 47, 49, 58, 60, 65, 70, 79, 80, 92

Number of scores below 70 = 6

Using the percentile formula,

Percentile = (Number of Values Below “x” / Total Number of Values)

× 100

Percentile of 70

= (6/10) × 100

= 0.6 × 100 = 60

Therefore, the percentile for score 70 = 60%

Example 2: The weights of 10 people were recorded in kg as 35, 41,

42, 56, 58, 62, 70, 71, 90, 77. How to find percentile for the weight 58
kg?

Solution:

Given:

Weight of the people are 35, 41, 42, 56, 58, 62, 70, 71, 77, 90

Number of people with weight below 58 kg = 4

Using the formula for percentile,

Percentile = (Number of Values Below “x” / Total Number of Values)

× 100

Percentile for weight 58 kg

= (4/10) × 100

= 0.4 × 100 = 40%

Therefore, the percentile for weight 58 kg = 40%

Example 3: In a college, a list of scores of 10 students is announced.
The scores are 56, 45, 69, 78, 72, 94, 82, 80, 63, 59. Using the
percentile formula, find the 70th percentile.

Solution: Arrange the data in ascending order - 45, 56, 59, 63, 69, 72,
78, 80, 82, 94

Find the rank,

Rank = Percentile ÷ 100

Rank = 70 ÷ 100 = 0.7

So, the rank is 0.7

Using the formula to calculate the percentile,

Percentile = Rank × Total number of the data set

Percentile = 0.7 × 10

Percentile = 7

Now, counting 7 values from left to right we reach 80, and we can say
that all the values below 80 will come under the 70th percentile. In
other words, 70% of the values are below 80.

Therefore, the 70th percentile is 80.

2. Range : describes the difference between the largest and smallest

data point in our data set.

Range = Largest data value – smallest data value

Range of visits to the library in the past year
Ordered data set: 0, 3, 3, 12, 15, 24
Range: 24 – 0 = 24
3. Standard deviation
The standard deviation (s or SD) is the average amount of variability
in your dataset. It tells you, on average, how far each score lies from
the mean. The larger the standard deviation, the more variable the
data set is

There are six steps for finding the standard deviation:

1. List each score and find their mean.

2. Subtract the mean from each score to get the deviation from the
mean.
3. Square each of these deviations.
4. Add up all of the squared deviations.
5. Divide the sum of the squared deviations by N – 1.
6. Find the square root of the number you found.
From learning that s = 9.18, you can say that on average, each score
deviates from the mean by 9.18 points.

4. Variance
The variance is the average of squared deviations from the mean.
Variance reflects the degree of spread in the data set. The more spread
the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The

symbol for variance is s2.
Variance of visits to the library in the past yearData set: 15, 3, 12, 0,
24, 3
s = 9.18

s2 = 84.3

In ML measures of central tendency are used for

b. if there is a significant difference between the mean and
median, it could indicate the presence of skewed data or
outliers.
c. When you have missing values in your dataset, you might
replace missing data with the mean or median (for
numerical data) or the mode (for categorical data). This
helps preserve the overall distribution of the data.
d. The mean is used in normalization or standardization to
scale features in a dataset. Subtracting the mean from each
data point and dividing by the standard deviation ensures
that the data is centered around zero and has a consistent
scale.
e. Measures of central tendency can be used to create new
features that capture the central value of different subsets
of the data. For instance, you might create features like
"average purchase amount per customer" or "average
temperature for a region.”
f. Assessing Model Performance: Measures like mean
squared error (MSE) or mean absolute error (MAE)
are based on the central tendency (mean) and help in
evaluating how well a model fits the data.
MODULE II
Regression
Regression in machine learning refers to a supervised learning
technique where the goal is to predict a continuous numerical
value based on one or more independent features. It finds
relationships between variables so that predictions can be made.
we have two types of variables present in regression:

Dependent Variable (Target): The variable we are trying to

predict e.g house price.
Independent Variables (Features): The input variables that
influence the prediction e.g locality, number of rooms.
Variables: Variable is any characteristic, number, or
quantity that can be measured or counted.

Types of variables:

1. Numerical variables.

2. Categorical Variables.

3. Mixed Variables.

1. Numerical Variables:

Obviously numerical variables will store numerical values.

Numerical variables further divided into 2 categories
based on the type of numerical values are stored.
 Continuous Variable: This variable stores continuous
numerical values. like Salary(10000 $), Height(5.8 feet),
Price(10.50 $)

 Discrete Variable: This variable stores the whole

number or count. This does not store floating-point
numbers. Examples: Number of apples, Number of
items.

2. Categorical Variables:

It stores categorical or string values. It further divided into

3 categories.

 Ordinal variable: Value stored in this has some order.

Examples: Grades(A, B, C) Grades have some order
associated with it A>B>C. Size(S,M,L) S<M<L.

 Nominal Variable: In this all values are equal.

Example: City(Mumbai, Delhi, Pune)

 Date Time variable: These variable stores Date Only,

Time Only, or Date&Time both.

3. Mixed Variables:

This variable stores data which is combination of both

numeric and categorical values. Example: Seat
Number(A10), Postal Code(XX123).
Regression analysis problem works with if output variable is a
real or continuous value such as “salary” or “weight”.

Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
25 pages
MLF - Unit 1
No ratings yet
MLF - Unit 1
54 pages
Machine Learning Basics and Applications
No ratings yet
Machine Learning Basics and Applications
22 pages
Introduction to Machine Learning Notes
No ratings yet
Introduction to Machine Learning Notes
26 pages
Big Data Analytics in Machine Learning
No ratings yet
Big Data Analytics in Machine Learning
4 pages
ML Types
No ratings yet
ML Types
35 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
4 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
69 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
12 pages
AI vs Data Science Overview
No ratings yet
AI vs Data Science Overview
24 pages
ML Module 1 Quik Notes
No ratings yet
ML Module 1 Quik Notes
38 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
153 pages
Basics of Machine Learning Explained
No ratings yet
Basics of Machine Learning Explained
86 pages
Lec 1 Introduction To ML
No ratings yet
Lec 1 Introduction To ML
17 pages
UNIT1 Machine Learning
No ratings yet
UNIT1 Machine Learning
62 pages
Tom Mitchell's Machine Learning Defined
No ratings yet
Tom Mitchell's Machine Learning Defined
4 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
21 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
42 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
5 pages
DL - Unit 1
No ratings yet
DL - Unit 1
23 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
6 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
19 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
14 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
57 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
4 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
14 pages
Lecture1 1 1
No ratings yet
Lecture1 1 1
4 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
Machine Learning Unit-1 Notes Overview
No ratings yet
Machine Learning Unit-1 Notes Overview
28 pages
Machine Learning Basics Overview
No ratings yet
Machine Learning Basics Overview
12 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
13 pages
In-Depth Guide to Machine Learning
100% (2)
In-Depth Guide to Machine Learning
42 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
31 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
44 pages
Fda Unit 4
No ratings yet
Fda Unit 4
35 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
33 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
24 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
19 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
35 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
55 pages
Machine Learning Overview and Applications
No ratings yet
Machine Learning Overview and Applications
27 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
67 pages
Intro. Machine Learning
No ratings yet
Intro. Machine Learning
97 pages
Machine Learning in Attendance Management
No ratings yet
Machine Learning in Attendance Management
51 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
21 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
32 pages
Understanding PAC Learning in ML
No ratings yet
Understanding PAC Learning in ML
24 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
48 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
11 pages
Data Sources for Machine Learning Models
No ratings yet
Data Sources for Machine Learning Models
36 pages
Machine Learning: Types and Techniques
No ratings yet
Machine Learning: Types and Techniques
77 pages
AI and Machine Learning Overview
No ratings yet
AI and Machine Learning Overview
25 pages
Comprehensive Machine Learning Guide
No ratings yet
Comprehensive Machine Learning Guide
7 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
11 pages
Machine Learning Algorithms Overview
No ratings yet
Machine Learning Algorithms Overview
29 pages
UNIT-3 Course Material
No ratings yet
UNIT-3 Course Material
8 pages
Overview of Machine Learning Types
No ratings yet
Overview of Machine Learning Types
3 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
70 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
38 pages
Understanding Interleave Division Multiple Access
No ratings yet
Understanding Interleave Division Multiple Access
37 pages
Valid Definitions of Mealy Machines
No ratings yet
Valid Definitions of Mealy Machines
29 pages
Random Number Tables in Probability Theory
No ratings yet
Random Number Tables in Probability Theory
9 pages
Sorting and Searching Algorithms
No ratings yet
Sorting and Searching Algorithms
16 pages
Survey of LLMs in Software Engineering
No ratings yet
Survey of LLMs in Software Engineering
57 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
41 pages
Screenshot 2025-10-29 at 5.05.42 AM
No ratings yet
Screenshot 2025-10-29 at 5.05.42 AM
72 pages
Sign Language Recognition Literature Review
No ratings yet
Sign Language Recognition Literature Review
8 pages
AI Applications in Vision & Language
No ratings yet
AI Applications in Vision & Language
14 pages
Performance Analysis of Machine Learning Algorithms For House Price Prediction
No ratings yet
Performance Analysis of Machine Learning Algorithms For House Price Prediction
10 pages
Atul Resume...
No ratings yet
Atul Resume...
1 page
MCS 230
No ratings yet
MCS 230
2 pages
Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning
No ratings yet
Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning
8 pages
Testbank Beginning Anomaly Detection Using PythonBased Deep Learning 2nd Edition Suman Kalyan Adari Download
No ratings yet
Testbank Beginning Anomaly Detection Using PythonBased Deep Learning 2nd Edition Suman Kalyan Adari Download
242 pages
Disentangled Color-Texture Image Generation
No ratings yet
Disentangled Color-Texture Image Generation
35 pages
Travel Behavior Analysis in Hanoi Using SVM
No ratings yet
Travel Behavior Analysis in Hanoi Using SVM
17 pages
Optimized Framework for Face Anti-Spoofing
No ratings yet
Optimized Framework for Face Anti-Spoofing
16 pages
Disparity Estimation in Stereo Images
No ratings yet
Disparity Estimation in Stereo Images
6 pages
Comparative Study of UEC Food Datasets
No ratings yet
Comparative Study of UEC Food Datasets
6 pages
Multimodal Fake Review Detection System
No ratings yet
Multimodal Fake Review Detection System
6 pages
Generative AI: Multimodal Creativity
No ratings yet
Generative AI: Multimodal Creativity
74 pages
AlexNet CNN in MATLAB Guide
No ratings yet
AlexNet CNN in MATLAB Guide
12 pages
Deep Learning Models For Advanced Intrusion Detection in Next-Generation Networks
No ratings yet
Deep Learning Models For Advanced Intrusion Detection in Next-Generation Networks
8 pages
N+ Implementation
No ratings yet
N+ Implementation
42 pages
Machine Learning Foundations Overview
No ratings yet
Machine Learning Foundations Overview
89 pages
AI & ML Certification Program Overview
No ratings yet
AI & ML Certification Program Overview
34 pages
Understanding Perceptrons and Activation Functions
No ratings yet
Understanding Perceptrons and Activation Functions
18 pages
Machine Learning Course Proposal QIP
No ratings yet
Machine Learning Course Proposal QIP
3 pages
Enhancing LLM Performance in MT
No ratings yet
Enhancing LLM Performance in MT
189 pages
Neural Networks Overview and Types
No ratings yet
Neural Networks Overview and Types
7 pages
FPGA-Accelerated Fully Quantized BERT
No ratings yet
FPGA-Accelerated Fully Quantized BERT
4 pages
Self-Supervised Learning in Medical Imaging
No ratings yet
Self-Supervised Learning in Medical Imaging
5 pages
Diffusion Models As Optimizers
No ratings yet
Diffusion Models As Optimizers
24 pages
Solar Cell Defect Detection with YOLOv5
No ratings yet
Solar Cell Defect Detection with YOLOv5
5 pages

Introduction to Machine Learning Basics

Uploaded by

Introduction to Machine Learning Basics

Uploaded by

INTRODUCTION TO MACHINE LEARNING

Traditional programming is a manual process—meaning a person

In machine learning, on the other hand, the algorithm automatically

Machine Learning Programming

Unlike traditional programming, machine learning is an automated

1. Identify the business question you would like to ask.

A machine can learn if it can gain more data to improve its

How does Machine Learning work

A machine learning system builds prediction models, learns from

The Machine Learning algorithm's operation is depicted in the

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:

Supervised learning can be grouped further in two categories of

Regression is when the variable to predict is numerical, whereas

Within supervised learning, various algorithms are used-

 Artificial neural networks

These algorithms are tasked with finding patterns and

In unsupervised learning, we don't have a predetermined result. The

Unlike supervised learning, which relies on a training dataset with

Machine learning (ML) involves several steps, which help to create a

1. Define the Problem

 Understand the business or research problem.

 Data Collection: Gather relevant data from different sources

3. Split the Data

 Divide the data into training, validation, and testing datasets.

 Choose an appropriate machine learning algorithm or model for

5. Train the Model

 Assess the model's performance using the validation dataset.

7. Tune the Model

 Adjust hyperparameters to improve model performance (e.g.,

8. Test the Model

 Evaluate the final model on the test dataset to check its

9. Deploy the Model

 Integrate the trained model into a production environment for

10. Monitor and Maintain the Model

 Continuously monitor the model’s performance, detect drift in

There are some machine learning algorithm which uses this

Advantages and Disadvantages of Feature Selection

Imagine that we have the following arbitrary dataset:

Entropy (Tennis) = -(9/14) log2(9/14) – (5/14) log2 (5/14) = 0.94

Gain (Tennis, Humidity) = (0.94)-(7/14)*(0.985) – (7/14)*(0.592) =

Gain (Tennis, Humidity) = (0.94)-(7/14)*(0.985) – (7/14)*(0.592) =

- 0.985 is the entropy when Humidity = “high”

- 0.59 is the entropy when Humidity = “normal”

There are various dimensions in which this data can be described.

Descriptive Statistics Based on the Central Tendency of Data

The central tendency of data is the center of the distribution of

The “Mean” is the average of the data. The average can be

The central tendency of the data may be influenced by outliers. An

Solution for the outliers problem: Removing the outliers while

It is the 50th percentile of the data. In other words, it is exactly the

Odd number of Data – 10,20,30,40,50

Now, add them and divide the result by 2

The mode of the data is the most frequently occurring data or

Descriptive Statistics Based on the Dispersion of Data

Measures of variability, also called measures of dispersion, help

quantify the spread or distribution of observations in a dataset.

The Inter Quartile Range is the difference between the third

Quartiles are special percentiles.

In statistics, a percentile is a term that describes how a score compares

Ie. It is a measure of a position of a particular data point .

Number of scores below 70 = 6

Using the percentile formula,

Percentile = (Number of Values Below “x” / Total Number of Values)

Therefore, the percentile for score 70 = 60%

Example 2: The weights of 10 people were recorded in kg as 35, 41,

Number of people with weight below 58 kg = 4

Using the formula for percentile,

Percentile = (Number of Values Below “x” / Total Number of Values)

Percentile for weight 58 kg

= 0.4 × 100 = 40%

Therefore, the percentile for weight 58 kg = 40%

Find the rank,

Rank = Percentile ÷ 100

Rank = 70 ÷ 100 = 0.7

So, the rank is 0.7

Gain (Tennis, Humidity) = (0.94)-(7/14)(0.985) – (7/14)(0.592) =

Gain (Tennis, Humidity) = (0.94)-(7/14)(0.985) – (7/14)(0.592) =