0% found this document useful (0 votes)

4 views18 pages

UNIT III-Machine Learning Full Notes

This document provides an overview of machine learning, detailing the modeling process and types of machine learning, including supervised, unsupervised, and semi-supervised learning. It outlines the steps involved in data science modeling, such as defining objectives, data collection, and model evaluation, as well as explaining classification and regression techniques. Additionally, it discusses clustering methods and outlier analysis, emphasizing the importance of data quality and the characteristics of outliers.

Uploaded by

vetrivendan7177

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views18 pages

UNIT III-Machine Learning Full Notes

Uploaded by

vetrivendan7177

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT III

MACHINE LEARNING

The modeling process - Types of machine learning - Supervised learning - Unsupervised

learning - Semi-supervised learning- Classification, regression - Clustering – Outliers and
Outlier Analysis

THE MODELING PROCESS:

What is Data Science Modelling?

Data science modeling is a set of steps from defining the problem to deploying the model in
reality.
Data Science Modelling Steps
 1. Define Your Objective
 2. Collect Data
 3. Clean Your Data
 4. Explore Your Data
 5. Split Your Data
 6. Choose a Model
 7. Train Your Model
 8. Evaluate Your Model
 9. Improve Your Model
 10. Deploy Your Model
1. Define Your Objective
First, define very clearly what problem you are going to solve. Whether that is a customer
churn prediction, better product recommendations, or patterns in data, you first need to know
your direction. This should bring clarity to the choice of data, algorithms, and evaluation
metrics.
2. Collect Data
Gather data relevant to your objective. This can include internal data from your company,
publicly available datasets, or data purchased from external sources. Ensure you have enough
data to train your model effectively.
3. Clean Your Data
Data cleaning is a critical step to prepare your dataset for modeling. It involves handling
missing values, removing duplicates, and correcting errors. Clean data ensures the reliability
of your model's predictions.
4. Explore Your Data
Data exploration, or exploratory data analysis (EDA), involves summarizing the main
characteristics of your dataset. Use visualizations and statistics to uncover patterns,
anomalies, and relationships between variables.
5. Split Your Data
Divide your dataset into training and testing sets. The training set is used to train your model,
while the testing set evaluates its performance. A common split ratio is 80% for training and
20% for testing.
6. Choose a Model
Select a model that suits your problem type (e.g., regression, classification) and data.
Beginners can start with simpler models like linear regression or decision trees before
moving on to more complex models like neural networks.
7. Train Your Model
Feed your training data into the model. This process involves the model learning from the
data, adjusting its parameters to minimize errors. Training a model can take time, especially
with large datasets or complex models.
8. Evaluate Your Model
After training, assess your model's performance using the testing set. Common evaluation
metrics include accuracy, precision, recall, and F1 score. Evaluation helps you understand
how well your model will perform on unseen data.
9. Improve Your Model
Based on the evaluation, you may need to refine your model. This can involve tuning
hyperparameters, choosing a different model, or going back to data cleaning and preparation
for further improvements.
10. Deploy Your Model
Once satisfied with your model's performance, deploy it for real-world use. This could mean
integrating it into an application or using it for decision-making within your organization

MACHING LEARNING:
Machine learning is the branch of Artificial Intelligence that focuses on developing
models and algorithms that let computers learn from data and improve from previous
experience without being explicitly programmed for every task
Types of Machine Learning:

Types of Machine Learning

1. Supervised Machine Learning

Supervised learning is defined as when a model gets trained on a “Labelled Dataset”.

Labelled datasets have both input and output parameters. In Supervised
Learning algorithms learn to map points between inputs and correct outputs. It has
both training and validation datasets labelled.
Example: Consider a scenario where you have to build an image classifier to
differentiate between cats and dogs. If you feed the datasets of dogs and cats labelled
images to the algorithm, the machine will learn to classify between a dog or a cat from
these labeled images.

There are two main categories of supervised learning that are mentioned below:

 Classification ii Regression

Classification:Classification deals with predicting categorical target variables, which

represent discrete classes or labels.

Here are some classification algorithms:

Logistic Regression,Support Vector Machine,Random ForestDecision Tree.

Regression :Regression, on the other hand, deals with predicting continuous target
variables, which represent numerical values.

Here are some regression algorithms:

 Linear Regression

 Polynomial Regression

 Decision tree

 Random Forest

Advantages of Supervised Machine Learning

 Supervised Learning models can have high accuracy as they are trained on labelled
data.

Disadvantages of Supervised Machine Learning

 It can be time-consuming and costly as it relies on labeled data only.

 It may lead to poor generalizations based on new data.

Unsupervised Machine Learning

Unsupervised Learning Unsupervised learning is a type of machine learning technique in
which an algorithm discovers patterns and relationships using unlabeled data. The primary
goal of Unsupervised learning is often to discover hidden patterns, similarities, or clusters
within the data.
There are two main categories of unsupervised learning that are mentioned below:

 Clustering

 Association

Clustering

Clustering is the process of grouping data points into clusters based on their similarity.

Here are some clustering algorithms:

 K-Means Clustering algorithm

Association

Association rule learning is a technique for discovering relationships between items in a

dataset.

Here are some association rule learning algorithms:

 Apriori Algorithm

Reinforcement Machine Learning:

Reinforcement machine learningalgorithm is a learning method that interacts with the
environment by producing actions and discovering errors. Trial, error, and delay are the most
relevant characteristics of reinforcement learning.

Semi-Supervised Learning: Supervised + Unsupervised Learning

Semi-Supervised learningis a machine learning algorithm that works between the supervised
and unsupervised learning so it uses both labelled and unlabelled data.
CLASSIFICATION IN MACHINE LEARNING:

Classification teaches a machine to sort things into categories. It learns by looking at

examples with labels.

For example a classification model might be trained on dataset of images labeled as

either dogs or cats and it can be used to predict the class of new and unseen images as
dogs or cats based on their features such as color, texture and shape.

Types of Classification

When we talk about classification in machine learning, we’re talking about the process
of sorting data into categories based on specific features or characteristics.

There are two main classification types in machine learning:

1. Binary Classification.

2. Multiclass Classification.

3. Multilabel Classification.

1. Binary Classification

This is the simplest kind of classification. In binary classification, the goal is to sort the
data into two distinct categories. Think of it like a simple choice between two options.
Imagine a system that sorts emails into either spam or not spam.
2. Multiclass Classification.

Here, instead of just two categories, the data needs to be sorted into more than two
categories. The model picks the one that best matches the input. Think of an image
recognition system that sorts pictures of animals into categories like cat, dog, and bird.

3. Multi-Label Classification

In multi-label classification single piece of data can belong to multiple categories at

once. Unlike multiclass classification where each data point belongs to only one class,
multi-label classification allows datapoints to belong to multiple classes. A movie
recommendation system could tag a movie as both action and comedy.

How does Classification in Machine Learning Work?

Classification involves training a model using a labeled dataset, where each input is
paired with its correct output label.

1. Data Collection: You start with a dataset where each item is labeled with the
correct class (for example, “cat” or “dog”).

2. Feature Extraction: The system identifies features (like color, shape, or texture)
that help distinguish one class from another. These features are what the model
uses to make predictions.
3. Model Training: Classification – machine learning algorithm uses the labeled
data to learn how to map the features to the correct class. It looks for patterns
and relationships in the data.

4. Model Evaluation: Once the model is trained, it’s tested on new, unseen data to
check how accurately it can classify the items.

5. Prediction: After being trained and evaluated, the model can be used to predict
the class of new data based on the features it has learned.

6. Model Evaluation: Evaluating a classification model is a key step in machine

learning. It helps us check how well the model performs and how good it is at
handling new, unseen data.

Regression in Machine Learning:

Regression in machine learning refers to a supervised learning technique where the goal
is to predict a continuous numerical value based on one or more independent features. It
finds relationships between variables so that predictions can be made. we have two types
of variables present in regression.

 Dependent Variable (Target): The variable we are trying to predict e.g house price.

 Independent Variables (Features): The input variables that influence the prediction
e.g locality, number of rooms.

Regression analysis problem works with if output variable is a real or continuous

value such as “salary” or “weight”.

Different types of Regression :

Simple Linear Regression:

Simple Linear Regression is a statistical method used to model the relationship between
two variables:

 Independent Variable (X) → The input or predictor variable.

 Dependent Variable (Y) → The output or target variable.
It assumes that there is a linear relationship between X and Y, which means t hat an
increase or decrease in X leads to a proportional change in Y.

Problems :
Multiple Linear Regression:

Clustering in Machine Learning :

A way of grouping the data points into different clusters, consisting of similar data
points. The objects with the possible similarities remain in a group that has less or no
similarities with another group."
Types of Clustering Methods

The clustering methods are broadly divided into Hard clustering (datapoint belongs
to only one group) and Soft Clustering (data points can belong to another group
also). But there are also other various approaches of Clustering exist. Below are the
main clustering methods used in Machine learning:

1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering

Partitioning Clustering
It is a type of clustering that divides the data into non-hierarchical groups. It is also known as
the centroid-based method. The most common example of partitioning clustering is the K-
Means Clustering algorithm.

In this type, the dataset is divided into a set of k groups, where K is used to define the number
of pre-defined groups. The cluster center is created in such a way that the distance between the
data points of one cluster is minimum as compared to another cluster centroid.

Density-Based Clustering

The density-based clustering method connects the highly-dense areas into clusters,
and the arbitrarily shaped distributions are formed as long as the dense region can
be connected. This algorithm does it by identifying different clusters in the dataset
and connects the areas of high densities into clusters.
Distribution Model-Based Clustering

In the distribution model-based clustering method, the data is divided based on the
probability of how a dataset belongs to a particular distribution. The grouping is
done by assuming some distributions commonly Gaussian Distribution.

Hierarchical Clustering

Hierarchical clustering can be used as an alternative for the partitioned clustering

as there is no requirement of pre-specifying the number of clusters to be created. In
this technique, the dataset is divided into clusters to create a tree-like structure,
which is also called a dendrogram.

Fuzzy Clustering

Fuzzy clustering is a type of soft method in which a data object may belong to more
than one group or cluster. Each dataset has a set of membership coefficients, which
depend on the degree of membership to be in a cluster. Fuzzy C-means algorithm is the
example of this type of clustering; it is sometimes also known as the Fuzzy k-means
algorithm.
Outliers and Outlier Analysis

What is an Outlier?

An outlier is a data point that significantly differs from the rest of the dataset. It deviates
so much from the other observations that it raises suspicion of being generated by a
different process or containing errors.

Types of Outliers

1. Global Outliers (Point Outliers)

o A single data point is far removed from the rest of the dataset.
o Example: In a dataset of student ages (18-22), an entry of 80 years is a
global outlier.
2. Contextual Outliers (Conditional Outliers)
o A data point is an outlier only in a specific context.
o Example: A temperature of 30°C is normal in summer but an outlier in
winter.
3. Collective Outliers
o A group of data points together deviate from the expected pattern.
o Example: A sudden drop in stock prices that does not match market
trends.
Detecting outlier using IQR Method:

Unit 3
No ratings yet
Unit 3
30 pages
Data Science Lec 9
No ratings yet
Data Science Lec 9
9 pages
Machine Learning: Types and Techniques
No ratings yet
Machine Learning: Types and Techniques
77 pages
Machine Learning Basics and Applications
No ratings yet
Machine Learning Basics and Applications
22 pages
Classification
No ratings yet
Classification
19 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
108 pages
Unit-1 Machine Learning Techniques
No ratings yet
Unit-1 Machine Learning Techniques
10 pages
Machine Learning Simplified
No ratings yet
Machine Learning Simplified
24 pages
Chapter - 01 - Introduction To ML
No ratings yet
Chapter - 01 - Introduction To ML
60 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
61 pages
Supervised Learning: Classification Explained
No ratings yet
Supervised Learning: Classification Explained
45 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
80 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
21 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
29 pages
Machine Learning Modeling Process Overview
No ratings yet
Machine Learning Modeling Process Overview
16 pages
K-Fold vs Stratified K-Fold Explained
No ratings yet
K-Fold vs Stratified K-Fold Explained
12 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
12 pages
Data Sources for Machine Learning Models
No ratings yet
Data Sources for Machine Learning Models
36 pages
Machine Learning Fundamentals Overview
No ratings yet
Machine Learning Fundamentals Overview
48 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
35 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
9 pages
Unit 1
No ratings yet
Unit 1
46 pages
Machine Learning Concepts Explained
No ratings yet
Machine Learning Concepts Explained
53 pages
Unit 3 DataScience
No ratings yet
Unit 3 DataScience
34 pages
ML UNIT 1 Final - 260122 - 091452
No ratings yet
ML UNIT 1 Final - 260122 - 091452
27 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
68 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
47 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
5 pages
ML - Part1
No ratings yet
ML - Part1
21 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
51 pages
Types of Machine Learning Classification
No ratings yet
Types of Machine Learning Classification
47 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
20 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
84 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Types of Machine Learning Explained
No ratings yet
Types of Machine Learning Explained
43 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
14 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
6 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
42 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
32 pages
Machine Learning Process Explained
No ratings yet
Machine Learning Process Explained
9 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
72 pages
Machine Learning: Types and Processes
No ratings yet
Machine Learning: Types and Processes
45 pages
Week 11 Classification1
No ratings yet
Week 11 Classification1
63 pages
Types of Machine Learning Explained
No ratings yet
Types of Machine Learning Explained
30 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
54 pages
Understanding Machine Learning Algorithms
No ratings yet
Understanding Machine Learning Algorithms
10 pages
Supervised Learning Overview and Workflow
No ratings yet
Supervised Learning Overview and Workflow
16 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
7 pages
ML Chapter1 English
No ratings yet
ML Chapter1 English
14 pages
Machine Learning Basics Explained
No ratings yet
Machine Learning Basics Explained
20 pages
MLF - Unit 1
No ratings yet
MLF - Unit 1
54 pages
ML Notes
No ratings yet
ML Notes
49 pages
Linear Regression and Classification Methods
No ratings yet
Linear Regression and Classification Methods
38 pages
Mastering Sentence Structure Skills
No ratings yet
Mastering Sentence Structure Skills
1 page
Turing's AI Test: Objections and Insights
No ratings yet
Turing's AI Test: Objections and Insights
1 page
Effectiveness of Structured Teaching Programme On Knowledge Regarding Behavioural Problem Among Mothers of School Chldren in Selected Area at Madurai
No ratings yet
Effectiveness of Structured Teaching Programme On Knowledge Regarding Behavioural Problem Among Mothers of School Chldren in Selected Area at Madurai
3 pages
Effective ICT Strategies for Teachers
No ratings yet
Effective ICT Strategies for Teachers
15 pages
Scaffolding Labourer Level 1 Overview
No ratings yet
Scaffolding Labourer Level 1 Overview
3 pages
Entrep-Module-6 For Teacher
79% (14)
Entrep-Module-6 For Teacher
19 pages
Summer Plans: Unit 20 Lesson 2
No ratings yet
Summer Plans: Unit 20 Lesson 2
36 pages
Maths X Standard QP Set I
No ratings yet
Maths X Standard QP Set I
6 pages
Google Forms: A Survey Tool Guide
No ratings yet
Google Forms: A Survey Tool Guide
2 pages
(Human Cognitive Processing) Reka Benczes-Creative Compounding in English - The Semantics of Metaphorical and Metonymical Noun-Noun Combinations-John Benjamins Pub Co (2006)
100% (4)
(Human Cognitive Processing) Reka Benczes-Creative Compounding in English - The Semantics of Metaphorical and Metonymical Noun-Noun Combinations-John Benjamins Pub Co (2006)
224 pages
MSc Project Management in the UK
No ratings yet
MSc Project Management in the UK
2 pages
JMCH Foundation Day Celebration Notice
No ratings yet
JMCH Foundation Day Celebration Notice
1 page
The Perceived Barriers and Entrepreneurial Intention of Young Technical Professionals
No ratings yet
The Perceived Barriers and Entrepreneurial Intention of Young Technical Professionals
6 pages
Oracle Clusterware 11g R2 Overview
100% (1)
Oracle Clusterware 11g R2 Overview
47 pages
Theory of Fun in Game Design
0% (1)
Theory of Fun in Game Design
20 pages
Timber Species and Uses in Ghana
100% (3)
Timber Species and Uses in Ghana
6 pages
Embracing Creativity for Success
No ratings yet
Embracing Creativity for Success
3 pages
University Success Writing Advanced by Skill
No ratings yet
University Success Writing Advanced by Skill
4 pages
Biology MCQs: Zoology & Botany Test
No ratings yet
Biology MCQs: Zoology & Botany Test
8 pages
Level 8 English Mock Paper 5 Answer Sheet
No ratings yet
Level 8 English Mock Paper 5 Answer Sheet
4 pages
Daily Speaking & Listening Lesson Plans
No ratings yet
Daily Speaking & Listening Lesson Plans
13 pages
Philosophy in Free Verse Poetry
No ratings yet
Philosophy in Free Verse Poetry
2 pages
Physical Education for Class 6 Students
No ratings yet
Physical Education for Class 6 Students
27 pages
Dentist's Guide to Implantology 2012
No ratings yet
Dentist's Guide to Implantology 2012
42 pages
K-Means Clustering in R Overview
No ratings yet
K-Means Clustering in R Overview
43 pages
Grade 6 Textbooks and Supplies List
No ratings yet
Grade 6 Textbooks and Supplies List
1 page
Writing the Significance of the Study
No ratings yet
Writing the Significance of the Study
3 pages
K-12 Grading System Overview 2025
No ratings yet
K-12 Grading System Overview 2025
61 pages
An Interpersonal Approach To Classroom Management
No ratings yet
An Interpersonal Approach To Classroom Management
20 pages
G5. Second Term Exam
No ratings yet
G5. Second Term Exam
8 pages

UNIT III-Machine Learning Full Notes

Uploaded by

UNIT III-Machine Learning Full Notes

Uploaded by

UNIT III

The modeling process - Types of machine learning - Supervised learning - Unsupervised

THE MODELING PROCESS:

What is Data Science Modelling?

Types of Machine Learning

1. Supervised Machine Learning

Supervised learning is defined as when a model gets trained on a “Labelled Dataset”.

Classification:Classification deals with predicting categorical target variables, which

Here are some classification algorithms:

Logistic Regression,Support Vector Machine,Random ForestDecision Tree.

Here are some regression algorithms:

Advantages of Supervised Machine Learning

Disadvantages of Supervised Machine Learning

 It can be time-consuming and costly as it relies on labeled data only.

 It may lead to poor generalizations based on new data.

Unsupervised Machine Learning

Here are some clustering algorithms:

 K-Means Clustering algorithm

Association rule learning is a technique for discovering relationships between items in a

Here are some association rule learning algorithms:

Reinforcement Machine Learning:

Semi-Supervised Learning: Supervised + Unsupervised Learning

Classification teaches a machine to sort things into categories. It learns by looking at

For example a classification model might be trained on dataset of images labeled as

There are two main classification types in machine learning:

In multi-label classification single piece of data can belong to multiple categories at

How does Classification in Machine Learning Work?

6. Model Evaluation: Evaluating a classification model is a key step in machine

Regression in Machine Learning:

Regression analysis problem works with if output variable is a real or continuous

Different types of Regression :

Simple Linear Regression:

 Independent Variable (X) → The input or predictor variable.

Clustering in Machine Learning :

Hierarchical clustering can be used as an alternative for the partitioned clustering

1. Global Outliers (Point Outliers)

You might also like