Unsupervised Learning: Clustering Explained

Unsupervised learning is a machine learning approach where algorithms identify patterns in unlabelled data without prior guidance. Key techniques include clustering, such as K-means and hierarchical clustering, which group similar data points based on inherent characteristics. The process involves collecting unlabelled data, selecting an algorithm, training the model, grouping data, and interpreting the results for insights.

Uploaded by

disego9711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views10 pages

Unsupervised Learning: Clustering Explained

Uploaded by

disego9711

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unsupervised: clustering, association

Unsupervised learning is a type of machine learning where a computer tries to find patterns in data without
being told what the right answers are.
For example:
• In supervised learning, you teach the computer using examples like “this is a cat” and “this is a
dog.”
• In unsupervised learning, you just give the computer a bunch of pictures of animals — without
labels — and it tries to group similar ones together or find patterns on its own.
So basically, unsupervised learning = finding hidden patterns or groups in data without labels or
guidance.

The image shows set of animals like elephants, camels and cows that represents raw data that the
unsupervised learning algorithm will process.
• The "Interpretation" stage signifies that the algorithm doesn't have predefined labels or categories
for the data. It needs to figure out how to group or organize the data based on inherent patterns.
• An algorithm represents unsupervised learning process which can be clustering, dimensionality
reduction or anomaly detection to identify patterns in the data.
• The processing stage shows the algorithm working on the data.

The output shows the results of the unsupervised learning process. In this case, the algorithm might have
grouped the animals into clusters based on their species (elephants, camels, cows).
Working of Unsupervised Learning
The working of unsupervised machine learning can be explained in these steps:
1. Collect Unlabelled Data
• Gather a dataset without predefined labels or categories.
• Example: Images of various animals without any tags.
2. Select an Algorithm
• Choose a suitable unsupervised algorithm such as clustering like K-Means, association rule learning
like Apriori etc.
3. Train the Model on Raw Data
• Feed the entire unlabelled dataset to the algorithm.
• The algorithm looks for similarities, relationships or hidden structures within the data.
4. Group or Transform Data
• The algorithm organizes data into groups (clusters), rules or lower-dimensional forms without
human input.
• Example: It may group similar animals together or extract key patterns from large datasets.
5. Interpret and Use Results
• Analyze the discovered groups, rules or features to gain insights or use them for further tasks like
visualization, anomaly detection or as input for other models.
Unsupervised Learning Algorithms
Types of unsupervised learning algorithm
1. Clustering Algorithms
Clustering is an unsupervised machine learning technique that groups similar data points together
into clusters based on their characteristics, without using any labelled data. The objective is to
ensure that data points within the same cluster are more similar to each other than to those in
different clusters, enabling the discovery of natural groupings and hidden patterns in complex
datasets.
• Goal: Discover the natural grouping or structure in unlabelled data without predefined categories.
• How: Data points are assigned to clusters based on similarity or distance measures.
• Similarity Measures: Can include Euclidean distance, cosine similarity or other metrics depending
on data type and clustering method.
• Output: Each group is assigned a cluster ID, representing shared characteristics within the cluster.
• For example, if we have customer purchase data, clustering can group customers with similar
shopping habits. These clusters can then be used for targeted marketing, personalized
recommendations or customer segmentation.

The following image shows an example of how clustering works.

The left side of the image shows uncategorized data. On the right side, data has been grouped into clusters
that consist of similar attributes.

Types of clustering in unsupervised machine learning

The main types of clustering in unsupervised machine learning include

• K-means
• hierarchical clustering
K-means
K-Means Clustering is an unsupervised machine learning algorithm that helps group data points into clusters
based on their inherent similarity. Unlike supervised learning, where we train models using labelled data, K-
Means is used when we have data that is not labelled and the goal is to uncover hidden patterns or structures.
For example, an online store can use K-Means to segment customers into groups like "Budget Shoppers,"
"Frequent Buyers," and "Big Spenders" based on their purchase history.

Working of K-Means Clustering

Suppose we are given a data set of items with certain features and values for these features like a vector.
The task is to categorize those items into groups. To achieve this, we will use the K-means algorithm. "k"
represents the number of groups or clusters we want to classify our items into.
The algorithm will categorize the items into "k" groups or clusters of similarity. To calculate that similarity
we will use the Euclidean distance as a measurement.
The algorithm works as follows:
1. Choose the number of clusters (K):
o The letter “K” in K-Means means the number of groups you want to form.
o For example, if K=2, the algorithm will divide the data into 2 clusters.
2. Initialize centroids:
o Pick K random points from the dataset as the starting Centers of the clusters.
o These points are called centroids.
3. Assign data points to the nearest centroid:
o Each data point is assigned to the cluster whose centroid is closest to it (based on distance,
usually Euclidean distance).
4. Update the centroids:
o For each cluster, calculate the average position of all points in that cluster.
o This new average point becomes the new centroid.
5. Repeat:
o Steps 3 and 4 are repeated until the centroids no longer move much, meaning the clusters
are now stable.
Euclidean Distance Formula
Consider two points (x1, y1) and (x2, y2) in a 2-dimensional space; the Euclidean Distance between them is
given by using the formula:

Where,
• d is Euclidean Distance,
• (x1, y1) is the Coordinate of the first point,
• (x2, y2) is the Coordinate of the second point.
Example:
Hierarchical Clustering in Machine Learning
Hierarchical clustering is an unsupervised learning technique used to group similar data points into clusters
by building a hierarchy (tree-like structure).
The algorithm builds clusters step by step either by progressively merging smaller clusters or by splitting a
large cluster into smaller ones. The process is often visualized using a dendrogram, which helps to
understand data similarity.
Dendrogram
A dendrogram is like a family tree for clusters. It shows how individual data points or groups of data merge
together. The bottom shows each data point as its own group and as we move up, similar groups are
combined. The lower the merge point, the more similar the groups are. It helps us see how things are
grouped step by step.

• At the bottom of the dendrogram the points A, B, C, D, E and F are all separate.
• As we move up, the closest points are merged into a single group.
• The lines connecting the points show how they are progressively merged based on similarity.
• The height at which they are connected shows how similar the points are to each other; the shorter
the line the more similar they are

Types of Hierarchical Clustering

1. Agglomerative Clustering
2. Divisive clustering

Hierarchical Agglomerative Clustering (HAC) is a bottom-up clustering method.

1. Start with individual points:
Each data point begins as its own cluster.
2. Find the closest pair:
The algorithm finds the two clusters that are most similar (closest to each other).
3. Merge them:
Combine those two clusters into one bigger cluster.
4. Repeat:
Keep finding and merging the closest clusters again and again.
5. Finish:
Continue until only one big cluster remains that contains all the data.
6. Create a dendrogram: As the process continues, we can visualize the merging of clusters using a
tree-like diagram called a dendrogram. It shows the hierarchy of how clusters are merged.

Workflow for Hierarchical Divisive clustering:

1. Start with all data points in one cluster: Treat the entire dataset as a single large cluster.
2. Split the cluster: Divide the cluster into two smaller clusters. The division is typically done by
finding the two most dissimilar points in the cluster and using them to separate the data into two
parts.
3. Repeat the process: For each of the new clusters, repeat the splitting process: Choose the cluster
with the most dissimilar points and split it again into two smaller clusters.
4. Stop when each data point is in its own cluster: Continue this process until every data point is its
own cluster or the stopping condition (such as a predefined number of clusters) is met.
Use the distance matrix in Table1 to perform single link and complete link hierarchical clustering. Show your
results by drawing a dendogram. The dendogram should clearly show the order in which the points are
merged.

Combine P1 and P2
Distances after combining P1 and P2

Understanding Unsupervised Learning Techniques
No ratings yet
Understanding Unsupervised Learning Techniques
31 pages
Unsupervised Learning Algorithms Explained
No ratings yet
Unsupervised Learning Algorithms Explained
40 pages
Lecture04 Unsupervised
No ratings yet
Lecture04 Unsupervised
81 pages
Unsupervised Learning Algorithms Explained
No ratings yet
Unsupervised Learning Algorithms Explained
15 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
31 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
50 pages
Unit 5 (Part 2)
No ratings yet
Unit 5 (Part 2)
12 pages
Understanding Clustering in Machine Learning
No ratings yet
Understanding Clustering in Machine Learning
20 pages
Hierarchical Reinforcement Learning Overview
No ratings yet
Hierarchical Reinforcement Learning Overview
32 pages
Unsupervised Learning in Python: Clustering
No ratings yet
Unsupervised Learning in Python: Clustering
19 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
44 pages
Unsupervised Learning Overview and Techniques
No ratings yet
Unsupervised Learning Overview and Techniques
41 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
7 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
92 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
62 pages
Understanding Unsupervised Learning
No ratings yet
Understanding Unsupervised Learning
47 pages
Clustering Techniques in Unsupervised Learning
No ratings yet
Clustering Techniques in Unsupervised Learning
45 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
99 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
48 pages
Machine Learning Notes Dtu Unit 4 Part 1
No ratings yet
Machine Learning Notes Dtu Unit 4 Part 1
113 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
60 pages
Module 3
No ratings yet
Module 3
13 pages
Clustering Techniques in Data Mining
No ratings yet
Clustering Techniques in Data Mining
91 pages
Unsupervised Learning Algorithms Overview
No ratings yet
Unsupervised Learning Algorithms Overview
88 pages
Unsupervised Learning and Clustering Techniques
No ratings yet
Unsupervised Learning and Clustering Techniques
10 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
66 pages
ML Unsupervised Clustering 13
No ratings yet
ML Unsupervised Clustering 13
15 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
37 pages
Introduction To Clustering
No ratings yet
Introduction To Clustering
3 pages
Data Science CH5
No ratings yet
Data Science CH5
14 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
96 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
28 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
62 pages
Unsupervised Learning and Clustering Techniques
No ratings yet
Unsupervised Learning and Clustering Techniques
59 pages
Clustering Techniques Explained
No ratings yet
Clustering Techniques Explained
32 pages
DSA 8 Clustering
No ratings yet
DSA 8 Clustering
29 pages
Understanding Unsupervised Clustering Techniques
No ratings yet
Understanding Unsupervised Clustering Techniques
35 pages
Mlclustering2022 10 26
No ratings yet
Mlclustering2022 10 26
36 pages
Understanding Clustering in Machine Learning
No ratings yet
Understanding Clustering in Machine Learning
36 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
13 pages
Unsupervised Machine Learning Methods
No ratings yet
Unsupervised Machine Learning Methods
8 pages
Understanding Clustering in Machine Learning
No ratings yet
Understanding Clustering in Machine Learning
88 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
98 pages
Unsupervised Learning Techniques Guide
No ratings yet
Unsupervised Learning Techniques Guide
133 pages
Unsupervised Learning and Clustering Methods
No ratings yet
Unsupervised Learning and Clustering Methods
14 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
38 pages
Unsupervised Learning and Clustering Techniques
No ratings yet
Unsupervised Learning and Clustering Techniques
29 pages
Unsupervised Learning: Clustering Insights
No ratings yet
Unsupervised Learning: Clustering Insights
23 pages
K-Means Clustering Explained in Python
No ratings yet
K-Means Clustering Explained in Python
28 pages
Understanding Unsupervised Learning Techniques
No ratings yet
Understanding Unsupervised Learning Techniques
59 pages
Unsupervised Learning: Clustering Models
No ratings yet
Unsupervised Learning: Clustering Models
38 pages
Unsupervised Learning in Data Science
No ratings yet
Unsupervised Learning in Data Science
34 pages
25 - Unsupervised Learning
No ratings yet
25 - Unsupervised Learning
47 pages
Python Data Visualization Techniques
No ratings yet
Python Data Visualization Techniques
105 pages
Unsupervised Learning Overview
No ratings yet
Unsupervised Learning Overview
19 pages
ML Mod3
No ratings yet
ML Mod3
95 pages
Unsupervised Machine Learning Techniques
No ratings yet
Unsupervised Machine Learning Techniques
58 pages
4-Week Easy Meal Prep Plan
No ratings yet
4-Week Easy Meal Prep Plan
83 pages
Strathclyde Pegasus Online Registration Guide
No ratings yet
Strathclyde Pegasus Online Registration Guide
10 pages
Limiting and Excess Reactants Explained
No ratings yet
Limiting and Excess Reactants Explained
8 pages
Understanding Forces: Definitions & Examples
No ratings yet
Understanding Forces: Definitions & Examples
4 pages
FIN 402 Treasury Management Assignments
No ratings yet
FIN 402 Treasury Management Assignments
2 pages
Santa Izabel GA Disc Harrow Models
No ratings yet
Santa Izabel GA Disc Harrow Models
1 page
Very Fast BDS
No ratings yet
Very Fast BDS
95 pages
Pharmacology ICMR STS 2026-27
No ratings yet
Pharmacology ICMR STS 2026-27
2 pages
Bahrain Marina Hotel P2 Loading Plan
No ratings yet
Bahrain Marina Hotel P2 Loading Plan
1 page
Squash and Syrup Production Guide
No ratings yet
Squash and Syrup Production Guide
2 pages
Crane Beam Design Analysis
100% (2)
Crane Beam Design Analysis
7 pages
Best CV Format for Ethiopian Engineers
No ratings yet
Best CV Format for Ethiopian Engineers
2 pages
Sustainable Tourism Plan for Narra, Palawan
No ratings yet
Sustainable Tourism Plan for Narra, Palawan
104 pages
Annual CPD Requirements for Agents
No ratings yet
Annual CPD Requirements for Agents
17 pages
Understanding Major Scales for Guitar
100% (2)
Understanding Major Scales for Guitar
3 pages
Conservation Program Impact on Species Population
No ratings yet
Conservation Program Impact on Species Population
40 pages
Oxychlorination Process for VCM Production
100% (2)
Oxychlorination Process for VCM Production
39 pages
Education Policies in Developing Nations
No ratings yet
Education Policies in Developing Nations
48 pages
Environmental Conservation Strategies
No ratings yet
Environmental Conservation Strategies
7 pages
Arctic Canoeing: Battling Nature's Elements
No ratings yet
Arctic Canoeing: Battling Nature's Elements
5 pages
JEE 2026 Vector Physics Practice Questions
No ratings yet
JEE 2026 Vector Physics Practice Questions
5 pages
Grade 1 Arts Curriculum Matrix
No ratings yet
Grade 1 Arts Curriculum Matrix
3 pages
Approval Records Summary
No ratings yet
Approval Records Summary
80 pages
Mba 1 Year
No ratings yet
Mba 1 Year
1 page
SAP Report ZNOTE_2173829 Overview
No ratings yet
SAP Report ZNOTE_2173829 Overview
215 pages
Livguard Battery Features & Warranty Details
No ratings yet
Livguard Battery Features & Warranty Details
2 pages
HSK 2 Grammar Points Overview
No ratings yet
HSK 2 Grammar Points Overview
17 pages
Artist Relations Career Experience Guide
No ratings yet
Artist Relations Career Experience Guide
4 pages
Trust Measurement in Advertising
No ratings yet
Trust Measurement in Advertising
20 pages
E-80 - OM Small
No ratings yet
E-80 - OM Small
284 pages

Unsupervised Learning: Clustering Explained

Uploaded by

Unsupervised Learning: Clustering Explained

Uploaded by

Unsupervised: clustering, association

The following image shows an example of how clustering works.

Types of clustering in unsupervised machine learning

The main types of clustering in unsupervised machine learning include

Working of K-Means Clustering

Types of Hierarchical Clustering

Hierarchical Agglomerative Clustering (HAC) is a bottom-up clustering method.

Workflow for Hierarchical Divisive clustering:

You might also like