0% found this document useful (0 votes)

2 views62 pages

Unsupervised Learning: Clustering Methods

Uploaded by

Sravani Nanubala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views62 pages

Unsupervised Learning: Clustering Methods

Uploaded by

Sravani Nanubala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Machine Learning

1
UNIT - III

UNIT III 9 Hrs

Unsupervised Learning: Clustering: Introduction to clustering, K-
means clustering, K-Mode Clustering, Hierarchal clustering, Anomaly
Detection, Association: Apriori Algorithm, F-P Growth Algorithm, Eclat
Algorithm, Ensemble method: Bootstrap Aggregation, Boosting,
Gradient Boosting Machines, Stacking.

2
Unsupervised Learning
Introduction
• In machine learning, the problem of unsupervised learning is that of trying to find
hidden structure in unlabeled data.

• Since the examples given to the learner are unlabeled, there is no error or reward
signal to evaluate the goodness of a potential solution. This distinguishes
unsupervised from supervised learning.

• Unsupervised learning is defined as the task performed by algorithms that learn

from a training set of unlabeled or unannotated examples, using the features of the
inputs to categorize them according to some geometric or statistical criteria.

• Unsupervised learning encompasses many techniques that seek to summarize and

explain key features or structures of the data. 3
Unsupervised Learning
Introduction
• Many methods employed in unsupervised learning are based on data mining methods
used to preprocess data.

• Most unsupervised learning techniques can be summarized as those that tackle the
following four groups of problems.
• Clustering:
• Has as a goal to partition the set of examples into groups.
• Dimensionality reduction:
• Aims to reduce the dimensionality of the data. Here, we encounter techniques such as Principal
Component Analysis (PCA), independent component analysis, and nonnegative matrix factorization.
• Outlier detection:
• Has as a purpose to find unusual events (e.g., a malfunction), that distinguish part of the data from
the rest according to certain criteria.
• Novelty detection:
• Deals with cases when changes occur in the data (e.g., in streaming data). 4
Unsupervised Learning
Clustering
• The most common unsupervised task is clustering.
• Clustering is a process of grouping similar objects together; i.e., to partition
unlabeled examples into disjoint subsets of clusters, such that:
• Examples within a cluster are similar (in this case, we speak of high intraclass
similarity).
• Examples in different clusters are different (in this case, we speak of low
interclass similarity).
• When we denote data as similar and dissimilar, we should define a measure for this
similarity/dissimilarity.
• Note:
• Grouping similar data together can help in discovering new categories in an
unsupervised manner, even when no sample category labels are provided.
5
Unsupervised Learning
Types of Clustering

6
K-Means Clustering Algorithm
What is K-Means Clustering ?
• K-Means clustering is the most popular unsupervised learning algorithm.

• It is used when we have unlabeled data which is data without defined categories or
groups.

• The algorithm follows an easy or simple way to classify a given data set through a
certain number of clusters.

• K-Means algorithm works iteratively to assign each data point to one of K groups
based on the features that are provided. Data points are clustered based on feature
similarity.

7
K-Means Clustering Algorithm
What is K-Means Clustering ?
• K-Means Clustering groups the unlabeled dataset into different clusters. Here K
defines the number of pre-defined clusters that need to be created in the process,
as if K=2, there will be two clusters, and for K=3, there will be three clusters, and so
on.

“It is an iterative algorithm that divides the unlabeled dataset into k different clusters
in such a way that each dataset belongs only one group that has similar properties”.

• It allows us to cluster the data into different groups and a convenient way to
discover the categories of groups in the unlabeled dataset on its own without the
need for any training.
• It is a centroid-based algorithm, where each cluster is associated with a centroid.
The main aim of this algorithm is to minimize the sum of distances between the
data point and their corresponding clusters. 8
K-Means Clustering Algorithm
What is K-Means Clustering ?
• The algorithm takes the unlabeled dataset as input, divides the dataset into k-
number of clusters, and repeats the process until it does not find the best clusters.
The value of k should be predetermined in this algorithm.

• The k-means clustering algorithm mainly performs two tasks:

• Determines the best value for K center points or centroids by an iterative process.

• Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.

• Hence each cluster has datapoints with some commonalities, and it is

away from other clusters.
9
K-Means Clustering Algorithm
How does the K-Means Algorithm Work

10
K-Means Clustering Algorithm
How does the K-Means Algorithm Work?
The working of the K-Means algorithm is explained in the below steps:
Step-1: Select the number K to decide the number of clusters.
Step-2: Select random K points or centroids. (It can be other from the input
dataset).
Step-3: Assign each data point to their closest centroid, which will form the
predefined K clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Step-5: Repeat the third steps, which means reassign each datapoint to the
new closest centroid of each cluster.
Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
Step-7: The model is ready.

11
K-Means Clustering Algorithm
How to choose the value of "K number of clusters" in K-means Clustering?
• The performance of the K-means clustering algorithm depends upon highly
efficient clusters that it forms. But choosing the optimal number of clusters is a big
task. There are some different ways to find the optimal number of clusters, but
here we are discussing the most appropriate method to find the number of
clusters or value of K. The method is given below:

12
K-Means Clustering Algorithm
Elbow Method
• The Elbow method is one of the most popular ways to find the optimal number of
clusters. This method uses the concept of WCSS value. WCSS stands for Within
Cluster Sum of Squares, which defines the total variations within a cluster. The
formula to calculate the value of WCSS (for 3 clusters) is given below:

WCSS= ∑Pi in Cluster1 distance(Pi C1)2 +∑Pi in Cluster2distance(Pi C2)2+∑Pi in CLuster3 distance(Pi C3)2

• In the above formula of WCSS,

• ∑Pi in Cluster1 distance(Pi C1)2: It is the sum of the square of the distances between
each data point and its centroid within a cluster1 and the same for the other two
terms.

13
K-Means Clustering Algorithm
Elbow Method
• To measure the distance between data points and centroid, we can use any
method such as Euclidean distance or Manhattan distance.
• To find the optimal value of clusters, the elbow method follows the below
steps:
• It executes the K-means clustering on a given dataset for different K
values (ranges from 1-10).
• For each value of K, calculates the WCSS value.
• Plots a curve between calculated WCSS values and the number of clusters
K.
• The sharp point of bend or a point of the plot looks like an arm, then that
point is considered as the best value of K.

14
K-Means Clustering Algorithm
Elbow Method
• Since the graph shows the sharp bend, which looks like an elbow, hence it is known
as the elbow method. The graph for the elbow method looks like the below image:

15
K-Means Clustering Algorithm
Applications of clustering
• K-Means clustering is the most common unsupervised machine
learning algorithm.

• It is widely used for many applications which include-

• Image segmentation
• Customer segmentation
• Species clustering
• Anomaly detection
• Clustering languages

16
Hierarchical Clustering Algorithm
• Hierarchical clustering is another unsupervised machine learning
algorithm, which is used to group the unlabeled datasets into a cluster
and also known as hierarchical cluster analysis or HCA.

• In this algorithm, we develop the hierarchy of clusters in the form of a

tree, and this tree-shaped structure is known as the dendrogram.

• Sometimes the results of K-means clustering and hierarchical

clustering may look similar, but they both differ depending on how they
work. As there is no requirement to predetermine the number of
clusters as we did in the K-Means algorithm.

17
Hierarchical Clustering Algorithm
• The hierarchical clustering technique has two approaches:

[Link]: Agglomerative is a bottom-up approach, in which the

algorithm starts with taking all data points as single clusters and
merging them until one cluster is left.

[Link]: Divisive algorithm is the reverse of the agglomerative

algorithm as it is a top-down approach.

18
Hierarchical Clustering Algorithm
Why hierarchical clustering?
• As we already have other clustering algorithms such as K-Means
Clustering, then why we need hierarchical clustering?
• We have seen in the K-means clustering that there are some challenges with
this algorithm, which are a predetermined number of clusters, and it always
tries to create the clusters of the same size.

• To solve these two challenges, we can opt for the hierarchical clustering
algorithm because, in this algorithm, we don't need to have knowledge about
the predefined number of clusters.

19
Hierarchical Clustering Algorithm
Agglomerative Hierarchical clustering
• The agglomerative hierarchical clustering algorithm is a popular
example of HCA.

• To group the datasets into clusters, it follows the bottom-up

approach. It means, this algorithm considers each dataset as a single
cluster at the beginning, and then start combining the closest pair of
clusters together.

• It does this until all the clusters are merged into a single cluster that
contains all the datasets.

• This hierarchy of clusters is represented in the form of 20 the

Hierarchical Clustering Algorithm
How the Agglomerative Hierarchical clustering Work?
The working of the AHC algorithm can be explained using the below
steps:
Step-1: Create each data point as a single cluster. Let's say there are N
data points, so the number of clusters will also be N.

21
Hierarchical Clustering Algorithm
How the Agglomerative Hierarchical clustering Work?
Step-2: Take two closest data points or clusters and merge them to form
one cluster. So, there will now be N-1 clusters..

22
Hierarchical Clustering Algorithm
How the Agglomerative Hierarchical clustering Work?
Step-3: Again, take the two closest clusters and merge them together to
form one cluster. There will be N-2 clusters.

23
Hierarchical Clustering Algorithm
How the Agglomerative Hierarchical clustering Work?
Step-4: Repeat Step 3 until only one cluster left. So, we will get the
following clusters. Consider the below images:

24
Hierarchical Clustering Algorithm
How the Agglomerative Hierarchical clustering Work?
• Step-5: Once all the clusters are combined into one big cluster,
develop the dendrogram to divide the clusters as per the problem.

Note:
To better understand hierarchical clustering, it is advised to have a look
on k-means clustering

25
Hierarchical Clustering Algorithm
Measure for the distance between two clusters
• As we have seen, the closest distance between the two clusters is
crucial for the hierarchical clustering.

• There are various ways to calculate the distance between two clusters,
and these ways decide the rule for clustering.

• These measures are called Linkage methods. Some of the popular

linkage methods are given below:

26
Hierarchical Clustering Algorithm
1. Single Linkage:
It is the Shortest Distance between the closest points of the clusters.
Consider the below image:

27
Hierarchical Clustering Algorithm
2. Complete Linkage:
• It is the farthest distance between the two points of two different
clusters. It is one of the popular linkage methods as it forms tighter
clusters than single-linkage.

28
Hierarchical Clustering Algorithm
3. Average Linkage:
• It is the linkage method in which the distance between each pair of
datasets is added up and then divided by the total number of datasets
to calculate the average distance between two clusters.

• It is also one of the most popular linkage methods.

29
Hierarchical Clustering Algorithm
4. Centroid Linkage:
• It is the linkage method in which the distance between the centroid of
the clusters is calculated. Consider the below image:

• From the above-given approaches, we can apply any of them

according to the type of problem or business requirement. 30
Hierarchical Clustering Algorithm
Woking of Dendrogram in Hierarchical clustering
• The dendrogram is a tree-like structure that is mainly used to store
each step as a memory that the HC algorithm performs.
• In the dendrogram plot, the Y-axis shows the Euclidean distances
between the data points, and the x-axis shows all the data points of the
given dataset. The working of the dendrogram can be explained using
the below diagram:

31
Hierarchical Clustering Algorithm
Woking of Dendrogram in Hierarchical clustering

• In the above diagram, the left part is showing how clusters are created
in agglomerative clustering, and the right part is showing the
corresponding dendrogram. 32
Hierarchical Clustering Algorithm
Woking of Dendrogram in Hierarchical clustering
• As we have discussed above, firstly, the datapoints P2 and P3
combine together and form a cluster, correspondingly a dendrogram is
created, which connects P2 and P3 with a rectangular shape. The
hight is decided according to the Euclidean distance between the data
points.
• In the next step, P5 and P6 form a cluster, and the corresponding
dendrogram is created. It is higher than of previous, as the Euclidean
distance between P5 and P6 is a little bit greater than the P2 and P3.
• Again, two new dendrograms are created that combine P1, P2, and P3
in one dendrogram, and P4, P5, and P6, in another dendrogram.
• At last, the final dendrogram is created that combines all the data
points together.
33
K-means Vs Hierarchical

34
K-means Vs Hierarchical

35
K-Mode Clustering Algorithm
• K-Modes clustering is one of the unsupervised Machine Learning
algorithms that is used to cluster categorical variables.

• Unlike traditional clustering algorithms that use distance metrics, K-

Modes works by identifying the modes or most frequent values within
each cluster to determine its centroid.

• K-Modes is ideal for clustering categorical data such as customer

demographics, market segments, or survey responses.

• It is a powerful tool for data analysts and scientists to gain insights into
their data and make informed decisions.
36
K-Mode Clustering Algorithm
K-Modes vs K-Means
• K-Means uses mathematical measures (distance) to cluster continuous
data. The lesser the distance, the more similar our data points are.
Centroids are updated by Means.

• But for categorical data points, we cannot calculate the distance. So

we go for K-Modes algorithm. It uses the dissimilarities(total
mismatches) between the data points. The lesser the dissimilarities the
more similar our data points. It uses Modes instead of means.

37
K-Mode Clustering Algorithm
How Does the K-Modes Algorithm Work?
• Unlike K-Means clustering methods, we need to upfront specify the K.
[Link] K observations at random and use them as leaders/clusters.

[Link] the dissimilarities and assign each observation to its closest

cluster.

[Link] new modes for the clusters.

[Link] 2–3 steps until there are is no re-assignment required

38
K-Mode Clustering Algorithm
Example
Imagine we have a dataset that has the information about hair color, eye color, and
skin color of persons. We aim to group them based on the available information

we have the sample data now. Let us proceed by defining the number
39
of
clusters(K)=3
K-Mode Clustering Algorithm
Example
Step 1: Pick K observations at random and use them as leaders/clusters
I am choosing P1, P7, P8 as leaders/clusters

40
K-Mode Clustering Algorithm
Example
Step 2: Calculate the dissimilarities(no. of mismatches) and assign each observation
to its closest cluster
• Iteratively compare the cluster data points to each of the observations. Similar data
points give 0, dissimilar data points give 1.

41
K-Mode Clustering Algorithm
Example
Step 2: Calculate the dissimilarities(no. of mismatches) and assign each observation
to its closest cluster
• Iteratively compare the cluster data points to each of the observations. Similar data
points give 0, dissimilar data points give 1.

• Comparing leader/Cluster P1 to
the observation

• P1 gives 0 dissimilarities.

42
K-Mode Clustering Algorithm
Example

• Comparing leader/cluster P1 to
the observation P2 gives 3(1+1+1)
dissimilarities.

43
K-Mode Clustering Algorithm
Example

• Likewise, calculate all the

dissimilarities and put them in
a matrix as shown below and
assign the observations to
their closest cluster(cluster
that has the least
dissimilarity)

44
K-Mode Clustering Algorithm
Example
• After step 2, the observations P1, P2, P5 are assigned to cluster 1; P3, P7 are
assigned to Cluster 2; and P4, P6, P8 are assigned to cluster 3.

• Note:
• If all the clusters have the same dissimilarity with an observation, assign to any
cluster randomly.

• In our case, the observation P2 has 3 dissimilarities with all the leaders. I
randomly assigned it to Cluster 1.

45
K-Mode Clustering Algorithm
Example
Step 3: Define new modes for the clusters
• Mode is simply the most observed value.
• Mark the observations according to the cluster they belong to. Observations of
Cluster 1 are marked in Yellow, Cluster 2 are marked in Brick red, and Cluster 3
are marked in Purple.

46
• Considering one cluster at a time, for each feature, look for the Mode and update
K-Mode Clustering Algorithm
Example
• Cluster 1 observations(P1, P2, P5) has brunette as the most observed hair color,
amber as the most observed eye color, and fair as the most observed skin color.

Note: If you observe the same occurrence of values, take the mode randomly. In our
case, the observations of Cluster 3(P3, P7) have one occurrence of brown, fair skin
color. I randomly chose brown as the mode.
• Below are our new leaders after the update.

• Obtained new leaders; Repeat steps 2–4 47

K-Mode Clustering Algorithm
Example
• After obtaining the new leaders, again calculate the dissimilarities between the
observations and the newly obtained leaders.

48
K-Mode Clustering Algorithm
Example
• Comparing Cluster 1 to the observation P1 gives 1 dissimilarity.

49
K-Mode Clustering Algorithm
Example
• Comparing Cluster 1 to the observation P2 gives 2 dissimilarities.
• Likewise, calculate all the dissimilarities and put them in a matrix. Assign each
observation to its closest cluster..

• The observations P1, P2, P5 are assigned to Cluster 1; P3, P7 are assigned to
Cluster 2; and P4, P6, P8 are assigned to Cluster 3.
50
• We stop here as we see there is no change in the assignment of observations.
Association Rule Learning
• Association rule learning is a type of unsupervised learning technique
that checks for the dependency of one data item on another data item
and maps accordingly so that it can be more profitable.

• It tries to find some interesting relations or associations among the

variables of dataset.

• It is based on different rules to discover the interesting relations

between variables in the database.

• The association rule learning is one of the very important concepts of

machine learning, and it is employed in Market Basket analysis, Web
usage mining, continuous production, etc.
51
Association Rule Learning
• Here market basket analysis is a technique used by the various big retailer
to discover the associations between items.

• We can understand it by taking an example of a supermarket, as in a

supermarket, all products that are purchased together are put together.

• For example, if a customer buys bread, he most likely can also buy butter,
eggs, or milk, so these products are stored within a shelf or mostly nearby.

52
Association Rule Learning
• Consider the below diagram:

• Association rule learning can be divided into three types of algorithms:

1. Apriori
2. Eclat
3. F-P Growth Algorithm
53
Association Rule Learning
How does Association Rule Learning work?
• Association rule learning works on the concept of If and Else Statement, such as if
A then B.

• Here the If element is called antecedent, and then statement is called as

Consequent. These types of relationships where we can find out some association
or relation between two items is known as single cardinality.

• It is all about creating rules, and if the number of items increases, then cardinality
also increases accordingly. So, to measure the associations between thousands of
data items, there are several metrics.
54
Association Rule Learning
How does Association Rule Learning work?
• These metrics are given below:
• Support
• Confidence
• Lift

Support
• Support is the frequency of A or how frequently an item appears in the dataset. It is
defined as the fraction of the transaction T that contains the itemset X. If there are
X datasets, then for transactions T, it can be written as:

55
Association Rule Learning
How does Association Rule Learning work?
Confidence
• Confidence indicates how often the rule has been found to be true. Or how often
the items X and Y occur together in the dataset when the occurrence of X is
already given. It is the ratio of the transaction that contains X and Y to the number
of records that contain X.

Lift
• It is the strength of any rule, which can be defined as below formula:

56
Association Rule Learning
How does Association Rule Learning work?
• It is the ratio of the observed support measure and expected support if X and Y are
independent of each other.

• It has three possible values:

• If Lift= 1: The probability of occurrence of antecedent and consequent is
independent of each other.

• Lift>1: It determines the degree to which the two item-sets are dependent to
each other.

• Lift<1: It tells us that one item is a substitute for other items, which means one
item has a negative effect on another.
57
Association Rule Learning
Types of Association Rule Learning
Association rule learning can be divided into three algorithms:
1. Apriori Algorithm
• This algorithm uses frequent datasets to generate association rules. It is designed
to work on the databases that contain transactions.

• This algorithm uses a breadth-first search and Hash Tree to calculate the itemset
efficiently.

• It is mainly used for market basket analysis and helps to understand the products
that can be bought together.

• It can also be used in the healthcare field to find drug reactions for patients. 58
Association Rule Learning
Types of Association Rule Learning
2. F-P Growth Algorithm
• The F-P growth algorithm stands for Frequent Pattern, and it is the improved
version of the Apriori Algorithm.

• It represents the database in the form of a tree structure that is known as a

frequent pattern or tree.

• The purpose of this frequent tree is to extract the most frequent patterns.

59
Association Rule Learning
Types of Association Rule Learning
3. Eclat Algorithm
• Eclat algorithm stands for Equivalence Class Transformation.

• This algorithm uses a depth-first search technique to find frequent itemsets in a

transaction database.

• It performs faster execution than Apriori Algorithm.

60
Association Rule Learning
Applications of Association Rule Learning
• It has various applications in machine learning and data mining. Below are some
popular applications of association rule learning:
• Market Basket Analysis: It is one of the popular examples and applications of
association rule mining. This technique is commonly used by big retailers to
determine the association between items.

• Medical Diagnosis: With the help of association rules, patients can be cured
easily, as it helps in identifying the probability of illness for a particular disease.

• Protein Sequence: The association rules help in determining the synthesis of

artificial Proteins.

• It is also used for the Catalog Design and Loss-leader Analysis and many more
other applications. 61
Unit – III (Functions) 62

ML Unsupervised Clustering 13
No ratings yet
ML Unsupervised Clustering 13
15 pages
Unsupervised Learning Algorithms Overview
No ratings yet
Unsupervised Learning Algorithms Overview
88 pages
Unsupervised Learning: K-Means & Clustering
No ratings yet
Unsupervised Learning: K-Means & Clustering
125 pages
Understanding Clustering in Unsupervised Learning
No ratings yet
Understanding Clustering in Unsupervised Learning
63 pages
Unit-3 1
No ratings yet
Unit-3 1
24 pages
Understanding Unsupervised Learning Techniques
No ratings yet
Understanding Unsupervised Learning Techniques
31 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
82 pages
Understanding Clustering in Unsupervised Learning
No ratings yet
Understanding Clustering in Unsupervised Learning
43 pages
Unsupervised Learning and Clustering Techniques
No ratings yet
Unsupervised Learning and Clustering Techniques
10 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
51 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
96 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
41 pages
Unsupervised Learning: K-Means & DBSCAN
No ratings yet
Unsupervised Learning: K-Means & DBSCAN
46 pages
Clustering Techniques and Use Cases
No ratings yet
Clustering Techniques and Use Cases
12 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
51 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
18 pages
Unsupervised Machine Learning Overview
No ratings yet
Unsupervised Machine Learning Overview
18 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
13 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
23 pages
Hierarchical Reinforcement Learning Overview
No ratings yet
Hierarchical Reinforcement Learning Overview
32 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
50 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
43 pages
Understanding Clustering in Machine Learning
No ratings yet
Understanding Clustering in Machine Learning
20 pages
AI CH 04 Lecture 01 Notes
No ratings yet
AI CH 04 Lecture 01 Notes
6 pages
Unsupervised Learning Techniques Overview
No ratings yet
Unsupervised Learning Techniques Overview
154 pages
K Means Clustering
No ratings yet
K Means Clustering
25 pages
Distance and Rule-Based Models in ML
No ratings yet
Distance and Rule-Based Models in ML
113 pages
Clustering Techniques and K-Means Guide
No ratings yet
Clustering Techniques and K-Means Guide
23 pages
K-Means Clustering in Unsupervised Learning
No ratings yet
K-Means Clustering in Unsupervised Learning
14 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
32 pages
Clustering and Ensemble Methods Overview
No ratings yet
Clustering and Ensemble Methods Overview
28 pages
K-Means Clustering Overview
No ratings yet
K-Means Clustering Overview
22 pages
Unsupervised Clustering Techniques Explained
No ratings yet
Unsupervised Clustering Techniques Explained
29 pages
Chapter 4
No ratings yet
Chapter 4
96 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
56 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
98 pages
PCA Variants in Unsupervised Learning
No ratings yet
PCA Variants in Unsupervised Learning
42 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
96 pages
Understanding Clustering Algorithms in ML
No ratings yet
Understanding Clustering Algorithms in ML
47 pages
Unsupervised Learning: Clustering Methods
No ratings yet
Unsupervised Learning: Clustering Methods
27 pages
Unsupervised Learning: Clustering Basics
No ratings yet
Unsupervised Learning: Clustering Basics
7 pages
K-Means Clustering in Unsupervised Learning
No ratings yet
K-Means Clustering in Unsupervised Learning
23 pages
DSUP Exp5 A23 Merged
No ratings yet
DSUP Exp5 A23 Merged
12 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
25 pages
Unsupervised Machine Learning Methods
No ratings yet
Unsupervised Machine Learning Methods
8 pages
Understanding Clustering in Unsupervised Learning
No ratings yet
Understanding Clustering in Unsupervised Learning
14 pages
Clustering Techniques in Machine Learning
No ratings yet
Clustering Techniques in Machine Learning
37 pages
Understanding K-Means Clustering Basics
No ratings yet
Understanding K-Means Clustering Basics
6 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
18 pages
K-Means and Hierarchical Clustering Guide
No ratings yet
K-Means and Hierarchical Clustering Guide
11 pages
K-Means Clustering in Machine Learning
No ratings yet
K-Means Clustering in Machine Learning
11 pages
Understanding Unsupervised Learning Techniques
No ratings yet
Understanding Unsupervised Learning Techniques
9 pages
KMeans Report
No ratings yet
KMeans Report
33 pages
Unit 4
No ratings yet
Unit 4
124 pages
Clustering Techniques in Unsupervised Learning
No ratings yet
Clustering Techniques in Unsupervised Learning
7 pages
13b K Means Clustering Clustering Concept PPTX Lyst7529
No ratings yet
13b K Means Clustering Clustering Concept PPTX Lyst7529
13 pages
K-Means Clustering Analysis in Jaipur
100% (1)
K-Means Clustering Analysis in Jaipur
26 pages
Disadvantages of K-Means Clustering
No ratings yet
Disadvantages of K-Means Clustering
59 pages
Understanding Clustering Techniques
No ratings yet
Understanding Clustering Techniques
56 pages
Hand Gesture Calculator with OpenCV
No ratings yet
Hand Gesture Calculator with OpenCV
2 pages
Create SmartAgent in NS-2 with C++ TCL
No ratings yet
Create SmartAgent in NS-2 with C++ TCL
7 pages
Theory of Computation Overview
No ratings yet
Theory of Computation Overview
159 pages
EmpowerU: AI Chatbot for Student Guidance
No ratings yet
EmpowerU: AI Chatbot for Student Guidance
1 page
Quicksort Algorithm: Analysis & Techniques
No ratings yet
Quicksort Algorithm: Analysis & Techniques
51 pages
Warshall's Algorithm for Reachability Matrix
No ratings yet
Warshall's Algorithm for Reachability Matrix
3 pages
MATLAB Optimization Methods Explained
No ratings yet
MATLAB Optimization Methods Explained
11 pages
Wavelet Galerkin Method for PDEs
No ratings yet
Wavelet Galerkin Method for PDEs
12 pages
Quadratic Polynomial Discriminants Explained
No ratings yet
Quadratic Polynomial Discriminants Explained
2 pages
Regression Analysis Summary Output
No ratings yet
Regression Analysis Summary Output
19 pages
Solving Diet Problems with Linear Programming
No ratings yet
Solving Diet Problems with Linear Programming
18 pages
History of MLPs and Backpropagation
No ratings yet
History of MLPs and Backpropagation
57 pages
Regression Analysis of Sales Data
No ratings yet
Regression Analysis of Sales Data
8 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
5 pages
Quadratic Functions and Polynomials Overview
No ratings yet
Quadratic Functions and Polynomials Overview
11 pages
Design and Analysis of Algorithms Questions
100% (1)
Design and Analysis of Algorithms Questions
70 pages
Python BFS Algorithm Implementation
No ratings yet
Python BFS Algorithm Implementation
3 pages
Variable Neighborhood Search Overview
No ratings yet
Variable Neighborhood Search Overview
9 pages
SF2822 Nonlinear Optimization Exam Solutions
No ratings yet
SF2822 Nonlinear Optimization Exam Solutions
3 pages
Grade 7 Math Summative Test Term 3
No ratings yet
Grade 7 Math Summative Test Term 3
3 pages
Linear Programming Optimization Methods
No ratings yet
Linear Programming Optimization Methods
3 pages
CD3281 Lab Manual: Python ADTs
No ratings yet
CD3281 Lab Manual: Python ADTs
34 pages
15forteen Computational Methods Overview
No ratings yet
15forteen Computational Methods Overview
6 pages
Algorithm Analysis: Efficiency & Complexity
No ratings yet
Algorithm Analysis: Efficiency & Complexity
7 pages
G10 Mathematics Long Quiz 1
No ratings yet
G10 Mathematics Long Quiz 1
4 pages
Searching and Sorting Algorithms Guide
No ratings yet
Searching and Sorting Algorithms Guide
48 pages
Polynomial Operations with Linked Lists
No ratings yet
Polynomial Operations with Linked Lists
18 pages
Algorithm Analysis and Efficiency Methods
No ratings yet
Algorithm Analysis and Efficiency Methods
8 pages
Scilab Numerical Techniques Manual
No ratings yet
Scilab Numerical Techniques Manual
33 pages
Traveling Salesman Problem Algorithms Guide
No ratings yet
Traveling Salesman Problem Algorithms Guide
6 pages
Logistic Regression Model Example
No ratings yet
Logistic Regression Model Example
4 pages
Quadratic Polynomial Zeroes Assessment
No ratings yet
Quadratic Polynomial Zeroes Assessment
2 pages
Shellsort - Wikipedia
No ratings yet
Shellsort - Wikipedia
5 pages
Class 9 Summer Homework: Number System & Polynomials
No ratings yet
Class 9 Summer Homework: Number System & Polynomials
2 pages

Unsupervised Learning: Clustering Methods

Uploaded by

Unsupervised Learning: Clustering Methods

Uploaded by

Machine Learning

UNIT III 9 Hrs

• Unsupervised learning is defined as the task performed by algorithms that learn

• Unsupervised learning encompasses many techniques that seek to summarize and

• The k-means clustering algorithm mainly performs two tasks:

• Hence each cluster has datapoints with some commonalities, and it is

• In the above formula of WCSS,

• It is widely used for many applications which include-

• In this algorithm, we develop the hierarchy of clusters in the form of a

• Sometimes the results of K-means clustering and hierarchical

[Link]: Agglomerative is a bottom-up approach, in which the

[Link]: Divisive algorithm is the reverse of the agglomerative

• To group the datasets into clusters, it follows the bottom-up

• This hierarchy of clusters is represented in the form of 20 the

• These measures are called Linkage methods. Some of the popular

• It is also one of the most popular linkage methods.

• From the above-given approaches, we can apply any of them

• Unlike traditional clustering algorithms that use distance metrics, K-

• K-Modes is ideal for clustering categorical data such as customer

• But for categorical data points, we cannot calculate the distance. So

[Link] the dissimilarities and assign each observation to its closest

[Link] new modes for the clusters.

[Link] 2–3 steps until there are is no re-assignment required

• Likewise, calculate all the

• Obtained new leaders; Repeat steps 2–4 47

• It tries to find some interesting relations or associations among the

• It is based on different rules to discover the interesting relations

• The association rule learning is one of the very important concepts of

• We can understand it by taking an example of a supermarket, as in a

• Association rule learning can be divided into three types of algorithms:

• Here the If element is called antecedent, and then statement is called as

• It has three possible values:

• It represents the database in the form of a tree structure that is known as a

• This algorithm uses a depth-first search technique to find frequent itemsets in a

• It performs faster execution than Apriori Algorithm.

• Protein Sequence: The association rules help in determining the synthesis of

You might also like