0% found this document useful (0 votes)
6 views8 pages

Clustering Techniques in Data Mining Lab

The document outlines a lab experiment for TE CSE students at Finolex Academy, focusing on clustering using open-source tools like Weka. It details lab objectives, outcomes, practical applications, and provides examples of K-means clustering in both one-dimensional and two-dimensional cases. The conclusion emphasizes the relevance of clustering algorithms in industry and engineering, along with the skills developed through the experiment.

Uploaded by

sakwarefaisal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

Clustering Techniques in Data Mining Lab

The document outlines a lab experiment for TE CSE students at Finolex Academy, focusing on clustering using open-source tools like Weka. It details lab objectives, outcomes, practical applications, and provides examples of K-means clustering in both one-dimensional and two-dimensional cases. The conclusion emphasizes the relevance of clustering algorithms in industry and engineering, along with the skills developed through the experiment.

Uploaded by

sakwarefaisal
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Hope Foundation’s

Finolex Academy of Management and Technology, Ratnagiri

Department of Computer Science and Engineering (AIML)

Subject name: Data Warehousing and Mining Lab Subject Code: CSL503

Class TE CSE Semester –VI (CBCGS) Academic year: 2024-25

Name of Student Zain Munawar Solkar QUIZ Score :


Roll No 75 Experiment No. 04
Title: Using open source tools perform Clustering.

1. Lab objectives applicable:


LOB4: To make students well versed in all data mining algorithms, methods, and tools.
2. Lab outcomes applicable:
LO3: Demonstrate an understanding of the importance of data mining.
LO6: Implement the appropriate data mining methods like classification, clustering or Frequent Pattern mining on large
data sets.
3. Learning Objectives:
1. To determine similarity and dissimilarity among elements and create clusters accordingly.
4. Practical applications of the assignment/experiment:
Clustering algorithms group similar data points together to uncover patterns and relationships, enhancing data
analysis and decision-making.
5. Prerequisites:
NA
6. Minimum Hardware Requirements:
1. I series processor, RAM 4GB,
7. Software Requirements:
1. Weka 3.8
8. Quiz Questions
[Link]
wform?usp=sf_link
9. Experiment/Assignment Evaluation:
Sr. No. Parameters Marks obtained Out of

1 Technical Understanding (Assessment may be done based on Q & A or any 6


other relevant method.) Teacher should mention the other method used -
2 Lab Performance 2
3 Punctuality 2
Date of performance (DOP) Total marks obtained 10

Signature of Faculty

Department of Computer Science and Engineering


10. Theory:

Solve example which is fed as input to Weka software. K-means one dimensional problem and 2-dimensional problem

Q.1) Implement k means clustering to form 2 clusters.

{13, 16, 29 ,78, 21, 43, 56, 90, 21, 8, 88, 60, 34}

Solution: -
Step 1 –
K=2
Let the two clusters be K1 and K2 with means M1 and M2 respectively
M1=29, M2=13
Step 2 –
Cluster K1: {29, 78, 21, 43, 56, 90, 21, 88, 60, 34}
Cluster K2: {13, 16, 8}
New M1 = (29 + 78 + 21 + 43 + 56 + 90 + 21 + 88 + 60 + 34) / 10 = 520 / 10 = 52.0
New M2 = (13 + 16 + 8) / 3 = 37 / 3 ≈ 12.33

Cluster K1: {29, 78, 43, 56, 90, 88, 60, 34}
Cluster K2: {13, 16, 21, 21, 8}
New M1 = (29 + 78 + 43 + 56 + 90 + 88 + 60 + 34) / 8 = 478 / 8 = 59.75
New M2 = (13 + 16 + 21 + 21 + 8) / 5 = 79 / 5 = 15.8

Cluster K1: {78, 43, 56, 90, 88, 60}


Cluster K2: {13, 16, 29, 21, 21, 8, 34}
New M1 = (78 + 43 + 56 + 90 + 88 + 60) / 6 = 415 / 6 ≈ 69.17
New M2 = (13 + 16 + 29 + 21 + 21 + 8 + 34) / 7 = 142 / 7 ≈ 20.29

Cluster K1: {78, 56, 90, 88, 60}


Cluster K2: {13, 16, 29, 21, 21, 8, 34, 43}
New M1 = (78 + 56 + 90 + 88 + 60) / 5 = 372 / 5 = 74.4
New M2 = (13 + 16 + 29 + 21 + 21 + 8 + 34 + 43) / 8 = 185 / 8 = 23.13

Cluster K1: {78, 56, 90, 88, 60}


Cluster K2: {13, 16, 29, 21, 21, 8, 34, 43}

No changes in the Clusters.

Step 3 –
Final Clusters are; -
K1 (Mean ≈ 74.4): {78, 56, 90, 88, 60}
K2 (Mean ≈ 23.13): {13, 16, 29, 21, 21, 8, 34, 43}

Q.2) Apply k means clustering to form 2 clusters.

Object Attribute1 (X) Attribute 2 (Y)


Weight index PH
MedicineA 1 1
MedicineB 2 1
MedicineC 4 3
MedicineD 5 4

Solution: -
Step 1 –
K=2
Let the two clusters be K1 and K2 with means M1 and M2 respectively
M1=MedicineC (4,3), M2=MedicineA (1,1)

Department of Computer Science and Engineering


Step 2 –

Object Coordinates Distance to M1 (4,3) Distance to M2 (1,1) Assigned


Cluster
MedicineA (1,1) √((4 − 1)² + (3 − 1)²) = √13 ≈ √((1 − 1)² + (1 − 1)²) = 0.00 K2
3.61
MedicineB (2,1) √((4 − 2)² + (3 − 1)²) = √8 ≈ √((1 − 2)² + (1 − 1)²) = √1 = K2
2.83 1.00
MedicineC (4,3) √((4 − 4)² + (3 − 3)²) = 0.00 √((1 − 4)² + (1 − 3)²) = √13 ≈ K1
3.61
MedicineD (5,4) √((4 − 5)² + (3 − 4)²) = √2 ≈ √((1 − 5)² + (1 − 4)²) = √25 = K1
1.41 5.00

K1: {MedicineC (4, 3), MedicineD (5, 4)}


K2: {MedicineA (1, 1), MedicineB (2, 1)}
Updated Means:
M1 = (4.5, 3.5)
M2 = (1.5, 1)

Object Coordinates Distance to M1 (4.5,3.5) Distance to M2 (1.5,1) Assigned


Cluster
MedicineA (1,1) √((4.5 − 1)² + (3.5 − 1)²) = √((1.5 − 1)² + (1 − 1)²) = K2
√18.5 ≈ 4.30 √0.25 = 0.50
MedicineB (2,1) √((4.5 − 2)² + (3.5 − 1)²) = √((1.5 − 2)² + (1 − 1)²) = K2
√12.5 ≈ 3.54 √0.25 = 0.50
MedicineC (4,3) √((4.5 − 4)² + (3.5 − 3)²) = √((1.5 − 4)² + (1 − 3)²) = K1
√0.5 ≈ 0.71 √10.25 ≈ 3.20
MedicineD (5,4) √((4.5 − 5)² + (3.5 − 4)²) = √((1.5 − 5)² + (1 − 4)²) = K1
√0.5 ≈ 0.71 √21.25 ≈ 4.61

K1: {MedicineC (4, 3), MedicineD (5, 4)}


K2: {MedicineA (1, 1), MedicineB (2, 1)}
Updated Means:
M1 = (4.5, 3.5)
M2 = (1.5, 1)

No changes in the Clusters

Step 3 –
Final Clusters are; -
K1: {MedicineC (4, 3), MedicineD (5, 4)}
K2: {MedicineA (1, 1), MedicineB (2, 1)}

Department of Computer Science and Engineering


11. Outcome –

K-means 1D -
Source code:

Department of Computer Science and Engineering


Output:

Department of Computer Science and Engineering


K-means 2D -
Source code:

Department of Computer Science and Engineering


Output:

Department of Computer Science and Engineering


12. Learning Outcomes Achieved

1. Students are able to cluster the given data in k- some known number of clusters.

13. Conclusion:

1. Applications of the Studied Technique in Industry


Clustering algorithms, such as K-means or hierarchical clustering, are widely used in industry for customer
segmentation, market analysis, and anomaly detection. These techniques help businesses tailor marketing strategies,
optimize resource allocation, and identify unusual patterns or trends in large datasets

2. Engineering Relevance

Clustering algorithms are crucial in engineering for solving complex problems related to pattern recognition, image
processing, and system optimization. They enable engineers to group similar data points, improve model accuracy, and
make informed decisions based on data-driven insights.

3. Skills Developed

The experiment with clustering algorithms enhances skills in data preprocessing, algorithm implementation, and result
interpretation. It also develops expertise in applying statistical techniques to solve real-world problems, as well as
proficiency in using data mining tools and software for effective data analysis.

14. References:

[1] https:// Paulraj Ponniah, “Data Warehousing: Fundamentals for IT Professional” , Wiley Publications
[2] Han, Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann 3nd Edition.
[3] Margaret H. Dunham, “Data Mining: Introductory and Advanced Topics”, Person Education.
[4] Raghu Ramakrishnan and Johannes Gehrke, “Database Management Systems”, 3rd Edition McGraw Hill.
[5] Elmasari and Navathe, “Fundamentals of Database Systems”, Pearson Education.

Department of Computer Science and Engineering

You might also like