0% found this document useful (0 votes)

7 views11 pages

Clustering

Uploaded by

Priyam Ranjan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views11 pages

Clustering

Uploaded by

Priyam Ranjan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Clustering

• Clustering or cluster analysis is a machine learning technique, which groups the unlabelled
dataset.

• It can be defined as “A way of grouping the data points into different clusters, consisting
of similar data points. The objects with the possible similarities remain in a group that
has less or no similarities with another group.”

• It does it by finding some similar patterns in the unlabelled dataset such as shape, size,
color, behavior, etc., and divides them as per the presence and absence of those similar
patterns.

• It is an unsupervised learning method, hence no supervision is provided to the algorithm,

and it deals with the unlabeled dataset.
• After applying this clustering technique, each cluster or group is provided with a
cluster-ID. ML system can use this id to simplify the processing of large and complex
datasets.

• The clustering technique is commonly used for statistical data analysis.

Example: Let's understand the clustering technique with the real-world example of Mall:

• When we visit any shopping mall, we can observe that the things with similar
usage are grouped together.

• Such as the t-shirts are grouped in one section, and trousers are at other sections,
similarly, at vegetable sections, apples, bananas, Mangoes, etc., are grouped in
separate sections, so that we can easily find out the things.

• The clustering technique also works in the same way. Other examples of clustering
are grouping documents according to the topic.
• The clustering technique can be widely used in various tasks. Some most common uses of
this technique are:
• Market Segmentation
• Statistical data analysis
• Social network analysis
• Image segmentation
• Anomaly detection, etc.

• Apart from these general usages, it is used by the Amazon in its recommendation system
to provide the recommendations as per the past search of products.

• Netflix also uses this technique to recommend the movies and web-series to its users as
per the watch history.
Types of Clustering Methods

• The clustering methods are broadly divided into Hard clustering (datapoint belongs to
only one group) and Soft Clustering (data points can belong to another group also).

• But there are also other various approaches of Clustering exist. Below are the main
clustering methods used in Machine learning:
1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering
Partitioning Clustering

• It is a type of clustering that divides the data into

non-hierarchical groups. It is also known as the
centroid-based method.

• The most common example of partitioning

clustering is the K-Means Clustering algorithm.

• In this type, the dataset is divided into a set of k

groups, where K is used to define the number of
pre-defined groups.

• The cluster center is created in such a way that

the distance between the data points of one
cluster is minimum as compared to another
cluster centroid.
Density-Based Clustering

• The density-based clustering method connects the

highly-dense areas into clusters, and the arbitrarily
shaped distributions are formed as long as the
dense region can be connected.

• This algorithm does it by identifying different

clusters in the dataset and connects the areas of
high densities into clusters.

• The dense areas in data space are divided from

each other by sparser areas.

• These algorithms can face difficulty in clustering

the data points if the dataset has varying densities
and high dimensions.
Distribution Model-Based Clustering

• In the distribution model-based clustering

method, the data is divided based on the
probability of how a dataset belongs to a
particular distribution.

• The grouping is done by assuming some

distributions commonly Gaussian
Distribution.

• The example of this type is the

Expectation-Maximization Clustering
algorithm that uses Gaussian Mixture Models
(GMM).
Hierarchical Clustering

• Hierarchical clustering can be used as an

alternative for the partitioned clustering as
there is no requirement of pre-specifying the
number of clusters to be created.

• In this technique, the dataset is divided into

clusters to create a tree-like structure, which
is also called a dendrogram.

• The observations or any number of clusters

can be selected by cutting the tree at the
correct level.

• The most common example of this method is

the Agglomerative Hierarchical algorithm.
Fuzzy Clustering

• Fuzzy clustering is a type of soft method in which a data object may belong to more than
one group or cluster.

• Each dataset has a set of membership coefficients, which depend on the degree of
membership to be in a cluster.

• Fuzzy C-means algorithm is the example of this type of clustering; it is sometimes also
known as the Fuzzy k-means algorithm.

Module 5
No ratings yet
Module 5
91 pages
Unit 2 ML
No ratings yet
Unit 2 ML
11 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
7 pages
Lecturer-1 Unit 3
No ratings yet
Lecturer-1 Unit 3
31 pages
ML Unit 4 (Ab 22)
No ratings yet
ML Unit 4 (Ab 22)
39 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
21 pages
Machine Learning Clustering Guide
No ratings yet
Machine Learning Clustering Guide
7 pages
Cbsyllabus Bda
No ratings yet
Cbsyllabus Bda
5 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
U20cs604 Machine Learning Unit III
No ratings yet
U20cs604 Machine Learning Unit III
23 pages
ML CH 4
No ratings yet
ML CH 4
51 pages
Clustering
No ratings yet
Clustering
20 pages
Unt III (DS)
No ratings yet
Unt III (DS)
49 pages
ML Unit-3
No ratings yet
ML Unit-3
22 pages
Clustering in Machine Learning - Javatpoint
No ratings yet
Clustering in Machine Learning - Javatpoint
10 pages
Classification vs Clustering Guide
No ratings yet
Classification vs Clustering Guide
31 pages
Clustering Methods in Machine Learning
No ratings yet
Clustering Methods in Machine Learning
45 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
Clustering
No ratings yet
Clustering
9 pages
Clustering
No ratings yet
Clustering
10 pages
Clustering
No ratings yet
Clustering
6 pages
Unsupervised Learning: Clustering Techniques
No ratings yet
Unsupervised Learning: Clustering Techniques
14 pages
Unsupervised Learning Overview
No ratings yet
Unsupervised Learning Overview
25 pages
Unit 4
No ratings yet
Unit 4
62 pages
4.unsupervised Learning Model-Clustering
No ratings yet
4.unsupervised Learning Model-Clustering
45 pages
Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
28 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Clustering
No ratings yet
Clustering
8 pages
Unit - 4 (ML)
No ratings yet
Unit - 4 (ML)
13 pages
Advance Learning Methods Machine Learning Lecture Notes
No ratings yet
Advance Learning Methods Machine Learning Lecture Notes
13 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
Clustering: An Overview: Key Concepts Objective
No ratings yet
Clustering: An Overview: Key Concepts Objective
12 pages
Unit 4-L2
No ratings yet
Unit 4-L2
19 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
59 pages
Classification and Clustering
No ratings yet
Classification and Clustering
8 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
Clustering and K-Means Algorithm
No ratings yet
Clustering and K-Means Algorithm
81 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
DWMModule 4
No ratings yet
DWMModule 4
31 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
18 pages
Unit 4
No ratings yet
Unit 4
96 pages
Clustering: Methods and Applications
No ratings yet
Clustering: Methods and Applications
69 pages
Unit 5
No ratings yet
Unit 5
44 pages
Unit - Iv Unsupervisied Learning - Notes
No ratings yet
Unit - Iv Unsupervisied Learning - Notes
32 pages
Unit 4
No ratings yet
Unit 4
74 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
64 pages
Clustering Techniques in Unsupervised Learning
No ratings yet
Clustering Techniques in Unsupervised Learning
42 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
M5
No ratings yet
M5
40 pages
Unit-IV - Unsupervised Learning
No ratings yet
Unit-IV - Unsupervised Learning
154 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
40 pages
Clustering
No ratings yet
Clustering
41 pages
Clustering
No ratings yet
Clustering
4 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
Unsupervised Learning Insights
No ratings yet
Unsupervised Learning Insights
10 pages
Unsupervised Learning - CH-3
No ratings yet
Unsupervised Learning - CH-3
59 pages
PL/SQL Database Administration Tasks
No ratings yet
PL/SQL Database Administration Tasks
7 pages
TETRA Interoperability Certification 2018
No ratings yet
TETRA Interoperability Certification 2018
24 pages
AY-24-25 E4, E3 & E2 Class Sem2 Time - Table 19DEC2024
No ratings yet
AY-24-25 E4, E3 & E2 Class Sem2 Time - Table 19DEC2024
21 pages
Manual April 2015
No ratings yet
Manual April 2015
26 pages
Video Server Request Router TV Director Datasheet Edgeware
No ratings yet
Video Server Request Router TV Director Datasheet Edgeware
2 pages
Dbms Notes 1
No ratings yet
Dbms Notes 1
6 pages
Selfstudys Com File
No ratings yet
Selfstudys Com File
21 pages
RTN 905: Advanced Microwave System
No ratings yet
RTN 905: Advanced Microwave System
2 pages
Passed S.S.C. Exam 87.43 Distinction 2006 Gujarat Secondary and Higher Secondary Education B Gujarat 1 B344829
No ratings yet
Passed S.S.C. Exam 87.43 Distinction 2006 Gujarat Secondary and Higher Secondary Education B Gujarat 1 B344829
2 pages
Supplementary Voices BK Tienganh 1
No ratings yet
Supplementary Voices BK Tienganh 1
6 pages
Introduction To SCPI Language
No ratings yet
Introduction To SCPI Language
4 pages
Untitled Document
No ratings yet
Untitled Document
3 pages
Question 4
No ratings yet
Question 4
6 pages
Name: Shrey Anandariya Enrollment: SR21BSIT007 Div: B Subject: Cloud Computing
No ratings yet
Name: Shrey Anandariya Enrollment: SR21BSIT007 Div: B Subject: Cloud Computing
7 pages
Vidga Buying Guide Jan 24
No ratings yet
Vidga Buying Guide Jan 24
9 pages
Shine Candidates 1737050482
No ratings yet
Shine Candidates 1737050482
11 pages
Windows Privilege Escalation Guide
No ratings yet
Windows Privilege Escalation Guide
20 pages
The Exact Over-Wire: Splines, and Worms
No ratings yet
The Exact Over-Wire: Splines, and Worms
240 pages
Assignment 10
No ratings yet
Assignment 10
4 pages
Applications of SCADA
No ratings yet
Applications of SCADA
3 pages
Css Notes Ashok It
No ratings yet
Css Notes Ashok It
90 pages
Techcon TS580D MM Controller Data Sheet en
No ratings yet
Techcon TS580D MM Controller Data Sheet en
2 pages
Unison-2 32 52-Manual
No ratings yet
Unison-2 32 52-Manual
65 pages
Professional Thesis Writing Help
100% (3)
Professional Thesis Writing Help
8 pages
Bagalkot District Election Offices Contact
No ratings yet
Bagalkot District Election Offices Contact
3 pages
Practice - Exploring The Bank Promotion Data Set Using CAS and The Python API
No ratings yet
Practice - Exploring The Bank Promotion Data Set Using CAS and The Python API
3 pages
Functional Programming in CSPC31
No ratings yet
Functional Programming in CSPC31
29 pages
Cover Letter For Cyber Security Internship
100% (2)
Cover Letter For Cyber Security Internship
6 pages
Bjast Pap 24347
No ratings yet
Bjast Pap 24347
10 pages
Question Paper Code:: Anna University, Polytechnic & Schools
No ratings yet
Question Paper Code:: Anna University, Polytechnic & Schools
2 pages