0% found this document useful (0 votes)

6 views19 pages

Unit 4-L2

Uploaded by

Vanshika Tyagi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views19 pages

Unit 4-L2

Uploaded by

Vanshika Tyagi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

1

Unit 4 : Types of machine learning

Lecture 4: Unsupervised Learning: Clustering
Table of Contents
 Clustering
 Types of Clustering Methods
 Clustering Algorithms
 Applications of Clustering
Clustering
• A way of grouping the data points into different clusters, consisting of similar data points.
The objects with the possible similarities remain in a group that has less or no similarities
with another group.

Figure: 1
Clustering
• It is an unsupervised learning method, hence no supervision is provided to the algorithm,
and it deals with the unlabeled dataset.
• After applying this clustering technique, each cluster or group is provided with a cluster-
ID. ML system can use this id to simplify the processing of large and complex datasets.

Types of Clustering Methods

• The clustering methods are broadly divided into
• Hard clustering (datapoint belongs to only one group) and
• Soft Clustering (data points can belong to another group also).
• There are also other various approaches of Clustering exist:
Types of Clustering Methods
• Partitioning Clustering
• Density-Based Clustering
• Distribution Model-Based Clustering
• Hierarchical Clustering
• Fuzzy Clustering
Types of Clustering Methods
• Partitioning Clustering: It is a type of clustering that divides the data into non-
hierarchical groups.
• It is also known as the centroid-based method.

Figure: 2
Types of Clustering Methods
• The most common example of partitioning clustering is the K-Means Clustering
algorithm.
• In this type, the dataset is divided into a set of k groups, where K is used to define the
number of pre-defined groups.
• The cluster center is created in such a way that the distance between the data points of
one cluster is minimum as compared to another cluster centroid.
• Density-Based Clustering: The density-based clustering method connects the highly-
dense areas into clusters, and the arbitrarily shaped distributions are formed as long as
the dense region can be connected.
• This algorithm does it by identifying different clusters in the dataset and connects the
areas of high densities into clusters.
• The dense areas in data space are divided from each other by sparser areas.
Types of Clustering Methods
• These algorithms can face difficulty in clustering the data points if the dataset has
varying densities and high dimensions.

Figure: 3
Types of Clustering Methods
• Distribution Model-Based Clustering: In the distribution model-based clustering
method, the data is divided based on the probability of how a dataset belongs to a
particular distribution.

Figure: 4
Types of Clustering Methods
• The grouping is done by assuming some distributions commonly Gaussian Distribution.
• The example of this type is the Expectation-Maximization Clustering algorithm that uses
Gaussian Mixture Models (GMM).
• Hierarchical Clustering: Hierarchical clustering can be used as an alternative for the
partitioned clustering as there is no requirement of pre-specifying the number of clusters
to be created. In this technique, the dataset is divided into clusters to create a tree-like
structure, which is also called a dendrogram.

Figure: 5
Types of Clustering Methods
• The observations or any number of clusters can be selected by cutting the tree at the
correct level.
• The most common example of this method is the Agglomerative Hierarchical algorithm.
• Fuzzy Clustering: Fuzzy clustering is a type of soft method in which a data object may
belong to more than one group or cluster.
• Each dataset has a set of membership coefficients, which depend on the degree of
membership to be in a cluster.
• Fuzzy C-means algorithm is the example of this type of clustering; it is sometimes also
known as the Fuzzy k-means algorithm.
Clustering Algorithms
• Clustering Algorithms: The Clustering algorithms can be divided based on their models
that are explained above.
• The clustering algorithm is based on the kind of data that we are using.
• Such as, some algorithms need to guess the number of clusters in the given dataset,
whereas some are required to find the minimum distance between the observation of
the dataset.
• Some popular Clustering algorithms:
• K-Means algorithm
• Mean-shift algorithm
• DBSCAN Algorithm
• Expectation-Maximization Clustering using GMM
• Agglomerative Hierarchical algorithm
• Affinity Propagation
Clustering Algorithms
• K-Means algorithm: The k-means algorithm is one of the most popular clustering
algorithms.
• It classifies the dataset by dividing the samples into different clusters of equal
variances.
• The number of clusters must be specified in this algorithm. It is fast with fewer
computations required, with the linear complexity of O(n).
• Mean-shift algorithm: Mean-shift algorithm tries to find the dense areas in the smooth
density of data points.
• It is an example of a centroid-based model, that works on updating the candidates for
centroid to be the center of the points within a given region.
• DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of Applications with
Noise. It is an example of a density-based model similar to the mean-shift, but with some
remarkable advantages. In this algorithm, the areas of high density are separated by the
areas of low density. Because of this, the clusters can be found in any arbitrary shape.
Clustering Algorithms
• DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of Applications with
Noise.
• It is an example of a density-based model similar to the mean-shift, but with some
remarkable advantages.
• In this algorithm, the areas of high density are separated by the areas of low density.
Because of this, the clusters can be found in any arbitrary shape.
• Expectation-Maximization Clustering using GMM: This algorithm can be used as an
alternative for the k-means algorithm or for those cases where K-means can be failed.
• In GMM, it is assumed that the data points are Gaussian distributed.
• Agglomerative Hierarchical algorithm: The Agglomerative hierarchical algorithm
performs the bottom-up hierarchical clustering.
• In this, each data point is treated as a single cluster at the outset and then
successively merged. The cluster hierarchy can be represented as a tree-structure.
Clustering Algorithms
• Affinity Propagation: It is different from other clustering algorithms as it does not
require to specify the number of clusters.
• In this, each data point sends a message between the pair of data points until
convergence.
• It has O(N2T) time complexity, which is the main drawback of this algorithm.
Applications of Clustering
• In Identification of Cancer Cells: The clustering algorithms are widely used for the
identification of cancerous cells.
• It divides the cancerous and non-cancerous data sets into different groups.
• In Search Engines: Search engines also work on the clustering technique. The search
result appears based on the closest object to the search query.
• It does it by grouping similar data objects in one group that is far from the other
dissimilar objects.
• The accurate result of a query depends on the quality of the clustering algorithm
used.
• Customer Segmentation: It is used in market research to segment the customers based
on their choice and preferences.
Applications of Clustering
• In Biology: It is used in the biology stream to classify different species of plants and
animals using the image recognition technique.
• In Land Use: The clustering technique is used in identifying the area of similar lands use
in the GIS database.
• This can be very useful to find that for what purpose the particular land should be
used, that means for which purpose it is more suitable.
Thank You

Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
Module 5
No ratings yet
Module 5
91 pages
Classification and Clustering
No ratings yet
Classification and Clustering
8 pages
4.unit 4 ML Q&A
No ratings yet
4.unit 4 ML Q&A
73 pages
Clustering Part1
No ratings yet
Clustering Part1
79 pages
Cluster Analysis: Basic Concepts and Algorithms
No ratings yet
Cluster Analysis: Basic Concepts and Algorithms
141 pages
Unit 4
No ratings yet
Unit 4
74 pages
ML Unit-Iii
No ratings yet
ML Unit-Iii
18 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
7 pages
Day 3 - Content
No ratings yet
Day 3 - Content
50 pages
ML CH 4
No ratings yet
ML CH 4
51 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Unit 4
No ratings yet
Unit 4
62 pages
Unit III Clustering
No ratings yet
Unit III Clustering
47 pages
Unsupervised Learning-01
No ratings yet
Unsupervised Learning-01
42 pages
Verado L6 200-300 Gen5 & 350-400R Service Manual
86% (35)
Verado L6 200-300 Gen5 & 350-400R Service Manual
833 pages
ML Unit 4 (Ab 22)
No ratings yet
ML Unit 4 (Ab 22)
39 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
Lecturer-1 Unit 3
No ratings yet
Lecturer-1 Unit 3
31 pages
Fundamentals of Data Science Unit 3
No ratings yet
Fundamentals of Data Science Unit 3
15 pages
Clustering
No ratings yet
Clustering
41 pages
Unit 4
No ratings yet
Unit 4
96 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
64 pages
Clustering-Part 1
No ratings yet
Clustering-Part 1
35 pages
Clustering in Machine Learning - Javatpoint
No ratings yet
Clustering in Machine Learning - Javatpoint
10 pages
DSS09 (B) - Clustering
No ratings yet
DSS09 (B) - Clustering
35 pages
Module 5 - Notes - 13 12 2024
No ratings yet
Module 5 - Notes - 13 12 2024
45 pages
ML Unit 3
No ratings yet
ML Unit 3
28 pages
Clustering
No ratings yet
Clustering
57 pages
U20cs604 Machine Learning Unit III
No ratings yet
U20cs604 Machine Learning Unit III
23 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
ML Unit-3
No ratings yet
ML Unit-3
22 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
21 pages
Clustering
No ratings yet
Clustering
21 pages
Unit 4
No ratings yet
Unit 4
29 pages
Data Science
No ratings yet
Data Science
20 pages
E-Note 28966 Content Document 20241211091351PM
No ratings yet
E-Note 28966 Content Document 20241211091351PM
69 pages
4.unsupervised Learning Model-Clustering
No ratings yet
4.unsupervised Learning Model-Clustering
45 pages
Unit 4
No ratings yet
Unit 4
16 pages
Clustering
No ratings yet
Clustering
20 pages
Unit 2 ML
No ratings yet
Unit 2 ML
11 pages
Ds Econtent
No ratings yet
Ds Econtent
8 pages
Clustering
No ratings yet
Clustering
11 pages
Unit5 CSM ML
No ratings yet
Unit5 CSM ML
32 pages
Clustering
No ratings yet
Clustering
11 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
7 pages
Clustering New
No ratings yet
Clustering New
6 pages
Cbsyllabus Bda
No ratings yet
Cbsyllabus Bda
5 pages
Clustering Explanation
No ratings yet
Clustering Explanation
8 pages
Clustering
No ratings yet
Clustering
6 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
18 pages
Classify Clustering
No ratings yet
Classify Clustering
31 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
UNIT 4 K-Means Clustring
No ratings yet
UNIT 4 K-Means Clustring
13 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
Clustering
No ratings yet
Clustering
3 pages
History of English Literature
No ratings yet
History of English Literature
214 pages
GTA Cheat Codes
No ratings yet
GTA Cheat Codes
12 pages
Air compressorSS15HN
50% (2)
Air compressorSS15HN
50 pages
ELeventh Physics Textbook SCERt
No ratings yet
ELeventh Physics Textbook SCERt
320 pages
Modes of Mechanical Ventilation A
No ratings yet
Modes of Mechanical Ventilation A
23 pages
Ship Lifecycle
100% (1)
Ship Lifecycle
156 pages
Republic of The Philippines Office of The President Commission On Higher Education
No ratings yet
Republic of The Philippines Office of The President Commission On Higher Education
25 pages
Probability and Statistics Final-3
No ratings yet
Probability and Statistics Final-3
106 pages
Levis Et Al 2018 How People Domesticated Amazonian Forests
No ratings yet
Levis Et Al 2018 How People Domesticated Amazonian Forests
21 pages
Caterpillar Cat M322D MH WHEELED EXCAVATOR (Prefix D3X) Service Repair Manual Instant Download
No ratings yet
Caterpillar Cat M322D MH WHEELED EXCAVATOR (Prefix D3X) Service Repair Manual Instant Download
30 pages
Reactive Power Compensation Technologies, State-of-the-Art Review
No ratings yet
Reactive Power Compensation Technologies, State-of-the-Art Review
21 pages
DeltaV Remote Client. DeltaV Remote Client. Introduction
No ratings yet
DeltaV Remote Client. DeltaV Remote Client. Introduction
8 pages
Al79 10e
No ratings yet
Al79 10e
33 pages
Fashion and Fabrics Syllabus
No ratings yet
Fashion and Fabrics Syllabus
29 pages
Complete Handbook
No ratings yet
Complete Handbook
28 pages
Fortune 1000 US List 2021 Someka V1F
No ratings yet
Fortune 1000 US List 2021 Someka V1F
14 pages
BUS101 Presentation: Group Name: Group Members
No ratings yet
BUS101 Presentation: Group Name: Group Members
18 pages
DPP-6 Geometric Progression PDF
No ratings yet
DPP-6 Geometric Progression PDF
4 pages
annotated-Blackrock-Group 9
No ratings yet
annotated-Blackrock-Group 9
7 pages
Micron: Pawe Kozikowski
No ratings yet
Micron: Pawe Kozikowski
6 pages
Lab1 Calculations Worksheet-1
No ratings yet
Lab1 Calculations Worksheet-1
6 pages
All 03
No ratings yet
All 03
12 pages
Practical 10: Presentation: Exploring Sikkim
No ratings yet
Practical 10: Presentation: Exploring Sikkim
9 pages
L48AE DE d1
No ratings yet
L48AE DE d1
10 pages
River Diversion
No ratings yet
River Diversion
5 pages
Math 3 Demo Lesson Plan GEMDAS
No ratings yet
Math 3 Demo Lesson Plan GEMDAS
6 pages
Voodoo Priest Side Effects
No ratings yet
Voodoo Priest Side Effects
5 pages
Soil Moisture and Its Effect On Bulk Density and Porosity of Intact Aggregates of Three Mollic Soils
No ratings yet
Soil Moisture and Its Effect On Bulk Density and Porosity of Intact Aggregates of Three Mollic Soils
6 pages
Hfy-3800-2000-Pip-Lay-1001 - D Equipment Layout For C5 Storage Tank Unit-Code A
No ratings yet
Hfy-3800-2000-Pip-Lay-1001 - D Equipment Layout For C5 Storage Tank Unit-Code A
1 page
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet

Unit 4-L2

Uploaded by

Unit 4-L2

Uploaded by

1

Unit 4 : Types of machine learning

Types of Clustering Methods

You might also like