0% found this document useful (0 votes)

32 views20 pages

K-Means Clustering Guide

K-means clustering is an algorithm to classify objects into K number of groups by minimizing the sum of squares of distances between data and the corresponding cluster centroid. The algorithm works by initializing centroids, assigning objects to the closest centroid, recomputing centroids, and repeating until convergence is reached. An example shows how k-means clustering with K=2 partitions a sample dataset into two clusters.

Uploaded by

Muneeba Hussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views20 pages

K-Means Clustering Guide

Uploaded by

Muneeba Hussain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

K-MEANS

CLUSTERING
INTRODUCTION-
What is clustering?

 Clustering is the classification of objects into

different groups, or more precisely, the
partitioning of a data set into subsets
(clusters), so that the data in each subset
(ideally) share some common trait - often
according to some defined distance measure.
Common Distance measures:

 Distance measure will determine how the similarity of two

elements is calculated and it will influence the shape of the
clusters.
They include:
1. The Euclidean distance (also called 2-norm distance) is
given by:

2. The Manhattan distance (also called taxicab norm or 1-

norm) is given by:
3.The maximum norm is given by:

4. The Mahalanobis distance corrects data for

different scales and correlations in the variables.
5. Inner product space: The angle between two
vectors can be used as a distance measure when
clustering high dimensional data
6. Hamming distance (sometimes edit distance)
measures the minimum number of substitutions
required to change one member into another.
K-MEANS CLUSTERING
 The k-means algorithm is an algorithm to cluster
n objects based on attributes into k partitions,
where k < n.
 It is similar to the expectation-maximization
algorithm for mixtures of Gaussians in that they
both attempt to find the centers of natural clusters
in the data.
 It assumes that the object attributes form a vector
space.
 Simply speaking k-means clustering is an
algorithm to classify or to group the objects
based on attributes/features into K number of
group.
 K is positive integer number.
 The grouping is done by minimizing the sum
of squares of distances between data and the
corresponding cluster centroid.
How the K-Mean Clustering
algorithm works?
 Step 1: Begin with a decision on the value of k =
number of clusters .
 Step 2: Put any initial partition that classifies the
data into k clusters. You may assign the
training samples randomly,or systematically
as the following:
1.Take the first k training sample as single-
element clusters
2. Assign each of the remaining (N-k) training
sample to the cluster with the nearest centroid.
After each assignment, recompute the centroid of
the gaining cluster.
 Step 3: Take each sample in sequence and
compute its distance from the centroid of
each of the clusters. If a sample is not
currently in the cluster with the closest
centroid, switch this sample to that cluster
and update the centroid of the cluster
gaining the new sample and the cluster
losing the sample.
 Step 4 . Repeat step 3 until convergence is
achieved, that is until a pass through the
training sample causes no new assignments.
A Simple example showing the
implementation of k-means algorithm
(using K=2)
Step 1:
Initialization: Randomly we choose following two centroids
(k=2) for two clusters.
In this case the 2 centroid are: m1=(1.0,1.0) and
m2=(5.0,7.0).
Step 2:
 Thus, we obtain two clusters
containing:
{1,2,3} and {4,5,6,7}.
 Their new centroids are:
Step 3:
 Now using these centroids
we compute the Euclidean
distance of each object, as
shown in table.

 Therefore, the new

clusters are:
{1,2} and {3,4,5,6,7}

 Next centroids are:

m1=(1.25,1.5) and m2 =
(3.9,5.1)
 Step 4 :
The clusters obtained are:
{1,2} and {3,4,5,6,7}

 Therefore, there is no
change in the cluster.
 Thus, the algorithm comes
to a halt here and final
result consist of 2 clusters
{1,2} and {3,4,5,6,7}.
PLOT
(with K=3)

Step 1 Step 2
PLOT
Real-Life Numerical Example
of K-Means Clustering
We have 4 medicines as our training data points object
and each medicine has 2 attributes. Each attribute
represents coordinate of the object. We have to
determine which medicines belong to cluster 1 and
which medicines belong to the other cluster.
Attribute1 (X): Attribute 2 (Y): pH
Object
weight index

Medicine A 1 1

Medicine B 2 1

Medicine C 4 3

Medicine D 5 4
Step 1:
 Initial value of
centroids : Suppose
we use medicine A and
medicine B as the first
centroids.
 Let and c1 and c2

denote the coordinate

of the centroids, then
c1=(1,1) and c2=(2,1)
We get the final grouping as the results as:

Object Feature1(X): Feature2 Group

weight index (Y): pH (result)
Medicine A 1 1 1
Medicine B 2 1 1
Medicine C 4 3 2
Medicine D 5 4 2

K Mean Clustering
No ratings yet
K Mean Clustering
32 pages
K Mean Clustering
No ratings yet
K Mean Clustering
48 pages
K Mean Clustering 1
No ratings yet
K Mean Clustering 1
26 pages
K Mean Clustering
No ratings yet
K Mean Clustering
45 pages
42-Unsupervised Learning - K-Means Clustering-21-11-2024
No ratings yet
42-Unsupervised Learning - K-Means Clustering-21-11-2024
18 pages
K-Means Clustering-Converted-Merged
No ratings yet
K-Means Clustering-Converted-Merged
76 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
36 pages
Understanding K-Means Clustering
No ratings yet
Understanding K-Means Clustering
12 pages
K-Means Clustering Algorithm Guide
No ratings yet
K-Means Clustering Algorithm Guide
24 pages
K-Means Clustering and Elbow Method Guide
No ratings yet
K-Means Clustering and Elbow Method Guide
53 pages
Clustering Techniques for CS Students
100% (1)
Clustering Techniques for CS Students
26 pages
ML Unit-2
No ratings yet
ML Unit-2
31 pages
Clustering
No ratings yet
Clustering
18 pages
K Mean Clustering
No ratings yet
K Mean Clustering
18 pages
K Mean Clustering
No ratings yet
K Mean Clustering
19 pages
K Mean Clustering
No ratings yet
K Mean Clustering
18 pages
Unsupervised Learning - Clustering
No ratings yet
Unsupervised Learning - Clustering
55 pages
Clustering Techniques - Hierarchical, K-Means Clustering
No ratings yet
Clustering Techniques - Hierarchical, K-Means Clustering
22 pages
1 Kmeans
No ratings yet
1 Kmeans
13 pages
K Mean Clustering1
No ratings yet
K Mean Clustering1
23 pages
ML 12
No ratings yet
ML 12
19 pages
K-Means Clustering
No ratings yet
K-Means Clustering
5 pages
ADL LAB Manual
No ratings yet
ADL LAB Manual
27 pages
K Mean
No ratings yet
K Mean
7 pages
Unit 4
No ratings yet
Unit 4
125 pages
Algo
No ratings yet
Algo
59 pages
K Clustering
No ratings yet
K Clustering
28 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
Pilot
No ratings yet
Pilot
3 pages
Lecture 18 Clustering 19092024 091909am
No ratings yet
Lecture 18 Clustering 19092024 091909am
33 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
16 pages
K Mean Algorithm
No ratings yet
K Mean Algorithm
18 pages
K-Means Clustering Guide for Beginners
No ratings yet
K-Means Clustering Guide for Beginners
19 pages
Understanding Clustering in Machine Learning
No ratings yet
Understanding Clustering in Machine Learning
33 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
12 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
24 pages
K - Means Clustering
No ratings yet
K - Means Clustering
13 pages
K-Means Clustering
No ratings yet
K-Means Clustering
6 pages
K Means
No ratings yet
K Means
23 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
21csc305p Machine Learning Unit 3 - Updated
No ratings yet
21csc305p Machine Learning Unit 3 - Updated
147 pages
Mod4 - Unsupervised Learning
No ratings yet
Mod4 - Unsupervised Learning
9 pages
K Means Algorithm
No ratings yet
K Means Algorithm
4 pages
K Means
No ratings yet
K Means
40 pages
L7 Clustering
No ratings yet
L7 Clustering
58 pages
Clustering
No ratings yet
Clustering
84 pages
KMeans Clustering
No ratings yet
KMeans Clustering
11 pages
Unit 3 - KmeansClustering
No ratings yet
Unit 3 - KmeansClustering
17 pages
ML 5
No ratings yet
ML 5
61 pages
ADB Ch07 - Data Mining Clustering K-Means
No ratings yet
ADB Ch07 - Data Mining Clustering K-Means
27 pages
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
No ratings yet
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
73 pages
Clustering
No ratings yet
Clustering
8 pages
0006 - K Means Clustering - Introduction - 2025
No ratings yet
0006 - K Means Clustering - Introduction - 2025
19 pages
K-Medoids Clustering Overview
No ratings yet
K-Medoids Clustering Overview
36 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Operations Research: Engr. Zaeem Anjum
No ratings yet
Operations Research: Engr. Zaeem Anjum
28 pages
Real Analysis 1st Edition Fon-Che Liu PDF Download
100% (7)
Real Analysis 1st Edition Fon-Che Liu PDF Download
51 pages
Understanding Number Systems in Mathematics
100% (1)
Understanding Number Systems in Mathematics
5 pages
C. Henry Edwards, David E. Penney Elementary Differential Equations With Boundary Value Problems 2003
67% (6)
C. Henry Edwards, David E. Penney Elementary Differential Equations With Boundary Value Problems 2003
320 pages
Introduction To Probability Distribution: Abdul Wali Khan University Mardan Pakistan
No ratings yet
Introduction To Probability Distribution: Abdul Wali Khan University Mardan Pakistan
74 pages
International GCSE Further Pure Mathematics Teacher Resource Pack Sample
50% (2)
International GCSE Further Pure Mathematics Teacher Resource Pack Sample
12 pages
The Derivative As The Slope of The Tangent Line
No ratings yet
The Derivative As The Slope of The Tangent Line
5 pages
Applied Partial Differential Equations: January 2003
100% (1)
Applied Partial Differential Equations: January 2003
129 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Schwinger Action Principal
No ratings yet
Schwinger Action Principal
24 pages
A Short Course On Differential Equations Ebook - LegalTorrents
100% (2)
A Short Course On Differential Equations Ebook - LegalTorrents
146 pages
Functions Part 9
No ratings yet
Functions Part 9
9 pages
Lab No.5 Me
No ratings yet
Lab No.5 Me
10 pages
Class 10th (Set A) Gitanjali Maths Exam 15
No ratings yet
Class 10th (Set A) Gitanjali Maths Exam 15
1 page
Definite Integration TN
No ratings yet
Definite Integration TN
12 pages
Algebra 1 CST Release Questions
100% (1)
Algebra 1 CST Release Questions
27 pages
16fall 165 Midterm Part2
No ratings yet
16fall 165 Midterm Part2
7 pages
Quiz m1 DCGM
No ratings yet
Quiz m1 DCGM
5 pages
SDOF Response to Impulsive Loading
No ratings yet
SDOF Response to Impulsive Loading
13 pages
Introduction to Stochastic Calculus
No ratings yet
Introduction to Stochastic Calculus
89 pages
Monochromatic Components
No ratings yet
Monochromatic Components
5 pages
Laplace Transform and Circuit Theory A Chakraborty Scanner ID Book Circuit Theory Adobe Scan 22 Apr 2025 v1
No ratings yet
Laplace Transform and Circuit Theory A Chakraborty Scanner ID Book Circuit Theory Adobe Scan 22 Apr 2025 v1
25 pages
Lesson 1.2
No ratings yet
Lesson 1.2
18 pages
Merlin Concrete PDF
No ratings yet
Merlin Concrete PDF
79 pages
1 Algorithm: Design: Indian Institute of Information Technology Design and Manufacturing, Kancheepuram
No ratings yet
1 Algorithm: Design: Indian Institute of Information Technology Design and Manufacturing, Kancheepuram
13 pages
Relations, Functions and Graphs - Worksheet
No ratings yet
Relations, Functions and Graphs - Worksheet
5 pages
Mechanical Stress Definition
No ratings yet
Mechanical Stress Definition
6 pages
Dr. Poonam Kumar Sharma CV
No ratings yet
Dr. Poonam Kumar Sharma CV
35 pages
Python
No ratings yet
Python
47 pages

K-Means Clustering Guide

Uploaded by

K-Means Clustering Guide

Uploaded by

K-MEANS

 Clustering is the classification of objects into

 Distance measure will determine how the similarity of two

2. The Manhattan distance (also called taxicab norm or 1-

4. The Mahalanobis distance corrects data for

 Therefore, the new

 Next centroids are:

denote the coordinate

Object Feature1(X): Feature2 Group

You might also like