Classification KNN

Uploaded by

Sudhanshu Dwivedi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views11 pages

Classification KNN

Uploaded by

Sudhanshu Dwivedi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

CLASSIFICATION

Arvind Deshpande
11/19/2024 Arvind Deshpande (VJTI) 2

K-Nearest Neighbours (KNN)

• Eager learning- The classification algorithm constructs a
classification model before receiving new data to classify.
• Lazy learning - Instance-based learning
• Training data is simply stored (or only minor processing)
and waits until it is given a test tuple to classify.
• KNN algorithm - instance-based learning
• Unlike eager learning methods, lazy learners do less work
when a training tuple is presented and more work when
making a classification or numeric prediction
11/19/2024 Arvind Deshpande (VJTI) 3

KNN classifier
• KNN classifiers are based on learning by analogy, that is,
by comparing a given test tuple with training tuples that
are similar to it.
• The training tuples are described by n attributes.
• Each tuple represents a point in an n-dimensional space.
In this way, all the training tuples are stored in an n-
dimensional pattern space.
• When given an unknown tuple, a k-nearest-neighbor
classifier searches the pattern space for the k training
tuples that are closest to the unknown tuple.
• These k training tuples are the k “nearest neighbors” of
the unknown tuple.
11/19/2024 Arvind Deshpande (VJTI) 4

KNN classifier
• “Closeness” is defined in terms of a distance metric, such
as Euclidean distance. The Euclidean distance between
two points or tuples, say, 𝑋1 = 𝑥11 , 𝑥12 , … , 𝑥1𝑛 and 𝑋2 =
𝑥21 , 𝑥22 , … , 𝑥2𝑛 .
𝑛

𝑑𝑖𝑠𝑡 𝑋1 , 𝑋2 = 𝑥1𝑖 − 𝑥2𝑖 2

𝑖=1

• For categorical attributes, if 2 tuples are identical, the

difference is taken as 0 otherwise the difference is 1.
11/19/2024 Arvind Deshpande (VJTI) 5

KNN classifier
• Typically, we normalize the values of each attribute before
using the equation. This helps prevent attributes with
initially large ranges (e.g., income) from outweighing
attributes with initially smaller ranges (e.g., binary
attributes).
• Min-max normalization, for example, can be used to
transform a value v of a numeric attribute A to v’ in the
range [0, 1] by computing
′
𝑣 − 𝑚𝑖𝑛𝐴
𝑣 =
𝑚𝑎𝑥𝐴 − 𝑚𝑖𝑛𝐴
where 𝑚𝑖𝑛𝐴 and 𝑚𝑎𝑥𝐴 are the minimum and maximum
values of attribute A.
11/19/2024 Arvind Deshpande (VJTI) 6

KNN classifier
11/19/2024 Arvind Deshpande (VJTI) 7

KNN Algorithm
1. Calculate "d (x, xi)" i = 1, 2……, n where d denotes the
Euclidean distance between the points.
2. Arrange the calculated in Euclidean distances in non-
decreasing order.
3. Let k be a +ve integer, take the first k distances from
this sorted list.
4. Find those k-points corresponding to these k-distances.
5. Let ki denotes the number of points belonging to the ith
class among k points i.e. k > 0.
6. If ki > kj , i ≠ j then put x in class i.
11/19/2024 Arvind Deshpande (VJTI) 8

Choosing the right value of K

• The KNN algorithm is run several times with different values of
K and choose the K that reduces the number of errors we
encounter while maintaining the algorithm’s ability to accurately
make predictions when it's given data it hasn't seen before.
• As we decrease the value of K to 1, our predictions become
less stable.
• Inversely, as we increase the value of K, our predictions
become more stable due to majority voting / averaging, and
thus, more likely to make more accurate predictions (up to a
certain point.)
• Eventually, we begin to witness an increasing number of errors.
It is at this point we know pushed the value of K too far.
• In cases where we are taking a majority vote (e.g. picking the
mode in a classification problem) among labels, we usually
make K an odd number to have a tiebreaker.
11/19/2024 Arvind Deshpande (VJTI) 9

Advantages
• The algorithm is simple and easy to implement.
• It is lazy learning algorithm and therefore requires no
training prior to making real time prediction.
• There's no need to build a model, tune several
parameters, or make additional assumptions. This makes
the KNN algorithm much faster than other algorithms that
require training like SVM, linear regression etc.
• Since the algorithm requires no training before making
predictions, new data can be added seamlessly.
• There are only two parameters required to implement
KNN i.e. the value of K and the distance function.
• The algorithm is versatile. It can be used for classification,
regression and search.
11/19/2024 Arvind Deshpande (VJTI) 10

Disadvantages
• It doesn't work well with high dimensional data because
with large number of dimensions, it becomes difficult for
the algorithm to calculate distance in each dimension.
• KNN algorithm has a high prediction cost for large
datasets. This is because in large datasets the cost of
calculating distance between new point and each existing
point becomes higher.
• KNN algorithm doesn't work well with categorical features
since it is difficult to find the distance between dimensions
with categorical features.
• The algorithm gets significantly slower as the number of
examples and/or predictors/independent variables
increase.
11/19/2024 Arvind Deshpande (VJTI) 11

Disadvantages
• When making a classification or numeric prediction, lazy
learners can be computationally expensive.
• They require efficient storage techniques and are well
suited to implementation on parallel hardware.
• They offer little explanation or insight into the data’s
structure.
• They can suffer from poor accuracy when given noisy or
irrelevant attributes.
• Nearest-neighbor classifiers can be extremely slow when
classifying test tuples.

3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
K Nearest Neighbour Classifier Overview
No ratings yet
K Nearest Neighbour Classifier Overview
30 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
KNN Algorithm Overview and Applications
No ratings yet
KNN Algorithm Overview and Applications
41 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
04 KNN
No ratings yet
04 KNN
25 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
05 KNN
No ratings yet
05 KNN
49 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
KNN
No ratings yet
KNN
53 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
kNN Algorithm: Pros and Cons
No ratings yet
kNN Algorithm: Pros and Cons
18 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
ML KN
No ratings yet
ML KN
12 pages
Week 07
No ratings yet
Week 07
24 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Week 7 Nearest Neighbours
No ratings yet
Week 7 Nearest Neighbours
21 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
KNN: Classifying with Nearest Neighbors
No ratings yet
KNN: Classifying with Nearest Neighbors
51 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
9 pages
Notes On K
No ratings yet
Notes On K
3 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Supervised Learning Techniques
No ratings yet
Supervised Learning Techniques
33 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
KNN Presentation
No ratings yet
KNN Presentation
19 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
Aitee (Notes) KNN
No ratings yet
Aitee (Notes) KNN
3 pages
KNN Algorithm: Clustering & Classification
No ratings yet
KNN Algorithm: Clustering & Classification
10 pages
ML 5
No ratings yet
ML 5
76 pages
Supervised Learning KNN
No ratings yet
Supervised Learning KNN
23 pages
K-Nearest Neighbors (KNN) Algorithm
No ratings yet
K-Nearest Neighbors (KNN) Algorithm
26 pages
Notes: KNN: K-Nearest Neighbors
No ratings yet
Notes: KNN: K-Nearest Neighbors
4 pages
L5 KNN
No ratings yet
L5 KNN
23 pages
K-Nearest NEIGHBOUR
No ratings yet
K-Nearest NEIGHBOUR
16 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
No ratings yet
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
11 pages
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
No ratings yet
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
24 pages
kNN Insights for Data Scientists
No ratings yet
kNN Insights for Data Scientists
53 pages
DADM S15 K-NN Classification
No ratings yet
DADM S15 K-NN Classification
13 pages
Nearest Neighbour Classifier (-NN Classifier)
No ratings yet
Nearest Neighbour Classifier (-NN Classifier)
17 pages
The Nearest Neighbour Algorithm
No ratings yet
The Nearest Neighbour Algorithm
3 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
53 pages
Additive Regression for Netflix Data
No ratings yet
Additive Regression for Netflix Data
12 pages
MAT6001 Advanced-Statistical-Methods ETH 1 AC44
No ratings yet
MAT6001 Advanced-Statistical-Methods ETH 1 AC44
2 pages
Bharathidasan University-Econometrics-QP-Nov-2010
No ratings yet
Bharathidasan University-Econometrics-QP-Nov-2010
2 pages
Dependent Variable
No ratings yet
Dependent Variable
3 pages
Multiple Regression Analysis Report
No ratings yet
Multiple Regression Analysis Report
14 pages
Project Notes - II (Capstone Project) - Facebook Comments Volume Prediction - YS
83% (6)
Project Notes - II (Capstone Project) - Facebook Comments Volume Prediction - YS
15 pages
1986-Barron & Kenny-The Moderator-Mediator Variable Distinction in Social Psychological Research
No ratings yet
1986-Barron & Kenny-The Moderator-Mediator Variable Distinction in Social Psychological Research
11 pages
The Nature of Regression Analysis: Gujarati 4e, Chapter 1
No ratings yet
The Nature of Regression Analysis: Gujarati 4e, Chapter 1
12 pages
Evaluation of Machine Learning Algorithms For The Detection of Fake Bank Currency
No ratings yet
Evaluation of Machine Learning Algorithms For The Detection of Fake Bank Currency
6 pages
Understanding Multicollinearity in Econometrics
No ratings yet
Understanding Multicollinearity in Econometrics
44 pages
Understanding Adjusted R-Squared
100% (1)
Understanding Adjusted R-Squared
2 pages
SPSS Repeated-Measures ANOVA Guide
No ratings yet
SPSS Repeated-Measures ANOVA Guide
16 pages
Multigroup SEM Analysis & Moderation
No ratings yet
Multigroup SEM Analysis & Moderation
5 pages
Stock Watson 3u Exercise Solutions Chapter 11 Instructors
100% (1)
Stock Watson 3u Exercise Solutions Chapter 11 Instructors
13 pages
NARDL
No ratings yet
NARDL
23 pages
Polynomial Regression in Python Explained
No ratings yet
Polynomial Regression in Python Explained
1 page
Sec PPT Excel Sans-1
No ratings yet
Sec PPT Excel Sans-1
1 page
Mathematics & Statistics For Management
No ratings yet
Mathematics & Statistics For Management
13 pages
Regression Analysis of Height and Weight
No ratings yet
Regression Analysis of Height and Weight
5 pages
Exercise On Factor Analysis
No ratings yet
Exercise On Factor Analysis
4 pages
(Chapman & Hall - CRC Texts in Statistical Science) Paul Roback and Julie Legler - Beyond Multiple Linear Regression-Applied Generalized Linear Models and Multilevel Models in R-CRC Press (2020)
100% (1)
(Chapman & Hall - CRC Texts in Statistical Science) Paul Roback and Julie Legler - Beyond Multiple Linear Regression-Applied Generalized Linear Models and Multilevel Models in R-CRC Press (2020)
437 pages
Stoltzfus (2011) Logreg
No ratings yet
Stoltzfus (2011) Logreg
6 pages
Hair Salon PCA & Regression Analysis
33% (3)
Hair Salon PCA & Regression Analysis
11 pages
NoCA2019-ProxyML 2019nov29
No ratings yet
NoCA2019-ProxyML 2019nov29
24 pages
Introduction To Machine Learning - Unit 4 - Week 2
No ratings yet
Introduction To Machine Learning - Unit 4 - Week 2
4 pages
PPTML
No ratings yet
PPTML
16 pages
Test Bank Questions Chapter 5
No ratings yet
Test Bank Questions Chapter 5
5 pages
Logistic Regression on Iris Dataset
No ratings yet
Logistic Regression on Iris Dataset
39 pages
PERRERAS, Karen A. (Activity 3-Module 7)
No ratings yet
PERRERAS, Karen A. (Activity 3-Module 7)
1 page

Classification KNN

Uploaded by

Classification KNN

Uploaded by

CLASSIFICATION

K-Nearest Neighbours (KNN)

𝑑𝑖𝑠𝑡 𝑋1 , 𝑋2 = 𝑥1𝑖 − 𝑥2𝑖 2

• For categorical attributes, if 2 tuples are identical, the

Choosing the right value of K

You might also like