0% found this document useful (0 votes)

31 views20 pages

K-Nearest Neighbors Classification Explained

Uploaded by

Vũ Xuân Chiến

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views20 pages

K-Nearest Neighbors Classification Explained

Uploaded by

Vũ Xuân Chiến

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Similar instances have

similar classiﬁcation

2
 Training phase (Model construction): a model
is constructed from the training instances.
◦ classification algorithm finds relationships between
predictors and targets
◦ relationships are summarised in a model
 Testing phase:
◦ test the model on a test sample whose class labels
are known but not used for training the model
 Usage phase (Model usage):
◦ use the model for classification on new data whose
class labels are unknown

ML – NLU 3
 No clear separation between these phases of
classification
 also called lazy classification, as opposed to
eager classification
 Examples:
◦ Rote-learner
 Memorizes entire training data and performs
classification only if attributes of record match one of the
training examples exactly
◦ Nearest neighbor
 Uses k “closest” points (nearest neighbors) for
performing classification

ML – NLU
 Model is computed  Model is computed
beforeclassification during classification
 Model is independent of the  Model is dependent on
test instance
the test instance
 Test instance is not
included in the training  Test instance is
data included in the training
 Avoids too much work at data
classification time  High accuracy for
 Model is not accurate for models at each
each instance instance level

Eager Classification Lazy Classification

ML – NLU 5
 Basic idea:
◦ If it walks like a duck, quacks like a duck, then it’s
probably a duck

Compute
Distance Test
Record

Training Choose k of the

Records “nearest” records

ML – NLU
Unknown record Requires three things
– The set of labeled records
– Distance Metric to compute
distance between records
– The value of k, the number of
nearest neighbors to retrieve

To classify an unknown record:

– Compute distance to other training
records
– Identify k nearest neighbors
– Use class labels of nearest
neighbors to determine the class
label of unknown record (e.g., by
taking majority vote)

ML – NLU
X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

that have the k smallest distances to x

ML – NLU
ML – NLU 9
Voronoi Diagram

Predict the same value/class as the nearest instance in the training set

ML – NLU
 Problem: measure similarity between instances

◦ different types of data: numbers colours, geolocation,

booleans, etc.

 Solution: convert all features of the instances into

numerical values
◦ represent instances as vectors of features in an n-
dimensional space

ML – NLU 11
 Euclidean distance:
n
d ( x, y ) = (x − yi )
2
i
i =1

 Manhattan distance
n
d ( x, y ) =  xi − yi
i =1

 Minkowski distance 1/ p
 n p

d ( x, y ) =   xi − yi 
 i =1
 
ML – NLU 12
 Determine the class from nearest neighbor
list
◦ Take the majority vote of class labels among the k-
nearest neighbors

◦ Weigh the vote according to distance

 weight factor, w = 1/d2

ML – NLU
 Let k be the number of nearest neighbors and D
be the set of training examples.

ML – NLU 14
 Illustration of kNN for a 3-class problem with k=5

ML – NLU 15
 Choosing the value of k: Classification is
sensitive to the correct selection of k
◦ if k is too small ⇒ overfitting
 algorithm performs too good on the training set,
compared to its true performance on unseen test
data
➔ small k? → less stable,
influenced by noise
➔ large k? → less precise, higher
bias X

k= n
ML – NLU
 Scaling issues
◦ Attributes may have to be scaled to prevent
distance measures from being dominated by one of
the attributes

◦ Example:
 height of a person may vary from 1.5m to 1.8m
 weight of a person may vary from 90lb to 300lb
 income of a person may vary from $10K to $1M

ML – NLU
 Selection of the right similarity measure is
critical:

111111111110 000000000001
vs
011111111111 100000000000

Euclidean distance = 1.4142 for both pairs

Solution: Normalize the vectors to unit length

ML – NLU
 Pros:
◦ Simple to implement and use
◦ Robust to noisy data by averaging k-nearest
neighbours
◦ kNN classiﬁcation is based solely on local
information
◦ The decision boundaries can be of arbitrary shapes

ML – NLU 19
 Cons:
◦ Curse of dimensionality: distance can be dominated
by irrelevant attributes
◦ O(n) for each instance to be classiﬁed
◦ More expensive to classify a new instance than with
a model

ML – NLU 20
FACULTY OF INFORMATION TECHNOLOGY

Nearest-Neighbor Classifier Guide
No ratings yet
Nearest-Neighbor Classifier Guide
2 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Lecture8 KNN1
No ratings yet
Lecture8 KNN1
16 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
20 KNN Presentation
No ratings yet
20 KNN Presentation
16 pages
Session 9 KNN - 2024
No ratings yet
Session 9 KNN - 2024
23 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
73 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
Pks Machine Learning Module 3 1
No ratings yet
Pks Machine Learning Module 3 1
62 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
PowerPoint Presentation - KNN Presentation
No ratings yet
PowerPoint Presentation - KNN Presentation
16 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
Textbook ML - Removed
No ratings yet
Textbook ML - Removed
10 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
CSE445 NSU Week - 5
No ratings yet
CSE445 NSU Week - 5
26 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
01 Basics 02knn 01
No ratings yet
01 Basics 02knn 01
7 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
K-Nearest Neighbor Classifier Explained
No ratings yet
K-Nearest Neighbor Classifier Explained
16 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
K-Nearest Neighbourhood
100% (1)
K-Nearest Neighbourhood
7 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
ML KN
No ratings yet
ML KN
12 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
Classification (K-Nearest Neighbor)
No ratings yet
Classification (K-Nearest Neighbor)
22 pages
Lazy Learning in Machine Learning
No ratings yet
Lazy Learning in Machine Learning
22 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
Lecture 12
No ratings yet
Lecture 12
15 pages
Lec 7
No ratings yet
Lec 7
40 pages
ML Unit 3
No ratings yet
ML Unit 3
12 pages
7.classification After
No ratings yet
7.classification After
51 pages
K-NN Numerical N Theory
No ratings yet
K-NN Numerical N Theory
5 pages
K Nearest Neighbour Classifier Overview
No ratings yet
K Nearest Neighbour Classifier Overview
30 pages
K Nearest Neighbor Classification
No ratings yet
K Nearest Neighbor Classification
30 pages
04 Unit-Iv - ML
No ratings yet
04 Unit-Iv - ML
23 pages
CH 2
No ratings yet
CH 2
30 pages
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
No ratings yet
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
11 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
26 pages
Module3-Similarity-based Learning-11Mar2024
No ratings yet
Module3-Similarity-based Learning-11Mar2024
34 pages
cs4302 Lecture2
No ratings yet
cs4302 Lecture2
40 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Lecture 3 - KNN Algorithm
No ratings yet
Lecture 3 - KNN Algorithm
28 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Supervised Learning & KNN Guide
No ratings yet
Supervised Learning & KNN Guide
27 pages
Machine Learning Lecture 02
No ratings yet
Machine Learning Lecture 02
25 pages
Unit 4 - KVR
No ratings yet
Unit 4 - KVR
111 pages
Lecture-8 Classification Using K-NN
No ratings yet
Lecture-8 Classification Using K-NN
40 pages
K-Nearest Neighbor Algorithm Explained
100% (1)
K-Nearest Neighbor Algorithm Explained
17 pages
06c Nearest Neighbor
No ratings yet
06c Nearest Neighbor
17 pages
Classification KNN
No ratings yet
Classification KNN
11 pages
Research and Implementation of Machine
No ratings yet
Research and Implementation of Machine
6 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
Understanding k-Nearest Neighbors
No ratings yet
Understanding k-Nearest Neighbors
18 pages
S-104 Product Specification 1 1 0 Final
No ratings yet
S-104 Product Specification 1 1 0 Final
141 pages
2nd SCH 2013-14 PDF
No ratings yet
2nd SCH 2013-14 PDF
67 pages
TR Technology Radar Vol 32 en
No ratings yet
TR Technology Radar Vol 32 en
46 pages
Robots in Our Homes Today
100% (2)
Robots in Our Homes Today
2 pages
Enhancing Production at Plant Q1
No ratings yet
Enhancing Production at Plant Q1
6 pages
Manual TV Sharp
No ratings yet
Manual TV Sharp
72 pages
FOS Script
No ratings yet
FOS Script
2 pages
Selectshift Motor Adjusts Bend Setting To Reach TD in Permian Vertical and Tangent Case Study
No ratings yet
Selectshift Motor Adjusts Bend Setting To Reach TD in Permian Vertical and Tangent Case Study
1 page
DPPIPL Jamnagar Rates
No ratings yet
DPPIPL Jamnagar Rates
1 page
ES Assignment
No ratings yet
ES Assignment
19 pages
Python Lec 1
No ratings yet
Python Lec 1
29 pages
CCTV System
No ratings yet
CCTV System
2 pages
Shonascript Learn
No ratings yet
Shonascript Learn
76 pages
Noblelift PTE20B
100% (1)
Noblelift PTE20B
31 pages
Cultural Heritage and Web Mapping
No ratings yet
Cultural Heritage and Web Mapping
6 pages
IBM Maximo V7.5 Implementation Guide
No ratings yet
IBM Maximo V7.5 Implementation Guide
6 pages
English Test for Students
No ratings yet
English Test for Students
6 pages
1st National AI Olympiad Question Bank Watermark
No ratings yet
1st National AI Olympiad Question Bank Watermark
39 pages
IPTV Ale4
0% (1)
IPTV Ale4
21 pages
IA Check List 2023
No ratings yet
IA Check List 2023
5 pages
How To Setup ADB and Fastboot On Windows, Linux, Mac OS Easily For Android Developemnt - (2018 Edition)
100% (1)
How To Setup ADB and Fastboot On Windows, Linux, Mac OS Easily For Android Developemnt - (2018 Edition)
4 pages
SS2 T-Drawng 2ND Term E-Notes
No ratings yet
SS2 T-Drawng 2ND Term E-Notes
47 pages
Information Security 05 - Encryption
No ratings yet
Information Security 05 - Encryption
38 pages
BS 4568
No ratings yet
BS 4568
6 pages
Pythontex: Fast Access To Python From Within Latex: Xie Pastell MK
No ratings yet
Pythontex: Fast Access To Python From Within Latex: Xie Pastell MK
7 pages
Network Routing Configuration Guide
No ratings yet
Network Routing Configuration Guide
56 pages
Small-Scale Plastic Wind Turbines Deployment Along Roadways
No ratings yet
Small-Scale Plastic Wind Turbines Deployment Along Roadways
5 pages
Q-Exactive-GC Preinstallation Guide
No ratings yet
Q-Exactive-GC Preinstallation Guide
67 pages
EE2211 Past Paper
No ratings yet
EE2211 Past Paper
14 pages
Plugin Brochure
No ratings yet
Plugin Brochure
7 pages