KNN: Classifying with Nearest Neighbors

The document discusses two machine learning algorithms: K-nearest neighbors (KNN) and hidden Markov models (HMM). KNN is a classification algorithm that finds the closest training examples to make predictions. HMMs are statistical models where the system being modeled has hidden states and outputs depend on the state. The document provides details on how each algorithm works, important considerations, and examples of applications.

Uploaded by

Rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views51 pages

KNN: Classifying with Nearest Neighbors

Uploaded by

Rohit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

KNN and HMM

K-Nearest Neighbors
 K-Nearest Neighbors, is a supervised learning algorithm specialized in
classification.
 It is a simple algorithm that stores all available cases and classifies new
cases by a majority vote of its k neighbors.
 The case being assigned to the class is the most common among its K
nearest neighbors measured by a distance function.
 These distance functions can be Euclidean, Manhattan, Minkowski and
Hamming distance.
 What is K?
 In algorithm, for each test data point, we would be looking at the K nearest
training data points and take the most frequently occurring classes and assign
that class to the test data. Therefore, K represents the number of training data
points lying in proximity to the test data point which we are going to use to
find the class.
Conti..

 The algorithm looks at different centroids and compares distance using some
sort of function (usually Euclidean), then analyzes those results and assigns
each point to the group so that it is optimized to be placed with all the
closest points to it.
 You can use KNN for both classification and regression problems. However, it
is more widely used in classification problems in the industry.
Conti..
 Let’s take a simple case to understand this algorithm. Following is a spread of
red circles (RC) and green squares (GS) :

 You intend to find out the class of the blue star (BS) . BS can either be RC or
GS and nothing else. The “K” is KNN algorithm is the nearest neighbors we
wish to take vote from
Conti..
 Let’s say K = 3. Hence, we will now make a circle with BS as center just as big
as to enclose only three datapoints on the plane. Refer to following diagram
for more details:

 The three closest points to BS is all RC. Hence, with good confidence level we
can say that the BS should belong to the class RC.
 Here, the choice became very obvious as all three votes from the closest
neighbor went to RC. The choice of the parameter K is very crucial in this
algorithm.
Algorithm

 Load the data

 Initialise the value of k
 For getting the predicted class, iterate from 1 to total number of training
data points
 Calculate the distance between test data and each row of training data. Here we
will use Euclidean distance as our distance metric since it’s the most popular
method. The other metrics that can be used are Chebyshev, cosine, etc.
 Sort the calculated distances in ascending order based on distance values
 Get top k rows from the sorted array
 Get the most frequent class of these rows
 Return the predicted class
Choosing the right value for K

 To select the K that’s right for your data, we run the KNN
algorithm several times with different values of K and
choose the K that reduces the number of errors we
encounter while maintaining the algorithm’s ability to
accurately make predictions when it’s given data it hasn’t
seen before.
 Here are some things to keep in mind:
 As we decrease the value of K to 1, our predictions become less
stable.
 Inversely, as we increase the value of K, our predictions
become more stable due to majority voting / averaging, and
thus, more likely to make more accurate predictions (up to a
certain point).
Conti..

 You will have to note the following points before selecting

KNN:
 KNN is computationally expensive.
 Variables should be normalized else higher range variables can
bias it.
 Works on pre-processing stage more before going for KNN like
outlier, noise removal
Summary
 The k-nearest neighbors (KNN) algorithm is a simple, supervised machine
learning algorithm that can be used to solve both classification and
regression problems. It’s easy to implement and understand, but has a
major drawback of becoming significantly slows as the size of that data in
use grows.
 KNN works by finding the distances between a query and all the examples
in the data, selecting the specified number examples (K) closest to the
query, then votes for the most frequent label (in the case of
classification) or averages the labels (in the case of regression).
 In the case of classification and regression, we saw that choosing the right
K for our data is done by trying several Ks and picking the one that works
best.
Hidden Markov Model

 A Hidden Markov Model is a statistical Markov Model

(chain) in which the system being modeled is assumed to
be a Markov Process with hidden states (or unobserved)
states.
 In simpler Markov models (like a Markov chain), the state
is directly visible to the observer, and therefore the state
transition probabilities are the only parameters
 while in the hidden Markov model, the state is not directly
visible, but the output (in the form of data or "token" in
the following), dependent on the state, is visible
What is a Markov Property?

 A stochastic process (or a random process that is a

collection of random variables which changes through
time) if the probability of future states of the process
depends only upon the present state, not on the sequence
of states preceding it.
 It is commonly referred as memoryless property.
 Any random process that satisfies the Markov Property is
known as Markov Process.
Markov chain
Application

 DNS Sequence analysis

 Prediction of genes
 Horizontal gene transfer
 Radiation hyper mapping
 Speech recognition
 Vehicle trajectory protection
 Positron emission tomography (PET)
 Digital communication
 Music analysis
 Optical signal detection
 Gesture learning for human robot interface
Example:
Using this Markov chain, what is
the probability that the
Wednesday will be cloudy if today
is sunny?
Answer:
 The following are different transitions that can result in a cloudy Wednesday
given today (Monday) is sunny.

Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
12 - 23ECE216 - Nearest Neighbors
No ratings yet
12 - 23ECE216 - Nearest Neighbors
29 pages
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
No ratings yet
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
24 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
k-Nearest Neighbors Lecture Notes
No ratings yet
k-Nearest Neighbors Lecture Notes
23 pages
KNN Algorithm: Clustering & Classification
No ratings yet
KNN Algorithm: Clustering & Classification
10 pages
Nearest Neighbor Methods in Machine Learning
No ratings yet
Nearest Neighbor Methods in Machine Learning
22 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
04 KNN
No ratings yet
04 KNN
25 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
Shivwangi Banerjee - ML
No ratings yet
Shivwangi Banerjee - ML
10 pages
Slide 20 21 Explanation
No ratings yet
Slide 20 21 Explanation
9 pages
04 Unit-Iv - ML
No ratings yet
04 Unit-Iv - ML
23 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
E Learning KNN
No ratings yet
E Learning KNN
31 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
k-NN Algorithm: Basics, Applications, and Advantages
No ratings yet
k-NN Algorithm: Basics, Applications, and Advantages
42 pages
ML KN
No ratings yet
ML KN
12 pages
ML Notes
100% (2)
ML Notes
125 pages
K Nearest Neighbor - Step by Step Tutorial
No ratings yet
K Nearest Neighbor - Step by Step Tutorial
16 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Unit 2
No ratings yet
Unit 2
30 pages
K-Nearest Neighbors Guide
No ratings yet
K-Nearest Neighbors Guide
25 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
KNN
No ratings yet
KNN
53 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Week 07
No ratings yet
Week 07
24 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
CSL0777 L22
No ratings yet
CSL0777 L22
35 pages
Dr. BC Roy Engineering College Durgapur
No ratings yet
Dr. BC Roy Engineering College Durgapur
10 pages
KNN Presentation
No ratings yet
KNN Presentation
19 pages
S3 K Nearest Neighbor LKW 15jan2025
No ratings yet
S3 K Nearest Neighbor LKW 15jan2025
16 pages
ML 5
No ratings yet
ML 5
35 pages
KNN Algorithm Overview and Applications
No ratings yet
KNN Algorithm Overview and Applications
41 pages
Understanding K-Nearest Neighbors (KNN)
No ratings yet
Understanding K-Nearest Neighbors (KNN)
9 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
21 KNN
No ratings yet
21 KNN
28 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
ML 2
No ratings yet
ML 2
6 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Amm PPT 5
No ratings yet
Amm PPT 5
17 pages
EDD Debit Card Statement Summary
No ratings yet
EDD Debit Card Statement Summary
4 pages
Mapeh Test Tos Q1
No ratings yet
Mapeh Test Tos Q1
11 pages
Electrical Safety & Operations Guide
No ratings yet
Electrical Safety & Operations Guide
12 pages
Understanding IFRS and IND-AS Standards
No ratings yet
Understanding IFRS and IND-AS Standards
7 pages
Digital Forensics and Cyber Crime
No ratings yet
Digital Forensics and Cyber Crime
13 pages
HRMS Development for XBase Inc.
No ratings yet
HRMS Development for XBase Inc.
3 pages
Mensa Scholarship Essay Winners
No ratings yet
Mensa Scholarship Essay Winners
7 pages
24 D0630 English LL
No ratings yet
24 D0630 English LL
8 pages
CIE Review 1 PPT Format
No ratings yet
CIE Review 1 PPT Format
11 pages
Understanding Android Architecture Basics
No ratings yet
Understanding Android Architecture Basics
6 pages
Understanding Extrinsic Motivation and Rewards
50% (2)
Understanding Extrinsic Motivation and Rewards
16 pages
Third Eye Activation Recipes
No ratings yet
Third Eye Activation Recipes
7 pages
20 MN CR 5
No ratings yet
20 MN CR 5
3 pages
LPG Burner Design for Hot Air Puffing
No ratings yet
LPG Burner Design for Hot Air Puffing
8 pages
Cs 18
No ratings yet
Cs 18
1 page
Hse Eng t110 Dormitory Occupational Safety Instruction
100% (1)
Hse Eng t110 Dormitory Occupational Safety Instruction
2 pages
LP Form 4
No ratings yet
LP Form 4
2 pages
BI Developer Expertise
No ratings yet
BI Developer Expertise
5 pages
Archivo de Precios PVP Absima Xray Todo 07032014
No ratings yet
Archivo de Precios PVP Absima Xray Todo 07032014
122 pages
Hacking - 17 Most Dangerous Hacking Attacks - Downlaod From Darkwiki - in PDF
No ratings yet
Hacking - 17 Most Dangerous Hacking Attacks - Downlaod From Darkwiki - in PDF
93 pages
PSBank Auto Loan Application Form
No ratings yet
PSBank Auto Loan Application Form
2 pages
MCA105
No ratings yet
MCA105
1 page
Meeting 5 - Distributive Determiners & Exercise
No ratings yet
Meeting 5 - Distributive Determiners & Exercise
6 pages
Duck Housing and Management - Cornell University College of Veterinary Medicine
No ratings yet
Duck Housing and Management - Cornell University College of Veterinary Medicine
4 pages
68 107426 02 - SLDDRW
No ratings yet
68 107426 02 - SLDDRW
3 pages
Hybrid Additive Manufacturing - Report
No ratings yet
Hybrid Additive Manufacturing - Report
4 pages
4
No ratings yet
4
13 pages
PLC and SCADA for Chiller Automation
No ratings yet
PLC and SCADA for Chiller Automation
7 pages
Lab Heating Stirrer Guide
No ratings yet
Lab Heating Stirrer Guide
21 pages

KNN: Classifying with Nearest Neighbors

Uploaded by

KNN: Classifying with Nearest Neighbors

Uploaded by

KNN and HMM

 Load the data

 You will have to note the following points before selecting

 A Hidden Markov Model is a statistical Markov Model

 A stochastic process (or a random process that is a

 DNS Sequence analysis

You might also like