Assignment 02

This document provides instructions for a machine learning assignment involving k-nearest neighbors (k-NN) classification. It consists of 3 parts: 1. Implement a k-NN classifier from scratch in Python without using scikit-learn. Test it on a dataset of cat and dog images and evaluate performance for different k values and distance metrics. 2. Use scikit-learn's k-NN implementation to classify the same cat/dog dataset, and compare results to Part 1. 3. Generalize the k-NN classifier to handle multiple classes and test it on a weather dataset with 4 classes. Evaluate performance and provide confusion matrix, accuracy, and macro average F1 score.

Uploaded by

Muhammad Husnain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views5 pages

Assignment 02

Uploaded by

Muhammad Husnain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

LAHORE UNIVERSITY OF MANAGEMENT SCIENCES

Syed Babar Ali School of Science and Engineering

EE514/CS535 Machine Learning

Spring Semester 2021

Programming Assignment 2 – k-Nearest Neighbors Classification

Issued: Tuesday 9th February, 2021

Total Marks: 100

Submission: 11:55 pm, Thursday 18th February, 2021.
Contribution to Final Percentage: 4%

Goal
The goal of this assignment is to get you familiar with k-NN classification and to give
hands on experience of basic python tools and libraries which will be used in implementing
the algorithm. You will learn the following:
Feature engineering, i.e., extracting features from raw data using different tech-
niques.
Implementation of the k-NN algorithm from scratch.
Classification of images using the k-NN algorithm.
Evaluation of the performance of your classifier using different evaluation metrics
such as confusion matrix, F1-score, accuracy.

Instructions
Submit your code both as notebook file (.ipynb) and python script (.py) on LMS.
The name of both files should be your roll number. Failing to submit any one of
them will result in the reduction of marks.
The code MUST be implemented independently. Any plagiarism or cheating of work
from others or the internet will be immediately referred to the DC.
10% penalty per day for 3 days after due date. No submissions will be accepted
after that.
Use procedural programming style and comment your code properly.

1
Part 1: Implement k-NN classifier from scratch (35
Marks)
NOTE:
You are not allowed to use scikit-learn or any other machine learning toolkit for this part.
You have to implement your own k-NN classifier from scratch. You may use Pandas,
NumPy, Matplotlib and other standard python libraries.

Dataset:
The dataset contains 10,000 images of dogs and cats which have already been split (80
%, 20%) into training and test data. There are two top-level directories [test set/, train-
ing set/] corresponding to test set and training set respectively. Each of these directories
further contains two directories [cats/, dogs/] comprising images of cats and dogs. The
class labels of each of the images correspond to the directory they are contained in i.e.,
cat/dog.
Dataset: Dogs and Cats Images

Feature Extraction:
In the feature extraction step, firstly, you have to read the images which will give you a
multi dimensional array containing RGB pixel intensities of the image. Using raw pixel
values is the simplest way to create features from an image but for this part we will use
HOG (Histogram of Oriented Gradients) feature descriptor to extract features from image
data. The HOG descriptor focuses on the structure or shape of an object. It identifies if a
pixel is an edge or not, as well as edge direction, by extracting the gradient and orientation
(or magnitude and direction) of the edges. You can use skimage.feature library to extract
HOG features from the image.

Tasks:
1. Create your own k-Nearest Neighbors classifier function by performing following
tasks:
For a test data point, find its distance from all training instances.
Sort the calculated distances in ascending order based on distance values.
Choose k training samples with minimum distances from the test data point.
Return the most frequent class of these samples.
(Your function should work with Euclidean distance as well as Manhattan
distance. Pass the distance metric as a parameter in k-NN classifier function.
Your function should also be general enough to work with any value of k.)
2. Implement a evaluation function which calculates the classification accuracy, F1
score and the confusion matrix of your classifier on the test set. What significance
does the F1 score hold, and why is it a better metric than accuracy?
3. Run your k-NN function for the values of k = 1, 2, 3, 4, 5, 6, 7. Do this for both the
Euclidean distance and the Manhattan distance for each value of k. Formulas for

2
both are listed below:

Euclidean Distance:
p
d(~p, ~q) = (p1 − q1 )2 + (p2 − q2 )2 + (p3 − q3 )2 + ... + (pn − qn )2

Manhattan Distance:

d(~p, ~q) = |(p1 − q1 )| + |(p2 − q2 )| + |(p3 − q3 )| + ... + |(pn − qn )|

4. For the even values of k given in the above task break ties by backing off to k − 1
value (Assume that you take the k = 4 nearest neighbors of a particular image, and
two of them have the label “cat” and the other two have the label “dog”. In this
case you will break tie by backing off to k = 3 neighbors).
5. Present the results as a graph with k values on x-axis and F1 score on y-axis. Use
a single plot to compare the two versions of classifier (one using Euclidean and the
other using Manhattan distance metric). The graphs should be properly labelled.

3
Part 2: k-NN classifier using scikit-learn (20 marks)
In this part you have to use scikit-learn’s k-NN implementation to train and test your
classifier on the dataset used in Part 1. Run the k-NN classifier again for values of
k = 1, 2, 3, 4, 5, 6, 7 using both Euclidean and Manhattan distance. Use scikit-learn’s
accuracy score function to calculate the accuracy, F1 score to calculate F1 score and
confusion matrix function to calculate confusion matrix for test data. Also present the
results as a graph with k values on x-axis and F1 score on y-axis for both distance metrics
in a single plot.

4
Part 3: Implement a k-NN classifier for a multi-class
data set (45 marks)
Dataset:
You will be using the weather data set for this part, which can be found here. There are
two top-level directories [Test data/, Training data/] corresponding to test set and train-
ing set respectively. There are 899 images in the training data and 224 images in the test
data. Each of these directories further contains four directories [Cloudy/, Rain/, Shine/,
Sunrise/] comprising images of relevant weather conditions. The class labels of each of the
images correspond to the directory they are contained in i.e., Cloudy/Rain/Shine/Sunrise.

Feature Extraction:
In the feature extraction step, you have to read the images and then resize them to a
fixed size (32, 32), which can be done using this function. After that, flatten the RGB
pixel intensities of the images (which are multi-dimensional arrays obtained by reading
the image) to a single list of numbers, i.e., a one dimensional array. Doing so, you will
get a numpy array of shape (image size*3, ) for each image, which will serve as your
feature vector for the particular image. You can use cv2 and numpy to implement these
steps.

Tasks:
In this part you have to implement a generalized form of a k-NN classifier, which can
classify a data set which has N classes. You have to repeat all the steps that you have
implemented in Part 1, this time ensuring that all the steps are scaled up from a bi-
nary (classes c = 2) to a generalized form (c = N ). Evaluation function should now
provide macro average F1 score as an output output along with accuracy and confusion
matrix.

Process of ML Code/Algorithm: KNN Type I - Input Test Sample Method
No ratings yet
Process of ML Code/Algorithm: KNN Type I - Input Test Sample Method
3 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Machine Learning KNN - Supervised
No ratings yet
Machine Learning KNN - Supervised
9 pages
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
No ratings yet
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
2 pages
ML Notes
100% (2)
ML Notes
125 pages
Worksheet - 2.3 20BCS7490
No ratings yet
Worksheet - 2.3 20BCS7490
6 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
Experiment 4: Aim/Overview of The Practical: Task To Be Done
No ratings yet
Experiment 4: Aim/Overview of The Practical: Task To Be Done
7 pages
Worksheet - 2.3 20BCS7611
No ratings yet
Worksheet - 2.3 20BCS7611
6 pages
K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
DL Exp-1.4 19BCS1431
No ratings yet
DL Exp-1.4 19BCS1431
5 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
17 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning - Javatpoint
18 pages
Practicl Work - 02
No ratings yet
Practicl Work - 02
2 pages
Research Paper
No ratings yet
Research Paper
6 pages
KNN Colab Illustration
No ratings yet
KNN Colab Illustration
5 pages
Untitled 9
No ratings yet
Untitled 9
17 pages
Garden of Eden For Artificial Intelligence
No ratings yet
Garden of Eden For Artificial Intelligence
4 pages
Experiment 2.2 KNN Classifier
No ratings yet
Experiment 2.2 KNN Classifier
7 pages
Image and Video Analytics
No ratings yet
Image and Video Analytics
3 pages
Assignment No 2 AI
No ratings yet
Assignment No 2 AI
4 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
ML Assignment 02
No ratings yet
ML Assignment 02
8 pages
Artificial Intelligence Class 9
No ratings yet
Artificial Intelligence Class 9
4 pages
Expert System Question Bank
50% (2)
Expert System Question Bank
6 pages
Lab 8
No ratings yet
Lab 8
7 pages
Assignment 3 B
No ratings yet
Assignment 3 B
7 pages
AI LAB Assignment 09
No ratings yet
AI LAB Assignment 09
4 pages
Lab Session 9
No ratings yet
Lab Session 9
2 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
KNN Lab
No ratings yet
KNN Lab
4 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
B-56 Sanket Jambhulkar MLA-7
No ratings yet
B-56 Sanket Jambhulkar MLA-7
9 pages
Advance Accounting ppt-1
No ratings yet
Advance Accounting ppt-1
17 pages
Lab 1
No ratings yet
Lab 1
3 pages
Open RAN Security: Challenges and Opportunities
No ratings yet
Open RAN Security: Challenges and Opportunities
30 pages
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
No ratings yet
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
93 pages
Act 10
No ratings yet
Act 10
4 pages
ML 4
No ratings yet
ML 4
33 pages
Stock Market Prediction Using Reinforcement Learning With Sentiment Analysis
No ratings yet
Stock Market Prediction Using Reinforcement Learning With Sentiment Analysis
20 pages
Short Keynote Paper: Mainstreaming Personalized Healthcare - Transforming Healthcare Through New Era of Artificial Intelligence
No ratings yet
Short Keynote Paper: Mainstreaming Personalized Healthcare - Transforming Healthcare Through New Era of Artificial Intelligence
5 pages
Lecture 1 Introduction To Artificial Intelligence (AI)
No ratings yet
Lecture 1 Introduction To Artificial Intelligence (AI)
34 pages
ML-Unit 5
No ratings yet
ML-Unit 5
40 pages
Seminar PPT On HAR Depth
No ratings yet
Seminar PPT On HAR Depth
37 pages
Lab 1 1.2
No ratings yet
Lab 1 1.2
4 pages
Ann Case Study
No ratings yet
Ann Case Study
14 pages
Fundamental Ethical and Professional Principles (A) : Acca Strategic Business Reporting (SBR)
No ratings yet
Fundamental Ethical and Professional Principles (A) : Acca Strategic Business Reporting (SBR)
109 pages
ISYE 6740 - (SU22) Syllabus
No ratings yet
ISYE 6740 - (SU22) Syllabus
6 pages
Keras
No ratings yet
Keras
12 pages
Lab 10 - Manual and Assignment On KNN
No ratings yet
Lab 10 - Manual and Assignment On KNN
3 pages
ML Lab Manual
No ratings yet
ML Lab Manual
24 pages
ML Lab2 PGM
No ratings yet
ML Lab2 PGM
3 pages
Analytica August 2023
No ratings yet
Analytica August 2023
13 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Growth Hacking, Insights On Data-Driven
No ratings yet
Growth Hacking, Insights On Data-Driven
20 pages
oum作业代写：让你的学业更轻松
100% (2)
oum作业代写：让你的学业更轻松
7 pages
Anglais
No ratings yet
Anglais
3 pages
Classification (K-Nearest Neighbor)
No ratings yet
Classification (K-Nearest Neighbor)
22 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
7 pages
Swiggy Vs Zomato A Comparative Analysis
No ratings yet
Swiggy Vs Zomato A Comparative Analysis
11 pages
Class-X-Hindi Grammar All Important
No ratings yet
Class-X-Hindi Grammar All Important
6 pages
ML 3
No ratings yet
ML 3
6 pages
Rahul Raj - Ipynb - Colab
No ratings yet
Rahul Raj - Ipynb - Colab
50 pages
PGM 5
No ratings yet
PGM 5
3 pages
ML 12215012 1-4
No ratings yet
ML 12215012 1-4
35 pages
Dhanashree ML Report
No ratings yet
Dhanashree ML Report
3 pages
DSASSign 4
No ratings yet
DSASSign 4
11 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Module 3 Lab 2
No ratings yet
Module 3 Lab 2
6 pages
K-Nearest Neighbor (KNN) 6
No ratings yet
K-Nearest Neighbor (KNN) 6
46 pages
Be - Artificial Intelligence and Data Science - Semester 4 - 2023 - March - Management Information System Mis Pattern 2019
No ratings yet
Be - Artificial Intelligence and Data Science - Semester 4 - 2023 - March - Management Information System Mis Pattern 2019
2 pages
V
No ratings yet
V
8 pages
Reaseach Paper by Salim Sabri Abeid 12A
No ratings yet
Reaseach Paper by Salim Sabri Abeid 12A
11 pages
Ai Practical File
No ratings yet
Ai Practical File
10 pages
Balint Pongracz Presentation
No ratings yet
Balint Pongracz Presentation
31 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
KNN Practice Set
No ratings yet
KNN Practice Set
5 pages
MIT Technology Review - AI
No ratings yet
MIT Technology Review - AI
20 pages
Face Swapper - Free Face Swap and Reface Online
No ratings yet
Face Swapper - Free Face Swap and Reface Online
1 page
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Case Study Ai Project Report
No ratings yet
Case Study Ai Project Report
7 pages
MIAGE Gr1 Dilia Ethics and AI
No ratings yet
MIAGE Gr1 Dilia Ethics and AI
11 pages
SML Project 2
No ratings yet
SML Project 2
11 pages