0% found this document useful (0 votes)

88 views26 pages

Understanding Lazy Learners in Data Mining

This document discusses lazy learners, a type of machine learning algorithm. It focuses on k-nearest neighbor classifiers, which are a common lazy learning approach. The key points are: 1) Lazy learners store all training data and delay processing until making predictions, whereas eager learners construct a model during training. 2) K-nearest neighbor classifiers find the k closest training examples in feature space to make predictions for new examples. 3) Techniques like kD-trees can be used to efficiently search for the nearest neighbors in large datasets.

Uploaded by

Manshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views26 pages

Understanding Lazy Learners in Data Mining

Uploaded by

Manshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Data Mining

Lazy Learners
(Instance-Based
Learners)
Outline

• Introduction
• k-Nearest-Neighbor Classifiers

Lazy Learners
Introduction

• Lazy vs. eager learning

– Eager learning
◆ e.g. decision tree induction, Bayesian classification, rule-
based classification
◆ Given a set of training set, constructs a classification model
before receiving new (e.g., test) data to classify
– Lazy learning
◆ e.g., k-nearest-neighbor classifiers, case-based reasoning
classifiers
◆ Simply stores training data (or only minor processing) and
waits until it is given a new instance
• Lazy: less time in training but more time in
predicting
Lazy Learners
Introduction

• Lazy learners store training examples and delay the

processing (“lazy evaluation”) until a new instance
must be classified
• Accuracy
– Lazy method effectively uses a richer hypothesis space
since it uses many local linear functions to form its implicit
global approximation to the target function
– Eager: must commit to a single hypothesis that covers the
entire instance space

Lazy Learners
Example Problem: Face
Recognition
• We have a database of (say) 1 million face
images
• We are given a new image and want to find the
most similar images in the database
• Represent faces by (relatively) invariant values,
e.g., ratio of nose width to eye width
• Each image represented by a large number of
numerical features
• Problem: given the features of a new face, find those
in the DB that are close in at least ¾ (say) of the
features
Lazy Learners
Introduction

• Typical approaches
– k-nearest neighbor approach
◆ Instances represented as points in a Euclidean space.
– Case-based reasoning
◆ Uses symbolic representations and knowledge-based
inference

Lazy Learners
k-Nearest-Neighbor Classifiers

• All instances correspond to points in the n-

dimentional space
• The training tuples are described by n attributes.
• Each tuple represents a point in an n-dimensional
space.
• A k-nearest-neighbor classifier searches the
pattern space for the k training tuples that are
closest to the unknown tuple.

Lazy Learners
k-Nearest-Neighbor Classifiers

• Example:
– We are interested in classifying the type of drug a
patient should be prescribed
– Based on the age of the patient and the patient’s
sodium/potassium ratio (Na/K)
– Dataset includes 200 patients

Lazy Learners
Scatter plot

On the scatter plot; light gray points indicate drug Y; medium gray points indicate drug A or
X; dark gray points indicate drug B or C

Lazy Learners
Close-up of neighbors to new patient
2

• k=1 => drugs B and C (dark gray)

• k=2 => ?
• K=3 => drugs A and X (medium gray)

• Main questions:
– How many neighbors should we consider? That is,
what is k?
– How do we measure distance?
– Should all points be weighted equally, or should some
points have more influence than others?
Lazy Learners
k-Nearest-Neighbor Classifiers

• The nearest neighbor are defined in terms of

Euclidean distance, dist(X1, X2)
• The Euclidean distance between two points or tuples, say,
X1 = (x11, x12, … , x1n) and X2 = (x21, x22, ... , x2n), is:

– Nominal attributes: distance either 0 or 1

Lazy Learners
k-Nearest-Neighbor Classifiers

• Typically, we normalize the values of each attribute in

advanced.
• This helps prevent attributes with initially large ranges
(such as income) from outweighing attributes with
initially smaller ranges (such as binary
attributes).
Min-max normalization:

– all attribute values lie between 0 and 1

Lazy Learners
k-Nearest-Neighbor Classifiers

• Common policy for missing values: assumed to be

maximally distant (given normalized attributes)
• Other popular metric: Manhattan (city-block)
metric
– Taking absolute differences value without squaring
them

Lazy Learners
k-Nearest-Neighbor Classifiers

• For k-nearest-neighbor classification, the unknown tuple is

assigned the most common class among its k nearest
neighbors.
• When k = 1, the unknown tuple is assigned the class of
the training tuple that is closest to it in pattern space.
• Nearest-neighbor classifiers can also be used for
prediction, that is, to return a real-valued prediction for
a given unknown tuple.
– In this case, the classifier returns the average value of the real-
valued labels associated with the k nearest neighbors of the
unknown tuple.

Lazy Learners
Categorical Attributes

• A simple method is to compare the corresponding

value of the attribute in tuple X1 with that in tuple X2.
• If the two are identical (e.g., tuples X1 and X2
both have the color blue), then the difference
between the two is taken as 0, otherwise 1.
• Other methods may incorporate more sophisticated
schemes for differential grading (e.g., where a
difference score is assigned, say, for blue and white
than for blue and black).

Lazy Learners
Missing Values

• In general, if the value of a given attribute A is

missing in tuple X1 and/or in tuple X2, we assume
the maximum possible difference.
• For categorical attributes, we take the difference
value to be 1 if either one or both of the
corresponding values of A are missing.
• If A is numeric and missing from both tuples X1 and
X2, then the difference is also taken to be 1.
– If only one value is missing and the other (which we’ll call
v’) is present and normalized, then we can take the
difference to be either |1 - v’| or |0 – v’| , whichever is
greater.

Lazy Learners
Determining a good value for
k
• k can be determined experimentally.
• Starting with k = 1, we use a test set to estimate the
error rate of the classifier.
• This process can be repeated each time by
incrementing k to allow for one more neighbor.
• The k value that gives the minimum error rate
may be selected.
• In general, the larger the number of training
tuples is, the larger the value of k will be

Lazy Learners
Finding nearest neighbors efficiently

• Simplest way of finding nearest neighbor: linear

scan of the data
– Classification takes time proportional to the product of the
number of instances in training and test sets
• Nearest-neighbor search can be done more
efficiently using appropriate data structures
There two methods that represent training data in
a tree structure:
kD-trees (k-dimensional trees)
Ball trees

Lazy Learners
kD-trees

• kD-tree is a binary tree that divides the input

space with a hyperplane and then splits each
partition again, recursively.
• The data structure is called a kD-tree because it
stores a set of points in k-dimensional space, k
being the number of attributes.

Lazy Learners
kD-tree example

Lazy Learners
Using kD-trees:
example
• The target, which is not one of the instances in the tree, is
marked by a star.
• The leaf node of the region containing the target is
colored black.
• To determine whether one
closer exists, first check
whether it is possible for a
closer neighbor to lie within
the node’s sibling.
• Then back up to the parent
node and check its sibling

Lazy Learners
More on kD-trees

• Complexity depends on depth of tree

• Amount of backtracking required depends on
quality of tree
• How to build a good tree? Need to find good split
point and split direction
– Split direction: direction with greatest variance
– Split point: median value or value closest to mean
along that direction
• Can apply this recursively

Lazy Learners
Building trees incrementally

• Big advantage of instance-based learning:

classifier can be updated incrementally
– Just add new training instance!
• We can do the same with kD-trees
• Heuristic strategy:
Find leaf node containing new instance
Place instance into leaf if leaf is empty
Otherwise, split leaf
Tree should be rebuilt occasionally

Lazy Learners
Lazy Learners
Lazy Learners

Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
Lazy Learning (Or Learning From Your Neighbors)
No ratings yet
Lazy Learning (Or Learning From Your Neighbors)
3 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
K Nearest Neighbour Classifier Overview
No ratings yet
K Nearest Neighbour Classifier Overview
30 pages
Lazy Learners PDF
No ratings yet
Lazy Learners PDF
15 pages
Nearest-Neighbor Classifier Guide
No ratings yet
Nearest-Neighbor Classifier Guide
2 pages
Lesson 4.1 - Unsupervised Learning Partitioning Methods PDF
No ratings yet
Lesson 4.1 - Unsupervised Learning Partitioning Methods PDF
41 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
Lazy vs. Eager Learning
No ratings yet
Lazy vs. Eager Learning
6 pages
Machine Learning in Pratice
No ratings yet
Machine Learning in Pratice
18 pages
Lecture 12
No ratings yet
Lecture 12
15 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Machine Learning Lecture 02
No ratings yet
Machine Learning Lecture 02
25 pages
Understanding k-Nearest Neighbors
No ratings yet
Understanding k-Nearest Neighbors
18 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
Week 07
No ratings yet
Week 07
24 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Unit 5 ML
No ratings yet
Unit 5 ML
13 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
20 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
CSE445 NSU Week - 5
No ratings yet
CSE445 NSU Week - 5
26 pages
ML Day6
No ratings yet
ML Day6
20 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
No ratings yet
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
11 pages
01 Basics 02knn 01
No ratings yet
01 Basics 02knn 01
7 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
K-Nearest Neighbors
100% (1)
K-Nearest Neighbors
32 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
05 K-Nearest Neighbors
No ratings yet
05 K-Nearest Neighbors
15 pages
Instance Based Learning: Vibhav Gogate The University of Texas at Dallas
No ratings yet
Instance Based Learning: Vibhav Gogate The University of Texas at Dallas
25 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Research and Implementation of Machine
No ratings yet
Research and Implementation of Machine
6 pages
K-Nearest Neighbor Algorithm Explained
100% (1)
K-Nearest Neighbor Algorithm Explained
17 pages
ML KN
No ratings yet
ML KN
12 pages
K Nearest Neighbor Classification
No ratings yet
K Nearest Neighbor Classification
30 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
32 pages
7.classification After
No ratings yet
7.classification After
51 pages
K-NN Algorithm: Key Concepts & Challenges
No ratings yet
K-NN Algorithm: Key Concepts & Challenges
10 pages
Unit 2
No ratings yet
Unit 2
30 pages
Classification KNN
No ratings yet
Classification KNN
11 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
KNN Algorithm Overview and Applications
No ratings yet
KNN Algorithm Overview and Applications
41 pages
ML 2
No ratings yet
ML 2
6 pages
MATLAB Numeric and String Conversion
No ratings yet
MATLAB Numeric and String Conversion
12 pages
Placement Cheatsheet - Curious Freaks
No ratings yet
Placement Cheatsheet - Curious Freaks
3 pages
BTP PPT Phase1
No ratings yet
BTP PPT Phase1
14 pages
Graphs 1
No ratings yet
Graphs 1
23 pages
MATLAB Root-Finding for Engineers
No ratings yet
MATLAB Root-Finding for Engineers
8 pages
Algorithm Design for CS Students
No ratings yet
Algorithm Design for CS Students
2 pages
Ethan Stanley Dijkstra-Alg. Slides
No ratings yet
Ethan Stanley Dijkstra-Alg. Slides
11 pages
An Optimized Task Scheduling Algorithm in CloudComputing
No ratings yet
An Optimized Task Scheduling Algorithm in CloudComputing
2 pages
Formal Language and Automata Theory
No ratings yet
Formal Language and Automata Theory
13 pages
Python Strings Guide for Students
No ratings yet
Python Strings Guide for Students
11 pages
H. Lerchs and I. F. Grossmann - 1965 - Optimum Design of Open Pit Mines PDF
33% (3)
H. Lerchs and I. F. Grossmann - 1965 - Optimum Design of Open Pit Mines PDF
8 pages
Unrestricted Grammars and Turing Machines
No ratings yet
Unrestricted Grammars and Turing Machines
12 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
11 pages
Active Learning Techniques in ML
No ratings yet
Active Learning Techniques in ML
11 pages
Solutions 6
No ratings yet
Solutions 6
4 pages
Linear Data Structures Guide
No ratings yet
Linear Data Structures Guide
42 pages
Haskell Hanoi Tower Game Module
No ratings yet
Haskell Hanoi Tower Game Module
2 pages
Beu - B
No ratings yet
Beu - B
3 pages
Applications of Flat
No ratings yet
Applications of Flat
3 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
5 pages
Question 1 Correct
No ratings yet
Question 1 Correct
36 pages
3-701 - 3 PDF
No ratings yet
3-701 - 3 PDF
30 pages
Linear Programming Techniques
No ratings yet
Linear Programming Techniques
37 pages
Sahil Kumar Suri
No ratings yet
Sahil Kumar Suri
8 pages
Daa Unit-1 PPT-4
No ratings yet
Daa Unit-1 PPT-4
8 pages
Model Paper-1 Adsaa - Mic23
No ratings yet
Model Paper-1 Adsaa - Mic23
2 pages
Ecor 2606
No ratings yet
Ecor 2606
3 pages
K.L.P. Mishra (FLAT)
43% (7)
K.L.P. Mishra (FLAT)
434 pages
Advanced Algorithms and Complexity: The Complexity Class P: August 3, 2018
No ratings yet
Advanced Algorithms and Complexity: The Complexity Class P: August 3, 2018
4 pages
Tree & Co-Tree MCQs - Network Theory
0% (1)
Tree & Co-Tree MCQs - Network Theory
42 pages

Understanding Lazy Learners in Data Mining

Uploaded by

Understanding Lazy Learners in Data Mining

Uploaded by

Data Mining

• Lazy vs. eager learning

• Lazy learners store training examples and delay the

• All instances correspond to points in the n-

• k=1 => drugs B and C (dark gray)

• The nearest neighbor are defined in terms of

– Nominal attributes: distance either 0 or 1

• Typically, we normalize the values of each attribute in

– all attribute values lie between 0 and 1

• Common policy for missing values: assumed to be

• For k-nearest-neighbor classification, the unknown tuple is

• A simple method is to compare the corresponding

• In general, if the value of a given attribute A is

• Simplest way of finding nearest neighbor: linear

• kD-tree is a binary tree that divides the input

• Complexity depends on depth of tree

• Big advantage of instance-based learning:

You might also like