0% found this document useful (0 votes)

22 views16 pages

4.4-InstanceBasedLearning Part 1

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views16 pages

4.4-InstanceBasedLearning Part 1

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Week 4 Inductive Learning based on

Symbolic Representations
and Weak Theories

Video 4.4 Instance Based Learning Part 1

Subtopics for this lecture

Instance Based Learning in general

The structure of the instance space
K-Nearest Neighbor algorithm
Distance and Similarity metrices
Weighted Nearest Neighbor algorithm
Binary Linear Classifier
Support Vector Machines
Kernel Methods
The Kernel Trick enabling binary non-linear classification
Instance Based Learning
Synonym: Memory-based learning

Instance based learning is a family of learning algorithms that, instead of performing

explicit generalization, compares new problem instances with instances seen in
training, which have been stored in memory. It is called instance-based because it
evaluates test cases directly based on the training instances themselves.

Instance based learning is a kind of lazy learning, where the evaluation is only
approximated locally and all computation is deferred until classification.

In the worst case, a hypothesis is a list of n training items and the computational
complexity of classifying a single new instance is O(n).

One advantage that instance based learning has over other methods of machine
learning is its high flexibility to adapt its model to previously unseen data.

Instance-based learners may store a new instance and/or throw an old instance away.
The structure of the instance space
In many machine learning approaches the internal structure of the instance space
is not explicitly considered.

However, the character of the instance space will always implicitly influence the
performance of learning algorithms even if it is not explicitly considered in the
algorithm design.

In contrast, for instance based learning, the character of the instance space is of
key importance.

In earlier lectures, a few crucial structural aspects of the instance space have been
mentioned:
- The number of features
- The value set of features
- Instances with special status: prototypes, outliers and near misses
- Similarity or distance measures
- Structural properties of the whole space such as sparseness, density etc.
These aspects will come into play now.
K-Nearest Neighbor Algorithm (KNN)
In the k-nearest neighbors algorithm (k-NN) the analysis is based on the k closest
training examples in the instance space.

k is a predefined positive integer, typically small and odd. Potentially an optimal k can be
calculated by special techniques (hyper parameter optimization techniques)

The typical representation of an instance x is a feature vector ( a1(x), a2(x),....an(x)) + a

target function f(x). The training phase is simply the storage of the feature vectors of all
training instances in a datastructure.

A distance metric is always needed. A default metric is the Euclidean distance

d(xi,xj) = sqrt ((r=1..n) :Sum ( ar(xi)-ar(xj))^2)). A metric is typically manually defined
but can also be learned.

The k-NN algorithm can be used both for classification or regression:

• In k-NN classification, the output is a class membership. A query instance (xq) is
assigned the class label most common among its k nearest neighbors. If k = 1, then the
instance is simply assigned the class of that single nearest neighbor.
• In k-NN regression, the output is the property value for the query instance (xq). This
value is the average of the values of its k nearest neighbors.

For examples we will use binary classifications in a feature space of two dimensions.
Illustration of simple classification application
of the K-nearest neigbour algorithm
The circles represent instances
of the algorithm for values of K =1,3,5

Query instance

Instance classified as BLUE

Instance classified as RED

In the example the query instance is

classified as follows:
- In the k=1 case BLUE
- In the k=3 case RED
- In the k=5 case BLUE
Implicit representation or visualization of
the Hypothesis space
It is obvious that in the case of instance-based learning there is only one
explicit space, the instance space. A hypothesis is never explicitly built
up.

The ´hypotheses´ are implicit in the structure of the instance space.

One form of such implicit representations is the so called Voronoi

Diagram.

A Voronoi Diagram is a partitioning of the decision surface into convex

polyhedral surroundings of the training instances.

Each polyhedron covers the potential query instances positively

determined by a training instance. Query points outside a specific
polyhedron is closer to another training instance.

The approch can be extended to a number of dimensions larger than 2.

Normed and Inner Product Euclidean Vector Spaces
In this presentation we only considers vectors in an Euclidean space.

A normed vector space is a vector space over the real or complex numbers, on
which a norm or length is defined. A norm is a real-valued function that has the
following properties:
1. A norm is written as d(x) or |x| where x is a vector
2. d(x)>=0
3. d(k*x) = k*d(x)
4. d(x+y) <= d(x) + d(y)
5. An euclidian norm is written as ||x|| and = sqrt ( (r=1..n) Sum ar(x)^2 ))
A norm applied to a difference of two vectors is called a distance = d(x-y).

An inner product space is a normed euclidean vector space on which an inner product or dot
product is defined.The inner product associates each pair of vectors in the space with a scalar
quantity. Inner products allow the introduction of the intuitive geometrical notion of the angle
between two vectors.
6. An inner product or dot product is written as ( x.y) where x and y are vectors
7. (x.y) = r=1..n Sum ar * br
8. (x.y) = d(x)*d(y) * cos (angle between x and y)
9. (x.y) = d(x)*d(y) * cos (90 degrees) = 0 => x and y are orthogonal
10. An euclidian norm ||x|| = sqrt ((x.x)).
Distance and Similarity Metrices
A distance metric (measure, function) is typically a real-valued function that quantifies
the distance between two objects:
• distances between a point and itself are zero: d(x,x) = 0;
• all other distances are larger than zero: d(x, y) > 0
• distances are symmetric: d(y,x) = d(x, y)
• detours can not shorten the distance: d(x, z) <= d(x, y)+d(y, z)

Distance metrics and similarity metrics have been developed more or less
independently for different purposes, but intuitively specific similarity metrics are
inverses of corresponding distance metrics and can be transformed into each
other.

Typically similarity metrics takes values in the range of -1...0...1, where 1 means that
the objects are regarded as identical and -1 means is the maximum distance
considered by the corresponding distance metric. Distance metrics can take arbitrary
values from 0 to infinity. Through some transforms and normalizations distance and
similarity metrics can be made comparable.

We will exemplify by metrics in a normed Euclidean vector space and Metrices based
on overlapping elements.
Metrics in Normed and Inner product vector spaces
Minkovsky distance
The Minkowski distance is a metric in a normed Euclidean vector space.
d(xi, xj) = ( r=1..n Sum ( ar(xi)-ar(xj))^k))^1/k, range 0..

Manhattan or taxicab distance = the Minkovsky distance with k=1.

d(xi,xj) = ( r=1..n Sum (Abs (ar(xi)-ar(xj)))), range 0....infinity
The sum of the absolute differences of the Cartesian coordinates for two vectors.

Euclidean distance = the Minkovsky distance with k=2

||x-y|| = d(xi,xj) = sqrt ( r=1..n Sum (ar(Xi)-ar(xj))^2), range 0....infinity
The classic Euclidean distance according to the theorem of Pythagoras.

Chebyshev or chessboard distance = the Minkovsky distance with k=infinity

d(xi, xj) = ( r=1..n Sum ( ar(xi)-ar(xj))^k))^1/k -> max r=1..n ( ar(xi)-ar(xj)),range:0-infinity
The greatest of their difference along any coordinate dimension
The minimum number of moves a king requires to move between two chess positions.

Cosine similarity measure

d(xi,xj) = Cosine (angle between xi and xj) = sqrt ( r=1..n Sum (ar(xi)*ar(xj)))/
(sqrt ( r=1..n Sum (ar(xi)^2))*sqrt ( r=1..n Sum (ar(xj)^2))), range: 0-infinity
The cosine measure disconsiders the magnitude of vectors, which is preferable for certain data-sets
Different metrices give rise to different Voronoi diagrams
Example of Cosine similarity
Metrices based on overlapping elements
One category of metrices measures the degree of overlap of elements in sets,
arrays or vectors. Elements could be binary digits, numbers or words.

Levenshtein Distance
A string metric for measuring the difference between two sequences.
Informally, the Levenshtein distance between two words is the minimum
number of single-character edits (insertions, deletions or substitutions)
required to change one word into the other.

Jaccard Similarity, Index or Coefficient

The Jaccard index, The Jaccard coefficient measures similarity between finite
sample sets, and is defined as the size of the intersection divided by the size
of the union of the sample sets.

Hamming distance
The Hamming distance between two strings of equal length is the number of
positions at which the corresponding symbols are different. In other words, it
measures the minimum number of substitutions required to change one string
into the other.
Issues to consider for the k2-nearest neighbor algorithm
In binary (two class) classification problems, it is helpful to choose k to be an odd number as
this avoids tied votes. One way of choosing the empirically optimal k in this setting is via the
bootstrap method.

The principle of "majority voting" for deciding the class labels can be problematic when the
class distribution is skewed.

Instances of a more frequent class tend to dominate the prediction of the new examples, because
they tend to be more common among the k nearest neighbors due to their large number.

Irrelevant features within a large feature set, tend to degrade performance.

The simple model where all instances are treated fairly using the same distance metric may be
inadequate e.g. in sparse instance spaces.
The Bootstrap method
The bootstrap method is a statistical technique for estimating quantities about
a population by averaging estimates from multiple small data samples

Importantly, samples are constructed by drawing observations from a large

data sample one at a time and returning them to the data sample after they
have been chosen. This allows a given observation to be included in a given
small sample more than once. This approach to sampling is called sampling
with replacement.

The bootstrap method can be used to estimate a quantity of a population. This

is done by repeatedly taking small samples, calculating the statistic, and
taking the average of the calculated statistics.

We can summarize this procedure as follows:

1. Choose a number of bootstrap samples to perform
2. Choose a sample size
3. For each bootstrap sample
a) Draw a sample with replacement with the chosen size
b) Calculate the statistic on the sample
4. Calculate the mean of the calculated sample statistics.
To be continued in Part 2

04 KNN M
No ratings yet
04 KNN M
26 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
Instance Based Learning
No ratings yet
Instance Based Learning
20 pages
Distance Based Models
No ratings yet
Distance Based Models
58 pages
K-Nearest Neighbour Classifiers
No ratings yet
K-Nearest Neighbour Classifiers
18 pages
ML Unit 2
No ratings yet
ML Unit 2
11 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Textbook ML - Removed
No ratings yet
Textbook ML - Removed
10 pages
Metric-Based Classifiers: Nuno Vasconcelos (Ken Kreutz-Delgado)
No ratings yet
Metric-Based Classifiers: Nuno Vasconcelos (Ken Kreutz-Delgado)
32 pages
Similarity and Distance Metrics Explained
No ratings yet
Similarity and Distance Metrics Explained
85 pages
K-Nearest Neighbors: A Non-Parametric Approach
No ratings yet
K-Nearest Neighbors: A Non-Parametric Approach
22 pages
DS - Module 3
No ratings yet
DS - Module 3
65 pages
K Nearest Neighbour - Algorithm
No ratings yet
K Nearest Neighbour - Algorithm
29 pages
Similarity Based Learning (Part 2)
No ratings yet
Similarity Based Learning (Part 2)
15 pages
Chapter 2
No ratings yet
Chapter 2
70 pages
K-NN Classification Review
No ratings yet
K-NN Classification Review
7 pages
K-Nearest Neighbor & Dimensionality
No ratings yet
K-Nearest Neighbor & Dimensionality
17 pages
m3 Final-1
No ratings yet
m3 Final-1
171 pages
Dist
No ratings yet
Dist
14 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
III Clustering
No ratings yet
III Clustering
87 pages
Instance Based Learning: Vibhav Gogate The University of Texas at Dallas
No ratings yet
Instance Based Learning: Vibhav Gogate The University of Texas at Dallas
25 pages
Data Mining: Distance & Similarity
No ratings yet
Data Mining: Distance & Similarity
25 pages
KNN Algorithm For Machine Learning Course
No ratings yet
KNN Algorithm For Machine Learning Course
22 pages
Fundamentals of Similarity-Based Learning
No ratings yet
Fundamentals of Similarity-Based Learning
40 pages
Machine Learning Course Outline
No ratings yet
Machine Learning Course Outline
50 pages
Distance-Based Methods - KNN
0% (1)
Distance-Based Methods - KNN
8 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
No ratings yet
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
6 pages
ML Unit 2
No ratings yet
ML Unit 2
24 pages
Data Mining: Similarity and Distance Recommendation Systems Sketching, Locality Sensitive Hashing
No ratings yet
Data Mining: Similarity and Distance Recommendation Systems Sketching, Locality Sensitive Hashing
57 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
9.introduction To Artificial Intelligence
No ratings yet
9.introduction To Artificial Intelligence
14 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
Showfile
No ratings yet
Showfile
130 pages
K-Nearest Neighbour Classification Guide
No ratings yet
K-Nearest Neighbour Classification Guide
29 pages
Non-Parametric Classification Overview
No ratings yet
Non-Parametric Classification Overview
74 pages
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
No ratings yet
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
25 pages
Data Similarity and Dissimilarity Metrics
No ratings yet
Data Similarity and Dissimilarity Metrics
30 pages
Statistical Learning
No ratings yet
Statistical Learning
92 pages
Lecture8 KNN1
No ratings yet
Lecture8 KNN1
16 pages
UNIT-2 ML Notes
No ratings yet
UNIT-2 ML Notes
15 pages
DMi 03-Proximity
No ratings yet
DMi 03-Proximity
51 pages
Intro to Similarity-Based Learning
No ratings yet
Intro to Similarity-Based Learning
40 pages
9 Distance Measures in Data Science
No ratings yet
9 Distance Measures in Data Science
23 pages
Machine Learning Lecture 02
No ratings yet
Machine Learning Lecture 02
25 pages
05 KNN
No ratings yet
05 KNN
49 pages
Understanding k-Nearest Neighbors
No ratings yet
Understanding k-Nearest Neighbors
18 pages
Machine Learning For Natural Language Processing: Classification: Nearest Neighbors
No ratings yet
Machine Learning For Natural Language Processing: Classification: Nearest Neighbors
28 pages
Clustering for Data Science Students
No ratings yet
Clustering for Data Science Students
47 pages
445 Lecture 5
No ratings yet
445 Lecture 5
28 pages
Nearest Neighbor Algorithms Guide
No ratings yet
Nearest Neighbor Algorithms Guide
26 pages
Similarity
No ratings yet
Similarity
20 pages
Similarity and Distance Metrics
No ratings yet
Similarity and Distance Metrics
20 pages
Machine Learning: kNN Techniques
No ratings yet
Machine Learning: kNN Techniques
9 pages
Clustering Lecture 1: Basics: Jing Gao
No ratings yet
Clustering Lecture 1: Basics: Jing Gao
62 pages
K-Nearest Neighbor Classifier Explained
No ratings yet
K-Nearest Neighbor Classifier Explained
16 pages
3 4-ArtificialNeuralNetworks
No ratings yet
3 4-ArtificialNeuralNetworks
18 pages
4.3-DecisionTreesLearningAlgorithms Part 2
No ratings yet
4.3-DecisionTreesLearningAlgorithms Part 2
15 pages
3 3-BayesianNetworks
No ratings yet
3 3-BayesianNetworks
13 pages
3 5-GeneticAlgorithms
No ratings yet
3 5-GeneticAlgorithms
16 pages
2 3-FeatureRelatedIssues
No ratings yet
2 3-FeatureRelatedIssues
10 pages
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
No ratings yet
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
18 pages
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
No ratings yet
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
17 pages
3 6-LogicProgramming
No ratings yet
3 6-LogicProgramming
8 pages
Week 2 Watermark
No ratings yet
Week 2 Watermark
84 pages
Week1 Annotated
No ratings yet
Week1 Annotated
4 pages
Week 1
No ratings yet
Week 1
12 pages
Coordination of Overcurrent Relays in Distribution System Using Linear Programming Technique
100% (1)
Coordination of Overcurrent Relays in Distribution System Using Linear Programming Technique
4 pages
Dip Full Syllabus QB
No ratings yet
Dip Full Syllabus QB
12 pages
K Server Problem
No ratings yet
K Server Problem
100 pages
Mws Mec Ode TXT Runge4th Examples PDF
No ratings yet
Mws Mec Ode TXT Runge4th Examples PDF
6 pages
CCS355 SET2 Anna University Lab Question Set Neural Network
No ratings yet
CCS355 SET2 Anna University Lab Question Set Neural Network
2 pages
CS-850: Advanced Theory of Computation: Adnan Rashid
No ratings yet
CS-850: Advanced Theory of Computation: Adnan Rashid
72 pages
Process Control Strategies Guide
No ratings yet
Process Control Strategies Guide
2 pages
DAA - Strassen's Matrix (Anurag Verma) v1.0
No ratings yet
DAA - Strassen's Matrix (Anurag Verma) v1.0
5 pages
The Traveling Salesman Problem PDF
No ratings yet
The Traveling Salesman Problem PDF
21 pages
Recursion Tree: Divide & Conquer Algorithm
No ratings yet
Recursion Tree: Divide & Conquer Algorithm
10 pages
Python Interview Questions
No ratings yet
Python Interview Questions
14 pages
AIT307 AI Unit2 3
No ratings yet
AIT307 AI Unit2 3
63 pages
Matrix Operations in Excel Guide
No ratings yet
Matrix Operations in Excel Guide
1 page
Practice Problems
No ratings yet
Practice Problems
2 pages
Learn Random Forest Using Excel
No ratings yet
Learn Random Forest Using Excel
9 pages
Huffman Code & Cryptography Experiments
No ratings yet
Huffman Code & Cryptography Experiments
5 pages
Mca 4 Sem Machine Learning and Data Analytics Using Python 91855 May 2023
No ratings yet
Mca 4 Sem Machine Learning and Data Analytics Using Python 91855 May 2023
3 pages
Backtracking: Depth-First Search N-Queens Problem Hamiltonian Circuits
No ratings yet
Backtracking: Depth-First Search N-Queens Problem Hamiltonian Circuits
16 pages
Recursion
No ratings yet
Recursion
32 pages
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
No ratings yet
Sequential Quadratic Programming Methods: Abstract. in His 1963 PHD Thesis, Wilson Proposed The First Sequential Quadratic
78 pages
Gomory Cutting Plane Method
No ratings yet
Gomory Cutting Plane Method
10 pages
Algoritma Neural Network Backpropagation
No ratings yet
Algoritma Neural Network Backpropagation
19 pages
PSC Lab-Syllabus
No ratings yet
PSC Lab-Syllabus
3 pages
Polynomial Regression-Bias Variance
No ratings yet
Polynomial Regression-Bias Variance
3 pages
Lec 6-String Processing
100% (1)
Lec 6-String Processing
25 pages
21ai71 Simp Tie (1) - 250107 - 124440
No ratings yet
21ai71 Simp Tie (1) - 250107 - 124440
19 pages
AES Algorithm
No ratings yet
AES Algorithm
25 pages
MIT 6.00 Quiz 3 Analysis
No ratings yet
MIT 6.00 Quiz 3 Analysis
14 pages
2 3aiml
No ratings yet
2 3aiml
5 pages
04-Transform Domain Analysis
No ratings yet
04-Transform Domain Analysis
68 pages

4.4-InstanceBasedLearning Part 1

Uploaded by

4.4-InstanceBasedLearning Part 1

Uploaded by

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Week 4 Inductive Learning based on

Video 4.4 Instance Based Learning Part 1

Instance Based Learning in general

Instance based learning is a family of learning algorithms that, instead of performing

The typical representation of an instance x is a feature vector ( a1(x), a2(x),....an(x)) + a

A distance metric is always needed. A default metric is the Euclidean distance

The k-NN algorithm can be used both for classification or regression:

Instance classified as BLUE

Instance classified as RED

In the example the query instance is

The ´hypotheses´ are implicit in the structure of the instance space.

One form of such implicit representations is the so called Voronoi

A Voronoi Diagram is a partitioning of the decision surface into convex

Each polyhedron covers the potential query instances positively

The approch can be extended to a number of dimensions larger than 2.

Manhattan or taxicab distance = the Minkovsky distance with k=1.

Euclidean distance = the Minkovsky distance with k=2

Chebyshev or chessboard distance = the Minkovsky distance with k=infinity

Cosine similarity measure

Jaccard Similarity, Index or Coefficient

Irrelevant features within a large feature set, tend to degrade performance.

Importantly, samples are constructed by drawing observations from a large

The bootstrap method can be used to estimate a quantity of a population. This

We can summarize this procedure as follows:

You might also like