0% found this document useful (0 votes)

635 views36 pages

PAC Learning & Machine Learning Course

The document discusses machine learning concepts related to computational learning theory. It covers topics like sample complexity, PAC learning, and VC dimension. Specifically, it defines [1] sample complexity as how many training examples are needed for a learner to converge to a successful hypothesis, [2] PAC learning as requiring a learner to output a hypothesis that is approximately correct with high probability, and [3] VC dimension as a measure of the complexity of a hypothesis space that can be used to state bounds on sample complexity for infinite hypothesis spaces.

Uploaded by

V Yaswanth Sai 22114242113

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

635 views36 pages

PAC Learning & Machine Learning Course

Uploaded by

V Yaswanth Sai 22114242113

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Day & Time: Monday (10am-11am & 3pm-4pm)

Tuesday (10am-11am)
Wednesday (10am-11am & 3pm-4pm)
Friday (9am-10am, 11am-12am, 2pm-3pm)
Dr. Srinivasa L. Chakravarthy
&
Smt. Jyotsna Rani Thota
Department of CSE
GITAM Institute of Technology (GIT)
Visakhapatnam – 530045
Email: [email protected] & [email protected]
Department of CSE, GIT 1
20 August 2020
EID 403 and machine learning
Course objectives

● Explore about various disciplines connected with ML.

● Explore about efficiency of learning with inductive bias.
● Explore about identification of Ml algorithms like decision
tree learning.
● Explore about algorithms like Artificial Neural networks,
genetic programming, Bayesian algorithm, Nearest neighbor
algorithm, Hidden Markov chain model.

Department of CSE, GIT EID 403 and machine learning

Learning Outcomes

● Identify the various applications connected with ML.

● Classify efficiency of ML algorithms with Inductive bias
technique.
● Discriminate the purpose of all ML algorithms.
● Analyze any application and Correlate available ML
algorithms.
● Choose an ML algorithm to develop their project.

Department of CSE, GIT EID 403 and machine learning

Syllabus

20 August 2020 4

Department of CSE, GIT EID 403 and machine learning

Reference book 1. Title -Machine Learning
Author- Tom M Mitchell

Department of CSE, GIT EID 403 and machine

learning
Reference book 2. Title –Introduction to Machine Learning
Author- Ethem Alpaydin

Department of CSE, GIT EID 403 and machine learning

Module -3(Chapter-7)
It includes-

Chapter -6 Bayesian Learning

Chapter -9 Computational Learning Theory

● Probably Learning
● Sample Complexity
● Finite and Infinite Hypothesis space
● Mistake Bound model of Learning

7
Introduction
This chapter introduces

-Characterization of difficulty of several types of ML problems.

-Capabilities of several types of ML algorithms.

-Two specific frameworks for analysing ML algorithms

1. Probably approximately correct(PAC) framework- In this we identify

classes of hypotheses that can and cannot be learned from no.of
training examples.
2. Mistake Bound framework- In this we examine the no.of training errors
made by a learner before determining the correct hypothesis.
Introduction-

The Goal in this chapter is to answer some questions with computational theory
such as-

1. Sample complexity- How many training examples are needed for a

learner to converge (with high probability)to a successful hypothesis?

2. Computational complexity- How much computational effort is needed for

a learner to converge (with high probability)to a successful hypothesis?

3. Mistake bound- How many training examples will the learner misclassify
before converging to a successful hypothesis?
Probably learning an approximately correct hypothesis-

The Problem Setting-

● Set of instances X described by attributes like age(young/old) height(short/tall)

● Set of hypotheses H
● Set of possible target concepts C (i.e., c in C where, c: X {0,1}.)
● Training instances generated by a fixed, unknown probability distribution D from X

For example, D might be the distribution of instances generated by observing

“people who walk out of the largest sports store.”
Probably learning an approximately correct hypothesis(cont.)

The Problem Setting-(cont.)

The learner L, considers some set H of possible hypotheses when attempting

to learn the target concept.

After observing a sequence of training examples of the target concept c, L must

output some hypothesis h from H, which is its estimate of c.

We may evaluate the success of L, by the performance of h over new instances

drawn randomly from X according to D.

With this setting, we are interested in characterizing the performance of various

learners L using various hypothesis spaces H, when learning individual target concepts
drawn from various classes C.
Probably learning an approximately correct hypothesis(cont.)

Error of a Hypothesis-
In order to identify , how closely the learner’s output hypothesis h approximates the
actual target concept c-
we need to know about true error of a hypothesis h w.r.t target concept c, i.e.,
Probably learning an approximately correct hypothesis(cont.)

Error of a Hypothesis-(cont.)

● Two notations of error

Probably learning an approximately correct hypothesis(cont.)

PAC Learnability-

In order to characterize classes of target concepts that are reliably learned

then, what kind of statements about learnability should we say as TRUE?

Generally there are two difficulties existed-

1. we provide training examples corresponds to possible instance X, where there
may be multiple consistent hypothesis and learner cannot be able to pick one
corresponds to target concept.

2. Given that the training examples are drawn randomly, there is always a
nonzero probability that the training examples encountered by learner will be
misleading.
Probably learning an approximately correct hypothesis(cont.)

PAC Learnability-(cont.)

To overcome those difficulties-

1. We should not look for learner to output a zero error hypothesis instead we
require that its error be bounded by some constant ϵ, that can be made
arbitrarily small.
2. We should not look for learner to succeed for every sequence of randomly
drawn training examples instead we require only probability of failure be
bounded by some constant δ, that can be made arbitrarily small.

In short we require a learner probably learn a hypothesis that is approximately

correct-hence the term probably approximately correct(PAC) learning in short.
Probably learning an approximately correct hypothesis(cont.)

PAC Learnability-(cont.)

Where, n is size of instances in X & size(c) is encoding length of c in C.

For example, if C is conjunctions of k boolean features then size(c) is no.of boolean
features actually used to describe c.
Sample complexity for Finite hypothesis spaces-

PAC learnability is determined by no.of training examples required by learner.

The growth in the required training examples with problem size is called as-
-“Sample Complexity” of learning problem.

In practical,the factor that minimizes the success of a learner is limited availability of

training data.

The general bound on the sample complexity for a very broad class of learners,
called consistent learners. A learner is consistent if its output hypotheses perfectly
fits the training data.
Sample complexity for Finite hypothesis spaces-(cont.)

As we know that, version space VSH,D to be the set f all hypotheses h ∈ H that
correctly classify the training examples D.

So, every consistent learner outputs a hypothesis belonging to the version space.

Therefore, to bound the no.of examples needed by any consistent learner, we

bound the no.of examples needed to assure that the version space contains no
unacceptable hypotheses.
Sample complexity for Finite hypothesis spaces-(cont.)
Exhausting the Version space
Sample complexity for Finite hypothesis spaces-(cont.)
How many examples will ∈-exhaust the VS?
Sample complexity for Finite hypothesis spaces-(cont.)
Conjunctions of Boolean Literals Are PAC-Learnable

Use theorem-
Sample complexity for Finite hypothesis spaces-(cont.)
Agnostic Learning

If H does not contains the target concept c, then a zero-error hypothesis cannot
always be found.

In this case, we might ask our learner to output the hypothesis from H that has the
minimum error over the training examples.

A learner that makes no assumption that the target concept is representable by H

and that simply finds the hypothesis with minimum training error, is often called an
agnostic learner.
Sample complexity for Finite hypothesis spaces-(cont.)
Agnostic Learning
Sample complexity for Infinite hypothesis spaces-

So far, the discussion is about-

Sample complexity for PAC leaning grows as the logarithm of size of the
hypothesis space but, the drawbacks to characterizing sample complexity in
terms of |H| are-

1. It can lead to quite weak bounds.

2. In case of infinite hypothesis spaces we cannot apply the equation-

Here we consider, a measure of the complexity of H, called Vapnik-Chervonenkis

dimension or VC dimension or VC(H).

Instead of |H| to state bounds on sample complexity we use VC(H).

Sample complexity for Infinite hypothesis spaces-(cont.)
Shattering a set of Instances
VC dimension measures the complexity of hypothesis not by no.of distinct
hypotheses |H|, but by no.of distinct instances
from X that can be discriminated using H.

A set of 3 instances shattered by eight

hypotheses.
For every possible dichotomy of the
instances, there exists a corresponding
hypothesis.
Sample complexity for Infinite hypothesis spaces-(cont.)
Vapnik-Chervonenkis Dimension

Note- For any finite H, VC(H) ≤ log2|H|. To see this suppose VC(H)=d, then H will
require 2d distinct hypotheses to shatter d instances.

Hence, 2d ≤ |H| and d =VC(H) ≤ log2|H|.

Sample complexity for Infinite hypothesis spaces-(cont.)
Example-VC dimension for linear decision surfaces

Suppose each instance in X is described by the conjunction of exactly 3

boolean literals and,

Suppose that each hypothesis in H is described by conjunction of up to 3

boolean literals. What is VC (H)?

We can show it as follows-By representing each instance by a 3-bit string and

each literal corresponds to l1,l2,l3.

Instance1: 100
Instance2: 010
Instance3: 001
Sample complexity for Infinite hypothesis spaces-(cont.)
Example-VC dimension for linear decision surfaces(cont.)

The VC dimension for linear decision surface in the x,y plane is 3.

a) A set of 3 points that can be shattered using linear decision surfaces.
b) A set of 3 points that cannot be shattered.

NOTE- The VC dimension for conjunctions of n boolean literals is at least n. In fact

it is exactly n, though showing this is more difficult, because it requires
demonstrating that no set of n+1 instances can be shattered.
Sample complexity for Infinite hypothesis spaces-(cont.)
Sample complexity and VC Dimension
Earlier, we considered the question

The answer is- Here the no.of training examples grows logarithmically in

Now,using VC(H) for the complexity of H, it is possible to derive an alternative i.e.,

Here the no.of training examples grows log times linear in (1/ ∈)
Mistake Bound model
So far the discussion is about-

1. How the training examples are generated.

2. Noise in the data.
3. The definition of success by stating “ target concept must be learned exactly
or probably”.
4. The measures according to which the learner is evaluated.

Now consider, Mistake bound model of learning,in which

The learner is evaluated by total no.of mistakes it makes before converges to

correct hypothesis.
Mistake Bound model(cont.)

Mistake boud learning problem may be studied in various specific settings.

Generally, we might count the number of mistakes made before PAC learning the
target concept.

So, In the examples below, we consider the number of mistakes made before
learning the target concept exactly means converging to a hypothesis such that
h(x)=c(x).
Mistake Bound for FIND-S Algorithm

FIND-S makes no errors, provided C ⊆ H and training data is noise-free.

Mistake Bound for FIND-S Algorithm(cont.)

FIND-S begins with most specific hypothesis and generalizes the hypotheses
space in order to observe positive training examples.

And So, FIND-S can never mistakenly classify a negative example as

positive.

So, to calculate no.of mistakes in FIND-S, we need to count the no.of

mistakes it mis-classifies truly positive examples as negative.
Mistake Bound for Halving Algorithm

● If the majority of version space hypotheses classify a new instance as positive

then this prediction is output by the learner.

Mistake bound of Halving algorithm can be derived, when the majority of

hypothesis in current version space incorrectly classifies new instance .

In this case,if an instance is correctly classified then the version space will be
reduced to at most half its current size
Optimal Mistake Bounds

Therefore, M Find-S(C) = n+1 and M Halving (C) ≤ log2(|C|)

END OF CHAPTER 7 & MODULE 3

ML Unit-3.-1
No ratings yet
ML Unit-3.-1
28 pages
Issues in Decision Tree Learning
No ratings yet
Issues in Decision Tree Learning
14 pages
Candidate Elimination Algorithm
No ratings yet
Candidate Elimination Algorithm
3 pages
Heuristics For Backpropagation Algorithm
No ratings yet
Heuristics For Backpropagation Algorithm
2 pages
ML - Viva QnA - Doubtly - in
No ratings yet
ML - Viva QnA - Doubtly - in
14 pages
Deep Learning NPTEL Syllabus
No ratings yet
Deep Learning NPTEL Syllabus
1 page
Unit-III - Chapter7-Learning Rule Sets
No ratings yet
Unit-III - Chapter7-Learning Rule Sets
44 pages
Ensemble Learning Quiz
No ratings yet
Ensemble Learning Quiz
34 pages
21CS54 Aiml Module3 PPT
No ratings yet
21CS54 Aiml Module3 PPT
102 pages
Evaluation Hypothesis New
No ratings yet
Evaluation Hypothesis New
55 pages
UNIT 2-Topic 5-Search With Partial Information (Heuristic Search)
100% (1)
UNIT 2-Topic 5-Search With Partial Information (Heuristic Search)
4 pages
Naive Bayes Classification Numerical Example With Code
No ratings yet
Naive Bayes Classification Numerical Example With Code
8 pages
Designing Machine Learning Systems
No ratings yet
Designing Machine Learning Systems
12 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Unit IV
No ratings yet
Unit IV
14 pages
Inductive Bias
No ratings yet
Inductive Bias
3 pages
1 FIND+S+Algorithm
No ratings yet
1 FIND+S+Algorithm
2 pages
AI Deep Learning & NLP Course
No ratings yet
AI Deep Learning & NLP Course
4 pages
Assignment 1 Solution
No ratings yet
Assignment 1 Solution
6 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
University AI Exam Questions Guide
No ratings yet
University AI Exam Questions Guide
3 pages
CST 402 DC QB
No ratings yet
CST 402 DC QB
6 pages
ML Lab Viva Questions
No ratings yet
ML Lab Viva Questions
5 pages
ML Unit-4
No ratings yet
ML Unit-4
40 pages
Intelligent Systems Unit 1
No ratings yet
Intelligent Systems Unit 1
13 pages
22PCOAM16 - Machine Learning - Session 3 Concept Learning Task
No ratings yet
22PCOAM16 - Machine Learning - Session 3 Concept Learning Task
17 pages
High Performance Computing-Question Bank PDF
No ratings yet
High Performance Computing-Question Bank PDF
4 pages
Concept Learning and Genrel To Specific Ordering - 2
No ratings yet
Concept Learning and Genrel To Specific Ordering - 2
46 pages
Bput Coa
No ratings yet
Bput Coa
2 pages
Decision Trees & ID3 for Beginners
No ratings yet
Decision Trees & ID3 for Beginners
109 pages
22PCOAM16 - Machine Learning - Session 2 Brain and The Neuron
No ratings yet
22PCOAM16 - Machine Learning - Session 2 Brain and The Neuron
17 pages
Concept Learning for AI Students
No ratings yet
Concept Learning for AI Students
13 pages
ML - Unit 5
No ratings yet
ML - Unit 5
22 pages
Machine Learning Basics: Lecture Slides For Chapter 5 of Deep Learning Ian Goodfellow
No ratings yet
Machine Learning Basics: Lecture Slides For Chapter 5 of Deep Learning Ian Goodfellow
85 pages
ML Unit I Notes
No ratings yet
ML Unit I Notes
27 pages
HDFS Concepts
No ratings yet
HDFS Concepts
10 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Unit-4 Part-1 ML Ai&Ml r23
No ratings yet
Unit-4 Part-1 ML Ai&Ml r23
20 pages
Machine Learning Techniques Short Answers
No ratings yet
Machine Learning Techniques Short Answers
20 pages
Rule Learning in AI: Concepts & Methods
No ratings yet
Rule Learning in AI: Concepts & Methods
9 pages
Unit Vi
No ratings yet
Unit Vi
14 pages
Assignment Questions
No ratings yet
Assignment Questions
1 page
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Applied Machine Learning Exam Paper
100% (1)
Applied Machine Learning Exam Paper
2 pages
Comprehensive Guide to Machine Learning Concepts
No ratings yet
Comprehensive Guide to Machine Learning Concepts
3 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
Perceptron Convergence Theorem
No ratings yet
Perceptron Convergence Theorem
15 pages
ML Decode
No ratings yet
ML Decode
130 pages
ResNet Deep Learning Presentation
100% (1)
ResNet Deep Learning Presentation
8 pages
Implementing the FIND-S Algorithm in Python
No ratings yet
Implementing the FIND-S Algorithm in Python
3 pages
Ai Unit 2
No ratings yet
Ai Unit 2
135 pages
CP5191 Machine Learning Techniques Question Paper
No ratings yet
CP5191 Machine Learning Techniques Question Paper
3 pages
DL Unit1
100% (2)
DL Unit1
79 pages
What's Next?: Binary Classification and Related Tasks Classification
No ratings yet
What's Next?: Binary Classification and Related Tasks Classification
44 pages
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
100% (1)
Introduction To Machine Learning - Unit 3 - Week 1 - Non - Graded
3 pages
Machine Learning Course Code 3710216
No ratings yet
Machine Learning Course Code 3710216
2 pages
ML Module 5 Full Notes
No ratings yet
ML Module 5 Full Notes
23 pages
ML Unit-2 Material Add-On
No ratings yet
ML Unit-2 Material Add-On
82 pages
Computational Learning Theory Insights
No ratings yet
Computational Learning Theory Insights
44 pages
Lecture22 s12
No ratings yet
Lecture22 s12
21 pages
Earth and Life First Quarter All About Rocks
No ratings yet
Earth and Life First Quarter All About Rocks
42 pages
Fee Structure MBBS For 2024 25 Local
No ratings yet
Fee Structure MBBS For 2024 25 Local
1 page
Social Constructionism in The Study of Career: Accessing The Parts That Other Approaches Cannot Reach
No ratings yet
Social Constructionism in The Study of Career: Accessing The Parts That Other Approaches Cannot Reach
16 pages
Five Example of Ballroom Dance
50% (2)
Five Example of Ballroom Dance
2 pages
Saqa - 14909 - Nhlapo Bellina JV17099
No ratings yet
Saqa - 14909 - Nhlapo Bellina JV17099
16 pages
Peds Pu 39-41
No ratings yet
Peds Pu 39-41
60 pages
JITOCE 2023 Electrical Officer Application
No ratings yet
JITOCE 2023 Electrical Officer Application
3 pages
Statement of Purpose
No ratings yet
Statement of Purpose
8 pages
10 56597-Kausbed 1056341-2186227
No ratings yet
10 56597-Kausbed 1056341-2186227
10 pages
ENVR3009
No ratings yet
ENVR3009
6 pages
Minggu 16
No ratings yet
Minggu 16
12 pages
It Standard For Business 2015 08 10
No ratings yet
It Standard For Business 2015 08 10
133 pages
DLP Revised - Nail Structures and Shapes
No ratings yet
DLP Revised - Nail Structures and Shapes
14 pages
IGCSE Urdu
No ratings yet
IGCSE Urdu
4 pages
PhD Candidate Shortlist 2018
No ratings yet
PhD Candidate Shortlist 2018
3 pages
ALP Jonathan Lucano ARW2
No ratings yet
ALP Jonathan Lucano ARW2
6 pages
1728662945-BBB4M Unit 1 NAFTA
No ratings yet
1728662945-BBB4M Unit 1 NAFTA
4 pages
Decision Making in Systems Engineering and Management (3rd Edition) Driscoll
No ratings yet
Decision Making in Systems Engineering and Management (3rd Edition) Driscoll
10 pages
Hand Bookupdated2
No ratings yet
Hand Bookupdated2
5 pages
GRADE 12 - Examinations Council of Zambia
No ratings yet
GRADE 12 - Examinations Council of Zambia
13 pages
Youth Catalysts Speech Detailed
No ratings yet
Youth Catalysts Speech Detailed
2 pages
Retest Request Form for Students
No ratings yet
Retest Request Form for Students
1 page
Cyber Security Career Guide and Skills
No ratings yet
Cyber Security Career Guide and Skills
45 pages
Food Recommendations with BERT Insights
No ratings yet
Food Recommendations with BERT Insights
15 pages
Subject Improvement Plan Exemplar-1
100% (1)
Subject Improvement Plan Exemplar-1
4 pages
NorthStar 4
100% (3)
NorthStar 4
274 pages
MIE Expert Nomination Questions 2021-2022
No ratings yet
MIE Expert Nomination Questions 2021-2022
10 pages
FMFP-2016 Conference Details and Registration
No ratings yet
FMFP-2016 Conference Details and Registration
4 pages
5701 - Group 4 - PPT - The Mother Tongue Based Multi Lingual Education Framework
100% (2)
5701 - Group 4 - PPT - The Mother Tongue Based Multi Lingual Education Framework
95 pages

PAC Learning & Machine Learning Course

Uploaded by

PAC Learning & Machine Learning Course

Uploaded by

Day & Time: Monday (10am-11am & 3pm-4pm)

● Explore about various disciplines connected with ML.

Department of CSE, GIT EID 403 and machine learning

● Identify the various applications connected with ML.

Department of CSE, GIT EID 403 and machine learning

Department of CSE, GIT EID 403 and machine learning

Department of CSE, GIT EID 403 and machine

Department of CSE, GIT EID 403 and machine learning

Chapter -6 Bayesian Learning

Chapter -9 Computational Learning Theory

-Characterization of difficulty of several types of ML problems.

-Capabilities of several types of ML algorithms.

-Two specific frameworks for analysing ML algorithms

1. Probably approximately correct(PAC) framework- In this we identify

1. Sample complexity- How many training examples are needed for a

2. Computational complexity- How much computational effort is needed for

The Problem Setting-

● Set of instances X described by attributes like age(young/old) height(short/tall)

For example, D might be the distribution of instances generated by observing

The Problem Setting-(cont.)

The learner L, considers some set H of possible hypotheses when attempting

After observing a sequence of training examples of the target concept c, L must

We may evaluate the success of L, by the performance of h over new instances

With this setting, we are interested in characterizing the performance of various

● Two notations of error

In order to characterize classes of target concepts that are reliably learned

Generally there are two difficulties existed-

To overcome those difficulties-

In short we require a learner probably learn a hypothesis that is approximately

Where, n is size of instances in X & size(c) is encoding length of c in C.

PAC learnability is determined by no.of training examples required by learner.

In practical,the factor that minimizes the success of a learner is limited availability of

Therefore, to bound the no.of examples needed by any consistent learner, we

A learner that makes no assumption that the target concept is representable by H

So far, the discussion is about-

1. It can lead to quite weak bounds.

Here we consider, a measure of the complexity of H, called Vapnik-Chervonenkis

Instead of |H| to state bounds on sample complexity we use VC(H).

A set of 3 instances shattered by eight

Hence, 2d ≤ |H| and d =VC(H) ≤ log2|H|.

Suppose each instance in X is described by the conjunction of exactly 3

Suppose that each hypothesis in H is described by conjunction of up to 3

We can show it as follows-By representing each instance by a 3-bit string and

The VC dimension for linear decision surface in the x,y plane is 3.

NOTE- The VC dimension for conjunctions of n boolean literals is at least n. In fact

Now,using VC(H) for the complexity of H, it is possible to derive an alternative i.e.,

1. How the training examples are generated.

Now consider, Mistake bound model of learning,in which

The learner is evaluated by total no.of mistakes it makes before converges to

Mistake boud learning problem may be studied in various specific settings.

FIND-S makes no errors, provided C ⊆ H and training data is noise-free.

And So, FIND-S can never mistakenly classify a negative example as

So, to calculate no.of mistakes in FIND-S, we need to count the no.of

● If the majority of version space hypotheses classify a new instance as positive

Mistake bound of Halving algorithm can be derived, when the majority of

Therefore, M Find-S(C) = n+1 and M Halving (C) ≤ log2(|C|)

You might also like