ML Unit-4 Prob Learning

The document discusses various concepts in computational learning, focusing on probability learning, hypothesis formulation, sample complexity, and rule learning. It explains the differences between frequentist and Bayesian approaches, the significance of hypothesis spaces, and the implications of finite and infinite hypothesis spaces. Additionally, it covers the mistake bound model and sequential covering algorithm for rule learning in machine learning.

Uploaded by

Vaishu Sangati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views36 pages

ML Unit-4 Prob Learning

Uploaded by

Vaishu Sangati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

UNIT- IV

Computational
Learning
Probability Learning

 Probability learning is a machine

learning technique that uses
probability theory to make
predictions and decisions.
 It's a statistical approach that
models uncertainty in data by using
probability distributions.
A probability
distribution describes the possible
values and the corresponding
likelihoods that a random variable
can take. For example, the
probabilities of having 0, 1, 2, …,
100 heads respectively.
 Frequentist calculates probabilities
from the relative frequencies of
specific events to the total number
of trials. For example, P(Head) =
Bayesian

 Bayesian modifies a prior belief with

current experiment data using
Bayesian inference. For example, a
Bayesian can combine a prior belief
that the coin is fair with the current
experiment data (56 heads out of
100) to form a new belief
A Frequentist estimates the most
likely value for P(head) (a point
estimate). But a Bayesian tracks all
possibilities with the corresponding
certainties. This calculation is
complex but contains richer
information for further computation.
Hypothesis

 The hypothesis is one of the

commonly used concepts of
statistics in Machine Learning.
 It is specifically used in Supervised
Machine learning, where an ML
model learns a function that best
maps the input to corresponding
outputs with the help of an available
dataset.
Hypothesis
 Thereare some common methods
given to find out the possible
hypothesis from the Hypothesis
space, where hypothesis space is
represented by uppercase-h
(H) and hypothesis by lowercase-h
(h). These are defined as follows:
 The hypothesis (h) can be formulated in
machine learning as follows:
 Y= mx + b
Where,
 Y: Range
 m: Slope of the line which divided test
data or changes in y divided by change
in x.
 x: domain
 c: intercept (constant)
Hypothesis space (H):
 Hypothesis space is defined as a
set of all possible legal
hypotheses; hence it is also
known as a hypothesis set.
Sample complexity
 Sample complexity is a concept
in machine learning that determines the
number of data samples required to
achieve a certain level of learning
performance.
 Its importance lies in its ability to assess
the efficiency of a learning algorithm.
 A more efficient algorithm needs fewer
samples to learn effectively, reducing the
resources required for data acquisition
and storage.
 There are two types of sample complexities
that are often referenced: worst-case sample
complexity and average-case sample
complexity.
 Worst-case sample complexity refers to the
maximum number of samples required to
reach a specific learning goal, irrespective of
the data distribution.
 Average-case sample complexity, on the
other hand, considers the average number of
samples needed, assuming the data follows a
certain distribution.
Mathematical backbone of
sample complexity
 Probably Approximately Correct
(PAC) learning theory provides a
framework to relate VC dimension to
sample complexity. PAC learning
seeks to identify the minimum
sample size that will, with high
probability, produce a hypothesis
within a specified error tolerance of
the best possible hypothesis.
 The PAC learning bound is given by:
 N >= (1/ε) * (ln|H| + ln(1/δ)),
Where,
 N is the sample size,
 ε is the maximum acceptable error
(the 'approximately correct' part),
 |H| is the size of the hypothesis
space (related to VC dimension),
 δ is the acceptable failure probability
(the 'probably' part).
Finite hypothesis

 A finite hypothesis space consists of

a limited, countable set of
hypotheses (models or functions)
that can be selected to explain or
predict outcomes based on the input
data.
 Example: Decision trees with a
limited depth, a fixed number of
linear classifiers, or a specific set of
rules in rule-based learning.
Implications

 Easier to manage and evaluate since

the number of hypotheses is small.
 Risk of underfitting if the hypothesis
space is too constrained and does
not capture the underlying data
distribution.
 It can be easier to ensure
generalization since overfitting can
be less of a concern due to the
limited complexity.
Infinite hypothesis space
 An infinite hypothesis space contains
an unbounded or uncountable set of
hypotheses. This means there are
potentially limitless models that could
be considered for fitting the data.
 Example: Linear regression with any
real-valued coefficients, neural
networks with varying architectures
and parameters, or kernel methods in
support vector machines (SVMs).
Implications

 Greater flexibility and the ability to

capture complex patterns in the
data.
 Higher risk of overfitting, as the
model can become too complex and
fit noise in the data rather than the
underlying distribution.
 Requires more advanced techniques
for regularization and validation to
ensure that the model generalizes
Mistake bound model
 The mistake bound (MB) model in machine
learning is a model that evaluates a learner
based on the total number of mistakes it
makes before reaching the correct
hypothesis.
 The model is used in cumulative learning
scenarios, where the learning process is
made up of rounds.
 In each round, the learner is asked about an
aspect of the learned phenomenon, makes a
prediction, and is told if it was correct.
 Thegoal of the learner is to make
the minimum number of mistakes
possible in the learning process.
Mistake bound model
algorithm
 An algorithm A is said to learn C in the
mistake bound model if for any concept
c ∈ C, and for any ordering of
examples consistent with c, the total
number of mistakes ever made by A is
bounded by p(n, size(c)), where p is a
polynomial. We say that A is a
polynomial time learning algorithm if
its running time per stage is also
polynomial in n and size(c).
 Conjunctions
 Let us assume that we know that the
target concept c will be a
conjunction of a set of (possibly
negated) variables, with an example
space of n-bit strings. Consider the
following algorithm:
Algorithm for MB
 1. Initialize hypothesis h to be x1x1
x2x2 . . . xnxn .
 2. Predict using h(x).
 3. If the prediction is False but the label is
actually True, remove all the literals in h
which are False in x. (So if the first mistake
is on 1001, the new h will be x1x2 x3 x4.)
 4. If the prediction is True but the label is
actually False, then output “no consistent
conjunction”. 5. Return to step 2.
 Lower Bound: In fact no deterministic
algorithm can achieve a mistake
bound better than n in the worst
case.
 This can be seen by considering the
sequence of n examples in which the
ith example has all bits except the
ith bit set to 1
 The target concept c will be a monotone
conjunction constructed by including xi only
if the algorithm predicts the ith example to
be True (in which case the ith example’s
label will be False).
 (If the algorithm predicts the ith example to
be False, then the target concept will not
include xi , and so the true label will be
True.) The algorithm will have made n
mistakes by the time all of these n examples
are processed.
Learning set of rules
A learning rule set in machine learning
is a collection of rules that describe a
dataset. The process of creating these
rules from data is called rule learning.
Rule types
 The most common type of rule learning
is inductive rule learning, also known as
rule induction. Other types of rules
include association rules, which are used
to express relationships in large datasets
Rule form

 The
basic form of a rule is "IF
PREMISE THEN CONSEQUENT". This
means that the consequent is true
whenever the premise is true.
Example 1
Example 2:
Learning strategies

 Some strategies for learning rule

sets include:
 Learn-One-Rule: Searches from
general to specific.
 Find-S: Searches from specific to
general.
 FOIL: Learns one rule at a time,
removing positive examples covered
by the learned rule before
attempting to learn another rule.
Sequential Covering
Algorithm
 Sequential Covering is a popular
algorithm based on Rule-Based
Classification used for learning a
disjunctive set of rules.
 The basic idea here is to learn one
rule, remove the data that it covers,
then repeat the same process.
 In this way, it covers all the rules
involved with it in a sequential
manner during the training phase.
Sequential Covering
Algorithm
 The algorithm involves a set of ‘ordered
rules’ or ‘list of decisions’ to be made.
 Step 1 – create an empty decision list, ‘R’.
Step 2 – ‘Learn-One-Rule’ Algorithm It
extracts the best rule for a particular class
‘y’, where a rule.
In the beginning,
 Step 2.a – if all training examples ∈ class
‘y’, then it’s classified as positive example.
 Step 2.b – else if all training examples ∉
class ‘y’, then it’s classified as negative
example.
 Step 3 – The rule becomes
‘desirable’ when it covers a
majority of the positive examples.
Step 4 – When this rule is obtained,
delete all the training data
associated with that rule. (i.e. when
the rule is applied to the dataset, it
covers most of the training data, and
has to be removed).
 Step 5 – The new rule is added to
the bottom of decision list, ‘R’.
Decision List format
Working of the algorithm

PAC Learning and Sample Complexity in ML
No ratings yet
PAC Learning and Sample Complexity in ML
64 pages
ML Unit 3 Bayesian - Learning (Textbook)
No ratings yet
ML Unit 3 Bayesian - Learning (Textbook)
25 pages
Bayesian and Computational Learning
No ratings yet
Bayesian and Computational Learning
178 pages
Naive Bayes Classifier Explained
No ratings yet
Naive Bayes Classifier Explained
11 pages
Supervised Regression in Machine Learning
No ratings yet
Supervised Regression in Machine Learning
32 pages
Decision Trees
No ratings yet
Decision Trees
32 pages
Unit 1.2 Desigining A Learning System
No ratings yet
Unit 1.2 Desigining A Learning System
15 pages
ML 5 Units
No ratings yet
ML 5 Units
466 pages
What Is Supervised Machine Learning
No ratings yet
What Is Supervised Machine Learning
3 pages
ML Unit-1
No ratings yet
ML Unit-1
43 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
65 pages
Machine Learning Techniques Question Bank
No ratings yet
Machine Learning Techniques Question Bank
9 pages
ML Question Bank 2024
No ratings yet
ML Question Bank 2024
2 pages
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
20 pages
Machine Learning in Mechanical Engineering
No ratings yet
Machine Learning in Mechanical Engineering
20 pages
Samuels Checkers Player Presentation
No ratings yet
Samuels Checkers Player Presentation
14 pages
Tutorial-1-Machine Learning-2020
No ratings yet
Tutorial-1-Machine Learning-2020
1 page
Introduction
No ratings yet
Introduction
6 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
45 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
79 pages
ML Unit-V
No ratings yet
ML Unit-V
75 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Ai PPT Material
No ratings yet
Ai PPT Material
9 pages
Deep Learning with RBMs and DBNs
No ratings yet
Deep Learning with RBMs and DBNs
79 pages
Guide To AUC ROC Curve in Machine Learning
No ratings yet
Guide To AUC ROC Curve in Machine Learning
10 pages
Unit - V
No ratings yet
Unit - V
10 pages
Understanding Random Forests in Machine Learning
100% (1)
Understanding Random Forests in Machine Learning
4 pages
Association Rules for Data Analysts
No ratings yet
Association Rules for Data Analysts
16 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Self Organizing Map
No ratings yet
Self Organizing Map
4 pages
FP-Growth Algorithm
No ratings yet
FP-Growth Algorithm
23 pages
Expectation Maximization
No ratings yet
Expectation Maximization
23 pages
of Bayesian Statistics (Chirayu Jain & Group)
No ratings yet
of Bayesian Statistics (Chirayu Jain & Group)
8 pages
Support Vector Machines (SVMS) : Cs479/679 Pattern Recognition Dr. George Bebis
No ratings yet
Support Vector Machines (SVMS) : Cs479/679 Pattern Recognition Dr. George Bebis
37 pages
Introduction to Applied Machine Learning
100% (1)
Introduction to Applied Machine Learning
48 pages
Neural Networks & SVMs in AI
No ratings yet
Neural Networks & SVMs in AI
19 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
Introduction To Hidden Markov Models
No ratings yet
Introduction To Hidden Markov Models
31 pages
SVM: Understanding the Optimal Hyperplane
100% (1)
SVM: Understanding the Optimal Hyperplane
37 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
Heuristic Search Techniques Explained
No ratings yet
Heuristic Search Techniques Explained
49 pages
Computer Science Algorithm Question Bank
No ratings yet
Computer Science Algorithm Question Bank
12 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Decision Trees & ID3 for Beginners
No ratings yet
Decision Trees & ID3 for Beginners
109 pages
DAA VIT AP 27 Maximum Matching in Bipartite Graphs
No ratings yet
DAA VIT AP 27 Maximum Matching in Bipartite Graphs
6 pages
RMPS M3
No ratings yet
RMPS M3
38 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
24 pages
Machine Learning Refined (Foundations, Algorithms, and Applications) (2nd Edition) Watt
No ratings yet
Machine Learning Refined (Foundations, Algorithms, and Applications) (2nd Edition) Watt
10 pages
Bayes Classification for Fish Sorting
No ratings yet
Bayes Classification for Fish Sorting
86 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Handout - BITS-F464 - Machine - Learning - August 2019
No ratings yet
Handout - BITS-F464 - Machine - Learning - August 2019
4 pages
ML Unit-1
No ratings yet
ML Unit-1
45 pages
Understanding Predicate Logic Basics
No ratings yet
Understanding Predicate Logic Basics
64 pages
Minimum Spanning Tree Algorithms
No ratings yet
Minimum Spanning Tree Algorithms
29 pages
Knowledge Representation and Reasoning
No ratings yet
Knowledge Representation and Reasoning
155 pages
Bayes Theorem Topic Final
No ratings yet
Bayes Theorem Topic Final
23 pages
Decision Tree Question
No ratings yet
Decision Tree Question
6 pages
CONSTRUCTOR AND DESTRUCTOR (C++)
No ratings yet
CONSTRUCTOR AND DESTRUCTOR (C++)
24 pages
Computer Network: 02 December 2024 22:38
No ratings yet
Computer Network: 02 December 2024 22:38
5 pages
Common 8th SEM Project Report
No ratings yet
Common 8th SEM Project Report
46 pages
Year 5 Missing Number Problems
No ratings yet
Year 5 Missing Number Problems
2 pages
OBE Template UGC
No ratings yet
OBE Template UGC
47 pages
Module I
No ratings yet
Module I
27 pages
Steel Beam Design for Kiln 3
No ratings yet
Steel Beam Design for Kiln 3
6 pages
Growth Mindset in Math Education
No ratings yet
Growth Mindset in Math Education
26 pages
A3 Template
100% (1)
A3 Template
2 pages
Shaly Sand Cross-Plot A Mathematical Treatment
No ratings yet
Shaly Sand Cross-Plot A Mathematical Treatment
48 pages
Screenshot 2024-10-14 at 11.16.54 AM
No ratings yet
Screenshot 2024-10-14 at 11.16.54 AM
14 pages
Instruction Manual: Hand Stacker Pa1015 Capacity 1000kg
No ratings yet
Instruction Manual: Hand Stacker Pa1015 Capacity 1000kg
17 pages
Feynman Path Integral Explained
No ratings yet
Feynman Path Integral Explained
80 pages
Loyabook
80% (5)
Loyabook
388 pages
Nonprofit Admin & Finance Resume
No ratings yet
Nonprofit Admin & Finance Resume
1 page
Unit Rates
No ratings yet
Unit Rates
5 pages
Schneider Transformers 800kva Wiring Diagram
No ratings yet
Schneider Transformers 800kva Wiring Diagram
8 pages
Quarter 1-Activity 2-Accepting The Individuality of Others
No ratings yet
Quarter 1-Activity 2-Accepting The Individuality of Others
2 pages
BC BF Advance Notes
No ratings yet
BC BF Advance Notes
12 pages
Pure Maths 2025 p2.1
No ratings yet
Pure Maths 2025 p2.1
5 pages
Set 1 Sample Sam and Jen Get A Pet
No ratings yet
Set 1 Sample Sam and Jen Get A Pet
20 pages
Error Measurement in Numerical Methods
No ratings yet
Error Measurement in Numerical Methods
42 pages
Mac OS X Quarantine Data
No ratings yet
Mac OS X Quarantine Data
1 page
Ppt-Antibacterial Activity of Mentha Arvensis Extract Against Staphylococcus Epidermis
0% (1)
Ppt-Antibacterial Activity of Mentha Arvensis Extract Against Staphylococcus Epidermis
8 pages
Diversifying Stock Portfolios for Risk Management
No ratings yet
Diversifying Stock Portfolios for Risk Management
8 pages
Waxing and Waning Aspects in Astrology
No ratings yet
Waxing and Waning Aspects in Astrology
5 pages
AI's Impact on Healthcare: Review
No ratings yet
AI's Impact on Healthcare: Review
18 pages
Bleach - Can't Fear Your Own World, Vol. 3
No ratings yet
Bleach - Can't Fear Your Own World, Vol. 3
263 pages
Kahoot Winner Bots List of Best Bots
No ratings yet
Kahoot Winner Bots List of Best Bots
1 page
Bsa 1400
No ratings yet
Bsa 1400
191 pages
Ultimist: Misting Nozzles
No ratings yet
Ultimist: Misting Nozzles
1 page
Script Writing 101
No ratings yet
Script Writing 101
34 pages

ML Unit-4 Prob Learning

Uploaded by

ML Unit-4 Prob Learning

Uploaded by

UNIT- IV

 Probability learning is a machine

 Bayesian modifies a prior belief with

 The hypothesis is one of the

 A finite hypothesis space consists of

 Easier to manage and evaluate since

 Greater flexibility and the ability to

 Some strategies for learning rule

You might also like