0% found this document useful (0 votes)

99 views27 pages

Bayesian Decision Theory

1. The document provides information about Bayesian decision theory and describes the process of using a Naive Bayes classifier to classify data. 2. It gives the steps to estimate probabilities from data, calculate the posterior probability to classify a new data point using the Bayes theorem and independence assumptions of Naive Bayes. 3. An example is shown applying these steps to classify an iris flower species from a test data point using the iris dataset, estimating probabilities, calculating likelihoods and posterior probabilities to make the prediction.

Uploaded by

Nurma Ayu Wigati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views27 pages

Bayesian Decision Theory

Uploaded by

Nurma Ayu Wigati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

25 April 2013

Bayesian Decision
Theory
Dr. Anto Satriyo Nugroho,
M.Eng
Center for Information & Communication Technology
Agency for the Assessment & Application of Technology
URL: http://asnugroho.net Email: [email protected]

Introduction
The sea bass/salmon example
State of nature, prior
State of nature is a random variable
The catch of salmon and sea bass is equiprobable
P(1) = P(2) (uniform priors)
P(1) + P( 2) = 1 (exclusivity and exhaustivity)

Decision Rules
Decision rule with only the prior information
Decide 1 if P(1) > P(2) otherwise decide 2
Use of the class conditional information
P(x | 1) and P(x | 2) describe the difference in lightness
between populations of sea and salmon

Probability Density

Posterior, likelihood, evidence

P(j | x) = P(x | j) . P (j) / P(x)
Where in case of two categories

Posterior = (Likelihood x Prior) / Evidence

j 2

P( x) P( x | j ) P( j )
j 1

Error probability
Decision given the posterior probabilities
X is an observation for which:
if P(1 | x) > P(2 | x)
if P(1 | x) < P(2 | x)

True state of nature = 1

True state of nature = 2

Therefore:
whenever we observe a particular x, the probability of
error is :
P(error | x) = P(1 | x) if we decide 2
P(error | x) = P(2 | x) if we decide 1

Minimizing the probability of error

Bayesian Classifier
Consider each attribute and class label as random variables
Given a record with attributes (A1, A2,,An)
Goal is to predict class C
Specifically, we want to find the value of C that maximizes P(C| A1,
A2,,An )
Can we estimate P(C| A1, A2,,An ) directly from data?

Naive Bayes Classifier

Approach:
compute the posterior probability P(C | A1, A2, , An) for all
values of C using the Bayes theorem

P(A1 A2 An | C)P(C)
P(C | A1 A2 An ) =
P(A A A )
Choose value of C that maximizes P(C | 1A1,2 A2, ,n An)

Equivalent to choosing value of C that maximizes

P(A1, A2, , An|C) P(C)

How can we estimate the value of P(A1, A2, , An | C ) ?

Naive Bayes Classifier

How can we estimate the probabilities

from the data ?
Case 1 : Discrete data

Class: P(C) = Nc/N

e.g., P(No) = 7/10,

P(Yes) = 3/10
For discrete attributes:
P(Ai | Ck) = |Aik|/ Nc
where |Aik| is number of
instances having attribute Ai
and belongs to class Ck
Examples:
P(Status=Married|No) = 4/7
P(Refund=Yes|Yes)=0

How can we estimate the probabilities

from the data ?
Case 2 : Continuous data

For continuous attributes:

Discretize the range into bins
one ordinal attribute per bin
violates independence assumption
Two-way split: (A < v) or (A > v)
choose only one of the two splits as new attribute
Probability density estimation:
Assume attribute follows a normal distribution
Use data to estimate parameters of distribution
(e.g., mean and standard deviation)
Once probability distribution is known, can use it to estimate the
conditional probability P(Ai|c)

1
P( A | c )
e
2
i

( Ai ij ) 2
2 ij2

Normal distribution:

1
P( A | c )
e
2
i

( Ai ij ) 2
2 ij2

One for each (Ai,ci) pair

For (Income, Class=No):
If Class=No
sample mean = 110
sample variance = 2975

1
P( Income 120 | No )
e
2 (54 .54 )

( 1 21 0 1) 2 0
2 ( 2 9 )7 5

0.0072

Example of Naive Bayes

Classification

Given a Test Record:

X (Refu n d No ,M arried In
, co me 1 2 0 K)

P(X|Class=No) = P(Refund=No|Class=No)
P(Married| Class=No)
P(Income=120K|
Class=No)
= 4/7 4/7 0.0072 = 0.0024

P(X|Class=Yes) = P(Refund=No| Class=Yes)

P(Married| Class=Yes)
P(Income=120K|
Class=Yes)
= 1 0 1.2 10-9 = 0

Since P(X|No)P(No) > P(X|Yes)P(Yes)

Therefore P(No|X) > P(Yes|X)

=> Class = No

Characteristics of Naive
Bayes Classifier

Robust to isolated noise points

Handle missing values by ignoring the instance during
probability estimate calculations
Robust to irrelevant attributes
Independence assumption may not hold for some attributes
Use other techniques such as Bayesian Belief Networks
(BBN)

Example with Iris Dataset

Training set:
Iris Setosa (1): 25 samples (first half of the original dataset)
Iris Versicolor (2): 25 samples (first half of the original dataset)
Iris Virginica (3):

25 samples (first half of the original dataset)

Testing set
Iris Setosa (1): 25 samples (second half of the original dataset)
Iris Versicolor (2): 25 samples (second half of the original dataset)
25 samples (second half of the original
Iris Virginica (3):
dataset)
Suppose we want to classifiy a datum from Testing set, with the
following characteristics (the actual class is Iris Versicolor):
Sepal length: 5.7
Sepal width:
2.6
Petal length:
3.5
Petal width:
1
17

Solution:
1.
2.
3.
4.
5.

Calculate the prior probability

Calculate the mean & variance of each feature
Calculate the likelihood
Calculate
prior probability x likelihood
Make decision based on posterior probability

Step
4
Step
1

Step 2
&3

POSTERIOR = PRIOR x LIKELIHOOD

EVIDENCE

Step 1: Prior Probability

Calculation

P(1)= number of 1 samples / total samples = 25/75 = 0.33

P(2)= number of 2 samples / total samples = 25/75 = 0.33
P(3)= number of 3 samples / total samples = 25/75 = 0.33

Step 2: Mean & Variance

Calculation

Iris has continuous attributes, thus to calculate the likelihood we have to

calculate the mean () and variance (2) of each class of each
attributes.

Step 3: Likelihood
Calculation

Suppose we want to classifiy a datum from Testing set, with the

following characteristics (the actual class is Iris Versicolor):
Sepal length: 5.7 Sepal width:2.6 Petal length:3.5 Petal width:

(Ai ij )

1: Iris Setosa 2: Iris versicolor

3: Iris Virginica

P(Ai | j ) =

A1:Sepal length=5.7,
A2:sepal width=2.6
A3:petal length=3.5
A4:petal width=1

2
ij

C code for Likelihood

Calculation

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
main()
{
float x,m,var;

Compile with
gcc programfilename.c o programfilename -lm

while(1){
printf("attribute value: ");
scanf("%f",&x);
printf("attribute mean: ");
scanf("%f",&m);
printf("attribute variance: ");
scanf("%f",&var);
printf("%g\n",1.0/sqrt(2*M_PI*var)*exp(-(x-m)*(x-m)/(2*var)));
}
}

$ gcc calculate_likelihood.c -o calculate_likelihood -lm

$ ./calculate_likelihood
Example of how to compile
attribute value: 5.7
the program and use it to
attribute mean: 5.028
calculate the likelihood
attribute variance: 0.16043333
P(sepal length=5.7 | Iris Setosa)
0.243805
to exit, press CTRL+C

Likelihood Calculation
Results
P(sepal length=5.7 | Iris Setosa) = 0.241763
P(sepal width=2.6 | Iris Setosa) = 0.0625788
P(petal length=3.5 | Iris Setosa) = 1.7052 e-23 = 0
P(petal width=1
| Iris Setosa) = 2.23877 e-11
P(sepal length=5.7 | Iris Versicolor) = 0.619097
P(sepal width=2.6 | Iris Versicolor) = 0.998687
P(petal length=3.5 | Iris Versicolor) = 0.16855
P(petal width=1
| Iris Versicolor) = 0.481618
P(sepal length=5.7 | Iris Virginica) = 0.265044
P(sepal width=2.6 | Iris Virginica) = 0.731322
P(petal length=3.5 | Iris Virginica) = 0.00256255
P(petal width=1
| Iris Virginica) = 0.000360401

Step 4: Prior x Likelihood

Calculation

prior (Iris Setosa)P(sepal length=5.7 | Iris Setosa)P(sepal width=2.6 |

Step 5: Decision based on Posterior

Probability
Posterior (Iris Setosa | Sepal length: 5.7, Sepal width: 2.6, Petal length:
3.5, Petal width:1 ) = 0/evidence
Posterior (Iris Versicolor | Sepal length: 5.7, Sepal width: 2.6, Petal
length: 3.5, Petal width:1 ) = 0.016730091/evidence
Posterior (Iris Versicolor | Sepal length: 5.7, Sepal width: 2.6, Petal
length: 3.5, Petal width:1 ) = 5.96711 10-8 /evidence
From the three posterior values above, the second one is the biggest.
Thus the class for datum with sepal length: 5.7 Sepal width:2.6 Petal
length:3.5 Petal width:1 is Iris Versicolor

POSTERIOR = PRIOR x LIKELIHOOD

EVIDENCE
26

Final Examination 2011

The following dataset are
part of microarray
dataset used to design a
classifier (thus, it is the
trainingset)
for classifying and
diagnostics cancers
using gene expression.
The original dataset
consists of expression of
6567 genes (attributes)
and 63 training samples
of 4 classes:
neuroblastoma (NB),
rhabdomyosarcoma
(RMS), non-Hodgkin
lymphoma (NHL) and the
Ewing family of tumors
(EWS). For the sake of
simplicity, only
expression of four genes,

Determine the class of the following datum

gene-1

gene-2

gene-3

gene-4

0.4964

0.2509

2.714

0.1805

Visual Inspection Procedure
100% (4)
Visual Inspection Procedure
8 pages
Preview: Generation X and Generation Y in The Workplace: A Study Comparing Work Values of Generation X and Generation Y
100% (1)
Preview: Generation X and Generation Y in The Workplace: A Study Comparing Work Values of Generation X and Generation Y
24 pages
ML Lec 15 Naive Bayes
No ratings yet
ML Lec 15 Naive Bayes
16 pages
Classification (Naive Bayes)
No ratings yet
Classification (Naive Bayes)
40 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
No ratings yet
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
26 pages
ML-09-naive-bayes-classifier
No ratings yet
ML-09-naive-bayes-classifier
24 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Foundations of Data Science - Unit 6 - Naive Bayes
No ratings yet
Foundations of Data Science - Unit 6 - Naive Bayes
12 pages
ML Lecture#5
No ratings yet
ML Lecture#5
65 pages
Naive Bayes
No ratings yet
Naive Bayes
13 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
Lect-7-DM
No ratings yet
Lect-7-DM
65 pages
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
100% (3)
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
11 pages
Bayesian Classifier Notes
No ratings yet
Bayesian Classifier Notes
9 pages
D3 It Naive Bayes
No ratings yet
D3 It Naive Bayes
24 pages
Naïve Bayesv1
No ratings yet
Naïve Bayesv1
31 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
6 - Naive Bayes
No ratings yet
6 - Naive Bayes
26 pages
Bayes Classifier PDF
100% (1)
Bayes Classifier PDF
18 pages
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
No ratings yet
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
8 pages
06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Week 4 - Classification Alternative Techniques
No ratings yet
Week 4 - Classification Alternative Techniques
87 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
Lecture 11
No ratings yet
Lecture 11
49 pages
Naive-By
No ratings yet
Naive-By
23 pages
naive_bayes
No ratings yet
naive_bayes
19 pages
Classification-Alternative Techniques: Bayesian Classifiers
No ratings yet
Classification-Alternative Techniques: Bayesian Classifiers
7 pages
L05-NaiveBayes
No ratings yet
L05-NaiveBayes
21 pages
הרצאה - Density Estimations
No ratings yet
הרצאה - Density Estimations
66 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
navie classifier
No ratings yet
navie classifier
8 pages
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
56 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Unit 3 - Naive Bayes
No ratings yet
Unit 3 - Naive Bayes
8 pages
DM NaiveBayes
No ratings yet
DM NaiveBayes
15 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Chapter 4
No ratings yet
Chapter 4
57 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Bayes Classifier
No ratings yet
Bayes Classifier
31 pages
05 ZeroR OneR Bayes KNN
No ratings yet
05 ZeroR OneR Bayes KNN
76 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
CPE412 Pattern Recognition (Week 5) - Updated
No ratings yet
CPE412 Pattern Recognition (Week 5) - Updated
36 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
T7.3
No ratings yet
T7.3
4 pages
Naive Bayes
No ratings yet
Naive Bayes
36 pages
PR January20 05 PDF
No ratings yet
PR January20 05 PDF
24 pages
2-Unit-PR-Statistical-Decision-making
No ratings yet
2-Unit-PR-Statistical-Decision-making
61 pages
Analytic Geometry: Graphic Solutions Using Matlab Language
From Everand
Analytic Geometry: Graphic Solutions Using Matlab Language
Ing. Mario Castillo
No ratings yet
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
From Everand
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
How To Write A Thesis
No ratings yet
How To Write A Thesis
6 pages
1A Student Checklist Research Plan Instructions
No ratings yet
1A Student Checklist Research Plan Instructions
2 pages
MGMT628-MidTerm-Subjective Solved Spring 2011 by Rajpoot
No ratings yet
MGMT628-MidTerm-Subjective Solved Spring 2011 by Rajpoot
9 pages
Estadistica, Articulo, Analyzing Outliers: Influential or Nuisance?
No ratings yet
Estadistica, Articulo, Analyzing Outliers: Influential or Nuisance?
3 pages
Research 2 SAS 1-6
No ratings yet
Research 2 SAS 1-6
26 pages
Thermography PDF
No ratings yet
Thermography PDF
51 pages
Value Destruction in Exaggerated Online Reviews: The Effects of Emotion, Language, and Trustworthiness
No ratings yet
Value Destruction in Exaggerated Online Reviews: The Effects of Emotion, Language, and Trustworthiness
21 pages
Strategic Attractiveness
100% (1)
Strategic Attractiveness
107 pages
142-161-Berenguel-et-al.-2023-v1-1
No ratings yet
142-161-Berenguel-et-al.-2023-v1-1
20 pages
Cartney Adult Learning Styles
No ratings yet
Cartney Adult Learning Styles
20 pages
Strength Data of Concrete
No ratings yet
Strength Data of Concrete
107 pages
A Novel Approach For Estimation of Aboveground Biomass of A Carbon-Rich Mangrove Site in India
No ratings yet
A Novel Approach For Estimation of Aboveground Biomass of A Carbon-Rich Mangrove Site in India
13 pages
Course Specifications: A-Basic Information
No ratings yet
Course Specifications: A-Basic Information
2 pages
Abc School Counseling Core Curriculum Action Plan 10 Grade
No ratings yet
Abc School Counseling Core Curriculum Action Plan 10 Grade
8 pages
Solution
No ratings yet
Solution
5 pages
Thesis
No ratings yet
Thesis
48 pages
Statistics Sample Questions
No ratings yet
Statistics Sample Questions
9 pages
Marine Risk Assesment
100% (2)
Marine Risk Assesment
19 pages
MSC Dissertation Marking Scheme
100% (2)
MSC Dissertation Marking Scheme
8 pages
Dissertation On Working Capital Management and Profitability
100% (1)
Dissertation On Working Capital Management and Profitability
7 pages
City Sanitation Plan
No ratings yet
City Sanitation Plan
125 pages
Research Lms
No ratings yet
Research Lms
9 pages
Chapter 3 Thesis Data Gathering
100% (3)
Chapter 3 Thesis Data Gathering
7 pages
Marketing Research That Won t Break the Bank A Practical Guide to Getting the Information You Need 2nd Edition Alan R. Andreasen - Read the ebook online or download it for a complete experience
No ratings yet
Marketing Research That Won t Break the Bank A Practical Guide to Getting the Information You Need 2nd Edition Alan R. Andreasen - Read the ebook online or download it for a complete experience
86 pages
Jas 056 Jochen Straub
No ratings yet
Jas 056 Jochen Straub
9 pages
Research Design: Patanjali (FMCG)
No ratings yet
Research Design: Patanjali (FMCG)
6 pages
Research Methodology and Scientific Writing (Chem 453)
No ratings yet
Research Methodology and Scientific Writing (Chem 453)
169 pages
The Nature and Purpose of A Literature Review
100% (3)
The Nature and Purpose of A Literature Review
5 pages

Bayesian Decision Theory

Uploaded by

Bayesian Decision Theory

Uploaded by

25 April 2013

Posterior, likelihood, evidence

Posterior = (Likelihood x Prior) / Evidence

True state of nature = 1

Minimizing the probability of error

Naive Bayes Classifier

Equivalent to choosing value of C that maximizes

How can we estimate the value of P(A1, A2, , An | C ) ?

Naive Bayes Classifier

How can we estimate the probabilities

Class: P(C) = Nc/N

e.g., P(No) = 7/10,

How can we estimate the probabilities

For continuous attributes:

One for each (Ai,ci) pair

Example of Naive Bayes

Given a Test Record:

P(X|Class=Yes) = P(Refund=No| Class=Yes)

Since P(X|No)P(No) > P(X|Yes)P(Yes)

Robust to isolated noise points

Example with Iris Dataset

25 samples (first half of the original dataset)

Calculate the prior probability

POSTERIOR = PRIOR x LIKELIHOOD

Step 1: Prior Probability

P(1)= number of 1 samples / total samples = 25/75 = 0.33

Step 2: Mean & Variance

Iris has continuous attributes, thus to calculate the likelihood we have to

Suppose we want to classifiy a datum from Testing set, with the

1: Iris Setosa 2: Iris versicolor

C code for Likelihood

$ gcc calculate_likelihood.c -o calculate_likelihood -lm

Step 4: Prior x Likelihood

prior (Iris Setosa)*P(sepal length=5.7 | Iris Setosa)*P(sepal width=2.6 |

Step 5: Decision based on Posterior

POSTERIOR = PRIOR x LIKELIHOOD

Final Examination 2011

Determine the class of the following datum

You might also like

prior (Iris Setosa)P(sepal length=5.7 | Iris Setosa)P(sepal width=2.6 |