Bayesian Decision Theory
Bayesian Decision Theory
Bayesian Decision
Theory
Dr. Anto Satriyo Nugroho,
M.Eng
Center for Information & Communication Technology
Agency for the Assessment & Application of Technology
URL: http://asnugroho.net Email: [email protected]
Introduction
The sea bass/salmon example
State of nature, prior
State of nature is a random variable
The catch of salmon and sea bass is equiprobable
P(1) = P(2) (uniform priors)
P(1) + P( 2) = 1 (exclusivity and exhaustivity)
Decision Rules
Decision rule with only the prior information
Decide 1 if P(1) > P(2) otherwise decide 2
Use of the class conditional information
P(x | 1) and P(x | 2) describe the difference in lightness
between populations of sea and salmon
Probability Density
P( x) P( x | j ) P( j )
j 1
Error probability
Decision given the posterior probabilities
X is an observation for which:
if P(1 | x) > P(2 | x)
if P(1 | x) < P(2 | x)
Therefore:
whenever we observe a particular x, the probability of
error is :
P(error | x) = P(1 | x) if we decide 2
P(error | x) = P(2 | x) if we decide 1
Bayesian Classifier
Consider each attribute and class label as random variables
Given a record with attributes (A1, A2,,An)
Goal is to predict class C
Specifically, we want to find the value of C that maximizes P(C| A1,
A2,,An )
Can we estimate P(C| A1, A2,,An ) directly from data?
P(A1 A2 An | C)P(C)
P(C | A1 A2 An ) =
P(A A A )
Choose value of C that maximizes P(C | 1A1,2 A2, ,n An)
1
P( A | c )
e
2
i
ij
( Ai ij ) 2
2 ij2
Normal distribution:
1
P( A | c )
e
2
i
( Ai ij ) 2
2 ij2
ij
1
P( Income 120 | No )
e
2 (54 .54 )
( 1 21 0 1) 2 0
2 ( 2 9 )7 5
0.0072
14
X (Refu n d No ,M arried In
, co me 1 2 0 K)
P(X|Class=No) = P(Refund=No|Class=No)
P(Married| Class=No)
P(Income=120K|
Class=No)
= 4/7 4/7 0.0072 = 0.0024
=> Class = No
15
Characteristics of Naive
Bayes Classifier
16
Testing set
Iris Setosa (1): 25 samples (second half of the original dataset)
Iris Versicolor (2): 25 samples (second half of the original dataset)
25 samples (second half of the original
Iris Virginica (3):
dataset)
Suppose we want to classifiy a datum from Testing set, with the
following characteristics (the actual class is Iris Versicolor):
Sepal length: 5.7
Sepal width:
2.6
Petal length:
3.5
Petal width:
1
17
Solution:
1.
2.
3.
4.
5.
Step
4
Step
1
Step 2
&3
19
20
Step 3: Likelihood
Calculation
(Ai ij )
P(Ai | j ) =
A1:Sepal length=5.7,
A2:sepal width=2.6
A3:petal length=3.5
A4:petal width=1
2
ij
2
ij
21
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
main()
{
float x,m,var;
Compile with
gcc programfilename.c o programfilename -lm
while(1){
printf("attribute value: ");
scanf("%f",&x);
printf("attribute mean: ");
scanf("%f",&m);
printf("attribute variance: ");
scanf("%f",&var);
printf("%g\n",1.0/sqrt(2*M_PI*var)*exp(-(x-m)*(x-m)/(2*var)));
}
}
22
23
Likelihood Calculation
Results
P(sepal length=5.7 | Iris Setosa) = 0.241763
P(sepal width=2.6 | Iris Setosa) = 0.0625788
P(petal length=3.5 | Iris Setosa) = 1.7052 e-23 = 0
P(petal width=1
| Iris Setosa) = 2.23877 e-11
P(sepal length=5.7 | Iris Versicolor) = 0.619097
P(sepal width=2.6 | Iris Versicolor) = 0.998687
P(petal length=3.5 | Iris Versicolor) = 0.16855
P(petal width=1
| Iris Versicolor) = 0.481618
P(sepal length=5.7 | Iris Virginica) = 0.265044
P(sepal width=2.6 | Iris Virginica) = 0.731322
P(petal length=3.5 | Iris Virginica) = 0.00256255
P(petal width=1
| Iris Virginica) = 0.000360401
24
25
gene-2
gene-3
gene-4
0.4964
0.2509
2.714
0.1805
27