0% found this document useful (0 votes)
38 views24 pages

PR January20 05 PDF

- The document describes Bayesian classifiers and how they can be used for classification problems. - Bayesian classifiers estimate the posterior probability P(C|A1,A2,...An) for each class C using Bayes' theorem and the probabilities are estimated from training data. - The Naive Bayes classifier assumes attribute independence given the class, allowing easier estimation of probabilities from data. - Probabilities like P(C) and P(Ai|C) can be estimated from data by calculating relative frequencies. For continuous attributes, discretization or density estimation may be used.

Uploaded by

Nadia Anjum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views24 pages

PR January20 05 PDF

- The document describes Bayesian classifiers and how they can be used for classification problems. - Bayesian classifiers estimate the posterior probability P(C|A1,A2,...An) for each class C using Bayes' theorem and the probabilities are estimated from training data. - The Naive Bayes classifier assumes attribute independence given the class, allowing easier estimation of probabilities from data. - Probabilities like P(C) and P(Ai|C) can be estimated from data by calculating relative frequencies. For continuous attributes, discretization or density estimation may be used.

Uploaded by

Nadia Anjum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

CSE 473

Pattern Recognition

Instructor:
Dr. Md. Monirul Islam
Bayesian Classifier and its Variants
a l a l s
c c u
Classification Example 2 r i r i o
o o u
g g t in s
te te n las
ca ca c o c
Tid Refund Marital Taxable
Status Income Evade

1 Yes Single 125K No


2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

• A married person with income 120K did not refund


the loan previously
• Can we trust him? 3
Bayesian Classifiers

• We have multiple attributes (A1, A2,…,An)


– Goal is to predict class C
– Specifically, we want to find the value of C that
maximizes P(C| A1, A2,…,An )

• Can we estimate P(C| A1, A2,…,An ) directly from


data?
Bayesian Classifiers
• Approach:
– compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P ( A A  A | C ) P (C )
P (C | A A  A )  1 2 n

P(A A  A )
1 2 n

1 2 n
Bayesian Classifiers
• Approach:
– compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P ( A A  A | C ) P (C )
P (C | A A  A )  1 2 n

P(A A  A )
1 2 n

1 2 n

– Choose value of C that maximizes


P(C | A1, A2, …, An)
Bayesian Classifiers
• Approach:
– compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P ( A A  A | C ) P (C )
P (C | A A  A )  1 2 n

P(A A  A )
1 2 n

1 2 n

– Choose value of C that maximizes


P(C | A1, A2, …, An)

– Equivalent to choosing value of C that maximizes


P(A1, A2, …, An|C) P(C)
Bayesian Classifiers
• Approach:
– compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P ( A A  A | C ) P (C )
P (C | A A  A )  1 2 n

P(A A  A )
1 2 n

1 2 n

– Choose value of C that maximizes


P(C | A1, A2, …, An)

– Equivalent to choosing value of C that maximizes


P(A1, A2, …, An|C) P(C)

• How to estimate P(A1, A2, …, An | C )?


Naïve Bayes Classifier
• Assume independence among attributes Ai when class is given:

– P(A1, A2, …, An |Cj) = P(A1| Cj) P(A2| Cj)… P(An| Cj)


Naïve Bayes Classifier
• Assume independence among attributes Ai when class is given:

– P(A1, A2, …, An |Cj) = P(A1| Cj) P(A2| Cj)… P(An| Cj)

– can estimate P(Ai| Cj) for all Ai and Cj

– the new pattern is classified to Cj if P(Cj)  P(Ai| Cj) is


maximum
How toriEstimate
c a l
r ic a l
o
Probabilities
u s from Data?
o o u
eg eg it n s s
c at c at c on cl
a
Tid Refund Marital Taxable
Status Income Evade • Class: P(C) = Nc/N
1 Yes Single 125K No – e.g., P(No) = 7/10,
P(Yes) = 3/10
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
How toriEstimate
c a l
r ic a l
o
Probabilities
u s from Data?
o o u
eg eg it n s s
c at c at c on cl
a
Tid Refund Marital Taxable
Status Income Evade • Class: P(C) = Nc/N
1 Yes Single 125K No – e.g., P(No) = 7/10,
P(Yes) = 3/10
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No • For discrete attributes:
5 No Divorced 95K Yes
6 No Married 60K No
P(Ai | Ck) = |Aik|/ Nc
7 Yes Divorced 220K No
– where |Aik| is number of
8 No Single 85K Yes instances having attribute Ai
9 No Married 75K No and belongs to class Ck
10 No Single 90K Yes
10 – Examples:
P(Status=Married|No) = 4/7
P(Refund=Yes|Yes)=0
How to Estimate Probabilities from Data?

• For continuous attributes:


– Discretize the range into bins
• one ordinal attribute per bin

– Two-way split: (A < v) or (A > v)


• choose only one of the two splits as new attribute
– Probability density estimation:
• Assume attribute follows a normal distribution
• Use data to estimate parameters of distribution
(e.g., mean and standard deviation)
• Once probability distribution is known, can use it to
estimate the conditional probability P(Ai|c)
How togo Estimate
g o in
u o uProbabilities
s
r
s
ic a l
from
r Data?
ic a l

e e t as
c at c at c o n
c l
Tid Refund Marital
Status
Taxable
Income Evade
• Normal distribution:
1 
( Ai   ij ) 2

1 Yes Single 125K No


P( A | c )  e 2  ij2

2
i j 2
2 No Married 100K No
ij
3 No Single 70K No
4 Yes Married 120K No – One for each (Ai,cj) pair
5 No Divorced 95K Yes
6 No Married 60K No
• For (Income, Class=No):
7 Yes Divorced 220K No – If Class=No
8 No Single 85K Yes • sample mean = 110K
9 No Married 75K No • sample variance = 2975
10 No Single 90K Yes
10
How togo Estimate
g o in
u o uProbabilities
s
r
s
ic a l
from
r Data?
ic a l

e e t as
c at c at c o n
c l
Tid Refund Marital
Status
Taxable
Income Evade
• Normal distribution:
1 
( Ai   ij ) 2

1 Yes Single 125K No


P( A | c )  e 2  ij2

2
i j 2
2 No Married 100K No
ij
3 No Single 70K No
4 Yes Married 120K No – One for each (Ai,cj) pair
5 No Divorced 95K Yes
6 No Married 60K No
• For (Income, Class=No):
7 Yes Divorced 220K No – If Class=No
8 No Single 85K Yes • sample mean = 110K
9 No Married 75K No • sample variance = 2975
10 No Single 90K Yes
10

1 
( 120110 ) 2

P( Income  120 | No)  e 2 ( 2975 )


 0.0072
2 (54.54)
Example of Naïve Bayes Classifier
Given a Test Record: X  (Refund  No, Married, Income  120K)
naive Bayes Classifier:

P(Refund=Yes|No) = 3/7 l P(X|Class=No) = P(Refund=No|Class=No)


P(Refund=No|No) = 4/7  P(Married| Class=No)
P(Refund=Yes|Yes) = 0  P(Income=120K| Class=No)
P(Refund=No|Yes) = 1 = 4/7  4/7  0.0072 = 0.0024
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7
P(Marital Status=Single|Yes) = 2/7
P(Marital Status=Divorced|Yes)=1/7
P(Marital Status=Married|Yes) = 0

For taxable income:


If class=No: sample mean=110
sample variance=2975
If class=Yes: sample mean=90
sample variance=25
Example of Naïve Bayes Classifier
Given a Test Record: X  (Refund  No, Married, Income  120K)
naive Bayes Classifier:

P(Refund=Yes|No) = 3/7 l P(X|Class=No) = P(Refund=No|Class=No)


P(Refund=No|No) = 4/7  P(Married| Class=No)
P(Refund=Yes|Yes) = 0  P(Income=120K| Class=No)
P(Refund=No|Yes) = 1 = 4/7  4/7  0.0072 = 0.0024
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7 l P(X|Class=Yes) = P(Refund=No| Class=Yes)
P(Marital Status=Single|Yes) = 2/7  P(Married| Class=Yes)
P(Marital Status=Divorced|Yes)=1/7  P(Income=120K| Class=Yes)
P(Marital Status=Married|Yes) = 0 = 1  0  1.2  10-9 = 0
For taxable income:
If class=No: sample mean=110
sample variance=2975
If class=Yes: sample mean=90
sample variance=25
Example of Naïve Bayes Classifier
Given a Test Record: X  (Refund  No, Married, Income  120K)
naive Bayes Classifier:

P(Refund=Yes|No) = 3/7 l P(X|Class=No) = P(Refund=No|Class=No)


P(Refund=No|No) = 4/7  P(Married| Class=No)
P(Refund=Yes|Yes) = 0  P(Income=120K| Class=No)
P(Refund=No|Yes) = 1 = 4/7  4/7  0.0072 = 0.0024
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7 l P(X|Class=Yes) = P(Refund=No| Class=Yes)
P(Marital Status=Single|Yes) = 2/7  P(Married| Class=Yes)
P(Marital Status=Divorced|Yes)=1/7  P(Income=120K| Class=Yes)
P(Marital Status=Married|Yes) = 0 = 1  0  1.2  10-9 = 0
For taxable income:
If class=No: sample mean=110 Since P(X|No)P(No) > P(X|Yes)P(Yes)
sample variance=2975 Therefore P(No|X) > P(Yes|X)
If class=Yes: sample mean=90
sample variance=25 => Class = No
Example-2 of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals
python no no no no non-mammals
salmon no no yes no non-mammals
whale yes no yes no mammals
frog no no sometimes yes non-mammals
komodo no no no yes non-mammals
bat yes yes no yes mammals
pigeon no yes no yes non-mammals
cat yes no no yes mammals
leopard shark yes no yes no non-mammals
turtle no no sometimes yes non-mammals
penguin no no sometimes yes non-mammals
porcupine yes no no yes mammals
eel no no yes no non-mammals
salamander no no sometimes yes non-mammals
gila monster no no no yes non-mammals
platypus no no no yes mammals
owl no yes no yes non-mammals
dolphin yes no yes no mammals
eagle no yes no yes non-mammals

Give Birth Can Fly Live in Water Have Legs Class


yes no yes no ?
Example-2 of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals A: attributes
python no no no no non-mammals
salmon no no yes no non-mammals M: mammals
whale yes no yes no mammals
frog no no sometimes yes non-mammals
N: non-mammals
komodo no no no yes non-mammals
6 6 2 2
bat yes yes no yes mammals P ( A | M )      0.06
pigeon
cat
no
yes
yes
no
no
no
yes
yes
non-mammals
mammals
7 7 7 7
leopard shark yes no yes no non-mammals 1 10 3 4
turtle no no sometimes yes non-mammals P ( A | N )      0.0042
penguin no no sometimes yes non-mammals 13 13 13 13
porcupine yes no no yes mammals
eel no no yes no non-mammals 7
salamander no no sometimes yes non-mammals P ( A | M ) P ( M )  0.06   0.021
gila monster no no no yes non-mammals 20
platypus no no no yes mammals
13
owl
dolphin
no
yes
yes
no
no
yes
yes
no
non-mammals
mammals
P ( A | N ) P( N )  0.004   0.0027
eagle no yes no yes non-mammals 20

Give Birth Can Fly Live in Water Have Legs Class P(A|M)P(M) >
yes no yes no ? P(A|N)P(N)
=> Mammals
Sample Data for Sessional on
Bayesian Classification
Feature 1 Feature 2 Class
1.7044 3.6651 1
1.6726 4.6705 1
1.4597 4.194 1
Sample Data 1.9761 4.1965 1
for 1.9126 3.4987 1
Bayesian 1.5214 3.9072 1
2.6463 3.473 1
Classification
2.2205 3.9642 1
6.8104 10.0517 2
7.5809 9.8897 2
8.1287 9.8605 2
7.9081 9.6332 2
7.9162 9.9677 2
7.9415 9.278 2
8.0842 10.3062 2
7.7494 9.3382 2
8.1146 9.9617 2
Naïve Bayes (Summary)

• Robust to isolated noise points


• Handle missing values by ignoring the instance
during probability estimate calculations
• Robust to irrelevant attributes
• Independence assumption may not hold for some
attributes
– Use other techniques such as Bayesian Belief Networks
(BBN)
Bayesian Belief Networks

• Let we have l random variables


• The joint probability is given by,

p( x1, x2 ,..., x )  p( x | x1,..., x1 )  p( x1 | x2 ,..., x1 )  ...


... p( x2 | x1 )  p( x1 )

You might also like