Naive Bayesian Classifiers
Naive Bayesian Classifiers
The Naïve Bayes Classifier technique is particularly suited when the dimensionality of the
inputs is high. Despite its simplicity, Naive Bayes can often outperform more sophisticated
classification methods. Naïve Bayes model identifies the characteristics of patients with heart
disease. It shows the probability of each input attribute for the predictable state.
A conditional probability is the likelihood of some conclusion, C, given some
evidence/observation, E, where a dependence relationship exists between C and E. This
probability is denoted as P(C |E) where
Suppose that there are m classes, C1, C2,…, Cm. Given a tuple, X, the classifier will predict
that X belongs to the class having the highest posterior probability, conditioned on X. That is,
the naïve Bayesian classifier predicts that tuple x belongs to the class Ci if and only if Thus we
maximize P(Ci|X). The class Ci for which P(Ci|X) is maximized is called the maximum
posteriori hypothesis. By Bayes’ theorem
As P(X) is constant for all classes, only P (X|Ci) P (Ci) need be maximized. If the class prior
probabilities are not known, then it is commonly assumed that the classes are equally likely,
that is, P(C1)=P(C2) =…=P(Cm), and we would therefore maximize P(X|Ci). Otherwise, we
maximize P(X|Ci)P(Ci). Note that the class prior probabilities may be estimated by
P(Ci)=|Ci,D|/|D|, where |Ci,D| is the number of training tuples of class Ci in D.
Given data sets with many attributes, it would be extremely computationally expensive to
compute P(X|Ci). In order to reduce computation in evaluating P(X|Ci), the naïve assumption
of class conditional independence is made. This presumes that the values of the attributes are
conditionally independent of one another, given the class label of the tuple (i.e., that there are
no dependence relationships among the attributes).
Illustrate with Example
Consider a simple example
S .No. Age Income Govt. Employee Credit Rating Loan Yes /No
1 Y H No Fair No
2 Y H No Excellent No
3 M H No Fair Yes
4 S M No Fair Yes
5 S L Yes Fair Yes
6 S L Yes Excellent No
7 M L Yes Excellent Yes
8 Y M No Fair No
9 Y L Yes Fair Yes
10 S M Yes Fair Yes
11 Y M Yes Excellent Yes
12 M M No Excellent Yes
13 M H Yes Fair Yes
14 S M No Excellent No
Predicting a class label using naïve Bayesian classification, we wish to predict the class label
of a tuple using naive Bayesian classification from the training data as in the above table. The
data tuples are described by the attributes age, income, Govt_employee and credit rating. The
class label attribute, Loan has two distinct values (namely, {yes, no}). Let C1 correspond to
the class Loan =Yes and C2 correspond to Loan=no.
The tuple we wish to classify is
X = (Age=Youth, Income=Medium, Govt_ Employee=Yes, credit_rating=Fair)
We need to maximize P(X|Ci)P(Ci), for i=1, 2. P(Ci), the prior probability of each class, can
be
computed based on the training tuples:
P( Loan r=Yes) = 9/14=0.643
Similarly,
P(X | Loan=no) = 0.600 x 0.400 x 0.200 x 0.400 = 0.019.