CSCI 5582
Artificial Intelligence
Lecture 14
Jim Martin
CSCI 5582 Fall 2006
Today 10/17
Review basics
More on independence
Break
Bayesian Belief Nets
CSCI 5582 Fall 2006
Review
Joint Distributions
Atomic Events
Independence assumptions
CSCI 5582 Fall 2006
Review: Joint Distribution
Toothache=True Toothache=False
Cavity True
0.04
0.06
Cavity False
0.01
0.89
Each cell represents a conjunction of the variables in
the model.
CSCI 5582 Fall 2006
Atomic Events
The entries in the table represent
the probabilities of atomic events
Events where the values of all the
variables are specified
CSCI 5582 Fall 2006
Independence
Two variables A and B are
independent iff P(A|B) = P(A). In
other words, knowing B gives you no
information about B.
Or P(A^B)=P(A|B)P(B)=P(A)P(B)
I.e. Two coin tosses
CSCI 5582 Fall 2006
Mental Exercise
With a fair coin which of the
following two sequences is more
likely?
HHHHHTTTTT
HTTHHHTHTT
CSCI 5582 Fall 2006
Conditional Independence
Consider the dentist problem with 3
variables: cavity, toothache, catch
If I have a cavity, then the chances
that there will be a catch is
independent of whether or not I have
a toothache as well. I.e.
P(Catch|Cavity^Toothache)=
P(Catch|Cavity)
CSCI 5582 Fall 2006
Conditional Independence
Remember that having the joint
distribution over N variables allows
you to answer all the questions
involving those variables.
Exploiting conditional independence
allows us to represent the complete
joint distribution with fewer entries.
I.e. Fewer than the 2N normally needed
CSCI 5582 Fall 2006
Conditional Independence
P(Cavity,Catch,Toothache)
= P(Cavity)P(Catch,Toothache|Cavity)
=P(Cavity)P(Catch|Cavity)P(Toothache|Cavity)
CSCI 5582 Fall 2006
Conditional Independence
P(Cavity,Catch,Toothache)
= P(Catch)P(Cavity,Toothache|Catch)
Huh?
CSCI 5582 Fall 2006
Bayesian Belief Nets
A compact notation for representing
conditional independence assumptions
and hence a compact way of
representing a joint distribution.
Syntax:
A directed acyclic graph, one node per
variable
Each node augmented with local
conditional probability tables
CSCI 5582 Fall 2006
Bayesian Belief Nets
Nodes with no incoming arcs (root
nodes) simply have priors associated
with them
Nodes with incoming arcs have tables
enumerating the
P(Node|Conjunction of Parents)
Where parent means the node at the
other end of the incoming arc
CSCI 5582 Fall 2006
Alarm Example
Variables: Burglar, MaryCalls,
JohnCalls, Earthquake, Alarm
Network topology captures the
domain causality (conditional
independence assumptions).
CSCI 5582 Fall 2006
Alarm Example
CSCI 5582 Fall 2006
Bayesian Belief Nets:
Semantics
The full joint distribution for the N
variables in a Belief Net can be recovered
from the information in the tables.
N
P(X 1,...XN ) = P(Xi | Parents(Xi))
i=1
CSCI 5582 Fall 2006
Belief Net Semantics
Alarm Example
What are the chances of John calls,
Mary calls, alarm is going off, no
burglary, no earthquake?
CSCI 5582 Fall 2006
Alarm Example
CSCI 5582 Fall 2006
Alarm Example
P(J^M^A^~B^~E)=
P(J|A)*P(M|A)*P(A|~B^~E)*P(~B)*P(~E)
0.9 * 0.7 * .001
* .999 * .998
In other words, the probability of atomic
events can be read right off the network as
the product of the probability of the entries
for each variable
CSCI 5582 Fall 2006
Events
What about non-atomic events?
Remember to partition. Any event can
be defined as a combination of other
more well-specified events.
P(A) = P(A^B)+P(A^~B)
So whats the probability that Mary
calls out of the blue?
CSCI 5582 Fall 2006
Events
P(M ^J^E^B^A)+
P(M^J^E^B^~A)+
P(M^J^E^~B^A)+
CSCI 5582 Fall 2006
Events
How about P(M|Alarm)?
Trick question thats something we know
How about P(M|Earthquake)?
Not directly in the network
rewrite as
P(M^Earthquake)/P(Earthquake)
CSCI 5582 Fall 2006
Simpler Examples
Lets say we have two variables A and B, and
we know B influences A.
B
P(B)
Whats P(A^B)?
A
P(A|B)
P(A|~B)
CSCI 5582 Fall 2006
Simple Example
Now I tell you that B has happened.
Whats you belief in A?
P(B)
B
P(A|B)
P(A|~B)
CSCI 5582 Fall 2006
Simple Example
Suppose instead I say A has happened
Whats you belief in B?
B
P(B)
P(A|B)
P(A|~B)
CSCI 5582 Fall 2006
Simple Example
P(B|A)=P(B^A)/P(A)
= P(B^A)/P(A^B)+P(A^~B)
=P(B)P(A|B)
P(B)P(A|B)+P(~B)P(A|~B)
CSCI 5582 Fall 2006
Chain Rule Basis
P(B,E,A,J,M)
P(M|B,E,A,J)P(B,E,A,J)
P(J|B,E,A)P(B,E,A)
P(A|B,E)P(B,E)
P(B|E)P(E)
CSCI 5582 Fall 2006
Chain Rule Basis
P(B,E,A,J,M)
P(M|B,E,A,J)P(J|B,E,A)P(A|B,E)P(B|E)P(E)
P(J|A)
P(A|B,E)P(B)P(E)
P(M|A)
CSCI 5582 Fall 2006
Alarm Example
CSCI 5582 Fall 2006
Details
Where do the graphs come from?
Initially, the intuitions of domain experts
Where do the numbers come from?
Hopefully, from hard data
Sometimes from experts intuitions
How can we compute things efficiently?
Exactly by not redoing things unnecessarily
By approximating things
CSCI 5582 Fall 2006
Break
Readings for probability
13: All
14:
492-498, 500, Sec 14.4
CSCI 5582 Fall 2006
Noisy-Or
Even with the reduction in the number
of probabilities needed its hard to
accumulate all the numbers you need.
Especially true when some evidence
variables are shared among many
causes.
The Noisy-Or hack is a useful shortcut.
P(A|C1^C2^C3)
CSCI 5582 Fall 2006
Noisy-Or
Cold
Flu
Fever
CSCI 5582 Fall 2006
Malaria
Noisy Or
P(Fever|Cold)
P(Fever|Malaria)
P(Fever|Flu)
P(~Fever|Cold)
P(~Fever|Malaria)
P(~Fever|Flu)
CSCI 5582 Fall 2006
Noisy Or
to
What does it mean for the
occur?
It means the cause was true and the
symptom didnt happen
Whats the probability of that?
P(~Fever|Cause)
P(~Fever|Flu), etc
CSCI 5582 Fall 2006
Noisy Or
If all three causes are true and you dont
have a fever then all three blockers
are in effect
Whats the probability of that?
P(~Fever|flu,cold,malaria)
P(~Fever|flu)P(~Fever|cold)P(~Fever|malaria)
But 1 that = P(Fever|causes)
CSCI 5582 Fall 2006
Computing with BBNs
Normal scenario
You have a belief net consisting of a bunch
of variables
Some of which you know to be true (evidence)
Some of which youre asking about (query)
Some you havent specified (hidden)
CSCI 5582 Fall 2006
Example
Probability that theres a burglary
given that John and Mary are calling
P(B|J,M)
B is the query variable
J and M are evidence variables
A and E are hidden variables
CSCI 5582 Fall 2006
Example
Probability that theres a burglary given that John
and Mary are calling
P(B|J,M) = alpha P(B,J,M)
= alpha *
P(B,J,M,A,E) +
P(B,J,M,~A,E)+
P(B,J,M,A,~E)+
P(B,J,M, ~A,~E)
CSCI 5582 Fall 2006
From the Network
e a P( B ) P( E ) P ( A | B, E ) P( J | A) P( M | A)
P( B )e P ( E )a P( A | B, E ) P ( J | A) P( M | A)
CSCI 5582 Fall 2006
Expression Tree
CSCI 5582 Fall 2006
Speedups
Dont recompute things.
Dynamic programming
Dont compute some things at all
Ignore variables that cant effect the
outcome.
CSCI 5582 Fall 2006
Example
John calls given
burglary
P(J|B)
P ( B )e P ( E )a P ( A | B, E ) P( J | a )m P ( M | A)
CSCI 5582 Fall 2006
Variable Elimination
Every variable that is not an ancestor
of a query variable or an evidence
variable is irrelevant to the query
CSCI 5582 Fall 2006
Next Time
Finish Chapters 13 and 14
CSCI 5582 Fall 2006