Chapter 3 (Part 1) of The Book of Why: From Evidence To Causes - Reverend Bayes Meets Mr. Holmes
Chapter 3 (Part 1) of The Book of Why: From Evidence To Causes - Reverend Bayes Meets Mr. Holmes
Tomás Aragón and James Duren (Version F; Last compiled August 4, 2019)
June 6, 2019, San Francisco, CA
Health Officer, City & County of San Francisco
Director, Population Health Division (PHD)
San Francisco Department of Public Health
https://taragonmd.github.io/ (GitHub page)
1
A patient presents with chest pain to clinical provider.
The patient has a history of coronary artery disease.
CAD MI TT
CP
GERD
Figure 1: A patient with a history of coronary artery disease (CAD) presents to a provider
complaining of prolonged chest pain (CP). The provider’s differential diagnosis (hypotheses) are
myocardial infarction (MI) and gastroesophageal reflux disease (GERD). The provider sends a
blood specimen for a Troponin Test (TT) to "rule out" a MI.
2
Sherlock Holmes: deduction vs. induction
Cause • → • Effect
Hypothesis • → • Evidence
3
Also called "predictive" reasoning
4
Also called "diagnostic" reasoning
4
Bayes’ theorem for causal Hypothesis • → • Evidence
For Bayes’ theorem just substitute probability expressions from table above:
(1)(2)
(4) =
(3)
5
Bayes’ theorem for causal (Hypothesis) → (Evidence)
P(H)P(E | H)
P(H | E ) =
P(E )
Exposure • → • Disease
7
Reverend Thomas Bayes’ pool table example
L (cause) → x (effect)
Forward probability
P(x | L)
"Inverse" probability
P(L | x )
8
Tea-Scones example (default: assume probabilistic dependence)
P(Tea) = P(T )
P(Scone) = P(S)
T S P(Tea ∩ Scone) = P(T , S)
= P(S, T )
T ∩S
P(S, T ) = P(T )P(S | T ) (1)
P(T , S) = P(S)P(T | S) (2)
10
Bayes’ theorem
P(S)P(T | S)
P(S | T ) =
P(T )
P(Hypothesis)P(Evidence | Hypothesis)
P(Hypothesis | Evidence) =
P(Evidence)
11
Bayes’ theorem
P(Hypothesis)P(Evidence | Hypothesis)
P(Hypothesis | Evidence) =
P(Evidence)
5
causal reasoning
6
evidential reasoning
12
Bayes’ theorem
P(S, T ) P(S)P(T | S)
P(S | T ) = =
P(T ) P(T )
P(Hypothesis)P(Evidence | Hypothesis)
P(Hypothesis | Evidence) =
P(Evidence)
Cause • → • Effect
Disease • → • Test
P(Disease, Test)
P(Disease | Test) =
P(Test)
P(Disease)P(Test | Disease)
=
P(Test)
P(Test | Disease)
= P(Disease)( )
P(Test)
= P(Disease)(Likelihood Ratio)
14
Bayes’ theorem example: Mammogram for breast cancer screening
Cause • → • Effect
Disease • → • Test
P(Disease, Test)
P(Disease | Test) =
P(Test)
P(Disease)P(Test | Disease)
=
P(Test)
P(D)P(T | D)
=
P(D)P(T | D) + P(D̄)P(T | D̄)
15
Bayes’ theorem example: Mammogram for breast cancer screening
Cause • → • Effect
Disease • → • Test
16
Bayes’ theorem example: Mammogram for breast cancer screening
Disease • → • Test
A 43 woman has a positive mammogram (T +). What is the probability of breast cancer
given the positive test (P(D+ | T +))? What do we know?
P(D+)(TP)
P(D+ | T +) =
P(D+)(TP) + P(D−)(FP)
(1/700)(0.73)
= ≈ 0.009
(1/700)(0.73) + (1 − 1/700)(0.12)
17
Bayes’ theorem: Review sensitivity and specificity of a diagnostic test
18
Bayes’ theorem
Bayes’ rule is a distillation of the scientific method (TBoW, p. 108)
19
Bayes’ theorem for causal (Disease) → (Test)
P(D)P(T | D)
P(D | T ) =
P(T )
Here is a non-causal Bayesian network. Smelling smoke increases the credibility (belief)
of a fire nearby, but smoke does not cause fire.
Here is a causal Bayesian network. Fire causes smoke. Smoke is evidence of a fire
(cause). Causal BNs have causal and evidential implications.
Fire Smoke
Directed acyclic graphs (DAGs) are causal Bayesian networks. The mammography
example was a causal Bayesian network.
Disease Test
21
Bayesian networks generalize Bayes’ theorem for complex causal graphs
Core DAG patterns for three nodes and two edges
X X Y
X Y Z Y Z Z
(a) (b) (c)
Figure 2: Core DAG patterns for three nodes and two edges: (a) chain (sequential cause), (b)
fork (common cause), and (c) collider (common effect).
22
Recap: A patient presents with chest pain to clinical provider.
The patient has a history of coronary artery disease.
CAD MI TT
CP
GERD
Figure 3: A patient with a history of coronary artery disease (CAD) presents to a provider
complaining of prolonged chest pain (CP). The provider’s differential diagnosis (hypotheses) are
myocardial infarction (MI) and gastroesophageal reflux disease (GERD). The provider sends a
blood specimen for a Troponin Test (TT) to "rule out" a MI. The pattern CAD → MI → TT is
a chain (sequential cause); TT ← MI → CP is a fork (common cause or confounder); and
MI → CP ← GERD is a collider (common effect). Providers reason like Sherlock Holmes.
23