0% found this document useful (0 votes)

45 views6 pages

Bayesian Statistics: 5.3 Poisson Model For Count Data

1) The document discusses using the Poisson distribution to model count data, where the number of occurrences in an interval is modeled as a random variable y with a Poisson distribution with parameter θ. 2) The conjugate prior for the Poisson distribution is the Gamma distribution. When used as a prior, the posterior distribution after observing the data y1:n is also Gamma distributed, with updated parameters. 3) The prior predictive distribution of the data when θ has a Gamma prior is Negative Binomial. The posterior predictive distribution is also Negative Binomial with updated parameters.

Uploaded by

lamaridis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views6 pages

Bayesian Statistics: 5.3 Poisson Model For Count Data

Uploaded by

lamaridis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Bayesian Statistics

Course notes by Robert Piche, Tampere University of Technology

based on notes by Antti Penttinen, University of Jyvaskyla
version: February 27, 2009

5.3 Poisson model for count data

Let #I denote the number of occurrences of some phenomenon that are observed
in an interval I (of time, usually). For example, #I could be the number of traffic
accidents on a given stretch of highway, the number of particles emitted in the
radioactive decay of an isotope sample, the number of outbreaks of a given disease
in a given city. . . The number y of occurrences per unit time is often modelled
y
as y | Poisson( ), which has the pmf P(#(t0 ,t0 + 1] = y | ) = (y!) e (y
{0, 1, 2, . . .}).
The Poisson model can be derived as follows. Assume that the events are
relatively rare and occur at a constant rate , that is,

P(#(t,t + h] = 1 | ) = h + o(h), P(#(t,t + h] 2 | ) = o(h),

where o(h) means lim o(h)

h = 0. Assume also that the numbers of occurences in
h0
distinct intervals are independent given . Letting Pk (t) := P(#(0,t] = k | ), we
have

P0 (t + h) = P(#(0,t] = 0 & #(t,t + h] = 0 | )

= P(#(t,t + h] = 0 | #(0,t] = 0, )P(#(0,t] = 0 | )
= (1 h + o(h))P0 (t).

Letting h 0 gives the differential equation P00 (t) = P0 (t), which with the
initial condition P0 (0) = 1 has the solution P0 (t) = et . Similary, for k > 0 we
have

Pk (t + h) = P(#(0,t] = k & #(t,t + h] = 0 | )

+ P(#(0,t] = k 1 & #(t,t + h] = 1 | )
= (1 h + o(h))Pk (t) + ( h + o(h))Pk1 (t),

which in the limit h 0 gives the differential equations

Pk0 (t) = Pk (t) + Pk1 (t).

Solving these with the initial conditions Pk (0) = 0 (k > 0) gives

(t)k t
Pk (t) = e , (k {0, 1, 2, . . .}),
k!
which for t = 1 is the Poisson pmf.
Thus, a Poisson-distributed random variable y | Poisson( ) has the pmf

k
P(y = k | ) = e , (k {0, 1, 2, . . .})
k!
and the summary statistics

E(y | ) = , V(y | ) = .

1
The likelihood pmf of a sequence y1 , . . . , yn of Poisson-distributed counts on
unit-length intervals, assumed to be mutually independent conditional on , is
n
( )yi
p(y1:n | ) = e s en
i=1 yi !

where s = ni=1 yi .
The conjugate prior for the Poisson distribution is the Gamma(, ) distribu-
tion, which has the pdf
1
p( ) = e ( > 0).
()
The distribution gets its name from the normalisation factor of its pdf. The mean,
variance and mode of Gamma(, ) are
1
E( ) = , V( ) = , mode( ) = .
2
The formula for the mean can be derived as follows:
Z
1
E( ) = e d
0 ()
( + 1) +1 (+1)1
Z
= e d
() 0 ( + 1)
()
= 1 = .
()
The parameter > 0 is a scaling factor (note that some tables and software use
1/ instead of to specify the gamma distribution); the parameter > 0 deter-
mines the shape:
1

Gamma(,1)

=1
p()

0
0 10

With the likelihood pdf p(y1:n | ) s en and the prior pdf p( ) 1 e ,

Bayess formula gives the posterior pdf

p( | y1:n ) +s1 e( +n) ,

that is, | y1:n Gamma( + s, + n). The and parameters in the priors
gamma distribution are thus updated to + s and + n in the posteriors gamma
distribution. The summary statistics are updated similarly, in particular the poste-
rior mean and posterior mode (MAP estimate) are
+s +s1
E( | y1:n ) = , mode( | y1:n ) = .
+n +n
As n , both the posterior mean and posterior mode tend to y = s/n.

2
The prior predictive distribution (marginal distribution of data) has the pmf
Z Z k
1
P(y = k) = P(y = k | )p( ) d = e e
0 0 k! ()
Z
( + k) ( + 1)+k
= +k1 e( +1) d
k!( + 1)+k () 0 ( + k)
| {z }
=1
k
( + k 1)( + k 2) () 1
=
()k! +1 +1
k
+k1 1
= .
1 +1 +1

This is the pmf of the negative binomial distribution. The summary statistics of
y NegBin(, ) are

E(y)
= , V(y)
= + .
2
The negative binomial distribution also happens to model the number of Bernoulli
failures occurring before the th success when the probability of success is p =

+1 . For this reason, many software packages (including Matlab, R and Win-
BUGS) use p instead of as the second parameter to specify the negative bino-
mial distribution.
The posterior predictive distribution can be derived similarly as the prior pre-
dictive, and is
y | y1:n NegBin( + s, + n).

Example: moose counts A region is divided into equal-area (100 km2 ) squares
and the moose in each square are counted. The prior distribution is Gamma(4, 0.5),
which corresponds to the prior predictive pmf y NegBin(4, 0.5).
0.12
0.1

p()
y = k)
P(

0 0
0 5 10 15 20 0 5 10 15 20
k

On a certain day the following moose counts are

collected from an aerial survey of 15 squares: 4
n 15
3 mean 7.07
5 7 7 12 2 # sd 4.51
2
14 7 8 5 6 1
18 6 4 1 4
0
1 5 9 13 17
y
The posterior distribution for the rate (i.e. number
of moose per 100 km2 ) is | y1:15 Gamma(110, 15.5), for which

E( | y1:15 ) = 7.0968, V( | y1:15 ) = 0.67672 , mode( | y1:15 ) = 7.0323

and the 95% credibility interval is [5.83, 8.48]. (The normal approximation
gives the interval [5.77, 8.42].) The predictive posterior distribution is y | y1:n

3
NegBin(110, 15.5).
0.6

0.15

p(|y)
y = k|y)
P(

Moose counts

0 0
0 5 10 15 20 0 5 10 15 20
k

theta
A WinBUGS model for this problem is
model {
for (i in 1:n) { y[i] dpois(theta) }
theta dgamma(4,0.5)
ypred dpois(theta)
} ypred y[i]

The data is entered as

for(i IN 1 : n)
list( y=c(5,7,7,12,2,14,7,8,5,6,18,6,4,1,4), n=15)

The results are

node mean sd 2.5% median 97.5%
theta 7.107 0.6608 5.85 7.101 8.482
ypred 7.098 2.838 2.0 7.0 13.0

A more general Poisson model can be used for counts of occurrences in intervals of
different sizes. The model is
yi | Poisson(ti )
where the ti are known positive values, sometimes called exposures. Assuming as usual
that the counts are mutually independent given , the likelihood is

p(y1:n | ) s e T

where s = ni=1 yi and T = ni=1 ti . With the conjugate prior Gamma(, ), the pos-
terior is
| y1:n Gamma( + s, + T ),
with
+s +s1
E( | y1:n ) = , mode( | y1:n ) = .
+T +T
As n , both the posterior mean and posterior mode tend towards s/T .

4
5.4 Exponential model for lifetime data
Consider a non-negative random variable y used to model intervals such as the time-to-
failure of machine components or a patients survival time. In such applications, it is
typical to specify the probability distribution using a hazard function, from which the cdf
and pdf can be deduced (and vice versa).
The hazard function is defined by

P(t < y t + dt) p(t) dt

h(t) dt = P(t < y t + dt | t < y ) = = ,
| {z } |{z} P(t < y) S(t)
fail in (t,t+dt] OK at t

where p is the pdf of y and S(t) := P(t < y) is called the reliability function. Now, because
0 (t)
p(t) = S0 (t), we have the differential equation h(t) = SS(t) with initial condition S(0) =
1, which can be solved to give Rt
S(t) = e 0 h() d .
In particular, for constant hazard h(t) = the reliability is S(t) = et and the density is
the exponential distribution pdf

p(t) = et .

Suppose a component has worked without failure for s time units. Then according to the
constant-hazard model, the probability that it will survive at least t time units more is

P(y > s & y > s + t) P(y > s + t) e (t+s)

P(y > s + t | y > s) = = = s = et ,
P(y > s) P(y > s) e

which is the same probability as for a new component! This is the lack-of-memory or
no-aging property of the constant-hazard model.
For an exponentially distributed random variable y | Exp( ) the mean and vari-
ance are
1 1
E(y | ) = , V(y | ) = 2 .

The exponential distribution also models the durations (waiting times) between consecu-
tive Poisson-distributed occurrences.
For exponentially-distributed samples yi | Exp( ) that are mutually independent
given , the likelihood is
n
p(y1:n | ) = e yi = n e s
i=1

where s = ni=1 yi . Using the conjugate prior Gamma(, ), the posterior pdf is

p( | y1:n ) 1 e n e s = +n1 e( +s) ,

that is, | y1:n Gamma( + n, + s), for which

+n +n1 +n
E( | y1:n ) = , mode( | y1:n ) = , V( | y1:n ) = .
+s +s ( + s)2

It often happens that lifetime or survival studies are ended before all the samples
have failed or died. Then, in addition to k observations y1 , . . . , yk [0, L], we have n k
samples whose lifetimes are known to be y j > L, but are otherwise unknown. This is
called a censored data set. The censored observations can be modelled as Bernoulli trials
with success (z j = 1), corresponding to y j > L, having the probability

P(y j > L | ) = e L .

5
The likelihood of the censored data is
k nk
p(y1:k , z1:nk | ) = e yi e L = k e (sk +(nk)L) ,
i=1 j=1

where sk = ki=1 yi . With the conjugate prior Gamma(, ), the posterior pdf is
p( | y1:k , z1:nk ) 1 e k e (sk +(nk)L) = +k1 e( +sk +(nk)L) ,
that is, | y1:k , z1:nk Gamma( + k, + sk + (n k)L).

Example: Censored lifetime data In a two-year survival study of 15 cancer pa-

tients, the lifetimes (in years) are
1.54,0.70,1.23,0.82,0.99,1.33,0.38,0.99,1.97,1.10,0.40
and 4 patients are still alive at the end of the study. Assuming mutually independent
Lifetimes
yi | Exp( ) conditional on , and choosing the prior Gamma(2, 1), we obtain the
posterior
| y1:11 , z1:4 Gamma(2 + 11, 1 + 11.45 + 4 2) = Gamma(13, 20.45)
which has mean 0.636, variance (0.176)2 , and 95% credibility interval (0.338, 1.025).
The normal approximation has 95% credibility interval (0.290, 0.981).

theta
posterior
p
1

prior

0
0 1 2 3 4
y[i] L[i]
A WinBUGS model for this problem is
model { for(i IN 1 : n)
theta dgamma(2,1)
for (i in 1:n) {y[i] dexp(theta)I(L[i],)}
}
Censoring is represented by appending the I(lower,upper) modifier to the distribution
specification. The data is entered as
list(y=c(1.54,0.70,1.23,0.82,0.99,1.33,0.38,0.99,1.97,1.10,0.40,
NA,NA,NA,NA), n=15, L=c(0,0,0,0,0,0,0,0,0,0,0,2,2,2,2))
where the censored observations are represented
modelby{ NA. The results after 2000 simulation
steps are theta ~ dgamma(2,1)
node mean sd for( i in 1median
2.5% :n){ 97.5%
y[i] ~
theta 0.6361 0.1789 0.3343 0.6225 1.045 dexp(theta)I(L[i],)
}
}

data
list(y=c(1.54,0.7,1.23,0.82,0.99,1.33,0.38,0.99,1.97,1.10,0.40,NA,NA,NA,
L=c(0,0,0,0,0,0,0,0,0,0,0,2,2,2,2))

inits
list(theta=0.6)

Note: press "gen inits" to initialise the chain for y[12:15]

6
theta sample: 2000
3.0

s.4 Maths Scenario Items Colloection From Maths Clinic
92% (26)
s.4 Maths Scenario Items Colloection From Maths Clinic
16 pages
Poisson Distribution - An Overview - ScienceDirect Topics
No ratings yet
Poisson Distribution - An Overview - ScienceDirect Topics
14 pages
Introduction To Bayesian Methods With An Example
No ratings yet
Introduction To Bayesian Methods With An Example
25 pages
CH 5
No ratings yet
CH 5
45 pages
Lecture 4
No ratings yet
Lecture 4
7 pages
CS109/Stat121/AC209/E-109 Data Science: Statistical Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Statistical Models
26 pages
MAS3301 Bayesian Statistics: M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9
No ratings yet
MAS3301 Bayesian Statistics: M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2008-9
18 pages
Actsc 432 Review Part 1
No ratings yet
Actsc 432 Review Part 1
7 pages
ProblemSet1Sol
No ratings yet
ProblemSet1Sol
7 pages
Commonly Used Probability Distribution - SHORT
No ratings yet
Commonly Used Probability Distribution - SHORT
26 pages
Harolds Stats Distributions Cheat Sheet 2022
No ratings yet
Harolds Stats Distributions Cheat Sheet 2022
18 pages
ln13
No ratings yet
ln13
5 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Lecture 3
No ratings yet
Lecture 3
4 pages
BT_Wk3_LectureNotes(3)
No ratings yet
BT_Wk3_LectureNotes(3)
16 pages
Exponentiol, Poisson, Normal
No ratings yet
Exponentiol, Poisson, Normal
17 pages
Poisson, Exponential Gamma Distributions From CH 4
No ratings yet
Poisson, Exponential Gamma Distributions From CH 4
4 pages
Bhati 2016
No ratings yet
Bhati 2016
32 pages
Poisson Distribusion Latest
No ratings yet
Poisson Distribusion Latest
47 pages
STAT2120: Categorical Data Analysis Chapter 1: Introduction
No ratings yet
STAT2120: Categorical Data Analysis Chapter 1: Introduction
51 pages
l1 Distribution
No ratings yet
l1 Distribution
35 pages
Predição em Modelos de Tempo de Falha Acelerado Com Efeito Aleatório para Avaliação de Riscos de Falha - (JoaoBC)
No ratings yet
Predição em Modelos de Tempo de Falha Acelerado Com Efeito Aleatório para Avaliação de Riscos de Falha - (JoaoBC)
22 pages
On A Bivariate Poisson
No ratings yet
On A Bivariate Poisson
11 pages
Important PMFs and PDFs
No ratings yet
Important PMFs and PDFs
7 pages
Review Statistics
No ratings yet
Review Statistics
24 pages
Bayesian Workshop1 Solution
No ratings yet
Bayesian Workshop1 Solution
3 pages
R300 Advanced Econometrics Methods Lecture Slides
No ratings yet
R300 Advanced Econometrics Methods Lecture Slides
362 pages
02第二课：基于机器学习方法的自然语言处理
No ratings yet
02第二课：基于机器学习方法的自然语言处理
54 pages
Chapter 3
No ratings yet
Chapter 3
4 pages
Lecture 20 - Bayesian Analysis
No ratings yet
Lecture 20 - Bayesian Analysis
4 pages
BT_Wk3_LectureNotes(2)
No ratings yet
BT_Wk3_LectureNotes(2)
19 pages
Gallery of Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Gallery of Continuous Random Variables Class 5, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
7 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
2 Statistical Definitions: 2.1 Probability Density Function
No ratings yet
2 Statistical Definitions: 2.1 Probability Density Function
9 pages
5-Module_Chpter 2_Stat Physics for Computing_by GHP
No ratings yet
5-Module_Chpter 2_Stat Physics for Computing_by GHP
19 pages
prob-review__xid-8243918_1
No ratings yet
prob-review__xid-8243918_1
21 pages
Lecture 2 - 4 Prior
No ratings yet
Lecture 2 - 4 Prior
51 pages
An Empirical Bayes Approach
No ratings yet
An Empirical Bayes Approach
7 pages
lecture22.dvi
No ratings yet
lecture22.dvi
2 pages
Actuarial Mao
No ratings yet
Actuarial Mao
37 pages
Applications of Normal Distribution: See Graph Given
No ratings yet
Applications of Normal Distribution: See Graph Given
33 pages
Exam P Review Sheet
No ratings yet
Exam P Review Sheet
12 pages
Special Distributions
No ratings yet
Special Distributions
30 pages
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
100% (1)
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
14 pages
Principles of Statistics
No ratings yet
Principles of Statistics
113 pages
Stat 130n Answers To The LAs in Lessons 3.1-3.3
No ratings yet
Stat 130n Answers To The LAs in Lessons 3.1-3.3
18 pages
Lecture08 PDF
No ratings yet
Lecture08 PDF
43 pages
Unit 2 Computational Statistics
No ratings yet
Unit 2 Computational Statistics
6 pages
4_5870790406060379150
No ratings yet
4_5870790406060379150
26 pages
Week+4+and+week+5_10+Aug+to+21+Aug_Continuous+random+variables_specific+distributions
No ratings yet
Week+4+and+week+5_10+Aug+to+21+Aug_Continuous+random+variables_specific+distributions
5 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
Introduction to Probability and Statistics (2)
No ratings yet
Introduction to Probability and Statistics (2)
28 pages
2.3+Common+Univariate+Random+Variables
No ratings yet
2.3+Common+Univariate+Random+Variables
24 pages
Tutorial 5 So LN
No ratings yet
Tutorial 5 So LN
10 pages
08 The Poisson Probability Distribution
No ratings yet
08 The Poisson Probability Distribution
5 pages
DS 630_Lec 4_St
No ratings yet
DS 630_Lec 4_St
27 pages
Probability and Distributions
No ratings yet
Probability and Distributions
6 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Properties of MIMO LTI Systems
No ratings yet
Properties of MIMO LTI Systems
8 pages
CS502 Quiz1
No ratings yet
CS502 Quiz1
37 pages
Ito Stratonovich
No ratings yet
Ito Stratonovich
18 pages
Pure Math Ial
67% (3)
Pure Math Ial
28 pages
Matlab DFT
No ratings yet
Matlab DFT
34 pages
1000 Ways To Pack The Bin PDF
100% (1)
1000 Ways To Pack The Bin PDF
50 pages
Collins Edexcel A-Level Maths SoW Year 2 - Final
No ratings yet
Collins Edexcel A-Level Maths SoW Year 2 - Final
55 pages
Chap 16 Residue
No ratings yet
Chap 16 Residue
17 pages
Craig-Bampton Method For A Two Component System
No ratings yet
Craig-Bampton Method For A Two Component System
23 pages
Lecture 04 (3hrs) Neural Network and Deep Learning-Part A
No ratings yet
Lecture 04 (3hrs) Neural Network and Deep Learning-Part A
76 pages
Unit 2 PPT Probability
No ratings yet
Unit 2 PPT Probability
77 pages
C++ Programming: From Problem Analysis To Program Design: Fifth Edition
No ratings yet
C++ Programming: From Problem Analysis To Program Design: Fifth Edition
81 pages
ch13 3
No ratings yet
ch13 3
6 pages
Project Euler Problems
No ratings yet
Project Euler Problems
3 pages
SAS Day11 - ITE048 Discrete Structure
No ratings yet
SAS Day11 - ITE048 Discrete Structure
8 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
5 pages
Unit 3
No ratings yet
Unit 3
16 pages
Gentzens Centenary, The Quest For Consistency by Reinhard Kahle, Michael Rathjen (Eds.)
No ratings yet
Gentzens Centenary, The Quest For Consistency by Reinhard Kahle, Michael Rathjen (Eds.)
563 pages
Strip Theory of Ship Motions
No ratings yet
Strip Theory of Ship Motions
12 pages
Hkimo 2024 Heat Round Ss
100% (2)
Hkimo 2024 Heat Round Ss
6 pages
Symbolic Language of Mathematics
No ratings yet
Symbolic Language of Mathematics
3 pages
Introduction To Function
No ratings yet
Introduction To Function
15 pages
Ou R.C Om: Mock Test Paper - I Mathematics Class X Solution Section A
No ratings yet
Ou R.C Om: Mock Test Paper - I Mathematics Class X Solution Section A
13 pages
MAPS Table - Tilingdasd
No ratings yet
MAPS Table - Tilingdasd
2 pages
Differential Calculus - 1 & 2
No ratings yet
Differential Calculus - 1 & 2
9 pages
TCS Aptitude Test 1
No ratings yet
TCS Aptitude Test 1
16 pages
Tws Test
No ratings yet
Tws Test
5 pages
Simple Pendulum Lab Write
No ratings yet
Simple Pendulum Lab Write
5 pages
Pre Calculus Review Sheet
No ratings yet
Pre Calculus Review Sheet
5 pages