0% found this document useful (0 votes)
98 views38 pages

Poisson Estimation & Inference

This document discusses point and interval estimation methods and one sample inference for the Poisson distribution. It provides an example analyzing childhood leukemia rates in Woburn, MA in the 1970s. The point estimate and 95% confidence interval for the incidence rate λ are calculated. Another example analyzes mortality from Hodgkin's disease in rubber workers compared to the US rate. Both the critical value method and p-value method are used to test if the rates are significantly different, finding no significant difference in the rubber workers example.

Uploaded by

jiawei tu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views38 pages

Poisson Estimation & Inference

This document discusses point and interval estimation methods and one sample inference for the Poisson distribution. It provides an example analyzing childhood leukemia rates in Woburn, MA in the 1970s. The point estimate and 95% confidence interval for the incidence rate λ are calculated. Another example analyzes mortality from Hodgkin's disease in rubber workers compared to the US rate. Both the critical value method and p-value method are used to test if the rates are significantly different, finding no significant difference in the rubber workers example.

Uploaded by

jiawei tu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PH 1700 Session 4b:

Poisson - Point and Interval


Estimates and One Sample
Inference
Rosner 6.9, & 7.10
Point Estimation and Exact
Interval Methods for the
Poisson
Chapter 6 Section 9
Summary
• Poisson
• Point Estimation
• Exact Interval Estimation
• One Sample Inference
• SMR
Estimation for Poisson
• The Poisson distribution is used to estimate occurrences in a given
time period
• Often times we can apply the Poisson distribution to events occurring
for people over a given time.
• A common unit of follow up time is person-years – the unit of time
defined as 1 person being followed for 1 year.
• Example: a study with 10 people that each person is followed up for 2 years
has a total of 20 person-years.
Point Estimate for Poisson
• Assume the number of events 𝑋 over 𝑇 person years is Poisson
distributed with parameter 𝜇 = 𝜆𝑇. An unbiased estimator of 𝜆 is
given by 𝜆& = 𝑋/𝑇, where 𝑋 is the observed number of events over 𝑇
person-years (the entire study)
• If 𝜆 is the incidence rate per person-year, and 𝑇 = number of person-
years of follow up, and we assume a Poisson distribution then
𝐸 𝑋 = 𝜆𝑇 and therefore 𝜆& is unbiased by
𝐸 𝜆& = 𝐸 𝑋 ⁄𝑇 = 𝜆𝑇⁄𝑇 = 𝜆
Example: Woburn, MA 1970’s Excess Cancer risk
in children’s leukemia
• In the book A Civil Action, the people in the town feared that a
contaminated water supply caused cancer
• Translating question to statistical framework:
• 12 cases of childhood leukemia (<19 y.o.) diagnosed from 1970-1979
• Total of 12000 childhood residents (<19 y.o.)
• National average is 5 cases per 100,000 person years
• Is the cancer risk in the town different from the national average?
Finding the Point Estimate for the town
• 10 years, 12,000 people; the study was for
𝑇 = 10×12,000 = 120,000 person years
(approximation, more later when we study Survival Analysis)
• Estimating incidence: 𝜆& = 12/120,000 = 0.0001 events per person-
year
• Rescale: 𝑇 = 1.2 100,000-person years and 𝜆& = 10 events per
100,000-person years
• What is the uncertainty around that estimate?
Exact Interval Estimation for Poisson
• An exact 100%(1 − 𝛼) confidence interval for Poisson parameter 𝜆 is
given by (𝜇! /𝑇, 𝜇" /𝑇) such that 𝜇! , and 𝜇" satisfy
& %'!
𝛼 𝑒 '(! 𝜇!# 𝑒 '(! 𝜇!#
Pr 𝑋 ≥ 𝑥 𝜇 = 𝜇! = =9 =1−9
2 𝑘! 𝑘!
#$% #$)
and
%
𝛼 𝑒 '(" 𝜇"#
Pr 𝑋 ≤ 𝑥 𝜇 = 𝜇" = =9
2 𝑘!
#$)
Example: Leukemia
• 12 cases in 1.2 100,000-person-years
• What is the 95% confidence interval for 𝜇? For 𝜆?
• 12 cases out of 1.2 100,000-person years is very small, use exact
methods.
• For 𝜇, use Table 7, page 881. Look under 95% column in row where
𝑥 = 12.
Piece of Table 7, page 881
Example: Leukemia - Confidence Intervals
• The 95% CI for 𝜇 is (6.20, 20.96)
• For 𝜆 = 𝜇/𝑇 , convert from 𝜇 : 𝑇 = 1.2 so the 95% CI for 𝜆 is
(6.20/1.2, 20.96/1.2) = (5.2,17.5)
cases for 100,000-person years
• Since our 95% CI does not include the national rate of 5 per 100,000
person years, 5 is not a plausible value for the rate of the town. We
can say the town rate of leukemia is significantly higher than the
national rate.
• This can also be done with Stata
Example: Leukemia –
Using Stata for Poisson Distribution Estimation
• cii means 120000 12, poisson
Poisson Exact
Variable Exposure Mean Std. Err. [95% Conf. Interval]
120000 0001 .0000289 .0000517 .0001747

• From Stata, the 95% CI for λ=(5.2/10* , 17.5/10* ), cases per each
person year. Equivalently to λ= 5.2, 17.5 per 10* -person years.
One-Sample Inference for
Poisson Distribution
Section 7.10

13
Example (page 259) 7.57: Occupational
Health: Rubber workers
• Starting in January 1, 1964, 8418 white male rubber workers ages 40-
84 were followed for 10 years for various mortality outcomes and
compared to the US white male mortality rates in 1968. 4 deaths due
to Hodgkin’s disease were found compared to 3.3 deaths expected
from US mortality rates. Is this significant?

14
Example (page 259) 7.57: Occupational
Health: Rubber workers Continued
• Let
X=total number of deaths in study population
Yi = 1 if the individual i dies in the study period 0 otherwise
pi =probability of death for the ith individual
Therefore Yi ~Bernoulli(pi)
𝑋 = ∑!"#,…& 𝑌! , (i=1 to n=8418)
𝐸 𝑋 = 𝐸(∑! 𝑌! ) = ∑! 𝐸(𝑌! ) = ∑! 𝑝!
• Under H0: death rate of rubber workers =US general population, the expected number of events
μ0 is given by
𝐻' : 𝐸 𝑋 = 𝜇( ,
• If the disease is rare, then the expected number of events ~Poisson(μ0), we test: H0:μ=μ0, vs.
H1:μ≠ μ0

15
So if we have…
• One variable of interest? _Yes_
• We have one-sample? _Yes_
• Assume that underlying distribution is normal or CLT holds? _No_
• Is underlying distribution binomial? _No_
• Is underlying distribution Poisson? _Yes_
• Then we have a one-sample Poisson Test…..

16
Mortality rates and the Poisson
• The Poisson distribution can be used as a model for the counts of
events, such as death, occurring rarely in a population.
• Useful when the probability of the event, such as mortality, is not
constant for everyone in the population being considered
• When the probability is not constant, then the binomial distribution
does not apply

17
One sample test: Poisson Distribution Critical
Value Method (as opposed to p-value method)
• Use the Poisson exact confidence interval method
• If X is a Poisson Random variable with expected value
µ, then to test the hypothesis
H0: µ= µ0 versus H1: µ ¹ µ0
using a two sided test of level a
Construct the two sided 100% x (1-a) confidence
interval for µ, based on the observed value x: (c1,c2)
• If µ0 < c1 or µ0 > c2 (outside the interval) then reject H0
• c1 < µ0 < c2, (inside the confidence interval) then do
not reject H0
18
Example: Occupational Health Rubber workers –
Critical value Method
• µ0 = 3.3 ; x =4; 𝛼 = 0.05, 1 − 𝛼 = 0.95
7

• 1.09<3.3<10.24, therefore we fail to reject H0.


• The rate of Hodgkin’s disease among Rubber workers
is not significantly different from the national
mortality of Hodgkin’s.
19
Example: Occupational Health Rubber workers
Confidence Interval for Poisson in Stata
. cii means 84180 4, poisson

-- Poisson Exact --
Variable | Exposure Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
Translating question to statistical framework
| 84180 .0000475 .0000238 .0000129 .0001217

• Stata gives the CI in terms of 𝜆 as a rate. To make it a CI for 𝜇 multiply by T:


• 0.0000129*84180 person years = 1.0859
• 0.001217*84180 person years = 10.2447
• CI for 𝜇: (1.09, 10.24)
Small Sample inference – p-value
• Let µ be the expected value of a Poisson distribution. To test the
hypothesis H0: µ = µ0 versus H1: µ ¹ µ0,
• Compute x= observed number of deaths in the study population
• Under H0, the random variable X will follow a Poisson distribution
with parameter µ0 and the exact p-value is given by
• min 2 Pr 𝑋 ≤ 𝑥 , 1 𝑖𝑓 𝑥 < 𝜇)
• min 2 Pr 𝑋 ≥ 𝑥 , 1 𝑖𝑓 𝑥 ≥ 𝜇)
8 #$% (&'
• With Pr[𝑋 = 𝑥] = = display poissonp(𝜇) , 𝑥)
%!
• And Pr 𝑋 ≤ 𝑥 = display poisson(𝜇) , 𝑥)

21
Equivalently, (Rosner page 260 and 261)
%
𝑒 '(& 𝜇)#
min 2× 9 ,1 𝑖𝑓 𝑥 < 𝜇)
𝑘!
#$)

%'!
𝑒 '(& 𝜇)#
min 2× 1 − 9 ,1 𝑖𝑓 𝑥 ≥ 𝜇)
𝑘!
#$)

22
Example: Occupational Health- p-value
• Back to the Rubber workers, with 4 deaths, and 3.3 expected from
the US mortality rate.

𝑋 = 4 > 𝜇) = 3.3; therefore we use


%
𝑒 '(& 𝜇)#
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = min 2× 1 − 9 ,1
𝑘!
#$)

23
Example Occupational Health - p-value in Stata
or display poissonp(3.3,0)

display poissonp(3.3,1)

display poissonp(3.3,2)

display poissonp(3.3,3)
thus

24
Example: Occupational Health-
Interpretation of p-value
• With a p value of 0.839, there is no evidence for the mortality from
Hodgkin’s disease among rubber workers being significantly different
from the US mortality rate.

25
Standardized mortality ratio (SMR)
• Another way to compare the mortality rate of a
sample with that of the population would be the
standardized mortality ratio (SMR).
• The standardized mortality ratio is 100%*observed/
expected number of deaths.
• The expected number of deaths assumes no difference
between the sample and the general population
• Standard morbidity ratio is an alternate name for the
standardized mortality ratio when the conditions do
not result in death.

26
SMR interpretation
• Similar to an Odds ratio
• SMR >100% implies increased risk in the sample
• SMR < 100% implies decreased risk in the sample
• And SMR = 100% implies neither an increased nor decreased risk in the
sample compared to the general population

27
Example: Occupational Health -SMR
• Recall we had 4 observed in our sample, and
expected 3.3 based on the US mortality reports for
Hodgkins.
• 100%*(4/3.3) = 121%
• We can reframe our Poisson test in terms of the SMR:
• H0: SMR =100%
• H1: SMR ¹ 100%
• Since we already performed the test, we can say the
SMR is not significantly different from 1 for the
Hodgkin’s disease in our sample of rubber workers.

28
One Sample Inference for the Poisson (large
sample)
• If the expected number of events (deaths) under the null distribution
are large enough, we can approximate
• This approximation is useful only if µ0 >10
• It uses a statistic that follows the Chi-squared distribution

29
Large sample test for Poisson µ
• Compute x = number of observed events in the study sample
• Compute the test statistic: (See next slide)
( x - µ0 )
2 2
æ SMR ö
• X 2
= = µ 0ç - 1÷ ~ c12 , under H
µ0 è 100 ø 0
• For a two sided test at level a, we reject H0 if our statistic X2 > c1,12 -a
• And fail to reject H0 if X < 1,1-a2 c 2

• The exact p-value is given by Pr( 1 > X2) c 2

30
Just a note:
!
Remember (SMR= 100)
"!
" '⁄ !)) "
:;< $&
• 𝜇) !))
−1 = 𝜇) !))
−1
"
=𝜇) %⁄(& −1
%'(& "
= 𝜇) (&
%'(& "
= (&
= 𝑋"
"
Remember if 𝑥 ∼ 𝑁 𝜇) , 𝜎 = 𝜇) , then
%'(& %'(& "
∼ 𝑁 0,1 and therefore, ∼ 𝜒!"
(& (&

31
Large Sample Approximate CI
• The 100%x(1-a) confidence interval for µ can be approximated by

x ± z1-a /2 x

32
Example: Occupational Health- Rubber
workers and Bladder cancer
• We observe 21 cases in our sample, and the US mortality
rate Is 18.1 deaths. Are there significantly more deaths in
our sample?
• SMR = 100%*(21/18.1) = 116%
• X = 21;
#$
8 & (&(
%
• Exact method (p-value): min 2× 1 − ∑#$) ,1
#!
é æ 20
e-18.118.1k ö ù
min ê 2 ´ ç1 - å ÷ ,1ú
• ë è k =0 k! ø û = min[2 x display poisson(18.1,20),1]

33
Example Occupational Health – p-value
8 #!).! !>.!(
min 2× 1 − ∑")
#$) #!
, 1 = min[2 x display poisson(18.1,20),1],

so we get:

8 #!).! !>.!(
min 2× 1 − ∑")
#$) #!
, 1 = min( 2*(1-.72270),1) = 0.5546

Thus by the exact test, the mortality rate from bladder cancer in our sample
of rubber workers is not significantly different from the general population.
34
Example: Occupational Health-
Comparing Methods
• Using the approximate method, we get
( x - µ0 ) ( 21 - 18.1)
2 2
2.92
X 2
= = = = 0.46464
µ0 18.1 18.1
• critical value for the test
• invchi2(1,0.95) = 3.8414
• Or p-value for the test: chi2tail(0.46464) = 0.4955
• By the approximate method, there is no significant
difference.
• Compare to exact: P- value = 0.5546
• In general, exact methods are strongly preferred for
inference concerning the Poisson distribution
35
Checklist for tests of hypothesis
• Identify the variable of interest
• Identify the parameter(s) of interest
• State the null and alternative hypotheses
• Identify the type I error level
• Identify the test statistic (you can use the flow chart in the back of the
textbook)
• Identify the distribution of the test statistic (a known probability
distribution)
• Determine the decision rule (do a graph!)
• Calculate the test statistic
• Report the test statistic, df, CI & p-value
• Make a decision
• Conclude and interpret
When do I use Poisson, Revisited
• Testing mortality\morbidity ratios or rare incidence rates
• mortality rate varies across the sample
• Any other situation the Poisson applies and we are testing the Poisson
parameter (eg. modeling counts)
• Use the Poisson test
• Exact method – small sample (µ0 < 10)
• Large sample approximation with chi-square (not as accurate), (µ0 > 10)
• However, exact method is often preferred, even if the approximation is okay.

37
Summary
• Poisson
• Point Estimation
• Exact Interval Estimation
• One Sample Inference
• SMR

You might also like