PH6205 RTutorial 4

This is the tutorial 4 for the course PH6205 at City University of Hong Kong.

Uploaded by

xuehrcityu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views5 pages

PH6205 RTutorial 4

This is the tutorial 4 for the course PH6205 at City University of Hong Kong.

Uploaded by

xuehrcityu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

PH6205 R Tutorial 4

Haoran Xue

(Last Updated February 25, 2025)

1 Scottish Lip Cancer Data

We analyze the Scottish Lip Cancer Data. First we load the data into R:
Lip.data = read.csv("/Users/x/Dropbox/PH6205/Slides4/SC_lip_cancer.csv")
head(Lip.data)

## id district obs pop propag

## 1 1 Caithness 11 83190 10
## 2 2 Sutherland 5 37521 16
## 3 3 Ross-Cromarty 15 129271 10
## 4 4 Banff-Buchan 39 231337 16
## 5 5 Nairn 3 29374 10
## 6 6 Skye-Lochalsh 9 28324 16

1.1 Poisson Regression

Next, we fit the Poisson Regression of obs on propag, with log(pop) as the offset:
obs = Lip.data$obs
pop = Lip.data$pop
propag = Lip.data$propag
poisson.model = glm(obs ~ propag, offset = log(pop), family = poisson(link = "log"))
summary(poisson.model)

##
## Call:
## glm(formula = obs ~ propag, family = poisson(link = "log"), offset = log(pop))
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -10.82457 0.07006 -154.50 <2e-16 ***
## propag 0.08114 0.00603 13.46 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 423.92 on 55 degrees of freedom
## Residual deviance: 256.92 on 54 degrees of freedom
## AIC: 468.89
##
## Number of Fisher Scoring iterations: 5

1
1.2 Plot for Checking Overdispersion
We can draw the scatter plot of (y − λ̂)2 versus λ̂ to visually check for overdispersion:
lambda.hat = fitted(poisson.model)
plot(lambda.hat,(obs - lambda.hat)ˆ2)
abline(0,1,col = "red",lwd = 2)
500
(obs − lambda.hat)^2

400
300
200
100
0

0 10 20 30 40

lambda.hat
Here the red solid line is y = x. Most of the points are above the red line, i.e. (y − λ̂)2 > λ̂ in general. This
indicates the existence of overdispersion.
We can also calculate the dispersion parameter as
phi = sum((obs - lambda.hat)ˆ2/lambda.hat)/(56-1-1)
phi

## [1] 5.430562

1.3 Quasi-Poisson Regression

We fit a Quasi-Poisson Regression to deal with the overdispersion.
Quasipoisson.model = glm(obs ~ propag, offset = log(pop), family = quasipoisson)
summary(Quasipoisson.model)

##
## Call:
## glm(formula = obs ~ propag, family = quasipoisson, offset = log(pop))
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -10.82457 0.16326 -66.301 < 2e-16 ***
## propag 0.08115 0.01405 5.774 3.91e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##

2
## (Dispersion parameter for quasipoisson family taken to be 5.430563)
##
## Null deviance: 423.92 on 55 degrees of freedom
## Residual deviance: 256.92 on 54 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 5

1.4 Negative Binomial Regression

We can also fit a Negative Binomial Regression to deal with the overdispersion. We use the function
glm.nb() from the R package MASS to fit the Negative Binomial Regression model.
library(MASS)
NB.model = glm.nb(formula = obs ~ propag + offset(log(pop)))
summary(NB.model)

##
## Call:
## glm.nb(formula = obs ~ propag + offset(log(pop)), init.theta = 2.531752157,
## link = log)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -10.66858 0.15792 -67.558 < 2e-16 ***
## propag 0.08499 0.01408 6.036 1.58e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(2.5318) family taken to be 1)
##
## Null deviance: 93.005 on 55 degrees of freedom
## Residual deviance: 63.170 on 54 degrees of freedom
## AIC: 356.58
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 2.532
## Std. Err.: 0.642
##
## 2 x log-likelihood: -350.575

2 Vietnam Highly Pathogenic Avian Influenza Data

We use the Vietnam Highly Pathogenic Avian Influenza Data to illustrate the Zero-inflated Poisson
Regression model. First we load the data:
VN.new = read.csv("/Users/x/Dropbox/PH6205/Slides4/Vietnam2003HPAI.csv")
head(VN.new)

## obs pop pop.dens road.dens river.dens elevation

## 1 0 1 0.5731923 -0.9532340 -0.58138623 0.4925043
## 2 0 1 0.7068947 -0.7887648 -0.72589578 0.4782148
## 3 0 1 0.4386707 0.3990558 -0.24816734 0.3940719

3
## 4 0 1 0.7793921 0.2686765 0.40521494 0.4497510
## 5 0 1 0.6405167 -0.7609713 0.12034335 0.4716434
## 6 0 1 0.2975146 -0.1706405 -0.06887103 0.3827288
obs = VN.new$obs
pop = VN.new$pop
pop.dens = VN.new$pop.dens
road.dens = VN.new$road.dens
river.dens = VN.new$river.dens
elevation = VN.new$elevation

We can use the table() function to show the counts for different values of obs:
table(obs)

## obs
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
## 1407 156 67 39 23 18 15 14 10 8 3 7 5 1 1 1
## 16 17 18 21
## 2 1 2 1
We can see, out of 1781 districts, there are 1407 report obs as 0. Next we draw to bar plot for better
visualization. (Note: the codes for drawing bar plot are not required.)
barplot(c((table(obs))[1:19],0,0,(table(obs))[20]),names.arg = 0:21,xlab = "obs",ylab = "Count")
1400
1000
Count

600
200
0

0 2 4 6 8 10 12 14 16 18 20

obs
We fit a Zero-inflated Poisson Regression model. We use the function zeroinfl() from the R package
pscl.
library(pscl)

## Classes and Methods for R originally developed in the

## Political Science Computational Laboratory
## Department of Political Science
## Stanford University (2002-2015),

4
## by and under the direction of Simon Jackman.
## hurdle and zeroinfl functions by Achim Zeileis.
ZIP.VN = zeroinfl(obs ~ road.dens + river.dens + elevation | pop.dens,
offset = log(pop),dist = c("poisson"))
summary(ZIP.VN)

##
## Call:
## zeroinfl(formula = obs ~ road.dens + river.dens + elevation | pop.dens,
## offset = log(pop), dist = c("poisson"))
##
## Pearson residuals:
## Min 1Q Median 3Q Max
## -4.10910 -0.37482 -0.17808 -0.06144 17.42802
##
## Count model coefficients (poisson with log link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.97356 0.07965 -24.778 < 2e-16 ***
## road.dens -0.12954 0.02251 -5.755 8.64e-09 ***
## river.dens 0.28272 0.08009 3.530 0.000416 ***
## elevation -1.00963 0.09860 -10.239 < 2e-16 ***
##
## Zero-inflation model coefficients (binomial with logit link):
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.0195 0.1690 6.032 1.62e-09 ***
## pop.dens 3.1881 0.3105 10.268 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Number of iterations in BFGS optimization: 13
## Log-likelihood: -1425 on 6 Df

Assignment3 Finaldraft
No ratings yet
Assignment3 Finaldraft
38 pages
HW5 JW
No ratings yet
HW5 JW
12 pages
CS1B April 2024
No ratings yet
CS1B April 2024
9 pages
6 - Poisson Reg
No ratings yet
6 - Poisson Reg
46 pages
Shorten - Count Data Analysis
No ratings yet
Shorten - Count Data Analysis
24 pages
Understanding Poisson Regression Models
No ratings yet
Understanding Poisson Regression Models
19 pages
Bayesian Poisson Regression Guide
No ratings yet
Bayesian Poisson Regression Guide
122 pages
DwightTimothie Assignment04
No ratings yet
DwightTimothie Assignment04
5 pages
Regression and Probability Models Analysis
No ratings yet
Regression and Probability Models Analysis
16 pages
Logistic Regression (With R) : 1 Theory
No ratings yet
Logistic Regression (With R) : 1 Theory
15 pages
Poisson & Negative Binomial Regressions: Notes
No ratings yet
Poisson & Negative Binomial Regressions: Notes
30 pages
Homework 3: Jiawei Li Sahil Bhagat Shahrzad Baraeinezhad
No ratings yet
Homework 3: Jiawei Li Sahil Bhagat Shahrzad Baraeinezhad
16 pages
Lab 6: Estimation (Solutions) : Ben Bolker October 24, 2005
No ratings yet
Lab 6: Estimation (Solutions) : Ben Bolker October 24, 2005
7 pages
GLM Sol
No ratings yet
GLM Sol
11 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
23 pages
R
No ratings yet
R
4 pages
Data Analysis and Estimation Guide
No ratings yet
Data Analysis and Estimation Guide
8 pages
Week6 2 GLM2
No ratings yet
Week6 2 GLM2
26 pages
Poisson Regression - Stata Data Analysis Examples
No ratings yet
Poisson Regression - Stata Data Analysis Examples
12 pages
LabNote 3
No ratings yet
LabNote 3
3 pages
PSQF6270 Example4a Binomial
No ratings yet
PSQF6270 Example4a Binomial
13 pages
Count Data
No ratings yet
Count Data
5 pages
Count Data Regression in R
No ratings yet
Count Data Regression in R
13 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
Count Data 2012
No ratings yet
Count Data 2012
20 pages
Lect 12
No ratings yet
Lect 12
36 pages
Department of Statistics: STATS 762: Topics in Regression Modelling Term Test Friday October 12, 2007
No ratings yet
Department of Statistics: STATS 762: Topics in Regression Modelling Term Test Friday October 12, 2007
6 pages
Exercise 3 Computer Intensive Statistics
No ratings yet
Exercise 3 Computer Intensive Statistics
10 pages
R Codes
No ratings yet
R Codes
5 pages
Log Linear Models & Poisson GLM
No ratings yet
Log Linear Models & Poisson GLM
30 pages
Heart Disease Prediction Model
No ratings yet
Heart Disease Prediction Model
35 pages
PSI Homework 1
No ratings yet
PSI Homework 1
8 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
Machine Learning-Lecture 2 (Student)
No ratings yet
Machine Learning-Lecture 2 (Student)
9 pages
Countdata2018 2
No ratings yet
Countdata2018 2
23 pages
Chap 35
No ratings yet
Chap 35
62 pages
Approximate
No ratings yet
Approximate
4 pages
Saad Akhtar
No ratings yet
Saad Akhtar
48 pages
Approximate
No ratings yet
Approximate
4 pages
Maximum Likelihood Estimation by R: Instructor: Songfeng Zheng
No ratings yet
Maximum Likelihood Estimation by R: Instructor: Songfeng Zheng
5 pages
451hw02 Soln
No ratings yet
451hw02 Soln
16 pages
Math68052 Generalised Linear Models and Survival Analysis
No ratings yet
Math68052 Generalised Linear Models and Survival Analysis
12 pages
Poisson Data MLE Guide for R Users
No ratings yet
Poisson Data MLE Guide for R Users
9 pages
Summary of R Commands For Statistics 100
No ratings yet
Summary of R Commands For Statistics 100
3 pages
A1w2017s PDF
No ratings yet
A1w2017s PDF
11 pages
Dhrubajyoti Chakraborty A5
No ratings yet
Dhrubajyoti Chakraborty A5
14 pages
Pool
No ratings yet
Pool
13 pages
Poisson Regression Guide
No ratings yet
Poisson Regression Guide
15 pages
Final Estimationsa
No ratings yet
Final Estimationsa
7 pages
Week 7 and Week 8
No ratings yet
Week 7 and Week 8
29 pages
Lab 3
No ratings yet
Lab 3
10 pages
05 GeneralizedLinearModels
No ratings yet
05 GeneralizedLinearModels
36 pages
Poisson
No ratings yet
Poisson
54 pages
Poisson Regression
No ratings yet
Poisson Regression
12 pages
R Code Default Data PDF
No ratings yet
R Code Default Data PDF
10 pages
Count Data Models Explained
No ratings yet
Count Data Models Explained
7 pages
Lup Decomposition
No ratings yet
Lup Decomposition
19 pages
Curve Fitting
No ratings yet
Curve Fitting
20 pages
Document 3
No ratings yet
Document 3
9 pages
IEB AMIE Course List For CSE
No ratings yet
IEB AMIE Course List For CSE
8 pages
Constrained Optimization Guide
No ratings yet
Constrained Optimization Guide
40 pages
Binomial Expansion
No ratings yet
Binomial Expansion
2 pages
Test 1 Questions
No ratings yet
Test 1 Questions
15 pages
Non Linear Model
No ratings yet
Non Linear Model
4 pages
Quasi Newton Methods
No ratings yet
Quasi Newton Methods
17 pages
Advanced Linear Algebra Problems
No ratings yet
Advanced Linear Algebra Problems
2 pages
Integer Programming and Goal Programming: To Accompany
No ratings yet
Integer Programming and Goal Programming: To Accompany
79 pages
PLAXIS Scientific Manual 2019
No ratings yet
PLAXIS Scientific Manual 2019
61 pages
Mobile Selection via TOPSIS Method
No ratings yet
Mobile Selection via TOPSIS Method
3 pages
RS Aggarwal Class 10 Solutions Chapter 2 Polynomials
No ratings yet
RS Aggarwal Class 10 Solutions Chapter 2 Polynomials
43 pages
Maths CH 2 Polynomials Exclusive Material by @SIRCBSE
No ratings yet
Maths CH 2 Polynomials Exclusive Material by @SIRCBSE
12 pages
SJ PDF 2 JSR 10.1177 - 10946705231213118
No ratings yet
SJ PDF 2 JSR 10.1177 - 10946705231213118
6 pages
Assignment #1 Structural Analysis
No ratings yet
Assignment #1 Structural Analysis
56 pages
CAA Course Overview and Details
No ratings yet
CAA Course Overview and Details
5 pages
Operations Research (Me 705C) MCQS: Max Z 30x - 15x, S.T. 2x - 2x 0
100% (4)
Operations Research (Me 705C) MCQS: Max Z 30x - 15x, S.T. 2x - 2x 0
10 pages
QUICK Scheme
No ratings yet
QUICK Scheme
25 pages
Grade 8 Math: Factoring Polynomials
100% (4)
Grade 8 Math: Factoring Polynomials
3 pages
Grade 10 Polynomial Worksheet
100% (1)
Grade 10 Polynomial Worksheet
2 pages
Uncorrected Author Proof: A Quasi-Newton Augmented Lagrangian Algorithm For Constrained Optimization Problems
No ratings yet
Uncorrected Author Proof: A Quasi-Newton Augmented Lagrangian Algorithm For Constrained Optimization Problems
10 pages
Numerical Methods for Algebraic Equations
No ratings yet
Numerical Methods for Algebraic Equations
24 pages
Simplex Method Literature Review Guide
100% (2)
Simplex Method Literature Review Guide
6 pages
Numerical Methods Bibliography
No ratings yet
Numerical Methods Bibliography
3 pages
Finite Difference Method Overview
No ratings yet
Finite Difference Method Overview
7 pages
(1995) The Lifting Scheme A Custom-Design Construction of Biorthogonal Wavelets PDF
No ratings yet
(1995) The Lifting Scheme A Custom-Design Construction of Biorthogonal Wavelets PDF
15 pages
Coal India LTD Syllabus
No ratings yet
Coal India LTD Syllabus
2 pages
P3 - Chapter6 - Numerical Solutions of Equations
No ratings yet
P3 - Chapter6 - Numerical Solutions of Equations
8 pages