0% found this document useful (0 votes)

144 views45 pages

Analyzing Two-Way Contingency Tables

This document provides information on statistical methods for analyzing two-way contingency tables, including differences and ratios of proportions, relative risk, odds ratios, chi-square tests of independence, and residuals. It defines key terms, provides formulas for estimates and confidence intervals, and includes examples applying the methods to data on aspirin use and risk of heart attacks.

Uploaded by

Harley Garcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views45 pages

Analyzing Two-Way Contingency Tables

Uploaded by

Harley Garcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

INFERENCES ON TWO- WAY

CONTINGENCY TABLES
DIFFERENCE OF PROPORTIONS
Suppose

denote the (conditional) probability

of success for row i. Then the difference of
proportions (

) compares the success

probabilities in the two rows, i and j.
Note: 1

1
DIFFERENCE OF PROPORTIONS

estimates the true difference

+
+

+

[Large Sample] % CI: [due to Walds]

EXAMPLE # 1
The following table is from a report on the relationship between
aspirin use and myocardial infarction (heart attacks) by the
Physicians Health Study Research Group at Harvard Medical
School. The Physicians Health Study was a five-year
randomized study testing whether regular intake of aspirin
reduces mortality from cardiovascular disease. Every other day,
the male physicians participating in the study took either one
aspirin tablet or a placebo. The study was blind the
physicians in the study did not know which type of pill they
were taking.
EXAMPLE # 1
EXAMPLE # 1
(a) Estimate the probability of suffering myocardial
infarction (MI) for both placebo and aspirin groups.
(b) Construct a 95% CI for the true difference of
probabilities of heart attack between male physicians who
took placebo and those who took aspirin. From this,
determine if aspirin is effective in diminishing the risk of
heart attack?
RELATIVE RISK
For 2-by-2 tables, the relative risk (RR) is the ratio
=

where it can be any non-negative number. RR = 1.0 iff

estimates the true ratio (RR)

.
RELATIVE RISK
The importance of RR is due to the importance of
differences of a certain fixed size when proportions of
success (in all levels of ) are close to 0 or 1. That is, while
the same difference was observed for (a) 0.010 and 0.001
and (b) 0.410 and 0.401, (a) is more striking since the
discrepancy between the two proportions can be expressed
as 10 times of the other. This goes to show that RR may
give better interpretative meaning for public health
implications, than relying on the differences of proportions
alone (which may be misleading if
i
0 or 1).
RELATIVE RISK
The sampling distribution of RR is highly skewed
unless the sample sizes are quite large. Under which, an
approximate [large-sample due to Walds] 1
100% CI for the true log RR is given by:

EXAMPLE # 2
Refer to the aspirin use and myocardial infarction (heart
attacks) study by the Physicians Health Study Research
Group at Harvard Medical School.
(a) Estimate and interpret the RR of heart attack
between male physicians who took placebo and those
who took aspirin.
(b) Construct a 95% CI for the true RR of heart attack
between male physicians who took placebo and those
who took aspirin.
ODDS RATIO
For a probability of success , the odds (of success)
are defined to be
= /( )
from which we can get
= /( +)
ODDS RATIO
For 2-by-2 tables, the odds ratio () is the ratio
=

where it can be any non-negative number.
Sample odds ratio (

) [through ML under multinomial

assumption, or independent binomial assumption]:

ODDS RATIO
and independent = .
> . : higher success rate for row [X level] 1
< . : higher success rate for row [X level] 2
Values of farther from 1.0 in any direction represent stronger
association between and .
is orientation invariant (unlike RR).
may be viewed as a cross-product ratio of joint probabilities if
interdependence is desired.
ODDS RATIO
The sampling distribution of is highly skewed unless
the sample sizes are quite large. Under which, an
approximate [large-sample] 1 100% CI for the
true log [which is symmetric about 0] is given by:

ODDS RATIO
If some cell counts (n
ij
) are 0, then

can either be 0 or ,
or even undefined if both entries in a row or column are 0. To
adjust for this, an amended estimator is given by

=
(

+. )(

+. )
(

+. )(

+. )

i.e., an adjustment of 0.5 was made on each cell count (also
applies for SE(

) for estimating a 1 100% CI).

EXAMPLE # 3
Refer to the aspirin use and myocardial infarction (heart
attacks) study by the Physicians Health Study Research
Group at Harvard Medical School.
(a) Estimate and interpret of heart attack between
male physicians who took placebo and those who took
aspirin.
(b) Construct a 95% CI for the true of heart attack
between male physicians who took placebo and those
who took aspirin.
RELATI ONSHI P BETWEEN
ODDS RATI O AND RELATI VE RI SK
=

Hence, whenever direct estimation of RR is not
possible, one can estimate instead, and use it to
approximate RR, as long as

and

.
ODDS RATIO AND
CASE- CONTROL STUDIES
In most case-control studies, marginal distribution of the
response variable is usually fixed by the sampling design.
With this being retrospective, one can construct conditional
distributions for the explanatory variable, within levels of
the response outcome of interest. In this case, only can
be estimated due to its symmetric orientation (invariance).
Thus, for relatively rare successes [usually rare diseases],
RR is usually approximated by

TESTS OF INDEPENDENCE
Consider

For a sample of size with cell counts *n
ij
+, the values *
ij
= n
ij
+
are expected frequencies, i.e. *(n
ij
)+ under which

is true.

To arrive at a decision, *n
ij
+ is compared with *
ij
+, such that for

is true, *n
ij

ij
+ must be small, i.e. larger differences provide
stringer evidences against

.

Test statistics used to make such comparisons have large-sample

distributions.

TESTS OF INDEPENDENCE

()

Mean:
Variance:

=

(,)

PEARSON

STATISTIC

score statistic
minimum at 0 if all n
ij
=
ij

p-value: ,

-
*

> + for decent approximation

LIKELIHOOD- RATIO STATISTIC

likelihood-ratio statistic [based on multinomial assumption]
minimum at 0 if all n
ij
=
ij

p-value: ,

-
*

> + for decent approximation

TESTS OF INDEPENDENCE
In two-way tables, the null hypothesis of statistical independence
has the form

=
+

+

Note: *

+ is estimated by the estimated expected frequencies

=
n
i+
n
+j
n
+

TESTS OF INDEPENDENCE
For testing independence in I x J contingency tables,
the

and

statistics are used, with both having

large-sample
2
distribution with degrees of
freedom = ( )( ).

converges in distribution more quickly than

TESTS OF INDEPENDENCE
Recall:
The degrees of freedom is obtained by taking the difference
between the number of parameters [cell counts] under the alternative
[for w/c there are IJ 1 non-redundant parameters] and null
[for w/c there are (I 1)+(J 1) non-redundant parameters]
hypotheses, i.e.,

= + = ( )( )
EXAMPLE # 4
The following table, from the 2000 General Social
Survey, cross classifies gender and political party
identification. Subjects indicated whether they identified
more strongly with the Democratic or Republican party
or as Independents. This also contains estimated
expected frequencies for

: Independence between
Gender and Political Party Identification.
Determine if a significant association between gender
and political party identification exists or not.
EXAMPLE # 4
RESIDUALS FOR CELLS
A cell-by-cell comparison of observed and estimated
frequencies help us better understand the nature of the
evidence.

However, it is rather insufficient to simply rely on the
raw cell differences

[due to the inherent

magnitude of the counts].
STANDARDIZED RESIDUAL

+

+

follows a [large-sample] standard normal distribution under

: (as compared to 0) evidence towards lack of fit of

i.e., at a significance level , one expects 100% of the
standardized residuals to be beyond 2 (or 3, if many cells) by chance
alone under

EXAMPLE # 5
Refer to the gender and political party identification
example. The following table shows the standardized
residuals for testing independence in the previous
example. Try to make sense of the computed standardized
residuals in relation with the observed global result for
testing independence between gender and political arty
identification.
EXAMPLE # 5
STANDARDIZED RESIDUALS
Notice that residuals for the females are the negative
of those of males. In general, the residuals in each
column must sum up to 0 as the observed counts and the
expected frequencies are constrained by the same row
and column totals. In particular, for 2 x J tables,

= (

)
PARTITIONING

Recall: Let

and

be independent
2
RVs w/ degrees of
freedom
1
and
2
, respectively. Then

=

In essence, this enables one to separate/collapse rows or columns
of I x J tables to several sub-tables, and obtain
2
or
2
statistics for
which the sum of each corresponding partitioned statistic is the
global statistics.
PARTITIONING

Consider: For a test of independence in a 2 x J table, a

2
statistic can be broken down into J 1 components: [1] the
first two columns; [2] collapsing of the first two columns, then
compared with the 3
rd
column; [3] collapsing of the first three
columns, then compared with the 4
th
column, etc. until the J
th

column is considered. In particular, this is true for

.
PARTITIONING

While it might seem more natural to obtain

statistic for each

2 x 2 pairing, note that the sum of these individual statistics will
not total the global

. [Issues due to non-independence]

has exact partitionings;

does not (at least, algebraically).

Nevertheless, partitioning
2
is valid for both statistics as long as
independence of partitions are met.
SOME COMMENTS ON

TESTS
These tests likewise require a very large sample size n
relative to IJ. Moreover,

converges poorly as compared

for very small sample sizes, i.e. for large I or J,

still provides decent approximation even if some expected
frequencies are as small as 1.
SOME COMMENTS ON

TESTS

tests merely indicate the degree of evidence for an association;

they do not give anything about the strength and the nature of the
association.

Both

and

are orientation invariant, i.e. they do not change

values with reorderings of rows or columns. However, both are only
powerful when associations regarding nominal variables are of concern.
For ordinal, more powerful tests exist.
FISHER S EXACT TEST
Recall: For 2 x 2 tables, independence = .

Consider the cell counts {

}. A small-sample null
probability distribution for the cell counts that does not
depend on unknown parameters results from considering the
set of tables having the same row and column total. Under this
condition, each *

+ then have the hypergeometric

distribution.
FISHER S EXACT TEST
It is sufficient to know

alone to determine all other cell

counts. Under the null hypothesis of independence

: = ,

is hypergeometric with

+

Hence, the p-value equals the sum of hypergeometric probabilities
for outcomes at least as favorable to

as the observed outcome.

EXAMPLE # 6
In his 1935 book, The Design of Experiments, Fisher described
the following experiment: When drinking tea, a colleague of
Fishers at Rothamsted Experiment Station near London
claimed she could distinguish whether milk or tea was added
to the cup first. To test her claim, Fisher designed an
experiment in which she tasted eight cups of tea. Four cups
had milk added first, and the other four had tea added first.
EXAMPLE # 6
She was told there were four cups of each type and she
should try to select the four that had milk added first. The
cups were presented to her in random order. The following
table shows a potential result of the experiemtn. Perform a
test to check whether there is evidence of a positive
association between the true order of the pouring and her
guess. Compute for the exact p-value of the test.
EXAMPLE # 6
CONSERVATISM OF
FISHER S EXACT TEST
Being an exact test, the test is very conservative, i.e. the
actual error rate when the null hypothesis of independence is
true is much smaller than the intended one. This is essentially
true for one-sided alternatives. Hence, mid p-value is
preferred as an alternative to diminish the conservativeness.
SMALL- SAMPLE
CONFIDENCE INTERVAL FOR
It is also possible to construct small-sample confidence
intervals for odds ratio. The procedure involved is a
generalization of Fishers exact test that tests an arbitrary value,

: =

. Hence, a % CI would then

contain all values of

for which the exact p-value of

: =

is greater than 0.05. This can also be constructed

using mid p-value to preserve conservatism.
SMALL- SAMPLE
CONFIDENCE INTERVAL FOR
For the tea-taste experiment, a 95% CI for can be
computed to be as follows:
Exact p-value: (0.21 , 626.17)
Mid p-value: (0.31 , 308.55)

Two-Way Tables - Measures of Association
No ratings yet
Two-Way Tables - Measures of Association
33 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
40 pages
Categorical Data Analysis: Odds Ratios
No ratings yet
Categorical Data Analysis: Odds Ratios
56 pages
Outline Note Allan Agresti
No ratings yet
Outline Note Allan Agresti
187 pages
Categorical Notes Ch3
No ratings yet
Categorical Notes Ch3
15 pages
Understanding Chi-Square and Odds Ratio
No ratings yet
Understanding Chi-Square and Odds Ratio
56 pages
Lecture 4&5-Categorical Data Analysis
No ratings yet
Lecture 4&5-Categorical Data Analysis
85 pages
Contingency Tables: Proportions & Risks
No ratings yet
Contingency Tables: Proportions & Risks
11 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
6 Contingency Tables
No ratings yet
6 Contingency Tables
72 pages
Chi-Square Test for Association
No ratings yet
Chi-Square Test for Association
105 pages
Statistical Analysis Techniques
No ratings yet
Statistical Analysis Techniques
79 pages
Categorical Data Analysis Course
No ratings yet
Categorical Data Analysis Course
191 pages
Categorical Data Analysis Solutions
No ratings yet
Categorical Data Analysis Solutions
28 pages
Two-Way Contingency Table Analysis
No ratings yet
Two-Way Contingency Table Analysis
42 pages
Hypothesis Testing in Categorical Data
No ratings yet
Hypothesis Testing in Categorical Data
35 pages
Categorical Data Analysis Guide
No ratings yet
Categorical Data Analysis Guide
44 pages
Analysis of Categorical Data
No ratings yet
Analysis of Categorical Data
75 pages
Categorical Data Analysis Techniques
No ratings yet
Categorical Data Analysis Techniques
4 pages
Proportions: Statistical Methods Explained
No ratings yet
Proportions: Statistical Methods Explained
19 pages
Solutions Icda HW
No ratings yet
Solutions Icda HW
13 pages
An Introduction To Categorical Data Analysis, 2Nd Ed
No ratings yet
An Introduction To Categorical Data Analysis, 2Nd Ed
13 pages
Understanding Contingency Tables and Chi-Square Tests
No ratings yet
Understanding Contingency Tables and Chi-Square Tests
57 pages
Seat Belt Safety Analysis and Statistics
No ratings yet
Seat Belt Safety Analysis and Statistics
7 pages
Inferential Statistics
No ratings yet
Inferential Statistics
21 pages
Lesson-2 1
No ratings yet
Lesson-2 1
27 pages
Chi-Square Test Fall Semester 2024
No ratings yet
Chi-Square Test Fall Semester 2024
21 pages
Week 6 Lecture 1 - 2023-2024
No ratings yet
Week 6 Lecture 1 - 2023-2024
47 pages
Statistical Inference for Proportions
No ratings yet
Statistical Inference for Proportions
2 pages
Chi-Square Test for Independence in Biostatistics
No ratings yet
Chi-Square Test for Independence in Biostatistics
18 pages
Categorical
No ratings yet
Categorical
45 pages
CDA Exercises
No ratings yet
CDA Exercises
26 pages
HW4 Sol
No ratings yet
HW4 Sol
5 pages
Inferential Statistics Lesson 5
No ratings yet
Inferential Statistics Lesson 5
21 pages
Contingency Tables & Goodness-of-Fit Tests
No ratings yet
Contingency Tables & Goodness-of-Fit Tests
16 pages
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
No ratings yet
Categorical Data Analysis: 48Th Icro-Sun PG Teaching Programme 26 & 27 OCTOBER, 2024
57 pages
Categorical Data Analysis Guide
No ratings yet
Categorical Data Analysis Guide
194 pages
BS IMI U8 Oct23
No ratings yet
BS IMI U8 Oct23
100 pages
Chi-square (χ2) test compiled
No ratings yet
Chi-square (χ2) test compiled
34 pages
Tests of Significance
No ratings yet
Tests of Significance
12 pages
Statistical Significance & Association
No ratings yet
Statistical Significance & Association
21 pages
Chi-Square Test
No ratings yet
Chi-Square Test
10 pages
Chi-Square Test: Milan A Joshi
No ratings yet
Chi-Square Test: Milan A Joshi
39 pages
Chi-Square Distribution Explained
No ratings yet
Chi-Square Distribution Explained
19 pages
Statistical Theory Lecture 5-2025
No ratings yet
Statistical Theory Lecture 5-2025
13 pages
Chi-Squared Analysis in Genetics
No ratings yet
Chi-Squared Analysis in Genetics
29 pages
Chi-Square Distribution Overview
No ratings yet
Chi-Square Distribution Overview
13 pages
Chi-Square Analysis and Hypothesis Testing
No ratings yet
Chi-Square Analysis and Hypothesis Testing
37 pages
Epidemiology: Measures of Effectiveness
No ratings yet
Epidemiology: Measures of Effectiveness
38 pages
Statistical Tests Martin G 161131 V15 UPLOAD
No ratings yet
Statistical Tests Martin G 161131 V15 UPLOAD
33 pages
Quiz 4 Review
No ratings yet
Quiz 4 Review
21 pages
Calculating Measures of Effect in Epidemiology
No ratings yet
Calculating Measures of Effect in Epidemiology
209 pages
Biostatistics Lecture 3 Inferential Statistics..
No ratings yet
Biostatistics Lecture 3 Inferential Statistics..
40 pages
Tests of Significance and Measures of Association
No ratings yet
Tests of Significance and Measures of Association
21 pages
Hypothesis Testing: Chi-Square Methods
No ratings yet
Hypothesis Testing: Chi-Square Methods
16 pages
T5DM For Plenary
100% (1)
T5DM For Plenary
18 pages
Medical Devices Inspection and Maintenance A Literature Review
No ratings yet
Medical Devices Inspection and Maintenance A Literature Review
10 pages
Etiology of Major Depressive Disorder
No ratings yet
Etiology of Major Depressive Disorder
4 pages
Gazette Notice on Publication Deadlines
No ratings yet
Gazette Notice on Publication Deadlines
41 pages
Teamwork and Collaboration - 3
No ratings yet
Teamwork and Collaboration - 3
43 pages
ICPCG Chapter 9 2024
No ratings yet
ICPCG Chapter 9 2024
31 pages
Chapter 12 (Philoid-In)
No ratings yet
Chapter 12 (Philoid-In)
11 pages
Electrical Installation Training Module
No ratings yet
Electrical Installation Training Module
233 pages
5 Steps To Drastically Improve Your Performance
No ratings yet
5 Steps To Drastically Improve Your Performance
18 pages
Georgia Report 2024 - Eu Enlargement
No ratings yet
Georgia Report 2024 - Eu Enlargement
103 pages
Approach To Acute Abdominal Pain: Robert Mcnamara,, Anthony J. Dean
No ratings yet
Approach To Acute Abdominal Pain: Robert Mcnamara,, Anthony J. Dean
15 pages
Enforcement of R.A. 9003 in Surigao City
No ratings yet
Enforcement of R.A. 9003 in Surigao City
56 pages
Metal Primer Red Oxide Safety Data Sheet
No ratings yet
Metal Primer Red Oxide Safety Data Sheet
9 pages
Sound Healing Practice: by Simon Heather
100% (2)
Sound Healing Practice: by Simon Heather
23 pages
Environmental Health Engineering in The Tropics Water Sanitation and Disease Control Third Edition Cairncross Ebook Plus Content Version
No ratings yet
Environmental Health Engineering in The Tropics Water Sanitation and Disease Control Third Edition Cairncross Ebook Plus Content Version
342 pages
HyLED 200 Examination Light
No ratings yet
HyLED 200 Examination Light
2 pages
UNIT 10 - Bahasa Inggris Pangan - JMP
No ratings yet
UNIT 10 - Bahasa Inggris Pangan - JMP
10 pages
Rabies Vaccination and Management Guide
No ratings yet
Rabies Vaccination and Management Guide
10 pages
Understanding Kidney Disorders and Functions
No ratings yet
Understanding Kidney Disorders and Functions
11 pages
Bruce Lee - Lessions of Life
No ratings yet
Bruce Lee - Lessions of Life
5 pages
Manual Xpert Mtb-Rif Ultra PDF
No ratings yet
Manual Xpert Mtb-Rif Ultra PDF
26 pages
TATA CSR Assignment
No ratings yet
TATA CSR Assignment
22 pages
Curriculum 2c Pdhpe Assessment Task 2 Final PDF
No ratings yet
Curriculum 2c Pdhpe Assessment Task 2 Final PDF
16 pages
Confined Space Hazard Assessment Guide
No ratings yet
Confined Space Hazard Assessment Guide
9 pages
Ethical Dilemmas in Psychology
No ratings yet
Ethical Dilemmas in Psychology
13 pages
Martin Zenker-Noonan Syndrome and Related Disorders - A Matter of Deregulated Ras Signaling (Monographs in Human Genetics Vol 17) - S. Karger AG (Switzerland) (2009)
No ratings yet
Martin Zenker-Noonan Syndrome and Related Disorders - A Matter of Deregulated Ras Signaling (Monographs in Human Genetics Vol 17) - S. Karger AG (Switzerland) (2009)
178 pages
Why Surgery Isn't Always The Answer For Anal Fissures: Conservative Treatments That Work
No ratings yet
Why Surgery Isn't Always The Answer For Anal Fissures: Conservative Treatments That Work
2 pages
Interview Galley Steward Sugiana
No ratings yet
Interview Galley Steward Sugiana
2 pages
Reproductive System
No ratings yet
Reproductive System
21 pages
Obgyne Endorsement
No ratings yet
Obgyne Endorsement
51 pages

Analyzing Two-Way Contingency Tables

Uploaded by

Analyzing Two-Way Contingency Tables

Uploaded by

INFERENCES ON TWO- WAY

denote the (conditional) probability

) compares the success

estimates the true difference

estimates the true ratio (RR)

) [through ML under multinomial

) for estimating a 1 100% CI).

> + for decent approximation

> + for decent approximation

+ is estimated by the estimated expected frequencies

statistics are used, with both having

converges in distribution more quickly than

[due to the inherent

statistic for each

. [Issues due to non-independence]

has exact partitionings;

does not (at least, algebraically).

converges poorly as compared

for very small sample sizes, i.e. for large I or J,

tests merely indicate the degree of evidence for an association;

are orientation invariant, i.e. they do not change

+ then have the hypergeometric

alone to determine all other cell

as the observed outcome.

. Hence, a % CI would then

for which the exact p-value of

is greater than 0.05. This can also be constructed

You might also like