INFERENCES ON TWO- WAY
CONTINGENCY TABLES
DIFFERENCE OF PROPORTIONS
Suppose
denote the (conditional) probability
of success for row i. Then the difference of
proportions (
) compares the success
probabilities in the two rows, i and j.
Note: 1
1
DIFFERENCE OF PROPORTIONS
estimates the true difference
+
+
+
[Large Sample] % CI: [due to Walds]
EXAMPLE # 1
The following table is from a report on the relationship between
aspirin use and myocardial infarction (heart attacks) by the
Physicians Health Study Research Group at Harvard Medical
School. The Physicians Health Study was a five-year
randomized study testing whether regular intake of aspirin
reduces mortality from cardiovascular disease. Every other day,
the male physicians participating in the study took either one
aspirin tablet or a placebo. The study was blind the
physicians in the study did not know which type of pill they
were taking.
EXAMPLE # 1
EXAMPLE # 1
(a) Estimate the probability of suffering myocardial
infarction (MI) for both placebo and aspirin groups.
(b) Construct a 95% CI for the true difference of
probabilities of heart attack between male physicians who
took placebo and those who took aspirin. From this,
determine if aspirin is effective in diminishing the risk of
heart attack?
RELATIVE RISK
For 2-by-2 tables, the relative risk (RR) is the ratio
=
where it can be any non-negative number. RR = 1.0 iff
estimates the true ratio (RR)
.
RELATIVE RISK
The importance of RR is due to the importance of
differences of a certain fixed size when proportions of
success (in all levels of ) are close to 0 or 1. That is, while
the same difference was observed for (a) 0.010 and 0.001
and (b) 0.410 and 0.401, (a) is more striking since the
discrepancy between the two proportions can be expressed
as 10 times of the other. This goes to show that RR may
give better interpretative meaning for public health
implications, than relying on the differences of proportions
alone (which may be misleading if
i
0 or 1).
RELATIVE RISK
The sampling distribution of RR is highly skewed
unless the sample sizes are quite large. Under which, an
approximate [large-sample due to Walds] 1
100% CI for the true log RR is given by:
EXAMPLE # 2
Refer to the aspirin use and myocardial infarction (heart
attacks) study by the Physicians Health Study Research
Group at Harvard Medical School.
(a) Estimate and interpret the RR of heart attack
between male physicians who took placebo and those
who took aspirin.
(b) Construct a 95% CI for the true RR of heart attack
between male physicians who took placebo and those
who took aspirin.
ODDS RATIO
For a probability of success , the odds (of success)
are defined to be
= /( )
from which we can get
= /( +)
ODDS RATIO
For 2-by-2 tables, the odds ratio () is the ratio
=
where it can be any non-negative number.
Sample odds ratio (
) [through ML under multinomial
assumption, or independent binomial assumption]:
ODDS RATIO
and independent = .
> . : higher success rate for row [X level] 1
< . : higher success rate for row [X level] 2
Values of farther from 1.0 in any direction represent stronger
association between and .
is orientation invariant (unlike RR).
may be viewed as a cross-product ratio of joint probabilities if
interdependence is desired.
ODDS RATIO
The sampling distribution of is highly skewed unless
the sample sizes are quite large. Under which, an
approximate [large-sample] 1 100% CI for the
true log [which is symmetric about 0] is given by:
ODDS RATIO
If some cell counts (n
ij
) are 0, then
can either be 0 or ,
or even undefined if both entries in a row or column are 0. To
adjust for this, an amended estimator is given by
=
(
+. )(
+. )
(
+. )(
+. )
i.e., an adjustment of 0.5 was made on each cell count (also
applies for SE(
) for estimating a 1 100% CI).
EXAMPLE # 3
Refer to the aspirin use and myocardial infarction (heart
attacks) study by the Physicians Health Study Research
Group at Harvard Medical School.
(a) Estimate and interpret of heart attack between
male physicians who took placebo and those who took
aspirin.
(b) Construct a 95% CI for the true of heart attack
between male physicians who took placebo and those
who took aspirin.
RELATI ONSHI P BETWEEN
ODDS RATI O AND RELATI VE RI SK
=
Hence, whenever direct estimation of RR is not
possible, one can estimate instead, and use it to
approximate RR, as long as
and
.
ODDS RATIO AND
CASE- CONTROL STUDIES
In most case-control studies, marginal distribution of the
response variable is usually fixed by the sampling design.
With this being retrospective, one can construct conditional
distributions for the explanatory variable, within levels of
the response outcome of interest. In this case, only can
be estimated due to its symmetric orientation (invariance).
Thus, for relatively rare successes [usually rare diseases],
RR is usually approximated by
TESTS OF INDEPENDENCE
Consider
For a sample of size with cell counts *n
ij
+, the values *
ij
= n
ij
+
are expected frequencies, i.e. *(n
ij
)+ under which
is true.
To arrive at a decision, *n
ij
+ is compared with *
ij
+, such that for
is true, *n
ij
ij
+ must be small, i.e. larger differences provide
stringer evidences against
.
Test statistics used to make such comparisons have large-sample
distributions.
TESTS OF INDEPENDENCE
()
Mean:
Variance:
=
(,)
PEARSON
STATISTIC
score statistic
minimum at 0 if all n
ij
=
ij
p-value: ,
-
*
> + for decent approximation
LIKELIHOOD- RATIO STATISTIC
likelihood-ratio statistic [based on multinomial assumption]
minimum at 0 if all n
ij
=
ij
p-value: ,
-
*
> + for decent approximation
TESTS OF INDEPENDENCE
In two-way tables, the null hypothesis of statistical independence
has the form
=
+
=
+
+
Note: *
+ is estimated by the estimated expected frequencies
*
=
n
i+
n
+j
n
+
TESTS OF INDEPENDENCE
For testing independence in I x J contingency tables,
the
and
statistics are used, with both having
large-sample
2
distribution with degrees of
freedom = ( )( ).
converges in distribution more quickly than
TESTS OF INDEPENDENCE
Recall:
The degrees of freedom is obtained by taking the difference
between the number of parameters [cell counts] under the alternative
[for w/c there are IJ 1 non-redundant parameters] and null
[for w/c there are (I 1)+(J 1) non-redundant parameters]
hypotheses, i.e.,
= + = ( )( )
EXAMPLE # 4
The following table, from the 2000 General Social
Survey, cross classifies gender and political party
identification. Subjects indicated whether they identified
more strongly with the Democratic or Republican party
or as Independents. This also contains estimated
expected frequencies for
: Independence between
Gender and Political Party Identification.
Determine if a significant association between gender
and political party identification exists or not.
EXAMPLE # 4
RESIDUALS FOR CELLS
A cell-by-cell comparison of observed and estimated
frequencies help us better understand the nature of the
evidence.
However, it is rather insufficient to simply rely on the
raw cell differences
[due to the inherent
magnitude of the counts].
STANDARDIZED RESIDUAL
+
+
follows a [large-sample] standard normal distribution under
: (as compared to 0) evidence towards lack of fit of
i.e., at a significance level , one expects 100% of the
standardized residuals to be beyond 2 (or 3, if many cells) by chance
alone under
EXAMPLE # 5
Refer to the gender and political party identification
example. The following table shows the standardized
residuals for testing independence in the previous
example. Try to make sense of the computed standardized
residuals in relation with the observed global result for
testing independence between gender and political arty
identification.
EXAMPLE # 5
STANDARDIZED RESIDUALS
Notice that residuals for the females are the negative
of those of males. In general, the residuals in each
column must sum up to 0 as the observed counts and the
expected frequencies are constrained by the same row
and column totals. In particular, for 2 x J tables,
= (
)
PARTITIONING
Recall: Let
and
be independent
2
RVs w/ degrees of
freedom
1
and
2
, respectively. Then
=
In essence, this enables one to separate/collapse rows or columns
of I x J tables to several sub-tables, and obtain
2
or
2
statistics for
which the sum of each corresponding partitioned statistic is the
global statistics.
PARTITIONING
Consider: For a test of independence in a 2 x J table, a
2
statistic can be broken down into J 1 components: [1] the
first two columns; [2] collapsing of the first two columns, then
compared with the 3
rd
column; [3] collapsing of the first three
columns, then compared with the 4
th
column, etc. until the J
th
column is considered. In particular, this is true for
.
PARTITIONING
While it might seem more natural to obtain
statistic for each
2 x 2 pairing, note that the sum of these individual statistics will
not total the global
. [Issues due to non-independence]
has exact partitionings;
does not (at least, algebraically).
Nevertheless, partitioning
2
is valid for both statistics as long as
independence of partitions are met.
SOME COMMENTS ON
TESTS
These tests likewise require a very large sample size n
relative to IJ. Moreover,
converges poorly as compared
to
for very small sample sizes, i.e. for large I or J,
still provides decent approximation even if some expected
frequencies are as small as 1.
SOME COMMENTS ON
TESTS
tests merely indicate the degree of evidence for an association;
they do not give anything about the strength and the nature of the
association.
Both
and
are orientation invariant, i.e. they do not change
values with reorderings of rows or columns. However, both are only
powerful when associations regarding nominal variables are of concern.
For ordinal, more powerful tests exist.
FISHER S EXACT TEST
Recall: For 2 x 2 tables, independence = .
Consider the cell counts {
}. A small-sample null
probability distribution for the cell counts that does not
depend on unknown parameters results from considering the
set of tables having the same row and column total. Under this
condition, each *
+ then have the hypergeometric
distribution.
FISHER S EXACT TEST
It is sufficient to know
alone to determine all other cell
counts. Under the null hypothesis of independence
: = ,
is hypergeometric with
+
Hence, the p-value equals the sum of hypergeometric probabilities
for outcomes at least as favorable to
as the observed outcome.
EXAMPLE # 6
In his 1935 book, The Design of Experiments, Fisher described
the following experiment: When drinking tea, a colleague of
Fishers at Rothamsted Experiment Station near London
claimed she could distinguish whether milk or tea was added
to the cup first. To test her claim, Fisher designed an
experiment in which she tasted eight cups of tea. Four cups
had milk added first, and the other four had tea added first.
EXAMPLE # 6
She was told there were four cups of each type and she
should try to select the four that had milk added first. The
cups were presented to her in random order. The following
table shows a potential result of the experiemtn. Perform a
test to check whether there is evidence of a positive
association between the true order of the pouring and her
guess. Compute for the exact p-value of the test.
EXAMPLE # 6
CONSERVATISM OF
FISHER S EXACT TEST
Being an exact test, the test is very conservative, i.e. the
actual error rate when the null hypothesis of independence is
true is much smaller than the intended one. This is essentially
true for one-sided alternatives. Hence, mid p-value is
preferred as an alternative to diminish the conservativeness.
SMALL- SAMPLE
CONFIDENCE INTERVAL FOR
It is also possible to construct small-sample confidence
intervals for odds ratio. The procedure involved is a
generalization of Fishers exact test that tests an arbitrary value,
: =
. Hence, a % CI would then
contain all values of
for which the exact p-value of
: =
is greater than 0.05. This can also be constructed
using mid p-value to preserve conservatism.
SMALL- SAMPLE
CONFIDENCE INTERVAL FOR
For the tea-taste experiment, a 95% CI for can be
computed to be as follows:
Exact p-value: (0.21 , 626.17)
Mid p-value: (0.31 , 308.55)