Matched-pair hypothesis testing
We’ve recently been looking at hypothesis testing for the difference of
means when we take two independent samples from one or two
populations. Technically, we say that we have independent samples when
there’s no relationship between the observations we find for each sample.
But sometimes we’ll want to run a hypothesis test on the difference of
means between dependent samples, which are samples for which the
observations from one sample are related to an observation from the
other sample.
Matched-pair tests
When we do hypothesis testing with dependent samples, we often call it a
matched-pair test, because each subject in the second sample matches
with a particular subject in the first sample.
It’s common to run a matched-pair test that compares some new
technique or method to an old one, or looks at a before-and-after change.
For instance, a weight-loss study could define Population 1 as the set of
starting weights for each participant, and Population 2 as the set of ending
weights for each participant. Each participants starting and ending weights
(from Populations 1 and 2, respectively) form a matched-pair for that
individual.
In this example, there’s an advantage to using a matched-pair test, instead
of a difference of means test with independent samples. If we took the
335
independent samples approach, sample 1 could be taken from the
population before the weight loss study begins, and sample 2 could be
taken from the population after the weight loss study ends. This approach
introduces extra variability unnecessarily because we’ll get different
people in both samples.
But if we take the matched-pair approach, we keep the people the same
across both samples, creating a matched-pair of each person’s starting
and ending weights.
In general, hypothesis testing with dependent samples will follow a really
similar process as the one we’ve used for the difference of means with
independent samples, except that we’ll create one variable as the
difference between the two samples, and we’ll perform the hypothesis
test with just this one variable, instead of with two variables.
Let’s work through an example so that we can see how to use dependent
samples in a matched-pair hypothesis test.
Example
A fast food restaurant is implementing new workplace policies with the
goal of increasing employee satisfaction by 2 points on a scale of 1 to 10.
The restaurant surveys 10 employees, asking them both before and after
the policies are enacted to rate their workplace satisfaction on the 1 − 10
scale, and records the results in the table below.
336
Employee 1 2 3 4 5 6 7 8 9 10
Before x1 3 3 5 7 1 0 2 6 6 5
After x2 3 6 9 7 3 5 5 5 9 9
Difference, d 0 3 4 0 2 5 3 -1 3 4
d2 0 9 16 0 4 25 9 1 9 16
Can the restaurant say at 5 % significance that the policies increased
employee satisfaction by 2 points?
The restaurant will define the “before” responses as Population 1, and the
“after” responses as Population 2. The samples are dependent because
it’s reasonable to see how an employee’s “after” response could be
affected by their “before” response.
Then their null and alternative hypotheses will be
H0 : μ2 − μ1 ≤ 2
Ha : μ2 − μ1 > 2
where μ1 is the mean employee satisfaction before the new workplace
policies are implemented, and μ2 is the mean employee satisfaction after
the new workplace policies are implemented. And because μ2 − μ1 is the
difference in employee ratings, the hypothesis statements could also be
written as
H0 : μd ≤ 2
337
Ha : μd > 2
where μd is the mean difference between the two populations.
To find the mean difference, we’ll sum the differences and divide by the
number of matched-pairs in our sample, n = 10.
n
∑i=1 di 0 + 3 + 4 + 0 + 2 + 5 + 3 + (−1) + 3 + 4 23
d¯ = = = = 2.3
n 10 10
So the sample mean tells us that employee satisfaction increases by about
2.3 on a scale of 1 to 10. Then the sample standard deviation is
∑i=1 (di − d¯)2
n
sd =
n−1
To calculate this, we’ll first find
n
(di − d¯)2
∑
i=1
(0 − 2.3)2 + (3 − 2.3)2 + (4 − 2.3)2 + (0 − 2.3)2 + (2 − 2.3)2
+(5 − 2.3)2 + (3 − 2.3)2 + (−1 − 2.3)2 + (3 − 2.3)2 + (4 − 2.3)2
(−2.3)2 + 0.72 + 1.72 + (−2.3)2 + (−0.3)2 + 2.72 + 0.72 + (−3.3)2 + 0.72 + 1.72
5.29 + 0.49 + 2.89 + 5.29 + 0.09 + 7.29 + 0.49 + 10.89 + 0.49 + 2.89
36.1
Then the sample standard deviation is
338
36.1
sd =
9
sd ≈ 4.011
sd ≈ 2.003
Because the population standard deviations are unknown, and/or because
both sample sizes are small, n1, n2 < 30, the test statistic will be
d¯ − μd
t= sd
n
2.3 − 2
t≈ 2.003
10
10
t ≈ 0.3 ⋅
2.003
t ≈ 0.474
and the degrees of freedom are
df = n − 1 = 10 − 1 = 9
At a significance level of 5 % (a confidence level of 95 % ) for an upper-tail
test, and df = 9, the t-table gives 2.262.
339
Upper-tail probability p
df 0.25 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.001 0.0005
8 0.706 0.889 1.108 1.397 1.860 2.306 2.896 3.355 4.501 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.821 3.250 4.297 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.764 3.169 4.144 4.587
50% 60% 70% 80% 90% 95% 98% 99% 99.8% 99.9%
Confidence level C
The restaurant’s t-test statistic t ≈ 0.474 doesn’t meet the threshold
t = 2.262, so the critical value approach tells them that they can’t reject the
null hypothesis, and therefore can’t conclude that the new workplace
policies increased employee satisfaction by 2 points.
Confidence intervals for matched-pair tests
If the restaurant from the previous example had known the population
standard deviation σd, they could have calculated a confidence interval
around the difference d¯ using
(a, b) = d¯ ± zα/2σd¯
σd
(a, b) = d¯ ± zα/2
n
If, instead, the restaurant had an unknown population standard deviation
σd and/or a small sample n < 30, to find a confidence interval around the
difference d¯ they would have used
340
sd
(a, b) = d¯ ± tα/2 with df = n − 1
n
Let’s continue with the previous example in order to calculate the
confidence interval.
Example (cont’d)
Find a 95 % confidence interval around d¯ using the information in the
previous example.
From the previous example, we see that population standard deviation σd
is unknown, and we have a small sample n = 10 < 30, so we’ll calculate the
confidence interval as
sd
(a, b) = d¯ ± tα/2
n
2.003
(a, b) ≈ 2.3 ± 2.262 ⋅
10
(a, b) ≈ 2.3 ± 1.433
So the margin of error is 1.433 and the confidence interval is
(a, b) ≈ (2.3 − 1.433,2.3 + 1.433)
(a, b) ≈ (0.867,3.733)
341
Therefore, there’s a 95 % chance that the change in employee satisfaction
changes between 0.867 points and 3.733 points.
342