Hypothesis Testing - Z and T-Tests
Hypothesis Testing - Z and T-Tests
I. Testing a Claim about the Mean Using a Large and Small Samples (One sample group)
When a hypothesis test involves a claim about a population mean, then we will draw a sample and look at the sample
mean to test the claim. If the sample drawn is large enough ( n 30 ), then the Central Limit Theorem applies, and the
distribution of sample means is approximately normal. Note: Since the null hypothesis must be of one of the following
types: 0 , 0 , or 0 , where 0 is a constant, we will always assume for the purpose of our test that 0 .
Example 1: Suppose we believe that the mean body temperature of healthy adults is less than the commonly accepted
measurement of 98.6 F and a standard deviation of 1.1°. A sample of 60 healthy adults is drawn with an
average temperature of x 98.2 F. Test if the claim is true using 5% level of significance?
Step 1: Ho: The mean body temperature of healthy adults is greater than or equal to 98.6° (Ho: 98.6 )
Ha: The mean body temperature of healthy adults is less than 98.6° (Ho: 98.6 )
Step 2: Level of significance, 0.05, and sample size, n 60 .
Step 3: Choice of Statistic/Type of Test: Z– test for mean, one–tail test (left tail)
Step 4: Critical Value: since Ha indicates left tail test, Critical Value, CV = – 1.65.
Step 5: Computation:
x 0 98.2 98.6 0.4
z 2.82
/ n 1.1 / 60 .142
Step 6: Decision/Conclusion: Since the computed z value (–2.82) falls within the rejection region, then we reject the
null hypothesis. In other words, since 2.82 1.65 , we reject Ho. The data provide significant evidence to
show that the mean body temperature of healthy adults is indeed less than the commonly accepted 98.6°. Thus
the claim is true.
Example 2: The manager of a certain large department store wants to determine if the mean sale amount of all sales on a
promotion day is just the same with that on a regular day which is $78 with a standard deviation of $25. On a special
holiday promotion day, the total purchase amount of 80 different randomly selected sales has a mean of $82. Test the
hypothesis at 10% significance level.
Step 1: Ho: The mean sale amount of all sales on a promotion day is equal to that on a regular day which is $78.
( H 0 : $78 )
Ha: The mean sale amount of all sales on a promotion day is not equal to that on a regular day which is $78.
( H a : $78 )
Step 2: Level of significance, 0.10, and sample size, n 80 .
Step 3: Choice of Statistic/Type of Test: Z– test for mean, two–tail test
Step 4: Critical Value: since Ha is non – directional, Critical Value, CV = 1.65.
Step 5: Computation:
x 0 82 78
z 1.43
/ n 25 / 80
Step 6: Decision/Conclusion: Since the computed z value (1.43) falls within the acceptance region, then we accept the
null hypothesis. In other words, since 1.43 1.65 , we accept Ho. There is significant evidence that the mean
sale amount of all sales on a promotion day is just the same with that on a regular day which is $78. The mean
of all sale amounts during the said days are just comparable, at 10% significance and based on the samples
selected.
ACTIVITY # 1
KINDLY REFER TO ITEM NOS. 9.46 (A–C) AND 9.47A IN PAGE 318 OF OUR TEXTBOOK. FOR SPSS ACTIVITY, KINDLY
REFER TO 9.61 IN PAGE 324 AND 9.65 IN PAGE 325.
Example 1: A company claims that the mean selling price for a certain type of imported sports car is $42,000. A survey of 16
randomly selected owners of such cars shows that they actually paid a mean of $44,200 with a standard deviation of
$6,000 for their cars. Test if the company’s claim is TOO LOW at 1% level of significance.
Step 1: Ho: The mean selling price for a certain type of imported sports car is $42,000. ( H 0 : $42, 000 )
Ha: The mean selling price for a certain type of imported sports car is $42,000. ( H a : $42, 000 )
Step 2: Level of significance, 0.01 and sample size, n 16 .
Step 3: Choice of Statistic/Type of Test: t– test for mean, one–tail test
Step 4: Critical Value: since Ha indicates an upper – tail test, Critical Value, CV = 2.6025, with df = 15.
Step 5: Computation:
x 0 44200 42000
t 1.47
s/ n 6000 / 16
Step 6: Decision/Conclusion: Since the computed z value 1.47 falls within the acceptance region, then we accept the
null hypothesis. In other words, since 1.47 2.6025 , we accept Ho. There is significant evidence that the
mean selling price for a certain type of imported sports car is $42,000. Therefore, the company’s claim is not
too low at 1% significance level.
ACTIVITY # 2
KINDLY REFER TO ITEM NOS. 9.55 & 9.57A IN PAGE 323 OF OUR TEXTBOOK. FOR SPSS ACTIVITY, KINDLY REFER TO
9.58A AND 9.59A IN PAGE 324.
Example 1: Suppose we believe that the mean body temperature of healthy adults is less than the commonly accepted
measurement of 98.6 F and a standard deviation of 1.1°. A sample of 60 healthy adults is drawn with an
average temperature of x 98.2 F. Test if the claim is true using 5% level of significance?
Step 1: Ho: The mean body temperature of healthy adults is greater than or equal to 98.6° (Ho: 98.6 )
Ha: The mean body temperature of healthy adults is less than 98.6° (Ho: 98.6 )
Step 2: Level of significance, 0.05, and sample size, n 60 .
Step 3: Choice of Statistic/Type of Test: Z– test for mean, one–tail test (left tail)
Step 4: Computation:
x 0 98.2 98.6 0.4
z 2.82
/ n 1.1 / 60 .142
Step 5: The P–value that z would be less than –2.82, that is P z 2.82 0.0024 .
Step 6: Decision/Conclusion: Since the p-value (0.0024) is less than 0.05 , then we reject the null hypothesis. There
is significant evidence that the mean body temperature of healthy adults is indeed less than the commonly
accepted 98.6°. Thus the claim is true.
Example 2: The manager of a certain large department store wants to determine if the mean sale amount of all sales on a
promotion day is just the same with that on a regular day which is $78 with a standard deviation of $25. On a special
holiday promotion day, the total purchase amount of 80 different randomly selected sales has a mean of $82. Test the
hypothesis at 10% significance level.
Step 1: Ho: The mean sale amount of all sales on a promotion day is equal to that on a regular day which is $78.
( H 0 : $78 )
Ha: The mean sale amount of all sales on a promotion day is not equal to that on a regular day which is $78.
( H a : $78 )
Step 2: Level of significance, 0.10, and sample size, n 80 .
Step 3: Choice of Statistic/Type of Test: Z– test for mean, two–tail test
Step 4: Computation:
x 0 78 82
z 1.43
/ n 25 / 80
Step 5: Since this is a two–tail test, so we would find the P–value that z would either be less than –1.43 or greater than
1.43, that is P z 1.43 or z 1.43 0.0764 0764 0.1528 .
Step 6: Decision/Conclusion: Since the p–value (0.1528) is greater than α (0.10), then we accept the null hypothesis.
The data provide significant evidence that the mean sale amount of all sales on a promotion day is just the
same with that on a regular day which is $78. The mean of all sale amounts during the said days are just
comparable, at 10% significance and based on the samples selected.
ACTIVITY # 3
KINDLY REFER TO ITEM NOS. 9.57B IN PAGE 323 AND 9.63B IN PAGE 325 OF OUR TEXTBOOK. FOR SPSS ACTIVITY,
KINDLY REFER TO 9.58B AND 9.59 IN PAGE 324.
Example 1. A cereal company claims that two-thirds of all children prefer Rice Crunchies to Rice Flakies. In a sample of
100 children, 55 prefer Rice Crunchies. Test if the company’s claim is overstated at the 5% level of significance.
Step 1: Ho: Two-thirds of all children prefer Rice Crunchies to Rice Flakies. (Ho: P 2 / 3 )
Ha: Less than two–thirds of all children prefer Rice Crunchies to Rice Flakies. (Ha: P 2 / 3 )
Step 2: Level of significance, 0.05, and sample size, n 100 .
Step 3: Choice of Statistic/Type of Test: Z-test proportion/one tail test
Step 4: Critical Value: since Ha indicates lower tail test, Critical Value, CV = -1.65 .
0.55 - 0.667
Step 5: Computation: z = ≈ - 2.49 ,
0.6671 - 0.667
100
Since x = 55 and n = 100 so p = 55% and P = 2 / 3 = 0.666 0.667
Step 6: Decision/Conclusion: Since the computed Z–value 2.49 is greater than the critical value 1.65 , reject Ho.
Therefore, based on the sample data we conclude that the company’s claim regarding the proportion of
children who prefer Rice Crunchies to Rice Flakies is overstated at 5% level of significance.
Example 2. A drug rehabilitation center claims that at most 22% of its patients who are certified as drug–free suffer a
relapse within 2 years. A study of 35 randomly selected graduates of the program shows that 10 have gone
back to drugs within 2 years. Test if the claim is overstated at the 1% level of significance.
Step 1: Ho: At most 22% of a drug rehabilitation center’s patients who are certified as drug–free suffer a relapse
within 2 years. (Ho: P 22% )
Ha: More than 22% of a drug rehabilitation center’s patients who are certified as drug–free suffer a relapse
within 2 years. (Ha: P 22% )
Step 2: Level of significance, 0.01, and sample size, n 35 .
Step 3: Choice of Statistic/Type of Test: Z-test proportion/one tail test
Step 4: Critical Value: since Ha indicates upper tail test, Critical Value, CV = 2.33 .
0.2857 - 0.22
Step 5: Computation: z = ≈0.94 , since x = 10 and n = 35 so p = 28.57 % and P = 22%.
0.221 - 0.22
35
Step 6: Decision/Conclusion: Since the computed Z–value (0.94) is lesser than the critical value (2.33), accept Ho.
Therefore, based on the sample data, we conclude that the drug rehabilitation center’s claim regarding the
proportion of their patients who are certified as drug free yet still suffer a relapse within 2 years is not
overstated at 5% level of significance. The data gathered justified that the true proportion of at most 22%.
ACTIVITY # 4
KINDLY REFER TO ITEM NOS. 9.71 AND 9.75 IN PAGE 329 OF OUR TEXTBOOK. FOR SPSS ACTIVITY, KINDLY REFER TO
9.71 AND 9.75 IN PAGE 329.
This permits us to compare two different groups to see if they represent different populations of if they are
essentially equivalent with respect to a particular parameter.
In this case, we’ll have two different populations with potentially different means, μ1 and μ 2 . These two populations
have two standard deviations, σ1 and σ 2 . We seek to determine whether any of the following hypotheses would be
true : Ho : μ1 = μ 2 ; Ho : μ1 μ 2 ; Ho : μ1 μ 2 ; Ha : μ1 μ 2 ; Ha : μ1 > μ 2 ; or Ha : μ1 < μ 2 .
To perform this hypothesis test, we need two separate independent random samples, one drawn from each
population. Thus the sample date give: n1 , x1 , s1 from the first group and n2 , x 2 , s2 from the second group. We
consider the difference between the means of these two samples: x1 x2 . The set of all such differences of sample
means from the two populations forms a new sampling distribution known as the distribution of differences of
sample means.
CASE 1: Z – TEST FOR THE DIFFERENCE BETWEEN TWO MEANS
( n1 ≥30 and n2 ≥30 andσ1 and σ2 are know n )
The distribution of differences of sample means is approximately normal with mean: x x 1 2 and standard
1 2
2 2
deviation x x 2 , provided that both n1 30 and n2 30.
1
1 2
n1 n2
When we work with this normal distribution for differences of sample means, the z values are calculated according to
the formula: z =
x - x - μ -μ where σ
1 2 1 2
=
σ12 σ 22
+ . Since the null hypothesis asserts that 1 2 or
σ x -x x1 -x 2
n1 n2
1 2
EXAMPLES
1. A study is made comparing the prices asked for existing one – family homes in two adjacent communities. In College
Heights, the mean asking price for a random sample of 50 homes is $142,000 with a standard deviation of $30,000. In
University Gardens, the mean asking price for a random sample of 35 homes is $168,000 with a standard deviation of
$40,000. Test whether there is a difference in the mean asking prices for homes in these two areas at the 5% level of
significance.
Hypotheses
Ho: There is no significant difference in the mean asking prices for homes in these two adjacent communities.
Ho : 1 2
Ha: There is a significant difference in the mean asking prices for homes in these two adjacent communities.
Ha : 1 2
Level of Significance and sample sizes
α = 0.05, n1 = 50 and n2 = 35
Choice of Statistic and Type of Test
Since both n1 > 30 and n2 > 30, we use Z – test mean difference and since Ha indicates that 1 2 , the problem
entails a two–tail test.
Critical Value
Using α = 0.05, two tail test, Critical Values are 1.96.
Computation
n1 = 50 n2 = 35
Given Data: College Heights x1 = 142, 000 and University Gardens x 2 = 168, 000
s1 = 30, 000
s 2 = 40, 000
x1 - x 2 142, 000 - 168, 000
z= = -3.26
2 2 2 2
s1 s2 30, 000 40, 000
+ +
n1 n2 50 35
Decision/Conclusion
Since the computed z -3.26 > the critical value -1.96 , reject Ho. There is a significant difference in the mean
asking prices for homes in these two adjacent communities. Since the computed z is (–3.26), there is sufficient evidence
that the mean asking price for homes in College Heights subdivision is lower than that in the University Gardens.
2. A study is made comparing wages paid to women and men holding comparable jobs in a large company. A random sample
of 100 women are paid a mean hourly wage of P315.25 with a standard deviation of P71.34 while a random sample of 75
men are paid a mean hourly wage of P350.61 with a standard deviation of P80.48. Do these data constitute “proof” that,
on the average, women are paid less than men at the 5% level of significance?
Hypotheses
Ho: There is no significant difference in the mean hourly wage paid to women and men who are holding comparable
jobs in a large company. In symbol, Ho : w m .
Ha: On the average, women are paid less than men. In symbol, Ha : w m .
Level of Significance and sample sizes
α = 0.05, nw = 100 and nm = 75
Choice of Statistic and Type of Test
Since both n1 > 30 and n2 > 30, we use Z – test mean difference and since Ha indicates that w m , the problem
entails a one–tail test.
Critical Value
Using α = 0.05, one tail test, Critical Value is - 1.65.
Computation
nw = 100 nm = 75
Given Data: Women x w = 315.25 and Men x m = 350.61
s w = 71.34 s m = 80.48
x1 - x 2 315.25 - 350.61
z= = -3.02
s12 s22 71.34
2
80.48
2
+ +
n1 n2 100 75
Decision/Conclusion
Since the computed z -3.02 > the critical value -1.65 , reject Ho. That is, women are seemingly paid lower
wages, on the average, than men in this organization at the 5% level of significance.
CASE 2: POOLED – VARIANCE T-TEST FOR THE DIFFERENCE BETWEEN TWO MEANS
Preferably used when n1 < 30 and n2 < 30 though this can still be used for large samples as long as the
population variances are assumed equal
x1 - x 2
The t- value for this case is computed by the formula: t =
2 n1 1 s12 n2 1 s2 2
where s p and
1 1 n1 n2 2
sp2 +
n1 n2
with degrees of freedom, df n1 n2 2 . The term, s p 2 arises because of the assumption that the two populations
have equal variances. To estimate, this common variance, it is necessary to combine or pool the information from
both samples to get one single estimate of the unknown variance, 2 . We call this pooled variance, s 2 pool .
EXAMPLES
1. A waitress at a yacht club restaurant conducts a study to compare the average amount of tips she receives per person on a
$20 dinner from sailboaters versus motorboaters. In a random sample of 28 sailboaters, she receives a mean of $3.80 with a
standard deviation of $1.22. In a random sample of 24 motorboaters, she receives a mean of $4.22 with a standard
deviation of $1.04. Test if there is a difference in the average tip at the 5% level of significance. (Assume that the population
variances are equal).
Hypotheses
Ho: There is no significant difference in the average amount of tip received by the waitress from sailboaters and
motorboaters on a $20 dinner. In symbol, Ho : s m .
Ha: There is a significant difference in the average amount of tip received by the waitress from sailboaters and
motorboaters on a $20 dinner. In symbol, Ha : s m .
Level of Significance and sample sizes
α = 0.05, ns = 28 and nm = 24
Choice of Statistic and Type of Test
Since both n1 < 30 and n2 < 30, we use t – test mean difference for pooled variance and since Ha indicates that
s m , the problem entails a two–tail test.
Critical Value
Using α = 0.05, two tail test, where df 28 24 2 50, the critical values are 2.0086
Computation
ns = 28 nm = 24
Given Data: Sailboaters x s = 3.80 and Motorboaters x m = 4.22
s s = 1.22 s m = 1.04
n1 - 1 s12 + n2 - 1 s22 = 28 - 11.22 + 24 - 11.04
2 2
2
1.32 ,
2
Computing for s p first we have, sp =
n1 + n2 - 2 28 + 24 - 2
x1 - x 2 3.80 - 4.22
Thus the t – value, t = = -1.30
1 1
1.32 +
1 1
sp2 +
n1 n2 28 24
Decision/Conclusion
Since the computed t -1.30 < the critical value 2.0086 , accept Ho. There is no significant difference in the average
amount of tip received by the waitress from sailboaters and motorboaters on a $20 dinner at 5% level of significance.
That is, on the average, sailboaters and motorboaters gave comparably equal amounts of tip to the waitress.
However, there are lots of real life situations, where 1 can not be assumed equal with 2 . In this case, the t- value is
2 2
2
s12 s2 2
computed by the formula: t =
x1 - x 2
with degrees of freedom, df
n1 n2 .
2 2
s12 s2 2 s12 s2 2
+
n1 n2 n1 n2
n1 1 n2 1
EXAMPLES
1. A waitress at a yacht club restaurant conducts a study to compare the average amount of tips she receives per person on a
$20 dinner from sailboaters versus motorboaters. In a random sample of 28 sailboaters, she receives a mean of $3.80 with a
standard deviation of $1.22. In a random sample of 24 motorboaters, she receives a mean of $4.22 with a standard
deviation of $1.04. Test if there is a difference in the average tip at the 5% level of significance. (Suppose we cannot assume
that the population variances are equal).
Hypotheses
Ho: There is no significant difference in the average amount of tip received by the waitress from sailboaters and
motorboaters on a $20 dinner. In symbol, Ho : s m .
Ha: There is a significant difference in the average amount of tip received by the waitress from sailboaters and
motorboaters on a $20 dinner. In symbol, Ha : s m .
Level of Significance and sample sizes
α = 0.05, ns = 28 and nm = 24
Choice of Statistic and Type of Test
Since both n1 < 30 and n2 < 30, we use t – test mean difference for separate variance and since Ha indicates that
s m , the problem entails a two–tail test.
Critical Value
2 2
s12 s2 2 1.222 1.042
Using α = 0.05, two tail test , df n1 n2
28 24
49.997 50
2 2 2 2
s12 s2 2 1.222 1.042
n1 n2 28 24
n1 1 n2 1 27 23
ns = 28 nm = 24
x1 - x 2 3.80 - 4.22
Computation: Given Data: Sailboaters x s = 3.80 and Motorboaters x m = 4.22 so t = 2 = -1.34
s s = 1.22 s m = 1.04 s1 s2 2 2
1.22 1.04
2
+ +
n1 n2 28 24
Decision/Conclusion
Since the computed t -1.34 < the critical value 2.0086 , accept Ho. There is no significant difference in the average
amount of tip received by the waitress from sailboaters and motorboaters on a $20 dinner at 5% level of significance.
That is, on the average, sailboaters and motorboaters gave comparably equal amounts of tip to the waitress.
ACTIVITY # 5
KINDLY REFER TO ITEM NOS. 10.7 PAGE 355, 10.11 IN PAGE 356 AND 10.12 IN PAGE 356 OF OUR TEXTBOOK. FOR
SPSS ACTIVITY, KINDLY REFER TO 10.14 AND 10.18 IN PAGE 357.
V. THE PAIRED – DATA TEST
This test is applied when two samples are linked or paired, that is, if they are dependent.
In this test, we would me concern about the differences d of the two linked samples, that is, d = x – y. This consequently
reduces to considering the mean of all possible differences d = x – y for all members of the population to see whether the
mean is actually zero.
To formalize these ideas, we introduce the following notations: d is the mean of all the possible d’s in the population
and d , their standard deviation. The null hypothesis is then formulated as H o : d 0 . The alternative hypothesis
is then formulated in any of the following forms depending on the type of test: H a : d 0 , H a : d 0 or
H a : d 0 .
This can also be used to determine whether there is any difference between two sets of measurements on the same
individuals or items.
The test statistic to be used for this case is known as paired – data test or the paired t – test. It is based on the test
n d d
2 2
d d
statistics t , where d is the mean of all differences and sd with degree of freedom, df =
sd n n 1
n
n – 1.
Note: T- test paired can still be used when n > 30.
EXAMPLES
1. A study is conducted to compare the pricing at two competing food stores. Twelve common grocery items are chosen at
random and the price (in dollars) of each is noted at the two stores, as follows:
Item 1 2 3 4 5 6 7 8 9 10 11 12
Store A 0.89 0.59 1.29 1.50 2.49 0.65 0.99 1.99 2.25 0.50 1.99 1.79
Store B 0.95 0.55 1.49 1.69 2.39 0.79 0.99 1.79 2.39 0.59 2.19 1.99
Test whether there is a mean difference in the prices at the two stores at the 2% level of significance.
Hypotheses
Ho: There is no significant mean difference in the prices at the two stores.
Ho : d 0
Ha: There is a significant mean difference in the prices at the two stores.
Ha : d 0
Level of significance and sample size
0.02, n 12
Choice of Statistic/Type of Test
Since we want to determine if there is any difference of the two sets of measurements (prices) for the same items, we’ll
use t – test paired data test and two tail test.
Critical Value
df = n – 1 = 11, the critical value are ±2.718
Computation
Item d=x–y d2
1 -0.06 0.0036
2 0.04 0.0016
3 -0.20 0.0400
4 -0.19 0.0361
5 0.10 0.0100
6 -0.14 0.0196
7 0 0
8 0.20 0.0400
9 -0.14 0.0196
10 -0.09 0.0081
11 -020 0.0400
12 -0.20 0.0400
d 0.88 d 0.2586
2
0.88
d 0.073
12
n d d 12 0.2586 0.88
2 2 2
From this table, the standard deviation sd 0.133 and the t – value is
n n 1 12 12 1
d d 0.073
t 1.90
sd 0.133
n 12
Decision/Conclusion
Since the computed t 1.90 < critical value 2.718 , we cannot reject the null hypothesis. There is insufficient
evidence to indicate any difference between the average prices at the two stores at the 2% significance level.
ACTIVITY # 6
KINDLY REFER TO ITEM NOS. 10.24 PAGE 366, AND 10.27 IN PAGE 367 OF OUR TEXTBOOK. FOR SPSS ACTIVITY,
KINDLY REFER TO 10.24 AND 10.27 IN PAGE 366 – 367.
Prepared By:
Mrs. Jennefer Mirafuentes Piramide
BA100A Teacher