Chapter VIII:
Test of Hypothesis for a Single Sample
MATH 22
Engineering Data Analysis
Objectives:
At the end of the lesson, the students are expected to
Structure engineering decision-making problems as hypothesis
tests;
Test hypotheses on the mean / difference in means of a normal
distribution using either a Z-test or a t-test procedure;
Test hypotheses on the variance or standard deviation of a normal
distribution;
Test hypotheses on a population proportion / difference in two
population proportions;
Use the P-value approach for making decisions in hypotheses tests;
Compute power and type II error probability;
Explain and use the relationship between confidence intervals and
hypothesis tests.
Statistical Hypothesis
Statistical Hypothesis
A statistical hypothesis is a statement about the parameters of one or
more populations.
May also be thought of as a statement about the probability distribution of
a random variable.
A hypothesis is an assumption about the population parameter.
✓ A parameter is a Population mean or proportion
✓ The parameter must be identified before analysis
The null hypothesis, 𝐻0 , States the assumption (numerical) to be tested.
[e.g. The average # TV sets in US homes is at least 3 (𝐻0 : 𝜇 ≥ 3)] Begin
with the assumption that the null hypothesis is TRUE. (Similar to the
notion of innocent until proven guilty). It refers to the Status Quo and
always contains the ‘ = ‘ sign. The Null Hypothesis may or may not be
rejected.
The Alternative Hypothesis, 𝐻1 , Is the opposite of the null hypothesis
[e.g. The average # TV sets in US homes is less than 3 (𝐻1 : m < 3)].It
challenges the Status Quo and never contains the ‘=‘ sign. The Alternative
Hypothesis may or may not be accepted
Statistical Hypothesis
For example, consider the aircrew escape system. Suppose that we are
interested in the burning rate of the solid propellant. Now, the burning rate is
a random variable that can be described by a probability distribution.
For the burning rate, specifically, if we are interested in deciding whether or
not the mean burning rate is 50 centimeters per second. We may express this
formally as
H0: μ = 50 centimeters per second
H1: μ ≠ 50 centimeters per second
H0 is called the null hypothesis. H1 is called the alternative hypothesis. A
two-sided alternative hypothesis because of ≠ symbol
One-sided alternative hypothesis
H0: μ = 50 centimeters per second
H1: μ < 50 centimeters per second or
H0: μ = 50 centimeters per second
H1: μ > 50 centimeters per second
It is important that hypotheses are always statements about the population or
distribution under study, not statements about the sample.
Tests of Statistical Hypotheses
Test of Statistical Hypothesis
Identify the problem
To write the null and alternative hypotheses, translate the claim made
about the population parameter from a verbal statement to a
mathematical statement
State a Hypothesis
State the Null Hypothesis then state its opposite, the Alternative
Hypothesis. Hypotheses are mutually exclusive & exhaustive.
Sometimes it is easier to form the alternative hypothesis first.
Hypothesis Testing Process (whether or not to reject 𝐻0 )
Assume the population mean age is 50.
The Sample Population
Mean is 20
REJECT Is ഥ
𝒙 = 𝟐𝟎 ≅ 𝝁 = 𝟓𝟎
No, not likely! Sample
(Null Hypothesis)
Reason for Rejecting 𝐻0
Sampling Distribution
It is unlikely that we would
get a sample mean of this ... if in fact this were ... Therefore, we reject the null
value ... the population mean. hypothesis that m = 50.
20 𝝁= 50
H0
Test of Statistical Hypothesis
Example: H0: μ = 50 centimeters per second
H1: μ ≠ 50 centimeters per second
Suppose that a sample of n = 10 specimens is tested and that the sample mean burning rate
𝑥ҧ is observed. The sample mean is an estimated of the true population mean μ. A value of
the sample mean 𝑥ҧ that falls close to the hypothesized value of μ = 50 centimeters per
second does not conflict with the null hypothesis that the true mean μ is really 50
centimeters per second. On the other hand, a sample mean that is considerably different
from 50 centimeters per second is evidence in support of the alternative hypothesis H1.
Thus, the sample mean is the test statistic in this case.
The sample mean can take on many different values.
Critical region -Values in the interval 𝑥ҧ < 48.5 or 𝑥ҧ > 51.5
Acceptance region -Values in the interval 48.5 ≤ 𝑥ҧ ≤ 51.5
Test of Statistical Hypothesis
Critical region
- also known as the rejection region, is the set of values that rejects
the null hypothesis in favor of the alternative hypothesis
Acceptance region
- the interval within the sampling distribution that is consistent with
the null hypothesis
- it fails to reject the null hypothesis
Critical values
- the boundaries between the critical regions and the acceptance
region
Errors in Making Decisions
Type I Error
✓ Rejecting the null hypothesis H0 when it is true
✓ For example, the true mean burning rate of the propellant could be equal to
50 centimeters per second. However, for the randomly selected propellant
specimens that are tested, we could observe a value of the test statistic 𝑥ҧ
that falls into the critical region. We would then reject the null hypothesis
H0 in favor of the alternative H1, when in fact, H0 is really true.
Type II Error
✓ Failing to reject the null hypothesis when it is false
✓ Suppose that the true mean burning rate is different from 50 centimeters
per second, yet the sample mean 𝑥ҧ falls in the acceptance region. In this
case we would fail to reject H0 when it is false.
Errors in Making Decisions
Probability of Type I Error
α = P(type I error) = P(reject H0 when H0 is true)
✓ Sometimes called the significance level, or the α-error, or the size of the
test.
In the propellant burning rate example, a type I error will occur when either 𝑥ҧ >
51.5 or 𝑥ҧ < 48.5 when the true mean burning rate really is μ = 50 centimeters
per second. Suppose that the standard deviation of burning rate is σ = 2.5
centimeters per second and that the burning rate has a distribution of the sample
mean is approximately normal with mean μ = 50 and standard deviation
𝜎Τ 𝑛 = 2.5Τ 10 = 0.79. The probability of making a type I error (or
the significance level of our test) is equal to the sum of the areas that have
been shaded in the tails of the normal distribution in below
𝛼 = 𝑃 𝑋ത < 48.5 when 𝜇 = 50 + 𝑃 𝑋ത > 51.5 when 𝜇 = 50
.
Errors in Making Decisions
Computing the Type I Error Probability
The z-values that correspond to the critical values 48.5 and 51.5 are
48.5−50 51.5−50
𝑧1 = = −1.90 and 𝑧2 = = 1.90
0.79 0.79
Therefore,
𝛼 = 𝑃 𝑍 < −1.90 + 𝑃 𝑍 > 1.90 = 0.0287 + 0.0287 = 0.0574
This implies that 5.74% of all random samples would lead to rejection of the hypothesis H0: μ =
50 centimeters per second when the true mean burning rate is really 50 centimeters per second.
We can reduce α by widening the acceptance region. For example, if we make the critical values
48 and 52, the value of α is
48 − 50 52 − 50
𝛼=𝑃 𝑍< +𝑃 𝑍 > = 𝑃 𝑍 < −2.53 + 𝑃 𝑍 > 2.53
0.79 0.79
= 0.0057 + 0.0057 = 0.0114 (Originally, 0.0574.)
The Impact of Sample Size
We could also reduce α by increasing the sample size. If n = 16, 𝜎Τ 𝑛 = 2.5Τ 16 = 0.625,
and using the original critical region from the given graph, we find
48.5−50 51.5−50
𝑧1 = 0.625 = −2.40 and 𝑧2 = 0.625 = 2.40;
Therefore 𝛼 = 𝑃 𝑍 < −2.40 + 𝑃 𝑍 > 2.40 = 0.0082 + 0.0082 = 0.0164 (Originally,
0.0574.)
Errors in Making Decisions
Probability of Type II Error
β = P(type II error) = P(fail to reject H0 when H0 is false)
Computing the Probability of Type II Error
α will help us calculate the probability of type II error β. The normal distribution
on the left in equation for α is the distribution of the test statistic 𝑋ത when the null
hypothesis H0: μ = 50 is true (this is what is meant by the expression “under H0: μ
= 50”), and the normal distribution on the right is the distribution of 𝑋ത when the
alternative hypothesis is true and the value of the mean is 52 (or “under H1: μ =
52”).
𝛽 = 𝑃 48.5 ≤ 𝑋 ≤ 51.5 when 𝜇 = 52
The z-values corresponding to 48.5 and 51.5 when μ = 52 are
48.5−52 51.5−52
𝑧1 = = −4.43 and 𝑧2 = = −0.63
0.79 0.79
Therefore,
𝛽 = 𝑃 −4.43 ≤ 𝑍 ≤ −0.63
= 𝑃 𝑍 ≤ −0.63 − 𝑃 𝑍 ≤ −4.43
= 0.2643 − 0.0000 = 0.2643
Errors in Making Decisions
Computing the Probability of Type II Error
The probability of making a type II error β increases rapidly as the true value
of μ approaches the hypothesized value. For example, see graph below, where
the true value of μ of the mean is μ = 50.5 and the hypothesized value is H0: μ
= 50.
𝛽 = 𝑃 48.5 ≤ 𝑋ത ≤ 51.5 when 𝜇 = 50.5
The z-values corresponding to 48.5 and 51.5 when μ = 50 are
48.5−50.5 51.5−50.5
𝑧1 = = −2.53 and 𝑧2 = = 1.27
0.79 0.79
Therefore,𝛽 = 𝑃 −2.53 ≤ 𝑍 ≤ 1.27 = 𝑃 𝑍 ≤ 1.27 − 𝑃 𝑍 ≤ −2.53 =
0.8980 − 0.0057 = 0.8923
Errors in Making Decisions
Effect of Sample Size on β
In the graph below, the normal distribution on the left is the distribution of 𝑋ത
when the mean μ = 50, and the normal distribution on the right is the distribution
of 𝑋ത when μ = 52. 𝛽 = 𝑃 48.5 ≤ 𝑋ത ≤ 51.5 when 𝜇 = 52
When n = 16, the standard deviation of 𝑋ത is 𝜎Τ 𝑛 = 2.5Τ 16 = 0.625, and the z-values
corresponding to 48.5 and 51.5 when μ = 52 are
48.5−52 51.5−52
𝑧1 = = −5.60 and 𝑧2 = = −0.80
0.625 0.625
Therefore, 𝛽 = 𝑃 −5.60 ≤ 𝑍 ≤ −0.80 = 𝑃 𝑍 ≤ −0.80 − 𝑃 𝑍 ≤ −5.60
= 0.2119 − 0.0000 = 0.2119 (Originally, 0.2643.)
Increasing the sample size results in a decrease in the probability of type II error.
Errors in Making Decisions
The results and a few other similar calculations are summarized in the following table. The
critical values are adjusted to maintain equal α for n = 10 and n = 16.
The results in boxes were not calculated.
The results in boxes were not calculated. This display and the discussion reveal four important
points.
1. The size of the critical region, and consequently the probability of a type I error α, can
always be reduced by appropriate selection of the critical values.
2. Type I and type II errors are related. A decrease in the probability of one type of error always
results in an increase in the probability of the other, provided that the sample size n does not
change.
3. An increase in sample size reduces β, provided that α is held constant.
4. When the null hypothesis is false, β increases as the true value of the parameter approaches
the value hypothesized in the null hypothesis. The value of β decreases as the difference
between the true mean and hypothesized value increases
Test of Statistical Hypothesis
Generally, the analyst controls the type I error probability α when he or she selects the critical
values. Thus, it is usually easy for the analyst to set the type I error probability at (or near) any
desired value. Since the analyst can directly control the probability of wrongly rejecting H0, we
always think of rejection of the null hypothesis H0 as a strong conclusion.
Classical Method of HypothesisTesting
Fixed significance level testing
A widely used procedure in hypothesis testing is to use a type I error or significance level of α
= 0.05. This value has evolved through experience, and may not be appropriate for all
situations.
On the other hand, the probability of type II error β is not a constant, but depends on the true
value of the parameter. It also depends on the sample size that we have selected. Because the
type II error probability β is a function of both the sample size and the extent to which the null
hypothesis H0 is false, it is customary to think the decision to accept H0 as a weak
conclusion, unless we know that β is acceptably small.
“fail to reject H0” is preferred than “accept H0”
Does not necessarily mean that there is a high probability that H0 is true
It may simply mean that more data are required to reach a strong conclusion.
Power
The power of a statistical test is the probability of rejecting the null hypothesis H0 when the
alternative hypothesis is true. Computed as 1 − β
Can be interpreted as the probability of correctly rejecting a false null hypothesis
P-Values in Hypothesis Tests
If the null hypothesis H0 is true, a P-value (or probability value) of a
hypothesis test is the probability of obtaining a sample statistic with a value
as extreme or more extreme than the one determined from the sample
data.
The P-value is the smallest level of significance that would lead to
rejection of H0 with the given data.
Has been adopted widely in practice to avoid difficulties by the fixed level
significance testing
The P-value of a hypothesis test depends on the nature of the test.
There are three types of hypothesis tests – a left-, right-, or two-tailed test.
The type of test depends on the region of the sampling distribution that
favors a rejection of H0. This region is indicated by the alternative
hypothesis.
P-Values in Hypothesis Tests
Identifying types of Hypothesis Test
1. If the alternative hypothesis contains the less-than inequality symbol (<), the
hypothesis test is a left-tailed test. H0: μ k ; H1: μ < k
P is the area to the left Test statistic
of the test statistic.
z
-3 -2 -1 0 1 2 3
2. If the alternative hypothesis contains the greater-than symbol (>), the
hypothesis test is a right-tailed test. H0: μ k ; Ha: μ > k
P is the area to the right
of the test statistic.
z
-3 -2 -1 0 Test 1statistic
2 3
3. If the alternative hypothesis contains the not-equal-to symbol (), the
hypothesis test is a two-tailed test. In a two-tailed test, each tail has an area of P.
P is twice the area to the left of P is twice the area to the right
the negative test statistic. of the positive test statistic.
.
z
-3 -2 -1 0 1 2 3
Test statistic Test statistic
P-Values in Hypothesis Tests
Identifying types of HypothesisTest
Example:
For each claim, state H0 and Ha. Then determine whether the hypothesis test is
a left-tailed, right-tailed, or two-tailed test.
a.) A cigarette manufacturer claims that less than one-eighth of the US adult
population smokes cigarettes.
H0: p 0.125 Ha: p < 0.125 (Claim) Left-tailed test
b.) A local telephone company claims that the average length of a phone
call is 8 minutes.
H0: μ = 8 (Claim) Ha: μ 8 Two-tailed test
Decision Rule Based on P-value
To use a P-value to make a conclusion in a hypothesis test, compare the P-value
with . If P , then reject H0. If P > , then fail to reject H0
General Procedure for Hypothesis Tests
1. Parameter of interest: From the problem context, identify the parameter of
interest.
2. Null hypothesis, H0: State the null hypothesis, H0.
3. Alternative hypothesis, H1: Specify an appropriate alternative hypothesis,
H1.
4. Test statistic: Determine an appropriate test statistic.
5. Reject H0 if: State the rejection criteria for the null hypothesis.
6. Computations: Compute any necessary sample quantities, substitute these into
the equation for the test statistic, and compute that value.
7. Draw conclusions: Decide whether or not H0 should be rejected and report
that in the problem context.
Only three steps are really required:
1. Specify the test statistic to be used (such as Z0).
2. Specify the location of the critical region (two-tailed, upper-tailed, or lower-
tailed)
3. Specify the criteria for rejection (typically, the value of α, or the P-value at which
the rejection should occur).
Connection Between HTs and CIs
There is a close relationship between the test of a hypothesis about any
parameter, say θ, and the confidence interval for θ. If [l, μ] is a 100(1 −
α)% confidence interval for the parameter θ, the test of size α of the
hypothesis
H0: θ = θ0
H1: θ ≠ θ0
will lead to rejection of H0 if and only if θ0 is not in the
100(1 − α)% CI [l, μ].
Confidence interval provides a range of likely values for μ at a stated
confidence level.
Hypothesis testing is an easy framework for displaying the risks levels
such as the P-value associated with a specific decision.
Tests on the Mean of a Normal
Distribution, σ2 Known
Tests on the Mean of a Normal
Distribution, σ2 Known
Suppose that we wish to test the hypotheses
H0: μ = μ0
H1: μ ≠ μ0
where μ0 is a specified constant.
Test Statistic
Standardize the sample mean.
Reference distribution is the standard normal distribution.
Usually called a z-test.
𝑋ത − 𝜇0
𝑍0 =
𝜎Τ 𝑛
Tests on the Mean of a ND, σ2 Known
The P-value for a z-test.
(a) The two-sided alternative H1: μ ≠
μ0.
(b) The one-sided alternative H1: μ >
μ0.
(c) The one-sided alternative H1: μ <
μ0.
Tests on the Mean of a ND, σ2 Known
The distribution of Z0 when H0: μ = μ0 is
true, with critical region for
(a)The two-sided alternative H1: μ ≠ μ0.
(b)The one-sided alternative H1: μ > μ0.
(c) The one-sided alternative H1: μ < μ0.
Tests on the Mean of a ND, σ2 Known
Large-Sample Test
If n is large (say, n ≥ 40) the sample standard deviation s can be substituted for σ in the
test procedures with little effect.
While we have given a test for the mean of a normal distribution with known σ2, it
can be easily converted into a large-sample test procedure for unknown σ2
that is valid regardless of the form of the distribution of the population
Relies on the central limit theorem.
Tests on the Mean of a ND, σ2 Known
Test Statistic
𝑋ത − 𝜇0
𝑇0 =
𝑆Τ 𝑛
If the null hypothesis is true T0 has a t distribution with n − 1 degrees of freedom. When
we know the distribution of the test statistic when H0 is true (this is often called the
reference distribution or the null distribution), we can calculate the P-value from
this distribution, or, if we use a fixed significance level approach, we can locate the critical
region to control the type I error probability at the desired level.
Calculating the P-value for a t-test: (a) H1: μ ≠ μ0; (b) H1: μ > μ0 ; (c) H1: μ < μ0.
P-value for t0 = 2.8; an upper-
tailed test is shown to be
between 0.005 and 0.01.
Tests on the variance of a Normal
Distribution
Tests on the σ2 of a ND
To test
H0: 𝜎 2 = 𝜎02
H1: 𝜎 2 ≠ 𝜎02
we will use the test statistic:
2
𝑛 − 1 𝑆
𝜒02 =
𝜎02
Reference distribution for the test of H0: 𝜎 2 =
𝜎02 with critical region values for
(a) H1: 𝜎 2 ≠ 𝜎02 ,
(b) H1: 𝜎 2 > 𝜎02 , and
(c) H1: 𝜎 2 < 𝜎02 .
Tests on the σ2 of a ND
Tests on theVariance of a Normal Distribution
Tests on a p
Tests on a p
We will consider testing
H0 : p = p 0
H1 : p ≠ p 0
Test Statistic
𝑋 − 𝑛𝑝0
𝑍0 =
𝑛𝑝0 1 − 𝑝0
Summary of Approximate Tests on a Binomial Proportion
Summary
Type II error is incurred by failing to reject a null hypothesis when
it is actually false (also called the β-error).
One-tailed / one-sided test is whenever H1: θ > θ0 or H1: θ < θ0.
Two-tailed / two-sided test is whenever H1: θ ≠ θ0.
The power of a statistical test is the probability that the test rejects
the null hypothesis when the null hypothesis is indeed false. Thus,
the power is equal to one minus the probability of type II error (1
− β).
The P-value is the smallest level of significance that would lead to
rejection of the null hypothesis H0 with the given data.
Fixed significance level (α) testing is also known as classical
method.
Summary
ONE-SAMPLE TESTS
Mean,Variance Known / Large Sample Test (n ≥ 40)
𝑋ത − 𝜇0
𝑍0 =
𝜎Τ 𝑛
Mean,Variance Unknown
𝑋ത − 𝜇0
𝑇0 =
𝑆Τ 𝑛
Variance
2
𝑛 − 1 𝑆
𝜒02 =
𝜎02
Proportion
𝑋 − 𝑛𝑝0
𝑍0 =
𝑛𝑝0 1 − 𝑝0
References:
Larson & Farber. Elementary Statistics: Picturing the
World, 3rd Ed. © 1999
Mapúa University-Mathematics Department.
https://mapuanfiles.weebly.com
Montgomery and Runger. Applied Statistics and
Probability for Engineers, 5th Ed. © 2011
Ross, et. al. Introduction to Probability and Statistics for
Engineers and Scientists, 3rd Ed. © 2004
Walpole, et al. Probability and Statistics for Engineers
and Scientists 9th Ed. © 2012, 2007, 2002