Session 7
Mathematics for Engineers IV
Outline
Chapter III: Probability and statistics
Unit 11: Hypothesis Testing
Chapter III: Elementary of
probability and statistics
Unit 11: Hypothesis Testing
Researchers are interested in answering many types of questions. For example,
1. A scientist might want to know whether the earth is warming up.
2. A physician might want to know whether a new medication will lower a per-
son’s blood pressure.
3. An educator might wish to see whether a new teaching technique is better
than a traditional one.
4. A retail merchant might want to know whether the public prefers a certain
color in a new line of fashion.
5. Automobile manufacturers are interested in determining whether seat belts
will reduce the severity of injuries caused by accidents.
6. These types of questions can be addressed through statistical hypothesis test-
ing, which is a decision-making process for evaluating claims about a popula-
tion.
11.1 Definition
Hypothesis testing, or significance testing, a method of testing a claim or
hypothesis about a parameter in a population, using data measured in a sample.
Whenever we have a decision to make about a population characteristic, we make
a hypothesis. There are two types of statistical hypotheses for each situation: the
null hypothesis and the alternative hypothesis.
The null hypothesis, symbolized by H0 , is a statistical hypothesis that states
that there is no difference between a parameter and a specific value, or that there
is no difference between two parameters.
The alternative hypothesis, symbolized by H1 , is a statistical hypothesis that
states the existence of a difference between a parameter and a specific value, or
states that there is a difference between two parameters.
Examples:
1. A medical researcher is interested in finding out whether a new medication
will have any undesirable side effects. The researcher is particularly concerned
with the pulse rate of the patients who take the medication. Will the pulse rate
increase, decrease, or remain unchanged after a patient takes the medication?
Since the researcher knows that the mean pulse rate for the population under
study is 82 beats per minute, the hypotheses for this situation are
H0 : µ = 82 vs H1 : µ 6= 82 (1)
The null hypothesis specifies that the mean will remain unchanged, and the
alternative hypothesis states that it will be different
2. Suppose that we want to test the hypothesis that µ 6= 120 mmHg.
H0 : µ = 120 mmHg and alternative hypothesis is µ > 120 mmHg.
For the null hypothesis we always use equality, since we are comparing µ with a
previously determined mean. For the alternative hypothesis, we have the choices:
<, > or 6=.
11.2 Types of error in Testing hypothesis
We define a Type I error as the event of rejecting the null hypothesis when it
was true. The probability of a type I error (α) is called the significance level. We
define a Type II error (with probability β) as the event of failing to reject the
null hypothesis when it was false.
Note: Larger α results in smaller β, and smaller α results in larger β.
In most hypothesis-testing situations, β cannot be easily computed; however, α
and β are related in that decreasing one increases the other. Statisticians generally
agree on using three arbitrary significance levels: the 0.10, 0.05, and 0.01 levels.
That is, if the null hypothesis is rejected, the probability of a type I error will be
10%, 5%, or1%, depending on which level of significance is used. Here is another
way of putting it: When a 0.10, there is a 10% chance of rejecting a true null
hypothesis; when a 0.05, there is a 5% chance of rejecting a true null hypothesis;
and when a 0.01, there is a 1% chance of rejecting a true null hypothesis. In a
hypothesis-testing situation, the researcher decides what level of significance to use.
These errors are summarized in table 4.
2
11.3 Five steps of hypothesis testing
Consider the following hypotheses
H0 : µ = µ0 vs H1 : µ 6= µ0 (2)
Five steps of hypothesis testing are as follows:
Step 1. State the hypotheses and identify the claim.
Step 2. Set the criteria for a decision (i.e find the critical value by using appropriate
table).
Step 3. Compute the test statistic (i.e test value).
Step 4. Make a decision
Step 5. Summarize the results.
Some definitions:
a. A statistical test uses the data obtained from a sample to make a decision
about whether the null hypothesis should be rejected. The numerical value
obtained from a statistical test is called the test value or test statistic or
or simply t-statistic.
If the hypothesized mean is µ0 , then the test statistic (z-test) for the mean
is given by
X̄ − µ0
Z= √ (3)
σx / n
with the hypothesis rejected at significance level α if |Z| > zα/2
b. The significance level can be regarded as the probability of false rejection,
an error of type I.
c. The critical value separates the critical region from the noncritical region.
The symbol for critical value is C.V.
d. The critical or rejection region is the range of values of the test value that
indicates that there is a significant difference and that the null hypothesis
should be rejected.
f. The noncritical or non rejection region is the range of values of the test
value that indicates that the difference was probably due to chance and that
the null hypothesis should not be rejected.
g. A one-tailed test indicates that the null hypothesis should be rejected when
the test value is in the critical region on one side of the mean. A one-tailed
test is either a right-tailed test or left-tailed test, depending on the direction
of the inequality of the alternative hypothesis.
The hypothesis of one tailed-test can be written as
H0 : µ = µ0 vs H1 : µ < µ0 lef t tailed test (4)
H0 : µ = µ0 vs H1 : µ > µ0 right tailed test. (5)
3
Figure 1: Left and right tailed test for α = 0.01, respectively.
f. In a two-tailed test, the null hypothesis should be rejected when the test
value is in either of the two critical regions. The two tailed-test of hypothesis
can be written as
H0 = µ = µ0 vs H1 = µ 6= µ0 two tailed test. (6)
The critical region is shown in Figure 2.
Figure 2: Critical and Noncritical Regions for α = 0.01
Example 1: Using Table of standard normal distribution, find the criti-
cal value(s) for each situation and draw the appropriate figure, showing the
critical region.
a. A left-tailed test with α = 0.1
b. A two-tailed test with α = 0.02
c. A right-tailed test with α = 0.005
Solution a.
Step 1: Draw the figure and indicate the appropriate area. Since this is
a left-tailed test, the area of 0.10 is located in the left tail, as shown in
Figure 3.
Step 2: Find the area closest to 0.1000 in the table of standard normal
distribution. That is,
P (Z < k) = 0.1000
In this case, it is 0.1003. Find the z value which corresponds to the area
0.1003. It is k = −1.28.
Solution b.
Step 1: Draw the figure and indicate the appropriate area. In this case,
there are two areas equivalent to α2 , or 0.02
2 = 0.01
4
Figure 3: Critical Value and Critical Region for part a of Example 1.
α 0.02
Step 2: For the left z critical value, find the area closest to 2, or 2 =
0.01. That is,
α
P (Z < k1 ) = = 0.01.
2
In this case, it is 0.0099. Find the z value which corresponds to the area
0.00399. It is k = −2.33
For the right z critical value, find the area closest to 1 − α2 ,or 1 − 0.02
2 =
0.9900. That is,
α
P (Z > k2 ) = 1 − P (Z ≤ k2 ) =
2
α
⇒ P (Z ≤ k2 ) = 1 − = 0.9900.
2
In this case, it is 0.9901. Find the z values for each of the areas. For
0.0099, k1 = −2.33. For, the area of .9901, , k2 = 2.33. See Figure 4
Figure 4: Critical Value and Critical Region for part b of Example 1.
Solution c.
Step 1: Draw the figure and indicate the appropriate area. Since this
is a right-tailed test, the area 0.005 is located in the right tail, as shown
in Figure 5.
5
Figure 5: Critical Value and Critical Region for part c of Example 1.
Step 2: Find the area closest to 1 − α, or 1 − α = 0.9950. That is,
P (Z > k3 ) = 1 − P (Z ≤ k3 ) = α
⇒ P (Z ≤ k3 ) = 1 − α = 0.9950.
In this case, it is 0.9949 or 0.9951.
The two z values corresponding to 0.9949 and 0.9951 are 2.57 and 2.58.
Since 0.9500 is halfway between these two values, find the average of the
two values 2.57+2.58
2 = 2.575. However, 2.58 is most often used.
Example 2: A machine fills cartons of liquid; the mean fill is adjustable
but the dial on the gauge is not very accurate. The standard deviation
of the quantity of fill is 6 ml. A sample of 30 cartons gave a measured
average content of 570 ml. For the situation, test the claim that the
mean fill of liquid is more than than 568 ml. Use α = 10%.
Solution:
The following steps are used:
Step 1: Formulation of hypotheses and identification of claim:
H0 : µ = 568 vs H1 : µ > 568( claim).
Step 2: Use α = 10% = 0.1 to find the critical value and critical region
α
P (Z > z α2 ) =
2
α 0.1 1.64 + 1.65
⇒ P (Z < z α2 ) = 1 − =1− = 0.95 ⇒ z α2 = = 1.645
2 2 2
Step 3: The value of the test statistic
X̄ − µ0 570 − 568
Z= √ = √ = 1.83
σ/ n 6/ 30
Step 4: Make a decision. Since the test value, 1.83, is greater than the
critical value, 1.645, and is in the critical region, the decision is to
reject the null hypothesis.
6
Step 5: Summarize the results. There is enough evidence to support
the claim that the mean fill of liquid is more than thn 568 ml at
significance level of 0.1.
EXERCISE XI.1
1. A researcher claims that the average wind speed in a certain city is
8 miles per hour. A sample of 32 days has an average wind speed
of 8.2 miles per hour. The standard deviation of the population is
0.6 mile per hour. At a 0.05, is there enough evidence to reject the
claim?
2. A researcher wishes to test the claim that the average cost of tuition
and fees at a four year public college is greater than $5700. She
selects a random sample of 36 four-year public colleges and finds the
mean to be $5950. The population standard deviation is $659. Is
there evidence to support the claim at a 0.05?
3. A store manager hypothesizes that the average number of pages a
person copies on the store’s copy machine is less than 40. A sample of
50 customers’ orders is selected. At a 0.01, is there enough evidence
to support the claim?. Assume σ = 30.9.
Remarks:
1. z test is used when n ≥ 30
2. When the population standard deviation is unknown, the z test is
not normally used for testing hypotheses involving means
11.4 Student t distribution
The Student t distribution or simply, t-distribution is similar to
the standard normal distribution in the following ways.
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal to 0 and are located at the
center of the distribution.
4. The curve never touches the x axis.
The t-distribution differs from the standard normal distribution in
the following ways.
1. The variance is greater than 1.
2. The t distribution is a family of curves based on the degrees of free-
dom, which is a number related to sample size.
The degrees of freedom are the number of values that are free to
vary after a sample statistic has been computed, and they tell the
researcher which specific curve to use when a distribution consists
of a family of curves.
3. As the sample size increases, the t distribution approaches the nor-
mal distribution.
The t test is a statistical test for the mean of a population and is used
when the population is normally or approximately normally
7
distributed, σ is unknown. The formula for the t test is
X̄ − µ
T = √
S/ n
The degrees of freedom are d.f = n − 1. When you test hypotheses by
using the t test (traditional method), follow the same procedure as for
the z test.
Step 1. State the hypotheses and identify the claim.
Step 2. Find the critical value(s).
Step 3. Compute the test value.
Step 4. Make the decision to reject or not reject the null hypothesis.
Step 5. Summarize the results.
Remember that the t test should be used when the population is
approximately normally distributed and the population standard
deviation is unknown.
Example: A medical investigation claims that the average number of
infections per week at a hospital in southwestern Pennsylvania is 16.3.
A random sample of 10 weeks had a mean number of 17.7 infections.
The sample standard deviation is 1.8. Is there enough evidence to
reject the investigator’s claim at a 0.05?
Solution:
Step 1: Formulation of hypotheses and identification of the claim.
H0 : µ = 16.3( claim) vs H1 : µ 6= 16.3
Step 2 The critical value are −tα,n−1 = −2.262 and tα,n−1 = 2.262 for
α = 0.05 and d.f. = n − 1 = 10 − 1 = 9 because it is a two tailed test of
hypothesis.
Step 3 The test value is
X̄ − µ 17.7 − 16.3
T = √ = √ = 2.46
S/ n 1.8/ 10
Step 4 Reject the null hypothesis since 2.46 > 2.262
Step 5: There is enough evidence to reject the claim that the average
number of infections is 16.3.
EXERCISES XI.2
1. An educator claims that the average salary of substitute teachers in
school districts in Allegheny County, Pennsylvania, is less than $60
per day. A random sample of eight school districts is selected, and
the daily salaries (in dollars) are shown as 60 56 60 55 70 55 60 55.
Is there enough evidence to support the educator’s claim at a 0.10?
2. A physician claims that joggers’ maximal volume oxygen uptake is
greater than the average of all adults. A sample of 15 joggers has a
mean of 40.6 milliliters per kilogram (ml/kg) and a standard devia-
tion of 6 ml/kg. If the average of all adults is 36.7 ml/kg, is there
enough evidence to support the physician’s claim at a 0.05?
Application of Elementary probability and statistics in Engineering situation
8
1. Analysis of engine performance data.
2. Statistical quality control.
3. Etc
4. For further explanation regarding to these applications, you can
read this book: Advanced Modern Engineering Mathemat-
ics, fourth Edition, 2011, by Glyn James.
End