EST - Statistical Inference
Statistical inference is concerned with making
decisions or predictions about population
parameters.
Parameters discussed this far are the
population mean µ, the population standard
deviation σ, and the binomial proportion p.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 2 / 58
EST - Statistical Inference cont...
Two methods for making inferences about
population parameters are:
Estimation: Estimating or predicting the
value of the parameter.
Hypothesis testing: Making a decision
about the value of a parameter based on some
preconceived idea about what its value might
be.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 3 / 58
EST - Estimators
An estimator is a rule, usually expressed as a
formula, that tells us how to calculate an
estimate based on information in the sample.
Since estimators are calculated using
information from the sample observations, they
are also statistics.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 4 / 58
EST - Unbiased Point Estimator
Since an estimator is calculated from sample
values, it varies from sample to sample
according to its sampling distribution.
An estimator is unbiased if the mean of its
sampling distribution equals the parameter of
interest. Otherwise it is said to be biased.
An unbiased estimator does not systematically
overestimate or underestimate the target
parameter.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 6 / 58
EST - Unbiased Point Estimator cont...
Of all the unbiased estimators, we prefer the
estimator whose sampling distribution has the
smallest spread or variability (as measured by
the variance).
FRANCISCO CHAMERA ESTIMATION November 27, 2022 7 / 58
EST - Goodness of an Estimator
The distance between an estimate and the true
value of the parameter is called the error of
estimation.
In this topic, the sample sizes are large,
n ≥ 30.
Hence our unbiased estimators will have
normal distributions, as a direct result of the
Central Limit Theorem.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 8 / 58
EST - Margin of Error
Using standard normal tables, we have,
P(−1.96 < z < 1.96) = 0.95.
Hence for unbiased estimators with normal
sampling distributions, 95% of all point
estimates will lie within 1.96 standard
deviations of the parameter of interest.
Therefore the difference between the unbiased
point estimator and the true value of the
parameter will be less than 1.96 standard
deviations or 1.96 standard errors (SE).
FRANCISCO CHAMERA ESTIMATION November 27, 2022 9 / 58
EST - Margin of Error cont...
The 95% Margin of error is the maximum
error of estimation, calculated as
1.96 × std error of the estimator.
It provides a practical upper bound for the
error of estimation.
It is possible that the error of estimation will
exceed this margin of error, but that is very
unlikely.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 10 / 58
EST - Point Estimators for µ and p
The point estimator for the population mean µ
is the sample mean x̄ which is unbiased with
standard error estimated as
s
SE = √ ,
n
where s is the sample standard deviation.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 11 / 58
EST - Point Estimators for µ and p
The point estimator for the population mean µ
is the sample mean x̄ which is unbiased with
standard error estimated as
s
SE = √ ,
n
where s is the sample standard deviation.
The point estimator for the population
proportion p is the sample proportion p̂ with
r
p̂ q̂
SE = .
n
FRANCISCO CHAMERA ESTIMATION November 27, 2022 11 / 58
EST - Point Estimators for µ and p cont...
Parameter µ p
Estimator x̄ p̂
r
s p̂ q̂
Standard Error √
n nr
s p̂ q̂
Margin of Error ±1.96 × √ ±1.96 ×
n n
Assumption n ≥ 30 np̂ > 5, nq̂ > 5
FRANCISCO CHAMERA ESTIMATION November 27, 2022 12 / 58
EST - Example 1
1
A home owner randomly samples 64 homes
similar to her own and finds that the average
selling price is K 252, 000 with a standard
deviation of K 15, 000. Find the point
estimator for the population mean µ and the
margin of error.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 13 / 58
EST - Example 1
1
A home owner randomly samples 64 homes
similar to her own and finds that the average
selling price is K 252, 000 with a standard
deviation of K 15, 000. Find the point
estimator for the population mean µ and the
margin of error.
2
A random sample of n = 900 observations
from a binomial population produced x = 655
successes. Estimate the binomial proportion p
and calculate the margin of error.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 13 / 58
EST - Solution to Example 1
1
n = 64 (> 30), x̄ = 252000 and s = 15000.
Point estimator of µ is x̄ = 252000.
The standard error is
s 15000 15000
SE = √ = √ = = 1875.
n 64 8
FRANCISCO CHAMERA ESTIMATION November 27, 2022 14 / 58
EST - Solution to Example 1
1
n = 64 (> 30), x̄ = 252000 and s = 15000.
Point estimator of µ is x̄ = 252000.
The standard error is
s 15000 15000
SE = √ = √ = = 1875.
n 64 8
The Margin of error is
±1.96 × SE = ±1.96 × 1875 = ±3675.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 14 / 58
EST - Solution to Example 1 cont...
655
2. n = 900, p̂ = = 0.73, q̂ = 0.27.
900
Now np̂ = 657 and nq̂ = 243.
Point estimator of p is p̂ = 0.73.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 15 / 58
EST - Solution to Example 1 cont...
655
2. n = 900, p̂ = = 0.73, q̂ = 0.27.
900
Now np̂ = 657 and nq̂ = 243.
Point estimator of p is p̂ = 0.73.
The standard error is
r r
p̂ q̂ 0.73 × 0.27
SE = = = 0.015.
n 900
FRANCISCO CHAMERA ESTIMATION November 27, 2022 15 / 58
EST - Solution to Example 1 cont...
655
2. n = 900, p̂ = = 0.73, q̂ = 0.27.
900
Now np̂ = 657 and nq̂ = 243.
Point estimator of p is p̂ = 0.73.
The standard error is
r r
p̂ q̂ 0.73 × 0.27
SE = = = 0.015.
n 900
The Margin of error is
±1.96 × SE = ±1.96 × 0.015 = ±0.0294.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 15 / 58
EST - Example 2
1
A random sample of n = 50 observations from
a quantitative population produced x̄ = 56.4
and s 2 = 2.6. Give the best point estimate for
the population mean µ, and calculate the
margin of error.
2
A technician wants to estimate the proportion
of soda cans that are underfilled. He randomly
samples 200 cans of soda and finds 10
underfilled cans. Find the point estimator for
the population mean µ and the margin of error.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 16 / 58
EST - Solution to Example 2
√
1
n = 50 (> 30), x̄ = 56.4 and s = 2.6 = 1.6.
Point estimator of µ is x̄ = 56.4.
The standard error is
s 1.6 1.6
SE = √ = √ = = 0.226.
n 50 7.07
The Margin of error is
±1.96 × SE = ±1.96 × 0.226 = ±0.443.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 17 / 58
EST - Solution to Example 2 cont...
10
2. n = 200, p̂ = = 0.05, q̂ = 0.95.
200
Now np̂ = 10 and nq̂ = 190.
Point estimator of p is p̂ = 0.05.
The standard error is
r r
p̂ q̂ 0.05 × 0.95
SE = = = 0.0154.
n 200
The Margin of error is
±1.96 × SE = ±1.96 × 0.0154 = ±0.0302.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 18 / 58
EST - Interval Estimation
We create an interval (a, b) so that we are
fairly sure that the parameter lies between
these two values.
Here ‘fairly sure’ means ‘with high probabilty’.
The probability that a confidence interval will
contain the estimated parameter is called the
confidence coefficient, designated by 1 − α.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 19 / 58
EST - Interval Estimation cont...
For example, a 95% confidence interval is the
interval such that confidence coefficient, or the
probability that the interval will contain the
estimated parameter, is 1 − α = 0.95.
We can increase or decrease amount of
certainty by changing the confidence
coefficient.
Other values typically used by experimenters
are 0.90, 0.98 and 0.99.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 20 / 58
EST - Constructing Confidence Intervals
Suppose the confidence coefficient is
1 − α = 0.95.
We know that, of all possible values of the
estimator that we might select, 95% of them
will be in the interval
Parameter ± 1.96 × SE .
FRANCISCO CHAMERA ESTIMATION November 27, 2022 21 / 58
EST - Constructing Confidence Intervals cont...
Since we do not know the value of the
parameter, we consider constructing the
interval
Estimator ± 1.96 × SE .
FRANCISCO CHAMERA ESTIMATION November 27, 2022 22 / 58
EST - Changing the Confidence Level
To change the confidence coefficient from
(1 − α) = 0.95 to another level, we need to
change the value Z = 1.96, which locates an
area 0.95 at the center of the standard normal
curve, to a different value.
Since the total area is 1, the remaining area in
the two tails is α.
α
Hence the area for each tail is .
2
FRANCISCO CHAMERA ESTIMATION November 27, 2022 23 / 58
EST - Changing the Confidence Level cont...
The value Z that has tail area α/2 to its right
is denoted Zα/2 and the area between −Zα/2
and Zα/2 is the confidence coefficient 1 − α.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 24 / 58
EST - Changing the Confidence Level cont...
The table below gives values of Zα/2 which are
commonly used.
α
1−α α Zα/2
2
0.90 0.10 0.05 1.645
0.95 0.05 0.025 1.96
0.98 0.02 0.01 2.33
0.99 0.01 0.005 2.58
FRANCISCO CHAMERA ESTIMATION November 27, 2022 25 / 58
EST - Confidence Intervals
The (1 − α)100% large sample confidence
interval is given by
Point Estimator ± Zα/2 × SE,
where Zα/2 is the value with an area α/2 in
the right tail of a standard normal distribution
and SE is the standard error of the estimator.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 26 / 58
EST - Confidence Intervals
The (1 − α)100% large sample confidence
interval is given by
Point Estimator ± Zα/2 × SE,
where Zα/2 is the value with an area α/2 in
the right tail of a standard normal distribution
and SE is the standard error of the estimator.
This formula generates two values; the lower
confidence limit (LCL)and the upper
confidence limit (UCL).
FRANCISCO CHAMERA ESTIMATION November 27, 2022 26 / 58
EST - CI for Means and Proportions
For a quantitative population, the conficence
interval for mean µ is
s
x̄ ± Zα/2 × √ .
n
For a binomial population, the confidence
interval for a population proportion p̂ is
r
p̂ q̂
p̂ ± Zα/2 × .
n
FRANCISCO CHAMERA ESTIMATION November 27, 2022 27 / 58
EST - Example 3
1
A random sample of n = 50 males showed a
mean average daily intake of dairy products
equal to 756 grams with a standard deviation
of 35 grams. Find a
(a) 95% confidence interval for µ.
(b) 99% confidence interval for µ.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 28 / 58
EST - Solution to Example 3
(a) n = 50, x̄ = 756 and s = 35.
s 35
Zα/2 = 1.96 and SE = √ = √ = 4.95.
n 50
Hence the 95% confidence interval is
756 ± 1.96 × 4.95
756 ± 9.7
746.3 < µ < 765.7.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 29 / 58
EST - Solution to Example 3 cont...
(b) n = 50, x̄ = 756 and s = 35.
s 35
Zα/2 = 2.58 and SE = √ = √ = 4.95.
n 50
Hence the 99% confidence interval is
756 ± 2.58 × 4.95
756 ± 12.77
743.23 < µ < 768.77.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 30 / 58
EST - Example 4
Of a random sample of n = 150 college students,
104 of the students said that they had played on a
soccer team during their primary school years.
Estimate the proportion of college students who
played soccer in their youth with a 98%
confidence interval.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 31 / 58
EST - Solution to Example 4
104
n = 150, p̂ = = 0.69 and q̂ = 0.31.
150
Zα/2 =r2.33 and
r
p̂ q̂ 0.69 × 0.31
SE = = = 0.0378.
n 150
Hence the 98% confidence interval is
0.69 ± 2.33 × 0.0378
0.69 ± 0.088
0.602 < p < 0.778.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 32 / 58
EST - Example 5
A random sample of 985 voters were polled during
an opinion poll conducted by an aspiring BSU
candidate. Of those surveyed, 592 indicated that
they intended to vote for the said candidate in the
upcoming election. Construct a 90% confidence
interval for p, the proportion of voters in the
population who intend to vote for the candidate.
Based on this information, can you conclude that
the candidate will win the election?
FRANCISCO CHAMERA ESTIMATION November 27, 2022 33 / 58
EST - Solution to Example 5
592
n = 985, p̂ = = 0.601 and q̂ = 0.399.
985
Zα/2 =r1.645 and
r
p̂ q̂ 0.601 × 0.399
SE = = = 0.0156.
n 985
Hence the 90% confidence interval is
0.601 ± 1.645 × 0.0156
0.601 ± 0.0257
0.5753 < p < 0.6267.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 34 / 58
EST - Estimating the Difference Between Two Means
Sometimes we are interested in comparing the
means of two populations.
The average growth of plants fed using two
different nutrients.
The average scores for students taught with
two different teaching methods.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 35 / 58
EST - Estimating the Difference Between Two Means
cont...
To make this comparison;
A random sample of size n1 is drawn from
population 1 with mean µ1 and standard
deviation σ1.
Another random sample of size n2 is drawn
from population 2 with mean µ2 and standard
deviation σ2.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 36 / 58
EST - Estimating the Difference Between Two Means
cont...
We compare the two averages by making
inferences about µ1 − µ2, the difference in the
two population averages.
If the two population averages are the same,
then µ1 − µ2 = 0.
The best estimate of µ1 − µ2 is the difference
between the two sample means x¯1 − x¯2.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 37 / 58
EST - The Sampling Distribution of x¯1 − x¯2
The mean of x¯1 − x¯2 is µ1 − µ2.
s Standard Deviation of x¯1 − x¯2sis
The
σ12 σ22 s12 s22
+ which is estimated as + .
n1 n2 n1 n2
If the sampled populations are normally
distributed, then the sampling distribution of
x¯1 − x¯2 is exactly normally distributed,
regardless of the sample size.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 38 / 58
EST - The Sampling Distribution of x¯1 − x¯2 cont...
If the sampled populations are not normally
distributed, then the sampling distribution of
x¯1 − x¯2 is approximately normally distributed
when n1 and n2 are both 30 or more, due to
the CLT.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 39 / 58
EST - Estimating µ1 − µ2
Point estimate for µ1 − µ2 is x¯1 − x¯2.
s
s12 s22
The Margin of Error is ±1.96 + .
n1 n2
A (1 − α)100% large sample confidence
interval for µ1 − µ2 is
s
s12 s22
(x¯1 − x¯2) ± Zα/2 + .
n1 n2
FRANCISCO CHAMERA ESTIMATION November 27, 2022 40 / 58
EST - Example 6
Average Daily Intakes Men Women
Sample size 50 50
Sample Mean 756 762
Sample Standard Deviation 35 30
1
Construct a 95% confidence interval for
µ1 − µ2 .
2
Could you conclude, based on this confidence
interval, that there is a difference in the
average daily intake of dairy products for men
and women?
FRANCISCO CHAMERA ESTIMATION November 27, 2022 41 / 58
EST - Solution to Example 6
1
n1 = n2 = 50, x¯1 = 756, x¯2 = 762, s1 = 35
and s2 = 30.
The 95% confidence interval is
s
s12 s22
(x¯1 − x¯2) ± 1.96 +
n1 n2
r
352 302
(756 − 762) ± 1.96 +
50 50
−6 ± 12.78 ⇒ −18.78 < µ1 − µ2 < 6.78.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 42 / 58
EST - Solution to Example 6 cont...
2. The confidence interval contains the value
µ1 − µ2 = 0. Therefore, it is possible that
µ1 = µ2. You would not want to conclude that
there is a difference in average daily intake of
dairy products for men and women.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 43 / 58
EST - Estimating the Difference Between Two Proportions
Sometimes we are interested in comparing the
proportion of ‘successes’ in two binomial
populations. For example;
The germination rates of untreated seeds and
seeds treated with a fungicide.
The proportion of male and female voters who
favor a particular candidate for governor.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 44 / 58
EST - Estimating the Difference Between Two Proportions
cont...
To make this comparison;
A random sample of size n1 is drawn from
binomial population 1 with parameter p1.
Another random sample of size n2 is drawn
from binomial population 2 with parameter p2.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 45 / 58
EST - The Sampling Distribution of pˆ1 − pˆ2
x1 x2
The mean of (pˆ1 − pˆ2) = − is
n1 n2
p1 − p2 = 0.
The Standard Deviation of (pˆ1 − pˆ2) is
r
p1q1 p2q2
+
n1 n2
which is estimated as
r
p̂1q̂1 p̂2q̂2
+ .
n1 n2
FRANCISCO CHAMERA ESTIMATION November 27, 2022 46 / 58
EST - Estimating p1 − p2
Point estimate for p1 − p2 is pˆ1 − pˆ2.
r
p̂1q̂1 p̂2q̂2
The Margin of Error is ±1.96 + .
n1 n2
A (1 − α)100% large sample confidence
interval for p1 − p2 is
r
p̂1q̂1 p̂2q̂2
(pˆ1 − pˆ2) ± Zα/2 + .
n1 n2
FRANCISCO CHAMERA ESTIMATION November 27, 2022 48 / 58
EST - Example 7
Youth Soccer Male Female
Sample size 80 70
Played Soccer 65 39
1
Construct a 99% confidence interval for
p1 − p2.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 49 / 58
EST - Example 7
Youth Soccer Male Female
Sample size 80 70
Played Soccer 65 39
1
Construct a 99% confidence interval for
p1 − p2.
2
Can you conclude that there is a difference in
the proportion of male and female college
students who said that they had played on a
soccer team during their primary school years?
FRANCISCO CHAMERA ESTIMATION November 27, 2022 49 / 58
EST - Solution to Example 7
65
1
n1 = 80, n2 = 70, p̂1 = = 0.813,
80
39
q̂1 = 0.187, p̂2 = = 0.557 and q̂2 = 0.443.
70
The standard deviation is
r
0.813 × 0.187 0.557 × 0.443
SE = +
√ 80 70
= 0.0019 + 0.0035
√
= 0.0054 = 0.0735.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 50 / 58
EST - Solution to Example 7 cont...
1
The 99% confidence interval is
(0.813 − 0.557) ± Zα/2 × SE
(0.813 − 0.557) ± 2.58 × 0.0735
0.256 ± 0.189
0.067 < p1 − p2 < 0.445.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 51 / 58
EST - Solution to Example 7 cont...
2. The confidence interval does not contain the
value p1 − p2 = 0.
Therefore, it is not likely that p1 = p2.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 52 / 58
EST - Solution to Example 7 cont...
2. The confidence interval does not contain the
value p1 − p2 = 0.
Therefore, it is not likely that p1 = p2.
We conclude that there is a difference in the
proportions for males and females i.e., a higher
proportion of males than females played soccer
in their youth.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 52 / 58
EST - One Sided Confidence Bounds
Confidence intervals are by their nature
two-sided since they produce upper and lower
bounds for the parameter.
One-sided bounds can be constructed simply
by using a value of Z that puts α rather than
α/2 in the tail of the Z distribution.
The value of Z that puts α in the tail of the Z
distribution is called Zα .
FRANCISCO CHAMERA ESTIMATION November 27, 2022 53 / 58
EST - One Sided Confidence Bounds cont...
A (1 − α)100% lower confidence bound (LCB)
is given by
Point Estimator − Zα × SE
where SE is the standard error of the
estimator.
A (1 − α)100% upper confidence bound
(UCB) is given by
Point Estimator + Zα × SE .
FRANCISCO CHAMERA ESTIMATION November 27, 2022 54 / 58
EST - Example 8
Find a 95% one-sided upper confidence bound for
the population mean µ given that n = 40, s 2 = 65
and x̄ = 75.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 55 / 58
EST - Solution to Example 8
Upper or lower confidence bound:
x̄ ± Zα × SE .
√
Zα = 1.465, s = 65 = 8.062
s 8.062 8.062
SE = √ = √ = = 1.275.
n 40 6.325
FRANCISCO CHAMERA ESTIMATION November 27, 2022 56 / 58
EST - Solution to Example 8
Upper or lower confidence bound:
x̄ ± Zα × SE .
√
Zα = 1.465, s = 65 = 8.062
s 8.062 8.062
SE = √ = √ = = 1.275.
n 40 6.325
LCB: 75 − 1.465 × 1.275 = 73.132.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 56 / 58
EST - Solution to Example 8
Upper or lower confidence bound:
x̄ ± Zα × SE .
√
Zα = 1.465, s = 65 = 8.062
s 8.062 8.062
SE = √ = √ = = 1.275.
n 40 6.325
LCB: 75 − 1.465 × 1.275 = 73.132.
UCB: 75 + 1.465 × 1.275 = 76.868.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 56 / 58
EST - Exercise
1
Find a 90% one-sided upper confidence bound
for the population mean µ given that n = 100,
s = 2.3 and x̄ = 1.6.
2
Find a 99% lower confidence bound for the
binomial proportion p when a random sample
of n = 400 trials produced x = 196 successes.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 57 / 58
EST - Exercise Cont...
3. Independent random samples of size 50 are
drawn from two quantitative populations,
producing the sample information in the table.
Find a 95% upper confidence bound for the
difference in the two population means.
FRANCISCO CHAMERA ESTIMATION November 27, 2022 58 / 58
EST - Exercise Cont...
3. Independent random samples of size 50 are
drawn from two quantitative populations,
producing the sample information in the table.
Find a 95% upper confidence bound for the
difference in the two population means.
Sample 1 Sample 2
Sample size 50 50
Sample Mean 12 10
Sample Standard Deviation 5 7
FRANCISCO CHAMERA ESTIMATION November 27, 2022 58 / 58