Confidence Interval Estimation
• Point estimation and confidence interval
estimation
• Confidence interval estimates for the mean
• Confidence interval estimates for the proportion
• Sample size decision in estimating population
mean
• Sample size decision in estimating population
proportion
Statistical Estimation
• We take data from a sample and say something about the
population from which the sample was drawn
• Sample statistic is used to estimate unknown parameter.
• There are two types of estimation:
• Point Estimation:
Calculation of a single value of a sample statistic
• Confidence Interval Estimation
Calculation of an interval using a sample statistic
This interval is calculated at a desired level of confidence
• Eg. 95% confidence, 99% confidence, can not be 100%
Sample to sample variation (standard error) is also taken
into consideration.
)
Confidence Interval Estimates
• Let θ is unknown parameter.
• Suppose T is the point estimate of θ
• Also, E(T) = θ
• Fix the confidence level at (1- )x100 %.
• Suppose that the confidence interval estimate of θ is obtained
as
[T-h, T+h]
• It means that P(T-h ≤ θ ≤ T+h) = 1-
• is the probability of “error”.
• (1- ) is called confidence coefficient.
• Thus, for 95% confidence level, = 0.05.
• The general formula for all confidence intervals is:
[UE ± CV x SE]
• where
UE = Unbiased (Point) Estimate of the unknown parameter
CV = Critical Value (will be discussed later)
SE = Standard Error of the estimator
• i.e., Lower Confidence Limit = UE - CV x SE
• Upper Confidence Limit = UE + CV x SE
Point Estimate
Lower Confidence Limit Upper Confidence Limit
Width of
confidence interval
• Using Central Limit Theorem, for large sample
size, Unbiased Estimator Parameter
Z ~ N (0,1)
Standard Error
• Fix the confidence level at (1- )x100 %
• critical value is the given by z/2 as below
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1- ).
N(0,1)
Unbiased Estimator Parameter
• Since Z ~ N (0,1)
Standard Error
• And P(-z/2 < Z < z/2) = (1- ), where Z~N(0,1).
• This implies
UE Parameter
P z / 2 z / 2 1
SE
or P z / 2 SE UE Parameter z / 2 SE 1
or P UE z / 2 SE Parameter UE z / 2 SE 1
• Thus (1- )x100 % Confidence interval estimate of
unknown parameter is given by
• [UE - z/2 x SE, UE + z/2 x SE]
Confidence Interval for Population Mean μ
(σ Known)
• When
Population standard deviation σ is known
Population is normally distributed
If population is not normal, sample size is large
• (1- )x100 % Confidence interval estimate of is
μ given by
x z / 2 , x z / 2
n n
• where P(-z/2 < Z < z/2) = (1- ), Z~N(0,1).
N(0,1)
α
.025 1 0.95 α
.025
2 2
-z/2 = - 1.96 0 z/2 = 1.96
Commonly used confidence levels and corresponding critical values
(N(0,1) Distribution)
Confidence
Confidence Level Coefficient α Critical Value
80% 0.8 0.2 1.28
90% 0.9 0.1 1.645
95% 0.95 0.05 1.96
98% 0.98 0.02 2.33
99% 0.99 0.01 2.58
99.80% 0.998 0.002 3.08
99.90% 0.999 0.001 3.27
Sampling Distribution of the Mean
N , n
/2 1 /2
μx μ
Value of Sample Mean x (1-) x100%
for different samples of intervals will
contain μ.
Confidence Intervals (for different samples)
σ σ
x zα/ 2 , x zα/ 2
n n
• Example:
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• We know from past testing that the population standard
deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. σ
x z( 0.025)
n
2.20 1.96 (0.35/ 11 )
2.20 0.2068
(1.9932 , 2.4068)
Confidence Interval for Population Mean μ
(σ Unknown)
• Use unbiased estimate of σ, given by
1 n
s1 i
n 1 i 1
( x x ) 2
• Case 1: n is small
Value of s1 varies sample to sample
This increases extra variability
Normal distribution can not be used
We use t distribution
• Case 2: n is large
When n is large, t distribution approaches normal distribution
We use N(0,1) distribution
Case 1: σ is unknown and n is small
• Assumption: Population has normal distribution
• (1- )x100 % Confidence interval estimate of is μ given
by
s1 s1
x t / 2 , x t / 2
n n
• Where t/2 is given such that
• For T ~ t(n-1), P(-t/2 < T < t/2) = (1- ).
α t(n-1) α
2 2
1
0
-t/2 t/2
Some Critical Values of t(n-1) distribution for given α and d.f. (n-1)
d.f. Critical Value Critical Value
(n-1) at α = 0.05 at α = 0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
• Consider the same example
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• Population standard deviation is not known.
• Sample standard deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. x t s1 If we are given s2, we
( 0 .025 ) can use following
n
formula
2.20 2.365 ( 0.35 / 11 )
n 2
2.20 0.249576 s
2
1 s
n 1
( 1.950424 , 2.449576 )
Case 2: σ is unknown and n is large
• Assumption: Population has normal distribution
• This assumption is not very strong.
• (1- )x100 % Confidence interval estimate of is μ given
by
s1 s1
x z / 2 , x z / 2
n n
• Where z/2 is given such that
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1- ).
Confidence Interval Estimate of μ
σ known σ Unknown
n small n large n small n large
Normal Any Normal Any
Distribution Distribution Distribution Distribution
s1 s1
x z / 2 , x z / 2 x z / 2 , x z / 2
n n n n
s s
x t / 2 1 , x t / 2 1
n n
Confidence Intervals for Population Proportion P
• We know, for large n, that pP
Z ~ N (0,1)
PQ n
• For Z~N(0,1), we have
P( z / 2 Z z / 2 ) 1
p P
or P z / 2 z / 2 1
PQ n
or
P p z / 2 PQ n P p z / 2 PQ n 1
• Thus (1- )x100 % CI estimate of P is given by
pz /2 PQ n , p z / 2 PQ n
• This expression itself contains P
• Which is unknown
• So, this CI estimate becomes meaningless.
• We use the unbiased estimate of P
• Then, (1- )x100 % CI estimate of P is given by
pz /2 pq n , p z / 2 pq n
• Where q=1-p.
• Example:
• A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
• Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
• Ans
p z( 0 .025 ) p( 1 p)/n
25 / 100 1.96 0.25( 0.75 )/ 100
0.25 1.96 (.0433 )
( 0.1651 , 0.3349 )
Sample Size Decision
(when Estimating μ)
• We have seen (for sufficiently large n) that
x
Z
x ~ N ( , n) or
n
~ N (0,1)
• Error of Estimation e x
• Fix the confidence level at (1- )x100 %
• Obtain critical value is z/2 using N(0,1) such that
• Then, we have
2
e z / 2
z / 2 or n
n e
• Thus the sample size for estimating population mean μ is
2
z / 2
n
e
• Critical value z/2 can be taken from the table.
• Estimation Error (e) should be fixed by the researcher in
advance.
• Clearly, e ≠ 0
• Population standard deviation σ can be estimated from
some other small sample or pilot survey as
• Range/6 or by sample standard deviation
• Example:
• In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
• What should be the sample size needed to estimate the
population mean within ± 5 with 90% confidence level?
• Ans.
276 6
Estimate of population standard deviation ˆ 45
6
Estimation Error e 5
For 90% confidence level, critical value z ( 0.05) 1.645
2 2
ˆ z 0.05 45 1.645
So, n 219.19 219
e 5
Sample Size Decision
(when Estimating P)
• Similarly, the sample size for estimating population
proportion P is given by PQ ( z / 2 ) 2
n
e2
• For fixed confidence coefficient (1- ), critical value z/2 can
be taken from the normal table.
• Estimation Error (e = |p – P|) should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population proportion P can be estimated from some other
small sample or pilot survey.
• If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.
• Example:
• How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
• (Assume a pilot sample yields p = 0.12)
• Ans.
Estimate of population proportion p 0.12
Estimation Error e 3 / 100 0.03
For 95% confidence level,critical value z ( 0.025 ) 1.96
pq ( z 0.025 ) 2 0.12 0.88 1.96 1.96
So, n 450.75 451
e 2
0.03 0.03
• Estimating Total:
• In auditing, one is more interested to get the estimate of
population total amount.
• The point estimate of it can be given by Nx
• The CI estimate at (1- )x100 % confidence level is given by
s1 s1
N x N t / 2 N x N z / 2
n n
(small sample size, normal distribution) (large sample size)
• fpc should be used when n / N >0.05
s1 N n s N n
N x N t / 2 N x N z / 2 1
n N 1 n N 1
(small sample size, normal distributi on) (large sample size)
Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
• A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
• Find the 95% confidence interval estimate of the total
balance.
• Ans: N 1000, n 80, x 87.6, s1 22.3
s1 N n
Nx N z 0.025
n N 1
22.3 1000 80
( 1000 )( 87.6 ) ( 1000 )( 1.96 )
80 1000 1
87 ,600 4 ,762.48
(82837.52, 92362.48)
• Estimating Total Difference:
• An auditor may wish to estimate the magnitude of
errors
• An error is the difference of the values reached
during audit and the original values recorded.
• A sample of size n items is collected.
• Let Di denote the error in the ith item (i=1,2,…,n).
Di = 0, if the auditor finds that the original value is correct
Di > 0, if the audited value is larger than the original value
Di < 0, if the audited value is smaller than the original value
• Define: D 1 n D and s 1 n
i n
D
i 1
n 1 i 1
( Di D ) 2
• Point Estimate of Total Difference is N D
• CI estimate of Total Difference
sD sD
N D N t / 2 N D N z / 2
n n
(for small samples, normal distribution) (for large samples)
• fpc should be used when n / N >0.05
s N n s N n
N D N t / 2 D N D N z / 2 D
n N 1 n N 1
(for smallsamples, normal distribution) (for large samples)
• Example:
• Econe Dresses has 1200 inventory items.
• In the past 15% items were incorrectly priced.
• A sample of 120 items was selected.
• Historical cost of each item was compared with
the audited value.
• 15 items differ in their historical costs and
audited values.
• These values are as follows:
n 120, N 1200
D 0.95833
s D 25.24482
n / N 120 / 1200 0.1 0.05,
we use fpc
95% CI is
s N n
N D Nz ( 0.025 ) D
n N 1
25.24482 1200 120
1200 ( 0.95833) 1200 1.96
120 1200 1
Summary
• Point estimation and confidence interval estimation
• CI estimates for the population mean (σ known)
• CI estimates for the population mean (σ unknown)
• CI estimates for the population proportion
• Sample size decision in estimating population mean
• Sample size decision in estimating population
proportion
• CI estimates for Population Total
• CI estimates for Total Difference