100% found this document useful (1 vote)
568 views31 pages

Confidence Interval Estimation

The document discusses confidence interval estimation and provides details on: 1) Confidence interval estimates can be calculated for the mean and proportion of a population based on a sample. 2) The general formula for a confidence interval is [point estimate ± critical value × standard error], where the critical value depends on the desired confidence level. 3) For large samples from a normal population, the confidence interval for the mean uses the normal distribution. For small samples or unknown variance, the t-distribution is used.

Uploaded by

Saurabh Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
568 views31 pages

Confidence Interval Estimation

The document discusses confidence interval estimation and provides details on: 1) Confidence interval estimates can be calculated for the mean and proportion of a population based on a sample. 2) The general formula for a confidence interval is [point estimate ± critical value × standard error], where the critical value depends on the desired confidence level. 3) For large samples from a normal population, the confidence interval for the mean uses the normal distribution. For small samples or unknown variance, the t-distribution is used.

Uploaded by

Saurabh Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Confidence Interval Estimation

• Point estimation and confidence interval


estimation
• Confidence interval estimates for the mean
• Confidence interval estimates for the proportion
• Sample size decision in estimating population
mean
• Sample size decision in estimating population
proportion
Statistical Estimation
• We take data from a sample and say something about the
population from which the sample was drawn
• Sample statistic is used to estimate unknown parameter.
• There are two types of estimation:
• Point Estimation:
 Calculation of a single value of a sample statistic
• Confidence Interval Estimation
 Calculation of an interval using a sample statistic
 This interval is calculated at a desired level of confidence
• Eg. 95% confidence, 99% confidence, can not be 100%
 Sample to sample variation (standard error) is also taken
into consideration.
)
Confidence Interval Estimates
• Let θ is unknown parameter.
• Suppose T is the point estimate of θ
• Also, E(T) = θ
• Fix the confidence level at (1-  )x100 %.
• Suppose that the confidence interval estimate of θ is obtained
as
[T-h, T+h]
• It means that P(T-h ≤ θ ≤ T+h) = 1- 
•  is the probability of “error”.
• (1- ) is called confidence coefficient.
• Thus, for 95% confidence level,  = 0.05.
• The general formula for all confidence intervals is:
[UE ± CV x SE]
• where
 UE = Unbiased (Point) Estimate of the unknown parameter
 CV = Critical Value (will be discussed later)
 SE = Standard Error of the estimator
• i.e., Lower Confidence Limit = UE - CV x SE
• Upper Confidence Limit = UE + CV x SE
Point Estimate
Lower Confidence Limit Upper Confidence Limit

Width of
confidence interval
• Using Central Limit Theorem, for large sample
size, Unbiased Estimator  Parameter
Z ~ N (0,1)
Standard Error

• Fix the confidence level at (1-  )x100 %


• critical value is the given by z/2 as below
• For Z~N(0,1), P(-z/2 < Z < z/2) = (1-  ).
N(0,1)
Unbiased Estimator  Parameter
• Since Z ~ N (0,1)
Standard Error
• And P(-z/2 < Z < z/2) = (1-  ), where Z~N(0,1).
• This implies
 UE  Parameter 
P  z / 2   z / 2   1  
 SE 
or P  z / 2  SE  UE  Parameter  z / 2  SE   1  
or P UE  z / 2  SE  Parameter  UE  z / 2  SE   1  

• Thus (1-  )x100 % Confidence interval estimate of


unknown parameter is given by
• [UE - z/2 x SE, UE + z/2 x SE]
Confidence Interval for Population Mean μ
(σ Known)
• When
 Population standard deviation σ is known
 Population is normally distributed
 If population is not normal, sample size is large
• (1-  )x100 % Confidence interval estimate of is
μ given by
   
 x  z / 2  , x  z / 2  
 n n
• where P(-z/2 < Z < z/2) = (1-  ), Z~N(0,1).
N(0,1)

α
 .025 1    0.95 α
 .025
2 2

-z/2 = - 1.96 0 z/2 = 1.96


Commonly used confidence levels and corresponding critical values
(N(0,1) Distribution)
Confidence
Confidence Level Coefficient α Critical Value
80% 0.8 0.2 1.28
90% 0.9 0.1 1.645
95% 0.95 0.05 1.96
98% 0.98 0.02 2.33
99% 0.99 0.01 2.58
99.80% 0.998 0.002 3.08
99.90% 0.999 0.001 3.27
Sampling Distribution of the Mean 
N  , n 
/2 1  /2

μx  μ
Value of Sample Mean x (1-) x100%
for different samples of intervals will
contain μ.

Confidence Intervals (for different samples)


 σ σ 
 x  zα/ 2 , x  zα/ 2 
 n n
• Example:
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• We know from past testing that the population standard
deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. σ
x  z( 0.025)
n
 2.20  1.96 (0.35/ 11 )
 2.20  0.2068
(1.9932 , 2.4068)
Confidence Interval for Population Mean μ
(σ Unknown)
• Use unbiased estimate of σ, given by
1 n
s1   i
n  1 i 1
( x  x ) 2

• Case 1: n is small
 Value of s1 varies sample to sample
 This increases extra variability
 Normal distribution can not be used
 We use t distribution
• Case 2: n is large
 When n is large, t distribution approaches normal distribution
 We use N(0,1) distribution
Case 1: σ is unknown and n is small
• Assumption: Population has normal distribution
• (1-  )x100 % Confidence interval estimate of is μ given
by
 s1 s1 
 x  t / 2  , x  t / 2  
 n n

• Where t/2 is given such that


• For T ~ t(n-1), P(-t/2 < T < t/2) = (1-  ).
α t(n-1) α
2 2
1

0
-t/2 t/2
Some Critical Values of t(n-1) distribution for given α and d.f. (n-1)
d.f. Critical Value Critical Value
(n-1) at α = 0.05 at α = 0.10
1 12.706 6.314
2 4.303 2.92
3 3.182 2.353
4 2.776 2.132
5 2.571 2.015
6 2.447 1.943
7 2.365 1.895
• Consider the same example
• A sample of 11 circuits from a large normal population
has a mean resistance of 2.20 ohms.
• Population standard deviation is not known.
• Sample standard deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean
resistance of the population.
• Ans. x  t s1 If we are given s2, we
( 0 .025 ) can use following
n
formula
 2.20  2.365  ( 0.35 / 11 )
n 2
 2.20  0.249576 s 
2
1 s
n 1
( 1.950424 , 2.449576 )
Case 2: σ is unknown and n is large
• Assumption: Population has normal distribution
• This assumption is not very strong.
• (1-  )x100 % Confidence interval estimate of is μ given
by
 s1 s1 
 x  z / 2  , x  z / 2  
 n n

• Where z/2 is given such that


• For Z~N(0,1), P(-z/2 < Z < z/2) = (1-  ).
Confidence Interval Estimate of μ

σ known σ Unknown

n small n large n small n large


Normal Any Normal Any
Distribution Distribution Distribution Distribution

     s1 s1 
 x  z / 2  , x  z / 2    x  z / 2  , x  z / 2  
 n n  n n

 s s 
 x  t / 2  1 , x  t / 2  1 
 n n
Confidence Intervals for Population Proportion P

• We know, for large n, that pP


Z ~ N (0,1)
PQ n
• For Z~N(0,1), we have

P( z / 2  Z  z / 2 )  1  
 p  P 
or P  z / 2   z / 2   1  
 PQ n 
 
or 
P p  z / 2  PQ n  P  p  z / 2  PQ n  1   
• Thus (1-  )x100 % CI estimate of P is given by

pz  /2  PQ n , p  z / 2  PQ n 
• This expression itself contains P
• Which is unknown
• So, this CI estimate becomes meaningless.
• We use the unbiased estimate of P
• Then, (1-  )x100 % CI estimate of P is given by

pz  /2  pq n , p  z / 2  pq n 
• Where q=1-p.
• Example:
• A random sample of 100 people shows that 25
have opened IRA (individual retirement
arrangement) this year.
• Construct a 95% confidence interval for the true
proportion of population who have opened IRA.
• Ans
p  z( 0 .025 ) p( 1  p)/n
 25 / 100  1.96 0.25( 0.75 )/ 100
 0.25  1.96 (.0433 )
 ( 0.1651 , 0.3349 )
Sample Size Decision
(when Estimating μ)
• We have seen (for sufficiently large n) that
x
Z
x ~ N ( , n) or
 n
~ N (0,1)

• Error of Estimation e  x  
• Fix the confidence level at (1-  )x100 %
• Obtain critical value is z/2 using N(0,1) such that
• Then, we have
2
e   z / 2 
z / 2  or n 
 n  e 
• Thus the sample size for estimating population mean μ is
2
  z / 2 
n 
 e 
• Critical value z/2 can be taken from the table.
• Estimation Error (e) should be fixed by the researcher in
advance.
• Clearly, e ≠ 0
• Population standard deviation σ can be estimated from
some other small sample or pilot survey as
• Range/6 or by sample standard deviation
• Example:
• In a pilot survey, it is observed that the smallest
observation is 6 and the largest observation is 276.
• What should be the sample size needed to estimate the
population mean within ± 5 with 90% confidence level?
• Ans.
276  6
Estimate of population standard deviation ˆ   45
6
Estimation Error e  5
For 90% confidence level, critical value z ( 0.05)  1.645
2 2
 ˆ z 0.05   45  1.645 
So, n       219.19  219
 e   5 
Sample Size Decision
(when Estimating P)
• Similarly, the sample size for estimating population
proportion P is given by PQ ( z / 2 ) 2
n
e2
• For fixed confidence coefficient (1-  ), critical value z/2 can
be taken from the normal table.
• Estimation Error (e = |p – P|) should be fixed by the
researcher in advance. Clearly, e ≠ 0
• Population proportion P can be estimated from some other
small sample or pilot survey.
• If no information is available, it can be decided by the
researcher using past experience or can be taken as 0.5.
• Example:
• How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
• (Assume a pilot sample yields p = 0.12)
• Ans.
Estimate of population proportion p  0.12
Estimation Error e  3 / 100  0.03
For 95% confidence level,critical value z ( 0.025 )  1.96
pq ( z 0.025 ) 2 0.12  0.88  1.96  1.96
So, n    450.75  451
e 2
0.03  0.03
• Estimating Total:
• In auditing, one is more interested to get the estimate of
population total amount.
• The point estimate of it can be given by Nx
• The CI estimate at (1-  )x100 % confidence level is given by

 s1   s1 
 N x  N t / 2    N x  N z / 2  
 n  n
(small sample size, normal distribution) (large sample size)
• fpc should be used when n / N >0.05
 s1 N  n   s N n 
 N x  N t / 2    N x  N z / 2  1 
 n N  1   n N  1 
  
(small sample size, normal distributi on) (large sample size)
Example: A firm has a population of 1000 accounts and
wishes to estimate the total population value.
• A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
• Find the 95% confidence interval estimate of the total
balance.
• Ans: N  1000, n  80, x  87.6, s1  22.3
s1 N n
Nx  N z 0.025
n N 1
22.3 1000  80
 ( 1000 )( 87.6 )  ( 1000 )( 1.96 )
80 1000  1
 87 ,600  4 ,762.48
 (82837.52, 92362.48)
• Estimating Total Difference:
• An auditor may wish to estimate the magnitude of
errors
• An error is the difference of the values reached
during audit and the original values recorded.
• A sample of size n items is collected.
• Let Di denote the error in the ith item (i=1,2,…,n).
 Di = 0, if the auditor finds that the original value is correct
 Di > 0, if the audited value is larger than the original value
 Di < 0, if the audited value is smaller than the original value
• Define: D  1 n D and s  1 n
 i n
D
i 1

n  1 i 1
( Di  D ) 2

• Point Estimate of Total Difference is N  D


• CI estimate of Total Difference
 sD   sD 
 N D  N t / 2    N D  N z / 2  
 n  n
(for small samples, normal distribution) (for large samples)

• fpc should be used when n / N >0.05


 s N n   s N n
 N D  N t / 2  D   N D  N z / 2  D 
 n N  1   n N  1 
 
(for smallsamples, normal distribution) (for large samples)
• Example:
• Econe Dresses has 1200 inventory items.
• In the past 15% items were incorrectly priced.
• A sample of 120 items was selected.
• Historical cost of each item was compared with
the audited value.
• 15 items differ in their historical costs and
audited values.
• These values are as follows:
n  120, N  1200
D  0.95833
s D  25.24482

n / N  120 / 1200  0.1  0.05,


we use fpc
 95% CI is
 s N n
 N D  Nz ( 0.025 ) D 
 n N  1 

 25.24482 1200  120 
 1200  ( 0.95833)  1200  1.96  
 120 1200  1 
Summary
• Point estimation and confidence interval estimation
• CI estimates for the population mean (σ known)
• CI estimates for the population mean (σ unknown)
• CI estimates for the population proportion
• Sample size decision in estimating population mean
• Sample size decision in estimating population
proportion
• CI estimates for Population Total
• CI estimates for Total Difference

You might also like