Dr.
Kannan A
Department of Chemical Engineering
Indian Institute of Technology Madras
Point Estimation
Montgomery, D. C., G.C. Runger, Applied Statistics and
Probability for Engineers. 5th ed. New Delhi: Wiley-India,
2011.
❖ We have an unknown, possibly abstract population which
consists of members with a wide difference in quantifiable
features (height, weight, marks, income etc.)
❖ The center value of this population is mean () and the
spread is characterized by the standard deviation ().
❖ Usually, these parameters and are not known.
❖ Our job is to estimate them.
❖ Usually, these parameters and are not known. Our
job is to estimate them.
❖ Hence we take a sample from the population taking
care to ensure that the sample is sufficiently
representative of the population.
The sample elements should have the following features
❖ Randomness
❖ Independence
❖ Identical distribution
❖ Should be preferably many in number
❖ We have the sample mean and variance with us and
we hope that they are reasonable estimates of
population mean and variance.
❖ We need to find the expected value taken by these
sample statistics.
❖ It is preferred usually if these estimators give unbiased
estimates of the population parameters.
❖ We take random samples to draw inferences about a
population. Let a parameter of this population be .
❖ The objective of point estimation is to obtain the most
plausible single numerical value from a sample, which
represents the estimate of the population parameter.
❖ This numerical value, calculated from the sample statistic is
often referred to as the point estimate of the parameter.
❖ Reiterating, let us assign our hitherto n random variables
belonging to a population as X1, X2, … Xn.
❖ The statistic given below is a function of these random
variables and is called as a point estimator of .
ˆ = h( X1, X2 ,...Xn )
❖ After the sample has been selected, the point estimator
takes on a numerical value and yields a point estimate
denoted as 𝛉.
❖ This is the point estimate of the population parameter .
The statistics X1 + X 2 + ⋯ + X n
ഥ=
X
ഥ
❖ sample mean X n
σ n ത 2
i=1 X i − 𝑋
❖ sample variance (S2) S2 =
n−1
are the point estimators of the unknown population mean ()
and unknown population standard deviation (2) respectively.
❖ After the sample has been selected, the sample mean xത
ෝ2 ) of and 2
and s2 are point estimates (ොμ and σ
respectively.
❖ Suppose the sample values are 20, 30, 45, 55, 65, 67, 80
the sample mean (51.71) and sample variance (21.382) are
the point estimates of the unknown population mean ()
and population variance (2) respectively.
❖ If the population’s distribution is normal with mean ()
and variance (2), then the sampling distribution is also
normal with mean () and variance (2/n).
❖ Even if the population probability distribution is not
normal, its sampling distribution tends to be normal
provided the sample size ‘n’ is reasonably large ( n >30).
❖ The mean of the random samples taken is also a random
variable and it has a probability distribution.
❖ It will be nice to know the type of probability distribution the
samples belongs to.
❖ However the parent population’s probability distribution is
usually not known. Hence it is not easy to derive the
sample statistic’s sample distribution.
❖ Hence it is not easy to derive the sample statistic’s sample
distribution. However, if we take a reasonably large sample
size, then the sampling distribution of the mean is still
normal with mean () and variance 2/n.
❖ Even if the parent population were not normal, the large
sample size somehow makes the distribution of the sample
means to be normal.
The Central Limit Theorem simplifies matters a lot by
stating that even if the original probability distribution of the
population is not a normal i.e. Gaussian, the sample mean
tends towards the normality provided the sample size is
high (say > 30).
❖ Also the central limit theorem says that for smaller samples,
the distribution is still approximately normal if the parent
population distribution does not deviate too much from
normality.
❖ We are indeed fortunate to have the central limit theorem
❖ If X1, X2,… Xn is a random sample of size n taken from
any (i.e. not necessarily normal) population with mean
and variance 2 and if X
ഥ is the sample mean, the limiting
form of the distribution of
ത
𝑋−𝜇
𝑍= 𝜎
ൗ 𝑛
as n → is the standard normal distribution.
Illustration of the Central Limit Theorem
X Outcomes p(X=x)
1 (1,1) 1 0.027778
1.5 (1,2),(2,1) 2 0.055556
2 (1,3),(3,1),(2,2) 3 0.083333
2.5 (1,4),(4,1),(2,3),(3,2) 4 0.111111
3 (1,5),(5,1),(2,4),(4,2),(3,3) 5 0.138889
3.5 (1,6),(2,5),(3,4),(4,3),(5,2),(6,1) 6 0.166667
4 (2,6),(3,5),(4,4),(5,3),(6,2) 5 0.138889
4.5 (3,6),(4,5),(5,4),(6,3) 4 0.111111
5 (4,6),(5,5),(6,4) 3 0.083333
5.5 (5,6),(6,5) 2 0.055556
6 (6,6) 1 0.027778
Sum 36 1
Illustration of the Central Limit Theorem
2 dice
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0 1 2 3 4 5 6 7
2 dice
Probability distribution of sample mean when 2
dice are rolled
Illustration of the Central Limit Theorem
outcome Mean Possible Outcomes
3 1 111
4 1.333 121
5 1.666 221 113
6 2 114 123 222
7 2.333 115 124 133 223
8 2.667 116 125 134 224 233
9 3 126 135 144 225 234 333
10 3.333 136 145 226 235 244 334
11 3.667 326 335 344 425 461 515
12 4 156 246 354 444 525 633
13 4.333 166 256 346 355 445
14 4.667 266 356 446 455
15 5 366 465 555
16 5.333 466 556
17 5.667 566
18 6 666
Illustration of the Central Limit Theorem
outcome Mean Frequency of Occurrence Total Probability
3 1 1 1 0.00463
4 1.333 3 3 0.013889
5 1.667 3 3 6 0.02778
6 2 3 6 1 10 0.04630
7 2.333 3 6 3 3 15 0.06944
8 2.667 3 6 6 3 3 21 0.09722
9 3 6 6 3 3 6 1 25 0.11574
10 3.333 6 6 3 6 3 3 27 0.125
11 3.667 6 3 3 6 6 3 27 0.125
12 4 6 6 6 1 3 3 25 0.11574
13 4.333 3 6 6 3 3 21 0.09722
14 4.667 3 6 3 3 15 0.06944
15 5 3 6 1 10 0.04630
16 5.333 3 3 6 0.027778
17 5.667 3 3 0.01389
18 6 1 1 0.00469
216 1
0.14
0.12
0.1
Probability
0.08
0.06
0.04
0.02
0
1 2 3 4 5 6
Mean Value
Probability Distribution for averages of outcomes when tossing 3 fair dice
❖ If the sample is large, then the sampling distribution is
normal even if the original population is not normal.
❖ For small sample sizes, the sampling distribution is also
approximately normal provided the parent population does
not exhibit a great deviation from normality.
❖ If the parent population is normal, the sampling distribution
is also normal even for small n.
We wish to compare sample statistics taken from two
independent normal populations having parameters {1;1},
{2;2}. It may be shown that a linear function of the
random variables from these two independent populations
also represents a normal distribution.
To use the following, is taken to be known and we usually
speculate on the value of
If the linear function of these independent sample statistics is
the difference between the sample means, then
X = X − X = 1 − 2
1 − X2 1 2
12 22
X −X = X + X
2 2 2
= +
1 2 1 2 n1 n2
The resulting sampling distributions of the difference in
means also behaves normally with the above model
parameters.
❖ If the two samples are not normally distributed, then
the sample size comes into play.
❖ Here it is assumed that the population from which the
random samples were drawn was not very deviant
from the normal distribution.
If n > 30, then we assume that the two independent sampling
distributions are approximately normal and a linear
combination of them also behave approximately normally with
mean and variance as given previously.
Consider two independent populations with parameters
{1, 1} and {2, 2}. Let X
ഥ1 and X
ഥ2 are the sample means
of the two independent random samples of samples of
size n1 and n2 drawn from these two populations.
Then the sampling distribution of
Z=
(X1 )
− X 2 − ( 1 − 2 )
12 n1 + 22 n 2
is approximately standard normal if the conditions of the
central limit theorem apply. If the two populations are normal,
then the sampling distribution of Z is standard normal.
Sample size Parent Statistic mean Variance Sampling
distribution distribution
Large (>30)
normal
X 2 n
Small (<30)
NORMAL
Large (>30)
Different from X 2
n
normal
Small (<30) Only slightly
deviant from
normal
To use these, is taken to be known and we usually speculate on the
value of
❖ The sample mean and variances give us estimates of the
populations mean and variance respectively.
❖ They are not meant to give us the estimates of the sampling
distributions parameters.
Hence the sample mean is expected to give us the population
mean () and sample variance is expected to give us the
population variance (2) and remember, not the sample
distribution variance viz. 2/n. Hence
E(X) =
E(S2) = 2
❖ The sample mean and variance give us estimates of
the population’s mean and variance respectively.
❖ They are NOT meant to give us the estimates of the
sampling distributions parameters.
❖ The sample mean and variance are determined from
the available sample data.
The bias of a point estimator is given by
ˆ )−
E (
This shows the difference between the expected value of the
point estimator
and the actual population parameter θ
ˆ )−
E (
Hence if the estimator i.e. the sample mean is an unbiased
ഥ) = and the bias is zero.
estimate, then E(X
Similarly the estimator S2 is also an unbiased estimator for
the population variance 2 (not the sampling distribution
variance which is 2/n ) because
E(S2) = 2
❖ Let us estimate any population parameter from a
sample using it’s appropriate estimator.
❖ We also need to report what is the precision of the
estimate.
❖ The standard error usually is reported as the precision
of the estimate.
The standard error of an estimator is nothing but the
standard deviation of the estimator i.e.
ˆ = ˆ)
V(
❖ However, we do not know the value of the point
estimator’s variance.
❖ However, the standard error ( 𝝈
) contains
unknown parameters, here 𝐕
.
❖ If it may be estimated, then substitution of the
estimated value gives us the estimated standard
error ෝ
σ
❖ Let us have a normal distribution with mean and
variance .
❖ We know that the sampling distribution of the mean
ഥ) is also normal with
(X
ഥ ) (=) and
❖ mean E (𝑿
ഥ ) (=2/n).
❖ variance V (𝑿
ഥ) viz.
We are now reporting estimate of the population mean (X
. Hence the standard error of the sample mean is
𝝈
𝝈𝑿ഥ (= )
𝒏
Note that 𝑽
= 𝝈𝑿ഥ
❖ Since is unknown, we do the next available possibility
viz. Substitute S2 for the population variance 2 so that
we get the estimated value of the standard error for the
𝐒
sample mean (Xഥ) as 𝛔ෝ𝐗ഥ =
𝐧
When the estimator viz. the sample mean here for
instance follows the normal distribution by virtue of its
breeding (parent population is normal) or courtesy of the
central limit theorem (for sample size > 30), then we
expect the true value of the population mean to lie within
ഥ.
2 standard errors of the estimate viz. X
When the estimator is not normally distributed, but
however is unbiased, then estimate of the parameter will
deviate from the true or actual value by as much as 4
standard errors at most 6% of the time.
❖ Hence a highly conservative statement that the true
value of the parameter differs from the point estimate by
at most 4 standard errors.
❖ It is important that we minimize this standard error to get
precise estimates of the population parameter.