Probability and Statistics
Chapter 7
Fundamental Sampling
Distributions
Dr. Yehya Mesalam 1
Sampling Distribution of the Sample Mean
The sampling distribution of the sample mean is the probability
distribution of the population of the sample means obtainable
from all possible samples of size n from a population of size N
Population
6
5
4
3
2
1
17 18 19 20 21 22 23 24 25
Dr. Yehya Mesalam 2
Distribution of Sample Means from Samples of Size n = 2
x f(x) x.f(x) x2.f(x)
18 0.25 4.5 81
20 0.25 5 100
22 0.25 5.5 121
24 0.25 6 144
sum 1 21 446
μ E(X) x P(x) 21
x
2 V(X) E(x 2
) [ E ( x)]2 446 212 5
2.236068 2.24
Dr. Yehya Mesalam 3
Distribution of Sample Means from Samples of Size n = 2
Sample # Scores Mean ( X )
1 18, 18 18
2 18, 20 19
3 18, 22 20
4 18, 24 21
5 20, 18 19
6 20, 20 20
7 20, 22 21
8 20, 24 22
9 22, 18 20
10 22, 20 21
11 22, 22 22
12 22, 24 23
13 24, 18 21
14 24, 20 22
15 24, 22 23
16 24, 24 24
Dr. Yehya Mesalam 4
Distribution of Sample Means from Samples of Size n = 2
𝒙 f f(x) 𝒙.f(x) 𝒙2.f(x)
18 1 0.0625 1.125 20.25
19 2 0.125 2.375 45.125
20 3 0.1875 3.75 75
21 4 0.25 5.25 110.25
22 3 0.1875 4.125 90.75
23 2 0.125 2.875 66.125
24 1 0.0625 1.5 36
sum 16 1 21 443.5
Dr. Yehya Mesalam 5
Distribution of Sample Means from Samples of Size n = 2
μ x E( X) x.f(x) 21
x
μ x 21
2
x V( X) E( x ) [ E ( x )] 443.5 21 2.5
2 2 2
x 1.581139
2.236068
X 1.581139
n 2
Dr. Yehya Mesalam 6
Sampling Distribution of the Sample Mean
The sampling distribution of the sample mean is the probability
distribution of the population of the sample means obtainable
from all possible samples of size n from a population of size N
Population
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9
Dr. Yehya Mesalam 7
Distribution of Sample Means from Samples of Size n = 2
x f(x) x.f(x) x2.f(x)
2 0.25 0.5 1
4 0.25 1 4
6 0.25 1.5 9
8 0.25 2 16
sum 1 5 30
μ E(X) x P(x) 5
x
2 V(X) E(x 2
) [ E ( x)]2 30 25 5
2.236068 2.24
Dr. Yehya Mesalam 8
Distribution of Sample Means from Samples of Size n = 2
Sample # Scores Mean ( X )
1 2, 2 2
2 2,4 3
3 2,6 4
4 2,8 5
5 4,2 3
6 4,4 4
7 4,6 5
8 4,8 6
9 6,2 4
10 6,4 5
11 6,6 6
12 6,8 7
13 8,2 5
14 8,4 6
15 8.6 7
16 8.8 8
Dr. Yehya Mesalam 9
Distribution of Sample Means from Samples of Size n = 2
𝒙 f f(x) 𝒙.f(x) 𝒙2.f(x)
2 1 0.0625 0.125 0.25
3 2 0.125 0.375 1.125
4 3 0.1875 0.75 3
5 4 0.25 1.25 6.25
6 3 0.1875 1.125 6.75
7 2 0.125 0.875 6.125
8 1 0.0625 0.5 4
sum 16 1 5 27.5
Dr. Yehya Mesalam 10
Distribution of Sample Means from Samples of Size n = 2
μ x E( X) x.f(x) 5
x
μx 5
2
x V( X) E( x ) [ E ( x )] 27.5 25 2.5
2 2
x 1.581139
2.236068
X 1.581139
n 2
Dr. Yehya Mesalam 11
Distribution of Sample Means from Samples of Size n = 2
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9
sample mean
We can use the distribution of sample means to
answer probability questions about sample
means
Dr. Yehya Mesalam 12
Distribution of Individuals in Population
6
= 5, = 2.24
5 Distribution of Sample Means
4
3
2
6 X = 5, X = 1.58
1
5
1 2 3 4 5 6 7 8 9
4
3
2
1
1 2 3 4 5 6 7 8 9
sample mean
Dr. Yehya Mesalam 13
Distribution of Individuals in Population
6
= 5, = 2.24
5
4 Distribution of Sample Means
3
2
6 X = 5, X = 1.58
1
5
1 2 3 4 5 6 7 8 9 2.24
4
X 1.58
3 2
2
1
1 2 3 4 5 6 7 8 9
sample mean
Dr. Yehya Mesalam 14
Sampling Distribution (n = 3)
24
22
X = 5
20 X = 1.29
18
16
14
12 2.24
X 1.29
10 3
8
6
4
2
1 2 3 4 5 6 7 8 9
sample mean
Dr. Yehya Mesalam 15
Distribution of Sample Means
6
5 Things to Notice
4
3 1. The sample means tend to pile
2 up around the population mean.
1
1 2 3 4 5 6 7 8 9 2. The distribution of sample means
sample mean is approximately normal in
shape, even though the
population distribution was not.
x μ
z 3. The distribution of sample means
σ has less variability than does the
n population distribution.
Dr. Yehya Mesalam 16
Central Limit Theorem
For any population with mean and standard deviation ,
the distribution of sample means for sample size n …
1. will have a mean of
2. will have a standard deviation of
n
3. will approach a normal distribution as n approaches
infinity
The mean of the sampling distribution
X
The standard deviation of sampling distribution
(“standard error of the mean”)
X
n
Dr. Yehya Mesalam 17
Clarifying Formulas
Distribution of
Population Sample Sample Means
X X
X X
n
N
ss
s
ss X
N n 1 n
notice
2
2
X
n
Dr. Yehya Mesalam 18
Confidence Level, (1-)
• Suppose confidence level = 95%
• Also written (1 - ) = 0.95
• A relative frequency interpretation:
– From repeated samples, 95% of all the
confidence intervals that can be constructed
will contain the unknown true parameter
• A specific interval either will contain or
will not contain the true parameter
– No probability involved in a specific interval
Dr. Yehya Mesalam 19
Confidence Interval for μ
– Population variance σ2 is known use Z
x μ
z
σ
n
• Confidence interval estimate:
σ σ
x zα/2 μ x zα/2
n n
(where z/2 is the normal distribution value for a probability of /2 in each
tail)
Dr. Yehya Mesalam 20
Finding the Reliability Factor, z/2
• Consider a 95% confidence interval:
1 .95
α α
.025 .025
2 2
Z units: z = -1.96 0 z = 1.96
Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
Find z.025 = 1.96 from the standard normal distribution table
Dr. Yehya Mesalam 21
Common Levels of Confidence
• Commonly used confidence levels are 90%,
95%, and 99%
Confidence
Confidence
Coefficient, Z/2 value
Level
1
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.58
99.8% .998 3.08
99.9% .999 3.27
Dr. Yehya Mesalam 22
Example
• A sample of 11 circuits from a large normal population has a
mean resistance of 2.20 ohms. We know from past testing that
the population standard deviation is 0.35 ohms. Determine a
95% confidence interval for the true mean resistance of the
population.
• Solution:
σ
xz 2.20 1.96 (.35/ 11)
n
2.20 .2068
1.9932 μ 2.4068
We are 95% confident that the true mean resistance is between 1.9932 and 2.4068 ohms
Dr. Yehya Mesalam 25
Confidence Interval for μ
• If the population standard deviation σ is
unknown, and n>30 use Z x μ
z
s
n
s s
x z α/2 μ x z α/2
n n
x μ
• N<=30 use t distribution t
s
S S n
x t α/2, μ x t α/2,
n n
where tα/2,n-1 is the critical value of the t distribution with n-1 d.f. and an
area of α/2 in each tail:
Dr. Yehya Mesalam 26
Choice of Sample Size
• To Calculate the sample size needed for (1-α )
is
z α/2 . 2
n [ ]
E
• Where E the error
E x μ
Dr. Yehya Mesalam 27
Example
• Assuming the population standard deviation =
3, how large should a sample be to estimate the
population mean with a margin of error not
exceeding 0.5?
z α/2 . 2
n [ ]
E
Dr. Yehya Mesalam 28
Solution
• where = 0.05
• Then from table
z α/2 z0.o 25 1.96
• Error =E = 0.5
• Then z α/2 . 2
n [ ]
E
• n= [ 1.96*3 / 0.5]2 = 138.3
• we need a sample of size at least 139
Dr. Yehya Mesalam 29
Student’s t Distribution
• Consider a random sample of n observations
– with mean x and standard deviation s
– from a normally distributed population with mean μ
• Then the variable
x μ
t
s/ n
follows the Student’s t distribution with (n - 1) degrees of
freedom
d.f. = n - 1
Dr. Yehya Mesalam 30
Student’s t Distribution
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
Dr. Yehya Mesalam 31
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ
• Solution
d.f. = n – 1 = 24, so t α/2, t 24,.025 2.0639
The confidence interval is
S S
x t α/2, μ x t α/2,
n n
8 8
50 (2.0639) μ 50 (2.0639)
25 25
46.698 μ 53.302
Dr. Yehya Mesalam 34
Confidence Intervals for the Population
Variance
The random variable
2
(n 1)s
2
n1
σ 2
follows a chi-square distribution with (n – 1) degrees of
freedom
Where the chi-square value n21, denotes the number
for which
P( χn21 χn21, α ) α
Dr. Yehya Mesalam 35
Confidence Intervals for the Population
Variance
The (1 - )% confidence interval for the population
variance is
(n 1)s2 (n 1)s 2
σ 2
2
χ α/2,
2
χ1 - α/2,
Dr. Yehya Mesalam 36
Example
You are testing the speed of a batch of computer
processors. You collect the following data (in Mhz):
Sample size 17
Sample mean 3004
Sample std dev 74
Assume the population is normal.
Determine the 95% confidence interval for σ2
Dr. Yehya Mesalam 37
Solution
• n = 17 so the chi-square distribution has (n – 1) =
16 degrees of freedom
• = 0.05, so use the chi-square values with area
0.025 in each tail:
χ 2α/2, χ 0.025,16
2
28.85
χ12- α/2, χ 0.975,16
2
6.91
probability probability
α/2 = .025 α/2 = .025
216
216 = 6.91 216 = 28.85
Dr. Yehya Mesalam 38
Solution
• The 95% confidence interval is
(n 1)s2 (n 1)s 2
σ 2
2
χ α/2,
2
χ1 - α/2,
(17 1)(74)2 (17 1)(74)2
σ2
28.85 6.91
3037 σ 2 12683
Converting to standard deviation, we are 95% confident
that the population standard deviation of CPU speed is
between 55.1 and 112.6 Mhz
Dr. Yehya Mesalam 42
.
Example
The lapping process which is used to grind certain
silicon wafers to the proper thickness is acceptable only
if the population standard deviation of the thickness of
dice cut from the wafers is at most 0.50 mil. If the
thicknesses of 17 dice cut from such wafers have a
standard deviation of 0.78 mil. Find 95% confidence
limits on .
Dr. Yehya Mesalam 43
.
Solution
(n 1)s 2 (n 1)s 2
σ 2
χ α/2,
2
χ12- α/2,
2 2
16 * 0.78 16 * 0.78
σ
2
28.845 6.908
0.3374 σ 1.40912
Dr. Yehya Mesalam 44
Confidence Interval between (Two Means)
σ12 and σ22 known use Z
(x1 x 2 ) (μ1 μ 2 )
Z
σ12 σ 22
n1 n 2
The confidence interval for μ1 – μ2 is:
σ12 σ 22 σ12 σ 22
(x1 x 2 ) z α/2 μ1 μ 2 (x1 x 2 ) z α/2
n1 n 2 n1 n 2
σ12 and σ22 Unknown and n1+n2 >30 use Z
The confidence interval for μ1 – μ2 is:
s12 s 22 s12 s 22
(x1 x 2 ) z α/2 μ1 μ 2 (x1 x 2 ) z α/2
n1 n 2 n1 n 2
Dr. Yehya Mesalam 45
Confidence Interval between (Two Means)
σ12 and σ22 Unknown and n1+n2 <=30 use t
(x1 x 2 ) (μ1 μ 2 )
t
1 1
Sp
n1 n 2
The confidence interval for μ1 – μ2 is:
1 1 1 1
(x1 x 2 ) t α/2, .s p μ1 μ 2 (x1 x 2 ) t α/2, .s p
n1 n 2 n1 n 2
Where
(n1 1)s12 (n 2 1)s22
sp
n1 n 2 2
Is the pooled variance
Dr. Yehya Mesalam 46
Example
You are testing two computer processors for speed.
Form a confidence interval for the difference in CPU
speed. You collect the following speed data (in Mhz):
CPU1 CPU2
Number Tested 16 13
Sample mean 3004 2538
Sample std dev 74 56
Assume both populations are normal with
equal variances, and use 95% confidence
Dr. Yehya Mesalam 47
Solution
The pooled variance is:
n
S2 1
1S1
2
n 2 1S 2
2
16 174 2
13 156 2
4427.03
(n1 n 2 2) (16 13 2)
p
Sp
n1 1S12 n 2 1S2 2
16 1742 13 1562 66.537
(n1 n 2 2) (16 13 2)
The t value for a 95% confidence interval is:
t α/2, t 0.025,27 2.052
Dr. Yehya Mesalam 48
Solution
• The 95% confidence interval is
1 1 1 1
(x1 x 2 ) t α/2, .s p μ1 μ 2 (x1 x 2 ) t α/2, .s p
n1 n 2 n1 n 2
1 1 1 1
(3004 2538) (2.052) * 66.537 μ1 μ 2 (3004 2538) (2.052) * 66.537
16 13 16 13
416.69 μ1 μ 2 515.31
We are 95% confident that the mean difference in CPU
speed is between 416.69 and 515.31 Mhz.
Dr. Yehya Mesalam 49
.
Example
As part of an industrial training program, some trainees are
instructed by Method 1, which is straight teaching-machine
instruction, and some are instructed by Method 2, which
also involves the personal attention of an instructor. If
random samples are taken from large groups of trainees
instructed by each of these two methods and the scores
with standard deviation are 6.06, and 5.58 respectively;
The score obtained in an appropriate achievement test
are
Method 1 71 75 65 69 73 66 69 75 74 87 68
Method 2 72 77 84 78 69 70 77 81 65 77 75
•Use the 0.05 level of significance to find 1 100%
confidence limits on 1 2
Dr. Yehya Mesalam 50
.
Solution
SX SX2 Mean Variance S.D
A 792 52351 72 36.8 6.0663
B 825 62183 75 30.8 5.549775
σ12 σ 22 σ12 σ 22
(x1 x 2 ) z α/2 μ1 μ 2 (x1 x 2 ) z α/2
n1 n 2 n1 n 2
6.06 2 5.58 2 6.06 2 5.58 2
(72 75) 1.96 μ 1 μ 2 (72 75) 1.96
11 11 11 11
- 7.85885 μ1 μ 2 1.858845
Dr. Yehya Mesalam 51
.
Example
As part of an industrial training program, some trainees
are instructed by Method 1, which is straight teaching-
machine instruction, and some are instructed by Method
2, which also involves the personal attention of an
instructor. If random samples are taken from large
groups of trainees instructed by each of these two
methods and the scores which they obtained in an
appropriate achievement test are
Method 1 71 75 65 69 73 66 69 75 74 87 68
Method 2 72 77 84 78 69 70 77 81 65 77 75
•Use the 0.05 level of significance to find 1 100%
confidence limits on 1 2
Dr. Yehya Mesalam 52
.
Solution
SX SX2 Mean Variance S.D
A 792 52351 72 36.8 6.0663
B 825 62183 75 30.8 5.549775
s 2p s 2p s 2p s 2p
(x1 x 2 ) t , α/2 μ1 μ1 (x1 x 2 ) t , α/2
n1 n 2 n1 n 2
(n1 1)s12 (n 2 1)s 22
sp
n1 n 2 2
10 * 6.0663 10 * 5.54972 2
sp 5.813
11 11 2
Dr. Yehya Mesalam 53
.
Solution
1 1 1 1
(72 75) 2.086 * 5.813 μ1 μ 2 (72 75) 2.086 * 5.813
11 11 11 11
- 8.1705 μ1 μ 2 2.1705
Dr. Yehya Mesalam 54
Dr. Yehya Mesalam 55
55