Sampling Distribution
Sampling distribution of a given statistic is the
distribution of the values taken by the statistic in all
possible samples of the same size form the same
population.
There are five types of sampling:
(i) Random sampling
(ii) Systematic sampling
(iii) Convenience sampling
(iv) Cluster sampling
(v) Stratified sampling
Hypothesis Testing
Hypothesis testing is a statistical method used to
determine if there is enough evidence in a sample data
to draw conclusions about a population.
There are two types of hypothesis testing:
(i) Null hypothesis (H0)
(ii) Alternative hypothesis (Ha or H1)
Null Hypothesis (H0): The null hypothesis (H0), stated as the
null, is a statement about a population parameter, such as the
population mean, that is assumed to be true.
Alternative Hypothesis (Ha): An alternative hypothesis (H1)
is a statement that directly contradicts a null hypothesis by
stating that that the actual value of a population parameter is
less than, greater than, or not equal to the value stated in the
null hypothesis.
Level of Significance
Level of Significance: Level of significance, or significance
level, refers to a criterion of judgment upon which a decision
is made regarding the value stated in a null hypothesis. It is
denoted by .
Generally, the level of significance is typically set at 5%
(0.05) or 1% (0.01).
Types of errors
Type I error: The acceptance of H1 (H0 is rejected) when H0 is true is
called a Type I error. The probability of committing a type I
error is called the level of significance and is denoted by .
Example: Convicting the defendant when he is innocent.
Type II error: Failure to reject H0 when H1 is true (H0 false) is called a
Type II error. The probability of committing a type II error is
denoted by .
Confidence Interval
A confidence interval is a range of values, bounded above
and below the statistics mean, that likely would contain an
unknown population parameter.
Therefore, 95% confidence interval (CI) is given as
CI X 1.96
n
Also, 99% confidence interval (CI) is given as
Use in place 2.576 in the value of 1.96
Note: Analysts often use confidence intervals that contain
either 95% or 99% of expected observations.
Problem 2: Explain confidence interval. A random sample of
1000 students was taken to estimate the average number of
hours they spend studying per week. The sample mean was 16
hours, and the standard deviation was 4 hours. Calculate a 95%
confidence interval for the true mean number of hours students
spend studying per week.
Solution:
Given
Mean X 16
Standard deviation 4
Number of observatio n n 1000
confidence level as 95%
CI X 1.96
n
4
CI 16 (1.96)
1000
CI 16 (1.96) 0.12649
CI 16 0.247922
CI 16.2479, 15.7520
CI (15.7520, 16.2479)
t-Test
A t-test is a statistical test that is used to compare the
means of two groups. It is often used in hypothesis
testing to determine whether a process actually has an
effect on the population or whether two groups are
different from one another.
One Sample t-test
(Student's t-Test)
Student's t-test is a method of testing the theory about the mean of
a small sample drawn from a population where the standard
deviation of the population is unknown.
In such case, we take hypothesis as:
H0 : The difference between the sample mean x and population
mean ( ) is not significant.
Test statistic: x
t
s/ n
with Degree of Freedom (df) = n-1
If the calculated value of ‘t’ is such that
t t H is accepted
0
t t H0 is rejected