0% found this document useful (0 votes)
20 views38 pages

SMP2

sampling types

Uploaded by

aapatil.sknsits
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views38 pages

SMP2

sampling types

Uploaded by

aapatil.sknsits
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Testing of Hypothesis

Dr. Sachin M. Patil


Assistant Professor
Department of Statistics
Shivaji University, Kolhapur.
How Do I Begin???
Is newly introduced drug is effective in reduction of sugar levels in
diabetic patients?
IDENTIFY THE PROBLEM
• Steps:
• State the Null Hypothesis
• State its opposite, the Alternative Hypothesis
• Hypotheses are mutually exclusive &
exhaustive

 Hypothesis for previous example


 Null hypothesis (𝐻0 ): There is no significant difference
in sugar levels after drug consumption.
𝐻0 : 𝜇1 = 𝜇2

 Alternative hypothesis (𝐻1 ): There is significant


reduction in sugar levels after drug consumption.
 𝐻1 : 𝜇1 > 𝜇2
Types of tests:
1. Left tailed test
• 𝐻0 : 𝜇1 = 𝜇2
• 𝐻1 : 𝜇1 < 𝜇2

2. Right tailed test


• 𝐻0 : 𝜇1 = 𝜇2
• 𝐻1 : 𝜇1 > 𝜇2

3. Two tailed test


• 𝐻0 : 𝜇1 = 𝜇2
• 𝐻1 : 𝜇1 ≠ 𝜇2
ERRORS IN MAKING DECISIONS

Alternative
Reality Null Hypothesis
Hypothesis
(H0 ) is true
Experimenter (H1) is true
Decision

Type I error Correct decision


Reject (H0 )
() (Power = 1 - )

Correct decision Type II error


Don’t reject (H0 )
(1 - ) ()
 &  Have an Inverse
Relationship
Reduce probability of one
error and the other one
goes up.

One possibility: Increase the sample


size!!!!
Critical Region
The critical region (or rejection region) is the set of
all values of the test statistic that cause us to reject
the null hypothesis. Acceptance region
Accept H0 ,if the sample
mean falls in this region

95 % of area

Acceptance and
0.025 of area
rejection regions 0.025 of area
in case of a two-
µH 0
tailed test with 5%
significance level. Rejection region
Reject H0 ,if the sample mean falls
in either of these regions
PROCEDURE FOR TESTING OF HYPOTHESIS:
Step 1: Set up the null hypothesis H0.
Step 2: Set up the alternative hypothesis H1. This gives
idea whether we have to use one tailed or two
tailed test.
Step 3: Chose the appropriate level of significance ().
Step 4: Compute the value of test statistic “Z”.
𝑶𝒃𝒔𝒆𝒓𝒗𝒆𝒅 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆
𝒁=
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓
Step 5: Obtain the table value at given level of significance
or p-value.
Step 6: Compare the value of Z calculated with that of table
value or p-value with level of significance ().
STEPS FOR HYPOTHESIS TESTING
Formulate H0 and H1

Select Appropriate Test


Choose Level of Significance

Calculate Test Statistic TSCAL

Determine Prob Determine Critical


Assoc with Test Stat Value of Test Stat
TSCR
Determine if TSCR
Compare with Level falls into (Non)
of Significance,  Rejection Region

Reject/Do not Reject H0

Draw Marketing Research Conclusion


P-Value

The P-value (or p-value or probability value)


is the probability of getting a value of the test
statistic that is at least as extreme as the one
representing the sample data, assuming that
the null hypothesis is true. The null
hypothesis is rejected if the P-value is very
small, such as 0.05 or less.
Two-tailed Test

If the alternative hypothesis contains the not-equal-to symbol


(), the hypothesis test is a two-tailed test. In a two-tailed
test, each tail has an area of P.

H0: μ = k
Ha: μ  k
1
P is twice the
P is twice the 2
area to the right
area to the left of of the positive
the negative test test statistic.
statistic.

-3 -2 -1 0 1 2 3
Test Test
statistic statistic
Right-tailed Test

If the alternative hypothesis contains the greater-than


symbol (>), the hypothesis test is a right-tailed test.

H0: μ = k
P is the area to
Ha: μ > k the right of the test
statistic.

-3 -2 -1 0 1 2 3
Test
statistic
Left-tailed Test

If the alternative hypothesis contains the less-than


inequality symbol (<), the hypothesis test is a left-tailed
test.

H0: μ = k

Ha: μ < k
P is the area to
the left of the test
statistic.

-3 -2 -1 0 1 2 3
Test
statistic
Test For Single Population Proportion
➢ Use when observations in a sample are divided into
two groups such as male or female , pass or fail, died
or survived, vaccinated or unvaccinated.
➢ Null Hypothesis (𝐻0 ): There is no difference in
specified and sample proportion.
➢ Test Statistic:
𝑝−𝑃 𝑝−𝑃
𝑍= =
𝑆𝐸(𝑝) 𝑝𝑞
𝑛
The 𝑍 statistic follows a normal distribution
So we use cut-off values for comparison
based on Normal distribution
➢ Decision: at 5% level of significance ➢ Decision: at 5% level of significance
For two tailed test: For one tailed test:
Reject Null if 𝐙 > 1.96 Reject Null if 𝐙 > 1.64 ,
or p-value<0.05. or p-value<0.05.

0.025 0.025 0.05

-1.96 1.96 1.64

➢ The probable limit of the observed proportion of


success are
𝑝𝑞
𝑝 ± 1.96
𝑛
Test For Difference between
two Population Proportions
➢ For comparing two population proportions possessing
certain attributes.
➢ Null Hypothesis (𝐻0 ): There is no difference
between two population proportions.
➢ Test Statistic:
𝑝1 − 𝑝2 𝑝1 − 𝑝2
𝑍= =
𝑆𝐸(𝑝1 − 𝑝2 ) 1 1
𝑃𝑄 +
𝑛1 𝑛2
Where,
𝑛1 𝑝1 + 𝑛2 𝑝2
𝑃= 𝑎𝑛𝑑 𝑄 = 1 − 𝑃
𝑛1 + 𝑛2
➢ Decision:
✓ For one tailed test: Reject Null if 𝐙 > 1.64
✓ For two tailed test: Reject Null if 𝐙 > 1.96
Test For Single Population Mean
➢ For testing the significance difference between sample mean and
population mean
➢ Null Hypothesis (𝐻0 ): There is no difference between sample mean
and population mean.
➢ Test Statistic:
𝑥ҧ − 𝜇0
𝑧=
𝑆𝐷/ 𝑛
Where,
𝑠𝑢𝑚 𝑥−𝑥ҧ 2 𝑠𝑢𝑚 𝑥−𝑥ҧ 2
If n<30 then 𝑆𝐷2 = and If 𝑛 ≥ 30 then 𝑆𝐷2 =
𝑛−1 𝑛
➢ Decision:
✓ For one tailed test: Reject Null if 𝐭 > 𝒛𝜶
✓ For two tailed test: Reject Null if 𝐭 > 𝒛𝜶
𝟐

➢ 𝟏 − 𝜶 𝟏𝟎𝟎% confidence interval 95% confidence interval for


for mean is mean is
𝑚𝑒𝑎𝑛 ∓ 𝒛𝜶/𝟐 × 𝑆𝐸 𝑜𝑓 𝑚𝑒𝑎𝑛 𝑥ҧ ± 1.96 × 𝑆𝐷/ 𝑛
𝑆𝐷
= 𝑥ҧ ± 𝒛𝜶/𝟐 ×
𝑛
where SE: 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟
2. Test For Two Population Means
➢ Used only when two independent samples come from the normal
population.
➢ Null Hypothesis (𝐻0 ): There is no difference between two population
means.
➢ Test Statistic:
𝑥ҧ1 − 𝑥ҧ2
𝑧=
𝑠12 𝑠22
𝑛1 + 𝑛2
➢ Decision:
✓ For one tailed test: Reject Null if 𝒛 > 𝒛𝛂
✓ For two tailed test: Reject Null if 𝒛 > 𝒛𝛂/𝟐

𝟏 − 𝜶 𝟏𝟎𝟎% confidence interval for


difference between means is 𝟗𝟓% confidence interval for
difference between means is
𝑠12 𝑠22 𝑠12 𝑠22
𝑥ҧ1 − 𝑥ҧ2 ± 𝟏. 𝟗𝟔 × +
𝑥ҧ1 − 𝑥ҧ2 ± 𝒛𝜶/𝟐 × + 𝑛1 𝑛2
𝑛1 𝑛2
Assumptions for t-test:

1. Sample size is less than 30


2. The sample must be random.
3. The population standard deviation is not known.
4. The distribution of population from which sample
is drawn is normal.
t-Test For Single Mean
➢ For testing the significance difference between sample mean and
population mean
➢ Null Hypothesis (𝐻0 ): There is no difference between sample mean
and population mean.
➢ Test Statistic:
𝑥ҧ − 𝜇0
𝑡=
𝑆𝐷/ 𝑛
Where,
𝑠𝑢𝑚(𝑥 − 𝑥)ҧ
𝑆𝐷2 =
𝑛−1
➢ Decision:
✓ For one tailed test: Reject Null if 𝐭 > 𝒕𝜶, (𝒏−𝟏)
✓ For two tailed test: Reject Null if 𝐭 > 𝒕𝜶, 𝒏−𝟏
𝟐
➢ 𝟏 − 𝜶 𝟏𝟎𝟎% confidence interval for mean is
𝑆𝐷
𝑚𝑒𝑎𝑛 ∓ 𝒕𝜶/𝟐, 𝒏−𝟏 × 𝑆𝐸 = 𝑥ҧ ± 𝒕𝜶/𝟐, 𝒏−𝟏 ×
𝑛
where SE: 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑚𝑒𝑎𝑛
• 95% confidence interval for mean is (if n>30)
𝑥ҧ ± 1.96 × 𝑆𝐷/ 𝑛
2. Unpaired t-test
➢ Used only when two independent samples come from the normal
population.
➢ Null Hypothesis (𝐻0 ): There is no difference between two
population means.
➢ Test Statistic:
𝑥ҧ1 − 𝑥ҧ2
𝑡=
1 1
𝑆𝐷 𝑛 + 𝑛
1 2
Where,
𝑛1 − 1 𝑠12 + 𝑛2 − 1 𝑠22 𝑠𝑢𝑚(𝑥1 − 𝑥ҧ1 )2 +𝑠𝑢𝑚(𝑥2 − 𝑥ҧ2 )2
𝑆𝐷2 = =
𝑛1 + 𝑛2 − 2 𝑛1 + 𝑛2 − 2
➢ Decision:
✓ For one tailed test: Reject Null if 𝐭 > 𝐭 𝛂, 𝐧𝟏 +𝐧𝟐 −𝟐
✓ For two tailed test: Reject Null if 𝐭 > 𝐭 𝛂/𝟐, 𝐧𝟏 +𝐧𝟐−𝟐
𝟏 − 𝜶 𝟏𝟎𝟎% confidence interval for difference between means is

1 1
𝑥ҧ1 − 𝑥ҧ2 ± 𝒕𝜶/𝟐, 𝑛1 +𝑛2 −2 × 𝑆𝐷 +
𝑛1 𝑛2
3. Paired t-test
➢ Two samples are not independent but sample observations are paired
together. In this case we have 𝑛1 = 𝑛2 = 𝑛.
➢ Null Hypothesis (𝐻0 ): There is no real difference between means of
before and after observations.
➢ Test Statistic:
Let 𝑑 = 𝑥 − 𝑦: be the difference of paired observations
𝑑ҧ
𝑡=
𝑆𝐷𝑑 / 𝑛
Where,
2 𝑠𝑢𝑚(𝑑 − 𝑑)ҧ 2
𝑆𝐷𝑑 =
𝑛−1
➢ Decision:
✓ For one tailed test: Reject Null if 𝐭 > 𝐭 𝛂,𝑛−1
✓ For two tailed test: Reject Null if 𝐭 > 𝐭 𝛂/𝟐, 𝑛−1

𝟏 − 𝜶 𝟏𝟎𝟎% confidence interval for difference between means is


𝑑ҧ ± 𝐭 𝛂/𝟐, 𝑛−1 × 𝑆𝐷𝑑 / 𝑛
Chi – square Tests
Independence of attributes
Goodness of fit test
Assumptions:
1.Data must be random
2.A sufficiently large sample size
3.Adequate cell sizes
4.Independence
5.Normal distribution of deviations
Independence of attributes
Step 1: Write down the null hypothesis.
Step 2: Obtain the expected frequencies.
Step 3: Compute test statistic

2
(𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2
𝜒 = 𝑠𝑢𝑚
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑
Step 4: Find out degrees of freedom (𝑟 − 1) × (𝑐 − 1)
Step 5: Compare test statistic with Chi – square table
Reject Null hypothesis if value of 𝜒 2 is greater than table value
Example
Consider the following distribution of persons according to sex and blood groups.
Blood Group
Sex Total
O A B AB
Male 105 50 45 15 215
Female 115 60 40 10 225
Total 220 110 85 25 440

Expected Frequency Table:


Blood Group
Sex
O A B AB
215 × 220 215 × 110 215 × 85 215 × 25
M = 107.5 = 53.57 = 41.53 = 12.22
440 440 440 440
225 × 220 225 × 110 225 × 85 225 × 25
F = 112.5 = 56.25 = 43.46 = 12.78
440 440 440 440

2
(𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2
𝜒 = 𝑠𝑢𝑚 = 𝟐. 𝟒𝟐
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑
𝝌𝟐𝜶,(𝒓−𝟏)×(𝒄−𝟏) = 𝟕. 𝟖𝟏 𝒂𝒏𝒅 𝒑 − 𝒗𝒂𝒍𝒖𝒆 = 𝟎. 𝟒𝟗𝟐𝟖𝟔
Accept Null hypothesis
Conclusion: Blood group is independent of sex
2x 2 contingency table

B
A Total
Present Absent
Present a b a+b
Absent c d c+d
Total a+c b+d N=a+b+c+d

2
𝑁(𝑎𝑑 − 𝑏𝑐)2
𝜒 =
(𝑎 + 𝑏)(𝑐 + 𝑑 )(𝑎 + 𝑐 )(𝑏 + 𝑑 )

𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 = 1

➢ If any of the cell frequency is less than 5 then use Yate’s


correction by adding 0.5 frequency to the minimum frequency.

( )2
𝑎𝑑 − 𝑏𝑐 − 𝑁/2
𝜒2 =
(𝑎 + 𝑏)(𝑐 + 𝑑 )(𝑎 + 𝑐 )(𝑏 + 𝑑 )
2. Goodness of fit test

➢ Used to test whether a given distribution or model is good fit to


the given data
➢ Null hypothesis
𝐻0 : The given distribution is good fit for given data
➢ Test statistic
( )2
𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑
𝜒 2 = 𝑠𝑢𝑚
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑
➢ 𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 = 𝑛 − 𝑘 − 1
Where n: No. of observations,
k: No. of estimated parameters
ANOVA

• Three teachers are equally effective


(effectiveness is measured through marks
obtained)
• All the income groups have equal mean
stress.
• A group of psychiatric patients are trying three different therapies:
counseling, medication and biofeedback. You want to see if one
therapy is better than the others.
• A manufacturer has three different processes to make light bulbs.
They want to know if one process is better than the other.
• Students from different colleges take the same exam. You want to see
if one college outperforms the other.
EXAMPLES OF SOME COMMONLY USED STATISTICAL TESTS

Level of Measurement

Number of groups Nominal Ordinal Interval/Ratio


t-test of sample
Single Proportion Kolmogorov-Smirnov 1
1 group test sample test
mean vs. known
population value
2 test,
independent
samples Independent
2 independent groups Mann-Whitney U test
samples t-test
proportion test

2 dependent groups McNemar test Wilcoxon test Paired t-test

>2 independent groups 2 test Kruskal-Wallis ANOVA ANOVA

Friedman ANOVA by Repeated


>2 dependent groups Cochran Q test
ranks measures ANOVA

You might also like