0% found this document useful (0 votes)
25 views17 pages

Statistics in Research

This document discusses various parametric statistical tests used in research including t-tests, ANOVA, Pearson's correlation, and z-tests. It provides details on assumptions, calculations, and applications of each test. Student's t-test is used to compare two means of small samples or a sample mean to a population mean. ANOVA compares multiple group means. Parametric tests require continuous, normally distributed data measured on interval or ratio scales.

Uploaded by

doitmrnags
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views17 pages

Statistics in Research

This document discusses various parametric statistical tests used in research including t-tests, ANOVA, Pearson's correlation, and z-tests. It provides details on assumptions, calculations, and applications of each test. Student's t-test is used to compare two means of small samples or a sample mean to a population mean. ANOVA compares multiple group means. Parametric tests require continuous, normally distributed data measured on interval or ratio scales.

Uploaded by

doitmrnags
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Statistics in Research

Research Aptitude

Copyright © 2014-2023 TestBook Edu Solutions Pvt. Ltd.: All rights reserved
Download Testbook App

Parametric Tests

Used for Quantitative Data

Used for continuous variables

Used when data are measured on an approximate interval or ratio scales of measurement.

Data should follow a normal distribution

Parametric Tests
1. t-test (n<30)

1 of 16
SUBJECT | Research Aptitude
Download Testbook App

2. ANOVA (Analysis of Variance

3. Pearson's r Correlation
4. Z test for large samples (n>30)
Parametric tests
STUDENT'S T-TEST

Developed by Prof WS Gossett in 1908, who published statistical papers under the pen name of 'Student'. Thus the test is known as Student's 't' test.

indications for the test:

1. When samples are small


2. Population variance is not known.
Uses
1. Two means of small independent samples
2. Sample mean and population mean

2 of 16
SUBJECT | Research Aptitude
Download Testbook App

Assumptions made in the use of 't' test


1. Samples are randomly selected
2. Data utilised is Quantitative
3. Variable follow a normal distribution

4. Sample variances are mostly the same in both the groups under the study
5. Samples are small, mostly lower than 30

A t-test compares the difference between two means of different groups to determine whether that difference is statistically significant.

Student's 't' test for different purposes

't' test for one sample

't' test for unpaired two samples

't' test for paired two samples

ONE SAMPLE T-TEST

When comparing the mean of a single group of observations with a specified value

In one sample t-test, we know the population mean. We draw a random sample from the population and then compare the sample mean with the po

Calculation

Where = Sample mean

µ = population mean
= Standard error

Where x = element of sample

3 of 16
SUBJECT | Research Aptitude
Download Testbook App

= sample mean
n – 1 = degrees of freedom

Now we compare calculated value with table value at a certain level of significance (generally 5% or 1%)

If the absolute value of 't' obtained is greater than the table value then reject the null hypothesis and if it is less than the table value, the null hypothesis

EXAMPLE
Research Problem: Comparison of mean dietary intake of a particular group of individuals with the recommended daily intake.
DATA: Average daily energy intake (ADEI) over 10 days of 11 healthy women

Mean ADEI value = 6753.6


SD ADEI value = 1142.1

When can we say about the energy intake of these women in relation to a recommended daily intake of 7725 KJ?
Research Hypothesis
State null hypothesis and alternative hypothesis:
H
0 = there is no difference between population

mean and sample mean


OR
H
0 : µ = 7725 KJ
H
1 = there is a difference between the population mean and sample mean
OR
H
1 : µ ≠ 7725 KJ

Set the level of significance α = .05, .01 or .001

Calculate the value of the proper statistic

State the rule for rejecting the null hypothesis:

4 of 16
SUBJECT | Research Aptitude
Download Testbook App

Reject H0 if t ≥ +ve Tabulated


value
OR
Reject H0 if t ≤ -ve Tabulated
value
Or we can say that p < .05

In the above example, we have seen


t=-.2564 which is less than 2.23

P-value suggests that the dietary intake of these women was significantly less than the recommended level (7725 KJ)

Two Sample "t' test


A. Unpaired Two sample 't'- test

The unpaired t-test is used when we wish to compare two means

Used when the two independent random samples come from the normal populations having unknown or same variance

We test the null hypothesis, that the two population means are the same i.e μ1 = μ2 against an appropriate one-sided or two-sided alternative hypoth

Assumptions

The samples are random & independent of each other

The distribution of the dependent variables is normal.

The variances are equal in both the groups

FORMULA
The test statistic is given by

Where are respectively called SD's of the first and second group

Research Problem
A study was conducted to compare the birth weights of children born to 15 non-smoking with those of children born to 14 heavy smoking mothers.

5 of 16
SUBJECT | Research Aptitude
Download Testbook App

Non-smoking Heavy smoking


mothers (n = 15) mothers (n = 14)
3.99 3.18
3.79 2.84
3.60 2.90
3.73 3.27
3.21 3.85
3.60 3.52
4.08 3.23
3.61 2.76
3.83 3.60
3.31 3.75
4.13 3.59
3.26 3.63
3.54 2.38
3.51 2.34
2.71

Research Hypothesis: State null hypothesis and alternative hypothesis

H0 = there is no difference between the birth weights of children born lo non-smoking and smoking mother

H1 = there is a difference between the birth weights of children born to non-smoking and smoking mothers

Set the level of significance α =.05, .01 or .001

Calculate the value of proper statistic

State the rule for rejecting the null hypothesis

If tcal > ttab > we can say that P < .05 then we reject null hypothesis and accept the Alternative hypothesis. Decision

If we reject the null hypothesis so we can say that children born to non-smokers are heavier than children born to heavy smokers.

PAIRED TWO-SAMPLE T-TEST

6 of 16
SUBJECT | Research Aptitude
Download Testbook App

Used when we have paired data of observations from one sample only when each individual gives a pair of observations.

Same individuals are studied more than once in different circumstances- measurements made on the same people before and after interventions

Assumptions

The outcome variable should be continuous

The difference between pre-post measurements should be normally distributed

Z Test

This test is used for testing significance difference between two means (n>30).

Assumptions to apply Z test

The sample must be randomly selected

Data must be quantitative

Samples should be larger than 30

Data should follow normal distribution

Sample variances should be almost the

same in both the groups of study

7 of 16
SUBJECT | Research Aptitude
Download Testbook App

If the SD of the populations is known, a Z test can be applied even if the sample is smaller than 30

Indications for Z Test

To compare sample mean with population mean

To compare two sample means

Steps
1. Define the problem
2. State the null hypothesis (H0) & alternate hypothesis (H1)

3. Find Z value

4. Fix the level of significance


5. Compare calculated Z value with the value in Z table at corresponding degree significance level.

If the observed Z value is greater than theoretical Z value, Z is significant, reject null hypothesis and accept alternate hypothesis

One tailed and Two tailed Z tests

Z values on each side of mean are calculated as +Z or as -Z.

A result larger than difference between sample mean will give +Z and result smaller than the difference between mean will give -Z

E.g. for two tailed :

In a test of significance, when one wants to determine whether the mean 10 of malnourished children is different from that of well nourished and do

Conclusion

8 of 16
SUBJECT | Research Aptitude
Download Testbook App

Tests of significance play an important role in conveying the results of any research & thus the choice of an appropriate statistical test is very impor

Hence the emphasis placed on tests of significance in clinical research must be tempered with an understanding that they are tools for analyzing dat

Analysis of Variance(ANOVA)

Given by Sir Ronald Fisher

The principle aim of statistical models is to explain the variation in measurements.

The statistical model involving a test of significance of the difference in mean values of the variable between two groups is the student's ‘t’ test If there

Assumptions for ANOVA


1. Sample population can be easily approximated to normal distribution.
2. All populations have the same Standard Deviation.
3. Individuals in the population are selected randomly.
4. Independent samples

ANOVA compares variance by means of a simple ratio, called F-Ratio

F= Variance between groups


Variance within groups

The resulting F statistics are then compared with a critical value of F (critic), obtained from F tables in much the same way as was done with 't'

If the calculated value exceeds the critical value for the appropriate level of α, the null hypothesis will be rejected.

An F test is therefore a test of the Ratio of Variances F Tests can also be used on their own, independently of the ANOVA technique, to test hypotheses a

In ANOVA, the F test is used to establish whether a statistically significant difference exists in the data being tested.

9 of 16
SUBJECT | Research Aptitude
Download Testbook App

ANOVA can be

One Way ANOVA

If the various experimental groups differ in terms of only one factor at a time- a one way ANOVA is used

e.g. A study to assess the effectiveness of four different antibiotics on S Sanguis

Two Way ANOVA

If the various groups differ in terms of two or more factors at a time, then a Two Way ANOVA is performed

e.g. A study to assess the effectiveness of four different antibiotics on S Sanguis in three different age groups

Pearson's Correlation Coefficient


Karl Pearson is the most popular, widely used and correlation quantitatively within specified limitations through an ideal measure of covariance. The coe
indicates no correlation at all. It is popularly called Karl Pearson’s Coefficient of correlation or Pearsonian Correlation. The formulas used under this meth

By Direct method (Actual mean)

Where: γ = Karl Pearson’s Coefficient of Correlation


x and y = Deviations of individual items of the series from their mean
n = The number of terms of a series
σ1 and σ2 = standard Deviations of first and second series

The Kruskal-Wallis H Test

The Kruskal-Wallis H Test is a non-parametric procedure that can be used to compare more than two populations in a completely randomized desi

10 of 16
SUBJECT | Research Aptitude
Download Testbook App

All n = n1 + n2 + ... + nk measurements are jointly ranked (i.e. treat as one large sample).

We use the sums of the ranks of the k samples to compare the distributions.

The Kruskal-Wallis H Test

Rank the total measurements in all k samples from 1 to n. Tied observations arc assigned average of the ranks they would have gotten if not tied .

Calculate

Ti = rank sum for the i th sample i = 1, 2, ... ,k

And the test statistic

The Kruskal-Wallis H Test


H0: the k distributions are identical versus

Ha : at least one distribution is different

Test statistic: Kruskal-Wallis H

When H0 is true, the test statistic H has an approximate chi-square distribution with df

= k-1.
Use a right-tailed rejection region or p-value based on the Chi-square distribution.
Example
Four groups of students were randomly assigned to be taught with four different techniques, and their achievement test scores were recorded. Are the dis

1 2 3 4
65 75 59 94
87 69 78 89
79 81 62 88

Teaching Methods

11 of 16
SUBJECT | Research Aptitude
Download Testbook App

1 2 3 4

65 (3) 75 (7) 59 (1) 94 (16)

87 (13) 69 (5) 78 (8) 89 (15)

73 (6) 83 (12) 67 (4) 80 (10)

79 (9) 81 (11) 62 (2) 88 (14)

Ti 31 35 15 55

Teaching Methods

Key Concepts
l. Nonparametric Methods
These methods can be used when the data cannot be measured on a quantitative scale, or when

The numerical scale of measurement is arbitrarily set by the researcher, or when

The parametric assumptions such as normality or constant variance are seriously violated.

Key Concepts
Kruskal-Wallis H Test: Completely Randomized Design
1. Jointly rank all the observations in the k samples (treat as one large sample of size n say). Calculate the rank sums, Ti, = rank sum of sample i. and the test

2. If the null hypothesis of equality of distributions is false, H will be unusually large, resulting in a one-tailed test
3. For sample sizes of five or greater, the rejection region for H is based on the chi-square distribution with (k – 1) degrees of freedom.

Mann Whitney U test:


nonparametric equivalent of a t test for two independent samples
Use when:

12 of 16
SUBJECT | Research Aptitude
Download Testbook App

• Data does not support means (ordinal)


• Data is not normally distributed.

1) Rank all data.


2) Evaluate if ranks tend to cluster within a group.

Mann Whitney U test:

Where: n1 Size of Sample one

n2 Size of Sample two

Evaluation of Mann Whitney U


1) Choose the smaller of the two U values.
2) Find the critical value (Mann Whitney table)
3) When a computed value is smaller than the critical value the outcome is significant!

group 1 group 2
24 28

18 42
45 63

57 57
12 90
30 68

Step One: Rank all data across groups


group 1 group 2
24 28
18 2 42
45 63
57 57
12 1 90

30 68

group 1 group 2
24 3 28 4
18 2 42 6
45 7 63 10
57 8.5 57 8.5
12 1 90 12

30 5 68 11

Step Two: Sum the ranks for each group

13 of 16
SUBJECT | Research Aptitude
Download Testbook App

group 1 group 2
24 3 28 4

18 2 42 6
45 7 63 10
57 8.5 57 8.5

12 1 90 12
30 5 68 11

Check the rankings:

Step Three: Compute U1

U1 = 36 + 21– 26.5

U1 = 30.5

Step Four: Compute U2

U2 = 36+21–51.5

U2 = 5.5

Step Five: Compare U1 to U2


U1 = 30.5

U2 = 5.5

5.5 < 30.5


U = 5.5

Critical Value = 5
This is a nonsignificant outcome

Chi-square Test
Chi-square is a test statistic used to test a hypothesis that provides a set of the theoretical frequencies with, which observed frequencies are compared.
Chi-square, symbolically written as x2, enable us to test and compare whether more than two population proportions can be considered equal.

14 of 16
SUBJECT | Research Aptitude
Download Testbook App

Hence, it is a non-parametric test of statistical significance. Which compare observed data with expected data and testing the null hypothesis, which states t
The Chi-square ( ) is computed by using the following formula.

where O represents the observed frequency, E represents an expected frequency.


Whether or not a calculated value of is significant, can be ascertained by looking at the tabulated values of for a given degree of freedom at a certa
between the observed and expected frequencies is taken as significant but if the table value is more than the calculated value of , then the difference is
ignored.

Area of Application of Chi-square Test


The Chi-square test technique is used in a number of problems. Some of them are
As a Test of Goodness of Fit Karl Pearson developed a test for significance called the chi-square test of goodness of fit, which is used to test whether o
deviations, if any, between the observed and estimated values can be because of a chance or some other inadequacies.

As a Test of Homogeneity test helps is in stating whether different samples' come from the same universe. Through, this test, we can also explain whe
results fail to support the given hypothesis.

As Test of Population Variance square is also used to test the significance of population variance through confidence intervals, especially in the case of s

Conditions for the Applicability of Test


The following conditions should be satisfied before the test can be applied
• Observations are recorded and collected on a random basis.
• All the members in the sample must be independent
• No group should contain very few items.

• The overall number of items must be reasonably large.


• The constraints must be linear. Constraints, which involve linear equations in the cell frequencies of a contingency table are known as linear constraints
Step involved in Finding the Value of Chi-square
The process of computing the value involves the following steps

1. Set-up null hypothesis and alternative hypothesis.


2. List-up the observed frequencies.

3. Calculate the expected frequencies, if the data followed a given theoretical distribution.
4. Obtain the difference between the observed and corresponding expected frequencies.
5. Expressing the square of the difference as a fraction of the corresponding expected frequencies.
6. Now add all the fractions obtained.
7. Then compare the value with the appropriate (x2) value from the tables at the predetermined level of significance.

8. Accept the null hypothesis, if the value, thus computed for the given degrees of freedom and levels of significance is lesser than the (tabulated value) othe
Illustration The following table depicts the expected sales (E) and actual sales (O) of television sets for a company. Test whether there is a substantial differe
Actual and Expected Sales of Television Sets
Actual Sales (O) Expected Sales (E)
57 59
69 76
51 55
83 75
44 39
48 53
35 30
37 48

15 of 16
SUBJECT | Research Aptitude
Download Testbook App

Solution
Computation of Test Statistic

O E O–E (O – E)2 (O – E)2 / E


57 59 –2 4 0.068
69 76 –8 64 0.842
51 55 –4 16 0.291
83 75 8 64 0853
44 39 5 25 0.641
48 53 –5 25 0.472
35 30 –5 25 0.833
37 48 –11 121 2.521
Total 6.521

The critical value of Chi-square (8 – 1) = 7 degree of freedom at 0.05 level of significance is 2.167

But
Since, the value of x2 does fall within critical region the null hypothesis has to rejected. That is, there is a significant difference actual values of sales and

16 of 16
SUBJECT | Research Aptitude

You might also like