1
21msm3009
2
21msm3160
3
21msm3080
A Simple Introduction to
ANOVA
1
Amemou Franck Elyse Yao, 2Kaleab Hailemariam 3Venkat Kumar 4Vishwajeet
Shankar Goswami
yaoamemou996@[Link], [Link]
Chandigarh university, Gharuan, Punjab
1|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Abstract
In this essay, we'll go over the various ANOVA methodologies that can be
utilized to make the best selections. We'll look at few examples and try to figure
out how to acquire the outcomes. To comprehend this topic, you must have a
basic understanding of statistics. A working knowledge of t-tests and hypothesis
testing would be advantageous.
2|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Contents
1. Introduction to ANOVA
2. Terminologies related to ANOVA
¤ Grand Mean
¤ Hypotheses
¤ Between Group Variability
¤ Within Group Variability
¤ F-statistic
3. One-way ANOVA
Limitations of One-way ANOVA
4. Two-way ANOVA
5. References
3|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Introduction to ANOVA :
Analyzing the number of days it took the patients to be cured is a popular method
of determining a reliable treatment strategy. We may compare these three
treatment samples using a statistical technique that shows how dissimilar they are
from one another. The term "ANOVA" refers to a procedure that compares
samples based on their means.
The analysis of variance (ANOVA) is a statistical technique for determining if the
means of two or more groups differ significantly. ANOVA compares the means of
different samples to determine the impact of one or more factors.
ANOVA can be used to establish or disprove whether all of the drug treatments
were equally beneficial.
The t-test is another method for comparing samples. When we only have two
samples, both the t-test and the ANOVA produce the same findings. In
circumstances when there are more than two samples, however, a t-test would be
unreliable. When we use repeated t-tests to compare more than two samples, the
error rate of the result increases exponentially.
When might we use ANOVA ?
We might use ANOVA (Analysis of Variance) , as a marketer, when you want to
test a particular hypothesis. You would use ANOVA to help you understand how
your different groups respond, with a null hypothesis for the test that the means of
the different groups are equal. If there is a statistically significant result, then it
means that the two populations are different (or unequal).
4|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
How does ANOVA work ?
ANOVA evaluates the means of different groups and shows you if there are any
statistical differences between them, just like other forms of statistical tests. An
ANOVA is a type of omnibus test statistic. This means it can't tell you which
groups were statistically different from each other; it can only tell you that at least
two of them were.
The main ANOVA research topic is whether the sample means are from different
populations. The ANOVA test is based on two assumptions:
First, the observations within each sampled population are normally distributed,
regardless of the data gathering technology used.
Second, the population under study has a common variance of s2.
MST
The Formula for ANOVA is F= MSE
Where, F = ANOVA coefficient
MST = Mean Sum of Squares due to Treatment
MSE = Mean Sum of Squares due to Error
5|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Terminologies related to ANOVA :
Before we get into the applications of ANOVA, I'd like to define a few terms that
are commonly used in the field.
Grand Mean -
A simple or arithmetic average of a set of variables is called a mean. Separate
sample means ( μ1 , μ 2 , μ3 ) and the grand mean (μ) are the two types of means that
we employ in ANOVA computations.
The grand mean is the average of sample means or the average of all observations,
regardless of sample size.
Hypothesis -
In the situation of the medication we discussed before, we may expect one of two
outcomes: either the medication will have an effect on the patients or it will not.
Hypothesis is the term used to describe these statements. An educated assumption
about something in the world around us is referred to as a hypothesis. It should be
observable or testable by experiment.
ANOVA uses a Null hypothesis and an Alternate hypothesis, much like any other
type of hypothesis you could have studied in statistics. When all of the sample
means are equal or have no significant difference, the null hypothesis in ANOVA
is valid. As a result, they can be considered part of a broader population. When at
least one of the sample means differs from the rest of the sample means, however,
the alternate hypothesis is correct. They can be expressed mathematically as:
H0 : μ1 = μ2 = …. = μL Null Hypothesis
H1 : μl ≠ μm Alternative Hypothesis
where μl and μm
belong to any two sample means, out of all the samples considered for the test. To
put it another way, the null hypothesis states that all sample means are equal or
that the factor had no significant impact on the results. The alternative hypothesis
asserts that at least one of the sample means differs from the others. But we're still
stumped as to which one it is.
6|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Between Group Variability -
Take a look at the distributions of the two samples below. Because these samples
overlap, the individual means will not differ significantly. As a result, the
disparity between their individual means and the grand mean will be insignificant.
Take a look at these two sample distributions now. Because the samples varied by
such a large margin, their individual means would diverge as well. As a result, the
discrepancy between the individual means and the grand mean would be large.
Between-group variability is the term for this type of variation between
distributions. Because the values within each group differ, it refers to differences
in the distributions of individual groups (or levels).
To compute the variability, each sample is examined and the difference between
its mean and grand mean is calculated. The grand mean will be similar to the
individual means if the distributions overlap or are near, but the distance between
the means and the great mean will be substantial if the distributions are widely
apart.
7|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
We'll do the same thing with Between Group Variability as we did with the
standard deviation. We may compute it using the sample means and Grand mean
as follows:
We also want to take into account the sample size while calculating each squared
deviation. In other words, if a deviation comes from a larger sample, it is given
more weight. As a result, we'll double each squared deviation by each sample size
before adding them together. For between-group variability, this is known as the
sum-of-squares method. (SSbetween).
We still need to conduct one more thing to get a good measure of between-group
variability. Remember how we arrived at the sample standard deviation.
8|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
We divide the degrees of freedom by the total of each squared variation. We'll
identify each squared deviation, weigh it by its sample size, add it up, and divide
by the degrees of freedom (dfbetween), which in the case of between-group
variability is the number of sample means (k) minus 1.
Within Group Variability -
Consider the following three sample distributions. As each sample's spread
(variability) grows, their distributions overlap, and they become part of a larger
population.
Consider a different distribution of the same three samples, but this time with less
variance. Despite the fact that the means of the samples are comparable to those in
the graphic above, they appear to belong to different populations.
Within-group variation refers to such variations within a sample. Because not all
of the numbers within each group are the same, it refers to changes produced by
differences within individual groups (or levels). Each sample is examined
separately, and the variability between the sample's distinct points is calculated. In
other words, no sample interactions are taken into account.
9|P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Within-group variability can be calculated by calculating how far each sample's
value departs from the sample mean. To begin, we'll add up the squared deviations
of each result from its respective sample mean. Within-group variability is
represented by the sum of squares.
We divide the sum of squared deviations by the degrees of freedom, much as we
did for between-group variability, (dfwithin)
to find a less biassed estimator for the average squared deviation (essentially,
the average-sized square from the figure above). This quotient is called the mean
square once more, but this time for within-group variability: (MSwithin)
The degrees of freedom are equal to the sum of the sample sizes (N) minus the
number of samples this time (k). Another approach to think about degrees of
freedom is to take the total number of values (N) and divide it by the number of
samples (S):
10 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
F-statistic -
The F-Ratio is a statistic that determines whether or not the means of distinct
samples are significantly different. The lower the F-Ratio, the closer the sample
means are. We can't rule out the null hypothesis in that instance.
F = Between group variability / Within group variability
This formula is relatively self-explanatory. The between-group variability is
defined by the numerator term in the F-statistic computation. When previously
stated, as between-group variability increases, sample means become increasingly
dissimilar. To put it another way, the samples are more likely to come from
completely distinct populations.
To reach a conclusion, the F-statistic calculated here is compared to the F-critical
value. In our pharmaceutical example, if the calculated F-statistic is greater than
the F-critical value (for a specific/significance level), we reject the null hypothesis
and thus conclude that the therapy was effective.
The F-distribution, unlike the z and t-distributions, does not have any negative
values because the variance between and within groups is always positive due to
squaring each deviation.
11 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
As a result, the right tail has only one critical region (shown as the blue shaded
region above). We can conclude that the means are significantly different and
reject the null hypothesis, if the F-statistic falls within the critical region. To
establish the critical region's cut-off, we must again find the critical value. For
this, we'll make use of the F-table.
Because the F-critical value is a function of two things, we need to look at distinct
F-values for each alpha/significance level : dfwithin and dfbetween .
12 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
One Way ANOVA :
Now that we've covered the fundamentals of ANOVA, let's look at some real-
world instances of how it's done.
According to a recent study, playing music in class improves attentiveness and
hence helps students retain more information.
What if it had a negative impact on the pupils' grades? Or, alternatively, what type
of music would be appropriate for this? Given all of this, having some proof that it
works would be really beneficial.
To figure this out, we decided to implement it on a smaller group of randomly
selected students from three different classes. The idea is similar to conducting a
survey. We take three different groups of ten randomly selected students (all of
the same age) from three different classrooms. Each classroom was provided with
a different environment for students to study. Classroom A had constant music
being played in the background, classroom B had variable music being played and
classroom C was a regular class with no music playing. After one month, we
conducted a test for all the three groups and collected their test scores. The test
scores that we obtained were as follows:
Now we will calculate the means and Grand mean. So, in our case,
13 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Looking at the data above, we can deduce that the mean score of students in
Group A is clearly higher than the other two groups, implying that the treatment is
effective. Perhaps this is true, but there's also a chance that we chose the best
pupils from class A, resulting in higher test scores (remember, the selection was
done at random). This raises a few questions, such as:
1. How do we know that the differences in performance between these three
groups were caused by the varied settings and not by chance?
2. How distinct are these three samples from one another statistically?
3. What are the chances that students in group A perform so differently than
the other two groups?
To answer all these questions, first we will calculate the F-statistic which can be
expressed as the ratio of Between Group variability and Within Group Variability.
Let’s complete the ANOVA test for our example with = 0.05.
Limitations of one-way ANOVA -
A one-way ANOVA tells us that at least two groups are different from each
other. But it won’t tell us which groups are different. If our test returns a
significant f-statistic, we may need to run a post - hoc test to tell us exactly which
groups have a difference in means.
So after performing a post-hoc test, we can say that the constant music had a
significant effect on the performance of students.
14 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Two Way ANOVA :
Using one-way ANOVA, we discovered that the music treatment helped our
students improve their test scores. However, this treatment was carried out on
students of the same age group. What if the treatment had differing effects on
different age groups of students? Or perhaps the treatment had different results
depending on the teacher who led the lesson.
Furthermore, how can we be certain which factor(s) has the most impact on the
kids' grades? Perhaps a student's age group has a greater impact on their
performance than the music treatment.
When the outcome or dependent variable (in this case, test scores) is influenced
by two independent variables/factors, we employ a slightly modified technique
known as two-way ANOVA.
We discovered that the groups exposed to 'varying music' and 'no music at all'
performed similarly in the one-way ANOVA test. It means that the children were
not affected by the varied music treatment in any way.
For the sake of simplicity, we shall ignore the "varying music" treatment when
running two-way ANOVA. Rather, a new variable, age, will be introduced to see
how well the treatment works with pupils of various ages. This time, our data is as
follows:
15 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
Here, there are two factors – class group and age group with two and three levels
respectively. So we now have six different groups of students based on different
permutations of class groups and age groups and each different group has a
sample size of 5 students.
A few questions that two-way ANOVA can answer about this dataset are:
1. Is music treatment the main factor affecting performance? In other words,
do groups subjected to different music differ significantly in their test
performance?
2. Is age the main factor affecting performance? In other words, do students of
different age differ significantly in their test performance?
3. Is there a significant interaction between the factors? In other words, how
do age and music interact with regard to a student’s test performance? For
example, it might be that younger students and elder students reacted
differently to such a music treatment.
4. Can any differences in one factor be found within another factor? In other
words, can any differences in music and test performance be found in
different age groups?
Two-way ANOVA tells us about the main effect and the interaction effect. The
main effect is similar to a one-way ANOVA where the effect of music and age
would be measured separately. Whereas, the interaction effect is the one where
both music and age are considered at the same time.
That’s why a two-way ANOVA can have up to three hypotheses, which are as
follows:
Two null hypotheses will be tested if we have placed only one observation in each
cell. For this example, those hypotheses will be:
H1: All the music treatment groups have equal mean score.
H2: All the age groups have equal mean score.
16 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
For multiple observations in cells, we would also be testing a third hypothesis:
H3: The factors are independent or the interaction effect does not exist.
An F-statistic is computed for each hypothesis we are testing.
Before we proceed with the calculation, have a look at the image below. It will
help us better understand the terms used in the formulas.
The table shown above is known as a contingency table. Here, Al represents the
total of the samples based only on factor 1, and similarly Am represents the total of
sample based only on factor 2. We will see in some time that these two are
responsible for the main effect produced. Also, a term Alm is introduced which
represents the subtotal of factor 1 and factor 2. This term will be responsible for
the interaction effect produced when both the factors are considered at the same
time. And we are already familiar with the Grand Total G, which is the sum of all
the observations (test scores), irrespective of the factors.
17 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
We have calculated all the means – sound class mean, age group mean and mean
of every group combination in the above table.
Now, calculate the sum of squares (SS) and degrees of freedom (df) for sound
class, age group and interaction between factor and levels.
We already know how to calculate SS (within)/df (within) in our one-way
ANOVA section, but in two-way ANOVA the formula is different. Let’s look at
the calculation of two-way ANOVA:
18 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
In two-way ANOVA, we also calculate SSinteraction and dfinteraction which defines the
combined effect of the two factors.
Since we have more than one source of variation (main effects and interaction
effects), it is obvious that we will have more than one F-statistic also.
Now using these variances, we compute the value of F-statistic for the main and
interaction effect. So, the values of f-statistic are,
F1 = 12.16
F2 = 15.98
F12 = 0.36
We can see the critical values from the table
Fcrit1 = 4.25
Fcrit2 = 3.40
Fcrit12 = 3.40
The F-value for factor 1 (music) and factor 2 (age) respectively, are higher than
their F-critical values. This means that the factors have a significant effect on the
results of the students and thus we can reject the null hypothesis for the factors.
Also, the F-value for interaction effect is quite less than its F-critical value, so we
can conclude that music and age did not have any combined effect on the
population.
19 | P a g e
1
21msm3009
2
21msm3160
3
21msm3080
References :
[Link]
[Link]
%20variance%2C%20or%20ANOVA,the%20dependent%20and
%20independent%20variables.
[Link]
[Link]
testing/anova/
[Link]
and-demerits/#:~:text=Demerits%20or%20Limitations%20of%20Two
%20Way%20ANOVA&text=specified%20by%20using%20't'%20test,found
%20significant%20for%20a%20treatment.&text=these%20assumptions
%20are%20not%20fulfilled,is%20difficult%20and%20time
%20consuming.&text=interpretation%20of%20results%20become
%20difficult.
[Link]
20 | P a g e