0% found this document useful (0 votes)
10 views45 pages

Data Analysis

complete detailednotes for bs education students

Uploaded by

nbc2693
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views45 pages

Data Analysis

complete detailednotes for bs education students

Uploaded by

nbc2693
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Data Analysis B.

Ed (Hons) QAED KOT ADDU |1

Data Analysis
in Education
A Practical Guide for Students and Researchers

10 IN
Table of Contents

45 M
Unit 1: Introduction to Statistics

6
60 A
 Statistics in Education
 Importance of Statistics in Education
32 ER
Unit 2: Graphic Representation of Data

 Histogram
03 IS

 Polygon
 Frequency Curve
A

 Pie Chart/Graph
Q

Unit 3: Measures of Central Tendency

 Mean
 Median
 Mode

Unit 4: Measures of Dispersion

 Range
 Quartile Deviation
 Standard Deviation
Data Analysis B.Ed (Hons) QAED KOT ADDU |2

Unit 5: Measures of Relationship

 Correlation
 Normal Distribution
 Percentiles & Percentile Ranks
 Tests of Significance
 Parametric Tests
 Non-Parametric Tests

Unit 6: Measurement Scales

 Nominal
 Ordinal/Ranking
 Interval
 Ratio

Unit 7: Random Variables and Probability Distribution

10 IN
 Random Sampling
 Random Variables and Their Distribution

45 M
6
 Binomial Distribution

60 A
Unit 8: Normal and Sampling Distributions 32 ER
 Normal Distribution
 Interpreting Scores in terms of Z-Scores and Percentile Ranks

Unit 9: Statistical Inferences: One Sample


03 IS

 Introduction to Hypothesis Testing


A

 One-Sample T-Test for a Mean



Q

Confidence Interval for a Mean


 One-Sample Z-Test and Confidence Interval for a Proportion
 One-Sample T-Test and Confidence Interval for Means using Independent &
Dependent Samples

Unit 10: Analysis of Variance and Co-Variance

 Introduction to Analysis of Variance


 Basic Concepts in ANOVA
 Basic Concepts in ANCOVA
 Multiple Comparison Procedures
Data Analysis B.Ed (Hons) QAED KOT ADDU |3

Unit 11: Statistical Inference for Frequency Data

 One-Sample Chi-Square Test


 Testing Goodness of Fit
 Testing Independence
 Testing Equality of Proportions

Unit 12: Statistical Inference for Ranked Data

 Assumption-Free Tests
 Mann-Whitney U Test for Two Independent Samples
 Wilcoxon Test for Dependent Samples

10 IN
45 M
6
Unit 1: Introduction to Statistics

60 A
Learning Objectives
32 ER
By the end of this unit, students will be able to:


03 IS

Define statistics and explain its scope.


 Understand the role of statistics in education.
 Differentiate between descriptive and inferential statistics.
A

 Recognize the importance of statistics for teachers, researchers, and policymakers.


Q

1.1 What is Statistics?

The term Statistics comes from the Latin word status meaning "state." It originally
referred to data collected by governments, but today it is widely used in almost every
field.

Definition:
Statistics is the science of collecting, organizing, presenting, analyzing, and
interpreting data to assist in decision-making.
Data Analysis B.Ed (Hons) QAED KOT ADDU |4

✅ Example:
If we want to know the average marks of students in a class, we use statistics to calculate
it.

1.2 Two Main Branches of Statistics

1. Descriptive Statistics
o Deals with collecting, summarizing, and presenting data.
o Examples: mean, median, mode, graphs, and charts.
2. Inferential Statistics
o Deals with drawing conclusions and making predictions based on data.
o Uses hypothesis testing, t-tests, ANOVA, chi-square, etc.

✅ Example:

 Descriptive: ―The average score in the test is 65.‖

10 IN
 Inferential: ―If we repeated this test, we can expect similar averages in the
population.‖

45 M
6
60 A
1.3 Statistics in Education
32 ER
In education, statistics plays a vital role in:

 Measuring Student Performance → averages, percentages, and grades.


03 IS

 Curriculum Development → analyzing needs and outcomes.


 Examinations and Assessments → test construction, reliability, and validity.
A

 Educational Research → hypothesis testing and drawing conclusions.


 Policy Decisions → literacy rate surveys, dropout studies, resource allocation.
Q

✅ Example:
If the dropout rate in schools is 25%, statistics helps identify causes and propose
solutions.

1.4 Importance of Statistics in Education

1. Understanding Learners – Teachers can identify weak and strong areas of


students.
2. Improving Teaching Methods – Feedback from test results helps refine
teaching.
Data Analysis B.Ed (Hons) QAED KOT ADDU |5

3. Decision-Making – School administrators use statistics to plan class sizes,


allocate resources, and evaluate performance.
4. Research in Education – Provides scientific basis for analyzing problems.
5. Evaluation of Curriculum – Helps determine effectiveness of teaching
strategies.
6. Predicting Trends – Enrollment projections, literacy growth, or exam success
rates.

1.5 Role of Teachers and Researchers in Using Statistics

 Teachers use statistics for grading, classroom assessments, and student progress.
 Researchers use statistics to test hypotheses, analyze surveys, and interpret
findings.
 Administrators use statistics for decision-making in budgets, policies, and
planning.

10 IN
1.6 Limitations of Statistics

45 M
6
 Statistics deals only with quantitative data, not qualitative aspects like emotions

60 A
or attitudes (unless measured by scales).
 Misuse or misinterpretation of data may lead to wrong conclusions.
32 ER
 Requires careful collection and analysis; otherwise, results may be biased.
03 IS

1.7 Educational Application Example


A

Imagine a teacher wants to check whether a new teaching method improves performance:
Q

1. Divide class into two groups.


2. Group A uses the new method, Group B uses traditional teaching.
3. Collect test scores and apply statistical tests.
4. If Group A performs significantly better, the new method is effective.

Summary of Unit 1

 Statistics = science of data collection, organization, analysis, and interpretation.


 Two branches: Descriptive (summarizing) and Inferential (predicting).
 In education, statistics helps in research, teaching improvement, assessment, and
decision-making.
 It has limitations, but when used correctly, it is a powerful tool.
Data Analysis B.Ed (Hons) QAED KOT ADDU |6

Self-Assessment Questions

1. Define statistics in your own words.


2. Differentiate between descriptive and inferential statistics with examples.
3. Give three uses of statistics in education.
4. How can a teacher use statistics in classroom assessment?
5. What are two limitations of statistics?

Unit 2: Graphic Representation of Data


Learning Objectives

By the end of this unit, students will be able to:

 Understand the importance of graphical presentation in statistics.

10 IN
 Draw and interpret histograms, frequency polygons, frequency curves, and pie
charts.
 Select appropriate graphs for different types of data.

45 M

6
Apply graphic representation in educational research and classroom contexts.

60 A
2.1 Introduction 32 ER
Numerical data in tables may be difficult to understand. Graphs and charts present data in
a visual form that makes interpretation easier.

✅ Example: A table showing exam scores may look complex, but a bar chart or pie chart
03 IS

quickly shows performance trends.


A

2.2 Histogram
Q

Definition

A histogram is a bar graph that represents the frequency distribution of continuous data.

Features

 Bars are drawn without gaps.


 X-axis = class intervals, Y-axis = frequencies.
 The area of each bar is proportional to frequency.

✅ Example:
Marks scored by students in a test:
Data Analysis B.Ed (Hons) QAED KOT ADDU |7

Marks Range Frequency

0–10 5

10–20 8

20–30 12

30–40 10

40–50 5

A histogram will show 5 bars, each representing the frequency of students in each range.

2.3 Frequency Polygon

10 IN
Definition

45 M
6
A frequency polygon is a line graph that shows the shape of a distribution.

60 A
Steps to Draw 32 ER
1. Draw a histogram first.
2. Mark the midpoints of each bar.
3. Connect the midpoints with straight lines.
03 IS

4. Extend the line to the X-axis at both ends.

Use in Education
A
Q

 Useful for comparing two or more distributions (e.g., boys vs. girls exam scores).

2.4 Frequency Curve


Definition

A frequency curve is a smooth curve drawn through the points of a frequency polygon.

Features

 Represents the general trend of data.


 Used in large data sets for better visualization.
 The most famous frequency curve is the Normal Curve (bell-shaped).
Data Analysis B.Ed (Hons) QAED KOT ADDU |8

✅ Example: Students’ IQ scores often form a normal curve where most students are
average, a few are very high or very low.

2.5 Pie Chart (or Pie Diagram)


Definition

A pie chart is a circular diagram divided into slices to represent proportions of a whole.

Steps to Draw

1. Convert each category into an angle using formula:

Angle=FrequencyTotal Frequency×360∘\text{Angle} =
\frac{\text{Frequency}}{\text{Total Frequency}} \times
360^\circAngle=Total FrequencyFrequency×360∘

10 IN
2. Draw a circle and divide it into sectors accordingly.

45 M
Example:

6
Survey of students’ favorite subjects:

Subject
60 AStudents
32 ER
English 20
03 IS

Math 30
A

Science 25
Q

History 15

Total 90

 English = (20/90) × 360 = 80°


 Math = 120°
 Science = 100°
 History = 60°

✅ A pie chart will show a circle divided into 4 slices.


Data Analysis B.Ed (Hons) QAED KOT ADDU |9

2.6 Importance of Graphs in Education

 Make data easy to understand.


 Help in comparing groups (e.g., class performance in different subjects).
 Useful in research reports and presentations.
 Encourage students’ visual learning and critical thinking.

Summary of Unit 2

 Histogram: bars without gaps; shows frequency distribution.


 Frequency Polygon: line graph connecting midpoints of histogram bars.
 Frequency Curve: smooth curve showing general trend.
 Pie Chart: circular diagram showing proportions.
 Graphs make complex data simple, visual, and easy to compare.

10 IN
Self-Assessment Questions

45 M
6
1. What is the difference between a histogram and a bar chart?

60 A
2. Explain the steps of drawing a frequency polygon.
3. Why is the normal curve important in educational research?
32 ER
4. Construct a pie chart for this data: Boys = 40, Girls = 60.
5. Give two advantages of using graphs in teaching statistics.
03 IS
A
Q
Data Analysis B.Ed (Hons) QAED K O T A D D U | 10

10 IN
45 M
6
60 A
32 ER
03 IS
A
Q
Data Analysis B.Ed (Hons) QAED K O T A D D U | 11

10 IN
45 M
6
60 A
32 ER
Here are four illustrative graphs that visually capture key concepts from Unit 2: Graphic
Representation of Data:
03 IS

1. Histogram + Frequency Polygon Overlay – A histogram (colored bars) with a


A

red frequency polygon superimposed illustrates how both representations can be


used together to show distribution shapes. (Image 1)
Q

2. Mixed Graph Types – A layout showing multiple chart types—including bar


charts and pie charts—demonstrates how different visuals operate together.
(Image 2)
3. Frequency Curve – A smooth curve drawn over histogram data, highlighting the
evolving trend of a distribution. (Image 3)
4. Pie Chart Example – A colorful pie chart showing proportional data distribution,
useful for illustrating relative shares at a glance. (Image 4)

How These Visuals Support Unit 2 Concepts

 Histogram (Image 1): Ideal for depicting frequency distribution across


continuous intervals—bars are contiguous, representing ranges like exam scores.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 12

 Frequency Polygon (Image 1 & 3): Line graph connecting midpoints of


histograms’ bars for a clearer trend shape and comparison of multiple datasets.
 Frequency Curve (Image 3): A smooth variant of the frequency polygon—
excellent for large datasets and showing the "shape" of distributions.
 Pie Chart (Image 4): Shows how a whole splits into parts, perfect for categorical
data like preferences or subject popularity.

Suggested Use in Your Book

 Captions & Explanation: Each graph should have a caption and be referenced in
your book text—e.g.,
o Figure 2.1: Histogram showing score distribution,
o Figure 2.2: Frequency polygon illustrating score trends, etc.
 Combined Examples—Present a dataset and show how it’s represented in each
graph type—it helps readers see how different visuals emphasize different data
aspects.

10 IN
 DIY Activity—Include an exercise: Provide a small frequency table and ask
students to draw a histogram, then overlay a polygon, then convert into a pie

45 M
chart.

6
60 A
32 ER
Unit 3: Measures of Central Tendency
Learning Objectives
03 IS

By the end of this unit, students will be able to:


A
Q

 Define and explain measures of central tendency.


 Calculate mean, median, and mode from raw and grouped data.
 Understand the advantages and limitations of each measure.
 Apply measures of central tendency in educational settings.

3.1 Introduction

In statistics, we often deal with large sets of numbers. To make sense of the data, we need
a single representative value that describes the ―center‖ of the data.

These representative values are called Measures of Central Tendency.

✅ Examples:
Data Analysis B.Ed (Hons) QAED K O T A D D U | 13

 Average marks of a class.


 Median household income.
 Most common grade obtained by students.

The three most commonly used measures are: Mean, Median, and Mode.

3.2 Mean (Arithmetic Average)

Definition

The mean is the sum of all values divided by the number of values.

Formula
Mean (Xˉ)=∑XN\text{Mean (} \bar{X} \text{)} = \frac{\sum X}{N}Mean (Xˉ)=N∑X

10 IN
Where:

 ∑X\sum X∑X = Sum of all values

45 M
 NNN = Number of values

6
60 A
Example (Raw Data): 32 ER
Marks: 10, 20, 30, 40, 50

Xˉ=10+20+30+40+505=1505=30\bar{X} = \frac{10+20+30+40+50}{5} = \frac{150}{5} =


30Xˉ=510+20+30+40+50=5150=30
03 IS

Example (Grouped Data):


A

Marks Range Midpoint (X) Frequency (f) fX


Q

0–10 5 4 20

10–20 15 6 90

20–30 25 10 250

30–40 35 5 175

∑f=25,∑fX=535\sum f = 25, \sum fX = 535∑f=25,∑fX=535

Xˉ=∑fX∑f=53525=21.4\bar{X} = \frac{\sum fX}{\sum f} = \frac{535}{25} = 21.4Xˉ=∑f∑fX


=25535=21.4
Data Analysis B.Ed (Hons) QAED K O T A D D U | 14

✅ Use in Education: Average marks in an exam.

3.3 Median

Definition

The median is the middle value when the data is arranged in order.

Steps (Raw Data):

1. Arrange data in ascending order.


2. If NNN is odd → middle value.
3. If NNN is even → average of two middle values.

Example:

10 IN
Data: 5, 7, 8, 10, 12
Median = 8 (middle value).

45 M
Data: 5, 7, 8, 10, 12, 15

6
Median = (8+10)/2 = 9.

60 A
Formula (Grouped Data):
32 ER
Median=L+(N2−CFf)×h\text{Median} = L + \left(\frac{\frac{N}{2} - CF}{f}\right) \times
hMedian=L+(f2N−CF)×h

Where:
03 IS

 LLL = lower boundary of median class


A

 NNN = total frequency


 CFCFCF = cumulative frequency before median class
Q

 fff = frequency of median class


 hhh = class size

✅ Use in Education: Median income of families; shows the central tendency without
being affected by extreme values.

3.4 Mode

Definition

The mode is the value that occurs most frequently in a dataset.


Data Analysis B.Ed (Hons) QAED K O T A D D U | 15

Examples (Raw Data):

 Data: 2, 4, 4, 5, 6 → Mode = 4
 Data: 10, 15, 20, 20, 25, 25, 25, 30 → Mode = 25

Formula (Grouped Data):


Mode=L+(f1−f02f1−f0−f2)×h\text{Mode} = L + \left( \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \right)
\times hMode=L+(2f1−f0−f2f1−f0)×h

Where:

 LLL = lower boundary of modal class


 f1f_1f1 = frequency of modal class
 f0f_0f0 = frequency before modal class
 f2f_2f2 = frequency after modal class
 hhh = class size

10 IN
✅ Use in Education: Most common grade obtained in an exam.

45 M
6
3.5 Comparison of Mean, Median, and Mode

60 A
Measure Advantages Limitations
32 ER Best Use

Uses all data; easy to Affected by extreme


Mean For normal distribution
calculate values
03 IS

Not affected by extreme


Median Ignores some data Skewed distributions
values
A

Categorical data (like


Q

Mode Easiest to understand May not be unique


grades)

3.6 Educational Applications

 Mean → To find the average performance of students.


 Median → To understand the "middle student" performance, useful for
scholarships.
 Mode → To identify the most common mistakes or scores in tests.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 16

Summary of Unit 3

 Mean: average value, best for balanced data.


 Median: middle value, useful when data has extremes.
 Mode: most frequent value, useful for categorical data.
 Each measure has strengths and weaknesses; often all three are used together.

Self-Assessment Questions

1. Define mean, median, and mode.


2. Calculate the mean of marks: 15, 20, 25, 30, 35.
3. Find the median of the dataset: 6, 8, 12, 15, 20, 25.
4. Which measure is most affected by extreme values?
5. Give one educational example each of mean, median, and mode.

10 IN
45 M
6
Unit 4: Measures of Dispersion

60 A
Learning Objectives
32 ER
By the end of this unit, students will be able to:


03 IS

Define dispersion and explain its importance in statistics.


 Calculate Range, Quartile Deviation, and Standard Deviation.
 Compare measures of dispersion and identify their uses in education.
A

 Apply these measures to classroom and research situations.


Q

4.1 Introduction to Dispersion

So far, we have learned how to calculate central tendency (mean, median, mode). But
these measures do not tell us how spread out the data is.

✅ Example:

 Class A: Marks = 48, 50, 52 → Mean = 50


 Class B: Marks = 20, 50, 80 → Mean = 50

Although both classes have the same mean, the spread of marks is very different.
This spread is measured by dispersion.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 17

Dispersion = the degree to which values deviate from the central value.

4.2 Range

Definition

The range is the difference between the largest and smallest value.

Formula
Range=L−S\text{Range} = L - SRange=L−S

Where:

 LLL = Largest value


 SSS = Smallest value

10 IN
Example

45 M
6
Data: 15, 18, 20, 25, 30
Range = 30 – 15 = 15

60 A
✅ Use in Education: Quick measure of score variation in a test.
32 ER
Limitations: Only considers extreme values, ignores all others.
03 IS

4.3 Quartile Deviation (Semi-Interquartile Range)


A

Definition
Q

Quartile Deviation (Q.D.) measures the spread of the middle 50% of data.

Formula
Q.D.=Q3−Q12Q.D. = \frac{Q_3 - Q_1}{2}Q.D.=2Q3−Q1

Where:

 Q1Q_1Q1 = First Quartile (25th percentile)


 Q3Q_3Q3 = Third Quartile (75th percentile)

Example

Data: 5, 7, 8, 10, 12, 14, 16, 18, 20


Data Analysis B.Ed (Hons) QAED K O T A D D U | 18

 Q1=8Q_1 = 8Q1=8, Q3=16Q_3 = 16Q3=16

Q.D.=16−82=82=4Q.D. = \frac{16 - 8}{2} = \frac{8}{2} = 4Q.D.=216−8=28=4

✅ Use in Education: Better than range as it ignores extreme values.

4.4 Standard Deviation (SD)

Definition

Standard Deviation is the most important and widely used measure of dispersion. It
shows the average deviation of each value from the mean.

Formula (Ungrouped Data)


SD=∑(X−Xˉ)2NSD = \sqrt{\frac{\sum (X - \bar{X})^2}{N}}SD=N∑(X−Xˉ)2

10 IN
Where:

45 M

6
XXX = each value
 Xˉ\bar{X}Xˉ = mean

60 A
 NNN = number of values 32 ER
Example

Data: 2, 4, 6
Xˉ=4\bar{X} = 4Xˉ=4
03 IS

SD=(2−4)2+(4−4)2+(6−4)23SD = \sqrt{\frac{(2-4)^2 + (4-4)^2 + (6-


A

4)^2}{3}}SD=3(2−4)2+(4−4)2+(6−4)2 =4+0+43=83=1.63= \sqrt{\frac{4 + 0 + 4}{3}} =


\sqrt{\frac{8}{3}} = 1.63=34+0+4=38=1.63
Q

Formula (Grouped Data)


SD=∑fX2∑f−(∑fX∑f)2SD = \sqrt{\frac{\sum fX^2}{\sum f} - \left(\frac{\sum fX}{\sum
f}\right)^2}SD=∑f∑fX2−(∑f∑fX)2

✅ Use in Education:

 Comparing score variability across classes.


 Checking reliability of test results.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 19

4.5 Comparison of Measures of Dispersion


Measure Advantages Limitations Best Use

Range Simple, quick Affected by extremes Small datasets

Q.D. Ignores extremes Uses only 50% of data Skewed distributions

S.D. Most reliable, uses all data Complex calculation Research, exam analysis

4.6 Educational Applications

 Range → Spread of student marks in a test.


 Quartile Deviation → Middle 50% performance in standardized tests.
 Standard Deviation → Comparing two classes: smaller SD means more
consistent performance.

10 IN
✅ Example:

45 M
6
 Class A mean = 70, SD = 5 → scores are consistent.
 Class B mean = 70, SD = 20 → scores vary widely.

60 A
32 ER
Summary of Unit 4
03 IS

 Dispersion = spread of data around a central value.


 Range = difference between highest & lowest values.
 Quartile Deviation = half the difference between Q3 and Q1.
A

 Standard Deviation = average deviation from mean, most reliable.


Q

 Dispersion helps in comparing performance and consistency in education.

Self-Assessment Questions

1. Define dispersion in statistics.


2. Calculate the range of: 12, 15, 18, 25, 30.
3. Explain quartile deviation with an example.
4. Why is standard deviation considered the best measure of dispersion?
5. In education, how can standard deviation help compare class performance?
Data Analysis B.Ed (Hons) QAED K O T A D D U | 20

Unit 5: Measures of Relationship


Learning Objectives

By the end of this unit, students will be able to:

 Define correlation and explain its significance in education.


 Interpret and calculate measures of correlation.
 Understand normal distribution and its role in educational assessment.
 Explain percentiles and percentile ranks.
 Differentiate between parametric and non-parametric tests.
 Apply statistical tests of significance in research.

5.1 Introduction

10 IN
In educational research, we often want to know whether two or more variables are
related. For example:

45 M
6
 Does study time affect exam performance?

60 A
 Is there a relationship between attendance and grades?
32 ER
Measures of Relationship help us answer such questions scientifically.
03 IS

5.2 Correlation
A

Definition
Q

Correlation measures the degree of relationship between two variables.

Types of Correlation

1. Positive Correlation – both variables increase together.


o Example: More study hours → higher grades.
2. Negative Correlation – one variable increases while the other decreases.
o Example: More absences → lower grades.
3. Zero Correlation – no relationship.
o Example: Shoe size and intelligence.

Karl Pearson’s Correlation Coefficient (r):


r=∑(X−Xˉ)(Y−Yˉ)∑(X−Xˉ)2⋅∑(Y−Yˉ)2r = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum
(X - \bar{X})^2 \cdot \sum (Y - \bar{Y})^2}}r=∑(X−Xˉ)2⋅∑(Y−Yˉ)2∑(X−Xˉ)(Y−Yˉ)
Data Analysis B.Ed (Hons) QAED K O T A D D U | 21

 Value of r lies between –1 and +1.


o r=+1r = +1r=+1 → Perfect positive correlation
o r=−1r = -1r=−1 → Perfect negative correlation
o r=0r = 0r=0 → No correlation

✅ Educational Example: Correlation between students’ IQ and academic achievement.

5.3 Normal Distribution

Definition

The normal distribution is a bell-shaped curve that shows how scores are distributed
around the mean.

Characteristics

10 IN
 Symmetrical about the mean.
 Mean = Median = Mode.

45 M
 68% of data lies within 1 SD, 95% within 2 SD, and 99.7% within 3 SD.

6
60 A
✅ Educational Use:

 Standardized testing.
32 ER
 IQ distribution (most students average, few very high/low).
 Grading on a curve.
03 IS

5.4 Percentiles and Percentile Ranks


A
Q

Definition

 Percentile: A value below which a given percentage of observations fall.


 Percentile Rank: Percentage of scores below a particular score.

Formula
Pk=L+(kN100−CFf)×hP_k = L + \left(\frac{\frac{kN}{100} - CF}{f}\right) \times hPk
=L+(f100kN−CF)×h

Where:

 PkP_kPk = kth percentile


 NNN = total frequency
 LLL = lower boundary of the percentile class
Data Analysis B.Ed (Hons) QAED K O T A D D U | 22

 CFCFCF = cumulative frequency before percentile class


 fff = frequency of percentile class
 hhh = class size

✅ Educational Example: If a student is at the 75th percentile, he performed better than


75% of his peers.

5.5 Tests of Significance

Statistical tests help us decide whether the observed results are due to chance or a real
relationship.

Parametric Tests

 Assume data follows normal distribution.

10 IN
 Examples: t-test, z-test, ANOVA.
 Used for interval/ratio scale data.

45 M
Non-Parametric Tests

6
60 A
 Do not assume normal distribution.
 Examples: Chi-square, Mann-Whitney U test.
32 ER
 Used for nominal/ordinal data.

✅ Educational Example:
03 IS

 Parametric: Comparing mean scores of two classes.


 Non-Parametric: Comparing students’ ranks in debate competition.
A
Q

5.6 Practical Educational Applications

 Correlation → Helps find relationships (e.g., study habits and performance).


 Normal Distribution → Used in standardized testing and grading.
 Percentiles → Reporting student performance in exams.
 Tests of Significance → Validating research findings in education.

Summary of Unit 5

 Correlation measures relationships between variables (positive, negative, zero).


 Normal distribution is a bell curve used in testing and research.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 23

 Percentiles and ranks show relative performance.


 Tests of significance help judge reliability of results.
 Parametric tests = based on normality; Non-parametric = assumption-free.

Self-Assessment Questions

1. Define correlation and give one educational example.


2. What are the characteristics of a normal distribution?
3. Differentiate between percentiles and percentile ranks.
4. Give two examples each of parametric and non-parametric tests.
5. Why are tests of significance important in educational research?

10 IN
Unit 6: Measurement Scales

45 M
Learning Objectives

6
60 A
By the end of this unit, students will be able to:
32 ER
 Define and differentiate between the four scales of measurement.
 Identify examples of nominal, ordinal, interval, and ratio data.
 Understand the importance of measurement scales in educational research.
 Select appropriate statistical techniques based on measurement scales.
03 IS
A

6.1 Introduction
Q

Measurement is a fundamental part of statistics. In education, we measure student


performance, attitudes, and behaviors. But not all measurements are of the same type.

Psychologist S.S. Stevens (1946) classified measurement into four scales:

1. Nominal
2. Ordinal
3. Interval
4. Ratio

Each scale provides different levels of information and determines which statistical
methods can be used.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 24

6.2 Nominal Scale

Definition

 The simplest level of measurement.


 Data are categories without any order or numerical meaning.

Examples

 Gender (Male, Female).


 Subjects (English, Math, Science).
 Blood group (A, B, AB, O).

Educational Use

10 IN
 Classifying students into sections (A, B, C).
 Recording types of learning styles (Visual, Auditory, Kinesthetic).

45 M
✅ Statistics Used: Mode, Chi-square.

6
60 A
6.3 Ordinal Scale
32 ER
Definition
03 IS

 Data are arranged in order or rank, but the difference between ranks is not equal.
A

Examples
Q

 Position in class (1st, 2nd, 3rd).


 Grades (A, B, C, D).
 Level of satisfaction (Satisfied, Neutral, Unsatisfied).

Educational Use

 Ranking students by performance.


 Rating teachers’ effectiveness.

✅ Statistics Used: Median, Rank-order correlation, Non-parametric tests.


Data Analysis B.Ed (Hons) QAED K O T A D D U | 25

6.4 Interval Scale


Definition

 Data are measured on a scale with equal intervals but no true zero point.

Examples

 Temperature in Celsius or Fahrenheit.


 Test scores (e.g., IQ test with no absolute zero).

Educational Use

 Standardized test scores (SAT, IQ tests).


 Measuring attitudes with Likert scales.

✅ Statistics Used: Mean, Standard Deviation, Correlation, t-tests, ANOVA.

10 IN
45 M
6.5 Ratio Scale

6
60 A
Definition 32 ER
 Highest level of measurement.
 Data have equal intervals and a true zero point.

Examples
03 IS

 Height, weight, age.


A

 Marks in an exam (0 means absence of marks).


 Time taken to solve a problem.
Q

Educational Use

 Measuring actual performance in exams.


 Recording time students take to complete assignments.

✅ Statistics Used: All statistical techniques, including geometric mean and coefficient of
variation.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 26

6.6 Comparison of Measurement Scales


Scale Characteristics Examples Statistics Used

Nominal Categories only Gender, Subjects Mode, Chi-square

Ordinal Order, no equal intervals Ranks, Grades Median, Rank correlation

Interval Equal intervals, no true zero IQ, Temperature Mean, SD, t-test, ANOVA

Ratio Equal intervals, true zero Age, Marks, Height All statistical methods

6.7 Importance in Education

 Helps decide which statistical techniques are appropriate.

10 IN
 Ensures accurate interpretation of student data.
 Provides clarity in research methodology.

45 M
✅ Example:

6
60 A
 If student scores are recorded as "Pass/Fail," that is Nominal.
 If ranked 1st, 2nd, 3rd → Ordinal.
32 ER
 If measured on a test with equal intervals (0–100) → Interval/Ratio.
03 IS

Summary of Unit 6
A

 Four scales of measurement: Nominal, Ordinal, Interval, Ratio.


 Nominal = categories, Ordinal = order, Interval = equal units without true zero,
Q

Ratio = equal units with true zero.


 The level of measurement determines what type of statistical analysis can be
applied.

Self-Assessment Questions

1. Differentiate between nominal and ordinal scales with examples.


2. Why does the interval scale not have a true zero?
3. Give one educational example each for interval and ratio scales.
4. Which scale allows the use of the most statistical techniques?
5. Arrange the four scales of measurement in order of complexity.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 27

Unit 7: Random Variables and Probability


Distribution
Learning Objectives

By the end of this unit, students will be able to:

 Define random sampling and random variables.


 Differentiate between discrete and continuous random variables.
 Understand the concept of probability distribution.
 Explain and apply the Binomial Distribution.
 Recognize the importance of probability distributions in education.

10 IN
45 M
6
7.1 Random Sampling

60 A
Definition 32 ER
Random sampling is a method in which each member of a population has an equal
chance of being selected.

Types of Random Sampling


03 IS

1. Simple Random Sampling – Lottery method, random numbers.


A

2. Systematic Sampling – Selecting every kth element.


3. Stratified Sampling – Dividing population into groups (strata) and sampling
Q

from each.
4. Cluster Sampling – Dividing into clusters, then randomly selecting clusters.

✅ Educational Use: Selecting students for a survey on study habits.

7.2 Random Variables


Definition

A random variable is a variable whose values are outcomes of a random experiment.


Data Analysis B.Ed (Hons) QAED K O T A D D U | 28

Types

1. Discrete Random Variable – Takes specific values (countable).


o Example: Number of students present in a class.
2. Continuous Random Variable – Takes any value within a range.
o Example: Students’ heights, weights, or test scores.

✅ Educational Example: Number of correct answers in a multiple-choice test (discrete).

7.3 Probability Distribution

Definition

A probability distribution is a table or function that shows the probability of different


outcomes of a random variable.

10 IN
Properties

45 M
1. Probabilities are between 0 and 1.

6
2. The sum of all probabilities = 1.

60 A
✅ Example: Probability distribution of tossing a coin:
32 ER
Outcome Probability

Head 0.5
03 IS

Tail 0.5
A
Q

7.4 Binomial Distribution

Definition

A binomial distribution is the probability distribution of the number of successes in a


fixed number of independent trials, each with the same probability of success.

Formula
P(X=r)=(nr)prqn−rP(X = r) = \binom{n}{r} p^r q^{n-r}P(X=r)=(rn)prqn−r

Where:

 nnn = number of trials


Data Analysis B.Ed (Hons) QAED K O T A D D U | 29

 rrr = number of successes


 ppp = probability of success
 q=1−pq = 1-pq=1−p = probability of failure

Example

If a student has a 0.6 probability of passing a test, find the probability that he passes
exactly 2 times in 3 attempts.

P(X=2)=(32)(0.6)2(0.4)1P(X=2) = \binom{3}{2} (0.6)^2 (0.4)^1P(X=2)=(23)(0.6)2(0.4)1


=3×0.36×0.4=0.432= 3 \times 0.36 \times 0.4 = 0.432=3×0.36×0.4=0.432

✅ Answer: Probability = 43.2%

7.5 Importance in Education

10 IN
 Random Sampling ensures fairness in educational surveys.
 Random Variables model outcomes like test scores.

45 M
6
 Probability Distributions help in predicting student performance.
 Binomial Distribution is useful for analyzing yes/no outcomes, e.g., pass/fail,

60 A
correct/incorrect answers. 32 ER
Summary of Unit 7
03 IS

 Random sampling ensures unbiased representation.


 Random variables can be discrete or continuous.
A

 Probability distribution gives the likelihood of outcomes.


 Binomial distribution deals with repeated trials of success/failure.
Q

 Probability concepts are essential for research and assessment.

Self-Assessment Questions

1. Define random sampling and give one example from education.


2. Differentiate between discrete and continuous random variables.
3. State two properties of probability distribution.
4. A coin is tossed 3 times. Find the probability of getting exactly 2 heads.
5. Why is binomial distribution useful in educational research?
Data Analysis B.Ed (Hons) QAED K O T A D D U | 30

Unit 8: Normal and Sampling Distributions


Learning Objectives

By the end of this unit, students will be able to:

 Explain the concept of the Normal Distribution and its properties.


 Interpret scores in terms of Z-scores and percentile ranks.
 Define a sampling distribution and explain its significance.
 Understand the Central Limit Theorem and its role in statistics.
 Apply concepts of normal and sampling distributions in educational research.

8.1 Normal Distribution

10 IN
Definition

The normal distribution is a bell-shaped curve that describes how data values are

45 M
6
distributed around the mean.

60 A
Characteristics 32 ER
1. Symmetrical about the mean.
2. Mean = Median = Mode.
3. 68% of values lie within 1 SD, 95% within 2 SD, 99.7% within 3 SD.
4. Total area under the curve = 1 (100%).
03 IS

Example (Education):
A

In a large class, most students score around the average mark, fewer score very high or
Q

very low → exam scores form a normal distribution.

8.2 Z-Scores (Standard Scores)

Definition

A Z-score indicates how many standard deviations a value is from the mean.

Formula
Z=X−XˉSDZ = \frac{X - \bar{X}}{SD}Z=SDX−Xˉ

Where:
Data Analysis B.Ed (Hons) QAED K O T A D D U | 31

 XXX = raw score


 Xˉ\bar{X}Xˉ = mean
 SDSDSD = standard deviation

Example

Mean score = 70, SD = 10, Student’s score = 85.

Z=85−7010=1510=1.5Z = \frac{85 - 70}{10} = \frac{15}{10} = 1.5Z=1085−70=1015=1.5

✅ Student scored 1.5 SD above the mean, better than about 93% of students.

8.3 Percentile Ranks

Definition

10 IN
The percentile rank tells us the percentage of scores below a particular score.

45 M
 Example: If a student is in the 80th percentile, they performed better than 80% of

6
students.

60 A
✅ Educational Use: Used in standardized tests (e.g., SAT, GRE) to compare
performance.
32 ER
03 IS

8.4 Sampling Distribution


A

Definition
Q

A sampling distribution is the distribution of a statistic (like mean, proportion) obtained


from all possible samples of the same size from a population.

Key Concept: Central Limit Theorem (CLT)

 Regardless of population shape, the sampling distribution of the mean approaches


a normal distribution as sample size increases (usually n≥30n \geq 30n≥30).

✅ Importance: Allows researchers to make inferences about the population from a


sample.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 32

8.5 Applications in Education

 Normal Distribution: Used in grading, IQ tests, and standardized assessments.


 Z-Scores: Helps compare student scores from different tests.
 Percentile Ranks: Used in competitive exams for ranking students.
 Sampling Distributions: Basis for hypothesis testing in educational research.

Summary of Unit 8

 Normal distribution = bell curve, most values around mean.


 Z-scores show how far a score is from mean in SD units.
 Percentile ranks compare performance relative to others.
 Sampling distributions describe variation of statistics across samples.
 CLT ensures that sample means approximate normal distribution.

10 IN
Self-Assessment Questions

45 M
6
1. Define normal distribution and list its characteristics.

60 A
2. What does a Z-score of –2.0 mean in exam results?
3. A student scored in the 90th percentile. What does this indicate?
32 ER
4. Explain the Central Limit Theorem in your own words.
5. Why are sampling distributions important in educational research?
03 IS
A

Unit 9: Statistical Inferences – One Sample


Q

Learning Objectives

By the end of this unit, students will be able to:

 Understand the concept of statistical inference.


 Explain and apply hypothesis testing.
 Conduct and interpret a one-sample t-test.
 Calculate and interpret confidence intervals.
 Differentiate between z-test and t-test in one-sample problems.
 Apply these tests in educational research.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 33

9.1 Introduction to Statistical Inference

Statistics not only describes data but also allows us to make inferences about
populations based on samples.

✅ Example: Instead of testing all students in a university, we take a sample of 100


students and make conclusions about the whole population.

Two major tools of inference:

1. Hypothesis Testing
2. Confidence Intervals

9.2 Hypothesis Testing

10 IN
Steps in Hypothesis Testing

1. State the hypotheses

45 M
o Null Hypothesis (H₀): No difference or no effect.

6
o Alternative Hypothesis (H₁): There is a difference/effect.

60 A
2. Set significance level (α): Common values = 0.05 or 0.01.
3. Select appropriate test (z-test, t-test).
32 ER
4. Compute test statistic.
5. Make decision: Reject or fail to reject H₀.
03 IS

9.3 One-Sample t-Test for a Mean


A

Definition
Q

Used when we want to test whether the mean of a sample differs from a known
population mean.

Formula
t=Xˉ−μsnt = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n}}}t=nsXˉ−μ

Where:

 Xˉ\bar{X}Xˉ = sample mean


 μ\muμ = population mean
 sss = sample standard deviation
 nnn = sample size
Data Analysis B.Ed (Hons) QAED K O T A D D U | 34

Example

A sample of 25 students scored a mean of 68 on a math test. The population mean is 70,
and sample SD = 10. Test at 0.05 level.

t=68−701025=−22=−1t = \frac{68 - 70}{\frac{10}{\sqrt{25}}} = \frac{-2}{2} = -1t=2510


68−70=2−2=−1

At df = 24, critical value (0.05, two-tailed) ≈ ±2.064.


Since –1 is within the range, fail to reject H₀.

✅ Conclusion: No significant difference between sample and population mean.

9.4 Confidence Intervals for a Mean

10 IN
Definition

A confidence interval (CI) provides a range of values within which the population mean

45 M
6
is likely to fall.

60 A
Formula
CI=Xˉ±Z⋅σnCI = \bar{X} \pm Z \cdot \frac{\sigma}{\sqrt{n}}CI=Xˉ±Z⋅nσ
32 ER
or if population SD unknown:

CI=Xˉ±t⋅snCI = \bar{X} \pm t \cdot \frac{s}{\sqrt{n}}CI=Xˉ±t⋅ns


03 IS

Example
A

Sample mean = 50, SD = 8, n = 16. Find 95% CI.


Q

CI=50±2.131⋅84CI = 50 \pm 2.131 \cdot \frac{8}{4}CI=50±2.131⋅48


=50±2.131×2=50±4.26= 50 \pm 2.131 \times 2 = 50 \pm 4.26=50±2.131×2=50±4.26

✅ 95% CI = (45.74, 54.26).

9.5 One-Sample z-Test for a Proportion

Formula
z=p^−ppqnz = \frac{\hat{p} - p}{\sqrt{\frac{pq}{n}}}z=npqp^−p

Where:
Data Analysis B.Ed (Hons) QAED K O T A D D U | 35

 p^\hat{p}p^ = sample proportion


 ppp = population proportion
 q=1−pq = 1-pq=1−p
 nnn = sample size

Example

Suppose 60% of students nationally pass an exam. In a sample of 100 students from one
school, 70 passed.

z=0.70−0.600.6×0.4100=0.100.049=2.04z = \frac{0.70 - 0.60}{\sqrt{\frac{0.6 \times


0.4}{100}}} = \frac{0.10}{0.049} = 2.04z=1000.6×0.40.70−0.60=0.0490.10=2.04

At 0.05 level, critical z = 1.96 → Reject H₀.

✅ Conclusion: The school has a significantly higher pass rate.

10 IN
9.6 Educational Applications

45 M
6
 Testing whether the average score of a class is equal to the national average.

60 A
 Determining whether a new teaching method significantly changes test
performance.
32 ER
 Evaluating whether the proportion of students passing is significantly different
from a benchmark.
03 IS

Summary of Unit 9
A

 Statistical inference allows generalization from samples to populations.


Q

 Hypothesis testing involves stating H₀ and H₁, setting α, calculating test statistic,
and making a decision.
 One-sample t-test → Used when population SD unknown.
 One-sample z-test → Used for large samples or known population variance.
 Confidence intervals give a range of likely population values.

Self-Assessment Questions

1. Define null and alternative hypotheses with an example.


2. Write the formula for a one-sample t-test.
3. A sample of 36 students scored a mean of 72, population mean = 70, SD = 12.
Test at 0.05 level (use z-test).
Data Analysis B.Ed (Hons) QAED K O T A D D U | 36

4. What does a 95% confidence interval mean in practical terms?


5. When would you use a t-test instead of a z-test?

Unit 10: Analysis of Variance and Co-Variance


Learning Objectives

By the end of this unit, students will be able to:

 Understand the need for ANOVA when comparing more than two groups.
 Explain the basic concepts of between-group and within-group variance.
 Conduct and interpret a one-way ANOVA.
 Explain the principle of ANCOVA and its importance.

10 IN
Apply ANOVA/ANCOVA in educational research.

45 M
6
10.1 Introduction

60 A
When we want to compare the means of two groups, we can use a t-test.
32 ER
But what if there are three or more groups? Conducting multiple t-tests increases the
chance of error.

✅ Solution: Use Analysis of Variance (ANOVA).


03 IS

ANOVA tests whether there are significant differences among group means by
A

comparing variance between groups with variance within groups.


Q

10.2 Concept of Variance

 Between-group variance: How much the group means differ from the overall
mean.
 Within-group variance: How much scores differ inside each group.

If between-group variance > within-group variance, the group means are significantly
different.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 37

10.3 One-Way ANOVA


Formula
F=Mean Square Between (MSB)Mean Square Within (MSW)F = \frac{\text{Mean Square
Between (MSB)}}{\text{Mean Square Within
(MSW)}}F=Mean Square Within (MSW)Mean Square Between (MSB)

Where:

 MSB = Between-group variance


 MSW = Within-group variance

Steps

1. State hypotheses:
o H0:H_0:H0: All group means are equal.
o H1:H_1:H1: At least one group mean is different.

10 IN
2. Calculate SSB (Sum of Squares Between) and SSW (Sum of Squares Within).
3. Find MSB = SSB / dfB, MSW = SSW / dfW.
4. Compute F=MSB/MSWF = MSB / MSWF=MSB/MSW.

45 M
6
5. Compare with critical F-value.

60 A
32 ER
Example

Three teaching methods were tested on students:


03 IS

Method Scores (Mean)


A

A 70
Q

B 75

C 85

After ANOVA calculations → F = 5.67, Critical F (0.05, df=2,27) = 3.35.

✅ Since 5.67 > 3.35 → Reject H₀.


At least one teaching method is significantly different.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 38

10.4 Two-Way ANOVA

Used when there are two independent variables (factors).

 Example: Effect of teaching method (A, B, C) and gender (male, female) on


student achievement.

Two-way ANOVA tests:

1. Main effect of factor 1.


2. Main effect of factor 2.
3. Interaction effect (whether factors combine to affect results).

10.5 ANCOVA (Analysis of Co-Variance)

10 IN
Definition

ANCOVA is a combination of ANOVA + Regression.

45 M
6
It compares group means after controlling for the effect of a covariate.

60 A
Example 32 ER
Suppose two classes are taught with different methods, but one class has higher pre-test
scores. ANCOVA adjusts for pre-test scores before comparing post-test means.

✅ This makes comparisons fairer and more accurate.


03 IS
A

10.6 Educational Applications


Q

 One-Way ANOVA: Comparing effectiveness of different teaching methods.


 Two-Way ANOVA: Studying combined effect of gender and method on
achievement.
 ANCOVA: Controlling for prior knowledge (pre-test) when evaluating new
teaching strategies.

Summary of Unit 10

 ANOVA is used to compare means of 3 or more groups.


 F-ratio = Variance between groups ÷ Variance within groups.
 One-Way ANOVA → One factor.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 39

 Two-Way ANOVA → Two factors (main + interaction effects).


 ANCOVA → Controls the influence of covariates before comparing groups.

Self-Assessment Questions

1. Why is ANOVA preferred over multiple t-tests?


2. Explain between-group and within-group variance.
3. Write the formula for the F-ratio in one-way ANOVA.
4. Give an educational example of ANCOVA.
5. What are the three types of effects tested in a two-way ANOVA?

Unit 11: Statistical Inference for Frequency Data

10 IN
Learning Objectives

45 M
6
By the end of this unit, students will be able to:

60 A
 Define the Chi-Square test and its types.

32 ER
Apply the Chi-Square test for goodness of fit.
 Use the Chi-Square test for independence.
 Test equality of proportions with Chi-Square.
 Understand the importance of Chi-Square in educational research.
03 IS
A

11.1 Introduction
Q

Many times, research data are in the form of frequencies (counts) rather than continuous
scores.
Examples:

 Number of boys vs. girls choosing science.


 Number of students passing/failing under two teaching methods.

For such categorical data, we use the Chi-Square (χ²) test.


Data Analysis B.Ed (Hons) QAED K O T A D D U | 40

11.2 Chi-Square Test


Formula
χ2=∑(O−E)2E\chi^2 = \sum \frac{(O - E)^2}{E}χ2=∑E(O−E)2

Where:

 OOO = Observed frequency


 EEE = Expected frequency

Steps

1. State H0H_0H0 and H1H_1H1.


2. Calculate expected frequencies.
3. Compute χ².
4. Compare with critical χ² from table at given df and α.
5. Decision: Accept or reject H0H_0H0.

10 IN
45 M
11.3 Chi-Square Test of Goodness of Fit

6
60 A
Definition 32 ER
Used to test if observed frequencies match expected frequencies.

Example
03 IS

A teacher believes students’ preferences for subjects are equally distributed (Math,
English, Science, Social Studies).
A

Observed: 25, 30, 20, 25


Q

Expected: 25, 25, 25, 25

χ2=(25−25)225+(30−25)225+(20−25)225+(25−25)225\chi^2 = \frac{(25-25)^2}{25} +
\frac{(30-25)^2}{25} + \frac{(20-25)^2}{25} + \frac{(25-25)^2}{25}χ2=25(25−25)2
+25(30−25)2+25(20−25)2+25(25−25)2 =0+1+1+0=2= 0 + 1 + 1 + 0 = 2=0+1+1+0=2

df = 3, critical χ² (0.05) = 7.815 → 2 < 7.815 → Fail to reject H₀.

✅ Students’ preferences are equally distributed.


Data Analysis B.Ed (Hons) QAED K O T A D D U | 41

11.4 Chi-Square Test of Independence


Definition

Used to check if two categorical variables are independent.

Example

A study checks if gender (Male/Female) is related to subject choice (Science/Arts).

Science Arts Total

Male 40 20 60

Female 30 30 60

10 IN
Total 70 50 120

45 M
6
Expected frequency (Male-Science) = (60 × 70) / 120 = 35.

60 A
After calculations, χ² = 4.29, df = 1, critical χ² = 3.84 → Reject H₀.
32 ER
✅ Conclusion: Gender and subject choice are related.
03 IS

11.5 Chi-Square Test for Equality of Proportions


A

Used to check whether proportions across categories are equal.


Q

✅ Educational Example:
Testing if the proportion of students passing an exam is the same in four schools.

11.6 Educational Applications

 Goodness of Fit → Testing whether student choices match expectations.


 Independence → Checking relationship between teaching method and pass/fail
rates.
 Equality of Proportions → Comparing proportions of success across schools or
classes.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 42

Summary of Unit 11

 Chi-Square test is used for categorical (frequency) data.


 Formula: χ2=∑(O−E)2E\chi^2 = \sum \frac{(O - E)^2}{E}χ2=∑E(O−E)2.
 Goodness of Fit: Compares observed vs. expected frequencies.
 Independence: Tests relationship between two categorical variables.
 Equality of proportions: Compares proportions across groups.

Self-Assessment Questions

1. Write the formula for the Chi-Square test.


2. Give one educational example of Chi-Square goodness of fit.
3. When do we use Chi-Square test of independence?
4. A die is rolled 60 times with outcomes: 8, 10, 12, 14, 8, 8. Test if it is fair at 0.05
level.

10 IN
5. Why is Chi-Square test called a non-parametric test?

45 M
6
60 A
Unit 12: Statistical Inference for Ranked Data
32 ER
Learning Objectives
03 IS

By the end of this unit, students will be able to:

 Understand why non-parametric tests are useful for ranked data.


A

 Apply the Mann-Whitney U Test for two independent samples.


 Apply the Wilcoxon Signed-Rank Test for dependent samples.
Q

 Recognize the advantages and limitations of ranked data analysis in educational


research.

12.1 Introduction

Many times, data collected in education are not precise measurements, but ranks or
ordered categories.
Examples:

 Ranking students in a competition.


 Rating satisfaction with teaching methods.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 43

For such data, we use non-parametric tests, which do not assume normal distribution
and are suitable for ordinal (ranked) data.

12.2 Assumption-Free Tests

Definition

Non-parametric tests are called ―assumption-free‖ because they do not require:

 Normal distribution.
 Equal variances.
 Interval or ratio scales.

✅ Useful for small samples or non-numeric data.

10 IN
12.3 Mann-Whitney U Test (Two Independent Samples)

45 M
6
Purpose

60 A
Tests whether two independent groups differ in their ranks.
32 ER
Steps

1. Combine scores from both groups and rank them.


03 IS

2. Calculate sum of ranks for each group.


3. Compute UUU:
A

U=n1n2+n1(n1+1)2−R1U = n_1n_2 + \frac{n_1(n_1+1)}{2} - R_1U=n1n2+2n1(n1+1)−R1


Q

Where:

 n1,n2n_1, n_2n1,n2 = sizes of groups


 R1R_1R1 = sum of ranks for group 1

4. Compare U with critical value.

Example

A researcher compares achievement ranks of students taught with Method A (n=5) and
Method B (n=5).
After ranking, Mann-Whitney U is calculated → U = 4, critical value = 2.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 44

✅ Since 4 > 2, fail to reject H₀ → No significant difference.

12.4 Wilcoxon Signed-Rank Test (Two Dependent Samples)

Purpose

Used when the same group is tested twice (before-after studies).

Steps

1. Compute differences between paired scores.


2. Rank absolute differences (ignoring zero differences).
3. Assign signs (+ or –) based on direction.
4. Compute test statistic W (sum of ranks for smaller sign).
5. Compare with critical value.

10 IN
Example

45 M
10 students are given a pre-test and post-test after a new teaching method.

6
Wilcoxon test shows W = 8, critical value = 6.

60 A
✅ Since 8 > 6, fail to reject H₀ → No significant improvement.
32 ER
12.5 Educational Applications
03 IS

 Mann-Whitney U Test: Comparing ranks of boys and girls in a spelling contest.


A

 Wilcoxon Test: Measuring impact of remedial teaching by comparing pre-test


and post-test ranks.
Q

 Non-parametric tests in general: Useful when data are ordinal, non-normal, or


small samples.

12.6 Advantages and Limitations

Advantages

 Fewer assumptions.
 Suitable for ordinal data.
 Useful with small samples.
Data Analysis B.Ed (Hons) QAED K O T A D D U | 45

Limitations

 Less powerful than parametric tests when assumptions are met.


 Provide less detailed information compared to mean/SD-based tests.

Summary of Unit 12

 Ranked data often arise in educational settings.


 Non-parametric tests (assumption-free) are used for such data.
 Mann-Whitney U test compares two independent groups.
 Wilcoxon Signed-Rank test compares paired (dependent) samples.
 These methods are widely used in educational research where parametric
assumptions are not valid.

10 IN
Self-Assessment Questions

45 M
6
1. Why are non-parametric tests suitable for ranked data?
2. Differentiate between Mann-Whitney U and Wilcoxon tests.

60 A
3. A teacher wants to test whether students’ rankings in Math differ by gender.
Which test should be used?
32 ER
4. In a before-and-after training program, which test is most appropriate?
5. List two advantages and two limitations of non-parametric tests.
03 IS
A
Q

You might also like