0% found this document useful (0 votes)
26 views13 pages

Revised Lectures 2,3 and 4

The document discusses measures of central tendency, including mean, median, and mode, explaining how to calculate each for both discrete and grouped data. It also covers measures of dispersion such as range, variance, and standard deviation, providing formulas and examples for clarity. Overall, it serves as a comprehensive guide to understanding and calculating these statistical concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views13 pages

Revised Lectures 2,3 and 4

The document discusses measures of central tendency, including mean, median, and mode, explaining how to calculate each for both discrete and grouped data. It also covers measures of dispersion such as range, variance, and standard deviation, providing formulas and examples for clarity. Overall, it serves as a comprehensive guide to understanding and calculating these statistical concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Measure of Central Tendency

Measure of central tendency gives an idea about the concentration of the values in the central
part of the distribution. It also represents the whole population. There are three types of
measure of central tendency such as
1. Mean
2. Median
3. Mode

Mean: The average value of all the observations in the data is called mean.
For discrete data,
Suppose 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations, then mean is calculated as
∑𝑛𝑖=1 𝑥𝑖
𝑥̅ =
𝑛
Where, ∑𝑛𝑖=1 𝑥𝑖 = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 & 𝑛 = Number of observations.

For grouped data,


Suppose 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations and 𝑓1 , 𝑓2 , … , 𝑓𝑛 are associated frequencies, then the
mean is calculated as,
∑𝑛𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̅ =
𝑁
Where, 𝑁 = ∑𝑛𝑖=1 𝑓𝑖 is also called Total frequency.

Example: Find the mean of the given data.


Discrete data: 5, 3, 4, 7, 8, 4, 3
Solution: Here, 𝑛 = 7 and ∑𝑛𝑖=1 𝑥𝑖 = 5 + 3 + 4 + 7 + 8 + 4 + 3

5+3+4+7+8+4+3
𝑥̅ =
7
34
=
7
= 4.86
Grouped data:
Class Interval Frequency(f)
10-20 3
20-30 5
30-40 7

Solution:
Class Interval Mid value (x) Frequency (f) 𝑥𝑓
10-20 15 3 45
20-30 25 5 125
30-40 35 7 245
N = 15 415
From the table, we get 𝑁 = 15 𝑎𝑛𝑑 ∑ 𝑥𝑓 = 415
Then, the mean is calculated as
∑𝑛𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̅ =
𝑁
415
=
15
= 27.67
Median
The value of an observation which can divide the data into two equal parts is called median
of the data. This is positional value.

For discrete data,


Suppose 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations, then here arises two cases such as
Case i: When 𝑛 is an odd number, then median is obtained as

𝑛 + 1 𝑡ℎ
𝑥𝑀 = the value of ( ) term of the arranged data
2

Case ii: When n is an even number, then median is obtained as


𝑛 𝑡ℎ 𝑛 𝑡ℎ
the value of (2) term + the value of ( 2 + 1) term
𝑥𝑀 =
2
For grouped data
The median is calculated as
𝑁
− 𝑐𝑓
𝑥𝑀 = 𝑙 + ( 2 )∗ℎ
𝑓

Where,
𝑙 = lower limit of the median class
𝑁 = Total frequency
𝑐𝑓 = Cumulative frequency of the class preceding the median class
𝑓 = frequency of the median class
ℎ = class length
Example: Find the median of the given data.
Discrete data:
Case i: 23, 11, 25, 14, 32, 44, 15
Solution: Here, 𝑛 = 7 is an odd number so
Arranged in ascending order as
11, 14, 15, 23, 25, 32, 44
Then the median is obtained as

7 + 1 𝑡ℎ
𝑥𝑀 = value of the ( ) term
2
= value of 4𝑡ℎ term
= 23

Case ii: 23, 11, 25, 14, 32, 44, 15, 55


Solution: Here, 𝑛 = 8 is an even number so
Arranged in ascending order as
11, 14, 15, 23, 25, 32, 44, 55
Then the median is obtained as

8 𝑡ℎ 8 𝑡ℎ
the value of (2) term + the value of (2 + 1) term
𝑥𝑀 =
2
the value of (4)𝑡ℎ term + the value of (5)𝑡ℎ term
=
2
23 + 25
=
2
48
=
2
= 24
Grouped data:
Class Interval f F
10-20 3 3
20-30 5 8
30-40 7 15
40-50 2 17
N = 17

Solution: finding out median class as


𝑁 17
Thus, = = 8.5
2 2
𝑁
Here, the value of is just less than 15 therefore the median class will be 30-40.
2

Now, 𝑙 = 30, 𝑓 = 7, 𝑐𝑓 = 8, 𝑁 = 17 and ℎ = 10


The median is calculated as
8.5 − 8
𝑥𝑀 = 30 + ( ) ∗ 10
7
0.5
= 30 + ( ) ∗ 10
7
5
= 30 +
7
= 30 + 0.71
= 30.71
Mode
The value of an observation has highest frequency or repeated largest number of times in a
data set is called mode. A data set can be zero modal, unimodal, bimodal and multimodal.
For discrete data: In this type of data set we have to observe that which value of observation
is repeated maximum times then this value will be the mode of the data set. The mode is
denoted as 𝑥𝑀0
Example: 3, ,3, 5, 5, 5, 5, 2, 2, 2, 2, 2, 7, 7, 8, 11, 11, 13, 13.
Solution: First of all, we make frequency table of the given data as
x f
2 5
3 2
5 3
7 2
8 1
11 3
13 2

We can see that in the above table the 𝑥 = 2 has the highest frequency in the whole data set
therefore the mode as
𝑥𝑀0 = 2

For grouped data: The mode is calculated as


𝑓1 − 𝑓0
𝑥𝑀0 = 𝑙 + ( )∗ℎ
2𝑓1 − 𝑓0 − 𝑓2
Where
𝑙 = lower limit of modal class
𝑓0 = frequency of the class preceding the modal class
𝑓1 = frequency of the modal class
𝑓2 = frequency of the class succeding the modal class
ℎ = class length
Example: Find the mode of the given data below.
Class Interval f
10-20 3
20-30 5
30-40 7
40-50 2
N = 17

Solution: The class interval 30-40 has the highest frequency that is why this class interval is
called modal class.
Now, 𝑙 = 30, 𝑓0 = 5, 𝑓1 = 7, 𝑓2 = 2 𝑎𝑛𝑑 ℎ = 10
Thus, the mode can be calculated as
7−5
𝑥𝑀0 = 30 + ( ) ∗ 10
2∗7−5−2
2
= 30 + ( ) ∗ 10
14 − 7
2 ∗ 10
= 30 +
7
20
= 30 +
7
= 30 + 2.85
= 32.85
Measure of Dispersion
Measure of dispersion means the measure of scatteredness and variability in a data. There are
some common measures of dispersion as
1. Range
2. Variance
3. Standard Deviation

Range: The range of a variable is the simplest measure of its dispersion and it is defined as
the difference between the greatest and the lowest its given set of values.
It should be noted that if the data are given in grouped frequency distribution, then the range
can the considered as the difference between the largest upper boundary and the smallest
lower boundary.
Thus, the range can be calculated as
𝑅 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
Where,
𝑅 = Range
𝑥𝑚𝑎𝑥 = Smallest value of the variable 𝑥
𝑥𝑚𝑖𝑛 = Largest value of the varaible 𝑥

Example: Find the range of the data given below.

Discrete data: 23, 11, 25, 14, 32, 44, 15, 55


Solution: We can see that the smallest value is 11 and the largest value is 55.
Thus, we get 𝑥𝑚𝑎𝑥 = 55, 𝑥𝑚𝑖𝑛 = 11
Then, the Range can be calculated as
𝑅 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
= 55 − 11
= 44
Grouped data:
Class Interval f
10-20 3
20-30 5
30-40 7
40-50 2
N = 17

Here from the table, we get 𝑥𝑚𝑎𝑥 = 50, 𝑥𝑚𝑖𝑛 = 10


Now, the Range can be calculated as
𝑅 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
= 50 − 10
= 40
Variance: Variance measures the variability in the data set.
For discrete data:
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations of a data set. Then, the variance can be defined as
𝑛
1
𝑉𝑎𝑟 = ∑(𝑥𝑖 − 𝑥̅ )2
𝑛
𝑖=1

∑𝑛
𝑖=1 𝑥𝑖
Where, 𝑛 = Number of observations in a the data, 𝑥̅ = 𝑛

For continuous data:


Suppose we 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations and the corresponding frequency distribution
are 𝑓1 , 𝑓2 , … , 𝑓𝑛 then, variance can be defined as
𝑛
1
𝑉𝑎𝑟 = ∑(𝑥𝑖 − 𝑥̅ )2 𝑓𝑖
𝑁
𝑖=1

∑𝑛
𝑖=1 𝑥𝑖 𝑓𝑖
Where, 𝑁 = ∑𝑛𝑖=1 𝑓𝑖 , and 𝑥̅ = 𝑁
Standard Deviation: The positive square root of the variance is called standard deviation of
the data. It also measures the variability in the data.
For discrete data:
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations of a data set. Then, the standard deviation can be
defined as

𝑛
1
𝑆𝐷 = √ ∑(𝑥𝑖 − 𝑥̅ )2
𝑛
𝑖=1

∑𝑛
𝑖=1 𝑥𝑖
Where, 𝑛 = Number of observations in a the data, 𝑥̅ = 𝑛

For continuous data:


Suppose we 𝑥1 , 𝑥2 , … , 𝑥𝑛 be the observations and the corresponding frequency distribution
are 𝑓1 , 𝑓2 , … , 𝑓𝑛 then, the standard deviation can be defined as

𝑛
1
𝑆𝐷 = √ ∑(𝑥𝑖 − 𝑥̅ )2 𝑓𝑖
𝑁
𝑖=1

∑𝑛
𝑖=1 𝑥𝑖 𝑓𝑖
Where, 𝑁 = ∑𝑛𝑖=1 𝑓𝑖 , and 𝑥̅ = 𝑁

Example: Find the standard deviation of the data given below.


Discrete data: 3, 2, 4, 5, 6
Solution: Here, we make a table as
x (𝑥 − 𝑥̅ ) (𝑥 − 𝑥̅ )2
3 -1 1
2 -2 4
4 0 0
5 1 1
6 2 4
𝑥̅ = 4 10

From the table, we get 𝑛 = 5 and ∑(𝑥 − 𝑥̅ )2 = 10


Now, the standard deviation is calculated as

1
𝑆𝐷 = √ ∗ 10
5

= √2
= 1.41
Grouped data:
x f
2 2
4 4
6 5
8 3

Solution:
x f xf (𝑥 − 𝑥̅ ) (𝑥 − 𝑥̅ )2 (𝑥 − 𝑥̅ )2 𝑓
2 2 4 -3.3 10.89 21.78
4 4 16 -1.3 1.69 6.76
6 5 30 0.7 0.49 2.45
8 3 24 2.7 7.29 21.87
N = 14 ∑ 𝑥𝑓 = 74 52.86

∑ 𝑥𝑓 74
From the table, we get 𝑁 = 14, ∑(𝑥 − 𝑥̅ )2 𝑓 = 39 and 𝑥̅ = = = 5.3
𝑁 14

Now, the standard deviation is calculated as

1
𝑆𝐷 = √ ∗ 52.86
14

52.86
=√
14

= √3.78
Graphical Representation of data
Graphical representation of data, also known as data visualization, is the process of creating
graphical displays to communicate information and insight about data.
Types of Graphical Representations
1. Bar diagram
2. Histogram
3. Pie Chart
4. Frequency Polygon
5. Ogive.
Bar Diagram: A bar diagram, also known as a bar chart, is a graphical representation of data
that uses rectangular bars to displays qualitative data. The length or height of each bar
represents the magnitude or frequency of the data.

Example: The data of GDP growth at constant price for different years as given below in
table. Make bar diagram of it.
Years GDP growth at constant price
2005 131
2010 284
2015 319
2020 461

Graph:
Histogram: A histogram is graphical representation of data that shows the distribution of
continuous variables and it displays the frequency of each interval.
Example: The data on New Tax Rate (%) and corresponding Taxable Income Slab (Lakh) is
given below in the table so make Histogram.
Taxable Income Slabs (Lakh) New Tax Rates (%)
2.5-5 5
5-7.5 10
7.5-10 15
10-12.5 20
12.5-15 25

Graph:
Pie Chart: A pie chart is a circular graphical representation of data that shows how different
categories contribute to a whole. It is divided into slices or sectors, each representing a
category, with the size of each slice or sector proportional to its contributions.
Example: The data on share of income from various sources for Indian central government
for the year 2019-20 is given below in the table so make Pie chart of it.
Sources of Income Share in percentage
Goods & Service Tax 18%
Union Excise Duties 7%
Customs 4%
Income-Tax 17%
Corporate Tax 18%
Borrowing & Other Liabilities 20%
Non-Debt capital Receipts 6%
Non-Tax Revenue 10%

Graph:
Frequency Polygon: A frequency polygon is a graphical representation of data that shows
the frequency of different values or ranges of values. It is a line graph that connects the
midpoints of the tops of histogram bars.
Graph:

Ogive: An ogive also known as a cumulative frequency curve, is graphical representation of


Less than cumulative frequencies or Greater than cumulative frequencies. It shows the
number of observations less than or equal to a specific value.
Graph:

You might also like