NOTES
UNIT I
MEASURES OF CENTRAL TENDENCY
In general terms, central tendency is a statistical measure that determines a single value
that accurately describes the center of the distribution and represents the entire distribution
of scores.
The goal of central tendency is to identify the single value that is the best representative
for the entire set of data.
By identifying the "average score," central tendency allows researchers to summarize or
condense a large set of data into a single value
In addition, it is possible to compare two (or more) sets of data by simply comparing the
average score (central tendency) for one set versus the average score for another set.
According to Prof Bowley “Measures of central tendency (averages) are statistical constants
which enable us to comprehend in a single effort the significance of the whole.”
The main objectives of Measure of Central Tendency are
1. To condense data in a single value.
2. To facilitate comparisons between data. There are different types of averages, each has its
own advantages and disadvantages.
Measures of central tendency
Mathematical Average
Positional Average
GM HM Partition Values Mode
AM
Median Quartile Decile Percentile
MATHEMATICAL AVERAGES:
GEOMETRIC MEAN HARMONIC MEAN
ARITHMETIC MEAN (GM) (HM)
(AM) Ungrouped Data Ungrouped Data
Ungrouped Data
Frequency Distribution Frequency Distribution
Frequency Distribution
Uses:
1. Averaging ratios, rates Uses:
Uses: and percentages. 1. HM gives the largest
1. To compare the 2. Average rate under weights to smallest item
and smallest weights to
class averages. compound interest,
largest item. (when there
2. Average spending depreciation of machines. are few extremely large or
habit of individual 3. In economics and business small values HM is
in the construction of preferable .
index numbers. 2. Averages involving
time, speed, rate and
price.
POSITIONAL AVERAGES:
MODE
Ungrouped Data:
Mode is determined by locating that value, which occurs the maximum number of times.
Frequency Distribution:
𝑓 −𝑓
𝑀𝑜𝑑𝑒 = 𝐿 + ×ℎ
2𝑓 − 𝑓 − 𝑓
Where,
𝐿 = 𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓 = 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑝𝑟𝑒𝑐𝑒𝑒𝑑𝑖𝑛𝑔 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓 = 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓 = 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑠𝑢𝑐𝑐𝑒𝑒𝑑𝑖𝑛𝑔 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
ℎ = 𝑤𝑖𝑑𝑡ℎ 𝑜𝑓 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
PARTITION VALUES
Median Quartiles Deciles Percentile
Individual series Individual series Individual series Individual series
𝑛+1
𝑄 = 𝑘 𝑡ℎ 𝑛+1
𝑛+1 4 𝑛+1
𝑡ℎ 𝑜𝑑𝑑 𝑐𝑎𝑠𝑒 𝐷 = 𝑘 𝑡ℎ 𝑃 = 𝑘 𝑡ℎ
2 10 100
𝑡ℎ + + 1 𝑡ℎ even case
Discrete series Discrete series Discrete series Discrete series
Find ; just greater value in Find ; just greater value Find ; just greater value Find ; just greater
CF gives median. in CF gives quartiles. in CF gives deciles. value in CF gives
percentile.
Continuous Series Continuous Series Continuous Series
Continuous Series
𝑁 𝑘𝑁 𝑘𝑁
− 𝑐𝑓 − 𝑐𝑓 − 𝑐𝑓 𝑘𝑁
𝑀 =𝐿+ 2 ℎ 𝑄 =𝐿+ 4 ℎ 𝐷 = 𝐿 + 10 ℎ 100
− 𝑐𝑓
𝑓 𝑓 𝑓 𝑃 =𝐿+ ℎ
𝑓
th
L= Lower limit of the median L= Lower limit of the kth quartile L= Lower limit of the k
class decile class L= Lower limit of the kth
class percentile class
cf = Cumulative frequency cf = Cumulative frequency cf = Cumulative frequency
preceding kth quartile class preceding kth decile class cf = Cumulative frequency
preceding median class preceding kth percentile class
f = frequency of the median class f = frequency of the kth quartile f = frequency of the kth decile
class class f = frequency of the kth
h = width of the median class percentile class
h = width of the kth quartile h = width of the kth decile
class class h = width of the kth percentile
e class
MEAN OF COMPOSITE GROUP
If two groups contain 𝑛 𝑎𝑛𝑑 𝑛 observations with mean 𝑥̅ 𝑎𝑛𝑑 𝑥̅ respectively, then the mean
(𝑥̅ ) of the composite group of 𝑁 = 𝑛 + 𝑛 observations is given by the relation
𝑛 𝑥̅ + 𝑛 𝑥̅
𝑥̅ =
𝑁
PROPERTIES OF GOOD MEASURE OF CENTRAL TENDENCY:
1. It should be rigidly defined.
2. It should be simple to understand & easy to calculate.
3. It should be based upon all values of given data.
4. It should be capable of further mathematical treatment.
5. It should have sampling stability.
6. It should be not be unduly affected by extreme values.
MEASURES OF DISPERSION
Measures of dispersion are descriptive statistics that describe how similar a set of scores are to each
other
A quantity that measures the variability among the data, or how the data one dispersed about the
average, known as Measures of dispersion, scatter, or variations.
The more similar the scores are to each other, the lower the measure of dispersion will be
The less similar the scores are to each other, the higher the measure of dispersion will be
In general, the more spread out a distribution is, the larger the measure of dispersion will be
Measures of Dispersion
Absolute Measures Relative Measures
Range Quartile Mean Standard
deviation Deviation Deviation
Coefficient Coefficient of Coefficient Coefficient
of Range Quartile of Mean of
deviation Deviation Variation
Absolute Measures
MEAN DEVIATION
RANGE QUARTILE DEVIATION Raw Data STANDARD DEVIATION
Highest value –
Smallest Value A = Mean, Median or Raw Data
Mode
Grouped Data Grouped Data
A = Mean, Median or
Mode
Relative Measures
COEFFICIENT OF
COEFFICIENT
COEFFICIENT OF MEAN DEVIATION
OF RANGE
QUARTILE DEVIATION A = Mean, median
𝐻−𝑆
𝐶𝑜𝑒𝑓𝑓 𝑜𝑓 𝑅𝑎𝑛𝑔𝑒 =
𝐻+𝑆 or Mode
STANDARD DEVIATION OF COMPOSITE GROUP
If two groups contain 𝑛 𝑎𝑛𝑑 𝑛 observations with mean 𝑥̅ 𝑎𝑛𝑑 𝑥̅ , and standard deviation
𝜎 𝑎𝑛𝑑 𝜎 respectively, then the standard deviation (𝜎) of the composite group is given by
𝑛 𝜎 + 𝑛 𝜎 +𝑛 𝑑 +𝑛 𝑑
𝜎=
𝑁
Where 𝑑 = 𝑥̅ − 𝑥̅ , 𝑑 = 𝑥̅ − 𝑥̅ , and 𝑥̅ is the mean of the composite group, given by 𝑁𝑥̅ =
𝑛 𝑥̅ + 𝑛 𝑥̅
OBJECT AND PURPOSE OF MEASURING DISPERSION
1. To find the average distance of the items from an average.
2. To know the structure of the series.
3. To gauge the reliability of an average. When the dispersion is small, the average is reliable.
4. To know the limits of the items.
5. To serve as a basis for control of the variability itself.
6. To compare two or more series with regard to their variability.
SKEWNESS:
The skewness of a distribution is defined as the lack of symmetry.
In a symmetrical distribution, mean, median, and mode are equal to each other.
The presence of extreme observations on the right hand side of a distribution makes it
positively skewed.
We shall in fact have Mean > Median > Mode when a distribution is positively skewed.
On the other hand, the presence of extreme observations to the left hand side of a
distribution make it negatively skewed and the relationship between mean, median and
mode is: Mean < Median < Mode.
MEASURES OF SKEWNESS
Measures of Skewness
Karl pearson’s Bowley’s Kelly’s coefficient of
coefficient of coefficient of coefficient of skewness based
skewness skewness skewness on moments
Emperical Relation: Mode= 3Median -2Mean
If Sk = 0 the frequency distribution is symmetrical about mean.
If Sk > 0 the frequency distribution is Positively Skewed.
If Sk < 0 the frequency distribution is negatively skewed.
KURTOSIS:
Kurtosis is another measure of the shape of a distribution. Whereas skewness measures thelack
of symmetry of the frequency curve of a distribution, kurtosis is a measure of the relative
peakedness of its frequency curve. Various frequency curves can be divided into three categories
depending upon the shape of their peak.
Measures of Kurtosis
A measure of kurtosis is given by 𝛽 = ,a coefficient given by Karl Pearson. 𝛽 The value
of 𝛽 = 3 for a mesokurtic curve. When 𝛽 > 3, the curvt: is more peaked than the mesokurtic
curve and is tenned as leptokurtic. Similarly, when 𝛽 < 3, the curve is less peaked than the
mesokurtic curve and is called as platykurtic curve.