Module tl
CHAPTER 6
Statistics
ER stropuction
a branch of science which deals with the methods for collection, classification and analysis
oratory or outside with a view to draw
Statistics i:
of numerical data while conducting experiments, either in the
valid conclusions and making reasonable decisions about some phenomenon,
FEES easic tenmnooay
1. Variable (or Variate). A quantity which takes up different values during certain physical investigation
is called a variable or variate, e.g., heights, weights, ages, wages of persons, rainfall records of cities
etc.
The totality of the va
of the variable,
ues of a variable during some investigation is called the domain or range
Quantities which can assume any numerical value within a certain range are called continuous
variables, e.g., as the child grows, his/her height takes all possible values from 50 cm to 100 em.
Quantities which are incapable of taking all possible values are called diserete or discontinuous
variables, e.g., the population of a region at any time is a discrete variable as it can take up only no
negative integral values.
2. Data or Observations. The values taken by a variable are called data or observations.
statistics these are also known as ‘statistical data’ or ‘statistical observations’.
Example: Let X = number of goals scored during each game of a football team in the last sesso"
Then X can assume the data 0, 2, 3, 4, ...
‘These data can be presented as
X:0,2,3,4,..
3. Frequency. Frequency of a variate value is the number of occurrences of that particular ¥*"
of the variable in the given data set. The variate value can be a number or a category, depending 0”
type of the variable-quantitative or qualitative respectively.
214
‘Scand with CamScanneSTICS
san 215
Example: (i) Let X denotes the marks
semester class in & college test in Mathematice
Obtained by 20 students of Diploma
Let the values of X according to roll num
75 60 8 gy ‘0
65 70 5 oe
2B 75 65 0 aI
84 B 8g ‘
Here we see that X assumes the valu, i
© 80 four times frequenc; e data 40 is 4
ii) Asample of 15
_ i) Asampl at 15 Higher Secondary Students Were asked about their future plan regarding their
choice of honours subjects at degree level, the responses are as follows:
le.
cremisty Mathematics Physics Economics Statistics:
\ysics , Statistics Mathematics Chemistry hematics:
Economics Chemistry Mathematics Physics Mather
Here the frequency of Mathematics is 5, the frequency of Physics is 3 and so on,
In statistics the observations, assumed by a variable, together with their frequency
inatable known as Frequency Distribution of the variable. There are two types of frequency distribution,
namely
(® Simple Frequency Distribution
(ii) Grouped Frequency Distribution
These are discussed below.
4. Simple Frequency Distribution. The table where the observed values (data) assumed by a
variable are arranged in order of magnitude together with the respective frequency of each observation
is shown side by side is called Simple Frequency Distribution.
A simple frequency distribution tells us what values the variable can take and how often it takes
these values in a data set. It reflects the pattern of variation of a variable.
Example: Let X be a variable which takes the values:
2 1 0 4 5 6 7
8 0 1 9 3 4 0
4 1 2 3 7 8 6
7 2 1 3 5 0 4
Then the frequency distribution of X is
xX: 0 I 2 3 4 5 6 7 8 9 Total
fi 4 4 3 3 4 2 2 3 2 1 2
This table can also be written in columnwise. sete seat oe
istributit variable takes a large number of data or the
5. Grouped Frequency Distribution. When a variable tab ‘ th
variables continuous inature like height, weight etc.) then we divide the entire range of data (smallest
‘o largest) i : ing intervals (known oF equal
In ths cave ue tou oe camer of obervations that fall into each class. These counts are called ct :
frequencies. These intervals (classes) are shown in a table and the frequency of the sat inl in
each class (ie., class frequency) are shown side by side in that table. This table is called Grow
Frequency Distribution of the variable.ATEXTBOOK OF ENGINEERING MATHEMATICS (vo,
216
Example: Following observations show the percentage of population of 65 years old or ayy,
50 di it regions:
* ai N2 153 94 136 14.2 Id 15.2 12.2 9.3
PDI 12 99 96 «14d IST 149 ILO 91 2s
Tt M2 26 BS 9.2 (14S 1S. ISP 10.2 106
R2 BS it 6 107 135 12 96 98 10.1
152 156 134 14S 9.8 OT ISL 4D 9.2 ND
Make a grouped frequency distribution of the above data.
Solution: We group the data into the non-overlapping intervals 9.0-9.9, 10.0-19 9,
11,0-11.9, ...
Class-limits Tally Marks Frequency
9.0-9.9 mun 10
10.0-10.9 ll 6
11.0-11.9 Mull 1
12.0-12.9 Nl 5
13.0-13.9 Mull 1
140-149 WN 8
15.0-15.9 Null 7
Total — 30
Note: Each tally mark (1) represents one frequency. Tally marks are recorded in groups of five ({\) in
order to avoid error in counting a large number of tallies.
Basic Terminologies Relating to Grouped Frequency Distribution
Class interval: The data are grouped into a number of suitable non-overlapping intervals called
class intervals.
In the previous example 9.0-9.9, 10.0-10.9, etc. are class intervals.
Class frequency: The number of observations falling in the class is called the class frequent)
of that class.
We may use ‘tally marks’ for counting the class frequencies from the raw data.
Class limits: The two extreme values specifying a class interval, i.e., the lowest and highest
values of the variable that can be included in the class are called class limits. These are the apparett
limits of the classes where the smaller one is known as lower class limit (LCL) and the upper on¢
upper class limit (UCL) of the class.
In the previous example LCL and UCL of the class 10,0-10,9 are respectively 10 and 109
Class boundaries: The class boundaries of a class are defined as
Lower class boundary (LCB) = LCL of the class — 2 ,
2
where d = LCL of the present class ~ UCL of the previous class.
‘Scanned with CamScannersranisTics 217
Upper class boundary (UCB) = UCL of the class + g where d = LCL of the next class ~ UCL
ofthe present class,
In the previous example, LCB of the class 12.0-12.9 is 12- 42= 119) = 11 95
— = 12.95.
Width of a class (or class-width): Width of a class =
Itis also known as class size,
and UCB of the class 12,0-12.9 is 12.9 4 13-129,
2
UCB - LCB of the class.
In the previous example, width of the class 12.0 - 12.9 is 12.95 - 11. 95
Class mark or mid value: It is the mid-point of the class interval.
Class mark of a class
W
1
2 (LCL of the class + UCL of the class)
1
2 (LCB of the class + UCB of the class).
Class mark is the representative value of the class.
In the previous example, class mark of the class 12.0-12.9 =
(12 + 12.9) = 12.45,
Note: In any frequency distribution wtih class intervals presented in terms of class limits, it is better to
convert them to the class boundaries before proceeding for any further calculation or statistical analysis.
6. Cumulative Frequency
Less-than type:
(i) For a simple frequency distribution the total frequency of the observations (data) lesser or
equal to an observation (data) is called ‘less than type (<)’ cumulative frequency of the observation
(data).
(ii) For a grouped frequency distribution the total frequency of the observations (data) lesser or
equal to the observation (data) in a class is called ‘less than type (<)’ cumulative frequency of the
observation (data).
Greater-than type (or More-than type):
(i) Fora simple frequency distribution the total frequency of the observations (data) greater or
equal to an observation (data) is called ‘greater than type (2)’ cumulative frequency of the obervation
(data),
(ii) For a grouped frequency distribution the total frequency of the observations (data) greater or
equal to the observation (data) in a class is called ‘greater than type (2)’ cumulative frequency of the
observation (data). ,
Example: The class boundaries, class marks, class widths, cumulative frequencies of less than
type and greater than type for the data given in the previous example are tabulated below:
‘Scanned with CamScannerae A TEXTBOOK OF ENGINEERING MATHEMATICS yc,
Chay Class “Class Cumulative frequency
mark width | frequency [Less than type [ Greater hana
SAS O8 1.00 10 10 50 |
888-19 1.00 6 16 40 |
WOS-1.98 1.00 7 23 M |
1.00 5 28 7 |
1.00 7 35 2 |
1.00 8 43 5 |
1.00 7 50 7 |
[ — 30 = -
mulative frequency for the class 10.95 ~ 11.95. We may say ta
the percentage of people aged 65 years or above is 11.95 ¢¢
there are (out of 50) for whi
Jess and 34 regions for which this percentage is 10.95 or more.
[EGII weasures oF CENTRAL TENDENCY
Central tendency of a group of data (or, observations) is the tendency of the data to cluster or to centre
about some typical value called a central value. This value is used to represent the whole group and is
known as a measure of central tendency. In this article we shall discuss the following three measures
of central tendene}
(i) Arithmetic Mean (or, Simply Mean)
) Median
(iii) Mode
1, Mean:
I. Direct Method
(i If. y .X, are n values of a variable (or variate), then the arithmetic mean (or simpy
mean) is denoted and defined as
Lytytntayety Al)
n ”
(i) If the observations have the frequency as shown below:
Variable (x) x X % x,
Frequny(f) : ff fh fy ;
then the mean is
i 2
Bay UM Het etd = ay x fe )
where N=fithto+h= Dd fi-
a
4
‘Scanned wi CamScannerFr -
sranisTiS ™
ii) Ina group frequency distribution
: : if x),
iqtervals having frequencies f,, f, 1a
Jr Fespectively, then’
be the class mark (or mid value) of the class
the mean is
-i< 5
gol
NW 2 Six), where N= f, tht othe dD fe O
Example: (i) The systolic blood pressure ‘
127, 124, 136, 116, 13:
‘The mean systolic blood pressure is
(mm Hg) of 10 workers of a factory are given below:
2, 128, 120, 123, 130, 121.
1
10 (127 + 124+ 1364 116+ 132 + 128 + 120 + 123 + 130+ 121)
= 125.70 mm Hg.
Gi If
Number of goals (x): 0 1 2 3 4 Tol)
Number of matches (f,) : 1 5 12 5 1 24
be the frequency distribution of number of goals scored per match, then average number of goals per
<_il
Baap UM fata thats hese hs)
1
= (1x 045x1412%245x341 x4) = 8
24 24
(iii) Following table shows the calculation of mean life time (in hours) for a frequency distribution
of life-hours of 70 light bulbs:
Life-hours Frequency (f,) Class-mark or xf;
mid-value (x;)
400-500 2 450 900
500-600 4 550 2200
600-700 10 650 6500
700-800 30 750 22500
800-900 16 850 13600
900-1000 8 950 7600
| Total 70 53300
The mean life hour is ¥ = 53300/70 = 761.43 hr.
IL Short-cut method ;
In this method the mean (X) is computed by applying the formula
¥ =A+ fd ; wal)
A, the deviation of x, from an arbitrary origin (assumed mean) A.
Proof: Now, Bfx, = f(A +4) =A BS * Bi
‘Scanned with CamScannerAN
220 A TEXTBOOK OF ENGINEERING MATHEMATICS, Woy
Example: Calculate arithmetic mean by the short cut method from the following data:
Marks : 0-10 10-20 20-30 30-40 “ 50-09
30 s 2
No, of studenss : 7 8 25 !
Solution: Let A = 35 be the assumed mean.
,
Marks Mid-value (x,) | No. of students (f,) Se,
0-10 5 1 -2 im
10-20 15 8 - ! ‘
20-30 25 25 25
30-40 35 30 0 0
40-50 45 18 10 180,
50-60 35 12 20 240
Total ¥,= 100 Bid, =~ 200
200
Average marks =
BA Of dyBj = 35 - > =33.
IIL. Step-deviation method
In this method the mean (¥) is computed by applying the formula
FSA+ (hBfuylEf, (3)
where u, = (x;—A)/h, A is the assumed mean and / is the equal class interval.
Proof: Put u,= d/h, or, d, = hu,, Substituting this value in (4), we get (5).
Example: From the data of previous example, compute arithmetic mean by step-deviation method
Solution: Here h = 10 and let A = 35 be the assumed mean.
Marks Mid-value (x,) | No. of students (f)] d, = x,—35 fn
0-10 5 7 ~30 -3 =21
10-20 1S 8 -20 -2 -16
20-30 25 25 -10 -1 -35
3040 35 30 0 ° 0
40-50 45 18 10 1 18
50-60 35 12 20 7 1m"
Total Y= 100 sh
fi
Average marks
=A+ EFUB
lox
100
Notes: (i) All the three methods of finding arithmetic mean in continuous series give us the same resi
(ti) The direct method, though the simplest one, involves more calculations when mid-values and freque™®
are very large in magnitude. For example, consider the following data:
Income in %: 10,000-20,000 20,000-30,000 30,000-40,000 40,000-50,000
No. of Persons: 570 481 155 320
20)
=35+
=35-2=33,
‘Scanned wi CamScannersranisicS 221
Here step-
1139 21200. 2116 ara
Solution: _ ee
Sl. No. Wages arranged in Sl. No.
Wages arranged in
ascending order ascending onter
! 21000 5 21200
2 21080 6 21400
7 21120 7 21500
4 21160
Median = size of = thitem = 1! th item
= size of 4th item = 21160.
Median wage = % 21160.
Example: Compute the value of median finm the following data:
421 222-351 3024322125320 890
Solution:
SI. No. Data arranged in Sl. No. Data arranged in
ascending order ascending order
1 212 5 351
2 222 6 421
3 302 1 432
4 320 8 890
ntl a= 84! th item = 4.5% i
Median = size of “—thitem = th item = 4.5% item
A 1 =
= size of 7 (a item + 5® item) = 5 G20+351) = 3355.
‘Scanned with CamScanner224 ATEXTBOOK OF ENGINEERING MATHEMATICS ty
M
Case 2. (Simple frequency distribution) / ;
Steps: (i) Arrange the given values of observations (i-e., given data) in ascending Onde
magnitude
ncies.
ss than type (S)" cumulative frequel
tive frequeney coturan and find that cot which is either equ
io
(ii) Find out the ‘les
(iii) Now look at the cumul:
v _N +1. .
x + (where is the total frequency) or next higher (i AS isnot a cumulative frequency) a,
servation corresponding toil, This gives the value of the median
data find the value of median:
000 10000 15000 20000 25000
determine the value of the obs
Example: From the following
Income (RS.) 18000
No. of persons 30 16 24 26 20 6
Solution:
Tncome arranged in No. of persons (f,) Cumulative frequency
ascending order (spe)
8000 16 16
10000 24 40
15000 26 66
18000 30 96
20000 20 116
25000 6 12
Here N = 122 ee =615.
We see that there is no cumulative frequency ‘61.5’ in the third column. Next higher cumulative
frequency in the third column is 66.
Required median is the observation correspondit
ponding to the cumulative frequency 66=% 15000
Case 3. (Grouped frequency distribution) amen
Steps: (i) Arrange the gi I ions (ie., gi
es ge given values of observations (i.e., given data) in ascending ordet
(ii) Find out the ‘less than type (S)’ cumulative frequencies.
(iii) Now look at the cumulative frequency column and find that total which is either equal © £
(where N is the total freque itd
equency) or next higher | if 3 isnot a cumulative frequen) sand determine HY
corresponding class (called median class).
N
2 Ao
(iv) Median = |, +
‘Scanned with CamScanner-
a 1, = lower class boundary of the ‘Median class,
= total frequency ,
F = Cumulative frequency of th i
of all classes lence lon the eaereeing to the median class or sum of the frequencies
Jn = Simple frequency of the median class,
i = width of the median class,
Example: Calculate the median for the Following frequency distribution:
Marks No. of students Marks No. of students
0 10 20-25 32
40-45 15 15-20 20
35-40 26 10-15 R
30-35 30 5-10 5
25-30 50
Solution:
Marks Frequency Cumulative frequency (2)
5-10 5 5
10-15 2 7
15-20 20 7
20-25 32 6
25-30 50 ug
30-35 30 149
35-40 26 115
40-45 1s 190
45-50 10 200
_ N _ 200 _ 100, There is no cumulative frequency 100 in the third column of the above
2 2
‘able, the next higher c,f. is 119 corresponding to the median class 25-30.
<1, = lower boundary of the median class = 25
= cumulative frequency of the class preceding the median class = 69.
Jn = simple frequency of the median class = 50,
i = width of the median class = 30 - 25 =.
Nig
2
fr
xi2 25+ x5=28.1.
100-69
Median = [,,, + 50
‘Scanned wih CamScamerATEXTROOK OF ENGINEETING MATIEMATION tycy
226
ving date
Weight an ge Na. of mangoes
a0 99 a
p10 “0
a
ass limits, We should convert then 4,
in terms ofc 1
tipper limits,
nil adding, 0.5 10 th
Solutio i
class boundaries by deducting 0.5 from the lo
Cumulative fiequency
Weight in gms Frequency
309.5-319.5 n 12
319.8-329.5 20 Mm
329,5-339.5 a 16
34 130
S 2 I?
359,5-369.5 20 192
369.5-379.5 8 200
2
Here, x. m 100. There is no cumulative frequency 100 in the third column of the table,
the next higher c fis 130 corresponding to the median class 339.5-349.5.
= lower boundary of the median class = 339.5
mulative frequency of the class preceding the median cla
imple frequency of the median class = 54
vidth of the median class = 349.5 ~ 339.5 = 10.
Median = J, +
= 3305+ 2 = 343.94.
coun 5 Mode. The move or the modal value is that value in a collection of observations which
2 h um frequency. Therefore in a frequency distribution the observation having grea!
frequency is called mode.
Computation of Mode
Case 1. (Individual observations)
To find mode count the nu
imber of times the various
Is vi ring
maximum number of times is the modal value. alues repeat themselves. The value occu"227
Calculate the mode from the following data of marks obtained by ten students:
SEN, i Marks obtained
p SiN |
f 6 1s
5 7 20
3 8 26
4
‘ 9 2
5 10 15
Solu
Marks No. of times it occurs Marks No. of times it vccurs
10 1
12 1
1s 1
18 1
(ii) When there are two or more va
1 ) alues having the same greatest frequency, one cannot say which is the
‘modal value. In this case mode is said to be
ill-defined and such a series of observations (or. distribution) is known
as bi-modal or multi-modal.
Case 2. (Simple frequency distribution)
In a simple frequene
Y distribution the mode is found just by inspection, i
value of the observation
around which the items are most heavily concentrated
Example: Consider the simple frequency distribution:
Size of garment 320 31S 0G
No. of persons wearing =: = 25° 2015, gw
In this distribution we sce the observation 30 has highest frequency 60. So the mode of the
distribution is 30.
.» by looking to that
Case 3. (Grouped frequency distribution)
In a grouped frequency distribution mode can be found if there exists an unique class with
maximum frequency and if every class interval has equal width.
First find the modal class, i.e., the class having maximum frequency, then
Where 1, = lower class boundary of the modal class
Jy = frequency of the modal class
Jy frequency ofthe class preceding the modal class
J;= frequency of the class succeeding the modal class
1'= width of each class
‘Scanned wit CamScannerJA 1PeIHOOK OF ENGINEERING MATHEMATICS [yoy y
220
Kxmmples Fn se vai af made fram the dar given below:
V jv, of studen
Woight (Ky) No of suulenty Weight (Ke) No, of sttulents
Wd? 4 olor 2
W-82 4 on 2 4
Sha? Mn wy §
$8-02 » WWD so
Solutton: By inspection mode liew in the clave 5H-62, in terms of class boundaries it become,
57.5-02.5,
Mode 1,4, Nhoty yi, (0)
2h Inn h
where J, = lower clays boundary of the modal class = 57.5
J, = frequency of the modal clays = 22
Jy = frequency of the clas preceding, the modal class = 15
Ly quency of the class succeeding the modal class = 12
i = width of cach class = 62.5 ~ 57.5 = 5.
22-15 35
Mode = 57.5 4 at x5= 5754 — 359.56
2x 22-15-12 7
Relation among Mean, Median and Mode
For a distribution having single mode the relation is
Mode = 3 Median - 2 Mean n(8)
Note: Where mode is ill-defined, its value may be ascertained by the formula (8). This measure is called
the empirical mode
[EER stanvarp veviation
Mean, Median and Mode represent the entire series of observations, the degree to which these observations
tend to spread about these measure of central tendency is usually measured by Standard Deviation
(S.D.). The standard deviation concept was introduced by Karl Pearson in 1823 and it is the most
important and widely used measure of studying dispersion. It is computed as the positive square root of
the mean of the squares of the differences of the variate values (observations) from their mean and soit
is also known as root mean square deviation, Standard deviation ($.D.) is usually denoted by te
small Greek letter o (read as sigma),
Variance: The square of Standard Deviation, ie., 0?
Calculation of Standard Deviation
Case 1. (Individual observations)
1. Direct Method
Wty.) oor, be the observations (data) then their Standard Deviation (S.D.) is
is called variance.
‘Scanned with CamScannerstasis.
229
o=+ {t {oy
(1)
Oy Fatt),
Example: Lets assumes the values 1,3, 7,9
1
=a 43474925,
S.D. of xis
ue
aa[l > :
ont [i0-5'+0-90-5¢e0-a9]
1
= {7usraeasr0} =VI0.
Now, observe that
1 .
ye PLY OP ane ery
i "
Therefore, from (1) we get
ws
} (2)
Case 2. (Simple frequency distribution)
xy x3, X, be a series of values of a variable x with respective frequencies fifo. f, then the
Standard Deviation (S.D.) is
I’
o=+ [Flac =P + fily —D? tt Lys,
I<
=+
(3)
N
and N=, +hy+ +S,
ie
wy AiG
‘Scanned wit CamScannerTherefore, from (3) we get
= fFbs 2
Case 3. (Grouped frequency distribution)
If the observations are given in grouped frequency distribution then let x,, x5. ..., are the class
‘mark (or mid value) of each class interval and fy. fay»
Deviation (S.D.) is given by (3) or (4).
Example: Obtain standard deviation for the frequency distribution given in Example (ii) of
an. 6.3.
Sotution:
Life-hours | Frequency (f,) | Class-mark (x,) ad; x, ]
[400-300 2 450 900 405000 |
500-600 4 550 2200 1210000
| 600-700 10 650 6500 422500 |
| 700-800 30 750 22500 16875000 |
800-900 16 850 13600 11560000 |
| 900-1000 8 950 7600 7220000
f
[Total 10=N 53300 41495000
Mean, ¥ = 53300/70 = 761.43 hr.
Variance,
where = x, -
~ (761.43)?
A, the deviation of x, from an arbitrary origin (assumed mean) A.
‘Scanned with CamScanner
‘f, are the corresponding frequency. The Standari
wld)gransTiCs 231
Proof: Now, (i-A)~( =A) =4,-(F =A),
Bild)- (% ~A)}? = Spd? + (¥ — AP BEI —A) Bid,
+ Sd, :
= xa? — Gid*
of;
REET)
i of oh
Example: Calculate the standard deviation Srom the data given below:
Size of item: 3.5 4.5 55 6S 75 8S 9.5
Frequency: 2 8 21 52 91 36
Solution:
Size of item (x,) | Frequency () | d=x,~65 Ta Fae
35 2 =3 x6 8
45 8 -2 -16 32
55 2 -1 -21 2
65 52 0 0 0
15 o 1 1 1
85 36 2 n 144
95 5 3 15 45
Total N=215 Yd, = 135 Yd? =351
SD.=o=
where Zfd? = 351, N= Bf, = 215, Bfd; = 135.
351 =) = 1.113
215 (215
IIL Step-deviation method
In this method the standard deviation (6) is computed by applying the formula
--(6)
o=th
umed mean and /1 is the equal class interval.
where u,=(x,—Ayih = d/h, A is the
Substituting this value in (5), we get (6).
Proof: Put u,= d/h, or, d,=
‘Scanned wih CamScamer232
Example 1: Obtain stan
art. 6.3 (ID.
dard deviation for the frequency
A TEXTBOOK OF ENGINEEAI
distribution given in the Example
ING MATHEMATICS (vo,
i
Solution:
Marks | Mid-value (x,) | No. of students (f,) (x, - 35/10 fe fue |
0-10 5 7 -3 - 16 8 ]
10-20 15 8 -2 ais x |
20-30 25 25 -1 % 2
3040 35 30 0 a o |
40-50 45 18 1 18
50-60 55 12 2 24 48
Total N=¥f,= 100 fe=- 20 | Bu? = 186
Here, A =35, = 10.
Standard Deviation, 6 = a 1s ) =10
a }
Example 2: Find the standard deviation from the following data:
Age under: 10 20 «3002 «40 S00 80
No. of persons dying? 12 30 55-70, 102A IS 2S
Solution: Computation of Standard Deviation
Age Mid-value (x,) | __ Frequency (f,) u, = (x, - 35/10 fy, fu?
0-10 3 12 -3 —36 108
10-20 I 18 -2 -36 n
20-30 25 25 -1 -25 25
30-40 35 15 0 0 0
40-50 45 32 1 32 2
50-60 55 12 2 4 8B
60-70 65 ol 3 03 09
70-80 15 10 4 40 10
Total
Here, A = 35, h= 10.
Too +
Standard Deviation, 6 = h a Hu) = 19 || 454_ (2.
no\y 125 (125
= 19.057
Properties of Standard Deviation
Property 1: The standard deviation for a collection of data is zero if and only if all values int
collection are equal
‘Scanned wi CamScannergtanisTicS 233
proof: If part:
Suppose x; = constant = c (say), for all j= 1,2
By Property | of mean (see art. 6.3), 7 = ¢
Variance, eal opie :
ny RY Ay ene
Hence, s.d., 6 = 0,
Only if part:
Suppose 6 = 0 for a collection of values XX,
Le
Die, = x, —¥)?
ne x =0
This is possible only if each term of the summation is zero,
are equal. ‘
Hence the result.
6
ie,x,= ¥, ie, ally, (/= 1,2,
___ Property 2: If-x and y are linearly related as y= a + bx, a,b are real constants, then their standard
deviations have the following relation:
6, =|b] 9,
where 0,. 6, are the s.d. of x, y respectively.
ie
Proof: =- yy)?
oat yy
DY (a+b, -a- 67) [By Property 3 of mean, art. 6.3.]
im
ig so,
B.D (x, -¥) =
0h
* 9, =!b16,
Notes: (i) It can also be written as:
x-a
Ity=
Sy
a. b are real constants, then 6,= 57
(ii) The above results make the calculation of s.d. easier.
(iii) Standard Deviation is affected by the change of scale,
Property 3: If there are two groups of data consisting of m, and n, observations respectively
and standard deviations 6, and ,. then the composite variance for
but the change of base has no effect on it
With respective means ¥, and ¥,
the collection of (n, +n.) observations is given by
2 % -x)? 3
{mo} +ny93 + m3 ~ BP +m F
x},
o=
ny tn,
Where Z is the composite mean.
- ‘Scanned with CancamerA TEXTBOOK OF ENGINEERING MATHEMATICS yoy 4,
Prot Suppose typ. yy. ty, be the observations for the first group (having mean ¥,, 5.4, *)
ii ean x), 9d. 04).
and ty. <1, be the observations for the second group (having mean x2 A
Aly
and the composite variance
(2
1 fe Fy _
oes (yy 0 >) (xy
ny tay Is " ist
Now, Yow -
a
Y fou Fa DP
a
- x {ly (xy —¥))? (8, 8)? +208, -) YO, - 3)
= > (4, - 3)? +n,(%,- 3)? [By Property 2 of Mean, ant 63)
=1,0,? +n, (¥, -¥)* [By ()] 3)
Similarly, ° (xy) -3
a
From (2), (3) and (4), the result follows.
Example: The mean of two samples of size 50 and 100 respectively are 54.1 and 50.3 and the
standard deviations are 8 and 7. Find the mean and the standard deviation of the sample of size 150
obtained by combining two samples,
= nO} +n3(¥,-¥)? Ad)
Solution: Let ¥ and 6 represent the composite mean and composite standard deviation of the
combined collection of given two samples.
= _ MX +m _ 5054.1 +100 503 63)
nth 50+ 100 [By Property 5 of Mean, art. 6.3
= S157
i +03 +7(%, ~ 3)? +m % - x7] [By Property 3 of S.D., art. 64]
= Soa 199 (50% 8? + 100 x 7? + 50 (54.1 - 51.57)? + 100 (50.3 — 51.57?
= 57.21
o o=7.56.
Seamed wn eamecamers1anisT7oS
COEFFICIENT OF VARIATION
Mean j 5
rere sun Fe, its used for comparing me sno" # the coeicient of variation, Note that his
‘ariations between two groups. Its often expressed as
a peroenlage:
19 119 36 B49
ae a4 37 #88
Who is the better score getter and who 18 more consistent?
Solution: Let x denotes the score of x
and y that of Y,
x ue %
6 ~30 900 0 1369
-29 841 4 1089
12 ~24 576 2 os
19 -17 289 13 -24 576
29 -7 49 16 -21 441
36 0 0 37 0 0
B 37 1369 42 5 2s
84 48 2304 47 10 100
us 79 6241 48 u 121
119 83 6889 St 14 196
Total | Eu,=140 | 5u? = 19458 By} = 4542
For Player X:
an
[BES ne
8 x 100 = 83.6%
50
Gx =
~. Coefficient of variation = — x 100 =
x
Ey, _ 47, (-100)
For Player Y: Mean = ¥ = 37+ 79 =37+—9
=27.
188
=— = 69.6%
“+ Coefficient of variation = —* x 100=— «100 = 6
Since ¥ > 7, it follows that X is a better score getter (i.e., more efficient) than Y. :
Since the ¢ ° ficient of variation of Y < the coefficient of variation of X, it means that ¥ is more
coefficient of ; i
“onsistent than X. Thus even though X is a better player, he is less consistent.
&£
‘Scanned wi CamScannerTEXTBOOK OF ENGINEERING MATHEMATICS yoy,
™
Teme Hid east
Objective type
Example 1: Answer with minimum justification
i) The scores of nine students are respectively 9, 8, 4. 6, 7.4, Il, 13. 10. The median of the scores ig
(a9 (0)8 (0) 85 (@) None of these
ay fy = 3x ~ 100 and ¥ = 50. than the value of ¥ is
(a) 60 (b) 30 (©) 100 (50
Gili) Mf var (x) = Sand y = 5x + 6 then var (y) is equal to
(a) 125 (b) 150 (06 (d) None |
(iv) Find the standard deviation of the following numbers: |
1, 2, 3,4, 5,6, 7,8, 9.
(0) The mean, median, mode of the data 0, 5,1, 5, 2, 3,1, 4, 3, 0,0, 3, 3, are
©) 3.3.3 (d) None
(a) 2.3.3 (b) 2,2,3
(vi) Compute the median from the following data:
10, 5, 9, 4, 8, 7,6
(vii) The diameter of six circular gaskets in mm are 8.3,
(viii) Length of 4 bolts in mm are 6.1, 6.0, 6.2, 6.3. The S. D. is
8.2, 8.5, 7.9, 8.0 and 8.1 Find the mean and media
(d) None of these
(a) 0.012 mm (b) 0.1095 mm (©) 6.15 mm
(ix) Find the mean of the following data:
Dia.(x)inmm : 12.25 1250 -—-12.75 ‘13.00
Frequency (/) 3 7 6 4
(x) The mean of the following data:
Age ‘x’ in year 12 13 14 15
No. of boys ‘f 2 3 2 1 is
(a) 12.5 yrs. (b) 13 yrs (c) 13.25 yrs (d) None of these
(xi) The median of the following data:
3.1, 2.6, 5.0, 4.7, 4.2, 3.9, 5.1, 3.6 is
(a) 3.75 (6) 47 (0) 3.6 (d) None of these
(xii) ‘The mean and standard deviation of the following data:
wo: od 2 3 4
foe 1 2 3 4
are respectively
(a) 3.1 (6) 1,3 (©) 3,2 @21
(xiii) The relation between two groups of observations (x,) and {y,) is 2x, + 3y,=9 and if ¥ = 3. then) *
(a) 3 (b)2 1 (d) None of these
(xiv) The mean of a set of 20 measurements was calculated to be 50 cm. But later it was found that amis
had been made in one of the measurements which was recorded as 64 em, but should have been 6! ®
‘The correct mean is
(a) 50.15 (b) 49.85, (c) 49.9 (d) 49.8
Solution: (i) Arranging the data in ascending order of magnitudes, we get
4,4, 6,7, 8,9, 10, 11, 13
Scanned by CamScanner
‘Seanned with CamScannergranisS
Number of values is n= 9
Median = size of ” a o+
th item = SS item
= size of si item = 8,
Ans. (0)
(ii) Given y = 3x ~ 100.
~ 100 [by property 2 of Mean, art. 6.3]
*50- 100. ¥ = 50)
=50
Ans. (d)
(iii) We know that if y= a + bx, then
Var (y) = b? Var (x). {sec Property 2 of art 6 4]
Given, Var (x) = 5 and y = 5x +6,
. Var (y) = 5? Var (x) =25 x 5 =125,
‘Ans. (a)
(iv) Here, Ex 1424+3444546474849=45,
P42? 432 4424574674724 824.92
= 144494 16425 +36 +49 +64 +81 =285
"rags yi
-(= -5 ) = 258,
(v) Arranging the data in ascending order of magnitudes, we get
= 1,-1,0,0,0, 2, 3,3,3,3,4,5,5.
Number of values is n = 13.
+1... 3+1).
Median = size of a item = th item
= size of 7 item = 3
Frequency distribution is
x: 1 0 2 3 4 5
Si: 2 3 1 4 1 7
E2024 D444 1H =F
Mode = value corresponding to the highest frequency = 3.
Ans, (a) /
(vi) Arranging the data in ascending order of magnitudes, we get
4,5, 6, 7, 8,9 10.
Number of values is n = 7.
‘Scanned with CamScannerA TEXTBOOK OF ENGINEERING MATHEMATICS NOK,
atl. 7+1,.
Median = size of az item = ~~ th item
= size of 4" item = 7. ; o
(vii) Mean = 2 (8.3 +82+8:5+79+80+8: = 6
Arranging the data in increasing order of magnitudes, we get
7.9, 8.0, 8.1, 8.2, 8.3, 8.5
Number of values is n = 6.
6+l..
Median = size of an item = 25" th item = 3.5% item
= size of ; (3% item + 4 item)
(8.1 + 8.2) = 8.15.
(viii) Here, 14+6.04+6.2+6.3 = 24.6,
= 246 _ 123
4 20
Ex? = (6.1)? + 6 + (6.2)? = (6.3)? = 151.34
loa #" (sai 1 2\'?
S.D. = 4—Ex} — = j - 7 12.3)
{i a3 Gg 23)
= 0.1118
Ans. (d)
(ix) Here, 2px) =3 x 12.25 +7 x 12.50 + 6 x 12.75 +4 x 13
= 252.75,
Bf =3+7+64+4=20.
= _ Xfix; _ 252.75
Mean, x = t="? _
Tht a9 = 12.6375.
(x) Here, ZUsx=2% 124313 42% 1441 x 15 = 106,
Ef =2+3424128,
Zfix, _ 106
pot 71325
(a) Arranging the data in increasing order of magnitudes, we get
2.6,3.1,36,3.9,42,47,5051
Number of values is. = 8.
Mean, ¥
Median = size of "*! th item = 8414
2 th item = ——th item = 4.5" item
1
= size of — (4th j ‘ 1
2 (4" item + 5% item) = > (3.9442) =405ao 230
twit) Here, er ht2 43 44210,
Ax/= 1K 142
uisiate Py eS tAxd 14494 16 = 30.
2 +3xRaqx gre 1+8+427+464= 100.
1 1n
SD. = yes? -#} _ (~ )"
=(—- =I
1 :
Ans. (a) °
(xiii) Given, 2x; + 3y,=9.
Z 25 +39 =9 [by Property 4 of Mean, art 63]
9
7 ( ¥ =3),
x y=l
Ans. (c)
(iv) Incorrect Ex, = n x given mean = 20 x 50 = 1000.
correct Lx; = Incorrect Xx, wrong figure + correct figure
= 1000 - 64 + 61 = 997,
Correct mean = = 49.g
= 39 74% 5
Ans (b)
Subjective type
Example 2: The table below gives the weights of 50 school boys in the age group 15-17 years
nearest to kilogram:
Wein Kg. (x): 40-4200 43-45 46-48 49-5]
No. of boys (f) = 3 6 ° 8
Wr in Kg. (x): 52-54 55-57 58-60
No. of boys (f): 8 7 4
Calculate mean and median.
Solution: We are given class intervals in terms of class limits, let us convert them to class
boundaries by deducting 0.5 from the lower limits and adding 0.5 to the upper limits.
Mid-value | No. of boys of fia,
(x) (f,) (s)
3 3 -3 -9
” 6 9 -2 -12
47 9 18 “i “9
50 1B 31 0 0
33 8 39 1 8
56 7 46 2 14
39 4 50 3 2
N=, =50 Bud
Scanned by CamScanner240
Here A = 50, h= 3.
Here,
2
next higher c.f. is 31 correspondin;
. 1,, = lower boundary of the
F = cumulative frequency 0
=18
‘ffg=simple frequency of the median class =
| = width of the median class = 51.5 ~ 48.5
N30
ATEXTBOOK OF ENGINEERING MATHEMATICS iyo,
iy
Mean =
25. There is no cumulative frequet
median class
N
Median =1,, +2
F
f the class preceding the
xi =48.5+
4
50+3 x G5 = 5024
ncy 25 in the fourth column of the tab
g to the median class 48.5 - 51.5.
8.5
median class
13
25-18
13
x3 = 50.12
Example3: Find standard deviations ofthe length ofthe belts produced in two different machine
and compare these standard deviations. ve
Length of the belts (in mm)
Machine
A 24 27 26 28 30 3 22 29 6
B 2% 2% 22% 2% Ww M4 27 2% %
Solution:
Machine A: A
ine ip 24+ 27+ 26+ 28+ 30431 + 22429427426
Machine B: 4
= [pg 28+ 26 +25 +26 + 28 + 29.427 + 26 + 27 + 28)=7
Machine B |
x j
M
4 28
27 2
2% 7
28 °
* 26
xu 28
2 29
9 27
> 26
% a]
28
‘Scanned with CamScanneristics
sta 241
Machine Az 6
19 72-569
Machine B: ie
shows that the
that
ent than machine A,
Example 4: Calculate the
deviations of the lengths of the bolts from mean value
roduce,
Produced by machine A, ie., regarding product of bolts, machine
‘mean and s
and standard deviation from the following data:
Size of the item 6 7 3
Solution: ss 4
Size of the item | Frequency 4d,
=x-9
() ” oe fa, fa?
; 3 -3 -9 27
6 -2 -12 24
8 9 -1 -9 9
9 13, 0 0 oO
10 8 1 8 8
in) 5 2 10 20
12 4 3 12 36
Total N=¥f=48 Fd,=0 ¥f,d?= 124
Here A =9.
Mean
Standard deviation =
2s (2)
48 (48
Example 5: A random variable has the following probability distribution
x 4 5 6 8
Probability a1 03 04 02
Find the expectation and the standard deviation of the random variable.
Solution: Here, .
at 4 5 6 8
A: 03 04 02
. 1x 4403x5404 6+0.2x8=5.9
pc? — (Exp (X)]?
—séarned with camiScamer16 K OF ENGINEERING MATHEMATICS yoy y
ATEXTBOO!
]
2 — (5.97
24a 24.0.3 x52 +0.4 x 6 +0.2 x8
=0.1x 4? +0. g— (5.9)? = 1.49
16+7.5+ 144+ 128-6!
. = 1.2206. .
Standard Deviation of X= 6 = v8 mn from the following table giving the age distribu,
Example 6: Find out the standard deviatior
of 540 members of parliament. 0 40 50 60 70
Age in year 3 3 140 51
8 “4 B2 IS:
Number of members
Solution:
Age No. of members Say Sui
(fd
(x) f
ql ~ 256
= 6 128 6
~ 132 132
7 -1 132
0 153 0 °
50 140
60 1 40 204
0 2 31 2
Tota Ne Yiu? = 732
Here A= 50, h= 10
Shut (Efu))"
2 Su
Standard Deviation = h pe (Fé) |
N N
732_(-18)"\'""
= 19,22 _ (a8 = 11.638
' { 540 ( 540 )
Bxample 7: The table below gives the marks obtained in a test in Mathematics:
Marks (x) 5 FIO 11-20 21-30 31-40 41-50 51-60
No. of students (f) : 3 16 26 31 16 8
Calculate mean and standard deviation of the distribution
Solution: We are given class intervals in terms of class limits, let us convert them to class
boundaries by deducting 0.5 from the lower limits and adding 0.5 to the upper limits
255 ~]
Marks Mid-value 0 | No. of students fa, fue |
(Class boundary)|—(x,) fh)
05-105 5.5 -2 3 6 2
10.5-205 155 -1 16 ~16 16
20.5-30.5 255 0 26 0 0
30.5-40.5 35.5 1 31 31 a
40.5-50.5 45.5 2 16 32 6
$0.5-60.5 55.5 3 8 24 2
Total N=¥f= 100 Wu, =65 yu? = 5 |isTicS
sa 243
Hore A = 25.5, h= 10,
Mean = =
BYs
Standard Deviation = h
195 / 65 \2)"?
= 10 4195_( 65)
{es (is) = 12.36
Example 8: Given below
are th cobizine
in Pes and Chemise “marks obtained by a batch of 16 students in a certain class test
oll No. : 2
a Physics is pp 7 8 9 WH RB WIS I
Marks in Physics $8 32-30 35 50 54 61 45 32 57 75 GH 71 Ho Jp wu
Marks in Chemistry :
: 58 62 34 5] 68 50 OS 37 44 S469 63 76 87 SS 4S
In which subject is the level of know
ledge of the students higher?
Solution: The subject for which the value of the median i
level of knowledge of the students is higher. To find the median in
increasing order of magnitudes:
igher will be the subject in which the
each
se, We arrange the marks in
Serial No. > 123 4 5 6 7 8 9 1 Hl 12 13 14 15 16
Marks in Physics 30 32 35 39 43 45 48 50 52 54 57 61 68 71 75 82
Serial No. 7162345678 9 1 It 12 13 14 15 16
Marks in Chemistry : 34 37 44 45 50 5)
54 S558 62 63 65 68 69 76 87
+1 16+1
Median marks in Physics. = size of "th item = Sth item
= 8.5" item = size of 56" item + 9" item)
1
(50 + 52)=51
“ antl. 6+1
Median marks in Chemistry = size of —~th item = 2 th item
= 8.5" item = size of ; (8" item + 9" item)
= 4055 +58) = 565.
Since the median marks in Chemistry is greater than the median marks in Physies; the level of
ince the media
knowledge in Chemistry is higher.
istril given below:
An ii ete frequency distribution is give
vunale * vm 70-30 30-40 40-50 50-60 on 7-40
fariable 2 :
Frequency =: 12 30 ? 65
tat the toral frequency i ian is 44, find the missing frequencies.
Giver the ncy is 230 and median i: ces
aut rn Le 7 ne the rnssing requencies of the classes 30-40 and 50-60 respectively.
Solution: Let f,, fy 5s
‘Scanned with CamScanner|
A TEXTBOOK OF ENGINEERING MATHEMATiCg 1
We know that
F
2 ,
xi
Median = 1, + *
where — J, = lower class boundary af the median class,
N = total frequency, .
an class,
F = cumulative frequency of the class preceding to the median ¢]
Soa
i = width of the median class.
Since the median lies in the class 40-0, we have
230 12304 A)
~___—_—- x 10
44 =40+ 65
or 26 = 11S — (12+ 30 +f)
or W=B-f. f=a7.
* fue 230— (12 +30 +47 + 65 + 25 + 18) = 33
Example 10: 4 collar manufacturer is considering the product of a new style of collar to aura
youngmen. The following statistic of neck circumferences are available based upon the measuremenf
simple frequency of the median class,
a typical group of college students:
Midvale 125-130 RS 10S SO ISS 16.0185
No. of students 4 9 30 63 6629s !
Compute the standard deviation and use the criterion \t + 36, where o is the standard deviation
and \1 is the arithmetic mean to determine the largest and smallest size of collars he should make in
order t0 meet the needs of practically all his customers bearing in mind that collar worn, on average
44 unit longer than the neck size.
Solution:
Mid value | Frequency fu fu
() i)
as 4 a4 ~e u
13.0 19 -3 -57 m1
13.5 30 -2
-60
140 “8 “1 63 20
145 66 0 0 .
15.0 29 1 29 °
Iss 18 2 6 29
16.0 1 3 5 n
165 I 4 4 7
To! = | W=ay = 231 unaia | yahe
‘Scanned with CamScannerstaTisTics.
245
Here A= 14.5,h=0,5,
Mean = y= 4 4,2 Ait a124
y= 14.5 405 (#)- 2
WN «(ayy J = 14.23 unit
os{s( ut)
SD.=o=h {feat :
Ny {231 231
=072
Maximum neck size = 1439 =
and minimum neck size
14.23+3% 072 = 1639
=~ 30 = 1423-3979 —
Hence, largest collar size = 16.39 4 3/4 P= 107
and smallest collar size = 12.07 + 3/4 = 12.89 unit
1. Ify=Se—20 and ¥ = 30 then value of 5 is
(@) 130 ® 140
Find the mean, median and mode ofthe following
33, 32, 33.5, 33, 31.5, 32, 31, 32, 33, 32
3. The diameter of five circular gaskets are given below
83,82, 80, 7.9, 8.1
lis mean and median are respectively.
(@ 82,81 (®) 8.1, 8.1
4. The mean of the following data:
Numbers: 9 10 1520
Frequency : 5 8 8 4 is
(a) 10 (6) (©) 12 @) 13.
5+ Fora collection of observations xx. with respective frequency f,
is defined as,
(©) 30
(@) None of these.
(©) 8.0, 8.1 (d) None of these.
Fro the standard deviation
wn . >
@ aca (» FAD (9 Fiat (@) None of these
Lf Lh Zhi
6. The standard deviation of the following numbers:
24, 36, 48, 53, 64 is
(a) 13 (6) 13.83, (12 (@) None of these.
Se
1 fa 2. Mean = 32.3, Median = 32, Mode = 32.
+ 5.(a) 6. (b)
PROBLEMS
1. The number of telephone calls received in 245 successive intervals at an exchange are shown in the
following frequency distribution: aurea246
10.
1
12.
13.
14.
TEXTBOOK OF ENGINEERING MATHEMATICS ty, 1
p23 4 So
No.ofealls 0 a 40
Frequency es
Calculate the mean,
According to the census of |
2000, 1180, 1785, 1500, 560, 1200, 7
Find the median.
ind the median of the data given in Q. 1.
Find the median of the frequency distribution:
971; the following are the population figure (in thousand) of 10 cities
182, 385, 222, 1123
tio oto? 3 4 5 6 7 8 8
fo: 8 1 Me 16 20 25 15 9 6
Find the mode of the following series:
Size 15 25 35 45 55 65 75
Frequency: 185 77 34 180 136 23 50
Find the mean, median and mode for the following:
Mid Value : 15 20 25 30 35 40 45 50 55
Frequency : 2 22 19 14 3 4 6 1 1
U
. Show thatthe variance of the first positive integers is O = 7 (n?— 1).
.. ‘The mean of five items of an observation is 4 and the variance is 5.2. If three of the items are 1, 2 and6,
then find the other two.
|. The following table shows the marks obtained by 100 candidates in an examination. Calculate the mean,
median and standard deviation,
Marks obtained =: = 1-10.20 21-30 31-40. 41-50 S1-060
No. of candidates : 3 16 26 31 16 8
‘The marks obtained by students appearing in a test in Mathematics was tabled as :
Marks 2 30-39 40-49 50-59 60-69 70-79 80-89 90-99
No.of students: 2 3 u 20 32 25 7
Find the mean and standard deviation of the marks.
The following table shows the marks obtained by 100 i
. ‘s candidates in a ination. Ca :
shedion andl stederd deviations y es in an examination, Calculate the meat.
Marks obtained 10 11-20 21-30
3031404
No. of candidates: 3 16 %6 31 ve ae
Find the standard deviation of the breaki i :
Fin reaking strength of 86 test pieces of certain alloy from the followitt
Breaking strength: 44-46 4648 48-50 50-52
No. of pieces : 3 24 27 2 “
5
the unit being given to nearest thousand kef per square meter,
Compate the mean and standard deviation ofa
ge of drug addi i _
Age (in years) > ‘14.5-19.5 195-245 245-205 095-348 we one
Frequency : Bw mM @ 6
20
The table below gives the weights of 50 school boys nearest to kg.
We in kg (x) 40-42 43-45
2 Sat
No. of boys \) 3 , “s esl est 55-57
Palowtate set 7rans 247
16. From the following distribution table, and s.d, of weights of 100 students:
Wt in kg 0X) 60-62 63-65 66-68 69-71 72-74
No. of students (/) 5 18 42 7
ee
2. 1151.5 thousand
4.5
compute mean
1 Mean = 25.9, S.D.=5.5
14. Mean = 49.4, S.D. = 3/2.
Scanned with CrScamer