Statistics
() Syllabus
E9.1 E9.5
Collect, classify and tabulate statistical data. Construct and use cumulative frequency
Read, interpret and draw simple inferences from ff diagrams.
tables and statistical diagrams. Estimate and interpret the medion, percentiles,
uattiles and infer-querile range.
£9.2
Construct and read bar charts, pie charts, ene
Pidegrams, simple frequency distributions, Understand what is meant by positive, negative
histograms with equal and unequal intervals and) and zero correlation with reference toa seater
cotter diograms. diogrom. /
£9.3 59.7
Calculate the mean, median, mode and range @ Draw straight line of best fit by eye.
for individual ond discrete dota and distinguish
between the purposes for which they are used.
F9.4
Calcolate an estimate of the mean for grouped
‘and continuous date.
Identify the modal class from a grouped
frequency distribution,
(_) Contents
Chapter 37 Mean, median, mode and range (E9.3, E9.4)
Chapter38 Collecting and displaying dota (E9.1, E9.2, E9.6, E9.7)
Chapter39 Cumulative frequency (E9.5)
@2 —4ac
T_y+vb
cc | @
‘00 ax 3 i
eA p(s) |
C caxt) TU |
90° 60° y “\y
-g(x)ydx 120° oftam max *? if
epee oa tyen Be f
30° oi 4 ag
bs . a a J
- manMean, median, mode and
range
@ Average
‘Average’ is a word which in general usc is taken to mean
somewhere in the middle. For example, a woman may describe
herself as being of average height. A student may think he or
she is of average ability in maths. Mathematics is more exact
and uses three principal methods to measure average.
@ The mode is the value occurring the most often,
© The median is the middle value when all the data is arranged
in order of size.
@ The mean is found by adding together all the values of the
data and then dividing that total by dhe number of data values.
@ Spread
tis often useful to know how spread out the data is. It is
possible for two sets of data to have the same mean and median
but very different spreads.
‘The simplest measure of spread is the range. The range is
simply the difference between the largest and smallest values in
the data,
Another measure of spread is known as the inter-quartile
range. This is covered in more detail in Chapter 39.
Worked examples a) i) Find the mean, median and mode of the data listed
below.
1,0,2,4, 1,2, 1,1,2,5,5,0,1,2,3
LHOF2+4H1F2414+142454540414243
15
Mean =
=2
Arranging all the data in order and then picking out the
middle number gives the median:
0,0,1,1,1,1,1,(2) 2,2,2,3,4,5,5
The mode is the number which appeared most often.
Therefore the mode is 1.
ii) Calculate the range of the data.
Largest value = 5
Smallest value
‘Therefore the range = 5~0=537
‘Mean, median, mode and range
b) i) The frequency chart (below) shows the score out of 10
achieved by a class in a maths test.
Calculate the mean, median and mode for this data.
Frequency
0125945678910
Test score
‘Transferring the results to a frequency table gives:
Test score
oft 4[s[e6]7] 8] 9 [10] tou
Frequency
Frequency x score | 0 | 2
2]3
tfatsfe2fa[s[alfele qi [i] 2
6 [6 [ia fas [os [a2 [a2] s Pio] tee
Exercise 37.1
In the total column we can see the number of students
taking the test, Le. 32, and also the total number of
marks obtained by all the students, ic. 168.
Therefore the mean score = 4&8 = 5.25
Arranging all the scores in order gives:
0,1, 1,2,2,2,3,3,4, 4, 4,5,5,5,5,G,6) 6,6, 6,7, 7.7,
7, 7,7, 8, 8,8, 8, 9,10
Because there is an even number of students there isn’t
one middle number. There is a middle pair. The
(+6) _
75S.
The mode is 7 as it is the score which occurs most often,
fi) Calculate the range of the data,
Largest value =10 Smallest value = 0
Therefore the range = 10 —0= 10
median is
In questions 1:
each set of data.
1. A hockey team plays 15 matches. Below is a list of the
numbers of goals scored in these matches.
1,0,2,4,0, 1, 1,1,2,5,3,0,1,2,2
2, The total scores when two dice are thrown 20 times are:
7,4, 5,7,3, 2,8, 6, 8,7,6, 5, 11, 9, 7,3, 8,7,6,5
find the mean, median, mode and range forStatistics
Exercise 37.2
‘The ages of a group of girls are:
14 years 3 months, 14 years 5 months,
13 years 11 months, 14 years 3 months,
14 years 7 months, 14 years 3 months,
14 years 1 month
‘The numbers of students present in a
week period are:
28, 24, 25, 28, 23, 28,27, 26, 27,25, 28, 28, 28, 26, 25
Anathlete keeps a record in seconds of her training times
for the 100 m race:
ss over a three-
14.0, 14.3, 14.1, 14.3, 14.2, 14.0, 13.9, 13.8, 13.9, 13.8, 13.8,
13.7, 13.8, 13.8, 13.8
The mean mass of the 11 players in a football team is
80.3 kg, The mean mass of the team plus a substitute is,
81.2 kg, Calculate the mass of the substitute.
After eight matches a basketball player had scored a mean
of 27 points. After three more matches his mean was 29.
Calculate the total number of points he scored in the last
three games.
An ordinary dice was rolled 60 times. The results are shown
in the table below. Calculate the mean, median, mode and
range of the scores.
‘Score if273 s [6
Frequeney | 2] |e [2] 7
‘Two dice were thrown 100 times. Bach time their
combined score was recorded. Below is a table of the
results, Calculate the mean score.
‘Score 2]3f4[s[e[z7fefofofuta
Frequency | 5 | 6 [7] 9 [is[isfisfu[s[7]3
Sixty flowering bushes are planted. At their flowering peak,
the number of flowers per bush is counted and recorded.
‘The results are shown in the table below.
el7]e
10] 16] 18
T2]2]4]5
ofofel[afe
a) Calculate the mean, median, mode and range of the
number of flowers per bush.
b) Which of the mean, median and mode would be most
useful when advertising the bush to potential buyers?
Flowers per bush | 0
Frequency
037
Mean, median, mode and range
Worked example
Exercise 37.3
@ The mean for grouped data
‘The mean for grouped da
can only be an estimate as the
position of the data within a group is not known. An estimate is
made by calculating the mid-interv:
value for a group and then
assigning all of the data within the group that mid interval value,
‘The history test scores for a group of 40 students are shown in
the grouped frequency table below.
[| Sone [over] Memce™ | Patent”
interval value
"8
[wesem[ + | os |e
i) Calculate an estimate for the mean test result.
i) What is the modal class?
This refers to the class with the greatest frequency, if
the class width is constant. Therefore the modal class is
60<=S <79.
The heights of SO basketball players attending a tournament
are recorded in the grouped frequency table.
Note: 1.8— means 1.8 < H< 1.9.
a) Copy the table and complete it to include the necessary
data with which to calculate the mean height of the
players.
b) Estimate the moan height of the players,Statistics
2. The number of hours of overtime worked by employees at a
factory over a period of a month is given in the table (left).
a) Calculate an estimate for the mean number of hours of
overtime worked by the employees that month.
b) What is the modal class?
3. The length of the index finger of 30 students in a class is
measured. The results were recorded and are shown in the
grouped frequency table.
a) Calculate an estimate for the mean index finge
of the students,
b) What is the modal class’?
Student assessment |
1. A rugby team scores the following number of points in
, 12, 15, 18, 42, 18, 24, 6, 12,3
Calculate for the 12 matches:
a) the mean score,
b) the median score,
©) the mode
d) the range.
2. The bar chart (left) shows the marks out of 10 for an
7 English test taken by a class of students.
6 a) Calculate the number of students who took the test,
gs b) Calculate for the class:
a i) the mean test result,
8 the median test result,
the modal test result,
the range of the test results.
5 6 7 8 8 10
Tost score37
Mean, median, mode and range
456789 10
Test score
3. Fifty sacks of grain are weighed as they are unloaded
from a truck. The mass of each is recorded in the grouped
frequency table.
a) Calculate the mean han
mass of the SO sacks. os
IS=M
[oo [x2 [o> |
Find the mean, median, mode and range of her throws.
2. The bar chart shows the marks out of 10 for a Maths test
taken by a class of students.
a) Calculate the number of students who took the test.
b) Calculate for the class:
i) the mean test result
it) the median test result,
iii) the modal test result,
iv) the range of the test results.
A hundred sacks of coffee with a stated mass of 10 kg. are
unloaded from a train. The mass of each sack is checked
and the results are presented in the table.
a) Calculate an estimate [frase (agp
for the mean mass.
b) What is the modal
class?
98=M<99
99
029 0 60 80 1007120140
Height (em)
© Types of correlation
There are several types of correlation, depending on the
arrangement of the points plotted on the scatter diagram.38
A strong positive correlation
between the variables x and y.
The points lie very close to the line of
best fit,
Asx increases, so does y.
yh
A strong negative correlation.
‘The points lie close around the
line of best fit.
Asx increases, y decreases.
vA
*
No correlation.
As there is no pattern to the way in which
the points arc lying, there is no correlation
between the variables x and y. As a result
there ean be no line of best fit.
Exercise 38.3 1. State what type of correlation you might expect, if a
Collecting and displaying data
A weak positive correlation,
Although there is direction to the way the
points are lying, they are not tightly packed
around the line of best fit.
‘Asx increases, y tends to increase too.
yh
A weak negative correlation,
‘The points are not tightly packed around the
line of best fit.
Asx increases, y tends to decrease.
ya
Le
it
the following data was collected and plotted on a scatter
diagram. Give reasons for your answer.
a) A student's score in a maths exam and their score in a
science exam.
b) A student's hair colour and the distance they have to
travel ta school
©) The outdoor temperature and the number of cold drinks
sold by a shop.
4d) The age of a motorcycle and its second-hand selling
price.Statistics
©) The number of people living in a house and the number
of rooms the house has.
{) The number of goals your opponents score and the
number of times you win,
2) A child’s height and the child’s age
h) A car's engine size and its fuel consumption.
2. Awebsite gives average monthly readings for the number
s of sunshine and the amount of rainfall in
imetres for several cities in Europe. The table below is a
summary for July.
Pace | Rear atanaine [Ra
am
‘Copenhagen
Dubrovnik
Edinburgh
Frankfurt
‘Geneva
Helsinki
Innsbruck
Krakow
Marseilles
Naples’
‘Oslo
Plovdiv
Reykjavik
Sofia
Tallinn
‘Valletta
York
Zurich
a) Plot a scatter diagram of the number of hours of
sunshine against the amount of rainfall. Use a
spreadsheet if possible.
b) What type of correlation, if any, is there between the
two variables? Comment on whether this is what you
would expect.Collecting and displaying data
3. The United Nations keeps an up-to-date database of
statistical information on its member countries.
The table below shows some of the information available.
Life expectancy at birth | Adule illiteracy | Infant mortality rate
(years, 2005-2010) _| rate (%, 2009) | (per 1000 births,
2005-2010)
10
m4
130
Nepal
Portugal
Russian Federation
Saudi Arabia 3 7 5 9
South Africa 33 50 12 9
United Kingdom a 7 D
United States of America al 7 O é
a) By plotting a scatter diagram, decide if there is a correlation
between the adult illiteracy rate and the infant mortality
rate.
b) Are your fi
your answer.
©) Without plotting a graph, decide if you think there is likely
to be a correlation between male and female life expectancy
at birth. Explain your reasons.
@) Plot a scatter diagram to test if your predictions for parte)
were correct,
ings in part a) what you expected? ExplainStatistics
4. A gardener plants 10 tomato plants, He wants to see if there
is a relationship between the number of tomatoes the plant
produces and its height in centimetres.
‘The results are presented in the scatter diagram below. The
line of best fit is also drawn.
Number of tomatoes produced
onsaBRB Rs aR
's0 85 60 65 70 75 BO BS 90 85 100
Height (em)
a) Describe the correlation (ifany) between the height of
plant and the number of tomatoes it produced,
b) The gardener has another plant grown in the same
conditions as the others. If the height is 85 cm, estimate
from the graph the number of tomatoes he can expect it
to produce.
©) Another plant only produces 15 tomatoes. Deduce its
height from the graph.
@ Histograms
A histogram displays the frequency of either continuous or grouped
discrete data in the form of bars. There are several important
features of a histogram which distinguish it from a bar chart.
© The bars are joined together.
@ The bars can be of varying width.
© The frequency of the data is represented by the urea of the
bar and not the height (though in the case of bars of equal
width, the area is directly proportional to the height of
the bar and so the height is usually used as the measure of
frequency).Collecting and displaying data
Worked example
‘The table (left) shows the marks out of 100 in a maths test for a
class of 32 students. Draw a histogram representing this data,
Test marks || Frequency | 411 the class intervals arc the same. As a result the bars of the
1-10 ° histogram will all be of equal width, and the frequency can be
a 5 plotted on the vertical axis. The histogram is shown below.
21-30 1 of
31-40 z 8
41-50 5 7
51-00 8 g* aii
+ Be
71-80 é 8
21-90 2 2
91100 1 i
940 20 90 40 60 60 70 80 00100"
Tect eoore
Exercise 38.4
1. ‘The table (below) shows the distances 2. ‘The heights of students in a class were
travelled to school by a class of 30 measured. The results are shown in the table
students, Represent this information ona (below). Draw a histogram to represent this
histogram, data
145— 1
150— 2
iss 7
160 7
[_16s- 6
[oa 7
15 2
[———.
Note that both questions in Exercise 38.4 deal with continuous
data, In these questions equal class intervals are represented
in differont ways. However, they mean the same thing,
In question 2, 145— means the students whose heights fall in
the range 145
FIL OP PL OP HE
‘Sunflower height (m)
This graph is misleading because it leads people to the
conclusion that most of the sunflowers were under 1 m, simply
because the area of the bar is so great. In actual fact only
approximately one quarter of the sunflowers were under 1 m,
‘When class intervals are different it 1s the area of the bar which
represents the frequency, not the height. Instead of frequency
heing plotted on the vertical axis, frequency density is plotted
i density = leauencs
requency density = ——"——
aoe ey Glass width
The results of the sunflower measurements in the example
above can therefore be wi
LO=nS 1S
15 ~~ | 1 | t—_|
a) Copy the table and complete it by calculating the
frequency density.
b) Represent the information on a histogram,
2. OnSundays Maria helps her father feed their chickens.
Over a period of one year she kept a record of how long it,
took. Her results are shown in the table below.
[ine iny_[_Freaveney _[_Froweney doy
w