0% found this document useful (0 votes)
1K views57 pages

Merged OCR Stats

Answer all the questions. Give non-exact numerical answers correct to 3 significant figures unless a different degree of accuracy is specified in the question or is clearly appropriate. The total number of marks for this paper is 72.

Uploaded by

Koirala Prijun
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views57 pages

Merged OCR Stats

Answer all the questions. Give non-exact numerical answers correct to 3 significant figures unless a different degree of accuracy is specified in the question or is clearly appropriate. The total number of marks for this paper is 72.

Uploaded by

Koirala Prijun
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

ADVANCED SUBSIDIARY GCE UNIT

4732/01

MATHEMATICS
Probability & Statistics 1 FRIDAY 12 JANUARY 2007
Additional Materials: Answer Booklet (8 pages) List of Formulae (MF1)

Morning Time: 1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72.

ADVICE TO CANDIDATES

Read each question carefully and make sure you know what you have to do before starting your answer. You are reminded of the need for clear presentation in your answers.

This document consists of 6 printed pages and 2 blank pages.


OCR 2007 [K/102/2696] OCR is an exempt Charity

[Turn over

www.XtremePapers.net

2 1

Part of the probability distribution of a variable, X , is given in the table.

x
P(X = x)

1
3 10

2
1 5

3
2 5

(i) Find P(X = 0). (ii) Find E(X ).

[2] [2]

The table contains data concerning ve households selected at random from a certain town. Number of people in the household Number of cars belonging to people in the household 2 1 3 1 3 3 5 2 7 4

(i) Calculate the product moment correlation coefcient, r , for the data in the table.

[5]

(ii) Give a reason why it would not be sensible to use your answer to draw a conclusion about all the households in the town. [1]

The digits 1, 2, 3, 4 and 5 are arranged in random order, to form a ve-digit number.
(i) How many different ve-digit numbers can be formed? (ii) Find the probability that the ve-digit number is (a) odd, (b) less than 23 000.

[1]

[2] [3]

OCR 2007

4732/01 Jan07

www.XtremePapers.net

3 4

Each of the variables W , X , Y and Z takes eight integer values only. The probability distributions are illustrated in the following diagrams.

(i) For which one or more of these variables is (a) the mean equal to the median, (b) the mean greater than the median? (ii) Give a reason why none of these diagrams could represent a geometric distribution.

[1] [1] [1]

(iii) Which one of these diagrams could not represent a binomial distribution? Explain your answer briey. [2]

A chemical solution was gradually heated. At ve-minute intervals the time, x minutes, and the temperature, y C, were noted.

x y

0 0.8

5 3.0

10 6.8

15 10.9

20 15.6

25 19.6

30 23.4

35 26.7

[n = 8, x = 140, y = 106.8, x2 = 3500, y2 = 2062.66, xy = 2685.0.]


(i) Calculate the equation of the regression line of y on x. (ii) Use your equation to estimate the temperature after 12 minutes.

[4] [2]

(iii) It is given that the value of the product moment correlation coefcient is close to +1. Comment on the reliability of using your equation to estimate y when (a) x = 17, (b) x = 57.

[2]

OCR 2007

4732/01 Jan07

[Turn over

www.XtremePapers.net

4 6

A coin is biased so that the probability that it will show heads on any throw is 2 . The coin is thrown 3 repeatedly. The number of throws up to and including the rst head is denoted by X . Find
(i) P(X = 4), (ii) P(X < 4), (iii) E(X ).

[3] [3] [2]

A bag contains three 1p coins and seven 2p coins. Coins are removed at random one at a time, without replacement, until the total value of the coins removed is at least 3p. Then no more coins are removed.
(i) Copy and complete the probability tree diagram.

[5]

Find the probability that


(ii) exactly two coins are removed, (iii) the total value of the coins removed is 4p.

[3] [3]

OCR 2007

4732/01 Jan07

www.XtremePapers.net

5 8

In the 2001 census, the household size (the number of people living in each household) was recorded. The percentages of households of different sizes were then calculated. The table shows the percentages for two wards, Withington and Old Moat, in Manchester. Household size 1 Withington Old Moat 34.1 35.1 2 26.1 27.1 3 12.7 14.7 4 12.8 11.4 5 8.2 7.6 6 4.0 2.8 7 or more 2.1 1.3

(i) Calculate the median and interquartile range of the household size for Withington.

[3]

(ii) Making an appropriate assumption for the last class, which should be stated, calculate the mean and standard deviation of the household size for Withington. Give your answers to an appropriate degree of accuracy. [6]

The corresponding results for Old Moat are as follows. Median 2 Interquartile range 2 Mean 2.4 Standard deviation 1.5

(iii) State one advantage of using the median rather than the mean as a measure of the average household size. [1] (iv) By comparing the values for Withington with those for Old Moat, explain briey why the interquartile range may be less suitable than the standard deviation as a measure of the variation in household size. [1] (v) For one of the above wards, the value of Spearmans rank correlation coefcient between household size and percentage is 1. Without any calculation, state which ward this is. Explain your answer. [2]

A variable X has the distribution B(11, p).


(i) Given that p = 3 , nd P(X = 5). 4 (ii) Given that P(X = 0) = 0.05, nd p. (iii) Given that Var(X ) = 1.76, nd the two possible values of p.

[2] [4] [5]

OCR 2007

4732/01 Jan07

www.XtremePapers.net

6 BLANK PAGE

4732/01 Jan07

www.XtremePapers.net

7 BLANK PAGE

4732/01 Jan07

www.XtremePapers.net

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (UCLES) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
OCR 2007 4732/01 Jan07

www.XtremePapers.net

ADVANCED SUBSIDIARY GCE UNIT

4732/01

MATHEMATICS
Probability & Statistics 1 TUESDAY 5 JUNE 2007
Additional Materials: Answer Booklet (8 pages) List of Formulae (MF1)

Afternoon Time: 1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72.

ADVICE TO CANDIDATES

Read each question carefully and make sure you know what you have to do before starting your answer. You are reminded of the need for clear presentation in your answers.

This document consists of 6 printed pages and 2 blank pages.


OCR 2007 [K/102/2696] OCR is an exempt Charity

[Turn over

www.XtremePapers.net

2 1

The table shows the probability distribution for a random variable X . P(X = x)

0 0.1

1 0.2

2 0.3

3 0.4 [5]

Calculate E(X ) and Var(X ).


2

Two judges each placed skaters from ve countries in rank order. Position Judge 1 Judge 2 1st UK Russia 2nd France Canada 3rd Russia France 4th Poland UK 5th Canada Poland [5]

Calculate Spearmans rank correlation coefcient, rs , for the two judges rankings.
3

(i) How many different teams of 7 people can be chosen, without regard to order, from a squad of 15? [2] (ii) The squad consists of 6 forwards and 9 defenders. 3 forwards and 4 defenders can be chosen?

How many different teams containing [2]

A bag contains 6 white discs and 4 blue discs. Discs are removed at random, one at a time, without replacement.
(i) Find the probability that (a) the second disc is blue, given that the rst disc was blue, (b) the second disc is blue, (c) the third disc is blue, given that the rst disc was blue.

[1] [3] [3]

(ii) The random variable X is the number of discs which are removed up to and including the rst blue disc. State whether the variable X has a geometric distribution. Explain your answer briey. [1]

OCR 2007

4732/01 Jun07

www.XtremePapers.net

3 5

The numbers of births, in thousands, to mothers of different ages in England and Wales, in 1991 and 2001 are illustrated by the cumulative frequency curves.

(i) In which of these two years were there more births? How many more births were there in this year? [2] (ii) The following quantities were estimated from the diagram.
Year 1991 2001 Median age (years) 27.5 Interquartile range (years) 7.3 Proportion of mothers giving birth aged below 25 33% Proportion of mothers giving birth aged 35 or above 9% 18%

(a) Find the values missing from the table.

[5]

(b) Did the women who gave birth in 2001 tend to be younger or older or about the same age as the women who gave birth in 1991? Using the table and your values from part (a), give two reasons for your answer. [3]

OCR 2007

4732/01 Jun07

[Turn over

www.XtremePapers.net

4 6

A machine with articial intelligence is designed to improve its efciency rating with practice. The table shows the values of the efciency rating, y, after the machine has carried out its task various numbers of times, x.

x y

0 0

1 4

2 8

3 10

4 11

7 12

13 13

30 14

These data are illustrated in the scatter diagram.

[n = 8, x = 60, y = 72, x2 = 1148, y2 = 810, xy = 767.]

(i) (a) Calculate the value of r, the product moment correlation coefcient.

[3]

(b) Without calculation, state with a reason the value of rs , Spearmans rank correlation coefcient. [2] (ii) A researcher suggests that the data for x = 0 and x = 1 should be ignored. Without calculation, state with a reason what effect this would have on the value of (a) r, (b) rs . (iii) Use the diagram to estimate the value of y when x = 29.

[2] [2] [1]

(iv) Jack nds the equation of the regression line of y on x for all the data, and uses it to estimate the value of y when x = 29. Without calculation, state with a reason whether this estimate or the one [2] found in part (iii) will be the more reliable.

OCR 2007

4732/01 Jun07

www.XtremePapers.net

5 7

On average, 25% of the packets of a certain kind of soup contain a voucher. Kim buys one packet of soup each week for 12 weeks. The number of vouchers she obtains is denoted by X .
(i) State two conditions needed for X to be modelled by the distribution B(12, 0.25).

[2]

In the rest of this question you should assume that these conditions are satised.
(ii) Find P(X 6).

[2]

In order to claim a free gift, 7 vouchers are needed.


(iii) Find the probability that Kim will be able to claim a free gift at some time during the 12 weeks. [1] (iv) Find the probability that Kim will be able to claim a free gift in the 12th week but not before. [4] 8 (i) A biased coin is thrown twice. The probability that it shows heads both times is 0.04. Find the probability that it shows tails both times. [3] (ii) Another coin is biased so that the probability that it shows heads on any throw is p. The probability that the coin shows heads exactly once in two throws is 0.42. Find the two possible values of p. [5] 9 (i) A random variable X has the distribution Geo 1 . Find 5 (a) E(X ), (b) P(X = 4), (c) P(X > 4).

[2] [2] [2]

(ii) A random variable Y has the distribution Geo(p), and q = 1 p. (a) Show that P(Y is odd) = p + q2 p + q4 p + . . . . (b) Use the formula for the sum to innity of a geometric progression to show that

[1]

P(Y is odd) =

1 . 1+q

[4]

OCR 2007

4732/01 Jun07

www.XtremePapers.net

6 BLANK PAGE

4732/01 Jun07

www.XtremePapers.net

7 BLANK PAGE

4732/01 Jun07

www.XtremePapers.net

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
OCR 2007 4732/01 Jun07

www.XtremePapers.net

OXFORD CAMBRIDGE AND RSA EXAMINATIONS Advanced Subsidiary General Certicate of Education Advanced General Certicate of Education

MATHEMATICS
Probability & Statistics 1
Tuesday

4732
Afternoon 1 hour 30 minutes

18 JANUARY 2005

Additional materials: Answer booklet Graph paper List of Formulae (MF1)

TIME

1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. Questions carrying smaller numbers of marks are printed earlier in the paper, and questions carrying larger numbers of marks later in the paper. You are reminded of the need for clear presentation in your answers.

This question paper consists of 4 printed pages.


OCR 2005 [K/102/2696] Registered Charity Number: 1066969

[Turn over

www.XtremePapers.net

2 1

The scatter diagrams below illustrate three sets of bivariate data, A, B and C.

State, with an explanation in each case, which of the three sets of data has
(i) the largest, (ii) the smallest,

value of the product moment correlation coefcient.

[4]

The back-to-back stem-and-leaf diagram below shows the number of hours of television watched per week by each of 15 boys and 15 girls. Boys 8 7 7 6 6 4 4 3 2 2 0 6 5 4 5 0 1 2 3 Girls 0 5 5 6 6 7 7 8 8 9 0 0 4 2 7

Key: 4 2 2 means a boy who watched 24 hours and a girl who watched 22 hours of television per week.
(i) Find the median and the quartiles of the results for the boys.

[3]

(ii) Give a reason why the median might be preferred to the mean in using an average to compare the two data sets. [1] (iii) State one advantage, and one disadvantage, of using stem-and-leaf diagrams rather than box-andwhisker plots to represent the data. [2]

4732/Jan05

www.XtremePapers.net

3 3

Two commentators gave ratings out of 100 for seven sports personalities. The ratings are shown in the table below. Personality Commentator I Commentator II

A
73 77

B
76 78

C
78 79

D
65 80

E
86 86

F
82 89

G
91 95

(i) Calculate Spearmans rank correlation coefcient for these ratings. (ii) State what your answer tells you about the ratings given by the two commentators.

[5] [1]

The table below shows the probability distribution of the random variable X .

x
P(X = x)

2
1 4

1
1 5

1
2 5

2
1 10

(i) Find the value of the constant k. (ii) Calculate the values of E(X ) and Var(X ).

[2] [5]

On average 1 in 20 members of the population of this country has a particular DNA feature. Members of the population are selected at random until one is found who has this feature.
(i) Find the probability that the rst person to have this feature is (a) the sixth person selected, (b) not among the rst 10 people selected. (ii) Find the expected number of people selected.

[3] [3] [2]

Louise and Marie play a series of tennis matches. It is given that, in any match, the probability that Louise wins the rst two sets is 3 . 8
(i) Find the probability that, in 5 randomly chosen matches, Louise wins the rst two sets in exactly 2 of the matches. [3]

It is also given that Louise and Marie are equally likely to win the rst set.
(ii) Show that P(Louise wins the second set, given that she won the rst set) = 3 . 4 (iii) The probability that Marie wins the rst two sets is 1 . Find 3

[2]

P(Marie wins the second set, given that she won the rst set).

[2]

4732/Jan05

[Turn over

www.XtremePapers.net

4 7

It is known that, on average, one match box in 10 contains fewer than 42 matches. Eight boxes are selected, and the number of boxes that contain fewer than 42 matches is denoted by Y .
(i) State two conditions needed to model Y by a binomial distribution.

[2]

Assume now that a binomial model is valid.


(ii) Find (a) P(Y = 0), (b) P(Y 2).

[2] [2]

(iii) On Wednesday 8 boxes are selected, and on Thursday another 8 boxes are selected. Find the probability that on one of these days the number of boxes containing fewer than 42 matches is 0, and that on the other day the number is 2 or more. [3] 8

An examination paper consists of 8 questions, of which one is on geometric distributions and one is on binomial distributions.
(i) If the 8 questions are arranged in a random order, nd the probability that the question on geometric distributions is next to the question on binomial distributions. [3]

Four of the questions, including the one on geometric distributions, are worth 7 marks each, and the remaining four questions, including the one on binomial distributions, are worth 9 marks each. The 7-mark questions are the rst four questions on the paper, but are arranged in random order. The 9-mark questions are the last four questions, but are arranged in random order. Find the probability that
(ii) the questions on geometric distributions and on binomial distributions are next to one another, [3] (iii) the questions on geometric distributions and on binomial distributions are separated by at least 2 other questions. [4] 9

Five observations of bivariate data produce the following results, denoted as (xi , yi ) for i = 1, 2, 3, 4, 5. (13, 2.7) (13, 4.0) (18, 2.8) (23, 3.3) (23, 2.2) [ x = 90, y = 15.0, x2 = 1720, y2 = 46.86, xy = 264.0.]
(i) Show that the regression line of y on x has gradient 0.06, and nd its equation in the form y = a + bx. [4] (ii) The regression line is used to estimate the value of y corresponding to x = 20, but the value x = 20 is accurate only to the nearest whole number. Calculate the difference between the largest and [3] the smallest values that the estimated value of y could take.

The numbers e1 , e2 , e3 , e4 , e5 are dened by

ei = a + bxi yi

for i = 1, 2, 3, 4, 5.

(iii) The values of e1 , e2 and e3 are 0.6, 0.7 and 0.2 respectively. Calculate the values of e4 and e5 . [2] (iv) Calculate the value of e2 + e2 + e2 + e2 + e2 and explain the relevance of this quantity to the 1 2 3 4 5 regression line found in part (i). [2] (v) Find the mean and the variance of e1 , e2 , e3 , e4 , e5 .
4732/Jan05

[4]

www.XtremePapers.net

BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable e ort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.

www.XtremePapers.net

4732/Jan05

OXFORD CAMBRIDGE AND RSA EXAMINATIONS Advanced Subsidiary General Certicate of Education Advanced General Certicate of Education

MATHEMATICS
Probability & Statistics 1
Thursday

4732
Morning 1 hour 30 minutes

9 JUNE 2005

Additional materials: Answer booklet Graph paper List of Formulae (MF1)

TIME

1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. Questions carrying smaller numbers of marks are printed earlier in the paper, and questions carrying larger numbers of marks later in the paper. You are reminded of the need for clear presentation in your answers.

This question paper consists of 5 printed pages and 3 blank pages.


OCR 2005 [K/102/2696] Registered Charity Number: 1066969

[Turn over

www.XtremePapers.net

2 1 (i) Calculate the value of Spearmans rank correlation coefcient between the two sets of rankings, A and B, shown in Table 1. [4]

A B

1 4

2 1

3 3
Table 1

4 2

5 5

(ii) The value of Spearmans rank correlation coefcient between the set of rankings B and a third set of rankings, C, is known to be 1. Copy and complete Table 2 showing the set of rankings C. [2]

B C

Table 2

The probability that a certain sample of radioactive material emits an alpha-particle in one unit of time is 0.14. In one unit of time no more than one alpha-particle can be emitted. The number of units of time up to and including the rst in which an alpha-particle is emitted is denoted by T .
(i) Find the value of (a) P(T = 5), (b) P(T < 8). (ii) State the value of E(T ).

[3] [3] [2]

In a supermarket the proportion of shoppers who buy washing powder is denoted by p. 16 shoppers are selected at random.
(i) Given that p = 0.35, use tables to nd the probability that the number of shoppers who buy washing powder is (a) at least 8, (b) between 4 and 9 inclusive.

[3] [2]

(ii) Given instead that p = 0.38, nd the probability that the number of shoppers who buy washing powder is exactly 6. [3]

4732/S05

www.XtremePapers.net

3 4

The table shows the latitude, x (in degrees correct to 3 signicant gures), and the average rainfall y (in cm correct to 3 signicant gures) of ve European cities. City Berlin Bucharest Moscow St Petersburg Warsaw

x
52.5 44.4 55.8 60.0 52.3

y
58.2 58.7 53.3 47.8 56.6

[n = 5, x = 265.0, y = 274.6, x2 = 14 176.54, y2 = 15 162.22, xy = 14 464.10.]


(i) Calculate the product moment correlation coefcient.

[3]

(ii) The values of y in the table were in fact obtained from measurements in inches and converted into centimetres by multiplying by 2.54. State what effect it would have had on the value of the product moment correlation coefcient if it had been calculated using inches instead of centimetres. [1] (iii) It is required to estimate the annual rainfall at Bergen, where x = 60.4. Calculate the equation of an appropriate line of regression, giving your answer in simplied form, and use it to nd the required estimate. [5]

4732/S05

[Turn over

www.XtremePapers.net

4 5

The examination marks obtained by 1200 candidates are illustrated on the cumulative frequency graph, where the data points are joined by a smooth curve.

Use the curve to estimate


(i) the interquartile range of the marks, (ii) x, if 40% of the candidates scored more than x marks, (iii) the number of candidates who scored more than 68 marks.

[3] [3] [2]

Five of the candidates are selected at random, with replacement.


(iv) Estimate the probability that all ve scored more than 68 marks.

[3]

It is subsequently discovered that the candidates marks in the range 35 to 55 were evenly distributed that is, roughly equal numbers of candidates scored 35, 36, 37, , 55.
(v) What does this information suggest about the estimate of the interquartile range found in part (i)? [2]

4732/S05

www.XtremePapers.net

5 6

Two bags contain coloured discs. At rst, bag P contains 2 red discs and 2 green discs, and bag Q contains 3 red discs and 1 green disc. A disc is chosen at random from bag P, its colour is noted and it is placed in bag Q. A disc is then chosen at random from bag Q, its colour is noted and it is placed in bag P. A disc is then chosen at random from bag P. The tree diagram shows the different combinations of three coloured discs chosen.

(i) Write down the values of a, b, c, d, e and f .

[4]

The total number of red discs chosen, out of 3, is denoted by R. The table shows the probability distribution of R.

r
P(R = r)

0
1 10

2
9 20

3
1 5

(ii) Show how to obtain the value P(R = 2) = (iii) Find the value of k. (iv) Calculate the mean and variance of R.

9 . 20

[3] [2] [5]

A committee of 7 people is to be chosen at random from 18 volunteers.


(i) In how many different ways can the committee be chosen?

[2]

The 18 volunteers consist of 5 people from Gloucester, 6 from Hereford and 7 from Worcester. The committee is to be chosen randomly. Find the probability that the committee will
(ii) consist of 2 people from Gloucester, 2 people from Hereford and 3 people from Worcester, [4] (iii) include exactly 5 people from Worcester, (iv) include at least 2 people from each of the three cities.
4732/S05

[4] [4]

www.XtremePapers.net

6 BLANK PAGE

4732/S05

www.XtremePapers.net

7 BLANK PAGE

4732/S05

www.XtremePapers.net

8 BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable e ort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
4732/S05

www.XtremePapers.net

ADVANCED SUBSIDIARY GCE MATHEMATICS Probability & Statistics 1 TUESDAY 15 JANUARY 2008
Additional materials: Answer Booklet (8 pages) List of Formulae (MF1)

4732/01

Morning Time: 1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Read each question carefully and make sure you know what you have to do before starting your answer. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. You are reminded of the need for clear presentation in your answers.

This document consists of 6 printed pages and 2 blank pages.


OCR 2008 [K/102/2696] OCR is an exempt Charity

[Turn over

www.XtremePapers.net

2 1 (i) The letters A, B, C, D and E are arranged in a straight line. (a) How many different arrangements are possible? (b) In how many of these arrangements are the letters A and B next to each other?

[2] [3]

(ii) From the letters A, B, C, D and E, two different letters are selected at random. Find the probability that these two letters are A and B. [2]
1 5

A random variable T has the distribution Geo


(i) P(T = 4), (ii) P(T > 4), (iii) E(T ).

. Find [2] [2] [1]

A sample of bivariate data was taken and the results were summarised as follows.

n=5

x = 24

x2 = 130

y = 39

y2 = 361

xy = 212

(i) Show that the value of the product moment correlation coefcient r is 0.855, correct to 3 signicant gures. [2] (ii) The ranks of the data were found. One student calculated Spearmans rank correlation coefcient rs , and found that rs = 0.7. Another student calculated the product moment coefcient, R, of these ranks. State which one of the following statements is true, and explain your answer briey.

(A) R = 0.855 (B) R = 0.7 (C) It is impossible to give the value of R without carrying out a calculation using the original data. [2]
(iii) All the values of x are now multiplied by a scaling factor of 2. State the new values of r and rs . [2]

A supermarket has a large stock of eggs. 40% of the stock are from a rm called Eggzact. 12% of the stock are brown eggs from Eggzact. An egg is chosen at random from the stock. Calculate the probability that
(i) this egg is brown, given that it is from Eggzact, (ii) this egg is from Eggzact and is not brown.

[2] [2]

OCR 2008

4732/01 Jan08

www.XtremePapers.net

3 5 (i) 20% of people in the large town of Carnley support the Residents Party. 12 people from Carnley are selected at random. Out of these 12 people, the number who support the Residents Party is denoted by U .

Find
(a) P(U 5), (b) P(U 3).

[2] [3]

(ii) 30% of people in Carnley support the Commerce Party. 15 people from Carnley are selected at random. Out of these 15 people, the number who support the Commerce Party is denoted by V .

Find P(V = 4). The probability distribution for a random variable Y is shown in the table.

[3]

y
P(Y = y)

1 0.2

2 0.3

3 0.5

(i) Calculate E(Y ) and Var(Y ).

[5]

Another random variable, Z , is independent of Y . The probability distribution for Z is shown in the table. 1 P(Z = ) 0.1 2 0.25 3 0.65

One value of Y and one value of Z are chosen at random. Find the probability that
(ii) Y + Z = 3, (iii) Y Z is even.

[3] [3]

(i) Andrew plays 10 tennis matches. In each match he either wins or loses. (a) State, in this context, two conditions needed for a binomial distribution to arise.

[2]

(b) Assuming these conditions are satised, dene a variable in this context which has a binomial distribution. [1] (ii) The random variable X has the distribution B(21, p), where 0 < p < 1.

Given that P(X = 10) = P(X = 9), nd the value of p.

[5]

OCR 2008

4732/01 Jan08

[Turn over

www.XtremePapers.net

4 8

The stem-and-leaf diagram shows the age in completed years of the members of a sports club.
Male Female

8876 76553321 98443 521 90 Key: 1 4

1 2 3 4 5

66677889 1334578899 23347 018 0

0 represents a male aged 41 and a female aged 40. [3]

(i) Find the median and interquartile range for the males.

(ii) The median and interquartile range for the females are 27 and 15 respectively. Make two comparisons between the ages of the males and the ages of the females. [2] (iii) The mean age of the males is 30.7 and the mean age of the females is 27.5, each correct to 1 decimal place. Give one advantage of using the median rather than the mean to compare the ages of the males with the ages of the females. [1]

A record was kept of the number of hours, X , spent by each member at the club in a year. The results were summarised by

n = 49,

(x 200) = 245,

(x 200)2 = 9849. [6]

(iv) Calculate the mean and standard deviation of X .

OCR 2008

4732/01 Jan08

www.XtremePapers.net

5 9

It is thought that the pH value of sand (a measure of the sands acidity) may affect the extent to which a particular species of plant will grow in that sand. A botanist wished to determine whether there was any correlation between the pH value of the sand on certain sand dunes, and the amount of each of two plant species growing there. She chose random sections of equal area on each of eight sand dunes and measured the pH values. She then measured the area within each section that was covered by each of the two species. The results were as follows. Dune pH value, x Area, y cm2 , covered Species P Species Q

A
8.5 150 170

B
8.5 150 15

C
9.5 575 80

D
8.5 330 230

E
6.5 45 75

F
7.5 15 25

G
8.5 340 0

H
9.0 330 0

The results for species P can be summarised by

n = 8,

x = 66.5,

x2 = 558.75,

y = 1935,

y2 = 711 275,

xy = 17 082.5.

(i) Give a reason why it might be appropriate to calculate the equation of the regression line of y on x rather than x on y in this situation. [1] (ii) Calculate the equation of the regression line of y on x for species P, in the form y = a + bx, giving [4] the values of a and b correct to 3 signicant gures. (iii) Estimate the value of y for species P on sand where the pH value is 7.0.

[2]

The values of the product moment correlation coefcient between x and y for species P and Q are rP = 0.828 and rQ = 0.0302.
(iv) Describe the relationship between the area covered by species Q and the pH value.

[1]

(v) State, with a reason, whether the regression line of y on x for species P will provide a reliable estimate of the value of y when the pH value is (a) 8, (b) 4.

[1] [1]

(vi) Assume that the equation of the regression line of y on x for species Q is also known. State, with a reason, whether this line will provide a reliable estimate of the value of y when the pH value is 8. [1]

OCR 2008

4732/01 Jan08

www.XtremePapers.net

6 BLANK PAGE

OCR 2008

4732/01 Jan08

www.XtremePapers.net

7 BLANK PAGE

OCR 2008

4732/01 Jan08

www.XtremePapers.net

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
OCR 2008 4732/01 Jan08

www.XtremePapers.net

ADVANCED SUBSIDIARY GCE

4732/01

MATHEMATICS
Probability & Statistics 1

FRIDAY 23 MAY 2008


Additional materials (enclosed): Additional materials (required): Answer Booklet (8 pages) List of Formulae (MF1) None

Morning Time: 1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name in capital letters, your Centre Number and Candidate Number in the spaces provided on the Answer Booklet. Read each question carefully and make sure you know what you have to do before starting your answer. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. You are reminded of the need for clear presentation in your answers.

This document consists of 6 printed pages and 2 blank pages.


OCR 2008 [K/102/2696] OCR is an exempt Charity

[Turn over

www.XtremePapers.net

2 1 (i) State the value of the product moment correlation coefcient for each of the following scatter diagrams. [2]

(a)

(b)

(ii) Calculate the value of Spearmans rank correlation coefcient for the following data.

[5]

x y

3.8 1.4

4.1 0.8

4.5 0.7

5.3 1.2

A class consists of 7 students from Ashville and 8 from Bewton. A committee of 5 students is chosen at random from the class.
(i) Find the probability that 2 students from Ashville and 3 from Bewton are chosen.

[3]

(ii) In fact 2 students from Ashville and 3 from Bewton are chosen. In order to watch a video, all 5 committee members sit in a row. In how many different orders can they sit if no two students from Bewton sit next to each other? [2]

(i) A random variable X has the distribution B(8, 0.55). Find (a) P(X < 7), (b) P(X = 5), (c) P(3 X < 6). (ii) A random variable Y has the distribution B 10, (a) P(Y = 2), (b) Var(Y ).
5 12

[1] [2] [3] . Find [2] [1]

OCR 2008

4732/01 Jun08

www.XtremePapers.net

3 4

At a fairground stall, on each turn a player receives prize money with the following probabilities. Prize money Probability 0.00
17 20

0.50
1 10

5.00
1 20

(i) Find the probability that a player who has two turns will receive a total of 5.50 in prize money. [3] (ii) The stall-holder wishes to make a prot of 20p per turn on average. Calculate the amount the stall-holder should charge for each turn. [4]

(i) A bag contains 12 red discs and 10 black discs. Two discs are removed at random, without replacement. Find the probability that both discs are red. [2] (ii) Another bag contains 7 green discs and 8 blue discs. Three discs are removed at random, without replacement. Find the probability that exactly two of these discs are green. [3] (iii) A third bag contains 45 discs, each of which is either yellow or brown. Two discs are removed at 1 random, without replacement. The probability that both discs are yellow is 15 . Find the number of yellow discs which were in the bag at rst. [4]

OCR 2008

4732/01 Jun08

[Turn over

www.XtremePapers.net

4 6 (i) The numbers of males and females in Year 12 at a school are illustrated in the pie chart. The number of males in Year 12 is 128.

Males
120

Females

Year 12
(a) Find the number of females in Year 12.

[1]

(b) On a corresponding pie chart for Year 13, the angle of the sector representing males is 150 . Explain why this does not necessarily mean that the number of males in Year 13 is more than 128. [1] (ii) All the Year 12 students took a General Studies examination. The results are illustrated in the box-and-whisker plots.

Year 12 Females

Mark
0 10 20 30 40 50 60 70 80 90 100

Year 12 Males

(a) One student said The Year 12 pie chart shows that there are more females than males, but the box-and-whisker plots show that there are more males than females.

Comment on this statement.

[1]

(b) Give two comparisons between the overall performance of the females and the males in the General Studies examination. [2] (c) Give one advantage and one disadvantage of using box-and-whisker plots rather than histograms to display the results. [2] (iii) The mean mark for 102 of the male students was 51. The mean mark for the remaining 26 male students was 59. Calculate the mean mark for all 128 male students. [3]

OCR 2008

4732/01 Jun08

www.XtremePapers.net

5 7

Once each year, Paula enters a lottery for a place in an annual marathon. Each time she enters the lottery, the probability of her obtaining a place is 0.3. Find the probability that
(i) the rst time she obtains a place is on her 4th attempt, (ii) she does not obtain a place on any of her rst 6 attempts, (iii) she needs fewer than 10 attempts to obtain a place, (iv) she obtains a place exactly twice in her rst 5 attempts.

[3] [2] [3] [3]

A city council attempted to reduce trafc congestion by introducing a congestion charge. The charge was set at 4.00 for the rst year and was then increased by 2.00 each year. For each of the rst eight years, the council recorded the average number of vehicles entering the city centre per day. The results are shown in the table. Charge, x Average number of vehicles per day, y million 4 2.4 6 2.5 8 2.2 10 2.3 12 2.0 14 1.8 16 1.7 18 1.5

[n = 8, x = 88, y = 16.4, x2 = 1136, y2 = 34.52, xy = 168.6.]


(i) Calculate the product moment correlation coefcient for these data. (ii) Explain why x is the independent variable. (iii) Calculate the equation of the regression line of y on x.

[3] [1] [4]

(iv) (a) Use your equation to estimate the average number of vehicles which will enter the city centre per day when the congestion charge is raised to 20.00. [2] (b) Comment on the reliability of your estimate.

[2]

(v) The council wishes to estimate the congestion charge required to reduce the average number of vehicles entering the city per day to 1.0 million. Assuming that a reliable estimate can be made by extrapolation, state whether they should use the regression line of y on x or the regression line [2] of x on y. Give a reason for your answer.

OCR 2008

4732/01 Jun08

www.XtremePapers.net

6 BLANK PAGE

OCR 2008

4732/01 Jun08

www.XtremePapers.net

7 BLANK PAGE

OCR 2008

4732/01 Jun08

www.XtremePapers.net

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable effort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
OCR 2008 4732/01 Jun08

www.XtremePapers.net

OXFORD CAMBRIDGE AND RSA EXAMINATIONS Advanced Subsidiary General Certicate of Education Advanced General Certicate of Education

MATHEMATICS
Probability & Statistics 1
Wednesday

4732
Afternoon 1 hour 30 minutes

24 MAY 2006

Additional materials: 8 page answer booklet Graph paper List of Formulae (MF1)

TIME

1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. Questions carrying smaller numbers of marks are printed earlier in the paper, and questions carrying larger numbers of marks later in the paper. You are reminded of the need for clear presentation in your answers.

This question paper consists of 4 printed pages.


OCR 2006 [K/102/2696] Registered Charity Number: 1066969

[Turn over

www.XtremePapers.net

2 1

Some observations of bivariate data were made and the equations of the two regression lines were found to be as follows. y on x : y = 0.6x + 13.0 x on y : x = 1.6y + 21.0
(i) State, with a reason, whether the correlation between x and y is negative or positive. (ii) Neither variable is controlled. Calculate an estimate of the value of x when y = 7.0. (iii) Find the values of x and y.

[1] [2] [3]

A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from the bag. Find the probability that
(i) the second disc is black, given that the rst disc was black, (ii) the second disc is black, (iii) the two discs are of different colours.

[1] [3] [3]

Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a row.
(i) How many different arrangements of the letters are possible? (ii) In how many of these arrangements are all three Ds together?

[3] [2]

The 7 cards are now shufed and 2 cards are selected at random, without replacement.
(iii) Find the probability that at least one of these 2 cards has D printed on it.

[3]

(i) The random variable X has the distribution B(25, 0.2). Using the tables of cumulative binomial [2] probabilities, or otherwise, nd P(X 5). (ii) The random variable Y has the distribution B(10, 0.27). Find P(Y = 3).

[2]

(iii) The random variable Z has the distribution B(n, 0.27). Find the smallest value of n such that [3] P(Z 1) > 0.95. 5

The probability distribution of a discrete random variable, X , is given in the table.

x
P(X = x) It is given that the expectation, E(X ), is 1 1 . 4
(i) Calculate the values of p and q. (ii) Calculate the standard deviation of X .

0
1 3

1
1 4

[5] [4]
4732/S06

www.XtremePapers.net

3 6

The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005. Agent Distance travelled Commission earned

A
18 18

B
15 45

C
12 19

D
14 24

E
16 27

F
24 22

G
13 23

(i) (a) Calculate Spearmans rank correlation coefcient, rs , for these data. (b) Comment briey on your value of rs with reference to this context.

[5] [1]

(c) After these data were collected, agent A found that he had made a mistake. He had actually travelled 19 000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearmans rank correlation coefcient will increase, decrease or stay the same. [2]

The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, y, together with the data for distance travelled, x, are illustrated in the scatter diagram below.

(ii) For this scatter diagram, what can you say about the value of (a) Spearmans rank correlation coefcient, (b) the product moment correlation coefcient?

[1] [1]

[Questions 7 and 8 are printed overleaf.]

4732/S06

[Turn over

www.XtremePapers.net

4 7

In a UK government survey in 2000, smokers were asked to estimate the time between their waking and their having the rst cigarette of the day. For heavy smokers, the results were as follows. Time between waking and rst cigarette Percentage of smokers 1 to 4 minutes 31 5 to 14 minutes 27 15 to 29 minutes 19 30 to 59 minutes 14 At least 60 minutes 9

Times are given correct to the nearest minute.


(i) Assuming that At least 60 minutes means At least 60 minutes but less than 240 minutes, calculate estimates for the mean and standard deviation of the time between waking and rst cigarette for these smokers. [6] (ii) Find an estimate for the interquartile range of the time between waking and rst cigarette for these smokers. Give your answer correct to the nearest minute. [4] (iii) The meaning of At least 60 minutes is now changed to At least 60 minutes but less than 480 minutes. Without further calculation, state whether this would cause an increase, a decrease or no change in the estimated value of (a) the mean, (b) the standard deviation, (c) the interquartile range.

[1] [1] [1]

Henry makes repeated attempts to light his gas re. He makes the modelling assumption that the probability that the re will light on any attempt is 1 . 3 Let X be the number of attempts at lighting the re, up to and including the successful attempt.
(i) Name the distribution of X , stating a further modelling assumption needed.

[2]

In the rest of this question, you should use the distribution named in part (i).
(ii) Calculate (a) P(X = 4), (b) P(X < 4). (iii) State the value of E(X ).

[3] [3] [1]

(iv) Henry has to light the re once a day, starting on March 1st. Calculate the probability that the rst day on which fewer than 4 attempts are needed to light the re is March 3rd. [3]

4732/S06

www.XtremePapers.net

OXFORD CAMBRIDGE AND RSA EXAMINATIONS Advanced Subsidiary General Certicate of Education Advanced General Certicate of Education

MATHEMATICS
Probability & Statistics 1
Thursday

4732
Afternoon 1 hour 30 minutes

12 JANUARY 2006

Additional materials: 8 page answer booklet Graph paper List of Formulae (MF1)

TIME

1 hour 30 minutes

INSTRUCTIONS TO CANDIDATES

Write your name, centre number and candidate number in the spaces provided on the answer booklet. Answer all the questions. Give non-exact numerical answers correct to 3 signicant gures unless a different degree of accuracy is specied in the question or is clearly appropriate. You are permitted to use a graphical calculator in this paper.

INFORMATION FOR CANDIDATES

The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72. Questions carrying smaller numbers of marks are printed earlier in the paper, and questions carrying larger numbers of marks later in the paper. You are reminded of the need for clear presentation in your answers.

This question paper consists of 5 printed pages and 3 blank pages.


OCR 2006 [K/102/2696] Registered Charity Number: 1066969

[Turn over

www.XtremePapers.net

2 1

Jenny and John are each allowed two attempts to pass an examination.
(i) Jenny estimates that her chances of success are as follows. The probability that she will pass on her rst attempt is 2 . 3 If she fails on her rst attempt, the probability that she will pass on her second attempt is 3 . 4

Calculate the probability that Jenny will pass.


(ii) John estimates that his chances of success are as follows. The probability that he will pass on his rst attempt is 2 . 3 Overall, the probability that he will pass is 5 . 6

[3]

Calculate the probability that if John fails on his rst attempt, he will pass on his second attempt. [3] For each of 50 plants, the height, h cm, was measured and the value of (h 100) was recorded. The mean and standard deviation of (h 100) were found to be 24.5 and 4.8 respectively.
(i) Write down the mean and standard deviation of h.

[2]

The mean and standard deviation of the heights of another 100 plants were found to be 123.0 cm and 5.1 cm respectively.
(ii) Describe briey how the heights of the second group of plants compare with the rst. (iii) Calculate the mean height of all 150 plants.

[2] [2]

In Mr Kendalls cupboard there are 3 tins of baked beans and 2 tins of pineapple. Unfortunately his daughter has removed all the labels for a school project and so the tins are identical in appearance. Mr Kendall wishes to use both tins of pineapple for a fruit salad. He opens tins at random until he has opened the two tins of pineapples. Let X be the number of tins that Mr Kendall opens.
(i) Show that P(X = 3) = 1 . 5 (ii) The probability distribution of X is given in the table below.

[4]

x
P(X = x) Find E(X ) and Var(X ).

2
1 10

3
1 5

4
3 10

5
2 5

[5]

4732/Jan06

www.XtremePapers.net

3 4

Each day, the Research Department of a retail rm records the rms daily income, to be used for statistical analysis. The results are summarised by recording the number of days on which the daily income is within certain ranges.
(i)

The histogram shows the results for 300 days. By considering the total area of the histogram,
(a) nd the number of days on which the daily income was between 4000 and 6000,

[4]

(b) calculate an estimate of the number of days on which the daily income was between 2700 and 3200. [3] (ii) The Research Department offers to provide any of the following statistical diagrams: histogram, frequency polygon, box-and-whisker plot, cumulative frequency graph, stem-and-leaf diagram and pie chart.

Which one of these statistical diagrams would most easily enable managers to
(a) read off the median and quartile values of the daily income, (b) nd the range of the top 10% of values of the daily income?

[1] [1]

Andrea practises shots at goal. For each shot the probability of her scoring a goal is 2 . Each shot is 5 independent of other shots.
(i) Find the probability that she scores her rst goal (a) on her 5th shot, (b) before her 5th shot. (ii) (a) Find the probability that she scores exactly 1 goal in her rst 5 shots. (b) Hence nd the probability that she scores her second goal on her 6th shot.

[2] [3] [3] [2]

4732/Jan06

[Turn over

www.XtremePapers.net

4 6

An examination paper consists of two parts. Section A contains questions A1, A2, A3 and A4. Section B contains questions B1, B2, B3, B4, B5, B6 and B7. Candidates must choose three questions from section A and four questions from section B. The order in which they choose the questions does not matter.
(i) In how many ways can the seven questions be chosen?

[3]

(ii) Assuming that all selections are equally likely, nd the probability that a particular candidate [3] chooses question A1 but does not choose question B1. (iii) Following a change of syllabus, the form of the examination remains the same except that candidates who choose question A1 are not allowed to choose question B1. In how many ways can the seven questions now be chosen? [3]

Past experience has shown that when seeds of a certain type are planted, on average 90% will germinate. A gardener plants 10 of these seeds in a tray and waits to see how many will germinate.
(i) Name an appropriate distribution with which to model the number of seeds that germinate, giving the value(s) of any parameters. State any assumption(s) needed for the model to be valid. [4] (ii) Use your model to nd the probability that fewer than 8 seeds germinate.

[2]

Later the gardener plants 20 trays of seeds, with 10 seeds in each tray.
(iii) Calculate the probability that there are at least 19 trays in each of which at least 8 seeds germinate. [4]

4732/Jan06

www.XtremePapers.net

5 8

The table shows the population, x million, of each of nine countries in Western Europe together with the population, y million, of its capital city. Germany United Kingdom 59.2 7.0 France 59.1 9.0 Italy 56.7 2.7 Spain 39.2 2.9 The Netherlands 15.9 0.8 Portugal 9.9 0.7 Austria 8.1 1.6 Switzerland 7.3 0.1

x y

82.1 3.5

[n = 9, x = 337.5, x2 = 18 959.11, y = 28.3, y2 = 161.65, xy = 1533.76.]


(i) (a) Calculate Spearmans rank correlation coefcient, rs .

[5]

(b) Explain what your answer indicates about the populations of these countries and their capital cities. [1] (ii) Calculate the product moment correlation coefcient, r .

[2]

The data are illustrated in the scatter diagram.

(iii) By considering the diagram, state the effect on the value of the product moment correlation coefcient, r , if the data for France and the United Kingdom were removed from the calculation. [1] (iv) In a certain country in Africa, most people live in remote areas and hence the population of the country is unknown. However, the population of the capital city is known to be approximately 1 million. An ofcial suggests that the population of this country could be estimated by using a regression line drawn on the above scatter diagram. (a) State, with a reason, whether the regression line of y on x or the regression line of x on y would need to be used. [2] (b) Comment on the reliability of such an estimate in this situation.
4732/Jan06

[2]

www.XtremePapers.net

6 BLANK PAGE

4732/Jan06

www.XtremePapers.net

7 BLANK PAGE

4732/Jan06

www.XtremePapers.net

8 BLANK PAGE

Permission to reproduce items where third-party owned material protected by copyright is included has been sought and cleared where possible. Every reasonable e ort has been made by the publisher (OCR) to trace copyright holders, but if any items requiring clearance have unwittingly been included, the publisher will be pleased to make amends at the earliest possible opportunity. OCR is part of the Cambridge Assessment Group. Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
4732/Jan06

www.XtremePapers.net

You might also like