SEHH2031 Statistics Exercises for Chapter 11
SEHH2031 Statistics - Chapter 11
Tutorial Exercises
(Correct the final answers to 4 decimal places whenever appropriate.)
1. A researcher has conducted a survey on the relationship between the age of a copy
machine (𝑋) and its monthly maintenance cost (𝑌). The sample of 10 copy machines
was selected and the data are summarized as follows.
1 2 3 4 5 6 7 8 9 10
Age (𝑋) 1 2 3 4 4 6 5 6 3 4
Cost (𝑌) 64 80 72 92 91 101 95 98 65 92
Assuming a linear relationship between the age of a copy machine and its monthly
maintenance cost.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Interpret the meaning of the slope in the least square regression line. (2 marks)
(c) Predict the average maintenance cost of a 3.5-year-old copy machine. (2 marks)
(d) Determine the coefficient of correlation and interpret its meaning. (4 marks)
2. A study was conducted to find out the relationship between the driver’s age (𝑋) and
the number of accidents he or she has over a one-year period (𝑌). Data were collected
from 10 drivers and are presented in the following table.
1 2 3 4 5 6 7 8 9 10
Driver’s age (𝑋) 15 25 19 18 24 25 33 17 39 45
Number of
4 2 7 5 0 1 1 2 1 1
accidents (𝑌)
Assuming a linear relationship between driver’s age and the number of accidents he
or she has over a one-year period.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Interpret the meaning of the slope in the least square regression line. (2 marks)
(c) Predict the number of accidents of a driver who is 32. (2 marks)
(d) Determine the coefficient of determination and interpret its meaning. (4 marks)
3. A researcher wanted to determine the relationship between the typing speed (in
words per minute) of a clerk (𝑋) and the time (in hours) that the clerk takes to learn a
new word processing program (𝑌). A random sample of ten clerks was selected. The
summary statistics are presented below.
∑ 𝑥 = 712, ∑ 𝑦 = 42, ∑ 𝑥 2 = 52566 , ∑ 𝑦 2 = 209.5, ∑ 𝑥𝑦 = 2757
Assuming a linear relationship between the typing speed of a clerk and the time that
the clerk takes to learn a new word processing program.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Interpret the meaning of the slope in the least square regression line. (2 marks)
(c) If a clerk has a typing speed of 72 words per minute, predict the time that the
clerk takes to learn a new word processing program. (2 marks)
(d) Determine the coefficient of determination and interpret its meaning. (4 marks)
Page 1
SEHH2031 Statistics Exercises for Chapter 11
4. A medical researcher has conducted a study on the relationship between the number
of grams of a fat dieter consumes per day (𝑋) and cholesterol level (𝑌). A random
sample of eight dieters was selected. The summary statistics are presented below.
∑ 𝑥 = 67.2, ∑ 𝑦 = 1744, ∑ 𝑥 2 = 584.74 , ∑ 𝑦 2 = 388848, ∑ 𝑥𝑦 = 14899.5
Assume a linear relationship between the number of grams of fat of dieters
consumers per day and cholesterol level.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Interpret the meaning of the slope in the least square regression line.
(c) Predict the cholesterol level of a dieter who consumes 8.7 grams of fat per day. (2 marks)
(d) Determine the coefficient of determination and interpret its meaning. (4 marks)
5. A doctor is interested in the association between the dosage (in grams) of a medicine
(𝑋) and the time required (in hours) for a patient’s recovery (𝑌). A random sample of
ten patients was selected. The summary statistics are presented below.
∑ 𝑥 = 14.3, ∑ 𝑦 = 250, ∑ 𝑥 2 = 21.41 , ∑ 𝑦 2 = 7300, ∑ 𝑥𝑦 = 334.9
Assume a linear relationship between the dosage of a medicine and the time required
for a patient’s recovery.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Predict the recovery time for a patient given 1.1 grams of the drug.
(c) Find the prediction error if a patient was given 1.1 grams of the drug and his
recovery time was 26 hours. (2 marks)
6. A research assistant would like to know whether is there any relationship between
number of cups of coffee of an adult drinks per day (𝑋) and their stress level (𝑌). The
stress level is measured on a scale of 1 to 10. Higher score indicates higher stress
level. The data are provided in the following table.
1 2 3 4 5 6 7 8 9 10
Number of cups of coffee (𝑋) 2 4 5 6 5 1 7 6 2 4
Stress level (𝑌) 3 5 3 9 3 2 10 8 3 8
Assume a linear relationship between the number of cups of coffee of an adult drinks
per day and their stress level.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Predict the stress level if an adult drinks 2 cups of coffee per day.
(c) Find the prediction error if an adult drinks 2 cups of coffee per day and his stress
level is 3.
(d) Compute the coefficient of determination.
7. The following data show the IQ scores and Economic exam scores of 10 students.
1 2 3 4 5 6 7 8 9 10
IQ scores (𝑋) 142 98 105 120 119 114 102 112 111 117
Economic exam scores (𝑌) 79 38 45 72 64 55 38 45 38 46
Assume a linear relationship between the IQ scores and Economic exam scores of the
students.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) Predict the Economic exam scores if the student has an IQ score of 98.
Page 2
SEHH2031 Statistics Exercises for Chapter 11
(c) Find the prediction error if the student has an IQ score of 98 and Economic exam
score of 38.
(d) What percentage of the variation in the Economic exam scores can be explained
by the IQ scores?
8. Dr. Chan wants to know whether newborn baby’s weight (in pounds) (𝑌) can be
predicted by the weight of the baby’s father (in pounds) (𝑋). Data were collected and
are presented in the following table.
1 2 3 4 5 6 7 8
Weight of baby’s father (𝑋) 178 162 189 212 195 140 201 211
Newborn baby’s weight (𝑌) 6.8 8.9 9.1 7.5 8.2 9.6 7 8.5
Assume a linear relationship between newborn baby’s weight and the weight of the
baby’s father.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) What percentage of the variation in the newborn baby’s weight can be explained
by the weight of baby’s father?
(c) What percentage of the variation in the newborn baby’s weight cannot be
explained by the weight of baby’s father?
(d) Find the standard error of estimate.
9. The following summary shows the number of absentee (𝑋) and mid-term test score of
a statistic subject (𝑌) of 8 selected students.
∑ 𝑥 = 44, ∑ 𝑦 = 560, ∑ 𝑥 2 = 340 , ∑ 𝑦 2 = 45050, ∑ 𝑥𝑦 = 2383
Assume a linear relationship between the number of absentee and mid-term test
score of the statistics subject.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) What percentage of the variation in the mid-term test score of the statistics
subject can be explained by the number of absentee?
(c) What percentage of the variation in the mid-term test score of the statistics
subject cannot be explained by the number of absentee?
(d) Find the standard error of estimate.
10. The following summary shows the stress level (𝑋) and average number of daily
calories (𝑌) in the past week of 8 selected adults. Stress level ranges from 1 to 10, with
higher scores indicating higher level of stress.
∑ 𝑥 = 40, ∑ 𝑦 = 11500, ∑ 𝑥 2 = 252 , ∑ 𝑦 2 = 18040000, ∑ 𝑥𝑦 = 65400
Assume a linear relationship between the stress level and average number of daily
calories.
(a) Find the least square regression line for the data shown above. (8 marks)
(b) What percentage of the variation in the average number of daily calories in the
past week can be explained by the stress level of an adult?
(c) What percentage of the variation in the average number of daily calories in the
past week cannot be explained by the stress level of an adult?
(d) Find the standard error of estimate.
Page 3
SEHH2031 Statistics Exercises for Chapter 11
Supplementary Exercises
(Correct the final answers to 4 decimal places whenever appropriate.)
1. A business manager wants to develop a regression equation to predict customer’s
loyalty to their products (Y) measured as an index from their age (X). Data were
collected from 10 customers and are presented in the following table.
x 35.2 42.7 36 34.1 45.2 37.5 41.2 29.3 36.6 37.3
y 64.2 70.4 74.5 57.3 71.8 57.6 65.8 45.8 68.6 65.2
(a) Assuming a linear relationship between the age of customers and their loyalty,
find the least square regression line for the data shown above. (8 marks)
(b) Interpret the meaning of the slope in the least square regression line. (2 marks)
(c) Predict the average loyalty of a customer who is aged 38. (2 marks)
(d) Determine the coefficient of correlation and interpret its meaning. (4 marks)
2. A secondary teacher has conducted a survey on the relationship between the age of
some students in secondary school (X) and their weight in kg (Y). A sample of eight
students was selected. The data are summarized as follows.
Age 19 16 14 13 15 16 18 14
Weight (kg) 65 60 55 45 58 63 75 53
(a) Assuming a linear relationship between the age of students and their weight, find
the least square regression line for the data shown above.
(b) Interpret the meaning of the slope in the least square regression line.
(c) Predict the average weight of a student who is aged 17.
(d) Determine the coefficient of determination and interpret its meaning.
3. A class teacher has conducted a survey on the relationship between the amount of
time spent on revisions (in hours) and the corresponding exam marks. A sample of
ten students was selected. The data are summarized as follows.
Time Spent (x) 4 9 10 14 4 7 12 22 1 17
Exam Marks(y) 31 58 65 73 37 44 60 91 21 84
(a) Find the equation of the least squares line that approximates the relationship
between the exam marks and the time spent on revision.
(b) Interpret the meaning of the slope in the least square regression line.
(c) Predict the average exam mark of a person who spent 14 hours on revisions.
(d) Determine the coefficient of correlation and interpret its meaning.
4. A physiologist has conducted a survey on the relationship between the age of
students and their Intelligence Quotient (IQ) score. A sample of eight students was
selected and the data are summarized as follows.
Age 15 13 8 6 18 9 10 7
IQ Score 108 113 104 90 117 98 101 96
(a) Assuming a linear relationship between the age of students and their IQ score,
find the least square regression line for the data shown above.
(b) Interpret the meaning of the slope in the least square regression line.
(c) Predict the average IQ score of a 19-year-old student.
Page 4
SEHH2031 Statistics Exercises for Chapter 11
(d) Determine the coefficient of correlation and interpret its meaning.
5. A marketing analyst of a small-sized sport company believed that the sales (Y) in
dollars for a particular month are directly related to the staff performance commission
(X) in dollars in that month. A random sample of 8 pairs of measurements has been
collected in this regard and the summary statistics are presented below.
x = 237 y = 2320 x = 8641 xy = 73880 y = 689600
2 2
Assuming a linear relationship between the sales for a particular month and staff
performance commission.
(a) Find the least square regression line for the data shown above.
(b) Interpret the meaning of the slope in the least square regression line.
(c) If he proposes to spend $38 in staff performance commission in the coming
month, what are the expected sales?
(d) Determine the coefficient of correlation and interpret its meaning.
6. The following data show the production volumes and total cost data for a
manufacturing operation:
Volume 400 450 550 600 700 750
Cost 4000 5000 5400 5900 6400 7000
(a) Use the data to develop the least square regression equation that could be used
to predict the total cost for a given production volume.
(b) What is the cost per unit produced?
(c) Compute the coefficient of determination.
(d) What percentage of the variation in total cost can be explained by production
volume?
7. Suppose an engineer’s salary level (in $1,000) is related to the corresponding years of
experience as shown in the following table::
X 1 3 4 4 6 8 10 10 11 13
Y 40 49 46 51 52 56 60 62 59 68
(a) Use the data to develop a least squares regression equation that can be used to
predict the salary of an engineer given the years of experience.
(b) Predict the salary of an engineer with 9 years of experience.
(c) Compute the coefficient of determination and interpret its meaning.
8. The following data show the quality rating and the price of a certain product:
Rating 4 4 4 3 2.5 4 3 2 3
Price ($) 531 466 630 265 200 665 402 194 349
(a) Use the data to develop a least squares regression equation.
(b) Provide an interpretation for the slope of the estimated regression.
(c) Predict the price for a certain product with a quality rating of 3.5.
(d) Compute the coefficient of determination and interpret its meaning.
Page 5
SEHH2031 Statistics Exercises for Chapter 11
9. The following data show the late arrival and late departure rates in a certain airport:
Arrival (%) 24 21 30 20 16 23 18 20 18
Departures (%) 22 22 29 19 16 23 19 16 18
(a) Use the data to develop a least squares regression equation.
(b) Provide an interpretation for the slope of the least square line.
(c) Suppose the percentage of late arrivals was 22%. What is an estimate of the
percentage of late departures?
(d) Compute the coefficient of determination and interpret its meaning.
10. A mobile phone store manager wants to analyse and predict monthly revenues (Y)
from the unit price of a certain brand of new smartphone (X). Data were collected
from the past 6 months and are presented in the following table.
Monthly revenues (in $1,000) 522 674 605 686 561 542
Unit price of New Smartphone(in $1,000) 8.42 7.05 7.21 6.63 8.32 7.64
(a) Assuming a linear relationship between monthly revenues and the unit price of
new smartphone, find the least squares regression equation for the above data.
(b) Interpret the slope in the least square line obtained in (a).
(c) Predict the monthly revenues when the unit price of new smartphone is $7,500.
(d) Find the coefficient of determination and interpret its meaning.
Page 6