ISOM2002 STATISTICS AND DATA ANALYSIS
Supplementary Exercises
(Part A)
1. For each of the following, (1) indicate whether the type of data collected is categorical or
numerical; and (2) if it is numerical, determine whether it is discrete or continuous.
a. The amount of time that a student spends on studying.
b. The prefix-code (e.g. ISOM2002) of the university courses.
c. The faculty that a student registers in.
d. The temperature for the summer months in Macau.
e. The number of courses that students take in one semester.
2. The Computer Security Institute (CSI) conducts an annual survey of computer crime at
Macao businesses. CSI sends survey questionnaires to computer security personnel at all
Macao corporations and government agencies. In 2006, 616 organizations responded to
the CSI survey. Fifty-two percent of the respondents admitted unauthorized use of
computer systems at their firms during the year.
a. Identify the population of interest to CSI.
b. Is the variable measured in the CSI survey quantitative or qualitative?
c. What is the relevant sample?
3. A researcher of LFC University would like to study the food services provided by the
three residential colleges at the university. Eighty students lived in those three residential
colleges were randomly selected and asked to classify the quality of their college food
service into “poor”, “average”, “good” and “excellent”. Among the data that the
researcher has collected, 37 out of the 80 respondents indicated that the food services
were “poor”.
a. Identify the population of interest to the researcher.
b. Identify the variable of interest to the researcher.
c. Is the variable stated in part (b) numerical or categorical?
d. Based on the data collected in the study, what is the value of the sample statistic that
the research can use in his analysis?
e. What statistical inference that the researcher can make based on the sample statistic
in part (d)?
4. The amount spent (in $) in a supermarket by a random sample of 45 customers was given
below.
113 123 135 117 119 127 120 118 113
103 125 132 104 108 125 112 107 103
90 139 148 93 95 139 96 94 89
151 134 120 111 96 142 127 117 106
94 152 136 124 112 97 152 143 138
a. Construct a “less than” frequency distribution for the data. Use equal class interval of
$15 and start with $85.
1
b. Plot a frequency polygon.
c. Set up a cumulative percentage distribution for the data.
d. Plot a cumulative percentage polygon.
e. Based on the results of (a) through (d), what can you conclude about the spending of
the customers if the supermarket expects her customers to spend at least $100?
5. The amount of time (in seconds) needed to complete a critical task on an assembly line
was measured for a sample of 50 assemblies. These data are as follows:
30.3 31.9 34.4 32.8 31.1 32.2 30.7 36.8 30.5 30.6
34.5 33.1 30.1 31.0 30.7 30.9 30.7 30.2 30.6 37.9
31.1 31.1 34.6 30.2 33.1 32.1 30.6 31.5 30.2 30.3
30.9 30.0 31.6 30.2 34.4 34.2 30.2 30.1 31.4 34.1
33.7 32.7 32.4 32.8 31.0 30.7 33.4 35.7 30.7 30.4
a. Construct a “less than” frequency distribution. Use equal class interval/width of 1.2
and start with 30.0.
b. Draw a frequency histogram.
c. Describe the skewness of the data.
6. The following is a “less than” frequency distribution of the number of daily automobile
accidents reported in a given month.
Number of Daily Accidents Number of Days
0 but less than 4 12
4 but less than 8 9
8 but less than 12 6
12 but less than 16 2
16 but less than 20 1
a. Draw the histogram for the given frequency distribution, labelling all important
values and axis titles.
b. Construct the corresponding cumulative percentage distribution and draw the
percentage ogive.
c. Based on your percentage ogive only, find the percentage of the days that five or
more daily accidents occur.
7. The Hillside Bowling Alley manager has selected a random sample of his league
customers. He asked them to record the number of lines they bowl during the month of
December, including both league and open bowling. The sample of 20 people produced
the following data:
13 22 12 9 16 17 16 12 12 20
11 16 15 12 12 14 32 12 18 15
a. Are these data qualitative or quantitative? If they are quantitative, determine whether
they are discrete or continuous.
b. Find the mean, median and mode for these data.
c. For these data, which measure provides the best measure of the central tendency of
the data? Discuss.
2
8. The following table summarizes the number of cups of coffee consumed in a particular
day for a sample of 80 men.
Number of Cups of Coffee Number of Men
0 24
1 32
2 16
3 6
4 2
a. Compute the mean and the standard deviation of the number of cups of coffee
consumed for the 80 men.
b. For another sample of 50 women, the mean and the standard deviation of the number
of cups of coffee consumed in the same day are 0.85 and 0.812 respectively. Which
sample has a more stable consumption of coffee in the day? Justify you answer.
9. The following table shows the number of used cars sold by Holden’s Auto in a sample of
30 months.
4 8 6 12 7 4 3 3 17 7
9 5 8 4 5 3 3 9 10 5
5 23 7 4 5 7 10 5 4 8
a. Compute the mean and median of this sample.
b. Comment on the shape of this distribution by comparing the mean to the median.
c. Another company, Bill’s Used Cars, has a sample mean sales of 6.5 cars and a
sample standard deviation of 2.3 cars in the same 30 months’ period. Which of the
two companies had more sales in these 30 months? Why?
d. Which of the two companies had more consistent sales in these 30 months? Why?
10. A company is considering changing its starting from 8:00 a.m. to 7:30 a.m. A census of
the company’s 1,200 office and production workers shows 370 of its 750 production
workers favor the change and a total of 715 workers favor the change. To further assess
worker opinion, the region manager decides to talk with random workers.
a. Summarize the data into a contingency table.
b. What is the probability a randomly selected worker will be in favor of the change?
c. What is the probability a randomly selected worker will be against the change and be
an office worker?
d. Is the relationship between job type and opinion statistically independent? Explain.
11. Of 10,000 students at a college, 2,500 have a Mastercard, 4,000 have a VISA, and 1,000
have both.
a. Construct and fill in a Venn diagram summarizing the credit card data.
b. Construct and fill in a contingency table summarizing the credit card data.
c. Of those who have a Mastercard, what is the probability that a randomly selected
student also has a VISA?
3
12. A serious disease occurs in 1% of a certain population. A diagnostic test is available to
help in screening for the disease; a positive test result is taken as an indication that the
disease may be present and further different tests are necessary, a negative result suggests
that no further tests are necessary. It has been found that 95% of those with the disease
have a positive test result and 5% of those who do not have the disease have a positive
test result.
a. Produce a decision tree diagram based on the information provided above and list the
probabilities along all the branches.
b. Based on your tree diagram only, calculate the probability that a randomly chosen
member of the population has a positive test.
c. Based on your tree diagram only, calculate the probability that a person has the
disease given that this person’s test result is negative.
13. In a local university, ⅖ of the students are business majors, ⅖ of the students are
science/social science majors, and the remaining ⅕ are liberal arts majors. ⅔ of the
business majors, ⅓ of the science/social science majors, and half of the liberal arts
majors are women. A student is randomly selected from this university. Find the
following probabilities.
a. The probability that the selected student is a man.
b. The probability that the selected student is a woman and studying liberal arts.
c. If the selected student is a woman, the probability that she is a business major.
d. The probability that the selected student is either a woman or studying business.
14. Many medical researchers have conducted experiments to examine the relationship
between cigarette smoking and cancer. Consider an individual randomly selected from
the adult male population. Let A represent the event that the individual smokes. Let B
represent the event that the individual develops cancer. Then the four events associated
with the experiment and their probabilities for a certain city are given in the following
joint probability table.
Develops Cancer
Yes, B No, B’
Yes, A 0.05 0.20
Smoker
No, A’ 0.03 0.72
a. What is the probability that a randomly selected adult male neither smokes nor
develops cancer?
b. What is the probability that a randomly selected adult male develops cancer given
that he smokes?
c. What is the probability that a randomly selected adult male develops cancer given
that he does not smoke?
d. Are the events “Smoker” and “Develops Cancer” mutually exclusive? Justify your
answer statistically.
e. Are the events “Smoker” and “Develops Cancer” independent? Justify your answer
statistically.
f. Suppose now we have chosen 20 adult male smokers, what is the probability that we
will find less than three of them develop cancer?
4
15. The manager of Nu-Look Car Wash would like to know the summary statistics of the
number of cars arrived at the car wash for the last 100 days. Based on his observations, he
has found the following frequency distribution.
Cars Number of Days
0 but less than 10 8
10 but less than 20 16
20 but less than 30 35
30 but less than 40 25
40 but less than 50 16
a. Using the mid-points of each class to reflect the values in the class, set up a
probability distribution for the number of cars arriving at the car wash.
b. Determine the expected number of cars to arrive at the car wash.
c. Determine the standard deviation.
16. According to the regulation of a certain university, an undergraduate student may take a
course more than once and only the course grade of the last attempt will appear on the
student’s transcript. Assuming that a student took the course “Business Mathematics”
several times and the probability of him passing the course each time is 0.45.
a. What is the probability that he passed the course twice exactly out of five attempts?
b. What is the probability that he passed the course the first time at his third attempt?
c. What is the probability that he passed the course at least two times out of five
attempts?
17. The number of students taking the SAT has risen over the years. Students are allowed to
repeat the test in hopes of improving the score that is sent to college and university
admission offices. The following table summarizes the number of times the SAT was
taken and the number of students in the year 2016.
Number of Times Number of Students
1 721,769
2 601,325
3 166,736
4 22,299
5 6,730
a. Let X be a random variable indicating the number of times a student takes the SAT.
Set up the probability distribution for this random variable.
b. What is the population in this study?
c. What is the population size?
d. What is the probability that a student takes the SAT more than one time?
e. What is the probability that a student takes the SAT three or more times?
f. What is the expected value of the number of times the SAT is taken? What is your
interpretation of this expected value?
g. What is the standard deviation of this probability distribution?
h. If 10 individuals were selected from this group of students, what is the probability
that there are more than 2 of them who took the SAT more than one time?