0% found this document useful (0 votes)
9 views24 pages

Basic Probability

The document discusses basic probability concepts, focusing on their application in understanding consumer behavior at Fredco Warehouse Club. It covers topics such as conditional probability, types of probability, and the use of survey data to inform marketing strategies. The document also includes examples and explanations of simple and joint probabilities, emphasizing the importance of statistical principles in predicting purchase behaviors.

Uploaded by

mdalif15603
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views24 pages

Basic Probability

The document discusses basic probability concepts, focusing on their application in understanding consumer behavior at Fredco Warehouse Club. It covers topics such as conditional probability, types of probability, and the use of survey data to inform marketing strategies. The document also includes examples and explanations of simple and joint probabilities, emphasizing the importance of statistical principles in predicting purchase behaviors.

Uploaded by

mdalif15603
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

4

CONTENTS
Basic Probability

USING STATISTICS:
Probable Outcomes at
Fredco Warehouse Club

4.1 Basic Probability


Concepts
4.2 Conditional Probability
4.3 Ethical Issues and
Probability
4.4 Bayes’ Theorem

CONSIDER THIS: Divine


Providence and Spam

4.5 Counting Rules online

Probable Outcomes at
Fredco Warehouse Club,
Revisited

EXCEL GUIDE

OBJECTIVES ▼▼
USING STATISTICS
Understand basic
■■

probability concepts
Probable Outcomes at Fredco Warehouse Club

A
■■ Understand conditional s a Fredco Warehouse Club electronics merchandise manager, you oversee the process
probability that selects, purchases, and markets electronics items that the club sells, and you seek
to satisfy customers’ needs. Noting new trends in the television marketplace, you have
worked with company marketers to conduct an intent-to-purchase study. The study asked the
heads of 1,000 households about their intentions to purchase a large TV, a screen size of at least
65 inches, sometime during the next 12 months.
Now 12 months later, you plan a follow-up survey with the same people to see if they did
purchase a large TV in the intervening time. For households that made such a purchase, you
would like to know if the TV purchased was HDR (high dynamic range) capable, whether they
also purchased a streaming media player in the intervening time, and whether they were satis-
fied with their large TV purchase.
You plan to use survey results to form a new marketing strategy that will enhance sales and
better target those households likely to purchase multiple or more expensive products. What
questions can you ask in this survey? How can you express the relationships among the various
intent-to-purchase responses of individual households?

182
4.1 Basic Probability Concepts 183

T
he principles of probability help bridge the worlds of descriptive statistics and inferen-
tial statistics. Probability principles are the foundation for the probability distribution,
the concept of mathematical expectation, and the binomial and Poisson distributions.
Applying probability to intent-to-purchase survey responses answers purchase behavior ques-
tions such as:
• What is the probability that a household is planning to purchase a large TV in the next year?
• What is the probability that a household will actually purchase a large TV?
• What is the probability that a household is planning to purchase a large TV and actually
purchases the TV?
• Given that the household is planning to purchase a large TV, what is the probability that
the purchase is made?
• Does knowledge of whether a household plans to purchase a large TV change the
likelihood of predicting whether the household will purchase a large TV?
• What is the probability that a household that purchases a large TV will purchase an
HDR-capable TV?
• What is the probability that a household that purchases a large TV with HDR will also
purchase a streaming media player?
• What is the probability that a household that purchases a large TV will be satisfied with
the purchase?
Answers to these question will provide a basis to form a marketing strategy. One can con-
sider whether to target households that have indicated an intent to purchase or to focus on selling
TVs that have HDR or both. One can also explore whether households that purchase large TVs
with HDR can be easily persuaded to also purchase streaming media players.

4.1 Basic Probability Concepts


In everyday usage, probability, according to the Oxford English Dictionary, indicates the extent
to which something is likely to occur or exist but can also mean the most likely cause of some-
thing. If storm clouds form, the wind shifts, and the barometric pressure drops, the probability
of rain coming soon increases (first meaning). If one observes people entering an office building
with wet clothes or otherwise drenched, there is a strong probability that it is currently raining
outside (second meaning).
In statistics, probability is a numerical value that expresses the ratio between the value
sought and the set of all possible values that could occur. A six-sided die has faces for 1, 2, 3,
4, 5, and 6. Therefore, for one roll of a fair six-sided die, the set of all possible values are the
values 1 through 6. If the value sought is “a value greater than 4,” then the values 5 or 6 would
be sought. One would say the probability of this event is 2 outcomes divided by 6 outcomes
or 1/3.
Consider tossing a fair coin heads or tails two times. What is the probability of tossing two
tails? The set of possible values for tossing a fair coin twice are HH, TT, HT, and TH. Therefore,
the probability of tossing two tails is 1/4 because only one value (TT) matches what is being
sought, and the set of all possible values has four values.

Events and Sample Spaces


student TIP When discussing probability, one uses outcomes in place of values and calls the set of all possi-
Events are represented
ble outcomes the sample space. Events are subsets of the sample space, the set of all outcomes
by letters of the alphabet. that produce a specific result. For tossing a fair coin twice, the event “toss at least one head” is
the subset of outcomes HH, HT, and TH, and the event “toss two tails” is the subset TT. Both
events are examples of a joint event, an event that has two or more characteristics. In contrast,
a simple event has only one characteristic, an outcome that cannot be further subdivided. The
event “rolling a value greater than 4” in the first example results in the subset of outcomes 5
and 6 and is an example of a simple event because “5” and “6” represent one characteristic and
cannot be further divided.
184 CHAPTER 4 | Basic Probability

The complement of an event A, noted by the symbol A', is the subset of outcomes that are not
part of the event. For tossing a fair coin twice, the complement of the event “toss at least one head”
is the subset TT, while the complement of the event “toss two tails” is HH, HT, and TH.
A set of events are mutually exclusive if they cannot occur at the same. The events “roll a
value greater than 4” and “roll a value less than 3” are mutually exclusive when rolling one fair
die. However, the events “roll a value greater than 4” and “roll a value greater than 5” are not
because both share the outcome of rolling a 6.
A set of events are collectively exhaustive if one of the events must occur. For rolling
a fair six-sided die, the events “roll a value of 3 or less” and “roll a value of 4 or more” are
­collectively exhaustive because these two subsets include all possible outcomes in the sample
student TIP
space. ­However, the set of events “roll a value of 3 or less” and “roll a value greater than 4” is
By definition, an event
not because this set does not include the outcome of rolling a 4.
and its complement are
Not all sets of collectively exhaustive events are mutually exclusive. For rolling a fair
always both mutually
­six-sided die, the set of events “roll a value of 3 or less,” “roll an even-numbered value,” and
exclusive and collectively
exhaustive.
“roll a value greater than 4” is collectively exhaustive but is not mutually exclusive as, for
example, “a value of 3 or less” and “an even-numbered value” could both occur if a 2 is rolled.
Certain and impossible events represent special cases. A certain event is an event that is
student TIP sure to occur such as “roll a value greater than 0” for rolling one fair die. Because the subset of
outcomes for a certain event is the entire set of outcomes in the sample, a certain event has a
A probability cannot
be negative or greater
probability of 1. An impossible event is an event that has no chance of occurring, such as “roll
than 1. a value greater than 6” for rolling one fair die. Because the subset of outcomes for an impossible
event is empty—contains no outcomes—an impossible event has a probability of 0.

Types of Probability
The concepts and vocabulary related to events and sample spaces are helpful to understanding
how to calculate probabilities. Also affecting such calculations is the type of probability being
used: a priori, empirical, or subjective.
In a priori probability, the probability of an occurrence is based on having prior knowl-
edge of the outcomes that can occur. Consider a standard deck of cards that has 26 red cards
and 26 black cards. The probability of selecting a black card is 26>52 = 0.50 because there
are 26 black cards and 52 total cards. What does this probability mean? If each card is replaced
after it is selected, this probability does not mean that one out of the next two cards selected will
be black. One cannot say for certain what will happen on the next several selections. However,
one can say that in the long run, if this selection process is continually repeated, the proportion
of black cards selected will approach 0.50. Example 4.1 shows another example of computing
an a priori probability.

EXAMPLE 4.1 A standard six-sided die has six faces. Each face of the die contains either one, two, three, four,
Finding A Priori five, or six dots. If you roll a die, what is the probability that you will get a face with five dots?
Probabilities SOLUTION Each face is equally likely to occur. Because there are six faces, the probability of
getting a face with five dots is 1/6.

The preceding examples use the a priori probability approach because the number of ways
the event occurs and the total number of possible outcomes are known from the composition of
the deck of cards or the faces of the die.
In the empirical probability approach, the probabilities are based on observed data, not on
prior knowledge of how the outcomes can occur. Surveys are often used to generate empirical
probabilities. Examples of this type of probability are the proportion of individuals in the Fredco
Warehouse Club scenario who actually purchase a large TV, the proportion of registered voters
who prefer a certain political candidate, and the proportion of students who have part-time jobs.
For example, if one conducts a survey of students, and 60% state that they have part-time jobs, then
there is a 0.60 probability that an individual student has a part-time job.
4.1 Basic Probability Concepts 185

The third approach to probability, subjective probability, differs from the other two
approaches because subjective probability differs from person to person. For example, the devel-
opment team for a new product may assign a probability of 0.60 to the chance of success for
the product, while the president of the company may be less optimistic and assign a probability
of 0.30. The assignment of subjective probabilities to various outcomes is usually based on a
combination of an individual’s past experience, personal opinion, and analysis of a particular
situation. Subjective probability is especially useful in making decisions in situations in which
one cannot use a priori probability or empirical probability.

Summarizing Sample Spaces


Sample spaces can be presented in tabular form using contingency tables (see Section 2.1).
Table 4.1 in Example 4.2 summarizes a sample space as a contingency table. When used for
probability, each cell in a contingency table represents one joint event, analogous to the one joint
response when these tables are used to summarize categorical variables. For example, 200 of
the respondents correspond to the joint event “planned to purchase a large TV and subsequently
did purchase the large TV.”

EXAMPLE 4.2 The Fredco Warehouse Club scenario on page 182 involves analyzing the results of an intent-
Events and Sample to-purchase study. Table 4.1 presents the results of the sample of 1,000 households surveyed in
Spaces terms of purchase behavior for large TVs.

TABLE 4.1
Purchase behavior ACTUALLY PURCHASED
PLANNED TO
for large TVs
PURCHASE Yes No Total
Yes 200 50 250
No 100 650 750
Total 300 700 1,000

What is the sample space? Give examples of simple events and joint events.
SOLUTION The sample space consists of the 1,000 respondents. Simple events are “planned to
purchase,” “did not plan to purchase,” “purchased,” and “did not purchase.” The complement of
the event “planned to purchase” is “did not plan to purchase.” The event “planned to purchase
and actually purchased” is a joint event because in this joint event, the respondent must plan to
purchase the TV and actually purchase it.

Simple Probability
Simple probability is the probability of occurrence of a simple event A, P(A), in which each outcome
is equally likely to occur. Equation (4.1) defines the probability of occurrence for simple probability.

PROBABILITY OF OCCURRENCE
X
Probability of occurrence = (4.1)
T
where
X = number of outcomes in which the event occurs
T = total number of possible outcomes
186 CHAPTER 4 | Basic Probability

Equation 4.1 represents what some people wrongly think is the probability of occurrence for
all probability problems. (Not all probability problems can be solved by Equation 4.1 as later
examples in this chapter illustrate.) In the Fredco Warehouse Club scenario, the collected survey
data represent an example of empirical probability. Therefore, one can use Equation (4.1) to
determine answers to questions that can be expressed as a simple probability.
For example, one question asked respondents if they planned to purchase a large TV. Using
the responses to this question, how can one determine the probability of selecting a household
that planned to purchase a large TV? From the Table 4.1 contingency table, determine the
value of X as 250, the total of the Planned-to-purchase Yes row and determine the value of T as
1,000, the overall total of respondents located in the lower right corner cell of the table. Using
Equation (4.1) and Table 4.1:
X
Probability of occurrence =
T
Number who planned to purchase
P(Planned to purchase) =
Total number of households
250
= = 0.25
1,000
Thus, there is a 0.25 (or 25%) chance that a household planned to purchase a large TV.
Example 4.3 illustrates another application of simple probability.

EXAMPLE 4.3 In another Fredco Warehouse Club follow-up survey, additional questions were asked of the
Computing the 300 households that actually purchased large TVs. Table 4.2 indicates the consumers’ responses
Probability That the to whether the TV purchased was HDR-capable and whether they also purchased a streaming
Large TV ­Purchased media player in the past 12 months.
is HDR-capable Find the probability that if a household that purchased a large TV is randomly selected, the
television purchased was HDR-capable.

TABLE 4.2
Purchase behavior STREAMING MEDIA
PLAYER
about purchasing an
HDR-capable television HDR FEATURE Yes No Total
and a streaming media
player HDR-capable 38 42 80
Not HDR-capable 70 150 220
Total 108 192 300

SOLUTION Using the following definitions:


A = purchased an HDR-capable TV
A′ = purchased a television not HDR-capable
B = purchased a streaming media player
B′ = did not purchase a streaming media player
Number of HDR@capable TVs purchased
P(HDR@capable) =
Total number of TVs
80
= = 0.267
300
There is a 26.7% chance that a randomly selected large TV purchased is HDR-capable.
4.1 Basic Probability Concepts 187

Joint Probability
Whereas simple probability refers to the probability of occurrence of simple events, joint
­probability refers to the probability of an occurrence involving two or more events. An
example of joint probability is the probability that one will get heads on the first toss of a coin
and heads on the second toss of a coin.
In Table 4.1 on page 185, the count of the group of individuals who planned to purchase
and actually purchased a large TV corresponds to the cell that represents Planned to purchase
Yes and Actually purchased Yes, the upper left numerical cell. Because this group consists of
200 households, the probability of picking a household that planned to purchase and actually
purchased a large TV is

Planned to purchase and actually purchased


P(Planned to purchase and actually purchased) =
Total number of respondents
200
= = 0.20
1,000

Example 4.4 also demonstrates how to determine joint probability.

EXAMPLE 4.4 In Table 4.2 on page 186, the purchases are cross-classified as being HDR-capable or not and
Determining the whether the household purchased a streaming media player. Find the probability that a randomly
Joint Probability selected household that purchased a large TV also p­ urchased an HDR-capable TV and purchased
That a ­Household a streaming media player.
Purchased an SOLUTION Using Equation (4.1) on page 185 and Table 4.2,
HDR-capable Large
TV and Purchased Number that purchased an HDR@capable TV
a Streaming P(HDR@capable TV and and purchased a streaming media player
Media Player =
streaming media player) Total number of large TV purchasers
38
= = 0.127
300
Therefore, there is a 12.7% chance that a randomly selected household that purchased a large
TV purchased an HDR-capable TV and purchased a streaming media player.

Marginal Probability
The marginal probability of an event consists of a set of joint probabilities. You can deter-
mine the marginal probability of a particular event by using the concept of joint probability
just discussed. For example, if B consists of two events, B1 and B2, then P(A), the probability
of event A, consists of the joint probability of event A occurring with event B1 and the joint
probability of event A occurring with event B2. Use Equation (4.2) to calculate marginal
probabilities.

student TIP
Mutually exclusive
events cannot occur
MARGINAL PROBABILITY
simultaneously.
In a collectively
exhaustive set of events,
P(A) = P(A and B1) + P(A and B2) + g + P(A and Bk)(4.2)
one of the events must
occur.
where B1, B2,c, Bk are k mutually exclusive and collectively exhaustive events
188 CHAPTER 4 | Basic Probability

Using Equation (4.2) to calculate the marginal probability of “planned to purchase” a large
TV gets the same result as adding the number of outcomes that make up the simple event
“planned to purchase”:
P(Planned to purchase) = P (Planned to purchase and purchased)
+ P (Planned to purchase and did not purchase)
200 50 250
= + = = 0.25
1,000 1,000 1,000

student TIP General Addition Rule


The key word when using The probability of event “A or B” considers the occurrence of either event A or event B or both A
the addition rule is or. and B. For example, how can one determine the probability that a household planned to purchase
or actually purchased a large TV?
The event “planned to purchase or actually purchased” includes all households that planned
to purchase and all households that actually purchased a large TV. Examine each cell of the
Table 4.1 contingency table on page 185 to determine whether it is part of this event. From
Table 4.1, the cell “planned to purchase and did not actually purchase” is part of the event
because it includes respondents who planned to purchase. The cell “did not plan to purchase and
actually purchased” is included because it contains respondents who actually purchased. Finally,
the cell “planned to purchase and actually purchased” has both characteristics of interest. There-
fore, one way to calculate the probability of “planned to purchase or actually purchased” is
P(Planned to purchase or actually purchased) = P (Planned to purchase and did not actually
purchase) + P(Did not plan to
purchase and actually purchased) +
P (Planned to purchase and
actually purchased)
50 100 200
= + +
1,000 1,000 1,000
350
= = 0.35
1,000
Often, it is easier to determine P(A or B), the probability of the event A or B, by using the
­general addition rule, defined in Equation (4.3).

GENERAL ADDITION RULE


The probability of A or B is equal to the probability of A plus the probability
of B minus the probability of A and B.
P(A or B) = P(A) + P(B) - P(A and B)(4.3)

Applying Equation (4.3) to the previous example produces the following result:

P(Planned to purchase or actually purchased) = P(Planned to purchase)


+ P(Actually purchased) - P(Planned to
purchase and actually purchased)
250 300 200
= + -
1,000 1,000 1,000
350
= = 0.35
1,000
4.1 Basic Probability Concepts 189

The general addition rule consists of taking the probability of A and adding it to the probabil-
ity of B and then subtracting the probability of the joint event A and B from this total because the
joint event has already been included in computing both the probability of A and the probability
of B. For example, in Table 4.1, if the outcomes of the event “planned to purchase” are added
to those of the event “actually purchased,” the joint event “planned to purchase and actually
­purchased” has been included in each of these simple events. Therefore, because this joint
event has been included twice, you must subtract it to compute the correct result. Example 4.5
­illustrates another application of the general addition rule.

EXAMPLE 4.5 In Example 4.3 on page 186, the purchases were cross-classified in Table 4.2 as TVs that were
Using the General HDR-capable or not and whether the household purchased a streaming media player. Find the
Addition Rule for probability that among households that purchased a large TV, they purchased an HDR-capable
the Households TV or purchased a streaming media player.
That Purchased SOLUTION Using Equation (4.3):
Large TVs P(HDR@capable TV = P(HDR@capable TV)
or purchased a streaming media player) + P(purchased a streaming media player)
- P(HDR@capable TV and
purchased a streaming media player)
80 108 38
= + -
300 300 300
150
= = 0.50
300
Therefore, of households that purchased a large TV, there is a 50% chance that a randomly
selected household purchased an HDR-capable TV or purchased a streaming media player.

PROBLEMS FOR SECTION 4.1

LEARNING THE BASICS 4.4 Consider the following contingency table, which shows how
4.1 A die is rolled twice. many men and women do and do not exercise in a week:
a. Give an example of a simple event.
b. Give an example of a joint event. EXERCISE
c. What is the complement of getting a number 1 on the first roll?
d. What does the sample space consist of? GENDER Yes No
Male 17 20
4.2 A basket consists of 5 red apples and 3 green apples. Two
Female 25 28
apples are going to be taken out of the basket.
a. Give an example of a simple event.
b. What is the complement of the first apple being red? Find the probability of the following events:
c. What does the sample space consist of? a. Is male and does exercise during the week.
b. Is female and does not exercise during the week.
4.3 Consider the following contingency table: c. Is male or does exercise during the week.
d. Is female or does not exercise during the week.
Product Number Sold
A 13
APPLYING THE CONCEPTS
B 30
C 7 4.5 For each of the following, state whether the event created is a
joint event, a simple event, a certain event, or an impossible event.
What is the probability of a. Getting a tail on a coin toss.
a. A, B, and C being sold? b. Selecting a boy from a group of 20 girls.
b. A and B being sold? c. Drawing a card that is a five and is black.
c. A or C being sold? d. Selecting a red marble from a glass jar that contains 15 red
d. A or B or C being sold? marbles.
190 CHAPTER 4 | Basic Probability

4.6 For each of the following, state whether the events are mutually a. Give an example of a simple event.
exclusive and whether they are collectively exhaustive. b. Give an example of a joint event.
a. Employees are grouped according to their wages: less than $5,000, c. What is the complement of a child’s death from diarrheal dis-
$5001 to $10,000, $10,001 to $15,000, or more than $15,000. eases due to unsafe water source?
b. The residents of a town get to know about a restaurant through differ- d. Why is a child’s death from diarrheal diseases due to no access
ent sources: newspapers, magazines, or recommendation by friends. to handwashing facility in the year 2017 a joint event?
c. Your favorite magazine: People, Time, Elle, or other.
4.11 Referring to the contingency table in Problem 4.10, if a
d. A tourist has to select one or more leisure activities from a list
5-year-old child is randomly selected in United States,
of five activities.
a. What is the probability that the child passed away due to diar-
4.7 Which of the following events are examples of a priori proba- rheal diseases caused by no access to any handwashing facility?
bility, empirical probability, or subjective probability? b. What is the probability that the child passed away in 2017 due
a. Based on past experience, you know your medical bill for this to diarrheal diseases?
month will be over $100. c. What is the probability that the child passed away in 2017 due
b. You select a chocolate ice cream at an ice cream parlor where to diarrheal diseases or due to no access to any handwashing
84 out of 100 people choose chocolate. facility?
c. Using today’s temperature to forecast tomorrow’s temperature. d. Explain the difference in the results in (b) and (c).
d. A student enrolls in the business program during an open day at
SELF 4.12 Fixed broadband encompasses any high-speed
a university where 70 out of 145 visiting students enroll in the TEST data transmission to a residence or a business using a
business school.
variety of technologies, including cable, DSL, fiber optics, and
4.8 A hospital was testing the effectiveness of two new treatments, wireless at a fixed location. Fixed broadband refers to high-speed
A and B. The following table records the number of patients for each internet connections that are “always on” in fixed locations. The
treatment and the effectiveness of each. following data was collected for the total number of fixed broad-
band subscriptions (in million) at homes and workplaces between
EFFECTIVE Australia and Canada in 2017 and 2018.
TREATMENT Yes No
A 56 27 COUNTRY
B 39 43
YEAR Australia Canada Total
a. Give an example of a simple event. 2017 7.92 13.92 21.84
b. Give an example of a joint event. 2018 8.40 14.30 22.70
c. What is the complement of patients who received treatment B? Total 16.32 28.22 44.54
d. Why is “A patient received treatment A that has been effective” Source: Data extracted from ITU, [Link]
a joint event?
4.9 Referring to the contingency table in Problem 4.8, if a patient If a subscription is selected at random, what is the probability that
is selected at random, answer the following: the subscription is
a. What is the probability that the patient received treatment A? a. in Australia?
b. What is the probability that the patient received treatment A and b. from 2018?
it has been effective? c. from Canada or from 2018?
c. What is the probability that the patient received treatment B or d. Explain the difference in the results in (b) and (c).
it has been effective?
d. Explain the difference in the results in (b) and (c). 4.13 Integrated Public Use Microdata Series (IPUMS) is an effort
to inventory, preserve, harmonize, and disseminate census micro-
4.10 The Global Burden of Disease Collaborative Network is a global data from around the world. In the United States, data was collected
research collaborative that quantifies the impact of hundreds of dis- by public-use microdata (PUMD) through an interview survey in
eases, injuries, and risk factors around the world. A study on the total 2018. The survey considers all males and females and categorizes
number of child deaths (in thousand) from diarrheal diseases by differ- them according to the amount spent on housing, rental: A, those
ent risk factors had been done by the collaborative. The study included who spend less than $1,000; B, those who spend $1,000 to less than
children of both gender and under 5 years in the United States. The $2,000; and C, those who spend $2,000 or more. The following
following table summarizes data selected from the study for 2016 and table summarizes the data.
2017 on two risk factors:

RISK FACTORS GENDER


Unsafe Water No Access to Total RENTAL Male Female Total
Source Handwashing
A 101 140 241
YEAR Facility
B 145 135 280
2016 894 415 1,309 C 84 78 162
2017 869 404 1,273 Total 330 353 683
Total 1,763 819 2,582 Source: Data extracted from United State Department of Labor, [Link]
Source: Data extracted from Our World in Data, [Link] [Link]/cex/pumd_data.htm.
4.2 Conditional Probability 191

If a person is randomly selected in the United States, b. is an older adult and indicates it is important to have a clear
a. what is the probability that his or her rental is in category B? understanding of a company’s privacy policy before signing up
b. what is the probability that his rental is in category C? for its service online?
c. what is the probability that the person is a male or has a rental c. is an older adult or indicates it is important to have a clear un-
in category C? derstanding of a company’s privacy policy before signing up for
d. Explain the difference in the results in (b) and (c). its service online?
d. is an older adult or a younger adult?
4.14 Consumers are aware that companies share and sell their
personal data in exchange for free services, but is it import- 4.15 The human resource department of a company is analyzing
ant to consumers to have a clear understanding of a compa- whether an employee has worked at the company for more or less
ny’s privacy policy before signing up for its service online? than a year and their education level for the previous year. The data
According to an Axios-SurveyMonkey poll, 911 of 1,001 older collected indicated 15.8% of employees have completed their sec-
adults (aged 65 + ) indicate that it is important to have a clear ondary education and worked at the company for less than a year.
understanding of a company’s privacy policy before signing The data also indicated that 32.1% of the employees had worked for
up for its service online and 195 of 260 younger adults (aged less than a year and in total 62.5% of the employees at the company
18–24) indicate that it is important to have a clear understand- had completed tertiary education.
ing of a company’s privacy policy before signing up for its Construct a contingency table to evaluate the probabilities of the
service online. secondary and tertiary education-related employment. What is the
Source: Data extracted from “Privacy policies are read by an aging few,” [Link]/2NyPAWx.
probability that an employee selected at random
Construct a contingency table to evaluate the probabilities. What is a. has a tertiary education level?
the probability that a respondent chosen at random b. has a tertiary education level and has worked for less than a year?
a. indicates it is important to have a clear understanding of a c. has a tertiary education level or has worked for less than a year?
company’s privacy policy before signing up for its service d. has a tertiary education level or has worked for less one year or
online? more?

4.2 Conditional Probability


Each Section 4.1 example involves finding the probability of an event when sampling from the
entire sample space. How does one determine the probability of an event if one knows certain
information about the events involved?

Calculating Conditional Probabilities


Conditional probability refers to the probability of event A, given information about the
­occurrence of another event, B.

CONDITIONAL PROBABILITY
The probability of A given B is equal to the probability of A and B divided by the
­probability of B.
P(A and B)
P(A B) = (4.4a)
P(B)
The probability of B given A is equal to the probability of A and B divided by the
­probability of A.
P(A and B)
P(B A) = (4.4b)
P(A)
where
P(A and B) = joint probability of A and B
P(A) = marginal probability of A
P(B) = marginal probability of B
192 CHAPTER 4 | Basic Probability

In the Fredco Warehouse Club scenario, suppose one had been told that a specific household
planned to purchase a large TV. What then would be the probability that the household actually
purchased the TV?
student TIP In this example, the objective is to find P(Actually purchased  Planned to purchase), given
The variable that the information that a household planned to purchase a large TV. Therefore, the sample space
is given goes in does not consist of all 1,000 households in the survey. It consists of only those households that
the denominator planned to purchase the large TV. Of 250 such households, 200 actually purchased the large TV.
of ­Equation (4.4) Therefore, based on Table 4.1 on page 185, the probability that a household actually purchased
denominator. Because the large TV given that they planned to purchase is
planned to purchase Planned to purchase and actually purchased
was the given, planned P(Actually purchased  Planned to purchase) =
to purchase goes in the
Planned to purchase
denominator. 200
= = 0.80
250
Defining event A as Planned to purchase and event B as Actually purchased, Equation (4.4b)
also calculates this result:
P(A and B)
P(B A) =
P(A)
200>1,000 200
P(Actually purchased  Planned to purchase) = = = 0.80
250>1,000 250
Example 4.6 further illustrates conditional probability.

EXAMPLE 4.6 Table 4.2 on page 186 is a contingency table for whether a household purchased an HDR-capable
Finding the TV and whether the household purchased a streaming media player. If a household purchased
­Conditional an HDR-capable TV, what is the probability that it also purchased a streaming media player?
­Probability of SOLUTION Because you know that the household purchased an HDR-capable TV, the sample
­Purchasing a space is reduced to 80 households. Of these 80 households, 38 also purchased a streaming media
Streaming Media player. Therefore, the probability that a household purchased a streaming media player, given
Player that the household purchased an HDR-capable TV, is
Number purchasing HDR@capable
P(Purchased streaming media player  Purchased TV and streaming media player
=
HDR@capable TV) Number purchasing
HDR@capable TV
38
= = 0.475
80

Using Equation (4.4b) on page 191 and the following definitions:


A = Purchased an HDR-capable TV
B = Purchased a streaming media player

then
P(A and B) 38>300
P(B A) = = = 0.475
P(A) 80>300
Therefore, given that the household purchased an HDR-capable TV, there is a 47.5% chance
that the household also purchased a streaming media player. You can compare this condi-
tional probability to the marginal probability of purchasing a streaming media player, which is
108>300 = 0.36, or 36%. These results tell you that households that purchased HDR-capable
TVs are more likely to purchase a streaming media player than are households that purchased
large TVs that are not HDR-capable.
4.2 Conditional Probability 193

Decision Trees
In Table 4.1, households are classified according to whether they planned to purchase and
whether they actually purchased large TVs. A decision tree is an alternative to the contingency
table. Figure 4.1 represents the decision tree for this example.

FIGURE 4.1
Decision tree for planned ased P(A and B) 5 200
250 yPurch 1,000
to purchase and actually P(A) 5
1,000 Actuall
purchased
to
ned Did N
Plan hase ot A
Pur c Purch ctually P(A and B9) 5 50
Entire ase 1,000
Set of
Households
Did
to P Not Pl hased
urc a
has n y Purc P(A9 and B) 5 100
e Actuall 1,000

750
P(A9) 5 Did
1,000 Not
Pur Actua
cha l P(A9 and B9) 5 650
se ly 1,000

In Figure 4.1, beginning at the left with the entire set of households, there are two
“branches” for whether or not the household planned to purchase a large TV. Each of these
branches has two subbranches, corresponding to whether the household actually purchased
or did not actually purchase the large TV. The probabilities at the end of the initial branches
represent the marginal probabilities of A and A′. The probabilities at the end of each of the
four subbranches represent the joint probability for each combination of events A and B. You
compute the conditional probability by dividing the joint probability by the appropriate mar-
ginal probability.
For example, to compute the probability that the household actually purchased, given that
the household planned to purchase the large TV, you take P(Planned to purchase and actually
purchased) and divide by P(Planned to purchase). From Figure 4.1,

200>1,000
P(Actually purchased  Planned to purchase) =
250>1,000
200
= = 0.80
250
Example 4.7 illustrates how to construct a decision tree.

EXAMPLE 4.7 Using the cross-classified data in Table 4.2 on page 186 construct the decision tree. Use the
Constructing decision tree to find the probability that a household purchased a streaming media player, given
the Decision Tree that the household purchased an HDR-capable television.
for the Households SOLUTION The decision tree for purchased a streaming media player and an HDR-capable
That Purchased TV is displayed in Figure 4.2.
Large TVs

(continued)

194 CHAPTER 4 | Basic Probability

FIGURE 4.2 g
min
Decision tree for trea
a s ed S ayer P(A and B) 5
38
h l
purchased an P(A) 5
80 Purc edia P 300
300 M
HDR-capable TV and a
streaming media player ed
has Did
Purc able TV Strea Not Purc
-cap ming ha P(A and B9) 5 42
Entire HDR Media se 300
Playe
Set of r
Households
Did
g
N
HD ot Pu eamin
sed Str
R-c
apa rchase Purcha ia Player P(A9 and B) 5 70
ble Med 300
TV
220
P(A9) 5 Did
300 Stre Not Pu
am r
in chas P(A9 and B9) 5 150
Pla g Med e 300
yer ia

Using Equation (4.4b) on page 191 and the following definitions:


A = Purchased an HDR-capable TV
B = Purchased a streaming media player
then
P(A and B) 38>300
P(B A) = = = 0.475
P(A) 80>300

Independence
In the example concerning the purchase of large TVs, the conditional probability is
200>250 = 0.80 that the selected household actually purchased the large TV, given that the
household planned to purchase. The simple probability of selecting a household that actu-
ally purchased is 300>1,000 = 0.30. This result shows that the prior knowledge that the
household planned to purchase affected the probability that the household actually pur-
chased the TV. In other words, the outcome of one event is dependent on the outcome of
a second event.
When the outcome of one event does not affect the probability of occurrence of another
event, the events are said to be independent. Independence can be determined by using
Equation (4.5).

INDEPENDENCE
Two events, A and B, are independent if and only if
P(A B) = P(A)(4.5)
where
P(A B) = conditional probability of A given B
P(A) = marginal probability of A

Example 4.8 on the next page demonstrates the use of Equation (4.5).
4.2 Conditional Probability 195

EXAMPLE 4.8 In the follow-up survey of the 300 households that actually purchased large TVs, the households
Determining were asked if they were satisfied with their purchases. Table 4.3 cross-classifies the responses to
Independence the satisfaction question with the responses to whether the TV was HDR-capable.

TABLE 4.3
Satisfaction with SATISFIED WITH PURCHASE?
purchase of large TVs
HDR FEATURE Yes No Total
HDR-capable 64 16 80
Not HDR-capable 176 44 220
Total 240 60 300

Determine whether being satisfied with the purchase and the HDR feature of the TV
purchased are independent.
SOLUTION For these data:
64>300 64
P(Satisfied  HDR@capable) = = = 0.80
80>300 80

which is equal to
240
P(Satisfied) = = 0.80
300
Thus, being satisfied with the purchase and the HDR feature of the TV purchased are independent.
Knowledge of one event does not affect the probability of the other event.

Multiplication Rules
The general multiplication rule is derived using Equation (4.4a) on page 191:
P(A and B)
P(A B) =
P(B)
and solving for the joint probability P(A and B).

GENERAL MULTIPLICATION RULE


The probability of A and B is equal to the probability of A given B times the
probability of B.
P(A and B) = P(A B)P(B)(4.6)

Example 4.9 demonstrates the use of the general multiplication rule.

EXAMPLE 4.9 Consider the 80 households that purchased HDR-capable TVs. In Table 4.3 on this page, you see
Using the General that 64 households are satisfied with their purchase, and 16 households are dissatisfied. Suppose
Multiplication Rule two households are randomly selected from the 80 households. Find the probability that both
households are satisfied with their purchase.
SOLUTION Here you can use the multiplication rule in the following way. If
A = second household selected is satisfied
(continued) B = first household selected is satisfied

196 CHAPTER 4 | Basic Probability

then, using Equation (4.6),


P(A and B) = P(A B)P(B)
The probability that the first household is satisfied with the purchase is 64/80. However, the
probability that the second household is also satisfied with the purchase depends on the result
of the first selection. If the first household is not returned to the sample after the satisfaction
level is determined (i.e., sampling without replacement), the number of households remaining
is 79. If the first household is satisfied, the probability that the second is also satisfied is 63/79
because 63 satisfied households remain in the sample. Therefore,
63 64
P(A and B) = ¢ ≤ ¢ ≤ = 0.6380
79 80

There is a 63.80% chance that both of the households sampled will be satisfied with their
purchase.

The multiplication rule for independent events is derived by substituting P(A) for
P(A B) in Equation (4.6).

MULTIPLICATION RULE FOR INDEPENDENT EVENTS


If A and B are independent, the probability of A and B is equal to the probability
of A times the probability of B.

P(A and B) = P(A)P(B)(4.7)

If this rule holds for two events, A and B, then A and B are independent. Therefore, there
are two ways to determine independence:
1. Events A and B are independent if, and only if, P(A B) = P(A).
2. Events A and B are independent if, and only if, P(A and B) = P(A)P(B).

Marginal Probability Using


the General Multiplication Rule
Section 4.1 defines the marginal probability using Equation (4.2). One can state the equation for
marginal probability by using the general multiplication rule. If

P(A) = P(A and B1) + P(A and B2) + g + P(A and Bk)

then, using the general multiplication rule, Equation (4.8) defines the marginal probability.

MARGINAL PROBABILITY USING THE GENERAL MULTIPLICATION RULE

P(A) = P(A B1)P(B1) + P(A B2)P(B2) + g + P(A Bk)P(Bk)(4.8)

where B1, B2, c, Bk are k mutually exclusive and collectively exhaustive events.
4.2 Conditional Probability 197

To illustrate Equation (4.8), refer to Table 4.1 on page 185 and let:
P(A) = probability of planned to purchase
P(B1) = probability of actually purchased
P(B2) = probability of did not actually purchase

Then, using Equation (4.8), the probability of planned to purchase is

P(A) = P(A B1)P(B1) + P(A B2)P(B2)


200 300 50 700
= ¢ ≤¢ ≤ + ¢ ≤¢ ≤
300 1,000 700 1,000
200 50 250
= + = = 0.25
1,000 1,000 1,000

PROBLEMS FOR SECTION 4.2

LEARNING THE BASICS survey was conducted on two types of stocks, C and D, to gauge
4.16 Consider the following contingency table, which indicates how many stocks were purchased by students based on their gender.
the number of two products, A and B, produced by two different The following table summarizes the responses from students who
production lines. participated in the survey:

1 2 STOCK
A 60 55 GENDER C D Total
B 40 25 Male 22 55 77
Female 24 39 63
What is the probability of Total 46 94 140
a. A | 1? Source: Data extracted from United State Department of Labor,
b. 2 | B? [Link]
c. A and 1, using the general multiplication rule?
d. B, using the general multiplication rule? a. What is the probability that a male student bought stock D?
b. Given that a student buys stock D, what is the probability that
4.17 Consider the following contingency table:
the student is male?
c. Explain the difference in the results in (a) and (b).
B B′ d. Is the type of stock and gender independent?
A 0.25 0.35
4.22 How much will residents across different states spend on dif-
A′ 0.3 0.1
ferent types of activities? In 2017-2018, residents from two census
What is the probability of divisions in the United States, New England and Middle Atlantic,
a. B | A’? were enrolled in a consumer expenditure survey for two types of
b. A | B’? activities, personal care and reading. The survey was based on 880
c. A and B being independent using the conditional probability residents from New England and 893 residents from Middle Atlan-
found in (a)? tic. The following table summarizes the results:
d. A and B being independent using the multiplication rule?
CENSUS DIVISIONS
4.18 If P(A) = 0.3, P(B) = 0.25, and P(A and B) = 0.15, find
P(A  B). TYPES OF New Middle
4.19 If A and B are independent events, P(A) = 0.5, and P(A  B) EXPENDITURE England Atlantic Total
= 0.4, find P(B). Personal care 727 771 1,498
Reading 153 122 275
4.20 If P(A) = 0.2, P(B) = 0.75, and P(A and B) = 0.15, are A
Total 880 893 1,773
and B independent?
Source: Data extracted from United State Department of Labor,
[Link]
APPLYING THE CONCEPTS
4.21 Besides saving, many college students also invest in bonds, a. Suppose you know that the resident is selected from New Eng-
stocks, and mutual funds to earn extra pocket money. In 2018, a land. What is the probability that they spent on reading?
198 CHAPTER 4 | Basic Probability

b. Suppose you know that the resident is selected from Middle 4.25 According to the Malaysia Informative Data Center
Atlantic. What is the probability that they spent on reading? ­(MysICD), in 2016, the number of registered employees in two
c. Are the two events, census division and types of expenditure, Malaysian states, Selangor and Kuala Lumpur, were 1,281 and
independent? Explain. 1,509, respectively. In 2017, the numbers increased to 1,913 and
1,959, respectively.
4.23 Are teams within a business usually in consensus when
approving a project that will affect the entire company? The fol- Source: Data extracted from MysICD, [Link]
lowing table shows the positive agreement and negative responses a. In 2016, what is the probability that the registered employees
received for a project proposed by two teams in a company. are from Selangor?
b. In 2017, what is the probability that the registered employees
RESPONSES are from Kuala Lumpur?
TEAM c. Are the years of registration and the states independent?
Positive Negative Total ­Explain.
A 56 8 64
4.26 A scientist is analyzing a drug to cure common cold for two
B 2 14 16
duration-based categories. She has two groups of patients who have
Total 58 22 80
common cold—Group A will take the drug and Group B will not. She
found that within 1 to 3 days, 86 patients from Group A are cured and
a. If Team A’s project is selected, what is the probability that it will 19 patients from Group B are cured. The number of patients who are
receive negative responses? cured within 4 to 7 days from Group A are 16 and from Group B are 79.
b. If Team B’s project is selected, what is the probability that it a. If a patient is cured within 3 days, what is the probability that
will receive positive responses? they are from Group A?
c. Find the probability that a project with positive responses is b. What is the probability that a patient cured within 4 to 7 days is
proposed by Team A. from Group B?
c. What is the probability of a patient taking the drug?
SELF 4.24 Is there full support for increased use of educa- d. Are duration to be cured and drug being taken ­independent
TEST tional technologies in higher ed? As part of Inside
events? Explain.
Higher Ed’s 2018 Survey of Faculty Attitudes on Technology, aca-
demic professionals, namely, full-time faculty members and admin- 4.27 In 44 of the 68 years from 1950 through 2018 (in 2011
istrators who oversee their institutions’ online learning or there was virtually no change), the S&P 500 finished higher after
instructional technology efforts (digital learning leaders), were the first five days of trading. In 36 out of 44 years, the S&P 500
asked that question. The following table summarizes their responses: ­finished higher for the year. Is a good first week a good omen for
the upcoming year? The following table gives the first-week and
annual ­performance over this 68-year period:
ACADEMIC PROFESSIONAL

FULL Faculty Digital Learning S&P 500’S ANNUAL PERFORMANCE


SUPPORT Member Leader Total
FIRST WEEK Higher Lower
Yes 511 175 686
No 1,086 31 1,117 Higher 36 8
Total 1,597 206 1,803 Lower 12 12
Source: Data extracted from “The 2018 Inside Higher Ed Survey of Faculty
Attitudes on Technology,” [Link]/301jUit. a. If a year is selected at random, what is the probability that the
S&P 500 finished higher for the year?
b. Given that the S&P 500 finished higher after the first five days
a. Given that an academic professional is a faculty member, what
of trading, what is the probability that it finished higher for the
is the probability that the academic professional fully supports
year?
increased use of educational technologies in higher ed?
c. Are the two events “first-week performance” and “annual per-
b. Given that an academic professional is a faculty member,
formance” independent? Explain.
what is the probability that the academic professional does
d. Look up the performance after the first five days of 2019 and
not fully support increased use of educational technologies
the 2019 annual performance of the S&P 500 at [Link]
in higher ed?
.com. Comment on the results.
c. Given that an academic professional is a digital learning leader,
what is the probability that the academic professional fully sup- 4.28 A standard deck of cards is being used to play a game. There
ports increased use of educational technologies in higher ed? are four suits (hearts, diamonds, clubs, and spades), each having
d. Given that an academic professional is a digital learning lead- 13 faces (ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, jack, queen, and king), making
er, what is the probability that the academic professional does a total of 52 cards. This complete deck is thoroughly mixed, and
not fully support increased use of educational technologies in you will receive the first 2 cards from the deck, without replacement
higher ed? (the first card is not returned to the deck after it is selected).
4.4 Bayes’ Theorem 199

a. What is the probability that both cards are queens? The study found that of the people who had a university degree,
b. What is the probability that the first card is a 10 and the second 56 people had developed noxious habits and 116 had not. The total
card is a 5 or 6? number of participants with a secondary level of education is 78.
c. If you were sampling with replacement (the first card is returned a. What is the probability that the participant has a university de-
to the deck after it is selected), what would be the answer in (a)? gree?
d. In the game of blackjack, the face cards (jack, queen, king) b. What is the probability that the participant has a university de-
count as 10 points, and the ace counts as either 1 or 11 points. gree and has noxious habits?
All other cards are counted at their face value. Blackjack is c. What is the probability that a participant having a university
achieved if two cards total 21 points. What is the probability of degree has noxious habits?
getting blackjack in this problem? d. Are the education level and the development of noxious habits
independent?
4.29 A study on an individual’s education level and the development
of noxious habits was conducted using a sample of 250 participants.

4.3 Ethical Issues and Probability


Ethical issues can arise when any statements related to probability are presented to the public,
particularly when these statements are part of an advertising campaign for a product or ser-
vice. Unfortunately, many people are not comfortable with numerical concepts (Paulos) and
tend to misinterpret the meaning of the probability. In some instances, the misinterpretation
is not intentional, but in other cases, advertisements may unethically try to mislead potential
customers.
One example of a potentially unethical application of probability relates to advertisements
for state lotteries. When purchasing a lottery ticket, the customer selects a set of numbers (such
as 6) from a larger list of numbers (such as 54). Although virtually all participants know that
they are unlikely to win the lottery, they also have very little idea of how unlikely it is for them
to select all six winning numbers from the list of 54 numbers. They have even less of an idea of
the probability of not selecting any winning numbers.
Given this background, one might consider a state lottery commercial that stated,
“We won’t stop until we have made everyone a millionaire” to be deceptive and possibly unethical.
Is it possible that the lottery can make everyone a millionaire? Is it ethical to suggest that the
purpose of the lottery is to make everyone a millionaire?
Another example of a potentially unethical application of probability relates to an invest-
ment newsletter promising a 90% probability of a 20% annual return on investment. To make
the claim in the newsletter an ethical one, the investment service needs to (a) explain the basis
on which this probability estimate rests, (b) provide the probability statement in another format,
such as 9 chances in 10, and (c) explain what happens to the investment in the 10% of the cases
in which a 20% return is not achieved (e.g., is the entire investment lost?).
Probability-related claims should be stated ethically. Readers can consider how one would
write an ethical description about the probability of winning a certain prize in a state lottery. Or
how one can ethically explain a “90% probability of a 20% annual return” without creating a
misleading inference.

4.4 Bayes’ Theorem


Developed by Thomas Bayes in the eighteenth century, Bayes’ theorem builds on conditional
probability concepts that Section 4.2 discusses. Bayes’ theorem revises previously calculated
probabilities using additional information and forms the basis for Bayesian analysis. (Anderson-
Cook, Bellhouse, Hooper).
In recent years, Bayesian analysis has gained new prominence for its application to ana-
lyzing big data using predictive analytics (see Chapter 17). However, Bayesian analysis does
200 CHAPTER 4 | Basic Probability

not require big data and can be used in a variety of problems to better determine the revised
probability of certain events. The Bayesian Analysis online topic contains examples that apply
Bayes’ theorem to a marketing problem and a diagnostic problem and presents a set of additional
study problems. The Consider This for this section explores an application of Bayes’ theorem
that many use every day, and references Equation (4.9) that the online topic discusses.

CONSIDER THIS
Divine Providence and Spam
Would you ever guess that the essays Divine Benevolence: probabilities necessary to use Bayes’ theorem. With these
Or, An Attempt to Prove That the Principal End of the probabilities, the filter can ask, “What is the probability that
Divine Providence and Government Is the Happiness of an email is spam, given the presence of a certain word?”
His Creatures and An Essay Towards Solving a Problem in Applying the terms of Equation (4.9), such a Bayes-
the Doctrine of Chances were written by the same person? ian spam filter would multiply the probability of finding
Probably not, and in doing so, you illustrate a modern-day the word in a spam email, P(A  B), by the probability that
application of Bayesian statistics: spam, or junk mail filters. the email is spam, P(B), and then divide by the proba-
In not guessing correctly, you probably looked at the words bility of finding the word in an email, the denominator in
in the titles of the essays and concluded that they were talking ­Equation (4.9). Bayesian spam filters also use shortcuts by
about two different things. An implicit rule you used was that focusing on a small set of words that have a high probabil-
word frequencies vary by subject matter. A statistics essay ity of being found in a spam message as well as on a small
would very likely contain the word statistics as well as words set of other words that have a low probability of being
such as chance, problem, and solving. An eighteenth-century found in a spam message.
essay about theology and religion would be more likely to As spammers (people who send junk email) learned of
contain the uppercase forms of Divine and Providence. such new filters, they tried to outfox them. Having learned
Likewise, there are words you would guess to be very that Bayesian filters might be assigning a high P(A  B),
unlikely to appear in either book, such as technical terms value to words commonly found in spam, such as Viagra,
from finance, and words that are most likely to appear in spammers thought they could fool the filter by misspelling
both—common words such as a, and, and the. That words the word as Vi@gr@ or V1agra. What they overlooked was
would be either likely or unlikely suggests an application of that the misspelled variants were even more likely to be
probability theory. Of course, likely and unlikely are fuzzy found in a spam message than the original word. Thus, the
concepts, and we might occasionally misclassify an essay ­misspelled variants made the job of spotting spam easier
if we kept things too simple, such as relying solely on the for the Bayesian filters.
occurrence of the words Divine and Providence. Other spammers tried to fool the filters by adding “good”
For example, a profile of the late Harris Milstead, better words, words that would have a low probability of being
known as Divine, the star of Hairspray and other films, vis- found in a spam message, or “rare” words, words not
iting Providence (Rhode Island), would most certainly not ­frequently encountered in any message. But these spam-
be an essay about theology. But if we widened the number mers overlooked the fact that the conditional probabilities
of words we examined and found such words as movie or are constantly updated and that words once considered
the name John Waters (Divine’s director in many films), we “good” would be soon discarded from the good list by the
probably would quickly realize the essay had something to filter as their P(A  B) value increased. Likewise, as “rare”
do with twentieth-century cinema and little to do with theol- words grew more common in spam and yet stayed rare in
ogy and religion. ham, such words acted like the misspelled variants that
We can use a similar process to try to classify a new ­others had tried earlier.
email message in your in-box as either spam or a legitimate Even then, and perhaps after reading about Bayesian
message (called “ham,” in this context). We would first need statistics, spammers thought that they could “break”
to add to your email program a “spam filter” that has the Bayesian filters by inserting random words in their mes-
ability to track word frequencies associated with spam and sages. Those random words would affect the filter by
ham messages as you identify them on a day-to-day basis. causing it to see many words whose P(A  B) value would
This would allow the filter to constantly update the prior be low. The Bayesian filter would begin to label many spam
Summary 201

messages as ham and end up being of no practical use. Other future tricks will ultimately fail for the same reason.
Spammers again overlooked that conditional probabilities (By the way, spam filters use non-Bayesian techniques as
are constantly updated. well, which makes spammers’ lives even more difficult.)
Other spammers decided to eliminate all or most of the Bayesian spam filters are an example of the unexpected
words in their messages and replace them with graphics so way that applications of statistics can show up in your
that Bayesian filters would have very few words with which daily life. You will discover more examples as you read the
to form conditional probabilities. But this approach failed, rest of this book. By the way, the author of the two essays
too, as Bayesian filters were rewritten to consider things mentioned earlier was Thomas Bayes, who is a lot more
other than words in a message. After all, Bayes’ theorem famous for the second essay than the first essay, a failed
concerns events, and “graphics present with no text” is as attempt to use mathematics and logic to prove the exis-
valid an event as “some word, X, present in a message.” tence of God.

4.5 Counting Rules


In many cases, a large number of outcomes is possible and determining the exact number of out-
comes can be difficult. In these situations, rules have been developed for counting the exact number
of possible outcomes. The Section 4.5 online topic discusses these rules and illustrates their use.

▼▼
USING STATISTICS
Probable Outcomes at Fredco Warehouse Club, Revisited

A s a Fredco Warehouse Club electronics merchandise


manager, you analyzed an intent-to-purchase-a-large-TV
study of the heads of 1,000 households as well as a follow-up
was an 80% chance that
the household actually
made the purchase. Thus
study done 12 months later. In that later survey, respondents the marketing strategy
who answered that they had purchased a large TV were asked should target those households that have indicated an intent to
additional questions concerning whether the large TV was purchase.
HDR-capable and whether the respondents had purchased a You determined that for households that purchased
streaming media player in the past 12 months. an HDR-capable TV, there was a 47.5% chance that the
By analyzing the results of these surveys, you were able household also purchased a streaming media player. You
to uncover many pieces of valuable information that will help then compared this conditional probability to the marginal
you plan a marketing strategy to enhance sales and better probability of purchasing a streaming media player, which
target those households likely to purchase multiple or more was 36%. Thus, households that purchased HDR-capable TVs
expensive products. Whereas only 30% of the households are more likely to purchase a streaming media player than
actually purchased a large TV, if a household indicated that it are households that purchased large TVs that were not
planned to purchase a large TV in the next 12 months, there HDR-capable.

▼▼
SUMMARY
This chapter develops the basic concepts of probability that the chapter discusses conditional probabilities and indepen-
serve as a foundation for other concepts that later chapters dent events. The chapter contingency tables and decision
discuss. Probability is a numeric value from 0 to 1 that trees summarize and present probability information. The
represents the chance, likelihood, or possibility that a par- chapter also introduces Bayes’ theorem.
ticular event will occur. In addition to simple probability,
202 CHAPTER 4 | Basic Probability

▼▼
REFERENCES
Anderson-Cook, C. M. “Unraveling Bayes’ Theorem.” Quality Lowd, D., and C. Meek. “Good Word Attacks on Statistical Spam
Progress, March 2014, pp. 52–54. Filters.” Presented at the Second Conference on Email and Anti-
Bellhouse, D. R. “The Reverend Thomas Bayes, FRS: A ­Biography Spam, 2005.
to Celebrate the Tercentenary of His Birth.” ­Statistical Science, Paulos, J. A. Innumeracy. New York: Hill and Wang, 1988.
19 (2004), 3–43. Silberman, S. “The Quest for Meaning,” Wired 8.02, February 2000.
Hooper, W. “Probing Probabilities.” Quality Progress, March 2014, Zeller, T. “The Fight Against V1@gra (and Other Spam).”
pp. 18–22. The New York Times, May 21, 2006, pp. B1, B6.

▼▼
KEY EQUATIONS
Probability of Occurrence Independence
X P(A B) = P(A)(4.5)
Probability of occurrence = (4.1)
T
General Multiplication Rule
Marginal Probability
P(A and B) = P(A B)P(B)(4.6)
P(A) = P(A and B1) + P(A and B2)
+ g + P(A and Bk)(4.2)
Multiplication Rule for
General Addition Rule Independent Events
P(A or B) = P(A) + P(B) - P(A and B)(4.3) P(A and B) = P(A)P(B)(4.7)
Conditional Probability
P(A and B) Marginal Probability Using the General
P(A B) = (4.4a)
P(B) Multiplication Rule
P(A and B) P(A) = P(A B1)P(B1) + P(A B2)P(B2)
P(B A) = (4.4b)
P(A) + g + P(A Bk)P(Bk)(4.8)

▼▼
KEY TERMS
a priori probability 184 general addition rule 188 mutually exclusive 184
Bayes’ theorem 199 general multiplication rule 195 outcomes 183
certain event 184 impossible event 184 probability 183
collectively exhaustive 184 independence 194 sample space 183
complement 184 joint event 183 simple event 183
conditional probability 191 joint probability 187 simple probability 185
decision tree 193 marginal probability 187 subjective probability 185
empirical probability 184 multiplication rule for independent
events 183 events 196

▼▼
CHECKING YOUR UNDERSTANDING
4.30 What are the differences between a priori probability, 4.33 What is the difference between mutually exclusive events and
­empirical probability, and subjective probability? collectively exhaustive events?
4.31 What is the difference between a simple event and a joint 4.34 How does conditional probability relate to the concept of
event? independence?
4.32 How can you use the general addition rule to find the proba- 4.35 How does the multiplication rule differ for events that are and
bility of occurrence of event A or B? are not independent?
Chapter Review Problems 203

▼▼
CHAPTER REVIEW PROBLEMS
4.36 A survey by Accenture indicated that 64% of millennials e. Suppose the randomly chosen employee does indicate that pre-
as compared to 28% of baby boomers prefer “hybrid” investment senteeism is an important metric to consider when evaluating the
advice—a combination of traditional advisory services and low-cost effectiveness of employee engagement programs. What is the
digital tools—over either a dedicated human advisor or conventional probability that the employee is a non-HR employee?
robo-advisory services (computer-generated advice and services f. Are “presenteeism is an important metric” and “employee”
without human advisors) alone. independent?
Source: Data extracted from Business Wire, “Majority of Wealthy Investors g. Is “absenteeism is an important metric” independent of “employee”?
Prefer a Mix of Human and Robo-Advice, According to Accenture Research,”
/[Link]/2qZY9Ou. 4.38 To better understand the website builder market, Clutch surveyed
Suppose that the survey was based on 500 respondents from each individuals who had created a website using a do-it-yourself (DIY) web-
of the two generation groups. site builder. Respondents, categorized by the type of website they built—
a. Construct a contingency table. business or personal—were asked to indicate the primary purpose for
b. Give an example of a simple event and a joint event. building their website. The following table summarizes the findings:
c. What is the probability that a randomly selected respondent
­prefers hybrid investment advice? TYPE OF WEBSITE
d. What is the probability that a randomly selected respondent
PRIMARY PURPOSE Business Personal Total
­prefers hybrid investment advice and is a baby boomer?
e. Are the events “generation group” and “prefers hybrid ­investment Online Business 52   4 56
advice” independent? Explain. Presence
Online Sales 32 13 45
4.37 G2 Crowd provides commentary and insight about employee Creative Display 28 54 82
engagement trends and challenges within organizations in its G2
Informational   9 24 33
Crowd Employee Engagement Report. The report represents the
Resources
results of an online survey conducted in 2019 with employees
Blog   8 52 60
located across the United States. G2 Crowd was interested in exam-
Total 129 147 276
ining differences between HR and non-HR employees. One area of
focus was on employees’ response to important metrics to consider Source: Data extracted from “How Businesses Use DIY Web Builders: Clutch
when evaluating the effectiveness of employee engagement pro- 2017 Survey,” [Link]/2qQjXiq.
grams. The findings are summarized in the following tables.
If a website builder is selected at random, what is the probability
Source: Data extracted from “Employee Engagement,” G2 Crowd, [Link]/2WEQkgk
that he or she
a. indicated creative display as the primary purpose for building
PRESENTEEISM IS AN his/her website?
IMPORTANT METRIC b. indicated creative display or informational resources as the pri-
mary purpose for building his/her website?
EMPLOYEE Yes No Total
c. is a business website builder or indicated online sales as the
HR 53 79 132 primary purpose for building his/her website?
Non-HR 43 225 268 d. is a business website builder and indicated online sales as the
Total 96 304 400 primary purpose for building his/her website?
e. Given that the website builder selected is a personal website
builder, what is the probability that he/she indicated online busi-
ABSENTEEISM IS AN ness presence as the primary purpose for building his/her website?
IMPORTANT METRIC
4.39 The CMO Survey collects and disseminates the opinions of top
EMPLOYEE Yes No Total marketers in order to predict the future of markets, track marketing excel-
HR 54 78 132 lence, and improve the value of marketing in firms and in society. Part
Non-HR 72 196 268 of the survey is devoted to the topic of marketing analytics and under-
standing what factors prevent companies from using more marketing
Total 126 274 400
analytics. The following findings are based on responses from 272 senior
marketers within B2B firms and 114 senior marketers within B2C firms.
What is the probability that a randomly chosen employee
a. is an HR employee? Source: Data extracted from “Results by Firm & Industry Characteristics,” The
CMO Survey, February 2017, p. 148. [Link]/2qY3Qvk.
b. is an HR employee or indicates that absenteeism is an import-
ant metric to consider when evaluating the effectiveness of
LACK OF PROCESS/TOOLS
employee engagement programs?
TO MEASURE SUCCESS
c. does not indicate that presenteeism is an important metric to
consider when evaluating the effectiveness of employee engage- FIRM Yes No Total
ment programs and is a non-HR employee? B2B 90 182 272
d. does not indicate that presenteeism is an important metric to B2C 35 79 114
consider when evaluating the effectiveness of employee engage- Total 125 261 386
ment programs or is a non-HR employee?
204 CHAPTER 4 | Basic Probability

c. Given that a randomly selected senior marketer is within a B2C


LACK OF PEOPLE WHO firm, what is the probability that the senior marketer indicates
CAN LINK TO PRACTICE that lack of process/tools to measure success through analytics
is a factor that prevents his/her company from using more mar-
FIRM Yes No Total
keting analytics?
B2B 75 197 272 d. What is the probability that a randomly selected senior marketer
B2C 36 78 114 indicates that lack of people who can link to marketing practice
Total 111 275 386 is a factor that prevents his/her company from using more mar-
keting analytics?
a. What is the probability that a randomly selected senior marketer e. Given that a randomly selected senior marketer is within a B2B
indicates that lack of process/tools to measure success through firm, what is the probability that the senior marketer indicates that
analytics is a factor that prevents his/her company from using lack of people who can link to marketing practice is a factor that
more marketing analytics? prevents his/her company from using more marketing analytics?
b. Given that a randomly selected senior marketer is within a B2B f. Given that a randomly selected senior marketer is within a B2C
firm, what is the probability that the senior marketer indicates that firm, what is the probability that the senior marketer indicates that
lack of process/tools to measure success through analytics is a lack of people who can link to marketing practice is a factor that
factor that prevents his/her company from using more marketing prevents his/her company from using more marketing analytics?
analytics? g. Comment on the results in (a) through (f).

CHAPTER

CASES
Digital Case
Apply your knowledge about contingency tables and the proper
The Choice Is Yours Follow-Up
4
1. Follow up the “Using Statistics: ‘The Choice Is Yours,’
application of simple and joint probabilities in this continuing Revisited” on page 109 by constructing contingency
Digital Case from Chapter 3. tables of market cap and type, market cap and risk, market
cap and rating, type and risk, type and rating, and risk and
Open [Link], the EndRun Financial Services
rating for the sample of 479 retirement funds stored in
“Guide to Investing,” and read the information about the Guar-
Retirement Funds .
anteed Investment Package (GIP). Read the claims and examine
the supporting data. Then answer the following questions: 2. For each table you construct, compute all conditional and
marginal probabilities.
How accurate is the claim of the probability of success for
3. Write a report summarizing your conclusions.
EndRun’s GIP? In what ways is the claim misleading? How
would you calculate and state the probability of having an
annual rate of return not less than 15%? Clear Mountain State Student Survey
1. Using the table found under the “Show Me the Winning Prob- The Student News Service at Clear Mountain State University
abilities” subhead, calculate the proper probabilities for the (CMSU) has decided to gather data about the undergraduate stu-
group of investors. What mistake was made in reporting the dents who attend CMSU. CMSU creates and distributes a survey
7% probability claim? of 14 questions (see [Link]) and receives
responses from 111 undergraduates (stored in Student Survey ).
2. Are there any probability calculations that would be appropri-
ate for rating an investment service? Why or why not? For these data, construct contingency tables of gender and major,
gender and graduate school intention, gender and employment
CardioGood Fitness status, gender and computer preference, class and graduate
1. For each CardioGood Fitness treadmill product line (see school intention, class and employment status, major and
CardioGood Fitness ), construct two-way contingency tables graduate school intention, major and employment status, and
of gender, education in years, relationship status, and self- major and computer preference.
rated fitness. (There will be a total of six tables for each tread- 1. For each of these contingency tables, compute all the condi-
mill product.) tional and marginal probabilities.
2. For each table you construct, compute all conditional and 2. Write a report summarizing your conclusions.
marginal probabilities.
3. Write a report detailing your findings to be presented to the
management of CardioGood Fitness.
CHAPTER

EG4.1
EXCEL GUIDE
BASIC PROBABILITY CONCEPTS EG4.2 BAYES’ THEOREM
4
Simple Probability, Joint Probability, Key Technique Use Excel arithmetic formulas.
and the General Addition Rule Example Apply Bayes’ theorem to the television marketing
Key Technique Use Excel arithmetic formulas. example that the Bayesian Analysis online topic discusses.
Example Compute simple and joint probabilities for pur-
chase behavior data in Table 4.1 on page 185. Workbook Use the COMPUTE worksheet of the Bayes
workbook as a template.
PHStat Use Simple & Joint Probabilities. The worksheet (shown below) already contains the proba-
For the example, select PHStat ➔ Probability & Prob. bilities for the online section example. For other problems,
­Distributions ➔ Simple & Joint Probabilities. In the new change those probabilities in the cell range B5:C6.
template, similar to the worksheet shown below, fill in the
Sample Space area with the data.

Workbook Use the COMPUTE worksheet of the Prob­


abilities workbook as a template.
The worksheet (shown below) already contains the Table 4.1
purchase behavior data. For other problems, change the sam-
ple space table entries in the cell ranges C3:D4 and A5:D6.
As you change the event names in cells, B5, B6, C5, and
C6, the column A row labels for simple and joint probabilities
and the addition rule change as well. These column A labels Open to the COMPUTE_FORMULAS worksheet to
are formulas that use the concatenation operator (&) to form examine the arithmetic formulas that compute the probabili-
row labels from the event names you enter. ties, which are also shown as an inset to the worksheet.
For example, the cell A10 formula 5"P("& B5 &")"
combines the two characters P( with the Yes B5 cell value and
the character ) to form the label P(Yes). To examine all of the
COMPUTE worksheet formulas shown below, open to the
­COMPUTE_FORMULAS worksheet.

205

You might also like