Statistics For Management - 1
Statistics For Management - 1
The probability of an event refers to the likelihood that the event will occur.
If P(A) is close to zero, there is only a small chance that event A will
occur.
If P(A) equals 0.5, there is a 50-50 chance that event A will occur.
If P(A) is close to one, there is a strong chance that event A will occur.
The probability that the experiment results in a successful outcome (S) is:
P(S) =
=r/n
Consider the following experiment. An urn has 10 marbles. Two marbles are
red, three are green, and five are blue. If an experimenter randomly selects 1
marble from the urn, what is the probability that it will be green?
In this experiment, there are 10 equally likely outcomes, three of which are
green marbles. Therefore, the probability of choosing a green marble is 3/10 or
0.30.
One can also think about the probability of an event in terms of its long-
run relative frequency. The relative frequency of an event is the number of
times an event occurs, divided by the total number of trials.
For example, a merchant notices one day that 5 out of 50 visitors to her store
make a purchase. The next day, 20 out of 50 visitors make a purchase. The two
relative frequencies (5/50 or 0.10 and 20/50 or 0.40) differ. However, summing
results over many visitors, she might find that the probability that a visitor
makes a purchase gets closer and closer to 0.20.
The scatterplot above shows the relative frequency of purchase as the number of
trials (in this case, the number of visitors) increases. Over many trials, the
relative frequency converges toward a stable value (0.20), which can be
interpreted as the probability that a visitor to the store will make a purchase.
The idea that the relative frequency of an event will converge on the probability
of the event, as the number of trials increases, is called the law of large
numbers.
Problem
(A) 0.125
(B) 0.250
(C) 0.333
(D) 0.375
(E) 0.500
Solution
The correct answer is (D). If you toss a coin three times, there are a total of
eight possible outcomes. They are: HHH, HHT, HTH, THH, HTT, THT, TTH,
and TTT. Of the eight possible outcomes, three have exactly one head. They
are: HTT, THT, and TTH. Therefore, the probability that three flips of a coin
will produce exactly one head is 3/8 or 0.375.
RULES OF PROBABILITY
Rule of Subtraction
P(A) = 1 - P(A')
Suppose, for example, the probability that Bill will graduate from college is
0.80. What is the probability that Bill will not graduate from college? Based on
the rule of subtraction, the probability that Bill will not graduate is 1.00 - 0.80
or 0.20.
Rule of Multiplication
The rule of multiplication applies to the situation when we want to know the
probability of the intersection of two events; that is, we want to know the
probability that two events (Event A and Event B) both occur.
Example
An urn contains 6 red marbles and 4 black marbles. Two marbles are
drawn without replacement from the urn. What is the probability that both of the
marbles are black?
Solution: Let A = the event that the first marble is black; and let B = the event
that the second marble is black. We know the following:
In the beginning, there are 10 marbles in the urn, 4 of which are black.
Therefore, P(A) = 4/10.
After the first selection, there are 9 marbles in the urn, 3 of which are
black. Therefore, P(B|A) = 3/9.
Rule of Addition
The rule of addition applies to the following situation. We have two events, and
we want to know the probability that either event occurs.
Example
A student goes to the library. The probability that she checks out (a) a work of
fiction is 0.40, (b) a work of non-fiction is 0.30, and (c) both fiction and non-
fiction is 0.20. What is the probability that the student checks out a work of
fiction, non-fiction, or both?
Solution: Let F = the event that the student checks out fiction; and let N = the
event that the student checks out non-fiction. Then, based on the rule of
addition:
Problem 1
An urn contains 6 red marbles and 4 black marbles. Two marbles are
drawn with replacement from the urn. What is the probability that both of the
marbles are black?
(A) 0.16
(B) 0.32
(C) 0.36
(D) 0.40
(E) 0.60
Solution
The correct answer is A. Let A = the event that the first marble is black; and let
B = the event that the second marble is black. We know the following:
In the beginning, there are 10 marbles in the urn, 4 of which are black.
Therefore, P(A) = 4/10.
After the first selection, we replace the selected marble; so there are still
10 marbles in the urn, 4 of which are black. Therefore, P(B|A) = 4/10.
Therefore, based on the rule of multiplication:
Problem 2
A card is drawn randomly from a deck of ordinary playing cards. You win $10
if the card is a spade or an ace. What is the probability that you will win the
game?
(A) 1/13
(B) 13/52
(C) 4/13
(D) 17/52
(E) None of the above.
Solution
The correct answer is C. Let S = the event that the card is a spade; and let A =
the event that the card is an ace. We know the following:
Bayes' theorem is also called Bayes' Rule or Bayes' Law and is the foundation
of the field of Bayesian statistics.
Bayes' theorem thus gives the probability of an event based on new information
that is, or may be related, to that event. The formula can also be used to see how
the probability of an event occurring is affected by hypothetical new
information, supposing the new information will turn out to be true. For
instance, say a single card is drawn from a complete deck of 52 cards. The
probability that the card is a king is four divided by 52, which equals 1/13 or
approximately 7.69%. Remember that there are four kings in the deck. Now,
suppose it is revealed that the selected card is a face card. The probability the
selected card is a king, given it is a face card, is four divided by 12, or
approximately 33.3%, as there are 12 face cards in a deck.
WHAT IS A RANDOM VARIABLE?
Some references state that continuous variables can take on an infinite number
of values, but discrete variables cannot. This is incorrect.
When comparing discrete and continuous variables, it is more correct to say that
continuous variables can always take on an infinite number of values; whereas
some discrete variables can take on an infinite number of values, but others
cannot.
Problem 1
(A) I only
(B) II only
(C) III only
(D) I and II
(E) II and III
Solution
The table below shows the probabilities associated with each possible value of
X. The probability of getting 0 heads is 0.25; 1 head, 0.50; and 2 heads, 0.25.
Thus, the table is an example of a probability distribution for a discrete random
variable.
The charts below show two continuous probability distributions. The first chart
shows a probability density function described by the equation y = 1 over the
range of 0 to 1 and y = 0 elsewhere. The second chart shows a probability
density function described by the equation y = 1 - 0.5x over the range of 0 to 2
and y = 0 elsewhere. The area under the curve is equal to 1 for both charts.
y=1
y = 1 - 0.5x
Problem 1
(A) 0.10
(B) 0.15
(C) 0.25
(D) 0.50
(E) 0.90
Solution
The mean of the discrete random variable X is also called the expected value of
X. Notationally, the expected value of X is denoted by E(X). Use the following
formula to compute the mean of a discrete random variable.
Example 1
In a recent little league softball game, each player went to bat 4 times. The
number of hits made by each player is described by the following probability
distribution.
Probability,
Number of hits, x
P(x)
0 0.10
1 0.20
2 0.30
3 0.25
4 0.15
(A) 1.00
(B) 1.75
(C) 2.00
(D) 2.25
(E) None of the above.
Solution
The median of a discrete random variable is the "middle" value. It is the value
of X for which P(X < x) is greater than or equal to 0.5 and P(X > x) is greater
than or equal to 0.5.
The equation for computing the variance of a discrete random variable is shown
below.
where xi is the value of the random variable for outcome i, P(x i) is the
probability that the random variable will be outcome i, E(x) is the expected
value of the discrete random variable x.
Example 2
Number of adults, x 1 2 3 4
(A) 0.50
(B) 0.62
(C) 0.79
(D) 0.89
(E) 2.10
Solution
The correct answer is D. The solution has three parts. First, find the expected
value; then, find the variance; then, find the standard deviation. Computations
are shown below, beginning with the expected value.
And finally, the standard deviation is equal to the square root of the variance; so
the standard deviation is sqrt(0.79) or 0.889.
The above conditions are equivalent. If either one is met, the other condition is
also met; and X and Y are independent. If either condition is not met, X and Y
are dependent.
Each trial can result in just two possible outcomes. We call one of these
outcomes a success and the other, a failure.
The trials are independent; that is, the outcome on one trial does not
affect the outcome on other trials.
Consider the following statistical experiment. You flip a coin 2 times and count
the number of times the coin lands on heads. This is a binomial experiment
because:
Each trial can result in just two possible outcomes - heads or tails.
The trials are independent; that is, getting heads on one trial does not
affect whether we get heads on other trials.
Notation
Binomial Distribution
Suppose we flip a coin two times and count the number of heads (successes).
The binomial random variable is the number of heads, which can take on values
of 0, 1, or 2. The binomial distribution is presented below.
n * P .
The variance (σ2x) is
n * P * ( 1 - P ).
The standard deviation (σx) is
sqrt[ n * P * ( 1 - P ) ].
Binomial Formula and Binomial Probability
Therefore, the expected value (mean) and the variance of the Poisson
distribution is equal to λ.
Where,
x is the variable
μ is the mean
σ is the standard deviation
The graph of the normal distribution depends on two factors - the mean and the
standard deviation. The mean of the distribution determines the location of the
center of the graph, and the standard deviation determines the height and width
of the graph. All normal distributions look like a symmetric, bell-shaped curve,
as shown below.
Here is a histogram of SAT Critical Reading scores. The scores create a symmetrical
curve that can be approximated by a normal curve, as shown. Notice that we see the
characteristic bell shape of this near-normal distribution.
mean and
a standard deviation .
The curve is symmetric about the mean, which means that the right and left sides of the
curve are identical mirror images of each other.
Because the right and left sides are mirror images of each other, 50% of the values are
less than the mean and 50% of the values are greater than the mean.
The height of a normal distribution is a maximum at the mean, and the height decreases
as one goes from the mean toward the right tail, or as one goes from the mean to the left
tail.
When you are given a normal distribution, with a given mean and standard deviation,
you can determine important locations on the bell curve by adding standard deviations to
the mean and by subtracting standard deviations from the mean.
For example, if you are given information about IQ scores, which are normally
distributed, and are told that the mean IQ score
is = 100
then you can calculate that an IQ score that is 1 standard deviation above the mean is:
In other words, a person who has an IQ score of 115 has an IQ score that is 1 standard
deviation above the mean.
A person who has an IQ score of 70 has an IQ score that is 2 standard deviations below
the mean.
We're about to see that it becomes less and less likely to find values that are farther
from the mean than are closer to it.
That is, it would be much less likely to find an IQ score that was 3 standard deviations
above the mean than to find one that was 2 standard deviations above the mean (or two
standard deviations below the mean, for that matter).
This is such an important concept that we have a rule of thumb referred to as the
Empirical Rule for normal distributions. In all normal distributions, the Empirical Rule tells
us that:
1. About 68% of all data values will fall within +/- 1 standard deviation of the mean.
2. About 95% of all data values will fall within +/- 2 standard deviations of the mean.
3. About 99.7% of all data values will fall within +/- 3 standard deviations of the mean.
Here is a sketch of a representative normal curve, with the Empirical Rule displayed.
Let's assume for a moment that this normal curve was the distribution of the IQ scores of
1,000 high school students. Let's again assume that the mean IQ score of these
students is = 100 and that the standard deviation is 15.
Let's consider what this all means. First, the average (arithmetic mean) IQ score of all
the students is 100.
That is, if you averaged all of the students' IQ scores, you'd see their average IQ score
was 100.
Second, if is 15, then about 68% of the students had an IQ score in the interval from 85
to 115 since 100 - 15 = 85 and 100 + 15 = 115.
In other words, about 680 of the IQ scores of the 1000 students are between 85 and
115.
Think about that. Almost 70% of the students have an IQ score that is within 1 standard
deviation of the mean.
Thus, only about 30% of the IQ scores are outside of being within 1 standard deviation
of the mean.
Let's consider how many students' IQ scores fall within 2 standard deviations of the
mean.
The scores that are two standard deviations of the mean range from 70 to 130 since
100 - 2(15) = 70 and 100 + 2(15) = 130.
From the Empirical Rule, we know that about 95% of all students' IQ scores will fall
within this range.
Thus, about 950 of the 1,000 students IQ scores fall in this range.
we'd expect only 50 students to have an IQ score that is either less than 70 or greater
than 130.
For example, finding a student with an IQ score of 60 would be highly unlikely. Similarly,
finding a student with an IQ score of 140 would be highly unlikely.
Keep in mind that when we have skewed data, there are limitations on how we can
analyze the data. For example, we can't use the Empirical Rule for data that come from
a skewed distribution. A normal distribution is required to use the Empirical Rule.
A high standard deviation means that values are generally far from the mean, while a
low standard deviation indicates that values are clustered close to the mean.
When you collect data from a sample, the sample standard deviation is used to
make estimates or inferences about the population standard deviation.
Steps for calculating the standard deviation
There are six main steps for finding the standard deviation by hand. We’ll use a
small data set of 6 scores to walk through the steps.
Data set
46 6 32 60 52 41
9
To find the mean, add up all the scores, then divide them by the number of
scores.
Mean (x̅)
x̅ = (46 + 69 + 32 + 60 + 52 + 41) ÷ 6 = 50
Subtract the mean from each score to get the deviations from the mean.
Multiply each deviation from the mean by itself. This will result in positive
numbers.
Add up all of the squared deviations. This is called the sum of squares.
Sum of squares
16 + 361 + 324 + 100 + 4 + 81 = 886
Divide the sum of the squares by n – 1 (for a sample) or N (for a population) –
this is the variance.
Variance
886 ÷ (6 – 1) = 886 ÷ 5 = 177.2
Step 6: Find the square root of the variance
To find the standard deviation, we take the square root of the variance.
Standard deviation
√177.2 = 13.31
From learning that SD = 13.31, we can say that each score deviates from the
mean by 13.31 points on average.
Z TABLE
Negative Z score table
Use the negative Z score table below to find values on the left of the mean as can be
seen in the graph alongside. Corresponding values which are less than the mean are
marked with a negative score in the z-table and respresent the area under the bell
curve to the left of z.
Positive Z score table
Use the positive Z score table below to find values on the right of the mean as can be
seen in the graph alongside. Corresponding values which are greater than the mean are
marked with a positive score in the z-table and respresent the area under the bell curve
t o the left of z.
2
To use the Z-Tables however, you will need to know a little something called the Z-Score. It
is the Z-Score that gets mapped across the Z-Table and is usually either pre-provided or has
to be derived using the Z Score formula. But before we take a look at the formula, let us
understand what the Z Score is
What is a Z Score?
A Z Score, also called as the Standard Score, is a measurement of how many standard
deviations below or above the population mean a raw score is. Meaning in simple terms, it is
Z Score that gives you an idea of a value’s relationship to the mean and how far from the
mean a data point is.
A Z Score is measured in terms of standard deviations from the mean. Which means that if Z
Score = 1 then that value is one standard deviation from the mean. Whereas if Z Score = 0, it
means the value is identical to the mean.
A Z Score can be either positive or negative depending on whether the score lies above the
mean (in which case it is positive) or below the mean (in which case it is negative)
When we do not have a pre-provided Z Score supplied to us, we will use the above formula
to calculate the Z Score using the other data available like the observed value, mean of the
sample and the standard deviation. Similarly, if we have the standard score provided and are
missing any one of the other three values, we can substitute them in the above formula to get
the missing value.
Understanding how to use the Z Score Formula with an example
Let us understand how to calculate the Z-score, the Z-Score Formula and use the Z-
table with a simple real life example.
Q: 300 college student’s exam scores are tallied at the end of the semester. Eric scored
800 marks (X) in total out of 1000. The average score for the batch was 700 (µ) and
the standard deviation was 180 (σ). Let’s find out how well Eric scored compared to
his batch mates.
Using the above data we need to first standardize his score and use the respective z-
table before we determine how well he performed compared to his batch mates.
Z score = ( x – µ ) / σ
Z score = 0.56
Once we have the Z Score which was derived through the Z Score formula, we can
now go to the next part which is understanding how to read the Z Table and map the
value of the Z Score we’ve got, using it.
Once you have the Z Score, the next step is choosing between the two tables. That is
choosing between using the negative Z Table and the positive Z Table depending on
whether your Z score value is positive or negative.
What we are basically establishing with a positive or negative Z Score is whether your
values lie on the left of the mean or right of the mean. To find the area on the left of
the mean, you will have a negative Z Score and use a negative Z Table. Similarly, to
find the area on the right of the mean, you will have a positive Z Score and use a
positive Z Table.
Now that we have Eric’s Z score which we know is a positive 0.56 and we know
which corresponding table to pick for it, we will make use of the positive Z-table
(Table 1.2) to predict how good or bad Eric performed compared to his batch mates.
Now that we’ve picked the appropriate table to look up to, in the next step of the
process we will learn how to map our Z score value in the respective table. Let us
understand using the example we’ve chosen with Eric’s Z score of 0.56
Traverse horizontally down the Y-Axis on the leftmost column to find the find the
value of the first two digits of your Z Score (0.5 based on Eric’s Z score).
Once you have that, go alongside the X-axis on the topmost row to find the value of
the digits at the second decimal position (.06 based on Eric’s Z score)
Once you have mapped these two values, find the interesection of the row of the first
two digits and column of the second decimal value in the table. The instersection of
the two is the answer we’re looking.
In our example, we get the interesection at a value of 0.71226 (~ 0.7123)
To get this as a percentage we multiply that number with 100. Therefore 0.7123 x 100
= 71.23%. Hence we find out that Eric did better than 71.23% of students.
Let us take one more example but this time for a negative z score and a negative
z table.
Based on what we had discussed before, since the z score is negative, we will use the
negative z table (Table 1.1)
First, traverse horizontally down the Y-Axis on the leftmost column to find the value
of the first two digits that is -1.3
Once we have that, we will traverse along the X axis in the topmost row to map the
second decimal (0.05 in the case) and find the corresponding column for it.
The interesection of the row of the first two digits and column of the second decimal
value in the above Z table is the anwer we’re looking for which in case of our example
is 0.08851 or 8.85%
(Note that this method of mapping the Z Score value is same for both the positive as
well as the negative Z Scores. That is because for a standard normal distribution table,
both halfs of the curves on the either side of the mean are identical. So it only depends
on whether the Z Score Value is positive or negative or whether we are looking up the
area on the left of the mean or on the right of the mean when it comes to choosing the
respective table)
There are two Z tables to make things less complicated. Sure it can be combined into
one single larger Z-table but that can be a bit overwhelming for a lot of beginners and
it also increases the chance of human errors during calculations. Using two Z tables
makes life easier such that based on whether you want the know the area from the
mean for a positive value or a negative value, you can use the respective Z score table.
If you want to know the area between the mean and a negative value you will use the
first table (1.1) shown above which is the left-hand/negative Z-table. If you want to
know the area between the mean and a positive value you will the second table (1.2)
above which is the right-hand/positive Z-table.