0% found this document useful (0 votes)
2K views43 pages

Statistics For Management - 1

The document discusses probability and key concepts related to probability, including: 1) It defines probability as the likelihood that an event will occur, expressed as a number between 0 and 1. 2) It explains how to interpret various probability values and how the sum of probabilities of all possible outcomes equals 1. 3) It provides examples of how to compute probability when outcomes are equally likely and discusses the law of large numbers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views43 pages

Statistics For Management - 1

The document discusses probability and key concepts related to probability, including: 1) It defines probability as the likelihood that an event will occur, expressed as a number between 0 and 1. 2) It explains how to interpret various probability values and how the sum of probabilities of all possible outcomes equals 1. 3) It provides examples of how to compute probability when outcomes are equally likely and discusses the law of large numbers.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

What is Probability?

The probability of an event refers to the likelihood that the event will occur.

How to Interpret Probability

Mathematically, the probability that an event will occur is expressed as a


number between 0 and 1. Notationally, the probability of event A is represented
by P(A).

 If P(A) equals zero, event A will almost definitely not occur.

 If P(A) is close to zero, there is only a small chance that event A will
occur.

 If P(A) equals 0.5, there is a 50-50 chance that event A will occur.

 If P(A) is close to one, there is a strong chance that event A will occur.

 If P(A) equals one, event A will almost definitely occur.

In a statistical experiment, the sum of probabilities for all possible outcomes is


equal to one. This means, for example, that if an experiment can have three
possible outcomes (A, B, and C), then P(A) + P(B) + P(C) = 1.

How to Compute Probability: Equally Likely Outcomes

Sometimes, a statistical experiment can have n possible outcomes, each of


which is equally likely. Suppose a subset of r outcomes are classified as
"successful" outcomes.

The probability that the experiment results in a successful outcome (S) is:
P(S) =

( Number of successful outcomes ) / ( Total number of equally likely


outcomes )

=r/n

Consider the following experiment. An urn has 10 marbles. Two marbles are
red, three are green, and five are blue. If an experimenter randomly selects 1
marble from the urn, what is the probability that it will be green?

In this experiment, there are 10 equally likely outcomes, three of which are
green marbles. Therefore, the probability of choosing a green marble is 3/10 or
0.30.

How to Compute Probability: Law of Large Numbers

One can also think about the probability of an event in terms of its long-
run relative frequency. The relative frequency of an event is the number of
times an event occurs, divided by the total number of trials.

P(A) = ( Frequency of Event A ) / ( Number of Trials )

For example, a merchant notices one day that 5 out of 50 visitors to her store
make a purchase. The next day, 20 out of 50 visitors make a purchase. The two
relative frequencies (5/50 or 0.10 and 20/50 or 0.40) differ. However, summing
results over many visitors, she might find that the probability that a visitor
makes a purchase gets closer and closer to 0.20.
The scatterplot above shows the relative frequency of purchase as the number of
trials (in this case, the number of visitors) increases. Over many trials, the
relative frequency converges toward a stable value (0.20), which can be
interpreted as the probability that a visitor to the store will make a purchase.

The idea that the relative frequency of an event will converge on the probability
of the event, as the number of trials increases, is called the law of large
numbers.

Test Your Understanding

Problem

A coin is tossed three times. What is the probability that it lands on


heads exactly one time?

(A) 0.125
(B) 0.250
(C) 0.333
(D) 0.375
(E) 0.500

Solution

The correct answer is (D). If you toss a coin three times, there are a total of
eight possible outcomes. They are: HHH, HHT, HTH, THH, HTT, THT, TTH,
and TTT. Of the eight possible outcomes, three have exactly one head. They
are: HTT, THT, and TTH. Therefore, the probability that three flips of a coin
will produce exactly one head is 3/8 or 0.375.

RULES OF PROBABILITY

Often, we want to compute the probability of an event from the known


probabilities of other events. This lesson covers some important rules that
simplify those computations.

Definitions and Notation

Before discussing the rules of probability, we state the following definitions:


 Two events are mutually exclusive or disjoint if they cannot occur at the
same time.
 The probability that Event A occurs, given that Event B has occurred, is
called a conditional probability. The conditional probability of Event A,
given Event B, is denoted by the symbol P(A|B).
 The complement of an event is the event not occurring. The probability
that Event A will not occur is denoted by P(A').
 The probability that Events A and B both occur is the probability of
the intersection of A and B. The probability of the intersection of Events
A and B is denoted by P(A ∩ B). If Events A and B are mutually
exclusive, P(A ∩ B) = 0.
 The probability that Events A or B occur is the probability of
the union of A and B. The probability of the union of Events A and B is
denoted by P(A ∪ B) .
 If the occurrence of Event A changes the probability of Event B, then
Events A and B are dependent. On the other hand, if the occurrence of
Event A does not change the probability of Event B, then Events A and B
are independent.

Rule of Subtraction

In a previous lesson, we learned two important properties of probability:

 The probability of an event ranges from 0 to 1.


 The sum of probabilities of all possible events equals 1.

The rule of subtraction follows directly from these properties.

Rule of Subtraction. The probability that event A will occur is equal to 1


minus the probability that event A will not occur.

P(A) = 1 - P(A')

Suppose, for example, the probability that Bill will graduate from college is
0.80. What is the probability that Bill will not graduate from college? Based on
the rule of subtraction, the probability that Bill will not graduate is 1.00 - 0.80
or 0.20.
Rule of Multiplication

The rule of multiplication applies to the situation when we want to know the
probability of the intersection of two events; that is, we want to know the
probability that two events (Event A and Event B) both occur.

Rule of Multiplication The probability that Events A and B both occur is equal


to the probability that Event A occurs times the probability that Event B occurs,
given that A has occurred.

P(A ∩ B) = P(A) P(B|A)

Example
An urn contains 6 red marbles and 4 black marbles. Two marbles are
drawn without replacement from the urn. What is the probability that both of the
marbles are black?

Solution: Let A = the event that the first marble is black; and let B = the event
that the second marble is black. We know the following:

 In the beginning, there are 10 marbles in the urn, 4 of which are black.
Therefore, P(A) = 4/10.
 After the first selection, there are 9 marbles in the urn, 3 of which are
black. Therefore, P(B|A) = 3/9.

Therefore, based on the rule of multiplication:

P(A ∩ B) = P(A) P(B|A)


P(A ∩ B) = (4/10) * (3/9) = 12/90 = 2/15 = 0.133

Rule of Addition

The rule of addition applies to the following situation. We have two events, and
we want to know the probability that either event occurs.

Rule of Addition The probability that Event A or Event B occurs is equal to the


probability that Event A occurs plus the probability that Event B occurs minus
the probability that both Events A and B occur.

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)


Note: Invoking the fact that P(A ∩ B) = P( A )P( B | A ), the Addition Rule can
also be expressed as:

P(A ∪ B) = P(A) + P(B) - P(A)P( B | A )

Example
A student goes to the library. The probability that she checks out (a) a work of
fiction is 0.40, (b) a work of non-fiction is 0.30, and (c) both fiction and non-
fiction is 0.20. What is the probability that the student checks out a work of
fiction, non-fiction, or both?

Solution: Let F = the event that the student checks out fiction; and let N = the
event that the student checks out non-fiction. Then, based on the rule of
addition:

P(F ∪ N) = P(F) + P(N) - P(F ∩ N)


P(F ∪ N) = 0.40 + 0.30 - 0.20 = 0.50

Test Your Understanding

Problem 1

An urn contains 6 red marbles and 4 black marbles. Two marbles are
drawn with replacement from the urn. What is the probability that both of the
marbles are black?

(A) 0.16
(B) 0.32
(C) 0.36
(D) 0.40
(E) 0.60

Solution

The correct answer is A. Let A = the event that the first marble is black; and let
B = the event that the second marble is black. We know the following:

 In the beginning, there are 10 marbles in the urn, 4 of which are black.
Therefore, P(A) = 4/10.
 After the first selection, we replace the selected marble; so there are still
10 marbles in the urn, 4 of which are black. Therefore, P(B|A) = 4/10.
Therefore, based on the rule of multiplication:

P(A ∩ B) = P(A) P(B|A)


P(A ∩ B) = (4/10)*(4/10) = 16/100 = 0.16

Problem 2

A card is drawn randomly from a deck of ordinary playing cards. You win $10
if the card is a spade or an ace. What is the probability that you will win the
game?

(A) 1/13
(B) 13/52
(C) 4/13
(D) 17/52
(E) None of the above.

Solution

The correct answer is C. Let S = the event that the card is a spade; and let A =
the event that the card is an ace. We know the following:

 There are 52 cards in the deck.


 There are 13 spades, so P(S) = 13/52.
 There are 4 aces, so P(A) = 4/52.
 There is 1 ace that is also a spade, so P(S ∩ A) = 1/52.

Therefore, based on the rule of addition:

P(S ∪ A) = P(S) + P(A) - P(S ∩ A)


P(S ∪ A) = 13/52 + 4/52 - 1/52 = 16/52 = 4/13

Bayes' Theorem Definition

What Is Bayes' Theorem?


Bayes' theorem, named after 18th-century British mathematician Thomas
Bayes, is a mathematical formula for determining conditional probability.
Conditional probability is the likelihood of an outcome occurring, based on a
previous outcome occurring. Bayes' theorem provides a way to revise existing
predictions or theories (update probabilities) given new or additional evidence.
In finance, Bayes' theorem can be used to rate the risk of lending money to
potential borrowers.

Bayes' theorem is also called Bayes' Rule or Bayes' Law and is the foundation
of the field of Bayesian statistics.

Bayes’ theorem describes the probability of occurrence of an event related to


any condition. It is also considered for the case of conditional probability. Bayes
theorem is also known as the formula for the probability of “causes”. For
example: if we have to calculate the probability of taking a blue ball from the
second bag out of three different bags of balls, where each bag contains three
different colour balls viz. red, blue, black. In this case, the probability of
occurrence of an event is calculated depending on other conditions is known as
conditional probability. In this article, let us discuss the statement and proof for
Bayes theorem, its derivation, formula, and many solved examples.

Understanding Bayes' Theorem


Applications of the theorem are widespread and not limited to the financial
realm. As an example, Bayes' theorem can be used to determine the accuracy of
medical test results by taking into consideration how likely any given person is
to have a disease and the general accuracy of the test. Bayes' theorem relies on
incorporating prior probability distributions in order to generate posterior
probabilities. Prior probability, in Bayesian statistical inference, is the
probability of an event before new data is collected. This is the best rational
assessment of the probability of an outcome based on the current knowledge
before an experiment is performed. Posterior probability is the revised
probability of an event occurring after taking into consideration new
information. Posterior probability is calculated by updating the prior
probability by using Bayes' theorem. In statistical terms, the posterior
probability is the probability of event A occurring given that event B has
occurred.

Bayes' theorem thus gives the probability of an event based on new information
that is, or may be related, to that event. The formula can also be used to see how
the probability of an event occurring is affected by hypothetical new
information, supposing the new information will turn out to be true. For
instance, say a single card is drawn from a complete deck of 52 cards. The
probability that the card is a king is four divided by 52, which equals 1/13 or
approximately 7.69%. Remember that there are four kings in the deck. Now,
suppose it is revealed that the selected card is a face card. The probability the
selected card is a king, given it is a face card, is four divided by 12, or
approximately 33.3%, as there are 12 face cards in a deck.
WHAT IS A RANDOM VARIABLE?

When the value of a variable is determined by a chance event, that variable is


called a random variable.

Discrete vs. Continuous Random Variables

Random variables can be discrete or continuous.

 Discrete. Within a range of numbers, discrete variables can take on only


certain values. Suppose, for example, that we flip a coin and count the
number of heads. The number of heads will be a value between zero and
plus infinity. Within that range, though, the number of heads can be only
certain values. For example, the number of heads can only be a whole
number, not a fraction. Therefore, the number of heads is a discrete
variable. And because the number of heads results from a random process
- flipping a coin - it is a discrete random variable.

 Continuous. Continuous variables, in contrast, can take on any value


within a range of values. For example, suppose we randomly select an
individual from a population. Then, we measure the age of that person. In
theory, his/her age can take on any value between zero and plus infinity,
so age is a continuous variable. In this example, the age of the person
selected is determined by a chance event; so, in this example, age is a
continuous random variable.

Discrete Variables: Finite vs. Infinite

Some references state that continuous variables can take on an infinite number
of values, but discrete variables cannot. This is incorrect.

 In some cases, discrete variables can take on only a finite number of


values. For example, the number of aces dealt in a poker hand can take on
only five values: 0, 1, 2, 3, or 4.

 In other cases, however, discrete variables can take on an infinite number


of values. For example, the number of coin flips that result in heads could
be infinitely large.

When comparing discrete and continuous variables, it is more correct to say that
continuous variables can always take on an infinite number of values; whereas
some discrete variables can take on an infinite number of values, but others
cannot.

Test Your Understanding

Problem 1

Which of the following is a discrete random variable?

I. The average height of a randomly selected group of boys.


II. The annual number of sweepstakes winners from New York City.
III. The number of presidential elections in the 20th century.

(A) I only
(B) II only
(C) III only
(D) I and II
(E) II and III

Solution

The correct answer is B.


The annual number of sweepstakes winners results from a random process, but
it can only be a whole number - not a fraction; so it is a discrete random
variable. The average height of a randomly-selected group of boys could take
on any value between the height of the smallest and tallest boys, so it is not a
discrete variable. And the number of presidential elections in the 20th century
does not result from a random process; so it is not a random variable.

What is a Probability Distribution?

A probability distribution is a table or an equation that links each possible


value that a random variable can assume with its probability of occurrence.

Discrete Probability Distributions

The probability distribution of a discrete random variable can always be


represented by a table. For example, suppose you flip a coin two times. This
simple exercise can have four possible outcomes: HH, HT, TH, and TT. Now,
let the variable X represent the number of heads that result from the coin flips.
The variable X can take on the values 0, 1, or 2; and X is a discrete random
variable.

The table below shows the probabilities associated with each possible value of
X. The probability of getting 0 heads is 0.25; 1 head, 0.50; and 2 heads, 0.25.
Thus, the table is an example of a probability distribution for a discrete random
variable.

Number of heads, x Probability, P(x)


0 0.25
1 0.50
2 0.25

Note: Given a probability distribution, you can find cumulative probabilities.


For example, the probability of getting 1 or fewer heads [ P(X < 1) ] is P(X = 0)
+ P(X = 1), which is equal to 0.25 + 0.50 or 0.75.
Continuous Probability Distributions

The probability distribution of a continuous random variable is represented by


an equation, called the probability density function (pdf). All probability
density functions satisfy the following conditions:

 The random variable Y is a function of X; that is, y = f(x).

 The value of y is greater than or equal to zero for all values of x.


 The total area under the curve of the function is equal to one.

The charts below show two continuous probability distributions. The first chart
shows a probability density function described by the equation y = 1 over the
range of 0 to 1 and y = 0 elsewhere. The second chart shows a probability
density function described by the equation y = 1 - 0.5x over the range of 0 to 2
and y = 0 elsewhere. The area under the curve is equal to 1 for both charts.

y=1

y = 1 - 0.5x

The probability that a continuous random variable falls in the interval


between a and b is equal to the area under the pdf curve between a and b. For
example, in the first chart above, the shaded area shows the probability that the
random variable X will fall between 0.6 and 1.0. That probability is 0.40. And
in the second chart, the shaded area shows the probability of falling between 1.0
and 2.0. That probability is 0.25.
Note: With a continuous distribution, there are an infinite number of values
between any two data points. As a result, the probability that a continuous
random variable will assume a particular value is always zero. For example, in
both of the above charts, the probability that variable X will equal exactly 0.4 is
zero.

Test Your Understanding

Problem 1

The number of adults living in homes on a randomly selected city block is


described by the following probability distribution.

Number of adults, Probability,


x P(x)
1 0.25
2 0.50
3 0.15
4 or more ???

What is the probability that 4 or more adults reside at a randomly selected


home?

(A) 0.10
(B) 0.15
(C) 0.25
(D) 0.50
(E) 0.90

Solution

The correct answer is A. The sum of all the probabilities is equal to 1.


Therefore, the probability that four or more adults reside in a home is equal to 1
- (0.25 + 0.50 + 0.15) or 0.10

Mean and Variance of Random Variables

Just like variables from a data set, random variables are described by measures


of central tendency (like the mean) and measures of variability (like variance).
This lesson shows how to compute these measures for discrete random
variables.

Mean of a Discrete Random Variable

The mean of the discrete random variable X is also called the expected value of
X. Notationally, the expected value of X is denoted by E(X). Use the following
formula to compute the mean of a discrete random variable.

E(X) = μx = Σ [ xi * P(xi) ]


where xi is the value of the random variable for outcome i, μ x is the mean of
random variable X, and P(xi) is the probability that the random variable will be
outcome i.

Example 1

In a recent little league softball game, each player went to bat 4 times. The
number of hits made by each player is described by the following probability
distribution.

Probability,
Number of hits, x
P(x)
0 0.10
1 0.20
2 0.30
3 0.25
4 0.15

What is the mean of the probability distribution?

(A) 1.00
(B) 1.75
(C) 2.00
(D) 2.25
(E) None of the above.

Solution

The correct answer is E. The mean of the probability distribution is 2.15, as


defined by the following equation.

E(X) = Σ [ xi * P(xi) ]


E(X) = 0*0.10 + 1*0.20 + 2*0.30 + 3*0.25 + 4*0.15 = 2.15

Median of a Discrete Random Variable

The median of a discrete random variable is the "middle" value. It is the value
of X for which P(X < x) is greater than or equal to 0.5 and P(X > x) is greater
than or equal to 0.5.

Consider the problem presented above in Example 1. In Example 1, the median


is 2; because P(X < 2) is equal to 0.60, and P(X > 2) is equal to 0.70. The
computations are shown below.

P(X < 2) = P(x=0) + P(x=1) + P(x=2)

P(X < 2) = 0.10 + 0.20 + 0.30 = 0.60

P(X > 2) = P(x=2) + P(x=3) + P(x=4)

P(X > 2) = 0.30 + 0.25 + 0.15 = 0.70


Variability of a Discrete Random Variable

The equation for computing the variance of a discrete random variable is shown
below.

σ2 = Σ { [ xi - E(x) ]2 * P(xi) }

where xi is the value of the random variable for outcome i, P(x i) is the
probability that the random variable will be outcome i, E(x) is the expected
value of the discrete random variable x.

Example 2

The number of adults living in homes on a randomly selected city block is


described by the following probability distribution.

Number of adults, x 1 2 3 4

Probability, P(x) 0.25 0.50 0.15 0.10

What is the standard deviation of the probability distribution?

(A) 0.50
(B) 0.62
(C) 0.79
(D) 0.89
(E) 2.10

Solution

The correct answer is D. The solution has three parts. First, find the expected
value; then, find the variance; then, find the standard deviation. Computations
are shown below, beginning with the expected value.

E(X) = Σ [ xi * P(xi) ]


E(X) = 1*0.25 + 2*0.50 + 3*0.15 + 4*0.10 = 2.10

Now that we know the expected value, we find the variance.


σ2 = Σ { [ xi - E(x) ]2 * P(xi) }
σ2 = (1 - 2.1)2 * 0.25 + (2 - 2.1)2 * 0.50 + (3 - 2.1)2 * 0.15 + (4 - 2.1)2 * 0.10
σ2 = (1.21 * 0.25) + (0.01 * 0.50) + (0.81) * 0.15) + (3.61 * 0.10) = 0.3025 +
0.0050 + 0.1215 + 0.3610 = 0.79

And finally, the standard deviation is equal to the square root of the variance; so
the standard deviation is sqrt(0.79) or 0.889.

INDEPENDENT RANDOM VARIABLES

When a study involves pairs of random variables, it is often useful to know


whether or not the random variables are independent. This lesson explains how
to assess the independence of random variables.

Independence of Random Variables

If two random variables, X and Y, are independent, they satisfy the following


conditions.

 P(x|y) = P(x), for all values of X and Y.


 P(x ∩ y) = P(x) * P(y), for all values of X and Y.

The above conditions are equivalent. If either one is met, the other condition is
also met; and X and Y are independent. If either condition is not met, X and Y
are dependent.

Note: If X and Y are independent, then the correlation between X and Y is


equal to zero.

BINOMIAL PROBABILITY DISTRIBUTION

To understand binomial distributions and binomial probability, it helps to


understand binomial experiments and some associated notation; so we cover
those topics first.
Binomial Experiment

A binomial experiment is a statistical experiment that has the following


properties:

 The experiment consists of n repeated trials.

 Each trial can result in just two possible outcomes. We call one of these
outcomes a success and the other, a failure.

 The probability of success, denoted by P, is the same on every trial.

 The trials are independent; that is, the outcome on one trial does not
affect the outcome on other trials.

Consider the following statistical experiment. You flip a coin 2 times and count
the number of times the coin lands on heads. This is a binomial experiment
because:

 The experiment consists of repeated trials. We flip a coin 2 times.

 Each trial can result in just two possible outcomes - heads or tails.

 The probability of success is constant - 0.5 on every trial.

 The trials are independent; that is, getting heads on one trial does not
affect whether we get heads on other trials.

Notation

The following notation is helpful, when we talk about binomial probability.

 x: The number of successes that result from the binomial experiment.

 n: The number of trials in the binomial experiment.

 P: The probability of success on an individual trial.

 Q: The probability of failure on an individual trial. (This is equal to 1


- P.)

 n!: The factorial of n (also known as n factorial).


 b(x; n, P): Binomial probability - the probability that an n-trial binomial
experiment results in exactly x successes, when the probability of success
on an individual trial is P.

 n Cr: The number of combinations of n things, taken r at a time.

Binomial Distribution

A binomial random variable is the number of successes x in n repeated trials


of a binomial experiment. The probability distribution of a binomial random
variable is called a binomial distribution.

Suppose we flip a coin two times and count the number of heads (successes).
The binomial random variable is the number of heads, which can take on values
of 0, 1, or 2. The binomial distribution is presented below.

Number of heads Probability


0 0.25
1 0.50
2 0.25

The binomial distribution has the following properties:

 The mean of the distribution (μx) is equal to

n * P .
 The variance (σ2x) is 

n * P * ( 1 - P ).
 The standard deviation (σx) is

sqrt[ n * P * ( 1 - P ) ].
Binomial Formula and Binomial Probability

The binomial probability refers to the probability that a binomial experiment


results in exactly x successes. For example, in the above table, we see that the
binomial probability of getting exactly one head in two coin flips is 0.50.

Given x, n, and P, we can compute the binomial probability based on the


binomial formula:

Poisson Distribution Definition


A Poisson distribution is a probability distribution that results from the Poisson
experiment. A Poisson experiment is a statistical experiment that classifies the
experiment into two categories, such as success or failure. Poisson distribution
is a limiting process of the binomial distribution. A Poisson random variable
“x” defines the number of successes in the experiment. This distribution occurs
when there are events that do not occur as the outcomes of a definite number of
outcomes. Poisson distribution is used under certain conditions. They are:

 The number of trials “n” tends to infinity


 Probability of success “p” tends to zero
 np = 1 is finite

Poisson Distribution Formula


The formula for the Poisson distribution function is given by:
f(x) =(e– λ λx)/x!
Where,
e is the base of the logarithm
x is a Poisson random variable
λ is an average rate of value
Poisson Distribution Table
As with the binomial distribution, there is a table that we can use under certain
conditions that will make calculating probabilities a little easier when using the
Poisson Distribution. The table is showing the values of f(x) = P(X ≥ x), where
X has a Poisson distribution with parameter λ. Refer the values from the table
and substitute it in the Poisson distribution formula to get the probability value.
The table displays the values of the Poisson distribution.

Poisson Distribution Mean and Variance


Assume that, we conduct a Poisson experiment, in which the average number of
successes within a given range is taken as λ. In Poisson distribution, the mean of
the distribution is represented by λ and e is constant, which is approximately
equal to 2.71828. Then, the Poisson probability is:
P(x, λ ) =(e– λ λx)/x!

In Poisson distribution, the mean is represented as 


E(X) = λ.
For a Poisson Distribution, the mean and the variance are equal. It means that 
E(X) = V(X)
Where,
V(X) is the variance.

Poisson Distribution Expected Value


A random variable is said to have a Poisson distribution with the parameter λ,
where “λ” is considered as an expected value of the Poisson distribution.
The expected value of the Poisson distribution is given as follows:
E(x) = μ = d(eλ(t-1))/dt, at t=1.
E(x) = λ

Therefore, the expected value (mean) and the variance of the Poisson
distribution is equal to λ.

What is Uniform Distribution


A continuous probability distribution is a Uniform distribution and is related
to the events which are equally likely to occur. It is defined by two parameters,
x and y, where x = minimum value and y = maximum value. It is generally
denoted by u(x, y).
OR
If the probability density function or probability distribution of a uniform
distribution with a continuous random variable X is f(b)=1/y-x, then It is
denoted by U(x,y), where x and y are constants such that x<a<y. It is written as
X ∼ U(a,b)
(Note: Check whether the data is inclusive or exclusive before working out
problems with uniform distribution.)

Normal Distribution Definition


The Normal Distribution is defined by the probability density function for a
continuous random variable in a system. Let us say, f(x) is the probability
density function and X is the random variable. Hence, it defines a function
which is integrated between the range or interval (x to x + dx), giving the
probability of random variable X, by considering the values between x and
x+dx.
f(x) ≥ 0 ∀ x ϵ (−∞,+∞)
And -∞∫+∞ f(x) = 1

Normal Distribution Formula


The probability density function of normal or gaussian distribution is given by;

Where,

 x is the variable
 μ is the mean
 σ is the standard deviation

The Normal Curve

The graph of the normal distribution depends on two factors - the mean and the
standard deviation. The mean of the distribution determines the location of the
center of the graph, and the standard deviation determines the height and width
of the graph. All normal distributions look like a symmetric, bell-shaped curve,
as shown below.

Smaller standard deviation Bigger standard deviation


When the standard deviation is small, the curve is tall and narrow; and when the
standard deviation is big, the curve is short and wide

The Normal Distribution - Empirical Rule

Here is a histogram of SAT Critical Reading scores. The scores create a symmetrical
curve that can be approximated by a normal curve, as shown. Notice that we see the
characteristic bell shape of this near-normal distribution.

Every normal distribution has a

mean and

a standard deviation .

Given any normal distribution, it will be true that

mean = median = mode.

The curve is symmetric about the mean, which means that the right and left sides of the
curve are identical mirror images of each other.

Because the right and left sides are mirror images of each other, 50% of the values are
less than the mean and 50% of the values are greater than the mean.
The height of a normal distribution is a maximum at the mean, and the height decreases
as one goes from the mean toward the right tail, or as one goes from the mean to the left
tail.

The total area under the curve is 1, or 100%.

When you are given a normal distribution, with a given mean and standard deviation,
you can determine important locations on the bell curve by adding standard deviations to
the mean and by subtracting standard deviations from the mean.

For example, if you are given information about IQ scores, which are normally
distributed, and are told that the mean IQ score

is = 100

and that the standard deviation is 15,

then you can calculate that an IQ score that is 1 standard deviation above the mean is:

Similarly, an IQ score that is 1 standard deviation below the mean is:


.

An IQ score that is 2 standard deviations above the mean is:


.

An IQ score that is 2 standard deviations below the mean is


: .

75 85 100 115 130

In other words, a person who has an IQ score of 115 has an IQ score that is 1 standard
deviation above the mean.

A person who has an IQ score of 70 has an IQ score that is 2 standard deviations below
the mean.

We're about to see that it becomes less and less likely to find values that are farther
from the mean than are closer to it.

That is, it would be much less likely to find an IQ score that was 3 standard deviations
above the mean than to find one that was 2 standard deviations above the mean (or two
standard deviations below the mean, for that matter).

This is such an important concept that we have a rule of thumb referred to as the
Empirical Rule for normal distributions. In all normal distributions, the Empirical Rule tells
us that:

1. About 68% of all data values will fall within +/- 1 standard deviation of the mean.
2. About 95% of all data values will fall within +/- 2 standard deviations of the mean.
3. About 99.7% of all data values will fall within +/- 3 standard deviations of the mean.

Here is a sketch of a representative normal curve, with the Empirical Rule displayed.
Let's assume for a moment that this normal curve was the distribution of the IQ scores of
1,000 high school students. Let's again assume that the mean IQ score of these
students is   = 100 and that the standard deviation   is 15.

Let's consider what this all means. First, the average (arithmetic mean) IQ score of all
the students is 100.

That is, if you averaged all of the students' IQ scores, you'd see their average IQ score
was 100.

Second, if is 15, then about 68% of the students had an IQ score in the interval from 85
to 115 since 100 - 15 = 85 and 100 + 15 = 115.

In other words, about 680 of the IQ scores of the 1000 students are between 85 and
115.

Think about that. Almost 70% of the students have an IQ score that is within 1 standard
deviation of the mean.

Thus, only about 30% of the IQ scores are outside of being within 1 standard deviation
of the mean.

Let's consider how many students' IQ scores fall within 2 standard deviations of the
mean.
The scores that are two standard deviations of the mean range from 70 to 130 since
100 - 2(15) = 70 and 100 + 2(15) = 130.

From the Empirical Rule, we know that about 95% of all students' IQ scores will fall
within this range.

Thus, about 950 of the 1,000 students IQ scores fall in this range.

This means that out of 1,000 students,

we'd expect only 50 students to have an IQ score that is either less than 70 or greater
than 130.

For example, finding a student with an IQ score of 60 would be highly unlikely. Similarly,
finding a student with an IQ score of 140 would be highly unlikely.

Keep in mind that when we have skewed data, there are limitations on how we can
analyze the data. For example, we can't use the Empirical Rule for data that come from
a skewed distribution. A normal distribution is required to use the Empirical Rule.

Understanding and calculating


standard deviation
The standard deviation is the average amount of variability in your dataset. It tells
you, on average, how far each value lies from the mean.

A high standard deviation means that values are generally far from the mean, while a
low standard deviation indicates that values are clustered close to the mean.

What does standard deviation tell you?

Standard deviation is a useful measure of spread for normal distributions.

In normal distributions, data is symmetrically distributed with no skew. Most


values cluster around a central region, with values tapering off as they go
further away from the center. The standard deviation tells you how spread out
from the center of the distribution your data is on average.

Many scientific variables follow normal distributions, including height,


standardized test scores, or job satisfaction ratings. When you have the standard
deviations of different samples, you can compare their distributions
using statistical tests to make inferences about the larger populations they came
from.

Standard deviation formulas for populations and samples


Different formulas are used for calculating standard deviations depending on
whether you have data from a whole population or a sample.

Population standard deviation


When you have collected data from every member of the population that you’re
interested in, you can get an exact value for population standard deviation.

The population standard deviation formula looks like this:

Sample standard deviation

When you collect data from a sample, the sample standard deviation is used to
make estimates or inferences about the population standard deviation.
Steps for calculating the standard deviation

The standard deviation is usually calculated automatically by whichever


software you use for your statistical analysis. But you can also calculate it by
hand to better understand how the formula works.

There are six main steps for finding the standard deviation by hand. We’ll use a
small data set of 6 scores to walk through the steps.

Data set
46 6 32 60 52 41
9

Step 1: Find the mean

To find the mean, add up all the scores, then divide them by the number of
scores.

Mean (x̅)
x̅ = (46 + 69 + 32 + 60 + 52 + 41) ÷ 6 = 50

Step 2: Find each score’s deviation from the mean

Subtract the mean from each score to get the deviations from the mean.

Since x̅ = 50, here we take away 50 from each score.


Scor Deviation from the mean
e
46 46 – 50 = -4
69 69 – 50 = 19
32 32 – 50 = -18
60 60 – 50 = 10
52 52 – 50 = 2
41 41 – 50 = -9

Step 3: Square each deviation from the mean

Multiply each deviation from the mean by itself. This will result in positive
numbers.

Squared deviations from the mean


(-4)2 = 4 × 4 = 16
192 = 19 × 19 = 361
(-18)2 = -18 × -18 = 324
102 = 10 × 10 = 100
22 = 2 × 2 = 4
(-9)2 = -9 × -9 = 81

Step 4: Find the sum of squares

Add up all of the squared deviations. This is called the sum of squares.

Sum of squares
16 + 361 + 324 + 100 + 4 + 81 = 886

Step 5: Find the variance

Divide the sum of the squares by n – 1 (for a sample) or N (for a population) –
this is the variance.

Since we’re working with a sample size of 6, we will use  n – 1, where n = 6.

Variance
 886 ÷ (6 – 1) = 886 ÷ 5 = 177.2
Step 6: Find the square root of the variance

To find the standard deviation, we take the square root of the variance.

Standard deviation
√177.2 = 13.31

From learning that SD = 13.31, we can say that each score deviates from the
mean by 13.31 points on average.

Z TABLE
Negative Z score table

Use the negative Z score table below to find values on the left of the mean as can be
seen in the graph alongside. Corresponding values which are less than the mean are
marked with a negative score in the z-table and respresent the area under the bell
curve to the left of z.
Positive Z score table

Use the positive Z score table below to find values on the right of the mean as can be
seen in the graph alongside. Corresponding values which are greater than the mean are
marked with a positive score in the z-table and respresent the area under the bell curve
t o the left of z.

2
To use the Z-Tables however, you will need to know a little something called the Z-Score. It
is the Z-Score that gets mapped across the Z-Table and is usually either pre-provided or has
to be derived using the Z Score formula. But before we take a look at the formula, let us
understand what the Z Score is

What is a Z Score?

A Z Score, also called as the Standard Score, is a measurement of how many standard
deviations below or above the population mean a raw score is. Meaning in simple terms, it is
Z Score that gives you an idea of a value’s relationship to the mean and how far from the
mean a data point is.

A Z Score is measured in terms of standard deviations from the mean. Which means that if Z
Score = 1 then that value is one standard deviation from the mean. Whereas if Z Score = 0, it
means the value is identical to the mean.

A Z Score can be either positive or negative depending on whether the score lies above the
mean (in which case it is positive) or below the mean (in which case it is negative)

Z Score helps us compare results to the normal population or mean

The Z Score Formula

The Z Score Formula or the Standard Score Formula is given as

When we do not have a pre-provided Z Score supplied to us, we will use the above formula
to calculate the Z Score using the other data available like the observed value, mean of the
sample and the standard deviation. Similarly, if we have the standard score provided and are
missing any one of the other three values, we can substitute them in the above formula to get
the missing value.
Understanding how to use the Z Score Formula with an example

Let us understand how to calculate the Z-score, the Z-Score Formula and use the Z-
table with a simple real life example.

Q: 300 college student’s exam scores are tallied at the end of the semester. Eric scored
800 marks (X) in total out of 1000. The average score for the batch was 700 (µ) and
the standard deviation was 180 (σ). Let’s find out how well Eric scored compared to
his batch mates.

Using the above data we need to first standardize his score and use the respective z-
table before we determine how well he performed compared to his batch mates.

To find out the Z score we use the formula

Z Score = (Observed Value – Mean of the Sample)/standard deviation

Z score = ( x – µ ) / σ

Z score = (800-700) / 180

Z score = 0.56

Once we have the Z Score which was derived through the Z Score formula, we can
now go to the next part which is understanding how to read the Z Table and map the
value of the Z Score we’ve got, using it.

How to Read The Z Table


To map a Z score across a Z Table, it goes without saying that the first thing you need
is the Z Score itself. In the above example, we derive that Eric’s Z-score is 0.56.

Once you have the Z Score, the next step is choosing between the two tables. That is
choosing between using the negative Z Table and the positive Z Table depending on
whether your Z score value is positive or negative.

What we are basically establishing with a positive or negative Z Score is whether your
values lie on the left of the mean or right of the mean. To find the area on the left of
the mean, you will have a negative Z Score and use a negative Z Table. Similarly, to
find the area on the right of the mean, you will have a positive Z Score and use a
positive Z Table.

Now that we have Eric’s Z score which we know is a positive 0.56 and we know
which corresponding table to pick for it, we will make use of the positive Z-table
(Table 1.2) to predict how good or bad Eric performed compared to his batch mates.
Now that we’ve picked the appropriate table to look up to, in the next step of the
process we will learn how to map our Z score value in the respective table. Let us
understand using the example we’ve chosen with Eric’s Z score of 0.56

Traverse horizontally down the Y-Axis on the leftmost column to find the find the
value of the first two digits of your Z Score (0.5 based on Eric’s Z score).

Once you have that, go alongside the X-axis on the topmost row to find the value of
the digits at the second decimal position (.06 based on Eric’s Z score)
Once you have mapped these two values, find the interesection of the row of the first
two digits and column of the second decimal value in the table. The instersection of
the two is the answer we’re looking.
In our example, we get the interesection at a value of 0.71226 (~ 0.7123)

To get this as a percentage we multiply that number with 100. Therefore 0.7123 x 100
= 71.23%. Hence we find out that Eric did better than 71.23% of students.

Let us take one more example but this time for a negative z score and a negative
z table.

Let us consider our Z score = -1.35

Based on what we had discussed before, since the z score is negative, we will use the
negative z table (Table 1.1)
First, traverse horizontally down the Y-Axis on the leftmost column to find the value
of the first two digits that is -1.3

Once we have that, we will traverse along the X axis in the topmost row to map the
second decimal (0.05 in the case) and find the corresponding column for it.
The interesection of the row of the first two digits and column of the second decimal
value in the above Z table is the anwer we’re looking for which in case of our example
is 0.08851 or 8.85%
(Note that this method of mapping the Z Score value is same for both the positive as
well as the negative Z Scores. That is because for a standard normal distribution table,
both halfs of the curves on the either side of the mean are identical. So it only depends
on whether the Z Score Value is positive or negative or whether we are looking up the
area on the left of the mean or on the right of the mean when it comes to choosing the
respective table)

Why are there two Z tables?

There are two Z tables to make things less complicated. Sure it can be combined into
one single larger Z-table but that can be a bit overwhelming for a lot of beginners and
it also increases the chance of human errors during calculations. Using two Z tables
makes life easier such that based on whether you want the know the area from the
mean for a positive value or a negative value, you can use the respective Z score table.

If you want to know the area between the mean and a negative value you will use the
first table (1.1) shown above which is the left-hand/negative Z-table. If you want to
know the area between the mean and a positive value you will the second table (1.2)
above which is the right-hand/positive Z-table.

You might also like