Type of Events
Complementary Event : The event EC is called complemen-
tary event for the event E. It consists of all outcomes not in E,
but in S. For example, in a dice throw, if E = {Even nos} =
{2, 4, 6} then E C = {Odd nos} = {1, 3, 5}.
Equally Likely Events :
Two events E and F are equally likely iff p(E) = p(F)
For example, E = {1, 2, 3}
F = {4, 5, 6}
are equally likely, since p(E) = p(F) = 1/2.
Mutually Exclusive Events :
Two events E and F are mutually exclusive, if E ∩ F = ϕ
i.e., p(E ∩ F) = 0. In other words, if E occurs, F cannot occur
and if F occurs, then E cannot occur (i.e., both cannot occur
together).
Collectively Exhaustive Events :
Two events E and F are collectively exhaustive, if E F = S.
∩
i.e., together E and F include all possible outcomes, p(E F)
∩
= p(S) = 1.
Independent Events :
Two events E and F are independent iff
p(E ∩ F = p(E) * p(F)
Also p(E | F) = p(E) and p(F | E) = p(F).
Page - 1
Whenever E and F are independent. i.e., when two events
E and F are independent , the conditional probability becomes
same as marginal probability i.e., probability E is not affected
by whether F has happened or not, and vice-versa. i.e., when E
is independent of F, then F is also independent of E.
DeMorgan’s Law :
( )
n C n
∩ ECi
∩
1. Ei =
i=1 i=1
C
Example : (E1 E 2 )C = E1 ∩ E2C
∩
(E1 ∩ E2 )C = E1C E C2
∩
Note that E1C ∩ E2C is the event neither E1 nor E2
E1 E2 is the event either E1 or E2 (or both).
∩
Demorgan’s law is often used to find the probability of neither
E1 not E2 .
i.e., p(E 1C ∩ EC2 ) = p(E1 E 2 )C] = 1 - p(E1 E 2 )
∩ ∩
Approaches to Probability
There are 2 approaches to quantifying probability of an Event E.
1. Classical Approach : n(E)
_____ |E|
_____
P(E) = n(S) =
|S|
i.e., the ratio of number of ways an event can happen to the
number of ways sample space can happen, is the probability of the
event. Classical approach assumes that all outcomes are equally
likely.
Example 1 :
If out all possible jumbles of the word “BIRD”, a random word is
picked, what is the probability, that this word will start with a “B”.
Page - 2
Solution :
n(E)
_____
p(E) =
n(S)
In this problem : n(S) = all possible jumbles of BIRD = 4!
n(E) = those jumbles starting with “B” = 3!
n(E)
_____ _____
3! 1
_____
So, p(E) = = =
n(S) 4! 4
Example 2 :
From the following table find the probability of obtaining
“A” grade in this exam.
Grade A B C D
Number of Students 10 20 30 40
Solution :
N = total number of students = 100
By frequency approach, n(A grade)
_____ _____
10
p(A grade) = N = = 0.1
100
Axioms of Probability
Consider an experiment whose sample space is S. For each
event E of the sample space S we assume that a number P(E) is
defined and satisfies the following three axioms.
Axiom - 1 : 0 ≤ P(E) ≤ 1
Axiom - 2 : P(S) = 1
Axiom - 3 : For any sequence of mutually exclusive events
E1 , E 2 , ...... (that is events for which E ∩ E = ϕ when i ≠ j.
( )
∞ ∞
∑ P(E i)
∩
P i=1
Ei =
i=1
Example : P(E1 E2 ) = P(E 1 ) + P(E2 ) where (E 1 , E 2 are
∩
mutually exclusive).
Page - 3
Rules of Probability
There are six rules of probability using which probability
of any compound event involving arbitrary event A and B, can
be computed.
Rule 1 :
p(A B) = p(A) + p(B) - p(A ∩ B)
∩
This rule is also called the inclusion-exclusion principle of pro-
bability.
This formula reduces to p(A B) = p(A) + p(B)
∩
If A and B are mutually exclusive, since p(A ∩ B) = 0 in such a
case.
Rule 2 :
p(A ∩ B) = p(A) * p(B/A) = p(B) * p(A/B)
where p(A/B) represents the conditional probability of A given B
and p(B/A) represents the condition probability of B given A.
(a) p(A) and p(B) are called the marginal probabilities of A and
B respectively. This rule is also called the multiplication rule
of probability.
(b) p(A ∩ B) is called the joint probability of A and B.
(c) If A and B are independent events, this formula reduces
to p(A ∩ B) = p(A) * p(B)
since when A and B are independent
p(A/B) = p(A) and p(B/A) = p(B)
i.e., the conditional probabilities become same as the
marginal (unconditional) probabilities.
(d) If A and B are independent, then so are A and BC ; AC
and B and AC and BC .
(e) Condition for three events to independent :
Events A, B and C are independent iff
p(ABC) = p(A) p(B) p(C)
and p(AB) = p(A) p(B)
Page - 4
and p(AC) = p(A) p(C)
A, B, C arepairwise independent.
and p(BC) = p(B) p(C)
Note : If A, B, C are independent, then A will be independent
of any event formed from B and C.
For instance, A is independent of B C.
∩
Rule 3 : Complementary Probability.
p(A) = 1 - p(AC )
p(AC) is called the complementary probability of A and
p(AC ) represents the probability that the event A was not happen.
∴ p(A) = 1 - p(AC )
p(AC ) is also written as p(A′)
Notice that p(A) + p(A′) = 1
i.e., A and A′ are mutually exclusion as well as collectively ex-
haustive.
Also notice that by Demorgan’s law since AC ∩ BC = (A B)C
∩
p(AC ∩ BC) = p(A B)C = 1 - p(A B)
∩ ∩
i.e., p(neither A nor B) = 1 - p (either A or B)
Rule 4 : Conditional Probability Rule
Starting from the multiplication rule.
p(A ∩ B) = p(B) * p(A/B)
by cross multiplying we get the conditional probability formula.
p(A ∩ B)
_____
p(A/B) = p(B)
By interchanging A and B in this formula we get
p(A ∩ B)
_____
p(B/A) = p(A)
Page - 5
Probability Distributions
Random Variables
It is frequently the case when an experiment is performed
that we are mainly interested in some function of the outcome
as opposed to the actual outcome itself.
For instances, in tossing dice we are often interested in the
sum of two dice and are not really concerned about the separate
value of each die. That is, we may be interested in knowing
that the sum is 7 and not be concerned over whether the actual
outcome was (1, 6) or (2, 5) or (3, 4) or (4, 3) or (5, 2) or (6, 1).
Also, in coin flipping we may be interested in the total num-
ber of heads that occur and not care at all about the actual head
tail sequence that results. These quantities of interest, or more
formally, these real valued functions defined on the sample space,
are known as random variables.
Because the value of a random variable is determined by the
outcome of the experiment, we may assign probabilities to the
possible values of the random variable.
Types of Random Variable : Random variable may be dis-
crete or continuous.
Discrete Random Variable : A variable that can take one value
from a discrete set of values.
Example : Let x denotes sum of 2 dice. Now x is a discrete ran-
dom variable as it can take one value from the set {2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12}, since the sum of 2 dice can only be one of
these values.
Continuous Random Variable : A variable that can take one
value form a continuous range of values.
Example : x denotes the volume of Pepsi in a 500 ml cup. Now
x may be a number from 0 to 500, any of which value, x may take.
Page - 6
Probability Density Function (PDF)
Let x be continuous random variable then its PDF F(x) is de-
fined such that ∞
1. R(x) ≥ 0 2. ∫ F(x)dx = 1
-∞
b
3. P(a < x < b) = ∫ F(x) dx
a
Probability Mass Function (PMF)
Let x be discrete random variable then its PMF p(x) is defined
such that
1. p(x) = P[X = x] 2. p(x) ≥ 0 3. ∑p(x) = 1
Distributions
Based on this we can divide distributions also into discrete
distribution (based ob a discrete random variable) or continuous
distribution (based on a continous random variable).
Examples of discrete distribution are binomal, Poisson and
hypergeometric distributions.
Examples of continuous distribution are uniform, normal and
exponential distributions.
Properties of Discrete Distribution
∑P(x) = 1
E(x) = ∑x P(x)
V(x) = E(x²) - (E(x))² = ∑x² P(x) - [∑x P(x)]²
E(x) denotes expected value or average value of the random varia-
ble x, while V(x) denotes the variance of the random variable x.
Properties of Continous Distribution ∞
∫ f(x)dx
-∞
= 1
∞
F(x) = ∫ f(x)dx (cumulative distribution function)
-∞
Page - 7
∞
E(x) = ∫-∞ xf (x) dx
[ ∫ xf(x)dx ]
∞ ∞ 2
V(x) = E(x²) - [E(x)]² = ∫
-∞
x²f(x)dx -
-∞
p(a < x < b) = p(a ≤ x < b) = P(a < x ≤ b) = P(a ≤ x ≤ b)
b
= ∫
a
f(x)dx
Types of Distributions
Discrete Distributions :
1. General Discrete Distribution
2. Binomial Distribution
3. Hypergeometric Distribution
4. Geometric Distribution
5. Poisson Distribution
General Discrete Distribution
Let X be a discrete random variable.
A table of possible values of x versus corresponding probability
values p(x) is called as its probability distribution table.
Expectation E(x)
The mean value of the probability distributiobn of a variety is
commonly known as its expectation.
n
µx E(X) = ∑ i=1 x i x i f(xi x i x i) (Discrete case)
∞
µx E(X) = ∫-∞ x i x i f(xi x i xi) dxdx (Continuous case)
Page - 8
Variance Var(X)
Var X = E[(x - µµx )²]
Var X = ∑(x i x - µµ)² f(xxi ) (Discrete case)
∞
Var X = ∫ (xx - µµ)² f(x)dx (Continuous case)
-∞
It can be proved that Var X = E(XX²) - [E(X)]²
Properties of Expectation and Variance.
If x1 and x2 are two random variance and a and b are constants,
E(ax1 + b) = a E(x1 ) + b
V(ax1 + b) = a²V(x1 )
E(ax1 + bx2 ) = a E(x1 ) + b E(x2 )
V(ax1 + bx2 ) = a² V(x1 ) + b²V(x2 ) + 2ab cov(x 1 , x 2 )
Where cov (x 1 , x 2 ) represents the covariance between x1 and x 2
If x1 and x2 are independent, then cov(x 1 , x2 ) = 0 and the above
formula reduces to
V(ax1 + bx2 ) = a²V(x1 ) + b²V(x2 )
For example, from above formula we can say
E(x1 + x2 ) = E(x1 ) + E(x2 )
E(x1 - x 2) = E(x 1 ) - E(x 2 )
V(x1 + x 2 ) = V(x1 - x2 ) = V(x1 ) + V(x 2)
Formula for calculating covariance between X and Y
Cov (x, Y) = E(XY) - E(X) E(Y)
∴ If X, Y are independent E(XY) = E(X) E(Y)
and hence Cov(X, Y) = 0
Binomial Distribution
The probability of obtaining x-successes from n trials is given by
the binomial distribution formula.
P(X = x) = nCx p x (1 - p)n-x
Page - 9
Where p is the probability of success in any trial and (1 - p) = q
is the probability of failure.
Geometric Distribution
Consider repeated trials of a Bernoulli experiment ∈ with
probability P of success and q = 1 - P of fail.
Let x denote the number of times ∈ must be repeated unit finally
obtaining a success. The distribution of random variable x is
given as follows.
_____
_____
_____
k 1 2 3 4 5
_____
_____
_____
P(k) P qP q²P q³P q4 P
The experiment ∈ will be repeated k times only in the case that
there is a sequence of k - 1 failures follow by a success.
P(k) = P(x - k) = q P
The geometric distribution is characterized by a single para-
meter P.
Points to Remember :
Let x be a geometric random variable with distribution GEO(P).
Then q
1
_____ _____
1. E(x) = 2. Var(x) =
P P²
3. Cumulative distribution F(k) = 1 - q k
4. P(x > r) = q r
Geometric distribution possesses “no-memory” or “lack of
memory” property which can be stated as
P(x > a + r | x > a) = P(x > r)
Poisson Distribution
A random variable X, taking on one of the values 0, 1, 2,
.......... is said to be a Poisson random variable with parameter
λ if for some λ > 0.
Page - 10
e -λ λx
_____
P(x = x) = x!
For Poisson Distribution :
Mean = E(x) = λ
Variance = V(x) = λ
Therefore, expected value and variance of a Poisson random
variable are both equal to its parameters λ.
Here λ is average number of occurrences of event in an obser-
vation period ∆t. So, λ = α∆t where α is number of occurren-
ces of event per unit time.
Page - 11
PROBABILITY
Prob = ∑ P(x) = 1 → Discrete
∫P(x) = 1
Distribution
(i) Binomial Distribution (ii) Poisson Distribution
Binomial Distribution = n Cr p r q n - r
n → lot number of variable
r → event p+q = 1 q = 1-p
* Position cannot be found
* To find only 2 success or 2 failure is other words specific number of
success.
Mean or expectation, µ = n p
Variance = npq
_____
Standard deviation = √ npq
Poisson Distribution
e -λ λx
_____
p(x) = e → 0, 1, 2, .........
x!
random variable
_____ λ → mean = xp
SD = √ xp Variance = xp.
Continuous
∞
i) ∫-∞ p(x) = 1 E(x) = ∫ n p(n)
Var(x) = ∫ x² p(x) - [ ∫ x p(x)]²
Page - 12
Properties
* E(constant = constant
* E(ax + by) = a E(x) b E(y)
* E(ax - by) = a E(x) - b E(y)
* E(xy) = E(x) . E(y) If x & y are independent
* Variance (constant) = 0
* v(ax ± by) = a² v(n) + b² v(y)
* Co-Variance (xy) = E(x . y) - E(x) . E(y)
* If x & y are independent, the covariance (x . y) = 0
Uniform Distribution
b
1
_____ 1
_____
[a, b] p(x) =
b-a a
∫ b-a = 1
a+b
_____ (b - a)²
_____
Mean = 2 Variance = 12
Normal Distribution or Gaussian
1
_____
_____ -(x - µ)²
_____
p(x) = e 2σ² µ → mean
√ 2�σ²
σ → SD
µ - σ → µ + σ → 68.34%
or 0.6834
µ-σ µ µ+σ
µ - 2σ → µ + 2σ → 95.5%
or 0.955
µ - 2σ µ + 2σ
µ - 3σ → µ + 3σ → 99.7%
µ - 3σ µ + 3σ or 0.997
Page - 13
Exponential Distribution
P(x) = λe -λn x ≥ 0
= 0 x<0
1
_____ 1
_____
Mean = Var =
λ λ²
Standard Normal Distribution
µ = 0 σ = 1
_____ -x²
_____
1
_____
p(x) = e2
√ 2�
Page - 14
Arithmetic Mean for Raw Data
The formula for calculating the arithmetic mean for
∑x
_____
raw data is x = n
x : Arithmetic mean
x : Refers to the value an observation
n : Number of observations
Example :
The number of visits made by ten mothers to a clinic
were ; 8 6 5 5 7 4 5 9 7 4
Calculate the average number of visits.
The Arithmetic Mean for Grouped Data (Frequency Dis-
tribution)
The formula for the arithmetic mean calculated from a
frequency distribution has to be amended to include the
frequency. It becomes ∑(fx)
_____
x = ∑f
Example :
To show how we can calculate the arithmetic mean of a
grouped frequency distribution, there is a example of
weights of 75 pigs. The classes and frequencies as given
in following table :
Weight (kg) Midpoint of classNumber of pigs fx
x f(frequency)
0 & under 20 15 1 15
20 & under 30 25 7 175
30 & under 40 35 8 280
40 & under 50 45 11 495
50 & under 60 55 19 1045
60 & under 70 65 10 650
70 & under 80 75 7 525
80 & under 90 85 5 425
90 & under 100 95 4 380
100 & under 110 105 3 215
Total 75 4305
Page - 15
Median for Raw Data :
In general, if we have n values of x, they can be arranged
in ascending order as : x1 < x2 < ...... < x n
(n + 1)
Suppose n is odd, then Median = the _____ -th value
2
However, if n is even, we have two middle points
Median =
n th
_____
_____
2 ( )
value +
n
_____
2
th
(
+ 1 value )
2
Example :
The heights (in cm) of six students in class are 160, 157,
156, 161, 159, 162. What is median height ?
Median for Grouped Data
1. Identify the median class which contains the middle obser-
(
N+1
vation _____
2 ) observation. This can be done by observing
th
the first class in which the cumulation frequency is equal
N + 1 . Here, N = Σf = Total number of
or more than _____
2
observations.
2. Calculate Median as follows :
[
N + 1 - (F + 1)
_____
Median = L + _____
2
fm
(
× h. ) ]
Page - 16
Where, l = Lower limit of median class
n = Total number of observations
cf = Cumulative frequency of the class pre-
ceding the median class.
f = Frequency of median class
C = class length
Example :
Consider the following table giving the marks obtained by
students in an exam.
Mark Range f No. of Students Cumulative Frequency
0 - 20 2 2
20 - 40 3 5
40 - 60 10 15
60 - 80 15 30
80 - 100 20 50
Mode
Mode is defined as the value of the variable which occurs
most frequently.
Mode for Raw Data
In raw data, the most frequently occurring observation is
the mode. That is data with highest frequency mode. If
there is more than one data with highest frequency, then
each of them is a mode. Thus we have Unimodal(single
mode), Bimodal (two modes) and Trimodal (three modes
data sets.
Example :
Find the mode of the data set : 50, 50, 70, 50, 50, 70, 60.
Page - 17
Mode for Grouped Data
Mode is that value of x for which the frequency is maxi-
mum. If the values of x are grouped into the classes (such
that they are uniformly distributed within any class) and
we have a frequency distribution then :
1. Identify the class which has the largest frequency
(modal class)
2. Calculate the mode as f 0 - f1
_____
Mode = L + 2f - f - f × h
0 1 2
Where,
L = Lower limit of the modal class
f 0 = Largest frequency (frequency of Modal Class)
f 1 = Frequency in the class preceding the modal class.
f2 = Frequency in the class next to the modal class
h = Width of the modal class
Example :
Data relating to the height of 352 school students are given
in the following frequency distribution.
Calculate the modal height.
Height (in feet) Number of students
3.0 - 3.5 12
3.5 - 4.0 37
4.0 - 4.5 79
4.5 - 5.0 152
5.0 - 5.5 65
5.5 - 6.0 7
Total 352
Page - 18
Properties Relating Mean, Median and Mode
1. Empirical mode = 3 median - 2 mean
When an approximate value of mode is required above
empirical formula for mode may be used.
2. There are three types of frequency of distributions.
Positively skewed, symmetric and negatively skewed
distribution.
►
►
► ►
► ►
►
(a) (Positively Skewed) (a) Symmetric (a) Negatively Skewed
(a) In positively skewed distribution.
Mode ≤ Median ≤ Mean
(b) In symmetric distribution
Mean = Median = Mode
(c) In negatively skewed distribution
Mean ≤ Median ≤ Mode
Standard Deviation
Standard Deviation is a measure of disperson or variation
amongst data.
Insteady of taking absolute deviation from the arithmetic
mean. We may square each deviation and obtain the arith-
metic mean of squared deviations. This gives us the vari-
ance of the values.
The positive square root of the variance is called the
“Standard Deviation” of the given values.
Page - 19
Standard Deviation for Raw Data
Suppose x1, x , ...... x n are n values of the x, their arith-
metic mean is :
1
_____
x = Σx i and x1 - x, x2 - x, ..... xn - x are the deviations
N
of the values of x from x. Then,
1
_____
σ² = Σ(x i - x)² is the variance of x. It can be shown that
n²
Σ(x i - x)²
_____ 1
_____ nΣx²i - (Σxi )²
_____
σ² = n = Σx²
i - x² = n²
n
It is conventional to represents the variance by the symbol
σ². Infact, σ is small sigma and Σ is capital sigma.
Square root of the variance is the standard deviation.
_____ _____ _____
√ √
1
_____ 1
_____
√
σ = + Σ(x i - x)² = nΣx²i - (Σxi )²
Σx²i - x² = _____
n n n²
Example :
Consider three students in a class, and their marks in exam
was 50, 60 and 70. What is the standard deviation of this
data set ?
Example :
The frequency distribution for heights of 150 young ladies
in a beauty contest is given below for which we have to
calculate standard deviation.
Page - 20