Random Variables
Random Variables
•
Random Variables: Example
• A life insurance agent has two Event Probability
elderly clients with policy Y (Younger one dies) P(Y) = 0.05
value of Rs 1 crore upon O (Older one dies) P(O) = 0.10
death
• Y and O are independent
Random Probability Calculation Result
• X = Random variable Variable X =
representing the amount of 0 (Nobody P(YcOc) 0.95 x 0.90 0.855 or 85%
money paid out (in crores) dies)
based on who dies
1 (One of P(YOc) + (0.05 x 0.90) + 0.140 or 14%
• Possible values for X: 0, 1, 2 them dies) P(YcO) (0.95 x 0.10)
crore 2 (Both die) P(YO) 0.05 x 0.10 0.005 or 0.5%
Random Variable Types
• Discrete random variable: Can only take on countable number of
possible values
• Credit ratings: AAA, AA, B, BBB, etc
• Number of orders received on a shopping website (0, 1, 2, 3, …)
• Customer churn: 0 = no churn, 1 = churn
• Continuous random variable: Can take on any value in a certain
interval
• Market share of a company (Infinite possible values between 0% and 100%,
e.g. 45.123%)
• Time taken to place an online order (e.g. 2.5 minutes or 2.55 minutes, etc)
• Waiting time at an ATM (e.g. 30.2 seconds or 45.76 seconds, etc)
Probability Distribution
• Distribution: Describes all the probable outcomes of a variable
• Discrete distribution: Sum of all individual probabilities = 1
• Continuous distribution: Total area under the probability curve (density) = 1
Probability Distribution Types and
Estimations Probability
Distribution Types
Discrete Continuous
Uniform Binomial Poisson Uniform Normal
Probability Mass Function (PMF) Probability Density Function (PDF)
Cumulative Distribution Function (CDF)
Discrete Distributions:
(1) Uniform Distribution
• Note: Uniform Distribution can also fall under continuous probability distributions category,
depending on what we are measuring. If the observations are not discrete, a uniform
distribution becomes continuous.
• Uniform Discrete Distribution: Throw of a dice (exactly 1, 2, …, 6)
• Uniform Continuous Distribution: Time taken by a flight (can go up to nano seconds and
beyond)
Uniform Distribution
• Rolling a fair die: 6 discrete, equally probable outcomes
• We can roll a 1 or 2, but not 1.5
• The probabilities of each outcome are evenly distributed across the
sample space
• Code: C:\code\Data Analytics\uniform_dist.py
Uniform Distribution
• Rolling a die has 6 equal probabilities and their total adds to 1
Discrete Distributions:
(2) Binomial Distribution
Binomial Distribution
•
Binomial Distribution Formula
Problem: Calculate the probability that 3 out of 5
Number of Probability that
• liking students like Python, when in general 66% students
students someone likes
Python, e.g. 3 Python, e.g. 66%
like Python
Probability
Number of
students we
asked, e.g. 5
Binomial Distribution Example
•
Result: Binomial Probability Mass Function (PMF)
Exercise (Solve + Code)
•
Understanding PMF and CDF
• Probability Mass Function (PMF): Probability at a specific point
• Cumulative Distributed Function (CDF): Cumulative probability up to a
specific point
• Example: Consider tips dataset
• Suppose we define success as tip >= 15% of the total bill amount
• PMF = Probability for each possible number of successful tips, e.g.
• P(X=0) -> Probability that none of the tips are at least 15%
• P(X=1) -> Probability that exactly one tip is at least 15%
• P(X=2) -> Probability that exactly two tips are at least 15%
• … up to the total number of tips
• CDF = Cumulative probability at that point
Plotting PMF and CDF
• Now calculate the probability that we will get this success in 60 or 65
out of 100 customer visits
• Plot PDF and CDF and highlight these points (60 and 65) on the plot
• Exercise: Calculate overall survival rate in titanic dataset and plot the
PDF and CDF
Discrete Distributions:
(3) Poisson Distribution
Poisson Distribution
•
Poisson Distribution
•
Geometric Distribution
•