0% found this document useful (0 votes)
2 views14 pages

Module 5 Discrete PR

Module 5 covers discrete probability distributions, including Bernoulli, Binomial, and Poisson distributions, along with their properties and applications. It introduces simulation techniques, particularly Monte Carlo simulation, and provides assessments to reinforce learning. The module aims to equip learners with the ability to solve probabilities, apply probability density functions, and model distributions using simulated data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views14 pages

Module 5 Discrete PR

Module 5 covers discrete probability distributions, including Bernoulli, Binomial, and Poisson distributions, along with their properties and applications. It introduces simulation techniques, particularly Monte Carlo simulation, and provides assessments to reinforce learning. The module aims to equip learners with the ability to solve probabilities, apply probability density functions, and model distributions using simulated data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 5 - Discrete Probability Distributions

Instructional Hours: 4

Module Overview
In this module, you will learn about the discrete distributions, namely, the
Bernoulli distribution, Binomial distribution and Poisson distribution, their
properties and the respective probability mass function. Further, an intro-
duction to simulation and the Monte Carlo simulation and its application to
simulate the three discrete distributions. The module provides assessments and
quiz to enhance the learning process.

0.1 Learning Outcomes


1. Solve probabilities using the Binomial, Bernoulli and Poisson distributions
2. Apply the probability density functions from different discrete distribu-
tions

3. Explain the difference between simulation and Monte Carlo simulation


4. Model Binomial and Poisson distributions from simulated data

Learning Activities
1. Read the lecture notes
2. Read the assigned reference materials
3. Watch lecture videos

4. Complete the module assessment/Quiz

Probability distributions
A probability distribution describes how the probabilities are distributed over
the values of a random variable. It can be either discrete or continuous.

Discrete probability distributions


Discrete probability distributions are used for discrete random variables, which
take on a countable number of values.
• The probability mass function (PMF): It gives the probability that a dis-
crete random variable is exactly equal to some value.

1
The common discrete probability distributions are:
• Bernoulli distribution
• Binomial distribution
• Poisson distribution

• Negative Binomial distribution

Continuous probability distributions


Continuous probability distributions are used for continuous random variables,
which take on an uncountably infinite number of values.
• The probability density function (PDF): It provides the relative likelihood
of a random variable to take on a particular value. The probability for a
continuous random variable to take on a specific value is 0, but the area
under the PDF over an interval gives the probability of the variable falling
within that interval.
The common continuous probability distributions are:
• Uniform distribution

• Normal distribution
• Exponential distribution
• Student t-distribution
• Beta distribution

• Gamma distribution
• Chi-square distribution
• Log-normal distribution

• F-distribution

Bernoulli distribution
The Bernoulli distribution is a discrete probability distribution for a random
variable which takes value 1 with probability p and value 0 with probability
1 − p. It is the simplest form of a discrete probability distribution and is a
special case of the Binomial distribution with a single trial.

2
Probability mass function (PMF)
The probability mass function (PMF) of a Bernoulli random variable X is given
by:
P (X = x) = px (1 − p)1−x
for x ∈ {0, 1} and 0 ≤ p ≤ 1.

• P (X = 1) = p
• P (X = 0) = 1 − p

Practical applications
• Quality control: Determining if a product passes or fails a quality test.
• Survey responses: Modeling binary responses (e.g., yes/no, true/false).

• Medical trials: Assessing whether a patient responds to a treatment (suc-


cess/failure).

Expected value and variance


The expected value E[X] and variance Var(X) of a Bernoulli random variable
are given by:
E[X] = p
Var(X) = p(1 − p)
Example 1. Suppose we have a biased coin that lands on heads with probability
0.7 and tails with probability 0.3. Define X to be a random variable representing
the outcome of a single flip, where X = 1 if the outcome is heads and X = 0 if
the outcome is tails.

• P (X = 1) = 0.7

• P (X = 0) = 0.3

1. Expected value
E[X] = p = 0.7

2. Variance
Var(X) = p(1 − p) = 0.7 × 0.3 = 0.21

Example 2. A factory produces items, and each item can be either defective
(with probability p = 0.05) or non-defective (with probability 1 − p = 0.95).
Define Y to be a random variable representing the condition of an item, where
Y = 1 if the item is defective and Y = 0 if it is non-defective.

• P (Y = 1) = 0.05

3
• P (Y = 0) = 0.95

1. Expected value
E[Y ] = p = 0.05

2. Variance
Var(Y ) = p(1 − p) = 0.05 × 0.95 = 0.0475

Binomial distribution
The binomial distribution is a discrete probability distribution of the number
of successes in a sequence of n independent experiments, each asking a yes-no
question, and each with its own boolean-valued outcome: a success or a failure.

Probability mass function (PMF)


The probability mass function of a binomial distribution is given by:
 
n k
P (X = k) = p (1 − p)n−k
k

where:
• X is the number of successes.
• n is the number of trials.
• k is the number of successes ( k = 0, 1, 2, . . . , n ).

• p is the probability of success on an individual trial.


• nk = k!(n−k)!
n!

is the binomial coefficient.

Properties of Binomial Distribution


• Mean: µ = np
• Variance: σ 2 = np(1 − p)
• Standard Deviation: σ =
p
np(1 − p)

Practical applications
• Marketing: Determining the number of customers who make a purchase
out of a fixed number of contacts.

• Manufacturing: Modeling the number of defective items in a batch of


products.

4
• Elections: Predicting the number of voters who will vote for a particular
candidate out of a sample of voters.

Example 3. A fair coin is flipped 10 times. What is the probability of getting


exactly 6 heads?
Here, n = 10, k = 6, and p = 0.5.

 
10
P (X = 6) = (0.5)6 (1 − 0.5)10−6
6
10!
P (X = 6) = (0.5)6 (0.5)4
6!4!
P (X = 6) = 210 · (0.5)10
1
P (X = 6) = 210 ·
1024
210
P (X = 6) = ≈ 0.205
1024

So, the probability of getting exactly 6 heads is approximately 0.205.


Example 4. In a class, 70% of students pass the exam. If 15 students are
randomly selected, what is the probability that exactly 10 students pass?
Here, n = 15, k = 10, and p = 0.7.
 
15
P (X = 10) = (0.7)10 (1 − 0.7)15−10
10
15!
P (X = 10) = (0.7)10 (0.3)5
10!5!
P (X = 10) = 3003 · (0.7)10 · (0.3)5
P (X = 10) ≈ 3003 · 0.0282475 · 0.00243
P (X = 10) ≈ 0.205

So, the probability that exactly 10 students pass is approximately 0.205.

Poisson distribution
The Poisson distribution is a discrete probability distribution that expresses
the probability of a given number of events occurring in a fixed interval of
time or space. These events must occur with a known constant mean rate and
independently of the time since the last event.

5
Probability mass function (PMF)
The probability mass function of a Poisson random variable X is given by:
λk e−λ
P (X = k) =
k!
where:
• k is the number of occurrences of an event ( k = 0, 1, 2, . . . ),
• λ is the average number of occurrences in the given interval,
• e is the base of the natural logarithm (approximately equal to 2.71828).

Properties of Poisson distribution


1. Mean and variance: Both the mean and variance of a Poisson distributed
random variable X are equal to λ.
2. Additivity: If X1 and X2 are independent Poisson random variables with
parameters λ1 and λ2 , respectively, then X1 +X2 is also a Poisson random
variable with parameter λ1 + λ2 .

Practical applications
• Call centers: Modeling the number of calls received in an hour.
• Traffic engineering: Predicting the number of cars passing through a check-
point in a given period.
• Biology: Modeling the number of mutations in a strand of DNA per unit
length.
Example 5. Suppose a call center receives an average of 5 calls per hour. What
is the probability that exactly 3 calls will be received in an hour?
Here, λ = 5 and k = 3.
53 e−5 125e−5
P (X = 3) = = ≈ 0.1404
3! 6
So, the probability that exactly 3 calls will be received in an hour is approximately
0.1404.
Example 6. A bookstore sells an average of 2 rare books per day. What is the
probability that no rare books will be sold in a day?
Here, λ = 2 and k = 0.
20 e−2
P (X = 0) = = e−2 ≈ 0.1353
0!
So, the probability that no rare books will be sold in a day is approximately
0.1353.

6
Quiz

Provide this quiz after the Poisson distribution notes

1. In a batch of 1000 computer chips, the probability that a chip is defective


is 0.02. What is the probability that exactly 20 chips are defective in the
batch? (3 Marks)
Solution: Let X be the number of defective chips in the batch. X follows
a binomial distribution with parameters n = 1000 and p = 0.02. The
probability mass function is given by:
 
n k
P (X = k) = p (1 − p)n−k
k
Substituting the values:
 
1000
P (X = 20) = (0.02)20 (0.98)980 ≈ 0.0735
20

2. A stock has a 60% chance of increasing in price each day. Over the next 5
days, what is the probability that the stock will increase in price exactly
3 days? (4 Marks)
Solution: Let Y be the number of days the stock price increases. Y
follows a binomial distribution with parameters n = 5 and p = 0.6. The
probability mass function is given by:
 
n k
P (Y = k) = p (1 − p)n−k
k
Substituting the values:
 
5
P (Y = 3) = (0.6)3 (0.4)2 ≈ 0.3456
3

3. A hospital emergency room receives an average of 5 emergency calls per


hour. What is the probability that the emergency room will receive exactly
7 calls in the next hour? (3 Marks)
Solution: Let Z be the number of emergency calls received in an hour.
Z follows a Poisson distribution with parameter λ = 5. The probability
mass function is given by:
λk e−λ
P (Z = k) =
k!
Substituting the values:
57 e−5
P (Z = 7) = ≈ 0.1044
7!

7
4. A website receives an average of 2 hits per minute. What is the probability
that it will receive 5 hits in a particular minute? (3 Marks)
Solution: Let W be the number of hits received in a minute. W follows a
Poisson distribution with parameter λ = 2. The probability mass function
is given by:
λk e−λ
P (W = k) =
k!
Substituting the values:

25 e−2
P (W = 5) = ≈ 0.0361
5!

Monte Carlo simulation


Monte Carlo simulation is a computational technique that uses random sampling
to obtain numerical results. The core idea is to use randomness to solve problems
that might be deterministic in principle. It’s widely used in fields such as finance,
engineering, supply chain management, and more.

Applications
• Finance: Risk assessment and portfolio optimization.
• Engineering: Reliability analysis and optimization of systems.
• Supply chain management: Inventory optimization and demand forecast-
ing.

• Physics: Particle transport simulations and quantum mechanics.


• Economics: Policy impact analysis and market behavior simulations.

What is simulation
Simulation is the process of creating a model of a real-world system and con-
ducting experiments on this model to understand the behavior of the system or
to evaluate various strategies for its operation. This model is often mathemat-
ical or computational and can include physical, chemical, biological, economic,
social, and engineering systems.

Importance of simulation
• Risk-free testing: Simulation allows for the testing of hypotheses and
strategies in a risk-free environment. This is crucial in fields like finance
and engineering where real-world testing can be costly or dangerous.

8
• Understanding complex systems: It provides a way to understand and
analyze complex systems that are difficult to study through analytical
methods alone.
• Decision making: Simulation helps in making informed decisions by eval-
uating the potential outcomes of different scenarios. This is especially
useful in business and public policy.
• Optimization: It aids in the optimization of processes and systems by
allowing for the testing of different variables and conditions to find the
most efficient solution.

• Training and education: Simulation is widely used for training and edu-
cational purposes, providing hands-on experience in a controlled setting.

Uses of simulation
• Engineering and manufacturing: To design and test new products, opti-
mize production processes, and improve quality control.
• Healthcare: For planning and managing healthcare systems, studying the
spread of diseases, and training medical professionals.
• Finance: To model financial markets, assess risk, and develop trading
strategies.
• Military and defense: For strategic planning, training, and the develop-
ment of new technologies.
• Climate science: To model climate change and predict future environmen-
tal conditions.
• Operations research: To optimize logistics, supply chain management, and
resource allocation.

• Social sciences: For studying human behavior, social dynamics, and eco-
nomic policies.

Simulation is an invaluable tool that spans multiple disciplines, providing a


robust framework for analysis, experimentation, and decision-making

Simulation versus Monte Carlo simulation


Simulation is a broad term that refers to the process of creating and analyzing
a model or replica of a real-world system to understand its behavior or to test
different scenarios. It encompasses various techniques and methodologies to
model complex systems.

9
Types of simulation
1. Deterministic simulation: Uses fixed inputs and processes to model sys-
tems where the outcome is predictable and consistent (e.g., fluid dynam-
ics).

2. Stochastic simulation: Incorporates randomness and variability into the


model, often using probability distributions to represent uncertainty (e.g.,
queuing systems).

Monte Carlo Simulation


Monte Carlo Simulation is a specific type of stochastic simulation that uses
random sampling to obtain numerical results. It is named after the Monte
Carlo Casino due to its reliance on randomness and probability.

Key features of Monte Carlo simulation


1. Random sampling: Generates random inputs to model a system and per-
forms multiple trials to estimate the behavior or outcome.
2. Statistical analysis: Uses the results from numerous random trials to com-
pute estimates, probabilities, and confidence intervals.
3. Reproducibility: Setting a seed ensures that simulations can be reproduced
with the same random number sequences.

Difference from general simulation


• Scope: Monte Carlo Simulation is a subset of simulation techniques fo-
cused specifically on random sampling and probabilistic analysis, whereas
general simulation encompasses a broader range of methods, including
deterministic and other stochastic approaches.
• Methodology: Monte Carlo relies heavily on random number generation
and statistical methods to model uncertainty and variability, while other
simulation types might use deterministic rules or simpler probabilistic
models.

Simulating data in R
Steps in simulating data in R
1. Determine the type of data you need and the purpose of the simulation.
For example, are you simulating normal data to test statistical methods,
or are you modeling a complex system with multiple variables?

2. Set the seed for reproducibility: Use [Link]() to ensure that the results
are reproducible. This is important for consistency and debugging.

10
[Link](123)

3. Decide which probability distribution best fits your simulation needs (e.g.,
normal, binomial, Poisson). Specify the parameters for the distribution,
such as mean and standard deviation for normal distribution, or size and
probability for binomial distribution.
4. Use R functions to generate random data according to the chosen distri-
bution and parameters.

binomial_data <- rbinom(1000, size = 10, prob = 0.5)


poisson_data <- rpois(1000, lambda = 5)

5. Conduct preliminary analysis to check the data’s summary statistics and


visualize it using plots. Use functions like summary() and hist() for
initial exploration.

summary(binomial_data)
hist(binomial_data, main = "Histogram of Binomial Data",
xlab = "Value", ylab = "Frequency")

6. Based on initial results, adjust parameters or the simulation model as


needed. Repeat the simulation to refine your results or to test different
scenarios.
7. Save your simulated data and results for further analysis or reporting. Use
functions like [Link]() to export data to files.

[Link](binomial_data, "binomial_data.csv")

Practical session

Create a code run for the students to practice the above R codes.

Bernoulli distribution application


Example 7. A company wants to simulate the outcome of a quality control test
where each product has a 90% chance of passing the test (success) and a 10%
chance of failing (failure).
# Set probability of success
p <- 0.9

# Simulate 1000 Bernoulli trials


[Link](123)
bernoulli_trials <- rbinom(1000, size = 1, prob = p)

11
# Calculate the proportion of successes
proportion_success <- mean(bernoulli_trials)
proportion_success
This R code simulates 1000 Bernoulli trials and calculates the proportion of
successes, which should be close to 0.9.

Practical session

Create a code run for the students to practice the above R codes.

Binomial distribution application


Example 8. A marketing campaign is run where the probability of a customer
making a purchase is 0.3. If 10 customers are contacted, what is the probability
that exactly 4 will make a purchase?
# Set parameters
n <- 10
p <- 0.3
k <- 4

# Calculate the probability of exactly 4 successes


prob_4_successes <- dbinom(k, size = n, prob = p)
prob_4_successes
This R code calculates the probability of exactly 4 customers making a purchase
out of 10, which is approximately 0.2001.

Practical session

Create a code run for the students to practice the above R codes.

Poisson distribution application


Example 9. A call center receives an average of 5 calls per hour. What is the
probability that exactly 3 calls will be received in an hour?
# Set parameters
lambda <- 5
k <- 3

# Calculate the probability of exactly 3 calls


prob_3_calls <- dpois(k, lambda = lambda)
prob_3_calls

12
This R code calculates the probability of receiving exactly 3 calls in an hour,
which is approximately 0.1404.

Practical session

Create a code run for the students to practice the above R codes.

Quiz

This quiz is to be embedded as part of the notes after the Monte Carlo
simulation

• Match each scenario with the appropriate distribution or method (4 Marks):

The description:
A. A manufacturing process has a 2% defect rate. You inspect 100 items and
count the number of defective items.

B. You flip a fair coin and record whether it lands heads (1) or tails (0).
C. The number of emails received per hour at a help desk.
D. Estimating the value of π using random sampling.
The distribution or method:

1. Binomial Distribution (Answer A)


2. Bernoulli Distribution (Answer B)
3. Poisson Distribution (Answer C)

4. Monte Carlo Simulation (Answer D)

Reading Materials
1. Aliakbar Montazer Haghighi and Indika Wickramasinghe (2020). Prob-
ability, Statistics and Stochastic Processes for Engineers and Scientists.
First Edition, CRC Press(Pages 93-101) Read the selected pages

13
Summary
1. The Bernoulli distribution represents a random variable with two possible
outcomes: 1 (success) with probability p and 0 (failure) with probability
1 − p. Its probability mass function (pmf) is P (X = x) = px (1 − p)1−x for
x ∈ {0, 1}. It is fundamental in modeling binary outcomes in statistics.
2. The Binomial distribution models the number of successes in n indepen-
dent
 Bernoulli trials with probability p of success. Its pmf is P (X = k) =
n k n−k
k p (1 − p) for k = 0, 1, 2, . . . , n. It is crucial for experiments with
fixed numbers of independent trials.
3. The Poisson distribution describes the probability of a given number of
events occurring in a fixed interval of time or space with a known average
k −λ
rate λ. Its pmf is P (X = k) = λ k!e
for k = 0, 1, 2, . . .. It is essential for
modeling rare events in a continuous domain.

4. Simulation is a computational technique used to model the behavior of


a system by generating random variables and analyzing the outcomes.
Monte Carlo simulation, a specific type of simulation, involves using ran-
dom sampling and statistical methods to estimate mathematical functions
and mimic the operation of complex systems.

5. To perform simulations for the binomial distribution in R, use the rbinom()


function, which generates random numbers based on specified parameters.
For the Poisson distribution, use the rpois() function, which similarly
generates random numbers given a lambda parameter.

14

You might also like