0% found this document useful (0 votes)

8 views14 pages

Module 9 Simulation

Module 9 covers simulation, sampling, and resampling techniques, focusing on Monte Carlo simulation, bootstrapping, and jackknife methods. It includes practical examples using R software, instructional materials, and self-assessment quizzes. The module aims to equip learners with the ability to define key terms, simulate data, and apply various statistical methods for analysis.

Uploaded by

kevinkochieng2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views14 pages

Module 9 Simulation

Uploaded by

kevinkochieng2000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Module 9 - Simulation, sampling and resampling

Instructional Hours: 4

Module Description
In this module, you will learn about sampling and the different sampling meth-
ods, simulation with emphasis on Monte Carlo simulation and resampling using
the bootstrapping and jackknife methods. Integration based on Monte Carlo
integration is presented. Practical examples using R software are provided. Fur-
ther, the module provides videos, examples and quiz for self-assessment in the
learning process.
This module introduces sampling methods, simulation using Monte Carlo
simulation and resamping using bootstrapping and Jackknife methods.

Module Learning Outcomes 1

• Define the terms sampling, simulation and resampling

• Show how to simulate data with and without replacement

• Solve an integral problem using the Monte Carlo integration
• Apply the bootstrapping and jackknife methods in resampling

Learning Activities
1. Read the lecture notes (PDF)
2. Read the assigned reference materials

3. Watch lecture videos

4. Complete the module assessment/Quiz

Introduction
0.0.1 Simulation
Simulation is a statistical technique that involves generating random data based
on specified parameters or models to understand the behavior of a system or
process. It allows us to explore the potential outcomes and variability of complex
systems when analytical solutions are difficult or impossible to obtain.

1
Key points
• Random sampling: Simulation often involves random sampling from prob-
ability distributions.
• Modeling uncertainty: It helps in modeling the uncertainty and variability
inherent in real-world systems.
• Applications: Commonly used in risk assessment, financial modeling, en-
gineering, and scientific research.
• Monte Carlo simulation: A popular simulation method that relies on re-
peated random sampling to estimate properties of a system.

0.0.2 Resampling
Resampling is a statistical technique that involves repeatedly drawing samples
from observed data. It is used to assess the variability of a statistic and make
inferences about the population from which the sample was drawn.

Key points
• Bootstrap method: A resampling technique that involves repeatedly sam-
pling with replacement from the observed data to estimate the sampling
distribution of a statistic.
• Jackknife method: A resampling technique that systematically leaves out
one observation at a time from the sample set and calculates the statistic
for each subset.
• Permutation tests: Used to test hypotheses by comparing the observed
data to data generated by randomly permuting the labels of the observa-
tions.
• Cross-validation: Used in machine learning to assess the performance of
a model by dividing the data into training and validation sets multiple
times.

0.1 Simulation, sampling and resampling

Simulation
Simulation involves generating random data based on specified parameters or
models to study the behavior of a system or process.

Sampling
Sampling is the process of selecting a subset of individuals or observations from
a larger population to infer characteristics about the population.

2
Resampling
Resampling involves repeatedly drawing samples from observed data, often to
assess the variability of a statistic.

Bootstrapping
Bootstrapping is a specific resampling technique where samples are drawn with
replacement from the observed data to estimate the sampling distribution of a
statistic.

0.1.1 Similarities
• Simulation and sampling: Both involve drawing samples from a dis-
tribution.

• Resampling and bootstrapping: Bootstrapping is a type of resam-

pling.
• Simulation and resampling: Both methods generate multiple datasets
to analyze variability and uncertainty.

0.1.2 Differences
Source of data
• Simulation: Data is generated based on specified models or distributions.

• Sampling: Data is drawn from an actual population.

• Resampling: Data is repeatedly drawn from the observed dataset.
• Bootstrapping: A specific form of resampling with replacement from the
observed dataset.

Purpose
• Simulation: To model and study complex systems.
• Sampling: To make inferences about a population.
• Resampling: To assess the variability of a statistic.
• Bootstrapping: To estimate the sampling distribution of a statistic.

3
Method
• Simulation: Involves generating random data.
• Sampling: Involves selecting a subset of the population.

• Resampling: Involves creating multiple samples from observed data.

• Bootstrapping: Involves drawing samples with replacement from the ob-
served data.

0.2 Sampling methods

Sampling is the process of selecting a subset of data points from a larger dataset
(population) to make statistical inferences about the population.

0.2.1 Types of Sampling Methods

Simple Random Sampling
Each member of the population has an equal chance of being selected.
Example 1. A data scientist wants to survey 100 employees out of 1,000 in a
company to understand their job satisfaction.
• Assign a unique number to each employee (1 to 1000).
• Use a random number generator to select 100 unique numbers.
• Survey the employees corresponding to these numbers.

Stratified Sampling
The population is divided into strata (subgroups) based on specific characteris-
tics, and random samples are taken from each stratum.
Example 2. A researcher wants to ensure representation from different age
groups in a health survey. The population is divided into three age groups: 18-
30, 31-50, and 51+.

• Divide the population into three strata based on age groups.

• Randomly sample individuals from each age group proportional to their
population size.
Example 3. In a factory with 200 workers, there are 50 in the assembly line,
100 in packaging, and 50 in quality control. A sample of 60 workers is needed.

1. Divide the workers into three strata: assembly line, packaging, and quality
control.

4
2. Calculate the sample size for each stratum proportional to their population
size:
• Assembly line: 50
200 × 60 = 15
• Packaging: 100
200 × 60 = 30
• Quality control: 50
200 × 60 = 15
3. Randomly sample 15 workers from the assembly line, 30 from packaging,
and 15 from quality control.

Cluster Sampling
The population is divided into clusters, some of which are randomly selected.
All members of selected clusters are sampled.

Example 4. A school district has 20 schools, and a researcher wants to survey

all teachers in 5 randomly selected schools.
1. Number the schools from 1 to 20.
2. Use a random number generator to select 5 unique numbers.

3. Survey all teachers in the schools corresponding to these numbers.

Systematic Sampling
Every k-th member of the population is selected after a random starting point.
Example 5. From a list of 500 customers, select a sample of 50 using system-
atic sampling.
500
1. Calculate the sampling interval: k = 50 = 10.
2. Randomly select a starting point between 1 and 10, say 4.
3. Select every 10th customer from the starting point: 4, 14, 24, 34, ..., 494.

0.3 Simulation
Simulation is a computational technique used to model the behavior of a system
by generating random samples and observing the outcomes. It is often used to
estimate the probability of complex events that are difficult to analyze using
analytical methods.

5
0.3.1 Monte Carlo Simulation
Monte Carlo Simulation is a specific type of simulation that relies on repeated
random sampling to compute results. It is named after the Monte Carlo Casino
in Monaco, due to the element of chance.

Example 6. Simulate rolling a fair six-sided die 10,000 times and estimate the
probability of rolling a 4.

1. Generate 10,000 random numbers between 1 and 6.

2. Count the number of times the number 4 appears.
3. Estimate the probability by dividing the count of 4s by the total number of
rolls.

set.seed(123)
rolls <- sample(1:6, 10000, replace = TRUE)
prob_4 <- sum(rolls == 4) / 10000
prob_4

#Expected output
0.1675

Practical Session
Create an interactive R studio in LMS for the student to practice the
above codes.

Example 7. Estimate the value of π using Monte Carlo simulation.

1. Generate random points within a unit square.
2. Count the number of points that fall within a quarter circle of radius 1.

3. Use the ratio of points inside the circle to the total number of points to
estimate π.

set.seed(123)
n <- 10000
x <- runif(n, min = 0, max = 1)
y <- runif(n, min = 0, max = 1)
inside_circle <- (x^2 + y^2) <= 1
pi_estimate <- 4 * sum(inside_circle) / n
pi_estimate

#Expected output
3.1424

6
Practical Session
Create an interactive R studio in LMS for the student to practice the
above codes.

Example 8. Simulate tossing a fair coin 1,000 times and estimate the proba-
bility of getting heads.

1. Generate 1,000 random numbers (0 or 1) where 0 represents tails and 1

represents heads.
2. Count the number of heads.
3. Estimate the probability by dividing the count of heads by the total number
of tosses.

set.seed(123)
tosses <- sample(c(0, 1), 1000, replace = TRUE)
prob_heads <- sum(tosses) / 1000
prob_heads

# Expected output
0.507

Practical Session
Create an interactive R studio in LMS for the student to practice the
above codes.

Example 9. Estimate the integral of f (x) = x2 from 0 to 1 using Monte Carlo

simulation.

1. Generate random points uniformly distributed in [0,1].

2. Evaluate the function at these points.

3. Take the average of the function values to estimate the integral.

set.seed(123)
n <- 10000
x <- runif(n, min = 0, max = 1)
f_x <- x^2
integral_estimate <- mean(f_x)
integral_estimate

# Expected output
0.3337

7
Practical Session
Create an interactive R studio in LMS for the student to practice the
above codes.

0.3.2 Watch the video

Watch the video and answer the questions that follow

Video Visit the URL below to view a video:

https://www.youtube.com/embed/v=9crPxlA795Y

Video by Stat Legend - Monte Carlo Integration using R

R1
1. Using nrep = 2, 000, find the integral of 0 x2 dx using Monte Carlo
integration (4 Marks) (Answer is 0.3303753)
Rπ
2. Using nrep = 4, 000, find the integral of 1 ex dx using the Monte
Carlo integration (4 Marks) (answer is 20.9764)

0.4 Bootstrapping
Bootstrapping is a statistical technique used to estimate the sampling distribu-
tion of a statistic by repeatedly resampling with replacement from the original
sample data. This method allows for assessing the variability and confidence in-
tervals of estimates, particularly when the theoretical distribution of the statistic
is unknown or the sample size is small.

0.4.1 Steps involved in bootstrapping

1. Original sample: Start with an original sample of size n from the popula-
tion.
2. Resampling: Randomly draw n observations with replacement from the
original sample to create a bootstrap sample.

3. Statistic calculation: Calculate the desired statistic (e.g., mean, median)

for the bootstrap sample.
4. Repeat: Repeat steps 2 and 3 a large number of times (e.g., 1,000 or
10,000) to create a distribution of the statistic.

8
5. Analysis: Analyze the distribution of the bootstrap statistics to estimate
the standard error, confidence intervals, and other properties of the origi-
nal statistic.

Example 10. Estimate the mean and its 95% confidence interval for a sample
using bootstrapping.

1. Generate a bootstrap sample by resampling with replacement from the orig-

inal sample.
2. Compute the mean for each bootstrap sample.
3. Repeat this process 10,000 times to create a distribution of the mean.
4. Calculate the 95% confidence interval from the bootstrap distribution.

set.seed(123)
# Original sample
original_sample <- c(10, 12, 15, 18, 20, 25, 30, 35, 40, 45)

# Function to generate bootstrap samples and calculate the mean

bootstrap_mean <- function(data, n_bootstrap) {
bootstrap_means <- numeric(n_bootstrap)
n <- length(data)
for (i in 1:n_bootstrap) {
bootstrap_sample <- sample(data, size = n, replace = TRUE)
bootstrap_means[i] <- mean(bootstrap_sample)
}
return(bootstrap_means)
}

# Generate 10,000 bootstrap samples

n_bootstrap <- 10000
bootstrap_means <- bootstrap_mean(original_sample, n_bootstrap)

# Calculate the 95% confidence interval

ci_lower <- quantile(bootstrap_means, 0.025)
ci_upper <- quantile(bootstrap_means, 0.975)

list(mean = mean(bootstrap_means), ci_lower = ci_lower, ci_upper = ci_upper)

# Expected output
$mean
[1] 25.02295

$ci_lower
[1] 17.5

9
$ci_upper
[1] 32.5

Example 11. Estimate the median and its 95% confidence interval for a sample
using bootstrapping.
Solution:
1. Generate a bootstrap sample by resampling with replacement from the orig-
inal sample.

2. Compute the median for each bootstrap sample.

3. Repeat this process 10,000 times to create a distribution of the median.
4. Calculate the 95% confidence interval from the bootstrap distribution.

set.seed(123)
# Function to generate bootstrap samples and calculate the median
bootstrap_median <- function(data, n_bootstrap) {
bootstrap_medians <- numeric(n_bootstrap)
n <- length(data)
for (i in 1:n_bootstrap) {
bootstrap_sample <- sample(data, size = n, replace = TRUE)
bootstrap_medians[i] <- median(bootstrap_sample)
}
return(bootstrap_medians)
}

# Generate 10,000 bootstrap samples

bootstrap_medians <- bootstrap_median(original_sample, n_bootstrap)

# Calculate the 95% confidence interval

ci_lower <- quantile(bootstrap_medians, 0.025)
ci_upper <- quantile(bootstrap_medians, 0.975)

list(median = median(bootstrap_medians), ci_lower = ci_lower, ci_upper = ci_upper)

# Expected output
$median
[1] 25

$ci_lower
[1] 15

$ci_upper
[1]

10
Practical Session
Create an interactive R studio in LMS for the student to practice the
above codes.

0.4.2 Quiz
Given the following dataset representing the weights (in kg) of five individuals:

data = {55, 62, 70, 58, 64}

Using the bootstrapping method, calculate the bootstrap estimate of the

mean weight based on 1000 resamples. (6 Marks) (Answer is 61.4). Hint:
Execute this question by setting the seed to 123.

0.5 Jackknife
Jackknife resampling is a technique used to estimate the bias and variance of a
statistical estimator. It systematically leaves out one observation at a time from
the sample set and calculates the estimate over n different samples (where n is
the sample size). It is particularly useful for small datasets and can be applied
to various statistical measures.

0.5.1 Steps involved in Jackknife resampling

1. Compute the original estimator: Calculate the estimator (e.g., mean, vari-
ance) using the full dataset.
2. Leave-one-out samples: Create n subsets of the data, each leaving out one
observation.
3. Compute leave-one-out estimates: For each subset, compute the estimator.
4. Jackknife estimate of bias: Calculate the bias of the estimator using the
leave-one-out estimates.
5. Jackknife estimate of variance: Calculate the variance of the estimator
using the leave-one-out estimates.
6. Combine results: Use the bias and variance estimates to adjust the original
estimator if needed.

Example 12. Estimate the mean of a sample and its standard error using the
jackknife method.

1. Compute the mean of the full dataset.

2. Generate leave-one-out samples.

11
3. Compute the mean for each leave-one-out sample.
4. Calculate the jackknife estimate of the mean and its standard error.

# Sample data
data <- c(2, 4, 6, 8, 10)

# Step 1: Compute the original estimator (mean)

original_mean <- mean(data)

# Step 2: Generate leave-one-out samples and compute their means

n <- length(data)
leave_one_out_means <- numeric(n)

for (i in 1:n) {
leave_one_out_sample <- data[-i]
leave_one_out_means[i] <- mean(leave_one_out_sample)
}

# Step 3: Compute the jackknife estimate of the mean

jackknife_mean <- mean(leave_one_out_means)

# Step 4: Compute the jackknife estimate of the standard error

jackknife_se <- sqrt((n - 1) * mean((leave_one_out_means - jackknife_mean)^2))

# Results
original_mean
jackknife_mean
jackknife_se

# Expected output
original_mean: 6
jackknife_mean: 6
jackknife_se: 1.414214

0.5.2 Quiz
Given the dataset
data = {5, 7, 8, 6, 9}
Calculate the Jackknife mean, bias and variance (6 Marks) (Answer: mean =7.2,
Bias=0.2 and variance=0.24 )

0.5.3 Discussion forum

Provide a discussion forum for the students to share their experiences in
this module

12
0.6 Reading Materials
1. Miltiadis, C. Mavrakakis and Jeremy Penzer (2021). Probability and Sta-
tistical Inference: From Basic Principles to Advanced Models. Chapman
and Hall/CRC (Pages 379-400) Read the selected pages

0.7 Summary
1. Sampling is the process of selecting a subset of individuals from a popu-
lation to estimate characteristics of the whole population.

Types of sampling

• Simple random sampling: Every individual has an equal chance of

being selected.
• Stratified sampling: Population is divided into strata, and samples
are taken from each stratum.
• Cluster sampling: Population is divided into clusters, and whole clus-
ters are sampled.
• Systematic sampling: Every nth individual is selected from a list of
the population.
• Convenience sampling: Samples are taken from a group that is con-
veniently accessible.

2. Bootstrapping is a resampling method used to estimate the distribution

of a statistic by sampling with replacement from the original data.

Key methods

• Generating bootstrap samples: Repeatedly draw samples with re-

placement from the original data set.
• Calculating statistics: For each bootstrap sample, calculate the de-
sired statistic (e.g., mean, variance).
• Estimating distribution: Use the distribution of the bootstrap statis-
tics to estimate the sampling distribution.
• Confidence intervals: Derive confidence intervals from the bootstrap
distribution.

3. Jackknife is a resampling method used to estimate the bias and variance

of a statistic by systematically leaving out one observation at a time from
the sample set.

13
Key methods

• Leave-one-out: Create jackknife samples by systematically omitting

each observation from the sample.
• Calculate statistic: For each jackknife sample, calculate the statistic
of interest.
• Estimate bias and variance: Use the jackknife samples to estimate
the bias and variance of the statistic.
• Jackknife-after-bootstrap: Combining jackknife and bootstrap for
more robust variance estimates.

4. Monte Carlo Simulation is a computational technique that uses repeated

random sampling to simulate and understand the behavior of a system or
process.

Key methods

• Random sampling: Generate random samples from the probability

distributions of the input variables.
• Model simulation: Use the random samples to simulate the model or
system.
• Outcome analysis: Analyze the results of the simulations to estimate
probabilities, averages, variances, etc.
• Applications: Used in various fields like finance, engineering, and
science to solve problems involving uncertainty and complex systems.

mku dss
No ratings yet
mku dss
96 pages
Module 7 Presentation
No ratings yet
Module 7 Presentation
21 pages
Module 5 and 6 Discussionpresentation
No ratings yet
Module 5 and 6 Discussionpresentation
22 pages
Module 1 Introductio
No ratings yet
Module 1 Introductio
13 pages
Module 10 Computer P
No ratings yet
Module 10 Computer P
12 pages
Module 4 Fundamental
No ratings yet
Module 4 Fundamental
13 pages
Math 2Q
No ratings yet
Math 2Q
14 pages
Module 5 Discrete PR
No ratings yet
Module 5 Discrete PR
14 pages
Eng PPPPPPPPPPPPPPPPPPPPPPP
No ratings yet
Eng PPPPPPPPPPPPPPPPPPPPPPP
8 pages
Mathematics f2 Qs Scheme
No ratings yet
Mathematics f2 Qs Scheme
14 pages
Mathematics f2 Qs
No ratings yet
Mathematics f2 Qs
14 pages
Marking Scheme Eng
No ratings yet
Marking Scheme Eng
6 pages
451 3DeclarationByCandidates
No ratings yet
451 3DeclarationByCandidates
1 page
Form 2 Agric
No ratings yet
Form 2 Agric
7 pages
Database 10 Practical Questions
No ratings yet
Database 10 Practical Questions
2 pages
CV 202308121209277
No ratings yet
CV 202308121209277
2 pages
Form 3 W Revision
No ratings yet
Form 3 W Revision
6 pages
Shanetex 1
No ratings yet
Shanetex 1
4 pages
2025 Athletics
No ratings yet
2025 Athletics
1 page
Form 2 Term 1 Marking Scheme 2025
No ratings yet
Form 2 Term 1 Marking Scheme 2025
4 pages
Book 2
No ratings yet
Book 2
2 pages
ST Gregory Koru Girls Secondary School Term Two 2021 Cat 1 Exam Form One NAME . ADM NO STREAM 1Hr 30mins
No ratings yet
ST Gregory Koru Girls Secondary School Term Two 2021 Cat 1 Exam Form One NAME . ADM NO STREAM 1Hr 30mins
4 pages
End Term f1 2017 t3
No ratings yet
End Term f1 2017 t3
4 pages
End Term Form 1 2021 t3
No ratings yet
End Term Form 1 2021 t3
5 pages
Computer Studies Exam for Form One Students
No ratings yet
Computer Studies Exam for Form One Students
4 pages
f1 End Year Exam Scheme 2022
No ratings yet
f1 End Year Exam Scheme 2022
5 pages
End Term Exam t3 2019 f1
No ratings yet
End Term Exam t3 2019 f1
5 pages
f1 Brain Storm 2017 t2
No ratings yet
f1 Brain Storm 2017 t2
2 pages
Chemistry Revision Questions & Answers
No ratings yet
Chemistry Revision Questions & Answers
33 pages
NU501
No ratings yet
NU501
9 pages
Intro To Statistics Chapter 1 & 2
No ratings yet
Intro To Statistics Chapter 1 & 2
25 pages
Types of Probability Sampling
No ratings yet
Types of Probability Sampling
5 pages
Overview of Decision Support Systems and WEKA
No ratings yet
Overview of Decision Support Systems and WEKA
22 pages
Understanding Microsoft Teams Administration: Configure, Customize, and Manage The Teams Experience 2nd Edition Balu N Ilag Digital Download
No ratings yet
Understanding Microsoft Teams Administration: Configure, Customize, and Manage The Teams Experience 2nd Edition Balu N Ilag Digital Download
125 pages
Assignment Number 1 Introduction To Statistics MATH 1131
No ratings yet
Assignment Number 1 Introduction To Statistics MATH 1131
9 pages
Facebook's Impact on SHS Students' Academics
No ratings yet
Facebook's Impact on SHS Students' Academics
39 pages
Mpa Res 1 - How To Write Your Full Blown Research Proposal
No ratings yet
Mpa Res 1 - How To Write Your Full Blown Research Proposal
18 pages
Thesis Conclusion Help for Students
100% (2)
Thesis Conclusion Help for Students
6 pages
Journal
No ratings yet
Journal
12 pages
CMT Curriculum 2021 LEVEL II Wiley FINAL
50% (4)
CMT Curriculum 2021 LEVEL II Wiley FINAL
14 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Monte Carlo Project Risk Analysis
No ratings yet
Monte Carlo Project Risk Analysis
3 pages
Research Methodology Part 1 Introduction To Research Research Methodology
No ratings yet
Research Methodology Part 1 Introduction To Research Research Methodology
30 pages
Online Education Challenges in Bangladesh
No ratings yet
Online Education Challenges in Bangladesh
14 pages
Statistics Basics for Students
No ratings yet
Statistics Basics for Students
29 pages
A00-220 Study Guide and How To Crack Exa PDF
No ratings yet
A00-220 Study Guide and How To Crack Exa PDF
6 pages
Thesis Results Section Guidelines
100% (3)
Thesis Results Section Guidelines
8 pages
أثر الخدمات المصرفية الالكترونية على تحسين جودة الخدمات بالمصارف دراسة حالة الوكالات العمومية و الخاصة بولاية البليدة
No ratings yet
أثر الخدمات المصرفية الالكترونية على تحسين جودة الخدمات بالمصارف دراسة حالة الوكالات العمومية و الخاصة بولاية البليدة
15 pages
Cambridge IGCSE™: Sociology 0495/12
No ratings yet
Cambridge IGCSE™: Sociology 0495/12
23 pages
PMCC Hypothesis Testing Explained
No ratings yet
PMCC Hypothesis Testing Explained
33 pages
Sas 9.0 Manual PDF
No ratings yet
Sas 9.0 Manual PDF
1,861 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
64 pages
UT Dallas Syllabus For Stat1342.001.09s Taught by Yuly Koshevnik (Yxk055000)
No ratings yet
UT Dallas Syllabus For Stat1342.001.09s Taught by Yuly Koshevnik (Yxk055000)
5 pages
Res 1 Group 1 Beed 3
No ratings yet
Res 1 Group 1 Beed 3
57 pages
Evaluating The Impact of Urban Transit ... Idence From Bogotá's TransMilenio0
No ratings yet
Evaluating The Impact of Urban Transit ... Idence From Bogotá's TransMilenio0
44 pages
Principles of Research Design Overview
No ratings yet
Principles of Research Design Overview
38 pages
Surveys in Social Research 6nbsped 0415530156 9780415530156 - Compress
No ratings yet
Surveys in Social Research 6nbsped 0415530156 9780415530156 - Compress
401 pages
Effective Demand Forecasting Methods
No ratings yet
Effective Demand Forecasting Methods
44 pages
Engineering Probability Basics
No ratings yet
Engineering Probability Basics
16 pages
Chapter 4
No ratings yet
Chapter 4
34 pages

Module 9 Simulation

Uploaded by

Module 9 Simulation

Uploaded by

Module 9 - Simulation, sampling and resampling

Module Learning Outcomes 1

• Show how to simulate data with and without replacement

3. Watch lecture videos

0.1 Simulation, sampling and resampling

• Resampling and bootstrapping: Bootstrapping is a type of resam-

• Sampling: Data is drawn from an actual population.

• Resampling: Involves creating multiple samples from observed data.

0.2 Sampling methods

0.2.1 Types of Sampling Methods

• Divide the population into three strata based on age groups.

Example 4. A school district has 20 schools, and a researcher wants to survey

3. Survey all teachers in the schools corresponding to these numbers.

1. Generate 10,000 random numbers between 1 and 6.

Example 7. Estimate the value of π using Monte Carlo simulation.

1. Generate 1,000 random numbers (0 or 1) where 0 represents tails and 1

Example 9. Estimate the integral of f (x) = x2 from 0 to 1 using Monte Carlo

1. Generate random points uniformly distributed in [0,1].

3. Take the average of the function values to estimate the integral.

0.3.2 Watch the video

Video Visit the URL below to view a video:

Video by Stat Legend - Monte Carlo Integration using R

0.4.1 Steps involved in bootstrapping

3. Statistic calculation: Calculate the desired statistic (e.g., mean, median)

1. Generate a bootstrap sample by resampling with replacement from the orig-

# Function to generate bootstrap samples and calculate the mean

# Generate 10,000 bootstrap samples

# Calculate the 95% confidence interval

list(mean = mean(bootstrap_means), ci_lower = ci_lower, ci_upper = ci_upper)

2. Compute the median for each bootstrap sample.

# Generate 10,000 bootstrap samples

# Calculate the 95% confidence interval

list(median = median(bootstrap_medians), ci_lower = ci_lower, ci_upper = ci_upper)

data = {55, 62, 70, 58, 64}

Using the bootstrapping method, calculate the bootstrap estimate of the

0.5.1 Steps involved in Jackknife resampling

1. Compute the mean of the full dataset.

# Step 1: Compute the original estimator (mean)

# Step 2: Generate leave-one-out samples and compute their means

# Step 3: Compute the jackknife estimate of the mean

# Step 4: Compute the jackknife estimate of the standard error

0.5.3 Discussion forum

• Simple random sampling: Every individual has an equal chance of

2. Bootstrapping is a resampling method used to estimate the distribution

• Generating bootstrap samples: Repeatedly draw samples with re-

3. Jackknife is a resampling method used to estimate the bias and variance

• Leave-one-out: Create jackknife samples by systematically omitting

4. Monte Carlo Simulation is a computational technique that uses repeated

• Random sampling: Generate random samples from the probability

You might also like