Simulation
Simulation
Presidency University
December, 2024
Introduction
1 The random numbers generated by computers are not random in absolute sense,
they are only pseudo-random numbers.
Concept of Simulation
1 The random numbers generated by computers are not random in absolute sense,
they are only pseudo-random numbers.
Concept of Simulation
1 The random numbers generated by computers are not random in absolute sense,
they are only pseudo-random numbers.
Concept of Simulation
1 The random numbers generated by computers are not random in absolute sense,
they are only pseudo-random numbers.
Why do we simulate?
I Restrict to tractable special cases: The idea was focus only very
simple setups for which sampling distribution becomes exactly
[Link] can be done in two ways:
I use only statistics which have simple mathematical form (like linear
functions of the data): That is why sample mean became so popular
as a statistic.
I assume that the probability distributions featured in the stochastic
model take one of a few forms for which exact calculation is possible,
analytically or via tabulated special functions: That is why normal
distribution became so popular.
I Combining we have the most popular combination for doing
statistical inference: sample mean and normal distribution.
Response of classical statisticians
I Restrict to tractable special cases: The idea was focus only very
simple setups for which sampling distribution becomes exactly
[Link] can be done in two ways:
I use only statistics which have simple mathematical form (like linear
functions of the data): That is why sample mean became so popular
as a statistic.
I assume that the probability distributions featured in the stochastic
model take one of a few forms for which exact calculation is possible,
analytically or via tabulated special functions: That is why normal
distribution became so popular.
I Combining we have the most popular combination for doing
statistical inference: sample mean and normal distribution.
I Appeal to asymptotics: Generally assumption of normality does not
hold in practice and so we can show under reasonable conditions,
even if we assume different parent distributions, the distribution of
sample mean is going to be normal if we have large samples.
I Up through about the 1960s, statistics was split between developing
general ideas about how to draw and evaluate inferences with
stochastic models, and working out the properties of inferential
procedures in tractable special cases (especially the linear and-
Gaussian case), or under asymptotic approximations.
I Up through about the 1960s, statistics was split between developing
general ideas about how to draw and evaluate inferences with
stochastic models, and working out the properties of inferential
procedures in tractable special cases (especially the linear and-
Gaussian case), or under asymptotic approximations.
bernoulli=function(prob)
{
u=runif(1); x=NULL;
if (u<prob) x=1
else x=0
return(x)
}
bernoulli(0.7) #--call the function
[1] 1
Using it further
x=1
y=bernoulli(0.8)
while(y!=1)
{
y=bernoulli(0.8)
x=x+1
}
x
[1] 1
Much Complicated Ones
Fact
Suppose
P we want to generate X having p.m.f P(X = xj ) = pj j = 0, 1, .....
pj = 1. We generate U ∼ U(0, 1) and set
x0 if U < p0
x1 if p0 ≤ U < p0 + p1
. ..
X = .. .
Pj−1 Pj
xi if i=0 pi ≤ U < i=0 pi
.. ..
. .
Algorithm
poi_mass=function(x,lambda)
{
return( exp(-lambda)*(lambda^x)/factorial(x))
}
poi_sample=function(lambda)
{
U=runif(1); i=0; cumprob=poi_mass(0,lambda)
while(U>cumprob)
{
i=i+1
cumprob=cumprob+poi_mass(i,lambda)
}
return(i)
}
poi_sample(5)
[1] 4
Drawing random sample
I Recall that randomly drawing a random sample is same as drawing a
discrete uniform {1, 2, ..., n} random variable. Hence we should know
how do simulate a discrete uniform sample if we have U ∼ U(0, 1).
Drawing random sample
I Recall that randomly drawing a random sample is same as drawing a
discrete uniform {1, 2, ..., n} random variable. Hence we should know
how do simulate a discrete uniform sample if we have U ∼ U(0, 1).
I Note that if U ∼ U(0, 1), then X = [nU] + 1 has a discrete uniform
distribution over {1, 2, ..., n}.
Drawing random sample
I Recall that randomly drawing a random sample is same as drawing a
discrete uniform {1, 2, ..., n} random variable. Hence we should know
how do simulate a discrete uniform sample if we have U ∼ U(0, 1).
I Note that if U ∼ U(0, 1), then X = [nU] + 1 has a discrete uniform
distribution over {1, 2, ..., n}.
I In R, we shall make use of the sample() command for both
with-replacement and without-replacement sampling.
[Link](100)
sample(c("A","B","C","D","E"),size=3)
sample(c("A","B","C","D","E"),size=3,replace=T)
sample(100,size=5)
[1] 78 23 86 70 4
Unequal Probability Sampling
[Link](100)
sample(c("A","B","C"),size=2,prob=c(0.1,0.4,0.5))
Fact
(Probability Integral Transformation) If X has an absolutely continous
distribution, the the c.d.f of X , F (X ) has U(0, 1) distribution.
What about Continuous Distributions?
Fact
(Probability Integral Transformation) If X has an absolutely continous
distribution, the the c.d.f of X , F (X ) has U(0, 1) distribution.
I Suppose we want to generate X from Exp(λ) distribution. Then
Fact
(Probability Integral Transformation) If X has an absolutely continous
distribution, the the c.d.f of X , F (X ) has U(0, 1) distribution.
I Suppose we want to generate X from Exp(λ) distribution. Then
I However this method does not work, if F does not have a closed
form. e.g. Normal distribution, Gamma distribution, Beta
distribution etc.
Generating Normal Variate I
normal=function(n)
{
U1<-runif(n)
U2<-runif(n)
C<--sqrt(-2*log(U1))
Z1<-C*cos(2*pi*U2)
Z2<-C*sin(2*pi*U2)
return(Z1)
}
n<-100000
Z=normal(n)
hist(Z,prob=T)
curve(dnorm(x),-3,3,add=T)
Generating Normal Variate II
Histogram of Z
0.4
0.3
Density
0.2
0.1
0.0
−4 −2 0 2 4
Distribution Arising from Normal Distribution
binorm=function(n,rho)
{
x<- numeric(n); y<- numeric(n)
for (i in 1:n)
{
z1 <- normal(1)
z2 <- normal(1)
x[i] <- z1
y[i]<-rho*z1+sqrt(1-rho^2)*z2
}
return(cbind(x,y))
}
3
2
1
y
0
−1
−2
−3
−3 −2 −1 0 1 2 3
Generating Multivariate Normal
Fact
If X ∼ Np (µ, Σ) then there exists a non-singular T such that T ΣT 0 = Ip ,
i.e. T (X − µ) ∼ Np (0, Ip )
Generating Multivariate Normal
Fact
If X ∼ Np (µ, Σ) then there exists a non-singular T such that T ΣT 0 = Ip ,
i.e. T (X − µ) ∼ Np (0, Ip )
I Such T is found by cholesky decomposition of Σ
Generating Multivariate Normal
Fact
If X ∼ Np (µ, Σ) then there exists a non-singular T such that T ΣT 0 = Ip ,
i.e. T (X − µ) ∼ Np (0, Ip )
I Such T is found by cholesky decomposition of Σ
multinorm=function(mean,Sigma)
{
p=length(mean)
T=chol(Sigma)
y=rnorm(p)
x=mean+(solve(T)%*%y)
return(x)
}
mean=c(10,9,11)
S=matrix(c(4,3,2,3,10,6,2,6,16),nrow=3)
multinorm(mean,S)
[,1]
[1,] 10.606520
[2,] 9.360652
[3,] 10.948511
Working with inbuilt R functions
[1] 0
[1] 0
I We can also compute the normal quantiles zα
[1] 4.983551
[1] 5.023263
I We can also compute the normal quantiles zα
[1] 4.983551
[1] 5.023263
[1] 0
[1] 39.89423
Plotting the normal density
x=seq(-3,3,by=0.01); y=dnorm(x)
plot(x,y,type="l",main="Density of N(0,1)",
ylab=expression(phi(x)))
Density of N(0,1)
0.4
0.3
φ(x)
0.2
0.1
0.0
−3 −2 −1 0 1 2 3
x
Other Standard Distributions in R
I Now
Z ∞
f (Y ) f (y )
P(B) = P U ≤ = P U≤ g (y )dy
ag (Y ) −∞ ag (y )
Z ∞ Z ∞
f (y ) f (y ) 1
= g (y )dy = dy =
−∞ ag (y ) −∞ a a
I Further
Z ∞
P((Y ≤ x) ∩ B) = P((Y ≤ x) ∩ B|Y = y )g (y )dy
−∞
!
Z ∞
f (Y )
= P (Y ≤ x) ∩ U ≤ |Y = y g (y )dy
−∞ ag (Y )
Z x
f (y )
= P U≤ g (y )dy
−∞ ag (y )
Z x
f (y ) FX (x)
= g (y )dy =
−∞ ag (y ) a
I Further
Z ∞
P((Y ≤ x) ∩ B) = P((Y ≤ x) ∩ B|Y = y )g (y )dy
−∞
!
Z ∞
f (Y )
= P (Y ≤ x) ∩ U ≤ |Y = y g (y )dy
−∞ ag (Y )
Z x
f (y )
= P U≤ g (y )dy
−∞ ag (y )
Z x
f (y ) FX (x)
= g (y )dy =
−∞ ag (y ) a
I generate Y ∼ U(0, 1)
I generate U ∼ U(0, 1)
I while U > 20Y 3 (1 − Y )2
I generate Y
I generate U
I set X = Y
Efficiency of the Acceptance-Rejection Algorithm
I As before let B be the event that Y has been accepted in a loop, i.e.
U ≤ f (Y )/ag (Y ) .
Efficiency of the Acceptance-Rejection Algorithm
I As before let B be the event that Y has been accepted in a loop, i.e.
U ≤ f (Y )/ag (Y ) .
I As before let B be the event that Y has been accepted in a loop, i.e.
U ≤ f (Y )/ag (Y ) .
I Our constant a had to satisfy 60x 3 (1 − x)2 ≤ a for all x ∈ [0, 1].
Example(Contd.)
I Our constant a had to satisfy 60x 3 (1 − x)2 ≤ a for all x ∈ [0, 1].
I Thus we take a as
I Our constant a had to satisfy 60x 3 (1 − x)2 ≤ a for all x ∈ [0, 1].
I Thus we take a as
I Our constant a had to satisfy 60x 3 (1 − x)2 ≤ a for all x ∈ [0, 1].
I Thus we take a as
n
X
where λi , αi ≥ 0, and αi = 1. Here αi = 0 for i > n.
i=1
I For example, consider a
X ∼ Hyperexponential(λ1 , α1 , λ2 , α2 , ..., λn , αn ) so that
n
X
f (x) = αj λj e −λj x
j=1
n
X
where λi , αi ≥ 0, and αi = 1. Here αi = 0 for i > n.
i=1
n
X
where λi , αi ≥ 0, and αi = 1. Here αi = 0 for i > n.
i=1
I We have
∞
X
P(X ≤ x) = P(X ≤ x|I = j)P(I = j)
j=1
∞
X
= P(Yj ≤ x)P(I = j)
j=1
∞
X
= Fj (x)πj = FX (x).
j=1
Example (Contd.)
n <- 5000
x1 <- rnorm(n, 0, 1)
x2 <- rnorm(n, 3, 1)
u <- runif(n)
k <- [Link](u > 0.5) #vector of 0's and 1's
x <- k * x1 + (1-k) * x2
#plot the density of the mixture
#with the densities of the components
plot(density(x), xlim=c(-7,7), ylim=c(0,.5),
lwd=3, xlab="x", main="")
curve(dnorm(x,0,1),from=-7,to=7,n=5000,add=T)
curve(dnorm(x,3,1),from=-7,to=7,n=5000,add=T)
Mixture of Normal II
0.5
0.4
0.3
Density
0.2
0.1
0.0
−6 −4 −2 0 2 4 6
x
Finding Probabilities
Fact
Probability of any event A can be interpeted as the long term relative
frequency of the event A, i.e.
no. of repetitions resulting in A
total number of repetitions
as n → ∞
I Hence for computing the probability of any event A by simulation,
we shall simulate a large number n of cases and count the number of
times the event A has occured. If this number is m, then the
probability of the event A can be approximated by m/n
Drawing a Card
pic=NULL
for ( i in 1:100000)
{
x=sample(52,size=1)
if (any(x%%13==c(11,12,0)))
pic=c(pic,1)
}
sum(pic)/100000
[1] 0.23035
Divisibility Test
count=0
for (i in 1:10000)
{
num=sample(1000,1)
if(num%%3==0 || num%%5==0 || num%%6==0) count=count+1
}
count/10000
[1] 0.4616
Urn-Ball Problem
I Suppose an urn contains 7 white and 5 black balls. 3 balls are
chosen at random without replacement. Find the probability that
I all the 3 balls are white.
I 2 are white and 1 is black.
count1=0; count2=0;
balls= [Link](c(rep("W",7),rep("B",5)))
for ( i in 1:10000)
{
chosen= sample(balls,3);
if (all(chosen=="W")) count1=count1+1
if (table(chosen)["W"]==2) count2=count2+1
}
count1/10000; count2/10000;
[1] 0.1596
[1] 0.4799
Birthday Problem
[1] 0.5684
Card Shuffling
cut=function(deck)
{
x=rbinom(1,52,0.5) #choose a random cut point near middle
temp=c(deck[(x+1):52],deck[1:x])
return(temp)
}
cut(1:52)
[1] 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
[26] 52 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
[51] 25 26
Riffle Shuffle
I A much more reliable way is the rifle shuffle (also known as Faro
shuffle or dovetail shuffle.)
Riffle Shuffle
I A much more reliable way is the rifle shuffle (also known as Faro
shuffle or dovetail shuffle.)
I First split the deck into two parts just as in the cut method. Take
the top half in left hand, and the other half in your right. Release
the cards randomly from both the hands.
Riffle Shuffle
I A much more reliable way is the rifle shuffle (also known as Faro
shuffle or dovetail shuffle.)
I First split the deck into two parts just as in the cut method. Take
the top half in left hand, and the other half in your right. Release
the cards randomly from both the hands.
I Mathemtically,if at any stage there are a cards in your left hand and
b cards in your right, then the next card comes from the left hand
with probability a/a+b and from the right with probability b/a+b.
Simulating Riffle Shuffle
riffle=function(deck)
{
n=length(deck)
x=rbinom(1,52,0.5)
left=deck[1:x]; right=deck[(x+1):52];
k=0; a=length(left); b=length(right); tab=NULL;
for(drop in 1:52)
{
ind=rbinom(1,1,a/(a+b))
if (ind==1)
{
tab[k+1] = left[a]
left = left[1:(a-1)]
k = k + 1; a = a - 1;
}
else
{
tab[k+1] = right[b]
right = right[1:(b-1)]
k = k + 1; b = b - 1;
}
}
return(tab)
}
Simulating Riffle Shuffle
riffle(1:52)
[1] 29 28 52 51 50 27 26 49 25 48 24 47 23 22 21 46 45 20 44 43 42 41 40 39 19
[26] 38 37 18 36 17 16 15 14 13 35 12 11 10 9 8 7 6 34 33 5 4 3 32 2 31
[51] 1 30
Comparing the techniques
I Consider two consecutive cards (say 23rd and 24th card) in a pack
of cards. If in the original unshuffled pack, the 23rd card is followed
by the 24th card, then what is the probability that the 24th card
follows the 23rd one even in the shuffled deck?
Comparing the techniques
I Consider two consecutive cards (say 23rd and 24th card) in a pack
of cards. If in the original unshuffled pack, the 23rd card is followed
by the 24th card, then what is the probability that the 24th card
follows the 23rd one even in the shuffled deck?
1
I If the deck is well shuffled, then this probability should be 52
Comparing the techniques
I Consider two consecutive cards (say 23rd and 24th card) in a pack
of cards. If in the original unshuffled pack, the 23rd card is followed
by the 24th card, then what is the probability that the 24th card
follows the 23rd one even in the shuffled deck?
1
I If the deck is well shuffled, then this probability should be 52
I We now shuffle our virtual deck 8 times using the above two
methods and compute the above probability.
Simulating the probability
count1=0; count2=0;
for(simul in 1:1000)
{
deck1= 1:52; deck2=1:52;
for( i in 1:8)
{
deck1=cut(deck1)
deck2 = riffle(deck2)
}
for(j in 1:51)
{
if(deck1[j]==23 & deck1[j+1]==24)
count1= count1 + 1
if(deck2[j]==23 & deck2[j+1]==24)
count2= count2 + 1
}
}
print(count1/1000); print(count2/1000);
[1] 0.996
[1] 0.017
Central Limit Theorem
Theorem
(iid case) Let X1 , X2 , ....., Xn be iid random variables with mean µ and
Sn −E (Sn )
variance σ 2 < ∞ and Sn = X1 + X2 + .... + Xn . Then √ → N(0, 1)
Var (Sn )
in law as n → ∞.
Central Limit Theorem I
U1 + U2 + .... + Un − n/2
Zn = p → N(0, 1)
n/12
in distribution as n → ∞
n<-100
k<-10000
U<-runif(n*k)
M<-matrix(U,n,k)
X<-apply(M,2,'sum')
Z<-(X-n/2)/sqrt(n/12)
par(mfrow=c(1,2))
hist(Z)
qqnorm(Z)
qqline(Z,col="red")
Central Limit Theorem II
Histogram of Z Normal Q−Q Plot
2000
3
1500
2
Sample Quantiles
1
Frequency
1000
0
−1
500
−2
−3
0
−2 0 2 4 −4 −2 0 2 4
Z Theoretical Quantiles
Cauchy Distribution I
n<-100
k<-10000
U<-rcauchy(n*k)
M<-matrix(U,n,k)
X<-apply(M,2,'sum')
Z<-(X-n/2)/sqrt(n/12)
par(mfrow=c(1,2))
hist(Z)
qqnorm(Z)
qqline(Z,col="red")
Cauchy Distribution II
Histogram of Z Normal Q−Q Plot
1000 2000 3000 4000 5000 6000
40000
20000
Sample Quantiles
Frequency
0
−40000 −20000
0
−40000 0 20000 −4 −2 0 2 4
Z Theoretical Quantiles
Laws of large numbers
I The weak law of large numbers says that for any > 0 the
sequence of probabilities
Sn
P | − µ| < → 1 as n → ∞
n
Laws of large numbers
I The weak law of large numbers says that for any > 0 the
sequence of probabilities
Sn
P | − µ| < → 1 as n → ∞
n
I Consider i.i.d. coin flips, that is, Bernoulli trials with p = µ = 1/2
Laws of large numbers
I The weak law of large numbers says that for any > 0 the
sequence of probabilities
Sn
P | − µ| < → 1 as n → ∞
n
I Consider i.i.d. coin flips, that is, Bernoulli trials with p = µ = 1/2
Sn
I We find the probability P | n − µ| < in R and illustrate the
limiting behavior, with = 0.01
Plotting the Probability I
wlln<-function(n,eps,p)
{
pbinom(n*p+n*eps,n,p)-pbinom(n*p-n*eps,n,p)
}
prob=NULL
for (n in 1:10000)
prob[n]=wlln(n,0.01,0.5)
plot(prob,type="l",xlab="n",ylab="probability")
Plotting the Probability II
0.8
0.6
probability
0.4
0.2
0.0
n
Strong Law of large numbers
I Consider i.i.d. coin flips, that is, Bernoulli trials with p = µ = 1/2
Strong Law of large numbers
I Consider i.i.d. coin flips, that is, Bernoulli trials with p = µ = 1/2
slln<-function(n,p)
{
x=rbinom(1,size=n,prob=p)
return(x)
}
value=NULL
for( n in 1:10000)
value[n]=slln(n,0.5)/n
plot(value,type="l",xlab="n",ylab="sample mean")
abline(h=0.5)
Illustrating Strong Law II
1.0
0.8
sample mean
0.6
0.4
0.2
0.0
n
Using Simulation to construct Tests
I To get an idea about the nature of the test we plot the density
function for different values of b.
Plot Of Beta Densities I
par(mfrow=c(1,3))
x=seq(0,1,by=0.01)
y1=dbeta(x,shape1=5,shape2=2)
y2=dbeta(x,shape1=5,shape2=5)
y3=dbeta(x,shape1=5,shape2=10)
plot(x,y1,type="l",xlab="b<5")
abline(v=median(rbeta(101,shape1=5,shape2=2)),col="red")
plot(x,y2,type="l",xlab="b=5")
abline(v=median(rbeta(101,shape1=5,shape2=5)),col="red")
plot(x,y3,type="l",xlab="b>5")
abline(v=median(rbeta(101,shape1=5,shape2=10)),col="red")
Plot Of Beta Densities II
2.5
2.5
3.0
2.0
2.0
2.5
1.5
2.0
1.5
y1
y2
y3
1.5
1.0
1.0
1.0
0.5
0.5
0.5
0.0
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
I From the figures we see that sample median can be used as a test
statistic.
What type of test shall we perform?
I From the figures we see that sample median can be used as a test
statistic.
I From the figures we see that sample median can be used as a test
statistic.
I From the figures we see that sample median can be used as a test
statistic.
I From the figures we see that sample median can be used as a test
statistic.
[Link](100);prob=NULL; j=1;
C=seq(0.2,0.9,by=0.001)
for ( c in C)
{
prob[j]=0
for(i in 1:100)
{
x=rbeta(101,shape1=5,shape2=5)
me=median(x)
if(me>c) prob[j]=prob[j]+1
}
prob[j]=prob[j]/100
j=j+1
}
plot(C,prob,type="l"); abline(h=0.9,col="red");
Plot PH0 (me > c) for different c II
1.0
0.8
0.6
prob
0.4
0.2
0.0
C
Now lets find the c
We continue to search the c for which PH0 (me > c) is closest to 0.9
Now lets find the c
We continue to search the c for which PH0 (me > c) is closest to 0.9
We continue to search the c for which PH0 (me > c) is closest to 0.9
We continue to search the c for which PH0 (me > c) is closest to 0.9
where X1 , X2 , ... are iid sequence of random variables with the same
distribution as X .
Application of Monte Carlo
Z b
I Consider the integral f (x)dx
a
Another Application: Monte Carlo Integration
Z b
I Consider the integral f (x)dx
a
Z b
I Consider the integral f (x)dx
a
Z b
I Consider the integral f (x)dx
a
I Then Z b Z b
f (x)dx = (b − a) f (x)φ(x)dx
a a
= (b − a)E (f (X ))
N
b−a X
≈ f (Xj )
N
i=1
for large N.
Integration
R 2π
I Evaluate the integral 0
e cos(x) dx
Integration
R 2π
I Evaluate the integral 0
e cos(x) dx
R 2π
I Evaluate the integral 0
e cos(x) dx
N=1000
x=runif(N,min=0,max=(2*pi))
value=sum(exp(cos(x)))
value=(2*pi)*value/N
value
[1] 8.188166
Family Planning
count=NULL
for ( i in 1:1000)
{
child=sample(0:1,1)
while(length(unique(child))<2)
child=c(child,sample(0:1,1))
count[i]=length(child)
}
mean(count)
[1] 3.008
Example
I Suppose we want to estimate Φ(t) = P(X ≤ t), where X ∼ N(0, 1) for a fixed t.
I We can do that in several ways:
z <- rnorm(m)
dim(x) <- length(x)
method1 <- apply(x, MARGIN = 1,
FUN = function(x, z) {mean(z < x)}, z = z)
for (i in 1:length(x)) {
u=runif(m,min=0,max=x[i])
g= exp(-(u^2)/2)
method2[i]=mean(g)*x[i] / sqrt(2 * pi) + 0.5
}
u <- runif(m)
for (i in 1:length(x)) {
g <- x[i] * exp(-(u * x[i])^2 / 2)
method3[i] <- mean(g) / sqrt(2 * pi) + 0.5
}
method4=pnorm(x)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x 0.100 0.367 0.633 0.900 1.167 1.433 1.700 1.967 2.233 2.500
method1 0.542 0.643 0.738 0.818 0.880 0.925 0.955 0.976 0.987 0.992
method2 0.540 0.643 0.737 0.816 0.879 0.922 0.958 0.975 0.984 0.996
method3 0.540 0.643 0.737 0.816 0.878 0.924 0.956 0.975 0.987 0.993
method4 0.540 0.643 0.737 0.816 0.878 0.924 0.955 0.975 0.987 0.994
Bias and Variance
bias(ZnMC ) = 0
and
1
MSE (ZnMC ) = Var (ZnMC ) = Var (f (X )).
n
Mean or Median
[Link]=1000; [Link]=50;
m=NULL; me=NULL;
for (i in 1:[Link])
{
x=rnorm([Link])
m[i]=mean(x); me[i]=median(x);
}
se.m= sqrt(mean(m^2)-((mean(m))^2))
[Link]= sqrt(mean(me^2)-((mean(me))^2))
mean(m); mean(me);
[1] -0.004079595
[1] -0.009236697
se.m; [Link];
[1] 0.1412464
[1] 0.1722919
For t-distribution
[Link]=1000; [Link]=50;
m=NULL; me=NULL;
for (i in 1:[Link])
{
x=rt([Link],df=2)
m[i]=mean(x); me[i]=median(x);
}
se.m= sqrt(mean(m^2)-((mean(m))^2))
[Link]= sqrt(mean(me^2)-((mean(me))^2))
mean(m); mean(me);
[1] 0.01256094
[1] 0.006310878
se.m; [Link];
[1] 0.5231233
[1] 0.202407
Level α test
n=100
m=500
x=NULL
p=0
for(i in 1:m)
{
x=rnorm(n)
if(mean(x)>qnorm(0.95)/sqrt(n)) p=p+1
}
p/m
[1] 0.048
Power of Test
options(warn=-1)
require(MASS,quiet=T)
n=50; m=1000;p=seq(0,0.6,0.01); prob=NULL;
for( i in 1:length(p))
{
s=0
for(j in 1:m)
{
u=mvrnorm(n,c(50,65),
matrix(c(4,p[i]*2*3,p[i]*2*3,9),nrow=2))
r=cor(u)[1,2]; t=r*sqrt((n-2)/(1-r^2));
if(t>qt(0.95,n-2)) s=s+1
}
prob[i]=s/m
}
plot(p,prob,type="l",xlab=expression(rho),main="Power Curve")
Power Curve in R II
1.0
0.8
0.6 Power Curve
prob
0.4
0.2