FIM 548, Lecture #12
Quasi Monte Carlo
Chapter 5 in Glasserman and Chapter 15 of Owen
Lecture and Codes by Andrew
Papanicolaou Slides originally of Y.
Kitapbayev
Feb 26, 2023
1 / 20
Integration using
►
So far we have focused on computing
∫1
0
θ ψ(x )d
►
Monte Carlo integration can be especially useful
for estimating high-dimensional integrals.
►
Suppose we want to estimate
∫1∫1
0 0
θ ψ(x, y )dxdy
►
Monte Carlo is easy because we just need to sample
uniformly on [0, 1) × [0, 1),
1 Σ
N
θˆ = ψ(X i, Y i ) .
N
i
Quasi Monte Carlo
►
We already saw that standard MC methods have a
√
convergence of O(1/ N).
►
To accelerate it, one can apply so-called quasi Monte
Carlo method, or low-discrepancy method.
►
These methods seek to increase accuracy by generating points
that are too evenly distributed to be random.
►
Due to their deterministic nature, statistical methods do not
apply to QMC methods.
►
In this lecture we focus on d -dimensional integrals,
θ = Eψ(U) ,
where U ∼ uniform[0, 1)d . Using inverse CDFs we can apply
to a great many other d -dimensional distributions.
Integration
►
We are already familiar with basic QMC.
►
The congruential method for pseudo random numbers.
Xi+1 = mod(aXi , m)
Ui +1 = Xi +1/m
with m = 231 and a = 16807. Generates a scalar sequence of
pseudo-randoms on (0, 1).
►
A similar idea, but perhaps more uniformly separated
are rank-1 lattices,
Xi = mod(i V1, 1) ,
where V1 ∈ Rd . For
√
example, the Fibonacci lattice in d = 2
1+ 5
takes V1 = (1, 2
Lattic
congruential
1 1 grid rank 1
1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0 0.5 1 0 0.5 1 0 0.5 1
Hammersley
1 Latin Hypercube Stratified Random
1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0.5
0 0
Lattic
0.5 0
1 5 / 20
0.5
A Simple QMC
►
Once we have our QMC points in [0, 1)d , we only need to
plug them into our random number generator in place of
our pseudo-random uniforms.
►
For example, we can use an integration lattice in d = 2 to
estimate a function of a jointly-normal pair.
►
For example,
q
E cos(ǁX ǁ) ≈ cos X2+Y2 ,
1 Σ N
i i
N
i =1
where (Xi , Yi ) are given by Box-Muller with points from
an integration lattice.
∫ 1 ǁx ǁ2
Example: θ = 1 cos(ǁx ǁ)e− 2 dx
2 R 2
%% Simple QMC Example
n = 2 ˆ8 ;
p h i = (1+ s q r t ( 5 ) ) / 2 ; % F i b o n a c c i l a t t i c e
v1 = [ 1 /( n+1) ; p h i ] ;
U qmc = mod( v1 ∗ [ 1 : n ] , 1 ) ;
X = s q r t (−2∗ l o g ( U qmc ( 1 , : ) ) ) . ∗ cos ( 2 ∗ p i ∗U qmc ( 2 , : ) ) ;
Y = s q r t (−2∗ l o g ( U qmc ( 1 , : ) ) ) . ∗ s i n ( 2 . ∗ p i ∗U qmc ( 2 , : ) )
theta qmc = mean ( cos ( s q r t (X.ˆ2+Y. ˆ 2 ) ) ) ;
theta mc = mean ( cos ( s q r t ( sum ( randn ( 2 , n ) . ˆ 2 , 1 ) ) ) ) ;
Discrepan
►
Let us assume that d is the dimension of a problem, and
we want to estimate the integral of ψ over [0, 1)d .
►
Now we aim to fill the cube uniformly using
some deterministic rule.
►
There are a couple discrepancy definitions.
►
We define discrepancy for point set {x1, . . . , xn} as
#{xi ∈ [0, a)}
D(x , . . . , x ) = sup . .
− .[0, a). ,
1 2
a∈[0,1) n .
d
.
where [0, a) = [0, a1) × · · · × [0, an), where #{xi ∈ [0, a)}
.
denotes the number of xi contained in [0, a), and [0, . a)
denotes the measure or volume of the interval.
Low Discrepancy
►
For Ui ∼ uniform[0, 1)
√ it is known that
d
2nD(U1, U2, . . . , Un)
√ =1.
lim sup
n→∞ log log n
►
It is known that deterministic low-discrepancy sets can have
1 2 n nd
log(n)
D(u , u , . . . , u ) that is O .
►
Integration error when using a low-discrepancy set is
quantified with the Koksma-Hlawka inequality,
|θˆ − θ| ≤ Vhk (ψ)D(u1, u2, . . . , un) ,
where Vhk (ψ) is the Hardy-Krause total variation.
►
Examples of sequences with low discrepancy include
Halton sequence, Sobol sequence, Faure sequence, and
Low Discrepancy
Niederreiter sequence.
Digital Nets With Low-
Sobol Sobol Scramble
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Halton Halton Scramble
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure: Digital nets don’t have the periodicity of lattices. Integration
lattices are good for integrating smooth functions with smooth periodic
features.
Randomized
►
Randomized digital nets offer a way to take advantage of
low-discrepancy sets and avoid some of the pitfalls in
basic QMC.
►
The Matousek-Owen scramble is commonly used with
the Sobol set.
►
Reverse-Radix scrambling is used for the Halton set.
Example: Scrambled Sobol Set to Generate
Brownian Motion
%% QMC Sim Brownian Motion
T = 3 / 12 ;
dt = 1 / 365 ;
N = round (T/ dt ) ;
Rho = −. 7 ;
dW = z e r o s (N, 2 ) ;
Sb l = s o b o l s e t ( 2 , ’ Skip ’ , 1 e3 , ’ Leap ’ , 1
e2 ) ; p = s c r a m b l e ( Sbl , ’ Matousek Affine
Owen ’ ) ; p = net ( p , N) ’ ;
U1 = p ( 1 : N) ;
U2 = p ( ( N+1) : end ) ;
dW( : , 1 ) = s q r t (−2∗ l o g ( U1 ) ) . ∗ cos ( 2 ∗ p i ∗U2 ) ∗ s q r t (
dt ) ; dW( : , 2 ) = s q r t (−2∗ l o g ( U1 ) ) . ∗ s i n ( 2 ∗ p i ∗U2 ) ∗ s
q r t ( dt ) ;
dW( : , 2 ) = Rho∗dW( : , 1 )+s q r t (1−Rho ˆ2 ) ∗dW( : , 2 ) ;
Example: Heston Model Call
Heston Call Implied Volatility (T=3 months)
0.23
MC
QMC
RQMC
0.22 Quadrature
0.21
0.2
0.19
0.18
0.17
0.16
-0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25
Figure: Using QMC and RQMC to estimate Heston call price, and then
comparing implied vols.
Copul
►
Copulas are the CDFs of marginally uniform[0, 1) random
variables.
►
For X ∈ Rd , with X l denoting lth element.
►
Let Fl be the (marginal) CDF of X l .
►
We have X l = F −1 (Ul ) where Ul is a marginally uniform[0, 1)
l
random variable, and is the lth element of U ∈ [0, 1)d .
►
The CDF of U is copula C .
►
Gaussian copula:
s ∫ Φ−1(ud )
∫ −1
C (u1, . . . , ud ) (2π)−d Φ (u1) · · · e
1 ∗ −
− x ρ 1x
dx ,
= 2
|ρ| −∞ −∞
where ρ ∈ Rd ×d is a correlation matrix and |ρ| = det(ρ).
►
Archimedean copula:
C (u1, . . . , ud ) = ϕ−1(ϕ(u1) + · · · + ϕ(ud )) ,
Copul
where ϕ is the copula the generator.
Example: Large Portfolio of Co-Dependent
Claims
►
Let us consider an insurance example.
►
Individual loss distributions have standard normal
distribution (i.e., marginally standardnormal).
►
U l = Φ(X l ) are jointly dependent through a Clayton copula,
with 1
ϕ(t) = (t−ν 1) ,
ν −
where ν > 0 is the parameter.
►
A claim is made if X l ≥ 3.
►
We let the portfolio have l = 1, 2, . . . , 100. (i.e., 100 policies).
►
We want to gain an understanding of the magnitude of
the total claims given that at least one claim is made.
►
For example, think about fire insurance for homeowners in
California.
Marshall-Olkin
►
We can sample from the copula using the Marshall-
Olkin Algorithm
− 1
1. V ∼ LS ϕ
2. U˜ ∼ uniform[0, 1)d
3. Return U where U = ϕ(− log(U˜)/V ).
LS ϕ (v ) = ∫0 ϕ(t)e −vt dt is the Laplace-Stieltjes transform of
►
∞
ϕ, which for Clayton copula has known inverse,
− 1
LS ϕ (v ) ∝ v 1/ν−1 e −v ,
i.e., V is gamma distributed with shape 1/ν and scale 1.
Example: Large Portfolio of Co-Dependent
Insurance
►
We can run this 100-dimensional example with Monte Carlo,
QMC and RQMC, and we’ll see generally comparable results.
►
We can also run the example with a normal approximation
to the data, which requires some shrinkage of the
covariance matrice’s eigenvalues.
►
The reduced error from QMC/RQMC does not pop out at you
in this example, or in most other examples.
►
To see error reduction we will need to run many trials, save
the MC/QMC/RQMC estimators, and then compare
statistics of the aggregate distributions to see which method
has the least variation.
►
We estimate θ = probability of at least 1 claim, M =
the expectation number of claims given at least 1, and
the expectation of the peaks-over-threshold distribution
(POT) given at least 1 claim.
Example: Large Portfolio of Co-Dependent
Insurance
theta mc
40 theta qmc
40
20 20
0.03
0 0.035 0.04 0.045 0.05 0
0.035 0.04 0.045
M mc M qmc
40
40
20 20
20 2.5 3 3.5 4 4.5 0
2.8 3 3.2 3.4 3.6 3.8
pot mc pot qmc
40 40
20 20
0 0
8 10 12 14 16 9 10 11 12
Figure: 500 trials of MC and RQMC estimations for the insurance
Example: Large Portfolio of Co-Dependent
Insurance
example using Clayton copula.
Example: Large Portfolio of Co-Dependent
Insurance
Based on the 500 trials from on the previous slide, we have
the following statistics:
E [θmc ] = 0.0409, sd(θmc ) = 0.0032
E [θqmc ] = 0.0409, sd(θqmc ) = 0.0017
E [Mmc ] = 3.3259, sd(Mmc ) = 0.2896
E [Mqmc ] = 3.3202, sd(Mqmc ) = 0.1458
E [potmc ] = 10.9189, sd(potmc ) = 0.9520
E [potqmc ] = 10.9020, sd(potqmc ) = 0.4778
As we can see from this table, error in RQMC estimation is about
half of the standard deviation in MC estimation.
Summa
►
QMC is good for some problems in multiple dimensions.
►
Implementation is more complicated than standard MC.
►
Issues such as selection of a lattice/net, parameters for it,
and the absence of statistical analysis of error (i.e., no CLT)
make it less straightforward.
►
In 1 and 2 dimension there are quadrature rules that
are better (e.g., Gauss-Lobatto quadrature).
►
The Koksma-Hlawka bound and possible O(log(n)d /n)
error is appealing, but is only theoretical and in reality it
requires some now-how to see reduced error.