0% found this document useful (0 votes)

14 views26 pages

C V S GAN M A V: Omputing Olatility Urfaces Using S With Inimal Rbitrage Iolations

Uploaded by

lirikih253

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views26 pages

C V S GAN M A V: Omputing Olatility Urfaces Using S With Inimal Rbitrage Iolations

Uploaded by

lirikih253

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

C OMPUTING VOLATILITY S URFACES USING GAN S WITH

M INIMAL A RBITRAGE V IOLATIONS

A P REPRINT

Andrew S. Na∗ Meixin Zhang

David R. Cheriton School of Computer Science David R. Cheriton School of Computer Science
arXiv:2304.13128v3 [q-fin.CP] 24 Dec 2023

University of Waterloo University of Waterloo

Waterloo, ON Waterloo, ON
[email protected] [email protected]

Justin W.L. Wan

David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, ON
[email protected]

December 27, 2023

A BSTRACT
In this paper, we propose a generative adversarial network (GAN) for efficiently computing volatility
surfaces. Our framework trains a regularized generative adversarial network to compute volatility
surfaces from time to maturity, moneyness and at the money implied volatility. We show an equiva-
lent formulation between the GAN and the inverse problem. We incorporate calendar and butterfly
spread arbitrage penalty terms to our generator loss functions that minimizes the arbitrage viola-
tions of our generated volatility surface. In this paper, we show that we can use GAN to speed up
the computation of volatility surfaces while minimizing arbitrage violations. Our experiments show
that by using regularization and the discriminator we can use a shallow network for the generator
with accurate results. Comparing our models with other methods, we found that our method can
outperform artificial neural network (ANN) frameworks in terms of errors and computation time.
We show that our framework can generate market consistent implied and local volatility surfaces
on minimal sample data by using a pre-trained discriminator and retraining the generator on market
data.

Keywords M
achine Learning; Implied Volatility; Local Volatility; Calibration; Stochastic Volatility; Heston Model

1 Introduction

The Black-Scholes equation implies that the volatility parameter σ is constant under geometric Brownian motion
(GBM) with no market frictions [Cont and da Fonseca 2002; Lee 2005]. However, it is well documented that the
observed market volatility is not constant [Lee 2005]. Black-Scholes implied volatility is one method to overcome
this. Market option prices are often quoted in terms of Black-Scholes implied volatility. The Black-Scholes implied
volatility σimplied (K, T ) is solved from the Black-Scholes equation that matches the market price. The Black-Scholes
implied volatility is often used by risk analysts and traders routinely for pricing derivatives and hedging risks [Cont
and da Fonseca 2002]. For easy exposition, we refer to the Black-Scholes implied volatility as the implied volatility.
∗
CONTACT Andrew S. Na. Email: [email protected]
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Another method to account for the non-constant volatility is to use stochastic volatility models to model asset prices
[Lee 2005]. Stochastic volatility models assume the variance follows a stochastic process which is coupled with the
asset price process. Stochastic volatility models are attractive to use as they can successfully model key features of
volatility observed empirically such as curvature in large maturities. However, stochastic volatility models are difficult
to calibrate due to the large number of parameters. In our paper, we use the Heston model to simulate asset prices.
It is well known that the Heston model cannot replicate market prices for short expiries when parameters are time-
consistent [Gatheral and Lynch 2001b]. This motivated the development of local volatility models. We let σlocal (S, t)
be the local volatility. Local volatility is computed as the solution to the inverse problem from the Dupires equation.
The solution to the local volatility is not unique, which means the solution is difficult to verify [Gatheral and Lynch
2001a]. Absence of market data points also make it difficult to compute volatility surfaces that are consistent with
market prices and great care must be taken during estimation [Coleman et al. 2000; Boyle and Thangaraj 2000]. Also
the lack of data availability in large and small strikes makes it more difficult to compute a volatility surface for all
maturities and strikes. To avoid this we look at using synthetic market data computed from given Heston parameters.
We assume the market prices Vmkt generated from the Heston model is the true market price. The Heston model was
chosen because it has a closed form solution for European options [Heston 1993].
The implied volatility from the Heston model can be computed using classical methods such as Brent’s method or
Newton’s method at each maturity T and strike K. At the calibrated T and K the implied volatility does not exhibit
arbitrage. However, when we extrapolate and interpolate to another T and K arbitrage may be violated. Another
challenge posed by SV models such as the Heston model is that the calibration of their parameters is difficult as the
data may not be available in practice. Currently, practitioners use parametric models such as stochastic volatility
inspired (SVI) to compute local volatility. The framework has extensions that ensure arbitrage is not violated in the
local volatility surface and ensure that it is consistent with the Heston model in the limits [Gatheral and Jacquier 2011,
2014].
Given market calibrated Heston parameters the implied volatility computed from the Heston model is arbitrage free
at calibrated T and K. However, no-arbitrage can be violated when we extrapolate or interpolate volatility points to
other maturities and strikes. The computation of arbitrage free volatility surfaces is important because an arbitrage
opportunity in the option market allows trading strategies that are implemented at zero cost and provides only upside
potential. Such opportunities implies there are price mismatches in the market which may be exploited and cause a
loss to the option writer or holder.
Recently, neural networks and machine learning models have been developed to compute the Black-Scholes implied
volatility in real and synthetic markets [Liu et al. 2019a; Hernandez 2016; Poggio et al. 2017; Spiegeleer et al. 2018;
Dimitroff et al. 2018; Horvath et al. 2021; Liu et al. 2019b; Hirsa et al. 2019]. Methods using convolutional neural
networks (CNNs) has been explored in [Hernandez 2016] and [Dimitroff et al. 2018]. They showed that CNNs
performed well; however, they needed to be redesigned for specific models. Deep artificial neural networks (ANNs)
such as the rough volatility method [Horvath et al. 2021], the implied volatility ANN (IV-ANN) method [Liu et al.
2019b], the calibrating neural network (CaNN) method [Liu et al. 2019a], and the method in [Hirsa et al. 2019] are
used to learn the solution of the option pricing function and construct the implied volatility surface by applying an
inverse method or another neural network. The methods mentioned above do not account for the possible arbitrage
violation of the neural network as there is no guarantee that the no-arbitrage conditions are not violated with generic
neural networks. We also note that the inversion network of CaNN is model dependent as it uses the parameters of the
model it is trying to calibrate as inputs used to learn the option price.
Neural network and machine learning models have been used in the computation of local volatility [Itkin 2019;
Chataigner et al. 2021]. These methods typically account for arbitrage violations as arbitrage violation is a well
known issue of local volatility, [Gatheral and Lynch 2001a]. The deep local volatility (DLV) method introduces a
regularization term to the loss function to calibrate the implied volatility over an no-arbitrage option price surface.
The method of [Itkin 2019] proposes an ANN with no-arbitrage constraints on learned parameters which guarantees
that parameters calibrated on in-sample data adhere to no-arbitrage. Variational autoencoder (VAE) has been used in
extrapolation and interpolation of local volatility surfaces [Bergeron et al. 2022]. The VAE model was used to com-
plete a local volatility surface where some of the local volatility points are known. The use of regularization terms
to limit arbitrage violations is explored using VAEs using a similar approach to the DNN [Bergeron et al. 2022]. A
GAN model has been used in the calibration of local stochastic volatility (LSV) model parameters [Cuchiero et al.
2020]. In comparison in this paper we look at computing volatility surfaces. The GAN calibration for LSV [Cuchiero
et al. 2020] also does not explore the use of regularization to enforce no-arbitrage conditions. We show later that
including the no-arbitrage conditions as penalty terms has a noticeable affect on the volatility surface geometry. GAN
for generating implied volatility with no-arbitrage penalty has been explored [Sidogi et al. 2022]. However, the au-
thors use the standard GAN loss function which results in irregular surfaces [Cont and Vuletic 2022]. We show in our
numerical experiement that this results in volatility surfaces that are inconsistent with the market. VolGAN, a method

2
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

to generate dynamic implied volatility and covariation from financial timeseries has been proposed [Cont and Vuletic
2022]. The authors also use the standard GAN loss with arbitrage penalties but use the log volatility to generate the
surface which results in better performance. The inclusion of the mean-squared error (MSE) term in the loss function
has been explored in the Fin-GAN framework [Vuletic et al. 2023].
Computing volatility surface from option prices often requires us to compute the option price. Though deep neural
networks can successfully learn nonlinear functions, training deep networks requires a lot of data and can take a long
time. In this paper, we propose using a generative adversarial network (GAN) to generate volatility surfaces from time
to maturity, moneyness and at the money implied volatility. The generator network is assisted in training by a discrim-
inator that evaluates whether the generated implied volatility matches the targets distribution or not. We derive a GAN
framework from the inverse problem that is consistent with the target volatility using an MSE loss function [Vuletic
et al. 2023]. Our framework also trains our network to satisfy the no-arbitrage constraints by introducing penalties
as regularization terms [Roper 2010; Itkin 2019]. Although training a generator and a discriminator involves two
networks, our proposed GAN model allows the use of shallow networks which results in much lower computational
cost.
The contribution of this paper is as follows:

• We show that the use of a generator-discriminator pair allows us to train shallow networks to achieve greater
efficiency without losing accuracy.
• We propose a GAN based framework to compute volatility surfaces from synthetic market option prices
generated by the Heston model. Our framework allows us to use the trained generator network to generate the
implied volatility and the local volatility with minimal arbitrage violations out-of training, which is important
for pricing and hedging options.
• We also show that our method can be used to generate market consistent volatility surfaces with minimal
tuning of the generator on limited sample data by using a pre-trained discriminator.

2 Heston Model, Volatility Surface and Static Arbitrage

In the following section, we discuss the stochastic asset price model used to price the synthetic data used in this paper.
Then we discuss implied volatility, local volatility and the computation of volatility surfaces. Finally, we discuss static
arbitrage and present penalty terms that impose no-arbitrage as regularization on loss functions.

2.1 Heston Model

In this paper we model market asset prices using the Heston model. In standard Black-Scholes model, it assumes the
asset price follows a geometric Brownian motion (GBM) which leads to the well known Black-Scholes equation for
the no-arbitrage option price. However, it assumes the volatility is constant across maturities and strikes which may
not hold in practice.
In this paper we use the Heston model to price the European call option which is used as synthetic data for our model
training. We denote the call option price as V . The Heston model has the nice property that the characteristic function
can be derived analytically which can be used in fast efficient solvers. The asset price that follows the Heston model
is given as the pair of SDEs [Heston 1993]
√
dSt = rSt dt + vt St dWtS
√
dvt = κ(v̄ − v0 )dt + γ vt dWtv
dWtS dWtv = ρdt,
where St is the stock price at time t, r is the risk-free interest rate, κ is the mean reversion rate, ρ is the correlation
between the stock price process and the variance process, WtS is the Wiener process driving the stock price dynamics,
Wtv is the Wiener process driving the variance process, γ is the volatility of the variance, v̄ is the long-run average of
the variance, vt is the variance at time t and v0 is the initial variance.
We used the cosine (COS) method of [Fang and Oosterlee 2009] to √ compute the European option price using the
characteristic function of the Heston model. More precisely, let i = −1, ψ ∈ R and let τ = T − t for any time
t ∈ [0, T ]. The variables {κ, ρ, γ, v0 , v̄} are the Heston parameters. The explicit form of the characteristic function of
the Heston model given by [Dunn et al. 2014]
f (iψ) = eA(τ )+B(τ )vt +iψSt

3
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

where
1 − N eM τ

κv̄
A(τ ) = riψτ + −(ργiψ − κ − M )τ − 2 ln
γ2 1−N
(eM τ − 1)(ργiψ − κ − M )
B(τ ) =
γ 2 (1 − eM τ )
p
M = (ργiψ + κ)2 + γ 2 (iψ + ψ 2 )
ργiψ − κ − M
N= .
ργiψ − κ + M

2.2 Volatility Surface

In this paper, we want to generate volatility surfaces through a trained generator network. This means we want
our proposed model to generate the implied volatility and local volatility from our synthetic market prices. In the
following, we give an overview of the implied and local volatility and how they are computed.

2.2.1 Implied Volatility

Given the initial stock price s0 , the strike price K, the time to maturity T , the risk-free interest rate r, and the Heston
parameters we can compute the arbitrage-free option price. Let Vmkt (K, T ) be the call option price at T and K. In this
paper we compute Vmkt (K, T ) using the Heston model. The implied volatility, σimplied (K, T ), is found by solving
the inverse problem
V (S0 , K, T, r, σimplied (K, T )) = Vmkt (K, T ), (1)
where the Black-Scholes call price, V , is found by solving the Black-Scholes equation [Liu et al. 2019b; Gatheral and
Lynch 2001a]. Note that the solution to the inverse problem, (1), can be reformulated as
V (S0 , K, T, r, σimplied (K, T )) − Vmkt (K, T ) = 0. (2)
Computing implied volatility is then reduced to the root finding problem given by (2). Two common root finding
methods are Newton’s method and Brent’s method [Brent 1973]. In this paper, we will use Brent’s method for solving
(2).

2.2.2 Local Volatility

Given the stock price S, under local volatility the call option price follows the dynamics given by
∂V 1 ∂2V ∂V
+ σlocal (S, t)2 S 2 2 + rS − rV = 0
∂t 2 ∂S ∂S
with terminal condition at T is V = max(S − K, 0). To change the dependent variables of the local volatility to
σlocal (K, T ) we can form the equivalent forward equation called the Dupire equation [Dupire 1994]:
∂V 1 ∂2V ∂V
+ σlocal (K, T )2 K 2 + rK =0 (3)
∂T 2 ∂K 2 ∂K
∂ ∂
with initial condition V (K, 0) = max(S0 − K, 0) [Lee 2005; Gatheral and Lynch 2001a,b]. Let ∂T = ∂T , ∂K = ∂K ,
∂2
and ∂KK = ∂K 2 . Given the option price surface V (K, T ), in terms of K, and T we can define the local volatility as
1/2
∂T V + rK∂K V
σlocal (K, T ) = .
0.5K 2 ∂KK V
For a grid of Vmkt , K, and T we can approximate σlocal from market option prices using finite-difference method
(FDM). For discretized maturity Tj with size ∆T and strike Ki with step size ∆K the approximations to the partial
derivatives are as follows
Vi,j − Vi,j−1
∂T Vi,j ≈
∆T
Vi,j − Vi−1,j
∂K Vi,j ≈
∆K
Vi+1,j − 2Vi,j + Vi−1,j
∂KK Vi,j ≈ ,
∆K 2

4
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

then the discretized approximation of σlocal (K, T ) is given by

!1/2
∂T Vi,j + ri,j Kj ∂K Vi,j
σlocal (Ki , Tj ) ≈ σF DM (Ki , Tj ) = .
Kj2 ∂KK Vi,j
Though the FDM method can be convenient, it has an issue with the regularity and availability of market data. To
overcome the issue of limited market price data practitioners look at the parametric approximation given by the surface
stochastic volatility inspired (SSVI) method [Gatheral and Jacquier 2011, 2014].
Definition 2.1. Let θT = σimplied (s0 , T )2 T and let ϕ be a smooth function from R+ 7→ R+ such that the limit of
θT ϕ(θT ) exists as T → 0. Then the SSVI is the surface defined by [Gatheral and Jacquier 2014]
θT p
w(k, θT ) = 1 + ρϕ(θT )k + (ϕ(θT ) + ρ)2 + (1 − ρ2 ) . (4)
2
Note the parameter ρ is the same as the Heston parameter ρ. To solve for the local volatility surface we let λ > 0 and
use the Heston-like parametric function given by [Gatheral and Jacquier 2014]
1 − e−λθ

1
ϕ(θ) = 1− . (5)
λθ λθ

2.3 Static Arbitrage

The absence of static arbitrage in option prices ensures the fair valuation of the option for a fixed T . In this paper we
refer to static arbitrage as arbitrage. We can say an option price surface or a volatility surface is free of arbitrage if and
only if [Gatheral and Jacquier 2014]

1. it is free of calendar spread arbitrage (monotonicity);

2. it is free of butterfly spread arbitrage.

The absence of calendar spread implies that monotonicity constraints of the option price/volatility surface is satisfied
over T . The absence of butterfly arbitrage ensures that there exists a non-negative probability measure [Gatheral and
Jacquier 2014]. This ensures that the option price/volatility is a martingale and arbitrage free.
Generally, arbitrage violation are found by checking that the option price/volatility surface violates one or both of
these conditions [Carr and D.B. 2005], which are often imposed by constraints. In this paper, we distinguish two
types of constraints. Hard constraints are constraints that must be satisfied to be a feasible solution. Alternatively, soft
constraints are constraints that can be violated, but incur a penalty. This penalty term is then added as a regularization
term to the objective function as shown in [Itkin 2019] and [Ackerer et al. 2020]. Note that this method does not
guarantee that all arbitrage opportunities will be eliminated.
There are two methods to impose soft constraints for no-arbitrage. The first method is to impose it on the option
price surface directly [Itkin 2019]. The second method is to impose the soft constraints through the implied volatility
surface [Ackerer et al. 2020]. We review both methods here, as the no-arbitrage on implied volatility surface builds on
the work on price surfaces.

2.3.1 No-Arbitrage Conditions imposed on Option Price Surface

Let V (K, T ) : [0, ∞) × [0, ∞) → R be the call price surface. For V (K, T ) to be arbitrage free it must have the
following properties [Carr and D.B. 2005]:
∂T V > 0, ∂K V < 0, ∂KK V > 0. (6)
The first constraint of (6) ensures that calendar arbitrage (monotonicity) is not violated. The second and the third
ensures that the butterfly arbitrage is not violated.
The constraints(6) are applied to the option price surface to ensure that the option price surface is arbitrage-free [Carr
and D.B. 2005]. Neural networks are very difficult to train with hard constraints and tend to result in extremely
large prediction errors [Ackerer et al. 2020]. [Itkin 2019] treats the constraints as soft constraints and derives penalty
functions to limit arbitrage violations. For given constants δ1 > 0, δ2 > 0, δ3 > 0 and the approximate option price,
V̂ , the regularization terms are given by [Itkin 2019]:
L1 = δ1 max{−T 2 ∂T V̂ , 0}, L2 = δ2 max{−K 2 ∂KK V̂ , 0}, L3 = δ3 max{K∂K V̂ , 0}. (7)

5
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Another approach to ensure no-arbitrage via regularization is by treating the Dupire condition [Dupire 1994] as a soft
∂ ∂2
constraint [Chataigner et al. 2020]. Let k = log(K/s0 ), ∂k = ∂k and ∂kk = ∂k 2 then the Dupire penalty is given by:

∂T V
Ldup = 2 .
k ∂kk V
It is clear that the Dupire penalty requires the conditions ∂T V ≥ 0 and ∂kk V > 0. This may not hold true if relying
on auto-differentiation. However, this may be remedied by using a similar treatment on the put option surface as
(7). We remark that for the DLV method the local volatility is computed as σlocal (k, T ) = (Ldup )1/2 . Note that the
soft constraints can not guarantee that the calibrated implied volatility will be arbitrage free as it is not guaranteed to
completely eliminate arbitrage in the option price surface.

2.3.2 No-Arbitrage Conditions imposed on Volatility Surface

R × [0, ∞) 7→ [0, ∞]. We can build on the no-arbitrage
Let the scaled volatility surface of k be denoted by I(k, T ) : √
conditions on V (k, T ), and extend it on to I(k, T ) = σ(k, T ) T . The sufficient conditions for I(k, T ) to be arbitrage
free are [Roper 2010]
• (Smoothness): I(k, T ) is twice differentiable for every T > 0;
• (Positivity): for every k ∈ R and T > 0, I(k, T ) > 0;
• (Durrleman’s Condition): for every k ∈ R and T > 0,
k∂k I 2 1
0 ≤ (1 − ) − (∂k I)2 + I∂kk I, (8)
I 4
where I = I(k, T );
• (Monotonicity in T ): for every k ∈ R, I(k, ·) is non-decreasing;
• (Large moneyness behaviour): for every T > 0, lim sup I(k,T
√
2k
)
∈ [0, 1); and
k→∞
• (Value at maturity): for every k ∈ R, I(k, 0) = 0.
Note that the butterfly spread no-arbitrage conditions are satisfied from (8) and calendar spread no-arbitrage conditions
are satisfied by the monotonicity condition.
Extending this approach we can show the constraints to eliminate calendar and butterfly spread can be expressed as
soft constraints when calibrating for the volatility surface σ(k, T ) [Ackerer et al. 2020]. We let the total variance be
defined as
ω(k, T ) = σ 2 (k, T )T.
Let ℓcal be the risk in the total variance from the calendar spread arbitrage and ℓbut be the risk in the total variance
from the butterfly spread arbitrage, which is given by
ℓcal (k, T ) = ∂T ω(k, T )
2
∂ 2 ω(k, T )

k∂k ω(k, T ) ∂k ω(k, T ) 1 1
ℓbut (k, T ) = 1 − − + + kk .
2ω(k, T )) 4 ω(k, T ) 4 2
The proof of why ℓbut ≥ 0 ensures the butterfly spread arbitrage is not violated is shown in [Gatheral and Jacquier
2014].
Then we can define the penalty to satisfy the no-arbitrage conditions as [Ackerer et al. 2020]
• (Monotonicity in T ):
1 X
Lc = max(0, −ℓcal (ki , Ti )), i = 1, ..., M
M i
• (Durrleman’s condition):
1 X
Lbf = max(0, −ℓbut (ki , Ti )), i = 1, ..., M
M i
• (Large moneyness limit):
1 X 2
L∞ = |∂kk ω(ki , Ti )|, i = 1, ..., M
M i

In our method we add these penalty terms to the objective function of the generator, which we show in Section 3.

6
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

3 GAN framework for Computing Volatility Surfaces

In this section we present our proposed model used to compute volatility surfaces. Our proposed framework uses a
generative-adversarial network (GAN) model to compute no-arbitrage volatility surfaces. The GAN model is com-
posed of two neural networks, the generator network and the discriminator network. The generator learns to generate
volatility surfaces that have minimal arbitrage violations. The discriminator network is trained with on volatility sur-
faces over different time to maturities and interest rates. The discriminator learns to classify the given data as true
if the data is from the distribution of the volatility surface and false if it is not. Concurrently, the generator learns to
generate data that is consistent with the distribution of the labels by minimizing detection from the discriminator net-
work. This competition between the generator and discriminator allows the GAN to generate out-of-training samples
that closely mimic the target in distribution [Goodfellow et al. 2016]. Our GAN framework allows us to use smaller
more efficient networks for the generator. Our proposed model can be used to compute both the local volatility and
the implied volatility. We remark that our method is not dependent on the Heston model. To allow our model to train
on limited number of observations, we engineer two additional features. We use an adjusted log-moneyness to enrich
the feature space and the at the money (ATM) implied volatility as a form of target encoding.

Figure 1 Schematic of discriminator network during training. The discriminator is trained with the Black-Scholes
implied volatility (BS-IV) as the true labels and the generator output with noise as the fake labels. The generator
weights are fixed while the discriminator is being trained.

Figure 2 Schematic of generator network during training. The ground truth is given by the Black-Scholes implied
volatility. The discriminator weights are fixed during the training of the generator network.

7
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

3.1 GAN Formulation of the Inverse Problem

We formally formulate the inverse problem that will be solved by our proposed GAN framework to compute the
complete implied and local volatility surfaces. Given a grid of strikes and maturities and the set of volatilities Q, the
objective function of the inverse problem requires us to solve:
arg min{∥V (S0 , K, T, r, σ(K, T )) − Vmkt (K, T )∥22 } = 0. (9)
σ∈Q

3.1.1 Implied volatility

We formulate the inverse problem for the implied volatility directly, as it is given by definition. Given σmkt (Ki , Tj )
from Vmkt (Ki , Tj ), we can reformulate the inverse problem as:
min{∥σ(K, T ) − σmkt (K, T )∥22 } = 0. (10)
σ∈Q

To solve the implied volatility problem, it suffices to solve (10). However, this may lead to values of σ that may not
be arbitrage free. Let Qimplied ⊂ Q is the space of σ(K, T ) such that they are arbitrage free. By selecting σ(K, T )
such that it is most likely sampled from Qimplied we get the following minimax problem:
max min {∥σ(K, T ) − σmkt (K, T )∥22 } = 0, (11)
Qimplied ⊂Q σ∈Qimplied

we denote the resulting σ ∗ (K, T ) as σimplied (K, T ).

3.1.2 Local volatility

To formulate the local volatility as an inverse problem in terms of volatility surfaces is not so clear. We present a
Lemma that connects the local volatility surface to the inverse problem.
Lemma 3.1. Let Vmkt be the market price of the option and V its approximation. We assume that ∂V∂T mkt
,
2 2
∂Vmkt ∂ Vmkt ∂V ∂V ∂ V
∂K , ∂K 2 , ∂T , ∂K , ∂K 2 exists and V , Vmkt satisfies (3). Under the assumption that there exists an opti-
mal σ ∗ (K, T ) such that it satisfies (9). Let H(K, T, V, Vmkt ) = δ(V (K, T ), Vmkt (K, T )) and Γ(K, T, σ, σmkt ) =
δ(σ(K, T ), σmkt (K, T )), where δ(x, x0 ) = ∥x − x0 ∥2 is the L2 distance between x and x0 . Then we can form a
controlled KFP equation given by measure δ:
∂H ∂H (V − Vmkt ) ∂2H
+ 2rK + min{Γ(K, T, σ)2 }K 2 = 0, (12)
∂T ∂K 2 σ ∂K 2
with initial condition H(K, 0, V, Vmkt ) = δ(V (K, 0), Vmkt (K, 0)).

Proof. For δ = ∥ · ∥2 , the distance between the market value and its approximation is given by H = δ(V, Vmkt ). We
can construct the KFP equation. From chain rule
∂H ∂H ∂V ∂H ∂Vmkt
= +
∂T ∂V ∂T ∂Vmkt ∂T
∂H ∂V σ2 K 2 ∂ 2 V ∂H ∂Vmkt 2
σmkt K 2 ∂ 2 Vmkt
= (−rK − ) + (−rK − )
∂V ∂K 2 ∂K 2 ∂Vmkt ∂K 2 ∂K 2
∂H σ 2 K 2 ∂H ∂ 2 V σ 2 K 2 ∂H ∂ 2 Vmkt
= −2rK − 2
− mkt ,
∂K 2 ∂V ∂K 2 ∂Vmkt ∂K 2
which gives us the equation
∂H ∂H σ 2 K 2 ∂H ∂ 2 V 2
σmkt K 2 ∂H ∂ 2 Vmkt
+ 2rK + + = 0. (13)
∂T ∂K 2 ∂V ∂K 2 2 ∂Vmkt ∂K 2
Next, we multiply and divide both sides of (13) by H. This gives us:
∂H ∂H σ2 K 2 ∂H ∂2V 2
σmkt K2 ∂H ∂ 2 Vmkt
+ 2rK + (H) (1/H) + (H) (1/H) = 0, (14)
∂T ∂K 2 ∂V ∂K 2 2 ∂Vmkt ∂K 2
from the derivative of H we can see that
∂H V − Vmkt ∂H
= =− ,
∂V H ∂Vmkt

8
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

∂2H ∂2H
2
= 1/H, 2 = −1/H,
∂V ∂Vmkt
applying the first derivative and the definition of the second derivative to (14) we get
∂H ∂H (V − Vmkt )K 2 2 2 ∂2H
+ 2rK + [σ − σmkt ] = 0,
∂T ∂K 2 ∂K 2
p 2
we can rewrite σ 2 − σmkt
2
as 2
σ 2 − σmkt = Γ(σ, σmkt )2 . The initial condition can be derived as follows:

∂2V
Z
∂V ∂V 1
δ(V, Vmkt ) = δ( + rK + σ(K, T )2 K 2 dT + V (K, 0, σ)
T ∂T ∂K 2 ∂K 2
∂ 2 Vmkt
Z
∂Vmkt ∂Vmkt 1
, + rK + σmkt (K, T )2 Ki2 dT + Vmkt (K, 0, σmkt ))
T ∂T ∂K 2 ∂K 2
= δ(V (K, 0, σ), Vmkt (K, 0, σmkt )) = H(K, 0, V, Vmkt ).
We know from the inverse problem that for each T and K we want
H(V, Vmkt ) → 0 as Γ(σ, σmkt ) → 0,
this tells us we need minσ (Γ(σ, σmkt )2 ).

Note, a connection exists between the forward Kolmogorov equation and the Hamilton-Jacobi-Bellman (HJB) equation
[Annunziato et al. 2014]. More, specifically, when the cost functional of the adjoint equation equals d = ∥ · ∥2 and the
σ ∗ (K, T ) is a strong solution, then the HJB equation is equal to the adjoint equation [Annunziato et al. 2014].
Let Qlocal ⊂ Q be space of volatilities σ(K, T ) such that they are arbitrage free and computed uniquely from Dupire’s
Equation. The optimal solution is found by taking the first order condition with respect to σ.
min{∥σ(K, T ) − σmkt (K, T )∥22 } = 0.
σ∈Q

By restricting the possible σ to the space of arbitrage free volatilities, the inverse problem can be solved by solving
the following problem:
max min {∥σ(K, T ) − σmkt (K, T )∥22 } = 0. (15)
Qlocal ⊂Q σ∈Qlocal

We denote the resulting σ ∗ (K, T ) as σlocal (K, T ). In our formulation, the max is solved using GAN which learns to
accept solutions in Qimplied and Qlocal . The min is solved using the M SE loss and no-arbitrage penalty terms given
that σ(K, T ) belongs to the restricted set of volatilities Qimplied and Qlocal . We present the loss function explicitly in
the next section. Note in practice, σmkt is computed by finite differences.

3.2 Loss Function and Proposed Model

In this section we present our GAN model more formally. Our proposed model can be used for computing the implied
and local volatility. For the implied volatility, the input of our proposed model is composed of: the risk free rate
r, the moneyness k = K/s0 , the maturity time T and the at the money (ATM) volatility σAT M = σimplied (s0 , T )
and the adjusted log-moneyness klog = log(k) − rT . The adjusted log-moneyness is chosen to enrich the feature
space by adding the log of the moneyness we account for skewness in the moneyness data, which is highly likely in
volatility data as there may be different volatility levels given T . We also choose to include σAT M as a form of target
encoding. This is because under the limit at T → t the implied volatility converges to the spot volatility almost surely
[Durrleman 2008]. Note this is a consistency condition for the model to be consistent with spot volatility [Durrleman
2008; Carmona 2007].
First, we construct the loss function to approximate the implied volatility. For each T ∈ RM we draw r from the range
[0.0, 0.05]. We let k ∈ RM be the moneyness, σAT M ∈ R+M be the ATM implied volatility and klog ∈ RM be the
adjusted log-moneyness. We let [X1 , ..., XN ]⊤ be the input to the GAN where Xi = {k, σAT M , T, r1, klog }, where
1 is a vector of ones of dimension M . Let d = 5M N be the size of X. We let b = M N be the size of the label data
y ∈ Rb used in training, i.e. σimplied from Black-Scholes.
Let G : Rd 7→ Rb be the generator function and D : Rd+b 7→ [0, 1]b be the discriminator function and let z ∼ N (0, 1)
such that Z = [z1 , ..., zd ]. In a standard GAN model the learning objective is to ensure the distribution of G(Z) is
similar to the distribution of y. In our approach we do not use standard noise Z, but instead a shifted and scaled noise
Z̄ = [z̄1 , ..., z̄d ], where z̄ ∼ N (µX , σX ). The mean and standard deviation of X is estimated for each feature as the

9
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

sample mean and standard deviation for the input X. This is done to ensure the noise is within the domain of the input
X. The standard loss function for the GAN is given by the minimax problem [Goodfellow et al. 2016]

min max L(G, D) = min max E[log(D({X, y}))] + E[1 − log(D({Z̄, G(Z̄)}))]
G D G D

Solving the standard GAN loss function is difficult so in practice we reformulate the the standard loss function as two
minimization problems given by
min L(D) = min{−E[log(D({X, y}))]}
D D
min L(G) = min{E[1 − log(D({Z̄, G(Z̄)}))]}.
G G

These two loss functions are solved iteratively where the G is fixed when minimizing L(D) and D is fixed when
minimizing L(G).
We connect the minimax problems (11) and (15) to the standard GAN. The generator G outputs possible values
of σ from a given input. The discriminator D, classifies σ as a volatility that is in Qimplied , where target =
{implied, local} or not, this restricts the possible values of σ such that they are consistent with Qtarget . The MSE
loss is used to solve the inverse problem and map σ to σtarget . Note that, we use lower case for scalar values, upper
case for vector values and underlined upper case for matrices.
We formulate our generator loss function similarly to [Horvath et al. 2021] and [Ackerer et al. 2020] and use the mean
squared error (MSE) to learn the shape of the volatility surface. For b different samples the MSE is given by
b
1X
M SE(y, G(X)) = (yi − G(Xi ))2 (16)
b i=1
To generate arbitrage-free implied volatility, we incorporate the no-arbitrage conditions Lc , Lbf and L∞ in the loss
function of the generator. To ensure that the G(X) has a similar distribution to y we minimize the negative log-
likelihood loss function given by
b
1X
LDG = − log(D({Z̄i , G(Z̄i )})).
b i=1
Then the loss function for the generator is given by
LG = M SE(y, G(X)) + λ1 Lc + λ2 Lbf + λ3 L∞ + λ4 LDG . (17)
The parameters λ1 > 0, λ2 > 0, and λ3 > 0 are used to determine the amount of calendar arbitrage, butterfly arbitrage
and the limit behaviour that are penalized. λ4 ∈ [0, 1] is the amount of similarity we want with the distribution of y.
Note that the additional terms act as regularizers to the standard MSE loss function.
For the discriminator network, we use the binary cross-entropy (BCE) loss function [Goodfellow et al. 2016] given by
b
1X
LD = − (log(D({Xi , yi })) + log(1 − D({Xi , G(Xi )}))) , (18)
b i=1
The discriminator loss is chosen to maximize the likelihood that the discriminator classifies the target values as true
given some noisy inputs. Then we want to minimize LD and LG iteratively such that D∗ = minD LD and G∗ =
minG LG .
For the local volatility the loss function is the same except the input of our proposed model is given by Xi =
{k, σAT M , σimplied , T, r1, klog } and the label y = σlocal .
In our framework we model G and D using feedforward neural networks. We define the generator network as G(·; Ω)
with a set of parameters Ω. The discriminator network as D(·; Θ) with a set of parameters Θ. The generator network
we use is the ANN with l = 1, 2 layers. We did not use a deep neural network as other works on volatility computation
such as [Liu et al. 2019b] and [Chataigner et al. 2020]. We found that increasing the depth past l = 2 did not affect
the accuracy of the generator network. We remark that depth past l = 1 in the discriminator network actually lead to a
degradation of performance as it was over-fitting on synthetic data. inputs of the network are normalized by the mean
and standard deviation. Let Wl be the weights of the l-th layer neural network. Then D({X, y}; Θ) with parameters
Θ = [W0 , W1 ] is represented by
Z1 = softplus(batchnorm({X, y}W0 ); β)
D({X, y}; Θ) = sigmoid(Z1 W1 ).

10
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

We represent GAN-1 with a l = 1 layer G1 (X; Ω1 ), with parameters Ω1 = [W0 , W1 ] by

Z1 = softplus(batchnorm(XW0 ); β)
G1 (X; Ω1 ) = softplus(Z1 W1 ; β),
and represent GAN-2 with a l = 2 layer G2 (X; Ω2 ), with parameters Ω2 = [W0 , W1 , W2 ] as
Z1 = softplus(batchnorm((XW0 ); β)
Z2 = softplus(batchnorm((Z1 W1 ); β)
G2 (X; Ω2 ) = softplus(Z2 W2 ; β).
We use a scaled version of the softplus activation function and sigmoid activation function defined as [Dugas et al.
2001; Goodfellow et al. 2016]
1
softplus(X; β) = log(1 + eβX ),
β
1
sigmoid(X) = .
1 + e−X
The softplus activation function is used for the generator network output because we want a smooth enough function
to learn no-arbitrage soft constraints [Dugas et al. 2001]. The sigmoid activation function for the discriminator output
is standard for classification problems [Goodfellow et al. 2016]. We also employ batch normalization in each layer
with learnable parameters γ and η. The input X is batch normalized as follows [Goodfellow et al. 2016]:
X − E[X]
batchnorm(X) = γ + η.
std[X]
We train the discriminator network first using G(Z̄; Ω) with Ω fixed as shown in figure 1. Then we train the generator
network G(X; Ω) using D({X, G(X; Ω)}; Θ) with Θ fixed as shown in figure 2. Reparameterizing (17) and (18) by
the neural network parameters {Ω, Θ} gives us our final loss function for the discriminator as
b
1X
LD (Θ) = − (log(D({Xi , yi }; Θ)) + log(1 − D({Xi , G(Xi )}; Θ))) , (19)
b i=1
and for the generator as
LG (Ω) = M SE(y, G(X; Ω)) + λ1 Lc + λ2 Lbf + λ3 L∞ + λ4 LDG (Ω), (20)
where
b
1X
LDG (Ω) = − log(D({Z̄i , G(Z̄i ; Ω)})).
b i=1
The optimal set of discriminator network parameters is denoted by Θ∗ and is found by
Θ∗ = arg min{LD (Θ)}.
Θ

The optimal set of generator network parameters is denoted by Ω∗ , which is found by minimizing (17) this gives us
Ω∗ = arg min{LG (Ω)}
Ω

3.3 Training and Summary of GAN

Training the GAN requires two steps. The first step is to train the discriminator. However, loss (19) is not
straightforward to compute directly. Instead, the discriminator is trained using algorithm 1. The total loss function
LD (Θ) = LD1 (Θ) + LD2 (Θ) is minimized using the Adam [Kingma and Ba 2017] optimizer.
In the second step, given X and y we train the generator by minimizing (20) using Adam. This is done in a standard
neural network approach. Training is performed for 50 epochs. Once training is complete, we evaluate the GAN with
a forward pass using the testing data.
A detailed summary of our training pipeline is summarized as follows
1. The Heston parameters are generated using a uniform distribution over a range of values as shown in table
2 and the option price and target volatility surface Y is computed as shown in figure 3. Input X and Z is
constructed for M samples.

11
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Algorithm 1 Discriminator training algorithm

Require: X, y, G(Ω∗ ), optimizer, epoch
n←0
while n < epoch do
for batch do
D := D(Θ)
G := G(Ω∗ )
LD1 (Θ) = −E[log(D({X, y}))]
LD2 (Θ) = −E[log(1 − D({X, G(X)}))]
LD (Θ) = LD1 (Θ) + LD2 (Θ)
propogate errors backwards through the network.
optimizer step
n←n+1

2. Then X, Z and y are given as inputs to D(·; Θ).

3. D(·; Θ) is trained by evaluating (19) and G(X; Ω) is trained by evaluating (20).
4. The trained network G(X; Ω) approximates the volatility surface.
5. We carry forward the weights of the generator from the previous epoch to initialize the weights of the next
epoch to reinforce the soft constraints.

The architecture and training parameters are detailed in table 1.

Table 1 Model Parameters used in GAN for GAN-1 and GAN-2.

Parameters Options
Neurons(each layer) 100
Activation function softplus (β = 1), sigmoid
Dropout rate 0.0
Batch-normalization No
Optimizer Adam [Kingma and Ba 2017]
Batch size 128

3.4 Generating Data

Our proposed model can be used to compute volatility surfaces, which are two separate problems thus they require
different input features for training and testing. In this paper, our option price is generated using the Heston model
with different combination of parameters. We use synthetic data over real data as it is difficult to obtain good quality
real data for all T and K. In this section we provide more details on the training and testing data used in this paper. In
this paper we generate three separate sets of data. One set used for training and testing of our model. A second set for
generating out-of training volatility surface and a third set for testing price errors.

3.4.1 Generating Volatility Data

We generate synthetic market data for training our GAN framework. The characteristic function of the Heston model
was used with the COS method to generate the European call price, of an underlying asset following the geomet-
ric Brownian motion. Then the Black-Scholes implied volatility was computed using Brent’s method. The finite-
difference method was used to approximate the local volatility from European call prices.
For training and validating, we use the parameters shown in table 2 in our simulations. Each parameter was sampled
randomly from the uniformly distributed parameters.
For training and validating, we generated 10 different combinations of parameters above. For each set of parameters,
we generated the option price and implied volatility for 75 different maturity dates, and 50 different strikes.
For testing out-of training volatility surface we use 1 set of parameters over 11 different maturities and 157 different
strikes with parameters shown in table 3. For price error we use 50 different sets of parameters, 8 different maturities
and 11 different strikes with parameters shown in table 2.

12
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

r, ρ, κ, γ, v̄, v0 , Heston Model Black-Scholes Implied Volatil-

T, K/s0 Option Price V
COS Method Brent’s Method ity σimplied

Dupire’s Volatility Finite Difference Method

Local
Volatility
σlocal

Figure 3 Pipeline used to generate data for training and testing our model.

Table 2 Domain of parameters used in the simulation of training and testing data, all parameters were drawn randomly
from the parameter space.
Parameter Range
r, risk-free interest rate (0.0, 0.05)
κ, reversion speed (0.0, 3.0)
ρ, correlation (−0.9, 0.0)
γ, volatility of variance (0.01, 0.5)
v̄, long-run mean variance (0.01, 0.5)
v0 , initial variance (0.05, 0.5)
K/s0 , moneyness (0.5, 2.5)
T , time to maturity (0.5, 2.0)

3.4.2 Volatility Data From Call options on the S&P 500

We also test our GAN framework on market index options built on the S&P 500 index. The options dataset has
the following features. The strike, moneyness, bid price, mid price, ask price, last price, volume, implied volatility
and time traded. We gathered data for T = 0.25, 0.5, 0.75, 1. The interest rate r was set to the 1-year treasury
bond rate of 5.463%. We collected a total of 821 data points of moneyness ∈ [0.24, 1.17] and time-to-maturity
∈ {0.25, 0.5, 0.75, 1.0}.

4 Numerical Results

In this section, we present our experimental results. Our experiment is divided into two main categories. First we
show the that our proposed method can compute the Heston implied volatility surface. To measure the accuracy of our
method we use the absolute percent error (MAPE) as a measure of performance. The MAPE is given by

b
1 X |σBS,i − σimplied,i |
M AP E = .
b i=1 σBS,i

Table 3 Domain of parameters used in out-of training volatility surface generation, all parameters were drawn uni-
formly from the domain.
Parameter Range
r, risk-free interest rate 0.02
κ, reversion speed 2.7
ρ, correlation −0.4
γ, volatility of variance 0.2
v̄, long-run mean variance 0.4
v0 , initial variance 0.4
K/s0 , moneyness (0.3, 2.8)
T , time to maturity (0.3, 2.0)

13
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

We also evaluate another measure of performance, the mean absolute error(MAE), which is given by
b
1X
M AE = |σBS,i − σimplied,i |.
b i=1
Next we show that our proposed method can compute the local volatility surface. Note that the local volatility surface
is not unique thus it is difficult to measure the quality directly. Instead we use repricing error as a method to measure
the quality of our method [Horvath et al. 2021; Chataigner et al. 2020, 2021]. This done by computing the implied
volatility from the local volatility since p
σ̂ = σlocal /T .
To measure the quality of our method we compute the average relative pricing error (ARPE) of the European option
given by
b
1 X |Vmkt,i − Vlocal,i |
ARP E = ,
b i=1 |Vmkt,i |
where Vlocal is the Black-Scholes European call option price computed using the local volatility. We use two other
metrics to measure the performance of our proposed method for local volatility. We use the maximum relative price
error (MRPE) given by
|Vmkt,i − Vlocal,i |
M RP E = max ,
i=1,...,b |Vmkt,i |
and the standard deviation of relative price error. Note that all three measures are presented as error heatmaps [Horvath
et al. 2021]. In all of our experiments, we use generated data as described in Section 3. In experiment 1, we compare
the performance of GAN-1 vs. GAN-2 and show qualitative results which compares the output of our method to the
implied volatility smile generated using Brent’s method. We also compare our model with and without soft constraints.
In experiment 2, we compare our proposed methods with the IV-ANN method of [Liu et al. 2019b], we chose not to
compare with the deep calibration [Horvath et al. 2021] approach as the network architecture used in this model is
captured by the IV-ANN method. We compute the implied volatility and present a qualitative and quantitative analysis.
We also look at the repricing error of our method and the IV-ANN method to replicate Vmkt . In experiment 3, we
compare our proposed method with a deep neural network implementation similar to the DNN [Chataigner et al.
2020]. We compute the local volatility using both methods and compare the qualitative and quantitative results of
both methods. In experiment 4 we compare our method to a VAE implementation. In this experiment we compare the
local volatility of both methods given a set computation time and evaluate the performance of both methods. Finally,
in experiment 5 we use our pre-trained discriminator and fine-tune a 2-layer generator to generate market consistent
volatility surfaces, this highlights our models capabilities to generalize to other datasets with minimal tuning with
small number of samples (less than 1000). All our numerical experiments were run using Google Colab with 13 GB
of RAM and a dual-core CPU of 2.2 GHz.

4.1 Experiment 1: Performance Comparison between GAN models

In this experiment, we compare the performance of the two GAN models proposed in our paper. We use MAE and
MAPE to measure the performance of both models for implied volatility. For the local volatility we use the ARPE,
MRPE and the standard deviation of repricing error heatmaps. To begin our comparison, we compare the training
performance of GAN-1 and GAN-2. The training procedures for GAN-1 and GAN-2 are identical and are described
in section 3.

4.1.1 Implied Volatility Smile

To show the effectiveness of our approach, we present the MAE and MAPE of the generator, which is a measure of how
well our generated implied volatility matches the training and validation true implied volatility. We see that the training
and testing loss for both of our models behave similarly for the generator. We summarize the training performance of
GAN-1 and GAN-2 in table 4, which shows the MAE and MAPE of both models and their respective training times.
We notice that GAN-1 trains faster than GAN-2 by 10 seconds but GAN-2 tends to yield better accuracy than GAN-1
in both MAE and MAPE. However, only looking at error does not give us a full picture of the performance of our
proposed method. Thus we also compare more qualitative results.
We compare the qualitative performance of our proposed models GAN-1 (dashed red) and GAN-2 (dashed green)
for different maturities as shown in figure 4. We show a cross section of the implied volatility surface for different
maturities to highlight the tail behaviour of our generated surface. Our target value shown by the solid blue line was
constructed from the implied volatility. The red dashed lines are the implied volatilities output from GAN-1 and the

14
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Table 4 We compare the performance of GAN-1 and GAN-2 using training time, MAE and MAPE.
Model Training Time MAE MAPE
GAN-1 198.689732s 4.2680e−5 0.07898%
GAN-2 208.098838s 2.1376e−5 0.03981%

green dashed lines are the implied volatilities output from GAN-2. We can see that the implied volatility approximated
by GAN-2 follows the implied volatility curve but does not match it completely. The implied volatility approximated
by GAN-1 fails to capture the curvature of the implied volatility curve in maturities further out. Though GAN is a
powerful tool the generator network still requires some hidden layers to perform well. However, the training time
required for GAN-2 is only 10 seconds more than GAN-1 showing that it is still efficient.

Figure 4 The implied volatility computed by GAN-1 with soft constraints (red) and GAN-2 with soft constraints
(green) compared to the Black-Scholes implied volatility (blue).

We compare the generated implied volatility from our models with soft constraints (dashed green) vs our models with
out any constraints (dashed red) as shown in figures 5 and 6. As with figure 4, we look at the cross section of the
implied volatility surface at different maturities. We observe that the role of the regularizer as maturity increases. We
see that the no-arbitrage penalty terms flatten the curve and allow for better fitting with the implied volatility curve.
In the case with no constraints we can qualitatively observe some concavity in the implied volatility curves which
suggests that arbitrage has been violated.
We show the number of arbitrage violations during training and testing for both GAN models when soft constraints
are used and when they are not used in tables 5 and 6. We found that adding the soft constraints greatly assisted the
generator in avoiding arbitrage butterfly and vertical spread violations. From testing, we see that GAN-1 and GAN-2
without soft constraints performs considerably worse, as we had 4.90% and 11.15% arbitrage violations respectively.
From table 5 we see a reduction in butterfly and vertical spread arbitrage violations when soft constraints are used.
Note that the soft constraint does not guarantee that arbitrage violations do not occur.

Table 5 Number of butterfly arbitrage violations detected in implied volatility surface.

GAN-1 GAN-2
Soft Constraint No Constraint Soft Constraint No Constraint
Training 0/28306 1049/28306 0/28306 2143/28306
Testing 0/4995 245/4995 0/4995 557/4995

15
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Figure 5 The implied volatility approximated by GAN-1 with no constraints (orange) and GAN-1 with soft constraints
(green) compared to the Black-Scholes implied volatility (blue).

Figure 6 The implied volatility approximated by GAN-2 with no constraints (orange) and GAN-2 with soft constraints
(green) compared to the Black-Scholes implied volatility (blue).

Table 6 Number of calendar arbitrage violations detected in implied volatility surface.

GAN-1 GAN-2
Soft Constraint No Constraint Soft Constraint No Constraint
Training 0/28306 0/28306 0/28306 0/28306
Testing 0/4995 0/4995 0/4995 0/4995

16
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

4.1.2 Local Volatility Smile

Our proposed methods can learn the behaviour of the full local volatility with minimal arbitrage violations. We
demonstrate the resulting local volatility surface computed by GAN-1 with soft constraints (left), GAN-2 with soft
constraints (middle) and the FDM local volatility (right) in figure 7. This is done to show qualitatively how the
generated local volatility surfaces compare to the approximate local volatility given by FDM. The surface generate by
GAN-2 tends to try to fit the FDM local volatility more closely than GAN-1.

Figure 7 The local volatility surface computed by GAN-1 with soft constraints (left), GAN-2 with soft constraints
(middle) and FDM (right).

To compare our models quantitatively our generated local volatility was used to reprice the Heston option price and a
error heatmap of ARPE, MRPE and the standard deviation of relative error as shown in figure 8. These heatmaps were
generated by taking the relative error at for each maturity and strike. At each maturity and strike we computed the
ARPE, MRPE and standard deviation for different sets of model parameters. From figure 8 we can see that GAN-1 has
a maximum ARPE of 0.1% ± 0.175% and a MRPE of 1.2%. GAN-2 has a maximum ARPE of 0.175% ± 0.175% and
a MRPE of 1%. We also note that the training time of GAN-1 230.9626 seconds and GAN-2 was 263.9366 seconds.
This opens up the possibility of using either models depending on the time constraints of the problem. The ARPE of
the FDM local volatility surface is 0.8% ± 0.175% and the MRPE is 14%.
In a similar fashion to the implied volatility, we also look at the arbitrage violations of our proposed methods for the
local volatility. However, in contrast to the results of the implied volatility surface, the no-arbitrage soft constraints
did not aid our proposed method in avoiding arbitrage violations significantly as seen in tables 7 and 8. We see that
with or without soft constraints the generated local volatility does not violate butterfly arbitrage and effectively does
not violate the calendar arbitrage in testing. However, this does not mean the soft constraints can be removed from
training the generator.
We look at the cross section of the local volatility surface at each maturity to see highlight the effects of soft constraints
in generated local volatility surfaces as shown in figure 9. The soft constraints influence the geometry of the generated
local volatility surface. The effect on GAN-2 is the most prominent. Qualitatively, we see that there is a level of
concavity when we generate the local volatility without soft constraints in the log moneyness region [−0.3, 0.3] at
higher maturities.

Table 7 Number of butterfly arbitrage violations detected in local volatility surface.

GAN-1 GAN-2
Soft Constraint No Constraint Soft Constraint No Constraint
Training 0/28306 0/28306 0/28306 0/28306
Testing 0/4995 0/4995 0/4995 0/4995

4.1.3 Comparison to GAN without MSE Loss

We construct GAN-1 and GAN-2 without MSE loss function to compare our formulation with the formulation pre-
sented in [Sidogi et al. 2022]. In our experiment we found that without the M SE loss term the GAN is unable to fully
capture the implied volatility surface. More specifically, it undershoots the volatility value resulting in a surface that is

17
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Figure 8 Pricing error heatmap computed from local volatility. GAN-1 with soft constraints (top), GAN-2 with soft
constraints(middle) and FDM (bottom). The warmer colours (yellow) represent higher error values and colder colours
(blue) represent lower error values.

Table 8 Number of calendar arbitrage violations detected in local volatility surface.

GAN-1 GAN-2
Soft Constraint No Constraint Soft Constraint No Constraint
Training 0/28306 28/28306 0/28306 0/28306
Testing 0/4995 1/4995 0/4995 0/4995

inconsistent with the market. This is likely due to the synthetic data being difficult to segment. In figure 10 we show
the GAN-1 and GAN-2 network without LM SE .
In figures 10, we see that the implied volatility generated by the GAN without M SE cannot fit the ground truth implied
volatility at any point and fails to estimate the implied volatility surface that is market consistent. This highlights the
importance in the inclusion of the MSE term in the loss function.

4.2 Experiment 2: Comparison of Implied Volatility with IV-ANN

In this experiment, we compare the performance of our proposed methods to the IV-ANN method [Liu et al. 2019b]
with 4 hidden layers with the ReLU activation function. We design the GAN network such that the MAE of the two
approaches are similar. Doing this we find that our proposed GAN-2 is roughly two times faster than the IV-ANN
approach. The performance is gauged by the mean absolute error of the predicted values, the MAE for the IV-ANN
was reported as 9.73e−5 in [Liu et al. 2019b]. We summarized the training time, mean absolute error and the mean
absolute percent error for each method in table 9. As seen in table 9, using the same training data and testing set we
can see that GAN-2 outperforms IV-ANN in runtime when they are both trained to a similar mean absolute error.
Next we evaluate the quality of our proposed method vs the IV-ANN method. We show the implied volatility learned
by the IV-ANN method in figure 11. Figure 11 looks at the cross section of the implied volatility surface generated by

18
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Figure 9 The local volatility computed by GAN-2 with no constraints (red) and GAN-2 with soft constraints (green).

Figure 10 The implied volatility computed by GAN-1 without MSE (red) and GAN-2 without MSE (green) compared
to the Black-Scholes implied volatility (blue).

Table 9 Timing and error comparison between GAN-2 and IV-ANN

Model Training Time MAE MAPE
IV-ANN 446.204831 2.3235e−5 0.044106%
GAN-2 209.194824 2.1376e−5 0.039810%

19
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

the IV-ANN method for different maturities. From figure 11 we see that the implied volatility learned by IV-ANN has
difficulty fitting the Black-Scholes implied volatility qualitatively.

Figure 11 Implied volatility cross section generated by the IV-ANN for different maturities.

We compare the repricing performance of our proposed method vs the IV-ANN method. The repricing error heatmap
of the IV-ANN method and GAN-2 is show in figure 12. We see that our proposed GAN-2 network outperforms the
IV-ANN method. The IV-ANN method has a maximum ARPE of 0.080% ± 0.12% and a MRPE of 0.70%. Our
proposed method has a maximum ARPE of 0.08% ± 0.1%, and a MRPE of 0.50%.

Figure 12 GAN-2 implied volatility (top) and IV-ANN implied volatility (bottom repricing error heatmap. The warmer
colors (yellow) respresent higher error values and colder colors (blue) represent lower error values.

4.3 Experiment 3: Comparison of Local Volatility Computation with Deep Neural Network and SSVI

In this experiment we compare the local volatility computed by our method with a deep neural network (DNN). We
use the pricing error given by the SSVI method [Gatheral and Jacquier 2014] as a benchmark. We implemented the
deep neural network method using 4 hidden layers with 400 nodes with ReLU activation functions. The SSVI was
computed using the analytical equation (4) with the Heston-like parametric function in equation (5). Note that the

20
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

local volatility surface is not unique [Lee 2005], thus many different local volatility surfaces may satisfy given market
prices.
In this experiment we trained our proposed method and the DNN using the data constructed according to section 3.4.
Then the local volatility was computed using our proposed method, the deep neural network, and SSVI based on out-of
training data generated. The training time, maximum ARPE and MRPE for our proposed method, the DNN and the
benchmark SSVI is summarized in table 10.

Table 10 We compare the performance of GAN-1, GAN-2, and DNN to the SSVI.
Model Training Time Max ARPE MRPE
SSVI – 0.8% ± 0.2% 1.0%
GAN-1 198.689732s 0.1% ± 0.175% 1.2%
GAN-2 208.098838s 0.175% ± 0.175% 1.0%
DNN 246.32142s 0.8% ± 0.2% 3.5%

The results of table 10 show that our method is more efficient and accurate than the DNN. We also compared the
arbitrage violations that occurred in our proposed method to the DNN and SSVI method. The arbitrage violations for
the DNN and SSVI are shown in table 11. We can see that during training and testing all methods do not produce any
arbitrage opportunities.

Table 11 Number of arbitrage violations detected in local volatility surface computed by the DNN and SSVI method.
GAN-2 DNN SSVI
Soft Constraint Soft Constraint Hard Constraint
Training 0/28306 0/28306 0/28306
Testing 0/4995 0/4995 0/4995

Finally we compared the pricing error produced by each method is shown in figures 8, and 13. We can clearly see that
our proposed method GAN-2 has a maximum ARPE of 0.175% ± 0.175% . This is better than the SSVI and DNN
with maximum ARPE of 0.8% ± 0.2%. Our proposed method GAN-2 also performs comparably to the SSVI in terms
of MRPE of 1% (For GAN-2) and better than the DNN with a MRPE of 3.5%.

Figure 13 Pricing error computed from local volatility using DNN (top) and SSVI (bottom). The warmer colours
(yellow) represent higher error values and colder colours (blue) represent lower error values.

4.4 Experiment 4: Comparison of Local Volatility Computation with VAE

In this experiment we compare the local volatility computed by our proposed method with and a VAE . We imple-
mented the VAE using 2 hidden layers with 200 nodes. The number of hidden layers and nodes were chosen such

21
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

that the training time for the VAE and our GAN methods were the same. The encoder was constructed with a ReLU
activation function for the hidden layers and a sigmoid activation function for the output. The decoder was constructed
with ReLU activation function for all layers. The VAE was trained over the same training set as our proposed method
as outlined in experiment 3. To compare our methods we use the same set up as in experiment 3. Given similar training
times of 229.9448 seconds, the pricing errors for the VAE trained is shown in figure 15. The maximum ARPE for
VAE method is 2% ± 0.8% and the MRPE is 50% this is due to the noise generated in VAE models. Given the same
training time, the VAE performs worse than the results of our proposed GAN-1 as we saw in experiment 1.

Figure 14 Local volatility surface generated by trained VAE using a shallow network (left) and a deep network (right).

Give the same training time, the local volatility surface by our proposed method is smooth as shown in figure 7. This is
in contrast to the noisy surface generated from VAEs as shown in the left hand side of figure 14. We note that the VAE
method can produce smooth local volatility surfaces as shown in the right hand side of figure 14 however it requires a
deeper network for the encoder or larger batch sizes. Adding the extra layers to smoothen the local volatility surface
resulted in a training time of 1036.027 seconds. Noise is present in VAE because the encoder network is trained
to generate the parameters of a normal distribution that fits the data and data can be sampled using these learned
parameters [Kingma and Welling 2014]. This adds noise to the output if there is not a sufficient amount of samples
generated.

Figure 15 Pricing errors computed for different maturities and strike using the VAE method. The warmer colours
(yellow) represent higher error values and colder colours (blue) represent lower error values.

4.5 Experiment 5: Generating Market Consistent Volatility Surfaces

In this experiment we generate market volatility surfaces using limited market data. This experiment shows that our
method can be adapted to be used on market data with minimal additional training with limited sample data. This is
achieved using a pre-trained a 2-layer discriminator with synthetic data. We fine-tune the 2-layer generator which is
retrained on a limited subset of the market data over 200 epochs with a batch size of 32. The GAN-2 model is tested
on a separate test set that was not used for training the generator.

4.5.1 Implied Volatility Smile

We use GAN-2 to generate a market consistent implied volatility surface, this shows the capability of our network
to generate market consistent volatility surfaces just by retraining the generator. We present the market consistent
implied volatility surface in figure 16.

22
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Figure 16 Implied volatility generated by the generator (left) and the market implied volatility (right).

Figure 16 shows the implied volatility surfaces from our generator and the market implied volatility. We see they are
near identical qualitatively. Quantitatively, we measured the MAE score for GAN-2 as 0.007 and the MAPE score as
7.28%.

4.5.2 Local Volatility Smile

We use GAN-2 to generate a market consistent local volatility surface, the market local volatility was approximated
using the finite difference method. We present the market consistent local volatility surface in figure 17.

Figure 17 Local volatility generated by the generator (left) and the market implied volatility (right).

Figure 17 shows the local volatility surfaces from our generator and the market implied volatility. Quantitatively, we
look at the repricing errors of from our local volatility given in figure 18.

Figure 18 Pricing errors computed for different maturities and strike using GAN-2 for market local volatilities. The
warmer colours (yellow) represent higher error values and colder colours (blue) represent lower error values.

In figure 18 we see smaller errors in shorter time-to-maturity over longer time-to-maturity at strikes that are far OTM.
This is due to the lack of data points in this region.

23
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

5 Conclusion
In this paper, we present a framework to generate volatility surfaces efficiently using GAN. By using a generator
and discriminator together we are able to create a model framework that is efficient and accurate with shallow-narrow
networks. We used no-arbitrage penalty terms with the MSE loss function to penalize arbitrage opportunities generated
by the generators. The log-likelihood estimation of the discriminator adds to the generator by allowing the posterior
generator to successfully output a valid volatility surface that is consistent with the option price. The discriminator
was trained as a classifier to classify the volatility as true or false.
Our numerical results show that GAN-2 is outperforms GAN-1. However, a deeper discriminator network doesn’t
generate performance better than a shallow one. The majority of arbitrage violations detected by our discriminator
is driven by butterfly arbitrage. Regularization using no-arbitrage soft constraints helps mitigate this for implied
volatility computation. We show that our method is more accurate than the IV-ANN method with a faster training time
and more accurate predictions. In particular, the GAN-2 model only required 209.195 seconds and produced a MAPE
of 3.981e−5 , compared to our implementation of the IV-ANN model which took 446.205 seconds with a MAPE of
4.4106e−5 .
We further showed the capability of our network to compute the local volatility and compared our model with a deep
neural network and SSVI. As shown in figure ?? our method produces a maximum ARPE of 0.1% ± 0.117% which
is better than both the deep neural network and SSVI method with maximum ARPE of 0.8% ± 0.2%. Our proposed
method also has a MRPE comparable to the SSVI at 1%. We have shown that our model can be generalized to market
data with limited data samples using a pre-trained discriminator and fine-tuning the generator. We show that we can
generate market consistent volatility surfaces in figures 16 and 17.
With this paper we have shown the potential of generative methods in solving the inverse problem. In this work we
assume that only one price exists for an option. It would be interesting to explore generative models behave in a market
setting with more than one offered price and how it learns the volatility surfaces. It would also be interesting to see an
extension into diffusion models to improve generated surface quality.

Disclosure Statement
No potential conflict of interest is reported by the author(s).

Funding
This work was supported by the Natural Sciences and Engineering Research Council of Canada.

ORCID
Andrew Na: https://orcid.org/ 0000-0002-6162-8171
Justin W.L. Wan: https://orcid.org/ 0000-0001-8367-6337

References
D. Ackerer, N. Tagasovska, and T. Vater. Deep smoothing of the implied volatiliity surface. 34th Conference on
Neural Information Processing System, 2020.
M. Annunziato, A. Borzi, F. Nobile, and R. Tempone. On the connection between the Hamilton-
Jacobi-Bellman and the Fokker-Planck control frameworks. Applied Mathematics, 5(16):2476–2484, 2014.
doi:10.4236/am.2014.516239.
Maxime Bergeron, Nicholas Fung, John Hull, Zissis Poulos, and Andreas Veneris. Variational autoencoders: A hands-
off approach to volatility. The Journal of Financial Data Science, 4(2):125–138, 2022. doi:10.3905/jfds.2022.1.093.
URL https://doi.org/10.3905/jfds.2022.1.093.
Phelim P. Boyle and Draviam. Thangaraj. Volatility estimation from observed option prices. Decisions in Eco-
nomics and Finance, 23(1):31–52, 2000. doi:10.1007/s102030050004. URL https://doi.org/10.1007/
s102030050004.
Richard P. Brent. Algorithms for minimization without derivatives. Prentice Hall, Upper Saddle River, NJ, 1973.

24
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

Rene A. Carmona. HJM: A Unified Approach to Dynamic Models for Fixed Income, Credit and Equity Markets, pages
1–50. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007. ISBN 978-3-540-73327-0. doi:10.1007/978-3-540-
73327-0_1. URL https://doi.org/10.1007/978-3-540-73327-0_1.
P. Carr and Madan D.B. A note on sufficient conditions for no arbitrage. Finance Research Letters, pages 125–130,
2005.
M. Chataigner, S. Crepey, and M. Dixon. Deep local volatility. Risks, 2020.
Marc Chataigner, Areski Cousin, Stéphane Crépey, Matthew Dixon, and Djibril Gueye. Short communication: Beyond
surrogate modeling: Learning the local volatility via shape constraints. SIAM Journal on Financial Mathematics,
12(3):SC58–SC69, 2021. doi:10.1137/20M1381538. URL https://doi.org/10.1137/20M1381538.
Thomas F. Coleman, Yuying Li, and Arun Verma. Reconstructing the unknown local volatility function. Journal of
Computational Finance, 2(3):77–102, 2000. doi:10.21314/JCF.1999.027.
R. Cont and J. da Fonseca. Dynamics of implied volatility surfaces. Quatitative Finanace, pages 45–60, 2002.
Rama Cont and Milena Vuletic. Simulation of arbitrage free implied volatility surfaces. 2022.
doi:http://dx.doi.org/10.2139/ssrn.4299363.
Christa Cuchiero, Wahid Khosrawi, and Josef Teichmann. A generative adversarial network approach to calibration
of local stochastic volatility models. Risks, 8(4), 2020. ISSN 2227-9091. doi:10.3390/risks8040101. URL https:
//www.mdpi.com/2227-9091/8/4/101.
G. Dimitroff, D. Roder, and C.P. Fries. Volatility model calbitration with convolutional neural networks. SSRN, 2018.
Charles Dugas, Y. Bengio, Francois Elisle, and Claude Nadeau. Incorporating second-order functional knowledge for
better option pricing. Cirano Working Papers, 02 2001.
Robin Dunn, Paloma Hauser, Tom Seibold, and Hugh Gong. Estimating option prices with Heston ’ s stochastic
volatility model. 2014.
B. Dupire. Pricing with a smile. Risk, pages 18–20, 1994.
Valdo Durrleman. Convergence of at-the-money implied volatilities to the spot volatility. Journal of Applied Proba-
bility, 45(2):542–550, 2008. ISSN 00219002. URL http://www.jstor.org/stable/27595963.
F. Fang and C. Oosterlee. A novel pricing method for European options based on Fourier-cosine series expansions.
SIAM Journal of Scientific Computing, pages 826–848, 2009.
Jim Gatheral and Antoine Jacquier. Convergence of Heston to SVI. Quantitative Finance, 11(8):1129–1132, 2011.
doi:10.1080/14697688.2010.550931. URL https://doi.org/10.1080/14697688.2010.550931.
Jim Gatheral and Antoine Jacquier. Arbitrage-free SVI volatility surfaces. Quantitative Finance, 14(1):59–71, 2014.
doi:10.1080/14697688.2013.819986. URL https://doi.org/10.1080/14697688.2013.819986.
Jim Gatheral and Merril. Lynch. Lecture 1: Stochastic volatility and local volatility, 2001a.
Jim Gatheral and Merril. Lynch. Lecture 2: Fitting the volatility skew, 2001b.
I. Goodfellow, Y. Bengio, and A. Courville. Deep learning, adaptive computation and machine learning. MIT Press:
Cambridge, MA, 2016.
A. Hernandez. Model calibration with neural networks. SSRN, 2016.
S.L. Heston. A closed-form solution for options with stochastic volatility with applications to bond and currency
options. Review of Financial Studies, pages 327–343, 1993.
A. Hirsa, T. Karatas, and A. Oskoui. Supervised deep neural networks (DNNs) for pricing/calibration of vanilla/exotic
options under various different processes, 2019.
B. Horvath, A. Muguruza, and M. Tomas. Deep learning volatility: a deep neural network perspec-
tive on pricing and calibration in (rough) volatility models. Quantitative Finance, 21(1):11–27, 2021.
doi:10.1080/14697688.2020.1817974. URL https://doi.org/10.1080/14697688.2020.1817974.
A. Itkin. Deep learning calibration of option pricing models: some pitfalls and solutions. arXiv:1906.03507v1, 2019.
Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. Banff, AB, Canada, 2014. 2nd International
Conference on Learning Representations, ICLR.

25
Computing Volatility Surfaces using GANs with Minimal Arbitrage Violations A P REPRINT

D.P. Kingma and J. Ba. Adam: A method for stochastic optimization, 2017.
Roger W. Lee. Implied Volatility: Statics, Dynamics, and Probabilistic Interpretation, pages 241–268. Springer US,
Boston, MA, 2005. ISBN 978-0-387-23394-9. doi:10.1007/0-387-23394-6_11. URL https://doi.org/10.
1007/0-387-23394-6_11.
S. Liu, A. Borovykh, L.A. Grzelak, and C. Oosterlee. A neural network-based framework for financial model calibra-
tion. Journal of Mathematics in Industry, 2019a.
S. Liu, C. Oosterlee, and S.M. Bohte. Pricing options and computing implied volatilities using neural networks. Risks,
2019b.
T. Poggio, H. Mhaskar, L. Rosasco, B. Miranda, and Q. Liao. Why and when can deep-but not shallow-networks avoid
the curse of dimensionality: A review. International Journal of Automation and Computing, pages 503–519, 2017.
M. Roper. Arbitrage free implied volatility surfaces, 2010.
Thendo Sidogi, Wilson Tsakane Mongwe, Rendani Mbuvha, and Tshilidzi Marwala. Creating synthetic volatility
surfaces using generative adversarial networks with static arbitrage loss conditions. In 2022 IEEE Symposium
Series on Computational Intelligence (SSCI), pages 1423–1429, 2022. doi:10.1109/SSCI51031.2022.10022219.
J.D. Spiegeleer, D. Madan, S. Reyner, and W. Schoutens. Machine learning for quantitative finance: fast derivative
pricing, hedging and fitting. Quantitative Finance, 18(10):1635–1643, 2018.
Milena Vuletic, Mihai Cucuringu, and Felix Prenzel. Fin-gan: Forcasting and classifying financial time series via
generative adversarial networks. 2023. doi:http://dx.doi.org/10.2139/ssrn.4328302.

SSRN 4617536
No ratings yet
SSRN 4617536
45 pages
Deep Smoothing of The Implied Volatility Surface: Damien Ackerer
No ratings yet
Deep Smoothing of The Implied Volatility Surface: Damien Ackerer
30 pages
Deep Learning Volatility
No ratings yet
Deep Learning Volatility
32 pages
SSRN 4572108
No ratings yet
SSRN 4572108
44 pages
SSRN Id1882567
No ratings yet
SSRN Id1882567
38 pages
Implied Volatility Surface Construction
No ratings yet
Implied Volatility Surface Construction
40 pages
SSRN 382744
No ratings yet
SSRN 382744
28 pages
NeurIPS 2020 Deep Smoothing of The Implied Volatility Surface Paper
No ratings yet
NeurIPS 2020 Deep Smoothing of The Implied Volatility Surface Paper
12 pages
Implied Stoch Model
No ratings yet
Implied Stoch Model
118 pages
Implied Volatility Surface Methods
No ratings yet
Implied Volatility Surface Methods
40 pages
Deep Learning for Option Pricing
No ratings yet
Deep Learning for Option Pricing
16 pages
Implied Volatility Models: Newton-Raphson & Corrado-Miller
No ratings yet
Implied Volatility Models: Newton-Raphson & Corrado-Miller
4 pages
Stochastics Report
No ratings yet
Stochastics Report
6 pages
Volatility Smile in Exotic Pricing
100% (1)
Volatility Smile in Exotic Pricing
2 pages
SSRN Id2944341
No ratings yet
SSRN Id2944341
15 pages
Burzoni - Finance
No ratings yet
Burzoni - Finance
54 pages
Neural Network Calibration for Volatility Models
No ratings yet
Neural Network Calibration for Volatility Models
32 pages
Lifting Heston
No ratings yet
Lifting Heston
33 pages
Arbitrage-Free Local-Stochastic Volatility Model
No ratings yet
Arbitrage-Free Local-Stochastic Volatility Model
21 pages
Deep Learning for Local Volatility Calibration
No ratings yet
Deep Learning for Local Volatility Calibration
21 pages
SABR in Illiquid Market by West
No ratings yet
SABR in Illiquid Market by West
15 pages
Asymptotics and Calibration of Local Volatility Models PDF
No ratings yet
Asymptotics and Calibration of Local Volatility Models PDF
9 pages
Mean-Reverting SABR Models: Closed-Form Surfaces and Calibration For Equities
No ratings yet
Mean-Reverting SABR Models: Closed-Form Surfaces and Calibration For Equities
30 pages
Considering Appropriate Input Features of Neural Network To Calibrate Option Pricing Models
No ratings yet
Considering Appropriate Input Features of Neural Network To Calibrate Option Pricing Models
28 pages
Implied Vol Dynamics Cont Fonseca
No ratings yet
Implied Vol Dynamics Cont Fonseca
16 pages
ML Valuation of Illiquid Options
No ratings yet
ML Valuation of Illiquid Options
23 pages
Sound Deposit Insurance Pricing Using A Machine Le
No ratings yet
Sound Deposit Insurance Pricing Using A Machine Le
18 pages
FTSE Options Volatility Modeling
No ratings yet
FTSE Options Volatility Modeling
24 pages
Bayesian Estimator
No ratings yet
Bayesian Estimator
29 pages
Machine Learning For Option Pricing
No ratings yet
Machine Learning For Option Pricing
29 pages
Leif Andersen - Extended Libor Market Models With Stochastic Volatility PDF
No ratings yet
Leif Andersen - Extended Libor Market Models With Stochastic Volatility PDF
43 pages
eSSVI Implied Volatility WP FY20
No ratings yet
eSSVI Implied Volatility WP FY20
11 pages
Rough Volatility 2023 Part 1 Handout
No ratings yet
Rough Volatility 2023 Part 1 Handout
43 pages
VAR Paper
No ratings yet
VAR Paper
5 pages
Implied Volatility Forecast and Option Trading STR
No ratings yet
Implied Volatility Forecast and Option Trading STR
12 pages
Deep Learning From Implied Volatility Surfaces 1715106433
No ratings yet
Deep Learning From Implied Volatility Surfaces 1715106433
92 pages
Option Pricing With Machine Learning Daniel Bloch 1718934904
100% (1)
Option Pricing With Machine Learning Daniel Bloch 1718934904
49 pages
Dcabes 2017 18
No ratings yet
Dcabes 2017 18
4 pages
10 1016@j Eswa 2020 113799
No ratings yet
10 1016@j Eswa 2020 113799
7 pages
Delta Hedging Vega Risk
No ratings yet
Delta Hedging Vega Risk
30 pages
Delta-Hedging with Local vs Implied Delta
No ratings yet
Delta-Hedging with Local vs Implied Delta
30 pages
Local Volatility Under Rough Volatility
No ratings yet
Local Volatility Under Rough Volatility
28 pages
Evolution of Volatility Models
No ratings yet
Evolution of Volatility Models
12 pages
Unsupervised Calibration for rBergomi
No ratings yet
Unsupervised Calibration for rBergomi
24 pages
From Implied To Spot Volatilities: Valdo Durrleman
No ratings yet
From Implied To Spot Volatilities: Valdo Durrleman
21 pages
Monte-Carlo Simulation in Finance
No ratings yet
Monte-Carlo Simulation in Finance
19 pages
Building A Good Vol Surface
No ratings yet
Building A Good Vol Surface
49 pages
Advanced LSV Models for Traders
No ratings yet
Advanced LSV Models for Traders
17 pages
Short Term ATM Asymptotics Under SV Models
No ratings yet
Short Term ATM Asymptotics Under SV Models
20 pages
Pricing Exotic Options
No ratings yet
Pricing Exotic Options
28 pages
Andersen
No ratings yet
Andersen
45 pages
Path Dependent Volatility: Paolo Foschi and Andrea Pascucci Dipartimento Di Matematica, Universit' A Di Bologna
No ratings yet
Path Dependent Volatility: Paolo Foschi and Andrea Pascucci Dipartimento Di Matematica, Universit' A Di Bologna
18 pages
Deutsche Bank-Quantitative Research Valuing and Hedging Equity Derivatives-051031
100% (1)
Deutsche Bank-Quantitative Research Valuing and Hedging Equity Derivatives-051031
53 pages
Societe Generale-Numerical Methods For Non-Linear Problems in Quantitative Finance-0904
No ratings yet
Societe Generale-Numerical Methods For Non-Linear Problems in Quantitative Finance-0904
39 pages
GARCHNet Value-at-Risk Forecasting
No ratings yet
GARCHNet Value-at-Risk Forecasting
31 pages
Nber w4718
No ratings yet
Nber w4718
51 pages
On Deep Calibration of (Rough) Stochastic Volatility Models: Christian - Bayer@wias-Berlin - de
No ratings yet
On Deep Calibration of (Rough) Stochastic Volatility Models: Christian - Bayer@wias-Berlin - de
32 pages
Deep Generative Modeling For Financial Time Series With Application in Var: A Comparative Review
No ratings yet
Deep Generative Modeling For Financial Time Series With Application in Var: A Comparative Review
75 pages
Volatility Estimation From Observed Option Prices: Phelim P. Boyle, Draviam Thangaraj
No ratings yet
Volatility Estimation From Observed Option Prices: Phelim P. Boyle, Draviam Thangaraj
22 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
2023 Assignment 1 - Final
No ratings yet
2023 Assignment 1 - Final
2 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
Lecture 1
No ratings yet
Lecture 1
30 pages
Arbitrage-Free SVI Volatility Surfaces: Jim Gatheral, Antoine Jacquier November 27, 2024
No ratings yet
Arbitrage-Free SVI Volatility Surfaces: Jim Gatheral, Antoine Jacquier November 27, 2024
27 pages
C206T Cost Report
No ratings yet
C206T Cost Report
6 pages
MO IBEF V - Domestic Fund Presentation - May 2025
No ratings yet
MO IBEF V - Domestic Fund Presentation - May 2025
33 pages
Muthoot Financewep PDF
No ratings yet
Muthoot Financewep PDF
14 pages
CA Inter Law Extra Questions
No ratings yet
CA Inter Law Extra Questions
35 pages
Average Collection Period Calculation
No ratings yet
Average Collection Period Calculation
35 pages
ABM Fundamentals for Senior High School
100% (1)
ABM Fundamentals for Senior High School
7 pages
PSB E-Auction Sale Notice for Property
No ratings yet
PSB E-Auction Sale Notice for Property
3 pages
Fico Interview Questions
No ratings yet
Fico Interview Questions
63 pages
SVS Buildcon
No ratings yet
SVS Buildcon
2 pages
(ISongs - Info) 05 - Ellelli Nodali 2
No ratings yet
(ISongs - Info) 05 - Ellelli Nodali 2
3 pages
Account / Folio No: 401156725467: Summary of Investments
No ratings yet
Account / Folio No: 401156725467: Summary of Investments
3 pages
Mobile Forex Scalping Strategy Guide
No ratings yet
Mobile Forex Scalping Strategy Guide
20 pages
Syllabus
No ratings yet
Syllabus
2 pages
Nazi Economic Policies Explained
No ratings yet
Nazi Economic Policies Explained
5 pages
17BB319
No ratings yet
17BB319
2 pages
Documentation Requirements For Licensing
No ratings yet
Documentation Requirements For Licensing
4 pages
FARAFAR Review Assessment Answers
No ratings yet
FARAFAR Review Assessment Answers
4 pages
Stock Investment Return Analysis
No ratings yet
Stock Investment Return Analysis
3 pages
Eprocurement System of Government of West Bengal
No ratings yet
Eprocurement System of Government of West Bengal
2 pages
Disciplined Trader Trade Journal (Spread Betting)
No ratings yet
Disciplined Trader Trade Journal (Spread Betting)
850 pages
REAL ESTATE APPRIASER With ANSWER PDF
80% (5)
REAL ESTATE APPRIASER With ANSWER PDF
12 pages
Airgas FY11 Annual Report FINAL
No ratings yet
Airgas FY11 Annual Report FINAL
69 pages
021617
No ratings yet
021617
22 pages
Corporate Reporting November 2018
No ratings yet
Corporate Reporting November 2018
28 pages
AML Risk Assessment
67% (3)
AML Risk Assessment
16 pages
33 Benefits and Perks of The American Express® Gold Card
No ratings yet
33 Benefits and Perks of The American Express® Gold Card
38 pages
Gas Dispersion With OpenFoam DANSIS Chris Dixon
100% (1)
Gas Dispersion With OpenFoam DANSIS Chris Dixon
37 pages
Booking Status
No ratings yet
Booking Status
1 page
Public Finance of Bangladesh
No ratings yet
Public Finance of Bangladesh
10 pages
Price/Earnings Ratio Analysis Quiz
No ratings yet
Price/Earnings Ratio Analysis Quiz
40 pages

C V S GAN M A V: Omputing Olatility Urfaces Using S With Inimal Rbitrage Iolations

Uploaded by

C V S GAN M A V: Omputing Olatility Urfaces Using S With Inimal Rbitrage Iolations

Uploaded by

C OMPUTING VOLATILITY S URFACES USING GAN S WITH

M INIMAL A RBITRAGE V IOLATIONS

Andrew S. Na∗ Meixin Zhang

University of Waterloo University of Waterloo

Justin W.L. Wan

December 27, 2023

2 Heston Model, Volatility Surface and Static Arbitrage

2.1 Heston Model

2.2 Volatility Surface

2.2.1 Implied Volatility

2.2.2 Local Volatility

then the discretized approximation of σlocal (K, T ) is given by

2.3 Static Arbitrage

1. it is free of calendar spread arbitrage (monotonicity);

2.3.1 No-Arbitrage Conditions imposed on Option Price Surface

2.3.2 No-Arbitrage Conditions imposed on Volatility Surface

3 GAN framework for Computing Volatility Surfaces

3.1 GAN Formulation of the Inverse Problem

3.1.1 Implied volatility

we denote the resulting σ ∗ (K, T ) as σimplied (K, T ).

3.1.2 Local volatility

3.2 Loss Function and Proposed Model

We represent GAN-1 with a l = 1 layer G1 (X; Ω1 ), with parameters Ω1 = [W0 , W1 ] by

3.3 Training and Summary of GAN

Algorithm 1 Discriminator training algorithm

2. Then X, Z and y are given as inputs to D(·; Θ).

The architecture and training parameters are detailed in table 1.

Table 1 Model Parameters used in GAN for GAN-1 and GAN-2.

3.4 Generating Data

3.4.1 Generating Volatility Data

r, ρ, κ, γ, v̄, v0 , Heston Model Black-Scholes Implied Volatil-

Dupire’s Volatility Finite Difference Method

3.4.2 Volatility Data From Call options on the S&P 500

4.1 Experiment 1: Performance Comparison between GAN models

4.1.1 Implied Volatility Smile

Table 5 Number of butterfly arbitrage violations detected in implied volatility surface.

Table 6 Number of calendar arbitrage violations detected in implied volatility surface.

4.1.2 Local Volatility Smile

Table 7 Number of butterfly arbitrage violations detected in local volatility surface.

4.1.3 Comparison to GAN without MSE Loss

Table 8 Number of calendar arbitrage violations detected in local volatility surface.

4.2 Experiment 2: Comparison of Implied Volatility with IV-ANN

Table 9 Timing and error comparison between GAN-2 and IV-ANN

4.4 Experiment 4: Comparison of Local Volatility Computation with VAE

4.5 Experiment 5: Generating Market Consistent Volatility Surfaces

4.5.1 Implied Volatility Smile

4.5.2 Local Volatility Smile

You might also like