A Crash Intro To SDE and Poisson Processes
A Crash Intro To SDE and Poisson Processes
to
Stochastic Differential Equations
and Poisson Processes
School of Economics
Yonsei University
Kyongchae Jung
In order to have stochastic differential equations defined I will illustrate the Brownian
motion process, a fundamental continuous-paths process. Given its importance in default
modeling, I also introduce the Poisson process, to some extent the purely jump analogous of
Brownian motion. Brownian motions and Poisson processes are among the most important
random processes of probability.
These notes are far from being complete or fully rigorous, in that I privilege the
intuitive aspect, but I give references for the reader who is willing to deepen her knowledge
on such matters.
I note that the understanding, and subsequent implementation, of most of the essential
and important issues in interest rate modeling do not require excessively-exotic tools of
stochastic calculus. The basic paradigms, risk neutral valuation and change of numeraire, in
fact, essentially involve Ito's formula and the Girsanov theorem. I therefore introduce quickly
and intuitively such results. The fact that I do not insist upon more fundamental questions to
address in practice can be very often solved with the basic tools above.
Here I present a quick and informal introduction to SDE's. I consider the scalar case
to simplify exposition.
,
where is a real constant. Now suppose that, due to some complications, it is no longer
realistic to assume the initial condition to be a deterministic constant. Then we may decide
to let be a random variable , and to model the population growth by the differential
equation:
, .
As a further step, suppose that no even is known for certain, but that also our
knowledge of is perturbed by some randomness, which we model as the "increment" of a
stochastic process ≧ , so that
The function , corresponding to the deterministic part of the SDE, is called the drift. The
function (or sometimes its square ) is called the diffusion coefficient. Note that the
randomness enters the differential equation from two sources: The "noise term" ⋅
and the initial condition .
Usually, the solution of the SDE is called also a diffusion process, because of the
fact that some particular SDE's can be used to arrive at a model of physical diffusion. In
general the paths ↦ of a diffusion process are continuous.
2. Brownian Motion
The process whose "increments" are candidate for representing the noise
process in (2) is the Brownian motion. This process has important properties: It has stationary
and independent Gaussian increments " ", or more precisely for any and
any :
3. Stochastic Integrals
does now exist, what meaning can we give to equation (2)? The answer
Since
,
(6)
so that, from now on, all differential equations involving terms like are meant as integral
equations, in the same way as (2) will be an abbreviation for (6).
However, we are not done yet, since we have to deal with the new problem of
defining an integral like .
A priori it is not possible to define it as a
Stieltjes integral on the paths, since they have unbounded variation. Nonetheless, under some
"reasonable" assumptions that we do not mention, it is still possible to define such integrals a
la Stieltjes. The price to be paid is that the resulting integral will depend on the chosen points
of the sub-partitions (whose mesh tends to zero) used in the limit that defines the integral.
More specifically, consider the following definition. Take an interval and consider the
following dyadic partition of depending on an integer ,
, ⋯ ∞.
Notice that from a certain on all terms collapse to , i.e., for all . For
each we have such a partition, and when increases the partition contains more elements,
giving a better discrete approximation of the continuous interval . Then define the integral
as
lim
∞
→ ∞
where is any point in the interval . Now, by choosing (initial point of the
subinterval) we have the definition of the Ito integral, whereas by taking
(middle point) we obtain a different result, the Stratonovich integral.
The Ito integral has interesting probabilistic properties (for example, it is a martingale,
an important type of stochastic process that will be briefly defined below), but leads to a
calculus where the standard chain rule is not preserved since there is a non-zero contribution
of the second order terms. On the contrary, although probabilistically less interesting, the
Stratonovich integral does preserve the ordinary chain rule, and is preferable from the
viewpoint of properties of the paths.
To better understand the difference between these two definitions, we can resort to
the following classical example of stochastic integral computed both with the Ito calculus and
the Stratonovich calculus:
Ito →
Stratonovich →
To distinguish between the two definition, a symbol "∘" is often introduced to denote
the Stratonovich version as follows :
∘ .
In the Ito version, the "" term originates from second order effects, which are not negligible
like in ordinary calculus. Note that the first integral is a martingale (so that, for example, it
has constant expected value equal to zero, which is an important probabilistic property), but
does not satisfy formal rules of calculus, as instead does the second one (which is not a
martingale).
In our discussion above, we have mentioned the concept of martingale. To give a quick
idea, consider a process satisfying the following measurability and integrability conditions.
A martingale is a process satisfying these two conditions and such that the following
property holds for each ≦ :
.
This definition states that, if we consider as the present time, the expected value at a future
time given the current information is equal to the current value. This is, among other
things, a picture of a "fair game", where it is not possible to gain or lose on average. It turns
out that the martingale property is also suited to model the absence of arbitrage in
mathematical finance. To avoid arbitrage, one requires that certain fundamental processes of
the economy be martingales, so that there are no "safe" ways to make money from nothing out
of them.
, .
≧ .
This means that the expected value of the process grows in time, and that averages of future
values of the process given the current information always exceed (or at least are equal to)
the current value.
≦ ,
and that expected value of the process decreases in time, so that average of future values of
the process given the current information are always smaller than (or at most are equal to)
the current value.
5. Quadratic Variation
〈〉 .
It is easy to check that a process whose paths ↦ are differentiable for almost all
satisfies 〈〉 . In case is a Brownian motion, it can be proved, instead, that
Again, this comes from the fact that the Brownian motion moves so quickly that second order
effects are not negligible. Instead, a process whose trajectories are differentiable cannot move
so quickly, and therefore its second order effects do not contribute.
In case the process is equal to the deterministic process ↦, so that , we
immediately retrieve the classical result from (deterministic) calculus:
.
6. Quadratic Covariation
One can also define the quadratic covariation of two processes and with
continuous paths as
〈 〉
,
.
Let us go back to our general SDE, and let us take time-homogeneous coefficients for
simplicity:
Under which conditions does it admit a unique solution in the Ito sense? Standard theory tells
us that it is enough to have both the and coefficients satisfying Lipschitz continuity (and
linear growth, which does not follow automatically in the time-inhomogeneous case or with
local Lipscitz continuity only). There sufficient conditions are valid for deterministic differential
equations as well, and can be weakened, especially in dimension one. Typical examples
showing how, without Lipschitz continuity or linear growth, existence and uniqueness of
solutions can fail are the following:
, -> , ∈ .
, -> ∞ , ∈ ∞
The proof of the fact that the existence and uniqueness of a solution to a SDE is
guaranteed by Lipschitz continuity and linear growth of the coefficients, is similar in spirit to
the proof for deterministic equations.
lim lim
→
,
→
.
,
lim
→
lim
.
→
The second limit is non-zero because of the "infinite velocity" of Brownian motion, while the
first limit is the analogous of the deterministic case.
9. Ito's Formula
Now we are ready to introduce the famous Ito formula, which gives the chain rule for
differentials in a stochastic context.
given a smooth transformation , one can write the evolution of via the chain rule:
. (10)
We already observed in (7) that whenever a Brownian motion is involved, such a fundamental
rule of calculus needs to be modified. The general formulation of the chain rule for stochastic
differential equations is the following. Let be a smooth function and let be the
unique solution of the stochastic differential equation (9). Then, Ito's formula reads as
, (11)
〈〉
Comparing equation (11) with its deterministic counterpart (10), we notice that the extra term
appears in our stochastic context, and this is the term due to the Ito integral.
The term can be developed algebraically by taking into account the rules on the
quadratic variation and covariation seen above:
We thus obtain
Also the classical Leibnitz rule for differentiation of a product of functions is modified,
analogously to the chain rule. The related formula can be derived as a corollary of Ito's
formula in two dimensions, and is reported below.
.
For two diffusion process (and more generally semimartingales) and we have
instead
〈 〉 .
A SDE is said to be linear if both its drift and diffusion coefficients are first order
polynomials (or affine functions) in the state variable. We here consider the particular case:
where , , are deterministic functions of time that are regular enough to ensure existence
and uniqueness of a solution.
It can be shown that a stochastic integral of a deterministic function is the same both
in the Stratonovich and in the Ito sense. As a consequence, by writing (12) in integral form we
see that the same equation holds in the Stratonovich sense:
∘ , ,
We obtain
A remarkable fact is that the distribution of the solution is normal at each time .
Intuitively, this holds since the last stochastic integral is a limit of a sum of independent
normal random variables. Indeed, we have
∼
The major examples of models based on a SDE like (13) are that of Vasicek and that of Hull
and White.
12. Lognormal Linear SDEs
Another interesting example of linear SDE is that where the diffusion coefficient is a
first order homogeneous polynomial in the underlying variable. This SDE can be obtained as an
exponential of a linear equation with deterministic diffusion coefficient. Indeed, let us take
, where evolves according to (12), and write by Ito's formula.
As a consequence, the process has a lognormal marginal density. A major example of model
based on such a SDE is the Black and Karasinski model.
,
where and are positive constants. To check that is indeed a lognormal process, one
can compute via Ito's formula and obtain
.
From the seminal work of Black and Scholes on, processes of this type are frequently used in
option pricing theory to model general asset price dynamics. Notice that this process is a
submartingale, in that clearly
≧ .
, . (13)
.
(14)
, (16)
where the diffusion process has dynamics, starting from at time , given by
Notice that the terminal condition determines the function of the diffusion process
whose expectation is relevant, whereas the PDE coefficients determine the dynamics of the
diffusion process.
The Girsanov theorem shows how a SDE changes due to changes in the underlying
probability measure. It is based on the fact that the SDE drift depends on the particular
probability measure in our probability space , and that, if we change the
probability measure in a "regular" way, the drift of the equation changes while the diffusion
coefficient remains the same. The Girsanov theorem can be thus useful when we want to
modify the drift coefficient of a SDE. Indeed, suppose that we are given two measures and
on the space . Two such measures are said to be equivalent, written ∼ , if
they share the same sets of null probability (or of probability one, which is equivalent).
Therefore, two measures are equivalent when they agree on which events of hold almost
surely. Accordingly, a proposition holds almost surely under if and only if it holds almost
surely under . Similar definitions apply also for the measures restriction to , thus
expressing equivalence of the two measures up to time .
When two measures are equivalent, it is possible to express the first in terms of the
second through the Radon-Nikodym derivative. Indeed, there exists a martingale on
such that
, ∈ ,
.
,
where and denote expected values with respect to the probability measures and ,
respectively. More generally, when dealing with conditional expectations, we can prove that
.
Theorem [The Girsanov theorem] Consider again the stochastic differential equation, with
Lipschitz coefficients,
, ,
under . Let be given a new drift and assume to be bounded.
Define the measure by
As already, noticed this theorem is fundamental when we wish to change the drift of
a SDE. It is now clear that we can do this by defining a new probability measure , via a
suitable Radon-Nikodym derivative, in terms of the difference "desired drift-given drift."
.
(18)
We finally stress that above we assumed boundedness for simplicity, but less stringent
assumptions are possible for the theorem to hold.
Given their importance in default modeling, and their growing interest to the financial
community in addressing jump-diffusion models, we cannot close the lecture note without
mentioning Poisson processes, that are the purely jump analogous of the Brownian motion
with which we started the lecture note.
One notices immediately that the Poisson process shares the properties expressed by
(3) and (4) with the Brownian motion process. Actually, if we substituted "unit jump
increasing, right continuous" by "continuous" we would obtain a characterization of Brownian
motion with time-linear drift. Indeed, an important theorem of stochastic calculus due to Levy
states that a continuous process with independent and stationary increments and null initial
condition is necessarily of the form
with and deterministic constants and a Brownian motion. This shows that Poisson
Processes and Brownian motions are analogous processes family respectively. In particular,
they are both particular cases of the larger family of Levy processes, i.e. particular cases of
processes with stationary independent increments and with right continuous and left limit
paths.
The first results on Poisson processes are given by the following facts:
First properties of Poisson Processes. Let be a time homogeneous Poisson process. Then
1) There exists a positive real number
such that
for all .
2) lim
→
3) lim
→
The first point states that the probability of having no jumps up to some given time
is an exponential function of minus that (possibly re-scaled) time. The second point tells us
that the probability of having more than one jump in an arbitrary small time going to zero
goes to zero faster than the time itself. So, roughly speaking, in small intervals we can have
at most one jump. The third point tells us that the probability of having exactly one jump in a
small time, re-scaled by the time itself, is the constant we find in the exponent of the
exponential function found in the first point.
Also, in classical Poisson process theory, starting from the above first results, one
proves the following
Further properties of Poisson Processes. Let be a time homogeneous Poisson
process. Then
This second set of properties tells us that the number of jumps of a Poisson process
follows the Poisson law (hence the name of the process).
if and only if
≦
for all and is unit-jump increasing, right
continuous with .
.
A fundamental result (also for financial applications) concerns the distribution of the
intervals of time between two jumps of the process.
Exponential distribution for the time between two jumps. Let be a time
homogeneous Poisson process. Let , , ⋯, , ⋯ be the first, second etc. jump times of
. Then , , , ⋯, i.e. the times between any jump and the subsequent one, are
i.i.d. ∼ exponential(
) (or, equivalently, the random variables
,
,
, ⋯, are
i.i.d ∼ exponential(1)).
In the simplest intensity models for credit derivatives the default time is modeled as
, so that the time of the first jump becomes particularly important. An immediate important
consequence of the last property is that the probability of having the first jump in a small
time interval given that this first jump did not occur before is
, so that
bears also the interpretation of probability of having a new jump about given that we have
not had it before . In formula
,
.
is still increasing by jumps of size 1, its increments are still independent, but they
are no longer identically distributed (stationary) due to the "time distortion" introduced by the
possibly nonlinear .
From we have obviously that jumps the first time at if and only if
jumps the first time at .
But since we know that is a standard Poisson Process for which the first jump
time is exponentially distributed, then we have
∼ .
,
≈
(where the final approximation is good for small exponents). Following this, we have that the
"probability of first jumping between and given that one has not jumped before " is
≈
,
(where, again, the final approximation is good for small exponents). The last term is a sort of
time-averaged intensity between and .
It is easy to show, along the same lines, that
"Probability that first jump occurs in the (arbitrary small) next "dt" instants given that we had
no jump so far is ."
Notice that a fundamental fact from probability tells us that is independent of all
possible Brownian motion in the same probability space where the Poisson process is defined,
and also of the intensity itself when this is assumed to be stochastic, as we are going to
assume now.
Intensity, besides bing time varying, can also be stochastic: in that case it is assumed
to be at least a -adapted and right continuous (and thus progressive) process and is
denoted by and the cumulated intensity of hazard process is the random variable
. We assume
.
We recall again that " -adapted" means essentially that given the information we
know from to .
We have that, for Cox Processes, the first jump time can be represented as
.
Notice once again that here not only is random (and still independent of anything
else, included ), but itself is stochastic. With Cox processes we have
∈ ≧ . This reads, if "=now":
"The probability that the process first jumps in (a small) time "" given that it has not
jumped so far and given the information is ."
≧ ≧ ≧ ≧
which, in a financial context, where is typically a default time, is completely analogous to
the bond price formula in a short rate model with interest rate replacing .
Cox processes thus allow to drag the interest-rate technology and paradigms into
default modeling. But again is independent of all default free market quantities (of , of ,
of ...) and represents an external source of randomness that makes reduced form models
incomplete.
.
⋯ ⋯
.
The interest in compound Poisson processes is given by their possible use as "jumpy
shocks" processes in jump diffusion models, as opposed to the continuous shock process
given by Brownian motion.
In general, a candidate jump-diffusion process is written as
.
if , and 0 otherwise.
We see that the shock in is always finite (rather than infinitesimal/small of order
"" or "
") or null.
Finally, we notice that the above-mentioned Levy Processes have been characterized
as limits of compositions of independent families of compound Poisson processes and
Brownian motions. The basic mathematical framework for reaching Levy Processes thus
includes the above compound Poisson process and the Brownian motion. Obviously in their
basic formulation Levy processes incorporate the Brownian motion and the Poisson process as
particular cases, but not the jump diffusion and Cox processes in general. The financial
community is now considering processes with Levy shocks, or Levy processes under
stochastic time-changes. These processes encompass a large family of earlier models based
on jump-diffusions and stochastic volatility.