2A, Afstorea Ieoduetion
substance X formed by a chemical reaction of the form X = A.A quite well de
fined deterministic motion is evident, and ths is reproducible, unlike the fuctuations
fround this motion, which are not.
1.2 Some Historical Examples
1.2.1 Brownian Motion
‘The observation that, when suspended in water, small pollen grains are found to be in
A very animated and iregula state of motion, was first systematically investigated by
Robert Brown in 1827 [1.1], and the observed phenomenon took the name Brownian
‘Motion because of his fundamental pioneering work. Brown was botanis—indeed
4 very famous botanist—and he was examining pollen grains in order to elucidate
the mechanism which by which the grains moved towards the ova when ferilising
flowers. At first he thought this motion was a manifestation o life he was secking,
but whem he found that this motion was present in apparently dead pollen, some over
a century old, some even extracted from fossils, and then even in any suspension of
Fine particles gas, minerals and even a fragment ofthe sphinx—he ruled out any
specifically organi origin ofthis motion. The motion is ilustated in Fig. 1.2
‘The riddle of Brownian motion was not quickly solved, and a satisfactory ex
planation did not come until 1905, when Einstein published an explanation under
the rather modest title “Uber die von der molekular-kinetischen Theorie der Ware
geforderte Bewegung von in ruhenden Flissigketen suspendierten Teilchen” (con-
cerning the motion, as required by the molecular-kinetic theory” heat of particles
suspended in liquids at rest) [1.2]. The same explanation was independently devel
‘oped by Smoluchowski [1.3], who was responsible for much of the later systematic
development and for much ofthe experimental verification of Brownian motion the-
ory.
‘There were two major points in Einstein's solution to the problem of Brownian
motion,
| Fie. 1.2 Moron of a point undergoing Brownian
12 Some Histrical Examples 3
') The motion is caused by the exceedingly frequent impacts on the pollen grain of
the incessantly moving molecules of liquid in which it is suspended.
{i) The motion of these molecules is so complicated that its effect on the pollen
grain can only be described probabilstcally in terms of exceedingly frequent
satistcally independent impact
“The existence of fuctuations like these ones calls out fora statistical explanation of
this kind of phenomenon. Statistics hod already been used by Maxwell and Boltz~
1ann in their famous gas theories, but only as a description of possible states and
the likelihood of theie achievement and not as an intrinsic part othe time evolution
of the system, Rayleigh (14) was in fact the fst 0 considera statistical description
inthis context, ut for one reason or another, very litle arse out of his work. For
practical purposes Einstein's explanation of the nature of Brownian motion must be
regarded asthe beginning of stochastic modelling of natural phenomena,
Einstein's reasoning is very clear and elegant. I contains all the basic concepts
‘which will makeup the subject matter ofthis book. Rather than paraphrase a lassie
piece of work shal simply give an extended excerpt fom Einsteins paper (author's
translation:
“T-must clearly be assumed that each individual particle executes a motion
Which is independent of the motions of all other particles: it will also be
considered thatthe movements of one and the same particle in different time
imervals are independent processes, as long as these time intervals are not
chosen too smal
“We introduce a time interval + into consideration, which is very small com
pared to the observable time intervals, but nevertheless so large that in wo
successive time intervals r, the motions executed by the particle can be
thought of as events which are independent ofeach other
“Now let there be a total of m particles suspended in a liquid. Ina time
intorval r the X-coordinates of the individual particles will nerease by an
amount A, where for each particle A has a different (positive or negative)
value. There will be a certain frequency law for A; the number dn of the
particles which experience a shift which is between A and 4 + dA will be
‘expressible by an equation of the form
dn = ngs. oan
were
Foca 0.2
and ¢ is only diferent from zero for very small values of A, and satisfies the
condition
a) = HA) 23)
“We now investigate how the diffusion coeficiem depends on @. We shall
‘once more restrict ourselves tothe case where the number v of particles per
‘uit volume depends only on x and4
1. A Historical Irodction
“Let y = f(t) be the number of particles per unit volume, We compute the
istribution of particles atthe time r+ fom the distribution time f, From
the definition of the function 4A). iti easy to find the number of particles
‘which at time 1 +7 are Found between two planes perpendicular to the x-axis
nd passing through points x and x +d. One obtains
flsrende= de F fore [Link] (24)
But since ris very small, we can set
fasten) = fleo+ 025)
urthermore, we develop f(x + At) in powers of &
fort A. flan SLED , LOLA , 126)
eH Or
‘We can use this series under the integral, because only small values of A
contribute to this equation, We obtain
a are ay eat
ft Foaaa + 3 F agawas Fs F Seana
aan
x) the second, fourth, et. terms on the right-hand side
vanish, while out of the Ist, 3rd, th, etc. terms, each one is very small
compared with the previous, We obtain from this equation, by taking into
consideration
Fanaa, (128)
and sting
(aula =D (129)
and keeping only the Ist and third terms ofthe right
af p@t
_ p&t
f= ott (1210)
“This is already known as the differential equation of diffesion and it can be
seen that Dis the diffusion coefficient.
“The problem, which correspon to the problem of diffusion from a sin
le point (neglecting the interaction between the diffusing particles), is now
‘completely determined mathematically: its solution is,
Ho
124
Vaso ‘
1.2 Some Historical Examples S
“We now calculate, with the help of this equation, the displacement 2x in
the direction of the X-axis that a particle experiences on the average oF
‘more exactly the square root of the arithmetic mean of the square of the
displacement in the direction ofthe X-axis: iis
a= VE = VEDI” 0.2.12)
Einstein’s derivation is really based on a discrete time assumption, that impacts hap-
pen only at times 0,1, 27,3... and his resulting equation (1.2.10) for the distribu
tion function f(t) and its solution (1.2.11) are to be regarded as approximations, in
‘which ris considered so smal! that ¢ may be considered as being continuous. Nev’
eteless, his description contains very many ofthe major concepts which have been
developed more and mote generally and rigorously since then, and which will be
central to this book. For example:
§) The Chapman-Kolmogorov Equation occurs as Einstein's equation (1.2.4). It
states that the probability of the paticle being at point « at time 1+ ris given
by the sum of the probability of all possible “pushes” 4 from positions + A,
‘mokipled by the probability of being atx+ 4 at ime &. This assumption is based
‘on the independence of the push A of any previous history of the motion: it is
only necessary to know the intial position of the particle at time —not at any
previous time, This is the Markov postulate and the Chapman Kolmogorov equa
tion, of which (1.2.4) isa special form, is the central dynamical equation to all
Markov processes. These willbe studied in detail in Chap. 3.
ii) The Fokker: Planck Equation: Bq, (1.2.10) is the diffusion equation, a special ease
ofthe Fokker Planck equation (also known as Kolmogorov's equation) which de:
srs a large clas of very interesting stochastic processes in which the system
hhas a continuous sample path. In this case. that means that the pollen grain's
Position, if thought of as obeying a probabilistic law given by solving the diffu-
‘sion equation (1.2.10), in which time ris continuous (not discrete, as assumed by
Einstein) can be written x(2), where x7) i a continuous function of time—but 8
‘random function, This leads us to consider the possibility of describing the dy-
ramics of the system in some direct probabilistic way, so that we would have a
random or stochastic differential equation for the path. This procedute was ini-
tiated by Langevin with the famous equation that to this day bears his name. We
will discuss this in Sect. 1.2.2, and in detail in Chap. 4.
ii) The Kramers-Moyal and similar expansions are essentially the same as that used
by Einstein to go from (1.2.4) (the Chapman-Kolmogorov equation) to the diffu-
sion equation (1.2.10) The use ofthis type of approximation, which effectively
replaces a process whose sample paths need aot be continuous with one whose
paths are continuous, is very common and convenient. Its use and validity will
be discussed in Chap. 11.6 4, AHlistorcat troduction
1.2.2 Langevin’s Equation
‘Some time after Einstein's original derivation, Langevin [1.5] presented a new
method which was quite different from Einstein’s and, according to him, “infinitely
‘more simple.” His reasoning was as follows.
From statistical mechanics, i was known thatthe mean kinetic energy ofthe Brow=
nian particle should, in equilibrium, reach a value
(Sm) = ar. 213
(7; absolute temperature, &; Boltzmann's constant). (Both Einstein and Smolu-
‘chowski had used this fac), Acting on the particle, of mass m there should be two
forces
4) A viscous drag: assuming this is given by the same formula as in macroscopic
hydrodynamics, this is ~6rqadx/dt where 9 isthe viscosity and a the diameter
of the particle, assumed spherical.
4i) Another fluctuating force X which represents the incessant impacts of the mo~
ecules of the liquid on the Brownian particle. All that is known about itis that
fact, and that it should be positive and negative with equal probability. Thus, the
‘equation of motion for the position of the particle is given by Newton's law as
ax dx
mo = Gena + 1
oe = enna + x 24a
and multiplying by x, this ean be written
mde ae)
Es) < mo = Sama + xx 2.
ae) sana? + xx, 2.415)
where v = dxdt. We now average over a large number of different particles and use
(1.2.13) to obtain an equation for (2°):
mac) der)
eae ee raat
where the term (xX) has been set equal to zero because (to quote Langevin) “of the
irregularity ofthe quantity X". One then finds the general solution
ae)
dt
where C is an arbitrary constant. Langevin estimated thatthe decaying exponential
approaches zero with atime constant ofthe order of 10° s, whici Tor any practical
observation at that time, was essentially immediately. Thus, for practical purposes,
‘we can neglect this term and integrate once more to get
(2) = (x) = AT /Genayl. 218)
This corresponds o (1.2.12) as deduced by Binstein, provided we identify
D=AT HG), 1.2.19)
(216,
= KT (Gana) + Cexp(-Grnatin), a2iy
1.2 Some Historical Examples 7
a result which Einstein derived in the same paper but by independent means.
LLangevin's equation was the first example ofthe stochastic differential equation—
differential equation with a random term X and hence whose solution is, in some
‘sense, a random funetion, Each solution of Langevin’s equation represents a dtfer-
cent random trajectory and, using only rather simple properties of X (his fluctuating
force), measurable results can be derived.
‘One question arises: Einstein explicitly required that (on a sufficiently large time
scale) the change A be completely independent of the preceding value of A. Langevin
did not mention such a concept explicitly, but itis there, implicitly, when one sets
(Kx) equal to zero. The concept that X is extremely irregular and (which is not men-
tioned by Langevin, but is implicit) that X and x are independent ofeach other—that
the irregularities in + asa function of time, do not somehow conspire to be always.
inthe same direction as those of X, so that it would no be valid to set the average
‘of the product equal to ero. These are really equivalent to Einstein's independence
assumption, The method of Langevin equations is cleanly very mich more direct, at
least a first glance, and givesa very natural way of generalising a dynamical equation
toaprobabilstc equation. An adequate mathematical grounding forthe approach of
Langevin, however, was not available until more than 40 years later, when Ito for
‘mulated his concepts of stochastic differential equations. And in this formulation,
1 precise statement of the independence of X and « le tothe calculus of stochastic
oo means X > 0, so the natural range (0,09) of prices is
recovered.
‘There is also a certain human logic in the description. Prices move as a result of
judgments by buyers and sellers, to whom the natural measure of a price change
js not the absolute size of the change, but the fractional change. The improvement
over Bachelicr’s results So significant, andthe resulting description in terms of the
logarithm of the price and the fractional price change so simple, that ths i the pre-
ferred model to this day. Samuelson termed the process geometric Brownian motion
oralteratively economic Brownian motion.
1.32 Financial Derivatives
In order to smooth the running of busines, itis often helpful to fx in advance the
price of a commodity which will be needed in the future—for example, the price of
Wheat which has not yet been grown and harvested 1s moderately uncertain. A baker
‘could choose to pay a fixed sum now for the future delivery of wheat. Rather than
dal with an individual grower, the baker can buy the ungrown wheat from a dealer
in wheat futures, who charges a premium and arranges appropriate contracts with
rowers. However, the contract to deliver wheat ata certain price on a future date
an itself become a tradable item. Having purchased such a contrat, the baker can
sell itt another baker, or indeed, to anyone else, who may buy it with the view to
selling it ata future date, without ever having had anything to do with any wheat at
all
Such a contract is known as a derivative security. The wheat future exists only
because there is a market for real wheat, but nevertheless can develop an existence
of its own, Another kind of derivative is an option, in which one buys the right to
porchase something at a future date at a definite price. I the market price on the
date at which the option is exercised is lager than the option price, one exercises
the option. Ifthe market price turns out to be below the option price, one discards
the option and pays the market price. Purchasing the option limits exposure to price101. A Historical Itroduetion
rises, transferring the risk (othe seller ofthe option, who charges appropriately, and
specializes in balancing risks. Options to purchase other securities, such as shares
find stocks, are very common, and indeed there are options markets which trade
under standardized conditions.
1.3.3 The Black-Scholes Formula
[Although a desctiption of market processes in terms of stochastic processes was
Well-known by the 1970s, it was not clear how it could be used as a tool for making
investment decisions. The breakthrough came with the realization that a portfolio
containing an appropriate mix of cash, stocks and options could be devised in which
‘the short erm fluctuations inthe various values could be cancelled, and that this gave
a rclatively simple formola for valuing options—the Black-Scholes Formula—which
‘would be of very significant value in making investment decisions. This formula has
truly revolutionized the practice of finance: to quote Samuelson {|-10]
reat economist of an earlier generation said that, useful though eco-
nomic theory is for understanding the world, no one would go to an eco-
‘homie theorist for advice on how to un a brewery oF prosuce a mousetrap.
‘Taday that sage would have to change his tune: economic principles really
«do apply and woe the accountant or marketer who runs counter to economic
Jaw, Paradoxically, one of our most elegant and complex secto’s of economic
analysis—the modern theory of finance—is confirmed daily by millions of|
statistical observations. When today’s associate professor of security analy-
sis is asked "Young man, if you're so smart why ain't you rich”, he replies
by laughing all the way to the bank or his appointment as a high-paid con-
stltane to Wall Street”
‘The derivation was given fist in the paper of Black and Scholes [1.11], and a dif-
ferent derivation was given by Merton (1.12). The formula depends critically on
description of the returns on securities as a Brownian motion process, which is of
mited accuracy. Nevertheless, the formula is sufficiently realistic to make investing
in stocks and options a logical and rational process, justifying Samuelson’s perhaps.
‘over-dramatised view of modern financial theory
1.34 Heavy Tailed Distributions
‘There is, however, no doubt thatthe geometric Brownian motion mode of financial
rmatkels is not exact, and even misses out very important features. One need only
sudy the empirical values of the returns in stock market records (as well as other
kinds of markets) and check what kinds of distributions are in practice observed.
‘The results are not really in agreement with a Gaussian distrittion of returns—
the observed distribution of returns is usually approximately Gaussian for
small values ofr, bat the probability of lage values of r is always observed to be
significantly larger than the Gaussian prediction—the observed distributions are said
tohave heavy tails.
[4 Bith-Death Processes 11
“The field of Continuous Time Finance [1-10] is an impressive theoretical edifice
built on this flawed foundation of Brownian motion, but so lar it appears to be the
‘most practical method of modelling financial markets. With modern electronic bank=
ing and transfer of funds, itis possible to trade over very short ime intervals, during
‘which perhaps, in spite of the overall increase of trading activity which results, 3
Brownian description i valid
I is certainly sufficiently valued for its practitioners to be highly valued, as
Samuelson notes. However, every so often one of these practitioners makes aspectac~
ular lss, threatening financial institutions. While there is public alarm about billion
dollar losses, those who acknowledge the significance of heavy tals are unsurprised.
1.4 Birth-Death Processes
‘A wide variety of phenomena can be modelled by a particular class of process called
‘a birth-death process. The name obviously stems from the modelling of human or
‘animal populations in which individuals are born, or die. One ofthe most entertaining
‘models is that of the prey- predator system consisting of two kinds of animal, one of
‘which preys on the other, which is itself supplied with an inexhaustible food supply.
‘Thus letting X symbolise the prey, Y the predator, and A the food of the prey, the
process under consideration might be
XA ox, (4a)
x+y, aby
yo, (ate)
‘which have the following naive, but charming interpretation. The frst equation sym:
bolises the prey eating one unit of food, and reproducing immediately. The second
equation symbolises a predator consuming a prey (who thereby dies—this is the only
sath mechanism considered forthe prey) and immediately reproducing. The final
‘equation symbolises the death of the predator by natural causes. Iti easy (0 guess
‘model differential equations for x and y, the numbers of X and ¥. One might assume
‘that the first reaction symbolises arate of production of X proportional tothe proivct
‘of « and the amount of food: the second equation a production of ¥ (and an equal
rate of consumption of X) proportional toy, and the lat equation a death rate of ¥,
in which the rate of death of ¥ is simply proportional toy; thus we might write
dx
ax kay, a
GF = hax hoy (1429)
dy
fe kyr ky 14.26)
a> yy: (1.426)
‘The solutions of these equations, which were independently developed by Lorka
1.13) and Volterra [1.14] have very interesting oscillating solutions, as presented
in Fig. 13a, These oscillations are qualitatively easily explicable. Inthe absence of
significant numbers of predators, the prey population grows rapidly until the pres
ence of so much prey for the predators to eat stimulates their rapid reproduction, a121. A istorica! Introduction
9) sr
a Rico eo
“Time (arbitrary units)
Fig. L.a-e. Time development in prey-pedator systems, (a) Plot of scltons of the deter
rinistic equations (4.28, 4.26) (x = sold line, y = dashed line. (b) Data fra real prey-
predator system Hete the predator isa mite (Eottranyehus sexmaculatus—dashe Tine) which
Feeds on ocanges. and the prey is another mite (Typhlodromus occidental). Data fom (1.15
1.16} @) Simulation of stochastic equations (1 4.3a-1.4.2)-
the same time reducing the number of prey which get eaten, Becaase a large number
cf prey have been eaten, there are no longer enough to maintain the population of
predators, which then die out, returning us to our initial situation. The eyeles repeat
indefinitely and are indeed, at least qualitatively feature of many real prey-predator
systems. An example is given in Fig. 1.36
Of course the realist systems donot follow the solutions of differential equations
cexacily—they fluctuate about such curves. One must include these fluctuations and
1.4 BinDeath Processes 13
the simplest way to do this is by means of a birthdeath master equation. We assume
1 probability distribution, PCy.) forthe number of individuals at a given time
and ask for a probabilistic law corresponding to (1.4.2a, 1.4.2b). This is done by
assuming that in an infinitesimal time Ar, the following transition probability laws.
holds.
Probar-rxttiy—y) = kasd, (1430)
Prob(x—+x— try + y+ 1) = Anaya (436)
Probix>xyoy-D = kwdr, (ae)
Prob(x > sy > y) 1 (hia + kexy + kyr aad
‘Thus, we simply, for example, replace the simple rate laws by probability laws, We
then employ what amounts to the same equation as Einstein and others used, ie. the
CChapman-Kolmogorov equation, namely, we write the probability at + Aras a sum
of terms, each of which represents the probability of a previous state multiplied by
the probability ofa transition tothe state (x,y). Thus, we find by letting Ar —+ 0
2
atx — DPC 1 ys) + kate Dy BD
x Plt Ly = La) +kaly + DPRK Y +1.)
= (hart haty + ky) PC gD. 44)
Tn writing the assumed probability laws (1.4.2a-1.4.30), we are assuming thatthe
probability of each of the events occurring can be determined simply from the
knowledge of x and y. This is again the Markov postulate which we mentioned in
Sect. 1.2.1 Inthe case of Brownian motion, very convincing arguments can be made
in favour ofthis Markov assumption. Here itis by no means clear. The concept of
heredity, ie. thatthe behaviour of progeny is related to that of parents, clearly con-
tradicls this assumption. How to include heredity is another matter; by no means
«does a unique prescription exist.
‘The assumption of the Markov postulate in this context is valid to the extent that
different individuals of the same species are similar; i i invalid to the extent that,
nevertheless, perceptible inheritable differences do exist.
‘This type of model has a wide application—in fact to any system to which a popu
lation of individuals may be attributed, for example systems of molecules of various
chemical compounds, of electrons, of photons and similar physical particles as well,
as biological systems. The particular choice of transition probabilities is made on
various grounds determined by the degree to which details ofthe births and deaths.
involved are known, The simple multiplicative laws, as illustrated in (.4.34-1.4.20,
are the most elementary choice, ignoring, as they do, almost all details ofthe pro-
cesses involved. In some ofthe physical processes we can derive the transition prob:
abilities in much greater detail and with greater precision.
Equation (1.44) has no simple solution, but one major property differentiates
‘equations like it fom an equation of Langevin’s type, in which the fluctuation term
is simply added tothe differential equation, Solutions of (1.4.4) determine both the
{gross deterministic motion and the fluctuations; the fluctuations are typically ofthe141, A Historica! Intoduetion
same order of magnitude as the square rots ofthe numbers of individuals involved.
[isnot difficult to simulate a sample time development ofthe process asin Fig. 1-3,
‘The figure does show the correct general features, but the model is so obviously sim-
plied that exact agreement can never be expected. Thus, in conirast tothe situation
in Brownian motion, we are not dealing here so much witha theory ofa phenomenon,
as witha class of mathematical models, which are simple enough to have a very wide
range of approximate validity. We will ee in Chap. 11 that a theory can be developed
‘which can deal with a wide range of models in this category. and tat there is indeed
4 close connection hetween this kind of theory and that of stochastic differential
equations
1.5 Noise in Electronic Systems
‘The early days of radio with low transmission powers and primitive receivers, made
it evident to every ear that there were a great numberof highly irregular electrical
signals which occurred either inthe atmosphere, the receiver or the radio transmiter,
and which were given the collective name of “noise”, since thsi certainly what they
sounded like on a radio, Two principal sources of noise are shot oise and Johnson
15.1 Shot Noise
In a vacuum tube (and in solid-state devices) we get a nonsteady electrical current,
since i is generated by individual elecrons, which are accelerated across a distance
and deposit their charge one ata time on the anode. The electric current arising from
such a process can be writen
n= ZF a~ 0). asp
where F(r~ 1.) represents the contribution to tbe current ofan electron which arives
at time f. Each electron is therefore assumed to give rise tothe same shaped pulse,
but with an appropriate delay a in Fig. 14,
‘A statistical aspect arses immediately we consider what kind of choice must be
made for 1. The simplest choice is that each electron arrives inéependently of the
previous one—that is, the times “are randomly distributed with a certain average
‘number per unit time inthe range (—2, 0), or whatever time is under consideration,
‘The analysis of such noise was developed during the 1920's ard 1930"s and was
summarised and laggely completed by Rice [1.17] It was first considered as early as
1918 by Schonky [1.18}
We shall find that there is a close connection between shot cise and processes
described by birth-death master equations. For, if we consider nthe number of elec~
trons which have arrived up to a time f, to be a statistical quantity deseribed by
1 probability P(n,1), then the assumption that the electrons arrive independently is
clearly the Markov assumption. Then, assuming the probability that an electron will
4.5 Noise in Bleetronie Systems 15
|
Pe Height
Time
Fig 14 Illustration of shot noise: identical electric pulses ave at random times
arrive in the time interval between # and 1 + Aris completely independent of rand n,
its only dependence can be on Ar. By choosing an appropriate constant 2, we may
write
Prob (n+ I,in time Ad) = Ar, «s2y
so that
Pine 0) = Penal = AN) + Pl [Link] sa)
and taking the Fimit Ar —+ 0
aPIn.0)
a
Which isa pure birth process. By writing
Gs.) = LS PIno.
There, G(s.) is known as the generating funetion for Pin, 1), and the pati
nique of solving (1.5.4) very widely used), we find
= ALPin = 1.1) ~ POnD] asa)
20 eH. ase)
oth
Gta) = explo HG. assy
By requiring at time ¢ = 0 that no electrons had arrived, it is clear that P(O,0) is t
and Pln,0) is zero for all » > 1, so that Gts,0) = 1. Expanding the solution (1.5.7)
in powers of s, we find
Pn.t) = exp(-Aniary"/n, 58)
Which is known as a Poisson distribution (Sect. 2.8.3). Let us introduce the variable
_N(0), which isto be considered as the number of electrons which have arrived up to
time and isa random quantity. Then,
Pant) = Prob (N()= nl, 59)
and N(t) can be called a Poisson process variable, Then cleaty, the quantity 44),
formally defined by164. A Historica Itoduetion
aaa (15.10)
is zero, except when N(i) increases hy 1; at that stage i isa Dirac delta Function,
arn), as.
aH
where the are the times of arrival of the individual electrons. We may write
4) = f dtr r ye) (15.2
A very reasonable restriction on F(¢~ ¢)is that it vanishes if < f and that for
1+ 6 it also vanishes, This simply means that no current arises from an electron
before itarives, and thatthe effect of its arival eventually dies out. We assume then,
for simplicity, the very commonly encountered form
ae, «0,
rw 1sa9
[eo Ga,
so ta (15 12) an be rowriten as
oo
Kn) uae a (1.5.14)
We cn derive sip deena equation. We diferente 106 otin
AH) grgger-tNOD [gga dN)
Mf aroge Hd [ponent sis
sora
ae)
a 0516
‘THis is a kind of stochastic differential equation, similar (o Langevin’s equation, in
Which, however, the fluctuating fore is given by qu), where m(i) is the derivative
‘ol the Poisson process, as given by (1.5.11). However, the mean of) is nonzero,
in fact, from (1.5.10)
qutondty = (aN() = adr, asin
dN) adr) = adr, (1s.18)
from the properties of the Poisson distibution, for which the variance equals the
‘mean, Defining, then the fluctuation as the difference between the mean value and
AN(0), we write
dilt) = dN(W) ~ ade, 18.19)
0 that the stochastic differential equation (1.5.16) takes the form
Nt) = Lag atid + gant) 820)
1.5 Noise in Electronic Systems 17
Now how does one solve such an equation? In this ease, we have an academie prob-
Jem anyway since the solution is known, but one would like to have a technique.
Suppose we try to follow the method used by Langevin—what will we get as an
answer? The short reply to this question is: nonsense. For example, using ordinary
caleulus and assuming (I()dr(@) = 0, we can derive
$10) agony. as2n
Laren _ oO
Sage = enn) ao) (1522)
Solving inthe limit ¢—» oo, where the mean values would reasonably be expected to
bbe constant one finds
(Ho0))_ = Agfa, (1523)
(P(e) = (Aglay*. (1524)
“The first answer is reasonable—it merely gives the average current through the $ys-
{em in a reasonable equation, but the second implies thatthe mean square curentis
the same as the square of the mean, ie, the current at 1 —r does not Huctuate!
"This is rather unreasonable, and the Solution tothe problem will show tha stochastic
differential equations are rather more subtle than we have sofa presented.
Firstly. the notation in terms of differentials used in (1517-1520) has been cho-
son deliberately. n deriving (1.5.22), one uses ordinary calculus, i.e. one writes
Pye (+dl P= 21d! +a? (1525)
and then one drops the (42) as being of second onder ind. But now look at (1.5.18):
this is equivalent to
ddn(a)?) = Adt, (15.26)
0 that a quantity of second order indy is actually of frst order in dt. Te reason 18
ot dffical to fnd, Cleary,
dit) = ANC) ~ adr, sz
but the curve of Nit) isa step function, discontinuous, and certainly not differen-
tiable, atthe times of arrival ofthe individual electrons. In the ordinary sense, none
of these calculus manipulations is permissible. But we can make sense ovt of them
fs follows. Let us simply calculate (a2) using (1.5.20, 5.25, 15.26)
Cath?) = 2UILag — atlas + qdn{n)) + lag alee +qdntoy?). (1.5.28)
‘We now assume again that ((n)) = O and expand, after taking averages using
the fact that (dn()2) = Adi to Ist onder in dt, We obtain
$dir) = [agin — at?) + qa] dt, 15.29)
and this gives
; a
(Poo) = (eo)? = £ (1530)181. A strca!Intodetion
“Ths, there are fluctuations from this point of view, as ¢ —> co, The extra term in
(1.5.29) as compared to (1.5.22) arises directly out of the statistical considerations
implicit in N(e) being a discontinuous random function
Thus we have discovered a somewhat deeper way of looking at Langevin’s kind
fof equation—the treatment of which, from this point of view, now seems extremely
naive, In Langevin’s method the fluctuating Force X is not specified, but it will be-
come clear in this book that problems such at we have just considered are very
‘widespread in this subject. The mora is that random functions cannot normally be
differentiated according to the usual laws of calculus: special rules have to be de-
veloped, and a precise specification of what one means by differentiation becomes
important. We will specify these problems and theit solutions in Chap.4 which will
‘concern itself with situations in which the uctuations are Gaussian,
Autocorrelation Functions and Spectra
‘The measurements which one can carry out on fluctuating systems such as electric
circuits are, in practice, not of unlimited variety. So far, we have considered the dis:
Cibution functions which tel us, at any time, what the probability distribution of the
values of a stochastic quantity are. If we are considering a measurable quantity xt)
‘which fluctuates with time, in practice we can sometimes determine the distribution
of the values of x. though more usually, what is available at one time are the mean
40) and the variance varlx())-
‘The mean and the variance do not tell a great deal about the underlying dynamics
‘of what is happening. What would be of interest is some quantity which is a measure
‘of the influence ofa value of « a ime fon te value at time ¢ + r-Such a quantity is
the autocorrelation function, which was apparently first introduced by Taylor (1.19)
ur
Ge) = fim F fdrsionr+ 1). (1531)
This is the time average of a two-time product over an arbitrary large time ., which
is then allowed to become infinite. Using modern computerized data collection tech-
nology it is straightforward 10 construct an autocorrelation from sny stream of data,
citer in realtime or from recorded data
‘A closely connected approach is to compute the spectrum ofthe quantity x(). This
is defined in two stages. First, define
yw) = fdre™*x(1), 5.32)
then the spectrum is defined by
1 2,
Stu) = i 5 Il a3)
‘The autocorrelation function and the spectrum are clesely connected. By a litle ma-
nipulation one finds
1L5 Noise in Blecionic Systems 19
r te
sw jim | Feoswondr® Fxom-+ it sap
and aking the init 7 > eo (under suitable assumptions to ens the vay of
‘certain interchanges of order), one finds
St)
S Femwnctee. te
‘Thisis «fundamental result which relates the Fourier transform ofthe autocorrelation
Fecon he rump sgh et one oe
Gen= fin} 7 arave on 41336
swr= 2 Femorner, asa)
vith coreg nee
ote)= J staida sas)
“This result is known asthe Wiener-Khinchin theorem 1.20, 1.21] and has widespread
application
Tt means that one may either directly measure the autocorrelation function of a
signal, or the spectrum, and convert back and forth, which by means of the fast
Fourier transform and computer is relatively straightforward
1.5.3 Fourier Analysis of Fluctuating Functions: Stationary Systems
“The autocorrelation function has been defined so far as atime average of a signal, bot,
we may also consider the ensemble average, in which we repeat the same measure-
‘ment many times, and compute averages, denoted by ¢ ) It will be shown that for
very many systems, the time average is equal 1 the ensemble average; such systems
are termed ergodio—see Sect. 3.7.1
If we have such a fluctuating quantity 2(7), then we can consider the average ofthe
product of two time-values of x
dadoxte+0) = GO). (1.529)
"The fact thatthe result is independent ofthe absolute time £is a consequence of our
ergodic assumption.
‘Now itis very natural to write a Fourier transform for the stochastic quantity x(f)201, Aistorcal Inteoduetion
al
and consequently,
Sdwc(wye™, (540)
L
ctw) = 5 Farin «san
Note that 2() real implies
lw) = e(-w). (1842
If the system is ergodic, we must have a constant (x(), since the time average is
clearly constant. The process is then stationary by which we mean that all time-
dependent averages are functions only of time differences, ie., averages of functions
Mls Xa) Ul) are equal 1 those of x(t) +A), a(f2 + Ay ose + A),
For convenience, in what follows we assume (1x) = 0. Hence,
tow) = sane =o asa
“tat = ah f fanart
(lancet = Gh fava" xt),
1
Ftlw—w') fare™Gtr),
Bae Ww dre™"orn)
Kw — WF )S(w) saa)
Here we find not only a relationship between the mean square (J(w)?) and the spec:
‘rum, but also the result that staionarity alone implies that e() and ¢”(c') are un-
correlated, since the term 6(w ~ «u') arses because (x(0)x(7)) isa function only of
re
1.54 Johnson Noise and Nyquist’s Theorem
‘Two brief and elegant papers appeared in 1928 in which Johnson 1.22] demonstrated
‘experimentally that an electric resistor automatically generated fluctuations of elec-
trie voluge, and Nyquist [1.23] demonstrated its theoretical derivation, in complete
‘accordance with Johnson's experiment. The principle involved was alteady known by
‘Schottky [1.18] and is the same as that used by Einstein and Langevin, Ths principle
is that of thermal equilibrium. Ifa resistor R produces electric fluctuations, these will
produce a current which will generate heat. The heat produced in the resistor must
‘exactly balance the energy taken out of the Mluctuations, The detailed working out
‘ofthis principe is not the subject of this section, but we will find that such results
are common throughout the physics and chemistry of stochastic processes, where
the principles of stalistical mechanics, whose basis is not essentially stochastic, are
brought in to complement those of stochastic processes—such resulls are known a
fuctuation-dissipation theorems.
Nyquist’s experimental result was the following. We have an electric resistor of
resistance R at absolute temperature T. Suppose by means of a suitable filter we
measure F(w)duy the volge across the resistor with angular frequency in the range
(w.+ da), Then, if kis Boltzmann's constant,
1.5 Noise in Electronic Systems 21
uT fa «1ssy
ey)
‘This esul is known nowadays as Nyquist's theorem. Johnson remarked. “The effect
is one ofthe causes of what is called ‘tube noise’ in vacuum tube amplifiers. Indeed,
itis often by far te larger part ofthe ‘noise’ of a good amplifier”
Johnson noise is easly described by the formalism of the previous subsection,
“The mean noise voltage is zero across a resistor, and the system is arranged so that,
itis in a steady state and is expected to be wel represented by a stationary process.
Johnson's quantity is, in practice, a limit ofthe kind (1.5.33) and may be summarised
by saying thatthe voltage spectrum 5 («) is given by
S (a) = RET/ x, (1546)
that i, the spectrum is fat, i. a constant function of w. In the ease of light, the
‘requencies correspond to different colours of light. If we perceive light to be white,
iis found that in practice all colours are present in equal proportions—the optical
spectrum of white light is thus flat—at least within the visible range. In analogy,
the term white noise is applied toa noise voltage (or any other fluctuating quantity)
‘whose spectrum is fla
‘White noise cannot actually exist. The simplest demonstration is to note that the
‘mean power dissipated in the resistor inthe frequency range (11,02) is given by
Faust
so that the total power dissipated in all frequencies is infinite! Nyquist realised this,
and noted tha, in practice, there would be quantum corrections which would, at room
temperature, make the spectrum flat oaly up to 7 x 10! Hr, which is not detectable
in practice, in a radio situation. The actual power dissipated inthe resistor would be
somewhat les than ifinite—10-!® W in fact! And in practice there are other limiting
factors such as the inductance ofthe system, which would limit the spectrum to even
lover frequencies.
From the definition ofthe spectrum in terms of the autocorrelation function given
in Sect, 1.5, we have
(et NE) = Gtr, «usa
2 Fave™omr, «39
AT ~ w2)/m, «san,
= 2RETA(r), (1550)
‘which implis that no matter how small the time diference +, E(t +) and E(x) are
not correlated. This i, of course, a direct result ofthe flatness of the spectrum. A
{typical model of Sw) that is almost fat is
Slo) = RET Htatw?eE + D1 ass
‘This is flat provided w <« 12!. The Fourier transform can be explicitly evaluated in
this case to give
(EU + ECO) = (RET Ire)expl-t/r0). (152)221, A Historical Ineoduetion
Gu)
Fig. 5, Correlation Functions (—) and corresponding spectra (==) fr (a) shor corre:
lation time coresponding to an almost Nat spectrum: (b) long corelaton cme. giving a quite
rapidly deereesing spectrum
so thatthe autocorrelation function vanishes only fort >> rc, which isealled the cor
relation time ofthe fluctuating voltage. Thus, the delta function correlation function
appears as an ideatsation, only valid on a sufficiently long time scale.
This is very reminiscent of Einstein's assumption regarding Brownian motion and
of the behaviour of Langevin’s Nuctuating force. The idealised white noise will play
aa highly important role inthis book but, in just the same way asthe fluctuation term
that arises ina stochastic differential equation is not the same as an ordinary differ-
ential, we will find that differential equations which include white noise as @ driving
{erm have to be handled with great care. Such equations arse very naturally in any
fluctuating system and it is possible to arrange by means of Stratonovich's rules for
ordinary calculus rules to apply, bu a the cost of imprecise mathematical definition
and some difficulties in stockastic manipulation. It turns ut to be far better to aban=
don ordinary calculus and use the to calculus, which isnot very different (iis in
fact, very similar to the calculus presented for shot noise) and t> preserve tractable
statistical properties. All these matters will be discussed thoroughly in Chap. 4
‘White noise, as we have noted above, does not exist as a physically realisable
process and the rather singular behaviour it exhibits does not arse in any reaisable
context. It is, however, fundamental in a mathematical, and irdeed in @ physical
sense, in that it is an ideatisation of very many processes that do occur. The slightly
sirange rules which we will devetop for the calculus of white noise are not really
very difficult and are very much easier to handle than any method which always
deals with areal noise. Furthermore, situations in which white noise is not a good
approximation can very often be indirectly expressed quite simply in terms of white
noise. In this sense, white noise is the starting point from which a wide range of
stochastic descriptions can be derived, and is therefore fundamertal tothe subject of
this book.
Probability Concepts
In the preceding chapter, we introduced probability notions without any definitions.
In order to formulate essential concepts more precisely, it is necessary to have some
‘more precise expression of these concepts. The intention of this chapter isto provide
some background, and to present a number of essential results. tis not a thorough
‘outline of mathematical probability, for which te reader is refered to standard math
ematical texts such as those by Feller [2.1] and Papouts (2.2),
2.1 Events, and Sets of Events
is convenient to use a notation which is as general as possible in order to describe
those occurrences to which we might wish to assign probabilities. For example, we
‘may wish to tak about a situation in which there are 64x 10"* molecules in a certain
region of space; of situation in which a Brownian panicle is at a certain point x in
space; of possibly there are 10 mice and 3 owls in a certain region of a forest
‘These occurrences are all examples of practical realisations of events. More ab-
siractly, an event is simply a member of a certain space, which in the cases most
practically occurring can be characterised by a vector of integers
(ns may ms Ds ey
lo a vector of real numbers
(x1, ea). 12
‘The dimension of the vector is arbitrary.
Wis convenient to use the language of set theory, introduce the concept of a ser of
events, and use the notation
wea, 13)
to indicate thatthe event «is one of events contained in A. For example, one may
‘consider the set A(25) of events in the ecological population in which there are m0
‘mote than 25 animals present; clearly the event 2 that there are 3 mice, a tiger, and
‘no other animals present satisfies
Be AQ5) en
‘More significantly, suppose we define the set of events A(r, AV) that @ molecule
is within a volume element AV centred on a point r In this ease, the practical sig-
nificance of working in tems of sets of events becomes clear, because we should28 2, Probability Concepts
normally be able to determine whether or not a molecule is within @ neighbourhood.
AAV ofr, but to determine whether the particle is exactly tr is impossible. Thus, if
we define the event ay) thatthe molecule is at pointy, it makes sense to ask whether
cy) € [Link]), Qs)
and to assign a certain probability to the set A(r, AV), which is to be inexpreted as
the probability of the occurrence of (2.1.5).
2.2 Probabilities
‘Most people have an intuit ‘own experi
tence, However, a precise formulation of intuitive concept is fraught with dificlties,
and it has heen found most convenient to axiomatise probability theory as an essen-
tially abstract science, in which a probability measure P(A) is assigned to every set
‘Ain the space of events, including
‘The set ofall events : 2, 21)
‘The set of no events : @, (2.22)
in order to define probability. we need our sets of events to form a closed system
(known by mathematicians as a o-algebra) under the set theorstic operations of
union and intersection,
2.2.1 Probability Axioms
We introduce the probability of A, P(A), as a function of A satisfying the follo
probability axioms:
i) PAYS forall A, 22)
ii) PQ=1, (224)
iil) IFA; (= 1,2,3,...) is a countable (but possibly infinite) collection of
nonoverlapping sets Le. such that
AOA =O forall i#j, (225)
226)
‘These are all the axioms needed. Consequentially, however, ve have:
iv) if Ais the complement of A, ie. the set ofall events not contained in A, then
PA) =1~ Pa), 227)
» P) 228)
2.2.2 The Meaning of P(A)
‘There is no way of making probability theory correspond to reality without requir
ing a certain degree of intuition. The probability P(A), as axiomatised above, isthe
itive probability that an “arbitrary” event w. ie, an event w “chosen at ran-
dom, will satisly w» € A. Or more explicitly, if we choose an event “ar random”
from @ N times, the relative frequency thatthe particular event chosen will satisfy
‘WA approaches P(A) as the number of times, N we choose the event, approaches
infinity, The number of choices NV can be visualised as being done one after the
other (“independent” tosses of one die) or atthe same time (N dice are thrown at
the same time “independent. All definitions ofthis kind must be intuitive, as we
‘can see by the way undefined terms (“arbitrary”, “at random”, “independent”) keep
turning up. By eliminating what we now think of as intuitive ideas and axiomatis-
ing probability, Kolmogorov [2.3] cleared the road for a rigorous development of
‘mathematical probability. But the circular definition problems posed by wanting an
intuitive understanding remain. The simplest way of looking at axiomatic probabil-
ity isas a formal method of manipulating probabilities using the axioms. In order 10
apply the theory, the probability space must be defined and the probability measure
P assigned. These are a priori probabilities, which are simply assumed. Examples
‘of such a priori probabilities abound in applied disciplines. For example, in equilib-
rium statistical mechanics one assigns equal probabiltes to equal volumes of phase
space. Einstein's reasoning in Brownian motion assigned a probability ¢4A) to the
probability of a “push” A from a position x atime t
‘The task of applying probability is
') To assume some set of a prior! probabilities which scem reasonable andl to de-
‘duce results from this and from the structure ofthe probability space,
{i) To measure experimental results with some apparatus which is constructed to
‘measure quantities in accordance with these a priori probabilities.
‘The structure ofthe probability space is very important, especially when the space of
events is compounded by the additional concept of time. This extension makes the
effective probability space infnite-dimensional, since we can construct events such
as “the particle was at [Link] at times fy for = 0,1,2,...,80".
2.2.3 The Meaning of the Axioms
Any inuitive concept of probability gives rise to nonnegative probabilities, and the
probability that an arbitrary event is contained in the set of all events must be 1
‘no matier what our definition of the word arbitrary. Hence, axioms i) and ii) are
understandable. The heart of the matter lies in axiom ii). Suppose we are dealing
with only 2 sets A and B, and A.n B = ©. This means there are no events contained
in both A and B. Therefore, the probability that © AU Bis the probability that
either w € A ors € B. Intuitive considerations tellus this probability isthe sum of
the individual probabilities, ic,26 2, Probability Concepes
PYAU B) = Pl(w€ A) or a € B)) = P(A) PCB) 2.29)
Notice this is aot proof—merely an explanation,
‘The extension now to any finite number of nonoverlapping sels is obvious, but
the extension only to any countable number of nonoverlapping ses requires some
‘comment.
‘This extension must be made restrictive because of the existeace of sets labelled
by a continuous index, for example, x, the position in space. The probability of a
molecule being in the set whose only element in x is zero; but the probability of
beeing in a region R of finite volume is nonzero. The region R is a union of sets of
the form (}—but not a countable union. Thus axiom ii) is not applicable and the
probability of being in Ris not equal to the sum ofthe probabilities of being in (x).
2.24 Random Variables
‘The concept ofa random variable is a notational convenience which is central to this
book. Suppose we have an abstract probability space whose events can be written
x. Then we can introduce the random variable F(x) which isa fimction of x, which
‘akes on certain values for each x. In particular, the identity function of x, written
(2) is of interest: itis given by
Xin) (2.2.10)
We shall normally use capitals in this book to denote random variables and small
letters to denote their values whenever its necessary to make adistinction,
‘Very often, we have some quite different underlying probability space 2 with val-
tues w, and talk about X(w) which is some function of w, and then omit explicit
sncation oF a, This eat be for either of two reasons
i) we specify the events by the values of x anyway, ie. we identify x and w:
fi) the underlying events w are too complicated to describe, oF sometimes, even to
know.
Forexample inthe case of the positon of a molecule in aliquid, we really should
terret each was being capable of specifying all the postions, momenta, and ori-
énlations ofeach melecule i that volume of igi; bu thsi simply too dificult to
write down, and olten unnecessary
One great advantage of inducing the concept ofa random variable i the sin-
plicity with which one may handle functions of random variables eg. X2,sin(a-X),
‘c., and compute means and distributions of these. Further, by defining stochastic
Aitferemial equations. one can also quite simply talk about time development of ran-
ddom variables in way which suite analogous othe classical description by means
of ditferemial equations of non-probebliste systems.
23 Join and Contional Probabilities: Independence 27
2.3 Joint and Conditional Probabilities: Independence
23,1 Joint Probabilities
‘We explained in Sect.2.2.3 how the occurrence of mutually exclusive events is re
lated to the concept of nonintersecting sets. We now consider the concept P(A B).
where ANBis nonempty. Anevent us which satisfies a» € A will only satisly w € ANB
ifwe Bas wel
‘Thus, PCAN B) = Pllw € A) and (w € B)). 31)
and P(A © B)is called the joint probability that the event « is contained in both
lasses, of, alternatively, that both the events w € A and w € Boccur Join probabil-
ities occur naturally in the context of this book in two ways:
') When the event is specified by a vector, ¢., mice and n igers. The probability
ofthis event isthe joint probability of (m mice (and any number of tigers)} and
In tigers (and any number af mice). All vector specifications are implicitly joint
probabilities inthis sense.
li) When more than one time is considered : what is the probability that (at time f
there are my tigers and n; mice) and (at time fz there re ms tigers and n> mice)
‘To consider such a probability, we have effectively created out of the events at
time f; and events at time f, joint events involving one event at each time. In
essence, there ino difference between these two cases except forthe fundamental
dynamical role of time.
23.2. Conditional Probabil
‘We may specify conditions on the events we are interested in and consider only these,
eg. the probability of 21 buffaloes given that we know there are 10O lions. What does
this mean? Clearly, we will be interested only in those events contained in the set B=
[all events where exacily 100 lions occur). This means that we to define conditional
Probabilities, which are defined only on the collection of al sets contained in B. we
define the conditional probability as
PLA|B) = PUAN BY PLB). 32)
nd this satisfies our intuitive conception thatthe conditional probability that w € A
(given that we know w € B),is given by dividing the probability of joint occurrence
by the probability (w € B).
‘We can define in both directions, ie, we have
PUAN B) = PALB)PB) = P(BLA)P(A). 33)
‘There is no particular conceptual difference between, say, the probability of {(21
buffaloes) given (100 lions)} and the reversed concept. However, when two times
are involved, we do see a difference. For example, the probability that a particle is
al position xy al time f, given that it was at x at the previous time 12, is a very
natural thing to consider: indeed, it will turn out to be a central concept inthis book,28 2. Probability Cones
“The converse looks to the past rather than the future; given that a particle is at xy at
time ¢. what isthe probabfity that that atthe previous time fit was at position x2,
The frst concep—the forward probability —looks at where the panicle will go, the
second—the backward probability—at where it came from.
‘The forward probability has already occurred in this book, for example, the
‘4A)dA of Einstein (Sect. 1.2.1) is the probability that a particle at x at time f will
be in the range [5+ A,x +4 + dA] at time 1 + 1, and similarly inthe other exam-
ples. Our intuition tell us as it told Finstein (as can be seen by reading the extract
from his paper) that this kind of conditional probability is direcly related wo the time
development of a probabilistic system.
23.3 Relationship Between Joint Probabilities of Different Orders
Suppose we have a collection of sets B; such that
Bn By =O. 234)
233)
236)
Using now the probability axiom ii), we see that A 7B, satisfy the conditions on
the A; used there, so that
EPANB) = RUAN BD), (237)
PA). 238)
and thus
PAI B)PLB) = PLA) 2.39)
‘Thus, summing over afl mutually exclusive possibilities of B int
eliminates that variable.
Hence, in general,
EPAN BAG...
BNC 23.10)
‘The result (2.3.9) has very significant consequences in the developanent of the theory
of stochastic processes, which depends heavily on joint probabilities.
23.4 Independence
We need a probabilistic way of specifying what we mean by independent events
‘Two seis of events A and B should represeat independent sets of evens if the spec-
ification that a particular event is contained in B has no influence on the probability
‘of that event belonging to A. Thus, the conditional probability P(A|B) should be
independent of B, and hence
2.4 Mean Values and Probubiiy Density 29
POA B) = PlA)PLB). ai
Inthe case of several events, we need a somewhat stronger specification. The events,
(wo € AWM = 1,2,.--.n) will be considered to be independent if for any subset
(tetas ath) of the 801, 2,.--4m,
PLAY, OAs --- A) = PAG IPA) «os POG) 23.12)
{tis important to require factorisation for all possible combinations, as in (2.3.12)
For example, for three sets Ay, itis quite conceivable that
PALA) = PAPA). 23.13)
for al different i and j, but also that
AL Ar=ArMAy= ANAL. (Soe Fig. 2.1) 314)
‘This requires
POA, Az 1s) = Pld 9A 0143) = PAD VAS)
POA2)PLAs) # PUA )P(A2)PUAS) 23:15)
‘We can sce that the occurrence of w € Az and w € As necessarily implies the occur-
rence of w € Aj. In this sense the events are obviously not independent.
Random variables X1, X2. Xs...» Will be said to be independent random variables,
if forall sets of the form Ay = x such that a) < x :tc0, so we
see that higher moments tell us only about the properties of unlikely large values of
X. In practice we find that the most important quantities are related to the first and
second moments. In particular, for a single variable X, the variance defined by
varlX] = (o1XIP = (08 - OOF), 259)
and as is well known, the variance var{X] ofits square root the standard deviation
CX], is @ measure ofthe degree to which the values of X deviate from the mean
value (0,
In the case of several variables, we define the covariance matrix as
(XG) = (X= XDI — OGM) = XD — XDXD. 54)
Obviously
(Xi) = varlX. 255)
Ifthe variables are independent in pairs, the covariance matrix is diagonal
2.8.2 The Law of Large Numbers
As an application ofthe previous concepts, let us investigate the following model
of measurement. We assume that we measure the same quantity N times, obtaining
sample values of the random variable X(n}; (1 = Iy2ese-e). Since theve are all
‘measurements of the same quanity at successive times, we assume that for every
‘n.X(n) has the same probability distribution but we do not assume the X(n) to be
independent. However, provided the covariance matrix (Xin), X(m) vanishes sufi
ciently rapidly as jx ~ mj +o, then defining
yaa Sx, 036)
wwe shall show
im Ru = 0) 5
Nis clear that
hy = a) 258)
‘We now calculate the variance of Xy and show that as N’ + oo it vanishes under
certain conditions:
2.6 Characteristic Function 33
1 Sout A
By) = a)®
Provided (Xs, Xq) falls off suficiently rapidly as n— mj -> co, we find
Jimvaty) =0, 25.10)
so that fim Ry is 8 dterminsc variable equal 10.
woods of (Xe Xa) canbe chosen,
a) hnXe) ~ Ka ac), (25.11)
for which one ids
2K (aN N(A= N= a) OK
vetted = Fe (Pt) Foo, sz)
b) Xa Xn)~ [n= mit, (nem), (2.5.13)
and on fis approximately
Diy 2
vari ~ Z log N- 7 +0 5.4)
In both these cases, var[Xy] > 0, but the rate of convergence is very different. Inter.
preting n,m asthe times at which the measurement is carried out, one sees than even
very slowly decaying correlations are permissible. The law of large numbers comes
jn many forms, which are nicely summarised by Papoulis (2.2). The central limit
theorem is an even more precise result in which the limiting distribution function of
Rw ~ (X) is determined (see Sect.2.8.2).
2.6 Characteristic Function
(One would lke a condition where the variables are iuepeament, ot jus in pats, To
this end (and others) we define the characteristic function.
I isthe vector (51.83,---y84) and X = (X1,Xaye.e%y) is @ vector of random,
variables, then the characteristic function (or moment generating function is defined
by
6s) = Cexplis 29) = Jdx pleyexplis x) 26.1)
“The characteristic function has the following properties ([2.1], Chap. XV)
60)
ipo]
ai) 4¢)is a uniormly continuous function ofits arguments forall inte reals (2.4)
iv) I the moments (TT Xf") exist, then
262)342. Probab
concep
WA sequence of probability densities converges to limiting probability density if
‘and only i the corresponding characteristic functions converge tothe corespond-
‘ng characteristic function ofthe limiting probability density,
vi) Fourier inversion formula
lx) = (2ay"* fds (8) exp(-ix 8) 26.)
Because of this inversion formula, (+) determines p(x) with probability 1
Hence, the characteristic function does trly characterise the probability density
vii Independent random variables: from the definition of independent random vari
ables in Sect. 2.3.4 it follows thatthe variables X,,Xo... are independent if and
only if
lta tha) = Puan PACED) Palins (264)
in which case,
4545200005) = 665159) al) 265)
viii) Sum of independent random variables: if Xi, X2,..., are independent random
variables and if
v= 3x, 266)
and the characteristic function of ¥is
8) = (expieY, 267)
then
#460) = FL 64S) 268)
“The characteristic function plays an important role in this book which arses from
the convergence property (v), which allows us to perform limiting processes on the
characteristic function rather than the probability distribution itself, and often makes
proofs easier. Further, the fact thatthe characteristic function is truly characteristic,
i.e, the inversion formula (vi), shows that different characteristic functions arse from
different distributions. As well as ths, the straightforward derivation of the moments,
bby (2.6.2) makes any determination of the characteristic function directly relevant to
measurable quantities.
2.7 Cumulant Generating Function: Correlation Functions
and Cumulants
|A further important property ofthe characteristic function arses by considering its
logarithm
Os) = log 8) 74)
127 Comulant Generating Function: Corelation Functions and Cumulants 38
which is called the cumulane generating function. Let us assume that all moments
‘0 that g(s) and hence, (3), is expandable in a power series which can be
‘writen as
72)
where the quantities QXP°X$2 AT) are called the cumulants ofthe variables X.
“The notation chosen should not be taken to mean that the cumulants are functions of
the particular product of powers of the X; it rther indicates the moment of highest
‘order which occurs in ther expression in terms of moments, Stratonovich (2.5 aso
tecr the term correlation funerions, aterm which we shall reserve for cumulant
‘which involve more than one ;, For, if the X are all independent, the factrisation
property (26.6) implies that @s) (Ihe cumblant generating function) is sum of
frm, each of which is a function of only one s; and hence the coefficient of
tnixed lemme. the correlarion functions (in our terminology) ar all zero and the
ever sss tr. Thus, the magnitude of the coelation functions is a measure
ofthe degree of corelation.
“The cumulant and coreaton functions can be evaluated in erms of moments by
expanding the characteristic function asa power series:
Ps 4
one) = Ei Dag ay XE),
a0 EE capa A et Oe dim) Oe
27»
[Expanding the logarithm in a power series, and comparing it with (2.7.2) for (8).
the relationship between the curmulants and the moments can be deduced. No simple
formula can be given, but the first few cumulants can be exhibited: we find
ap =X) 74)
XP = HRY — KAD. 75)
HGR Xd = OKA) — KRG) — IKE) ~ OAD + KOGA
076)
Here, all formulae are also valid for any number of equal i .k./. An explicit gen-
cral formula can be given as follows, Suppose we wish to calculate the cumulant
AX XaXs...Xa)). The procedure is the following:
i) Writea sequence of n dots ..
ii). Divide ino p + | subsets by inserting angle brackets
GMM de e277)
i) Distribute the symbols X;...%q in place ofthe dots in such a way that all difer-
ent expressions ofthis kind occur, ©...
CX YKAKa) = (MYRDND) # OASYXIDDs 278)
iv) Take the sum of all such terms for a given p. Call this Cy(Xi, X2,-.-+Xo)% Probability Concepts
YOK Ky = EC Wp Kayo Xe 79
A drivation ofthis Formula was given by Meerom (2.6) The particular procedure
is de to van Kampen (2.7)
vi) Camutansin which there i one or more repeated element
For example (X3X5X2}—simply evaluate (Xi XeXsXa) and set Xe = X; inthe
resulting expression
2.7.1 Example: Cumulant of Order 4: (XiX2X3Xa)
a) p=0
Only term is (Xi XoXsXe) = ColXiX2X0X0),
b) pet
Partition (.)...)
‘Ten (OX )MN Mad + a)(KKAN) + WAIN)
+X A)N AGS)
partition (96.2)
‘Term (X X2XsX) + (XIN ANNs) + (ML a)ONONG) Da
Hence,
Dy + Ds = CUR XXX). 2.7.10)
Partition (3634-3)
‘Tear (X )¢Xa)(KaXa) + (X WGN NAKA) + KYKDNEXS)
HKD) Mad + (Ka AUDEN) + KS) HDG AD)
5 CG KKK).
ap
Partition (424)
‘erm (Xi )(Xa)(XsMXs) = CNNNANA)
Hence,
AX XaXaXa) = Co ~ C14 203 ~ OC en.
2.7.2 Significance of Cumulants
From (2.7.4, 2.7.5) we see that the first two cumulants are the means (Xj) and co-
variances (XX;). Higher-order cumulants contain information of decreasing signif
cance, unlike higher-order moments. We cannot set all moments higher than acertain
order equal to zero since (X") > (X")? and thus, all moments contain information
about lower moments.
For cumulans, however, we can consistently set
28 Gaussian and Poissonian Probability Distebutions 37
@) =a,
Wy =0?,
WX) =0, (M>D,
‘and we can easily deduce by using the inversion formula forthe characteristic fune:
tion that
27.1)
that is, a Gaussian probability distribution. It does not, however, seem possible to
give more than this intuitive justification. Indeed, the theorem of Marcinkiewicz [2.8
2,9) shows thatthe cumulant generating funetion cannot be s polynomial of degree
‘reater than 2 that is, either all but the first 2 cumulants vanish or there are an infinite
fhumber of nonvanishing cumulants. The greatest significance of cumulant lies in
the definition ofthe correlation functions of diferent variables in terms of them; this
leads further to important approximation methods.
2.8 Gaussian and Poissonian Probability Distributions
2.8.1 The Gaussian Distribution
By far the most important probability distribution is the Gaussian, or normal disti-
bution, Here we collect together the most important facts about it
EX is a vector of n Gaussian random variables, the corresponding multivariate
probability density function can be written
1
ix) =§ exp [Hr 2-9), 2.8.
2) = aS p[-$r- 37 e-3)] 281)
s0 that
(X= fdex pox)= 5 082)
083)
084)
‘This particularly simple characteristic Function implies that all cumulant of higher
order than 2 vanish, and hence means that all moments of order higher than 2 are
expressible in terms of those of order 1 and 2. The celationship (2.8.3) means that
is the covariance matrix (as defined in Sect. 2.5.1), Le. the matrix whose elements
are the second-order correlation functions, OF course, is symmetric.
“The precise relationship between the higher moments and the covariance matrix
«can be written down straightforwardly by using the relationship between the mo-
rents and the characters funtion [Set 2.6). The formula is only simple i
1 in which case the odd moments vanish and the even moments satisty3H 2. Probability Concepts
OA Keo) = Oe youre lame 085)
“where the subscript “sym” means the symmetrised form of the product of o"s, and
2N isthe order ofthe moment. For example,
A Ae 2
ox = AS fot = 30h esr
2.82 Central Limit Theorem
“The Gaussian distribution is important fora variety of reasons. Many variables are, in
practice, empirically well approximated by Gaussians and the reason for this arises
from the central limit theorem, which, roughly speaking, asserts that a random vat
able composed ofthe sum of many pars, each independent bu arbitrarily distributed,
is Gaussian. More precisely let Xy,%2,Xs,....y be independen’ random variables
such that
(%)= 0, varlX) = 5}, (28.8)
and drain action of Hb
Dene
ix. eas)
ond
che vasal= £8 a9)
‘We vequine Further the Fl
sn ofthe Lindeberg condition:
i :
sin |= Ef ave nto] =0. eat)
for any fixed 1 > 0. Then, under these conditions, the distribution of the normalised
sums S/o tends to the Gaussian with zero mean and unit varians
The proof of the theorem can be found in [2.1]. It is worthwhile commenting
fon the hypotheses, however. We fist note thatthe summands X} are required to be
‘independent. This condition is not absolutely necessary; for example, choose
x
LY, (28:12)
where the ¥; are independent. Since the sum ofthe X's can be rewritten asa sum of
¥°s (with certain finite coefficients), the theorem is sill tue.
Roughly speaking as long asthe correlation between X; and X, goes to ero suf-
ciently rapidly as i- j| +c, conta limit theorem will be expected. The Lindeberg
28 Gaussian and Poissonian Probability Distributions 39
condition (2.8.11) is not an obviously understandable condition but is the weakest
condition which expresses the requirement that the probability for [Xo be larg is
‘ery small. For example, if all the by are infinite or preater than some constant C,
its clear that of diverges as n —» oo. The sum of integrals in (28.11) i the sum
‘of contributions to variances for al {Xj > 17x, and its clear that as 7 — 6 each
contribution goes to zero. The Lindeberg condition requires the sum of all the con-
tributions not to diverge as fast as 2. In practice it isa rather weak requirement;
satisfied if [i] < C forall X, orf px) goto zero sufficiently rapidly'as.x ~» 4ce
‘An exception is
a 2819
mera)
the Cauchy, or Lorentsian distribution. The variance ofthis distribution is infinite
and. in fac. the sum of al the X; has a distribution othe same form as (2.8.13) with
4; replaced by 5 a, Obviously, the Lindeberg condition isnot satisfied
A related condition, also called the Lindeberg condition, will arise in Sect, 3.3.1,
‘where we discuss the replacement of a discrete process by one with continuous steps.
pa
2.8.3 The Poisson Distribution
{A distrbation which plays a central role in the study of random variables which take
‘on postive integer valves is the Poisson distribution. 1X isthe relevant variable the
Poisson distribution i defined by
PX =e Pe) (2814
and lea te facil moments, defined by
= GED. Dy axis)
sre givenby
yea" ais)
For variables whose range is nonnegative integral, we can very naturally define the
generating function
60) = § P09 esi
which related othe characteristic function by
G05) = 6-i on). 28.18)
“Te generating function has the sel property that
or [(2) e9] ay
For the Poisson distribution we havei= 2 eget) a2
‘We may also define the Factorial cumulant generating function gs) by
918) = log G(s) 2821)
and the factorial cumulants &X")¢ by
way
7 2.822)
as) = 36x:
‘We see that the Poisson distribution has all but the frst factorial cumulant zero.
‘The Poisson distribution arises naturally in very many contexts, for example, we
have already met iin Sec. 1.5.1 a8 the solution ofa simple master equation. It plays
similar cenral roe in the study of random variables which take on integer values to
that occupied by the Gaussian distribution inthe study of variables with a continuous.
range. However, the only simple multivariate generalisation ofthe Poisson is simply
« produet of Poissons, ie. of the Form
a ea)"
Porm aye) 2823)
There is no logical concept of a correlated mulipoissonian distibution, similar to
‘hat of a correlated multivariate Gaussian distribution.
2.9 Limits of Sequences of Random Variables
“Much of computational work consists of determining approximations to random vari-
ables, in which the concept of a limit of a sequence of random variables naturally
arises. However, there is no unique way of defining sucha limit.
For, suppose we have a probability space Q. and a sequence of random variables,
X, defined on 2. Then by the limit ofthe sequence as e0|
X= dim Xy 29.)
‘we mean a random variable X which, in some sense, is approached by the sequence
cof random variables X,. The various possibilities arise when one considers thatthe
probability space @ has elements w which have a probability density p(w). Then we
‘can choose te following definitions.
2.9.1 Almost Certain Limit
X, converges almost certainly to X if, forall « except a set of probability eto
X,(e) = Xu) (292)
Thus each realisation of Xy converges to X and we write
2.93)
29 Limits of Sequences of Random Variables
2.9.2 Mean Square Limit (Limit in the Mean)
Another possibility is to regard the Xq(w) as Functions of «and look for the mean
square deviation of Xq(w) from X(w). Thus, we say that X, converges to X in the
‘mean square if
lim f des ple){Xale) ~ X(w))? lim 4(Xy ~ X)°) = 0. 294)
‘Tiss the kind of iit which swell known in Hilbert space theory. We write
mein X, =X 295)
29.3 Stochastic Limit, or Limit in Probability
‘We can consider the possibility that X,(w) approaches X because the probability of
deviation from X approaches zero: precisely, this means that if for any z > 0
lim P(X, —X1> 6) = 0. 296)
then the stochastic Himit of X is X.
Inthis case, we write
stlimX, =X. 299)
29.4 Limit in Distribution
‘Aneven weaker form of convergence occur if for any continuous hounded function
0)
Jim F0%)) = FO) 098)
In this case the convergence ofthe limit is said wo be in distribution. In particular,
using exp(ixs) for /(), we find thatthe characteristic functions approach each other.
and hence the probability density of Xy approaches that of X:
2.9.5 Relationship Between Limits
‘The following relations can be shown,
1) Almost certain convergence => stochastic convergence.
2) Convergence in mean square => stochastic convergence.
3) Stochastic convergence = convergence in distribution.
All ofthese Fimits have uses in applications.3. Markov Processes
3.1 Stochastic Processes
Ail othe examples given in Chap. 1 can be mathematically described as stochastic
processes by which we mean, in a loose sense, systems which evolve probabils
tically in time or more precisely, systems in which a certain time-dependent ra
ddom variable X(1) exists. We can measure values x2 2, +.» € OF X(t) at times
Fictostye... and we assume that a set of join probability densities exists
Plxrenaas tastes ds LD
which descr the system completely.
In terms of these joint probability density functions, one can also define condi
tional probability densities
OEY Te)
Plait tact os Imnet Weta)
PON TYa Fa)
12
‘These definitions are valid independently ofthe ordering ofthe tines. although it is
‘usual to consider only times which increase from right to left ie
Neh Baer ene 1a
‘The concept of an evolution equation leads us 10 consider the conditional probabi
ites as predictions of the future values of X(?) (ie X1.325-.. al times fs t3.
given the knowledge ofthe past (values yy...» times 71.125 ..0)
3.1.1 Kinds of Stochastic Process
‘The concept of a general stochastic process is very loose. To defise the process we
need to know at least all possible joint probabilities of the kind in (3.11). IF such
Knowledge does define the process, it is known as a separable stochastic process.
Al he processes considered inthis book will be assumed to be separable,
42) Complete Independence: This is the most simple kind of stochastic process it
satisfies the property
play nko tists.) = TL porns) G14)
32 Markov Process 43
Which means thatthe value of X a time ris completely independent of its values in
the past (or future).
) Bernoolli Trials: An even more special case occurs when the p(t) are inde-
‘pendent of 7, so thatthe same probability law governs the process at all times. We
then have the Bernoulli rials, in which a probabilistic process is repeated at succes-
sive times,
©) Martingales: The conditional mean value of X(0) given that X(to) = i defined
(Keollagtol) & fd x plastl sot) 1s)
Ina martingale this has the simple property
(XColLsa tol) = 0 G.L6)
‘The martingale property is rather strong property, and is associate with many similar
and related processes, such as local martingales, sub-martingales, super-martingales
‘etc. which have come tobe extensively studied and used inthe past 25 years. Potter
[3.1] has weitten the definitive book on their use in stochastic processes.
1d) Markov Processes: ‘The next most simple idea is that of the Markov process in
which knowledge of only the present determines the future, and most of this book is.
built around this concept
3.2 Markov Process
“The Markov assumption is formulated in terms of the conditional probabilities. We
require that ifthe times satisfy the ordering (3.1.3), the conditional probability is
determined enttely by the knowledge ofthe most recent condition, ie.
Play thsataicos | mpeTMae abe) = poetshionastaseo IWyTD) 20)
This is simply a more precise statement of the assumptions made by Einstein, Smolu-
chowski and others. Iti, even by itself, extremely powerful. For it means that we can.
define everything in terms ofthe simple conditional probabilities p(x). 4.1). For
‘example, by definition ofthe conditional probability density p(x), 1; x2.f2l4.71)
xy x2. f: Mie T1)PCR2falgy.71) and using the Markov assumption (3.2.1). we
find
leita fim. T1) = plate La. f)pCa2 tay. 7Ds 22)
and it isnot dificult to see that an arbitrary joint probability can be expressed simply
Pleats 229795)
cits Lota) PCa tts peat tate)
«PLE a tales 1 af) PCE nt
provided
HD DBD DM De 24)443, Markov Processes
3.2.1 Consistency—the Chapman-Kolmogorov Equation
From Seot.2.3.3 we require that summing over all mutually exclusive events of one
kind ina joint probability eliminates that vatiable, ie.
ZPAnBC..
PARC...)s 25)
«d when this is applied to stochastic processes, we get two deceptively similar equa-
tions:
poets
Jes pry te2t2) = Fede> plea ts Laas h)planct 626)
‘This equation is an identity valid for all stochastic processes and is the first in @
hierarchy of equations, the second of which is
lari st Lest)
dx por. tisxta.f ts.)
= Fede plat | ta t5x1.t))p Crate 83.0). 627)
This equation is also always valid. We now introduce the Markov assumption. If
1 > af we en drop ther dependence in he dobly condoned probability
and write
lry.t Late)
dry plot) | x3. 12)PCR fa Xt). G28)
hich is the Chapman-Kolmogoray equation
‘What is the essential difference between (3.2.8) and (3.2.6)? The obvious answer
is that (3.2.6) is for unconditioned probabilities, whereas (3.2.7) is for conditional
Probabilities. Equation (3.2.8) is a rather complex nonlinear Functional equation re-
lating all conditional probabilities p(x |.()) to each other, whereas (3.2.6) sim-
ply constructs the one time probabilities inthe future tof ¢, given the conditional
probability p(xy.t [2 f).
The Chapman-Kolmogorov equation has many solutions. These are best under:
food by deriving the differential form which is done in Sect. 34.1 under certain
rather mild conditions
3.22 Discrete State Spaces
In the ease where we have a discrete variable, we will use the symbol N=
().[Link]...), where the N; are random variables which take on integral values
Cleary. we now replace
fide 29)
‘and we can now write the Chapman-Kolmegoroy equation for such a process as
Pant) macs) = BPs maa) Plt 1) 2.10)
‘This is now a matrix multiplication, with possibly infinite matrices.
43 Continuity in Stochastic Proceses 45
3.2.3 More General Measures,
‘A more general formulation would assume a measure d(x) instead of dx where a
variety of choices can be made. For example, if (x) is a step function with steps
at integral values of x, we eecover the discrete state space form, Most mathematica!
‘works attempt to be as general as possible. For applications such generality can lead
to lack of clarity so, where possible, we will favour a more specific notation.
3.3 Continuity in Stochastic Processes
Whether or not the random variable X()has a continuous range of possible values is
a completely different question from whether the sample path of X(2) is acontinuous
function of For example, ina gas composed of molecules with velocities V(0). itis
clear that all possible values of V(0) are in principle realisable, so thatthe range of
(is continuous. However, a model of collisions ina gas of hard spheres as occur.
ring instantaneously is often considered, and in such a model the velocity before the
collision, o, will change instantaneously at the time of impact to another value ry, 0
‘the sample path of V(t) is not continuous. Nevertheless, in such a model, the pasition
‘fa gas molecule X(i) would be expected o change continuously.
‘A major question now arises. Do Markov processes with continuous sample paths
actually exist in reality? Notice the combination of Markov and continuous. It is
almost certainly the case that in a classical picture (i.e, not quantum mechanical),
all variables with a contiquous range have continuous sample paths. Even the hard
sphere gas mentioned above is an idealisation and more realistically, one should
allow some potential to act which would continuously deflect the molecules during a
collision, But it would also be the ease that, if we observe on such a fine time scale,
the process will probably not be Markovian. The immediate history of the whole
system will almost certainly be required to predict even the probabilistic fue. This
is certainly born out inal altempis to derive Markovian probabilistic equations rom
‘mechanics. Equations which are derived are rarely troly Markovian—rather there is
1 certain characteristic memory time during which the previous history is important
132.331
‘This means that inthe real world there is really no such thing as a Markov process;
rather, there may be systems whose memory time is so small that, on the time seale
‘on which we carry out observation, itis fair to regard them as being well approx
jmated by a Markov process. But in this case the question of whether the sample
paths are actually continuous is not relevant. The sample paths ofthe approximating
“Markov process certainly need not be continuous. Even if collisions of molecules are
not accurately modelled by hard spheres, during the time taken for a collision, afi
nite change of velocity takes place and this will appear in the approximating Markov
process asa diseete step. On this time scale, even the position may change discon-
tinuously ths giving the picture of Brownian motion as modelled by Einstein
In chemical reactions, for example, the time taken for an individual reaction to
proceed to completion—roughly of the same order of magnitude asthe collision time46-3. Markov Processes
Fig. 31. Musation of sample paths of the
‘Cavehy process XU) (dashed line) and Brown-
ian notion Wold tne)
{for molecules—provides yet another minimum time, since during this time, sates
‘which cannot be described in terms of individual molecules exist, Here, therefore,
the very description ofthe state in terms of individual molecules requires a certain
minimum time scale to be considered
However, Markov processes with continuous sample paths do exist mathemati
‘cally and are useful in deseribing reality, The model of the gas mentioned above
provides a useful example, The position of the molecule is indeed probably best
‘modelled as changing discontinuously by discrete jumps. Compared tothe distances
travelled, however, these jumps are infinitesimal and a continuous curve provides
{200d approximation to the sample path. On the other hand, the velocities can change
bby amounts which are ofthe same order of magnitude as typical values attained in
practice. The average velocity of a molecule in a gas is about 100C m/s and during
collision can easily reverse its sign. The velocities simply cannot reach (with any
significant probability) values for which the changes of velocity can be regarded as
very small. Hence, there is no sense in a continuous path description of velocities in
a gas.
3.1 Mathematical Definition of a Continuous Markov Process
For a Markov process, itcan be shown {3.4 that with probability one, the sample
paths are continuous Functions oft if For any 5 > 0 we have
1
fim 2 f de plate ari. =0, a)
My Bip hoe
uniformly in 2. and Av
‘This means thatthe probability forthe final position x to be finitely different from
£ goes 10 2er0 faster that Ar as AF goes to zer0. Equation (3.3.1) i sometimes called
the Lindeberg condition.
Examples
i) Einstein's solution or his f(x,0) (Sect. 1.2.1) is really the conditional probability
(x.110.0), Following his method we would find
34 Diflerental Chapman Kolmogorov Equation 47
post Atle) 32)
and itis easy to check that (3.3.1) is satisfied inthis case. Thus, Brownian motion
in Einstein's formulation has continuous sample paths.
ii) Cauchy Process : Suppose
a
(xt Ang.) = a oa
pe l= aT areaA )
“Then this does not satisfy (3.3.1) so the sample paths are discontinuous.
However, in both cases, we have as requited for consistency
im, rtst + Atha.) = 8x ~2), 34)
and it is easy to show that in both cases, the Chapman-Kolomogoroy equation is
satisfied.
“The difference between the two processes just described is illustrated in Fig. 3.1
in which simulations of both processes are given, The difference between the two
is striking. Notice, however, that even the Brownian motion curve is extremely ir
regular, even though continuous—in fact itis nowhere differentiable, The Cauchy
process curve, however, is only piecewise continuous.
3.4 Differential Chapman-Kolmogorov Equation
Under appropriate assumptions, the Chapman-Kolmogorov equation can be reduced
to differential equation. The assumptions made are closely connected with the con-
tinuity properties of the process under consideration. Because of the form of the
‘continuity condition (3.3.1), one is led (0 consider a method of dividing the difer-
entiability conditions into pats, one corresponding to continuous motion of arepre-
sentative point and the other to discon
‘We require the following conditions forall e > 0:
i) imple. + i070 = Weel. oan
thifomly in x2, and ¢ for b= al >:
a) fim bf acta — apple + Ariz.) = Az. + O10) 42)
tide
i pm, fais -sinreien= AYN #0. 43
the ast wo being uniform in , af
‘Notice that all higher oder coufficents ofthe form in (34.2, 3.4.3) must vanish,
For example, consider the third-order quantity defined by
Wim Lf datas antsy epee epee Arlee,
Euan
= Cale) + 060) aad483, Markov Processes
Since Cx is symmetric in, jk, consider
Sremicins. = Ciae.0, 45)
so that
1a
Clean = pe — Ce.
(= Baca CE 646)
Then,
1
eee. 2.o1< fim, xf la Ge ablla-ta—2iFptat+ Ala. nde+ Oe)
Slee fim flex ~ 2) plat + Atl z, tds + OF6)
= elarlieiaBy(e.1) + O16)) + O46),
(6). Gan
imilary, we can show that all corresponding higher-order quant
so that C is ze,
ties also vanish,
‘According tothe condition for continuity (3.3.1), the process ean only have contin-
‘vous paths if Wx | 2.1) vanishes forall x # 2. Thus, this function must in some way
deseribe discontinuous motion, while the quanites A; and mst be connected
ith continuous motion,
3.4.1. Derivation ofthe Differential Chapman-Kolmogorov Equation
‘We consider the time evolution of the expectation ofa function f(z) which is twice
continuously differentiable.
Thus,
4, fds fexplaty.t)
=m, arf Fee sen e+ arian) pestle] (48)
= Bay gy te fae fa) pie + arlene) ~ J dae sno},
049)
‘where we have used the Chapman-Kolmogorov equation in the positive term of
(34.8)10 produce the corresponding term in (3.4.9).
‘We now divide the integral over x into two regions |x ~ 2] > e and |x ~ 2]
35.4 General Processes
In general, none of the quamtties in A(z,).B(z,f) and W(x 2.0) need vanish, and
inthis case we obtain a process whose sample paths are as illustrated in Fig. 3.28,
i.e. a piecewise continuous path made up of pieces which correspond to a diffusion
process with a nonzero drift, onto whichis superimposed a fluctuating par.
Ttis also possible that A(z, 1) is nonzero, but B(z,) is zero and here the sample
paths are, asin Fig. 3.2b composed of pieces of smooth curve [solutions of (3.5.19)]
With discontinuities superimposed. This is very like the picture one would expect
ina dilute gas where the particles move freely between collisions which cause an
‘instantaneous change in momentum, though not positon.
3.6 Equations for Time Development in Initial Time—Backward
Equations
We can derive much more simpy than in Set. 34, some equations which give the
time development with respect tothe initial variables yt of p([Link]. 0).
‘We consider
1
im 2 fptastigae +f) ~ plz} 661)
dim, gpleetlnt + at) - pe tye,
SAL Ly. pletlyst + OP) ~ plate + O/)}
(362)
1
BM ae 42 me
“The second line follows by use of the Chapman-Kolmogorov equation in the second
term and by noting that the fist term gives 1 x ptx,tly.t! + Af).563, Markov Processes
‘The assumptions that are necessary are now the existence of all relevant deriva-
tives, and that p(x, ly.) is continuous and bounded in x,t,f for some range
11> 65> 0, Wemay then write
(8.6.2) Jim, 3 fae pla’ + at Wot folate.) pati), G63)
We now proceed using simitar techniques to those used in Seet.34.1 and finally
derive
2 2 pF rketiy.t)
PO y By. PE
fee EST
+ fe Wolof ftastine) = patie} 364)
“Wis wl be called the Boker dferetia Chapman Kolmogorov equation. In
2 naar sens, ster defn thn she Crespo fred eation
(a2) The appropiate condo orb enon
a ass)
rspreseing the obvious fac tha i hepatic i a y time the rbebity
pau oa ry
“The frvar and ine acer equation recut! o cach ker Fo, sl
von of the forvard equation subtotal conion (343) and ny ar
prt boundary conditions, ye! Slaton of he Chapman Kolnogeror equ,
pee Saree ier ser car eb rinner aati eet
peel i in alien oat nt aia!
quan dea wah n Sec S132) Tne ba iflrnce which et a ar
a
telnet ors > fost (363) nen nile fre ford eq
‘on arth bck equation, suo eit fer <0 tt sne he aca
pa ete i mi rg en even anni
Siace te are equivalent the frwarl an beckward equnions ae bath est
‘The forward equation gives more directly the values of measursble quantities as a
finctinot he there ime tdtenstte wed erccommenl inappiaons
The tciwardoqton fins os aplcaten nh sya fe pease me oF
[Ee ele ee ny ie acre igen nada ir
pete
3.7 Stationary and Homogeneous Markov Processes
In Sect.1.5.3 we met the concept of a stationary process, which represents the
stochastic motion of a system which has settled down to a steady state, and whose
stochastic properties are independent of when they are measured. Stationarty can be
‘defined in various degrees, but we shall reserve the term “stationary process” for a
strict definition, namely. a stochastic process X() is stationary if X() and the process
117 Stitionary and Homogeneous Markov Processes 57
X(1+ «) have the same statistics for any &. This is equivalent to saying that all joint
probability densities satisy time translation invariance, ie,
pox. tisaa iss fie inte)
= plait + eines HeLa +E
tes 2)
and hence such probabilities are only functions of the time differences, «j= 1) In
particular, the one-time probability is independent of time and can be simply writen
pala, 672)
and the two-time join probability as
Palais — 552.0) G73
Finally, the conditional probability can also be written as
ara)
palxisth ~ 122.0)
For a Markov process since all joint probabilities can be written as products of the
two-time conditional probability andthe one-time probability, a necessary and suff-
cient condition for stationarity is the ability to write the one and two-time probabil
ties in the Forms given in (3.7.1-3.7.).
3.7.1. Engodie Properties
I we havea stationary proces, it is reasonable 1 expec that average measurements
could be constructed by taking values of the variable x at suocessive times, and av-
feraging various functions of these. This is effectively a belief thatthe law of lage
rumbers (as explained in Sect.2.5.2) applies to the variables defined by successive
rmeasuremens ina stochastic process
2) Ergodic Property of the Mean: Let us define the variable 27) by
Lt
pal 875)
R= 5p F dex, 675)
where 31) isa stationary process, and consider the limit T —+ co. This represents @
possible model of measurement ofthe mean by averaging overall times. Clearly
RO= Os. 678)
We now caleuate the variance of X(T). Thws,
ODP) = Ghy F J duacnteysend, 677)
and ifthe process is stationary,
Cateye) = Reh = 2) +P, 378)
‘where Ris the two-time correlation function, Hence,58 3, Markov Proceses
OP) = shy Td RNA I, 0.79)
where the ast ator follows by changing variables to
renott 67.10)
an integrating
‘The left-hand side is now the variance of (7) and we will show that under certain
conditions, this vanishes as T — co, Most straightforwardly ll we require is that
any
Ure tt
ting Lar(t- Fy) ne
which isa litte obscure. However, it is clear that a sufficient condition for this mit
to be zero is for
Jariani ) Engodie Property ofthe Spectrum: We similarly fnd that the spectrum, given
ty the Fourier eansform
eGtndr, 7.19)
as in Sect. 15.2, is also given by the procedure
7.20)
1 [Faroe
4) Engodic Property of the Distribution Function: Finally, the practical method
cof measuring the distribution function isto consider an interval (xy, x2) and measure
x(t) repeatedly to determine whether itis in this range or not. This gives a measure
of [2 dx p.(x). Essentially, we are then measuring the time average value of the
function x(x) defined by
hoo n tp by
Poet) = pilxstl ato). 128
pestle’) = pae'tlet), (37.28)
and all other joint probabilities are obtained from these in the usual manner for a
Markov process. Clearly. if 3.7.23) is satisfied, we find that as 1+ coor as fy —» —20,
Plt) — pax) 726)
and all other probabilities become stationary because the conditional probability is
Stationary, Such a process is known as a homogeneous proces
‘The physical interpretation is rather obvious, We havea stochastic system whose
variable x is by some extemal agency fined to have a vale al time fy. tthen
evolves back to a stationary system wih the pasage of time. This # how many
stationary systems are created in practic.
From the point of view of the diferental Chapman-Kolmogorov equation, we
‘wil find that the stationary distibuion function pas) sa solution of the stationary
diferent! Chapman-Kolmogorov equation, which ake the frm
a
Belem tla + 3
#
O=-E Fes BME tl
+f dx{Weel epee, tly.) — Weel 2)[Link] Is 720
Where we have used the fact that the process is homogeneous to aoe that A, B and
W, as defined in (3.4.1-3.4.3), are independent of r. This isan alternative definition,
‘of a homogeneous process.
3.7.3 Approach to a Stationary Process
‘A converse problem also exists. Suppose A.B and W are independent of time and
pala) satisfies (3.7.27), Under what conditions does a solution of the differential
‘Chapman-Kolmogorov equation approach the stationary solution (2)?
‘There does not appear to be a complete answer to this problem. However, we can
_givea reasonably good picture as follows. We define a Lyapunov functional K of any
'wo solutions p and pe ofthe differential Chapman-Kelgomorov equation by
K = Sx prez ttoglpy(x.0/ p22.) 3.7.28)
and assume for the moment that neither py nor pe are zero anywhere. We will now
show that K is always positive and dK/d is always negative.
Firstly, noting that both pax.) and py(x,1) are normalised to ose, we write
2.7 Stationary and Homogeneous Markov Processes 1
e)-1
Kipispatl= f dx pix. 0 logtp xs / poe.) + pala. 1px.) ~ 1}
| (37.29)
and use the inequality valid forall z> 0,
~loge+2-120, 67.0)
to show that K > 0.
Let us now show that dK/de < 0, We can write (using an abbreviated notation)
= PPL tog py +1 tog pal ~ 221puI I} a7)
5 def tog ps +1 spa - FP
‘We now calculate one by one the contributions to dK /d from drift, diffusion, and
jump terms inthe differential Chapman-Kolmogorov equation:
i ond Zao} 0739
(B),,-Bte {-tostoun + gen + wuloag (a)
which can be rearranged to give
lesa
‘Similarly, we may calculate
(FB) Bo egeeatosturenn. 0739
et)
ae + 1 tBupn ~ (riled geee Burd}
(GE), A Bi {totter Bud = (rn gr (Bud
0734)
and after some rearranging we may write
a a }
=A Esde bs { om rron} {toto
@
+40 fae By logtps/ po) - (3139)
4 See 5a [Bu toe |
Finally, we may calculate the jump contribution similarly:
(BE) = face {terterpra'9 —we'inpceo}
he
x loglps(.0)/pa6e.01+ 1)
— [Wexix’yprta'.0- WO Levert.) ]n(x.0/p2.0}, G.7.36)
and after some rearrangement,
(B= saree westenmersnie eatee'i-o+e0, 737
where
a738)
= pile DI pales),62 3. Markow Processes
and ¢' is similarly defined in terms of
‘We now consider the simplest case. Suppose a stationary solution py(x) exists
which is nonzero everywhere, except at infinity, where it and its fist derivative van-
ish. Then we may choose p2(x.1) = psx). The contributions to di/dr from (3.7.33)
and the second term in (3.7.35) can be integrated to give surface terms which vanish
at infinity so that we find
ak
a 67298)
aK
(Gi s, that
HCOWCS) Loto}? = AWE — WESMIWES) + WLP) 824)
Using the independence of increments, the fist average is zero and the second is
given by (38.10) so that we have, in genera,
HCW) {Wot} = mint ~ tp, 5~t0) + UG, 825)
Which is corect for both ¢> sand r< s
a he initil vale is
38.2 The Random Walk in One Dimension
‘A man moves along a lin, aking, at random, steps to the left or the right with equal
probability. The steps are of length /so that his position can take en only the value
nl, where nis integral. We want to know the probability that he reaches a given point
‘distance n! from the origin ater a given elapsed time.
‘The problem can be defined in two ways. The firs, which is more traditional, is
to allow the walker to take steps at times Nr (N integral) at which times he must
38 Examples of Markov Processes 69
‘step either left or right, with equal probability. The second isto allow the walker to
take steps left or right with a probability per unit timed which means thatthe walker
waits at each point for a variable time. The second method is describable by a master
‘equation
8) Continuous Time Random Walk: To do a master equation treatment of the
problem, we consider thatthe transition probability per unit time is given by the
form
Wont Vin = Win Vin = 0. 3.826)
otherwise, Wonlm,t) = 0 s0 that, according to Sect. 3.5.1, the master equation for
the man to be atthe position nl, given that he started at nl, is
BPlatin(f) = dlPln Lent) + Pa Leta). @827)
by Disrte Time Random Walk: The more clase for of he random walk
Aes nt atu ta the an makes his jump tothe eto igh according o 8
tse quton baht he upset cr gh with equal probability at times Ne. 0
that tine is adr variable In this case, we ca wee
fetes Lei ney+ Pon tavedenD.
‘aa28)
Pon N+ DyrlatN)
Iris small, we can view (3.8.27) and (3.8.28) as approximations to each other by
writing
Poni + Ariat N's) = PlnNe lat N'r) + [Link]' A). 829)
with ¢= Naf = N’randd = $1"! so that the transition probability per unit time
in the master equation model corresponds to half of the inverse waiting time r in the
discrete time model.
© Solutions Using the Characteriatic Punction: Both systemscan be eaily solved
by introducing the characteristic function
(5.0 = (1) = E Plein 6.830)
in which case the master equation gives
AG(s,0) = dle" +e" -2G(8.0), G83
and the diserete time equation becomes
G(s,(N + Dr) = He? +P IG(S,NA) 3.832)
‘Assuming the man starsat the origin n’ = Oa time r= 0, we find
Gs,0) 3.833)
in both cases, so tha the solution to (3.8.31) is
xpliel” +e ~ 2d), 3.834)
Gus.)703, Markov Processes
and 10 (3.8.32)
Gys,N2)=[He" +e)" 3.8.35)
‘The appropriate probability distributions can be obtained by expanding G\(s,NT)
and G(5.1)n power of expis): we find
Py(n.t}0.0) =e Fy(4td), 3.8.36)
yt amy (Mea
(mle HS MT eam
“The discret time distribution is also known asthe Bernoulli distribution: it ives the
probability of a total of» heads tossing an unbiased coin W times.
{Continuous Space Limit: For both kinds of random walk, limit of continuous
space—that is, very many steps of very small size—gives the Wiener process. If we
set the distance travelled as.
Pan,Ne10,0)
xenl, 838)
so thatthe characteristic function ofthe distribution of x is,
O1(5,1) = Ce) = Gulls.) = expite” +e — 2yed 8.39)
‘Then the limit of infinitesimally small steps 1 > Ois
(5,1) > exp-s"1D), 8.40)
where
D= lima. san,
“This isthe characteristic function of a Gaussian (Sect. 2.8.1) ofthe form
P(x, t10,0) = (4xD1)""expl—7/4D0, 842)
sand is of course the distribution for the Wiener process (Sect 3.8.1) oF Brownian
‘motion, as mentioned in Sect. 1.2. Thus, the Wiener process can be regarded as the
limit ofa continuous time random walk in the limit of infiitesimally small step size.
The limit
1047-40, wih D=Hgt?
0 De Hig. sayy
‘of the discrete time random walk gives the same resul. From ths form, we see clearly
the expression of D as the mean square distance travelled per unit time.
‘We can also sce more directly that expanding the right-hané side of (3.8.27) as a
function of x upto second order in J gives
G,p(xst10,0) = BddEPCx,t10,0). 84s)
‘The three processes are thus intimately connected with each other at two levels,
namely, under the limits considered, the stochastic equations approach each other
and under those same limits, the solutions to these equations approach each other,
‘These limits are exactly those used by Einstein. Comparison with Sect. 1.2 shows
38 Examples of Markov Processes 71
that he modelled Brownian motion by a discrete time and space random wal
nevertheless, derived the Wiener process model by expanding the equations for time
‘development ofthe distribution Function.
“The limit results of tis section are a slightly more rigorous version of Einst!
method. There are generalisations of these results to less specialised situations and it
isa fair statement to make tat almost any jump process has some kind of limit which
isa diffusion process. However, the precise limits are not always so simple, and there
are limits in which certain jump processes become deterministic and are governed by
LLiowvile's equation (Sect. 35.3) rather than the full Pokker-Planck equation. These
results are presented in Sect. 11.2
138.3 Poleson Process.
‘We have already noted the Poisson process in Sect. 1.5.1. The process in which elec-
trons arrive at an anode or customers arrive ata shop with a probability per unit time
“2 ofariving, is governed by the master equation for which
Wont Winn = 2, 3.845)
‘otherwise,
Worm =o. 6846)
‘This master equation becomes
PCr th nf) = APC — Wot ett) = POL thn, @san
and by comparison with (3.8.27) also represents a “one-sided” random walk, in
‘which the walker steps tothe right only with probability per unit time equal 10 2
“The characteristic Function equation i similar to (3.8.31):
8,G(5,0) = Alexptis) ~ NGLS., 3.8.48)
with the solution
G{s,t)= exp {arfexptis) — M. 849)
for the initial condition that there are initially no customers (or electrons) at time
= 0, yielding
[eee
1) Continuous Space Limit: In contrast to the random walk, the only timit that
exists is 1+ 0, with
alee, (3852)
held fixed, andthe limiting characteristic Function is3, Markov Processes
Timfexplae!® — 1] = expttes), 8.8.53)
‘withthe solution
p(x,110,0)
(x a), 854)
‘We also see that in this limit the master equation (3.8.47) would become Liowvile’s
‘equation, whose solution would be the deterministic motion we have derived,
») Asymptotic Approximation: We can do a slightly more refined analysis. We
‘expand the characteristic function up to second order ins in the exponent and find
Hs
(ls.t) = expleios ~ #°D/2)]. 855)
where, as inthe previous section,
D=Pa 38.56)
‘This is the characteristic function of a Gaussian with variance Di and mean of, $0
that we now have
i ap 8
st10,0)> Ghee o( = ) 857)
Its als clear that this solution is the solution of
A,p5110,0) = v8, 110,0) + SDAA. 110.0), 858)
‘which i obtained by expanding the master equation (3.8.47) to order P, by writing
Pun = 1.110,0)= apt 1.110,0),
= Aplx.110,0) = 128,p4.110.0) + $PADF 10.0). 3.8.59)
However, this isan approximation or an expansion and not a lieit The limit + 0
«tives Liouvile's equation with the purely deterministic solution (3.8.54). Effectively,
the limit + 0 with well-defined corresponds to D = 0. The kind of approximation
{just mentioned is a special case of van Kampen's system size expansion which we
{neat fully in Sect. 11.2.3.
38.4 The Ornstein-Uhlenbeck Process
All the examples so far have had no stationary distibution, that is, as # + oo, the
distribution at any finite point approaches zero and we see tht, with probability one,
the point moves to infinity
If-we add a linear drift term to the Wiener process, we have a Fokker-Planck
equation ofthe form,
‘Ap = dutkxp) + $Dap, 6.860)
38 Examples of Markov Processes 73
where by p we mean p(x. 11 29,0). This is the Omstein-Uhlenbeck process (3.6)
8) Characteristic Function Solution: The equation for the characteristic function
43) = Fe ptt. 0dr, R61
1
dpa 38.62)
sé
“The method of characteristics can be used to solve this equation, namely, if
ee ee
4.8. we o
seas pac hf BED ghey
rae anes
‘The particular integrals ae readily found by integrating the equation involving, dt
and ds and that involving ds and dé; they are
W(s,t.9) = sexp(-k). 8.66)
165,140) = dexp(Ds"/4k) 867)
“The general solution can clearly be put inthe form v = g(u) with g(u) an arbitrary
fonction of u, Thus, the general soli
8,0) = exp(-Ds* /4k\gl sexp(—ki)] . (3.8.68)
petordey cot
PAx,0 | 0,0) = 6x = 30). 3.8.69),
a
8, 0) = explixos), (3.8.70)
ocr
@{s) = exp(Ds?/4k + ixos). G.8.71)
=
frm S028 arent 0 rein
(X(0) = zoexpt-kt), (3.8.73)
wate Bt -et-20) a774 3. Markov Processes
b) Stationary Sotution: Clearly, as # > co, the mean and variance approach limits
0 and D/2k, respectively, which gives a limiting stationary solution. This solution
cean also be obtained directly by requiring d,p = 0, so that p satisfies the stationary
Fokker-Planck equation
aaftsos 500.0] = 875)
and integrating once, we nd
ee
|esvs S00] =0 876)
‘The requirement that p vanish at ~co together with its derivative, is necessary for
normalisation. Hence, we have
Spx 2
perp (38.77)
sot
7
nin= V&en(-~ eam
Ts a Gaon wih mean O nd vince DI, as pte tome ine
{dependent solution
Its clear that a stationary solution can always be obtained for a one variable
system by this integration method if such a stationary solution exists. Ifa stationary
solution does not exist, this method gives an unnormalisable soluion.
©) Time Correlation Functions: The time correlation function analogous to that
‘mentioned in connection with the Wiener process can be calculsied and is a mea-
sutable piece of data in most stochastic systems. However, we have no easy way of
‘computing it other than by definition
(XOX) | Leosol) = J Fade, x12 pCa te 428120640), 8.79)
and using the Markov property
S fdadanxy rapist x2. 8)pt22. 5120010). 8.8.80)
‘om the assumption that
1352 Gssn
‘The correlation function with a definite intial condition is not normally of as much
interest as the stationary correlation function, which is obtained by allowing the
system 10 approach the stationary distibution. It is achieved by putting the intial
condition in the remote past, as pointed out in Sect.3.7.2. Letting -> —2e, we find
tim Pra 5 |405t0) = ryan) = (D/A)? exp(—kx3/D) (3.882)
and by straightforward substitution and integration and noting that the stationary
‘mean is zero, we pet
38 Examples of Markov Processes 75,
croton = oat = Bessa «assy
‘This result demonstrates the general property of stationary processes: that the cor
relation functions depend only on time differences. It is also a general result (3.7]
thatthe process we have described in this section is the only stationary Gaussian,
‘Markov process in one real variable.
‘The results ofthis subsection are very easly obtained by the stochastic differential
equation methods which will be developed in Chap. 4
‘The Ornsiein-Uhlenbeck process is a simple, explicitly representable process,
which has a stationary solution. In its stationary state, itis often used to model a
realistic noise signal, in which X(2) and X(s) are only significantly coreated if
bess Vker. asa)
‘More precisely, 7, known as the correlation time can be defined for arbitrary pro-
cesses X(s) by
Jaane, xy, ae. 3.8.85)
‘hich is independent ofthe precise functional form of the correlation function.
3.85 Random Telegraph Process
‘We considera signal X(t) which can have either of two values a and b and switches
{rom one tothe other with certain probabilities per unit time. Thus, we have a master
equation
Plat 819) = ~APCa,t1 X40) + #PCD. FL 0), | asso
A.P(b.t1 8510) = AP(a,t1 x10) ~ wP(b,t1 400).
18) Time-Dependent Solutions: These can simply be found by noting that
Plast XQ) 4 P(D«tL Sto) = 1, (3.887)
and using the intial condition
POX tol ato) = Bex (2888)
‘A simple equation can then be derived for AP(a,11x,f0)— P(b, txt), Whose solu
tion is
PCa, x10) ~ HP(D.1t0) = exp| ~ (2+ HVE to] (Aba2— Hrs) (3.8.89)
‘The solution for the probabilities then takes the form
rasa) gig eestern|
ta (3.8.90)
= -casynr-an {4 ee
“The mean of X() is sraightforwardly computed:7h 3. Markow Processes
(XCD Lao. tol) = E xP Koto)
= EPA ctennn|
wa ® 7
‘The variance can also be computed bul is a very messy expression,
1b) Stationary Solutions: This process has the stationary solution chained by letting
anh
matt) 89)
” a
Pa tpg 3.
ere Py = = 3.8.92)
Which is obvious from the master equation
‘The stationary mean and variance are
ws ba
wy, = EB 9
(X), ern 3893)
(a=brna
vary, = Coed
IX), aw (38,94),
«) Stationary Correlation Functions: To compute the stationary time correlation
function Ie > sand write
MONE = Ex’ Plt ha’, IPs), (3.8.95)
LeU DAC 6895)
Now ase .8.91-2.8.94) to obtain
ORONO. = ODE engl + Kt IKE) — OD. eam
fay + ba y (a-b¥pa
expC + pone = 9 =e 3
(SEER!) tote
Hence
(XU Nas = (MDX), — C9? = HEHE soins 8.899)
(ene
Notice that this time correlation function is of exactly the same form as that of the
‘Ornstcin-Uhlenbeck process. Higher-order correlation Functions are not the same of
‘course, bat because ofthis simple correlation function and the simplicity of the two
sate prooess, the random telegraph signal also finds wide applicatca in model build
4. The Ito Calculus and Stochastic
Differential Equations
4.1 Motivation
In Sect 1.2.2 we met for the first time the equation which isthe prototype of what
is now known as a Langevin equation, which can be described heuristically san
‘ordinary differential equation in which a rapidly and irregularly lutuating random
function of time [the term X() in Langevin’ original equation} occurs. The simplic-
ity of Langevin’s derivation of Einstein's results isin itself sulficient motivation to
attempt to pat the concept of such an equation on a reasonably precise footing.
“The simple-minded Langevin equation that turns up most often can be written in
the form
dx
BS aus.) bu nec 4.1
GF = ats. Be 6) at)
where «isthe variable of interest, af) and D(x) re certain known functions and
40) is the rapidly fluctuating random term. An idealised mathematical formulation
fof the concept ot “rapidly varying, highly iregular funetion” is that for ¢ # .£(0)
and £(7’) are statistically independent, We also require (&()) = 0, ince any nonzero
mean can be absorbed into the definition of af, 2), andl thus require that
cena) = 1-1). (4.12)
which satisfies the requirement of no conelaton at different times and furthermore,
has the rather pathological result that &(¢) has infinite variance. From a realistic point
fof view, we know that no quantity can have such an infinite variance, but the con-
cept of white noise as an idealisation ofa realistic Auctuating signal des have some
‘meaning, and has already been mentioned in Sect 1.5.2 in connection with Johnson
noise in electrical cireuits. We have already met two sources which might be consid-
cred realistic versions of almost uncorrelated noise, namely. the Ornstein-Uienbeck
process and the random telegraph signal. For both of these the second-order corrla-
tion function can, up to a constant factor, be put in the form
cana = Fem" a9)
[Now the essential difference hetween these two is thatthe sample paths ofthe random
telegraph signal are discontinuous, while those of the Ornstein-Uhlenbeck process
fare not. If (4.11) isto be regarded as a real differential equation. in which €() is
not white noise with a delta function correlation, but rather a noise with a finite7% 4, Thelto Coleus and Stocheste Ditferetil Equations
‘correlation time, then the choice ofa continuous function for &() seems essential 10
make this equation realistic: we do not expect dx/dt to change discontinuously. The
limit as 7 — oo of the corelation function (4.1.3) is clearly the Dirac delta function
ie
w2
and fort.
h Lay
Yew
sim Ze (415)
‘This means that a possible model ofthe £(7) could be obtained by taking some
kind of limit as —» oof the Orsten-Ulenbeck process. This would corespond,
inthe notation of Sect.3.8:4,t0 the limit k— co with D
‘This limit simply does not exist. Any such limit must clearly be taken after ca
culating measurable quantities. Such a procedure is possible but co cumbersome to
use asa caleulational to!
‘An alternative approach is called for. Since we write the differential equation
(4.1.1, we must expect it tobe integrable and hence must expect tat
a
15)
‘Suppose we now demand the ordinary property of an integra, tht u(t) is a contin
tious function off This implies that u(7) is a Markov process since we can write
vr) = fey face. wan
= tgl Féren) «farce as
anor nye» Othe (mht nega aiden of the i the cond
integral. Hence, by continuity w#) and w(’) ~ ut) are statistically independent and
Further. u(')~n(2) is independent of u(?”) forall 1” < 1 This means that u(t) fully
{determined (probabilistically) from the knowledge ofthe value of u(t) and not by any
past values. Hence. w() is a Markov process.
‘Since the sample functions of u(t) are continuous, we must be able to describe u(t)
by a Fokker-Planck equation. We can compute the drift and diffusion coeficients for
this proces by using the formula of Set 3.52. We cn tite
(utr At) — ual uot) = ("P etsids) = 0, (419)
and
utr + 1) oto AD
Pas'Paveioes. (4.1.10)
fds’ Paves —sy= as au
42 Stochastic tegration 79
‘This means that the deft and diffusion coefficients are
(u(r + Af) ~ uot)
Al.) = fi, x =o, arp
shag. = fy Am iD Aa
“The corresponding Fokker-Planck equation is that ofthe Wiener process and we ean
Jett = ut) = We) (44.14)
“Ths, we have the paradox thatthe integra of €() is (A), which i self not df=
Ferentiable, as shown in Sect. 38.1, This means that mathematically speaking, the
Langevin equation (4.1.1) does not exist. However, the corresponding integral equa
tion
= ¥0)= falotatde flat. sktads, Lis)
‘ean be interpreted consistently.
‘We make the replacement, which follows directly from the infepretation of the
integral of €() asthe Wiener process WW(), that
aaa) = Were dr) ~ We) = Eide 4116)
and thus write the sond integral as
frtx).sld Ms) (aay
which isa kind of stochastic Stiles integral with respect to a sample function Wi)
Such an integral can he defined and we will carry this out inthe next section
Before doing so, it should be noted that the requirement that w(t) be continu
cus, while very natural, can be relaxed to yield a way of defini
fs stochastle differential equations. This has already been hinted atin the
ff shot noise in Sect 5.1, However, it does not seem to be nearly so useful and will
‘ot be treated in this book. The interested reader is referred to [4.1]
‘As a final point, we should note that one normally assumes that £(1) is Gau
sian, and satisfies the conditions (4.1.2) as well. The above did not require tl
{Gaussian nature follows in fet from the assumed continuity of u(t). Which ofthese
assumptions is made i, in a siict sense, a matte of tase. However, the continuity of
1) seems a much more natural assumption to make than the Gaussian nature of &(),
‘which involves in principle the determination of moments of arbitrarily high order
4.2 Stochastic Integration
4.2.1 Definition ofthe Stochastic Integral
Suppose Gt) is an arbitrary function of time and Wr) isthe Wiener process. We
define the stochastic integral Ge" \4W() asa kindof Riemann Sti neg80 4, The Mo Caleuls and Stochastic Dilfer Equations
HALAL
ig... Paitonng of the time interval used inthe defntion of stockasic negation
Namely, we divide the interval tt] into subintervals by means of pattoning
points (asin Fig. 4.1)
WS Knee Se a2
and define intermediate points such that
hi ens 422)
‘The stochastic integral {5 G(e)/¥ (is defined as iit ofthe partial sums
S42 S Gerona) Woot 429)
10s heuristically quit easy to se that. in general, the ietegral defined asthe limit
‘Of S, depends on the particular choice of intermediate point . For example, if we
take the choice of Gis) = Wir).
Sa) =( i Wert) - Wee.) (424)
= Eminent) ~ mints 425)
Eev-no. 426)
1 forexample, we choose fr all
Heat t(l-ainr 0.
“The proof is quite straightforward. For N= 0, let us define84 4, Theo Calculus and Stochastic Differential Equations
in ([56-s(aW2—a1))) 4228)
Jim (3G cow? — an? +3 26,6 a ~a1p (aw? a}
423)
‘The horizontal braces indicate factors which are statistically independent of each
‘ther because ofthe properties of the Wiener process, and because the Gare values
‘of a nonanticipating function which are independent ofall AW; for j> i.
Using this independence, we can factorise the means, and also using
(AW) = an,
fi) (AW? - An)?
we find
(from Gaussian nature of AW),
42 Stochastic megrtion 85
and although G(1) is nonanticipating, this is not suficent to guarantee the inde-
pendence of AW, and Gj.» as thus defined.
iv) By similar methods one can prove that
im 3G, AW
(4.2.38)
fawyat awe
a}
and similarly for higher powers. The simplest way of characterising these reslls is
to say that dW() isan infinitesimal order of $ and that in calculating diferental,
infinitesimals of order higher than I are discarded.
4.2.7 Properties of the Ito Stochastic Integral
4) Existence: One can show that the Ito stochastic integral G(r}dW() exists
‘whenever the function Git’) is continuous and nonanticipating on the closed interval
1=24m[Zato.0")] 4220) ta)
Cece eeepc ory b) Imegration of Polynomials: We ean formally us theres of See. 4.26
mi( 2.6.14? 26.184) =0, aan aqweor =p + ance - wo = $(")weor awe 6239)
and since ee
mim EG. = fare). 42m) = niveyr amine A Daveor ar, (4240)
we have sothat
flamorau = farce 239) fwerawey= oiwot wages 5fmerar, aan
Comments
1) The proof {2 GintdWeP™™ = 0 for N > is similar and uses the explicit
expressions for the higher moments of « Gaussian given in Sect.2.8.1
ii) dWV( only occurs in integrals so that when we restrict ourselves to nonantic
pating functions, we can simply write
aw? i, (4.2.34)
aWi*™ 0, (N>0) (4235)
ili) The results are only valid for the lo integra, since we have used the fact that
AW, is independent of Gs. In the Stratonovich integral
AW, = WHE) Ml. 4236)
Gir = G(4ur+ 4-0). 4.237)
<6) Two Kinds of Integral: We note that foreach G() there are wo kinds of integrals,
namely,
found’ and jouw), 4242)
both of which occur in the previous equation. There is, in general, no connection
between these two kinds of integral
4) General Differentiation Rules: In forming differentials, as in (b) above, one
‘must keep all terms upto second order in dWV(7. This means that, for example,
dlexplW(s)}} = exp[W(n) + dW(1)] ~ explWi0)], (4.2.43)
= exptwWen)lawen + 4awer'), (4248)
= exptWonifaiven + $a] 4.245)86 4, Thelto Calculus and Stochastic Diferental Equations
Foran arbitrary function
12 ae « Lanne LOL
io. 12 Lane» Lawes 12 Liaw?
anwo.n SEL an? + Lawn 3 2Laweoy
*
ZL awe 4.246)
sd we use
amor a, 247)
draw) +0, [Set.426, comment (iv) (4.248)
(ay? 0, (4.249)
and all higher powers vanish, to atv at
af VPS),
apenas = (2 LBL) a
i41We0 ( ¢ “i a+ Lawn 14250)
©) Mean Value Formula: For nonantcpaing G0,
(jomane)=0 as)
Proof: Since G(t) is nonanticipatin
(26.1001) = E66) MAM) = 0. (4.252)
in the definition of the stochastic integr
‘We know from Sect.2.9.5 that operations of ma-lim andl ¢ ) may be interchanged
Hence, taking the limit of (4.2.52), we have the result.
‘This result is notte for Stratonovieh’s integral, since the value of G1 is chosen
inthe middle ofthe interval, and may be correlated with AW;.
eee
funetions,
(fowraweeyjmeyawer) 4253)
Proof: Notice that
(EGE H)10W,)
= (SGeaaacamy?) + EGeaths +6108. pawyaw) 4254)
42 Stochastic Imepraion 87
In the second term, AW; is independent of all other tems since j < i. and G and
‘Hare nonanticipating. Hence, we may factorise out the term (AW) = 0 so that this
term vanishes. Using
(AW?) = As,, (4255)
nd interchanging mean and limit operations, the result follows
4) Relation to Delta-Correlated White Noise: Formally, this is equivalent to the
idea that Langevin terms €) are deta corelated and uncorrelated with F(?) and Git),
For, rewriting
awn) > end, (42.56)
itis clear that if F(t) and G() ate nonanticipating, £1 is independent of them, and
wwe get
jar Jaeicuyms ee’ =f faras Gwyn es).
= farGwne’y, (4257)
which implies
{E10 619) = 501-9) (4258)
[An important point of definition arises here, however. In integrals involving dla
funtion it frequently occurs inthe study of sochasti differential equations that
the argorent ofthe delta function i equal to either the upper or the lower limit of
the mcg thats, we ind integrals like
1 =farpnee-n), (4259)
b= Farsnse~). (4.260)
‘Various conventions can be made concerning the value of such integrals. We will
show that inthe present context, we must always make the iaterpretation
ha ft), 4261)
h=0, (4.2.62)
corresponding to counting all the weight of a delta function atthe lower limit of an
integral, and none ofthe weight atthe upper limit, To demonstrate this, note that
(Jowawery| FAs yawis))) = 0. (42.6%)
This follows, since the function defined by the integral inside the square bracket
{s, by Sect.4.2.5 comment (v), a nonanticipating function and hence the complete88 4, The to Caleuus and Stochasti Dilferemial Equations
integrand, {obtained by multiplying by G(r’) which is also nonantcipating} is itself
‘nonanticipating. Hence the average vanishes by the result of Sect. 4.2.7e.
[Now using the formulation in terms of the Langevin source £(), we can rewrite
(4.2.63) as
fae fds Gurynes nse — (42.68)
‘which corresponds to not counting the weight ofthe delta function atthe upper limit.
Consequently, the full weight must be counted atthe lower limit.
This propery is a direct consequence of the definition of the Ito integral as in
(4.2.10) in which the increment points “towards the future". Tha i, we can interpret
awe) = Wier dd) ~ WO. (42.65)
Inthe case of the Stratonovich integral, we get quite a different formula, which is by
‘no means as simple to prove asin the Ito case, but which amounts 0 choosing
he hf),
4see)
“his means that in both cases the delta function occurring atthe Hinit ofan inte
al has half its weight counted. This formula, although intuitively more satistying
than the Ito form, is more complieated to use especially in the perturbation theory
of stochastic differential equations, where the Ito method makes very many term
vanish,
{Stratonovich) (42.66)
4.3 Stochastic Differential Equations (SDE)
We canetnded in Sect. 4.1, that the most satisfactory interpretation ofthe Langevin
equation
Ho act) 4b x
B= ax. + a.0€0). aa)
Js a stochastic integrat equation
= HO) = fala £1 fee Bete). 432)
Unfortunately, the kind of stochastic integral tobe used isnot given by the reasoning
of Sect. 4.1. The lo integral is mathematically and technically the most satisfactory,
but it is not always the most natural choice physically, The Straonovich integral
is the natural choice for an interpretation which assumes £(r) isa real noise (not a
white noise) with finite corelation time, which is then allowed to ecome infnites-
mally small after calculating measurable quantities. Furthermore, a Stratonovich
interpretation enables us to use ordinary calculus, which is not possible for an Tio
interpretation,
443 Stochastic Difernial Equations (SDE) 89
nee pee arora
Fig. 42. lusaton ofthe Cauchy-Euler procedure for consiruting an approximate solution
ofthe stochasti iferenil equation dx) = ax). + BL. tldW(0)
From a mathematical point of view, the choice is made clear by the near impossi-
bility of carrying out proofs using the Strtonovich integral. We will therefore define
the Ito SDE, develop its equivalence with the Stratonovich SDE; and use ether form
depending on circumstances, The relationship between white noise stochastic differ
tential equations and the eal noise systems is explained in Sect. 8.
43.1 Ito Stochastic Differential Equation: Definition
‘A stochastic quantity x() obeys an Ko SDE written as
a(t) = alx(, fdr + HEx(,1dWO), (43.3)
if forall rand fo,
att) + fala) fae + fole?)s"awe!) aaa)
Before considering what conditions must be salisfied by the coethcients in (4.3.4),
itis wise to consider what one means by a solution of such an equation and what
‘uniqueness of solution would mean in this context. For this purpose, we can consider
a discretised version ofthe SDE obtained by taking a mesh of points (as illustrated
in Fig. 4.2) such that
ch encore Sar Stat 435)
and writing the equation as
wr = 314 aC NAN + Ba AW 436)
Here,
eats
ANS teats 437)
AW, = Wer) = WOH)904, The to Calculus and Stochasie Difeental Equations
18) Cauchy-Euler Construction ofthe Solution of an Ito SDE: We see from(4.3.6)
that an approximate procedure for solving the equation isto calculate x; from the
knowledge of x; by adding a deterministic term
ast. (438)
and a stochastic term
Dai. AAW; (439)
‘The stochastic term contains an element AW;, which isthe increment of the Wiener
process, but i statistically independent of 3 if
{i xs itself independent of al W(r) ~ W(t) fort > ta (hus, the initial conditions
if considered random, must be nonanticipating), and
ii) a(x.0) is a nonanticipating function of ¢ for any fixed x.
Constructing an approximate soltion iteratively by use of (4.2.6, we ee that xis
always independent of AW, for j > i
‘The solution is then formally constructed by leiting the mesh size goto zero, To
say that the solution is unigue means that fora given sample function ¥() of the
random Wiener process W(1), the particular solution ofthe equation which arises is
unique. To say thatthe solution exists means that with probability one, a soltion
exists for any choice of sample fonction W() ofthe Wiener process WC.
“This method of constructing a solutions called the Cauchy- Euler method, and can
bbe used to generate simulations. However, there are significantly better algorithms,
asis explained in Chap. 10
>) Existence and Uniqueness of Solutions of an Ito SDE: Existence and unique-
sess wll not be proved here, The interested reader will find proofs in {4.3}. The
conditions which are required for existence and usiquenes in a tne interval {7
ip Lipschitz condition: aK exists such that
lax.) = ay.) + 16C4. 1) = bys < Kix al, (43.10)
forall xand y, and all # inthe range 0.7.
ii) Growth condition: a K exists such that forall inthe range (4,7),
late, + Ibo. < 120+ [Pp au
Under these conditions there will be a unique nonanticipating ‘olution x(t) in the
range {to.T
Almost every stochastic differential equation encountered in practice satisfies the
Lipschitz condition since itis essentially a smoothness condition. However, the
{growth condition is often violated. This does not mean that no solution exist; rather,
‘it means the solution may “explode” to infinity, that i, the value of x can become
infinite ina finite time; in practice a finite random time. This phenomenon occurs in
‘ordinary differential equations, for example,
42 Stochastic Differential Equations (SDE) 91
[=har (43.12)
ete
asthe general solution with an inital condition x = x at = 0,
a= ars tidy? ain
I ais postive, this becomes infinite when xo = (a1)! but if a is negative, the
solution never explodes, Failing wo satisy the Lipschitz condition does not guarantee
the solution will explode, More precise stability results are required for one to be
certain ofthat [4.3].
4.32 Dependence on Initial Conditions and Parameters
Inexactly the same way as in the case of deterministic differemial equations, ifthe
functions which occur in a stochastic differential equation depend continuously on
parameters, then the solution normally depends continuously on that parameter. Sim-
italy, the solution depends continuously on the initial conditions. Let us formulate
this more precisely. Consider a one-variable equation
dx = aQd.x,Ddt + WA, )dHO, aay
‘with initial condition
to) = (A), anus)
where 2 isa parameter. Let the solution of (4.3.14) be x(1.1) Suppose
(to). 4316)
i) slime)
ii) ForeveryN>0
sim { sop, [lcd ald.s.01+ Bea, x.0)~Bda.x08]} = 0.
ve nt —
iil) Thee exists aK independent of such that
laa, x07 + 19 < AAC + be), 3.18)
Then,
slim {sup 0a. ~ to, ni} = 0 (63.19)
ti sup ba.) ~ x.) 4319
Fora proof see [4.1]
‘Comments
5) Recalling the definition of stochastic Timi, the interpretation ofthe Himit (4.3.19)
js that as > do, the probability that the maximum deviation over any finite
inerval[f, T] between x(2,1) and xd, 1) is nonzero, goes to zero.92 4, The to Calculus and Stochastic Diferental Equations
ii) Dependence on the initial condition is achieved by eting a andb be independent
of a
ii) The result will be very useful in justifying perturbation expansions
iv) Condition (ji) is written in the most satural form for the case thatthe functions
atx,t) and b(x,t) are not themselves stochastic. It often atises that a(x.) and
Dx.) are themselves stochastic (nonanticipating) functions. n this case, condi
tio (i) must be replaced by a probabilistic statement, Ii, in fact, sufficient to
replace fim by slim
43.3 Markov Property of the Solution of an Ito SDE
‘We now show that (1), the solution to the stochastic differential equation (4.3.4).
is a Markov Process. Heuristcally, the result is obvious, since with a given initial
condition x), the future time development is uniquely (stochastically) determined,
tha is, a(7) for ¢> fg is determined only by
') The particular sample path of W() fort >to;
ii) The valve of s(t).
Since x() is a nonantcipating function ofr, W(t) for # > ty is independent of xe) for
1 fg, Thus, the time development of x(t) for t > ta is independent of x(1) for t< tp
provided .%p) is known. Hence, 1(f) is a Markov process. For a precise proof see
(43.
43.4 Change of Variables: Ito's Formula
Consider an arbitrary function of x1): f[2(2)). What stochastic differential equation
does it obey? We use the resulls of Sect. 4.2.6 to expand d [x0] to second order in
awn:
Aflac) = flat + dat) ~ fl), (43.20)
SUatoldat + LPL da? + (4321)
=F Latnifalatn, ride + Lat), naw} + FECL Faw?
43.22)
where all other terms have been discarded since they are of higher order, Now use
dWin? = dt obvain
flat) = (atx, AF Lx] + FLU AF FLX] de + BEC ALF LC CO,
3.3.23)
This formula is known as Io's formula and shows that changing variables is not
given by ordinary calculus unless fls(7)] is merely linear in x().
{42 Stochastic Differential Equations (SDF) 98
Many Variables: In practice, Io's formula becomes very complicate and the eas
iest methodist simply use the multivariate form ofthe rle that dW() is an in
fntesimal of order 4. By similar methods to those used in Sect. 4.2.6, we can show
that forann dimensional Wiener process W().
AWA aWAD = 6d", (43.240)
aw? = 0, (> 0, (43.240)
awynd = 0, (43.240)
an eo) (43.248)
svhich imply that dW san infitesimal of order. Note however that 4.3.24)
inaconsequence othe independence of dW?) and dW (). To develop o's feral
for functions of an n dimensional vector xt) satisfying the stochastic differential
equation
dx = Acx pdt + Bex WO, 43.25)
‘we simply follow this procedure. The result is
pee) = [EAs fx +} $1B(x 08% x00]
+E Bylx NOs dW An), (4.3.26)|
4.35 Connection Between Fokker-Planck Equation and Stochastic Differential
Equation
a) Forward Fokker-Planck Equation: We now consider the time development of
an arbitrary f(a(). Using to's formula
daslsto _ (afi _ a,
eon. ( iz TAF).
= (ost. nh + $8L0.0°22f) 3.2m
However hs condos probability density xt) an
Sto = fdr fats to.
= fase, + ps. 0PARp pts tt) 43.2%
This is now of the same form as (3.4.16) Sect. 3.4.1. Under the same conditions as,
there, we integrate by parts and discard surface terms to obtain
Sasfisyayp = fdxflo| — delacx. op) + 4OLx pl}. (4.329)
and hence, since f(x) is arbitrary,94 4 Thelto Calewas and Stochastic Diferental Equations
4px 11 Xoo) = Balas, pC%tx0.0)] + $02B(%.1% pCa #19. fo). 4.3.30)
‘We have thus a complete equivalence toa diffusion process defined by a drift coef
cient a(x.) and a diffusion coeficient (x,
‘The results are precisely analogous to those of Sect. 3.5.2, in which it was shown
that the diffusion process could be locally approximated by an equation resembling.
an lo stochastic differentia} equation,
») Backward Fokker-Planck Equation—the Feynm:
function g(x,2) obeys the backward Fokker-Planck equation
Ag = ~atx,0.9 ~ 4614. 020. (43.30
with the final condition
Ma.T) = GU) (43.32)
Ifa() obeys the stochastic differential equation (4.3. then using Ho's rule (adapted
appropriately to account for explicit time dependence), the Function glx), t} obeys
the stochastie diferental equation
dtd. = [Bg + alae dyglat e+ 4bLAC. AF Boleto.t dr
+ BLx.t1d,glx(st1 dW, 433)
and using (4.3.31) this becomes
dglxtt), | = BLx(t), 14,914, 1 dW0). (4.3.6)
[Now integrate fom to T, and take the mean
[Link] abana = (Folarneyaatarnflamn) =0. 4338)
Let the initial condition of the stochastic differential equation fora) and 1” = 1 be
xO) = (4336,
‘where « is a noa-stochastic value, so that
dale) ) = 96.0) (aaa)
At the other end ofthe interval, use the final condition (4.3.32) 10 write
dalX7).T) = GUT at) = 2), (4.3.38)
‘where the notation on the ight hand side indicates the mean conditioned om the initial
condition (4.3.36),
Putting these two together the Feynman-Kac formula results:
(GL TMI() = 2) = 91.0 (43.39)
where g(x) isthe solution of the backward Fokker-Planck equation (4.3.31) with
initial condition (4.3.32).
‘This formula is essentially equivalent tothe fact that p(x.t1s9,t) obeys the back-
ward Fokker-Planck equation inthe arguments xo. as shown in Sect 3.6, since
GGIs¢T Isto) = x0) = fdxGCIPCR,T L010). (4.3.40)
443 ‘Stochastic Dtferemial Equations (SDE) 98
4.36 Multivariable Systems
‘In gencral, many variable systems of stochastic differential equations can be defines
for n variables by
dx = Ate dt + BE. NAWO), (asaly
where W(t) is an n variable Wiener process. as defined in Sect.38.1. The many
variable version of the reasoning used in Sect. 4.3.5 shows thatthe Fokker-Planck
equation forthe conditional probability density p(x. tl 30.0) =p is
ap = ~[Link]+ $ A9l(@x. 9B". hip) aan
Notice thatthe same Fokker-Planck equation arses from all matrices 8 such that
BBB? is the same, This means that we can obtain the same Fokker-Planck equation by
replacing B by BS where S is orthogonal, ic., SS" = 1. Notice that S may depend
onx(0.
“This can be seen directly from the stochastic differential equation. Suppose St) is
an orthogonal matrix with an arbitrary nonanticipating dependence on t. Then define
avin = Sinawe (4343)
Now the vector €V(1 isa linear combination of Gaussian variables dW() with cocf-
ficients S(t) which are independent of dW(7), since S(t) is nonanticipating. For any
fixed valve of Sit, the dV(°) are thus Gaussian and their correlation matrix is
LaVO) AVA) = ESAS pl (AWAD AWD)
= ESS Ode = by dt, (4am)
since S(0 is orthogonal. Hence, all the moments are independent of S() and are the
same as those of dW'(), $0 d¥() is itself Gaussian with the same corelation matrix.
as dW(0). Finally averages at different times factorise, for example, if > fin
Fllawansyinr"tawceysuer. (43.45)
we can factorise ot the averages of d'Vi() to various powers since dW, is inde-
pendent ofall other terms. Evaluating these we will find thatthe orthogonal nature
of St) gives after averaging over d1¥(t), simply
Zalawyor cate sule a> (43.46)
‘which similarly gives [dWOI"{aW(7)1". Hence, the dV(0) are also increments of
4a Wiener process, The orthogonal transformation simply mixes up different sample
paths of the process, without changing its stochastic nature.
Hence, instead of (4.3.41) we can write
(x, dr + Bex. NSTNSIDAWE, (aan
Ales dt + Bla, NSO. (4.348)
and since V(0) is itself simply a Wiener process, this equation is equivalent t096 4. The to Calculus and Stochasti Diferenil Equations
dx = Atx rar + Box, NSTDAWi, (4.3.49)
which has exacly the same Fokker-Planck equation (4.3.42).
‘We will return to some examples in which this identity is relevant in See. 4.5.5,
4.4 The Stratonovich Stochastic Integral
‘The Stratonovich stochastic integral is an alternative tothe Io definition, in whieh
Nto's formols. developed in Sect.4.3.4, is replaced by the ordinsry chain rule for
‘change of variables. This apparent advantage does not come without cost, since in
Siratonovich’s definition the independence of a non-anticipating integrand Git) and
the increment d¥7() in a stochastic integral no longer holds. This means tht inere
iment and the integrand are correlated, and therefore to give a full definition of the
Stratonavich integral requires some way of specifying wha this cerrelation is
This correlation is implicitly specified inthe situation of most interest, he ease in
“which the integrand is a function whose stochastic nature arses fram its dependence
‘on a variable (0 which obeys a stochastic differential equation. Since the aim is 10
recover the chain rule for change of variables in a stochastic differential equation,
sonable reteotion
44.1 Definition of the Stratonovich Stochastic Integral
Stratomovich [4.21 d
vurvand ry
ined a stochastic integral ofan integrand which isa function of
(5) fate 1aer im 8 GL (st + Mia))sfen}E Win) — WED)
an
“The Suratonovich imegal i clearly related to a mid-point choice of; inthe deti-
sition of stochastic ion as given in Sect, bul eleary is nor necessarily
‘equivalent to that definition. Rather, instead of evaluating x atthe midpoint $i)
the average of the values athe two time points s taken. Furhemrore its only the
dependence ons) thai averaged inthis way. and no the explicit dependence on 1
However, if G(s differentiable inthe integral canbe shown t be independent
ofthe patiular choice of value fort the range [ft
44.2 Stratonovich Stochastic Differential Equation
Iis possible to write a stochastic diferential equation (SDE) using Stratonovich's
integral
Wi = says faded,
1468) Fawerstat?r) 442)
44 The Stamonavich Stochastic Ingral
48) Change of Variables for the Stratonovich SDE: The definition ofthe Straton=
variables, This means, thal forthe Stratonovich integral, Ho's form
replaced by the simple calculus rule
(44)
in} = Fat fat. rll + ato, 114We0
‘This can be proves! quite simply (rom the definition (4.4.1). The essence ofthe proot
ean he explained by using the simple SDE.
(S)duin = Bian] dwen. aan
In discreised form, this can be written
tie) = 8 BLL ey + 89] Mien = WD. (445)
‘To ind the Stratonovich SDE for fLA(} we need only use the Taylor series expan-
sion a a faction about a migpoat i he form
cape 3 Emde!
Ln Seen
In expanding fC) we oly need to keep lerms up 10 second onde. so we drop all
tut the fst wo tems and write
jeeray 446)
FUsined = fa) + Fea + x1 — 5+ aan
= FL EGon + xd] BLS Csien + x0) Whos ~ Wh (448)
“This means thatthe Stratonovich SDE for fLa(A] is
(Spdflan} = fet] Banda, 449)
‘which is the ordinary caleulus rule, The extension to the general ease (4.4.3) is,
straighitorward,
'b) Equivalent Ho SDE: We shall show thatthe Stratonovieh SDE isin fac equiva
Tent to an appropriate Ho SDE. Let us assume that x7) isa solution of the lio SDE
x(t) = af. fdr + Lx.) dW). (4.4.10)
and deduce the and fora corresponding Stratonovich equation ofthe form (44.2)
In both cases, the solution (7) isthe same Function,
We first compute the connection between the Ito integral fdWW4r yiatr).} and
the Stratonovich integral (S) dW" Blxtr). 71
(5) fawer tate). = Eal§ ata + ar0))tr]AW0.0. ain
nA.) we wie
ses) + Atti). ain
and use the Ito SDE (4.4.10) to write
xt68 4, TelloCaeus an Stott DieretlEgptons
Ante) = aso. at tes + Bliss) ay
“Then, applying Ho's formula, we can write
[30 + athd)ea] = Batten + Sana.
1) + fle. tad $M]
+} bl) BE AVAWE). (44.14)
(For simplicy, we write) et, insted of fiat)] wherever posible). Pating
all thse back inthe original equation (4.4.10) and dropping as usual di? d-dWor,
and setting dW(0? = dr, we find
(5) f= Satta WEI) = Wee M4 HY BO VB BANG = tea)
Hence we derive
(5) Feber 1dWir) = feted). }dWO+ $f 6L). 118, BLd yar
(44.15)
‘This formula gives a connection between the Io and Stratonovich integrals of func-
Lions flat") 1]. in which x(7’)is the solution ofthe Ito SDE (4.4.2. Itdoes not give a
‘general connection between the Io ané Stratonovich integrals of arbitrary functions.
1 wo mow make the choice
(x,t) = atx.) XA bLN, A)
Bn = x0 4416)
we ee tha
theo SDE
dx=adi+ bdo). 517
is the same asthe Strafonovich SDE
(S)de=(a~ 4o0,8)dr + b4W0), 41
and conversely.
the Strtonovich SDE
(Syax=aarspawe, (44.19
is the same asthe Ko SDE
r+ 400,404 Bawey 4420
445 Some Examples and Solutions 99
‘¢) Many Variables: Ifa many variable To equation is
x dt + [Link], aan
ralonovich equation can be shown similarly 1 be given by
de
then the corresponding S
replacing
A = A $5 BOB,
B-8 (44.22)
1d) Fokker-Planck Equation: Corresponding to the Stratonovich SDE,
(S)de = Ax nd + Bir. nd WO) (4.4.23)
‘we can, by use of (4.4.22) ané the known correspondence (Sect. 4.3.6) between the
Ito stochastic differential equation and Fokker-Planck equation, show thatthe equiv
alent Fokker-Planck equation is
Op =~ Zalaiph + $F GABA yp. (4.4.24)
‘whieh is often knowin asthe “Stratonovich form” of the Fokker-Planck equation. Ia
contrast to the two forms of the stochastic differential equation, the two forms of
Fokker-Planck equation have a differen appearance but are (of course) interpreted
‘with the same rules—those of ordinary calculus. We will find later that the Stra
e Fokker-Planck equation does arise very naturally in certain
4.5 Some Examples and Solutions
4.5.1 Coefficients without x Dependence
The simple equation
dx = athdr+ KOdWe, as.)
‘with a(t) and b(0) noaranddom functions of ime, is solved simply by integrating
Mt) = so fattyaf + fe) de) 432)
Fete, x» can be either nonrandom inital condition or may be random, but must be
independent of (1) ~ Wht) for > fy: otherwise, x) is not nonanticipating
‘As constructed, s() is Gaussian, provided is either aonrandom or itself Gaus-1004, The ko Callus and Stochastic Ditfeweil Equations
Soe awe), (453)
is simply a linear combination of infinitesimal Gaussian variables. Further,
(xi) = (aa) + fat ydr 454)
(since the mean ofthe Ho integral vanishes) and
ain ~ (LA) ~ GND # CD. 269) as
= varlaol + ( Foie yaWer) {6(s.4M59)). 456)
= arb) + Fou’ yP ar 450
where we have wed the result 4.2.53) with, however
We), fet
. we), es.
“The process is thus completely determined.
4.5.2 Multiplicative Linear White Noise Process—Geometrie Browni
Motion
‘The equation
dx = cra) (45.10)
is known as a multiplicative white noise process because it is Hinear in x but the
‘noise term” Wr) mauliplies x. Is lsu cossmwnly know as geomerie Brownian
‘We can solve this exacly by using Ito's formula, Let us define a new variable by
(5.1)
dW(t)~ $e dt 45.12)
“This equation can now be directly inepated, so we obtain
wht) = wlan) + ef Wit) — Wro)] — $c° = to), (4.5.13)
and hence.
A) = xinhexp etW¥e0 ~ Wea} = $e2~ m0} ss)
49) Mean value: We can calculate the mean by using the formula for any Gaussian
variable with 2ero mean
4.5 Some Examples and Solutions 108
exo (423) 4515)
so at
(at) = anenp|J2U= = $20 1) = (x06) est
“This resulis so obvious rom deft, since
dein) = axn)= xed = 0 asin
1b) Autocorrelation Funetion: We can also calculate the autocorreation function
atria = Cnt) (exp [el # Wis) = 2Weml— Jee s—2)])
Cara?) exp [LEU + WC) = 2WeayP) — 45 2s
(ate) exp | $I = Aig + 2mings)= (r+ 6~ 20}
= (alto? expe? min(~ io. = 0) 145.18)
«© Stratonovich Interpretation: ‘The solution of this equation interpreted 0s a Stra-
tonovich equation can also be obtained, but ordinary calculus would then be vali.
‘Thus insted of (4.5.12) we would obain
Say = cay (45.19)
and hence,
st) = ntpoxpet WO) ~ Wee) (4520)
Inthis case
xo) = Gxonexn [$2 —0)} (3520
and
dsc) = Cato) exp {Jet +s 2+ 2 mint ~ fp. Tol} (45.22)
‘One sees that there isa clea difference between these two answers
Frequency
4.33 Complex Oscillator with No
“This isa simplification of a model due 10 Kubo [4.4] and isa slight generalisation of
the previous example for complex variables. We consider
# ci(ur Wran)e, 14523)
a
‘which formally represents a simple model of an oscillator with a mean frequency a
perturbed by a noise term £0).
Physically, this is best modelled by writing 8 Strtonovich equation
(de =ifods Vrawa))e,
which is equivalent tothe Ho equation (from Sect. 4.4)102 4, The fo Calelos and Stochastic Differential Equations
Fig. 43. Mostration ofthe decay of the mean arp
tude o complex oxcillatorof as «result of dephasing,
[Gar yas i Vp amen (4.525)
‘Taking the mean value, we see immediatly that
a
2 eo tiw- Ko (45.26)
with the damped oscillatory solution
(30) = expla ~ yketO}) 527)
‘We shall show fully in Seet.8.3, why the Stratonovich model is more appropriate
‘The most obvious way to see this isto note that £(0) would, in practice, be someshat
smoother than a white noise and ordinary calculus would apply, ass the case in the
Stratonovick interpretation
[Now in this cas, the correlation function obtained from solving the eriginal Stra
{onovich equation is
(OP Yexpltias = ye + 5) ~ 2y mint, (45:28)
Fo eiwith ere
+ netn) =0. (45.29)
However. the correlation function of physical interest isthe complex correlation
Mexpliads ~ 9) +i V21WE) ~ WoT)
(COP yexptivg® ~ 8) ~ylr + s— 2mints 9),
= (210) expiants — 9)— yr ~ sl (45.30)
‘Thus. the complex correlation function has a damping term which eises purely from
the noise. I may be thought of as a noise induced dephasing effect, whereby the
phases of an ensemble of initial states with identical phases diffue away from the
value wf arising from the deterministic motion, as illustrated in Fig. 4.3. The mean
of the ensemble consequently decays, aldhough the amplitude ir} of any mernber
‘of the ensemble is unchanged. For lage time differences, z(2) and 2s) become in-
dependent
45 Some Examples and Solutions 19%
[A realistic oscillator cannot be described by this model of a complex oscillatcr.
‘as discussed by van Kampen [4.5]. However the qualitative behaviour is very si
Jar, and this model may be regarded as a prototype model of oscillators with noisy
frequency,
‘Taking the FokkerPlanck equation given for the Omstein-Uenbeck process in
Seot.3.84, we can immediately write down the SDE using the result of Sec. 4.35
de = ~kedt + VDAWin), (45.31)
and solve this directly, Puting
vase’, (4332)
then
dy = (dxjd(e") + (dx)e" + xd(e")
= Lavar+ Daweotel drs [-ardrs WDawen] ky ot
45a)
‘We note that he fst product vanishes involving only a and dV Gin fh,
an be seen hat tis wil always happen if we simply mulily shy # deterministic
funetion of time). We get
dy= Ve away, us)
so that integrating and resubsttuing for y, we get
xi
we + VBS eH aWer) 143.35)
vty
IF the intial condition is deterministic or Gaussian distributed, then o() is eh
Gaussian, with mean and variance
(x) (oye, (4.5.36)
varlatn) = ([Ex0)~ (ope + VBj eS? awe)’) 45.37)
“Taking the inital condition to be nonanticipating, that is, independent of dW for
15-0, we can write using the result of Sect. 4.41
vari stn
aixle™ + fe a
= (varl(0)} = D/2K) e+ D/2K. (4538)
“These equations are the same as those obtained directly by solving the Fokker-Planck.
equation in Sect. 3.8.4. with the added generalisation of a nonantcipating random
initial condition. Added to the fact that the solution is a Gaussian variable. we also
have the correct conditional probability.1044, Theo Callus and Stochastic Differential Equations
The time correlation function can also be ealeulated directly nd is,
at.) = vary}! 4+ DE Fee™—Paweny fee-M awe),
= var Oe DP
tetra!
ar)
fratwo-2 ase)
Notice that if & > 0, as ¢.5 -» 00 with finite { ~ s, the correlation Funetion becomes.
Stationary and of the form deduced in Sect. 3.8.4
In actif we set the initial time at -s rather than 0, the solution 4.5.35) becomes
t= NB f otra), san
in which the corcelation function and the mean obviously asstene thei stationary
values. Since the process is Gaussian, this makes it stationary,
48.5 Conversion from Cartesian to Polar Coordinates,
‘A. model often used to describe an optical fcld is given by & pair of Ornstein-
Unlenbeck processes describing the real and imaginary components ofthe electri
field i.
MEW) = ~yE Md + ed (0, (Sata)
dE) = ~yEAdt + edWy), (4s.41b)
is of interest to convert to polar coordinates. We set
Exit) = alnreoseXty, (45.42)
auysin gt), (45.43)
and for simplicity, also define
w= (45.44)
so that
iO) + iC) = loglEs() + ECO) (assy
We then use the ho ealeuls to deve
UE) iE3) (dE + EP
Ei vies HE +B?
HEL +E) 4, sid) +idWoin] dW + iaWeen?
Brie Evie) EE
diy +i) =
45.46)
and noting dW(0) dW) = 0, and dW; (1? = JWa(1)? = dt, it can be seen that the
fast cerm vans
445 Some Examples and Solutions 108
iat) + i0)] = ~ydt + exxpl—ue ~ i@cNldW LC) + dW) (asany
‘We now take the real part, set at?) = expt?) and using the Ito ealeulus find
dao =( at + 5) de oa yeosein +a}
(4548
“Te imaginary pr yes
ta = E(-amnsina + ayes) as)
We now define
dW.A1) = dWy{1)cos dt) + dWoi0) sin de}, } (45.50)
AW) = -dY 0 sing) + WACO Cos
‘We note that his is an orthogonal transformation of the kind mentioned in Sect. 4.3.6,
so that we may take dW,(?) and dWa(?) as increments of independent Wiener pro-
wesses Wi) and Wel).
Hence. the stochastic differential equations for phase and amplitude ane
ein = oat (455i)
dat) = (~ you + sla + ed Wait (4551b)
ta) = (to 3a en
‘Comment. Using the rules given in Sect. 4.4 (i) itis possible fo convert both the
Cartesian equation (4.5.41a, 4.5.41b) and the polar equations (4.5.51a, 4.5.51) to
the Stratonovich form, and to find thet both are exactly the same as the lo form,
Nevertheless, a direct conversion using ordinary calculus is not possible. Doing so
wwe would get the same result until (4.5.47) where the term [e? 2a()] df would not be
found. This must he compensated by an extra term which arises frm the fact that th
‘Stratonovich increments d¥(0) are correlated with fr) and thus, dW,(0) and dW.)
‘cannot simply be defined by (45.49). We se the advantage of the Io method which
retains the statistical independence of dW(0) and variables evaluated at time
the equations in Polar form are not soluble, as the corresponding
Cartesian equations are, There isan advantage, however, in dealing with polar equa-
tions in the laser, whose equations are similar, but have an aed term proportional
toate} drin @SS1b).
45.6 Multivariate Ornstein-Ublenbeck Process
we define the process by the stochastic diferental equation
ax)
(A and B are constant matrices) for which the solution is easily obtained (as in
Sect. 4.5.4:
Ax(dt + BAW), (45521064, The te Calculus and Stochastic Differential Equations
219) = expl-AnxiO) + Jexpl-Au~ IBEW) (45.53)
The mean is
(210) = expt-ANK0). 43s)
The correlation function follows similarly
Ate). xT) = (lett) ~ Cee Mlats) ~ Gets)
expt-AratO). 20) e4ph-As)
+f exk-Att— 1B expl-AT st 455s)
‘The integral can be explicitly evaluated in certain special cases, and for particular
low-dimensional problems, its possible to simply multiply everything out term by
term. In the remtainder we set (x10), (0) = 0), corresponding to a deterministic
initial condition, and evaluate a few special eases
a) The Case AAT = ATA: In this case (for seal A) we can find a unitary matrix S
such that
sst
SAS* = SATS? = diag dy... 2s) (45.56)
or simplicity. assume ¢ > s, Then
tat.a(s) = S1GU.95 assy
where
BN
[Gt shy fexpl-Ait — s) expla a9). (4.558)
b) Stationary Variance: Ii has only eigenvalues with positive real par, a station-
ary solution exists of the form
ain f expr ryBawery (45.59)
We have of course
teln) <0. (45.60)
(a). 163)) = "TF expl-Ale— C188" expl-ATs~ Pha? sen
Let us define the stationary covariance mattis. by
[Link] 45.02)
‘This can be evaluated hy means of an algebraic equation thus:
45 Some Examples and Solutions INT
Aer brat
f Aexpl-A@— BB" expl-AT= 11d
4+ J expt-Atr— 19188 exph-aT AT at
td Aaa t 45060
= f Lrexpl-a— #188" expla — hdr 56%
CCarying out the integral, we find that the Tower limit vanishes by the assumed pos:
tivity ofthe eigenvalues oF A and hence only the upper limit remains. gv
Al = BBE (45068
as an algebraic equation forthe stationary covariance matrix.
© Stationary Variance for Two Dimensions: We note that ifA isa 2% 2 mates. it
ie equation
satisfies the charac
AP = (TeAA + (Det A) = 0, (45.65)
‘and from (45.60) and the fact that (4.5.65) implies exp(—A1) is a polynomial of
degree | in A. we must be able 1 write
/BBT + piABET + BBTAT) + yABBT AT 45.066)
Using (45.65), we find (4.5.68) i satisfied if
+ (TRAY (Det Ay = 0, (467)
2ptDetA) +1 =0, (45.68)
Beta o. (45.69)
rom which we have
= (DANA + [A = CTEANBBEA — CTA] er
(TE ANDet Ay
wary State: From the solution of 4.5.60)
Matrix in the Stati
4) Time Correlation
wwe see that if >;
(.00,4148)) =expl-Ate~ 9) f expl-Ats PBB expl-As— PA.
empl-Auasile, 1s 45.7ta
and similarly.
scenpl-AUs-n]. res casiby
“This depends only on 5 ~ 1, as expected of a stationary solution. Defining then
Gut 9) = (20,2119). asm©) Spectrum Matrix
rather simple, We define similarly to Seo. 1.5.2:
em
44 The lo Caeulus and Stochastic Ditferetial Equations
wwe see (remembering o = 0) that
Galt 9) = 1s =F
(45.73)
in Stationary State: ‘The spectrum marry jwens out to be
1
stay= Ef eG arn, (57)
ite °
= sgl {ewl-os arrears f expla +A ea 4575)
! fy +t ~ ia!
ENA tive eta iw) 4576)
Hence
(A+ iw)S KA" ~ io) 4637
and! using (4.5.64), we get
t
Ste) = 5214 + iw BBN AT — wy! (45.78)
£ Regression Theorem: The result (4.5.71a) is also known as @egression theorem
in that it states that te time development G7) is for T > O gover
‘of time development of the mean, as in (4.5.54). Iisa consequence of the Marko
ian liner nature ofthe problem, The tie derivative ofthe stanary corelation
by the same law
d a :
GlGlMde = FeadraTOar,
= ((rAn (ride + BaWes)h a2 (0) (45.19)
Since r > 0, the increment We) is uncoreated with 43 (0) this means that
a
FIG Un) = AG) (4.5.80)
Thus, computation of G.(r) requires the knowledge of Gy(0) = o and the time de-
velopment equation ofthe mean. This results similar to those of Sect. 3.7.4
45.7 The General Single Variable Linear Equation
8) Homogeneous Case: We consider firstly the homogeneous case
dx = (bind +a) dWenx, assy
and using the usual lo rules, write
445 Some Examples and Solutions 109
y= tog x. (15.82)
so that
ty 8 Ec nate gain Sarak, 458%)
sing and inverting (4582). wea
st) = oven} f r)— Satta! + foramen}. asap
= W090 (45.88)
which serves to define 0)
‘We note that using 4.5.15)
Ate = eesoor (exp fed drat +» feutvawer)y
= (ext xp [nf va + n= foo ar (45.86)
’) Inhomogeneous Case: Now consider
lit) vx + LAO + gtOHT AO. (4587)
and write
(45388)
st = atte.
‘with 0) as defined in (4.5.85) and a solution ofthe homogeneous &
‘Then we write
ation (4.581),
dz = dxigint! + xdlginy'] + dedloy] (4.5.89)
Noting that d{gin]-! = -dednto]? + [doy HOT? and using Ho rules, we find
d= = (late) — finginyldt + find wonnecy! (4.5.90)
‘which lire integrable, Hence, the solution is
11) = po) foe Mla) — Fear Lae + FPA) son
©) Moments and Autocorrelation: It is beter to derive equations for the moments
from (45.87) rather than caleulate moments and autocorrelation ditecty from the
solution (4.5.91).
or we have
ast) = nat" Mdatd) + nln Dato" EdstOP
= airy date) + Jno = Hcy PLAC + ataCOP de (45.92)110 4, Theo Caleulus and Stochastic Difeental ay
Renee
Levy = (xeyInbe9 + $rtn~ gto").
+ (ay [natn) + mtn DyfloMnT
+ da") ft = Df (45.93)
‘These equi fom a hearhy in which he ih equation inlet saa ons a
ihe pons te, andcanbelepat sean
488 Multivariable Linear Equations
2) Homogeneous Case: The cation
dni) = [Bindrs Soeoawso ec, 4594
where B().G,() are matrices. The equation i no, in general, soluble in closed form,
‘unles all the matrices (1) Ge) commute at all times with each other i.
GANG Wr) = GArIGAD,
BING.) = Gut'yBU (45.98)
BiNBE) = BBL)
In this ease, the solution is completely
have
alogous to the one variable case and we
xu
Bx0). (4596)
with
1 = emp fa $ ZG. asm
}) Inhomogeneous Case: We can reduce the inhomogeneous case to the homoge:
‘neous case in exactly the same way as in one dimension, Thus, we consider
dx) = (AC) + Bln], + SIFU +Ganslawen, (4.598)
and write
m= oy 'xe), (45.99)
where wis a matrix solution ofthe homogencous equ
to evaluate dfs"), For any matrix M we have MM!
onder, MdM™'] + dM"! + dMalM~) = 0,
Hence, d[M-"] = =[M + dMf)""dMt Mand again to second order
ar
jon (4.5.64). We frst have
|: $0, expanding to second
MUM Ms Md aM, (45.100)
45 Some Examples and Solutions
an thus, since i) satisfies the homogeneous equation,
ayer) = wor Ben + LEAN] de EGAnaHan} 45.100
ana again aking diferenials
ait = varia 3. Ganka + EReoamo) (5.102)
Hence.
x)
o{o0 + four IA) = LGM EMA + srcwmeny
: (45.103)
when the solution forthe
“This solution isnot very useful for practical purposes, even whe
homogeneous equation is known, because ofthe difficulty in evaluating means and
‘correlation functions.
4.5.9 Time-Dependent Ornstein-Ulenbeck Process
, hich is sob
‘This is a pacar ease ofthe provous gener iar eation whi
Iisa gentsaton othe mutvarte Ornstin-Unenbeck process (Set. 4.5.6 1
includ time-dependent parameters, namely
x(a) = Anetra + BNW) (45.106)
“This is clearly ofthe same Form as (4.5.98) withthe replacements
Ao =) 0:
BO, + AW. me
SRnawin + KNAW,
Gan, SC
‘The corresponding homogeneous equation is simply the deterministic equation
dx( = ~Aunxtna (45.106)
which is soluble provided A(DA() = A()A(A) and has the solution
x) = WO), 45.107)
with
(45.108)
wo = exp| = facet
Thus, applying (45.108)
at) =exp|-fArdf]ntoy+ Flexo[ = facadslfacryawiry. (65.109)—
124 The to Coteus and Stohaate Dilfer Equations
This very similar tothe solution of the time-independent Omstin-Uenbeck “1 i
cess, a derived in See, 4.5.6, equation (45.53. . ee ae
From this we have
Grip mexp[- faurrae|ixioy. (45.110)
ns) e0| ferrari, x09] fake at
+ faresp| - facnsfoerw ave|-fatons, cas.i1y
The time-dependent Ornstein-Uhlenbeck process will arise very naturally in connec
‘ion with the development of asymptotic methods in low-noise systems,
In the next two chapters, the theory of continuous Markov processes is developed
from the point of view of the corresponding Fokker-Planck equation, which gives
the lime evolution ofthe probability density function forthe system. This chapter is
devoted mainly to single variable systems, since there are a large number of exact
results for single variable systems, which makes the separate treatment of such syS-
‘ems appropriate, The next chapter deals with the more general multivariable aspests
‘of many ofthe same issues treated one-dimensionally inthis chapter.
‘The construction of appropriate boundary conditions is of fundamental impor
tance, and is carried out in Sect.5.1 in a form applicable to both one-variable and
‘many-variable systems, A corresponding treatment for the boundary conditions on
the backward Fokker Planck equation is given in Sect. 5.1.2. The remaining of the
chapter is devoted to a range of exact results, on stationary distribution functions,
properties of eigenfunctions, and exit problems, most of which can be explicitly
solved in the one variable case,
‘We have already met the Fokker-Planck equation in several contexts, starting from
Einstein's original derivation and uso ofthe diffusion equation (Sect. 1.2). again asa
particular case of the differential Chapman-Kolmogoroy equation (Sect, 35.2), and
Finally, in connection with stochastic differential equations (Sect. 4.3.5), There are
many techniques associated with the use of Fokker Planck equations which lead 10
results more directly than by direct use ofthe corresponding stochastic differential
‘equation; the reverse is also true. To obtain a full picture of the nature of difusion
processes, one must study both points of view.
‘The origin of the name “Fokker-Planck Equation” is from the work of Faker
(1914) [5., 5.2] and Planck (1917) [5.2] where the Former investigated Brownian
‘motion in a radiation field and the latter attempted to build a complete theory of
ed on it, Mathematically oriented works tend (0 use the term “Kol-
‘mogorov’s Equation” because of Kolmogorov's work in developing its rigorous basis
[5.3]. Yet others use the tem “Smoluchowski Equation” because of Smoluchowshi
‘original use ofthis equation, Without in any way assessing the merits of this termi
nology I shall use the term “Fokker-Planck equation” as that most commonly used
by the audience to whom this book is addressed1145, The Fokkee Planck Equation
5.1 Probability Current and Boundary Conditions
‘The PPE is second-order parabolic partial differential equation, and for solutions
‘we need an initial condition such as (5.2.5) and boundary conditions atthe end ofthe
Jnwerval inside which xis constrained. These take on a variety of forms.
Ics simpler to derive the boundary conditions in general, than to restrict consi
ion tothe one variable situation, We consider the forward equstion
a a
atest) = ~¥, EAdesple. + $Y 5 Ble.
inte) = ~¥ Bade peat) + $B Poe Byte) 1)
We note that this ean also be writen
aptzt)| y @ 12
EY oy Fen =0 12)
where we define the probability current
a
Adz. D2.) ~ FE Bile... 1a
dies
Equation (5.5) has the form of a focal conservation equation, and can be written in
an integral form as follows. Consider some region R with a boundary S and define
oe i
eS
PAE. asm sen. ne
Where mis the outward pointing normal to S. Thus (5.1.5) indicates thatthe total loss
‘of probability is given by the surface imegral of J over the boundary of R.
ss
a
Fig 8.1 Repions used to demonstrate tha! the probability current ithe Row of probability
‘We can show as well thatthe current J does have the somewhat stronger property,
that a surface integral over any surface S gives the net flow of protability across that
surlace, For consider two adjacent regions Ry and Ro, separated bya surface Sja. Let
'S; and S2 be the surfaces which, together with Siz, enclose respectively Ry, and Re
(see Fig. 5.1)
51. Probbility Curent and Boundary Conditions 115
“Then the net flow of probability can be computed by noting that we are dealing
here with a process with continuous sample paths, so that na sufficient! short time
[Ar the probability of erossing Six from Ry t Ry isthe joint probability of being in
Ry atime rand Ry, atime t+ Ai,
= fide f dy poss + Ayo) (5.16)
“The net flow of probability from Ry to Ry is obtained by subtracting from this the
probability of erossing inthe ceverse direction, and dividing by At i.
Bayh fe fairer aor) in
ja fysy.)=0 18)
since this isthe probability of being in Ry and R; simultaneously. Thus, we can write
sara (5.19)
AD) = fd f dy Oe post. — de Py,
and using the Fokker-Planck equation in the form (5.5)
D8 [EWE Ma tReds (6.1.10)
where Ji. Re) formed from
port: Rad) Ln)
dy pt
inthe same way as J(z,¢) is formed from plz, in (5.1.3) and Jy: Rais defined
similery, We now convert the integrals to surface integrals. The integral over Sz
vanishes, since it will ivolve p(x, 1:0), with x notin Rj or on its boundary (except
fora set of measure zero.) Similarly the integral over Sy vanishes, but those over Siz
donot, since here the integration is simply over pat of the boundaries of Ry and Re
‘Thus we find, the et flow from Rp t0 Ry is
f dS mE RIs0) + Det Ray (1.12)
s
and we finally conclude, since x belongs the union of Ry and thatthe net flow of
probability per unit ime from Rs to Ry
tig © pdx f dota. + arn. ~ pt t=.)
| wimp a 0 omy
f dsm Hen),N65. The Fokker:
ek Equation
5.1.1 Classification of Boundary Conditions
‘We can now consider the various kinds of boundary condition separately,
4) Reflecting Barrier: We can consider the situation where the particle cannot leave
‘a region R, hence there is zero net flow of probability across, the boundary of.
Thus we require
nS) =0, forzeS,
= normal 1S. (Lay
where Jz.) is given by (5.5.4)
Since the particle eannot cross, it must be reflected there, and hence the name
‘reflecting barrier for this condition.
b) Absorbing Barrier: Here, one assumes thal the moment the particle teaches, it
is removed trom the system, thus the barrier absorbs. Consequertly the probability
‘of being on the boundary is 210, i.
We=0, forzes. S15)
©) Boundary Conditions at a Discontinuity: It is possible for both the A; and Bi,
‘coefficients to be discontinuous ata surface S, but for there o be free motion across
'S. Consequently, the probability andthe normal component of the current must oth
be continuous across S,
n- Stas, = m- Feels (5.1.16)
Pals, = ples. 6.117)
‘where SS. a8 subseripis. mean the limits ofthe quantities from the left and tight
hhand sides of the surface.
‘The definition (5.1.3) of the curren
necessarily continuous a.
indicates tha the derivatives of p(z) are not
5.1.2 Boundary Condit
ms for the Backward Fokker-Planck Equation
‘We suppose that p(x. 1x1’) obeys the forward Fokker-Planck equation fora set of
2:1 and x’, and that the process is confined to a region R with toundary 8. Then,
ifs isa time between t and 7,
a oye i
= Frat) = 5 dy pari. np slx' 0), (5.1.18)
where we have used the Chapman-Kolmogorov equation, We take the derivative 0/9
inside the integral, use the forward Fokker-Planck equation forthe second factor and
the backward equation for the first factor. For brevity, lel us write
DY.) = [Link])
PS) = platy.)
Then,
6.119)
552 Fokker Planck Equation in One Dimension 117
5 5, Oe 1.20)
“fool 458 Be
and afer some mnpation
2 Apps ty |p2 coup) vt se (tay
ove 2 {anos 33 [pho peStl}
(5.1.22)
eafolane peg} ap pene ma,
ee orale
oper creer tr
‘on substituting p= On that equation, we get
2p 45.123)
= 3.450 22
Cachet 7
However if he boundary i absorbing. clearly
patlys)=0, for ye boundary 124)
since this merely states that the probability of X re-entering R from the boundary is,
220.
b) Reflecting Boundaries: Here the condition on the forward equation makes the
first integral vanish in (S.1-22). The final factor vanishes fr arbitrary p only if
8 (a(x. 9) = 0 (5.1.28)
Embargo
In one dimension this reduces to
5.1.26)
a
Gyre =
unless vanishes.
©) Other Boundaries: We shall not consider these this section. For further details
see [54]
5.2 Fokker-Planck Equation in One Dimension
none dimension, the Fokker-Planck equation (FPE) takes the simple form
ED Brace geno) Slow 0/000) 62
Th Seuis.24, 35 the Fokker-Plonck equation was shown to be valid fo the con
tional probability, that i, the choice118 5. The Fokker Planck Equation
Fa. plat toto), 622)
for any initial o,o, and withthe initia condition
PA fo fo) = OU ~ x4) (52.3)
However. using the definition for the one time probability
PLD) = fo pox 29,40) = fd pst to.t0)pl2aato)s 524)
‘we see that i is also vaid for pt.) with the initial condition
PLD, = Plato), (525)
hich is generally less singular than (5.2.3).
From the result of Sect. 4.3.5, we know thatthe stochastic process described by a
‘conditional probability satisfying the FPE js equivalent tothe to stochastic differen.
tial equation (SDE)
dat) = Alstn.t]dr + BGR AaWe), 626)
and that the two descriptions are to be regarded as complementary to each other. We
will see that perurbation theories based on the FPE are very diferent from those
based on the SDE and both have their uses.
52:1 Boundary Conditions in One Dimension
‘The genera formulation of boundary conkitions as given in Sect. 5.1.1 can be aug.
‘mented by some more specific results fr the one-dimensional ease,
4) Periodic Boundary Condition: We assume that the process takes place on an
interval [a bin which the two end points are identified with each ober. this occur,
for example, if the diffusion is on a circle). Then we impose boundary conditions
derived from those for a discontinuity, ie.
1: dig poe =
lim tx. (527)
We isp dea) = fim J.) 628)
Most frequently. periodic boundary conditions are imposed when the functions
‘A(s.1) and B4x.0) are periodic onthe same interval so that we have
Alb.1) = Ata.t),
Bb.) = Blan),
‘and this means that I and Il simply reduce to an equality of px.) and its derivatives
atthe points and b,
») Prescribed Boundaries: If the diffusion coeficient vanishes at a boundary, we
have a situation in which the kind of boundary may be automatically prescribed
Suppose the motion occurs only for x > a. If a Lipschitz condition is obeyed by
AU 1) and VEC T atx = a Sect. 4.3.16) and Btx. 1) is differentiable at x= a then
0, Bia,0) = 0. 52.10)
(5.29
52 Fokker Planck Equation in One Dimension 119
“The SDE then has solutions, and se may write
dalt) = Ato + VEE AWO)
In this rather special case, the situation is determined by the sign of AUx.1. Three
cases then occur, as follows,
(S21)
i) Exit boundary. In this case, we suppose
Ma.) <9. a
will certainly proceed out of region
so that if the particle reaches the point
to.x 0 ca
© 1 sign of A(a.t) is such a8 10
In this case i the particle reaches the point a, the sir
return it to x > a; thus a particle placed to the right of « can never leave the
region. However, a particle introduced at x = a will certainly enter the region
Hence the name, “entrance boundary”.
i) Natural boundary. Finally consider
(52.14)
Ata.)
veritcan be demon-
“The pail once itreaches = , wll ain hte, How
{rae ha cannot ever each hs point Ths sa ound fom which we em
neither abo nor at which we can odie any pies.
©) Feller’s Classification of Boundaries: Feller [5.4] showed that in general the
boundaries can be assigned to one of the foe types: regular. entrance. exit and natu
ral. His general eritria forthe classification ofthese boundaries are as follows.
Define
fix) = exp|-2 fdsatsy a1). 62.15)
2IBL ft}. 62.16)
So faisdds. (52.17)
an f Foods. 52.18)
Heres € (2,0) and is fixed. Denote by
52.19)
Foun).
the space of all functions integrable on the interval (x, 82)120.5. The FokkerPtanck Equation
Then the boundary at a can be classified as
Ee Regular: if f(x) ¢ 2([Link]) and glx) e Llaso)
Me Exit if gay € Lea.) and f(x) € ta. 10)
Mk: Entrance: it g(x) © 2a.) and Mat) € Zea, x9)
IV: Natural: all other eases
It can be seen trom the results of Sect.5.3 that for an exit boundary there is m0
ormalisable stationary solution of the Fokker-Planck equation, and that the mean
time to reach the boundary, (5.5.24, is finite, Similarly, ifthe boundary is exit, a
stationary solution can exist, but the mean time to reach the boundary is infinite, In
the case of a regular boundary, the mean time to reach the boundary is finite, but
& stationary solution with a reflecting boundary at a does exist. The case of natural
boundaries is harder to analyse. The reader is referred to [5.4] fer a more complete
description.
©) Boundaries at Infinity: All of the above kinds of boundary can occur at infin
ity, provided we can simultaneously guarantee the normalisation of the probability
which. it p(2) is reasonably well behaved, requires
° 5.2.20)
If Bxphx) is reasonably well behaved (i.e. does not oscillate infinitely rapidly as
x9 9),
lim 8,p0s.4) = 0.
7 (52.21)
50 that @ nonzero current at infinity will usually require either 4¢x,1) or Blx,4) to
‘become infinite there. Treatment of such cases is usually best carried out by changing
([Link] variable which i finite at x = oo,
Where there are boundaries at x = 40° and nonzero curtents at infinity are permit
ted, we have two possibilities Whieh do not allow fon loss of proba
DJs, =0. (5.2.22)
ii) Je409,) = S09.) (5.2.28)
These are the limits of reflecting and periodic boundary conditions, respectively,
5.3 Stationary Solutions for Homogeneous Fokker-Planck
‘Equations
We recall (Sect. 3.7.2) tha in a homogeneous process, the drift and diffusion coefi-
cients are time independent. In such a case, the equation satishied by the stationary
distribution is:
a 1@
enn 3S tape
a)
5.3 Stationery Solutions for Homogeneous Fokker-Planck Equations 121
which can also be written simply in terms of the curent (as defined in Sect. 5.1)
we 832)
ae
‘which clearly has the solution
JU) = constant (ay
‘Suppose the process takes place on an interval (a,b), Then we must have
Sta) = He) = 0) 8 J. 634)
and i one ofthe boundary conditions is reflecting, this means that both are reflecting,
and J :
the boundaries are not reflecting (5.3.4) requires them to be periodic. We then
use the boundary conditions given by (5.2.7) and (5.2.8).
2) Zero Current—Potential Solution:
Seting J = 0, we write (5.34) as
td
Ava) = 5 H1Bla)p.t0)] = 0, (535)
{for which the solution is
exp|2f dx’ Aar)/B1x’) (536)
pals)
ae
where isa normalisation constant such that
an)
Jasna
7 rical reasons, but
sath estan known a pret slo, ovine
aay te te sony ton tn! ty = gle megan el
“Toc atom rien ine 622
Wea Boundary Conn: He We have za sont J ws ete
hi
Atnpala) = FBP = I 638)
However, J is not arbitrary, but is determined by normalisation and the periodic
boundary condition
7
recone
‘Then we can easily integrate (5.3.8) to get
PAs) _ padBla) 9 F a in
vay) wee)——
1225. The Fokker Prank Equation
By imposing the boundary condition (5.3.9) we find that
22) _ By
i) wa |”
I ae (83.13)
Loa)
ota
jf BO) pa Boy
Laceyea *! aero
10) = nly] ED oan)
coho
©) Infinite Range and Singular Boundaries: In ether of these cases, one or the
sither ofthe above possibilities may turn out to be forbidden because of divergences,
ele. A full enumeration ofthe possibilities is, in general, very complicated. We shal
demonstrate these hy means ofthe examples given in the next section.
5.3.1 Examples of Stationary Solutions
4) Diffusion in a Gravitational Field: A strongly damped Brownian particle mov
ing in 2 Constant gravitational field is olten described by the stochastic diferential
equation (8.2.15)
dx = -gdt+ VDdwin, (5.3.15)
for which he Fokker Planck equation i
® _ Bon, \poe
2. Diop tn@e (5316
(On the interval (a.) with reflecting boundary conditions, the stationary solution is
given by (5.3), ie.
Paix) = AF expl-29x/D 1, AI?
‘where we have absorbed constant factors into the definition of 1
Clearly this solution is normatisable on (a,b) only if aa is finite, though b may
be infinite The result is no more profound than to say that particles diffusing in a
‘beaker of fuid will lall down, and if the beaker is infinitely deep, they will never
stop falling! Diffusion upwards against gravity is possible for any distance bat with
exponentially small probability.
‘Now assume periodic boundary conditions on (a,b), Substitution into (5.3.14)
yields
Pad
pata), (53.18)
‘constant distribution,
‘The interpretation is that the particles pas frely from a to band back.
1) Ornstein Ublenbeck Process: We use the notation of Sect.3.8.4 where the
Fokker Planck equation was
453 Stationary Solution for Homogeneous FokkerPlanck Equations 123
ras)
Fig, $2, Nontomatsale “stoma p06 the ee
¥ thos Ae 2k
9 _ Beary s to® 6319)
Pt Pa
whose stationary slain on he intra ab) with reflecting ares x
pals) = expl-kx?/D) (5.3.20)
Provided & > 0, this is normalisable on (~0, 00).
IK <0, one can only make sense of ton a finite interval. In this case suppose
<0. 9320,
0 that from (5.3.11).
Kigaa
sore tee
and if we consider the periodic boundary condition on this interval, by
Ha) = wa) a
wwe find that
a3) wef Fue al. 532
patsy = tS ~ puarero |p
ion as in the ease of reflecting barirs
so thatthe symmetey yields the same solu the
Letting a + 0, we see that we still ave the same solution. The result is also true
ifa—+ ov independently of b + ce, provided k > 0.
©) A Chemical Reaction Models Although chemical reactions are normally best
modelled by a birth-death master equation formalism, as in Chap. 11, approximate
treatments are often given by means of a Fokker-Planck equation. The reaction
X+Ae oN (53.25)
= 0 (where xis the number of
is of interest sine it possesses an exit boundary at x = 0 ( rat
molecules of X}. Clearly if there is no X, a collision between X and A cannot occur
s0 no more X is produced
“The Fokker-Planck equation is derived in Sect. 1.6.1 and is
apts) = ~8, (ax—P) piso] + $02 |(ax+ A) pes.9} oe1245. The Fokker Planck Equation
We introduce reflesting houndaries at x
solution is
and x = B. In this case, the stationary
Paap eMart
63.27)
whieh is not normalisable if ¢ = 0. The pole at x = 0 is a resul ofthe absorption
‘there. Infact. comparing with (5.2.18), we see that
BO.) = (art hg =0.
AO.1) = (ax ¥)a0 = 0, (5.3.28)
8,810.1) = (04 28)e0g > 0.
80 we indeed have an exit boundary. The stationary solution has relevance only if,
> 0 since itis otherwise not normalisable. The physical meaning of a reflecting
‘barrier is quite simple: whenever a molecule of X disappears, we simply add another
‘one immediately. A plot of psx) is given in Fig. 5.2. The time for all x to disappear
's in practice extraordinarily long, and the stationary solution (5.3.27) i, in practice,
8 good representation ofthe distribution except near x = 0.
5.4 Eigenfunction Methods for Homogeneous Processes
‘We shall now show how. in the case of homogeneous processes, slutions can most
naturally be expressed in terms of eigenfunctions. We consider refecting and absorb
ing boundacies,
5.4.1 Bigenfunctions for Reflecting Boundaries
We considera Fokker-Planck equation fora process on a interval (a,b) with refle
ing boundaries. We suppose the Fokker-Planck equation to have a sationary solution
Pao) and the from
O,p13.8) = ~ILACPL D+ YRLBDPL. A 64
We define a function gts.) by
PED = BadgCH.D 542)
and, by direct substitution, find that 42 satisfies the Backward equation
Aglet) = Aas.) + $BEHAAC.1 (543)
‘We now wish to consider solations ofthe Form
Pian = Pune’, (544)
at) = Qaaye*, (645)
which obey the eigenfunction equations
ANACIPAAN] + $LBYP AL] = ~AP AX) (8.46)
AGH, Qi(2) + $BOIBO eC) = =2" yx). 647)
5.4 Eigenfunction Methods for Homogeneous Processes 128
rom (514.2) and (5.4.3) i follows that
(548)
i) Relationship berween Py and Qu
Paka) = pals}Qutx)
“This simple result does not generalise completely to many dimensional situs
tions, which are treated in Sect. 65.
4i) Orthogonality of eigenfumctions: We can straighttorwardly show by partial inte
ration that
ura farryoQ.09
[oren{-aunrions [Link] ceo} ~ $BL9PsC99. Geta
649)
Using the reflecting boundary condition on the cocficient of Q(x), we see that,
this ceficien vanishes, Further, sing the definition of g(t) a tems ofthe
Stationary solution (5.4.2), its simple to show that
$298, O08) = ALP) + $LBEOP (0. (54.10)
50 that term vanishes also, Hence, the Qu(x) and P(x) form & bi-orthogonal
system
falxPusvQr4a)= bie. 4.1)
‘Threats reo alte ohana systems,
J dx pssyOua)Qn(0) = Sax (54.12)
i (53)
[Link] PAP eC) = bas
onary
i should be noted that setting 4 = 2° = O gives the normalisation ofthe
solution py(x) since
44)
Pa = pad.
ou = 1 415)
assuming comple:
it) Expansion neigefntons Using this orthogonality and r
ess) ne can write any ston interme of egenncions Foc if
pint = EA, Pane, (5.4.16)
then
Faroe. (6417)1265. The Fokker Planck Equation
‘v) Conditional probability: For example, the conditional probability p(x] s0.0)
is piven by the inital condition
P2,0).%9,0) = dt x0), (54.18)
so that
Ay = Fede Q,2)6~ x9) = Orban). (54.19)
and hence,
Pact 10.0) = E PriQataaye" (54.20)
Vi Autocorrelation function: We can write the autocorrelation fonction quite ele-
any as
((OM(O)) = Fade Fro x20 plxt149, 09.00), 42
Z[Jacmcnfe® 42
where we have used the definition of Q4(x) by (5.4.5)
SA2
igenfunctions for Absorbing Boundaries
These are treated similarly.
We define P and Q; a8 above, except that p(x is lll he stationary solution of
the Fokker-Planck equation with reflecting boundary conditions, With his definition,
We find that we must have
Plo) = Qx(a) = Px(b) = Q41b) = 0, (5.4.23)
{and the orthogonality proof stil follows through. Eigenfunctions are then computed
using this condition andthe eigenfunction equations (5.4.6) and (5,7) and all other
results look the same. However the same of 4 does not Include 1 = 0, and hence
Lt 49,0) 0 as 1 > 60
54.3 Examples
2) Wiener Process with Absorbing Boundaries The Fokker Planck equation
ap = ip. (5424)
is treated on the itera (0,1). The absorbing boundary conitoe requires
100.9 =
and the appropri
0. (5.4.25)
igenfunctions are sin(nzx) so we expand in a Fourier sine series
pian = ¥ bansininas, 6.426)
‘hich automaticaly saisies (5.4.25) The initial condition is chosen so that
54 Ejgentunetion Methods for Homogeneous Processes 127
(5427)
for which the Fourier coeticients are
2,0) = 2 Favotx ~ xp)sin(unx) = 2 sin(nmay) (54.28)
Substituting the Fourier expansion (5.4.26) into (5.4.24) gives
ald = a,b, (5429)
i =~).
with
Ae = (5430)
and the solution
lO) = beO) expla) aan)
So we have the Solution [which by the initial condition (5.4.27) is for the conditional
probability px.t1x.0))
Us t1a9,0) = 2 5, expl—aut)sin(mm) sin(as). (5.4.32)
'b) Wiener Process with Reflecting Boundaries: Here the boundary condition re
duces to fon the interval (0, 1)]
A,pA0.1) = BPC, 8) =O, (5433)
and the eigenfunctions are now cos(nur), so we make a Fourier cosine expansion
140 +E ayltheosns), te
‘withthe same initial condition
our a0) (54.35)
(0) = 2d costnnsi(x— ag) = 20s. (5436)
Inthe same way as before, we find
440) = a0) EXPK-2ats (saan)
with
ener (538)
sothat
plx.t\,0) = 1 +2 F costs) costo expl—2yt) (54.29)
[As #0, the process becomes stationary, with stationary distribation128 5S. The Fokker-Planck Equation
(5.40)
an compute the stationary autocorrelation function by
P.O), = ff dds plet a. 0100. (saat)
and carrying out the integrals explicitly,
las
(nO = 5+ FE exmetananQn + 4 (5.442)
We see that asr + c, all the exponentials vanish and
l
(ONO. > 5 = CO aaa
and as t+ 0,
Ae
(tna, > 3+ 5 Zane ty (544d)
when one takes sesount of the entity (rom the theory of te Riemann zeta
a ae
Eats % 5.485)
©) Ornstein-Unlenbeck Process: As in Sect. 3.8.4 the Fokker Panck equation is
4 plot) = AlkxpOO + $D8%px.0, (534.46)
‘The eigenfunction equation for Q is
ene
Qn tO + FOr 0, (5.4.47)
tnd his becomes he diferent equation for Hermite peynomis
ion or Hert pobmomils Ha) (5.560
making the replacement y = x VE7D 7 ee
4} Q1 ~ 24d, 01 + 211K), = 0. 5.4.48)
Weean write
Q) = tute '?H, (x VAT) « (5.4.49)
where
Aan. 6450)
lutions are normalised as in (5.4.11-5.4.13),
jonary solution is, as previously found,
(ky) exp—kx?/D) 6.451)
tnd a general solution can be written as
Psst = Z VRE ARDH exr-k (DIK, (x YTB), (5452)
154 igentunction Methods for Homogeneous Processes 129
with
Ag= J deploy ( #). Since the system is time
homogeneous, we ean write
plast13,0)= plO1x-0. 654)
and the backward Fokker Planck equation ean be written
‘Aypla ts 0) = ALA PLA t}%0) + § BLD 12,0), (535)
and hence, G0) obeys the equation
B,GLx,1) = ALIGL.) + $BUIRGCO.1) 656)
Initial condition: Clearly that
plu’.O1x,0) = r=, (659)
and hence,
G0) eae 5.5.8)
0. elsewhere.
iit) Boundary conditions: Wx = a oF b, the panicle is absorbed immediately, so
ProbiT > #) =O when x= aorx = bie,
Gla,t) = Gib, =0. (5.5.9)
in) Moments of the exit time: Since Gl.) is the probability that Tf. the mean
‘of any function of Tis
gery =F sasexn iss10
Thus heme eine oe ras ine)
Tix) = (T) (5.5.11)
beens
ria) = =feaceena sin
= Focund (5.13)
stringy pte Sma ie
Tax) = (T"), (5.5.14)
wn
ra0)= foun, 519
¥) Differential equation for the mean exit ime: We can derive a simple ordinary dif
ferenial equation for T(x) by using (5.5.13) and integrating (5.5.6) over (0,0)
Noting that1525. The Fokker Planck Equation
Ja, [Link] = G1x,00)~ 614.0
1 516)
wederive
AcoaATEN + $BUNETUA) (65.17)
with he howndary condition
To) = 10 6518)
Simiay we se that
ay) = ACT) + EBL) 6519)
i Solin of he Equation: Bquton (517 an be volved
on (S517 can be solved diel by ner
‘on The slo ater some mariplaion cane writen interns
419-09 {f artes} 5.20)
eg _(f ta few
Bo awit du5t me
a 521)
fe
Jey
55.2 One Absorbing Barrier
We omit motion sil inthe interval.) tat
eral (0) bsp ihe air ber
flecting. The boundary conditions then become a a
Gla.) = 0.
Gib.t) i ame
(55.226)
sich flow fom the candiiony om he backward Fokker Patch
svar Fokker Manc eatin dived
Seo 5.12, Weselve (3.17) wih the comespondng boundary candi amd in
Falage sors
Similarly, one finds
reflecting,
a absorbing, ae
a © Po 7
wo \y \ no ; I oo ; |
a 2 Zs
Fig. 5.3. (a) Double well potential U(2;¢b Stationary dstiation p10; () Mean fs pas
sage lime from at0 x, T(a +9)
5.53 Application—Escape Over a Potential Barrier
‘We suppose that a point moves according 10 the Fokker-Planck equation
A, peat) = Bef". + DPC A (55.25)
“The potentiat has maxima and minima, as shown in Fig. 5.3. We suppose that motion
json an infinite range, which means the stationary solution is
pala) =H expl-UO/D1, (5.526)
which is bimodal (as shown in Fig. 5.3) so that there is a relatively high probability,
Of being onthe left or the ight of b, But not near b. What is the mean escape time
from the let hand well? By this we mean, what is the mean first passage time from
‘ato. where xis in the vicinity of B? We use (5.5.23) with the substitutions
es ssa
sot
ro 3s fares f ex -Ute ND sam
If the central maximum of U(x) is large and D is small, then exp [U()/D) is
sharply peaked at x = b, while expl-U@)/D] is very small near z = b. Therefore,
‘Pha expl-U(2)/ Diz isa very slowly varying function of y near y = b. This means
that the value ofthe integral f, expl—U(2)/ dz will be approximately constant for
those values of y which yield a value of exp{U(y)/D] which is significantly eliferent
from zero. Hence, inthe inner integral, we can set y = B and remove the resulting
constant factor from inside the integral with respect to y. Thus, we ean approximate
(5.5.28) by
reo {5 f ayerot-uieDI}jdvenstu/} (5529)
[Notice that by the definition of py(x) in (5.5.26), we can say that
nal (5.5.20)
F dyexp(-Uey0)IMS. The FotkerPlnck Equation
Which means that isthe probability thatthe particle is o the left of & when the
system is stationary.
A plot of Ta ~> 49) against pis shown in Fig, 5.3 and shows thatthe mean frst,
Passage time to xy is quite small for xp in the let well and quite large for tp in the
‘ight well. This means that the particle, in going over the bartier to the right wel
{takes most ofthe time in actually surmounting the barrier. Its quite meaningful 10
alk of the escape time as that time for the particle, initially ata, to reach a point
rear c since this time is quite insensitive tothe exact location ofthe inital and final
Points. We can evaluate this by further assuming that near b we ean write
voy vos 4(254) esa
antmara
tes vers (2) 3529
“The constant factor in (5.5.29) is evaluated as
{dcext-vooroi f deorp[-4O =a"), say
=a VixDexp|-U(a)/D), (5.5.34)
And the inner factor becomes, on assuming ty is well to the right ofthe central point
b.
Fevevanor= j aver] 4 SP], 6538)
VIxDexplU(b)/D1 (5.5.36)
Ping tah oF een 55.29)
Tea 1) 2abrexlUd)~ Ul0/D) osm
‘This is the classical Arrhenius formula of chemical reaction theory. In a chemical
reaction, we can model the reaction by introducing a coordinate such that x = a is
species A and x = ¢ is species C, The reaction is modelled by the above diffusion
Process and the two distinet chemical species are separated by the potential barrier
1b inthe chemical reaction, statistical mechanics gives the value
D=it (55.38)
here &is Boltzmann's constant and T is the absolute temperature. We see thatthe
‘ms important dependence on temperature comes from the exponential factor which
is often writen
explAE AT), (55.39)
and predicts a very characteristic dependence on temperature. Intuitively, the answer
Js obvious. The exponential factor represents the probability that the energy will
‘exceed tha ofthe barrier when the system sin thermal equilibrium, Those molecules
that reach this energy then react, witha certain finite probability
‘We will come back to problems like this in great detail in Chap 14
5.5 Fis Passage Tines for Homogencous Processes 135
5.54 Probability of Exit Through a Particular End of the Interval
‘What isthe probability thatthe particle, intially atx in (a,b) exits through a. and
‘hat i the mean exit ime?” iL
“The total probability thatthe particle exits through a after time 1 is given by
time integral of the probability current ata. We thus define this probability by
ans.) =~ fal Jat (50), (53.40)
Fat [-Alaypta,r’ | 2.0) + $341 BCa)pa.t [2,00]} 540)
the nopative sign being chosen since we nee the euren pointing othe Ill, Simi
lary we define
‘gulsat) = J at (ACbyplb4 F250) ~ LOBOYPLE.1 2,001} (5.5.42)
“These two quantities give the probabilities thatthe particle exits through a or b after
time ¢ respectively. The probability that (given that it exits through a) it exits after
time ris
Prob, > 1) = aals.19/996.0) 543)
We now find an equation for gu(x.0). using the fact that p(a,1\x,0) satisfies 3
backward Fokker-Planck equation. Thus,
Fdt ae Ha,t \x.0),
Aena,gtsi+ aCe)
at),
= Bal.0) 65.49)
“The mean exit time, given that exit is through ais
gost) 98.22) (5.845)
Ta.s) =~ r5Pu(Ty > Na
Simply integrating (5.5.44) with respect to 1, we get
Atala s9T(a.2)] + $BUVOLR ITAA) = Ales (55.46)
where we define
_m(2) = (probability oF exit through a) = 94(%,0). (35.47)
“The boundary conditions on (5.5.46) are quite straightforward since they follow from
those for the backward Fokker Planck equation, namely,
(S548)
ea)T (aya) = na(b)P(ab)
is zero) and
In the first of these clearly Ta,a) is zero (the time 10 reach a from a is zero)
in the second, x,(b) is zero (the probability of exiting through a, starting from b, is
2210)(365. The Fokker Planck Equation
By leting ¢ ~» 0 in (5.5.44), we see that J(a,Olx.0) must varish ifa + x, since,
a.0x.0) = 6x ~a), Hence the Fight-hand side tends o zero and we get
ACI RL) + $ BLY (5.549)
the boundary condition this time being
mala 1, mb) =0. (5.5.50)
‘The solution of is boundary condition and the condition
(551)
5.49) subject to
(8) 4 m2) = 1
Rasy = {i ay 0) l J dy wo. (5552)
mtn) = [fave | vv. (55.5%)
with WO) as defined in 55.20.
“These formulae ind application inthe problem of relaxation of a distribution ini-
sally concentrated at an unstable stationary point (Set 141.2).
Example—Dilfusive Traversal Time of a One-Dimensional medium: A particle
tse ina one-aimension according the difesion equation ip = {dp What
ste mean time forthe pate to cifse frm bo a under the condition tha it does.
sot eae the interval (a,b) before reaching a? [S6)
In the case that the particle starsat within (2,6), we find rom (5.552) and
(5546)
bes
a) a 6554
SDARRAITAbO) = =) (55.55)
Using the boundary conditions (5.5.48) the second equation is easy integrated 10
ive
Ex bia = aXe +a-2b)
As)Pela) = 4
(oT) oD) (55:56)
and hence
Tx) = FeO so) (3550)
3D
In the limiting case x ~» b the probability of exit through a as given by (5.5.54), is
zero. Nevertheless, inthe limit that xis approaches b, the mean time to make the exit
ven that the ext is at ais quite well defined and is
-aF
Tb) = (55.58)
5.5 First Passage Times for Homogeneous Processes 137
“This is also clearly the time to exit ata without ever leaving the interval (a,b) before
Somtntnte ineral: Notice thai we fix x ant be, we find
1 (5559)
Ta) 0 (ssa
“The eesull is thatthe particle is certain fo escape a a, but the average time to escape
is infinite This arises esas the particle an spend ret dea fine exper
the infinite half of the interval, giving rise to an escape time distribution which is
normalisabl, but decays so slowly that all moments diverge.6. The Fokker-Planck Equation in Several
Dimensions
ln many variable situations, Fokker-Planck equations take on an essentially more
complex range of behaviour than is possible inthe case of one variable, Boundaries
are no longer simple end points ofa line but rather curves or surfaces, and the nature
‘of the boundary can change from place to place. Stationary soluions even with re-
Aecting boundaries can correspond to nonzero probability currentsand eigenfunction
‘methods are no longer so simple.
[Newertheles, the analogies between one and many dimensions are useful, and this
chapter wil follow the same general outline as that on one-variable situations.
‘The subject matter of this chapter covers @ number of exact Fokker Planck equa
sion results for many variable systems. These results are not as explicit as forthe one
variable ease. An extra feature which is included is the concept of detailed balance
in multivariable systems, whichis slmost trivial in one variable systems, but leads to
very interesting conclusions in multivariable systems,
“The chapter concludes with a treatment of exact results in ext problems for mul-
tivariable Fokker-Planck equations.
6.1 Change of Variables
Suppose we have a Fokker-Planck equation in variable.
A,ple,1) =~ LAAAAx plat) + $F ASByx ple, (61.0)
‘and we want to know the corresponding equation for the variables
= fe) (6.12)
where fare certain differentiable independent functions. Let us denote by p(y.) the
probability density for the new variable, which is given by
dus. x2
Bysya-
‘The simplest way o effect the change of variables isto use Ito's formula on a corre-
sponding stochastic differential equation
iy.) = pla.t
(6.13)
ax
where
(ard + benyawin, (6.14)
6.1 Change of Variables 139
bixy"btx) = BOD, (6.15)
‘and then recompute the corresponding Fokker-Planck equation for piy.t) trom the
resulting stochastic differential equation as derived in Sect. 4.3.5.
"The result is rather complicated. In specific situations direct implementation
(6.1.3) may be preferable, There is no way of avoiding a rather messy calculation
Lunes full use of symmettes and simplifications is made
Example—Cartesian to Polar Coordinates: As an example, one can consider
the transformation to polar coordinates of the Rayleigh process, previously done by
the stochastic differential equation method in Sect. 4.5.5. Thus, the FokkerPlanck
equation is
fp, &
wo ce). (6.1.6)
EF * 362
‘and we want to find the Fokker-Planck equation for a and @ defined by
a
apts = Gbps Gen be
By = acosd, Ep =asind 17
“The Jacobian is
cos sind |_ Be
sing acosd ie
We use the polar form of the Laplacian to write
18 apa
2 tZ (oe), 61.9)
rales) i)
and inverting (6.1.7)
a= JET+E}, p= tan (Es/Ey) (61:10)
We nove
oe Et cos
Cet at ant
ane un
fe. -§ Fs = sing,
OB: fet
and
(Ob iaan tanya:
te B+ @
a6 _F:___ sing
a, H+e ae
(6.1.12)
Hence,M06, The Fokker Planck Equation in Several Dimensions
a a
aE ape.
ra (ES 5 228),
2 eae ae,
0 1 8a
p+ oe = Etat, 6.13)
Let us use the symbol pa) For the density Function in terms of 2 and &, The Jaco-
ban formula (6.1.8) us that
AE. Bo
7) = |AEE ct,£) =a EE ons
Pating opt (6.1.6 (6.1.9) (6.1.13) and (6.1.1, we Be
2 Mya) oe ( 188 2
Elo) $ (GZ 22). (15)
“which (of course) isthe Fokker Planck equation, corresponding tthe two stochas-
ic differential equations in Sect.4.5.5, which were derived by changing variables
according to to's Formula,
6.2 Stationary Solutions of Many Variable Fokker-Planck
Equations
62.1 Boundary Conditions
We have already touched on boundary conditions in general ia Sect. 5.1 where they
were considered in terms of probability current. The full range 0! boundary condi
tions for an arbitrary moltidimensional Fokker-Planck equation is very much broader
than for the one dimensional case, and probably has never heen conpletely specified.
In this book we shall therefore consider mostly reflecting barrier boundary condi
tions at a surface S, namely.
mJ =0 forse, (62
where mis the normal othe surface and
Hla = Adastpta) = BE, aye Deke. 622)
and absorbing barrier boundary conitions
Pa.) =0 fore 23)
1 practice, some par ofthe surfce may be reflecting and anotieabsorbing. Ata
surface S on which 4; or Bj are discontinuous, we enforce
JO), = 2 Ha,
may Epa | ms is2
62 Siationary Solutions of Many Variable FokkerPlanck Equations 141
“The tangential current component is permitted to be discontinuous.
“The boundary conditions on the backward equation have already been derived in
‘Sect 5.1.2 For completeness, they are
Absorbing Boundary: pox. ty.) = 0 yes. 625)
a 5
Reflecting Boundary: Em Bywrgrrie t=O yes. (626)
6.22 Potential Conditions
{A large class of interesting systems is described by Fokker-Planek equations which
permit a stationary distribution for which the probability current vanishes forall in
BR. Assuming this tobe the ease, by rearranging the definition of J (6.2.2), we obtain
‘a completely equivalent equation
[Link])
3x,
$B Btn Las
she matrix B(x) has an inverse forall x, we can write (6.2.7)
a : a ,
Lvoeinoon = Zaiten|emey-2 20,40). (628)
ZAA,Bl 629)
“This equation cannot be satisied for arbitrary By(x) and A(x) since the leltchand
side is explicitly a gradient. Hence, Z; must also be a gradient, and a necessary and
{condition for that isthe vanishing ofthe curl i.e.
Hs (6.2.10)
If this condition is satsted, the stationary solution can be obtained by simple inte-
ration of (6.2.8)
oy-vo jae.) amy
tities Z; from derivatives of log [p,(2)], which, therefore, is offen thought of as a
potential ~@x) so that more precisely.
pala) = exp (6212)
and
lx) =~ fade ZA, Bea) (62.13)1426 The Fokker Planck Equation in Several Dimensions
Example—Raylefgh Process in Polar Coordinates: From (6.1.15) we find
ya + e/a"
ae ("e% -
215)
from which
2
Bah (6.2.16)
so that
cap ta = (~2mle? + Ha’
ermia (De t19) oe
an ceay
yy
a eg 218
‘The satonary soleton shen
aad) = exp| Fab doze] 62.19
ye
1 o9( 2 sg. oxy
pid
ve 2) 220
6.3 Detailed Balance
6.3.1 Definition of Detailed Balance
‘The fact that the stationary solution of certain Fokker-Planck equations corresponds
‘o2 vanishing probability current isa particular version of the physical phenomenon
of detailed balance. A Markov process satisfies detailed balance if. roughly speak-
ing, in the stationary situation each possible transition balances with the reversed
lrnnsition. The concept of detailed balance comes from physies, so let us explain
‘more precisely with # physical example, We consider a gas of particles with pos
tions r and velocities v. Then a transition corresponds to a particle at some time £
with position velocity (r.6) having acquired by a later time t+ + pestion and veloc-
ity of). Te probabil deny ofthis easton ihe on poe deny
(OH T: FB)
‘We may symbolically we
this transition as
(0) 4.0.02), a4)
{63 Detaled Balinve 143
ned and unprimed
‘The reversed transition is not given simply by interchanging pr
quantities Rather, itis
Fo) rer) (6.32)
i corresponds to the time reversed transition and requires the velocities to be re
versed because the motion from z” to ri in the opposite direction from tha from r
tor.
“The probability density forthe reversed transition thus th joint probability den
sity
DAR B48. (633)
The principle of detailed balance requires the equality ofthese tw joint probabilities
when the system isin a stationary state, Ths, we may write
(634)
pe 7:758,0)
palrs-v,t5 0-0
“The principle can be derived under certain conditions from the laws of physies—see
[6.1] and Seet.6.4.2
6.3.2 Detailed Balance for a Markov Process
When the probabilities correspond 10 those of a Markov process we ean rewrite
(63) as
pire] -,0, 002.8) rie. 0 pur). 635)
or,
where the conditional probabilities now apply to the corresponding homogeneous
Markov process (iF the process was not Markoy, the conditional probabilities would
be forthe stationary system only).
Inits general form, detailed balance is formulated in terms of arbitrary variables x;
whieh, under time reversal, (ransform (othe reversed variables according tothe rule
mee. 636)
ee al 37)
depending on whether the variable is odd or even under time reversal Inthe above,
riseven, vis odd,
‘Then by detailed balance we require
ex 4 7880) (63.8)
port as0
By ex, we mean (e) 81.4282.)
"Notice that setting + = 0 in (6.3.8) we obtain
ox ~ xpi!) = dlex — 8x’ )[Link]) 639
The two delta functions are equal since only sign changes are involved, Hence,
palex). (63.10)
pals)(errr
1446. ‘The Fokker Planck Equation in Severat Dimensions
isa consequence of the formulation of detated balance by (6.3.8), Rewriting now in
‘terms of conditional probabilities, we have
| nsrie.0on0e)= nee. ren pin ain
6.33 Consequences of Detailed Balance fr Stationary Mean, Autocorrelation
Function and Spectrum nl
An important consequence of (63.10) that
(x), = eCx)s (6.3.12)
hence all odd variables have zero stationary mean, and for the auiocorretation func:
tion
Gr) = (x(a, (63.13)
we have
GO) = eM, (63.4)
hence,
Gir) = eG" ine, (63.15)
and setting + = O and noting that the covariance matrix o satisfies = o",
o (6.3.16)
For the spectrum matrix
net
Sw)= 3 FeMom, 3.17)
we find from (6.3.15) that
Sew) = 68 whe (6.3.18)
led Balance must be Generalised
Possible that there exist several stationary solutions to a Markov process, and in
this situation, a weaker form of detailed balance may hold, namely, instead of (6.38),
we have ;
plier x0) = plex ners ext. 63.19)
where the superscripts 1 and 2 refer to two different stationary solutions. Such a
situation can exist if one of the variables is odd under time reversal, but does not
change with time; for example, ina centrifuge the total angular mementurs has this
property. constant magnetic field acts the same way.
‘Mostly, one writes the detailed balance conditions in such situations as,
63 Dewiled Blane 1S
poet eri att) = plex 4s en). 63.20)
where 1 is a vector of such constant quantities, which change to 6A under time re-
versal. According to one point of view, such a situation does not represent detailed
balance: since ina given stationary situation, the transitions do not balance in detail
Wis perhaps better to cal the property (6.3.20) fime reversal invariance.
Ta the remainder of our considerations, we shall mean by detailed balance the
situation (6.3.10), since no strong consequences arise from the form (6.3.20)
6.35 Implementation of Detailed Balance in the Differential
Chapman-Kolmogorov Equation
‘The formulation of detsiled balance for the Fokker-Planck equation was done by
vat Kampen [6], and Graham and Haken {6.2}. We will formulate the conditions
in a slightly more direct and more general way. We want necessary ang suficient
conditions on the drift and diflusion coefficients and the jump probabilities for a
homogeneous Markov process to have stationary solutions which satisfy detailed
balance. We shall show that necessary and sufficient conditions are given by
@ Werle ype) = Weer’ ienp,
a
i eAKeRPA) = “AKIO +E FIBULA,
Git) ee jByfox) = Bylxy
“The specialisation to a Fokker Planck equation is simply done by sting the jump
probabilities Wex 2’) equal to zero.
2) Necessary Conditions: 11s simpler to formulate conditions for the cifeentil
CChapman- Kolmogorov equation than to restit ourselves to the Fokker-Planck equa-
tion. According to Sect. 3, which defines the quanities Wox|2).Ai¢x) and By)
{allo course feng tine iependent. since we are considering homoseneous po
cess), we have the trivial result that detailed balance require. from (63.11)
Wer x’)pse’) = Wier’ lexdpan). (3.21)
Consider now the dif coeficient. For simplicity write
63.22)
x46
‘Then from (63.11) we have
spl Se + ob ae, Opa) =f aB Sra dole +40) +8),
6323)
‘we use K instead of ein the range of integration to avoid confusion with «divide
by Ar and take the limit Ar > O, and the lef-hand side yields
eA) pala) + O1K)- (63.28)1466. The Fokker Planck Equation in Several Dimensions
‘On the right-hand side we write
Wa [Link] x +8,0)p,0x +8)
per 6.0014, 0)p.(4)
a
+E) 2 [pla 8, dr) x. O)pia)] + 06), (6.325)
so that the right-hand side is
Ln, Fe {2 4,012,090)
+560 grote a + 01K),
“Adapted 5 (Bite + O18), (6326)
free etre es ed ered In Se 34 des rms involving higher
powers of than 6 are of order K. tating K 0, we find
sAlexipAla) = ~Adadp da) + EI tBu (6327)
‘The condition on B(x) is obtained sina, but in this case no term like the second
‘on the right of (6.3.27) arises, since the principal term is O48"). We find
598 B (6x) = Bix) (6.3.28)
A thitd condition i, of course, that p(x) bea stationary solution of the differential
(Chapman-Kolmogorov equation. This is not a trivial condition, and is, in general
independent ofthe others.
by) Sufficient Conditions: We now show that (6.3.21) are sufficient. Assume that
these conditions are satisfied, that py(x) is a stationary solution of the differential
CChapman-Kolmogorov equation, and that p(x, rx’, 0) is solutior of the differential
CChaprman-Kolmogorov equation. We now considera quantity
Phest1x'.0) = plex’ }ox,O)p.(x)ps2") (3.29)
Clearly
x.) 2,0) = dtx =x) = pox. 0] 2.0) 63:30)
‘We substitute p into the differential Chapman-Kolmogorov equation and show that
because plx’,t]x,0) obeys the backward differential Chapman-Kolmogorov equa:
dion in the variable x, the quantity pis a Solution of the forward differential
Chapman Kolmogorov equation.
We do this explicitly. The notation is abbreviated for clarity, so ‘hat we write
PD for pletle.0),
pe for pals),
for pe’),
pes) for ple'thx.0)
amy
63 Daaled Blane 7
We proceed erm by tc.
i) Dat Term
-E fama -2 Zeamenniy. 6330)
= 3 [Flames ngs 2 mex) 0%
inno
4 a [Bypsdplex)
Lege mum 45 oo mare
We 339
Be #
+22 (Bp) 2 ples) + Birig amen
aera ge wee) * Boas PA
it) Jump Term:
J del Wx 2)D(z.t 12,0) ~ Wea x) px tha’,
= fade Wx] z)psz)plex’,tez,0) ~ Wiel x)[Link]) plex’. t1ex,O)IPs
(63:30)
‘We now use the fact that p(x) is solution ofthe stationary differential Chapman-
Kolmogorov equation to write
2 ®
-s|2earoe bE scion
Jaz Weelsype)
ose Werle. a3)
and sing the detailed balance condom (6.210) for W
= - fide Wezlex)pdz) (6.3.36)
50) Combining all terms : Now subsite
age (633)
and all up ll thre contains taking cae of (6.338), (63.36%
{-pencennen| Zp] + Zeng enn ge |
+ JdsiW eu |2)p.(c)pty' 162.0)
+4 Zee,B: steno | aoa
~ Woey| ps2) ty. 00} /p.) 63.28)
We ow satiate died talance conttons (62.20)
=(saunatin.o Bsns mK
+ [dete] yip12.0)~ Welwrpty tiv. 0o}psayiedy’) (6229)1486. The Fokkes
anck Equation in Sever! Dimensions
‘The term in the lage curly brackets is now recognisable asthe backward differ
ential Chapman-Koimogorov operator (Sect. 3.6, (3.6.4). Nete that the process
is homogeneous, so that
ra’ t1v.0)= ta0i9.-0) (340)
We se that
(63.9)= iy’ Opsyvinay | = Bokate’.0) (6341)
Siow Osrtea= 2p.
hich means that tx, 1x’, 0). defined in (6.3.29), satisfies the Forward differen:
tial Chapman-Kolmogorov equation. Since the intial condition of plx,t|x’.0)
and lx.t|x',0) at 1 = 0 are the same (6.3.30) and the solutions are unique, we
hhave shown that provided that detailed balance conditions (6.3.21) are satisfied
‘detailed balance i satisfied, Hence, suficiency is shown,
©) Comments:
') Even variables only: the conditions are considerably simpler if all ¢, are +1. In
this case, the conditions reduce 10
Woxe')p) = Wex'lx) pax), (6.3.42)
Aiea) = 4, Fyltitennte (6.3.43)
ys) Be (6344)
the last of which is trivial. The condition (6.3.43) is exactly the same as the
potential condition (6.2.7) which expresses the vanishing of J. the probability
‘current inthe stationary state
‘The conditions (6.3.42) and(6.3.43) taken together imply that pa(x) satisfies
{he stationary differential Chapman-Kolmogorov equation, which isnot the ease
for the general conditions (6.3.21),
i) Fotker Planck equations: The cone of reversible and ireversile dit pats
ss intodced by va Kampen 1), and Graham and Haken [6.21. The ie,
versie dit is
D4) = HA) +eAKon. 6345)
and the reversible dit
Na) = Asa) ever). (6346)
Using again the potenil defined by
Pax) = oxpl-#l0. (6347
‘we sce that in the case of « Fokker Planck equation, we ean write the conditions
{or detailed balance as
nck Equations 149
64 Examples of Det
Bex) = By). 348)
Day 15 Layton : (6349)
Yay
4 240 (9350)
p[ Zo -no%@2]=0,
where the last equation is simply the stationary Fokker-Planck equation for psx)
alter substituting (6.3.21 i). AS was the ease forthe potential conditions, i
bre seen that (6.3.49) gives an equation for 69/2x; which can only be satisfied
provided certain conditions on D(x) and By(x) are satisfied. 1 B(x) has an
inverse, these take the form
a, ash
Bs,” OR”
where
i a (6352)
4 io] 2De ~ BBY
and we have
p= expat) = exp (Fae ) oe
‘Thus, asim the case of a vanishing probability current, p,(x) can be determined
explicitly as an integral.
iii} Connection between backward and forward operators of differential Chapman
Kolmogorov equations is provided by the detailed balance, The proof of suf
cient conditions amouots to showing that if fx, is a solution of the forward
tiferential Chapman-Kolmogoroy equation, then
Fiat) = flex pe. 16.354)
isa solution of the backward differential Chapman-Kolmogorov equation. This
telationship will be used in Sect. 6.5 for the construction of eigenfunctions.
6.4 Examples of Detailed Balance in Fokker-Planck Equations
6.4.1. Kramers? Equation for Brownian Motion in a Potential
“This problem was given its definitive formulation by Kramers 6.3}. when consider
ing a mode! of molecular dissociation. :
'We take the motion of a particle in a fluctuating environment, The motion isin one
dimension and the state ofthe particle is described by its postion x and velocity 6
‘This gives the differential equations1506. The Fok
nck Equation in Several Dimensions
ds
é (64)
le = v9) pos VERTED, 42)
which resenilly Langevin equations (1.2.14) n which, for brevity, we write
ana = 8, (43)
and V(x) isa potential whose gradient V'(x) gives rise o a force on the particle. By
‘making the assumption thatthe physical uctuating force g(t) is io be interpreted as
eindr= awe), (644)
«s explained in Sect. 41, we obtain stochastic diferenta! equations
ds=odt, (645)
indo = -1V'G) + polars V3BRTaWin, (646)
for which the comesponding Fokker Planck equation is
T Py
(Ve +A4p) + AEE 647)
The equation canbe slightly simplified by introducing new scaled variables
v=sVaiT. (648)
= 0ymiT, (649)
uw) = Yeayer, (6.4.10)
y=BIm, (64.11)
so thatthe Fokker Planck equation takes the form
&® a
* - Gun Svs vo) +e (oo 2). (64.12)
which we shall eall Kramers” equation
Here, y (the position) isan even variable and u(the velocity) n odd variable, as
explained in Seet.6.3, The drift and diffusion can be written
Aya) = |-vsi-m (oan)
ewn=[5 3] 64.19)
‘The detailed balance transformation is
lcl-[4] wy
We can check the conditions one by one,
‘The condition (6.3.21 ii) is trivially satisfied. The condition (6.421) is somewhat
degenerate, since B is not invertible. It can be writen
OO
464 Examples of Desiled Balance in Fokker Planck Equations (St
o
‘or, more fully
“1 - ¢
(6.7
oe P= era vou] ny See ne
“The fist line isan identity and the second states
pay.) = Be, (64.18)
ala) = exp(—$1") fly) (64.19)
“This means that if ps(y.u) is written in the form (6.4.19), then the detailed bal-
ance conditions are satisfied, One must now check whether (6.4.19) indeed gives,
stationary solution of Kramers’ equation (6.4.12) by substitution. The final bracket
vanishes, eaving
of
=u ~ vrs, 64.20)
By TOMS «
which means
W expl-UWI. 420,
se] (64.2)
In terms of the original (x,2) variables,
Plxt)= er [2 me (6.4.23)
2KT |*
“which is the familiar Boltzmann distribution of statistical mechanics, Notice that,
the denominators AT arise from the assumed coefficient /2BKT of the flectwating
force in (6.4.2). Thus, we take the macroscopic equations and add a fluctuating force,
whose magnitude is fixed by the requirement that the solution be the Boltzmann
distribution corresponding to the temperature T.
But we have also achieved exactly the right distribution function. This means that
the assumption that Brovsnian motion is deseribed by a Markov process of the form
(64.1), (6.4.2) must have considerable validity.152.6 ‘The Fokker Planck Equation in Several Dimensions
6.4.2 Deterministic Motion
Here we have B(x) and Werle’) equal to zero, so the detailed balance conditions
are simply
GiAKeR) = ~Asx) (6.4.24)
Since we are now dealing with a Liouville equation (Sect,3.5.3), the motion of a
Point whose ecordinates are « is described by the ordinary differemial equation
a
Ga = Aleta (6.4.25)
Suppose a solution of (6.4.25) which passes through the poin al
aia. 6426)
Which therefore satisfies
(6420)
ion 6.4.24) implies thatthe reversed solution
eq(-1.00), (6.428)
is also a solution of (64.25), and since
eq([Link]) = sey =y, (6.4.29)
initial conditions are the same, these solutions must be identical,
eq\-t.0y) = qt.9) 6.4.30)
Now the joint probability in the stationary state can be wei
pasta’) = dy paeestix! 139.0)
= Lely t= qe” — quae). (aay
an
pilex’ 1 sex.) = {dy dex — gl-ryilex’ ~ q(-1 wlpau) 64.32)
Change the vias from y to ey and note that i) = pe nd ey = dy, so
(64.32) = Jdyolx~ eq(-[Link])olx’ ~ eq(-1 [Link]), (64.33)
and using (64.30),
= Sey Sx — qt etx! ~ gt pala) (64.34)
paestx'0) (6.4.38)
Using the stationarity property, that py depends only onthe
tha nds only on the time difference, we see
that detailed balance is satistied. This direct proof is, of course, unnecessary since
the original general proof is valid for this determi
{64 Examples of Detailed Balance in Fokker Planck Equations 188
6.43 Detailed Balance in Markovian Physical Systems
tn physical systems, which are where detailed balance is important, we often have an
unbelievably larze numberof variables, ofthe order of 10" at least. These variables
(Gay, momentum and velocity ofthe particles ina gas) are those which occur in the
disribation function which obeys a Liouville equation for they follow determinist
{equations of mation, like Newton's laws of motion.
can be shown directly that, for appropriate forms of interaction, Newton's laws
‘obey the principle of microscopic reversibility which means that they can be pot in
the form (6.4.25), where A(x) obeys the reversibility condition (64.24). The macro-
scopically observable quantities in uch a system are functions of these variables (for
example, pressure, temperature, density of particles) and, by appropriate changes of
variable. can be represented by the frst few components of the vector x.
“Thus, we assume x can be weitten
x=(as) Cen
where the veetor a represents the macroscopically observable quantities and # is all
the others. Then in practice, we are interested in
Bay, tsag.f343.t5--)
So fbaodity parr tian tik tsio) (64.37)
From the mieroscopic reversibility, it follows from our reasoning above that p, and
thus also, both obey the detailed balance conditions but, of course, ? does not obey
‘a Liouville equation. [fi turns out or can be proven that B abeys, to some degree
“approximation, a Markov equation of morion, then we must preserve the detailed bal-
lance property. which takes the same form for pas for p. In this sense, the condition
(6.38) for detailed balance may be said tobe derived from microscopic reversibility
‘of the equations of motion.
64.4 Ornstein-Uhlenbeck Process
Most systems in which detailed balance is of interest can be approximated by an
Omnstein-Uhlenbeck process, i. this means we assume
Ada) = BAG (6.4.38)
By (6.4.39)
Bye)
“The detailed balance conditions are not trivial, even for the case of such a linear
system, They take the form
a
Beery + Aude = EB Yo mt, (64.40)
and
on wash,S46 The Fokker Planck Equation in Several Dimensions
Equation (6.4.40) has the qualitative implication that pa (x) a Gaussian since deriva-
tive of log ps(x) logis linear in x. Furthermore, since the leli-hand side contains no
‘constant term, this Gaussian must have zero mean, hence, we can write
¥exp(-baton's) (6.442)
One ean ow substitute (64.42) in the stationary Fokker-Planck equation and re-
arrange to obtain
pa)
: (6.4.43)
~ BAG HE Bye! + B(Beutas +4507! Beat)
Where we have used the symmetry of the matrix. The quadratic term vanishes if
the symmetric part of its coefficient is zero, This condition may be written in matrix
form as
oat ato =
Bo, (64.48)
Ao tea" =~B. (6.445)
The constant term also vanishes i(6.4.44) is satisfied, Equation (6.4.45) i, of course,
exactly that derived by stochastic differential equation technigues in Sect. 4.5.6
(4.5.64) withthe substitutions
ed (6.446)
Bes Bs, 447)
‘We can now write the detailed balance conditions in their most elegant form. We
define the matrix by
= diagley,e0.65....) (6.4.48)
and clearly
ea (6.449)
‘Then the conditions (6.4.40) and (6.4.41) become in matrix notation
bAB+A = Bo (6.4.50)
Be = B (6451)
‘The potential condition (6.3.51) is simply equivalent to the symmetry of
‘As noted in Sect.6.3 (6.3.16), detailed balance requires
Lo = oe. (64.52)
Bearing this in mind, we take (6.4.45)
Aotod™ =—B, (6.453)
and from (6.4.50)
her + Aa =
(458)
464 Examples ofDesiled Balance in Fokker Planck Equations 185
which yield
eas (6455)
and with (6.4.52)
atAo) = Ao". Nee
‘These are equivalent to the celebrated Onsager relations; Onsager, (64; Casimir,
[6.5]. The derivation closely follows van Kampen’s [6.1] work
GAS The Onsager Relations
“The physical form of the Onsager relations arises when we introduce phenomeno-
logical forces defined as the gradient of the potential ¢ = logl p(x)
Fix) = Vox) = ox, 457)
(in physics, 4/47 isthe entropy of the system). Because of the Tinear form of the
‘A (6.4.38), the exact equations of mation for (x) are
as)
dr
MD) ~ stay = Ae FH) (6.4.58)
“Thus, ifthe flares d(x) are related linearly to the forces FU(x)) by a mateix L
defined by
L=Ae. (6459)
then (6.4.56) says
as (6.460)
and e, of the same sign, ny
6, and e} of different slg.
Notice also that the relations
eoe=o. (6462)
imply that fy and cry vanish if e; and ¢; have opposite signs.
Tn the special ease that all the have the same sign, we find tht
bent 64.63)
and noting that, since ois symmetric and positive definite it has a real square root
(?, we find that
ebe
deo tac? (6468)
is symmetric, that Ais similar. asymmetric matrix. Hence, al the eigenvalues
of Aa real1566 TheFokkecPlanck Equation in Several Dimensions
L e
SO — t
é avn ES Foe dn
Fig. 6.1. Bec
ict used inthe dorivation of Nyguist's formal
6.6 Significance of the Onsager Relations—Fluctuation-Dissipation Theorem
The Onsager relations are for a set of macroscopically observable quantities. and
thus provide an easily observed consequence of detailed balance, which is itself @
consequence of the reversibility of microscopic equations of motion, as outlined
in Sect. 6.4.2 above. However, to check the validity in a given siuation requires a
knowledge of the covariance matt c.
Fortunately. in such situations statistical mechanics gives us the form of the
stationary distribution, provided this is thermodynamic equilibria in which de-
tailed balance is always satisied. The principle is similar to that used by Langevin
(Sect. 1.22),
Example—Derivation of Nyquists’s Formula: As an example of the use of the
‘Onsager relations, we ean give a derivation of Nyquist’s formula
‘We assume an electric circuit asin Fig. 6.1 in which there is assumed to be a fluc-
‘wating voltage and a fluctuating charge. both of which arise from te system having
‘nonzero temperature, and which we will show have their origin in the resistors R
andr. The elecrical equations aise from conservation of an elecrie charge q and
Kirchoft’s voltage law. The charge equation takes the form
ag
ra
in which we equate the rate of gain of charge on the capacitor to the current i less
leakage term yq = g/rC plus a possible fluctuating term A/(2), which arises from
Jeakage through the capacitor, and whose magnitude will be shortly calculated,
Kirchofl’s voltage law is obtained by adding up all the voltages around the circuit,
including 2 possible Huctuatng voltage AV(
di
a
We now assume that A/(1) and AV(A) ate white noise. We can write in the most
general ease,
AK = bugil) + bre), (6.4.67)
ave = ine. (sass)
wtauy, —y
ire. (6.465)
rbE- save) 6s)
in which €(0) and (0) are uncorrelated Langevin sources, i,
Fokker Planck Equations 157
(64 Examples of Detailed Bal
andr = am. waa
eude = a0. (e470)
ze (and Ws) ar independent Wiener processes
Ths hve
4
|: : (am
ie
The tal ery inthe 5)tem i
BP + te 47)
Batt at
From statistical mechanics, we know that p,(q.i) is piven by the Bolremann Distri-
ution at temperature T. ie.
lg.) = AV expl-E/kT)
eae a7)
=o sada):
where ks the Boltzmann constant, so that the covariance matrix is,
arc 0 on
elo eri!
‘The Onsager symmetry can now be checked:
ear (6.4.78)
=" VeryL RUTH?
For this system g the total charge is even under time inversion and i the current, is
‘odd Thus,
' Tee (6.4.78)
pesaevotent['S 2) ‘
o
lane an eo
h suggests that
and we see that Bp = By; = 0, a8 required by physical inition, which suzgests
the two sources of fluctuations arise from different causes and should be independent.= i
158 6. The Fokker-Planck Equation in Several Dimensions
‘These results are fluctuation dissipation results, The magnitudes ofthe fluctuations
‘hy ace determined by the dissipative terms r and R. In fact, the result (6.4.80) is
precisely Nyquist’s theorem, which we discussed in Sect. 15.4. The noise voltage
and current inthe circuit ae given by
AV) = VETR EM), aly = VT en, (6.482)
so that
{AVr+ DAV(N) = ATR,
(Are AND) = 2T/r6(2), (6.4.83)
(avirw nanny = 0.
‘The first ofthese is Nyquist” theorem in the form quoted in (1.5.48-1.5.50), the
second isthe corresponding result fr the noise regarded as arising from a fluctuation
current Irom the resistor r, and these fuetuations are independent
‘The terms r and B are called dissipative because they give rise to energy dissipa-
‘ion: in fact, dleterministically (ie. setting noise equal to 0),
caer
Ea ee, (6.484)
‘which explicitly exhibits the dissipation,
6.5 Bigenfunction Methods in M:
ny Variables
ere we shall proceed similarly to Sect.5.4, weating only homogeneous processes.
We assume the existence of a complete set of eigenfunctions P(x) of the forward
Fokker-Planck and a set Q,(x) of the backward Fokker Planck equstion, Thus,
= ESUACNPARD + § FAB yCeyPie) = APC), (65.1)
TAMA +L = = vs) 652)
Whether Que) and Py(x) satisfy absorbing or reflecting boundary conditions, one
ean show, in a manner very similar fo that used in Sect. 5.1.2, that
AA-AD fds Pye) =O. 653)
0 that the P(x) and Q(x} from a bi-orihogonal set, which we normalise as
Idx Py Qu(2) = by. 54)
if the spectrum of eigenvalues 1 is discrete. Ifthe spectrum is continuous, the Kro-
recker 8, isto be replaced by (4~ 2’), except where we have reflecting boundary
‘conditions, and thus 4 = 0 corresponds to the stationary state, The normalisation of
ala) then gives
J dx Pox )Qolx) = fidx Pols) = 1 (655)
‘0 that there is also a discrete point with zero eigenvalue inthe specteum then,
(65 Eigenfunetion Methods in Many Variables 159
65.1 Relationship between Forward and Backward Eigenfunctions
‘The functional relationship (5.8) between the P, and Q), which is always tue in
‘one dimension, only pertains in many dimensions if det
‘To show this, note that if detailed balance is valid, we have already seen in
Sect. 6.3 Selii) that plex, —2/px(x) is a solution of the backward Fokker-Planck
‘equation so tha, from the uniqueness of solutions, we ean say
Ou(x) = mPaen pn. 656)
Here, my =