0% found this document useful (0 votes)
6 views6 pages

Distribution Example

The document provides an introduction to the delta function and distributions, explaining their role in solving differential equations. It defines distributions, illustrates their properties with examples, and demonstrates how to use them to solve equations like u' = g through convolution. The document also raises questions about extending the methods to less regular functions.

Uploaded by

kawaiiku08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

Distribution Example

The document provides an introduction to the delta function and distributions, explaining their role in solving differential equations. It defines distributions, illustrates their properties with examples, and demonstrates how to use them to solve equations like u' = g through convolution. The document also raises questions about extending the methods to less regular functions.

Uploaded by

kawaiiku08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

A quick introduction to the delta function and distributions

Contents
1 Introduction 1

2 Definitions and basic theory about distributions 2

3 Example: Solving a simple differential equation 6

4 Questions 6

1 Introduction
Physics and engineering textbooks often give a nonrigorous presentation of the following important technique
for solving differential equations of the form
Lu = g (1)
where L is a linear differential operator and g is a continuous function. The technique can be summarized
as follows:
1. First we find a function f which satisfies (or supposedly satisfies)

Lf = δ (2)

where δ is the “delta function”. The delta function is often described as having the following properties:
(

Z
if x = 0
δ(x) = and δ(x) dx = 1.
0 if x 6= 0 R

Of course, no such function actually exists!


2. Then the function u = f ∗ g, the convolution of f and g, is a solution to (1).
Let’s illustrate the use of this method by solving the very simple differential equation u0 = g. We first
seek a function f which satisfies
f 0 = δ. (3)
Proceeding nonrigorously, we invoke the fundamental theorem of calculus and assert that the function
Z x (
1 if x > 0,
f (x) = δ(y) dy =
−∞ 0 if x < 0

satisfies equation (3). This function f is called the Heaviside step function. The next step is to note that if
u = f ∗ g then Z Z ∞ Z x
u(x) = f (y)g(x − y) dy = g(x − y) dy = g(z) dz
R 0 −∞

1
(In the final step, we made a substitution z = x − y.) Despite our nonrigorous claim that f 0 = δ, the function
u that we have obtained is indeed a genuine solution to u0 = g. Whatever objections might have been raised
against this method, it actually works.
The above method might seem at first to be complete nonsense, given the lack of a coherent statement
of what δ is. But, the method is in fact quite intuitive if we interpret δ to be merely an “approximate delta
function”, by which I mean a smooth function that has the following properties:
• δ has a spike near the origin.
• δ is zero elsewhere.
R
• R δ(x) dx = 1.

Figure 1: An approximate delta function.

(See figure 1.) The intuitive idea is that, since L is shift invariant, we can shift f to obtain a solution to (1)
in the special case where the function on the right is a shifted version of δ (a shifted spike). Then, because
L is linear, we can solve (1) whenever the function on the right is a sum of shifted spikes. The punch line is
that we can think of any continuous function g as a sum of shifted spikes:
Z
g(x) ≈ g(y)δ(x − y) dy. (4)
R

So, invoking the linearity and shift invariance of L, we recognize that if


Z
u(x) = g(y)f (x − y) dy
R

then Lu ≈ g.
Despite its intuitive appeal, the above argument remains nonrigorous. If δ is merely an approximate
delta function, then (4) is only an approximation. Moreover, physicists and engineers do not hesitate to
do other nonrigorous things with the delta function, such as taking the Fourier transform of both sides of
equation (2) in order to solve for f . The theory of distributions gives us a way to precisely define the delta
function and to make such arguments rigorous.

2 Definitions and basic theory about distributions


Let V = Cc∞ (R), the vector space of all functions φ : R → R which are infinitely differentiable and have
compact support. The elements of V are called “test functions”.
Definition 1. A distribution is a linear function F : V → R which is continuous in the following sense:
if {φk } is a sequence of functions in V and for each nonnegative integer n we have
(n)
φk → φ(n) uniformly as k → ∞

then F (φk ) → F (φ) as k → ∞. Here φ(n) is the nth-order derivative of φ. I’ll denote the set of all
distributions as V ∗ .

2
The strange-sounding definition of continuity that we are using is designed, perhaps by trial and error,
so that the “derivative” of F that we will define below is guaranteed to also be a distribution. For more
discussion of how this definition can be motivated, see here.
If F is a distribution and φ is a test function, we’ll use the notation hF, φi as an alternative way of writing
F (φ). So by definition
hF, φi = F (φ).
Example 2.1. If f : R → R is locally integrable, then the function F : V → R defined by
Z
F (φ) = f (x)φ(x) dx for all φ ∈ V
R

is a distribution.
Example 2.2. The function δ : V → R defined by

δ(φ) = φ(0) for all φ ∈ V

is a distribution. It is called the “delta function” or, more properly, the delta distribution.
The integration by parts formula
Z Z
f 0 (x)φ(x) dx = − f (x)φ0 (x) dx for all φ ∈ V
R R

suggests a way to define the derivative of any distribution F .


Definition 2. Let F be a distribution. The function DF : V → R defined by

DF (φ) = −hF, φ0 i for all φ ∈ V

is called the “derivative” of F .


Our strange-sounding definition of continuity for a distribution was designed in order to make the following
theorem true:
Theorem 1. If F is a distribution, then its derivative DF is also a distribution.
Proof. (Left to reader, for now.)
The operator D : V ∗ → V ∗ which takes a distribution F as input and returns DF as output is called the
“derivative operator” on the space V ∗ of distributions. We can compute higher derivatives of a distribution
F by applying the operator D repeatedly. For example, D2 F = D(DF ). Notice that if φ ∈ V then

D2 F (φ) = −hDF, φ0 i = hF, φ00 i.

More generally, we have


Dn F (φ) = (−1)n hF, φ(n) i
for any positive integer n. Here φ(n) denotes the nth derivative of φ.
Example 2.3. Let h : R → R be the Heaviside step function defined by
(
1 if x > 0,
h(x) =
0 if x < 0.

(I haven’t specified the value of h when x = 0 because it doesn’t matter here.) Let H be the distribution
defined by Z Z ∞
H(φ) = h(x)φ(x) dx = φ(x) dx.
R 0

3
Then for any φ ∈ V we have

DH(φ) = −hH, φ0 i
Z ∞
=− φ0 (x) dx
0
= φ(0)
= δ(φ).

Thus, DH = δ. This gives a precise meaning to the claim that “the derivative of the Heaviside function is
the delta function”, which is a statement often heard in nonrigorous treatments of this subject.
If f : R → R is locally integrable, and φ is a test function, then the convolution of f and φ is the function
f ∗ φ defined by Z
(f ∗ φ)(x) = f (y)φ(x − y) dy.
R
The above formula suggests a way to define the convolution of φ with a distribution.
Definition 3. Suppose that F is a distribution and φ is a test function. The convolution of F and φ is the
function F ∗ φ : R → R defined by

(F ∗ φ)(x) = hF, φ(x − ·)i for all x ∈ R.

Here φ(x − ·) is a shorthand notation for the function y 7→ φ(x − y).


Example 2.4. Let’s compute the convolution of the delta distribution δ with a test function φ. If x ∈ R
then

(δ ∗ φ)(x) = hδ, φ(x − ·)i


= φ(x − 0)
= φ(x).

So we see that
δ ∗ φ = φ.
It’s interesting to note that when we convolve a distribution with a test function, the result is a function
rather than a distribution. That explains how finding a distribution which satisfies the equation (2) can lead
to finding an actual function that satisfies the equation (1). To be specific, we will see that if g is a test
function and a distribution F satisfies (2), then the function F ∗ g satisfies (1). In preparation for that, we
first state and prove the following key theorem.
Theorem 2. Suppose that F is a distribution and φ is a test function. Then F ∗ φ is differentiable and

(F ∗ φ)0 = F ∗ φ0 = (DF ) ∗ φ.

Proof. Let x ∈ R. If t ∈ R and t 6= 0 then

(F ∗ φ)(x + t) = hF, φ(x + t − ·)i and (F ∗ φ)(x) = hF, φ(x − ·)i

so * +
(F ∗ φ)(x + t) − (F ∗ φ)(x) φ(x + t − ·) − φ(x − ·)
= F, = hF, ψt i,
t | t
{z }
ψt

4
where ψt is the function defined by ψt (y) = φ(x+t−y)−φ(x−y)
t . It can be shown that ψt converges uniformly
to φ0 (x − ·) as t → 0, and likewise for each positive integer n the nth derivative of ψt converges uniformly to
the nth derivative of φ0 (x − ·) as t → 0. By the continuity property of distributions, we see that
(F ∗ φ)(t + h) − (F ∗ φ)(x)
lim = hF, φ0 (x − ·)i = (F ∗ φ0 )(x).
t→0 t
Thus, F ∗ φ is differentiable and (F ∗ φ)0 = F ∗ φ0 .
Next, notice that
(DF ∗ φ)(x) = hDF, φ(x − ·)i
= hF, −[φ(x − ·)]0 i
= hF, φ0 (x − ·)i (by the chain rule)
0
= (F ∗ φ )(x).
This shows that DF ∗ φ = F ∗ φ0 .
Repeatedly applying theorem 2 yields the following corollary.
Corollary 1. If F is a distribution and φ is a test function, then F ∗ φ is infinitely differentiable, and for
any positive integer n we have
(F ∗ φ)(n) = F ∗ φ(n) = (Dn F ) ∗ φ. (5)
Proof. By theorem 2, we know that F ∗ φ is differentiable and that (F ∗ φ)0 = F ∗ φ0 = (DF ) ∗ φ. Invoking
theorem (2) again, we see that (DF ) ∗ φ is differentiable and its derivative is D2 F ∗ φ. Because φ0 is a test
function, theorem (2) also tells us that the derivative of F ∗ φ0 is F ∗ φ00 . Thus, (F ∗ φ)0 is differentiable and
(F ∗ φ)00 = F ∗ φ00 = D2 F ∗ φ.
Continuing like this, we can see that F ∗ φ is in fact infinitely differentiable, and that equation (5) holds for
any positive integer n.
Corollary 2. Let L : V ∗ → V ∗ be a differential operator, which means that
N
X
L= cn Dn
n=1

for some real numbers c1 , . . . , cN . Suppose that F is a distribution which satisfies L(F ) = δ. If g is a test
function, then the function u = F ∗ g satisfies
N
X
cn u(n) = g.
n=1

(Here u(n) denotes the nth derivative of u.)


Proof. Using corollary 1, we see that
N
X N
X
(n)
cn (F ∗ g) = cn (Dn F ) ∗ g
n=1 n=1
N
!
X
n
= cn D ∗g
n=1
=δ∗g
= g.

5
3 Example: Solving a simple differential equation
In this section we’ll illustrate the use of distributions in solving differential equations by solving the very
simple differential equation
u0 = g
where g is a test function. First note that the Heaviside distribution H defined in example 2.3 satisfies

DH = δ.

It follows from corollary 2 that the function u = H ∗ g satisfies u0 = g. All that remains is to find a more
explicit formula for H ∗ g. If x ∈ R, then

(H ∗ g)(x) = hH, g(x − ·)i


Z ∞
= g(x − y) dy.
0

Making a change of variable z = x − y, we see that


Z −∞ Z x
(H ∗ g)(x) = − g(z) dz = g(z) dz.
x −∞

Of course, we could have found this antiderivative directly using the fundamental theorem of calculus. But,
the above calculation illustrates a general technique — we first find a fundamental solution (in this case H),
and then we convolve the fundamental solution with g to obtain a solution to Lu = g.

4 Questions
• When defining the convolution of a distribution F and a function g, can I loosen the restrictions on g?
It seems overly restrictive to assume that g is infinitely differentiable and has compact support.

• How can I extend this approach to solve Lu = g, where g is a less nice function? For example, can I
drop the assumption that g has compact support?

You might also like