0% found this document useful (0 votes)
27 views106 pages

1617 Theories of Matter Space and Time

The document outlines the course PHYS3007, focusing on theories of matter, space, and time, covering topics such as the principles of least action, special relativity, electromagnetism, and quantum mechanics. It includes detailed sections on various physical laws, mathematical formulations, and applications in optics and dynamics. Learning outcomes emphasize understanding key equations and principles, such as the Euler-Lagrange equations and conservation laws.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views106 pages

1617 Theories of Matter Space and Time

The document outlines the course PHYS3007, focusing on theories of matter, space, and time, covering topics such as the principles of least action, special relativity, electromagnetism, and quantum mechanics. It includes detailed sections on various physical laws, mathematical formulations, and applications in optics and dynamics. Learning outcomes emphasize understanding key equations and principles, such as the Euler-Lagrange equations and conservation laws.

Uploaded by

ripek83921
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

PHYS3007

Theories of Matter, Space and


Time

Steve King
Room: 5025
Phone: 22056
e-mail: king@[Link]
Contents

1 Principles Of Least Action 4


1.1 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.1 Snell’s Law . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.2 Complicated Problems . . . . . . . . . . . . . . . . . . 7
1.2 Calculus of Variation . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 More Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.1 Light in Vacuum . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Light in the Atmosphere . . . . . . . . . . . . . . . . . 12
1.4 First Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Newtonian Dynamics . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.1 Multiple Coordinates . . . . . . . . . . . . . . . . . . . 19
1.5.2 Example 1: Projectile Motion . . . . . . . . . . . . . . 20
1.5.3 Example 2: Double Pendulum . . . . . . . . . . . . . . 21
1.6 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . 24
1.6.1 Ignorable Coordinates . . . . . . . . . . . . . . . . . . 24
1.6.2 Energy Conservation . . . . . . . . . . . . . . . . . . . 25
1.6.3 Example - Central Forces . . . . . . . . . . . . . . . . 26
1.6.4 Hamiltonian and Energy . . . . . . . . . . . . . . . . . 27

2 Special Relativity 28
2.1 The Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . 31
2.2.1 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.2 Lorentz Contraction . . . . . . . . . . . . . . . . . . . 34
2.3 An Analogy to Rotations . . . . . . . . . . . . . . . . . . . . . 35
2.4 Four Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4.1 Index Convention . . . . . . . . . . . . . . . . . . . . . 39
2.5 The Laws of Dynamics . . . . . . . . . . . . . . . . . . . . . . 40
2.5.1 Four-velocity . . . . . . . . . . . . . . . . . . . . . . . 41
2.5.2 Four Acceleration . . . . . . . . . . . . . . . . . . . . . 42
2.5.3 Four Momentum . . . . . . . . . . . . . . . . . . . . . 42

1
2.5.4 Hypothesis for Dynamical Law . . . . . . . . . . . . . 44
2.6 Physics with Four-Momentum . . . . . . . . . . . . . . . . . . 44
2.6.1 The Doppler Effect . . . . . . . . . . . . . . . . . . . . 44
2.6.2 The Compton Effect . . . . . . . . . . . . . . . . . . . 46
2.6.3 Fixed Target Experiments . . . . . . . . . . . . . . . . 47
2.6.4 The GZK Bound . . . . . . . . . . . . . . . . . . . . . 50
2.7 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.8 Relativistic Action . . . . . . . . . . . . . . . . . . . . . . . . 53
2.9 Appendix 1 - Lorentz Transformations and Rotations II . . . . . 54

3 Electromagnetism 57
3.1 Integral Form of Maxwell’s Equations . . . . . . . . . . . . . . 59
3.1.1 Gauss’ Law . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.2 No Magnetic Charges . . . . . . . . . . . . . . . . . . . 60
3.1.3 Faraday’s Law . . . . . . . . . . . . . . . . . . . . . . . 60
3.1.4 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Differential Form of Maxwell’s Equations . . . . . . . . . . . . 62
3.2.1 Maxwell’s Equations in Differential Form . . . . . . . . 63
3.2.2 Conservation of Charge . . . . . . . . . . . . . . . . . . 63
3.2.3 The Displacement Current . . . . . . . . . . . . . . . . 64
3.3 Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.1 Electric Potential . . . . . . . . . . . . . . . . . . . . . 65
3.3.2 Vector Potential . . . . . . . . . . . . . . . . . . . . . . 67
3.3.3 A New Electric Potential . . . . . . . . . . . . . . . . . 68
3.3.4 Gauge Transformations . . . . . . . . . . . . . . . . . . 69
3.3.5 Maxwell’s Equations in Lorentz Gauge . . . . . . . . . 70
3.4 Relativistic Formulation Of Electromagnetism . . . . . . . . . 71
3.4.1 Four-vector Current . . . . . . . . . . . . . . . . . . . . 71
3.4.2 Conservation of Charge . . . . . . . . . . . . . . . . . . 72
3.4.3 The Four Vector ∂ µ . . . . . . . . . . . . . . . . . . . . 73
3.4.4 Four Vector Potential . . . . . . . . . . . . . . . . . . . 74
3.4.5 A Moving Point Charge . . . . . . . . . . . . . . . . . 75
3.4.6 The Electromagnetic Field Strength Tensor . . . . . . 76
3.4.7 Lorentz Transformations of Electric and Magnetic Fields 77
3.4.8 The Relativistic Force Law . . . . . . . . . . . . . . . . 78
3.5 The Lagrangian For a Charged Particle . . . . . . . . . . . . . 79
3.6 Appendix 1 - Gauss’ and Stoke’s Theorems . . . . . . . . . . . 81
3.6.1 Gauss’ Theorem . . . . . . . . . . . . . . . . . . . . . . 81
3.6.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . 83
3.7 Appendix 2 - Vector Identities . . . . . . . . . . . . . . . . . . 86

2
4 Quantum Mechanics 89
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Quantum Mechanics Review . . . . . . . . . . . . . . . . . . . 91
4.2.1 Time Independent Schrödinger Equation . . . . . . . . 92
4.2.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . 92
4.2.3 Momentum Space Wave Functions . . . . . . . . . . . 93
4.2.4 Square Well Example . . . . . . . . . . . . . . . . . . . 96
4.2.5 Completeness . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.6 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . 100
4.3 Klein-Gordon equation . . . . . . . . . . . . . . . . . . . . . . 101
4.3.1 The Schrödinger equation . . . . . . . . . . . . . . . . 101
4.3.2 The Relativistic Schrödinger Equation . . . . . . . . . 102
4.3.3 Interpretation of negative energy states . . . . . . . . . 104

3
Chapter 1

Principles Of Least Action

Contents

• Fermat’s Principle of least time

• The Euler-Lagrange equations

• Light in vacuum and in media

• Lagrangian dynamics and Hamilton’s principle

• Symmetries and conservation laws

Learning Outcomes

• Know the Euler Lagrange equations and how to apply them

• Know Fermat’s Principle of least time

• Be able to formulate and determine the Euler Lagrange eqns for the
path travelled by light in media

• Know Hamilton’s Principle

• Know the Lagrangian for a particle in a potential and be able to de-


termine the Euler Lagrange equations in simple problems

• Be able to explain the connection between ignorable coordinates and


conserved quantities

4
Reference Books

• PHYS2006 Classical Mechanics notes.

• Analytical Mechanics, GR Fowles and GI Cassiday.

• Classical Mechanics, TL Chow.

• Perfect Form, DS Lemons.

5
We are going to explore an alternative formulation of classical mechanics
which at first sight appears very different from Newton’s Laws. It is a formal-
ism that grew out of optics and will allow us to study an area of mathematics
called “calculus of variation”. Of course it must turn out to be the same as
Newton’s laws. This alternative formalism makes some problems easier to
solve but more importantly it will give us new insights into conservation laws.

1.1 Optics
Our starting point will be to think about the path that light travels by. In
these enlightened times we might start from Maxwell’s equations and derive
a wave equation with light waves as solutions to determine how the light
propagates. Before this technology though Fermat proposed
Fermat’s Principle of Least Time: Light propagates between two points
so as to minimize its travel time
Thus for example in a uniform medium where the speed of light c is a
constant the minimum time of travel
d
t= (1.1)
c
is given by the path of shortest distance d ie a straight line. This is still a
perfectly good (if limited) description of light.
We can obtain more interesting results by thinking about media where
the speed of light changes.

1.1.1 Snell’s Law


Consider two neighbouring regions of space in which light travels at different
speeds v1 , v2 - for example a glass air interface. We will be interested in
the light that travels from the point (x1 , y1 ) in the first medium to the point
(x2 , y2 ) in the second In any one medium light travels in a straight line but in
this case we have some choice in where the light crosses between the media.
Lets consider the arbitrary crossing point (x, y = 0). The time of travel is
d1 d2
T [x] = v1
+ v2

√ √ (1.2)
(x−x1 )2 +y12 (x−x2 )2 +y22
= v1
+ v2

We now want to find the path (ie the value of x through which it passes)
which minimizes the time taken. Thus

6
(x1, y1)
material 1 - v1
!1
d1
(x, 0)

d2
material 2 - v2
!2

(x2, y2)

dT (x − x1 ) (x2 − x)
= p − p =0 (1.3)
dx v1 (x − x1 ) + y1 v2 (x2 − x)2 + y22
2 2

This equation though is just

sin θ1 sin θ2
v1 = v2 (1.4)
which is Snell’s law.
In terms of index of refraction which is defined, relative to the vacuum,
as
c
n1 = (1.5)
v1

n1 sin θ1 = n2 sin θ2 (1.6)

1.1.2 Complicated Problems


We can imagine more complicated problems than that above where the index
of refraction is an arbitrary function of position. For example consider light
moving in a plane where the speed of the light is v(x, y)
Different paths are described by different functions y(x). The time to
travel along an arbitrary little piece of path is
p
distance dx2 + dy 2
∆T = = (1.7)
velocity v(x, y)
Summing such contributions up along a path gives the total time of travel

7
y

xa xb
x

s 2
Z xb 
1 dy
T [y(x)] = 1+ dx (1.8)
xa v(x, y) dx
Now we want to find the path y(x) that gives the minimum time. This
is the sort of problem that Calculus of Variation is designed to address.

Example: to remind yourself about partial differentiation. If a function is


defined by
T = a(t) b(t)3 ḃ(t) t10
where the dot indicates a derivative with respect to t, give expressions for
∂T ∂T ∂T dT
, , ,
∂t ∂b ∂ ḃ dt
The answer is
∂T
= 10ab3 ḃt9 ,
∂t
∂T
= 3ab2 ḃt10 ,
∂b
∂T
= ab3 t10 ,
∂ ḃ
dT
= ȧb3 ḃt10 + 3ab2 ḃ2 t10 + ab3 b̈t10 + 10ab3 ḃt9 .
dt

8
1.2 Calculus of Variation
The mathematical method will be more widely applicable than the optics
problems so far discussed so we will use a more general notation in this
section.
Consider a set of curves q(s) between two points (q1 , s1 ) and (q2 , s2 ) in the
s, q plane (we will only consider curves where the trajectory is single valued
at each value of s).

q
q2

q1

s1 s2
S
Imagine we are interested in one curve that minimizes the quantity
Z s2
S[q(s)] = L(q, q̇, s)ds (1.9)
s1

L is just a number at each point on a given curve determined by the values


of q and s at that point and the gradient q̇ = dq/ds. The integral sums these
numbers along the line.
If the curve that minimizes S is q̄(s) we can write the other curves as
deviations from it

q(s) = q̄(s) + δq(s) (1.10)


subject to the boundary conditions

δq(s1 ) = δq(s2 ) = 0 (1.11)


The value of S for these curves varies from the value for q̄(s) by

δS = S[q̄ + δq] − S[q̄] (1.12)

9
Since q̄(s) is the minimum though δS = 0 to lowest order in δq.
Let’s calculate S[q̄ + δq] to order δq

R s2
S[q̄ + δq] = s1
L(q̄ + δq, q̄˙ + δ q̇, s)ds
R s2  
' s1
˙ s) + δ q̇ ∂L
L(q̄, q̄, ∂ q̇
+ δq ∂L
∂q
+ .... ds (1.13)
R s2  
' S[q̄] + s1
δ q̇ ∂L
∂ q̇
+ δq ∂L
∂q
ds + O(δq 2 )

Integrating the second term by parts (u = ∂L/∂ q̇, dv/ds = δ q̇ etc)

s2  s Z s2  
∂L 2
Z
∂L d ∂L
δ q̇ ds = δq − δq ds (1.14)
s1 ∂ q̇ ∂ q̇ s1 s1 ds ∂ q̇
The first term vanishes since δq vanishes at the ends of the path.

Thus

Z s2    
d ∂L ∂L
S[q̄ + δq] − S[q̄] = − δq − ds + ... (1.15)
s1 ds ∂ q̇ ∂q

This is only zero (at order δq) if

 
d ∂L ∂L
ds ∂ q̇ − ∂q =0 (1.16)
The Euler Lagrange equation.

Example: If a system with generalized coordinate q has the Lagrangian


1
L = q̇ 2 − q 3
2
what is the Euler Lagrange equation describing the system?
Answer: the corresponding Euler-Lagrange equation is
 
d ∂L ∂L
− =0 .
dt ∂ q̇ ∂q

Given that ∂L/∂ q̇ = q̇ and ∂L/∂q = −3q 2 , the equation reads q̈ + 3q 2 = 0.

10
1.3 More Optics
Let’s return to the problem of the path light travels in a plane where the
speed of light depends on the position in the plane. We had found that the
time taken to traverse a path is
Z xb
T [y] = L(y, ẏ, x) dx (1.17)
xa

where
1 p
L(y, ẏ, x) = 1 + ẏ 2 (1.18)
v(x, y)
This is of the form that leads to the Euler Lagrange equation
 
d ∂L ∂L
− =0 (1.19)
dx ∂ ẏ ∂y
Lets look at a couple of examples

1.3.1 Light in Vacuum


In vacuum the speed of light is a constant so v(x, y) = c
The Euler Lagrange equation is
!
d ẏ
p =0 (1.20)
dx 1 + ẏ 2
Integrating this gives

p = constant (1.21)
1 + ẏ 2
The only solution of this is

ẏ = constant, m (1.22)
or integrating

y = mx + c (1.23)
ie a straight line. This is our first example of the solution of the Euler
Lagrange equation giving the path that minimizes T . m and c are determined
by the initial and final position of the light.

11
1.3.2 Light in the Atmosphere
In the atmosphere the air temperature and density change with height re-
sulting in the speed of light depending on height - v(h). Equivalently we can
write the refractive index n(h) with
c
v(h) = (1.24)
n(h)
Our result for the length of time light takes to travel some path h(x) can
be written as an optical path length
Z x2 p
cT [h] = dxL, L = n(h) 1 + ḣ2 (1.25)
x1

We can use the fact that L is independent of x to simplify the Euler


Lagrange equation as follows. Note that
dL ∂L ∂L ∂L
= + ḣ + ḧ (1.26)
dx ∂x ∂h ∂ ḣ
The first term on the right is zero. Now replace ∂L ∂h
using the Euler Lagrange
equation
 
∂L d ∂L
= (1.27)
∂h dx ∂ ḣ
and we find
 
dL d ∂L ∂L
= ḣ + ḧ (1.28)
dx dx ∂ ḣ ∂ ḣ
which is just
 
d ∂L
L − ḣ =0 (1.29)
dx ∂ ḣ
which gives us
∂L
L − ḣ = constant, D (1.30)
∂ ḣ
Note that this is only a first order equation rather than the second order
Euler Lagrange equation so is simpler to solve.
In our problem, using the explicit form for L above we have
p ḣ2 n
n 1 + ḣ2 − p =D (1.31)
1 + ḣ2

12
which simplifies to
n
p =D (1.32)
1 + ḣ2
Note that the physical meaning of D is the value of the index of refraction
at the point where the light ray becomes horizontal so that ḣ = 0.
Squaring and rearranging we find
r
dh n2
= −1 (1.33)
dx D2
Thus
Rh
x − x0 = h0 q ndh2 (1.34)
−1
D2

Explicit Example: Consider a ray of light that begins moving horizontally


(ḣ = 0) at h = 0 in an atmosphere where

n(h) = n0 − λh (1.35)
where λ is some constant. We must solve the integral
Z
dh
x= q (1.36)
(n0 −λh)2
D2
−1
This can be done by changing variables to

n0 − λh = D cosh φ (1.37)
The integral becomes
Z
D D
x=− dφ = − φ + c (1.38)
λ λ
Returning to the original coordinates and requiring the boundary conditions
ḣ(x = 0) = 0 and h(x = 0) = 0 gives the result

n0
h= λ
(1 − cosh λx
n0
) (1.39)

• When λ is positive n(h) decreases with altitude - this is what normally


happens in the atmosphere. Plotting the solution we find the form
Thus if we look up at the Empire State building it will appear taller than
it actually is.

13
h
apparent light path
to observer
(0,0)
x

• If there is a temperature inversion then λ is negative so n(h) increases with


altitude. Plotting the solution we find the form

x
(0,0)

We see “the sky on the ground” - a mirage.

Example:

(a) Consider a fibre optic cable lying in the z direction. The cable is made
of glass with index of refraction n(r), where r is the radial distance from
the centre of the cable. Working in cylindrical coordinates (r, θ, z) show that
Fermat’s Principle implies light travels on the path minimizing the quantity
Z z2 Z z2 p
0 0
f (r(z), θ(z), r (z), θ (z)) dz = n(r) r0 2 + r2 θ0 2 + 1 dz.
z1 z1

14
where a prime indicates differentiation with respect to z. z1 and z2 are the
z-coordinates of the end points of the path.

(b) If a light ray initially has θ0 = 0 show, from the appropriate Euler La-
grange equation, that the θ independence of f implies the path followed by
the light is described by a constant value of θ.

(c) Use the z independence of f to deduce that the first order differential
equation for rays travelling paths with constant θ is
∂f 0
f− r = constant.
∂r0

Answer: (a) The


p distance between two points is given, in Cartesian coor-
dinates, 2 2 2
p by L = (∆x) + (∆y) + (∆z) which, in polar coordinates reads
L = (dr)2 + (rdθ)2 + (dz)2 . The time to travel that distance is given by
time = Lv = n(r) Lc , where n(r) is the refraction index. For a given path
between two end points
s 
Z z2 2  2
dr dθ
cT = n(r) + r2 + 1 dz .
z1 dz dz
(b) f is independent of θ, therefore the corresponding Euler-Lagrange equa-
tion reads  
d ∂f ∂f
0
= 0 → 0 = constant
dz ∂θ ∂θ
Replacing f by the integrand, we have
n(r)r2 θ0
p = constant
r02 + (rθ0 )2 + 1
If θ0 = 0 initially, then constant = 0. This implies that θ0 = 0 for all z
(given that throughout the path n(r) and r are non zero), and the trajectories
have θ = constant.
(c) f is z-independent, then
df ∂f 0 ∂f 00 ∂f 0 ∂f 00
= r + 0r + θ + 0θ .
dz ∂r ∂r ∂θ ∂θ
The third term on the right hand side is zero, and the Euler-Lagrange equa-
tions for r and θ are
   
d ∂f ∂f d ∂f
− =0 , =0.
dz ∂r0 ∂r dz ∂θ0

15
We can now rewrite the second term in the right hand side of the expan-
sion of df /dz using its Euler-Lagrange equation and we get
 
df d ∂f ∂f ∂f
− 0
r0 − 0 r00 − 0 θ00 = 0 ,
dz dz ∂r ∂r ∂θ

which we can write in a compact way


 
d ∂f 0 ∂f 0 ∂f ∂f
f − 0 r − 0 θ = 0 → f − 0 r0 − 0 θ0 = constant .
dz ∂r ∂θ ∂r ∂θ

For paths with constant θ this reduces to


∂f 0
f− r = constant
∂r0

1.4 First Integrals


We have seen two situations in which the Euler Lagrange equation simpli-
fies from a second order equation to a first order equation. These will be
important when we come on to Newtonian dynamics. Lets review them:

1) If L(q, q̇, s) is independent of the coordinate q


 
d ∂L
=0 (1.40)
ds ∂ q̇
So
∂L
∂ q̇ = constant (1.41)

2) If L(q, q̇, s) is independent of the coordinate s


dL ∂L ∂L
= q̇ + q̈ (1.42)
ds ∂q ∂ q̇
using the Euler Lagrange equation gives
 
dL d ∂L ∂L
= + q̈ (1.43)
ds ds ∂ q̇ ∂ q̇
which is just

16
 
d ∂L
L − q̇ =0 (1.44)
ds ∂ q̇
so that

L − q̇ ∂L
∂ q̇ = constant (1.45)

Example: If a system with generalized coordinates x and y has the Action


Z  
1 2 1 2
S= ẋ + ẏ + cos ẏ − x dt
2 2
what quantities are conserved?

Answer: The Lagrangian, L = 12 ẋ2 + 12 ẏ 2 + cos(ẏ) − x, does not depend on


either y nor t. Therefore the momentum along the y direction is conserved,
∂L
py = = ẏ − sin(ẏ) ,
∂ ẏ
as well as the Hamiltonian,
X ∂L 1 1
H= q˙i − L = ẋ2 + ẏ 2 − ẏ sin(ẏ) − ẋ2 − ẏ 2 − cos(ẏ) + x
i
∂ q˙i 2 2

1 1
= ẋ2 + ẏ 2 − ẏ sin(ẏ) − cos(ẏ) + x .
2 2

17
1.5 Newtonian Dynamics
We have seen that the motion of light can be described by a “principle of
least time”. Is there an equivalent rule that would describe the motion of a
particle in Newtonian dynamics? There is and it is enshrined as
Hamilton’s Principle: A particle travels by the path between two points
that minimizes the Action.
We need to know what the “action” is. Let’s write it first for one dimen-
sional motion. The action is

R tb
S[path] = ta L(x, ẋ, t)dt (1.46)
where the dot indicates differentiation with respect to the time, t. L is
known as the Lagrangian and is given by

L = kinetic energy − potential energy = T − V (1.47)

From our calculus of variation result we know that the path that mini-
mizes the action satisfies the Euler Lagrange equation

d ∂L ∂L

dt ∂ ẋ − ∂x =0 (1.48)

We can now check to see if any of this makes sense (!). For a non-
relativistic particle in a one dimensional potential we have
1
L = T − V = mẋ2 − V (x) (1.49)
2
The Euler Lagrange equation is therefore
d ∂V
(mẋ) + =0 (1.50)
dt ∂x
which is Newton’s second law since
∂V
F =− (1.51)
∂x
Note that the momentum of the particle is given by
∂L
p = mẋ = (1.52)
∂ ẋ

18
1.5.1 Multiple Coordinates
We will want to solve problems in more than one dimension. The formalism is
easily adapted. The definition of the Action above in terms of the Lagrangian
(L = T − V ) remains the same. If now we have several coordinates

qi i = 1...n (1.53)
(so for example we might call x = q1 , y = q2 z = q3 etc). In the derivation
above of the Euler Lagrange equation (section 2) we would have to take into
account deviations in the path in all of these coordinates. We would find
that the change in the action of a path close to the minimizing path would
have the form
Z s2 X    
d ∂L ∂L
∆S = − δqi − ds (1.54)
s1 i
ds ∂ q̇i ∂qi

At the minimum the coefficients of each δqi must vanish independently so


we get a set of Euler Lagrange equations - one associated with each coordinate
 
d ∂L ∂L
ds ∂ q̇i − ∂qi =0 (1.55)

Generalized Coordinates
The reason that we have written the coordinates so generally as qi rather
than for example using x, y, z is that in some problems these are not the ap-
propriate coordinates because of a constraint. A simple example to illustrate
this is a ball on a wire hoop
The hoop stops the ball moving in the radial direction so the ball cannot
be at any arbitrary (x, y). The sensible coordinate to use is the angle θ.
Such a reduced set of coordinates are called generalized coordinates.

Generalized Momentum
A generalization of the idea of momentum can be defined in the spirit of
(1.52). The generalized momentum associated with a generalized coordinate
is given by

∂L
pi = ∂ q̇i (1.56)

19
y
!
x

1.5.2 Example 1: Projectile Motion


Consider the familiar problem of a projectile in a uniform gravitational field

x
We can obtain the normal Newtonian equations of motion from the Euler
Lagrange equations. We need expressions for the kinetic and potential energy
of the system so we can build the Lagrangian. The kinetic energy is just
1 1
T = mẋ2 + mẏ 2 (1.57)
2 2
and the potential energy

V = mgy (1.58)
So the Lagrangian is just

20
1 1
L = T − V = mẋ2 + mẏ 2 − mgy (1.59)
2 2
Now we find the two Euler Lagrange equations. The first associated with
the x coordinate is
 
d ∂L ∂L
− =0 (1.60)
dt ∂ ẋ ∂x
which gives
mẍ = 0 (1.61)
The second equation associated with the y coordinate is
 
d ∂L ∂L
− =0 (1.62)
dt ∂ ẏ ∂y
which gives

ÿ = −g (1.63)
The two boxed equations are the standard Newtonian equations of motion.

Hopefully you’re starting to see the power of this technique now - the
kinetic and potential energies of a system are fairly easy to work out and
then we just do some maths. There’s not all that resolving forces business!
The next problem is an example that would be very hard by the standard
methodology.

1.5.3 Example 2: Double Pendulum


Consider a double pendulum of the form
It would be pretty hard work to determine all the forces in play here.
However, the Lagrangian technique means we only have to calculate the
energies of the two masses to get to the equations of motion.
The first mass has a velocity ~v1 with magnitude l1 θ̇1 (v = ωr). The second
mass has both this motion plus a second contribution from the swing of the
second pendulum ~v2 with magnitude l2 θ̇2 . The total velocity of the second
mass is therefore

~vtot = ~v1 + ~v2 (1.64)


so
2
vtot = (~v1 + ~v2 ).(~v1 + ~v2 )
(1.65)
2 2
= (l1 θ̇1 ) + (l2 θ̇2 ) + 2l1 θ̇1 l2 θ̇2 cos(θ2 − θ1 )

21
!1 l1

v1
m1

l2
!2 v1 + v2
m2

where θ2 − θ1 is the angle between ~v1 and ~v2 .


Thus the total kinetic energy of the system is

1 1 h i
T = m1 l12 θ̇12 + m2 (l1 θ̇1 )2 + (l2 θ̇2 )2 + 2l1 θ̇1 l2 θ̇2 cos(θ2 − θ1 ) (1.66)
2 2
The potential energy is determined by the heights of the masses

V = −m1 gl1 cos θ1 − m2 g(l1 cos θ1 + l2 cos θ2 ) (1.67)


and the Lagrangian is

L=T −V (1.68)
There are two Euler lagrange equations - one associated with θ1

h i
d
dt
m1 l12 θ̇1 + m2 l12 θ̇1 + m2 l1 l2 θ̇2 cos(θ2 − θ1 ) − m2 l1 l2 θ̇1 θ̇2 sin(θ2 − θ1 )

−(m1 + m2 )gl1 sin θ1 = 0


(1.69)
and one with θ2

d h 2
i
m2 l2 θ̇1 + m2 l1 l2 θ̇1 cos(θ2 − θ1 ) +m2 l1 l2 θ̇1 θ̇2 sin(θ2 −θ1 )−m2 gl2 sin θ2 = 0
dt
(1.70)
These are pretty messy (but that was the point!). Things simplify a bit
if we assume that both θ1 and θ2 are small and expand to linear order. We
then get

22
(m1 + m2 )l12 θ̈1 + m2 l1 l2 θ̈2 = (m1 + m2 )gl1 θ1
(1.71)
m2 l22 θ̈2 + m2 l1 l2 θ̈1 = m2 gl2 θ2
These coupled equations in fact have normal mode solutions of the form

θ̈1 = −ω 2 θ1
(1.72)
2
θ̈2 = −ω θ2
ie the two pendulums oscillate with the same frequency.
To find ω you can try substituting in the form of the solution in (1.72)
into (1.71). You’ll find two simultaneous equations for θ1 and θ2 with two
solutions. You’ll find in one case θ1 /θ2 is positive and in the other it is
negative. So in one case the pendulums swing together and in the other case
in opposite directions.

Example:
(a) Show that for a non-relativistic, free particle of mass m travelling with
constant velocity v the action S describing its motion reduces to

S = mvd/2

where d is the distance travelled. This was a form for the action proposed
by Maupertuis who believed it reflected the simplicity and economy of the
Creator-God....

(b) Consider such a particle rolling on a table in the x, y plane with speed v1 .
Along the y-axis there is a height discontinuity in the table which the particle
can move over at the cost of potential energy which reduces its velocity to v2 .
If the particle starts at (x1 , y1 ) to the left of the y axis and ends to the right
at (x2 , y2 ) show that the action for it passing across the y-axis at arbitrary
y (assuming it travels in a straight line except when it crosses the y axis) is
given by
p p
S = mv1 x21 + (y − y1 )2 + mv2 x22 + (y − y2 )2

By minimizing the action deduce the relation

v1 sin θ1 = v2 sin θ2

where the angles are the angles between the particle’s direction of motion
and the x axis before and after it crosses the y axis. Contrast this result
with Snell’s Law for light.

23
Answer: Rt
(a) The action is given by S = t12 Ldt, where L is the Lagrangian. For
d2
a free particle travelling at constant speed v, L = 12 mv 2 = 12 m (t2 −t 1)
2 , where

d is the distance travelled between t1 and t2 . Then


Z t2
md2 m d2 1
S= 2
dt = = mvd
t1 2(t2 − t2 ) 2 t2 − t1 2
(b) Similarly to what we did for light in the lectures, the distance
p travelled
2
before the discontinuity at x = 0, with velocity v1 is d1 = x1 + (y − y1 )2 .
Similarly, to the right
p of the discontinuity the particle will travel at v2 a
distance called d2 = x22 + (y − y2 )2 .
The total action is, then,
q q
S = mv1 d1 + mv2 d2 = mv1 x21 + (y − y1 )2 + mv2 x22 + (y − y2 )2 .

Taking the derivative with respect to y we get


δS y − y1 y − y2
0= = mv1 p 2 + mv2 p 2 .
δy x1 + (y − y1 )2 x2 + (y − y2 )2
p
But, doing some p trigonometry, we see that (y − y1 )/ x21 + (y − y1 )2 =
sin(θ1 ), (y − y2 )/ x22 + (y − y2 )2 = − sin(θ2 ), and the minimisation con-
dition reduces to
δS
0= → v1 sin(θ1 ) = v2 sin(θ2 ) .
δy

1.6 Conservation Laws


Finally lets look at one of the most surprising pieces of insight to come out
of the Lagrangian formalism - that is a deeper understanding of conservation
laws. Mathematically this will just be to repeat the section previously on
First Integrals but in the dynamics arena we will see a new interpretation.

1.6.1 Ignorable Coordinates


If the Lagrangian does not depend on some coordinate qi it is called an ignor-
able coordinate. Then ∂L/∂qi = 0 and it’s associated generalized momentum
is conserved as we can see from the Euler Lagrange equation
 
d ∂L ∂L dpi
− = =0 (1.73)
dt ∂ q̇i ∂qi dt

24
so
pi = constant (1.74)

This is clearly a mathematical fact but there is a deeper interpretation.


If L only depends on q̇i not qi itself then we can shift

qi → qi + const (1.75)
and leave the Lagrangian, L, (and hence the physics) invariant. This is a
symmetry - translation invariance in the qi direction.
Thus we learn that the true relation is

symmetry → conserved momentum

This is a new insight we have not seen before in Newtonian mechanics.

1.6.2 Energy Conservation


The second example of a first integral above was the case where L does not
depend explicitly on t. This implies that a quantity known as the Hamilto-
nian is conserved
X ∂L
H= q̇i − L (1.76)
i
∂ q̇i
To prove that it is conserved we explicitly calculate

 
dH X d ∂L X ∂L ∂L X ∂L X ∂L
= q̇i + q̈i − − q̇i − q̈i = 0 (1.77)
dt i
dt ∂ q̇i i
∂ q̇i ∂t i
∂q i i
∂ q̇ i

using the Euler Lagrange equations and ∂L


∂t
= 0.
In simple systems the Hamiltonian is just the total energy of the system
as we can see for example in one dimension where
1
L = mẋ2 − V (x) (1.78)
2
so using the definition above
1
H = mẋ2 + V (x) (1.79)
2
In conclusion here we have learnt that time translation invariance implies
energy conservation.

25
1.6.3 Example - Central Forces
Consider a particle moving subject to a central force ie in a potential V (r)

r F
!
"
The kinetic energy of the particle is
1 1 1
T = m(ẋ2 + y˙2 ) = mṙ2 + mr2 θ̇2 (1.80)
2 2 2
thus
1 1
L = mṙ2 + mr2 θ̇2 − V (r) (1.81)
2 2
There is a Euler Lagrange equation associated with the r coordinate
 
d ∂L ∂L
− =0 (1.82)
dt ∂ ṙ ∂r
giving
∂V
mr̈ = mrθ̇2 − (1.83)
∂r
Plus a second equation for θ, which since L is independent of θ, is just
d
(mr2 θ̇) = 0 (1.84)
dt
which tells us that angular momentum is conserved.
The Hamiltonian is also conserved and is given here by
1 1
H = mṙ2 + mr2 θ̇2 + V (1.85)
2 2
which is the total energy.

26
1.6.4 Hamiltonian and Energy
Finally it is worth stressing that the Hamiltonian is not always the energy of
the system. As an example consider a bead on a hoop that is being rotated,
as shown, by a torque

a
" m

θ and t describe the position of the bead so are good generalized coordi-
nates. The kinetic energy is given by
1
T = m(a2 θ̇2 + a2 sin2 θω 2 ) (1.86)
2
and the potential energy by

V = −mga cos θ (1.87)


Thus
1
L = m(a2 θ̇2 + a2 sin2 θω 2 ) + mga cos θ (1.88)
2
Since L does not depend on t the Hamiltonian is conserved. In particular
1 1
H = ma2 θ̇2 − ma2 sin2 θω 2 − mga cos θ (1.89)
2 2
Although H is conserved the total energy of the system is not since to keep
the hoop rotating the torque must be applied doing work on the system.

27
Chapter 2

Special Relativity

Contents

• Postulates

• Lorentz transformations as generalized rotations

• 4-vectors and index conventions

• Proper time and definitions of uµ , aµ , pµ

• E = m(v)c2

• Eqns of relativistic dynamics

• 4-momentum conservation

• Examples - Compton effect, doppler effect, particle decay, GZK bound

• The relativistic action

Learning Outcomes

• Know postulates

• Perform Lorentz transformation on xµ , pµ , g µν etc

• Perform calculations using index notation

• Know defns of uµ , pµ etc

• Perform calculations of dynamics problems where pµ is conserved

28
Reference Books

• A smart student would re-read their first year SR notes!

• Special Relativity, AP French.

• Investigate the QC6 section of the library.

29
In this section of the course we will learn how to write the laws of dynam-
ics in a form consistent with Special Relativity. First lets review the basic
postulates and their observational consequences.

2.1 The Postulates


The two fundamental postulates of special relativity are
• The speed of light is the same when measured in any inertial frame.
This was the crucial result from the Michelson-Morley experiment.
• The laws of physics are the same for an observer in any inertial frame.
This is the statement that there is no observer (for example stationary
relative to some “ether”) for whom the laws are especially simple.
Remember that an observer moving at constant speed is said to be in an
inertial frame that can be thought of as a combination of
• A rigid, stationary (relative to the observer) lattice grid by which po-
sition coordinates are specified
• A set of synchronized clocks at each lattice point so time can be recorded.

With these observational tools the observer can specify any event by the
set of coordinates

(x, y, z), and t (2.1)

Note that moving from one inertial frame to another is often described
as performing a boost.

30
2.2 Lorentz Transformations
To see the bizarre implications of the first postulate, consider a light wave
front emitted from a stationary source at the origin
z

Let us call this inertial frame (stationary relative to the light source) frame
S. The light wave moves away from the source at speed c as a spherical shell
described by
x2 + y 2 + z 2 = (ct)2 (2.2)
Now consider an inertial frame S 0 moving with speed v in the x direction.
For convenience lets set the origin of both sets of coordinates at time t = 0
at the same place.

z z’
S S’ v

y y’

x x’

vt

31
The origins of the two sets of coordinates separate by a distance vt in time
t.

The first postulate says that the observer in S 0 sees light travel at speed
c too. Thus in this frame too the light forms a spherical shell centred on the
origin in S 0 described now by
0 0 0
x 2 + y 2 + z 2 = (ct0 )2 (2.3)
This is very surprising - you would have guessed that the observer moving
relative to the light source would not be in the centre of the spherical light
shell.
The only way to reconcile the two viewpoints is if the two observers
disagree on the values of times and positions. The two equations for the
position of the shell (2.2), (2.3) in the two frames are reconciled by the
Lorentz transformations

t0 = γ t − v

c2 x

x0 = γ (x − vt)
(2.4)
0
y =y

z0 = z
where s
1
γ= 2 (2.5)
1 − vc2

Exercise: Explicitly check that substituting the Lorentz transformations


into (2.3) one obtains (2.2).

Exercise: How would the Lorentz transformations differ if the boost was in
the z direction rather than the x direction?

An immediate check we should make on these transformations is that


they make sense in the slow moving world we live in. When v  c, γ ' 1
and
t0 ' t, x0 ' x − vt (2.6)

32
which is indeed what we would expect.
The Lorentz transformations imply that observers moving relative to each
other will not agree on the simultaneity of events. For example if a stationary
observer sees an event happen at t = 0 a distance of 10 m away

(t = 0, x = 10 m) (2.7)
Then an observer moving in the x direction at speed v will record the event
as occurring at a time
v
t0 = −γ (10 m) (2.8)
c2
ie earlier than when the two observers passed each other (t = t0 = 0). The
implications of this are that the observers do not agree on measurements of
periods and lengths.

2.2.1 Time Dilation


Imagine an observer at the origin in the frame S marks a second by letting
off two flashes of light separated by 1 second on his watch. The flashes of
light are two events with coordinates

(x = 0, t = 0) (x = 0, t = 1 s) (2.9)
The moving observer in the S 0 frame sees the events as
0 0 0 0
(x = 0, t = 0) (x = −γvt, t = γ s) (2.10)
The S 0 observer has recorded a time
−1/2
v2

γ= 1− 2 ≥1 (2.11)
c
longer than one second. The S 0 observer therefore declares that the S ob-
server’s watch (which is moving relative to S 0 ) is running slow.

A moving clock runs slow

33
2.2.2 Lorentz Contraction
Consider a ruler of length L at rest in the frame S. An observer in S might
make measurements of the position of the two ends to deduce its length.
Those measurements can be represented by the events

(t = 0, x = 0), (t = 0, x = L) (2.12)
0
A moving observer in the frame S watches this process and is somewhat
bemused. He sees the measurement events as
0 0 0 v 0
(t = 0, x = 0) (t = −γ 2
L, x = γL) (2.13)
c
The measurements were taken according to S 0 at different times. Remember
that S 0 sees the ruler moving, so if you measure the end points at different
times you’ll not correctly measure the length.
S 0 wants S to make the second measurement at t0 = 0. In S the position
of the ruler doesn’t change but when should S make the measurement so that
S 0 says t0 = 0?

0
 v 
t = γ t − 2L = 0 (2.14)
c
thus
v
t = 2L (2.15)
c
(The S observer doesn’t see what is special about this time of course!)
Now where is the second end in the S 0 coordinates when this new S
measurement is made?

v2 L
x0 = γ(x − vt) = γ(L − 2
L) = (2.16)
c γ
Thus S 0 says the two correct simultaneous measurements of the end points
are
L
(t0 = 0, x0 = 0) (t0 = 0, x0 =
) (2.17)
γ
S 0 therefore sees the moving ruler to be shorter by a factor of γ relative to
S.
Exercise: Repeat the computation of the length of the ruler in the frame S 0
but assuming that the ends of the ruler are at the points x = 1 m, x = 2 m
in the S frame. Show that the contracted length is again L/γ.
Moving objects contract in the direction of motion.

34
2.3 An Analogy to Rotations
It’s helpful to think of the Lorentz transformations as a generalization of the
idea of rotations in the following sense.
Consider first rotations in two dimensions. We can set up two observers
who are using coordinates rotated by an angle θ relative to each other

y sin

O
x cos
O
x

The coordinates transform between the two coordinate systems as

x0 = x cos θ + y sin θ (2.18)


y 0 = y cos θ − x sin θ

The different coordinate choices are in a sense a distraction from the


physics involved (of say a moving particle) which is really the same for the
observer using either coordinates. The elegant way to express this is to
use vectors. The vector (eg from the origin to a particle) is the same for
both observers although its components may be different for the different
observers. We write

~x or x = (x, y) (2.19)
The coordinate transformation can then be written as a matrix multipli-
cation on the vector
 0    
x cos θ sin θ x
0 = (2.20)
y − sin θ cos θ y

35
Remember that there is something invariant about the position of a par-
ticle under rotations - it’s distance from the origin ie
0 0
L 2 = x2 + y 2 = x 2 + y 2
We can extract this from the vector by the dot product of the vector with
itself

L2 = ~x.~x (2.21)

Now consider Lorentz transformations in the x and t directions where the


coordinates are mixed up by a boost. The Lorentz transformations, although
not exactly like the mixing of spatial coordinates under rotations, do have
a similar form. Lets try to draw a diagram with the coordinate axes of two
different inertial frame observers both shown.
We begin with one stationary observer’s coordinates in the x − (ct) plane.
We use ct rather than just t because it has the same dimensions as x.

ct
light

x
Note that light travels on the line at 45o to the axes since it reaches a
distance x = ct in ct time.
We can use the Lorentz transformations to plot the position of the equiv-
alent axes in a frame moving relative to this frame. The coordinate axes are
given when ct0 = 0 and x0 = 0 so

36
v
ct0 = γct − γx
c
v
ct0 = 0 → ct = x (2.22)
c
v
x0 = γx − γct
c
c
x0 = 0 → ct = x (2.23)
v
Now we can plot these axes in the original frame’s coordinates
ct ct’ light path

x’

The marked lines are the S 0 coordinate axes - they agree with the original
coordinates as to the point (0,0). The plot also shows the grid x0 = 0, 1, 2..
ct0 = 0, 1, 2.. etc. Note that in the new coordinate system the path light takes
is given by the same line - it goes through the points (0,0) (1,1) (2,2) etc.
This is an equivalent plot to the one we drew for rotations. We can place
an event on the plot and then read off its coordinates in either the original
frame using the square grid or in the boosted frame using the skewed grid.
The grid can be used to see time dilation and length contraction
The circles are events positioned at x0 = 0 every second in S 0 . Reading
the time of the event on the original axes though shows that S sees more
than 1s having passed between events - a clock in a moving inertial frame
measures time more slowly - time dilation.
The solid line represents a rod in S 0 . In S if we measure distance at the
same time for each end we get a smaller length - lengths appear contracted
in a moving inertial frame.

37
ct ct’ light path

x’

Although x and t change between S and S 0 this picture, like that for the
rotations, suggests there should be a frame invariant way to discuss events.
This will lead us to introduce vectors in this plane which have space and time
like components. In the rotation case the vector had an invariant length that
was the same for all observers. For Lorentz transformations we have shown
that the quantity

ct2 − |x|2 = constant (2.24)


is left invariant. This will be the “length” of our new “4-vectors”.

2.4 Four Vectors


Our previous discussion leads us to consider a four component vector with ct
as the time like component, and x y and z position components describing
an event or object. We will write this four-vector as

xµ = (ct, x, y, z) = (x0 , x1 , x2 , x3 ) (2.25)


In this notation the index µ on xµ takes the values 0, 1, 2, 3 corresponding to
the components as shown.
We have identified two properties of the four-vector already. Firstly under
Lorentz boosts in the positive x direction by speed v it transforms as

38
− vc γ
  
γ 0 0 ct
0  −vγ γ 0 0  x
  
xµ → x µ = 
 0
c  (2.26)
0 1 0  y 
0 0 0 1 z
Secondly we know that it has a Lorentz invariant length

(x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 = (ct)2 − |~x|2 (2.27)

2.4.1 Index Convention


At this point we are going to adopt a rather compact notation for multiplying
four-vectors. It will take a little getting used to but is not intrinsically deep!
There are two rules that will apply to the “µ” index on a four vector

• A given label for an index may occur at most twice in any term in an
expression.

• A repeated index is said to be “contracted”. Typically people write


a repeated index once up and once down. Such a repeated index is
“summed over”.

The best way to explain this is with an example. We can write the Lorentz
transformation of xµ in the following form
0
xµ → x µ = Λµν xν (2.28)
The new object Λµν has two indices each of which can take the values 0, 1, 2, 3
and so there are 4×4 = 16 components. These 16 components are just the 16
components of the Lorentz transformation matrix we’ve written above (for
example let µ count the row and ν the column).
In the expression the ν index occurs twice and this implies we must let
ν take all possible values and add up the answers we get in each case. Thus
consider the case where we set µ = 0 then
0
x 0 = Λ0ν xν

= Λ00 x0 + Λ01 x1 + Λ02 x2 + Λ03 x3 (2.29)

= γx0 − γ vc x1
This has reproduced the Lorentz transformation for x0 = ct.

39
Exercise: Convince yourself that equations (2.26) and (2.28) both reproduce
the four equations (2.4).

We can also write the Lorentz invariant length in this way. Formally we
do this as follows. We define a two index object called the metric with the
16 components
 
1 0 0 0
 0 −1 0 0 
gµν =  0 0 −1 0 
 (2.30)
0 0 0 −1
Now we can write

xµ = gµν xν (2.31)
This four vector with a lowered index has components

xµ = (ct, −x, −y, −z) (2.32)


So finally we can define the length of the four vector as

xµ xµ = x0 x0 + x1 x1 + x2 x2 + x3 x3
(2.33)
2 2 2 2
= (ct) − x − y − z
In practice you may just want to remember to insert the minus
signs as they appear in the above expression when you contract
the indices on four vectors! BEWARE though that there are not
these minus signs in the Lorentz transformation expression (2.28)!

2.5 The Laws of Dynamics


We have seen the consequences of relativity for observations of lengths and
periods. Now we will turn to thinking about how to formulate the laws of
dynamics. Simple Newtonian formulae such as f = ma do not work because
they contain time dependence and different observers don’t agree on lengths
of time.
Our guiding principle should be the second postulate which says that
physical laws should be the same for an observer in any inertial frame. We
will cast the laws in a way where this is manifestly true. Four-vectors will be
the tool that allows this since they are a frame invariant way of describing
the properties of a particle. Our laws will only

40
• contain Lorentz invariant quantities such as xµ xµ

• or take the form X µ = Y µ

This latter form is explicitly Lorentz invariant because the two sides of
the equation transform in the same way under Lorentz transformations.
So far we only have a four-vector describing position. We will now con-
struct four-vectors describing the kinematic properties of a particle.

2.5.1 Four-velocity
It is not sensible to use
dxµ
v= (2.34)
dt
as our definition of velocity because both xµ and t transform under Lorentz
boosts. The resulting transformation is very messy.
Ideally we would like a measure of time that is Lorentz invariant so that
v would transform only through the transformation of xµ . It would then be
a four-vector itself. Such a Lorentz invariant measure of time is
Proper Time: the time elapsed on a clock in the rest frame of a moving
object. Essentially we imagine that everything has a watch and we time an
event for the object by the time on its watch not the observer’s. Observers
in any reference frame will then get the same answer.
Note that in a particle’s rest frame its x position is a constant so
1p µ
τ= ∆x ∆xµ (2.35)
c
Finally we can make a sensible choice for our variable four-velocity

dxµ
uµ = dτ
(2.36)
Let’s stress again that this four-vector transforms just like xµ under boosts
ie
0
u µ = Λµν uν (2.37)
It is useful to know how four-velocity relates to the more standard velocity
measured by an observer using his own watch (we can call this coordinate
velocity)

µ dxµ dxµ dt
u = = (2.38)
dτ dt dτ
41
We can work out dt/dτ from the Lozentz transformations. τ is the time
in the rest frame, where the particle is sat at the origin, so in a moving frame
dt
t = γτ → =γ (2.39)

Thus the components of four-velocity are

uµ = γ(c, vx , vy , vz ) (2.40)
From this expression we can finally work out the invariant “length” of this
four vector from the product

uµ uµ = γ 2 (c2 − |~v |2 ) = c2 (2.41)

2.5.2 Four Acceleration


The definition of acceleration is now straightforward
duµ
aµ = (2.42)

Again it’s worth stressing that this object is a four-vector which transforms
in the same way as xµ .

2.5.3 Four Momentum


The natural generalization of momentum is given by
dxµ
pµ = muµ = m (2.43)

Here we have introduced the mass of the particle - it is a constant, intrinsic
property of the particle.
pµ is again a four-vector that transforms as

p0µ = Λµν pν (2.44)

Interestingly though we have been led to introduce a time-like version


of momentum. What does this correspond to? To find out we should take
the classical limit of the theory (v  c) and see what it corresponds to
in Newtonian dynamics. Remember that the time like component of four-
velocity was u0 = γc so

42
p0 = mcγ

= mc(1 − v 2 /c2 )−1/2 (2.45)

1 v2
' mc(1 + 2 c2
+ ...)
The first term is a constant. The second term though is recognizable since
1
2
mv 2is kinetic energy in the low v limit. This suggests we should interpret p0
as the relativistic version of energy (divided by c). Then we have a surprising
interpretation of the first, constant, term - a particle at rest has energy

Erest = mc2 (2.46)


We can write the components of pµ in a number of ways now
E
pµ = ( , p~) = muµ = mγ(c, ~v ) (2.47)
c
The relativistic expression for energy is therefore

E = γmc2 (2.48)
and the relativistic version of kinetic energy (the energy when moving minus
the energy at rest)
T = (γ − 1)mc2 (2.49)
The invariant length of the four-vector follows from uµ uµ = c2 so

E2
p µ pµ = c2
− |~p|2 = m2 c2 (2.50)

Example: Calculate by explicitly performing a boost the relativistic energy


and momentum of a proton moving at speed v=0.5c. The rest mass of a
proton is approximately 1 GeV/c2 .
The four momentum for the proton at rest is given by pµ = muµ =
m(c, vx , vy , vz ). after a boost along the x-direction we obtain p0µ = Λµν pν , with

Λµν as given above. The result is p0µ = (mcγ, −mvγ,√0, 0). Here γ = 2/ 3,
then p0µ = mγc(1, −0.5, 0, 0) = 2GeV(1, −0.5, 0, 0)/( 3c).

43
2.5.4 Hypothesis for Dynamical Law
Armed with these four-vector variables we can now have a guess as to the
form of the relativistic version of Newton’s second law. The obvious equation
to try is
dpµ
fµ = (2.51)

This is manifestly Lorentz invariant and has the correct non-relativistic limit
if f µ is a relativistic extension of force. As yet though we haven’t mentioned
forces and we won’t until we discuss electro-magnetism! In fact this guess is
the correct law.
The law tells us something interesting even when f µ = 0
dpµ
= 0 → pµ = constant (2.52)

In other words if no external force acts on a system four-momentum is con-
served. This is the relativistic analogue of conservation of energy (p0 ) and
conservation of the usual three component momentum (p1 , p2 , p3 ).

2.6 Physics with Four-Momentum


To gain experience with four-vectors we will now look at four physics prob-
lems where using four-momentum makes the solutions much easier than with-
out.
To make our life easier we will use a trick that is common. Instead of
using the usual units system we will work in a new system where

c=1 (2.53)
In other words we redefine the unit of length so that it is the distance light
travels in 1 second! This would not be sensible for everyday life but in
problems where everything is travelling at the speed of light a meter is an
absurdly small distance. In practice we will be able to drop all the factors of
c from computations. It’s pretty easy to put them back into the final answer
using dimensional analysis as we will see.

2.6.1 The Doppler Effect


What frequency will an observer see a light wave at if he is moving relative
to it?

44
x

Consider first a static observer in the frame of the light source. The
photons of light carry four momentum
h
pµ = (E, p~) = (hf, − x̂) = (hf, −hf, 0, 0) (2.54)
λ
Note that the photon is moving in the negative x-direction towards the ob-
server. We have used the quantum mechanical relations between the energy
and frequency of the photon and between its momentum and wavelength.
We have also used f λ = c = 1.
We can now ask what would happen to the frequency of the light if the
observer was moving in the positive x-direction at speed v. We just perform
a boost on the four-vector

    
γ −vγ 0 0 hf γ(1 + v)hf
−vγ γ 0 0   −hf   −γ(1 + v)hf
p0µ = 
     
 0 =  (2.55)
0 1 0  0   0 
0 0 0 1 0 0
Now if we just concentrate on the time-like component we have

r s
0 1 (1 + v)2
p 0 = E 0 = hf 0 = (1 + v) hf = hf (2.56)
1 − v2 (1 + v)(1 − v)
or
s
(1 + v)
f0 = f (2.57)
(1 − v)

45
Finally we can reintroduce the factors of c since the factors of (1 + v) are
not dimensionally correct. We should have
q
0 (1+v/c)
f = (1−v/c)
f (2.58)

2.6.2 The Compton Effect


The Compton Effect relates the angle of scattering of a photon off a static
electron to its final wavelength. The classic experiment is schematically

!"(#)

monochromatic x-ray static free electron #


source " in metal target

You’ve calculated this relationship in previous courses. Using four-momentum


will get us to the answer much quicker.
Set up the four momentum of the particles to be:

initial photon: pµγi = ( λh , λh x̂)


initial electron: pµei = (me , 0)
final photon: pµγf = ( λh0 , λh0 )
final electron: pµef

Note the photon’s final momentum is at an angle θ to x-dirn.


Since no external force acts four momentum is conserved in the collision
so

pµγi + pµei = pµγf + pµef (2.59)

46
It turns out to be helpful to rearrange this equation so that pµef is isolated
- we know least about pµef so will want to eliminate it

pµγi + pµei − pµγf = pµef (2.60)


Now we consider the Lorentz invariant product

pµef pef µ = m2e

= (pµγi + pµei − pµγf )(pγiµ + peiµ − pγf µ )

= pµγi pγiµ + pµei peiµ + pµγf pγf µ + 2(pµγi peiµ − pµγi pγf µ − pµγf peiµ )

= 0 + m2e + 0 + 2( λhi me − 0) − 2 λhi λhf (1 − cos θ) − 2( λhf me − 0)


(2.61)
We have used two crucial facts here. Firstly when the four momentum of a
particle is contracted with itself we simply obtain the invariant m2 . Secondly
we have used the contraction law pµ1 p2µ = (p01 p02 − p~1 .~p2 ).
Rearranging we find
h h h h
me − me = (1 − cos θ) (2.62)
λi λf λi λf
Multiplying through by λi λf /(hme ) gives
h
λf − λi =
(1 − cos θ) (2.63)
me
which is the answer we want. Again we can insert c on dimensional grounds
h
λf − λi = me c
(1 − cos θ) (2.64)

2.6.3 Fixed Target Experiments


A simple way in which to create fundamental particles is by colliding a high
energy proton or electron into a fixed target of, for example, lead.
It’s not immediately obvious how much energy is available to make rest
mass energy of the new particle because momentum conservation requires the
final state to be moving and have kinetic energy. A sensible thing to do is to
move to the Centre of Mass frame where the particle and target (a particle
in the wall) approach each other with equal and opposite momentum. In this
frame the particle produced will be at rest and all the energy of the initial
state will become rest mass energy of the product.

47
µ

p !

Pb

Lab frame CoM frame

a b a b
µ
µ
pb = (mb , 0)
pa = (Ea , pa )

48
We can work out the Lorentz boost needed to move from the original
“lab” frame to the centre of mass frame. We boost the four-momenta in the
lab frame by an amount v
    
µ0 γ −γv Ea γ(Ea − vpa )
pa = = (2.65)
−γv γ pa γ(pa − vEa )
    
µ0 γ −γv mb γmb
pb = = (2.66)
−γv γ 0 −γvmb
In the Centre of Mass frame the momenta must be equal and opposite so
0 0
pax = −pbx
(2.67)
γvmb = γ(pa − vEa )
and the required boost is by

v pa
c = mb c+Ea /c (2.68)

If after this boost the particles are ultra-relativitic so that Ea ' |pa | =
|pb | ' Eb then the total available energy is

v s
u 4m2 c4 4m2b c4 (mb c + Ea /c)2
ECoM = 2γmb c2 = u  ba 2 = (2.69)
(mb c + Ea /c)2 − pa2
t
p
1− mb c+Ea /c

If we now expand in the limit with Ea  mb c2 , ma c2 we find (remember


that in this high energy limit Ea /c = pa )
s
4m2b Ea2 c2
ECoM =
2mb Ea

ECoM = 2mb Ea c2 (2.70)

We could have obtained this result more quickly by calculating the in-
variant rest mass of the whole system in the original coordinates

49
pµT OT pT OT µ = m2T OT c2

= (pµa + pµb )(paµ + pbµ ) (2.71)

= pµa paµ + pµb pbµ + 2pµa pbµ


which in the limit where Ea is large compared to the rest masses gives

Ea mb c2
m2T OT c2 = 2 = 2Ea mb (2.72)
c c

2.6.4 The GZK Bound


Active galaxies accelerate protons to very high energies but there is a maxi-
mum energy we should expect to see (first calculated by Greisen, Kuzmin and
Zatsepin). The reason for the maximum is that the Universe is full of photons
left over from the Big Bang which higher energy protons can interact with.
These photons are responsible for the ambient background temperature of
the Universe T ∼ 3K (Eγ = kB T = 8 × 10−4 eV ). The protons interact as
follows

pγ → ∆(M∆ ∼ 1.2GeV /c2 ) → π + n (2.73)


The ∆ is a short lived particle and the final decay is by far its most dominant
decay process. If there is sufficient energy in the collision to create a ∆ then
the proton is converted to other particles very efficiently. We can calculate
the minimum energy the proton must have.
Let’s assign the proton and photon initial four-momenta

pµp = (Ep , k, 0, 0), pµγ = (hν, −hν, 0, 0) (2.74)


Note that we’ve set up the process so the photon and proton will collide head
on. This maximizes the energy available for new particle creation and will
therefore give us the minimum proton energy for the process.
Four momentum will be conserved in the interaction so

pµ∆ = pµp + pµγ (2.75)


Rearranging and squaring gives

pµ∆ p∆µ = m2∆ = (ppµ + pγµ )(ppµ + pγµ )


(2.76)
= m2p + m2γ + 2Ep hν − 2k(−hν)

50
For a relativistic proton Ep ' k and so

m2∆ − m2p
Ep = ' 2 × 1020 GeV (2.77)
4hν
Protons with energy of this or above will under go this interaction. Fac-
toring in the density of photons it turns out that the mean free path for
such protons is about 3Mpc (our galaxy group is about 20Mpc across). We
shouldn’t expect to see any protons of this energy from active galaxies.
Surprisingly though experimenters have reported about 20 observed cos-
mic ray protons with higher energy than this bound. If these events are real
we must be doing something wrong! Could Special Relativity break down at
such high energies? Could there be a source of high energy protons within
3Mpc? At the moment this issue is an open question.

Example: In the original (Homestake) solar neutrino detection experiment


neutrinos from the sun interact with Cl37 atoms to form Ar37 and an electron.
Assuming the Cl atoms are at rest what boost is required to move to the
centre of mass frame? Determine the minimum energy the neutrino must
have for this reaction to proceed.

We assign momenta pµν = (Eν , p~ν ) to the neutrino and pµCl = (mCl , ~0) to
the atom. We want to boost these momenta in order to bring them to the
centre of mass frame, i.e. so that p~Cl = −~pν . The effect of the boost on the
momenta is given by
p0ν = γpν − γvEν
p0Cl = −γvmCl
.
For both to be equal in magnitude but opposite in sign, vmCl = pν − vEν ,
i.e.

v= ,
mCl + Eν
or, reinstating factors of c,
v Eν
= .
c mCl c2 + Eν
The threshold energy corresponds to producing Ar and the electron at
rest. Then, conservation of momentum tells us that

pµTOT pTOTµ = m2TOT ,

51
where pµTOT = (Eν + mCl , p~ν ), and m2TOT = (me + mAr )2 . Then

pµTOT pTOTµ = Eν2 + m2Cl + 2mCl Eν − p2ν = m2Cl + 2mCl Eν = (me + mAr )2 .

Then
(me + mAr )2 − m2Cl
Eν = .
2mCl

2.7 Tensors
We are now familiar with four-vectors. They are though just one part of a
family of objects called tensors which can have more than one index. We will
need these later when we study electromagnetism. To introduce them think
about angular momentum:
Non-relativistically angular momentum is given by

~l = ~r × p~ (2.78)
with components
l1 = ypz − zpy

l2 = zpx − xpz (2.79)

l3 = xpy − ypx

Relativistically these components are naturally part of the tensor

Lµν = xµ pν − pµ xν (2.80)
For example
L12 = xpy − ypx = l3 (2.81)
Tensors have a number of properties which in this case we can deduce
from its “composite” nature. Thus

• Under Lorentz transformations: L0µν = Λµα Λνβ Lαβ

• Lorentz invariant: Lµν Lµν = (L00 )2 − (L01 )2 + (L11 )2 + ... = constant

Finally we note that the metric we introduced earlier is itself a tensor.

52
2.8 Relativistic Action
The action that reproduces the relativistic equation of motion for a free
particle
dpµ
=0 (2.82)

has an interesting form. It is given by
Z r µ
dx dxµ
S=m dτ (2.83)
dτ dτ
We can see that this works since the Euler Lagrange equations take the form
!
d ∂L ∂L
dx
− =0 (2.84)
dτ ∂ µ ∂xµ

or explicitly
" −1/2 #
dxµ dxµ dxµ

d
m =0 (2.85)
dτ dτ dτ dτ
Since
dxµ dxµ
= uµ uµ = c2 (2.86)
dτ dτ
we are left with

dxµ dpµ
 
d
m = =0 (2.87)
dτ dτ dτ
the correct equation of motion.
If we stare at (2.83) though we realize that it has an interesting form.
The proper time is being used to parameterize the path of the particle but if
we move dτ into the square root we see it cancels and what we are actually
doing is calculating the length of the path. This is very elegant in that the
length of the path is the only physical characteristic of the motion - it’s nice
that the action is so simple.

53
2.9 Appendix 1 - Lorentz Transformations and Rotations II

Naively you might think Lorentz transformations form a closed set of oper-
ations (that is doing two boosts is equivalent to doing one other boost) but
in fact things are more complicated as the following procedure shows...

We will do the following Lorentz transformations on a four vector xµ = (t, x):

1) Boost by δv in the x-direction


2) Boost by δu in the y-direction
3) Boost by −δv in the x-direction
4) Boost by −δu in the y-direction

You might think we’d be back where we started but let’s see...

1) Boost by δv in the x-direction:

−γ(δv) δvc
  
γ(δv) 0 0 t
0  −γ(δv) δv γ(δv) 0 0   x1
  
x µ = Λµν X ν =  c  (2.88)
 0 0 1 0   x2 
0 0 0 1 x3

Now since δv is small


−1/2
δv 2 1 δv 2

γ(δv) = 1 − 2 '1+ + .... (2.89)
c 2 c2
δv δv
γ(δv) ' + ... (2.90)
c c
2
We’ve kept all the terms upto order δvc2 .
2
1 + δv − δvc
  
2c2
0 0 t
2
0µ  − δv 1 + δv 2 0 0   x1
  
x =  c 2c  (2.91)
0 0 1 0   x2 
0 0 0 1 x3

54
2) Boost by δu in the y-direction:
2
1 + δu 0 − δu
  
2c2 c
0 t
00 µ 0  0 1 0 0   x1
  
x = Λµν x ν =  2
 (2.92)
 − δu
c
0 1 + δu
2c2
0   x2 
0 0 0 1 x3

3) Boost by −δv in the x-direction:


2
1 + δv δv
  
2c2 c
0 0 t
δv 2
000 µ µ 00 ν
 δv
1 + 2c2
0 0   x1
  
x = Λν x =   c  (2.93)
0 0 1 0   x2 
0 0 0 1 x3

4) Boost by −δu in the y-direction:


2
1 + δu δu
  
2c2
0 c
0 t
0000 µ µ 000 ν
 0 1 0 0   x1
  
x = Λν x =  2
 (2.94)
 δu
c
0 1 + δu2c2
0   x2 
0 0 0 1 x3

Note: that 1) and 3) are each others inverse:

δv 2 δv 2
1 + δv − δvc
  
1+ 2c2 c
0 0 2c2
0 0
δv 2 2

 c
1 + δv2c2
0  − c
0   δv
1 + δv2c2
0 0 
 (2.95)
 0 0 1 0  0 0 1 0 
0 0 0 1 0 0 0 1
 
1 0 0 0
3
 
 0 1 0 0 
+O δv/u
= 
 0 0 (2.96)
1 0  c3
0 0 0 1

The loop of four transformations is more complicated though because


we’re doing 2) in between 1) and 3). So...

55
0000 µ
X = Λµα (−δu)Λαβ (−δv)Λβγ (δu)Λγν (δv)X ν (2.97)

Λµα (−δu)Λαβ (−δv)Λβγ (δu)Λγν (δv) = (2.98)


2 2
1 + δu δu
1 + δv δv
  
2c2
0 c
0 2c2 c
0 0
0 1 0 0  δv δv 2

 

c
1 + 2c2
0 0 
×
δu 2

c
0 1 + δu2c2
0  0 0 1 0 
0 0 0 1 0 0 0 1
2 2
1 + δu 0 − δu 1 + δv − δvc
  
2c2 c
0 2c2
0 0
δv 2
 0 1 0 0 
 − c
 1 + δv2c2
0 0 

 − δu 2
 (2.99)
c
0 1 + δu
2c2
0  0 0 1 0 
0 0 0 1 0 0 0 1

δu2 +δv 2 δv δu 2 2 −δv −δu


1 + δu 2c+δv
  
1+ 2c2 c c
0 2 c c
0
δv 2 −δv 2
 1 + δv2c2
0 0   1 + δv
2c2
0 0 
= c
δu 2
  c
−δu 2


c
0 1 + δu2c2 0 
c
0 1 + δu
2c 2 0 
0 0 0 1 0 0 0 1
(2.100)
 
1 0 0 0
δuδv δv/u3
 
 0 1 0 
= c2  +O (2.101)
 0 − δuδv
c2
1 0  c3
0 0 0 1

 
1 0 0 0
 0 cos θ sin θ 0 
= 
 0 − sin θ cos θ 0  (2.102)
0 0 0 1

where cos θ ' 1 + .... and sin θ ' δuδv + ...

The result is a rotation about the z axis!! It is therefore necessary to consider


the combination of both rotations and Lorentz transformations as a single
set of transformations.

56
Chapter 3

Electromagnetism

Contents

• Maxwell’s equations review

• Conservation of charge

• Potential and Laplace’s equation

• Vector Potential

• Gauge transformations

• Wave equations in free space

• 4-vector current

• 4-vector potential

• F µν

• Relativistic formulation of Maxwell’s equations

• Relativistic force law and action.

Learning Outcomes

• Know Gauss’ and Stoke’s Theorems and Maxwell’s equations.

• Be able to derive and solve wave equation.

• Be able to solve Laplace’s equation for simple geometries.

57
~ B
• Know the relations between E, ~ and φ and A.
~

• Be able to perform Lorentz transformations on Aµ , F µν etc.

Reference Books

• A smart student would re-read their second year EM notes!

• Foundations of Electromagnetic Theory - Reitz, Milford and Christy

• Investigate section QC670 in the library (eg Cook, Nafeh and Brussel,
etc)

58
In this section of the course we will study electromagnetism. You have
already seen Maxwell’s equations in integral and differential form - we will
review these shortly. Our main task here though will be to understand how
these equations already encode relativity. To do this we will need to rewrite
them in terms of potentials to find a manifestly Lorentz invariant form.

3.1 Integral Form of Maxwell’s Equations


We begin by reviewing the physics of Maxwell’s equations in integral form.

3.1.1 Gauss’ Law


~ A ~= q
R
S E.d 0 (3.1)

~ is the (vector) force a unit charge experiences at position x (F~ = q E)


• E ~

~ A
• The integral means a sum of E.d ~ for the infinitesimal surface elements
that make up a whole, closed surface S. Remember that a little area
element is described by a vector normal to its surface

dA = |d A| n
• q is the charge contained inside the surface.
eg The electric field around a point charge is given by Gauss’ law using a
spherical shell around the charge:

~ = q
4πr2 |E|
0
~ = q
|E|
4π0 r2

59
3.1.2 No Magnetic Charges
The equivalent of Gauss’ law for magnetic fields is just

~ A ~=0
R
S B.d (3.2)
since there are no magnetic charges.

3.1.3 Faraday’s Law

Moving a loop of wire in a magnetic field induces a current in the wire.


The relevant measure of magntic field is given by the flux through the loop
Z
Φ= ~ A
B.d ~ (3.3)
S

60
the area we sum over is that enclosed by the loop. Now the induced voltage
is given by
∂Φ
e.m.f. = − (3.4)
∂t
The minus sign reflects Lenz’s Law which says the system resists change.
Finally the voltage difference around the loop, s, is given by “V = Ed”
but since different bits of the wire point in different directions we must cal-
culate for each infinitessimal bit of wire and sum the answers
R
~ ~
R ~
∂B ~
e.m.f = s [Link] = − S ∂t .dA (3.5)

3.1.4 Ampere’s Law


R
~ ~l = µ0 J.d
R
~ A
~ + µ0 0
R ~
∂E ~
s B.d S S ∂t .dA (3.6)

Reading just the first two terms in this equation we see the familiar
~ A)
physics that if a current (J. ~ is flowing through some loop then there is a
circulating magnetic field

The final term was added for consistency by Maxwell (we will revisit this
shortly) and mirrors the term in Faraday’s law.
This integral form of Maxwell’s equations are a complete description of
electromagnetism. In what follows we shall simply recast the equations in
several different ways in order to display their physics content better.

61
3.2 Differential Form of Maxwell’s Equations
The first rewriting of Maxwell’s equations we shall do is to put the equations
into a differential equation form. The benefit of this form will be that the
equations are true locally at a point. In the integral form one has to pick
“loops” and “areas” to define the integrals and they are therefore telling you
about global properties of a problem. We will need two bits of mathematics
you proved last year (see Appendix 1 for a proof):

Gauss’ Theorem:
~ ~ ~ F~ dV
R R
S F .dA = ∇. (3.7)

Stoke’s Theorem:
~ ~ ~ × F~ ).dA
~
R R
s F .dl = S (∇ (3.8)

We can use these to find the differential form of Maxwell’s equations as


the following two examples show

Differential Form of Gauss’ Law

We can now convert the integral form of Gauss’ Law


Z
~ A
E.d ~= q (3.9)
S 0
to differential form using Gauss’ Theorem
Z Z
~ A
E.d ~ = ∇.~ EdV
~ (3.10)
S
If we also write the charge in terms of a charge density
Z
q ρ
= dV (3.11)
0 0
Then comparing these two equations we find

~ E
∇. ~ = ρ (3.12)
0

62
Differential Form of Faraday’s Law

The integral form of Faraday’s law is


Z Z ~
∂B
~ ~l = −
E.d ~
.dA (3.13)
s S ∂t
Using Stokes’ Theorem we see that
Z Z
~ ~l = (∇
E.d ~ × E).d
~ A ~ (3.14)
s S
and hence

~
~ = − ∂B
∇×E (3.15)
∂t

3.2.1 Maxwell’s Equations in Differential Form


Using Gauss’ theorem and Stoke’s theorem we have now re-written the Maxwell
equations as

~ E
∇. ~ = ρ
0

~ B
∇. ~ =0
(3.16)
~ ×E
~ = ~
∇ − ∂∂tB

~ = µ0 J~ + µ0 0 ∂ E~
~ ×B
∇ ∂t

3.2.2 Conservation of Charge


Another equation it is useful to put into differential form is that describ-
ing charge conservation. Since charge is conserved the current flowing out
through the surface of some volume must give the change in charge within
the volume
Z Z
~ ~ ∂ρ
[Link] = − dV (3.17)
S ∂t
Applying Gauss’ Divergence theorem to the left hand side we have

63
I

q = !d V

~ J~ = − ∂ρ
∇. (3.18)
∂t

3.2.3 The Displacement Current


Prior to Maxwell’s involvement the fourth ”Maxwell” equation was just

~ ×B
∇ ~ = µ0 J~ (3.19)
However, we can see quite simply in this formalism that this can not be
correct. This is because it is true that for any vector field F~

~ ∇
∇.( ~ × F~ ) ≡ 0 (3.20)
(The proof is given in Appendix 2 at the end of this chapter) Lets see if this
makes sense for our equation above by taking the divergence

~ ∇
∇.( ~ × B)
~ ≡ 0 = µ0 ∇.
~ J~ (3.21)
~ J~ = − ∂ρ ! Maxwell’s extra
But this isn’t correct since we just saw that ∇. ∂t
term corrects things as we can see

~ ∇
∇.( ~ × B) ~ J~ + µ0 0 ∂ ∇.
~ ≡ 0 = µ0 ∇. ~ E~ (3.22)
∂t
~ E
Using the first Maxwell equation (∇. ~ = ρ/0 ) we recover the correct con-
servation of charge formula.

64
3.3 Potentials
Potentials are a mathematical trick for making the Maxwell’s equations easier
to solve. The one you are already familiar with is:

3.3.1 Electric Potential


In electrostatic problems Maxwell’s equations reduce to

~ E
∇. ~ = ρ ~ ×E
∇ ~ = ~0 (3.23)
0
If we write

~ = −∇φ
E ~ (3.24)
Then, because of the identity (see Appendix 2)

~ × ∇φ
∇ ~ ≡0 (3.25)
the second of our two Maxwell equations is automatically satisfied. We are
left with only Poisson’s equation
ρ
−∇2 φ = (3.26)
0
This simplifies things! Further the potential
Z ~
x
φ=− ~ ~l
E.d (3.27)

can be interpreted as the “potential energy” for moving a unit charge to the
point ~x. This energy is independent of the path the charge takes to arrive at
that point.
Note that φ is only defined upto an arbitrary constant (the energy of a
charge at infinity) since

~ = −∇(φ
E ~ + C) = −∇φ
~ (3.28)

Example 1: Infinite Parallel Plate Capacitor


Consider the capacitor with a potential difference of V across it
Between the plates there is no charge so we should solve Laplace’s equa-
tion

65
x=d !=V
E
x=0 !=0

∇2 φ = 0 (3.29)
In this problem by the symmetries the only variation in φ will be in the x
direction so

d2
∇2 φ = φ=0 (3.30)
dx2
Integrating twice we obtain

φ = Ax + C (3.31)
with A, C constants. They can be fixed by imposing the boundary conditions
φ(x = 0) = 0, φ(x = d) = V . We obtain
V
φ= x (3.32)
d
Finally we can obtain the electric field from the potential

E ~ = (− V , 0, 0)
~ = −∇φ (3.33)
d

66
Example 2: Co-axial Cable
A co-axial cable is a similar problem but with different symmetry prop-
erties.

!= 0
b
a

!=V

Here the potential will only vary radially. In Appendix 2 ∇2 is calculated


in cylindrical polar coordinates (r, θ, z). Only allowing r variation in φ we
find
 
1 d dφ
r =0 (3.34)
r dr dr
Integrating twice we find

φ = A ln r + C (3.35)
Again we fix the integration constants from the boundary conditions shown
in the figure, so
V
φ(r) = − (ln r − ln b) (3.36)
ln(b/a)

3.3.2 Vector Potential


Having introduced an electric potential we might try to introduce a magnetic
potential in the same way. This does not work though because even in static
magnetic problems there must be a current to generate the magnetic field.
Thus

~ ×B
∇ ~ = µ0 J~ (3.37)

67
~ × ∇φ
and we can not use a scalar potential field since ∇ ~ ≡ 0.
On the other hand for magnetic fields

~ B
∇. ~ =0 (3.38)
and so we can make use of an alternative identity

~ ∇
∇.( ~ × F~ ) ≡ 0 (3.39)
Thus we can automatically solve the second of these two Maxwell equations
provided we write the magnetic field in turns of a new vector field, the “vector
~
potential” A

~ =∇
B ~ ×A
~ (3.40)

Just as there was some freedom in the choice of the electric potential
~ B
there is an arbitrariness about A. ~ is left invariant if we transform

~→A
A ~ + ∇ψ(x)
~ (3.41)
~ follows from the
where ψ(x) is an arbitrary scalar field. The invariance of B
~ × (∇ψ)
identity ∇ ~ ≡0

3.3.3 A New Electric Potential


The electric potential we wrote before only worked when there were no mag-
~ and B
netic fields. Can we find simultaneous potentials for both E ~ that work
in all circumstances? Those potentials are

~ = − ∂ A~ − ∇φ
E ~
∂t
(3.42)
~ =∇
B ~ ×A
~
These are automatically solutions of the Maxwell equations

~ B
∇. ~ = ∇.(
~ ∇ ~ × A)
~ =0 (3.43)
and also

68
 
~ ×E
∇ ~ × − ∂ A~ − ∇φ
~ = ∇ ~
∂t

~ ~ (3.44)
= − ∂(∇×
∂t
A) ~ × (∇φ)
−∇ ~

~
= − ∂∂tB − 0
This should simplify things greatly since now there are only the remain-
ing two Maxwell equations to solve. Let’s write them out in terms of the
potentials

~ A)
d(∇. ~
~ E
∇. ~ = −∇2 φ − = ρ
(3.45)
dt 0

~ ×B
For the ∇ ~ equation we will again use the identity for this product in
Appendix 2. Thus
!
∂ ∂ ~
A
~ ∇.
∇( ~ A)
~ − ∇2 A
~ = µ0 J~ + µ0 0 − ~
− ∇φ (3.46)
∂t ∂t
or rearranging
2 ~
~ + µ0 0 ∂ A2 = µ0 J~ − ∇(
~ ∇.
~ A
~ + µ0 0 ∂φ )
−∇2 A ∂t ∂t
(3.47)
Unfortunately these two equations we are left with are quite messy! To
clean them up we can make use of our ability to redefine the potentials whilst
~ B
keeping the E, ~ fields the same.

3.3.4 Gauge Transformations


~ B
The transformations for these potentials that leave E, ~ invariant are the
following gauge transformations

~→A
A ~ + ∇ψ
~
(3.48)
∂ψ
φ→φ− ∂t

~ B
Exercise: Show explicitly that the E, ~ fields are left invariant by these
transformations.

69
~ A
We can make a choice of gauge that transforms ∇. ~ as follows

~ A
∇. ~ → ∇.(
~ A ~ + ∇ψ) = ∇.
~ A~ + ∇2 ψ (3.49)
~ A
Note that ∇. ~ is a number at each point in space. ∇2 ψ is also a number
at each point but here we get to choose it by choosing ψ. The upshot is that
we can choose to transform ∇.~ A
~ to anything we want!
~ ~
∇.A = 0 is one sensible choice (known as Coulomb gauge).

3.3.5 Maxwell’s Equations in Lorentz Gauge


Lets chose to make a gauge transformation such that

~ A
∇. ~ = − 1 ∂φ = −µ0 0 ∂φ (3.50)
c2 ∂t ∂t
In this gauge Maxwell’s equations simplify to

2
−∇2 φ + µ0 0 ∂∂tφ2 = ρ
0 (3.51)

2
~ + µ0 0 ∂ A2 = µ0 J~ ~
−∇2 A ∂t (3.52)

This form of our remaining Maxwell’s equations is much prettier! Observe


the following two points:

Wave Equations in Free Space

In free space J~ = 0 and ρ = 0 and these equations become wave equations

∂ 2φ ~
∂ 2A
−∇2 φ + µ0 0 = 0, ~ + µ0 0
−∇2 A =0 (3.53)
∂t2 ∂t2
which have wave solutions of the form

A(~ ~ 0 ei(wt−~k.~r)
~ r, t) = A (3.54)
Substituting this solution into the wave equation we find the condition

w2 1
2
= c2 = (3.55)
k µ0 0

70

In other words these waves move at a speed c = 1/ µ0 0 which is the speed
of light. This is how Maxwell concluded that light is an electromagnetic
wave.

Relativistic Form

Equations (3.51) and (3.52) also have a very suggestive form for Relativity
- they are symmetric in time and space. There’s also a symmetry between
the components of A ~ and φ - should we promote them to the components
of a four-vector? Similarly should the charge density and current become a
four-vector?

3.4 Relativistic Formulation Of Electromag-


netism
Our goal now is to cast Maxwell’s equations in a manifestly Lorentz invariant
form which is compatible with the second postulate of Special Relativity. The
equations in Lorentz gauge suggested a four-vector form:

3.4.1 Four-vector Current


Consider a uniform distribution of charge in a volume V at rest in some
frame

q = !V

If the charge density is ρ0 then the total charge is ρ0 V .


Now consider boosting to a frame moving with speed v relative to the
charge. The volume changes because of Lorentz contraction

71
V
V0 = (3.56)
γ
The total number of charges in the box must be the same for each observer
though so the charge density must also change to keep the total charge fixed.
Thus

ρ0 = γρ0 (3.57)
There will also now be a current density since the charges are moving in
the new inertial frame. These transformations are all consistent with ρ and
J~ being a four vector.
Thus we define

~
J µ = (ρc, J) (3.58)
Classically the current density is just given in terms of the speed of the
particles as ρ~v . The natural relativistic definition is therefore
dxµ
J µ = ρ0 uµ = ρ0 (3.59)

The Lorentz invariant “length” of the four-vector then follows from uµ uµ = c2

J µ Jµ = ρ20 c2 (3.60)

Exercise: Write equations for how each of the four components of J µ trans-
form under a Lorentz boost by v in the x-direction.

3.4.2 Conservation of Charge


The conservation of charge equation

~ J~ + ∂ρ = 0
∇. (3.61)
∂t
can now be written in a Lorentz invariant form

∂µ J µ = 0 (3.62)
where    
µ ∂ ~ 1∂ ~
∂ = , −∇ = , −∇ (3.63)
∂x0 c ∂t

72
Note the minus sign in the definition of the relativistic derivative four-
vector ∂ µ . It looks a bit odd but is needed to get the signs correct here. In
fact it is the only prescription compatible with the usual definition of xµ :

3.4.3 The Four Vector ∂ µ


You might worry that defining
 
µ 1∂
∂ = , −∇ (3.64)
c ∂t
with a minus sign contradicts the fact that

xµ = (ct, x) (3.65)
For example under a Lorentz boost to a frame moving with speed v in
the positive x direction
0
x µ = Λµν xν (3.66)
ie
v v
(ct0 ) = γ(ct) − γx, x0 = γx − γ(ct) (3.67)
c c
or inverting the relations
v v
(ct) = γ(ct0 ) + γx0 , x = γx0 + γ(ct0 ) (3.68)
c c
Similarly the definition in (3.64) would imply
0
∂ µ = Λµν ∂ ν (3.69)
ie
1 ∂ 1∂ v ∂ ∂ ∂ v 1∂
0
=γ + γ , − 0 = −γ − γ (3.70)
c ∂t c ∂t c ∂x ∂x ∂x c c ∂t
Note the signs in the transformations
To show this is consistent let’s work it out from first principles
∂ ∂x ∂ ∂t ∂
0
= 0 + 0 (3.71)
∂t ∂t ∂x ∂t ∂t
∂ ∂x ∂ ∂t ∂
0
= 0
+ 0 (3.72)
∂x ∂x ∂x ∂x ∂t
from the transformations in (3.68) above

73
∂x ∂t ∂t v ∂x
= vγ, = γ, = 2 γ, =γ (3.73)
∂t0 ∂t0 ∂x 0 c ∂x0

Substituting these in (3.71) and (3.72) we find (3.69) - this shows that
there is not an inconsistency (and in fact that the minus sign in (3.64) is
required).

We can also define a four-vector version of ∇2 by

1 ∂2
 = ∂ µ ∂µ = − ∇2 (3.74)
c2 ∂t2

3.4.4 Four Vector Potential


The final element we need to write Maxwell’s equations in a Lorentz invariant
form is a four-vector including the potentials. The appropriate four-vector is
φ ~
Aµ = ( , A) (3.75)
c
The Maxwell equations are then


Aµ = 0 c2 (3.76)

The µ = 0 equation is the φ equation (3.51) and the µ = 1, 2, 3 equations


~
give the components of the equation (3.52) for A.
The Maxwell equations in Lorentz gauge also required the gauge condition
(3.50) which becomes

∂µ Aµ = 0 (3.77)

Remember being able to write these equations in four-vector notation is


a huge step in itself. We now know that electromagnetism is relativistically
invariant.

74
3.4.5 A Moving Point Charge
One of the advantages of the relativistic formulation is that we understand
how electric and magnetic fields behave under boosts. As an example lets
look at the fields around a moving electric charge.
For an electric charge at rest we know that
 
φ ~ q
µ
A = ( , A) = , ~0 (3.78)
c 4π0 rc
We can make the charge move by boosting to an inertial frame moving
at speed v in the positive x direction
0
A µ = Λµν Aν (3.79)
so for example
0 v
A 0 = γ(A0 − Ax ) (3.80)
c
which means
γq
φ0 = . (3.81)
4π0 r0
We must remember also that r2 = x2 + y 2 + z 2 and x transforms too, so
γq
φ0 = (3.82)
4π0 (γ 2 (x0 + vt0 )2 + y 02 + z 02 )1/2
Turning to the spatial components we find
0 v γv q
A x = −γ A0 = − 2 (3.83)
c c 4π0 (γ (x + vt )2 + y 02 + z 02 )1/2
2 0 0

0 0
and A y = A z = 0.

The electric field is then given by

~0
E ~ 0 φ0 − ∂ A
~ 0 = −∇ (3.84)
∂t0
which works through to
0 qγ (x0 +vt0 )
Ex = 4π0 (γ 2 (x0 +vt0 )2 +y 02 +z 02 )3/2

0 qγ y0
Ey = 4π0 (γ 2 (x0 +vt0 )2 +y 02 +z 02 )3/2
(3.85)

0 qγ z0
Ez = 4π0 (γ 2 (x0 +vt0 )2 +y 02 +z 02 )3/2

75
These results are particularly interesting when v ' c. Look first on the x-axis
0 q
Ex' (3.86)
4π0 γ 2 (x0 + vt0 )2
since γ is large this component of the field is reduced relative to that of the
stationary charge. On the other hand if we look at the field perpendicular
0 0
to the motion (ie at x0 = −vt0 ) E y , E z are both enlarged by a factor of γ.
Thus the field of a relativistic moving charge is essentially confined to a disc

3.4.6 The Electromagnetic Field Strength Tensor


It is also possible to write Maxwell’s equations in a relativistic form involving
~ and B
E ~ fields rather than the potentials. Remember that

~
E ~ − ∂A
~ = −∇φ (3.87)
∂t
so a component is given in terms of Aµ , ∂ ν by

Ei
= ∂ i A0 − ∂ 0 Ai (3.88)
c
Similarly

~ =∇
B ~ ×A
~ (3.89)
so, up to signs, we have the form

B i = ∂ j Ak − ∂ k Aj (3.90)

76
~ and B
Thus we conclude that the E ~ fields are described by the EM field
strength tensor

F µν = ∂ µ Aν − ∂ ν Aµ (3.91)

Explicitly the components are


 1 2 3

0 − Ec − Ec − Ec
 E1
0 −B 3 B 2

µν
F =  Ec2 (3.92)
 
B3 0 −B 1

 c 
E 3
c
−B 2 B 1 0
where µ counts the row and ν the column.
Maxwell’s equations in terms of F µν are a little involved and are given by

∂µ F µν = µ0 J ν (3.93)

∂ λ F µν + ∂ µ F νλ + ∂ ν F λµ = 0 (3.94)

~ E
For example the first equation contains (ν = 0) ∇. ~ = ρ/0 and (ν =
~ = µ0 J~ + µ0 0 ∂ E~ .
~ ×B
1, 2, 3) ∇ ∂t
The second equation is actually 64 equations so contains many repeats
of the remaining two Maxwell equations. For example if we set λ = 1, µ =
~ B
3, ν = 2 we obtain ∇. ~ = 0 and so forth.

Exercise: Explicitly extract the differential form of Maxwell’s equations


from (3.93),(3.94).

3.4.7 Lorentz Transformations of Electric and Mag-


netic Fields
~ and B
We can calculate the Lorentz Transformation properties of the E ~ fields
µν
using the fact that F transforms as
0 µν
F = Λµα Λνβ F αβ (3.95)
For example for a boost by speed v in the positive z direction

77
0
E1 0 10
c
= F

= Λ1α Λ0β F αβ

= Λ0β (Λ10 F 0β + Λ11 F 1β + Λ12 F 2β + Λ13 F 3β )


(3.96)
= Λ0β F 1β

= Λ00 F 10 + Λ01 F 11 + Λ02 F 12 + Λ03 F 13


 
E1 v 2
= γ c
− c
B
The full set of transformations are given by
0
 
E1 E1
c
=γ c
− vc B 2

0
 
E2 E2
c
=γ c
+ vc B 1

0
E3 E3
c
= c
(3.97)
 
01 1 v E2
B =γ B + c c

 
0 v E1
B 2 = γ B2 − c c

0
B 3 = B3

~ and B
Exercise: Evaluate F µν Fµν in terms of E ~ fields.

3.4.8 The Relativistic Force Law


When we were studying relativity we promised to return to the idea of rela-
tivistic force when we had studied electromagnetism.
Classically the electromagnetic force is given by

F~ = q(E
~ + ~v × B)
~ (3.98)
Thus for example the x component is given by

78
F 1 = q(E 1 + v 2 B 3 − v 3 B 2 ) = q(cF 10 − v 2 F 12 − v 3 F 13 ) (3.99)

to make this more symmetric we can add −v 1 F 11 since this is just zero!
Now since (c, v 1 , v 2 , v 3 ) are just the non-relativistic limit of uµ we are led
to

f µ = quν F µν (3.100)

Now we can ask what the non-relativistic limit of the time-like component
of force is?

f 0 = q(u0 F 00 − u1 F 01 − u2 F 02 − u3 F 03 )

~
= qγ 1c ~v .E (3.101)
 
= qγ
~v . ~ + ~v × B
E ~
c

where we have used that ~v .(~v × B)~ ≡ 0.


Taking v  c we obtain q~v .E ~ which is just the work done per second.
This indeed should be the rate of change of energy and it makes sense to
0
equate it to dp

in the relativistic generalization of Newton’s law.

3.5 The Lagrangian For a Charged Particle


The equation of motion for a charged, moving particle is given by

d~p ~ + ~v × B)
~
= q(E (3.102)
dt
The action that reproduces this equation is
Z
1 ~ − qφ
S = Ldt, L = m|~ẋ|2 + q(~ẋ.A) (3.103)
2
The Euler Lagrange equation is
 
d ∂L ∂L
− =0 (3.104)
dt ∂ ~ẋ ∂~x

79
or
d ~ ~ − ∇(q
~ ~ẋ.A~ − qφ) = 0
(mẋ + q A) (3.105)
dt
To see this is the equation we want we must first be careful about the
time dependence of A. ~ Of course it can explicitly depend on time, but even
if it’s constant the particle, as it moves, will see a time variation of the field.
This is accounted for using the chain rule
d ∂ dx ∂ dy ∂ dz ∂ ∂ ~
= + + + = + ~ẋ.∇ (3.106)
dt ∂t dt ∂x dt ∂y dt ∂z ∂t
So our equation of motion is

d~p ∂A~
+q ~A
+ q~ẋ.∇ ~ − q∇(~ẋ.A)
~ + q ∇φ
~ =0 (3.107)
dt ∂t
Next we use the identity

~ × A)
~ẋ × (∇ ~ = ∇(
~ ~ẋ.A)
~ − (~ẋ.∇)
~ A~ (3.108)
We have
!
d~p ~
∂A
=q − ~
− ∇φ ~ × A)
+ q~ẋ × (∇ ~ (3.109)
dt ∂t
Finally we remember the form for the electric and magnetic field in terms
of the potentials (3.42) and see that this is precisely the equation of motion
(3.102) we wanted!
Note also the expressions for the generalized momenta
∂L ~
p~gen = = m~ẋ + q A (3.110)
∂ ~ẋ
and for the Hamiltonian
1
H = p~gen .~ẋ − L = m|~ẋ|2 + qφ (3.111)
2
These expressions combine to the generalized four-vector momentum

pµgen = muµ + qAµ (3.112)


Replacing momenta in a problem by this generalized four momenta is
called “minimal substitution”.

80
3.6 Appendix 1 - Gauss’ and Stoke’s Theo-
rems
Here are derivations of these two crucial theorems:

3.6.1 Gauss’ Theorem


We want to convert the surface integral
Z
F~ .dA
~ (3.113)
S

to a form that is locally true. We do this by calculating the integral for an


infinitesimal cubic volume

Z
dx
dy

Y
dz

O X

We choose the surface in the integral as the surface of this cube.


As an example lets take

F~ = F ẑ (3.114)
(ie the field F~ points in the z direction.)
Calculating the surface integral for this F~ :
Flux at bottom surface = −F (0) δx δy
Flux at top surface = F (0) + ∂F

∂z
δz δx δy
Here we have Taylor expanded F to keep only the leading change in its
behaviour as we move in the z direction. Note that the top and bottom

81
areas contribute opposite signs because the area vectors point in opposite
directions. The other surfaces contribute nothing for this choice of F~ . The
total integral is therefore
Z
~ = ∂F δx δy δz
F~ .dA (3.115)
S ∂z

This result generalizes, when F~ has x and y components too, to:

F~ .dA
~  ∂Fx ∂Fy ∂Fz 
R
S
lim δV → 0 = + + (3.116)
δV ∂x ∂y ∂z
where δV is the volume of the cube.
Alternatively we may write this as:
Z
F~ .dA
~ = ∇.F~ δV (3.117)
S
where
 
∂ ∂ ∂
∇= x̂ + ŷ + ẑ (3.118)
∂x ∂y ∂z
and dV is an integral over the whole volume.

Gauss’ Theorem For Extended Volumes

It is easy to obtain the equivalent expression for an arbitrary volume -


we just build it up out of infinitesimal cubes: eg if we put two together:

dA

82
It turns out that

Z Z Z
F~ .dA
~= F~ .dA
~+ F~ .dA
~ (3.119)
two cubes cube one cube two
since the side shared by the two cubes has an area vector with opposite sign
in the case of the two integrals - the side cancels! We can therefore build any
shape in this way and the surface integral is just the sum over the surface
integrals of the component cubes so we arrive at Gauss’ Law

~ ~ ~ F~ dV
R R
S F .dA = ∇. (3.120)

3.6.2 Stokes’ Theorem


Next we want to convert the line integral
Z
F~ .d~l (3.121)
s

to a form that is locally true. We do this by calculating the integral for an


infinitesimal rectangular loop

Z dx Q

dz

We’ve chosen the loop to lie in the x-z plane


If at the bottom corner of the rectangle (P)

83
F~ = Fx x̂ + Fy ŷ + Fz ẑ (3.122)
The line integral gets contributions from the top and bottom of the form
“Fx dx” and from the sides of the form “Fy dy”. We must take into account
the change in these components across the box though. Clockwise round the
box we get contributions:

Z
∂Fx ∂Fz
F~ .d~l = Fz δz + (Fx + δz)δx − (Fz + δx) δz − Fx δx (3.123)
s ∂z ∂x
Z
∂Fx ∂Fz
F~ .d~l = ( − ) δx δz (3.124)
s ∂z ∂x
or
Z
F~ .d~l = cy dA (3.125)
s

Note that the area element is in the ŷ direction.


In general cy is the y-component of a vector called the curl of F. Its other
components are
∂Fz ∂Fy
cx = ( − ) (3.126)
∂y ∂z
∂Fy ∂Fx
cz = ( − ) (3.127)
∂x ∂y

We can write the curl as

x̂ ŷ ẑ
~ × F~ =
∇ ∂ ∂ ∂
(3.128)
∂x ∂y ∂z
Fx Fy Fz

The calculation above then generalizes, for an area placed at random


relative to the axes, to

Z
F~ .d~l = (∇
~ × F~ ).dA
~ (3.129)
s

84
Stokes’ Theorem For Extended Areas

We can again make larger areas by placing infinitesimal squares next to


each other - the common sides cancel from the sum

dl

Thus
Z Z Z
F~ .d~l = F~ .d~l + F~ .d~l (3.130)
two sq sq one sq two

Using our above result we arrive at Stoke’s theorem

~ ~ ~ × F~ ).dA
~
R R
s F .dl = S (∇ (3.131)

85
3.7 Appendix 2 - Vector Identities
Identity: ~ ∇
∇.( ~ × F~ ) = 0

Proof:
x̂ ŷ ẑ
~ × F~ =
∇ ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
   
∂Fz ∂Fy ∂Fx ∂Fz
 ∂Fy ∂Fx
= ∂y
− ∂z
x̂ + ∂z
− ∂x
ŷ + ∂x
− ∂y

~ ∇ ~ × F~ ) = ∂ 2 Fz ∂ 2 Fy ∂ 2 Fx ∂ 2 Fz ∂ 2 Fy ∂ 2 Fx
∇.( ∂x∂y
− ∂x∂z
+ ∂y∂z
− ∂y∂x
+ ∂z∂x
− ∂z∂y

= 0

Identity: ~ × (∇φ)
∇ ~ =0

Proof:
x̂ ŷ ẑ
~ × (∇φ)
~ ∂ ∂ ∂
∇ = ∂x ∂y ∂z
∂φ ∂φ ∂φ
∂x ∂y ∂z

     
∂2φ ∂2φ ∂2φ ∂2φ ∂2φ ∂2φ
= ∂y∂z
− ∂z∂y
x̂ + ∂z∂x
− ∂x∂z
ŷ + ∂x∂y
− ∂y∂x

= 0

Identity: ~ × (∇
∇ ~ × F~ ) = ∇( ~ F~ ) − ∇2 F~
~ ∇.

Proof:
x̂ ŷ ẑ
~ × F~ =
∇ ∂ ∂ ∂
∂x ∂y ∂z
Fx Fy Fz
   
∂Fz ∂Fy ∂Fx ∂Fz
 ∂Fy ∂Fx
= ∂y
− ∂z
x̂ + ∂z
− ∂x
ŷ + ∂x
− ∂y

86
 
~ × (∇
~ × F~ ) = ∂ 2 Fy ∂ 2 Fx ∂ 2 Fz ∂ 2 Fz
∇ ∂y∂x
− ∂y 2
− ∂z 2
+ ∂z∂x

 
∂ 2 Fy ∂ 2 Fx ∂ 2 Fz ∂ 2 Fy
− ∂x2
− ∂x∂y
− ∂z∂y
+ ∂z 2

 
∂ 2 Fx ∂ 2 Fz ∂ 2 Fz ∂ 2 Fy
+ ∂x∂z
− ∂x2
− ∂y 2
+ ∂y∂z

h     i
∂ ∂Fx ∂Fy ∂Fz ∂2F ∂2 ∂2
= ∂x ∂x
+ ∂y
+ ∂z
− ∂x2
+ ∂y 2
+ ∂z 2
Fx x̂
h     i
∂ ∂Fx ∂Fy ∂Fz ∂2F ∂2 ∂2
+ ∂y ∂x
+ ∂y
+ ∂z
− ∂x2
+ ∂y 2
+ ∂z 2
Fy ŷ
h     i
∂ ∂Fx ∂Fy ∂Fz ∂2F ∂2 ∂2
+ ∂z ∂x
+ ∂y
+ ∂z
− ∂x2
+ ∂y 2
+ ∂z 2
Fz ẑ

~ ∇.
= ∇( ~ F~ ) − ∇2 F~

Identity: In cylinderical polar coordinates (r, θ, z)


∂2 1 ∂ 1 ∂ ∂2
∇2 = ∂r2
+ r ∂r
+ r2 ∂θ
+ ∂z 2

Proof:
∂2 ∂2 ∂2
∇2 = ∂x2
+ ∂y 2
+ ∂z 2
 
∂ ∂r ∂ ∂θ ∂ ∂ ∂r ∂ ∂θ ∂ ∂2

= ∂x ∂x ∂r
+ ∂x ∂θ
+ ∂y ∂y ∂r
+ ∂y ∂θ
+ ∂z 2

 2
∂r 2 ∂ 2 ∂2r ∂ ∂θ 2 ∂ 2 ∂2θ ∂ ∂r ∂2
 
= ∂x ∂r2
+ ∂x2 ∂r
+ ∂x ∂θ2
+ ∂x2 ∂θ
+ ∂y ∂r2
+
 2
∂2r ∂ ∂θ ∂2 ∂2θ ∂ ∂2
∂y 2 ∂r
+ ∂y ∂θ2
+ ∂y 2 ∂θ
+ ∂z 2

87
Now we use the relations between x, y and r, θ:

r = (x2 + y 2 )1/2 x = r sin θ

tan θ = x/y y = r cos θ


∂r x
∂x
= (x2 +y 2 )1/2
= sin θ

∂2r x2 cos2 θ
∂x2
= 1
(x2 +y 2 )1/2
− (x2 +y 2 )3/2
= 1r (1 − sin2 θ) = r

∂r y
∂y
= (x2 +y 2 )1/2
= cos θ

∂2r 1 y2 sin2 θ
∂y 2
= (x2 +y 2 )1/2
− (x2 +y 2 )3/2
= 1r (1 − cos2 θ) = r

∂θ cos2 θ cos θ
∂x
= x
= r

∂2θ 2 cos θ sin θ ∂θ 2 cos θ sin θ


∂x2
= y ∂x
= r2

∂θ
∂y
= − cos2 θ yx2 = − sinr θ

∂2θ
∂y 2
= 2 cos θ sin θ yx2 ∂y
∂θ
+ 2 cos2 θ yx3 = − r22 sin θ
cos θ
(sin2 θ − 1) = − 2 sinrθ2cos θ

Substituting in above we find directly


∂2 1 ∂ 1 ∂ ∂2
∇2 = ∂r2
+ r ∂r
+ r2 ∂θ
+ ∂z 2

1 ∂ ∂ 1 ∂ ∂2

= r ∂r
r ∂r + r2 ∂θ
+ ∂z 2

A similar procedure may be used in spherical polar coordinates (r, θ, φ) where

∂2 2 ∂ 1 ∂2 cot θ ∂ 1 ∂2
∇2 = ∂r2
+ r ∂r
+ r2 ∂θ2
+ r2 ∂θ
+ r2 sin2 θ ∂φ2

88
Chapter 4

Quantum Mechanics

Contents

• Fourier Analysis in QM:


• Momentum space wave functions
• Completeness and orthogonality
• Initial condition problems for quantum wells
• Klein-Gordon equation
• Relativistic Schrödinger equation
• Negative energy solutions
• Perturbation Theory in QM:
• Time independent perturbation theory
• Time dependent perturbation theory
• Fermi’s Golden Rule

Learning Outcomes

• Know the properties of coordinate and momentum space wave functions


and be able to calculate one from the other in simple problems.
• Know the meaning of completeness and orthogonality and their impli-
cations for initial condition problems.

89
• Be able to compute the time evolution of an initial wave function in
square well problems.

• Be able to compute the Klein-Gordon equation starting from the rela-


tivistic relation for energy and momentum.

• Understand the meaning of negative energy solutions to the Klein-


Gordon equation.

Reference Books

• PHYS2003 Quantum Physics notes.

• Introduction to Quantum Mechanics, Griffiths

• Quarks and Leptons, Halzen and Martin

• Quantum Physics, Eisberg and Resnick

90
4.1 Introduction
In this part of the course we will study a number of techniques in Quantum
Mechanics (QM). To begin with (chapter 4.2) we will review parts of your
second year course but bearing in mind what you have learnt about Fourier
analysis. Since QM is a theory of waves it is not surprising that Fourier
techniques are very important. In chapter 4.3 we will move on to investigate
the relativistic Schrödinger equation, more commonly known as the Klein-
Gordon equation. Here we will encounter negative energy solutions and we
will see the standard interpretation due to Stückelberg and Feynman. Finally,
in chapter 4.4 we will work through the major results in QM perturbation
theory, a technique for solving problems that are close to previously solved
problems. It is most useful in scattering experiments where the deviation of
a particle as the result of an interaction is small relative to its free motion.

4.2 Quantum Mechanics Review


In QM the behaviour of a particle is controlled by a wave equation. A free
particle is associated with a wave
ψ = ei(kx−wt) (4.1)
where the wave number k and angular frequency w are related to the mo-
mentum and energy of the particle
h p
p= → k= (4.2)
λ ~
E
E = hν → w= (4.3)
~
The properties of the particle can therefore be obtained from the wave
by acting on it with operators

Eψ = i~ ψ (4.4)
∂t

pψ = −i~ ψ (4.5)
∂x
The free wave equation is an eigenfunction of these operators with the
values of E and p being the eigenvalues.

For a classical particle we require that energy is conserved so

p2
E= +V (4.6)
2m

91
which, using the operators, we can rewrite as a wave equation

2 2
∂ ~ ∂
i~ ∂t ψ = − 2m ∂x2 ψ + V ψ (4.7)
This is the Schrödinger equation which is central to QM.

4.2.1 Time Independent Schrödinger Equation


In problems where V is independent of time there are always solutions to the
Schrödinger equation of the form

ψ(x, t) = u(x)e−iEt/~ (4.8)


where u(x) satisfies (simply substitute this solution into the full Schrödinger
equation) the time independent Schrödinger equation

2 2
~ ∂
− 2m ∂x2 u(x) + V (x)u(x) = Eu(x) (4.9)

4.2.2 Interpretation
The amplitude of the wave function ψ ∗ (x, t)ψ(x, t) (which in the time inde-
pendent case is just u∗ (x)u(x)) is associated with the probability of finding
a particle at x. Remembering that x is continuous the precise statement is

u∗ (x)u(x)dx = prob of finding particle between x and x + dx (4.10)


Graphically this looks like the probability of finding the particle in the dx
spatial slice is just the area under the curve u∗ u in that slice.
Since the particle must be somewhere with probability one we must have
Z ∞
u∗ (x)u(x)dx = 1 (4.11)
−∞
Formally we find observable properties of the particles using the operators
Z ∞
hxi = u∗ (x) x u(x)dx (4.12)
−∞
Z ∞  
∗ ∂
hpi = u (x) −i~ u(x)dx (4.13)
−∞ ∂x

92
u*u

dx x

4.2.3 Momentum Space Wave Functions


In the above discussion we have described the particle by its wave function at
a particular point in space and then shown how to calculate it’s momentum
with an operator. Alternatively we could write a wave function that describes
the probability of the particle having momentum in some dp interval directly
and then calculating the position becomes more complicated.
In fact it is possible to set up this momentum space wave function such
that

φ∗ (p) φ(p) dp = prob. of particle having momentum p to p + dp (4.14)

Z ∞
φ∗ (p) φ(p)dp = 1 (4.15)
−∞

with the properties of the particle being given by the operator relations
Z ∞
φ∗ (p) p φ(p)dp = hpi (4.16)
−∞
Z ∞  
∗ ∂
φ (p) −i~ φ(p)dp = hxi (4.17)
−∞ ∂p

The relationship between ψ(x) and φ(p) is given by a Fourier Transform

R∞
φ(p) = √1 ipx/~
2π~ −∞ ψ(x)e dx (4.18)

93
or inversely

R∞ −ipx/~
ψ(x) = √1
2π~ −∞ φ(p)e dp (4.19)

We can demonstrate that the Fourier Transform indeed has the correct
properties by checking the consistency of the three operator equations above.
Firstly consider

i R 00

−ipx0
hR ipx
R ∗ 1
R 0 ∗ 0 00 00
φ (p) φ(p) dp = 2π~
dp dx e ~ ψ (x ) dx e ~ ψ(x )

00
00 00 ip(x −x0 )
dx0 1
ψ ∗ (x0 )ψ(x )
R R R
= dx 2π~
dpe ~

(4.20)

We recognise the dp integral as the Fourier expansion of a delta function

Z
1
δ(x − x0 ) = eiω(x−x0 ) dω (4.21)

So with ω = p/~ and dω = dp/~

00 00 00
φ∗ (p) φ(p)dp = dx0 dx δ(x − x0 )ψ ∗ (x0 )ψ(x )
R R R

0
dx0 ψ ∗ (x0 )ψ(x )
R
= (4.22)

= 1

The equations are consistent.

Secondly we can check the relation for the expectation value of the par-
ticles position
R ∗  

φ (p) −i~ ∂p φ(p)dp

94
 00  R 00

−ipx0
hR i
1
R 0 ∗ 0ix 00 ipx 00
= 2π~
dp dx e ~ ψ (x ) −i~ ~
dx e ~ ψ(x )

00
00 00 00 ip(x −x0 )
dx0 1
ψ ∗ (x0 ) x ψ(x )
R R R
= dx 2π~
dp e ~

00 00 00 00
dx0 dx δ(x − x0 )ψ ∗ (x0 ) x ψ(x )
R R
=
0
dx0 ψ ∗ (x0 ) x0 ψ(x )
R
=

= hxi
(4.23)

Finally we check the expectation value for momentum

 R 00

−ipx0
hR i
R ∗ 1
R 0 ∗ ∂0 00 ipx 00
φ (p) p φ(p)dp = 2π~
dp dx e ~ ψ (x ) −i~ ∂x00 dx e ~ ψ(x )

 
0 00 00 0 ∗ 0 00
−i~ ∂x∂00
R R
= dx dx δ(x − x )ψ (x ) ψ(x )
 
0 ∗ 0 0
−i~ ∂x∂ 0
R
= dx ψ (x ) ψ(x )

= hpi
(4.24)

Everything is nicely consistent.

95
4.2.4 Square Well Example
A simple, interesting example of a QM system is the square potential well.
We assume that the particle can not penetrate the infinite barriers

V= V=0 V=

x=0 x=a

Since the potential is time independent the solution takes the form

ψ(x, t) = u(x)e−iEt/~ (4.25)


and we must solve the time independent Schrödinger equation

~2 d2
− u(x) + V (x)u(x) = EU (x) (4.26)
2m dx2
Of course in the region of interest the potential is just V = 0.

The solutions to this equation take the form

u(x) = A sin kx + B cos kx (4.27)


The integration constants are fixed by the boundary conditions of ψ vanishing
at x = 0, a so
nπx
un (x) = A sin (4.28)
a
with n integers 1, 2, 3....
Substituting this solution into the Schrödinger equation we find

~2  nπ 2
En = (4.29)
2m a

96
Finally to find the constant A we can require ψ(x, t) is correctly normal-
ized
R∞ ∗
−∞
ψ ψdx = 1
Ra
= 0
A2 sin2 nπx
a
dx (4.30)

= A2 a2
The full solution is therefore
r
2 nπx −iEt/~
ψn (x, t) = sin e (4.31)
a a

97
4.2.5 Completeness
The square well problem shares features with a typical Fourier analysis prob-
lem. The solutions in each case are of the form sin nπx
a
. Thus we can make a
Fourier expansion of any initial condition for the wave form and then deter-
mine the evolution.
For example if we take an initial wave function form at t = 0

x=0 x=a
then we can write

X
u(x, t = 0) = cn un (x) (4.32)
n=1

where the cn are the Fourier coefficients. Taking into account the normaliza-
tion of un (x)) they are given by
r
8k a nπ
cn = 2 2 sin . (4.33)
nπ 2 2
We now know the time evolution since we know that each individual term
evolves as

un (x, 0) → e−iEn t/~ un (x, 0) . (4.34)


Resuming the Fourier series at time t gives the evolution of the initial
condition (to a precision determined by how many terms you resum).

98
This is an example of a general rule in QM called completeness: any
wave function may be expanded as a series of the eigenfunction solutions of
the Schrödinger equation relevant to that problem. In other words in any
problem we may write
X
φ(x) = cn un (x) , (4.35)
n

for any function φ(x), where

Hun = En un . (4.36)
We won’t prove this here but if it weren’t true we’d be in all sorts of
trouble! Imagine we had found all the solutions of the Schrödinger equation
and then wrote down an initial condition that couldn’t be rewritten in terms
of those solutions... we’d have no idea how to evolve that initial condition -
which is silly! Completeness therefore has to be true for the theory to make
sense.

99
4.2.6 Orthogonality
It is also important in these initial condition problems that there is a unique
way of writing
X
u(x, 0) = cn un (x) (4.37)
n

If it were not unique then a given initial condition would have more than
one expansion which would evolve differently. Again the evolution would be
undetermined and the theory not make sense.
Each un (x) therefore contains unique information. Orthogonality is a
mathematical statement of this fact
Z ∞
u∗n (x)um (x)dx = δnm (4.38)
−∞

where δnm = 1 if m = n and δnm = 0 if m 6= n.


You can think of this expression as similar to a dot product between the
coordinate axes vectors (î, ĵ, k̂) - the axes contain the separate information
about the three directions in the space and the dot product is zero between
any two orthogonal directions.

Proof: The un are eigenfunctions of the Hamiltonian H satisfying Hun =


En un so consider
Z
u∗i Huj dx (4.39)

We can act with H to either the left or right in which case we will find
Z Z
Ej ui uj dx = Ei u∗i uj dx

(4.40)

which can only be true for i 6= j if the wave functions are orthogonal and
both sides are zero. When i = j the integral over the wave function squared
is just the usual probability of finding the particle in all space and is set equal
to one.

100
4.3 Klein-Gordon equation
4.3.1 The Schrödinger equation
Going back to the Schrödinger equation, it is useful to remember that a way
of deriving it for a free particle of mass m is to start with the classical relation
between energy and momentum,
p2
E= , (4.41)
2m
and write E and p in terms of differential operators,

E → i~ , p → −i~∇ . (4.42)
∂t
The resulting equation is understood to act on a wave function ψ(x, t), and
so it reads
∂ψ ~2 2
i~ + ∇ ψ=0. (4.43)
∂t 2m
As already discussed last year |ψ|2 is the probability density, with |ψ|2 d3 x
being the probability of finding the particle in the volume d3 x. From now on
we will use the notation ρ ≡ |ψ|2 .
The main application within this Quantum Mechanics section of the con-
cepts developed is to the study of scattering processes, where the particles
concerned are in motion. It makes therefore sense to define the density flux
of a beam of particles, j. This flux obeys a continuity equation, namely the
rate of decrease of the number of particles in a given volume must equal the
flux of particles out of that volume. We can write that as
Z Z

− ρdV = j . n̂dS , (4.44)
∂t V S

where n̂ is a unit vector along the outward normal to the element dS of the
surface S enclosing volume V . Using Gauss’s theorem
Z Z
j . n̂dS = ∇ . jdV , (4.45)
S V

and putting these two last equations together we arrive at


∂ρ
+∇.j=0. (4.46)
∂t
To determine j we use the Schrödinger, equation eq. (4.43), given that
∂ρ ∗ ∂ψ ∂ψ ∗
=ψ + ψ, (4.47)
∂t ∂t ∂t
101
and we can replace the partial derivatives by the corresponding ∇2 terms.
Then
∂ρ i~ ∗ 2
− (ψ ∇ ψ − ψ∇2 ψ ∗ ) = 0 . (4.48)
∂t 2m
This last equation allows us to identify the probability flux density as
i~ ∗
j=− (ψ ∇ψ − ψ∇ψ ∗ ) . (4.49)
2m
For example in the case of the free particle of energy E and momentum p,
we know that a solution to the Schrödinger equation is given by
ψ = N eip.x−iEt/~ , (4.50)
which means that
ρ = |N |2 , (4.51)
and
p
j= |N |2 . (4.52)
m

4.3.2 The Relativistic Schrödinger Equation


It should be clear to us that the Schrödinger equation, eq. (4.43), does
not respect Lorentz covariance, i.e. space and time coordinates behave in
completely different ways. The way to obtain a relativistic version of the
Schrödinger equation is to take as starting point the relativistic relation be-
tween energy and momentum
E 2 = p2 + m2 . (4.53)
We can apply to this equation the same substitutions of eq. (4.42), and act
on a wave function φ to obtain
∂ 2φ
− + ∇2 φ = m2 φ . (4.54)
∂t2
This is the Klein-Gordon equation, which can also de denoted as the rela-
tivistic Schrödinger equation. If we multiply it by −iφ∗ , and its conjugate
equation by −iφ the result is
∂φ∗
  
∂ ∗ ∂φ ~ ∗~ ~ ∗ )] = 0
i φ −φ + ∇[−i(φ ∇φ − φ∇φ (4.55)
∂t ∂t ∂t
By comparison with the non relativistic continuity equation, eq. (4.46) we
can identify the probability density as
∂φ∗
 
∗ ∂φ
ρ=i φ −φ , (4.56)
∂t ∂t

102
while the flux density is given by
~ − φ∇φ
j = −i(φ∗ ∇φ ~ ∗) . (4.57)
In the particular case of a free particle of energy E and momentum p the
solution of the Klein-Gordon equation is
φ = N eip.x−iEt , (4.58)
in, from now on, units of ~ = 1, c = 1. Using this form for φ in eqs (4.56,4.57)
we obtain
ρ = i(−2iE)|N |2 = 2E|N |2 , (4.59)
j = −i(2ip)|N |2 = 2p|N |2 . (4.60)
Note that the expression for the probability density contains a factor of E,
the energy of the particle.
It is convenient to express all this in four vector notation, in particular
we will make use of the D’Alembertian operator,
 ≡ ∂µ ∂ µ , (4.61)
~ with
which is the contraction of the four vector derivative ∂ µ = (∂/∂t, −∇)
itself. Then the Klein-Gordon equation reads
( + m2 )φ = 0 , (4.62)
and the probability density and flux can be grouped in a four vector,
j µ = (ρ, ~j) = i(φ∗ ∂ µ φ − φ∂ µ φ∗ ) , (4.63)
with the continuity equation reading
∂µ j µ = 0 . (4.64)
The free particle solution is expressed as
φ = N e−ip.x (4.65)
where x.p = xµ pµ . Plugging this expression into eq. (4.63) we obtain
j µ = 2pµ |N |2 . (4.66)
The next thing to do is to study the possible values of the energy for this free
particle that obeys the Klein-Gordon equation. To do that we substitute the
solution (4.65) into (4.62) and the result are the eigenvalues of the energy
p
E = ± p2 + m2 , (4.67)

103
that is, negative energy solutions are possible! Moreover these E < 0 so-
lutions are associated to a negative probability density, as it is clear from
eq. (4.60). In other words
E < 0 solutions with ρ < 0 . (4.68)
This is a problem that we have to address if we want to continue our study
of relativistic quantum mechanics.

4.3.3 Interpretation of negative energy states


It seems from looking at the Klein-Gordon equation that the problem of
negative energy states arises, from a mathematical point of view, from the
fact that this equation is quadratic in both space and time, which itself arises
from the relativistic energy-momentum relation, eq. (4.53). One, therefore,
has to take the square root of the resulting expression for the energy2 and
that implies having two possible signs in the solution.
In 1927 Dirac derived a relativistic wave equation which was linear in
space and time (i.e. proportional to ∂ µ rather than ). In that way the
probability density was well defined and positive, but he could not avoid
having negative energy solutions. Moreover his equation only described spin
1/2 particles. This is known today as the Dirac equation which describes
spin 1/2 fermions and antifermions.
In 1934 Pauli and Weisskopf revisited the Klein-Gordon equation and
introduced a slight modification. They added a factor of −e (the electron
charge) to the expression of the current, eq. (4.63), which now reads
j µ = −ie(φ∗ ∂ µ φ − φ∂ µ φ∗ ) . (4.69)
Now j 0 = ρ can be interpreted as a charge density, not a probability density,
and this explains the fact that it can take negative values (for negatively
charged particles).
The prescription for dealing with negative energy states is due to Stückelberg
(1941) and Feynman (1948). Their idea, commonly accepted today, is that
a negative energy solution describes both a particle propagating backward in
time and a positive energy antiparticle propagating forward in time. This can
be easily seen with an example: consider an electron of energy E, momentum
p and charge −e. Its corresponding four vector current will be
j µ (e− ) = −2e|N |2 (E, p) . (4.70)
If we now consider a positron, the antiparticle of the electron, with the same
energy and momentum, its corresponding four current will read
j µ (e+ ) = 2e|N |2 (E, p) , (4.71)

104
given that its charge is +e. We can shift signs twice within this equation to
obtain
j µ (e+ ) = −2e|N |2 (−E, −p) , (4.72)
which can be read as the current for an electron of energy −E and momentum
−p. Therefore the emission of a positron of energy E is equivalent to the
absorption of an electron of energy −E. This is easily seen if we focus on the
energy dependence of the free particle solution of the Klein-Gordon equation,
i.e.
e−iE.t = e−i(−E).(−t) . (4.73)
Here we see explicitly that the antiparticle going forward in time is equivalent
to the particle of negative energy going backward in time.

105

You might also like