100% found this document useful (1 vote)

43 views360 pages

Relativistic Quantum Mechanics Overview

Uploaded by

Rahul Patra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

43 views360 pages

Relativistic Quantum Mechanics Overview

Uploaded by

Rahul Patra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Relativistic Quantum

Mechanics
Written by two of the most prominent leaders in particle physics, Relativistic Quantum Mechan-
ics: An Introduction to Relativistic Quantum Fields provides a classroom-tested introduction to
the formal and conceptual foundations of quantum field theory. Designed for advanced under-
graduate- and graduate-level physics students, the text only requires previous courses in classical
mechanics, relativity and quantum mechanics.
The introductory chapters of the book summarise the theory of special relativity and its applica-
tion to the classical description of the motion of a free particle and a field. The authors then explain
the quantum formulation of field theory through the simple example of a scalar field described
by the Klein–Gordon equation as well as its extension to the case of spin ½ particles described by
the Dirac equation. They also present the elements necessary for constructing the foundational
theories of the standard model of electroweak interactions, namely quantum electrodynamics
and the Fermi theory of neutron beta decay. Many applications to quantum electrodynamics and
weak interaction processes are thoroughly analysed. The book also explores the timely topic of
neutrino oscillations.
Logically progressing from the fundamentals to recent discoveries, this textbook provides stu-
dents with the essential foundation to study more advanced theoretical physics and elementary
particle physics. It will help them understand the theory of electroweak interactions and gauge
theories.
View the second and third books in this collection: Electroweak Interactions and An Introduction
to Gauge Theories.
Key Features of the New Edition:
Besides a general revision of text and formulae, three new chapters have been added.
• Chapter 17 introduces and discusses double beta decay processes with and without
neutrino emission, the latter being the only process able to determine the Dirac or
Majorana nature of the neutrino (discussed in Chapter 13). A discussion of the limits
to the Majorana neutrino mass obtained recently in several underground laboratories
is included.
• Chapter 18 illustrates the calculation of the mass spectrum of “quarkonia” (mesons
composed by a pair of heavy, charm or beauty quarks), in analogy with the positronium
spectrum discussed in Chapter 12. This calculation has put into evidence the existence
of “unexpected” states and has led to the new field of “exotic hadrons”, presently under
active theoretical and experimental scrutiny.
• Chapter 19 illustrates the Born-Oppenheimer approximation, extensively used in the
computation of simple molecules, and its application to the physics of exotic hadrons
containing a pair of heavy quarks, with application to the recently observed doubly
charmed baryons.
Relativistic Quantum
Mechanics
An Introduction to Relativistic
Quantum Fields
Second Edition

Luciano Maiani and Omar Benhar

Th is eBook was published Open Access with funding support from the Sponsoring Consortium for Open Access
Publishing in Particle Physics (SCOAP3) licensed under the terms of the creative commons Creative Commons
Attribution-Non-Commercial (CC-BY-NC) 4.0 license [Link]

Any third party material in this book is not included in the OA Creative Commons license, unless indicated
otherwise in a credit line to the material. Please direct any permissions enquiries to the original rightsholder.

Second edition published 2024

by CRC Press
2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431

and by CRC Press

4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

CRC Press is an imprint of Taylor & Francis Group, LLC

© 2025 Luciano Maiani and Omar Benhar

First edition published by CRC Press 2015

Reasonable eﬀorts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and publish-
ers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has not
been acknowledged please write and let us know so we may rectify in any future reprint.

The Open Access version of this book, available at [Link], has been made available under a Cre-
ative Commons Attribution-NonCommercial (CC-BY-NC) 4.0 International license.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted,
or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, includ-
ing photocopying, microfi lming, and recording, or in any information storage or retrieval system, without writ-
ten permission from the publishers.

For permission to photocopy or use material electronically from this work, access [Link] or contact
the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works
that are not available on CCC please contact mpkbookspermissions@[Link]

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for
identification and explanation without intent to infringe.

ISBN: 978-1-032-56594-1 (hbk)

ISBN: 978-1-032-55940-7 (pbk)
ISBN: 978-1-003-43626-3 (ebk)

DOI: 10.1201/9781003436263

Typeset in CMR10
by KnowledgeWorks Global Ltd.

Publisher’s note: Th is book has been prepared from camera-ready copy provided by the authors.
Contents

Preface xi
The authors xiii

Chapter 1 The Symmetries of Space-Time 1

1.1 THE PRINCIPLE OF RELATIVITY 1

1.2 PHYSICAL LORENTZ TRANSFORMATIONS 5
1.3 CAUSAL STRUCTURE OF SPACE-TIME 6
1.4 CONTRAVARIANT AND COVARIANT VECTORS 7
1.5 PROBLEMS FOR CHAPTER 1 8

Chapter 2 The Classical Free Particle 10

2.1 SPACE–TIME MOTION 10

2.2 PARTICLE OF ZERO MASS 12
2.3 ACTION PRINCIPLE FOR THE FREE PARTICLE 13
2.4 THE MASS–ENERGY RELATION 15
2.5 PROBLEMS FOR CHAPTER 2 17

Chapter 3 The Lagrangian Theory of Fields 18

3.1 THE ACTION PRINCIPLE 18

3.2 HAMILTONIAN AND CANONICAL FORMALISM 20
3.3 TRANSFORMATION OF FIELDS 24
3.4 CONTINUOUS SYMMETRIES 28
3.5 NOETHER’S THEOREM 29
3.6 ENERGY–MOMENTUM TENSOR 31
3.7 PROBLEMS FOR CHAPTER 3 36

v
vi Contents

Chapter 4 Klein–Gordon Field Quantisation 39

4.1 THE REAL SCALAR FIELD 39

4.2 GREEN’S FUNCTIONS OF THE SCALAR FIELD 42
4.3 QUANTISATION OF THE SCALAR FIELD 46
4.4 PROBLEMS FOR CHAPTER 4 50

Chapter 5 Electromagnetic-Field Quantisation 52

5.1 MAXWELL’S EQUATIONS IN COVARIANT FORM 52

5.2 GREEN’S FUNCTIONS OF THE ELECTROMAGNETIC
FIELD 54
5.3 THE MAXWELL–LORENTZ EQUATIONS 57
5.4 HAMILTON FORMALISM AND MINIMAL SUBSTITUTION 63
5.5 QUANTISATION OF THE ELECTROMAGNETIC FIELD IN
VACUUM 67
5.6 THE SPIN OF THE PHOTON 73
5.7 PROBLEMS FOR CHAPTER 5 75

Chapter 6 The Dirac Equation 78

6.1 FORM AND PROPERTIES OF THE DIRAC EQUATION 79

6.1.1 Spin 82
6.1.2 Relativistic Invariance 83
6.1.3 Boost 89
6.1.4 Solutions of the Dirac Equation for a Free Particle 91
6.1.5 The Magnetic Moment of the Electron 95
6.2 THE RELATIVISTIC HYDROGEN ATOM 98
6.2.1 Factorisation of the Dirac Equation in Polar Co-
ordinates 98
6.2.2 Separation of Variables 99
6.2.3 Eigenvalues of the Hamiltonian 103
6.3 TRACES OF THE γ MATRICES 106
6.4 PROBLEMS FOR CHAPTER 6 107
Contents vii

Chapter 7 Quantisation of the Dirac Field 110

7.1 PARTICLES AND ANTIPARTICLES 110

7.2 SECOND QUANTISATION: HOW IT WORKS 113
7.3 CANONICAL QUANTISATION OF THE DIRAC FIELD 115
7.4 THE REPRESENTATION OF THE LORENTZ GROUP 120
7.5 MICROCAUSALITY 122
7.6 THE RELATION BETWEEN SPIN AND STATISTICS 124
7.7 PROBLEMS FOR CHAPTER 7 126

Chapter 8 Free Field Propagators 128

8.1 THE TIME-ORDERED PRODUCT 128

8.2 PROPAGATORS OF THE SCALAR FIELD 129
8.3 PROPAGATORS OF THE DIRAC FIELD 131
8.4 THE PHOTON PROPAGATOR 133
8.5 PROBLEMS FOR CHAPTER 8 135

Chapter 9 Interactions 136

9.1 QUANTUM ELECTRODYNAMICS 137

9.2 THE FERMI INTERACTION FOR β DECAY 142
9.3 STRONG INTERACTIONS 143
9.4 HADRONS, LEPTONS AND FIELDS OF FORCE 144
9.5 PROBLEM FOR CHAPTER 9 146

Chapter 10 Time Evolution of Quantum Systems 147

10.1 THE SCHRÖDINGER REPRESENTATION 147

10.2 THE HEISENBERG REPRESENTATION 149
10.3 THE INTERACTION REPRESENTATION 150
10.3.1 Theory of Time-dependent Perturbations 152
10.3.2 Time-ordered Products 152
10.4 SYMMETRIES AND CONSTANTS OF THE MOTION 153
viii Contents

Chapter 11 Relativistic Perturbation Theory 157

11.1 THE DYSON FORMULA 159

11.2 CONSERVATION LAWS 160
11.3 COLLISION CROSS SECTION AND LIFETIME 161
11.4 PROBLEMS FOR CHAPTER 11 166

Chapter 12 The Discrete Symmetries: P, C, T 168

12.1 PARITY 168

12.2 CHARGE CONJUGATION 170
12.3 TIME REVERSAL 172
12.4 TRANSFORMATION OF THE STATES 176
12.5 SOME APPLICATIONS 180
12.5.1 Furry’s Theorem 180
12.5.2 Symmetries of Positronium 180
12.6 THE CPT THEOREM 183
12.6.1 Equality of Particle and Antiparticle Masses 187
12.7 PROBLEMS FOR CHAPTER 12 189

Chapter 13 Weyl and Majorana Neutrinos 191

13.1 THE WEYL NEUTRINO 191

13.2 THE MAJORANA NEUTRINO 195
13.3 RELATIONSHIPS AMONG WEYL, MAJORANA AND
DIRAC NEUTRINOS 197
13.4 PROBLEM FOR CHAPTER 13 200

Chapter 14 Applications: QED 201

14.1 SCATTERING IN A CLASSICAL COULOMB FIELD 201

14.2 ELECTROMAGNETIC FORM FACTORS 206
14.3 THE ROSENBLUTH FORMULA 208
14.4 COMPTON SCATTERING 214
14.5 INVERSE COMPTON SCATTERING 221
14.6 THE PROCESSES γγ → e+ e− AND e+ e− → γγ 224
14.7 e+ e− →μ+ μ− ANNIHILATION 227
Contents ix

14.8 PROBLEMS FOR CHAPTER 14 232

Chapter 15 Applications: Weak Interactions 233

15.1 NEUTRON DECAY 233

15.2 MUON DECAY 240
15.3 UNIVERSALITY, CURRENT × CURRENT THEORY 244
15.4 TOWARDS A FUNDAMENTAL THEORY 247
15.5 PROBLEMS FOR CHAPTER 15 249

Chapter 16 Neutrino Oscillations 250

16.1 OSCILLATIONS IN VACUUM 252

16.2 NATURAL AND ARTIFICIAL NEUTRINOS 255
16.3 INTERACTION WITH MATTER: THE MSW EFFECT 259
16.4 ANALYSIS OF THE EXPERIMENTS 263
16.5 OPEN PROBLEMS 270
16.6 PROBLEM FOR CHAPTER 16 271

Chapter 17 Neutrinoless Double-Beta Decay 272

17.1 DOUBLE BETA DECAY 272

17.1.1 Two-neutrino Double Beta Decay 274
17.1.2 Neutrinoless Double Beta Decay 277
17.2 EXPERIMENTAL STUDIES OF DOUBLE BETA DECAY 281

Chapter 18 A Leap Forward: Charmonium 284

18.1 A PRIMER: BARYONS, MESONS, QUARKS AND QCD 284

18.1.1 Conserved Quantum Numbers 285
18.1.2 Quarks and QCD 285
18.1.3 Infrared Conﬁnement and Asymptotic Freedom 288
18.2 CHARMONIA 291
18.2.1 The Cornell Potential and Its Relativistic Cor-
rections 292
18.2.2 Strategy and Numerical Results 295
18.3 CHARMONIA END EXOTICS 296
18.4 PROBLEMS FOR CHAPTER 18 299
x Contents

Chapter 19 The Born-Oppenheimer Approximation for the

Doubly 300

19.1 BORN-OPPENHEIMER APPROXIMATION IN BRIEF 301

19.2 COLOUR GYMNASTIC FOR QUARK-QUARK POTEN-
TIALS 302
19.3 THE DOUBLY CHARMED BARYON 304
19.3.1 The BO Approximation for Ξ++
cc 304
19.3.2 Numerical Results 308
19.3.3 About the BO Approximation Error in QCD 309
19.4 PROBLEMS FOR CHAPTER 19 310

Appendix A Basic Elements of Quantum Mechanics 311

A.1 THE PRINCIPLE OF SUPERPOSITION 311

A.2 LINEAR OPERATORS 313
A.3 OBSERVABLES AND HERMITIAN OPERATORS 315
A.4 THE NON-RELATIVISTIC SPIN 0 PARTICLE 316
A.4.1 Translations and Rotations 318
A.4.2 Spin 321

Appendix B The Non-Relativistic Hydrogen Atom 323

B.1 FACTORISATION OF THE LAPLACIAN 323

B.2 SEPARATION OF VARIABLES 324
B.3 EIGENVALUES OF THE HAMILTONIAN 326
B.4 EIGENFUNCTIONS 328

Bibliography 331

Index 339
Preface

This book is based on the relativistic quantum mechanics lecture course

given since 2006 to first year master’s degree (Laurea Magistrale) students in
the physics department of the University of Rome, “La Sapienza”.
The course, which is mandatory for all students irrespective of their cho-
sen field of study, is intended to provide an introduction to the formal and
conceptual foundations of quantum field theory.
The requirement to be relevant to students with different interests, com-
bined with the fact that the course takes place in the first session of their
formative year, has had an important role in the choice of the subjects cov-
ered, as well as how they are taught.
The only prerequisite for the understanding of the contents of this vol-
ume is to have taken courses in classical mechanics, relativity and quantum
mechanics, as given in the first three years of the bachelor’s degree (Laurea)
course.
The introductory chapters are devoted to a summary of the theory of spe-
cial relativity and its application to the classical description of the motion of
a free particle and a field. Within the Lagrangian formulation of classical field
theory, particular emphasis is placed on the relationships between symme-
try and conservation laws, whose fundamental role is demonstrated in many
examples in the next volume of this series.
The quantum formulation of field theory is introduced through the simple
example of a scalar field described by the Klein–Gordon equation, showing
that generalisation of the Schrödinger equation to the case of a relativistic
particle renders a probabilistic interpretation of its solution impossible. The
alternative interpretation as a quantum field emerges in a natural way from the
necessity to describe processes characterised by particle creation and destruc-
tion, made possible by the equivalence between mass and energy established
by the theory of relativity.
The elements necessary for the construction of the theories which are the
foundation of the standard model of electroweak interactions, namely quantum
electrodynamics and the Fermi theory of neutron beta decay, are discussed in
the central chapters. Problems connected with the quantisation of the electro-
magnetic field are analysed and spinor fields described by the Dirac equation
are introduced. The structure of the interaction term between charged parti-
cles and the electromagnetic field is discussed showing how the form which is

xi
xii Preface

obtained from classical theory through the so-called minimal substitution is

in reality, prescribed by the symmetry of the Lagrangian.
An important part of this volume is dedicated to the derivation of rela-
tivistic perturbation theory and its application to the calculation of observable
quantities like the collision cross-sections and decay lifetimes. The steps neces-
sary to arrive at the final results are derived in a detailed way, demonstrating
the mathematical tools that by the end of the course the students should have
learned to use.
The relativistic quantum mechanics course is one of a series of theoretical
physics courses offered by the physics department of “La Sapienza”. It should
provide students who plan to study theoretical physics and elementary particle
physics with the essential foundations to be able to follow subsequent mod-
ules, having as an objective the theory of electroweak interactions and gauge
theories. It is in this spirit that a chapter dedicated to neutrino oscillations has
been included, as one of the most topical and stimulating subjects in elemen-
tary particle physics, and one which is not generally treated in introductory
field theory texts.
Finally, we would like to acknowledge how much we owe to the collabora-
tion between the authors and Nicola Cabibbo over more than twenty years,
as well as our gratitude for comments we have received from our students.
The Authors

Luciano Maiani, born in 1941, is emeritus professor of theoretical physics

at the University of Rome, “La Sapienza”, and author of more than two hun-
dred scientiﬁc publications on the theoretical physics of elementary particles.
He, together with S. Glashow and J. Iliopoulos, made the prediction of a new
family of particles, those with “charm”, which form an essential part of the
uniﬁed theory of the weak and electromagnetic forces. He has been presi-
dent of the Italian Institute for Nuclear Physics (INFN), Director-General of
CERN in Geneva and president of the Italian National Council for Research
(CNR). He promoted the development of the Virgo Observatory for gravita-
tional wave detection, the neutrino beam from CERN to Gran Sasso and at
CERN directed the crucial phases of the construction of the Large Hadron
Collider. He has taught and worked in numerous foreign institutes. He was
head of the theoretical physics department at the University of Rome, “La
Sapienza”, from 1976 to 1984 and held the chair of theoretical physics from
1984 to 2011. He is a member of the Italian Lincean Academy and a Fellow
of the American Physical Society.

Omar Benhar, born in 1953, is an INFN research director and teaches gauge
theories at the University of Rome, “La Sapienza”. He has worked extensively
in the USA as a visiting professor, at the University of Illinois and the Old
Dominion University, as well as an associate scientist at the Thomas Jeﬀerson
National Accelerator Facility. Since 2013, he has served as an adjunct profes-
sor at the Centre for Neutrino Physics of Virginia Polytechnic Institute and
State University. He is the author of more than a hundred scientiﬁc papers
on the theory of many-particle systems, the structure of compact stars and
electroweak interactions of nuclei.

xiii
CHAPTER 1

THE SYMMETRIES OF
SPACE-TIME

1.1 THE PRINCIPLE OF RELATIVITY

The starting point of the theory of special relativity is the possibility to
identify inertial frames of reference (IF), deﬁned as those systems in which
Newton’s ﬁrst law of motion (below) is valid:

(1) A body not subject to a force in an IF is in a state of rest or performs

uniform straight-line motion.

On a closer look, we are in the presence of a circular argument: the absence

of forces can be ascertained only by observing uniform straight-line motion,
which, however, requires the a priori deﬁnition of a reference frame. Physicists
resolve the problem in a pragmatic manner, starting from those systems which
are clearly not inertial (the system made by my car on a rough road is certainly
not inertial; the motion of the objects in the car is strongly inﬂuenced by
apparent forces) and identifying systems in nature which are gradually better
approximations to an ideal IF:

• our house (over intervals which are short compared to the Earth’s period
of rotation),

• the Earth (for durations short compared to the solar year),

• the Sun (if we ignore the orbital motion around the centre of the galaxy),

• the galaxy. . .

Once an IF has been identiﬁed, it is possible to construct an inﬁnite number

of them, in fact, ∞6 , which diﬀer in the position of the origin (three coordi-
nates) and by a constant relative velocity (three components). The principle

DOI: 10.1201/9781003436263-1 1
This chapter has been made available under a CC BY NC license.
2 Relativistic Quantum Mechanics

of special relativity formulated by Galileo states that:

(2) The laws of physics are invariant under a change of IF.

In a given IF, physical phenomena can be analysed in terms of events;

occurrences identiﬁed with a point x and a certain time t. An event is therefore
characterised by a 4-vector which provides the coordinates of the event in a
given IF:

coordinates = (ct, x) = xμ (μ = 0, .., 3) .

In the time coordinate, we have inserted a factor c (the velocity of light

in free space) so that x0 = ct has the same physical dimensions, of a length,
as x. In this way, time can be measured in terms of length (the distance
travelled by light in the given interval). This deﬁnition of time forms part
of the so-called natural units, which we introduce later. Astronomers follow
the opposite convention, measuring distances in terms of the time needed to
traverse them (for example, light years).
To give meaning to the principle of special relativity, we must establish
the rules governing the transformation of the coordinates of a given event in
one IF, O, to another, O .
Let us assume, for simplicity, that the origins of O and of O coincide at
time t = 0. The requirement that uniform straight line motion in O should
look similar in O constrains the transformation to be linear and homogeneous:

(x )μ = Λμν xν , (1.1)

where repeated indices (from 0 to 3) indicate a summation and Λ is indepen-

dent of xμ .
To principles (1) and (2), Einstein added:

(3) The speed of light in free space c is a universal constant, independent of

the reference frame.

Let us consider two events which differ by Δx and Δt, which we suppose to
be infinitesimal. We define the squared invariant length of the interval (Δt, Δx)
by the quantity1 :

Δs = (cΔt)2 − (Δx)2 = (Δx0 )2 − (Δx)2 .

If |Δx| = c|Δt|, the two events are connected through the propagation of a
light ray which leaves from the ﬁrst event and arrives in exact coincidence with
the second. Clearly this coincidence must occur in all reference frames and,
given the invariance of the speed of light, this implies that the diﬀerence should
1 Δs is often referred to as the line element.
The Symmetries of Space-Time 3

also correspond to a zero invariant length in the transformed coordinates. As

a formula:
Δs = (cΔt)2 − (Δx)2 = 0 → Δs = (cΔt )2 − (Δx )2 = 0.
This condition can hold only if the transformation rule is such that:
Δs = λΔs,
where λ must be independent of the coordinates and can only depend on the
magnitude of the velocity. Now we take into account that the relationship
between O and O is completely symmetric; seen from O, O moves at the
same speed at which O sees O travel. Therefore, by exchanging the roles of
the two systems, it must also be true that:
Δs = λΔs , (1.2)
so that λ = ±1. The case −1 is excluded by the fact that the coordinate
transformations are continuously connected by the identity transformation,
which clearly has λ = +1; therefore, we conclude that the transformations
(1.1) must conserve the invariant length of the interval between two events
(hence the name given to Δs). Given the linearity of the transformations, the
condition can immediately be extended to ﬁnite intervals.
To formalise the condition (1.2), we introduce the metric tensor gμν , which
allows us to rewrite Δs as:
Δs = gμν Δxμ Δxν (1.3)
gμν = diag(+1, −1, −1, −1). (1.4)
Condition (1.2) is rewritten as follows:
s = gμν xμ xν = gμν Λμρ Λνσ xρ xσ = (Λμρ gμν Λνσ )xρ xσ = s (1.5)
which then implies:
(Λμρ gμν Λνσ ) = gρσ . (1.6)
Expressed as a matrix product of Λ and g, the equation can be rewritten:
ΛT gΛ = g. (1.7)
This equation deﬁnes a group of matrices which, in mathematical terms,
represent the group of Lorentz transformations. The coordinate transforma-
tions are only one example of these transformations, as we will soon see. First,
let us see how these considerations apply in the concrete example of special
Lorentz transformations.
The system O moves in the direction of the positive x axis of O with speed
v. The explicit form of (1.1) is:
Δx = αΔx − δcΔt
cΔt = −Δx + ζcΔt (1.8)

Δy = Δy; Δz = Δz,
4 Relativistic Quantum Mechanics

with α, δ, , ζ to be determined. Moreover,

s = (ζ 2 −δ 2 )(cΔt)2 −(α2 −2 )(Δx)2 −2(ζ −αδ)(Δx)(cΔt)−(Δy )2 −(Δz )2 .

Setting λ = 1 in (1.2), we must require s = s so that:

ζ 2 − δ 2 = α 2 − 2 = 1
ζ − αδ = 0. (1.9)

Equation (1.8) can be solved by substituting:

α = ζ = cosh(θ)
δ = = sinh(θ), (1.10)

where θ is a real parameter (the rapidity) connected to the relative velocity

between O and O . If we set Δx = 0 in (1.8), the second equation of (1.8)
should deﬁne the motion of a stationary object in O seen from O, so Δx =
v · Δt.
Comparing, we obtain:
v
tanh(θ) = , (1.11)
c
and therefore, from (1.8), we obtain the well known special Lorentz transfor-
mations:

Δx = γ(Δx − βcΔt)

cΔt = γ(−βΔx + cΔt) (1.12)
v 1
β= , γ= .
c 1− v
2
c2

Comments.
• From (1.11) we see that the speed of a physical system cannot exceed c.
• Newton hypothesised a universal time, which ﬂows equally in every
frame of reference. If, instead of using principle (3), we set t = t, the
special Lorentz transformations reduce to Galilean transformations:

x = x − v · t
t = t

which are obtained from (1.12) in the limit c → ∞ (non-relativistic

limit).
• Rotations in a Euclidean space leave invariant the squared Pythagorean
length: sE = x2 +y 2 +.... Correspondingly, they satisfy the orthogonality
relation R · RT = 1, analogous to (1.7) except for the replacement of the
matrix g with 11.
The Symmetries of Space-Time 5

1.2 PHYSICAL LORENTZ TRANSFORMATIONS

From (1.7), taking the determinant of both sides, we obtain the condition:

det(Λ) · det(g) · det(ΛT ) = det(Λ)2 det(g) = det(g). (1.13)

Therefore it must be the case that det(Λ) = ±1; the Lorentz group com-
prises at least two disconnected components. Only those elements with de-
terminant equal to +1 (the proper Lorentz transformations) are able to be
continuously connected to the identity transformation. The transformations
with determinant −1 are known as improper. An important example of an
improper transformation is the parity operation, denoted by P :

P : x = −x; t = t. (1.14)

The proper transformations are, for their part, constituted of two discon-
nected components, which are distinguished according to whether or not they
invert the direction of time. Events with coordinates (t, 0) in O describe, as a
function of t, the history of a stationary clock at the origin of O. If we apply
the relation (1.1), we ﬁnd:
t = Λ00 · t,
therefore, the sign of Λ00 determines whether the clocks of O run in the same
direction as in O. The transformations characterised by:

Λ00 > 0 (1.15)

are known as orthochronous.

Clearly, if an orthochronous transformation takes a system O to an-
other frame of reference O , the overall transformation: O → O → O is
also orthochronous. Hence the transformations characterised by (1.15) form a
subgroup of the Lorentz group. An important example of an improper non-
orthochronous transformation is time inversion, denoted by T :

T : x = x; t = −t. (1.16)
Combining both P and T transformations, total inversion, I, is obtained,
which is a non-orthochronous but also proper transformation:

I : xμ = −xμ . (1.17)

The changes between physically realisable IFs which are continuously con-
nected to the identity transformation are orthochronous. The principle of rel-
ativity must apply strictly only to these transformations.

The laws of physics are invariant under proper and orthochronous Lorentz
transformations. The corresponding group is usually denoted by L↑+ .
6 Relativistic Quantum Mechanics

Comment. In classical physics this principle is implicitly extended to P and

T transformations. However, some nuclear and subnuclear processes are not
invariant under P (β decays) or under T (decays of neutral K mesons); Nature
does not respect them. Total inversion, I, is worth further discussion. In quan-
tum mechanics, invariance of the laws of physics under the action of inversion,
I, combined with the operation exchanging particle with antiparticle, C, can
be demonstrated (CPT theorem).

1.3 CAUSAL STRUCTURE OF SPACE-TIME

The invariance of the length of the interval between two events, s, under
the Lorentz transformation, gives rise to a factorisation of space-time into
regions with a diﬀerent causal connection to a given event. For convenience,
let us set an event A at the origin of the spatial coordinates at time t = 0 in
a given IF. The space-time of the events (t, x) factorises into four regions in
a way which is independent of the chosen reference frame.

• Events with s = 0. They are found on the surfaces of two cones (light
cones), which represent the trajectories of light rays emerging from A
(future cone) or converging on A (past cone). They are known as lightlike.
• Events with s > 0, t > 0. They fill the interior of the future light cone;
they are events which can be influenced by event A, since they can be
reached by physically realisable signals originating from A which travel
at speeds less than c.
• Events with s > 0, t < 0. They fill the interior of the past light cone;
they are events which can influence event A, by means of physically
realisable signals which travel at speeds less than c. These events and
those of the preceding paragraph are known as timelike (see Problem 1).
• Events with s < 0. These are outside the light cones; they are events
which cannot have any causal relationship with A, because signals with
speeds exceeding c would be required. The events in this region form the
absolute present of A. They are known as spacelike events (see Problem
2).

In quantum mechanics the absence of the possibility to observe simultane-

ously two physical quantities are connected to the causal inﬂuence which the
measurement of one of them can exercise on the measurement of the other (for
example the measurement of p and x). Let us consider two observable quan-
tities whose measuring apparatus are localised in ﬁnite regions of space-time
(local observables). When the relevant region of one of the two observables is
located entirely in the present of the other, the principle of causality requires
that the two observables should be simultaneously measurable and the corre-
sponding operators should commute with each other. This principle is known
as microcausality.
The Symmetries of Space-Time 7

1.4 CONTRAVARIANT AND COVARIANT VECTORS

Let xμ and y μ be two 4-vectors which transform according to (1.1). In
the literature, these objects are known as contravariant vectors, indicated by
upper indices. The scalar product formed with the metric tensor is invariant:

(x · y) = gμν xμ y ν = xμ yμ = y μ xμ
(x · y ) = (x · y) . (1.18)

In equation (1.18) we introduced vectors with lower indices, commonly

known as covariant 4-vectors. Clearly, the transformation rule for these vec-
tors must be such as to provide an invariant when a covariant vector is multi-
plied by a contravariant vector. In fact, this rule can immediately be obtained
starting from (1.18):

yμ = gμν y ν = gμν Λνρ y ρ .

Multiplying (1.6) by (Λ−1 )ρτ we ﬁnd:

gτ ν Λνσ = (Λ−1 )ρτ gρσ ; (1.19)

or yμ = gμν Λνρ y ρ = (Λ−1 )ρμ gρσ y σ = (Λ−1 )ρμ yρ .

Therefore covariant vectors transform using the matrix Λ−1 .

The four-vector ∂/∂xμ transforms according to
∂ ∂ ∂xν ∂ ∂
→ μ = = (Λ−1 )νμ , (1.20)
∂x μ ∂x ∂x μ ∂xν ∂xν
that is like a covariant four-vector. Therefore, we can write

∂ 1 ∂ ∂ 1 ∂
≡ , ∇ = ∂μ , ≡ , −∇ = ∂ μ , (1.21)
∂xμ c ∂t ∂xμ c ∂t
from which it follows that, given a four-vector u
1 ∂u0
∂ μ uμ = + ∇ · u = ∂ μ uμ , (1.22)
c ∂t
and the d’Alambertian operator is defined as
1 ∂2
2 = ∂μ ∂ μ = − ∇2 . (1.23)
c2 ∂t2
In formal terms, covariant vectors are the elements of the dual space of the
vector space of contravariant vectors. In general, the dual V of a vector space
V is defined as the space of linear functionals of vectors y μ , of functions, that
is, defined for every y and z, such that:

f = f (y),
f (αy + βz) = αf (y) + βf (z)
8 Relativistic Quantum Mechanics

with α and β numbers (in our case, real numbers). It is easy to show that
every element f of V can be written as:

f (y) = fμ y μ = fμ y μ = (f · y), (1.24)
μ

hence V has the same dimensions as V and there is a one-to-one mapping

between the elements of V and those of V (theorem due to Riesz).
In our case, this mapping is realised with the metric matrix gμν which gives
the element of V , yμ = gμν y μ , which corresponds to y μ . With this relation,
the elements of V are invariant functions under Lorentz transformations:

f (y ) = f (y), if y = Λy .

Using the relation (A.5) it is immediately seen that the preceding equation
requires two transformation rules:

y = Λμν y ν , fμ = (Λ−1 )ρμ fρ

such that:

f (y ) = fμ y μ = fρ (Λ−1 )ρμ Λμν y ν = fρ (Λ−1 · Λ)νρ y ν = fρ δνρ y ν = fρ y ρ = f (y)

where δνρ is the Kronecker delta.

1.5 PROBLEMS FOR CHAPTER 1

Sect. 1.1
1. From the deﬁnition of velocity, (1.11) prove that the speed of a physical
system cannot exceed c.
2. Newton hypothesised a universal time, which ﬂows equally in every
frame of reference. Work out the transformations obtained by setting
set t = t, thereby obtaining the Galilean transformations:

x = x − v · t
t = t .

3. Prove that the Galilean transformations of Problem 1 are obtained from

the Lorentz transformations (1.12) in the limit c → ∞ (i.e. the non-
relativistic limit).
4. Show that the rotation matrices in a Euclidean space satisfy the
orthogonality relation

R · RT = 1
The Symmetries of Space-Time 9

as a consequence of them leaving invariant the squared Pythagorean

length: sE = x2 + y 2 + .... Note that the above relation is analogous to
(1.7) except for the replacement of the matrix 11 with g.
5. Show that, if Λ1 and Λ2 are Lorentz matrices satisfying the Lorentz
condition (1.7), ΛT gΛ = g, their matrix product Λ1 · Λ2 satisﬁes the
same condition.

6. Show that, by carrying out two special Lorentz transformations with

rapidities θ1 and θ2 , a Lorentz transformation with rapidity θ1 + θ2 is
obtained. From this, derive the rule for the combination of velocities:
v 1 + v2
v 1 ⊕ v2 = .
1 + v1c·v
2
2

Sect. 1.2
1. Show that if an event has s > 0 it is possible to find an inertial frame
in which xμ = (t, 0). (Consequently, these events are called timelike).
2. Show that if an event has s < 0 it is possible to find an inertial frame
in which xμ = (0, x). (Consequently, these events are called spacelike).
3. Show that if an event has s = 0 it is possible to find an inertial frame
in which xμ = (|x|, x) and x aligned along the z axis. (These events are
called lightlike).
CHAPTER 2

THE CLASSICAL FREE

PARTICLE

2.1 SPACE–TIME MOTION

The motion in space-time of a classical particle (that is to say, non-
quantum mechanical) is described by four functions of time: xμ = xμ (t). We
consider the interval which separates two positions very close together in time
and its invariant length:

Δxμ = (cΔt, v · Δt) (2.1)

v2
Δs = (Δx · Δx) = (1 − 2 )(cΔt)2 .
c
Obviously, Δxμ is a timelike interval, because the particle travels at a
speed less than that of light, hence an IF exists in which Δxμ has only a
temporal component. In this √ IF the particle is stationary (in the time interval
under consideration) and Δs/c gives the time interval measured by a clock
at rest with respect to it at that moment. The new variable is called the proper
time of the particle, denoted by τ . By deﬁnition, an interval of proper time is
a relativistic invariant and the relation between τ and the time t in the chosen
IF is given by the relation:
dt
dτ = 1 − β 2 dt = . (2.2)
γ(t)

If we use the proper time to characterise the motion of the particle, we

can deﬁne a velocity four-vector, uμ :

d xμ (τ ) dxμ (t)
uμ = =γ , (2.3)
dτ dt

10 DOI: 10.1201/9781003436263-2
This chapter has been made available under a CC BY NC license.
The Classical Free Particle 11

in which uμ is obviously a covariant 4-vector, since it transforms like xμ . We

note the value of its components and of its invariant magnitude:

uμ = γ(c, v); (2.4)

gμν uμ uν = uμ uμ = c2 . (2.5)

In non-relativistic mechanics, the momentum is deﬁned as p = mv, where

m is the inertial mass. We deﬁne the 4-vector momentum analogously as:

pμ = muμ (2.6)

where m is a constant which characterises the particle and which obviously

coincides with the inertial mass at low velocity. For this reason, m is called
the rest mass of the particle. The rest mass is a relativistic invariant, as is
seen from the relation:
p2 = pμ pμ = (mc)2 . (2.7)
The time component of pμ is related to the energy of the particle. We
calculate it explicitly starting from the relation:

p2 = (mc)2 = (p0 )2 − (p)2 (2.8)

from which:
1 p2
p0 = (mc)2 + (p)2 (mc2 + ) (2.9)
c 2m
where the final step is valid for low velocity. The temporal component of pμ ,
in this limit, is equal to the kinetic energy divided by c plus a constant related
to the rest mass. In classical mechanics, the energy is always defined to within
a constant, so we can identify p0 with the energy of the particle divided by c:

pμ = ( , p). (2.10)
c
Thus we arrive at the important conclusion that in a relativistic formula-
tion momentum and energy are parts of a single physical entity, the energy-
momentum 4-vector (or 4-momentum) pμ .
The rules determining the components of pμ after transformation of the
coordinate system follow immediately from the nature of the 4-vector pμ . In
the specific case of a special Lorentz transformation along the x-axis, we have
(the quantities with accents refer to the IF O , those without an accent to O):

p1 = γ(p1 − β ) (2.11)
c

= γ(−βp1 + )
c c
p2 = p2 ; p3 = p3 .
12 Relativistic Quantum Mechanics

If the particle is at rest in O, we ﬁnd, in particular:

= γmc2 ; p = γβmc. (2.12)

The velocity of the particle, in any IF, is given by:

cp
β= . (2.13)

Comment. In deriving the value of p0 in (2.9) starting from (2.8), we have

chosen the positive root. This seems an innocuous step, but we can examine
it more closely. The negative root is separated from the positive one by a gap
of at least 2mc2 . In classical mechanics the energy varies continuously; thus,
if we assume that the particle has E > mc2 at the start, it can never ﬁnish
in a state with E < −mc2 . In quantum mechanics, however, the energy can
change discontinuously and we cannot exclude transitions from states with
E > mc2 to others with E < −mc2 . The resolution of this diﬃculty takes us
directly to the concept of antiparticles.

2.2 PARTICLE OF ZERO MASS

The condition (2.7) allows us to consider the limit m → 0. In this case, in
any IF:

|p| = p0 ; (2.14)
β = 1,

the particle travels at the speed of light in any IF, for any state of motion.
Obviously photons, the quanta of light, have this property; the photon is a
particle with zero rest mass.
Despite the fact that (2.7) varies smoothly as m → 0, particles with mass
m > 0 are intrinsically diﬀerent from those of zero mass, however, small the
value of m; the limit m → 0 is intrinsically discontinuous.
The simplest way to obtain this result is by verifying that the symmetry
group for the momentum (the little group introduced by Wigner) is diﬀerent
in the two cases.

• If pμ is the momentum of a particle of non-zero mass we can ﬁnd an

IF in which pμ has only a temporal component (its rest system). In the
rest system, pμ = (mc2 , 0) and the group of transformations which leave
the momentum invariant is the entire group of three-dimensional spatial
rotations, O(3), the group of orthogonal matrices in three dimensions.
• If pμ is the momentum of a zero-mass particle, we can write it in the
form: pμ = (|p|, p). The invariance group is now that of rotations in the
plane orthogonal to the direction of p, the orthogonal matrices in two
dimensions, O(2), a commutative group much smaller than O(3).
The Classical Free Particle 13

One sees therefore that when letting m → 0 the little group changes discontin-
uously from O(3) to O(2). In quantum mechanics, the little group determines
the value of the spin of the particle. For a particle with mass, the spin co-
incides with the angular momentum at rest and the states therefore form a
representation of the group of 3-dimensional rotations; a particle of spin S
possesses 2S + 1 spin states.
Conversely, a particle of zero mass can never be at rest, and its intrinsic
angular momentum is deﬁned by rotations around p, the elements of O(2),
which allow one-dimensional representations corresponding to Sz = 12 n for
a given integer n. If we include parity, P , as a good symmetry, two states
should exist with Sz = ± 12 n. For example, the photon has two spin states,
corresponding to Sz = ±1 and not three, as we would have expected for a
particle of spin 1.

2.3 ACTION PRINCIPLE FOR THE FREE PARTICLE

The results of the previous section are of such importance as to merit an
independent derivation. At the same time, this permits the introduction of
the formulation of relativistic dynamics based on the action principle, which
has a fundamental importance in quantum mechanics.
We consider a trajectory in space-time, xμ (t), which starts and ends in
two ﬁxed events:

xμ (t1 ) = xμ1 = (ct1 , x1 ) (2.15)

xμ (t2 ) = xμ2 = (ct2 , x2 )

Given xμ (t), we can deﬁne the action, S(xμ1 , xμ2 ):

t2
S= L(x(t), v(t)) dt (2.16)
t1

The trajectory actually travelled is determined by the action principle:

• The particle trajectory corresponds to the minimum value of the action.
From the Lagrangian function, the principle of least action derives the
Lagrange equations, which in classical mechanics completely replace Newton’s
equations, F = ma:
d ∂L ∂L
= (2.17)
dt ∂v ∂x
The requirement that the laws of motion should be invariant following a
change of IF becomes the simple statement:

• The action must be invariant under Lorentz transformations.

14 Relativistic Quantum Mechanics

Applied in our case, this means:

L(x(t), v(t)) dt = invariant.

For a free particle, the only invariant non-trivially constant is the proper
time, dτ and therefore it must be true that:

v 2 (t)
L(x(t), v(t)) dt = −α dτ = −α 1− dt (2.18)
c2
with α a constant. The value of α is determined by the non-relativistic limit,
in which L should tend to the kinetic energy of the particle, 12 mv 2 , to within
an irrelevant additive constant. For small values of v(t) we ﬁnd:

αv 2
L → −α + ; (2.19)
2c 2
→ α = mc2

and therefore the Lagrangian of a free particle is:

v(t)2
L = −mc2 1− . (2.20)
c2

The kinematic momentum is given by the conjugate momentum with re-

spect to x, while the energy is given by the Hamiltonian. Therefore we ﬁnd:
∂L
p= = mγv (2.21)
∂v
H = p · v − L = γmc 2 .

The results in (2.21) confirm the definition of the 4-momentum given in (2.6).
The Lagrange equations which follow from (2.20) and (2.17) are:
d i
p = 0; (i = 1, 2, 3) (2.22)
dt
Furthermore, from condition (2.7) we find:

p0 dp0 − p · dp = 0 (2.23)

and therefore:
d 0
p = 0. (2.24)
dt
Finally, we have obtained the covariant equations of motion:
d μ d μ
p =0= p (2.25)
dt dτ
which express the conservation of momentum and energy.
The Classical Free Particle 15

2.4 THE MASS–ENERGY RELATION

To illustrate the meaning of the rest energy, we consider a system composed
of many particles (in short, a gas). The total 4-momentum is given, naturally,
by:
μ
Pμ = pi ; (2.26)
i

i.e. P = pi ; cP 0 = E = i .
i

For a system of this type an IF exists in which P = 0 (the centre of mass

system). This can be seen from (2.11): we orient P along the x-axis and require
that P 1 = 0. We ﬁnd:
cP
β= (2.27)
E
which always has a modulus less than unity, given that:

c|P | < c |pi | < i = E. (2.28)
i

The total energy, E0 , in this frame of reference deﬁnes the rest mass of the
system
E0 = M0 c2 (2.29)
while the energy in another IF, E0 , is given by an analogous formula to (2.12):

E0 = γM0 c2 (2.30)

We now look more closely at M0 . In the limit in which the gas particles
are non-relativistic, we have:
1 T
M0 = i mi + 2 (2.31)
c2 i c

where T is the kinetic energy contained in the gas. If we increase or reduce

this energy, by heating or cooling the gas, the rest mass varies accordingly, as:
ΔE
ΔM0 = . (2.32)
c2
If we introduce an interaction between the gas particles, this adds to the
second term of (2.31) a quantity V which can be positive or negative.
The conclusion is that the rest mass of a composite system diﬀers from the
sum of the rest masses of its constituents and the diﬀerence can be released
(or must be provided) in the form of an amount of energy again given by
relation (2.32):
ΔE = Q = (M0 − mi )c2 . (2.33)
i
16 Relativistic Quantum Mechanics

Equations (2.32) and (2.33) provide Einstein’s mass–energy relationship,

with innumerable applications in physics theory and practice.
From now on, the masses of atomic and subatomic particles will always
be given in terms of energy, according to (2.32), using the units 1 MeV =
1000 keV = 106 eV.
As a reminder:

• electron: Me = 0.511 MeV;

• proton: Mp = 938.27 MeV;
• neutron: MN = 939.57 MeV;

• deuteron: MD = 1875.61 MeV.

An important example. The energy emitted by the Sun is produced by the
fusion of four protons into one helium nucleus. The fusion takes place via a
series of reactions, the so-called p–p sequence (this is the principal sequence;
there are several secondary series studied originally by Bethe in the 1930s;
recall that proton = 1 H, deuteron = 2 H):

p + p → 2 H + e+ + ν (2.34)
2 3
H + p → He + γ (2.35)
3 3 4
He + He → He + p + p. (2.36)

The overall reaction can be written as:

4p + 2e− → 4 He + 2e+ + 2e− + 2ν. (2.37)

The positrons annihilate with the electrons of the medium releasing energy.
Therefore the total thermal energy released by each 4-proton reaction is:

Ethermal = Q − 2 < Eν > (2.38)

Q = 4Mp + 2Me − MHe = 26.7 MeV.

< Eν > is the average energy carried away by neutrinos, which leave the
Sun undisturbed. The neutrino energy is continuously distributed between
zero (assuming negligible neutrino mass) and the Q value in the formation of
deuterium:

Q(1 H + 1 H) → 2 H = 2Mp − MD − Me = 0.42 MeV. (2.39)

Comment. According to relativistic quantum mechanics, reactions exist in

which particles can be created or destroyed; for example, proton-antiproton
annihilation. In these cases the variation of energy in equation (2.32) should
include the entire rest mass of the particles involved.
The Classical Free Particle 17

2.5 PROBLEMS FOR CHAPTER 2

Sect. 2.1
1. The μ-particle, or muon, has a rest mass mμ ∼ 106 MeV and a lifetime
at rest τμ ∼ 2.197 s. For a muon produced with energy E = 1 GeV,
calculate: (i) the values of β and γ (ii) the average length (in km) covered
before decay. Conversely, what is the minimum energy needed for a muon
produced at the top of the atmosphere (20 km above sea level) to arrive
at sea level before decay?

Sect. 2.4
1. Knowing that: (i) the Sun contains around N = 1056 protons in the core,
i.e. available for fusion; (ii) the solar constant (ﬂux of solar energy onto
the Earth) is K0 = 3.3 · 10−2 cal cm−2 s−1 ; (iii) the Earth–Sun distance
is 8 light-minutes; (iv) the energy carried by neutrinos is negligible,
estimate the lifetime of the Sun from (2.37) and (2.38).
CHAPTER 3

THE LAGRANGIAN
THEORY OF FIELDS

In classical mechanics two types of physical systems are distinguished:

material points (particles) which have spatial coordinates x(t) as dynamic
variables, or ﬁelds (waves), which are dynamic systems described by one or
more continuous functions of the coordinates and of time.

φ = φ(x, t) = φ(x). (3.1)

The most important example of this second kind of system is the electro-
magnetic field described at every point of space by two vectors corresponding
to the values of the electric field E(x, t) and the magnetic field B(x, t).

3.1 THE ACTION PRINCIPLE

In analogy with the mechanics, of systems with a finite number of degrees
of freedom, it is natural to derive the field equations from an action principle.
The action is defined as the time integral of the Lagrangian, between two fixed
instants, t1 < t2 : t2
S= L dt. (3.2)
t1

The Lagrangian of a system of particles is the sum of the many diﬀerent

degrees of freedom. In the case of a ﬁeld, the degrees of freedom are localised
at every point of space, therefore:

L = d3 x L(φ, φμ , x), (3.3)

where we have denoted the derivatives of the ﬁeld with respect to the coordi-
nates as φμ :
∂φ
φμ (x) = .
∂xμ
18 DOI: 10.1201/9781003436263-3
This chapter has been made available under a CC BY NC license.
The Lagrangian Theory of Fields 19

The function L is given the name Lagrangian density or, more simply for
brevity, the Lagrangian and depends on the fields (the dynamic variables) and
their derivatives. The time derivatives are the generalisation of velocity, while
the dependence of the Lagrangian on spatial derivatives allows to couple the
degrees of freedom among nearby points in space.
We have considered a possible explicit dependence of the Lagrangian on
the space-time coordinates, to allow for the effect of possible agents external to
the system of fields. For an isolated system, this dependence cannot be present
and the Lagrangian depends on the coordinates only through the fields and
their derivatives.
In terms of L:
S= d4 x L(φ, φμ , x) (3.4)
V4
where V4 is the region of space-time limited by the hypersurfaces Γ1 : t = t1
and Γ2 : t = t2 .
We assume the values of the fields are fixed on Γ1,2 . The principle of the
least action states that:
• the evolution of the field between these values is given by the function
φ(x) = φ̄(x, t) which minimises S, with boundary conditions fixed on
Γ1,2 .
We note that the Lagrangian density is not unique. Because the principle
of least action stipulates that the fields have defined values on the boundaries
of V4 , we can add to the Lagrangian the divergence of any 4-vector without
changing the minimum of the action or, hence, the equations of motion.
To derive the differential equations which determine the evolution of the
field, we set:
φ(x) = φ̄(x) + δφ(x); δφ(x, t1 ) = δφ(x, t2 ) = 0. (3.5)
The condition of least action becomes the equation:

∂L ∂L
δS = 0 = d4 x [ δφ + δ(∂μ )φ] (3.6)
∂φ ∂∂μ φ

∂L ∂L
= d4 x [ δφ + ∂μ δφ]
∂φ ∂∂μ φ

∂L ∂L ∂L
= d4 x [ − ∂μ ( )]δφ + d4 x ∂μ ( δφ)
∂φ ∂∂μ φ ∂∂μ φ
because δ(∂μ )φ = ∂μ δφ.
Given that the field variations vanish at the edges of the region of integra-
tion, the final term in (3.6) is zero. Moreover, (3.6) must apply for arbitrary
variations δφ, so the function in square brackets must cancel identically in x.
Thus we find the Euler–Lagrange equations:
∂L ∂L
∂μ ( )= (3.7)
∂∂μ φ ∂φ
20 Relativistic Quantum Mechanics

a system of partial diﬀerential equations. Naturally, if we have several ﬁelds

φi , i = 1, ..., N , we have an equation for each component.
In Newtonian mechanics, the motion of a particle is determined if we define
its position and velocity at a fixed instant in time. The extension of this
principle gives rise to the requirement that L should be at most quadratic
in the derivatives of the fields and we will follow this principle. In that case,
∂L/∂∂μ φ is linear in ∂μ φ, the Euler–Lagrange equation is of second order
in the derivatives and the solution is determined once the field and its time
derivative are assigned values on the hypersurface t = t1 .

Relativistic Invariance. The relativistic invariance of the theory is embod-

ied in the simple requirement that the action should be relativistically invari-
ant. The size of space-time is itself invariant, given that, for a transformation
Λ:
d4 x = det(Λ)d4 x = d4 x. (3.8)

Therefore

• the relativistic invariance of the action requires that the Lagrangian den-
sity should itself be invariant.

3.2 HAMILTONIAN AND CANONICAL FORMALISM

The route to the canonical formalism begins with the definition of the
momentum conjugate to each dynamic variable. In the case of a field theory,
we define the conjugate momentum density
∂L
π(x, t) = (3.9)
∂∂t φ
and stipulate that equation (3.9) should be used to express ∂t φ as a function
of φ, ∇φ, π. Subsequently one can define the Hamiltonian density.

H(π, φ) = π∂t φ − L; H = d3 xH. (3.10)

It should be noted that the Hamiltonian density is an auxiliary quantity;

only the Hamiltonian is physically relevant while the Hamiltonian density is
defined to within the 3-divergence of a vector, which integrates to zero when
the fields vanish at infinity.
The equations of motion are obtained simply by differentiating equation
(3.10) and using the Euler–Lagrange equations. We put:

φ(x) = φ̄(x) + δφ(x); π(x) = π̄(x) + δπ(x); t = t̄ + δt. (3.11)

The Lagrangian Theory of Fields 21

We then ﬁnd:

∂L ∂L ∂L
δH = d3 x ∂t φ δπ + πδ(∂t φ) − δφ − · δ(∇φ) − δ(∂t φ)
∂φ ∂∇φ ∂∂t φ

∂L
− d3 x δt =
∂t

3 ∂L ∂L
= d x ∂t φ δπ − ∂t π δφ − ∇ · ( δφ) − d3 x δt. (3.12)
∂∇φ ∂t
where we have used the Euler–Lagrange equations in the form:
∂L ∂L ∂L
−∇ = ∂t = ∂t π (3.13)
∂φ ∂∇φ ∂∂t φ
We can discard the 3-divergence from the diﬀerential of the Hamiltonian,
so we obtain:

∂L
δH = d3 x (∂t φ δπ − ∂t π δφ) − d3 x δt. (3.14)
∂t
On the other hand, from the fact that the Hamiltonian density depends
on φ, ∇φ and π, we obtain (continuing to discard the 3-divergences):

∂H ∂H ∂H ∂H
δH = d3 x δπ + −∇· δφ + d3 x δt. (3.15)
∂φ ∂φ ∂∇φ ∂t

Comparing the coefficients of the differentials in (3.14) and (3.15), we find

Hamilton’s equations:
∂H
∂t φ = ; (3.16)
∂π
∂H ∂H
∂t π = −( −∇· );
∂φ ∂∇φ
∂H ∂L
=− .
∂t ∂t
In the second equation the derivative with respect to ∇φ appears, owing
to the fact that in L, and therefore in H, we treated the dependence on φ and
∇φ separately. In fact all the terms in brackets correspond to the quantity
(∂H/∂q) in the case of one degree of freedom. From the third equation we
ﬁnd:
dH ∂H ∂L
= d3 x = − d3 x (3.17)
dt ∂t ∂t
For an isolated system, as we have seen, the Lagrangian density can de-
pend on the space-time location only through the ﬁelds. Consequently, for an
isolated system, we recover the law of conservation of energy: the Hamiltonian
of an isolated system is a constant of the motion. The same thing holds under
the weaker condition that the system should be independent of time.
22 Relativistic Quantum Mechanics

Functionals and Functional Derivatives. From a mathematical point of

view, the Hamiltonian in (3.10) is a functional of the Hamiltonian density:
a rule which associates a number to every given function, H. The functional
is therefore a function deﬁned on the space of functions, rather than on the
space of real or complex numbers. Analogously to what is done with functions,
we can introduce the concept of the derivative of a functional in the following
way:
Let H[f ] be a functional of the function f (x)1 . We deﬁne the derivative
of the functional starting from the variation:
δH = H[f (x) + δ (3) (x − y)] − H[f (x)] (3.18)
setting:
δH H[f (x) + δ (3) (x − y)] − H[f (x)]
= lim→0 . (3.19)
δf (y)
In the case of the Hamiltonian, we have

H = d3 x H(φ, π). (3.20)

from which, for example:

δH ∂H
= . (3.21)
δπ ∂π
A normal function can also be considered as a particular example of a
functional, according to the formula:

f (y) = dx δ(x − y)f (x). (3.22)

In this case, the symbol y which appears in the first term is simply a
dummy variable, and f (x) is the argument of the functional. For the functional
derivative, it is easily shown that:
δf (y)
= δ(x − y). (3.23)
δf (x)
If we introduce the concept of the functional derivative, the second of
Hamilton’s equations takes an aspect more similar to that of the case of a
finite number of degrees of freedom. Applying the definition (3.19), Hamilton’s
equations (3.17) can be put in the form2 :
δH
∂t φ(x) =
δπ(x)
δH
∂t π(x) = − . (3.24)
δφ(x)

1 As in the example given in this section, we consider the functions of a three-dimensional

variable. The generalisation to more dimensions is obvious.

2 We note, on the basis of (3.19), that the physical dimensions of the functional derivative

which appear in the preceding equations are equal to dim[H] -dim[φ] (or π) - [length3 ], or,
equivalently, to dim[H] -dim[φ] (or π).
The Lagrangian Theory of Fields 23

Poisson Brackets. The observables in the canonical formalism are, in gen-

eral, functionals of π(x) and φ(x). Given two observables, A and B, we can
introduce the Poisson bracket, in analogy to the finite dimensional case, de-
fined as:

δA δB δB δA
{A, B} = d3 x − = − {B, A} . (3.25)
δφ(x) δπ(x) δφ(x) δπ(x)
It is easily shown that:
δφ(x) δπ(x)
= = δ(x − y)
δφ(y) δπ(y)
δφ(x) δπ(x)
= = 0. (3.26)
δπ(y) δφ(y)
Using Poisson brackets we can give yet another form to Hamilton’s equa-
tions:
∂t φ = {φ, H}
∂t π = {π, H} . (3.27)
The equations (3.26) and (3.27) are the starting point for the canonical
quantisation of a field theory.
Poisson brackets are antisymmetric under the exchange of the two terms,
equation (3.25), like the commutator of two matrices:
[A, B] = AB − BA (3.28)
and, similarly to commutators, Poisson brackets satisfy a Jacobi identity:
{A, {B, C}} + {B, {C, A}} + {C, {A, B}} = 0. (3.29)
The Jacobi identity and equations (3.26) and (3.27) are the starting point
of the canonical quantisation of a field theory (see the problems for Sec-
tion 3.2), to be discussed in Section 4.3.

Comment. The quantum canonical field variables, φ(x) and π(x), are op-
erators (i.e. infinite dimensional matrices) depending upon the space-time co-
ordinates x = (x, x0 ). In the limit → 0, these operators have to go into the
classical field variables, φ(x) and π(x), commuting functions of x. The limit
(at equal time):
[φ(x, 0), π(y, 0)]
lim→0 (3.30)
i
is classically a function which obeys the algebraic properties of the commu-
tator. In view of the antisymmetry of the Poisson bracket and of the Jacobi
identity (3.29), one is naturally led to assume that:
[φ(x, 0), π(y, 0)]
lim→0 = {φ(x, 0), π(y, 0)} = δ (3) (x − y). (3.31)
i
24 Relativistic Quantum Mechanics

Thus, the Heisenberg equal-time relations for the canonical variables q and
p, translated into field theory:
[φ(x, 0), π(y, 0)] = iδ (3) (x − y) (3.32)
guarantee that quantum field theory goes into classical field theory for van-
ishing Planck’s constant.

3.3 TRANSFORMATION OF FIELDS

To construct relativistically invariant Lagrangians, we must begin from the
rules of field transformations, the relations which connect the fields observed
in IF O, φ(x), to the values observed in O , φ (x ), which correspond to the
same event in space-time, described by the coordinates xμ and xμ in O and
in O :
xμ = Λμν xν . (3.33)
The simplest case is that of a scalar field in which the values are the same:

φ (x ) = φ(x). (3.34)

The derivatives of φ transform like covariant vectors, Section 1.4:
∂φ (x ) ∂φ(x) ∂xλ ∂φ(x) ∂φ
= = = (Λ−1 )λμ . (3.35)
∂xμ ∂xμ ∂xμ ∂xλ ∂xλ
It is immediately veriﬁed that, instead, the derivatives with respect to xμ
transform like contravariant vectors:
∂φ (x ) ∂φ

= Λμλ . (3.36)
∂xμ ∂xλ
Consequently, we denote the derivatives as:
∂φ ∂φ
= φμ = ∂μ φ; = φμ = ∂ μ φ. (3.37)
∂xμ ∂xμ
With further derivatives, multi-index tensors can be constructed, with the
corresponding transformation properties:
∂ ∂ ∂ ∂
φνμ11,ν 2 ,...
,μ2 ,... (x) = μ μ
... ...φ(x); (3.38)
∂x ∂x
1 2 ∂xν1 ∂xν2

(φ )μν11,ν 2 ,...

,μ2 ,... (x ) = (Λ
−1 λ1
)μ1 (Λ−1 )λμ22 ...Λνρ11 Λνρ22 ...φλρ11,ρ 2 ,...
,λ2 ,... (x).

Extending the case of the scalar field, we can define tensor fields, functions
of xμ provided with a certain number of upper (covariant) indices, ns , and
lower (contravariant) ng , Fμν11,μ
,ν2 ,...
2 ,...
(x) whose transformation, rules are the same
as (3.38):
(F )νμ11,ν 2 ,...
,μ2 ,... (x ) = (Λ
−1 λ1
)μ1 (Λ−1 )λμ22 ...Λνρ11 Λνρ22 ...Fλρ11,λ
,ρ2 ,...
2 ,...
(x). (3.39)
The Lagrangian Theory of Fields 25

Important examples are the antisymmetric Maxwell tensor, F μν (x) =

−F νμ (x), which describes the electromagnetic ﬁeld, and the vector ﬁeld,
Aμ (x), which describes the vector potential.
The rank of a tensor (the number of covariant and contravariant indices)
can be reduced by contracting the indices with invariant tensors (i.e. tensors
such that T = T ). For the Lorentz transformations there are three types of
invariant operation:

• contraction of a covariant and a contravariant index with the Kronecker

delta: δνμ ,
• contraction of two covariant indices with the tensor gμν (or two con-
travariant indices with g μν ),
• contraction with the completely antisymmetric Levi–Civita tensor μνρσ
deﬁned as:

0123 = +1 (3.40)
μ1 μ2 μ3 μ4 = 0 (two equal indices)
μ1 μ2 μ3 μ4 = ±1 : μ1 , μ2 , μ3 , μ4 = even/odd permutations of 0123.

To show that the Levi–Civita tensor is invariant, we consider:

X0123 = μ1 μ2 μ3 μ4 Λμ0 1 Λμ1 2 Λμ2 3 Λμ3 4 . (3.41)

X is the sum of products of the matrix elements of Λ from rows and

columns all diﬀerent to each other, multiplied by the sign of the permutation
which transforms (0, 1, 2, 3) into (μ1 , μ2 , μ3 , μ4 ). Therefore X = detΛ = 1.
Furthermore, Xμνρσ = 0 if the two indices are equal, Xμνρσ = ±1 according
to whether the permutation is even or odd. It follows that:

μ1 μ2 μ3 μ4 Λμρ11 Λμρ22 Λμρ33 Λμρ44 = ρ1 ρ2 ρ3 ρ4 (3.42)

and thus the required invariance.

The tensors of a given rank ns , ng describe a linear manifold. A tensor is
said to be reducible if this manifold contains subspaces invariant under the
transformations (3.39) which should be non-trivial (that is, diﬀerent from 0
and from the same manifold). Otherwise, the tensor is irreducible.
Possible invariant subspaces can be obtained by projecting the general ten-
sor using the invariant operations described earlier. For an irreducible tensor,
on the other hand, the invariant operations give zero or project onto all the
starting space.
By completely contracting the indices of products of tensor ﬁelds and their
derivatives, one obtains tensors of rank zero (with no free indices) which are
26 Relativistic Quantum Mechanics

invariant (they transform like the scalar ﬁeld in (3.34)). These invariant combi-
nations are the building blocks with which to construct the Lagrangian density,
which describes the ﬁeld dynamics.

Example 1. An important case is that of tensors with two covariant an-

tisymmetric indices. Obviously, these tensors have 4 · 3/2 = 6 independent
components, which can be organised into 3-vectors in the following way:
⎛ ⎞
0 B 3 −B 2 E 1
⎜ −B 3 0 B1 E2 ⎟
F μν = −F νμ = ⎜⎝ B
⎟. (3.43)
2
−B 1
0 E3 ⎠
1 2 3
−E −E −E 0
The application of the Levi–Civita tensor takes us to the definition of the
dual tensor, F̄ μν :
1 1
F̄ μν = g μμ1 g νν1 μ1 ν1 λρ F λρ = μνλρ F λρ . (3.44)
2 2
It is not difficult to see that F̄ is obtained from F with the substitutions
E → −B; B → E:
⎛ ⎞
0 E 3 −E 2 −B 1
⎜ −E 3
0 E 1 −B 2 ⎟
F̄ μν = ⎜
⎝ E 2 −E 1
⎟.
⎠ (3.45)
0 −B 3
1 2 3
B B B 0
The application of the Levi–Civita tensor transforms the space of these
tensors into itself:
1 μν 1 μν
F λρ = F̄ μν ; F̄ λρ = −F μν (3.46)
2 λρ 2 λρ
(where we have used the relation 0123 = −1). Starting from these equations,
we can define two irreducible components which correspond to the eigenvalues
±i of the duality transformation (3.46):
(X ± )μν = F μν ± iF̄ μν (3.47)
1 μν
(X ± )λρ = ∓i(X ± )μν .
2 λρ
Because we have used only invariant operations, the factorisation (3.47)
is invariant under Lorentz transformations and the space of antisymmetric
tensors with two indices factorise into two invariant subspaces, each of three
dimensions.
These two subspaces form a unique complex if we add the parity operation
(cf. Chapter 1). Under parity, the vectors E and B behave, respectively, like
a polar vector and an axial vector:
P : E(x, t) → −E(−x, t); B(x, t) → +B(−x, t) (3.48)
The Lagrangian Theory of Fields 27

and therefore:

P : X ± (x, t) = f (E ∓ iB) → f (−E ∓ iB) = −f (E ± iB) = −X ∓ (−x, t).

(3.49)
The same conclusion is reached by considering the two quadratic invariants
which can be constructed starting from F and F̄ :

1 μν
L1 = F Fμν = (E)2 − (B)2 (3.50)
2
1
L2 = F̄ μν Fμν = (E · B).
2
The form of (3.50) suggests to consider the complex vectors:

(Z)± = E ∓ iB; (3.51)

(Z)± · (Z)± = L1 ∓ 2L2 .

The squares of the two 3-vectors are separately conserved by Lorentz trans-
formations, which therefore transforms their components among each other.
The parity transformation obviously exchanges Z + with −Z − .

Example 2. Tensors with two symmetric indices, both covariant (or con-
travariant): T μν (or Tμν ). In this case, the projection with gμν gives an in-
variant and the space factorises into the space of tensors symmetric and with
zero trace, gμν T μν = 0, of nine dimensions, and into a unidimensional space
of tensors of the form g μν T .

Comment. The classiﬁcation of irreducible tensors is discussed in [11]. From

an algebraic point of view, it is found that the Lorentz group is equivalent to
the product of two groups of rotations: L↑+ = SU (2) ⊗ SU (2). Therefore the
irreducible tensors are characterised by two angular momenta: j1 and j2 . The
tensors which belong to the irreducible representation (j1 , j2 ) have dimensions
d = (2j1 + 1)(2j2 + 1).
The (contravariant or covariant) 4-vectors correspond to the representation
(1/2, 1/2) of SU (2) ⊗ SU (2). The multi-index tensors described earlier are
obtained from tensor products of this representation. For example, the tensors
with two contravariant indices Tμ,ν are generated by the product:

(1/2, 1/2)⊗(1/2, 1/2) = (1+0, 1+0) = [(1, 1)⊕(0, 0)]⊕[(1, 0)⊕(0, 1)]. (3.52)

We have placed the brackets to separate the symmetric tensors (those

without trace plus the trace) from the asymmetric ones. We note the simplicity
with which the rule for the combination of angular momenta reproduces the
factorisation into irreducible components, in particular their dimensionality.
28 Relativistic Quantum Mechanics

3.4 CONTINUOUS SYMMETRIES

In the previous section, we made reference to two observers who study
the same system of events (for example a certain physical field configuration)
viewed from two different inertial frames.
The principle of relativity requires that the two IFs should be completely
equivalent, so that each observer is independently able to describe the field
dynamics with an action and a Lagrangian density which have the same func-
tional dependence on the fields. Relativistic invariance then requires:

4
S(φ ) = d x L[φ (x ), φμ (x ), x ] = S(φ) = d4 x L[φ(x), φμ (x), x]. (3.53)

if φ(x), x and φ (x ), x are the components of the ﬁeld and the coordinates
associated with an event in the two IFs, equations (3.33) and (3.39).
We can consider equation (3.53) from another viewpoint, as the invariance
(or symmetry) of the action for the transformation which substitutes φ(x), x
with φ (x ), x in the same frame of reference:

x → x ;
φ(x) → φ (x ). (3.54)

Viewed in this way, the relativity principle simply expresses the symme-
try of the action under the symmetry group of (proper and orthochronous)
Lorentz transformations and we can study the consequences of this symmetry
at the same time as other possible symmetries of the action which can take
the general form (3.54), with diﬀerent realisations of the transformation rules.

Regarding coordinates, we are restricted to the addition of translations

from the origin of space-time. These transformations, together with the proper
and orthochronous Lorentz transformations, form the Poincaré group which
is the group of natural space-time symmetries in special relativity.
Also included within (3.54) are transformations which change the fields
but not the coordinates, x = x, which are referred to as internal symmetries,
for example phase transformations of the complex fields which we will consider
extensively in what follows.
In this section, we limit the discussion to transformations which belong
to continuous groups. In this case, infinitesimal transformations close to the
identity transformation can be defined. By means of products of infinitesi-
mal transformations we can arrive at all the transformations of the group (at
least as far as the component connected to the identity is concerned). There-
fore we can explore the consequences of symmetry under a continuous group
restricting ourselves to infinitesimal transformations.
Thus we consider the transformation defined by infinitesimal variations:

xμ = xμ + δxμ ; (3.55)

φ (x ) = φ(x) + δT φ.
The Lagrangian Theory of Fields 29

For simplicity, we have omitted in (3.55) the indices connected to the ﬁeld,
φ, which are implied. We denote δT φ as the total variation of the ﬁeld. We
can decompose δT φ in the following way:

δT φ = φ (x ) − φ(x) = φ (x ) − φ(x ) + φ(x ) − φ(x)

= δφ(x) + ∂μ φ(x)δxμ . (3.56)

The total variation is the sum of the functional variation, δφ, and of a
translation by δxμ .
The Poincaré group transformations and, a fortiori, the transformations
associated with internal symmetries leave the size of d4 x invariant. This re-
quires that:

∂x 4 ∂δxλ 4
d4 x = || ||d x = det(δμλ + )d x = d4 x. (3.57)
∂x ∂xμ
If we use the identity:

det(1 + ) = 1 + Trace(), (3.58)

valid at least to terms of higher order in the inﬁnitesimal matrix , the invari-
ance condition on the volume of integration takes the form:

1 = 1 + ∂μ δxμ ; or (3.59)
μ
∂μ δx = 0

Under these conditions, the invariance of the action, equation (3.53) simply
requires the invariance of the Lagrangian density:

δT L = L(φ , ∂φ , x ) − L(φ, ∂φ, x) = 0. (3.60)

In the case of Lorentz transformations, the requirement is satisﬁed if we

construct the Lagrangian as a polynomial of the ﬁelds and their derivatives,
saturating the indices in an invariant way, as discussed in the preceding sec-
tion.

3.5 NOETHER’S THEOREM

We expand (3.60) using (3.56). We ﬁnd:
∂L μ ∂L ∂L ∂L μ
0 = δL + δx = δφ + δ(∂μ φ) + δx = (3.61)
∂xμ ∂φ ∂ ∂μ φ ∂xμ
∂L ∂L ∂L ∂L μ
=[ − ∂μ ( )]δφ + ∂μ [( )δφ] + δx
∂φ ∂ ∂μ φ ∂ ∂μ φ ∂xμ
where:
∂L ∂
μ
= L[φ(x), φμ (x), x]. (3.62)
∂x ∂xμ
30 Relativistic Quantum Mechanics

Using the equations of motion and (3.59), we obtain the conservation equa-
tion:
∂L
∂μ [( )δφ + Lδxμ ] = 0. (3.63)
∂ ∂μ φ
We can express the infinitesimal variations of the fields and coordinates
as linear combinations of the infinitesimal parameters which characterise the
transformation:

δxμ = A (ΔA )μ (x); δφ = A (ΣA )φ (3.64)
A A

where ΔA and ΣA are matrices which represent the generators of the infinites-
imal transformations of the coordinates and the fields. Because (3.63) must
be satisfied for any arbitrary values of the infinitesimal parameters, we obtain
the conservation equations:
∂L
∂μ (J A )μ = ∂μ [( )ΣA φ + (ΔA )μ L] (3.65)
∂ ∂μ φ

for the currents associated with each inﬁnitesimal generator. The result (3.65)
is Noether’s theorem:

• Each inﬁnitesimal generator of a continuous symmetry is associated with

a conserved current.

The conserved current is determined by the Lagrangian density, according

to the canonical formula:
∂L
(J A )μ = ( )ΣA φ + (ΔA )μ L. (3.66)
∂ ∂μ φ
The conserved current determines an additive constant of the motion, rep-
resented by the integral of its time component overall space. We write:
1
(J A )μ = ((J A )0 , J A ) (3.67)
c
and integrate the conservation equation into a fixed volume (x0 = ct):

3 ∂ A 0 d A
d x[ (J ) + ∇ · J ] = Q +
A
dσ n · J A = 0 (3.68)
V ∂t dt Σ

where QA = V d3 x (J A )0 . Equation (3.68) expresses the fact that a variation
of the charge contained inside the volume V is balanced by a corresponding
current density flux through the surface Σ of V . If we extend the integration
to the whole of space, the surface integral vanishes, given the cancellation of
the fields at infinity, and we obtain the law of conservation of the total charge:
d A
Q = 0. (3.69)
dt
The Lagrangian Theory of Fields 31

If the symmetry transformations involve space-time, the index A includes

one or more vector indices and (3.66) behaves like a tensor of higher rank.
For internal symmetries, in which the transformations of the symmetry
group do not involve space-time, the total charge is a Lorentz invariant. To
conﬁrm this property, we consider two hypersurfaces: Γ0 , corresponding to
t = 0 in our frame of reference, and Γ1 , corresponding to t = constant in a
Lorentz transformed the system. We integrate the conservation equations in
the four-dimensional volume bounded by these two hypersurfaces. We obtain:

0= ∂ μ J μ d4 x = − nμ J μ dΓ0 + nμ J μ dΓ1 + I (3.70)
Γ0 Γ1

where nμ and nμ are the normals to the two hypersurfaces, which point in the
time direction of the two reference frames, and I represents the contribution
of the lateral surfaces of our 4-volume. If we let these surfaces tend to inﬁnity,
I → 0, and we then have:

nμ J μ dΓ0 = nμ J μ dΓ1 ; (3.71)
Γ0 Γ1

i.e. d3 x J 0 (x, t) = d3 x J 0 (x , t ).

Noether’s theorem establishes the existence of a number of conserved cur-

rents, but it does not uniquely determine their form. We can add another
4-vector to the current in (3.66) provided that it is conserved, as a result of
the equations of motion or for algebraic reasons. An example of the second
case is given by the 4-divergence of an antisymmetric tensor:

sμ = ∂λ T λμ . (3.72)

If T λμ = −T μλ the current sμ is trivially conserved, ∂μ ∂λ T μλ = 0 because

of the antisymmetry of T . In this case, the addition of sμ modiﬁes the current
but not the conserved charge, since:

d3 x s0 = d3 x ∂i · T i0 = 0 (3.73)

if the ﬁelds vanish at inﬁnity.

3.6 ENERGY–MOMENTUM TENSOR

We consider explicitly the Poincaré group transformations, consisting of
translations in space-time and the special Lorentz transformations.
[Link] in space-time. These are the coordinate transformations:

xμ = xμ + aμ (3.74)
32 Relativistic Quantum Mechanics

with aμ = constant. For the ﬁelds, we set:

φ (x ) = φ(x) (3.75)

(possible tensor indices of φ are not aﬀected by the transformation).

Invariance under translations requires:

L(φ , (∂φ) , x ) = L(φ, (∂φ), x) (3.76)

or, using (3.74) and (3.75):

L(φ, (∂φ), x + a) = L(φ, (∂φ), x). (3.77)

Invariance under translations thus requires that L does not depend explic-
itly on x.
From (3.75), we derive the form of the functional variation of the ﬁelds:

δT φ = 0 = δφ + (∂μ φ)aμ ; (3.78)

δφ = −(∂μ φ)a .μ

The conserved current derived from (3.66) and (3.78) is a tensor of rank
two which is called the canonical energy-momentum tensor
∂L ν
T μ,ν = ∂ φ − g μν L; (3.79)
∂∂μ φ
∂μ T μ,ν = 0.

In fact, the 00 component is just the Hamiltonian density, deﬁned in Sec-

tion 3.2.
∂L
T 0,0 = ∂t φ − L = π ∂t φ − L (3.80)
∂∂t φ
Correspondingly, the spatial integral of T 0,0 is the Hamiltonian of the
system, the ﬁrst integral associated with the time independence of the system
(that is, invariance with respect to time translations: ct → ct + a0 ):

H = d3 x T 0,0 = E (3.81)
d
E = 0.
dt
The energy is the time component of a 4-vector, whose spatial components
are the momentum. We can thus identify:

d3 x T 0,μ = P μ (3.82)

with the overall 4-momentum of the ﬁeld.

The Lagrangian Theory of Fields 33

Noether’s theorem guarantees that the components of P μ are conserved

for systems independent of position in space-time. Thus we ﬁnd the important
result according to which conservation of energy and momentum are conse-
quences of invariance under space-time translations. For an isolated system,
invariance under translations is associated with the homogeneity of space-
time; the conservation of 4-momentum for these systems provides a concrete
proof of this important physical fact.
2. Lorentz transformations. These are associated with coordinate transfor-
mations:
xμ = Λμν xν . (3.83)
For inﬁnitesimal transformations, we can set:

Λμν = δνμ + μν . (3.84)

The inﬁnitesimal parameters μν are not independent, because the matrices
Λ must be such that they leave the metric tensor invariant:

ΛT gΛ = g; i.e. (3.85)
λμ gλν + gμλ λν = μν + νμ = 0,

where we have kept the terms to first order in and we have defined a new in-
finitesimal tensor, μν , with two contravariant indices. Condition (3.85) states
that this tensor should be antisymmetric in the two indices. Equation (3.83)
is rewritten as:

xμ = xμ + g μα xβ αβ ; (3.86)

1
δxμ = αβ (g μα xβ − g μβ xα ).
2
Infinitesimal Lorentz transformations therefore depend on six parameters:
three for spatial rotations (ij = −ji ; i = j = 1, 2, 3) and three for special
Lorentz transformations (0i = −i0 ; i = 1, 2, 3).
For general tensor fields, the transformations are those given in (3.39). We
will use here a more compact notation which can be applied also to the more
general case of spinor fields. We define an index which runs over all the com-
ponents independent of the field, which we denote as M , N , etc. The infinites-
imal transformations associated with the parameters μν (μ, ν = 0, 1, 2, 3) are
transformations of the fields φM given by the linear combination of μν with
six matrices Σμν
M N , antisymmetric in μν, associated with the generators of the
Lorentz transformations of the same fields:
1
φM (x ) = (δM N + μν Σμν
M N )φN (x). (3.87)
2
(The factor 12 is conventional and the sum over repeated indices is understood
in all cases).
34 Relativistic Quantum Mechanics

The functional variation of the ﬁelds is obtained from δT φ:

1
δT φ M = μν Σμν μ
M N φN (x) = δφM + ∂μ φM δx ; i.e. (3.88)
2
1 αβ
δφM = [Σ φN (x) − ∂μ φM (g μα xβ − g μβ xα )]αβ .
2 MN
With these results, we can write the conserved inﬁnitesimal current as:
∂L
δM μ = δφM + δxμ L =
∂∂μ φM
1 ∂L
= αβ [− ∂λ φM (g λα xβ − g λβ xα )
2 ∂∂μ φM
∂L
+ (g μα xβ − g μβ xα )L + Σαβ φN ] =
∂∂μ φM M N
1 ∂L
= αβ [(xα T μ,β − xβ T μ,α ) + Σαβ φN ]. (3.89)
2 ∂∂μ φM M N
This result takes us to the deﬁnition of the canonical angular momentum
tensor:

M μ[αβ] = (xα T μ,β − xβ T μ,α ) + Σμ[αβ] ; (3.90)

∂L
Σμ[αβ] = Σαβ φN ;
∂∂μ φM M N
∂μ M μ[αβ] = 0.

The components associated with spatial rotations (αβ = ij) give as con-
served charge the total angular momentum of the ﬁeld, for example:

J 1 = d3 x M 0[23] = d3 x[(x2 T 0,3 − x3 T 0,2 ) + Σ0[23] ]. (3.91)

Comment. This might suggest to identify the first term in (3.91) as due
to the orbital angular momentum and the second to the intrinsic angular
momentum of the field (which in quantum mechanics would correspond to
the spin of the field quanta). In fact, in the case of the scalar field, the second
term is absent. However, the two terms individually are ambiguous, as we
will see in the following section. For example, we can redefine the energy-
momentum tensor so as to eliminate completely the second term. The only
quantities uniquely defined are the constants of the motion, J k .

The Symmetric Energy-Momentum Tensor. In the theory of general rel-

ativity, the Einstein equations, which connect the energy-momentum tensor
to the geometry of space-time, requires that the energy-momentum tensor
should be symmetric in the two indices. In general, the canonical energy-
momentum tensor deﬁned by (3.79) is not symmetric. However, making use
The Lagrangian Theory of Fields 35

of the ambiguity inherent in its deﬁnition, it is possible to construct a new

energy-momentum tensor, θμν , which is symmetric and conserved at the same
time. The general construction is due to Belinfante and Rosenfeld [1].
The starting point for the construction of θμν is the conservation equation
for M μ[αβ] . Using (3.90), we ﬁnd:

0 = ∂μ M μ[αβ] = T α,β − T β,α + ∂μ Σμ[αβ] . (3.92)

The antisymmetric part of T μ,ν can therefore be eliminated in favour of

the 4-divergence of Σ and we can define the symmetric part of T μ,ν according
to:
1
S μν = T μ,ν + ∂λ Σλ[μν] = S νμ (3.93)
2
This tensor is not yet the solution to the problem, since S μν is not con-
served:
1
∂μ S μν = ∂λ ∂μ Σλ[μν] . (3.94)
2
However, this result coincides with the 4-divergence of the symmetric ten-
sor:
1
R (μν) = [∂λ Σμ[λν] + ∂λ Σν[λμ] ]; (3.95)
2
1
∂μ R (μν) = ∂λ ∂μ Σμ[λν]
2
Therefore we define:
1
θμν = T μ,ν + ∂λ [Σλ[μν] − Σμ[λν] − Σν[λμ] ]. (3.96)
2
The new tensor is symmetric and conserved. We can calculate the corre-
sponding charge:

1
d x θ = d3 x[T 0,ν + ∂λ (Σλ[0ν] − Σ0[λν] − Σν[λ0] )].
3 0ν
(3.97)
2
In the terms with derivatives, we should keep only those with λ = 0; the
terms with spatial derivatives correspond to surface terms which vanish at
infinity. We obtain:

3 0ν 3 0,ν 1 0[0ν] 0[0ν]
d x θ = d x[T + (∂0 (Σ −Σ −Σ ν[00]
)] = d3 x T 0,ν .
2
(3.98)
Therefore the tensor θμν is a perfectly legitimate substitute for the canon-
ical energy-momentum tensor.
An important consequence of what we have just seen is that we can con-
struct a new momentum tensor based on θμν :
μ[αβ] = xα θμβ − xβ θμα .
M (3.99)
36 Relativistic Quantum Mechanics

This new tensor is conserved, because θ is symmetric and conserved, and

represents a legitimate substitute for the canonical tensor. We note that, ap-
parently, the angular momentum is now only “orbital”, proof that the sepa-
ration between orbital and spin angular momenta has no physical signiﬁcance
in a relativistic theory.
With the new momentum tensor, we can analyse the constants of the
motion associated with the special Lorentz transformations. We ﬁnd:

3 0[0i]
i
K = d xM = d3 x(ctθ0i − xi θ00 ) = constant. (3.100)

That K i is a constant simply expresses the fact that the barycentre of the
energy, for an isolated system, moves with uniform rectilinear motion.
3 i 00
d xxθ
< x >= 3 00
i
(3.101)
d xθ
3 0i
d xθ
< xi >= ct · 3 00 + constant.
d xθ

3.7 PROBLEMS FOR CHAPTER 3

Sect. 3.1
1. Given the complex field φ and the Lagrangian density
L = ∂μ φ∂ μ φ − m2 (φφ ) − λ(φφ )2
determine the equations of motions for φ and φ .
2. In the same example:
– derive the canonical energy-momentum tensor T μν , and verify that
T 00 coincides with the Hamiltonian density, H;
– obtain the Noether current associated with the invariance of the
Lagrangian under global phase transformations: φ → eiα φ.
3. Consider two real scalar fields φ1 and φ2 and the Lagrangian density

1 1 1 φ1
L(Φ, ∂μ Φ) = (∂μ Φ)T (∂ μ Φ) − m2 Φ† Φ − λ(Φ† Φ)2 , Φ =
2 2 4 φ2
and consider the transformations Φ → Φ = U Φ, induced by the matrix

cos α − sin α
U= .
sin α cos α
Write down the infinitesimal form of the transformation matrix; show
that L is invariant under infinitesimal transformations; derive the cor-
responding Noether current.
The Lagrangian Theory of Fields 37

4. Write the complex scalar ﬁeld in terms of its real and imaginary compo-
nents: φ = φ1 + iφ2 . Prove that the theory embodied by the Lagrangian
of Problem 1 is identical to the one of Problem 3.
5. Given the Lagrangian density
1 1 λ
L= ∂μ φ∂ μ φ − m2 φ2 − φ3
2 2 3!
– derive the equations of motions for the ﬁeld φ;
– derive the expression of the canonical energy-momentum tensor
T μν applying Noether’s theorem in the case of translation invari-
ance;
– obtain the Hamiltonian density H, and verify that H = T 00 .

Based on the analysis of the energy density, discuss the consistency of

the theory.
6. Scale invariance. Consider the action associated with the Lagrangian
density describing a massless real scalar ﬁeld

1
S= d4 x g μν ∂μ φ∂ν φ
2
and the scale transformations

xμ → x = eα xμ
μ

φ(x) → φ (x ) = eγα φ(x) .

acting on both x and φ .
Determine the value of γ that leaves the action unchanged and derives
the corresponding Noether current.

Sect. 3.2
1. Using eq. (3.19), prove that:
δφ(x) δπ(x)
= = δ(x − y)
δφ(y) δπ(y)
δφ(x) δπ(x)
= =0
δπ(y) δφ(y)
and that:

{φ(x), π(y)} = δ(x − y) .

2. Demonstrate the Jacobi identity for generic Poisson brackets:

{A, {B, C}} + {B, {C, A}} + {C, {A, B}} = 0 .

38 Relativistic Quantum Mechanics

3. Demonstrate the same identity for the commutator of two matrices

[A, B], i.e.

[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 .

Sect. 3.3
1. Demonstrate that the charges corresponding to the momentum tensor
in (3.99) agree with those of the canonical tensor (3.90).
CHAPTER 4

KLEIN–GORDON FIELD
QUANTISATION

4.1 THE REAL SCALAR FIELD

The scalar ﬁeld provides the simplest example of what has been discussed
so far. In general, we can choose:
1 μ 1
L= (∂ φ)(∂μ φ) − m2 φ2 − V (φ) (4.1)
2 2
where m is a constant with dimensions of [length]−1 , and V (φ) is a function
of second order in φ.
From (4.1) we ﬁnd the Euler–Lagrange equations:

∂V
2φ + m2 φ + = 0. (4.2)
∂φ
In the case V = 0, we ﬁnd the Klein–Gordon equation:

(2 + m2 )φ = 0. (4.3)
which constitutes the simplest relativistic ﬁeld equation.
To construct the Hamiltonian, we calculate the conjugate momentum den-
sity (recall that x0 = ct):
∂L 1
π= = 2 ∂t φ (4.4)
∂∂t φ c
so that:
1 2 2
H = π ∂t φ − L = [c π + (∇φ)2 + m 2 φ2 + V (φ)]. (4.5)
2

DOI: 10.1201/9781003436263-4 39
This chapter has been made available under a CC BY NC license.
40 Relativistic Quantum Mechanics

The stability of the field requires that the Hamiltonian should be bounded
from below, for variations of φ in all possible configurations. Restricting our-
selves to spatially constant configurations, we see from (4.5) that V (φ) must
be a function bounded from below.
Hamilton’s equations are:

∂t φ = c2 π; (4.6)
∂V
∂t π = −(m2 φ + − ∇ · ∇φ).
∂φ
Substituting the ﬁrst into the second we ﬁnd, of course, the relativistically
invariant equation (4.2).

Comment. Equation (4.1) represents the most general form of a Lagrangian

which is invariant and quadratic in the derivatives. A term linear in φ could
be added. Even in the presence of a term of this type, the potential V (φ) in
the Hamiltonian must have an absolute minimum. Therefore we can eliminate
the linear term with a change of variables: φ = φ − φ0 , ∂ μ φ = ∂ μ φ where
φ0 is the minimum. The development of L in φ no longer contains the linear
term.

Now we determine the general solution of the Klein–Gordon (K–G) equa-

tion, which is again:

(2 + m2 )φ = 0

which we can solve with periodic boundary conditions at the edges of a large
cubic spatial volume of side L:

φ(x, y, z, t) = φ(x + L, y, z, ), etc.

The general solution has the form of a plane wave:

μ
φ = N e−ikμ x ; k μ = (k 0 , k)

where k is the vector wave number. The periodicity condition implies:

2π 1 2 3
k 1 L = 2πn1 , etc. → k= (n , n , n ) (4.7)
L
where ni are arbitrary integers. The Klein–Gordon equation, in turn, requires:

k μ k μ − m2 = 0

from which we can deduce the two solutions:

ω ω
k0 = ± ; = (m)2 + (k)2 .. (4.8)
c c
Klein–Gordon Field Quantisation 41

The general solution of the K–G equation is a superposition of the plane

waves just found. For every k we have two plane waves, one with positive
frequency e−iωt and one with negative frequency e+iωt . We write:

φ(x) = Σn N (ω)[a(k) e−i(ωt−k·x) + c(k) e+i(ωt+k·x) ] (4.9)

where N is a normalisation factor which we will shortly deﬁne, and the sum
is over all vectors with integer components. In the second term we can sum
over −n and deﬁne c(−k) = b∗ (k):

φ(x) = Σn N (ω)[a(k) e−i(ωt−k·x) + b∗ (k) e+i(ωt−k·x) ] = (4.10)

−ikμ xμ ∗ +ikμ xμ
= Σn N (ω)[a(k) e + b (k) e ]

where, from now on, we will put k μ = ( ωc , k).

The general solution (4.10) produces a field which is, in general, complex.
For a real field, we must have b(k) = a(k) and we find:
μ μ
φ(x) = Σn N (ω)[a(k) e−ikμ x + a∗ (k) e+ikμ x ]. (4.11)

The φ given by (4.11) is real and depends on two real functions of k, the real
and imaginary parts of a(k). This corresponds to the fact that, to completely
determine φ, it is necessary to provide two types of initial conditions: the
values of φ(x, 0) and ∂t φ(x, 0).
To conclude this section, we can state explicitly the relation between a(k)
and the initial conditions. We consider the system of functions fk (x), solutions
of the K–G equation for positive frequencies:
μ
fk (x) = N e−ikμ x (4.12)

Using the orthonormality condition for exponential functions, we can cal-

culate the two projections (recalling that ω(k) = ω(−k)):

X = d3 x [(∂t fk (x, t))∗ φ(x, t)]t=0 = iω(k)N 2 V [a(k) + a∗ (−k)];

Y = d3 x [(fk (x, t))∗ ∂t φ(x, t)]t=0 = −iω(k)N 2 V [a(k) − a∗ (−k)];

from which we can derive a(k). If we choose N = (2ω(k)V )−1/2 , it follows

that:

a(k) = i(Y − X) = i d3 x[fk∗ · (∂t φ) − (∂t fk∗ ) · φ]t=0
1 μ μ
φ(x) = Σn √ [a(k) e−ikμ x + a∗ (k) e+ikμ x ]. (4.13)
2ωV
42 Relativistic Quantum Mechanics

The complex Klein–Gordon ﬁeld. The extension to the case of a complex

ﬁeld is straightforward. The Lagrangian is written:

L = (∂ μ φ)(∂μ φ∗ ) − m2 φφ∗ − V (φφ∗ ) (4.14)

which arrives once again at the Klein–Gordon equation (4.3) for φ and φ∗ .
The Lagrangian (4.14) exhibits a symmetry for changes of phase, this time
an internal symmetry:

φ(x) → eiα φ(x); φ(x)∗ → e−iα φ(x)∗ . (4.15)

In terms of the real and imaginary parts of φ:

φ1 + iφ2
φ= √
2
the Lagrangian (4.14) reduces to the sum of two identical Lagrangians for
the real ﬁelds φ1 and φ2 . The symmetry (4.15), in this new representation,
corresponds to an orthogonal rotation of the ﬁelds φ1,2 among themselves:

φi = Oij φj OT O = 1. (4.16)

The complex ﬁeld φ can once again be expanded in solutions of the

Klein–Gordon equation according to (4.10) but now a(k) and b(k) are in-
dependent:
1 μ μ
φ(x) = Σn √ [a(k) e−ikμ x + b∗ (k) e+ikμ x ]. (4.17)
2ωV

The Continuum Limit. The inclusion of the system in a cube of side L is a

mathematical artiﬁce which serves to arrive at a discrete spectrum of solutions
to the K–G equation. In general, at the end, it is necessary to pass to the limit
L → ∞. The sum over integer vectors, in this limit, tends to an integral over
the oscillator density, of which we give the explicit form. From (4.7) it can be
seen that the interval Δn1 corresponds to 2π L
Δk 1 , etc. Therefore:

Δk 1 Δk 2 Δk 3 d3 k
Σn ... → V ... = V ...
(2π)3 (2π)3

4.2 GREEN’S FUNCTIONS OF THE SCALAR FIELD

We now consider the solutions of the equation of motion of a real ﬁeld in
the presence of a known source, J(x):

(−2 − μ2 )φ(x) = J(x). (4.18)

The associated homogeneous equation is the Klein–Gordon equation, (4.3).

Klein–Gordon Field Quantisation 43

The solutions of (4.18) are obtained starting from the Green’s function for the
problem, the solution to the equation in the presence of a pointlike source
described by a Dirac delta function localised at the origin of space-time:

(−2 − μ2 )G(x) = δ (4) (x). (4.19)

Given the Green’s function, the solution to (4.18) is simply:

φ(x) = d4 x G(x − x )J(x ) (4.20)

as is easily verified.
There are, of course, an infinite number of Green’s functions, each deter-
mined by the particular boundary conditions which are assigned to (4.19). The
solutions differ among themselves by a solution to the homogeneous equation,
so that the general solution to (4.18) is written:

φ(x) = d4 x G(x − x )J(x ) + φ0 (x) (4.21)

where G is a given Green’s function and φ0 is the general solution to the

homogeneous K–G equation (4.3), which we described earlier, (4.13).
To solve (4.19), the method of Fourier transforms is used. Given f (x), we
deﬁne the Fourier transform (immediately taking the limit V → ∞) as :

1
f˜(k) = d4 x f (x)ei(k·x) ; f (x) = d4 k f˜(k)e−i(k·x) . (4.22)
(2π)4

Equations (4.19) and (4.20) therefore become:

1
G̃(k) = ;
k2− μ2
˜
φ̃(k) = G̃(k) · J(k). (4.23)

A particular solution to (4.18) is found formally from (4.23):

1
φ(x) = d4 ke−i(k·x) 2 ˜
· J(k). (4.24)
k − μ2
To give a precise signiﬁcance to (4.24) we must take account of the fact
that the denominator in the integral is singular at the points which correspond
to the propagation of free waves, (4.8). To do this, it is convenient to work in
the complex plane of the variable k 0 . The singularity of G̃(k) is found on the
real axis, for k 0 = ±ω, and each particular solution is found by assigning a
path in the complex plane to carry out the integral in k 0 .
To be deﬁnite we consider the integral:

−1
F (x) = d4 k 2 2
g̃(k)e−i(kx) (4.25)
C k − μ
44 Relativistic Quantum Mechanics

with g̃(k) a given function, which is analytic in k 0 and C is a path assigned in

the complex plane of k 0 . We must separately distinguish between integration
along closed and open paths.

1. Closed paths. These integrals give solutions to the homogeneous equation.

Actually, applying the Klein–Gordon operator, a factor k 2 − μ2 is obtained in
the numerator of the integrand, which eliminates the pole; at this point we
can reduce the integration path to zero, thus obtaining:

(2 + μ2 )F (x) = 0.

Using the residue theorem, it is easily seen that the integral is equal to zero
if the path does not include either of the two singularities, or a combination
of the two homogeneous solutions, represented by the residues of the integral
around each singularity.
We denote by C + a path which encircles clockwise (only once!) the point
k 0 = +ω(k) and we deﬁne:

1 i
iΔ(+) (x) = 4
d4 k e−i(k·x) 2 =
(2π) C + k − μ2

1 i
= d3 k dk 0 0 e−i(k·x) ;
(2π)4 C+ (k − ω)(k 0 + ω)
3 3
(+) 1 d k −i(ωt−k·x) 1 d k −i(k·x)
iΔ (x) = 3
e = 3
e . (4.26)
(2π) 2ω (2π) 2ω

Analogously, we denote by C − the path which turns in the anticlockwise

direction around the singularity at k 0 = −ω(k) and we deﬁne:

1 i
iΔ(−) (x) = d4 k e−i(k·x) ;
(2π)4 C− k 2 − μ2

−1 d3 k −i(−ωt−k·x)
iΔ(−) (x) = e =
(2π)3 2ω

−1 d3 k +i(ωt−k·x) −1 d3 k +i(k·x)
= e = e . (4.27)
(2π)3 2ω (2π)3 2ω

In (4.26) and (4.27), k μ = [+ω(k), k]. Obviously,

Δ(−) (x) = −Δ(+) (−x) = −[Δ(+) (x)]∗ . (4.28)

In conclusion, the integral (4.25) on a closed path, C0 , gives a solution to

the homogeneous equation, from a linear combination of the residues at the
two poles:

1 d3 k −i(k·x) −1 d3 k
φ0 (x) = g̃(k)e + g̃(−k)e+i(k·x) =
(2π)3 2ω(k) (2π)3 2ω(k)

= Δ (x − x )g(x ) + Δ(−) (x − x )h(x ).
(+)
(4.29)
Klein–Gordon Field Quantisation 45

where we have put g̃(k) = g̃(ω(k), k), and g and h are two independent func-
tions. Equation (4.29) gives a representation of the general solution of the
homogeneous equation which reproduces what was given earlier, in (4.13) in
the complex case (in which g̃(k) and g̃(−k) are independent) and in the con-
tinuum limit, in which:

g̃(k) = 2ω(k)V a(k). (4.30)

√ that in the continuum limit, V → ∞, the a(k) must tend to zero like
(Note
1/ V if we wish for field configurations in which the total energy of the field,
rather than the energy density, remains finite).

2. Open paths. In general, these paths give a solution to the inhomogeneous

equation. Two diﬀerent paths give the same result if we can continuously
deform one into the other without encountering any singular point of G̃, oth-
erwise they diﬀer by combinations of the integrals around the singularity, i.e.
by solutions of the inhomogeneous equation.
Among the particular solutions of the homogeneous equation worthy of
note are those which correspond to retarded or advanced Green’s functions or
to the Feynman function.

• The retarded Green’s function, Gret , corresponds to the condition that

G(x) = 0 for t < 0, which is that the result should be diﬀerent from
zero only after switching on the source at the coordinate origin (causal-
ity condition). In this case, the integration path must be completely
above the singularity. We put k 0 = Re k 0 + iη and explicitly write the
exponential in (4.25):
0
e−i(k·x) = e(−iRek t+ηt+...)
.

For t < 0 the integration path must be closed in the upper part of the
complex plane (η > 0). To obtain a null result, the path closed in this
way should not contain the singularity, which must therefore be below
the path of integration. Conversely, when t > 0 and the integration path
is closed in the lower half-plane, the path, turning clockwise, includes
the two singularities. The result is therefore:

1 i
iΔret (x) = d4 k 2 = θ(t)[iΔ(+) (x) + iΔ(−) (x)].
(2π)4 Imk0 >0 k − μ2
(4.31)
• The symmetric condition, that G should vanish for t > 0, takes us to
the advanced Green’s function:

1 −1
iΔadv (x) = 4
d4 k 2 = −θ(−t)[iΔ(+) (x) + iΔ(−) (x)].
(2π) Imk0 <0 k − μ2
(4.32)
46 Relativistic Quantum Mechanics

• The Feynman propagator is obtained from the condition that it should

coincide with iΔ(+) (x) for t > 0 and with iΔ(−) (x) for t < 0. This
condition determines an integration path, CF , that originates from the
negative real axis from k 0 passing below the singularity in k 0 < 0, and
above the one at k 0 > 0. In this way, for t > 0, when enclosed in the
lower half-plane, the path turns clockwise around the point k 0 = ω and
results in iΔ(+) (x), while for t < 0 the path is enclosed in the upper half-
plane, turning anticlockwise around the singularity at k 0 < 0. The same
result is obtained, obviously, integrating along the real axis, after having
moved the poles in the complex planes by an inﬁnitesimal amount, > 0,
in the following way:

k 0 = −ω(k) → k 0 = −ω + i;
k 0 = +ω(k) → k 0 = +ω − i.

As a formula, we arrive at the following deﬁnition:

d4 k i
iDF (x) = 4 2 2
e−i(k·x)
CF (2π) k − μ

d4 k i
= e−i(k·x)
(2π) (k − ω + i)(k 0 + ω − i)
4 0

d4 k i
= e−i(k·x)
(2π) k − μ2 + i
4 2

= θ(t)iΔ(+) (x) − θ(−t)iΔ(−) (x). (4.33)

The propagators Δ(±) (x) can be expressed in terms of known func-

tions [2].

4.3 QUANTISATION OF THE SCALAR FIELD

We recall the classical Lagrangian for the complex scalar ﬁeld:

L = ∂μ φ † ∂ μ φ − m 2 φ † φ (4.34)

which reproduces the Klein–Gordon (K–G) equation:

(∂μ ∂ μ + m2 )φ = 0 (4.35)

The Lagrangian (4.34) is invariant for translations in space-time and for

Lorentz transformations, under which φ transforms like a scalar:

φ (x ) = φ(x). (4.36)

Moreover, the Lagrangian is invariant under the phase transformations:

φ (x) = eiα φ(x); φ† (x) = e−iα φ† (4.37)

Klein–Gordon Field Quantisation 47

with α constant.
It follows from Noether’s theorem that:

• the energy-momentum tensor (symmetric, because there is no spin part):

θμν = ∂ μ φ† ∂ ν φ − g μν L (4.38)

• the momentum tensor:

M μ,αβ = xα θμβ − xβ θμα (4.39)

• the conserved current corresponding to (4.37) (the factor 1/ is inserted

for appropriate normalisation of the charge):

i † μ
J μ (x) = [φ (∂ φ) − (∂ μ φ† )φ]. (4.40)

From (4.35) the conjugate momenta and the Hamiltonian are quickly
found:
∂L
π= = ∂t φ † ; π † = ∂ t φ
∂∂t φ

H = π † π + ∇φ† · ∇φ + m 2 φ† φ; H = d 3 x H

i
Q = d3 xJ 0 (x, t) = d3 x φ† (∂ 0 φ) − (∂ 0 φ† )φ (4.41)

Canonical quantisation should replace classical Poisson brackets for the
canonical variables (cf. Chapter 3) with the equal time commutators according
to the rule:
[A, B]
{A, B} → . (4.42)
i
We obtain the equal time commutators from equation (3.26) :

[φ(x, t), φ(y, t)] = [φ(x, t), φ† (y, t)] = 0

[π(x, t), π(y, t)] = [π(x, t), π † (y, t)] = 0
[φ(x, t), π † (y, t)] = 0
[φ(x, t), π(y, t)] = [φ† (x, t), π † (y, t)] = iδ (3) (x − y). (4.43)

We can also express the non-zero commutators as:

[φ(x, t), ∂t φ† (y, t)] = iδ (3) (x − y), (4.44)

together with the Hermitian conjugate equation.

48 Relativistic Quantum Mechanics

Equations (4.44) determine the operator structure of the theory, in par-

ticular, the commutators of the dynamic variables with the Hamiltonian and
therefore the equation of motion. Hamilton’s equations are1 :
∂
i φ = [φ, H] = iφ† ;
∂t
∂
i π = [π, H] = i(∇ · ∇ − m2 )φ† , (4.45)
∂t
from which (4.35) follows. We note the charge-ﬁeld commutation rule:

[φ, Q] = +φ (4.46)

The solutions to the K–G equation are of the form found in section 4.1.
To be exact, the K–G equation, being linear in φ, has the same solutions,
regardless of whether φ is a complex number or an operator. These are the
amplitudes of the normal oscillation modes, a(k) and b(k), which now become
linear operators with commutation rules determined by (A.25). (We normalise
a(k) and b(k) so as to eliminate from their commutation rules.):

φ(x) = Σk [a(k)e−i(kx) + b† (k)ei(kx) ]. (4.47)
2ωV
Inverting equation (4.47) we ﬁnd:

a(k) = i d3 x[fk∗ · (∂t φ) − (∂t fk∗ ) · φ]t=0 =,

b(k) = i d3 x[fk∗ · (∂t φ† ) − (∂t fk∗ ) · φ† ]t=0

1
fk = e−ikx , (4.48)
2ω(k)V

From (4.44) we therefore ﬁnd:

[a(k), a† (k )] = [b(k), b† (k )] = δk,k (4.49)

with all other commutators equal to zero.

• The canonical commutation rules given to the operators a, a† and b, b†
the function of creation and destruction operators of quantum harmonic
oscillators, with two types of oscillator for every mode of vibration of
the classical field.
The Hilbert space on which the field operators act is composed of tensor
products of the different oscillator states. More precisely, the space of states
includes:
1 These are obtained immediately by applying (4.42) to the Poisson brackets (3.27).
Klein–Gordon Field Quantisation 49

• the vacuum state in which all the oscillators are at the lowest level.
Mathematically, |0 > is determined by the condition of being annihilated
by the application of any destruction operator:
as (p)|0 >= br (q)|0 >= 0, for any s, r, p, q; (4.50)

• states with a certain number of excitations of diﬀerent oscillators, ob-

tained by applying the creation operators a† and b† to the vacuum state:
1
|n1 , n2 , . . . ; m1 , m2 , · · · >= √ ·
n1 !n2 ! . . . m1 !m2 ! . . .
· [a†s1 (p1 )]n1 [a†s2 (p2 )]n2 . . . [b†r1 (q1 )]m1 [b†r2 (q2 )]m2 . . . |0 > . (4.51)

The physical nature of these operators is clariﬁed by the consideration of

the conserved quantities: the energy and momentum of the ﬁeld, H and P ,
and the conserved charge, Q. Substituting the expansion (4.47) into (4.38)
and (4.40) and using the orthogonality of the plane waves, we ﬁnd (without
changing the order in which the operators appear in the various expressions):

H = d3 x θ00 = Σk ω(k) [a† (k)a(k) + b(k)b† (k)]

P i = d3 x θ0i = Σk k i [a† (k)a(k) + b(k)b† (k)]

Q = d3 x J 0 = Σk [a† (k)a(k) − b(k)b† (k)]. (4.52)

We can now reorder the operators so as always to have the destruction

operators on the right, finding:
H = Σk ω(k) [a† (k)a(k) + b† (k)b(k)] + constant
P i = Σk k i [a† (k)a(k) + b† (k)b(k)]
Q = Σk [a† (k)a(k) − b† (k)b(k)]. (4.53)
The ordered operators always give zero on the vacuum state and give the
occupation number of the corresponding oscillator, when applied to the states
of (4.51).
The (infinite) constant in the expression for the energy represents the
energy of the vacuum state, which is unobservable (while we remain within the
limits of special relativity). Measuring the energy starting from the energy of
the vacuum, the first equation of (4.53) shows that all the states have positive
energy.
The values of energy and momentum corresponding to the states a† (k)|0 >
are those of a relativistic particle with 4-momentum pμ = (ω(k), k) and
pμ pμ = (m)2 . The set of states which we have found is that of a perfect
quantum gas formed of identical particles of two types (the particles created
by the operators a† and b† ), with equal mass and charge ±1 respectively.
From the canonical quantisation rules (4.44) we see that the particles obey
the rules of Bose–Einstein statistics: n1 , n2 , . . . , m1 , m2 , . . . = 0, 1, 2, . . ..
50 Relativistic Quantum Mechanics

Normal Products. The version of the classical Lagrangian and other ob-
servables of the field (energy, momentum, etc) suffer from an intrinsic am-
biguity, because we must convert the products of classical quantities (which
commute) into products of linear operators (in general, non-commuting). We
can remove this ambiguity by defining, as we have just done, the quantum
operator products in a way which ensures that their vacuum value is equal to
zero. When we impose this condition, we speak of having normal products or
normal ordering.
To formalise this condition, we observe that the field operators are sums
of two components, characterised by the sign of the exponential in t in the
plane waves:

φ(+) (x) ∼ e−i(px) (conventionally, positive frequency)

φ(−) (x) ∼ e+i(px) (conventionally, negative frequency). (4.54)

The normal product of two operators is deﬁned as what is obtained putting

the positive frequency operators to the right of the expression and ignoring
the result of eventual commutation. The normal product is commonly denoted
with the symbol N or, more simply enclosing the product between two colons
(: · · · :). For example:

N (φ(x)φ† (y)) =: φ(x)φ† (y) :

=: (φ(+) (x) + φ(−) (x))((φ† )(+) (y) + (φ† )(−) (y)) :
= φ(+) (x)(φ† )(+) (y) + (φ† )(−) (y)φ(+) (x)
+φ(−) (x)(φ† )(+) (y) + φ(−) (x)(φ† )(−) (y). (4.55)

In what follows, it will be understood that the Lagrangian, Hamiltonian

and other observables should be constructed of normal products of the ﬁeld.
Correspondingly, the expression for energy is given by (4.53) without the
inﬁnite constant.

4.4 PROBLEMS FOR CHAPTER 4

Sect. 4.1
1. Using the Klein–Gordon equation, show that the quantity

d3 x [fk · (∂t φ) − (∂t fk ) · φ]

where fk is given by eq. (4.12), is independent of time.

Klein–Gordon Field Quantisation 51

Sect. 4.3
1. Given the real scalar ﬁeld φ(x),

– evaluate the commutator [φ(x), φ(y)] for x0 = y 0 ;

– show that each of the two terms contributing to the commutator is
separately Lorentz invariant;
– show that the canonical commutation rule is recovered in the x0 →
y 0 limit.

2. Find the momentum operator associated with the real scalar ﬁeld φ,
deﬁned as
P i = d3 x T 0i

T μν being the energy, momentum tensor, and show that

∂φ
[P i , φ] = i .
∂xi

3. Evaluate the vacuum expectation value

0|φ(x1 )φ(x2 )φ(x3 )φ(x4 )|0 .

4. Consider the transformation

xμ → x = e−α xμ φ(x) → φ (x ) = eα φ(x)

μ
,

which leaves the Lagrangian density of a free massless scalar ﬁeld invari-
ant.

– obtain the conserved charge

D= d3 xj 0 (x) ,

where j μ is the corresponding Noether current;

– evaluate the commutators

[D, φ(x)] , [D, π(x)]

where π(x) is the ﬁeld conjugate to φ(x).

5. Evaluate the commutator [P μ , φ(x)], where P μ = T 0μ and T μν is the

energy-momentum tensor.
CHAPTER 5

ELECTROMAGNETIC-
FIELD QUANTISATION

5.1 MAXWELL’S EQUATIONS IN COVARIANT FORM

The equations which describe the behaviour of electric and magnetic ﬁelds
in the presence of a given charge density, ρ(x, t), and current density, J (x, t),
are written in the following way:
∇·E =ρ (5.1)
∇·B =0 (5.2)
1 ∂E
∇×B− =J (5.3)
c ∂t
1 ∂B
∇×E+ = 0. (5.4)
c ∂t
These equations can immediately be written in covariant form for Lorentz
transformations [3].
We introduce the antisymmetric tensor F μν , connected to the electric and
magnetic ﬁelds by the equations:
F μν = −F νμ
F 0i = E i ; F 12 = B 3 , and cyclic permutations, (5.5)
and the 4-vector for the charge-current density:
1
j μ = (ρ, j). (5.6)
c
The inhomogeneous Maxwell’s equations may then be written as:
∂ν F μν = j μ (5.7)
while the homogeneous equations can be expressed as:
∂μ Fνλ + ∂ν Fλμ + ∂λ Fμν = 0; (μ < λ < ν). (5.8)

52 DOI: 10.1201/9781003436263-5
This chapter has been made available under a CC BY NC license.
Electromagnetic-Field Quantisation 53

The homogeneous equations take a more symmetric form if we express the

three antisymmetric indices in terms of the Levi–Civita tensor and introduce
the dual tensor:
F̃ μν = μνρσ Fρσ (5.9)
The homogeneous equations (5.8) become:

μνρσ ∂ν Fρσ = 0. = ∂ν (F̃ μν ) (5.10)

Equation (5.10) expresses the fact that the dual tensor does not contain
sources, unlike F μν . Because the conversion from F to F̃ requires the inter-
change of electric and magnetic ﬁelds, equation (5.10) implies the absence of
magnetic monopoles, the magnetic analogue of electric charge.

Vector Potential. Equations (5.8) are constraints on the components of

Fμν . Consequently, not all the six components of E and B are independent
variables.
There are obviously four ways of choosing three different indices, each with
four possible values, which is the number of independent homogeneous equa-
tions. In total, therefore, the electromagnetic field contains only two dynamic
variables. A first step in isolating the independent components consists in
observing that equations (5.8) are identically satisfied by the expression:

F μν = ∂ ν Aμ − ∂ μ Aν (5.11)

as is verified immediately from (5.10). Equation (5.11) defines a new field,

known as the vector potential. Explicitly, with Φ = A0 = scalar potential:
1 ∂ i
F 0i = ∂ i A0 − ∂ 0 Ai = −∂i Φ − A; (5.12)
c ∂t
B = ∇×A. (5.13)

If we use the components of the vector potential as dynamic variables,

Maxwell’s equations in a vacuum can be derived from an action principle,
starting from the Maxwell Lagrangian:
1 1
Le.m. = − Fμν F μν = (E 2 − B 2 ). (5.14)
4 2

Gauge Invariance. The components of Aμ are four variables, so they still

with some redundancy. If we carry out a gauge transformation

Aμ → Aμ = Aμ + ∂ μ f (5.15)

the new vector potential gives rise to the same observables E and B, for any
function f.
54 Relativistic Quantum Mechanics

It is possible and natural to use arbitrariness in the deﬁnition of Aμ to

impose a covariance requirement on Aμ . A condition often used is the Lorenz
gauge condition1 :
∂μ Aμ = 0. (5.16)
It is easy to see that this condition can always be imposed. If we start
from a given Aμ which does not satisfy (5.16), we can obtain an equivalent
Aμ which satisﬁes it by solving the equation:

0 = ∂μ Aμ = ∂μ Aμ (x) + 2f (x) (5.17)

which has f (x) as unknown. We can explicitly give a particular solution of

this equation in terms of the inverse of the operator 2. However, the solution
is not unique, since the corresponding homogeneous equation:

2f = 0 (5.18)

permits non-trivial solutions (as we have seen already in the case of the
Klein–Gordon equation). This is in accord with a count of the degrees of
freedom. Taking account of (5.16) we ﬁnd three degrees of freedom for Aμ ,
but the preceding count says that there should be two.
Unfortunately, the ﬁnal condition cannot generally be given in covariant
form. This is the origin of numerous problems, which will be confronted and
resolved in what follows.

5.2 GREEN’S FUNCTIONS OF THE ELECTROMAGNETIC

FIELD
In terms of the vector potential, Maxwell’s inhomogeneous equations, (5.7),
are written:

∂ν F μν = 2Aμ − ∂ μ (∂ν Aν ) = J μ

which, if Aμ satisﬁes the Lorenz condition (5.16), reduces to the wave equation:

∂ ν ∂ ν Aμ = J μ (5.19)

with the supplementary condition that the current should be conserved:

∂ μ J μ = 0 → kμ J˜μ (k) = 0. (5.20)

To characterise the solutions of (5.19), we can use the results of the pre-
ceding section, in the limit of zero mass, μ2 → 0. As we have seen, the Green’s
functions of the Klein–Gordon equation contain singularities in the Fourier
transform, localised at the points k 0 = ± μ2 + (k)2 . To discuss the electro-
magnetic ﬁeld, it is desirable to begin with a ﬁctitious mass, λ, which is small
1 Proposed in 1867 by the Danish mathematical physicist, Ludvig Valentin Lorenz, not
to be confused with H. A. Lorentz of the homonymous transformations.
Electromagnetic-Field Quantisation 55

but non-zero, to prevent singularities from coalescing at the same point when
k → 0. The limit λ → 0 can be taken at the end of the calculations.
From (5.19), we see that the Green’s function for the vector potential
satisfies the equation:
−2Gμν (x) = −g μν δ (4) (x) (5.21)
or, after the Fourier transform:
k 2 G̃μν (k) = −g μν . (5.22)
The solution of (5.19) is therefore written (in the Lorenz gauge, with Feyn-
man boundary conditions):

d4 k −g μν ˜ −i(k·x)
μ
A (x) = limλ→0 Jν (k) e + Aμ0 (x) (5.23)
(2π)4 k 2 − λ2 + i
where Aμ0 satisfies the free wave equation.
As noted in the previous paragraph, the Lorenz condition does not com-
pletely determine the gauge and we can impose a further condition. To con-
tinue, we must identify an appropriate basis on which to project the four
components of Aμ .
We fix the 4-vector k μ = (ω(k), k) and, correspondingly, a space-time
coordinate system identified by the following four vectors:
μ1,2 = (0, 1,2 ); k · (1,2) = 0;
k
μ3 = (0, );
|k|
μ0 = η μ = (1, 0). (5.24)
The normalisation conditions are:
μα νβ gμν = gαβ (5.25)
and the completeness conditions are:
km kn
Σi=1,2 m n
i i = (δ
mn
− );
|k|
Σα gαα μα να = g μν . (5.26)
These 4-vectors form a basis in which any other vector can be expanded.
Using the second completeness equation from (5.26), we rewrite the Feynman
Green’s function in (5.23) in the following way (with the limit of zero mass
understood):
−g μν −(Σα gαα μα να )
=
k 2 + i k 2 + i
(Σ1,2 μi νi ) (μ3 ν3 − η μ η ν )
= + . (5.27)
k 2 + i k 2 + i
56 Relativistic Quantum Mechanics

Keeping in mind condition (5.20), we eliminate 3 to leave kμ and η μ :

kμ ω μ
μ3 = − η (5.28)
|k| |k|

and consequently:

kμ kν ω ω2 μ ν
μ3 ν3 − η μ η ν = 2
− 2
(k μ η ν + η μ k ν ) − (1 − )η η . (5.29)
(k) (k) (k)2

Finally, we can rewrite the integrand in (5.23):

−g μν ˜ −i(k·x) (Σ1,2 μi νi ) ˜

J ν (k)e = Jν (k) e−i(k·x) +
k 2 + i k 2 + i
1 kμ kν ω
+ 2 [ − (k μ η ν + η μ k ν )] J˜ν (k) e−i(k·x)
k + i (k)2 (k)2
1 μ ν˜
+ η η Jν (k)e−i(k·x) . (5.30)
(k)2

We now analyse the diﬀerent terms:

• The terms in the ﬁrst line corresponds to the waves generated by the
current, which are transverse waves with respect to the direction of prop-
agation and represent the two degrees of freedom present in the ﬁeld.

• In the second line, after integration, the terms proportional to k μ con-

tribute with terms of the type ∂ μ f , which can be eliminated with a fur-
ther (last) gauge transformation; the terms proportional to kν J˜ν vanish
because of the conservation of the current; in total we can ignore the
second line.
• In the third line, the Feynman propagator has been replaced by the
Fourier transform of the Coulomb ﬁeld; this term represents the electro-
static potential generated by the charge density in J μ .

Explicitly:

d4 k 1 k i k j ˜j
i
A (x) = (δ ij
− )J (k)e−i(k·x) ;
(2π)4 k 2 + i |k|2

d4 k 1
A0 (x) = ρ̃(k)e−i(k·x) . (5.31)
(2π)4 |k|2

We note that:

∇ · A(x) = 0;
−∇ · ∇A0 (x) = ρ(x)
Electromagnetic-Field Quantisation 57

In general, we can ﬁx the gauge so that the electric ﬁeld is divided into
transverse and longitudinal parts:
E = EL + ET ;
∇ · ET = 0; ∇ · EL = ρ. (5.32)
From the preceding results we obtain explicitly:

1 1
E L (x, t) = −∇A = −∇ d3 y
0
ρ(y, t);
4π |x − y|

∂ ∂ d4 k 1 k(k · J˜)
ET = − A = − 4 2
e−ikx [J˜ − ]. (5.33)
∂t ∂t (2π) k + i |k|2

5.3 THE MAXWELL–LORENTZ EQUATIONS

We consider the case in which the electromagnetic field is coupled to a
pointlike particle with charge q. For the electron, q = −e, where e is the
elementary electric charge:
e = +1.60217653(14)10−19 C. (5.34)
For a pointlike particle:
1
j μ = (ρ, j);
c
ρ = q δ (3) [x − x(t)]; j = q v δ (3) [x − x(t)]. (5.35)
We therefore have:
1 ∂ dx(t)
ρ = −q · ∇ δ (3) [x − x(t)]
c ∂t dt
1
∇ · j = qv(t) · ∇ δ (3) [x − x(t)]. (5.36)
c
As a consequence, j μ satisfies the continuity equation:
1 ∂ 1
ρ + ∇ · j = 0. (5.37)
c ∂t c
The extension to the case of multiple charges is straightforward.
The overall action of the field+charge system is obtained from the equa-
tions of section 5.1, specifying the current according to (5.35) and adding the
action due to the charge. Combining them, we find the Lagrangian density:

1 v2
L(x) = − Fμν F μν + δ (3) [x − x(t)](−mc2 1− ) − jμ Aμ (x). (5.38)
4 c2
From this result, the ﬁeld equations are obtained as before, in the form:
1
∂ν F μν = j μ = q uμ Aμ [x(t)] δ (3) [x − x(t)] (5.39)
γ
58 Relativistic Quantum Mechanics

2
where uμ is the 4-velocity and, as usual, γ = 1/ 1 − vc2 .
To derive the equations of motion of the charge, we calculate the conjugate
momentum (where L = d3 xL):

∂L
= mγ v + qA[x(t), t] (5.40)
∂v
from which:
d ∂L d ∂A 1
= (mγv) + q + q (v · ∇)A (5.41)
dt ∂v dt ∂t c
and the generalised force:
∂L
= −q[∇A0 − Σi v i ∇Ai ]. (5.42)
∂x
The equations of motion are therefore:
d ∂L ∂L
= ;
dt ∂v ∂x
d d ∂A 1
(mγv) = (p) = −q − q∇A0 + q [−(v · ∇)A + Σi v i (∇Ai )] =
dt dt ∂t c
1 1
= qE + q v × ∇ × (A) = qE + q v × B. (5.43)
c c
These are the equations for the spatial components of the 4-momentum, p,
of the charge. The equation for the time component is obtained by multiplying
the previous equation by p and using the relations:

p0 dp0 = p · dp;
v/c = β = p/p0 .

We therefore ﬁnd:
d dp0
=c = q v · E. (5.44)
dt dt
Equation (5.44) expresses the conservation of energy: the energy acquired
by the particle in a unit of time is equal to the power provided by the electric
ﬁeld (the Lorentz force, the second term in the second line of (5.43), does not
perform work).
The preceding equations can be put in the covariant form, noting that:
1 0μ
v·E = F uν
γ
dp1 q 1 1μ
= qF 01 + (v 2 F 12 − v 3 F 31 ) = F uμ .
dt c cγ
Electromagnetic-Field Quantisation 59

We therefore obtain:
dpμ q 1 μν
= F uν ; or
dt cγ
dpμ q
= F μν uν (5.45)
dτ c
in terms of the proper time. In the case of multiple particles, we obtain an
equation of the type (5.45) for each particle, while in equations (5.39) we
should include the contribution of each particle to the current.

With the boundary conditions that there should be no external fields and
fields at infinity, the Maxwell–Lorentz equations (5.39, 5.45) describe the time
evolution of a system of charged particles, each under the action of the field
generated by itself and by the other particles.
Assuming that the other forces are negligible, this system of equations
describes the behaviour of matter in terms of its elementary constituents. The
Maxwell–Lorentz equations are the first example of a Theory of Everything, a
theory which would describe all the phenomena found in Nature.
The systematic description of the properties of matter based on equations
(5.45) was dealt with by Lorentz [4] at the start of the twentieth century, and
represented a fundamental step forward in the understanding of the structure
of matter.
For the behaviour of matter on the laboratory scale, the hypothesis that
electromagnetic forces should dominate is completely adequate. On astronom-
ical scales, it is necessary to take into account gravitational forces, which can
be inserted into the picture by extending the principle of special relativity to
Einstein’s general relativity. The Maxwell–Lorentz–Einstein equations give an
accurate description of phenomena on macroscopic scales, not yet superseded.
On a microscopic scale, at atomic and subatomic dimensions (10−8 cm),
the Maxwell–Lorentz picture must be replaced by quantum electrodynamics
(QED). It is a very noteworthy fact that the Maxwell–Lorentz equations,
once converted to the framework of quantum field theory, maintain their form
essentially unaltered and are capable of describing the properties of condensed
matter and of atoms extraordinarily accurately.
At nuclear and subnuclear levels (below 10−13 cm) other forces enter into
play: the nuclear forces described for the first time in a covariant and quantum
manner by Yukawa, at the beginning of the 1930s, and the weak interactions,
identified by Fermi, also at the start of the 1930s, as responsible for the decay
of the neutron and β radioactivity.

Conservation of Energy and Momentum. Starting from the Maxwell–

Lorentz Lagrangian, we can construct the conserved quantities corresponding
to the energy and total momentum of a charged ﬁeld.
It is natural to begin with the case of the electromagnetic ﬁeld in the
60 Relativistic Quantum Mechanics

absence of charges (cf. Section 5.1):

1
Le.m. = − Fμν F μν . (5.46)
4
The canonical energy-momentum tensor is given by:
∂Le.m. ν
μ,ν
Te.m. = ∂ Aβ − g μν Le.m. =
∂ μ Aβ
g μν
= F μβ ∂ ν Aβ + Fαβ F αβ . (5.47)
4
μ,ν
As well as Te.m. we consider the symmetric tensor, Θμν
e.m. obtained with the
Belinfante–Rosenfeld procedure. Θμν e.m. is obtained most simply by summing
with (5.47) the conserved tensor:

S μ,ν = −∂β (F μβ Aν ) = −F μβ (∂β Aν )

since, in the absence of charges, ∂β F μβ = 0. We note that:

∂μ S μ,ν = 0; d3 xS 0ν = 0.

In conclusion:
g μν αβ
e.m. = −gρσ F
Θμν ρμ σν
F + F Fαβ . (5.48)
4
The explicit forms of the energy and momentum densities obtained from
(5.48) are:
1
Ee.m. = Θ00
e.m. = (E · E + B · B);
2
i
Pe.m. = Θ0i
e.m. = (E × B)i . (5.49)

We now consider the complete Lagrangian density, (5.38), which we rewrite

for convenience:

L = Le.m. + Lq + Lint ;
1
Lq = −mc2 δ (3) [x − x(t)];
γ
1
Lint = −jμ Aμ = q δ (3) [x − x(t)]uμ Aμ .
γ
The overall energy-momentum tensor is obtained by differentiating with
respect to the degrees of freedom of the field and of the particle. We restrict
ourselves to consideration of the spatial integral of the time components (Lq +
Electromagnetic-Field Quantisation 61

Lint = d3 x(Lq + Lint ):

0,0
E tot = d3 xTtot

∂L 0 ∂(Lq + Lint )
= d3 x[ ∂ Aβ − g 00 Le.m. ] + · v − (Lq + Lint )
∂ 0 Aβ ∂v

= d3 xTe.m.
0,0
+ mc2 γ + qA0 [x(t), t] (5.50)

∂(Lq + Lint )
pitot = d3 xTe.m.
0i
+
∂v i

= d3 xTe.m.
0i
+ mγv i + qAi [x(t), t]. (5.51)
μ,ν
The tensor Te.m. is no longer conserved. Using the equation of motion for
μν
F , (5.39), we find:

dpνe.m.
= d3 x∂0 Te.m.
0,ν
= d3 x∂μ Te.m.
μ,ν
=
dt

1
= − d3 xjβ ∂ ν Aβ + d3 x Fμσ (∂ μ F σν + ∂ σ F νμ + ∂ ν F μσ ).
2
In the parentheses of the second term σ = μ. If we also have ν = σ = μ,
the bracketed term is zero for the homogeneous Maxwell’s equations, (5.8). If
instead ν = σ = μ, it vanishes because of the antisymmetry of F μν . In any
case, therefore:

dE e.m.
= − d3 x jβ ∂ 0 Aβ = −q∂0 A0 + qv · (∂0 A);
dt
dpie.m.
= qv · (∂i A)
dt
from which:
dE tot dE e.m. dE q
= + =
dt dt dt
d(mc2 γ)
= −q∂0 A0 + qv · (∂0 A) + + q∂0 A0 + qv · ∂A0 =
dt
d(mc2 γ)
= −qv · E + =0 (5.52)
dt
as a consequence of the equation of motion of the particle, (5.44). Similarly,
from the equation of motion, (5.43), we find:
dpitot dpie.m. dpiq
= + =
dt dt dt
dmγv i
= qv · (∂i A) + + q∂0 Ai + v · ∇Ai =
dt
dmγv i
= − qE i − q(v × B)i = 0. (5.53)
dt
62 Relativistic Quantum Mechanics

The evolution of the system in time consists of the continuous exchange of

energy and momentum between field and particle, maintaining constant the
values of the energy and the total momentum.
However, the definition of the energy and momentum associated with the
particle or with the field are not unique at a given instant, since the quantities
are not individually constants of the motion. A particularly simple description
μ,ν
is obtained by eliminating Te.m. in favour of Θμν
e.m. . Comparing (5.47) and
(5.48) leads to:

T μ,ν = Θμν + F μβ ∂β Aν = Θμν + ∂β (F μβ Aν ) − j μ Aν .

The total derivative can be omitted and we can combine the last term
with the expression for the energy-momentum of the particle. In this way we
obtain:

1 mc2
E tot = d3 x(E 2 + B 2 ) +
2 1 − vc2
2

mv i
pitot = d3 x(E × B)i + . (5.54)
2
1 − vc2

In the expressions of (5.54) apparently, only the energy of the free particle
appears. In the case of more than one particle, it can be asked where the
energy from the electrostatic interaction between the particles has gone. The
answer is that this energy is absorbed into the first term, as can be seen in the
following way. We separate the electric field into longitudinal and transverse
components, according to (5.32). We find:

1 1
d3 x E 2 = d3 x(EL2 + ET2 ) =
2 2

1 1
= d3 x ET2 + d3 x (∇Φ) · (∇Φ) =
2 2

1 1
= d3 x ET2 − d3 xΦ(∇ · ∇Φ) =
2 2

1
= d x ET + d3 x Φρ
3 2
(5.55)
2
where Φ is the electrostatic potential. The second term in the final expression
is the electrostatic energy which, for a system of pointlike particles, leads to
the expression:
1 1 q i qj
VCoul = Σij . (5.56)
2 4π |xi (t) − xj (t)|
We find:

1 1 1 qi qj mc2
Etot = d3 x(ET2 + B 2 ) + Σij + . (5.57)
2 2 4π |xi − xj | 2
1 − vc2
Electromagnetic-Field Quantisation 63

The first term contains only the degrees of freedom of the radiation field,
while the second and third contain the degrees of freedom of the particles.
Starting from (5.54) the unintegrated energy-momentum tensor can finally
be derived, in the form:

Θμν = Θμν μν
e.m. + θ ;
1
θμν = mc2 uμ uν δ (3) [x − x(t)].
γ

Comment. For a pointlike particle we have introduced the Lagrangian den-

sity, current and energy-momentum in the form:
1
L = −mc2 δ[x − x(t)];
γ
1 μ
jμ = q u δ[x − x(t)];
γ
1
θμν = mc2 uμ uν δ[x − x(t)];
γ

It is interesting to observe that the factor 1/γ is essential to compensate

the non-covariance of the δ-function and produce the apppropriate covariant
quantities, as seen from the following argument. We multiply j μ by an explic-
itly covariant 4-density and by a ﬁnite but small 4-volume around (x(t), t):

Δt
Δ(jA) = Δ3 xΔt(jμ Aμ ) = q uμ Aμ (x(t), t) = qΔτ [uμ Aμ (x(τ ), τ )] (5.58)
γ
where τ is the proper time of the particle. We have obtained an invariant
result for any 4-vector Aμ , therefore j μ must transform like a 4-vector. Similar
reasoning holds for the other densities.

5.4 HAMILTON FORMALISM AND MINIMAL SUBSTITUTION

The conjugate momentum of the particle was already calculated in (5.51):

p = mγv + qA[x(t), t]. (5.59)

From here we should derive v as a function of p. We ﬁnd:
cπ
v= ;
|π|2
+ (mc)2
π = p − qA.
64 Relativistic Quantum Mechanics

The charged particle Hamiltonian is obtained from:

H p = p · v − (Lp + Lint ) =
v2
= p · v + qA0 − qv · A + mc2 1− =
c2
v2
= qA0 + v · π + mc2 1− .
c2
Eliminating v we ﬁnd, ﬁnally:

H p = qA0 + (mc2 )2 + c2 (p − qA)2 . (5.60)

If we compare with the free particle Hamiltonian:

H p = (mc2 )2 + (cp)2 (5.61)

we see that the interaction of a free-charged particle in a known electromag-

netic ﬁeld is introduced with the minimal substitution:

H p → H p − qA0 ;
p → p − qA (5.62)

or, in covariant terms:

pμ → pμ − qAμ . (5.63)

We conclude with the Hamiltonian of the electromagnetic ﬁeld. Starting

from the deﬁnition of the conjugate momentum of Aμ :
∂L
Πμ = = F 0μ . (5.64)
∂∂0 Aν

We note that Π0 = F 00 = 0, consistent with the fact that A0 is not a real

variable of the electromagnetic ﬁeld. Moreover, as we saw in equations (5.31)
and after, we can choose a gauge in which the vector potential is transverse,
∂i Ai = 0. In this gauge:
Πi = −∂0 Ai = ETi (5.65)
The Hamiltonian of the ﬁeld is given by:

1
H e.m. = d3 x{π · (∂0 A) − Le.m. } = d3 x{ET2 − (E 2 − B 2 )} =
2

1 3 2 2 1 3 2
= d x{(ET + B )} − d xEL .
2 2
Electromagnetic-Field Quantisation 65

We must add this result to the Hamiltonian of the particle, (5.60). Con-
sidering the case of several charged particles, we ﬁnd (with Aμi = Aμ [xi (t), t]):

1
H tot = d3 x(ET2 + B 2 )−
2

1
− d3 x EL2 + Σi qA0i + Σi (mi c2 )2 + c2 (pi − Ai )2 } =
2

1 1 qi qj
= d3 x(ET2 + B 2 ) + Σij +
2 2 4π|xi − xj |

+ Σi (mi c2 )2 + c2 (pi − Ai )2 . (5.66)

In the non-relativistic limit we ﬁnd the well known result:

1 1 qi qj (p − qi Ai )2
H tot = d3 x(ET2 + B 2 ) + Σij + Σi i . (5.67)
2 2 4π|xi − xj | 2m2i

The Classical Zeeman Eﬀect. The minimal substitution leads to the

Lorentz force and numerous examples of classical electromagnetic phenom-
ena therefore confirm its validity.
A characteristic example is the Zeeman effect, the splitting of spectral
lines emitted or absorbed by atoms in a magnetic field, considered in classical
theory by Lorentz [4]. We consider an electron in an atomic system described
the Hamiltonian:
p2
H= + · · · + V (x, . . .) (5.68)
2m
where x, p are the canonical variables of the electrons and the dots (. . .)
indicate further terms in the kinetic energy or the potential from other degrees
of freedom of the system. We suppose in addition that V has a spherical
symmetry, V = V (r2 , . . .) with r2 = x2 + y 2 + z 2 . Hamilton’s equations are:

dx ∂H p
= = ;
dt ∂p m
dp ∂H ∂V ∂V
=− =− = − 2 2x. (5.69)
dt ∂x ∂x ∂r
Now we introduce a constant magnetic ﬁeld directed along the z-axis,
generated by the vector potential:
B
A= (−y, x, 0); ∇ × A = B.
2
According to the minimal substitution, the new Hamiltonian is:

(p + eA)2
H= + . . . + V (r2 , . . .). (5.70)
2m
66 Relativistic Quantum Mechanics

The spherical symmetry is reduced to an axial symmetry around the di-

rection of B, so we must treat the variables x, px and y, py separately from
z, pz . We replace the former with the complex variables:

ζ = x + iy; p = px + ipy (5.71)

and in addition:
B
A = Ax + iAy = i ζ. (5.72)
2
Hamilton’s equations for the coordinates transverse to B are:
dζ ∂H ∂H p
= +i = + iωL ζ
dt ∂px ∂py m
where we have introduced the Larmor frequency:
eB
ωL = . (5.73)
2m
We deﬁne a new variable χ(t), setting:

ζ(t) = χ(t)eiωL t . (5.74)

The previous equation gives:

dχ iωL t p
e = (5.75)
dt m
which suggests to redeﬁne the conjugate momentum also, by setting:

p(t) = π(t)eiωL t . (5.76)

Hamilton’s equation for p is then written;

dp dπ ∂H ∂H
=( + iωL π)eiωL t = −( +i )=
dt dt ∂x ∂y
∂V ∂V
= − 2 2ζ + iωL p = (− 2 2χ + iωL π)eiωL t (5.77)
∂r ∂r
where we have neglected terms quadratic in the magnetic field, which is neg-
ligible in normal experimental situations.
The conclusion obtained from equations (5.73) and (5.77) is that the vari-
ables χ and π obey the same equations of motion as the unperturbed atom.
From (5.74) we see that a precession around the magnetic field direction at the
Larmor frequency is superimposed onto the unperturbed motion. The motion
along z is not affected by the field (the z-component of the Lorentz force is
zero).
If we assume that the unperturbed motion is harmonic with frequency ω0 ,
we have:
χ(t) = ae+iω0 t + be−iω0 t , z = ceiω0 t
Electromagnetic-Field Quantisation 67

for the solutions of (5.69), while:

ζ(t) = ae+i(ω0 +ωL )t + be−i(ω0 −ωL )t , z = ceiω0 t (5.78)

for the motion in the magnetic ﬁeld.

A spectral line of frequency ω0 absorbed or emitted by the atom in normal
conditions separates into three components with frequency ω0 ±ωL , ω0 . In ad-
dition, light which travels in the direction of the magnetic field, which cannot
contribute to the motion along z, contains only the first two components.
The classical predictions are observed in a certain number of cases (normal
Zeeman effect). In other cases, the line structure in the magnetic field is more
complicated. The anomalous Zeeman effect can only be explained by taking
the magnetic moment of the electron spin into account, which we will return
in Section 6.1.5.

5.5 QUANTISATION OF THE ELECTROMAGNETIC FIELD IN

VACUUM
Despite the fact that the observation of electromagnetic phenomena was
at the origin of the development of classical ﬁeld theory, quantisation of the
electromagnetic ﬁeld in the canonical formalism presents non-trivial problems.
The starting point is the Lagrangian density (5.14)
1 1 1 2
Le.m. = − F μν Fμν = − (∂ ν Aμ − ∂ μ Aν ) ∂ν Aμ = |E| − |B|2 (5.79)
4 2 2
from which Maxwell’s equations in the absence of charges and currents are
derived, through the action principle. The canonical variables conjugate to
the components of the vector potential Aμ , which are obtained from:
∂L
πμ = = F 0μ (5.80)
∂ Ȧμ

are (i = 1, 2, 3):

π 0 = F 00 = 0, π i = F 0i = ∂ i A0 − ∂ 0 Ai = E i . (5.81)

From the corresponding expression for the Hamiltonian density:

3
3
1 2
H= π i Ȧi − L = E i Ȧi − |E| − |B|2 =
i=1 i=1
2
1 2
= |E| + |B|2 + E · ∇φ , (5.82)
2
with φ = A0 , it follows that (comparing with (5.54)):

1
H= d3 x |E|2 + |B|2 , (5.83)
2
68 Relativistic Quantum Mechanics

as is easily seen by integrating by parts and using ∇ · E = 0.

The canonical commutation rules for Aμ and π i (i, k = 1, 2, 3) are:

[Aμ (x, t), Aν (x , t)] = 0 (5.84)

i
π (x, t), π k (x , t) = 0 (5.85)
k
π (x, t), A0 (x , t) = 0 (5.86)

showing that, because of the vanishing of the conjugate momentum π 0 , the

component A0 of the vector potential, unlike the components Ai , commutes
with all Aμ and π i , and can therefore be described by a number, instead of
an operator. This then limits application of the quantum formalism to the
components Ai only, and requires A0 to be treated as a classical ﬁeld.
This procedure, originally followed by Dirac [5], while being of great utility
in many applications, has the disadvantage of not being manifestly covariant,
since the components of the four-vector Aμ are treated in an asymmetric
manner. The covariant quantisation of the electromagnetic ﬁeld presents non-
trivial technical problems and will be discussed elsewhere.

Energy and Momentum of the Electromagnetic Field. In the gauge de-

ﬁned by the condition:
∇·A=0 , (5.87)
the vector potential satisﬁes:

∂φ(x)
2A(x) = j(x) − ∇ , (5.88)
∂t
as follows from (5.3), while (5.1) implies that the scalar potential A0 = φ, the
solution of Poisson’s equation:

∇2 φ(x) = −ρ(x) , (5.89)

is the Coulomb potential generated by the charge distribution ρ(x):

ρ(x , t)
φ(x) = d3 x . (5.90)
|x − x |

For this reason, the gauge (5.87) is called the Coulomb gauge.
In principle, the term containing φ on the right-hand side of (5.88) can
be obtained from (5.90). However, separating the longitudinal and transverse
components of the current, by writing:

j(x) = j T (x) + jL (x) (5.91)

with
∇ × j L = 0, ∇ · jT = 0 (5.92)
Electromagnetic-Field Quantisation 69

and using the identity ∇ × (∇ × j) = ∇(∇ · j) − ∇2 j, it is easily shown that

∂φ(x)
∇ = j L (x) , (5.93)
∂t
which means that the source term in the equation which determines the vector
potential A is reduced to the transverse component of the current, j T (x).
In the absence of charges and currents, φ(x) ≡ 0 and (5.88) becomes:

2A = 0 . (5.94)

A particular solution to (5.94) which satisﬁes periodic boundary conditions

on a cubic box, as discussed in Section 4.1, can be written in the form:
1
uk (x, t) = k e−i(ωk t−k·x) , (5.95)
V
where V is the normalisation volume and

ωk = |k| . (5.96)

The gauge condition (5.87) implies that the polarisation vector has the
property:
k · k = 0 , (5.97)
that is, for every wave vector k, k lies on the plane perpendicular to k.
Therefore we can deﬁne two real unit vectors k1 and k2 , such that (r, r =
1, 2)
kr · kr = δrr (5.98)
and
kr · k = 0 . (5.99)
The general solution of (5.94) which satisﬁes the gauge condition (5.87) is
obtained from a linear combination of uk (x, t) (we recall that A(x, t) is a real
function)
2

A(x, t) = [ckr ukr (x, t) + c∗kr u∗kr (x, t)]
k r=1
2

= [ckr (t)ukr (x) + c∗kr (t)u∗kr (x)] , (5.100)
k r=1

where we have deﬁned

ckr (t) = ckr e−iωk t (5.101)
and the functions
1
ukr (x) = √ kr eik·x (5.102)
V
70 Relativistic Quantum Mechanics

satisfy the orthonormality conditions:

kr · k r
d3 x u∗kr (x)uk r (x) = d3 x ei(k−k )·x = δrr δkk . (5.103)
V
Substituting into the equation for the energy of the electromagnetic ﬁeld
(comparing with (5.54)):
2
1 1 ∂A
|∇ × A| +
2
H= d3 x |B|2 + |E|2 = d3 x (5.104)
2 2 ∂t

the Fourier expansion (5.100) gives

1
H= d x [ckr (t) ∇×ukr (x)+c.c.]·[c∗k r (t) ∇×u∗k r (x)+c.c.]
3
2
kr k r

∂ckr ∂c∗k r ∗
+ ukr (x) + c.c. · uk r (x) + c.c. . (5.105)
∂t ∂t
The calculation of the contribution of the magnetic field requires integra-
tions of the type:

d3 x (∇ × ukr (x)) · (∇ × u∗k r (x))

= d3 x ∇ · [ukr (x) × (∇ × u∗k r (x))] +

+ d3 x ukr (x) · [∇ × (∇ × u∗k r (x))]

= − d3 x ukr (x)∇2 u∗k r (x)
2
|k |
= kr · k r d3 x ei(k−k )·x = ωk2 δrr δkk ,
V
which are easily carried out using the periodic boundary conditions at the
edges of the volume of integration and the identities
(∇ × u) · (∇ × v) = ∇ · [u × (∇ × v)] + u · [∇ × (∇ × v)]
and
∇ × (∇ × u) = ∇ · (∇ · u) − ∇2 u.
The corresponding integrals for the contribution of the electric field are
instead of the type:

∂ckr ∂c∗
d3 x ukr (x) · k r u∗k r (x) = ωk ωk δrr δkk ckr (t)c∗k r (t) .
∂t ∂t
The final result is:

H= ωk2 [ckr (t)c∗kr (t) + c∗kr (t)ckr (t)] , (5.106)
kr
Electromagnetic-Field Quantisation 71

where the functions ckr (t) satisfy the diﬀerential equation:

c̈kr (t) = −ωk2 ckr (t) , (5.107)
which is the equation of motion of a classical harmonic oscillator of angular
frequency ωk and unit mass. We note that, if the functions ckr (t) are complex
numbers, [ckr (t)c∗kr (t) + c∗kr (t)ckr (t)] = 2ckr (t)c∗kr (t). The reason we have
written H in the form (5.106) will be made clear shortly.
The energy of a system of classical, unit mass oscillators, vibrating in the
directions deﬁned by the wave vectors k with angular frequency ωk is:
1
Hosc = (p2 + ωk2 x2kr ) , (5.108)
2 kr
kr

where xkr and pkr are the canonical classical variables, which satisfy the
equations of motion:
ẍkr (t) = −ωk2 xkr (t), p̈kr (t) = −ωk2 pkr (t) . (5.109)
Comparing (5.106) with (5.108) we see immediately that H coincides with
Hosc if
1 1
ckr = (ωk xkr + ipkr ) , c∗kr = (ωk xkr − ipkr ) , (5.110)
2ωk 2ωk
or if
xkr = ckr + c∗kr , pkr = −iωk (ckr − c∗kr ) . (5.111)
This result suggests to interpret H as the energy of a set of classical
oscillators. The path to the quantum case is straightforward. The energy of a
system of quantum oscillators has the form:
ωk †
H= akr akr + akr a†kr , (5.112)
2
kr

which agrees with the energy of the electromagnetic field (5.106) if the coef-
ficients of the expansion of A in a Fourier series are interpreted as quantum
operators defined by the equations:
1 1
ckr = √ akr , c∗kr = √ a† . (5.113)
2ωk 2ωk kr
The vector potential A can therefore be expressed in terms of the creation
and destruction operators a†kr and akr
1 !
A(x) = √ kr akr (t)e−ik·x + a†kr (t)eik·x (5.114)
kr
2V ωk
1 !
= √ kr akr e−i(ωk t−k·x) + a†kr ei(ωk t−k·x)
kr
2V ωk
= A+ (x) + A− (x) , (5.115)
72 Relativistic Quantum Mechanics

and is itself an operator in the Hilbert space whose state vectors are:

|nk1 r1 , nk2 r2 , . . . nkn rn . . ., (5.116)

where nkr is the quantum number of the mode of oscillation characterised by

the wave vector k and by the polarisation vector kr . The state (5.116) can
be obtained starting from the vacuum state, which corresponds to nkr ≡ 0,
using
" (a†k r )nki ri
|nk1 r1 , nk2 r2 , . . . = i i |0 . (5.117)
k r
nk i ri !
i i

Notice that in (5.115) we have separated the contribution of the terms

containing creation operators (A− (x)) from those of annihilation (A+ (x)).
Evidently:
A+ (x)|0 = 0 . (5.118)
Choosing 0|H|0 as the zero of the energy scale, the Hamiltonian of the
electromagnetic ﬁeld becomes:

H= ωk Nkr , (5.119)
kr

with Nkr = a†kr akr , and satisfying the eigenvalue equations:

H|nk1 r1 , nk2 r2 , . . . = nki ri ωki |nk1 r1 , nk2 r2 , . . . . (5.120)
k i ri

Because A(x) is linear in the creation and destruction operators, from the
commutation rules
!
[akr , Nk r ] = akr , a†k r ak r = δrr δkk akr (5.121)

and ! !
a†kr , Nk r = a†kr , a†k r ak r = δrr δkk a†kr , (5.122)

it follows that the operator Nkr does not commute with A(x) nor, therefore,
with the electric and magnetic ﬁelds E and B. This result implies that the
values of E and B and nkr cannot be simultaneously measured with arbitrary
precision. Moreover, from the linearity of A(x) in akr and a†kr , it also follows
that the expectation values E and B in the states (5.116) are all zero.
Classically the momentum of an electromagnetic wave is given by the
Poynting vector (compare with equation (5.54))

∂A
p = d3 x E × B = − d3 x × (∇ × A) . (5.123)
∂t
Electromagnetic-Field Quantisation 73

Substituting (5.114) in (5.123) we ﬁnd

1 †
p = k akr akr + akr a†kr
2
kr
† 1

= k akr akr +
2
kr

= k Nkr . (5.124)
kr

To obtain (5.124) it is suﬃcient to rewrite (5.123) in the form

1
p=− d3 x (E × B − B × E) . (5.125)
2
In this way it is immediately seen that the terms containing two creation or
two destruction operators do not contribute. For the calculation of the other
terms we use
1 1
d3 x √ kr × (k × k r )ωk ×
2V ω k ωk
kr k r

!
akr a†k r e−i(k−k )x + a†kr ak r ei(k−k )x
1 ωk
= √ kr × (k × k r ) akr a†k r + a†kr ak r δkk
2 ωk ωk
kr k r

† 1
= k akr akr + (5.126)
2
kr
#
where, clearly, k k = 0.
Equation (5.124) shows that the quantum of the electromagnetic field with
energy ωk = |k| has momentum k, and can therefore be identified with a par-
ticle of zero mass (as follows from ωk2 − |k2 | = 0), the photon. The corpuscular
nature of electromagnetic radiation was confirmed in 1922 by the observation
of the Compton effect, which demonstrates the conservation of momentum
and energy in an elastic collision between photons and atomic electrons.

5.6 THE SPIN OF THE PHOTON

The polarisation state of the photon is determined by the projection of its
angular momentum along the quantisation axis, which we can take to be x3 .
We apply the definition of the canonical tensor for the angular momentum
from Chapter 3 (equations (3.90) and (3.91)), to the case of the electromag-
netic field. Because this is a vector field (which transforms like a 4-vector
under Lorentz transformations) the indices M N which appear in the defini-
tion of Σαβ
M N (equation(3.87)) are Lorentz indices in this case. In this way, we
find:
Σαβ
μν = g gν − gβμ g να ,
αμ β
(5.127)
74 Relativistic Quantum Mechanics

that, substituted in the equation analogous to (3.91) for J 3 , gives the result
$ 3 %
∂ ∂

J 3 = d3 x Ȧi x1 − x2 Ai − (Ȧ1 A2 − Ȧ2 A1 ) . (5.128)
i=1
∂x 2 ∂x 1

Now substituting (5.114) into (5.128) we see that J 3 satisﬁes the commu-
tation rules
!
J 3 , a†kr = i 1kr a†k2 − 2kr a†k1 , (5.129)

where we have chosen the axis x3 along the direction of the wave vector k.
Note that only the second term in the integrand of (5.128), which we can
interpret as spin angular momentum, gives a non-zero contribution to the
commutator (5.129).
We now deﬁne two new operators
1 1
a†kR = √ a†k1 + ia†k2 , a†kL = √ a†k1 − ia†k2 , (5.130)
2 2
which create circularly polarised photons; that is, photons whose state of po-
larisation is described by the vectors
1 1
kR = √ (k1 + ik2 ) , kL = √ (k1 − ik2 ) . (5.131)
2 2
Setting k1 ≡ (1, 0, 0) and k2 ≡ (0, 1, 0) and rewriting (5.129) in terms of
the new operators we obtain the commutation rules
! !
J 3 , a†kR = a†kR , J 3 , a†kL = −a†kL , (5.132)

from which it follows that the third component of the angular momentum of
the state with a photon is given by
!
J 3 a†kR |0 = J 3 , a†kR |0 = a†kR |0 , (5.133)

!
J 3 a†kL |0 = J 3 , a†kL |0 = −a†kL |0 . (5.134)

This result shows that the photon has spin |J | = 1, as required by the
vector nature of the electromagnetic ﬁeld, and that the two projections J3 =
±1 corresponds to the circularly polarised states. The absence of the state
with J3 = 0 is a consequence of the transversality condition (5.87), which
reduces the number of degrees of freedom by one unit.
Electromagnetic-Field Quantisation 75

5.7 PROBLEMS FOR CHAPTER 5

Sect. 5.5
1. Consider a radiation ﬁeld in the Coulomb gauge: ∇ · A = 0. Using
the plane wave expansions of the ﬁelds Ai (i = 1, , 2, 3) and of the
corresponding conjugate variables πi , show that

[Ai (t, x), πj (t, x ))] = iδij

⊥
(x − x )

where the transverse δ-function is deﬁned as

ki kj

1
⊥
δij (x − x ) = δij − 2 √ ek·(x−x ) .
k
k V

2. In the Coulomb gauge, Section 5.5, the interactions of a collection of

atomic electrons with the electromagnetic ﬁeld are described by the
Hamiltonian [see eq. (5.67)]

Z
e
Hint = (p · A)
i=1
m i

where e and m are the electron charge and mass, while A is the radiation
ﬁeld.
Using the dipole approximation, which amounts to replacing eik·x → 1
in the plane-wave expansion of the ﬁeld A, evaluate the transition matrix
element associated with photon emission and absorption

Mi→f = f | Hint |i

where
|i = |A, nkr , |f = |B, nkr ± 1 .
In the above equations, nkr is the number of photons in the state of
momentum k and polarisation specified by the index r, whereas A and
B denote the quantum numbers of the atomic initial and final states,
respectively.
3. Obtain the commutation rules satisfied by the components of the electric
and magnetic fields (i = 1, 2, 3)

[E i (x), E j (y)], [B i (x), B j (y)], [E i (x), B j (y)]

for x0 = y0 .
4. The basis of eigenstates of the electromagnetic Hamiltonian, speciﬁed
by the number of photons with a given wave number, k, and polarisa-
tion, r, provides a useful representation when the number of photons is
76 Relativistic Quantum Mechanics

small. The states produced by a classical light source, however, are best
described using coherent states.
Consider the coherent state associated with a single oscillation mode of
the electromagnetic ﬁeld
∞ n
2 a†
|α = e−|α| √ |n
n=0 n!

where α is an arbitrary complex number, a† is the creation operator and

|n is the n-photon state.

– Show that α is an eigenstate of the annihilation operator, satisfying

the eigenvalue equation

a|α = α|α .

– Using the above result and the relations

1 1
a= √ (ωx + ip), a† = √ (ωx − ip)
2ω 2ω
show that the |α is a minimum uncertainty state, i.e. that

ΔxΔp = .
2
Hint: use (Δx)2 = x2 − x2 and (Δp)2 = p2 − p2 .
5. The Lagrangian density
1 1
L = − Fμν F μν + m2 Aμ Aμ − jμ Aμ
4 2
with
F μν = ∂ ν Aμ − ∂ μ Aν ,
describes a massive vector field interacting with the current jμ .
– Derive the equations of motion of the field Aμ .
– Using the obtained result, determine which condition must be ful-
filled to satisfy the relation

∂ μ Aμ = 0 .

– Calculate the variation of the action, deﬁned as

δS = d4 x L(Aμ , ∂ν Aμ ) − d4 x L(Aμ , ∂ν Aμ )

associated with the gauge transformation

Aμ → Aμ = Aμ − ∂μ Λ .
Electromagnetic-Field Quantisation 77

6. Starting from the transformation rules

Aμ (x ) = Λμν Aν (x)

calculate the angular momentum matrices (Σαβ )μν , introduced in Section

μν
3.6, and show that the symmetric energy-momentum tensor, θe.m. of Eq.
(5.48), can be obtained from the canonical energy-momentum tensor
T μ,ν using the method of Belinfante and Rosenfeld.
CHAPTER 6

THE DIRAC EQUATION

From this chapter onward we will use natural units, in which = c = 1.

According to non-relativistic quantum mechanics, the evolution of the wave

function of a free particle of mass m and momentum p is described by the
Schrödinger equation:
∂ψ
i = Hψ , (6.1)
∂t
with the Hamiltonian operator H = −∇2 /2m obtained from the expression
for energy
p2
E= , (6.2)
2m
using the substitution
∂
E → i , p → −i∇ . (6.3)
∂t
Schrödinger himself [6] ﬁrst suggested a generalisation of equation (6.1) based
on the use of the relativistic energy equation:

E 2 = p 2 + m2 . (6.4)

The outcome of this process is the Klein–Gordon equation:

(2 + m2 )ψ = 0 , (6.5)

Multiplying by ψ ∗ and subtracting the product of ψ and the complex

conjugate of (6.5) from the result, the continuity equation is obtained

∂ρ
= ∇j , (6.6)
∂t
with
∂ψ ∂ψ ∗
ρ = ψ∗ −ψ (6.7)
∂t ∂t
78 DOI: 10.1201/9781003436263-6
This chapter has been made available under a CC BY NC license.
The Dirac Equation 79

and
j = −∇ (ψ ∗ ∇ψ − ψ∇ψ ∗ ) . (6.8)
However, the ρ which appears in equation (6.6) cannot be identified with
the probability density, in analogy with the similar quantity obtained from the
Schrödinger equation, because it does not have the required property of being
always positive-definite. To be convinced of this it is sufficient to substitute
i∂/∂t → E in (6.7). The result,

ρ = E|ψ|2 ,

shows that ρ can be either positive or negative, following from the fact that
the Klein–Gordon equation has solutions for the energy with both signs

E = ± p 2 + m2 ,

Note that in the non-relativistic limit E ≈ m > 0, and the familiar result
ρ ∝ |ψ|2 is recovered.
In addition, the presence of the second-time derivative conﬂicts with the
postulate of quantum mechanics stating that the wave function contains all
the information on the state of a physical system, which must therefore be
completely determined by its initial value.
Because of these problems the Klein–Gordon equation was initially aban-
doned, until Pauli and Weisskopf [7] suggested that the solution should be
interpreted as a quantum ﬁeld, instead of the wave function of a particle.

6.1 FORM AND PROPERTIES OF THE DIRAC EQUATION

If, at a given instant, the wave function should contain all the informa-
tion on the state, the wave equation must be of ﬁrst order with respect to
time. Because the relativistic treatment requires that the time and the spa-
tial coordinates should be treated symmetrically, this implies that the spatial
derivatives must also be of ﬁrst order. Moreover, the solutions should be com-
patible with the Klein–Gordon equation, which is obtained directly from the
relativistic expression for energy (6.4).
To satisfy all these conditions, Dirac proposed [9] to write the wave equa-
tion in the form
∂ψ
i = (−iα · ∇ + βm)ψ , (6.9)
∂t
where ψ is a vector with N components
⎛ ⎞
ψ1
⎜ ψ1 ⎟
ψ=⎜ ⎝ ... ⎠
⎟ (6.10)
ψN
and αi (i = 1, 2, 3) and β are N × N , matrices with N to be determined.
80 Relativistic Quantum Mechanics

Note that wave functions of the type (6.10) are encountered in non-relativistic
quantum mechanics. For example, the wave function of a particle of spin 12 is
a two-component vector.
Because the Hamiltonian is a hermitian operator, it must be true that
α = α† and β = β † . Moreover, from the requirement that (6.9) should be
compatible with the Klein–Gordon equation, or

(α · p + βm)(α · p + βm) = E 2 = p2 + m2 ,

it follows that:

αi αj pi pj + m(αi β + βαi )pi + β 2 m2

1
= ({αi , αj } + [αi , αj ]) pi pj + m {αi , β} pi + β 2 m2
2
= pi pj δij + m2 .

Note that pi pj is a symmetric tensor, whose contraction with the antisym-

metric tensor [αi , αj ] is zero. Therefore we ﬁnd:

{αi , αj } = 2δij (6.11)

{αi , β} = 0 (6.12)
αi2 2
=β =1. (6.13)
It is helpful to introduce a new set of matrices γ μ (μ = 0, 1, 2, 3) deﬁned
as (i = 1, 2, 3)
γ 0 = β, γ i = βαi = γ 0 αi , (6.14)
which satisﬁes the anticommutation rules

{γ μ , γ ν } = 2g μν , (6.15)

and have the properties

(γ 0 )2 = 1, (γ i )2 = −1 , (6.16)

γμ† = γ0γμγ0 . (6.17)

μ
Using the γ matrices the Dirac equation can be rewritten in the form
given by Feynman. Multiplying (6.9) on the left by β = γ 0 gives

∂ψ 0 ∂ψ 2 i ∂
iβ = iγ = (βαi pi + β m)ψ = −iγ +m ψ ,
∂t ∂t ∂xi
which is
(iγ μ ∂μ − m)ψ = 0 . (6.18)
Finally, introducing the notation ∂/ = γ μ ∂μ , equation (6.18) can be put in
the form
(i∂/ − m)ψ = 0 . (6.19)
The Dirac Equation 81

Properties of the Dirac Matrices. The anticommutation rules (6.11) –

αi and β should satisfy. Equation (6.13) implies that the eigenvalues of the
matrix αi and β are all equal to ±1, while from (6.11) it follows, for j = i,
that

α i α j α j = αi = −αj αi αj
T r(αi ) = −T r(αj αi αj ) = −T r(αj αj αi ) = −T r(αi ) ,

thus
T r(αi ) = 0 . (6.20)
Using the same procedure it can also be shown that from (6.11) it follows that

T r(β) = 0 . (6.21)

An N × N matrix whose eigenvalues are all equal to ±1 can have a zero

trace only if N is an even number. Therefore the possible dimensions of the
Dirac matrices are N = 2, 4, . . ..
We can immediately exclude the case N = 2. The Pauli matrices

0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = , (6.22)
1 0 i 0 0 −1
1
used in non-relativistic quantum mechanics to describe particles of spin 2
satisfy the anticommutation rules

{σi , σj } = 2δij , (6.23)

similar to (6.11). However, it is impossible to ﬁnd a fourth independent matrix

which anticommutes with the σi . In fact the Pauli matrices, together with the
unit matrix 11, form a basis for 2 × 2 matrices, from which any 2 × 2 matrix,
M , can be constructed according to

M = M0 11 + Mi σi

with
1 0
11 = .
0 1
A matrix M that is independent of σi must have M0 = 0, but in this case
it obviously cannot anticommute with the σi . The smallest dimension which
the matrices αi and β can have is N = 4. It can easily be veriﬁed that the
4 × 4 matrices (written in terms of 2 × 2 blocks)

0 σi 1 0
αi = , β= , (6.24)
σi 0 0 −1

satisfy (6.11)–(6.13).
82 Relativistic Quantum Mechanics

The representation of the γ μ matrices obtained from (6.24), known as the

Pauli representation, is not unique. Given a non-singular matrix S, the new
matrices
μ = S −1 γ μ S
γ (6.25)
satisfy the same anticommutation rules as γ μ :

{ ν } = S −1 γ μ SS −1 γ ν S + S −1 γ ν SS −1 γ μ S = S −1 {γ μ , γ ν } S = 2g μν .
γμ, γ

μ matrices which satisfy the

It can be shown that, given two sets of γ μ and γ
anticommutation rules (6.15), they are always connected by a transformation
of the type (6.25), with a particular non-singular matrix S.

6.1.1 Spin
The Hamiltonian operator for the Dirac equation

H = α · p + βm (6.26)

does not commute with the orbital angular momentum

L=x×p . (6.27)

Using the commutation rules [pi , xj ] = −iδij gives, for example:

[H, L3 ] = [αi pi , x1 p2 − p2 x1 ]
= α1 p2 [p1 , x1 ] − α2 p1 [p2 , x2 ]
= −i(α1 p2 − α2 p1 ) = 0 .

The constants of the motion associated with the invariance under rotation
of (6.9) are the components of the total angular momentum, deﬁned as
1
J = L + Σ;
2
σ 0
Σ= , (6.28)
0 σ

which do commute with the Hamiltonian (6.26). To see this, we deﬁne the
antisymmetric tensor
i
σ μν = [γ μ , γ ν ] , (6.29)
2
whose components σ ij (i, j = 1, 2, 3) are
i j
i [σ , σ ] 0 σk 0
σ =−
ij
= ijk
= ijk Σk .
2 0 [σ i , σ j ] 0 σk
Thus we obtain
i i
Σ3 = (γ1 γ2 − γ2 γ1 ) = − (α1 α2 − α2 α1 ) .
2 2
The Dirac Equation 83

In this form we can easily calculate the commutator of H with Σ3 :

1 i
[H, Σ3 ] = − [αi pi , α1 α2 − α2 α1 ] = −i(α2 p1 − α1 p2 ) = − [H, L3 ] ,
2 4
implying
1
[H, J3 ] = H, L3 + Σ3 = 0 .
2
The eigenvalues of Σ3 are ±1. Equation (6.28) shows that the Dirac equa-
tion describes particle with spin 12 .
When the momentum of the particle is non-zero, while the projection of the
spin along an arbitrary axis is not conserved, as we have just seen, the projec-
tion of the spin along the direction of motion commutes with the Hamiltonian.
This quantity is given the name helicity, and is described by the operator
(Σ · p)
σp = . (6.30)
|p|

Comment. The appearance of spin explains why the wave function which
satisﬁes the Dirac equation must be a multidimensional vector. However for
spin 12 a two-component wave function was expected, while the minimum
dimension of the Dirac matrices is N = 4. The doubling of the components is
due to the necessity of the presence of an antiparticle, as we shall see later.

6.1.2 Relativistic Invariance

We would like to show that, if ψ(x) satisfies the Dirac equation in a given
frame of reference O, the wave function determined by an observer in another
system O satisfies the Dirac equation in O .
This is similar to what happens in the case of the electromagnetic field
tensor: the components of E and B change from one frame to the other, but
the form of Maxwell’s equations remains invariant.
We consider the homogeneous Lorentz transformation from O to O

x = Λμ ν x ν .
μ

Correspondingly, the components of ψ should transform linearly, to respect

the superposition principle, with a matrix which depends on the transforma-
tion Λ

ψ (x ) = S(Λ)ψ(x). (6.31)

The dependence of S on Λ must be such as to respect the rule for com-
bining Lorentz transformations, at least for those transformations close to the
identity:
S(Λ1 Λ2 ) = S(Λ1 )S(Λ2 ). (6.32)
84 Relativistic Quantum Mechanics

Note that we do not know a priori the form of S(Λ). The relativistic
invariance of the Dirac equation requires that it should be possible to determine
S(Λ) so that:
• the transformations are in accord with the combination rule (6.32),
• they lead to a ψ which satisﬁes the Dirac equation in O if ψ satisﬁes it in
O.
We now consider the Dirac equation in O:

λ ∂ λ ∂
0 = iγ − m ψ(x) = iγ − m S −1 (Λ)ψ (x ) =
∂xλ ∂xλ
μ

λ ∂x ∂
= iγ − m S −1 (Λ)ψ (x ) =
∂xλ ∂x μ

∂
= iγ Λ λ μ − m S −1 (Λ)ψ (x ).
λ μ
(6.33)
∂x
Multiplying by the matrix S(Λ), we obtain:

μ ∂
i
γ − m ψ (x ) = 0;
∂x μ
μ = Λμ ν S(Λ)γ ν S −1 (Λ).
γ (6.34)

Equation (6.34) agrees with the Dirac equation in system O if the matrices
are identical to γ μ , or if S(Λ) satisﬁes the relation:
γμ

S −1 (Λ)γ μ S(Λ) = Λμ ν γ ν . (6.35)

To solve (6.35), we restrict ourselves to inﬁnitesimal transformations,

which we have seen in (3.85) to be of the form (cf. Section 3.6):

Λμ ν = δ μ ν + μ ν ,

with

μ ν = g μα αν ;
αβ = −βα . (6.36)

Now we write
S =1+T , S −1 = 1 − T
with T inﬁnitesimal:
1
T = αβ T αβ , (6.37)
2
and T αβ antisymmetric.
Substituting in equation (6.35) we obtain (to ﬁrst order in )

Λμ ν γ ν = γ μ + μ ν γ ν = S −1 γ μ S = γ μ + (γ μ T − T γ μ ) .
The Dirac Equation 85

which gives (using (6.37) and (6.36))

βμ α
g γ − g αμ γ β = γ μ , T αβ . (6.38)
Equation (6.38) has as solution the antisymmetric tensor (compare with
(6.29))
1 α β i
T αβ = γ , γ = − σ αβ ,
4 2
The transformation S that we seek is therefore:
i
S =1− αβ σ αβ , (6.39)
4
and we note the property
γ 0 S † γ 0 = S −1 . (6.40)
which is shown using
σ μν † = γ 0 σ μν γ 0 , (6.41)
and which implies
i i
γ0S†γ0 = 1 + μν γ 0 σ μν † γ 0 = 1 + μν σ μν = S −1 . (6.42)
4 4
Note that (6.40) is valid for any proper Lorentz transformation which can
be obtained as a product of inﬁnitesimal transformations. To demonstrate this
we consider the case in which
S = S1 S2 , S −1 = S2−1 S1−1 ,
with S1 and S2 inﬁnitesimal transformations. It follows from (6.40) that

γ 0 S † γ 0 = γ 0 S2† S1† γ 0 = γ 0 S2† γ 0 γ 0 S1† γ 0 = S2−1 S1−1 = S −1 .

The Adjoint Spinor. In general ψ is a complex wave function. Alongside

ψ we can introduce the complex conjugate spinor ψ ∗ . If we consider ψ as a
column vector, see (6.10), we can introduce the spinor ψ † , a row vector which
has the elements of ψ ∗ as components:
ψ † = (ψ ∗ )T . (6.43)
We note that ψ † ψ is not invariant under Lorentz transformations, since the
matrix S(Λ) is not unitary. An invariant can be constructed by considering
the adjoint spinor
ψ̄ = ψ † γ 0 . (6.44)
Using equation (6.42) it can be seen that
ψ̄ (x ) = ψ † (x)S † γ 0 = ψ̄(x)S −1 (6.45)
so that:
(ψ̄ψ) (x ) = ψ̄(x)S −1 Sψ(x) = (ψ̄ψ)(x) . (6.46)
86 Relativistic Quantum Mechanics

Covariant Bilinear Forms. Multiplying together two or more γ matrices

generates a matrix algebra. Because the symmetric product of two γ matrices
is ±I, we can limit our considerations to the products which are antisym-
metric in their Lorentz indices. Thus ﬁfteen 4 × 4 matrices which are linearly
independent are found (the four γ matrices, six products of two γ, four of
three and one of four) which together with the identity matrix make up the
Dirac algebra
ΓS = I , ΓVμ = γμ , ΓTμν = σμν ,
ΓP = γ5 = γ 5 = iγ 0 γ 1 γ 2 γ 3 , ΓA
μ = γμ γ 5 .

We note that γ5 is hermitian while relations similar to (6.17) hold for the
other matrices of the algebra

Γ = γ 0 Γ† γ 0 . (6.47)

Using the results of the previous paragraph one easily obtains transforma-
tion rules for the bilinear forms of the type

ψ † γ 0 Γψ = ψ̄Γψ , (6.48)

in terms of the adjoint spinor ψ̄, which transforms according to (6.45). The
covariant bilinear forms (6.48) perform an important role, because they have
deﬁnite transformation properties under Lorentz transformations. They are
the basic ingredients with which to construct observable quantities and in-
variant Lagrangian densities.
We consider, for example, the continuity equation (6.6) which is obtained
from the Dirac equation. It is easily shown that

ρ = ψ † ψ = ψ̄γ 0 ψ, j = ψ † αψ = ψ̄γψ .

The bilinear form ψ̄γ μ ψ is a 4-current associated with the particle de-
scribed by the Dirac equation and j 0 = ρ is the probability density, whose
volume integral is a conserved quantity.
We can now proceed to determine the transformation rules for the bilinear
forms.
• ψ̄ΓS ψ = ψ̄ψ transforms like a scalar, because

ψ̄ ψ = ψ̄S −1 Sψ = ψ̄ψ .

• ψ̄ΓμV ψ = ψ̄γ μ ψ transforms like the components of a covariant four-vector,

because
ψ̄ γ μ ψ = ψ̄S −1 γ μ Sψ = Λμ ν ψ̄γ ν ψ .

• ψ̄Γμν μν μ ν
T ψ = ψ̄σ ψ = ψ̄ i[γ , γ ]/2 ψ transforms like the elements of an
antisymmetric tensor, because

ψ̄ γ μ γ ν ψ = ψ̄S −1 γ μ SS −1 γ ν Sψ = Λμ α Λν β ψ̄γ α γ β ψ .
The Dirac Equation 87

• ψ̄ΓP ψ = ψ̄γ 5 ψ transforms according to

ψ̄ γ 5 ψ = det(Λ) ψ̄γ 5 ψ ,
that is, it transforms like a scalar in the case of proper Lorentz trans-
formations (det(Λ) = 1) but changes sign in the case of parity transfor-
mations (x0 , xi ) → (x0 , −xi ) whose determinant equals −1. Therefore
ψ̄γ 5 ψ is a pseudoscalar density.
The transformation rule is obtained using the deﬁnition
i
γ 5 = iγ 0 γ 1 γ 2 γ 3 = μναβ γ μ γ ν γ α γ β ,
4!
where μναβ is the unit antisymmetric tensor with four indices. Thus,
one ﬁnds
i
S −1 γ 5 S = μναβ Λμ δ Λν λ Λα σ Λβ ρ γ δ γ λ γ σ γ ρ
4!
i
= det(Λ)δλσρ γ δ γ λ γ σ γ ρ = det(Λ)γ 5 .
4!
5 μ
• ψ̄Γμν
A ψ = ψ̄γ γ ψ transforms according to

ψ̄ γ 5 γ μ ψ = det(Λ) Λμ ν ψ̄γ 5 γ ν ψ ,
that is, like the components of a four-vector in the case of proper Lorentz
transformations and in the opposite way under parity transformations.

Comment 1. A correspondence Λ → S(Λ) between the elements of a group

and a set of matrices which are subject to the same composition rule as the
group defines a representation of the group. Equations (6.31) and (6.32) define
a representation of the Lorentz group, to be added to the irreducible tensors
already discussed in Section 3.3. According to the classification considered
there, the four-dimensional representation of the Dirac spinors corresponds
to (0, 1/2) ⊕ (1/2, 0). This is a reducible representation, where a non-trivial
matrix (γ5 ) exists which commutes with all the generators of the group. The
tensor product of an odd number of spinor representations generates a new
series of representations which, from the point of view of rotations, contains
representations of half-integer spin.

Comment 2. According to (6.40) the matrices S(Λ) are pseudounitary, but

not unitary. In fact it can be shown that representations of the Lorentz group
by unitary operators are necessarily inﬁnite-dimensional. This is due to the
fact that L↑+ is a non-compact group: the spaces of parameters which describe
the transformations make up a non-compact set, unlike the rotations. For
these, the matrix elements are bounded functions of the
rotation angles, within
the interval (0, 2π). Conversely, the parameter γ = 1/ 1 − β 2 , which appears
in Λμν , is unbounded.
88 Relativistic Quantum Mechanics

Comment 3. According to quantum mechanics, the squared modulus of the

| < ΛB|ΛA > |2 = | < B|A > |2 . (6.49)

Again, according to quantum mechanics, the transformed states are ob-
tained by applying a linear operator U (Λ), which represents the Lorentz trans-
formation Λ. Therefore it must be true that:

| < B|U (Λ)† U (Λ)|A > |2 = | < B|A > |2 . (6.50)

Wigner showed that (10.39) has two possible solutions:

Comment 4. Given the non-unitary nature of the S(Λ) matrices, it is in-

teresting to ask in what way the scalar products between states which are
solutions of the Dirac equation can provide a unitary representation of the
Lorentz group, as required by the considerations of the previous comment. In
Dirac’s theory, for two states |A > and |B >, we have:

< A|B > = d3 x ψA ∗
(x, t)ψB (x, t) .

The density inside the integral is not invariant. Instead, the form of S(Λ)
implies that it should be the time component of a 4-current, conserved by the
Dirac equation:
∗
ψA (x, t)ψB (x, t) = ψ̄A (x, t)γ 0 ψB (x, t) = JA,B
0
;
μ
∂μ JA,B =0.

But as we saw in Section 3.5, the space integral of the time component
of a conserved 4-current is a relativistic invariant. Therefore, for any Lorentz
transformation:
< ΛA|ΛB > = < A|B >
The Dirac Equation 89

which is exactly the invariance condition for the scalar product. The non-
invariance of ψ ∗ ψ is in the proportion required to balance exactly the non-
invariance of the size of d3 x, so that the S(Λ) matrices impose on the physical
states a unitary representation of the Lorentz group. This result will be derived
explicitly in Section 7.4.

6.1.3 Boost
This expression denotes a speciﬁc Lorentz transformation, which corre-
sponds to the passage from the given frame of reference, O, to a frame O
moving along the positive x-axis at a velocity β with respect to O.
The transformation rule between the coordinates of O and O involves only
x and x1 , and is written:
0

x0, = Λ00 x0 + Λ01 x1 = γ(t − βx)

x1, = Λ10 x0 + Λ11 x1 = γ(−βt + x)

where:

γ = (1 − β 2 )−1/2 = cosh θ (6.52)

and θ is the rapidity, with:

β = tanh θ. (6.53)

The origin of O , x =0, moves according to x = βt as it should. Setting β

inﬁnitesimal:

β = δθ

and deﬁning the inﬁnitesimal parameters of the transformation according to

(6.36) we obtain

1,0 = −0,1 = δθ .

The matrix S(Λ), which determines the transformation between ψ(x) and
ψ (x ) is therefore, according to (6.39):

i
S(Λ) = 1 − αβ σ αβ =
4
iδθ 10
=1− σ . (6.54)
2
Using the deﬁnition:
i 1 0
σ 10 = [γ , γ ] = iγ 1 γ 0 = −iα1
2
90 Relativistic Quantum Mechanics

we find:
δθ
S(Λ) = 1 − α1 . (6.55)
2
Transformations with finite rapidity, θ = tanh−1 β, are obtained by com-
bining infinitesimal transformations, which corresponds to multiplying to-
gether the relevant matrices (6.55). We put:

θ
δθ = ; (N large)
N
and ﬁnd:
N
θ θ
S(Λ) = 1 − α1 → e − 2 α1 . (6.56)
2N

The exponential with α1 is easily expressed in elementary terms, because

(α1 )2 = 1. Expanding in series, we obtain:

n=∞ n
− θ2 α1 1 θ
e = − α1 =
n=0
n! 2
1 θ 2k
k=∞
k=∞
1

θ
2k+1
= − + α1 −
(2k)! 2 (2k + 1)! 2
k=0 k=0
θ θ
= cosh − α1 sinh (6.57)
2 2
and ﬁnally using the well-known hyperbolic trigonometric relationships:
θ cosh θ + 1 θ cosh θ − 1
cosh2 = ; sinh2 = (6.58)
2 2 2 2
to obtain:
θ 1 − tanh θ2 σ1
S(Λ) = cosh =
2 − tanh 2 σ1
θ
1
$ %
γ+1 1 − γ+1
βγ
σ1
= . (6.59)
2 − γ+1 σ1
βγ
1

Equation (6.59) can be generalised immediately to the case in which the

velocity is directed along a general vector, n:
$ %
γ+1 1 − γ+1
βγ
(n · σ )
S(Λ) = . (6.60)
2 − γ+1
βγ
(n · σ ) 1
The Dirac Equation 91

6.1.4 Solutions of the Dirac Equation for a Free Particle

The Dirac equation, in the form (6.9) or (6.18), has solutions in the form
of relativistic plane waves. We write

ψ(x) = u(p)e−i(px) ; pμ = (E, p) . (6.61)

The equation for u(p) takes the form:

(p/ − m)u(p) = (pμ γ μ − m)u(p) = 0 (6.62)

where u is a four-component spinor.

To begin, we consider solutions of the Dirac equation for a particle at rest.
In this case (6.62) reduces to

(γ 0 E − m)u(p) = 0 , (6.63)

or, explicitly:
⎛ ⎞⎛ ⎞
E−m 0 0 0 u1
⎜ 0 E − m 0 0 ⎟ ⎜ u2 ⎟
⎜ ⎟⎜ ⎟
⎝ 0 0 −E − m 0 ⎠ ⎝ u3 ⎠ = 0 . (6.64)
0 0 0 −E − m u4

Equation (6.64) has eigenvalues E (1) = E (2) = m and E (3) = E (4) = −m,
whose corresponding eigenvectors are:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 0
⎜ 0 ⎟ ⎜ 1 ⎟ ⎜ 0 ⎟ ⎜ 0 ⎟
u =⎜
(1) ⎟ (2) ⎜ ⎟ (3) ⎜ ⎟ (4)
⎝ 0 ⎠ , u = ⎝ 0 ⎠ , u = ⎝ 1 ⎠ , u = ⎝ 0 ⎠ . (6.65)
⎜ ⎟

0 0 0 1

Like the Klein–Gordon equation, the Dirac equation also has negative en-
ergy, as well as positive energy, solutions. The two positive energy solutions
correspond to the two possible states of a particle with spin 12 . We will later
discuss the meaning of the solutions with negative energy.
In the general case of p = 0 we write the 4-spinor u(r) (p) in terms of two
two-component spinors uA and uB

uA
u(r) (p) = . (6.66)
uB

From (6.62) we obtain the matrix equation

E − m −σ · p uA
=0, (6.67)
σ · p −E − m uB

which has non-trivial solutions for

E 2 − m2 = (σ · p)2 = p2 . (6.68)
92 Relativistic Quantum Mechanics

Rewriting (6.67) in the form

(σ · p)
uA = uB (6.69)
E−m
(σ · p)
uB = uA , (6.70)
E+m
it is immediately seen that the positive energy solutions can be obtained from
(6.70), by choosing uA equal to one of the Pauli two-component spinors

1 0
χ1 = , χ2 = . (6.71)
0 1
The negative energy solutions are obtained in a similar way starting from
(6.69). We can therefore write the four solutions to (6.62) in the form

(+)
χr (−) − E(σ·p)
+m χ r
ur (p) = (σ·p) , ur (p) = p , (6.72)
Ep +m χr χr

with r = 1, 2 and Ep = + p2 + m2 = |E|. Obviously, for p → 0, (6.72)
reduces to (6.65).
To describe the negative energy states, it is helpful to introduce two new
4-spinors vr (p) (r = 1, 2), deﬁned by
(−) (−)
v1 (p) = u2 (−p) , v2 (p) = −u1 (−p) . (6.73)

or:
vr (p) = rs u(−)
s (−p), (r, s = 1, 2) (6.74)
where rs is the completely antisymmetric two-index tensor1 and the sum over
repeated indices is understood. Furthermore we put

ur (p) = u(+)
r (p) . (6.75)

It immediately follows from the deﬁnitions that ur and vr satisfy the equa-
tions:
(p/ − m)ur = 0 , (p/ + m)vr = 0 . (6.76)
The normalisation of the 4-spinors is deﬁned by writing the solution of
(6.62) in the form
⎧
1/2 ⎨ ur (p)e−ipx (positive energy)
m
ψ(x) = N × (6.77)
V Ep ⎩
vr (p)eipx (negative energy)
and requiring that
d3 x ψ † (x)ψ(x) = 1 , (6.78)

1 12 = −21 = 1, 11 = 22 = 0.

The Dirac Equation 93

which gives
Ep
ur † (p)ur (p) = vr † (p)vr (p) = . (6.79)
m
In this way the result N = [(Ep + m)/2m]1/2 is obtained.
With this choice of normalisation, the orthonormality relations for the
4-spinors are:
Ep
ur † (p)us (p) = vr † (p)vs (p) = δrs
m
†
ur † (p)vs (−p) = u(+)
r (p)u(−)
s (p) = 0 (6.80)

and, with simple formal manipulation2 we obtain from (6.76) the relations

m ū(p)γ μ u(p) = pμ ū(p)u(p)

ū(p)v(p) = 0 . (6.81)

The orthonormality conditions for u and v with respect to multiplication

by the adjoint spinors follow from the ﬁrst of these

ūr (p)us (p) = −v̄r (p)vs (p) = δrs ,

ūr (p)vs (p) = v̄r (p)us (p) = 0 . (6.82)

The completeness of the set of solutions of the Dirac equation is expressed

by the relation

[(ur )α (p)(ūr )β (p) − (vr )α (p)(v̄r )β (p)] = δαβ , (6.83)
r

where (ur )α (p) is the α component of the 4-spinor. Equation (6.83) is easily
obtained from the 4-spinor deﬁnition, which implies

+ p/ + m
(ur )α (p)(ūr )β (p) = Λp αβ = (6.84)
r
2m αβ

p/ − m
− (vr )α (p)(v̄r )β (p) = Λ−
p αβ = − . (6.85)
r
2m αβ

The operators Λ+ −
p and Λp are projectors of, respectively, the positive and
negative energy states, i.e.

Λ+ −
p u r = ur , Λ p v r = v r , (6.86)

Λ+ −
p v r = Λp u r = 0 (6.87)
2 The ﬁrst line of (6.81) is obtained by multiplying (6.76) on the left by ū(p)γ μ and

summing with the adjoint of the same equation multiplied from the right by γ μ u(p); the
second is the outcome of a similar rearrangement of the equation for v(p) in (6.76).
94 Relativistic Quantum Mechanics

Starting from the complete basis of solutions to the Dirac equation, we

can express any solution, according to the expansion
m
ψ(x) = [ar (p)ur (p)e−i(px) + (br (p))∗ vr (p)ei(px) ] . (6.88)
p,r
E(p)V

For later use, we give here the expressions for normalisation of the wave
function and the expectation values of the energy and momentum in terms of
the amplitudes of the normal modes of oscillation which appear in (6.88).
To carry out the calculations, we treat the components of the spinors u
and v as commuting numbers, but avoid exchanging the order in which the
coeﬃcients a and b appear. With the help of the orthogonality conditions
(6.80) we ﬁnd:

N = d3 x ψ † (x, t)ψ(x, t) =

= [ar (p)(ar (p))∗ + (br (p))∗ br (p)]; (6.89)
p,r

<E>= d3 x ψ † (x, t)[−iα · ∇ + βm]ψ(x, t) =

= E(p)[ar (p)(ar (p))∗ − (br (p))∗ br (p)]; (6.90)
p,r

<P >= d3 x ψ † (x, t)(−i∇)ψ(x, t) =

= p[ar (p)(ar (p))∗ − (br (p))∗ br (p)]. (6.91)
p,r

From (6.90) we see that the normalisation factor m/EV has been chosen
to ensure that the energy of the oscillation mode p, r is equal to E(p) for unit
oscillation amplitude.

Solutions with p = 0 and Lorentz Boost. The solutions for p = 0 can be

obtained with a Lorentz transformation starting from those for a particle at
rest, using the representation for the Lorentz boost deﬁned in Section 6.1.3.
We denote the spinor in frame O , introduced in Section 6.1.3, with:

ψ (x ) = e−i(p x ) u (p ) (6.92)
We then have

ψ(x) = e−i(px) u(p) = S −1 (Λ)ψ (x ) = e−i(p x ) S −1 (Λ)u (p ) (6.93)

or, given that (p x ) = (px):
u(p) = S −1 (Λ)u (p );
$ %
βγσ1
γ + 1 1
S −1 (Λ) = βγσ1
γ+1
. (6.94)
2 γ+1 1
The Dirac Equation 95

If the particle is at rest in O , the energy and momentum in O are given

p = ±mβγ; p0 = ±E = ±mγ (6.95)

for the positive and negative energy solutions. Therefore, for the first we find
(compare with the definitions (6.65)–(6.72)):
pσ1
E+m 1 χr
u(+)
r (p) = u(p) = pσ1
E+m =
2m E+m 1 0

χr
= pσ1 (r = 1, 2) , (6.96)
E+m χr

and for the negative energy solutions:

−pσ1
E+m 1 0
u(−)
r (p) = −pσ1
E+m =
2m E+m 1 χr
−pσ1
E+m χ
=
χr
pσ1
rs (−)
vr (p) = us (−p) = rs E+m χs (r, s = 1, 2) . (6.97)
χs

The solutions corresponding to positive and negative energies do not mix

under a Lorentz transformation.

Note. As equation (6.96) shows, for small velocities the lower components of
u(p) are of the order of β = v/c = p/m with respect to the upper components.
In this limit, upper and lower components are referred to as the big and small
components of u(p), respectively. The roles are exchanged for v(p), as shown
by equation (6.97).

6.1.5 The Magnetic Moment of the Electron

We consider an electron in a known electromagnetic ﬁeld Aμ ≡ (φ, A).
The Dirac equation in the presence of the ﬁeld is obtained with the minimal
substitution discussed in Section 5.4:

i∂ μ → i∂ μ + eAμ , (6.98)

where e is the electronic charge. We obtain:

(i∂/ + eA/ − m)ψ = 0 , (6.99)

or
∂
β i + eφ − βα · (p + eA) − m ψ = 0 . (6.100)
∂t
96 Relativistic Quantum Mechanics

If we restrict ourselves to solutions with positive energy, equation (6.98)

provides an extraordinarily accurate description of the behaviour of the elec-
tron in a given electromagnetic field, both for the electric field generated by
an atomic nucleus and for a classical external field. We want now to study
equation (6.100) for the latter case, in the non-relativistic limit, for

2 2
p2 p2
E = p +m ≈m 1+ = m + , (6.101)
2m2 2m

with m p2 /2m. In these conditions it is helpful to isolate the rapidly varying

phase factor which corresponds to the rest energy, and rewrite the solution to
(6.100) in the form
−imt ,
ψ = ψe (6.102)
where ψ oscillates much more slowly than e−imt .
Substituting (6.102) in (6.100) and multiplying on the left by βeimt gives

∂
i + m ψ = [α · (p + eA) + βm − eφ] ψ . (6.103)
∂t

ψ is a four-component spinor which we can write in the form

ϕ
ψ= , (6.104)
η

with ϕ and η two-component spinors. From

0 σ ϕ ση
αψ = = (6.105)
σ 0 η σϕ

I 0 ϕ ϕ
β ψ = = , (6.106)
0 −I η −η
a set of coupled equations for ϕ and η is obtained

∂
i + eφ ϕ = σ · (p + eA)η (6.107)
∂t

∂
i + eφ + 2m η = σ · (p + eA)ϕ . (6.108)
∂t
Now we again use the non-relativistic approximation, for which

∂
i − eφ + 2m η ≈ 2mη (6.109)
∂t

and thus
σ · (p + eA)
η≈ ϕ. (6.110)
2m
The Dirac Equation 97

Equation (6.110) shows that η is the small component of ψ, being of order

p/m with respect to ϕ.
Substituting (6.110) into (6.107) gives the equation for ϕ

∂ [σ · (p + eA)]2
i + eφ ϕ = ϕ, (6.111)
∂t 2m
which can be rewritten in a more familiar form using the relationship

[σ · (p + eA)]2 = (p + eA)2 + iσ · [(p + eA) × (p + eA)]

= (p + eA)2 + ieσ · (p × A + A × p)
= (p + eA)2 + eσ · (∇ × A + A × ∇)
= (p + eA)2 + eσ · (∇ × A)
= (p + eA)2 + eσ · B ,

where B = (∇ × A) is the magnetic ﬁeld. This then gives

∂ 1 e
i + eφ ϕ = (p + eA)2 + σ·B ϕ , (6.112)
∂t 2m 2m
which is the Schrödinger equation for a particle with electric charge −e and
spin s = σ/2, interacting with an electromagnetic field described by the po-
tentials φ and A.
The last term on the right hand side of equation (6.112) is the interaction
energy between the magnetic field B and a magnetic dipole moment
−e −e
μ= σ=g s. (6.113)
2m 2m
The coefficient g is known as the gyromagnetic ratio and expresses the
relationship between the magnetic moment, given in units of Bohr magnetons,
and the corresponding angular momentum.
The complete interaction with the magnetic field is obtained by inserting
into (6.112) the vector potential corresponding to a constant field along the
z-axis:
B
A = (−y, x, 0); ∇ · A = 0; ∇ × A = B .
2
Neglecting terms of order B 2 , we obtain:
p2 e e
H= + (p · A + A · p) + σ·B =
2m 2m 2m
p2 eB e
= + (xpy − ypx ) + σ·B =
2m 2m 2m
p2 e
= + (L + 2S) · B . (6.114)
2m 2m
The term which contains the orbital angular momentum provides the quan-
tum explanation of the normal Zeeman effect. The emission or absorption of
98 Relativistic Quantum Mechanics

a photon obeys the selection rule ΔLz = ±1, 0. For atoms in which Lz is a
good quantum number, in the presence of a magnetic field the spectral line
splits into three components separated by ±eB/2m, 0, which coincides exactly
with the Larmor frequency (5.73). However, in complex atoms, Lz and Sz are
not individually diagonal and the difference between the levels involved in
the transition is also a function of the gyromagnetic spin ratio [9]. Equation
(6.113) correctly describes the behaviour of the electron; one finds g = 2 as
originally hypothesised by Goudsmit and Uhlenbeck to explain the anomalous
Zeeman effect.
The result (6.114) constitutes an extraordinary success for Dirac’s the-
ory, in which the spin-magnetic field term, which in non-relativistic quantum
mechanics must be added ad hoc to the Schrödinger equation, emerges in
a natural way from the quantum minimal substitution applied to the Dirac
equation of the free particle.

6.2 THE RELATIVISTIC HYDROGEN ATOM

In this section we discuss the spectrum of the hydrogen atom starting
from the Dirac equation (following the original treatment by Dirac [9]; see
also Schiﬀ [10]). The calculation in the non-relativistic limit, based on the
Schrödinger equation, is given in Appendix B. As in that case, the starting
point is the factorisation of the Hamiltonian in polar coordinates.

6.2.1 Factorisation of the Dirac Equation in Polar Coordinates

The α matrices, in the Dirac theory, describe the velocity of the electron.
dx
= −i [x, H] = −i [α · p + βm] = α . (6.115)
dt

We can therefore deﬁne the radial velocity as:

1
αr = x·α (6.116)
r
and the radial momentum, pr (see Appendix B), as:
1 ∂ 1
pr = (x · p − i) = −i( + ). (6.117)
r ∂r r
The factorisation needed is obtained starting from the product:

1 1 1
αr (α · p) = xi (αi αj )pj = xi δij + [αi , αj ] pj . (6.118)
r r 2
The commutator of two α matrices uses the matrices Σi = 2si , introduced
in equation (6.28), which describe the spin of the electron, for example:

σ3 0
[α1 , α2 ] = 2α1 α2 = 2i = 2iΣ3 . (6.119)
0 σ3
The Dirac Equation 99

In general:
1
[αi , αj ] = iijk Σk (6.120)
2
from which, using the deﬁnition of pr , equation (6.117), and the orbital angular
momentum, L = x × p, we can rewrite equation (6.118) as:
i i
αr (α · p) = pr + (L · Σ + 1) = pr + β k (6.121)
r r
where we have deﬁned the hermitian operator k:

k = β(L · Σ + 1) . (6.122)

The Dirac Hamiltonian can be rewritten in the new variables as:

α · p + βm = αr pr + iαr β(r)−1 k + βm . (6.123)

The physical meaning of k becomes clear by calculating the square:

1
k 2 = (L · Σ + 1)2 = Li (δij + [Σi , Σj ])Lj + 2L · Σ + 1 =
2
3 1
= L2 + iijk Σk Li Lj + 2L · Σ + 1 = L2 + L · Σ + + =
4 4
1 1 1
= (L + Σ)2 + = J 2 + . (6.124)
2 4 4
The square of k is a function of the square of the total angular momentum
and its eigenvalues are equal to j(j + 1) + 1/4 = (j + 1/2)2 . Obviously pr ,
αr and r commute with J, therefore with k and also β. It follows that k is a
constant of the motion and that H and k are simultaneously diagonalisable.
We note the anticommutation relation:

{αr , β} = 0, αr2 = β 2 = 1 . (6.125)

The overall Hamiltonian for the electron in the Coulomb ﬁeld of the proton
is written:
α
H = αr pr + iαr β(r)−1 k + βm − (6.126)
r
with α 1/137 the ﬁne structure constant.

6.2.2 Separation of Variables

To achieve the separation of variables, we must construct the Dirac spinors
which correspond to the eigenvalues of the total angular momentum, j, and the
component along the z-axis, jz . These are similar to the spherical harmonics
introduced in Appendix B, equation (B.12) for the particle of spin zero (for a
complete treatment, see [11]).
100 Relativistic Quantum Mechanics

1
Spinors with Deﬁnite Angular Momentum. For a spin particle, a given
2
value of the total angular momentum can be obtained starting from two values
of the orbital angular momentum:

l = j ± 1/2, l = j ∓ 1/2 . (6.127)

Even if Lz is not conserved, see equation (6.28), the value of the total
orbital angular momentum is a constant of the motion, since it is related to
the behaviour of the wave function under parity. For inversion of the axes,
the state with total orbital angular momentum l has a factor (−1)l , which
assumes opposite values for the orbital momenta in (6.127).
We take the value of l which corresponds to the plus sign in (6.127). The
two-dimensional spinor with angular momentum j, jz , is easily constructed
combining the spherical harmonics Ylm and Ylm+1 with the up and down
spinors:

χ(+) (j = l − 1/2, jz = m + 1/2; θ, φ) =

= aYlm (θ, φ) χ(↑) + bYlm+1 (θ, φ) χ(↓) =

aYlm (θ, φ)
= (6.128)
bYlm+1 (θ, φ)

where a and b are coeﬃcients3 which characterise the combination correspond-

ing to the total angular momentum j = l − 1/2. In detail, see [11], one ﬁnds:

l−m
a=
2l + 1
l+m+1
b=− . (6.129)
2l + 1
Similarly, we deﬁne the spinor corresponding to the minus sign in (6.127):

χ(−) (j = l + 1/2, jz = m + 1/2; θ, φ) =

= cYlm (θ, φ) χ(↑) + dYlm+1 (θ, φ) χ(↓) =

cYlm (θ, φ)
= (6.130)
dYlm+1 (θ, φ)

which gives:

l+m+1
c=
2l + 1
l−m
d= . (6.131)
2l + 1
3 Known as Clebsch–Gordan coeﬃcients; see [13].
The Dirac Equation 101

To construct the Dirac spinors which correspond to speciﬁc values of j and

jz , we anticipate from Section 12.1 that, in the theory of Dirac, parity is rep-
resented by the matrix γ0 =diag(1, 1, −1, −1). Therefore, the two-dimensional
spinors which represent the ﬁrst two and the second two rows must have op-
posite parity. From this, two possibilities follow:
1 (+)

u(+)j,jz = r F (r)χ
1 (−)
r G(r)χ
1
r F (r)χ(−)
u(−)j,jz = 1 (+) (6.132)
r G(r)χ

where r−1 F and r−1 G are radial wave functions. As well as j and jz , we can
characterise the spinor u with the value l of the orbital momentum of the
upper component, which in the hydrogen atom is dominant, knowing that the
lower component will be related to the orbital momentum l given by (6.127),
and write simply:
1
uj,l,jz = r F (r)χ(l) . (6.133)
1
r G(r)χ(l )

In spectroscopic notation, see Appendix B, the two states which correspond

to j = 1/2 are written:

S1/2 , corresponding to uj,l=0,jz :

χ(l = 0) = χ(−) , χ(l = 1) = χ(+)
P1/2 , corresponding to uj,l=1,jz :
χ(l = 1) = χ(+) , χ(l = 0) = χ(−) . (6.134)

Radial Equations. The two spinors, χ(±) are connected by the operator
σr = r−1 σ · x. The relation follows from the fact that the operator σr is
rotationally invariant, so cannot change the total angular momentum, and
has negative parity, therefore must exchange l and l :

σr (θ, φ)χ(−) (θ, φ) = ηχ(+) (θ, φ) (6.135)

with η a complex number whose value is determined by setting θ = 0 in (6.135)

and applying σr (0, φ) = σ3 to the spinor with m = 0. We ﬁnd4 η = 1.
With this equation we can determine the action of the operator αr , equa-
tion (6.126), on the spinors (6.132) by choosing for αr the representation5 :

0 −iσr
αr = (6.136)
iσr 0

4 Using 2l+1
the relation Yl0 (cos θ = 1, φ) = 4π
.
5 The reader can show that the new representation of the α matrices is connected to the

Dirac representation, equation (6.24), by a unitary transformation.

102 Relativistic Quantum Mechanics

With this choice:

1 (+)
1 (+)

iαr r F (r)χ = r G(r)χ (6.137)
1 (−) 1
r G(r)χ − r F (r)χ(−)

while:
1 (+)
1 (+)

β r F (r)χ = r F (r)χ (6.138)
1 (−) 1
r G(r)χ − r G(r)χ(−)

and k is simply a multiple of the identity:

1 (+)
1 (+)

k r F (r)χ = k(j) r F (r)χ (6.139)
1 (−) 1 (−)
r G(r)χ r G(r)χ

where k(j), the eigenvalue of k, has the values:

k(j) = (j + 1/2)2 = ±1, ±2, · · · (6.140)

Consequently we can reduce the Dirac equation to a two-dimensional equa-

tion on the vector ( 1r F (r), 1r G(r)) introducing the two-dimensional represen-
tation:

1 0 0 −i
β= ; αr = (6.141)
0 −1 i 0

which clearly satisﬁes the anticommutation relations (6.125). In this represen-

tation the Hamiltonian (6.126) gives rise to two radial equations:
k(j) α
G (r) + G(r) + ( − m + E)F (r) = 0
r r
k(j) α
−F (r) + F (r) + ( + m + E)G(r) = 0 . (6.142)
r r

Why Does β Appear in the Deﬁnition of k? The spin-angular momentum

coupling operator which appears in k, equation (6.122), can be applied to the
two-dimensional spinors introduced in (6.128) and (6.130). We ﬁnd:
1
(σ · L + 1)χ(l) = [j(j + 1) − l(l + 1) + ]χ(l) =
4
= −(j + 1/2)χ(l), for l = j + 1/2
= (j + 1/2)χ(l), for l = j − 1/2 (6.143)

in agreement with (6.140), while:

1
(σ · L + 1)χ(l ) = [j(j + 1) − l(l + 1) + ]χ(l ) =
4
= (j + 1/2)χ(l ), for l = j − 1/2
= −(j + 1/2)χ(l ), for l = j + 1/2. (6.144)
The Dirac Equation 103

Therefore, the spinors (6.132) are not eigenstates of the spin-orbit operator,
Σ · L, but they are eigenstates of k, owing to the multiplication by β, which
makes the eigenvalue of χ(l ) equal to that of χ(l). We also note the relations
used in the literature:
k(j) = −l, for l = j + 1/2; k(j) = l + 1, for l = j − 1/2. (6.145)

Boundary Conditions. For r → +∞, the radial equations reduce to:

G (r) − (m − E)F (r) = 0
−F (r) + (m + E)G(r) = 0 . (6.146)
We define:
√
1,2 = (m ± E) > 0; = 1 2 . (6.147)
Taking the derivative of the second equation and substituting G from the
first, we find:
F (r) = −2 F (r) (6.148)
from which the general solution follows:
F = Ae−r + Be+r (r → +∞). (6.149)
The same holds for G. In conclusion, we must require:
• F (r), G(r) → 0 per r → 0+ ;
• F (r), G(r) ∼ e−r per r → +∞ .

6.2.3 Eigenvalues of the Hamiltonian

With obvious variations, the procedure follows the one used for the non-
relativistic case of Appendix B. We write
F (r) = e−r (a0 rs + a1 rs+1 + · · · + aν rs+ν + · · · )
G(r) = e−r (b0 rs + a1 rs+1 + · · · + bν rs+ν + · · · ). (6.150)

ν = 0. The series should not contain terms in rs−1 . Substituting in (6.142),

we ﬁnd:
a−1 = [−s + k(j)]a0 + αb0
b−1 = αa0 + [s + k(j)]b0 . (6.151)
To ensure a−1 = b−1 = 0 with a0 and b0 non-zero, the determinant of
(6.151) must vanish:

s2 = k(j)2 − α2 or : s = + (j + 1/2)2 − α2
b0 −α
= . (6.152)
a0 s + k(j)
104 Relativistic Quantum Mechanics

We now consider the general term of the series, of order ν − 1. Substituting

in (6.142) we ﬁnd (this time it is necessary to diﬀerentiate the exponentials
as well):

[s + ν − k(j)]aν − αbν = aν−1 + 1 bν−1

αaν + [s + ν + k(j)]bν = 2 aν−1 + bν−1 . (6.153)

The set on the right-hand side has determinant zero, therefore we can
eliminate aν−1 and bν−1 and obtain the relation:

[(s + ν − k(j)) − 1 α]aν = [1 (s + ν + k(j)) + α]bν . (6.154)

General ν. For general values of the recurrence relations (B.27) give rise
to the series (6.150) starting from a0 and b0 . For ν → ∞, we obtain:

νaν = aν−1 + 1 bν−1 = (2)aν−1

νbν = 2 aν−1 + bν−1 = (2)bν−1 (6.155)

making use of (6.154). Summing the series, we obtain (the same holds for G):

F (r) ∼ e−r e+2r = e+r (6.156)

which is not acceptable; the series must terminate at a certain value ν = ν̄.
Setting ν = ν̄ + 1 in (B.27) and making the ﬁrst term vanish, we ﬁnd:
−
bν̄ = aν̄ (6.157)
1
and, substituting in (6.154):

Eα = (s + ν̄) (6.158)

which shows that E > 0. Squaring:

m2
E2 = α2
(6.159)
1+ (s+ν̄)2

and, ﬁnally:
m
E= . (6.160)
√ α2
1+
(ν̄+ (j+1/2)2 −α2 )2

Special consideration is required for the states with ν̄ = 0, where there

could be a conﬂict between equation (6.152):

b0 −α
= (6.161)
a0 s + k(j)
The Dirac Equation 105

1
Table 6.1 Levels of the hydrogen atom in the Dirac equation for n = ν̄ + j + ≤ 4.
2
The energies increase from the bottom towards the top and from left to right. Levels
with the same values of j and ν̄ are degenerate. In the last column, the number of
states in the row, equal to 2n2 , is given (see Eq. (B.35)).

ν̄ 3 2 1 0 number
of states
1
n = ν̄ + j + 2
4 S1/2 , P1/2 P3/2 , D3/2 D5/2 , F5/2 F7/2 2 · 16
3 −− S1/2 , P1/2 P3/2 , D3/2 D5/2 2·9
2 −− −− S1/2 , P1/2 P3/2 2·4
1 −− −− −− S1/2 2·1

and equation (6.154), which for ν̄ = 0 reads:

a0 + 1 b0 = 0. (6.162)

The latter equation clearly requires that a0 and b0 should have opposite
signs, which can be obtained from the preceding equation only if:

k(j) = +(j + 1/2) (6.163)

which in turn, given equation (6.144), implies the selection rule:

ν̄ = 0 : only l = j − 1/2. (6.164)

To the lowest order in α, equation (6.160) reduces to the non-relativistic

result, with the principal quantum number n = j + 1/2 + ν̄.
The resulting arrangement of the states can be represented by a series
ordered according to the quantum numbers n, l, j. Table 6.1 shows the levels
with n ≤ 4, in spectroscopic notation (Appendix B):
The noteworthy feature of the result (6.160) is that the eigenvalues depend
only on ν̄ and j. Dirac’s theory largely resolves the degeneracy in l of the
non-relativistic result, giving a satisfactory explanation of the so-called fine
structure of the levels, but a residual degeneracy remains between those pairs
of states which have equal values of ν̄ and j and which differ by one unit in
orbital angular momentum.
The most celebrated example is that of the 2S1/2 and 2P1/2 states. In 1947,
Lamb and Retherford observed a small energy difference between these states,
the so-called Lamb shift. This difference was interpreted by Bethe as caused by
a higher-order electrodynamic correction, owing to the interaction between the
electron and the electrodynamic fluctuations of the vacuum. The calculation
106 Relativistic Quantum Mechanics

of Bethe and Schwinger’s calculation of the diﬀerence of the magnetic moment

of the electron from the value in Dirac’s theory (g = 2), marked the beginning
of modern quantum electrodynamics, QED.

6.3 TRACES OF THE γ MATRICES

The traces of γ matrix products enter into practically all ﬁeld theory cal-
culations. Here we give the rules for calculating these traces, and explicit
results in some simpler cases. Today there are programs capable of numeri-
cally or symbolically computing traces of products up to very high levels, but
it is nevertheless useful to acquire some familiarity with the properties of the
traces and the calculation methods. The deﬁnitions of the γ matrices and of
γ5 is given in Section 6.1. The point of departure is the elementary properties
already discussed:

T r(γ μ ) = 0; (6.165)
μ ν μν
T r(γ γ ) = 4g . (6.166)

γ5 The matrix γ5 is deﬁned as:

γ5 = iγ 0 γ 1 γ 2 γ 3 . (6.167)

It follows from this that6 :

• the trace of γ5 with a number ≤ 3 of γ matrices is zero,

•
T r (γ μ γ ν γ ρ γ σ γ5 ) = +4iμνρσ . (6.168)

Odd Number of γ Matrices. The rule (6.165) implies that the γ matri-
ces have an even number of dimensions and an equal number of +1 and −1
eigenvalues. Its generalisation is that

• the trace of an odd number of γ matrices = 0.

Even Number of γ Matrices. We ﬁrst consider the case of four γ matrices:

T r (γ μ γ ν γ ρ γ σ ) .

We can advance the ﬁrst matrix by one place using the anticommutation
rule:
T r (γ μ γ ν γ ρ γ σ ) = −T r (γ ν γ μ γ ρ γ σ ) + 2g μν T r (γ ρ γ σ ) .
6 We recall that the completely antisymmetric Levi–Civita tensor is deﬁned by
0123 =
+1 = −0123 , equation (3.40).
The Dirac Equation 107

The second term contains two fewer γ matrices and is elementary. If we

continue to advance using the anticommutation rule, we obtain:
T r (γ μ γ ν γ ρ γ σ ) =
= +8g μν g ρσ − 8g μρ g νσ + 8g μσ g νρ − T r (γ ν γ ρ γ σ γ μ ) .
The final trace is equal to the initial one, because of the cyclic property;
therefore, taking it to the other side of the equation, we find:
T r (γ μ γ ν γ ρ γ σ ) = +4 (g μν g ρσ − g μρ g νσ + g μσ g νρ ) . (6.169)
The rule can be generalised, since, with an even number of γ matrices,
if we move the first matrix forward until it reaches the last place we always
end up with the starting trace multiplied by −1. Therefore, we can reduce
the trace of 2n γ matrices to a combination with alternating signs of traces of
2n − 2 γ matrices. Iterating, we reduce to n = 2+.
Of course, the number of terms in the expansion of the traces increases
very rapidly, as (2n − 1)!!, making the use of computer programs necessary.
As an alternative to the previous method, we can reduce the number of
matrices in a trace by using the relation which reduces the product of three
matrices to a combination of γ α and γ α γ5 , which follows from the completeness
of the basis of matrices introduced in equation (6.47):
γ μ γ ν γ λ = g μν γ λ + g νλ γ μ − g μλ γ ν − iμνλρ γρ γ5 . (6.170)

6.4 PROBLEMS FOR CHAPTER 6

Sect. 6.1
1. For a massless particle, the Dirac equation (6.9) involves only three Dirac
matrices:
∂
i ψ = γ · pψ .
∂t
Show that in this case, to recover the energy-momentum relation: E 2 =
|p|2 , we need only three anticommuting matrices, which we may take as
the Pauli matrices σi , i = 1, 2, 3. The above equation is known as the
Weyl equation and it describes the two-component neutrino, a particle
with only two states, the antineutrino, with helicity h = +1, a candidate
for the particle emitted with the electron in neutron’s β-decay, Sect. 15.1,
and its antiparticle, the neutrino with helicity h = −1 a candidate for
the particle emitted in the solar fusion reactions, Sect. 2.4.
2. Prove that the four matrices γ̃ μ :

0 σ2 σ3 0
0 = α2 =
γ 1 = −iΣ3 = −i
; γ
σ2 0 0 σ3

σ1 0
2 = γ 2 ; γ
γ 3 = iΣ1 = i
0 σ1
108 Relativistic Quantum Mechanics

satisfy the Dirac anticommutation relations (6.15) and therefore pro-

vide an acceptable representation of the Dirac matrices known as the
Majorana representation, to be discussed in Section 13.2.
μ of
3. Determine the unitary transformation that brings the matrices γ
Problem 2 into the standard Pauli representation (6.24).
4. The matrices γ μ of Problem 2 are all imaginary so that the Dirac equa-
tion in this representation has only real coefficients. What can you de-
duce from this fact? Draw your conclusions and then consult Sect. 13.2.
5. In the Weyl, or chiral, representation, the γ-matrices are written in the
form
0 11 0 σi
γ0 = , γi = .
11 0 −σ i 0
– Verify that the above matrices fullfill the anti-commutation rule
{γ μ , γ ν } = 2g μν .
– Derive the explicit expressions of the operators P± = (1 ± γ5 ), and
verify that they are projection operators.
6. Given the Dirac spinor ψ(x), and the transformation
ψ (x ) = S(Λ)ψ(x)
x = Λx
determine the form of the matrix S(Λ) corresponding to rotation by an
angle φ around the z-axis. Consider the cases of:
– infinitesimal transformation;
– finite transformation, discussing the result obtained for φ = 2π.
7. Show that the operators

1 Σ·p
Π± = 1±
2 |p|
where
σi 0
Σi =
0 σi
– are projection operators;
– satisfy the relations:
Π+ ur (p) = δr1 ur (p), Π+ vr (p) = δr2 vr (p)
Π− ur (p) = δr2 ur (p), Π− vr (p) = δr1 vr (p)
where ur (p) and vr (p) (r = 1, 2), are four-component spinors,
solutions of the Dirac equation describing a free particle with mo-
mentum p along the z-axis.
The Dirac Equation 109

8. Under a ﬁnite, i.e. not inﬁnitesimal, Lorentz transformation Λ, a four-

spinor transforms according to

ψ → S(Λ)ψ ,

with μν
i
S[Λ(ω)] = e− 4 ωμν σ
and
i μ ν
σ μν = [γ , γ ] .
2
The non-vanishing components of the antisymmetric tensor ωμν corre-
spond to a boost along the direction of the i-axis (ω0i , with i = 1, 2, 3)
and to rotations around the k-axis, perpendicular to the (i, j)-plane
(ωij ).

– compute the matrix S corresponding to a boost along the direction

of the 1-axis;
– verify that
S(Λ)γ μ S(Λ)−1 = (Λ−1 )μ ν γ ν .

Sect. 6.3

1. Prove that the trace of an odd number of gamma matrices vanishes.

CHAPTER 7

QUANTISATION OF
THE DIRAC FIELD

7.1 PARTICLES AND ANTIPARTICLES

As we have seen, the Dirac equation is the only equation for the wave
function in accord with the requirements of quantum mechanics and relativ-
ity, and leads unambiguously to particles with spin 12 and an extraordinarily
accurate description of the properties of the electron, either free or bound in
atoms. However, the Dirac equation has solutions with negative energy, which
present a diﬃcult quantum interpretation.
From general principles, it follows that a quantum state should evolve in
time according to:
|t >= e−iHt |0 > (7.1)
from which, if ψ is an eigenfunction of energy, it follows that:

ψ(x, t) = e−iEt ψ(x, 0) . (7.2)

Moreover, to have a system which is stable under small perturbations, E

must be bounded from below when other quantum numbers vary, something
which is evidently
not true for the solutions of the Dirac equation with E =
−ω(p) = − (p)2 + m2 ≤ −m.
The solution to the stability problem proposed by Dirac is simple and
radical; all the negative energy states are occupied and the vacuum is in
reality a sea of electrons which ﬁll all the levels with E ≤ −m. The exclusion
principle prevents an electron with positive energy from ending in one of the
forbidden states.
Exciting an electron from a state of negative energy to a state of positive
energy creates a hole in the Dirac Sea, which behaves in every way like a
particle of mass m, equal to the mass of the electron, and with spin 12 and
positive electric charge.
A particle consistent with this description, the positron, was discovered by
110 DOI: 10.1201/9781003436263-7
This chapter has been made available under a CC BY NC license.
Quantisation of the Dirac Field 111

Figure 7.1 A cloud chamber photograph by Anderson (C. D. Anderson, Phys. Rev.
44 (1933), 406 [8]) with one of the ﬁrst images of a positron. The positron travels
from bottom to top, as shown by the fact that the track has a lower curvature in
the upper part of the trajectory, owing to the loss of energy as the positron passes
through the layer of lead, whose side view is visible in the middle of the chamber.
From this information, it can be deduced that the particle has a positive charge,
while the mass is consistent with the mass of an electron.

Anderson in 1932, by observing the products of cosmic ray interactions in a

cloud chamber, Fig. 7.1.
However, the solution proposed by Dirac is not satisfactory from several
points of view. Among them is the fact the the solution would not work for spin
zero particles, which are bosons. One can answer, with Dirac, that for these
particles the x-coordinate is not observable and therefore a spin zero particle
does not have a wave function; for such particles, there is still a representation
of the momentum which is suﬃcient for practical matters [9].
It can be argued that, in reality, on the basis of the uncertainty principle,
the position of a relativistic particle is not an observable, regardless of the
value of its spin.
Imagine the measurement of the position of an electron with a microscope.
Obviously, we must use light of a suﬃciently low wavelength, λ, to obtain
adequate precision. Consequently, the electron experiences a random recoil of
112 Relativistic Quantum Mechanics

the order of the photon momentum, k = 1/λ, such that the uncertainty in the
position and the momentum satisfy the uncertainty relation:
1
Δx ∼ λ = . (7.3)
Δp
When the photon energy exceeds the value ω = k ∼ 2m a new phenomenon
occurs: the creation of an electron-positron pair. In the language of the Dirac
sea, an electron in a negative energy state absorbs the photon and passes to a
positive energy state leaving behind a hole. Now we have two electrons in the
state and it is no longer possible to speak of the electron position.
In order for the locations of the two electrons to be confused, they must be
separated by a distance of the order of the photon wavelength. Therefore the
effect of non-locality begins when we arrive at a precision in the coordinate
measurement of the electron of the order of:
1
Δx ∼ λ = 4.0 · 10−11 cm. (7.4)
m
The length defined by (7.4) is known as the Compton wavelength, λC . For
the proton, λC ∼ 0.2 fermi =0.2 · 10−13 cm.
The same result is found if we try to construct ever smaller wave packets
from the solutions to the Dirac equation. Using the positive energy solutions,
which obviously do not form a complete set, we cannot obtain dimensions for
the packet below λC .
The conclusion is that the representation of x, and therefore the wave
function, simply does not exist for relativistic particles. Even for the electron,
however, there is still a representation of the momentum which is sufficient for
practical purposes. The way in which these purposes are satisfied is provided
by second quantisation.
According to the second quantisation, more modernly known by the name
of relativistic quantum field theory, the object which satisfies the Dirac equa-
tion is a quantised field, mathematically described by a linear operator which
is a function of the location in space-time, ψ(x, t).
The field is a dynamic quantum variable in the Heisenberg representation.
As an operator, the field acts on the space of physical states which, also in
the Heisenberg representation, are constant in time.
The difficulty connected with solutions with negative energy (now, more
properly, negative frequency) is solved because the time dependence of an
operator in the Heisenberg representation is an exponential whose sign is un-
defined. The sign can be positive or negative, according to the energy difference
of the states between which the operator induces transitions.
In the second quantisation, the quantities referred to in (6.89), (6.90) and
(6.91) are also linear operators, rather than average values represented by
complex numbers. As we will see in the following section, the first of these
quantities must be identified with the charge associated with the particles,
Quantisation of the Dirac Field 113

which must be opposite for the particles destroyed by ψ (+) to those created
by ψ (−) , as happens for the electron and positron.
It is reasonable to identify (6.90) and (6.91) with the Hamiltonian and
the total momentum of the ﬁeld. The operator structure of the creation and
destruction operators and the further physical characteristics of the associated
particles are obtained from the requirement that the Hamiltonian has a lower
bounded spectrum (stability). The stability condition leads unambiguously to
Fermi–Dirac statistics for the particles created or destroyed by the components
of the Dirac ﬁeld.

7.2 SECOND QUANTISATION: HOW IT WORKS

We separately consider in (6.88) the solutions with positive and negative
frequencies, which we write generally as:

ψ (+) (x) = Xe−i(px) ; (positive frequency) (7.5)

ψ (−) (x) = Y e+i(px) ; (negative frequency) (7.6)
μ 2 2
p = (E(p), p); E > m; p = m . (7.7)

The ﬁeld in x can be obtained from the ﬁeld at the origin by a space-time
translation according to the equation (see Appendix A):
μ μ
ψ(x) = eiP μ x ψ(0)e−iP μ x (7.8)

where P μ is the energy-momentum operator. We now take the matrix element

of this equation between the 4-momentum eigenstates:

)μ x μ
< E , P |ψ(x)|E, P >= e−i(P −P < E , P |ψ(0)|E, P > . (7.9)

If we compare (7.9) with (7.5) we see that we must have:

P = P − p; (positive frequency) (7.10)

P = P + p; (negative frequency) (7.11)

or:

• the ﬁeld components with positive or negative frequencies respectively de-

stroy or create a particle with mass m.
On the basis of these considerations, we rewrite the expansion (6.88) sub-
stituting the amplitudes of the normal modes of oscillation by, respectively,
destruction and creation operators for electrons and positrons, a, a† and b, b† :
m
ψ(x) = [ar (p)ur (p)e−i(px) + br (p)† vr (p)ei(px) ] . (7.12)
p,r
E(p)V
114 Relativistic Quantum Mechanics

The algebraic structure of these operators is determined by considering

the expressions in (6.90) and (6.91) which now we interpret as the energy
and momentum of the ﬁeld. Carrying out the multiplication of the operators
without changing the order in which they appear in (6.90) and (6.91) we
obtain:

H = d3 x ψ † (x, t)[−iα · ∇ + βm]ψ(x, t) =

= E(p)[ar (p)† ar (p) − br (p)br (p)† ] (7.13)
p,r

P = d3 x ψ † (x, t)(−i∇)ψ(x, t) =

= p[ar (p)† ar (p) − br (p)br (p)† ] . (7.14)
p,r

We consider the second term in the Hamiltonian. The operator bb† is semi-
positive deﬁnite1 . This prevents the assignment of commutation relations to
the creation and destruction operators equal to those of the quantum harmonic
oscillator. In this case, for ﬁxed p, we have:

−bb† = −b† b − 1 = −N (p) − 1 (7.15)

where N (p) is the occupation number of the mode p. The right-hand side
can take arbitrarily negative values and the resulting Hamiltonian will be
unbounded from below.
To obtain a consistent theory, the operator b† b must be limited, as happens
for the Fermi oscillator2 which satisﬁes anticommutation rules. In this case we
obtain:
−bb† = +b† b − 1 = +N (p) − 1 (7.16)
which has eigenvalues 0, −1 (−1 for the vacuum). Using equation (7.16) for
all the values of p and r, we obtain:

H= Ep Nr (p) + N̄r (p) , (7.17)
pr

where Nr (p) = a†r (p)ar (p) and N̄r (p) = b†r (p)br (p) are the number operators
for the particles and antiparticles. In going from (7.13) to (7.17) we have
omitted a constant term which corresponds to the zero point energy of the
oscillators. Physically, this is equivalent to choosing the value 0|H|0 as the
zero of the energy scale.
1 Actually,

< A|bb† |A >= n < A|b|n >< n|b† |A >= n | < A|b|n > |2 ≥ 0, whatever
the state |A >.
2 The Fermi oscillator is deﬁned by the anticommutation rule {b, b} = {b† , b† } =

0; {b, b† } = 1. From the ﬁrst it follows that b2 = 0 and therefore N 2 = b† bb† b = b† b = N ,

or N (N − 1) = 0. The eigenvalues of N are therefore 0, 1.
Quantisation of the Dirac Field 115

The preceding equations are generalised by the following anticommutation

rules:
{ar (p), a† r (p )} = {br (p), b† r (p )} = δrr δpp (7.18)
{ar (p), ar (p )} = {br (p), br (p )} = {ar (p), br (p )} = {a†r (p), b† r (p )} = 0.
(7.19)
Now we consider the normalisation factor in (6.89), which we recognise to
be the integral of the time component of the 4-current J μ = ψ̄γ μ ψ. Proceeding
in a similar way to before and omitting a constant (inﬁnity!) which represents
the charge of the vacuum, we obtain:

d3 x J 0 (x) = Nr (p) − N̄r (p) . (7.20)
pr

We can therefore see that the particles created by b† , while they have
mechanical properties, mass and spin, equal to those created by a† , have
opposite charge.

7.3 CANONICAL QUANTISATION OF THE DIRAC FIELD

We will follow, as far as possible, a similar path to the one of the preceding
section in constructing a quantised ﬁeld which is subject to the Dirac equation.
The Lagrangian density in the absence of interactions is written (with
ψ̄ = ψ † γ0 ):

L = ψ̄(i∂μ γ μ − m)ψ . (7.21)

From here, we ﬁnd;
∂L ∂L
= iψ̄γ μ ; = −mψ̄
∂∂μ ψ ∂ψ
∂L ∂L
= 0; = (i∂μ γ μ − m)ψ . (7.22)
∂∂μ ψ̄ ∂ ψ̄

The Euler–Lagrange equations are therefore:

(i∂μ γ μ − m)ψ = 0; i∂μ (ψ̄γ μ ) + mψ̄ = 0 . (7.23)

Using the relation γ 0 (γ μ )† γ 0 = γ μ , it can be seen that the second equation

is the adjoint of the ﬁrst. In other words, if we consider ψ and ψ̄ as independent
variables, the two equations tell us that ψ̄γ 0 is the hermitian conjugate of ψ.
The Lagrangian (7.21) is invariant under translations and Lorentz trans-
formations. The latter, for inﬁnitesimal transformations can be written:
1
xμ = Λμν xν = xμ + αβ (g αμ δνβ − g βμ δνα )xν
4
1
ψ (x ) = S(Λ)ψ(x) = ψ(x) − i αβ σ αβ ψ(x) (7.24)
4
116 Relativistic Quantum Mechanics

with:
i α β
σ αβ =
[γ , γ ] (7.25)
2
Moreover, the Lagrangian (7.21) possesses a global symmetry, associated
with the transformation of the ﬁelds by a constant phase
ψ(x) → eiα ψ(x); ψ̄ → e−iα ψ̄(x) (7.26)
From (7.22), (7.24) and (7.26) we immediately ﬁnd:
• the canonical energy-momentum tensor:
T μ,ν = iψ̄γ μ ∂ ν ψ − g μν L , (7.27)

• the energy and the momentum of the ﬁeld:

E = d3 xψ † (−iα · ∇ + mβ)ψ

P = d3 xψ † (−i∇)ψ , (7.28)

• the symmetric energy-momentum tensor:

1
θμν = [ψ̄γ μ ∂ ν ψ − (∂ ν ψ̄)γ μ ψ + (μ → ν)] , (7.29)
4
• the conserved current connected to the transformation (7.26):
j μ (x) = ψ̄(x)γ μ ψ(x); ∂μ j μ (x) = 0. (7.30)

Comment. To construct the symmetric energy-momentum tensor according

to the Belinfante–Rosenfeld procedure, we begin from the spin part of the
angular momentum tensor. On the basis of (7.24) we can write
1 i
Σμ,αβ = ψ̄(γ μ σ αβ )ψ = ψ̄(γ μ [γ α , γ β ])ψ . (7.31)
2 4
Deﬁning:
S1αβ = ∂μ Σμ,αβ ; S2αβ = ∂μ Σα,μβ = S3βα (7.32)
we have:
1
θαβ = T α,β + (S1 − S2 − S3 )αβ (7.33)
2
(the tensor 12 S1 cancels the antisymmetric part of T α,β while the subtraction
of the symmetric tensor S2 + S3 balances the 4-divergence of S1 ). Explicitly,
we have
1 i
(S1 − S2 − S3 ) = (∂μ ψ̄)Xμ ψ + ψ̄Xμ (∂μ ψ);
2 8
1 μ α β
X μ,αβ
= [γ [γ , γ ] − γ α [γ μ , γ β ] − γ β [γ μ , γ α ]) . (7.34)
4
Quantisation of the Dirac Field 117

Using the anticommutation rule for the γ matrices, we can make the op-
erators ∂μ γ μ operate on the ﬁelds, where they give ±im. It is easy to see that
the terms proportional to m cancel and we are left with the results of the
anticommutations, from which:
1 i i
(S1 −S2 −S3 )αβ = ψ̄(∂ α γ β −3∂ β γ α )ψ− [(∂ α ψ̄)γ β ψ+(∂ β ψ̄)γ α ψ] . (7.35)
2 4 4
Summing this result with (7.27) leads to (7.29).

Hamiltonian Formalism. The conjugate momentum of the ﬁeld ψ is found

from (7.22):
Π(x) = iψ † (x) (7.36)
while the conjugate momentum of ψ̄ vanishes; in the Hamiltonian scheme
there is only one pair of conjugate variables, which are ψ and iψ † . The Hamil-
tonian density agrees with the ﬁrst result from (7.28), which has already been
expressed in terms of conjugate variables.
The canonical quantisation foresees that the conjugate variables satisfy
commutation rules similar to those of non-relativistic quantum mechanics
(3.26). However, the requirement to have a lower bounded energy requires
anticommutation rules for the electron and positron creation and destruction
operators, as we saw in the preceding section. This corresponds to translating
(3.26) into equal-time anticommutation rules:

{ψα (x, t), Πβ (y, t)} = iδαβ δ (3) (x − y)

or
0
{ψα (x, t), ψ̄β (y, t)} = γαβ δ (3) (x − y) . (7.37)

Equations (7.37) determine the commutators of the dynamic variables with

the Hamiltonian and thus the equations of motion. We use the identity:

[a, b · c] = {a, b}c − b{a, c} (7.38)

to ﬁnd:
∂ψ
i = [ψ(x), H] = (−i α · ∇ + βm)ψ . (7.39)
∂t
The Heisenberg equations of motion reproduce the Dirac equation, as they
should. We can also calculate the commutator of the ﬁeld with the conserved
charge, with the result:
[ψ(x), Q] = +ψ(x) . (7.40)

The Space of States. Proceeding in a similar way to what was done for
the Klein–Gordon equation, we can derive from (6.88) the as (p) and bs (p)
118 Relativistic Quantum Mechanics

operators, taking advantage of the orthonormality conditions for the u and v

spinors. We ﬁnd:

m i(px)
as (p) = d3 x e (ūs (p)γ 0 ψ(x));
EV

m i(px)
bs (p) = d3 x e (ψ̄(x)γ 0 vs (p)) . (7.41)
EV

Starting from the canonical anticommutators, the anticommutators of

these operators are calculated, which as expected agree with those already
found earlier:

{as (p), a†s (p )} = {bs (p), b†s (p )} = δs,s δp,p (7.42)

and all the other anticommutators equal to zero.

Using the anticommutation rules, we can express the conserved quantities
in a way formally identical to (4.52):

H = d3 x θ00 = Σk ω(k) [a† (k)a(k) + b† (k)b(k)] + constant

P i = d3 x θ0i = Σk k i [a† (k)a(k) + b† (k)b(k)]

Q = d3 x J 0 = Σk [a† (k)a(k) − b† (k)b(k)] . (7.43)

We see that the Hilbert space of the theory is that of a set of Fermi
oscillators. Explicitly, it consists of:

• the state |0 >, the vacuum, which vanishes following the application of
the destruction operators:

|0 > such that : as (p)|0 >= br (q)|0 >= 0, for each s, r, p, q;

(7.44)

• the states with given occupation numbers, which are obtained by apply-
ing the creation operators a† and b† to the vacuum:

|n1 , n2 , . . . ; m1 , m2 , · · · >=
= [a†s1 (p1 )]n1 [a†s2 (p2 )]n2 . . . [b†r1 (q1 )]m1 [b†r2 (q2 )]m2 . . . |0 > .
(7.45)

As can be seen from the expression for the momentum, the a and b oper-
ators destroy relativistic particles of mass m and spin 12 . The quantum states
are those of a perfect gas made up of fermions of two diﬀerent types, with
equal mechanical properties.
Quantisation of the Dirac Field 119

Further information on the nature of these particles is provided by the

conserved charge. We consider an element of the commutator matrix (7.40).
We obtain3 :

< q |[ψ, Q]|q >= (q − q ) < q |ψ|q >= + < q |ψ|q >
or : q = q − 1. (7.46)

In both cases, the action of the ﬁeld reduces the conserved charge by one
unit, therefore:

• the particle destroyed by ψ (+) has opposite charge to that created by

ψ (−) .
In conclusion, the duplication of the sign of p0 = ±ω(p), which is inevitable
in a relativistic theory, is reﬂected in the characterisation of the positive and
negative frequency components as operators for the destruction and creation
of particles. In the presence of a conserved charge with non-zero eigenvalues,
the particles created by ψ (−) are a sort of mirror image of those destroyed by
ψ (+) , in the sense that they have equal mechanical properties (mass and spin)
and opposite charge.
The combination of special relativity with quantum mechanics requires the
existence of antimatter.

Comment. The antiparticle of the proton, the negatively charged antipro-

ton, was discovered in 1955 with the Bevatron particle accelerator in Berkeley,
by Segré, Chamberlain, Wiegand and Ypsilantis. Experimentally, the antipro-
ton mass agrees with the mass of the proton to within a very small error, of
order one part in 100 million [15]. The neutron also has an antiparticle, the
antineutron. This is due to the existence of a conserved charge independent of
electric charge. This new conserved charge, the baryon number, is necessary
to take account of the extreme stability of matter.4 The baryon number is
associated with the current:

B μ = ψ̄P γ μ ψP + ψ̄N γ μ ψN (7.47)

where ψP,N denote the Dirac ﬁelds associated with the proton and neutron.
The value +1 is conventionally assigned to the baryon number of the proton
and neutron (and thus −1 to the antiproton and antineutron), while the lighter
particles, electrons and neutrinos, have zero baryon numbers. Many other
particles are now known which decay into protons or neutrons and therefore
also have non-zero baryon number.
3 The same result is obtained for the Klein–Gordon ﬁeld starting from (4.46).
4 Electric charge conservation alone would allow the decay of the proton into e positron
and a neutral meson: P → e+ π 0 . The present non-observation of this decay leads to the
limit pf the proton’s lifetime: τP > 1.6 1034 years, see [16].
120 Relativistic Quantum Mechanics

The simultaneous conservation of electric charge and baryon number gives

rise to the stability of the hydrogen atom (which has Q = 0 but B = 1), which
could otherwise convert into purely electromagnetic radiation.

7.4 THE REPRESENTATION OF THE LORENTZ GROUP

We will now show what was anticipated in Section 6.1.2, that using the
S(Λ) matrices we can construct a set of unitary operators that represent
the Lorentz transformations in the space of occupation numbers (following
the method of Wigner). It is suﬃcient to limit the discussion to states with
only one particle. We therefore consider the states:

|p, r >= ar (p)† |0 > . (7.48)

We must deﬁne the action of the L↑+ transformations on these states, by

means of unitary operators U (Λ) which satisfy the composition rules of the
group. The Lorentz transformations act on 4-vectors on the mass hyperboloid
deﬁned by pμ pμ = p2 = m2 .
We choose a speciﬁc 4-vector, p0 . The L↑+ transformations convert p0 into
4-vectors in the region of the mass hyperboloid with p0 > +m. Conversely,
every pμ is characterised by the Lorentz transformation which converts p0 into
p. For particles with mass m = 0, we choose

pμ0 = (m, 0) . (7.49)

For each p we can deﬁne a transformation we call a boost, which takes p0

into p. We write concisely:
L(p)p0 = p (7.50)
and deﬁne the action of U (L(p)) on the states |p0 , r > in the following way:

|p, r >= U (L(p))|p0 , r > . (7.51)

(Note that U is deﬁned so as not to change the spin indices.)

Now we consider a general transformation:

p = Λ μ ν p ν .
μ
(7.52)

The crucial observation is that;

p = L(Λp)p0 = Λp = ΛL(p)p0 . (7.53)

Therefore, the transformation L−1 (Λp)ΛL(p) changes p0 into itself and

must belong to the subgroup of Lorentz transformations which leaves p0 in-
variant. Wigner called this subgroup the little group of the representation. In
our case, given the form of p0 in (7.49), these transformations are rotations
Quantisation of the Dirac Field 121

in three-dimensional space, which we denote as R. The explicit form of the

transformations follows from the unitary transformations U (Λ). We set

U (Λ) = U (L(Λp))U (R)U (L−1 (p)); (7.54)

from which:

U (Λ)|p, r >= U (L(Λp))U (R)U (L−1 (p))|p, r >=

= U (L(Λp))U (R)|p0 , r >=

= U (L(Λp)) S(R)r,s (R)|p0 , s >=
s

= S(R)r,s |Λp, s >; (7.55)
s
S(R)r,s = ūs (0)S(R)ur (0) = χ† s S(R)χr (7.56)

where S(R) are the matrices deﬁned in (6.39).

The mapping Λ → R conﬁrms the composition rule for L↑+ . To be exact,
if Λ = Λ1 Λ2 :

U (Λ1 )U (Λ2 ) = {U (L(Λ1 q))U (R1 )U (L−1 (q))}{U (L(Λ2 p))U (R2 )U (L−1 (p))

We set q = Λ2 p. Because U (L−1 (q))U (L(q)) = 1, we obtain:

U (Λ1 )U (Λ2 ) = U (L(Λ1 Λ2 (p))U (R1 )U (R2 )U (L−1 (p)),

therefore,

if : Λ1 → R1 ; Λ2 → R2 ;
then : Λ1 Λ2 → R1 R2

and equation (7.55) deﬁnes a representation of the L↑+ group.

Corresponding to every unitary representation of the little group, the equa-
tion (7.55) therefore produces a unitary representation of the Lorentz group
on the eigenstates of the momentum of the particle. For Dirac particles, the
non-unitarity nature of S(Λ) already noted in (6.1.2) does not change the
result. Eﬀectively, according to the method of Wigner, the representation of
the states uses only the component of S which corresponds to the group of
spatial rotations, for which S is unitary:
i
S(R) = S = 1 − ωij σ ij (i, j = 1, 2, 3)
4
(σ ij )† = σ ij , → S(R)† S(R) = 1 .

In conclusion, relativistic particles with non-zero mass are characterised

by two quantum numbers
• the value of the mass, m = 0,
122 Relativistic Quantum Mechanics

• the (integer or half-integer) value of the spin, which characterises the

representation of the little group of p0 .
For particles of zero mass, like the photon, p0 can be chosen as:

p0 = (ω, ωn3 ) (7.57)

where n3 is the unit vector along the z-axis. The little group is the one-
dimensional group of rotations around the z-axis.
We leave to the reader the task of constructing the corresponding repre-
sentations, to show the result already given in Section 2.2.

Normal Products. The prescription for obtaining multilinear expressions

of the fermion fields with zero vacuum value must be modified appropriately
to take account of the anticommutation rules.
Once the fields are separated into positive and negative frequency parts,
the correct prescription is to write the operators with negative frequency to the
right of those with positive frequency, with positive or negative sign according
to whether the number of exchanges necessary to arrive at this configuration
from the starting point, is even or odd. For example:

N (ψ(x)ψ̄(y)) =: ψ(x)ψ̄(y) :=:

:=: (ψ (+) (x) + ψ (−) (x))(ψ̄ (+) (y) + ψ̄ (−) (y)) :=
= ψ (+) (x)ψ̄ (+) (y) − ψ̄ (−) (y)ψ (+) (x) + ψ (−) (x)ψ̄ (+) (y) + ψ (−) (x)ψ̄ (−) (y).
(7.58)

7.5 MICROCAUSALITY
As we saw in Section 1.3, given an event x, space-time divides into distinct
regions regarding the causal connection of diﬀerent events y with event x. The
region located outside the two light cones originating from x represents the
absolute present of x. These events are characterised by the fact that the
interval y − x is spacelike, or (y − x)2 < 0. For brevity, we write y ∼ x.
Measurements carried out on two observables localised at x and y respec-
tively cannot inﬂuence one another when x ∼ y, because this implies propa-
gation of signals faster than the limit set by the speed of light in a vacuum.
From quantum mechanical principles it follows that, under these conditions,
the corresponding operators commute among themselves:

[O1 (x), O2 (y)] = 0, if x ∼ y . (7.59)

The relation (7.59) is known as the microcausality condition. The hypoth-

esis that it should hold for any value of the x–y separation is a very stringent
condition, which could be violated on microscopic length scales. However, the
experimental consequences of microcausality have been conﬁrmed down to the
smallest distances so far tested, of the order of 10−15 cm.
Quantisation of the Dirac Field 123

Using relativistic invariance, we can extend the canonical quantisation

rules (4.44) and (7.37) from the region y = x, y 0 = x0 to the whole of
the region of the present of x. For a general ﬁeld χ, we ﬁnd

[χa (x), χ†b (y)]± = [χa (x), χb (y)]± = 0 for x ∼ y (7.60)

where the ± sign denotes the anticommutator or commutator of the ﬁelds,

according to whether χ is a Dirac or Klein–Gordon field (the indices a, b
denote possible components of the field, either spinors or related to internal
symmetries).
In the case of boson fields, (7.60) tells us that the components of the field
commute for spatial separations and are therefore potential observables, such
as the components of electric and magnetic fields5 .
In contrast, fermion fields do not commute among themselves for spacelike
separations. The only possible interpretation of this result is that fermion
fields cannot be observables.
The non-observability of the fermion field is confirmed by its transfor-
mation properties under rotation. We restrict equation (7.24) to the case of
rotations around the z-axis. In this case, the only non-zero components among
the infinitesimal parameters are 12 = −21 = and we find:

12 12 3 σ3 0
ψ (x ) = (1 − i σ )ψ(x); σ = Σ = . (7.61)
2 0 σ3

For ﬁnite rotations, the previous equation becomes an exponential, so:

θ
ψ (x ) = e−i 2 Σ3 ψ(x). (7.62)

Given that Σ3 has eigenvalues ±1, the equation simply expresses the fact
that the field is associated with a spin 12 particle. However, for θ = 2π, when
x = x, we find:
ψ = −ψ. (7.63)
This result is obviously absurd for an observable quantity, which should
return to itself after a 2π rotation.
However, if ψ cannot be an observable, the physical quantities constructed
from ψ, such as the energy density or charge density, must be. On the basis of
equation (7.63), we can conclude that homogeneous functions of even powers
of the field are good candidates to be observables.
The same conclusion is reached by starting from the canonical anticom-
mutation rules. We consider as an example the commutators of two charge
densities: 0
j (x), j 0 (y) for x ∼ y. (7.64)
5 A complex scalar field is not observable, not being hermitian, but it’s real and imaginary

parts are in principle observables.

124 Relativistic Quantum Mechanics

Repeatedly applying the identify (7.38), and similarly for the commutators,
we ﬁnd:

[ψ † (x)ψ(x), j 0 (y)] = ψ † (x)[ψ(x), j 0 (y)] + [ψ † (x), j 0 (y)]ψ(x) =

= ψ † (x)(−ψ † (y){ψ(x), ψ(y)} + {ψ(x), ψ † (y)}ψ(y)) + . . . = 0. (7.65)

because all the anticommutators vanish for x ∼ y.

7.6 THE RELATION BETWEEN SPIN AND STATISTICS

We have seen in a comprehensive way that the quanta of the Dirac field,
particles of spin 12 , must obey Fermi–Dirac (F–D) statistics. It is easy to
be convinced that the quanta of the Klein–Gordon field and of the electro-
magnetic field, spin 0 and 1 respectively, should obey Bose–Einstein (B–E)
statistics. We consider the classical expression for the Hamiltonian of a real
Klein–Gordon field in terms of the amplitudes of the normal modes of oscil-
lation, cf. equation (4.52):
1 ∗
H = Σk ω(k) [a (k)a(k) + a(k)a∗ (k)]. (7.66)
2
If we wish to quantise according to the rules of the Dirac field:

a → a; a∗ → a†
) *
a, a† = 1 (7.67)

we find the absurd result that the Hamiltonian is a multiple of the identity
operator so that every observable has to be a constant of the motion. The
same holds for the electromagnetic field and the photons, cf. equation (5.112).
For particles of spin j ≤ 1 we can therefore state the spin-statistics theo-
rem:
• identical particles with integer spin obey Bose–Einstein statistics, while
those with half-integer spin conform to Fermi–Dirac statistics.
We can generalise the theorem to assemblies of arbitrary spin composed
from spin 12 particles. In fact, the particles which we know with spin greater
than 1 are all of this type:
• atomic nuclei, with mass number A and atomic number Z, composed of
Z protons and A − Z neutrons,
• atoms, composed of Z electrons and a nucleus with mass number A; in
all Z + A spin 12 particles,
• subnuclear particles, classified as baryons, made of three quarks which
are fundamental spin 12 particles, and mesons, composed of a quark-
antiquark pair.
Quantisation of the Dirac Field 125

In general, since spin 12 is the fundamental representation of the rotation

group, we can construct any value of angular momentum j by combining an
appropriate number N of spin 12 , with
• N=even, j = 12 N = integer;
• N=odd, j = 12 N = half-integer.

The effective field which creates and destroys these assemblies (for exam-
ple the iron nucleus) is obtained starting from ψ N , if ψ is the field of the
constituent.
After rotations by 360◦ , each field ψ gains a factor −1, or:

Ψ = ψ N → (−1)N Ψ = ei2πj Ψ (rotation through 2π) (7.68)

while under the exchange of constituents of two identical assemblies, the state
has a phase factor:

|N >→ (−1)N |N >= |N > ei2πj . (7.69)

Therefore the assemblies take a sign appropriate for either B–E or F–D
statistics, according to whether N is even or odd, or whether j is integer or
half-integer, in agreement with the theorem.
In a relativistic theory, a state with a definite j value can in general be
a superposition of one state with N constituents plus an indefinite number
of fermion–antifermion pairs. This corresponds to adding to Ψ components of
the type ψ N (ψψ † )m , with m variable. However, because ψψ † corresponds to
integer angular momentum and the pairs are equivalent to 2m fermions, the
result does not change. The same holds if we introduce the expression for Ψ
the derivatives of ψ or ψ † . This corresponds to introducing orbital angular
momenta which, having integer eigenvalues, do not change the integer or half-
integer property of j. Finally, the same result arises if, as in nuclei or atoms, the
constituents are groups of different kinds of fermions; each fermion contributes
a factor −1 to (7.68) and the same factor to (7.69).

Comment 1. The ﬁrst determination of the statistics of several nuclei was

due to Franco Rasetti, with an experiment carried out in 1928 (for a de-
scription of Rasetti’s experiment and the discussion which followed, see [17]).
The experiment concerned the nuclei 14 16
7 N a and 8 O (the upper and lower
indices represent the values of A and Z) and both were shown to obey B–E
statistics. At that time, it was thought that a nucleus of mass A and charge
Z was composed of A protons, the only heavy particle then known, and by
A − Z electrons, the nuclear electrons, conﬁned within the nucleus, a total of
2A − Z fermions. This number is even or odd according to Z, while the nuclei
considered by Rasetti have Z both even and odd, while obeying the same
statistics. The paradox was resolved by the discovery of the neutron which,
126 Relativistic Quantum Mechanics

as we explained, causes the number of fermions present in a nucleus to be A,

in agreement with the result of Rasetti and the spin-statistics theorem.
Rasetti’s result was noted by Pauli, who cited it in the letter in which he
proposed the existence of the neutrino. Pauli thought the electron and the neu-
trino emitted in β decays of nuclei existed permanently in the initial nucleus.
Assuming a spin 12 neutrino and one neutrino for each nuclear electron, the
number of fermions present became A + 2(A − Z) = 3A − 2Z which depends
only on the evenness of A, in agreement with Rasetti. After the discovery
of the neutron, Fermi proposed that the eν̄p system (which has half-integer
angular momentum) was not permanently present in the nucleus but was sim-
ply the product of the neutron decay, as an eﬀect of the weak interaction.
The eν̄ pair is created from the vacuum by the creation operators included in
the interaction Hamiltonian, in a similar way to the photon produced by the
deexcitation of an atom, cf. Chapter 9.

Comment 2. A similar paradox occurred at the origin of quark theory, in

which baryons are thought to be bound states of three quarks. The solution
to the problem introduced a new quark quantum number, colour, cf. Sects.
9.3, 18.1 and Ref. [13].

7.7 PROBLEMS FOR CHAPTER 7

Sect. 7.3
1. Starting from the equal time anticommutators
0
{ψα (x, t), ψ̄β (y, t)} = γαβ δ (3) (x − y) (7.70)

prove that:

{as (p), a†s (p )} = {bs (p), b†s (p )} = δs,s δp,p

i.e. that the quanta obey the Fermi–Dirac statistics.

2. Consider the explicitly hermitian Lagrangian density:

1
L= (LD + L†D )
2
where LD is the Dirac Lagrangian density

LD = ψ̄(iγμ ∂ μ − m)ψ .

Use its explicit form to obtain:

– the equations of motions for the ﬁelds ψ and ψ̄;

– the expressions of the variables conjugate to the ﬁelds ψ and ψ̄;
Quantisation of the Dirac Field 127

– the energy, written in terms of the energy-momentum tensor θμν .

3. With ψ a spinor ﬁeld, consider the Lagrangian density
1
L = iψ̄γ μ (∂μ ψ) − i ∂μ ψ̄ γ μ ψ − M ψ̄ψ .
2
– Derive the equations of motion for ψ and ψ̄.
– Determine the spin of the quanta described by the ﬁeld ψ.
– Determine the mass of the quanta.
4. From the Heisenberg equation of motion and the Dirac Hamiltonian:
∂ψ
i = [ψ(x), H]
∂t
H = ψ̄(iα · ∇ + βm)ψ
and using the equal time anticommutators (7.70), derive the Dirac equa-
tion for ψ (hint: use the identity [a, b · c] = {a, b}c − b{a, c}).
5. Consider the plane wave representation:
m
ψ(x) = [ar (p)ur (p)e−i(px) + br (p)† vr (p)ei(px) ]
p,r
E(p)V

and the commutation rule

[ψ(x), Q] = qψ(x) .
Compute the matrix elements of Q between vacuum and the one-electron
state |e, p, r > and between < ē, p, r| and vacuum, where e and ē de-
note the particles created by a† and b† , with electric charges qe and qē ,
respectively. Show that:
qe = −qē = q .

Sect. 7.5
1. Consider the bilinear densities
JA = ψ̄(x)ΓA ψ(x)
where ΓA , A = 1, · · · 16 is one of the Dirac matrices introduced in
Sect. 6.1.2.
Prove the formula for the equal time commutator:

[JA (x, 0), JB (y, 0)] = δ (3) (x − y)ψ̄(y, 0) γ 0 ΓA , γ 0 ΓB ψ(y, 0) .
This shows, in particular, that the time component of the conserved
current Jμ has vanishing equal time commutators with all the current’s
components, a fact at the basis of the so-called Ward identities, see the
problems in Chapter 14.
CHAPTER 8

FREE FIELD
PROPAGATORS

In this section we set = c = 1.

8.1 THE TIME-ORDERED PRODUCT

The time-ordered product of two scalar ﬁelds is deﬁned as:
+
φ(x)φ† (y) x0 − y 0 > 0
T {φ(x)φ† (y)} = . (8.1)
φ† (y)φ(x) x0 − y 0 < 0

The vacuum expectation value:

iDF (x, y) =< 0| T {φ(x)φ† (y)} |0 > (8.2)

gives the quantum amplitude for the simplest observable process involving a
quantised field:
(i) creation of a particle by a source located at y and corresponding ab-
sorption at x, if x0 > y 0 , or
(ii) creation of an antiparticle by a source located at x and corresponding
absorption at y, if y 0 > x0 .
The function (8.2) is known as the propagator of the corresponding field
and its exact form depends on the mass and the spin of the quanta associated
with the field.
In view of the anticommutation relations satisfied by the Dirac field, the
time-ordered product of fermion fields must be antisymmetric:
+
ψ(x)ψ̄(y) x0 − y 0 > 0
T {ψ(x)ψ̄(y)} = , (8.3)
ψ̄(y)ψ(x) x0 − y 0 < 0

and the fermion propagator is deﬁned by:

i(SF )αβ (x, y) =< 0| T {ψα (x)ψ̄β (y)} |0 > . (8.4)

128 DOI: 10.1201/9781003436263-8
This chapter has been made available under a CC BY NC license.
Free Field Propagators 129

In a ﬁeld theory invariant under translations, the propagators are functions

only of the diﬀerence x–y.
This can be seen by inserting into (8.2) the products U † U = 1, where U
is the operator which translates by −y (the argument is repeated identically
in the fermion case (8.4)). The vacuum being invariant on the application of
U , we obtain:

< 0| T {φ(x)φ† (y)} |0 > =< 0| T {φ(x − y)φ† (0)} |0 >=

= iDF (x − y, 0) = iDF (x − y). (8.5)

When the particle associated with the field has a charge, as happens if
the field is complex, the charge flows in both cases from y towards x. The
propagator can be represented by a line directed from y to x.

8.2 PROPAGATORS OF THE SCALAR FIELD

To calculate iDF (x) explicitly from (8.2), we proceed as follows. We write
the ﬁelds in terms of the positive and negative frequency components, and use
the fact that the former gives zero when applied to the vacuum. We obtain:

< 0| T {φ(x)φ† (0)} |0 > =

+
< 0|φ(+) (x)(φ† )(−) (0) |0 >=< 0|[φ(+) (x), (φ† )(−) (0)]|0 >; x0 > 0
=
< 0|(φ† )(+) (0)φ(−) (x) |0 >= − < 0|[φ(−) (x), (φ† )(+) (0)]|0 >; x0 < 0.
We deﬁne the two functions:

iΔ(+) (x) =< 0|[φ(+) (x), (φ† )(−) (0)]|0 >; (8.6)
iΔ(−) (x) =< 0|[φ(−) (x), (φ† )(+) (0)]|0 > . (8.7)

We can quickly convince ourselves that these two functions are exactly the
solutions of the homogeneous Klein–Gordon equation which we introduced,
with the same name, in Section 4.2, equations (4.26) and (4.27). For exam-
ple, starting from the expansion (4.47) and taking account of the canonical
commutators, we ﬁnd:
1
iΔ(+) (x) =< 0|[φ(+) (x), (φ† )(−) (0)]|0 >= e−i(kx) =
2ω(k)V
k

1 d3 k −i(kx)
= e
(2π)3 2ω(k)
which agrees with (4.26).
For the propagator, we therefore ﬁnd:

iDF (x) = θ(x0 )iΔ(+) (x) − θ(−x0 )iΔ(−) (x) (8.8)

which agrees with the solution of the inhomogeneous equation with the Feyn-
man boundary conditions referred to in equation (4.33) with the same name:
130 Relativistic Quantum Mechanics

Figure 8.1 Following the i prescription of Feynman, the poles of the Green’s func-
tion move in the complex plane as shown in the ﬁgure. The path of integration to
obtain iDF , equation (8.10), is now the real axis. The ﬁgure shows the way we can
close the path to obtaining iDF = iΔ(+) for x0 > 0. For x0 < 0 the path should be
closed in the upper half-plane.

• The Feynman boundary conditions are the correct ones to create the
quantum propagator.
In summary, if the scalar field satisfies the equation:
(−2 − μ2 )φ = 0 (8.9)
the Feynman propagator is given, in terms of its Fourier transform, by

† d4 k i
iDF (x) =< 0| T {φ(x)φ (0)} |0 >= e−i(kx) . (8.10)
(2π)4 k 2 − μ2 + i
As described in Section 4.2 the integration in the complex plane of k 0 is
carried out along the real axis and the prescription for i in (8.10) moves the
poles of the integrand as shown in Fig. 8.1.
Unlike the commutators of the fields, which satisfy the microcausality con-
dition, the propagator does not vanish for spatial separations. To be exact:
iDF (x, 0) = limt→0+ iΔ(+) (x, t) = limt→0− iΔ(−) (x, t) =

1 1
= d3 k eik·x = 0. (8.11)
(2π)3 2 k + μ2
2

For large values of r = |x| the propagator decreases exponentially (for a de-
tailed study of the functions Δ(±) (x) and DF (x), see Bogoliubov–Shirkov [2]):

1 2 μ1/2 −μr
iDF (r, 0) e (8.12)
8π 2 π r3/2
Free Field Propagators 131

consistent with the arguments of Section 7.1, by which a relativistic particle

can be localised at most to within its Compton wavelength, λC = 1/μ in
natural units. According to (8.11), we obtain a function localised only in the
limit μ → ∞, as is reasonable to expect:
1 (3)
limμ→∞ iDF (x, 0) = δ (x). (8.13)
2μ
In the limit, we find again the non-relativistic situation in which a pointlike
source at the origin produces a particle in an eigenstate of the coordinate with
x = 0.
In the limit of infinite mass, the propagator is completely localised in space-
time. In this limit, the singularity in the integration of k 0 in (8.10) becomes
infinite, and we obtain1 :
−i (4)
limμ→∞ DF (x) = δ (x). (8.14)
μ2

Comment. It is useful to show by direct calculation that the time-ordered

product of scalar ﬁelds satisﬁes the Klein–Gordon equation with pointlike
sources. We consider the expression in (8.1) which we write explicitly as:

T {φ(x)φ† (0)} = θ(t)φ(x)φ† (0) + θ(−t)φ† (0)φ(x) (8.15)

From the relations:

∂ ∂
θ(t) = − θ(−t) = δ(t) (8.16)
∂t ∂t
and from the commutation rules (4.43), we ﬁnd:

∂t T {φ(x)φ† (0)} = T {∂t φ(x)φ† (0)} + δ(t)[φ(x, 0), φ† (0)] = T {∂t φ(x)φ† (0)}
∂t2 T {φ(x)φ† (0)} = T {∂t2 φ(x)φ† (0)} + δ(t)[∂t φ(x, 0), φ† (0)] =
= T {∂t2 φ(x)φ† (0)} − iδ 4 (x) (8.17)

or, given that φ satisﬁes (8.9), we obtain:

(−2 − μ2 )T {φ(x)φ† (0)} = iδ (4) (x) (8.18)

consistent with (8.10).

8.3 PROPAGATORS OF THE DIRAC FIELD

By analogy to the case of the scalar ﬁeld, the Feynman propagator is
deﬁned as:
iSF (x) = 0| T {ψ(x)ψ̄(0)} |0. (8.19)
1 The discrepancy between the two preceding equations is due to the fact that they are

obtained with two diﬀerent sequences of limits; (8.13) is obtained in the limit t → 0 followed
by μ → ∞, while in (8.14) μ → ∞ for a general x; the two limits do not commute.
132 Relativistic Quantum Mechanics

We separate into positive and negative frequency parts in the ﬁelds and
omit terms with operators which vanish when applied to the vacuum. We
obtain:

0| ψ(x)ψ̄(0) |0 = 0| ψ (+) (x)ψ̄ (−) (0) |0 =

= 0|{ψ (+) (x), ψ̄ (−) (0)}|0 = iS (+) (x) (8.20)

0| ψ̄(0)ψ(x) |0 = 0| ψ̄ (+) (0)ψ (−) (x) |0 =

= 0|{ψ (−) (x), ψ̄ (+) (0)}|0 = iS (−) (x). (8.21)

Using the 4-spinor completeness relations, (6.84) and (6.85), we can rewrite
equations (8.20) and (8.21) in the form

(+) 1 m p/ + m −ip(x )
iS (x) = e =
V ps Ep 2m

1 d3 p −i(px)
= (i∂/ + m) e =
(2π)3 2E(k)
= (i∂/ + m)[iΔ(+) (x)]. (8.22)

Similarly:

1 m p/ − m ip(x)
iS (−) (x) = e
V ps Ep 2m

1 d3 p +i(px)
= −(i∂/ + m) 3
e =
(2π) 2E(k)
= (i∂/ + m)[iΔ(−) (x)] (8.23)

Taking account of the minus sign in the time-ordered product, the Feyn-
man propagator (8.19) is then (with the integration path in the plane p0 of
Fig. 8.1):

iSF (x) = θ(x0 )iS (+) (x) − θ(−x0 )iS (−) (x) =

1 i
= (i∂/ + m)[ d4 p 2 e−i(px) ] =
(2π)4 p − m2 + i

1 i(p/ + m) −i(px)
= d4 p 2 e . (8.24)
(2π)4 p − m2 + i
Sometimes, using the relation:

(p/ + m)(p/ − m) = p2 − m2 (8.25)

equation (8.24) is symbolically rewritten as:

1 i
iSF (x) = d4 p e−i(px) . (8.26)
(2π)4 p/ − m + i
Free Field Propagators 133

8.4 THE PHOTON PROPAGATOR

In the preceding section, we calculated the propagators of the scalar field
and the Dirac field starting from their definition in terms of quantum fields. In
principle, the calculation of the propagation function of the electromagnetic
field, necessary for the applications we will discuss in Chapter 14, requires
the quantisation of the electromagnetic field in covariant form, which we will
not tackle in this volume. However, we can arrive at the same result using a
shortcut.
As we saw, the quantum propagation function of the scalar field coincides
with the Green’s function of the classical Klein–Gordon equation, with Feyn-
man boundary conditions. Extending by analogy this results to the case of
the electromagnetic field, we can use the results of the analysis of the Green’s
function from Chapter 5.
From (5.23) it follows immediately that, in the Lorenz gauge, the Green’s
the function we are seeking has the form
−ig μν
iDFμν (q) = , (8.27)
q 2 + i
and therefore, in the space of coordinates:

d4 q −ig μν iqx
iDFμν (x) = e . (8.28)
(2π)4 q 2 + i

Equation (8.28) deﬁnes the photon propagator in the Feynman gauge.

The completeness condition:

g λμ = g λλ μλ νλ (8.29)

(see Chapter 5), would seem to indicate that the function DFμν (x) represents
the propagation of virtual photons in four diﬀerent polarisation states, de-
scribed by the vectors λ . In addition to the two transverse polarisation states
which describe real photons, there is one longitudinal polarisation state, de-
scribed by a vector parallel to q, and one state of polarisation in time, de-
scribed by the timelike vector η μ ; cf. equation (5.24).
However, by carrying out the decomposition (5.27), it can easily be shown
that only photons in transverse states of polarisation contribute to the pole
at q 2 → 0, so only those states are observable at large distances from the
interaction vertex.
The role of the other terms in (8.29) becomes clear if we note that in the
calculation of the probability amplitudes of physical processes, the electro-
magnetic ﬁeld propagator always appears saturated by the electromagnetic
current jμ , in the form jμ (−q)DFμν (q)jν (q). From the current conservation
condition, q μ jμ = 0, it follows that the terms proportional to q μ or q ν in
DFμν (x − y) gives zero contribution to the physical amplitudes.
134 Relativistic Quantum Mechanics

As we saw in Chapter 5, the contributions to (8.29) from the longitudinal

(λ = 3) and temporal (λ = 0) polarisation vectors can be combined, with the
result
q2 μ ν
μ3 ν3 − η μ η ν = η η + ... , (8.30)
|q|
where the dots denote the terms proportional to q μ or q ν , which do not con-
tribute to the amplitudes. The term which remains in DFμν (q) is proportional
to 1/|q| and therefore describes the instantaneous Coulomb interaction due
to the charge distribution j 0 .
• The propagator (8.28) describes the propagation of two transverse pho-
tons and the Coulomb interaction between charges separated by a dis-
tance x.
The possibility that the fields Aμ (x) could be subject to a gauge transfor-
mation implies that the propagator of the electromagnetic field is not uniquely
determined. If we limit ourselves to covariant gauges, the propagators corre-
sponding to different gauges differ by terms proportional to q μ or q ν and
therefore give the same result when they are combined with the currents to
obtain physical amplitudes. The expression for the propagator in a general
covariant gauge is obtained in the following way:
We recall that the propagator in the Feynman gauge, DFμν (x), is the inverse
of the operator g μν 2. In momentum space:

g μν 2 → −g μν q 2 = K μν (q) (8.31)

and DFμν (q) is determined by the equation:

Dμλ (q)Kλν (q) = δνμ . (8.32)

We can characterise the general gauge with the substitution:

1
g μν 2 → g μν 2 + − 1 ∂μ∂ν , (8.33)
ξ

speciﬁed by a parameter ξ that can take arbitrary values. DFμν (q) is obtained
by inverting the operator

μν 2 1
Kμν = −g q − − 1 qμ qν . (8.34)
ξ

In general, the tensor DFμν (q) will be of the form:

DFμν (q) = A(q 2 )g μν + B(q 2 )q μ q ν , (8.35)

and the result is easily obtained from (8.32):

1 qμ qν
DFμν (q) = 2
−g μν + (1 − ξ) 2 . (8.36)
q q
Free Field Propagators 135

The choice of ξ = 1 corresponds to the Feynman gauge, while the choice ξ = 0

is known as the Landau gauge.
In practical calculations, it is usual to keep ξ general. The disappearance
of any dependence on ξ from the physical amplitudes provides a very eﬀective
check of the correctness of the calculation.

8.5 PROBLEMS FOR CHAPTER 8

Sect. 8.3
1. Prove the relation:

(i∂/ − m)T {ψ(x)ψ̄(0)} = iδ (4) (x) .

2. Using the result of Problem 1 to Sect. 7.5, prove the relation:

∂ μ T {Jμ (x) Jν (y)} = 0 .

Sect. 8.4
1. Starting from the Lagrangian:
1 1
L = − Fμν F μν − (∂μ Aμ )2 , F μν = ∂ ν Aμ − ∂ μ Aν
4 2ξ

derive the photon propagator in the generic ξ-gauge, given in (8.36).

2. Verify that, in a generic ξ-gauge

∂ μ T {Aμ (x)Aν (0)} = −ξ ∂ν DF (x, m = 0)

where DF (x, m = 0) is the propagator of a massless scalar ﬁeld.

CHAPTER 9

INTERACTIONS

The theory of the free field describes an unchangeable world in which the
energy and momentum of every particle of the system are conserved separately.
The variety of phenomena which we observe requires instead some form
of interaction between the fields. In this case, as we saw in the classical limit,
Section 5.3, particles can exchange energy and momentum giving rise to scat-
tering processes or to the emission and absorption of light; the Sun can shine,
the sky can be blue and our eyes can perceive the external world via photons
absorbed by the retina.
In a relativistic theory, not only photons, but also particles associated with
matter, like electrons, protons and others, can be created or annihilated. For
a system isolated from the rest of the world, the elementary processes must
respect the conservation laws imposed by the symmetry of the system: en-
ergy, momentum, angular momentum as well as possible internal conserved
charges; for example, electric charge. This requires that the interaction La-
grangian, which we add to the free field Lagrangian should be invariant un-
der the transformations of special relativity: the Poincaré group, comprised of
translations in space-time and proper Lorentz transformations, L↑+ .
Invariance under the Poincaré group still permits a wide variety of forms
for the interactions, for example between the field of the electron and the
electromagnetic field. In principle, we should identify the correct interaction by
an iterative comparison between prediction and experiment (trial and error).
One starts from a type of interaction which explains at least part of the
initial data and then the theory is extended to other processes, successively
comparing predictions with the experimental data. When new data are found
to be in contradiction with the theory, it is falsified, in Popper’s terminology,
and nothing remains but to modify the interaction to bring it into agreement
with the complete set of old and new data1 .
1 Following Popper, we can say that theories are never confirmed, in that every set of data
inevitably has ranges of errors which make it compatible with very many, even an infinite
number of, theories for the interaction. However, a theory can be eliminated when the data
falsify it in favour of a theory in agreement with the overall data set. For example, classical

136 DOI: 10.1201/9781003436263-9

This chapter has been made available under a CC BY NC license.
Interactions 137

In this process of trial and error, heuristic a priori principles, such as the
requirement to lead to the classical theory in the limit of large systems or the
presence of certain further symmetries, provide a powerful aid to restrict the
form of the interaction, and therefore the choice of discriminating experiments,
naturally keeping in mind the necessity to give greater weight to the scrutiny
by experimental facts compared to heuristic criteria.
A result of great significance from the physics of the last century has been
the assignment of observed processes in the subatomic and subnuclear world
to the action of three different categories of interaction: electromagnetic, weak
and strong, as already pointed out in Section 5.3. The form of these interac-
tions, in terms of fundamental fields, is greatly restricted by symmetry princi-
ples which are the extension of the gauge symmetry encountered in the theory
of the electromagnetic field, Section 5.1, and of the phase transformations of
Sections 4.3 and 7.3.
In what follows, we will derive the form of the electromagnetic interac-
tion for fundamental spinor particles, such as the electron and the muon, the
Fermi interaction is suitable for describing β decay of the neutron and discuss
qualitatively several aspects of nuclear interactions, above all to point out
the difficulty of a description in terms of fields associated with the observed
nuclear particles, nucleons and pions.
We delay to later volumes [13, 14] a deeper treatment of the interactions
in terms of fundamental constituents (quark, leptons and gauge fields).

9.1 QUANTUM ELECTRODYNAMICS

The free theory which describes photons and electrons is obtained by com-
bining the Maxwell Lagrangian (5.14) with the Dirac Lagrangian (7.21):
1
L0 = − Fμν F μν + ψ̄(i∂/ − m)ψ. (9.1)
4
In classical theory, the interaction between ﬁeld and particle is described
by the minimal substitution pμ → pμ − qAμ (Section 5.4). On the other hand,
in quantum mechanics the 4-momentum is represented by the operators:
∂ ∂
p0 = +i ; pi = −i i = −i∂i = +i∂ i
∂t ∂x
and the minimal substitution takes the form:

i∂ μ → i∂ μ − qAμ . (9.2)

mechanics is replaced by quantum mechanics when we wish to include the phenomena

of atomic physics. Scientiﬁc progress comes about through the elimination of hypotheses,
rather than their veriﬁcation.
138 Relativistic Quantum Mechanics

To reobtain classical electrodynamics from the quantum theory in the limit

→ 0 (correspondence principle) we must introduce the substitution (9.2) into
(9.1). For the electron, which has charge q = −e, we obtain:
1
L = − Fμν F μν + ψ̄(i∂/ + eA/ − m)ψ =
4
= L0 + eAμ ψ̄γ μ ψ. (9.3)

The interaction Lagrangian requires the inclusion of the current Jeμ =

μ
ψ̄γ ψ, the Noether current associated with the invariance of (9.3) for the
global phase transformations of the Dirac ﬁeld, (7.26).

Gauge Invariance. The Lagrangian we have obtained is invariant under a

wider group of transformations, the transformations of the Dirac ﬁeld by a
phase which is variable in space-time, counterbalanced by a suitable gauge
transformation of the ﬁeld Aμ .
We consider the simultaneous transformations (which we denote as gauge
transformations of the second kind, or simply gauge transformations):

ψ(x) → eiα(x) ψ(x); ψ̄(x) → e−iα(x) ψ̄(x);

1
Aμ → Aμ + ∂ μ α(x) (9.4)
e
with α(x) an arbitrary function of space-time. It can immediately be conﬁrmed
that (9.4) leaves (9.3) invariant.
We can reverse the argument, as observed by Pauli [20], and show that
the Lagrangian (9.3) is the simplest solution to the problem of constructing
a Lagrangian for the electron which is invariant under the transformations
(9.4).
To show this we note that the combination (the covariant derivative):

D μ ψ = (∂ μ − ieAμ )ψ (9.5)

is transformed by a simple phase change, exactly like ψ, if ψ and Aμ are

subjected to (9.4):
1
(Dμ ψ) = ∂ μ (eiα ψ) − ie(Aμ + ∂ μ α)eiα ψ =
e
= eiα Dμ ψ (9.6)

from which it follows that:

• if we carry out the substitution ∂ μ → Dμ in a Lagrangian invariant
under global phase transformations, we obtain a new Lagrangian in-
variant under (9.4). The new Lagrangian contains an interaction term
prescribed exactly by the symmetry, which agrees with the interaction
in (9.3).
Interactions 139

In particular, the Dirac Lagrangian thus modiﬁed:

LD (ψ, Dμ ψ) = ψ̄(iD/ − m)ψ (9.7)

is gauge invariant. The general solution to the problem is obtained by adding

to (9.7) terms which should be:

• gauge-invariant functions of Aμ , namely functions only of F μν ; which is

the case of the Maxwell Lagrangian Le.m. in (5.14), which when added
to (9.7) exactly reproduces (9.3);

• functions of F μν and of ψ and ψ̄, but not of their derivatives, which

should be invariant under global transformations obtained from (9.4)
with α = constant.

The minimal transformation now assumes a clearer meaning. It provides

the simplest Lagrangian for the electron which is gauge invariant; the possi-
bility of adding terms of the second type from those listed above is not used.
A non-minimal gauge-invariant term which we can add to the interaction
is the so-called Pauli term:
−eκ
Lnon min. = ψ̄σμν ψ F μν (9.8)
4m
which is obviously gauge-invariant.
In the presence of the Pauli term, the equation of motion of the electron
becomes:
eκ
(i∂/ + eA/ − m)ψ = ψ. (9.9)
4m
The non-relativistic limit of this equation is easily found by repeating the
arguments of Section 6.1.5. The result is the equation of motion for the two-
dimensional spinor:
∂
i χ = Hχ
∂t
(p + eA)2 e(1 + κ)
H= + σ·B
2m 2m
which replaces (6.114). The Pauli term would change the gyromagnetic ratio
of the electron to:

gtot = 2(1 + κ). (9.10)

QED for the Charged Leptons. Equation (9.10) provides a strong argu-
ment for limiting the electromagnetic interaction of the electron purely to the
minimal substitution, which already well describes the magnetic moment of
the electron. The result is conﬁrmed by the calculation of higher order cor-
rections to the quantity g − 2, ﬁrst carried out by Schwinger, which shows
140 Relativistic Quantum Mechanics

that these corrections reproduce with high precision the small experimentally
observed deviations from the Dirac prediction of g = 2, cf. [14].
The same conclusion holds for the muon, μ, which has spin 12 and a mass
about 200 times that of the electron. The muon appears in all respects a heavy
version of the electron. The magnetic moment of the muon is known exper-
imentally with great precision and the deviations from the Dirac prediction
g = 2 are well described by higher-order corrections.
In 1976, another charged spin 12 lepton similar to the electron and muon,
the τ particle, was discovered.
The electron, μ, τ and the corresponding neutrinos are classiﬁed as lep-
tons, a family of particles not sensitive to nuclear forces. The most convincing
hypothesis in agreement with our experimental knowledge is to describe the
electromagnetic interactions of the three charged leptons with only the mini-
mal substitution, and therefore with the Lagrangian:
1
LQED = − Fμν F μν + ψ̄e (iD/ − me )ψe + ψ̄μ (iD/ − mμ )ψμ + ψ̄τ (iD/ − mτ )ψτ
4
λ
= Le.m. + L0e + L0μ + L0τ − eAλ Jlept ;
λ
Jlept = −{ψ̄e γ λ ψe + ψ̄μ γ λ ψμ + ψ̄τ γ λ ψτ } (9.11)

where we denote by L0 the free Lagrangian.

The fermion ﬁelds appear in the interaction Lagrangian with the same
coupling constant, the electric charge of the electron, a property known as the
universality of the electromagnetic interaction (see Table 9.1).
The theory described by (9.11) is known as spinor QED (QED = quantum
electrodynamics).

Nuclear Particles. The minimal substitution does not correctly reproduce

the electromagnetic interactions of the proton and neutron. Both of these
particles should be described by a Dirac ﬁeld, because they have spin 12 , but
their gyromagnetic ratios are not in agreement with the minimal values of
g = 2 and g = 0 respectively. For the nucleon magnetic moment, we set
(N = p, n):
e
μN = gN μp S; μp =
2Mp

Table 9.1 Electromagnetic properties of the charged leptons, from the Particle Data
Group [?]. Numbers in parentheses denote the error on the last digit of each quantity.

m (MeV) g
e 0.510998902(21) 2.002319304374(4)
μ 105.658357(5) 2.0023318320(6)
τ 1776.99(28) 2.000(58)
Interactions 141

with Mp the proton mass. Experimentally, we ﬁnd [?]:

+
gN 2.792847351 ± 0.000000028 (proton)
= (9.12)
2 −1.9130427 ± 0.0000005 (neutron).
We must therefore add a Pauli term for each of the two nucleons, with:
+
κp = +1.792847351 ± 0.000000028
(9.13)
κn = −1.9130427 ± 0.0000005.
The corresponding Lagrangian of the overall system (leptons and nucleons)
is written:
L = LQED + ψ̄p (i∂/ − Mp )ψp + ψ̄n (i∂/ − Mn )ψn − eAμ ψ̄p γ μ ψp +
e
+ {κp ψ̄p σμν ψp + κn ψ̄n σμν ψn } F μν . (9.14)
4Mp
We can rewrite the Pauli terms as additive corrections to the overall elec-
tromagnetic current of the nuclear particles. We write, in general:
eκ eκ
ψ̄σμν ψ F μν = ψ̄σμν ψ(∂ ν Aμ ) =
4m 2m
κ ν
= −eAμ { (∂ (ψ̄σμν ψ)} + 4–divergence. (9.15)
2m
Omitting the 4-divergence, which does not contribute to the principle of
minimum action, we can rewrite the Lagrangian (9.14) as:
μ
L = LQED + ψ̄p (i∂/ − Mp )ψp + ψ̄n (i∂/ − Mn )ψn − eAμ Jnucl ;
μ μ κp ν κn ν
Jnucl = ψ̄p γ ψp + ∂ (ψ̄p σμν ψp ) + ∂ (ψ̄n σμν ψn ). (9.16)
2m 2m
The Pauli terms become additional terms of the Noether current of the
proton (that of the neutron is obviously zero) which describe the deviations
of the magnetic moment from the Dirac value. We note that the additional
terms are the divergence of an antisymmetric tensor, therefore the ambigu-
ity associated with the Noether current reoccurs (cf. Section 3.5); they are
identically conserved and do not contribute to the conserved charge:

μ
∂μ Jnucl = 0; d3 x Jnucl
0
= d3 x ψ̄p γ 0 ψp .

Finally, we can write the Lagrangian for electromagnetic interactions of

charged leptons and nuclear particles in the following way:
μ
LQED = L0,tot − eAμ Jtot (9.17)
where:
1
L0,tot = − Fμν F μν + ψ̄i (i∂/ − mi )ψi ;
4 i=e,μ,τ,p n
μ μ μ
Jtot = Jlept + Jnucl (9.18)
142 Relativistic Quantum Mechanics

μ μ
with Jlept and Jnucl given by (9.11) and (9.16).
Today we know many particles in addition to the proton and neutron which
are sensitive to strong interactions, collectively called hadrons, cf. Section 9.3.
If we want to describe each of them with a quantised ﬁeld, we must add their
free Lagrangian to L0,tot , and include their contribution to the total electro-
magnetic current. The Lagrangian (9.17) keeps its form with the extension of
μ
Jnucl to the overall hadron current.

9.2 THE FERMI INTERACTION FOR β DECAY

The first quantitative theory of nuclear β decay is due to Fermi [18]. Fol-
lowing the proposal of Pauli, Fermi assumed that the electron in β decay is
emitted together with an unobserved neutral particle, the neutrino, and that
the process is caused by an interaction independent of both electromagnetic
interactions and the forces which bind nuclei. The corresponding reaction is
written:
N (A, Z) → N (A, Z + 1) + e + ν̄ (9.19)
where A and Z are the mass number and the charge of the nucleus. The
coupling constant associated with this new interaction proved to be so small,
on the scale of nuclear phenomena, to justify the name weak interaction given
to the interaction identified by Fermi.
Fermi assumed that the interaction Lagrangian must be the product of
two terms: an operator which induces the transition from the initial to final
nucleus, and an operator which creates the pair of light particles. There is a
deep analogy here with electromagnetic transitions in atoms and nuclei:

A∗ → A + γ. (9.20)

The interaction Lagrangian of (9.20) is also the product of an operator

which causes the transition from A∗ to A, and of an operator which creates
the photon. For example, as we saw in the previous section, equation (9.16),
for a nuclear transition we have:
μ
Lγ = −eJnucl Aμ . (9.21)

In the case of the reaction (9.19), Fermi assumed that it was induced by
the basic process of neutron disintegration:

n → p + e + ν̄ (9.22)

and wrote the simplest possible interaction Lagrangian as:

LF = −GF ψ̄p γ μ ψn ψ̄e γμ ψν . (9.23)

We have assumed that all the particles can be represented by Dirac ﬁelds,
and GF is the Fermi constant. Conventionally the particle emitted in neutron
β − decay is an antineutrino.
Interactions 143

In modern notation, the Lagrangian for the decay of the neutron is instead
written as:
GF gA
LF = − √ ψ̄p γ μ (1 + γ5 )ψn ψ̄e γμ (1 − γ5 )ψν (9.24)
2 gV

Equation (9.24) condenses a wealth of experimental and theoretical devel-

opments from studies of weak interactions, from the formulation of Fermi to
today:

• V –A theory: the hadronic and leptonic terms in (9.24) are combinations

only of bilinears V μ and Aμ .

• Parity violation: the Lagrangian is a consistent superposition of polar

and axial vectors.

• The neutrino ﬁeld appears in the combination (1 − γ5 )ψν . This ensures

that, for zero mass, neutrinos and antineutrinos are emitted or absorbed
only in negative or positive helicity states, respectively. Neutrinos (an-
tineutrinos) with positive (negative) helicity are not coupled and may
not actually exist (for the theory of two-component neutrinos, see Sec-
tion 13.1).

• The nuclear particles do not appear in the V –A combination, but with

a normalising coeﬃcient, gA /gV , to be determined by experiment.

The Sign of GF . In his work on β decays [18], Fermi wrote the Hamiltonian
density with a sign consistent with the relation Hint = −Lint = −LF . The
applications we will discuss in Chapter 15 depend only on | f |LF |i|2 . The sign
of GF is determined from the interference of the weak and electromagnetic
contributions in the forward–backward asymmetry in, for example, e+ e− →
μ+ μ− , cf. [13], and in the M SW eﬀect on the propagation of neutrinos through
the body of the Sun, Chapter 16. The sign in (9.23) and (9.24) with GF > 0
is consistent with what is obtained if we derive the Fermi interaction from the
second order of the theory of the intermediate vector boson, cf. Chapter 15.

9.3 STRONG INTERACTIONS

This is the name which describes the force which binds together protons
and neutrons (nucleons in atomic nuclei, overcoming electrostatic repulsion of
the protons. Nuclear forces have a range limited to dimensions of the order of
nuclear radii:
e−r/R
V (r)
r
R ∼ 1 · 10−13 cm = 1 fermi. (9.25)
144 Relativistic Quantum Mechanics

A ﬁrst theory of nuclear interactions, proposed by Yukawa, described them

as due to the exchange of a particle similar to the photon but endowed with
mass, the π meson. In this case, the eﬀective range is determined by the mass
of the intermediate particle:

R . (9.26)
mπ c
Comparing equation (9.26) with (9.25) a value of mπ 100–200 MeV
is found2 , in agreement with the mass of the π meson (mπ 140 MeV)
discovered in cosmic ray interactions by Lattes, Muirhead, Occhialini and
Powell in 1947.
The pion–nucleon coupling is characterised by a dimensionless constant
similar to the ﬁne structure constant, but about 1000 times larger; nuclear
forces overcome the electrostatic repulsion of protons inside nuclei. Strong
nuclear forces are by a long way the most powerful forces in Nature, hence
their name, but from atomic distances upwards they are irrelevant owing to
the exponential decline with distance, equation (9.25).

9.4 HADRONS, LEPTONS AND FIELDS OF FORCE

Since the Second World War, numerous particles which, like nucleons and
π mesons, are affected by strong interactions have been identified in high-
energy collisions of cosmic rays and particle accelerators. All these particles
are known collectively as hadrons3 and are divided into two large families:
• mesons: particles which do not possess any absolutely conserved quan-
tum number except for electric charge; consequently mesons decay into
π mesons, nucleon-antinucleon pairs, electromagnetic radiation and, ul-
timately, electrons and neutrinos,
• baryons: particles which possess a non-zero baryon number (cf. Sec-
tion 7.3) and therefore decay into states with a nucleon plus mesons,
nucleon-antinucleon pairs, electromagnetic radiation and, ultimately,
electrons and neutrinos.
The proliferation of hadrons has made particularly pressing the problem
of identifying the fundamental degrees of freedom to describe them in the
framework of a quantum field theory. According to the present picture, the
fundamental degrees of freedom are associated with several types of spin 12
2 The numerical result is obtained recalling that in natural units 1 fm−1 =197 MeV.
3 The term was introduced in 1962 by [Link] from the Greek αδρóσ= thick, strong, in
opposition to the particles which are sensitive to Electromagnetic and Weak interactions
only, that Okun called leptons, from λπτ óσ= small, weak.
Interactions 145

field, the quarks, confined inside hadrons according to the scheme in which
mesons are q q̄ states and baryons are qqq states.
The hadrons are accompanied by the family of leptons, particles which
do not feel the strong interaction, divided into charged leptons, whose elec-
trodynamic interactions were described in Section 9.1, and neutrinos, subject
only to the weak interaction. As already mentioned, three charged leptons are
known today: e, μ and τ and their associated neutrinos.
The fundamental forces acting on quarks and leptons are attributed to
the exchange of particles of spin 1, all similar to the photon, and of a spin
0 particle, the Higgs boson, coupled to the mass of the different particles. In
more detail (see [13] for an extended discussion):
• QED, Section 9.1, describes electromagnetic forces caused by the ex-
change of photons,

• the weak interactions are associated with the exchange of massive, elec-
trically charged vector bosons (W ± ) responsible for the interactions
identified by Fermi, Section 9.2, and a neutral boson (Z 0 ) responsible
for neutral current process,
• the primary strong interactions between quarks are due to the action
of fields similar to the photon, gluons, from the word ‘glue’ because of
their property of binding the quarks together to form hadrons,
• the field associated with the Higgs boson, the particle observed by the
ATLAS and CMS Collaborations at CERN in 2012 with a mass of ap-
proximately 125 GeV, has a fundamental role in breaking the symmetry
which connects the photon to the fields of the weak interaction, generat-
ing the mechanism by which the W ± and Z 0 particles, as well as quarks
and leptons, gain their masses.

Particle Names. Apart from the term hadron, recently coined from the
Greek (adrós = strong), the terms lepton, meson and baryon were born in a
historical period when the only known particles were the electron and the neu-
trino (leptons, from leptós = light), the Yukawa meson with intermediate mass
(from mesos = in between) and the nucleons (baryons, from barús = heavy).
Today we know of leptons and mesons which are much heavier than nucleons
and the names really reﬂect the interactions and the quantum numbers of
the particles. The name quark was given to the fundamental constituents of
hadrons by Gell–Mann, from a passage in Finnegans Wake by James Joyce
(... Three quarks for Muster Mark!), probably alluding to the fact that three
quarks are needed to describe the nucleon.
146 Relativistic Quantum Mechanics

9.5 PROBLEM FOR CHAPTER 9

Sect. 9.1
1. Scalar electrodynamics. Consider the Lagrangian density describing a
charged scalar ﬁeld interacting with the electromagnetic ﬁeld
1
L = (Dμ φ)† (Dμ φ) − m2 φ† φ − Fμν F μν
4
where
Dμ φ = (∂μ − ieAμ )φ , Fμν = ∂ν Aμ − ∂μ Aν .

– Determine the equation of motions.

– Indicate the diﬀerences of the interactions of scalar vs. spinor QED.
– Obtain the Nöther current j μ associated with the invariance of the
ﬁelds φ and φ† under phase transformations;
– Verify that ∂μ j μ = 0.
CHAPTER 10

TIME EVOLUTION OF
QUANTUM SYSTEMS

In general, the expectation values of observable quantities depend on time.

In quantum mechanics, these values are given by the expression:

< X >t = < A(t)|X(t)|A(t) >. (10.1)

There is an intrinsic ambiguity in determining the time dependence on the

various elements (bra, ket and operator) which comprise the right-hand side
of equation (10.1) because we can transfer this dependence from one element
to another, while keeping unchanged the expectation value < X >t , which
is all we can measure on the system. The ambiguity gives rise to diﬀerent
descriptions of the motion, connected to unitary time-dependent transforma-
tions; thus, equivalent to each other. In the following sections, we describe the
Schrödinger and Heisenberg representations. Subsequently, we will introduce
a third description of the motion or interactions: the Dirac representation,
also called the interaction representation, which is particularly useful in the
case of weakly interacting systems.

10.1 THE SCHRÖDINGER REPRESENTATION

In this representation, the dynamic variables (position, momentum, etc)
are associated with ﬁxed operators. The time variation of the expectation value
(10.1) is due to the time variation of the ket which represents the physical state
at time t. Given the ket |A > at time t0 (the initial state), the principle of
superposition requires that |A(t) > is obtained from |A > via application of
a linear operator, U (t, t0 ), independent of |A >:

|A(t) >= U (t, t0 )|A > . (10.2)

Moreover, if |A > is normalised, so that its components cn , on the basis

of a given observable O, are the probability amplitudes of possible results

DOI: 10.1201/9781003436263-10 147

This chapter has been made available under a CC BY NC license.
148 Relativistic Quantum Mechanics

of a measurement of O, it is natural to require that |A(t) > should also be

normalised, so that:

|cn |2 = |cn (t)|2 = 1. (10.3)
n n

Equation (10.3) corresponds to the conservation of the probability of all

possible results. Given this condition, the operator U (t, t0 ) must be unitary:

U (t, t0 )† U (t, t0 ) = 1. (10.4)

We can transform (10.2) into a diﬀerential equation:

d dU (t, t0 )
|A(t) >= U (t, t0 )† .|A(t) >= −iH|A(t) > (10.5)
dt dt
H is the generator of infinitesimal time translations. Because U is unitary,
H is hermitian:
dU (t, t0 )
H=i U (t, t0 )† =
dt
d dU (t, t0 )†
= i [U (t, t0 )U (t, t0 )† ] − iU (t, t0 ) = H †. (10.6)
dt dt
Equation (10.5) is the Schrödinger equation. It is a first order differential
the equation in t, in agreement with the hypothesis that, at time t0 , the ket
|A > gives a complete description of the state of the system and that therefore
the time evolution should be determined by a single initial condition.
In the classical limit, H becomes the Hamilton function of the system, and
for this reason is called the Hamiltonian operator, or simply Hamiltonian, of
the system.
In the case in which H is independent of time, we can integrate the equa-
tion (10.5) and write directly the solution of the Schrödinger equation which
reproduces the state |A > at the time t0 :

|A(t) >= e−iH(t−t0 ) |A > . (10.7)

If we expand |A > in the basis of the eigenvectors of H:

|A(t) >= cn (t)|hn > (10.8)
n

we obtain from (10.5):

cn (t) = e−iEn (t−t0 ) cn (t0 ). (10.9)

Energy is conserved; equation (10.5) shows that if |A > is an eigenstate of

H, so is |A(t) >, with the same eigenvalue. For a general state, this conserves
the expected value of H:

< A(t)|H|A(t) >=< A|H|A >= constant. (10.10)

Time Evolution of Quantum Systems 149

An interesting aspect of these results is invariance with respect to time

translations, or the fact that |A(t) > depends only on t − t0 . If we carry out
an experiment preparing the system in the state |A > at 9:30 am today and
we make a measurement at 10 am, the result is the same as if we had carried
out the same operations between 5 pm and 5:30 pm yesterday. This follows
directly from the time of independence of H.
Reasoning in the reverse direction, if we expect a priori that a given system
should be invariant for translations in time, its Hamiltonian will be indepen-
dent of t and therefore conserved in time; conservation of energy is a direct
consequence of invariance under time translation.
Everything we know makes us believe that systems suﬃciently isolated
from the rest of the universe are independent of time. An isolated system,
therefore, must respect the conservation of energy.

10.2 THE HEISENBERG REPRESENTATION

As an alternative to the Schrödinger representation, we can associate the
state to a ﬁxed vector, and attribute the time dependence of the expecta-
tion values (10.1) to the change of the operator which represents the observ-
able. Formally, the Heisenberg representation is obtained by applying to the
ket |A(t) >S of the Schrödinger representation of the unitary transformation
which returns it to the value which it had at a ﬁxed time t0 :

|A >H = e+iH(t−t0 ) |A(t) >S . (10.11)

At time t0 , the two representations coincide.

The dependence of observable quantities on t is ﬁxed by the requirement
that their expectation values (10.1) should be the same in the two representa-
tions at all t, and that the ket |A >H should be constant. From the equation:

S < A(t)|XS |A(t) >S = H < A|XH (t)|A >H (10.12)

we obtain, using (10.11):

XH (t) = e+iH(t−t0 ) XS e−iH(t−t0 ) . (10.13)

Diﬀerentiating with respect to time, we obtain:

dXH (t)
i = [XH (t), H]. (10.14)
dt
To visualise the Heisenberg representation, we return to the analogy with
the motion of a classical system.
We can describe the state of the system at time t by giving the instan-
taneous position of the system in phase space, (pt , qt ); this corresponds to
the Schrödinger representation. Instead, the Heisenberg representation cor-
responds to describe the state of motion by giving the initial conditions,
150 Relativistic Quantum Mechanics

(pt0 , qt0 ), at an arbitrary but fixed time, t0 . The initial conditions completely
determine the trajectory in phase space and naturally, like the Heisenberg
state, they do not change with time.
There is an unusual aspect of the Heisenberg representation which is im-
plicit in what has been said but is worth underlining.
The Heisenberg state is independent of time t. However, the vector which
represents it depends implicitly on the value of time t0 , which was chosen to
fix the initial conditions (in other words, the time at which the Heisenberg
representation coincides with the Schrödinger representation). The choice of
t0 is arbitrary, but we must fix t0 in the same way for all states of motion, if
we wish to compare them with different states.
The vectors which represent the state of motion for a given choice of t0
differ from those relative to an alternative choice by a unitary transformation:

|A; t0 >H = e+iH(t0 −t0 ) |A; t0 >H . (10.15)
Equation (10.15) leaves invariant the expectation values (10.1). However,
as t0 changes, the vectors which represent the same state of motion can assume
a considerably different appearance.

10.3 THE INTERACTION REPRESENTATION

In many physically interesting cases, the Hamiltonian is the sum of two
terms:
H = H0 + V 0 (10.16)
in which H0 is exactly diagonalisable and V0 can be considered a “small”
modification to H0 . In these cases, we can try to approximate a solution to H
starting from the solutions for H0 , with an expansion in powers of V0 limited to
a finite number of terms. This is the quantum version of perturbation theory,
widely used in classical mechanics.
The most relevant example is quantum electrodynamics (QED). H0 is the
Hamiltonian which describes free electrons and photons while V0 describes
the interaction of the electron with the electromagnetic field. The strength
of the interaction is determined by the electric charge of the electron, and is
expressed in terms of a dimensionless quantity (the fine structure constant):
e2 1
α=( )= (10.17)
4π 137
much less than unity.
To obtain the perturbative expansion in a systematic way, it is conve-
nient to describe the motion of the system in the interaction representation,
introduced by Dirac.
The state at time t in the interaction representation is obtained from the
state in the Schrödinger representation with the unitary transformation:
|A(t) >I = e+iH0t |A(t) >S . (10.18)
Time Evolution of Quantum Systems 151

The expectation values of the observables must be the same in the two
representations:

I < A(t)|OI (t)|A(t) >I = S < A(t)|OS |A(t) >S (10.19)

and therefore:
OI (t) = e+iH0 t OS e−iH0 t . (10.20)
Diﬀerentiating with respect to t, we obtain the equations of motion of the
states and the observables:
∂
i |A(t) >I = V0I (t)|A(t) >I (10.21)
∂t

dOI (t)
i = [H0 , OI (t)] (10.22)
dt
where V0I (t) is the interaction Hamiltonian in the interaction representation:

V0I (t) = e+iH0 t V0 e−iH0 t . (10.23)

The observables change in time with the free Hamiltonian, while the time
variation of the states is due only to the interaction. It should be noted that
V0I depends explicitly on time since V0 and H0 generally do not commute.
Equation (10.21) deﬁnes a translation operator in time between t0 and t:

|t >I = UI (t, t0 )|t0 >I (10.24)

where we have denoted with |t >I the state at time t which becomes |t0 >I
at time t0 .
UI (t, t0 ) is a linear operator. Also, it is subject to the relation:

UI (t, t0 ) = UI (t, t1 )UI (t1 , t0 ) (t > t1 > t0 ). (10.25)

UI is unitary, as a consequence of the fact that the interaction Hamiltonian,

V0I , is hermitian:

U (t, t0 )† U (t, t0 ) = U (t, t0 )U (t, t0 )† = I. (10.26)

Clearly, we can also solve the equation of motion with a ﬁnal condition,
i.e. determining the state |t >I which reduces to a given state |t1 > at a time
t1 > t. The corresponding operator, ŪI (t, t1 ), is deﬁned by:

|t >I = ŪI (t, t1 )|t1 >I (t1 > t) (10.27)

and it is not diﬃcult to see that it is the hermitian conjugate of UI (t1 , t):

ŪI (t, t1 ) = UI (t1 , t)† . (10.28)

152 Relativistic Quantum Mechanics

10.3.1 Theory of Time-dependent Perturbations

We can express UI as a power series in V0I . To do this, we integrate (10.21)
between t0 and t, in this way obtaining the integral equation:
t
|t >1 = |t0 >I −i dt V0I (t )|t >I . (10.29)
t0

The procedure can be repeated. For example, by substituting (10.30) back

into (10.29), we obtain the solution to second order in V0I :
t
|t >1 = |t0 >I −i dt V0I (t )|t0 >I +
t0
t t
2
+(−i) dt V0I (t ) dt V0I (t )|t0 >I +O(V03 ) (10.31)
t0 t0

and so on.
Continuing the process indeﬁnitely, we ﬁnd the series:
+∞
t
|t >I = |t0 >I + (−i)n dt1 V0I (t1 )×
n=1 t0
t1 tn−1
dt2 V0I (t2 ) . . . dtn V0I (tn )|t0 >I =
t0 t0
+∞
t t1 tn−1
= [1 + (−i)n dt1 V0I (t1 ) dt2 V0I (t2 ) . . . dtn V0I (tn )]|t0 >I
n=1 t0 t0 t0

= UI (t, t0 )|t0 >I (10.32)

which gives a formal solution to the equation of motion with the initial con-
dition |t >I = |t0 >I , as is easily conﬁrmed by substituting the result (10.32)
into equation (10.21).

10.3.2 Time-ordered Products

It is helpful to express the terms of the perturbative series (10.32) as
the result of an integration independent of all the variables t1 , . . . tn , in the
ﬁxed interval between t0 and t. To do this, we introduce the time-ordered
product (T-product) of the operators V0I (t1 ), V0I (t2 ) . . . , V0I (tn ) deﬁned, for
Time Evolution of Quantum Systems 153

ﬁxed values of the variables t1 , t2 , . . . tn , as the product of operators in the

order which corresponds to decreasing values of time, reading from left to
right:

T (V0I (t1 ), V0I (t2 ) . . . , V0I (tn )) = V0I (ta1 )V0I (ta2 ) . . . V0I (tan ) (10.33)

where ta1 , ta2 . . . tan is the permutation of t1 , t2 , . . . , tn for which:

ta1 > ta2 > . . . > tan . (10.34)

In the case of two operators, explicitly:

T (V0I (I1 ), V0I (t2 )) = θ(t1 − t2 )V0I (t1 V0I (t2 + θ(t2 − t1 )V0I (t2 )V0I (t1 )
(10.35)

with θ(x) = 0, 1 according to whether x < 0 or x > 0.

We now consider the integral:
t
T (V0I (t1 ), V0I (t2 ) . . . , V0I (tn ))dt1 dt2 . . . dtn (10.36)
t0

with all the variables between t0 and t.

The region of integration is factorised into n! regions, corresponding to
the possible time ordering of the variables. The contribution of the particular
region in which the ordering given by (10.33) holds is given by:
t ta1 tan−1
dta1 V0I (ta1 ) dta2 dV0I (ta2 ) . . . dtan V0I (tan ) (10.37)
t0 t0 t0

and therefore it coincides exactly with the integral which appears in (10.32),
apart from a trivial change in the name of the variables. The instruction of
the time ordering in (10.36) is such that the integrals extended to the n!
regions are all equal to (10.37), so that, ﬁnally, we can rewrite UI (t, t0 ) in the
symmetric form:
t
n 1
UI (t, t0 ) = 1 + (−i) dt1 dt2 . . . dtn T (V0I (t1 )V0I (t2 ) . . . V0I (tn )).
n=1
n! t0
(10.38)

10.4 SYMMETRIES AND CONSTANTS OF THE MOTION

Observables which commute with the Hamiltonian are constants of the
motion; their expectation values are independent of t (cf. equation (10.14)). As
in classical mechanics (cf. Noether’s theorem), in quantum mechanics there is a
direct relation between the symmetries of a physical system and the constants
of the motion.
154 Relativistic Quantum Mechanics

Qualitatively, we speak of symmetry every time we can carry out a transfor-

mation on apparatus which prepares the system (in diﬀerent states |A >, |B >,
etc) in such a way as to leave invariant the relations between them. To be con-
crete, if we denote by |RA >, |RB >, etc. the states which are obtained from
|A >, |B >, etc. with the transformation R, this means that:

| < RB|RA > |2 = | < B|A > |2 (10.39)

for all the states |A > and |B >.

As Wigner showed, equation (10.39) gives rise to two alternatives, which
are:

< RB|RA >=< B|A > (10.40)

or:

< RB|RA >=< B|A > ∗. (10.41)

If the condition (10.40) holds, the transformation is represented by a linear

and unitary operator, U (R):

|RA >= U (R)|A > (10.42)

†
U (R)U (R) = 1.

Condition (10.41) instead requires that U (R) should be anti-linear and

anti-unitary [21]:

U (R)(α|A > +β|B >) = α∗ U (R)|A > +β ∗ U (R)|B >

U (R)U (R)† = 1. (10.43)

The second alternative applies to the case of time reversal. All transfor-
mations which leave the direction of time unchanged should be represented
by unitary operators, and this is the case we will consider in this section. In
particular, operators that represent those groups of transformations which are
continuously connected to the identity transformation (which can always be
represented by the operator 1) should be unitary, as for example are spatial
translations and rotations, discussed in Section A.4.1, or translations in time,
Section 10.1.
Given an observable X, we deﬁne the transformed observable to be one
which has the same expectation value on the state |RA >, as X has on |A >,
so:

< RA|XR |RA >=< A|X|A > . (10.44)

From (10.42) we therefore ﬁnd:

XR = U (R)XU (R)† . (10.45)

Time Evolution of Quantum Systems 155

The most important case is that of the Hamiltonian. If H is left invariant

by the transformation:

HR = U (R)HU (R)† = H (10.46)

then:

[U (R), H] = 0 (10.47)

and all the operators U (R) are constants of the motion.

We compare two experiments which start from two states which are trans-
forms of each other, |A > and |RA >, at time t = 0. In the two cases, at time
t, we have:

|A(t) >= e−iHt |A >

In this case, we speak of an exact, or conserved, symmetry and the previous

formulas show that the relation by which one state is transformed into the
other is conserved in time.
Given a group of continuous transformations, we consider those inﬁnitesi-
mally close to the identity. For these, we can set:

U (R) = U (λ) ≡ 1 − iλT (10.49)

where λ denotes the transformation parameter. The inﬁnitesimal generator,

T , must be hermitian given that U (R) is unitary:

T † = T. (10.50)

On the basis of (10.48), the generator of an exact symmetry commutes

with H:
• a continuous group of exact symmetries implies the existence of con-
served observables, as many as the number of independent parameters
which characterise the group.
The conservation of momentum and angular momentum are the most con-
spicuous examples of these results, which are otherwise completely general.

The observables of quantum ﬁeld theories, for example, the energy den-
sity, are local quantities, operators which we can think of as determined by
measurements in a small region of space-time.
We denote one of these operators as Π(x, t). The vector x is a numerical
variable (and not the position operator) since it is connected to the position
of the macroscopic apparatus which measures Π(x, t).
156 Relativistic Quantum Mechanics

Π(x, t) is obtained from a space-time transformation, starting from

Π(0, 0) = Π(0). For the time development, given that Π(x, t) is an opera-
tor in the Heisenberg representation, we obtain from (10.12):

Π(x, t) = e+iHt Π(x, 0)e−iHt . (10.51)

Using equations (10.45) and (A.39), we ﬁnd for the spatial variables:

Π(x, t) = e+iHt e−iP ·x Π(0, 0)e+iP ·x e−iHt . (10.52)

In the exponents, the following relativistically invariant combination ap-

pears:

Ht − P · x = P μ xν gμν = Pμ xμ . (10.53)

Consequently the matrix element between energy and momentum eigen-

states of a local operator has a characteristic space-time dependence:

< P , E |Π(x, t)|P, E >=

=< P , E |e+i(Ht−P ·x) Π(0, 0)e−i(Ht−P ·x)
|P, E >=
−i[(E−E )t−(P i−P )·x]
=e < P , E |Π(0, 0)|P, E > . (10.54)

‘
CHAPTER 11

RELATIVISTIC
PERTURBATION
THEORY

In this section, we derive the perturbative expansion of the scattering

matrix (S-matrix) for a relativistic ﬁeld theory.
The starting point is the expansion of the transfer matrix, UI (t2 , t1 ), the
operator which transforms the state at time t1 into that at time t2 , in the
interaction representation (IR) introduced in the previous chapter.
The fundamental hypothesis to derive the scattering matrix is to assume
that after a long time, the interaction Hamiltonian in the IR tends to zero:

limt→±∞ H I (t) → 0. (11.1)

Intuitively, the condition arises from the fact that when the time tends
towards infinity, particles separate indefinitely and their mutual interactions
tend to zero.
However, condition (11.1) is actually not trivial. For example, it is not
satisfied in the presence of long range forces, such as the electrostatic interac-
tion between electric charges. This case requires particular care to define the
probability of observable processes (cf. for example [11]).
Also in the case of short-range forces, to satisfy condition (11.1) we must
carefully define the interaction Hamiltonian, so as to subtract the interaction
of each particle with the field generated by the particle itself. The procedure of
subtraction of the effects of self-interaction will be discussed in detail in the
third volume of this series, titled Introduction to Gauge Theories [14].
Assuming the validity of (11.1), the state vector tends to a constant vector
in the limit t → +∞ and the same thing happens, with a different vector, if
we let t → −∞:
limt→±∞ |t >= | ± ∞ >= constant. (11.2)

DOI: 10.1201/9781003436263-11 157

This chapter has been made available under a CC BY NC license.
158 Relativistic Quantum Mechanics

For ﬁnite times, the state vector is given by the application of the linear
operator U (t0 , t) to the initial state |t0 >. From equations (11.2) it follows
equally that | + ∞ > depends linearly on | − ∞ >. We can therefore deﬁne a
linear operator independent of time, the scattering matrix, or S-matrix, which
transforms one to the other:

| + ∞ >= S| − ∞ > . (11.3)

Comparing with the deﬁnition of UI (t2 , t1 ), we see that:

S = limt2 →+∞,t1 →−∞ UI (t2 , t1 ) (11.4)

or, using the perturbative expansion discussed in Chapter 10:

(−i)n +∞ +∞ +∞
S= dt1 dt2 . . . dtn T (H I (t1 )H I (t2 ) . . . H I (tn )).
n
n! −∞ −∞ −∞
(11.5)
In a scattering process, the state at time −∞ represents the initial state,
|i >, while the state at time +∞ is a linear superposition of the states which
correspond to all possible results of the interaction. If we denote one of these
states with |f >, the probability amplitude to observe |f > at the end of the
process and the corresponding probability are given by:

Amplitude(i → f ) =< f | S |i >;

P robability(i → f ) = |A(i → f )|2 .

Unitarity. The state |f is one of the possible results of the scattering pro-
cess. Summing over all the states |fn of a complete basis, the total probability
must be unity:

P (i → fn ) = i| S |fn fn | S † |i = i| SS † |i = 1.
n n

Being the same for every |i, this result implies that the S-matrix must be
unitary:
SS † = S † S = 1. (11.6)

Relativistic invariance. The principle of special relativity takes a partic-

ularly simple form for the S-matrix. For simplicity, we set our system in a
finite volume, V . In this case, the states belong to a discrete spectrum and
are normalised as | < n|m > |2 = δn,m . The probability of observing a certain
result from given initial conditions and after an interval of infinite time must
be the same in any inertial frame of reference. Therefore, if we denote with
|U i >, |U f > the initial and final states transformed to another IF, we must
have:
| < f |S|i > |2 = | < U f |S|U i > |2 = | < f |U † SU |i > |2 .
Relativistic Perturbation Theory 159

For transformations continuously connected to the identity transformation,

therefore:
< f |U † SU |i >=< f |S|i >
or, given that the initial and ﬁnal states are arbitrary:
U † SU = S. (11.7)
The S-matrix is invariant under Lorentz transformations.

11.1 THE DYSON FORMULA

In a ﬁeld theory, the interaction Hamiltonian is in general the spatial
integral of a Hamiltonian density:

H I (t) = d3 x HI (x).

Furthermore, we restrict ourselves to the case in which the Lagrangian

density is the sum of a free Lagrangian and an interaction Lagrangian which
does not contain the field derivatives, as happens in spinor QED. In this case,
we can obtain the Hamiltonian density of interactions as follows:
∂L ∂L0
Htot = ∂0 φ − L = ∂0 φ − (L0 + LI ) =
∂∂0 φ ∂∂0 φ
= H0 + HI .
By comparison, we obtain:
HI = −LI . (11.8)
Consequently, we obtain the formula for the S-matrix due to Dyson:
(i)n
S= ... d4 x1 d4 x2 . . . d4 xn T [LI (x1 )LI (x2 ) . . . LI (xn )] .
n
n!
(11.9)
The Dyson formula allows us to directly verify the relativistic invariance
of the S-matrix.
We note first that the volume of integration and the interaction Lagrangian
density are Lorentz invariants. We must examine only the invariance of the
time-ordered product. For simplicity, we consider the product of two operators:
T [LI (x1 )LI (x2 )] = θ(x01 − x02 )LI (x1 )LI (x2 ) + θ(x02 − x01 )LI (x2 )LI (x1 ).
(11.10)
If the separation between two events is time-like, the order is maintained
in every IF. Conversely, for space-like separations, the temporal order of two
events can be different in a different IF. However, the microcausality condition,
Section 7.5, comes to our aid. Since under these conditions the two interaction
Lagrangians must commute, the order in which the two operators appear is
immaterial and the T-product is in fact a relativistic invariant.
160 Relativistic Quantum Mechanics

Comment. The fields which appear in (11.9) are operators in the interac-
tion representation, therefore they satisfy the free equations of motion and are
in fact, the operators given by expansions of the type referred to in (4.47)
and (7.12), with the canonical quantisation rules. If we add the convention
of defining LI as normal products of the fields, these rules completely deter-
mine the products which appear in the Dyson formula as long as we maintain
x1 = x2 = . . . = xn . When one or more points coincide, the products of LI
exhibit an unusual behaviour (for this aspect, cf. in particular [2]). The over-
coming of the difficulty requires a redefinition of LI and the parameters which
appear in it. The possibility to renormalise the theory in interactions will be
studied in [14]. We can disclose in advance that the renormalisation procedure
can be completed only for a restricted class of interactions: renormalisable in-
teractions. QED generated by the minimal substitution is in this category.
Conversely, the electrodynamic interaction with the addition of Pauli terms,
for example, the Lagrangian (9.16), is not renormalisable and must be con-
sidered a phenomenological theory, valid only in a restricted range of energy.

11.2 CONSERVATION LAWS

Consider the element of the S-matrix between states of deﬁned 4-
momentum:
< Pf in , β| S |Pin , α > (11.11)
(α and β are further quantum numbers necessary to specify the states). We
consider the nth term of the S-matrix and apply a translation of −x1 . If the
interaction Lagrangian is invariant under translations, i.e. depends only on
the coordinates through the ﬁelds1 , then:
μ μ
LI (x1 ) = eiPμ x1 LI (0)e−iPμ x1 (11.12)

< Pf in , β|T [LI (x1 )LI (x2 ) . . . LI (xn )] |Pin , α > =

= < Pf in , β|U † (−x1 )U (−x1 )T ×
[LI (x1 )LI (x2 ) . . . LI (xn )] U † (−x1 )U (−x1 )|Pin , α >
= e−i[(Pin −Pf in )x1 ] ×
< Pf in , β|T [LI (0)LI (x2 − x1 ) . . . LI (xn − x1 )] |Pin , α > . (11.13)

In carrying out the integral in (11.9) we can translate the integration

1 For example, this excludes the presence of a classical ﬁeld, a complex numerical function

of the coordinates, which would remain unaltered under the unitary transformations in
(11.12).
Relativistic Perturbation Theory 161

variables (x2 → x2 − x1 = ζ2 , x3 → x3 − x1 = ζ3 , . . .) and obtain:

(i)n
< Pf in , β| S (n) |Pin , α > = d4 x1 e−i[(Pf in −Pin )x1 ] · ·
n!

· < Pf in , β| . . . d4 ζ2 . . . d4 ζn T [LI (0)LI (ζ2 ) . . . LI (ζn )] |Pin , α > .
(11.14)

Therefore, in conclusion:

< Pf in , β| S |Pin , α >= (2π)4 δ (4) (Pf in − Pin ) · F (Pf in , β, Pin , α) (11.15)

where F is a well-behaved function of its variables.

Equation (11.15) expresses the conservation of 4-momentum in a transla-
tionally invariant theory, according to (11.12).
We can treat all the conservation laws which derive from the invariance of
the interaction Lagrangian (and thus the S-matrix) in a similar manner. This
is the case, for example, for the law of conservation of angular momentum,
when LI is invariant for rotations. If:

e+iαQ Se−iαQ = S; or
[Q, S] = 0.

we also have, for the eigenstates of Q:

0 =< q | [Q, S] |q >= (q − q) < q | S |q >

or the S-matrix element must vanish, unless the conservation law

q = q (11.16)

is satisﬁed.

11.3 COLLISION CROSS SECTION AND LIFETIME

Scattering processes start from an initial state, |i, in which a particle
beam is directed onto a fixed target or, in so-called colliders, against another
beam of particles. At large distances from the interaction region, a system of
detectors produces a signal when a possible final state, |f , is identified.
Normally a single quantum state is not selected. In general, the detectors
are not able to discriminate between the states of a certain set (for example,
particles with momentum around an average value p̄ in a given volume Δp).
In this case, we must sum the probabilities over the final states which are not
separated by the experimental apparatus. Similarly, the initial state can be an
incoherent mixture of a certain set of states; for example, an unpolarised beam
contains equal proportions of different spin states of the incident particle. In
this case, we must average the probability over the initial states.
162 Relativistic Quantum Mechanics

For simplicity, we consider a ﬁxed target experiment. We enclose the sys-

tem in a large box with volume V and surface area S transverse to the beam
and direct a single incident particle at a time onto a target, also considered
to consist of a single particle. In a steady state (replacing the target and pro-
jectile after each collision) we have a constant flux of events in the detectors,
characterised by the probability per unit of time, W . It is easy to convince
oneself that W must be equal to the flux of particles incident on the tar-
get multiplied by the interaction probability of a projectile with the target
particle, P :
W = Φ · P;
1
Φ = ρ · S · v; ρ = (11.17)
V
where ρ is the density of incident particles and v their velocity. The proba-
bility P should be inversely proportional to the surface area S, if the beam is
uniformly distributed transversely:
Δσ
P = . (11.18)
S
The quantity Δσ represents the interaction probability and has dimensions
of an area. Δσ is called the collision cross-section; it all goes as if the target
would offer a transverse area Δσ to a beam of incident particles randomly
distributed over the area S.
Clearly, the cross-section depends on the initial conditions (for example the
energy and type of incident particle we choose) and the specific final conditions
fixed by our detectors. Δσ is all that can be measured for a given scattering
process.
Summarising, we find:
W
Δσ = . (11.19)
ρ·v
If we denote with T the time over which the passage from the initial to
final state takes place (as in the case of V , at the end we will take the limit
T → ∞), we have:
| f | S |i|2
W = (11.20)
T
and therefore:
V | f | S |i|2
Δσ = . (11.21)
T ·v
Consider the process:
a + b → 1 + 2 + . . . + nf (11.22)
with initial and final states:
|i = a†a (pa )a†b (pb )|0
|f = a†1 (p1 )a†2 (p2 ) . . . a†nf (pnf )|0. (11.23)
Relativistic Perturbation Theory 163

In a situation invariant for space-time translations, the S-matrix element

has the form (11.15) and in calculating the probability W we must square the
Dirac δ-function, an operation normally not allowed. In this case, for V and
T ﬁnite, the δ-function can be substituted by:

d4 x e−i[(Pf −Pi )x] (11.24)
Γ

where Γ is the ﬁnite region of space-time of the 4-volume V · T . It is easy to

show2 that, in the limit of large V and T :

| d4 x e−i[(Pf −Pi )x] |2 → V · T · (2π)4 δ (4) (Pf − Pi ). (11.25)
Γ

From this, we ﬁnd:

1
Δσ = V 2 (2π)4 δ (4) (Pf − Pi ) |F (Pf , . . . , Pi . . .)|2 (11.26)
v
where F is the well-behaved function defined in (11.15) and the dots indicate
the other variables needed to fully describe the process.
To any finite order of perturbation theory, the S-matrix, (11.9), is a poly-
nomial of the fields. Among the various terms, only those which contain fields
capable of annihilating the particles of the initial state and creating those in
the final state gives a non-zero matrix element. Using the appropriate com-
mutation or anticommutation rules3 , we can arrange these fields to operate
2 For simplicity, we consider the one-dimensional case
+T /2
2 ωT
dt eiωt = sin( ).
−T /2 ω 2
If we integrate with a trial function, f (ω) and take the limit T → ∞, we find:
+∞ +∞ +∞
2 ωT sin(x) 2x sin(x)
dω sin( )f (ω) = 2 dx f ( ) → 2f (0) dx = 2πf (0)
−∞ ω 2 −∞ x T −∞ x
or, as expected:
2 ωT
sin( ) → 2πδ(ω).
ω 2
Now, we consider the square and integrate it with a trial function:
+∞ 4 ωT
dω sin2 ( ) f (ω) =
−∞ ω2 2
+∞ +∞
sin2 x x sin2 x
= 2T dx 2 f ( ) → 2T f (0)) dx = T · 2πf (0)
−∞ x T −∞ x
or: +T /2
| dt eiωt |2 → T · 2πδ(ω).
−T /2
3 As discussed in Section 11.1, the field operators which appear in the Dyson formula are

free ﬁelds, whose commutators or anticommutators are calculable complex numbers.

164 Relativistic Quantum Mechanics

on the initial and ﬁnal states respectively:

1, 2, . . . | S |a, b =

= d4 x d4 x . . . 1, 2, . . . | χ†1 (x1 )χ†2 (x2 ) . . . [. . . ] χa (xa )χb (xb ) |a, b =
(+)
= 1, 2, . . . | (χ†1 )(−) (x1 )(χ†2 )(−) (x2 ) . . . [. . . ] χ(+)
a (xa )χb (xb ) |a, b.
(11.27)

The points x1 , x2 , . . . are a permutation of x, x , . . ., not necessarily dif-

ferent from each other. In parentheses, there is a polynomial of other fields,
provided that the product of the fields shown explicitly with those implied
forms a Lorentz-invariant combination.
In the expansion of the fields χ in plane waves, only terms with the de-
struction or creation operators appropriate for the initial and final states are
counted, therefore the matrix element contains the corresponding normalisa-
tion coefficients as factors:
⎧
⎪
⎪
m
fermions
⎨ Ei V
Ni = , (11.28)
⎪
⎪
⎩ 1
bosons
2ωi V

each multiplied by a Lorentz-invariant exponential, ei(px) , and by a solution of

the wave equation (for example a Dirac spinor) which has the transformation
properties of the original field. This last factor, multiplied by the fields in
parentheses in (11.27), form a Lorentz-invariant combination.
Introducing the sum over the final states corresponding to the detector
resolution, we find, finally:
⎡ ⎤
(Na Nb )2
Δσ = V 2 ⎣ (2π)4 δ (4) (Pf − Pi ) Πi=1,2,... Ni2 |Mi→f |2 ⎦ . (11.29)
v
f

Mi→f is called the Feynman amplitude of the process and is, from its
derivation, relativistically invariant. The factor V 2 which arises from (14.200)
is cancelled by the factors 1/V in the normalisation of the initial states. The
factors 1/V in the normalisation of the ﬁnal states are balanced by corre-
sponding factors in the phase space of each ﬁnal particle:

2 1 d 3 p 1 d 3 p2
Πi=1,2,... Ni (. . .) = . . . Πi=1,2,... n2i (. . .) (11.30)
(2π)3nf 2E1 2E2
f

where we have set 2Ei Ni2 = n2i = 2mi or 1 for fermions or bosons, respectively.
It should be noted that the momentum integration volume associated with
Relativistic Perturbation Theory 165

each particle is invariant for L↑+ transformations4 . as seen from the equality:

d3 p
= d4 p δ(p2 − m2 )θ(p0 ) (11.31)
2E(p)
Taking into account all these observations, we arrive at the expression
(Eb = mb for a ﬁxed target):

(na nb )2 d 3 pi
dσ = (2π)4 δ (4) (Pf − Pi ) Πi n2 |Mi→f |2 . (11.32)
v · (4Ea mb ) (2π)3 2Ei i
We can see that the ﬂux factor is also Lorentz invariant, by rewriting:

v · (Ea mb ) = |pa |mb = (Ea mb )2 − m2a m2b = (pa · pb )2 − p2a p2b , (11.33)

therefore, by direct inspection of (11.32), we reach an important conclusion:

• the collision cross-section is a relativistic invariant.
Equation (11.33) permits us to calculate the cross-section in reference sys-
tems in which both the initial particles are in motion, as is the case in modern
colliders. In this case, a particularly useful reference frame is the centre of the
mass system (pa + pb = 0) which, for collisions between symmetric beams,
coincides with the laboratory frame of reference. Assuming also, for simplicity,
equal masses for particles a and b, we ﬁnd:

(pa · pb )2 − p2a p2b = 2|p|E = 2vE 2 = |va − vb | E 2 . (11.34)

The ﬂux is determined by the relative velocity between the two particles
(which can also be larger than c). The diﬀerential cross-section in the general
case of particles with velocity va and vb is given by:

(na nb )2 d 3 pi
dσ = (2π)4 δ (4) (Pf − Pi ) Πi n2 |Mi→f |2 .
|va − vb | · (4Ea Eb ) (2π)3 2Ei i
(11.35)

Lifetime. The interaction may cause decays of particles which would be

stable in the free theory. The decay probability per unit time of a state |f is
given by:
| f | S |a|2
Γ(f ) = . (11.36)
T
Summing over all ﬁnal states the total decay probability is found:

Γ= Γ(f ). (11.37)
f

4 Which leave the sign of p0 invariant, pμ being timelike or lightlike.

166 Relativistic Quantum Mechanics

For a given ﬁnal state f , the ratio:

Γ(f )
B(f ) = (11.38)
Γ
represents the probability to ﬁnd the state f among the decay products and
is called the branching ratio or branching fraction. Clearly:

B(f ) = 1. (11.39)
f

For a system of N particles, the change in the number present at time t is

given by:
dN (t)
= −Γ · N (t)
dt
from which:
N (t) = N (0)e−Γt .
The lifetime of the unstable particle is therefore:
1
τ= . (11.40)
Γ
Γ is known as the total width of the particle5 . We consider the case of an
interaction invariant for translations in space-time. Proceeding as before, we
ﬁnd, in the rest system of the unstable particle:

dΓ(a → 1 + 2 + · · · + nf ) =
1 n2a d 3 pi 2
= 3n
Π i n (2π)4 δ (4) (Pf − Pa )|M (a → 1 + 2 + · · · + nf )|2 .
(2π) f 2ma 2Ei i
(11.41)

11.4 PROBLEMS FOR CHAPTER 11

Sect. 11.3
1. Prove the relation:
d3 p
= d4 p δ(p2 − m2 )θ(p0 ) .
2E(p)
5 The name originates from the fact that an unstable state necessarily has an uncertainty

in the energy given the Heisenberg principle ΔEΔt . If we take Δt = τ we ﬁnd ΔE =

/τ = Γ; therefore Γ represents the width of the distribution of energies of the unstable
state.
Relativistic Perturbation Theory 167

2. The Fabri–Dalitz plot. Consider a three-particle decay and assume the

initial particle has spin zero (e.g. K 0 → π + π − π 0 ) or it is not polarised,
so we can assume isotropy in its rest system. In this frame, the momenta
of the three final particles add to zero, which means that they are in a
plane, which we take to be the x − y plane, with the first particle (e.g.
the π + ) along the x axis. The momentum of particle 3 is fixed to be
p3 = −p1 − p2 so the only variables left are |p1 |, |p2 | and the angle
between the two vectors, θ12 .The angle can be eliminated in favour of
the energy of particle 3, from the relation

|p3 |2 = |p1 |2 + |p2 |2 + 2 cos θ12 |p1 | · |p2 |

and we are left with three variables, the energies, restricted by the energy
conservation relation:

E1 + E2 + E3 = M = const.

Each set of allowed energies can be represented as a point inside a ﬁxed

equilateral triangle, the energies being the distances of the point from the
three sides of the triangle (this property is known as Viviani’s theorem).
The beauty of this construction and its usefulness is that one can prove
that the density of points from phase space only; that is, for a constant
decay matrix element squared, is uniform. Hence if we measure many
decays and report a point for each decay in the Fabri–Daliz plot any
density variation of the points reﬂects a true property of the matrix
element, e.g. the presence of a resonance or possible spin correlations.
Thus, the Fabri–Dalitz plot puts into evidence the dynamics of the decay.
The diﬀerential rate in the rest system of the decay is
⎛ ⎞
" d 3 pi
dΓ = Const × δ (4) (P − pi ) ⎝ ⎠ |M|2 .
i=1,2,3 i=1,2,3
2E i

Prove that, after elimination of irrelevant variables, one can write

⎛ ⎞
"
dΓ = Const × δ(M − E1 − E2 − E3 ) ⎝ dEi ⎠ |M|2 .
i=1,2,3

3. Prove that the boundary of the allowed region of the Fabri–Dalitz plot
corresponds to collinear conﬁgurations, namely θ12 = 0, π.
4. Determine the boundary of the Fabri–Dalitz plot for decay in three equal
mass particles (e.g. K 0 → π + π − π 0 ).
CHAPTER 12

THE DISCRETE
SYMMETRIES: P, C, T

In this chapter, we discuss discrete symmetries in quantum ﬁeld theories,

i.e. the transformations:
• inversion of the spatial coordinate axes at a given time:

x → −x, t → t (12.1)

or parity, which we denote with P,

• substitution of every particle by its antiparticle, and vice versa, or charge
conjugation, C,
• inversion of time, or time reversal

t → −t, x → x (12.2)

which we denote as T .
The ﬁrst two transformations are represented in the Hilbert space of states
by unitary operators while time inversion must act as an anti-unitary operator.
In this chapter, we will refer principally to the QED Lagrangian which, as we
will see, is invariant under all these transformations, which are therefore exact
symmetries of QED. At the end of the chapter, we will consider the Fermi
Lagrangian, the prototype description of the weak interactions, in which the
symmetries are individually violated, beginning with parity, and the only exact
symmetry is the product of all three, the T CP transformation.

12.1 PARITY
The action of parity on the field operators follows from (12.1): the trans-
formed components of the field in x must be a superposition of the field
168 DOI: 10.1201/9781003436263-12
This chapter has been made available under a CC BY NC license.
The Discrete Symmetries: P, C, T 169

components in −x. The application of the parity operation twice returns us

to the starting point, therefore we impose the condition:
hPP = 1. (12.3)

For the electromagnetic field, we know that in the classical limit, the spatial
components of the vector potential are polar vectors, while the scalar potential
is just that, a scalar. Therefore (the indices repeated three times are not
summed):
PAμ (x, t)P = g μμ Aμ (−x, t). (12.4)
From here we find:
P∂ λ Aμ (x, t)P = g μμ g λλ ∂ λ Aμ (−x, t) (12.5)
from which follows the rule for the transformation of Maxwell’s tensor.
PF μν (x, t)P = g μμ g νν F μν (−x, t). (12.6)
In the case of the spinor, the general rule is written:
Pψ(x, t)α P = [P ψ(−x, t)]α = Pαβ ψ(−x, t)β (12.7)
where P is a matrix in the space of Dirac spinors to be defined so that ob-
servables constructed with the fields transform correctly under parity.
To respect the classical limit, the current must transform like the vector
potential:
P ψ̄(x, t)γ μ ψ(x, t)P = g μμ ψ̄(−x, t)γ μ ψ(−x, t). (12.8)
Substitution of (12.7) in the above equation yields
P ψ̄(x, t)γ μ ψ(x, t)P = g μμ ψ̄(−x, t)P † γ 0 γ μ P ψ(−x, t). (12.9)
• For μ = 0, we find:
P † P = 1. (12.10)

• For μ = i (i=1, 2, 3), we ﬁnd:

P † γ 0 γ i P = P † αi P = −αi . (12.11)

Finally, P must anticommute with all the αi and this implies:

P = cγ 0 . (12.12)
2 2
If we require that P = 1, it follows that |c| =1. Without loss of generality,
we can fix c = 1 and thus:
P = γ 0 ; P = P † = P −1 . (12.13)
As we saw in Section 7.5, observables are constructed from bilinear combi-
nations of the Dirac fields, which in their turn are organised into Dirac bilin-
ear covariants. Equations (12.7) and (12.13) define the parity transformation
properties of each bilinear.
170 Relativistic Quantum Mechanics

Bilinear Covariants. We leave to the reader the task of generalising (12.9),

showing that the bilinears S, P, V, A, T transform under parity as scalars, pseu-
doscalars, polar vectors, axial vectors and tensors.

S : P ψ̄(x, t)ψ(x, t)P = ψ̄(−x, t)ψ(−x, t)

P : P ψ̄(x, t)iγ5 ψ(x, t)P = − ψ̄(−x, t)iγ5 ψ(−x, t)
V : P ψ̄(x, t)γ μ ψ(x, t)P = g μμ ψ̄(−x, t)γ μ ψ(−x, t)
A : P ψ̄(x, t)γ μ γ5 ψ(x, t)P = −g μμ ψ̄(−x, t)γ μ γ5 ψ(−x, t)
T : P ψ̄(x, t)σ μν ψ(x, t)P = g μμ g νν ψ̄(−x, t)σ μν ψ(−x, t). (12.14)

A corollary of the relations we have demonstrated is that the QED action

is parity invariant. Actually, in
1
LQED = − Fμν (x)F μν (x) + ψ̄(x)(i∂/ − m)ψ(x) + eAμ (x)ψ̄γ μ ψ(x) (12.15)
4
only scalar quantities or vector products appear. Therefore:

PSP = P d4 xLQED (x, t)P =

= d xLQED (−x, t) = d4 xLQED (x, t) = S.
4
(12.16)

The invariance remains valid if we add to the action non-minimal terms

which describe a possible anomalous magnetic moment of the form (cf. Sec-
tion 9.1):
κ
Lnon min. = ψ̄σμν ψ F μν . (12.17)
2m

12.2 CHARGE CONJUGATION

The field ψ destroys an electron and creates a positron, while ψ † creates
an electron and destroys a positron. As we have observed several times, the
statement that the positron is the antiparticle of the electron is purely a matter
of convention and the roles of the electron and positron in QED are completely
symmetric. In other words, we may equally choose to formulate QED in terms
of a new field which destroys the positron and creates the electron. If we call
the new field ψc , from what has just been said the components of ψc must be
linear combinations of the components of ψ † .
In equation form (we do not include the coordinates which are the same
on the left and right-hand sides):

[ψc ]α = Cψα C = Cψ † α = Cαβ ψβ† . (12.18)
The Discrete Symmetries: P, C, T 171

where, as before, C is a matrix to be determined so that the transformation of

observables by C corresponds to the replacement of each electron by a positron
and vice versa. Taking the hermitian conjugate of (12.18), we also ﬁnd:
†
[ψc ]α = Cψα† C = ψC † α . (12.19)

As in the case of parity, repeating C twice gives the identity transformation.

Therefore we can require that C 2 = 1, and thus:

ψ = (ψc )† C † ; ψ † = Cψc (12.20)

In order that the Dirac Lagrangian should be invariant under C, we must

have, in the ﬁrst place:

ψ̄ψ = ψ̄c ψc . (12.21)

We use the Pauli representation, in which γ 0 is real and symmetric, for

the γ matrices. The ﬁeld products in the formulae should all be understood
as normal products, within which the fermion ﬁelds anticommute. Therefore
we have:

ψ̄ψ = − ψ(γ 0 )T ψ † = − (ψc )† C † γ 0 Cψc (12.22)

from which:

C † γ 0 C = −γ 0 . (12.23)

Now we require that terms with derivatives in the Dirac action, Skin ,
should maintain the same form when expressed in the conjugate ﬁelds:

Skin = d x iψ̄∂/ψ = − d4 x i(∂μ ψ † )γ 0 γ μ ψ
4

= + d4 x iψ(γ 0 γ μ )T (∂μ ψ † )

= + d4 x iψ̄c γ 0 C † (γ μ )T γ 0 C(∂μ ψc ). (12.24)

Therefore we must have;

γ 0 C † (γ μ )T γ 0 C = γ μ . (12.25)

If we set μ = 0, the previous equation becomes:

C † C = 1 → C † = C = C −1 (12.26)

and from equation (12.23) we see that C must anticommute with γ 0 . Further-
more, if we use the general relation:

γ 0 (γ μ )† γ 0 = γ μ (12.27)
172 Relativistic Quantum Mechanics

equation (12.25) becomes:

Cγ μ C = −(γ μ )∗ . (12.28)

In the Pauli representation, γ 0,1,3 are real while γ 2 is imaginary. Equation

(12.28) says that C anticommutes with γ 0,1,3 and commutes with γ 2 , therefore
C must be proportional to γ 2 . If we require that its square should be equal to
one, we ﬁnd, to within an unimportant sign:

C = iγ 2 . (12.29)

C is therefore a real, symmetric matrix. With this deﬁnition of C, we can

determine the behaviour of the Dirac bilinears under C. In the general case,
we write:

O(Γ) = ψ̄Γψ = −ψ(Γ)T γ 0 ψ † . (12.30)

Using (12.18) and (12.19), we have:

CΓC = − ψ̄c γ 0 C(Γ)T γ 0 Cψc = ψ̄c CΓ∗ Cψc (12.31)

where we have assumed that the bilinears are deﬁned to satisfy the hermiticity
relation:

γ 0 Γγ 0 = Γ† (12.32)

(This is the reason for the factor which appears in the pseudoscalar density).
Finally, we ﬁnd that, under charge conjugation:

O(Γ) → O(CΓ∗ C). (12.33)

From (12.33), it can easily be shown that each of the Dirac bilinears takes
a characteristic sign under C, which we denote as ηC , where

ηC = +1, for S, P, A
ηC = −1, for V, T. (12.34)

The QED Lagrangian, (12.15), including its non-minimal extensions, is

invariant under C.

12.3 TIME REVERSAL

In classical mechanics, we can change the signs of the velocities of all the
particles in the system at time t = 0. At subsequent times, the new system
retraces all the conﬁgurations through which the original system passed. We
denote a point in phase space of a classical Hamiltonian system with A(q, p),
and the transformation is known as time reversal with

A = (q, p) → AT = (q, −p). (12.35)

The Discrete Symmetries: P, C, T 173

Figure 12.1 Time reversal of the motion of a classical harmonic oscillator.

If we compare the behaviour of the system which starts from A at the time
t = 0 with one which starts from AT we can express the previous statements
in the following way:

A(t) (state at time t with A(0) = A)

AT (t) (state at time t with AT (0) = AT )
AT (t) = T [A(−t)] (time reversal in classical mechanics) (12.36)

Equation (12.36) is shown graphically in Fig. 12.1, for the case of the
motion of a harmonic oscillator.
What happens in quantum mechanics? The time reversal transformation
must be carried out on the quantum states by an operator, which we denote
as T , such that (working in the Schrödinger representation):

|A; t = 0 → |AT ; t = 0 = T |A; t = 0

|AT ; t = T |A; −t (time reversal in quantum mechanics). (12.37)

However, the T operator must have very special properties, as we can see
from the following argument. Suppose that the state A is an energy eigenstate
with eigenvalue E. We expect that AT should also have the same energy (as
happens in classical mechanics). In that case:

|E; t = e−iEt |E; t = 0; |ET ; t = e−iEt |ET ; t = 0. (12.38)

174 Relativistic Quantum Mechanics

However, it must be true that

|ET ; t = e−iEt |ET ; t = 0 =

= T [e−i(−t)E |E; t = 0] = T (e+itE |E; t = 0) (12.39)

which is impossible to realise if T is a linear operator, because in that case

the phase of the exponentials on the two sides will have opposite signs.
The solution proposed by Wigner is to use an antilinear operator:

T (α|A + β|B) = α∗ T |A + β ∗ T |B. (12.40)

In that case we have:

T (e+iEt |E; t = 0) = e−itE T |E; t = 0 = e−itE |ET ; t = 0T (12.41)

as required.

Symmetries and Unitary or Antiunitary Operators. The condition set

by Wigner to have a symmetry, is that the operator which represents it should
leave the relations between quantum states unchanged. In their turn, these re-
lationships are represented by squared moduli of scalar products, which specify
the probability of results of quantum measurements. If:

|AT = T |A; |BT = T |B; . . . (12.42)

we must have:

| AT |BT |2 = | A|B|2 (12.43)

which gives two possibilities for the operator which carries out the transfor-
mation |A → |AT :

AT |BT = A|B : unitary operator (12.44)

AT |BT = A|B∗ : antiunitary operator. (12.45)

For transformations dependent on one or more parameters, which are con-

nected to the identity transformation (proper Lorentz, translations), the ﬁrst
condition should hold, by continuity. From the argument we gave earlier, the
second is the one which should apply to all symmetries which involve time
reversal.

Action of T on Fields. We require that:

T † = T −1 = T . (12.46)

The action of the T operator on the vector potential is ﬁxed by the clas-
sical limit. If T leaves the positions of charges unchanged and reverses their
The Discrete Symmetries: P, C, T 175

velocities, it must be true that:

T A0 (x, t)T = +A0 (x, −t); T Ai (x, t)T = −Ai (x, −t)
or
T Aμ (x, t)T = g μμ Aμ (x, −t). (12.47)
The T operator must transform ψα (x, t) into a linear combination of the
ψβ (x, −t) ﬁelds:
T ψα (x, t)T = Tαβ ψβ (x, −t) = [T ψ(x, −t)]α (12.48)
from which, then:
T ψ(x, t)† T = [ψ(x, −t)T ]† . (12.49)
The form of the T matrix is deﬁned by requiring that the current trans-
forms like Aμ in (12.47). Taking into account the antilinearity of T , we have:

T ψ̄(x, t)γ μ ψ(x, t)T = T ψ(x, t)† T T γ 0 γ μ ψ(x, t)T =
∗ ∗
= T ψ(x, t)† T γ 0 γ μ T ψ(x, t)T = ψ(x, −t)† T † γ 0 γ μ T ψ(x, −t)
(12.50)
from which we find:
μ = 0 : T †T = 1
μ = i : T † (αi )∗ T = −αi (12.51)
where we have introduced the Dirac matrices, αi = γ 0 γ i . If we require, as
usual, that T 2 = 1, the previous conditions give:
T † T = T −1 = T
μ = i : T (αi )∗ T = −αi . (12.52)
In the Pauli representation, α1,3 are real and α2 imaginary, therefore T
must anticommute with α1,3 and commute with α2 . Thus, from (12.52) we
find:
T = iγ 1 γ 3 = σ2 (12.53)
The transformation with matrix σ2 leaves unchanged γ 0 , which is real, and
changes the signs of γ i , therefore:
T (γ μ )∗ T = g μμ γ μ (12.54)
From this, the transformation rules of the Dirac bilinears, which are formed
from products of the gamma matrices, can immediately be found. We find:
O(Γ)((x, t) = ψ̄(x, t)Γψ(x, t)
T O(Γ)(x, t)T = O(T Γ∗ T )(x, −t) = ηT O(Γ)(x, −t) (12.55)
176 Relativistic Quantum Mechanics

where:

ηT = +1 (S, P)
ηT = g μμ (V, A)
ηT = −g μμ νν
g (T). (12.56)

12.4 TRANSFORMATION OF THE STATES

We consider the transformation properties of the electron and positron
states. We restrict ourselves to the case of a free particle ﬁelds which describe
asymptotic “in” and “out” states. For the reader’s convenience, we repeat the
positive and negative energy solutions of the Dirac equation from Section 6.1.4.
The spin is quantised in a ﬁxed direction which we choose to be along the z-
axis.
⎛ ⎞
χs
E(p) + m ⎝ ⎠ ; σ3 χs = sχs (s = ±1)
us (p) =
2m σ·p
E(p)+m χs
⎛ σ·p ⎞
E(p)+m ξs
E(p) + m ⎝ ⎠ ; σ3 ξs = sξs (s = ∓1).
vs (p) =
2m
ξs
(12.57)

From the properties of the Pauli matrices we ﬁnd:

σ2 (σ i )∗ σ2 = −σ i (i = 1, 2, 3)
σ3 (σ2 ξs ) = −σ2 (σ3 ξs )∗ = ∓(σ2 ξs∗ )
∗

or
(σ2 ξs∗ ) = ξ−s = χs
and also
(σ2 χ∗s ) = χ−s = ξs . (12.58)

Parity. From (12.7) and (12.13) we ﬁnd:

Pψ(x, t)P = γ 0 ψ(−x, t). (12.59)

• The left-hand side, expanded from the solutions to the Dirac equation,
is:

d3 p m
3/2
Pas (p)Pe−ipx us (p) + Pbs (p)† Pe+ipx vs (p) .
(2π) E(p)
(12.60)
The Discrete Symmetries: P, C, T 177

• The right-hand side is:

d3 p m
3/2
as (p)e−ipxP (γ 0 us )(p) + bs (p)† e+ipxP (γ 0 vs )(p)
(2π) E(p)
(12.61)

where

xP = (−x, t) pxP = Et + x · p = pP x. (12.62)

Changing the integration variable, p → −p, the right-hand side be-

comes:

d3 p m
3/2
as (pP )e−ipx (γ 0 us )(pP ) + bs (pP )† e+ipx (γ 0 vs )(pP ) .
(2π) E(p)
(12.63)
From (12.57) we obtain:

γ 0 us (pP ) = +us (p); γ 0 vs (pP ) = −vs (p). (12.64)

Finally, comparing the terms in the expansion on each side of (12.59), we

ﬁnd:

Pas (p)P = +as (pP ); Pbs (p)P = −bs (pP ). (12.65)

The absolute sign in these relations could be changed with a diﬀerent

deﬁnition of the sign of P . The important fact is the relative sign of the
electron and its antiparticle:
The electron and positron have opposite parity.

Charge Conjugation. From (12.18) we ﬁnd:

Cψ(x)C = iγ 2 [ψ(x)† ]. (12.66)

We proceed as before:

• Right-hand side:

d3 p m
×
(2π)3/2 E(p)

bs (p)e−ipx (iγ 2 )αβ [(vs )β (p)]∗ + as (p)† e+ipx (iγ 2 )αβ [(us )β (p)]∗ .
178 Relativistic Quantum Mechanics

• Using (12.57) and (12.58) we ﬁnd:

⎛ ⎞∗
σ·p
E(p)+m ξs
E(p) + m 0 iσ2 ⎝ ⎠
iγ2 vs (p)∗ =
2m −iσ2 0
ξs
⎛ ⎞
iσ2 ξs∗
E(p) + m ⎝ ⎠=
=
2m (σ·p)∗ ∗
−iσ2 E(p)+m ξs
⎛ ⎞
χs
E(p) + m ⎝ ⎠ = ius (p)
=i (12.67)
2m σ·p
E(p)+m χs

and, similarly:

iγ2 us (p)∗ = −ivs (p). (12.68)

Comparing the two sides, we obtain:

Cas (p)C = ibs (p); Cbs (p)C = −ias (p). (12.69)

Time Reversal. We start from (12.48):

T ψ(x, t)T = [σ2 ψ(x, −t)]. (12.70)

• The left-hand side, recalling the antilinearity of T , is:

d3 p m
3/2
T as (p)T e+ipx us (p)∗ + T bs (p)† T e−ipx vs (p)∗ .
(2π) E(p)
(12.71)

• The right-hand side is:

d3 p m
3/2
as (p)e−ipxT [σ2 us (p)]† + bs (p)† e+ipxT [σ2 vs (p)]†
(2π) E(p)
(12.72)

where:

xT = (x, −t); pxT = −Et − p · x = pT x. (12.73)

If we change the integration variable, (12.72) becomes:

d3 p m
3/2
as (pT )e+ipx [σ2 us (pT )]† + bs (pT )† e−ipx [σ2 vs (pT )]† .
(2π) E(p)
(12.74)
The Discrete Symmetries: P, C, T 179

Figure 12.2 The Action of parity, P, and charge conjugation, C, on the electron and
positron states with diﬀerent helicity. The T transformation does not change the
helicity since it changes both momentum and spin.

• We use (12.57) again, to obtain:

{[σ2 us (pT )]† } = {us (pT )† σ2 }

E(p) + m † σ2 0

−σ·p
= χs , χ†s E(p)+m =
2m 0 σ2
E(p) + m
−σ·p
= χ†s σ 2 , χ†s E(p)+m σ2 . (12.75)
2m

• Using (12.58) we ﬁnd:

(χ†s σ 2 )α = (σ 2 χ∗s )α = (χ−s )α ;

−σ · p 2 −σ·p
(χ†s σ )α = (σ 2 E(p)+m χ∗s )α
E(p) + m
+σ·p
= ( E(p)+m χ−s )α (12.76)

or:

{[σ2 us (pT )]† }α = [u−s (p)]∗α ; (12.77)

and similarly:

{[σ2 vs (pT )]† }α = [v−s (p)]∗α . (12.78)

We can now compare the two sides, and ﬁnd:

T as (p)T = a−s (−p); T bs (p)T = b−s (−p). (12.79)

Fig. 12.2 shows the action of the three symmetries on the electron and
positron states.
180 Relativistic Quantum Mechanics

12.5 SOME APPLICATIONS

12.5.1 Furry’s Theorem
This theorem, a consequence of the invariance of QED under C, concerns
Green’s functions which involve only Aμ and states that:
the Green’s functions with an odd number of external photon lines are
identically zero.
The theorem can also be expressed in terms of S-matrix elements:
the reactions nγ → n γ vanish if n + n = an odd number.

Proof. We use the invariance of the vacuum under C to obtain:

Gμ1 ...μN (x1 , . . . xN ) = 0|T [Aμ1 (x1 ) . . . AμN (xN )] |0 =

= 0|CT [CAμ1 (x1 )CC . . . CAμN (xN )C] C|0 =
= (−1)N 0|T [Aμ1 (x1 ) . . . AμN (xN )] |0. (12.80)

Clearly, if N is odd the amplitude is equal to its negative and therefore it

must vanish.

12.5.2 Symmetries of Positronium

Positronium is a system formed by an electron and positron bound to-
gether by the Coulomb force. Its energy levels are well described, in the non-
relativistic approximation, using four quantum numbers:

• the radial quantum number: n = 1, 2, . . .,

• the orbital angular momentum: L = 0, 1, 2, . . .,

• the total spin: S = 0, 1,

• the total angular momentum, J, which also takes integer values.

Positronium is similar in every way to the hydrogen atom, except that the
reduced mass, μ, is about half the reduced mass of hydrogen:
m1 m2
μ= ;
m 1 + m2
1 1
μ e + e − me μ H . (12.81)
2 2
Because the reduced mass determines the bound state energy levels, the
positronium spectrum is scaled in energy by a factor two compared to that of
hydrogen.
Positronium is formed each time a positron comes to rest in matter. At
the end of its travel, the positron captures an electron from the surrounding
atoms and forms an excited state of positronium. In contrast to hydrogen,
The Discrete Symmetries: P, C, T 181

positronium is not stable. Once the electron and positron have reached the
ground state with L = 0, the annihilation probability is signiﬁcant and the
state decays into two or more photons (cf. Landau and Lifshitz [11] for the
cross-section calculation).
For the ground state, there are two energy levels with n = 1, L = 0 and
S = 0 or 1, known respectively as parapositronium and orthopositronium. The
energy diﬀerence between them is very small because it results from magnetic
interactions between the spins.
On the basis of the results from the previous section, we can determine the
parity and charge conjugation properties of the positronium levels. Because
these operations commute with the Hamiltonian, they provide good quantum
numbers and determine the selection rules of the decays.

Positronium States. The states of positronium are linear superpositions

of the states obtained by applying to the vacuum one creation operator for
the electron and another for the positron. The coeﬃcients of the superpo-
sition are given by the product of the spherical harmonics corresponding to
angular momentum L, the Clebsch–Gordan coeﬃcients coupling the electron
and positron spins to obtain the total spin S and the radial wave functions
corresponding to the quantum number n:

(n,L,S,J)
|n, L, S, J = dpR (p) dΩp YmL (p̂)×

C(S, s3 |1/2, s; 1/2, s )as (p)† bs (−p)† |0 (12.82)
s,s

where p = |p| and p̂ is the unit vector of p.

Parity of the Levels. We apply the P operator to the state (12.82) and use
(12.65) and the invariance of the vacuum:

P|n, L, S, J = dpR(n,L,S,J) (p) dΩp YmL (p̂)×

C(S, s3 |1/2, s; 1/2, s ) Pas (p)† PPbs (−p)† PP|0 =
s,s

= dpR(n,L,S,J) (p) dΩp YmL (p̂)×

C(S, s3 |1/2, s; 1/2, s )(−1)as (−p)† bs (+p)† |0 (12.83)
s,s

where the minus sign originates in the opposite parities of electron and
positron. Now we can let p → −p in the integral and use the parity of the
spherical harmonics to obtain:

P|n, L, S, J = (−1)L+1 . (12.84)

182 Relativistic Quantum Mechanics

Charge Conjugation of the Levels. We apply the C operator and use

To return to the starting expression we must (i) exchange a with b, (ii) let
p → −p, (iii) exchange s with s .
In the latter operation, we recall that the Clebsch–Gordan coeﬃcients for
two spin 12 combinations are symmetric for the exchange of s with s in the
case S = 1, while they are antisymmetric for S = 0.
The three operations introduce a factor (i) −1; (ii) (−1)L ; (iii) (−1)S+1
respectively and therefore an overall factor ηC :
C|n, L, S, J = ηC |n, L, S, J; ηC = (−1)(−1)L (−1)S+1 = (−1)L+S . (12.86)

Selection Rules. For the positronium ground states we obtain:

parapositronium : J P C = 0−+
orthopositronium : J P C = 1−− (12.87)
As we saw from Furry’s theorem, a state with N photons has C = ±1
according to whether N is even or odd. Therefore the selection rules are:
parapositronium → 2γ; → 3γ
orthopositronium → 2γ; → 3γ (12.88)
The annihilation amplitude into two photons is of order e2 , and for three
photons of order e3 and the corresponding probabilities of order α2 and α3 re-
spectively, with α 1/137. We therefore expect two positronium components
to form in matter, one with a short half-life, parapositronium, and one with
a considerably longer half-life, orthopositronium.
The observed values agree well with this rule. They are1 :
Γ(para → 2γ)expt = 7990.9(1.7) μs−1 (12.89)
−1
Γ(ortho → 3γ)expt = 7.0404(10)(8) μs (12.90)
1 Numbers in parentheses represent the error on the last digits of the quantity; when two
numbers are reported, the ﬁrst is the statistical and the second the systematic error.
The Discrete Symmetries: P, C, T 183

a factor of about a thousand between the two decay probabilities, in agreement

with the selection rule (12.87) and the QED predictions.

12.6 THE CPT THEOREM

We summarise the transformations of the Dirac covariants under P, C and
T , generalising the considerations of sections 12.1, 12.2 and 12.3 to the case
of bilinears constructed from two different fermion fields, ψa and ψb . In the
case of the scalar density, for example, we find:

PSab (x, t)P = P ψ̄a (x, t)ψb (x, t)P = ηP (S)Sab (−x, t);
CSab (x, t)C = ηC (S)Sba (x, t);
T Sab (x, t)T = ηT (S)Sab (x, −t) (12.91)

where η are the ± signs characteristic of the transformation and the speciﬁc
covariant. In Table 12.1 the values of η for the three transformations and for
the combined θ = CP T operation are listed.
The CP T operation is clearly represented by an antilinear operator and
acts according to the simple rule:

θ [gS Sab (x)] θ† = (−1)N (gS )∗ (Sab (−x))† (12.92)

where gS is a complex coeﬃcient. N is the number of Lorentz indices of the

covariant.
The same rule applies to the vector potential (N = 1), and to the Maxwell
tensor (N = 2) and extends unchanged to operators of scalar or vector ﬁelds.
In the case of complex ﬁelds, the antilinear nature of θ implies, for example,
that

θ (φ1 + iφ2 ) (x)θ† = (φ1 − iφ2 )(−x) (12.93)

Therefore, for these operators also, θ implies hermitian conjugation.

So as not to forget . . .. It is useful to derive the results in Table 12.1 directly

from the transformations of the ﬁelds under CP T . Collecting the formulae
(12.59), (12.66) and (12.70) and the corresponding hermitian conjugates, we
ﬁnd:

(CPT )ψ(x)(T PC) = θψ(x)θ† = σ2 γ 0 (iγ 2 )ψ † (−x) = iγ5 ψ † (−x) (12.94)

and, for the hermitian conjugate ﬁeld:

θψ † (x)θ† = −iγ5 ψ(−x). (12.95)

184 Relativistic Quantum Mechanics

Table 12.1 Summary of the properties of the Dirac covariants and of the electro-
magnetic ﬁeld under P, C, T transformations, and θ = CP T .

S P V A T Aμ Fμν
ηP +1 -1 gμμ -gμμ gμμ gνν gμμ gμμ gνν
ηC +1 +1 -1 +1 -1 -1 -1
ηT +1 -1 gμμ +gμμ -gμμ gνν gμμ +gμμ gνν
ηCP T +1 +1 -1 -1 +1 -1 -1

For the general bilinear, we therefore ﬁnd:

θψ̄a Γψb θ† = θψa† (x)θ† (γ 0 Γ)∗ θψb (x)θ† = [γ5 ψa (−x)](γ 0 Γ)∗ γ5 ψb† (−x) =

= −ψb† (−x)γ5 (γ 0 Γ)† γ5 ψa (−x) = + ψ̄b γ 0 γ5 Γ† γ5 γ 0 ψa (−x) (12.96)

where we have repeatedly made use of the fact that γ5 is hermitian and anti-
commutes with γ 0 .
Combination with γ5 produces a minus sign for every factor γ μ in Γ, there-
fore:
†
γ 0 γ5 Γ† γ5 γ 0 = γ 0 (γ5 Γγ5 ) γ 0 =
= (−1)N γ 0 Γ† γ 0 = (−1)N Γ (12.97)

where N is the number of vector indices of Γ. Finally:

†
θ ψ̄a Γψb (x) θ† = (−1)N ψ̄b Γψa (−x) = (−1)N ψ̄a Γψb (−x) (12.98)

as given in Table 12.1.

The CPT Theorem. The (−1)N factor applies to the total inversion of the
coordinates:

TI : xμ → −xμ . (12.99)

In a four-dimensional Euclidean space, inversion of the axes has a deter-

minant equal to one and is continuously connected to the identity transfor-
mation. In Euclidean space, inversion is a proper transformation and therefore
a necessary symmetry. This is not true in Minkowski space, in which proper
transformations must also have Λ00 > 0, a condition which is not satisﬁed by
TI.
However, as we will see in the discussion of Feynman integrals of [14],
quantum theory in Minkowski space is the analytic continuation of the theory
deﬁned with a complex time, and there is no obstacle to continue to Minkowski
The Discrete Symmetries: P, C, T 185

space from purely imaginary time, i.e. from the theory in the four-dimensional
Euclidean space. This is the origin of the CPT Theorem which, under very
general conditions, states that the operation θ, total inversion of the axes sup-
plemented with the operation of hermitian conjugation, is an exact symmetry
of any relativistic quantum field theory.
The CP T theorem is due to Luders and Pauli [19, 20]. We will give a
demonstration very close to the one of Bjorken and Drell [21]. Subsequently,
we will illustrate the most noteworthy consequences of the CP T theorem.
We consider a theory described by a Lagrangian density L(x). The con-
ditions characterising a relativistic quantum theory, the subject to which the
Lagrangian is to be constructed, are as follows.
The Lagrangian must be:
• hermitian,
• a local function of the fields and their derivatives, up to finite order,
calculated at the same point,
• a boson operator; fermion fields can appear in even numbers and they
can therefore be organised into Dirac bilinears, ψ̄a Γψb , where ψa and
ψb are fields associated with different types of fermion (e.g. electron and
neutrino),
• invariant under proper Lorentz transformations,
• a normal product of fields.
Under these conditions, we show that it must necessarily be true that:
θL(x)θ† = L(−x) (12.100)
from which it follows that the action is invariant under CP T :

† 4 †
θSθ = d x θL(x)θ = d x L(−x) = d4 (−x) L(−x) = S (12.101)
4

and therefore CP T is an exact symmetry.

Proof. The general form of a Lagrangian density which satisﬁes the condi-
tions above can be written in the following way:

L(x) = ci Oi (x);
i
Oi (x) =: . . . Aμ (x) . . . (ψ̄a Γψb )(x) . . . ∂ ν . . . : (12.102)
where ci are complex coeﬃcients. We now apply the CP T operation. On the
basis of Table 12.1 and the antilinearity of θ we obtain:
†
θL(x)θ† = (ci )∗ (−1)Ntot : . . . Aμ (−x) . . . (ψ̄a Γψb )(−x) . . . ∂ ν . . . :
i
(12.103)
186 Relativistic Quantum Mechanics

where Ntot is the total number of Lorentz indices which appear in the equation
(12.102).
Inside the normal product, we can commute the boson operators and put
them in the opposite order to which they appear in (12.102), which means we
can rewrite (12.103) as:
†
θL(x)θ† = (−1)Ntot [ci Oi (−x)] . (12.104)
i

If the Lagrangian should be Lorentz invariant, the Lorentz indices must

be summed on invariant tensors. In a four-dimensional space-time, there are
only two invariant operations: contraction with gμν and contraction with μνρσ
(cf. Section 3.3). Both of these operations reduce the free indices to an even
number. Therefore, in order that (12.102) should be invariant (i.e. does not
contain free indices), the number of vector indices that appear must be even,
or (−1)Ntot =+1. We conclude therefore that:
† †
θL(x)θ† = [ci Oi (−x)] = [L(−x)] = L(−x) (12.105)
i

where the ﬁnal step follows from the fact that L is Hermitian.

Fermi Interaction. It is interesting to apply the considerations just de-

scribed to the Fermi interaction for neutron β decay, which we wrote as (Sec-
tion 9.2):

G gA μ
LF = − √ μ
ψ̄P γ + γ γ5 ψN ψ̄e (γ μ − γ μ γ5 ) ψνe +
2 gV

G gA ∗ μ
−√ ψ̄N γ + ( ) γ γ5 ψP ψ̄νe (γ μ − γ μ γ5 ) ψe . (12.106)
μ
2 gV

We have written explicitly the hermitian conjugate of the ﬁrst term, nec-
essary to make the Lagrangian real.
The interaction (12.106) clearly is NOT invariant under parity, since it is
produced by superpositions of polar and axial vectors, which change relative
sign under parity. For example:

gA μ gA μ
P ψ̄P γ μ + γ γ5 ψN (x, t)P = g μμ ψ̄P γ μ − γ γ5 ψN (−x, t).
gV gV
(12.107)
Instead, the CP transformation acts in the same way on vector and axial
currents:

gA μ gA μ
C ψ̄P γ μ + γ γ5 ψN C = (−g μμ )ψ̄N γ μ + γ γ 5 ψP
gV gV
C ψ̄e (γ − γ γ5 ) ψνe C = (−g )ψ̄νe (γ − γ γ5 ) ψe .
μ μ μμ μ μ
(12.108)
The Discrete Symmetries: P, C, T 187

Invariance under CP can be obtained if the ﬁrst term of (12.106) trans-

forms into the second, which is its hermitian conjugate. This, in turn, requires
that:

gA /gV = real (invariance for CP). (12.109)

However, if we apply θ to the ﬁrst term of (12.106), gA /gV also changes

to its complex conjugate and the interaction remains invariant for any value,
real or complex, of the constant which appears in (12.106).

12.6.1 Equality of Particle and Antiparticle Masses

This is the most direct consequence of the CP T theorem. From the in-
variance of the Lagrangian, equation (12.100), that of the Hamiltonian easily
follows:

θHθ† = H. (12.110)

Moreover, from Table 12.1 it follows that θ changes signs of 3-vectors, like
momentum, and changes the sign of each conserved charge that is present in
our theory (e.g. electric charge):

θP θ† = −P ; θQθ† = −Q (12.111)

Conversely, angular momentum remains unchanged2

θJ θ† = J (12.112)

where P , J , Q are the momentum, angular momentum and charge operators.

The Case M = 0. We consider the case of a particle with non-zero mass.

We can choose to be in the rest frame of the particle, in which P = 0 and the
angular momentum coincides with the spin. In general, other than the mass
and the spin, the state is characterised by the value of the conserved charge,
which we denote with q. Therefore we write the ket which represents the state
as:

|A = |M, P = 0, sz ; q. (12.113)

Taking account of (12.111), the state θ|A must have the same mass, the
same value of s2 = s(s + 1) and of the spin component, but opposite electric
charge:

θ|A = |M, P = 0, sz ; −q. (12.114)

2 Classically, J = x × v does not change sign under total inversion since x and v both
change sign.
188 Relativistic Quantum Mechanics

We must distinguish two cases, according to which q = 0 or q = 0.

If q = 0, as for example it is for the electron, the CP T -conjugate state,
(12.114), clearly does not coincide with the original state. Another particle
of equal mass and spin but the opposite charge must exist: the positron.
Since θ2 = 1, the relationship between particle and antiparticle is perfectly
symmetric; the electron is the antiparticle of the positron.
If instead q = 0, the conjugate state (12.114) is identical to the original.
Naturally, invariance under rotation of the Hamiltonian in the rest system
requires the presence of all 2s + 1 states associated with spin s. In this case,
we have a spin s particle, absolutely neutral in the sense that it cannot be
distinguished from the ﬁrst particle. This, for example, is the case of the (spin
0) π 0 meson or of (spin 1) ρ0 and ω 0 mesons.

The Case M = 0. In this case, we choose to be in a system in which the

particle has its momentum along the z-axis and has helicity λ. We write the
state as:

|A = |P3 , λ; q. (12.115)

Lorentz invariance, which reduces to invariance under rotation around

the z-axis, does not require that there should be states other than those in
(12.115).
The CP T -conjugate state must have the opposite momentum and the
same spin components, or opposite helicity:

θ|A = | −P3 , −λ; −q. (12.116)

Here again, we distinguish two cases:

If q = 0, the conjugate state represents an antiparticle of the original
particle, with opposite charge, and opposite helicity. This is the case for par-
ticles of the two-component theory (the Weyl neutrino, see Chapter 13). We
have a neutrino state with negative helicity and another state of an antineu-
trino with positive helicity, distinguished by the value of a conserved charge,
the lepton number. Other states are not necessary to have a relativistic and
CP T -invariant theory.
If q = 0, we have an absolutely neutral particle which, on the basis of
(12.116), must be present with two helicity states equal to ±λ. This is the
case of the photon. On the basis of pure relativistic invariance, it could have
only a single state, for example with helicity equal to one. CP T -invariance
requires both observed states of the photon, with helicity ±1.

Experimental Veriﬁcation. Beyond the equality of their masses, it can

easily be seen that particle and antiparticle, if unstable, must have opposite
magnetic moments and equal half-lives. These relationships have been exper-
imentally tested, with extreme precision in a few fortunate cases. We recall
the most important (see [?]).
The Discrete Symmetries: P, C, T 189

• the mass of the antiproton coincides with that of the proton within a
relative precision of 10−8 ,
• the masses and half-lives of K 0 and anti-K 0 mesons agree to within a
few parts in 10−18 ,
• the electron and positron have equal masses within one part in 10−8 ,
and their magnetic moments, apart from opposite signs, agree within
one part in 10−12 ,
• the magnetic moments of muon and antimuon agree within two parts in
10−8 .
The observation of any violation of the CP T theorem would imply the
necessity for improvement of the relativistic quantum ﬁeld theory paradigm
and would have enormous conceptual importance.

12.7 PROBLEMS FOR CHAPTER 12

Sect. 12.1
1. The parity transformation P inverts the signs of the space components
leaving the time component unchanged:
P : x, t → −x, t
with PP = 11. Requiring the Dirac action to be invariant under parity,
namely
¯
L(x) = ψ(x)(i ∂/ − m)ψ(x)
PL(x, t)P = L(−x, t)
determine the form of the 4×4 matrix P transforming the ﬁeld ψ(x)
according to
Pψα (x, t)P = Pαβ ψβ (−x, t)
where α and β are spinor indices.
/
Hint: Consider the term ψ̄(x)i∂ψ(x) ¯ μ ∂ μ ψ(x), treating sepa-
= iψ(x)γ
rately the cases μ = 0 and μ = 1, 2, 3.
2. Using the result of the previous problem,

– perform the parity transformation of the Dirac spinors

(σ·p)

E+m χs E+m ξs
us (p) = (σ·p) , vs (p) = E+m
2m E+m χs 2m ξs
χs e ξs being Pauli spinors satisfying the relations
σ3 χs = sχs , (s = ±1)
σ3 ξs = sξs , (s = ∓1) ;
190 Relativistic Quantum Mechanics

– verify that us (0) and vs (0) have opposite parity.

3. Determine, from the above arguments, the parity of a fermion an-

tifermion pair in S-wave, e.g. an electron-positron bound state (positro-
nium) with orbital angular momenum L = 0.

4. Show that the Weyl equation for the two-component neutrino,

eq. (6.171), can be derived from the Lagrangian density

∂
LWeyl (x) = iψ † − ∇ · σ ψ.
∂x0

5. Show that the Weyl Lagrangian is NOT invariant under parity. On this
basis, Pauli rejected Weyl’s theory (parity violation had not beeen dis-
covered yet).
Hint: the matrix P should obey P 2 = 1 and P σP = −σ. Does such a
matrix exist?

Sect. 12.2
1. Deﬁne the action of parity and charge conjugation on the two-component
Weyl ﬁeld according to

Pψ(x, t)P = σ2 ψ(−x, t); Cψ(x, t)C = ψ(x, t)† . (12.117)

Prove that the Weyl Lagrangian, LWeyl of problem 4 above, is invariant

under PC.
Hints:

– require only invariance of the action, so that derivatives in the

Lagrangian density can be freely integrated by parts;
– use the identity: σ2 σ T σ2 = −σ.

More about Weyl neutrinos in Sect. 13.1.

CHAPTER 13

WEYL AND MAJORANA

NEUTRINOS

We have seen that Dirac spinors transform according to a reducible repre-

sentation of the Lorentz transformations. On the basis of transformations of
L↑+ only it should therefore be possible to ﬁnd smaller, truly irreducible rep-
resentations to describe a spin 12 particle. This reduction leads to two types
of theory, of the Weyl neutrino and the Majorana neutrino.
As we will see, the two theories agree for zero mass fermions but diﬀer for
particles of non-zero mass giving rise to the Dirac and Majorana theories as
physically distinct alternatives.
It should be said immediately that neither of these theories can apply to
the electron, or to the proton or neutron. For these particles, the presence of a
conserved vector current associated with an electric charge or baryon number
makes the Dirac structure essential. The question is open in the case of the
neutrino.
The recent discovery of very small neutrino masses has reopened the ques-
tion of which of the Dirac or Majorana theories is the better candidate to
describe the properties of these particles.

13.1 THE WEYL NEUTRINO

We consider the Dirac equation in the limit of zero mass:

i∂/ψ = 0. (13.1)

Equation (13.1) allows an invariant operator represented by the γ5 matrix.

If ψ is a solution of (13.1), then γ5 ψ also satisﬁes the equation. Therefore

DOI: 10.1201/9781003436263-13 191

This chapter has been made available under a CC BY NC license.
192 Relativistic Quantum Mechanics

we can separate the solutions into two invariant sub-spaces by means of the
projection operators:
1 ± γ5
a(±) = . (13.2)
2
We deﬁne:
ψL = a(−) ψ; ψR = a(+) ψ. (13.3)
If, for example, ψR = 0 at time t0 , it will remain so at later times. The
ﬁeld therefore divides into two irreducible and independent components.
In the Pauli representation of the γ matrices:

0 1
γ5 = (13.4)
1 0

the eigenvectors of γ5 with eigenvalue h = ±1 have the form:

χ
ψ= . (13.5)
hχ

The Dirac equation with m = 0 gives rise to two possible equations for the
two-dimensional spinors:
∂
i χ = ±(−i∇ · σ)χ. (13.6)
∂t

Equation (13.6) is known as the Weyl equation and describes a particle

of zero mass (because it is compatible with the equation 2χ = 0) and spin
11
2 . The Weyl equation is used to describe a massless neutrino, which we will
return to later.
We examine the solutions to the massless Dirac equation, (13.1), more
closely. The spinors with positive energy and momentum p along the z-axis
take the form:
+
1 χ 1 χ−
u+ (p) = √ ; u − (p) = √ ;
2 χ+ 2 −χ−

1 0
χ+ = ; χ− = . (13.7)
0 1

We have used the appropriate normalisation for massless spinors in (13.7),

i.e.:
us (p)† ur (p) = δrs . (13.8)
It should be noted that (cf. Section 6.1.4) ū(p)u(p) = m/E → 0 for m = 0.
1 If we repeat the Dirac construction (Section 6.1) starting directly with m = 0, we must

introduce only three anticommuting matrices and therefore the solution α = σ is acceptable,
the minimum dimensionality of the spinor is 2 and instead of the Dirac equation we obtain
the Weyl equation.
Weyl and Majorana Neutrinos 193

The spinors with negative energy and momentum −p along the z-axis are
instead:
+
(E<0) 1 χ
u+ (−p) = v− (p) = √ ;
2 χ+

(E<0) 1 −χ−
u− (−p) = v+ (p) = √ . (13.9)
2 χ−

Both in (13.7) and (13.9) the ± superscripts denote the σ3 eigenvalue.

The spinors u+ (p) and v− (p) are eigenvectors of γ5 with eigenvalues +1,
while u− (p) and v+ (p) belong to the −1 eigenvalue.
The exchange of signs between u and v in (13.9) arises from the fact that,
(E<0)
in the hole theory, the u± (−p) spinors are associated with the destruction
of a neutrino in a negative energy state with momentum −p and spin ± 12
along the z-axis. This corresponds to the creation of a hole (an antineutrino)
with momentum p and spin ∓ 12 , described by the spinor v(p).
The same ideas can be translated into the language of quantised fields as
follows. We introduce left-handed, L, and right-handed, R, fields, defined in
(13.3):
1 !
ψL = √ a− (p)u− (p)e−i(px) + (b+ (p))† v+ (p)e+i(px)
p V
1 !
ψR = √ a+ (p)u+ (p)e−i(px) + (b− (p))† v− (p)e+i(px) . (13.10)
p V

From (13.10) it follows that:

• the ﬁeld ψL destroys a neutrino with helicity − 12 and creates an antineu-

trino with helicity + 12 ,
• the ﬁeld ψR destroys a neutrino with helicity + 12 and creates an an-
tineutrino with helicity − 12 .
The same conclusion is reached by considering the matrix elements of the
invariant density ψ̄(x)ψ(x). We write:

ψ(x) = ψ(x)R + ψ(x)L . (13.11)

If we deﬁne:

ψ̄L = (a− ψL )† γ 0 = (ψL )† a− γ 0 = ψ¯L a+ , (13.12)

we see that ψ̄L should be multiplied by ψR and ψ̄R by ψL . Thus, we ﬁnd:

ψ̄(x)ψ(x) = ψ̄L ψR + ψ̄R ψL . (13.13)

194 Relativistic Quantum Mechanics

The scalar density has non-zero matrix elements between the vacuum state
and the state with a neutrino-antineutrino pair. To calculate these matrix
elements, we must examine the spinors with momentum −p:

1 χ∓
u± (−p) = √ ;
2 ±χ∓

1 ∓χ±
v± (−p) = √ ; (13.14)
2 χ±

and helicity ±1 (the two-dimensional spinors always correspond to the eigen-

states of σ3 ).
Keeping only the terms which give matrix elements diﬀerent from zero, we
have:

ν(p)ν̄(−p)|ψ̄R ψL |0 = ū+ (p)v+ (−p)e−i(px)+i(p x) ;

ν(p)ν̄(−p)|ψ̄L ψR |0 = ū− (p)v− (−p)e−i(px)+i(p x) . (13.15)

The neutrino and antineutrino are created with the same helicity, in agree-
ment with the fact that the state must have zero angular momentum, and must
also conform with the rules described above. Fig. 13.1 summarises the states
created from the vacuum by the scalar density for each chirality, and those
created by the vector density. We leave to the reader the task of showing that
the only states which have matrix elements of the vector current diﬀerent from
zero are those illustrated in the ﬁgure.

Figure 13.1 States created by the scalar and vector densities acting on the vacuum.
Weyl and Majorana Neutrinos 195

13.2 THE MAJORANA NEUTRINO

In the Pauli representation of the γ matrices, the Dirac equation (7.23)
is a complex equation; if ψ is real at a time t0 , in general it will develop an
imaginary part at later times. However, a symmetry of the Lagrangian (7.21)
exists, charge conjugation, which essentially exchanges ψ with ψ ∗ . Using this
symmetry we can reduce the Dirac ﬁeld to two independent components, as
pointed out by Majorana [22] in 1937.
The most direct way of studying the question is to start from the fact that
representations exist (called Majorana representations, MR in the following),
in which all the Dirac matrices are purely imaginary.
In general, it is not necessary to know the explicit form of the gamma
matrices in the Majorana representation. We give an example here to reassure
the reader that they actually do exist. Starting from the Pauli representation,
and using (6.25) with S = (1 + γ 2 )/2, we obtain:

0 = α2 ; γ
γ 1 = −iΣ3 ; γ
2 = γ 2 ; γ
3 = iΣ1

σ2 0
5 = γ 2 γ 5 =
γ . (13.16)
0 −σ2

We note that γ5 is also imaginary (and therefore antisymmetric). All other

realisations of MR are obtained by applying (6.25) again with S real.
In the Majorana representation, the Dirac equation (7.23) is an equation
with real coefficients and therefore allows purely real solutions. For a real field
we expect half the number of degrees of freedom compared to a complex field,
therefore the field found in this way has only 2 degrees of freedom, exactly as
required for a spin 12 particle.
It is interesting to show in detail how the degrees of freedom in a Majorana
field are organised. From (6.76) we see that, in the Majorana representation,

v(p) = u(p)∗

therefore the expansion of ψ with the condition that it should be real, takes the
form (we also use here the normalisation (13.8) to facilitate the limit m → 0):
1
ψ(x) = √ [ar (p)ur (p)e−i(px) + ar (p)† ur (p)∗ e+i(px) ]. (13.17)
p,r V

The particles created by ψ (−) are identical to those annihilated by ψ (+) ; a

Majorana fermion is the same as its antiparticle and is intrinsically uncharged,
like the photon.
Starting from the Lagrangian (7.21) we can quantise the Majorana ﬁeld
without diﬃculty. The momentum conjugate to ψ is simply iψ. The anticom-
mutation rules are written:

{ψ(x, t)α , iψ(y, t)β } = iδα,β δ (3) (x − y). (13.18)

196 Relativistic Quantum Mechanics

From here the anticommutation relations for the creation and destruction
operators a and a† can be recovered.
The neutrality of the Majorana fermions can also be seen from the form
of the conserved current. We ﬁnd:

j μ = ψ̄γ μ ψ = ψ T γ 0 γ μ ψ =
= (ψ T ψ, ψ T αψ) = complex number (13.19)

(ψ T denotes a row vector with the same components as ψ). That the current
should be a complex number follows from the fact that, in all the components
of j, the spinors ψα ψβ are multiplied by matrices symmetric in α and β, and
therefore, in view of the anticommutation relations (13.18), produce a multiple
of the identity matrix. If we deﬁne the current so as to vanish on the vacuum
state, we will have Q = 0 in all the states.
A non-trivial result is obtained for the axial current:

Aμ = ψ̄γ μ γ5 ψ = ψ T γ 0 γ μ γ5 ψ (13.20)

because, in this case, the matrices:

aμ = (γ5 , αγ5 ) (13.21)

are antisymmetric. However, the current (13.20) is conserved only in the limit
of zero mass. Using the Dirac equation (7.23) we ﬁnd:

∂μ Aμ = 2mψ̄iγ5 ψ. (13.22)

The vanishing of the vector current excludes that the electron or proton,
whose electromagnetic current is certainly different from zero can be described
by a Majorana field. The neutron too, in view of the conservation of baryon
number must be described by a (complex) Dirac field.
In the Majorana representation, the matrices which represent the Lorentz
transformations S(Λ), equation (6.39), take a special form. In fact, with imag-
inary gamma matrices, the σ μν matrices are also imaginary and the S(Λ) are
real matrices.
If we examine a complex Dirac field, the spinor:

ψT γ 0 (13.23)

transforms like the adjoint spinor, i.e. with S −1 :

ψ γ 0 (x ) = ψ T S T (Λ)γ 0 = ψ T S † (Λ)γ 0 = ψ T γ 0 S −1 (Λ).

T
(13.24)

It then follows that we can construct with ψ a diﬀerent mass term from
what appears in the Dirac Lagrangian:

LM = μ ψ T γ 0 ψ + μ ∗ ψ † γ 0 ψ † (13.25)
Weyl and Majorana Neutrinos 197

where μ is a complex arbitrary parameter. The mass term (13.25) is known as

the Majorana mass. The Lagrangian:

ψ̄(i∂/)ψ + μψ T γ 0 ψ + μ∗ ψ † γ 0 ψ † (13.26)

still describes spin 12 fermions with non-zero mass, as we will see, but there
is no longer invariant for global phase transformations of the ﬁeld ψ. The
corresponding vector current ψ̄γ μ ψ is not conserved.

13.3 RELATIONSHIPS AMONG WEYL, MAJORANA AND DIRAC

NEUTRINOS
In this section, we consider the different possibilities to describe the field
of a neutrino. In all the formulae which follow we use the Majorana represen-
tation.
Starting from a Dirac field, ν(x), we can separate the left- and right-handed
components using the projection operators a(±) :

ν(x) = ν(x)R + ν(x)L . (13.27)

For massless ﬁelds the separation is Lorentz-invariant and the Dirac ﬁeld
divides into irreducible components, each of which describes a Weyl fermion.
The matching of degrees of freedom is accounted by:

Dirac(4) = 2 × W eyl(2). (13.28)

Again in terms of a massless theory, each Weyl fermion is equivalent to a

Majorana fermion. To see this we deﬁne:

ν1 (x) = νL (x) + [νL (x)]† ; ν2 (x) = νR (x) + [νR (x)]† (MR) (13.29)

Recalling the expansion (13.10) and the conjugation relation (13.17) we

ﬁnd, for example for ν1 :
1
ν1 (x) = √ ×
p V
1 2
[a− (p)u− (p) + b+ (p)u+ (p)]e−i(px) + [a− (p)† v− (p) + b+ (p)† v+ (p)]e+i(px) .

(13.30)

Equation (13.30) shows that the two spin components of the Majorana
fermion are made up of the right-handed antineutrino and the left-handed
neutrino, the components of the Weyl ﬁeld νL (x). Similarly the two spin com-
ponents of ν2 are composed of the right-handed neutrino and the left-handed
antineutrino of the ﬁeld νR ; see Fig. 13.2.
198 Relativistic Quantum Mechanics

Figure 13.2 Schematic representation of the relationships among Weyl, Majorana

and Dirac neutrino states and masses.

The equations in (13.29) can be inverted, recalling that γ5T = −γ5 , from
which:

1 − γ5 1 − γ5 †
(νL )† = [νL ]
2 2

1 + γ5
=[ νL ]† = 0. (13.31)
2

We therefore obtain:

1 − γ5 † 1 + γ5
νL = ν1 ; (νL ) = ν1 . (13.32)
2 2

Equations (13.29) and (13.32) show the equivalence of the Weyl and Ma-
jorana theories for massless fermions.
The Dirac Lagrangian can be written in terms of Weyl or Majorana ﬁelds,
according to the equivalence:

LD = ν̄i∂/ν = ν̄R i∂/νR + ν̄L i∂/νL = LW (νR ) + LW (νL )

1 T
LW (νR(L) ) = ν2(1) γ 0 ∂/ν2(1) = LM (ν2(1) ) (13.33)
2
The possible terms to be added to (13.33) to give mass to the fermions
can be classiﬁed as follows:
The Dirac mass term has the form:

LmD = mD ν̄ν = mD ν̄L νR + h.c. (13.34)

where h.c. denotes the hermitian conjugate operator, in this case2 mD ν̄R νL .
2 We leave it to the reader to show that if we start from a complex m
D we can reduce
it to the real case with a redeﬁnition of the relative phase between νL and νR and the
consequent redeﬁnition of ν1 and ν2 .
Weyl and Majorana Neutrinos 199

Again using the antisymmetry of γ5 , it is easy to see that the Dirac term
takes the form, in terms of ν1,2 :

LmD = mD ν1T γ 0 ν2 = mD ν2T γ 0 ν1 . (13.35)

However, given the Majorana ﬁelds ν1,2 , we can also consider two new
mass terms:
1 1
LmM = M1 ν1T γ 0 ν1 + M2 ν2T γ 0 ν2 . (13.36)
2 2
Individually, the two terms correspond to a Majorana mass for the left-
handed and right-handed neutrino, for example:
1 1
M1 ν1T γ 0 ν1 = M1 (νLT γ 0 νL + h.c.). (13.37)
2 2
Overall, the neutrino masses are described by a symmetric matrix:
1
Lm = ζ T γ 0 Mζ
2
ν1 M1 mD
ζ= ; M= . (13.38)
ν2 mD M2

It is instructive to examine the symmetries of the massless Lagrangian,

(13.33), and the Dirac and Majorana mass terms, (13.35) and (13.36).
Equation (13.33) is invariant under two abelian and commuting transfor-
mation groups3 :

ν1 → ν1 = eiαγ5 ν1 ;
ν2 → ν2 = eiβγ5 ν2 . (13.39)

However, the Majorana mass terms are not invariant, while the Dirac term
is invariant for the subgroup of transformations with α = −β:
1 1
LmM → M1 ν1T γ 0 e2iαγ5 ν1 + M2 ν2T γ 0 e2iβγ5 ν2 ;
2 2
T 0 i(α+β)γ5
LmD → mD ν1 γ e ν2 . (13.40)

In the speciﬁc case M1 = M2 = 0 a symmetry therefore remains, whatever

the value of mD and a conserved current. In this case, as we see from (13.34),
the neutrino is described by the Dirac ﬁeld νL + νR with conserved current
ν̄γ μ ν, corresponding to the familiar phase transformations.
To see this more formally, we observe that in the case M1 = M2 = 0 the
mass matrix is proportional to the σ2 matrix and is therefore diagonalised
3 Since γ is purely imaginary, and the transformations (13.39) are orthogonal, as they
5
must be to maintain the Majorana nature of the ﬁelds ν1,2 .
200 Relativistic Quantum Mechanics

by the combination ζ (+) = ν1 + ν2 , eigenvalue +mD , and ζ (−) = ν1 − ν2 ,

eigenvalue −mD . Thus, in the light of (13.29), we can set:
1 (+)
νDirac = [ζ − γ5 ζ (−) ]
2
1 1
= [νL + (νL )† + νR + (νR )† ] − γ5 [νL + (νL )† − νR − (νR )† ] =
2 2
= νL + ν R (13.41)

The conserved current in this case is known as the lepton number and
distinguishes the neutrino from the antineutrino.
In the general case, in which at least one of M1 or M2 is non-zero, the
eigenvectors of the mass matrix are two Majorana ﬁelds, and there is neither
a conserved current nor a diﬀerence between neutrino and antineutrino.
The relationships between the Weyl, Dirac and Majorana neutrino states
and masses are illustrated in Fig. 13.2. The general particle-antiparticle sym-
metry requires that the Dirac masses which connect νL with νR or ν̄L with
ν̄R takes the same mD value.

Comment: the electron case. The requirement that a conserved current

should exist, to be identiﬁed as the electric current, implies that there should
only be the Dirac mass term and that therefore the electron should be
described by a pair of Majorana ﬁelds as in (13.41), therefore by a four-
dimensional Dirac spinor. In turn, this implies the existence of antiparticles
(positrons) distinct from particles. The same argument, using the conservation
of baryon number, holds for the proton and neutron.

Comment: the see-saw mechanism. In the modern theory of the neutrino,

it is supposed both that M1 = 0 and M2 >> mD . In this case, the mass matrix
in (13.38) has as approximate eigenvalues and eigenvectors:

m2D
ζ (−) ν1 ; m . (13.42)
M2
For suﬃciently large M2 with ﬁxed mD , the light neutrino has a tiny mass
and is a good approximation to a Majorana–Weyl neutrino. As we will see,
this situation describes the neutrino of β-decay very well.

13.4 PROBLEM FOR CHAPTER 13

Sect. 13.1
1. Extend the calculation illustrated in Fig. 13.1 to the axial vector density,
ψ̄γ μ γ5 ψ.
CHAPTER 14

APPLICATIONS: QED

The calculation of the elements of the S-matrix to a given order of per-

turbation theory, equation (11.9), is carried out using the general method of
Feynman diagrams, which will be illustrated in [14].
For processes up to the second order, which we will consider in this and the
following chapter, the matrix elements can be calculated by simple inspection
and we will not need general formalism, which is essential for the calculation of
higher order corrections and renormalisation. The focus here is rather to show
how the ideas of second quantisation can be compared with experiments for the
simplest processes, which historically had an essential role in the development
of the theory of elementary particles.

14.1 SCATTERING IN A CLASSICAL COULOMB FIELD

We consider the scattering of an electron in a static, i.e. time-independent,
external field. The relevant 4-potential can be written

μ μ d3 q
A (x) = A (x) = Aμ (q) eiqx (14.1)
(2π)3
and the perturbative expansion of the S-matrix has the form
∞
(ie)n
S= d4 x1 ...d4 xn T {[: ψ̄(x1 )A/(x1 )ψ(x1 ) :]...[: ψ̄(xn )A/(xn )ψ(xn ) :]} ,
n=0
n!
(14.2)
where ψ is the field which describes the electron and T is the time-ordering
operator.
To first order in the interaction Lagrangian, equation (14.2) reduces to

S = 1 + ie d4 x : ψ̄(x)A/(x)ψ(x) : . (14.3)

The physical process described by (14.3) is represented schematically by

the Feynman diagram of Fig. 14.1. The interaction with the external ﬁeld

DOI: 10.1201/9781003436263-14 201

This chapter has been made available under a CC BY NC license.
202 Relativistic Quantum Mechanics

Figure 14.1 Diagrammatic representation of electron scattering in a static ﬁeld.

causes the transition of the electron from the initial state |i = |pr to the
final state |f = |p r , where p ≡ (E, p) and p ≡ (E , p ) are the 4-momenta
and r and r the spin projections.
The corresponding S-matrix has the form
m 1/2 m 1/2
Sif = f |S|i = ie
d4 x ei(p −p)x ūr (p )A/(x)ur (p)
VE VE
m 1/2 m 1/2
= ie d3 x ei(q+p−p )x
VE V E

i(E −E)t d3 q
× dt e ūr (p )A/(q)ur (p)
(2π)3
m 1/2 m 1/2
= (2π) δ(E − E ) Mif , (14.4)
VE V E
where V is the normalisation volume and
Mif = ie ūr (p)A/(p − p)ur (p) . (14.5)
We note that in (14.4) only the δ-function associated with the conservation
of energy appears. Momentum is not conserved, because the static source,
which generates the field, breaks translational invariance. The relation which
expresses the conservation of energy, which implies
|p| = |p | , (14.6)
clearly shows that the momentum absorbed by the source is neglected.
The differential cross section for the process is obtained from the formulae
given in Chapter 11:
V V d 3 p
dσ = Wif , (14.7)
v (2π)3
where Wif is the transition probability per unit time
|Sif |2 m 2
Wif = = (2π) δ(E − E ) |Mif |2 , (14.8)
T VE
Applications: QED 203

and T is the duration of the interaction. Using equations (14.7) and (14.8),
and the relation:
|p|2 d|p| = |p|E dE , (14.9)
from (14.7) we obtain

VE m 2 V
dσ = (2π) δ(E − E ) |Mif |2 |p|E dE dΩp . (14.10)
|p| VE (2π)3

To arrive at the expression for the diﬀerential cross-section, the probability

that the electron is scattered into a solid angle element dΩp , we must integrate
over the energy using the δ-function, with the result
dσ m 2 me 2
= |Mif |2 = |ūr (p )A/(q)ur (p)|2 , (14.11)
dΩp 2π 2π

where q = p − p.
We now consider the case in which the external ﬁeld is the Coulomb ﬁeld
generated by an atomic nucleus, which we suppose to be pointlike. Therefore
we have:
Z e
Aμ (x) ≡ , 0, 0, 0 , (14.12)
4π |x|
with the Fourier transform (FT) given by (cf. Problem 1):

e
Aμ (q) ≡ Z 2 , 0, 0, 0 . (14.13)
|q|

Substituting (14.13) into (14.11), summing over the projections of the elec-
tron spin in the ﬁnal state and averaging over those of the initial state of the
electron we obtain
dσ me 2 e2 1
= Z2 |ūr (p )γ 0 ur (p)|2
dΩp 2π |q|4 2
rr
(2mαZ)2 1 1
= T r (p/ + m)γ 0 (p/ + m)γ 0 , (14.14)
|q|4 (2m)2 2

where α = e2 /4π and the sums over r and r are carried out using the com-
pleteness relations satisﬁed by the Dirac spinors. The calculation of the trace
in (14.14) is easily done using the result

T r (p/ + m)γ 0 (p/ + m)γ 0 = T r(p/ γ 0 p/γ 0 ) + m2 T r(γ 0 γ 0 )
= 4pμ pν (2g μ0 g ν0 − g μν g 00 ) + 4m2 (14.15)
= 4[2EE − (pp ) + m ] = 4[EE + (p · p) + m2 ] .
2
204 Relativistic Quantum Mechanics

In this way we obtain

dσ (αZ)2
= 2 [E 2 + (p · p ) + m2 ]
dΩp |q|4

(αZ)2 2 |p|2 2 θ
= 2 2E 1 − 2 sin (14.16)
|q|4 E 2

(αZ)2 θ
= 2 2E 2 1 − v 2 sin2 .
|q|4 2
where v and θ are, respectively, the velocity and scattering angle of the elec-
tron, i.e. the angle between the vectors p and p . Using the relation
θ
|q|2 = |p − p|2 = 2|p|2 (1 − cos θ) = 4E 2 v 2 sin2 , (14.17)
2
we can rewrite equation (14.17) in the form originally obtained by Mott

dσ (αZ)2 2 2 θ
= 1 − v sin . (14.18)
dΩp 4E 2 v 4 sin4 (θ/2) 2

The Mott cross-section describes the elastic scattering of electrons by nu-

clei to the lowest order in the ﬁne structure constant α. Obviously, this ap-
proximation is not applicable in the case of very heavy nuclei, for which the
coupling constant Zα becomes too large.
In the non-relativistic limit, i.e. for v 1 and E ∼ m, from (14.18) we
recover the famous Rutherford cross-section
dσ (αZ)2
= , (14.19)
dΩp 4m sin4 (θ/2)
2

whose experimental measurement led to the development of the planetary

model of the atom.
In the ultrarelativistic limit, v 1, the Mott cross section becomes:

dσ (αZ)2 cos2 (θ/2)
= . (14.20)
dΩp v=1 4E 2 v 4 sin4 (θ/2)
The ultrarelativistic Mott cross section vanishes for back-scattering of the
electron, θ = π.

Note. We have succeeded in deriving the Rutherford cross-section from a

calculation to ﬁrst order in α. This is surprising because the same expression
is obtained from classical mechanics as an exact result. The explanation of
this apparent contradiction is due to Dalitz, who showed that, in the non-
relativistic limit, the inclusion of higher-order corrections only results in the
appearance of a phase factor in the amplitude, which leaves the cross-section
unchanged.
Applications: QED 205

Coulomb Divergence. The integral of the Mott cross section, (14.20), over
solid angle diverges for θ = 0, preventing deﬁnition of the total cross-section.
To be exact, for small values of θ
dσ αZ 2
d cos θ 4 2 4 4 θdθ . (14.21)
dΩp 4E v θ

The origin of this divergence can be traced to the trend towards singularity,
for |q| → 0, of the FT of the Coulomb field, equation (14.13), or to the too
slow fall off of the Coulomb field itself, as |x| → ∞, equation (14.12). Due to
this slow decrease, even particles which pass very far from the charge at the
origin feel its effect; the asymptotic states, in the interaction representation,
do not truly tend to a constant and this prevents us from rigorously defining
the S-matrix.
Fortunately, isolated electric charges do not exist in Nature. The positive
charge of each nucleus, in ordinary matter, is screened by the negative charges
of the atomic electrons. Also in an ionised plasma, due to the long range of
the electrostatic forces, the positive charges attract negative charges around
them so that, on average, the plasma is electrically neutral. The result is that
the integrand in (14.21) is suppressed for small values of θ.
We recall that the FT at |q| is sensitive to the values of the function for
|x| |q|−1 . Therefore, if we consider scattering on an atom, the presence of
the external electrons is felt at values of |q| such that:
|q|R ≤ 1 , (14.22)
where R is the atomic radius, or for:
1
θ≤ . (14.23)
REv
In this θ region the differential cross section is suppressed by the fact that
the electron sees a gradually decreasing total charge and the total cross-section
is finite.

Non-relativistic Form Factor. We can improve the approximation of the

pointlike charge of the nucleus by introducing a form factor.
We suppose that the charge density inside the nucleus is described by
a function Zeρ(x). The charge density must decrease rapidly with increasing
|x|, vanish outside the nucleus, i.e. for |x| > RN , and satisfy the normalisation
condition:
ρ(x)d3 x = 1 (14.24)

so that the total charge is still given by Ze. The electrostatic potential of the
nucleus is now given by:

Ze ρ(y)
A0 (x) = d3 y , (14.25)
4π |x − y|
206 Relativistic Quantum Mechanics

and its FT is given by (cf. Problem 1):

Ze
A0 (q) = F (q);
|q|

F (q) = d3 x eiq·x ρ(x). (14.26)

Correspondingly the Mott cross section becomes:

dσ αZ 2 2 2 2 θ
= |F (q)| 1 − v sin . (14.27)
dΩp 4E 2 v 4 sin4 (θ/2) 2

14.2 ELECTROMAGNETIC FORM FACTORS

We consider the matrix element p |Jpμ |p of the proton current, equation
(9.16), between states with momenta p and p .
Substituting the expansion in plane waves of the ﬁelds, we ﬁnd (M denotes
the proton mass ):

M κ ! M
p |Jpμ |p =
√ ū(p ) γ μ + iσμν q ν u(p) = √ J μ (p , p) ,
V EE 2M V EE
(14.28)
where we have introduced the momentum transfer:

q μ = pμ − pμ . (14.29)

We can ask what would be the most general form of J μ (p , p) compatible
with a conserved polar 4-vector:

qμ J μ (p , p) = 0 . (14.30)

Clearly the current must have the form:

J μ (p , p) = ū(p )Γμ (p , p)u(p) , (14.31)

with Γμ a linear combination of the available 4-vectors, which we can choose

as:
γμ , σμν q ν , qμ , pμ + pμ , σμν (pν + pν ) . (14.32)
However, taken between the two spinors ū(p ) and u(p), these vectors are
not independent of each other, as a result of the equation of motion which
u(p) satisﬁes, (6.76), and the corresponding equation for ū(p ). Multiplying
the ﬁrst equation of (6.76) by ū(p )γμ , we obtain:

ū(p ) [pμ − iσμν pν ] u(p) = M ū(p )γμ u(p) .

Applications: QED 207

Proceeding symmetrically with the equation for ū(p ), we also obtain:

ū(p ) pμ + iσμν pν u(p) = M ū(p )γμ u(p) .

Summing and subtracting the two equations, we ﬁnd the relations1 :

pμ + pμ i
ū(p )u(p) + ū(p )σμν q ν u(p) = ū(p )γμ u(p) (14.33)
2M 2M
i
qμ ū(p )u(p) + ū(p )σμν (pν + pν )u(p) = 0 , (14.34)
2M
which shows that we can limit ourselves to the ﬁrst three vectors:
i
Γμ (p , p) = Aγμ + B σμν q ν + Cqμ . (14.35)
2M
Moreover, from the conservation equation (14.30) we obtain:

qμ J μ (p , p) = Cq 2 ū(p )u(p) = 0 , (14.36)

or C = 0. We can therefore write:

i
J μ (p , p) = ū(p ) Aγμ + B σμν q ν u(p) . (14.37)
2M
The requirement that (14.37) transforms like a 4-vector implies, in general,
that the coeﬃcients A and B are invariant functions of 4-momenta. Because
p2 = p2 = m2 , the only non-trivial invariant combination is q 2 and we obtain,
ﬁnally, the form of the current:
i
J μ (p , p) = ū(p ) F1 (q 2 )γμ + F2 (q 2 ) σμν q ν u(p). (14.38)
2M
It is easily seen that condition (9.17) implies

F1 (0) = 1 , (14.39)

while from (9.10) we ﬁnd:

F2 (0) = κ . (14.40)
Similar formulae naturally hold for the neutron, with F1,n (0) = 0.
The functions F1 (q 2 ) and F2 (q 2 ) are known as Dirac and Pauli form fac-
tors, respectively, and can be measured, as we will see, by elastic scattering
of electrons on hydrogen (for the proton) and on deuterium (obtaining the
neutron form factor by subtraction).
The form (14.38) is obviously not unique because we can choose the basic
4-vectors in diﬀerent ways. For example, we can write the current as:
pμ + pμ i
J μ (p , p) = ū(p ) Fa (q 2 ) + Fb (q 2 ) σμν q ν u(p) . (14.41)
2M 2M
1 The ﬁrst of these relations is well known as Gordon decomposition.
208 Relativistic Quantum Mechanics

Using equation (14.34), this choice corresponds to:

F1 (q 2 ) = Fa (q 2 ); F2 (q 2 ) = Fb (q 2 ) − Fa (q 2 ) (14.42)

or, for q 2 = 0:

Fa (0) = 1; Fb (0) = 1 + F2 (0). (14.43)

Fb (0) immediately gives the total magnetic moment (in Bohr magneton
units).
The choice (14.41) is best suited for the calculation of radiative corrections
to the anomalous magnetic moment of the electron; see [14].
A particularly convenient choice to describe the elastic electron–nucleon
cross section is given by the Sachs form factors

GE = F1 − τ F2 , GM = F1 + F2
q2
τ =− , (14.44)
4M 2
which are known, respectively, as the electric and magnetic form factors. The
normalisation conditions follow from equation (14.44):

GE (0) = 1, GM (0) = 1 + κ . (14.45)

14.3 THE ROSENBLUTH FORMULA

In the scattering of electrons on protons, when the electron energy becomes
of the order of GeV, we must include a fully quantum description of the pro-
ton. To do this, we extend the interaction Lagrangian of the electromagnetic
ﬁeld according to the approach illustrated in Chapter 9, by introducing the
electromagnetic current of the proton:
μ μ
Le.m. = eAμ (x)Jtot (x); Jtot (x) = Jeμ (x) + Jpμ (x) . (14.46)

For the electron current, we take, as before:

Jeμ (x) = − : ψ̄e (x)γ μ ψe (x) : . (14.47)

The matrix element for the proton current between proton states is pa-
rameterised in terms of the form factors introduced in the previous section,
equation (14.38):

M2
< p |Jpμ (0)|p >= 2
ū(p )Γμ (p , p)u(p) ,
V Ep E p
i
Γμ (p , p) = F1 (q 2 )γμ + F2 (q 2 ) σμν q ν . (14.48)
2M
Applications: QED 209

To lowest order of perturbation theory, electron-proton scattering is de-

scribed by terms of second order:

(ie)2
S (2) = d4 x d4 y T [Aμ (x)Jtot
μ ν
(x)Aν (y)Jtot (y)] =
2

= (ie)2 d4 x d4 y T Aμ (x)Jeμ (x)Aν (y)Jpν (y) + . . . (14.49)

We have written only the terms of the T -product which have the right
annihilation and creation operators to destroy the initial particles and create
the ﬁnal particles, summarising the others, which are not essential, in the
ellipsis dots.

Matrix Elements. As explained earlier, the operators which appear in equa-

tion (14.49) satisfy the free particle equations of motion. Therefore fields which
correspond to different particles commute or anticommute among themselves,
according to which type of statistics they obey. Furthermore, the external
states are tensor products of the states of the different particles, for example:

|e, p, 0γ >= a† (e)a† (p)|0 >= |e > |p > |0γ >= a† (e)|0e > a† (p)|0p > |0γ > ,

where we have denoted the diﬀerent momenta using the names of the relevant
particles, with, for example, the state with zero electrons being |0e >, and,
naturally:
|0 >= |0e , 0p , 0γ >= |0e > |0p > |0γ > .
The current of each particle operates only on the corresponding states.
Finally, we can take the currents outside the T -product and write:

< f |S|i > = (ie)2 d4 x d4 y < e |Jeμ (x)|e > .

. < 0γ |T [Aμ (x)Aν (y)] |0γ >< p |Jpν (y)|p > .

(14.50)

Fig. 14.2 (a) gives a space-time picture of the scattering process. The initial
electron propagates up to a point x where it is destroyed by the current,
which creates the final electron at the same point. Also, at x a photon which
propagates from point x to point y, or from y to x, is emitted or absorbed,
while at point y the proton current changes the initial proton into the final
one. The amplitude of the propagation of the photon between x and y is given
by the T -product on the vacuum of the electromagnetic fields and the product
of the three factors in the integral of (14.50) represents the overall amplitude
of the history which corresponds to values of x and y.
The observation of the final particles does not determine the points x and
y where the interaction took place, therefore, according to the principles of
quantum mechanics, we must sum over the amplitudes of each history to have
the total amplitude for the process, integrating over the values of x and y.
210 Relativistic Quantum Mechanics

Figure 14.2 Graphical representations of the electron-proton scattering amplitudes

in (a) space-time and (b) momentum space.

Proceeding as in the previous section, we ﬁnd (the minus sign corresponds

to the negative electron charge; it is here as a reminder only, as it disappears
when computing probabilities):

m2e
< e |Jeμ (x)|e >= ei(e −e)x (−1)ū(e )γ μ u(e) , (14.51)
V 2 E e E e

and similarly:

M2
< p |Jpν (y)|p >= ei(p −p)y ū(p )Γν (p , p)u(p) . (14.52)
V 2 E p Ep

Using the translational invariance of the vacuum in the propagation func-

tion of the electromagnetic field, we finally find:

4 (4) m2e M2
< f |S|i >= (2π) δ (p + e − p − e) 2 2E
V E e E e V p Ep
× e2 ū(e )γ μ u(e) i(DF )μν (p − p ) ū(p )Γν (p , p)u(p) , (14.53)

where we have set, according to equation (8.27):

−ig μν
i(DF )μν (q) = d4 x e−iqx < 0|T [Aμ (x)Aν (0)] |0 >= 2 . (14.54)
q + i

The result (14.53) can be graphically represented in Fig. 14.2(b). The

initial and ﬁnal particles emerge from the vertices, each one represented by a
Dirac spinor (u for an incoming fermion, ū for an outgoing fermion) and by
the corresponding normalisation factor. The interaction is represented by a
factor −ieγ or ieΓ for electron and proton. The momentum q = p − p = e − e
is associated with the photon propagator so that 4-momentum is conserved
at each vertex.
Applications: QED 211

Diﬀerential Cross Section. Following the formulae of the previous chapter,

the diﬀerential cross section is written:

1
dσ = (2π)4 δ (4) (p + e − p − e)×
4
m2e M 2 d 3 p d 3 e e 4
× Hμν Lμν . (14.55)
Ee Ee Ep Ep (2π)6 (q 2 )2

1
The factor 4 comes from the averaging over the initial spins and we have
set:

Hμν = ūr (p )Γμ us (p)ūs (p)Γν ur (p ) =
r,s

(p/ + M ) ν (p/ + M ) μ hμν

= Tr Γ Γ = , (14.56)
2M 2M 4M 2

Lμν = ūp (e )γ μ uq (e)ūq (e)γ ν up (e ) =
p,q

/e ν /e μ lμν
= Tr γ γ = . (14.57)
2me 2me 4m2e

Here and in what follows we neglect the electron mass in the numerators.
To obtain the differential cross section in the variables describing the final
electron, which are those which are normally observed, we must integrate over
the momentum of the final proton using the three-dimensional δ-function for
momentum conservation, which fixes:

p = e − e (14.58)

In addition, conservation of energy ﬁxes the energy of the ﬁnal electron. In

the proton rest frame, the argument of the δ-function for the energy in (14.55)
is:

f (Ee ) = Ee + Ep − Ee − M =

= Ee + M 2 + Ep2 − 2Ee Ee cos θ + Ee2 − Ee − M , (14.59)

with the derivative with respect to Ee equal to:

∂f M Ee
= . (14.60)
∂Ee E e E p

Using the relation:

1
δ(f (x)) = δ(x − x0 ) , f (x0 ) = 0 , (14.61)
|(∂f /∂x)x0 |
212 Relativistic Quantum Mechanics

and introducing the ﬁne structure constant:

e2
α= , (14.62)
4π
we ﬁnd: 2
α2 E e X
dσ = dΩ , (14.63)
16q 4 Ee M2
where
X = lμν hμν , (14.64)
and dΩ is the solid angle of the ﬁnal electron.

Kinematic Variables. In the proton rest frame, the process is described

by the energy of the initial electron and the scattering angle, θ, of the ﬁnal
electron. As an alternative we can use the squared momentum transfer,

q 2 = (p − p )2 = (e − e)2 , (14.65)

or the energy, Ee , of the final electron. From its definition, we find:
θ
q 2 = 2Ee Ee (1 − cos θ) = 4Ee Ee sin2 . (14.66)
2
In general W 2 is used to denote the squared mass of the system recoil-
ing against the final electron. In the elastic scattering process which we are
considering we must have:

W 2 = M 2 = (p + e − e )2 = M 2 + q 2 + 2M (Ee − Ee ) ,

or:
q2
Ee = Ee + ,
2M
and therefore, using equation (14.66):

Ee
Ee = 2Ee
. (14.67)
1+ M cos θ

Traces. We must calculate:

hμν = T r [(p/ + M )Γν (p/ + M )Γμ ] . (14.68)

If we use equation (14.48), in the trace there are up to six γ matrices.

As an alternative we can again use the Gordon decomposition, (14.34), and
express the proton current as a combination of γ μ and of:

Qμ = (p + p )μ . (14.69)
Applications: QED 213

We find:
Qμ
Γ(p , p) = A(q 2 ) γ μ + B(q 2 ) ,
2M
A = F1 + F2 , B = −F2 . (14.70)
With this arrangement, the trace (14.68) is easily calculated, with the
result
+
2 q2 2 2
hμν = 2 Q Q (A + B) −
μ ν
B − Π (q)A
μν
,
4M 2
Πμν (q) = q μ q ν − q 2 g μν . (14.71)
The trace of the electron is obtained from (14.71) with the obvious substi-
tutions:
Qμ → E μ = (e + e )μ , M → me 0 ,
A → 1, B → 0 , (14.72)
yielding
lμν = 2 [Qμ Qν − Πμν (q)] .
Finally, we find:
+
q2
X = 4 (Qμ E μ )2 + q 2 Q2 (A + B) − 2
B 2 + 2(q 2 )2 A2 .
4M 2
From the preceding equations, we can explicitly calculate2 :

(Qμ E μ )2 + q 2 Q2 = 4M 2 (Ee + Ee )2 + q 2 (4M 2 − q 2 )
q2 θ
= 16M 2 Ee (Ee + ) + 4M 2 q 2 = 16M 2 Ee Ee cos2 .
2M 2
Finally setting:
θ
2(q 2 )2 = 8(−q 2 )Ee Ee sin2 ,
2
we find:
+
X θ q2 −q 2 2 θ
= 64Ee Ee cos2 F12 − F 2
+ (F 1 + F 2 ) 2
tan .
M 2 2 4M 2 2 2M 2 2
(14.73)
Substituting into (14.63), we obtain the Rosenbluth formula:

dσ α2 cos2 θ2 E e
=
dΩ 4Ee2 sin4 θ2 Ee
+
2 q2 2 −q 2 2 2 θ
× F1 − F + (F1 + F2 ) tan . (14.74)
4M 2 2 2M 2 2
2 Using the relation (p e) = (pe ).
214 Relativistic Quantum Mechanics

Expressing F1,2 in terms of the Sachs form factors, equation (14.44), we

ﬁnally obtain:

dσ dσ E e G2E + τ G2M θ
= × + 2τ G2M tan2 , (14.75)
dΩ dΩ 0 Ee 1+τ 2
2
where, as before, τ = − 4M
q
2 and:

dσ α2 cos2 θ
2
= 4 , (14.76)
dΩ 0 4Ee2 sin θ2

is the Mott cross section.

The Sachs form factors are convenient since equation (14.75) does not
contain interference terms. The combinations G2E +τ G2M and G2M are obtained
separately from the angular dependence of the cross section at ﬁxed q 2 . The
relative sign of GE,M is obtained from the cross section on polarised protons.
The neutron form factor is obtained by subtraction from scattering on
deuterium.
The measurement of the proton form factor by Hofstadter and collabora-
tors at the end of the 1950s provided an important test of the non-pointlike,
and presumably non-elementary, nature of the proton and, more generally, of
other hadronic particles.

14.4 COMPTON SCATTERING

We consider the process

e + γ → e + γ (14.77)

which to the lowest order of perturbation theory is described by Feynman

diagrams (1) and (2) in Fig. 14.3.

Figure 14.3 Feynman diagrams describing the amplitude of e + γ → e + γ to order

α.
Applications: QED 215

To write down the corresponding amplitudes it is useful to introduce the

Mandelstam variables3

s = (p + k)2 = (p + k )2 = Ecm

2
, (14.78)

t = (p − p)2 = (k − k )2 , (14.79)

2 2
u = (p − k ) = (k − p ) , (14.80)
where Ecm is the total energy of the system in the centre of mass frame,
deﬁned by the relation p + k = 0, and p and k are the respective momenta
of the electron and photon present in the initial state.
The cross section for the process is proportional to the squared modulus
of the sum of the amplitudes associated with diagrams (1) and (2), averaged
over the spin states of the particles in the initial state and summed over the
spin states of the particles in the ﬁnal state.

σ ∝ |M1 + M2 |2 = |M1 |2 + |M2 |2 + M1 M2∗ + M1∗ M2 . (14.81)

The amplitudes of diagrams (1) and (2) are easily computed to be:

∗ p/ + k/ + m
M1 = ū μ (ieγ μ )i (ieγ ν )ν u , (14.82)
(p + k)2 − m2

and
p/ − k/ + m
M2 = ū μ ∗ (ieγ μ )i (ieγ ν )ν u , (14.83)
(p − k )2 − m2
where u = us (p) and u = us (p ) are the four-spinors associated with the
electrons e and e , and μ = μr (k ) and = μr (k ) are the polarisation
vectors of the photons γ and γ .
Before going into the detailed calculations, we comment on some properties
of amplitudes with external photons implied by invariance under gauge trans-
formations, which are useful for the calculation of the sum over polarization
states of the photons.

Implications of Gauge Invariance. The amplitude of any process which

involves a photon with polarisation r in the initial or ﬁnal state can be put in
the form
M = r,μ Mμ . (14.84)
The polarisation vector r depends on the gauge, as is easily seen by consid-
ering the transformation

Aμ (x) → Aμ (x) = Aμ (x) + ∂μ Λ(x) , (14.85)

3 For completeness, we also give the deﬁnition of the variable t, which will not be used

in the calculations discussed in this section.

216 Relativistic Quantum Mechanics

where
Aμ (x) = r,μ eikx . (14.86)
ikx
Choosing Λ(x) = Ce we ﬁnd

Aμ (x) = (r,μ + iCkμ )eikx = r,μ eikx (14.87)

i.e. the gauge transformation changes the polarisation vector r into r .

The condition that the amplitude (14.84) remains invariant under the
transformation of the polarisation vector is

r,μ Mμ = r,μ Mμ = (r,μ + iCkμ )Mμ , (14.88)

that is:
kμ M μ = 0 . (14.89)
This result allows us to make the summation
2

|M|2 = (Mμ )∗ Mν ∗r,μ r,ν , (14.90)
r r=1

where (see the discussion of Chapter 5)

2
kμ kν − (kn)(kμ nν + kν nμ )
∗r,μ r,ν = −gμν − , (14.91)
r=1
(kn)2

with nμ ≡ (1, 0, 0, 0). From (14.89) it follows that

2
2

2 μ ∗
|M| = (M ) M ν
∗r,μ r,ν = −(Mμ )∗ Mν gμν = −|M|2 . (14.92)
r=1 r=1

The result is that:

• when multiplied by gauge invariant amplitudes, the sum over polariza-
tions of the photon can be replaced by −gμν .

Calculation of |M1 |2 and |M2 |2 . To sum over the spins of the electrons,
we use the relation equation (6.84), Section 6.1.4:
p/ + m
us (p)us (p) = , (14.93)
s
2m

where m is the mass of the electron. In this way we obtain:

e4 1 μ
|M1 |2 = ū γ (p/ + k/ + m)γ ν uūγν (p/ + k/ + m)γμ u (14.94)
(s − m2 )2 4
s,s

e4 1
= Tr (p/ + m)γ μ (p/ + k/ + m)γ ν (p/ + m)γν (p/ + k/ + m)γμ .
(s − m ) 16m2
2 2
Applications: QED 217

To calculate the trace of the right-hand side of equation (14.94), we apply

the rules summarised in Section 6.3.
First, using the invariance property of the trace under cyclic permutations
of the arguments, we can move the γμ matrix to the left of the factor (p/ + m).
Then exploiting the relation, valid for any four-vector a,

γμ a/γ μ = aρ γμ γ ρ γ μ = aρ γμ (2g ρμ − γ μ γ ρ ) = −2a/ , (14.95)

we can put the trace to be calculated in the form

Tr (4m − 2p/ )(p/ + k/ + m)(4m − 2p/)(p/ + k/ + m) = A + B + C + D , (14.96)

with

A = 16m2 Tr (p/ + k/ + m)(p/ + k/ + m) (14.97)

B = 4Tr p/ (p/ + k/ + m)p/(p/ + k/ + m) (14.98)

C = −8mTr p/ (p/ + k/ + m)(p/ + k/ + m) (14.99)
D = −8mTr (p/ + k/ + m)p/(p/ + k/ + m) . (14.100)

Deﬁning s̃ = p+k = p +k , from which it follows that s̃2 = s, and recalling

that the trace of the product of an odd number of γ-matrices vanishes, we can
immediately calculate the term A, with the result

A = 16m2 Tr (s̃/s̃/ + m2 ) = 64m2 (s + m2 ) , (14.101)

which is easily obtained from the identity, valid for any two 4-vectors a and b:

Tr a//b = aρ bσ Tr (γ ρ γ σ ) = 4aρ bσ g ρσ = 4(ab) . (14.102)

The contribution of the terms C and D is obtained in a similar manner.

For example, for C we ﬁnd the expression

C = −8mTr p/ (s̃/ + m)(s̃/ + m) = −16m2 Tr p//s̃ = −64m2 (p s̃) , (14.103)

which substituting

(p s̃) = p (p + k ) = m2 + (p k ) , (14.104)

with (p k ) = (s − m2 )/2 becomes

C = −32m2 (s + m2 ) . (14.105)

The calculation of term D is similar and the result is the same

D = −64m2 (ps̃) = −32m2 (s + m2 ) . (14.106)

Summing (14.101), (14.105) and (14.106) we ﬁnd

A+C +D =0 , (14.107)
218 Relativistic Quantum Mechanics

from which it follows that the only term which contributes to the trace (14.96)
is B, which we can rewrite as
B = Tr (p/ (s̃/p//s̃ + m2 p/ p/) . (14.108)
The calculation is carried out by using the relation
Tr (a//b/cd/) = aλ bμ cν dρ Tr (γ λ γ μ γ ν γ ρ ) (14.109)
= 4aλ bμ cν dρ (g λμ g νρ + g λρ g μν − g λν g μρ )
= 4[(ab)(cd) + (ad)(bc) − (ac)(bd)] .
The result is
B = 16[2(p s̃)(ps̃) − (pp )s + m2 (pp )]
= 32{(pk)(pk ) + m2 [(pp ) + (p k)]} , (14.110)
which can be written in terms of the variables s and u by noting that
m2 − u s + m2
(pk ) = , [(pp ) + (p k)] = . (14.111)
2 2
Finally we ﬁnd

B = 8 4m4 + (s − m2 )(m2 − u) + 2m2 (s − m2 ) . (14.112)
From equations (14.94) and (14.96) we ﬁnd, in conclusion:
1 2e4 4
|M1 |2 = 4m − (s − m2 )(u − m2 ) + 2m2 (s − m2 ) .
4m2 (s − m2 )2
(14.113)
The expression for |M2 |2 , see equation (14.83), is obtained from (14.113)
with the substitution s u:
1 2e4 4
|M2 |2 = 2 2 2
4m − (u − m2 )(s − m2 ) + 2m2 (u − m2 ) .
4m (u − m )
(14.114)

Interference. The calculation of the interference terms is carried out in a

similar way. We obtain
1 4e4 4
M1 M2∗ +M1∗ M2 = 4m + m2 (s − m2 ) + m2 (u − m2 ) .
4m2 (s − m2 )(u − m2 )
(14.115)

Summing Up. Putting together the results (14.113), (14.114) and (14.115)
we ﬁnd, ﬁnally:
$ 2
2 4 m2 m2
|M1 + M2 |2 = e + (14.116)
m2 s − m2 u − m2

m2 m2 1 u − m2 s − m2
+ + − + .
s − m2 u − m2 4 s − m2 u − m2
Applications: QED 219

Invariance under Gauge Transformations. As we saw in the previous sec-

tion, the gauge transformation (14.85) does not leave the polarisation vectors
of the electromagnetic ﬁeld unchanged. We can write the total scattering am-
plitude, M1 + M2 , in the form4

M1 + M2 = μ ν (Mμν μν
1 + M2 ) , (14.117)

and we see from equation (14.88) that it must satisfy the condition

kμ kν (Mμν μν
1 + M2 ) = 0 . (14.118)

We consider the ﬁrst term on the left-hand side of (14.118), which we can
rewrite using the deﬁnitions (14.82) and (14.83), with k 2 = 0 and p2 = m2 ,

ū k/ (p/ + k/ + m)k/u

kμ kν Mμν
1 = . (14.119)
2(pk)

The numerator of this expression can be put in a very simple form by

noting that

(p/ + k/ + m)k/u = (p/ + m)k/u = [2(pk) − k/(p/ − m)]u = 2(pk)u , (14.120)

from which it follows that

kμ kν Mμν
1 = ū k
/u. (14.121)

The second term, thanks to the relation p − k = p − k, can be put in the

form
ū k/(p/ − k/ + m)k/ u
kμ kν Mμν
2 =− . (14.122)
2(p k)
Since

−ū k/(p/ − k/ + m) = −ū k/(p/ + m) = −ū [2(p k) − (p/ − m)k/] = −2(p k)ū ,
(14.123)
we obtain
kμ kν Mμν
2 = −ū k/ u = −kμ kν Mμν 1 , (14.124)
conﬁrming that the scattering amplitude for e + γ → e + γ is invariant under
gauge transformations. We note that only the sum M1 +M2 is invariant, while
the amplitudes corresponding to the two processes illustrated in the Feynman
diagrams of the ﬁgure, considered separately, are not.
4 Without loss of generality, we may choose real polarisation vectors.
220 Relativistic Quantum Mechanics

Klein–Nishina Cross Section. Now we will use the result (14.116) to obtain
the cross section in the laboratory frame, deﬁned by the relations

p ≡ (m, 0), p ≡ (E , p ) ,

k ≡ (ω, k), k ≡ (ω , k ) ,
with |k| = ω and |k| = ω , from which it follows that

s = (p + k)2 = m2 + 2mω, u = (p − k )2 = m2 − 2mω . (14.125)

Furthermore, (p − p )2 = (k − k)2 implies that

1 1
1 − cos θ = m − , (14.126)
ω ω

where θ is the angle between the vectors k and k .

Substituting equations (14.125) and (14.126) into (14.116), we obtain

2
1 4 ω ω 2
|M1 + M2 | = 2e + − sin θ . (14.127)
4m2 ω ω
The cross section is defined as
1W V V
dσ = d 3 p d3 k , (14.128)
F T (2π 3 ) (2π 3 )
where the flux of incident photons, F , is
1 |k| 1
F = = , (14.129)
V ω V
while the transition probability per unit time is
W 1
= V T (2π)4 δ (4) (p + k − p − k )
T T
1 1 m m
× |M1 + M2 |2 . (14.130)
2V ω 2V ω V m V E
After substituting equations (14.127), (14.129) and (14.130) into (14.128),
we may carry out the integration over p exploiting the δ-function associated
with conservation of momentum, to obtain

1 1 4 ω ω 2
dσ = 2e + − sin θ δ(m+ω −E −ω )d3 k , (14.131)
64π 2 ωω mE ω ω
2
or, recalling that d3 k = dΩ ω dω , where Ω is the solid angle which identifies
the direction of the vector k ,

dσ e4 ω 1 ω ω 2
= + − sin θ δ(m + ω − E − ω )dω . (14.132)
dΩ 32π 2 ω mE ω ω
Applications: QED 221

The integration over ω with the δ-function is carried out using the rule

dF −1
δ[F (ω )] = δ(ω − ω0 ) , (14.133)
dω

with F (ω0 ) = 0. Recalling that E = m2 + |k − k |2 , we ﬁnd

d 1

dω (m + ω − E − ω ) = E (ω − ω cos θ) + 1 (14.134)

(p k ) (pk) mω
= = = ,
Eω Eω Eω
and therefore

dσ e4 ω 1 mω ω ω

= + − sin2 θ , (14.135)
dΩ 32π 2 ω mE E ω ω ω
or 2
dσ α2 ω ω ω
= + − sin2 θ . (14.136)
dΩ 2m2 ω ω ω
The cross section (14.136) was obtained by Klein and Nishina in 1929 and
it provides an accurate description of the Compton eﬀect, observed experi-
mentally for the ﬁrst time in 1923.
We now consider equation (14.136) in the non-relativistic limit: ω/m 1.
Solving equation (14.126) for ω we obtain the relation
ω
ω = , (14.137)
1+ ω
m (1 − cos θ)

which shows that, in the limit which interests us here, ω/ω → 1 and equation
(14.136) becomes
dσ α2
= (1 + cos2 θ) . (14.138)
dΩ 2m2
To obtain the total cross section we carry out the angular integral

α2 2 8π α2
σ = 2π d cos θ (1 + cos θ) = . (14.139)
2m2 3 m2

Equation (14.139) is the Thomson cross section, which describes the inter-
action of the classical electromagnetic ﬁeld with an electron.

14.5 INVERSE COMPTON SCATTERING

In the previous section we considered the photon-electron scattering cross
section in the laboratory frame, in which the electron is at rest. The expression
222 Relativistic Quantum Mechanics

we obtained for the squared modulus of the transition amplitude is relativisti-

cally invariant, however, and can also be used to describe collisions involving
electrons in ﬂight, in which case the kinematical variables are deﬁned as

p ≡ (E, p), p ≡ (E , p ) , (14.140)

k ≡ (ω, k), k ≡ (ω , k ) . (14.141)

We will now consider relativistic electrons, with E m, and see how
in these conditions it is possible to obtain a ﬁnal state photon with energy
ω ω. From conservation of total four-momentum, which implies

(pk) = (p k ) = (p + k − k )k = (pk ) + (kk ) , (14.142)

we obtain the relation

Eω − p · k = Eω − p · k + ωω (1 − cos θ) , (14.143)

that we can rewrite as

ωω
ω(1 − β cos φ) = ω (1 − β cos φ ) + (1 − cos θ) , (14.144)
E
where φ and φ are the angles contained between the direction of p and,
respectively, those of k and k , and β = |p|/E is the electron velocity (in
units with c = 1).
Solving equation (14.144) for ω , we obtain
1 − β cos φ
ω = ω , (14.145)
1 − β cos φ + E
ω
(1 − cos θ)

which obviously reduces to equation (14.137) in the limit β → 0. For relativis-

tic electrons β ≈ 1 and we can use the expansion, valid for 1/γ 2 1,

|p| E 2 − m2 1 1
β= = = 1− ≈1− 2 , (14.146)
E E2 γ 2 2γ

where γ = E/m = (1 − β 2 )−1/2 is the Lorentz factor.

In this regime, the energy of the ﬁnal state photon has a maximum for
φ = π and φ = 0, implying θ = φ − φ = π:

1+β 2 z
ωmax =ω ≈ ω m2 =E , (14.147)
1 − β + 2ω
E 2E 2 +
2ω
E
1+z

with
4ωE
z= . (14.148)
m2
Now we consider the case in which p and k are still oppositely directed,
i.e. φ = π, but the momenta of the particles in the ﬁnal state are such that
Applications: QED 223

Figure 14.4 Energy distribution of photons produced by Compton scattering of

relativistic electrons.

φ ≈ 0 and θ ≈ π. In these conditions equation (14.145) can be rewritten in

the form
z
ω ≈ E , (14.149)
1+x+z
with
E2 2
x= φ . (14.150)
m2
In Compton scattering of electrons produced by a particle accelerator on
photons with energies of order 1 eV, obtained with a laser, it is possible to
produce photons with maximum energies of order GeV. The corresponding

diﬀerential cross section shown in Fig. 14.4 exhibits a peak at ω ≈ ωmax
whose width decreases with the increase of E and ω.

This technique is used for the production of beams of photons for nuclear
physics experiments. The ﬁrst beam of this type, of energy ∼80 MeV, was
obtained at the end of the 1970s at the Frascati Laboratory, using electrons
from the Adone storage ring with energy E = 1.5 GeV and photons of energy
ω = 2.45 eV. A photon beam obtained in this way possesses the important
property of having a high degree of polarisation. This is a consequence of
the fact that for relativistic electrons helicity is a good quantum number.
224 Relativistic Quantum Mechanics

Figure 14.5 Feynman diagrams describing the amplitude of γ + γ → e+ + e− to

order α.

Therefore, in the scattering process there is very little transfer of angular

momentum, and the polarisation of the scattered photon is very close to that
of the incoming laser light.

14.6 THE PROCESSES γγ → e+ e− AND e+ e− → γγ

The discussion of photon-electron scattering in Section 14.4 can be easily
generalised to describe the process

γ 1 + γ 2 → e− + e + , (14.151)

in which the collision between the two photons results in the creation of an
electron-positron pair. Comparing the Feynman diagrams which describe this
process to order α2 , represented in Fig. 14.5, to those for Compton scattering,
Fig. 14.3, one sees that the only diﬀerence consists in the substitution

k → k1 , k → −k2 , p → −pe− , p → pe+ . (14.152)

From this it follows that the expression for |M1 + M2 |2 written in terms
of Mandelstam variables s, t and u, equation (14.116), can also be used to
obtain the transition probability for the process (14.151) by setting

s = (k1 − pe− )2 = (pe+ − k2 )2 , (14.153)

t = (pe+ + pe− )2 = (k1 + k2 )2 , (14.154)

2
u = (pe− − k2 ) = (k1 − p2e+ )2 . (14.155)
where k1 ≡ (ω1 , k1 ), k2 ≡ (ω2 , k2 ), pe+ ≡ (Ee+ , ke+ ) and pe− ≡ (Ee− , ke− )
represent, respectively, the 4-momenta of the particles in the initial and ﬁnal
states.
Applications: QED 225

We note that, compared to the case of Compton scattering, the roles of

the variables s and t are exchanged. One says therefore that the two processes
(14.77) and (14.151) correspond to crossed channels of the same reaction.
The calculation of the diﬀerential cross section, which can easily be carried
out starting from equation (14.116), gives the result
$ 2
dσ α2 m2 m2
= 8π 2 +
ds t s − m2 u − m2

m2 m2 1 u − m2 s − m2
+ + − + .(14.156)
s − m2 u − m2 4 s − m2 u − m2

In the centre of mass frame, in which pe+ = −pe+ , k1 = −k2 , ω1 = ω2 = ω

and Ee+ = Ee− = ω, the total cross section, which is obtained by integration
of (14.156), has the form

π α2 1+β
σ= 2
(1 − β 2 ) (3 − β 4 ) ln − 2β(2 − β 2 ) , (14.157)
2m 1−β

where 1/2
m2
β= 1− 2 , (14.158)
ω
and, obviously, the energy of the photons must satisfy the condition ω > m.
Equation (14.157) can easily be generalised to the case of any reference
frame in which the photons move along opposing paths in the same overall
direction, with energies ω1 and ω2 , by making the substitution
1/2
m2
β → 1− , (14.159)
ω1 ω2

which implies the threshold condition

m2
ω1 ≥ . (14.160)
ω2

The cross section for the process (14.151) applies to the collisions of high
energy photons in cosmic rays with the thermal background radiation (cosmic
microwave background, CMB) which permeates the universe. The mean free
path of a photon with energy ω1 can be estimated from the relation
1
λ(ω1 ) = , (14.161)
σργ

where ργ is the density of thermal photons, of energy ω2 , and σ is the total

cross section deﬁned by equations (14.157) and (14.159).
226 Relativistic Quantum Mechanics

Comment. Setting ω2 ≈ 6 × 10−4 eV, the value which is obtained from the
spectral distribution of a black body at a temperature of T = 2.7 K, and us-
ing the result of recent measurements which give ργ = 410 cm−3 , a threshold
energy of ω1 ≈ 4 × 1014 eV = 400 TeV is obtained. At energies just above
the threshold the cross section has a value σ ≈ 10−26 cm2 , and the mean free
path of λ ≈ 31.5 kpc turns out to be comparable with the dimensions of our
galaxy.

We conclude this section by noting that, as well as photon-electron scat-

tering (14.77) and e+ e− pair production (14.151), there is another process
described by the transition probability (14.116), which is annihilation:

e+ + e − → γ1 + γ 2 . (14.162)

The cross section calculation is carried out using the Mandelstam variables

s = (pe− − k1 )2 = (k2 − pe+ )2 , (14.163)

2 2
t = (pe+ + pe− ) = (k1 + k2 ) , (14.164)
u = (k2 − pe− )2 = (p2e+ − k1 ) , (14.165)
which are obtained immediately from (14.155) by changing the signs of all the
four-momenta.
In the centre of mass frame the resulting cross section has the form

πα2 λ2 + 4λ + 1 λ+3
σ= 2 2
ln (λ + λ2 − 1) − √ , (14.166)
m (λ + 1) λ −1 λ2 − 1
with λ = Ecm /m. In the non-relativistic limit equation (14.166) becomes

1 πα2
σ≈ , (14.167)
vrel m2
√
where vrel = 2λ λ2 − 1 is the relative velocity of the electron–positron sys-
tem.
From the cross section of the process (14.162) the mean lifetime of para-
positronium (i.e. the bound state of an electron and positron with spin zero)
can be obtained, for which the decay into two photons dominates5 (cf. Chap-
ter 12).
The mean lifetime is obtained from the product of the probability of the
annihilation process and the ﬂux, equal to vrel |ψ(0)|2 , where ψ(r) is the nor-
malised wave function of the positronium ground state. Taking into account
5 As shown in Chapter 12, the spin one state, orthopositronium, has negative charge

conjugation and cannot annihilate into two photons, resulting in a much longer lifetime.
Applications: QED 227

Figure 14.6 Graphical representation of the amplitude for electron-positron annihi-

lation into a μ+ μ− pair. The arrows indicate the ﬂux of negative electrical charge,
therefore they are in the opposite direction to the momenta of e+ and μ+ ; q μ is the
momentum associated with the line which represents the propagation of the photon;
momentum is conserved at the vertices.

that only one of the four spin states of the e+ e− system is available for anni-
hilation into two photons, we obtain
4πα2 1
Γ = 4|ψ(0)|2 vrel σ ≈ |ψ(0)|2 = mα5 (14.168)
m2 2
which corresponds to a lifetime
τ = Γ−1 ≈ 1.2 × 10−10 s (14.169)
in agreement with the experimental result quoted in Chapter 12.

14.7 e+ e− →μ+ μ− ANNIHILATION

We consider the process of annihilation of an electron–positron pair and
creation of a muon–antimuon pair
e + e − → μ+ μ − , (14.170)
represented schematically by the Feynman diagram of Fig. 14.6.
To obtain the cross section to second order in the interaction Lagrangian
we must calculate the matrix element
f |S (2) |i , (14.171)
where the initial and ﬁnal states are6 |i = |pr, p r = a†pr b†p r |0 and |f =
|ks, k s = a†ks b†k s |0 and the S-matrix has the form

e2
S (2) = − 2 d4 x d4 y T {: ψ̄e (x)A/(x)ψe (x) : : ψ̄μ (y)A/(y)ψμ (y) :} ,
2
(14.172)
6 Forsimplicity of notation, where no confusion can arise, we use the same symbols for
the creation and destruction operators of the electrons and muons.
228 Relativistic Quantum Mechanics

with the factor 2 in front of the integrals to allow for the fact that two identical
contributions are obtained with the exchange x y. Obviously the ﬁeld
operators ψe and ψμ , which describe the electron and muon (and relevant
antiparticles) commute, both with each other and the electromagnetic ﬁeld.
We can therefore rewrite the time-ordered product in the form

T {jeλ (x)Aλ (x)jμν (y)Aν (y)} = jeλ (x)jμν (y)T {Aλ (x)Aν (y)} , (14.173)

where jeλ =: ψ̄e γ λ ψe : and jμν =: ψ̄μ γ ν ψμ : are the electromagnetic currents
of the electron and the muon. Because neither the initial nor the ﬁnal state
contain photons, the contribution of the electromagnetic ﬁeld to the transition
matrix element is given by

0|T {Aλ (x)Aν (y)}|0 = iDFλν (x − y) , (14.174)

where DFλν is the photon propagator which, as we showed in Section 8.4, is

given by
d4 q iq(x−y) −ig λν
iDF (x − y) =
λν
e . (14.175)
(2π)4 q 2 + i

The other matrix elements to be calculated are

0|jeλ (x)|pr, p r , ks, k s |jμν (y)|0 . (14.176)

We consider the electron current (spin indices are omitted to simplify the
notation)

jeλ (x) = Nq Nq : a†q ūe (q)eiqx + bq v̄e (q)e−iqx γ λ
q,q

!
× aq ue (q )e−iq x + b†q ve (q )eiq x : , (14.177)

with Nq = m/V Eq . The only term which gives a non-zero contribution to
the matrix element (14.176) is the one containing the destruction operators
ap and bp , i.e.

Np Np bp ap v̄e (p )γ λ ue (p) e−i(p+p ) . (14.178)
Similarly, the only term which contributes to the muon current matrix
element is
Nk Nk a†k b†k ūμ (k)γ ν vμ (k ) e+i(k+k ) . (14.179)
We therefore obtain (ue = ue (p), ūe = ūe (p ), . . .)

f |S (2) |i = −e2 Np Np Nk Nk (v̄e γ λ ue )(ūμ γ ν vμ )

d4 q eiq(x−y))
× d4 x d4 y e−i[(p+p )x−(k+k )y] (−igλν ) . (14.180)
(2π)4 q2
Applications: QED 229

The +i term in the denominator of the photon propagator is irrelevant,

because q 2 > 0. Carrying out the integrations and substituting the expressions
for the spinor normalisation constants, we obtain the result
m M
f |S (2) |i = (2π)4 δ (4) (p + p − k − k ) √ Mif , (14.181)
V E p Ep V Ek Ek
where m and M are the masses of the electron and the muon, respectively.
The invariant amplitude Mif is deﬁned by
1
Mif = ie2 (v̄e γ λ ue ) (ūμ γλ vμ ) , (14.182)
(p + p )2
from which it follows that
e4
|Mif |2 = (v̄ γ λ ue ūe γ ν ve )(ūμ γλ vμ v̄μ γν uμ ) . (14.183)
(p + p )2 e
The average over spins of the particles in the initial state, and the sum over
spins of particles in the ﬁnal state, is carried out by using the completeness
relations for the Dirac spinors:
p/ + m p/ + m
ue ūe = , − ve v̄e = , (14.184)
s
2m
2m
s

and
k/ + M k/ + M
uμ ūμ = , − vμ v̄μ = . (14.185)
2M
2M
k k
From the term corresponding to the electron current, we obtain
1
v̄e γ λ ue ūe γ ν ve = − 2 Tr (p/ − m)γ λ (p/ + m)γ ν , (14.186)

4m
s,s

which, substituted together with the analogous expression for the muon cur-
rent in (14.183), gives the result
1 e4 1
|Mif |2 = |Mif |2 = ×
4
4 (p + p ) 4
s,s k,k
1 1
2
Tr (p/ − m)γ λ (p/ + m)γ ν Tr [(k/ + m)γλ (k/ − m)γν ] .
4m 4M 2
(14.187)
Using the rules for the calculation of products of γ-matrix traces, we ﬁnd

Tr (p/ − m)γ λ (p/ + m)γ ν = Tr(p/ γ λ p/γ ν ) − m2 Tr(γ λ γ ν )
= p pσ Tr(γ ρ γ λ γ σ γ ν ) − 4m2 g λν
ρ
1 2
= 4 p pν + p pλ − g λν (pp ) + m2
λ ν
,
(14.188)
230 Relativistic Quantum Mechanics

and, similarly
) *
Tr [(k/ + M )γλ (k/ − M )γν ] = 4 kλ k ν + kν k λ − gλν (kk ) + M 2 .
(14.189)
From now on we set to zero the mass of the electron; m ∼ 0.5 MeV,
which is negligible compared to the mass of the muon, M ∼ 105 MeV and
to the 4-momenta of the particles which participate in the process. With this
approximation, we obtain

Tr (p/ − m)γ λ (p/ + m)γ ν Tr [(k/ + M )γλ (k/ − M )γν ]

= 32 (pk )(p k) + (pk)(p k ) + (pp )M 2 , (14.190)

from which it follows that

e4 1 1
|Mif |2 = 8 (pk )(p k) + (pk)(p k ) + (pp )M 2 .
(p + p )4 4m2 4M 2
(14.191)
The explicit calculation of |Mif |2 is easily carried out in the centre of
mass frame, deﬁned by the condition p + p = 0. Choosing the z-axis in the
direction of p and again neglecting the electron mass, we can write

p ≡ (E, 0, 0, E), p ≡ (E, 0, 0, −E), (14.192)

k ≡ (E, 0, |k| sin θ, |k| cos θ), k = (E, 0, −|k| sin θ, −|k| cos θ), (14.193)
where√θ is the angle between the momenta of the electron and muon and
|k| = E 2 − M 2 . From the deﬁnitions it follows that

|k|
(pk ) = (p k) = E 2 1 + cos θ , (14.194)
E

|k|
(pk) = (p k ) = E 2 1 − cos θ , (14.195)
E
(p + p )2 = 2(pp ) = 4E 2 . (14.196)
Substituting into (14.191) we obtain the expression

2 1 1 4 M2 M2
|Mif | = e 1 + 2 + 1 − 2 cos2 θ , (14.197)
4m2 4M 2 E E

which implies (compare with (14.181))

TV
| f |S (2) |i|2 = (2π)4 δ (4) (p + p − k − k )×
V4
e4 M2 M2
1 + 2 + 1 − 2 cos2 θ . (14.198)
16E 4 E E
Applications: QED 231

|v − v | |v| 2 |p| 2
=2 = = . (14.199)
V V V E V
The result is
V V e4 ! V V
dσ = 4
(2π)4 δ (4) (p + p − k − k ) 4
...... d3 k d3 k .
2 V 16E (2π)3 (2π)3
(14.200)
The integration over k can be carried out thanks to the function
δ (3) (p + p − k − k ), while to integrate over k we use the relation
d3 k = dΩk |k |2 d|k | = dΩk |k |E dE , (14.201)
which, substituted into (14.200), gives the result
1 1 e4 !
dσ = 2
δ(2E − 2E ) 4
. . . . . . dΩk |k |E dE , (14.202)
2 (2π) 16E
or (with α = e2 /4π)
dσ α2 |k| !
= ...... , (14.203)
dΩk 16E 2 E
or
1/2
dσ α2 M2 M2 M2
= 2
1− 2 1 + 2 + 1 − 2 cos2 θ , (14.204)
dΩk 4Ecm E E E
where the total energy in the centre of mass frame is Ecm = 2E. Finally,
carrying out the angular integration we obtain the total cross section
1 1/2
dσ α2 M2 8 1 M2
σ = 2π d cos θ = 2π 2 1− 2 1+ .
−1 dΩk 4Ecm E 3 2 E2
(14.205)
Obviously, equation (14.205) makes sense only if
M2
1− >0, (14.206)
E2
or if the threshold condition for μ+ μ− pair production, E 2 > M 2 or Ecm
2
>
2
4M , is satisﬁed. In the limit E → ∞
4 α2
σ→ π 2 . (14.207)
3 Ecm
The trend of equation (14.207) with energy was easily predictable, because
the cross section has dimensions of an area, i.e. in natural units with =
c = 1 the inverse of energy squared. In the limit E → ∞ the only energy
scale available is E, all the masses being negligible. To second order in the
interaction Lagrangian, we therefore ﬁnd
α2
σ∝ 2
, (14.208)
Ecm
in agreement with equation (14.207).
232 Relativistic Quantum Mechanics

14.8 PROBLEMS FOR CHAPTER 14

Sect. 14.1
1. Write the expression for the Fourier transform of the electrostatic po-
tential due to the charge distribution: Zeρ(x).

Sect. 14.4
1. Starting from the Dyson formula and using translation invariance, show
that the Compton scattering amplitude to order α can be concisely
written as:

A = (2π)4 δ (4) (p + q − p − q) μ (q)ν (q )Mμν

Mμν = p , r | d4 x e−iqx T {Jμ (x)Jν (0)} |p, r .

2. Recalling the result of problem 2. to Sect. 8.3 show that:

q μ
d4 x eiqx T {Jμ (x)Jν (0)} = 0 .

This identity is used in the text as a check that gauge invariance is

respected in the calculation of Mμν from the Feynman diagrams. It
belongs to the large class of Ward identities embodying the consequences
of electric charge conservation.

3. Using equations (14.116), (14.140) and (14.141), show that the diﬀeren-
tial cross section for Compton scattering on a moving electron can be
cast in the form
$ 2
dσ α2 2 m2 m2 m2 m2
= 2ω 4 + + 4 +
dΩ (s − m2 )2 s − m2 u − m2 s − m2 u − m2

u − m2 s − m2
− + , (14.209)
s − m2 u − m2

where Ω is the diﬀerential solid angle specifying the direction of the

scattered photon.
4. Using equations (14.209) and (14.145) with φ = π, show that, in the
limit ω E, the diﬀerential cross sections of Fig. 14.4 can be written
in the form
$ 2 %
dσ πα2 m4 ω m2 ω E − ω E
= − + + .
dω 2ωE 2 4ω 2 E 2 E − ω ωE E − ω E E − ω
CHAPTER 15

APPLICATIONS: WEAK
INTERACTIONS

15.1 NEUTRON DECAY

n → p + e− + ν̄ . (15.1)
The interaction Lagrangian which describes the process (15.1) and (9.24)
is recalled here for convenience:
GF gA GF
LF = − √ ψ̄p γ μ (1 + γ5 )ψn ψ̄e γμ (1 − γ5 )ψν = − √ H μ Lμ . (15.2)
2 g V 2
All the particles are represented by Dirac ﬁelds and GF is the Fermi constant.
The Lagrangian is the product of two operators: the nuclear current H μ ,
which induces the transition between heavy particles, n → p, and the lepton
current Lμ , which creates the lepton pair from the vacuum. Assuming the va-
lidity of (15.2) we calculate the mean lifetime of the neutron and the electron
asymmetry with respect to the spin of the neutron. Comparing with exper-
imental values we can determine the two constants which appear in (15.2).

To ﬁrst order of perturbation theory:

−iGF
p, e, ν̄|S|n = √ p, e, ν̄| d4 xH μ (x)Lμ (x)|n (15.3)
2
−iGF
= (2π)4 δ (4) (Pf − Pi ) √ p|H μ (0)|n e, ν̄|Lμ (0)|0 .
2

The nuclear matrix element can be calculated in the limit in which the
proton is non-relativistic, given the small n-p mass diﬀerence compared to
the mass of the proton. In this limit, only the currents which correspond to
diagonal Dirac matrices survive, i.e.
γ0 , γ i γ5 .

DOI: 10.1201/9781003436263-15 233

This chapter has been made available under a CC BY NC license.
234 Relativistic Quantum Mechanics

The corresponding matrix elements are known respectively as Fermi and

Gamow–Teller transitions. We ﬁnd

p|ψ̄p γ 0 ψn |n = p|ψ̄p(−) γ 0 ψn(+) |n = χ†p χn = h0 (Fermi) ,

p|ψ̄p γ i γ5 ψn |n = p|ψ̄p(−) γ i γ5 ψn(+) |n = χ†p σ i χn = hi (Gamow–Teller) ,

where χp and χn are two-dimensional spinors.

For the leptons, no approximation can be used:

e, ν̄|ψ̄e γ μ (1 − γ5 )ψν |0 = e, ν̄|ψ̄e(−) γ μ (1 − γ5 )ψν(−) |0

me mν me mν μ
= ūe (pe )γ μ (1 − γ5 )vν (pν ) = l .
Ee E ν Ee Eν

The squared modulus of the Feynman matrix element is:

G2F μ ν ∗
|M |2 = [h (h ) ][lμ lν∗ ] . (15.4)
2

The Nuclear Part. In the sum over the proton spin in the nuclear part of
|M |2 the projector of the two spin states is used:

(χp )a (χ†p )b = δab (a, b = 1, 2) . (15.5)
spin p

Therefore, assuming the neutron is in a given spin state:

hμ (hν )∗ = Tr aμ (χn χ†n )aν ,
spin p

with
gA
aμ = (1, · σ) . (15.6)
gV

Neutron Polarisation. The state of a neutron with polarisation P is de-

scribed using a density matrix in place of the projector of spin states in (15.5).
If A and B are, respectively, the probability to ﬁnd the neutron with spin up
or down along the 3-axis, the density matrix is given by:

ρab = (χn )a A(χ†n )b + (χn )a B(χ†n )b

1 + σ3 1 − σ3 1 + P σ3
= A+ B= , (15.7)
2 2 2

where we have used A + B = 1 and the polarisation along the 3-axis is:

P = σ3 = T r(ρσ3 ) = A − B .
Applications: Weak Interactions 235

For a non-polarised neutron, P = 0 and the insertion of ρ into equation

(15.6) simply gives the average over the initial spin states. In any case, we
ﬁnd:

hμ (hν )∗ = T r [aμ ρaν ] = H μν ,
spin p
(H μν )∗ = H νμ .
Explicitly:
gA i3
H 00 = 1 , H i0 = H 0i = Pδ ,
gV
2
gA ij
H ij = δ − iP ij3 .
gV

Lepton Part. In the lepton part, we sum over all the spins, which are usually
not observed, using the formulae (6.84) and (6.85) for the projectors of the
solutions for positive (electron) and negative (antineutrino) energy. We obtain

(lμ lν∗ ) = [ūe (pe )γμ (1 − γ5 )vν (pν )] vν (pν )† (1 − γ5 )γν † γ 0 ue (pe )
spin e,ν spin e,ν

= [ūe (pe )γμ (1 − γ5 )vν (pν )] [v̄ν (pν )γν (1 − γ5 )ue (pe )] ,
spin e,ν

because from (6.17) it follows that:

γ 0 (1 − γ5 )γν † γ 0 = γν (1 − γ5 ) ,
and therefore:
1
(lμ lν∗ ) = T r [(p/e + me )γμ (1 − γ5 )(p/ν − mν )γν (1 − γ5 )]
spin e,ν
4me mν
1
2 T r [p/e γμ p/ν γν (1 − γ5 )]
4me mν
1
=8 (pe )μ (pν )ν + (pe )ν (pν )μ − gμν (pe · pν ) + iαβμν (pe )α (pν )β
4me mν
1
=8 Lμν . (15.8)
4me mν
We have used the results of Section 6.3 and the relations:
(1 − γ5 )(1 + γ5 ) = 0; (1 − γ5 )(1 − γ5 ) = 2(1 − γ5 ) ,
which imply, among other things, that the terms proportional to the masses
in the numerator give zero contribution. In components:

L00 = Ee Eν + (pe · pν ); L0i = − Ee (pν )i + (pe )i Eν + iijk (pe )j (pν )k ,

Lij = (pe )i (pν )j + (pe )j (pν )i + δ ij (pe pν ) + iijk Ee (pν )k − (pe )k Eν .
236 Relativistic Quantum Mechanics

Phase Space. Comparing with (11.41), we obtain:

G2F (4) d 3 pe d 3 pν 3
dΓ = 5
δ (Pf − Pi ) d pp · [4H μν Lμν ] .
(2π) 4Ee Eν
We can integrate the proton momentum using the three-dimensional δ-
function. In addition, conservation of energy, in the non-relativistic limit for
the proton, can be written:
(pe + pν )2
mn − mp − + Ee + Eν = 0 .
2mp
The kinetic energy of the proton is negligible, from which we find:
Ee + Eν = Δm = mn − mp . (15.9)
The result (15.9) explains the continuous spectrum of β rays; the energy
released in the transition is fixed, as in atomic or nuclear transitions with the
emission of a photon, but this energy is divided in a random way between the
electron and the neutrino, which is not observed.
Equation (15.9) has the practical consequence that we can integrate
freely over the direction of the neutrino momentum, with its energy fixed
by Eν = Δm − Ee . Overall, we can substitute
d 3 pe d 3 p ν 3
δ (4) (Pf − Pi ) d pp (. . . )
4Ee Eν

π
→ pν pe dEe dcosθen dΩν (. . .) .
2
From (15.4) and (15.8) it can be seen that |M |2 depends linearly on (pν )μ .
We can therefore omit from Lμν the terms proportional to pν which integrate
to zero over the solid angle.

Lifetime and Spin Asymmetries. Putting everything together, we ﬁnd:

G2F
dΓ = (1 + 3λ2 ) [1 + Aen (P ve cosθen )] pν Eν pe Ee dEe dcosθen , (15.10)
4π 3
λ(1 + λ)
Aen = −2 , (15.11)
(1 + 3λ2 )
where θen is the angle between the momentum of the electron and the neutron
spin direction. We have set
gA
λ= ,
gV
and in addition:
pe
pe = Ee2 − m2e , ve = ,
E
e
Eν = Δm − Ee , pν = Eν2 − m2ν .
Applications: Weak Interactions 237

Table 15.1 Observables in the decay of the neutron.

Δm (MeV) Lifetime (s) Aen Aνn gA /gV

1.293 885.7 ±0.8 −0.1173 ±0.0013 0.983 ±0.004 −1.2695 ±0.0029

We can also express dΓ as a function of the electron energy and the angle
between the neutrino and the neutron spin direction. The neutrino momentum
is reconstructed by measuring the proton momentum as well as that of the
electron, using pν = −pe − pp . It is easily seen that the terms in P λ are
symmetric for exchange of e → ν while terms in P λ2 change sign. Therefore,
with vν = 1, we obtain:

G2F
dΓ = (1 + 3λ2 ) [1 + Aνn (P cosθνn )] pν Eν pe Ee dEe dcosθνn , (15.12)
4π 3
λ(1 − λ)
Aνn = −2 . (15.13)
(1 + 3λ2 )

From measurements of Aen and Aνn we can determine λ and therefore

obtain GF by comparison of the experimental value of the lifetime with the
expression derived from (15.10) or (15.12):
$ 2 %
1 G2F Δm5 gA me
Γ= = 1+3 · I( ),
τ 60π 3 gV Δm
1
I(x) = 30 dt t(1 − t)2 t2 − x2 0.473 . (15.14)
x

The integral I is normalised to give 1 in the limit me = 0 where we have again

neglected the mass of the neutrino. The numerical value is obtained using the
mass values tabulated in Tables 15.1 and 15.2.

Table 15.2 Properties of charged leptons are determined by the weak interactions,
from the Particle Data Group [12]. The numbers in brackets denote the error on
the last digit of each quantity. In the last two columns B(l), denotes the fraction of
lepton decays, with the emission of a l− ν̄l pair, with l = e, μ.

m (MeV) lifetime B(e)(%) B(μ)(%)

26
e 0.510998902(21) > 4.6·10 y 0 0
−6
μ 105.658357(5) 2.19703(4)·10 s 100 0
τ 1776.99(28) 2.906(11)·10−13 s 17.84(6) 17.37(6)
238 Relativistic Quantum Mechanics

Figure 15.1 Eliminating λ from equations (15.11) and (15.13) one ﬁnds the second
degree consistency relation in the asymmetries, equation (15.17), which is repre-
sented by the ellipse shown in the ﬁgure, in the plane (Aen , Aνn ). The experimental
point is indicated by the dot and, indeed, it lies on the ellipse, showing that the
consistency condition is quite well obeyed.

Numerical Analysis Considering λ and λ2 as independent quantities, from

(15.11) and (15.13) and the values in Table 15.1, we ﬁnd the value of gA /gV :

Δ = Aen − Aνn −1.10, Σ = Aen + Aνn 0.866 ,

−Δ
λ2 = 1.57 ,
4 + 3Δ
−Σ gA
λ= −1.24 = , (15.15)
4 + 3Δ gV
with a consistency check that requires:
2
−Σ −Δ
= ,
4 + 3Δ 4 + 3Δ
or:

1.53 1.57 . (15.16)

Another way of expressing the consistency is to say that the experimental

point in the (Aen , Aνn ) plane should be on the ellipse given by the equation:
2 4
Σ2 + 3(Δ + )2 = . (15.17)
3 9
Fig. 15.1 shows that this is true to an excellent approximation.
Applications: Weak Interactions 239

Finally, from this simple analysis we ﬁnd a value for gA /gV in (15.15) very
close to that adopted by the Particle Data Group [?] listed in Table 15.1.
Substituting the latter value into (15.14), we ﬁnd the Fermi constant to be:

GF = 1.18 · 10−5 GeV−2 . (15.18)

Comment. The existence of a correlation between the neutron spin and the
electron direction of flight shows that the weak interaction violates spatial
reflection symmetry. Under this operation, the electron momentum (polar
vector) changes sign while the neutron spin (axial vector) does not change. If
the final state is an eigenstate of parity, as happens if P commutes with the
Hamiltonian, we must have ve · σ n = 0, and therefore an average value of
cosθen equal to zero.
The existence of parity violation in weak interactions, and therefore in β
decays, was hypothesised by Lee and Yang [23] in 1956 to resolve the so-called
θ − τ puzzle, the decay of K mesons into both two and three π mesons. The
first experimental observation of the spin asymmetry was in β decays of nuclei
by Wu and collaborators in 1957 [24].

Limits on the Mass of the Neutrino. The electron energy distribution

is potentially sensitive to the mass of the neutrino in the highest range of
energies, the endpoint of the spectrum. For this reason, in equation (15.10) we
have kept the neutrino mass diﬀerent from zero. After integration over angle,
we can write the spectrum as:
dΓe
= f (Ee ) = C · pν Eν Ee pe C pν Eν ,
dEe
with C and C constants. In the region in which Ee Δm, we can keep only
the neutrino terms, which vary rapidly, and approximate the others with their
value at the end point Ee pe Δm.

It is useful to consider the square root of the spectrum:

2
g(Ee ) = f (Ee ) = (Ee − Δm) (Ee − Δm) − m2ν . (15.19)

For a neutrino mass exactly equal to zero, g(Ee ) vanishes linearly at the
endpoint, while if mν = 0 the curve vanishes with an inﬁnite derivative. The
eﬀect allows an estimate of the neutrino mass, or at least an upper limit,
with greater sensitivity the smaller the endpoint value, which in the case of
the neutron is Δm 1.29 MeV. A particularly favourable nuclide is tritium,
which decays according to the scheme:
3
H → 3 He + e− + ν̄

with Δm equal to 18.6 keV.

240 Relativistic Quantum Mechanics

0.05

0.04

0.03

0.02

0.01

18.55 18.56 18.57 18.58 18.59

Figure 15.2 Calculated shape of g(Ee ) in the decay of tritium, near to the endpoint.
The upper curve corresponds to a zero neutrino mass, the lower curve to m = 10 eV.

Curves which represent the electron energy spectrum corresponding to

mν = 0 and mν = 10 eV are shown in Fig. 15.1. At present no eﬀect has been
observed at the endpoint of tritium which positively indicates the existence of
a neutrino mass in β decay, but only an upper limit [?]:

mνe ≤ 3 eV.

15.2 MUON DECAY

The muon decays via a process analogous to β decay of the neutron:

μ− (p) → νμ (p ) + e− (q) + ν̄e (q ) . (15.20)

We have introduced two diﬀerent types of neutrino, associated with two

charged leptons, in accord with experimental and theoretical evidence accu-
mulated since the 1960s.
The decay of the muon can be described with a Lagrangian of the Fermi
type. In view of the strong similarity between electron and muon displayed by
the electromagnetic interaction, we assume a Lagrangian in which the V –A
structure of the νe –e pair is extended to the νμ –μ pair:

G(μ)
Lμ−dec = − √ ψ̄νμ γ λ (1 − γ5 )ψμ ψ̄e γλ (1 − γ5 )ψνe
2
+ hermitian conjugate. (15.21)

G(μ) is a new constant analogous to the Fermi constant introduced for the
neutron. In the muon rest frame, the S-matrix element for the decay is:

me mνμ mνe
νμ , e, ν̄e |S|μ = (2π)4 δ (4) (p − p − q − q ) M (i → f ) ,
E(p )E(q)E(q )V 4
Applications: Weak Interactions 241

where we have introduced the invariant Feynman amplitude, M (i → f ). The

decay probability is calculated starting from:
(G(μ) )2 [8M μν Lμν ]
< |M (i → f )|2 >= ,
8me mνμ mνe
spin f in

where < .. > denotes the average over the muon spin. The tensor Lμν is the
same as introduced in (15.8), while, for a state of deﬁnite spin:

M μν = T r (χχ† )γ ν p/ γ μ (1 − γ5 ) .
In general, we must replace the spinor product with a density matrix,
similarly to what was done for the neutron:

† ρ(P ) 0
< (χ)α (χ )β >→ , (15.22)
0 0
1 + P σ3
ρ(P ) = ,
2
where P is the polarisation of the muon along the 3-axis. To calculate M μν
we must use the identity (6.170):
M μν = p S ν + p S μ − g μν p α S α − iναμρ p α S ρ ,
μ ν

S μ = T r [ργ μ (1 − γ5 )] .
Multiplying by Lμν we ﬁnd:
M μν Lμν = 4(qp )S α qα , (15.23)
where the momenta are attributed as in (15.20). From this:
(G(μ) )2 d3 q (4)
3 3
d p d q
ρ σ
dΓ = 5
· 32 · q ρ S σ δ (p − q − p − q ) p q .
(2π) 2Ee 4Eνe Eνμ
If we do not observe the neutrinos, we must integrate the quantity in
brackets. The size of the integration volume being invariant, the result is a
tensor in the indices ρ, σ constructed with g ρσ and the vector Q = p − q, the
only variable which remains after the integration. We therefore set:

d3 p d 3 q ρ σ
p q = δ (4) (p − q − p − q )
ρ σ
p q
4Eνe Eνμ
= A(Q2 )g ρσ + B(Q2 )Qρ Qσ .
2 2
To determine A and B, we note the relations (p = q = 0):
1 Q2
gρσ · p q = (p + q )2 =
ρ σ
·I ,
2 2
1
Qρ Qσ · p q = (Q2 )2 · I ,
ρ σ
4

d3 p d 3 q
I = 1 = δ (4) (p − q − p − q ) .
4Eνe Eνμ
242 Relativistic Quantum Mechanics

In this way we ﬁnd two equations for A and B, which give:

ρ σ 1 1 ρσ 2
p q = I g Q +Q Q ρ σ
.
6 2

An easy calculation, in addition, provides1 :

π
I= ,
2
and therefore, ﬁnally, neglecting as usual terms of order m2e , we ﬁnd
2 +
dΓ G(μ) 2 2 4Ee 4Ee T r [ρq/(1 − γ5 )]
= m E (3 − ) + (1 − ) .
dEe dcosθe 24π 3 μ e mμ mμ Ee

The trace is easily calculated, since:

Ee + p e · σ ...
/q (1 − γ5 ) = ,
... ...

from which:
T r [ρq/(1 − γ5 )] = Ee (1 + ve P cosθe ) .
We normalise the electron energy to the maximum value it can have in the
decay, Emax = mμ /2, putting:

Ee 2x
x= = ,
Emax mμ

and we ﬁnd, ﬁnally:

2
dΓ G(μ) m5μ
= [3 − 2x + (1 − 2x)v e · P ] . (15.24)
dxdcosθe 192π 3

Integrating over the remaining variables, we ﬁnd from (15.24) the total
decay rate:
2
1 G(μ) m5μ
Γ= = . (15.25)
τ 192π 3
Both the electron spectrum and the spin asymmetry are in perfect agree-
ment with the experimental data. Comparing equation (15.25) with the ob-
served value of the lifetime, Table 15.2, we ﬁnd in addition:

G(μ) = 1.16 · 10−5 GeV 2 . (15.26)

1 In the rest system of Qμ the three-dimensional δ-function eliminates the integration

over p , therefore I = d3 q /(4E 2 )δ(2E − Q0 ) = (4π)/(2 · 4) = π/2.
Applications: Weak Interactions 243

Kinematic Limits. Taking account of conservation of momentum, the con-

servation of energy in the decay is written:

mμ = Ee + Eνμ + |pe + pνμ | = Ee + Eνμ + p2e + p2νμ + 2pe pνμ cosθeνμ .

The kinematic limits of the decay in the Ee –Eνμ plane are given by the
condition cosθeνμ = ±1. Simplifying to the case of particles of zero mass, we
find:
mμ = Ee + Eνμ + |Ee ± Eνμ | .
With the positive sign, we have:
mμ
Ee + Eν μ = therefore :
2
mμ mμ mμ
Ee = x (0 ≤ x ≤ 1) , Eνμ = (1 − x) , Eν̄e = . (15.27)
2 2 2
With the negative sign, we have two solutions:
mμ
Ee = therefore :
2
mμ mμ
Eν μ =y (0 ≤ y ≤ 1) , Eν̄e = (1 − y) , (15.28)
2 2
or:
mμ
Eνμ = therefore :
2
mμ mμ
Ee = x (0 ≤ x ≤ 1) , Eν̄e = (1 − x) . (15.29)
2 2
Finally, the extreme configurations are three collinear configurations, in
which one particle takes the maximum energy, mμ /2, and the other two divide
the remaining half between them (see Fig. 15.3).

Forbidden Conﬁgurations in the Decay. The results in equations (15.23)

and (15.24) show that, if we set me = 0, the decay probability vanishes for
certain configurations of the final state particles. This arises from the V –A
structure of the Lagrangian, for which e− and νμ are produced with negative
helicity and the antineutrino, ν̄e , has positive helicity.
According to (15.23), the decay rate is proportional to the eνμ invariant
mass:
1 1 2
(pe pνμ ) (pe + pνμ )2 = Meν μ
,
2 2
which vanishes in the collinear configuration, Fig. 15.3. In this configuration,
the spins of e− , νμ and ν̄e are all parallel and similarly oriented, therefore
this state has projection − 32 along the electron direction of flight. Because the
initial state only has projection ± 12 , the probability of the configuration must
vanish because of conservation of angular momentum.
244 Relativistic Quantum Mechanics

Figure 15.3 The V –A interaction implies that, in the limit of zero mass, particles
have negative helicity and antiparticles positive helicity. Therefore in the conﬁgu-
ration in which Meνμ = 0 the amplitude has to vanish because of conservation of
angular momentum.

According to equation (15.24), the probability vanishes for x = 1, P = 1,

cosθe = 1, i.e. when the electron travels in the direction of the muon spin with
maximum energy, Fig. 15.1. In this situation, νμ and ν̄e are also collinear, in
opposite directions, with opposite helicities and therefore with zero component
of the total spin. The ﬁnal state has S3 = (se )3 = − 12 while the initial state
has S3 = (sμ )3 = + 12 .

15.3 UNIVERSALITY, CURRENT × CURRENT THEORY

The Fermi constants (15.18) and (15.26) are surprisingly similar. This fact
suggests that in the β decay Lagrangian an overall weak current appears,
analogous to what happens in electromagnetic interactions:
G λ
L W = − √ JN + Jμλ (Je† )λ ,
2
gA
JN + Jμ = gV ψ̄p γλ (1 −
λ λ
γ5 )ψn + ψ̄νμ γλ (1 − γ5 )ψμ ,
gV
(Je† )λ = ψ̄e γλ (1 − γ5 )ψνe ,

where we have set G(μ) = G and gV 1 represents an eventual scaling factor

between the Fermi constants for neutron β decay and that of the muon.
If we wish to maintain the universality between electron and muon we
must make another step and include the contributions of the two particles in
Applications: Weak Interactions 245

the same current. In this way, we arrive at the current × current expression:
G λ †
LW = − √ JW (JW )λ ,
2
λ λ
JW = JN + Jeλ + Jμλ . (15.30)

The analogy between electromagnetic interactions is now much deeper (see

for example the expression for the total electromagnetic current which appears
in equation (9.17)). In addition, we expect that, by adding appropriate terms
to the nuclear current, we can describe the weak interactions of all hadronic
particles.
The current × current Lagrangian describes new processes, compared to
those considered up to now, in particular:

• a weak interaction between nucleons, which introduces a parity-violating

component into the nuclear forces. In view of the value of G we expect
a small eﬀect. The eigenstates of the complete nuclear Hamiltonian can
be a superposition of states with opposite parity. The consequence is a
polarisation asymmetry in the γ decay of these states. This type of eﬀect
is actually observed with the correct order of magnitude (asymmetry
10−5 ),

• interactions of muon neutrinos with nuclear matter identical to those

of electron neutrinos, except for a kinematic effect connected to the
difference between the masses of electron and muon, which, however, is
a negligible effect at high energy.

The reaction products of each type of neutrino include the corresponding

lepton:

νe + nucleus → e + . . . , → μ + . . . (15.31)
νμ + nucleus → μ + . . . , → e + . . . (15.32)

The selection rules follow from the invariance of the Lagrangian (15.30) for
global phase transformations carried out separately on the νe , e ﬁelds and on
the νμ , μ ﬁelds. The symmetry implies the conservation of two types of lepton
charge: electron number and muon number

Ne = N (e− ) + N (νe ) − N (e+ ) − N (ν̄e ) , (15.33)

Nμ = N (μ− ) + N (νμ ) − N (μ+ ) − N (ν̄μ ) . (15.34)

The observation of the selection rule (15.32) at the beginning of the 1960s
allowed the existence of the muon neutrino, νμ = νe , to be established.
The current × current theory ﬁnds spectacular conﬁrmation in the decay of
the τ lepton. If we add to the weak current (15.30) the term which corresponds
246 Relativistic Quantum Mechanics

to the τ → ντ transition, with the same V –A structure as the others, we

conclude that this particle must have three types of decay2
⎧
⎨ ν τ + e + νe
τ− → ν τ + μ + νμ .
⎩
ντ + hadrons
The theoretical prediction of the probability of the semi-hadronic decay
mode requires elements of the Standard Model which will be developed in
[13]. Concerning instead the ﬁrst two decay modes, we can use the formula
(15.25) with the substitution mμ → mτ , in the limit in which we neglect the
mass of the muon or electron. We then obtain the prediction:

G2 · m5τ
Γ(τ → ντ + l− + νl ) =
192π 3
mτ 5
=( ) Γμ (l = e, μ) , (15.35)
mμ
or:
B(e) = B(μ);
1 B(l) mμ 5
τ (τ ) = = −
= B(l)( ) τ (μ) = 2.86 · 10−13 s ,
Γτ Γ(τ → ντ + l + νl ) mτ
in excellent agreement with the values for the leptonic decay branching ratios,
B(e), B(μ) and the lifetime given in Table 15.2.
Alternatively, from the experimental values of the τ lifetime and the lep-
tonic branching ratios we can derive a new value of the Fermi constant. We
ﬁnd:
G(τ ) = 1.15 · 10−5 GeV2 , (15.36)
in excellent agreement with the value determined from the muon lifetime,
(15.26).

Universality: Latest Developments. The situation that we have shown,

concerning the decays of the neutron and muon, is that anticipated in the
classic work of Feynman and Gell–Mann [25] in which they proposed the
V –A form of the Fermi interaction and universality of the vector current.
For a precision measurement of the Fermi constant, however, it is necessary
to take into account electromagnetic corrections to the lifetime. A precise
calculation of these corrections shows that in reality, GF is slightly less than
Gμ by about 3%. The most precise measurement of GF is obtained from β
transitions between nuclei of isotopic spin 1 and spin-parity J P = 0+ , the
so-called superallowed Fermi transitions. The most recent results give [26]:
GF
= 0.9739 ± 0.0005 . (15.37)
G(μ)
2 For the decays of the τ + we must exchange particles with antiparticles and vice versa.
Applications: Weak Interactions 247

Still more critical is the situation in β decays of strange particles in which

the corresponding Fermi constant results in a value equal to 15 of the Fermi
constant GF .
The reconciliation of these facts with the universality of the weak interac-
tions is due to Cabibbo [27].
The study of the structure of the weak hadronic current is of crucial im-
portance in understanding the nature and properties of the constituents of
hadrons. The extension of the Feynman–Gell–Mann–Cabibbo theory to par-
ticles with charm is due to Glashow, Iliopoulos and Maiani [28], and the
subsequent extension to particles formed of b and t quarks to Kobayashi and
Maskawa [29].

15.4 TOWARDS A FUNDAMENTAL THEORY

Since Fermi’s original paper it has been suspected that the four-fermion in-
teraction represented by (9.23) and the two subsequent modifications are only
a low energy approximation to a more fundamental interaction, in which the
force is transmitted by an intermediate particle, as happens for electromag-
netic interactions by the photon. This particle must be a boson and has been
given the name intermediate boson. If this is the case, an even stronger link
between the weak and electromagnetic interactions can be hypothesised, with
a symmetry which connects the electromagnetic current to the weak current
and, at the same time, the mediator boson to the photon.
The success of the V –A theory implies that the eventual intermediate bo-
son should be described by a vector field, in which case the name intermediate
vector boson is used. The first unified theories of the weak and electromagnetic
interactions were due to Schwinger [30] and, subsequent to the V –A theory,
Glashow [31].
To see how this idea works, we suppose to add to the Lagrangian of
ordinary matter (leptons, nucleons, etc.) the intermediate vector boson La-
grangian:
(0)
LIV B = LIV B + Lint ,
(0) 1 †
LIV B = − W μν Wμν − M 2 Wμ (W μ )† , W μν = ∂ ν W μ − ∂ μ W ν
2
† μ
Lint = −g Wμ† JW
μ
+ W μ JW , (15.38)

where we have described the intermediate boson with a complex vector ﬁeld;
(0)
LIV B is the free Lagrangian and g is a new coupling constant that we assume
to be small.
Treating Lint as a perturbation, we have:

(i)2
S = 1 + i d4 xLint + d4 xd4 yT [Lint (x)Lint (y)] + . . . . (15.39)
2
248 Relativistic Quantum Mechanics

The ﬁrst order term contributes to processes in which an intermediate

boson is emitted or absorbed. If the mass W is large enough, these processes
are forbidden by conservation of energy, for example in the neutron decay.
The second-order term can be expanded in the following way:

(i)2
d4 xd4 y T (Lint (x)Lint (y)) (15.40)
2

(ig)2 † ν
= d4 xd4 yT Wμ† (x)JW μ
(x)Wν (y)JW (y) + (x → y, y → x) + . . .
2

† ν μ
= (ig)2 d4 xd4 y JW (y) T Wν (y)Wμ† (x) JW (x) + . . . ,

where the ellipsis dots represent terms with the emission or absorption of two
intermediate bosons (still more forbidden) and we have used the fact that
in the Dyson formula we use free ﬁelds and therefore the currents and ﬁelds
commute.
In equation (15.41) the intermediate boson propagator appears which, be-
cause of its large mass, can be approximated by a δ-function, cf. equation
(8.14) and Problem 1 below. In this limit we obtain the product of the cur-
rents:

−G † μ
S = 1 + i d4 x √ JW (x)JW μ (x) + . . . (15.41)
2

=1+i d4 x LW (x) + · · · ,

where again the ellipsis dots represent terms irrelevant at low energy and we
have put:
G g2
√ = 2 >0. (15.42)
2 MW
In the limit of large mass of the intermediate boson and with a definite
sign3 for G, the terms of order g 2 in the S-matrix agree with the first order
term in the Fermi constant of the current × current Lagrangian (15.30)!
As well as giving the physical dimension of G (recall that g is dimensionless,
like the electric charge, in natural units), equation (15.42) gives a valuable clue
for constructing a unified theory; the true coupling constant g can be of the
same order of magnitude as the electric charge if MW is sufficiently large. We
require
4π
g 2 e2 = 4πα 0.091 (15.43)
137
and we use the value of the Fermi constant (15.26). We find:
√ 2
2 2g 2
MW = (100 GeV) . (15.44)
G
3 This is irrelevant for the applications just illustrated but essential to give the correct
sign for the interaction energy of neutrinos with matter, cf. Chapter 16.
Applications: Weak Interactions 249

This is the order of magnitude of the mass of the intermediate boson in a

theory in which the fundamental weak interactions are described by equation
(15.38) are uniﬁed with electromagnetic interactions.
As will be shown in the next volume [13], to have a uniﬁed theory in agree-
ment with β decay it is necessary to introduce further interactions mediated
by an electrically neutral vector boson, Z 0 , with a new coupling constant and
a mass of the same order of magnitude as g and MW . In this theory, the
overall interaction Lagrangian is written:
†
Lint = −eAμ Je.m.
μ
− g(Wμ† JW
μ
+ W μ JW μ
μ ) − g 1 JZ Zμ , (15.45)

with g, g1 of order e.

15.5 PROBLEMS FOR CHAPTER 15

Sect. 15.4
(0)
1. Show that the Lagrangian LIV B

(0) 1 †
LIV B = − W μν Wμν − M 2 Wμ (W μ )†
2
W μν = ∂ ν W μ − ∂ μ W ν

leads to the equations of motion:

− ∂ρ ∂ ρ + M 2 W μ + ∂ μ (∂ρ W ρ ) = 0 .

2. Show that the Feynman propagator of the intermediate boson W is

ν
!
DFμν (x) =< 0|T W μ (x)W † (0) |0 >=

1 4 i kμ kν
= d k 2 2 + i −g
μν
+ 2 e−ikx .
(2π)4 k − MW MW

3. Show that in the limit MW → ∞

ig μν (4)
DFμν (x) → + 2 δ (x) .
MW
.
CHAPTER 16

NEUTRINO
OSCILLATIONS

Since the 1970s the systematic observation of neutrinos of both natural and
artificial origin has been made possible by the development of experimental
apparatus of large dimensions (tons or thousands of tons) which signal, with
different methods, the occurrence of a neutrino interaction.
Fig. 16.1 shows the general principle of these measurements. A source pro-
duces neutrinos or antineutrinos from β decay and the detector, at a distance
L from the source, signals the interaction of the neutrino by means of the ob-
servation of a charged lepton produced in the inverse β process; see Table 16.1
for the sources and decays utilised.
In the historic experiment of Cowan and Reines [32] in 1956, the source
was the Savannah River nuclear reactor, which produced a calculable flux of
antineutrinos from neutron decays. The detector was a tank containing water.
The antineutrinos produced positrons via the reaction:

ν̄ + p → e+ + n , (16.1)

on protons of the water, followed by the annihilation of the positron with

an electron of the medium, which gives rise to two γ rays, each of energy 0.5
MeV. To reduce the uncertainty level, the water contained a certain amount of
cadmium chloride, cadmium being capable of absorbing the neutron with the
emission, after a delay of around 5 μs, of another γ ray with a characteristic
energy, following the reaction:

n + 108 Cd → 109m Cd → 109 Cd + γ , (16.2)

where 109m Cd denotes a metastable excited state of 109 Cd. A scintillator ma-
terial dissolved in the water transformed the γ rays into light ﬂashes detected
with a system of photomultipliers.

250 DOI: 10.1201/9781003436263-16

This chapter has been made available under a CC BY NC license.
Table 16.1 Sources and detection methods for naturally occurring and artiﬁcially produced neutrinos (for more detailed information
see Ref. [16]). In atmospheric neutrinos, the complete cascade of decays: π → μνμ , μ → νμ eνe gives a 2 : 1 ratio between the ﬂuxes of
νμ and νe .

Source Production Eν (MeV) L(km) Reaction in detector Method

8 37 37
Sun (Be-B) νe 1 − 10 1.4 · 10 νe Cl → e Ar radioch.
Sun (p-p) νe 0.2 − 0.7 1.4 · 108 νe 71
Ga → e 71
Ge radioch.
Sun (B) νe 5.5 − 10 1.4 · 108 νe p → e n Cherenkov
Sun (B) νe 6 − 10 1.4 · 108 νd→νpn Cherenkov
18
Supernova 1987 e p → n νe 1 1.7 · 10 νe N ucleus → e + · · · Cherenkov
π → μνμ 3
Atmosphere (zenith) 10 ∼ 20 νμ/e N ucleus → μ/e + · · · Cherenkov
→ νμ eνe
μ

Neutrino Oscillations 251

π → μνμ
Atmosphere (nadir) 103 ∼ 13000 νμ/e N ucleus → μ/e + · · · Cherenkov
μ → νμ eνe
Nuclear reactor n → ν̄e e− p 1 ∼1 ν̄e p → e+ n scint.
3−5 ∓
Accelerator (short base) π/K → μνμ 10 0.1 − 1 νμ (ν̄μ ) N ucleus → l + · · · imag.
Accelerator (long base) π/K → μνμ 103−4 300 − 900 νμ (ν̄μ ) N ucleus → l∓ + · · · imag.
252 Relativistic Quantum Mechanics

Figure 16.1 Production and detection of neutrinos via direct and inverse β processes.

In 1962, Lederman, Schwarz and Steinberger and collaborators [33] at

Brookhaven observed that neutrinos associated with muons in the decay of
π and K mesons do not give rise to reactions in which an electron appears,
but are invariably associated with a muon. This experiment shows that the
electron and muon are associated with two diﬀerent types of neutrino, subse-
quently denoted respectively as νe and νμ . The third known charged lepton,
the τ , is associated with a third neutrino, ντ , as shown by a recent experiment
carried out at Fermilab by the DON U T collaboration [34].

16.1 OSCILLATIONS IN VACUUM

We can interpret the experiment of the two neutrinos by assuming the
existence of two lepton numbers which are individually conserved: an electron
number, which characterises νe and e, and a muon number, associated with νμ
and μ. In modern terminology, these quantum numbers are known as lepton
flavours.
If neutrinos are massless, different flavour neutrino states are degenerate
and we can always find a basis in which the Hamiltonian and the lepton
flavours are simultaneously diagonal. This is equivalent to aligning the neu-
trino states with the basis of the charged lepton states.
If neutrinos acquire a mass, it is no longer automatic that the mass matrices
of the charged leptons and neutrinos are simultaneously diagonalisable and the
phenomenon of neutrino mixing can occur, with violation of lepton flavour [35,
36].

Mixing of Two Flavours. We denote with νe and νμ the neutrino states

with the same lepton ﬂavour as e and μ. In general the two states diﬀer from
the eigenstates of the mass matrix, which we denote as ν and ν , by a unitary
Neutrino Oscillations 253

transformation represented by a 2 × 2 matrix:

|νe = cos θ|ν + sin θ|ν ,

|νμ = − sin θ|ν + cos θ|ν . (16.3)

The Hamiltonian of the system of neutrinos with momentum p is:

m2
H= p2 + m2 |p| + , (16.4)
2|p|

where m is the neutrino mass matrix and we have used the ultra-relativistic
approximation.
Referring to Fig. 16.1, the amplitude to observe in the detector a neutrino
of ﬂavour j, originating from the source as a neutrino of ﬂavour i (i, j = e, μ)
is calculated from the evolution matrix, e−iHt . We approximate:

t = L , |p| = Eν ,

A(i → j) = j|e−iHL |i . (16.5)

If we denote the mass eigenstates with |νa (a = , ), we also have:

m2

−iHL −iEν L −i 2Eaν L
A(i → j) = j|a a|e |b b|i = e j|ae a|i ,
a,b a

(16.6)

from which we obtain the probability of appearance of a new ﬂavour, for

example νe → νμ :

P (νe → νμ ; E, L) = |A(νe → νμ )|2

Δm2 L
= cos2 θ sin2 θ|1 − e−i 2Eν |2

2 2 Δm2 L
= sin (2θ) sin , (16.7)
4Eν

where Δm2 = m2ν − m2ν . As expected, at short distances, depending on the

values of Δm2 and Eν , the identity of the detected neutrino agrees with that
of the neutrino produced. At longer distances neutrino oscillations between
the initial ﬂavour and the other ﬂavour develop.
We recall that the limit on the mass of νe from the β spectrum is of order
eV. Assuming that the neutrino masses are all of this order, we can see more
clearly the spatial scale of the oscillations by expressing the argument of the
254 Relativistic Quantum Mechanics

sine in more convenient units1 :

Δm2 L Δm2 L
φ= =
4Eν 4cEν
Δm (eV2 )L(km)
2
Δm2 (eV2 )L(m)
= 1.27 = 1.27 . (16.8)
Eν (GeV) Eν (MeV)
Clearly the interesting effects occur when φ approaches π/2.
The νe → νμ oscillation is not observable with neutrinos from reactors or
the Sun, which do not have sufficient energy to produce a μ in the detector.
The νe in this case oscillates into a non-observable, sterile, neutrino but we
can observe a neutrino deficit, described by the non-oscillation probability:
P (νe → νe ; E, L) = 1 − P (νe → νμ ; E, L) . (16.9)
The interactions of neutrinos with matter include elastic scattering, in
which the neutrino remains intact instead of transforming into the corre-
sponding charged lepton. In this case (neutral current processes, cf. [13]) all
the neutrinos into which νe oscillate are active and there should be no ob-
served variation in the number of reactions with distance (allowing for the
geometrical reduction in the flux). The simultaneous observation of a deficit
in charged current processes, P (νe → νe ) < 1, with unchanged neutral current
processes constitutes a very convincing proof of the oscillation phenomenon
(cf. the SN O experiment results discussed later).

Three Flavours. The formulae in (16.3) and (16.7) are easily generalised to
the case of three lepton ﬂavours. We write:

|νi = Uia |a (i = e, μ, τ ; a = 1, 2, 3) . (16.10)
a

The index a characterises the mass eigenstates, and U is a unitary matrix

which is, in general, complex:
Uia = a|i . (16.11)
The non-diagonal oscillation amplitude is written:
m2aL
A(νe → νμ ) = e−i 2Eν Uμa
∗
Uea , (16.12)
a

and the probability:

∗ ∗
P (νe → νμ ; E, L) = eiΔab L (Uμa Uea )(Uμb Ueb ), (16.13)
a,b

(m2b − m2a )
Δab = , Δ12 + Δ23 = Δ13 . (16.14)
2Eν
1 We recall that c ∼ 197 MeV·fm and that E(GeV) = 109 E(eV).
Neutrino Oscillations 255

In this case there are two diﬀerences between independent masses and
therefore two diﬀerent spatial scales over which the oscillations evolve.
In general terms, the 3 × 3 unitary matrix, U , is parameterised by three
angles and a complex phase, which implies a violation of CP symmetry (of
which more later). The parameterisation commonly used in the literature is
obtained in the following way:
The most general form of a complex vector νe in terms of three complex
vectors ν1,2,3 is written:

νe = cos θ13 [cos θ12 ν1 + sin θ12 ν2 ] + eiδ sin θ13 ν3 . (16.15)

Proof. In general, we expect a phase factor in front of each of the three terms on
the right-hand side. A redefinition of the phase of the left-hand side νe → eiα νe
redefines the three phases for a constant value of the phase, which we choose so that
the coefficients of ν1 are real. Now we can redefine the phases of ν2 and ν3 only, in a
way so that the phase of νe does not change, therefore ν2 → eiβ ν2 and ν3 → e−iβ ν3 .
By this we can also reduce the phase of ν2 to zero and we are left with one phase
for the coefficient of ν3 , as in (16.15).

The deﬁnition of νe allows us to deﬁne two other orthogonal vectors (in

the complex ﬁeld). We choose:

ν = − sin θ12 ν1 + cos θ12 ν2 , (16.16)

ν = e−iδ sin θ13 [cos θ12 ν1 + sin θ12 ν2 ] − cos θ13 ν3 . (16.17)

The most general form of vectors νμ and ντ will be that of two orthogonal
combinations of ν and ν , which we can choose in terms of a third angle
without having to introduce new phases (the reasoning is the same as that
used to have all real coeﬃcients in equation (16.3)):

νμ = cos θ23 ν + sin θ23 ν ,

ντ = − sin θ23 ν + cos θ23 ν . (16.18)

Overall, the expression for the unitary matrix U , for three ﬂavours, is:
⎛ ⎞
c13 c12 c13 s12 eiδ s13
U = ⎝ −c23 s12 + s23 s13 c12 e−iδ c23 c12 + s23 s13 s12 e−iδ −s23 c13 ⎠ .
s23 s12 + c23 s13 c12 e−iδ −s23 c12 + c23 s13 s12 e−iδ −c23 c13
(16.19)

16.2 NATURAL AND ARTIFICIAL NEUTRINOS

The relevant characteristics of the sources and methods of detection of
naturally occurring and artiﬁcially produced neutrinos are summarised in Ta-
ble 16.1. The Sun is a source of electron neutrinos, νe , produced in nuclear
fusion reactions by several reaction cycles and with a complex energy spectrum
which extends up to ∼10 MeV, Figs. 16.2 and 16.3.
256 Relativistic Quantum Mechanics

Figure 16.2 The fusion reactions in the Sun (Bethe cycles) [37].

Figure 16.3 Spectrum of solar neutrinos in the Solar Standard Model [38]. Energy
in MeV. Reprinted from Ref. [12] with permission. © APS (1998).
Neutrino Oscillations 257

Neutrinos originating from Be and from B were detected in 1967 by Davies

and collaborators with a detector deep underground, to shield it from cosmic
rays, located in the Homestake mine2 . The solar neutrinos were detected by
the reaction 37 Cl → 37 Ar by means of radiochemical measurements of the ra-
dioactive isotope 37 Ar. These observations showed a deficit compared to esti-
mates from solar models, and interpreted as the effect of neutrino oscillations
by Pontecorvo and Gribov [39]. Neutrinos from the pp cycle were observed
in 1992 with the GALLEX3 and SAGE4 experiments through the reaction
71
Ga → 71 Ge with radiochemical measurements of 71 Ge. These observations
confirmed the deficit compared to the predicted flux, a very reliable prediction
since the flux of pp neutrinos is directly related to the energy flux produced
by the Sun (cf. the problem at the end of Section 2.4).
The inverse β reaction

νe + n → e + p , (16.20)

was observed with the SuperKamiokande detector5 by the detection of the

electron by means of Cherenkov light. The correlation between the direction
of the electron and the instantaneous position of the Sun allowed discrimina-
tion against background events. More recently, the SN O detector6 detected
dissociation reactions of deuterium by solar neutrinos, followed by recombina-
tion of the neutron with emission of a photon which, in its turn, by Compton
scattering, produces a fast electron, observed by the Cherenkov eﬀect:

ν + d → ν + p + n (NC) , (16.21)
n + p → d + γ, γ + e → γ + e .

In reaction (16.21) all the ﬂavours into which the original νe can be trans-
formed are active. SN O also measures charged current reactions and scatter-
ing on electrons:

ν + d → e− + p + p (CC) , (16.22)
ν + e − → ν + e− (ES) . (16.23)

The SN O results prove that there is no solar deficit for the neutrino elastic
scattering processes, confirming the oscillatory nature of the deficit observed
in the charged current reactions.
On 23 February 1987, at 7:35 am Universal Time, neutrino events were ob-
served in two large underground detectors: Kamiokande in Japan and IM B 7
2 Homestake Gold Mine, Lead, South Dakota, USA.
3 GALLEX, INFN Gran Sasso National Laboratory, Italy.
4 SAGE, Baksan Neutrino Observatory, Caucasus, Russia.
5 Kamioka Observatory, Hida-city, Toyama, Japan.
6 Sudbury Neutrino Observatory, Creighton Mine, Sudbury, Ontario, Canada.
7 Irvine-Brookhaven-Michigan experiment for detection of proton decays, Fairport mine,

Lake Erie, USA.

258 Relativistic Quantum Mechanics

Table 16.2 Observed deﬁcit in solar neutrino experiments. For SNO, cf. Fig. 16.5.

Experiment Observed/expected Years of operation

Homestake 0.33 ± 0.03 ± 0.05 1970−1995
Kamiokande 0.54 ± 0.08+0.10
−0.07 1986−1995
SAGE 0.58 ± 0.06 ± 0.03 1990−2006
GALLEX 0.60 ± 0.06 ± 0.04 1991−1996
SuperKamiokande 0.465 ± 0.005+0.016
−0.015 1996−

in the USA. A few hours later optical signals from a supernova located in our
galaxy, at a distance of 170,000 light years, were detected. The correlation
between the neutrino events and the light signals allowed a signiﬁcant limit
to be set on the diﬀerence between the speed of neutrinos of ∼10 MeV energy
and the speed of light.
Cosmic rays produce neutrinos of energy around 1 GeV starting from π
and K mesons produced in high layers of the atmosphere, via the decay chain

π + /K + → μ+ νμ , μ+ → ν̄μ e+ ν̄e . (16.24)

The chain with charge conjugate particles is obtained starting from

π − /K − . Overall the expected fluxes of the two neutrino flavours are in the
ratio νμ : νe = 2 : 1.
Very large underground detectors can observe atmospheric neutrino events
produced (i) in the region directly above the detector (zenith), with path
lengths of order 20 km, (ii) in the atmosphere on the opposite side of the
Earth (nadir) with path lengths of order 13, 000 km, corresponding to the
diameter of the Earth. SuperKamiokande distinguishes νμ → μ from νe → e
events by the distribution of Cherenkov light emitted by the charged leptons
and observed a ratio of around 2:1 between events classified as ‘muons’ and
‘electrons’, for the neutrinos from the zenith, while observing a ratio of about
1:1 for those from the nadir. The observation, now strengthened, is interpreted
as due to oscillations of νμ into the neutrino associated with a lepton which,
in the Standard Model with three flavours, can only be the ντ . The limited
energy spectrum of atmospheric neutrinos does not allow direct production of
τ s but only permits the deficit of atmospheric neutrinos to be established.
Concerning neutrinos from artificial sources, accurate measurements of
the non-oscillation probability of ν̄e over distances of the order of kilometres
have been carried out at nuclear reactors, so far with negative results. The
KAM LAN D8 experiment adopted an innovative approach, detecting, with a
large volume of scintillator placed inside Kamiokande, the antineutrinos (ν̄e )
8 Kamioka Liquid Scintillator Anti-Neutrino Detector. With 1000 tons of liquid scintilla-
tor, it is the largest scintillation detector so far constructed.
Neutrino Oscillations 259

emitted by a group of nuclear reactors located in Japan and South Korea. The
ﬂux is dominated by a few of the reactors, situated at an average distance
of 180 km. KAM LAN D succeeded in detecting a reduction of about 40%
compared to the expected ﬂux, and a distortion of the spectrum in agreement
with the oscillation hypothesis.
Finally, starting from the Kamiokande observations, beams of neutrinos
produced by accelerators have been developed so that they can be detected
in underground experiments placed at long distances from the source (long-
baseline neutrino experiments).
At present there are three active beams of this type:

• K2K 9 : baseline ∼ 250 km, Eν = 1.3 GeV ,

• N uM i10 : baseline ∼ 700 km, Eν 3 GeV ,

• CN GS 11 : baseline ∼ 700 km, Eν ∼ 20 GeV.
The objective of these beams is to study the atmospheric neutrino anomaly
in controlled and reproducible conditions. CN GS also has the goal of observ-
ing τ leptons produced by possible νμ –ντ oscillation. The observation of a τ
event was reported by the OP ERA collaboration at Gran Sasso in 201012 .
Both M IN OS and K2K announced the observation of electron events
from the νμ → νe oscillation in 2011.

16.3 INTERACTION WITH MATTER: THE MSW EFFECT

To correctly describe the propagation of neutrinos inside the Sun it is
necessary to take account of their interactions with solar matter. To this end,
to the expression for the energy of the neutrino in vacuum, equation (16.4), we
must add the eﬀect of the weak interaction with particles of the surrounding
medium, i.e. electrons, protons and neutrons. We have to deal with two types
of interaction:

• the νe –e interaction, produced by the exchange of the charged interme-

diate boson, cf. Section 15.4,

• the interaction due to the exchange of the neutral intermediate boson,

Z 0 , between a neutrino of any ﬂavour and matter particles.

The second interaction is independent of ﬂavour [13] and therefore con-

tributes to the Hamiltonian a term proportional to the identity matrix that
has no eﬀect on the mixing or on the oscillations. We can therefore limit
9 KEK to Kamiokande, Japan.
10 Neutrinos to M IN OS: from Fermilab to the M IN OS experiment, Soudan mine, Min-
nesota, USA.
11 CERN Neutrinos to Gran Sasso, Italy
12 At the end of 2014, OP ERA had identiﬁed a total of 4, ν initiated, events.
τ
260 Relativistic Quantum Mechanics

our considerations to the first interaction, which arises from the Hamiltonian
density:
G
Hνe −e (x) = −LF ermi (x) = + √ [ēγμ (1 − γ5 )νe ] [ν̄e γ μ (1 − γ5 )e] . (16.25)
2
The sign is consistent with the sign found in Section 15.4, when we derived
the Fermi Lagrangian starting from the exchange of the charged intermediate
boson.
The calculation is greatly simplified if we exchange the fields ē and ν̄e ,
by expressing the Hamiltonian as the product of diagonal bilinears, relative
to the field of the electron and νe . This transformation, known as the Fierz
transformation, has the property of leaving invariant the V –A interaction:

[ēγμ (1 − γ5 )νe ] [ν̄e γμ (1 − γ5 )e] = + [ν̄e γ μ (1 − γ5 )νe ] [ēγμ (1 − γ5 )e] . (16.26)

We will therefore use the Hamiltonian in the form:

G
Hνe −e (x) = + √ [ēγ μ (1 − γ5 )e] [ν̄e γμ (1 − γ5 )νe ] . (16.27)
2

Proof. If we write equation 16.25) in terms of left-handed ﬁelds, Chapter 13, we

obtain the combination:

(ēL γμ νL )(ν̄L γ ν eL ) ,

which can be expressed in terms of bilinears:

(ν̄L Γ(i) νL )(ēL Γ(i) eL ) ,

by the completeness of the Dirac matrices. On the other hand, from the presence of
the projectors 1 − γ5 , the Γ(i) matrices must anticommute with γ5 (cf. Chapter 13)
or Γ(i) = V, A. Taken between the left-handed ﬁelds, γμ and γμ γ5 give the same
result, except for a sign. Therefore, if we take account of the sign changes from the
exchange of e and νe due to Fermi statistics, the relation to be proven is:

[γμ (1 − γ5 )]αβ [γ μ (1 − γ5 )]δ = − [γμ (1 − γ5 )]δβ [γ μ (1 − γ5 )]α .

ρ
We multiply both sides by γβα and sum over β and α. The left-hand side gives:

T r[γ ρ γμ (1 − γ5 )] [γ μ (1 − γ5 )]δ = 4γ ρ (1 − γ5 ),

and the right-hand side:

− [γμ (1 − γ5 )γ ρ γ μ (1 − γ5 )]δ = −2γμ γ ρ γ μ (1 − γ5 ) = +4γ ρ (1 − γ5 ) . (16.28)

Q. E. D.
Neutrino Oscillations 261

Matrix Element. The energy density of the neutrino is found by taking the
matrix element of equation (16.27) between the states with the initial and
ﬁnal electron at rest, and the initial and ﬁnal neutrino with equal momentum,
p:
G
EW = + √ e(p = 0)|ēγ μ (1 − γ5 )e|e(p = 0)
2
× νe (p)|ν̄e γμ (1 − γ5 )νe |νe (p) . (16.29)

With the electron at rest, only μ = 0 counts. We can approximate the

mass of the neutrino to zero and consider it a Weyl ﬁeld (in this limit, the
same result holds for a Majorana neutrino, Chapter 13). We use the expansion
(13.10) and we ﬁnd for the energy density:
G 1
EW = + √ ue (0)† ue (0) × 2 × uνe †L (p)uνe L (p) , (16.30)
2V2
with the factor 2 arising from the action of 1 − γ5 on the left-handed neutrino.
The normalisation chosen corresponds to a neutrino and an electron in a
volume V . The neutrino energy is obtained by multiplying by V and by the
number of electrons present, Ne . With ρe = Ne /V , the number of electrons
per unit volume, we obtain:
√
EW = + 2Gρe = V , (16.31)

and the total energy of the neutrino can now be written:

m2
E = |p| + + V = |p| + ΔE . (16.32)
2p
The estimated mass density at the centre of the Sun is equal to:
3
ρSun (r = 0) 162 g/cm . (16.33)

To a good approximation in the Sun there is one proton for every electron,
therefore the density of 1 g/cm3 corresponds to NA = 6.6 · 1023 electrons/cm3 ,
from which:
ρSun (r = 0)
(ρe )Sun (r = 0) 3 NA · 162(c)3
162 g/cm
ρSun (r = 0)
= 3 8.18 · 1011 eV3 , (16.34)
162 g/cm
from which
ρSun (r = 0) eV2
V = 1.35 · 10−5 3 . (16.35)
162 g/cm MeV

with ρSun in g/cm3 .

262 Relativistic Quantum Mechanics

MSW Effect. In the (νe , νμ ) basis, we can express the neutrino mass matrix
in terms of Δm212 = m22 − m21 and the vacuum mixing angle:

i|m2 |j = i|am2a a|j = [U (m2 )diag U † ]ij =
a

Δm212 cos(2θ12 ) sin(2θ12 )
=− + ··· , (16.36)
2 sin(2θ12 ) − cos(2θ12 )
where we have omitted a term proportional to the unity matrix. Similarly, we
write the potential as:

√ 1 0
V = + 2Gρe
0 0
√
2G 1 0
=+ ρe + ··· , (16.37)
2 0 −1
from which:
m2
ΔE = +V
2E
$ ν √ %
Δm212 cos(2θ12 ) 2G Δm212 sin(2θ12 )
= − + ρe σ3 + σ1 + · · · ,
4Eν 2 4Eν
(16.38)
where σ1,3 are the familiar Pauli matrices and we have omitted terms propor-
tional to the identity matrix.
If Δm212 cos(2θ12 ) > 0, equation (16.38) shows that a critical value of the
density exists for which the coefficient of σ3 vanishes and, correspondingly, the
mixing angle becomes maximal, even if the vacuum mixing angle is very small.
In these conditions a νe → νμ resonant conversion, the Mikheyev–Smirnov–
Wolfenstein effect [40, 41], can occur. The critical density is:
Δm212 cos(2θ12 )
ρ̄e = √ . (16.39)
2 2GEν
The reactions which generate the neutrinos take place in the central region
of the Sun (r = 0) where the density is maximum; in their journey towards the
surface, the neutrinos cross regions whose density decreases until it vanishes
at the surface of the Sun, r = RSun . The mixing matrix at the exit from the
Sun is obtained by solving the Schrödinger equation along the trajectory. We
can consider two extreme cases [16]:
• the potential term is negligible:
√
Δm212 cos(2θ12 ) 2G
>> ρe (r = 0) or ρe (r = 0) << ρ̄e . (16.40)
4Eν 2
The neutrino deviates further from the critical condition, the matter
effect is negligible and the formulae for oscillations in vacuum apply.
Neutrino Oscillations 263

• the potential term is dominant:

√
Δm212 cos(2θ12 ) 2G
<< ρe (r = 0) or ρe (r = 0) >> ρ̄e . (16.41)
4Eν 2

The neutrino at the√origin is found in the eigenstate with maximum

eigenvalue of V = + 22G ρe (r = 0). If, as happens in the Sun, the varia-
tion in density along the trajectory is suﬃciently slow (adiabatic condi-
tions), the neutrino is found at all times in an eigenstate of the mixing
matrix, corresponding to the density at that point. In addition, it can
be shown that the levels do not cross, from which, at the exit from the
Sun, the neutrino is found in the eigenstate of the mass matrix with
maximum eigenvalue, or:

νe (r = RSun ) ∼ ν2 . (16.42)

After this, the eigenstate propagates to the detector without oscillation,

from which:

P (νe → νe ) = |Ue2 |2 = sin2 θ12 (MSW adiabatic) . (16.43)

16.4 ANALYSIS OF THE EXPERIMENTS

The results of recent decades have led to a determination of the param-
eters which describe neutrino oscillations of the three known flavours. We
summarise in Table 16.4 the values obtained in two recent measurements.
These numbers result from a rather complex global fit, but it is possible to
provide a few simple considerations for guidance, starting from the fact that
θ13 is much smaller than the other two angles and that Δm212 << Δm223 .
In the limit θ13 = 0, equations (16.15) and (16.18) show that νe is mixed
only with the superposition of νμ and ντ which we called ν . Therefore the
oscillations of νe and ν̄e , including solar neutrinos and the effects of their
passage through the interior of the Sun, are, to an excellent approximation,
those of a simple system of two neutrinos, νe –ν .

KAM LAN D. Measurement of the reduction in the ﬂux of νe compared to

what is observed in experiments in close proximity to reactors, and the dis-
tortion of the energy spectrum, allows a direct measurement of the oscillation
parameters, Table 16.3.

Solar Neutrinos. The first proposal to explain the deficit observed at Home-
stake, advanced by Pontecorvo and Gribov, was the “just so” solution, in
which the oscillation phase is of order π/2 just at the orbit of the Earth. Re-
ferring to the values from GALLEX (deficit R ∼ 0.58%, E ∼ 0.5 MeV) we
264 Relativistic Quantum Mechanics

Table 16.3 Oscillation parameters for νe –ν determined by the KAM LAN D exper-
iment [42].

Δm212 (10−5 eV2 ) tan2 (θ12 )

7.58 ± 0.14(stat)±0.15 (syst) 0.56 ± 0.10 (stat) ±0.10 (syst)

ﬁnd:

solution just so : (16.44)

π E
Δm212 ∼ ∼ 4.3 · 10−12 ,
2 1.27LSun
sin2 θ12 ∼ 0.12 . (16.45)

The just so solution appears in the lower part of Fig. 16.4, but is not
consistent with all the other information, in particular with the KAM LAN D
observations, which require an oscillation length of terrestrial scale. For Δm212
much larger than the just so value, the oscillation in vacuum averages to a
value independent of distance:
1
P (νe → νe ) → 1 − sin2 (2θ12 ) (vacuum oscillation) . (16.46)
2
Referring to the value of GALLEX we obtain:

sin2 (2θ12 ) = 2(1 − R) = 0.84 → sin2 θ12 = 0.30 . (16.47)

This solution corresponds to a vertical line with tan2 θ12 = 0.43, from the
just so solution upwards, roughly the dark region of Fig. 16.4.
For a more precise analysis, it is necessary to take account of the MSW
eﬀect in the Sun, equation (16.38). Using the KAM LAN D parameters, Ta-
ble 16.3, and the values of the energy, Eν = 0.5, Eν = 7 MeV for gallium and
boron, respectively, we ﬁnd:

Δm212 cos(2θ12 ) eV2

= 2.4 · 10−5 (Ga) ,
4Eν MeV
eV2
= 0.17 · 10−5 (B) ,
√ MeV
2G eV2
ρe (r = 0) = 0.63 · 10−5 .
2 MeV
We are, therefore, in the conditions:

ρ̄e (Ga) > ρSun (r = 0) > ρ̄e (B) , (16.48)

Neutrino Oscillations 265

from which, to a suﬃciently good approximation, we can conclude that:

• for the deficit from Ga, equation (16.47) applies, which, with R(Ga) =
0.58, gives sin2 θ12 (Ga) = 0.30;
• for the deficit from B, equation (16.43) applies, which, with R(B) = 0.3
gives sin2 θ12 (B) = 0.33.
The values of the deficit from Homestake and GALLEX/SAGE, while
different from each other, are explained by the same oscillation parameters,
consistent; moreover, with those observed by KAM LAN D and other experi-
ments, as shown in Table 16.4.

Figure 16.4 Allowed regions from the measurement of the probability of non-
oscillation; νe of solar origin and ν̄e produced on Earth [16]. Reprinted from Ref. [12]
with permission. © APS (1998).
266 Relativistic Quantum Mechanics

Figure 16.5 SNO results [43]. The measurement of charged current (CC) and neutral
current (NC, ES) reactions allows separation of the contributions of νe and the
superposition νμ –ντ , and to compare them with the predictions of the Standard
Model and the predicted spectrum for neutrinos from boron. Reprinted from Ref. [12]
with permission. © APS (1998).

Atmospheric Neutrinos. On the scale of energies and distances of atmo-

spheric neutrinos, cf. Table 16.1, the oscillation associated with Δm212 is negli-
gible. If we set θ13 and Δm212 equal to zero, the formula for the non-oscillation
amplitude of νμ , cf. equation (16.12), simpliﬁes considerably:
m2
aL
|A(νμ → νμ ; E, L)| = | e−i 2Eν Uμa
∗
Uμa |
a
= ||Uμ1 |2 + |Uμ2 |2 + eiΔ23 L |Uμ3 |2 |
= |1 − |Uμ3 |2 + eiΔ23 L |Uμ3 |2 | . (16.49)
In addition:
|Uμ3 |2 = sin2 θ23 cos2 θ13 ∼ sin2 θ23 , (16.50)
from which:
P (νμ → νμ ; E, L) = cos4 θ23 + sin4 θ23 + 2 cos2 θ23 sin2 θ23 cos Δ23
Δm223 L
= 1 − sin2 2θ23 sin2 . (16.51)
4Eν
Neutrino Oscillations 267

Table 16.4 Recent determinations of the oscillation parameters of neutrinos of three

ﬂavours from a global ﬁt to all available data.

Parameter Fogli et al. [44] Schwetz et al. [45]

Δm212 (10−5 eV2 ) 7.58+0.22
−0.26 7.59+0.20
−0.18
|Δm223 |(10−3 eV2 ) 2.35+0.12
−0.09 2.50+0.09
−0.16
sin2 θ12 0.312+0.017
−0.016 0.312+0.017
−0.015
sin2 θ23 0.42+0.08
−0.03 0.52+0.06
−0.07
sin2 θ13 0.025±0.007 0.013+0.007
−0.005

Using the parameters from Tables 16.1 and 16.4 with Eν ∼ 1 GeV, we
ﬁnd:

P (νμ → νμ ; 1 GeV, 13000 km) = 0.25 , (16.52)

which is a good approximation to the result of SuperKamiokande, R(nadir)=

observed ﬂux/calculated ﬂux ∼ 0.57.

The Transformation νμ → νe . The probability of appearance of νe in a

long-baseline beam composed initially of νμ is easily calculated in the approx-
imation Δm212 = 0. Exchanging e with μ in equation (16.12), gives:
∗ ∗
|A(νμ → νe )| = |Ue1 Uμ1 + Ue2 Uμ2 + e−iΔ13 L Ue3 Uμ3
∗
|
∗
= |Ue3 Uμ3 (1 − e−iΔ13 L )| , (16.53)

therefore:
∗ 2
P (νμ → νe ) = |Ue3 Uμ3 | 2[1 − cos(Δ31 L)]

2 2 2 Δm223 L
= sin θ23 sin (2θ13 ) sin . (16.54)
4Eν

The observation by M IN OS and K2K of electrons produced by a νμ

beam, if conﬁrmed, allows the determination of the third angle, θ13 , cf. Ta-
ble 16.4.

Matter–Antimatter Symmetry. If we move from the states, equa-

tion (16.10), to ﬁelds, along the lines of Chapter 13, we can connect the
oscillations of neutrinos and antineutrinos in a precise way. The neutrino
ﬁeld is written schematically as:

ψ ∼ ae−ikx + b† e+ikx , (16.55)

268 Relativistic Quantum Mechanics

and neutrino and antineutrino states are obtained by applying the relevant
creation operators, a† and b† , to the vacuum.
Since the mixing of the ﬁelds is described by a unique matrix which we
denote as U ∗ , it follows that:
• the mixing of the neutrino states is described by the matrix U ,

• the mixing of the antineutrinos is described by U ∗ .

If U is real, oscillations of neutrinos (from the Sun or from π + decays)

are identical to those of antineutrinos (from reactors or from π − decays), as
we have implicitly assumed so far. The presence or absence of a phase in the
mixing matrix is connected to the violation of matter–antimatter symmetry13 .
This connection appears explicitly by comparing the formula (16.12) with the
analogous one for antineutrinos
m2
aL
A(νe → νμ ; E, L) = e−i 2Eν Uμa
∗
Uea ,
a
m2
aL
A(ν̄e → ν̄μ ; E, L) = e−i 2Eν Uμa Uea
∗
. (16.56)
a

We see that:

• P (ν̄e → ν̄μ ; E, L) = P (νe → νμ ; E, L) if, and only if, U is real.

Even for complex U , equation (16.56) implies a connection between the

oscillations of neutrinos and antineutrinos, i.e.:

• P (ν̄e → ν̄μ ; E, L) = P (νe → νμ ; E, −L) = P (νμ → νe ; E, L) for any

unitary U .
If we recall that L ≡ t, we see that the previous equation implies an exact
symmetry for matter–antimatter exchange united to time reversal. It is the
consequence of the CP T transformation symmetry which every relativistic
quantum field theory obeys.
As we saw in equation (16.19), the theory with three flavours allows a
mixing matrix with a complex phase. We can see explicitly how this leads to
a violation of matter-antimatter symmetry, by calculating the difference:
1
ΔP = [P (νμ → νe ; E, L) − P (ν̄μ → ν̄e ; E, L)]
2
1
= [P (νμ → νe ; E, L) − P (νμ → νe ; E, −L)] . (16.57)
2
13 This is described by the CP product of the charge conjugation C and parity P trans-

formations. For a ﬁeld theory discussion of the C and P symmetries and of time reversal,
T see Chapter 12.
Neutrino Oscillations 269

Using the deﬁnition of equation (16.14) and the antisymmetry of Δab , we

ﬁnd:
−i ∗ ∗
ΔP = sin(Δab L) (Uμa Uea )(Uμb Ueb )
2
ab

∗ ∗
= sin(Δab L) Im [(Uμa Uea )(Uμb Ueb )]
ab

∗ ∗
=2 sin(Δab L) Im [(Uμa Uea )(Uμb Ueb )]
a<b
∗ ∗
= 2{sin(Δ12 L) Im [(Uμ1 Ue1 )(Uμ2 Ue2 )]+
∗ ∗
+ sin(Δ13 L) Im [(Uμ1 Ue1 )(Uμ3 Ue3 )]+
∗ ∗
+ sin(Δ23 L) Im [(Uμ2 Ue2 )(Uμ3 Ue3 )]} . (16.58)

We use the orthogonality relation on the rows of U :

∗ ∗ ∗
Uμ1 Ue1 + Uμ2 Ue2 + Uμ3 Ue3 = 0 , (16.59)
∗
to eliminate (Uμ2 Ue2 ), for example:
∗ ∗ ∗ ∗
Im [(Uμ1 Ue1 )(Uμ2 Ue2 )] = −Im [(Uμ1 Ue1 )(Uμ3 Ue3 )] , (16.60)

and the second condition from equation (16.14) to eliminate Δ13 . We ﬁnd:
∗
ΔP = −2Im [(Uμ1 Ue1 )(Uμ3 U ∗ e3)]
× {sin(Δ12 L) − sin[(Δ12 + Δ23 )L] + sin(Δ23 L)}
∗ ∗
= −2Im [(Uμ1 Ue1 )(Uμ3 Ue3 )]×
× {sin(Δ12 L)[1 − cos(Δ23 L)] + sin(Δ23 L)[1 − cos(Δ12 L)]}
=F ×G , (16.61)

and the two factors are, explicitly; cf. equation (16.19):

F = cos θ13 sin(2θ12 ) sin(2θ23 ) sin(2θ13 ) sin δ ,

G = 2 sin(Δ12 L) sin2 (2Δ23 L) + sin(Δ23 L) sin2 (2Δ12 L) . (16.62)

From these formulae we learn that:

• the phase in the matrix U produces a violation of CP symmetry,

• the violation only occurs if all the mass diﬀerences and all the angles
are diﬀerent from zero; in the converse case, as we saw when setting θ13
or Δm212 to zero, the system reduces to a mixing between two states,
which is described by a real matrix.
270 Relativistic Quantum Mechanics

16.5 OPEN PROBLEMS

The observation of the phenomena just illustrated has shown beyond any
reasonable doubt that neutrinos have a mass and that their ﬂavour quan-
tum numbers are not conserved. Several problems remain open, which make
neutrinos still a frontier area of astrophysics, cosmology and the physics of
fundamental forces. Without any attempt at completeness, we illustrate a few
of these issues.

CP Violation in the Lepton Sector. The recent evidence for a non-zero

value of θ13 , Table 16.4, opens the way to the search for CP violation among
neutrinos, which would match the already observed violation in weak interac-
tions of hadrons, cf. [13]. To ﬁnd evidence of an asymmetry in the oscillations
of neutrinos and antineutrinos, even for large values of the phase δ which ap-
pears in the mixing matrix, (16.19), requires the construction of accelerators
providing extremely high intensities of muons: muon factories, muon colliders,
etc.

Majorana Neutrino. Everything which has been said so far holds for both
Dirac or Majorana neutrinos since, for masses which are anyway << E, the
V –A interaction inhibits transitions between particles and antiparticles, of
the type νe → e+ , cf. Chapter 13. The observation of neutrinoless double–β
decay is the characteristic signal of a Majorana neutrino with mass, and is
actively being sought in several laboratories, for example Gran Sasso. The
present situtation is illustrated in the next Section.

Mass Hierarchy. As is seen from equation (16.51) and in Table 16.1, the
data do not deﬁne the sign of Δm213 ; therefore, we have two possible orderings
of the neutrino masses:
normal order : m21 < m22 < m23 ,
inverted order : m23 < m21 < m22 . (16.63)
In the case of quark masses, normal ordering holds, and in addition each
mass squared is much less than the squared mass which follows. If this scheme
were to be repeated for neutrinos, we would have:
0 ∼ m21 << Δm212 ∼ m22 << Δm223 ∼ m23 , (16.64)
but they could also have squared masses roughly equal to each other and much
larger than the diﬀerences:
m21 ∼ m22 ∼ m23 >> Δm2 . (16.65)
The absolute values of the masses can be obtained with accurate measure-
ments of the electron spectrum in β decay of tritium, Chapter 15, or from the
observation of neutrinoless double-β decay.
Neutrino Oscillations 271

Sterile Neutrinos. An experiment carried out at Los Alamos laboratory [47]

with the LSN D detector14 has observed positrons (around 60 events) and
electrons (around 18 events) possibly originating in the decay chain:

π + → μ+ + ν μ (π + in ﬂight) ,
μ+ → ν̄μ + e+ + νe (μ+ at rest) . (16.66)

From the energy and distance scales involved, the interpretation of the
result as a νμ → νe or ν̄μ → ν̄e oscillation requires a value of Δm2 very
diﬀerent from the values observed with atmospheric neutrinos:

(Δm2 )LSN D ∼ 1 eV . (16.67)

Three diﬀerent values of Δm2 require a fourth neutrino. However, the

decay of the neutral vector boson is compatible only with three neutrino
types [13]. Therefore the fourth neutrino, if confirmed, must be a sterile neu-
trino, not subject to the usual weak interactions.
Searches to confirm or disprove the LSN D effect are at present underway
with beams of muon neutrinos (M iniBooN E 15 experiment) and at several
nuclear reactors (disappearance of νe at short distances).

16.6 PROBLEM FOR CHAPTER 16

Sect. 16.1
1. Show that the phase arbitrariness of quantum states can be used to
make the two neutrino mixing matrix to be real, as in Eq. (16.3).

14 Liquid Scintillator Neutrino Detector, Los Alamos Meson Facility, USA.

15 Fermilab, Chicago, USA.
CHAPTER 17

NEUTRINOLESS
DOUBLE-BETA DECAY

As pointed out in Section 16.5, clear-cut evidence of massive Majorana

neutrinos can only be obtained from the observation of neutrinoless double
beta decay, wherein a parent nucleus emits a pair of virtual W bosons which
exchange a Majorana neutrino and produce the two emitted electrons. This
chapter provides an outline of the derivation of the double beta decay rate, as
well as a summary of the status of the experiments aimed at establishing the
existence of the Majorana neutrino and determining its mass.

17.1 DOUBLE BETA DECAY

Double beta decay is a rare nuclear transition between two nuclei having
the same mass number A, in which the nuclear charge number Z changes by
two units. This process was ﬁrst considered by Maria Goeppert-Mayer shortly
after the development of Fermi’s theory of neutron beta decay [48]1 , which
provided a remarkably accurate description of the reactions

N (A, Z) → N (A, Z + 1) + e− + ν e , (17.1)

+
N (A, Z) → N (A, Z − 1) + e + νe . (17.2)

where N and N denote the parent and daughter nucleus, respectively; see
Section 15.1.
The stability of a nucleus of mass number A against beta decays is driven
by the dependence of its mass, MA , on the charge number Z. In the vicinity
of the minimum, the available experimental data turns out to be accurately
1 In her paper, Goeppel-Mayer actually reports that the possible occurrence of double-

beta nuclear disintegrations had been pointed out to her by E. Wigner.

272 DOI: 10.1201/9781003436263-17

This chapter has been made available under a CC BY NC license.
Neutrinoless Double-Beta Decay 273

Figure 17.1 Z-dependence of the masses of A = 76 isobars, as described by

Eq. (17.3). The upper and lower parabolae correspond to odd-odd and even-even
nuclei, respectively. It is apparent that the 76 76
32 Ge → 33 As beta decay is forbidden,
76 76
while the 32 Ge → 34 Se double beta decay is energetically allowed.

described by the quadratic parametrisation

MA (Z) = Zmp + (A − Z)mn (17.3)
2 2
Z (A − 2Z)
+ α − βA−1/3 − γ −δ − (A, Z) .
A1/3 4A
with mp and mn being the proton and neutron mass, respectively. The last
contribution to the right-hand side of the above equation accounts for the
experimental evidence that nuclei with even number of both protons and neu-
trons tend to be more stable than those with odd Z and odd N = A − Z,
as well as those with odd A. This property originates from the pairing of like
nucleons, and can be described setting = ±11.2 A−1/2 MeV for even A,
even-even or odd-odd nuclei respectively, and = 0 for odd A.
Figure 17.1, illustrates the Z-dependence of the masses of isobars with
A = 76. The upper and lower parabolæ correspond to odd-odd and even-
even nuclei, respectively. It is apparent that beta decays of even-even nuclei,
leading to the appearance of an odd-odd isobar—for example, the transition
76 76 76
32 Ge → 33 As—is energetically forbidden. On the other hand, 32 Ge can decay
76
to 34 Se emitting two electrons.
Double beta decays can occur through two different reaction modes, char-
acterised by different final states. These are the two-neutrino double beta
(2β2ν ) decays
N (A, Z) → N (A, Z + 2) + e− −
1 + e2 + ν 1 + ν 2 , (17.4)

N (A, Z) → N (A, Z − 2) + e+
1 + e+
2 + ν1 + ν2 , (17.5)
274 Relativistic Quantum Mechanics

analysed in the pioneering work of Goeppert-Mayer, and the neutrinoless dou-

ble beta (2β0ν ) decays

N (A, Z) → N (A, Z + 2) + e− −
1 + e2 , (17.6)

N (A, Z) → N (A, Z − 2) + e+
1 + e+
2 , (17.7)

ﬁrst discussed by Furry in 1939 [49]. The decays proceed mainly from the
ground state of a parent nucleus with spin-parity 0+ to the 0+ ground state
of the daughter nucleus, although in some instances the transition to excited
0+ or 2+ states is also energetically allowed.
The occurrence of 2β0ν decay processes—which is forbidden in the Stan-
dard Model of weak interactions outlined in Chapter 15—is only possible if
neutrinos behave as massive Majorana particles, the properties of which are
discussed in Chapter 13.
Neglecting the recoil energy of the daughter nucleus, the Q-value of double
beta decays reduces to

Q2β = MN − MN − 2me , (17.8)

where MN and MN are the masses of the initial and final nuclei, respectively,
and me is the electron mass. Note that, because 2β2ν decays lead to the
appearance of a final state comprising four leptons, the sum of the kinetic
energies of the two charged leptons, Ee1 +Ee2 , exhibits a continuous spectrum
extending from zero to Q. In the case of 2β0ν decays, on the other hand, there
are only two final-state leptons, and Ee1 + Ee2 = Q.

17.1.1 Two-neutrino Double Beta Decay

The 2β2ν decay is a second order process in the weak interaction described
by the Fermi Lagrangian introduced in Chapter 9, which can be conveniently
cast in the form
G
LF = − √ Lρ H ρ + h.c. , (17.9)
2
with

Lρ = ψ e γρ (1 − γ5 )ψν , (17.10)

and
gA
H ρ = ψ N γ ρ (1 − γ5 )τ + ψN . (17.11)
gV
In the above equation, the nucleon is described by the doublet

ψp
N= , (17.12)
ψn
Neutrinoless Double-Beta Decay 275

where ψp and ψn denote the proton and neutron ﬁelds, respectively, while
τ + = (τ1 + iτ2 )/2—with the τi being Pauli matrices—is the operator raising
the lower component of the doublet.
The S-matrix element describing the decay (17.4) can be written in the
form

S2ν = d4 x1 d4 x2 e1 , e2 , ν 1 , ν 2 , N |T {LF (x1 )LF (x2 )} |N (17.13)

= (2π)4 δ (4) (Pf − Pi ) M2ν ,

where Pf and Pi are the four-momenta of the initial and ﬁnal states, re-
spectively, and M2ν denotes the transition ampliude. The expression of S2ν
involves the lepton tensor

Lρσ = Jρ (x1 )Jσ (x2 ) ,

with2 ,

1 1
Jρ (xi ) = ei , ν i | Lρ (x) |0 = ei γρ (1 − γ5 )νi ei(pei +pνi )xi , (17.14)
2 E ei E ν i

and the nuclear tensor

ρ
H ρσ = ΨN | HA σ
(x1 )HA (x2 ) |ΨN , (17.15)

with |ΨN and |ΨN being the ground states of the parent and daughter
nuclei, respectively. The nuclear current is deﬁned as

ρ

A
HA = Hiρ . (17.16)
i=1

with Hiρ being the current of Eq. (17.11) associated with the i-th nucleon.
Figure 17.2 provides a diagrammatic representation of the reaction (17.4),
in which beta decays are associated with the exchange of the positively-
charged W bosons, as described by the IVB Lagrangian LIV B of Eq. (15.38).
The Fermi Lagrangian LF can be obtained from LIV B in the MW →
∞ limit, safely applicable to the description of nuclear beta decays; see
Section 15.4.
The calculation of H ρσ is carried out by rewriting the nuclear tensor in
the form ρ
H ρσ = ΨN | HA (x1 ) |Ψm Ψm | HA
σ
(x2 ) |ΨN , (17.17)
m

2 Here we use the spinor normalisation of Eq. (13.8), which turns out to be more conve-

nient in the m → 0 limit.

276 Relativistic Quantum Mechanics

Figure 17.2 Diagrammatic representation of the two-neutrino double beta decay

reaction (17.4).

with {Ψm } being a complete set of eigenstates of the unobserved intermediate-

state nucleus, having mass number A and charge number Z +1. The time inte-
gration in Eq. (17.13) leads to the familiar result of second order perturbation
theory

(2π)δ (EN + Ee1 + Eν1 + Ee2 + Eν2 − EN ) (17.18)

ΨN | H ρ (x1 ) |Ψm Ψm | H σ (x2 ) |ΨN
× A A
Jρ (x1 )Jσ (x2 ) + . . . ,
m
E m + E e2 + E ν2 − EN

where the ellipses refer to presence of similar additional contributions, arising

from diﬀerent time ordering and exchange terms. Numerical calculations are
often performed using the approximation
M N − MN
Ee1 + Eν1 ≈ Ee2 + Eν2 ≈ , (17.19)
2
with MN and MN being the masses of initial and ﬁnal nuclei, respectively,
which allows to decouple the lepton and nuclear parts of the amplitude.
2ν
The half-time of the parent nucleus, T1/2 , trivially related to the decay
2ν −1
rate Γ through [T1/2 ] = Γ/ ln 2, can be cast in the form
2ν −1
[T1/2 ] = |M2ν |2 G2ν , (17.20)

where 2
gV
M2ν = GT
M2ν − F
M2ν . (17.21)
gA
F GT
In the above equation, M2ν and M2ν —referred to as Fermi and Gamow-
Telller matrix elements—describe the nuclear transition amplitudes involving
Neutrinoless Double-Beta Decay 277

the vector and axial-vector weak currents, respectively. Using the approxima-
tion of Eq. (17.19) and treating the nucleons as non relativistic particles they
can be written in the form
# + #A +
ΨN | A j=1 τj |Ψm Ψm | k=1 τk |ΨN
F
M2ν = , (17.22)
m
Em − (MN + MN /2)
# + #A +
ΨN | A j=1 σ j τj |Ψm · Ψm | k=1 σ k τk |ΨN
GT
M2ν = , (17.23)
m
Em − (MN + MN )/2
with σ j being the matrix associated with the spin of the j-th nucleon.
The phase-space factor is obtained from
4
1 G4 g A
G2ν = dΩ2ν F0 (Z + 2, Ee1 )F0 (Z + 2, Ee2 ) , (17.24)
ln 2 64π 7 m2e
with
dΩ2ν = Eν21 dEν1 Eν22 dEν2 |pe1 |Ee1 dEe1 |pe2 |Ee2 dEe2 d cos θ
× δ(EN + Ee1 + Eν1 + Ee2 + Eν2 − EN ) ,
where θ is the angle between the momenta of the emitted electrons. The Fermi
factor F0 (Z, Ei ) appearing in Eq. (17.24) takes into account the effect of the
Coulomb attraction between the emitted electron and the daughter nucleus,
having charge number Z + 2. The corresponding expression is
E 2πα(Z + 2)
F0 (Z + 2, E) = , (17.25)
|p| 1 − e−2πα(Z+2)
with α being the fine-structure constant. Note that nuclear matrix elements
defined according to the above equations turn out to be dimensionless.
The occurrence of 2β2ν decay has been unambiguously observed in several
nuclei—ranging from 48 Ca to 150 Nd—using different detection techniques. The
measured values of the half-life lie in the range 1018 < 2ν < 23
∼ T1/2 ∼ 10 years [50].
Precision measurements of 2β2ν reactions can provide information useful to
constrain the models employed to perform calculations of the nuclear matrix
elements, which are also needed for the determination of the 2β0ν decay rates
to be discussed in the next section.

17.1.2 Neutrinoless Double Beta Decay

The observation of the hypothetical 2β0ν decay reaction would provide
evidence of matter creation, associated with a two-unit violation of lepton
number conservation. Such a departure from the predictions of the Standard
Model of weak interactions may be explained considering the process illus-
trated in Fig. 17.3. Since no antineutrino is emitted, the two antineutrino
lines attached to the weak interaction vertices in Fig. 17.2 should be joined
to represent a neutrino propagator. If neutrinos behaved as Dirac particles,
however, this interpretation would not be allowed, because:
278 Relativistic Quantum Mechanics

Figure 17.3 Diagrammatic representation of the neutrinoless double beta decay re-
action (17.6).

1) The antineutrino emitted from the upper leptonic vertex could not be
absorbed by the lower vertex, which can only absorb a neutrino;
2) The helicity of the antineutrino emitted from the upper leptonic vertex
would be positive, while the lower leptonic vertex can only absorb a
neutrino with negative helicity.
As a consequence, the occurrence of 2β0ν decay entails two necessary condi-
tions:
1) The neutrino must be a Majorana fermion, in which case ν e = νe and
the total lepton number is not conserved;
2) The neutrino mass mνe must be non zero, in which case a neutrino
of negative helicity can be emitted from the upper leptonic vertex with
amplitude mνe /Eνe , and absorbed by the lower leptonic vertex with unit
amplitude.
For convenience, we recall here the expression of the ﬁeld describing massive
Majorana fermions, discussed in Chapter 13
1
ψν (x) = √ ar (p)ur (p)e−ipx + a†r (p)ur (p)eipx , (17.26)
p,r
2V Eν

showing that a Majorana particle is identical to its antiparticle and inherently

charge neutral.
The expression of the S-matrix element of 2β0ν decay, to be contrasted
with the corresponding expression for 2β2ν , Eq. (17.13), reads

S0ν = − d4 x1 d4 x2 e1 , e2 , N | T {LF (x1 )LF (x2 )} |N . (17.27)
Neutrinoless Double-Beta Decay 279

Comparison between the diagrams of Figs. 17.2 and 17.3 shows that the
nuclear parts of the 2β0ν and 2β2ν amplitudes are identical. The lepton part
of Eq. (17.27)

Lρσ = e1 e2 |T {Lρ (x1 )Lσ (x2 )} |0 (17.28)

) *
= e1 e2 |T ψ e (x1 )γρ (1 − γ5 ) ψνe (x1 )ψ e (x2 )γσ (1 − γ5 ) ψνe (x2 ) |0 .

can be rewritten substituting the expression of the νe ﬁeld with a superposition

of mass eigenstates νi acording to Eq. (16.10). The resulting expression is
3

Lρσ = Uej Uek (17.29)
j,k=1
) *
× e1 e2 |T ψ e (x1 )γρ (1 − γ5 ) ψνj (x1 )ψ e (x2 )γσ (1 − γ5 ) ψνk (x2 ) |0 ,

whith the Uej being elements of the neutrino mixing matrix. Let us now
assume that the neutrino be a Majorana particle, so that

ψνTk = −ψ νk C , (17.30)

where C is the charge-conjugation matrix. Substituting the above relation in

Lρσ we obtain the result
3

Lρσ = Uej Uek (17.31)
j,k=1
) *
× e1 e2 |T ψ e (x1 )γρ (1 − γ5 ) ψνj (x1 )ψνTk (x2 ) (1 − γ5 ) γσ ψeC (x2 ) |0 ,

showing that implementation of the Majorana condition, Eq. (17.30), leads

to the appearance of the propagator describing the internal neutrino line ap-
pearing in Fig. 17.3
3

2
) *
Uek 0|T ψνk (x1 )ψνTk (x2 ) |0 (17.32)
k=1
3

2 d4 q /q + mk −iq(x1 −x2 )
= −i Uek e .
(2π)4 q 2 − m2k
k=1

Using the result

(1 − γ5 )(/q + mj )(1 − γ5 ) = −2mj (1 − γ5 ) , (17.33)

we find that the lepton part of the 2β0ν amplitude is proportional to the
effective Majorana mass, defined as
3

2
mee = Uek mk . (17.34)
k=1
280 Relativistic Quantum Mechanics

In addition, performing the q0 integration in the right-hand side of Eq. (17.32),

we obtain the result

d4 q e−iq(x1 −x2 ) d3 q e−i[ωq (x10 −x20 )−q·(x1 −x2 )]
2 = , (17.35)
(2π)4 q 2 − mj (2π)3 2ωq

with ωq = q2 + m2j .
As noted above, the nuclear part of the amplitude is the same as in the
case of 2β2ν decays. In neutrinoless decays, however, its time dependence is
combined with a diﬀerent time dependence of the leptonic part. Carrying out
the time integration, we ﬁnd the result

(2π)δ (EN + Ee1 + Ee2 − EN ) (17.36)

ΨN |H ρ (x)|Ψm Ψm |H σ (y)|ΨN
× A A
+ ... .
m
E m + ω q + E e2 − E N

The sum over intermediate nuclear states is performed using the closure ap-
proximation, which amounts to replacing the energy E = Em + Ee2 − EN ,
appearing in the denominator of the above equation, with an average value
E ∼ 10 MeV. The sum # can then be performed exploiting completeness
of the set {Ψm }, implying m |Ψm Ψm | = 11. This procedure drastically
simpliﬁes the calculation of the nuclear matrix elements, which reduce to the
form

A
MF
0ν = ΨN | τj+ τk+ H(rjk )|ΨN , (17.37)
j,k=1

A
MGT
0ν = Ψ N | (σ j · σ k ) τj+ τk+ H(rjk )|ΨN , (17.38)
j,k=1

where rjk is the distance between the two nucleons undergoing beta decay. The
function H(rjk ), called neutrino potential, originates from the integration of
Eq. (17.35) over the space components of the virtual neutrino momentum,
which can be performed using the approximation ωq ≈ |q|. The resulting
expression is
2RA +∞ sin(qr)
H(r) = dq , (17.39)
πr 0 q + E
with RA = 1.2 A1/3 being the radius of the parent nucleus.
Collecting the nuclear and leptonic parts together we can cast the expres-
sion of the 2β0ν half-life in the form
!−1 |mee |
2
0ν 2
T1/2 = |M0ν | G0ν (17.40)
m2e
Neutrinoless Double-Beta Decay 281

where, mee is the eﬀective Majorana mass, deﬁned in Eq. (17.34), and M0ν
is the dimensionless nuclear transition matrix element, including the contri-
butions of Fermi and Gamow-Teller transitions
2
gV
M0ν = M0ν −GT F
M0ν . (17.41)
gA

Finally, the phase-space factor, G0ν , is given by

2 G4 g A
4
m2e
G0ν = dΩ0ν F0 (Z + 2, Ee1 )F0 (Z + 2, Ee2 ) , (17.42)
ln 2 16π 5
where

dΩ0ν = |pe2 | Ee2 |pe1 |Ee1 dEe1 , (17.43)

with Ee2 = EN − EN − Ee1 .

17.2 EXPERIMENTAL STUDIES OF DOUBLE BETA DECAY

As pointed out above, the observation of 2β0ν decays would unambiguously
establish the existence of Majorana neutrinos and provide a measurement of
the effective Majorana mass mee , the definition of which involves both the
elements of the neutrino mixing matrix, Uei , and the mass eigenvalues mi ;
see Eq. (17.34). Complementary information on neutrino masses are inferred
from cosmological observations. The 2018 analysis
# of the data collected by the
Planck observatory provides the constraint i mi < 0.12 meV [51].
As discussed in Section 15.1, a limit on the neutrino mass can also be
obtained from the observation of the endpoint of the electron energy distri-
bution in Tritium beta decay experiments. The latest # measurement of the
effective electron neutrino mass, defined as m2ν = i |Uei |2 m2i , reported by
the KATRIN Collaboration, yields the upper limit mν < 0.8 eV [52].
The detection of 2β0ν decays is based on the measurement of the total
energy of the two emitted electrons, which is expected to exhibit a sharp peak
at Ee1 + Ee2 = Q2β , with Q2β defined by Eq. (17.8).
The data obtained from the oscillation analyses, discussed in Chapter 16,
provide differences between the squared neutrino mass eigenvalues, Δij , but
do not allow to pin down which of the three mass eigenstates is the heaviest.
Experimental studies of 2β0ν decay have the potential to resolve this issue.
To see this, consider that, by using the available experimental information
on mixing angles and mass splittings, mee can be written as a function of
the lightest neutrino mass, mlight , assuming either normal order (NO), cor-
responding to Δ23 = m23 − m22 > 0, or inverted order (IO), corresponding to
Δ23 = m23 − m22 < 0. As a consequence, experimental searches of 2β0ν de-
cay turn out to be sensitive to differences between the predictions of the two
scenarios, and may shed new light on the neutrino mass spectrum.
282 Relativistic Quantum Mechanics

Table 17.1 Current limits on the 2β0ν decay half-life and the eﬀective
Majorana mass, reported by experiments performed using diﬀerent nu-
clei.
0ν
Nucleus Experiment T1/2 mee
[years] [meV]
76
Ge GERDA [53] > 1.8 × 1026 < 79 − 180
76
Ge MAJORANA [54] > 8.3 × 1025 < 113 − 269
136
Xe KamLAND-Zen [55] > 2.3 × 1026 < 36 − 156
136
Xe EXO-200 [56] > 3.5 × 1025 < 93 − 286
130
Te CUORE [57] > 2.2 × 1025 < 90 − 305
100
Mo CUPID-Mo [58] > 1.8 × 1024 < 280 − 490
82
Se CUPID-0 [59] > 4.6 × 1024 < 263 − 545
48
Ca CANDLES-III [60] > 5.6 × 1022 < 1600 − 2900

The results of recent experimental determinations of both the lower bound

0ν
of T1/2 and the upper bound of mee are listed in Table 17.1. The limits on
the half-life are mainly driven by exposure, defined as the product between
the amount of active isotope employed by the experiment and the duration
of data taking. According to the estimates of Engels and Menendéz [61], as-
suming mee ≈ 50 meV, the detection of a 2β0ν decay will be possible with an
experimental sensitivity of 5−8×1026 years, to be compared to the best limits
0ν
on T1/2 obtained so far. If, on the other hand, one assumes mee ≈ 10 meV,
a sensitivity as high as of 1 − 2 × 1028 years will be needed. The results of
Table 17.1 suggest that in the best case scenario the required sensitivity may
be achieved by some of the current experiments.
The largest source of uncertainty in the determination of the effective Ma-
jorana mass from Eq. (17.41)—resulting in the spread of values of mee reported
in Table 17.1—is the calculation of the nuclear matrix element, generally per-
formed using the formalism of the nuclear shell model, which unavoidably
involve approximations. Furthermore, the effects of nucleon-nucleon correla-
tions—which is neglected altogether in the mean-field approximation underly-
ing the shell model—have been found to be large [62]. Overall, the uncertainty
arising from the discrepancy between the results of different calculations of
M2ν turns out to be as large as 50−75 %
The values of mee obtained from the measured half-lives can be compared
with the predictions based on the results of oscillation experiments. Following
Neutrinoless Double-Beta Decay 283

the procedure outlined above one ﬁnds

3

mee = Uei mi (17.44)

i=1
⎧
⎪
⎪
⎪
⎪ mlight c212 c213 + s212 c213 e2i(η2 −η1 ) Δ212 + m2light
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨ + s213 e−2i(δ+η1 ) Δ223 + Δ212 + m2light (NO)

=
⎪
⎪
⎪ 2 2 2 2i(δ+η2 )
⎪ mlight s13 + s12 c13 e
⎪ m2light − Δ223
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ 2 2i(δ+η1 ) 2 2 2
⎪
⎩ + s13 e mlight − Δ23 + Δ12 (IO)

Here, cij = cos θij , sij = sin θij , with θij being a neutrino mixing angle, and
δ is the phase associated with the violation of the CP symmetry discussed in
Section 16.1. The additional phases η1 and η2 which also describe CP violation,
only appear in the case of Majorana neutrinos.
The eﬀective Majorana masses obtained from the above equations using
the results of the global analysis of oscillation data of Esteban et al. [63] are
shown in Fig. 17.4.

0
10

-1
10
mee (eV)

-2
10

-3
10

-4
10
0.0001 0.001 0.01 0.1 1
mlight (eV)

Figure 17.4 The eﬀective Majorana neutrino mass mee as a function of the lightest
neutrino mass, mlight . The lower and upper bands correspond to normal and inverted
order; that is, mlight = m1 , and mlight = m3 , respectively. Reprinted from Ref. [12]
with permission. © APS (1998).
CHAPTER 18

A LEAP FORWARD:
CHARMONIUM

In Chapter 9 we have introduced the Electromagnetic, Weak and Strong

Interactions, the three forces that act at the elementary particle level.
The correct theory of the Strong Interactions, the very intense forces that
bind protons and neutrons in the atomic nuclei, was found in 1973. It took
indeed almost three decades to decipher the puzzle, following the discovery
in 1947 of the mediator of nuclear forces, the π-meson, and the exploration
of the spectrum of a wealth of particles subject to the Strong Interactions,
similar to protons, neutrons and π-mesons and called collectively hadrons; see
Section 9.4.
The theory of strong interaction, named Quantum Chromo Dynamics
(QCD), was prepared by important steps partly anticipated in Sections 9.3
and 9.4; here, QCD will be illustrated with more detail in Section 18.1 below.
In Section 18.2 we introduce charm-anticharm mesons, called charmonia
in analogy with the positronium states discussed in Chapter 12.
The calculation of the spectrum of charmonia, unexpectedly, has turned
out to be an interesting exercise in Non Relativistic Quantum Mechanics that
deserves to appear in this Volume. All the more so because the discovery of
unexpected lines intermixed with the predicted spectrum of charmonia has led
to the discovery of a family of exotic hadrons, still waiting for a quantitative
description within the present theory of the Strong Interactions.

18.1 A PRIMER: BARYONS, MESONS, QUARKS AND QCD

Hadrons divide into two great families: baryons and mesons. Baryons are
particles with half-integer spin, with proton and neutron being the lightest
ones; mesons have integer spin, the lightest mesons being the π meson, in its
three charged states π + , π 0 , π − .

284 DOI: 10.1201/9781003436263-18

This chapter has been made available under a CC BY NC license.
A Leap Forward: Charmonium 285

18.1.1 Conserved Quantum Numbers

Besides electric charge, light hadrons are characterised by four quantum
numbers that are conserved by the Strong Interactions, Baryon number, B,
Isotopic Spin and its projection, I and I3 , and Strangeness, S.
Baryon number. Baryons have B = 1 (antibaryons B = −1). Mesons
have all B = 0. Proton and neutron are the lightest baryons and B con-
servation implies that higher mass baryons (called hyperons) will eventually
decay into a ﬁnal state containing one proton or one neutron and other B = 0
particles (mesons, leptons and photons). Charged π ± decay into leptons, π 0
decays into pure radiation (two photons), higher mass mesons decay eventu-
ally into states containing π mesons, leptons, photons and, if heavy enough,
baryon-antibaryon pairs.
Isotopic Spin. The close mass of proton and neutron has prompted very
early the suggestion that the Strong Interaction hamiltonian is symmetric
under unitary transformations of p and n

p p
→U with : U = unitary matrix, det U = 1. (18.1)
n n
These transformations are analogous to spin rotations, hence the name Iso-
topic Spin Symmetry. The symmetry can be extended to the other hadrons.
Proton and neutron (Nucleons) have isotopic spin I = 1/2 with I3 =
+1/2(p), −1/2(n), π mesons have I = 1, with I3 = +1(π + ), 0(π 0 ), −1(π − ).
Strangeness. The Λ0 hyperon and the K + meson, the next heavier par-
ticles than proton and π, have very long lifetimes, typical of the Weak rather
than Strong interactions. This unexpected fact led to introduce a new quan-
tum number called Strangeness, conserved by Strong and Electromagnetic
Interactions and violated by Weak Interactions. Conventionally S = −1 for
Λ0 , S = +1 for K + and S = 0 for proton, neutron and π mesons.
Strangeness conservation implies that particles with S = 0 are produced in
pairs in hadronic collisions initiated by S = 0 particles (associate production),
e.g. : π + p → Λ0 + K + + (S = 0) hadrons, as it is in fact observed. We have
identiﬁed hyperons with S = −2 (Ξ0,− ) and S = −3 (Ω− ), and mesons with
S = ±1, 01 .

18.1.2 Quarks and QCD

The idea that some hadrons may be elementary (proton and neutron) and
be the constituents of the other hadrons was considered in 1949 by E. Fermi
and C. N. Yang [64]. The scheme was extended in 1956 by S. Sakata [65],
to include strange particles. At the beginning of the 1960s, however, it was
recognised that the most natural principle to understand the hadrons was
Nuclear Democracy: all hadrons are to be treated on the same footing (G.
Chew and S. Frautschi).
1 Systematic classiﬁcation of hadron resonances and related conventions are given in [12].
286 ■ Relativistic Quantum Mechanics

Table 18.1 Baryon number B, Isospin I3 , Electric charge Q, Strangeness S, of u, d and s

quarks introduced by Gell-Mann and Zweig. We have listed also the heavy quarks (charm,
beauty, top) each with its conserved quantum numbers. Quark masses (in GeV) are derived
from the mass spectrum of the corresponding mesons (see Sect.18.2 for charm quark masses).

Quark name B I3 Q S Charm Beauty Top mass

u (up) 1/3 +1/2 2/3 0 0 0 0 0.31
d (down) 1/3 -1/2 -1/3 0 0 0 0 0.31
s (strange) 1/3 0 -1/3 -1 0 0 0 0.48
c (charm) 1/3 0 2/3 0 1 0 0 1.32
b(beauty) 1/3 0 -1/3 0 0 1 0 4.58
t(top) 1/3 0 2/3 0 0 0 1 173

Quarks. A decisive step was made in 1964 by M. Gell-Mann [66] and by

G. Zweig [67] independently: Nuclear Democracy holds, subnuclear particles
are all composite, the elementary constituents being three kinds of spin 1/2
particles, unobserved until then and called quarks by Gell-Mann. Mesons and
Baryons have the following quark compositions

mesons = q q̄ ,
baryons = qqq . (18.2)

It is customary to say that Gell-Mann’s quarks come with three different

qualities, or flavours, named according to q = u, d, s (up, down and strange).
Quark quantum numbers are reported in Table 18.1. It is easy to check
the matching of quantum numbers in the simplest cases:
¯ spin0 ; K + = (us̄) spin 0 ;
π + = (ud)
p = (uud), spin 1/2 ; n = (udd), spin 1/2 ;
Λ0 = (uds), spin 1/2 ; Ξ0 = (uss), spin 1/2 ; Ω− = (sss), spin 3/2 .

Unitary transformations of the 3-dimensional vector q = (u, d, s) corre-

spond to a group called SU (3)f lavour , which extends the Isotopic Spin sym-
metry and is the natural approximate symmetry of the Strong Interactions,
as proposed earlier by Gell-Mann and by Y. Ne’eman [68], with consequences
well supported by the observed properties of the hadrons known at the time.
Heavy Quarks. In 1970, Glashow, Iliopoulos and Maiani [28] have pro-
posed a fourth quark, considerably heavier than the other three, to explain
the observed suppression of weak interaction processes involving strangeness-
changing neutral weak currents. The fourth quark removed the obstacles that
until then had precluded a unified Electro-Weak theory of quarks. Like the
strange quark, the new quark and the particles composed by it, carries a quan-
tum number C, or Charm, similar to Strangeness and conserved in Strong
and Electromagnetic Interactions. New particles containing a c̄c pair (hidden
A Leap Forward: Charmonium 287

charm) or one unpaired c quark (charmed particles) have been observed from
1974 onwards, with the properties anticipated in [28].
In 1973, Kobayashi and Maskawa [29] have proposed the existence of a
further pair of quarks, to explain the CP-violation observed in K decays.
Hadrons with beauty and top quarks have been ﬁrst observed in 1976 and
1994, respectively.
Heavy quarks decay by Weak Interactions into a lighter quark plus particles
with total baryon number B = 0. Thus they all have the same baryon number
(1/3) of the lightest ones, which is determined by the fact that proton and
neutron are made by three quarks. Indeed charmed and beauty baryons, with
composition (qq c) and (qq b) have been observed, as well as doubly charmed
baryons (qcc) (q and q denote light quarks).
QCD. The ﬁrst baryon with S = 0 heavier than the proton is the so called
3-3 resonance Δ++ (1232)2 (in parenthesis the mass in MeV). The quark model
composition of Δ++ in its s3 = +3/2 state is:

Δ++ ↑ ↑ ↑
s3 =+3/2 = (u u u ) . (18.3)

Assuming all u quarks to be in the fundamental state, Eq. (18.3) would

be a fully symmetric configuration of three quarks, in conflict with the Fermi
statistics obeyed by spin 1/2 particles.
At the end of the years 1960s, the view prevailed that the u quark, as well
as all other quark flavors, had to have a hidden quantum number, to restore
the required antisymmetry. First ideas were advanced in a seminal paper by
Han and Nambu [69], the modern view was proposed by Bardeen, Fritzsch
and Gell-Mann in 1972 [70].
It is assumed that quark fields have an additional (three valued) index
associated to unitary transformations of a new symmetry called SU (3)colour 3 :

q → q a : q = u, d, s and a = 1, 2, 3 . (18.4)

The further hypothesis is done that the observed hadrons are invariant
under the new symmetry, i.e. they are colour singlets. In the case of (18.3),
this implies:

Δ++ → (u↑a u↑b u↑c )abc . (18.5)

with abc the completely antisymmetric tensor in three dimensions

(sum over repeated indices is understood). The full antisymmetry of re-
stores the antisymmetry of the state of three quarks with equal ﬂavour and
spin up.
To describe the strong interactions, Han-Nambu and Bardeen et al. pro-
posed quark interactions to be invariant under gauge transformations based
2 Δ is the spin 3/2 resonance discovered at the Chicago Cyclotron by E. Fermi in 1951–

1952 in π + + p collisions.
3 The name colour is given in analogy to the ﬂavor quantum numbers exhibited by quarks,

colour symmetry is assumed for heavy quarks as well.

288 Relativistic Quantum Mechanics

on SU (3)colour , analogous to the gauge transformations of QED described in

Section 9.1, see Eq. (9.4):
λA
·αA (x) a b
ua (x) → e−i 2
b
u (x) . (18.6)
A=1,...,8

where the λA are eigth 3 × 3 hermitian and traceless matrices (the generators
of SU (3)colour ).

18.1.3 Infrared Conﬁnement and Asymptotic Freedom

The one-dimensional gauge invariance of QED is associated with the mass-
less photon. Similarly, the eight-dimensional gauge transformations (18.6) are
associated with eight massless vector particles called gluons, which are sup-
posed to be the mediators of the basic strong interactions which glue together
quarks inside the hadrons4 . The resulting quark-gluon interaction has been
named Quantum Chromo Dynamics (QCD) in [70].
As indicated by Eq. (18.4) trasformations of colour are independent from
and therefore commute with the Weak and Electromagnetic interactions. In
correspondence, gluons are electrically neutral and hadron spectroscopy re-
mains the same as in the colourless theory, only with quark colours arranged so
as to produce mesons and baryons in strictly colourless states (i.e. SU (3)colour
singlets) as in (18.5).
Invariance under the gauge transformations (18.6) determines the quark
gluon interaction lagrangian. Using the same argument that led to the
electron-photon interaction in Chapter 9, Eq. (9.7), one ﬁnds:

Lq (q, Dμ q) = q̄(iD/ − m)q ;

λA !a b
Dμ q a (x) = ∂ μ q a (x) − igs g Aμ (x) q (x) ;
2 b
a a
q = (u, d, s, c) , m = (mu , md , ms , mc ) . (18.7)

with gs the strong quark gluon coupling, g Aμ (x), A = 1, . . . , 8 the gluon fields,
and m the quark masses. Unlike photons in QED, gluons interact among them-
selves, with 3-gluon and 4-gluons interactions of order gs and gs2 , respectively.
Quark Confinement. Gluons self-interaction, in the presence of a cou-
pling constant gs = O(1) is supposed to be at the origin of quark confinement,
the fact that quarks are permanently confined inside colour singlet hadrons.
Indeed, quarks have never been observed in isolation5 .
To visualise this phenomenon, consider a (colour singlet) meson made by
a quark-antiquark pair bound by colour forces. For weak coupling, we have
colour lines of force that start from the quark, like in electrostatics, and have
4 A gauge theory based on a non-commutative symmetry like SU (3)
colour is called a
Yang-Mills theory, after the names of the authors who first worked out these theories in
1954 [71].
5 The lightest quark must be stable due to its fractional electric charge. Isolated quarks

produced in the collisions of cosmic rays or at particle accelerators have been intensively
and unsuccessfully searched in the years nineteen sixties.
A Leap Forward: Charmonium 289

Figure 18.1 Inside hadrons, colour ﬁelds condense in “strings” that go from the quark to
the antiquark (mesons) or to a three-string vertex (baryons). The energy of the string is
proportional to the lenght. If we apply a force to the quark, e.g. from a collision with an
external electron, the string is stretched until string energy reaches the q q̄ threshold and it
gives rise to two mesons (baryon plus meson), still with conﬁned quarks.

to land all in the antiquark, since the meson is colour neutral. It has been
supposed that, going to a strong coupling, gluon self-attraction makes the
force-lines to condense into a string that goes from the quark to the antiquark,
with an associated energy which increases (e.g. linearly) with the length. In
the case of baryons, the strings originating from the three quarks merge into
a colour invariant triple vertex with their colour indices saturated by the
antisymmetric tensor abc as in Eq. (18.5).
If we give energy and momentum to one quark, e.g. from a collision with
an external electron, the string is stretched until string energy reaches the
q q̄ threshold. At this point an additional quark pair is created, see Fig. 18.2,
and a state with two mesons (or a baryon plus meson) is produced, still with
confined quarks.
Asymptotic Freedom. In 1973 David Gross and Franck Wilczek [72]
and, indipendently, David Politzer [73], computed the asymptotic behaviour
of the SU (3)colour coupling, Eq. (18.7), for large values of the momentum
transfer. Unlike what happens in all cases previously studied (QED, Yukawa
and λφ4 interaction) they found that the colour coupling gs , defined to be
very large in the confinement region q < 1 GeV, decreases logarithmically for
large momentum transfer:

gs2 (q) C
αs (q) = ≈ → 0 (q 2 → ∞) . (18.8)
4π ln q 2
Asymptotically, quark behave as free particles!
The unexpected result agreed with the scaling relations observed in the
experimental e − p deep inelastic cross-sections since the ﬁrst data in 1968.
The scaling behaviour had been anticipated by J. Bjorken, on the basis of
quark current commutators, and was explained by R. Feynman in 1971 as
290 Relativistic Quantum Mechanics

indicating that the proton, seen at very large momentum transfer, behaves
like a cloud of free, point-like particles that Feynman called partons.
Experimental investigations of deep inelastic scattering of electrons and
neutrinos off protons and neutrons, in the years 1970s and 1980s, have shown
that the proton seen at large momentum can indeed be described as an inco-
herent mixture of free quarks and antiquarks with different flavours, with an
additional component of neutral partons, to be identified with gluons. Each
parton is characterised by structure functions that describe the probability to
find that particular constituent (i.e. u or d quarks or gluons) with a fraction
x of proton’s momentum.
Owing to the logarithmic approach to asymptotic freedom, Eq. (18.8),
at large but finite energies there are deviations from free-parton behaviour,
which have been accurately computed in QCD, see Ref. [74], and compared
to experimental data at increasingly large energies.
The q dependence of the coupling, the so-called running coupling, has
been computed in QCD to leading logarithmic (LLO) and next-to-leading
logarithmic order (NLLO) to be6 :
6 The running coupling is determined by the renormalization group equation
∂α(t)
= β(α), t = ln q 2 , (18.9)
∂t
where β(α) = β1 α2 + β2 α3 , β1 and β2 are constants computed in [75] to leading logarithmic
order (LLO) and next-to-leading logarithmic order (NLLO), respectively, and reported in
(18.16). The solution of (18.9) is:
αs (q 2 ) q2
dα q2
= D(αs (q 2 )) = dt = ln , (18.10)
β1 α2 + β 2 α3 Λ2QCD

where − ln Λ2QCD is an integration constant and D(α) is the primitive of the integrand in
(18.10): D (α) = (β1 α2 + β2 α3 )−1 :
1 β2 αβ2
D(α) = − − 2 ln( ). (18.11)
αβ1 β1 β1 + αβ2
To LLO (and β1 < 0):
1 q2 1
= (−β1 ) ln 2 , i.e. αs1 (q 2 ) = . (18.12)
αs1 (q 2 ) ΛQCD q2
(−β1 ) ln Λ2
QCD

To ﬁnd αs to NLLO we use (18.11) inserting αs1 in the NLLO correction, to ﬁnd

1 β2 β2
αs (q 2 ) = 2 1 + α s1 (q 2
) ln α s1 (q 2
) , (18.13)
(−β1 ) ln 2q β1
ΛQCD
β1

which gives Eq. (18.15). To ﬁnd the value of ΛQCD given MZ , we use Eq. (18.11) yielding
β2
1+ α
β1 sM Z
ln( β
β
2
αsM Z )
1
ΛQCD = MZ Exp − = 0.244 GeV , (18.14)
2(−β1 ) αsM Z
for 5 quark ﬂavours with masses m << MZ .
A Leap Forward: Charmonium 291

Figure 18.2 Taken from Ref. [76], the ﬁgure shows the strong coupling αs = gs2 /4π as a
function of momentum transfer, compared to recent data from LHC and other sources.

1 β2 1 β2 1
αs (q 2 ) = q2
+ q2
ln[ q2
],
(−β1 ) ln Λ2 β1 [(−β1 ) ln
Λ2QCD
]2 β1 (−β1 ) ln
Λ2QCD
QCD

ΛQCD 0.25 GeV , (18.15)

with β1 β2 the LLO and NLLO coefficients of the beta-function, see [75]:
11 − 2/3f −102 + 38/3f
β1 (f ) = − ; β2 (f ) = , (18.16)
4π (4π)2
f = 5 = number quarks with m << MZ .
The value of ΛQCD is determined from the condition: αs (MZ ) = 0.1181,
which is the result of the fit to the data in [76]. The behaviour implied by
(18.15) is compared with recent experimental data from LHC experiments
and other sources in Fig. 18.2, also taken from Ref. [76].
The extraordinary agreement between theoretical predictions and obser-
vations is a solid confirmation of QCD.

18.2 CHARMONIA
The critical parameter that regulates the passage from strong to weak
regime, ΛQCD = 0.25 GeV, Eq. (18.15), can be compared with the quark
masses given in Table 18.1. Quarks u and d fall in the strong coupling region,
s marginally out, but heavy quarks are deﬁnitely in the weak coupling region.
The result suggests the possibility to compute the mass spectrum of c̄c and b̄b
mesons with methods similar to those employed for atomic spectra.
After the discovery of asymptotic freedom, T . Applequist and D. Politzer
observed that the Coulomb-like interactions associated to the exchange of
gluons would produce a series of c̄c bound states [77], in all similar to the
positronium states introduced in Section 12.5.2. By analogy, Applequist and
Politzer proposed to name charmonium each of these states and observed
that the smallness of αs at the charm quark mass scale would make their
spectrum calculable in perturbation theory. As it happens for positronium,
the hyperﬁne (spin-spin) interaction is expected to split the ground state into
“Paracharmonium” (J = 0) and “Orthocharmonium” (J = 1).
292 Relativistic Quantum Mechanics

Due to charge-conjugation, Para- and Orthocharmonium must decay in

states with two or three gluons respectively [77], which would then evolve into
final states of light mesons. The smallness of αs would make the decay width of
Ortopositronium (proportional to αs3 ) much smaller than the Parapositronium
width (proportional to αs2 ) estimated in [77] to be of about ∼6 MeV (see
Sect. 12.5.2 to compare with positronium).
After J/Ψ discovery in 1974, A. De Rujula and S. L. Glashow [78] identified
this, very narrow width particle (Γ ∼ 100 keV) with the Orthocharmonium
proposed by Appelquist and Politzer and opened the road to the investigation
of the spectrum of charmed particles and charmonia7 . It was also clarified
that the threshold below which charmonia had only gluon (or photon) decays
had to be identified with the threshold for the production of charmed meson
pairs, (c̄q) + (cq̄), see Fig. 18.4 below.
In 1974, the Cornell potential [80–82] was introduced to supplement the
Coulomb-like potential envisaged in [77], which dominates at short distances,
with a term linear in the radius to simulate the confining forces that dominate
at larger distances, the latter are determined by a new, phenomenological
constant, the string tension k.

18.2.1 The Cornell Potential and Its Relativistic Corrections

The Cornell potential8 :
4 αs
V =− + kr + 2Mc = VV + VS , (18.17)
3 r
provides the basic charmonium wave functions via the spin-independent
Schrödinger equation
1
(− ∇ + V )ψ = Eψ , (18.18)
2μc
μc = Mc /2 is the reduced mass of the charm quark-antiquark pair.
Following the treatment of the non-relativistic hydrogen atom (Appendix
B), we separate angular and radial wave functions for each value of the orbital
angular momentum L, according to
ψ(n, L, m)(x) = R(n, L)(r) · Y (θ, φ)L
m , (18.19)
with n = 1, 2, . . . the principal quantum number, R the radial wave function
and YmL the spherical harmonics (we shall use the standard notation: L =
S, P, D, · · · = 0, 1, 2,). Further, we put:
χ(r)
R(r) = , (18.20)
r
7 The association of J/Ψ with the opening of the cc̄ threshold had been made previously

in [79].
8 −4/3 is a group theoretical factor reﬂecting the fact that the cc̄ pair, like all hadrons,

is in a colour singlet state.

A Leap Forward: Charmonium 293

with χ(r), the reduced radial wave function, satisfying the boundary condition

lim χ(r) = 0 . (18.21)

r→0+

Similarly to atomic physics, hyperﬁne, spin-orbit and tensor interactions

(discussed e.g. in [83]) arise as part of the expansion in powers of v 2 /c2 of
the relativistic quark-antiquark potential and are to be treated as ﬁrst-order
perturbations to the quark-antiquark interaction (18.17) (see also [84, 85]).
Hyperﬁne Interaction. The spin-spin interaction, taken from atomic
physics, is:
2 16παs (3) 3
VSStot = 2
∇(VV )(S · S) = 2
δ (r) [S(S + 1) − ] . (18.22)
3Mc 9Mc 2

The hyperﬁne correction applies to L = 0 states only, for which:

1 χ(r) 2
|ψ(0)|2 = lim ( ) =0. (18.23)
4π r→0+ r
Deﬁning
16παs
V̄SS = |ψ(0)|2 (18.24)
9Mc2

one has:
1
+1/2, (S = 1)
< VSStot >S−wave = V̄SS × . (18.25)
−3/2, (S = 0)

Spin Orbit Interaction.

We deﬁne
1 dVV 1 dVS 1 4αs k
VLS = 2
3 − = 2 3
− ,
2Mc r dr r dr 2Mc r r
VLStot = VLS (L · S) , (18.26)

with S being the total c̄c spin. To ﬁrst order in this potential, we compute the
integral
+∞
1 4αs k χ(r) 2
V̄LS = 2 3
− r2 ( ) dr . (18.27)
2Mc 0 r r r

The integral is convergent at the origin for L ≥ 1 since (χ(r)/r)2 ∼ r2L for
r → 0. We restrict to P -waves and ﬁnd
⎧
⎨ −2 (J = 0)
δmLS =< VLStot >P −wave = V̄LS (L · S) = V̄LS × −1 (J = 1) . (18.28)
⎩
+1 (J = 2)
294 Relativistic Quantum Mechanics

Tensor Interaction.
One deﬁnes the tensor interaction starting from (see [83, 84])

1 d 2 VV 1 dVS
VT = 2
3 2
− ,
6Mc dr r dr
1 8αs k
VT tot = −3VT [Nij Si Sj ] = 2 3
+ [Nij Si Sj ] , (18.29)
2Mc r r

with:
1 ri
Nij = (n̂i n̂j − δij ), n̂i = . (18.30)
3 r
The average value of Nij in a state with deﬁnite orbital angular momentum
L is9 :
1 2
< (n̂i n̂j − δij ) >L = a(L) Li Lj + Lj Li − δij L(L + 1) , (18.31)
3 3
1
a(L) = − , (18.32)
4L(L + 1) − 3

so that:
1 8αs k
< VT tot >L =< + >×
2Mc2 r3 r
1
×2a(L) (S·L)2 − S(S + 1)L(L + 1) . (18.33)
3
The eﬀect, of course, vanishes for L = 0 or S = 0. For L = S = 1, we compute
the integral
1 8αs k
V1 =< 2 3
+ >P −wave =
2Mc r r
∞
1 8αs k χ(r) 2
= 2 3
+ r2 ( ) dr , (18.34)
0 2Mc r r r

and ﬁnd
2 −8 (J = 0) −8 (J = 0)
δmT =< VT tot >P −wave = V1 × = V̄T ;
15 +1 (J = 1, 2) +1 (J = 1, 2)
1 1 8αs k
V̄T = < + >P −wave . (18.35)
15 Mc2 r3 r
9 This result is reported in the book of Landau and Lifshitz [86]. The r.h.s of (18.31) is

the only symmetric and traceless tensor with two indices, which can be formed with Li , the
only vector remaining after integration over r. To obtain (18.32), one multiplies both sides
of (18.31) by Li (on the left) and Lj (on the right), summing over i and j. The result follows
r×p
from the fact that n̂ is orthogonal to L = and from angular momentum commutators.
A Leap Forward: Charmonium 295

Charmonium masses. We denote by E(n, L) the eigenvalues of (18.18).

Recalling that V includes the charm quark pair rest mass, charmonium masses
are given by:

M (n, L, S, J) = E(n, L) +
+δmSStot (n, L, J) + δmLStot (n, L, J) + δmT tot (n, L, J) (18.36)

The usual spectroscopic notation for the states with quantum numbers
n, L, S, J is (see [84])
2S+1
n L J : 1 1 S0 , 1 3 S1 , . . . . (18.37)

It is also usual to denote the S-wave, spin singlet and triplet states with the
symbols ηc and ψ respectively, P -wave states with χcJ , (S = 1, J = 0, 1, 2)
and hc , (S = 0, J = 1).
Parity and charge conjugation properties of charmonia are the same as
those of positronium (Eqs. (12.84) and (12.86)); that is:

P = (−1)L+1 , C = (−1)(L+S) . (18.38)

18.2.2 Strategy and Numerical Results

The formulae given in the previous paragraph identify an ambitious model,
in which the spectrum of charmonia is determined by only three parameters:
Mc , the string tension k and αs (Mc ).
Due to quark confinement, however, and unlike atomic and nuclear physics,
we cannot determine the mass of the elementary constituents (i.e. the charm
quark) and their interactions in isolation, independent of the spectrum of
bound states.
In mathematical terms, confinement implies that we cannot determine a
priori the zero of the Cornell potential and we should add to the Cornell
potential an a priori undetermined constant V0 . We can include V0 in the
definition of the charm quark mass, and consider the mass a free constant
to be determined from charmonium spectrum. The same holds true for the
string tension. The third parameter, the running coupling at the charm mass
scale αs (Mc ), with only a logarithmic dependence upon Mc , can be obtained
from Eq. (18.15) via the well determined value of αs (MZ ) = 0.1185. All this
amounts to take the masses of two c̄c states as input and predict the masses
of the other charmonia.
Following Ref. [84], we take:

αs = 0.331, Mc = 1.317 GeV, k = 0.18 GeV2 , (18.39)

which reproduce well the Orto and Para-charmonium masses.

We solve numerically the Schrödinger equation, Eq. (18.18), restricting to
the lowest levels: n = 1, 2; L = S, P , and use the wave functions to compute
hyperﬁne, spin-orbit and tensor interaction corrections.
296 Relativistic Quantum Mechanics

4 E( 2P )
E( 2S)
E( 1P)
3 E( 1S )

V(r)
2

1
(1S) (1P)

0
(2P)
(2S)
0 1 2 3 4 5 6 7

r (GeV -1 )

Figure 18.3 Radial wave functions and eigenvalues of the states: n = 1, 2; L = S, P . In

ordinates E in GeV. Note: 1GeV−1 ∼ 0.2 fm.

To solve the Scrödinger equation, we use the program [Link] [87].

Eigenfunctions and eigenvalues are displayed in Fig. 18.3. The numerical val-
ues of the constants to determine hyperﬁne, spin-orbit and tensor corrections
are reported in Table 18.2.
The resulting spectrum is displayed in Fig. 18.4, together with the masses
obtained numerically, labeled ”th”.
Arrows in Fig. 18.4 indicate cascade decays between charmonia levels, with
emission of γ rays or light hadrons. Radiative decays, typical of transitions
with ΔL = 1 or ΔS = 1, have been an importante guide to discover and
identify charmonium levels.
The agreement of masses and transitions predicted from the simple, QCD
motivated, Cornell potential and the observed spectrum of charmonia is re-
markable. For the extension of the spectrum to higher masses, further devel-
opments and improvements, the reader may resort to Refs. [83, 84, 93].

18.3 CHARMONIA END EXOTICS

In November 1974, J/Ψ = ψ(3097) has been discovered in proton-Be col-
lisions at Brookhaven, as a narrow peak in e+ e− invariant mass distribution,
and as a peak in the total cross section of e+ e− → hadrons, at SLAC. One
week after J/Ψ discovery at SLAC, a second peak has been observed at SLAC,
corresponding to ψ (3686) = ψ(2S).

Table 18.2 Values of the constants needed for hyperﬁne, spin-orbit and tensor corrections,
in GeV, see Eqs. (18.24), (18.27), (18.35).
– V̄SS V̄LS V̄T
1S 0.055 0 0
1PS 0 0.013 0.011
2S 0.041 0 0
A Leap Forward: Charmonium 297

Figure 18.4 Black lines. Expected and observed ground and excited levels of Charmonia
(c̄c mesons) up to masses of 4 GeV. J P C quantum numbers are indicated at the bottom
of the plot and particles are identified by the denominations introduced in the text. The
observed masses are reported in parenthesis, next to the particle name, and are taken
from [12]. Charmonia h(2P )c and χ(2P )c1 have not been identified yet. The mass values
obtained with the Cornell potential plus hyperfine, spin-orbit and tensor corrections are
reported for each particle, with the ”th” label. Gray lines. The particle named X(3872),
J P C = 1++ , was the first example of ”exotic hadron” found among hidden charm particles,
followed by the electrically charged Z(3900), Z(4020) and Z(4430), J P C = 1+− .

The ﬁrst evidence10 of intermediate P-wave states was found in 1975 by

the spectrometer DASP [89] at DESY (Hamburg), in the two-photon cascade:
ψ (3686) → γ + χc0 (3415) → 2γ + ψ(3097). Later, this cascade and cascades
to other intermediate χ states were seen at SPEAR [90, 91] and at DESY by
the Desy-Hamburg and the PLUTO Collaborations [88].
The ηc (2984) has been identiﬁed in 1977 in the radiative decay ψ(3097) →
γ + ηc (2984) by the DASP Collaboration [92] in DESY.
Exotics. In a word made of gluons and charmed quarks/antiquarks, light
hadrons can be created, to a great extent, only in isospin zero states, i.e. η
(I = 0) and not π 0 (I = 1) mesons, or ω (I = 0) but not ρ0 (I = 1) mesons.
This is well illustrated by the small ratio [12]
Γ(ψ (3686) → π 0 ψ(3097)
= (4.5 ± 0.1)10−2 . (18.40)
Γ(ψ (3686) → ηψ(3097))
The result is contradicted by the much larger ρ0 vs ω production in X(3872)
10 See H. Schopper in Ref. [88] for details.
298 Relativistic Quantum Mechanics

Figure 18.5 Schematic representations of diﬀerent models of exotic tetraquark systems.

decays [94]

Γ(X(3872) → ρ0 ψ(3097)
= (2.9 ± 0.4)10−1 , (18.41)
Γ(X(3872) → ωψ(3097))

which suggests that a pair of light quarks may be “intrinsically” present in

the constitution of X(3872), together with the cc̄ pair.
A first proposal, based on the fact that X(3872) is very close to the D0 D̄∗0
threshold, was the X(3872) to be a composite meson-antimeson state, bound
by the same nuclear force that binds proton and neutron into the deuteron: a
bound state that, by analogy, has been called “deuson” [95, 96].
In alternative, based on previous investigations on the constitution of
light spin 0 mesons, the hypothesis has been made of X(3872) as a compact
tetraquark: a [cq][c̄q̄] state (with q a light quark) bound by the fundamental
QCD forces [97].
The observation of hadrons that are: (i) electrically charged and (ii) de-
cay into a final state containing a true charmonium has confirmed the possi-
ble coexistence of one light and one heavy quark-antiquark pair in the same
hadron. This is realised in the three levels reported in Fig. 18.4: Z(3900)± →
π ± + ψ (3986), Z(4020)± → π ± + hc (3525) and Z(4430)± → π ± + ψ (3986).
The experimental study of exotic hadrons is in full development. Recent ac-
quisitions are the discovery of ”pentaquarks” [98], with composition (cc̄qq q )
and ”tetra-charm” states [99] observed to decay into pairs of ψ mesons, with
composition (c̄cc̄c).
A Leap Forward: Charmonium 299

Theoretical studies are also in evolution, to clarify the internal dynamics

of exotic hadrons, its relations to QCD and to the well known dynamics of
(q q̄) mesons and (qqq) baryons, see Fig. 18.5 for illustration and Ref. [100] for
a theoretical introduction.

18.4 PROBLEMS FOR CHAPTER 18

Sect. 17.1
1. Show that the dimension of the algebra of the group with N colours, i.e.
the number of N × N hermitian and traceless matrices, is D = N 2 − 1.
2. Starting from Eq. (18.6), explain why there are 8 gluons in QCD.

Sect. 17.2
1. Using the radial equation for χ(r), Eq. (B.16), prove that:

χ(r)
∼ rL , for r → 0+
r
where L is the orbital angular momentum.

2. Using the computer program [Link] and the data given in

(18.39), reconstruct the eigenvalues and wave functions reported in
Fig. 18.3.

3. Compute the masses of ψ(3S) and ψ(4S) and compare to the computed
and observed values reported in [84].
CHAPTER 19

THE
BORN-OPPENHEIMER
APPROXIMATION FOR
THE DOUBLY
CHARMED BARYON

Molecules and crystals are made of two kinds of particles with very differ-
ent masses, light electrons and heavy nuclei, both moving in the fields gener-
ated by Coulomb forces. Different masses entail different scales for the space
variation of the wave functions and this is the basis of an approximation,
the Born-Oppenheimer approximation, widely used in the theory of complex
atomic systems. A recent illustration of the Born-Oppenheimer approximation
in QED is found in Weinberg’s book on Quantum Mechanics [101].
The similarity with QED, has led several authors to apply the Born-
Oppenheimer approximation to hadronic systems which contain both heavy
and light quarks, and treat them as molecule-like systems bound by QCD
forces [104–107]. We illustrate here an application of the Born-Oppenheimer
approximation (BOA), to estimate the mass of the recently observed double
charm baryon, Ξ++cc = (ccu).

300 DOI: 10.1201/9781003436263-19

This chapter has been made available under a CC BY NC license.
The Born-Oppenheimer Approximation for the Doubly 301

19.1 BORN-OPPENHEIMER APPROXIMATION IN BRIEF

To be deﬁnite, consider a system with two heavy and two light particles
with Hamiltonian (see [101]):
1 2
H = Hheavy + Hlight = Pi + V (xA , xB ) +
2M
heavy
1 2
+ pi + Vl (xA , xB , x1 , x2 ) . (19.1)
2m
light

First, consider the heavy particles as classical sources with ﬁxed coordi-
nates and quantum numbers, and ﬁnd the ground state of the light particles,
solving the eigenvalue equation:

Hlight f0 (xA , xB , x1 , x2 ) = Ef0 ,

E = E(xA , xB ) . (19.2)

Then search for solutions of the complete Schrödinger equation for wave func-
tions of the form:

Φ = Ψ(xA , xB ) f0 (xA , xB , x1 , x2 ) .

In the equation:

(Hheavy + Hlight )Ψ(xA , xB ) f0 (xA , xB , x1 , x2 ) = EΨf0 ,

we may replace Hlight by its eigenvalue and rewrite the equation as

1 2
( Pi + V (xA , xB ) + E)Ψf0 = EΨf0 . (19.3)
2M
heavy

Applying Hheavy to Φ we encounter terms of the kind:

∂ 1 ∂f0 1 ∂Ψ
−iPΦ = Φ= + Ψf0 . (19.4)
∂xA f0 ∂xA Ψ ∂xA
The Born-Oppenheimer approximation consists in neglecting systematically
the ﬁrst with respect to the second term. If we call a and b the lengths over
which f0 or Ψ show an appreciable variation with the source coordinates, the
BO approximation is valid for a >> b, namely for small values of the ratio
1/a
Λ= << 1 . (19.5)
1/b
In the hydrogen molecule, a and b are, respectively, the electron and proton
Bohr radius (1/a = αm, 1/b = αM ) and the error is of order m/M (see [101]
for the more complicated case of crystals). We will discuss the error in QCD
later.
302 Relativistic Quantum Mechanics

The upshot is the Born-Oppenheimer (BO) equation:

⎛ ⎞
P2
⎝ i
+ V (xA , xB ) + E(xA , xB )⎠ Ψ = EΨ , (19.6)
2M
heavy

and in the following, we will denote

V (xA , xB ) + E(xA , xB ) = VBO (xA , xB ) . (19.7)

The remarkable fact about Eq. (19.6) is that the wave function f0 has dropped
out: what matters is only the energy of the light quarks in presence of the
heavy sources, which makes the BO approximation amenable to numerical
calculations (see e.g. [102] for a review of recent Lattice QCD calculations).
The BO approximation requires all quantum numbers of the heavy parti-
cles to be fixed when computing the light quarks energy, E. Besides position
and flavour, we have to fix spin and colour quantum numbers. In the two
cases of interest, the c quarks are colour triplets with spin 1/2, with colours
combined to produce a colour 3̄ representation and total spin 11 .

19.2 COLOUR GYMNASTIC FOR QUARK-QUARK

POTENTIALS
The Coulomb potential part of the quark-quark Cornell potential intro-
duced in (18.17), can be written as
αs 3 A A 4
VV (r) = · T1 T 2 R , (19.8)
r
A
A
where T1,2 are the matrices λ2 representing the generators of SU (3)c acting
on quarks 1 or 2 (sum over the repeated indices A understood) and R indicates
the SU (3)colour irreducible multiplet to which the product of the quark and
antiquark ﬁelds belong.
A
The T1,2 act on diﬀerent spaces, and obviously commute. To be precise,
one should write
λA λA
T1A = ( )1 ⊗ 12 , T2A = 11 ⊗ ( )2 . (19.9)
2 2
One can write the generator of the whole 1 + 2 system according to:

λA λA
A
T1+2 = T1A + T2A = ( ) 1 ⊗ 12 + 1 1 ⊗ ( ) 2 , (19.10)
2 2
1 For the (c c q) baryon, color 3̄ is required to make a colour singlet with the colour triplet
1 2
q and spin 1 is required by Fermi statistics, since c1 and c2 are already antisymmetrised by
¯ not consider here, we would chose the
colour as in Eq. (18.5). For the tetraquark (c1 c2 ūd),
charm pair to be a colour anti-triplet, because it is the only attractive channel, see below,
and spin 1 follows from Fermi statistics as before.
The Born-Oppenheimer Approximation for the Doubly 303

and (sum over A understood):

A
T1+2 A
T1+2 = T2 =
λ A λA λA λ A λA λA
=( ) 1 ⊗ 12 + 1 1 ⊗ ( )2 + 2( )1 ⊗ ( )2 =
4 4 2 2
= T 21 + T 22 + 2T 1 ·T 2 . (19.11)

For each representation, R1 , R2 or R12 to which q1 , q2 or their product q1 q2

belong, the operator T 2 = C2 (R) is called the quadratic Casimir operator and
is a constant multiple of the identity over the representation.
The last term in Eq. (19.11) is just the coeﬃcient in front of the po-
tential (19.8). Expressing it in terms of the Casimir operators, we write the
potential (19.8) as2 :
αs
VV = λ(R) ;
r
2 λ(R) = C2 (R12 ) − C2 (R1 ) − C2 (R2 ) . (19.12)

We note the results:

C2 (1) = 0 , C2 (R) = C2 (R̄) ,

C2 (3) = 4/3 ; C2 (6) = 10/3 ; C2 (8) = 3 . (19.13)

The dependence of λ on Casimir operators shows an interesting pattern of

forces vs R:

1. quark-antiquark:

R = (1): attractive (λ = −4/3); R = (8): repulsive (λ = +1/6) ,

(19.14)

2. quark-quark

R = (3̄): attractive (λ = −2/3); R = (6): repulsive (λ = +1/3) .

(19.15)

In conclusion: quark-antiquark pairs may bind in colour singlet mesons, while

diquarks may bind: (i) to another quark, to make a colour-singlet baryon, or
(ii) to an antidiquark, to make a colour-singlet tetraquark.
Non perturbatively, colour lines of force are supposed to condense in strings
going from quarks to antiquarks, see Section 18.1.3, or from quark to quark,
when colour force is attractive. In this case, it has been argued that the
2 The similarity with the familiar system of two particles with spins s and s and total
1 2
spin S = s1 + s2 is evident. In this case, the quadratic Casimir operator of total spin is:
C2 (S) = S2 = S(S + 1) and 2(s1 · s2 ) = [S(S + 1) − s1 (s1 + 1) − s2 (s2 + 1)].
304 Relativistic Quantum Mechanics

Figure 19.1 Schematic representation of the doubly charmed baryon within the BO ap-
proximation.

strenght of the attractive string (the constant k in the Cornell potential)

is proportional to |λ|, with λ given in (19.12).
Residual quark-quark or quark-antiquark interactions are local chromo-
magnetic, spin-spin, interactions of the form:
4παs
Hij (s1 , s2 ) = − (T A T A )[2 (s1 · s2 )] δ (3) (x1 − x2 ) . (19.16)
3mi mj 1 2

The formula given in Eq. (18.22) follows since λ = −4/3 for quark-antiquark
in colour singlet, as indicated above.

19.3 THE DOUBLY CHARMED BARYON

In 2020 the LHCb Collaboration found convincing evidence for a doubly
charmed, doubly charged baryon Ξ++
cc (3621) of mass M (Ξcc ) = 3620.6 ± 1.5
in the decay channel3

Ξ++ +
cc (3621) → Ξc + π
+
(19.17)

Quark composition is (ccu) and, given that the cc pair must be in total spin
Scc = 1, as noted before, the baryon total angular momentum may be J =
1/2, 3/2. There is no convincing evidence of a lighter baryon of this kind4 and
we assume provisionally J = 1/2.

19.3.1 The BO Approximation for Ξ++

cc
We assume the c quarks at ﬁxed positions xA and xB and in analogy
with the BO approximation for the H2+ = (ppe), see Ref. [103], consider the
3 According to GIM [28] the charm quark decays weakly as c → s + u + d.¯ In terms of
quarks, the decay (19.17) reads (ccu) → (csu) + (ud).
¯
4 Previous evidence from the SELEX experiment of a Ξ+ (3520) has not been conﬁrmed,
cc
see PdG 2023 [12].
The Born-Oppenheimer Approximation for the Doubly 305

state with the light quark bound to c(xA ) with a Cornell-like potential, see
Eq. (19.15)
2 αs 1
VA = − + kr + V0 ; r = |x − xA | , (19.18)
3 r 2
where V0 is the constant introduced in the Cornell potential in Sect. 18.2.2,
to ﬁx the zero of the energy.
Of course, one has to also consider the degenerate state, with q bound to
c(xB ). We assume the light quark ground level, in the presence of the two
heavy sources, to be the superposition of the two states, see Fig. 19.1:
ψ(x) + φ(x) 1 R(|x − xA |) + R(|x − xB |)
f0 (x) = =√ , (19.19)
2(1 + S) 4π 2(1 + S)
where ψ and φ are the wave functions of the two states. We indicate explic-
itly in Fig. 19.1 that the radial wave functions are the same function of two
diﬀerent variables: the distances of q from the source in xA or in xB . In cor-
respondence, ψ and φ are not orthogonal and we denote by S the overlap
integral:

S = d3 x ψ(x)φ(x) (real wave functions assumed) , (19.20)

a function of the distance between the two sources R = |xA − xB |, which

appears in the normalization of the state (19.19).
The cq orbital. In the jargon of molecular physics, the cq wave fuction
is an orbital. It is determined by the Schrödinger equation
1 2
(− ∇ + VA − V0 )ψ = E0 ψ , (19.21)
2μq
where μq is the reduced light quark mass, VA the potential in (19.18) and we
have suppressed the constant V0 . The energy is then E0 + V0 plus the quark
rest masses.
First order correction to the light quark energy. We consider now
the eﬀect of the interaction of the light quark with the other heavy source:

2 1
Hpert = − αS . (19.22)
3 |x − xB |
Treating (19.22) as a ﬁrst order perturbation, the energy of the light quark is:
E(R) = E0 + V0 + ΔE(R) , (19.23)
where
ΔE(R) = f0 |Hpert |f0 =
2αS 1
=− 2 [I1 (R) + I2 (R)] . (19.24)
3 2(1 + S)
306 Relativistic Quantum Mechanics

The factor 2 in the numerator of Eq. (19.24) arises because there are two equal
contributions, from the souces A and B, see Fig. 19.1.
The I1,2 are functions of R deﬁned in terms of the orbital wave functions
ψ and φ:

1
I1 (R) = d3 ξ |ψ(ξ)|2 , (19.25)
|ξ − xB |

1
I2 (R) = d3 ξ ψ(ξ)φ(ξ) , (19.26)
|ξ − xB |

where the vector ξ originates from A, taken in the origin, and |xB | = R. Ana-
lytic expressions for S, I1,2 are given in [103] for the hydrogen wave functions.
We shall evaluate them numerically.
The complete expression of E, to be inserted in the BO potential, requires
the addition of quark masses, so that:

E(R) = ΔE(R) + C , (19.27)

with C = E0 + V0 + quark masses.

Boundary condition at R = 0. For R → 0 the two charm quarks be-
come a single colour 3̄ source and the system reproduces. to all effects, the
colour fields configuration of a charmed meson, except for the larger mass,
2Mc of the source. This is in essence the doubly heavy quark-single heavy an-
tiquark symmetry introduced by several authors in the study of doubly heavy
baryons [108–110]. Charmed meson masses, with spin effects subtracted, are
well reproduced by the sum Mcmes + Mqmes with5

Mcmes = 1.667; Mqmes = 0.308 , (19.28)

2Mcmes + Mqmes = 3.642 (masses in GeV) ,

and the boundary condition reads:

ΔE(0) + C = (Mcmes + Mqmes ) + Mcmes , (19.29)

that is

C = −ΔE(0) + 2Mcmes + Mqmes , (19.30)

and we obtain:

E(R) = ΔE(R) − ΔE(0) + 2Mcmes + Mqmes . (19.31)

Boundary condition at R = ∞. At distances much larger than the

radius of the orbital c(xA )q, where forces due to gluons converge to zero like
5 Charmed baryon masses require diﬀerent charm and light quark masses, see

e.g. Ref. [100], the difference being interpreted as due to the different colour field con-
figurations in the two cases. For comparison, 2Mcbar + Mqbar = 3.782 GeV [100].
The Born-Oppenheimer Approximation for the Doubly 307

1/R, the charm quark in xB sees the other two particles as a 3̄ source with
which it combines to form a colour singlet. Therefore, the cc interaction must
contain at large distances a conﬁning string potential, with the same strength
as the Cornell potential introduced in (18.17).
To take this eﬀect into account, we add to the BO potential a linearly rising
term determined by the string tension k of charmonium, see Sect. 18.2.1

Vconf (R, R0 ) = k × (R − R0 ) × θ(r − R0 ) , (19.32)

and leave, for the moment, the onset point R0 as a free parameter. The Born-
Oppenheimer potential then reads
2 αs
VBO (R) = − + ΔE(R) − ΔE(0) + Vconf (R, R0 ) +
3R
+2Mcmes + Mqmes . (19.33)

Hyperﬁne interactions. We consider ﬁrst the interaction of the light

quark spin, s, with the spin of the heavy sources, SA ans SB , described by
Eq. (19.16). With TqA TcA = −2/3, Eq. (19.15) we write

< f0 |Hhf (s, sA ) + Hhf (s, sB )|f0 >=

8παs 1
=+ ψ(xA ) + φ(xA )]2 2(s · sA )] + (A → B) .
9Mq Mc 2(1 + S)

We set |xA | = 0, |xB | = R and note that φ(xA ) = ψ(xB ) = ψ(R) and
φ(xB ) = ψ(xA ) = ψ(0); thus, we ﬁnd

< f0 |Hhf (s, sA ) + Hhf (s, sB |f0 >=

8παs 1
=+ ψ(0) + ψ(R)]2 2s · (sA + sB ) . (19.34)
9Mq Mc 2(1 + S)

For a Ξcc with spin J = 1/2, we have 2s · (sA + sB ) = −2 and the spin
potential, to be added to the BO potential (19.33) is
16παs 1
VBOspin = + ψ(0) + ψ(R)]2 . (19.35)
9Mq Mc 2(1 + S)

The complete BO potential becomes

2 αs
VBOs (R) = − + ΔE(R) − ΔE(0) + Vconf (R, R0 ) + VBOspin +
3R
mes mes
+2Mc + Mq . (19.36)

Finally, Eq. (19.16) applied to the cc hyperﬁne interaction, taking into

account that 2(sA ·sB ) = +1/2 for total cc spin Scc = 1, gives a ﬁrst order
308 Relativistic Quantum Mechanics

0.4

0.4
0.3
RBO (R)
R(1S) 0.2
0.2

0.0
0.1 2A 3/2 Exp [-A r] 2 4 6 8 10
R(GeV-1 )
VBO (R)
-0.2
0.0
2 4 6 8 10
r(GeV-1 )

Figure 19.2 Left: Single exponential (dashed curve) ﬁtted to the radial wave function of
the (qc) orbital (solid curve). Right: BO potential (no spin) and radial BO wave function.

correction to the eigenvalue EBO of the Schrödinger equation with the spinless
potential (19.33):
4παs
Ehf (sA , sB ) = + |Ψ(0)|2 , (19.37)
9Mc2

to be computed from the corresponding BO eigenfunction Ψ 6 .

19.3.2 Numerical Results

The radial wave function R(x) of the cq orbital can be obtained by solving
numerically Eq. (19.21). To reduce the time length of successive calculations
it is useful to approximate R(x) with a single exponential Exp[−Ar], ﬁnding
the value of A from a best ﬁt to the numerical determination (see Ref. [107]).
Fig. 19.2 (left) shows the error made by the hydrogen like approximation (see
Eq. (B.45))7 :

R(x) = 2(A)3/2 e−Ax ; A = 0.277 GeV. (19.39)

6 The program Schrö[Link] gives the reduced radial wave function χ(r) and
1 χ(r) 2
|Ψ(0)|2 = lim | | . (19.38)
4π r→0 r

7 Correspondingly, the integrals to compute ΔE in the VBO potential become:

∞ +π
e−2Ar
I1 (R) = 2A3 dr dθ sin θ r2 √ ,
0 −π r 2 − 2rR cos θ + R2
∞ +π √
2 2
e−Ar e−A r −2rR cos θ+R
I2 (R) = 2A3 dr dθ sin θ r2 ,
0 −π r
∞ +π √
2 2
S(R) = 2A3 dr dθ sin θ r2 e−Ar e−A r −2rR cos θ+R ,
0 −π

and the hyperﬁne cq interaction potential is

8παs 2A3 (1 + e−AR )2
< f0 |Hhf (s, sA ) + Hhf (s, sB |f0 >= + 2s · (sA + sB ) .
9Mq Mc 1 + S(R)
The Born-Oppenheimer Approximation for the Doubly 309

Table 19.1 Eigenvalues of the Born-Oppenheimer equation without and with the hyperﬁne
potential and the resulting M (Ξcc ) mass, for R0 = 7 ± 2 GeV−1 see text . Energy and mass
in MeV.

R0 (GeV −1 ) EBO , Eq. (19.33) EBOs , Eq. (19.36) Ehf (cc) 2Mcm + Mqm M (Ξcc )
5 48 34 1.58 3642 3678
7 22 9 1.1 ” ” 3652
9 11 1.6 0.9 ” ” 3645

We report in Fig. 19.2 (right) the no-spin BO potential, Eq. (19.33), and
the corresponding radial wave function, Ψ(R).
Table 19.1 summarises the eigenvalues of the Born-Oppenheimer equation
without, Eq. (19.33), and with, Eq. (19.36), the qc hyperfine potential for
R0 = 7 ± 2 GeV−1 . The hyperfine cc interaction is computed with Eq. (19.37).
To obtain M (Ξcc ) the quark mass combination 2Mcmes +Mqmes has been added.
In conclusion, we find:
+26 −1
M (Ξ++
cc , J = 1/2) = 3652−7 for R0 = 7 ± 2 GeV , (19.40)
to be compared to M (Ξ++ cc )exp = 3621. The agreement with the observed
Ξ++
cc mass is reasonable (given the error of the BO approximation estimated
in the next Section) at the expense, however, of having introduced the ad-hoc
parameter R0 . The real measure of the effectiveness of the Born-Oppenheimer
approximation has to wait for the comparison of this formula with the different
levels of doubly charmed baryons predicted by the quark model, as it was the
case for the charmonium spectrum vs. the Cornell potential.

19.3.3 About the BO Approximation Error in QCD

Recall that we have characterised the BO approximation error by the ratio
1/a
Λ= , (19.41)
1/b
where a and b are the lengths over which f0 and Ψ show an appreciable
variation.
The length a is simply the radius of the cq orbital: 1/a = A ∼ 0.3 GeV,
i.e. a ∼ 0.7 fm.
The length b has to be formed from the dimensional quantities over which
the Born-Oppenheimer equation (19.33) depends. In the case of double heavy
baryons Eq. (19.33) depends on M , A and on the string tension k, which has
dimensions of GeV2 . The simplest possibility is 1/b = (AkM )+1/4 ; that is:
Λ = A3/4 (kM )−1/4 , (19.42)
which is 0.53 for charm, using Eqs. (18.39) and (19.28). For convenience we
have included quark masses in VBO , but it is worth noticing that the error we
are estimating is the error on binding energies, which turns out to be of the
order of 50 MeV or smaller in absolute value. So, the BO approximation error
corresponding to (19.42) is expected to be of the order of 25−30 MeV.
310 Relativistic Quantum Mechanics

19.4 PROBLEMS FOR CHAPTER 19

Sect. 18.3
1. Consider the decay Ξ++ + +
cc → Ξc +π , with both baryons with J = 1/2
P +
P −
and π meson with J = 0 . Find the values of the orbital angular
momentum of π + allowed by total angular momentum conservation.
2. Same problem, assuming a Ξ++ P +
cc with J = 3/2 .

3. Which of the orbital angular momenta found in 1. and 2. correspond to

parity conserving decays?
APPENDIX A

Basic Elements of
Quantum Mechanics

This appendix summarises the basic principles of quantum mechanics. The

objective above all is to recall the most important ideas and to deﬁne the nota-
tion which will be used in what follows. For a deeper discussion of the physical
basis of the theory, the reader is invited to consult the work of Dirac [111].
A concise and modern discussion of the fundamentals and the philosophical
diﬃculties of quantum mechanics can be found in the book by Bell [112].

A.1 THE PRINCIPLE OF SUPERPOSITION

At a given instant of time, the states of a quantum system are represented
by the elements of an abstract space, H. These elements will be denoted using
Dirac notation:

|A >, |B >, |C > . . . , known as kets, corresponding to the physical states

A, B, C, etc. of the system.
The mathematical structure of H is ﬁxed by the principle of superposition,
according to which, if |A > and |B > represent two possible states of the
system, other states can always be expressed in terms of an arbitrary linear
combination of |A > and |B > with complex coeﬃcients α and β:

|C >= α|A > +β|B > (A.1)

H is a complex vector (or linear) space, in general with an inﬁnite number

of dimensions.
Characterisation of the physical states necessarily implies the experimental
determination of values of one or more observable quantities of the states
themselves. Let us suppose that the kets |A > and |B > which appear in
equation (A.1) correspond respectively to two distinct values which we call
a and b, of the same observable X (for example the energy). The physical

DOI: 10.1201/9781003436263-A 311

This chapter has been made available under a CC BY NC license.
312 Relativistic Quantum Mechanics

interpretation of the state |C > as a superposition of |A > and |B > is as

follows:

• The superposition of |A > with itself (the case β = 0) gives rise to the
same physical state; |A > and α|A > represent the same state for all
values of α = 0.
• In the case in which α and β are both = 0, the result of a measurement
of X on the state |C > can be a or b; only one of these two values can
be the outcome of the measurement.
• It is not possible to predict which of the two values will be the result of
a given measurement. However, if the states |A > and |B > are correctly
normalised (in the sense deﬁned below), the frequencies with which the
outcomes a and b occur are in the ratio |α|2 /|β|2 .

To connect the superposition coeﬃcients to the probability of diﬀerent

measurement results, normalisation of the vectors which characterise the
states is necessary. This requires that the scalar product of |A > and |B > in
H can be deﬁned, which we denote as:

< B|A >=< A|B >∗ (A.2)

(the asterisk implies the operation of complex conjugation).

The scalar product < B|A > must be linear in A and hence antilinear
in B

< C|αA + βB >= α < C|A > +β < C|B > (A.3)

< αA + βB|C >= α∗ < A|C > +β ∗ < B|C > (A.4)

as well as positive deﬁnite:

< A|A > > 0,

< A|A >= 0 if, and only if, |A >= 0.

With this final constraint, the space H is a Hilbert space, and the quantity
< A|A > is the square of the norm of the vector |A >.
Besides H, we can consider a new vector space, the dual space H∗ , defined
as the space of linear functionals (with complex values) defined on H. It is not
difficult to be convinced that the elements of H∗ have a one-to-one mapping
onto those of H. In effect, according to a well known theorem due to Riesz,
every linear functional f (|A >) can be written as:

f (|A >) =< f |A > (A.5)

with |f > a ﬁxed ket. Therefore we can make the functional f (|A >) in H∗
correspond to |f > in H. (A.5) explains the Dirac notation for the elements
of H∗ , according to which we denote with:

< f| (A.6)
Basic Elements of Quantum Mechanics 313

the element of H∗ which corresponds to the ket |f >. The vectors < f | are
known by the name bra. As a result of (A.4), the bra < f | depends antilinearly
on the ket |f > and the scalar product (A.2) can be interpreted as the product
of the bra < B| with the ket |A > (bra* ket = bracket = product, from which
the terms bra and ket are derived).
Turning to the probabilistic interpretation of the results of the measure-
ment of X on the state |C > in equation (A.1), we assume that:

• If the vectors |A > and |B > have unity norm, the probability of ob-
taining a (or b) from the measurement of X on |C > is equal to |α|2 (or
|β|2 ).

A.2 LINEAR OPERATORS

On the vector space H we can deﬁne linear operators, such that to every
vector of H (contained in an appropriate region of H itself) we can associate
another vector, which is related in a linear way to the ﬁrst.1

|B >= X|A > (A.7)

X(α|A > +β|B >) = αX|A > +βX|B > . (A.8)

Given X, we can consider the complex number:

< A|X|B >∗

for every < A| and |B >. This complex number depends antilinearly on |B >.
Therefore we can write:

< A|X|B >∗ =< B|V >

.
Moreover, because |V > depends linearly on the ket |A > which corre-
sponds to bra < A|, we can also write:

< A|X|B >∗ =< B|V >=< B|X † |A > (A.9)

where X † is a new linear operator, associated unambiguously to X by (A.9),

and known as the adjoint (or hermitian conjugate) operator of X. Clearly the
relations:

(X † )† = X (A.10)

(αX)† = α∗ X † (A.11)
1 In what follows we will assume for simplicity that the operators are deﬁned in all H.
314 Relativistic Quantum Mechanics

apply for every complex number α.

An operator H is hermitian, or self-adjoint, if X † = X. Hermitian opera-
tors have some properties with respect to their eigenvalues and eigenvectors
which are crucial for the development of development of quantum mechanics.
We recall that the eigenvalue of a linear operator X is a (real or complex)
number λ for which the equation:

X|v >= λ |v > (A.12)

allows solutions |v >= 0. In this case we say that the eigenvector |v > belongs
to the eigenvalue λ and we write |v >= |λ, a, b, . . . > (a, b, . . . are parameters
which distinguish the eigenvectors which belong to the same eigenvalue).
The following properties apply:
• The eigenvalues of a hermitian operator are always real numbers.
• Eigenvectors, |h > and |h > which belong to two distinct eigenvalues,
h and h , are orthogonal to each other:

< h |h >= 0, if h = h . (A.13)

• The eigenvectors of a hermitian operator form a complete basis in H.

Assuming, for simplicity, that H has a discrete spectrum of eigenvalues,
the ﬁnal property implies that every vector can be expressed in the basis of
the normalised eigenvectors of H:

|A >= cn |hn > (A.14)
n

with:

H|hn >= hn |hn >

< hn |hm >= δn,m

from which it follows that:

cn =< hn |A >

< A|A >= |cn |2 . (A.15)
n

A useful concept, in connection to the base set of eigenvectors of H, is that

of projection of one or more states. The projection operator of a given vector,
for example of |h1 >, is one which has the property that:

P2 = P (A.16)
P |h1 > = |h1 > (A.17)
P |V > = 0, if < h1 |V >= 0 (A.18)
Basic Elements of Quantum Mechanics 315

P 2 = |h1 >< h1 |h1 >< h1 | = |h1 >< h1 | = P

P |h1 >= |h1 >< h1 |h1 >= |h1 >
P |V >= |h1 >< h1 |V >= 0, if < h1 |V >= 0.

The projection operator in a multidimensional space, deﬁned by a certain

number of vectors orthogonal to each other, is simply given by the sum of the
projections of single vectors:

P = |hn >< hn | (A.20)
n

and the completeness condition for the base set of eigenvectors of H is ex-
pressed as:

|hn >< hn | = 1 (A.21)
n

(1 indicates the identity operator). Equation (A.15) is obtained formally from

(A.21) in the following way:

|A >= 1|A >= |hn >< hn |A >= cn |hn >
n n

.
Similarly,

< A|A >=< A|1|A >= < A|hn >< hn |A >= |cn |2 .
n n

A.3 OBSERVABLES AND HERMITIAN

OPERATORS
The relevance of the concepts just illustrated lies in the fact that in quan-
tum mechanics every observable quantity, O, is represented by a hermitian
operator, O.
The eigenvectors of O represent the physical states for which O assumes
a deﬁnite value, equal to the eigenvalue which corresponds to the eigenvector
in question. The spectrum of eigenvalues of O therefore deﬁnes the set of
possible results from a measurement of O. The considerations of Section A.2
allow description of the results of a measurement of O on a general state |A >
in the following way.

• The measurement of O on |A > gives one of the eigenvalues of O, for

example hn , as the result with a probability proportional to the squared
modulus of the corresponding coeﬃcient of the expansion, cn .
316 Relativistic Quantum Mechanics

• The sum of the probabilities for all possible cases must be equal to one.
If |A > is normalised to unity, equation (A.15) shows that:

P (hn on |A >) = |cn |2 = | < hn |A > |2 . (A.22)

This result gives a physical meaning to the scalar product of two vectors.
Let us suppose that |A > and |B > correspond to states in which two
diﬀerent quantities assume deﬁnite values, xa for the observable X in
|A > and yb for the observable Y in |B >. If |A > and |B > are
normalised to unity, the probability that a measurement of Y in |A >
has the result yb is given by the squared modulus of the corresponding
scalar product:

P (yb on |A >) = | < B|A > |2 . (A.23)

For this reason, the scalar product is also known as the probability
amplitude.

• The average result of many measurements of O on |A >2 is given by the

formula:

< O >A = hn P (hn on |A >) = (A.24)
n

= hn |cn |2 = hn < A|hn >< hn |A >=
n n

=< A|O( |hn >< hn |)|A >=< A|O|A >
n

by virtue of equation (A.21). For this reason, the matrix element of O

between < A| and |A > (diagonal matrix element) is also called the
expectation value of O on A.

A.4 THE NON-RELATIVISTIC SPIN 0 PARTICLE

The spinless non-relativistic particle provides the simplest concrete exam-
ple of the ideas explained above. The fundamental observables of this system
are the coordinate, x, (for simplicity we consider just a single spatial dimen-
sion and we set = 1) and the conjugate momentum, p, with the commutation
rules:

[x, p] = i. (A.25)
2 This means that each of these measurements is carried out on a new replica of the
system, prepared in the state |A > using appropriate experimental apparatus.
Basic Elements of Quantum Mechanics 317

We can introduce the eigenstates of x and p:

x|x >= x|x >

p|p >= p|p >
with the normalisation and completeness conditions:

< x|x >= δ(x − x ) (A.26)

dx|x >< x| = 1 (A.27)

< p|p >= δ(p − p ) (A.28)

dp|p >< p| = 1. (A.29)

The wave function of a given ket, |A >, is the component of |A > in the
base set of the eigenvectors |x >:

ψA (x) =< x|A > (A.30)

with:

< B|A >= dx < B|x >< x|A >= dxψB (x)∗ ψA (x) (A.31)

from which:

1 =< A|A >= dx < A|x >< x|A >= dx|ψA (x)|2 (A.32)

in agreement with the interpretation of |ψA (x)|2 as the probability density to

ﬁnd the particle between x and x + dx.
From equation (A.30) and from the commutation relation (A.25), one
ﬁnds [111]:

(xψA )(x) = xψA (x)

d
(pψA )(x) = −i ψA (x).
dx
The wave function of the ket |p > is obtained directly from this equation:
1
< x|p >= √ eipx (A.33)
2π
with the normalisation factor determined from (A.28).
318 Relativistic Quantum Mechanics

A.4.1 Translations and Rotations

The results just illustrated permit us to characterise the operations of
spatial translation.
We deﬁne the translation operators, U (a), as those which, applied to a
ket |A >, transform it into the ket which corresponds to the translated state,
which is the state obtained by translating, through a length a, all the relevant
apparatus necessary to produce the state |A >. The homogeneity of space
requires that U (a) should be a unitary operator (see Section 10.4)

U (a)† = U (a)−1 = U (−a). (A.34)

From the deﬁnition of U , ignoring irrelevant phase factors, it follows that:

U (a)|x >= |x + a > (A.35)

or:

< x|U (a)† =< x + a| (A.36)

from which one obtains:

< x|U (a)† |p >=< x + a|p >= (2π)−1/2 eip(x+a) = (A.37)

= eipa < x|p >=< x|eipa |p > .

Because this relation must hold for every |p > and for every |x >, we ﬁnd:

U (a) = e−ipa . (A.38)

For inﬁnitesimal transformations:

U (a) ≡ 1 − ipa. (A.39)

The momentum is the inﬁnitesimal generator of spatial translation. For a gen-

eral state, |A >:

(U (a)ψA )(x) =< x|U (a)|A >=< x − a|A >= ψA (x − a). (A.40)

The transition to the three-dimensional case allows discussion of spatial

rotation symmetry. In analogy to the case of translation, we deﬁne the unitary
operators, U (R), according to the relation:

U (R)|x >= |Rx > (A.41)

where R is the (orthogonal 3 × 3) rotation matrix, and moreover:

U (R)† = U (R)−1 = U (R−1 ). (A.42)

Basic Elements of Quantum Mechanics 319

For rotations through an angle θ around the z axis:

(Rx)x = cos θx − sin θy

(Rx)x = sin θx + cos θy
(Rx)z = z.

Proceeding as in (A.40) we ﬁnd, for a general ket:

(U (R)ψA )(x) =< x|U (R)|A >=< R−1 x|A >=

= ψA (R−1 x). (A.43)

For inﬁnitesimal rotations about the z axis, we therefore ﬁnd:

d d
< x|U (R)|A >= ψA (x) − θ(x − y )ψA (x) =
dy dx
=< x|[1 − iθ(x × p)z ]|A >

from which

U (R) = 1 − iθLz (A.44)

where Lz is the angular momentum component along the z axis:

L = x × p. (A.45)

More generally, for an arbitrary rotation,

U (R) = e−in·L

where n is a three-dimensional vector whose direction identiﬁes the rotation

axis while |n| = θ is the rotation angle.
Therefore the generators of inﬁnitesimal rotations are the angular momen-
tum components. Making use of the canonical relations, (A.25), it can be
veriﬁed that the operators L obey the commutation relations:

[Li , Lj ] = iijk Lk . (A.46)

It is important to be convinced that the commutation rules (A.46) are a

direct consequence of the structure of the rotation group and the requirement
that:

U (R1 )U (R2 ) = U (R1 R2 ) (A.47)

for arbitrary inﬁnitesimal rotations, R1,2 .

To obtain this result, we make use of the fact that every orthogonal 3 × 3
matrix, R, can be written as:

R = e−in·T (A.48)
320 Relativistic Quantum Mechanics

where n is the same vector which appears in equation (A.46) and the 3 × 3
matrices, T , are given by:3

(Tk )ij = iikj . (A.49)

An explicit calculation shows that the matrices Tk satisfy commutation

relations equivalent to (A.46):

[Ti , Tj ] = iijk Tk (A.50)

For inﬁnitesimal n1 and n2 4 :

R1 R2 = e−in1 ·T e−in2 ·T
= e−i[(n1 +n2 )·T +(i/2)[n1 ·T ,n2 ·T ]+...]
= e−i[(n1 +n2 −(1/2)n1 ×n2 )·T +...] (A.51)

where the dots represent higher order terms in n1 or n2 . On the other hand,
we have:

U (R1 )U (R2 ) = e−in1 ·L e−in2 ·L = e−i[(n1 +n2 )·L+(i/2)[n1 ·L,n2 ·L]+...] =

= U (e−i[(n1 +n2 )·T T +(1/2)[n1 ·T ,n2 ·T ]+...] )
= U (e−i[(n1 +n2 −(1/2)n1 ×n2 ·T ]+...] ) (A.52)

where the last equalities follow from the composition rule (A.47). For com-
parison, we see that the operators L must obey the commutation relations
(A.46). The fact that the canonical commutators lead to these relations shows
that the operators U , equation (A.44), provide a representation of the rotation
group, but there could be (and there are, as we will see) other independent
representations.
The considerations we have just explained lead to a very general deﬁnition
of angular momentum.
For any physical system, we can (operationally) identify operators U (R)
which describe the action of a rotation on that system and for which the
equations (A.46) hold. For this system, we deﬁne the angular momentum
from the equation:

U (R) = 1 − in · J (A.53)

for inﬁnitesimal transformations. From what was seen earlier, the components
of J automatically satisfy commutation relations analogous to (A.46):

[Ji , Jj ] = iijk Jk . (A.54)

3 For example, it is easily shown that the matrix in equation (A.43) can be written as

e−iθT3 with (T3 )12 = −(T3 )21 = i132 , and all other elements equal to zero.
4 This relation follows directly from expanding the exponentials in powers of n and n .
1 2
Basic Elements of Quantum Mechanics 321

Analogous arguments hold for the momentum components. For an arbi-

trary physical system, we can deﬁne the momentum by the relation:

U (a) = 1 − ia · P (A.55)

valid for inﬁnitesimal translations determined by the vector a.

A.4.2 Spin
The simplest example of the deﬁnition of angular momentum just given is
that of a particle with spin. In this case, the kets which represent a particle
localised in x are characterised by a further quantum number σ such that the
eﬀect of a rotation, as well as turning x into Rx, is that of producing a linear
combination of the states corresponding to various values of σ ∗5 :

U (R)|x, σ >= |Rx, σ > S(R)σ σ (A.56)

That U is unitary implies that S(R) should be a unitary matrix:

S(R)† S(R) = 1 (A.57)

and furthermore the relation (A.56) implies that the matrices S(R) must
themselves provide a representation of the rotation group (for inﬁnitesimal
rotations):

S(R1 )S(R2 ) = S(R1 R2 ). (A.58)

Therefore, S also must have the form:

S = 1 − in · S (A.59)

with S being three suitable matrices in the space σ which satisfy the angular
momentum commutation rules:

[Si , Sj ] = iijk Sk . (A.60)

The possible realisations of S correspond, as is well known, to integer or

half-integer values of angular momentum s. When s has been ﬁxed, σ varies
between −s and +s in unit steps. For example, for s = 12 , σ = − 12 , + 12 and
1
Si = σi (A.61)
2
where σi are the three Pauli matrices.
The wave function of a general state |A > is now a “spinor” with 2s + 1
components:

ψσ (x) =< σ, x|A > (A.62)

5 In what follows the summation over repeated indices is understood.
322 Relativistic Quantum Mechanics

and furthermore:

[U (R)ψ]σ (x) =< σ, x|U (R)|A >= S(R−1 )σσ ψσ (R−1 x)

from which, for inﬁnitesimal transformations, it is found that:

< σ, x|U (R)|A >=< σ, x|1 − in · (L + S)|A > . (A.63)

The operator associated with the generators of rotations, the total angular
momentum, is the sum of two mutually commuting terms: the orbital angular
momentum, L, and the spin angular momentum, S.

J = L + S. (A.64)

Before concluding this section, we note a characteristic of the representa-

tion of spin 12 . Neglecting the spatial variables, the action of a rotation through
an angle θ around the z axis on a spinor with, for example, Sz = 12 is given
by:

U (R)|σ = 1/2 >= e−iθS2 |σ = 1/2 >= (A.65)

= e−iθ/2 |σ = 1/2 > .

For θ = 2π, the ket is multiplied by −1. This is completely consistent with
the fact that the physical state should turn into itself after a complete rotation,
since the kets |A > and −|A > represent the same physical state. However,
this fact shows that the multiplication rule (A.58) cannot be satisfied for
finite rotations in the case of representations corresponding to spin 12 (more
generally, for half-integer spins). In effect, as this example shows, quantum
mechanics requires only that the representations of the rotation group should
obey the law of group multiplications at least to within a phase:

U (R1 )U (R2 ) = ω(R1 , R2 )U (R1 , R2 ). (A.66)

The phase ω(R1 , R2 ) can, without loss of generality, be chosen to be equal

to +1 or −1. The operators U (R) give, in this case, a representation to within
a phase [113] of the group of rotations.
APPENDIX B

The Non-Relativistic
Hydrogen Atom

B.1 FACTORISATION OF THE LAPLACIAN

We consider a particle of spin zero, with the Hamiltonian:

p2
H = H0 + V = +V (B.1)
2m
and with the canonical commutation rules:

[xi , pj ] = iδij . (B.2)

If V has spherical symmetry, it is helpful to deﬁne the radial momentum,

pr , which allows factorisation of the free particle Hamiltonian in terms of
constants of the motion. Classically, pr is simply the momentum component
along the radial direction:
1
pr,cl = (x · p). (B.3)
r
However, if we substitute the quantum operators into (B.3) for the co-
ordinates and momenta, we do not obtain a hermitian operator because the
operators themselves do not commute. We obtain:
†
1 1 1 1
(x · p) = (p · x) = (x · p) + pi , xi =
r r r r
∂ xi 2i
= pr,cl − i ( ) = pr,cl − . (B.4)
∂xi r r
Therefore, if we deﬁne:

1 i ∂ 1
pr = (x · p) − = −i + (B.5)
r r ∂r r

DOI: 10.1201/9781003436263-B 323

This chapter has been made available under a CC BY NC license.
324 Relativistic Quantum Mechanics

we do obtain an hermitian operator, which coincides with prcl in the limit

→ 0. We note:

[r, pr ] = i. (B.6)

The factorisation we seek can be obtained starting from the expression for
the square of the orbital angular momentum1 :

L · L = (ijk xi pj )(lsk xl ps ) = xi pj xi pj − xi pj xj pi = A − B. (B.7)

Using the commutation rules (B.2) and the deﬁnition of pr , we ﬁnd:

A = r2 p2 − i(x · p) = r2 p2 − irpr + 1
B = (p · x)(x · p) + i(x · p) = (x · p)(x · p) − 2i(x · p)r =
i i
= r(pr + )r(pr + ) − 2irpr + 2 = (rpr + i)(rpr + i) − 2irpr + 2 =
r r
= r2 p2r − irpr + 1 (B.8)

from which:

L · L = r2 (p2 − p2r )

or:
L·L
p2 = p2r + . (B.9)
r2
(Being invariant under rotation, r commutes with all components of L).
In terms of the Laplacian operator, we recover the well known formula:

∂ ∂ ∂ 1 ∂ 1 L·L
− =− + + + =
∂xi ∂xi ∂r r ∂r r r2
1 ∂ ∂ L·L
= − 2 r2 + 2 . (B.10)
r ∂r ∂r r

B.2 SEPARATION OF VARIABLES

The Hamiltonian of the electron, in the centre of mass of the elec-
tron–proton system and ignoring the spin variables, is written:

p2 α p2 L·L α
H= − = r + 2 − . (B.11)
2mr r 2mr r r

mr = me Mp /(me + Mp ) is the reduced mass of the electron and α =

e2 /c 1/137 is the ﬁne structure constant.
Given that r and pr , and therefore H, commute with L, we can choose a
1 We recall that ijk lsk = δil δjs − δis δjl .
The Non-Relativistic Hydrogen Atom 325

basis in which H, L2 = L · L and Lz are diagonal and are represented by their

eigenvalues, which we denote, respectively, as E, l(l + 1), m.
In this coordinate system, the wave function of the electron factorises into
two terms which contain respectively the radial and angular dependence:

ψ(r, θ, φ) = R(r)Ylm (θ, φ) (B.12)

with:

L2 Ylm (θ, φ) = l(l + 1)Ylm (θ, φ);

Lz Ylm (θ, φ) = mYlm (θ, φ). (B.13)

The radial wave function satisﬁes the equation:

1 1 ∂ 2 ∂ l(l + 1) α
− r + − R(r) = ER(r). (B.14)
2mr r2 ∂r ∂r 2mr r2 r

At this point, it is usual to deﬁne a new wave function, χ, according to

the relation:
1
R(r) = χ(r). (B.15)
r
The equation for χ takes the form of a one-dimensional Schrödinger equa-
tion:
1 l(l + 1) α
− χ (r) + − χ(r) = Eχ(r). (B.16)
2mr 2mr r2 r

Boundary Conditions. The condition that χ(r) does not extend into non-
physical negative values of the radial variable is imposed by assigning a con-
stant potential, V0 , to the region r < 0 and making V0 → +∞. In this situation
it is well known that:

χ(r) → 0 for r → 0+ (B.17)

and (B.17) provides the boundary condition at r = 0. The second condition

is that:

χ(r) → 0 for r → +∞ (B.18)

such that the wave function is normalisable, as is necessary for a bound state:
+∞
d3 x|ψ(x)|2 = dΩ|Ylm |2 r2 dr|R(r)|2 = dr |χ(r)|2 < +∞. (B.19)
0
326 Relativistic Quantum Mechanics

Transformation to Dimensional Variables. We deﬁne the natural scales

of length and energy by means of the Bohr radius and the Rydberg:
c
RB = = 0.529 10−8 cm; r = RB ρ
me α
1
Ry = me α2 = 13.6 eV; E = Ry . (B.20)
2
In terms of the new variables (now the primes denote derivatives with
respect to ρ), χ satisﬁes the equation:

l(l + 1) 2
−χ (ρ) + − χ(ρ) = χ(ρ). (B.21)
ρ2 ρ

B.3 EIGENVALUES OF THE HAMILTONIAN

For ρ → ∞ we can approximate the equation (B.21) with:

−χ (ρ) = χ(ρ)

which has the general solution:

√ √
χ = Ae− −ρ
+ Be+ −ρ
. (B.22)

To take account of the normalisability condition, (B.19), we set:

√ s
χ(ρ) = e− −ρ
a0 ρ + a1 ρs+1 + · · · aν ρs+ν + · · · (B.23)

with s > 0 to satisfy the condition (B.17). Inserting this expression into (B.21),
the term in a0 ρs generates a term proportional to ρs−2 which should not exist,
if we wish the series to start with a positive power. This term is:

a−2 = [−s(s − 1) + l(l + 1)]a0 (B.24)

and its absence requires that:

s(s − 1) = l(l + 1) or s = l + 1 (B.25)

(we have discarded the root with s < 0). Now, the condition that the term in
ρs−1 should also be absent determines a1 starting from a0 , the cancellation of
the term in ρs determines a2 in terms of a1 , etc. Proceeding in this way, one
arrives at the solution which satisﬁes (B.25).
For general values of , however, the solution found in this way tends
asymptotically to the form (B.22), which is not normalisable unless B = 0.
We now show that this can occur only if the series terminates for a ﬁnite value
of ν.
The Non-Relativistic Hydrogen Atom 327

We write the various terms of equation (B.21) in the form (B.23) and we
identify the coeﬃcients of ρν−1 . One ﬁnds:
√ l(l + 1)
−χ(ρ) → −(s + ν + 1)(s + ν)aν+1 + 2 −(s + ν)aν + aν−1 +
ρ2
→ l(l + 1)aν+1
2
− → −2aν
ρ
−χ(ρ) → −aν−1 (B.26)
total →
√
0 = [l(l + 1) − (s + ν + 1)(s + ν)]aν+1 + 2[ −(s + ν) − 1]aν . (B.27)

For general values of equation (B.27) allows aν+1 to be obtained from

aν . In the limit of large ν, we ﬁnd:
√
2 −
aν+1 = aν (B.28)
ν+1
or:
√
(2 −)ν
aν = (B.29)
ν!
√
which, summed, takes us to the positive exponential of 2 −ρ, or takes us
back to the singular solution:
√ √ √
χ e− −ρ +2 −ρ
e = e+ −ρ
. (B.30)

To remain with a function which is normalisable the series must terminate,

or there exists a value ν̄ for which aν̄+1 = 0. This happens if:
√
−(s + ν̄) = 1 (B.31)

or:
1
=− . (B.32)
(l + ν̄ + 1)2

The energy eigenvalues are therefore characterised by an integer number

(the principal quantum number):
1
En = −Ry , n = 1, 2, · · · (B.33)
n2
For a given n, En takes the same value in states with angular momentum
l:

l = 0, · · · , n − 1 (B.34)
328 Relativistic Quantum Mechanics

(since ν̄ ≥ 0) and a total number of states equal to:

l=n−1
N (En ) = (2l + 1) = n2 . (B.35)
l=0

In spectroscopic notation, the states with l = 0, 1, 2, 3, · · · are denoted by

the letters S, P , D, F , etc. The order of the states is therefore:

E1 = −Ry : 1S,
E2 = −Ry/4 : 2S, 2P,
E3 = −Ry/9 : 3S, 3P, 3D
···

To the orbital angular momentum multiplicity it is, naturally, necessary

to add contributions due to the spin of the electron and proton so that, in
total, N (En ) = 4n2 .

B.4 EIGENFUNCTIONS
We start from the radial equation, Eq. (B.21), assuming the eigenvalue
= −1/n2 found in Eq. (B.33). For r → +∞ the radial equation reads
1
χ = χ (B.36)
n2
that is χ(ρ) = e±ρ/n . A normalizable wave function requires choosing the
minus sign and we set
ρ
χ(ρ) = ρl+1 P (ρ)e− n (B.37)

where we have used the behaviour of χ for r → 0+ indicated by Eq. (B.25),

P (ρ) to be determined presently.
Setting temporarily

P̄ (ρ) = ρl+1 P (ρ) (B.38)

Eq. (B.21) becomes

2 1 l(l + 1) 2 1
P̄ − P̄ + 2 P̄ − P̄ + P̄ = 2 P̄ (B.39)
n n ρ2 ρ n
Simplifying the terms in 1/n2 and using (B.38), one ﬁnds
2ρ l+1
ρP + P (2l + 2 − ) + 2P (− + 1) = 0 (B.40)
n n
To connect to the notation of Landau-Lifshitz [86], we set
2ρ
P (ρ) = w( ) (B.41)
n
The Non-Relativistic Hydrogen Atom 329

so that
2 2
P = w , P = ( )2 w , etc. (B.42)
n n
and ﬁnd the equation:

ξw (ξ) + w (ξ)(2l + 2 − ξ) + w(ξ)(n − l − 1) = 0 (B.43)

As discussed in Ref. [86] (p. 119 and Mathematical Appendix d) this equation
admits solutions2 which are exponentially bound at inﬁnity for n and l non-
negative integers and3

n≥l+1 (B.44)

In these cases, the solutions are polinomials related to the Laguerre polynomi-
als. Not surprisingly, we have re-obtained the condition previously found by
a direct calculation based on the power expansion of the reduced radial wave
function, Eqs. (B.31) and (B.32).
The lowest radial wave functions. In terms of the polynomials P (ρ)
deﬁned in (B.37), one has

Rnl = ρl P (ρ)e−ρ/n (B.45)

We derived earlier the degree of the polinomial part of χ to be ν̄ +l+1 = n,

Eq. (B.32), so that the degree of the polynomial part of Rnl is n − 1. From
this, using Eq. (B.40) we may construct any Rnl . We list below the normalised
radial wave functions up to n = 3 (see also [86], p. 120).

R10 = 2e−ρ ;
1 ρ
R20 = √ (1 − )e−ρ/2 ;
2 2
1
R21 = √ ρe−ρ/2 ;
2 6
2 2ρ 2ρ2 −ρ/3
R30 = √ (1 − + )e ;
3 3 3 27
8 ρ
R31 = √ ρ(1 − )e−ρ/3 ;
27 6 6
4
R32 = √ ρ2 e−ρ/3 .
81 30

2A class of functions known as Conﬂuent Hypergeometrical functions.

3 The inequality is speciﬁc of the Coulomb potential. For example, it is not satisﬁed by
the Cornell potential, Sect. 18.2.1, where we may have levels with l ≥ n.
Bibliography

[1] F. J. Belinfante, Physica 6, 887 (1939).

L. Rosenfeld, Memoires de l’Acad. Roy. Belgique 6, 30 (1940).
[2] N. N. Bogoliubov and D. V. Shirkov, Introduction to the Theory of Quan-
tized Fields, Interscience Publishers Inc., New York, 1959.
[3] F. Mandl and G. Shaw, Quantum Field Theory, John Wiley & Sons Ltd.,
Chichester, 1984.
[4] H. A. Lorentz, The Theory of Electrons and Its Applications to the Phe-
nomena of Light and Radiant Heat, Dover Publications Inc., New York,
1952.
[5] P.A.M. Dirac, Proc. Roy. Soc. 112, 661 (1926).
[6] E. Schrödinger, Ann. Physik 81, 109 (1926).
[7] W. Pauli and V. Weisskopf, Helv. Phys. Acta 7, 709 (1934).
[8] C. D. Anderson, Phys. Rev. 44 (1933), 406.
[9] P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford University
Press, Fourth Edition 1958, p. 254.
[10] L. Schiﬀ, Quantum Mechanics, McGraw-Hill, New York, 1968.
[11] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields (Course
of Theoretical Physics, Vol. 2). Butterworth–Heinemann (Oxford) 1980.
E. M. Lifshitz, V. B. Berestetski, and L. P. Pitaevskii, Quantum
Electrodynamics (Course of Theoretical Physics, Vol. 4). Butterworth–
Heinemann (Oxford) 1980.
[12] M. Tanabashi et el. (Particle Data Group Collaboration), Phys. Rev. D
98, 030001 (2018).
[13] L. Maiani, Electroweak Interactions, to be published.
[14] O. Benhar, N. Cabibbo, and L. Maiani, Gauge Theories, to be published.
[15] The experiments are due to G. Gabrielse and collaborators, see, e.g., G.
Gabrielse, Extremely Cold Antiprotons, Scientiﬁc American, December
1992 p. 78-89.

331
332 Bibliography

[16] K. Nakamura et al. [Particle Data Group Collaboration], J. Phys. G 37,

075021 (2010).
[17] E. Amaldi, Physics Reports 111, 1 (1984).
[18] E. Fermi, La Ricerca Scientiﬁca 4, 491 (1933).
[19] G. Luders, Kongelige Danske Videnskabernes Selskab Matematisk-Fysiske
Meddelelser, 28 1 (1954).
[20] W. Pauli in N. Bohr and the Development of Physics, Pergamon Press,
Oxford, 1955.
[21] J. D. Bjorken and [Link], Relativistic Quantum Fields, McGraw-Hill,
New York, 1965.
[22] E. Majorana, Il Nuovo Cimento 14, 171 (1937).
[23] T. D. Lee and C. N. Yang, Phys. Rev. 104, 254 (1956).
[24] C.S. Wu, E. Ambler, R.W. Hayward, D.D. Hoppes, and R.P. Hudson,
[Link]. 105, 1413 (1957). Parity violation was subsequently observed
in muon decays by R.L. Garwin, L.M. Lederman, and M., Weinrich, Phys.
Rev. 105, 1415 (1957). For a short history of the discovery of parity viola-
tion, see K. Myneni, [Link]
[25] R. P. Feynman and M. Gell-Mann, Phys. Rev. 109, 193 (1958); see also
S. S. Gershtein, and J.B. Zeldovich, Sov. Phys. JETP 2, 576 (1957).
[26] J.C. Hardy and I.S. Towner, [Link]. C 7, 055501 (2005); e-print arXiv
nucl-th/0412056.
[27] N. Cabibbo, Phys. Rev. Lett. 10, 531 (1963).
[28] S. L. Glashow, J. Iliopoulos, and L. Maiani, [Link]. D 2, 1285 (1970).
[29] M. Kobayashi and T. Maskawa, Progr. Theor. Phys. 49, 652 (1973).
[30] J. Schwinger, Ann. Phys. 2, 407 (1957).
[31] S. L. Glashow, Nucl. Phys. 22, 579 (1961).
[32] F. Reines and C. L. Cowan, Nature 178. 446 (1956).
[33] G. Danby, J. M. Gaillard, K. A. Goulianos, L. M. Lederman, N. B. Mistry,
M. Schwartz, and J. Steinberger, Phys. Rev. Lett. 9, 36 (1962).
[34] K. Kodama et al. [DONuT Collaboration], Phys. Rev. D 78. 052002
(2008); e-print arXiv:0711.0728 [hep-ex].
[35] B. Pontecorvo, Sov. Phys. JETP 26, 984 (1968) [Zh. Eksp. Teor. Fiz. 53,
1717 (1967)].
Bibliography 333

[36] Z. Maki, M. Nakagawa, and S. Sakata, Prog. Theor. Phys. 28, 870 (1962).
[37] [Link] chain
reaction
[38] J. N. Bahcall, A. M. Serenelli, and S. Basu, ApJ 621, L85 (2005); e-print
arXiv astro-ph/0412440.
[39] V. N. Gribov and B. Pontecorvo, Phys. Lett. B 28, 493 (1969).
[40] L. Wolfenstein, Phys. Rev. D 17, 2369 (1978).
[41] S. P. Mikheyev and A. Y. Smirnov, Prog. Part. Nucl. Phys. 23, 41 (1989).
[42] S. Abe et al. [KamLAND Collaboration], Phys. Rev. Lett. 100, 221803
(2008); e-print arXiv:0801.4589 [hep-ex].
[43] O. Q. R. Ahmad et al. [SNO Collaboration], Phys. Rev. Lett. 89, 011301
(2002); e-print arXiv nucl-ex/0204008.
[44] G. L. Fogli, E. Lisi, A. Marrone, A. Palazzo, and A. M. Rotunno, Phys.
Rev. D 84, 053007 (2011); e-print arXiv:1106.6028 [hep-ph].
[45] T. Schwetz, M. Tortola, and J. W. F. Valle, New J. Phys. 13, 109401
(2011); e-print arXiv:1108.1376 [hep-ph].
[46] W. C. Louis [LSND Collaboration], Prog. Part. Nucl. Phys. 40, 151
(1998).
[47] W. C. Louis [LSND Collaboration], Prog. Part. Nucl. Phys. 40 (1998)
151.
[48] M. Goeppel-Mayer, Phys. Rev. 48, 512 (1935).
[49] W.H. Furry, Phys. Rev, 56, 1184 (1939).
[50] R. Saakyan, Ann. Rev. Nucl. Part. Sci. 63, 503 (2013).
[51] N. Aghanim et al. (Plank Collaboration), Astronomy & Astrophysics
641, A6 (2020).
[52] M. Aker et al. (The KATRIN Collaboration), Nature Physics 18, 160
(2022).
[53] M. Agostini et al. (GERDA Collaboration), Phys. Rev. Lett. 125, 252502
(2020).
[54] I.J. Arnquist et al. (Majorana Collaboration), Phys. Rev. Lett. 130,
062501 (2023).
[55] S. Abe et al. (KamLAND-Zen Collaboration). Phys. Rev. Lett. 130,
051801 (2023).
334 Bibliography

[56] G. Anton et al. (EXO-200 Collaboration), Phys. Rev. Lett. 123, 161802
(2019).

[57] D.Q. Adams et al. (The CUORE Collaboration), Nature 604, 53 (2022).

[58] C. Augier et al., Eur. Phys. J. C 82, 1033 (2022).

[59] O. Azzolini et al. Phys. Rev. Lett. 129, 111801 (2022).

[60] S. Ajimura et al. (CANDLES Collaboration) Phys. Rev. D 103, 092008

(2021)

[61] J. Engel and J. Menéndez, Rep. Prog. Phys. 80, 046301 (2017).
[62] O. Benhar, R. Biondi, and E. Speranza, Phys. Rev. C 90, 065504 (2014).

[63] I. Esteban et al., JHEP 01, 106 (2019).

[64] E. Fermi, C. N. Yang, Phys. Rev. 76 (1949) 1739.

[65] S. Sakata, Progress of Theoretical Physics. 16 (1956) 686.

[66] M. Gell-Mann, Phys. Lett. 8 (1964) 214-215.

[67] G. Zweig, An SU(3) Model For Strong Interaction Symmetry And Its
Breaking. 2, CERN-TH-412.

[68] M. Gell-Mann, California Institute of Technology Synchrotron Labora-

tory Report No. CTSL-20, 1961 (unpublished), Phys. Rev. 125 (1962)
1067; [Link]’eman, Nuclear Phys. 26 (1961) 222.
[69] M.Y. Han and Y. Nambu, Phys. Rev. 139B (1965) 1006.

[70] W. A. Bardeen, H. Fritzsch and M. Gell-Mann, [arXiv:hep-ph/0211388

[hep-ph]].

[71] C. N. Yang and R. L. Mills, Phys. Rev. 96 (1954), 191-195.

[72] D. J. Gross and F. Wilczek, Phys. Rev. D 8 (1973) 3633; Phys. Rev. D
9 (1974) 980.

[73] H. D. Politzer, Phys. Rev. Lett. 30 (1973) 1346.

[74] G. Altarelli, G. Parisi, Nucl. Phys. B126 (1977) 298; Y. L. Dokshitzer,
Sov. Phys. J.E.T.P. 46 (1977) 691.

[75] W. E. Caswell, Phys. Rev. Lett. 33 (1974), 244

[76] S. Bethke, G. Dissertori and G.P. Salam, Quantum Chromodynamics in
C. Patrignani et al. [Particle Data Group], Chin. Phys. C 40 (2016)
100001.
Bibliography 335

[77] T. Appelquist and H. D. Politzer, Phys. Rev. Lett. 34 (1975), 43

doi:10.1103/PhysRevLett.34.43

[78] A. De Rujula and S. L. Glashow, Phys. Rev. Lett. 34 (1975), 46-49

doi:10.1103/PhysRevLett.34.46

[79] C. A. Dominguez and M. Greco, Lett. Nuovo Cim. 12 (1975), 439

doi:10.1007/BF02815956

[80] E. Eichten, K. Gottfried, T. Kinoshita, J. B. Kogut, K. D. Lane and

T. M. Yan, Phys. Rev. Lett. 34 (1975), 369-372 [erratum: Phys. Rev.
Lett. 36 (1976), 1276] doi:10.1103/PhysRevLett.34.369
[81] E. Eichten, K. Gottfried, T. Kinoshita, K. D. Lane and T. M. Yan,
Phys. Rev. D 17 (1978), 3090 [erratum: Phys. Rev. D 21 (1980), 313]
doi:10.1103/PhysRevD.17.3090

[82] E. Eichten, K. Gottfried, T. Kinoshita, K. D. Lane and T. M. Yan, Phys.

Rev. D 21 (1980), 203 doi:10.1103/PhysRevD.21.203

[83] N. Brambilla et al. [Quarkonium Working Group], [arXiv:hep-ph/0412158

[hep-ph]].

[84] N. R. Soni, B. R. Joshi, R. P. Shah, H. R. Chauhan and J. N. Pandya,

Eur. Phys. J. C 78 (2018) 592, arXiv:1707.07144 [hep-ph].

[85] M. B. Voloshin, Prog. Part. Nucl. Phys. 61 (2008), 455, [arXiv:0711.4556

[hep-ph]].

[86] L. D. Landau and E. M. Lifshitz, Quantum Mechanics (Nonrelativistic

Theory), 3rd edition. (Pergamon Press, Oxford, 1977), p. 96.
[87] P. Falkensteiner, H. Grosse, Franz F. Schoeberl, P. Hertel, Computer
Physics Communication 34 (1985) 287; W. Lucha and Franz F. Schoe-
berl, 1999. Solving The Schroedinger Equation For Bound States With
Mathematica 3.0, International Journal of Modern Physics C (IJMPC),
World Scientiﬁc Publishing Co. Pte. Ltd., vol. 10(04), pages 607-
619. The corresponding program [Link] can be obtained from
[Link]@[Link].

[88] H. Schopper, Subnucl. Ser. 15, 203-355 (1979) DESY-77-79.

[89] W. Braunschweig et al. [DASP], Phys. Lett. B 57, 407-412 (1975)

doi:10.1016/0370-2693(75)90482-7

[90] J. S. Whitaker, W. M. Tanenbaum, G. S. Abrams, M. S. Alam,

A. Boyarski, M. Breidenbach, W. Chinowsky, R. DeVoe, G. J. Feld-
man and C. E. Friedberg, et al. Phys. Rev. Lett. 37, 1596 (1976)
doi:10.1103/PhysRevLett.37.1596
336 Bibliography

[91] C. J. Biddick, T. H. Burnett, G. E. Masek, E. S. Miller,

J. G. Smith, J. P. Stronski, M. K. Sullivan, W. Vernon, D. H. Badtke
and B. A. Barnett, et al. Phys. Rev. Lett. 38, 1324 (1977)
doi:10.1103/PhysRevLett.38.1324

[92] W. Braunschweig et al. [DASP], Phys. Lett. B 67, 243-248 (1977)

doi:10.1016/0370-2693(77)90114-9
[93] N. Brambilla, doi:10.1007/978-981-15-8818-1 26-1 [arXiv:2204.11295
[hep-ph]].

[94] [LHCb], [arXiv:2204.12597 [hep-ex]].

[95] E. Braaten and M. Kusunoki, Phys. Rev. D 69 (2004), 074005

doi:10.1103/PhysRevD.69.074005 [arXiv:hep-ph/0311147 [hep-ph]].

[96] N. A. Tornqvist, Z. Phys. C 61 (1994), 525-537 doi:10.1007/BF01413192

[arXiv:hep-ph/9310247 [hep-ph]].
[97] L. Maiani, F. Piccinini, A. D. Polosa and V. Riquer, Phys. Rev. D 71
(2005) 014028.

[98] R. Aaij et al. [LHCb], Phys. Rev. Lett. 122, no.22, 222001 (2019)
doi:10.1103/PhysRevLett.122.222001 [arXiv:1904.03947 [hep-ex]].

[99] R. Aaij et al. [LHCb], Sci. Bull. 65 (2020) no.23, 1983-1993

doi:10.1016/[Link].2020.08.032 [arXiv:2006.16957 [hep-ex]].

[100] A. Ali, L. Maiani and A. D. Polosa, Cambridge University Press,

2019, ISBN 978-1-316-76146-5, 978-1-107-17158-9, 978-1-316-77419-9
doi:10.1017/9781316761465

[101] Lectures on Quantum Mechanics, Cambridge University Press (2015)

[102] P. Bicudo, [arXiv:2212.07793 [hep-lat]].

[103] L. Pauling, Chem. Rev., 5, 173-213 (1928), DOI: 10.1021/cr60018a003,

see also L. Pauling and E. B. Wilson Jr., Introduction to Quantum Me-
chanics with Applications to Chemistry. Dover Books on Physics (1985).

[104] E. Braaten, C. Langmack and D. H. Smith, Phys. Rev. D 90 (2014)

014044.

[105] N. Brambilla, G. Krein, J. Tarrús Castellà and A. Vairo, Phys. Rev. D 97

(2018) no.1, 016016 doi:10.1103/PhysRevD.97.016016 [arXiv:1707.09647
[hep-ph]].

[106] P. Bicudo, M. Cardoso, A. Peters, M. Pﬂaumer and M. Wagner, Phys.

Rev. D 96 (2017) 054510.
Bibliography 337

[107] L. Maiani, A. D. Polosa and V. Riquer, Phys. Rev. D 100 (2019) no.7,
074002 doi:10.1103/PhysRevD.100.074002 [arXiv:1908.03244 [hep-ph]].

[108] M. J. Savage and M. B. Wise, Phys. Lett. B 248 (1990), 177-180

doi:10.1016/0370-2693(90)90035-5

[109] N. Brambilla, A. Vairo and T. Rosch, Phys. Rev. D 72 (2005), 034021

doi:10.1103/PhysRevD.72.034021 [arXiv:hep-ph/0506065 [hep-ph]].

[110] S. Fleming and T. Mehen, Phys. Rev. D 73 (2006), 034502

doi:10.1103/PhysRevD.73.034502 [arXiv:hep-ph/0509313 [hep-ph]].

[111] P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford Univer-

sity Press, 1930.

[112] [Link], Speakable and Unspeakable in Quantum Mechanics. Cambridge

University Press, 2004.
[113] For a full review, see [Link], The Theory of Groups and Quantum
Mechanics. Dover Publications Inc., New York, 1950.
Index

J/Ψ discovery, 296 ATLAS, 145

β decay, 186, 250
CP invariance, 187 baryon, 124, 144
inverse reaction, 257 baryon number, 119, 285
strange particle, 247 Bevatron, 119
γγ annihilation, 224, 225 boost, 89, 94, 120
π meson, 144, 188, 239 Born-Oppenheimer approximation,
τ lepton, 246, 259 300
lifetime, 246 basic assumption, 301
hydrogen molecule, 301
action, 13, 18, 37 Born-Oppenheimer equation, 302
invariance under Lorentz bra, ket notation, 311
transformation, 28 scalar product, 312, 313, 315,
invariance under Lorentz 316
transformations, 13, 20, 28 branching ratio, 166
principle of least a., 13, 18, 19,
53, 67, 141 CERN, 145
adjoint spinor, 85, 196 charge
Adone storage ring, 223 under CPT, 187
angular momentum, 27, 82, 99, 100, charge conjugation-see discrete
320, 324 symmetries, 168
under CPT, 187 charged current, 254, 259, 265
angular momentum tensor charmonium, 291
based on θμν , 35 Cornell potential, 292
canonical, 34 spectrum, 296
anticommutation rules, 99, 115, 117, Clebsch–Gordan coeﬃcient, 100, 181,
118, 122, 128, 164, 196 182
at space-like separation, 123 CMS, 145
canonical, 117, 195 colour, 126
equal time, 117, 126 Casimir operators, 303
Fermi oscillator, 114 multiplets, 302
gamma matrices, 80, 82 symmetry generators, 302
Pauli matrices, 81 commutation rules, 72, 82, 323
antilinear operator, 174 angular momentum, 74, 320
antiparticle, 12, 83, 110, 119, 188 canonical, 48, 68
neutrino, 142 equal time, 24, 47
neutron, 119 of x and p, 316, 323
proton, 119 orbital angular momentum, 319
antiunitary operator, 174 spin, 74, 321

339
340 Index

Compton crossed channel, 225

cross section, 221 current
effect, 73, 221 × current, 245
inverse scattering, 221 electromagnetic, 133, 141, 142,
scattering, 214, 257 196, 208, 228, 245, 247
wavelength, 131 in neutron decay, 233
conjugate momentum, 20, 47, 58, 63, lepton, 233
66, 117 weak, 244, 247
conservation
angular momentum, 34, 155, 243 decay lifetime, 165
baryon number, 120, 196 degrees of freedom, 18, 22, 54, 56, 60,
electric charge, 120 63, 74, 144, 195, 197
energy, 21, 58, 149, 202, 236 density matrix, 234
energy and momentum, 14, 33, Dirac
59, 73, 161, 222 algebra, 86
lepton number, 245 bilinear covariants, 86, 143, 169,
momentum, 155, 220 185, 260
Noether charge, 30 transformation under C, P, T,
of probability: see unitarity, 148 172, 175, 183, 184
conservation laws, 136, 160 equation, 79, 98, 110
conserved current, 30, 47, 116, 196, free particle, 91
200 general solution, 94
continuity equation, 57, 78, 86 in electromagnetic field, 95
Cornell potential, 302 negative energy solutions, 91,
Coulomb part, 302 92, 95, 110
correspondence principle, 138 hole, 110, 193
Coulomb mass, 198, 199
divergence, 205 matrices, 80–82, 98, 169, 175,
gauge, 68 260
interaction, 134 Majorana representation, 195
coupling constant, 140, 144, 204, Pauli representation, 175, 192
247, 248 trace relations, 106
fine structure, 99, 150, 212, 324 Dirac Lagrangian, 115, 137, 198
covariant derivative, 138 global symmetry, 116
CPT theorem, 6, 183, 184, 187, 268 in electromagnetic field, 139
experimental status, 188 under P, C, T transformations,
creation and destruction operators, 171
119 discrete symmetries
cross section, 161, 162 C, 168, 170, 172, 177, 182, 195,
differential, 165, 202, 203, 211 226
Klein–Nishina, 221 CP, 187, 268
Mott, 204, 206 violation, 255, 267, 269, 270
relativistic invariance, 165 P, 5, 168, 169
Rutherford, 204 of E and B, 26
Thomson, 221 particle-antiparticle, 177
Index 341

violation, 143, 239, 245 tensor, 24

T, 154, 168, 173, 175, 179 Fierz transformation, 260
double-beta decay, 272 form factor
neutrinoless, 272, 274, 277, 281 Dirac, 207
experimental limits, 282 electric and magnetic, 208
half-life, 280 electromagnetic, 206
with two neutrinos, 273, 277 nucleus, 205
doubly charmed baryon, 304 Pauli, 207
numerical results, 308 proton measurement, 214
error estimate, 308 Sachs, 208, 214
Dyson formula, 159, 248 functional derivative, 22
Furry’s theorem, 180, 182
electromagnetic interaction, 137
electron GALLEX experiment, 257, 264
Hamiltonian in Coulomb field, gauge
99 invariance, 53, 138, 215
scattering in Coulomb field, 202, transformation, 53, 138
203 GIM mechanism, 286
electron positron annihilation gluon, 145
in γγ, 226 Gordon decomposition, 207, 212
in μ+ μ− , 227, 231 Gran Sasso laboratory, 257, 259
electrostatic potential, 62 Green’s function, 42, 133, 180
energy-momentum tensor, 60 electromagnetic field, 54
canonical, 31, 116 Feynman, 55
symmetric, 34, 35 retarded and advanced, 45
Euler–Lagrange equations, 19, 20, vector potential, 55
115 group
Euler–Lagrange equations, 39 little, 12, 121
Lorentz-see L. transformations,
Fermi constant, 142, 244, 247, 248 3
experimental value, 239 non-compact, 87
muon, 240, 242 O(2), 12
tau lepton, 246 O(3), 12, 319, 322
Fermi interaction, 142, 186 Poincaré, 28, 136
Fermilab, 271 representation, 87
Feynman gyromagnetic ratio
boundary condition, 46, 55, 130 τ lepton, 246
gauge, 133, 135 electron, 97, 98, 106, 139, 140
matrix element, 164, 234, 241 muon, 140
propagator, 46, 130, 131, 249 nucleon, 140
field
CPT transformation, 183 hadron, 142, 144
Klein–Gordon, 42 exotic, 297
scalar, 24, 39 Hamilton’s equations, 21, 40, 48, 65
spinor, 33 Hamiltonian, 20, 39, 47, 64, 68, 72,
342 Index

78, 82, 114, 117, 148, 155, Lagrange equations of motion, 13

157, 324 Lagrangian, 19, 60, 63, 67, 185
ν − e, 260 current × current, 248
Dirac, 99 Fermi, 142, 233, 244
eigenvalues, 103 free classical particle, 14
interaction, 159 interaction, 138, 142, 249
neutrino, 253 intermediate vector boson, 247,
Hamiltonian density, 20 249
Heisenberg equal-time relations, 24 Klein–Gordon, 42, 46
Heisenberg equations, 117 Maxwell, 53, 57, 60, 139
helicity, 83, 188, 244 muon decay, 240
ν, ν̄, 143 neutron decay, 143
Higgs boson, 145 relativistic invariance, 20, 29
discovery, 145 Lamb shift, 106
Hilbert space, 312 Landau gauge, 135
Homestake mine, 257, 265 Larmor frequency, 66, 98
hydrogen atom lepton, 140, 145, 245
energy levels, 104, 327 ﬂavour, 240, 252
non-relativistic, 323 number, 200, 252
relativistic, 98 Levi–Civita tensor, 25, 26, 53, 106
stability, 120 little group, 12, 121
local observable, 6, 122, 155, 156, 185
IMB experiment, 258 Lorentz force, 58
inertial frame, 1 Lorentz transformations, 3, 11, 83,
inﬁnitesimal generators 88, 94, 120, 196
spatial rotations, 319 boost, 89
spatial translations, 318 orthochronous, 5
interaction proper, 5
Lagrangian, 136 S-matrix, 159
invariance Lorenz gauge, 54, 55, 133
Lorentz group, 136 LSND experiment, 271
of Lagrangian, 136
Poincaré group, 136 magnetic moment, 141, 170, 188, 208
spatial rotation, 34 τ lepton, 246
time translation, 149 electron, 67, 95, 139, 189
isotopic spin, 285 muon, 140, 189
nucleon, 140
Jacobi identity, 23 Majorana mass, 196, 197, 199
Majorana neutrino mass, 270
K2K experiment, 259, 267 experimental limits, 283
Kamiokande experiment, 258, 259 Mandelstam variables, 215, 224, 226
KAMLAND experiment, 259, 263 mass-energy relation, 15
Klein–Gordon equation, 39, 46, 78, Maxwell Lagrangian, 137
129, 131, 133 Maxwell tensor, 25, 52, 139
solution, 40 dual, 26
Index 343

under CPT, 183 neutron

under parity, 169 β decay, 142, 186
Maxwell’s equations, 52, 54 lifetime, 237
Maxwell–Lorentz equations, 57 parity violation, 239
meson spin asymmetry, 238
π meson, 144, 284, 285 Noether current, 138
microcausality, 6, 122 proton, 141
Mikheyev–Smirnov–Wolfenstein Noether’s theorem, 29, 47
effect normal product, 50, 122
see MSW effect, 262
MiniBooNe experiment, 271 occupation number, 49
minimal substitution, 64, 95, 137, OPERA experiment, 259
138, 140, 160 orthocharmonium, 291
MINOS experiment, 259, 267 orthopositronium, 181
momentum lifetime, 226
under CPT, 187
momentum transfer, 206 paracharmonium, 291
MSW effect, 143, 262 parapositronium, 181
muon lifetime, 226
decay, 240 parity-see discrete symmetries, 168
Fermi constant, 240 Pauli
helicity in decay, 244 matrices, 81, 176, 262, 321
neutrino, 245 spinor, 189
spin asymmetry, 242 term, 139, 141, 160, 170
perturbation theory, 152, 157, 209,
neutral current, 145, 254, 259, 265 233, 248
neutrino photon
e, μ, τ , 252 spin, 13, 73, 74
cosmic ray, 258 pion decays, 285
CP asymmetry, 267 Poisson bracket, 23, 48
flavour, 252 polarisation
helicity, 143, 194 asymmetry, 245
limit to mass, 240 circular, 74
long baseline beams, 259 neutron, 234
Majorana, 195, 261, 270 photon, 73, 215
mass hierarchy, 270 positron, 111, 188
mass matrix, 199, 262 positronium, 180
mixing, 252, 253, 255, 262, 263, charge conjugation, 182
267 lifetime, 183
oscillations, 250, 252, 253, 264 parity, 182
solar, 255 Poynting vector, 72
solar deficit, 257, 258, 264 propagator, 128
sterile, 254, 271 Dirac field, 131
Weyl, 188, 191, 261 fermion, 128
neutrinoless double β decay, 270 intermediate boson, 248, 249
344 Index

photon, 133, 210, 228 Rosenbluth formula, 208, 213

scalar field, 129
proper time, 10 S-matrix, 158, 180, 201, 227, 240, 248
relativistic invariance, 159
QCD, 287 SAGE experiment, 257, 265
asymptotic freedom, 289 scalar potential, 53
infrared confinement, 288 Schrödinger equation, 78, 97, 148,
Lagrangian, 288 262, 325
renormalization group equation, second quantisation, 112, 113
290 see-saw mechanism, 200
running coupling constant, 290 SNO experiment, 257
scale ΛQCD , 290 solar fusion reactions, 16
string, 289 space-time translation, 28
strong coupling, 289, 291 spherical harmonics, 100, 325
weak coupling, 288, 291 spin
QED, 59, 106, 137, 145, 150, 168, 170 asymmetry, 238, 242
C invariance, 172 spin-statistics theorem, 124
non minimal, 160, 170 spinor, 94, 100, 234, 321
P invariance, 172 equation of motion, 139
spinor, 139, 140, 159 massless, 192
quantum orthonormality, 93
expectation values, 147, 316 parity, 101
states-under time inversion, 173 Pauli, 92
quantum electrodynamics-see QED, radial, 101
137 representation, 87
quark, 124, 145, 247, 285 under parity, 169
beauty, 287 statistics
charm, 247, 286 Bose–Einstein, 49, 124
colour, 287 Fermi–Dirac, 113, 124
confinement, 288 strangeness, 285
flavour, 286 strong interaction, 137, 143
heavy quark, 286 SuperKamiokande experiment, 257,
mass, 286 258, 267
quantum numbers, 286 superposition principle, 311
symmetry
relativistic invariance, 83 antiunitary operator, 174
renormalisation, 160 continuous, 28
representation global, 36, 116, 138, 245
Dirac, 147, 150 Hamiltonian, 153
Heisenberg, 147, 149 local see gauge, 53
interaction, 147, 150, 157 positronium, 180
Lorentz group, 27
unitary, 87, 88, 120, 121 tensor, 25, 26
O(3), 320, 322 rank, 25
Schrödinger, 147 time reversal-see discrete
symmetries, 168
Index 345

time-ordered product, 128, 153, 159, expectation value, 128

209, 228, 248 ﬂuctuations, 106
scalar ﬁeld, 131 state, 49, 72, 118
time-ordering operator, 201 vector boson, 145, 247
total inversion mass, 249
see CPT, 184 vector mesons, 188
transition vector potential, 25, 53, 67, 97
Fermi, 234 under CPT, 183
Gamow–Teller, 234 under time inversion, 175
vector space, 311
unitarity, 158
evolution operator, 148 W boson, 145, 259
universality, 140, 247 weak interaction, 137, 142, 245
lepton, 244 Weyl equation, 192
unstable particle
decay width, 166 Yukawa meson, 144

V–A interaction, 143, 240, 243, 246, Z boson, 145, 249, 259
247, 260 Zeeman eﬀect, 65, 98
vacuum, 110 zero point energy, 114
energy, 49

Classical Field Theory Overview
No ratings yet
Classical Field Theory Overview
72 pages
Principles of Quantum Field Theory
No ratings yet
Principles of Quantum Field Theory
228 pages
Einstein's Vierbein Field Theory Review
No ratings yet
Einstein's Vierbein Field Theory Review
27 pages
History of Differential Geometry I
No ratings yet
History of Differential Geometry I
62 pages
Renormalization Made Easy, Baez
No ratings yet
Renormalization Made Easy, Baez
11 pages
Tribute to Mathematician Hermann Weyl
No ratings yet
Tribute to Mathematician Hermann Weyl
11 pages
Quantization of the Electromagnetic Field
No ratings yet
Quantization of the Electromagnetic Field
21 pages
Quantum Field Theory Overview
No ratings yet
Quantum Field Theory Overview
731 pages
Henderson 2006
100% (1)
Henderson 2006
229 pages
Mathematical Foundations of Relativity
100% (1)
Mathematical Foundations of Relativity
92 pages
Geometric Methods in Physics Trends
No ratings yet
Geometric Methods in Physics Trends
431 pages
Exploring Space, Time, and Physics
No ratings yet
Exploring Space, Time, and Physics
297 pages
Introduction to General Relativity Course
No ratings yet
Introduction to General Relativity Course
131 pages
Zlib - Pub Quantum Mechanics An Introduction
No ratings yet
Zlib - Pub Quantum Mechanics An Introduction
569 pages
Theory of Gravitational Interactions (2nd Edition) Gasperini PDF
No ratings yet
Theory of Gravitational Interactions (2nd Edition) Gasperini PDF
10 pages
Tensor Decompositions Overview
No ratings yet
Tensor Decompositions Overview
71 pages
Understanding String Theory Basics
0% (1)
Understanding String Theory Basics
142 pages
Berthold-Georg Englert - Lectures On Quantum Mechanics - Volume 2 - Simple Systems (2024, World Scientific Publishing Company) - Libgen - Li
100% (1)
Berthold-Georg Englert - Lectures On Quantum Mechanics - Volume 2 - Simple Systems (2024, World Scientific Publishing Company) - Libgen - Li
222 pages
Tensors in Symmetry and Relativity
0% (1)
Tensors in Symmetry and Relativity
31 pages
Applications of Group Theory in Particle Physics
No ratings yet
Applications of Group Theory in Particle Physics
11 pages
History of Gauge Theories and Kaluza-Klein
No ratings yet
History of Gauge Theories and Kaluza-Klein
64 pages
Moyal's Role in Phase Space Quantum Mechanics
No ratings yet
Moyal's Role in Phase Space Quantum Mechanics
10 pages
Schwarzschild Solution and Black Holes
No ratings yet
Schwarzschild Solution and Black Holes
10 pages
On Born Jordan 1925
No ratings yet
On Born Jordan 1925
12 pages
Newman-Penrose Formalism Overview
No ratings yet
Newman-Penrose Formalism Overview
7 pages
Geometric View of Schrödinger Equation
No ratings yet
Geometric View of Schrödinger Equation
5 pages
Review of Modern Physics Volume 87 Issue 3 2015 (Doi 10.1103 - Revmodphys.87.897) Jones, R. O. - Density Functional Theory - Its Origins, Rise To Prominence, and Future
No ratings yet
Review of Modern Physics Volume 87 Issue 3 2015 (Doi 10.1103 - Revmodphys.87.897) Jones, R. O. - Density Functional Theory - Its Origins, Rise To Prominence, and Future
27 pages
Classical Physics as Curved Geometry
No ratings yet
Classical Physics as Curved Geometry
79 pages
Tensor Analysis in Relativistic Physics
100% (1)
Tensor Analysis in Relativistic Physics
106 pages
Conformal Field Theory Lecture Notes
No ratings yet
Conformal Field Theory Lecture Notes
15 pages
Magnetism Radiation Relativity Schroeder
No ratings yet
Magnetism Radiation Relativity Schroeder
39 pages
Constrained Hamiltonian Systems Overview
100% (1)
Constrained Hamiltonian Systems Overview
135 pages
Understanding Gauge Theories
100% (2)
Understanding Gauge Theories
84 pages
Density Matrix Theory and Applications
100% (1)
Density Matrix Theory and Applications
60 pages
Jahn-Teller Effects in Transition-Metal Fluorides
No ratings yet
Jahn-Teller Effects in Transition-Metal Fluorides
116 pages
Introduction to Quantum Field Theory
100% (3)
Introduction to Quantum Field Theory
121 pages
QFT II Lecture Notes by Timo Weigand
No ratings yet
QFT II Lecture Notes by Timo Weigand
107 pages
MSc in High Energy Physics Overview
No ratings yet
MSc in High Energy Physics Overview
42 pages
Relativistic Quantum Field Theory
100% (1)
Relativistic Quantum Field Theory
248 pages
Tensor Analysis for Physicists
No ratings yet
Tensor Analysis for Physicists
298 pages
Continuous Symmetries, Lie Algebras, Differential Equations, and Computer Algebra (PDFDrive) - 1
No ratings yet
Continuous Symmetries, Lie Algebras, Differential Equations, and Computer Algebra (PDFDrive) - 1
473 pages
Advanced Quantum Field Theory Overview
100% (3)
Advanced Quantum Field Theory Overview
267 pages
Tetrad Formalism in General Relativity
No ratings yet
Tetrad Formalism in General Relativity
4 pages
Electromagnetism: Forces and History
100% (2)
Electromagnetism: Forces and History
231 pages
Fieldtheoriesofcondensedmatterphysics: Duardo Radkin
100% (1)
Fieldtheoriesofcondensedmatterphysics: Duardo Radkin
854 pages
A First Graduate Course in Quantum Field Theory - For Doctoral Level Physics by Travis S. Taylor
No ratings yet
A First Graduate Course in Quantum Field Theory - For Doctoral Level Physics by Travis S. Taylor
415 pages
Relativity Topics: Geometry & Physics
No ratings yet
Relativity Topics: Geometry & Physics
107 pages
(Frontiers in Physics) Faddeev, L. D-Gauge Fields - An Introduction To Quantum Theory, Second Edition-Chapman and Hall - CRC (2018)
No ratings yet
(Frontiers in Physics) Faddeev, L. D-Gauge Fields - An Introduction To Quantum Theory, Second Edition-Chapman and Hall - CRC (2018)
236 pages
Master Course on Quantum Field Theory
100% (1)
Master Course on Quantum Field Theory
172 pages
Introduction to the Standard Model
No ratings yet
Introduction to the Standard Model
101 pages
Sakurai, Advanced Quantum Mechanics
100% (15)
Sakurai, Advanced Quantum Mechanics
344 pages
Advanced Quantum Mechanics - Sakurai
100% (1)
Advanced Quantum Mechanics - Sakurai
344 pages
Advanced Quantum Mechanics Sakurai
100% (2)
Advanced Quantum Mechanics Sakurai
344 pages
Advanced-Quantum-Mechanics - Sakurai PDF
100% (1)
Advanced-Quantum-Mechanics - Sakurai PDF
344 pages
Texts and Monographs in Physics: Series Editors
No ratings yet
Texts and Monographs in Physics: Series Editors
341 pages
From Classical To Quantum Fields PDF
100% (11)
From Classical To Quantum Fields PDF
951 pages
Merzbacher Quantum Mechanics
100% (5)
Merzbacher Quantum Mechanics
635 pages
Lahiri Pal A First Book of Quantum Field Theory
100% (5)
Lahiri Pal A First Book of Quantum Field Theory
392 pages
Relkvtyuuant
No ratings yet
Relkvtyuuant
104 pages
Robert Eisberg - Robert Resnick - Quantum Physics of Atoms-Molecules-Solids-Nuclei-Particles (1985 Wiley, 864s)
100% (8)
Robert Eisberg - Robert Resnick - Quantum Physics of Atoms-Molecules-Solids-Nuclei-Particles (1985 Wiley, 864s)
864 pages
Understanding the Cosmic Web
No ratings yet
Understanding the Cosmic Web
3 pages
Quantum Dataset for Drug-like Molecules
No ratings yet
Quantum Dataset for Drug-like Molecules
14 pages
Quantum Mechanical Benchmarking for Force Fields
No ratings yet
Quantum Mechanical Benchmarking for Force Fields
15 pages
Machine-Learned Force Fields for MD Simulations
No ratings yet
Machine-Learned Force Fields for MD Simulations
16 pages
AI Uncertainty Principle in Generative AI
No ratings yet
AI Uncertainty Principle in Generative AI
36 pages
Quantum Systems Bridging Gravity and Mechanics
No ratings yet
Quantum Systems Bridging Gravity and Mechanics
81 pages
Loop Quantum Gravity and Kerr Black Holes
No ratings yet
Loop Quantum Gravity and Kerr Black Holes
10 pages
Mass-Spring System and Harmonic Motion
No ratings yet
Mass-Spring System and Harmonic Motion
5 pages
Understanding Stellar Luminosity and Temperature
No ratings yet
Understanding Stellar Luminosity and Temperature
9 pages
Dynamics of Rigid Bodies Overview
No ratings yet
Dynamics of Rigid Bodies Overview
31 pages
Units and Dimensions (Subjective)
No ratings yet
Units and Dimensions (Subjective)
2 pages
JEE Advanced Physics: Thermal Properties
No ratings yet
JEE Advanced Physics: Thermal Properties
7 pages
Motion in Straight Line Question Paper
No ratings yet
Motion in Straight Line Question Paper
9 pages
Scalars vs Vectors in Physics Explained
No ratings yet
Scalars vs Vectors in Physics Explained
16 pages
Mechanics of Materials Overview
No ratings yet
Mechanics of Materials Overview
16 pages
Introduction to Psychology Overview
No ratings yet
Introduction to Psychology Overview
30 pages
Essential Astronomy Formulas Guide
No ratings yet
Essential Astronomy Formulas Guide
3 pages
Physics XI Examination Question Paper
No ratings yet
Physics XI Examination Question Paper
3 pages
Newton's Laws Quiz Questions & Answers
No ratings yet
Newton's Laws Quiz Questions & Answers
4 pages
Thoughts Shape Your Destiny: A Guide
No ratings yet
Thoughts Shape Your Destiny: A Guide
12 pages
Psychology and Your Life With POWER Learning 4th Edition Robert Feldman Ebook Testbank Solutions
No ratings yet
Psychology and Your Life With POWER Learning 4th Edition Robert Feldman Ebook Testbank Solutions
258 pages
Understanding Static Friction Forces
No ratings yet
Understanding Static Friction Forces
18 pages
Thermodynamics Formula Sheet for Class 11
No ratings yet
Thermodynamics Formula Sheet for Class 11
4 pages
ME 231 Thermodynamics Course Overview
No ratings yet
ME 231 Thermodynamics Course Overview
16 pages
The Power of Self-Reflection for Growth
No ratings yet
The Power of Self-Reflection for Growth
3 pages
Demonstrating the Doppler Effect in Light
No ratings yet
Demonstrating the Doppler Effect in Light
19 pages
Class X Physics: Forces, Energy, Light, Sound
No ratings yet
Class X Physics: Forces, Energy, Light, Sound
5 pages
Comprehensive Psychology Topics Overview
No ratings yet
Comprehensive Psychology Topics Overview
3 pages
Divine Action and The World of Science W
No ratings yet
Divine Action and The World of Science W
53 pages
Understanding Personality Traits and Types
No ratings yet
Understanding Personality Traits and Types
6 pages
Adiabatic Index Calculation Report
No ratings yet
Adiabatic Index Calculation Report
11 pages
Teacher Expectations and Student Success
No ratings yet
Teacher Expectations and Student Success
12 pages
Understanding Heat Transfer Basics
No ratings yet
Understanding Heat Transfer Basics
57 pages
NEET Weekend Test 13 Answer Key
No ratings yet
NEET Weekend Test 13 Answer Key
5 pages
Scalar and Vector Quantities Explained
No ratings yet
Scalar and Vector Quantities Explained
2 pages
Neet - SR Star Co Super Chaina - Prog-1 Q.P Ex DT 24-03-2026
No ratings yet
Neet - SR Star Co Super Chaina - Prog-1 Q.P Ex DT 24-03-2026
21 pages

Relativistic Quantum Mechanics Overview

Uploaded by

Relativistic Quantum Mechanics Overview

Uploaded by

Relativistic Quantum

Luciano Maiani and Omar Benhar

Second edition published 2024

and by CRC Press

CRC Press is an imprint of Taylor & Francis Group, LLC

© 2025 Luciano Maiani and Omar Benhar

First edition published by CRC Press 2015

ISBN: 978-1-032-56594-1 (hbk)

Chapter 1 The Symmetries of Space-Time 1

1.1 THE PRINCIPLE OF RELATIVITY 1

Chapter 2 The Classical Free Particle 10

2.1 SPACE–TIME MOTION 10

Chapter 3 The Lagrangian Theory of Fields 18

3.1 THE ACTION PRINCIPLE 18

Chapter 4 Klein–Gordon Field Quantisation 39

4.1 THE REAL SCALAR FIELD 39

Chapter 5 Electromagnetic-Field Quantisation 52

5.1 MAXWELL’S EQUATIONS IN COVARIANT FORM 52

Chapter 6 The Dirac Equation 78

6.1 FORM AND PROPERTIES OF THE DIRAC EQUATION 79

Chapter 7 Quantisation of the Dirac Field 110

7.1 PARTICLES AND ANTIPARTICLES 110

Chapter 8 Free Field Propagators 128

8.1 THE TIME-ORDERED PRODUCT 128

Chapter 9 Interactions 136

9.1 QUANTUM ELECTRODYNAMICS 137

Chapter 10 Time Evolution of Quantum Systems 147

10.1 THE SCHRÖDINGER REPRESENTATION 147

Chapter 11 Relativistic Perturbation Theory 157

11.1 THE DYSON FORMULA 159

Chapter 12 The Discrete Symmetries: P, C, T 168

12.1 PARITY 168

Chapter 13 Weyl and Majorana Neutrinos 191

13.1 THE WEYL NEUTRINO 191

Chapter 14 Applications: QED 201

14.1 SCATTERING IN A CLASSICAL COULOMB FIELD 201

14.8 PROBLEMS FOR CHAPTER 14 232

Chapter 15 Applications: Weak Interactions 233

15.1 NEUTRON DECAY 233

Chapter 16 Neutrino Oscillations 250

16.1 OSCILLATIONS IN VACUUM 252

Chapter 17 Neutrinoless Double-Beta Decay 272

17.1 DOUBLE BETA DECAY 272

Chapter 18 A Leap Forward: Charmonium 284

18.1 A PRIMER: BARYONS, MESONS, QUARKS AND QCD 284

Chapter 19 The Born-Oppenheimer Approximation for the

19.1 BORN-OPPENHEIMER APPROXIMATION IN BRIEF 301

Appendix A Basic Elements of Quantum Mechanics 311

A.1 THE PRINCIPLE OF SUPERPOSITION 311

Appendix B The Non-Relativistic Hydrogen Atom 323

B.1 FACTORISATION OF THE LAPLACIAN 323

This book is based on the relativistic quantum mechanics lecture course

obtained from classical theory through the so-called minimal substitution is

Luciano Maiani, born in 1941, is emeritus professor of theoretical physics

1.1 THE PRINCIPLE OF RELATIVITY

(1) A body not subject to a force in an IF is in a state of rest or performs

On a closer look, we are in the presence of a circular argument: the absence

• the Earth (for durations short compared to the solar year),

Once an IF has been identiﬁed, it is possible to construct an inﬁnite number

of special relativity formulated by Galileo states that:

(2) The laws of physics are invariant under a change of IF.

In a given IF, physical phenomena can be analysed in terms of events;

coordinates = (ct, x) = xμ (μ = 0, .., 3) .

In the time coordinate, we have inserted a factor c (the velocity of light

(x )μ = Λμν xν , (1.1)

where repeated indices (from 0 to 3) indicate a summation and Λ is indepen-

(3) The speed of light in free space c is a universal constant, independent of

Δs = (cΔt)2 − (Δx)2 = (Δx0 )2 − (Δx)2 .

also correspond to a zero invariant length in the transformed coordinates. As

with α, δ, , ζ to be determined. Moreover,

s = (ζ 2 −δ 2 )(cΔt)2 −(α2 −2 )(Δx)2 −2(ζ −αδ)(Δx)(cΔt)−(Δy )2 −(Δz )2 .

Setting λ = 1 in (1.2), we must require s = s so that:

Equation (1.8) can be solved by substituting:

where θ is a real parameter (the rapidity) connected to the relative velocity

Δx = γ(Δx − βcΔt)

which are obtained from (1.12) in the limit c → ∞ (non-relativistic

with α, δ, , ζ to be determined. Moreover,

s = (ζ 2 −δ 2 )(cΔt)2 −(α2 −2 )(Δx)2 −2(ζ −αδ)(Δx)(cΔt)−(Δy )2 −(Δz )2 .

hence V has the same dimensions as V and there is a one-to-one mapping

= γmc2 ; p = γβmc. (2.12)