0% found this document useful (0 votes)
49 views53 pages

Fundamentals of the Standard Model

The document outlines the Standard Model of particle physics, detailing its fundamental particles and the forces that govern their interactions, specifically the strong, weak, and electromagnetic forces. It emphasizes the mathematical framework underlying the model, including symmetries and group theory, while also highlighting the significance of the Higgs boson in providing mass to particles. The document serves as a resource for students and researchers, offering recommended readings and a structured course outline on the subject.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views53 pages

Fundamentals of the Standard Model

The document outlines the Standard Model of particle physics, detailing its fundamental particles and the forces that govern their interactions, specifically the strong, weak, and electromagnetic forces. It emphasizes the mathematical framework underlying the model, including symmetries and group theory, while also highlighting the significance of the Higgs boson in providing mass to particles. The document serves as a resource for students and researchers, offering recommended readings and a structured course outline on the subject.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

The Standard Model

University of Cambridge Part III Mathematical Tripos

David Tong
Department of Applied Mathematics and Theoretical Physics,
Centre for Mathematical Sciences,
Wilberforce Road,
Cambridge, CB3 OBA, UK

[Link]
[Link]@[Link]
Recommended Books and Resources

For a very elementary introduction to the Standard Model, you could take a look at
the lectures on Particle Physics that I wrote for the CERN summer school. They cover
the subject in a great deal of detail, but without any real mathematical sophistication.
If you’re completely new to the wonderful world of subatomic particles, this is a good
place to get grounded.

Many undergraduate degrees have courses on particle physics that use quantum
mechanics and some elementary group theory, without fully embracing quantum field
theory. There are a number of good textbooks catering to these courses. Two that I
particularly like are:

• Halzen and Martin, “Quarks and Leptons”

• David Griffiths, “Introduction to Elementary Particles”

More advanced and really excellent books are:

• Cli↵ Burgess and Guy Moore “The Standard Model”

• Mark Thomson, “Modern Particle Physics”

• Matt Schwartz, “Quantum Field Theory and the Standard Model”

All three have di↵erent perspectives. Cli↵ and Guy’s book in particular is closely
aligned to the general theme of these lectures. Mark Thomson’s book includes many
more details about the specifics of particle interactions, while Matt’s book is a great
all-round QFT book that, as the title suggests, has an increasing focus on the Standard
Model as it proceeds.

Finally, if you’re serious about particle physics you should acquaint yourself with the
all-important Particle Data Group. They have various apps that you can download
and, for the more old-fashioned among you, books. Their booklet, available in the
download section of the webpage, is particularly useful. They’ll even mail you one for
free if you ask nicely.

In addition, there are many online lecture notes. You can find links to these on the
course webpage.
Contents
0 Introduction 1

1 Symmetries 5
1.1 Spacetime Symmetries 5
1.1.1 The Lorentz Group 6
1.1.2 The Poincaré Group and its Representations 9
1.1.3 The Coleman-Mandula Theorem 15
1.2 Spinors 17
1.2.1 Dirac vs Weyl Spinors 17
1.2.2 Actions for Spinors 21
1.3 Gauge Invariance 22
1.3.1 Maxwell Theory 22
1.3.2 A Refresher on Lie Algebras 25
1.3.3 Yang-Mills Theory 28
1.4 C,P, and T 36
1.4.1 Parity 36
1.4.2 Charge Conjugation 40
1.4.3 Time Reversal 42
1.4.4 CPT 45

2 Broken Symmetries 47
2.1 Discrete Symmetries 48
2.1.1 Quantum Tunnelling 49
2.1.2 Discrete Symmetry Breaking in Quantum Field Theory 52
2.2 Continuous Symmetries 57
2.2.1 The O(N ) Sigma Model 60
2.2.2 Goldstone’s Theorem in Classical Field Theory 63
2.2.3 Goldstone’s Theorem in Quantum Field Theory 66
2.2.4 The Coleman-Mermin-Wagner Theorem 72
2.3 The Higgs Mechanism 74
2.3.1 The Abelian Higgs Model 75
2.3.2 Superconductivity 77
2.3.3 Non-Abelian Higgs Mechanism 86

–i–
3 The Strong Force 89
3.1 Strong Coupling 92
3.1.1 Asymptotic Freedom 92
3.1.2 Anti-Screening and Paramagnetism 96
3.1.3 The Mass Gap 98
3.1.4 A Short Distance Coulomb Force 99
3.1.5 A Long Distance Confining Force 103
3.2 Chiral Symmetry Breaking 107
3.2.1 The Quark Condensate 109
3.2.2 The Chiral Lagrangian 112
3.2.3 Phases of Massless QCD 119
3.3 Hadrons 122
3.3.1 Mesons 123
3.3.2 Lifetimes 129
3.3.3 Baryons 132
3.3.4 Heavy Quarks 136
3.4 The Theta Term 138
3.4.1 Topological Sectors 140
3.4.2 Instantons 142

4 Anomalies 145
4.1 Gauge Anomalies 148
4.1.1 Non-Abelian Gauge Anomalies 151
4.1.2 Mixed Anomalies 154
4.1.3 The Witten Anomaly 157
4.2 Chiral (or ABJ) Anomalies 158
4.2.1 The Theta Term Revisited 160
4.2.2 Noether’s Theorem for Anomalous Symmetries 162
4.2.3 Neutral Pion Decay 166
4.2.4 Surviving Discrete Symmetries 168
4.3 ’t Hooft Anomalies 169
4.3.1 Confinement Implies Chiral Symmetry Breaking 171

5 Electroweak Interactions 179


5.1 The Structure of the Standard Model 179
5.1.1 Anomaly Cancellation 181
5.1.2 Yukawa Interactions 184
5.1.3 Three Generations 185

– ii –
5.1.4 The Lagrangian 186
5.1.5 Global Symmetries 188
5.1.6 What is the Gauge Group of the Standard Model? 193
5.2 Electroweak Symmetry Breaking 194
5.2.1 Electromagnetism 197
5.2.2 Running of the Weak Coupling 199
5.2.3 A First Look at Fermion Masses 202
5.3 Weak Decays 205
5.3.1 Electroweak Currents 205
5.3.2 Feynman Diagrams 206
5.3.3 A First Look at Weak Processes 208
5.3.4 4-Fermi Theory 209

6 Flavour 212
6.1 Diagonalising the Yukawa Interactions 212
6.1.1 Counting Yukawa Parameters 212
6.1.2 The Mass Eigenbasis 214
6.1.3 A Brief Look at Leptons 215
6.2 The CKM Matrix 216
6.2.1 Two Generations and the Cabibbo Angle 218
6.2.2 Three Generations and the CKM Matrix 220
6.2.3 The Wolfenstein Parameterisation 222
6.2.4 The Unitarity Triangle 223
6.3 Flavour Changing Neutral Currents 225
6.4 CP Violation 228
6.4.1 How to Think of the Breaking of Time Reversal 229
6.4.2 The Jarlskog Invariant 233
6.4.3 The Strong CP Problem Revisited 235
6.4.4 Neutral Kaons 235
6.4.5 Wherefore CP Violation? 240

7 Neutrinos 241
7.1 Neutrino Masses 241
7.1.1 Dirac vs Majorana Masses 243
7.1.2 The Dimension 5 Operator 245
7.1.3 Neutrinoless Double Beta Decay 246
7.1.4 The PMNS Matrix 247
7.1.5 CP Violation in the Lepton Sector 249

– iii –
7.2 Neutrino Oscillations 250
7.2.1 Oscillations with Two Generations 251
7.2.2 Oscillations in Matter 254
7.2.3 Neutrino Detection Experiments 257

–1–
Acknowledgements

I’m grateful to Hugh Osborn and Fernando Quevedo who previously lectured a version
of this course in Cambridge, and to Wati Taylor for sharing the notes of his MIT course
with me. Many thanks to Ben Allanach for explaining various subtle (and less subtle)
issues to me. I’d also like to thank Mike Ball and Cory Fletcher for their courageous
typo spotting abilities and Elie Hamou for running the examples classes.

This course assumes a familiarity with quantum field theory. You will also need to
be comfortable with some group theory.

–2–
0 Introduction
The “Standard Model” is the comically inadequate name that physicists give to the
greatest scientific theory of all time.

This theory is the poster child for success in reductionist science. It describes the
universe on the most fundamental level and correctly predicts the results of every
experiment that we have ever done, sometimes with unprecedented levels of accuracy.

There are parts of the theory that are stunningly beautiful, with di↵erent facets
sliding together like a perfect jigsaw, locked in place with a mathematical rigidity that
means large parts of the world we inhabit could not be any other way. But there
are other aspects of the theory that appear much less elegant, with a couple of dozen
parameters that cannot be predicted from first principles but only by measuring them
in experiment. These parameters don’t appear to be completely random; there are
patterns within them that surely hint at some structure that lies beyond the Standard
Model, a structure that we have yet to uncover.

Boiled down to its essence, the Standard Model describes a bunch of particles, in-
teracting with three forces. These forces are the strong nuclear force, the weak nuclear
force, and electromagnetism. The force of gravity is not part of the Standard Model
but it’s straightforward to include it by coupling to a dynamical, curved spacetime.
(Claims that the Standard Model is incompatible with general relativity are wildly
overblown. The two theories work perfectly well together at all energy scales that we
can currently probe by experiment. The difficulties only arise when energies approach
the Planck scale.)

Each force in the Standard Model is associated to a Lie group. The upshot is that
the Standard Model is built around the group

G = U (1) ⇥ SU (2) ⇥ SU (3) .

Why nature chose the numbers, 1,2, and 3 as the building blocks for her most important
theory is not known, but you can’t help but smile at the decision. Here SU (3) is
associated to the strong force and SU (2) is associated to the weak force and U (1) is
not associated to electromagnetism but, instead, to an electromagnetic-like force known
as hypercharge. It too plays a role in the weak force. The theory of electromagnetism
that we know and love can be found hiding within the SU (2) ⇥ U (1) factor.

–1–
electron down quark up quark electron neutrino
6
1 9 4 ⇠ 10
muon strange quark charm quark muon neutrino
6
207 186 2495 ⇠ 10
tau bottom quark top quark tau neutrino
6
3483 8180 340,000 ⇠ 10

Table 1. The fermions of the Standard Model

Despite the group theoretic similarities of each force, the resulting physics is wildly
di↵erent. That’s because quantum field theory is cool. It does wonderful and unex-
pected things. Part of the purpose of this course is to learn about these things and
why the dynamics of the strong, weak and electromagnetic forces all play very di↵erent
roles in our world.

These three forces interact with matter which, in the Standard Model, comes in the
form of 15 Weyl fermions which, collectively, go by the name of the electron, the up
quark, the down quark, and the neutrino. Why we give just four names to 15 fermions
is part of the story that we will unravel, but at heart it is to do with representation
theory of the group G.

At this point, one of the deepest facts about nature rears its head. The subtleties
of quantum field theory mean that this quartet of particles – the electron, neutrino,
and up and down quarks – have to come together as a collective. You don’t have a
choice. The theory with just, say, an electron and an up quark and no companions
makes no sense. On grounds of mathematical consistency alone, we’re obliged to have
this quartet of particles with their particular properties. This is where some of the
most beautiful aspects of the Standard Model can be found.

But then nature has a surprise, one which we’ve known about for almost a century
and yet we are seemingly no closer to understanding. Nature took that collection of
four particles and, for mysterious reasons, chose to replicate it twice over. This means
that the matter in our world is not made of 15 fermions with four di↵erent names, but
instead of 45 fermions with twelve di↵erent names. The names of these twelve particles
are shown in Table 1 together with their masses, relative to the electron mass which is

me ⇡ 0.51 MeV .

–2–
Figure 1. Again, the masses of the fermions of the Standard Model. Note that the ordering
of particles in each generation is switched.

Each of the three rows in Table 1 is referred to as a di↵erent generation. The particles
in each generation experience identical forces. So, for example, the electron, muon and
tau all have electric charge 1, the down, strange and bottom quarks all have electric
charge 1/3 and the up, charm and top quarks all have electric charge +2/3. All three
neutrinos are neutral.

Similarly, the six quarks all experience the strong force in the same way, while the
electron, muon, tau and neutrinos (which, collectively are referred to as leptons) are all
untouched by the strong force.

The masses of the particles are replicated in Figure 1. They span at least 11 orders
of magnitude, maybe more. (The masses of the neutrinos are not well constrained, as
shown in the figure.) Why these particular masses? Why this ordering of masses? We
have no idea. That’s one of the outstanding questions that we hope might be answered
by a deeper theory.

There is one final piece of the Standard Model that sits, lording over everything.
This is the Higgs boson. It is, in many ways, the thing that ties everything together.
In particular, all the masses listed above can be traced to the interactions of various
fermions with the Higgs field.

The Higgs is simultaneously both the simplest and the most complicated field in the
Standard Model. It is the simplest because it is the only fundamental (as far as we can
tell!) scalar field that we have so far observed, meaning that it is the only field to carry
zero spin. It is the most complicated because, in contrast to fermions and gauge fields,
scalar fields don’t come with many consistency requirements which means that there

–3–
are a plethora of interaction terms that we can write down and the only way we have
to constrain their values is to go out and measure them. It’s here that we find the two
dozen or so parameters that we can’t yet explain. And it’s here that things get messy
and interesting.

This, then, is the Standard Model, part beauty, part beast. A glorious and astonish-
ingly successful theoretical edifice that, so far, has stood firm against everything that
experimenters have thrown at it. Yet few believe that it can really be the last word
in physics. The Standard Model, like the periodic table before it, surely holds clues
for what lies beyond. Our duty as physicists is to understand the Standard Model as
best we can, to learn its secrets and, if possible, to let it guide us to a still deeper
understanding of the world. The purpose of this course is to take you, at least part
way, on this journey.

–4–
1 Symmetries
A large chunk of the structure of the Standard Model follows from understanding the
various symmetries at play. Among these symmetries are
• Poincaré symmetries of spacetime, which restrict us to scalars, fermions, and
gauge fields. These are the basic building blocks of the Standard Model.

• Gauge symmetries, better referred to as “gauge redundancies”. These dictate the


interactions of the spin 1 fields. Indeed, we’ve already seen that the Standard
Model is usually advertised by specifying the gauge group

G = U (1) ⇥ SU (2) ⇥ SU (3) . (1.1)

• Global symmetries. These act on the fermions and include baryon number and
lepton number, as well as various approximate flavour symmetries.

• Discrete symmetries. Prominent among these are parity, time-reversal, and charge
conjugation. These three symmetries are critically important in the structure of
the Standard Model because, we shall see, none of them are actually good sym-
metries of our universe! But this is one case where not having symmetries puts
even stronger constraints on the theory than having symmetries. This is because
of something called “anomaly cancellation” that will be described in Section 4.
Of these, the various global symmetries arise because of the specific matter content of
the Standard Model and so we will postpone a discussion of them until we have more
details in place. (We’ll first get there in Section 3 when we describe features of the
strong force.) However, the other three symmetries – Poincaré, gauge, and discrete –
are ingredients that arise in pretty much all relativistic field theories. For this reason,
it makes sense to explore them in some detail in preparation for what’s to come.

1.1 Spacetime Symmetries


On the length scales appropriate for particle physics, spacetime is e↵ectively flat.
This means that the arena for our story is Minkowski space R1,3 , equipped with the
Minkowski metric

⌘µ⌫ = diag(+1, 1, 1, 1) . (1.2)

We label a point in Minkowski space as xµ = (x0 , x1 , x2 , x3 ). The set of symmetries of


Minkowski space include Lorentz transformations of the form xµ ! ⇤µ⌫ x⌫ where

⇤T ⌘⇤ = ⌘ . (1.3)

–5–
Embedded among these are a couple of discrete transformations: parity with ⇤ =
diag(1, 1, 1, 1) and time reversal with ⇤ = diag( 1, 1, 1, 1). These are important
enough that we will discuss them separately in Section 1.4. The transformations that
are continuously connected to the identity have det ⇤ = 1 and ⇤00 > 0 and form the
Lorentz group SO(1, 3). (The restriction to ⇤00 > 0 is sometimes written as SO+ (1, 3).)

Our main goal in this section is to understand some things about the representa-
tions of the Lorentz group and its extension to the Poincaré group which also includes
spacetime translations. Among these representations, spinors are the most fiddly and
subtle and we will describe some of their properties in Section 1.2.

1.1.1 The Lorentz Group


Strictly speaking, the group SO(1, 3) doesn’t have any spinor representations. However,
there is a closely related group called Spin(1, 3) that does admit spinors. This is the
double cover, in the sense that

SO(1, 3) ⇠
= Spin(1, 3)/Z2 (1.4)

where that Z2 is related to the famous minus sign that spinors pick up under a 2⇡
rotation, a minus sign that vectors like xµ are oblivious to. The fact that there are
spinors in our world is the statement that the true symmetry group is Spin(1, 3) rather
than SO(1, 3).

The groups Spin(1, 3) and SO(1, 3) share the same Lie algebra so(1, 3). A Lorentz
transformation acting on a 4-vector can be written as
✓ ◆
i µ⌫
⇤ = exp !µ⌫ M (1.5)
2
where !µ⌫ are six numbers that specify what Lorentz transformation we’re doing, while
M µ⌫ = M ⌫µ are a choice of six 4 ⇥ 4 suitable matrices that generate the di↵erent
Lorentz transformations. The matrix indices are suppressed in the above expressions;
in their full glory we would write (M µ⌫ )⇢ . So, for example
0 1 0 0
! 0 0 0 0
!
(M 01 )⇢ = i 1 0 0 0
0 0 0 0
and (M 12 )⇢ = i 0 0
0 1 0
1 0
0
. (1.6)
0 0 0 0 0 0 0 0

(Note that the generators di↵er by a factor of i from those defined in the Quantum
Field Theory lectures. This is compensated by an extra factor of i in the exponent
(1.5).) The matrices M µ⌫ generate the algebra so(1, 3),

[M µ⌫ , M ⇢ ] = i (⌘ ⌫⇢ M µ ⌘ ⌫ M µ⇢ + ⌘ µ M ⌫⇢ ⌘ µ⇢ M ⌫ ) . (1.7)

–6–
The six di↵erent Lorentz transformations naturally decompose into three rotations Ji
and three boosts Ki , defined by
1
Ji = ✏ijk Mjk and Ki = M0i (1.8)
2
where the j, k = 1, 2, 3 indices are summed over, and ✏123 = +1. The rotation matrices
are Hermitian, with Ji† = Ji while the boost matrices are anti-Hermitian with Ki† =
Ki . This ensures that the rotations in (1.5) give rise to a compact group while the
boosts are non-compact. From the Lorentz algebra, we find that these generators obey

[Ji , Jj ] = i✏ijk Jk , [Ji , Kj ] = i✏ijk Kk , [Ki , Kj ] = i✏ijk Jk . (1.9)

The rotations form an su(2) sub-algebra. That, of course, is to be expected and is


related to the fact that SO(3) ⇠
= SU (2)/Z2 .

We can, however, find two mutually commuting su(2) algebras sitting inside so(1, 3).
For this we take the linear combinations
1 1
Ai = (Ji + iKi ) and Bi = (Ji iKi ) . (1.10)
2 2
Both of these are Hermitian, with A†i = Ai and Bi† = Bi . They obey

[Ai , Aj ] = i✏ijk Ak , [Bi , Bj ] = i✏ijk Bk , [Ai , Bj ] = 0 . (1.11)

But we know all about representations of SU (2): they are labelled by an integer or
half-integer j 2 12 Z which, in the context of rotations, we call “spin”. The dimension
of the representation is then 2j + 1. The fact that we can find two su(2) sub-algebras
of the Lorentz algebra tells us that all representations must carry two such labels
1
(j1 , j2 ) with j1 , j2 2 Z . (1.12)
2
Moreover, we know that this representation must have dimension (2j1 + 1)(2j2 + 1).
We’ll flesh out the meaning of these representations more below. But for now, we can
identify the simplest such representations just by counting: we have

(0, 0) : scalar
( 12 , 0) : left-handed Weyl spinor
(0, 12 ) : right-handed Weyl spinor
( 12 , 12 ) : vector (1.13)
(1, 0) : self-dual 2-form
(0, 1) : anti-self-dual 2-form

–7–
What we call the physical spin of a particle is the quantum number under rotations J:~
this is j = j1 + j2 . The spin-statistics theorem ensures that particles with j 2 Z are
bosons, while those with j 2 Z + 12 are fermions.

There’s something a little odd about our discovery of two su(2) sub-algebras. After
all, it certainly isn’t true that the Lorentz group is isomorphic to two copies of SU (2).
This is because SU (2) is a compact group: keep doing a rotation and you will eventually
get back to where you started. Indeed, two copies of the group SU (2) give the rotation
group of Euclidean space R4 :

Spin(4) ⇠
= SU (2) ⇥ SU (2) with SO(4) ⇠
= Spin(4)/Z2 . (1.14)

In contrast, the Lorentz group is non-compact: keep boosting and you get further and
further from where you started. How does this manifest itself in the two su(2) algebras
that we’ve found in (1.11)?

The answer is a little subtle and is to be found in the reality properties of the
generators Ai and Bi . Recall that all integer, j 2 Z, representations of SU (2) are real,
while all half-integer spin, j 2 Z + 12 , are pseudoreal (which means that, while not
actually real, the representation is isomorphic to its complex conjugate). However, the
Ai and Bi in (1.11) do not have these properties. You can see in (1.6) that both Ji and
Ki are pure imaginary. This, in turn, means that the generators Ai and Bi are complex
conjugates of each other

(Ai )? = Bi . (1.15)

This is where the di↵erence lies that distinguishes SO(4) from SO(1, 3). The Lie algebra
so(1, 3) does not contain two, mutually commuting copies of the real Lie algebra su(2),
but only after a suitable complexification. This means that certain complex linear
combinations of the Lie algebra su(2) ⇥ su(2) are isomorphic to so(1, 3). To highlight
this, the relationship between the two is sometimes written as

so(1, 3) ⇠
= su(2) ⇥ su(2)? . (1.16)

For our purposes, it means that the complex conjugate of a representation (j1 , j2 )
exchanges the two quantum numbers

(j1 , j2 )? = (j2 , j1 ) . (1.17)

Both the scalar representation (0, 0) and the vector representation ( 12 , 12 ) are real, while
the left- and right-handed Weyl spinors ( 12 , 0) and (0, 12 ) are exchanged under complex

–8–
conjugation. This last statement, which is important, will be elaborated upon in Sec-
tions 1.2 and 1.4. In the context of quantum field theory, if a field appears in a theory
then so too does its complex conjugate. This means that if you have a left-handed
spinor, you also have a right-handed complex conjugated spinor.

1.1.2 The Poincaré Group and its Representations


The continuous symmetries of Minkowski space comprise of Lorentz transformations
together with spacetime translations. Combined, these form the Poincaré group. Space-
time translations are generated, as usual, by the momentum 4-vector P µ . Their com-
mutation relations with themselves and with the Lorentz generators M µ⌫ are given
by

[P µ , P ⌫ ] = 0 and [M µ⌫ , P ] = i (P µ ⌘ ⌫ P ⌫ ⌘µ ) . (1.18)

The latter of these is equivalent to the statement that P µ transforms as a 4-vector


under Lorentz transformations. These commutation relations should be considered in
conjunction with the Lorentz algebra (1.7),

[M µ⌫ , M ⇢ ] = i (⌘ ⌫⇢ M µ ⌘ ⌫ M µ⇢ + ⌘ µ M ⌫⇢ ⌘ µ⇢ M ⌫ ) . (1.19)

Together, (1.18) and (1.19) form the algebra of the Poincaré group.

Given an algebra, our next task is to explore its representations. There are di↵erent
ways that we could approach this. Ultimately, we will be interested in the way that
the Poincaré group acts on fields that make up the Standard Model. But first, to build
some intuition, we will understand how the Poincaré group acts on single particle states
in the Hilbert space.

To set the scene, let’s first recall how we construct irreducible representations of the
rotation group. We work with the algebra so(3) ⇠ = su(2) rather than the group. This
is, of course, defined by the familiar commutation relations

[Ji , Jj ] = i✏ijk Jk . (1.20)

To construct representations, the first thing we do is look to the Casimirs. These are
operators that commute with all generators of the group. For su(2), there is just a
single Casimir,
3
X
C= Ji2 . (1.21)
i=1

–9–
Irreducible representations are labelled by the eigenvalue of the Casimir. For su(2),
the eigenvalue of J 2 is j(j + 1) with the spin j taking values in j = 0, 12 , 1, . . .. Each
representation has dimension 2j + 1, with the states within a multiplet identified by
their eigenvalue under, say, J3 whose eigenvalue lies in the range |j3 |  j. The result is
the familiar one from quantum mechanics: states are labelled by two quantum numbers
|j, j3 i

Now let’s turn to the Poincaré group. The irreducible representations are what we
call “particles”. Again, they are characterised by the Casimirs. I won’t tell you how
to construct Casimirs, but will instead just present you with the result. First, we
introduce the Pauli-Lubański vector,
1
W µ = ✏µ⌫⇢ P⌫ M⇢ . (1.22)
2
This can be thought of as a relativistic version of angular momentum. You can eas-
ily check this commutes with momentum [Wµ , P⌫ ] = 0. The remaining non-trivial
commutation relations are somewhat more laborious to show:

[Wµ , M⌫⇢ ] = i(⌘µ⌫ W⇢ ⌘µ⇢ W⌫ ) and [Wµ , W⌫ ] = i✏µ⌫⇢ W ⇢ P . (1.23)

The last of these commutation relations is quadratic on the right-hand side and so we’re
not looking at a Lie algebra here, but something more complicated. (This is reminiscent
of the Runge-Lenz vector which is a conserved quantity for the Kepler problem; there
too, the Poisson bracket structure returns something quadratic on the right-hand side.)

The two Casimirs of the Poincaré group are formed from the momentum Pµ and the
Pauli-Lubański vector Wµ ,

C 1 = Pµ P µ and C2 = Wµ W µ . (1.24)

This is our starting point: representations of the Poincaré group are labelled by the
eigenvalues of C1 and C2 , together with the eigenvalues of any other operators that
we can find to make a maximally commuting set, analogous to J3 for the angular
momentum.

The most important of these “other operators” is the momentum P µ itself. All states
will be labelled by the eigenvalue pµ which is simply the 4-momentum of the particle.
The first Casimir is then just the rest mass of the particle, C1 = pµ pµ = m2 . By
acting with rotations and boosts Mµ⌫ , we can change the momentum to take any value
subject to the constraint pµ pµ = m2 . In the rotation analogy, the di↵erent values of pµ
are like the di↵erent values of j3 in the multiplet. However, in contrast to rotations,

– 10 –
representations of the Poincaré group will necessarily be infinite dimensional, labelled
(among other things) by the continuous variable pµ . This di↵erence can be traced to
the fact that the Poincaré group is non-compact while the rotation group is compact.

What happens next depends on whether we’re dealing with massive or massless
particles. We describe each in turn, followed by a somewhat mysterious massless rep-
resentation that no one really knows what to make of.

Massive Representations
First, consider the situation when C1 = m2 6= 0. It’s fruitful to pick a representative
value of the momentum pµ and the simplest choice is to boost to the rest frame of the
particle so that pµ = (m, 0, 0, 0). In this frame, the Pauli-Lubański vector is

W 0 = 0 and W i = mJ i . (1.25)

with J i the generators of rotations. Note that the rotation generators J i are precisely
those elements of the Lorentz algebra that don’t change the value of our chosen mo-
mentum pµ = (m, 0, 0, 0). That means that these generators J i must act on whatever
other degrees of freedom are carried by the particles. We want to ask: what are the
allowed extra degrees of freedom?

But this is a question that we already answered above because our problem has
reduced to finding a representation of the Lie algebra su(2), generated by J i . The
second quadratic Casimir of the Poincaré group is C2 = m2 J 2 and so is specified by
the eigenvalue of J 2 which, as we reviewed above, is j(j + 1) for some j 2 12 Z. The full
multiplet is then filled out by the di↵erent values of j3 with |j3 |  j.

We’ve seen that, if we fix the momentum to the specific value pµ = (m, 0, 0, 0),
then we’re left with finding representations of the rotation group. But, importantly, it
doesn’t matter which value of the momentum we started with: had we picked a di↵erent
pµ (still with pµ pµ = m2 ), then we’d have got the same result. This suggests that we
can lift the SU (2) representation that we found for our given pµ to a representation of
the full Poincaré group. And, indeed, this is the case.

There is a theorem underlying this result which we won’t prove. Instead, I’ll just
give you some names of things. Once we fix the momentum pµ , the elements of the
Lorentz group that don’t change pµ form a group known as the little group. For massive
particles, the little group is SU (2). One can then show that representations of the little
group uplift to representations of the full Poincaré group. This is what’s known as an
induced representation.

– 11 –
The upshot is something familiar: massive particles are characterised by their mass
m and spin j. Given these Casimirs, states in this representation of the Poincaré group
are labelled by |pµ , j3 i.

Massless Representations
The story is slightly di↵erent for massless particles, for which the first Casimir vanishes:
C1 = m2 = 0. We again choose a representative momentum. This time we can’t boost
to the rest frame, but we can choose the momentum to take the form pµ = (E, 0, 0, E)
where E is the energy of the particle. A short calculation shows that, in this frame,
the Pauli-Lubański now takes the form
0 1 0 1
M12 J3
B C B C
B M23 M02 C B J 1 K2 C
Wµ = E B B
C=EB
C B
C .
C (1.26)
@ M31 + M01 A @ J 2 + K1 A
M12 J3
Here we’ve replaced the Mµ⌫ with the appropriate rotation generator Ji or boost gen-
erator Ki defined in (1.8). Once again, each of the components of Wµ leaves our initial
momentum pµ = (E, 0, 0, E) unchanged, a fact that you can check by looking at the
explicit form of the generators (1.6). In other words, these components of Wµ are once
again our little group. (This has happened twice now and it is no coincidence: the
structure of the Pauli-Lubański vector was designed so that this holds.)

What group do the components of W µ actually generate? We can look at their


commutation relations which, using (1.9), are
[W1 , W2 ] = 0 , [W3 , W2 ] = iEW1 , [W3 , W1 ] = iEW2 . (1.27)
This is the Euclidean group in R2 , sometimes written as ISO(2), with W1 and W2 the
generators of translations and W3 the generator of rotations. Again, the little group
doesn’t act on our chosen pµ = (E, 0, 0, E), but it may act on any other degrees of
freedom that our state carries. Said di↵erently, those other degrees of freedom must
fall into a representation of the 2d Euclidean group.

Here a subtlety rears its head. For reasons that we will explain below, things turn out
to be simplest if we consider representations of the little group on which the translation
generators W1 and W2 act trivially. If we ignore these translations, the remaining little
group is just the U (1) of rotations generated by J3 . Representations of this U (1) are
labelled by a single eigenvalue h such that the states transform as
ei✓J3 |hi = eih✓ |hi . (1.28)

– 12 –
The eigenvalue h is called the helicity and is the analog of spin for massless particles.
At times, we’ll be lazy and just refer to both as “spin”. For a general null p, the helicity
tells us the eigenvalue of the state under a rotation along the direction of motion,

ei✓ p̂·J |pµ ; hi = eih✓ |pµ ; hi . (1.29)

Because the U (1) generated by J3 was a subgroup U (1) ⇢ SU (2), we know that this
helicity is quantised to take values
1
h2 Z. (1.30)
2
This is the statement that, under a rotation of ✓ = 2⇡, the states are either left the
same (for h 2 Z) or pick up a minus sign (for h 2 Z + 12 ).

There’s something missing in the story above. For massive representations, we’ve
seen that the states are labelled by m and j and fill out a multiplet |pµ , j3 i with
|j3 |  j. This multiplet has dimension 2j + 1. (Ok, the multiplet is really infinite
dimensional because of the pµ , but for a fixed pµ the multiplet has dimension 2j + 1.)

However, for massless particles there is just a single state |pµ ; hi. This is because the
helicity describes the representation of the Abelian group U (1) generated by J3 rather
than the non-Abelian group SU (2) and irreducible representations of Abelian groups
are one-dimensional.

The problem with this is that it doesn’t fit with what we know about massless
particles. For example, the photon has helicity h = 1 and has two polarisation states,
as does a graviton with h = 2. A massless spinor with h = 12 also has two degrees of
freedom. Why aren’t we seeing this doubling in our representation theory analysis?

What we’re missing is the additional requirement that the spectrum of states is
invariant under CP T . These are discrete symmetries that we will look at more closely
in Section 1.4. For massive particles, this doesn’t buy us anything new: the set of
states |pµ , ji is already invariant under CP T . However, for massless particles CP T
flips h 7! h and tells us that massless states must come in pairs

|pµ ; hi and |pµ ; hi . (1.31)

This is the origin of the two polarisation states of the photon or graviton, or the two
helicities of a massless Weyl spinor. Note that a massless scalar has helicity h = 0 and
so is CP T self-conjugate. This means that there’s no requirement from CP T to add
an additional degree of freedom in this case.

– 13 –
Weird Continuous Spin Representations
We brushed over something above. When looking at massless representations, we
found that the little group coincides with the 2d Euclidean group (1.27). But then,
without justification, we restricted ourselves to representations on which the translation
generators W1 and W2 act trivially. Here we give the justification.

Let’s look at representations of the 2d Euclidean group (1.27) for which translations
W1 and W2 act non-trivially. Because [W1 , W2 ] = 0, we can simultaneously diagonalise
these generators so that they act on states |w1 , w2 i such that

Wi |w1 , w2 i = wi |w1 , w2 i for i = 1, 2 . (1.32)

The second Casimir is then

C2 = W µ Wµ = (w12 + w22 ) . (1.33)

For the massless representations above, we assumed that w1 = w2 = 0. Now we


want to understand what happens when they are non-zero. Since C2 is fixed, we write
w1 = ⇢ cos ↵ and w2 = ⇢ sin ↵ with C2 = ⇢2 and we should think of the collection of
states |w1 , w2 i as parameterised by the angle ↵ 2 [0, 2⇡) with the action

W1 |↵i = ⇢ cos ↵|↵i and W2 |↵i = ⇢ sin ↵|↵i . (1.34)

It remains to determine the action of W3 = EJ3 on these states. This is given by


d
ei✓J3 |↵i = eih✓ |↵ + ✓i =) J3 |↵i = h|↵i i |↵i . (1.35)
d↵
You can check that the actions (1.35) and (1.34) do indeed furnish a representation
of the 2d Euclidean algebra (1.27). But, from the perspective of particle physics, it’s
a very weird representation. This is because particle states |pµ , ↵; hi are labelled by
their momentum pµ and an additional angle ↵ 2 [0, 2⇡). This means that for every
choice of momentum pµ , there’s still an infinite dimensional Hilbert space, labelled by
the continuous parameter ↵ rather than a discrete, bounded parameter like j3 . Said
di↵erently, it’s as if we have an uncountably infinite number of species of particle. These
are known as continuous spin representations.

We’ve certainly never observed particles corresponding to these states and they would
have very strange properties (such as infinite heat capacity). Nonetheless, one can’t
help but wonder if nature may make use of them somewhere.

– 14 –
1.1.3 The Coleman-Mandula Theorem
It’s not unusual for quantum field theories to exhibit further continuous symmetries.
Say, a global U (1) symmetry that rotates the phase of a complex field, or perhaps
a non-Abelian SU (N ) symmetry under which a multiplet of fields transforms. The
generators of these symmetries – which we’ll denote collectively as T – correspond to
some conserved charge and are always Lorentz scalars which means that they necessarily
commute with the Poincaré generators,

[P µ , T ] = [M µ⌫ , T ] = 0 . (1.36)

One could ask: is it possible for something less trivial to happen, with the new genera-
tors transforming in some fashion under the Poincaré group? For example, this would
happen if the additional generators T themselves carried some spacetime index. If this
were possilble, the Poincaré group would be subsumed into a larger group. And that
sounds interesting.

A theorem due to Coleman and Mandula greatly restricts this possibility. Roughly
speaking, the theorem states that, in any spacetime dimension greater than d = 1 + 1,
the symmetry group of any interacting quantum field theory must factorise as

Poincaré ⇥ Internal . (1.37)

We won’t prove the Coleman-Mandula theorem here. The gist of the proof is to look at
2-to-2 scattering (meaning two incoming particles scatter into two outgoing particles).
Poincaré invariance already greatly restricts what can happen, with only the scatter-
ing angle left undetermined. Any internal symmetries that factorise, as in (1.37), put
restrictions on the kinds of interactions that are allowed, for example enforcing con-
servation of electric charge. But if the generators T were to carry a spacetime index
then they would put further constraints on the scattering angle itself and that would
be overly restrictive, at best allowing scattering to occur only at discrete angles. But
if one assumes that the scattering amplitudes are analytic functions of the angle then
the amplitude must vanish for all angles and the theory is free.

Like all no-go theorems in physics, the Coleman-Mandula theorem comes with a
number of underlying assumptions. Some of these are eminently reasonable, such as
locality and causality. But it may be possible to relax other assumptions to find inter-
esting loopholes to the Coleman-Mandula theorem. Two such loopholes have proven
to be extremely important.
• Conformal Invariance: The Coleman-Mandula theorem assumes that the the-
ory has a mass gap, meaning that all particles are massive. Indeed, the theorem

– 15 –
is a statement about symmetries of the S-matrix which is really only well defined
for massive particles where we don’t have to worry about IR divergences. For
theories of massless particles something interesting can, and often does, happen.

The first interesting thing is that interacting massless theories typically exhibit
scale invariance. This means that physics is unchanged under the symmetry
xµ ! xµ . The associated symmetry generator is called D for “dilatation”. This
can only be a symmetry of a theory that has no dimensionful parameters, which
is the main reason it can occur only for massless theories.

The second interesting thing is more surprising. For reasons that are not en-
tirely understood, theories that exhibit scale invariance also exhibit a further
symmetry known as special conformal transformations of the form
x µ aµ x 2
xµ ! . (1.38)
1 2a · x + a2 x2
This transformation depends on a vector parameter aµ and the associated gen-
erator is a 4-vector K µ . The resulting conformal algebra extends the Poincaré
algebra (1.18) and (1.19) with the non-trivial commutators
[D, K µ ] = iK µ , [D, P µ ] = iP µ
[K µ , P ⌫ ] = 2i(D⌘ µ⌫ M µ⌫ ) (1.39)
µ⌫ ⌫ µ µ ⌫
[M , K ] = i (K ⌘ K ⌘ ) .
Interacting conformal field theories crop up in many places in physics. In their
Euclidean incarnation, they describe critical points, or second order phase transi-
tions, that were the focus of our lectures on Statistical Field Theory. In d = 1 + 1
dimensions the conformal group has rather more structure and a detailed intro-
duction can be found in the lectures on String Theory.

• Supersymmetry: The second loophole to the Coleman-Mandula theorem is su-


persymmetry. This is a symmetry that relates bosons to fermions. The generator
that enacts this magical transformation is denoted as Q↵ and carries a spacetime
spinor index ↵ = 1, 2. (We will learn more about spinors in Section 1.2.) This
is exactly the kind of thing that the Coleman-Mandula theorem is supposed to
rule out. However, supersymmetry evades the theorem because the generators
Q↵ do not form a Lie algebra: instead they form what is known as a super-Lie
algebra, with the commutation relations of the Poincaré group (1.18) and (1.19)
augmented by the anti-commutation relation
µ
{Q↵ , Q̄↵˙ } = 2 ↵↵˙ Pµ . (1.40)

– 16 –
Here ↵µ↵˙ are a collection of 2 ⇥ 2 matrices defined in (1.44). (We’ll see a lot more
about what the ↵ and ↵˙ spinor indices mean shortly.) You can learn (a lot!) more
about this algebra and its consequences for various field theories in the lectures
on Supersymmetry.

Neither conformal symmetry nor supersymmetry play a role in the Standard Model.
However, both arise in di↵erent ways when it comes to ideas for what lies beyond the
Standard Model.

1.2 Spinors
Scalars are basic. They have no internal structure and, as such, come with very little
baggage. There’s a lot of fun that we can have with them, largely by writing down
potentials that do interesting things, and we’ll see examples of this when we discuss
spontaneous symmetry breaking in Section 2. But there’s little that is subtle about
scalars: what you see is what you get.

In contrast, any field with higher spin is awash with subtleties. For massless spin
1 particles, like photons, these subtleties are all about gauge invariance and we will
discuss them in Section 1.3. Here our interest is in spin 12 particles, known as spinors.
These are the fields that describe all matter particles in the Standard Model, meaning
the quarks and leptons. They are subtle largely because anything that comes back to
itself with a minus sign after a 2⇡ rotation is always going to be a little strange.

1.2.1 Dirac vs Weyl Spinors


We start by reviewing some features of spinors that we met in the lectures on Quantum
Field Theory. However, our focus is going to be a little di↵erent. In particular, to
prepare us for the Standard Model, we will need to look more closely at the properties
of Weyl spinors.

In the lectures on Quantum Field Theory, we learned about the 4-component Dirac
spinor . This comes hand in hand with a collection of gamma matrices that obey the
Cli↵ord algebra
µ ⌫
{ , } = 2⌘ µ⌫ . (1.41)

The Cli↵ord algebra admits a unique irreducible representation, up to conjugation.


But that “up to conjugation” caveat hides all manner of headaches as it provides
ample opportunity for physicists to use annoying conventions. Here we use the chiral

– 17 –
basis of gamma matrices,
! !
µ
µ 0 5 1 0
= and = (1.42)
¯µ 0 0 1

where we’ve introduced two collections of 2 ⇥ 2 matrices,


µ i
= (1, ) and ¯ µ = (1, i
) (1.43)
i
where with i = 1, 2, 3 are the familiar Pauli matrices,
! ! !
1 0 1 2 0 i 3 1 0
= , = , = . (1.44)
1 0 i 0 0 1

The bar on ¯ µ in (1.43) doesn’t denote complex conjugation: these are simply a di↵erent
collection of 2 ⇥ 2 matrices from µ .

In the Quantum Field Theory lectures, we showed that the generators of Lorentz
transformations for a Dirac spinor are
!
µ⌫
i 0
S µ⌫ = [ µ , ⌫ ] = . (1.45)
4 0 ¯ µ⌫

(As with our earlier definition of M µ⌫ , this di↵ers by a factor of i from the conventions
in the Quantum Field Theory lectures.) Here we’ve defined

µ⌫ i µ ⌫ ⌫
= ( ¯ ¯µ)
4
i
¯ µ⌫ = (¯ µ ⌫ ¯⌫ µ
) . (1.46)
4
Because both of these expressions are anti-symmetrised in µ and ⌫, each is a collection
of six 2 ⇥ 2 matrices.

The generators S µ⌫ defined in (1.45) are block diagonal. This is telling us that they
are not an irreducible representation of the Lorentz algebra. Instead, it’s formed of two
distinct representations, one generated by µ⌫ and the other generated by ¯ µ⌫ . Indeed,
you can check that each of these obeys the Lorentz algebra (1.5)
µ⌫ ⇢
[ , ] = i (⌘ ⌫⇢ µ
⌘⌫ µ⇢
+ ⌘µ ⌫⇢
⌘ µ⇢ ⌫
) (1.47)

– 18 –
with a similar expression for ¯ µ⌫ . Correspondingly, the 4-component Dirac spinor
also decomposes into two 2-component spinors
!
L
= . (1.48)
R

These are referred to as left-handed and right-handed spinors respectively. In the


language of our earlier table of representations (1.13), L sits in the ( 12 , 0) representation
while R sits in the (0, 12 ) representation. A Dirac spinor is a combination of both
representations ( 12 , 0) (0, 12 ).

Under a Lorentz transformation, a left-handed Weyl spinor transforms as


✓ ◆
i µ⌫
L ! S L with S = exp !µ⌫ . (1.49)
2

Here !µ⌫ are the same set of six numbers that specify the Lorentz transformation (1.5).
There is a similar expression for R , with µ⌫ replaced by ¯ µ⌫ .

You can check that tr µ⌫ = 0 and so, using det(eA ) = etr A , we have det S = 1. In
fact, S 2 SL(2, C), and what we’ve done in constructing the Weyl spinor representation
of the Lorentz group is highlight the group isomorphism Spin(1, 3) ⇠= SL(2, C).

(Left-Handed)? = Right-Handed
The two representations – one for a left-handed Weyl spinor, the other for a right-
handed Weyl spinor – are related by complex conjugation.

It’s not immediately obvious because, as we’ve seen, the generators are µ⌫ and ¯ µ⌫
and it’s not true that these generators are complex conjugates: ( µ⌫ )? 6= ¯ µ⌫ . To see
the relation, we need an additional conjugation by the anti-symmetric tensor
!
0 1
✏= . (1.50)
1 0

You can then check that

✏T ( µ⌫ ?
) ✏ = ¯ µ⌫ . (1.51)

Operationally, the complex conjugation flips the sign of ( 2 )? = 2


leaving the other
Pauli matrices alone: ( ) = for i = 1, 3. But the conjugation by ✏ = i 2 then flips
i ? i

the sign of i with i = 1, 3, leaving 2 alone.

– 19 –
This simple algebraic relation has an important physical implication. If you have a
left-handed particle described by a Weyl spinor L , then its anti-particle is described
by the conjugate spinor L† (which we also write as ¯L ) and is right-handed.

Building Scalars from Spinors


If we’re given two left-handed spinors, L and L , then we can build a scalar. We’ll
adorn our spinors with indices, so we have ( L )↵ and ( L )↵ with ↵ = 1, 2. We also add
indices to our anti-symmetric matrix
!
0 1
✏↵ = . (1.52)
1 0
We then define the scalar quantity
L L := ✏↵ ( L) ( L )↵ =( L )2 ( L )1 ( L )1 ( L )2 . (1.53)
To see that this does indeed transform as a scalar, we look at
L L ! S↵ S ✏ ↵ ( L) ( L) = (det S)✏ ( L) ( L) = L L (1.54)
where, in the first equality we’ve used the fact that S↵ S ✏↵ = det S ✏ , which you
can confirm simply by checking all the cases , = 1, 2. In the second equality we’ve
used the fact that det S = 1.

This is an important lesson: you can form a scalar from two left-handed spinors. In
terms of the representation theory of the previous section, what we’re seeing here is
the tensor product ( 12 , 0) ⌦ ( 12 , 0) = (0, 0) (1, 0), where the scalar (1.53) picks out the
singlet (0, 0).

The anti-symmetric tensor ✏↵ is an invariant tensor for the group SL(2, C). In that
sense, it plays a role that is similar to the delta function ab for the group SO(N ), or the
Minkowski metric ⌘ µ⌫ for the group SO(1, 3). In particular, it allows us to form a scalar
product between two spinors as in (1.53). The fact that this product is anti-symmetric,
rather than symmetric, fits nicely with the fact that, in quantum field theory, spinors
are anti-commuting variables whose components are Grassmann-valued. This means
that we have,
L L =( L )2 ( L )1 ( L )1 ( L )2 = ( L )1 ( L )2 +( L )2 ( L )1 = L L . (1.55)
In particular, this means that we can form a scalar from just a single left-handed Weyl
spinor
L L =( L )2 ( L )1 ( L )1 ( L )2 = 2( L )2 ( L )1 . (1.56)
Again, there are similar expressions for right-handed spinors.

– 20 –
There’s quite a bit more to say about the two di↵erent representations of the Lorentz
algebra and their properties. You can read about this (and the corresponding dotted
and undotted indices) in the first section of the lectures on Supersymmetry. But the
simple summary above will suffice for our purposes.

1.2.2 Actions for Spinors


Our next goal is to understand how to construct Lagrangians for spinors. Again, our
starting point will be the Dirac spinor that we met in Quantum Field Theory. There
we saw that the Lorentz invariant action is
Z ⇣ ⌘
SDirac = 4
dx i ¯ µ
@µ M ¯ . (1.57)

For a Dirac spinor, the bar notation means ¯ = † 0


. Decomposed in terms of Weyl
fermions (1.48),
Z ⇣ ⌘
SDirac = d4 x i ¯L ¯ µ @µ L + i ¯R µ @µ R M ( ¯R ¯
L+ L R ) . (1.58)

First an important, but trivial, notational point: the bar for a Weyl spinor means
something di↵erent from a bar for a Dirac spinor. It is simply a more elegant way of
writing ¯L = L† .

Second, note that the mass term couples the left- and right-handed Weyl spinors.
Combining our observations above, we know that the complex conjugate ¯R is a left-
handed spinor, and so in writing ¯R L we’ve combined two left-handed spinors into a
scalar. Similarly, ¯L R combines two right-handed spinors into a scalar.

It’s worth pausing to look at the symmetries of the action (1.58). Crucially, these
symmetries are di↵erent for massless and massive fermions. In the absence of the mass
term, so M = 0, the action has a U (1)2 symmetry, under which the two fermions rotate
separately, L ! ei↵ L and R ! ei R . When we turn on the mass term, only the
diagonal combination, with ↵ = survives. This is a general story, and one that will
be particularly important for understanding the Standard Model: massless fermions
always have more symmetries than massive fermions.

The mass in (1.58) can take values M 2 R. (There’s no positivity requirement.)


Upon quantisation, with M 6= 0, we get a particle of spin + 12 and charge +1 under the
surviving U (1), together with a distinct anti-particle of spin + 12 and charge 1, both
with mass |M |.

– 21 –
The mass term in (1.58) which combines two di↵erent spinors, L and R , is known
as a Dirac mass. It’s not the only thing we can write down. Suppose that we have just
a left-handed spinor L . Then it’s perfectly possible to write down an action with a
mass term,
Z ⇣ m m? ¯ ¯ ⌘
SWeyl = d4 x i ¯L ¯ µ @µ L + L L + L L . (1.59)
2 2
This is known as a Majorana mass. Here we can take m 2 C.

Again, the massive theory has less symmetry than the massless theory, with the U (1)
that rotates the phase of L broken when m 6= 0. This means that there’s no U (1)
quantum number to distinguish particles from anti-particles and, upon quantisation,
the theory describes a single spin 12 particle with mass |m| that is now its own anti-
particle.

Because the Majorana mass term explicitly breaks the U (1) symmetry, it is not
allowed if the U (1) is gauged. Relatedly, it’s not possible to write down such a term
for any fermion L that transforms in a complex representation of a gauge group. It
is, however, possible to write down such terms for fermions in real representations.

1.3 Gauge Invariance


In the Standard Model, forces are associated to massless spin 1 particles, known col-
lectively as gauge bosons. As we now explain, much of the dynamics of these forces is
fixed by gauge invariance.

1.3.1 Maxwell Theory


The key ideas of gauge invariance are familiar from electromagnetism. There, the
fundamental field is the 4-vector Aµ (x), known as the gauge potential. Crucially, not all
components of Aµ (x) are physical: instead, we should identify any two gauge potentials
that are related by a gauge transformation of the form

Aµ ! Aµ + @µ ↵ (1.60)

for any function ↵(x). The transformation (1.60) is sometimes called a gauge symmetry.
It’s not a good name. A “symmetry” describes a situation in which two physically
distinct configurations share the same physics. But that’s not what’s going on in
(1.60). Instead, the two configurations related by a gauge transformation describe the
same physical configuration. A fairly decent analogy is to think of two gauge potentials
that are related by (1.60) in the same way as you would view two di↵erent coordinate
systems. A much better name would be gauge redundancy.

– 22 –
As we proceed, we’ll see that a great deal of the structure of the Standard Model
is determined by the requirements of gauge invariance. Yet, in many ways, this is a
strange idea on which to rest our most important theories of physics. Gauge invariance
is, at heart, merely an ambiguity in how we choose to present the laws of physics. Why
should it play such an important role?

One reason is that the ambiguity allows us to demonstrate various properties that
we care about but which, naively, might appear incompatible. These properties include
Lorentz invariance and locality and, in the quantum theory, unitarity. We already got
a glimpse of this in the lectures on Quantum Field Theory when we quantised Maxwell
theory. One choice of gauge makes unitarity manifest while another makes Lorentz
invariance manifest. The gauge ambiguity allows us to flit from one choice to another,
allowing us to both have our cake and eat it.

Relatedly, we know that the photon has two polarisation states. But try writing down
a field which describes the photon that has only two indices and which transforms nicely
under the SO(1, 3) Lorentz group; its not possible. So instead we introduce the field
Aµ which makes Lorentz invariance manifest and then use the gauge symmetry to kill
two of four resulting states.

The physical information in Aµ can be found in the field strength


Fµ⌫ = @µ A⌫ @⌫ Aµ . (1.61)
The field strength is invariant under the gauge transformation (1.60). The field strength
houses the electric field E and the magnetic field B. If we write Aµ = ( , A), then we
have
@A
E= r and B = r ⇥ A . (1.62)
@t
The dynamics of the gauge field is described by the action
Z
1
SMaxwell = d4 x Fµ⌫ F µ⌫ . (1.63)
4
The resulting equations of motion are
@µ F µ⌫ = 0 . (1.64)
This coincides with two of the Maxwell equations: Gauss’ law r · E = 0 and Ampère’s
law r ⇥ B = @E/@t. The other two follow immediately from constructing Fµ⌫ in terms
of the gauge potential. To see this, we first introduce the dual field strength
? 1
F µ⌫ = ✏µ⌫⇢ F⇢ . (1.65)
2

– 23 –
This is similar to Fµ⌫ , but with E and B swapped (one of them with a minus sign).
Then, by the anti-symmetry of ✏µ⌫⇢ , together with the definition (1.61), we have the
Bianchi identity
@µ ? F µ⌫ = 0 . (1.66)
Expanding this out gives the remaining two Maxwell equations: the one that says
magnetic monopoles don’t exist r·B = 0, and the law of induction r⇥E+@B/@t = 0.

The necessity to keep gauge invariance means that it’s not possible to augment
the action (1.63) with a mass term of the form m2 Aµ Aµ . This would break gauge
invariance and cause trouble down the line. Naively, this would appear to guarantee
that the photon must always be massless. In fact, there is a way to give the photon a
mass, known as the Higgs mechanism. This will be discussed in Section 2.3.

Coupling to Matter
Underlying electromagnetism is a U (1) gauge group. That’s not so obvious in the
description above, where the “symmetry” (really redundancy) manifests itself only as
a shift of the gauge field (1.60) depending on a function ↵(x). However, the U (1)ness
of electromagnetism becomes more apparent when we couple to charged fields.

Fields that are charged under electromagnetism are necessarily complex. Consider,
for example, a complex scalar field (x) of charge e. When the gauge field transforms
as (1.60), the scalar field has a corresponding transformation
! eie↵ . (1.67)
Here we see the group emerging more clearly, with eie↵(x) 2 U (1). Because the trans-
formation parameter ↵(x) is a function, we really have a U (1) symmetry/redundancy
for each point x in space. This is what it means to have a U (1) “gauge group”: it is a
much larger group than the global symmetries that appear elsewhere.

We can construct theories that are invariant under the transformation (1.67) by
replacing partial derivatives with the covariant derivative
Dµ = @ µ ieAµ . (1.68)
This has the nice property that Dµ transforms covariantly under a gauge transforma-
tion, a fact that requires a couple of quick lines of calculation:
Dµ ! (@µ ieAµ ie@µ ↵) eie↵
= eie↵ (@µ ieAµ )
ie↵
= e Dµ . (1.69)

– 24 –
The key to this calculation is that the derivative hitting @µ (eie↵ ) exactly cancels the
shift of the gauge field (1.60). Taking the complex conjugate of (1.68), we have
† †
Dµ = (@µ + ieAµ ) . (1.70)

From this, we see that the meaning of the covariant derivative Dµ depends on the object
it’s hitting: it’s ieAµ for the scalar in (1.68), but +ieAµ for the conjugate scalar in
(1.70). You can check that, under a gauge transformation, Dµ † ! e ie↵ Dµ † . This
ensures that we can form a gauge invariant action
Z ⇣ ⌘
Sscalar = d4 x Dµ † Dµ V (| |) (1.71)


where we take the potential to depend only on | |2 = . In particular, this means
†2
that we disallow terms in the potential of the form 2 + which are real but are not
gauge invariant.

If we have multiple scalar fields, then they can carry di↵erent charges. When the
gauge group is U (1), these charges should be integer multiples of each other, meaning
that each field transforms as

! eieq↵ with q 2 Z . (1.72)

It is possible to write down theories in which the charges q are not integer valued. (For
p
example, one could imagine one scalar field with q = 1 and another with q = 2.)
Strictly, the gauge group should be viewed as R in this case, rather than U (1). The
di↵erences between a U (1) gauge group and an R gauge group are rather subtle, and
manifest themselves only in the presence of magnetic monopoles, or in spacetimes of
non-trivial topology. We won’t get into these issues here.

Everything that we’ve said above for scalars also holds for fermions, both Weyl and
Dirac. In either case, we replace the partial derivatives in the relevant action (either
(1.59) or (1.58)) with covariant derivatives and o↵ we go.

1.3.2 A Refresher on Lie Algebras


There is an important extension of Maxwell theory in which the gauge group U (1) is
replaced by a compact Lie group G. Here we give a lightning review of the relevant
aspects of Lie groups and Lie algebras.

– 25 –
A Lie group is a group that is also a di↵erentiable manifold1 . This means, among
other things, that a group element is labelled by some continuous parameters. We’ve
already met examples of Lie groups in both the rotation group and the Poincaré group.

Lie groups have the property that, for elements continuously connected to the iden-
tity, we can write each U 2 G as
AT A
U = ei✓ (1.73)

Here the ✓A are just numbers that tell us which group element we’re working with,
while the T A are generators of the group. If you like, the T a tell us the infinitesimal
action of the group, with U ⇡ 1 + i✓A T A + O(✓2 ) when ✓ is small. A general group
element (1.73) can then be constructed by exponentiating the infinitesimal action.

It turns out that, with the exception of some global information, the structure of the
Lie group is captured in the behaviour of those infinitesimal generators T A . They form
the associated Lie algebra g, given by

[T A , T B ] = if ABC T C . (1.74)

Here A, B, C = 1, . . . , dim G and f ABC are the fully anti-symmetric structure constants
which distill the information about the group G. The factor of i on the right-hand side
is taken to ensure that the generators are Hermitian: (T A )† = T A .

(Mathematicians usually prefer the convention where there is no i on the right-hand


side and the generators are anti-Hermitian, largely because there are examples like
SO(N ) where everything in the game is real and a factor of i makes things needlessly
complex. In contrast, physicists tend to include the factor of i on the right-hand side
because they’re usually working in the realm of quantum mechanics where things will
ultimately become complex anyway.)

The T A in (1.74) are abstract objects but we will shortly want to identify them with
matrices. This means, among other things, that we want the commutator in (1.74) to
have the same properties as matrix commutation, among them the Jacobi identity

[T A , [T B , T C ]] + [T B , [T C , T A ]] + [T C , [T A , T B ]] = 0 . (1.75)

This puts constraints on the structure constants f abc which must, in turn, obey

f ADE f BCD + f BDE f CAD + f CDE f ABD = 0 . (1.76)

1
For many physicists, Lie groups are the only groups they know. A mathematician friend of mine
told me that a physicist’s definition of a finite group is a Lie group without manifold structure.

– 26 –
G SU (N ) SO(N ) Sp(N ) E6 E7 E8 F4 G2
1
dim G N2 1 2
N (N 1) N (2N + 1) 78 133 248 52 14
dim F N N 2N 27 56 248 6 7

Table 2. The classification of compact, semi-simple Lie algebras G, together with their
dimension and the dimension of the fundamental representation F .

We will be interested in simple, compact Lie groups. Here “simple” means that we don’t
have any trivial U (1) factors floating around that commute with everything else. We
can always include such factors if we wish (and we will wish for the Standard Model)
but we’ll be best served if we ignore them at this stage. Meanwhile, “compact” means
that if you continue to rotate in the group then you ultimately come back to where you
started from (or close to where you started from). For example, the group of rotations
is compact, while the Lorentz group is non-compact because if you keep boosting in a
given direction then you just move faster and faster.

There is a classification of simple compact Lie algebras. The possible options for the
group G, together with the dimension of the group, are shown in Table 22 . All of these
groups are referred to as non-Abelian meaning that things don’t commute with each
other. In contrast, U (1) is an Abelian group.

As we mentioned above, the T A in (1.74) are initially viewed as just abstract ob-
jects. But it’s interesting to ask when they can take a more concrete form in the guise
of matrices. These are the representations of the algebra. For each algebra, there is an
infinite list of numbers which are the dimensions of the matrices that can be used to
represent it. The smallest such (non-trivial) matrix is called the fundamental represen-
tation and we will denote it as F . The dimension of F for each Lie group G are also
shown in Table 2.

In what follows, we will (with a slight abuse of notation) use T A to refer to the
generators of the fundamental representation. When we have occasion to use other
representations R, we will refer to the generators as T A (R) (In later sections, we’ll also
refer to these as TRA .). In fact, for the Standard Model we will only need two di↵erent
representations: the fundamental and the adjoint. The adjoint is a representation that
2
We’re using the convention Sp(1) = SU (2). Other authors sometimes write Sp(2N ), or even
U Sp(2N ) to refer to what we’ve called Sp(N ), preferring the argument to refer to the dimension of
the fundamental representation F rather than the rank of the Lie algebra g.

– 27 –
has dimension dim(adj) = dim G with the generators given by

T A (adj)BC = if ABC . (1.77)

Don’t be lulled into thinking that you don’t need to consider other representations:
they will appear in other situations, including when we discuss flavour symmetry in
QCD in Section 3.

The Lie algebra comes with what, in fancy language, is called a Killing form. But,
by the time we’re thinking about matrices, this Killing form is just the trace. The
generators of any simple Lie algebra obey Tr T A = 0. (This is what it means for the
Lie algebra to be “simple”.) We take the generators in the fundamental representation
F to satisfy
1
Tr T A T B = AB
. (1.78)
2
This can be viewed as tantamount to fixing the normalisation of the structure con-
stants f ABC . Having fixed the normalisation in the fundamental representation, other
representations T A (R) will have di↵erent normalisations.

Before we proceed, an example. The simplest non-Abelian Lie group is SU (2), which
has dim(SU (2)) = 3 and structure constants given by f ABC = ✏ABC . In this case, the
fundamental representation is (up to an overall normalisation) the 2 ⇥ 2 Pauli matrices
1
TA = A
. (1.79)
2
These indeed obey [T A , T B ] = i✏ABC T C , together with the normalisation condition
(1.78).

The group SU (3) also plays a prominent role in the Standard Model. (In fact, as we
will see, it plays two prominent roles!) We will describe the structure constants and
the generators in Section 3.

1.3.3 Yang-Mills Theory


Now we can turn to some physics. Yang-Mills theory is a generalisation of Maxwell
theory in which the group U (1) is replaced by a simple, compact Lie group G. To
specify the Yang-Mills theory, we need only specify the choice of G together with
a coupling constant g > 0 that will dictate the strength of the interactions. (The
coupling constant g plays the same role as the charge e in Maxwell theory. As we will
later see, the phrase “coupling constant” is not particularly accurate because it will
turn out not to be constant!)

– 28 –
For each element of the algebra, we introduce a gauge field AA
µ with A = 1, . . . , dim G.
These are then packaged into the Lie algebra-valued gauge potential

Aµ = AA
µT
A
(1.80)

A down-to-earth perspective is to think of the T A as matrices in the fundamental


representation. This means, for example, that for G = SU (N ), the gauge potential Aµ
is a 4-vector where each component is a traceless N ⇥ N matrix.

The fields AAµ are collectively referred to as gauge bosons. (They have other, more
specific, names in the Standard Model when we apply these ideas to the two nuclear
forces.) As in Maxwell theory, not all the information in Aµ is physical and any two
field configurations related by a gauge transformation should be viewed as equivalent.
This time, however, the gauge transformation is a little more intricate.

The action of the gauge symmetry is associated to a Lie group valued function over
spacetime,

⌦(x) 2 G . (1.81)

The set of all such transformations is known as the gauge group. As in Maxwell theory,
we will sometimes be sloppy and refer to the Lie group G as the gauge group, but
strictly speaking it is the much bigger group of maps from spacetime into G. The
action on the gauge field is

1 i 1
Aµ ! ⌦Aµ ⌦ + ⌦ @µ ⌦ . (1.82)
g
The first term is the expected transformation for an adjoint-valued field. The second,
inhomogeneous, term is an additional piece that is characteristic of gauge transforma-
tions.

To make contact with gauge transformations in electromagnetism, suppose that we


have G = U (1) and write ⌦(x) = eie↵(x) . Then, using the fact that everything com-
mutes, we have

1 i 1
⌦Aµ ⌦ + ⌦@µ ⌦ = Aµ + @µ ↵ (1.83)
e
and the gauge transformation (1.82) reproduces the familiar gauge transformation of
Maxwell theory.

– 29 –
As in Maxwell theory, we can construct a field strength. Here too there is an extra
ingredient arising from the fact that Aµ is a matrix and the generalisation of (1.61) is
Fµ⌫ = @µ A⌫ @⌫ Aµ ig[Aµ , A⌫ ] . (1.84)
In contrast to Maxwell theory, the field strength includes a non-linear term, propor-
tional to the coupling g. This will prove to be important: it is this non-linear term that
makes Yang-Mills theory significantly richer and more interesting than Maxwell theory.
Like Aµ , the field strength is a Lie algebra-valued field and we could also expand it as
A A
Fµ⌫ = Fµ⌫ T .

So far, I’ve not explained why (1.84) is the right field strength. The main reason is
that it transforms nicely under the gauge transformation (1.82)
1
Fµ⌫ ! ⌦ Fµ⌫ ⌦ . (1.85)
To see this, you could just plug (1.82) into (1.84) but it’s mildly laborious; we will o↵er
a shortcut to this result presently.

The transformation (1.85) means that, in contrast to electromagnetism, the Yang-


Mills “electric field” Ei = F0i and “magnetic field” Bi = 12 ✏ijk Fjk are not gauge
invariant. To construct something physical, you can multiply together some number of
Ei and Bj and then take the trace, which ensures that the ⌦ and ⌦ 1 in (1.85) cancel
and you get something gauge invariant. (You need something that is at least quadratic
in Fµ⌫ because, for simple Lie groups, Tr Fµ⌫ = 0.)

The gauge transformations above involve the Lie group valued object ⌦(x). But one
of the key properties of Lie groups is that their structure is largely determined by the
elements that are infinitesimally close to the identity. This suggests that it’s fruitful to
look at gauge transformations that are everywhere close to the identity. These can be
written as
⌦(x) ⇡ 1 + ig↵A (x)T A + . . . (1.86)
where the ↵A are taken to be everywhere small. From (1.82), the infinitesimal trans-
formation of the gauge field is Aµ ! Aµ + Aµ with
Aµ = @ µ ↵ ig[Aµ , ↵] (1.87)
where ↵ = ↵A T A is the Lie algebra-valued infinitesimal transformation. It’s convenient
to write this as Aµ = Dµ ↵ where the covariant derivative is defined to be
Dµ ↵ = @ µ ↵ ig[Aµ , ↵] . (1.88)
This is the covariant derivative acting on the Lie algebra-valued (i.e. adjoint) field ↵.
We’ll soon see di↵erent covariant derivatives acting on other representations.

– 30 –
Now we can check how infinitesimal gauge transformations act on the field strength
(1.84). We have

Fµ⌫ = @µ A⌫ @ ⌫ Aµ ig[Aµ , A⌫ ] ig[ Aµ , A⌫ ]


= Dµ A ⌫ D ⌫ Aµ
= [Dµ , D⌫ ]↵ . (1.89)

We see that we’re left with the task of computing the commutator of two covariant
derivatives, acting on the adjoint field ↵. This is a worthwhile and straightforward
calculation. We have

[Dµ , D⌫ ]↵ = ig[Fµ⌫ , ↵] . (1.90)

This gives Fµ⌫ = ig[↵, Fµ⌫ ] which is indeed the expected infinitesimal gauge transfor-
mation arising from (1.85).

The Yang-Mills Action


The dynamics of the Yang-Mills field is the obvious generalisation of the Maxwell action,
Z
1
SYM = d4 x Tr F µ⌫ Fµ⌫ . (1.91)
2
Naively, the only di↵erence lies in that overall trace, which ensures that the action
is invariant under gauge transformations (1.85). This also accounts for the overall
normalisation of the action, which comes with a factor of 1/2 rather than the 1/4 seen
in (1.63) because an additional factor of 1/2 comes from the trace in (1.78). This means
that the Yang-Mills and Maxwell action come with the same normalisation.

However, the key di↵erence between the two actions is buried in our notation: while
the Maxwell action is quadratic in Aµ , the Yang-Mills action includes terms that are
cubic and quartic in Aµ , both coming from the commutator in the definition of the
field strength (1.84).

The classical equations of motion are derived by minimizing the action with respect
to each gauge field Aaµ . It is a simple exercise to check that they are given by

Dµ F µ⌫ = 0 . (1.92)

Here the covariant derivative is defined as in (1.88): Dµ F µ⌫ = @µ F µ⌫ ig[Aµ , F µ⌫ ].


These are the Yang-Mills equations. In contrast to the Maxwell equations, they are
non-linear. This means that the Yang-Mills fields interact with themselves.

– 31 –
There is also a Bianchi identity that follows from the definition (1.84) of Fµ⌫ in terms
of the gauge field. This is best expressed by first introducing the dual field strength

? µ⌫ 1
F = ✏µ⌫⇢ F⇢ . (1.93)
2
and noting that this obeys the identity

Dµ ? F µ⌫ = 0 . (1.94)

Both (1.92) and (1.94) are non-linear equations. However, the non-linearities come in
the form of commutators like [Aµ , A⌫ ]. This means that if we focus on field configura-
tions that sit purely within a subgroup U (1) ⇢ G, then the commutators vanish and
the equations reduce to those of Maxwell theory. So although the general solutions to
the Yang-Mills equations are surely complicated, we can always import any solution to
Maxwell theory and embed it in some U (1). In particular, Yang-Mills theory admits
solutions akin to electromagnetic waves that travel at the speed of light.

Although we can always embed solutions of Maxwell theory in the Yang-Mills field,
there’s nothing that tells us that these solutions are stable. For that, one has to work
harder and look at fluctuations of the other fields that do not live in your favourite
U (1). (For what it’s worth, a constant electric field is stable in Yang-Mills theory, while
a constant magnetic field is unstable.) We won’t discuss these stability issues further in
these lectures, largely because our interest lies in what happens in quantum Yang-Mills
rather than in the classical theory.

Just as for Maxwell theory, the need to keep gauge invariance means that we can’t
add a mass term like Aµ Aµ or Tr Aµ Aµ to the action (1.91). This strongly suggests
that quantum Yang-Mills is, like Maxwell theory, a theory of massless particles. This
strong suggestion is, it turns out, completely wrong! When we quantise the Yang-Mills
action (1.91), we find a theory of interacting massive particles, rather than massless
particles. The reason for this can be traced to the interaction terms in Yang-Mills,
but is not fully understood. Indeed, proving it from first principles remains one of the
most important open problems in mathematical physics. We will discuss this further
in section 3.

Coupling to Matter
As with electromagnetism, we can couple the Yang-Mills field to matter. We do this
by requiring that the matter fields live in some representation R of the gauge group.
This means that the matter fields come in some vector of dimension dim R.

– 32 –
For each such representation, we have generators T A (R) which we can think of as
square matrices of dimension dim R. Dressed resplendent in all their indices, they take
the form

T A (R)ab with a, b = 1, . . . , dim R and A = 1, . . . , dim G . (1.95)

Consider a scalar field in the representation R. Under a gauge transformation ⌦(x) =


A A
eig↵ (x)T , the scalar transforms as
⇣ ⌘a
a a b a A A
! (⌦R ) b with (⌦R ) b = exp ig↵ T (R) . (1.96)
b

Some representations R are real, and some are complex. For example, the fundamen-
tal representation of SU (N ) is complex, and so must be a complex N -dimensional
vector. Meanwhile, the adjoint representation of any group G is always real and, cor-
respondingly, can be real.

To write down an action for that is invariant under the gauge transformation (1.96),
we follow our Maxwellian noses and construct the covariant derivative,
a a
Dµ = @µ igAA A a
µ T (R) b
b
. (1.97)

Under a gauge transformation, this covariant derivative transforms, as the name sug-
gests, covariantly, meaning
a
Dµ ! (⌦R )ab Dµ b
. (1.98)

We will later see that all matter fields in the Standard Model transform in the fun-
damental representation. For SU (N ), this means that we can think of a as an N -
component complex vector, with a = 1, . . . , N , and write the covariant derivative in
terms of the N ⇥ N matrix-valued gauge field Aµ = AA A
µT ,

a a
Dµ = @µ ig(Aµ )ab b
. (1.99)

This expression di↵ers from our previous covariant derivative (1.88) because is in
the fundamental representation, while ↵ in (1.88) was in the adjoint. This highlights
something we’ve stressed previously: the meaning of the covariant derivative depends
on the representation of the object on which it acts. Once again, covariant derivatives
do not commute. This time, for covariant derivatives acting on fundamental fields, we
find

[Dµ , D⌫ ] = igFµ⌫ . (1.100)

This should be compared to the analogous result (1.90) for covariant derivatives acting
on adjoint-valued fields.

– 33 –
As before, it’s useful to check some of the formulae for infinitesimal gauge trans-
formations. We have Aµ = Dµ ↵, as in (1.87) and, from (1.96), = ig↵ . Then,
suppressing the a = 1, . . . , N index, the covariant derivative (1.99) transforms as

(Dµ ) = @µ ig Aµ igAµ
= ig@µ (↵ ) ig(Dµ ↵) + g 2 Aµ ↵
= ig↵ (@µ igAµ )
= ig↵Dµ . (1.101)

This is, indeed, the infinitesimal version of the gauge transformation (1.98).

With covariant derivatives that transform nicely, it’s straightforward to write down
an action for the matter fields. As in electromagnetism, we just need to replace the
partial derivatives in the action with covariant derivatives and we have something gauge
invariant. This holds for scalars, Weyl fermions, and Dirac fermions.

A Rescaling
Above we’ve written the action so that the coupling constant g multiplies the non-
linear terms. This means, in particular, that it makes an appearance in the field
strength (1.84). It also appears, perhaps rather strangely, as the inverse 1/g in the
gauge transformation (1.82).

There is a di↵erent way to normalise the gauge field that, for many purposes, turns
out to be more natural. We define the new gauge field

õ = gAµ and F̃µ⌫ = @µ Ã⌫ @⌫ õ i[õ , Ã⌫ ] . (1.102)

We also define the rescaled gauge parameter ↵ ˜ = g↵, so that the group element is
i↵
˜
⌦ = e . This then eliminates the gauge coupling from all kinematic quantities like the
field strength and covariant derivatives. The only place that the coupling shows up is
in an overall coefficient multiplying the entire action,
Z Z
1 4 µ⌫ 1
SYM = d x Tr F Fµ⌫ = d4 x Tr F̃ µ⌫ F̃µ⌫ . (1.103)
2 2g 2
In the first way of writing things, the coupling constant g sits in front of the non-linear
terms, making it clear that it governs the strength of interactions. But it also governs
the strength of interactions in the second way of writing things. To see this, note that
in the Euclidean path integral, we sum over all field configurations weighted by e S/~ .
With the rescaling above, g 2 sits in the same place in the action as ~, which suggests

– 34 –
that g 2 ! 0 will be a classical limit. Heuristically you should think that, for g 2 small,
we pay a large price for field configurations that do not minimize the action; in this
way, the path integral is dominated by the classical configurations. In contrast, when
g 2 ! 1, the Yang-Mills action disappears completely. This is the strong coupling
regime, where all field configurations are unsuppressed and contribute equally to the
path integral.

The Analogy with General Relativity


General Relativity is rightly lauded for the way it places geometry into the heart of
physics. But the other laws of physics, which combine to form the Standard Model, are
no less geometrical. Rather than arising from the geometry of spacetime, they instead
arise from a slightly more subtle object known as a fibre bundle.

We won’t describe the mathematics of fibre bundles in any detail in these lectures,
but will instead just point out some analogies between the gauge theories discussed
above and the di↵erential geometry that underlies general relativity.

One of the key ideas in general relativity is di↵eomorphism invariance. This is


the statement that physical quantities should not depend on the coordinates that we
choose to describe them. Such coordinate transformations are analogous to gauge
transformations in Yang-Mills theory.

One of the most important objects in general relativity is the Levi-Civita connection
µ
Famously, this is not a tensor. Under a coordinate transformation x ! x̃, with
⇢⌫ .

@xµ
⌦µ⌫ = , (1.104)
@ x̃⌫
the Levi-Civita connection transforms as
µ
⇢⌫ ! (⌦ 1 )µ⌧ ⌦ ⇢ ⌦ ⌫

+ (⌦ 1 )µ⌧ ⌦ ⇢ @ ⌦⌧⌫ . (1.105)

The first term is how a tensor would transform. The second term is independent of
and is the characteristic transformation of a connection. But this looks very similar to
the transformation of the gauge field (1.82),

1 i 1
Aµ ! ⌦Aµ ⌦ + ⌦ @µ ⌦ (1.106)
g
where, again, there is a transformation that befits a tensor, supplemented with the
additional derivative term @⌦. Indeed, this analogy can be made more precise, and
mathematicians refer to the gauge field Aµ as a connection. Both connections find

– 35 –
their natural home inside covariant derivatives. In gauge theory, this is the Dµ that
we’ve already met, while in general relativity it is the object that acts naturally on
vector fields Y , with (r⌫ Y )µ = @⌫ Y µ + µ⌫⇢ Y ⇢ and is then extended to act on other
tensor fields.

Given a Levi-Civita connection, one can construct the Riemann curvature tensor
R⇢µ⌫ . Rearranging some of the indices this can be written as

(Rµ⌫ ) ⇢ = @µ ⌫⇢ @⌫ µ⇢ + ⌫⇢ µ µ⇢ ⌫ . (1.107)

Again, we see an immediate similarity with the construction of the field strength in
Yang-Mills (1.84) which, including the a, b = 1, . . . , dim F indices, reads

(Fµ⌫ )ab = @µ (A⌫ )ab @⌫ (Aµ )ab ig(Aµ )ac (A⌫ )cb + ig(A⌫ )ac (Aµ )cb . (1.108)

Mathematicians refer to both the Riemann tensor and the field strength Fµ⌫ as the
curvature.

1.4 C,P, and T


Discrete symmetries play a crucial role in understanding the structure of the Standard
Model. There are three that are particularly important: parity, charge conjugation, and
time reversal. In this section, we describe each of these in turn. We end by explaining
why the combination of all three is necessarily a symmetry of any local, relativistic
quantum field theory.

1.4.1 Parity
Parity is an inversion of the spatial coordinates,

P : (t, x) 7! (t, x) . (1.109)

This can be viewed as a Lorentz transformation, but not one that is continuously
connected to the identity. Roughly speaking, the action of parity mimics what a system
looks like reflected in the mirror. More precisely, a reflection is implemented by, say,
R : (x, y, z) 7! (x, y, z). The parity transformation (1.109), which is a reflection
followed by a rotation by 180 , has the advantage that it treats all spatial coordinates
on the same footing.

– 36 –
(As an aside: one disadvantage of the parity transformation P : x 7! x is that it
only works when the number of spatial dimensions is odd. For example, in d = 2 + 1
dimensions, the transformation (x, y) 7! ( x, y) is just a rotation by 180 . For this
reason, if you’re discussing quantum field theories in di↵erent dimensions, it’s better to
talk about reflections which flip the sign of just one spatial direction, rather than parity
which flips all of them. In these lectures, we’ve got no interest in dimension hopping:
our interest is strictly in the Standard Model and so we keep with the conventional
definition of parity (1.109).)

We would like to understand the circumstances under which a quantum field theory
is invariant under parity, and how the fields transform. When we come to discuss the
weak force in Section 5, we will find that the laws of our universe are not invariant
under parity. This is a shocking statement. It means that given a solution to the
equations of motion, the parity reflected evolution is not a solution!

First, let’s ask how electromagnetic fields transform under parity. For this, we can
look at the covariant derivative which, regardless of the object it acts on, takes the
schematic form

Dµ = @ µ iAµ . (1.110)

This ties the behaviour of the gauge field to that of the derivative. Under a parity
transformation @0 is left una↵ected, while the spatial derivatives @i change sign. This
tells us that parity must act as

P : A0 (t, x) 7! +A0 (t, x) and P : Ai (t, x) 7! Ai (t, x) . (1.111)

Tracing this through to the definitions of the electric field E = r @A/@t and
magnetic field B = r ⇥ A, we have

P : E(t, x) 7! E(t, x) and P : B(t, x) 7! +B(t, x) . (1.112)

Vectors like E, which transform under parity in the same way as x are deemed worthy
to keep the name “vector”. Meanwhile, vectors like B which don’t pick up a minus sign
under parity are said to be pseudovectors. The most familiar examples of pseudovectors
are the magnetic field and angular momentum L = x ⇥ p. These are also the two kinds
of vectors that exhibit the most counterintuitive behaviour when we’re undergraduates.
This is not a coincidence.

– 37 –
In the quantum theory, the parity transformation is enacted by a unitary operator
on the Hilbert space that we also call P . The fields Aµ (x) are now also operators and
the transformation (1.111) becomes

P A0 (t, x)P † = A0 (t, x) and P Ai (t, x)P † = Ai (t, x) . (1.113)

In what follows, we will flit between the description of parity and other discrete sym-
metries as a map, as in (1.111), and as an operator acting on a Hilbert space, as in
(1.113).

Next, we turn to spinors. It can be somewhat fiddly to figure out how spinors
transform under various discrete symmetries, but it’s a topic that will play a crucial
role as we proceed. The equation of motion for a left-handed massless Weyl spinor L
is

¯ µ @µ L =0 (1.114)
i
where ¯ = (1, ). Under a parity transformation, the spatial derivative changes sign
and the Weyl equation (1.114) is not invariant. This is important: if we have just a
single left-handed Weyl spinor L then this theory is not invariant under parity.

We can rescue the situation if, in addition to our left-handed Weyl spinor L, we
also have a right-handed Weyl spinor R . This obeys the equation of motion
µ
@µ R =0 (1.115)

where µ = (1, i ). The di↵erent minus signs in µ and ¯ µ mean that we can compen-
sate for a parity transformation if we also exchange left- and right-handed spinors, so
that
† †
P L (t, x)P = R (t, x) and P R (t, x)P = L (t, x) . (1.116)

There are also options to put di↵erent minus signs (and even phases) on the right-hand
side as we describe below.

As we’ve seen in Section 1.2.1, the two spinors L and R naturally sit in a Dirac
spinor = ( L , R )T . The action of parity on Weyl spinors (1.116) translates into the
action on the Dirac spinor
!
0 1
P (t, x)P † = 0 (t, x) with 0
= . (1.117)
1 0

– 38 –
In the lectures on Quantum Field Theory, we saw that a stationary fermion is associated
to a solution to the Dirac equation, where the spinor degrees of freedom take the form
= (⇠, ⇠)T . Here ⇠ is some 2-component spinor the tells us the orientation of the
spin of the particle. Meanwhile, the solution corresponding to an anti-fermion takes
the form = (⇠, ⇠)T . This means that the fermion has intrinsic parity +1 while the
anti-fermion has intrinsic parity 1.

Terms in the action are always constructed out of an even number of fermions. Given
the transformation (1.117), we can look at the fate of various fermion bilinears under
parity. You can check, for example, that

P : ¯ 7! ¯ and P : ¯ 5
7! ¯ 5
(1.118)

where we’ve suppressed the all-important spinor indices. We say that ¯ transforms as
a scalar while ¯ 5 transforms as a pseudoscalar. Similarly, you can check that ¯ µ
is a vector while ¯ 5 µ is a pseudovector.

You shouldn’t be too dogmatic about insisting that (1.116) and (1.117) are the
definitive action of parity. Suppose that you have a Dirac fermion with action
Z ⇣ ⌘
S = d4 x i ¯ µ @µ M¯ . (1.119)

Then this is invariant under parity with the transformation (1.117). Suppose, in con-
trast, that you’re given the action
Z ⇣ ⌘
S = d4 x i ¯ µ @µ M¯ 5 . (1.120)

This is not invariant under (1.117) because the mass term is parity odd. Nonetheless,
that doesn’t mean that the theory doesn’t have parity symmetry. We just need to look
more carefully. You can check that the action (1.120) is invariant under the redefined
parity transformation

P 0 (t, x)P 0 † = 5 0
(t, x) . (1.121)

In terms of Weyl fermions, this inserts an extra minus sign on the right-hand side of
one of the transformations in (1.116). Ultimately, given a theory the aim is to find
some parity transformation of the fields that leaves the action, and hence the equation
of motion, invariant.

– 39 –
So far, we haven’t discussed the action of parity on scalar fields. These are more
malleable. Given a scalar field , the kinetic terms are invariant under either

P (t, x)P † = ± (t, x) . (1.122)

In other words, the kinetic terms don’t distinguish between scalar (the plus sign) or
pseudoscalar (the minus sign). Typically, this gets fixed when we look at the interaction
of the scalar field with fermions. For example, a Yukawa term of the form ¯ means
that the scalar is parity even under the transformation (1.117) while a Yukawa term
of the form ¯ 5 means that is parity odd under (1.117).

There are various pay-o↵s from understanding the way that parity is implemented
in a theory. If a theory is invariant under parity then, as we’ve seen, we can assign
transformation laws to the various fields. But, after quantisation, these fields give rise
to particles. That means that di↵erent species of particles can be thought of as parity
even or parity odd. Moreover, this concept of parity is conserved in all interactions and,
like all conservation laws, this puts constraints on the kind of things that can happen.

Perhaps surprisingly, it turns out that things are even more constrained when parity
is not a symmetry of the theory! This is for a much more subtle reason known as an
anomaly. We will discuss this in Section 4.

1.4.2 Charge Conjugation


Charge conjugation is an operation that switches particles with their anti-particles. If
a theory is invariant under charge conjugation, then the laws of physics that govern
particles coincide with those that govern anti-particles.

This time we start with a complex scalar field , coupled to electromagnetism. It will
prove simplest to look at actions, rather than equations of motion. Charge conjugation
exchanges particles and anti-particles, so we want it to act as

C: 7! ± . (1.123)

The ± ambiguity is like the ambiguity in the action of parity (1.122) and, as in that
case, will typically be fixed by the interactions with other fields. In contrast, there’s
no ambiguity about the action on the gauge field, which is fixed by looking at the
covariant derivatives, Dµ = (@µ ieAµ ) and Dµ † = (@µ + ieAµ ) † . This means that
any transformation (1.123) must be accompanied by

C : Aµ 7! Aµ . (1.124)

– 40 –
As for parity, we can also think of charge conjugation as a quantum operator C, in which
case (1.123) and (1.124) are replaced by C C † = ± † and CAµ C † = Aµ respectively.
For non-Abelian gauge fields, charge conjugation acts as CAµ C † = A†µ .

Again, the story for spinors is a little more fiddly. We’ll start by looking at a Dirac
spinor, rather than a Weyl spinor. The Dirac equation is
µ
i (@µ ieAµ ) M =0. (1.125)

We will look for an action of charge conjugation that transforms the spinor to
?
C: 7! C . (1.126)

Here C on the right-hand side is a 4 ⇥ 4 matrix that allows for the possibility that
the components of the spinor get mixed up under charge conjugation. Note that we’ve
written the transformed spinor as ? , rather than † , to emphasise that it remains a
“column vector” rather than a “row vector”. (Of course, it’s not really a vector at all.
It’s a spinor!)

The question is: what choice of C ensures that the transformation (1.126), combined
with (1.124), is a symmetry? First, we take the complex conjugate of the equation of
motion (1.125):
µ ? ? ?
i( ) (@µ + ieAµ ) M =0. (1.127)

This is the equation that ? obeys. Next, we compare this to what we get if we act
with charge conjugation on the original equation (1.125):
µ ? ?
i (@µ + ieAµ )C MC =0
† µ ? ?
=) iC C(@µ + ieAµ ) M =0. (1.128)

We see that (1.128) coincides with (1.127) provided that the charge conjugation matrix
C obeys

C† µ
C= ( µ ?
) . (1.129)

The charge conjugation matrix depends on your chosen basis of gamma matrices. For
the chiral basis of gamma matrices (1.42), all gamma matrices are real except for 2
which is pure imaginary. This means that we should take C = ±i 2 , and the action of
charge conjugation is
!
2
0
C : 7! ±i 2 ? with 2
= . (1.130)
2
0

– 41 –
For theories that are invariant under charge conjugation, we can assign an eigenvalue
C = ±1 to each particle, usually referred to as C-parity. As with actual parity, P ,
this new quantum number restricts the possible interactions. For example, it turns out
that the neutral pion ⇡ 0 has C = +1 while, from (1.124), the photon necessarily has
C = 1. This means that the decay to two photons, ⇡ 0 ! + , is allowed (and
indeed, happens over 98% of the time). But the decay to three photons, ⇡ 0 ! + +
is forbidden on symmetry grounds.

If we decompose the Dirac fermion into its two Weyl components, = ( L , R )T ,


then we can read o↵ from (1.130) the action of charge conjugation on Weyl spinors,
2 ? 2 ?
C: L 7! ±i R and C : R 7! ⌥i L . (1.131)

We see that charge conjugation, like parity, involves an exchange of two Weyl spinors.

A theory with just a single Weyl fermion is invariant under neither parity nor charge
conjugation. However, there’s still hope if we combine the two symmetries. We can
take the combined action from (1.116) and (1.131) to be
2 ? 2 ?
CP : L (t, x) 7! ⌥i L (t, x) and CP : R (t, x) 7! ±i R . (1.132)

A Weyl fermion coupled to a gauge field is invariant under CP. However, as we will see
later, it’s quite possible for this symmetry to be violated by other interaction terms
(for example, Yukawa interactions between fermions and scalars).

1.4.3 Time Reversal


Our final discrete symmetry is time reversal, which acts on spacetime coordinates as

T : (t, x) 7! ( t, x) . (1.133)

There’s a subtlety in implementing time reversal symmetry in quantum theories. This


manifests itself already in the simplest quantum mechanical systems like, say, a free
particle moving in R3 . The Schrödinger equation for the wavefunction takes the form
@
i = r2 . (1.134)
@t
Now compare this to the heat equation that describes how conserved quantities, such
as temperature T , di↵use in a system
@T
= r2 T . (1.135)
@t

– 42 –
The heat equation most certainly isn’t time reversal invariant since the left-hand side
picks up a minus sign, while the right-hand side does not. That’s to be expected: after
all, di↵usion is a process that increases entropy and there’s a clear arrow of time as
things spread out. In contrast, there’s no increase in entropy for a single quantum
particle and we do expect the physics to be invariant under time reversal. Yet the
Schrödinger equation is almost identical to the heat equation in structure. How can
one be time reversal invariant, and the other not?

Almost identical, but not quite. The key is that factor of i in the Schrödinger
equation that is not there in the heat equation. Suppose that (t) is a solution to
the Schrödinger equation. Then ( t) is not a solution but the factor of i means that
?
( t) is. That’s the clue that we need: time reversal in quantum mechanics acts as
?
T : (t) 7! ( t) . (1.136)

Viewed as an operator acting on the Hilbert space, this complex conjugation translates
into the requirement that T is an anti-unitary operator, rather than the more familiar
unitary operator. This means that, acting on states, we have

T (↵| 1i + | 2 i) = ↵? T | 1i + ?
T| 2i . (1.137)

In addition, the operator obeys


?
hT 1 |T 2i =h 1| 2i . (1.138)

See the lectures on Topics in Quantum Mechanics for more discussion of the action of
the time reversal in quantum mechanics.

This anti-linear behaviour changes some of the transformation properties of fields.


For example, you might naively think, following (1.111), that A0 would be odd under
time reversal and Ai even. But, in fact, it’s the opposite way around because there’s an
additional factor of i in the covariant derivative Dµ = @µ ieAµ which gets conjugated.
It means that the action of time reversal on the gauge field is

T : A0 (t, x) 7! +A0 ( t, x) and T : Ai (t, x) 7! Ai ( t, x) . (1.139)

Tracing this through to the electric field E = rA0 @A/@t and magnetic field
B = r ⇥ B, we have

T : E(t, x) 7! +E( t, x) and T : B(t, x) 7! B( t, x) . (1.140)

This makes sense: it’s the same transformation that we get from the Lorentz force law
mẍ = q(E + ẋ ⇥ B).

– 43 –
What about fermions? Once again, the action of time reversal can mix the di↵erent
components of a Dirac spinor. As we now show, it turns out that (for our chiral basis
of gamma matrices (1.42)) the correct transformation is
1 3
T : (t, x) 7! ⇥ ( t, x) where ⇥ = . (1.141)

As for other transformations, we could also include a minus sign on the right-hand
side. To see that (1.141) is indeed a symmetry, consider the action of time reversal
on the Dirac equation (1.125). Remembering that time reversal also acts by complex
conjugation (so, for example, changes µ to ( µ )? ), we have

i ( 0 ) ? D0 + ( i ) ? Di ⇥ M⇥ = 0
1
=) i⇥ ( 0 ) ? D0 ( i ) ? Di ⇥ M = 0. (1.142)

This gives us back the original Dirac equation if the matrix ⇥ obeys

⇥ 1 ( 0 )? ⇥ = 0
and ⇥ 1 ( i )? ⇥ = i
. (1.143)

It’s simple to check that, for the chiral basis of gamma matrices (1.42), ⇥ = 1 3
does the job. We can also translate this to the action on the component Weyl spinors
= ( L , R )T ,
2 2
T : L (t, x) 7! i L( t, x) and T : R (t, x) 7! i R( t, x) . (1.144)

We see that time reversal, like CP, does not mix the left- and right-handed Weyl spinors.

What would it mean for a quantum field theory to break time-reversal invariance?
It sounds rather cool. In practice, however, a breaking of time reversal manifests itself
in rather mundane ways. One simple example is the presence of an electric dipole
moment for particles. Recall from the lectures on Electromagnetism that an electric
dipole moment arises from two, equal and opposite, closely separated charges and gives
rise to an electric field that drops o↵ as 1/r3 .

The dipole moment points in a particular direction. For an elementary particle,


this direction must align with the spin otherwise the particle would pick a preferred
direction in space and so break Lorentz invariance. But the spin and dipole moment
transform di↵erently under both parity and time-reversal. To see this, recall that spin
S is a form of angular momentum L = mx ⇥ ẋ, which is even under parity and odd
under time reversal. Hence, we have

P : S 7! S and T : S 7! S
P : E 7! E and T : E 7! E . (1.145)

– 44 –
This means that discovery of a dipole moment for a fundamental particle would imply
that the laws of physics break both parity and time reversal invariance. The search
for the electric dipole moment of the neutron remains one of the most direct ways to
test for time-reversal breaking in the strong nuclear force. So far, no such breaking has
been found. (We discuss this further in Section 3.4.) As we will see later, the weak
force does break both parity P and, to a lesser extent, time reversal T . This results in
a theoretical prediction for the electric dipole moment of the electron, albeit one that
is far below current experimental bounds.

1.4.4 CPT
There are theories that are invariant under our three discrete symmetries, C, P and
T , and other theories that break them. As we will see, the Standard Model is in the
latter class and all three symmetries are broken.

However, there is a theorem that says that all relativistic quantum field theories
must necessarily be invariant under the combined action of CP T . In other words, if
you look at anti-particles in the mirror, with their motion reversed, then you will have
a symmetry on your hands.

One somewhat workaday proof of the CPT theorem is to simply write down all
possible Lorentz invariant terms and check that they are indeed invariant under CPT.
As we’ve seen, the most subtle transformations are those of spinors. For example,
combining our previous results (1.117), (1.126) and (1.141), we find that a Dirac spinor
is transformed by the anti-unitary operation
!
5 ? 5 1 0
CP T : (x) 7! ( x) with = . (1.146)
0 1

You can check that all fermion bilinears are invariant under this transformation. For
example,

¯ = † 0
7! T 5 0 5 ?
= T 0 ?
= ¯ (1.147)

where, in the final equality, we reordered the fermions and picked up a minus sign for
our troubles due to their Grassmann nature. The pseudoscalar ¯ 5 is also invariant
by a similar argument, while both ¯ µ and ¯ µ 5 transform as vectors, rather than
pseudovectors (meaning that they pick up minus signs) which ensures that any kinetic
term we write down is invariant. (For this, you will need to use the fact that 1T = 1
and 3T = 3 while T
0 = 0 and T
2 = 2 .)

– 45 –
A slightly more elegant, but not entirely convincing, demonstration of CPT follows
from Wick rotating to Euclidean space. Here we sketch the basic idea. The full Lorentz
group in Minkowski space is really O(1, 3) and contains four disconnected components,
with the actions of parity and time reversal taking us from one component to the other.
In contrast, in Euclidean space the group becomes O(4) and this contains only two
disconnected components. If you follow the Lorentzian CP T under a Wick rotation,
it becomes simply a rotation in SO(4), i.e. a transformation that is connected to the
identity. (The need to include C here is roughly because particles are like anti-particles
travelling backwards in time.) This means that if your Euclidean theory is to have
SO(4) rotational invariance, then your Lorentzian theory must enjoy CP T .

The statement that CP T is a symmetry of all relativistic quantum field theories is


something that we can test. Here’s an example from neutrino physics. We will learn
later that neutrinos oscillate from one flavour to another as they travel through space.
So, for example, a muon neutrino ⌫ µ will have some probability to convert into an
electron neutrino ⌫ e , a process that we write as

⌫µ ! ⌫e . (1.148)

We could also consider the CP conjugate process, namely

⌫¯µ ! ⌫¯e . (1.149)

There is no reason for the amplitudes for these two processes to be equal if CP is
broken. However, there is also the time reversed process of (1.148)

⌫e ! ⌫µ . (1.150)

This too may have a di↵erent amplitude to (1.148) if time reversal is broken. However,
CPT tells us that the amplitude for (1.149) and the amplitude for (1.150) are necessarily
equal. Indeed, all experimental tests so far have failed to find any violation of CPT.

– 46 –

Common questions

Powered by AI

Yang-Mills theory is richer than Maxwell's electromagnetism primarily because of its non-Abelian nature and the non-linear interaction terms in the field strength tensors . These nonlinearities allow gauge fields, or 'gauge bosons,' to interact with each other, leading to phenomena like asymptotic freedom and confinement, absent in Maxwell's linear theory . The presence of a gauge group more complex than U(1), like SU(N), brings additional degrees of freedom and the potential for a diverse set of interactions, making Yang-Mills theory foundational for describing the strong and weak nuclear forces in the standard model .

Helicity, a property of massless particles, represents the projection of spin along the direction of motion. It is quantized and can take half-integer values . Unlike spin for massive particles, which is intrinsic and independent of the momentum direction, helicity is defined concerning the particle's motion and lacks a rest frame definition . For massive particles, spin arises from finite-dimensional SU(2) group representations, while for massless particles, helicity emerges from ISO(2) representations of the little group and reflects rotational symmetry along the direction of movement .

In SU(2), the Casimir operator's eigenvalue, j(j+1), labels representations known as spins, with discrete values. Each representation dimension is 2j+1, corresponding to angular momentum states . In contrast, the Poincaré group has two Casimir operators. The first, PµPµ, represents the rest mass, and the second, WµWµ, relates to spin or helicity. These operators are essential in defining particles as Poincaré group representations, with eigenvalues related to mass (for C1) and spin/helicity (for C2).

The main difference lies in the structure of the field strength tensors. In Maxwell's theory, the field strength is simple and lacks self-interaction terms. In contrast, the Yang-Mills field strength tensor includes a non-linear term due to the commutation of gauge fields, reflecting the non-Abelian nature of the theory. This additional term makes Yang-Mills theory richer and more complex because it leads to the possibility of interactions between gauge fields, contrasting with the linear superposition of fields in electromagnetism .

The 'little group' arises by examining elements of the Lorentz group that do not change a chosen momentum in the representation of the Poincaré group. For massive particles, the little group is SU(2), and its representations correspond to the particle's spin . For massless particles, the little group's structure is different, leading to helicity as their primary representation . This group's representations help uplift to the full Poincaré group, explaining properties like spin or helicity for particles .

The choice of momentum affects particle representations in identifying the symmetry properties of the states. For massive particles, one can boost to the rest frame where the momentum is trivialized, simplifying the representation problem to finding those of the little group SU(2), which describe spin . In massless cases, no rest frame exists; instead, a null form of momentum is chosen, leading representations to involve ISO(2), with one primary degree of freedom—helicity . Both cases illustrate how momentum choice dictates the acting symmetry group, influencing the characterization of particle states via group reprsentations .

For massive particles, the little group is SU(2), and its representations can directly describe particle spin, with straightforward quantum mechanical interpretations . Each massive representation is finite-dimensional, labeling particles by their mass and spin . For massless particles, the little group is the Euclidean group ISO(2), with more complex structures like hellicity derived from it. The translation parts of ISO(2) must act trivially, complicating representation and requiring additional assumptions . This divergence in structure complexity makes massive representations simpler conceptually and mathematically than massless. .

Covariance and gauge invariance ensure that physical laws in Yang-Mills theory maintain consistent descriptions regardless of local field transformations. Covariance relates to laws appearing identical in different reference frames, preserving the form of equations under general transformations . Gauge invariance signifies that observable quantities remain unchanged under local gauge transformations, enforcing physical equivalence and leading to conservation laws via Noether's theorem. These principles dictate the form of interactions and restrict the structure of permissible fields and interactions, guiding the formulation of quantum field theories .

In Yang-Mills theory, gauge symmetries are more intricate than in Maxwell's theory due to the non-Abelian nature of the involved Lie groups . The gauge transformations act on matrix-valued gauge fields and include non-linear terms from field commutators, unlike the Abelian transformations in electromagnetism which are linear . The non-linear terms in Yang-Mills theory introduce self-interactions among gauge fields, enriching the theory's dynamics and leading to phenomena such as confinement, which are absent in the linear Maxwell framework .

The Pauli-Lubanski vector is important in the Poincaré group as a relativistic analogue of angular momentum . It is constructed using the epsilon tensor, momentum, and angular momentum operators . Its commutation relations with the momentum show it is conserved, analogous to the angular momentum in quantum mechanics . The Casimir formed with the Pauli-Lubanski vector is crucial for characterizing irreducible representations of the Poincaré group and labeling 'particles' .

You might also like