Standard Model Lecture Notes Overview
Standard Model Lecture Notes Overview
These notes cover the Standard Model and some of its extensions. The primary sources were:
• John March-Russell’s Standard Model and Beyond course. A course from the Oxford MMath-
Phys program giving a lot of wisdom on the motivation for and methodology of model building.
• Burgess and Moore, The Standard Model: A Primer. Covers the dynamics of the SM at
tree level in detail, with clear discussions, and considers theories beyond the SM through the
lens of effective field theory. It begins with a clean introduction to quantum field theory, but
realistically one would need prior exposure to make sense of it. The book is especially useful
because it describes intuitive ideas commonly used in collider physics which aren’t well-covered
in dedicated quantum field theory books.
• Maggiore, A Modern Introduction to Quantum Field Theory. An accessible, slim book which
charts a direct course towards the building blocks of the Standard Model. Like Burgess and
Moore, renormalization is not covered in any technical detail, but explained carefully from a
modern perspective.
• Georgi, Weak Interactions and Modern Particle Theory. Covers many important SM topics
that are omitted in introductory textbooks, such as chiral perturbation theory. Written in the
inimitable Georgi style: irreverent and direct, always going straight to the physics.
• Donoghue, Golowich, and Holstein, Dynamics of the Standard Model. An authoritative mono-
graph and useful reference.
The sources cited in the notes on Quantum Field Theory were also used. The gauge theory
conventions here differ from those used there, but match those of Peskin and Schroeder. The most
recent version is here; please report any errors found to kzhou7@[Link].
2 Contents
Contents
1 Introduction 3
1.1 Particle and Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Symmetries and Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Bound States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Quantum Chromodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Symmetries 23
2.1 Chiral and Gauge Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Discrete Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 Charge Conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4 Electroweak Theory 54
4.1 Gauge Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Coupling to Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3 Symmetries of the Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Electroweak Decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.5 CP Violation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Neutrinos 76
5.1 Historical Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Neutrino Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Neutrino Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6 Quantum Chromodynamics 88
6.1 Hadron Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.2 Deep Inelastic Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3 Chiral Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4 Chiral Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.5 The Strong CP Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.6 Axion Phenomenology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
1 Introduction
1.1 Particle and Interactions
First, we summarize some ancient history.
• In the early 1960s, the Eightfold Way was introduced, followed by the quark model. Quark
confinement was postulated to explain why free quarks had not been seen, while quark color
was added for consistency with the Pauli exclusion principle.
• By the late 1960s and early 1970s, deep inelastic scattering experiments at SLAC and CERN
found evidence for substructure in the proton, pointlike particles called partons, analogous
to how Rutherford had found substructure in the atom. However, there was still widespread
skepticism over the quark model, so partons were not identified with quarks.
• In 1974, groups led by Ting at Brookhaven and Richter at SLAC simultaneously found a new
particle, called the J/ψ, a meson which was both extremely heavy and relatively long-lived.
This set off a flurry of theoretical activity called the November revolution.
• The quark model explained the new particle as a bound state of a charm and anti-charm quark;
it has excited states in analogy with positronium, which are collectively called “charmonium”. It
also predicted many new mesons and baryons containing charm quarks, organized into multiplets
by group theory, which were quickly found.
• At this point, the elementary particles could be organized into families containing two quarks
and one lepton each, but measurements of CP violation motivated a third generation. In 1975,
the tau lepton was found. Then, a few years later, the upsilon meson was found and postulated
to be a bottom and anti-bottom quark. Throughout the 1980s, more B mesons were discovered,
and today LHCb and Belle II are devoted to studying them.
0
• The top quark would complete the third generation, and measurements of B 0 /B oscillations
indicated it had a huge mass. It was thus too heavy to be produced until 1995, at Fermilab’s
Tevatron, leading to the table of masses below.
Note that quarks don’t exist as free particles, so the definition of their mass is somewhat
ambiguous. For instance, the up mass may be as high as 5 MeV under some definitions.
4 1. Introduction
• Finally, the weak interaction (as understood by Fermi’s effective four-fermion interaction) was
suspected to be mediated by an intermediate vector boson. By 1960, Glashow had formulated
a unified electroweak theory, though there was no mechanism to break the symmetry. In 1964,
the Higgs mechanism was discovered, and in 1967, Weinberg and Salam showed how it could
break electroweak symmetry, predicting the W ± and Z bosons.
• In 1983, the intermediate vector bosons were discovered at CERN’s super proton synchrotron.
In the 1990s, LEP was constructed to perform precision tests of the electroweak theory.
• The basic QED and QCD vertices are simple: charged particles emit photons and colored
particles emit gluons. There are also ggg and gggg vertices.
• The fundamental neural weak vertex is Zf f for any fermion f . For example, a Z could mediate
neutrino-electron scattering. The Z behaves a lot like the photon, except that it also couples
to neutrinos.
• The fundamental leptonic charged weak vertex is W ℓν, i.e. a W boson can decay into a lepton
and its corresponding antineutrino. This vertex mediates the decay of the muon.
• Finally, the quark charged weak vertex converts an up-type quark to a down-type quark,
e.g. W ud′ . However, the quarks are defined in the mass basis, while this interaction is diago-
nalized in a different basis. Then W emission can convert an up quark to a down quark, but
also to a strange or bottom quark.
• The two bases are related by the CKM matrix; the magnitudes of the matrix elements are
′
d 0.974 0.227 0.004 d
s′ = 0.227 0.973 0.042 s .
b′ 0.008 0.042 0.999 b
Then most weak decays stay in the same generation. Crossing between adjacent generations is
rare but possible, while crossing between the first and third generation is rarer still.
• For example, a charged weak interaction mediates beta decay. A baryon can decay into another
baryon while emitting a photon by emitting and reabsorbing a W , which emits a photon. (This
photon is necessary by energy-momentum conservation.)
• It is now known that neutrinos have mass. Neutrinos are defined in the flavor basis, so the
weak vertex is unchanged. Instead, the flavor eigenstates oscillate into each other due to their
mismatch with the mass eigenstates. The analogue of the CKM matrix is the PMNS matrix.
• In the SM, the Higgs interacts with all fermions by Yukawa couplings. It also interacts with the
weak mediators with vertices W W H, ZZH, W W HH, and ZZHH, and with itself as HHH
and HHHH. One-loop diagrams also provide effective vertices ggH, γγH, and γZH.
5 1. Introduction
e+ + e− → Z ∗ → Z + H.
This ruled out Higgs masses below 114 GeV, while perturbative unitarity arguments disfavored
masses above 200 GeV.
• At hadron colliders, the main production mechanism is actually a one-loop “gluon fusion”
process gg → H, which dominantly goes through a t loop, because the heavy t quark has
the largest coupling with the Higgs. (This is an example of “non-decoupling”. We usually
expect heavy particles to be irrelevant, but they remain relevant because they must couple
more strongly to the Higgs to get mass. Thus, Higgs measurements already rule out a heavy
4th generation that gets mass the same way as the other 3.)
• There are other important production mechanisms which occur at tree level, including:
These processes have all been observed and match Standard Model expectations to around the
20% level. Note that direct quark fusion, qq → H, is unlikely because all Yukawa couplings
besides the top are small.
• The reason gluon fusion is dominant, despite being loop-induced, is because (1) it doesn’t
involve any weak couplings, and (2) at proton-proton colliders, the constituents (“partons”) are
dominantly gluons and quarks, with a smaller contribution from antiquarks. (There are some
antiquarks, because of QCD effects, and the specific amounts of each particle are characterized
by parton distribution functions.)
• The most likely Higgs decays are to pairs of heavy particles, such as τ , b, W , Z, or t. All
of these are possible (though suppressed) since the W , Z, and t may be produced as virtual
particles which then decay, though the Higgs is light enough for H → tt to be very rare.
• The discovery of the Higgs was made by the clean decay channels of H → γγ, which occurs
through a loop diagram, and H → ZZ → 4 leptons, though couplings to the other particles
were later measured as well.
• One way that experimentalists characterize their knowledge of the Higgs couplings is the “κ
framework”, which is an ad hoc scaling of each Higgs coupling by a factor κi . This violates
gauge invariance and doesn’t yield a consistent quantum field theory, so one can’t compute
higher order corrections, or get accurate kinematics. But measurements indicate κi ≈ 1 to
about 10% accuracy for the weak bosons and the heaviest fermions.
• At the same time, over the past few decades, more accurate calculations have reduced the theory
uncertainty, with the standard now “next to next to next to leading order” (N3 LO), which is
good to within a few percent. However, more accurate calculations will be needed for the “high
luminosity” (HL) phase of the LHC, which will further shrink experimental uncertainties.
6 1. Introduction
• We can also try to measure the Higgs trilinear coupling, which would help confirm the nature
of its potential. Varying the trilinear coupling affects the rate of two Higgs production, though
the uncertainties at the HL-LHC will remain of order 100%.
• Proposed future “Higgs factory” e+ e− colliders, such as the ILC, could substantially improve
the precision. For instance, they can operate at around 230 GeV to produce Higgs from Z
Bremsstrahlung (slightly above mH + mZ , to remove phase space suppression), around 250 GeV
to produce Higgs pairs, and at 350 GeV to produce top pairs and measure the top mass. It will
also be possible to directly measure the Higgs width, since the kinematics is much cleaner.
• On all these fronts, we could do much better with a µ+ µ− collider, since muons couple much
more strongly to the Higgs, but the technology to build one doesn’t yet exist. It’s not hard to
get the muons (e.g. electron-positron colliders make the positrons by just directing a beam into
a wall), but it seems challenging to form the muons into a beam before they decay.
Note. Before going on, it’s nice to step back and appreciate the massive engineering effort that
goes into particle colliders. The Large Hadron Collider (LHC) sits inside a 27 km circular tunnel,
buried 100 m underground due to a combination of radiation shielding and political reasons. Inside
the tunnel, beams of protons circulate in two separate tubes in opposite directions. There are
thousands of superconducting NbTi magnets distributed throughout, to keep the beam going in a
circle. This acceleration causes energy loss due to Bremsstrahlung, so the protons are pushed along
by thousands of superconducting radiofrequency (SRF) cavities, whose oscillating fields are timed
to accelerate the protons as they pass by. The whole system must be cooled with liquid helium
to cryogenic temperatures, to maintain superconductivity. An unplanned rise in temperature can
cause an explosive “magnet quench”, which knocked the LHC out of commission in 2008.
The proton beams are organized into thousands of “bunches” of about 1011 protons each, spaced
only 25 ns apart. They are focused to a transverse size of 16 microns (smaller for the upcoming
“High Luminosity” LHC) to increase the chance of an interesting event when the bunches collide at
the centers of the detectors. Maintaining this high beam quality is an entire field of study, involving
hundreds of physicists and a number of dedicated journals. The beam is carefully focused using
about a thousand specialized magnets, such as quadrupoles, sextupoles, octupoles, and decapoles.
And of course, the entire beamline needs to be a vacuum as empty as outer space, to avoid scattering
the beam protons.
The proton beam for the LHC needs to already be at a relatively high energy before entering, so
the energy is built up through a series of smaller accelerators. Protons are injected into the LHC
from the Super Proton Synchrotron (SPS), which discovered the W and Z bosons. The SPS in turn
receives them from the Proton Synchrotron, which gets them from the Proton Synchrotron Booster,
which gets them from Linac4, which gets them from ionizing hydrogen, which comes from a single
little bottle of hydrogen gas. The beam itself degrades as more collisions happen, so it needs to be
safely “dumped” and reformed every few hours, repeating this entire process.
The LHC supports many experiments. People often think of ATLAS and CMS, the general
purpose experiments which analyze high-energy proton-proton collisions. But there’s also ALICE
(measuring collisions of lead nuclei, to study quark gluon plasma), LHCb (with a specialized detector
to see hadrons containing b quarks, to study CP violation and flavor anomalies), LHCF and TOTEM
(downstream of the ATLAS and CMS collision points, to study forward produced particles for cosmic
ray physics), MoEDAL-MAPP (near LHCb, to search for produced magnetic monopoles), and soon
FASER (far downstream of the ATLAS collision point, to study new light particles). There are also
7 1. Introduction
some proposed small experiments, such as MATHUSLA (an above-ground detector, to see long-lived
particles), MilliQan (millicharged particles), and CODEX-b (like MATHUSLA, but near LHCb).
The enormous detectors in these experiments use a variety of techniques to infer what happened
in the collision. Powerful magnets bend the trajectories of charged particles, measuring their
momentum. Calorimeters are solid components which absorb energy from the particles through the
ionization or excitation of particles, which measures their energy. There are also “trackers”, often
filled with sparse gas, which allow the particles to pass through with lower energy loss. Particles
going through the trackers leave a trail of ionized gas, which can be pushed towards a detector
with an electric field. Modern trackers have exquisite sensitivity, allowing the paths of particles
to be known to finer than millimeter precision. The rate at which the particles lose energy is also
measured and can be described by the Bethe–Bloch formula. All of these pieces of information are
used together to identify the particle and figure out what happened in the collision.
• The SM has three conserved quantities: charge, baryon number (or equivalently quark number)
and lepton number. If we ignore neutrino oscillations, the individual lepton flavors are conserved.
If we ignore charged weak interactions, the individual quark numbers are conserved. If we ignore
all weak interactions, parity is conserved.
• Energy conservation forbids decays of particles into heavier particles. It places no restriction
on scattering, since the incoming energy can be arbitrary.
• A decay is more likely if the products are much lighter than the decaying particle, because there
is more available phase space volume; this is the reason neutron decay is so slow.
• Generally, a strong decay takes about 10−23 seconds, an electromagnetic decay takes 10−16 ,
and a weak decay takes at least 10−13 , higher if generation mixing occurs.
• Finally, the OZI/Zweig rule states that any diagram which can be cut in half by only cutting
gluon lines is suppressed. This is because the reaction requires hard gluons, and QCD is weak
at high energies.
• In 1956, Sachs and Wu found that nature did not respect parity symmetry. In the beta decay of
cobalt 60, it was found that the emitted electron came out opposite to the nuclear spin, which
may be aligned with a magnetic field. This violates parity since spin and magnetic fields are
axial vectors while the emission velocity is a true vector.
• Parity violation can also be seen from the helicity of the neutrino, which is Lorentz invariant
assuming the neutrino is massless. Defining right-handed helicity to mean spin pointing along
the direction of motion, we would expect neutrinos to be left-handed and right-handed with
equal frequency by parity invariance. Instead, experiments find that all neutrinos are left-
handed and all antineutrinos are right-handed, where the helicity of the neutrino is not directly
measured but inferred from the helicities of the other products of a decay.
• In the absence of the weak force, parity places constraints on allowed particle decays.
8 1. Introduction
– Vector particles such as the photon have parity −1 and axial vectors have parity +1.
Conversely, scalars have parity +1 and pseudoscalars have parity −1.
– The individual quark numbers and lepton numbers are conserved, so we are free to assign
any parity to them. By convention, we assign parity +1 to leptons and quarks (and thereby
also to the proton and neutron). We will show below that this implies parity −1 for
antiquarks and antileptons. In particular, this means that a bound state consisting of a
fermion and its antiparticle has a factor of −1 in its intrinsic parity.
– Note that parity cannot be defined for neutrinos in the Standard Model, as it would map
a left-helicity neutrino to a right-helicity neutrino, which doesn’t exist.
– In a two-body decay with angular momentum ℓ, there is an extra factor of (−1)ℓ .
• For example, with ℓ = 0, we expect to have pseudoscalar and vector mesons, corresponding
to spin 0 and 1 respectively. These are indeed the lowest-energy meson octets; there are also
higher-energy positive parity meson octets corresponding to excited states with ℓ = 1.
• One early hint of parity violation was the ‘theta–tau’ puzzle in the 1950s. Two mesons, called
the θ and the τ , decayed as
θ+ → π+ + π0 , τ + → π+ + π0 + π0.
Every particle involved has spin 0, so there can be no orbital angular momentum. Then the θ
and τ must have parity 1 and −1, but they had nearly the same mass. The resolution is that
they are indeed the same particle, the K + , and the first decay violates parity.
• Next, we turn to charge conjugation, which replaces particles with antiparticles by flipping the
sign of all internal quantum numbers. Only particles that are their own antiparticles can be
eigenstates of C, severely restricting its use.
– The photon has C = −1, since it is sourced classically by a current which flips under C.
– Consider a spin 1/2 particle and its antiparticle with total orbital angular momentum ℓ
and total spin s. We get a factor of (−1)ℓ from the orbital part, a factor of −1 from
identical particle exchange, and a factor of (−1)s+1 from the antisymmetry/symmetry of
the singlet/triplet. Then C = (−1)ℓ+s .
– For example, the neutron pion π 0 has C = +1, so it can’t decay into an odd number of
photons.
• For the strong interactions, where isospin is conserved, we may define the G-parity
G = CeiπI2 .
This is more useful because charged mesons can have definite G-parity. For example, the
charged pion π + is mapped to π − and then back to π + by C, so it has definite G-parity.
• As we’ve seen, the leptonic weak decays violate parity. For example, in the decay
π + → µ+ + νµ
9 1. Introduction
the antimuon is always left-handed, while it would be right-handed in the parity-flipped version.
Similarly, in the decay
π − → µ− + ν µ
the muon is always right-handed. Thus C symmetry is also violated, since the charge conjugate
would have a right-handed antimuon, but CP symmetry is not.
0
• One useful system for testing CP symmetry is the decay of the neutral kaon K 0 and K . The
0
K 0 and K mix by a W ± loop, so neutral kaons found in the lab are mixtures of the two. Both
0
K 0 and K have P = −1 and C = +1, and the states
0 0
|K1 ⟩ = |K 0 ⟩ − |K ⟩, |K2 ⟩ = |K 0 ⟩ + |K ⟩
• Assuming CP is conserved in the weak interactions, |K1 ⟩ and |K2 ⟩ decay in different ways.
Since a pion has CP = −1, the most common decays are
K1 → 2π, K2 → 3π.
The first decay is much faster because there is more phase space available. Therefore, a neutral
kaon should quickly turn into an eigenstate |K2 ⟩ of CP. Concretely, this means that a beam of
kaons should initially have many 2π decays, and later have only 3π decays.
• The neutral kaons provide another example where two bases mismatch, and the one we use is
0
dictated by convenience. If we are studying strong interactions, we want the K 0 and K , but
if we are studying weak decays, we want the K1 and K2 .
• In 1964, the Cronin–Fitch experiment established that the weak interaction does not conserve
CP. This was done by taking a beam of neutral kaons, waiting for a time much greater than
the lifetime of the K1 , and detecting residual 2π decays (about 0.2% of the total); this is only
possible if the K2 decay violates CP.
• In general, two particles can mix if they have approximately the same mass and the same
relevant conserved quantities. For example, the particles always must have the same baryon
number, but they don’t have to have the same isospin if the mixing is by a weak process. In
the case of neutrinos, lepton number is violated, but this is acceptable as the neutrino mass
terms explicitly break lepton number conservation.
0 0
• Similarly, the B 0 and B mesons can mix. Oscillations between the B 0 and B were observed
at Fermilab in 2006, and the decays of the B mesons have been observed to violate CP. Most
measurements of CP violation are with neutral B-mesons or neutral kaons; many of the few
other candidates are long-lived enough.
• T symmetry does not forbid decays, since no particle is in an eigenstate of T. It imposes detailed
balance for reactions, but it is often difficult to set up a backwards reaction. For example, the
reverse of the weak decay Λ → p+ + π − is difficult to observe because the proton and pion
interact by the strong force. To remove contamination from other forces, we might turn to
neutrinos, but it is hard to control them.
10 1. Introduction
• As such, the most sensitive searches for T violation come from measurements of the electric
dipole moment of elementary particles; such a dipole moment would violate both P and T. So
far, all experiments have found the dipole moment to be zero within error.
• However, we do expect T violation to occur, since CPT must be a symmetry and CP is violated.
T violation has been directly observed at BaBar at SLAC in 2012.
• One prediction of CPT symmetry is that every particle must have the same mass and lifetime
as its antiparticle. (This is also true of C symmetry, but we know that to be broken.) This
has been verified to great accuracy for many particles. Another result is that helicities must
be symmetric about zero: if there exists a helicity λ state, there must also be a corresponding
helicity −λ state.
• One should always keep in mind that just about all of these exact symmetries, such as Lorentz
symmetry, CPT symmetry, etc. are all strongly spontaneously broken in our universe. For
example, the presence of the CMB breaks Lorentz symmetry. When we talk about verifying
these symmetries, we always imagine experiments that are insensitive to the symmetry-breaking
background.
• CP violation can be caused by complex phases in the CKM or PMNS matrices; the former is
what accounts for CP violation in kaon and B-meson decay. For n < 3 generations such phases
can always be removed by redefining the quark fields, so the observation of CP violation led to
the prediction of a third generation, where there is one residual phase.
• CP violation can also come from F Fe = F ∧ F terms for the electromagnetic, weak, and strong
forces; specifically, such a term breaks both P and CP. However, this term is more subtle since
it is a total derivative, and hence a boundary term.
• CP violation is said to “distinguish matter from antimatter”. This comes from its presence in
the Sakharov conditions for baryogenesis, the origin of a net matter-antimatter asymmetry in
the universe. They are:
– CP violation. This is necessary since otherwise i → f will be balanced by i∗P → fP∗ . The
CP violation in the SM is quite small, and probably not enough to account for baryogenesis.
– Departure from thermal equilibrium. Otherwise, i → f will be balanced by f → i, by
detailed balance. Equivalently, we can’t go from µB = 0 to µB ̸= 0. We could exit
equilibrium, e.g. after a first-order phase transition. However, the electroweak and QCD
phase transitions appears be smooth crossovers in the SM, which cannot do the job.
As discussed in detail below, it can be ambiguous to define C or CP, or in some extreme cases
impossible to define them at all. “C and CP violation” really stands for the absence of any
symmetry which would relate processes, causing the net baryon number produced to cancel.
• One alternative possibility is “leptogenesis”. In the SM, B and L are violated by nonperturbative
effects while B − L remains conserved; then leptons can be created, and turn into baryons.
Note that the lepton number of the universe might or might not be zero, since we can’t measure
the neutrinos well.
Note. How complex phases in the CKM matrix cause observable CP violation. Consider a process
and its CP-reverse. Under the standard electroweak theory, the matrix elements are
where the CKM phase eiϕ is not conjugated. However, the magnitudes of the amplitudes are the
same, so this has no observable effect. However, if the process can occur in multiple ways,
then |M| and |M| f can differ because the terms can interfere differently. In the case of B-meson
decay, there is a tree level process, and the next most significant contribution is from a “penguin
diagram” involving a W loop. In fact, this is generic: one can show that the leading CP violation
must involve loop-level processes; interference between just tree-level processes isn’t enough. As a
result, CP violation is in some sense always small, no matter how large the phases are, since it’s
always loop suppressed.
• First, note that the potential energy and kinetic energy should be of the same order by the
virial theorem. Thus a system is nonrelativistic if its binding energy is small compared with its
mass energy.
• For example, light quark bound states are always relativistic, but charmonium cc and bottomo-
nium bb are not. We don’t count toponium tt since its lifetime is too short to be observed.
• The archetypical example of such a system is the hydrogen atom, where the energy levels are
α2 mc2 where m is the mass of the electron.
– Fine structure comes from the lowest-order relativistic correction and the spin-orbit coupling
of the electron. It can be calculated with the Dirac equation and contributes α4 mc2 .
– The Lamb shift comes from QED effects and contributes α5 mc2 .
12 1. Introduction
– The hyperfine structure comes from the spin of the proton. The proton’s magnetic moment is
much smaller than that of the electron since it is much heavier. It has a spin-spin interaction
as well as a spin-orbit coupling with the electron and contributes (m/mp )α4 mc2 .
• Positronium behaves very similarly to hydrogen; using the reduced mass, its energy levels are
like those of hydrogen with an electron half as massive.
– One major difference is that the hyperfine structure is now of the same order as the fine
structure.
– Since both the particles move quickly, there is another α4 mc2 correction due to the propa-
gation time for the electromagnetic field.
– The electron and positron can also temporarily annihilate into a virtual photon. Since the
probability for this process is proportional to |ψ(0)|2 , at lowest order it only occurs for
ℓ = 0. Since the photon has spin one, it only occurs in the triplet configuration s = 1.
– Finally, the electron and positron can annihilate. Positronium has C eigenvalue (−1)ℓ+s
while a state with n photons has (−1)n , restricting the number of photons produced. By the
same logic as above, annihilation only occurs at lowest order for ℓ = 0, usually producing 2
photons for the spin singlet and 3 for the spin triplet. The decay of the triplet state (called
ortho-positronium) is hence slower, by roughly a factor of α.
• Finally, we turn to ‘quarkonium’, a system of a quark and its antiquark. In this case, the energy
levels are far enough apart that we regard excited states as entirely different particles.
Note. Before tackling the light mesons and baryons, we show the lowest energy meson nonets and
the lowest energy baryon octet and decouplet. The baryons are shown below.
13 1. Introduction
The pseudoscalar and vector mesons are shown below, with the pseudoscalars at left.
Next, we consider the light quark mesons. In this case, we can’t say anything quantitative about
the bound state masses, so we focus on the wavefunctions.
• For simplicity, we consider the ground state n = 1 and ℓ = 0, so we only have to think about
the quark flavor and spin. Since isospin and spin commute, the mesons can be organized into
groups of definite flavor su(3) representation and spin, e.g. the meson octets and singlets we’ve
already seen. Excited states will give further meson multiplets.
• Now, we can write down the wavefunctions of these states. Since we’ve restricted to n = 1 and
ℓ = 0, the position space wavefunction is rather trivial; the color wavefunction must simply be
the color singlet, and the spin wavefunction is totally independent of the flavor wavefunction.
So the only nontrivial part we need is the flavor wavefunction.
• We focus on the states with I3 = 0, which occupy the center of the pseudoscalar nonet. The
pions in this row form an isospin triplet, so the π 0 is the I3 = 0 state of the triplet, uu − dd.
14 1. Introduction
(Naively we would think there should be a plus sign here, because a minus sign indicates a spin
singlet state, but the minus sign is correct, as explained below.)
η = uu + dd − 2ss, η ′ = uu + dd + ss
• The situation is slightly different for the vector mesons; here the equivalents of the η and η ′
mix, to form the physical states
ω = uu + dd, ϕ = ss.
This occurs because su(3) is broken by quark masses, and the strange quark is quite heavy. The
reason this mixing doesn’t happen for η and η ′ is that the singlet η ′ has a large contribution to
its mass due to instanton effects.
• The ϕ is the strange quark analogue of the J/ψ. It decays slowly because it’s too light to decay
into two mesons with one s or s each, and an ss → g ∗ → . . . decay is OZI suppressed; in fact,
this was how the OZI rule was discovered.
• The mesons in the pseudoscalar nonet and vector nonet differ in mass, so part of the strong
force must be spin-dependent. A good empirical model for the meson masses is
S1 · S2
M = m1 + m2 + A
m1 m2
where we divide the spins by the masses to get magnetic moments, and the constant A and the
effective quark masses mi are fit numerically.
• Heuristically, the effective quark masses account for the bare quark masses and the QCD field
energy each quark carries around. The other term accounts for spin-spin coupling through color
magnetic moments, which are inversely proportional to the masses. We’ve already accounted
for the “color electric” force; it just binds the quarks together independent of their flavor and
is counted in the effective masses.
Note. Why are the vector mesons heavier? Consider an analogy with positronium. The energy is
lowest when the magnetic moments are aligned, but since the charges are opposite, this corresponds
to the spins anti-aligned, giving a total of spin 0. The same reasoning holds for mesons, though
it’s color charge rather than electric charge that’s opposite. For baryons, the situation is more
complicated because the color charges differ by more than just a sign, but the same idea holds.
We can get plenty of insight from our simple model above. For example, the splitting between K
and K ∗ is smaller than the splitting between π and ρ because it involves the strange quark, which
has a larger mass. Another example is the Σ-Λ splitting. The Σ0 and Λ have exactly the same
quark content uds, but the latter has isospin zero, so the u and d quarks are antisymmetric in flavor
and hence antisymmetric in spin. Since the u-d spin-spin interaction is the most important, the Λ
is slightly lighter.
15 1. Introduction
Note. Extra sign flips arise because there are two competing and incompatible sign conventions.
We would like to define the antiparticles by charge conjugation, e.g. |u⟩ = C|u⟩, i.e. so the matrix
elements of C are all positive. On the the other hand, we want to work with eigenstates of I3 and
I 2 under the Cordan–Shortley phase convention, under which the I± have real positive entries.
By definition, charge conjugation flips the isospin,
CI3 C −1 = −I3 .
CI± C −1 = αI∓
where α = ±1, because the transformed I± operators must be Hermitian conjugates. We choose
α = −1, which implies
CI1 C −1 = −I1 , CI2 C −1 = I2 .
Now consider the isospin doublet {|u⟩, |d⟩} and their images under C, |u⟩, |d⟩. If we didn’t care
about sign conventions, we would have a new isospin doublet {|d⟩, |u⟩}, but
Then for the Cordan–Shortley phase convention to apply we must introduce a relative sign, though
a global sign is still arbitrary; we thus choose the isospin doublet {−|d⟩, |u⟩}. We can then use
standard tables of Clebsch–Gordan coefficients to add isospin.
Note. The role of antisymmetrization of the wavefunction. At the level of quantum field theory,
the wavefunction for any system of fermions must always be antisymmetrized, whether the fermions
are the same ‘type’ or not, because all fermion creation operators anticommute,
In wavefunction notation, the state of n fermions lives in the totally antisymmetric subspace of
H⊗n , where the single-particle space H includes fermions of all positions, spins, flavors, and colors.
The exchange operation swaps all of these properties, not just the positions.
However, if some particle has a property that none of the other particles share, it can be excluded
from the antisymmetrization without any effect. For example, if an electron is far away from all the
others, we can turn off the antisymmetrization because the only effect is to remove the exchange
force, which that electron doesn’t feel; the electron is ‘distinguishable by its position’. Similarly,
if only one electron in an atom has spin up, we can treat it naively because it won’t violate the
exclusion principle.
In the case of mesons, the constituents can always be treated as distinguishable because only
one of them will be an antiquark. But most baryons contain quarks with the same flavors, in which
case the antisymmetrization matters.
Finally, we turn to the light baryons, carefully accounting for the antisymmetrization. The wave-
function has four parts: position, color, flavor, and spin.
• The position wavefunction is more complicated; the orbital angular momentum must be de-
scribed by two parameters (e.g. the angular momentum L of the first two particle about their
center of mass, and the angular momentum L′ of their center of mass and the third particle
about the combined center of mass). We ignore these problems by restricting to n = 1 and
ℓ = ℓ′ = 0, so the position wavefunction is symmetric.
16 1. Introduction
• The color wavefunction must always be the color singlet, i.e. the 1 in
3 × 3 × 3 = 1 + 8 + 8 + 10.
• Now the combined spin and flavor wavefunctions must be symmetric. In the case of spin,
2 × 2 × 2 = 2ma + 2ma + 4s
where the 4 is spin 3/2 and contains totally symmetric wavefunctions, and the ‘ma’ stands
for ‘mixed antisymmetry’. The decomposition of the remainder into 2 + 2 is not unique; for
example, we have
11 1 1
= |10 − 01⟩|1⟩, − = |10 − 01⟩|0⟩
2 2 12 2 2 12
which is antisymmetric in slots 1 and 2, but also a 2 antisymmetric in 2 and 3, and a 2
antisymmetric in 1 and 3.
We can build an allowed baryon multiplet with 10s × 4s , giving the spin 3/2 baryon decuplet.
• Many of the remaining states are forbidden, since we can’t build symmetric combinations from
them. However, we can build a baryon octet out of the mixed antisymmetric representations,
This would be the baryon octet if quark color didn’t exist; it appears for excited states with
nonzero angular momentum.
• As with the mesons, we can compute the masses using the empirical formula
S1 · S2 S1 · S3 S2 · S3
M = m1 + m2 + m3 + A + + .
m1 m2 m1 m3 m2 m3
For example, for the baryon decouplet, all pairs of spins are ‘parallel’, so
ℏ2
(S1 + S2 )2 = S12 + S22 + 2S1 · S2 , S1 · S2 =
4
which implies that
ℏ2 A′
1 2
MΣ∗ = 2mu + ms + + .
4 m2u mu ms
These predictions are also within 1% of the experimental results, though we need to fit the
quark masses differently.
Note. We can also arrive at the above result with more powerful machinery. We combine flavor
and spin into an su(6) symmetry and use the fact
6 × 6 × 6 = 56s + 70ms + 70ma + 20a .
Then the 56s is exactly the set with the right symmetry. Restricting to su(3) ⊕ su(2) gives
56s → (10, 4) + (8, 2)
which are exactly the baryon decuplet and octet. The other possibly useful representation is the
antisymmetric one, which breaks up as
20a → (8, 2) + (1, 4).
For the mesons, we have
6 × 6 = 35 + 1 → (8, 1) + (8, 3) + (1, 3) + (1, 1)
which reproduces the two octets and singlets seen before. One might worry that combining a
spacetime and internal symmetry in this way is forbidden by the Coleman–Mandula theorem, but
there’s no problem because we’re working nonrelativistically. We can also handle magnetic moments;
since the magnetic moment operator is in the adjoint 35, and 35 × 56 only contains 56 once, all of
the moments can be expressed in terms of a single one, up to su(6) breaking.
Example. We construct the spin wavefunctions using the usual su(2) procedure. We can handle
flavor with an ad hoc method. The 10s is easy because the wavefunctions are totally symmetric and
the quark content is fixed by the strangeness and isospin; for example, the ∆0 is |ddu⟩+|dud⟩+|udd⟩.
The 1a is simply the totally antisymmetric combination.
18 1. Introduction
Now consider the 8ma antisymmetric in the first two particles. The outer six states are found by
taking the known quark content and simply antisymmetrizing the first two particles. One of the
center states is part of an isospin triplet and can be found by isospin raising (|ds⟩ − |sd⟩)|d⟩. The
other center state is found by orthogonality with this state and the 1a .
• Quarks can be pair produced by e+ e− → γ ∗ → qq. As the high-energy quarks separate, they
emit gluons which emit quark-antiquark pairs. Eventually, each group of particles turns into a
“jet” of hadrons, whose direction is correlated with that of the original hard quark.
• Note that to make the jets colorless, a quark or antiquark needs to be transferred between them.
This doesn’t make much of a difference, since it will be much lower-energy than the original
hard quarks.
• The quarks can also emit a hard gluon, γ ∗ → qqg. In this case, we get a three-jet event; such
events were key in establishing that gluons existed.
• Neglecting the masses of all particles, the cross-section for this process is
π Q2 α 2
σ=
3 E2
where Q is the charge of the quark. Therefore,
σ(e+ e− → hadrons) X
R= = 3 Q2i
σ(e+ e− → µ+ µ− )
i
where the 3 is for the three colors of quarks, and the sum is over quarks with masses much less
than E. We expect Q to look like a step function, jumping up for every flavor of quark.
• There are a few complications. Each step should be smoothed out by the masses. We have
neglected the interaction of the two final-state quarks, but this is very important near a resonance,
where the cross-section has a peak. Above about 50 GeV, R quickly increases because of the
Z 0 peak. But overall, the data fits reasonably well, and unambiguously establishes three quark
colors.
• At the most naive level, suppose the proton is a Dirac point charge.
19 1. Introduction
e4 µν p µ ν ν µ
|M|2 = L L , Lµν µν 2
e = 2(p1 p3 + p1 p3 + η (m − p1 · p3 ))
q 4 e µν
where the L factors come from the traces.
• In reality, the proton is much more complicated, and we can parametrize our ignorance with
form factors. Letting p be the initial proton momentum, we may write the proton factor as
K2 µ ν K4 K5
K µν = −K1 η µν + 2
p p + 2 q µ q ν + 2 (pµ q ν + pν q µ ).
M M M
We haven’t written the antisymmetric combination, which would have coefficient K3 , since Lµν
is symmetric.
• Next, we can check that qµ Lµν = 0, which means that we can choose K µν so that qµ K µν = 0
without affecting the result. This allows us to eliminate K4 and K5 , giving
qµqν K2 (q 2 ) µ
µν 2 µν
K = K1 (q ) −η + 2 + (p + q µ /2)(pν + q ν /2)
q M2
where K1 (q 2 ) and K2 (q 2 ) have absorbed the effects of K4 and K5 , and depend on q. For
example, for the original point charge model, K1 = −q 2 and K2 = 4M 2 .
where E and E ′ are the initial and final electron energies, and we have assumed E ≫ m. As a
check, when E ≪ M , the point charge form factors work. Then our result reduces to the Mott
formula, which describes electron scattering off a heavy pointlike target.
• The form factors K1 (q 2 ) and K2 (q 2 ) are measured by experiment and indicate the proton is
not pointlike, as expected from QCD.
Next, we turn to the Feynman rules for QCD itself. The coupling is gs , and we define αs = gs2 /4π.
• Quarks are specified by both a spinor polarization and a color. We label the colors with
mid-Latin letters and call them red, blue, and green.
• There is a gluon and two quark vertex, so the gluon colors must live in 3 × 3 = 8 + 1. The
elements of 3 × 3 have colors like rr (‘red anti-red’) and bg (‘blue anti-green’). The color singlet
rr + bb + gg is analogous to the meson singlet.
• One might wonder whether there is a ninth gluon. Theoretically, this is equivalent to the choice
of gauge group su(3) or u(3). Since the ninth gluon would be a color singlet, it would not be
confined, and would instead mediate a long-range force between color singlets; it would have an
independent coupling since u(3) is not semisimple. Such a force would appear as an anomalous
contribution to gravity, and there was a brief excitement over this in 1986.
20 1. Introduction
• The eight gluons can be put in correspondence with the eight Gell-Mann matrices λα , where
1 −i 1 1
λ 1 = 1 , λ2 = i , λ3 = −1 , λ4 =
1
−i 1
1
λ5 = , λ6 = 1 , λ7 = −i , λ8 = √ 1
i 1 i 3 −2
which are normalized to match the Pauli matrices, with tr(λα λβ ) = 2δ αβ . The colors can be
read off the columns and the anticolors off the rows, so that λ1 essentially means ‘red anti-blue
plus blue anti-red’.
[T α , T β ] = if αβγ T γ .
– Incoming quarks have a color and spin polarization us (p)c. Similarly, outgoing quarks have
c† , incoming antiquarks have c† , and outgoing antiquarks have c.
– Incoming gluons have a color and polarization ϵµ (p)aα , and outgoing gluons have ϵ∗µ (p)aα∗ .
– The propagators are the same as usual, with delta functions in color space.
– The qqg vertex gives a factor of −igs λa γ µ /2.
– The ggg vertex with colors α, β, and γ has a factor of f αβγ along with other terms. The
gggg vertex is similar, with two structure constants.
• Many simple processes will have amplitudes that look just like the QED amplitudes, but with
an additional ‘color factor’. A useful rule for finding these factors is
2
λαij λαkℓ = 2δiℓ δjk − δij δkℓ .
3
Example. Quark and antiquark scattering, u + d → u + d. At lowest order, there is one diagram.
21 1. Introduction
The amplitude is the same as in QED except for a color factor, so the potential is
αs 1
V (r) = −f , f = (c†3 λα c1 )(c†2 λα c4 ).
r 4
First, suppose the quark and antiquark are part of a color octet. For concreteness, let the incoming
quark and antiquark be red and anti-blue, respectively. By color conservation, the outgoing quark
and antiquark must also be red and anti-blue, respectively. Then
1 1
f = λα11 λα22 = − .
4 6
√
The color singlet state is (rr + bb + gg)/ 3, so there are nine terms in all; for example, the part
where the quarks come in rr and leave bb is (1/4)(1/3)λα21 λα12 . They can be compactly written as
11 α α 1 4
f= λij λji = tr(λα λα ) = .
43 12 3
Then the force between a quark and antiquark is only attractive if they form a color singlet! This is
nice, but only suggestive; after all, we worked to lowest order, which required asymptotic freedom,
but confinement does not occur in this regime.
Note. In the case u + u → u + u, we would also have the s-channel diagram. In the case where
the incoming quarks form a color singlet, this is automatically zero since a singlet cannot couple to
an octet.
Example. Quark and quark scattering, u + d → u + d. The color factor is very similar,
1
f = (c†3 λα c1 )(c†4 λα c2 )
4
where the labels on the ci are as above. Now, 3 × 3 = 6 + 3, so we must consider the sextet and
triplet configurations. They contain the symmetric and antisymmetric parts, respectively:
rb + br bg + gb gr + rg rb − br bg − gb gr − rg
rr, bb, gg, √ , √ , √ , √ , √ , √ .
2 2 2 2 2 2
For the sextet, we take rr, which gives
1 1
f = λα11 λα11 = .
4 3
√
For the triplet, we take (rb − br)/ 2, which gives four terms,
11 α α 2
f= (λ λ − λα21 λα12 − λα12 λα21 + λα22 λα11 ) = − .
4 2 11 22 3
Then the triplet is attractive and the sextet is not. There aren’t triplets observed in nature, but
note that the color singlet for three quarks is totally antisymmetric, so any two of the quarks form
a color triplet. Then every quark in a color singlet baryon attracts every other quark, as expected.
Example. Pair annihilation. Consider the decay of charmonium. There are two tree-level QED
diagrams, c + c → γ + γ, and three tree-level QCD diagrams, c + c → g + g. By angular momentum
22 1. Introduction
addition, the amplitude is only nonzero if the charmonium is in the spin singlet state. One can
show that the two amplitudes differ only by the color factor
1 1 1
f = aα3 aβ4 (c†2 {λα , λβ }c1 ) = √ aα3 aβ4 tr{λα , λβ } = √ aα3 aα4
8 8 3 2 3
where we used the fact that charmonium is in the color singlet state. Now we need to construct the
singlet state for two gluons, i.e. the 1 in
8 × 8 = 27 + 10 + 10 + 8 + 8 + 1.
√
One can show that this state has the form 8i=1 |i⟩|i⟩/ 8 where |i⟩ is the gluon state corresponding
P
Note. Consider two objects in color representations A and B. Their interaction is proportional to
1 2
TaA TbB = (Ta2 − TaA − (TaB )2 )
2
where Ta = TaA + TaB is a generator for total color. Then the attraction is strongest when the total
state has the least color. The same reasoning goes for ordinary electromagnetic interactions or
spin-spin interactions; the net effect will usually be to minimize or maximize the ‘charge’ of the
composite state. This attraction is what leads to color confinement.
23 2. Symmetries
2 Symmetries
2.1 Chiral and Gauge Symmetries
We begin by reviewing some conventions for Dirac spinors.
(i∂/ − m)ψ = 0
where the left arrow indicates the derivative acts to the left.
where there is an implicit identity matrix on the right-hand side. In the chiral representation,
σi
0 0 1 i 0
γ = , γ = .
1 0 −σ i 0
• Dirac masses are not fundamental; in this course we will be more concerned with massless
fermions. Then the chirality projection operators become more important. We define
5 0 1 2 3 −1 0
γ = iγ γ γ γ = , (γ 5 )2 = 1, {γ 5 , γ µ } = 0
0 1
where the sign of γ 5 differs between references. Then if ψ solves the massless Dirac equation
∂/ ψ = 0, then γ 5 ψ does as well, ∂/ (γ 5 ψ) = 0.
1 − γ5 1 + γ5
PL = , PR = , ψL = PL ψ, ψR = PR ψ
2 2
where it is straightforward to show the PL and PR project onto orthogonal subspaces,
(PL,R )2 = PL,R , PL PR = PR PL = 0, PL + PR = 1.
In the chiral representation, ψL /ψR has only the upper/lower two components nonzero.
• It’s important to note that a lot of the facts above are conventional. For example, (ψL )∗ is
clearly right-chiral in terms of its Lorentz transformation properties, because the left-chiral and
right-chiral representations are conjugate, but it is annihilated by PR because its bottom two
components remain zero. When we consider the charge conjugation of fields, we will include a
“charge conjugation matrix” whose purpose is to rearrange the components of the naive complex
conjugate so that the familiar properties still hold.
24 2. Symmetries
• Note that the adjoints of the left-chiral and right-chiral fields satisfy
Thus if we stick to only ψ and ψ, then PR projects right-chirality from both directions.
• A massless Dirac fermion has a U (1)L × U (1)R chiral symmetry. The Dirac Lagrangian is
where we get a chirality flip from anticommuting past ∂/ . Then when m = 0, we can rotate the
phases of ψL and ψR independently. Adding the mass term requires the phases to be rotated
the same way, breaking the symmetry to a U (1)V “vector” symmetry. Rotating the phases
oppositely gives a U (1)A “axial” transformation.
• The procedure for non-abelian gauge symmetry is similar. The matter field now transforms in
a unitary representation r of the gauge group G, with transformation
ψi (x) → exp(ita αa (x))ij ψj (x) = Uij ψj (x), ψ i (x) → ψ j (x) exp(−ita αa (x))ji = ψ j (x)(U † )ji
where the ta are the Hermitian generators in this representation, and satisfy
where T (r) is the Dynkin index, which is 1/2 for the fundamental representation.
(Dµ )ij = ∂µ δij + ig(ta Aaµ )ij , (Dµ ψ(x))i → (U (x)Dµ ψ(x))i
• More generally, we define the covariant derivative of any object X similarly, but with Aµ in
the appropriate representation; then the covariant derivative DX transforms just like X. For
example, the infinitesimal transformation of Aµ itself is
1 1
Aµ → Aµ − (∂µ α + ig[Aµ , α]) = Aµ − Dµ α
g g
where the Dµ acts as if α is in the adjoint representation. Note that Aµ doesn’t transform in
any definite representation, much like how the connection in GR is not a tensor.
That is, as in general relativity, the commutator of two covariant derivatives is a tensor, not a
differential operator. Then the field transforms in the adjoint representation, as
• More generally, we can think of the field as an infinitesimal Wilson loop. In a sense, the most
general gauge invariant observable is the trace of a Wilson loop.
• In some sources, for e.g. gauge group SU (n), the gauge field Aµ is thought of as an n × n matrix
rather than an abstract element of su(n), leading to equations like
Another example is a matter field which transforms in the adjoint representation, with
/ − m)ψ,
L = tr ψ(iD ψ → U ψU −1
where U is the same gauge transformation we would have in the fundamental representation,
ψ is now a matrix, and Aµ again acts by commutator in the covariant derivative. Expressions
like these are less mathematically general but can be easy to compute with.
• The symmetry can be intact, e.g. the gauge symmetries U (1)EM and SU (3)C .
• The symmetry can be anomalous, holding in the classical theory but not the quantum theory,
e.g. the global axial symmetry U (1)A .
• The symmetry can be explicitly broken in the Lagrangian, e.g. isospin SU (2) or generally flavor
SU (6). This is useful as long as the symmetry is approximate.
• The symmetry can be spontaneously broken, i.e. the vacuum does not respect the symmetry
though the Lagrangian does, e.g. SU (2)L × U (1)Y is spontaneously broken to U (1)EM .
26 2. Symmetries
• Let W be an operator on a Hilbert space with inner product (·, ·). If W is unitary and linear,
Wigner’s theorem states that groups of operators that preserve norms (and hence observable
probabilities) must be unitary or anti-unitary.
• Let W (Λ, a) be the operator on the state space that corresponds to a Poincare transformation
consisting of a Lorentz transformation Λ followed by a translation a. Then
We also consider the improper Poincare transformations of parity and time reversal, defining
Λµν = δνµ + ω µν , aµ = ϵµ .
energy H = P 0 , momentum P = (P 1 , P 2 , P 3 )
and
angular momentum J = (J 23 , J 31 , J 12 ), Lorentz boosts K = (J 01 , J 02 , J 03 ).
• Considering PbW (Λ, a)Pb−1 and TbW (Λa)Tb−1 for an infinitesimal translation,
If Pb were antilinear, then it would flip H in conjugation, implying a negative energy state for
every positive energy state. This is unacceptable, as it would forbid the existence of a ground
state, so Pb is linear and hence unitary. Similarly, Tb is anti-unitary.
27 2. Symmetries
• With the linearity and antilinearity established, we can now conjugate our other operators by
Pb and Tb to see how they transform. We find
and
TbPTb−1 = −P, TbJTb−1 = −J, TbKTb−1 = K.
Moreover, upon applying the relations above, we find that parity acts on one-particle states by
changing their momenta and angular momenta as implied above, along with a phase factor ηP
which depends only on the particle species, called the intrinsic parity.
• Under the naive assumptions we have made above, Pb and Tb automatically commute with H.
That is, our initial assumptions are equivalent to assuming that Pb and Tb violation don’t occur!
To allow it, we need to think more carefully.
First, we review the basics of representation theory, as covered in the notes on Group Theory.
• The representations of SO(3) are indexed by a nonnegative integer s called the spin. The
double/universal cover of SU (2) are indexed by a half-integer, and representations of SU (2)
correspond to projective representations of SO(3).
• If we include Lorentz boosts, we arrive at the connected Lorentz group SO(3, 1)0 , whose
double/universal cover is SL(2, C). In general, such a double cover is called a spin group. The
finite-dimensional representations are indexed by two half-integers (s1 , s2 ), where s1 + s2 is
called the spin. When s is half-integer, the representation is projective.
• Note that restricting to rotations does not produce the spin s1 + s2 representation of SU (2).
Instead, every spin from |s1 − s2 | to s1 + s2 in integer steps is represented.
• Sometimes one hears that “for massless particles, chirality is the same thing as helicity”. This
is an oversimplification that can lead to confusion. Helicity is defined for particles, chirality is
defined for fields, and the two can behave rather differently.
Next, we confront the issue of discrete symmetries, and their possible violation.
• We introduce parity and time reversal by going to the group O(3, 1). Ignoring the issue
of projective representations, the assertion that the Hilbert space carries a representation
of O(3, 1) ⋊ R3,1 carries dynamical content, because it automatically implies Pb and Tb are
conserved. That is, postulating a representation of a set of physical operations exists is a
nontrivial statement about the dynamics, when one of the operations is time translation.
• For spinless particles, if we have a representation of O(3, 1) ⋊ R3,1 , then Pb2 = 1 and particles
have parity ±1. Now consider a spinless theory that violates parity. In this case, we can still
talk about parity for asymptotic states, because they are free; we define parity just as in the
free theory. This is why we can speak about the change of parity in a scattering process.
• More generally, we must allow projective representations. For the Poincare group, it suffices
to promote SO(3, 1)0 to SL(2, C). There is a two-to-one map π : SL(2, C) → SO(3, 1)0 , which
can be extended to include the parity operation.
• However, there are two ways to incorporate parity; if P ∈ O(3, 1) is parity, then π −1 (P) contains
two elements. Letting π(P ) = P, we have
π(P 2 ) = P 2 = 1
which implies that P 2 = ±1. This is a genuine physical ambiguity, and it isn’t presently known
which is the right option in reality.
• In the case M > 0, if P 2 = 1 then Pb2 = 1, and for each s we have two representations, of
intrinsic parities ±1. If P 2 = −1, we instead have Pb2 = (−1)F where F is the fermion number.
Specifically, for integer s the intrinsic parities are ±1 and for half-integer s the intrinsic parities
are ±i. This doesn’t contradict the fact that P 2 = −1 because for integer s, −1 is represented
as +1.
• However, Dirac fermions carry other conserved quantum numbers, and we may replace Pb with
PbeiαQ for any conserved charge Q to find the same experimental consequences; in the SM the
conservation of electric charge, lepton number, and baryon number are sufficient to redefine
parity so that Pb2 = 1 in all cases. Stated another way, other conservation laws always rule out
possible experimental tests between the situations above.
• On the other hand, if a Majorana fermion were discovered, it would carry no conserved charges,
so it could distinguish between the possibilities. Specifically, if Pb2 = (−1)F , then no process
which conserves parity can turn this particle into three copies of itself, since (±i) ̸= (±i)3 .
• In the case M = 0, parity implies that irreps must contain helicities of ±λ in pairs; this is
also a consequence of CP or CP T . However, if we also demand that parity does not change
the values of internal quantum numbers, then there’s no reasonable way to define parity for
a theory with a single Weyl spinor. The helicities still come in pairs, but the pairing requires
flipping internal quantum numbers; we instead call this symmetry CP T .
29 2. Symmetries
• In the real world, parity is not conserved, but with the exception of chiral theories (e.g. with
a single Weyl spinor) where parity cannot even be reasonably defined, the free Hamiltonian
always commutes with parity. Thus parity can be defined in terms of the free theory, allowing
the parities of asymptotic particles to be defined.
• In the above discussion, we have neglected time reversal. When we account for both parity
and time reversal and allow for projective representations, we find eight possibilities in total,
though the so-called Pin groups are the mathematically nicest.
2.3 Parity
Now we investigate parity more precisely, beginning with the scalar field. As we motivated above,
we focus on defining parity on free fields.
where a† (p)/c† (p) create particles/antiparticles with momentum p. We use the relativistic
normalization convention, so the created states have squared norm 2Ep .
• Now, parity should preserve the number of particles and flip the momentum, so
• Inserting Pb−1 Pb = 1 above and assuming the vacuum is parity invariant, Pb|0⟩ = |0⟩, we find
where we reindexed the sum. This looks rather different from our previous expression; moreover,
P
[ϕ(x), ϕ† (y)] does not necessarily vanish for spacelike x and y.
ϕP (x) = ηP ϕ(xP ).
• If ϕ is a real field, then c(p) = a(p), so η c = η a , which implies that ηP is real, and hence it is
±1. The case +1 is a scalar, and the case −1 is a pseudoscalar.
30 2. Symmetries
• On the other hand, for a complex field ηP can be an arbitrary phase, but there is a U (1) internal
symmetry which may yield a conserved charge Q. In this case, we can always replace Pb with
Pbe−iαQ , where α may be chosen so that Pb2 = 1, so that ηP = ±1.
• Another way of saying this is that the complex scalar isn’t really a different case than a real
scalar. Everything that can be expressed in terms of complex scalars can be expressed in terms
of pairs of real scalars with appropriate U (1) symmetries. Choosing a description in terms of
complex scalars is purely a matter of convention and convenience, which pays off when the U (1)
symmetries at least approximately hold in the interacting theory.
where the ϵλµ are polarization vectors. It can be shown, using the desired properties of parity
defined in the previous section, that they transform as
• The rest of the argument goes as before, so for a real vector field
We now review conventions for the Dirac field, which is more subtle.
where b† and d† create particles and antiparticles of momentum p, and the spinors satisfy
/ − m)u(p) = 0,
(p (p
/ + m)v(p) = 0
for components s = ±1/2. In the chiral representation their components are
√ √
p · σξ s p · σζ s
s s
u (p) = √ , v (p) = √ , σ = (1, σ), σ = (1, −σ)
p · σξ s − p · σζ s
and a useful basis of two-component spinors is ξ 1/2 = (1, 0)T and ξ −1/2 = (0, 1)T , which have
spin up and spin down along ẑ for both the positive and negative frequency solutions. We’ll
use a different basis for the negative frequency solutions for convenience, as explained below.
• The spin angular momentum operator can be found by taking the conserved quantity due to
rotations and subtracting off the orbital contribution, giving
1 σi 0
i j k 1
Si = ϵijk γ γ = i , γ5Si = Siγ5 = γ0γi.
4 2 0 σ 2
• For a classical solution to the Dirac equation, define h = S · p̂. Inserting a factor of PL + PR ,
1 1 s
husL,R = ∓ usL,R , s
hvL,R = ∓ vL,R
2 2
where the L/R subscripts indicate left-chiral or right-chiral Weyl fields.
• The physical interpretation is a bit tricky. For positive frequency solutions, h is equal to the
helicity λ of the corresponding particle. For negative frequency solutions, the parameter p is
the opposite of the physical momentum, as they are proportional to eipx rather than e−ipx .
• Upon quantization negative frequency solutions become holes, which flips p, S, and all other
quantum numbers. The fact that p is already flipped once in the definition of v s (p) means that
the particle corresponding to v s (p) indeed has momentum p, with no sign. But since the spin
is flipped, we have h = −λ for negative frequency solutions. Since the charge is flipped, these
particles are called antiparticles.
• Thus, a left-chiral Weyl field annihilates a left-helicity (negative helicity) particle and creates
a right-helicity (positive helicity) antiparticle. Similarly, a right-chiral Weyl field annihilates a
right-helicity particle and creates a left-helicity antiparticle. We see that each of these Lorentz
irrep fields gives rise to two Poincare particle irreps.
• For example, a “left-chiral antiquark field” is one which annihilates a left-helicity antiquark.
It would be the charge conjugate of a left-chiral quark field, and the parity conjugate of a
right-chiral antiquark field, assuming these fields exist at all in the theory; if they are not,
parity and charge conjugation aren’t defined.
• For reference, for a massless particle moving in the +ẑ direction, we have
0 1 0 0
0 0 1 0
spin up: u(p) =
1 , v(p) = 0 , spin down: u(p) = 0 , v(p) = 0
0 0 0 1
as can be verified in the chiral basis using p · σ = pP · σ. Requiring the transformed field to
take the same form as the original field, we must have
ψ P (x) = ηP γ 0 ψ(xP ), ηP = η b = −η d∗ .
This is a special case of the fact that parity maps the (s1 , s2 ) Lorentz irrep to (s2 , s1 ). We can
then straightforwardly check that ψ P satisfies the Dirac equation, that ψψ is a scalar and ψγ 5 ψ
is a pseudoscalar, and so on.
• We have freedom in choosing the phase ηP as described above, using global U (1) symmetries,
and in the SM this freedom is used to set the intrinsic parities of the proton, neutron, and
charged leptons to +1. Note that this point is unrelated to the transformations of Dirac bilinears,
where ηP cancels out.
• We also note that, regardless of phase adjustments, we have η b η d = −1, which means that
a two-particle state containing a fermion and its antiparticle has an extra factor of −1 in its
intrinsic parity, as we previously noted in our qualitative overview. This logic holds unchanged
for Majorana fermions, where the fermion and its antiparticle coincide. The same result holds
for the charge conjugate of a fermion and its antiparticle.
• Consider a set of classical fields ψi that transform under some representation R. Then the
complex conjugate fields ψi∗ transform under the conjugate representation R∗ , though they
generally won’t be in the “standard” basis. We return to the standard basis using a “charge
conjugation matrix” C, and call the operation ψ → ψ (c) = Cψ ∗ charge conjugation.
• Since the (1/2, 0) and (0, 1/2) Lorentz representations are conjugate, this notion of charge
conjugation flips the chirality. This is the notion of charge conjugation we used when studying
group theory. It comes from the classical theory, and useful mainly for constructing real, singlet
Lagrangians. It is not the same as the C b we study below, which instead corresponds to the
intuitive idea of “exchanging matter and antimatter”.
• The field ψ (c) simply does the reverse: it annihilates what ψ creates, and vice versa. In particular,
classical charge conjugation doesn’t modify the particle content at all; a Lagrangian written in
terms of only ψ is equivalent to one written in terms of only ψ (c) .
• Note that if R is complex, the particles definitely cannot be identified with their antiparticles,
while if R is real they might or might not be.
• The situation is more complicated when we are talking about spacetime symmetries, since fields
have Lorentz symmetry and particles have Poincare symmetry; we’ve seen how chirality for
fields corresponds to helicity for particles above.
• A rough heuristic is that classical charge conjugation conjugates both internal and spacetime
representations, while, in a C-symmetric
b theory, C
b conjugates exactly the internal represen-
tations (when acting on the free “in/out” states), and in a C-asymmetric
b theory, C
b might
not even defined on those states at all. In terms of representations, the two notion of charge
conjugation differ essentially by a parity transformation, leading to confusion when people use
different versions of it.
Now we define C,
b starting with the scalar field.
• We begin by demanding that the particle and antiparticle operators should be exchanged,
Ca(p)
b Cb −1 = ηC c(p), Cc(p)
b Cb −1 = η ∗ a(p)
C
where we used Lorentz invariance as before to constrain the phases. Then we have, for instance
C|p, b † (p)|0⟩ = ηC
b particle⟩ = Ca ∗ † ∗
c (p)|0⟩ = ηC |p, antiparticle⟩.
Cϕ(x)
b Cb −1 = ηC ϕ† (x), b † (x)C
Cϕ b −1 = ηC
∗
ϕ(x).
For a real scalar field, this implies ηC = ±1, while for a complex scalar field we can perform a
rotation so that ηC = 1. In the former case, this means that particles are eigenstates of C,b so
the symmetry can provide selection rules.
for C
b to be a symmetry of QED. That is, photons have intrinsic charge eigenvalue C b = −1.
µ
Physically, this is because the coupling to matter is in the form A Jµ , where the current Jµ
certainly flips sign under charge conjugation.
• We define the positive frequency and negative frequency basis spinors to be related by
ζ s = iσ 2 ξ s∗ .
This gives an extra sign flip at the classical level, which ensures that the particles created by
bs† (p) and ds† (p) have the same physical spin, just as they have the same physical momentum.
34 2. Symmetries
γ µT = −C −1 γ µ C.
T
One can show that C is real, anti-symmetric, and unitary, γ 5 = C −1 γ 5 C, and the γ µ C are
symmetric, using only the properties of the Clifford algebra. For the chiral representation,
iσ2 0
C = −iγ 0 γ 2 = .
0 −iσ2
Cb b −1 = ηC ds (p),
b s (p)C b −1 = ηC bs† (p)
b s† (p)C
Cd
where the phases are equated as usual, and we used the fact that C
b doesn’t change spacetime
quantum numbers such as momentum and spin. Thus C preserves helicity.
b
These equations can also be used (sometimes unwittingly) to define C b on classical fields, with
the caveat that this differs from classical charge conjugation by a parity transformation.
Note. In practice, the simple definition of C b above might not work, while a slightly different
definition which lacks some of the usual properties of Cb (such as flipping all internal quantum
numbers) may be more useful. For example, in the Standard Model with a sterile neutrino, charge
conjugation must exchange the active and sterile neutrinos, if it is to keep the spacetime quantum
numbers the same. But the active and sterile neutrinos don’t have opposite internal quantum
numbers, e.g. the active neutrinos have hypercharge and the sterile neutrinos don’t. A strict
interpretation would lead to the conclusion that C
b can’t be defined in such a theory. However, it
is more common to loosen the criteria and allow C b to be defined this way anyway. This is useful
because it leads to an approximate symmetry, which is only broken by weak interactions.
35 2. Symmetries
This illustrates an important point when discussing discrete symmetries. The point of symmetries
is precisely to be able to use them to understand the dynamics. It doesn’t make sense to worry
about whether some operator is “the true C” b in some metaphysical sense. Nature doesn’t care: the
theory described above will still have a C-like
b symmetry constraining it, whether we call it that
or not. As another example, in some “left-right symmetric theories”, it is conventional to allow
parity to switch the internal representations of SU (2)L and SU (2)R , which is again useful precisely
because it leads to an approximate symmetry. (However, to give credit to the mathematicians, the
definition of Cb PbTb is more “canonical”, because it is the conserved quantity guaranteed to us by
the CPT theorem. This operator always flips all internal quantum numbers and the helicity.)
• The charge conjugate spinor ψ c satisfies the Dirac equation. To see this, take the transpose of
the Dirac equation for ψ for
T
(−iγ µT ∂µ − m)ψ (x) = 0.
Inserting factors of C −1 C and using γ µT = −C −1 γ µ C gives the result.
• A Majorana fermion has bs (p) = ds (p). That is, they are Dirac fermions that are their own
antiparticles, ψ c = ψ. They arise from quantizing solutions to the Dirac equation obeying a
reality condition. Then a spin up Majorana fermion can be described by either the spinor ζ 1/2
or ξ 1/2 , where the spinors are related by ζ s = iσ 2 ξ s∗ . Note that a Majorana field doesn’t have
a definite chirality, just like a Dirac field.
1 T 1
j µ (x) = (ψγ µ ψ − ψ T γ µT ψ ) = (γ µ )ij [ψ i (x), ψj (x)]
2 2
where the sign flip from the transpose is explained in the notes on Quantum Field Theory.
Applying charge conjugation, we have
b −1 = 1 (γ µ )ij [Cψ
b µC
Cj b −1 , Cψ
b iC b −1 ] = − 1 (γ µ )ij [(ψ T C −1 )i , (Cψ T )j ] = 1 (γ µ )ℓk [ψk , ψ ℓ ] = −j µ .
b jC
2 2 2
On the other hand, we know that the electromagnetic field is coupled as Aµ jµ , so for QED to
be charge conjugation invariant, we must define CA b −1 = −Aµ .
b µC
• Similarly, one can show that the axial current is even under C.
b This implies that it is impossible
to couple a linear combination of the vector and axial currents to a single field without violating
C,
b and this is exactly what happens in the weak interactions.
Majorana spinors can be a bit confusing, because people use the term in many distinct ways, so we
treat them carefully.
• To avoid confusion, we start with two-component Weyl fields, since Dirac and Majorana fields
are built out of them. Suppose we have a left-chiral Weyl spinor field ψ which transforms under
a representation R of an internal symmetry group. It annihilates a particle with negative helicity
in the representation R, and creates a particle with positive helicity in the representation R.
36 2. Symmetries
• In general, complex conjugating a quantum field just reverses which particles it creates and
annihilates. The conjugate field ψ † is a right-chiral Weyl spinor with internal symmetry repre-
sentation R. It annihilates a particle with positive helicity in the representation R, and creates
a particle with negative helicity in the representation R.
• Therefore, to describe a set of particles with |h| = 1/2, we can use only left-chiral Weyl fields,
or only right-chiral Weyl fields, or a mixture of both. The field content of a theory is somewhat
arbitrary. Note that the framework of fields can only describe particles which come in matter-
antimatter pairs: for every particle species transforming in a given internal representation, there
must be another particle species with opposite helicity and the same mass, transforming in the
conjugate internal representation. This is a consequence of CPT symmetry.
• With a single left-chiral Weyl field ψ, there are only two ways to produce quadratic Lorentz-
invariant terms in the Lagrangian. We know ψ transforms in (1/2, 0), and its conjugate
transforms in (0, 1/2). Since (1/2, 0) × (1/2, 0) = (1, 0) + (0, 0), contracting the field with itself
can yield a scalar. Since (1/2, 0) × (0, 1/2) = (1/2, 1/2), contracting the field with its conjugate
yields a four-vector, which can yield a scalar upon contraction with ∂ µ .
L ⊃ iψ † σ µ ∂µ ψ + mψψ
where the second term is a two-component spinor contraction, defined in the notes on Super-
symmetry, and the σ µ are just coefficients that isolate the appropriate scalar contraction.
• Now suppose ψ transforms in a representation R of an internal symmetry group. The first term
is automatically invariant, but the second term transforms as R × R, so it can only be invariant
if R is a real representation. The logic is precisely the same if the symmetry group is a gauge
group, except that ∂µ must be replaced with an appropriate covariant derivative Dµ .
• For a right-chiral Weyl field χ, the logic is the same, but the terms are written as
L ⊃ iχ† σ µ ∂µ χ + mχχ.
• Now, we can always stack a Weyl field and its conjugate into a four-component spinor field,
ψ
Ψ= .
ψ (c)
This is just a change of notation. There are still two possible terms in the Lagrangian,
1
L ⊃ Ψ(i∂/ − m)Ψ.
2
which are just the same as the original ones, up to conventions for factors of 2, once one expands
out the products. As before, the mass term is only allowed if R is real.
• Here’s the tricky part: if there’s a gauge field, the kinetic term should become
1
L ⊃ Ψ(i(∂/ + ieγ 5 A)
/ − m)Ψ.
2
37 2. Symmetries
That is, we do not use the minimal coupling prescription. Minimal coupling is a procedure for
generating a scalar Lagrangian given fields which transform in known representations. But in
general, Ψ does not transform in a well-defined representation of the internal symmetry group,
because the top half transforms in R and the bottom half transforms in R. Again, we can
confirm the γ 5 has to be there by expanding everything in components. (A more common way
to do this would be to add a chiral projector PL . It leads to the same result when expanded in
components, but our way treats the two halves of Ψ symmetrically.)
• Calculations with Standard Model fermions can be done with either two-component or four-
component spinor fields. In both cases, explicit mass terms are forbidden, but masses are
permitted by the Higgs mechanism. The advantage of four-component notation is that one can
use familiar techniques for the traces of gamma matrices; the disadvantage is that γ 5 appears.
• Now we’re ready to answer the key question: what is a Dirac spinor? Often, particles trans-
forming in a representation R can be paired with other particles, of the same mass and same
helicity, transforming in the representation R. For instance, this can be done for all particles
if the theory is symmetric under charge conjugation. We can describe a pair of such particle
species using a pair of left-chiral and right-chiral Weyl spinor fields, ψ and χ, which transform
in the same representation R.
• This allows a new term in the Lagrangian: we can contract one with the conjugate of the other
to get a scalar, no matter what R is. This is called a Dirac mass term, and it is most easily
written in four-component notation. Stacking these fields into a four-component spinor,
ψ
Ψ=
χ
the Lagrangian is
L ⊃ Ψ(i(∂/ + ieA)
/ − m)Ψ
where there’s no factor of 1/2, since the two halves of Ψ are distinct particles, and we simply
have a covariant derivative with no need for γ 5 , since both halves of Ψ transform in R. At the
level of two-component spinors the Dirac mass term looks like ψχ + ψχ.
– Starting with a Dirac spinor transforming in a representation R, one can define a Majorana
spinor by additionally imposing a reality condition Ψ(c) = Ψ. In our language, this is
equivalent to setting ψ = χ, and demanding invariance of the kinetic term implies R must
be real. This is the source of the claim that Majorana spinors can’t be charged.
– Starting with a Weyl spinor transforming in a representation R, one can define a Majorana
spinor by stacking it on its conjugate. Demanding invariance of the explicit mass term
implies R must be real, but if there is no such term, R can be arbitrary. This is not
contradictory with the previous point, because in this case Ψ does not transform in a
well-defined representation of the internal symmetry group.
Note. Imposing a reality condition might seem a bit artificial; alternatively, it’s simple to produce
Majorana spinors starting from only Dirac spinors. For example, suppose there is a global U (1)
symmetry, a Dirac field with charge 1, and a scalar field H with charge −2. We can then write
38 2. Symmetries
down terms like ψψH, which turns into a Majorana mass for ψ when H gets a vev. This doesn’t
contradict the statement that massive Majorana spinors can’t be charged, because the U (1) is
spontaneously broken by H.
This simple mechanism won’t show up in typical textbooks, because they often only consider the
U (1) of electromagnetism, which we know holds to extreme precision. However, it’s a common tool
in model building for dark matter, where we might have a “dark” U (1) separate from the Standard
Model gauge groups. When the Majorana mass terms are much larger than the Dirac mass term,
we get two distinct Majorana spinors, while if they’re smaller, then we have the “pseudo-Dirac”
case where the Majoranas ψ and χ have only a small splitting. The same idea can be applied to
separate the components of a complex scalar, giving the “inelastic scalar” case. The latter two
are concrete examples of “inelastic dark matter”, where collisions can excite or de-excite the dark
matter by interconverting the two particle species, leading to distinctive experimental signatures.
Note. Chiral gauge theories. Consider the fermions in a gauge theory. If, for every positive helicity
particle in a representation R of the gauge group G, there is a negative helicity particle in the same
representation R, the theory is not chiral; it doesn’t distinguish between the two helicities.
Suppose we write all spinor fields in a theory as left-chiral Weyl spinors. They are collectively
in a large representation S, and if S is not complex, the theory is not chiral. This remains true if
spontaneous symmetry breaking reduces G to H ⊆ G, because S will still remain real; it will split
into real representations plus pairs of conjugate representations of H.
This places a strong constraint on GUTs, because the SM is a chiral gauge theory. If S were
not complex in a GUT, then it would yield unwanted extra “mirror matter” transforming in the
conjugates of the SM particle representations. The mirror matter would have to be made very heavy
while keeping ordinary matter light, and it is unclear how to achieve this naturally.
In addition, time reversal flips the sign of the angular momentum. Note that pT x = −pxT .
where we used the antilinearity of Tb in the first step, then reindexed the sum.
where b maps to b because both the momentum and spin are flipped, keeping the helicity the
same, and the extra phase factors are again constrained by Lorentz invariance.
39 2. Symmetries
and we define 2
−1 5 5 iσ 0
1 3
B=C γ = −γ C = γ γ = .
0 iσ 2
Then ψ(x)ψ(x) → ψ(xT )ψ(xT ), which makes sense since charge density is T -even classically.
Then we have
Tbψ(x)γ µ ψ(x)Tb−1 = ψ(xT )B −1 γ µ∗ Bψ(xT )
so that ψγ µ ψ has its spatial parts flipped. The axial current ψγ 5 γ µ ψ transforms the same way,
essentially because Tb is blind to chirality, and the currents only differ by chirality.
• In a theory with C, P , and T symmetry, the Lagrangian is C, P , and T -even. The first two
imply that the S-matrix satisfies
PbS Pb−1 = S, CS
b Cb −1 = S.
Then the amplitude for |i⟩ → |f ⟩ is the same as the amplitude for Pb|i⟩ → Pb|f ⟩ or C|i⟩
b → C|f
b ⟩.
• Time reversal is more complicated. Note that V (t) is real, and the time ordering puts later
times to the left. Under conjugation by Tb, the factors of −i are conjugated, and the time
ordering is now in reverse. This is equivalent to an overall complex conjugation, so
TbS Tb−1 = S † .
Now we have
⟨iT |S|fT ⟩ = ⟨i|Tb† |S Tb|f ⟩ = ⟨i|Tb† S Tb|f ⟩∗ = ⟨f |S|i⟩
where the bar indicates the direction the antilinear operators act; swapping the direction picks
up a complex conjugation. Then the amplitude for |i⟩ → |f ⟩ equals that for Tb|f ⟩ → Tb|i⟩.
40 2. Symmetries
Note. A summary table for gamma matrices. The fourth column is representation-independent,
the first three are highly representation-dependent, and the last two are by definition.
γ∗ γT γ† γ −1 C −1 γC B −1 γB
0 + + + + (−)T (+)∗
1 + − − − (−)T (−)∗
2 − + − − (−)T (−)∗
3 + − − − (−)T (−)∗
5 + + + + (+)T (+)∗
Note. A summary table for discrete symmetries, for a real scalar ϕ, a real vector V µ , the special
case Aµ , and Dirac bilinears. For objects with vector indices, we define P = diag(1, −1, −1, −1)
and T = −P = diag(−1, 1, 1, 1).
ϕ Vµ Aµ ψψ iψγ 5 ψ ψγ µ ψ ψγ µ γ 5 ψ ψσ µν ψ ∂µ
Cb ηc = ±1 ηc −1 1 1 −1 1 −1 1
Pb ηp = ±1 ηp P P 1 −1 P −P P µP ν P
Tb ηt = ±1 ηt T −T 1 −1 −T −T −T µ T ν T
C
b PbTb 1 −1 −1 1 1 −1 −1 1 −1
where the last line requires choosing ηc ηp ηt = 1, which can always be arranged. Note that CPT just
gives a factor of −1 for each Lorentz index, so any Lorentz invariant Lagrangian is automatically
CPT invariant. In addition, in non-abelian gauge theories, the potential, field strength, and current
transform under C and P with extra matrix transposes.
41 3. Spontaneous Symmetry Breaking
Example. The linear sigma model. Consider N real scalar fields with Lagrangian
1 1 λ
L(ϕ, ∂µ ϕ) = (∂µ ϕ)2 + µ2 ϕ2 − ϕ4
2 2 4
where ϕ has N components, and we’ve suppressed dot products. Then the Lagrangian has an
O(N ) symmetry. The dispersion relation about ϕ = 0 contains excitations with negative mass
squared, indicating a potential maximum rather than a minimum. The lowest-energy classical field
configuration is a constant field ϕ0 . The potential is minimized for
µ2
ϕ20 = .
λ
Since ϕ0 can only take a single value, choosing it breaks the O(N ) symmetry down to O(N − 1),
since we are still free to rotate in the directions orthogonal to ϕ0 . Suppose we pick
µ
ϕ0 = (0, 0, . . . , 0, v), v=√ .
λ
We can expand the Lagrangian about the minimum by defining
Then we have
1 1 1
L = (∂µ π k )2 + (∂µ σ)2 − (2µ2 )σ 2 + cubic and quartic interactions.
2 2 2
That is, we find one massive field and N − 1 massless fields. In the case N = 2, this reduces to the
usual picture of a “Mexican hat potential”.
Note. The crucial step that breaks the symmetry is selecting a specific vacuum state, not rewriting
the Lagrangian. The new Lagrangian still has an O(N ) symmetry, though it’s harder to see as it’s
nonlinearly realized; we couldn’t have broken any symmetry because we merely redefined variables.
Note. Consider N = 1, where the broken symmetry is a discrete symmetry, Z2 . In this case the
experimental signature is not a Goldstone boson, but a domain wall. Without symmetry breaking,
the Z2 symmetry means that the parity of the number of particles is conserved. With spontaneous
symmetry breaking, we see cubic interaction terms, but the symmetry is still there, in a sense,
because we can think of the fourth particle as coming from our vacuum, which acts as a source.
Example. We don’t need to begin with negative squared masses. In the case of a potential
V (ϕ) ∼ −|ϕ|4 + |ϕ|6 for a complex scalar ϕ, we start with two massless particles and end up with
one massive particle and one massless Goldstone boson after symmetry breaking.
42 3. Spontaneous Symmetry Breaking
L = kinetic − V (ϕ)
where N indexes over non-vacuum states. By the translational invariance of the vacuum, we have
X XZ
⟨n|A(x)B(y)|n′ ⟩ = ⟨n|A(0)|m⟩⟨m|B(0)|n′ ⟩ + dp e−ip·(x−y) ⟨n|A(0)|Np ⟩⟨Np |B(0)|n′ ⟩.
m N
Then in the limit |x−y| → ∞, the integral term goes to zero. Moreover, at spacelike separation A(x)
and B(y) commute, so the matrices ⟨n|A(x)|m⟩ and ⟨m|B(y)|n′ ⟩ commute and can be simultaneously
diagonalized.
Now, in a theory with one vacuum |0⟩, we take as an axiom the cluster decomposition principle,
which states that in the limit of large separation,
⟨0|A(x)B(y)|0⟩ → ⟨0|A(x)|0⟩⟨0|B(y)|0⟩.
In this case, cluster decomposition only holds if we work in a basis of vacua where A and B are
diagonal. In our examples above, the quantum field ϕ itself is a local operator, picking out the |±⟩
states as valid vacua.
Note. Spontaneous symmetry breaking appears in the path integral through the choice of boundary
condition. In the case of quantum mechanics, these boundary conditions don’t matter for long times
because of quantum tunneling, but for quantum field theory they do.
Note. We can easily prove Goldstone’s theorem with the effective action. Considering a scalar
field theory for simplicity, we have an effective potential Veff (ϕ) whose minima give the allowed vevs.
Assuming the symmetry is linearly realized on the fields, Veff (ϕ) shares the same symmetries, as
shown in the notes on Quantum Field Theory. Then our classical argument goes through, showing
that broken symmetries give zero eigenvalues of the matrix of second derivatives of Veff (ϕ). On the
other hand, this matrix is related to the reciprocal of the momentum-space propagator by
∂ 2 Veff (ϕ)
= ∆−1
nm (p = 0).
∂ϕn ∂ϕm
44 3. Spontaneous Symmetry Breaking
Then a zero eigenvalue of Veff corresponds to a zero eigenvalue of the exact mass matrix, and hence
a massless particle.
We now show Goldstone’s theorem without using the effective action, in scalar field theory.
• Let ϕ be a vector of scalar fields and consider a continuous symmetry group G with generators
indexed by a, with corresponding conserved currents j µa (x) and conserved charges Qa . For the
infinitesimal symmetry ϕ → ϕ + ϵta ϕ, Noether’s theorem gives
Z Z
Qa = dx J0a (x) = dx πi (x)ta ϕi
• The charge Qa generates the symmetry transformation just as it does classically: using the
equal-time commutation relations, we have
Then intuitively, the current J0a (x) generates the symmetry “localized” near x. For example, if
the symmetry rotates ϕ1 into ϕ2 , then J0 (x) creates a “pion” ϕ1 ϕ2 localized at x.
• By definition, spontaneous symmetry breaking exists if the vacuum |0⟩ is charged, Qa |0⟩ ̸= 0.
Since H and Qa commute, Qa |0⟩ is degenerate with |0⟩.
If E0 is the vacuum energy, then these states have energy E0 + E(p). But when p is zero, the
state is proportional to Q|0⟩ and hence has energy E0 , so E(0) = 0. Then the states must
satisfy a massless dispersion relation; they are the desired Goldstone bosons.
• Note that ϕ need not be a fundamental field. For example, in a theory with Dirac spinors, we
could take ϕ = ψψ.
These are analogous to the spectral densities we found for the exact propagator, except that
instead of ϕ → ϕ we are describing the amplitude for ϕ → B where B is the particle created
by the current. We will see that B can be interpreted as a Goldstone boson.
45 3. Spontaneous Symmetry Breaking
• Since ρ and ρ̃ are Lorentz vectors that only depend on k, they must be proportional to k µ .
They must also be zero for negative energy since the states all have positive energy. Then
• Now, the left-hand side must vanish at spacelike separation by causality. We know that for
spacelike x, D(x, σ) = D(−x, σ), so ρa (σ) = −ρ̃a (σ). Then we have
Z
⟨0|[j aµ (x), ϕ(0)]|0⟩ = −∂ µ dσ ρa (σ)i∆(x, σ)
where Z
3
i∆(x, σ) = (2π) (D(x, σ) − D(−x, σ)) = d4 k δ(k 2 − σ) sign(k 0 )e−ikx
• Next, apply ∂µ to both sides. By current conservation, the left-hand side vanishes, and we can
simplify the right-hand side with the Klein–Gordan equation, for
Z
0 = dσ σρa (σ)i∆(x, σ).
For this to hold for all x, we require σρa (σ) = 0, since σ and ρ are positive.
• In the case where ρa (σ) = 0, we have ⟨0|[Qa , ϕ(0)]|0⟩ ∝ ⟨0|ta ϕ(0)|0⟩ = ta ϕ0 = 0, so the
symmetry is unbroken. Otherwise, we have ρa (σ) = N a δ(σ) and we can explicitly calculate
Z
a a a
t ϕ0 = i⟨0|[Q , ϕ(0)]|0⟩ = N dx ∂ 0 ∆(x, 0) = −(2π)3 N a ̸= 0
so the symmetry is indeed broken. Returning to the definition of the spectral density, there
must be families of massless states |B(p)⟩ where
by dimensional analysis and Lorentz invariance. The states |B(p)⟩ are spinless since ϕ(0)|0⟩
is rotationally invariant and carry the same quantum numbers as j 0 . They are the desired
Goldstone bosons.
46 3. Spontaneous Symmetry Breaking
Note. There is another simple proof that Goldstone bosons remain massless, though it is only valid
perturbatively. If the exact propagator ∆(k 2 ) of a Goldstone bosons is to retain a pole at k 2 = 0,
then the self-energy should satisfy Π(k 2 = 0) = 0. However, Goldstone bosons are derivatively
coupled, so all diagrams one can draw come with powers of the external momentum k. Since
Π(k 2 = 0) can be evaluated at k = 0, these diagrams vanish.
Note. There are several exceptions to Goldstone’s theorem. In d ≤ 2, the Mermin–Wagner theorem
ensures that spontaneous continuous symmetry breaking can never occur in the first place; concretely,
the effective potential will never develop an instability at ϕ = 0.
Gauge symmetry also complicates the picture. If a global symmetry is gauged then its current
cannot create Goldstone bosons, because it merely takes a state to the very same physical state. If
we try to work only with physical states, we either lose manifest Lorentz invariance, which invalidates
the formal proof above, or we have states with negative norm (e.g. in Lorenz gauge). This also
invalidates the proof above since ρ may be negative. Instead, we would see that the would-be
Goldstone bosons are “eaten” to produce massive gauge bosons.
Note. This situation is also called “spontaneous breaking of a gauge symmetry”, but this is a
misnomer. The local gauge symmetry remains a gauge symmetry; the choice of vacuum doesn’t
affect the fact that states related by a gauge transformation are physically the same. Actually
breaking gauge symmetry would be disastrous; it occurs in the case of a gauge anomaly and
destroys the Ward identities, making the theory inconsistent. Breaking the global symmetry does
not violate gauge symmetry since we require gauge transformations to vanish at infinity.
Incidentally, it is also possible to have local symmetries that are not gauge symmetries, e.g. in
lattice spin systems considered in the notes on Statistical Field Theory. However, Elitzur’s theorem
states that such a local symmetry can never be broken. There are two ways to argue this. We can
imagine introducing a symmetry breaking field h and taking the limits N → ∞ followed by h → 0,
in which case a global symmetry is broken because the energy cost N h goes to infinity, while a local
symmetry isn’t because the energy cost is O(h) which goes to zero.
Alternatively, suppose there is no external field; then we care about tunneling between ground
states related by the symmetry by local thermal fluctuations. Suppose the symmetry is discrete. In
the case of a global symmetry, there is an extensive energy cost since we must form a domain wall,
and hence cannot happen for d > 1. But in the case of a local symmetry, we can always relate two
such ground states by a series of local transformations which each cost no energy.
Thus, a local gauge symmetry can’t be broken in a consistent theory, while a local non-gauge
symmetry can’t be broken at all.
Example. The abelian Higgs model. Consider the theory of scalar QED with a potential,
1 λ ∗ 2
L = − (Fµν )2 + |Dµ ϕ|2 − V (ϕ), V (ϕ) = −µ2 ϕ∗ ϕ + (ϕ ϕ) , λ>0
4 2
with Dµ = ∂µ + ieAµ . This Lagrangian has the U (1) gauge symmetry
1
ϕ(x) → eiα(x) ϕ(x), Aµ (x) → Aµ (x) − ∂µ α(x).
e
47 3. Spontaneous Symmetry Breaking
Without loss of generality we can work with only real fields. Since the representation is finite-
dimensional, it is unitary, so the ta are imaginary and Hermitian, and the T a are real and
antisymmetric.
48 3. Spontaneous Symmetry Breaking
• Now let the field acquire a vev, ⟨ϕi ⟩ = ϕ0i . The last term yields a gauge boson mass term,
1
∆L = m2ab Aaµ Abµ , m2ab = g 2 (T a ϕ0 )i (T b ϕ0 )i .
2
The mass matrix is positive semidefinite, so the gauge bosons receive nonnegative masses.
• Suppose a generator T a leaves the vacuum invariant, T a ϕ0 = 0. Then the corresponding gauge
boson is massless, as expected.
• The interaction between the Goldstone bosons and the gauge bosons is
∆L = −gAaµ ∂µ ϕi (T a ϕ0 )i .
As expected, the only components of ϕ that mix are those parallel to T a ϕ0 for some transfor-
mation T a , which is precisely the set of Goldstone bosons. In other words, each massive gauge
boson eats the Goldstone corresponding to the broken symmetry it generates. Just as in the
abelian case, this provides the desired third polarization in the gauge boson propagator.
• The mass eigenstates in a gauge theory with gauge group G are multiplets of G. For example,
quarks are color triplets and gluons form a color octet. In the case of symmetry breaking,
one can show that mass eigenstates are instead multiplets of the unbroken gauge group H,
by checking that the mass matrices all commute with the generators of H in the appropriate
representation. For example, the particles in the SM have definite electric charge.
Example. Consider an SU (2) gauge theory where ϕ transforms in the spinor representation. Then
σa
Dµ ϕ = (∂µ + igAaµ τ a )ϕ, τa = .
2
If ϕ acquires a vev, then by the SU (2) symmetry we can let it be
1 0
⟨ϕ⟩ = √ .
2 v
The mass term for the gauge bosons has the form
g 2 v 2 a aµ
1 2 a b 0 1 2 a b 0
∆L = g 0 v τ τ Aaµ Abµ = g 0 v {τ , τ } Aaµ Abµ = Aµ A
2 v 4 v 8
where we used {τ a , τ b } = δ ab /2. Therefore all three gauge bosons receive the mass mA = gv/2.
49 3. Spontaneous Symmetry Breaking
Example. Consider the same example, but let ϕ transform in the vector representation. The
covariant derivative is, in components,
and we choose the vev to be ⟨ϕa ⟩ = vδa3 . Then the mass term is
g2v2 g2v2
∆L = (ϵabc Abµ δc3 )2 = ((A1µ )2 + (A2µ )2 )
2 2
so we only get two massive gauge bosons. This makes sense, since the symmetry of rotations about
the ϕ3 axis is preserved. Since the model contains both massive and massless gauge bosons, it was
once proposed as a candidate theory of the weak interaction, but it’s not quite right: we require
two massive bosons that only couple to left-handed fields (the W ± bosons), and one massive boson
that couples to both handednesses (the Z). This can’t be achieved by breaking SU (2) in any way.
Example. Consider an SU (3) gauge theory where ϕ transforms in the adjoint representation. Then
1 g2
(Dµ ϕ)a = ∂µ ϕa − gfabc Abµ ϕc , ∆L = (Dµ ϕ)a (Dµ ϕ)a ⊃ (fabc Abµ ϕc )2 .
2 2
In this case, it’s more convenient to work without components. We let Φ = ϕa ta , so
where the normalization is fixed by using the Gell-Mann matrices. Now, we can always rotate Φ so
that it is diagonal, but there are still several distinct possibilities. If Φ0 = |ϕ| diag(1, 1, −2), then
the masses of the Aaµ are
so the symmetry is broken to SU (2) × U (1). If Φ0 = |ϕ| diag(1, −1, 0), then the masses are
and the symmetry is broken to U (1) × U (1). We see that matter fields in the adjoint can’t break
the symmetry corresponding to the Cartan subalgebra. To break these symmetries, we would have
to add another matter field transforming in a different representation.
• So far we have considered symmetry breaking by scalar fields picking up vevs, but other
mechanisms could be possible; we will see such a mechanism for chiral symmetry in QCD.
Hence we would like an analysis that is independent of how the symmetry is broken.
• Consider a theory with a global symmetry G and let α parametrize the global symmetry. By
the usual Noether trick, if we promote α to α(x), then
δL = −(∂µ αa )J µa , ∂µ J µa = 0.
50 3. Spontaneous Symmetry Breaking
L′ = L + gAaµ J µa + O(A2 )
which has the effect of gauging the symmetry. By directly plugging in δL, we see that L′ is
indeed gauge invariant, up to unspecified O(A2 ) terms. However, we will only need the linear
term to compute matrix elements involving only one insertion of the gauge field.
J µa = ∂µ ϕi Tija ϕj .
The Goldstone bosons are the ϕi which are shifted by the global transformation, and indeed,
F ai = Tija ϕ0j .
where mab is the gauge boson mass matrix. To compute this, note that the pole at k 2 = 0 comes
from the diagram with an intermediate Goldstone boson. Using our two equations involving
J µa above, the Aaµ ϕj vertex factor is −gk µ F aj , giving
i
Πµν 2 µ a
ab (k ) ⊃ (gk F j ) (−gk ν F bj )
k2
so the gauge boson mass matrix is
m2ab = g 2 F aj F bj .
Note. If we wanted to analyze how electroweak symmetry breaking occurred earlier in our universe,
we would have to understand how thermal effects change the Higgs potential. This can be done
using standard techniques from thermal field theory, which are sketched in my dissertation.
One should also account for quantum corrections to the potential, which can be quite important.
For instance, in massless ϕ4 theory coupled to a U (1) gauge field, quantum effects cause spontaneous
symmetry breaking even though it doesn’t happen classically; the effective potential in this case
is called the Coleman-Weinberg potential. We managed to ignore this above by either implicitly
assuming weak coupling or working with the effective potential.
51 3. Spontaneous Symmetry Breaking
3.4 Quantization
In this section we consider the quantization of theories with spontaneous symmetry breaking of a
gauged global symmetry.
ϕ → eieα ϕ, Aµ → Aµ − ∂µ α.
√
The potential ensures |ϕ| = v/ 2. In unitary gauge, we can see that the gauge field gains a
mass MA2 = e2 v 2 . There is also a remaining massive scalar field corresponding to the radial
part of ϕ. with mass m2 = gv 2 .
• As shown in the notes on Quantum Field Theory, the propagator for a massive vector boson is
2 i kµ kν
Dµν (k ) = − 2 ηµν − .
k − MA2 + iϵ MA2
However, this makes renormalizability unclear, because the propagator does not fall off at high
k. Since we are dealing with a gauge theory, we should also be more careful to account for
Faddeev–Popov ghosts.
• As such, any convenient gauge fixing must suppress this term. In path integral quantization,
we may choose the gauge fixing function
F (A, φ) = ∂µ Aµ − ξMA φ.
Furthermore, we integrate over gauge fixings with a Gaussian weight of width ξ. As shown in
the notes on Quantum Field Theory, the Lagrangian picks up the terms
1
L⊃− F (A)2 + c∆FP c.
2ξ
This is a generalization of Rξ gauge.
52 3. Spontaneous Symmetry Breaking
• Unlike previous examples, the gauge fixing function F now depends on matter fields as well as
the gauge field. To evaluate the Faddeev–Popov determinant, we go back to the definition,
∂F ∂F ∂F
∆FP = =− Dµ + e(v + f ) = −∂ 2 − ξeMA (v + f ).
∂α ∂Aµ ∂φ
Since this is an abelian gauge theory, the ghosts do not couple to the gauge field directly, but
have indirect effects by their coupling to f .
• Expanding the extra F 2 /2ξ terms, the mixing term is cancelled, leaving quadratic terms
1 1 1 ξ
Lquad = (∂µ f )(∂ µ f ) − m2 f 2 + ∂µ φ∂ µ φ − MA2 φ2
2 2 2 2
1 µν 2 1 µ ν 2 µν
− Aµ −η ∂ + 1 − ∂ ∂ − MA η Aν − c∂ 2 c − ξMA2 vcc
2 ξ
where the f mass term is from the potential for ϕ. The propagators are now
i kµ kν i
Dµν (k 2 ) = − 2 η µν − (1 − ξ) , Dφ (k 2 ) = 2
k − MA2 + iϵ k 2 − ξMA2 k − ξMA2 + iϵ
and
i i
Dc (k 2 ) = , Df (k 2 ) = .
k2 − ξMA2 k2 − m2f
Renormalizability is now easier to show, as all propagators fall off as 1/k 2 , but not all the fields
are physical, as signaled by the ξ-dependent masses of φ and the ghost field. The f field is
physical, and in the Standard Model will correspond to the Higgs boson.
• The physical results should not depend on ξ, which can be a useful cross-check in computations,
just as it was in QED. The ξ-independence may be proven generally using the BRST symmetry
of the gauge-fixed Lagrangian.
• One useful special case is ξ = 0, where the Goldstone boson φ is massless and the gauge field
is fully transverse,
2 i kµ kν i
Dµν (k ) = − 2 2 ηµν − 2 , Dφ (k 2 ) = 2 .
k − MA + iϵ k k + iϵ
Both propagators have poles at k 2 = 0, but they don’t correspond to physical particles.
This gauge, called the Feynman–’t Hooft gauge, is the most convenient for general higher-order
computations.
• We recover unitary gauge in the limit ξ → ∞, where the unphysical fields decouple; the
unphysical poles in k 2 go to infinity. This is called unitary gauge because every pole found by
evaluating Feynman diagrams corresponds to the propagation of physical intermediate states,
consistent with the Cutkosky rules, so unitarity is manifest.
53 3. Spontaneous Symmetry Breaking
• In 1972, ’t Hooft and Veltman used Rξ gauge to prove the renormalizability of the Standard
Model at all orders in perturbation theory. For any finite ξ, it is easy to show that the divergences
can be cancelled by a finite number of counterterms, since the usual power counting arguments
will work. ’t Hooft and Veltman additionally showed that the counterterms preserved local
gauge invariance, and the ξ-independence of S-matrix elements.
Though we have now set up the trickiest Feynman rules, loop computations in the Standard Model
are quite complicated, and we will not perform any in these notes.
54 4. Electroweak Theory
4 Electroweak Theory
4.1 Gauge Theory
We now describe the Weinberg–Salam theory of the electroweak interaction.
• We postulate a gauge group SU (2)L × U (1)Y , where the factors are called weak isospin and
hypercharge, and a complex scalar field ϕ, called the Higgs field. The Higgs transforms as a
weak isospin doublet with a U (1) hypercharge Y = 1/2,
a (x)τ a σa
ϕ(x) → eiα eiβ(x)/2 ϕ(x), τa = .
2
• We suppose the Higgs acquires a vev, through the same potential as in the abelian Higgs model.
Using the SU (2)L × U (1)Y global symmetry, without loss of generality we can pick
1 0
ϕ0 = √ .
2 v
This breaks the symmetry to U (1)EM , generated by gauge transformations with α3 (x) = β(x).
1
Aµ = p (g ′ A3µ + gBµ ).
g 2 + g ′2
• The general covariant derivative may be rewritten in terms of the mass eigenstates as
Dµ = ∂µ + igAaµ T a + iY g ′ Bµ
ig i gg ′
= ∂µ + √ (Wµ+ T + + Wµ− T − ) + p Zµ (g 2 T 3 − g ′2 Y ) + i p Aµ (T 3 + Y )
2 2
g +g ′2 2
g +g ′2
• Note that the Z also couples directly to anything with hypercharge, so it can couple to particles
that are SU (2)L singlets. This isn’t a phenomenological problem, because the photon can
couple to anything the Z can. That is, the Z boson doesn’t produce any new decays; at low
energies its effect is totally washed out by that of the photon.
• The theory is predictive: with just four parameters (g, g ′ , µ2 , and λ), all of the masses and
(self-)interactions of the electroweak bosons and the Higgs are fixed. For example, the Higgs
trilinear and quartic couplings are predicted but currently not measured to any precision; they
will be targeted by future colliders.
Note. In the Standard Model, the mechanism of spontaneous symmetry breaking is much less
constrained than the other parts of it. It is therefore interesting to consider which of the above
predictions follow solely from the spontaneous symmetry breaking pattern SU (2)L × U (1)Y →
U (1)EM , and which additionally rely on there being a single, SU (2)L doublet Higgs field.
We focus on the mass matrix of the four gauge bosons. First, note that U (1)EM transformations
don’t commute with two of the SU (2)L transformations; this implies that a pair of SU (2)L bosons
pick up opposite electric charges. Thus, the mass matrix must take the form
2
m1
m22
2
m m 2
3
m2 m20
where additional off-diagonal terms are forbidden by U (1)EM symmetry, which additionally forces
m1 = m2 ≡ mW . Because U (1)EM is unbroken, there must be a massless gauge boson Aµ ,
56 4. Electroweak Theory
which implies m2 = ±|m0 m3 |. Requiring that Aµ take the same form as found above implies
tan θW = |m0 /m3 |. Therefore, the only thing not determined is the mass of the Z-boson,
This matches the Standard Model prediction precisely when m3 = mW . We can thus search for
deviations from the Standard Model by measuring ρ = m2W /(m2Z cos2 θW ), which at tree level is
ρ0 = 1. Loop corrections contribute ∆ρ ≈ 0.008.
• It suffices to find the SU (2)L and U (1)Y representations the fermions transform in; we can then
read the interaction terms off the covariant derivative. We are guided by the experimental fact
that the weak interactions only affect left-helicity particles and right-helicity antiparticles.
• We postulate the left-handed electron and electron neutrino fit in an isospin doublet,
νe (x) 1
L(x) = , eL (x) = (1 − γ 5 )e(x).
eL (x) 2
/ + RiDR
L ⊃ LiDL /
• Expanding out the covariant derivatives, we can show the interaction terms are
g µ g
L ⊃ − √ (J µ Wµ+ + J µ† Wµ− ) − eJEM Aµ − J µ Zµ
2 2 2 cos θW n
where we have defined the leptonic charged weak, neutral weak, and electromagnetic currents
1 µ
J µ = ν e γ µ (1 − γ 5 )e, Jnµ = (ν e γ µ (1 − γ 5 )νe − eγ µ (1 − γ 5 − 4 sin2 θw )e), JEM = −eγ µ e.
2
Note that right-handed electrons can couple to the Z, as mentioned earlier. Also note that
because sin2 θw is close to 1/4, the coupling of the charged leptons to the Z boson is almost
purely axial, i.e. proportional to γ 5 .
Note. One might wonder if the weak force can form bound states; for instance, the Z boson
mediates an attractive interaction between neutrinos. However, while all potential wells in 1D and
2D have bound states, sufficiently weak potential wells in 3D don’t, and indeed there are no weak
bound states in the SM. However, certain models of WIMP dark matter could have them.
• Next, we wish to write down a mass term for the electron, but the Dirac mass term me (eL eR +
eR eL ) is not gauge invariant. Instead, all mass terms in the SM come from Yukawa couplings
to the Higgs. We work in unitary gauge where
1 0
ϕ(x) = √
2 v + h(x)
and choose Y = 1/2 for the Higgs doublet so that the Higgs boson h is electrically neutral.
Note that the components of ϕ with charge are exactly the ones ‘eaten’ by the charged Wµ± .
giving a mass me = λe v and a Yukawa coupling λe to the Higgs, proportional to me . For now,
we’ll take the neutrino to be massless.
Here the generation indices are kept explicit, the spinor indices are contracted between L and
R, and the weak isospin indices are contracted between L and ϕ. Note that the adjoint/dagger
acts on all spaces, so L is a row vector in weak isospin space with Y = 1/2. Similarly, ϕ† is a
row vector in weak isospin space with Y = −1/2.
• In general, the weak interactions and the Higgs interactions will pick out two different bases,
the flavor basis and the mass basis. In the SM, neutrinos have no mass, so this problem doesn’t
arise. It also doesn’t occur for the charged leptons, as we now show.
• Now λ is an arbitrary complex matrix, so it can’t be diagonalized in the usual way. But since
λλ† is Hermitian and positive, we have
λλ† = U Λ2 U †
where Λ2 is diagonal and positive, and U is unitary. Taking Λ to also be diagonal and positive,
we define S = λ† U Λ−1 , so S is unitary as well, and
λ† λ = SΛ2 S † , λ = U ΛS −1 .
Here, the transformations for the barred quantities follow because taking the Dirac adjoint
performs a complex conjugation. The covariant derivative terms aren’t affected, while the mass
matrix λ is diagonalized to Λ, as desired.
58 4. Electroweak Theory
Note. More about weak isospin. Given two objects Aα and B α in the fundamental representation of
SU (2)L , their inner product Aα∗ B α is invariant. To unpack this, we note that Aα∗ ≡ Aα transforms
in the antifundamental representation. The contraction Aα B α is then invariant; forming invariants
is just a matter of matching up the indices, as for the Lorentz group. In the examples above,
everything starts with an upper index, and taking the adjoint lowers the index. Note that the
fundamental representation of SU (2) is pseudoreal and hence similar to the antifundamental, i.e. we
may raise and lower indices with ϵαβ .
Next, we perform the same procedure for the quark fields.
as usual. These terms violate C and P, but obey CP and T symmetry; note that here we
are referring to the quantum C,
b so the CP symmetry acts on fields like classical C symmetry,
conjugating them.
• The most general renormalizable gauge invariant quark-Higgs couplings are
√ i ij i c j
L ⊃ − 2 λij Q
d L ϕd j
R + λ Q
u L ϕ uR + h.c. , ϕcα ≡ ϵαβ ϕ†β .
In the second term, we need to use ϕ† to get hypercharge invariance, and a Levi–Civita to get
a weak isospin invariant contracting with Qα . Also, by hypercharge, there’s no term involving
uR and dR . Since CP conjugates the fields, the coupling is CP invariant if and only if λij
d and
ij
λu are real. Roughly speaking, complex physical parameters indicate CP violation.
• Next, we switch to the mass basis, as we did for the leptons. As before, we let
λu = Ku Λu Su† , λd = Kd Λd Sd†
and redefine the quark fields by
uL → Ku uL , dL → Kd dL , uR → Su uR , dR → Sd dR .
Then we have, for example, in unitary gauge
√ ij i j i † †
2λd QL ϕdR ⊃ v dL λij j
d dR → v dL Kd Kd Λd Sd Sd dR = v dL Λd dR
• Now, this redefinition affects the gauge couplings. The terms uR iDu
/ R +dR iDd
/ r are not affected,
/ L is because the covariant derivative mixes uL and dL . In particular, the charged
but QL iDQ
weak current transforms as
However, the neutral current remains diagonal, because it does not convert up-type quarks to
down-type quarks. Thus the SM, at tree level, has no flavor-changing neutral currents.
The off-diagonal elements quantify the mismatch between the mass basis and the flavor basis.
When we talk about an “up quark”, we conventionally mean the mass basis.
Note. We began this discussion in the flavor basis, where we noted that gauge boson couplings
are CP invariant, but introducing the mass terms broke CP symmetry. But usually we work in the
mass basis, where we say that the mass terms have CP symmetry, while the gauge boson couplings
break CP, as discussed below. So which term “really” breaks CP? Neither. The point is that there
is no single “objective” definition of CP, because discrete symmetries are only defined up to an
arbitrary linear transformation on the fields. This extra transformation can be chosen to leave the
gauge boson couplings invariant, or the mass terms invariant, but not both at once.
Next, we’ll introduce a useful way to count degrees of freedom, and apply it to the CKM matrix.
• Consider an atom in an external electric or magnetic field. Naively, the field has three degrees
of freedom, but we can always take it to be along the z-axis, giving only one degree of freedom.
The reason is that the atom and field still have SO(3) symmetry provided we rotate them
together, so we can align the field with the z-axis without loss of generality. To count the
number of degrees of freedom of the perturbation, we note that the atom alone has only SO(2)
symmetry, by rotations orthogonal to the field direction. We have lost 3 − 1 = 2 symmetry
generators, which were the ones used to align the field with the z-axis, so the field is described
by only 3 − 2 = 1 parameter.
• In a more general situation, suppose that some couplings break a symmetry. We can formally
think of the couplings as spurions (i.e. effectively as external fields) which transform under that
symmetry, reducing the reasoning to the previous case. The number of parameters needed to
break the couplings is the naive number, minus the number of broken symmetry generators.
• Now consider a general n×n complex matrix. Each entry is a complex number with a magnitude
and phase, so there are n2 real parameters and n2 phase parameters.
• An n × n orthogonal matrix has n(n − 1)/2 real parameters. However, an orthogonal matrix
can just be thought of as a unitary matrix with the phases removed, and a unitary matrix has
n2 parameters, so a unitary matrix has n(n − 1)/2 real parameters and n(n + 1)/2 phases.
60 4. Electroweak Theory
• This reasoning can also be understood by realizing that unitary matrices correspond with
ordered bases of Cn . The first basis vector is described by n − 1 real parameters, for the
magnitudes of the components, and n phases, for the phases of the components. Restricting
to the orthogonal subspace, the second basis vector is described by n − 2 real parameters and
n − 1 phases, and so on.
• Now we apply these results to the CKM matrix. Without the Yukawa couplings, the quark
sector has a U (3)3 symmetry, by unitary transformations of uR , dR , and QL individually; this
corresponds to 9 real parameters and 18 phases. Adding the Yukawa couplings breaks this to a
U (1)B symmetry, which is 1 phase.
• The Yukawa couplings take the form of two 3 × 3 complex matrices, with 18 real parameters
and 18 phases. Hence the quark sector has 9 real parameters (6 quark masses and 3 CKM
angles) and 1 phase.
• One might worry that anomalies upset this parameter counting, once we account for quantum
effects. Indeed, the U (3)3 symmetry includes the axial U (1)A symmetry, which is anomalous.
The corresponding new term is the QCD θ-term, which has no effect classically.
• In the above derivation, we derived the CKM matrix rather differently, as we used a U (3)4
quark field redefinition, which is not a symmetry of the Lagrangian even for λu = λd = 0. This
is precisely why the form of the rest of the Lagrangian changed, i.e. why we picked up the
CKM matrix in the first place. We will find the physical parameters of the CKM matrix below
explicitly, but this heuristic analysis using U (3)3 symmetry tells us what to expect.
Example. Spurions are ubiquitous. Consider a theory of a complex scalar field and a Weyl fermion,
1
L = (∂µ ϕ)2 + iψ † ∂/ ψ − m2ϕ |ϕ|2 − mψ ψ 2 + Lint .
2
In the limit mϕ → 0, we recover a shift symmetry for the scalar, so mϕ is a spurion for this symmetry.
Assuming the interaction obeys this theory, the mass of the scalar can’t become much larger than
mϕ , at least perturbatively. Similarly, mψ is a spurion for chiral symmetry. Scale invariance is
restored when both mass terms go to zero, and supersymmetry is restored when the mass terms
become equal. Supersymmetry effectively transfers the chiral symmetry of the spinor to the scalar.
Next, we investigate the degrees of freedom in the CKM matrix.
• In the case of two generations, unitarity implies that VCKM has four parameters, which can be
expressed as an angle and three phases,
cos θc eiα sin θc eiβ
VCKM = .
− sin θc ei(α+γ) cos θc ei(β+γ)
However, in the absence of the CKM matrix, the Lagrangian would be invariant under a global
phase rotation of any quark field,
i
qLi → eiα qLi , q i ∈ {u, d, s, c}.
On the other hand, a rotation of all four quark simultaneously doesn’t change the CKM matrix,
because it is the U (1)B symmetry. Since the CKM matrix breaks three U (1) symmetries, we
can use them to remove all phases in the CKM matrix; then there is no CP violation. The
remaining angle is called the Cabibbo angle.
61 4. Electroweak Theory
• Kobayashi and Maskawa proposed a third generation of quarks, which would allow for CP
violation. In this case, there are 3 angles and 6 phases, but only 5 quark phases available. Thus
the CKM matrix can be parametrized in terms of three angles and one phase.
1 − λ2 /2 Aλ3 (ρ − iη)
λ
VCKM = −λ 1 − λ2 Aλ2 + O(λ4 )
3
Aλ (1 − ρ − iη) −Aλ 2 1
where λ ≈ 0.22, A ≈ 0.81, ρ ≈ 0.12, and η ≈ 0.36. This is useful because it parametrizes
generation-mixing effects as powers of λ. For example, crossing from the first to the third
generation is penalized by a factor λ3 . The CP violating phase is parametrized by η. Note that
the top-left block is simply the 2 × 2 CKM matrix with Cabibbo angle λ.
• The unitarity of the CKM matrix is often tested by plotting unitarity triangles. We know that
the inner product of any two distinct columns, or any two distinct rows, must vanish, and each
inner product is the sum of three complex numbers, so there are six ‘unitarity triangles’ in the
complex plane that must close. In most cases, the triangle is very flat because some terms are
much bigger than others, so we usually plot
X
Vid Vib∗ = 0
i
Note that the hypercharge is always the average electric charge of a weak isospin multiplet.
Note. It’s important to avoid thinking of the Higgs sector of the SM as obvious. Over the 50 years
between its proposal and discovery, many influential physicists expressed skepticism, as described in
the historical review The Theoretical Physics Ecosystem Behind the Discovery of the Higgs Boson.
At the time of its proposal, it was not clear that the Higgs mechanism (i.e. the pattern of
electroweak symmetry breaking which gives mass to the W and Z bosons) was even necessary.
Glashow had proposed a model in 1961 where these masses were simply put in by hand, as an
explicit symmetry breaking, and viewed it as no less legitimate than breaking flavor symmetry by
hand. The Higgs mechanism, proposed in 1964 and used to by Weinberg and Salam to complete the
SM in 1967 and 1968, gained greater acceptance in 1971 when ‘t Hooft showed it to be renormalizable,
62 4. Electroweak Theory
in contrast to Glashow’s setup. (But in the 1970s we also learned that renormalizability was less
important of a criterion than had been thought, due to the rise of Wilsonian ideas.)
Even when the Higgs mechanism became more accepted, the Higgs boson was not. The Higgs
field was simply the analogue of the Ginzburg–Landau order parameter field in superconductivity.
In that case, the field was meant to measure some aspect of the collective behavior of the electrons,
so the natural analogue would have been to view the Higgs field as representing a condensate of
other particles. (Examples of such theories included top quark condensates, where top quarks
play the role of electrons, and technicolor, which breaks electroweak symmetry by strong gauge
interactions, and has no discernible Higgs excitation at all.) Many physicists, especially condensed
matter physicists, thought that postulating an elementary Higgs was naive, the result of taking
an order parameter field too literally. A further issue, realized throughout the 1970s, is that an
elementary Higgs boson requires fine tuning. As a result, thousands of papers have been written
on alternatives to an elementary Higgs.
The current experimental results have mostly wiped out Higgsless theories, because we now know
there is a new scalar with a mass of about 125 GeV. Currently, it is known that this scalar has the
same quantum numbers as the Higgs, and its direct Yukawa couplings to bottom and top quarks
have been measured, assuming the Higgs vev is as expected. However, we have measured none of
the other Yukawa couplings, or any features of the Higgs potential. For example, it is possible that
there is “induced electroweak symmetry breaking”,
V ⊃ µ2 H † H̃ + m2 |H|2 + V (H̃)
where a second Higgs doublet H̃ acquires a vev, creating a linear term in the Higgs potential and
leading to the observed Higgs mass and vev. In this case, there could be no Higgs quartic term.
In addition, some models with composite Higgs bosons remain viable. Distinguishing between
these options would be a task for a post-LHC collider. Of course, if the Higgs continues to appear
fundamental, and nothing else shows up, the fine-tuning problems pointed out 50 years ago would
become even more severe.
• All quarks couple to gluons with the same strength, because they all transform in the funda-
mental of SU (3)C . In addition, all leptons couple to W bosons with equal strength, because
all the Li transform in the fundamental of SU (2)L . This result is known as lepton universality.
• Lepton universality doesn’t apply to quarks, because of the CKM matrix, but one can still get
nice results upon summing over quarks. For example, for W + decay at tree level, we have
2 X 3
!
X
Γ = Γ(W + → e+ νe ) 3 + 3 |Vnm |2
n=1 m=1
where the first factor of 3 comes from the three generations of leptons, and the next factor of
3 comes from the three quark colors. Since the CKM matrix unitary, the sum is equal to 2,
giving the simple result that the W + decays to hadrons 2/3 of the time.
• All CP violation is due to the complex phase in the CKM matrix. Thus, any CP violating
process must involve all three generations, giving a suppression of λ6 ∼ 10−3 . Measurements of
CP violation are therefore sensitive probes of new physics.
63 4. Electroweak Theory
• The only particles in the SM that connects fermions with different flavors are the W bosons,
through the off-diagonal elements of the CKM matrix.
• At tree level, charged current interactions are mediated by W bosons, while neutral current
interactions (if we define the term rather inclusively) are mediated by the Z boson, gluons,
photons, and the Higgs. However, a loop of W bosons could also contribute to the neutral
current interaction.
• The Standard Model turns out to have no tree-level FCNC, as we have already seen above.
This is obvious for gluons and photons, whose interactions are flavor diagonal. For the Higgs,
it occurs because the Yukawa couplings to the Higgs are proportional to the masses, but it
wouldn’t be true for a more complicated Higgs sector, such as a two Higgs doublet model.
• Tracing back, we might wonder why there was no CKM matrix for Z bosons, which would
have led to tree-level FCNC. This occurred because all the up-type quarks (and down-type
quarks) coupled to the Z boson identically. Thus, the matrix of couplings was proportional to
the identity, and the matrices from changing to mass basis cancelled out, Kµ† Kµ = Kd† Kd = I.
• Now consider a general matter sector. The mass terms can connect fields with the same
SU (3)C × U (1)EM irrep, so each type of irrep corresponds to a mass matrix. Meanwhile, fields
in the same SU (3)C × SU (2)L × U (1)Y irrep couple the same way to the Z. Therefore, the logic
above goes through as long as all fields with the same SU (3)C × U (1)EM irrep automatically
have the same SU (3)C × SU (2)L × U (1)Y irrep.
• This holds in the SM, but before the discovery of the charm quark, the strange quark was
proposed to be in an SU (2)L singlet, which would have led to tree-level FCNC with the down
quark. This would have produced a sizable rate for the neutral kaon decay K 0 → µ+ µ− , but it
was measured to be very rare, with a branching fraction of about 10−9 . That result led to the
prediction of the charm quark.
• The GIM mechanism further suppresses FCNC, through a cancellation at loop level. Consider
the process b → sγ by a W loop which emits a photon, as shown below.
Concretely, this might be part of a B meson decay process. The amplitude is proportional to
X
Vib Vis∗ f (m2i /m2W ).
i∈{u,c,t}
Now consider Taylor expanding the function f . At zeroth order, the result vanishes by unitarity
of the CKM matrix. Beyond zeroth order, we can’t have a bare factor of log(m2i /m2W ), since this
64 4. Electroweak Theory
would blow up as mi → 0, which means the leading correction is at least suppressed by m2i /m2W
(possibly multiplied by a logarithm). This is small for everything but the very massive top quark,
which is the reason flavor physics can be used to “measure” the top quark mass. However, the
top quark amplitude is suppressed by several powers of the Wolfenstein parameter λ. Thus,
either the top or charm quark loops could be dominant, depending on the circumstances.
• To estimate the contribution from the charm quark loop, note that
Z Z
2 1
d̄p f (p ) = dx xf (x)
(4π)2
so in general a loop gives a numerical factor of roughly 1/(4π)2 . Then the amplitude scales as
1 m2c 1
(4π)2 m2W m2W
compared with a generic 1/Λ2 for new physics. Thus, in general, loop suppression of a process
in the SM allows us to probe new physics at scales up to Λ ∼ 10mW ∼ TeV, while the GIM
mechanism gives an additional factor of mW /mc , reaching an incredible ∼ 100 TeV.
• However, new physics can still exist below this scale. For instance, a new particle’s couplings
could have a trivial flavor structure, coupling identically to each up-type and down-type quark,
in which case FCNC is not modified at tree level. (One example is the minimal dark photon.)
• Alternatively, the new couplings could be proportional to (powers of) the existing Yukawa
couplings. This is the paradigm of “minimal flavor violation”, which makes the SM Yukawa
couplings the only source of flavor violation. It also suppresses new FCNC, and is commonly
used in SUSY model building. A final possibility is “flavor alignment”, where the new couplings
and the Yukawa couplings can be simultaneously diagonalized. So flavor constraints don’t
totally rule out new physics, but rather place strong constraints on how it can look.
Note. The general rule of thumb for loops given above is useful in many contexts. For example,
corrections due to a gluon loop are of order g32 /(4π)2 ∼ 10−2 , corrections due to a photon loop
are of order e2 /(4π)2 ∼ 10−3 , and weak loops are intermediate. On the other hand, depending on
the process, loop corrections may also come with logarithmic factors, which could be as large as
log(m2W /Λ2QCD ) ∼ 10 for gluon loops and log(m2W /m2e ) ∼ 20 for photon loops.
Processes involving charged particles can also proceed with an extra photon in the final state.
The rule of thumb is that the rate comes with a factor of e2 (2π)/(2π)3 = α/π ∼ 2 × 10−3 , where
the numerator comes from the angular integration and the denominator comes from the momentum
integration measure. (The rest of the phase space integral is not substantially affected, as long as
the photon is soft.) As a concrete example,
Br(µ− → e− ννγ)
= (1.4 ± 0.4)%
Br(µ− → e− νν)
• An accidental symmetry is a symmetry that arises from the field content, renormalizability,
and other symmetries, but is not put in by hand. For example, in QED, the most general
renormalizable Lagrangian is
1
L = − Fµν F µν + iaFµν Feµν + iψ Dψ
/ + ψ(m + iγ 5 m5 )ψ.
4
However, the second term is a total derivative and the final term may be removed by a chiral
5
rotation ψ → eiαγ ψ. Then we have accidental C, P, and T symmetry.
• The SM also contains all possible renormalizable terms, and has several accidental symmetries:
the baryon number U (1)B and the individual lepton numbers U (1)Le , U (1)Lµ , and U (1)Lτ . The
dimension 5 neutrino mass violates both individual and total lepton number, while dimension
6 operators can violate baryon number.
• Note that either U (1)B or U (1)L alone is sufficient to prevent the proton from decaying. Also, if
the proton decays, it must decay into an odd number of fermions by Lorentz invariance, which
requires the parity of the fermion number to be conserved. The only fermions lighter than the
proton are leptons, so lepton number must be violated in the decay.
• Proton decay has been tested stringently, placing a high bound on Λ. For a rough estimate, we
have τ > 1033 years while the decay rate should be m5p /Λ4 by dimensional analysis, where the
numerator accounts for the phase space; then Λ > 1015 GeV, a result similar to the bound from
neutrino masses. Thus new physics is either very far away, or respects baryon number.
• Violations of the individual U (1)Li have also been searched for, most stringently through the
unobserved decay µ → eγ, which will be probed further by the upcoming MEG II experiment.
There is also the upcoming Mu2e experiment, which will search for µ → e conversion in nuclei.
• It turns out that anomalies violate B and L conservation, as discussed further in the notes on
Quantum Field Theory, but Li − Lj remains exactly conserved, as does B − L if there is a
sterile neutrino. Neutrino masses break Li − Lj , while Majorana neutrino masses also break L.
Note. In the Standard Model, the result ρ0 = 1 follows from an approximate accidental symmetry.
Since the Higgs is a complex doublet, thereby containing 4 real fields, its most general possible
global symmetry is O(4). This symmetry is preserved by the Higgs potential, as it only depends
on the combination ϕ† ϕ. When the Higgs field develops a vev, it is broken to O(3), and since
o(3) ∼
= su(2), this residual symmetry is called custodial SU (2).
Writing it in terms of real fields is clunky, but we can work with complex fields by defining
the Higgs matrix Φ = (ϕ, ϵϕ∗ ). The Higgs potential is a function of tr(Φ† Φ), which preserves the
SU (2)L × SU (2)R symmetry
Φ → LΦR†
where L, R ∈ SU (2). The SU (2)L factor acts just like electroweak SU (2)L , while U (1)Y ⊆ SU (2)R .
When Φ acquires a vev, which we can take to be proportional to the identity, the diagonal subgroup
corresponding to L = R is preserved, and this is the custodial SU (2).
The coupling to gauge fields can be written as
i
L ⊃ tr(Dµ Φ)† Dµ Φ, Dµ Φ = ∂µ Φ + igAaµ τ a Φ + g ′ Bµ Φσ3
2
66 4. Electroweak Theory
where the σ3 is on the right, because ϕ and ϵϕ∗ have opposite hypercharge. The coupling to the
SU (2)L gauge bosons is invariant under SU (2)L by construction, and under SU (2)R by the cyclic
property of the trace. Therefore, if we set g ′ = 0, the custodial SU (2) survives, and implies the
three massive gauge bosons must be degenerate. For g ′ ̸= 0, the same logic implies that a 3 × 3
block of the mass matrix must be proportional to the identity, which implies ρ0 = 1.
This logic would not have applied if, e.g. the Higgs field had been an SU (2)L triplet, so mea-
surements of ρ provide information about the mechanism of electroweak symmetry breaking. On
the other hand, it is not necessary to have one SU (2)L doublet. There can be multiple doublets,
or more radically, if electroweak symmetry had been broken solely by the QCD condensate, there
would be the custodial symmetry SU (2)V , which implies the same result.
The coupling to U (1)Y breaks custodial SU (2), which leads to the radiative correction
in the MS scheme. In addition, the Yukawa couplings generally break custodial symmetry. Within
each generation of quarks, we have custodial symmetry if the up-type and down-type quarks have
the same mass, i.e. if there is isospin symmetry. Therefore, the high mass of the top quark produces
a significant loop correction to ρ,
Measurements of ρ therefore allowed the huge top quark mass to be predicted before it was discovered.
More generally, new physics corrections to the W W , ZZ, and γγ, and γZ two-point functions
(called “oblique” corrections, in contrast to direct modifications of the fermion-boson couplings) are
commonly parametrized by the so-called Peskin–Takeuchi parameters S, T , and U .
Note. Suppose that B − L was gauged, and that the corresponding gauge boson was massless.
The result is a long-range force which makes baryons repel, leptons repel, and baryons and leptons
attract each other. It turns out that the constraints on the gauge coupling g are extremely strong.
First, the energy levels of the deuteron would be shifted relative to hydrogen’s by order g 2 /e2 ,
placing a constraint g 2 ≲ 10−7 from spectroscopy. Next, for g 2 ≪ e2 there is a strong constraint
from stellar physics, as otherwise B − L gauge boson emission would dramatically accelerate stellar
evolution since such particles could escape more readily than photons. This rules out couplings
stronger than g 2 ≲ 10−20 .
Even strong constraints come from terrestrial physics. Since the Earth is charge neutral, its
B − L charge is roughly its neutron number. This leads to a repulsion which would have destroyed
the Earth unless the B − L force is weaker than gravitational, g 2 ≲ GN m2n ∼ (mn /Mpl )2 ∼ 10−36 .
(We can try to avoid this constraint by supposing the Earth has trapped a compensating number
of neutrinos, but it doesn’t work; if the residual charge is strong enough to keep neutrinos trapped,
it is also strong enough to destroy the Earth.) But even if the Earth is stable, the presence of a
B − L force would cause different materials to feel different effective values of g. Precision tests
of the equivalence principle therefore bound g 2 ≲ 10−48 . On the other hand, the constraints are
significantly weaker if the B − L gauge boson has a mass, and hence a finite range.
Note. In the limit of massless neutrinos, the SM has another long-range force: leptons can interact
with each other through a loop of neutrinos. In fact, Feynman briefly speculated that this could
67 4. Electroweak Theory
explain the gravitational force! However, there is a major immediate obstacle. When a massless
gauge boson is exchanged with momentum k, the matrix element is M ∼ 1/k 2 , which implies
V (r) ∼ 1/r in the Born approximation. For a neutrino loop, the matrix element is proportional to
G2F , which implies M ∼ k 2 G2F by dimensional analysis; Fourier transforming this gives V (r) ∼ 1/r5 ,
which is dramatically different but still technically long-ranged. A more precise calculation gives
G2F
V (r) =
4π 3 r5
though at higher orders there is also nontrivial velocity dependence. Unfortunately, the force is so
weak that it has never been observed.
Note. All dimension 6 operators, which number over 2,000, are considered in “Standard Model EFT”
(SMEFT) analyses. These terms have various signatures, such as CP violation, baryon number
violation, and changing the overall rates and high momentum tails of various processes. Choosing
to express experimental results as constraints on SMEFT coefficients has some advantages: it is
quite general and unambiguous, and can easily be combined between different searches. But it’s
hard to interpret the results, in terms of specific models of UV physics.
The SMEFT takes the Higgs doublet as a field, while the less popular HEFT uses the physical
Higgs, i.e. the SMEFT expands about the electroweak symmetry preserving vacuum, while the
HEFT expands about the physical vacuum. The SMEFT is more “straightforward” to work with,
but the HEFT is more general, e.g. it can accommodate other Higgs sectors.
• The first problem is that the SM does not account for neutrino masses and mixings, which we’ve
covered above. On astronomical and cosmological scales, the SM does not account for dark
matter, which is neutral, colorless, cold, non-baryonic, and massive. It also does not contain
enough CP violation to account for the matter/anti-matter asymmetry in our universe.
• The SM has three naturalness problems: the Higgs hierarchy problem, the cosmological constant
problem, and the strong CP problem. The first two are more urgent, in the sense that they
are also fine-tuning problems; on the other hand, the strong CP problem can’t be solved by
anthropics, making it arguably more robust.
• There are also a number of problems which might or might not have an explanation.
– Why is the amount of matter, radiation, and vacuum energy in the universe roughly equal
today? These quantities varied by many orders of magnitude in the universe’s history.
– Why are there three fermion families, and why do they display a hierarchical structure in
their masses and mixings? There are many candidate theories, but none are compelling
enough to earn widespread acceptance.
– Why are the three gauge couplings all relatively close in size?
– Why are there four spacetime dimensions, and one time dimension?
– Why is electric charge quantized? This is not explained by U (1)Y , because we must allow
for projective representations, and the universal cover of U (1) is R. It could be explained
by a grand unified theory where U (1)Y is embedded in a larger group.
68 4. Electroweak Theory
g2
Z
1
=1− dxdx′ J µ† (x)Dµν
W
(x − x′ )J ν (x′ ) + J µ † (x)Dµν
Z
(x − x′ )Jnν (x′ ) + O(g 4 ).
8 cos θW n
2
∂ 2 Zµ − ∂µ ∂ν Z ν + m2Z Zµ = −jµ
and taking the divergence of each side gives m2Z ∂µ Z µ = −∂µ j µ . Substituting this back into the
equation of motion gives
2 2 ∂µ ∂ν
(∂ + mZ )Zµ = − ηµν − 2 jν
mZ
and taking a Fourier transform yields the familiar massive vector boson propagator,
Z
Z −ip(x−y) e Z Z i pµ pν
Dµν (x − y) = d̄p e Dµν (p), Dµν (p) = 2
e −ηµν + 2 .
p − m2z + iϵ mZ
The Green’s function for the W boson is similar. At low energies, we can approximate
• Therefore we get the same S-matrix using the effective weak Lagrangian
GF µ† GF g2
Leff
W (x) = − J (x)Jµ (x) + ρJnµ † (x)Jnµ (x) , √ = .
2 2 8m2W
This is indeed an effective theory since the four-fermion operator has dimension 6. Higher-order
diagrams would give further contributions, but they are suppressed by more powers of large
masses; this is the reason we don’t have to include the top quark, as it only appears internally
in diagrams where there is already a W boson.
69 4. Electroweak Theory
• Why do we include the effects of the W and Z boson, but not that of the Higgs boson?
′
Integrating out the Higgs yields interactions of the form f f f f ′ , but they are further suppressed
by small Yukawa couplings mf m′f /v 2 . In addition, these terms don’t break symmetries the
same way the W and Z-mediated interactions do, so when they do contribute to processes, they
tend to be swamped by the larger strong or electromagnetic interactions.
Example. The muon’s “Michel” decay, µ → eν e νµ . It occurs via the leptonic charged weak current,
J ρ = ν e γ ρ (1 − γ 5 )e + ν µ γ ρ (1 − γ 5 )µ + ν τ γ ρ (1 − γ 5 )τ.
Since the muon mass mµ = 105 MeV is much less than mW = 80 GeV, we can use the effective
theory above, where Z
S − 1 = dx Leff W (x).
Here, the position integration enforces momentum conservation. We will compute the amplitudes
M which have i/δ( i pi ) factored out of the matrix element. Factoring out the delta function is
P
1X G2
|M|2 = F S1ρσ S2ρσ
2 4
spins
where since the neutrinos are massless, the spinor traces are
We simplify the spinor traces using the usual identities, noting that (1 − γ 5 )2 = 2(1 − γ 5 ), for
µ
S1ρσ = 8(k ρ q σ + k σ q ρ − (k · q)η ρσ − iϵρσµν kµ qν ), S2ρσ = 8(pρ qσ′ + pσ qρ′ − (p · q ′ )ηρσ − iϵρσµν q ′ pν )
where the minus sign comes from the determinant of the metric, we find
1X
|M|2 = 64G2F (p · q)(k · q ′ ).
2
spins
Finally, we must perform the integral over final state momenta. We have
To perform this tricky three-body integral it’s best to separate out the massless neutrinos,
dqdq′
Z
Iµν (Q) = δ(Q − q − q ′ )qµ qν′ , Q = p − k.
|q||q′ |
Then by Lorentz invariance we must have Iµν (Q) = aQµ Qν + bηµν Q2 . Contracting with η µν and
Qµ Qν and using the delta function to simplify, we find
dqdq′
Z
I I
a + 4b = , a + b = , I = δ(Q − q − q ′ ).
2 4 |q||q′ |
The integral I is Lorentz invariant, so we work in the center-of-mass frame Q = (σ, 0),
Z Z ∞
dq
I= δ(σ − 2|q|) = 4π d|q| δ(σ − 2|q|) = 2π
|q|2 0
G2F
Z
dk
2p · (p − k) k · (p − k) + (p · k)(p − k)2 .
Γ=
3mµ (2π)4 k 0
We work in the frame of the muon and approximate the electron as massless with energy E,
where the upper bound is attained when the neutrinos exit in the same direction. The size of this
result is substantially smaller than one would get by counting 2π factors, mostly because the final
phase space integral happens to give a numeric factor of 1/16. However, a similar suppression
often occurs whenever there is a three-body decay to light particles. Also note that the energy
distribution for the electron is monotonic: it is most likely to emerge with the maximum possible
energy mµ /2, while the probability for lower energy is suppressed as E 2 .
Note. Helicity suppression. Consider the case where the electron and muon neutrino exit in the
z-direction and the electron antineutrino exits in the −z-direction. Then
which vanishes in the limit me → 0. This is because in this limit, chirality coincides with helicity.
Since the electron and muon neutrino are left-handed and the electron antineutrino is right-handed,
the z components of the spin would sum to −3/2, so the decay is forbidden.
There are two ways to think about the effect of an electron mass. We can think of the electron
as a Dirac spinor, in which case a left-handed electron does not have definite helicity, so the process
is allowed. Alternatively, we can think of the electron as made of two massless Weyl spinors, where
chirality and helicity match, and treat the mass as an interaction term that flips the chirality.
71 4. Electroweak Theory
Note. The above decay channel is the only one allowed for the muon, so it provides a precise way
to measure the Fermi constant,
1
τ= = 2.1970 × 10−6 s, GF = 1.164 × 10−5 GeV2 .
Γ
One-loop corrections only affect GF at the per-million level. A similar calculation can be performed
for the τ , which has the two leptonic decay channels τ → eν e ντ and τ → µν µ ντ , as well as decays
into hadrons. We can estimate the decay rate in each leptonic channel by simply replacing mµ by
mτ . Thus, one can measure GF from these decays, and the results match that found for muons due
to lepton universality.
Note. In the 1950s, it was thought that there was only one neutrino, a conclusion supported by
lepton universality. However, this would imply the decay µ → eγ was possible through a loop of a
W boson and neutrino, with a branching ratio of order α. The nonobservation of this decay led to
the conclusion that there was a separate neutrino for each generation.
Example. Pion decay, π − → eν e , has the Feynman diagram shown below.
The d and u quarks do not propagate freely, but rather are bound together by nonperturbative
dynamics; thus we’ll have to parametrize our ignorance using form factors. The decay is again
solely through the charged weak current, where the hadronic weak current is
where we’ve defined vector and axial components with definite parity, and the overall “vector minus
axial” form is because the charged current only couples to left-handed quark fields. Then
GF µ
M = ⟨e− (k)ν e (q)|Leff − 5 −
W (0)|π (p)⟩ = − √ ue (k)γµ (1 − γ )vνe (q)⟨0|Jhad |π (p)⟩.
2
µ
The QCD vacuum is parity even and the pion is parity odd. Then ⟨0|Vhad |π − (p)⟩ must be an axial
vector, but there are no axial vectors it could be equal to, so it must simply be zero. On the other
hand, ⟨0|Aµhad |π − (p)⟩ must be a vector, so it has to be proportional to pµ . We cannot compute the
matrix element perturbatively, so we absorb it into a single dimensionful parameter called the pion
decay constant Fπ , so that √
⟨0|uγ µ γ 5 d|π − (p)⟩ = i 2Fπ pµ .
By momentum conservation we have p = k + q and the on-shell spinor identities
We expect helicity suppression, since the pion has spin zero and, in the pion’s rest frame, the two
particles come out back-to-back, giving a total of spin one in the massless limit. This is reflected in
the fact that M ∝ me .
72 4. Electroweak Theory
spins
Abbreviating the squared quantity as C, the decay rate in the pion rest frame is
Z Z
1 d̄kd̄q X
2 C dk
Γ= 0 0
/δ(p − k − q) |M| = 2 δ(mπ − E − |k|)(E + |k|)|k|
mπ 4k q 4π mπ E|k|
spins
where we defined E = k 0 and integrated over q so that q = (|k|, −k). The angular integral is 4π,
and integrating the delta function yields
2
m2e
C
Γ= mπ 1 − 2 .
4π mπ
We still don’t know what Fπ is, but we can compute branching ratios, such as
2
m2 m2π − m2e
Γ(π → eν e )
= 2e = 1.28 × 10−4 .
Γ(π → µν µ ) mµ m2π − m2µ
The experimental result is 1.230 × 10−4 , with the difference accounted for by loop diagrams.
Note. Above, we saw another example of helicity suppression, which is a rather common effect in
the ultrarelativistic limit. Yet another example occurs in the scattering of spin-polarized electrons
and positrons. If we neglect their masses, scattering via a photon is forbidden if the particles
have the same helicities (and hence opposite angular momenta), because the resulting product of
Poincare irreps has helicity zero, while photons have helicity ±1. This is an example of how the
chiral components decouple in massless QED.
For this argument to work, it’s essential that we think in terms of massless Poincare irreps with
helicity, rather than massive Poincare irreps with spin, since the combination of two antiparallel
spin 1/2 particles does have a spin 1 component (with Lz = 0). For this reason, when we account
for a nonzero mass, the scattering can happen, but it’s helicity suppressed.
An objection one could make for the muon and pion decays is: why can’t the decay products
come out with orbital angular momentum? Up to a basis change, orbital angular momentum just
corresponds to a particular
R pattern of superposition of directions of the outgoing particles, with a
state that looks like dΩ f (n̂)|k n̂, −k n̂⟩. (Instead of an integral over angles, one could also express
this state as a sum over l and m involving spherical harmonics, giving a partial wave expansion.
This is more useful for low-energy scattering, where the s-wave typically dominates, and in this case
orbital angular momentum expresses itself as a non-s-wave component. But the point is that one
doesn’t need to do, and indeed can’t do, both expansions at once; both bases here are complete.)
Now consider a symmetry argument involving only rotations about the z-axis. Such rotations
don’t rotate the |kẑ, −kẑ⟩ with others, so the argument can be used to show that f (ẑ) = 0 without
caring about what the other values of f (n̂) are. And then, since ẑ was arbitrary, this shows
that f (n̂) = 0 in general. This is the proper way to phrase the arguments we made above. (A
slick, but somewhat mysterious way of summarizing this is that “the orbital angular momentum is
perpendicular to the linear momentum, so it doesn’t affect helicity”.)
73 4. Electroweak Theory
4.5 CP Violation
Finally, we investigate neutral kaon mixing, which demonstrates CP violation.
• Kaons are pseudoscalar mesons containing either a strange quark or a strange antiquark. The
0
neutral kaons K 0 and K have quark content sd and ds respectively.
• Under C,
b the neutral kaons are mapped to each other. As discussed earlier, C
b and Pb have some
freedom in phase redefinition, and we may choose these phases so that
b Pb|K 0 ⟩ = −|K 0 ⟩,
C b Pb|K 0 ⟩ = −|K 0 ⟩.
C
• We consider the decays of neutral kaons to two pions, either π + π − or π 0 π 0 . Since this is a
flavor-changing interaction, it is mediated by a weak current, as shown below.
The pions are all pseudoscalars, and their total angular momentum must be zero since the kaon
has spin zero, so parity simply exchanges the pions without any signs, and charge conjugation
simply flips the charges. Thus both possible final states |π + π − ⟩ and |π 0 π 0 ⟩ are CP even, and
only |K+0 ⟩ can decay to two pions if CP is conserved. The |K 0 ⟩ should have a longer lifetime,
−
being only able to decay to three pions or other final states.
• Experimentally, it is indeed observed that there are two neutral kaons, KS0 and KL0 , with a
short and long lifetime respectively. We can create a pure sample of KL0 by waiting for a time
much longer than the lifetime of the KS0 . However, we occasionally observe the KL0 decay into
two pions. Specifically, we have
0
If CP symmetry holds, the amplitude for K 0 to transition to K is the same as the amplitude
0
to go the other way, and the eigenstates KS/L coincide with the CP eigenstates.
• On the other hand, if we have a CP violating phase in the CKM matrix, the amplitude for
0 0
K 0 → K is not the same as that for K → K 0 . Thus the ‘mass basis’ is not the same as the
‘CP basis’, so the KL0 can decay to two pions. Another way of phrasing this is that CP violating
0 ⟩.
effects produce oscillations between the CP states |K±
Since the kaons decay, the mass eigenstates have complex energies. Here, we’re making the
‘Wigner–Weisskopf’ assumption, i.e. we aren’t keeping track of the ‘environment’ state at all,
so we guarantee an exponential decay.
ΘH
b Θb −1 = H †
• If we further had T invariance, which would imply CP invariance, then TbH Tb−1 = H † , so
0 0
R12 = ⟨K 0 |Tb−1 Tb|H Tb−1 Tb|K ⟩ = ⟨K |H|K 0 ⟩ = R21
• Finally, assuming for simplicity that ϵ1 = ϵ2 = ϵ, which turns out to be correct, one can
straightforwardly calculate √ √
R12 − R21
ϵ= √ √ .
R12 + R21
The Rij can be computed in perturbation theory, and then ϵ can be related to the branching
0
ratios for KS/L decay, giving a quantitative calculation of a CP violating effect.
76 5. Neutrinos
5 Neutrinos
5.1 Historical Review
Next, we turn to neutrino masses, the leading correction of the SM. We begin with a history of
neutrino physics.
• 1914: Chadwick demonstrates the energy of the outgoing electron in β decay has a continuous
spectrum, which seems to contradict energy-momentum conservation. (This took almost two
decades from the discovery of β radiation, since such measurements were difficult.)
• 1920s: there was much confusion around this time. Nuclei were thought to be made of protons
and electrons, but this gave the wrong statistics and a much too large magnetic moment.
Ignoring these issues, the continuous spectrum could then be explained by assuming violation
of energy-momentum conservation, which was justified in a 1931 textbook by Gamow by saying
that we already knew such electrons had to behave strangely because of all the other problems.
• 1930: Pauli postulates an additional, nearly undetectable light neutral fermion contained in
the nucleus, called the neutron ν, that solves all the problems above. This is first presented in
absentia by his “dear radioactive ladies and gentlemen” letter.
• 1932: Chadwick discovers the neutron. This is too heavy to be Pauli’s postulated particle, so
Fermi renames it to the neutrino, because that means “little neutron” in Italian. (The -ino
ending was then hijacked for the rest of particle physics to mean a generic fermion, even if they
aren’t “little”.)
• 1934: Fermi introduces a four-fermion theory of weak interactions, allowing calculations. This
accounts for beta decay as the process n → p + e− + ν. This is actually quite a theoretical
advance, because it is the first example of fermion production not in particle-antiparticle pairs.
• 1935: the nucleus is understood as being composed of protons and neutrons, with the neutrinos
and electrons being newly created upon decay. Yukawa postulates a nuclear strong force
mediated by a “meson” (i.e. pion) to hold the nucleus together, with a Yukawa potential.
• 1937: the meson is “discovered” in cosmic rays, which has the right mass but seems to interact
far too weakly with nuclei. A long confusion ensues, until people eventually realize it is a new
particle, the muon, which is like a heavy electron. It is initially thought to be an excited state
of the electron, but the expected decay µ− → e− γ is not observed through many experiments.
• It turns out that cosmic rays are actually high-energy protons, which produce pions upon impact
with atoms in the atmosphere. These pions decay into the muons that we call cosmic rays
above; this is the most common decay because of helicity suppression.
• 1947: Marshak and Bethe propose the “two-meson hypothesis”, where π is produced in cosmic
rays but quickly decays to µ. This ridiculous ad-hoc idea is confirmed to be correct; pions are
observed in cosmic rays high in the atmosphere.
77 5. Neutrinos
• 1958: shortly after the Wu experiment (1956), neutrinos are observed by Goldhaber et al. to
always have left-handed helicity, which would make sense if they were massless.
• 1962: at this point, the muon neutrino has been theorized, and electron/muon number con-
servation has been postulated to explain the absence of the decay µ− → e− γ. (Actually, this
decay can occur due to neutrino masses, but is exceptionally rare in the SM because of the GIM
mechanism.) This means that pion decay is actually π − → µ− ν µ . A beam of muon neutrinos
fired at nuclei is then expected to produce muons and not electrons, which is confirmed at
Brookhaven in this year.
• 1968: the Homestake experiment detects solar electron neutrinos by the reaction
νe + 37 Cl → e− + 37 Ar
and then filtering out the argon and measuring its decay. It finds 1/3 as much compared to
detailed astrophysical calculations based on the proton-proton chain. The discrepancy is called
the solar neutrino problem, and touched off decades of confusion and finger pointing. Many
people thought that solar modeling or the neutrino experiments or both were mistaken, and
both the theoretical and experimental values changed dramatically over time
• Note that nearly all of the neutrinos produced in the Sun are expected to be electron neutrinos.
This is because the Sun is “low-energy” by the standards of particle physics. Neutrinos are
hence only produced by charged current interactions, and there is not enough energy to form
muons or taus. We also do not expect electron antineutrinos. The electron neutrinos produced
have MeV scale energies. By comparison, atmospheric neutrinos from cosmic rays go into the
GeV scale.
• The Homestake and related experiments are not sensitive to muon or tau neutrinos, because
absorption by a nucleus would have to produce a muon or tau, and there is not enough energy
to do so.
• 1957: Pontecorvo and Gribov formulate the theory of neutrino flavor oscillations, which violate
electron/muon/tau number. The oscillations require neutrino masses, since massless particles
“do not experience time” and hence can’t oscillate. Later, Mikheyev, Smirnov, and Wolfenstein
refine this into a solution for the solar neutrino problem, which we cover below.
• 1975: the tau is discovered at SLAC, leading to the prediction of the tau neutrino.
• 1970s to 1990s: followups on the Homestake experiment are done. SAGE and GALLEX/GNO
use gallium (lower threshold energy) while SNO, Kamioka, and SuperK use oxygen nuclei in
water (higher threshold energy, but cheaper), confirming the puzzling result. Some of these are
repurposed proton decay experiments motivated by GUTs. Throughout this time, many aren’t
convinced the solar neutrino problem is a real one, since the experiments are difficult and the
nuclear physics of the Sun is complicated.
78 5. Neutrinos
• 1987: neutrinos from a supernova, SN1987A, are detected. The neutrinos arrive at about the
same time as light (actually earlier, since the light is delayed during core collapse), providing a
strong upper bound on the neutrino mass.
π + → µ+ + νµ → e+ + νe + ν µ + νµ .
For a detector on the ground, one expects an equal rate of muon neutrinos coming down and
up from the other side of the Earth, by a shell-theorem like argument, assuming the isotropy
of high-energy cosmic rays. But Super-Kamiokande found almost exactly half as much going
up, which is explained by their oscillation into tau neutrinos.
• 2000: the ντ is directly observed by the DONUT experiment at Fermilab with the same strategy
as for muon neutrinos, using a tau neutrino beam. This is a very difficult experiment. The
discovery paper itself had only 4 events, and to date only about 10 tau events have been
directly seen by all experiments combined! However, note that the ντ had earlier been observed
indirectly from the Z decay width at LEP.
• 2001: the SNO experiment becomes sensitive to all three flavors of solar neutrinos. The
experiment uses heavy water, containing deuterons (loosely bound pn bound states). Neutrinos
can scatter off the deuteron by a neutral current interaction (same for all three flavors), breaking
it apart, and one then measures the produced neutron. SNO finds a total flux in accordance
with expectation, decisively confirming that solar neutrinos oscillate.
• 2005: KamLAND uses reactor neutrinos to directly observe neutrino oscillations for anti-electron
neutrinos. Varying the distance can be achieved because Japan has over 50 existing nuclear
reactors at varying distances from the (stationary) detector. Sociologically, this is because
Japan is an island and hence has plentiful water for reactor cooling.
• 2010s: Double Chooz (France), Daya Bay (China), and RENO (South Korea) all find that the
parameter θ13 in the PMNS matrix is nonzero, using reactor neutrinos. NOνa (Fermilab) and
T2K (Japan) do the same with accelerator neutrinos. These experiments do not have the luxury
of KamLAND’s multiple sources; instead they generally use two detectors, a “near” one and a
“far” one, to see how much the neutrino flux decreases.
• Reactor neutrino experiments find an unexpectedly large number of neutrinos at around 5 MeV,
which has not been resolved. There is also an outstanding accelerator neutrino anomaly from
LSND, which was checked by MiniBooNE. MiniBooNE in turn found yet another anomaly,
which has been checked by MicroBooNE, but the interpretation of all three experiments remains
unclear. (Historically, neutrino physics has generated a very large number of anomalies.)
• For the moment, we assume the neutrinos have Majorana masses and ignore issues of gauge
invariance. We write the mass terms as
1
L ⊃ − (mab ν a PL νb + h.c.)
2
79 5. Neutrinos
where the νa are the neutrino fields. We define the PMNS matrix V to map from the mass
basis to the flavor basis. (Note that when one refers to just “the neutrinos”, one means the
flavor basis. This is in contrast to quarks, where one means the mass basis.)
– The lepton sector has a U (3)2 symmetry, with 6 real parameters and 12 phases.
– The lepton Yukawa coupling is a complex matrix with 9 real parameters and 9 phases.
– This leads a theory with 3 real parameters (the charged lepton masses) and the three U (1)
lepton number symmetries.
– When we introduce Majorana masses, these symmetries are broken completely.
– The matrix m above is complex symmetric, and has 6 real parameters and 6 phases.
– Hence the masses can be described in terms of 6 real parameters and 3 phases.
Of these parameters, 3 of the real parameters are just the neutrino masses.
V = UK
where U has the same parametrization as the CKM matrix, and K can be chosen to be, e.g.,
diag(eiα1 , eiα2 , 1). The matrix K can’t be measured by neutrino oscillation experiments.
– This requires introducing a set of right-handed neutrino fields, which gives another U (3)
symmetry to use.
– The U (1)3 × U (3) symmetry is broken to U (1)L , allowing us to absorb 3 real parameters
and 8 phases.
– The Yukawa coupling between the left-handed and right-handed neutrino fields is again a
complex matrix with 9 real parameters and 9 phases.
– Hence the masses can be described in terms of 6 real parameters and 1 phase.
Again, 3 of the real parameters are neutrino masses, while the rest are in the PMNS matrix,
which in this case can be written in the same form as the CKM matrix.
• While this parameter counting is comprehensive and reliable, it can be simplified if we only
care about the PMNS matrix.
– This is naively a general unitary matrix with 3 real parameters and 6 phases.
– In the Majorana case, phases can only be removed by rephasing the charged lepton fields
(since this rephases the flavor basis), giving 3 remaining phases.
– In the Dirac case, both the charged lepton and neutrino fields can be rephased, but a
uniform phase shift does nothing to the PMNS matrix because of the U (1)L symmetry.
This leaves 6 − (6 − 1) = 1 phase.
• The matrix U is parametrized by three mixing angles θ12 , θ23 , and θ13 , and a CP-violating
phase δ. Currently, all of the mixing angles have been found to be nonzero, though θ13 ≈ 8◦ is
smaller than the rest and took much longer to measure, while δ is only nonzero at 2σ.
80 5. Neutrinos
• Neutrinos are typically produced and absorbed in charged-current weak interactions, i.e. in
flavor eigenstates |νa ⟩. We consider the amplitude
The spin part is flavor-independent; for simplicity we take the initial and final states to have
spin up. But we also find flavor-dependent phases since the dispersion relations Ei (k) differ.
where the overall phases in K have canceled out; neutrino oscillations cannot measure them.
• In the limit of large L, we note that E typically has some range, so the rapidly oscillating
random phase averages the second factor to 1/2, giving
1
Pνa →νb (E, L) ≈ sin2 (2θ).
2
Atmospheric neutrino experiments are in this regime for up-going neutrinos. In between, the
probability can oscillate.
P (νa → νb ) ̸= P (νb → νa )
and
P (νa → νb ) ̸= P (ν a → ν b )
where antineutrinos have mixing matrix V ∗ . (Also note that V ∗ is the matrix that appears in the
Lagrangian, because neutrino fields create antineutrinos.) The differences of these probabilities is
proportional to the Jarlskog invariant for the PMNS matrix. (cover in more detail) However,
CPT implies that antineutrinos have the same masses as the corresponding neutrinos, which gives
P (νa → νb ) = P (ν b → ν a )
in general.
Note. To define the PMNS matrix, we must fix a convention for the mass eigenstates. We let
m21 < m22 and let m23 be the one far from the other two. However, we don’t know if m23 is larger
(“normal” hierarchy) or smaller (“inverted” hierarchy). Under this convention, the elements of the
PMNS matrix are
Ve1 Ve2 Ve3 0.8 0.4 0.1
V = Vµ1 Vµ2 Vµ3 ∼ 0.4 0.5 0.7
Vτ 1 Vτ 2 Vτ 3 0.4 0.6 0.7
where the numbers above are extremely approximate. Assuming the neutrino mass is Dirac, the
PMNS matrix has three physical angles and one physical phase, which we defined to as
|Ve2 |2 |Vµ3 |2
tan2 θ12 = , tan2 θ23 = , Ve3 = sin θ13 e−iδ .
|Ve1 |2 |Vτ 3 |2
Current measurements of δ are still consistent with zero within a few sigma. Our knowledge of θ12
comes from solar neutrinos, θ23 from atmospheric neutrinos, and θ13 from reactor neutrinos.
From this matrix, we see that in the large L limit, we lose roughly half of both initial electron
neutrinos and initial muon neutrinos, while “2” neutrinos are composed of each flavor equally. A
nice way to remember this is to use the “tribimaximal” form,
2/3 1/3 0
|Vai |2 = 1/6 1/3 1/2
1/6 1/3 1/2
which was used in many earlier models, but is now ruled out, e.g. since θ13 is nonzero. This is
important, as if it were zero, there would be no CP violation at all.
82 5. Neutrinos
Note. Why are we allowed to restrict to two neutrino flavors sometimes? First, one oscillation
frequency is several times smaller than the others, so for experiments with small lengths (e.g. reactor
neutrinos) we can ignore the slow frequency. Second, for other ones (e.g. solar neutrinos) we often
only measure electron neutrinos, and here |Ve3 |2 is quite small.
Note. Neutrino oscillations are a bit puzzling, because if the mass-basis neutrinos have different
dispersion relations, then it is impossible for a flavor-basis neutrino to have a definite four-momentum.
But the electroweak Feynman diagrams that produce flavor-basis neutrinos impose momentum
conservation at every vertex, so the final state in the reaction e− + X → X ′ + νe looks like
X
|definite flavor X ′ , νe ⟩ = |definite momentum X ′ , νi ⟩
i
where each of the states on the right has a different momentum for X ′ . Tracing out the X ′ , it
would appear that we cannot have interference between the neutrino states. But this is no problem,
for the same reason that a Stern–Gerlach apparatus doesn’t destroy superpositions: the momenta
of the X was not well-defined to begin with! In the case of solar neutrinos, even demanding that
the X lie in the Sun requires a large enough spread in momentum that the X ′ states of different
momentum almost completely overlap.
However, this raises the possibility that neutrino oscillations can decohere. For instance, this oc-
curs if the distance traveled is great enough that the different components of the neutrino wavepacket
stop overlapping. An exhaustive review of the subtleties of neutrino oscillations is given in Paradoxes
of neutrino oscillations.
Note. If we used the formulas above, we would expect that the Homestake experiment saw 1/2 as
many neutrinos as expected, rather than the actual 1/3. (Sometimes the 1/3 is naively explained
by saying that there are 3 neutrino flavors, but this is a drastic oversimplification.) The 1/3 results
from the MSW effect: while electron neutrinos are created at the center of the Sun, they will be
affected by the electrons in the Sun, so that the neutrinos exiting the Sun are not electron neutrinos.
The mass eigenstates satisfy a Schrodinger equation in space,
d m2
i |νi ⟩ = i |νi ⟩
dL 2E
In terms of flavor eigenstates, we have
d m2 †
i |νβ ⟩ = Vβi i Viα |να ⟩.
dL 2E
Tau neutrinos are not important here, so we restrict to two flavors,
∆m2 sin2 θ
d |νe ⟩ cos θ sin θ |νe ⟩
i = .
dL |νµ ⟩ 2E cos θ sin θ cos2 θ |νµ ⟩
Now consider the interaction of electrons with electron-neutrinos, which is the leading (i.e. tree-level)
interaction in this context. The effective four-fermion interaction is
√ √
L ⊃ 2 2GF (ν eL γµ eL )(eL γ µ νeL ) = −2 2GF (ν eL γµ νeL )(eL γ µ eL )
where we used a Fierz identity. In a matter background with electron number density Ne , in the
matter rest frame, we may set
Ne
⟨eL γµ eL ⟩ = δµ0
2
83 5. Neutrinos
where the 1/2 is from the two possible helicities. The effective Lagrangian for electron neutrinos is
√
L ⊃ ν eL ∂/ νeL − iA(ν eL γ0 νeL ), A = 2GF Ne .
The matter term produces an effective potential. To see this, consider the equation of motion
0 = (∂ 2 − 2iA∂0 + A2 )|νe ⟩ = (E 2 − |⃗
p|2 ∓ 2AE + A2 )|νe ⟩
where the other sign arises for antineutrinos. This gives the dispersion relation
E = |⃗
p| ± A
which shows the matter-induced potential. Hence the Schrodinger equation becomes
∆m2 sin2 θ
d |νe ⟩ cos θ sin θ A 0 |νe ⟩
i = + .
dL |νµ ⟩ 2E cos θ sin θ cos2 θ 0 0 |νµ ⟩
One way to parametrize this matrix is to subtract off a multiple of the identity, giving
∆m2
A (∆/2) sin 2θ
, ∆=
(∆/2) sin 2θ ∆ cos 2θ 2E
which can be written in the original form with a different ∆ and θ,
∆M L
q
P (e → µ) = sin2 (2θM ) sin2 , ∆M = (A − ∆ cos 2θ)2 + ∆2 sin2 2θ, ∆M sin 2θM = ∆ sin 2θ.
2
Note that neutrinos and antineutrinos oscillate differently; this is compatible with CPT because
the matter background spontaneously breaks it. Also note that the MSW effect depends on the
sign of ∆, and hence can in principle tell between the normal and inverted mass hierarchy.
In the case of the Sun, A is high in the core, so that a produced |νe ⟩ is approximately a mass
eigenstate. As the neutrino exits, A adiabatically transitions to zero, so the neutrino exits in a
mass eigenstate, namely the heavier one |ν2 ⟩ because of avoided level crossing, and afterward do
not oscillate. The fraction of electron-neutrinos we see is
1
Pee = |⟨νe |ν2 ⟩|2 = sin2 θ ≈
3
which is the correct result. So ironically, the first evidence for neutrino oscillations doesn’t even
involve neutrino oscillations. (Though strictly speaking, the Sun produces neutrinos with a wide
range of energies, and this argument only applies to the high-energy ones. For lower-energy neutrinos,
measured in experiments after Homestake, this effect is less important.) The MSW effect also causes
a “day-night” effect for solar neutrinos, which have to pass through the Earth at night.
Note. Various signs get flipped for antineutrinos, which raises the question: does these results
change if neutrinos are Majorana? The answer is actually no, because if so, then what we call
“neutrino” and “antineutrino” just stands for left-helicity neutrino and right-helicity neutrino in
the lab frame, since that determines how we can detect them. (Strictly speaking, the weak force
couples to definite chirality; the mismatch between chirality and helicity gives errors, but they are
suppressed by powers of mν /E, which is tiny for all neutrinos ever detected.)
84 5. Neutrinos
• In general, any fermion that mixes with ordinary neutrinos but does not couple to anything
else in the SM is called a sterile neutrino. By definition, sterile neutrinos must have no charge
under any gauge group.
• Concretely, suppose we introduce N sterile Majorana neutrino fields. (As mentioned above, this
does not lose any generality. Only Lagrangian terms break symmetries, not how we package
the fields in them.) The most general Majorana mass terms include mixing terms between the
sterile and ordinary neutrinos, with mass matrix
m µ
µT M
• First, consider the case µ ≪ m, M . Then there is negligible mixing between sterile and ordinary
neutrinos, and the sterile neutrinos don’t do anything at all, though there may be constraints
on them from cosmology.
• Next, consider the case of Dirac neutrinos, m = M = 0. In this case, we can write the fields as
N − 3 massless decoupled sterile neutrinos and 3 massive Dirac neutrinos. The result is exactly
analogous to the quark fields. Lepton number is conserved, and the phases αi are all zero.
• It seems that it would be easy to rule out Dirac neutrinos, because ordinary neutrinos would
quickly oscillate into sterile neutrinos, which have the opposite chirality, leading to an easily
measurable missing probability. Chirality oscillations indeed occur for other fermions, but the
the mixing angle between the chirality and mass basis in the Dirac equation goes to zero as
the neutrino becomes ultrarelativistic. Since all neutrinos available are ultrarelativistic, the
oscillation amplitude is extremely small. This effect is known as helicity suppression.
• On the other hand, if we have light sterile neutrinos, m ∼ µ ∼ M , the previous argument
doesn’t apply. These models are indeed tightly constrained by “missing probability”.
• Finally, seesaw neutrinos are the case m ≪ µ ≪ M . The eigenvectors are almost purely sterile
and normal, with masses on the order of M and m + µ2 /M . These models are experimentally
acceptable, since the sterile neutrinos are too heavy to be produced, and give the right neutrino
mass naturally, as we’ll see below.
• When it is asked whether neutrinos are Dirac or Majorana, a Dirac neutrino would simply
correspond to m = M = 0. If there are Majorana mass terms, U (1)L is violated and neutrinos
can annihilate themselves. This isn’t forbidden, since U (1)L is merely an accidental global
symmetry of the SM anyway, and is even anomalous.
• However, note that U (1)B−L is also an accidental global symmetry of the SM, which isn’t
anomalous. In extensions of the SM where U (1)B−L is gauged and not spontaneously broken,
we must have m = M = 0. Alternatively, we could rule out these terms by just postulating
that U (1)L or U (1)B−L are exact global symmetries.
85 5. Neutrinos
• It would also be easy to tell the difference between Majorana or Dirac neutrinos if we could
detect nonrelativistic neutrinos, which appear in the cosmic neutrino background. This is
extremely challenging, since cross sections scale with the neutrino energy; we have never seen
any nonrelativistic neutrinos.
• The simplest possible sterile neutrinos are a set of three right-handed neutrinos
N i = νR
i
= (νeR , νµR , ντ R )
which are gauge singlets. Then we can include a Yukawa mass term,
√ i c j
L ⊃ − 2(λijν L ϕ N + h.c.)
where we use ϕc to make the hypercharge work out. Integrating out the Higgs, this corresponds
to the case of Dirac neutrinos. Such a term is not SU (2)L × U (1)Y invariant, but this is
acceptable since the only residual symmetry below the Higgs scale is U (1)A .
• We can also write down a gauge invariant Majorana mass term for the sterile neutrinos. The
natural scales for the mass matrices are then
where Λ is the SM cutoff. We hence get a seesaw mechanism with neutrino masses Λ and
2 /Λ, and the latter is the right mass if Λ is about the GUT scale, a compelling coincidence.
MEW
• To have Dirac neutrinos, we need to force M to be small somehow. For example, we could
use an exact B − L symmetry. However, µ must also be much smaller than MEW . This is
technically natural in the same way that the lightness of the up and down quarks relative to
MEW is, but it’s a bit unsatisfying because it adds more unexplained flavor structure.
• If we assume that whatever physics produces the neutrino masses is heavy, then we can simply
use effective field theory. Here, neutrinos receive mass by the dimension 5 “Weinberg operator”,
Y ij iT c
L⊃− (L ϕ )C(ϕcT Lj ) + h.c.
Λ
2 /Λ, giving a
where the conjugates ensure gauge invariance. The mass is therefore about MEW
simple reason that the seesaw mechanism, and related mechanisms, work.
• From the effective field theory point of view, neutrino masses are the leading correction to the
SM, because they are the only dimension 5 operator we can write down. If we take Λ to be
near the GUT scale as inferred from the neutrino masses, then dimension 6 operators are very
hard to measure.
• There are many more ways to UV complete the Weinberg operator. Above, we have only
considered the “type I seesaw”, which introduces a right-handed fermionic singlet. But one
can also introduce a scalar weak triplet (type II seesaw) or a fermionic weak triplet (type III
seesaw), as any of these can play the role of the intermediate heavy particle.
86 5. Neutrinos
• There are also “radiative” mass generation models where the neutrino mass is only generated at
loop level, such as the Zee model and the Ma model. The Ma model is called “scotogenic” (“from
darkness”) since the mass comes from neutrino interactions with a dark matter candidate.
Note. Why did we discuss neutrino masses in a set of notes on the Standard Model? After all,
doesn’t the Standard Model require neutrino masses to be zero? This is debatable, because if one
reads the term literally, as the standard model one uses to describe nature, then it has changed
significantly since the advent of “the” Standard Model, and now includes neutrino masses. For some
historical discussion, see The Once and Present Standard Model of Elementary Particle Physics.
Note. The clearest experimental signature for a Majorana mass term would be neutrinoless double
beta decay. Some nuclei cannot decay by beta decay, because the resulting product is heavier, but
can decay if two beta decays occur at once, a rare process. In neutrinoless double beta decay, the
two neutrinos produced annihilate, which is even rarer due to helicity suppression,
m2ν 9
Γ ∼ G4F E ∼ 10−31 years−1
E2
assuming mν ∼ 0.01 eV, E ∼ 1 MeV. The process can be identified by an incredibly sharp peak in
the energy spectrum of the resulting electrons, which requires very sensitive energy measurements.
Relevant experiments are reviewed here and here. Some experiments, with background count rate
plotted against energy resolution, are shown below.
There is a tradeoff between large size, at the right of the plot, and good energy resolution, at the
bottom. For example, CUORE uses precise bolometers (i.e. calorimeters) in a dilution fridge; it
will upgrade to CUPID by adding a light detector to veto most of the background, in the form of
degraded alpha particles. On the other extreme, KamLAND-Zen uses about 800 kg of liquid 136 Xe
dissolved in liquid scintillator, and operates much like direct dark matter detection experiments,
though its threshold is at MeV, while WIMP recoils are keV and lower.
Current experiments probe down to Γ ∼ 10−28 years−1 , while future experiments have the
concrete goal of probing the inverted neutrino hierarchy.
87 5. Neutrinos
Past experiments had the potential to probe heavier, quasi-degenerate neutrinos, but these are
in tension with cosmology, so one needs to add epicycles to fix this. Thus, as experiments get
more precise, we actually move towards testing the simplest models. Unfortunately, the possibility
remains that a cancellation occurs for the normal hierarchy, making the rate very small.
88 6. Quantum Chromodynamics
6 Quantum Chromodynamics
6.1 Hadron Production
Before beginning, we consider the running coupling.
g3 11 4X
β(g) = −β0 , β0 = CA − Tf
16π 2 3 3
f
where CA is the quadratic Casimir of the adjoint representation, and Tf is the Dynkin index of
the representation for quark flavor f . We see that fermions provide screening, while nontrivial
gluon-gluon interactions provide ‘antiscreening’, which favor asymptotic freedom.
• In the case of QCD, we have the group SU (3), so CA = 3, and all the quarks transform in the
fundamental representation where TF = 1/2, so
g3 2
β(g) = −β0 , β0 = 11 − Nf
16π 2 3
where Nf is the number of flavors. Then the beta function is negative if Nf < 33/2. This also
holds for QED, where CA = 0 and the beta function is positive for any nonzero Nf .
• For high energies, we have Nf = 6, so the beta function is negative. Defining αs = g 2 /4π,
dαs β0
= − αs2 .
d log µ 2π
Integrating, we have
2π 1 2π
αs (µ) = =
β0 log(µ/µ0 ) + 2π/β0 αs (µ0 ) β0 log(µ/ΛQCD )
• We’re implicitly using a mass-independent scheme, so each quark continues to contribute even
when µ is much less than its mass. In practice, when we drop below the top quark mass we
‘manually’ stop its running, matching the coupling and then setting Nf = 5, and so on. Doing
this yields ΛQCD ≈ 200 − 500 MeV, though the answer depends on the subtraction scheme.
• First, we consider the process e+ e− → qq, which we treat perturbatively by asymptotic freedom.
By simplicity we consider only the tree-level process where the intermediate particle is a virtual
photon, as shown below.
1X e4 Q2 8e4 Q2
|M|2 = tr(k/ 1 γ µ
/
k 2 γ ν
) tr(p γ p γ
/1 µ /2 ν ) = [(p1 · k1 )(p2 · k2 ) + (p2 · k1 )(p1 · k2 )].
4 4q 4 q4
spins
where we have
1X
|M|2 = e4 Q2 (1 + cos2 θ).
4
spins
e4 Q2 dk1 p
dσ = δ( q 2 − 2|k1 |) (1 + cos2 θ).
8π 2 q 2 4|k1 |2
• Writing dk1 = |k1 |2 d|k1 |dΩ and performing the delta function gives
dσ α2 Q2
= (1 + cos2 θ)
dΩ 4q 2
4πα2 2
σ= Q .
3q 2
This matches the result from our more specific formulas for the cross section.
• Note that the cross section only depends on the identity of the final particles through Q. Then
to reduce experimental and theoretical uncertainties, we can test this result by comparing it to
the cross section for e+ e− → µ+ µ− .
90 6. Quantum Chromodynamics
• Let |X⟩ denote a generic hadronic final state and let |0⟩ denote the QCD vacuum. Then the
amplitude to produce |X⟩ is approximately
e2
⟨X|Jhµ |0⟩v e (p2 )γµ ue (p1 ), Jhµ =
X
MX = Qf q f γ µ qf
q2
f
where Jhµ is the hadronic electric current, and we are essentially assuming that the process goes
as e+ e− → γ ∗ → qq → hadrons with a clean separation between the steps. Summing over all
final states gives
1 X1 X
σ= 0 0 /δ(q − pX )|MX |2
8p1 p2 4
X spins, pX
where the sum over pX includes the appropriate Lorentz invariant phase space factors.
• To simplify this, we introduce the hadronic spectral density as we did for a scalar field,
ρµν µν 2 µ ν 0 2
h (q) = (−η q + q q )θ(q )ρh (q )
where the theta function exists because the |X⟩ states have positive energy.
16π 3 α2
σ= ρh (q 2 ).
q2
Switching from a hadron-level to quark-level description of the process is called quark-hadron du-
ality. Using this assumption, the computation is essentially identical to our earlier computation
for qq final states.
where the final state are on-shell, k12 = k22 = m2f . Unlike our previous computation, we maintain
the masses of the quarks. We know the integral must take the form
I µν = Aq µ q ν + Bη µν
91 6. Quantum Chromodynamics
where A and B are found by contracting both sides with ηµν and qµ qν . We thus find
!1/2
Nc X 2 2 4m2f q 2 + 2m2f
ρh (q 2 ) = Qf θ(q − 4m 2
f ) 1 − .
12π 2 q2 q2
f
4πα2 X 2
σ = Nc Qf
3q 2
f
where the sum is over all quarks light enough to be produced; we thus expect a series of plateaus
between jumps. Experimental results confirm that Nc = 3 and display the same plateaus, with
extra resonances throughout. There is a resonance between each plateau, corresponding to the
lightest meson containing the new quark that can be produced, e.g. the J/ψ for the charm
√
quark. The result is good for s ∈ [2, 20] GeV. At high energies, we run into the broad Z pole,
while at low energies, αs is large.
• At next-to-leading order, we must account for a gluon loop on the qqγ vertex. The loop is UV
finite after renormalization but IR divergent, giving a divergent negative contribution to the
cross section. This cancels with the IR divergences in the tree-level cross section for e+ e− → qqg.
Thus the total cross section for
is finite, and the result is that the total cross-section is multiplied by 1 + α/π. Physically, the
q and q are seen as jets, so we’ve computed the differential cross-section for jet production.
• Intuitively, the intermediate photon is very far off-shell, decaying in time 1/ q 2 by the energy-
p
time uncertainty principle. The emission of soft gluons takes place over a much longer timescale,
so it can’t retroactively change how the photon decayed. Thus the IR divergences simply
account for how the hard quarks are “dressed” after their production and cannot affect the total
rate, so they must cancel. Indeed, the corrections in the next-to-leading order cross section
come from kinematic regions where the virtual gluon is hard.
• The formal proof that IR divergences cancel is rather difficult. For QED, the result is the
Bloch–Nordsieck theorem, while for the general non-abelian case it is the KLN theorem.
• Similarly, consider the tree-level differential cross-section for e+ e− → qqg. By the above
considerations, when the gluon is soft, we see two jets, not three. If we restrict to regions where
gluon is sufficiently hard, we can trust the result, giving a QCD prediction for the distribution
of three-jet events.
• A related question is what scale to choose for the running coupling αs . Intuitively, it should
√
be set at the scale of the momenta in the question, i.e. s for the dijets. However, for more
complicated processes there will be multiple invariant momenta; for the three-jet event, one
might choose the transverse momentum of the gluon.
92 6. Quantum Chromodynamics
modifying the scale µ by a factor of 2 changes αs by O(αs2 ). Thus ambiguities in the scale can
be resolved by computing to the next order.
Πµν µ µ
h (x, y) = i⟨0|T J (x)J (y)|0⟩.
Since J µ is the hadronic electric current, we have the QED Ward identity is qµ Πµν
h = 0, so
Πµν µν 2 µ ν 2
h (q) = (−η q + q q )Πh (q ).
Intuitively, since the hadronic current couples as J µ Aµ , the two-point function is essentially
the set of hadronic loop corrections to the amputated photon propagator.
Then Πh (q 2 ), as a function of complex q 2 , gets a branch cut starting at the masses of the lightest
hadrons.
• We can then use this information to compute ρh (q 2 ) and thereby make experimental predictions.
Note that we may invert the formula above for
Πh (q 2 + iδ) − Πh (q 2 − iδ)
ρh (q 2 ) = lim .
δ→0 2πi
This can be computed by integrating the derivative of Πh (z) along any contour connecting the
two points. In particular, we can take a large circle of radius q 2 . We won’t go into much more
detail here, but the idea of performing QCD computations by taking advantage of analyticity
is related to S-matrix theory and dispersion relations, and leads to “sum rules”.
• Historically, the first hint that the strong interaction was asymptotically free came from hadron-
hadron scattering experiments. In these experiments, the hadrons were shattered into many
constituents, but most of them had low transverse momentum, indicating that the components
of the hadrons were loosely bound and could not absorb a large momentum.
93 6. Quantum Chromodynamics
• We need to talk about transverse momentum because, while the lab frame coincides with the
CM frame of the two protons, it generally doesn’t coincide with the CM frame of two proton
constituents colliding, which may have a longitudinal boost. Another way to quantify the
momentum transfer q is via its square q 2 , as if q 2 is large and spacelike, the components of q
must be large in any frame.
• In the 1960s, deep inelastic scattering experiments involving electrons and protons indicated
that the proton was made of a small number of pointlike constituents, called partons. The
physical picture is that the hard scattering involved a photon exchange between one electron
and one parton, while subsequent small-q 2 exchanges between the struck parton and the others
produced jets, as we’ve seen above.
• Just as in Newtonian mechanics, elastic scattering refers to a scattering event where kinetic
energy is conserved. In deep inelastic scattering, the proton instead absorbs energy, shattering
into many pieces.
where s, t, and u are the Mandelstam variables for the electron-quark collision.
dσ α2 Q2i s2 + (s + t)2
∼ .
dt s2 t2
Note that t = q 2 . Since the momentum transfer is spacelike, we define Q2 = −q 2 for convenience.
where s′ is a Mandelstam variable for the electron-proton collision. Since the electron-parton
scattering is elastic,
0 ≈ (p + q)2 = 2p · q + q 2 = 2ξP · q − Q2
so we may measure ξ from observations of the electron alone,
Q2
ξ=x≡ .
2P · q
94 6. Quantum Chromodynamics
• We let fi (x) be a parton distribution function, denoting the probability that the constituent i
carries longitudinal momentum fraction x. Combining our results,
2 !
d2 σ 2 Q2
2 α
X
∼ fi (x)Qi 4 1 + 1 − ′ .
dxdQ2 Q xs
i
The only part of this cross section that depends on the strong interaction is fi (x), while
everything else is just from the QED amplitude and the phase space kinematics. Dividing the
cross section by these extra factors gives a cross section independent of Q2 , a prediction known
as Bjorken scaling, validated to 10% accuracy for Q ≳ 1 GeV.
• Physically, Bjorken scaling means that the proton appears the same to an electromagnetic
probe, no matter how hard the proton is struck. This is sensible, because for high Q, the
scattering process is much faster than the internal dynamics of the proton.
• On the other hand, Bjorken scaling should be corrected by emission of high-momentum partons;
this remains possible at arbitrarily high energies as the strong coupling only decays to zero
logarithmically. Thus the parton distribution functions depend logarithmically on Q2 and their
RG evolution equations are called the Altarelli–Parisi or DGLAP equations.
Next, we turn to a quantitative analysis of the cross section for deep inelastic scattering.
1 d̄p′ X 1X
dσ = ′0
/δ(q + P − pX ) |M|2
4EM 2p 2
X,pX spins
where the sum over pX includes the Lorentz invariant phase space factors for a variable number
of final state particles, we average over the initial spin of the electron, assuming the hadron H
is spinless, and M is the mass of H.
1X e4
|M|2 = 4 Lµν ⟨H(P )|Jhµ |X⟩⟨X|Jhν |H(P )⟩.
2 2q
spins
95 6. Quantum Chromodynamics
• Next, we define
1 X
WHµν (q, P ) = /δ(q + P − pX )⟨H(P )|Jhµ |X⟩⟨X|Jhν |H(P )⟩
4π
X
• Since WHµν is contracted with Lµν , we can take it to be symmetric. By current conservation,
q µ Lµν = 0, which means we can choose qµ WHµν = 0. Then we have two form factors,
qµqν
µν µν µ P ·q µ ν P ·q ν
WH = −η + 2 W1 + P − 2 q P − 2 q W2
q q q
Q2 ≡ −q 2 = 2p · p′ , ν = P · q.
Q2 P ·q ν
x= , y= =
2ν P ·p ME
dσ 4πα2
2M E xy 2 F1 (x, Q2 ) + (1 − y)F2 (x, Q2 )
= 4
dx dy Q
where we have defined the dimensionless structure functions F1 = W1 and F2 = νW2 .
96 6. Quantum Chromodynamics
V ± = V 0 ± V 3, V⊥ = (V 1 , V 2 )
• Again working in the rest frame of the hadron, we choose the photon momentum to be along
ê3 , so P⊥ = q⊥ = 0. Then
1
Q2 = −q + q − , ν = (q + P − + q − P + ).
2
In the deep inelastic limit, we take q − → ∞ with q + ∼ P + , giving
q+ q−P +
x∼− , ν∼ .
P+ 2
P ·q 2 ν2
WH+− (q, P ) = −W1 + P − 2 q W2 = −W1 + M + 2 W2 ≡ FL (x, Q2 ).
2
q Q
We can further simplify by applying the parton model, writing the structure functions in terms of
parton distribution functions, but we won’t go into the details here.
We suppress color indices for the quarks throughout, contracting them implicitly.
97 6. Quantum Chromodynamics
G = U (1)L × U (1)R × SU (N )L × SU (N )R
from unitary rotations of the left-chiral and right-chiral quarks. This group may be generated
by a combination of vector symmetries, which rotate them the same way, and axial symmetries,
which rotate them oppositely.
q → exp(−iθ)q, q → exp(−iθγ5 )q
Vµ = qγµ q, Aµ = qγµ γ5 Q.
The U (1)V symmetry corresponds to baryon number, while the U (1)A symmetry is anomalous.
Since U (1) is abelian, these symmetries do not yield any degeneracies.
The total axial charge is parity-odd, since γ0 and γ5 anticommute, and thus it should yield
degeneracies between hadrons with opposite parities. But no such degeneracies are observed.
• Also note that a set of N uncharged quarks has a much larger symmetry group, since we can
separate the two chiral components of each Dirac field,
L = iqL ∂/ qL + iqR ∂/ qR .
which has an SU (2N ) × U (1) symmetry. This is a perfectly legitimate symmetry, but it is
hidden when working with Dirac fields because in that case it would be partly antilinear. It is
not useful in QCD, because the extra symmetries are broken by the coupling to the gauge field.
98 6. Quantum Chromodynamics
• Now we consider quark masses, which break further symmetries. Mass terms have the form
L ⊃ qL M qR + h.c.
so that under a chiral transformation,
M → L† M R.
Hence a chiral field redefinition can be used to make M real and diagonal.
• Mass terms always break axial symmetry. If all the masses are different, the symmetry is
broken down to U (1)N , the individual quark numbers. If the masses are the same, the full
vector symmetry is preserved. (The different quark electric charges also break symmetries, but
we ignore these because electromagnetism is weak.)
• In QCD, only the up, down, and strange quarks are reasonably light. If we consider only the
up and down, the small mass difference means SU (2)V is a very good symmetry; it is isospin.
Since their absolute masses are small compared to ΛQCD , SU (2)A should be almost as good,
but this is not observed.
To fix this problem, we introduce the quark condensate, starting with two quark flavors.
• We postulate the QCD vacuum has a “quark condensate” analogous with the condensate of
Cooper pairs in a superconductor,
⟨Ω|q Ri qLj |Ω⟩ = −v 3 δij , v ≈ 250 GeV
where the minus sign ensures the vacuum energy is lowered. It isn’t known exactly how to show
the right-hand side is proportional to δij , but we know this must be the case, or else SU (2)V
would be badly broken.
• The vacuum is only invariant under U (1)V and SU (2)V , so we expect pseudo-Nambu–Goldstone
bosons (PNGBs) from the broken symmetries, U (1)A and “SU (2)A ”. This hypothesis was once
called “partially conserved axial current” (PCAC).
• The formation of the quark condensate occurs in a phase transition, which occurs approximately
concurrently with the confinement transition for QCD, since ΛQCD is the only relevant mass
scale in the theory. However, in supersymmetric gauge theories only confinement occurs.
• The PNGBs corresponding to SU (2)A are the pions, which are indeed far lighter than any other
hadron. They are generated by the axial current,
⟨Ω|Aaµ (x)|π b (q)⟩ = ifπ qµ δ ab e−iqx , fπ ≈ 92 MeV.
However, there is no PNGB observed for U (1)A , and this was called the U (1)A problem.
• Because of the axial anomaly, the U (1)A current obeys
αs µν 1
∂µ J5µ = G G̃aµν , J5µ = Qγ µ γ5 Q.
4π a 2
However, this doesn’t solve the U (1)A problem alone because GG̃ is a total derivative, so one
can define a modified (approximately) conserved axial current, which again requires a PNGB.
The real resolution requires accounting for QCD instantons, as covered in the notes on Quantum
Field Theory.
99 6. Quantum Chromodynamics
• More generally, we may extend to three flavors, where the PNGBs corresponding to SU (3)A
are the three pions, the four kaons, and the η. The particle that should be the PNGB for U (1)A
is the η ′ , which is much heavier; this is the U (1)A problem again. Explicitly, the masses are
0
π ± , π 0 : 140 MeV, K ± , K 0 , K : 500 MeV, η : 550 MeV, η ′ : 950 MeV.
These mesons are pseudoscalars because they are the Goldstone bosons related to the breaking
of axial symmetry, and axial symmetries pick up an extra sign under parity.
• Note that for two quark flavors, the SU (2)L symmetry is simply weak isospin, which uncoinci-
dentally is also called SU (2)L . Hence the quark condensate would break electroweak symmetry
if the electroweak phase transition hadn’t done it already.
Example. Alternative scenarios of symmetry breaking. First suppose the color group was SO(3)
and the quarks transformed in the 3. Since this representation is real, we may write the theory
in terms of 2N Weyl spinors χαi transforming in the 3. The symmetry group is SU (2N ), and the
U (1) factor is anomalous as before. The condensate that preserves the most symmetry is
Next, suppose that the color group was SU (2) and the quarks transformed in the 2. By the same
reasoning, the symmetry group is SU (2N ). The general form of the condensate is
where the ϵαβ factor is necessary to get a color singlet, which forces ηij to be antisymmetric. Then
the most symmetric condensate is one where η 2 = −I, preserving the symmetry Sp(2N ).
Example. Consider N real scalar fields that transform in a representation R. If R is real, the flavor
symmetry is clearly SO(N ). If R were complex, there would have to be another N real scalar fields
transforming in R, because the overall representation must be real; they are collectively equivalent
to N complex scalar fields transforming in R with flavor symmetry SU (N ). If R is pseudoreal, it
turns out the flavor symmetry is not SO(N ), but Sp(2N ).
U (x) → LU (x)R† .
The pion fields π a (x) parametrize the vacuum manifold. For a vector transformation we have
• We see the unbroken symmetries act on the π a (x) linearly, while the broken symmetries act
infinitesimally by shifts, ensuring the Goldstone bosons are massless. The general procedure for
writing broken and unbroken symmetries in this form is called the CCWZ construction. At the
quantum level, we build our Hilbert space on the physical vacuum. The unbroken symmetries
act on this Hilbert space, while the broken symmetries relate these states to excitations about
other vacuums, and hence do not yield degeneracies.
• Next, we write down every term consistent with the symmetries. The Lagrangian must be
invariant under SU (3)L × SU (3)R , which means that a U must always be next to a U † . But
we always need derivatives, because U † U = 1. Hence we have
fπ2
L= tr(∂µ U † ∂ µ U ) + . . . .
4
Since U is a nonlinear function of the pion fields, the quadratic term alone is enough to do
nontrivial computations. Expanding and neglecting O(1) factors and SU (3) indices,
1 1
L = ∂µ π∂ µ π + 2 π 2 ∂µ π∂ µ π + . . . .
2 fπ
In particular, the symmetry ensures that there are no relevant interaction terms at all!
• There are relations between the infinitely many coefficients in the Lagrangian in terms of π.
These relations were originally understood by current algebra, but in chiral perturbation theory,
we get them easily because we know that U transforms linearly.
• The second term above is the leading contribution to ππ → ππ scattering. The cross section
can be estimated as follows.
– If the pions are relativistic, we may ignore their masses, so the only mass scales are fπ and
the Mandelstam variables.
– The fπ can only enter through the matrix element, M ∝ 1/fπ2 , so σ ∝ 1/fπ4 .
– By dimensional analysis, σ has degree 1 in the Mandelstam variables. There is no way to
get a Mandelstam variable in the denominator, because we aren’t doing any kind of particle
exchange (compare the 1/t in e+ e− → e+ e− scattering).
– In general, for two particles in the final state, the phase space integrals will contribute a
factor like 1/8π, while for three particles we get 1/64π 3 .
The matrix element here is smaller by a factor of p2 /fπ2 . Hence the cross section, which is the
square of the sum of the matrix elements, is modified by O(p2 /fπ2 ). Hence chiral perturbation
theory is a perturbation series in p2 /fπ2 .
• For pion-pion scattering at tree level, only the terms quadratic and quartic in U can contribute,
since all others have too many powers of π. Hence tree level results are fully parametrized by
only a few parameters, and the real test is controlling the loop corrections. The logarithms in
cross sections due to loops are called chiral logs and are a signature of quantum effects.
• Loop corrections are suppressed by a factor of 1/(4π)2 , so the real expansion parameter is
p/4πfπ . Hence we were justified in treating the pions relativistically, because
4πfπ ≈ 1 GeV ≫ mπ .
Note. Chiral perturbation theory is closely related to sigma models. In the original linear sigma
model, one takes a set of two complex scalar fields and performs spontaneous symmetry breaking,
yielding a massive field σ and massless Goldstone bosons associated with the pions. At low energies,
the σ decouples, yielding a “nonlinear sigma model”, which now is a generic term for any field
theory whose fields take values on a manifold. Chiral perturbation theory is simply a nonlinear
sigma model where the σ would have corresponded to a variation of v.
• We are justified in treating the strange quark mass perturbatively because ms ≪ 4πfπ . In
general, a mass will contribute to the Lagrangian as
L ⊃ −q L M qR + h.c.
where M is a complex matrix. If we simply replace qL qR with its vev, we get the leading term,
L = v 3 tr(M U + M † U † ).
4v 3 a b a b 2v 3
L⊃− tr(M T T )π π = − tr(M {T a , T b })π a π b .
fπ2 42π
In the case of two quark flavors, this may be simplified using {T a , T b } = δ ab /2 for SU (2)
generators. We find that all three pions satisfy the Gell-Mann–Oakes–Renner equation
• For three quark flavors, a longer version of the same computation relates the pion, kaon, and η
masses, giving the Gell-Mann–Okubo formula.
• To establish the parameter fπ is really the pion decay constant, we can couple the pion fields
to leptons using the four-Fermi interaction and calculate the decay rate.
• To compute nucleon-pion scattering, for two quark flavors, we have a Dirac nucleon field
p
N=
n
that transforms as
PL N → LPL N, PR N → RPR N.
The only possible mass term for the nucleons is
L ⊃ −mN N (U † PL + U PR )N
• Alternatively, we may couple the pion fields to quark fields, which transform similarly. This is
valid for energies above ΛQCD and below 4πfπ . In both cases we write down all terms consistent
with chiral symmetry.
• To account for the different electric charges of the quarks, pions, or nuclei, we simply promote
derivatives in the chiral Lagrangian to covariant derivatives.
Further refinements, including a solution to the U (1)A problem, require an understanding of anoma-
lies, which are covered in the notes on Quantum Field Theory.
• The θ parameter is modified by chiral rotations of the quark fields by the chiral anomaly.
Specifically, after electroweak symmetry breaking the quark mass matrices are neither Hermitian
nor diagonal; integrating out the Higgs we have
and the matrix may be diagonalized by SU (Nf )A × SU (Nf )V transformations on the quark
fields. To remove the overall phase, we require U (1)A transformations,
qR → eiα/2 qR , qL → e−iα/2 qL
θ = θ − arg det M
• The value of θ may be measured from the neutron electric dipole moment, yielding
θ ≲ 10−10 .
The strong CP problem asks why this holds. It is especially puzzling because the two contribu-
tions to θ appear to be completely unrelated, and separately quite large.
• The chiral anomaly explains why all the θ-vacua are equivalent if any quark, such as the up
quark, is exactly massless. In that case we can perform arbitrary chiral rotations on that quark
field without physical effect, rotating θ to zero. This was once proposed as a solution to the
strong CP problem, though it is now disfavored by data and lattice computations.
• It’s also possible to explain why θ is small with clever model building, e.g. in Nelson–Barr
models. However, these models require a fair bit of extra structure. Below, we’ll focus on the
axion solution to the strong CP problem, which is very clean and simple.
• For the purposes of calculation, it’s most useful to rotate so that θ = 0, and work with the
complex masses in chiral perturbation theory. The presence of the masses yields a θ-dependent
QCD vacuum energy, which will provide the axion potential. This setup is also used to compute
the neutron electric dipole moment.
• By rotating θ into the quark mass matrix, we can take without loss of generality,
mu
θ = 0, M = md
ms e−iθ
where the mi are all real. As a result, to minimize the vacuum energy, the pion fields pick up
vevs. We can neglect all non-diagonal fields, because if they picked up a vev, they would break
U (1)A symmetry. The general intuition here is that a minimum is often a point of enhanced
symmetry (the exception being spontaneous symmetry breaking), which is essentially the reason
that the vacuum energy is minimized at the CP preserving point θ = 0.
• Since ms ≫ mu , md , we will have φu + φd ≈ θ. Deviations from this equality can lower the
potential energy by terms higher order in mu /ms , which we neglect. Differentiating, we have
mu sin φu = md sin φd .
104 6. Quantum Chromodynamics
By applying the law of sines and law of cosines to an appropriately chosen triangle,
sin φu sin φd sin θ
= =q .
md mu m2u + m2d + 2mu md cos θ
Plugging this in and using the Gell-Mann–Oakes–Renner equation, we find the vacuum energy
s
2 2 4mu md 2 θ 1 mu md 2
V (θ) = −mπ fπ 1 − 2
sin ≈ m2π fπ2 2
θ .
(mu + md ) 2 2 (mu + md )
• Note that a semiclassical one-instanton approximation would give V (θ) ∼ − cos θ, which is quite
different. In general little is rigorously known about the function V (θ), except that it is periodic
with a minimum at θ = 0. (This is established rigorously by the Vafa–Witten theorem.) We
may also compute the curvature at the minimum. There could be conceivably be other local
minima, or even discontinuities.
• It is difficult to compute V (θ) in lattice QCD, because lattice QCD must be done in Euclidean
signature to make the path integral converge numerically, but the theta term has a single time
derivative and hence appears as an imaginary part in the Euclidean action.
• The neutron EDM can also be computed in chiral perturbation theory. In this case the neutron
EDM comes from pion loops.
Next, we describe how the axion solves the strong CP problem. There exists an enormous literature
on axions; here we are just giving the very basics. Further discussion is given in the dissertation on
Cosmological Relaxation.
• For motivation, we could try to solve the strong CP problem in the same way the “electroweak”
CP problem is solved. That is, suppose there were a new chiral global symmetry U (1)PQ , called
Peccei–Quinn symmetry, where the quarks transform as
where ei is the U (1)P Q charge of the ith quark flavor. If i ei is nonzero, there would be a
P
• However, this global symmetry cannot exist in the SM. Conventionally normalizing i ei = 1,
P
the quark Yukawa coupling QL HDR cannot be invariant unless all quarks have ei = 1/6, and
the Higgs transforms as H → eiα/6 H. However, the other quark Yukawa coupling QL H c UR
requires the Higgs transforms as H → e−iα/6 H. (Note that in multi-axion theories one typically
P
does not normalize i ei = 1, which leads to extra coefficients in the results below.)
• This objection does not hold if U (1)PQ is spontaneously broken at a high scale fa . Surprisingly,
the strong CP problem is still solved if this happens, because the spontaneous symmetry
breaking yields a Goldstone field, the axion a(x), which transforms by a shift,
The U (1)PQ SU (3)2C anomaly manifests by breaking the shift symmetry of the axion,
g 2 a µνa a
L⊃ G G̃µν
32π 2 fa
105 6. Quantum Chromodynamics
• By moving θobs into complex quark masses, we find a vacuum energy that depends on a(x),
and hence an axion mass term,
10
mπ fπ −3 10 GeV
ma ≈ 0.5 ∼ 10 eV
fa fa
by our work above. More roughly one could find this by dimensional analysis, ma ∼ Λ2QCD /fa .
• Note that since the U (1)PQ symmetry is axial, the axion must be a pseudoscalar. Since GG̃ is
P odd and C even, aGG̃ is both P and C even and hence obeys CP.
• The strong CP problem differs from other “naturalness” problems, as there seems to be no
anthropic explanation. Also, our discussion above introduces a new high scale fa , which
potentially makes the hierarchy problem worse.
• The simplest way to implement PQ symmetry is to add a second Higgs doublet and let the
Yukawa interactions be
L ⊃ QL H1 DR + QL H2c UR
with U (1)PQ charges ei = 1/6 for each quark field, and the transformations
H1 → eiα/6 H1 , H2 → e−iα/6 H2 .
• Spontaneous breaking of U (1)PQ occurs along with electroweak symmetry breaking, where
iβ(x)/6 0√ −iβ(x)/6 0√
H1 = e , H2 = e .
v1 / 2 v2 / 2
• The Goldstone boson is the relative phase of the two fields, and
fa2 v
L ⊃ |∂µ H1 |2 + |∂µ H2 |2 ⊃ ∂µ β∂ µ β, fa = .
2 6
Hence the axion field in this model is a(x) = fa β(x), and our formula above gives ma ∼ 150 keV.
Since fa is low in this model, the axion interacts strongly with other particles, and hence this
scenario is experimentally ruled out.
106 6. Quantum Chromodynamics
In “invisible axion” models, the scale fa is much higher than the electroweak scale. The most
well-known models are the DFSZ and KSVZ axions, though many models are possible.
• In the DFSZ model, one takes the previous model and adds a complex scalar Φ which is a
singlet under the SM gauge group, with Φ → eiα/6 Φ, and adds the terms Φ† Φ and H1† H2 Φ2 .
The axion field is now a linear combination of the phases of the fields H1 , H2 , and Φ, but now
q
2
v12 + v22 + vΦ
fa = .
6
• In the KSVZ model, the ordinary quarks have zero U (1)PQ charge, but we add an additional
heavy quark Dirac field Ψ which is a triplet under SU (3)C and an electroweak singlet, with unit
U (1)PQ charge. We also introduce a singlet field Φ as above with Φ → eiβ Φ. Note that adding
only Dirac fields avoids more chiral fermions, so there are no issues with gauge anomalies.
• Giving Ψ a U (1)PQ charge of 1/2, we now have a U (1)PQ invariant Yukawa interaction
L ⊃ λΦΨL ΨR + h.c.
√
and U (1)PQ is spontaneously broken by the vev |⟨Φ⟩| = vΦ / 2. We then have
√
Φ = (fa + σ(x))eia(x)/ 2fa
and the vev provides the quarks with a mass on the order of λfa . The singlet σ, called the
“saxion”, also has a mass on the order of fa .
• Note that one may either normalize the charges so that a ∈ [0, 2πfa ], or normalize them so that
θobs ⊃ a/fa . We have chosen the latter option, though both are common.
• There are many variants. For example, in the original KSVZ model, Ψ has a hypercharge −1/3,
motivated by embedding the theory in a GUT.
since the axion is a Nambu–Goldstone boson, where the fi stand for all fermions. The important
point is that the interaction is suppressed by fa and depends only on ∂µ a.
• These interactions are modified by loop effects; equivalently one must RG flow down to a low
scale µ ≪ 1 GeV. In particular, the axion mixes with the π 0 , η, and η ′ , which are also SM
singlet neutral pseudoscalar mesons, generating a substantial axion-photon coupling, even if
none exists in the UV. This mixing can be computed explicitly in chiral perturbation theory.
107 6. Quantum Chromodynamics
where gaN N is an O(1) parameter. In the case of the KSVZ model, this interaction is induced
by gluon loops, which don’t suppress the coupling since QCD is strongly coupled in a nucleon.
The KSVZ axion is called a hadronic axion because interactions with leptons only occur through
photon loops, and are hence suppressed by a factor of αe2 /4π.
Note. Putting aside these models, how would we define an axion from the bottom up? First off,
axions are pseudoscalar Nambu–Goldstone bosons, which explains their light mass. But that isn’t
specific enough, because it also applies to many other things, like pions. Axions additionally have
an exact abelian discrete shift symmetry a → a + 2πfa , which forbids many potential couplings,
but allows terms like aF µν F̃µν and aGµν G̃µν with discrete coefficients. These are allowed because
the spacetime integrals of F µν F̃µν and Gµν G̃µν are quantized for topological reasons, as explained
in the notes on Geometry, so that the discrete shift won’t change eiS . Some people will say that
an axion must have a aGµν G̃µν coupling, while others would say this defines a QCD axion, while
axions without it are called “axion-like particles” (ALPs). This change of perspective is important
because many modern axion models don’t actually use PQ symmetry; for instance, axions may also
arise from gauge fields in extra dimensions.
• Following the WIMP paradigm, one might think about a thermal population of axions. It turns
out this would require heavy axions with ma ∼ 100 eV. This dark matter would be heavy, and
it is also ruled out by the constraints below. Instead, axions are produced by the nonthermal
“misalignment mechanism”.
• The axion only exists when Peccei–Quinn (PQ) symmetry breaks, so there are two scenarios.
If PQ symmetry breaks for the last time before inflation, then we expect the axion field to
have a uniform value in our Hubble patch. If it does so afterward, because either HI ≳ fa or
Treheat ≳ fa , then the axion field will have different values on each Hubble patch.
• In either case, right after PQ symmetry breaking the axion field will have some random mis-
alignment angle θ0 because its potential is flat. As the temperature lowers, the axion potential
turns on, giving the required DM energy density.
• The situation is a bit subtle, because the potential gradually turns on, and also isn’t perfectly
harmonic. A numeric calculation gives
2 n (
1.17 fa ≲ 3 × 1017 GeV
2 θ0 fa
Ωa h ≈ 0.35 , n =
0.001 3 × 1017 GeV 1.54 fa ≳ 3 × 1017 GeV
The two cases depend on whether we have H ∼ ma during radiation domination or matter
domination, and the exact solution involves Bessel functions. Note that a more naive estimate
would simply be V ∼ Λ4QCD θ02 , with no dependence on fa . The effects accounted for above mean
that the axion density actually decreases with the axion mass, which is generic for coherent
production mechanisms (i.e. ones where we can treat the DM as a field rather than particles).
108 6. Quantum Chromodynamics
• If PQ symmetry breaks after inflation, we must average over Hubble patches, giving
π2
⟨θ02 ⟩ =
3
in the case where the potential is sinusoidal; in general there are anharmonic correction factors.
To achieve closure density, Ωa h2 = 0.12, we have ma ∼ 10−5 eV. This is the classical axion
window, which has been investigated by ADMX and other experiments. It also allows high fa ,
all the way up to the Planck scale, which is favored by some string theory models.
• On the other hand, if PQ symmetry breaks before inflation, then θ0 can be arbitrary, so we
can have a smaller axion mass if θ0 is small. This is the anthropic axion window. DM radio,
CASPEr, and ABRACADABRA are more sensitive to the lower axion masses in this window.
• Note that there will be some energy in non-zero momentum modes of the axion field, but
they dilute away rapidly since they behave like radiation. The energy density from the zero
momentum mode, on the other hand, dilutes like matter. Hence the axion field can behave as
cold dark matter, despite being extremely light.
• Also note that in both cases, the axion potential is a little more complicated because it isn’t
simple harmonic. In general, it’s easy to get reasonable estimates, but almost all precision
statements about axions must be found numerically.
• Cosmic strings generically appear when a U (1) symmetry is broken. In the anthropic window,
the cosmic strings are diluted away. In the classical window, we expect one per Hubble volume.
• The axionic strings decay to axions, producing another relic population of axions. This is a
complex process that is hard to compute precisely and whose details are still under dispute.
However, under one calculation, it can provide closure density if ma = 26.2 ± 3.4 µeV.
• Practical computations involving axionic string decay usually treat the string with an effective
description such as the Nambu–Goto action, in which case string decay and axion production
must be added in as additional effects.
• In models where instantons preserve a ZN symmetry, domain walls are produced. This is
unacceptable in the classical axion window, as there are strong constraints on domain walls.
• The initial Hubble-scale perturbations can form gravitationally bound clumps of axions on small
scales called “miniclusters”, with many astrophysical consequences. Generally, axion searches
don’t take into account this non-homogeneous phase space distribution, but some features can
make detection easier. For example, rather than a general spread ∆v ∼ 10−3 , there can be low
dispersion streams. And miniclusters can be quite light, passing through the Earth regularly.
• One big simplification is that topological defects such as strings and domain walls are diluted
away. Also, inflation makes the field extremely smooth, preventing the formation of structures
like miniclusters. However, the axion density depends on an arbitrary parameter, the misalign-
ment angle, so the scenario is less predictive. One can take the axion mass to be much lower if
θ0 is also assumed lower.
109 6. Quantum Chromodynamics
• The main issue is the production of isocurvature fluctuations, which place an upper bound on
HI . In particular, an observably large value of rT , as was claimed by BICEP2, would rule out
much of the parameter space, though this measurement did not pan out.
• The anthropic tuning required for very light axions is not as ill-defined as the usual anthropics,
because there is a clear measure to use, i.e. uniform over θ.
• Very low axion masses are disfavored by black hole superradiance. That is, they would cause
rotating BHs to “spin down”, so observing such BHs constraints the axions. The region ruled
out is 6 × 10−13 eV ≲ ma ≲ 10−11 eV.
String theory also motivates axions with fa ∼ mstr ∼ 1018 GeV. The so-called classical axion
window has ma ∼ µeV, with an order of magnitude uncertainty in gaγγ due to model dependence.
(Smaller values would be possible, but would require tuned cancellations between the UV
coupling and the RG flow.)
• One detection method would be by polarization effects. Turning on a background static magnetic
field, photons with polarization along B will be preferentially converted to axions, and moreover
will experience a different index of refraction than photons with polarization perpendicular to
B. These effects, known as vacuum magnetic birefringence and dichroism, also exist in QED,
and were probed by PVLAS.
• Axions could be thermally produced in stars and quickly escape. In “helioscope” experiments
such as CAST, we use a magnetic field to convert them back to photons using a background
electromagnetic field, which are expected to have a thermal spectrum with the core temperature
of the Sun. The IAXO experiment will set a stronger bound; it will not be sensitive at the level
of standard QCD axion DM, but could detect other axion-like particles.
• Another approach, developed at DESY, is “light shining through walls”. A laser is shined at an
opaque wall in a background magnetic field. Light will go through if it turns into an axion, goes
through the wall, and turns back into a photon. However, the bounds from these experiments
are not strong because they require two axion-photon conversion events in the lab.
• The resulting line width is just O(mv 2 /mc2 ) ∼ 10−6 using the DM virial velocity, so we can get
an O(106 ) boost in sensitivity using a high-Q resonant cavity of size m−1
a , which is a fraction
of a meter. Note that this is very different from helioscopes, where the axions are relativistic.
110 6. Quantum Chromodynamics
B0 2
11
2 6 V 10 GeV −22
P ∼ (ga0 B0 ) V ωa min(Q, 10 ) ≲ 10 W
1T 1 m3 fa
for a typical QCD axion, which corresponds to O(103 ) photons per second. This is the approxi-
mate sensitivity of ADMX.
• Why is it possible to detect axion dark matter at all, for such high fa ? Roughly, it’s because
√ √
ρDM ρDM fa
a∼ ∼
ma Λ2QCD
which means that holding ρDM constant, the effect of the axion a/fa is independent of fa . That
is, all QCD axions are coupled the “same” amount; the challenge to probing other ma and fa
is really about devising a system with the appropriate resonant frequency.
becomes the QCD axion. The rest are “axion-like particles”, which don’t couple to QCD directly,
but can couple to the photon or fermions, whose mass is not related to their fa .
• Axions can generically decay to two photons by the same coupling. Hence one basic constraint is
that they must be stable on cosmological timescales if they are to be the DM, giving ma ≲ 20 eV.
• A more stringent constraint comes from the cooling of SN 1987A, which gives ma ≲ 10−2 eV,
which is the current best astrophysical constraint. Accelerated cooling of the Sun, which would
shorten its lifetime, gives a weaker constraint ma ≲ 1 eV. (However, note that faster cooling
makes the core heat up, because of higher gravitational contraction.)
• Why should stellar bounds be competitive with precision laboratory experiments? The point
is that the pace of stellar evolution is determined by the long time required for photons to
diffuse from the core of a star to the surface. Thus, producing weakly coupled axions, which
quickly exit the star, would accelerate stellar cooling. (This also means that stellar bounds stop
working at high axion couplings, in which case the axions get trapped, though such couplings
are already ruled out anyway.)
• Let’s do a very rough estimate to justify this. Such cooling occurs through the axion-photon
coupling via the Primakoff process γ + e− → a + e− , which scales as
2 T2
gaγγ core
Γ∼ .
16π 2 fa2
where we multiplied by the photon density T 4 and the typical axion energy T .
111 6. Quantum Chromodynamics
dQa /dt RT 7
∼ 2 core
4
dQrad /dt fa Tsurf
and for this to be less than one, we require fa ≳ 106 GeV which corresponds to ma ≲ 100 eV.
112 7. Effective Field Theory
• The “zeroth step” of working with an EFT is to identify the relevant degrees of freedom. Next,
we write down a general action for those degrees of freedom, including all terms allowed by the
symmetries. The action will involve undetermined constants, which we call couplings, reflecting
our ignorance of where the theory comes from.
• If there is a separation of scales in the problem, we will use it to write the action as a series in
their small ratio (“power counting”). A typical example is the ratio of the energy scales of the
processes considered, to a high cutoff energy. For example, the Euler–Heisenberg Lagrangian
applies to processes below me , while Fermi theory applies to processes below mW . Or, in cases
where there are nontrivial backgrounds, like a macroscopically occupied bosonic field, one might
expand in a vev divided by a mass scale.
• The power counting expansion will allow us to compute physical quantities to any desired
precision, by working up to the required order. At a fixed order, there will be a finite number
of unknown couplings, and the theory starts to be predictive when we measure more physical
quantities than unknowns.
• If the power counting expansion parameter stops being small, the predictive power of the EFT
breaks down, because infinitely many unknown couplings contribute. In these situations, the
EFT could be completed into a “full theory” with fewer relevant parameters. (And every “full”
theory is itself an EFT approximation to a deeper full theory!)
• However, EFTs can be useful even when we already know the full theory, because the EFT
can be more physically transparent and amenable to calculation. For instance, in the Standard
Model, hadronic decays of B mesons involve nonperturbative matrix elements. But in heavy
quark effective theory (HQET), these matrix elements just fix coupling coefficients, and there is
a perturbative expansion in Λ2QCD /m2b . We have already seen a similar situation for chiral per-
turbation theory (χPT), which can compute light meson scattering perturbatively. Furthermore,
passing to an EFT can yield new symmetries at leading order, such as spontaneously broken
chiral symmetry in χPT and a “spin-flavor” symmetry in HQET, simplifying calculations.
• In these cases, the full theory yields information about the couplings in the EFT, which we can
incorporate by “matching”. That is, we compute some physical quantity in both the EFT and
the full theory (which only involves external degrees of freedom that are present in the EFT),
and demand the answers match.
• Of course, if this is intractable, or the full theory is unknown, we can also match to experimental
data. Alternatively, if we are particularly lucky, the EFT may exhibit “universality”, where
the symmetries are so restrictive that, at some order in the power counting expansion, the
interesting physical observables in the EFT are independent of the full theory.
113 7. Effective Field Theory
• The same full theory can have multiple EFTs, depending on the process and regime of interest.
For example, QCD has χPT at low energy, HQET for heavy mesons, and SCET for jet formation.
In fact, we might use more than one theory within the same process, thanks to factorization.
In a hadronic process at a particle collider, we might compute the hard process using QCD and
parton distribution functions, then switch to SCET for the jets.
• Another great feature of EFTs is that they tell us when they break down, even when we don’t
know the full theory, because we can infer the size of the power counting expansion by measuring
individual couplings in experiments.
Example. The hydrogen atom. In introductory quantum mechanics, we consider the Hamiltonian
p2 e2
H= +
2me r
which can be regarded as the leading term in an EFT. Corrections to atomic energy levels due to a
finite nuclear mass enter at order me /mp , by changing me to the reduced mass, and fine structure
enters at higher order in α. Nonperturbative Standard Model effects also appear, as hyperfine
structure enters at order me /mp and requires nonperturbative QCD to find the proton magnetic
moment. Continuing to higher accuracy, we run into the Lamb shift and even electroweak effects.
As another example, we can focus on the multipole expansion of the proton’s electromagnetic
field. Even this by itself yields an infinite series of couplings, suppressed by powers of rp /r, where
rp is the proton radius, which includes the electric charge, electric and magnetic dipole moments,
quadrupole moments, anapole moments, and charge radii. All of these parameters are determined
by the full theory (i.e. QCD, or the entire Standard Model), and some are simply zero due to
symmetries. The matching could be done, for example, by computing the scattering amplitude for
the proton in the presence of a classical current. Conversely, measuring these moments gives us
information about the proton’s radius, even if we don’t know about QCD.
Example. For a theory with scalars and fermions, we can enumerate operators by dimension.
• At D = 0 we have 1, which represents a cosmological constant.
• At D = 1 we have ϕ, which can usually be removed by shifting ϕ.
• At D = 2 we have ϕ2 , which is a mass term. (We neglect ∂ µ ϕ, which breaks Lorentz invariance.)
• At D = 3, we have the fermion mass term ψψ and the self-interaction ϕ3 .
• At D = 4, we have the self-interaction ϕ4 , the Yukawa interaction ψψϕ, the scalar kinetic term
∂µ ϕ ∂ µ ϕ, and the fermion kinetic term ψ ∂/ ψ. Note that we are neglecting total derivatives, so
ϕ ∂ 2 ϕ is equivalent to ∂µ ϕ ∂ µ ϕ.
Many of the above remarks are generalities, which hold just as well in classical physics. Now we add
a bit more detail, in the context of the quantum field theories considered in these notes, assuming
a cutoff scale M and physics at m ≪ M .
• Even when the full theory is weakly coupled, calculations can break down because of the
phenomenon of “large logarithms”. Generically, the polynomially divergent parts of loop
amplitudes will give contributions scaling as (M/m)n , which are absorbed by counterterms.
What is left over is the part of the loop integral scaling as dd k/k d ∼ d(log k), which receives
R R
contributions at all scales, and gives a factor of log(M/m). More generally, we can get a
logarithm of the power counting parameter.
114 7. Effective Field Theory
• These logarithmic contributions are “nonlocal in energy” and are much subtler to deal with. In
our previous examples of effective theories, we mostly avoided them, either by working at tree
level, or by working in 0 dimensions, but we focus on them here.
• The perturbation series in the full theory thus includes factors of log(M/m), and if M/m is
sufficiently large, this can significantly affect its accuracy. The solution, as noted in the notes on
Quantum Field Theory, is to use a “running coupling” g(µ) which depends on a renormalization
scale µ. When we bring µ down to µ ∼ m, we have “resummed the logarithms”. Hence the
same theory at µ ∼ m and µ ∼ M can be thought of as a very simple example of an EFT and
a full theory.
• One might thus ask, in the context of weakly-coupled full theories, why an EFT is necessary
at all if we can just use running couplings. In more subtle multi-scale processes, such as those
in SCET, the large logarithms cannot be removed by running couplings alone. (Also, more
conceptually, the physical content might be clearer in terms of the EFT degrees of freedom.)
• As a simple example, consider a divergent loop integral with a hard UV cutoff ΛUV ≫ m,
d̄4 ℓ
Z ΛUV
ℓ3 m2 + Λ2UV
Z
1 1 2 2
I=i = dℓ = ΛUV − m log .
ℓ2 − m 2 8π 2 0 ℓ2 + m 2 16π 2 m2
• Now, a very naive approach to get a power counting expansion would be to “Taylor expand
before integrating”, which would give
Z ΛUV
m2 m4
1 3 1
I= 2 dℓ ℓ − 4 + 6 + ... .
8π 0 ℓ2 ℓ ℓ
However, this is too crude of an approximation. We no longer have the logarithmic dependence
on m at all, and meanwhile a spurious IR divergence has been introduced! Regulating it with
an IR cutoff ΛIR ≪ m, we get
Λ2UV
1 2 2 4 1 1
I= ΛUV + m log 2 + m − + ...
16π 2 ΛIR Λ2IR Λ2UV
• The point of this example is that simply Taylor expanding everything in sight is not a valid
approach, though one can get this idea to work if one uses the “method of regions”.
• We will use DR rather than a hard cutoff, as the latter tends to become messy for nontrivial
calculations, and also breaks a vast array of symmetries. (Other advantages of mass-dependent
versus mass-independent regulators have been discussed in the notes on Quantum Field Theory.)
In particular, both the EFT and the full theory will depend on the renormalization scale µ. We
will constrain the EFT by matching physical quantities at scale µ ∼ M , then RG flow down to
µ ∼ m to do perturbation theory in the EFT without large logarithms.
Note. In EFT, people often use the equations of motion to simplify the Lagrangian. For example,
the operators ϕ3 ∂ 2 ϕ and m2 ϕ4 are supposedly equivalent because (∂ 2 − m2 )ϕ = 0. Thus, an
important part of setting up an EFT is using identities such as integration by parts, and the
115 7. Effective Field Theory
equations of motion, to eliminate unwanted terms (often those with more derivatives). This is done
for dimension-6 operators in the SM here.
A dumb way to justify this procedure is to say “the equations of motion are true, so why not use
them?” Of course, this doesn’t make sense even classically. As noted in the notes on Undergraduate
Physics, plugging the equations of motion back into the Lagrangian isn’t valid even in point particle
mechanics, and applying it to a field theory like the one above clearly changes the solutions at the
classical level. And in a quantum theory, it is even less clear what it means for the equations of
motion to be “true” (e.g. the path integral integrates over off-shell configurations).
The reason this procedure works is that the two subtleties above cancel each other out. In the
QFT calculations we typically use EFTs for, we have little interest in the field itself. As discussed
in the notes on Group Theory, the field just serves as a tool for creating and annihilating particles,
and we are really interested in scattering amplitudes for the particles, which are the true gauge
invariant physical observables. As such, it’s clear that one can use the equations of motion at lowest
order. For instance, the operators ϕ3 ∂ 2 ϕ and m2 ϕ4 both mediate 2 → 2 scattering at tree level, but
all particles in the diagram are external. The former gives factors of p2 for each external leg, and
the latter gives factors of m2 , but these are the same thing because external legs are on-shell.
That above argument is only valid at lowest order, but the underlying reason works at all orders.
Correlation functions of fields are related to S-matrix elements by the LSZ reduction formula. We
will find the same scattering amplitudes as long as we use field operators that have nonzero overlap
between the vacuum and the desired one-particle states we are scattering. As such, we are free to
redefine the fields to some degree, and it turns out this freedom is equivalent to using the equations
of motion in the Lagrangian. The reason that this is not emphasized in standard field theory
textbooks is that the procedure only works order by order in the power counting expansion, i.e. one
can eliminate a term at the cost of introducing infinitely many higher-order ones. However, this is
perfectly acceptable in an EFT where we already had those terms to start with, and were planning
on neglecting them anyway.
In order to show this properly, we will follow the paper Reduced Effective Lagrangians (also see
this paper). For concreteness, we consider an EFT including a scalar field ϕ and work to first order
in the power counting parameter η,
ϕ = ϕ′ + ηT [Φ]
where T is any local function of the fields Φ, carrying no powers of η. The generating functional
Z Z !
X
Z[J] = DΦ exp i dx L(ϕ) + Ji Φi
i
where for brevity we are abusing notation by defining the “same spacetime” functional derivative,
δL(0) ∂L(0) ∂L(0)
≡ − ∂µ .
δϕ ∂ϕ ∂(∂µ ϕ)
116 7. Effective Field Theory
At this order, we pick up a Jacobian from the change of variables, a shift to the O(η) part of the
action proportional to the O(η 0 ) equations of motion, and a change in the coupling to currents. The
second effect is precisely what we want, because δL(0) /δϕ = 0 is simply the zeroth order equation
of motion. Thus, by a field redefinition, we can effectively use it to simplify the Lagrangian at first
order, at the cost of shifting the couplings in higher order terms which we are neglecting anyway.
It is clear that this generalizes, e.g. to subsequently eliminate O(η 2 ) terms we would shift the field
again by an amount proportional to η 2 . Also, it is necessary that the field redefinition preserves
the symmetries of the EFT, so no new terms are introduced.
The Jacobian factor is nontrivial, since we are going beyond linear field redefinitions. As in
Yang–Mills, it can be handled by introducing a ghost field, with
δT
Lghost = − cc + ηc c .
δϕ
However, the kinetic term for the ghosts must appear in the second term, and hence is O(η). Thus,
for a canonically normalized ghost field, the first term gives a mass 1/η, which is at the cutoff; hence
the ghosts can be integrated out just like the heavy fields. Note that for this part of the argument
to work, it was essential that the field redefinition be of the form ϕ = ϕ′ + O(η). Intuitively, this
condition means that the redefinition “preserves one-particle states”.
Finally, consider the effect of changing the couplings to currents. This is important because it
changes Green’s functions,
G(n) = ⟨(ϕ + ηT )1 . . . (ϕ + ηT )n ⟩
where the right-hand side is a time-ordered vev, and the subscripts i indicate xi arguments. To see
why this does not affect S-matrix elements, it suffices to consider a few examples. When T = ϕ, the
field redefinition just scales ϕ. This changes the field renormalization Zϕ in a compensating way,
and so drops out of the LSZ reduction formula. When T = ϕ2 , the Green’s function essentially picks
up contributions that would have been part of G(n+1) , and hence don’t have the right pole structure
to contribute to n-point scattering amplitudes. In the case of derivatives, T = ∂ 2 ϕ, we could simply
write T = (∂ 2 − m2 )ϕ + m2 ϕ. The first term gives contributions that cancel the corresponding pole
in the Green’s function, while the second has already been taken care of.
Thus, the only physical effect is that of shifting the Lagrangian, so the (zeroth order) equations of
motion can be used freely to simplify it. Another useful fact is that, at lowest order only, integrating
out a heavy field is equivalent to simply solving its equations of motion and plugging the solution
back into the Lagrangian. Note that at higher orders, the equations of motion themselves are
changed already at lower order, so using them to simplify the Lagrangian is tricky; it is better to
just think in terms of field redefinitions.
• The full theory has fields ϕ of mass m, and Φ of mass M , with M ≫ m. It is valid up to scales
µ > M , and has Lagrangian
1 1 2 2 1 1 2 2
LFull 2 2
kin = (∂µ ϕ) − m ϕ + (∂µ Φ) − M Φ
2 2 2 2
117 7. Effective Field Theory
and
1 3 1 2 1 2 2 1 3 1
LFull
kin = − aϕ − bϕ Φ − κϕ Φ − ρϕ Φ − ηϕ4 .
3! 2 4 3! 4!
To keep things simple, we will only turn some of these couplings on at once.
• In this simple case, the power counting parameter is m/M , and we can do power counting in
the EFT by ordinary dimensional analysis. However, in a more general situation, this would
be more subtle. For example, a derivative ∂µ has mass dimension 1 and hence power counting
dimension 1 here, but in a nonrelativistic EFT where the power counting parameter is v/c, such
as NRQED, we have ∂0 ∼ (v/c)2 and ∂i ∼ (v/c). Similar phenomena would happen in an EFT
defined in the soft or collinear limit.
• In these more nontrivial cases, we simply recall that the scaling of the field itself is always fixed
by making the kinetic term marginal, and coordinates scale inversely to derivatives.