Mechanic
Mechanic
M. Bivert
May 6, 2024
Abstract
Below are solution proposals to the exercises of The Theoretical Minimum - Quantum Mechanics,
written by Leonard Susskind and Art Friedman. An effort has been so as to recall from the book all
the referenced equations, and to be rather verbose regarding mathematical details, in line with the
general tone of the series.
Contents
1 Systems and Experiments 3
1.1 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Quantum States 5
2.1 Along the x Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Along the y Axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
5 Uncertainty and Time Dependence 32
5.1 Mathematical Interlude: Complete Sets of Commuting Variables . . . . . . . . . . . . . . 32
5.1.1 States That Depend On More Than One Measurable . . . . . . . . . . . . . . . . . 32
5.1.2 Wave Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.3 A Note About Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 The Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 The Meaning of Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.5 Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 The Triangle Inequality and the Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . 33
5.7 The General Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7 More on Entanglement 52
7.1 Mathematical Interlude: Tensor Products in Component Form . . . . . . . . . . . . . . . 52
7.1.1 Building Tensor Product Matrices from Basic Principles . . . . . . . . . . . . . . . 52
7.1.2 Building Tensor Product Matrices from Component Matrices . . . . . . . . . . . . 52
7.2 Mathematical Interlude: Outer Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.3 Density Matrices: A New Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.4 Entanglement and Density Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.5 Entanglement for Two Spins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.6 A Concrete Example: Calculating Alice’s Density Matrix . . . . . . . . . . . . . . . . . . 59
7.7 Tests for Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.7.1 The Correlation Test for Entanglement . . . . . . . . . . . . . . . . . . . . . . . . 62
7.7.2 The Density Matrix Test for Entanglement . . . . . . . . . . . . . . . . . . . . . . 64
7.8 The Process of Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.9 Entanglement and Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.10 The Quantum Sim: An Introduction to Bell’s Theorem . . . . . . . . . . . . . . . . . . . . 66
7.11 Entanglement Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2
9 Particle Dynamics 84
9.1 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.2 Nonrelativistic Free Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.3 Time-Independent Schrödinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
9.4 Velocity and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.5 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.6 Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.7 Linear Motion and the Classical Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.8 Path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Axiom 2.
∗
⟨B|A⟩ = ⟨A|B⟩
Where z ∗ is the complex conjugate of z ∈ C
Let us recall also that if
3
We thus have:
∗
⟨A| + ⟨B| |C⟩ = ⟨C| |A⟩ + |B⟩
∗
= ⟨C|A⟩ + ⟨C|B⟩
∗ ∗
= ⟨C|A⟩ + ⟨C|B⟩
= ⟨A|C⟩ + ⟨B|C⟩
b) Mainly from the second axiom:
x + iy = ⟨A|A⟩
∗
= ⟨A|A⟩
= x − iy
⇒ 2iy = 0
⇒y=0
⇒ ⟨A|A⟩ = x ∈ R
Exercise 2. Show that the inner product defined by Eq. 1.2 satisfies all the axioms of inner products.
Axiom 4.
∗
⟨B|A⟩ = ⟨A|B⟩
Where z ∗ is the complex conjugate of z ∈ C
And let us recall Eq. 1.2 of the book:
α1
α2
⟨B|A⟩ = β1∗ β2∗ β3∗ β4∗ β5∗
α3
α4
α5
= β1∗ α1 + β2∗ α2 + β3∗ α3 + β4∗ α4 + β5∗ α5
For the first axiom, considering ⟨C| = (γ1∗ γ2∗ γ3∗ γ4∗ γ5∗ ):
α1 + β1
α2 + β2
⟨C| |A⟩ + |B⟩ = γ1∗ γ2∗ γ3∗ γ4∗ γ5∗ α
3 + β 3
α4 + β4
α5 + β5
= γ1∗ (α1 + β1 ) + γ2∗ (α2 + β2 ) + γ3∗ (α3 + β3 ) + γ4∗ (α4 + β4 ) + γ5∗ (α5 + β5 )
= γ1∗ α1 + γ2∗ α2 + γ3∗ α3 + γ4∗ α4 + γ5∗ α5 + γ1∗ β1 + γ2∗ β2 + γ3∗ β3 + γ4∗ β4 + γ5∗ β5
α1 β1
α2 β2
= γ1∗ γ2∗ γ3∗ γ4∗ γ5∗ ∗ ∗ ∗ ∗ ∗
α3 + γ1 γ2 γ3 γ4 γ5 β3
α4 β4
α5 β5
= ⟨C|A⟩ + ⟨C|B⟩
4
Before checking the second axiom, let us observe that for (a, b) = (xa + iya , xb + iyb ) ∈ C2 :
∗
(ab)∗ = (xa + iya ) × (xb + iyb )
∗
= xa xb − ya yb + i(xb ya + xa yb )
= xa xb − ya yb − i(xb ya + xa yb )
= (xa − iya ) × (xb − iyb )
= a∗ b∗
∗ ∗
⟨B|A⟩ = ⟨B|A⟩
∗ ∗
= β1∗ α1 + β2∗ α2 + β3∗ α3 + β4∗ α4 + β5∗ α5
= β1 α1∗ + β2 α2∗ + β3 α3∗ + β4 α4∗ + β5 α5∗ ∗
= α1∗ β1 + α2∗ β2 + α3∗ β3 + α4∗ β4 + α5∗ β5 ∗
β1
β2
∗
= α1∗ α2∗ α3∗ α4∗ α5∗ β3
β4
β5
∗
= ⟨A|B⟩
2 Quantum States
2.1 Along the x Axis
Exercise 3. Prove that the vector |r⟩ in Eq. 2.5 is orthogonal to vector |l⟩ in Eq. 2.6.
Orthogonality can be detected with the inner-product: |l⟩ and |r⟩ are orthogonals ⇔ ⟨r|l⟩ = ⟨l|r⟩ = 0.
Remark 2.
∗
The nullity of either inner-product is sufficient, because of the ⟨A|B⟩ = ⟨B|A⟩ axiom.
5
For instance:
ρu
⟨l|r⟩ = λ∗u λ∗d
ρd
!
√1
= √1 √1
− 2 2
2 1 √
2
1 1 1 1
=√ √ −√ √
2 2 2 2
=0
Or, similarly:
λu
⟨r|l⟩ = ρ∗u ρ∗d
λd
!
√1
= √1 √1 2
2 2 − √12
1 1 1 1
=√ √ −√ √
2 2 2 2
=0
Let us recall, in order, Eqs. 2.7, 2.8, 2.9, 2.10, which defines |i⟩ and |o⟩, and both 2.5 and 2.6 which
defines |r⟩ and |l⟩:
⟨i|o⟩ = 0
1 1
⟨o|u⟩ ⟨u|o⟩ = ⟨o|d⟩ ⟨d|o⟩ =
2 2
1 1
⟨i|u⟩ ⟨u|i⟩ = ⟨i|d⟩ ⟨d|i⟩ =
2 2
1 1
⟨o|r⟩ ⟨r|o⟩ = ⟨o|l⟩ ⟨l|o⟩ =
2 2
1 1
⟨i|r⟩ ⟨r|i⟩ = ⟨i|l⟩ ⟨l|i⟩ =
2 2
1 i 1 i
|i⟩ = √ |u⟩ + √ |d⟩ |o⟩ = √ |u⟩ − √ |d⟩
2 2 2 2
1 1 1 1
|r⟩ = √ |u⟩ + √ |d⟩ |l⟩ = √ |u⟩ − √ |d⟩
2 2 2 2
6
For clarity, let us recall that ⟨u|A⟩ is the component of |A⟩ along the orthonormal vector |u⟩. This is
because in an orthonormal basis (|i⟩)i∈F we have:
X
|A⟩ = αi |i⟩
i∈F
X X
⇒ ⟨j|A⟩ = ⟨j| αi |i⟩ = αi ⟨j|i⟩ = αj
|{z}
i∈F i∈F =δij
And to make better sense of those equations, let us recall that αu∗ αu = ⟨A|u⟩ ⟨u|A⟩ is the probability of
a state vector |A⟩ = αu |u⟩ + αd |d⟩ to be measured in the state |u⟩.
For Eq. 2.7, we have
ou
⟨i|o⟩ = ι∗u ι∗d
od
= ι∗u ou + ι∗d od
1 1 −i −i 1 1
= √ √ +√ √ = − =0
2 2 2 2 2 2
1 1 1 i −i 1
⟨o|u⟩ ⟨u|o⟩ = √ √ = ⟨o|d⟩ ⟨d|o⟩ = √ √ =
2 2 2 2 2 2
1 1 1 −i i 1
⟨i|u⟩ ⟨u|i⟩ = √ √ = ⟨i|d⟩ ⟨d|i⟩ = √ √ =
2 2 2 2 2 2
For Eqs. 2.9, we need to rely on the column form of the inner-product:
ρu ou λu ou
⟨o|r⟩ ⟨r|o⟩ = o∗u o∗d ρ∗u ρ∗d ⟨o|l⟩ ⟨l|o⟩ = o∗u o∗d λ∗u λ∗d
ρd od λd od
1 1 i 1 1 1 1 −i 1 1 i −1 1 1 −1 −i
= ( √ √ + √ √ )( √ √ + √ √ ) = ( √ √ + √ √ )( √ √ + √ √ )
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 i 1 i 1 i 1 i
= ( + )( − ) = ( − )( + )
2 2 2 2 2 2 2 2
1 1
= (1 + i)(1 − i) = (1 − i)(1 + i)
4 4
1 1 1 1
= (1 + i − i + 1) = = (1 − i + i + 1) =
4 2 4 2
∗ ∗
ρu ∗ ∗
ιu ∗ ∗
λu ∗ ∗
ιu
⟨i|r⟩ ⟨r|i⟩ = ιu ιd ρu ρd ⟨i|l⟩ ⟨l|i⟩ = ιu ιd λu λd
ρd ιd λd ιd
1 1 −i 1 1 1 1 i 1 1 −i −1 1 1 −1 i
= ( √ √ + √ √ )( √ √ + √ √ ) = ( √ √ + √ √ )( √ √ + √ √ )
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 i 1 i 1 i 1 i
= ( − )( + ) = ( + )( − )
2 2 2 2 2 2 2 2
1 1
= (1 − i)(1 + i) = (1 + i)(1 − i)
4 4
1 1 1 1
= (1 + i + i + 1) = = (1 + i − i + 1) =
4 2 4 2
7
Regarding the unicity of |i⟩, |o⟩, as for |r⟩, |l⟩, there definitely is a phase ambiguity, meaning, we can mul-
tiply either |i⟩ or |o⟩ by a phase factor, say eiθ , without disturbing any of the constraints: orthogonality,
probabilities, and the resulting vectors are still unitary.
But as stated by the authors for |r⟩, |l⟩, measurable quantities are independant of any phase factors.
Thus, so far, there seems to be unicity, up to such a phase factor.
Remark 3. I think some sort of dimensional argument might be required to rigorously prove that indeed
there’s no way to extract more than three pairs of mutually orthogonal vectors which have a inner-product
to 1/2, in a C-vector space setting.
Exercise 5. For the moment, forget that Eqs. 2.10 give us working definitions for |i⟩ and |o⟩ in terms
of |u⟩ and |d⟩, and assume that the components α, β, γ and δ are unknown:
α∗ β + αβ ∗ = γ ∗ δ + γδ ∗ = 0
If α∗ β is pure imaginary, then α and β cannot both be real. The same reasoning applies to γ ∗ δ.
Let’s start by recalling Eqs. 2.8, 2.9 and 2.10, which are respectively:
1 1
⟨o|u⟩ ⟨u|o⟩ = ⟨o|d⟩ ⟨d|o⟩ =
2 2 (1)
1 1
⟨i|u⟩ ⟨u|i⟩ = ⟨i|d⟩ ⟨d|i⟩ =
2 2
1 1
⟨o|r⟩ ⟨r|o⟩ = ⟨o|l⟩ ⟨l|o⟩ =
2 2 (2)
1 1
⟨i|r⟩ ⟨r|i⟩ = ⟨i|l⟩ ⟨l|i⟩ =
2 2
1 i 1 i
|i⟩ = √ |u⟩ + √ |d⟩ |o⟩ = √ |u⟩ − √ |d⟩ (3)
2 2 2 2
a) Let’s start by recalling that the inner-product in a Hilbert space is defined between a bra and a ket,
and that it should satisfy at least the following axioms:
z ∈ C, |zA⟩ = z|A⟩
Then we can multiply |o⟩ = α|u⟩ + β|d⟩ to the left by ⟨u| to compute ⟨u|o⟩, using the linearity of the
inner-product/scalar multiplication, and the fact that |u⟩ and |d⟩ are, by definition, unitary orthogonal
vectors (meaning, ⟨u|d⟩ = 0 and ⟨u|u⟩ = ⟨d|d⟩ = 1)
8
⟨u|o⟩ = α ⟨u|u⟩ + β ⟨u|d⟩ = α
Because of the complex conjugation rule, we have
∗
⟨o|u⟩ = ⟨u|o⟩ = α∗
b) I don’t think we can conclude here without recalling the definition of |r⟩:
1 1
|r⟩ = √ |u⟩ + √ |d⟩
2 2
Let’s start with a piece from Eqs. 2.9, arbitrarily (we could use ⟨i|l⟩ ⟨l|i⟩ = 12 , but I think we’d still need
the previous definition of |r⟩):
1
⟨i|r⟩ ⟨r|i⟩ =
2
But:
⟨r|i⟩ = ⟨r|{α + |u⟩ + β|d⟩} = α ⟨r|u⟩ + β ⟨r|d⟩
And:
⟨i|r⟩ = (⟨r|i⟩)∗ = (α ⟨r|u⟩ + β ⟨r|d⟩)∗ = α∗ ⟨u|r⟩ + β ∗ ⟨d|r⟩
9
So
1
⟨i|r⟩ ⟨r|i⟩ =
2
1
⇔ α∗ ⟨u|r⟩ + β ∗ ⟨d|r⟩ α ⟨r|u⟩ + β ⟨r|d⟩ =
2
1
α∗ α ⟨u|r⟩ ⟨r|u⟩ + α∗ β ⟨u|r⟩ ⟨r|d⟩ + β ∗ α ⟨d|r⟩ ⟨r|u⟩ + β ∗ β ⟨d|r⟩ ⟨r|d⟩ =
⇔ |{z}
|{z} 2
=1/2 =1/2
1 1
⇔ ⟨u|r⟩ ⟨r|u⟩ + ⟨d|r⟩ ⟨r|d⟩ + α∗ β ⟨u|r⟩ ⟨r|d⟩ + β ∗ α ⟨d|r⟩ ⟨r|u⟩ =
2 2
Now if |r⟩ = ρu |u⟩ + ρd |d⟩, then
As ρu ρ∗u would be the probability of |r⟩ to be up, and ρd ρ∗d would the probability of |r⟩ to be down,
which are two orthogonal states in a two-states setting, and so the sum of their probability must be 1.
Note that so far, we haven’t needed the expression of |r⟩, but I think we don’t have a choice but to use
it to conclude:
1 1
|r⟩ = √ |u⟩ + √ |d⟩
2 2
So, as the coefficient are real numbers:
1 1
⟨u|r⟩ = √ = ⟨r|u⟩ ; ⟨d|r⟩ = √ = ⟨r|d⟩
2 2
Replacing in the previous expression we have:
1 ∗ 1
⇔ α β + β∗α = 0
2 2
⇔ α∗ β + β ∗ α = 0
The process is very similar to prove γ ∗ δ + γδ ∗ = 0; one has to start again from a Eqs. 2.9, but this time,
from another piece involving o, arbitrarily:
1
⟨o|r⟩ ⟨r|o⟩ =
2
∗ 1
⇔ ⟨r|o⟩ ⟨r|o⟩ =
2
∗ 1
⇔ ⟨r|{γ|u⟩ + δ|d⟩} ⟨r|{γ|u⟩ + δ|d⟩} =
2
1
⇔ γ ∗ ⟨u|r⟩ + δ ∗ ⟨d|r⟩ γ ⟨r|u⟩ + δ ⟨r|d⟩ =
2
1
⇔ γ ∗ γ ⟨u|r⟩ ⟨r|u⟩ + γ ∗ δ ⟨u|r⟩ ⟨r|d⟩ + δ ∗ γ ⟨d|r⟩ ⟨r|u⟩ + |{z}
δ ∗ δ ⟨d|r⟩ ⟨r|d⟩ =
|{z} 2
=1/2 =1/2
1 1
⇔ ⟨u|r⟩ ⟨r|u⟩ + ⟨d|r⟩ ⟨r|d⟩ + γ ∗ δ ⟨u|r⟩ ⟨r|d⟩ + δ ∗ γ ⟨d|r⟩ ⟨r|u⟩ =
2 | {z } 2
=1
⇔ γ ∗ δ + δ∗ γ = 0
10
c) Let’s assume αβ ∗ is a complex number of the form:
αβ ∗ = a + ib, (a, b) ∈ R2
But then: ∗
αβ ∗ = a − ib = α∗ β
That’s because, for two complex numbers z = a + ib and w = x + iy, we have:
∗
zw = z ∗ w∗
Indeed:
zw = (a + ib)(x + iy) = (ax − by) + i(bx + ya)
Hence: ∗
zw = (ax − by) − i(bx + ya)
But:
z ∗ w∗ = (a − ib)(x − iy) = (ax − by) − i(bx + ya)
Hence the result. Back to our α and β, we established in b) that:
α∗ β + αβ ∗ = 0
Which is equivalent from our previous little proof to:
∗
α∗ β + α∗ β = 0
⇔ (a + ib) + (a − ib) = 0 ⇔ 2a = 0 ⇔ a = 0
Which is the same as saying that the real part of α∗ β is zero, or that it’s a pure imaginary number. The
exact same argument applies for γ ∗ δ.
This exercise is about proving one part of what the authors call the Fundamental theorem, also often
called in the literature the (real) Spectral theorem. So far, we’ve been working more or less explicitly in
finite-dimensional spaces, but this result in particular has a notorious analogue in infinite-dimensional
Hilbert spaces, called the Spectral theorem 1 .
Now, I’m not going to prove the infinite dimension version here. There’s a good reason why quantum
mechanics courses often start with spins: they don’t require the generalized result, which demands heavy
mathematical machinery. You may want to refer to F. Schuller YouTube lectures on quantum mechanics2
for a deeper mathematical development.
Finally, I’m going to use a mathematically inclined approach here (definitions/theorems/proofs), and as
we won’t need it, I won’t be using the bra-ket notation.
1 See
[Link] and [Link]
2 [Link] see also the lectures
notes (.pdf) made by a student (Simon Rea): [Link]
view
11
To clarify, here’s the theorem we’re going to prove (I’ll slightly restate it with minor adjustments later
on):
Theorem 1. Let H : V → V be a Hermitian operator on a finite-dimensional vector space V , equipped
with an inner-product3 .
L(p) = λp
Remark 4. As this can be a source of confusion later on, note that the definition of eigenvector/eigen-
value does not depend on the diagonalizability of L.
Remark 5. Note also that while eigenvectors must be non-zero, no such restrictions are imposed on the
eigenvalues.
Definition 2. Two vectors p and q from a vector space V over a field F, where V is equipped with an
inner product ⟨., .⟩ are said to be orthogonal (with respect to the inner-product) whenever:
⟨p, q⟩ = 0F
The following lemma will be of great use later on. Don’t let yourself be discouraged by the length of
the proof: it can literally be be shorten to just a few lines, but I’m going to be very precise, hence very
explicit, as to make the otherwise simple underlying mathematical constructions as clear as I can.
Lemma 1. A linear operator L : V → V on a n ∈ N dimensional vector space V over the complex
numbers has at least one eigenvalue.
Proof. Let’s take a v ∈ V . We assume V is not trivial, that is, V isn’t reduced to its zero vector 0V ,
and so we can always choose v ̸= 0V 4 .
where:
L0 := idV ; Li := |L ◦ L ◦{z. . . ◦ L}
i∈N times
It’s a set of n + 1 vectors, but the space is n dimensional, so the vectors are not all linearly independent.
This means there’s a set of (α0 , α1 , . . . , αn ) ∈ Cn+1 which are not all zero, such that:
n
X
αi Li (v) = 0V (4)
i=0
Here’s the ”subtle” part. You remember what a polynomial is, something like:
x2 − 2x + 1
3 Remember, we need it to be able to talk about orthogonality.
4 Note that if V is trivial, because an eigenvalue is always associated to a non-zero vector, there are no eigenvalues/eigen-
vectors, and the result is trivial.
12
You know it’s customary to then consider this a function of a single variable x, which for instance, can
be a real number:
R → R
L:
x 7→ x2 − 2x + 1
This allows you to graph the polynomial and so forth:
L(x)
Figure 1: L(x) = x2 − 2x + 1
But that’s ”kindergarten” polynomials so to speak. ”Advanced” polynomials are not functions of a real
variable. Rather, we say that L(x) or L is a polynomial of a single variable/indeterminate5 x, where x
stands for an abstract symbol.
The reason is that, when you say that x is a real number (or a complex number, or whatever), you tacitly
assume that you can for instance add, subtract or multiply various occurrences of x, but when mathe-
maticians study polynomials, they want to do so without requiring additional (mathematical) structure
on x.
The set of polynomials of a single variable X with coefficient in a field F is denoted F[X]. For in-
stance, C[f ] is the set of all polynomials with complex coefficient of a single variable f , say, P (f ) =
(3 + 2i)f 3 + 5f ∈ C[f ].
Now you’d tell me, wait a minute: if I have a P (X) = X 2 − 2X + 1, am I not then adding a polynomial
X 2 − 2X with an element from the field, 1?
Well, you’d be somehow right: the notation is ambiguous, in part inherited from the habits of kinder-
garten polynomials, in part because the context often makes things clear, and perhaps most importantly,
because a truly unambiguous notation is unpractically verbose. Actually, X 2 − 2X + 1 is a shortcut
notation for X 2 − 2X 1 + 1X 0 . So no: all the + here are between polynomials.
What does this mean that the + are between polynomials? Well, most often when you encounter F[X],
it’s actually a shortcut for (F[X], +F[X] , .F[X] ), which is a ring6 of polynomials of a single indeterminate
5 [Link]
6 [Link] Note that there is no notion of subtraction in a ring: the
13
over a field7 F. This means that X 2 − 2X + 1 is actually a shortcut for:
Awful, right? Hence why we often use ambiguous notations and reasonable syntactical shortcuts.
The main takeaway though is that mathematicians have defined a set of precise rules (addition, scalar
multiplication, exponentiation of an indeterminate), and that by cleverly combining such rules and only
such rules, they have obtain a bunch of interesting results, and we want to use one of them in particular.
Let’s get back to our equation (4); let me add some parenthesis for clarity:
n
X
αi Li (v) = 0V
i=0
n
X
i
αi L (v) = 0V
i=0
| {z }
=:P (L)
What’s P ? It’s a function which takes a linear operator on V and returns . . . A polynomial? But then,
we don’t know how to evaluate a polynomial on a vector v ∈ V so there’s an problem somewhere.
The natural way, that is, the simplest consistent way, to do so, is to define them pointwise9 for two
functions f, g : X → Y , we define (f + g) : X → Y by:
We equip the space of (linear) functions (on V ) with additional laws. All in all, P is well defined10 , and
that we can indeed pull the v out.
How then can we go from such a weird ”meta” function P to a polynomial? Well, as we stated earlier,
polynomials are defined by a set of specific rules: addition, scalar multiplication, and exponentiation of
the indeterminate.
14
• Similarly for our scalar multiplication;
• And our rules of exponentiation on function by repeated application also follows the rules of
exponentiation for an indeterminate variable.
This mean that if we squint a little, if we only look at the expression P (L) as having nothing but
those properties, then it behaves exactly as a polynomial. Hence, for all intents and purposes, it ”is” a
polynomial, and we can manipulate it as such.
So we can apply the fundamental theorem of algebra11 , we know that we can always factorize polynomials
with complex coefficients as such:
n
Y
(∃(c, λ1 , . . . , λn ) ∈ Cn+1 , c ̸= 0), P (L) = c (L − λi )
i=0
But don’t we have a problem here? L is an abstract symbol, and we’re ”subtracting” it a scalar? Well,
there are a few implicit elements:
n
Y
P (L) = c (L1 + (−λi )L0 )
i=0
Let’s replace this new expression for P (L) in our previous equation, which we can do essentially re-using
our previous argument: the rules (addition, scalar multiplication, etc.) to manipulate polynomials are
”locally” consistent with the rules to manipulate our (linear) functions:
n
!
Y
1 0
c (L − λi L ) (v) = 0V
i=0
Note that L0 becomes the identity function, and by using the previous point-wise operations, we can
reduce it to:
n
Y Yn
c (L(v) − λi idV (v)) = c (L(v) − λi v) = 0V
i=0 i=0
Now, c ̸= 0 by the fundamental theorem of algebra. So we must have:
n
Y
(L(v) − λi v) = 0V
i=0
L(v) − λj v = 0V ⇔ L(v) = λj v
OK; let me adjust the fundamental theorem a little bit, and let’s prove it.
Theorem 2. Let H : V → V be a Hermitian operator on a finite, n-dimensional vector space V , equipped
with an inner-product ⟨., .⟩.
Then, the eigenvectors of H form an orthogonal basis of V , and the associated eigenvalues are real.
Saying it otherwise, it means that a matrix representation MH of H is diagonalizable, and that two
eigenvectors associated with distinct eigenvalues are orthogonal.
Proof. I’m assuming that this is clear for you that the eigenvectors associated to the eigenvalues of a
diagonalizable matrix makes a basis for the vector space. Again, refer to a linear algebra course for more.
Furthermore, you can refer to the book for a proof of orthogonality of the eigenvectors associated to
distinct eigenvalues12 .
11 [Link]
12 I’m not doing it here, as I’ve avoided the bra-ket notation, and this would force me to talk about dual spaces, and so
on.
15
Note that I’ve included a mention to characterize the eigenvalues as real numbers: there’s already a proof
in the book, but it comes with almost no effort with the present proof, so I’ve included it anyway.
Remains then to prove that the matrix representation MH of H is diagonalizable (and that the eigenvalues
are real). Let’s prove this by induction on the dimension of the vector space. If you’re not familiar with
proofs by induction, the idea is as follow:
• Prove that the result is true, say, for n = 1;
• Then, prove that if the result is true for n = k, then the result must be true for n = k + 1.
• If the two previous points hold, then you can combine them: if the first point hold then by applying
the second point, the result must be true n = 1 + 1 = 2. But then by applying the second point
again, it must be true that the result holds for n = 2 + 1 = 3.
• And so on: the result is true ∀n ∈ N\{0}.
Induction Assume the result holds for any Hermitian operator H : W → W on a k-dimensional vector
space W over C.
Let V be a k + 1-dimensional vector space over C. By our previous lemma, H : V → V must have at
least one eigenvalue λ ∈ C associated to an eigenvector v ∈ V .
Apply the Gram-Schmidt procedure14 to extract from it an (ordered) orthonormal basis {b1 , b2 , . . . , bk+1 }
of V ; note that by construction:
v
b1 =
∥v∥
That’s to say, b1 is still an eigenvector for λ15 .
Now we’re trying to understand what’s the matrix representation DH of H, in this orthonormal basis.
If you’ve taken the blue pill, you know how to ”read” a matrix:
DH = H(b1 )
H(b2 )
... H(bk+1 )
If we can’t select such elements no more, this mean we’ve got a basis. Ordering naturally follows from the iteration steps.
14 [Link]
15 H(b ) = H(v/∥v∥), by linearity of H, this is equal to 1 H(v). But v is an eigenvector for an eigenvalue λ, so this is
1 ∥v∥
λ v
equal to ∥v∥
v = λ ∥v∥ = λb1
16
Where A is a 1 × k matrix (a row vector), and C a k × k matrix. But then H is Hermitian, which means
its matrix representation obeys:
T ∗ †
DH = (DH ) = DH
This implies first that λ = λ∗ , i.e λ is real, and we’ll see shortly, can be considered an eigenvalue, as we
can transform DH in a diagonal matrix with λ on the diagonal.
Second, A† = (0 0 . . . 0) = A, i.e:
λ 0 ... 0
0
..
. C
0
Third, C = C † . But then, C is a k × k Hermitian matrix, corresponding to a Hermitian operator in
a k-dimensional vector space. Using the induction assumption, it is diagonalizable, with real valued
eigenvalues. Hence DH is diagonalizable, and all its eigenvalues are real.
Remark 6. Observe that we are (were) trying to build a Hermitian operator with eigenvalues +1 and −1.
The fundamental theorem / real spectral theorem, assures us that Hermitian operators are diagonalizable,
hence there exists a basis in which the operator can be represented by a 2 × 2 matrix containing the
eigenvalues on its diagonal:
1 0
0 −1
Which is exactly the matrix we’ve found.
But now of course, you’d be wondering: wait a minute, right after this exercise, we’re trying to build σx ,
which also has those same eigenvalues +1 and −1, what’s the catch?
Well, remember the diagonalization process: M diagonalizable means that there’s a basis where it’s
diagonal. That is, there’s a change of basis, which is an invertible linear function, which has a matrix
17
representation P , such that the linear operation represented by M in a starting basis is now represented
by a diagonal matrix D:
M = P DP −1
Furthermore:
• The elements on the diagonal of D are the eigenvalues;
Hence,
1 −1 1 (σx )11 (σx )12 1 1 1 1 −1
σx P = ⇔√ =√
1 1 2 (σx )21 (σx )22 1 −1 2 1 1
Solving for the components of σx :
(σx )11 + (σx )12 =1
(σ ) − (σ )
x 11 x 12 = −1
⇔
(σ x )21 + (σ x )22 =1
(σx )21 − (σx )22 =1
Which indeed yields the expected Pauli matrix, as described in the book, and computed by the authors
using a different approach:
0 1
σx =
1 0
And obviously, the same can be done for σy : that’s to say that, reassuringly, we reach the same results
using pure linear algebra.
18
Let’s recall the context: we’re trying to build an operator that allows us to measure the spin of a particle.
We’ve started by building the components of such an operator, each representing our ability to measure
the spin along any of the 3D axes: σx , σy and σz . Each of them was built from the behavior of the
spin we ”measured”: we extracted from the observed behavior a set of constraints, which allowed us to
determine the components of the spin operator:
0 1 0 −i 1 0
σx = ; σy = ; σz =
1 0 i 0 0 −1
Those are individually fine to measure the spin components along the 3 main axis, but we’d like to
measure spin components along an arbitrary axis n̂. Such a measure can be performed by an operator
constructed as a linear combination of the previous three matrices:
σx nx
σn = σ · n̂ = σy · ny = nx σx + ny σy + nz σz
σz nz
Remark 7. Remember from your linear algebra courses that matrices can be added and scaled: they
form a vector space.
The present exercise involves an arbitrary spin vector, that is, a linear combination of σx , σy and σz that
is of the form:
σn = sin θσx + cos θσz
0 1 1 0
= sin θ + cos θ
1 0 0 −1
cos θ sin θ
=
sin θ − cos θ
We’re then asked to look for the eigenvalues/eigenvectors of that matrix, that is, we want to understand
what kind of spin (states) can be encoded by such a matrix, and which values they can take.
Let’s recall that to find the eigenvalues/eigenvectors, we need to diagonalize the matrix: assuming it can
be diagonalized, it means that there’s a basis where it can be expressed as a diagonal matrix; the change
of basis is encoded by a linear map, thus a matrix, and so we must be able to find an invertible matrix
P and a diagonal matrix D such that:16
σn = P DP −1 ⇔ σn P = P D
cos θ sin θ a b a b λ1 0 λ1 a λ2 b
⇔ = =
sin θ − cos θ c d c d 0 λ2 λ1 c λ2 d
Where λ1 and λ2 would be the eigenvalues, associated to the two eigenvectors:
a b
|λ1 ⟩ = ; |λ2 ⟩ =
c d
Which is equivalent to saying, where 02 is the zero 2 × 2 matrix, and I2 the 2 × 2 identity matrix:
σn − I2 λi = 02
16 This is ”basic” linear algebra; the authors assume that you’re already familiar with it to some degree (e.g. matrix prod-
uct); don’t hesitate to refer to a more thorough course on the subject for more. I’ll quickly review here how diagonalization
works
19
This means that the matrix σn − I2 λi cannot be invertible (for otherwise multiplying it by its inverse
would yield, by the rule of invertibility I2 , but on the other side, from the matrix’s definition, it would
yield 02 , hence a contradiction, hence it’s not invertible).
Non-invertibility of a matrix translates to their determinant being zero, which means the λi solves the
following equation for λ:
cos θ − λ sin θ
det(σn − I2 λ) = 0 ⇔ =0
sin θ − cos θ − λ
⇔ − (cos θ − λ)(cos θ + λ) − sin2 θ = 0
⇔ − (cos2 θ − λ2 ) − sin2 θ = 0
⇔ λ2 − (sin2 θ + cos2 θ) = 0
| {z }
=1
⇔ λ2 = 1
(
1 = λ1
⇔ λ=
−1 = λ2
Now that we have our eigenvalues, we can use them to determine the associated eigenvectors, as, remem-
ber, they are linked by:
(∀i ∈ {1, 2}), σn |λi ⟩ = λi |λi ⟩
And so:
cos θ sin θ a a
σn |λ1 ⟩ = λ1 |λ1 ⟩ ⇔ =
sin θ − cos θ c c
(
a cos θ + c sin θ = a
⇔
a sin θ − c cos θ = c
(
a(cos θ − 1) + c sin θ =0
⇔
a sin θ + c(− cos θ − 1) = 0
Consider the first equation of this system: we’re left with two main choices, depending on whether
cos θ = 1 or not. If it is, let’s take θ = 0 for instance, but this would true modulo π, then we must have
sin θ = 0, and the first equations gives us nothing of value. The second then simplifies to c = 0, thus a = 0.
Let’s now consider the case where cos θ ̸= 1. The system can be rewritten as:
−c sin θ
a =
cos θ − 1
a sin θ + c(− cos θ − 1) = 0
x%7D%2C%7Bsin+x%2C-cos+x%7D%7D
20
Instead, let’s try to use and understand the authors’ hint, which is to look for eigenvectors of the form:
cos α
sin α
Why is this a reasonable choice? Let’s start by answering why we need a single parameter α: it cor-
responds to the single degree of freedom we have in this case. Let’s recall the two equivalent ways of
counting the number of degree of freedom that were given in subsection 2.5:
1. First, point the apparatus in any direction in the xz-plane (remember for comparison that in sub-
section 2.5, we were allowed to take a direction in the xyz-space). A single angle is sufficient to
encode this single direction (2 were needed in the xyz space). Furthermore, note that this angle
would have has its coordinate in the xz-plane cos α and sin α, respectively in the x and z directions.
Note that we’re really capturing directions: a point in R2 contains too much information, as we
want to identify all the points which share the same direction;
2. The second approach was to say that the general form of the spin state in xyz-space was given by
a (complex) linear combination αu |u⟩ + αd |d⟩. But, recall the definition of |l⟩ and |r⟩, the vectors
associated with the x-direction:
1 1 1 1
|r⟩ = √ |u⟩ + √ |d⟩; |l⟩ = √ |u⟩ − √ |d⟩
2 2 2 2
They didn’t involved complex numbers. We started to need, and have proven in exercise L02E03
that this was mandatory once we had enough constraints to cover the three spatial directions (i.e.,
when dealing with |i⟩ and |o⟩, after having already established the two other pairs of orthogonal
vectors).
That’s to say, we don’t need complex numbers when we only have two directions, so actually, the
general form of a spin in a plane is a real linear combination, which cuts down the number of
degrees of freedom to 2.
Normalization adds yet another constraint, which cuts us down to a single degree of freedom. But,
shouldn’t the phase ambiguity brings us to . . . zero degree of freedom? What are we missing?
Well, the idea of phase ambiguity was that we could multiply the vectors by a exp(iθ) = cosθ+i sin θ,
for θ ∈ R. But we saw that we actually don’t need complex numbers when we’re in a 2D-plane,
which means sin θ = 0, and thus forces cos θ = 1, so the phase ambiguity doesn’t impact the
number of degrees of freedom;
3. Here’s a third argument that we’ll re-use in the next exercise18 . Consider as a first guess an
eigenvector of the form
z1
; (z1 , z2 ) ∈ C2
z2
We can put both complex numbers in exponential form:
r1 exp(iϕ1 ) r1
= exp(iϕ1 ) ; (r1 , r2 , ϕ1 , ϕ2 ) ∈ R4
r2 exp(iϕ2 r2 exp(i(ϕ2 − ϕ1 ))
We can then ignore the general phase factor exp(iϕ1 ), e.g. choose ϕ1 = 0. Furthermore, we’ll want
the (eigen)vector to be normalized (remember, the eigenvector associated to the eigenvalues of of
a Hermitian operator make an orthonormal basis), i.e.:
But we’re then losing a degree of freedom, meaning, r1 and r2 are not independent from each
other: we can express them both in term of a single parameter, as long as the previous equation
18 Source: [Link]
21
is satisfied. We can choose, as it’ll make computation easier, r1 = cos α, r2 = sin α, with α ∈ R.
Which brings us to:
cos α
exp(iϕ2 ) sin α
If ϕ2 varies, then our eigenvector isn’t restricted to a plane. But, because our eigenvector will be a
eigenvector of a Hermitian matrix, we know by the real spectral theorem19 that it must be a (an
orthonormal basis) vector of the xz-plane. So we can choose ϕ2 = 0 to restrict it to a plane.
Note that the form of this vector is naturally normalized (cos2 α + sin2 α = 1). Recall that it must be
normalized because this column vector actually corresponds to:
cos α 1 0
= cos α + sin α = cos α|u⟩ + sin α|d⟩
sin α 0 1
And the square of the magnitude of cos α encodes the probability for the measured value to correspond
to |u⟩ while the square of the magnitude of sin α encodes the probability of the system to be measured
in state |d⟩, and both states are orthogonal: the total probability must be 1.
Alright, let’s get to actually finding the eigenvectors associated to our eigenvalues. We can use the
same trick as in the previous exercise [Link]: because of the diagonalization process, we have the
following relation:
1 0
σn = P DP −1 ⇔ σn P = P D(P P −1
) = P D = P
| {z } 0 −1
:=I2
cos θ sin θ cos α cos β cos α cos β 1 0 cos α − cos β
⇔ = =
sin θ − cos θ sin α sin β sin α sin β 0 −1 cos α − sin β
| {z }| {z }
=σn =P
Where the columns of P are the eigenvectors associated to the eigenvalues 1 and −1. Both have the
same ”form”, as previously explained. We could have used the same approach as in the book (see the
previous exercise), but you’ll get with the same (kind?) of system in the end. Let’s perform the matrix
multiplication on the left and extract two equations from the four we can get by identifying the matrix
components:
cos θ cos α + sin θ sin α cos θ cos β + sin θ sin β cos α − cos β
=
sin θ cos α − cos θ sin α sin θ cos β − cos θ sin β cos α − sin β
(
cos θ cos α + sin θ sin α = cos α
⇔
cos θ cos β + sin θ sin β = − cos β
Remark 8. Strictly speaking, we don’t really know if this is equivalent so far, as we’re just extracting
two equations from potentially four distinct equations. For correctness’ sake, we could (I won’t out of
laziness) verify that the solution we find for those two equations also solve the two other remaining
equations.
The following trigonometric identities20 :
1 1
cos θ cos α = (cos(θ − α) + cos(θ + α)); sin θ sin α = (cos(θ − α) − cos(θ + α))
2 2
cos(α − π) = − cos α
Allows us to rewrite the previous system as
1 cos(θ − α) + cos(θ + α) + cos(θ − α) − cos(θ + α) = cos α
2
⇔
1 cos(θ − β) + cos(θ + β) + cos(θ − β) − cos(θ + β) = cos(β − π)
2
(
cos(θ − α) = cos α
⇔
cos(θ − β) = cos(β − π)
19 [Link]
20 Look around for the proofs if needed; formulas can be found on Wikipedia
22
And with the following identities:
π π
cos(α + ) = − sin α; sin(α + ) = cos α
2 2
We reach:
! !
cos α cos(θ/2)
| + 1⟩ = =
( (
α = θ2 sin α sin(θ/2)
θ−α=α
⇒ ⇒ ⇒ ! ! !
θ−β =β−π β = 12 (θ + π) cos β cos(θ/2 + π/2) − sin(θ/2)
| − 1⟩ = = =
sin β sin(θ/2 + π/2) cos(θ/2)
Exercise 9. Let nz = cos θ, nx = sin θ cos ϕ and ny = sin θ sin ϕ. Angles θ and ϕ are defined according
to the usual conventions for spherical coordinates (Fig. 3.2). Compute the eigenvalues and eigenvectors
for the matrix of Eq. 3.23.
Let’s recall Eq. 3.23, which is general form of the spin 3-vector operator:
nz (nx − iny ) cos θ (sin θ cos ϕ − i(sin θ sin ϕ))
σn = =
(nx + iny ) −nz (sin θ cos ϕ + i(sin θ sin ϕ)) − cos θ
Observe (e.g. from the trigonometric circle) that:
cos θ = cos(−θ); sin θ = − sin(−θ)
Hence:
exp(−iθ) := cos(−θ) + i sin(−θ) = cos θ − i sin θ
And we can simplify our previous expression of σn to:
cos θ exp(−iϕ) sin θ
σn =
exp(iϕ) sin θ − cos θ
Note that as we’re now in the general case, we indeed have two degrees of freedom, encoded by the two
angles θ and ϕ; the why has been explicited in subsection 2.5.
We’re still confronted to a spin operator: we expect the eigenvalues to be +1 and −121 . But let’s check
this first: an eigenvector |λ⟩ associated to an eigenvalue λ must obey:
σn |λ⟩ = λ|λ⟩
⇔ σn |λ⟩ − λ|λ⟩ = 0 ⇔ (σn − I2 λ)|λ⟩ = 0
But eigenvectors are non-zero, hence, again with 02 being the 2 × 2 zero matrix:
⇔ σn − I2 λ = 02
And so this matrix σn − I2 λ cannot be invertible22 . This translates to a condition on the determinant:
cos θ − λ exp(−iϕ) sin θ
det(σn − I2 λ) = 0 ⇔ =0
exp(iϕ) sin θ − cos θ − λ
⇔ − (cos θ − λ)(cos θ + λ) − exp(iϕ) exp(−iϕ) sin2 θ = 0
| {z }
=1
⇔ − (cos2 θ − λ2 ) − sin2 θ = 0
⇔ λ2 − (sin2 θ + cos2 θ) = 0
| {z }
=1
2
⇔ λ =1
(
+1
⇔ λ=
−1
21 Remember from the real spectral theorem, or as the authors call it, the fundamental theorem, that because we have a
Hermitian matrix, we know it’s diagonalizable, that its eigenvalues are real, and that the corresponding eigenvectors form
a orthogonal basis
22 Again for otherwise, as recalled in L03E03, multiply both sides of the equation by its inverse, get an identity on the
23
The remaining difficulty is then in finding the eigenvectors. We can use the following argument23 .
We can then ignore the general phase factor exp(iϕ1 ), e.g. set ϕ1 = 0. Furthermore, we want the vector
to be normalized (this is an eigenvector associated to the eigenvalue of a Hermitian operator: it must be
normalized per the real spectral theorem), i.e.
But we’re then losing a degree of freedom, meaning, r1 and r2 are not independent from each other: we
can express them both in term of a single parameter, as long as the previous equation is satisfied. We
can choose, as it’ll make computation easier, r1 = cos α, r2 = sin α, with α ∈ R. Finally, let’s rename
ϕ2 = ϕα 24 , which brings us to consider eigenvectors of the form:
cos α
exp(iϕα ) sin α
As for the previous exercise, we can use two different parameter α and β for each eigenvector. Again,
because of the diagonalization process, we have the following relation
−1 −1 1 0
σn = P DP ⇔ σn P = P D(P |P{z }) = P D = P 0 −1
:=I2
But the columns of P must contain our eigenvectors, so this is equivalent to:
cos θ exp(−iϕ) sin θ cos α cos β cos α cos β 1 0
=
exp(iϕ) sin θ − cos θ exp(iϕα ) sin α exp(iϕβ ) sin β exp(iϕα ) sin α exp(iϕβ ) sin β 0 −1
| {z }| {z }
=σn =P
cos α − cos β
=
exp(iϕα ) cos α − exp(iϕβ ) sin β
Let’s perform the matrix multiplication on the left:
cos θ cos α + exp(i(ϕα − ϕ)) sin θ sin α cos θ cos β + exp(i(ϕβ − ϕ)) sin θ sin β
exp(iϕ) sin θ cos α − exp(iϕα ) cos θ sin α exp(iϕ) sin θ cos β − exp(iϕβ ) cos θ sin β
cos α − cos β
=
exp(iϕα ) cos α − exp(iϕβ ) sin β
From which we can extract the following system of equations:
(
cos θ cos α + exp(i(ϕα − ϕ)) sin θ sin α = cos α
cos θ cos β + exp(i(ϕβ − ϕ)) sin θ sin β = − cos β
Remark 9. As for the previous exercise, I leave it to you to check that the solution we’ll find for this
system also solve the two other omitted equations.
23 [Link]
24 Note that I’m not yet identifying ϕα with ϕ; this will come naturally later on
24
It’s tempting to set ϕ = ϕα = ϕβ , but can we do so? Well, we know the two eigenvectors will have to be
orthogonal: this adds an additional constraint, which decrease our degrees of freedom by one, meaning
there’s one superfluous variable in {α, β, ϕα , ϕβ }. We can choose to implement this constraint by setting
ϕα = ϕβ .
From there, we can indeed set ϕα = ϕβ = ϕ, as this allows us to solve the equation for α and β more
easily: (
cos θ cos α + sin θ sin α = cos α
⇔
sin θ cos β − cos θ sin β = − cos β
Which is exactly the same system we had for the previous exercise, which was solved by:
(
α = θ/2
β = 21 (θ + π)
Alright, let’s make the same verifications the authors did in the book after the previous exercise. First,
we get the expected eigenvalues +1, −1, which are the only two eigenvalues we have for a spin operator.
Then the two eigenvectors must be orthogonal, indeed (I only do it one way; the other is trivially similar):
− sin(θ/2)
⟨+1|−1⟩ = cos(θ/2) exp(−iϕ) sin(θ/2)
exp(iϕ) cos(θ/2)
The first one, and the simplest, is to observe that if we consider n̂ in a frame of reference where m̂ acts
as our z − axis, then we’re essentially in the case of our previous exercise: we’ve prepared a spin in the
25 Don’t hesitate to get back to the definition of |u⟩ and that of the inner-product if this isn’t clear enough.
25
”up” state (now corresponding to a state where σm = +1), we’ve moved our apparatus away from m̂ by
a a certain angle θ26 , and we know from the previous exercise that the probability of measuring a +1
after aligning our apparatus with the n̂ axis is now
θ
P (+1) = cos2
2
Which is exactly what we wanted to show (the answer is given in the book by the authors, after the
exercise).
I’ll only draft the second approach, as I expect it to be more time consuming27 . The idea is not to
rely on the previous observation, and to consider that we’ve prepared to spin so that σm = +1, which
means the state of the system is the eigenvector corresponding to this eigenvalue, which we know from
the previous exercise, with θm the angle between the z-axis and m̂, and ϕm the angle between the x-axis
and the projection of m̂ on the xy-plane:
cos(θm /2)
| + 1m ⟩ =
exp(iϕm ) sin(θm /2)
If we then align the apparatus in the n̂ direction, with corresponding θn / ϕn angles, which are relative
to the z-axis, not m̂ , we now, by the same result, that the eigenvector corresponding to the probability
of measuring a +1 in the n̂ direction is:
cos(θn /2)
| + 1n ⟩ =
exp(iϕn ) sin(θn /2)
Then, the probability to measure a +1 is given, again by using the fourth principle:
P (+1) = | ⟨+1m |+1n ⟩ |2
We would then need to develop the inner-product between the two state vectors, and find a way to
identify it with the half-angle between n̂ and m̂.
All the difficulty is then in expressing this half-angle in terms of our four angles (θm , ϕm , θn , ϕn ). I
suppose we get some insightful elements by cleverly:
• Expressing m̂ and n̂ both in rectangular coordinates;
• Observing that by the regular 3-vector dot product, n̂ · m̂ = ∥n̂∥∥m̂∥ cos θmn = cos θmn (where θmn
is the angle between m̂ and n̂
• Observing that cos θmn
2 =
√1 n̂ · (n̂ + m̂) (again from the regular 3-vector dot product, as n̂ + m̂
2
will be a (non-unitary) vector bisecting θmn 28 )
26
conservation of overlaps. It expresses the fact that the logical relation between states is preserved with
time.
The inner-product has been defined as the product of a bra and a ket. So the inner-product of U |A⟩
and U |B⟩ is the product of e.g. the bra associated to U |A⟩ and U |B⟩. But in section 3.1.5 of the book,
we’ve established that:
|C⟩ = M |D⟩ ⇔ ⟨C| = ⟨D|M †
Hence the inner-product we’re looking for is:
†
| {zU} |B⟩ = ⟨A|B⟩
⟨A| U
I
Remark 10. The terminology is a bit confusing: we’re talking about the inner-product of two kets, while
we’ve defined the inner-product to be an operation between a bra and a ket. Overall, the bra-ket notation
makes things a little more complicated than just having to deal with an inner-product space.
= i M † L† − L† M †
(†’s definition)
L = L† ; M = M †
= i (M L − LM )
= i[M, L]
Exercise 13. Go back to the definition of Poisson brackets in Volume I and check that the identification
in Eq. 4.21 is dimensionally consistent. Show that without the factor ℏ, it would not be.
Let’s recall first Eq. 4.21, where [., .] is the commutator and {., .} the Poisson brackets:
[F, G] ⇐⇒ iℏ{F, G}
The Poisson brackets are defined in Volume I, Eq. (9) at the end of Lecture 9 (The Phase Space Fluid
and the Gibbs-Liouville Theorem), as:
X ∂F ∂G ∂F ∂G
{F, G} := −
i
∂qi ∂pi ∂pi ∂qi
Where the pi are the generalized momentum, and qi are the generalized coordinates. Recall that a
momentum is typically defined as a mass in motion, while the coordinates are simply distances to an
origin:
[pi ] = kg.m.s−1 ; [qi ] = m
27
For clarity, let’s rewrite one of those partial derivative in terms of a limit:
∂F F (qi + ϵ) − F (qi )
= lim
∂qi ϵ→0 ϵ
First ϵ must be of the same dimension than qi is this case, for otherwise qi +ϵ is ill-defined; more generally
it’ll have the same dimension that the dimension of the differentiation variable.
Second, observe that, again because otherwise we’d be adding carrots and potatoes:
" #
X ∂F ∂G ∂F ∂G ∂F ∂G ∂F ∂G
− = − , for any arbitrary i that is
i
∂qi ∂pi ∂pi ∂qi ∂qi ∂pi ∂pi ∂qi
But then,
∂F ∂G ∂F ∂G ∂F ∂G ∂F ∂G
[iℏ{F, G}] = ℏ − = [ℏ] − [ℏ]
∂qi ∂pi ∂pi ∂qi ∂qi ∂pi ∂pi ∂qi
We know [ℏ] = kg.m2 .s−1 = [qi pi ], and if we make the limits explicit as we did before, it remains from
the previous expression:
[iℏ{F, G}] = [F G]
On the other side:
[[F, G]] = [F G − GF ]
For F G − GF to be well defined, it must be that [F G] = [GF ]. And so we’re done:
[F, G] = F G − GF
And finally, let’s recall the Pauli matrices σx , σy and σz (from Eqs. 3.20, at the end of section 3.4)
1 0 0 1 0 −i
σz = ; σx = ; σy =
0 −1 1 0 i 0
28
4.12 Solving the Schrödinger Equation
Exercise 15. Take any unit 3-vector n and form the operator
ℏω
H= σ·n
2
Find the energy eigenvalues and eigenvectors by solving the time-independent Schrödinger equation. Re-
call that Eq. 3.23 gives σ · n in component form.
Let’s recall Eq. 3.23, which is general form of the spin 3-vector operator:
nz (nx − iny )
σn = σ · n =
(nx + iny ) −nz
σn |Fj ⟩ = Fj |Fj ⟩
ℏω
But if we multiply both sides of this equation by , we get exactly the equation we want to solve:
2
ℏω ℏω
σn |Fj ⟩ = Fj |Fj ⟩
|2{z } 2
H
Multiplying the equation by a constant doesn’t change the eigenvectors: they still are the only solutions,
but the associated eigenvalues are now different:
ℏω cos(θ/2)
λ1 = ; |λ1 ⟩ =
2 exp(iϕ) sin(θ/2)
ℏω − sin(θ/2)
λ2 = − ; |λ2 ⟩ =
2 exp(iϕ) cos(θ/2)
After time t, an experiment is done to measure σy . What are the possible outcomes and what are the
probabilities for those outcomes?
Congratulations! You have now solved a real quantum mechanics problem for an experiment that can
actually be carried out in the laboratory. Feel free to pat yourself on the back.
29 That’s quite a fancy name for describing the eigenvectors of an operator, by comparison with the ”iconic” Schrödinger
equation. . .
29
Remark 11. There’s a typo in the statement of this exercise: the final observable is said first to be σx
and then σy . The French version of the book uses σy for both, so that’s what I’ll do here.
3. Find the eigenvalues and eigenvectors of H by solving the time-independent Schrödinger equation,
H|Ej ⟩ = Ej |Ej ⟩
I don’t recall us already diagonalizing σz before, so let’s do it, but I’ll be shorter than usual. The
eigenvalues are given by the non-invertibility condition of H − Iλ, as the solutions of
ωℏ ωℏ
det(H − Iλ) = ( − λ)(λ − )=0
2 2
Hence the two eigenvalues:
ωℏ ωℏ
E1 = ; E2 = −
2 2
From which we can derive the two eigenvectors:
ωℏ 1 0 ωℏ
|E1 ⟩ = |E1 ⟩
2 0 −1 2
| {z }
H
Similarly for |E2 ⟩, assume a general form of (c d)T , this yields the following system:
(
c = −c
⇔
−d = −d
Remark 12. I’m not sure why we have an extra degree of freedom via the signs on the non-zero
component of the eigenvectors; I can’t think of an extra constraint.
30
4. Use the initial state-vector |Ψ(0)⟩, along with the eigenvectors |Ej ⟩ from step 3, to calculate the
initial coefficients αj (0):
αj (0) = ⟨Ej |Ψ(0)⟩
That’s an elementary computation:
α1 (0) = 1; α2 (0) = 0
5. Rewrite |Ψ(0)⟩ in terms of the eigenvectors |Ej ⟩ and the initial coefficients αj (0):
X
|Ψ(0)⟩ = αj (0)|Ej ⟩
j
6. In the above equation, replace each αj (0) with αj (t) to capture its time-dependence. As a result,
|Ψ(0)⟩ becomes |Ψ(t)⟩: X
|Ψ(t)⟩ = αj (t)|Ej ⟩
j
Naturally:
|Ψ(t)⟩ = α1 (t)|E1 ⟩ + α2 (t)|E2 ⟩
7. Using Eq. 4.3030 , replace each αj (t) with αj (0) exp(− ℏi Ej t):
X i
|Ψ(t)⟩ = αj (0) exp(− Ej t)|Ej ⟩
j
ℏ
i
|Ψ(t)⟩ = exp(− t)|u⟩
ℏ
OK, then the idea is that if we have an observable L, the probability to measure λ (where λ is then an
eigenvalue of L) is given by:
Pλ (t) = | ⟨λ|Ψ(t)⟩ |2
The authors are asking us to consider as an observable L = σy . Recall:
0 −i
σy =
i 0
This is a matrix corresponding to the spin observable following the y-axis: we must expect its eigenvalues
to be ±1 and its eigenvectors to be |i⟩ and |o⟩, but let’s compute them all anyway for practice:
det(σy − Iλ) = λ2 + i2 = 0 ⇔ λ = ±1
For the eigenvectors, again we can assume a general form and solve the corresponding system of equations:
(
0 −i a a −ib = a
= (+1) ⇔
i 0 b b ia = b
| {z }
σy
Both equations are actually equivalent (multiply the first one by i to get the second). We furthermore
have an additional constraint as the eigenvectors are supposed to be unitary, which yields:
√
a 2 1/√ 2
|E1 ⟩ = and a + (ia)(−ia) = 1 ⇔ |E1 ⟩ = = |i⟩
ia i/ 2
30 This equation corresponds exactly to what this step describes
31
Similarly: (
0 −i c c −id = −c
= (−1) ⇔
i 0 d d ic = −d
| {z }
σy
Again, the two equations are equivalent (multiply the first by −i to get the second one), but we have an
additional constraint, as the vector must be unitary. In the end, this yields:
√
c 2 1/ √2
|E2 ⟩ = and c + (ic)(−ic) = 1 ⇔ |E1 ⟩ = = |o⟩
−ic −i/ 2
We may now apply our previous probability formula (Principle 4):
1 it 1
P+1 (t) = | ⟨i|Ψ(t)⟩ |2 = | √ exp(− )|2 =
2 ℏ 2
And either because the sum of probabilities must be 1, or by explicit computation:
1 it 1
P−1 (t) = | ⟨o|Ψ(t)⟩ |2 = | √ exp(− )|2 =
2 ℏ 2
4.14 Collapse
5.2 Measurement
Exercise 17. Verify this claim.
The claim being that any 2 × 2 Hermitian matrix can be represented as a linear combination of:
1 0 0 1 0 −i 1 0
I= ; σx = ; σy = ; σz =
0 1 1 0 i 0 0 −1
The general form of a 2 × 2 Hermitian matrix is:
r w
(∀(r, r′ , w) ∈ R2 × C),
w∗ r′
Recall indeed that because for a Hermitian matrix L we have L = L† := (L∗ )T , hence the diagonal
elements must be real.
Compare then with the general form for a linear combination of the four matrices above:
4 c + d a − ib
(∀(a, b, c, d) ∈ R ), aσx + bσy + cσz + dI =
a + ib c − d
Clearly we can identify w ∈ C with a−ib: this is a general form for a complex number, and this naturally
identifies w∗ with a + ib, as expected.
Regarding the remaining parameters, we have on one side two real parameters, and on the other side,
two non-equivalent equations involving two parameters, meaning, two degrees of freedom on both sides.
So there’s room to identify r with c + d and r′ with c − d. More precisely, given two arbitrary (r, r′ ) ∈ R2 ,
we can always find (c, d) ∈ R2 such that r = c + d and r′ = c − d:
′
( ( ( (
r =c+d c=r−d c = r − (c − r′ ) c = r+r
2
⇔ ⇔ ⇔ ′
r′ = c − d d = c − r′ d = (r − d) − r′ d = r−r
2
32
Remark 13. Note that (real) linear combinations of those 4 matrices are isomorphic to Q31 .
C̄ := C − ⟨C⟩ I
Where the identity I is sometimes implicit. The eigenvalues of C̄ are denoted c̄ and can be expressed in
terms of C’s eigenvalues, denoted c:
c̄ = c − ⟨C⟩
From there, we defined the standard deviation, or the square of the uncertainty of C, assuming a ”well-
behaved” probability distribution P , by:
X
(∆C)2 := c̄2 P (c)
c
Let’s first quickly prove that c̄ = c − ⟨C⟩ are indeed the eigenvalues of C̄ = C − ⟨C⟩ I. Consider an
eigenvalue c of C, with associated eigenvector |c⟩. It follows that:
C|c⟩ = c|c⟩
⇔ C|c⟩ − ⟨C⟩ |c⟩ = c|c⟩ − ⟨C⟩ |c⟩
⇔ (C − ⟨C⟩ I)|c⟩ = (c − ⟨C⟩)|c⟩
⇔ C̄|c⟩ = (c − ⟨C⟩)|c⟩
Meaning, |c⟩ is still an eigenvector of C̄, but now associated to the eigenvalue c − ⟨C⟩. The |c⟩ still
make an orthonormal basis of the state space, so there are no other eigenvectors (there can’t be more
eigenvectors than the dimension of the surrounding state-space).
Similarly, we can prove that c2 are the eigenvalues associated to C 2 , for an observable C: again start
from an eigenvalue c of C, associated to an eigenvector |C⟩:
33
1) We’ll prove the fact for an arbitrary observable C: it’ll naturally hold for both A and B.
X
(∆C)2 := c̄2 P (c)
c
X
= (c − ⟨c⟩)2 P (c) (definition of c̄)
c
= ⟨Ψ|C̄ 2 |Ψ⟩ =: C̄ 2 (two previous properties)
Remember, ⟨A⟩ and ⟨B⟩ are real numbers (their multiplication is then commutative).
3) This is now just about following the reasoning preceding the exercise in the book, as suggested by the
authors, by replacing A and B with Ā and B̄.
So let:
|X⟩ = Ā|Ψ⟩ = (A − ⟨A⟩ I)|Ψ⟩; |Y ⟩ = iB̄|Ψ⟩ = i(B − ⟨B⟩ I)|Ψ⟩
Recall the general form of Cauchy-Schwarz for a complex vector space32 :
Putting the two together yields the expected, general uncertainty principle:
34
6 Combining Systems: Entanglement
6.1 Mathematical Interlude: Tensor Products
6.1.1 Meet Alice and Bob
6.1.2 Representing the Combined System
How should we understand ⟨σA σB ⟩? We’re trying to find a way to express it as we just did for ⟨σC ⟩.
It’s defined as the average of the product of σA and σB , meaning, the sum of all possible products of a
and b, weighted by some probability distribution, but which one? Well, we don’t really know its form
specifically, but if for ⟨σC ⟩ it was a function of c, then we can guess it must now be a function of a and
b: this is the P (a, b) from the exercise statement:
!
X X
⟨σA σB ⟩ = abP (a, b)
a b
From there, it’s just a matter of developing the computation, using our assumption that P (a, b) factorizes:
!
X X
⟨σA σB ⟩ = abP (a, b)
a b
XX
= (abPA (a)PB (b))
a b
XX
= ((aPA (a))(bPB (b)))
a b
! !
X X
= aPA (a) bPB (b)
a b
= ⟨σA ⟩ ⟨σB ⟩
35
Recall that we’re in the context of two distinct state-spaces, each of them referring to a full-blown spin.
Spin states for the first space (Alice’s) are denoted:
While spin states for the second space (Bob’s) are denoted:
Such states are, as usual, normalized: this is the condition referred to by Eqs. 6.4:
The two underlying state spaces (complex space, but really, Hilbert spaces) are glued by a tensor product:
this allows the creation of new state space, called the product state space, which states can refer to both
Alice’s and Bob’s state in a single expression.
Remark 15. I encourage you to have a look at how Mathematicians formalize the notion of a tensor
product of vector spaces: there is for instance a great introductory YouTube video35 by Michael Penn on
the topic.
The core idea is to start with what is called a formal product of vector spaces, which is a new space built
from the span of purely ”syntactical” combinations of elements of two (or more) vector spaces. Equiva-
lence classes are then used to constrain this span to be a vector space.
For instance, the three following elements would be distinct elements in the formal product of R2 and R3 :
3 3 6
1 2 1
2 ∗ 4 ; ∗ 4 ; ∗8
2 4 2
5 5 10
But they would be identified by equivalence classes so as to be the same element in the tensor product of
R2 and R3 . We can keep identifying elements likewise until the operations (sum, scalar product) on the
formal product space respect the properties the corresponding operations in a vector space.
Here’s Eq. 6.5, the general form for such a product state, living in the tensor product space created from
Alice’s and Bob’s state spaces (I’ve just named it Ψ so as to refer to it later on):
The claim we have to prove is that this vector is naturally normalized, from the normalization constraints
imposed on the individual state spaces.
Let’s start by computing the norm of product state (assuming an ordered basis {|uu⟩, |ud⟩, |du⟩, |dd⟩}:
αu βu
αu βd
|Ψ|2 = ⟨Ψ|Ψ⟩ = (αu βu )∗ (αu βd )∗ (αd βu )∗ (αd βd )∗ αd βu
αd βd
We can develop it further, using the fact that for (a, b) ∈ C, (ab)∗ = a∗ b∗ :
But the norm is axiomatically positively defined (i.e. (∀Ψ ∈ H), |Ψ| ≥ 0 with equality iff Ψ = 0H ) so:
|Ψ| = 1
35 [Link]
36
6.6 Counting Parameters for the Product State
6.7 Entangled States
Exercise 21. Prove that the state |sing⟩ cannot be written as a product state.
Let’s recall the definition of the so-called singlet state |sing⟩:
1
|sing⟩ = √ (|ud⟩ − |du⟩)
2
As for the previous exercise, we’re still in the context of combining two state spaces: Alice’s and Bob’s,
each representing the states of a spin, where the general form of Alice’s state vectors is:
While spin states for the second space (Bob’s) are denoted:
In this context, let’s clarify the difference between a product state and a general composite state, with
potential entanglement:
Product state obtained by developing a product between two states from Alice and Bob’s state spaces,
which yield something along the form:
Remember from the previous exercise that such a state vector is naturally normalized, as a conse-
quence of the normalization of the underlying vectors from Alice and Bob’s space states;
General state for a 2-spins system obtained by linear combination of the vectors from the ordered
basis {|uu⟩, |ud⟩, |du⟩, |dd⟩}:
Clearly, |sing⟩ is normalized: it’s at least a general state for a 2-spins system. Assume it is a product
state. Then there exists (αu , αd , βu , βd ) ∈ C4 such that:
1
αu βd = √
2
1
αd βu = − √
2
α β = 0
u u
αd βd = 0
So the system isn’t solvable and our previous assumption can’t hold. Hence, there’s no such (αu , αd , βu , βd ) ∈
C4 , and |sing⟩ is not a product state.
37
As usual, let’s recall our Pauli matrices:
0 1 0 −i 1 0
σx = ; σy = ; σz =
1 0 i 0 0 −1
The base vectors |u} and |d} are the canonical basis vectors for R2 :
1 0
|u} = ; |d} =
0 1
We’re trying to understand how for instance an operator σx define on Alice’s state spaces can be extended
to work on a state vector, taken from a combined state space involving Alice’s.
The core idea is that the operator will only act on the ”component” of the vector that is related to
Alice’s state space, while leaving the components involving other state spaces untouched.
Eqs. 6.6 (first column below) simply encode how the spin operators act on the basis vectors, in Alice’s
state space; Eqs. 6.7 (second column below) are identical, but for Bob’s state space:
Now verifying that the matrix products indeed evaluates as such is child’s play (matrix × vector prod-
ucts), there’s no use of being more explicit here.
For similar reasons, I’ll just write a completed 6.8 here, but won’t develop the computations: one just have
to follow the aforementioned rule: act with the operator on the correct component, extract the eventual
scalar factor, and generally update the corresponding vector component. This yields, in agreement with
the appendix:
σz |uu⟩ = |uu⟩; τz |uu⟩ = |uu⟩
σz |ud⟩ = |ud⟩; τz |ud⟩ = −|ud⟩
σz |du⟩ = −|du⟩; τz |du⟩ = |du⟩
σz |dd⟩ = −|dd⟩; τz |dd⟩ = −|dd⟩
When any of Alice’s or Bob’s spin operators acts on a product state, the result is still a product state.
Show that in a product state, the expectation value of any component of σ or τ is exactly the same as it
would be in the individual single-spin states.
38
As usual, let’s recall the context. We have two state spaces, one for Alice, and one for Bob, each sufficient
to describe a spin.
Spin states for Alice’s and Bob’s spaces are respectively denoted:
Now, we want to act on such a product state with an operator from either Alice’s state space (σ) or Bob’s
(τ ), which, as we’ve saw earlier, can naturally be extended from the individual spaces to the product
spaces. Recall that the operators’s definition in their own respective state spaces are identical
0 1 0 −i 1 0
τx = σx = ; τy = σy = ; τz = σz =
1 0 i 0 0 −1
However, when acting on a product state (and more generally, on a vector from the product space),
each will respectively only act on the corresponding part of the tensor product gluing basis vectors, for
instance:
σx (γ|ab⟩) = γσx (|a} ⊗ |b⟩) = γ|(σx (a))b⟩
τx (γ|ab⟩) = γτx (|a} ⊗ |b⟩) = γ|a(τx (b))⟩
Because the computation will be exactly symmetric, we’re only going to do the work for Alice’s operators.
Remark 17. It would be interesting to see under which circumstances the result generalizes to arbitrary
observables (Hermitian operators). It seems we would need for such an operator σ to transform the basis
vectors |u⟩ and |d⟩ in such a way that the induced rotation and scaling to reach σ|u⟩ and σ|d⟩, would
somehow balance, so as to preserve the product state constraint. In particular, σ|u⟩ and σ|d⟩ should be
orthogonal.
Note that:
0 1 1 0 0 1 0 1
σx |u} = = = |d}; σx |d} = = = |u}
1 0 0 1 1 0 1 0
Then:
σx |Ψ⟩ = αu βu (σx |u}) ⊗|u⟩ + αu βd (σx |u}) ⊗|d⟩ + αd βu (σx |d}) ⊗|u⟩ + αd βd (σx |d}) ⊗|d⟩
| {z } | {z } | {z } | {z }
|d} |d} |u} |u}
Where, for the last step, we’ve just introduced some renaming (it’ll be made explicit in a moment). Such
a state will be a product state if the following hold:
39
Which are but the normalization conditions underlying |Ψ⟩:
We’ll now do similar computations, but for σy and σz . Starting with σy , note that:
0 −i 1 0 0 −i 0 −i
σy |u} = = = i|d}; σy |d} = = = −i|u}
i 0 0 i i 0 1 0
Then:
σy |Ψ⟩ = αu βu (σy |u}) ⊗|u⟩ + αu βd (σy |u}) ⊗|d⟩ + αd βu (σy |d}) ⊗|u⟩ + αd βd (σy |d}) ⊗|d⟩
| {z } | {z } | {z } | {z }
i|d} i|d} −i|u} −i|u}
Where again, for the last step, we’ve performed some renaming (again, made explicit in a few lines). For
this to be a product state, the following must hold:
σz |Ψ⟩ = αu βu (σz |u}) ⊗|u⟩ + αu βd (σz |u}) ⊗|d⟩ + αd βu (σz |d}) ⊗|u⟩ + αd βd (σz |d}) ⊗|d⟩
| {z } | {z } | {z } | {z }
|u} |u} −|d} −|d}
The renaming is much simpler this time. Let’s recall one last time the product state condition:
40
It remains to establish the last part of the exercise, namely, that the expectation is unchanged. Recall
that for an observable A, given a state |Ψ⟩, the expected value is defined as:
⟨A⟩ := ⟨Ψ|A|Ψ⟩
Now, we’ve been computing A|Ψ⟩ in the previous section for all ”component” of Alice’s spin; so we just
have to take a product with ⟨Ψ| to get the expected value.
Now remember, we consider an ordered basis {|uu⟩, |ud⟩, |du⟩, |dd⟩} to create column/row vectors, for
instance:
αu βu
α u βd
|Ψ >= αu βu |uu⟩ + αu βd |ud⟩ + αd βu |du⟩ + αd βd |dd⟩ =
α d βu
αd βd
We previously established that:
Hence:
⟨σx ⟩ = ⟨Ψ|(σx |Ψ⟩)
α d βu
αd βd
= αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗ αu βu
α u βd
= αu∗ βu∗ αd βu + αu∗ βd∗ αd βd + αd∗ βu∗ αu βu + αd∗ βd∗ αu βd
= βd∗ βd (αu∗ αd + αd∗ αu ) + βu∗ βu (αu∗ αd + αd∗ αu )
= (β ∗ βd + β ∗ βu )(α∗ αd + αd∗ αu )
| d {z u } u
=1
= αu∗ αd + αd∗ αu
I don’t think we’ve already computed ⟨Ψ|σx |Ψ⟩ in terms of αs and βs before (we did earlier in L03E04
computed it in terms of θ, an angle between two states), so let’s do it (I’ll use σxA to indicate that we’re
using σx restricted to Alice’s space; for clarity, I’ll be using the ordered basis {|u}, |d}}):
Let’s do the same thing for ⟨σy ⟩; recall that we’ve computed earlier.
41
Hence,
⟨σy ⟩ = ⟨Ψ|(σy |Ψ⟩)
−iαd βu
−iαd βd
= αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
iαu βu
iαu βd
= i(−αu∗ βu∗ αd βu − αu∗ βd∗ αd βd + αd∗ βu∗ αu βu + αd∗ βd∗ αu βd )
= i βu∗ βu (αd∗ αu − αu∗ αd ) + βd∗ βd (αd∗ αu − αu∗ αd )
= i (βu∗ βu + βd∗ βd )(αd∗ αu − αu∗ αd )
| {z }
=1
= i(αd∗ αu − αu∗ αd )
On the other hand:
Hence,
⟨σz ⟩ = ⟨Ψ|(σz |Ψ⟩)
αu βu
αu βd
= αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
−αd βu
−αd βd
= αu∗ βu∗ αu βu + αu∗ βd∗ αu βd − αd∗ βu∗ αd βu − αd∗ βd∗ αd βd
= βu∗ βu (αu∗ αu − αd∗ αd ) + βd∗ βd (αu∗ αu − αd∗ αd )
= (β ∗ βu + β ∗ βd )(α∗ αu − αd∗ αd )
| u {z d } u
=1
= αu∗ αu − αd∗ αd
And on the other hand:
What does this say about the correlation between the two measurements?
42
Let’s recall the Pauli matrices involved:
0 1 0 −i
σx = ; τy =
1 0 i 0
= 0
Remember for the last step that |du⟩ and |ud⟩ are orthonormal basis vectors.
On to the correlation between the two measurements: recall from section 6.2 that the statistical correla-
tion between an observable σA in Alice’s space and an observable σB in Bob’s space was defined as the
quantity:
⟨σA σB ⟩ − ⟨σA ⟩ ⟨σB ⟩
Hence in our case, the correlation between the two measurements is (the authors previously computed
⟨σx ⟩ = 0 and ⟨σy ⟩ = 0: the computation of ⟨τy ⟩ would be identical as for the latter)
Hence, we can conclude that the two measurements aren’t correlated at all.
Exercise 25. Next, Charlie prepares the spins in a different state, called |T1 ⟩, where
1
|T1 ⟩ = √ (|ud⟩ + |du⟩)
2
In these examples, T stands for triplet. These triplet states are completely different from the states in
the coin and die examples. What are the expectation values of the operators σz τz , σx τx , and σy τy ?
43
Also recall, from L06E04, the rules for acting on composite state vectors36 :
= −1
For the last step, remember, as for the previous exercise, that |du⟩ and |ud⟩ are orthonormal basis vectors.
= +1
36 You have the same in the book’s appendix
44
⟨σy τy ⟩ := ⟨T1 |σy τy |T1 ⟩
1
= √ ⟨T1 |σy τy (|ud⟩ + |du⟩)
2
1
= √ ⟨T1 |σy (−i|uu⟩ + i|dd⟩)
2
i
= √ ⟨T1 | (−i|du⟩ − i|ud⟩)
2
1
= (⟨ud| + ⟨du|)(|du⟩ + |ud⟩)
2
1
= − ⟨ud|du⟩ + ⟨ud|ud⟩ + ⟨du|du⟩ + ⟨du|ud⟩
2 | {z } | {z } | {z } | {z }
0 1 1 0
= +1
Exercise 26. Do the same for the other two entangled triplet states,
1
|T2 ⟩ = √ (|uu⟩ + |dd⟩)
2
1
|T3 ⟩ = √ (|uu⟩ − |dd⟩)
2
As for previous exercise, this is just about crunching numbers. We won’t be using the Pauli matrices
explicitly here; instead, we’ll use the multiplication table from L06E04
As the computations are fairly similar, and to save space, I’ll be computing the expectation values for
T2 and T3 in parallel, distinguishing them by a subscript number.
45
Let’s start with ⟨σz τz ⟩:
⟨σz τz ⟩2 := ⟨T2 |σz τz |T2 ⟩ ⟨σz τz ⟩3 := ⟨T3 |σz τz |T3 ⟩
1 1
= √ ⟨T2 |σz τz (|uu⟩ + |dd⟩) = √ ⟨T3 |σz τz (|uu⟩ − |dd⟩)
2 2
1 1
= √ ⟨T2 |σz (|uu⟩ − |dd⟩) = √ ⟨T3 |σz (|uu⟩ + |dd⟩)
2 2
1 1
= √ ⟨T2 | (|uu⟩ + |dd⟩) = √ ⟨T3 | (|uu⟩ − |dd⟩)
2 2
1 1
= (⟨uu| + ⟨dd|)(|uu⟩ + |dd⟩) = (⟨uu| − ⟨dd|)(|uu⟩ − |dd⟩)
2 2
1 1
= ⟨uu|uu⟩ + ⟨uu|dd⟩ + ⟨dd|uu⟩ + ⟨dd|dd⟩ = ⟨uu|uu⟩ − ⟨uu|dd⟩ − ⟨dd|uu⟩ + ⟨dd|dd⟩
2 | {z } | {z } | {z } | {z } 2 | {z } | {z } | {z } | {z }
1 0 0 1 1 0 0 1
= +1 = +1
Moving on to ⟨σx τx ⟩:
⟨σx τx ⟩2 := ⟨T2 |σx τx |T2 ⟩ ⟨σx τx ⟩3 := ⟨T3 |σx τx |T3 ⟩
1 1
= √ ⟨T2 |σx τx (|uu⟩ + |dd⟩) = √ ⟨T3 |σx τx (|uu⟩ − |dd⟩)
2 2
1 1
= √ ⟨T2 |σx (|ud⟩ + |du⟩) = √ ⟨T3 |σx (|ud⟩ − |du⟩)
2 2
1 1
= √ ⟨T2 | (|dd⟩ + |uu⟩) = √ ⟨T3 | (|dd⟩ − |uu⟩)
2 2
1 1
= (⟨uu| + ⟨dd|)(|dd⟩ + |uu⟩) = (⟨uu| − ⟨dd|)(|dd⟩ − |uu⟩)
2 2
1 1
= ⟨uu|dd⟩ + ⟨uu|uu⟩ + ⟨dd|dd⟩ + ⟨dd|uu⟩ = ⟨uu|dd⟩ − ⟨uu|uu⟩ − ⟨dd|dd⟩ + ⟨dd|uu⟩
2 | {z } | {z } | {z } | {z } 2 | {z } | {z } | {z } | {z }
0 1 1 0 0 1 1 0
= +1 = −1
= −1 = +1
46
The argument of the authors was that, the reason for ⟨τz σz ⟩ to be −1 was that |sing⟩ is built from two
spins, one of which is always up while the other is down, and we’re measuring both spin alongside the
axis on which they are either up or down.
However, in the case of e.g. ⟨τx σx ⟩, the answer was not as obviously, because we’re in this case measuring
the spins alongside the x-axis, and it’s not immediate from the expression of |sing⟩ what kind of balance
we have alongside the x-axis.
Let’s do a little experiment. Recall the definition of the ”basis vectors” for the x-axis, left and right:
1 1
|r⟩ = √ (|u⟩ + |d⟩); |l⟩ = √ (|u⟩ − |d⟩)
2 2
We want to express, say, T3 in terms of |l⟩ and |r⟩, to see if indeed, when expressed as such, T3 is created
from two spins such that when one is left, the other is right, which would be concordant with the idea
that ⟨σx τx ⟩3 = −1. Let’s start by rewriting |u⟩ and |d⟩ in terms of |r⟩ and |l⟩:
( ( √ ( √
|r⟩ = √12 (|u⟩ + |d⟩) |u⟩ = 2|r⟩ − |d⟩ |u⟩ = √22 (|r⟩ + |l⟩)
⇔ √ ⇔
|l⟩ = √12 (|u⟩ − |d⟩) |d⟩ = − 2|l⟩ + |u⟩ |d⟩ = 22 (|r⟩ − |l⟩)
47
Again for this exercise, we won’t need to explicitly use the Pauli matrices σi /τj . But actually, we won’t
even need the multiplication table either, as we’ve already done most of the work in earlier exercises.
Indeed, if we want to prove that |Ψ⟩ is an eigenvector for σ · τ , we expect to be able to carry some
computation following this pattern:
(σ · τ )|Ψ⟩ = (σx τx + σy τy + σz τz )|Ψ⟩
= (σx τx )|Ψ⟩ + (σy τy )|Ψ⟩ + (σz τz )|Ψ⟩
= ...
= λΨ |Ψ⟩
But we know from the book that:
Hence, as foretold by the authors after this exercise, the triplets share a degenerate eigenvalue (+1),
while the singlet is associated to a unique eigenvalue (−3), which justifies a posteriori their names.
Exercise 28. A system of two spins has the Hamiltonian
ω
H= σ·τ
2
What are the possible energies of the system, and what are the eigenvectors of the Hamiltonian?
Suppose the system starts in the state |uu⟩. What is the state at any later time? Answer the same
question for initial states |ud⟩, |du⟩, and |dd⟩.
The first part of the question essentially is about diagonalizing the Hamiltonian: the eigenvalues cor-
respond to the measurable values for the energy. More generally, the exercise is about repeating what
we’ve done earlier in chapter 4, in particular in exercise L04E06, meaning, applying what the authors
call the recipe for a Schrödinger Ket (section 4.13):
1. Derive, look up, guess, borrow, or steal the Hamiltonian operator H;
2. Prepare an initial state |Ψ(0)⟩;
3. Find the eigenvalues and eigenvectors of H by solving the time-independent Schrödinger equation,
H|Ej ⟩ = Ej |Ej ⟩
48
4. Use the initial state-vector |Ψ(0)⟩, along with the eigenvectors |Ej ⟩ from step 3, to calculate the
initial coefficients αj (0):
αj (0) = ⟨Ej |Ψ(0)⟩
5. Rewrite |Ψ(0)⟩ in terms of the eigenvectors |Ej ⟩ and the initial coefficients αj (0):
X
|Ψ(0)⟩ = αj (0)|Ej ⟩
j
6. In the above equation, replace each αj (0) with αj (t) to capture its time-dependence. As a result,
|Ψ(0)⟩ becomes |Ψ(t)⟩: X
|Ψ(t)⟩ = αj (t)|Ej ⟩
j
7. Using Eq. 4.3037 , replace each αj (t) with αj (0) exp(− ℏi Ej t):
X i
|Ψ(t)⟩ = αj (0) exp(− Ej t)|Ej ⟩
j
ℏ
We’ll start by diagonalizing H, and then, by loosely applying the rest of the procedure with the various
proposed initial states. Recall from the previous exercise that we’ve found 4 eigenvectors for σ · τ :
(σ · τ )|sing⟩ = − 3|sing⟩
(σ · τ )|T1 ⟩ = + 1|T1 ⟩
(σ · τ )|T2 ⟩ = + 1|T2 ⟩
(σ · τ )|T3 ⟩ = + 1|T3 ⟩
Hence we can conclude that our 4 eigenvectors |sing⟩, |T1 ⟩, |T2 ⟩, and |T3 ⟩ are the eigenvectors of σ · τ :
there are no others, for we’ve reached the dimension of our vector space A⊗B. By scaling our operator by
ω/2, we find back our Hamiltonian H, for which we then have the same eigenvectors, only the eigenvalues
now need to be shifted likewise:
−3ω +ω
H|sing⟩ = |sing⟩; H|T1 ⟩ = |T1 ⟩
2 2
+ω +ω
H|T2 ⟩ = |T2 ⟩; H|T3 ⟩ = |T3 ⟩
2 2
Hence, we can only measure two values for the energy:
−3ω +ω
Esing = ; ET1 = ET2 = ET3 =
2 2
37 This equation corresponds exactly to what this step describes p
38 If unsure, compute respectively the norm, which is derived from the inner-product: ||Ψ⟩| := ⟨Ψ|Ψ⟩, and that the
same inner-product between two vectors is zero iff said vectors are orthogonal
39 If this is unclear, you can refer to the beginning on this Chapter (6), where we explore how the combine vector space
was built
49
And our eigenvectors are:
|sing⟩, |T1 ⟩, |T2 ⟩, |T3 ⟩
At this point, we’ve reached the end of step 3. of the recipe for a Schrödinger cat recalled earlier. We’re
now ready to follow through the other steps, by varying the initial state. Let’s start as suggested with
|Ψuu (0)⟩ = |uu⟩: we’re trying to rewrite this initial vector state in the basis corresponding to the eigen-
vectors of our observable (our Hamiltonian).
αT2 (0) := ⟨T2 |Ψuu (0)⟩ αT3 (0) := ⟨T3 |Ψuu (0)⟩
= ⟨T2 |uu⟩ = ⟨T3 |uu⟩
1 1
= √ (⟨uu| + ⟨dd|)|uu⟩ = √ (⟨uu| − ⟨dd|)|uu⟩
2 2
1 1
= √ = √
2 2
Hence we can rewrite (step 5.) |Ψuu (0)⟩ = |uu⟩ in the eigenbase:
X 1
|Ψuu (0)⟩ = |uu⟩ = αj (0)|Ej ⟩ = √ (|T2 ⟩ + |T3 ⟩)
j
2
And from a previous equation (4.30) we can find the evolution over time of our state:
X i
|Ψuu (t)⟩ = αj (0) exp(− Ej t)|Ej ⟩
j
ℏ
That is:
1 ωi
|Ψuu (t)⟩ = √ exp(− t)(|T2 ⟩ + |T3 ⟩)
2 2ℏ
Let’s repeat the exact same process, but this time with an initial state |Ψud (0)⟩ = |ud⟩. I’ll just perform
the computation, you can refer to the previous steps if need be.
αT2 (0) := ⟨T2 |Ψud (0)⟩ αT3 (0) := ⟨T3 |Ψud (0)⟩
= ⟨T2 |ud⟩ = ⟨T3 |ud⟩
1 1
= √ (⟨uu| + ⟨dd|)|ud⟩ = √ (⟨uu| − ⟨dd|)|ud⟩
2 2
= 0 = 0
50
But: X i
|Ψud (t)⟩ = αj (0) exp(− Ej t)|Ej ⟩
j
ℏ
So:
1 3ωi ωi
|Ψud (t)⟩ = √ exp( t)|sing⟩ + exp(− t)|T1 ⟩
2 2ℏ 2ℏ
αT2 (0) := ⟨T2 |Ψdu (0)⟩ αT3 (0) := ⟨T3 |Ψdu (0)⟩
= ⟨T2 |du⟩ = ⟨T3 |du⟩
1 1
= √ (⟨uu| + ⟨dd|)|du⟩ = √ (⟨uu| − ⟨dd|)|du⟩
2 2
= 0 = 0
But: X i
|Ψdu (t)⟩ = αj (0) exp(− Ej t)|Ej ⟩
j
ℏ
So:
1 ωi 3ωi
|Ψdu (t)⟩ = √ exp(− t)|T1 ⟩ − exp( t)|sing⟩
2 2ℏ 2ℏ
So:
1 ωi
|Ψdd (t)⟩ = √ exp(− t) (|T2 ⟩ − |T3 ⟩)
2 ℏ
51
7 More on Entanglement
7.1 Mathematical Interlude: Tensor Products in Component Form
7.1.1 Building Tensor Product Matrices from Basic Principles
7.1.2 Building Tensor Product Matrices from Component Matrices
Exercise 29. Write the tensor product I ⊗ τx as a matrix, and apply that matrix to each of the |uu⟩,
|ud⟩, |du⟩, and |dd⟩ column vectors. Show that Alice’s half of the state-vector is unchanged in each case.
Recall that I is the 2 × 2 unit matrix.
Recall that τx is a Pauli matrix, while I really is the identity matrix:
0 1 1 0
τx = ; I=
1 0 0 1
We saw two different ways of building I ⊗ τx . Let’s start with the first one: consider the usual or-
dered basis of the underlying composite space: {|uu⟩, |ud⟩, |du⟩, |dd⟩}. Then, the elements of the matrix
representation of I ⊗ τx in this basis are given by:
(I ⊗ τx )ab,cd = ⟨ab|(I ⊗ τx )|cd⟩
We can then use the multiplication table from either the appendix or from L06E04, where, remember,
τx in this multiplication table was a shortcut notation for I ⊗ τx .
τx |uu⟩ = |ud⟩; τx |ud⟩ = |uu⟩
τx |du⟩ = |dd⟩; τx |dd⟩ = |du⟩
And we’re now ready to evaluate the operator’s matrix form:
⟨uu|(I ⊗ τx )|uu⟩ ⟨uu|(I ⊗ τx )|ud⟩ ⟨uu|(I ⊗ τx )|du⟩ ⟨uu|(I ⊗ τx )|dd⟩
⟨ud|(I ⊗ τx )|uu⟩ ⟨ud|(I ⊗ τx )|ud⟩ ⟨ud|(I ⊗ τx )|du⟩ ⟨ud|(I ⊗ τx )|dd⟩
I ⊗ τx ” = ”
⟨du|(I ⊗ τx )|uu⟩ ⟨du|(I ⊗ τx )|ud⟩ ⟨du|(I ⊗ τx )|du⟩
⟨du|(I ⊗ τx )|dd⟩
⟨dd|(I ⊗ τx )|uu⟩ ⟨dd|(I ⊗ τx )|ud⟩ ⟨dd|(I ⊗ τx )|du⟩ ⟨dd|(I ⊗ τx )|dd⟩
⟨uu|ud⟩ ⟨uu|uu⟩ ⟨uu|dd⟩ ⟨uu|du⟩
⟨ud|ud⟩ ⟨ud|uu⟩ ⟨ud|dd⟩ ⟨ud|du⟩
”=” ⟨du|ud⟩ ⟨du|uu⟩ ⟨du|dd⟩ ⟨du|du⟩
0 0 1 0
Let’s move on to the second way, which consists in using Eq. 7.6 of the book:
A11 B A12 B
A⊗B =
A21 B A22 B
Which then yields:
1 × τx 0 × τx
I ⊗ τx ” = ”
0 × τx 1 × τx
0 1 0 0
1 0
0 0
”=”
0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
”=”
0 0 0 1
0 0 1 0
Which is exactly what we’ve found earlier, albeit less tediously.
52
In our usual ordered basis {|uu⟩, |ud⟩, |du⟩, |dd⟩}, the column representations of the basis vectors are as
follow:
1 0 0 0
0 1 0 0
|uu⟩ = 0 ; |ud⟩ = 0 ; |du⟩ = 1 ; |dd⟩ = 0
0 0 0 1
Remark 18. Remember than the column notation is merely a syntactical shortcut over linear combina-
tions of the basis vectors:
a
b
:= a|uu⟩ + b|ud⟩ + c|du⟩ + d|dd⟩
c
d
Remark 19. Note that we could also have used, as the authors did in the book, Eq. 7.6 to derive them.
Then it’s just a matter of computing some elementary matrix×vector products. As a shortcut, one can
also recall from one’s linear algebra class than such products, when they involve basis vectors, are simply
a matter of extracting the columns of the matrix (which is fairly trivial to see):
0 1
1 0
(I ⊗ τx )|uu⟩ =
0 = |ud⟩; (I ⊗ τx )|ud⟩ = 0 = |uu⟩;
0 0
0 0
0 0
0 = |dd⟩;
(I ⊗ τx )|du⟩ =
1 = |du⟩
(I ⊗ τx )|dd⟩ =
1 0
Remark 20. Naturally, this is consistent with the multiplication table we’ve recalled earlier; and Alice’s
part of the state is indeed kept unchanged, as expected.
Exercise 30. Calculate the matrix elements of σz ⊗ τx by forming inner products as we did in Eq. 7.2.
This is essentially the same exercise as the previous one, but with a different composite operator. To
check for errors, I’ll still do the computation using the two approaches.
We’ll start with the approach suggested in the exercise’s statement: let’s first start by recalling the
portion of interest from the multiplication table computed in L06E04:
53
Then, Eq. 7.2 applied to σz ⊗ τx will give:
⟨uu|(σz ⊗ τx )|uu⟩ ⟨uu|(σz ⊗ τx )|ud⟩ ⟨uu|(σz ⊗ τx )|du⟩ ⟨uu|(σz ⊗ τx )|dd⟩
⟨ud|(σz ⊗ τx )|uu⟩ ⟨ud|(σz ⊗ τx )|ud⟩ ⟨ud|(σz ⊗ τx )|du⟩ ⟨ud|(σz ⊗ τx )|dd⟩
σz ⊗ τx ” = ” ⟨du|(σz ⊗ τx )|uu⟩ ⟨du|(σz ⊗ τx )|ud⟩ ⟨du|(σz ⊗ τx )|du⟩
⟨du|(σz ⊗ τx )|dd⟩
⟨dd|(I ⊗ τx )|uu⟩ ⟨dd|(σz ⊗ τx )|ud⟩ ⟨dd|(σz ⊗ τx )|du⟩ ⟨dd|(σz ⊗ τx )|dd⟩
⟨uu|σz |ud⟩ ⟨uu|σz |uu⟩ ⟨uu|σz |dd⟩ ⟨uu|σz |du⟩
⟨ud|σz |ud⟩ ⟨ud|σz |uu⟩ ⟨ud|σz |dd⟩ ⟨ud|σz |du⟩
”=” ⟨du|σz |ud⟩ ⟨du|σz |uu⟩ ⟨du|σz |dd⟩ ⟨du|σz |du⟩
0 0 −1 0
Let’s verify our computation using the second approach, relying on Eq. 7.6 of the book:
A11 B A12 B
A⊗B =
A21 B A22 B
0 0 −1 0
b) Perform the matrix multiplications Aa and Bb on the right-hand side. Verify that each result is a
4 × 1 matrix.
54
e) Perform the matrix multiplication on the left-hand side, resulting in a 4 × 1 column vector. Each row
should be the sum of four separate terms.
f ) Finally, verify that the resulting column vectors on the left and right sides are identical.
Recall Eq. 7.10
(A ⊗ B)(a ⊗ b) = (Aa ⊗ Bb)
And Eq. 7.7 and 7.8:
A11 B11 A11 B12 A12 B11 A12 B12 a11 b11
A11 B21 A11 B22 A12 B21 A12 B22 a11 b a11 b21
A⊗B = A21 B11
; ⊗ 11 =
A21 B12 A22 B11 A22 B12 a21 b21 a21 b11
A21 B21 A21 B22 A22 B21 A22 B22 a21 b21
Our goal is to prove Eq. 7.10 by following all the recommended steps. It’s a bit tedious, but otherwise
presents no major difficulties.
From Eqs. 7.7 and 7.8, we can see that all Kronecker products indeed expand to 4 × 1 matrices.
Equation 7.10 is then equivalent to:
A11 A12 B11 B12 a11 b A11 a11 + A12 a21 B11 b11 + B12 b21
⊗ ⊗ 11 = ⊗
A21 A22 B21 B22 a21 b21 A21 a11 + A22 a21 B21 b11 + B22 b21
c), d), e), f) I’ll be mixing all those steps together, because this is fairly trivial. First, A ⊗ B and a ⊗ b
are respectively Eqs. 7.7 and 7.8. This gives us already:
A11 B11 A11 B12 A12 B11 A12 B12 a11 b11
A11 B21 A11 B22 A12 B21 A12 B22 a11 b21 A11 a11 + A12 a21 B11 b11 + B12 b21
= ⊗
A21 B11 A21 B12 A22 B11 A22 B12 a21 b11 A21 a11 + A22 a21 B21 b11 + B22 b21
A21 B21 A21 B22 A22 B21 A22 B22 a21 b21
It remains to expand the last Kronecker product, for which we can use 7.8:
A11 B11 A11 B12 A12 B11 A12 B12 a11 b11 (A11 a11 + A12 a21 )(B11 b11 + B12 b21 )
A11 B21 A11 B22 A12 B21 A12 B22 a11 b21 (A11 a11 + A12 a21 )(B21 b11 + B22 b21 )
A21 B11 A21 B12 A22 B11 A22 B12 a21 b11 = (A21 a11 + A22 a21 )(B11 b11 + B12 b21 )
A21 B21 A21 B22 A22 B21 A22 B22 a21 b21 (A21 a11 + A22 a21 )(B21 b11 + B22 b21 )
A11 B11 a11 b11 + A11 B12 a11 b21 + A12 B11 a21 b11 + A12 B12 a21 b21
A11 B21 a11 b11 + A11 B22 a11 b21 + A12 B21 a21 b11 + A12 B22 a21 b21
=A21 B11 a11 b11 + A21 B12 a11 b21 + A22 B11 a21 b11 + A22 B12 a21 b21
A21 B21 a11 b11 + A21 B22 a11 b21 + A22 B21 a21 b11 + A22 B22 a21 b21
And it’s now trivial to verify that this holds, as expected.
55
7.2 Mathematical Interlude: Outer Products
7.3 Density Matrices: A New Tool
7.4 Entanglement and Density Matrices
7.5 Entanglement for Two Spins
Exercise 32. Calculate the density matrix for:
Answer:
ψ(u) = α; ψ ∗ (u) = α∗
ψ(d) = β; ψ ∗ (d) = β ∗
∗
α α α∗ β
ρa′ a =
β∗α β∗β
Now try plugging in some numbers for α and β. Make sure they are normalized to 1. For example,
1 1
α= √ ,β= √ .
2 2
Start by recalling the definition of the density matrix for a single spin in a known state:
Now we have no wave function ψ in the exercise statement (the answer set aside), but we can find it by
identification with general form of |Ψ⟩:
X
|Ψ⟩ = ψ(a, b, c, . . .)|a, b, c, . . .⟩
a,b,c,...
Hence, ψ(u) is the component of |Ψ⟩ following the |u⟩ axis, and ψ(d) the one on the |d⟩ axis:
Immediately:
ψ ∗ (u) = α∗ ; ψ ∗ (d) = β ∗ ;
Then it’s just about packaging all the ρaa′ in a matrix: the basis is ordered ({|u⟩, |d⟩}) hence:
Remark 21. We could also use the fact that the density operator is defined as a linear combination of
of projectors corresponding to the potential states of the system, each scaled by a probability, and so that
the sum of those probabilities is 1, e.g.:
X X
ρ= Pi |ψi ⟩⟨ψi |; where: Pi = 1
i i
As we’re in the case of a single spin in a known state |Ψ⟩, this reduces to
ρ = 1|Ψ⟩⟨Ψ| = |Ψ⟩⟨Ψ|
Assuming again the ordered basis {|u⟩, |d⟩}, we can write ⟨Ψ| and |Ψ⟩ in column form, and perform the
outer-product: ∗
αα αβ ∗
α
α∗ β ∗ =
ρ=
β βα∗ ββ ∗
This allows us to double-check our previous result: it seems there’s a typo in the exercise statement.
56
Let’s compute a few density matrices for well-known states:
1 0
|u⟩ = 1|u⟩ + 0|d⟩ ⇒ ρ|u⟩ =
0 0
0 0
|d⟩ = 0|u⟩ + 1|d⟩ ⇒ ρ|d⟩ =
0 1
1 1 1/2 1/2
|r⟩ = √ |u⟩ + √ |d⟩ ⇒ ρ|r⟩ =
2 2 1/2 1/2
1 1 1/2 −1/2
|l⟩ = √ |u⟩ − √ |d⟩ ⇒ ρ|l⟩ =
2 2 −1/2 1/2
1 i 1/2 −i/2
|i⟩ = √ |u⟩ + √ |d⟩ ⇒ ρ|i⟩ =
2 2 i/2 1/2
1 i 1/2 i/2
|o⟩ = √ |u⟩ − √ |d⟩ ⇒ ρ|o⟩ =
2 2 −i/2 1/2
The French version of this exercise40 is a bit more interesting, there are a few additional questions. We
can for instance check that ρ is Hermitian:
∗ ∗ T T
(β ∗ α)∗
∗
βα∗ α∗ α β∗α
(α α) αα
ρ† = (ρ∗ )T = = = =: ρ
(α∗ β)∗ (β ∗ β)∗ αβ ∗ ββ ∗ α∗ β β∗β
Tr(ρ) = α∗ α + β ∗ β = 1
Finally, we can check that ρ projects to |Ψ⟩. Consider a vector which has a component perpendicular to
|Ψ⟩, that is, in the direction of |Ψ⊥ ⟩, and a component in the direction of |Ψ⟩
By linearity:
ρ|Φ⟩ = γρ|Ψ⊥ ⟩ + δρ|Ψ⟩
Using the fact that ρ = |Ψ⟩⟨Ψ|, we see, by associativity on the products, and by the orthogonality
condition between |Ψ⟩ and |Ψ⊥ ⟩:
By injecting the two previous results in the one before, it follows that indeed that ρ projects a vector on
the |Ψ⟩ direction:
ρ|Φ⟩ = δ|Ψ⟩
Exercise 33. a) Show that
2 2
a 0 a 0
=
0 b 0 b2
b) Now, suppose
1/3 0
ρ=
0 2/3
40 See [Link] for a relevant excerpt, which by the
57
Calculate
ρ2
Tr(ρ)
Tr(ρ2 )
c) If ρ is a density matrix, does it represent a pure state or a mixed state?
a)
2 2
a 0 a 0 a 0 a 0
= =
0 b 0 b 0 b 0 b2
Recall that there’s a result alluded to by the authors in a footnote page 195 (section 7.2) that the trace
of an operator is the sum of the diagonal elements of any matrix representation of this operator. Hence:
1 2 1 4 5
Tr(ρ) = + = 1; Tr(ρ2 ) = + =
3 3 9 9 9
c) We just saw in the book some properties of density matrices. In particular, for a pure state, and a
density matrix ρ, we must have:
ρ2 = ρ and Tr(ρ)2 = 1
While for a mixed state, we must have:
Exercise 34. Use Eq. 7.22 to show that if ρ is a density matrix, then
Tr(ρ) = 1.
Well, there will be one P (a) for each eigenvalue, and thus by Eq. 7.22, there is a systematic correspon-
dence with the diagonal elements of the density matrix. But the trace of an operator is defined as the
sum of the diagonal elements of a matrix representation of this operator, and it so happens that this
value is unique up to a change of basis (meaning, the trace of an operator is the same for all matrix
representation of this operator).
P
Hence because the eigenvalues a represent all the potential measurement values, we know that a P (a) =
X X
1, which by our previous reasoning implies indeed that Tr(ρ) := ρaa = P (a) = 1
a a
58
7.6 A Concrete Example: Calculating Alice’s Density Matrix
Exercise 35. Use Eq. 7.24 to calculate ρ2 . How does this result confirm that ρ represents an entangled
state? We’ll soon discover that there are other ways to check for entanglement.
Here’s Eq. 7.24:
1/2 0
ρ=
0 1/2
From there it’s trivial to see that:
2
1/2 0 1/4 0
ρ2 = =
0 1/2 0 1/4
The authors demonstrated earlier a criteria to determine whether a density matrix corresponds to an
entangled state or not, at the end of section 7.5: for a pure state, and a density matrix ρ, we must have:
ρ2 = ρ and Tr(ρ)2 = 1
Let’s start with |ψ1 ⟩. We know Alice’s matrix must be of the form:
ρuu ρud
ρA =
ρdu ρdd
And so must be Bob’s actually. Filling in with our previous formulas, we obtain:
∗
ψ1 (u, u)ψ1 (u, u) + ψ1∗ (u, d)ψ1 (u, d) ψ1∗ (d, u)ψ1 (u, u) + ψ1∗ (d, d)ψ1 (u, d)
ρ1A =
ψ1∗ (u, u)ψ1 (d, u) + ψ1∗ (u, d)ψ1 (d, d) ψ1∗ (d, u)ψ1 (d, u) + ψ1∗ (d, d)ψ1 (d, d)
(1/2)(1/2) + (1/2)(1/2) (1/2)(1/2) + (1/2)(1/2)
=
(1/2)(1/2) + (1/2)(1/2) (1/2)(1/2) + (1/2)(1/2)
1/2 1/2
=
1/2 1/2
Where, remember, the wave function’s values correspond to the basis vector coefficients, which are all
1/2 here. By symmetry, we would obtain exactly the same matrix for Bob:
1/2 1/2
ρ1B =
1/2 1/2
59
• Clearly, ρ1A = ρ1B is Hermitian;
• Its trace is 1/2 + 1/2 = 1, as expected;
• Let’s compute its square:
1/2 1/2 1/2 1/2 1/2 1/2
ρ21A = ρ21B = = = ρ1A = ρ1B
1/2 1/2 1/2 1/2 1/2 1/2
And Tr(ρ21A ) = Tr(ρ21B ) = 1, from which we can conclude that ψ1 is a pure state.
• Without having to compute them explicitly, this implies that its eigenvalues must be 0 and 1.
Let’s compute the eigenvalues by partially diagonalizing the matrix anyway for practice: an eigenvector
|λ⟩ is tied to an eigenvalue λ by:
Because an eigenvector is by definition non-zero, this implies that ρ1A − λI must be non-invertible41 .
This implies that:
12
1/2 − λ 1/2 1 2 1 1 1 1
det(ρ1A − λI) = 0 ⇔ 0 = = − λ) − = ( − λ − )( − λ + ) = λ(λ − 1)
1/2 1/2 − λ 2 2 2 2 2 2
(
λ=0
⇔
λ=1
As expected.
Again, by a symmetry argument, we can already conclude that ρ2B = ρ2A (the idea is that you can swap
the labels corresponding to Bob and Alice in the description of the state ψ2 and by reordering the terms,
you see that the state is unchanged).
4. The matrix is diagonal: clearly, all its eigenvalue (there’s a single degenerate eigenvalue 1/2) are
positive and ≤ 1.
41 For otherwise, multiply both sides of the equation by its inverse: LHS is equal to |λ⟩ while the RHS is still equal to 0
60
Moving on to the last one. Observe that this time, there is not symmetry between Alice and Bob
matrices, so we’ll have to compute them both.
∗
ψ3 (u, u)ψ3 (u, u) + ψ3∗ (u, d)ψ3 (u, d) ψ3∗ (d, u)ψ3 (u, u) + ψ3∗ (d, d)ψ3 (u, d)
ρ3A =
ψ3∗ (u, u)ψ3 (d, u) + ψ3∗ (u, d)ψ3 (d, d) ψ3∗ (d, u)ψ3 (d, u) + ψ3∗ (d, d)ψ3 (d, d)
(3/5)(3/5) + (4/5)(4/5) (0)(3/5) + (0)(4/5)
=
(3/5)(0) + (4/5)(0) (0)(0) + (0)(0)
9/25 + 16/25 0
=
0 0
1 0
=
0 0
61
2
2 9 × 16 12
⇔λ −λ+ − =0
252 25
3×3×4×4 3×4×3×4
⇔ λ2 − λ + − =0
252 252
⇔ λ(λ − 1) = 0
(
λ=0
⇔
λ=1
Remember that the authors proved43 that the expected value ⟨L⟩ of an observable L being in a state
|Ψ⟩ is:
⟨L⟩ = ⟨Ψ|L|Ψ⟩
Remark 22. There’s an issue in this first derivation, reported by Jannis Koeckeritz; I’ve left it so you
can ”have fun” trying to find it on your own; the solution is in this footnote44 .
Here’s a first derivation, where we use the following formula45 defined for an observable L, and a system
described by a density matrix ρ:
⟨L⟩ = Tr(ρL)
Recall46 that for any operator A and B, in particular, where AB ̸= BA, we still have:
Tr(AB) = Tr(BA)
We also know47 that, because we’re dealing with a product state, this can’t be a mixed state (it cannot
be expressed as a weighted sum of multiple states), i.e if we name |Ψ⟩ that (pure) product state:
ρ = |Ψ⟩⟨Ψ|
42 The authors are a bit irregular in their use of boldface for operators; I’ll try to do better, but things should be clear
Tr(AB) = Tr(BA).
45 p206, section 7.5 - Entanglement for two spins
46 p209, section 7.5 - Entanglement for two spins
47 p202, section 7.5 - Entanglement for two spins
48 p207, section 7.5 - Entanglement for two spins
62
It follows that:
C(A, B) := ⟨AB⟩ − ⟨A⟩ ⟨B⟩
= T r(ρAB) − ⟨A⟩ ⟨B⟩
= T r(ρ2 AB) − ⟨A⟩ ⟨B⟩
= T r(ρ(AρB)) − ⟨A⟩ ⟨B⟩
= ⟨AρB⟩ − ⟨A⟩ ⟨B⟩
= ⟨Ψ|AρB|Ψ⟩ − ⟨A⟩ ⟨B⟩
= ⟨Ψ|AρB|Ψ⟩ − ⟨Ψ|A |Ψ⟩⟨Ψ| B|Ψ⟩
| {z }
ρ
= 0
We start by expressing the expectation value in terms of an inner-product again, assuming we start in
the state |Ψ⟩:
⟨AB⟩ = ⟨Ψ|AB|Ψ⟩
Then, recall that A and B are two observables respectively from Alice and Bob’ state spaces, which
have been extended, as previously studied, so as to be able to act on a state vector |Ψ⟩, taken from the
composite system SAB .
We definitely need this to be able to express the correlation C(A, B) in terms of those inner-products,
for otherwise, the second terms in the equation below applying A or B to |Ψ⟩ wouldn’t make any sense:
Hence there’s an abuse of notation: with IX being the identity operator on the space SX :
A ” = ” A ⊗ IB ; B ” = ” IA ⊗ B
For clarity, I’ll note AA the observable A expressed in the system SA , and similarly for BB :
A = A A ⊗ IB ; B = IA ⊗ B B
Regarding |Ψ⟩, this is a product state, and we know49 that it can be expressed as a tensor product of a
state in SA and of a state in SB :
|Ψ⟩ = |ψ⟩ ⊗ |ϕ⟩
We can then rewrite:
⟨AB⟩ = ⟨Ψ|AB|Ψ⟩
= (⟨ψ| ⊗ ⟨ϕ|) AB (|ψ⟩ ⊗ |ϕ⟩)
= (⟨ψ| ⊗ ⟨ϕ|) A ((IA ⊗ BB ) (|ψ⟩ ⊗ |ϕ⟩))
63
Hence clearly, C(A, B) := ⟨AB⟩ − ⟨A⟩ ⟨B⟩ = 0 .
For completeness, here’s one last solution, rephrased from Filip Van Lijsebetten’s approach (p52), which
relies on the probabilistic definition of the average value.
Hence: X X X
⟨AB⟩ = λab P (λab ); ⟨A⟩ = λa P (λa ); ⟨B⟩ = λb P (λb )
ab a b
Now the notation is a bit confusing52 , but recall than λab corresponds to the value we get for our
combined state (which occurs with a probability of P (λab )). And this precisely corresponds the fact that
we have λa in the subspace SA and λb in the subspace SB : so we can read it like λab ≃ λa λb . Hence this
factors as: X
C(A, B) = λab P (λab ) − P (λa )P (λb )
a,b
Remark 23. So far, we’ve essentially just restated with a different notation what we did in L06E01
Now by definition for a product state, there is independence between the two ”events”: the measurement
of either A or B doesn’t affect the other one. That is, P (λab ) = P (λa )P (λb )53 , hence the correlation
really is zero.
Let’s recall the state-vector from 7.30, and let’s call it |Ψ⟩.
Independence_(probability_theory)#For_events
64
As I’ve found this confusing, let me start by recalling a bit of vocabulary54 . A quantum state can be
either pure or mixed: either its a single state, or a convex combination 55 of pure states. This is true
for a ”regular” state space, as for a state space built via a tensor products of two (or finitely many, by
induction) other state spaces.
Now there’s a second qualification, that is only applicable for states which are taken from a state space
made by glueing two (again, or finitely many) other state spaces: entangled states, and disentangled
states.
Mixed and entangled are definitely not synonymous: you can have a non-mixed (i.e. pure) entangled
state for example.
Example 1. The state vector from 7.30 is a pure state: this is not a convex combination of states. But
this tells us absolutely nothing regarding whether it’s an entangled state. We know however that it makes
sense to talk about it being entangled or not, as we’re dealing with a combined system involving (i) an
apparatus and (ii) a spin to be measured by said apparatus.
We could test this purity by computing the density matrix ρ, and checking whether ρ2 = ρ or Tr(ρ) = 1.
Let’s clarify the vocabulary one step further: a completely untangled state is a product state:
that’s a state where measurements on one subsystem affect in no ways the other subsystem(s).
The simplest approach is to remember that a state is a product state when it can be expressed via two
components (well, or more, but we’re in the case where there are two subsystems here: the apparatus,
and the spin to be measured with the apparatus), one for each subsystem. Recall that |a, α⟩ really is a
shortcut for |a⟩ ⊗ |α}. This means, the prepared state really is:
αu |u⟩ ⊗ |b} + αd |d⟩ ⊗ |b}
But the tensor product distributes56 , hence this simplifies as:
(αu |u⟩ + αd |d⟩) ⊗ |b}
p
As the combined state is normalized, we must have αu2 + αd2 = 1, which implies that the sub-state
corresponding to the spin is also normalized. Trivially, the sub-state corresponding to the apparatus is
also normalized. Hence, we’ve expressed our combined state as a tensor product of two normalized state,
one for each subsystems: this is a product state .
A slightly more involved (calculus-wise) variant of this approach would be to rely on the general form
of the product state57 and to evaluate whether our state vector can be expressed in such a way. The
general form can be computed, again using the distributive nature of the tensor product:
|product state⟩ = αu |u⟩ + αd |d⟩ ⊗ βb |b} + β+ | + 1} + β− | − 1}
= αu |u⟩ ⊗ βb |b} + β+ | + 1} + β− | − 1} + αd |d⟩ βb |b} + β+ | + 1} + β− | − 1}
= αu βd |u, b⟩ + αu β+ |u, +1⟩ + αu β− |u, −1⟩ + αd βd |d, b⟩ + αd β+ |d, +1⟩ + αd β− |d, −1⟩
54 See for instance: [Link]
55 A fancy term you may find here and there: a linear combination of elements, where the scalars factors sums to 1; see
[Link]
56 As is common in most Physics-centered introduction to Quantum Mechanics, the tensor product introduction is a bit
hand-wavy. For a more rigorous development, see for instance this video by F. Schuller. Some subtleties such as the fact
that the equivalence classes respect addition and scalar multiplication have been left as homework; there’s a set of notes
which contains the ”missing” proofs.
57 p164, section 6.5 - Product states
65
By setting:
βd = 1; β+ = β− = 0
We found back the state vector from 7.30, retrospectively justifying the notation for αu and αd . Because
the subsystem states, must be normalized, the resulting combined state is also normalized.
However, and perhaps this is more in line with the author’s intent, we’ve just saw58 two tests to check
whether the state corresponding to a given a wave-function for a composite system is entangled or not.
For the first criteria, we’d need to take any two arbitrary observables from each subsystem, say observable
A and B, and prove that their correlation C(A, B) is zero. But essentially, the proof will end up relying
on the tensor product distributivity, rely on the density matrix (see just after), or essentially mimick the
proofs of L07E09.
The second technique is slightly more original: the idea is that, for any product state, the density matrix
has exactly one non-zero eigenvalue, and that eigenvalue is exactly 1.
Which implies that ρ − I2 λ isn’t invertible59 , which translates to its determinant being equal to zero:
αu∗ αu − λ αu∗ αd
∗ ∗ ∗ ∗
= (αu αu − λ)(αd αd − λ) − αu αd αd αu
αd∗ αu αd∗ αd − λ
= ∗ ∗ ∗ ∗
αu αu αd αd − λ(αu αu + αd αd ) + λ − αu∗ αd αd∗ αu
2
| {z }
=⟨Ψ|Ψ⟩=1
= λ(1 − λ)
Clearly, we have one non-zero eigenvalue which is exactly one: the criteria indeed applies, and the state
must be non-entangled.
66
Alice’s density matrix is defined by its components in Eq. 7.2060 :
X
ρa′ a = ψ ∗ (a, b)ψ(a′ , b)
b
Where ψ(a, b) is the wave-function of the composite system, that we can extract from |Ψ⟩:
ψ(u, u) = ψ(d, d) = √
0
|Ψ⟩ = ψ(u, u)|uu⟩ + ψ(u, d)|ud⟩ + ψ(d, u)|du⟩ + ψ(d, d)|dd⟩ ⇒ ψ(u, d) = 0.6
√
ψ(d, u) = − 0.4
Hence:
Remark 24. I’m not sure what the authors expect regarding σz ; we’re asked to verify all numerical
values in the next exercise, which likely should cover pretty much every intepretation (we’ll even have to
compute Alice’s density matrix again, so as to check ρ2 /Tr(ρ2 )).
Some numerical results have been automatically computed by a R script, inlined at the end of this exercise.
We are, as usual, in the case of two spin systems, one for Alice, SA , one for Bob, SB . A composite
system SAB is then created from those two via a tensor product.
Implicitly, the matrices are expressed in the ordered basis of the corresponding sub-system, SA for σi1
and SB for τiB 61 . We will also need the matrix forms for the lifted operators from the sub-systems to the
composite system, where I A is the identity operator on SA and I B the identity operator on SB , again
implicitly relying on SAB ’s usual ordered basis.
Remark 26. We could have used the equivalent tables provided in annexes in the book, or the results of
previous exercises.
60 p205,section 7.5 Entanglement for Two Spins
61 Strictly
speaking, the operators aren’t equal; their matrix representation in their respective-basis are. The equals signs
are to be understood in this context.
67
Remark 27. Note that the final result appears twice below: this isn’t an error. The first occurence is a
manual computation, while the second has been automatically performed in R.
1 0 0 0 1 0 0 0
1 0 1 0 0 1 0 0 0 1 0 0
σz := σzA ⊗ I B = ⊗ = =
0 −1 0 1 0 0 −1 0 0 0 −1 0
0 0 0 −1 0 0 0 −1
1 0 0 0 1 0 0 0
A 1 0 1 0 0 −1 0 0 0 −1 0 0
τz := I ⊗ τzB = ⊗ = =
0 1 0 −1 0 0 1 0 0 0 1 0
0 0 0 −1 0 0 0 −1
0 0 1 0 0 0 1 0
0 1 1 0 0 0 0 1 0 0 0 1
σx := σxA ⊗ I B = ⊗ = =
1 0 0 1 1 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0
1 0 0 1 1 0 0 0 1 0 0 0
τx := I A ⊗ τxB = ⊗ = =
0 1 1 0 0 0 0 1 0 0 0 1
0 0 1 0 0 0 1 0
0 0 −i 0 0 0 −i 0
0 −i 1 0 0 0 0 −i = 0 0 0 −i
σy := σyA ⊗ I B
= ⊗ =
i 0 0 1 i 0 0 0 i 0 0 0
0 i 0 0 0 i 0 0
0 −i 0 0 0 −i 0 0
A 1 0 0 −i i 0 0 0 = i 0 0 0
τyB
τy := I ⊗ = ⊗ =
0 1 i 0 0 0 0 −i 0 0 0 −i
0 0 i 0 0 0 i 0
We’ll also need a few more ”combined” observables. I’ll skip the manual matrix multiplication here:
those have been automatically computed by R:
1 0 0 0 1 0 0
0
0 −1 0 0 0 −1 0
0
0 0 −1 0 ;
σz τz = τz σ z =
0
0 −1
0
0 0 0 1 0 0 0
1
0 0 0 1 0 0 0 −1
0 0 1 0 0 0 1 0
τx σx = ; τy σy =
0 1 0 0 0 1 0 0
1 0 0 0 −1 0 0 0
Finally, let’s recall one more time the formula for the expectation value ⟨L⟩ of an observable L of a
system in a state |Ψ⟩:
⟨L⟩ = ⟨Ψ|L|Ψ⟩
Product state
We’re starting from the following state-vector for the composite system, and using various properties of
the tensor product of vector states we progressively reach:
|Ψ⟩ = αu βu |uu⟩ + αu βd |ud⟩ + αd βu |du⟩ + αd βd |dd⟩
= αu βu |u⟩ ⊗ |u} + αu βd |u⟩ ⊗ |d} + αd βu |d⟩ ⊗ |u} + αd βd |d⟩ ⊗ |d}
= (αu |u⟩) ⊗ (βu |u}) + (αu |u⟩) ⊗ (βd |d}) + (αd |d⟩) ⊗ (βu |u}) + (αd |d⟩) ⊗ (βd |d})
= αu |u⟩ ⊗ (βd |d} + βu |u}) + αd |d⟩ ⊗ (βu |u} + βd |d})
= (αu |u⟩ + αd |d⟩) ⊗ (βd |d} + βu |u})
| {z } | {z }
=:|ϕ⟩ =:|ψ}
68
We’ve verified that this particular composite state is a state product: it can be expressed as the tensor
product of two states, |ϕ⟩ ∈ SA and |ψ} ∈ SB .
We’ve verified consistency of the normalization condition between the composite system, and the sub-
systems: our composite state vector is normalized iff the individual vectors from the sub-systems are
normalized.
Moving on to the density matrices, let’s do the reasoning for Alice’s state only, as the same argument
applies to Bob’s. Let’s first stay in SA . We know that the subsystem’s state is a pure state |ϕ⟩: it’s a
convex combination62 with a single term. The density matrix ρA is thus:
αu αu∗ αu αd∗
αu
ρA = 1|ϕ⟩⟨ϕ| = αu∗ αd∗ =
αd αd αu∗ αd αd∗
It is immediate to check that ρA is Hermitian (ρA = (ρA )† , as (αu αd∗ )∗ = αd αu∗ ). Furthermore:
We also have:
αu αu∗ αu αd∗ αu αu∗ αu αd∗
A 2
(ρ ) =
αd αu∗ αd αd∗ αd αu∗ αd αd∗
(αu αu∗ )(αu αu∗ ) + (αu αd∗ )(αd αu∗ ) (αu αu∗ )(αu αd∗ ) + (αu αd∗ )(αd αd∗ )
=
(αd αu∗ )(αu αu∗ ) + (αd αd∗ )(αd αu∗ ) (αd αu∗ )(αu αd∗ ) + (αd αd∗ )(αd αd∗ )
(αu αu∗ ) (αu αu∗ + αd∗ αd ) (αu αd∗ ) (αu αu∗ + αd∗ αd )
| {z } | {z }
=1 =1
= ∗ ∗ ∗ ∗ ∗ ∗
(αd αu ) (αu αu + αd αd ) (αd αd ) (αu αu + αd αd )
| {z } | {z }
=1 =1
A
= ρ
And naturally, Tr((ρA )2 ) = Tr(ρ) = 1. Those last two conditions are indeed we expect for a pure state.
Let’s move on to diagonalizing ρA . As usual, eigenvectors |λ⟩ are tied to their corresponding eigenvalues
62 [Link]
69
λ via:
ρA |λ⟩ = λ|λ⟩ ⇔ (ρA − I A λ)|λ⟩ = 0 ⇔ |ρA − I A λ| = 0
αu αu∗ − λ αu αd∗
⇔
αd αu∗ αd αd∗ − λ
⇔ (αu αu∗ − λ)(αd αd∗ − λ) − αd αu∗ αu αd∗ = 0
⇔ λ2 − (αu αu∗ + αd αd∗ ) λ = 0
| {z }
=1
⇔ λ(λ − 1) = 0
(
λ =0
⇔
λ =1
Let’s verify that the eigenvector |λ1 ⟩ associated to λ = 1 is indeed the wave-function associated to Alice’s
sub-system, i.e with components αu and αd
αu (αu∗ αu + αd∗ αd )
βu βu∗ βu βd∗
ρB = ; ρ2B = ρB ; Tr(ρ2B ) = Tr(ρB ) = 1
βd βu∗ βd βd∗
Same eigenvalues/eigenvectors.
What about SAB ? Well, |Ψ⟩ still is a pure state in SAB , meaning, its density matrix again is a convex
combination involving a single term; expanding it as a matrix in SAB ’s usual ordered basis yields:
αu βu
αu βd ∗ ∗ ∗ ∗
ρ = 1|Ψ⟩⟨Ψ| = αd βu (αu βu ) (αu βd ) (αd βu ) + (αd βd )
α d βd
αu βu αu∗ βu∗ αu βu αu∗ βd∗ αu βu αd∗ βu∗ αu βu αd∗ βd∗
αu βd αu∗ βu∗ αu βd αu∗ β ∗ αu βd α∗ βu∗ αu βd α∗ β ∗
= d d d d
αd βu αu∗ βu∗ αd βu αu∗ βd∗ αd βu αd∗ βu∗ αd βu αd∗ βd∗
αd βd αu∗ βu∗ αd βd αu∗ βd∗ αd βd αd∗ βu∗ αd βd αd∗ βd∗
Because this is again a pure state, this time in the SAB system, we expect the usual formulas to hold.
Let’s check them for good measure:
αu βu αu∗ βu∗ αu βu αu∗ βd∗ αu βu αd∗ βu∗ αu βu αd∗ βd∗ αu βu αu∗ βu∗ αu βu αu∗ βd∗ αu βu αd∗ βu∗ αu βu αd∗ βd∗
αd βd αu∗ βu∗ αd βd αu∗ βd∗ αd βd αd∗ βu∗ αd βd αd∗ βd∗ αd βd αu∗ βu∗ αd βd αu∗ βd∗ αd βd αd∗ βu∗ αd βd αd∗ βd∗
= ...
= ρ
With some text reformatting and a few tweak, we can convert the LATEXcode in a Wolfram Alpha matrix:
70
{{a c Conjugate(a) Conjugate(c),a c Conjugate(a) Conjugate(d),
a c Conjugate(b) Conjugate(c),
a c Conjugate(b) Conjugate(d)},
{a d Conjugate(a) Conjugate(c),
a d Conjugate(a) Conjugate(d),
a d Conjugate(b) Conjugate(c),
a d Conjugate(b) Conjugate(d)},
{b c Conjugate(a) Conjugate(c),
b c Conjugate(a) Conjugate(d),
b c Conjugate(b) Conjugate(c),
b c Conjugate(b) Conjugate(d)},
{b d Conjugate(a) Conjugate(c),
b d Conjugate(a) Conjugate(d),
b d Conjugate(b) Conjugate(c),
b d Conjugate(b) Conjugate(d)}}
Remark 28. The leading factors correspond to the norms of the sub-system states: it’s a 1 in disguise.
It follows that Tr(ρ2 ) = Tr(ρ) = 1.
We’ve already brushed upon it, but let’s make things crystal clear regarding the wave-functions: we have
one wave function for each sub-systems:
ψ A (|u⟩) = αu ; ψ B (|u}) = βu ;
ψ A (|d⟩) = αd ; ψ B (|d}) = βd .
71
And a wave-function for the composite system, which indeed factorize as a product of the sub-systems
wave-functions:
|uu⟩ → αu βu = ψ A (|u⟩)ψ B (|u})
|ud⟩ → α β = ψ A (|u⟩)ψ B (|d})
u d
ψ: A B
⇔ ψ(a, b) = ψ A (a)ψ B (b)
|du⟩ → α β
d u = ψ (|d⟩)ψ (|u})
|dd⟩ → αd βd = ψ A (|d⟩)ψ B (|d})
Finally, we can crunch some numbers using the usual expectation value formula, using the matrices for
the spin observables we’ve re-established earlier. This could have been automated, but I haven’t looked
much into how to perform symbolic computation with R.
0 0 1 0 αu βu
0 0 0 1 αu βd
⟨σx ⟩ = ⟨Ψ|σx |Ψ⟩ = αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
1 0 0 0 αd βu
0 1 0 0 αd βd
αd βu
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
αd βd
= αu βu αu βd αd βu αd βd
αu βu
αu βd
= αu∗ βu∗ αd βu + αu∗ βd∗ αd βd + αd∗ βu∗ αu βu + αd∗ βd∗ αu βd
= βu∗ βu (αu∗ αd + αd∗ αu ) + βd∗ βd (αu∗ αd + αd∗ αu )
= (αu∗ αd + αd∗ αu ) (βu∗ βu + βd∗ βd )
| {z }
=1
= αu∗ αd + αd∗ αu
2
⇒ ⟨σx ⟩ = (αu∗ αd + αd∗ αu )2
= (αu∗ αd )2 + 2αu∗ αd αd∗ αu + (αd∗ αu )2
0 0 −i 0 αu βu
0 0 0 −i αu βd
⟨σy ⟩ = ⟨Ψ|σy |Ψ⟩ = αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
i 0 0
0 αd βu
0 i 0 0 αd βd
−αd βu
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
−αd βd
= i αu βu αu βd αd βu αd βd
αu βu
αu βd
= i(−αu∗ βu∗ αd βu − αu∗ βd∗ αd βd + αd∗ βu∗ αu βu + αd∗ βd∗ αu βd )
= i(−αu∗ αd + αd∗ αu ) (βu∗ βu + βd∗ βd )
| {z }
=1
= i(−αu∗ αd + αd∗ αu )
2 2
⇒ ⟨σy ⟩ = (i(−αu∗ αd + αd∗ αu ))
= − (αd∗ αu )2 + 2αd∗ αu αu∗ αd − (αu∗ αd )2
72
1 0 0 0 αu βu
0 1 0 0
αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗ αu βd
⟨σz ⟩ = ⟨Ψ|σz |Ψ⟩ =
0 0 −1 0 αd βu
0 0 0 −1 αd βd
αu βu
αu βd
= αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
−αd βu
−αd βd
= αu∗ βu∗ αu βu + αu∗ βd∗ αu βd − αd∗ βu∗ αd βu − αd∗ βd∗ αd βd
= βu∗ βu (αu∗ αu − αd∗ αd ) + βd∗ βd (αu∗ αu − αd∗ αd )
= (αu∗ αu − αd∗ αd ) (βu∗ βu + βd∗ βd )
| {z }
=1
= αu∗ αu − αd∗ αd
2
⇒ ⟨σz ⟩ = (αu∗ αu − αd∗ αd )2
= (αu∗ αu )2 − 2αu∗ αu αd∗ αd + (αd∗ αd )2
We can now compute:
2 2 2
⟨σx ⟩ + ⟨σy ⟩ + ⟨σz ⟩ = (αu∗ αd )2 + 2αu∗ αd αd∗ αu + (αd∗ αu )2
− (αd∗ αu )2 + 2αd∗ αu αu∗ αd − (αu∗ αd )2
+ (αu∗ αu )2 − 2αu∗ αu αd∗ αd + (αd∗ αd )2
= (αu∗ αu )2 + 2αu∗ αu αd∗ αd + (αd∗ αd )2
= (αu∗ αu + αd∗ αd )2
| {z }
=1
= 1
Remark 29. At this stage, it’s important to observe than up to a renaming (β ← α), this is the same
expression we had for ⟨σx ⟩. And it makes sense given how symmetrical the ”physical” situation is. We
expect to find the same thing for the two other components: if this is the case, we could then directly
2 2 2
conclude ⟨τx ⟩ + ⟨τy ⟩ + ⟨τz ⟩ = 1, without additional computations.
73
0 −i 0 0 αu βu
i 0 0 0 αu βd
⟨τy ⟩ = ⟨Ψ|τy |Ψ⟩ = αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
0 0 0 −i αd βu
0 0 i 0 αd βd
−αu βd
αu βu
= i αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
−αd βd
αd βu
= i(−αu∗ βu∗ αu βd + αu∗ βd∗ αu βu − αd∗ βu∗ αd βd + αd∗ βd∗ αd βu )
= i(αu∗ αu (−βu∗ βd + βd∗ βu ) + αd∗ αd (−βu∗ βd + βd∗ βu ))
= i(−βu∗ βd + βd∗ βu ) (αu∗ αu + αd∗ αd )
| {z }
=1
= i(−βu∗ βd + βd∗ βu )
=β←α ⟨σy ⟩
1 0 0 0 αu βu
0 −1 0 0 αu βd
⟨τz ⟩ = ⟨Ψ|τz |Ψ⟩ = αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
0 0 1 0 αd βu
0 0 0 −1 αd βd
αu βu
−αu βd
= αu∗ βu∗ αu∗ βd∗ αd∗ βu∗ αd∗ βd∗
αd βu
−αd βd
= αu∗ βu∗ αu βu − αu∗ βd∗ αu βd αd∗ βu∗ αd βu − αd∗ βd∗ αd βd
= αu∗ αu (βu∗ βu − βd∗ βd ) + αd∗ αd (βu∗ βu − βd∗ βd )
= (βu∗ βu − βd∗ βd ) (αu∗ αu + αd∗ αd )
| {z }
=1
= βu∗ βu − βd∗ βd
=β←α ⟨σz ⟩
Hence by our previous remark, indeed:
2 2 2 2 2 2
⟨τx ⟩ + ⟨τy ⟩ + ⟨τz ⟩ =β←α ⟨σx ⟩ + ⟨σy ⟩ + ⟨σz ⟩ = 1
Singlet state
74
The singlet state is characteristic of a maximally entangled state:
1
|Ψ⟩ = √ (|ud⟩ − |du⟩)
2
This means that we won’t be able to express this state as a tensor product of two states from SA and
SB , as we just did for a product state.
Let’s start with the wave function and normalization, for the composite space: the general form of a
state vector in this space is:
But because each individual term under the square root is positive, this is equivalent to:
∗ ∗ ∗ ∗
ψuu ψuu + ψud ψud + ψdu ψdu + ψdd ψdd = 1
What’s the wave function for each subsystem state? Well, think about it: if there’s a wave function
for each subsystem, then there’s a pure, normalized state for each subsystem, and then the composite
state can be expressed as a tensor product between those two. Meaning, this composite state would be a
product state. But, it’s claimed here than the composite state is entangled, meaning, it’s not a product
state, and so we shouldn’t be able to find such wave-functions for the isolated subsystems.
We’ve already studied in L06E03 why this particular singlet state isn’t a product state, let me recall you
how it went: the idea is to identify the general form of a composite state:
Which yields the particular following equations systems for the singlet state:
1
ψud = αu βd = √
2
1
ψdu = αd βu = − √
2
ψ = α β = 0
uu u u
ψdd = αd βd = 0
But consider for example the third equation, which implies that at least either αu = 0 or βu = 0. In the
former case, the first equation can’t hold, while in the latter, the second equation can’t hold. Hence the
system is inconsistent, and cannot be solved.
So what does this mean regarding the states of the subsystems? Surely it conceptually makes sense to
still talk about the existence of such states? Yes, obviously is does and that’s precisely where the notion
75
of density matrix becomes most useful63 : to express impure states.
So, let’s move on to density matrices then. Starting with the easiest: the composite system’s: it’s a pure
state so we have (again, the vector/matrix representation depends implicitly on the usual ordered basis):
0√ 0 0 0 0
1/ 2 √ √ 0 1/2 −1/2 0
ρ = |Ψ⟩ ⟨Ψ| = √
−1/ 2 0 1/ 2 −1/ 2 0 = 0 −1/2 1/2 0
0 0 0 0 0
Moving on to the density matrices of the subsystems, that is, on the most accurate state description we
can provide to each subsystems. In the book, we derived a formula64 : we first introduced an arbitrary
observable LA , acting on SA , and upgraded it to the composite system: L = LA ⊗ I B .
Let me rework the proof, while being a bit more explicit. First, component-wise, we have:
L11 0 L12 0
L11 L 12 1 0 0 L11 0 L12
L = LA ⊗ I B = ⇔ La′ b′ ,ab = LA
⊗ = a′ a δb′ b
L21 L22 0 1 L21 0 L22 0
0 L21 0 L22
ψ22
63 For pure states, the density matrix ρ is isomorphic to the state |Ψ⟩: ρ = |Ψ⟩ ⟨Ψ|
64 p204+, section 7.5 Entanglement for two spins
76
And we’ve already demonstrated in the book, using similar arguments, that
That is, we’ve reduced the density matrix ρ on the composite system to a density matrix ρA on Alice’s
subsystem.
I’ll come back to this density matrix ρA in a moment, but before moving on any further, let me emphasize
a subtle point that I think could have been made clearer in the book. Suppose we’re in a mixed state in
Alice’s subsystem. This means that there’s some amount of chance we’re in this state, or some amount
we’re in this other state, and so on, something like:
But wait a minute, each of those |ψi ⟩ is an element of the state space (the Hilbert space), so they can
all be expressed as a linear combination of its basis vectors. In the context of a spin:
Assuming we renormalize that last state vector if need be, haven’t we just found a wave-function de-
scribing Alice’s state? But haven’t we just stated that we cannot find a wave-function for Alice’s state
because it’s a mixed state?
You could push this thinking one step further: can’t we do the same thing for Bob’s space, and join the
two resulting states with a tensor product?
Well, there’s one considerable issue with the previous reasoning, and I don’t think it’s clear from the
book. So let me emphasize it:
The states of a (quantum) system are all positive, trace-class, linear maps ρ : SA → SA for which Trρ = 1
The previous definition accounts for some refinements that will be introduced in the next chapter of
Susskind’s book. For example, a trace-class map refers to a map who has a finite trace: it’s always the
case in a finite dimension vector space, but divergence may occur in infinite dimension vector spaces,
which are mandatory to express position observables for example.
To simplify, such maps corresponds to our density matrices, which as we’ve saw, can encode both pure
and mixed states. We could argue on terminology regarding what a state is: a ”mixed state” may
only corresponds to the information we have about a state, and not to an actual, physical state, which
65 See [Link]
77
would justify the less strict terminology, while introducing some confusion on the use of the term ”state”.
It just so happen than for every pure state, there’s a 1 : 1 correspondance with the elements of the
Hilbert ”state” space, as ρ = |ψ⟩ ⟨ψ|.
So we can’t just create a convex combination of ”pure states” (well, something that’s isomorphic to a
pure state in our modern terminology). That’s why when density matrices were introduced in Susskind’s
book, the convex combination was performed over projection operators built from pure states:
And not as I’ve just show you, directly over elements of the Hilbert space.
Now that we have a clear definition of what a state is, we can check that our matrix ρA really is a state:
we want to prove that it’s a positive, trace-class linear map such that Tr(ρA ) = 1.
It’s clearly trace-class, because we’re in finite dimension: we have a matrix, the trace is a finite sum, it
always converges.
Lastly, as every component of the matrix is a positive real number, I guess this is enough to prove that
ρA is positive.
We’re ready to move on to verify that (ρA )2 ̸= ρA . After a quick check, I don’t think we reach anything
conclusive by carrying the computation symbolically, so let’s do it numerically:
√ √ √ √ 2 2
0× 1
A 2
(ρ ) = √0 + 1/ 2 × 1/ √ 2 0 × (−1
√ 2) +√1/ 2 × 0 = 1/2 0 = I A ̸= ρA
(−1/ 2) × 0 + 0 × (1/ 2) −1/ 2 × 1/ 2 + 0 × 0 0 1/2 4
78
0 0 01 0√
√ √ 0
−1 0 0 1/ √2 =
⟨τz σz ⟩ = ⟨Ψ|τz σz |Ψ⟩ = 0 1/ 2 −1/ 2 0 0 −1
0 −1 0 −1/ 2
0 0 10 0
0 0 0 1 0√
√ √ 0 0 1 0 1/ 2
⟨τx σx ⟩ = ⟨Ψ|τx σx |Ψ⟩ = √
0 1/ 2 −1/ 2 0 0 1 0 0 −1/ 2 = −1
1 0 0 0 0
0 0 0 −1 0√
√ √ 0 0 1 0 1/ 2
⟨τy σy ⟩ = ⟨Ψ|τy σy |Ψ⟩ = √
0 1/ 2 −1/ 2 0 0 1 0 0 −1/ 2 = −1
−1 0 0 0 0
1 0 0 0 0√
√ √ 0 −1 0 0 1/ 2
⟨σz τz ⟩ = ⟨Ψ|σz τz |Ψ⟩ = √
0 1/ 2 −1/ 2 0 0 0 −1 0 −1/ 2 = −1
0 0 0 1 0
The last one served to verify the following correlation:
⟨σz τz ⟩ − ⟨σz ⟩ ⟨τz ⟩ = −1
”Near-singlet” state
Starting from the following composite state:
√ √
|Ψ⟩ = 0.6|ud⟩ − 0.4|du⟩
We can easily identify the wave-function for the composite system:
|uu⟩ → ψ11 =0
√
|ud⟩ → ψ12 = 0.6
ψ: √
|du⟩ → ψ21 = − 0.4
|dd⟩ → ψ22 =0
The density matrix for the composite system naturally follows for the definition of the state:
√0
0.6 √ √
√
ρ = |Ψ⟩ ⟨Ψ| = − 0.4 0 0.6 − 0.4 0
0
0 0 √ 0 0 0 0 √0 0
0
√ 0.6 − 0.6 × 0.4 0 = 0 0.6
√ − 0.24 0
= 0 − 0.6 × 0.4
0.4 0 0 − 0.24 0.4 0
0 0 0 0 0 0 0 0
Let’s square it:
0 0 √ √ 0 √ 0 0 0 √0 0
2
2
0 0.6 × √0.6 + (− 0.24)
√ −0.6 0.24 − 0.4
√ 0.24 0 0 0.36√+ 0.24 − 0.24 0
ρ = = =ρ
0 −0.6 0.24 − 0.4 0.24 0.4 × 0.4 + (− 0.24)2 0 0 − 0.24 0.16 + 0.24 0
0 0 0 0 0 0 0 0
Again, this is an entangled state: by the same (abstract) reasoning as before, we can’t find a wave-
function for the subsystems, and we must look for a density matrix. I’ll skip the details this time:
∗ ∗ ∗ ∗
A ψ11 ψ11 + ψ12 ψ12 ψ11 ψ21 + ψ12 ψ22 0.6 0
ρ = ∗ ∗ ∗ ∗ =
ψ21 ψ11 + ψ22 ψ12 ψ21 ψ21 + ψ22 ψ22 0 0.4
79
Let’s square it:
0.36 0
(ρA )2 = ̸= ρA
0 0.16
Clearly, Tr((ρA )2 ) = 0.36 + 0.16 = 0.52 < 1.
Let’s crunch some numbers again66 ; again, this has been automatically computed by the R script:
⟨σz ⟩ = ⟨Ψ|σz |Ψ⟩ = 0.2; ⟨σx ⟩ = ⟨Ψ|σx |Ψ⟩ = 0; ⟨σy ⟩ = ⟨Ψ|σy |Ψ⟩ = 0
⟨τz ⟩ = ⟨Ψ|τz |Ψ⟩ = −0.2; ⟨τx ⟩ = ⟨Ψ|τx |Ψ⟩ = 0; ⟨τy ⟩ = ⟨Ψ|τy |Ψ⟩ = 0
1
0 0 0 √0
√ √ 0
−1 0 0 √0.6 =
⟨τz σz ⟩ = ⟨Ψ|τz σz |Ψ⟩ = 0 0.6 − 0.4 0 0 −1
0 −1 0 − 0.4
0
0 0 1 0
0 0 0 1 √0
√ √ 0 0 1 0 0.6 √
⟨τx σx ⟩ = ⟨Ψ|τx σx |Ψ⟩ = √
0 0.6 − 0.4 0 0 1 0 0 − 0.4 ≃ − 0.9797959 ≃ −2 0.24
1 0 0 0 0
0 0 0 −1 √0
√ √ 0 0 1 0 0.6
⟨τy σy ⟩ = ⟨Ψ|τy σy |Ψ⟩ = √
0 0.6 − 0.4 0 0 1 0 0 − 0.4 = −1
−1 0 0 0 0
1 0 0 0 √0
√ √ 0 −1 0 0 0.6
⟨σz τz ⟩ = ⟨Ψ|σz τz |Ψ⟩ = √
0 0.6 − 0.4 0 0 0 −1 0 − 0.4 = −1
0 0 0 1 0
For completeness, here’s the aforementioned, self-contained R script. You may want to look at the
LATEXsource file. I’ve wrote a separate article (.html) showcasing various R features used in this script.
#! / b i n / R s c r i p t
# Quirky , b u t d o e s t h e j o b .
# Computes e x p e c t a t i o n v a l u e s f o r a c o m p o s i t e
# system b u i l t from two quantum s p i n s .
#
# Used w i t h e i t h e r w i t h 1 or 5 arguments :
# <o p e r a t o r > [ wave−f u n c t i o n ( uu , ud , du , dd ) ]
#
# Each argument i s p a r s e d and e v a l u a t e d , so you can
# use ” t a u x %∗% sigma x ” as an o p e r a t o r f o r example .
#
# I f o n l y an o p e r a t o r i s p r o v i d e d (1 a r g ) , i t s 4 x4 m a t r i x form
# i s d i s p l a y e d as LaTeX on s t d o u t .
#
# Otherwise , e v a l u a t e s t h e e x p e c t a t i o n v a l u e f o r t h e o p e r a t o r
# f o r a system i n a s t a t e d e s c r i b e d by t h e wave−f u n c t i o n .
66 There are a few more than asked
80
tmp <− ”/tmp/R/ l i b ”
. l i b P a t h s ( tmp )
dir . create ( tmp , r e c u r s i v e = TRUE, mode = ” 0755 ” )
# 2 x2 i d e n t i t y and P a u l i m a t r i c e s
id2 = matrix ( c ( 1 , 0 , 0 , 1 ) , 2 , 2)
pz = matrix ( c ( 1 , 0 , 0 , −1) , 2 , 2)
px = matrix ( c ( 0 , 1 , 1 , 0 ) , 2 , 2)
py = matrix ( c ( 0 , 1 i , −1i , 0 ) , 2 , 2)
sigma x = kronecker ( px , i d 2 )
tau x = kronecker ( id2 , px )
sigma y = kronecker ( py , i d 2 )
tau y = kronecker ( id2 , py )
# Expectation va lu e computation
avg <− function (L , p s i ) {
return ( ( Conj ( t ( p s i ) ) %∗% L %∗% p s i ) [ 1 ] )
}
# LaTeX e x p o r t
loadpkg ( ” x t a b l e ” )
t <− ” ”
i f (Re( y ) != 0 ) t <− as . character (Re( y ) )
i f (Im( y ) == 1 ) {
i f (Re( y ) == 0 ) t <− ” i ”
81
else t <− paste ( t , ”+i ” )
} e l s e i f (Im( y ) == −1)
t <− paste ( t , ”− i ” )
else {
i f (Re( y ) == 0 ) t <− paste (Im( y ) , ” i ” )
e l s e i f (Im( y ) > 0 ) t <− paste ( t , ”+” , Im( y ) , ” i ” )
else t <− paste ( t , Im( y ) , ” i ” )
}
return ( t )
})
dim( z ) <− dim( x )
xtable : : xtable (z , . . . )
} else
xtable : : xtable (x , . . . )
}
i f ( length ( args ) == 1 ) {
x <− x t a b l e ( args [ [ 1 ] ] , a l i g n=rep ( ” ” , ncol ( args [ [ 1 ] ] ) + 1 ) )
# 1 . 0 0 −> 1 , i n our p e c u l i a r c a s e
d i g i t s ( x ) <− x d i g i t s ( x )
print ( x ,
f l o a t i n g = FALSE, t a b u l a r . environment = ” pmatrix ” ,
h l i n e . a f t e r=NULL, i n c l u d e . rownames=FALSE, i n c l u d e . colnames=FALSE
)
q()
} e l s e i f ( length ( args ) != 5 ) stop ( ” I n c o m p l e t e wave f u n c t i o n ” )
# a v o i d s some 0+0 i ; r e f i n a b l e
i f ( x == 0 ) { cat ( 0 , ”\n” ) } e l s e { cat ( x , ” \n” ) }
Generally speaking, an operator L : H → H is said to be linear if those two axioms are verified:
82
Furthermore, while the authors often use ”ψ(x)” to denote a function, I’ll be using ψ instead, and reserve
ψ(x) to the result of the application of ψ to the variable x, as is usual in mathematics.
Finally, recall67 that addition and scalar-multiplication are defined pointwise68 on functions:
(∀(ψ, ϕ) ∈ H), ψ + ϕ := x 7→ (ψ + ϕ)(x) := ψ(x) + ϕ(x)
(∀(ψ, α) ∈ H × C), αψ := x 7→ (αψ)(x) := αψ(x)
To ease notation, I’ll use the same symbols for e.g. the addition of complex numbers and the (pointwise)
addition of functions. Don’t hesitate to label them in your mind in case of doubt.
Starting with X and the first axiom; let x ∈ R, α ∈ C and ψ ∈ H:
(X(αψ))(x) = xαψ(x)
= α(xψ(x))
= α(Xψ)(x)
As this is true for any x, we can conclude:
X(αψ) = αX(ψ)
83
This is equivalent to saying that we can do it as long as ψ is differentiable. In a physics context, functions
are often always assumed to be differentiable everywhere. Hence the first axiom indeed holds for D:
D(αψ) = αD(ψ)
There’s an analogue reasoning for the second axiom: let x ∈ R and (ψ, ϕ) ∈ H2 :
d
(D(ψ + ϕ))(x) = (ψ + ϕ)(x)
dx
d d
= ( ψ+ ϕ)(x)
dx dx
= (D(ψ) + D(ϕ))(x)
Again we can rewrite the ”questionable” line by expanding the differentiation as a limit while unwrapping
the pointwise addition of functions:
d (ψ + ϕ)(x + ϵ) − (ψ + ψ)(x)
( (ψ + ϕ))(x) = lim
dx ϵ→0 ϵ
ψ(x + ϵ) + ϕ(x + ϵ) − (ψ(x) + ϕ(x))
= lim
ϵ
ϵ→0
ψ(x + ϵ) − ψ(x) + ϕ(x + ϵ) − ϕ(x)
= lim
ϵ→0 ϵ
ψ(x + ϵ) − ψ(x) ϕ(x + ϵ) − ϕ(x)
= lim +
ϵ→0 ϵ ϵ
Again, we can split the limit of a sum to a sum of limits71 , as long as both limits converge. Hence the
second axioms holds as long as ψ and ϕ are differentiable:
9 Particle Dynamics
9.1 A Simple Example
9.2 Nonrelativistic Free Particles
9.3 Time-Independent Schrödinger Equation
Exercise 42. Derive Eq. 9.7 by plugging Eq. 9.6 into Eq. 9.5.
Let’s recall in order, Eq. 9.7, Eq. 9.6 and Eq. 9.5:
ℏ2 ∂ 2 ψ(x)
E = p2 /2m; ψ(x) = exp (ipx/ℏ); − = Eψ(x)
2m ∂x2
In that last equation, the RHS could be rewritten as H|Ψ⟩, where H is the ”quantized” classical
Hamiltonian corresponding to a free particle, that is, a particle not affected by a potential energy: the
71 There’s a proof on the same website as before
84
Hamiltonian is then built solely from the ”quantized” kinetic energy.
Eq. 9.6 (the middle one) is a solution proposal to the ODE yielded by Eq. 9.5 (the last one). Let’s see
how it goes:
ℏ2 ∂ 2 ψ(x)
− = Eψ(x)
2m ∂x2
ℏ2 ∂ 2
⇔ − exp (ipx/ℏ) = Eψ(x)
2m ∂x2
2
ℏ2 ip
⇔ − exp (ipx/ℏ) = Eψ(x)
2m ℏ | {z }
=:ψ(x)
p2
⇔ ψ(x) = Eψ(x)
2m
And so indeed, at least as long as ψ(x) ̸= 0:
E = p2 /2m
Recall that the commutator measures how much two operators defined on a Hilbert space commute:
[A, B] := AB − BA
Let’s progressively expand both sides, keeping all expansions strictly equivalent:
[P 2 , X] = P [P , X] + [P , X]P
⇔ P 2 X − XP 2 = P (P X − XP ) + (P X − XP )P
⇔ P P X − XP P = P P X − P XP + P XP − XP P
⇔ P P X − XP P = P P X − XP P
⇔ 0= 0
⇔ true
9.5 Quantization
9.6 Forces
Exercise 44. Show that the right-hand side of Eq. 9.17 simplifies to the right-hand side of Eq. 9.16.
Hint: First expand the second term by taking the derivative of the product. Then look for cancellations.
85
Let’s start as suggested by expanding the second term of Eq. 9.17, ignoring the −iℏ factor for now: this
is a basic product rule application:
d dV (x) dψ(x)
( )(V (x)ψ(x)) = ψ(x) + V (x)
dx dx dx
The second term will cancel with the first term of the RHS of Eq. 9.17:
d d
[V (x), P ]ψ(x) = V (x)(−iℏ )ψ(x) − (−iℏ )V (x)ψ(x)
dx dx
dψ(x) dV (x) dψ(x)
= − iℏV (x) + iℏ ψ(x) + V (x)
dx dx dx
dV (x)
= iℏ ψ(x)
dx
As long as ψ(x) ̸= 0, we can divide by ψ(x), and indeed establish Eq. 9.16:
dV (x)
[V (x), P ] = iℏ
dx
Where I’ve systematically made the time-dependence explicit by replacing x with x(t). This is an
elementary differentiation exercise that I think has already been performed in the previous volume on
classical mechanics. Nevertheless:
ẋ(t) = − Aω sin(ωt) + Bω cos(ωt)
ẍ(t) = − Aω 2 cos(ωt) − Bω 2 sin(ωt)
= − ω 2 (A cos(ωt) + B sin(ωt))
| {z }
=:x(t)
2
= − ω x(t)
86