Normed and Hilbert Spaces Overview
Normed and Hilbert Spaces Overview
Definition. Let X be a vector space over F. A norm on X is a function k · k : X → R such that for all x, y ∈ X
and α ∈ F,
(i) kxk ≥ 0, with equality if, and only if, x = 0,
(ii) kαxk = |α|kxk, and
(iii) kx + yk ≤ kxk + kyk (triangle inequality).
The pair (X, k · k) is then called a normed vector space or simply a normed space.
This is a norm that makes C([a, b]) a Banach space (cf. MA40043). [Last year’s notation was C 0 ([a, b]).]
Example. On the other hand, if C([a, b]) is equipped with the norm
ˆ b
kf kL1 (a,b) = |f (x)| dx,
a
1
Lemma 1.1. Let X be a normed vector space and x, y ∈ X. Then
The inequality
kyk − kxk ≤ kx − yk
is proved similarly, and both inequalities together imply the claim.
Theorem 1.2. Let X be a normed vector space. Suppose that (xn )n∈N and (yn )n∈N are sequences in X with
xn → x0 and yn → y0 as n → ∞. Furthermore, suppose that (αn )n∈N is a sequence in F with αn → α0 as
n → ∞. Then
(i) limn→∞ kxn k = kx0 k,
(ii) limn→∞ (xn + yn ) = x0 + y0 , and
(iii) limn→∞ (αn xn ) = α0 x0 .
(iii) We have
Furthermore, we have |αn − α0 | → 0 and kxn k → kx0 k by (i), and kxn − x0 k → 0. Hence
kαn xn − α0 x0 k → 0 as n → ∞.
Lemma 1.3. Let X be a normed vector space and Y ⊂ X a linear subspace. Then the closure Y is a linear
subspace as well.
2
Proof. We have y ∈ Y if, and only if, there exists a sequence (yn )n∈N in Y with y = limn→∞ yn . Now let
x, y ∈ Y and α ∈ F. Choose sequences (xn )n∈N and (yn )n∈N in Y converging to x and y, respectively. Then
x + y = lim (xn + yn ) ∈ Y
n→∞
and
αx = lim (αxn ) ∈ Y ,
n→∞
Remark. If Y ⊂ X is a linear subspace of a normed vector space with norm k · k, then the restriction of k · k
to Y is a norm on Y .
Theorem 1.4. Let X be a Banach space and Y ⊂ X a linear subspace. Then Y is a Banach space if, and only
if, it is closed.
Proof. From MA30041, a subspace of a metric space is complete if and only if it is closed.
Definition. Let X be a vector space over F. An inner product (or scalar product) on X is a function h · , · i : X ×
X → F such that for all x, y, z ∈ X and α ∈ F,
(i) hx + y, zi = hx, zi + hy, zi,
(ii) hαx, yi = αhx, yi,
(iii) hx, yi = hy, xi, and
(iv) hx, xi ≥ 0, with equality if, and only if, x = 0.
The pair (X, h · , · i) is then called an inner product space.
Remarks. In the above, z̄ denotes the complex conjugate for a number z ∈ C. If z ∈ R, then we have of
course z̄ = z. Thus if F = R, then the complex conjugate can be ignored in (iii).
If F = R, then (i)-(iii) imply that ( · , · ) is bilinear. That is, we also have
and
hx, αyi = αhx, yi.
If F = C, then the scalar product is still additive in the second variable but the homogeneity needs to be
replaced by
hx, αyi = ᾱhx, yi.
In the complex case the inner product is sometimes said to be sesquilinear (“1 12 -times linear”).
An inner product induces a norm on X, defined by
p
kxk = hx, xi
for x ∈ X. This is a norm. In particular it satisfies the triangle inequality by the following result.
3
Theorem 1.5 (Cauchy-Schwarz inequality and triangle inequality). Let X be a vector space with inner
product h · , · i. Then for x, y ∈ X, the Cauchy-Schwarz inequality
Definition. An inner product space is called a Hilbert space if it is complete with respect to the induced metric.
(In particular, a Hilbert space is a Banach space.)
for x = (xn )n∈N and y = (yn )n∈N . This is a Hilbert space. (For F = R, this was shown in MA40043. For F = C,
the proof is similar.)
Orthogonality in Inner Product Spaces
Definition. Let X be a vector space with inner product h · , · i and Y ⊂ X a linear subspace. Then
4
Theorem 1.7 (Double Complement). Let H be a Hilbert space and Y ⊂ H a closed linear subspace. Then
(Y ⊥ )⊥ = Y .
y = y1 + y2
Definition. For an index set A, a family (xα )α∈A in a Hilbert space H is called an orthonormal system if
(i) kxα k = 1 for every α ∈ A, and
(ii) hxα , xβ i = 0 for α, β ∈ A with α 6= β.
An orthonormal system is called complete if for every y ∈ H\{0}, there exists an α ∈ A such that (xα , y) 6= 0.
If the index set is N, then we also speak of an orthonormal sequence.
Remark. The condition for a complete orthonormal system can also be expressed as
Theorem 1.8. Let (un )m∈N be an orthonormal sequence in a Hilbert space H and (αn )n∈N a sequence in F.
Then ∞
P P∞ 2
n=1 αn un is convergent if, and only if, n=1 |αn | < ∞. If so, then
∞ 2 ∞
X X
αn un = |αn |2 .
n=1 n=1
Proof. Let
n
X n
X
sn = αk uk and σn = |αk |2
k=1 k=1
be the partial sums. Then for m < n,
n 2 n
X X
2
ksn − sm k = αk uk = αk ᾱ` huk , u` i
k=m+1 k,`=m+1
Xn
= |αk |2 = σn − σm .
k=m+1
Thus (sn )n∈N is a Cauchy sequence in H if, and only if, the sequence (σn )n∈N is Cauchy in R. The first
statement is now clear.
Furthermore, by similar computations,
∞ 2 ∞
X X
αn un = lim ksn k2 = lim σn = |αn |2 .
n→∞ n→∞
n=1 n=1
5
Theorem 1.9 (Bessel’s inequality). Let (un )n∈N be an orthonormal sequence in a Hilbert space H. Then
for any x ∈ H,
X∞
|hx, un i|2 ≤ kxk2 .
n=1
Theorem 1.10. Let (un )n∈N be an orthonormal sequence in a Hilbert space H. Then the formula
∞
X
x= hx, un iun (1)
n=1
holds for all x ∈ span {un : n ∈ N} and for no other points in H. In particular, if the orthonormal sequence is
complete, then (1) holds for every x ∈ H.
So ⊥
y ∈ (span {un : n ∈ N})⊥ = span {un : n ∈ N} .
(The last identity is verified in Problem Sheet 3 Q2(a).) Clearly we also have y ∈ span {un : n ∈ N}. Thus y
belongs both to span {un : n ∈ N} and to its orthogonal complement, hence y = 0, as required.
If we have a complete orthonormal sequence, then
⊥
span {un : n ∈ N} = (span {un : n ∈ N})⊥ = {0}.
Remark. The orthonormal bases defined here are not bases in the algebraic sense. The latter are sometimes
called Hamel bases in order to make the distinction clear.
Theorem 1.11. An infinite-dimensional Hilbert space is separable if, and only if, it has an orthonormal basis.
6
Proof. Suppose that H is an infinite-dimensional Hilbert space with an orthonormal basis (un )n∈N . Then
This is countable and we still have A = H. Hence H is separable. If F = C, then we replace Q by Q + iQ.
Now suppose that H is separable, i.e., there exists a dense sequence (xn )n∈N in H. Then similarly to
MA40043, we can use the Gram-Schmidt process to construct an orthonormal basis from (xn )n∈N .
The Spaces `p
Fix a p with 1 ≤ p ≤ ∞ (i.e., either p ∈ R with p ≥ 1 or p = ∞). Let x = (xn )n∈N be a sequence in F. Then
we define !1
X∞ p
p
kxk` =p |xn | if 1 ≤ p < ∞, kxk`∞ = sup |xn |.
n=1 n∈N
Let `p be the set of all sequences x in F with kxk`p < ∞. We want to show that (`p , k · k`p ) is a Banach space
(with the vector space structure given by
Lemma 1.12 (Young’s inequality). Let p, q ∈ (1, ∞) be conjugate exponents. Then for all a, b ≥ 0,
ap bq
ab ≤ + .
p q
Proof. The inequality is trivial for a = 0 or b = 0. Hence we assume that a, b > 0.
Fix b > 0 and consider the function
ap
f (a) = − ab, a > 0.
p
q
We have to show that f (a) ≥ − bq for all a > 0.
We have f 0 (a) = ap−1 − b. So f 0 (a) < 0 for a < b1/(p−1) and f 0 (a) > 0 for a > b1/(p−1) . Thus there is a
global minimum at a = b1/(p−1) . Hence for all a > 0,
bp/(p−1) bq
1/(p−1) 1/(p−1)+1 p/(p−1) 1
f (a) ≥ f (b )= −b =b −1 =− ,
p p q
p
as q = p−1 .
7
Lemma 1.13 (Hölder’s inequality). Let p, q be conjugate exponents. If x ∈ `p and y ∈ `q , then
∞
X
|xn yn | ≤ kxk`p kyk`q
n=1
Therefore,
∞
X ∞
X
|xn yn | = kxk`p kyk`q |x0n yn0 | ≤ kxk`p kyk`q .
n=1 n=1
Proof. The cases p = 1 and p = ∞ are straightforward, so we assume that p ∈ (1, ∞). Choose N ∈ N. Then
N
X N
X
|xn + yn |p ≤ (|xn | + |yn |)|xn + yn |p−1
n=1 n=1
(2)
XN N
X
= |xn ||xn + yn |p−1 + |yn ||xn + yn |p−1 .
n=1 n=1
p
Let q = p−1 , so that p and q are conjugate exponents. Then
N N
! p1 N
! 1q
X X X
|xn ||xn + yn |p−1 ≤ |xn |p |xn + yn |q(p−1)
n=1 n=1 n=1
!1− p1 (3)
N
X
≤ kxk`p |xn + yn |p
n=1
8
by Hölder’s inequality (applied to truncated sequences). Similarly,
N N
!1− p1
X X
|yn ||xn + yn |p−1 ≤ kyk`p |xn + yn |p . (4)
n=1 n=1
N N
!1− p1
X X
p p
|xn + yn | ≤ (kxk`p + kyk`p ) |xn + yn | ,
n=1 n=1
whence
N
! p1
X
|xn + yn |p ≤ kxk`p + kyk`p .
n=1
Remark. Hölder’s and Minkowski’s inequalities also hold for FN with the p-norms | · |p , from the above results
applied to elements of `p whose coordinates after the N -th vanish.
Proof. Minkowski’s inequality shows that `p is closed under vector addition and the triangle inequality holds.
All the other properties required of a vector space and a norm are easy to verify. Thus `p is a normed vector
space. It remains to show that it is complete.
Let (x(n) )n∈N be a Cauchy sequence in `p . Fix k ∈ N. Then we have
(m) (n)
|xk − xk | ≤ kx(m) − x(n) k`p → 0 as m, n → ∞.
(n)
Hence (xk )n∈N is a Cauchy sequence in F. We know that F is complete, so there exists a limit
(n)
xk = lim xk .
n→∞
K
!1/p
(m) (n)
X
|xk − xk |p ≤ kx(m) − x(n) k`p ≤ ε.
k=1
Letting m → ∞, we obtain
K
!1/p
(n)
X
|xk − xk |p ≤ ε,
k=1
9
as long as n ≥ N . Now let K → ∞. We deduce
∞
!1/p
(n)
X
|xk − xk |p ≤ε
k=1
for n ≥ N . That is, we have kx − x(n) k`p ≤ ε. By Lemma 1.14, it follows that x ∈ `p . Furthermore, as ε was
chosen arbitrarily, we have shown that x(n) → x in `p as n → ∞.
For p = ∞, the arguments are similar. Let ε > 0 and choose N ∈ N such that kx(m) − x(n) k`∞ ≤ ε for
(m) (n) (n)
m, n ≥ N . Then |xk − xk | ≤ ε for every k ∈ N. Hence |xk − xk | ≤ ε for every k ∈ N, which means
kx − x(n) k`∞ ≤ ε.
Theorem 1.16 (Riesz’s lemma). Suppose that X is a normed vector space and Y ⊂ X a closed linear
subspace. If Y 6= X and θ ∈ (0, 1), then there exists a vector u ∈ X with kuk = 1 and
inf ku − yk ≥ θ.
y∈Y
since y0 + kx0 − y0 ky ∈ Y .
B = {x ∈ X : kxk ≤ 1}
the closed unit ball in X. Then B is compact if, and only if, X is finite-dimensional.
Proof. It follows from Exercise 2.3 and the Heine-Borel theorem that B is compact in a finite-dimensional
space.
Suppose that X is infinite-dimensional. Choose a unit vector x1 ∈ X. Consider the one-dimensional
subspace X1 = span{x1 }. According to Exercise 2.4, a finite-dimensional subspace is always closed. So we
can apply Riesz’s Lemma 1.16 to X1 and construct an x2 ∈ X with kx2 k = 1 and kx2 − x1 k ≥ 21 . Define
X2 = span{x1 , x2 }. Use Riesz’s Lemma 1.16 again to find a point x3 ∈ X with kx3 k = 1 and
1 1
kx3 − x1 k ≥ and kx3 − x2 k ≥ .
2 2
1
If we continue indefinitely, we obtain a sequence (xn )n∈N of unit vectors, such that kxm − xn k ≥ 2 for m 6= n.
But this means that we cannot have a convergent subsequence, so B is not compact.
10
2 Bounded Linear Operators
The Space B(X, Y )
In this chapter we study linear operators (also called linear maps or linear transformations) between normed
spaces, especially operators with the following additional property.
Definition. Let X, Y be normed vector spaces and T : X → Y a linear operator. We say that T is bounded if
there exists a number c > 0 such that
kT xkY ≤ ckxkX
for all x ∈ X.
Theorem 2.1. Let X, Y be normed vector spaces and T : X → Y linear. Then the following are equivalent.
(i) T is bounded.
(ii) T is uniformly continuous.
(iii) T is continuous at 0.
Proof. (i) ⇒ (ii). Suppose that there is a number c > 0 satisfying kT xkY ≤ ckxkX for all x ∈ X. Let ε > 0
and let x, y ∈ X.
kT x − T ykY = kT (x − y)kY ≤ ckx − ykX < ε
provided that kx − ykX < ε/c, so T is uniformly continuous.
(ii) ⇒ (iii). This is trivial.
(iii) ⇒ (i). Choose δ > 0 such that
kxkX ≤ δ ⇒ kT xkY ≤ 1.
Then for any x ∈ X\{0},
kxkX δx kxkX
kT xkY = T ≤ .
δ kxkX Y δ
Hence T is bounded.
Definition. Let X, Y be normed vector spaces. Then B(X, Y ) is the space comprising all bounded linear
operators T : X → Y . For T ∈ B(X, Y ), we define
kT k = sup kT xkY = sup kT xkY ,
x∈X x∈BX
kxkX ≤1
where BX = {x ∈ X | kxkX ≤ 1}. We call kT k the operator norm of T . We also write B(X) = B(X, X).
Sometimes we write kT kB(X,Y ) or kT kB(X) for the operator norm in order to distinguish it from other
norms. Note that the linearity implies the identities
kT xkY
kT k = sup kT xkY = sup ,
x∈X x∈X\{0} kxkX
kxkX =1
unless X = {0}.
Theorem 2.2. (i) If X, Y are normed vector spaces, then B(X, Y ), equipped with the operator norm, is a
normed vector space.
(ii) If X is a normed vector space and Y is a Banach space, then B(X, Y ) is a Banach space.
(iii) Let X, Y, Z be normed vector spaces and suppose that S ∈ B(X, Y ) and T ∈ B(Y, Z). Then T S ∈ B(X, Z)
with
kT Sk ≤ kT kkSk.
11
Proof. (i) is routine.
(ii) Suppose that Y is complete. Let (Tn )n∈N be a Cauchy sequence in B(X, Y ). Then for any fixed x ∈ X, we
have
kTm x − Tn xkY ≤ kTm − Tn kkxkX → 0 as m, n → ∞.
Hence (Tn x)n∈N is a Cauchy sequence in Y and the limit
T x := lim Tn x
n→∞
exists. This gives rise to a map T : X → Y . It is linear, because for x, y ∈ X and α ∈ F, we have
It is bounded, because
kT xkY = lim kTn xkY ≤ lim kTn k kxkX .
n→∞ n→∞
(Note limn→∞ kTn k exists because (kTn k)n∈N is a Cauchy sequence in F.)
It remains to prove that Tn → T in B(X, Y ) (which is stronger than pointwise convergence). Let ε > 0 and
choose N ∈ N such that kTm − Tn k ≤ ε for m, n ≥ N . Then for every x ∈ X,
Letting m → ∞, we obtain
kT x − Tn xkY ≤ εkxkX .
Thus kT − Tn k ≤ ε, as required.
(iii) Let x ∈ X. Then
kT SxkZ ≤ kT kkSxkY ≤ kT kkSkkxkX .
This implies the desired inequality.
Definition. If X is a normed vector space over F, then X ∗ = B(X, F) is called the dual space of X. Its
elements are called bounded linear functionals on X.
Example. Let a < b and c ∈ [a, b]. Define T : C([a, b]) → F by T f = f (c). This is a linear operator and we
have
|T f | = |f (c)| ≤ kf kC([a,b]) .
So T ∈ (C([a, b]))∗ and kT k ≤ 1. Since T 1 = 1, we have in fact kT k = 1.
Example. Let g ∈ C([a, b]) and define T : C([a, b]) → C([a, b]) by T f = gf . Then
12
So R, L ∈ B(`p ). It also follows immediately that kRk = 1 and kLk ≤ 1. In fact we have kLk = 1, because L
maps (0, 1, 0, 0, . . .) (with norm 1) to (1, 0, 0, . . .) (still with norm 1).
Example. Let p, q be conjugate exponents. Fix y = (y1 , y2 , y3 , . . .) ∈ `q . Define T : `p → F by
∞
X
Tx = x n yn .
n=1
Theorem 2.3 (Baire Category Theorem). Let (X, d) be a nonempty complete metric space and let {Fn }∞
n=1
be a sequence of closed sets that covers X. Then Fn has nonempty interior for some n ∈ N.
Proof. Exercise: convince yourself this is the same as the Baire category theorem from MA40043!
Theorem 2.4 (Banach-Steinhaus Uniform Boundedness Principle). Let X be a Banach space and Y
a normed vector space. Let T ⊂ B(X, Y ) be such that for all x ∈ X, the set {T x : T ∈ T } is bounded in Y .
Then T is bounded in B(X, Y ).
In other words, if we have pointwise bounds, then we automatically get a bound in the norm of B(X, Y ) as
well. The second property appears much stronger and implies in particular that we have uniform bounds on
bounded subsets of X. That is, if A ⊂ X with
then
sup sup kT xkY ≤ sup kT k sup kxkX < ∞.
T ∈T x∈A T ∈T x∈A
By the Baire Category Theorem 2.3, there exists an n ∈ N such that Xn contains a ball
B0 = {x ∈ X : kx − x0 k ≤ r}
13
for some x0 ∈ X and r > 0. That is, for x ∈ B0 and T ∈ T , we have kT xkY ≤ n.
Fix T ∈ T and suppose that x ∈ X with kxkX ≤ 1. Define y = rx + x0 . Then y ∈ B0 . So
y − x0 1
kT xkY = T = (T y − T x0 )
r Y r Y
1 2n
≤ (kT ykY + kT x0 kY ) ≤ .
r r
2n
Hence kT k ≤ r . The right-hand side is independent of T , so T is bounded in B(X, Y ).
Corollary 2.5. Let X be a Banach space and Y a normed vector space. Suppose that (Tn )n∈N is a sequence in
B(X, Y ) such that (Tn x)n∈N is convergent for every x ∈ X. Then there exists a unique operator T ∈ B(X, Y )
such that
T x = lim Tn x
n→∞
for all x ∈ X.
Proof. It is clear (by the assumption and the uniqueness of limits) that the formula defines a unique map
T : X → Y . It is also clear that T is linear. We need to show that T is bounded.
Let T = {Tn : n ∈ N}. For every x ∈ X, the sequence (Tn x)n∈N is convergent by the hypothesis; in
particular it is bounded. Therefore, we can apply the Uniform Boundedness Theorem 2.4 to T . It follows that
there exists a number M ≥ 0 with the property that kTn k ≤ M for every n ∈ N. Now for any x ∈ X,
So kT k ≤ M and T ∈ B(X, Y ).
Example. Let (an )n∈N be a sequence in R such that for every x ∈ `2 , the sequence (a1 x1 , a2 x2 , a3 x3 , . . .) also
belongs to `2 . Then the linear operator
is bounded.
To prove this statement, define Tn : `2 → `2 with
Then Tn ∈ B(`2 ) with kTn k ≤ max{|a1 |, . . . , |an |}. Moreover, for every x ∈ `2 ,
T x = lim Tn x.
n→∞
Definition. Let X, Y be metric spaces and f : X → Y a map. We say that f is open if for every open set
U ⊂ X, the image f (U ) = {f (x) : x ∈ U } is open in Y .
14
Theorem 2.6 (Open Mapping Theorem). Let X, Y be Banach spaces and T ∈ B(X, Y ). If T is surjective,
then T is open.
For R > 0 we use the notation RBX for the ball of centre 0 and radius R in X. Since T is surjective, we have
∞
[
Y = T (nBX ).
n=1
By the Baire category theorem there exists an n ∈ N such that T (nBX ) contains a ball, say with centre y1 and
radius 2r, so we may choose any y0 ∈ (y1 + rBY ) ∩ T (nBX) and we then have
ky − T x1 kY ≤ 12 .
Hence
(y − T x1 ) ∈ 21 BY ⊂ T 1
2 RBX .
Therefore, there exists a point x2 ∈ 12 RBX such that
k(y − T x1 ) − T x2 kY ≤ 14 ,
so
ky − T (x1 + x2 )kY ≤ 14 .
Now we recursively define x3 , x4 , . . . such that xk ∈ ( 12 )k−1 RBX and
1 k
ky − T (x1 + · · · + xk )kY ≤ 2 .
Then
∞
X ∞
X
kxk kX ≤ R ( 21 )k−1 = 2R.
k=1 k=1
Therefore,
∞
X
x= xk
k=1
15
So y = T x. That is, we have proved BY ⊂ T (2RBX ).
Finally, let U ⊂ X be an open set and let V = T (U ). Fix y0 ∈ V . There exists an x0 ∈ U with T x0 = y0 .
As U is open, there exists a number ρ > 0 such that x0 + ρBX ⊂ U . Then
ρ ρ
V ⊃ T (x0 + ρBX ) = y0 + T (2RBX ) ⊃ y0 + BY .
2R 2R
Hence V is open.
Corollary 2.7. Let X, Y be Banach spaces and T ∈ B(X, Y ). If T is bijective, then T −1 ∈ B(Y, X).
Proof. We have T −1 ∈ B(X, Y ) if, and only if, the inverse T −1 is continuous by Theorem 2.1. A map is
continuous if, and only if, inverse images of open sets are open. For an inverse map T −1 , this is equivalent to
T being open. So the claim follows directly from the Open Mapping Theorem 2.6.
If X, Y are normed spaces, then we can equip the Cartesian product X × Y with the norm
If X and Y are Banach spaces, then this makes X × Y a Banach space as well.
We now have the following criterion for boundedness of an operator between Banach spaces.
Theorem 2.8 (Closed Graph Theorem). Let X, Y be Banach spaces and T : X → Y a linear operator.
Then T ∈ B(X, Y ) if, and only if, the graph G = {(x, T x) : x ∈ X} is closed in X × Y .
Proof. Suppose that T ∈ B(X, Y ). Then G is closed since G = L−1 (0), where L : X × Y → Y is the linear
operator L(x, y) = y − T x, which is bounded because
16
3 Dual Spaces
Recall that for a normed vector space X over F, we define X ∗ = B(X, F). This is always a Banach space, even
if X is incomplete. Recall that the norm k kX ∗ (or simply k k∗ ) on X ∗ is the operator norm:
kf kX ∗ = sup |f (x)|, f ∈ X ∗.
x∈BX
Hence kfy kH ∗ = kykH . It turns out that all elements of H ∗ are of this form if H is a Hilbert space:
Theorem 3.1 (Riesz Representation Theorem). Let H be a Hilbert space and f ∈ H ∗ . Then there exists
a unique y ∈ H such that f (x) = hx, yiH for all x ∈ H. Furthermore, kykH = kf kH ∗ .
Therefore, we have y1 = y2 .
Finally, the identity kykH = kf kH ∗ has been verified at the beginning of this subsection.
17
The theorem implies that there is a bijection Φ : H → H ∗ that maps a point y ∈ H to the functional
fy =: Φ(y), where
fy (x) = hx, yiH , x ∈ H.
If F = R, then can be checked by a direct calculation that Φ is linear. If F = C, then it is not linear. Instead,
for x, y, z ∈ H and α ∈ F, we have
Φ(αx + y)(z) = hz, αx + yiH = ᾱhz, xiH + hz, yiH = ᾱΦ(x)(z) + Φ(y)(z).
That is, we have Φ(αx + y) = ᾱΦ(x) + Φ(y). Nevertheless, the map Φ induces a Hilbert space structure on H ∗ :
Theorem 3.2. Let H be a Hilbert space and consider the above map Φ : H → H ∗ . Then H ∗ , with the inner
product
hf, giH ∗ = hΦ−1 (g), Φ−1 (f )iH ,
is a Hilbert space. Furthermore, Φ is an isometry between H and H ∗ .
Proof. The proof is routine and is not required. Note the orders in which the two inner products are taken.
By Hölder’s inequality, the series is convergent, so fy : `p → F is well-defined. Moreover, we have |fy (x)| ≤
kxk`p kyk`q . Thus fy ∈ (`p )∗ . Therefore, we obtain a linear map Φp : `q → (`p )∗ , setting Φp y = fy . i.e.,
∞
X
(Φp y)(x) = x n yn , x ∈ `p .
n=1
Theorem 3.3. The operator Φp is an isometry, i.e., kΦp yk(`p )∗ = kyk`q for every y ∈ `p . If p < ∞, then Φp
is bijective.
Proof. Let y ∈ `q and set fy = Φp y. We have seen that kfy k(`p )∗ ≤ kyk`q by Hölder’s inequality. To obtain the
reverse inequality, we distinguish three cases.
Case 1. If 1 < q < ∞, define (
|yn |q−2 ȳn if yn 6= 0,
xn =
0 if yn = 0.
Then for x = (xn )n∈N , we have
∞
|yn |q = kykq`q .
X
fy (x) =
n=1
18
q
as p = q−1 . Therefore,
fy (x) = kykq−1
`q kyk` = kxk` kyk` .
q p q
fy (x) = kyk`1 ,
Thus we have kfy k(`1 )∗ = kyk`∞ in this case as well, and this concludes the first part of the proof.
Note also that an isometry is necessarily injective, so for the second part, it suffices to show that Φp is
surjective. Suppose that p < ∞. Let f ∈ (`p )∗ . Consider the sequences
e1 = (1, 0, 0, . . .),
e2 = (0, 1, 0, 0, . . .),
e3 = (0, 0, 1, 0, 0, . . .),
..
.
Define yn = f (en ) for n ∈ N and consider the sequence y = (yn )n∈N . We claim that y ∈ `q . This is clear if
q = ∞, as |yn | = |f (en )| ≤ kf k(`1 )∗ . If q < ∞, fix N ∈ N and define
N
X
x(N ) = |yn |q−2 ȳn en .
n=1
19
So
N
! 1q
X
|yn |q ≤ kf k(`p )∗ .
n=1
Letting N → ∞, we obtain kyk`q ≤ kf k(`p )∗ . In particular y ∈ `q .
Next we claim that f = Φp y. Let x ∈ `p . Then we have
∞
X
x= xn e n .
n=1
Hence
∞
X ∞
X
f (x) = xn f (en ) = xn yn = Φp y(x),
n=1 n=1
as required.
∀a ∈ T : a0 ≤ a ⇒ a = a0 .
We also need a lemma from set theory. The proof is outside the scope of this course. However, it is known
that the lemma is equivalent to the so-called axiom of choice (see P.R. Halmos, Naive Set Theory, for a proof),
so for our purposes, the statement can be taken as an axiom.
Lemma 3.4 (Zorn’s Lemma). Let S 6= ∅ be a partially ordered set. If every totally ordered subset of S has
an upper bound, then S has a maximal element.
Before we state the main result of this section, we prove another statement that relies on similar arguments.
Theorem 3.5. Let V be a vector space over an arbitrary field k. Then V contains a maximal linearly inde-
pendent set S. Moreover, this set satisfies V = span S.
Remark. The expression ‘maximal’ refers to the partial order ⊂ on the power set of V ; thus, if S 0 is a linearly
independent subset of V and S ⊂ S 0 , then S = S 0 .
A set with these properties is a basis of V in the sense of linear algebra (i.e., a Hamel basis). The statement
can thus be summarised as follows: every vector space has a Hamel basis.
Proof. Let A comprise all linearly independent subsets of V . Then A is non-empty, as ∅ ∈ A. The inclusion
relation ⊂ gives a partial order of A. We want to show that A satisfies the hypotheses of Zorn’s lemma.
Suppose that U ⊂ A is totally ordered. Define
[
T = U.
U ∈U
20
We claim that T is linearly independent. To see this, let v1 , . . . , vn ∈ T be distinct vectors and let α1 , . . . , αn ∈ k
such that
α1 v1 + · · · + αn vn = 0.
Then for every i = 1, . . . , n, there exists a Ui ∈ U with vi ∈ Ui . Since U is totally ordered, there exists an
i0 ∈ {1, . . . , n} such that Ui ⊂ Ui0 for i = 1, . . . , n. Thus v1 , . . . , vn ∈ Ui0 . Since Ui0 is linearly independent,
this means that
α1 = · · · = αn = 0.
So T is linearly independent, and therefore T ∈ A. Obviously, T is an upper bound for U.
By Zorn’s lemma, there exists an maximal element S of A, as required. It remains to show that span S = V .
Let v0 ∈ V \S. Then S ∪ {v0 } is not linearly independent. Hence there exist certain vectors v1 , . . . , vn ∈ S
and scalars α0 , . . . , αn ∈ k with (α0 , . . . , αn ) 6= (0, . . . , 0) such that
α0 v0 + α1 v1 + · · · + αn vn = 0.
Definition. Let V be a real vector space. We say that a functional p : V → R is sublinear if for all x, y ∈ V
and α ≥ 0,
(i) p(x + y) ≤ p(x) + p(y), and
(ii) p(αx) = αp(x).
Property (i) in the definition is called subadditivity and (ii) is called positive homogeneity.
Example. If p is linear, then it is also sublinear.
Example. If k · k is a norm on V and λ > 0, then p(x) = λkxk is sublinear.
Theorem 3.6 (Hahn-Banach). Let V be a real vector space and p : V → R a sublinear functional. Suppose
that L ⊂ V is a linear subspace and f : L → R a linear functional such that f (x) ≤ p(x) for all x ∈ L. Then
there exists a linear extension F : V → R such that
(i) F (x) = f (x) for all x ∈ L, and
(ii) F (x) ≤ p(x) for all x ∈ V .
(U, φ) ≤ (U 0 , φ0 )
21
if, and only if, U ⊂ U 0 and φ0 |U = φ. We note that the pair (L, f ) belongs to S. Let
Proof. If (U, φ), (U 0 , φ0 ) ∈ T with x ∈ U ∩ U 0 , then either (U, φ) ≤ (U 0 , φ0 ) or (U 0 , φ0 ) ≤ (U, φ) by the total
ordering. In both cases, we have φ(x) = φ0 (x) by the definition of ≤. So the value of φ(x) does not depend on
the choice of (U, φ).
Proof. Let x1 , x2 ∈ W and α ∈ R. There exist (U1 , φ1 ), (U2 , φ2 ) ∈ T with x1 ∈ U1 and x2 ∈ U2 . By the total
ordering, we have U1 ⊂ U2 or U2 ⊂ U1 . Say U1 ⊂ U2 . Then x1 , x2 ∈ U2 and thus αx1 + x2 ∈ W . Moreover,
Finally, we have ψ ≤ p in W by construction and also (L, f ) ≤ (W, ψ). Hence (W, ψ) ∈ S0 . Again by
construction, (W, ψ) is an upper bound for T .
Step 3: construct the extension.
Zorn’s lemma now implies that there exists a maximal element (X, F ) of S0 .
Claim: X = V .
Proof. Assume for contradiction that X 6= V . Choose v ∈ V \X. Then for all x, y ∈ X, we have
Hence
F (x) − p(x − v) ≤ p(y + v) − F (y).
Set
κ = sup (F (z) − p(z − v)),
z∈X
22
Then G : X ⊕ span{v} → R is linear and G|X = F . Moreover from (5), for α > 0 we have
x x
G(x + αv) = F (x) + ακ = α F + κ ≤ αp + v = p(x + αv),
α α
and for α < 0,
x
G(x + αv) = F (x) − |α|κ = |α| F −κ
|α|
x
≤ |α|p − v = p(x + αv).
|α|
Therefore, we have (X ⊕ span{v}, G) ∈ S0 and (X, F ) ≤ (X ⊕ span{v}, G). But we do not have equality, which
contradicts the maximality of (X, F ). Hence X = V .
Now by construction, F has both the required properties (i) and (ii). This completes the proof of the
Hahn-Banach Theorem.
Corollary 3.7. Let X be a normed vector space and V ⊂ X a linear subspace. Suppose that f ∈ V ∗ . Then
there exists an extension F ∈ X ∗ with F (x) = f (x) for all x ∈ V and kF kX ∗ = kf kV ∗ .
Proof. First assume that F = R. Apply the Hahn-Banach Theorem 3.6 with
p(x) = kf kV ∗ kxkX .
Thus kF kX ∗ = kf kV ∗ .
Now we consider the case F = C. Then we use the fact that a complex linear functional is uniquely
determined by its real part. Define g(x) = Re f (x). Regarding X as a real vector space, we can apply the
previous arguments to g. Thus we construct an R-linear functional G : X → R with G|V = g = Re f . Now
define
F (x) = G(x) − iG(ix).
This is a C-linear functional by Exercise Sheet 7. For x ∈ V , we have
23
Corollary 3.8. Let L be a closed linear subspace of a normed vector space X. Suppose that x0 ∈ X\L. Then
there exists an F ∈ X ∗ with F = 0 on L and F (x0 ) = 1.
f (λx0 + w) = λ, λ ∈ F, w ∈ L.
This is well-defined and linear. Clearly f = 0 on L and f (x0 ) = 1. If we can show that f ∈ V ∗ , then the claim
follows from Corollary 3.7.
Assume for contradiction that f is unbounded. Then there exists a sequence of points λn x0 + wn ∈ V (i.e.,
λn ∈ F and wn ∈ L), such that
kλn x0 + wn k ≤ 1
for every n ∈ N, but |λn | → ∞ as n → ∞. Then
wn
x0 + →0 as n → ∞.
λn
So
wn
x0 = − lim ∈ L,
n→∞ λn
as L is closed. But this contradicts the assumptions.
Corollary 3.9. Let X be a normed vector space and x0 ∈ X\{0}. Then there exists an f ∈ X ∗ with kf kX ∗ = 1
and f (x0 ) = kx0 kX .
Proof. This follows from Corollary 3.7. A proof directly from the Hahn-Banach Theorem, when F = R, is set
on Probem Sheet 6.
Corollary 3.10. Let X be a normed vector space and x1 , x2 ∈ X with x1 6= x2 . Then there exists an f ∈ X ∗
with f (x1 ) 6= f (x2 ).
Proof. If f ∈ X ∗ with kf kX ∗ ≤ 1, then it is clear that |f (x)| ≤ kxkX . On the other hand, there exists an
f ∈ X ∗ with kf kX ∗ ≤ 1 and |f (x)| = kxkX (by Corollary 3.9 if x 6= 0 and trivially if x = 0). The result
follows.
Reflexivity
For conjugate exponents p, q ∈ (1, ∞), we have seen that we can identify (`p )∗ with `q and (`q )∗ with `p . So
(`p )∗∗ has the same structure as `p . This is not the case for other spaces in general; indeed, the spaces `1 and
`∞ provide counterexamples. But a normed vector space X can always be embedded in X ∗∗ .
24
Proof. The verification of linearity of JX x for each x is routine. For f ∈ X ∗ , we have
It follows that Fx ∈ X ∗∗ and kFx kX ∗∗ ≤ kxkX . By Corollary 3.9, there exists a g ∈ X ∗ with kgkX ∗ = 1 and
g(x) = kxkX . So |Fx (g)| = kxkX , and therefore kFx kX ∗∗ = kxkX .
Finally the linearity of x 7→ JX x is routine.
Definition. For a normed vector space X, the operator JX : X → X ∗∗ , defined in Theorem 3.12 is called the
canonical embedding of X in X ∗∗ .
The notation x
b := JX (x) is frequently used.
If JX (X) = X ∗∗ then X is said to be reflexive.
A linear isometric bijection is called an isometric isomorphism.
Remarks.
• A linear isometry is necessarily injective, i.e., we have ker JX = {0}.
• If X is reflexive then X is complete since X ∗∗ , being the dual space of X ∗ , is complete.
• Reflexivity is not quite the same as saying that X and X ∗∗ are isometrically isomorphic. For reflexivity, the
isometric isomorphism must be given by the specific operator JX .
Examples.
• Any finite-dimensional normed vector space is reflexive.
• It follows from the Riesz representation theorem 3.1 that any Hilbert space is reflexive. (However, a bit of
additional work is necessary to show that the resulting map H → H ∗∗ coincides with JH .)
• It follows from Theorem 3.3 that `p is reflexive for 1 < p < ∞ (see Exercise Sheet 9 for the details).
• On the other hand, both `1 and `∞ are not reflexive (cf. Exercise Sheet 9).
F0 (f ) = F (f |Y ), f ∈ X ∗.
25
Convexity
Definition. Let V be a (real or complex) vector space. A subset C ⊂ V is called convex if for all x, y ∈ C and
λ ∈ (0, 1), the point λx + (1 − λ)y belongs to C as well.
Examples.
• A linear subspace is convex.
• In a normed vector space X, consider the open unit ball B ◦ = {x ∈ X : kxk < 1}. Let x, y ∈ B ◦ and λ ∈ (0, 1).
Then
kλx + (1 − λ)yk ≤ λkxk + (1 − λ)kyk < 1.
So B ◦ is convex. Similarly, any open or closed ball in X is convex.
• Suppose that V is a vector space, f : V → F is a linear functional, and D ⊂ F is convex. Then
C = f −1 (D) = {x ∈ V : f (x) ∈ D}
Definition. Let C be a subset of a real normed vector space X with 0 ∈ C ◦ . The Minkowski functional h of
C is defined by
h(x) = inf{α > 0 : x ∈ αC}, x ∈ X.
Lemma 3.14. Let X be a real normed vector space and C ⊂ X a convex set with 0 ∈ C ◦ . Then the Minkowski
functional h of C satisfies
(i) h(λx) = λh(x) for all x ∈ X and λ ≥ 0,
(ii) h(x + y) ≤ h(x) + h(y) for all x, y ∈ X,
(iii) h(x) < 1 if, and only if, x ∈ C ◦ ,
(iv) h(x) ≤ 1 if, and only if, x ∈ C, and
(v) there exists a number A such that h(x) ≤ AkxkX for all x ∈ X.
(ii) Let x, y ∈ X. Fix ε > 0 and choose α, β > 0 such that x ∈ αC, y ∈ βC, and
α ≤ h(x) + ε, β ≤ h(y) + ε.
Set x0 = x
α and y 0 = βy . Then x0 ∈ C and y 0 ∈ C. Hence
0 0 α β
x + y = αx + βy = (α + β) x0 + y0 ∈ (α + β)C.
α+β α+β
26
Conversely, suppose that h(x) < 1. Then there exists an ε > 0 such that y := (1 + ε)x ∈ C. Since 0 ∈ C ◦ ,
there exists an open ball rB ◦ ⊂ C, where
B ◦ = {x ∈ X : kxkX < 1}.
Since C is convex,
εr y ε
x+ B◦ = + rB ◦ ⊂ C.
1+ε 1+ε 1+ε
So x ∈ C ◦ .
(v) Choose r > 0 such that rB ◦ ⊂ C ◦ . Then it follows from (iii) that h < 1 on rB ◦ . Hence by (i)
2kxkX rx 2
h(x) = h ≤ kxkX
r 2kxkX r
for all x ∈ X\{0}. Moreover, we know that h(0) = 0.
(iv) Firstly note that (ii) and (v) imply that h is continuous, since for z, w ∈ X,
h(z) − h(w) = h((z − w) + w) − h(w) ≤ h(z − w) ≤ Akz − wkX .
By definition, we have h ≤ 1 on C. Hence h ≤ 1 on C.
Conversely, suppose that h(x) ≤ 1. If h(x) < 1, then x ∈ C ◦ ⊂ C by (iii). If h(x) = 1, then there exist
αn > 0 with αn−1 x ∈ C for every n ∈ N and limn→∞ αn = 1. Hence x = limn→∞ αn−1 x ∈ C.
Theorem 3.15. Let X be a real normed vector space and C ⊂ X a convex set with 0 ∈ C ◦ . Then for any
x0 ∈ X\C, there exists an f ∈ X ∗ such that
f (x0 ) > sup f (x).
x∈C
Corollary 3.16. Let X be a real normed vector space and C ⊂ X non-empty, closed, and convex. Let
x0 ∈ X\C. Then there exists an f ∈ X ∗ with
f (x0 ) > sup f (x).
x∈C
Remarks. In contrast to Theorem 3.15, it is not required here that 0 ∈ C ◦ .
Proof. Fix y0 ∈ C. Then it suffices to prove the statement for C − y0 instead of C and for the point x0 − y0
instead of x0 . So we can assume without loss of generality that 0 ∈ C.
Let B ◦ be the open unit ball in X and
Cr = C + rB ◦ = {x + ry : x ∈ C, y ∈ B ◦ }
for r > 0. Then Cr is convex. For r sufficiently small, we have x0 6∈ Cr . Hence we can apply Theorem 3.15 and
the claim follows.
27
4 Weak and Weak* Convergence
Basic Properties
We have seen that closed, bounded sets need not be compact in a Banach space. In fact, according to Theorem
1.17, the closed unit ball is never compact in an infinite-dimensional space. However, compactness is an
important and useful concept in analysis. A way to recover some sort of compactness is to change our notion
of convergence.
Definition. Let X be a normed vector space. A sequence (xn )n∈N converges weakly to x0 ∈ X if for all f ∈ X ∗ ,
Remarks.
∗
• We use the notation xn * x0 for weak convergence and fn * f0 for weak* convergence.
• We sometimes use the expression strong convergence for convergence with respect to the norm on X to avoid
confusion with weak convergence.
• Note that weak* convergence is just pointwise convergence of functionals.
Theorem 4.1. If X is a normed vector space, then weak limits in X and weak* limits in X ∗ are unique.
Proof. Suppose that (xn )n∈N is a sequence in X that converges weakly to x0 ∈ X. Let y0 ∈ X with y0 6= x0 .
Then by Corollary 3.10, there exists an f ∈ X ∗ with f (x0 ) 6= f (y0 ). Hence
28
Theorem 4.3. Suppose that C is a convex, closed subset of a normed vector space X. If (xn )n∈N is a sequence
in C with weak limit x0 , then x0 ∈ C.
which is impossible.
If F = C, then we first regard X as a real vector space. We choose bounded real linear f as above and
define g(x) = f (x) − if (ix), so that g ∈ X ∗ (cf. Corollary 3.7). Then we have the contradiction
Proof. (i) It is enough to consider the case when x0 6= 0. According to Corollary 3.9 we can choose f ∈ X ∗
such that kf k∗ = 1 and f (x0 ) = kx0 k. Then
kx0 k = |f (x0 )| = | lim f (xn )| = lim |f (xn )| ≤ lim inf kf k∗ kxn k = lim inf kxn k.
n→∞ n→∞ n→∞ n→∞
(ii) Let ε > 0 and from the definition of k k∗ choose x ∈ X with kxk ≤ 1 and |f0 (x)| ≥ kf0 k∗ − ε. Then
kf0 k∗ ≤ |f0 (x)| + ε = lim |fn (x)| + ε ≤ lim inf kfn k∗ kxk + ε ≤ lim inf kfn k∗ + ε.
n→∞ n→∞ n→∞
It follows that the closed unit ball B in X is closed under weak convergence. That is, the weak limit of a
sequence in B belongs to B, too. (This statement can also be verified using Theorem 4.3.) Similarly, the closed
unit ball in X ∗ is closed under weak* convergence. However, we will see in the next subsection that we have in
fact a much stronger statement.
Proof. Let (fn )n∈N be a dense sequence in X ∗ . For each n ∈ N, choose a point xn ∈ X with kxn kX ≤ 1 and
|fn (xn )| ≥ 21 kfn kX ∗ . Set
Y = span{xn : n ∈ N}.
We claim that Y = X.
Assume that Y 6= X for contradiction. Choose a unit vector x0 ∈ X \ Y . Then by Corollary 3.8, there
exists an F ∈ X ∗ with F = 0 on Y , but F (x0 ) = 1. Let ε > 0 and choose n ∈ N such that kfn − F kX ∗ < ε.
Now kF k∗ ≥ 1 so
kfn k∗ ≥ kF k∗ − kF − fn k∗ > 1 − ε.
29
Then
Theorem 4.6 (Banach). Let X be a separable normed vector space. Then every sequence in the closed unit
ball B ∗ = {f ∈ X ∗ : kf kX ∗ ≤ 1} has a subsequence converging weak* to an element of B ∗∗ .
Proof. As X is separable, there exists a dense sequence (xk )k∈N in X. Now consider a sequence (fn )n∈N in B ∗ .
Since |fn (x1 )| ≤ kfn kX ∗ kx1 kX ≤ kx1 kX for every n ∈ N, the sequence (fn (x1 ))n∈N in F is bounded. Therefore,
there exists a convergent subsequence. That is, there exists an infinite set Λ1 ⊂ N such that the subsequence
(fn (x1 ))n∈Λ1 is convergent.
Next we consider the sequence (fn (x2 ))n∈Λ1 . It is bounded as well, hence there exists an infinite set Λ2 ⊂ Λ1
such that the subsequence (fn (x2 ))n∈Λ2 is convergent.
Similarly, we construct Λ3 , Λ4 , Λ5 , . . . such that each Λk is an infinite subset of Λk−1 for k = 2, 3, 4, . . . and
(fn (xk ))n∈Λk is convergent. We now want to construct an increasing sequence (n` )`∈N in N such that we have
simultaneous convergence of (fn` (xk ))`∈N for all k ∈ N. This works with a ‘diagonal sequence argument’ as
follows.
Let n1 = min Λ1 . Now define n2 , n3 , . . . recursively. Assuming that n` has been chosen, let
Then for every k, we have nk , nk+1 , nk+2 , . . . ∈ Λk . Therefore, (fn` (xk ))`∈N is convergent.
We claim that (fn` (x))`∈N is convergent for all x ∈ X. In order to prove this, fix x ∈ X. Let ε > 0 and
choose k ∈ N such that kx − xk kX ≤ ε. We know that (fn` (xk ))`∈N is convergent and thus Cauchy. Therefore,
there exists an L ∈ N such that for `, m ≥ L,
Thus
|fn` (x) − fnm (x)| ≤ |fn` (x) − fn` (xk )| + |fn` (xk ) − fnm (xk )|
+ |fnm (xk ) − fnm (x)|
≤ kfn` kX ∗ kx − xk k + ε + kfnm kX ∗ kx − xk k ≤ 3ε.
30
It follows that (fn` (x))`∈N is Cauchy, and since F is complete, it is convergent, as claimed.
Finally, we define
f (x) = lim fn` (x), x ∈ X.
`→∞
Then f is linear and |fn (x)| ≤ kxk for all n ∈ N and x ∈ X so
∗
Hence f ∈ X ∗ . The weak* convergence fn` * f follows from the construction.
Now suppose that we want a similar statement for weak convergence. If X happens to be reflexive, then
we can identify X with X ∗∗ and weak convergence in X with weak* convergence in X ∗∗ via the canonical
embedding JX : X → X ∗∗ . We can then apply Theorem 4.6 and thus obtain sequential compactness of the
closed unit ball in X, provided that X ∗ is separable. If X ∗ is not separable, then we need to use Lemma 4.5 in
addition.
Theorem 4.7. Let X be a reflexive Banach space. Then any sequence in the closed unit ball B of X has a
subsequence converging weakly to an element of B.
Proof. Firstly assume that X is separable. Consider the canonical embedding JX : X → X ∗∗ , which is isometric
and bijective since X is reflexive. Then X ∗∗ is separable, and by Lemma 4.5, the space X ∗ is separable as well.
Let B and B ∗∗ be the closed unit balls in X and X ∗∗ , respectively.
Suppose that (xn )n∈N is a sequence in B. Defining x̂n = JX xn , we obtain a sequence in B ∗∗ . By Theorem
4.6, there exists a weak* convergent subsequence (x̂nk )k∈N . Let x̂0 ∈ B ∗∗ be the weak* limit and define
−1
x0 = JX x̂0 . Then for any f ∈ X ∗ ,
Y = span{xn : n ∈ N}.
This subspace is clearly separable. According to Theorem 3.13, it is reflexive. Therefore, the previous arguments
give a subsequence (xnk )k∈N that converges weakly in Y , say with weak limit x0 ∈ Y . We need to show that
xnk * x0 weakly in X as well.
Let f ∈ X ∗ . Define g = f |Y . Then g ∈ Y ∗ . Hence
Remark. If p, q are conjugate exponents and 1 < q ≤ ∞ then `q is isometrically isomorphic to (`p )∗ , hence we
can, by slight misuse of terminology, talk about the weak* topology on `q . This is especially interesting in the
case q = ∞, when the dual space of `∞ is not isometrically isomorphic to `1 . More exactly, if f n ∈ (`p )∗ for all
n ≥ 0 and y n = Φ−1 n
p (f ) then
∞ ∞
∗
X X
f n * f 0 ⇔ ∀x ∈ `p Φp (y n )(x) → Φ(y 0 )(x) ⇔ ∀x ∈ `p ykn xk → yk0 xk .
n=1 n=1
31
5 Spectral Theory
For a finite-dimensional vector space X, in order to understand a linear operator T : X → X, it is essential to
know the eigenvalues and eigenvectors. For infinite-dimensional spaces, the corresponding concepts still exist,
but there are a few more subtle issues to consider, too.
T x − λx = y (6)
for given y ∈ X and λ ∈ F. This leads to the question: is the operator λidX − T invertible?
There are two potential problems when we want to solve equation (6): we could have ker(λidX − T ) 6= {0}
or im(λidX − T ) 6= X. In a finite-dimensional space, the two conditions are equivalent, but not in an infinite-
dimensional space.
Definition. Let X be a Banach space and T ∈ B(X). Suppose that λ ∈ F such that ker(λidX − T ) 6= {0}.
Then λ is called an eigenvalue of T . Any vector u ∈ ker(λidX − T )\{0} is called an eigenvector of T .
The spectrum of T is the set
Remark. By Corollary 2.6 of the Open Mapping Theorem, an invertible bounded linear operator from a
Banach space X to itself has a bounded inverse.
Theorem 5.1. Let X be a Banach space and T ∈ B(X) with kT kB(X) < 1. Then idX − T is invertible with
∞
X
−1
(idX − T ) = T n.
n=0
Proof. Firstly, since kT n kB(X) ≤ kT knB(X) , we have absolute convergence of the series. Set
n
X
Sn = T k.
k=0
Then
n
X n+1
X
(idX − T )Sn = Sn − T Sn = Tk − T k = idX − T n+1 → idX
k=0 k=1
in B(X). Moreover, setting
∞
X
S= T n,
n=0
we have
k(idX − T )Sn − (idX − T )SkB(X) ≤ kidX − T kB(X) kSn − SkB(X) → 0
as n → ∞. It follows that
(idX − T )S = lim (idX − T )Sn = idX .
n→∞
Similarly, we show that S(idX − T ) = idX .
32
Corollary 5.2. If X is a Banach space, then the set of all invertible operators in B(X) is open.
Proof. Suppose that T ∈ B(X) is invertible. If S ∈ B(X) then T − S = T (idX − T −1 S) which will be
invertible if kT −1 Sk < 1, by Theorem 5.1. Since kT −1 Sk ≤ kT −1 kkSk it now follows that T − S is invertible if
kSk < kT −1 k−1 , so T is an interior point of the set of invertible elements of B(X).
Theorem 5.3. Let X be a Banach space and T ∈ B(X). Then |λ| ≤ kT kB(X) for every λ ∈ σ(T ). Furthermore,
σ(T ) is compact.
Proof. If |λ| > kT kB(X) , then kλ−1 T kB(X) < 1, so (idX −λ−1 T )−1 exists by Theorem 5.1. Thus (λidX −T )−1 =
λ−1 (idX − λ−1 T )−1 exists as well. That is, we have λ 6∈ σ(T ).
Define the map τ : F → B(X) with τ (λ) = λidX − T . Then τ is continuous. Let G ⊂ B(X) be the set of
all invertible operators. This is open by Corollary 5.2. Hence B(X)\G is closed, and σ(T ) = τ −1 (B(X)\G) is
closed. It is bounded as well, and thus σ(T ) is compact by the Bolzano-Weierstrass theorem.
Theorem 5.4. Let X be a normed vector space, let Y be a Banach space, let (Tn )n∈N be a sequence in B(X)
that converges in the operator norm to T ∈ B(X). Then T is compact.
Proof. Let ε > 0 and choose N ∈ N such that kTn − T k < ε for all n ≥ N . Since TN is compact it follows that
TN (BX ) is a compact set so we can choose y1 , . . . , ym ∈ Y such that y1 + εBY , . . . , ym + εBY cover TN (BX ).
Then y1 + 2εBY , . . . , ym + 2εBY cover T (BX ) and since their union is closed they cover T (BX ). It follows that
T (BX ) is compact so T is a compact operator.
Remarks.
• Any bounded finite rank operator (i.e. one whose image is finite-dimensional) must be compact.
• Consequently, any operator that is the limit in B(X, Y ) of a sequence of bounded finite rank operators must
be compact if Y is complete.
Example. Let (λn )n∈N be a sequence in F that converges to 0 and define T : `2 → `2 by
Then T is compact, being the limit in the operator norm of the bounded finite rank operators
Tn : (x1 , x2 , x3 , . . .) 7→ (λ1 x1 , . . . , λn xn , 0, . . .)
Theorem 5.5. Let X, Y, Z be normed vector spaces and S ∈ B(X, Y ) and T ∈ B(Y, Z).
(i) If S is compact, then T ◦ S is compact.
(ii) If T is compact, then T ◦ S is compact.
Proof. (i) Suppose that S is compact. Let (xn )n∈N be a bounded sequence. Then there exists a subsequence
(xnk )k∈N such that Sxnk → y0 for some y0 ∈ Y . As T is continuous, we have (T ◦ S)(xnk ) → T y0 .
(ii) Suppose that T is compact. Let (xn )n∈N be a bounded sequence in X. Then (Sxn )n∈N is bounded in
Y , hence ((T ◦ S)(xn ))n∈N has a convergent subsequence.
33
Theorem 5.6. Let X be an infinite-dimensional Banach space and K ∈ B(X) a compact operator. Then
0 ∈ σ(K).
Proof. Assume for contradiction that 0 6∈ σ(K). Then K −1 exists. By the Open Mapping Theorem (or more
precisely, Corollary 2.6), we have K −1 ∈ B(X).
By Theorem 1.17, the closed unit ball in X is not compact. Hence there exists a sequence (yn )n∈N in X
such that kyn kX ≤ 1 for every n ∈ N, but there exists no convergent subsequence. Set xn = K −1 yn . This
gives rise to a bounded sequence, but (Kxn )n∈N does not have a convergent subsequence, in contradiction to
the compactness of K.
Theorem 5.7. Let X be a Banach space and K ∈ B(X) compact. Then for any λ ∈ F\{0}, the space
ker(λidX − K) is finite-dimensional.
Proof. Let Y = ker(λidX − K). Assume for contradiction that dim Y = ∞. Then, using Theorem 1.17 again,
we construct a bounded sequence (yn )n∈N in Y with no convergent subsequence. However, as K is compact,
there exists a subsequence, say (ynk )k∈N , such that
z0 := lim Kynk
k→∞
exists. Then
1 z0
ynk = Kynk → .
λ λ
But this contradicts the construction of (yn )n∈N .
By the Riesz Representation Theorem, there exists a unique z ∈ H such that φ(x) = hx, zi for all x ∈ X. Set
A∗ y = z. Then (7) holds by construction.
We need to check that the resulting operator A∗ is linear. Let y1 , y2 ∈ H and α ∈ F. Then
34
Finally, we have
Definition. Let H be a Hilbert space and A ∈ B(H). The operator A∗ ∈ B(H) characterised by (7) is called
the adjoint of A. If A∗ = A, then we say that A is self-adjoint.
Thus A∗∗ = A.
Theorem 5.10. Let H be a Hilbert space and A ∈ B(H) a self-adjoint operator. Then
(i) all eigenvalues of A are real, and
(ii) eigenvectors belonging to distinct eigenvalues are orthogonal.
Lemma 5.11. Let H be a Hilbert space and A ∈ B(H) a self-adjoint operator. Then
Proof. Set
M = sup |hAx, xi|.
x∈H
kxkH ≤1
35
The inequality M ≤ kAkB(H) follows.
In order to prove the reverse inequality, we may assume that A 6= 0, as the statement is obvious otherwise.
Let x ∈ H with Ax 6= 0. Fix β > 0 and set v = β1 Ax. Then hAx, vi = β −1 kAxk2 which is real and so by
self-adjointness hAv, xi = β −1 kAxk2 also. Thus we obtain
= 2M β 2 kxk2H + kvk2H
1
= 2M β 2 kxk2 + 2 kAxk2H ,
β
using the parallelogram identity. For
kAxkH
β2 = ,
kxkH
this gives kAxk2H ≤ M kAxkH kxkH . Now the desired inequality follows.
Lemma 5.12. For a Hilbert space H, let T ∈ B(H). Then (im T ∗ )⊥ = ker T .
y ∈ (im T ∗ )⊥ ⇔ ∀x ∈ H : hy, T ∗ xi = 0
⇔ ∀x ∈ H : hT y, xi = 0 ⇔ y ∈ ker T.
Lemma 5.13. Let A ∈ B(H) be a self-adjoint operator on a Hilbert space H. If M ⊂ H is a linear subspace
with A(M ) ⊂ M , then A(M ⊥ ) ⊂ M ⊥ .
since Ax ∈ M . Hence Ay ∈ M ⊥ .
Theorem 5.14. Let H be a Hilbert space and let K ∈ B(H) be a self-adjoint compact operator. Then for every
λ ∈ F\{0}, the subspace im(λidH − K) of H is closed.
Proof. Let (zn )n∈N be a convergent sequence in im(λidH − K) with limit z0 = limn→∞ zn . For n ∈ N, choose
xn ∈ H with
zn = λxn − Kxn .
Note that ker(λidH − K) is closed, so we have, by the orthogonal decomposition,
36
Consider the decomposition
Then
zn = λyn − Kyn , n ∈ N.
We claim that the sequence (yn )n∈N is bounded.
Assume for contradiction that it is unbounded. Discarding a subsequence if necessary, we can suppose that
yn 6= 0 for every n and kyn kH → ∞ as n → ∞. Set
yn
wn = .
kyn kH
Then
zn
λwn − Kwn = →0 as n → ∞. (8)
kyn kH
As K is compact, we may assume that the sequence (Kwn )n∈N is convergent (discarding another subsequence);
say Kwn → v0 . Then (8) implies
wn → λ−1 v0 .
Putting this into (8) we obtain
v0 − λ−1
n Kv0 = lim (λwn − Kwn ) = 0.
n→∞
That is, we have v0 ∈ ker(λidH − K). Since wn ∈ (ker(λidH − K))⊥ , which is closed, we also have v0 ∈
(ker(λidH − K))⊥ . Therefore, v0 = 0. On the other hand,
So z0 ∈ im(λidH − K).
Corollary 5.15. Let K ∈ B(H) be a self-adjoint compact operator on a Hilbert space H. Then for every
λ ∈ F\{0},
im(λidH − K) = (ker(λ̄idH − K))⊥ .
as stated.
37
Theorem 5.16. Let K ∈ B(H) be a self-adjoint compact operator on a Hilbert space H. If λ ∈ σ(K), then
λ = 0 or λ is an eigenvalue of K.
Proof. Suppose that 0 6= λ ∈ F and that λ is not an eigenvalue of K. Then λ̄ is not an eigenvalue of K either,
for otherwise λ̄ would be real by Theorem 5.10 and then λ would be an eigenvalue.
By Corollary 5.15, we have
im(λidH − K) = (ker(λ̄idH − K))⊥ .
Since ker(λ̄idH − K) = {0} we now have im(λidH − K) = H, and since ker(λidH − K) = {0} it now follows
that λidH − K is invertible, thus λ ∈
/ σ(K).
Theorem 5.17. Let K ∈ B(H) be a self-adjoint compact operator on a Hilbert space H. Then for every ε > 0,
the set {λ ∈ σ(K) : |λ| ≥ ε} is finite.
Proof. Suppose for contradiction that there exists a sequence (λn )n∈N in σ(K) with |λn | ≥ ε for every n ∈ N and
λm 6= λn for m 6= n. Then every λn is an eigenvalue according to Theorem 5.16. So there exists a corresponding
eigenvector un ∈ H with kun kH = 1. By Theorem 5.10, we have hum , un i = 0 for m 6= n. Thus
Furthermore,
kKun kH = |λn |kun kH ≥ ε.
Hence
kKum − Kun k2H = kKum k2H + kKun k2H ≥ 2ε2 .
But then the sequence (Kun )n∈N cannot have a convergent subsequence, which contradicts the compactness of
K.
Lemma 5.18. Let K ∈ B(H) be a self-adjoint compact operator on a Hilbert space H. Then at least one of
kKkB(H) and −kKkB(H) is an eigenvalue of K.
Hence there exists a sequence (un )n∈N of unit vectors (i.e., kun kH = 1 for every n ∈ N), such that hKun , un i →
m, where m = ±kKkB(H) . Note that
Furthermore, we have
kvkH = |m| lim kunj kH = |m| =
6 0.
j→∞
38
Remark. If π is the orthogonal projection map on a closed linear subspace M of a Hilbert space H then
Theorem 5.19 (Spectral theorem for self-adjoint compact operators on Hilbert spaces). Let H be
an infinite-dimensional Hilbert space and let K ∈ B(H) be a self-adjoint compact operator. Then:
(i) the spectrum σ(K) is a subset of R, it contains 0, and σ(K)\{0} comprises either finitely many eigenvalues
or a sequence of eigenvalues converging to 0;
(ii) the eigenspace corresponding to each non-zero eigenvalue is finite-dimensional;
(iii) let (λn )n∈Λ comprise all non-zero eigenvalues of K, repeated according to the dimensions of their eigenspaces,
where Λ = N or Λ = {1, . . . , N } for some N ∈ N. Then there exists an orthonormal system (un )n∈Λ such that
un is an eigenvector corresponding to λn for each n ∈ Λ, and
X
Kx = λn hx, un iun
n∈Λ
for every x ∈ H.
Proof. The statement of (i) is covered by Theorems 5.6, 5.10, 5.16, and 5.17. Part (ii) is covered by Theorem
5.7. It remains to prove part (iii).
For each eigenvalue λ, we choose an orthonormal basis of the eigenspace ker(λidH − K). By Theorem
5.10, the union of all all of these gives rise to an orthonormal system. Furthermore, if we label the elements
appropriately, then we can represent it in the form (un )n∈Λ , where un is an eigenvector corresponding to the
eigenvalue λn for every n ∈ Λ.
Let L = span {un : n ∈ Λ}. It follows from Theorem 1.10 that for any x ∈ L,
X
x= hx, un iun .
n∈Λ
Therefore, X
Kx = λn hx, un iun .
n∈Λ
hK ⊥ x, yi = hKπ ⊥ x, yi
= hKπ ⊥ x, π ⊥ yi = hπ ⊥ x, Kπ ⊥ yi = hx, K ⊥ yi.
That is, the operator K ⊥ is self-adjoint, and we can apply Lemma 5.18 to K ⊥ instead of K. Therefore, we
have an eigenvalue m of K ⊥ with |m| = kK ⊥ kB(H) . Let v0 be a corresponding eigenvector. Then
mv0 = K ⊥ v0 ∈ L⊥ .
Kv0 = K ⊥ v0 = mv0 ,
so m is also an eigenvalue of K and v0 is a corresponding eigenvector. But all eigenvectors for non-zero
eigenvalues are in L, hence m = 0, which means K ⊥ = 0.
39
Now for x ∈ H, we have the decomposition x = x1 + x2 with x1 ∈ L and x2 ∈ L⊥ , according to H = L ⊕ L⊥ .
Therefore,
Corollary 5.20. Let H be a separable Hilbert space and let K ∈ B(H) be a self-adjoint compact operator.
Then H has an orthonormal basis consisting of eigenvectors of K.
Proof. For dim H < ∞, this is a result of linear algebra. For dim H = ∞, define (un )n∈Λ and L as in the
previous proof. Choose an orthonormal basis (vm )m∈M of L⊥ , where M = N or a finite index set. For every
m ∈ M , we have Kvm = 0. Hence vm is an eigenvector. Now combine (un )n∈Λ with (vm )m∈M .
40