0% found this document useful (0 votes)
11 views187 pages

PMath 453: Functional Analysis Notes

Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views187 pages

PMath 453: Functional Analysis Notes

Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

An Introduction to Functional Analysis

Laurent W. Marcoux

Department of Pure Mathematics


University of Waterloo
Waterloo, Ontario Canada N2L 3G1

December 13, 2018


Preface to the Fourth Edition - December 13, 2018

Thanks to Adina Goldberg, Hayley Reid, Wanchun Shen, Erlang Surya and
Zhenyuan Zhang for catching those typos and mistakes that I thought I had hidden
really well, and for suggesting alternate proofs to some results presented in the third
edition!
I’ve begun to enlarge the “Appendix” sections for each Chapter, and to add
exercises at the end of each Chapter. The “Appendix” sections are meant to be
“cultural”. Students are not required to read these to understand the remainder of
the text, and material appearing in the Appendix sections will not appear on the
exam. The exercises are meant to be much easier than the assignment exercises,
and are there to help solidify some of the definitions and concepts appearing in the
notes.
The biggest change, mathematically speaking, in the notes is that I have now
provided a proof of the fact that the only compact, quasinilpotent, normal operator
acting on an infinite-dimensional, separable, complex Hilbert space is the zero op-
erator. (Yes, I am aware of the fact that this holds even if N is not compact.) The
usual proof of this result uses Beurling’s Spectral Radius formula, and it took me a
few hours to come up with a proof that avoided that result in the setting I needed
to consider. I have nothing against Beurling nor his Spectral Radius Formula in
general (in fact, you might say that I appreciate it even more now), except that I
didn’t have the time to cover it, and therefore I had to find a “work-around” that
did not require any functional calculus.

Preface to the Third Edition - October 24, 2014

This set of notes is now undergoing its third iteration. The mathematical content
outside of the appendices is mostly stabilized, and now begins the long and lonely
hunt for typos, poor grammar, and awkward sentence constructions.
Please feel free to contact me if you find any mistakes – mathematical or other-
wise – in these notes.

i
ii

Preface to the Second Edition - December 1, 2010

This set of notes has now undergone its second incarnation. I have corrected as
many typos as I have found so far, and in future instalments I will continue to add
comments and to modify the appendices where appropriate. The course number for
the Functional Analysis course at Waterloo has now changed to PMath 753, in case
anyone is checking.
The comment in the preface to the “first edition” regarding caution and buzz
saws is still à propos. Nevertheless, I maintain that this set of notes is worth at least
twice the price1 that I’m charging for them.
For the sake of reference: excluding the material in the appendices, and allowing
for the students to study the last section on topology themselves, one should be able
to cover the material in these notes in one term, which at Waterloo consists of 36
fifty-minute lectures.

My thanks to Xiao Jiang and Ian Hincks for catching a number of typos that I
missed in the second revision.

Preface to the First Edition - December 1, 2008

The following is a set of class notes for the PMath 453/653 course I taught at the
University of Waterloo in 2008. As mentioned on the front page, they are a work in
progress, and - this being the “first edition” - they are replete with typos. A student
should approach these notes with the same caution he or she would approach buzz
saws; they can be very useful, but you should be thinking the whole time you have
them in your hands. Enjoy.

I would like to thank Paul Skoufranis for having pointed out to me an embar-
rassing number of typos. I am glad to report that he still has both hands and all of
his fingers.

1If you were charged a single penny for the electronic version of these notes, you were robbed.
You can get them for free from my website.
iii

The reviews are in!

From the moment I picked your book up until I laid it down I was
convulsed with laughter. Someday I intend reading it.
Groucho Marx

This is not a novel to be tossed aside lightly. It should be thrown with


great force.
Dorothy Parker

The covers of this book are too far apart.


Ambrose Bierce

I read part of it all the way through.


Samuel Goldwyn

Reading this book is like waiting for the first shoe to drop.
Ralph Novak

Thank you for sending me a copy of your book. I’ll waste no time
reading it.
Moses Hadas

Sometimes you just have to stop writing. Even before you begin.
Stanislaw J. Lec
Contents

i
1. Normed Linear Spaces 1
2. An introduction to operators 21
3. Hilbert space 38
4. Topological Vector Spaces 54
5. Seminorms and locally convex spaces 69
6. The Hahn-Banach theorem 89
7. Weak topologies and dual spaces 110
8. Extremal points 126
9. The chapter of named theorems 134
10. Operator Theory 141
11. Appendix – topological background 166
Bibliography 173
Index 175

v
1. NORMED LINEAR SPACES 1

1. Normed Linear Spaces

I don’t like country music, but I don’t mean to denigrate those who do.
And for the people who like country music, denigrate means ‘put down’.
Bob Newhart

1.1. It is expected that the student of this course will have already seen the
notions of a normed linear space and of a Banach space. We shall review the
definitions of these spaces, as well as some of their fundamental properties. In both
cases, the underlying structure is that of a vector space. For our purposes, these
vector spaces will be over the field K, where K = R or K = C.

1.2. Definition. Let X be a vector space over K. A seminorm on X is a map


ν:X→R
satisfying
(i) ν(x) ≥ 0 for all x ∈ X;
(ii) ν(λx) = |λ| ν(x) for all x ∈ X, λ ∈ K; and
(iii) ν(x + y) ≤ ν(x) + ν(y) for all x, y ∈ X.
If ν satisfies the extra condition:
(iv) ν(x) = 0 if and only if x = 0,
then we say that ν is a norm, and we usually denote ν(·) by k · k. In this case, we
say that (X, k · k) (or, with a mild abuse of nomenclature, X) is a normed linear
space.

1.3. A norm on X is a generalisation of the absolute value function on K. Of


course, equipped with the absolute value function on K, one immediately defines a
metric d : K × K → R by setting d(x, y) = |x − y|.
In exactly the same way, the norm k · k on a normed linear space X induces a
metric
d: X×X → R
(x, y) 7→ kx − yk.
The norm topology on (X, k · k) is the topology induced by this metric. For each
x ∈ X, a neighbourhood base for this topology is given by
Bx = {Vε (x) : ε > 0},
where Vε (x) = {y ∈ X : d(y, x) < ε} = {y ∈ X : ky − xk < ε}. We say that the
normed linear space (X, k · k) (or informally X) is complete if the corresponding
metric space (X, d) is complete.
2 L.W. Marcoux Functional Analysis

1.4. Example. Define



cK
00 (N) = {(xn )n=1 : xn ∈ K, n ≥ 1, xn = 0 for all but finitely many n ≥ 1}.
For x = (xn )n ∈ cK 00 (N), set kxk∞ = supn≥1 |xn |. Then (c00 (N), k · k∞ ) is a
K

normed linear space. It is not, however, complete.


The space

cK
0 (N) = {(xn )n=1 : xn ∈ K, n ≥ 1, lim xn = 0},
n→∞
equipped with the same norm kxk∞ = supn≥1 |xn | does define a complete normed
linear space.

1.5. Remark. We pause to make a comment about the terminology which we


shall be using in these notes. A vector subspace of a vector space V over K is
a non-empty subset W for which x, y ∈ W and k ∈ K implies that kx + y ∈ W .
When the vector space V does not carry a topology, there is no confusion in this
terminology. When dealing with normed linear spaces (X, k · k), and more generally
with the topological vector spaces (V, T ) we shall deal with later in the text, and of
which normed linear spaces are an example, one needs to distinguish between those
vector subspaces which are definitely closed sets in the underlying topology from
those which may or may not be closed. For this reason, we shall refer to vector
subspaces of a topological vector space (V, T ) which may or may not be closed as
linear manifolds in V, whereas subspaces will be used to denote closed linear
manifolds. As a pedagogical tool, we shall also refer to these as closed subspaces,
although strictly speaking, in our language, this is redundant.
Thus cK00 (N) is a linear manifold in c0 (N) under the norm k · k∞ , but it is not a
K

subspace of cK K
0 (N), because it is not closed. In fact, it is dense in c0 (N).

1.6. Example. Consider


PK ([0, 1]) = {p = p0 + p1 z + p2 z 2 + · · · + pn z n : n ≥ 1, pi ∈ K, 0 ≤ i ≤ n}.
Then
kpk∞ = sup{|p(z)| : z ∈ [0, 1]}
defines a norm on PK ([0, 1]). The Stone-Weierstraß Theorem states that PK ([0, 1])
is a dense linear manifold in the normed linear space C([0, 1], K) of continuous, K-
valued functions on [0, 1] with the supremum norm.
If we select x0 ∈ [0, 1] arbitrarily, then it is straightforward to check that ν(f ) :=
|f (x0 )| defines a seminorm on PK ([0, 1]) which is not a norm.

1.7. Example. Let n ≥ 1 be an integer. If 1 ≤ p < ∞ is a real number, then


n
X 1/p
k(x1 , x2 , ..., xn )kp = |xk |p
k=1
defines a norm on called the p-norm. We often write `pn for (Kn , k · kp ), when
Kn ,
the underlying field K is understood. We may also define
k(x1 , x2 , ..., xn )k∞ = max(|x1 |, |x2 |, ..., |xn |).
1. NORMED LINEAR SPACES 3

Observe that (Kn , k · k∞ ) is a normed linear space. We abbreviate this to `∞


n when
K is understood.

1.8. Example. For 1 ≤ p < ∞, we define



`pK (N) = {(xn )∞
X
n=1 : xn ∈ K, n ≥ 1 and |xn |p < ∞}.
n=1

For (xn )∞
n=1 ∈ `pK (N), we set

X 1/p
k(xn )n kp = |xn |p .
n=1

Then k · kp defines a norm, again called the p-norm. on `pK (N).


As above, we may also define
`∞ ∞
K (N) = {(xn )n=1 : xn ∈ K, n ≥ 1, sup |xn | < ∞}.
n
The ∞-norm on `∞
K (N) is given by
k(xn )n k∞ = sup |xn |.
n

In most contexts, the underlying field K is understood, and we shall write only
`p (N), or even `p , 1 ≤ p ≤ ∞.

The last two examples have one especially nice property not shared by cK
00 (N)
and PK ([0, 1]), namely: they are complete.

1.9. Definition. A Banach space is a complete normed linear space.

1.10. Example. Let C([0, 1], K) = {f : [0, 1] → K : f is continuous}, equipped


with the uniform norm
kf k∞ = max{|f (z)| : z ∈ [0, 1]}.
Then (C([0, 1], K), k · k∞ ) is a Banach space.

1.11. Example. Let D ⊆ C denote the open unit disc D = {z ∈ C : |z| < 1},
and T denote the unit circle T = {z ∈ C : |z| = 1}. Consider the disc algebra
A(D) := {f ∈ C(D) : f |D is holomorphic}.
The function
kpk∞ := sup |p(z)|
z∈D
is easily seen to define a norm on A(D).
It follows from elementary Complex Analysis that the space C[z] of polynomials
(with domain restricted to D) is dense in A(D), and that A(D) is complete with
respect to this norm. That is, A(D) is a Banach space.
4 L.W. Marcoux Functional Analysis

Furthermore, from the Maximum Modulus Principle, we see that the map
Γ : A(D) → C(T)
f 7→ f |T

is isometric, and so we can (and do) identify A(D) with the algebra

{f ∈ C(T) : f extends to a holomorphic function on D},

equipped with the norm kf k∞ = sup|z|=1 |f (z)|.


The perspicacious reader will have noticed that the product f g of two elements
f, g of A(D) is again an element of A(D), and that kf gk∞ ≤ kf k∞ kgk∞ . This
means that A(D) forms a Banach algebra. These important examples of Banach
spaces form a topic of study in their own right, and we shall not have much to say
about them here.

1.12. Definition. Let H be an inner product space over K; that is, there exists
a map
h·, ·i → K
which, for all x, x1 , x2 , y ∈ H and λ ∈ K, satisfies:
(i) hx1 + x2 , yi = hx1 , yi + hx2 , yi;
(ii) hx, yi = hy, xi;
(iii) hλx, yi = λhx, yi;
(iv) hx, xi ≥ 0, with equality holding if and only if x = 0.
(Of course, when K = R, the complex conjugation in (ii) is superfluous.) Recall
that the canonical norm on H induced by the inner product is given by

kxk = hx, xi1/2 .

If H is complete with respect to the corresponding metric, then we say that H is a


Hilbert space. Thus every Hilbert space is a Banach space.

1.13. Example. Recall that `2K (N) is a Hilbert space with inner product

X
h(xn )n , (yn )n i = x n yn .
n=1

More generally, let (X, µ) be a measure space. Then H = L2 (X, µ) is a Hilbert


space with
Z
hf, gi = f gdµ.
X
1. NORMED LINEAR SPACES 5

1.14. It is easy to see that if X is a normed linear space, then the vector space
operations
σ : X×X → X µ: K×X → X
and
(x, y) 7→ x + y (λ, x) 7→ λx
of addition and scalar multiplication are continuous (from the respective product
topologies on X × X and on K × X to the norm topology on X). The proof is left
as an exercise for the reader. In particular, therefore, if 0 6= λ ∈ K, y ∈ X, then
σy : X → X defined by σy (x) = x + y and µλ : X → X defined by µλ (x) = λx are
homeomorphisms.
As a simple corollary to this fact, a set G ⊆ X is open (resp. closed) if and only
if G + y is open (resp. closed) for all y ∈ X, and λG is open (resp. closed) for all
0 6= λ ∈ K. We shall return to this in a later section.
1.15. New Banach spaces from old. We now exhibit a few constructions
which allow us to produce new Banach spaces from simpler building blocks.Q
Let (Xn , k · kn )∞
n=1 denote a countable family of Banach spaces. Let X = n Xn .
(a) For each 1 ≤ p < ∞, define
∞ ∞
X X 1/p
⊕p Xn = {(xn )n ∈ X : k(xn )n kp = kxn kpn < ∞}.
n=1 n=1
Then ∞ p
P
n=1 ⊕p Xn is a Banach space, referred to as the ` -direct sum of
the (Xn )n .
(b) With p = ∞,

X
⊕∞ Xn = {(xn )n ∈ X : k(xn )n k∞ = sup kxn kn < ∞}.
n=1 n≥1
P∞
Again, n=1 ⊕∞ Xn is a Banach space - namely the `∞ -direct sum of the
(Xn )n .
(c) We also define
c0 (X) = {(xn )n ∈ X : xn ∈ Xn , n ≥ 1 and lim kxn kn = 0}.
n→∞
The norm on c0 (X) is k(xn )n k∞ = supn≥1 kxn kn , andPequipped with this
norm, c0 (X) is easily seen to be a closed subspace of ∞
n=1 ⊕∞ Xn .

1.16. Definition. Let X be a vector space equipped with two norms k · k and
||| · |||. We say that these norms are equivalent if there exist constants κ1 , κ2 > 0
so that
κ1 kxk ≤ |||x||| ≤ κ2 kxk for all x ∈ X.

We remark that when this is the case,


1 1
|||x||| ≤ kxk ≤ |||x|||,
κ2 κ1
resolving the apparent lack of symmetry in the definition of equivalence of norms.
6 L.W. Marcoux Functional Analysis

1.17. Example. Fix n ≥ 1 an integer, and let X = Cn . For x = (x1 , x2 , ..., xn ) ∈


X,
n
X n
X n
 X
kxk1 = |xk | ≤ max |xj | = kxk∞ = nkxk∞ .
j
k=1 k=1 k=1
Moreover,
n
X
kxk∞ = max |xj | ≤ |xk | = kxk1 ,
j
k=1
so that
kxk∞ ≤ kxk1 ≤ nkxk∞ .
This proves that k · k1 and k · k∞ are equivalent norms on X. As we shall later see,
all norms on a finite dimensional vector space are equivalent.

1.18. Example. Let X = C([0, 1], C), and consider the norms
kf k∞ = sup{|f (x)| : x ∈ [0, 1]}
and Z 1
kf k1 = |f (x)|dx
0
on X. If, for each n ≥ 1, we set fn to be the function fn (x) = xn , then kfn k∞ = 1,
R1 1
while kfn k1 = 0 xn dx = n+1 . Clearly k · k1 and k · k∞ are inequivalent norms on
X.

1.19. Proposition. Two norms k·k and |||·||| on a vector space X are equivalent
if and only if they generate the same metric topologies.
Proof. Suppose first that k · k and ||| · ||| are equivalent, say κ1 kxk ≤ |||x||| ≤ κ2 kxk
for all x ∈ X, where κ1 , κ2 > 0 are constants. If x ∈ X and (xn )n is a sequence in
X, then it immediately follows that
lim kxn − xk = 0 if and only if lim |||xn − x||| = 0.
n→∞ n→∞

That is, the two notions of convergence coincide, and thus the topologies are equal.
Conversely, suppose that the metric topologies τk·k and τ|||·||| , induced by k · k
and ||| · ||| respectively, coincide. Then G = {x ∈ X : kxk < 1} is an open nbhd of 0
in (X, ||| · |||), and so there exists δ > 0 so that H = {x ∈ X : |||x||| < δ} ⊆ G. That
is, |||x||| < δ implies kxk < 1. In particular, therefore, |||x||| ≤ δ/2 implies kxk ≤ 1,
so that that ||y|| ≤ (2/δ)|||y||| for all y ∈ X. By symmetry, there exists a constant
κ2 > 0 so that |||y||| ≤ κ2 kyk for all y ∈ X.
Thus k · k and ||| · ||| are equivalent norms.
2

1.20. Corollary. Equivalence of norms is an equivalence relation for norms on


a vector space X.
1. NORMED LINEAR SPACES 7
P∞
1.21. Definition. Let (X, k · k) be a normed linear space. A series n=1 xn in
X is said to be absolutely summable if ∞
P
n=1 kx n k < ∞.

The following result provides a very practical tool when trying to decide whether
or not a given normed linear space is complete. We remark that the second half of
the proof uses the standard fact that if (yn )n is a Cauchy sequence in a metric space
(Y, d), and if (yn )n admits a convergent subsequence with limit y0 , then the original
sequence (yn )n converges to y0 as well.

1.22. Proposition. Let (X, k · k) be a normed linear space. The following


statements are equivalent:
(a) X is complete, and hence X is a Banach space.
(b) Every absolutely summable series in X is summable.
Proof.
P
(a) implies (b): Suppose that X is complete, and that xn is absolutely
Pk
summable. For each k ≥ 1, letPyk = n=1 xn . Given ε > 0, we can find
N > 0 so that m ≥ N implies ∞ n=m kxn k < ε. If k ≥ m ≥ N , then
k
X
kyk − ym k = k xn k
n=m+1
k
X
≤ kxn k
n=m+1
X∞
≤ kxn k
n=m+1
< ε,
so that (yk )k is Cauchy in X. Since X is complete, y = limk→∞ yk =
limk→∞ kn=1 xn = ∞
P P P∞
n=1 xn exists, i.e. n=1 xn is summable.
(b) implies (a): Next suppose that every absolutely summable series in X is
summable, and let (yj )j be a Cauchy sequence in X. For each n ≥ 1 there
exists Nn > 0 so that k, m ≥ Nn implies kyk − ym k < 1/2n+1 . Clearly,
we may assume without loss of generality that N1 < N2 < N3 < · · · . Let
x1 = yN1 and for n ≥ 2, let xn = yNn − yNn−1 . Then kxn k < 1/2n for all
n ≥ 2, so that
∞ ∞
X X 1
kxn k ≤ kx1 k +
2n
n=1 n=2
1
≤ kx1 k + < ∞.
2
P∞
= limk→∞ kn=1 xn exists.
P
By hypothesis, y = n=1 xn But
Pk
x
n=1 n = y Nk , so that lim y
k→∞ Nk = y ∈ X. Recalling that (yj j was
)
Cauchy, we conclude from the remark preceding the Proposition that (yj )j
8 L.W. Marcoux Functional Analysis

also converges to y. Since every Cauchy sequence in X converges, X is


complete.
2

1.23. Theorem. Let (X, k · k) be a normed linear space, and let M ⊆ X be a


linear manifold. Then
p(x + M) := inf{kx + mk : m ∈ M}
defines a seminorm on the quotient space X/M.
This formula defines a norm on X/M if and only if M is closed.
Proof. First observe that the function p is well-defined; for if x + M = y + M, then
x − y ∈ M and so
p(y + M) = inf{ky + mk : m ∈ M}
= inf{ky + m + (x − y)k = kx + mk : m ∈ M}
= p(x + M).
Clearly p(x + M) ≥ 0 for all x + M ∈ X/M. If 0 6= k ∈ K, then m ∈ M if and
only if k1 m ∈ M and so
p(k(x + M)) = p(kx + M)
= inf{kkx + mk : m ∈ M}
1
= inf{kk(x + m)k : m ∈ M}
k
= |k| inf{kx + m0 k : m0 ∈ M}
= |k|p(x + M).
If k = 0, then p(0 + M) = 0, since m = 0 ∈ M.
Finally,

p (x + M) + (y + M) = p(x + y + M)
= inf{k(x + y) + mk : m ∈ M}
= inf{k(x + m1 ) + (y + m2 )k : m1 , m2 ∈ M}
≤ inf{kx + m1 k + ky + m2 k : m1 , m2 ∈ M}
= p(x + M) + p(y + M).
In the case where M is closed in X, suppose that p(x + M) = 0 for some x ∈ X.
Then
inf{kx + mk : m ∈ M} = 0,
so there exist mn ∈ M, n ≥ 1, so that −x = limn→∞ mn . Since M is closed, −x ∈ M
and so x + M = x + (−x) + M = 0 + M, proving that p is a norm.
The converse statement is left as an exercise.
2
1. NORMED LINEAR SPACES 9

1.24. Let X be a normed linear space and M be a linear manifold in X. We shall


denote the canonical quotient map from X to X/M by q (or qM if the need to be
specific arises). When M is closed in X, we shall denote the norm from Theorem 1.23
once again by k · k (or k · kX/M ), so that
kq(x)k = kx + Mk = inf{kx + mk : m ∈ M}.
It is clear that kq(x)k ≤ kxk for all x ∈ X, and so q is continuous. Indeed, given
ε > 0, we can take δ = ε to get kx − yk < δ implies kq(x) − q(y)k ≤ kx − yk < ε.
We shall see below that q is also an open map - i.e. it takes open sets to open sets.

1.25. Theorem. Let X be a normed linear space and M be a closed subspace


of X.
(a) If X is complete, then so are M and X/M.
(b) If M and X/M are complete, then so is X.
Proof.
(a) Suppose that X is complete. We first show that M is complete.
Let (mn )∞
n=1 be a Cauchy sequence in M. Then it is Cauchy in X and
X is complete, so that x = limn→∞ mn ∈ X. Since M is closed in X, x ∈ M.
Thus M is complete.
Note that this argument shows that any closed subset of a complete
metric space is complete.
Next we show that X/M is also complete.
P
Let n q(xn ) be an absolutely summable series in X/M. For each
n ≥ 1, choose mn ∈ M so that kxn + mn k ≤ kq(xn )k + 21n . Then
X X 1

kxn + mn k ≤ kq(xn )k + n < ∞,
n n
2
P
so n (xn + mn ) is summable in X since X is complete. Set
X
x0 := (xn + mn ).
n

By the continuity of q,
X
q(x0 ) = q( (xn + mn ))
n
X
= q(xn + mn )
n
X
= q(xn ).
n

Thus every absolutely summable series in X/M is summable, and so by


Proposition 1.22, X/M is complete.
10 L.W. Marcoux Functional Analysis

(b) Suppose next that M and X/M are both complete.


Let (xn )∞ ∞
n=1 be a Cauchy sequence in X. Then (q(xn ))n=1 is Cauchy in
X/M and thus q(y) = limn→∞ q(xn ) exists, by the completeness of X/M.
For n ≥ 1, choose mn ∈ M so that
1
ky − (xn + mn )k < kq(y) − q(xn )k + n .
2

Since (xn + mn )n=1 converges to y in X, it follows that it is a Cauchy
sequence. Since both (xn )∞ ∞
n=1 and (xn + mn )n=1 are Cauchy, it follows that

(mn )n=1 is also Cauchy – a fact that follows easily from the observation
that
kmj − mi k ≤ k(xj + mj ) − (xi + mi )k + kxj − xi k.
But M is complete and so m := limn→∞ mn ∈ M. This yields
y − m = lim (xn + mn ) − m = lim xn ,
n→∞ n→∞
so that (xn )∞
n=1 converges to y − m in X. That is, X is complete.
2

1.26. Proposition. Let X be a normed linear space and M be a closed subspace


of X. Let q : X → X/M denote the canonical quotient map.
(a) A subset W ⊆ X/M is open if and only if q −1 (W ) is open in X.
(b) The map q is an open map - i.e., if G ⊆ X is open, then q(G) is open in
X/M.
Proof.
(a) If W ⊆ X/M is open, then q −1 (W ) is open in X because q is continuous.
Suppose next that W ⊆ X/M and that q −1 (W ) is open in X. Let
q(x) ∈ W . Then x ∈ q −1 (W ), and so we can find δ > 0 so that Vδ (x) ⊆
q −1 (W ). If kq(y) − q(x)k < δ, then ky − x + mk < δ for some m ∈ M, and
thus q(y) = q(y + m) ∈ q(Vδ (x)) ⊆ W . That is, Vδ (q(x)) ⊆ W , and W is
open.
(b) Let G ⊆ X be an open set. Observe that q −1 (q(G)) = G+M = ∪m∈M G+m
is open, being the union of open sets. By (a), q(G) is open.
2

1.27. Let M be a finite-dimensional linear manifold in a normed linear space


X. Then M is closed in X. The proof of this is left as an assignment exercise.
1.28. Proposition. Let X be a normed linear space. If M and Z are closed
subspaces of X and dim Z < ∞, then M + Z is closed in X.
Proof. Let q : X → X/M denote the canonical quotient map. Since Z is a finite
dimensional vector space, so is q(Z). By the exercise preceding this Proposition,
q(Z) is closed in X/M. Since q is continuous, M + Z = q −1 (q(Z)) is closed in X.
2
1. NORMED LINEAR SPACES 11

Appendix to Section 1.

1.29. This course assumes that the reader has taken at least enough Real Anal-
ysis to have seen that (`pK (N), k · kp ) is a normed linear space for each 1 ≤ p ≤ ∞.
Having said that, let us review Hölder’s Inequality as well as Minkowski’s Inequality
in this setting, since Hölder’s Inequality is also useful in studying dual spaces in the
next Section. The reader will recall that Minkowski’s Inequality is the statement
that the p-norm is subadditive; that is, that the p-norm satisfies condition (iii) of
Definition 1.2. We remark that both inequalities hold for more general Lp -spaces.
Our decision to concentrate on `p -spaces instead of their more general counterparts
is an attempt to accommodate the background of the students who took this course,
as opposed to a conscious effort to avoid Lp -spaces.

Before proving Hölder’s Inequality, we pause to prove the following Lemma.

1.30. Lemma. Let a and b be positive real numbers and suppose that
1
1 < p, q < ∞ satisfy p + 1q = 1. Then
a b
1 1
ap bq ≤
+ .
p q
Proof. Let 0 < t < 1 and consider the function
f (x) = xt − tx + t − 1,
defined on (0, ∞). Then
f 0 (x) = txt−1 − t = t(xt−1 − 1).
Thus f (1) = 0 = f 0 (1). Since f 0 (x) > 0 for x ∈ (0, 1) and f 0 (x) < 0 for x ∈ (1, ∞),
it follows that
f (x) < f (1) = 0 for all x 6= 1.
t
That is, x ≤ (1 − t) + tx for all x > 0, with equality holding if and only if x = 1.
Letting x = a/b, t = 1/p yields
1 1 1 −1
−1
ap bq
= ap b p
a 1
= ( )p
b
1 1 a
≤ (1 − ) + ( )
p p b
1 a 1
= ( )+ .
p b q
Multiplying both sides of the equation by b yields the desired inequality.
2
12 L.W. Marcoux Functional Analysis

1.31. Theorem. Hölder’s Inequality


Let 1 ≤ p, q ≤ ∞, and suppose that p1 + 1q = 1. Let x = (xn )n ∈ `p and
y = (yn )n ∈ `q . If z = (zn )n , where zn = xn yn for all n ≥ 1, then z ∈ `1 and
kzk1 ≤ kxkp kykq .

Proof. The cases where p = 1 or p = ∞ are routine and are left to the reader.
First let us suppose that kxkp = kykq = 1. Applying the previous Lemma to
our sequences x and y yields, for each n ≥ 1,
1 1
|xn yn | = (|xn |p ) p (|yn |q ) q
1 1
≤ |xn |p + |yn |q ,
p q
so that
X X
|zn | = |xn yn |
n n
1X 1X
≤ |xn |p + |yn |q
p n q n
1 1
= kxkpp + kykqq
p q
= 1.
In general, if x ∈ `p and y ∈ `q , let u = x/(kxkp ), v = v/(kykq ) so that
kukp = 1 = kvkq and so
1 X X
|xn yn | = |un vn |
kxkp kykq n n
≤ 1.
Thus
kzk1 ≤ kxkp kykq .
2

Hölder’s Inequality is the key to proving Minkowski’s Inequality.

1.32. Theorem. Minkowski’s Inequality.


Let 1 ≤ p ≤ ∞, and suppose that x = (xn )n and y = (yn )n are in `p . Then
x + y = (xn + yn )n ∈ `p and
kx + ykp ≤ kxkp + kykp .

Proof. Again, the cases where p = 1 and where p = ∞ are left to the reader.
Suppose therefore that 1 < p < ∞. Observe that if a, b > 0, then
a+b p
 
≤ ap + bp ,
2
1. NORMED LINEAR SPACES 13

so that (a + b)p ≤ 2p (ap + bp ). It follows that


X X
|xn + yn |p ≤ 2p (|xn |p + |yn |p ) < ∞,
n n

which proves that x + y ∈ `p .


By Hölder’s Inequality,
X
|xn + yn |p−1 |xn | ≤ kxkp k(|xn + yn |p−1 )n kq ,
n

and similarly
X
|xn + yn |p−1 |yn | ≤ kykp k(|xn + yn |p−1 )n kq .
n
Now
!1
X q

k(|xn + yn |p−1 )n kq = |xn + yn |(p−1)q


n
!1
X q

= |xn + yn |(pq−q)
n
!1
X q
p
= |xn + yn |
n
= k(xn + yn )n kp/q
p .

Hence
X
kx + ykpp = |xn + yn | |xn + yn |p−1
n
X
≤ (|xn | + |yn |) |xn + yn |p−1
n
≤ (kxkp + kykp ) k(|xn + yn |p−1 )n kq
= (kxkp + kykp ) k(xn + yn )n kp/q ,
from which we get
kx + ykp = kx + ykp−p/q
p ≤ kxkp + kykp .
2

Let us now examine a couple of examples of useful Banach spaces whose defi-
nitions require a somewhat better background in Analysis than we are assuming in
the main body of the text.
14 L.W. Marcoux Functional Analysis

1.33. Example. Let x = (xn )n be a sequence of complex (or real) numbers.


The total variation of x is defined by
X∞
V (x) := |xn+1 − xn |.
n=1
If V (x) < ∞, we say that x has bounded variation. The space
bv := {(xn )n : xn ∈ K, n ≥ 1, V (x) < ∞}
is called the space of sequences of bounded variation. We may define a norm
on bv as follows: for x ∈ bv, we set

X
k(xn )n kbv := |x1 | + V (x) = |x1 | + |xn+1 − xn |.
n=1
It can be shown that bv is complete under this norm, and hence that bv is a
Banach space.

If we let bv0 = {(xn )n ∈ bv : limn→∞ xn = 0}, then


k(xn )n kbv0 := V ((xn )n )
defines a norm on bv0 , and again, bv0 is a Banach space with respect to this norm.

1.34. Example. The geometric theory of real Banach spaces is an active and
exciting area. For a period of time, the following question was open [Lin71]: does
every infinite-dimensional Banach space contain a subspace which is linearly home-
omorphic to one of the spaces `p , 1 ≤ p < ∞ or c0 ? In 1974, B.S. Tsirel’son [Tsi74]
provided a counterexample to this conjecture. In this example, we shall discuss
the broad outline of the construction of the Tsirel’son space, omitting the proofs of
certain technical details.

We begin by considering the space c0 of Example 1.4. For each n ≥ 1, let en ∈ c0


denote the sequence (0, 0, ..., 0, 1, 0, 0, ...), with the unique th
P∞ “1” occurring in the n
coordinate. Given x = (xn )n ∈ c0 , we may write x = n=1 xn en . Let us also define
the map Pn : c0 → c0 via Pn (xk )k := (0, 0, ..., 0, xn+1 , xn+2 , xn+3 , ...).
Given a finite set {v1 , v2 , ..., vr } of vectors in c0 , we shall say that they are block-
disjoint for consecutively supported – written v1 < v2 < · · · < vr – if there exist
α1 , α2 , ..., αr , β1 , β2 , ..., βr ∈ N with
α1 ≤ β1 < α2 ≤ β2 < · · · < αr ≤ βr
so that supp(vj ) ⊆ [αj , βj ], 1 ≤ j ≤ r. Here, for x = (xn )n ∈ c0 ,
supp(x) := {j ∈ N : xj 6= 0}.
We shall write (v1 , v2 , ..., vr ) for rj=1 vj when v1 < v2 < · · · < vr .
P
For a subset B ⊆ c0 , we consider the following set of conditions which B may or
may not possess:
(a) x ∈ B implies that kxk∞ ≤ 1; i.e. B is contained in the unit ball of c0 .
1. NORMED LINEAR SPACES 15

(b) {en }∞ P⊆ B.
n=1
(c) If x = ∞ n=1 xn en ∈ B, y = (yn )n ∈ c0 and |yn | ≤ |xn | for all n ≥ 1, then
y ∈ B. (This is a hereditary property.)
(d) If v1 < v2 < · · · < vr lie in B, then 21 Pr ((v1 , v2 , ..., vr )) ∈ B.
(e) For every x ∈ B there exists n ∈ N for which 2Pn (x) ∈ B.

Our first goal is to construct a set K which has all five of these properties.
Let L1 = {rej : −1 ≤ r ≤ 1, j ≥ 1} and for n ≥ 1, set
1
Ln+1 = Ln ∪ { Pr ((v1 , v2 , ..., vr )) : r ≥ 1, v1 < v2 < · · · < vr ∈ Ln }.
2
Let K denote the pointwise closure of ∪n≥1 Ln . It can be shown that K ⊆ c0 . We
set D = co(K) denote the closed convex hull of K (with the closure taking place in
c0 ).
The Tsirel’son space T is then defined as span D. The norm on T is given by
the Minkowski functional which we shall encounter later when studying locally
convex spaces. It is given by kxkT = inf{r ∈ (0, ∞) : x ∈ rD}, where rD = {ry :
y ∈ D}. As we shall later see, the definition of this norm ensures that D is precisely
the unit ball of T .
Although we shall not prove it here, (T, k · kT ) is a Banach space which does not
contain any copy of c0 or `p , 1 ≤ p < ∞.

1.35. Example. Another Banach space of interest to those who study the
geometry of said spaces is James’ space.
For a sequence (xn )n of real numbers, consider the following condition, which
we shall call condition J: for all k ≥ 1,
(xn1 − xn2 )2 + (xn2 − xn3 )2 + · · · + (xnk−1 − xnk )2 < ∞.
 
sup
n1 <n2 <···<nk

The James’ space is defined to be:


J = {(xn )n ∈ c0 : (xn )n satisfies condition J}.
The norm on J is defined via:
1
(xn1 − xn2 )2 + (xn2 − xn3 )2 + · · · + (xnk−1 − xnk )2 2 .

k(xn )n kJ := sup
n1 <n2 <···<nk

It can be shown that J is a Banach space when equipped with this norm.

1.36. Example. Let X be a locally compact topological space and let B denote
the σ–algebra of Borel subsets of X. Let µ be a positive measure on X, so that
µ : B → R ∪ {∞}
satisfies
(a) µ(∅) = 0;
(b) µ(B) ≥ 0 for all B ∈ B;
16 L.W. Marcoux Functional Analysis

(c) if {Bn }n is a sequence of disjoint, measurable subsets from B, then


X
µ(∪n Bn ) = µ(Bn ).
n
The measure µ is said to be finite if µ(X) < ∞, and it is said to be regular if
(i) µ(K) < ∞ for all compact subsets K ∈ B;
(ii) µ(B) = sup{µ(K) : K ⊆ B, K compact} for all B ∈ B; and
(iii) µ(B) = inf{µ(G) : B ⊆ G, G open} for all B ∈ B.
A complex-valued, Borel measure on X is a function
ν:B→C
satisfying:
(a) ν(∅) = 0, and
(b) if {Bn }n is a sequence of disjoint, measurable subsets from B, then
X
ν(∪n Bn ) = ν(Bn ).
n
Let ν be a complex-valued Borel measure on X. For each B ∈ B, a measurable
partition of B is a finite collection {E1 , E2 , ..., Ek } of disjoint, measurable sets
whose union is B. We define the variation |ν| of ν to be the function defined as
follows: for B ∈ B,
Xk
|ν|(B) := sup{ |ν(Ej )| : {Ej }kj=1 is a measurable partition of B}.
j=1

It is routine to verify that |ν| is then a finite, positive Borel measure on X. We


say that ν is regular if |ν| is.
It is clear that every complex-linear combination of finite, positive, regular Borel
measures yields a complex-valued, regular Borel measure on X. A standard result
from measure theory known as the Hahn-Jordan Decomposition Theorem
states that the converse holds, namely: every complex-valued, regular Borel measure
can be written as a complex-linear combination of (four) finite, positive, regular
Borel measures.
Let MC (X) denote the complex vector space of all complex-valued, regular Borel
measures on X. Then the map
k · k : MC (X) → [0, ∞)
ν 7→ |ν|(X)
defines a norm on MC (X), and MC (X) is complete with respect to this norm.

1.37. In Theorem 1.25, we showed that if X is a Banach space and M is a closed


subspace, then X/M is complete. Our proof there was based upon Proposition 1.22.
This result also admits a direct proof in terms of Cauchy sequences:

Theorem. Let X be a Banach space and suppose that M is a closed subspace of


X. Then X/M is complete.
1. NORMED LINEAR SPACES 17

Proof.
Let (q(xn ))∞
n=1 be a Cauchy sequence in X/M. For each n ≥ 1, there exists
kn > 1 so that i, j ≥ kn implies kq(xi ) − q(xj )k < 2−n . Without loss of generality,
we may assume that kn > kn−1 for all n ≥ 2.
Set zn := xkn , n ≥ 1 and let m1 = 0. For n > 1, choose mn ∈ M so that
k(zn−1 + mn−1 ) − (zn + mn )k < 2−(n−1) .
That this is possible follows from the definition of the quotient norm along with the
inequality of the second paragraph. If we now define yn := zn + mn , n ≥ 1, then
q(yn ) = q(zn ) = q(xkn ), and for n2 > n1 ,
2 −n1
nX
kyn1 − yn2 k ≤ kyn1 +j − yn1 +j−1 k
j=1
2 −n1
nX
1
≤ ( )(n1 +j−1)
2
j=1

X 1
≤ ( )(n1 +j)
2
j=0
1
= ( )n1 −1 ,
2
from which it follows that (yn )∞
n=1 is Cauchy in X. Since X is complete, y :=
limn→∞ yn ∈ X, and since the quotient norm is contractive, q(y) = limn→∞ q(yn ) =
limn→∞ q(xkn ). Since (q(xn ))∞
n=1 is Cauchy, q(y) = limn→∞ q(xn ), which proves
that every Cauchy sequence in X/M converges - i.e. that X/M is complete.
2

1.38. Conversely, in Theorem 1.25 we used Cauchy sequences to prove that if


X is a normed linear space, M is a closed subspace of X and if both X/M and M
are complete, then X is complete as well. We now provide an alternate proof that
uses Proposition 1.22.

Theorem. Suppose that X is a normed linear space and that M is a closed subspace
of X. Suppose furthermore that X/M and that M are both complete. Then X is also
complete.
P P
Proof. Suppose that n xn P is an absolutely summable series in X, so that n kxn k <
∞. It suffices to prove that n xn exists in X.
P PN ∞
Note that the fact that n kxn k is finite implies that the sequence
P∞ ( n=1 xn )N =1
is Cauchy. Indeed, let ε > 0 and choose M > 0 such that n=M kxn k < ε. If
t ≥ s ≥ M , then
t
X s
X t
X t
X ∞
X
k xn − xn k = k xn k ≤ kxn k ≤ kxn k < ε.
n=1 n=1 n=s+1 n=s+1 n=M
This will be useful later on.
18 L.W. Marcoux Functional Analysis

Let q : X → X/M denote the canonical quotient map. Then q is linear and
contractive, so
X X
kq(xn )k ≤ kxn k < ∞.
n n

Since X/M is complete, this absolutely summable series is summable,


P and since q
is surjective, this implies that there exists y ∈ X so that q(y) = n q(xn ). In other
words,
N
X
lim kq(y − xn )k = 0.
N →∞
n=1

For each N ≥ 1, let δN := kq(y − N


P
n=1 xn )k. Then limN →∞ δN = 0, so we can
1
choose a subsequence (Nk )k so that δNk < 2k+1 , k ≥ 1.
By definition of the quotient norm, we can choose m1 ∈ M so that

N1
X
ky − xn − m1 k < 2δN1 = 1/2.
n=1

Next, we can choose m2 ∈ M so that

N2
X 
k y− xn − m1 − m2 k < 2δN2 = 1/4.
n=1

More generally, for each k ≥ 1, we can choose mk ∈ M so that

Nk
X k−1
X
mn − mk k < 2δNk = 1/2k .

k y− xn −
n=1 n=1

In particular, limk→∞ k y − N
P k  Pk
n=1 xn − n=1 mn k = 0.
Observe that for any k ≥ 1, it follows from the triangle inequality that

Nk k Nk−1 k−1
X  X X  X
kmk k ≤ k y − xn − mn k + k y − xn − mn k + kxNk k
n=1 n=1 n=1 n=1
1 1
< k + k−1 + kxNk k.
2 2

Thus
∞ ∞ ∞ ∞
X X 1 X 1 X
kmk k ≤ + + kxNk k < ∞.
2k 2k−1
k=1 k=1 k=1 k=1
1. NORMED LINEAR SPACES 19
P∞
Since M is assumed to be complete, it follows that m0 := k=1 mk ∈ M exists.
Also,
Nk
X k
X 
−y + m0 = lim y − xn − mn + (−y + m0 )
k→∞
n=1 n=1
Nk
X k
X
= lim − xn + (m0 − mn )
k→∞
n=1 n=1
Nk
X
= lim − xn ,
k→∞
n=1

so that limk→∞ N
P k
n=1 xn = m0 − y ∈ X exists.
PN PNk
But then the Cauchy sequence ( n=1 xn )∞ ∞
N =1 has a subsequence ( n=1 xn )k=1
which converges to m0 − y, and so the original sequence must also converge to the
same limit. That is,

X N
X Nk
X
xn = lim xn = lim xn = m0 − y ∈ X.
N →∞ k→∞
n=1 n=1 n=1
Since every absolutely summable series in X is summable, X is complete.
2

I have the body of an eighteen year old. I keep it in the fridge.


Spike Milligan
20 L.W. Marcoux Functional Analysis

Exercises for Section 1.

Question 1.
Let ∅ 6= X be a compact, Hausdorff space. Prove that for each ∅ 6= Ω ⊆ X, the
function
νΩ : C(X, K) → R
f 7→ supx∈Ω |f (x)|
defines a seminorm on C(X, K), and that it is a norm if and only if Ω is dense in X.

Question 2.
Prove that (C([0, 1], K), k · k∞ ) is a Banach space.

Question 3.
Prove that the disc algebra A(D) defined in Example 1.11 is a Banach space.
2. AN INTRODUCTION TO OPERATORS 21

2. An introduction to operators

Some people are afraid of heights. Not me, I’m afraid of widths.
Steven Wright

2.1. The study of mathematics is the study of mathematical objects and the
relationships between them. These relationships are often measured by functions
from one object to another. Of course, when both objects belong to the same
category (be it the category of vector spaces, groups, rings, etc), it is to be expected
that the most important maps between these objects will be morphisms from that
category. In this Section we shall concern ourselves with bounded linear operators
between normed linear spaces. These bounded linear maps, as we shall soon discover,
coincide with those linear maps which are continuous in the norm topology. Since
normed linear spaces are vector spaces equipped with a norm topology, the bounded
linear operators are the natural morphisms between them.
It should be pointed out that Banach spaces can be quite complicated to analyze.
For this reason, many people working in this area often study the structure and
geometry of these spaces without necessarily emphasizing the study of the linear
maps between them. In the next Section we shall examine the notion of a Hilbert
space. These are amongst the best-behaved Banach spaces, and their structure
is relatively well understood. For this reason, fewer people study Hilbert spaces
alone; Hilbert space theory tends to focus on the theory of the bounded linear maps
between them, as well as algebras of such bounded linear operators.
We would also be remiss if we failed to point out that not everyone on the planet
restricts themselves to bounded (i.e. continuous) linear operators. Differentiation
has the grave misfortune of being an unbounded linear operator, but nevertheless
it is hard to avoid if one wishes to study the world around one - or around one’s
friends, acquaintances, enemies, and every other one. Indeed, in applied mathemat-
ics and physics, it is often the case that the unbounded linear operators are the more
interesting examples. Having said that, we shall leave it to the disciples of those
schools to wax poetic on these topics.

2.2. Definition. Let X and Y be normed linear spaces, and let T : X → Y


be a linear map. We say that T is a bounded operator if there exists a constant
k ≥ 0 so that kT xk ≤ kkxk for all x ∈ X. When T is bounded, we define
kT k = inf{k ≥ 0 : kT xk ≤ kkxk for all x ∈ X}.
We shall refer to kT k as the operator norm of T .
It is, of course, understood that the norm of T x is computed using the Y-norm,
while the norm of x is computed using the X-norm. As we shall see below, the
operator norm does define a bona fide norm on the vector space of bounded linear
maps from X to Y, thereby justifying our terminology.
22 L.W. Marcoux Functional Analysis

Our interest in bounded operators stems from the fact that they are precisely
the continuous operators from X to Y.

2.3. Theorem. Let X and Y be normed linear spaces and T : X → Y be a


linear map. The following are equivalent:
(a) T is continuous on X.
(b) T is continuous at 0.
(c) T is bounded.
(d) κ1 := sup{kT xk : x ∈ X, kxk ≤ 1} < ∞.
(e) κ2 := sup{kT xk : x ∈ X, kxk = 1} < ∞.
(f) κ3 := sup{kT xk/kxk : 0 6= x ∈ X} < ∞.
Furthermore, if any of these holds, then κ1 = κ2 = κ3 = kT k.
Proof.
(a) implies (b): This is trivial.
(b) implies (c): Suppose that T is continuous at 0. Let ε = 1 and choose δ > 0
so that kx − 0k < δ implies that kT x − T 0k = kT xk < ε = 1. If kyk ≤ δ/2,
then kT yk ≤ 1, and so 0 6= x ∈ X implies that
δ 
kT x k ≤ 1,
2kxk
i.e. kT xk ≤ 2δ kxk. Since kT 0k = 0 ≤ 2δ k0k, we see that T is bounded.
(c) implies (d): This is trivial.
(d) implies (e): This is trivial.
(e) implies (f): Again, this is trivial.
(f) implies (a): Observe that for any x ∈ X, kT xk ≤ κ3 kxk. (For x 6= 0, this
follows from the hypothesis, while the linearity of T implies that T 0 = 0, so
the inequality also holds for x = 0.) Thus if ε > 0 and kx − yk < ε/(κ3 + 1),
then
kT x − T yk = kT (x − y)k ≤ κ3 kx − yk < ε.
The proof of the final statement is left as an exercise for the reader.
2

2.4. Computing the operator norm of a given operator T is not always a simple
task. For example, suppose that H = (C2 , k · k2 ) is a two-dimensional Hilbert
space with standard orthonormal basis {e1 = (1, 0), e2 = (0, 1)}.
 Let T : H → H
1 2
be the map whose matrix with respect to this basis is , so that T (x, y) =
3 4
(x + 2y, 3x + 4y). By definition,
kT k = sup{kT zk : z ∈ C2 , kzk ≤ 1}
p p
= sup{ |x + 2y|2 + |3x + 4y|2 : x, y ∈ C, |x|2 + |y|2 ≤ 1},
which involves non-linear equations. For Hilbert spaces of low dimension – say,
less than dimension 5 – alternate methods exist (but won’t be developed just yet).
2. AN INTRODUCTION TO OPERATORS 23

Instead, we turn our attention to special classes of operator which are simple enough
to allow us to obtain interesting results. p √
So as to satisfy the curious reader, we mention that the norm of T is 15 + 221.

2.5. Example. Multiplication operators.



(a) Let X = C([0, 1], C), k · k∞ , and suppose that f ∈ X. Define

Mf : X → X
.
g 7→ f g

It is routine to check that Mf is linear. If kgk∞ ≤ 1, then

kMf gk∞ = kf gk∞ = sup{|f (x)g(x)| : x ∈ [0, 1]} ≤ kf k∞ kgk∞ .

Thus kMf k∞ ≤ kf k∞ < ∞, and Mf is bounded.


Setting g(x) = 1, x ∈ [0, 1], we have g ∈ X, kgk∞ = 1 and kMf gk∞ =
kf k∞ , so that kMf k ≥ kf k∞ and therefore kMf k = kf k∞ .
For (hopefully) obvious reasons, Mf is referred to as a multiplication
operator .
(b) We now consider a similar operator acting on a Hilbert space. Let H =
L2 (X, dµ), where dµ is a positive, regular Borel measure. Suppose that
f ∈ L∞ (X, dµ) and let

Mf : H → H
.
g 7→ f g

Once again, it is easy to check that Mf is linear, while for g ∈ H,


Z
1
kMf gk2 = |f (x)g(x)|2 dµ 2

ZX
1
≤ kf k2∞ |g(x)|2 dµ 2

X
= kf k∞ kgk2 ,

so that kMf k ≤ kf k∞ , and hence Mf is bounded. As for a lower bound on


the norm of Mf , for each n ≥ 1, let Fn = {x ∈ X : |f (x)| ≥ kf k∞ − 1/n}.
Then Fn is measurable and µ(Fn ) > 0 by definition of kf k∞ . Let En ⊆ Fn
be a measurable set for which 0 < µ(En ) < ∞, n ≥ 1. The existence of
such sets En , n ≥ 1 follows from the regularity of the measure µ. Let
gn = χEn , the characteristic function of En . Then gn ∈ L2 (X, µ) for all
24 L.W. Marcoux Functional Analysis

n ≥ 1 and
Z
1
kMf gn k2 = |f (x)gn (x)|2 dµ 2

X
Z
1
= |f (x)|2 dµ 2

En
Z
1
≥ (kf k∞ − 1/n)2 dµ 2
En
Z
1
|gn (x)|2 dµ 2

= kf k∞ − 1/n
X

= kf k∞ − 1/n kgn k2 .
From this we see that kMf k ≥ kf k∞ − 1/n. Since n ≥ 1 was arbitrary,
kMf k ≥ kf k∞ , and so kMf k = kf k∞ .
Observe that the computation of the norm of the operator depended
very much upon the underlying norms of the spaces involved.
(c) As a special case of this phenomenon, let X = N and suppose that dµ is
counting measure. Then H = `2 (N) and f ∈ `∞ (N). As we are wont to
do when dealing with sequences, we denote by dn the value f (n) of f at
n ∈ N, so that f ≡ (dn )∞ ∞
n=1 ∈ ` (N). It follows that Mf (xn )n = (dn xn )n
2
for all (xn )n ∈ ` (N). By considering the matrix [Mf ] of Mf with respect
to the standard orthonormal basis (en )n for H, we see that
 
d1
 d2 
[Mf ] =  .
 
 d3 
..
.
Thus, Mf , often denoted in this circumstance as D = diag{dn }n , is referred
to as a diagonal operator . The above calculation shows that
kDk = kMf k = kf k∞ = sup{|f (n)| : n ≥ 1} = sup{|dn | : n ≥ 1}.

2.6. Example. Weighted shifts. With H = `2 (N) and (wn )n ∈ `∞ (N),


consider the map W : H → H defined by
W (xn )n = (0, w1 x1 , w2 x2 , w3 x3 , ...) for all (xn )n ∈ `2 (N).
We leave it as an exercise for the reader to show that W is a bounded linear operator
on H, and that
kW k = sup{|wn | : n ≥ 1}.
Such an operator is referred to as a unilateral forward weighted shift.
If (vn )n ∈ `∞ (N) and we define the linear map V : H → H via
V (xn )n = (v1 x2 , v2 x3 , v3 x4 , ...) for all (xn )n ∈ H,
then once again V is bounded, kV k = sup{|vn | : n ≥ 1}, and V is referred to as a
unilateral backward weighted shift.
2. AN INTRODUCTION TO OPERATORS 25

Finally, consider H = `2 (Z), and with (un )n ∈ `∞ (Z), define the linear map
U : H → H via
U (xn )n = (un−1 xn−1 )n .
Again, U is bounded with kU k = sup{|un | : n ∈ Z}, and U is referred to as a
bilateral weighted shift. The reader should ask himself/herself why we do not
refer to “forward” and “backward” bilateral shift operators.

2.7. Example. Differentiation operators. Consider the linear manifold


P(D) = {p : p a polynomial} ⊆ (C(D), k · k∞ ). Define the map
D : P(D) → P(D)
p 7→ p0 ,
the derivative of p. Then if pn (z) = z n , kpn k∞ = 1 for each n ≥ 1 and Dpn =
npn−1 , whence kDk ≥ n for all n ≥ 1. In particular, D is not bounded. That is,
differentiation is not continuous on the linear space of polynomials.

2.8. Notation. The set of bounded linear operators from the normed linear
space X to the normed linear space Y is denoted by B(X, Y). If X = Y, we abbreviate
this to B(X). We now fulfil an earlier promise by proving that the map T 7→ kT k
does indeed define a norm on B(X, Y).
2.9. Proposition. Let X and Y be normed linear spaces. Then B(X, Y) is a
vector space and the operator norm is a norm on B(X, Y).
Proof. Since linear combinations of continuous functions between topological spaces
are continuous, B(X, Y) is a vector space.
As to the second assertion: for R, T ∈ B(X, Y) and k ∈ K,
(i) kT k = sup{kT xk : kxk ≤ 1} ≥ 0;
(ii) kT k = 0 if and only if kT xk/kxk = 0 for all x 6= 0, which in turn happens
if and only if T x = 0 for all x ∈ X; i.e. if and only if T = 0.
(iii)
kkT k = sup{kkT xk : kxk ≤ 1}
= sup{|k| kT xk : kxk ≤ 1}
= |k| sup{kT xk : kxk ≤ 1}
= |k| kT k.
(iv)
kR + T k = sup{k(R + T )xk : kxk ≤ 1}
≤ sup{kRxk + kT xk : kxk ≤ 1}
≤ sup{kRxk + kT yk : kxk, kyk ≤ 1}
= kRk + kT k.
This completes the proof.
2
26 L.W. Marcoux Functional Analysis

2.10. Theorem. Let X and Y be normed linear spaces and suppose that Y is
complete. Then B(X, Y)
P is complete, and as such it is a Banach space.
Proof. Suppose that ∞ n=1 Tn is an absolutely summable series in B(X, Y). Given
x ∈ X,
X∞ X∞ X∞

kTn xk ≤ kTn k kxk = kxk kTn k < ∞,
n=1 n=1 n=1
compete, ∞
P
and thus, since Y is ∈ Y exists. Moreover,
n=1 Tn x P∞ the linearity of each
Tn implies that the map T : X → Y defined via T x = n=1 Tn x is linear, while
kxk ≤ 1 implies from above that kT xk ≤ ∞
P
n=1 kTn k. Hence

X
kT k ≤ kTn k < ∞,
n=1
implying that T is bounded.
Finally,
N
X ∞
X
kT x − Tn xk = k Tn xk
n=1 n=N +1
X∞
≤ kxk kTn k,
n=N +1
P∞
from which it easily follows that T = limN →∞ N
P
n=1 Tn . That is, the series n=1 Tn
is summable. By Proposition 1.22, B(X, Y) is complete.
2
As a particular case of Theorem 2.10, consider the case where Y = K, the base
field.
2.11. Definition. Let X be a normed linear space. The dual of X is B(X, K),
and it is denoted by X∗ . The elements of X∗ are referred to as continuous linear
functionals or – when no confusion is possible – as functionals on X.
Since K is complete, Theorem 2.10 implies that X∗ is again a Banach space.
As such, we may consider the dual space of X∗ , namely X(2) = X∗∗ := (X∗ )∗ ,
known as the double dual of X, and more generally, the nth -iterated dual spaces
X(n) = (X(n−1) )∗ , n ≥ 3. All of these are Banach spaces.
Before proceeding to some examples, let us first introduce some notation and
terminology which will prove useful.
2.12. Definition. A collection {en }∞
n=1 in a Banach space X is said to be
a Schauder basis if every x ∈ X can be written in a unique way as a norm
convergent series

X
x= xn en
n=1
for some choice xn ∈ K, n ≥ 1.
2. AN INTRODUCTION TO OPERATORS 27

2.13. Example.
(a) For each n ≥ 1, let en denote the sequence en = (0, 0, ..., 0, 1, 0, 0, ...) ∈ KN ,
where the unique “1” occurs in the nth position. Then {en }n is a Schauder
basis for c0 and for `p , 1 ≤ p < ∞. We shall refer to {en }n as the standard
Schauder basis for c0 and for `p .
Observe that it is not a Schauder basis for `∞ . Indeed, `∞ does not
admit any Schauder basis (why not?).
(b) It is not as obvious what one should choose as the Schauder basis for
(C[0, 1], R). It was Schauder [Sch27] who first discovered a basis for this
space. The description of such a basis is non-trivial.

2.14. Example. Consider the Banach space


c0 = {(xn )∞ N
n=1 ∈ K : lim xn = 0},n→∞

equipped with the supremum norm k(xn )n k∞ = sup{|xn | : n ≥ 1}. We claim that
c∗0 is isometrically isomorphic to `1 = `1 (N). To see this, consider the map
Θ : (`1 , k · k1 ) → (c0 , k · k∞ )∗
,
z := (zn )n 7→ ϕz
where ϕz ((xn )n ) = ∞
P
n=1 xn zn for all (xn )n ∈ c0 . That Θ is linear is readily seen.
That the sum converges absolutely is also easy to verify.
If k(xn )n k∞ ≤ 1, then |xn | ≤ 1 for all n ≥ 1, so that

X ∞
X X
|ϕz ((xn )n )| = | xn zn | ≤ |xn zn | ≤ |zn | = kzk1 .
n=1 n=1

Hence kϕz k ≤ kzk1 . On the other hand, if we set v[n] = (w1 , w2 , w3 , ..., wn , 0, 0, 0, ...)
for each n ≥ 1 (where wj = zj /|zj | if zj 6= 0, while wj = 1 if zj = 0), then v[n] ∈ c0 ,
kv[n]k∞ = 1 for all n ≥ 1, and
n
X
ϕz (v[n]) = |zj |.
j=1

From this it follows that kϕz k ≥ kzk1 . Combining these two estimates yields kϕz k =
kzk1 .
Thus Θ is an isometric injection of `1 into c0 . There remains to prove that Θ is
surjective.
To that end, suppose that ϕ ∈ c∗0 . Let {en }n denote the standard Schauder
basis for c0 , and for each n ≥ 1, let wn = ϕ(en ). Observe that if βn := wn /|wn | for
wn 6= 0, and βn := 0 if wn = 0, then
n
X
v[n] := βn en ∈ c0
k=1
28 L.W. Marcoux Functional Analysis

and kv[n]k∞ ≤ 1. Since, for all n ≥ 1,


n
X n
X
|wn | = β n wn
k=1 k=1
Xn
= ϕ(βn en )
k=1
= |ϕ(v[n])|
≤ kϕk kv[n]k∞
= kϕk,
we see that w := (w1 , w2 , w3 , ...) ∈ `1 . A routine computation shows that ϕw |c00 =
ϕ|c00 . Since ϕ, ϕw are both continuous and since c00 is dense in c0 , ϕ = ϕw = Θ(w).
Thus Θ is onto.

2.15. Example. Let 1 ≤ p < ∞. Recall from your Real Analysis courses that
there exists an isometric linear bijection Θ : (`q , k · kq ) → (`p , k · kp )∗ , where – as
always – q is the Lebesgue conjugate of p satisfying 1/p + 1/q = 1.
The map is defined via:
Θ : `q → (`p )∗
z 7→ ϕz ,
q
P
where for z = (zn )n ∈ ` , we have ϕz ((xn )n ) = n xn zn .
We normally abbreviate this result by saying that the dual of `p is `q , when
1 ≤ p < ∞. We refer the reader to the Appendix to Section 2 for a proof of this
result.

2.16. Example. The above example can be extended to more general measure
spaces. Let 1 ≤ p < ∞, and suppose that µ is a σ-finite, positive, regular Borel
measure on Lp (X, µ). Again, the map
Θ : Lq (X, µ) → Lp (X, µ)∗
g 7→ ϕg ,
where ϕg (f ) = X f gdµ defines a linear, isometric bijection between Lq (X, µ) and
R
Lp (X, µ)∗ .
If we drop the hypothesis that µ is σ-finite, the result still holds for 1 < p < ∞.
For reasons we shall discuss in the next Section, when p = 2, we often consider
the related map
Ω : L2 (X, µ) → L2 (X, µ)∗
g 7→ ϕg ,
R
where ϕg (f ) = X f gdµ defines a conjugate-linear, isometric bijection between
L2 (X, µ) and L2 (X, µ)∗ .
2. AN INTRODUCTION TO OPERATORS 29

2.17. Example. A function f : [0, 1] → K is said to be of bounded variation


if there exists κ > 0 such that for every partition {0 = t0 < t1 < t2 < · · · < tn = 1}
of [0, 1],
X n
|f (ti ) − f (ti−1 )| ≤ κ.
i=1
The infimum of all such κ’s for which the above inequality holds is denoted by
kf kv , and is called the variation of f .
Recall from your earlier courses in Analysis that if f is a function of bounded
variation, then for all x ∈ (0, 1], f (x− ) := limt→x− f (t) exists, and for all x ∈ [0, 1),
f (x+ ) := limt→x+ f (t) exists (though they might not be equal, of course). We set
f (0− ) = f (0) and f (1+ ) = f (1). It is known that a function of bounded variation
admits at most a countable number of discontinuities in the interval [0, 1], and for
g ∈ C([0, 1], K), the Riemann-Stieltjes integral
Z 1
g df
0
exists.
Let
BV [0, 1] = {f : [0, 1] → K : kf kv < ∞, f is left-continuous on (0, 1) and f (0) = 0}.
Then (BV [0, 1], k · kv ) is a Banach
 space with norm given by the variation.
Indeed, the dual of C([0, 1]), k · k∞ is isometrically isomorphic to BV [0, 1]. For
f ∈ BV [0.1] and g ∈ C([0, 1]), we define a functional ϕf ∈ (C([0, 1], K) by
Z 1
ϕf (g) := g df.
0

2.18. Proposition. Let X be a normed linear space. Then there exists a con-
tractive linear map J : X → X∗∗ .
Proof. Let z ∈ X and define a map zb : X∗ → K via zb(x∗ ) = x∗ (z). It is routine to
check that zb is linear, and if kx∗ k ≤ 1, then |b
z (x∗ )| = |x∗ (z)| ≤ kx∗ k kzk, so that
kb ∗∗
z k ≤ kzk; in particular, zb ∈ X .
It is also easy to verify that the map
J : X → X∗∗
z 7→ zb
is linear, and the first paragraph shows that J is contractive.
2

2.19. The map J is referred to as the canonical embedding of X into X∗∗ . It


is not necessarily the only embedding of interest, however. Once we have proven the
Hahn-Banach Theorem, we shall be in a position to show that J is in fact isometric.
We point out that if J is an isometric bijection from X onto X∗∗ , then X is said
to be reflexive. These are in some sense amongst the best behaved Banach spaces.
We shall return to the notion of reflexivity of Banach spaces in a later section.
30 L.W. Marcoux Functional Analysis

Appendix to Section 2.

2.20. Although norms of operators can be difficult to compute, there are cases
where useful estimates can be obtained.
Consider the Volterra operator

V : C([0, 1], K) → C([0, 1], K)


,
f 7→ Vf
Rx
where V f (x) = 0 f (t)dt for all x ∈ [0, 1]. (Since all functions are continuous, it
suffices to consider Riemann integration.)
Then

kV f k = sup{|V f (x)| : x ∈ [0, 1]}


Z x
= sup{| f (t)dt| : x ∈ [0, 1]}
Z 0x
≤ sup{ kf k∞ dt : x ∈ [0, 1]}
0
= sup{(x − 0) kf k∞ : x ∈ [0, 1]}
= kf k∞ .

Thus kV k ≤ 1. If 1(x) := 1 for all x ∈ [0, 1], then 1 ∈ C([0, 1], K), k1k∞ = 1,
and V 1 = j, where j(x) = x, x ∈ [0, 1]. But then kV 1k∞ = kjk∞ = 1, showing
that kV k ≥ 1, and hence kV k = 1.
Far more interesting (and useful) is the computation of kV n k for n ≥ 2.
Let us first general the construction of the operator V . We may consider the
function k : [0, 1] × [0, 1] → C defined by

0 if x < y,
k(x, y) =
1 if x ≥ y.

Then
Z x Z 1
(V f )(x) = f (y)dy = k(x, y)f (y)dy.
0 0

The function k = k(x, y) is referred to as the kernel of the integral operator V .


This should not be confused with the notion of a null space of a linear map, also
referred to as its kernel.
2. AN INTRODUCTION TO OPERATORS 31

Now
(V 2 f )(x) = (V (V f ))(x)
Z 1
= k(x, t) (V f )(t)dt
0
Z 1 Z 1
= k(x, t) k(t, y) f (y)dy dt
0 0
Z 1 Z 1
= f (y) k(x, t) k(t, y)dt dy
0 0
Z 1
= f (y) k2 (x, y)dy,
0
R1
where k2 (x, y) = 0 k(x, t) k(t, y)dt is a new kernel for the integral operator V 2 .
Note that
Z 1
|k2 (x, y)| = | k(x, t) k(t, y)dt|
0
Z x
= | k(x, t) k(t, y)dt|
y
= (x − y) for x > y,
while for x < y, k2 (x, y) = 0.
In general, since x − y < 1 − 0 = 1, we get
Z 1
n
(V f )(x) = f (y) kn (x, y)dy, where
0
Z 1
kn (x, y) = k(x, t) kn−1 (t, y)dt, and where
0
1 1
|kn (x, y)| ≤ (x − y)n−1 ≤ .
(n − 1)! (n − 1)!
It follows that
kV n k = sup kV n f k∞
kf k=1
Z 1
= sup k f (y) kn (x, y)dyk∞
kf k=1 0
≤ sup kf k∞ kkn (x, y)k∞
kf k=1
≤ 1/(n − 1)!.
A simple consequence of these computations is that
1
lim kV n k n = lim 1/(n − 1)! = 0.
n→∞ n→∞
We shall have more to say about this in the Appendix to Section 3.
32 L.W. Marcoux Functional Analysis

2.21. Example. Let {e1 , e2 , ..., en } denote the standard basis for Kn , so that
ej = (0, 0, ..., 0, 1, 0, ..., 0), where the 1 occurs in the j th position, 1 ≤ j ≤ n. Suppose
that 1 ≤ p ≤ ∞, and that Kn carries the p-norm from Example 1.7.
Let [tij ] ∈ Mn (K), and define the map
T : Kn →
7 Kn
x → 7 [tij ]x.
It is instructive, while not difficult, to prove that if every row and every column
of [tij ] has at most one non-zero entry, then
kT k = max |tij |.
1≤i,j≤n

We leave this as an exercise for the reader.

2.22. Example. Let n ≥ 1 be an integer and consider Mn = Mn (C). For


T ∈ Mn , T ∗ T is a hermitian matrix, and as such, has positive eigenvalues. Denote by
s1 , s2 , ..., sn the square roots of these eigenvalues (counted according to multiplicity)
and for 1 ≤ p < ∞, set
n
1
spk p .
X
kT kp =
k=1
For p = ∞, set
kT k∞ = max{s1 , s2 , ..., sn }.
The numbers s1 , s2 , ..., sn are known as the singular values of the matrix T .
It can be shown that k · kp is indeed a norm on Mn for all 1 ≤ p ≤ ∞. Let us
denote the space Mn equipped with the norm k · kp by Cpn . We shall refer to it as
the n-dimensional Schatten p-class of operators on Cn . Then we shall leave it as an
exercise for the reader to prove that Cpn ∗ ' Cqn , where q is the Lebesgue conjugate
of p, i.e., p1 + 1q = 1.
The above identification can be realized via the map:
Φ : Cqn → Cpn ∗
R 7→ ϕR
where ϕR : Cpn → C is the map ϕR (T ) = tr(RT ), and where tr[xij ] = nk=1 xkk
P
denotes the standard trace functional on Mn .
The above result has a generalisation to infinite-dimensional Hilbert spaces. We
refer the reader to [Dav88] for a more detailed treatment of this topic.

2.23. Example. We now return to the proof of the fact that the dual of `p is
`q when 1 < p < ∞, as stated in Example 2.15.

Given z = (zn )n ∈ `q , we define


βz : `p → P K
(xn )n 7→ n x n zn .
2. AN INTRODUCTION TO OPERATORS 33

Note that by Hölder’s Inequality, Theorem 1.31, for x = (xn )n ∈ `p , we have


X
|βz (x)| = xn z n
n
X
≤ |xn zn |
n
= kxzk1
≤ kxkp kzkq ,
so that indeed βz (x) ∈ K. Clearly βz is linear, and so the above argument also
shows that kβz k ≤ kzkq .
q/p
Furthermore, if we set xn = αn znq−1 = αn zn , where αn ∈ K is chosen so that
|αn | = 1 and xn zn ≥ 0 for all n ≥ 1, then
!1 !1
p p
X X q
|xn |p = (|zn | p )p
n n
!1
X p
q
= |zn |
n
= kzkq/p
q ,

so that x ∈ `p , and
X
|βz (x)| = xn z n
n
X
= |zn |q
n
= kzkq/p
q kzkq
= kxkp kzkq ,
so that kβz k ≥ kzkq , and therefore kβz k = kzkq .
Consider the map Θ defined via:
Θ : `q → (`p )∗
z 7→ βz ,
with βz defined as above. Then Θ is easily seen to be linear, and from above, it is
isometric (hence injective). There remains only to show that Θ is surjective.
To that end, let ϕ ∈ (`p )∗ . For each n ≥ 1, let zn := ϕ(en ), where {en }n is the
standard Schauder basis for `p . Set z[n] := nk=1 zk ek , and x[n] := nk=1 αk zkq−1 ,
P P
where – as before – αk is chosen so that |αk | = 1 and xk zk ≥ 0 for all k ≥ 1. Then
z[n] ∈ `q and x[n] ∈ `p for all n ≥ 1.
34 L.W. Marcoux Functional Analysis

Observe that if y = (yn )n ∈ `p , then by the continuity of ϕ,


!
X X X
ϕ(y) = ϕ yn en = yn ϕ(en ) = yn zn .
n n n
Now
n
αk zkq−1 zk
X
|ϕ(x[n])| =
k=1
n
X
= |zk |q
k=1
= kz[n]kq−1
q kz[n]kq ,
where
n
! a−1
X q

kz[n]kq−1
q = |zk |q
k=1
n
!1− 1
X q
q/p p
= (|zk | )
k=1
n
!1/p
X
= |xk |p
k=1
= kx[n]kp .
Thus
|ϕ(x[n])| = kx[n]kp kz[n]kq
≤ kx[n]kp kϕk for all n ≥ 1.
It follows that kz[n]kq ≤ kϕk for all n ≥ 1, so that if z := (zn )n , then z ∈ `q with
kzkq ≤ kϕk.
Finally, ϕ(y) = βz (y) for all y ∈ `p , so that ϕ = βz = Θ(z), proving that Θ is
surjective, as required.

2.24. Example. Let X be a compact, Hausdorff space and consider the Banach
space C(X, C), equipped with the norm kf k∞ := supx∈X |f (x)|.
The Riesz Representation Theorem states that the dual C(X, C)∗ of C(X, C)
can be identified with the set M(X) of all finite, regular Borel measures on X. Given
a measure µ ∈ M(X), we associate to it the linear functional ϕµ defined by
Z
ϕµ (f ) = f (x)dµ(x).
X
It is worth noting that the functional ϕµ sends positive functions to positive
functions precisely if µ is a positive measure. Such functionals play a significant role
in the theory of C ∗ -algebras, of which (C(X, C), k · k∞ ) is an example.
2. AN INTRODUCTION TO OPERATORS 35

I handed in a script last year and the studio didn’t change one word.
The word they didn’t change was on page 87.
Steve Martin
36 L.W. Marcoux Functional Analysis

Exercises for Section 2.

Question 1.
Why does (`∞ , k · k∞ ) not admit a Schauder basis?

Question 2.
Let H = `2 (N), and let (wn )n ∈ `∞ (N). Denote by W the unilateral forward
shift determined by
W (xn )n = (0, w1 x1 , w2 x2 , w3 x3 , . . .), x = (xn )n ∈ H.
Prove that kW k = supn |wn |.

Question 3.
Let H = `2 (N) and let {en }∞
n=1 denote the standard onb for H. Given T ∈ B(H),
denote by [T ] = [ti,j ] the matrix for T relative to {en }∞
n=1 , where ti,j = hT ej , ei i
for all i, j ≥ 1.
(a) Prove that if T ∈ B(H), then supi,j≥1 |ti,j | < ∞.
(b) Prove the converse to (a), or find a counterexample to show that it is false.
(c) Let T ∈ B(H). Extend the result of Question 2 and of Example 2.5 (c) by
proving that if each row and each column of [T ] contains only one non-zero
entry, then kT k = supi,j≥1 |ti,j |.
(d) Prove that in general, given T ∈ B(H), kT k ≥ supi,j≥1 |ti,j |.
(e) Let N ≥ 1 and let EN := {e1 , e2 , . . . , eN } be an onb for CN . Let QN
denote the operator whose matrix relative to EN is [QN ] = [1i,j ]1≤i,j≤N .
Find kQN k.

Question 4.
Let [ai,j ]1≤i,j be a matrix with finite support; i.e. suppose that the set
{(i, j) : ai,j 6= 0}
is finite. Let {en }n denote the standard onb for `2 (N), and set H0 := span{en }n .
Note that we are not using closed linear spans, and so H0 is a dense linear sub-
manifold of H, consisting of finitely-supported
P sequences. Define a linear map
A0 : H0 → H0 by setting A0 eN = i≥1 ai,N ei for all N ≥ 1, and extending by
linearity to all of H0 . (Make sure that you know why and how you can do this!)
(a) Prove that A0 is continuous on H0 .
(b) Let M be any dense linear submanifold of H, and suppose that
T0 : M → M is a continuous linear map. Show that T0 extends to
a unique continuous linear map T ∈ B(H) with kT k = kT0 k. That is,
there exists (a unique) T ∈ B(H) with kT k = kT0 k and T |M = T0 .
(c) Conclude that A0 extends to a continuous linear map A ∈ B(H). What
can you say about the range of A?
2. AN INTRODUCTION TO OPERATORS 37

Question 5.
Let J : c0 → `∞ denote the canonical embedding of c0 into `∞ = (c0 )∗∗ , as
defined in Proposition 2.18. Prove that
J((xn )n ) = (xn )n
for all x = (xn )n ∈ c0 .

Question 6.
Let 1 < p < ∞. By Example 2.15, we see that `q ' (`p )∗ , and thus that
` ' (`q )∗ ' (`p )∗∗ . By identifying `p with (`p )∗∗ under this isomorphism, we may
p

consider the canonical embedding J : `p → `p of `p into its “second dual”, as defined


in Proposition 2.18. Prove that J is the identity map.

Question 7.
Let (X, k · kX ), (Y, k · kY ), and (Z, k · kZ ) be Banach spaces. Let T ∈ B(X, Y) and
R ∈ B(Y, Z). Prove that R T ∈ B(X, Z) and that kR T k ≤ kRk kT k. Do we always
have equality? Can we ever have equality?

Question 8.
Let (X, k · k) be a Banach space. An operator T ∈ B(X) is said to be invertible
if there exists R ∈ B(X) such that R T = I = T R, where I ∈ B(X) is the identity
map Ix = x for all x ∈ X. We say that R is the inverse of T , and we typically
write T −1 instead of R to denote the inverse of T .
Prove that if T ∈ B(X) is invertible, then T is bounded below; that is, there
exists δ > 0 such that kT xk ≥ δ kxk for all x ∈ X.
Is the converse true? Prove it or find a counterexample to show that it is false.
38 L.W. Marcoux Functional Analysis

3. Hilbert space

You should always go to other people’s funerals; otherwise, they won’t


come to yours.
Yogi Berra

3.1. In this brief Chapter, we shall examine a class of very well-behaved Banach
spaces, namely the class of Hilbert spaces. Hilbert spaces are the generalizations
of our familiar (two- and) three-dimensional Euclidean space. There are two basic
approaches to studying Hilbert spaces. If one is interested in Banach space geometry
– and many people are – then one often tries to compare other Banach spaces to
Hilbert spaces. As an example of such a phenomenon, we mention the calculation
of the Banach-Mazur distance between Banach spaces, which we define in the
Appendix to this Section.
In the second approach, one decides that because Hilbert spaces are so well-
behaved, they are in some sense “understood”, and for this reason they are “less
interesting” to study than the set of bounded linear operators acting upon them.
One can then study the operators individually or in sets which have no algebraic
structure – this kind of analysis belongs to Single Operator Theory. Alternatively,
one can various Operator Algebras, of which there are myriads of examples. The
literature dealing with operators and operator algebras is vast.

3.2. Recall that a Hilbert space H is a vector space equipped with an inner
product h·, ·i so that that the induced norm kxk := hx, xi1/2 gives rise to a complete
normed linear space, i.e. a Banach space. [When the corresponding normed linear
space is not complete, we refer only to inner product spaces.]
In any inner product space we have the Cauchy-Schwarz Inequality:
|hx, yi|2 ≤ kxk2 kyk2 ,
for all x, y ∈ H. We say that x and y are orthogonal if hx, yi = 0, and we write
x ⊥ y.

3.3. Example.
(a) If (X, µ) is a measure space, then L2 (X, µ) is a Hilbert space, with the
inner product given by
Z
hf, gi = f (x)g(x)dµ(x).
X
P∞
(b) `2= {(xn )n : xn ∈ K, n ≥ 1 and n=1 |xn |
2 < ∞} is a Hilbert space, with
the inner product given by
X
h(xn )n , (yn )n i = xn yn .
n
3. HILBERT SPACE 39

The reader with a background in measure theory will recognize that the second
example is merely a particular case of the first. While these are the canonical inner
products on these spaces, they are not the only ones.
For example, if (rn )n is any sequence of strictly positive integers, one can define
a weighted `2 space relative to this sequence by setting
X
`2(rn )n := {(xn )n ∈ KN : rn |xn |2 < ∞}
n
with inner product X
h(xn )n , (yn )n i = rn xn yn .
n

3.4. Theorem. Let H be a Hilbert space and suppose that x1 , x2 , ..., xn ∈ H.


(a) [The Pythagorean Theorem] If the vectors are pairwise orthogonal, then
n
X n
X
2
k xj k = kxj k2 .
j=1 j=1

(b) [The Parallelogram Law ]


kx1 + x2 k2 + kx1 − x2 k2 = 2 kx1 k2 + kx2 k2 .


Proof. Both of these results follow immediately from the definition of the norm in
terms of the inner product.
2
The Parallelogram Law is a useful tool to show that many norms are not Hilbert
space norms.
3.5. Theorem. Let H be a Hilbert space, and K ⊆ H be a closed, non-empty
convex subset of H. Given x ∈ H, there exists a unique point y ∈ K which is closest
to x; that is,
kx − yk = dist(x, K) = min{kx − zk : z ∈ K}.

Proof. By translating K by −x, it suffices to consider the case where x = 0.


Let d := dist(0, K), and choose kn ∈ K so that k0 − kn k < d + n1 , n ≥ 1. By the
Parallelogram Law,
kn − km 2 1 1 kn + km 2
k k = kkn k2 + kkm k2 − k k
2 2 2 2
1 1 1 1
≤ (d + )2 + (d + )2 − d2 ,
2 n 2 m

kn + km
as ∈ K because K is assumed to be convex.
2
We deduce from this that the sequence {kn }∞
n=1 is Cauchy, and hence converges
to some k ∈ K, since K is closed and H is complete. Clearly limn→∞ kn = k implies
that d = limn→∞ kkn k = kkk.
40 L.W. Marcoux Functional Analysis

As for uniqueness, suppose thatz ∈ K and that kzk = d. Then


k−z 2 1 1 k+z 2
0≤k k = kkk2 + kzk2 − k k
2 2 2 2
1 1
≤ d2 + d2 − d2 = 0,
2 2
and so k = z.
2

3.6. Theorem. Let H be a Hilbert space, and let M ⊆ H be a closed subspace.


Let x ∈ H, and m ∈ M. The following are equivalent:
(a) kx − mk = dist(x, M);
(b) The vector x − m is orthogonal to M, i.e., hx − m, yi = 0 for all y ∈ M.
Proof.
(a) implies (b): Suppose that kx − mk = dist (x, M), and suppose to the
contrary that there exists y ∈ M so that k := hx − m, yi 6= 0. There is no
loss of generality in assuming that kyk = 1. Consider z = m + ky ∈ M.
Then
kx − zk2 = kx − m − kyk2
= hx − m − ky, x − m − kyi
= kx − mk2 − khy, x − mi − khx − m, yi + |k|2 kyk2
= kx − mk2 − |k|2
< dist(x, M),
a contradiction. Hence x − m ∈ M⊥ .
(b) implies (a): Suppose that x − m ∈ M⊥ . If z ∈ M is arbitrary, then
y := z − m ∈ M, so by the Pythagorean Theorem,
kx − zk2 = k(x − m) − yk2 = kx − mk2 + kyk2 ≥ kx − mk2 ,
and thus dist (x, M) ≥ kx − mk. Since the other inequality is obvious, (a)
holds.
2

3.7. Remarks.
(a) Given any non-empty subset S ⊆ H, let
S ⊥ := {y ∈ H : hx, yi = 0 for all x ∈ S}.
It is routine to show that S ⊥ is a norm-closed subspace of H. In particular,
 ⊥
S⊥ ⊇ span S,
the norm closure of the linear span of S.
(b) Recall from Linear Algebra that if V is a vector space and W is a (vector)
subspace of V, then there exists a (vector) subspace X ⊆ V such that
3. HILBERT SPACE 41

(i) W ∩ X = {0}, and


(ii) V = W + X := {w + x : w ∈ W, x ∈ X }.
We say that W is algebraically complemented by X . The existence of
such a X for each W says that every vector subspace of a vector space is
algebraically complemented. We shall write V = W u X to denote the fact
that X is an algebraic complement for W in V.
If X is a Banach space and Y is a closed subspace of X, we say that Y
is topologically complemented if there exists a closed subspace Z of X
such that Z is an algebraic complement to Y. The issue here is that both Y
and Z must be closed subspaces. It can be shown that the closed subspace
c0 of `∞ is not topologically complemented in `∞ . This result is known as
Phillips’ Theorem (see the paper of R. Whitley [Whi66] for a short but
elegant proof). We shall write X = Y ⊕ Z if Z is a topological complement
to Y in X.
Now let H be a Hilbert space and let M ⊆ H be a closed subspace
of H. We claim that H = M ⊕ M⊥ . Indeed, if z ∈ M ∩ M⊥ , then
kzk2 = hz, zi = 0, so z = 0. Also, if x ∈ H, then we may let m1 ∈ M be
the element satisfying
kx − m1 k = dist(x, M).
The existence of m1 is guaranteed by Theorem 3.5. By Theorem 3.6,
m2 := x − m1 lies in M⊥ , and so x ∈ M + M⊥ . Since M and M⊥
are closed subspaces of a Banach space and they are algebraically comple-
ments, we are done.
In this case, the situation is even stronger. The space M may admit
more than one topological complement in H - however, the space M⊥ above
is unique in that it is an orthogonal complement. That is, as well as
being a topological complement to M, every vector in M⊥ is orthogonal
to every vector in M.
(c) With M as in (b), we have H = M ⊕ M⊥ , so that if x ∈ H, then we may
write x = m1 + m2 with m1 ∈ M, m2 ∈ M⊥ in a unique way. Consider
the map:
P : H → M ⊕ M⊥
x 7→ m1 ,
relative to the above decomposition of x. It is elementary to verify that P
is linear, and that P is idempotent, i.e., P = P 2 . We remark in passing
that m2 = (I − P )x, and that (I − P )2 = (I − P ) as well.
In fact, for x ∈ H, kxk2 = km1 k2 +km2 k2 by the Pythagorean Theorem,
and so kP xk = km1 k ≤ kxk, from which it follows that kP k ≤ 1. If
M 6= {0}, then choose m ∈ M with kmk 6= 0. Then P m = m, and so
kP k ≥ 1. Combining these estimates, M = 6 0 implies kP k = 1.
We refer to the map P as the orthogonal projection of H onto M.
The map Q := (I − P ) is the orthogonal projection onto M⊥ , and we leave
it to the reader to verify that if M =6 H, then kQk = 1.
42 L.W. Marcoux Functional Analysis

(d) Let ∅ 6= S ⊆ H. We saw in (a) that S ⊥⊥ ⊇ span S. In fact, if we let


M = span S, then M is a closed subspace of H, and so by (b),
H = M ⊕ M⊥ .
It is routine to check that S ⊥ = M⊥ . Suppose that there exists 0 6= x ∈
S ⊥⊥ , x 6∈ M. Then x ∈ H, and so we can write x = m1 +m2 with m1 ∈ M,
and m2 ∈ M⊥ = S ⊥ (m2 6= 0, otherwise x ∈ M). But then 0 6= m2 ∈ S ⊥
and so
hm2 , xi = hm2 , m1 i + hm2 , m2 i
= 0 + km2 k2
6= 0.
This contradicts the fact that x ∈ S ⊥⊥ . It follows that S ⊥⊥ = span S.
(e) Suppose that M admits an orthonormal basis {ek }nk=1 . Let x ∈ H, and let
P denote the orthogonal projection onto M. By (b), P x is the unique
element
Pn of M so that x − P x lies in M⊥ . Consider the vector w =
k=1 hx, ek iek . Then
n
X
hx − w, ej i = hx, ej i − hhx, ek iek , ej i
k=1
Xn
= hx, ej i − hx, ek ihek , ej i
k=1
= hx, ej i − hx, ej i kej k2
= 0.
Pn
It follows that x−w ∈ M⊥ , and thus w = P x. That is, P x = k=1 hx, ek iek .

3.8. Theorem. The Riesz Representation Theorem. Let {0} 6= H be a


Hilbert space over K, and let ϕ ∈ H∗ . Then there exists a unique vector y ∈ H so
that
ϕ(x) = hx, yi for all x ∈ H.
Moreover, kϕk = kyk.
Proof. Given a fixed y ∈ H, let us denote by βy the map βy (x) = hx, yi. Our goal
is to show that H∗ = {βy : y ∈ H}. First note that if y ∈ H, then βy (kx1 + x2 ) =
hkx1 +x2 , yi = khx1 , yi+hx2 , yi = kβy (x1 )+βy (x2 ), and so βy is linear. Furthermore,
for each x ∈ H, |βy (x)| = |hx, yi| ≤ kxkkyk by the Cauchy-Schwarz Inequality. Thus
kβy k ≤ kyk, and hence βy is continuous - i.e. βy ∈ H∗ .
It is not hard to verify that the map
Θ : H → H∗
y 7→ βy
is conjugate-linear (if K = C), otherwise it is linear (if K = R). From the first
paragraph, it is also contractive. But [Θ(y)](y) = βy (y) = hy, yi = kyk2 , so that
3. HILBERT SPACE 43

kΘ(y)k ≥ kyk for all y ∈ H, and Θ is isometric as well. It immediately follows that
Θ is injective, and there remains only to prove that Θ is surjective.
Let ϕ ∈ H∗ . If ϕ = 0, then ϕ = Θ(0). Otherwise, let M = ker ϕ, so that
codim M = 1 = dim M⊥ , since H/M ' K ' M⊥ . Choose e ∈ M⊥ with kek = 1.
Let P denote the orthogonal projection of H onto M, constructed as in Re-
mark 3.7. Then, as I − P is the orthogonal projection onto M⊥ , and as {e} is an
orthonormal basis for M⊥ , by Remark 3.7 (d), for all x ∈ H, we have

x = P x + (I − P )x = P x + hx, eie.

Thus for all x ∈ H,

ϕ(x) = ϕ(P x) + hx, eiϕ(e) = hx, ϕ(e)ei = βy (x),

where y := ϕ(e)e. Hence ϕ = βy , and Θ is onto.


2

3.9. Remark. The fact that the map Θ defined in the proof the Riesz Rep-
resentation Theorem above induces an isometric, conjugate-linear automorphism of
H is worth remembering.

3.10. Definition. A subset {eλ }λ∈Λ of a Hilbert space H is said to be or-


thonormal if keλ k = 1 for all λ, and λ 6= α implies that heλ , eα i = 0.
An orthonormal set in a Hilbert space is called an orthonormal basis for H if
it is maximal in the collection of all orthonormal sets of H, partially ordered with
respect to inclusion.

If E = {eλ }λ is any orthonormal set in H, then an easy application of Zorn’s


Lemma implies the existence of a orthonormal basis in H which contains E. The
reader is warned that if H is infinite-dimensional, then an orthonormal basis for H
is never a vector space (i.e. a Hamel) basis for H.

3.11. Example.
(a) If H = `2 then the standard Schauder basis {en }∞ 2
n=1 for ` as defined in
Example 2.13 is an orthonormal basis for H.
(b) If H = L2 (T, dm), where T = {z ∈ C : |z| = 1} and dm represents
normalised Lebesgue measure, then {fn }n∈Z is an orthonormal basis for
L2 (T, dm), where fn (z) = z n for all z ∈ T and for all n ∈ Z.

We recall from Linear Algebra:


44 L.W. Marcoux Functional Analysis

3.12. Theorem. The Gram-Schmidt Orthogonalisation Process


If H is a Hilbert space over K and {xn }∞ n=1 is a linearly independent set in H,
then we can find an orthonormal set {yn }∞ n=1 in H so that span{x1 , x2 , ..., xk } =
span {y1 , y2 , ..., yk } for all k ≥ 1.
Proof. We leave it to the reader to verify that setting y1 = x1 /kx1 k, and recursively
defining
xk − k−1
P
j=1 hxk , yj iyj
yk := Pk−1 , k≥2
kxk − j=1 hxk , yj iyj k
will do.
2

3.13. Theorem. Bessel’s Inequality


If {en }∞
n=1 is an orthonormal set in a Hilbert space H, then for each x ∈ H,

X
|hx, en i|2 ≤ kxk2 .
n=1

Proof. For each k ≥ 1, let Pk denote the orthogonal projection of H onto


span {en }kn=1 . Given x ∈ H, we have seen that kPk k ≤ 1, and that Pk x =
Pk 2 2
Pk 2
n=1 hx, en ien . Hence kxk ≥ kPk xk = n=1 |hx, en i| for all k ≥ 1, from which
the result follows.
2

3.14. Before considering a non-separable version of the above result, we pause


to define what we mean by a sum over an uncountable index set.
Given a Banach space X and a set {xλ }λ∈Λ of vectors in X, let F denote the
all finite subsets of Λ, partially ordered by inclusion. For each F ∈ F,
collection of P
define yF = P λ∈F xλ , so that (yF )F ∈F is a net in X. If y = limF ∈F yF exists, then
we write y = λ∈Λ xλ , and we say that {xλ }λ is unconditionally summable to
y. P
In other words,
P λ xλ = y if for all ε > 0 there exists F0 ∈ F so that F ⊇ F0
implies that k λ∈F xλ − yk < ε.

3.15. Corollary. Let H be a Hilbert space and E ⊆ H be an orthonormal set.


(a) Given x ∈ H, theP set {e ∈ E : hx, ei =
6 0} is countable.
(b) For all x ∈ H, e∈E |hx, ei|2 ≤ kxk2 .
Proof.
(a) Fix x ∈ H. For each k ≥ 1, define Fk = {e ∈ E : |hx, ei| ≥ k1 }. Suppose
that there exists k0 ≥ 1 so that Fk0 is infinite. Choose a countably infinite
subset {en }∞
n=1 of Fk0 . By Bessel’s Inequality,
m m
m X 1 X
= ≤ |hx, en i|2 ≤ kxk2
k02 n=1 k02 n=1
3. HILBERT SPACE 45

for all m ≥ 1. This is absurd. Thus |Fk | < ∞ for all k ≥ 1. But then
{e ∈ E : hx, ei =
6 0} = ∪k≥1 Fk
is countable.
(b) This is left as a (routine) exercise for the reader.
2

3.16. Lemma. Let H be a Hilbert space, E ⊆ H be an orthonormal set, and


x ∈ H. Then X
hx, eie
e∈E
converges in H.
Proof. Since H is complete, it suffices to show that if F as in Paragraph 3.14
P of finite subsets of E, partially ordered by inclusion, and if for
denotes the collection
each F ∈ F, yF = e∈F hx, eie, then (yF )F ∈F is a Cauchy net.
Let ε > 0. By Corollary 3.15, we can find a countable subcollection {en }∞ n=1 ⊆ E
so that e ∈ E\{en }∞ implies
n=1 P that hx, ei = 0. Moreover, by Bessel’s Inequality, we
can find N > 0 so that ∞ k=N +1 |hx, ek i|2 < ε2 . Let F = {e , e , ..., e }.
0 1 2 N
If F, G ∈ F and F0 ≤ F , F0 ≤ G, then
X X
kyF − yG k2 = k hx, eie − hx, eiek2
e∈F \G e∈G\F
X
= |hx, ei|2
e∈(F \G)∪(G\F )

X
≤ |hx, ek i|2
k=N +1
2
<ε .
This shows that (yF )F ∈F is a Cauchy net, and therefore it converges, as required.
(For a proof that Cauchy nets in a complete metric space converge, see Proposi-
tion 3.29 in the Appendix to this Section.)
2

3.17. Theorem. Let E be an orthonormal set in a Hilbert space H. The fol-


lowing are equivalent:
(a) The set E is an orthonormal basis for H. (That is, E is a maximal or-
thonormal set in H.)
(b) The set span E is norm-dense
P in H.
(c) For all x ∈ H, x = e∈E hx, eie.
(d) For all x ∈ H, kxk2 = e∈E |hx, ei|2 .
P
[Parceval’s Identity ]
Proof. Let E = {eλ }λ∈Λ be an orthonormal set in H.
46 L.W. Marcoux Functional Analysis

(a) implies (b): Let M = span E. If M 6= H, then M⊥ 6= {0}, so we can find


z ∈ M⊥ , kzk = 1. But then E ∪ {z} is an orthonormal set, contradicting
the maximality of E. P
(b) implies (c): Let y = λ∈Λ hx, eλ ieλ , which exists by Lemma 3.16. Then
hy − x, eλ i = 0 for all λ ∈ Λ, so y − x is orthogonal to M = span E = H.
But then y − x ⊥ y −P − x = 0, i.e. y =P
x, so that y P x.
(c) implies (d): kxk2 = h e∈E hx, eie, f ∈E hx, f if i = e∈E |hx, ei|2 . [Check!]
(d) implies (a): If e ⊥ eλ for all λ ∈ Λ, then
X
kek2 = |he, eλ i|2 = 0,
λ∈Λ
so that E is maximal.
2

3.18. Proposition. If H is a Hilbert space, then any two orthonormal bases


for H have the same cardinality.
Proof. We shall only deal with the infinite-dimensional situation, since the finite-
dimensional case was dealt with in linear algebra.
Let E and F be two orthonormal bases for H. Given e ∈ E, let Fe = {f ∈ F :
he, f i =6 0}. Then |Fe | ≤ ℵ0 . Moreover, given f ∈ F, there exists e ∈ E so that
6 0, otherwise f is orthogonal to span E = H, a contradiction.
he, f i =
Thus F = ∪e Fe , and so |F| ≤ (supe∈E |Fe |) |E| ≤ ℵ0 |E| = |E|. By symmetry,
|E| ≤ |F|, and so |E| = |F|.
2
The above result justifies the following definition:

3.19. Definition. The dimension of a Hilbert space H is the cardinality of


any orthonormal basis for H, and it is denoted by dim H.
The appropriate notion of isomorphism in the category of Hilbert spaces involve
linear maps that preserve the inner product.
3.20. Definition. Two Hilbert spaces H1 and H2 are said to be isomorphic
if there exists a linear bijection U : H1 → H2 so that
hU x, U yi = hx, yi
for all x, y ∈ H1 . We write H1 ' H2 to denote this isomorphism.
We also refer to the linear maps implementing the above isomorphism as unitary
operators. Note that
kU xk2 = hU x, U xi = hx, xi = kxk2
for all x ∈ H1 , so that unitary operators are isometries. Moreover, the inverse map
U −1 : H2 → H1 defined by U −1 (U x) := x is also linear, and
hU −1 (U x)U −1 (U y)i = hx, yi = hU x, U yi,
3. HILBERT SPACE 47

so that U −1 is also a unitary operator. Furthermore, if L ⊆ H1 is a closed subspace,


then L is complete, whence U L is also complete and hence closed in H2 .

Unlike the situation in Banach spaces, where two non-isomorphic Banach spaces
can have Schauder bases of the same cardinality, the case of Hilbert spaces is as nice
as one can imagine.

3.21. Theorem. Two Hilbert spaces H1 and H2 over K are isomorphic if and
only if they have the same dimension.
Proof. Suppose first that H1 and H2 are isomorphic, and let U : H1 → H2 be
a unitary operator. Let {eλ }λ∈Λ be an orthonormal basis for H1 . We claim that
{U eλ }λ∈Λ is an orthonormal basis for H2 .
Indeed, hU eα , U eβ i = heα , eβ i = δα,β (the Kronecker delta function)), kU eα k =
keα k = 1 for all α ∈ Λ, and
H2 = U H1 = U (span {eλ }λ∈Λ ) = span {U eλ }λ∈Λ ,
by the continuity of U .
Hence dim H2 = |{U eλ }λ∈Λ | = |Λ| = dim H1 .

Conversely, suppose that dim H2 = dim H1 . Then we can find a set Λ and
orthonormal bases {eλ : λ ∈ Λ} for H1 , and {fλ : λ ∈ Λ} for H2 . Consider the map
U : H1 → `2 (Λ)
x 7→ (hx, eλ i)λ∈Λ ,
where `2 (Λ) := f : Λ → K : λ∈Λ |f (λ)|2 < ∞ . This is an inner product space
 P

using the inner product hf, gi = λ∈Λ f (λ)g(λ). The proof that `2 (Λ) is complete
P

is essentially the same as in the case of `2 (N).


It is routine to check that U is linear. Moreover,
X
hU x, U yi = hx, eλ i hy, eλ i
λ
X
= hhx, eλ ieλ , hy, eλ ieλ i
λ
X X
=h hx, eλ ieλ , hy, eγ ieγ i
λ γ
= hx, yi
for all x, y ∈ H, and so kU xk2 = hU x, U xi = hx, xi = kxk2 for all x ∈ H1 . It follows
that U is isometric and therefore injective.
If (rλ )λ ∈ `2 (Λ) has finite support, then x := λ∈Λ rλ eλ ∈ H1 and U x = (rλ )λ .
P
Thus ran U is dense. But from the comment above, U H1 is closed, and therefore
U H1 = `2 (Λ).
We have shown that U is a unitary operator implementing the isomorphism of
H1 and `2 (Λ). By symmetry once again, there exists a unitary V : H2 → `2 (Λ).
But then V −1 U : H1 → H2 is unitary, showing that H1 ' H2 .
48 L.W. Marcoux Functional Analysis

3.22. Corollary. The spaces `2 (N), `2 (Q), `2 (Z) and L2 ([0, 1], dx) (where dx
represents Lebesgue measure) are all isomorphic, as they are all infinite dimensional,
separable Hilbert spaces.
3. HILBERT SPACE 49

Appendix to Section 3.

3.23. When dealing with Hilbert space operators and operator algebras, one
tends to focus upon complex Hilbert spaces. One reason for this is that the spectrum
provides a terribly useful tool for analyzing operators. For X a normed linear space,
let I (or IX if we wish to emphasize the underlying space) denote the identity map
Ix = x for all x ∈ X.

3.24. Definition. Let X and Y be Banach spaces, and let T ∈ B(X, Y). We
say that T is invertible if there exists a (continuous) linear map R ∈ B(Y, X) such
that RT = IX and T R = IY .
If T ∈ B(X), we define the spectrum of T to be:

σ(T ) = {λ ∈ K : (T − λI) is not invertible}.

3.25. When X is a finite-dimensional space, the spectrum of T coincides with


the eigenvalues of T . The reader will recall from their Linear Algebra courses that
eigenvalues of linear maps need not exist when the underlying field is not alge-
braically closed. When K = C, it can be shown that the spectrum of an operator
T ∈ B(X) is a non-empty, compact subset of C, and a so-called functional calculus
which allows one to naturally define f (T ) for any complex-valued function which is
analytic in an open neighbourhood of σ(T ). This, however, is beyond the scope of
the present notes.
If X is a Banach space, an operator Q ∈ B(X) is said to be quasinilpotent if
σ(Q) = {0}. The argument of Paragraph 2.20 says that the Volterra operator is
quasinilpotent.
A wonderful theorem of A. Beurling, known as Beurling’s spectral radius
formula relates the spectrum of an operator T to a limit of the kind obtained in
Paragraph 2.20.

Theorem. Beurling’s Spectral Radius Formula.


Let X be a complex Banach space and T ∈ B(X). Then
1
spr(T ) := max{|k| : k ∈ σ(T )} = lim kT n k n .
n

The quantity spr(T ) is known as the spectral radius of T . It is worth pointing


out that an implication of Beurling’s Spectral Radius Formula is that the limit on
the right-hand side of the equation exists! A priori, this is not obvious.
50 L.W. Marcoux Functional Analysis

3.26. As mentioned in paragraph 3.1, Hilbert spaces arise naturally in the study
of Banach space geometry. In this context, much of the literature concerns real
Hilbert spaces.
For example, for each n ≥ 1, let Qn denote the set of n-dimensional (real)
Banach spaces. Given Banach spaces X and Y in Qn , we denote by GL(X, Y) the
set of all invertible operators from X to Y. We can define a metric δ on Qn via:
δ(X, Y) := log inf{kT k kT −1 k : T ∈ GL(X, Y)} .


It can be shown that (Qn , δ) is a compact metric space, known as the Banach-
Mazur compactum. One also refers to the quantity
d(X, Y) = inf{kT k kT −1 k : T ∈ GL(X, Y)}
as the Banach-Mazur distance between X and Y, and it is an important problem
in Banach space geometry to calculate Banach-Mazur distances between the n-
dimensional subspaces of two infinite-dimensional Banach spaces, say V and W.
Typically, one is interested in knowing something about the asymptotic behaviour
of these distances as n tends to infinity.
We mention without proof two interesting facts concerning the Banach-Mazur
distance:
(a) If X, Y and Z are n-dimensional Banach spaces, then
d(X, Z) ≤ d(X, Y) d(Y, Z).
(b) A Theorem of Fritz John shows that (with `2n := (Rn , k · k2 )),

d(X, `2n ) ≤ n for all n ≥ 1.
It clearly follows from these two properties that d(X, Y) ≤ n for all X, Y ∈ Qn .

3.27. We mentioned earlier in this section that the Parallelogram Law is useful
in determining that a given norm is not induced by an inner product. In fact, in
can be shown that a norm on a Banach space X is the norm induced by some inner
product if and only if the norm satisfies the Parallelogram Law.

3.28. In Lemma 3.16, we invoked the claim that a Cauchy net in our Hilbert
space H necessarily converges. Since we have defined completeness in terms of
sequences instead of nets, the following result may be helpful.

3.29. Proposition. Let (X, d) be a metric space. The following are equivalent:
(i) (X, d) is complete as a metric space; i.e. every Cauchy sequence (xn )∞
n=1
converges in X.
(ii) Every Cauchy net (xλ )λ∈Λ converges in X.
Proof.
3. HILBERT SPACE 51

(i) implies (ii). Suppose that every Cauchy sequence in X converges, and let
(xλ )λ∈Λ be a Cauchy net in X; that is, given ε > 0, there exists λ0 ∈ Λ so
that α, β ≥ λ0 implies d(xα , xβ ) < ε.
Choose λ1 ∈ Λ so that α, β ≥ λ1 implies that d(xα , xβ ) < 1. Choose
λ2 ≥ λ1 so that α, β ≥ λ2 implies that d(xα , xβ ) < 21 . (Note that by defini-
tion, we can always find γ2 ∈ Λ so that α, β ≥ γ2 implies that d(xα , xβ ) < 12 ;
by choosing λ2 ≥ λ1 , γ2 , it follows that α, β ≥ λ2 implies d(xα , xβ ) < 21 , as
required.)
Arguing as above, for n ≥ 2, we can find λn+1 ≥ λn so that α, β ≥ λn+1
1
implies that d(xα , xβ ) < n+1 .
Consider the sequence (xλn )∞ n=1 . It is easily seen that this sequence is
Cauchy, and so by hypothesis, x = limn→∞ xλn ∈ X exists. Let ε > 0 and
choose N > 2ε so that n ≥ N implies that d(xλn , x) < 2ε .
If λ ≥ λN , then
d(xλ , x) ≤ d(xλ , xλN ) + d(xλN , x)
1 ε
< +
N 2
< ε.
That is, limλ xλ = x.
(ii) implies (i). Since every sequence in X is also a net, this is clear.
2

My friends tell me I have an intimacy problem. But they don’t really


know me.
Garry Shandling
52 L.W. Marcoux Functional Analysis

Exercises for Section 3.

Question 1. This question builds upon Question 4 of Chapter 2.


Let H be an infinite-dimensional, separable Hilbert space and suppose that
T ∈ B(H) is a finite-rank operator; that is, dim ran T < ∞. Prove that there
exists a finite-dimensional subspace M ⊆ H such that T M ⊆ M and T M⊥ ⊆ M⊥ .
Such a subspace is said to be reducing for T .

Question 2.
Let H be a complex Hilbert space, and let M ⊆ H be a closed subspace of H.
Denote by P the orthogonal projection of H onto M. Given T ∈ B(H), we may
write the operator matrix for T relative to the decomposition H = M ⊕ M⊥ as
follows:  
T T12
T = 11 ,
T21 T22
where
• T1 1 = P T PM ,
• T1 2 = P T (I − P )|M⊥ ,
• T2 1 = (I − P )T P |M and
• T2 2 = (I − P )T (I − P )|M⊥ .
(a) Prove that the operator matrix of P relative to this decomposition is
 
I 0
.
0 0
 
R11 R12
(b) Show that if R ∈ B(H) has the decomposition R = relative to
R21 R22
the same decomposition of H, then
 
R11 T11 + R12 T21 R11 T12 + R12 T22
RT = .
R21 T11 + R22 T21 R21 T12 + R22 T22
In other words, this behaves just like multiplication of scalar matrices.
(c) Prove that kT k ≥ kTij k, 1 ≤ i, j ≤ 2.
(d) Prove that
kT k ≤ max(kT11 k, kT22 k) + max(kT12 k, kT21 k).
(e) We say that M is invariant for T if and only if T M ⊆ M. Prove that M
is invariant for T if and only if T21 = 0, which in turn happens if and only
if (I − P )T P = 0.
(f) Prove that M is reducing for T if and only if T12 = 0 and T21 = 0, which
in turn happens if and only if T P = P T .
(g) Suppose that κ ∈ C and that R T = T R. Prove that ker (T − κI) is
invariant for R. (We say that ker (T − κI) is hyper-invariant for T . That
is, it is invariant for all operators that commute with T .)
3. HILBERT SPACE 53

Question 3.
Suppose that H is a Hilbert space and that T ∈ B(H). Let {eα }α∈Λ be an onb
for H. Prove that if R ∈ B(H) and Reα = T eα for all α ∈ Λ, then T = R.
In other words, the action of a bounded linear operator on H is entirely deter-
mined by its action on an onb.

Question 4.
Prove that if F ∈ B(H) has rank N ≥ 1, then F is a sum of N rank-one
operators.
54 L.W. Marcoux Functional Analysis

4. Topological Vector Spaces

Someday I want to be rich. Some people get so rich they lose all
respect for humanity. That’s how rich I want to be.
Rita Rudner

4.1. Let H be an infinite-dimensional Hilbert space. The norm topology on


B(H) is but one example of an interesting topology we can place on this set. We are
also interested in studying certain weak topologies on B(H) generated by a family of
functions. The topologies that we shall obtain are not induced by a metric obtained
from a norm. In order to gain a better understand of the nature of the topologies we
shall obtain, we now turn our attention to the notion of a topological vector space.

4.2. Definition. Let W be a vector space over the field K, and let T be a
topology on W. We say that the topology T is compatible with the vector space
structure on W if the maps
σ : W ×W → W
(x, y) 7→ x + y
and
µ: K×W →W
(k, x) 7→ kx
are continuous, where K × W and W × W carry their respective product topologies.
A topological vector space (abbreviated TVS) is a pair (W, T ) where W is
a vector space with a compatible Hausdorff topology. Informally, we refer to W as
the topological vector space.

4.3. Remark. Not all authors require T to be Hausdorff in the above definition.
However, for all spaces of interest to us, the topology will indeed be Hausdorff.
Furthermore, one can always pass from a non-Hausdorff topology to a Hausdorff
topology via a natural quotient map. (See Appendix T.)

4.4. Example. Let (X, k · k) be any normed linear space, and let T denote the
norm topology. Suppose (x, y) ∈ X×X and ε > 0. Choose a net (xα , yα )α∈Λ ∈ X×X
so that limα (xα , yα ) = (x, y). Then there exists α0 ∈ Λ so that α ≥ α0 implies
kxα − xk < ε/2, kyα − yk < ε/2. But then α ≥ α0 implies kxα + yα − (x + y)k ≤
kxα − xk + kyα − yk < ε. In other words, σ(xα , yα ) tends to σ(x, y) and so σ is
continuous.
Similarly, if (k, x) ∈ K × X, then we can choose a net (kα , xα )α∈Λ so that
limα (kα , xα ) = (k, x). But then we can find α0 ∈ Λ so that α ≥ α0 implies |kα −k| <
1, and so |kα | < |k|+1. Next we can find α1 so that α ≥ α1 implies |kα −k| < ε/2kxk,
4. TOPOLOGICAL VECTOR SPACES 55

and α2 so that α ≥ α2 implies kxα − xk < ε/2(|k| + 1). Choosing α ≥ α0 , α1 and


α2 we get
kµ(kα , xα ) − µ(k, x)k = kkα xα − kxk
≤ kkα xα − kα xk + kkα x − kxk
≤ |kα |kxα − xk + |kα − k|kxk
≤ ε/2 + ε/2 = ε.
This proves that µ is continuous. Hence X is a TVS with the norm topology.

4.5. Example. As a concrete example of the situation in Example 4.4, let


n ≥ 1 be an integer and for (x1 , x2 , ..., xn ) ∈ Cn , set k(x1 , x2 , ..., xn )k∞ = max{|xk | :
1 ≤ k ≤ n}. Then Cn is a TVS with the induced norm topology. Of course, in this
example, the norm topology coincides with the usual topology on Cn coming from
Pn 1
2 2 , since k · k
the Euclidean norm k(x1 , x2 , ..., xn )k2 = k=1 |x k | ∞ and k · k2 are
equivalent norms on Cn .

4.6. Remark. In the assignments we shall see how to construct a TVS which is
not a normed linear space. See also the discussion of Fréchet spaces in the Appendix
to Section 5.

4.7. Remark. Let (V, T ) be a TVS, and let U ∈ U0 be a nbhd of 0 in W. The


continuity of σ : V ×V → W implies that σ −1 (U ) is a nbhd of (0, 0) in V ×V. As such,
σ −1 (U ) contains a basic nbhd N1 × N2 of (0, 0), where N1 , N2 ∈ U0 are open (see
the Appendix). But if N = N1 ∩ N2 , then N ∈ U0 and N × N ⊆ N1 × N2 ⊆ σ −1 (U ).
Thus for all U ∈ U0 there exists an open set N ∈ U0 so that σ(N × N ) = N + N :=
{m + n : m, n ∈ N } ⊆ U .
Similarly, we can find a nbhd Vε (0) of 0 in K and N ∈ U0V open so that Vε (0) ×
N ⊆ µ−1 (U ), or equivalently,
{kn : n ∈ N, 0 ≤ |k| < ε} ⊆ U.

It is also worth observing that if U ∈ U0 , then V = ∪n≥1 nU . Indeed, if x ∈ V,


then consider the continuous function f : R → V defined by f (t) = tx, so that
f (0) = 0. The continuity of f at 0 implies that there exists δ > 0 so that |t| < δ
forces f (t) = tx ∈ U . Choosing n > 1δ then yields that n1 x ∈ U , or in other words
that x ∈ nU . Indeed, this gives us the slightly stronger and useful conclusion that
x ∈ nU for all n > 1δ . As such, if (kn )n is a sequence in N with limn→∞ kn = ∞,
then
V = ∪n≥1 kn U.
This phenomenon is often referred to by saying that any nbhd of 0 in a TVS is
absorbing .

4.8. Definition. A nbhd N of 0 in a TVS W is called balanced if kN ⊆ N


for all k ∈ K satisfying |k| ≤ 1.
56 L.W. Marcoux Functional Analysis

4.9. Example. Let (X, k · k) be a normed linear space. For all δ > 0, Vδ (0) =
{x ∈ X : kxk < δ} is a balanced nbhd of 0, and if U ∈ U0X , then there exists δ > 0
such that Vδ (0) ⊆ U .

4.10. Proposition. Let (W, T ) be a TVS. Every nbhd of 0 contains a balanced,


open nbhd of 0.
Proof. By Remark 4.7, given U ∈ U0 , we can find ε > 0 and N ∈ U0 open such
that k ∈ K, 0 < |k| < ε implies kN ⊆ U . Since multiplication by a non-zero scalar
is a homeomorphism, each kN is open.
Let M = ∪0<|k|<ε kN . Then M ⊆ U and M ⊇ 2ε N , so M ∈ U0 . A routine
calculation shows that M is balanced. It is also open, being the union of open sets.
2

4.11. Suppose (W, T ) is a TVS, w0 ∈ W and k0 ∈ K. Define


τw0 : W → W via τw0 (x) = w0 + x.
By continuity of addition, we get that τw0 is continuous, and clearly τw0 is a bijection.
But τw−10
= τ−w0 is also a translation, and therefore it is continuous by the above
argument. That is, τw0 is a homeomorphism.
This simple observation underlies a particularly useful fact about TVS’s, namely:
N ∈ U0W if and only if w0 + N ∈ UwW0 .
That is,
the nbhd system at any point in W is determined
by the nbhd system at 0.
If 0 6= k0 ∈ K, then λk0 : W → W defined by λk0 (x) = k0 x is also a continuous
bijection (by continuity of scalar multiplication) and has continuous inverse λk−1 .
0
Thus
N ∈ U0W if and only if k0 N ∈ U0W for all 0 6= k0 ∈ K.

The following result shows that the assumption that a TVS topology be Haus-
dorff may be replaced with a weaker assumption - namely that points be closed (i.e.
that T be T1 ) - from which the Hausdorff condition follows.
4.12. Proposition. Let V be a vector space with a topology T for which
(i) addition is continuous;
(ii) scalar multiplication is continuous; and
(iii) points in V are closed in the T -topology.
Then T is a Hausdorff topology and (V, T ) is a TVS.
Proof. Let x 6= y ∈ V. Then {y} is closed (i.e. V\{y} is open) and so we can
find an open nbhd U ∈ Ux of x so that y 6∈ U . As above, by continuity of addition,
translation is a homeomorphism of V and so U = x+U0 for some open nbhd U0 of 0.
Also by continuity of addition, there exists an open nbhd V of 0 so that V + V ⊆ U0 .
By continuity of scalar multiplication, −V is again an open nbhd of 0, and hence
W = V ∩ (−V ) is an open nbhd of 0 as well, with W + W ⊆ V + V ⊆ U0 .
4. TOPOLOGICAL VECTOR SPACES 57

Suppose that (x + W ) ∩ (y + W ) 6= ∅. Then there exist w1 , w2 ∈ W so that


x + w1 = y + w2 , i.e. x + w1 − w2 = y. But w1 − w2 ∈ W + W , so that y ∈
x + (W + W ) ⊆ x + U0 = U , a contradiction. Hence (x + W ) ∈ Ux , (y + W ) ∈ Uy
are disjoint open nbhds of x and y respectively, and (V, T ) is Hausdorff.
2

4.13. Proposition. Let (W, T ) be a TVS and Y be a linear manifold in W.


Then
(a) Y is a TVS with the relative topology induced by T ; and
(b) Y is a subspace of W.
Proof.
(a) This is clear, since the continuity of σ|Y and µ|Y is inherited from the
continuity of σ and µ.
(b) Suppose y, z ∈ Y and k ∈ K. Choose a net (yα , zα ) ∈ Y × Y so that
limα (yα , zα ) = (y, z). By continuity of σ on W × W, yα + zα → y + z.
But yα + zα ∈ Y for all α, and so y + z ∈ Y. Similarly, if we choose a net
(kα , yα ) in K × Y which converges to (k, y), then the continuity of µ implies
that kα yα → ky. Since kα yα ∈ Y for all α, ky ∈ Y.
2

4.14. Exercise. Let (V, T ) be a TVS. Prove the following.


(a) If C ⊆ V is convex, then so is C.
(b) If E ⊆ V is balanced, then so is E.

4.15. Definition. Let (V, T ) be a TVS, and let (xλ )λ be a net in V. We say
that (xλ )λ is a Cauchy net if for all U ∈ U0 there exists λ0 ∈ Λ so that λ1 , λ2 ≥ λ0
implies that xλ1 − xλ2 ∈ U .
We say that a subset K ⊆ V is Cauchy complete if every Cauchy net in K
converges to some element of K.
We pause to verify that if (xλ )λ is a net in V which converges to some x ∈ V,
then (xλ )λ is a Cauchy net. Indeed, let U ∈ U0 and choose a balanced, open nbhd
N ∈ U0 so that N + N = N − N ⊆ U . Also, choose λ0 ∈ Λ so that λ ≥ λ0 implies
that xλ ∈ x + N . If λ1 , λ2 ≥ λ0 , then xλ1 − xλ2 = (xλ1 − x) − (xλ2 − x) ∈ N − N ⊆ U .
Thus (xλ )λ is a Cauchy net.

4.16. Example. If (X, k·k) is a normed linear space, then X is Cauchy complete
if and only if X is complete. Indeed, the topology here being a metric topology, we
need only consider sequences instead of general nets.
58 L.W. Marcoux Functional Analysis

4.17. Lemma. Let V be a TVS and K ⊆ V be complete. Then K is closed in


V.
Proof. Suppose that z ∈ K. For each U ∈ Uz , there exists yU ∈ K so that yU ∈ U .
The family {U : U ∈ Uz } forms a directed set under reverse-inclusion, namely:
U1 ≤ U2 if U2 ⊆ U1 . Thus (yU )U ∈Uz is a net in K. By definition, this net converges
to the point z, i.e. limU yU = z. (Since V is Hausdorff, this is the unique limit
point of the net (yU )U .) From the comments following Definition 4.15, (yU )U ∈Uz is
a Cauchy net. Since K is complete, it follows that z ∈ K, and hence that K is closed.
2

4.18. Quotient spaces. Let (V, T ) be a TVS and W be a closed subspace of


V. Then V/W exists as a quotient space of vector spaces. Let q : V → V/W denote
the canonical quotient map.
We can establish a topology on V/W using the T topology on V by defining a
subset G ⊆ V/W to be open if q −1 (G) is open in V.
• If {Gλ }λ ⊆ V/W and q −1 (Gλ ) is open in V for all λ, then q −1 (∪λ Gλ ) =
∪λ q −1 (Gλ ) is open, and thus ∪λ Gλ is open in V/W.
• If G1 , G2 ⊆ V/W and q −1 (Gi ) is open in V, i = 1, 2, then q −1 (G1 ∩ G2 ) =
q −1 (G1 ) ∩ q −1 (G2 ) is open in V, whence G1 ∩ G2 is open in V/W.
• Clearly ∅ = q −1 (∅) and V = q −1 (V/W) are open in V, so that ∅, V/W
are open in V/W and the latter is a topological space.
We refer to this topology on V/W as the quotient topology. The quotient
map is continuous with respect to the quotient topology, by design. In fact, the
quotient topology is the largest topology on V/W which makes q continuous.
We begin by proving that q is an open map. That is, if G ⊆ V is open, then
q(G) is open in V/W.
Indeed, for each w ∈ W, the set G + w is open in V, being a translate of the
open set G. Hence G + W = ∪w∈W G + w is open in V, being the union of open sets.
But
G + W = q −1 (q(G)),
so that q(G) is open in V/W by definition.
To see that addition is continuous in V/W, let x + W, y + W and let E be a
nbhd of (x + y) + W in V/W. Then q −1 (E) is a nbhd of x + y in V. Choose open
nbhds Ux of x and Uy of y in V so that r ∈ Ux , s ∈ Uy implies that r + s ∈ q −1 (E).
Note that x + W ∈ q(Ux ), y + W ∈ q(Uy ) and that q(Ux ), q(Uy ) are open in V/W
by the argument above. If a + W ∈ q(Ux ), b + W ∈ q(Uy ), then a + W = g + W and
b + W = h + W for some g ∈ Ux , h ∈ Uy . Thus
(a + b) + W = (g + h) + W ∈ q(q −1 (E)) ⊆ E.
Hence addition is continuous.
That scalar multiplication is continuous follows from a similar argument which
is left to the reader.
Finally, to see that the resulting quotient topology is Hausdorff, it suffices (by
Proposition 4.12) to show that points in V/W are closed. Let x + W ∈ V/W. Then
4. TOPOLOGICAL VECTOR SPACES 59

q −1 (x + W) = {x + w : w ∈ W} is closed in V, being a translation of the closed


subspace W. Hence the complement C = V\{x + w : w ∈ W} of x + W is open in
V. But then q(C) is open, since q is an open map, and q(C) is the complement of
x + W.

Finite-dimensional topological vector spaces. Our present goal is to prove


that there is only one topology that one can impose upon a finite-dimensional vector
space V to make it into a TVS. We begin with the one dimensional case.
4.19. Lemma. Let (V, T ) be a one-dimensional TVS over K. Let {e} be a
basis for V. Then V is homeomorphic to K via the map
τ: K → V
.
k 7→ ke

Proof. The map τ is clearly a bijection, and the continuity of scalar multiplication
in a TVS makes it continuous as well. We shall demonstrate that the inverse map
τ −1 (ke) = k is also continuous. To do this, it suffices to show that if limλ kλ e = 0,
then limλ kλ = 0. (Why? )
Let δ > 0. Then δe 6= 0, and as V is Hausdorff, we can find a nbhd U of 0 so
that δe 6∈ U . By Proposition 4.10, U contains a balanced nbhd V of 0. Obviously,
δe 6∈ V . Since limλ kλ e = 0, there exists λ0 so that λ ≥ λ0 implies that kλ e ∈ V .
Suppose that there exists β ≥ λ0 with |kβ | ≥ δ. Then
 
δ
δe = kβ e ∈ V,

as V is balanced. This contradiction shows that λ ≥ λ0 implies that |kλ | < δ. Since
δ > 0 was arbitrary, we have shown that limλ kλ = 0.
2

4.20. Proposition. Let n ≥ 1 be an integer, and let (V, T ) be an n-dimensional


TVS over K with basis {e1 , e2 , ..., en }. The map
τ: Kn → P V
n
(k1 , k2 , ..., kn ) 7→ j=1 kj ej
is a homeomorphism.
Proof. Lemma 4.19 shows that the result is true for n = 1. We shall argue by
induction on the dimension of V.
It is clear that τ is a linear bijection, and furthermore, since V is a TVS, the
continuity of addition and scalar multiplication in V implies that τ is continuous.
There remains to show that τ −1 is continuous as well.
Suppose that the result is true for 1 ≤ n < m. We shall prove that it holds for
n = m as well. To that end, let F = {et1 , et2 , ..., etr } for some 1 ≤ r < m, and let
E = {e1 , e2 , ..., em }\F = {ep1 , ep2 , ..., eps }.
Now Y = span E is an s-dimensional P space with s < m. By our induction
hypothesis, the map (k1 , k2 , ..., ks ) 7→ sj=1 kj epj is a homeomorphism. It follows
60 L.W. Marcoux Functional Analysis

that Y is complete (check! ) and therefore closed, by Lemma 4.17. By the arguments
of paragraph 4.18, V/Y is a TVS and the canonical map qY : V → V/Y is continuous.
Moreover, {q(et1 ), q(et2 ), ..., q(etr )} is a basis for V/Y. Since r < m, our induction
hypothesis once again shows that the map
ρY : P V/Y → Kr
r
j=1 kj q(etj ) 7→ (k1 , k2 , ..., kr )

is continuous. Thus
γ := ρY ◦ q : P V → Kr
n Pn
i=1 ki ei 7→ ρY ( i=1 ki q(ei )) = (kt1 , kt2 , ..., ktr )
is also continuous, being the composition of continuous functions.
To complete the proof, we first apply the above argument with F = {em } to get
that
γ1 : P V → K
n
i=1 i i 7→ km
k e
is continuous, and then to F = {e1 , e2 , ..., em−1 } to get that
γ2 : P V → Km−1
n
i=1 ki ei 7→ (k1 , k2 , ..., km−1 )
is continuous. Since τ −1 = (γ1 , γ2 ), it too is continuous, and we are done.
2

The previous result has a number of important corollaries:

4.21. Corollary. Let n ≥ 1 be an integer and V be an n-dimensional vector


space. Then there is a unique topology T which makes V a TVS. In particular,
therefore, all norms on a finite dimensional vector space are equivalent.
Proof. Since any topology on V which makes it a TVS is determined completely
by the product topology on Kn , it is unique. If k · k1 and k · k2 are two norms on V,
then they induce metric topologies which make V into a TVS. But these topologies
coincide, from the above argument. By Proposition 1.19, the norms are equivalent.
2

4.22. Corollary. Let V be a TVS and W be a finite-dimensional linear manifold


of V. Then W is closed in V.
Proof. This argument is embedded in the proof of Proposition 4.20; W is complete
because of the nature of the homeomorphism between W and Kn , where n is the
dimension of W. Then Lemma 4.17 implies that W is closed.
2
Recall from your Real Analysis course that the closed unit ball of (Kn , k · k2 )
is compact. Our next result essentially says that Kn is the only topological vector
space with this property.
4. TOPOLOGICAL VECTOR SPACES 61

4.23. Definition. A topological space (X, T ) is said to be locally compact if


each point in X has a nbhd base consisting of compact sets.

Suppose that X is locally compact and Hausdorff, and that x0 ∈ X. Then for
all U ∈ Ux0 , there exists K ∈ Ux0 so that K is compact and K ⊆ U . Choose G ∈ T
so that x0 ∈ G ⊆ K. Then G ⊆ K = K ⊆ U , and so G is compact. That is, if X is
Hausdorff and locally compact, then for any U ∈ Ux0 , there exists G ∈ T so that G
is compact and x0 ∈ G ⊆ G ⊆ U .

4.24. Example. Let n ≥ 1 be an integer and consider (Kn , k · k2 ). Then for


each x ∈ Kn , the collection
{B ε (x) := {y ∈ Kn : ky − xk2 ≤ ε} : ε > 0}
is a nbhd base at x consisting of compact sets, so (Kn , k · k2 ) is locally compact.

4.25. Theorem. A TVS (V, T ) is locally compact if and only if V is finite-


dimensional.
Proof. If dim V < ∞, then V is homeomorphic to (Kn , k · k2 ) by Proposition 4.20.
Since (Kn , k · k2 ) is locally compact from above, so is V.
Conversely, suppose that V is locally compact. Choose K ∈ U0 compact. Using
Remark 4.7, we can find a nbhd N ∈ U0V such that N + N ⊆ K. By replacing N
with its interior if necessary, we may assume without loss of generality that N is
open. Now K ⊆ ∪x∈K x + N . Since the latter is an open cover of K, we can find
x1 , x2 , ..., xr ∈ K so that
K ⊆ ∪ri=1 xi + N = {x1 , x2 , ..., xr } + N.
Let M = span{x1 , x2 , ..., xr }. Consider the quotient map q : V → V/M. As we
have seen, q is both continuous and open. Furthermore,
q(K) ⊆ ∪ri=1 q(xi + N ) = ∪ri=1 q(N ) = q(N ) ⊆ q(K),
as q(xi ) = 0 for all 1 ≤ i ≤ r.
Since N + N ⊆ K, we see that
2 q(K) ⊆ q(K) + q(K) = q(N ) + q(N ) ⊆ q(K).
By a simple induction argument, we see that 2m q(K) ⊆ q(K) for all m ≥ 1.
But K ∈ U0 implies that K is absorbing, and thus q(V) ⊆ ∪m≥1 2m q(K) = q(K).
Since K is compact and q is continuous, we infer that q(V) is compact. Suppose
that q(V) 6= {0}. Then q(V) contains a one-dimensional subspace K(y + M) for
some y ∈ V \ M. By Corollary 4.22, K(y + M) is closed in q(V). Since q(V) is
compact, so is K(y + M). By Lemma 4.19, K(y + M) is homeomorphic to K, which
forces K to be compact as well, which is absurd.
Hence q(V) = {0}, or in other words, V = M, which is finite-dimensional. This
completes the proof.
2
An interesting and useful consequence of this result is the following.
62 L.W. Marcoux Functional Analysis

4.26. Corollary. Let (X, k · k) be a NLS. Then the closed unit ball X1 of X is
compact if and only if X is finite-dimensional.
Proof. First suppose that X1 is compact. If U ∈ U0X is any nbhd of 0, then there
exists δ > 0 so that kxk < 2δ implies that x ∈ U . But then Xδ ⊆ U , and Xδ = δX1
is compact, being a homeomorphic image of X1 . By definition, X is locally compact,
hence finite-dimensional, by Theorem 4.25.
Conversely, suppose that X is finite-dimensional. By Theorem 4.25, X is locally
compact. By hypothesis, there exists a compact nbhd K of 0. As above, there exists
δ > 0 so that Xδ ⊆ K. Since Xδ is a closed subset of a compact set, it is compact.
Since X1 = (δ −1 )Xδ is a homeomorphic image of a compact set, it too is compact.
2

4.27. Definition. Let (V, TV ) and (W, TW ) be topological vector spaces and
suppose that f : V → W is a (not necessarily linear) map. We say that f is
uniformly continuous if, given U ∈ U0W there exists N ∈ U0V so that x − y ∈ N
implies that f (x) − f (y) ∈ U .

4.28. The definition of uniform continuity given here derives from the fact that
the collection B = {B(U ) = {(x, y) : x − y ∈ U } : U ∈ U0W } defines what is
known as a uniformity on the TVS (W, TW ) whose corresponding uniform topology
coincides with the initial topology T . The interested reader is referred to the book
by Willard [Wil70] and to the books of Kadison and Ringrose [KR83] for a more
complete development along these lines. We shall focus only upon that part of the
theory which we require in this text.
4.29. Let us verify that in the case where (X, k · kX ) and (Y, k · kY ) are normed
linear spaces, our new notion of uniform continuity coincides with our metric space
notion.
Observe first that if (Z, k · k) is a general normed linear space and δ > 0, then
x − y ∈ VδZ (0) if and only if kx − yk < δ.
Suppose f : X → Y is uniformly continuous in the sense of Definition 4.27.
Given ε > 0, VεY (0) = {y ∈ Y : kykY < ε} ∈ U0Y and so there exists N ∈ U0X such
that x − y ∈ N implies f (x) − f (y) ∈ VεY (0). But N ∈ U0X implies that there exists
δ > 0 so that VδX (0) ⊆ N . Thus kx − ykX < δ implies that x − y ∈ N , and thus
f (x) − f (y) ∈ VεY (0), i.e. kf (x) − f (y)kY < ε. Thus is the standard (metric) notion
of uniform continuity in a normed space.
Conversely, suppose that f : X → Y is uniformly continuous in the standard
metric sense. Let U ∈ U0Y . Then there exists ε > 0 so that VεY (0) ⊆ U . By
hypothesis, there exists δ > 0 so that kx − ykX < δ implies kf (x) − f (y)kY < ε, and
hence x − y ∈ VδX (0) implies that f (x) − f (y) ∈ VY (0) ⊆ U . That is, f is continuous
in the sense of Definition 4.27.

It is useful to extend our notion of uniformly continuous functions between


topological vectors spaces to functions defined only upon a subset (not necessarily
a subspace) of the domain space V.
4. TOPOLOGICAL VECTOR SPACES 63

4.30. Definition. If (V, TV ) and (W, TW ) are topological vector spaces and
C ⊆ V, then f : C → W is uniformly continuous if for all U ∈ U0W there exists
N ∈ U0V such that x, y ∈ C and x − y ∈ N implies f (x) − f (y) ∈ U .

4.31. Example. Now (R, | · |) is a normed linear space, and so the comments of
paragraph 4.29 apply. The function f : (0, 1) → R defined by f (x) = x2 is therefore
uniformly continuous, whereas g : R → R defined by g(x) = x2 is not.
Another way in which uniform continuity in the TVS setting extends the notion
of uniform continuity in the metric setting is evinced by the following:
4.32. Proposition. Let (V, TV ) and (W, TW ) be topological vector spaces and
f : V → W be uniformly continuous. Then f is continuous on V.
Proof. Let x0 ∈ V. Let U ∈ UfW(x0 ) . Then by paragraph 4.11, U = f (x0 ) + U0
where U0 ∈ U0W . By hypothesis, there exists N0 ∈ U0V so that x − x0 ∈ N0 implies
that f (x) − f (x0 ) ∈ U0 . That is, x ∈ x0 + N0 implies f (x) ∈ f (x0 ) + N0 = U . Since
N := x0 + N0 ∈ UxV0 , we see that f is continuous at x0 . But x0 ∈ V was arbitrary,
and so f is continuous on V.
2

4.33. Theorem. Let (V, TV ) and (W, TW ) be topological vector spaces over K.
Suppose that T : V → W is linear. The following are then equivalent:
(a) there exists x0 ∈ V so that T is continuous at x0 ; and
(b) T is uniformly continuous on V.
Proof. By Proposition 4.32, it suffices to prove that (a) implies (b). To that end,
suppose that T is continuous at x0 and let U0 ∈ U0W . Then U := T x0 + U0 ∈ UTWx0 .
By continuity of T at x0 , there exists N ∈ UxV0 so that T (N ) ⊆ U . But N = x0 + N0
for some N0 ∈ U0V . Now if z ∈ N0 , then x0 + z ∈ N , and so T (x0 + z) = T x0 + T z ∈
T x0 + U0 . That is, T z ∈ U0 .
In particular, if x−y ∈ N0 , then T (x−y) = T x−T y ∈ U0 , and so T is uniformly
continuous on V.
2

4.34. Given vector spaces V and W over C, denote by VR and WR the same
spaces of vectors, viewed as vector spaces over R. Observe that if (V, TV ) and
(W, TW ) are topological vector spaces over C and T : V → W is conjugate-linear
(i.e. T (kx) = k x for all x ∈ V and k ∈ C), then T : VR → WR is linear (over R).
That is, T (kx + y) = kT x + T y for all x, y ∈ VR and k ∈ R. Moreover, the topologies
TV and TW are real-vector space topologies for VR and WR respectively (as well as
C-vector space topologies for V and W).
From this it follows that T : V → W is continuous if and only if T : (VR , TV ) →
(WR , TW ) is continuous, and by the above Theorem, this happens precisely when T
is uniformly continuous on V (or equivalently on VR , since all scalars appearing in the
definition of uniform continuity are real and since the definition of uniform continuity
does not depend upon scalar multiplication, but only upon “subtraction”).
64 L.W. Marcoux Functional Analysis

In other words, Theorem 4.33 holds for as conjugate-linear maps too.


4.35. Corollary. Let (V, TV ) and (W, TW ) be topological vector spaces over K
and suppose that dim V = n < ∞. If T : V → W is linear, then T is continuous.
Proof. By Theorem 4.33 and Proposition 4.32, it suffices to prove that T is con-
tinuous at 0. Let {e1 , e2 , ..., en } be a basis for V, and suppose that (xλ )λ∈Λ is a net
in V which converges to 0. For each λ ∈ Λ we may express xλ as a unique linear
combination of the ej ’s, say
xλ = kλ,1 e1 + kλ,2 e2 + · · · + kλ,n en .

Pn limλ xλ = 0 implies that for 1 ≤ j ≤ n, limλ kλ,j = 0.


By Proposition 4.20,
Now T xλ = j=1 kλ,j T ej , λ ∈ Λ. But addition and scalar multiplication in
(W, TW ) are continuous, and limλ kλ,j = 0, so
n
X n
X
lim T xλ = lim kλ,j T ej = 0 T ej = 0 = T (lim xλ ).
λ λ λ
j=1 j=1

It follows that T is continuous at 0, as was required.


2
If we restrict our attention to subsets of V we get:
4.36. Proposition. Let V, W be topological vector spaces and T : V → W be
linear. Suppose that 0 ∈ C ⊆ V is balanced and convex. If T |C is continuous at 0,
then T |C is uniformly continuous.
Proof. Our assumption is that T |C is continuous at 0, and thus for all U ∈ U0W ,
there exists N ∈ U0V so that x ∈ C ∩ N implies T x ∈ 21 U .
By Proposition 4.10, every nbhd N ∈ U0V contains a balanced nbhd N0 , and so
by replacing N by N0 if necessary, we may assume that N is balanced.
Now suppose that x, y ∈ C and that x − y ∈ N . Then C balanced implies
that −y ∈ C, and C convex implies that 12 x + 12 (−y) = 21 (x − y) ∈ C. Since N is
balanced, 12 (x − y) ∈ N and so T ( 12 (x − y)) = 12 (T x − T y) ∈ 12 U . Hence x, y ∈ C,
x − y ∈ N implies T x − T y ∈ U . That is, T is uniformly continuous on C.
2
If A, B, and C are sets with A ⊆ B, and if f : A → C is a map, the we say that
the map g : B → C extends f (or that g is an extension of f ) if g|A = f .

4.37. Proposition. Suppose that V and W are topological vector spaces and
that W is Cauchy complete. If X ⊆ V is a linear manifold and T0 : X → W is
continuous and linear, then T0 extends to a continuous linear map T : X → W.
Proof. Let x ∈ X and choose (xλ )λ∈Λ1 in X so that limλ xλ = x. Clearly, if T is to
be continuous, we shall need T x = limλ T xλ . The issue is whether or not this limit
exists and is independent of the choice of (xλ )λ .
Now (xλ )λ is a Cauchy net. Take U ∈ U0W . Since T0 is continuous, there exists
N ∈ U0X such that w ∈ N implies that T0 w ∈ U . Hence, there exists λ0 such
that λ1 , λ2 ≥ λ0 implies that xλ1 − xλ2 ∈ N and therefore that T0 xλ1 − T0 xλ2 =
4. TOPOLOGICAL VECTOR SPACES 65

T0 (xλ1 − xλ2 ) ∈ U . Thus (T0 xλ )λ is also a Cauchy net. Our assumption that W is
Cauchy complete implies that there exists z (depending a priori upon (xλ )λ ) such
that z = limλ T0 xλ .
Suppose that (yβ )β∈Λ2 ∈ X and that limβ yβ = x. Arguing as above, there exists
z2 ∈ W so that z2 = limβ T0 yβ . If we set
y(λ,β) := yβ , λ ∈ Λ1
x(λ,β) := xλ , β ∈ Λ2 ,

and Λ = Λ1 × Λ2 - equipped with the direction (λ1 , β1 ) ≤ (λ2 , β2 ) if λ1 ≤ λ2 and


β1 ≤ β2 - then
lim x(λ,β) = lim y(λ,β) = x.
(λ,β)∈Λ (λ,β)
Also, lim(λ,β) T0 x(λ,β) = z1 , lim(λ,β) T0 y(λ,β) = z2 . Thus lim(λ,β) x(λ,β) − y(λ,β) =
0 ∈ X and so by the continuity of T0 on X ,
0 = T0 0 = T0 ( lim x(λ,β) − y(λ,β) )
(λ,β)
= lim T0 (x(λ,β) − y(λ,β) )
(λ,β)
= z1 − z2 .
That is, we can set T x = limλ T0 xλ and this is well-defined.
That T is linear on X is left as an exercise.
Finally, to see that T is continuous on X , let U ∈ U0W and choose U1 ∈ U0W so
that U1 + U1 ⊆ U . Choose N ∈ U0X so that x ∈ N implies T x = T0 x ∈ U1 . Then
N = G ∩ X for some G ∈ U0V .
Let M = G ∩ X so that M ∈ U0X . If z ∈ M , then z = limλ xλ for some xλ ∈ N ,
λ ∈ Λ. Now T z = limλ T0 xλ , so that there exists λ0 ∈ Λ so that λ ≥ λ0 implies that
T z − T0 xλ ∈ U1 .
But T0 xλ ∈ U1 for all λ and so T z = (T z − T0 xλ ) + T0 xλ ∈ U1 + U1 ⊆ U . That
is, z ∈ M implies that T z ∈ U . Hence T is continuous at 0, and consequently T is
uniformly continuous on X .
2

4.38. Corollary. Suppose that X and Y are Banach spaces and that M ⊆ X is
a linear manifold. If T0 : M → Y is bounded, then T0 extends to a bounded linear
map T : M → Y, and kT k = kT0 k.
Proof. Invoking Proposition 4.37, there remains only to show that kT k = kT0 k.
That kT k ≥ kT0 k is clear.
Conversely, given x ∈ M with kxk = 1 and ε > 0, there exists y ∈ M so that
kyk = 1 and kx − yk < ε. Then
kT0 y − T xk = kT y − T xk ≤ kT k ky − xk < kT kε.
(Recall that T is bounded since T is continuous!) Since ε > 0 was arbitrary,
sup{kT0 yk : y ∈ M, kyk = 1} ≥ sup{kT xk : kxk = 1},
66 L.W. Marcoux Functional Analysis

so kT0 k ≥ kT k, completing the proof.


2
4. TOPOLOGICAL VECTOR SPACES 67

Appendix to Section 4.

4.39. We have only touched upon the basics of the theory of topological vector
spaces. Indeed, our interests lie much closer to the study of normed linear spaces
and Banach spaces. Our reason for developing the theory of TVS’s to this extent
is that there are many topologies that one associates to Banach spaces that are not
necessarily norm topologies, including weak and weak∗ -topologies which we shall
study in Chapter 7 and beyond.
One can develop a theory of normed linear spaces, and deal with each of these
topologies on an ad hoc basis, however we feel that the price we pay in introducing
this more general approach is compensated by having our versions of the Hahn-
Banach Theorem hold in any “locally convex space”, a special kind of TVS whose
topology, as we shall see in the next Chapter, is induced by a separating family of
seminorms.

Do you know what it means to come home at night to a woman who’ll


give you a little love, a little affection, a little tenderness? It means
you’re in the wrong house, that’s what it means.
Henny Youngman
68 L.W. Marcoux Functional Analysis

Exercises for Section 4.

Question 1.
(a) Give an example of two homeomorphic metric spaces (X, dX ) and (Y, dY )
such that X is complete, but Y is not complete.
(b) Why is this not an issue in Corollary 4.22? What is it about the “nature”
of the homeomorphism between W and Kn in the proof of Proposition 4.20
that ensures that W is complete?

Question 2.
Let (V, T ) be a TVS. Prove the following.
(a) If C ⊆ V is convex, then so is C.
(b) If E ⊆ V is balanced, then so is E.

Question 3.
Let (V, T ) be a TVS, and let W ⊆ V be a closed subspace. Prove that scalar
multiplication is continuous in the quotient topology Tq on V/W.
5. SEMINORMS AND LOCALLY CONVEX SPACES 69

5. Seminorms and locally convex spaces

The secret of life is honesty and fair dealing. If you can fake that,
you’ve got it made.
Groucho Marx

5.1. Our main interest in topological vector spaces is to develop the theory
of locally convex topological vector spaces, which appear naturally in defining cer-
tain weak topologies naturally associated with Banach spaces, including the Banach
space of all bounded operators on a Hilbert space. Locally convex spaces are also
the most general spaces for which (in our opinion) interesting versions of the Hahn-
Banach Theorem will be shown to apply. As we shall see in this section, there is an
intimate relation between locally convex topological vector spaces and separating
families of seminorms on the underlying vector spaces, a phenomenon to which we
now turn our attention.
5.2. Definition. Let V be a vector space over K. A seminorm on V is a map
p : V → R satisfying
(i) p(x) ≥ 0 for all x ∈ V;
(ii) p(λx) = |λ|p(x) for all x ∈ V, λ ∈ K;
(iii) p(x + y) ≤ p(x) + p(y) for all x, y ∈ V.
It follows from this definition that a norm on V is simply a seminorm which
satisfies the additional property that p(x) = 0 if and only if x = 0.

5.3. Remark. A few remarks are in order. If p is a seminorm on a vector space


V, then for all x, y ∈ V,
p(x + y) ≤ p(x) + p(y)
implies that
p(x + y) − p(y) ≤ p(x).
Equivalently, with z = x + y, p(z) − p(x) ≤ p(z − x). Thus p(x) − p(z) ≤ p(x − z) =
p(z − x). Hence
|p(x) − p(z)| ≤ p(z − x).

5.4. Example. Let V = C([0, 1], C). For each x ∈ [0, 1], the map
px : V → R
defined by setting px (f ) = |f (x)| is a seminorm on V which is not a norm.

5.5. Example. Let n ≥ 1 and consider V = Mn = Mn (C). Fix 1 ≤ k, l ≤ n.


The map γkl : V → R defined by γkl ([xij ]) = |xkl | defines a seminorm on V which,
once again, is not a norm.
70 L.W. Marcoux Functional Analysis

5.6. Convexity. Recall that a subset E of a vector space V is said to be convex


if x, y ∈ E and 0 ≤ t ≤ 1 imply tx + (1 − t)y ∈ E. Geometrically, we are asking that
the line segment between any two points in E must lie in E.
It is a simple but useful fact that any linear manifold of V is necessarily convex.
Note also that if p1 , p2 , ...., pm is a family of seminorms on a V, x0 ∈ V, ε > 0
and E = {x ∈ V : pj (x − x0 ) < ε, 1 ≤ j ≤ m}, then E is convex. Indeed, if x, y ∈ E
and 0 ≤ t ≤ 1, then for all 1 ≤ j ≤ m,

pj (tx + (1 − t)y − x0 ) = tpj (x − x0 ) + (1 − t)pj (y − x0 )


< tε + (1 − t)ε = ε.

Thus tx + (1 − t)y ∈ E, and E is convex.


We leave it as an exercise for the reader to show that if V is a TVS and E ⊆ V
is convex, then so is E.
Another elementary but useful observation is that if C ⊆ V is convex and
T : V → W is a linear map (where W is a second vector space), then T (C) is
convex as well. Finally, if E ⊆ V is convex, then for all r, s > 0, rE + sE = (r + s)E.
r s
Indeed, r+s e1 + r+s e2 ∈ E for all e1 , e2 ∈ E, from which the desired result easily
follows.

5.7. The Minkowski functional. Let V be a TVS and suppose that E ∈ U0V
is convex. As we saw in the previous Chapter (see Remark 4.7), any nbhd of 0 in V
is absorbing, and thus there exists r0 > 0 so that x ∈ r0 E. This allows us to define
the map
pE : V → R
x 7→ inf{r ∈ (0, ∞) : x ∈ rE},
which we call the gauge functional or the Minkowski functional for E.
Note: the name is misleading, since the map is clearly not linear - its range is
contained in [0, ∞). By convexity of E, if x ∈ rE and 0 < r < s, then x = re for
some e ∈ E, so x = (1 − rs )0 + rs (se) ∈ co(sE) = sE. In particular, x ∈ sE for all
s > pE (x).

5.8. Definition. Let V be a vector space over K. A function p : V → R is


called a sublinear functional if it satisfies:
(i) p(x + y) ≤ p(x) + p(y) for all x, y ∈ V, and
(ii) p(rx) = rp(x) for all 0 < r ∈ R.

5.9. It is clear from the definition that every seminorm (and hence every norm)
on a vector space is a sublinear functional on that space. The converse is false in
general.
For example, the identity map κ : R → R is a sublinear functional on R. It is
not a seminorm since it is not even a non-negative valued function.
5. SEMINORMS AND LOCALLY CONVEX SPACES 71

5.10. Proposition. Let W be a TVS and E ∈ U0 be convex. Then


(a) The Minkowski functional pE is a sublinear functional on W for E.
(b) If E is open, then
E = {x ∈ W : pE (x) < 1}.
(c) If E is balanced, then pE is a seminorm.
Proof.
(a) Suppose that x, y ∈ E and that r, s ∈ (0, ∞) with r > p(x), s > p(y). Then
x ∈ rE, y ∈ sE and so x + y ∈ (r + s)E. That is,
p(x + y) ≤ r + s for all r > p(x), s > p(y).
Thus p(x + y) ≤ p(x) + p(y).
Also, if k > 0, then x ∈ rE if and only if kx ∈ krE, so that
p(kx) = inf{s : kx ∈ sE}
= inf{kr : kx ∈ krE}
= k inf{r : x ∈ rE}
= kp(x).
Thus p is a sublinear functional, as claimed.
(b) Suppose that x ∈ E and that E is open. Since the map f : R → W
given by f (t) = tx is continuous, 1 ∈ f −1 (E) is open in R and therefore
(1 − δ, 1 + δ) ⊆ f −1 (E) for some δ > 0. But then (1 + 2δ )x ∈ E, or
2 2
equivalently, x ∈ 2+δ E, implying that p(x) < 2+δ < 1.
Conversely, suppose that p(x) < 1. Then x = re for some p(x) < r < 1
and e ∈ E. But then x = (1 − r)0 + re ∈ coE = E.
k
(c) Suppose now that E is balanced. First observe that if k 6= 0, then |k| E = E.
Note that p is subadditive since it is a sublinear functional by (a). Also,
p(x) ≥ 0 for all x ∈ W by definition of p.
Finally, if k = 0, then p(kx) = p(0). But 0 ∈ rE for all r > 0, and so
p(0x) = p(0) = inf{r > 0 : x ∈ rE} = 0 = 0p(x).
If k 6= 0, then
p(kx) = inf{r > 0 : kx ∈ rE}
= inf{s|k| > 0 : kx ∈ s(|k|E)}
= inf{s|k| > 0 : kx ∈ s(kE)}
= |k| inf{s > 0 : x ∈ sE}
= |k|p(x).
Thus p is a seminorm.
2
72 L.W. Marcoux Functional Analysis

5.11. Proposition. Let W be a TVS and p be a seminorm on W. The following


are equivalent:
(a) p is continuous on W;
(b) there exists a set U ∈ U0W such that p is bounded above on U .
Proof.
(a) implies (b): Consider the set
E := {x ∈ W : p(x) < 1}.
Clearly 0 ∈ E, and since p is assumed to be continuous and E = p−1 (−∞, 1),
E is also open. Thus p is bounded above (by 1) on the open set E ∈ U0W .
(b) implies (a): Suppose that p is bounded above, say by M > 0 on an open
set U ∈ U0W . Let ε > 0. If x, y ∈ W and x − y ∈ ( M ε ε
)U , say x − y = M u
for some u ∈ U , then
ε ε
|p(x) − p(y)| ≤ p(x − y) = p( u) = p(u) < ε.
M M
Thus p is (uniformly) continuous on W.
2

5.12. Example. Recall from Example 5.4 that for each x ∈ [0, 1],
px : C([0, 1], C) → R
f 7→ |f (x)|
is a seminorm. Now B1 (0) := {f ∈ C([0, 1], C) : kf k∞ < 1} is open and f ∈ B1 (0)
implies that px (f ) = |f (x)| ≤ kf k∞ < 1.
Thus each such px is continuous on C([0, 1], C).

5.13. Definition. A topology T on a topological vector space W is said to be


locally convex if it admits a base consisting of convex sets. We shall write LCS
for locally convex topological vector spaces, and for the sake of brevity, we shall refer
to them as locally convex spaces.
Since the topology on W is determined by the nbhds at a single point, it suffices
to require that W admit a nbhd base at 0 consisting of convex sets; that is, given
any nbhd U ∈ U0 , there exists a convex nbhd N ∈ U0 so that N ⊆ U . In verifying
that a space is a LCS, we shall often only verify this condition.
5.14. Proposition. Let W be a TVS, and suppose that U ∈ U0 is convex. Then
U contains a balanced, open, convex nbhd of 0.
Proof. By Proposition 4.10, U contains a balanced, open nbhd H of 0. Set N =
co(H). Then U convex and H ⊆ U implies that N ⊆ U . Since H is balanced, a
routine calculation
Pm shows that N is also balanced. For any choice of t1 , t2 , ..., tm ∈
[0, 1] with k=1 tk = 1, and for any h1 , h2 , ..., hm ∈ H we have
m−1
!
X
tk hk + tm H ⊆ N.
k=1
5. SEMINORMS AND LOCALLY CONVEX SPACES 73
P 
m−1
Since H is open, so is k=1 tk hk + tm H. Since N = co(H), it follows that N is
a union of open sets of this form, and hence N is also open.
Thus N is an open, balanced, convex nbhd of 0 contained in U , the existence of
which proves our claim.
2

As an immediate consequence we obtain:

5.15. Corollary. Let (V, T ) be a LCS. Then V admits a nbhd base at 0 con-
sisting of balanced, open, convex sets.

5.16. Example. Let (X, k · k) be a normed linear space. For each ε > 0, the
argument of paragraph 5.6 shows that Bε (0) = {x ∈ X : kxk < ε} is convex. Since
{Bε (0) : ε > 0} is a nbhd base at 0 for the norm topology, (X, k · k) is a LCS.
More concretely, (Kn , k · k2 ) is a LCS, as is any Hilbert space H. So is B(H).
We have already seen that the quotient of a TVS by one of its closed subspaces
is a TVS. Let us first obtain the same result for locally convex spaces.

5.17. Proposition. Let (V, T ) be a LCS and W ⊆ V be a closed subspace.


Then V/W is a LCS in the quotient topology.
Proof. As mentioned above, that V/W is a TVS follows from paragraph 4.18.
There remains only to show that V/W admits a nbhd base at 0 consisting of convex
sets (see the remarks following Definition 5.13).
V/W
Let q : V → V/W denote the canonical quotient map, and let U ∈ U0 . Then
−1 V
q (U ) ∈ U0 , as q is continuous. Since V is a LCS, we can find a convex nbhd
N ∈ U0V so that 0 ∈ N ⊆ q −1 (U ). Let M = q(N ). Since q is an open map, we have
V/W
M ∈ U0 . Since q is linear, M is convex.
Finally, since N ⊆ q −1 (U ), M = q(N ) ⊆ U , and we are done.
2

5.18. Definition. A family Γ of seminorms on a vector space W is said to be


separating if for all 0 6= x ∈ W there exists p ∈ Γ so that p(x) 6= 0.

5.19. Example. Let W = C([0, 1], C) and consider Γ = {px : x ∈ Q ∩ [0, 1]},
where - as before - px (f ) = |f (x)| for all f ∈ W.
If 0 6= f ∈ W, then there exists y ∈ [0, 1] so that f (y) 6= 0. By continuity of f ,
there exists a nbhd N of y such that f (y) 6= 0 for all y ∈ N . Thus there exists a
rational number q ∈ N so that 0 6= f (q) and hence pq (f ) 6= 0. Thus Γ is a separating
family of seminorms.
74 L.W. Marcoux Functional Analysis

5.20. Let Γ be a family of seminorms on a vector space W. For F ⊆ Γ finite,


x ∈ W and ε > 0, set
N (x, F, ε) = {y ∈ W : p(x − y) < ε, p ∈ F }.
Permitting ourselves a slight abuse of notation, we shall write N (x, p, ε) in the case
where F = {p}.
5.21. Theorem. If Γ is a separating family of seminorms on a vector space
W, then
B = {N (x, F, ε) : x ∈ W, ε > 0, F ⊆ Γ finite}
is a base for a locally convex topology T on W. Moreover, each p ∈ Γ is T -
continuous.
Proof.
Step One: We begin by showing that B is a base for a Hausdorff topology T on
W.
• Let x ∈ W and choose 0 6= p ∈ Γ. (Such a p exists since Γ is assumed to
be separating.) Then x ∈ N (x, p, 1). Thus
∪{B : B ∈ B} ⊇ ∪{N (x, p, 1) : x ∈ W} = W.
• Next suppose that B1 = N (x, F1 , ε1 ) and B2 = N (y, F2 , ε2 ) lie in B and
that z ∈ B1 ∩ B2 . We must find B3 ∈ B so that z ∈ B3 ⊆ B1 ∩ B2 .
To that end, let ε = min{ε1 − p(x − z), ε2 − q(y − z) : p ∈ F1 , q ∈ F2 },
so that ε > 0. If w ∈ N (z, F1 ∪ F2 , ε), then
p(w − x) ≤ p(w − z) + p(z − x) < ε + p(z − x) ≤ ε1
for all p ∈ F1 , and so w ∈ B1 . An analogous argument proves that w ∈ B2 .
That is, B3 := N (z, F1 ∪ F2 , ε) satisfies the required condition.
It now follows from our work in the homework assignments that B is a
base for a topology on W.
• If x, y ∈ W and x 6= y, then our assumption that Γ is separating implies
the existence of an element p ∈ Γ so that δ := p(x − y) > 0. But then
N (x, p, 2δ ) and N (y, p, 2δ ) are disjoint nbhds of x and y respectively in the
T -topology, proving that T is Hausdorff.
Step Two: That T is locally convex follows readily from the fact that B is a base
for T and each N (x, F, ε) is itself convex, as is easily verified.
Step Three: Next we verify that (W, T ) is a TVS; namely, that the topology T
is compatible with the vector space operations.
• Suppose that x0 , y0 ∈ W and let U be a nbhd of x0 + y0 in the T -topology.
Then there exists a basic nbhd B = N (x0 + y0 , F, ε) of x0 + y0 with B ⊆ U .
Let B1 = N (x0 , F, 2ε ) and B2 = N (y0 , F, 2ε ). If (x, y) ∈ B1 × B2 , then
 ε ε
p (x + y) − (x0 + y0 ) ≤ p(x − x0 ) + p(y − y0 ) < + = ε
2 2
for all p ∈ F , and thus σ(B1 × B2 ) ⊆ B ⊆ U . This shows that addition is
continuous relative to T .
5. SEMINORMS AND LOCALLY CONVEX SPACES 75

• As for scalar multiplication, let λ0 ∈ K, x0 ∈ W and U be a nbhd of λ0 x0


in the T -topology. As before, choose a basic nbhd B = N (λ0 x0 , F, ε) ⊆ U .
Let δ > 0. If K := {λ ∈ C : |λ − λ0 | < δ} and B = N (x0 , F, δ), then
(λ, x) ∈ K × B implies that
p(λx − λ0 x0 ) ≤ p(λx − λx0 ) + p(λx0 − λ0 x0 )
≤ |λ|p(x − x0 ) + |λ − λ0 |p(x0 )
< (|λ0 | + δ)δ + δp(x0 )
for all p ∈ F . Since F is finite, it is clear that δ can be chosen such that
p(λx − λ0 x0 ) < ε, p ∈ F , which proves that scalar multiplication is also
continuous relative to T .
Together, these two observations prove that (W, T ) is a TVS.
Step Four: Finally, let us show that each p is continuous relative to T .
If p = 0, then clearly p is continuous relative to T .
Otherwise, let B = N (0, p, 1). Then B ∈ T , and for x ∈ B,
p(x) = p(x − 0) < 1,
so that p is bounded on some open set in W. It now follows from Proposition 5.11
that p is (uniformly) continuous on W.
2

5.22. The above result says that a separating family of seminorms on a vector
space W gives rise to a locally convex topology on W. Our next goal is to show that
all locally convex spaces arise in this manner.

5.23. Theorem. Suppose that (V, TV ) is a LCS. Then there exists a separating
family Γ of seminorms on V which generate the topology TV .
Proof. By Corollary 5.15, (V, TV ) admits a nbhd base C0 at 0 consisting of balanced,
open, convex sets. By Proposition 5.10, for each E ∈ C0 , the Minkowski functional
pE is a seminorm and E = {x ∈ V : pE (x) < 1}.
Let Γ = {pE : E ∈ C0 }. We first show that Γ is separating. Indeed, suppose
that 0 6= x ∈ V. Since TV is Hausdorff by hypothesis, there exists G ∈ C0 so that
x 6∈ G (this is actually a bit weaker than the statement that TV is Hausdorff, but
certainly implied by it). Since G ∈ C0 , pG ∈ Γ. But x 6∈ G implies that pG (x) ≥ 1,
and hence pG (x) 6= 0. Thus Γ is separating. This is required before passing to the
next step.
By Theorem 5.21,
B = {N (x, F, ε) : x ∈ V, ε > 0, F ⊆ Γ finite}
is a base for a locally convex topology TΓ on V. Our goal, of course, is to prove that
TV = TΓ .
Let E ∈ C0 be a TV -open, balanced, convex nbhd of 0. Since E = N (0, pE , 1) ∈ B,
it follows that TΓ contains a nbhd base at 0 for the topology TV . Since both topologies
76 L.W. Marcoux Functional Analysis

are TVS-topologies, they are determined by their nbhd bases at any point (for eg.,
at 0), and from this it follows that TΓ ⊇ TV .
On the other hand, each pE ∈ Γ is bounded above by 1 on E, and E is a TV -
open nbhd of 0. By Proposition 5.11, pE is continuous on (V, TV ). It follows that
N (0, pE , ε) = p−1
E (−ε, ε) ∈ TV for all ε > 0. Thus TV contains a nbhd subbase for
TΓ at 0, and arguing as before, we get that TΓ ⊆ TV .
Hence TV = TΓ , and the topology TV is determined by the family Γ of seminorms.
2

5.24. Example. Let (X, k · k) be a normed linear space. The norm topology on
X is the metric topology induced by the metric d(x, y) = kx − yk. That is, a nbhd
base at x0 ∈ X for the norm topology is
Bx0 = {Vε (x0 ) : ε > 0}
= {{y ∈ X : ky − x0 k < ε} : ε > 0}
= {N (x0 , k · k, ε) : ε > 0}.
Thus we see that the norm topology on X is exactly the locally convex topology
generated by Γ = {k · k}. Observe that since k · k is a norm, 0 6= x ∈ X implies that
kxk =
6 0, and thus Γ is indeed separating, as required.

5.25. In Corollary 5.15, we saw that any LCS (V, T ) admits a nbhd base at 0
consisting of open, balanced, convex sets.
In fact, each N (0, {p1 , p2 , ..., pm }, ε) is balanced, open and convex for all choices
of m ≥ 1, p1 , p2 , ..., pm ∈ Γ and ε > 0, where Γ is a separating family of seminorms
which generate T . It is clear from Theorems 5.21 and 5.23 that the collection of
such sets is a nbhd base at 0 for T .

Having generated a topology on a vector space using a separating family of


seminorms, let us now examine what it means for a net to converge in this topology.

5.26. Proposition. Let V be a vector space and Γ be a separating family of


seminorms on V. Let T denote the locally convex topology on V generated by Γ.
A net (xλ )λ in V converges to a point x ∈ V if and only if
lim p(x − xλ ) = 0 for all p ∈ Γ.
λ

Proof.
• Suppose first that (xλ )λ converges to x in the T -topology. Given p ∈ Γ
and ε > 0, the set N (x, p, ε) ⊆ T and so there exists λ0 so that λ ≥ λ0
implies that xλ ∈ N (x, p, ε). That is, λ ≥ λ0 implies that p(x − xλ ) < ε.
Thus limλ p(x − xλ ) = 0.
Alternatively, one may argue as follows: suppose that (xλ )λ converges
to x in the T -topology. Given p ∈ Γ, we know that p is continuous in the
5. SEMINORMS AND LOCALLY CONVEX SPACES 77

T -topology by Theorem 5.21. Since limλ x − xλ = 0,


lim p(x − xλ ) = p(lim (x − xλ )) = p(0) = 0.
λ λ
• Conversely, suppose that limλ p(x − xλ ) = 0 for all p ∈ Γ. Let U ∈ Ux
is the T -topology. Then there exist p1 , p2 , ..., pm ∈ Γ and ε > 0 so that
N (x, {p1 , p2 , ..., pm }, ε) ⊆ U . For each 1 ≤ j ≤ m, choose λj so that λ ≥ λj
implies that pj (xλ − x) < ε. Choose λ0 ≥ λ1 , λ2 , ..., λm . If λ ≥ λ0 , then
pj (xλ − x) < ε for all 1 ≤ j ≤ m so that xλ ∈ N (x, {p1 , p2 , ..., pm }, ε) ⊆ U .
Hence limλ xλ = x in (V, T ).
2

5.27. Remarks. Let V be a vector space as above and let Γ be a separating


family of seminorms on V. Recall that if Tw is the weak topology on V induced by Γ,
then Tw is the weakest topology for which each of the functions p ∈ Γ is continuous.
By Theorem 5.21, the LCS topology T generated by
B = {N (x, F, ε) : x ∈ V, ε > 0, F ⊆ Γ finite}
has the property that each p ∈ Γ is continuous on (V, T ). It follows, therefore, that
Tw ⊆ T . In other words, if (xλ )λ is a net in (V, T ) which converges to x ∈ V, then
(xλ )λ converges to x in (V, Tw ); i.e. limλ p(xλ ) = p(x) for all p ∈ Γ.
That these two topologies do not, in general, coincide can be seen by examining
a simple example.
Let V = K and let Γ = {p}, where p(x) = |x| for each x ∈ K. The LCS topology
on K generated by Γ is a TVS topology, and thus must agree with the usual topology
on K, since the latter admits a unique TVS topology, by Lemma 4.19. The weak
topology Tw on K generated by Γ is the weakest topology for which p is continuous.
In particular, a net (xλ )λ converges to x ∈ K if and only if limλ |xλ | = |x|. For
example, the sequence (xn )n , where xn = (−1)n , n ≥ 1 converges to x = 1 in
(K, Tw ). Since it clearly doesn’t converge in (K, T ), the two topologies are necessarily
different, and again – by Theorem 5.21 – it follows that (K, Tw ) is not a TVS.

There is, however, a situation where we can say a bit more than this. Let V
be a vector space and let (Xα , k · kα )α∈A be a collection of Banach spaces (in fact,
normed linear spaces will do). For each such α, suppose that Tα : V → Xα is a
linear map. Suppose furthermore that the family {Tα }α is separating in the sense
that if 0 6= x ∈ V, then there exists α ∈ A so that 0 6= Tα x ∈ Xα . Then each of the
functions
pα : V → [0, ∞)
x 7→ kTα xkα
is easily seen to be a seminorm. It is routine to verify that the fact that {Tα }α
is separating implies that Γ = {pα }α is a separating family of seminorms. Let T
denote the LCS topology on V generated by Γ. By Proposition 5.26, a net (xλ )λ
converges to x ∈ (V, T ) if and only if
lim pα (x − xλ ) = lim kTα (x − xλ )kα = 0 for all α ∈ A.
λ λ
78 L.W. Marcoux Functional Analysis

That is, limλ xλ = x if and only if


lim Tα xλ = Tα x for all α ∈ A.
λ

Since this is nothing more than the statement that each Tα is continuous, we find
that in this case, the T topology on V coincides with the weak topology generated
by the family {Tα }α∈A . This is still not the same as the weak topology generated
by the family Γ, however.

5.28. Example. Let H = `2 (N) and recall P that H is a Hilbert space when
equipped with the inner product h(xn ), (yn )i = ∞
n=1 xn yn .
Recall also that B(H) is a normed linear space with the operator norm kT k :=
sup{kT xk : x ∈ H, kxk ≤ 1}.
From above, we see that the norm topology on B(H) admits as a nbhd base at
T ∈ B(H) the collection
{N (T, k · k, ε) : ε > 0} = {Vε (T ) : ε > 0},
and that this is the locally convex topology generated by the separating family
Γ = {k · k} of (semi)norms.
Convergence of a net of operators (Tλ )λ to T ∈ B(H) in the norm topology (i.e.
limλ kTλ − T k = 0) should be thought of as uniform convergence on the closed unit
ball of H.
This is certainly not the only interesting topology one can impose upon B(H).
Let us first consider the topology of “pointwise convergence”.

The strong operator topology (SOT) For each x ∈ H, consider


px : B(H) → R
T 7→ kT xk.
Then
(i) px (T ) ≥ 0 for all T ∈ B(H);
(ii) px (λT ) = kλT xk = |λ| kT xk = |λ| px (T ) for all λ ∈ K;
(iii) px (T1 + T2 ) = kT1 x + T2 xk ≤ kT1 xk + kT2 xk = px (T1 ) + px (T2 ),
so that px is a seminorm on B(H) for each x ∈ H.
In general, px is not a norm because we can always find T ∈ B(H) so that 0 6= T
but px (T ) = 0. Indeed, let y ∈ H with 0 6= y and y ⊥ x. Define Ty : H → H via
Ty (z) = hz, yiy. Then kTy (z)k ≤ kzk kyk2 by the Cauchy-Schwarz Inequality and
in particular Ty (y) = kyk2 y 6= 0, but Ty (x) = hx, yiy = 0y = 0. Thus 0 6= Ty but
px (Ty ) = 0.
On the other hand, if 0 6= T ∈ B(H), then there exists x ∈ H so that T x 6= 0.
Thus px (T ) = kT xk = 6 0, proving that ΓSOT := {px : x ∈ H} separates the points
of B(H).
The locally convex topology on B(H) generated by ΓSOT is called the strong
operator topology and is denoted by SOT.
5. SEMINORMS AND LOCALLY CONVEX SPACES 79

By Proposition 5.26 above, we see that a net (Tλ )λ ∈ B(H) converges to T ∈


B(H) in the SOT if and only if
lim px (Tλ − T ) = lim kTλ x − T xk = 0 for all x ∈ H.
λ λ

Thus the SOT is the topology of pointwise convergence. That is, it is the weakest
topology that makes all of the evaluation maps T 7→ T x, x ∈ H continuous.
A nbhd base for the SOT at the point T ∈ B(H) is given by the collection
{N (T, {x1 , x2 , ..., xm }, ε) : m ≥ 1, xj ∈ H, 1 ≤ j ≤ m, ε > 0}
where, for m ≥ 1, F := {xj ∈ H : 1 ≤ j ≤ m} and ε > 0, we have
N (T, F, ε) = {R ∈ B(H) : kRxj − T xj k < ε, 1 ≤ j ≤ m}.

The weak operator topology (WOT) Next, for each pair (x, y) ∈ H × H,
consider the map
qx,y : B(H) → R
T 7→ |hT x, yi|.
Again, it is routine to verify that each qx,y is a seminorm but not a norm on B(H).
The locally convex topology on B(H) generated by ΓW OT := {qx,y : (x, y) ∈
H × H} is called the weak operator topology on B(H) and is denoted by WOT.
A net (Tλ )λ ∈ B(H) converges to T ∈ B(H) in the WOT if and only if
lim |h(Tλ − T )x, yi| = lim |hTλ x, yi − hT x, yi| = 0
λ λ

for all x, y ∈ H. In other words, the WOT is the weakest topology that makes all
of the functions T 7→ hT x, yi, x, y ∈ H continuous.
A nbhd base for the WOT at the point T ∈ B(H) is given by the collection
{N (T, {x1 , x2 , ..., xm , y1 , y2 , ..., ym }, ε) : m ≥ 1, xj , yj ∈ H, 1 ≤ j ≤ m, ε > 0},
where, for m ≥ 1, F := {(xj , yj ) ∈ H × H : 1 ≤ j ≤ m} and ε > 0, we have
N (T, F, ε) = {R ∈ B(H) : |hRxj − T xj , yj i| < ε, 1 ≤ j ≤ m}.

5.29. Proposition. Let (V, T ) be a LCS, and let Γ be a separating family


of seminorms on V which generate the locally convex topology on V. Let p be a
seminorm on V. The following are equivalent:
(a) p is continuous on V;
(b) there exists a constant κ > 0 and p1 , p2 , ..., pm ∈ Γ so that
p(x) ≤ κ max(p1 (x), p2 (x), ..., pm (x)) for all x ∈ V.
Proof.
80 L.W. Marcoux Functional Analysis

(a) implies [(b)] Suppose that p is continuous on V. Then M := p−1 ((−1, 1)) =
p−1 ([0, 1)) is a T -open nbhd of 0, and as such, it must contain a basic nbhd
N := N (0, {p1 , p2 , ..., pm }, ε) for some p1 , p2 , ..., pm ∈ Γ and ε > 0. It
follows that if pj (x) < ε for 1 ≤ j ≤ m, then x ∈ N ⊆ M , and hence
p(x) < 1.
More generally, let y ∈ V and let ry = max(p1 (y), p2 (y), ..., pm (y)).
• If ry = 0, then for all k > 0, pj (ky) = 0 < ε, 1 ≤ j ≤ m, so that from
above, p(ky) = kp(y) < 1. But then
p(y) = 0 ≤ 1 max(p1 (y), p2 (y), ..., pm (y)).
• If ry > 0, then x = 2rεy y satisfies pj (x) < ε, 1 ≤ j ≤ m, and so
ε
2ry p(y) = p(x) < 1. That is,
2ry 2
p(y) < = max(p1 (y), p2 (y), ..., pm (y)).
ε ε
We conclude that with κ = max(1, 2ε ),
p(y) ≤ κ max(p1 (y), p2 (y), ..., pm (y))
for all y ∈ V.
(b) implies [(a)] Suppose that (b) holds. Now N := N (0, {p1 , p2 , ..., pm }, 1) is
an open nbhd of 0 in the T -topology. If x ∈ N , then pj (x) < 1 for all
1 ≤ j ≤ m, and so p(x) ≤ κ. But then p is bounded above on the T -open
nbhd N of 0, and hence is continuous by Proposition 5.11.
2

5.30. Proposition. Let (V, TV ) and (W, TW ) be locally convex spaces. Let ΓV
and ΓW denote separating families of seminorms which generate the corresponding
locally convex topologies on V and W respectively. Finally, let T : V → W be a
linear map.
The following are equivalent:
(a) T is continuous.
(b) For all q ∈ ΓW there exists κ > 0 and p1 , p2 , ..., pm ∈ ΓV so that
q(T x) ≤ κ max(p1 (x), p2 (x), ..., pm (x)) for all x ∈ V.
Proof.
(a) implies (b): Suppose that T is continuous and that q ∈ ΓW . Clearly q is
continuous as well. It is routine to verify that q◦T is a seminorm on V. Since
the composition of continuous functions is continuous, q ◦ T is a continuous
seminorm on V, and the result now follows from Proposition 5.29.
(b) implies (a): Conversely, suppose that for all q ∈ ΓW there exists κ > 0 and
p1 , p2 , ..., pm ∈ ΓV so that
q(T x) ≤ κ max(p1 (x), p2 (x), ..., pm (x)) for all x ∈ V.
As before, we observe that q ◦ T is a seminorm on V for all q ∈ ΓW .
Moreover, by Proposition 5.29, each such q ◦ T is continuous.
5. SEMINORMS AND LOCALLY CONVEX SPACES 81

Let U ∈ U0W and choose q1 , q2 , ..., qn ∈ ΓW so that


N (0, {q1 , q2 , ..., qn }, ε) ⊆ U . Since each qj ◦ T is continuous on V, we have
that N (0, {q1 ◦ T, q2 ◦ T, ..., qn ◦ T }, ε) is a nbhd of 0 in V. Moreover,
x ∈ N (0, {q1 ◦ T, q2 ◦ T, ..., qn ◦ T }, ε)
implies that
T x ∈ N (0, {q1 , q2 , ..., qn }, ε) ⊆ U.
It follows that T is continuous at 0.
By Theorem 4.33 and paragraph 4.34, T is continuous on V.
2
We shall require the following special case of the above result.
5.31. Corollary. Let (V, T ) be a LCS. A linear functional f on V is continuous
if and only if there exists a continuous seminorm p on V such that
|f (x)| ≤ p(x) for all x ∈ V.

Proof. Observe that if f is a continuous linear functional on V, then p(x) := |f (x)|,


x ∈ V defines a continuous seminorm on V; indeed, that p is continuous follows from
the fact that f is continuous on V and | · | is continuous on K respectively. Obviously
|f (x)| ≤ p(x) for all x ∈ V.
Conversely – and more interestingly – suppose that there exists a continuous
seminorm p on V such that |f (x)| ≤ p(x) for all x ∈ V. As before, we may choose
a separating family Γ of seminorms on V, and without loss of generality, we may
assume that p ∈ Γ. (Otherwise we replace Γ by Γ ∪ {p}.). The result now follows
immediately from Proposition 5.30.
2
82 L.W. Marcoux Functional Analysis

Appendix to Section 5.

5.32. In the assignment questions we exhibited an example of a TVS which is


not normable, i.e. it is not a normed linear space with respect to any norm. The
technique for constructing that example can be extended to produce a large variety
of such examples. The spaces we have in mind are called Fréchet spaces, and we
define them now.
5.33. Definition. A metric d on a vector space V is said to be translation
invariant if
d(x, y) = d(x + z, y + z)
for all x, y, z ∈ V.
We shall also say that a metric d on V is complete if (V, d) is a complete metric
space.
Finally, let us say that a countable family {ρn }n of pseudo-metrics on V is
complete if, whenever (xk )k is a sequence in V which is Cauchy relative to each ρn
(i.e. for all n ≥ 1 and ε > 0 there exists N = N (ε, n) > 0 so that j, k ≥ N implies
ρn (xj , xk ) < ε), there exists x ∈ V so that limk→∞ ρn (xk , x) = 0 for each n ≥ 1.

5.34. Example. Most, but certainly not all metrics we deal with are translation
invariant. For example, if d(x, y) = |x−y| for x, y ∈ R, then d is obviously translation
invariant.
On the other hand, the metric d on R defined via:
d(x, y) = |x3 − y 3 |
for all x, y ∈ R is not translation invariant, since d(0, 1) = 1 6= 7 = d(1, 2).

5.35. Definition. Let (V, T ) be a LCS. If the topology T on V is induced by


a translation invariant, complete metric d, then we say that (V, T ) is a Fréchet
space.

5.36. Constructing Fréchet spaces. We know from Theorem 5.21 that if


Γ is a separating family of seminorms on a vector space V, then Γ generates a
LCS topology T on V. Suppose now that the family Γ possesses the following two
additional properties, namely:
• the set Γ = {pn }n is countable, and
• the family {ρn }n of pseudo-metrics defined on V via ρn (x, y) = pn (x − y)
is a complete family.
Then the metric
∞ ∞
X 1 pn (x − y) X 1 ρn (x, y)
d(x, y) = n
=
2 1 + pn (x − y) 2n 1 + ρn (x, y)
n=1 n=1
5. SEMINORMS AND LOCALLY CONVEX SPACES 83

is easily seen to be translation-invariant. It is not too difficult to verify that a


sequence (xk )k in V converges to x ∈ V relative to the metric topology induced
by d if and only if limk→∞ pn (xk − x) = 0 for all n ≥ 1. That is, the d-metric
topology coincides with the LCS topology induced by Γ. Furthermore, observe that
(xk )k is Cauchy in the d-metric topology if and only if (xk )k is Cauchy relative to
each pseudo-metric ρn , n ≥ 1. By the second item above, it follows that (V, d) is
complete, and hence that (V, T ) is a Fréchet space.

5.37. Example.
(a) Let V = C ∞ (R) denote the vector space of all functions f : R → R which
are infinitely differentiable at each point x ∈ R. Let Γ = {pn,k }n,k≥0 , where
for f ∈ V,
pn,k (f ) := sup{|f (n) (x)| : x ∈ [−k, k]}.
Let T denote the LCS topology on V generated by the separating family Γ
of seminorms. Then (V, T ) is a Fréchet space.
A sequence (fk )k in V converges to f ∈ V if and only if
(n)
lim sup{|fj (x) − f (n) (x)| : x ∈ [−k, k]} = 0
j

for all n ≥ 0, k ≥ 0.
(b) If (X, k · k) is a normed linear space, then with Γ = {k · k}, X becomes a
Fréchet space.

5.38. Many authors define a Fréchet space as a LCS with a translation-invariant


metric which is complete as a uniform topological space. The definition of a
uniform space is rather long, and instead we refer the interested reader to the book
of Willard [Wil70] for a development of this concept.

5.39. The wot– and sot– topologies. In Example 5.28, we defined the weak-
operator topology (wot) and the strong-operator topology (sot) on B(H). As we
mentioned there, a net (Tλ )λ converges to T ∈ B(H) if and only if it converges
pointwise; that is, for each x ∈ H, limλ Tλ x = T x.
Let us now turn our attention to wot–convergence.
Let {eα : α ∈ Ω} be an onb for an infinite-dimensional Hilbert space H. Cor-
responding to any X ∈ B(H) is a matrix [X] = [xα,β ]α,β∈Ω defined by xα,β :=
hXeβ , eα i for all α, β ∈ Ω. (You will have definitely seen the finite-dimensional ver-
sion of this phenomenon, where we write [T ] = [ti,j ] ∈ Mn (C), with ti,j := hT ej , ei i
for some onb {e1 , e2 , . . . , en } for H = Cn .)
There is an important difference to note when passing from finite-dimensional
Hilbert spaces to infinite-dimensional Hilbert spaces, namely: in the infinite-dimen-
sional setting, not every matrix [xα,β ] is the matrix of a bounded linear operator X.
For example, if H is infinite-dimensional, there is no bounded linear map X ∈ B(H)
such that xα,β = 1 for all α, β ∈ Ω. Even in the case where H is infinite-dimensional
but separable, there are no known necessary and sufficient conditions to identify
84 L.W. Marcoux Functional Analysis

when a matrix is the matrix of a bounded, linear operator. Of course, certain


necessary conditions are easy to state; below are two examples.
• For any fixed β0 ∈ Ω, the vector (hT β0 , αi)α∈Ω ∈ `2 (Ω), so in particular,
it is countably supported. This is because this vector is the set of Fourier
coefficients of T β0 relative to our onb {eα : α ∈ Ω}.
• Given X ∈ B(H), for all α, β ∈ Ω,
|xα,β | = |hXeβ , eα i| ≤ kXk keβ k keα k ≤ kXk.
Thus the entries of [X] are uniformly bounded. As mentioned above, this
is far from sufficient.
A sufficient (but not necessary) condition is that
X
|xα,β |2 < ∞.
α,β∈Ω

One thing to be particularly aware of is that even if X ∈ B(H) has a matrix


[X] = [xα,β ] relative to an onb {eα : α ∈ Ω}, there is no guarantee that the matrix
[|xα,β |] is the matrix of a bounded, linear operator. The following result was obtained
independently by V.S. Sunder [Sun78] and by A.R. Sourour [Sou78].

5.40. Theorem. Let H be a complex, separable, infinite-dimensional Hilbert


space. Let X ∈ B(H). The following conditions are equivalent.
(a) For every onb {eα }α∈Ω for H, the matrix [|hXem , en i|]α,β∈Ω is the matrix
of a bounded linear map in B(H);
(b) There exist λ ∈ C and K ∈ B(H) P with the property that relative to some

orthonormal basis {en }∞
n=1 for H, 2
n,m=1 |hKen , em i| < ∞, and

X = λI + K.
We mention in passing that an operator K ∈ B(H) for which there exists an
onb {en }∞
n=1 relative to which
 1
∞ 2
X
2
kKk2 :=  |hKen , em i|  < ∞
n,m=1

is said to be a Hilbert-Schmidt operator. It can be shown that the set C2 (H) :=


{K ∈ B(H) : K is Hilbert-Schmidt} is a linear manifold contained in K(H), and
that k · k2 is a norm on C2 (H). Furthermore, (C2 (H), k · k2 ) is complete, and it
forms an ideal in B(H). We refer the reader to Davidson’s monograph [Dav88] for
more information on Hilbert-Schmidt operators and on other so-called Schatten
p-classes of operators.

5.41. Returning to the issue of wot-convergence: suppose that (Tλ )λ∈Λ is a


net in B(H) and that T ∈ B(H). Suppose furthermore that {eα }α∈Ω is a onb for
H.
5. SEMINORMS AND LOCALLY CONVEX SPACES 85

If wot-limλ Tλ = T , then it is an immediate consequence of the definition of


wot-convergence that the net ([Tλ ])λ∈Λ of matrices for (Tλ )λ∈Λ relative to {eα }α∈Ω
must converge entry-wise to the matrix [T ] = [tα,β ] of T relative to {eα }α∈Ω .
Moreover, the converse is true, provided that the entry-wise convergence of the
matrices works for any and all onb’s. It is not sufficient that this work for a
single, fixed onb, as we now show.

5.42. Example. Let H = `2 .

For a, b ∈ H, we define the rank-one operator a ⊗ b∗ ∈ B(H) by


a ⊗ b∗ (x) = hx, bia, x ∈ H.

Note that if a ∈ H is a norm-one vector, then Pa := a ⊗ a∗ is the orthogonal


projection of H onto Ca.

Now suppose that {en }∞ 2


n=1 is the standard onb for H = ` ; that is, eN =

(δn,N )n=1 , where δi,j is the Dirac delta function δi,j = 1 if i = j and δi,j = 0 if i 6= j.

Consider RN = N 2 (eN ⊗ e∗N ), N ≥ 1, so that RN is N 2 times the orthogonal


projection of H onto CeN for each N .
For any 1 ≤ i, j < ∞,
lim hRN ei , ej i = 0,
N →∞
because for any N ≥ i, RN ei = 0. Nevertheless, we do not have that (RN )∞ N =1
converges inPthe wot to 0.
Let x = ∞ 1 P∞ 1 2
n=1 n en ∈ H, since n=1 ( n ) < ∞. Then for all N ≥ 1,
X1 X1 1
hRN x, xi = hRN ( en ), xi = hN eN , xi = hN eN , ( en )i = hN eN , eN i = 1.
n
n n
n N
Incidentally, the matrix for RN relative to the onb {en }∞ n=1 is

0 0 0 ... 0 ...
 
0 0 0 ... 0 . . .
..
 
0 . . .
 . 0 0 . . .

0 . . . 0 N2 0 . . . .
 
[RN ] = 
0 . . . . . . 0 0 . . .
 
0 . . . . . . 0 0 . . . . . .
 
 
.. .. .. .. ..
. . . . 0 .
That is, there is N 2 in the (N, N ) entry, and 0’s elsewhere. By our work in Chapter 2,
kRN k = N 2 for each N ≥ 1.
We leave it as an exercise for the reader to show that (RN )∞ N =1 does not converge
to anything in the weak-operator topology.
86 L.W. Marcoux Functional Analysis

5.43. Example. Let H denote an infinite-dimensional, separable, complex


Hilbert space. By B(H) we denote the set of bounded linear operators acting on
H. The following is an example of an unbounded net of operators in B(H) which
nonetheless converges to 0 in the strong operator topology, and therefore in the weak
operator topology.

Applying Example 11.19 of the online notes to the nbhd system U0sot of 0 in the
sot, we see that the latter forms a directed set using the relation
U1 ≤ U2 if U2 ⊆ U1 ,
and that if we choose an element XU ∈ U , U ∈ U0sot , then
lim XU = 0.
U ∈U0sot

Recall also from your Assignments that any sot-open nbhd of 0 ∈ B(H) contains
an infinite-dimensional sot-closed subspace of B(H). Of course, a non-zero subspace
of B(H) contains operators of any possible norm.
Each U ∈ U0sot then contains a basic sot-open nbhd of 0 of the form
N (0, F, ε) = {T ∈ B(H) : kT x − 0k < ε, x ∈ F },
where F = {x1 , x2 , . . . , xN } ⊆ H is a finite set, and ε > 0.
Choose an operator RU ∈ N (0, F, ε) ⊆ U with kRU k > 1ε . (Alternatively, we
may ask that kRU k ≥ |F |, the cardinality of F .)
Since RU ∈ U , U ∈ U0sot , we see from above that
lim RU = 0.
U ∈U0sot

Since ε > 0 can be made arbitrarily small (starting with any δ > 0, let 0 6= x ∈ H
and consider U = N (0, {x}, δ)), the net (RU )U is not bounded.

We also remark that there does not exist an unbounded sequence (Tn )∞ n=1 in
B(H) which converges in the wot to 0.
Indeed, suppose otherwise; that is, suppose that (Tn )n is a sequence and that
for each x, y ∈ H,
lim hTn x, yi = 0.
n→∞
Then, for each x ∈ H, (Tn x)∞n=1 is a sequence in H which converges weakly to
0. (This uses the Riesz Representation Theorem for Hilbert spaces to identify a
functional β ∈ H∗ with a functional βy (z) := hz, yi∀z ∈ H.) By Corollary 7.17 of
the online notes, supn≥1 kTn xk < ∞.
Of course, if we had a sequence (Tn )∞n=1 converging in the sot to 0, then for
each x ∈ H, limn kTn xk = 0, implying again that supn≥1 kTn xk < ∞.
Either way (i.e. wot or sot convergence to 0), we can apply the Uniform
Boundedness Principle: for each x ∈ H,
κx := sup kTn xk < ∞,
n≥1
5. SEMINORMS AND LOCALLY CONVEX SPACES 87

implying that supn kTn k < ∞.

I once spent a year in Philadelphia. I think it was on a Sunday.


W.C. Fields
88 L.W. Marcoux Functional Analysis

Exercises for Section 5.

Question 1. This question is based upon Example 5.42 of the Appendix. Let
H = `2 and denote by {en }∞n=1 the standard onb for H. For each N ≥ 1, set
RN = N (eN ⊗ eN ), where for x, y ∈ H, x ⊗ y ∗ denotes the rank-one operator
2 ∗

x ⊗ y ∗ (z) = hz, yix, z ∈ H.


Prove that the sequence (RN )∞ does not converge in the wot. Conclude that
N =1
it does not converge in the sot, nor in the norm topology on B(H).

Question 2.
Let H be an infinite-dimensional, separable, complex Hilbert space and let
{en }∞
n=1 be an onb for H. For each N ≥ 1, let PN denote the orthogonal pro-
jection of H onto span {e1 , e2 , . . . , eN }.
Prove that the sequence (Pn )∞ n=1 converges strongly to the identity; that is, that
it converges to the identity operator in the sot.

Question 3.
Let H be an infinite-dimensional, separable, complex Hilbert space, and denote
by F(H) the set of finite-rank operators in B(H). Prove that F(H) is dense in B(H)
in the wot.
Is it dense in B(H) in the sot?
6. THE HAHN-BANACH THEOREM 89

6. The Hahn-Banach theorem

When I wake up in the morning, I just can’t get started until I’ve had
that first, piping hot pot of coffee. Oh, I’ve tried other enemas...
Emo Philips

6.1. It is somewhat of a misnomer to refer to the Hahn-Banach Theorem. In


fact, there is a large number of variations on this theme. These variations fall into
two groups: the separation theorems, and the extension theorems. The crucial rela-
tion between these two classes of theorems is that they all refer to linear functionals.
Having said this, when one wishes to apply a version of the Hahn-Banach Theorem,
one tends to say only: “by the Hahn-Banach Theorem...”, usually leaving it to the
reader to determine which version of the Theorem is being applied.
The importance of these theorems in Functional Analysis can not be overstated.

6.2. Definition. Let W be a vector space over K. A linear functional on


W is a linear map f : W → K. The vector space of all linear functionals on W is
denoted by W # and is referred to as the algebraic dual of W.
If W is a TVS, the (vector) space of continuous linear functionals is denoted
by W ∗ , and is referred to as the (topological) dual of W. Obviously W ∗ ⊆ W # .

6.3. Example. Let n ≥ 1 be an integer and consider W = Kn equipped with


the norm k(x1 , x2 , ..., xn )k∞ = max |xj |. For any choice of k1 , k2 , ..., kn ∈ K, the
map
f: W → P K
n
(x1 , x2 , ..., xn ) 7→ i=1 ki xi
is a continuous linear functional.

6.4. Remarks.
(a) Recall from basic linear algebra that every linear functional on Kn is of this
form for some choice of k1 , k2 , ..., kn ∈ K. As such, every linear functional
on Kn is continuous.
(b) Recall from Proposition 4.20 that if V is an n-dimensional TVS with basis
{e1 , e2 , ..., en }, then V is homeomorphic to Kn via the map
P n n
i=1 ki ei 7→ (k1 , k2 , ..., kn ). Since the product topology on K is in turn
equivalent to the norm topology induced by the infinity norm, it follows
from (a) above that every linear functional on a finite-dimensional TVS is
continuous.
90 L.W. Marcoux Functional Analysis

6.5. Example. Let us next consider c00 (K) = {(xn )∞ n=1 : xn ∈ K for all n ≥
1 and xn = 0 for all but finitely many n’s}. Recall that this forms a normed linear
space when equipped with the norm

k(xn )n k∞ = sup |xn |.


n

Define
f : c00 (K) → P K
∞ .
(xn )n 7→ n=1 xn

Then f is a non-continuous linear functional on c00 (K). Indeed, if

1 1 1
yn = ( , , ..., , 0, 0, 0, ...)
n n n

(where the n1 term is repeated n times), then kyn k∞ = n1 , and so limn→∞ yn = 0,


while f (yn ) = 1 for all n, and hence limn→∞ f (yn ) 6= 0 = f (0).

For a number of the results we shall obtain below, we shall assume that the
underlying field is R. In order to translate the results to the case of complex vector
spaces, the following Lemma will be useful.

6.6. Lemma. Let V be a vector space over C.


(a) If f : V → R is an R-linear functional, then the map

fC (x) := f (x) − if (ix)

is a C-linear functional on V, and f = Re fC .


(b) If g : V → C is C-linear, f = Re g and fC is defined as in (a), then g = fC .
(c) If p is a C-seminorm on V and f , fC are as in (a) above, then |f (x)| ≤ p(x)
for all x ∈ V if and only if |fC (x)| ≤ p(x) for all x ∈ V.
(d) If V is a NLS and f, fC are as in (a), then kf k = kfC k.
Proof.
(a) This is routine and is left to the reader.
(b) Let x ∈ V and write g(x) = a + ib, where a = Re g(x) = f (x) and b =
Im g(x) are real. By C-linearity of g, g(ix) = ig(x) = −b + ia, and so

Im g(x) = b = −Re g(ix) = −f (ix).

That is, g(x) = f (x) + i(−f (ix)) = f (x) − if (ix) = fC (x).


(c) First suppose that |fC (x)| ≤ p(x) for all x ∈ V. Then |f (x)| = |Re fC (x)| ≤
|fC (x)| ≤ p(x) for all x ∈ V.
6. THE HAHN-BANACH THEOREM 91

Next suppose that |f (x)| ≤ p(x) for all x ∈ V. Given x ∈ V, choose


θ ∈ C, |θ| = 1 so that |fC (x)| = θfC (x). Then
|fC (x)| = θfC (x)
= fC (θx)
= Re fC (θx) (as this quantity is non-negative)
= f (θx)
≤ p(θx)
= |θ|p(x) = p(x).
(d) We only deal with the case where kf k and kfC k are finite, and leave the
case where one of them is (and therefore both of them are) infinite to the
reader.
It is routine to verify that kf k ≤ kfC k, and this step is left to the
reader. Conversely, given x ∈ V with kxk = 1, we can find θx so that
|fC (x)| = θx fC (x) = fC (θx x) = Re fC (θx x) = f (θx x). Note that kθx xk = 1
because V is a C-vector space and kθx xk = |θx | kxk = kxk = 1. Thus
kfC k = sup{|fC (z)| : kzk = 1}
= sup{|f (θz z)| : kzk = 1}
≤ sup{|f (y)| : kyk = 1}
= kf k.
2

6.7. Proposition. Let V be a vector space over K and let f ∈ V # .


(a) If g ∈ V # and g|ker f = 0, then g = kf for some k ∈ K.
(b) If g, f1 , f2 , ..., fN ∈ V # and g(x) = 0 for all x ∈ ∩N
j=1 ker fj , then
g ∈ span{f1 , f2 , ..., fN }.
Proof.
(a) If g = 0, then set k = 0 and we are done.
Otherwise, choose z ∈ V so that g(z) 6= 0. By hypothesis, f (z) 6= 0.
Let k = g(z)/f (z). Now ker f has codimension 1 in V, and so if x ∈ V,
then x = αz + y for some y ∈ ker f and α ∈ K. Hence
g(x) = αg(z) + g(y) = αg(z) + 0
= αkf (z)
= k(αf (z) + f (y))
= kf (x).
Since x ∈ V was arbitrary, g = kf .
(b) We may assume that {f1 , f2 , ..., fN } are linearly independent. Let N =
∩N
j=1 ker fj . Then dim (V/N ) ≤ N . (This is an elementary result from
Linear Algebra; there is a proof in the Appendix to this Chapter.) For
92 L.W. Marcoux Functional Analysis

1 ≤ j ≤ N , define f j : V/N → K via f j (x + N ) = fj (x). Since N ⊆ ker fj ,


each f j is well-defined, and f j ∈ (V/N )# .
We claim that {f 1 , f 2 , ..., f N } is also linearly independent. Otherwise,
we can find k1 , k2 , ..., kN ∈ K so that N
P PN
j=1 |kj | 6= 0, but j=1 kj f j = 0.
PN PN
But then j=1 kj fj 6= 0, so we can find z ∈ V with 0 6= j=1 kj fj (z) =
PN
j=1 kj f j (z + N ), a contradiction.
Thus dim (V/N )# ≥ N . Combining this with the fact that dim (V/N ) ≤
N yields dim (V/N ) = dim(V/N )# = N , and that {f 1 , f 2 , ..., f N } is a ba-
sis for (V/N )# .
Now define g : V/N → K via g(x + N ) = g(x). Again, since N ⊆ ker g,
g is well-defined. Since g ∈ (V/N )# , we can write
N
X
g= kj f j for some k1 , k2 , ..., kN ∈ K.
j=1

For x ∈ V,
N
X
0 = (g − kj f j )(x + N )
j=1
N
X
= g(x) − kj fj (x),
j=1
PN
so that g = j=1 kj fj .
2
The first part of the above Proposition shows that if f and g are distinct linear
functionals on a vector space V, then they have the same kernel if and only if one
functional is a non-zero multiple of the other.
6.8. Definition. Let V be a vector space over K. A hyperplane M in V is a
linear manifold for which dim (V/M) = 1.

6.9. If 0 6= ϕ ∈ V # , then from elementary linear algebra theory we see that


M := ker ϕ is a hyperplane in V and that ϕ induces an (algebraic) isomorphism ϕ
between V/M and K via
ϕ(x + M) := ϕ(x) for all x + M ∈ V/M.
Conversely, if M ⊆ V is a hyperplane, then V/M is (algebraically) isomorphic to
K. Let κ : V/M → K denote such an isomorphism. If q : V → V/M is the canonical
quotient map, then κ ◦ q : V → K is a linear functional with ker (κ ◦ q) = M.
Thus we have established a correspondence between linear functionals and hy-
perplanes. Proposition 6.7 implies that, up to a factor of a non-zero scalar multiple,
this correspondence is bijective.
6. THE HAHN-BANACH THEOREM 93

6.10. Proposition. If (V, T ) is a TVS and M ⊆ V is a hyperplane, then either


M is closed in V, or M is dense.
Proof. Since M is a vector space satisfying M ⊆ M ⊆ V, and since dim (V/M) =
1, we either have M = M, or M = V.
2
It is worth noting that both possibilities can occur.

6.11. Example.
(a) Let X = (C([0, 1], C), k · k∞ ), and let δ 1 : X → C be the map δ 1 (f ) := f ( 12 ),
2 2
f ∈ X. Then ker δ 1 = {f : [0, 1] → C|f is continuous and f ( 21 ) = 0}. This
2
is clearly closed. 
0 if i 6= j
(b) Let V = c0 (C), and let ek = (δ1k , δ2k , δ3k , ...), where δij = .
1 if i = j
Let z = (1, 1/2, 1/3, 1/4, ...). Then {z, e1 , e2 , e3 , ...} is linearly independent
in V, and as such it can be extended to a Hamel basis (i.e. a vector space
basis) for V, say
B = {z, e1 , e2 , e3 , ...} ∪ {bλ : λ ∈ Λ}.
Given v ∈ V, say v = αz + ∞
P P
k=1 βk ek + λ∈Λ γλ bλ for some α, βk , γλ ∈
K with only finitely many coefficients not equal to zero, define g(v) = α.
It is clear that this defines a linear functional on V. Since ker g is a
subspace containing ek , k ≥ 1, and since span {ek }∞ k=1 is dense in V, ker g
is dense in V. Since g 6= 0, and since ker g is dense in V, we see that ker g
is not closed.

6.12. Proposition. Let V be a TVS and ρ ∈ V # . Suppose that there exists an


open nbhd U ∈ U0 of 0 and a constant κ > 0 so that Re ρ(x) < κ for all x ∈ U .
Then ρ is uniformly continuous on V.
Proof. By Proposition 4.10, we can find a balanced, open nbhd N of 0 with N ⊆ U .
Observe that for x ∈ N ⊆ U , there exists θx ∈ K, |θx | = 1 so that
|ρ(x)| = ρ(θx x) = Re ρ(θx x).
But θx x ∈ N since N is balanced, and so |ρ(x)| < κ for x ∈ N . Consider the
function
p: V → R
x 7→ |ρ(x)|
which is easily seen to be a seminorm. Since p is bounded above by κ on the open
nbhd N of 0, we can invoke Proposition 5.11 to conclude that p is continuous on V.
By linearity, it follows that ρ is continuous at 0, and hence ρ is uniformly continuous
on V by Theorem 4.33.
2
94 L.W. Marcoux Functional Analysis

6.13. Corollary. Let V be a TVS and ρ ∈ V # . The following are equivalent:


(a) ρ is continuous on V - i.e. ρ ∈ V ∗ ;
(b) ker ρ is closed.
Proof.
(a) implies (b): This is clear. If ρ is continuous, then ker ρ = ρ−1 ({0}) is closed
in V because {0} is closed in K.
(b) implies (a): Suppose next that ker ρ is closed. If ρ = 0, then ρ is obviously
continuous. Suppose therefore that ρ 6= 0. Then W := V/ ker ρ is a one-
dimensional TVS and
ρ: W → K
x + ker ρ 7→ ρ(x)
is a linear functional on W. By Remark 6.4 (b), ρ is continuous. If
q : V → W is the canonical quotient map, then by Paragraph 4.18, q
is also continuous, and thus ρ = ρ ◦ q is continuous as well.
2
Recall that X∗ = B(X, K) is a Banach space whenever X is a NLS. Let us recall
a couple of results from Measure Theory which provide us with interesting examples
of classes of linear functionals.
6.14. Theorem. Let (X, Ω, µ) be a measure space and 1 < p < ∞. If p1 + 1q = 1,
and if g ∈ Lq (X, Ω, µ), then Z
βg (f ) := f gdµ
X
defines a continuous linear functional on Lp (X, Ω, µ),
and the map g 7→ βg is an
isometric linear bijection of L (X, Ω, µ) onto L (X, Ω, µ)∗ .
q p

If (X, Ω, µ) is σ-finite, then the same conclusion holds in the case where p = 1
and q = ∞.

Recall that if X is a locally compact space, then MK (X) denotes the space of
K-valued regular Borel measures on X with the total variation norm.

6.15. Theorem. If X is locally compact and µ ∈ MK (X), then


βµ : C0 (X, K) → R K
f 7→ X f dµ
defines an element of C0 (X, K), and the map µ 7→ βµ is an isometric linear isomor-
phism of MK (X) onto C0 (X, K)∗ .
6. THE HAHN-BANACH THEOREM 95

The Extension Theorems

6.16. The Hahn-Banach Theorem is probably the most important result in


Functional Analysis. It has a great many applications, and its usefulness cannot
be overstated. There are two basic formulations of this result (each with a variety
of consequences); the first in terms of extensions of linear functionals from linear
submanifolds of a LCS to the entire LCS, and the second in terms of so-called
“separation theorems”, which we shall examine later.
6.17. Proposition. Let V be a vector space over R and p : V → R be a
sublinear functional. Suppose that M is a (proper) hyperplane and that f : M → R
is a linear functional for which f (x) ≤ p(x) for all x ∈ M. Then there exists a
linear functional g : V → R such that g|M = f , and g(x) ≤ p(x) for all x ∈ V.
Proof. Let z ∈ V \M, so that V = span {z, M}. Then v ∈ V implies that v = tz+m
for some t ∈ R, m ∈ M.
For each r ∈ R we may define hr : V → R by setting hr (z) = r, setting
hr (m) = f (m), m ∈ M, and then extending hr by linearity to all of V. Clearly
hr ∈ V # and hr extends f . The problem is that we do not know that hr (x) ≤ p(x)
for all x ∈ V – in fact, this is generally not true. The question of finding a g as
in the statement of the Proposition amounts to showing that for some s ∈ R, we
will have hs (x) ≤ p(x) for all x ∈ V. To find such an s, we first examine which
properties it must satisfy. We then demonstrate that these properties are also
sufficient. Finally, the existence of s is a byproduct of reconciling these necessary
and sufficient conditions.
If hs (x) ≤ p(x) for all x ∈ V, then for all t ∈ R, m ∈ M we must have
hs (tz + m) = ts + f (m) ≤ p(tz + m).
• If t > 0, then setting m1 = t−1 m yields:
s ≤ −t−1 f (m) + t−1 p(tz + m)
= −f (t−1 m) + p(z + t−1 m) for all m ∈ M
(1) = −f (m1 ) + p(z + m1 ) for all m1 ∈ M.

• If t < 0, then setting m2 = −t−1 m yields:


s ≥ −t−1 f (m) + t−1 p(tz + m)
= f (−t−1 m) − p(−z − t−1 m) for all m ∈ M
(2) = f (m2 ) − p(−z + m2 ) for all m2 ∈ M.

The key issue is that we can “reverse engineer” this process. Suppose that s ∈ R
satisfies both (1) and (2), namely
(3) f (m2 ) − p(−z + m2 ) ≤ s ≤ −f (m1 ) + p(z + m1 ) for all m1 , m2 ∈ M.
96 L.W. Marcoux Functional Analysis

• If t > 0, then
hs (tz + m) = ts + f (m)
≤ t(−f (m/t) + p(z + (m/t))) + f (m)
= p(tz + m) for all m ∈ M,
while
• if t < 0, then
hs (tz + m) = ts + f (m)
≤ t(f (−m/t) − p(−z − (m/t))) + f (m)
= −f (m) + (−t)p(−z − (m/t)) + f (m)
= p(tz + m) for all m ∈ M.
• If t = 0, then hs (tz + m) = hs (m) = f (m) ≤ p(m) = p(tz + m) for all
m ∈ M.
There remains to show, therefore, that we can find s ∈ R which satisfies (3) (or
equivalently, which satisfies both (1) and (2)). Now this can be done if
f (m2 ) + f (m1 ) ≤ p(−z + m2 ) + p(z + m1 ) for all m1 , m2 ∈ M.
But
f (m1 ) + f (m2 ) = f (m1 + m2 ) ≤ p(m1 + m2 )
≤ p(m2 − z) + p(z + m1 )
for all m1 , m2 ∈ M, and so we can choose
s0 := sup{f (m2 ) − p(−z + m2 ) : m2 ∈ M}.
Letting g = hs0 completes the proof.
2

6.18. Theorem. The Hahn-Banach Theorem 01


Let V be a vector space over R and let p be a sublinear functional on V. If M
is a linear manifold in V and f : M → R is a linear functional with f (m) ≤ p(m)
for all m ∈ M, then there exists a linear functional g : V → R with g|M = f , and
g(x) ≤ p(x) for all x ∈ V.
Proof. Let J = {(N , h) : N a linear manifold in V, M ⊆ N , h ∈ N # ,
h|M = f, and h(n) ≤ p(n) for all n ∈ N }. For (N1 , h1 ), (N2 , h2 ) ∈ J , define
(N1 , h1 )  (N2 , h2 ) if N1 ⊆ N2 and h2 |N1 = h1 . Then (J , ) is a partially ordered
set with respect to . Moreover, J 6= ∅, since (M, f ) ∈ J .
Let C = {(Nλ , hλ ) : λ ∈ Λ} be a chain in J , and let N := ∪λ∈Λ Nλ . Define
h : N → R by setting h(n) = hλ (n) if n ∈ Nλ . Then h is well-defined because C
is a chain (check!), h is linear and h(n) ≤ p(n) for all n ∈ N . Thus (N , h) ∈ J
and it is an upper bound for C. By Zorn’s Lemma, (J , ) has a maximal element
(Y, g). Suppose that Y 6= V. Choosing z ∈ V\Y and letting Y0 = span{z, Y},
Proposition 6.17 implies the existence of a functional g0 : Y0 → R which extends g
6. THE HAHN-BANACH THEOREM 97

and satisfies g0 (y) ≤ p(y) for all y ∈ Y0 . This contradicts the maximality of (Y, g).
Hence Y = V, and g has the required properties.
2
The complex version of this theorem can now be established.

6.19. Theorem. The Hahn-Banach Theorem 02


Let V be a vector space over K. Let M ⊆ V be a linear manifold and let p : V → R
be a seminorm on V. If f : M → K is a linear functional and |f (m)| ≤ p(m) for
all m ∈ M, then there exists a linear functional g : V → K so that g|M = f and
|g(x)| ≤ p(x) for all x ∈ V.
Proof. Suppose that K = R. Then f (m) ≤ |f (m)| ≤ p(m) for all m ∈ M, and
p is a sublinear functional (by virtue of the fact that it is a seminorm). By the
Hahn-Banach Theorem 01, there exists g : V → R linear so that g|M = f and
g(x) ≤ p(x) for all x ∈ V. Thus −g(x) = g(−x) ≤ p(−x) = p(x) for all x ∈ V, so
that |g(x)| ≤ p(x) for all x ∈ V.

Now suppose that K = C. Let f1 = Ref . Then by Lemma 6.6, |f1 (m)| ≤ p(m)
for all m ∈ M. By the argument of the first paragraph of this proof, there exists
an R-linear functional g1 : V → R so that g1 |M = f1 and |g1 (m)| ≤ p(m) for all
m ∈ M. Let g = (g1 )C denote the complexification of g1 , as obtained in Lemma 6.6.
Then g : V → C is C-linear, g|M = f , and by part (c) of that Lemma,
|g(x)| ≤ p(x) for all x ∈ V.
2

6.20. Corollary. Let (V, T ) be a LCS and W ⊆ V be a linear manifold. If


f ∈ W ∗ , then there exists g ∈ V ∗ so that g|W = f .
Proof. Since (V, T ) is a LCS, so is W. Let Γ be a separating family of seminorms
which generate the LCS topology on V (see Theorem 5.23). Then it is routine to
verify that ΓW := {p|W : p ∈ Γ} is a separating family of seminorms on W which
generates the relative LCS topology on W.
Suppose that f ∈ W ∗ . By Proposition 5.29, there exist κ > 0 and p1 , p2 , ..., pm ∈
Γ so that
|f (w)| ≤ κ max(p1 (w), p2 (w), ..., pm (w)) for all w ∈ W.
Let q(x) := κ max(p1 (x), p2 (x), ..., pm (x)) for all x ∈ V. Then, as is easily verified,
q is a seminorm on V, and q is continuous by Proposition 5.29. Moreover,
|f (w)| ≤ q(w) for all w ∈ W.
By the Hahn-Banach Theorem 02, we can find a linear functional g : V → K so that
g|W = f and
|g(x)| ≤ q(x) for all x ∈ V.
Another application of Corollary 5.31 shows that g is continuous, as was required.
2
98 L.W. Marcoux Functional Analysis

The following is simply an application of Corollary 6.20 to the context of normed


linear spaces. It is often the version that comes to mind when the Hahn-Banach
Theorem is invoked.

6.21. Theorem. The Hahn-Banach Theorem 03


Let (X, k · k) be a NLS, M ⊆ X be a linear manifold, and f ∈ M∗ be a bounded
linear functional. Then there exists g ∈ X∗ such that g|M = f and kgk = kf k.
Proof. Consider the map
p: X → R
.
x 7→ kf k kxk
It is easy to check that p is a seminorm on X. (In fact, it is a norm unless f = 0.)
Since |f (m)| ≤ p(m) for all m ∈ M, it follows from the Hahn-Banach Theorem 02
that there exists g : X → K so that g|M = f and |g(x)| ≤ p(x) = kf k kxk for all
x ∈ X. This last inequality shows that kgk ≤ kf k. That kgk ≥ kf k is clear, and
hence kgk = kf k.
2

6.22. Corollary. Let (V, T ) be a LCS and {xj }m j=1 be a linearly independent
set of vectors in V. If {kj }m
j=1 ∈ K are arbitrary, then there exists g ∈ V ∗ so that
g(xj ) = kj , 1 ≤ j ≤ m.
Proof. Let M = span{xj }m j=1 , so that M is a finite-dimensional subspace of V.
Define f : M → K via
Xm Xm
f( aj xj ) = aj kj .
j=1 j=1
Then f is linear on M, and thus, by Corollary 4.35, it is continuous. By Corol-
lary 6.20, there exists g ∈ V ∗ so that g|M = f . Hence g(xj ) = kj , 1 ≤ j ≤ m.
2
A special case of the above Corollary which is worth pointing out is the following.

6.23. Corollary. Let (V, T ) be a LCS and 0 6= y ∈ V. Then there exists g ∈ V ∗


so that g(y) 6= 0.
Proof. Simply let x1 = y and k1 = 1 in the previous Corollary.
2
As an application of these results, let us show that finite dimensional subspaces
of locally convex spaces are topologically complemented.

6.24. Definition. A closed subspace W of a LCS (V, T ) is said to be topolog-


ically complemented if there exists a closed subspace Y of V so that V = Y ⊕ W.
That is, x ∈ V implies that x = y+w for some y ∈ Y and w ∈ W, while Y ∩W = {0}.
6. THE HAHN-BANACH THEOREM 99

6.25. Remark. Every vector subspace W of a vector space V over K is alge-


braically complemented. If {wλ }λ∈Λ is a basis for W , then it can be extended to a
basis {wλ }λ∈Λ ∪ {yβ }β∈Γ for V . Letting Y = span{yβ }β∈Γ , we get V = Y ⊕ W .
The key issue in the above definition is that if W is a closed subspace in the
LCS V, then we are asking that the complement Y of W also be closed. This is not
always possible. For example, c0 is a closed subspace of (`∞ , k · k∞ ). Nevertheless,
it does not possess a topological complement. The proof is omitted.
When W is finite-dimensional, the situation is somewhat better.

6.26. Proposition. Let W be a finite-dimensional subspace of a LCS (V, T ).


Then W is topologically complemented in V.
Proof. First observe that W is closed in V by Corollary 4.22. Let {w1 , w2 , ..., wn }
be a basis for W.
By Corollary 6.22, we can find continuous linear functionals ρ1 , ρ2 , ..., ρn ∈ V ∗
so that ρj (wi ) = δij , where δij is the Kronecker delta function. Let Y = ∩nj=1 ker ρj .
Since each ρj is continuous, ker ρj is closed for all j, and henceP Y is also closed.
Suppose v ∈ V. Let kj = ρj (v), 1 ≤ j ≤ n. Then w = ni=1 ki wi ∈ W. If
y := v − w, then ρj (y) = ρj (v) − ρj (w)Pn = kj − kj = 0, 1 ≤ j ≤ n. Hence y ∈ Y.
Finally, if z ∈ Y ∩ W, then z = i=1 ri wi for some ri ∈ K, 1 ≤ i ≤ n. But then
z ∈ Y, so for each 1 ≤ j ≤ n, 0 = ρj (z) = rj . Hence z = 0 and V = Y ⊕ W.
2

6.27. Corollary. Let (V, T ) be a LCS and W ⊆ V be a closed subspace of V.


If x ∈ V, x 6∈ W, then there exists g ∈ V ∗ so that g|W = 0 but g(x) 6= 0.
Proof. By Proposition 5.17, V/W is a LCS. Also, x 6∈ W implies that q(x) 6= 0 in
V/W, where q : V → V/W is the canonical quotient map. We can therefore apply
Corollary 6.23 to produce a functional f ∈ (V/W)∗ so that f (q(x)) 6= 0. Since q is
continuous, so is g := f ◦ q, and so g ∈ V ∗ satisfies g(w) = 0 for all w ∈ W, while
g(x) 6= 0.
2

6.28. Theorem. Let (V, T ) be a LCS and W ⊆ V be a linear manifold. Then


W = ∩{ker f : f ∈ V ∗ and W ⊆ ker f }.

Proof. Clearly f ∈ V ∗ implies that ker f is closed, so if W ⊆ ker f , then W ⊆ ker f .


Thus
W ⊆ ∩{ker f : f ∈ V ∗ and W ⊆ ker f }.
Conversely, suppose that x ∈ V, x 6∈ W. By Corollary 6.27, there exists g ∈ V ∗
so that g|W = 0 but g(x) 6= 0. This proves the reverse inclusion, and combining the
two inclusions yields the desired result.
2
100 L.W. Marcoux Functional Analysis

6.29. Corollary. Let (V, T ) be a LCS and W ⊆ V be a linear manifold. The


following are equivalent:
(a) W is dense in V.
(b) f ∈ V ∗ and f |W = 0 implies that f = 0.

Let us now describe some quantitative versions of the above results, in the setting
of normed linear spaces.

6.30. Corollary. Let (X, k · k) be a NLS and x ∈ X. Then


kxk = max{|x∗ (x)| : x∗ ∈ X∗ , kx∗ k ≤ 1}.

Proof. For the rest of the proof, the vector x ∈ X is fixed.


Let β := sup{|x∗ (x)| : x∗ ∈ X∗ , kx∗ k ≤ 1}. Then for any x∗ ∈ X∗ with kx∗ k ≤ 1,
|x (x)| ≤ kx∗ k kxk ≤ kxk, and so β ≤ kxk.

Define Y = Kx, so that Y is a one-dimensional normed, linear subspace of X.


Define f ∈ Y# via f (kx) = kkxk. Then |f (kx)| = |k| kxk = kkxk, and so kf k = 1.
By the Hahn-Banach Theorem 03 (Theorem 6.21), there exists y ∗ ∈ X∗ so that
y ∗ |Y = f , and ky ∗ k = kf k = 1.
Thus
|y ∗ (x)| = y ∗ (x) = f (x) = kxk,
which proves that β ≥ kxk, and hence that β = kxk. It also shows that the
supremum is attained at y ∗ .
2
Recall from Proposition 2.18 that if X is a normed linear space, then the canon-
ical embedding J : X → X∗∗ which sends x ∈ X to x b ∈ X∗∗ , where x
b(x∗ ) = x∗ (x) for
∗ ∗
all x ∈ X is a contractive linear mapping.
As a simple consequence of Corollary 6.30, we obtain:

6.31. Corollary. The canonical embedding J : X → X∗∗ is an isometry.

6.32. Corollary. Let (X, k · k) be a NLS and Y ⊆ X be a closed subspace, with


z ∈ X but z 6∈ Y. Let d := d(z, Y) = kz + Yk. Then there exists x∗ ∈ X∗ so that
kx∗ k = 1, x∗ |Y = 0, and x∗ (z) = d.
Proof. Let q : X → X/Y denote the canonical quotient map. Since X/Y is a
NLS and kq(z)k = d, Corollary 6.30 guarantees the existence of a linear functional
ξ ∗ ∈ (X/Y)∗ so that kξ ∗ k = 1 and ξ ∗ (q(z)) = kq(z)k = d.
Let x∗ = ξ ∗ ◦ q. Obviously x∗ (z) = d. Since kqk ≤ 1, kx∗ k ≤ kξ ∗ k kqk ≤ 1. Also,
for y ∈ Y, x∗ (y) = ξ ∗ (q(y)) = ξ ∗ (0) = 0.
To see that kx∗ k ≥ 1, note that kξ ∗ k = 1 and so we can find a sequence
(q(xn ))∞ ∗
n=1 in X/Y with kq(xn )k < 1 for all n ≥ 1 and limn→∞ |ξ (q(xn ))| = 1.
6. THE HAHN-BANACH THEOREM 101

Choose yn ∈ Y so that kxn + yn k < 1 for all n. Then


lim |x∗ (xn + yn )| = lim |ξ ∗ (q(xn + yn )|
n→∞ n→∞
= lim |ξ ∗ (q(xn ))|
n→∞
= 1.
Hence kx∗ k ≥ 1, whence kx∗ k = 1.
2

THE SEPARATION THEOREMS

6.33. Proposition. Let (V, T ) be a LCS over the field K and let ∅ 6= G ⊆ V
be an open, convex subset of V with 0 6∈ G. Then there exists a closed hyperplane
M in V such that G ∩ M = ∅.
Proof. Let us first consider the case where K = R.
Fix x0 ∈ G and let H = x0 −G. Then H ∈ U0V is open and convex. Let pH denote
the Minkowski functional on H. By Proposition 5.10, H = {x ∈ V : pH (x) < 1}.
Observe that 0 6∈ G implies that x0 6∈ H. Thus pH (x0 ) ≥ 1. Let W = Rx0 , and
define f : W → R via f (kx0 ) = kpH (x0 ). Clearly f ∈ W # . Moreover,
• if k ≥ 0, then f (kx0 ) = kpH (x0 ) = pH (kx0 ), while
• if k < 0, then f (kx0 ) = kpH (x0 ) < 0 ≤ pH (kx0 ).
It follows from the Hahn-Banach Theorem 01 (Theorem 6.18) that there exists a
linear functional g : V → R with g|W = f and g(x) ≤ pH (x) for all x ∈ V. Suppose
that y ∈ H. Then Re g(y) = g(y) ≤ pH (y) < 1.
By Proposition 6.12, g is continuous on V. Thus M := ker g is a closed hyper-
plane in V. [Note: obviously g 6= 0 since f 6= 0.]
Suppose z ∈ G. Then x0 − z ∈ H, so g(x0 ) − g(z) = g(x0 − z) ≤ pH (x0 − z) < 1.
On the other hand, g(x0 ) = f (x0 ) = pH (x0 ) ≥ 1, and so
g(z) > g(x0 ) − 1 ≥ 0, and z 6∈ M.
Thus G ∩ M = ∅.

Next, suppose that K = C.


Then V is also an R-linear space, and so as above we can find a continuous R-
linear functional gR : V → R so that G ∩ ker gR = ∅. Let gC be the complexification
of gR , gC (x) = gR (x) − igR (ix), x ∈ V. By Lemma 6.6, gC is a C-linear functional
and gR = Re gC .
Now gC (x) = 0 if and only if gR (x) = gR (ix) = 0. Let M = ker gC . Then
M = ker gR ∩ [i ker gR ] is a closed C-hyperplane in V and M ∩ G ⊆ ker gR ∩ G = ∅.
2
102 L.W. Marcoux Functional Analysis

6.34. Definition. An affine hyperplane M in a TVS (V, T ) is a translate


of a hyperplane; that is, M is an affine hyperplane if there exists x ∈ M so that
M − x is a hyperplane.
More generally, L ⊆ V is an affine manifold (resp. affine subspace) of V if
there exists m ∈ L so that L − m is a manifold (resp. subspace) of V.
We remark that if there exists m ∈ L so that L − m is a manifold in V, then for
all m ∈ L we must have L − m is a manifold. The verification of this is left to the
reader.

6.35. Corollary. Let (V, T ) be a LCS and ∅ 6= G ⊆ V be open and convex. If


L ⊆ V is an affine manifold of V and L ∩ G = ∅, then there exists a closed, affine
hyperplane Y ⊆ V so that L ⊆ Y and Y ∩ G = ∅.
Proof. Since the closure L of L is an affine subspace and L ∩ G = ∅ (as L ⊆ V \ G
and the latter set is closed), we may replace L by L and assume without loss of
generality that L is closed. Choose m ∈ L and let L0 = L − m, so that L0 is a closed
subspace of V. Let G0 = G − m. Since L ∩ G = ∅, it follows that L0 ∩ G0 = ∅. Let
q : V → V/L0 denote the canonical quotient map.
Since G is open, so is G0 . Since q is an open map (see paragraph 4.18), q(G0 )
is open. Furthermore, G is convex and hence so are G0 and q(G0 ). Again, since
L0 ∩ G0 = ∅, 0 6∈ q(G0 ). By Proposition 6.33, there exists a closed hyperplane N0
in V/L0 so that N0 ∩ q(G0 ) = ∅. Let Y0 = q −1 (N0 ). It is routine to check that Y0
is a linear manifold in V, and Y0 is closed since q is continuous. Moreover,
dim V/Y0 = dim(V/L0 )/(Y0 /L0 )
= dim(V/L0 )/N0
= 1,
and so Y0 is a closed hyperplane in V with L0 ⊆ Y0 . Translating back, let Y =
Y0 + m. Then Y is a closed affine hyperplane of V, L ⊆ Y and if z ∈ Y ∩ G, then
q(z − m) ∈ q(Y0 ) ∩ q(G0 ) = N0 ∩ q(G0 ) = ∅, a contradiction.
2

6.36. Definition. Let (V, T ) be a TVS over the field R. By an open half-
space (resp. closed half-space) we shall mean a subset S ⊆ V for which there
exist a non-zero continuous linear functional f : V → R and k ∈ R so that
S = {x ∈ V : f (x) > k}
(resp. S = {x ∈ V : f (x) ≥ k}.)
We say that two subsets A and B of V are separated if we can find closed half-
spaces SA and SB so that A ⊆ SA , B ⊆ SB and SA ∩SB is a closed affine hyperplane
of V. We say that A and B are strictly separated if we can find disjoint open
half-spaces SA and SB with A ⊆ SA and B ⊆ SB .
Note that if f ∈ V ∗ and k ∈ R, then S = {x ∈ V : f (x) < k} is also an open
half-space, since g = −f ∈ V ∗ and S = {x ∈ V : g(x) > −k}. A similar statement
holds for closed half-spaces.
6. THE HAHN-BANACH THEOREM 103

6.37. Example.
(a) Consider R2 equipped with the Euclidean norm. Let A = {(x, y) ∈ R2 :
x < 0 and y ≥ 1/x2 }, B = {(x, y) ∈ R2 : x > 0 and y ≥ 1/x2 }. Then
f : R2 → R, f (x, y) = x defines a continuous linear functional on R2 . Let
SA = {(x, y) ∈ R2 : f (x) < 0} and SB = {(x, y) ∈ R2 : f (x) > 0}. Then
SA , SB are disjoint open half-spaces with A ⊆ SA and B ⊆ SB . Hence A
and B are strictly separated.
(b) With R2 , f A, SA and SB as above, set C = {(0, y) : y ∈ R}. Then
A ⊆ SA = {(x, y) ∈ R2 : f (x) ≤ 0}, C ⊆ SB = {(x, y) ∈ R2 : f (x) ≥ 0}
and SA , SB are closed half-spaces for which SA ∩SB = {(x, y) ∈ R2 : x = 0}
is a closed (affine) hyperplane.
Thus A and C are separated. One can check that A and C are not
strictly separated.

6.38. Theorem. The Hahn-Banach Theorem 04 - R


Let (V, T ) be a LCS over R and suppose that A and B are non-empty, disjoint,
open, convex subsets of V. Then A and B are strictly separated.
Proof. Let G := A − B = {a − b : a ∈ A, b ∈ B}. We claim that ∅ 6= G is
open and convex. That ∅ 6= G is obvious as both A and B are non-empty. Since
G = ∪b∈B A − b, G is the union of open sets (each A − b is a translate of the open
set A), and thus G is open.
Suppose that g1 = a1 − b1 and g2 = a2 − b2 lie in G. Let t ∈ [0, 1]. Then
tg1 + (1 − t)g2 = [ta1 + (1 − t)a2 ] − [tb1 + (1 − t)b2 ].
Since A and B are convex, it follows that so is G. Observe that A ∩ B = ∅ also
implies that 0 6∈ G.
It now follows from Proposition 6.33 that there exists a closed hyperplane M
in V such that M ∩ G = ∅. Let f denote a continuous linear functional on V such
that ker f = M. (That a linear functional with this kernel exists was demonstrated
in paragraph 6.9, and that it is continuous follows from Corollary 6.13.)
Now G is convex and f is linear, whence f (G) is again convex. But G∩ker f = ∅,
so 0 6∈ f (G) and hence either f (x) > 0 for all x ∈ G, or f (x) < 0 for all x ∈ G.
By replacing f by −f if necessary, we may assume that the first condition holds. If
a ∈ A, b ∈ B, then c = a − b ∈ G, so f (c) = f (a) − f (b) > 0, i.e. f (b) < f (a). We
deduce that there exists k ∈ R so that
sup{f (b) : b ∈ B} ≤ k ≤ inf{f (a) : a ∈ A}.
Now A is open and hence f (A) is open (check!). Similarly, f (B) is open. Hence
f (b) < k < f (a) for all a ∈ A, b ∈ B. It follows that SA = {x ∈ V : f (x) > k}
and SB = {x ∈ V : f (x) < k} are disjoint open half-spaces with A ⊆ SA , B ⊆ SB .
Hence A and B are strictly separated.
2
104 L.W. Marcoux Functional Analysis

6.39. Remark. If A were open but not B, then G = ∪b∈B A − b would still be
a union of open sets and hence would still be open. The conclusion would be there
there exist a continuous linear functional g : V → R and a constant k ∈ R such that
A ⊆ {x ∈ V : g(x) > k} and B ⊆ {x ∈ V : g(x) ≤ k}.

6.40. Theorem. The Hahn-Banach Theorem 04 - C


Let (V, T ) be a LCS over C and suppose that A, B ⊆ V are non-empty, disjoint,
open, convex subsets of V. Then there exist a continuous C-linear functional f on
V and k ∈ R so that
Ref (a) > k > Ref (b) for all a ∈ A, b ∈ B.

Proof. Thinking of (V, T ) as a vector space over R, we may apply Theorem 6.38
(i.e. the HB04-R) above to obtain a continuous R-linear functional fR : V → R and
a constant k ∈ R so that
fR (a) > k > fR (b) for all a ∈ A, b ∈ B.
Let fC (x) = fR (x) − ifR (ix) be the complexification of fR . By Lemma 6.6, fC is
continuous and
RefC (a) > k > RefC (b) for all a ∈ A, b ∈ B.
Thus f = fC is the desired C-linear functional.
2

6.41. Theorem. The Hahn-Banach Theorem 05


Let (V, T ) be a LCS and suppose that A, B ⊆ V are non-empty, disjoint, closed,
convex subsets of V. Suppose furthermore that B is compact. Then there are real
numbers α, β and a continous linear functional f ∈ V ∗ so that
Ref (a) ≥ α > β ≥ Ref (b)
for all a ∈ A, b ∈ B. In particular, A and B are strictly separated.
Proof. Observe that V\A is open and that b ∈ B implies that b ∈ V\A. It follows
from Corollary 5.15 that we can find a balanced, convex, open nbhd Nb of 0 so
that b + Nb ⊆ V\A. The collection {b + 31 Nb : b ∈ B} is an open cover of B, and
B ⊆ ∪b∈B (b + 13 Nb ) ⊆ ∪b∈B (b + Nb ) ⊆ V\A. Since B is compact, we can find a
finite subcover {bj + 13 Nbj }nj=1 of B. Let N = ∩nj=1 13 Nbj . Then N is non-empty,
balanced, convex and open.
Let A0 = A+N = {a+n : a ∈ A, n ∈ N } and B0 = B+N . Clearly A0 6= ∅ 6= B0 .
Then A0 = ∪a∈A a + N is open and similarly B0 is open. If a1 + n1 , a2 + n2 ∈ A0
and t ∈ [0, 1], then
t(a1 + n1 ) + (1 − t)(a2 + n2 ) = (ta1 + (1 − t)a2 ) + (tn1 + (1 − t)n2 ) ∈ A + N = A0 ,
since each of A and N is convex. Thus A0 , and similarly B0 , is convex.
Suppose z ∈ A0 ∩ B0 . Then there exists a ∈ A, b ∈ B and n1 , n2 ∈ N so
that a + n1 = b + n2 . But b ∈ B implies that there exists 1 ≤ j ≤ n and some
mj ∈ 31 Nbj so that b = bj + mj . Thus a = b + (n2 − n1 ) = bj + mj + (n2 − n1 ).
6. THE HAHN-BANACH THEOREM 105

Now n1 ∈ N and N balanced implies that −n1 ∈ N . Recalling that N ⊆ 13 Nbj ,


we see that mj , n2 , −n1 ∈ 31 Nbj , and thus mj + (n2 − n1 ) ∈ Nbj . This forces
a = bj + mj + (n2 − n1 ) ∈ bj + Nbj , contradicting the fact that bj + Nbj ⊆ V\A.
Hence A0 ∩ B0 = ∅.
By Theorem 6.40 (HB04-C), there exists f ∈ V ∗ and α ∈ R so that
Re f (a) > α > Re f (b)
for all a ∈ A0 , b ∈ B0 .
But B is compact, and Re◦f is continuous on B, so that β = sup{Ref (b) : b ∈ B}
is attained at some point b0 ∈ B. Thus
Re f (a) > α > β = Re f (b0 ) ≥ Re f (b)
for all a ∈ A, b ∈ B.
2

6.42. Corollary. Let (V, T ) be a LCS over R and ∅ 6= A ⊆ V. Then the closed,
convex hull of A, co(A), is the intersection of the closed half spaces that contain A.
Proof. Let Ω = {S : A ⊆ S, S ⊆ V is a closed half-space}. Since each S ∈ Ω is
closed and convex, B = ∩S∈Ω S is again closed and convex. Clearly A ⊆ B, and so
the closed convex hull of A is also a subset of B.
If z 6∈ co(A), then {z} and co(A) are disjoint, non-empty, closed and convex
subsets of V. Since {z} is compact, we can apply Theorem 6.41 (i.e. HB05) to
obtain a linear functional f ∈ V ∗ and α, β ∈ R such that
f (z) ≥ α > β ≥ f (y)
for all y ∈ co(A). Thus if S0 = {x ∈ V : f (x) ≤ β}, then S0 is a closed half-space of
V, A ⊆ co(A) ⊆ S0 , and z 6∈ S0 . Thus ∩S∈Ω S ⊆ co(A), proving that
co(A) = ∩S∈Ω S.
2
106 L.W. Marcoux Functional Analysis

Appendix to Section 6.

6.43. Corollary 6.30 admits an interesting interpretation. Given (X, k · k) a NLS


and x∗ ∈ X∗ , we do not in general expect x∗ to achieve its norm. For example, if
X = c0 , equipped with the supremum norm, and if
X xn
x∗ ((xn )n ) := ,
n
2n
then x∗ has norm one, but there is no x = (xn )n ∈ c0 with kxk = 1 and |x∗ (x)| =
kx∗ k = 1.
Nevertheless, Corollary 6.30 allows us to conclude that we may find x∗∗ ∈ X∗∗
with kx∗∗ k = 1 for which |x∗∗ (x∗ )| = 1. If J : X∗ → X∗∗∗ is the canonical embedding
of X∗ into its second dual, then kx c∗ k = 1 by Corllary 6.31 and
c∗ (x∗∗ )| = |x∗∗ (x∗ )| = 1 = kx
|x c∗ k.
One can think of this as saying that the domain of x∗ is not “large enough” to
allow x∗ to attain its norm, but that X∗∗ extends the domain of x∗ enough to allow
c∗ of x∗ to attain its norm.
the extension x

6.44. The proof of Proposition 6.33 also gives us an indication of how one may
try to interpret the Hahn-Banach Theorem 01, namely Theorem 6.18, geometrically.
Let V be a vector space over R and let p be a sublinear functional on V. Suppose
that M is a linear manifold in V and f : M → R is a linear functional with
f (m) ≤ p(m) for all m ∈ M.
Let H = {x ∈ V : p(x) < 1}. It is routine to verify that H is convex. Since
0 6= f , there exists m0 ∈ M such that f (m0 ) > 1. This forces p(m0 ) ≥ f (m0 ) > 1,
and so m0 6∈ H. Let K = m0 − H = {m0 − h : h ∈ H}. Clearly K is also convex.
Since m0 6∈ H, 0 6∈ K. In fact, we claim that K ∩ ker f = ∅.
Suppose otherwise: let k ∈ K ∩ ker f . Then k ∈ M and k = m0 − h for some
h ∈ H, which forces h ∈ H ∩ M, since k, m0 ∈ M. But
0 = f (k) = f (m0 − h) = f (m0 ) − f (h) > 1 − p(h) > 0,
a contradiction. Thus K ∩ ker f = ∅. Theorem 6.18 then says that we can extend
f to a linear functional g : V → R with g|M = f , such that
g(x) ≤ p(x) for all x ∈ V.
A similar analysis to that above shows that K ∩ ker g = ∅. In other words,
one can translate the “unit ball” H of V (as measured by the sublinear functional
p – note, p doesn’t even have to be a non-negative-valued function, and hence the
interpretation of H as a “unit ball” here is very, very loose) so that it doesn’t
intersect the linear manifold ker f in such a way that the manifold may be extended
to a hyperplane (ker g) which also doesn’t intersect the translation.
6. THE HAHN-BANACH THEOREM 107

This is only intended as a heuristic. Proposition 6.33 shows how to correctly


use HB01, i.e. Theorem 6.18, to obtain an interesting geometric result in locally
convex spaces.

Part (b) of Proposition 6.7 admits a second proof (by induction). We thank
W. Shen for pointing this out and providing the following proof.

6.45. Proposition. Let V be a vector space over K, N ≥ 1 be an integer, and


g, f1 , f2 , . . . , fN ∈ V # be linear functionals. Suppose that ∩N
n=1 ker fn ⊆ ker g. Then
g ∈ span {f1 , f2 , . . . , fN }.
Proof. The case N = 1 is part (a) of Proposition 6.7. Now let M > 1 and suppose
that the result holds for N < M . We prove that it holds for N = M . To that end,
we suppose that g, f1 , f2 , . . . , fM ∈ V # , and that ∩Mn=1 ker fn ⊆ ker g.
M −1
Let W := ∩n=1 ker fn . Note that
ker fM |W ⊆ ker g|W .
Since the result holds for N = 1, we see that g|W = κ fM |W for some κ ∈ K. But
then
−1
∩M
n=1 ker fn = W ⊆ ker (g − κfM ),
and so by the induction hypothesis,
g − κfM ∈ span {f1 , f2 , . . . , fM −1 },
from which the result follows.
2
The following result was required for Proposition 6.7. For the sake of complete-
ness, we include a proof.

6.46. Proposition. Let V be a vector space over K and 1 ≤ N ∈ N. Sup-


pose that f1 , f2 , . . . , fN ∈ V # are linearly independent elements, and let W :=
∩N
n=1 ker fn . Then
dim(V/W) ≤ N.

Proof. To see this, we shall argue by induction on N . For N = 1, this is clear, as


f1 6= 0 (because {f1 } is assumed to be linearly independent), so
V
' ran f1 = K,
ker f1
V
implying that dim = dim K = 1.
ker f1
Another way of stating this is that if x1 ∈ V and x1 6∈ ker f1 , then V = ker f1 u
Kx1 . (Here, V = V1 u V2 for vector subspaces V1 , V2 of V if V = V1 + V2 = {x + y :
x ∈ V1 , y ∈ V2 } and V1 ∩ V2 = {0}.)
108 L.W. Marcoux Functional Analysis

Suppose that the result holds for 1 ≤ N ≤ M . Consider linearly independent


linear functionals f1 , f2 , . . . , fM , fM +1 on V. Let W := ∩M n=1 ker fn . Then, by the
induction hypothesis,
dim (V/W) ≤ M.
Alternatively, there exist r ≤ M linearly independent vectors x1 , x2 , . . . , xr such
that
V = W + span {x1 , x2 , . . . , xr }.
(One way to see this is to choose a basis {x1 + W, x2 + W, . . . , xr + W} for V/W,
and to check that the representatives x1 , x2 , . . . , xr of these cosets does the job.)
Let g := fM +1 |W . Then g is a linear functional on W, and so dim (W/ ker g) ≤ 1.
That is, there exists y ∈ W such that
W = ker g + K y.
(Observe that this holds even if ker g = W.) If w ∈ ker g, then fM +1 (w) = g(w) = 0,
and fn (w) = 0, 1 ≤ n ≤ M because ker g ⊆ W = ∩M n=1 ker fn .
It follows that
ker g ⊆ ∩M +1
n=1 ker fn ,
and
V = W + span {x1 , x2 , . . . , xr } = ker g + span {x1 , x2 , . . . , xr , y}.
In particular, V/ ker g is spanned by {x1 +ker g, x2 +ker g, . . . , xr +ker g, y +ker g}.
V
Hence dim ≤ r + 1 ≤ M + 1, which, when combined with the fact that
ker g
ker g ⊆ ∩M +1
n=1 ker fn , implies that
V
dim M +1 ≤ M + 1.
∩n=1 ker fn
2

There is less in this than meets the eye.


Tallulah Bankhead
6. THE HAHN-BANACH THEOREM 109

Exercises for Section 6.

Question 1.
Let V be a vector space over C, and let f : V → R be an R-linear functional.
Let fC denote the C-linear functional defined in Lemma 6.6, namely
fC (x) := f (x) − if (ix), x ∈ V.
Prove that kf k = ∞ if and only if kfC k = ∞.

Question 2.
For those of you who have studied Lebesgue measure: let 1 ≤ p < ∞, and let q
denote the Lebesgue conjugate of p, and dm denote Lebesgue measure on the real
line. Let f ∈ Lp ([0, 1], K). Prove that there exists g ∈ Lq ([0, 1], K) such that
(i) kgkq = 1, and
(ii)
Z 1
f gdm = kf kp .
0
This is typically one of the first things that one proves about Lp -spaces in any course
on measure theory. Hopefully you can now see why.

Question 3.
Find a linear functional ϕ ∈ (`∞ )∗ such that ϕ|c0 ≡ 0. That is, describe ϕ
explicitly.

Question 4.
Let 1 ≤ p ≤ ∞. Consider x = (1, 2) ∈ (C2 , k · kp ). Find a linear functional
ϕp ∈ (C2 , k · kp )∗ such that kϕp k = 1 and
ϕp (x) = kxkp .
110 L.W. Marcoux Functional Analysis

7. Weak topologies and dual spaces

Last week I stated that this woman was the ugliest woman I had ever
seen. I have since been visited by her sister and now wish to withdraw
that statement.
Mark Twain

7.1. In Remarks 5.27, we observed that if V is a vector space over K, if


(Xα , k · kα )α∈A is a family of normed linear spaces, and if for each α ∈ A we have a
linear map Tα : V → Xα , then each
pα : V → R
x 7→ kTα xk
is a seminorm. Furthermore, if {Tα }α∈A is separating – i.e. for each 0 6= x ∈ V,
there exists α0 ∈ A so that Tα0 x 6= 0 – then the family Γ = {pα }α∈A is a separating
family of seminorms. Finally, we saw there that the LCS topology on V generated
by Γ was nothing more (nor was it anything less) than the weak topology generated
by {Tα }α∈A .
Let us now consider the following special instance of this phenomenon. Again,
we begin with a vector space V over K, and we assume that we are given a separating
family Ω ⊆ V # . Of course, for each % ∈ Ω, we have
% : V → K,
and (K, | · |) is a normed linear space. Since Ω was assumed to be separating for V,
the family Γ = {τ% : % ∈ Ω} of functions defined by
τ% (x) = |%(x)|, x ∈ V,
is a separating family of seminorms which generates a LCS topology on V. From
above, this topology coincides with the weak topology generated by Ω, and we shall
denote it by σ(V, Ω).
Thus a base for the σ(V, Ω) topology on V is given by
B = {N (x, F, ε) : x ∈ V, ε > 0, F ⊆ Ω finite},
where for each x, F and ε > 0 as above,
N (x, F, ε) = {y ∈ V : τ% (x − y) = |%(x) − %(y)| < ε, % ∈ F }.
In particular, a net (xλ )λ∈Λ in (V, σ(V, Ω)) converges to x ∈ V if and only if
lim τρ (xλ − x) = lim |ρ(xλ ) − ρ(x)| = 0,
λ λ
or equivalently,
lim ρ(xλ ) = ρ(x)
λ
for all ρ ∈ Ω.
7. WEAK TOPOLOGIES AND DUAL SPACES 111

7.2. Definition. Let V be a vector space over K, and suppose that L ⊆ V # is


both a linear manifold and a separating family of linear functionals. We say that
(V, L) is a dual pair.

7.3. Example. Suppose that (V, T ) is a LCS and that L = V ∗ . By Corol-


lary 6.23, L separates points of V and hence (V, V ∗ ) is a dual pair. The σ(V, V ∗ )
topology is sufficiently important to merit its own name, and we refer to it as the
weak topology on V. If (xλ )λ is a net in V which converges to some x in the weak
topology, we say that (xλ )λ converges weakly to x.
Suppose that (xλ )λ is a net in V which converges to x ∈ V in the initial topology
T . For any f ∈ V ∗ , the fact that f is continuous implies that
lim f (xλ ) = f (x).
λ

Thus (xλ )λ converges to x weakly. It follows that the weak topology on V induced
by V ∗ is weaker than the initial topology: in other words, σ(V, V ∗ ) ⊆ T .

Let (V, L) be a dual pair. By Paragraph 7.1, each ρ ∈ L is continuous on V


relative to the σ(V, L) topology. Our present goal is to show that these are the only
σ(V, L)-continuous linear functionals on V .

7.4. Theorem. Let (V, L) be a dual pair. Then


L = (V, σ(V, L))∗ .

Proof. That L is contained in (V, σ(V, L))∗ was shown in Paragraph 7.1.
Suppose now that µ ∈ V # is σ(V, L)-continuous. Then the map pµ : V → R
satisfying pµ (x) = |µ(x)| defines a σ(V, L)-continuous seminorm on V . By Proposi-
tion 5.29, there exist ρ1 , ρ2 , ..., ρn ∈ L and 0 < κ ∈ R so that
pµ (x) = |µ(x)| ≤ κ max(|ρ1 (x)|, |ρ2 (x)|, ..., |ρn (x)|) for all x ∈ V.
It follows that ker µ ⊇ ∩nj=1 ker ρj . By Proposition 6.7, µ ∈ span {ρj }nj=1 ⊆ L.
2

7.5. Remark.
• We first remark that if Ω ⊆ V # is a separating family of linear functionals
but is not a linear manifold, then after setting L = span Ω, one can verify
that the σ(V, L)-topology on V agrees with the σ(V, Ω)-topology, and hence
that
L = (V, σ(V, Ω))∗ .
• It follows from Theorem 7.4 that the only weakly continuous linear func-
tionals on a locally convex space (V, T ) are the elements of (V, T )∗ .
112 L.W. Marcoux Functional Analysis

7.6. Definition. Suppose that (V, T ) is a LCS. Then V ∗ ⊆ V # is a vector space


over K. For each x ∈ V, define
b : V∗ →
x K
ρ 7→ x
b(ρ) := ρ(x).
Then V x : x ∈ V} is a linear manifold in (V ∗ )# . If 0 6= ρ ∈ V ∗ , then obviously
b := {b
there exists x ∈ V such that ρ(x) 6= 0. In other words, V b is a separating family of
∗ ∗
linear functionals on V . Hence (V , V) is a dual pair.
b
By convention, the weak topology on V ∗ induced by the family V b is usually de-
noted by σ(V , V) (as opposed to σ(V , V)), and is referred to as the weak∗ -topology
∗ ∗ b
on V ∗ .

7.7. Remark. It follows that a base for the weak∗ -topology on V ∗ is given by
B = {N (ϕ, F, ε) : ϕ ∈ V ∗ , ε > 0, F ⊆ V finite},
where
N (ϕ, F, ε) = {ρ ∈ V ∗ : |b
x(ρ − ϕ)| = |ρ(x) − ϕ(x)| < ε, x ∈ F }.
Moreover, a net (ρλ )λ in V converges in the weak∗ -topology to a functional ρ ∈ V ∗

if and only if limλ ρλ (x) = ρ(x) for all x ∈ V. In other words, convergence in the
weak∗ -topology on V ∗ is convergence at every point of V.
By Theorem 7.4, a functional ϕ is weak∗ -continuous on V ∗ if and only if ϕ = x
b
for some x ∈ V.

7.8. Proposition. Let (V, TV ) and (W, TW ) be locally convex spaces, and sup-
pose that T : (V, TV ) → (W, TW ) is a continuous linear operator. Then T is contin-
uous as a linear map between V and W when they are equipped with their respective
weak topologies.
Proof. Suppose that (xλ )λ is a net in V which converges weakly to x ∈ V. We must
show that the net (T xλ )λ converges weakly to T x in W. Now, if ρ ∈ W ∗ , then ρ ◦ T
is continuous with respect to the TV topology on V, and hence ρ ◦ T ∈ V ∗ . But the
weakly continuous linear maps on V coincide with V ∗ , and therefore ρ ◦ T is weakly
continuous on V, i.e. limλ ρ ◦ T (xλ ) = ρ ◦ T (x), as was to be shown.
2
When V is a LCS and C ⊆ V is convex, we get a particularly nice result con-
cerning the weak topology.

7.9. Theorem. Let C be a convex set in a LCS (V, T ). Then the closure of C
in (V, T ) coincides with its weak closure in (V, σ(V, V ∗ )).
Proof. First observe that we can always view (V, T ) and (V, σ(V, V ∗ )) as locally
convex spaces over R. Since C is assumed to be convex already, Corollary 6.42
implies that the closure of C in (V, T ) (resp. in (V, σ(V, V ∗ ))) is the intersection of
the T -closed (resp. σ(V, V ∗ )-closed) half spaces which contain C.
But a closed half space in a LCS corresponds to (a constant and) a continuous
linear functional on that space. Since (V, T ) and (V, σ(V, V ∗ )) share the same dual
7. WEAK TOPOLOGIES AND DUAL SPACES 113

space, namely V ∗ , it follows that they also share the same closed half-spaces, and
hence the closure of C in these two topologies must coincide.
2

7.10. Let (X, k · k) be a Banach space. Recall from Proposition 2.18 that the
canonical embedding
J : X → X∗∗
x 7→ x b,
where x ∗ ∗ ∗ ∗
b(x ) := x (x) for all x ∈ X , is a contractive map. In Corollary 6.31 we saw
that – as a consequence of the Hahn-Banach Theorem – J is in fact an isometry.
By Theorem 7.4 and Remark 7.5, J(X) corresponds exactly to the weak∗ -
continuous linear functionals on X∗ .

7.11. Proposition. Let X be a finite-dimensional Banach space. Then the


norm, weak and weak∗ -topologies on X all coincide.
Proof. First we must decide what we mean by the weak∗ -topology on X. Observe
that if dim X = n < ∞, then dim X∗ = n as well, and thus dim X∗∗ = n = dim X.
Since J : X → X∗∗ is a linear isometry, it must be a bijection in this case and therefore
we can identify X with X∗∗ = (X∗ )∗ . In this sense X ' J(X) comes equipped with a
weak∗ -topology induced by X∗ , namely the σ(J(X), X∗ )-topology. But since we are
identifying X with J(X) = X∗∗ , this is really just the σ(X, X∗ )-topology, namely the
weak topology on X.
Since the weak and the norm topologies on X are both TVS topologies, and
since finite-dimensional vector spaces admit a unique TVS topology, we see that all
three topologies cited above must coincide.
2
We now wish to examine some of the properties of the weak and weak∗ -topologies
in the context of normed linear spaces. We shall first require a result from Real
Analysis, which we shall then adapt to the setting of normed linear spaces.

7.12. Theorem. The Uniform Boundedness Principle


Let (X, d) be a complete metric space and let H ⊆ C(X, K) be a non-empty
family of continuous functions on X such that for each x ∈ X,
Mx := sup |h(x)| < ∞.
h∈H
Then there exists an open set G ⊆ X and a constant M > 0 so that
|h(x)| ≤ M for all h ∈ H, x ∈ G.

Proof. For each m ≥ 1, let Em,h = {x ∈ X : |h(x)| ≤ m}, and let Em = ∩h∈H Em,h .
Since each Em,h is closed (as h ∈ H implies that h is continuous), so is Em . Also,
for any x ∈ X, there exists m > Mx , and so x ∈ Em . Thus
X = ∪∞
m=1 Em .
114 L.W. Marcoux Functional Analysis

Since X is complete, the Baire Category Theorem implies the existence of k ≥ 1


so that the interior int (Ek ) of Ek is non-empty. This clearly leads to the desired
conclusion.
2

7.13. Corollary. The Uniform Boundedness Principle - Banach space


version
Let (X, k · kX ) and (Y, k · kY ) be Banach spaces and let A ⊆ B(X, Y) denote a
family of continuous linear operators from X to Y. Suppose that for each x ∈ X, we
have
Mx := sup{kT xk : T ∈ A} < ∞.
Then
sup{kT k : T ∈ A} < ∞.

Proof. For each T ∈ A, let pT : X → R be the continuous seminorm given by


pT (x) = kT xk. Since X is complete, the metric space version of the Uniform Bound-
edness Principle (Theorem 7.12) implies that there exists an open set ∅ 6= G ⊆ X
and a constant M > 0 so that

kT xk ≤ M for all T ∈ A, x ∈ G.

Now ∅ 6= G open implies that there exists z ∈ G and δ > 0 so that Vδ (z) = {x ∈
X : kx − zk < δ} ⊆ G. Consider y ∈ Vδ (0) and T ∈ A.
Then

kT yk ≤ kT (y + z)k + k − T zk
≤ M + kT zk
≤ 2M,

as z and y + z ∈ G. It follows that if x ∈ X, kxk ≤ 1, then


δ δ
kT xk = kT ( x)k ≤ M + kT zk ≤ 2M,
2 2
and hence that
2
kT xk ≤ (2M ), T ∈ A.
δ
That is,
4M
sup{kT k : T ∈ A} ≤ .
δ
2
7. WEAK TOPOLOGIES AND DUAL SPACES 115

7.14. Corollary. Let X be a Banach space and S ⊆ X. Then S is bounded if


and only if for all x∗ ∈ X∗ ,
sup{|x∗ (s)| : s ∈ S} < ∞.

Proof. Suppose that S is bounded by M > 0. If x∗ ∈ X∗ , then |x∗ (s)| ≤ kx∗ k ksk ≤
M kx∗ k < ∞.
Conversely, if sup{|x∗ (s)| : s ∈ S} < ∞, then sup{|b
s(x∗ )| : s ∈ S} < ∞ for all
∗ ∗
x ∈ X . By the Uniform Boundedness Principle,
sk : s ∈ S} = sup{ksk : s ∈ S} < ∞.
sup{kb
2

7.15. Corollary. Let X be a Banach space and S ⊆ X∗ . Then S is bounded if


and only if for all x ∈ X,
sup{|s∗ (x)| : s∗ ∈ S} < ∞.

Proof. This is an immediate consequence of the Uniform Boundedness Principle,


Theorem 7.13.
2

In general, we do not expect topologies to be determined by sequential con-


vergence. If we do have convergence of a sequence, therefore, something strong is
implied.

7.16. Theorem. The Banach-Steinhaus Theorem


Let X and Y be Banach spaces and suppose that {Tn }∞ n=1 ⊆ B(X, Y) is a sequence
which satisfies the property that for each x ∈ X, there exists yx ∈ Y so that
lim Tn x = yx .
n→∞
Then
(a) supn kTn k < ∞,
(b) the map T : X → Y defined by T x = yx is a bounded linear map, and
(c) kT k ≤ lim inf n kTn k.
Proof. For each x ∈ X, we have that {Tn x}∞ n=1 converges to some yx , and therefore
it is bounded. That is, supn≥1 kTn xk < ∞ for each x ∈ X. By the Uniform
Boundedness Principle, M := supn≥1 kTn k < ∞, which establishes part (a).
Let T x := limn Tn x, for each x ∈ X. Linearity of T is readily checked. Also,
kT xk = lim kTn xk ≤ lim inf kTn k kxk = (lim inf kTn k) kxk ≤ M kxk, x ∈ X.
n n≥1 n

Hence kT k ≤ lim inf n kTn k ≤ M < ∞, which completes parts (b) and (c).
2
116 L.W. Marcoux Functional Analysis

7.17. Corollary. Let X be a Banach space.


(a) If (xn )∞n=1 is a sequence which converges weakly to x ∈ X, then
(i) sup kxn k < ∞; and
(ii) kxk ≤ lim inf kxn k.
(b) If (yn∗ )∞ ∗ ∗
n=1 is a sequence which converges in the weak -topology to y ∈ X ,

then
(iii) sup kyn∗ k < ∞; and
(iv) ky ∗ k ≤ lim inf kyn∗ k.
Proof.
(a) Consider the natural isometric embedding Γ : X → X∗∗ sending y to yb,
where yb(x∗ ) = x∗ (y) for all x∗ ∈ X∗ .
Letting Y = K, we may apply the Banach-Steinhaus Theorem above
to the sequence (c xn )∞ ∞
n=1 , noting that convergence of (xn )n=1 in the weak
topology is simply the statement that limn→∞ x cn (x∗ ) = x
b(x∗ ) exists for all
∗ ∗
x ∈ X . Since Γ is an isometry, the result readily follows.
(b) Again, this is an immediate application of the Banach-Steinhaus Theorem,
replacing X in that Theorem with X∗ , and Y with K.
2

It is worth pointing out that by Theorem 7.9, if (xn )∞


n=1 converges weakly to x,
k·k ∞
then x ∈ co ({xn }n=1 ), since the latter is a convex set in X, closed in the norm
(and hence in the weak) topology.

Before considering our next example, let us first recall a result from Measure
Theory, alternately referred to as the Riesz Representation Theorem or the Riesz-
Markov Theorem.

7.18. Theorem. Let X be a locally compact, Hausdorff topological space, and


denote by M(X) the space of K-valued, finite, regular, Borel measures on X, equipped
with the total variation norm: kµk = |µ|(X). R
If µ ∈ M(X), then βµ : C0 (X, K) → K given by βµ (f ) = X f dµ is an ele-
ment of C0 (X, K)∗ , and the map Θ : M(X) → C0 (X, K)∗ is an isometric linear
isomorphism.
For example, if X = N with counting measure, then C0 (X, K) = c0 (N, K) and
M(X) = `1 (N, K).
When X = [0, 1], we can in turn identify M([0, 1]) = C([0, 1], K)∗ with the space
BV [0, 1] of left-continuous functions of bounded variation on [0, 1].

7.19. Proposition. Let X be a compact, Hausdorff space. Then a sequence


{fn }∞
n=1 in C(X) converges weakly to f ∈ C(X) if and only if
(i) supn kfn k < ∞; and
(ii) For each x ∈ X, (fn (x))∞
n=1 converges to f (x).
7. WEAK TOPOLOGIES AND DUAL SPACES 117

Proof. Suppose first that {fn }∞ n=1 converges weakly to f . By Corollary 7.17,
supn kfn k < ∞. Let δx : C(X) → K be the evaluation functional, δx (f ) = f (x) for
all x ∈ X. Then δx is linear and for f ∈ C(X), |δx (f )| = |f (x)| ≤ kf k, so that
kδx k ≤ 1 and δx ∈ C(X)∗ . (In fact, δx corresponds to the point mass measure at x.)
Thus limn→∞ δx (fn ) = δx (f ), i.e. limn→∞ fn (x) = f (x).
Conversely, suppose that (i) and (ii) hold. If ρ ∈ C(X)∗ , then by the Riesz
Representation Theorem above, there exists µ ∈ M(X) with kµk = kρk so that
Z
ρ(f ) = f dµ,
X
for all f ∈ C(X). By the Lebesgue Dominated Convergence Theorem,
Z Z
ρ(f ) = f dµ = lim fn dµ = lim ρ(fn ).
X n→∞ X n→∞

In other words, (fn )n converges weakly to f .


2

7.20. Theorem. Tychonoff ’s Theorem


SupposeQthat (Xλ , Tλ ) is a non-empty collection of compact, topological spaces.
Then X = λ Xλ is compact in the product topology.
Proof. Recall from Real Analysis that it suffices to prove that if F is a collection
of closed subsets of X with the Finite Intersection Property (FIP), then ∩{F : F ∈
F} 6= ∅. To that end, let F be a collection of closed subsets of X with the FIP.

Let J = {J ⊆ P(X) : F ⊆ J and J has the FIP}, partially ordered by inclu-


sion, so that J1 ≤ J2 if J1 ⊆ J2 . Since F ∈ J, J 6= ∅. Suppose that C = {Jλ }λ
is a chain if J. Clearly F ⊆ K := ∪λ Jλ , and if H1 , H2 , ..., Hm ∈ K, then the fact
that C is totally ordered implies that there exists λ0 so that H1 , H2 , ..., Hm ∈ Jλ0 .
Since Jλ0 has the FIP, ∩mi=1 Hi 6= ∅. Thus K has the FIP, and so K ∈ J is an upper
bound for C. By Zorn’s Lemma, J admits a maximal element, say M.
We make two observations: first, if we set M0 = {∩rk=1 Mk : Mk ∈ M, 1 ≤ k ≤
r, r ≥ 1}, then the elements of M0 are finite intersections of elements of M. It
follows that M0 has the FIP. Moreover, F ⊆ M0 . Since M ≤ M0 , the maximality
of M implies that M = M0 . In other words, finite intersections of elements of M
lie in M.
Second, if R ⊆ X and R ∩ M 6= ∅ for all M ∈ M, then F ⊆ M ⊆ M ∪ {R}
and M ∪ {R} has the FIP. Again, the maximality of M implies that R ∈ M.

Our goal now is to prove that ∩{M : M ∈ M} 6= ∅. Since ∩{F : F ∈ F } ⊇


∩{M : M ∈ M}, this will suffice to prove the Theorem.
For each λ, let π : X → Xλ denote the canonical projection map. Then ∅ 6=
Mλ := {πλ (M ) : M ∈ M} is a family of subsets of Xλ with the FIP. Since Xλ is
Xλ Xλ
compact, ∩{πλ (M ) : M ∈ M} = 6 ∅. Choose xλ ∈ ∩{πλ (M ) : M ∈ M}, and
let x = (xλ )λ . We want to show that x ∈ ∩{M : M ∈ M}.
118 L.W. Marcoux Functional Analysis

To do this, we must show that G ∈ UxX implies G ∩ M 6= ∅ for all M ∈ M.


Clearly it suffices to do this when G is a basic nbhd of x, say
G = ∩nj=1 πλ−1
j
(Uj ),
where Uj ⊆ Xj is open, 1 ≤ j ≤ n. Now for any λ0 ∈ Λ and xλ0 ∈ Uλ0 ∈ Tλ0 ,

xλ0 ∈ πλ0 (M ) 0 for all M ∈ M implies that Uλ0 ∩ πλ0 (M ) 6= ∅ for all M ∈ M.
But then πλ−10
(Uλ0 ) ∩ M 6= ∅ whenever xλ0 ∈ Uλ0 ⊆ Xλ0 is open. By maximality
of M and the second observation above, πλ−1 0
(Uλ0 ) ∈ M. Since M is closed under
finite intersections by the first observation,
G = ∩nj=1 πλ−1
j
(Uλj ) ∈ M whenever G is a basic nbhd of x.
Thus x ∈ ∩{M : M ∈ M}, and we are done.
2

7.21. Theorem. The Banach-Alaoglu Theorem


Let X be a Banach space. Then the closed unit ball X∗1 := {x∗ ∈ X∗ : kx∗ k ≤ 1}
of X∗ is weak∗ -compact.
Proof. For each x ∈ X, x∗ ∈ X∗1 , we have
x(x∗ )| = |x∗ (x)| ≤ kx∗ k kxk ≤ kxk.
|b
b(X∗1 ) ⊆ Dx := {z ∈ K : |z| ≤ kxk}. Now each such Dx is compact, and so by
Thus x
Tychonoff’s Theorem above, Y
D := Dx
x∈X
is also compact in the product topology. To complete the proof, we shall show that
X∗1 is homeomorphic to a closed, and therefore compact, subset of D.
Define
Φ : X∗1 → D
f 7→ (b x(f ))x∈X = (f (x))x∈X .
Clear Φ is injective. Now a net (fλ )λ∈Λ converges weak∗ to f if and only if
lim fλ (x) = lim x
b(fλ ) = x
b(f ) = f (x) for all x ∈ X,
λ λ
that is, if and only if limλ Φ(fλ ) = Φ(f ).
Thus X∗1 is homeomorphic to Φ(X∗1 ). There remains to show that Φ(X∗1 ) is closed
in D.
Suppose that (fλ )λ is a net in X∗1 , and that (Φ(fλ ))λ converges to d = (dx )x∈X ∈
D. Then
lim fλ (x) = dx for all x ∈ X.
λ
Define f (x) := dx , x ∈ X. Then f is linear since each fλ is, and
|f (x)| = |dx | ≤ kxk for all x ∈ X,
so that f ∈ X∗1 . Clearly Φ(f ) = limλ (Φ(fλ )), so that ran Φ is closed, and we are
done.
2
7. WEAK TOPOLOGIES AND DUAL SPACES 119

7.22. Corollary. Every Banach space X is isometrically isomorphic to a sub-


space of (C(L, K), k · k∞ ) for some compact, Hausdorff space L.
Proof. Let L := X∗1 . Then L is weak∗ -compact, by the Banach-Alaoglu Theorem,
and is Hausdorff since X separates the points of X∗1 . Define
∆ : X → C(L, K)
x 7→ b|L .
x
Then ∆ is easily seen to be linear, and kb x|L k ≤ kb
xk = kxk.
By the Hahn-Banach Theorem [Corollary 6.30], there exists x∗ ∈ X∗1 such that
|x∗ (x)| = kxk, and so kb x(x∗ )| = |x∗ (x)| = kxk; that is, ∆ is an isometry.
x|L k ≥ |b
2

7.23. Corollary. Let X be a Banach space and suppose that A ⊆ X∗ is weak∗ -


closed and bounded. Then A is weak∗ -compact.
7.24. Theorem. Goldstine’s Theorem
Let X be a Banach space and J : X → X∗∗ denote the canonical embedding. Then
J(X1 ) is weak∗ -dense in X∗∗ ∗ ∗∗
1 . Thus J(X) is weak -dense in X .
Proof. Clearly J(X1 ) = X c1 is convex, since X1 is. Observe that the closure of
w∗
J(X1 ) in the weak∗ -topology, namely J(X1 ) , is weak∗ -closed and convex. Being
weak∗ -closed in the weak∗ -compact set X∗∗ ∗
1 , it is also weak -compact. Suppose that
w∗
ϕ ∈ X∗∗1 and ϕ 6∈ J(X1 ) . Then, by the Hahn-Banach Theorem 6.41 (HB05), we
can find a weak∗ -continuous linear functional x
c∗ ∈ J(X∗ ) ⊆ X∗∗∗ so that
c∗ (ϕ) = b
Re x
w∗
c∗ (ξ) : ξ ∈ J(X1 )
> a := sup{Re x }
w∗
c∗ (ξ)| : ξ ∈ J(X1 )
= sup{|x }.
w∗
(The last equality follows from the fact that X1 and hence J(X1 ) and J(X1 ) are
balanced.)
But
w∗ w∗
c∗ (ξ)| : ξ ∈ J(X1 )
sup{|x } = sup{|ξ(x∗ )| : ξ ∈ J(X1 ) }

≥ sup{|b
x(x )| : x ∈ X1 }
= sup{|x∗ (x)| : x ∈ X1 }
= kx∗ k,
while
c∗ (ϕ)| = |ϕ(x∗ )k ≤ kϕk kx∗ k ≤ kx∗ k.
c∗ (ϕ)| ≤ |x
|Re x

This contradicts our choice of xc∗ , and thus J(X1 )w = X∗∗ 1 , as claimed.
Since X∗∗ = ∪n≥1 X∗∗n , and since each J(X n ) is weak∗ -dense in X∗∗ by a routine
n
modification of the above proof, J(X) is weak∗ -dense in X∗∗ .
2
120 L.W. Marcoux Functional Analysis

7.25. Example. By identifying c0 (K)∗ with `1 (K) and `1 (K)∗ with `∞ (K), we
see that the unit ball (c0 (K))1 of c0 (K) is weak∗ -dense in the closed unit ball of
`∞ (K), and thus c0 (K) is weak∗ -dense in `∞ (K).
Of course, c00 (K) is norm dense in c0 (K), and so c00 (K) is also weak∗ -dense in

` (K).
Culture: Although we shall not have time to prove this, the non-commutative
analogue of the above statement is that the set of finite rank operators F(H) on an
infinite-dimensional Hilbert space is weak∗ -dense in B(H).
Let us now establish a relation between compactness and reflexivity of a Banach
space.

7.26. Proposition. Let X be a Banach space. The following are equivalent.


(a) X is reflexive.
(b) X1 is weakly compact.
Proof.
(a) implies (b): First suppose that X is reflexive. Then Xb = X∗∗ and Xc1 = X∗∗
1
is weak -compact the Banach-Alaoglu Theorem 7.21. But then the weak∗ -

topology on X c1 is just the weak topology on X1 , so X1 is weakly compact.


(b) implies (a): Next suppose that X1 is weakly compact. Then X c1 is weak∗ -
compact, and since the weak∗ -topology is Hausdorff, X c1 is weak∗ -closed.
But by Goldstine’s Theorem, X ∗ ∗∗
c1 is weak -dense in X . Thus X c1 = Γ(X1 ) =
1
w∗ ∗∗ ∗∗
Γ(X1 ) = X1 . This in turn implies that X b = X , or in other words, that
X is reflexive.
2
Although in general, weak topologies are not metrizable, sometimes their re-
strictions to bounded sets can be:
7.27. Theorem. Let X be a Banach space. Then X∗1 is weak∗ -metrizable if and
only if X is separable.
Proof. First assume that X is separable, and let {xn }∞
n=1 be a dense subset of X.
Define a metric d on X∗1 via

∗ ∗
X |x∗ (xn ) − y ∗ (xn )|
d(x , y ) = .
2n kxn k
n=1
Then a net (x∗λ )λ in X∗1 converges in the metric topology to x∗ ∈ X∗1 if and only if
(x∗λ (xn ))λ converges to x∗ (xn ) for all n ≥ 1 (exercise). If x ∈ X and ε > 0, we can
choose n ≥ 1 so that kxn − xk < ε/3. Choose λ0 ∈ Λ so that λ ≥ λ0 implies that
|x∗λ (xn ) − x∗ (xn )| < ε/3. Then λ ≥ λ0 implies that
|x∗λ (x) − x∗ (x)| ≤ |x∗λ (x) − x∗λ (xn )| + |x∗λ (xn ) − x∗ (xn )| + |x∗ (xn ) − x∗ (x)|
≤ kx∗λ k kx − xn k + ε/3 + kx∗ k kxn − xk
< ε/3 + ε/3 + ε/3 = ε.
7. WEAK TOPOLOGIES AND DUAL SPACES 121

Thus x∗λ (xn ) converges to x∗ (xn ) for all n ≥ 1 if and only if (x∗λ )λ converges in the
weak∗ -topology to x∗ . Hence the weak∗ -topology on X∗1 is metrizable.
Next, assume that X∗1 is weak∗ -metrizable. Then we can find a countable se-
quence {G∗n }∞ ∗ ∗ ∞ ∗
n=1 of weak -open nbhds of 0 ∈ X1 so that ∩n=1 Gn = {0}. There is no
∗ ∗
harm in assuming that each Gn is a basic weak -open nbhd, so for each n ≥ 1 there
exists εn > 0 and a finite set Fn ⊆ X so that

G∗n = {x∗ ∈ X∗1 : |x∗ (x)| < εn , x ∈ Fn }.

Let F = ∪∞ ∗ ∗ ∗ ∗ ∗
n=1 Fn . If x ∈ X1 , x (F ) = 0, then x ∈ Gn for all n ≥ 1, and therefore
x∗ = 0. That is, if Y = spank·k F , then Y is separable and x∗ ∈ X∗1 , x∗ |Y = 0 implies
that x∗ = 0. By the Hahn-Banach Theorem [Corollary 6.29], Y = X.
2

7.28. Corollary. Let X be a separable Banach space. Then X∗1 is separable in


the weak∗ -topology.
Proof. By the Banach-Alaoglu Theorem, X∗1 is weak∗ -compact. By Theorem 7.27
above, X∗1 is weak∗ -metrizable.
Since a compact metric space is always separable – see Proposition 11.10 – we
see that (X∗1 , σ(X∗ , X)) is separable.
2

In a similar vein, we have

7.29. Theorem. Let X be a Banach space. Then X1 is weakly metrizable if


and only if X∗ is separable.
Proof. Assignment.
2

7.30. Definition. Let X be a Banach space and M ⊆ X, N ⊆ X∗ . Then the


annihilator of M is the set

M⊥ = {x∗ ∈ X∗ : x∗ (m) = 0 for all m ∈ M},

while the pre-annihilator of N is the set



N = {x ∈ X : n∗ (x) = 0 for all n∗ ∈ N.}

Observe that M⊥ and ⊥ N are linear manifolds in their respective spaces. More-
over, both are norm-closed and hence Banach spaces in their own right.
122 L.W. Marcoux Functional Analysis

7.31. Theorem. Let X be a Banach space, and let M ⊆ X be a closed subspace.


Let q : X → X/M denote the canonical quotient map. Then
Θ : (X/M)∗ → M⊥
ξ 7 ξ◦q

is an isometric isomorphism of Banach spaces.
Proof. Clearly Θ is linear. Let us show that Θ is injective.
If Θ(ξ1 ) = ξ1 ◦ q = ξ2 ◦ q = Θ(ξ2 ), then
ξ1 (q(x)) = ξ2 (q(x)) for all x ∈ X,
and so ξ1 = ξ2 .
Next we show that Θ is surjective.
Let z ∗ ∈ M⊥ and define ξz ∗ : X/M → K via ξz ∗ (q(x)) = z ∗ (x). Since
M ⊆ ker z ∗ , the map is well-defined. Furthermore, if x ∈ X and kq(x)k < 1,
then there exists m ∈ M so that kx + mk < 1, and
|ξz ∗ (q(x))| = |z ∗ (x)| = |z ∗ (x + m)| ≤ kz ∗ k,
so that kξz ∗ k ≤ kz ∗ k < ∞. Hence ξz ∗ ∈ (X/M)∗ . Clearly Θ(ξz ∗ ) = z ∗ .

Thus Θ is bijective, and kΘ(ξ)k = kξ ◦ qk ≤ kξk kqk ≤ kξk, so that kΘk ≤ 1.


Conversely, let ε > 0 and choose q(x) ∈ X/M with kq(x)k < 1 so that |ξ(q(x))| ≥
kξk − ε. Choose m ∈ M so that kx + mk < 1. Then
kξ ◦ qk ≥ |ξ ◦ q(x + m)| = |ξ(q(x))| ≥ kξk − ε,
so that kΘ(ξ)k = kξ ◦ qk ≥ kξk, implying that Θ is in fact isometric.
2

7.32. Theorem. Let X be a Banach space and M ⊆ X be a closed linear


subspace. Then the map
Θ: X∗ /M⊥ → M∗
x∗ + M⊥ 7→ x∗ |M
is an isometric isomorphism.
Proof. Note that M⊥ closed implies that X∗ /M⊥ is a Banach space. We check
that Θ is well-defined.
If x∗ + M⊥ = y ∗ + M⊥ , then x∗ − y ∗ ∈ M⊥ , so that (x∗ − y ∗ )|M = 0. That is,
Θ(x∗ + M⊥ ) = Θ(y ∗ + M⊥ ). Working our way backwards through this argument
proves that Θ is injective. That Θ is linear is easily verified.
Next suppose that m∗ ∈ M∗ . By the Hahn-Banach Theorem, we can find
x ∈ X∗ , kx∗ k = km∗ k so that x∗ |M = m∗ . Then Θ(x∗ + M⊥ ) = x∗ |M = m∗ , so that

Θ is onto. Thus Θ is bijective.


Suppose that kx∗ + M⊥ k < 1. Then there exists n∗ ∈ M⊥ so that kx∗ + n∗ k < 1.
Thus
kΘ(x∗ + M⊥ )k = kx∗ |M k = k(x∗ + n∗ )|M k ≤ kx∗ + n∗ k < 1.
It follows that kΘk ≤ 1.
7. WEAK TOPOLOGIES AND DUAL SPACES 123

From above, given m∗ ∈ M⊥ , there exists x∗ ∈ X∗ with kx∗ k = km∗ k so that


Θ−1 (m∗ )
= x∗ + M⊥ . Now
kΘ−1 (m∗ )k = kx∗ + M⊥ k ≤ kx∗ k = km∗ k,
so that Θ−1 is also contractive. But then Θ is isometric, and we are done.
2

7.33. It is a worthwhile exercise to think about the relationship between the an-
nihilator M⊥ of a subspace M of a Hilbert space H and the orthogonal complement
of M in H, for which we used the same notation.
In particular, one should interpret what Theorem 7.32 says in the Hilbert space
setting, where M⊥ refers to the orthogonal complement of M.

If you had a face like mine, you’d punch me right on the nose, and
I’m just the fella to do it.
Stan Laurel
124 L.W. Marcoux Functional Analysis

Appendix to Section 7.

The following result characterizes weak convergence of sequences in `p . We


leave its proof as an exercise for the reader.

7.34. Proposition. Suppose that 1 < p < ∞. A sequence (xn )∞ n=1 in ` (N)
p

(i.e. each xn = (xn1 , xn2 , xn3 , ...) ∈ `p (N) converges weakly to z = (z1 , z2 , z3 , ...) ∈
`p (N)) if and only if
(i) supn≥1 kxn k < ∞, and
(ii) limn→∞ xnk = zk for all k ∈ N.
7. WEAK TOPOLOGIES AND DUAL SPACES 125

Exercises for Section 7.

Question 1.
Let X be a Banach space and C ⊆ X be convex. Prove the following.
k·k weak
(a) C = C .
(b) C is norm-closed if and only if C is weakly closed.

Question 2.
Let X be a Banach space. Prove that X1 is weakly metrizable if and only if X∗
is separable.

Question 3.
Give a direct proof (i.e. without appealing to Goldstine’s Theorem) that c0 is
weak∗ -dense in `∞ .

Question 4.
Let (X, τ ) be a topological space. A function f : X → R is said to be lower
semicontinuous if for all a ∈ R, f −1 (a, ∞) is open in X.
(a) Show that every continuous function from X into R is lower semicontinuous.
(b) Let (X, k · k) be a Banach space. Prove that the function
f : (X, σ(X, X∗ )) → R
x 7→ kxk
is lower semicontinuous. That is, the norm on X is lower semicontinuous
for the weak topology on X.
(c) Let (X, k · k) be a Banach space. Prove that the function
g : (X∗ , σ(X∗ , X)) → R
x∗ 7→ kx∗ k
is lower semicontinuous. That is, the norm on X∗ is lower semicontinuous
for the weak∗ - topology on X.
126 L.W. Marcoux Functional Analysis

8. Extremal points

Somewhere on this globe, every ten seconds, there is a woman giving


birth to a child. She must be found and stopped.
Sam Levenson

8.1. The main result of this section is the Krein-Milman Theorem, which asserts
that a non-empty, compact, convex subset of a LCS has extreme points; so many,
in fact, that we can generate the compact, convex set as the closed, convex hull of
these extreme points.
Extreme points of convex sets appear in many different contexts in Functional
Analysis. For example, it is an interesting exercise (so interesting that it may appear
as an Assignment question) to calculate the extreme points of the closed unit ball
B(Cn )1 of the locally convex space B(Cn ), where Cn is endowed with the Euclidean
norm k · k2 and B(Cn ) is given the operator norm.
Recall that a linear map T ∈ B(Cn ) is said to be positive and we write T ≥ 0
if hT x, xi ≥ 0 for all x ∈ Cn . An equivalent formulation of this property says that
T is positive if there exists an orthonormal basis for Cn with respect to which the
matrix [T ] of T is diagonal, and all eigenvalues of T are non-negative real numbers.
A linear functional ϕ ∈ B(Cn )∗ is said to be positive if ϕ(T ) ≥ 0 whenever T ≥ 0.
For example, if {e1 , e2 , ..., en } is an orthonormal basis for Cn , then the so-called
normalized trace functional
n
1X
τ (T ) := hT ek , ek i
n
k=1
for T ∈ B(Cn ) can be shown to be a positive linear functional of norm one.
The state space S(B(Cn )) of B(Cn ), consisting of all positive, norm-one linear
functionals on B(Cn ) – called states – forms a non-empty, compact, convex subset
of B(Cn )∗ . The extreme points of the state space are called pure states. For
example, if x ∈ Cn and kxk = 1, then the map
ϕx : B(Cn ) → C
T 7→ hT x, xi
defines a pure state. States on B(Cn ) (and more generally states on so-called
C ∗ -algebras) are of extreme importance in determining the representation theory
of these algebras. This, however, is beyond the scope of the present manuscript.
8.2. Definition. Let V be a vector space and C ⊆ V be a convex set. A point
e ∈ C is called an extreme point of C if whenever there exist x, y ∈ C and t ∈ (0, 1)
for which
e = tx + (1 − t)y,
it follows that x = y = e. We denote by Ext(C) the (possibly empty) set of all
extreme points of C.
8. EXTREMAL POINTS 127

8.3. Example.
(a) Let V = C. Let D = {w ∈ C : |w| < 1} denote the open disk. It is easy
to see that D is convex. However D has no extreme points. If w ∈ D, then
|w| < 1, so there exists δ > 0 so that (1 + δ)|w| < 1. Let x = (1 + δ)w,
y = (1 − δ)w. Then x, y ∈ D and w = 12 x + 12 y.
(b) With V = C again, let D = {w ∈ C : |w| ≤ 1}. Then every z ∈ T :=
{z ∈ C : |z| = 1} is an extreme point of D. The proof of this is left as an
exercise.
(c) Let V = R2 , and let p1 , p2 , p3 be three non-collinear points in V . The
triangle T whose vertices are p1 , p2 , p3 has exactly {p1 , p2 , p3 } as its set of
extreme points.

The following generalizes the concept of an extreme point.

8.4. Definition. Let V be a vector space and let ∅ 6= C ⊆ V be convex. A


non-empty convex set F ⊆ C is called a face of C if whenever x, y ∈ C and t ∈ (0, 1)
satisfy tx + (1 − t)y ∈ F , then x, y ∈ F .
We emphasize the fact that F is convex is part of the definition of a face.

8.5. Remarks. Let V be a vector space and C ⊆ V be convex.


(a) If e is an extreme point of C, then F = {e} is a face of C. Conversely, if
F = {z} is a face of C, then z ∈ Ext(C).
(b) Let F be a face of C, and let D be a face of F . Then D is a face of C.
Indeed, let x, y ∈ C and t ∈ (0, 1), and suppose that tx + (1 − t)y ∈ D.
Then D ⊆ F implies that tx + (1 − t)y ∈ F . Since F is a face of C, we
must have x, y ∈ F . But then D is a face of F , and so it follows from
tx + (1 − t)y ∈ D that x, y ∈ D.
(c) From (b), it follows that if e is an extreme point of a face F of C, then e
is an extreme point of C.

8.6. Example.
(a) Let V = R2 and let p1 , p2 , p3 be three non-collinear points in V . Denote
by T the triangle whose vertices are p1 , p2 , p3 . Then T is a face of itself.
Also, each line segment pi pj is a face of T . Finally, each extreme point pj
is a face of T .
(b) Let V = R3 and C be a cube in V , for e.g.,
C = {(x, y, z) ∈ R3 : 0 ≤ x, y, z ≤ 1}.
Then C has itself as a face. Also, the 6 (square) sides of the cube are faces.
The 12 edges of the cube are also faces, as are the 8 corners. The corners
are extreme points of the cube.
128 L.W. Marcoux Functional Analysis

The definition of a face currently requires us to consider convex combinations of


two elements of C. In fact, we may consider arbitrary finite convex combinations of
elements of C.

8.7. Lemma. Let V be a vector space, ∅ 6= C ⊆ V be convex Pn and ∅ 6= F ⊆ C


be a face of C. Suppose that {xj }nj=1 ⊆ C and that x = j=1 tj xj is a convex
combination of the xj ’s. If x ∈ F and tj ∈ (0, 1) for all 1 ≤ j ≤ n, then xj ∈ F for
all 1 ≤ j ≤ n.
Proof. We argue by induction on n. The assumption that tj ∈ (0, 1) for all j
requires that n ≥ 2, and the case n = 2 is nothing more than the definition of a
face. Let k ≥ 3, and suppose that the result is true for n < k.
Pk
Suppose that x = j=1 tj xj ∈ F , where tj ∈ (0, 1) for all 1 ≤ j ≤ k and
Pk
j=1 tj = 1. Then
 
k−1
X t j
x = (1 − tk )  xj  + tk xk .
1 − tk
j=1
tj
Since C is convex, y := k−1
P
j=1 1−tk xj ∈ C. But then x = (1 − tk )y + tk xk ∈ F , and
F is a face, so that y and xk must lie in F . Since y ∈ F , our induction hypothesis
next implies that xj ∈ F for all 1 ≤ j ≤ k − 1, which completes the proof.
2

8.8. Lemma. Let (V, T ) be a LCS and ∅ 6= K ⊆ V be a compact, convex set.


Let ρ ∈ V ∗ , and set
r = sup{Re ρ(w) : w ∈ K}.
Then F = {x ∈ K : Re ρ(x) = r} is a non-empty, compact face of K.
Proof. Since Re ρ : K → R is continuous and K is compact, r = max{Re ρ(w) :
w ∈ K}, and so F is non-empty. Moreover, F = (Re ◦ ρ)−1 ({r}) and {r} ⊆ R is
closed, so F is closed in K, and hence F is compact.
Next, observe that if x, y ∈ F ⊆ K and t ∈ (0, 1), then tx + (1 − t)y ∈ K as K
is convex. But Re ρ(tx + (1 − t)y) = tRe ρ(x) + (1 − t)Re ρ(y) = tr + (1 − t)r = r,
so that tx + (1 − t)y ∈ F and F is convex.
Suppose that x, y ∈ K, t ∈ (0, 1), and tx + (1 − t)y ∈ F . As before,
r = Re ρ(tx + (1 − t)y)
= t Re ρ(x) + (1 − t) Re ρ(y).
But Re ρ(x) ≤ r, Re ρ(y) ≤ r, so the only way that equality can hold is if x, y ∈ F .
Hence F is a face of K.
2

The following result is a crucial step in the proof of the Krein-Milman Theorem.
8. EXTREMAL POINTS 129

8.9. Lemma. Let (V, T ) be a LCS and ∅ 6= K ⊆ V be a compact, convex set.


Then Ext(K) 6= ∅.
Proof. Let J = {F ⊆ K : ∅ 6= F is a closed face of K}, and partially order J by
reverse inclusion: i.e. F1 ≤ F2 if F2 ⊆ F1 . Observe that K ∈ J and so J 6= ∅.
Suppose that C = {Fλ }λ∈Λ is a chain in J . We claim that F = ∩λ∈Λ Fλ is
an upper bound for C. Since {Fλ }λ∈Λ has the Finite Intersection Property and K
is compact, F 6= ∅. Moreover, each Fλ is assumed to be closed and convex, and
thus so is F . Suppose that x, y ∈ K, t ∈ (0, 1), and tx + (1 − t)y ∈ F . Then
tx + (1 − t)y ∈ Fλ for each λ. But Fλ is a face of K, so x, y ∈ Fλ for all λ, whence
x, y ∈ F and F is face of K. Clearly it is an upper bound for C.
By Zorn’s Lemma, J contains a maximal element, say E. Since E ∈ J , it is
non-empty, convex, and closed in K, hence compact. We claim that E is a singleton
set, and therefore corresponds to an extreme point of K.
Suppose to the contrary that there exist x, y ∈ E with x 6= y. By the Hahn-
Banach Theorem 05 (Theorem 6.41), there exists a continuous linear functional
ϕ ∈ V ∗ so that
Re ϕ(x) > Re ϕ(y).
Since E is non-empty, convex and compact, we can apply Lemma 8.8. Let r =
sup{Re ϕ(w) : w ∈ E}, and set H = {x ∈ E : Re ϕ(x) = r}. Then H is a non-
empty, compact face of E, and hence of K.
But at least one of x and y does not belong to H, and so E < H, contradicting
the maximality of E. Thus E = {e} is a singleton set, and e ∈ Ext (E) ⊆ Ext(K),
proving that the latter is non-empty.
2

8.10. Theorem. The Krein-Milman Theorem


Let (V, T ) be a LCS and ∅ 6= K ⊆ V be a compact, convex set. Then
K = co(Ext(K)),
the closed, convex hull of the extreme points of K.
Proof. By Lemma 8.9, Ext(K) 6= ∅. Thus ∅ 6= co(Ext(K)) ⊆ K, as K is closed
and convex.
Suppose that m ∈ K\(co(Ext(K))). By the Hahn-Banach Theorem (Theo-
rem 6.41), there exists τ ∈ V ∗ and real numbers α > β so that
Re τ (m) ≥ α > β ≥ Re τ (b) for all b ∈ co(Ext(K)).
Let s := sup{Re τ (w) : w ∈ K}. Then s ≥ Re τ (m) ≥ α, and L := {z ∈ K :
Re τ (z) = s} is a non-empty, compact face of K, by Lemma 8.8. But then ∅ 6= L
is a compact, convex set in V, and so by Lemma 8.9, Ext(L) 6= ∅. Furthermore,
Ext(L) ⊆ Ext(K), by virtue of the fact that L is a face of K (see Remark 8.5 (c)).
Hence there exists e ∈ Ext(L) ⊆ co(Ext(K)) so that
Re τ (e) = s ≥ α > Re τ (b) for all b ∈ co(Ext(K)),
an obvious contradiction.
It follows that K\co(Ext(K)) = ∅, and thus K = co(Ext(K)).
130 L.W. Marcoux Functional Analysis

8.11. Corollary. Let (V, T ) be a LCS and ∅ 6= K ⊆ V be a compact, convex


set. If ρ ∈ V ∗ , then there exists e ∈ Ext(K) so that
Re ρ(w) ≤ Re ρ(e) for all w ∈ K.

Proof. Let r := sup{Re ρ(w) : w ∈ K}. By Lemma 8.7, F = {x ∈ K : Re ρ(x) = r}


is a non-empty, compact face of K. By Lemma 8.9, Ext(F ) 6= ∅. Let e ∈ Ext(F ).
Then e ∈ Ext(K), and
Re ρ(w) ≤ r = Re ρ(e) for all w ∈ K.
2
Equipped with the Krein-Milman Theorem 8.10 above, we are able to extend
Corollary 7.23.

8.12. Corollary. Let X be a Banach space and suppose that A ⊆ X∗ is weak∗ -


closed and bounded. Then A is weak∗ -compact. If A is also convex, then A =

cow (ExtA).
8. EXTREMAL POINTS 131

Appendix to Section 8.

8.13. Convexity and convex hulls. It was anticipated that everyone would
have seen the concept of convexity and of convex hulls. Hopefully, therefore, the
following remarks will serve as a refresher for everyone.
Let V be a vector space over K. Recall that a subset E of V is said to be
convex if x, y ∈ E and 0 ≤ t ≤ 1 implies that tx + (1 − t)y ∈ E. A simple finite
P is equivalent toPrequiring that if n ≥ 1,
induction argument shows that this definition
tk ∈ [0, 1], xk ∈ E for all 1 ≤ k ≤ n and nk=1 tk = 1, then nk=1 tk xk ∈ E.
A sum of the form
n
X
t k xk
k=1
Pn
where 0 ≤ tk ≤ 1 and k=1 tk xk is called a convex combination of the xk ’s.
If {Eλ }λ is a collection of convex subsets of V, then F := ∩λ Eλ is once again
convex. For if x, y ∈ F and 0 ≤ t ≤ 1, then for each λ, we have that x, y ∈ Eλ and
Eλ convex implies that tx + (1 − t)y ∈ Eλ , whence tx + (1 − t)y ∈ ∩λ Eλ = F .

Now let H ⊆ V be a set. The convex hull co(H) of H is the set

co(H) = ∩{K ⊆ V : H ⊆ K and K is convex}.


It readily follows from the definition that co(H) is the smallest convex subset of
V which contains H. Note that V is itself convex and clearly H ⊆ V, so that the
intersection on the right takes place over a non-empty collection.

Let us define
n
X n
X
F ={ tk hk : n ≥ 1, tk ≥ 0, hk ∈ H, 1 ≤ k ≤ n, tk = 1}.
k=1 k=1

ObservePthat h = 0h + 1h ∈P F for all h ∈ H, so that H ⊆ F . P


If x = nk=1 tk hk and y = m 0
j=1 sj hj are elements of F (with
n
k=1 tk = 1 =
Pm
j=1 sj ), then for 0 ≤ r ≤ 1,
 
n m
!
X X
rx + (1 − r)y = r tk hk + (1 − r)  sj h0j 
k=1 j=1
n
X m
X
= (rtk )hk + ((1 − r)sj )h0j .
k=1 j=1
132 L.W. Marcoux Functional Analysis

But
 
n m n m
!
X X X X
rtk + (1 − r)sj = r tk + (1 − r)  sj  = r + (1 − r) = 1,
k=1 j=1 k=1 j=1

and so rx + (1 − r)y ∈ F . Thus H ⊆ F and F is convex, so that co(H) ⊆ F .


Furthermore, if K is any convex set which contains H, then from the first para-
graph, F ⊆ K. Thus F ⊆ co(H).
In other words, a second description of the convex hull of H is:
Xn n
X
co(H) = { tk hk : n ≥ 1, tk ≥ 0, hk ∈ H, 1 ≤ k ≤ n, tk = 1}.
k=1 k=1
That is, co(H) consists of all convex combinations of elements of H.

I want to go back to Brazil, get married, have lots of kids, and just
be a couch tomato.
Ana Beatriz Barros
8. EXTREMAL POINTS 133

Exercises for Section 8.

Question 1.
Prove that there does not exist a Banach space X such that X∗ = (c0 , k · k∞ ).

Question 2.
Let 1 ≤ N be an integer. Find the extreme points of the closed unit ball of
(B(CN ), k · k), where k · k denotes the operator norm.

Question 3.
Let H = `2 and let {en }∞n=1 denote the standard onb for H. Let S ∈ B(H) be
the unilateral forward shift defined by Sen = en+1 , n ≥ 1.
Prove or disprove that S an extreme point of the closed unit ball of B(H).

Question 4.
Find the extreme points of the closed unit ball of each of the following spaces:
(a) (`1 , k · k1 );
(b) (`p , k · kp ), 1 < p < ∞;
(c) (`∞ , k · k∞ ).
134 L.W. Marcoux Functional Analysis

9. The chapter of named theorems

I believe that sex is one of the most beautiful, natural, wholesome


things that money can buy.
Steve Martin

9.1. In general, if f : X → Y is a continuous map between topological spaces


X and Y , one does not expect f to take open sets to open sets. Despite this, we
have seen that if V is TVS and W is a closed subspace of V, then the quotient map
does just this.
The Open Mapping Theorem extends this result to surjections of Banach spaces.
Many of the theorems in this Chapter are a consequence - either direct or indirect
- of the Open Mapping Theorem. We begin with a Lemma which will prove crucial
in the proof of the Open Mapping Theorem.
9.2. Lemma. Let X and Y be Banach spaces and suppose that T ∈ B(X, Y).
If Y1 ⊆ T Xm for some m ≥ 1, then Y1 ⊆ T X2m .
Proof. First observe that Y1 ⊆ T Xm implies that Yr ⊆ T Xrm for all r > 0.
Choose y ∈ Y1 . Then there exists x1 ∈ Xm so that ky − T x1 k < 1/2. Since
y − T x1 ∈ Y1/2 ⊆ T Xm/2 , there exists x2 ∈ Xm/2 so that k(y − T x1 ) − T x2 k < 1/4.
More generally, for each n ≥ 1, we can find xn ∈ Xm/2n−1 so that
n
X 1
ky − T xj k < .
2n
j=1
P∞ P∞ m P∞
Since X is complete and n=1 kxn k ≤ n=1 2n−1 = 2m, we have x = n=1 xn ∈
X2m . By the continuity of T ,
∞ N
! !
X X
Tx = T xn = lim T xn = y.
N →∞
n=1 n=1
2

9.3. Theorem. The Open Mapping Theorem


Let X and Y be Banach spaces and suppose that T ∈ B(X, Y) is a surjection.
Then T is an open map - i.e. if G ⊆ X is open, then T G ⊆ Y is open.
Proof. Since T is surjective, Y = T X = ∪∞ ∞
n=1 T Xn ⊆ ∪n=1 T Xn . Now Y is a
complete metric space, and so by the Baire Category Theorem , there exists m ≥ 1
so that int(T Xm ) 6= ∅.
Let y ∈ int(T Xm ), and choose δ > 0 such that y + VδY (0) ⊆ int(T Xm ) ⊆ T Xm .
Then
VδY (0) ⊆ −y + T Xm ⊆ T Xm + T Xm = T X2m .
(This last step uses the linearity and continuity of T .)
9. THE CHAPTER OF NAMED THEOREMS 135

Thus Yδ/2 ⊆ VδY (0) ⊆ T X2m . By Lemma 9.2 above,


Yδ/2 ⊆ T X4m ,
or equivalently,
T Xr ⊃ Yrδ/8m
for all r > 0.
Suppose that G ⊆ X is open and that y ∈ T G, say y = T x for some x ∈ G.
Since G is open, we can find ε > 0 so that x + VεX (0) ⊆ G. Thus
T G ⊇ T x + T (VεX (0))
⊇ y + T Xε/2
⊇ y + Yεδ/16m
Y
⊇ y + Vεδ/16m (0).
Thus y ∈ T G implies that y ∈ int T G, and so T G is open.
2

9.4. Corollary. The Inverse Mapping Theorem


Let X and Y be Banach spaces and suppose that T ∈ B(X, Y) is a bijection.
Then T −1 is continuous, and so T is a homeomorphism.
Proof. If G ⊆ X is open, then (T −1 )−1 (G) = T G is open in Y by the Open Mapping
Theorem above. Hence T −1 is continuous.
2

9.5. Corollary. The Closed Graph Theorem


Let X and Y be Banach spaces and suppose that T : X → Y is linear. If the
graph
G(T ) := {(x, T x) : x ∈ X}
is closed in X ⊕1 Y, then T is continuous.
Proof. The `1 norm on X ⊕ Y was chosen only so as to induce the product topology
on X⊕Y. We could have used any equivalent norm (for example, the `2 or `∞ norms).
Let π1 : X ⊕1 Y → X be the canonical projection π1 (x, y) = x, (x, y) ∈ X ⊕ Y. Then
π1 is clearly linear, and
kxkX = kπ1 (x, y)kX ≤ kxkX + kykY = k(x, y)k1 ,
so that kπ1 k ≤ 1. Moreover, G(T ) is easily seen to be a linear manifold in X ⊕1 Y,
and by hypothesis, it is closed and hence a Banach space.
The map
πG : G(T ) → X
(x, T x) 7→ x
is a linear bijection with kπG k = kπ1 |G(T ) k ≤ kπ1 k ≤ 1.
By the Inverse Mapping Theorem 9.4 above, πG−1 is also continuous, hence
bounded.
136 L.W. Marcoux Functional Analysis

Thus
kT xkY ≤ kxkX + kT xkY = k(x, T x)k1 = kπG−1 (x)k1 ≤ kπG−1 k kxkX

for all x ∈ X, and therefore kT k ≤ kπG−1 k < ∞. That is, T is continuous.


2
Let X and Y be Banach spaces. A linear map T : X → Y is continuous if and
only if for all sequences (xn )n in X converging to x ∈ X, we have limn T xn = T x. Of
course, given a linear map T : X → Y, and given a sequence (xn )n converging to x,
there is no reason a priori to assume that (T xn )n converges to anything at all in Y.
The following Corollary is interesting in that part (c) tells us that it in checking to
see whether or not T is continuous, it suffices to assume that limn→∞ T xn exists, and
that we need only verify that the limit is the expected one, namely T x. Linearity
of T further reduces the problem to checking this condition for x = 0.

9.6. Corollary. Let X and Y be Banach spaces and T : X → Y be linear. The


following are equivalent:
(a) The graph G(T ) is closed.
(b) T is continuous.
(c) If limn→∞ xn = 0 and limn→∞ T xn = y, then y = 0.
Proof.
(a) implies (b): This is just the Closed Graph Theorem above.
(b) implies (c): This is clear.
(c) implies (a): Suppose that ((xn , T xn ))∞
n=1 is a sequence in G(T ) which con-
verges to some point (x, y) ∈ X ⊕1 Y. Then, in particular, limn→∞ xn = x,
and so limn→∞ (xn −x) = 0. Also, limn→∞ T xn = y, so limn→∞ T (xn −x) =
y − T x exists. By our hypothesis, y − T x = 0, or equivalently y = T x. This
in turn says that (x, y) = (x, T x) ∈ G(T ), and so the latter is closed.
2
Recall that a two closed subspaces Y and Z of a Banach space X are said to
topologically complement each other if X = Y ⊕ Z.

9.7. Lemma. Two closed subspaces Y and Z of a Banach space X topologically


complement each other if and only if the map
ι : Y ⊕1 Z → X
(y, z) 7→ y + z
is a homeomorphism of Banach spaces.
Proof. First note that the norms on Y and on Z are nothing more than the
restrictions to these spaces of the norm on X.
Suppose that Y and Z are topologically complementary subspaces of X. That ι
is linear is clear. Moreover, since Y and Z are complementary subspaces, it is easy
9. THE CHAPTER OF NAMED THEOREMS 137

to see that ι is a bijection. Hence


kι(y, z)k = ky + zk
≤ kyk + kzk
= k(y, z)k,

so that ι is a contraction. By the Inverse Mapping Theorem 9.4, ι−1 is continuous,


and so ι is a homeomorphism.
Conversely, suppose that ι is a homeomorphism. Now ran ι = Y + Z = X, since
ι is surjective, and if w ∈ Y ∩ Z, then (w, −w) ∈ ker ι = (0, 0), so w = 0. Hence X
is the algebraic direct sum of Y and Z. Since Y and Z are closed in X, they are also
topologically complemented.
2
The next result extends our results from Section 3, where we showed that for
a closed subspace M of a Hilbert space H, there exists an orthogonal projection
P ∈ B(H) whose range is M (see Remarks 3.7).

9.8. Proposition. Let X a Banach space and let Y and Z be topologically


complementary subspaces of X. For each x ∈ X, denote by yx and zx the unique
elements of Y and Z respectively such that x = yx + zx . Define E : X → Y via
Ex = E(yx + zx ) = yx for all x ∈ X. Then
(a) E is a continuous linear map. Moreover, E = E 2 , ran E = Y, and
ker E = Z.
(b) Conversely, if E ∈ B(X) and E = E 2 , then M = ran E and N = ker E are
topologically complementary subspaces of X.
Proof.
(a) From Lemma 9.7 above, we know that there exists a linear homeomorphism
ι : Y ⊕1 Z → X. Consider the map
πY : Y ⊕1 Z → Y
(y, z) 7→ y.
It is clear that πY is linear and contractive, and so πY is continuous. As
such, the map
E := πY ◦ ι−1 : X → Y
x 7→ yx
is clearly linear (being the composition of linear functions), and
kExk = kπY ◦ ι−1 k ≤ kπY k kι−1 k < ∞,
so that E is bounded - i.e. E is continuous. That ran E = Y and ker E = Z
are left as exercises.
138 L.W. Marcoux Functional Analysis

(b) Since E is assumed to be continuous, N is closed. Now I − E is also


continuous, and ran E = ker(I − E), so ran E is also closed. If z ∈ ran E ∩
ker E, then z = Ew for some w ∈ X, so z = E 2 w = Ez = 0. Furthermore,
for any x ∈ X, x = Ex + (I − E)x ∈ ran E + ker E. Hence M and
N are algebraically complemented closed subspaces of X; i.e. they are
topologically complemented.
2

9.9. Remark. A linear map E ∈ B(X) is said to be idempotent if E = E 2 .


We point out that the term projection is often used in this context, although in
the Hilbert space setting, the meaning of projection is slightly different.
The above Proposition says that a subspace Y of a Banach space X is comple-
mented if and only if it is the range of a bounded idempotent in B(X).
9. THE CHAPTER OF NAMED THEOREMS 139

Appendix to Section 9.

9.10. As we mentioned in Chapter 3, not every closed subspace of a Banach


space X admits a topological complement. In particular, c0 is not topologically
complemented in `∞ . Thus there does not exist a bounded idempotent E ∈ B(`∞ )
such that ran E = c0 .
It is also the case that topological complements need not be unique. For ex-
ample, let H = `2 and let {en }∞ n=1 denote the standard onb for H. Set M =
span{e2n }∞ n=1 , so that both M and M⊥ are infinite-dimensional subspaces of H.
Let X ∈ B(M⊥ , M) be an arbitrary (bounded) linear operator, and consider the
bounded linear operator EX whose operator matrix relative to the decomposition
H = M ⊕ M⊥ is  
I X
EX := .
0 0
Then EX is easily seen to be an idempotent, and ran EX = M.
It follows from Proposition 9.8 that M is topologically complemented by NX :=
ker EX . Given x ∈ H, write x = y+z, where y ∈ M and z ∈ M⊥ . Then x ∈ ker EX
if and only if        
I X y y + Xz 0
EX x = = = ,
0 0 z 0 0
 
−Xz
i.e. if and only if x = for some z ∈ M⊥ . That is, for each X ∈ B(M⊥ , M),
z
  
−Xz ⊥
the closed subspace NX := :z∈M is a topological complement for
z
M.
Of course, M admits a unique orthogonal complement, which corresponds to
the case where X = 0.

Nobody in the game of football should be called a genius. A genius is


somebody like Norman Einstein.
Joe Theismann
140 L.W. Marcoux Functional Analysis

Exercises for Section 9.

Question 1.
Let X be a Banach space and T ∈ B(X). Recall that a closed subspace Y of X
is said to be invariant for T if T Y ⊆ Y.
Suppose that Y and Z are invariant for T , and that Z is a topological complement
for Y. Prove that there exists an idempotent E ∈ B(X) such that ET = T E.

Question 2.
Let H be a complex Hilbert space and M ⊆ H be a closed subspace. Prove
that E = E 2 ∈ B(H) is an idempotent with ran E = M if and only if there exists
X ∈ B(M⊥ , M) such that relative to the decomposition H = M ⊕ M⊥ , we have
 
I X
E= .
0 0

Question 3.
Let H be a Hilbert space and let E = E 2 ∈ B(H) be an idempotent operator.
Prove that there exists an invertible operator S ∈ B(H) such that P = S −1 ES is
an orthogonal projection.

Question 4.
Let (X, d) be a metric space and H ⊆ X. We say that H is nowhere dense
(or meager, or thin) if G := X \ H is dense in X. In other words, the interior of
H is empty.
We say that a subset H of a metric space (X, d) is of the first category in
(X, d) if there exists a sequence (Fn )∞
n=1 of closed, nowhere dense sets in X such
that
H ⊆ ∪∞ n=1 Fn .
Otherwise, H is said to be of the second category.

Prove the following alternate version of the Banach-Steinhaus Theorem:


Theorem. Let (X, k · kX ) and (Y, k · kY ) be Banach spaces and suppose that
∅ 6= F ⊆ B(X, Y). Let H ⊆ X be a subset of the second category in X, and suppose
that for each x ∈ H, there exists a constant κx > 0 such that
kT xkY ≤ κx , T ∈ F.
Then F is bounded; that is,
sup kT k < ∞.
T ∈F
10. OPERATOR THEORY 141

10. Operator Theory

I got kicked out of ballet class because I pulled a groin muscle. It


wasn’t mine.
Rita Rudner

10.1. Much of the work that has been done in Banach space theory has focussed
on the study of the geometric structure of Banach spaces. For example, people have
been interested in how “close” two Banach spaces are to being isomorphic as Banach
spaces (say, in terms of the Banach-Mazur distance between them), and they have
been interested in finding subspaces of a Banach space which have or are close to
having a prescribed geometric structure. Other interesting questions in this area
involve the study of how well finite-dimensional subspaces of one Banach space can
be embedded in a second Banach space, and yet others involve the search for nice
“bases” for a Banach space (for eg., the search for Schauder bases for quotients of
Banach spaces), or the renorming of Banach spaces by equivalent norms. This list
is anything but inclusive.
In contrast, the study of Hilbert spaces focusses very much on the structure of the
bounded linear operators acting on the space. This is in part because Hilbert spaces
are so well-behaved. This means that one pretends to understand the underlying
space H, and instead focusses upon understanding the more complicated structure,
namely B(H). Because Hilbert spaces are so well-behaved, there is a chance of
getting interesting and deep results about subsets and subalgebras of B(H).
Of course, one can also study the space B(X) of bounded linear operators acting
on a Banach space X. Here one is often interested in how the structure of the
underlying space X determines the operators in B(X). In this Section we shall
examine the notion of compactness of operators on a Banach space. We begin,
however, with the notion of the Banach space adjoint of an operator.

Let X and Y be Banach spaces. Let T ∈ B(X, Y). Given y ∗ ∈ Y∗ , the map
x 7→ y ∗ (T x)
is a linear functional on X. Let us denote by T ∗ y ∗ ∈ X# the functional on X
determined by this formula, namely:
T ∗ y ∗ (x) = y ∗ (T x) for all x ∈ X.
Observe that
kT ∗ y ∗ (x)k = ky ∗ (T x)k ≤ ky ∗ k kT k kxk,
and so kT ∗ y ∗ k ≤ kT k ky ∗ k < ∞, implying that T ∗ y ∗ ∈ X∗ . Furthermore, the map
T ∗ Y∗ → X∗
y ∗ 7→ T ∗ y ∗
is easily seen to be linear (by definition of linear combinations of functionals on X).
142 L.W. Marcoux Functional Analysis

Moreover, the estimate


kT ∗ y ∗ k ≤ kT k ky ∗ k
for all y ∗ ∈ Y∗ implies that kT ∗ k ≤ kT k.
Conversely, let x ∈ X. By the Hahn-Banach Theorem, we can choose y ∗ ∈ Y∗
such that ky ∗ k = 1 and y ∗ (T x) = kT xk. Then
kT xk = y ∗ (T x)
= T ∗ y ∗ (x)
≤ kT ∗ y ∗ k kxk
≤ kT ∗ kkxk.
Thus kT k ≤ kT ∗ k.
Combining this with the previous estimate, we have that
kT ∗ k = kT k.
10.2. Definition. Let X and Y be Banach spaces and let T ∈ B(X, Y). The
map T ∗ defined above is called the Banach space adjoint of T .

Many authors adopt the following notation for the action of a functional on a
vector, namely: given x ∈ X and x∗ ∈ X∗ , they write hx, x∗ i to denote x∗ (x). In
this notation, the equation T ∗ y ∗ (x) = y ∗ (T x) for all x ∈ X, y ∗ ∈ Y∗ becomes:

< x, T ∗ y ∗ > = < T x, y ∗ >


for all x ∈ X, y ∗ ∈ Y∗ . The reason for using this notation will become apparent
when we look at Hilbert space adjoints below.

10.3. Proposition. Let X, Y, Z be Banach spaces, S, T ∈ B(X, Y), and let


R ∈ B(Y, Z). Then
(a) for all k1 , k2 ∈ K, we have (k1 S + k2 T )∗ = k1 S ∗ + k2 T ∗ ;
(b) (R ◦ T )∗ = T ∗ ◦ R∗ .
Proof.
(a) This is left as an exercise for the reader.
(b) Let x ∈ X, y ∗ ∈ Y∗ , and z ∗ ∈ Z∗ . Then

(R ◦ T )∗ z ∗ (x) = z ∗ ((R ◦ T ))(x)


= z ∗ (R(T x))
= R∗ z ∗ (T x)
= T ∗ R∗ z ∗ (x).

Again, since this is true for all x ∈ X and then for all z ∗ ∈ Z∗ , we find that
(R ◦ T )∗ = T ∗ ◦ R∗ .
2
10. OPERATOR THEORY 143

10.4. Proposition. Let X = Cn and A ∈ B(X) ' Mn . Then the matrix of the
Banach space adjoint A∗ of A with respect to the dual basis coincides with At , the
transpose of A.
Proof. Recall that X∗ ' X. We then let {ei }ni=1 be a basis for X and let {fj }nj=1 be
the corresponding dual basis for X∗ ; that is, fj (ei ) = δij , where δij is the Kronecker
delta function. Let x ∈ X. Define λj = fj (x), 1 ≤ j ≤ n.
Writing the matrix of A ∈ B(X) as [aij ], we have
   
0 a1j
 0   a2j 
   
 .   . 
   
 .   . 
    n
 0   aj−1 j  X
Aej = [aij ]  =
 1   ajj 
= akj ek .
    k=1
 0   aj+1 j 
   
 .   . 
   
 .   . 
0 anj
Thus aij = fi (Aej ).
Now A∗ ∈ B(X∗ ) ' Mn , and so we can also write the matrix [αij ] for A∗ with
respect to {fj }nj=1 . Paralleling the above computation for [αij ], we obtain:
n
X

A fj = αkj fk ,
k=1
and thus
αij = (A∗ fj )(ei ) = fj (Aei ) = aji .
In particular, the matrix for A∗ with respect to {fj }nj=1 is simply the transpose of
the matrix for A with respect to {ej }nj=1 .
2
The fact that a Hilbert space is isometrically isomorphic (via a conjugate-linear
map) to its own dual allows us to define a separate notion of an adjoint for operators
acting on these spaces. The crucial difference between the Hilbert space adjoint and
the Banach space adjoint of an operator T acting on a Hilbert space H is that the
Hilbert space adjoint will operate on the same Hilbert space H, while the Banach
space adjoint will be an operator in B(H∗ ). While H∗ is a Hilbert space isomorphic
to H, it is not H itself.

10.5. Theorem. Let H be a Hilbert space and let T ∈ B(H). Then there exists
a unique operator T ∗ ∈ B(H), called the Hilbert space adjoint of T , satisfying
hT x, yi = hx, T ∗ yi
for all x, y ∈ H.
144 L.W. Marcoux Functional Analysis

Proof. Fix y ∈ H. Then the map


φy : H → C
x 7→ hT x, yi
is a linear functional and so there exists a vector zy ∈ H such that
φy (x) = hT x, yi = hx, zy i
for all x ∈ H. Define a map T ∗ : H → H by T ∗ y = zy . We leave it to the reader to
verify that T ∗ is in fact linear, and we concentrate on showing that it is bounded.
To see that T ∗ is bounded, consider the following. Let y ∈ H, kyk = 1. Then
hT x, yi = hx, T ∗ yi for all x ∈ H, so
kT ∗ yk2 = hT ∗ y, T ∗ yi
= hT T ∗ y, yi
≤ kT k kT ∗ yk kyk.
Thus kT ∗ yk ≤ kT k, and so kT ∗ k ≤ kT k.
Now T ∗ is unique, for if there exists A ∈ B(H) such that hT x, yi = hx, T ∗ yi =
hx, Ayi for all x, y ∈ H, then hx, (T ∗ −A)yi = 0 for all x, y ∈ H, and so (T ∗ −A)y = 0
for all y ∈ H, i.e. T ∗ = A.
2

10.6. Corollary. Let T ∈ B(H), where H is a Hilbert space. Then (T ∗ )∗ = T .


It follows that kT k = kT ∗ k.
Proof. For all x, y ∈ H, we get
hx, (T ∗ )∗ yi = hT ∗ x, yi
= hy, T ∗ xi
= hT y, xi
= hx, T yi,
and so (T ∗ )∗ = T . Applying Theorem 10.5, we get
kT k = k(T ∗ )∗ k ≤ kT ∗ k ≤ kT k,
and so kT k = kT ∗ k.
2

10.7. Proposition. Let H be a complex, separable Hilbert space and E = {en }n


be a (countably infinite or finite) orthonormal basis for H. Let T ∈ B(H). Then the
matrix of T ∗ with respect to E is the conjugate transpose of that of T relative to E.
Proof. By definition, the matrix of T relative to E is given by [ti,j ], where
ti,j = hT ej , ei i.
If we denote by [ri,j ] the matrix of T ∗ relative to E, then ri,j = hT ∗ ej , ei i for all i, j.
But ti,j = hT ej , ei i = hej , T ∗ ei i = hT ∗ ei , ej i = rj,i , completing the proof.
2
10. OPERATOR THEORY 145

10.8. Remark. For a Hilbert space H and A, B ∈ B(H), it is easy to see that
we have (AB)∗ = B ∗ A∗ . Indeed, given x, y ∈ H,
hx, (AB)∗ yi = hABx, yi = hBx, A∗ yi = hx, B ∗ A∗ yi,
from which the result follows. The adjoint operator
∗ : B(H) → B(H)
is an example of an involution on a Banach algebra. Namely, for all α, β ∈ C and
A, B ∈ B(H), we obtain
(i) (αA)∗ = αA∗ ;
(ii) (A + B)∗ = A∗ + B ∗ ; and
(iii) (AB)∗ = B ∗ A∗ .
(iv) (A∗ )∗ = A.

A norm-closed subalgebra A of B(H) which is closed under the adjoint operation


– i.e. A ∈ A implies that A∗ ∈ A – is called a (concrete) C ∗ -algebra.
10.9. Theorem. Let H be a Hilbert space and T ∈ B(H). Then kT ∗ T k =
kT k2 .
Proof.
• On the one hand,
kT ∗ T k ≤ kT ∗ k kT k ≤ kT k kT k = kT k2 .
• On the other hand, if x ∈ H and kxk ≤ 1, then (using the Cauchy-Schwarz
Inequality), we find that
kT ∗ T k ≥ kT ∗ T xk
≥ kT ∗ T xk kxk
≥ |hT ∗ T x, xi|
= hT x, T xi
= kT xk2 .
Since this holds for all x ∈ H, kxk ≤ 1, it follows that kT ∗ T k ≥ kT k2 .
2
The above equation is known as the C ∗ -equation. While it was not terribly
difficult to prove, this innocuous looking equation has amazing consequences on the
structure of C ∗ -subalgebras of B(H).
In fact, it is a consequence of the Gelfand-Naimark-Segal (GNS) construc-
tion that if one starts with an involutive Banach algebra A for each each element
a ∈ A satisfies the C ∗ -equation, that is ka∗ ak = kak2 , then there exists an isomet-
ric ∗ -embedding of that algebra into the set of bounded linear operators on some
Hilbert space. This is beyond the scope of this course, but well within the scope of
the next!
146 L.W. Marcoux Functional Analysis

10.10. Proposition. Let H be a Hilbert space and T ∈ B(H). Then (ran T )⊥ =


ker T ∗ . In particular, therefore:
(i) ran T = (ker T ∗ )⊥ ;
(ii) ran T is not dense in H if and only if ker T ∗ 6= 0.
Proof. Let y ∈ H. Then

y ∈ ker T ∗ if and only if for all x ∈ H, 0 = (x, T ∗ y),


if and only if for all x ∈ H, 0 = (T x, y),
if and only if y ∈ (ran T )⊥ .

The second statement is now obvious.


2

10.11. Definition. Let X and Y be Banach spaces, and T ∈ B(X, Y). Then T
is said to be compact if T (X1 ) is compact in Y. The set of compact operators from
X to Y is denoted by K(X, Y), and if Y = X, we simply write K(X).

Recall that a subset K of a metric space L is said to be totally bounded if


for every ε > 0 there exists a finite cover {Vε (yi )}ni=1 of K with yi ∈ K, 1 ≤ i ≤ n,
where Vε (yi ) = {z ∈ L : dist (z, yi ) < ε}.
We leave it as an exercise for the reader to show that if E is a subset of L and
E is totally bounded, then so is E.

10.12. Proposition. Let X and Y be Banach spaces, and T ∈ B(X, Y). The
following are equivalent:
(a) T is compact;
(b) T (F ) is compact in Y for all bounded subsets F of X;
(c) If (xn )n is a bounded sequence in X, then (T xn )n has a convergent subse-
quence in Y;
(d) T (X1 ) is totally bounded.
Proof. This is left as an Assignment exercise.
2

10.13. Theorem. Let X and Y be Banach spaces. Then K(X, Y) is a closed


subspace of B(X, Y).
Proof. This is left as an Assignment exercise.
2
10. OPERATOR THEORY 147

10.14. Theorem. Let W, X, Y, and Z be Banach spaces. Suppose R ∈ B(W, X),


K ∈ K(X, Y), and T ∈ B(Y, Z). Then T K ∈ K(X, Z) and KR ∈ K(W, Y).
Proof. Let X1 denote the unit ball of X. Note that K(X1 ) is compact and T is
continuous, so that T (K(X1 )) is compact and therefore closed.

T ◦ K(X1 ) = T (K(X1 ))
⊆ T (K(X1 ))
= T (K(X1 )).

Since T ◦ K(X1 ) is a closed subset of the compact set T (K(X1 )), it is compact as
well. Thus T K ∈ K(X, Z).
Now if W1 is the unit ball of W, then

KR(W1 ) = K(R(W1 ));

but R(W1 ) is bounded since R is, and so by Proposition 10.12, KR(W1 ) is compact.
Thus KR ∈ K(W, Y).
2

10.15. Corollary. If X is a Banach space, then K(X) is a closed, two-sided


ideal of B(X).

10.16. Proposition. Let X and Y be Banach spaces and assume that K ∈


K(X, Y). Then K(X) is closed in Y if and only if dim K(X) is finite.
Proof. K(X) = ran K is easily seen to be a submanifold of Y. Since finite-
dimensional manifolds are always closed, we find that dim K(X) < ∞ implies K(X)
is closed.
Now assume that K(X) is closed. Then K(X) is a Banach space and the map
K0 : X → K(X)
x 7→ Kx
is a surjection. By the Open Mapping Theorem, Theorem 9.3, it is also an open
map. In particular, K0 (int X1 ) is open in K(X) and 0 ∈ K0 (int X1 ). Let G be an
open ball in K(X) centred at 0 and contained in K0 (int X1 ). Then K0 (X1 ) = K(X1 )
is compact, hence closed, and also contains G. Thus G is compact in K(X) and so
dim K(X) is finite - see Corollary 4.26.
2

10.17. Definition. Let X and Y be Banach spaces. Then F ∈ B(X, Y) is said


to be finite rank if dim F (X) is finite. The set of finite rank operators from X to
Y is denoted by F(X, Y).
148 L.W. Marcoux Functional Analysis

10.18. Proposition. Let X and Y be Banach spaces. Then F(X, Y) ⊆ K(X, Y).

Proof. Suppose F ∈ F(X, Y). Then F X1 is closed and bounded in ran F , but ran F
is finite dimensional in Y, as F is finite rank. Thus F X1 is compact in ran F , and
thus compact in Y as well, showing that F is compact.
2

10.19. Proposition. Let X be a Banach space. Then K(X) = B(X) if and only
if X is finite dimensional.
Proof. If dim X < ∞, then B(X) = F(X) ⊆ K(X) ⊆ B(X), and equality follows.
If K(X) = B(X), then I ∈ K(X), so I(X1 ) = X1 is compact. In particular, X is
finite dimensional.
2

10.20. Theorem. Let X and Y be Banach spaces and suppose K ∈ K(X, Y).
Then K ∗ ∈ K(Y∗ , X∗ ).
Proof. Let ε > 0. Then K(X1 ) is totally bounded, so we can find x1 , x2 , . . . , xn ∈ X1
such that if x ∈ X1 , then kKx − Kxi k < ε/3 for some 1 ≤ i ≤ n. Let
R : Y∗ → (Cn , k · k∞ )
y ∗ 7→ (y ∗ (K(x1 )), y ∗ (K(x2 )), . . . , y ∗ (K(xn ))).
Then R ∈ F(Y∗ , Cn ) ⊆ K(Y∗ , Cn ), and so R(Y∗1 ) is totally bounded. Thus we can
find y1∗ , y2∗ , . . . , ym
∗ ∈ Y∗ such that if y ∗ ∈ Y∗ , then kRy ∗ − Ry ∗ k < ε/3 for some
1 1 j
1 ≤ j ≤ m. Now
kRy ∗ − Ryj∗ k = max |y ∗ (K(xi )) − yj∗ (K(xi ))|
1≤i≤n
= max |K ∗ (y ∗ )(xi ) − K ∗ (yj∗ )(xi )|.
1≤i≤n

Suppose x ∈ X1 . Then kKx − Kxi k < /3 for some 1 ≤ i ≤ n, and


|K ∗ (y ∗ )(xi ) − K ∗ (yj∗ )(xi )| < ε/3 for some 1 ≤ j ≤ m, so
|K ∗ (y ∗ )(x) − K ∗ (yj∗ )(x)| ≤ |K ∗ (y ∗ )(x) − K ∗ (y ∗ )(xi )| +
|K ∗ (y ∗ )(xi ) − K ∗ (yj∗ )(xi )| +
|K ∗ (yj∗ )(xi ) − K ∗ (yj∗ )(x)|
≤ ky ∗ k kKx − Kxi k + ε/3 + kyj∗ k kKx − Kxi k
< ε/3 + ε/3 + ε/3 = ε.
Thus kK ∗ y ∗ − K ∗ yj∗ k ≤  and so K ∗ (Y∗1 ) is totally bounded. We conclude that
K ∗ ∈ K(Y∗ , X∗ ).
2

10.21. The set of compact operators acting on a Hilbert space is more tractable
in general than the set of compact operators acting on an arbitrary Banach space.
One of the reasons for this is the characterization given below.
10. OPERATOR THEORY 149

10.22. Theorem. Let H be a Hilbert space and let K ∈ B(H). The following
are equivalent:
(i) K is compact;
(ii) K ∗ is compact;
(iii) There exists a sequence {Fn }∞n=1 ⊆ F(H) such that K = limn→∞ Fn .
Proof.
(i) ⇒ (iii) Let B1 denote the unit ball of H, and let  > 0. Since K(B1 ) is
compact, it must be separable (i.e. it is totally bounded). Thus M = ran K
is a separable subspace of H, and thus possesses an orthonormal basis
{en }∞
n=1 .
Let Pn denote the orthogonal projection of H onto span {ek }nk=1 . Set
Fn = Pn K, noting that each Fn is finite rank. We now show that K =
limn→∞ Fn .
Let x ∈ H and consider y = Kx ∈ M, so that limn→∞ kPn y − yk = 0.
Thus limn→∞ kFn x − Kxk = limn→∞ kPn y − yk = 0. Since K is com-
pact, K(B1 ) is totally bounded, so we can choose {xk }m k=1 ⊆ B1 such that
m
K(B1 ) ⊆ ∪k=1 B(Kxk , /3), where given z ∈ H and δ > 0, B(z, δ) = {w ∈
H : kw − zk < δ}.
If kxk ≤ 1, choose i such that kKxi − Kxk < /3. Then for any n > 0,
kKx − Fn xk
≤ kKx − Kxi k + kKxi − Fn xi k + kFn xi − Fn xk
< /3 + kKxi − Fn xi k + kPn k kKxi − Kxk
< 2/3 + kKxi − Fn xi k.
Choose N > 0 such that kKxi − Fn xi k < /3, 1 ≤ i ≤ m for all n > N .
Then kKx − Fn xk ≤ 2/3 + /3 = . Thus kK − Fn k < 3 for all n > N .
Since  > 0 was arbitrary, K = limn→∞ Fn .
(iii) ⇒ (ii) Suppose K = limn→∞ Fn , where Fn is finite rank for all n ≥ 1.
Note that Fn∗ is also finite rank (why?), and that kK ∗ − Fn∗ k = kK − Fn k
for all n ≥ 1, which clearly implies that K ∗ = limn→∞ Fn∗ , and hence that
K ∗ is compact.
(ii) ⇒(i) Since K compact implies K ∗ is compact from above, we deduce that
K ∗ compact implies (K ∗ )∗ = K is compact, completing the proof.
2
We can restate the above Theorem more succinctly by saying that K(H) is the
norm closure of the set of finite rank operators on H. This is an extraordinarily
useful result.
10.23. Remark. Contained in the above proof is the following interesting
observation. If K is a compact operator acting on a separable Hilbert space H, then
for any sequence {Pn }∞n=1 of finite rank projections tending strongly (i.e. pointwise)
to the identity, kK − Pn Kk tends to zero. By considering adjoints, we find that
kK − KPn k also tends to zero.
150 L.W. Marcoux Functional Analysis

Let  > 0, and choose N > 0 such that n ≥ N implies kK − KPn k < /2 and
kK − Pn Kk < /2. Then for all n ≥ N we get
kK − Pn KPn k ≤ kK − KPn k + kKPn − Pn KPn k
≤ kK − KPn k + kK − Pn Kk kPn k
< /2 + /2 = .
It follows that if H has an orthonormal basis indexed by the natural numbers,
say {en }∞n=1 , then the matrix for K with respect to this basis comes within  of
the matrix for PN KPN . In other words, K “virtually lives” on the “top left-hand
corner”.
Alternatively, if H has an orthonormal basis indexed by the integers, say {fn }n∈Z ,
and we let Pn denote the orthogonal projection onto span {ek }nk=−n , then the matrix
for K with respect to this basis can be arbitrarily well estimated by a sufficiently
large but finite “central block”.

10.24. Example. Let H be a separable Hilbert space with orthonormal basis


{en }∞ ∞
n=1 . Let {dn }n=1 be a bounded sequence and consider the diagonal operator
D ∈ B(H) defined locally by Den = dn en and extended to all of H by linearity and
continuity.
Then D ∈ K(H) if and only if limn→∞ dn = 0.

10.25. Example. Let H = L2 ([0, 1], dx), and consider the function k(x, t) ∈
L2 ([0, 1]×[0, 1], dm),
where dm represents Lebesgue planar measure. Then we define
a Volterra operator
V : L2 ([0, 1], dx) → L2 ([0, 1], dx)
R1
(V f )(x) = 0 f (t) k(x, t) dt.
(The classical Volterra operator has k(x, t) = 1 if x ≥ t, and k(x, t) = 0 if x < t.)
Now for f ∈ L2 ([0, 1], dx) we have
Z 1
kV f k2 = |V f (x)|2 dx
0
Z 1 Z 1 2
= f (t) k(x, t)dt dx
0 0
Z 1 Z 1 2
≤ |f (t) k(x, t)|dt dx
0 0
Z 1 Z 1
≤ kf k22 |k(x, t)|2 dtdx by the Cauchy-Schwartz Inequality
0 0
= kf k22 kkk22 ,
so that kV k ≤ kkk2 .
Let A denote thePalgebra of continuous functions on [0, 1] × [0, 1] which can be
resolved as g(x, t) = ni=1 ui (x) wi (t). Then A is an algebra which separates points,
contains the constant functions, and is closed under complex conjugation. By the
10. OPERATOR THEORY 151

Stone-Weierstraß Theorem, given  > 0 and h ∈ C([0, 1] × [0, 1]), there exists g ∈ A
such that kh − gk2 ≤ kh − gk∞ < . But since C([0, 1] × [0, 1]) is dense (in the
L2 -topology) in L2 ([0, 1] × [0, 1], dm), A must also be dense (in the L2 -topology) in
L2 ([0, 1] × [0, 1], dm).
Let  > 0. For k as above, choose g ∈ A such that kk − gk2 < . Define
V0 : L2 ([0, 1], dx) → L2 ([0, 1], dx)
R1
V0 f (x) = 0 f (t) g(x, t)dt.

From above, we find that kV − V0 k ≤ kk − gk2 < .


To see that V0 is finite rank, consider the following; first, g(x, t) = ni=1 ui (x) wi (t).
P
If we set M = span 1≤i≤n {ui }, then M is a finite dimensional subspace of L2 ([0, 1], dx).
Moreover,
Z 1
V0 f (x) = f (t) g(x, t)dt
0
n Z
X 1 
= f (t) wi (t)dt ui (x),
i=1 0

so that V0 f ∈ M.
Thus V can be approximated arbitrarily well by elements of the form V0 ∈
F(L2 ([0, 1], dx), and so V is compact.

10.26. Definition. Let H be a Hilbert space, M be a subspace of H, and


suppose that T ∈ B(H). Recall that M is called invariant for T provided that
T M ⊆ M. We say that M is reducing for T if M is invariant both for T and
for T ∗ .

10.27. Exercises. Let H be a complex Hilbert space and M be a closed sub-


space of H. Let P be the orthogonal projection of H onto M, and T ∈ B(H).
(a) Prove that relative to the decomposition H = M ⊕ M⊥ , we have
 
I 0
P = .
0 0

(b) Prove that P = P 2 = P ∗ .  


T1 T2
(c) More generally, write T = relative to the decomposition H =
T3 T4
M ⊕ M⊥ . Prove that
 ∗
T1 T3∗


T = .
T2∗ T4∗
(d) Prove that M is invariant for T if and only if (I − P )T P = 0, and M is
reducing for P if and only if T P = P T . Conclude that M is reducing for
T if and only if both M and M⊥ are invariant for T .
152 L.W. Marcoux Functional Analysis

10.28. Proposition. Let H be a Hilbert space, T ∈ B(H), and M be a reducing


subspace of H. Then  
T1 0
T = T1 ⊕ T4 =
0 T4
with respect to the decomposition H = M ⊕ M⊥ . Furthermore, T is compact if and
only if both T1 and T4 are compact, and T is normal if and only if T1 and T4 are.
Proof. Let us denote by P the orthogonal projection of P onto M. The matrix
form for T follows directly from the matrix form for P computed in Exercise 10.27
above, combined with the equation P T = T P .
• If T1 and T4 are compact, then they are limits of finite rank operators Fn
and Gn respectively, from which we conclude that T is a limit of the finite
rank operators Fn ⊕ Gn . Thus T is compact.
If T is compact, then the compression of T to any subspace is compact,
and so both T1 and T4 are compact.
• Observe that T is normal if and only if
 ∗
T1 T1 − T1 T1∗

∗ ∗ 0
0 = T T − TT = ,
0 T4∗ T4 − T4 T4∗
which is equivalent to the simultaneous normality of T1 and T4 .
2

10.29. Proposition. Let H be a Hilbert space and N ∈ B(H) be normal.


(a) For all x ∈ H,
kN xk = kN ∗ xk.
(b) Let p(x, y) be any polynomial in two non-commuting variables x and y.
Given α ∈ C, ker (p(N, N ∗ ) − αI) = ker (p(N, N ∗ ) − αI)∗ is a reducing
subspace for N .
(c) If α 6= β ∈ C, then ker (N − αI) is orthogonal to ker(N − βI).
Proof.
(a) Suppose that x ∈ H. Then
kN ∗ xk2 = hN ∗ x, N ∗ xi = hN N ∗ x, xi = hN ∗ N x, xi = hN x, N xi = kN xk2 .
As such, kN xk = kN ∗ xk, and it is clear that ker N = ker N ∗ .
(b) First note that the fact that N is normal means that to calculate p(N, N ∗ ),
we could just have easily taken a polynomial in two commuting variables.
Let α ∈ C. Since N is normal, so is p(N, N ∗ ). From part (a) we may
conclude that ker (p(N, N ∗ ) − αI) = ker (p(N, N ∗ ) − αI)∗ .
Consider x ∈ ker (p(N, N ∗ ) − αI). Then
(p(N, N ∗ ) − αI)N x = N (p(N, N ∗ ) − αI)x = N 0 = 0,
showing that N x ∈ ker (p(N, N ∗ ) − αI), i.e that ker (p(N, N ∗ ) − αI) is
invariant for N . Similarly,
(p(N, N ∗ ) − αI)N ∗ x = N ∗ (p(N, N ∗ ) − αI)x = N ∗ 0 = 0,
10. OPERATOR THEORY 153

showing that N ∗ x ∈ ker (p(N, N ∗ ) − αI), i.e that ker (p(N, N ∗ ) − αI) is
invariant for N ∗ .
By Exercise 10.27, ker (p(N, N ∗ ) − αI) is reducing for N .
(c) Suppose that α 6= β ∈ C. Let x ∈ ker (N − αI) and y ∈ ker (N − βI). Then
αhx, yi = hN x, yi = hx, N ∗ yi = hx, βyi = βhx, yi.
Since α 6= β, we must have x ⊥ y.
2

10.30. Proposition. Let H be a complex Hilbert space and N ∈ B(H) be a


compact, normal operator. Then N = 0 if and only if σ(N ) = {0}.
Proof. That N = 0 implies that σ(N ) = {0} is trivial.
Conversely, suppose that σ(N ) = {0}. First observe that if dim H < ∞, then
the result follows easily from the fact that every normal matrix is diagonalizable
(this is the Spectral Theorem for Normal Matrices). This reduces our problem to
the case where dim H = ∞.

Case One. N is a finite-rank operator.


In this case, ran N is a finite-dimensional and therefore closed subspace of H.
Recall that
(ran N ) = ran N = (ker N ∗ )⊥ = (ker N )⊥ = (ran N ∗ ) = ran N ∗ .
(The last equality follows from the fact that dim (ran N ∗ ) < ∞.) Decomposing
H = (ker N )⊥ ⊕ (ker N ), we may write
 
N1 0
N= .
0 0
Since κ := dim (ker N )⊥ = dim (ran N ) < ∞, N1 ∈ B(Cκ ) corresponds to a normal
matrix, and σ(N1 ) ⊆ σ(N ) = {0} implies that σ(N1 ) = {0} as well. As seen above,
this implies that N1 = 0, so that N = 0 as well.

Case Two. N is not a finite-rank operator.


We shall argue this case by contradiction. Suppose that N 6= 0. By kN k−1 N if
necessary, we may assume without loss of generality that kN k = 1.
Recall that since N is compact, H0 := ran N is a separable subspace of H. Let
{en }∞
n=1 be an onb for H0 , and for each n ≥ 1, set Pn to be the orthogonal projection
of H onto Hn := span {e1 , e2 , . . . , en }. As seen in the notes, if we set Fn = Pn N Pn ,
n ≥ 1, then each Fn has finite rank and
N = lim Fn .
n

It follows that
N ∗ N = lim Fn∗ Fn .
n
154 L.W. Marcoux Functional Analysis
 
An 0
But relative to H = Hn ⊕ Hn⊥ , we may write Fn = for some An ∈ B(Cn ),
0 0
whence
A∗n An 0
 
Fn∗ Fn = .
0 0
Now A∗n An is a positive semidefinite matrix, and therefore it is diagonalizable. More-
over, limn kFn k = kN k = 1. Thus
lim kA∗n An k = lim kAn k2 = lim kFn k2 = kN k2 = 1.
n n n

From this we see that with αn = kAn k2 , limn αn = 1 and αn ∈ σ(A∗n An ) ⊆


σ(Fn∗ Fn ) for all n ≥ 1.
Next, recall that the set of invertible operators in B(H) is open, and note that
N ∗ N − I = lim Fn∗ Fn − αn I.
n

Together, these imply that 1 ∈ σ(N ∗ N ).


But N ∗ N is compact, and so 1 ∈ σp (N ∗ N )
and {0} = ∗
6 M := ker (N N − I) is finite-dimensional.
We are almost there! A routine calculation shows that M is invariant for both
N and N ∗ . (This uses the fact that N and N ∗ are normal, and thus they commute
with (N ∗ N − I).)
Hence we may decompose H = M ⊕ M⊥ and write
 
N1,1 0
N=
0 N2,2
relative to this decomposition. A second routine calculation shows that N1,1 is
normal and σ(N1,1 ) ⊆ σ(N ) = {0}, from which we conclude that σ(N1,1 ) = {0}.
Setting d := dim M, we have that N1,1 ∈ B(Cd ), so that N1,1 corresponds to a
(normal) matrix. From above, we conclude that N1,1 = 0. But then for 0 6= x ∈ M,
we obtain
0 = N1,1 x = N x = N ∗ N x,
contradicting the fact that N ∗ N x = x for all x ∈ M.
This contradiction proves that N = 0.
2

10.31. Let (X, k · k) be a complex Banach space and T ∈ B(X). Recall that the
spectrum of T is the set
σ(T ) = {α ∈ K : (T − αI) is not invertible in B(X)}.
We do not yet have the machinery to be able to prove that σ(T ) is a non-empty,
compact set contained in {z ∈ C : |z| ≤ kT k}. (The fact that the underlying field
for X is C is required to prove that σ(T ) 6= ∅)
We denote the set of eigenvalues of T by σp (T ) and refer to this as the point
spectrum of T . In general, σp (T ) may be empty or non-empty. Clearly σp (T ) ⊆
σ(T ).
10. OPERATOR THEORY 155

In Assignment 6, we saw that if T is a compact operator, then


σ(T ) = σp (T ) ∪ {0},
and that for all ε > 0,
σ(T ) ∩ {z ∈ C : |z| > ε}
is a finite set. We also saw in that Assignment that dim ker (T − αI) < ∞ for all
α 6= 0. In other words, σ(T ) \ {0} is a sequence of eigenvalues of finite-multiplicity,
and this sequence converges to 0 (in the case where it is infinite).
We shall exploit this wonderful behaviour of compact operators below.

10.32. Theorem. Let H be a complex Hilbert space and N ∈ B(H) be a com-


pact, normal operator. If σp (N ) = {αn }n∈Ω , then H = ⊕n∈Ω ker (N − αn I).
Proof. The notation “Ω” has been introduced only so that we may deal with the
cases where σp (T ) is finite and where it is infinite simultaneously.
Obviously ker (N −αI) ⊆ H for all α ∈ C, and more specifically for all α ∈ σp (N ),
and by Proposition 10.29, α 6= β ∈ σp (T ) implies that ker (N − αI) ⊥ ker (N − βI).
Thus
M := ⊕n∈Ω ker (N − αn I) ⊆ H.
Suppose that M 6= H. Since each ker (N − αI) is reducing for N , again by
Proposition 10.29, we find that M is reducing for N . Decomposing H = M ⊕ M⊥ ,
we may write
 
N1 0
N= .
0 N4
By Proposition 10.28, we see that N1 and N4 are compact and normal.
It follows that σ(N4 ) = σp (N4 ) ∪ {0}. Note, however, that if α ∈ σp (N4 ), then
α ∈ σp (N ), and therefore ker (N − αI) ⊆ M. This shows that σp (N4 ) = ∅, from
which we conclude that σ(N4 ) = {0}.
By Proposition 10.30, N4 = 0. The hypothesis that M 6= H then implies that

M 6= {0}, which then implies that 0 ∈ σp (N4 ), a contradiction.
Hence M = H, completing the proof.
2

10.33. Theorem. The spectral theorem for compact normal operators.


Let H be a Hilbert space and N ∈ B(H) be a compact, normal operator. Suppose
{αn }n∈Ω are the distinct eigenvalues of N and that Pn is the orthogonal projection
of H onto Mn := ker (N − αn I) for each n ∈ Ω. Then Pn Pm = 0 if n 6= m ∈ Ω, and
X
N= αn Pn ,
n∈Ω

where the series converges in the norm topology in B(H).


Proof. We remark that since Ω is either a finite or denumerable set, the above series
is either just a finite sum (in which case there is no need to speak of convergence of
156 L.W. Marcoux Functional Analysis

the series), or we may as well assume that Ω = N, in which case the series may be
written
X∞
N= αn Pn .
n=1
That n 6= m ∈ Ω implies that Pn Pm = 0 is the statement that ker (N − αn I) ⊥
ker (N − αm I) if αn 6= αm , which we proved in Proposition 10.29.

Recall from Proposition 10.29 that each Mn is reducing for N and that by
Theorem 10.32, H = ⊕n∈Ω Mn . Thus
N = ⊕n∈Ω Nn ,
where Nn is the compression of N to the reducing subspace Mn . But for x ∈ Mn ,
(N − αn Pn )x = (N − αn I)x = 0. Thus
N = ⊕n∈Ω αn Pn .
P
In the case where Ω is finite, this yields that N = n∈Ω αn Pn , as there are no
convergence issues.
Suppose that Ω is infinite.
Case One. 0 6∈ σp (N ).
Then dim Mn < ∞ for all n ∈ Ω, and H = ⊕n∈Ω Mn implies that H is separable.
As we saw above, we may (and do) assume that Ω = N.
(n) (n) (n)
Choosing an orthonormal basis Bn = {e1 , e2 , ekn } for each Mn , n ≥ 1, we
find that B := ∪n=1 Bn is an onb for H and that relative to this basis we have
N = diag(α1 , α1 , . . . , α1 , α2 , α2 , . . . , α2 , α3 , α3 , . . .),
where αn is repeated exactly kn = dim Mn = rank Pn times.
Let QM := M
P
n=1 Pn , so that is a finite-rank projection for each M ≥ 1. Note
that (QM )∞
M =1 converges in the sot to I ∈ B(H). By Remark 10.23,
M
X X
N = lim QM N QM = lim αn Pn = αn Pn .
M →∞ M →∞
n=1 n

Case Two. 0 ∈ σp (N ).
Again, we may assume that Ω = N and that α1 = 0. Since M0 = ker N is
reducing for N , we may write
 
N1 0
N=
0 N0
relative to the decomposition H = M1 ⊕ M⊥ 0
1 . Note that N is compact, normal and
0
0 6∈ σp (N ). By Case One,

X
N0 = αn Pn .
n=2
10. OPERATOR THEORY 157

But then

X ∞
X
N = 0P1 + αn Pn = αn Pn .
n=2 n=1
2

10.34. Corollary. Let H be a Hilbert space and N ∈ B(H) be a compact,


normal operator. Then there exists an orthonormal basis {eα }α∈Λ for H such that
each eα is an eigenvector for N .
Proof. Let {λn }∞ n=1 be the set of eigenvalues of N . (We argue the case where N has
infinitely many eigenvalues and leave the case where σ(N ) is finite to the reader.)
For each n ≥ 1, choose an orthonormal basis {e(n,β) }β∈Λn for ker (N − λn I). (Note
that if λn 6= 0, then the cardinality of Λn is finite.) Then each e(n,β) , β ∈ Λn , n ≥ 1
is an eigenvector for N corresponding to λn , the e(n,β) ’s are all orthogonal since all
of the ker (N − λn I)’s are. Finally, span{e(n,β) }β∈Λn , n≥1 = ⊕∞
n=1 ker (N − λn I) = H
by Theorem 10.32. Let {eα }α∈Λ = {e(n,β) }β∈Λn , n≥1 .
2
158 L.W. Marcoux Functional Analysis

Appendix to Section 10.

10.35. Proposition 10.30 is actually true in much greater generality. If A is any


unital (complex) Banach algebra and a ∈ A, we define the spectral radius of a to
be
spr(a) := sup{|α| : α ∈ σ(a)}.
It can be shown that N ∈ B(H) normal implies that kN k = spr(N ). The usual
proof of this fact requires the following result.

10.36. Theorem. Beurling’s Spectral Radius Formula. If A is a unital


Banach algebra and a ∈ A, then
1
spr(a) = lim kan k n .
n→∞

The proof of this result is beyond (but not much beyond) the scope of this
course. (We simply didn’t have the time for it.) Nevertheless, two things are worth
pointing out.
Firstly, the above limit exists!! This is anything but obvious, and is interesting
in its own right.
Secondly, since kan k ≤ kakn for all n ≥ 1, we immediately see that
spr(a) ≤ kak.
This shows that σ(a) is always bounded. Another non-trivial fact is that σ(a) 6= 0.
This relies on a Banach-space version of Liouville’s Theorem.
Using Beurling’s Spectral Radius Formula, we may easily obtain:

10.37. Proposition. Let H be a complex Hilbert space and N ∈ B(H) be a


normal operator. Then spr (N ) = kN k.
In particular, σ(N ) = {0} if and only if N = 0.
Proof. Consider first:
kN 2 k = sup kN 2 xk
kxk=1
= sup kN ∗ N xk
kxk=1
≥ sup |(N ∗ N x, x)|
kxk=1
= sup (N x, N x)
kxk=1

= sup kN xk2
kxk=1

= kN k2 .
10. OPERATOR THEORY 159

n n
By induction, kN 2 k ≥ kN k2 for all n ≥ 1. The reverse inequality follows
immediately from the submultiplicativity of the norm in a Banach algebra. Thus
n n
kN 2 k = kN k2 for all n ≥ 1. By Beurling’s Spectral Radius Formula, Theo-
rem 10.36,
n n
spr (N ) = lim kN 2 k1/2 = kN k.
n→∞
2
This is the standard proof of this result. The proof of Proposition 10.30 given
in the notes above is original, and clearly was only devised to circumvent the fact
that we do not have Beurling’s Spectral Radius Formula.

10.38. Definition. Let A be a Banach algebra and a ∈ A. If A is unital, then


the spectrum of a relative to A is the set
σA (a) = {λ ∈ C : a − λ1 is not invertible in A}.
If A is not unital, then σA (a) is set to be σA+ (a) ∪ {0}. When the algebra A is
understood, we generally write σ(a). The resolvent of a is the set ρ(a) = C\σ(a).

10.39. Corollary. Let A be a unital Banach algebra, and let a ∈ A. Then ρ(a)
is open and σ(a) is compact.
Proof. Clearly ρ(a) = {λ ∈ C : (a − λ1) is invertible } is open, since A−1 is.
Indeed, if a − λ0 1 is invertible in A, then λ ∈ ρ(a) for all λ ∈ C such that |λ − λ0 | <
k(a − λ0 )−1 k−1 . Thus σ(a) is closed.
If |λ| > kak, then λ1 − a = λ (1 − λ−1 a) and kλ−1 ak < 1, and so (1 − λ−1 a) is
invertible. This implies
(λ1 − a)−1 = λ−1 (1 − λ−1 a)−1 .
Thus σ(a) is contained in the disk Dkak ({0}) of radius kak centred at the origin.
Since it both closed and bounded, σ(a) is compact.
2

10.40. Definition. Let X be a Banach space and U ⊆ C be an open set. Then


a function f : U → X is said to be weakly analytic if the map z 7→ x∗ (f (z)) is
analytic for all x∗ ∈ X∗ .

10.41. Theorem. [Liouville’s Theorem] Every bounded, weakly entire func-


tion into a Banach space X is constant.
Proof. For each linear functional x∗ ∈ X∗ , x∗ ◦ f is a bounded, entire function
into the complex plane. By the complex-valued version of Liouville’s Theorem, it
must therefore be constant. Now by the Hahn-Banach Theorem, X∗ separates the
points of X. So if there exist z1 , z2 ∈ C such that f (z1 ) 6= f (z2 ), then there must
exist x∗ ∈ X∗ such that x∗ (f (z1 )) 6= x∗ (f (z2 )). This contradiction implies that f is
constant.
2
160 L.W. Marcoux Functional Analysis

10.42. Definition. Let A be a unital Banach algebra and let a ∈ A. The map

R(·, a) : ρ(a) → A
λ 7→ (λ1 − a)−1

is called the resolvent function of a.

10.43. Proposition. The Common Denominator Formula. Let a ∈ A, a


unital Banach algebra. Then if µ, λ ∈ ρ(a), we have

R(λ, a) − R(µ, a) = (µ − λ) R(λ, a) R(µ, a).

Proof. The proof is transparent if we consider t ∈ C and consider the corresponding


complex-valued equation:

1 1 (µ − t) − (λ − t) (µ − λ)
− = = .
λ−t µ−t (λ − t) (µ − t) (λ − t) (µ − t)

In terms of Banach algebra, we have:

R(λ, a) = R(λ, a) R(µ, a) (µ − a)


R(µ, a) = R(µ, a) R(λ, a) (λ − a).

Noting that R(λ, a) and R(µ, a) clearly commute, we obtain the desired equation
by simply subtracting the second equation from the first.
2

We shall return to this formula when establishing the holomorphic functional


calculus in the next section.

10.44. Proposition. If a ∈ A, a unital Banach algebra, then R(·, a) is analytic


on ρ(a).
Proof. Let λ0 ∈ ρ(a). Then

R(λ, a) − R(λ0 , a) (λ0 − λ) R(λ, a) R(λ0 , a)


lim = lim
λ→λ0 λ − λ0 λ→λ0 λ − λ0
2
= −R(λ0 , a)

since inversion is continuous on ρ(a). Thus the limit of the Newton quotient exists,
and so R(·, a) is analytic.
2
10. OPERATOR THEORY 161

10.45. Corollary. [Gelfand] If a ∈ A, a Banach algebra, then σ(a) is non-


empty.
Proof. We may assume that A is unital, for otherwise 0 ∈ σ(a) and we are done.
Similarly, if a = 0, then 0 ∈ σ(a). If ρ(a) = C, then clearly R(·, a) is entire. Now
for |λ| > kak, we have
−1
(λ − a)−1 = λ(1 − λ−1 a)

X
−1
= λ (λ−1 a)n
n=0

X
= λ−n−1 an
n=0
so that if |λ| ≥ 2kak, then

−1
X kakn 1
k(λ − a) k≤ ≤ .
(2kak)n+1 kak
n=0

That is, kR(λ, a)k ≤ kak−1for all λ ≥ 2kak.


Clearly there exists M < ∞ such that
max kR(λ, a)k ≤ M
|λ|≤2kak

since R(·, a) is a continuous function on this compact set. The conclusion is that
R(·, a) is a bounded, entire function. By Theorem 10.41, the resolvent function must
be constant. This obvious contradiction implies that σ(a) is non-empty.
2
Recall that a division algebra is an algebra in which each non-zero element is
invertible.
10.46. Theorem. [Gelfand-Mazur] If A is a Banach algebra and a division
algebra, then there is a unique isometric isomorphism of A onto C.
Proof. If b ∈ A, then σ(b) is non-empty by Corollary 10.45. Let β ∈ σ(b). Then
β1 − b is not invertible, and since A is a division algebra, we conclude that β1 = b;
that is to say, that σ(b) is a singleton.
Given a ∈ A, σ(a) is a singleton, say {λa }. The complex-valued map φ : a 7→ λa
is an algebra isomorphism. Moreover, kak = kλa 1k = |λa | = kφ(a)k, so the map is
isometric as well.
If φ0 : A → C were another such map, then φ0 (a) ∈ σ(a), implying that φ0 (a) =
φ(a).
2

10.47. Definition. Let a ∈ A, a Banach algebra. The spectral radius of a is


spr(a) = sup{|λ| : λ ∈ σ(a)}.
162 L.W. Marcoux Functional Analysis

10.48. Lemma. The Spectral Mapping Theorem - polynomial version.


Let a ∈ A, a unital Banach algebra, and suppose p ∈ C [z] is a polynomial. Then
σ(p(a)) = p(σ(a)) := {p(λ) : λ ∈ σ(a)}.

Proof. Let α ∈ C. Then for some γ ∈ C,


p(z) − α = γ (z − β1 ) (z − β2 ) · · · (z − βn )
and so
p(a) − α = γ (a − β1 ) (a − β2 ) · · · (a − βn ).
Thus (as all of the terms (a − βi ) commute),
α ∈ σ(p(a)) ⇐⇒ βi ∈ σ(a) for some 1 ≤ i ≤ n
⇐⇒ p(z) − α = 0 for some z ∈ σ(a)
⇐⇒ α ∈ p(σ(a)).
2

10.49. Theorem. [Beurling : The Spectral Radius Formula] If a ∈ A,


a Banach algebra, then
spr(a) = lim kan k1/n .
n→∞

Proof. First observe that if A is not unital, then we can always embed it isomet-
rically into a unital Banach algebra A+ . Since both the left and right hand sides
of the above equation remain unchanged when a is considered as an element of A+ ,
we may (and do) assume that A is already unital.
Now σ(an ) = (σ(a))n , and so spr(an ) = (spr(a))n . Moreover, for all b ∈ A, the
proof of Corollary 10.39 shows that spr(b) ≤ kbk. Thus
spr(a) = (spr(an ))1/n ≤ kan k1/n for all n ≥ 1.
This tells us that spr(a) ≤ inf n≥1 kan k1/n .
On the other hand, R(·, a) is analytic on ρ(a) and hence is analytic on {λ ∈ C :
|λ| > spr(a)}. Furthermore, if |λ| > kak, then
R(λ, a) = (λ − a)−1
= λ−1 (1 − λ−1 a)−1
X∞
= an /λn+1 .
n=0
Let φ ∈ A∗ . Then φ ◦ R(·, a) is an analytic, complex-valued function,

X
[φ ◦ R(·, a)](λ) = φ(an )/λn+1
n=0
and this Laurent expansion is still valid for {λ ∈ C : |λ| > kak}, since the series
for R(·, a) is absolutely convergent on this set, and applying φ introduces at most
10. OPERATOR THEORY 163

a factor of kφk to the absolutely convergent sum. Since [φ ◦ R(·, a)] is analytic on
{λ ∈ C : |λ| > spr (a)}, the complex-valued series converges on this larger set.
From this it follows that the sequence {φ(an )/λn+1 }∞ n=1 converges to 0 as n
tends to infinity for all φ ∈ A , so therefore is bounded for all φ ∈ A∗ . It is now a

consequence of the Uniform Boundedness Principle that {an /λn+1 }∞ n=1 is bounded
in norm, say by Mλ > 0, for each λ ∈ C satisfying |λ| > spr(a). That is:
kan k ≤ Mλ |λn+1 |
for all |λ| > spr(a). But then, for all |λ| > spr(a),
1/n
lim sup kan k1/n ≤ lim sup Mλ |λn+1/n | = |λ|.
n≥1 n≥1

Combining this estimate with the above yields spr(a) = limn→∞ kan k1/n .
2

They laughed when I said I was going to be a comedian. They’re not


laughing now.
Bob Monkhouse
164 L.W. Marcoux Functional Analysis

Exercises for Section 10.

Question 1. Let X and Y be Banach spaces, and let S, T ∈ B(X, Y). Prove that
for all k1 , k2 ∈ K,
(k1 S + k2 T )∗ = k1 S ∗ + k2 T ∗ .
Here, the ∗ -map refers to the Banach space adjoint.

Question 2.
Let H be a complex Hilbert space and M be a closed subspace of H. Let P be
the orthogonal projection of H onto M, and T ∈ B(H).
(a) Prove that relative to the decomposition H = M ⊕ M⊥ , we have
 
I 0
P = .
0 0
(b) Prove that P = P 2 = P ∗ . 
T1 T2
(c) More generally, write T = relative to the decomposition H =
T3 T4
M ⊕ M⊥ . Prove that
 ∗
T1 T3∗


T = .
T2∗ T4∗
(d) Prove that M is invariant for T if and only if (I − P )T P = 0, and M is
reducing for P if and only if T P = P T . Conclude that M is reducing for
T if and only if both M and M⊥ are invariant for T .

Question 3.
This problem is more challenging than the others, and should rightfully be an
assignment question.
Let H be a complex, separable Hilbert space and suppose that K ∈ K(H). Note
that K ∗ K is compact and selfadjoint, hence normal. By the Spectral Theorem for
Normal operators (see Corollary 10.34), there exists an onb {en }∞n=1 for H such
that
K ∗ K = diag (d1 , d2 , d3 , . . .).
(a) Prove that dn ≥ 0 for all n ≥ 1.
We denote by |K| the diagonal operator (relative to this onb)
|K| = diag (s1 , s2 , s3 , . . .),
1
where sn = (dn ) , n ≥ 1. Note that in the same way that for complex numbers we
2

have that |z|2 = zz, we now have that |K|2 = K ∗ K.


The sequence (sn )n is called the sequence of singular numbers of K. For
1 ≤ p ≤ ∞, we define the Schatten p-class Cp of all compact operators K such
that (sn )n ∈ `p .
It can be shown that Cp is a vector space, that kKkp := k(sn )n kp defines a norm
on Cp , and that (Cp , k · kp ) is complete, and thus a Banach space.
10. OPERATOR THEORY 165

Let {fn }n denote a second onb for H. Recall that for x, y ∈ H, we defined the
rank-one operator x ⊗ y ∗ ∈ B(H) via x ⊗ y ∗ (z) = hz, yix, z ∈ H.
(b) Prove that n sn fn ⊗ e∗n converges in norm in B(H).
P
(c) Prove that if K ∈ K(H) has singular numbers (sn )n , then
sn = dist(K, Fn−1 ),
where Fn−1 = {F ∈ B(H) : rank F ≤ n − 1}. That is,
sn = inf {kK − F k : rank F ≤ n − 1}.
166 L.W. Marcoux Functional Analysis

11. Appendix – topological background

A child of five could understand this. Fetch me a child of five.


Groucho Marx

11.1. At the heart of analysis is topology. A thorough study of topology is


beyond the scope of this course, and we refer the reader to the excellent book [Wil70]
General Topology, written by my former colleague Stephen Willard. The treatment
of topology in this section borrows heavily from his book.
We shall only give the briefest of overviews of this theory - assuming that the
student has some background in metric and norm topologies. We shall only cover
the notions of weak topologies and nets, which are vital to the study of Functional
Analysis.

11.2. Definition. A topology τ on a set X is a collection of subsets of X,


called open sets, which satisfy the following:
(i) X, ∅ ∈ τ - i.e. the entire space and the empty set are open;
(ii) If {Gα }α ⊆ τ , then ∪α Gα ∈ τ - i.e. arbitrary unions of open sets are open;
(iii) If n ≥ 1 and {Gk }nk=1 ⊆ τ , then ∩nk=1 Gk ∈ τ - i.e. finite intersections of
open sets are open
A set F is called closed if X\F is open. We call (X, τ ) (or more informally, we
call X) a topological space.
It is useful to observe that the intersection of a collection {τα }α of topologies on
X is once again a topology on X.

11.3. Example.
(i) Let X be any set. Then τ = {∅, X} is a topology on X, called the trivial
topology on X.
(ii) At the other extreme of the topological spectrum, if X is any non-empty
set, then τ = P(X), the power set of X, is a topology on X, called the
discrete topology on X.
(iii) Let X = {a, b}, and set τ = {∅, {a}, {a, b}}. Then τ is a topology on X.
(iv) Let (X, d) be a metric space. Let
τ = {G ⊆ X : for all g ∈ G there exists δ > 0 such that
bδ (g) := {y ∈ X : d(x, y) < δ} ⊆ G}.
Then τ is a topology, called the metric topology on X induced by d.
This is the usual topology one thinks of when dealing with metric spaces,
but as we shall see, there can be many more.
11. APPENDIX – TOPOLOGICAL BACKGROUND 167

(v) Let X be any non-empty set. Then


τcf = {∅} ∪ {Y ⊆ X : X\Y is finite}
is a topology on X, called the co-finite topology on X.

11.4. Definition. Let (X, τ ) be a topological space, and x ∈ X. A set U


is called a neighbourhood (abbreviated nbhd) of x if there exists G ∈ τ so that
x ∈ G ⊆ U . The reader is cautioned that some authors require nbhds to be open - we
do not. The neighbourhood system at x is Ux := {U ⊆ X : U is a nbhd of x}.

The following result from [Wil70] illustrates the importance of nbhd systems.
11.5. Theorem. Let (X, τ ) be a topological space, and x ∈ X. Then:
(a) If U ∈ Ux , then x ∈ U .
(b) If U, V ∈ Ux , then U ∩ V ∈ Ux .
(c) If U ∈ Ux , there exists V ∈ Ux such that U ∈ Uy for each y ∈ V .
(d) If U ∈ Ux and U ⊆ V , then V ∈ Ux .
(e) G ⊆ X is open if and only if G contains a nbhd of each of its points.
Conversely, if in a set X a non-empty collections Ux of subsets of X is assigned
to each x ∈ X so as to satisfy conditions (a) through (d), and if we use (e) to define
the notion of an open set, the result is a topology on X in which the nbhd system at
x is precisely Ux .
Because of this, it is clear that if we know the nbhd system of each point in X,
then we know the topology of X.

There are a number of natural separation axioms that a topological space might
satisfy.

11.6. Definition. Let (X, τ ) be a topological space.


(i) (X, τ ) is said to be T0 if for every x, y ∈ X such that x 6= y, either there
is a neighbourhood Ux of x with y 6∈ Ux or there is a neighbourhood Uy of
y with x 6∈ Uy .
(ii) (X, τ ) is said to be T1 if for every x, y ∈ X such that x 6= y, there are
neighbourhoods Ux of x and Uy of y with y 6∈ Ux and x 6∈ Uy .
(iii) (X, τ ) is said to be T2 (or Hausdorff) if for every x, y ∈ X such that
x 6= y, there are neighbourhoods Ux of x and Uy of y with Ux ∩ Uy = ∅.
We say that two subsets A and B of X can be separated by τ if there exist U, V ∈ τ
with A ⊆ U, B ⊆ V and U ∩ V = ∅.
(iv) (X, τ ) is said to be regular if whenever F ⊆ X is closed and x 6∈ F ,
F and {x} can be separated.
(v) (X, τ ) is said to be normal if whenever F1 , F2 ⊂ X are closed and disjoint,
then F1 and F2 can be separated.
(vi) (X, τ ) is said to be T3 if it is T1 and regular.
(vii) (X, τ ) is said to be T4 if it is T1 and normal.
168 L.W. Marcoux Functional Analysis

We are assuming that the next definition is a familiar one.

11.7. Definition. Let (X, τ ) be a topological space. An open cover of X is a


collection G ⊆ τ such that X = ∪G∈G G. A finite subcover of X relative to G is a
finite subset {G1 , G2 , ..., Gn } ⊆ G which is again an open cover of X.
A topological space (X, τ ) is said to be compact if every open cover of X admits
a finite subcover.

11.8. Theorem. Let (X, d) be a metric space. Then X, equipped with the
metric topology, is T4 .

11.9. Theorem. Let (X, τ ) be a compact, Hausdorff space. Then (X, τ ) is T4 .

Recall that a topological space (X, τ ) is said to be separable if it admits a


countable dense subset.

The following result will be needed in Section 7.

11.10. Proposition. Let (X, d) be a compact metric space. Then (X, d) is


separable.
Proof. For each n ≥ 1, the collection Gn := {b1/n (x) : x ∈ X} is an open cover of
X. Since X is compact, we can find a finite subcover {b1/n (x(j,n) ) : 1 ≤ j ≤ kn } of
X. It is then clear that if x ∈ X, there exists 1 ≤ j ≤ kn so that d(x, x(j,n) ) < 1/n.
As such, the collection
D := {x(j,n) : 1 ≤ j ≤ kn , 1 ≤ n}
is a countable, dense set in X, proving that (X, d) is separable.
2

11.11. Definition. Let (X, τ ) be a topological space. A neighbourhood base


Bx at a point x ∈ X is a collection Bx ⊆ Ux so that U ∈ Ux implies that there exists
B ∈ Bx so that B ⊆ U . We refer to the elements of Bx as basic nbhds of the point
x.

The importance of neighbourhood bases is that all open sets can be constructed
from them, as we shall soon see.

11.12. Example. Consider (X, d) be a metric space equipped with the metric
topology τ . For each x ∈ X, fix a sequence {rn (x)}∞n=1 of positive real numbers
such that limn→∞ rn (x) = 0 and consider Bx = {Vrn (x) : n ≥ 1}. Then Bx is a nbhd
base at x for each x ∈ X.
11. APPENDIX – TOPOLOGICAL BACKGROUND 169

11.13. Definition. Let (X, τ ) be a topological space. A base for the topology is
a collection B ⊆ τ so that for every G ∈ τ there exists C ⊆ B so that G = ∪{B : B ∈
C}. That is, every open set is a union of elements of B. Note that if C is empty,
then ∪{B : B ∈ C} is also empty, so we do not need to include the empty set in our
base. A subbase for the topology is a collection S ⊆ τ such that the collection B of
all finite intersections of elements of S forms a base for τ .

As we shall see in the Assignments, any collection C of subsets of X serves as a


subbase for some topology on X, called the topology generated by C.

11.14. Example. Let (X, τ ) be a topological space, and for each x ∈ X,


suppose that Bx is a neighbourhood base at x consisting of open sets. Then B :=
∪x∈X Bx is a base for the topology τ on X.

11.15. Example. Consider R with the usual topology τ . The collection B =


{(a, b) : a, b ∈ R, a < b} is a base for τ . (You might remember from Real Analysis
that every open set in R is a disjoint union of open intervals - although the fact the
union is disjoint in this setting is a luxury item which we have not built into the
definition of a base in general.)
The collection S = {(−∞, a) : a ∈ R} ∪ {(b, ∞) : b ∈ R} is a subbase for the
usual topology, but is not a base for τ .

11.16. Definition. Let (X, τ ) be a topological space. A directed set is a set


Λ with a relation ≤ that satisfies:
(i) λ ≤ λ for all λ ∈ Λ;
(ii) if λ1 ≤ λ2 and λ2 ≤ λ3 , then λ1 ≤ λ3 ; and
(iii) if λ1 , λ2 ∈ Λ, then there exists λ3 so that λ1 ≤ λ3 and λ2 ≤ λ3 .
The relation ≤ is sometimes called a direction on Λ.
A net in X is a function P : Λ → X, where Λ is a directed set. The point P (λ)
is usually denoted by xλ , and we often write (xλ )λ∈Λ to denote the net.
A subnet of a net P : Λ → X is the composition P ◦ ϕ, where ϕ : M → Λ is an
increasing cofinal function from a directed set to Λ; that is,
(a) ϕ(µ1 ) ≤ ϕ(µ2 ) if µ1 ≤ µ2 (increasing), and
(b) for each λ ∈ Λ, there exists µ ∈ M so that λ ≤ ϕ(µ) (cofinal).
For µ ∈ M , we often write xλµ for P ◦ ϕ(µ), and speak of the subnet (xλµ )µ .

11.17. Definition. Let (X, τ ) be a topological space. The net (xλ )λ is said to
converge to x ∈ X if for every U ∈ Ux there exists λ0 ∈ Λ so that λ ≥ λ0 implies
xλ ∈ U .
We write limλ xλ = x, or limλ∈Λ xλ = x.
This mimics the definition of convergence of a sequence in a metric space.
170 L.W. Marcoux Functional Analysis

11.18. Example.
(a) Since N is a directed set under the usual order ≤, every sequence is a net.
Any subsequence of a sequence is also a subnet. The converse to this is
false, however. A subnet of a sequence need not be a subsequence, since
its domain need not be N (or any countable set, for that matter).
(b) Let A be a non-empty set and Λ denote the power set of all subsets of A,
partially ordered with respect to inclusion. Then Λ is a directed set, and
any function from Λ to R is a net in R.
(c) Let P denote the set of all finite partitions of [0, 1], partially ordered by
inclusion (i.e. refinement). Let f be a continuous function on [0, 1]; then
to P = {0 P= t0 < t1 < · · · < tn = 1} ∈ P, we associate the quantity
LP (f ) = ni=1 f (ti−1 )(ti − ti−1 ). The map P 7→ LP (f ) is a net (P is a
R1
directed set), and from Calculus, limP ∈P LP (f ) = 0 f (x)dx.

11.19. Example. Let (X, τX ) be a topological space and x ∈ X. Let Ux denote


the nbhd system at x. If, for U1 , U2 ∈ Ux we define the relation U1 ≤ U2 if U2 ⊆ U1 ,
then (Ux , ≤) forms a directed set.
For each U ∈ Ux , choose xU ∈ U . Then (xU )U ∈Ux forms a net in X. It is not
hard to see that limU ∈Ux xU = x. Indeed, given V ∈ Ux , we have that xU ∈ V for
all U ≥ V .
Observe that if (X, τX ) is not Hausdorff, it is entirely possible that there exists
y 6= x in X so that y = limU ∈Ux xU as well. (You should convince yourself of this
by producing an example.) The property that (X, τX ) is Hausdorff is equivalent to
the condition that that limits of nets in X are unique.

11.20. Definition.
Let (X, τX ) and (Y, τY ) be topological spaces. We say that a function f : X → Y
is continuous if f −1 (G) is open in X for all G ∈ τY .

That this extends our usual notion of continuity for functions between metric
space is made clear by the following result:
11.21. Proposition. If (X, dX ) and (Y, dY ) are metric spaces with metric space
topologies τX and τY respectively, then the following are equivalent for a function
f :X →Y:
(a) f is continuous on X, i.e. f −1 (G) ∈ τX for all G ∈ τY .
(b) limn f (xn ) = f (x) whenever (xn )∞
n=1 is a sequence in X converging to
x ∈ X.

As we shall see in the Assignments, sequences are not enough to describe conver-
gence, nor are they enough to characterize continuity of functions between general
topological spaces. On the other hand, nets are sufficient for this task, and serve
as the natural replacement for sequences. (The following result also admits a local
version, which we shall also see in the Assignments.)
11. APPENDIX – TOPOLOGICAL BACKGROUND 171

11.22. Theorem. Let (X, τX ) and (Y, τY ) be topological spaces. Let f : X → Y


be a function. The following are equivalent:
(a) f is continuous on X.
(b) Whenever (xλ )λ∈Λ is a net in X which converges to x ∈ X, it follows that
(f (xλ ))λ∈Λ is a net in Y which converges to f (x).

The notion of a weak topology on a set X generated by a family of functions


{fγ } from X into topological spaces (Yγ , τγ ) is of crucial importance in the study of
topological vector spaces and of Banach spaces. It is also vital to the understanding
of the product topology on a family of topological spaces, which we shall see shortly.
11.23. Definition. Let ∅ 6= X be a set and {(Yγ , τγ )}γ∈Γ be a family of topo-
logical spaces. Suppose that for each γ ∈ Γ there exists a function fγ : X → Yγ . Set
F = {fγ }γ∈Γ .
If S = {fγ−1 (Gγ ) : Gγ ∈ τγ , γ ∈ Γ}, then S ⊆ P(X) and – as noted above – S
is a subbase for a topology on X, denoted by σ(X, F), and referred to as the weak
topology on X induced by F.

The main and most important result concerning weak topologies induced by a
family of functions is the following:
11.24. Proposition.
(a) If τ is a topology on X and if fγ : (X, τ ) → (Yγ , τγ ) is continuous for all
γ ∈ Γ, then σ(X, F) ⊆ τ . In other words, σ(X, F) is the weakest topology
on X under which each fγ is continuous.
(b) Let (Z, τZ ) be a topological space. Then g : (Z, τZ ) → (X, σ(X, F)) is
continuous if and only if fγ ◦ g : Z → Yγ is continuous for all γ ∈ Γ.
11.25. Definition. Let {(Xα , τα )}α∈Λ be a collection of topological spaces. The
Cartesian product of the sets Xα is
Πα∈Λ Xα = {x : Λ → ∪α Xα | x(α) ∈ Xα for each α ∈ Λ}.
As with sequences, we write (xα )α for x.

The map πβ : ΠXα → Xβ , πβ (x) = xβ is called the βth projection map.


The product topology on Πα Xα is the weak topology on Πα Xα induced by the
family {πβ }β∈Λ . As we shall see in the Assignments, this is the topology which has
as a base the collection B = {Πα∈Λ Uα }, where
(a) Uα ∈ τα for all α; and
(b) for all but finitely many α, Uα = Xα .
It should be clear from the definition that in (a), it suffices to ask that we take
Uα ∈ Bα , where Bα is a fixed base for τα , α ∈ Λ.

Observe that if Uα ∈ τα and Uα = Xα for all α except for α1 , α2 , ..., αn , then


Πα Uα = πα−1
1
(Uα1 ) ∩ · · · ∩ πα−1
n
(Uαn ).
172 L.W. Marcoux Functional Analysis

From this it follows that {πα−1 (Uα ) : Uα ∈ Bα , α ∈ Λ} is a subbase for the product
topology, where Bα is a fixed base (or indeed even a subbase will do) for the topology
on Xα .

It is perhaps worth pointing out that it follows from the Axiom of Choice that
if for all α ∈ Λ we have Xα 6= ∅, then X 6= ∅.
We leave it to the reader to verify that the product topology on Rn = Πnk=1 R is
just the usual topology on Rn .
Bibliography

[Dav88] K.R. Davidson. Nest algebras. Triangular forms for operators on a Hilbert space, volume
191 of Pitman Research Notes in Mathematics. Longman Scientific and Technical, Harlow,
1988.
[KR83] R.V. Kadison and J.R. Ringrose. Fundamentals of the theory of operator algebras I: Ele-
mentary theory. Academic Press, New York, 1983.
[Lin71] J. Lindenstrauss. The geometric theory of the classical banach spaces. Actes du Congrès
Intern. Math., 1970, Paris, 2:365–372, 1971.
[Sch27] J. Schauder. Zur Theorie stetiger Abbildungen in Funktionalraumen. Math. Z., 26:47–65,
1927.
[Sou78] A.R. Sourour. Operators with absolutely bounded matrices. Math. Z., 162:183–187, 1978.
[Sun78] V.S. Sunder. Absolutely bounded matrices. Indiana Univ. Math. J., 27:919–927, 1978.
[Tsi74] B.S. Tsirel’son. Not every Banach space contains an imbedding of `p or c0 . Functional
Anal. Appl., 8:138–141, 1974.
[Whi66] R. Whitley. Projecting m onto c0 . The Amer. Math. Monthly, 73:285–286, 1966.
[Wil70] Stephen Willard. General Topology. Addison-Wesley Publishing Co., Reading, Mass.-
London-Don Mills, Ont., 1970.

173
Index

C ∗ -algebra, 126, 145 base for a topology, 169


C ∗ -equation, 145 basic neighbourhoods, 168
`∞ -direct sum, 5 basis
`∞ (N), 3 orthonormal, 43
`p -direct sum, 5 Schauder, 26, 43, 47
`p (N), 3 basis, Hamel, 43
∞-norm, 3 Berra, Yogi, 38
c0 , 2 Bessel’s Inequality, 44
p-norm, 2, 3 Beurling’s spectral radius formula, 49
Beurling, Arne, 49
absolutely summable series, 7 Bierce, Ambrose, iii
absorbing, 55, 61 bilateral weighted shift operator, 25
adjoint block-disjoint, 14
Banach space, 141 bounded operator, 21
Hilbert space, 142 bounded variation
affine hyperplane, 102
sequences of, 14
affine manifold, 102
affine subspace, 102
canonical embedding, 29, 100
algebra
Cauchy complete, 57
Banach, 4
Cauchy net, 57
disc, 3
Cauchy-Schwarz Inequality, 38, 42, 78, 145
algebraic dual, 89
Close Graph Theorem, 135
algebraically complemented, 41, 99
closed set, 166
annihilator, 121
co-finite topology, 167
Axiom of Choice, 172
compact, 168
Baire Category Theorem, 114, 134 locally, 61
balanced, 55 compact operator, 141, 146
Banach algebra, 4 compactum
Banach space, 1, 3 Banach-Mazur, 50
reflexive, 29 compatible, 54
Banach space adjoint, 141 complement
Banach-Alaoglu Theorem, 118 orthogonal, 41
Banach-Mazur compactum, 50 complemented
Banach-Mazur distance, 38, 50, 141 algebraically, 41, 99
Banach-Steinhaus Theorem, 115, 140 topologically, 41, 98
Bankead, T., 108 complete
Barros, A.B., 132 normed linear space, 1
base family of pseudo-metrics, 82
neighbourhood, 168 metric, 82
175
176 INDEX

complex-valued Borel measure, 16 Gelfand-Naimark-Segal Construction, 145


condition J, 15 GNS construction, 145
consecutively supported, 14 Goldstine’s Theorem, 119
continuous linear functionals, 26 Goldwyn, Samuel, iii
convergence Gram-Schmidt Orthogonalisation Process,
pointwise, 112 44
weak, 111
convex, 70 Hölder’s Inequality, 11, 13
convex hull, 131 Hölder’s Inequality., 12
convex set, 131 Hadas, Moses, iii
convexity, 70 Hahn-Jordan Decomposition Theorem, 16
half-space
diagonal operator, 24 closed, 102
differentiation operator, 25 open, 102
dimension, 46 Hamel basis, 43, 93
direct sum Hausdorff topology, 167
`p , 5 Hilbert space, 4, 38, 78
directed set, 169 Hilbert space adjoint, 142
direction, 169 Hilbert space dimension, 46
disc algebra, 3 Hilbert-Schmidt operator, 84
discrete topology, 166 hyper-invariant subspace, 52
distance hyperplane, 92
Banach-Mazur, 38, 50 affine, 102
division algebra, 161
idempotent, 41, 138
double dual space, 26
Inequality
dual
Bessel’s, 44
algebraic, 89
Cauchy-Schwarz, 38
topological, 89
inner product spaces, 38
dual pair, 111
integral
dual space, 11, 26
Riemann-Stieltjes, 29
equivalent norms, 5 integral operator, 30
Ext(C), 126 invariant subspace, 52, 140, 151
extension, 64 Inverse Mapping Theorem, 135
extreme point, 126 invertible
linear operator, 49
F. John’s Theorem, 50 involution, 145
face, 127 isomorphism of Hilbert spaces, 46
Fields, W.C., 87 iterated dual spaces, 26
Finite Intersection Property, 117, 129
James’ space, 15
finite measure, 16
finite subcover, 168 kernel
first category of an integral operator, 30
set of the, 140 Krein-Milman Theorem, 129
Fréchet space, 82 Kronecker delta function, 47, 143
function of bounded variation, 29
functional, 26, 89 Laurel, S., 123
gauge, 70 LCS, 72
Minkowski, 15, 70, 101 Lebesgue Dominated Convergence Theorem,
positive linear, 126 117
functional calculus, 49 Lec, Stanislaw J., iii
Levenson, Sam, 126
gauge functional, 70 linear functional, 89
INDEX 177

linear manifolds, 2 bounded, 21


Lioville’s Theorem, 158 compact, 141, 146
locally compact, 61 diagonal, 24
locally convex space, 72, 75 differentiation, 25
lower semicontinuous, 125 Hilbert-Schmidt, 84
multiplication, 23
manifold positive, 126
affine, 102 Volterra, 49
Martin, Steve, 35, 134 weighted shift, 24
Marx, Groucho, iii, 69, 166 operator matrix, 52
meager, 140 operator norm, 21, 25
measurable partition, 16 operators
measure unitary, 46
complex-valued, Borel, 16 orthogonal, 38
finite, 16 orthogonal complement, 41
regular, 16 orthogonal projection, 41
metric orthonormal basis, 43
complete, 82 orthonormal set, 43
translation invariant, 82
Milligan, Spike, 19 Parallelogram Law, 39, 50
Minkowski functional, 15, 70, 75, 101 Parceval’s Identity, 45
Minkowski’s Inequality, 11, 12 Parker, Dorothy, iii
Monkhouse, Bob, 163 partition
multiplication operator, 23 measurable, 16
Philips, Emo, 89
nbhd, 167 Phillips’ Theorem, 41
neighbourhood, 167 point spectrum, 154
balanced, 55 positive linear functional, 126
neighbourhood base, 168 positive operator, 126
neighbourhood system, 167 pre-annihilator, 121
net, 169 product topology, 171
Cauchy, 57 projection, 138
convergence of, 169 pseudo-metric
Newhart, Bob, 1 complete family of, 82
norm, 1 pure states, 126
Euclidean, 55 Pythagorean Theorem, 39
operator, 21, 25
total variation, 94 quasinilpotent, 49
uniform, 3 quotient topology, 58
norm topology, 1
normal topological space, 167 reducing subspace, 52, 151
normed linear space, 1 reflexive Banach space, 29
norms regular measure, 16
equivalent, 5 regular topological space, 167
Novak, Ralph, iii resolvent, 159
nowhere dense, 140 function, 160
Riemann-Stieltjes integral, 29
open cover, 168 Riesz Representation Theorem, 34, 42, 116
open map, 10, 58 Riesz-Markov Theorem, 116
Open Mapping Theorem, 134 Rudner, Rita, 54, 141
open set, 166
operator Schatten p-class, 32
bilateral weighted shift, 25 Schatten p-class, 164
178 INDEX

Schauder basis, 26, 43, 47, 141 Theismann, J., 139


standard, 27 Theorem
second category Baire Category, 114, 134
set of the, 140 Banach-Alaoglu, 118
semicontinuous Banach-Steinhaus, 115, 140
lower, 125 Beurling, 162
seminorm, 1, 69, 71 Closed Graph, 135
seminorms F. John, 50
separating family of, 73 Gelfand-Mazur, 161
separable, 168 Goldstine, 119
separated, 102, 167 Hahn-Jordan Decomposition, 16
strictly, 102 Inverse Mapping, 135
separating Krein-Milman, 129
family of functions, 77 Lebesgue Dominated Convergence, 117
separating family, 110 Liouville, 158
separating family of seminorms, 73 Liouville’s, 159
separation axioms, 167 Open Mapping, 134
series Phillips’, 41
absolutely summable, 7 Pythagorean, 39
Shandling, Garry, 51 Reisz-Markov, 116
singular numbers, 164 Riesz Representation, 34, 116
singular values, 32 Riesz Representation (for Hilbert spaces),
space 42
Banach, 3 Spectral Mapping
Hilbert, 4, 78 polynomial version, 162
James’, 15 Stone-Weierstraß, 2
Tsirel’son, 14 Tychonoff, 117
spectral radius, 49, 158, 161 Whitley’s proof of Phillips’, 41
spectral radius formula, 162 thin, 140
Beurling’s, 49 topological dual, 89
spectrum, 49, 154, 159 topological space
point, 154 T0 , 167
standard Schauder basis, 27 T1 , 167
state, 126 T3 , 167
state space, 126 T4 , 167
states compact, 168
pure, 126 Hausdorff, 167
Stone-Weierstraß Theorem, 2 normal, 167
strictly separated, 102 regular, 167
strong operator topology, 78 separation axioms, 167
subbase for a topology, 169 topological vector space, 54
sublinear functional, 70 topological vector spaces, 2
subnet, 169 topologically complemented, 41, 98, 136
subspace topology, 166
affine, 102 base for a, 169
invariant, 52, 140, 151 co-finite, 167
reducing, 52, 151 compatible, 54
subspaces, 2 discrete, 166
summable generated by a subbase, 169
unconditionally, 44 norm, 1
supspace product, 171
hyper-invariant, 52 quotient, 58
INDEX 179

strong operator, 78
subbase for a, 169
trivial, 166
weak, 111, 171
weak operator, 79
weak∗ , 112
total variation, 14
total variation norm, 94
trace functional, 126
translation invariant metric, 82
trivial topology, 166
Tsirel’son space, 14, 15
Tsirel’son, B.S., 14
TVS, 54
Twain, Mark, 110
Tychonoff’s Theorem, 117

unconditionally summable, 44
uniform boundedness principle, 113, 114
uniform continuity on a subset, 63
uniform norm, 3
uniform space, 83
uniformity, 62
uniformly continuous function, 62
unilateral backward weighted shift operator,
24
unilateral forward weighted shift operator,
24
unitary operators, 46

variation, 16, 29
bounded, 29
total, 14
vector subspace, 2
Volterra operator, 30, 49

weak convergence, 111


weak operator topology, 79
weak topology, 110, 171
weak topology (induced by the continuous
dual), 111
weak topology induced by a family of
functions, 171
weak∗ -topology, 112
weakly analytic, 159
weighted `2 -space, 39
weighted shift operator, 24
Willard, Stephen, 166
Wright, Steven, 21

Youngman, Henny, 67

You might also like