0% found this document useful (0 votes)
11 views55 pages

Functional Analysis Lecture Notes

The document outlines the course 'Intro to Functional Analysis' taught by Dr. Liran Rotem, covering various topics such as the Stone-Weierstrass Theorem, Hilbert Spaces, and Bounded Linear Operators. It includes definitions, theorems, and examples relevant to functional analysis, emphasizing the structure and properties of function spaces. The course materials are organized into sections that progressively build on foundational concepts in functional analysis.

Uploaded by

sabren143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views55 pages

Functional Analysis Lecture Notes

The document outlines the course 'Intro to Functional Analysis' taught by Dr. Liran Rotem, covering various topics such as the Stone-Weierstrass Theorem, Hilbert Spaces, and Bounded Linear Operators. It includes definitions, theorems, and examples relevant to functional analysis, emphasizing the structure and properties of function spaces. The course materials are organized into sections that progressively build on foundational concepts in functional analysis.

Uploaded by

sabren143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Intro to Functional Analysis - 104276

Lectures by Dr. Liran Rotem


Spring 21/22

Yam Y. Felsenstein ([Link]@[Link])


Last Updated: June 15, 2022

Contents
1 Introduction, Stone-Weierstrass Theorem 3
1.1 What is Functional Analysis? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Stone-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Stone-Weierstrass - Examples And The Complex Case, Hilbert Spaces 5


2.1 Stone-Weierstrass - Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Complex Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Example of Infinite-Dimensional Complete Space, Completion 9


3.1 The Space ℓ2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Completion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 The Space L2 12
4.1 Construction of L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 On The Measure-Theoretic Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.3 Properties of L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5 Orthogonal Projections 15

6 Orthonormal Bases 17
6.1 Definition of Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 Existence of Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

7 Orthonormal Basis of L2 21

8 Pointwise Convergence of Fourier Series 23


8.1 Pointwise Convergence and Dirichlet’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8.2 Cesaro Averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

9 Bounded Linear Operators 26

10 Bounded Operators Contd. 28


10.1 Riesz’ Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
10.2 Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

11 Matrix Representation of Operators 31

12 Banach Spaces 32

13 Bounded Operators On Banach Spaces 34

14 Weak Convergence, Ergodic Theory 36


14.1 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
14.2 Ergodic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

15 The Mean Ergodic Theorem 39

16 Invertible Operators And The Inverse Mapping Theorem, Spectrum 40


16.1 Invertible Operators And The Inverse Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 40

1
16.2 Spectrum of An Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

17 Compact Operators 43

18 Spectrum of Compact Operators 45

19 Compact Operators In Hilbert Spaces 47


19.1 Compact Operators In A Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
19.2 Spectral Theorem For Compact Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 47

20 Spectral Theorem For Compact Normal Operators, Functional Calculus 50


20.1 Spectral Theorem For Normal Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
20.2 Functions of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

21 Root of Positive-Semidefinite Operator 52

22 The Fourier Transform 53

Conventions
If we write C(X) without specifying further, it is assumed that C(X) = CC (X).

2
1 Introduction, Stone-Weierstrass Theorem
1.1 What is Functional Analysis?
Let us consider two problems, and inspect the differences between them.

Problem 1.1. Given an n × n matrix, find functions u : R → Rn such that u′ (t) = A(u(t)).

This is a problem we know how to solve, and the solution depends on spectral properties of A. This is Not functional
analysis.

Problem 1.2. Find a function v : R2 → R, v = v(t, x) such that


∂v ∂2v
(t, x) = (t, x) (1)
∂t ∂x2

Define the space V = C ∞ (R) (All infinitely differentiable functions R → R). Define the map
A : V → V, A(f ) = f ′′ (2)
By linearity of the derivative, this is a linear transformation. We claim that a (sufficiently nice) function v : R2 → R can
be identified with a function u : R → V by u(t)(x) = v(t, x). In the language of u, we wish to solve the equation
u′ (t) = A(u(t)) (3)
We have now reduced our problem down to one similar to our first problem. As before, the solution will depend on
spectral properties of A.

This Is functional analysis, because V is a (In particular infinite-dimensional) space of functions.

1.2 Stone-Weierstrass Theorem


Consider the R-Vector Space CR ([a, b]), the space of all continuous functions f : [a, b] → R with the regular operations
of pointwise addition and multiplication.

Definition 1.3. A collection {vi }i∈I will be called a (Hamel) Basis for a vector space V if every vector v ∈ V can be
uniquely expressed as a finite linear combination of some of the vi ’s.

Theorem 1.4. Assuming the axiom of choice, every Vector Space has a basis.

In the case of CR ([a, b]), such a basis promised by the above theorem is uncountable, and cannot be defined construc-
tively. On CR ([a, b]), we can define the Sup-Norm
||f ||∞ = max |f | (4)
[a,b]

(Note that f is continuous on a compact set, therefore the maximum is finite and attained at some point). This is
indeed a norm on CR ([a, b]). Does there exist a ’nice’ (In particular countable) collection of functions {fi }i∈I such that
span{fi } = CR ([a, b]) (Where the bar denotes topological closure). Note that convergence in the sup-norm is exactly
uniform convergence. The following theorem sometimes seen in a course in Probability or Infi 2 answers the question:

Theorem 1.5. (Weierstrass Approximation Theorem): For each function f ∈ CR ([a, b]), there is a sequence of poly-
nomials (pn )∞
n=1 such that pn → f uniformly on [a, b].

This theorem implies that the span of {1, x, x2 , . . .} is dense in CR ([a, b]). We shall prove a more general result, but
to prove the general version we need a very special case of the above result:

Lemma 1.6. f (x) = |x| can be uniformly approximated by polynomials on any [a, b].

Proof. WLOG (By a linear change of √ variables to h defined below) [a, b] = [−1, 1]. It can be proven with tools of
infi 2 that the taylor series of h(x) = 1 − x converges uniformly on [−1, 1], therefore for all ε > 0, there is some
polynomial p such that
max |p(x) − h(x)| < ε (5)
x∈[−1,1]

In particular, by symmetry of the interval [−1, 1], we get


max p(1 − x2 ) − h(1 − x2 ) < ε (6)
x∈[−1,1]

But h(1 − x2 ) = |x|, and p(1 − x2 ) is still a polynomial, so this concludes the proof.

3
Definition 1.7. Let X be a Compact Hausdorff Topological Space, we define
CR (X) = {f : X → R : f Continuous} (7)
And we define the sup-norm on CR (X)
||f ||∞ = max |f (x)| (8)
x∈X

Remark. Note that the maximum is attained as X is compact.

Definition 1.8. A vector subspace A ⊆ CR (X) is called a Sub-Algebra if A is closed under pointwise multiplication
of functions.

Theorem 1.9. (Stone-Weierstrass): Let A ⊆ CR (X) be a subalgebra, and suppose the following hold:
1. A contains the constant functions
2. A separates points: For every x ̸= y ∈ X, there is f ∈ A such that f (x) ̸= f (y)
Then A is dense in CR (X), i.e. A = CR (X).

Remark. Clearly the subalgebra of polynomials satisfies the hypothesis, hence Stone-Weierstrass implies the Weier-
strass Approximation Theorem.

Proof. First, note that we can suppose that A is closed, since otherwise we can simply apply the theorem to A. A is
a subalgebra, since A is closed under addition, scalar multiplication and multiplication (As the products and sums
of uniformly convergent sequences are also uniformly convergent on compact sets).

Next, we show that if f ∈ A, then |f | ∈ A: f is continuous on a compact space, therefore it is bounded,


i.e. there exists L such that |f | ≤ L on X. Consider a sequence of polynomials pn converging uniformly to |·| on
[−L, L]. We have
(1)
n→∞
max |pn (f (x)) − |f (x)|| ≤ max |pn (t) − |t|| −→ (9)
x∈X t∈[−L,L]

Where (1) is as f takes on values between −L, L, and we are simply enlarging the set of values which pn , |·| can take.
Note that pn ◦ f ∈ A since A is closed under multiplication, addition and scalar multiplication, but pn ◦ f → |f |
uniformly, and A is (topologically) closed by our assumption, hence |f | ∈ A.

We now show that if f, g ∈ A, min(f, g) := f ∧ g, max(f, g) := f ∨ g ∈ A: This is simply as we can write


f + g |f − g|
max(f, g) = + ∈A (10)
2 2
And similarly
f + g |f − g|
min(f, g) = − ∈A (11)
2 2
As A is a subalgebra.

Finally, we prove the theorem: Let f ∈ CR (X). A separates points, hence for every x ̸= y, there is some
gx,y ∈ A such that gx,y (x) ̸= gx,y (y). By a linear transformation, we can take α, β ∈ R such that hx,y = αgx,y + β ∈ A
(As A contains constants) such that
hx,y (x) = f (x), hx,y (y) = f (y) (12)
Let ε > 0, and fix some x ∈ X. For every y, hx,y (y) = f (y) implies hx,y (y) < f (y) + ε (For y = x, we can simply take
hx,x ≡ x). By continuity, there is some nbhd Uy of y such that hx,y < f + ε on Uy . Uy are a cover of X, hence by
compactness, there is a finite subcover Uy1 , . . . , Uym . Define
hx = min(hx,y1 , hx,y2 , . . . , hx,ym ) ∈ A (13)
Now hx < f + ε on all of X, and hx (x) = f (x). For each x ∈ X, hx (x) = f (x) implies hx > f − ε on some nbhd Vx
on x. As before, we have a cover of X, so there is some finite subcover Vx1 , . . . , Vxk . We now define
β = max(hx1 , . . . , hxk ) ∈ A (14)
So β > f − ε on all of X, but we also have that β < f + ε on all X, since hxi < f + ε for all i, therefore we have
ε < β − f < ε ⇒ ∀x ∈ X : |β(x) − f (x)| < ε ⇒ ||β − f ||∞ < ε (15)
As ε was arbitrary, this implies that f ∈ A = A, so A = CR (X) as required.

4
2 Stone-Weierstrass - Examples And The Com-
plex Case, Hilbert Spaces
2.1 Stone-Weierstrass - Examples
Examples.
1. By considering CR ([0, 1]), and A = {Polynomials} we recover the Weierstrass Approximation Theorem.
2. More generally, let X ⊆ Rd be compact, and consider CR (X), A = {Polynomials in d variables on X}, then
clearly A is a subalgebra, contains the constant functions and separates points, so A is dense in CR (X).
3. Consider the space CR,per (R) = {f : R → R Continuous, periodic with period 1}. This is still a vector space,
and we can define
||f || = max |f (x)| = max |f (x)| (16)
x∈R x∈[0,1]
Where the last equality is as f is periodic with period 1. Of course we cannot use S-W directly on this space,
as R is not compact, but we can circumvent this:

Define the space T = {z ∈ C : |z| = 1} the unit circle, and define the map
Φ : CR (T) → CR,per (R), Φ(f )(x) = f (e2πix ) (17)
Φ as such is linear, multiplicative, bijective, and preserves the norm (i.e. ||Φ(f )||∞ = ||f ||). Consider
N
X
A={ an cos(2πnx) + bn sin(2πnx) : N ∈ N, an , bn ∈ R} (18)
n=0
We claim that A is an algebra. Clearly A is a vector subspace, and is closed under multiplication by applying
trigonometric identities such as
1 α+β
   
α−β
cos(α) cos(β) = (cos + cos ) (19)
2 2 2
A contains the constants, since we can take a0 cos(0x) = a0 , and A separates points on T, since
if there exist two points x, y ∈ R such that ∀f ∈ A : f (x) = f (y), in particular we have that
sin(2πx) = sin(2πy), cos(2πx) = cos(2πy), then e2πix = e2πiy i.e. these points are the same on T (This
is the contrapositive claim to point separation).

We now consider Φ−1 (A) ⊆ CR (T), this is a subalgebra (Since Φ is linear and multiplicative, so is
Φ−1 ), contains the constant functions (Since Φ−1 clearly sends constant functions to constant functions), and
separates points:

If z, w ∈ T, and g(z) = g(w) for all g ∈ Φ−1 (A), then z = w (This is as if g(z) = g(w) for all
g ∈ Φ−1 (A), then z = e2πix for some x, w = e2πiy for some y, and g(z) = g(w) implies that Φ(g)(x) = Φ(g)(y)
for all g ∈ Φ−1 (A), so by bijectivity of Φ, f (x) = f (y) for all f ∈ A, therefore z = e2πix = e2πiy = w).

S-W then implies that Φ−1 (A) is dense in CR (T), and as Φ preserves the norm (That is, it is an iso-
metric isomorphism), then A = Φ(Φ−1 (A)) is dense in CR,per (R).

2.2 The Complex Case


In the course, we would generally like to work over C. The question arises: Is S-W theorem true for CC (X)? The
answer generally, is no, unless we add an extra condition:

Example. Consider D = {z ∈ C : |z| < 1} the open unit disc in the complex plane, and D = {z ∈ C : |z| = 1} the
closed unit disc. Consider the set of complex polynomials
N
X
A={ an z n : n ∈ N, an ∈ C} ⊆ C(D) (20)
n=0
A is an algebra, contains the constants, and separates points, yet we claim that A ̸= C(D). From complex analysis, we
know that the uniform limit of complex differentiable (Holomorphic) functions is also holomorphic, yet, for example,
the function C(D) ∋ f (z) = z̄ is not complex differentiable on the closed unit disc although it is continuous, therefore
cannot be a uniform limit of polynomials.
This example tells us that we must add another condition:

Definition 2.1. We say that A ⊆ C(X) is Self-Adjoint if for all f ∈ A, f¯ ∈ A (Where f¯(z) = f (z)).

5
Theorem 2.2. (Complex Stone-Weierstrass): If A ⊆ C(X) (X compact hausdorff) is a subalgebra that separates
points, contains the constant functions, and is self-adjoint, then A = C(X).

Proof. First, as in the real case, we can WLOG suppose that A is closed by replacing A with A (Clearly if f ∈ A is
the uniform limit of functions fn ∈ A, then f¯ is the uniform limit of functions fn ∈ A, and all the other arguments
as for the real case follow similarly). Every complex-valued function is of the form f = u + iv for u, v ∈ CR (X). Let
us consider
AR = {f ∈ A : Im(f ) ⊆ R} = A ∩ CR (X) (21)
Clearly AR is an algebra (Over R, that is, closed to scalar multiplication only by real numbers), and contains the

(Real) constant functions. Note that if f ∈ A, then Re(f ) = f +2 ∈ AR ⊆ A (Since A is self-adjoint). Similarly,
¯
−f
Im(f ) = f 2i ∈ AR . If x ̸= y, then there is f ∈ AA such that f (x) ̸= f (y), but in particular, this means that either
Re(f (x)) ̸= Re(f (y)), or Im(f (x)) ̸= Im(f (y)), therefore we see that AR separates points.

From the real case of S-W, AR = CR (X). In particular, for every g = u + iv ∈ C(X), u, v ∈ AR ⊆ A = A,
therefore as A is an algebra, g = u + iv ∈ A, therefore A = C(X).
Examples.
1. Consider the space C([a, b]) (Note that these are continuous functions f : [a, b] ⊆ R → C), and let A be the
set of polynomials with complex coefficients. This is clearly a subalgebra, contains the constants, and separates
n n
points, and is self-adjoint (Since if p = ai xi , then, as x is real, p̄ = a¯i xi ∈ A), therefore A is dense in
P P
i=0 i=0
C([a, b]).
2. Consider Cper (R) (periodic functions R → C with period 1), and consider
N
X
A={ an e2πinx : N ∈ N, an ∈ C} (22)
n=−N
N
It can be checked that A = { (an cos(2πnx) + bn sin(2πnx)) : N ∈ N, an , bn ∈ C} (By applying Euler’s
P
n=0
ix −ix ix −ix
Identity, and the fact that sin(x) = e −e2i , cos(x) = e +e
2 ). It is now even more self-evident that A is a
subalgebra, contains the constant functions, and as in example 3 of the real case, separates points on T, and is
self-adjoint, therefore by applying the same isometry Φ as before, we can conclude that A = Cper (R).

2.3 Hilbert Spaces


Definition 2.3. Let G be a Vector Space over C. An Inner Product is a function ⟨·, ·⟩ : G × G → G which obeys the
following:
1. (Linearity In the First Argument): ⟨x1 + λx2 , y⟩ = ⟨x1 , y⟩ + λ ⟨x2 , y⟩
2. (Hermitian): ⟨x, y⟩ = ⟨y, x⟩
3. (Positive-Definiteness): ⟨x, x⟩ ≥ 0, and ⟨x, x⟩ = 0 iff x = 0.
An Inner Product Space is then the pair (G, ⟨·, ·⟩).

Remark. From the first and second property, we get


⟨x, y1 + λy2 ⟩ = ⟨x, y1 ⟩ + λ̄ ⟨x, y2 ⟩ (23)

Example. G = Cn , the standard dot product


   
* x1 y1 + n
 ..   ..  X
⟨x, y⟩ =  .  ,  .  = xi y¯i (24)
xn yn i=1

Is an inner product on G.
Recall the following important inequality:

Proposition 2.4. (Cauchy-Schwarz Inequality): Let G be an inner product space, then


2
|⟨x, y⟩| ≤ ⟨x, x⟩⟨y, y⟩ (25)
And equality holds iff x, y are linearly independent.

Definition 2.5. A Norm on a vector space V over C is a function ||·|| : V → [0, ∞) which obeys the following:
1. (Positive-Definiteness): ||x|| = 0 iff x = 0.

6
2. (Absolute Homogeneity): ||λx|| = |λ|||x||.
3. (Triangle Inequality): ||x + y|| ≤ ||x|| + ||y||.

Proposition 2.6. Let (G, ⟨·, ·⟩) be an inner product space, then ||x|| := ⟨x, x⟩ defines a norm on G.
p

Remark. From the proof of the triangle inequality in the above, we can deduce when ||x + y|| = ||x|| + ||y||. This
happens iff x, y are linearly independent, and ⟨x, y⟩ ≥ 0, so either y = 0, or x = λy, and λ ≥ 0.

! Not every norm is induced by an inner product! For example, C([0, 1]) with the sup-norm ||f ||∞ as previously
defined is not an inner product space, for example, consider the functions f, g with the following graphs: Then we have

Figure 1: The graph of f in red, the graph of g in black

||f ||∞ = ||g||∞ = 1, ||f + g||∞ = 2, so ||f + g|| = ||f || + ||g||, yet f, g are clearly not linearly independent.

Figure 2: The graph of f + g

Definition 2.7. Let G be a normed space space. We say that a sequence (xn )n∈N in G converges to x if
lim ||xn − x|| = 0 (26)
n→∞
We say that a sequence is Cauchy if
lim ||xn − xm || = 0 (27)
n,m→∞

Definition 2.8. A Hilbert Space H is a complete inner product space, that is, every cauchy sequence in H converges
in H.

Examples.
1. Cn with the standard dot product is a Hilbert Space.
2. Counterexample: Consider G = C([0, 1]) with with inner product:
Z 1
⟨f, g⟩ = f (x)ḡ(x) dx (28)
0
Where we define an integral of a complex-valued function f = u + iv as
Z 1 Z 1 Z 1
f (x) dx = u(x) dx + i v(x) dx (29)
0 0 0

7
Hence the integral is well-defined, linear, hermitian, and is positive-definite, since
Z 1 Z 1
⟨f, f ⟩ = ¯
ff =
2
|f | ≥ 0 (30)
0 0
2
And ⟨f, f ⟩ = 0 if and only if (Since f is continuous) |f | = 0 if and only if f = 0, so (G, ⟨·, ·⟩) is an inner product
space. Nonetheless, this space is not complete. Consider the sequence of functions fn (See Figure).
This sequence is indeed cauchy, since we have for m > n
Z 12 + n1 Z 21 + n1
1
Z 1
2 2 2
||fn − fm || = |fn − fm | dx = |fm − fn | dx ≤ 1 dx = → 0 (31)
0 1
2
1
2
n
Why does fn not converge? Suppose that fn → f . We claim that f |[0, 12 ) = 0, f |( 21 ,1] = 1, but such an f is not
continuous, therefore fn is a cauchy sequence with no limit in H. Indeed, we have
Z 12 + n1 Z 21 + n1
1
Z 1
2
||fn − f || = |fn − f | dx = f dx ≤ 1 dx = → 0 (32)
0 1
2
1
2
n

Figure 3: The graph of fn

8
3 Example of Infinite-Dimensional Complete Space,
Completion
3.1 The Space ℓ2
Remark. In every inner product space G, the inner product, ⟨·, ·⟩ : G × G → C, the sum + : G × G → G, the scalar
multiplication C × G → G and the norm ||·|| : G → R are all continuous w.r.t the metric induced by the inner product.
Proof. We shall show the inner product is continuous for example:

We shall use sequential continuity. Let fn → f, gn → g, we shall show that ⟨fn , gn ⟩ → ⟨f, g⟩. Let us evalu-
ate
(1) (2) (3)
|⟨f, g⟩ − ⟨fn , gn ⟩| ≤ |⟨f, gn ⟩ − ⟨fn , gn ⟩| + |⟨f, gn ⟩ − ⟨f, g⟩| = |⟨f − fn , gn ⟩| + |⟨f, gn − g⟩| ≤
(33)
(4)
||f || · ||g − gn || + ||gn || · ||f − fn || ≤ ||f || · ||g − gn || + M ||fn − f || → 0
Where (1) is by the triangle inequality and additivity of the inner product, (2) is by the addivity of the inner product,
(3) is by the Cauchy-Schwarz Inequality, (4) is as it can be easily shown that ||fn || is convergent and therefore bounded
(By the reverse triangle inequality | ||fn || − ||f || | ≤ ||fn − f ||, which also shows continuity of the norm), and finally
by definition of convergence ||fn − f ||, ||gn − g|| → 0.
The other arguments are much the same, if not simpler.
Example. Consider the space
X 2
ℓ2 = {(an )n∈N : an ∈ C |an | < ∞} (34)
n∈N
With pointwise addition and multiplication, and the inner product
X∞
⟨x, y⟩ = xn y¯n (35)
n=1

Proposition 3.1. ℓ2 is a (In particular infinite-dimensional) Hilbert Space

Proof. Firstly, we shall show that ℓ2 is indeed a vector space: Closure under scalar multiplication is clear, so we
shall show closure under addition.

For any N ∈ N, we have, by the triangle inequality for the norm in CN :


N
! 12 N
! 12 N
! 21 ∞
! 21 ∞
! 12
X 2
X 2
X 2
X 2
X 2
|xn + yn | ≤ |xn | + |yn | ≤ |xn | + |yn | <∞ (36)
n=1 n=1 n=1 n=1 n=1
N
2
Therefore the sequence |xn + yn | converges, hence ℓ2 is closed under addition.
P
n=1

Secondly, we wish to check that the inner product is well-defined. Once we know that ⟨x, y⟩ always con-
verges, all the other properties will follow from properties of absolutely convergent series. Let x, y ∈ ℓ2 . For N ∈ N,
consider
XN
xn y¯n (37)
n=1
From Cauchy-Schwarz in CN , we get
N N
! 12 N
! 12 ∞
! 21 ∞
! 12
X X 2
X 2
X 2
X 2
xn y¯n ≤ |xn | |yn | ≤ |xn | |yn | <∞ (38)
n=1 n=1 n=1 n=1 n=1
Therefore the sequence of partial sums is bounded, and hence converges, so that |⟨x, y⟩| < ∞, hence well-defined.
Finally, we show that ℓ2 is indeed complete: Let (x(n) ) be a cauchy sequence. We shall denote
(n) (n)
(x(n) ) = (x1 , x2 , . . .) (39)
For k ∈ N, we have
(n) (m) n,m→∞
xk − xk ≤ x(n) − x(m) −→ 0 (40)
(n)
Therefore xk is cauchy in n, let us denote the limit by xk . Write x = (x1 , x2 , . . .). We shall show that xk is the
limit of x(n) .

9
Let ε > 0. There is some n0 such that for every m, n > n0 , x(n) − x(m) < ε, therefore for each n, m > n0 , and for
every N ∈ N:
N 2 2
(n) (m)
X
xk − xk ≤ x(n) − x(m) < ε2 (41)
k=1
Taking m → ∞ (Keeping n, N fixed), we get
N 2
(n)
X
xk − xk ≤ ε2 (42)
k=1
N 2
(n)
The sequence of partial sums is bounded, therefore it converges, so taking N → ∞ we get
P
xk − xk
k=1
∞ 2
(n)
X
xk − xk ≤ ε2 (43)
k=1
In particular, for every n > n0 , we have that x − x ∈ ℓ2 , but this implies that x = x − x(n) + x(n) ∈ ℓ2 , since ℓ2 is
(n)

a vector space. Secondly, for all n > n0 , we have that


x(n) − x ≤ ε (44)
But by definition, this means that x(n) → x ∈ ℓ2 , therefore ℓ2 is complete.

3.2 Completion
Theorem 3.2. Let G be an inner product space. There exists a Hilbert Space H and a linear embedding i : G ,→ H
which preserves the inner product such that i(G) = H.

Remark. Usually, we shall think of G as simply a dense subspace of H, and then i will simply be the identity, as for
our purposes, the embedding i presreves the structure of G entirely, so we shall identify G with i(G).
Proof. As a metric space, (G, dG ) has some completion (H, dH , i) where i is an isometric embedding such that
i(G) = H. We must define an inner product structure on H, and then show that this inner product induces the
metric dH , and that i is linear and preserves the inner product w.r.t this structure.

Sum: For h, k ∈ H, consider hn → h, kn → k where hn , kn ∈ G (here we identify G and i(G)). We then


define h + k = lim hn + kn . Firstly, this limit exists, since hn , kn converge in H, and are therefore cauchy sequences
n→∞
in H and so also in G, hence by the triangle inequality hn + kn is a cauchy sequence in G, therefore also in H, and
as H is complete it has a limit in H which we shall denote hn + kn .

Next, we show this sum does not depend on the choice of sequences hn , kn : Suppose we have hn , ĥn → h, kn , k̂n → k.
Define the sequences
(h′n ) = (h1 , ĥ1 , h2 , ĥ2 , . . .), (kn′ ) = (k1 , k̂1 , g2 , k̂2 , . . .) (45)
We still have that h′n→ → k, but as we have just proven, this implies that the limit of
h, kn′ h′n + kn′ exists in H, and
so in particular the limit of any subsequence is the same, so we get
lim h′2n + k2n

= lim ĥn + k̂n = lim h′2n−1 + k2n−1

= lim hn + kn (46)
n→∞ n→∞ n→∞ n→∞
Therefore the limit is well-defined.

Similarly, we define scalar multiplication λh as lim λhn for hn → h. The same proof will show that this is
n→∞
exists and is well-defined.

Inner Product: We define the inner product on H in the obvious way


⟨h, k⟩ = lim ⟨hn , kn ⟩ (47)
n→∞
For hn → h, kn → k. First, note that ⟨hn , kn ⟩ is cauchy in C:
C.S. m,n→∞
|⟨hn , kn ⟩ − ⟨hm , km ⟩| ≤ |⟨hn , kn ⟩ − ⟨hn , km ⟩| + |⟨hn , km ⟩ − ⟨hm , km ⟩| ≤ ||hn ||||kn − km || + ||km ||||hn − hm || −→ 0
(48)
Hence the sequence is cauchy, so it must converge (As C is complete). By using the same trick as for the sum, it can
be shown that the inner product is well-defined.

It remains to be checked that H satisfies the axioms of a vector space and an inner product space. The
proof of all of these properties is identical (They hold in G, and we take limits), so we shall do one for example:

10
Let h, k, m ∈ H, then
⟨h + k, m⟩ = lim ⟨hn + kn , mn ⟩ = lim ⟨hn , mn ⟩ + ⟨kn , mn ⟩ = ⟨h, m⟩ + ⟨k, m⟩ (49)
n→∞ n→∞
This is using the definition of the inner product, and the definition of h + k ∈ H as the limit of hn + kn .

Finally, we must show that the inner product induces dH , and this will also show that H is in particular
complete. Indeed, let h, k ∈ H. As the metric is continuous, we have
dH (h, k) = lim dH (hn , kn ) = lim dG (hn , kn ) = lim ⟨hn − kn , hn − kn ⟩ = ⟨h − k, h − k⟩ (50)
p p
n→∞ n→∞ n→∞
Where the last equality is by definition of the inner product in H, and square root is continuous on R.

11
4 The Space L2
4.1 Construction of L2
Define the space
P C[a, b] = {f : [a, b] → C : f Is Piecewise Continuous} (51)
Where by Piecewise Continuous, we mean continuous except for perhaps a finite number of points, and at every point
there are left and right-sided limits (So we only allow for jump discontinuities at worst), with the "inner product"
Z b
⟨f, g⟩ = f (x)ḡ(x) dx (52)
a
Note that this satisfies all the properties of an inner product, except for positive-definiteness, since we might have
⟨f, f ⟩ = 0 but f ̸= 0 (It can be not 0 in finitely many points). We define the equivalence relation f ∼ g if f = g except
for finitely many points, and we ’fix’ the definition of P C[a, b] to be P C[a, b]/ ∼, so that the inner product on the quotient
is really an inner product.

In practice, we think of the space of P C[a, b] still as a space of functions (And not of equivalence classes), simply
identifying functions which differ in finitely many points.

Proposition 4.1. if f, g : [a, b] → C are piecewise continuous, continuous at x0 and f ∼ g, then f (x0 ) = g(x0 )

Proof. Take a sequence xn → x0 of distinct elements, then f (xn ) = g(xn ) except for finitely many n, hence
f (x0 ) = lim f (xn ) = lim g(xn ) = g(x0 ).
n→∞ n→∞

Proposition 4.2. C[a, b] is dense in P C[a, b].

Formally, the proof is very cumbersome. The idea is that for any piecewise-continuous function, we can take a func-
tion which agrees on it except on arbitrarily small nbhds (Say (x − n1 , x + n1 )) of the points of discontinuity, in which we
approximate the jump by a linear function from x − n1 to x + n1 , so that the integral of the difference goes to 0 as n goes
to ∞ (Since the integral will be the sum of areas of triangles with arbitrarily small side length).

Sadly, P C[a, b] is still not complete, but we can of course complete it into a hilbert space:

Definition 4.3. L2 [a, b] is defined to be the completion of P C[a, b].

Remark. In fact, L2 [a, b] is also the completion of C[a, b], since we have C[a, b] ⊆ P C[a, b] ⊆ L2 , and C[a, b] being
dense in P C[a, b], and P C[a, b] being dense in L2 implies that C[a, b] is dense in L2 , and as L2 is complete, it must
therefore be the completion of C[a, b].

Remark. We think of the elements of L2 as functions.


! Given f ∈ L2 [a, b] − P C[a, b], there is no meaning to evaluating f at a point.

Rb 2 Rb
Yet, the ’integral’ |f | = ⟨f, f ⟩ is defined, since this is the norm of f . Similarly, the integral f = ⟨f, 1⟩ is defined,
a a
Rd
and for every sub-interval [c, d] ⊆ [a, b], f = ⟨f, χ[c,d] ⟩ where χ[c,d] is the indicator function of the interval [c, d].
c

Note that the above is simply notation for the inner product on L2 , and is not actually an integral (For our defini-
tion of L2 ).

4.2 On The Measure-Theoretic Construction


Given a function f : [a, b] → C, there is a more generalistion of the riemann integral (i.e., we can integrate a bigger
class of functions) called the lebesgue integral. Denote
Z b
2
L2 = {f : [a, b] → C : |f | < ∞ The lebesgue integral exists and is finite} (53)
a
And we define the "inner product"
Z b
⟨f, g⟩ = f ḡ (54)
a
As before. There is again the problem of a function having norm 0 while not being the zero function (Or not even being
nonzero in only finitely many points)! We say that f ∼ g if ||f − g|| = 0.

12
Proposition 4.4. L2/ ∼ = L2 [a, b], in the sense that this is a completion of C[a, b].

We shall not prove this proposition. This is a standard claim in measure theory, but we must properties of the lebesgue
integral which we do not know must be used to prove it.

4.3 Properties of L2
Rd Rd
Proposition 4.5. If f, g ∈ L2 [a, b], and f= g for all [c, d] ⊆ [a, b], then f = g in L2 .
c c

Proof. We have that f, χ[c,d] = g, χ[c,d] for every [c, d] ⊆ [a, b], therefore
f − g, χ[c,d] = 0 (55)
In particular, by linearity, for every staircase function (A linear combination of indicators of the form χ[c,d] ) h ∈
P C[a, b], we have
⟨f − g, h⟩ = 0 (56)
We shall show that staircase functions are dense in L [a, b]. To show this, we show that staircase functions are dense
2

in C[a, b] (And will therefore be dense in L2 [a, b], as C[a, b] is dense in L2 ):

Every f ∈ C[a, b] is uniformly continuous, therefore for every ε > 0, there is n such that |x − y| ≤ b−a n im-
plies that |f (x) − f (y)| < ε. We partition the interval [a, b] into n subintervals [ai , ai+1 ], 0 ≤ i ≤ n, where a0 = a,
and ai = a + i(b−a)
n , and define the function
Xn
fn = f (ai )χ[ai−1 ,ai ] (57)
i=1
By construction, |fn (x) − f (x)| < ε for each x ∈ [a, b], since x ∈ [ai−1 , ai ] for some i, therefore fn (x) = f (ai ), but
n , therefore |fn (x) − f (x)| = |f (ai ) − f (x)| < ε, so we have
|x − ai | ≤ b−a
s

Z b
2
||fn − f || = |fn − f | ≤ ε b − a (58)
a
And this is true for all ε > 0, hence the staircase functions are indeed dense in C[a, b].

Given the above, take a sequence of staircase functions hn → f − g, then by continuity of the inner product,
we have
2
||f − g|| = ⟨f − g, f − g⟩ = lim ⟨f − g, hn ⟩ = lim 0 = 0 (59)
n→∞ n→∞
Therefore we must have that f − g = 0 =⇒ f = g (In L ). 2

We now want to show explicitly that, in some sense, there are non piecewise-continuous functions in L2 :

Proposition 4.6. Let f : [0, 1] → C be continuous on (0, 1] and |f (x)| → ∞ as x ↓ 0, but the (Improper) integral
R1 2
|f | < ∞, then "f ∈ L2 [0, 1]", in the sense that there is g ∈ L2 [0, 1] such that for each [c, d] ⊆ [0, 1] such that
0
Z d Z d
f= g = g, χ[c,d] (60)
c c
Where the LHS is the (Perhaps Improper) Riemann Integral.

Example. 1
xα for 0 < α < 12 .

Proof. For each n ∈ N, define


(
f (x) 1
≤x≤1
fn (x) = n (61)
0 0 ≤ x ≤ n1
Then fn ∈ P C[0, 1]. We claim that fn is a Cauchy Sequence in L2 . For n > m, we have
Z 1 Z m1 Z m1 Z m1 Z 1 Z 1
2 2 2 2 2 2 m→∞
||fn − fm || = |fn − fm | = |fn − fm | = |f | ≤ |f | = |f | − |f | −→ 0 (62)
1 1 1
0 n n 0 0 m

Where the limit is 0 as the integral of f 2 converges, hence fn is cauchy, so it has some limit g ∈ L2 [0, 1]. For
[c, d] ⊆ [0, 1], we have
Z d Z d Z d
g = ⟨g, χ[c,d] ⟩ = lim ⟨fn χ[c,d] ⟩ = lim fn = (63)
c n→∞ n→∞ c c

13
Where the last equality is as fn = f on [c, d] for all n sufficiently large such that 1
n < c.

Remark. The above proposition, with f (x) = 1


1 for example, showsh that P C[0, 1] is not complete, since fn is cauchy
x3
in the L norm, but converge to
2 1
1 , which is not piecewise continuous.
x3

Definition 4.7. Let Ω R⊆ Rd be Jordan Measurable (χΩ is Riemann Integrable), we define C(Ω) as usual, with the
inner product ⟨f, g⟩ = Ω f ḡ, and we call its completion L2 (Ω).

14
5 Orthogonal Projections
Definition 5.1. Let V be an IP space over R or C. We say that S ⊆ V is Convex if for every x, y ∈ S, λ ∈ [0, 1], we
have (1 − λ)x + λy ∈ S.

Remark. Visually, this means that if x, y are inside a convex set, then the line between them is contained entirely
within the set.

Theorem 5.2. (Hilbert Projection Theorem): Let H be a hilbert space, S ⊆ H closed and convex, then for every
h ∈ H, there exists g ∈ S such that
∀f ∈ S : ||g − h|| ≤ ||f − h|| (64)
Furthermore, such a g is unique.

We recall the following lemma:

Lemma 5.3. (Parallelogram law): Let G be an inner product space, then


2 2 2 2
∀x, y ∈ G : ||x + y|| + ||x − y|| = 2||x|| + ||y|| (65)

The proof is by opening parentheses.


Proof. Let d = inf{||g − h|| : g ∈ S}. Let fn ∈ S be a sequence such that
||fn − h|| → d (66)
We shall show that fn is a cauchy sequence. We use the parallelogram law with x = fn − h, fm − h:
2 2 2 2
||fn + fm − 2h|| + ||fn − fm || = 2(||fn − h|| + ||fm − h|| ) (67)
Since S is convex, we have fn +fm
2 ∈ S, hence by homogeneity we have
fn + fm
||fn + fm − 2h|| = 2 − h ≥ 2d (68)
2
Therefore, we get that
2 2 2 2
||fn − fm || = 2(||fn − h|| + ||fm − h|| ) − ||fn + fm − 2h|| ≤
2 2 m,n→∞ (69)
2(||fn − h|| + ||fm − h|| ) − (2d)2 −→ 2(d2 + d2 ) − 4d2 = 0
Therefore fn is cauchy, and hence (Since H is complete) converges to some g ∈ H, but S is closed, hence g ∈ S, and
by continuity of the norm ||g − h|| = d.

As for uniqueness, if g, g ′ are both elements such that ||g − h|| = ||g ′ − h|| = d, then we take the sequence
(fn )n∈N = (g, g ′ , g, g ′ , g, g ′ , g, g ′ , . . .), then the sequence satisfies the hypothesis of the first part (||fn − h|| → d),
hence it converges by the proof of existence, but this is only possible if g = g ′ , so g is unique.

Definition 5.4. g as in the theorem is called the Projection of h onto S, and denoted g = PS h.

Proposition 5.5. Let H be a hilbert space, S closed and convex, h ∈ H, g ∈ S. TFAE:


1. g = PS h
2. for every f ∈ S, Re ⟨h − g, f − g⟩ ≤ 0

Remark. Intuitively, we understand the second condition as the angle between f − g and h − g being obtuse (> 90◦ ).
Proof. 1 =⇒ 2: Fix t ∈ (0, 1), then we know (By convexity and minimality of g):
2 2 2 2 2
||g − h|| ≤ ||(1 − t)g + tf − h|| = ||(g − h) + t(f − g)|| = ||g − h|| + 2t Re⟨g − h, f − g⟩ + t2 ||f − g|| =⇒
t 2 t 2 t↓0
(70)
− Re⟨g − h, f − g⟩ ≤ ||f − g|| =⇒ Re⟨h − g, f − g⟩ ≤ ||f − g|| −→ 0
2 2
And as the LHS does not depend on t, we get the required.

2 =⇒ 1: Let f ∈ S, then we have


2 2 2 2 2
||f − h|| = ||(f − g) − (h − g)|| = ||f − g|| −2 Re ⟨f − g, h − g⟩ +||h − g|| ≥ ||g − h|| (71)
| {z } | {z }
≥0 ≥0
Hence g is the element minimising the distance, so by uniqueness g = PS h.

15
Remark. If M ≤ H is a closed subspace (Note that a subspace is always convex), in particular, the orthogonal
projection PM h is well-defined.
In fact, for a closed subspace we can say a bit more:

Proposition 5.6. Let H be a hilbert space, M ≤ H a closed subspace, g ∈ M, h ∈ H. TFAE:


1. g = PM h
2. For any m ∈ M , ⟨h − g, m⟩ = 0

Proof. 2 =⇒ 1 since 2 in particular implies condition 2 in Proposition 5.5 (Since if f ∈ M , then f − g ∈ M ).

1 =⇒ 2: By Proposition 5.5, we know that Re ⟨h − g, f − g⟩ ≤ 0 for all f ∈ M . If we take f = m + g ∈ M , we get


Re ⟨h − g, m⟩ ≤ 0 (72)
Similarly, we an take f = −m + g, and we get
Re ⟨h − g, −m⟩ ≤ 0 =⇒ − Re ⟨h − g, m⟩ ≤ 0 (73)
Therefore Re ⟨h − g, m⟩ = 0. Similarly, if we take f = im + g, we get
0 ≥ Re ⟨h − g, im⟩ = Re −i ⟨h − g, m⟩ = Im ⟨h − g, m⟩ (74)
And taking f = −im + g, we get − Im ⟨h − g, m⟩ ≤ 0, hence ⟨h − g, m⟩ = 0.

Definition 5.7. Given S ⊆ H, The Orthogonal Subspace to S is defined to be


S ⊥ = {h ∈ H : ∀f ∈ S, ⟨f, h⟩ = 0} (75)

Remarks.
1. Note that S ⊥ = span(S)⊥ = (span(S))⊥ , where the last equality is by continuity of the inner product (In the
first argument).
2. S ⊥ is itself a closed subspace, again by continuity of the inner product (In the second argument this time)

Corollary 5.8. Let H be a hilbert space, M a closed subspace, then


H = M ⊕ M⊥ (76)

Proof. Firstly, let us show that every h ∈ H can be written as a sum of an element of M and M ⊥ . We can write
h = PM h + (h − PM h) (77)
Where PM h ∈ M by definition, and by Proposition 5.6, we have that h − PM h ∈ M . ⊥

To conclude the proof, we must show that M ∩ M ⊥ = ∅. Indeed, if x ∈ M ∩ M ⊥ , then ⟨x, x⟩ = 0, which
implies x = 0.
Remark. In particular, if M is any subspace (Not necessarily closed), then we have
H = M ⊕ M⊥ (78)

Where M is a closed subspace, and by our remark M =M . ⊥

Conclusion. For a closed subspace M ≤ H, the Orthogonal Projection PM : H → M is a linear operator. This stems
from the above corollary, and the fact that if V = U ⊕ W is Any vector space, then the projection p : V → U ,
p(u + w) = u is linear.

Corollary 5.9. If M ≤ H is a closed subspace, then (M ⊥ )⊥ = M .

Proof. It’s clear that M ≤ (M ⊥ )⊥ . We know that H = M ⊥ ⊕ M , but we also have H = (M ⊥ )⊥ ⊕ M ⊥ (By using
the corollary on M ⊥ ). Let f ∈ (M ⊥ )⊥ . We may write f = m′ + m wher m′ ∈ M ⊥ , m ∈ M , but this means we can
write f in two ways as a sum of an element of M ⊥ and (M ⊥ )⊥ , namely 0 + f = m′ + m (Since M ≤ (M ⊥ )⊥ ). By
uniqueness, we must have that m′ = 0, and f = m ∈ M , so (M ⊥ )⊥ ≤ M and hence we have equality.

Remark. In particular, for any subset S ⊆ H, we have that (S ⊥ )⊥ = span(S), since by a previous remark (S ⊥ )⊥ =

(span(S) )⊥ = span(S).

16
6 Orthonormal Bases
6.1 Definition of Orthonormal Bases
Definition 6.1. A collection {ei }i∈I ⊆ H is called an Orthonormal System (ONS) if
(
1 i=j
⟨ei , ej ⟩ = δij = (79)
0 i ̸= j

Example. Define en (x) = e2πinx ∈ C[0, 1] ⊆ L2 [0, 1]. We claim that this is an orthonormal system, and indeed
Z 1 Z 1
⟨en , em ⟩ = en ēm dx = e2πi(n−m)x dx (80)
0 0
If n ̸= m, then the integral is
1
e2πi(n−m)x
=0 (81)
2πi(n − m) 0
Since e2πi is 1-Periodic. If n = m, we get
Z 1
1 dx = 1 (82)
0

Fact: Every Hamel-Basis for an infinite-dimensional hilbert space is uncountable.

Hamel Bases are generally very hard to work with (And impossible to describe). We would like to find an orthonormal
system such that any h ∈ H can be written as an infinite sum

X
h= αn en (83)
n=1
In particular, if the above holds, we get
∞ ∞
* +
X X
⟨h, en ⟩ = αm em , en = αm ⟨em , en ⟩ = αn (84)
m=1 n=1
Where we used continuity of the of the inner product, so we know that if such a linear combination exists, the coefficients
must be ⟨h, en ⟩. The question is now, for some orthonormal system, does this series converge, and if so, does it converge
to our original element?

Proposition 6.2. Let H be a hilbert space, {en }n∈N an ONS, {αn } ⊆ C some sequence of coefficients, then
X∞ ∞
X 2
αn en Converges ⇔ |αn | < ∞ (85)
n=1 n=1
In this case, we have
∞ 2 ∞
X X 2
αn en = |αn | (86)
n=1 n=1

N
Proof. ⇒: Let SN = αn en be the N -th partial sum. Convergence of the series is by definition the convergence
P
n=1
of the sequence SN . Let us write lim SN = S. We have
N →∞
*N N
+ N X
N N
ON S
2
X X X X 2
||SN || = ⟨SN , SN ⟩ = αn en , αm em = αn ᾱm ⟨em , en ⟩ = |αn | (87)
n=1 m=1 n=1 m=1 n=1
By continuity of the norm, we then get
N
X ∞
X
2 2 2 2
∞ > ||S|| = lim ||SN || = lim |αn | = |αn | (88)
N →∞ N →∞
n=1 n=1
As required.

2
⇐: Suppose |αn | < ∞, and we shall show that SN is a cauchy sequence. Let m > n, then
P
n=1
m 2 m
2
X P ythagoras X 2 m,n→∞
||Sm − Sn || = αk ek = |αk | −→ 0 (89)
k=n+1 k=n+1

17
2
Where the last part is as the series of |αn | converges, hence it satisfies the cauchy criterion for series, hence Sn is
cauchy, and therefore converges, and as before by continuity of the norm, we find that
∞ 2 ∞
X X 2
αn en = |αn | (90)
n=1 n=1

Theorem 6.3. (Bessel’s Inequality): Let H be hilbert, {en } an ONS, h ∈ H, then


X∞
2 2
|⟨h, en ⟩| ≤ ||h|| (91)
n=1
And in particular this series converges.

Proof. Let M = span{e1 , . . . , eN }. Define


N
X
SN = ⟨h, en ⟩ en (92)
n=1
We claim that SN = PM h (Note that M is finite-dimensional and hence in particular closed). Cleraly SN ∈ M , we
shall show that h − SN ∈ M ⊥ (This is sufficient by Proposition 5.6). For this (Since M = span{e1 , . . . , eN }), it is
sufficient to to show that h − SN ⊥ em for 1 ≤ m ≤ N . Indeed
*N + N
ON S
X X
⟨h − SN , em ⟩ = ⟨h, em ⟩ − ⟨h, en ⟩ en , em = ⟨h, em ⟩ − ⟨h, en ⟩ ⟨en , em ⟩ = ⟨h, em ⟩ − ⟨h, em ⟩ = 0 (93)
n=1 n=1
Therefore, by Pythagoras’ Theorem, we have
N
X
2 2 2 2 2 2
||h|| = ||SN + (h − SN )|| = ||SN || + ||h − SN || ≥ ||SN || = |⟨h, en ⟩| (94)
n=1
This holds for all N ∈ N, hence taking N → ∞ we get the required inequality.


Theorem 6.4. Let H be hilbert, {en } an ONS, then PM h = ⟨h, en ⟩ en , where M = span{en }.
P
n=1

∞ ∞
2
Proof. By Bessel’s Inequality, |⟨h, en ⟩| < ∞, hence by Proposition 6.2, S = ⟨h, en ⟩ en converges. We shall
P P
n=1 n=1
show that S = PM h, that is, S ∈ M , and h − S ∈ M ⊥ .
N
Clearly S ∈ M , since S is the limit of ⟨h, en ⟩ en ∈ span{en }, and M = span{en } is closed.
P
n=1

We know that M ⊥ = span{en } = {en }⊥ , so it is sufficient to check that h − S ⊥ em for all m ∈ N, and
indeed, we have
*∞ + ∞
ON S
X X
⟨h − S, em ⟩ = ⟨h, em ⟩ − ⟨h, en ⟩ en , em = ⟨h, em ⟩ − ⟨h, en ⟩ ⟨en , em ⟩ = ⟨h, em ⟩ − ⟨h, em ⟩ = 0 (95)
n=1 n=1
Where we used continuity of the inner product, hence h − S ∈ M ⊥ , and so S = PM h as required.

Definition 6.5. An ONS {en } ⊆ H will be called Complete if {en }⊥ = {0}.

Theorem 6.6. Let {en } ⊆ H be an ONS. The following are equivalent:


1. {en } is complete.
2. span{en } = H

3. ∀h ∈ H : h =
P
⟨h, en ⟩ en
n=1

2 2
4. (Parseval’s Identity): ||h|| = |⟨h, en ⟩| .
P
n=1

Definition 6.7. If an ONS {en } satisfies one of the above conditions, we say that {en } is an Orthonormal Basis.

Proof. Let M = span{en }. In terms of M we may rephrase 1 as M ⊥ = {0}, 2 as M = H, and 3 as PM h = h for


all h ∈ H. It is clear that these are equivalent since H = M ⊥ ⊕ M . 3 =⇒ 4 is by continuity of the norm and
pythagoras’ theorem.

18
4 =⇒ 1: Let f ∈ {en }⊥ , then

X ∞
X
2 2
||f || = |⟨f, en ⟩| = 02 = 0 (96)
n=1 n=1
Therefore f = 0, so {en }⊥ = {0}, as required.

Definition 6.8. The coefficients {⟨h, en ⟩}∞


n=1 are called the Generalised Fourier Coefficients of h.

Example. H = ℓ2 . Let en be the sequence with 1 in the n-th place, and 0 otherwise. It is clear by any definition that
{en } is an orthonormal basis for ℓ2 . For example, if x = (xn )n∈N ∈ {en }⊥ , then ⟨x, en ⟩ = xn = 0, hence x = 0.

6.2 Existence of Orthonormal Bases


Fact: Every Hilbert Space has a (Not necessarily countable) complete orthonormal system, in the sense that {ei }⊥
i∈I =
0, and {ei } are orthonormal. The proof is by Zorn’s Lemma.

We wish to discuss the cases in which there is a countable complete ONS. We recall the following definition:

Definition 6.9. A topological space X is called Separable if there exists a countable dense subset in X.

Theorem 6.10. A Hilbert space H is separable iff H has a countable orthonormal basis (In the sense previous
discussed)

Proof. Suppose that {en } is an orthonormal basis, then


N
X
spanQ[i] {en } = { (αn + iβn )en : αn , βn ∈ Q} (97)
n=1
Is countable and dense (Since we can approximate any complex number by numbers in Q[i], and we can approximate
any element of H by finite linear combinations of elements of en (Since it is an orthonormal basis).

Suppose H is separable. Let {xn }∞


n=1 be a countable dense set. Clearly
span{xn } = H (98)
Let {yn } be a subsequence of {xn } such that span{yn } = span{xn }, and {yn } are linearly independent (For each n,
we take xn if xn is not in the span of x1 , . . . , xn−1 ). We now use the Gram-Schmidt Process on yn :

Define e1 = ||y1 || .
y1
Suppose we have defined e1 , . . . , en−1 orthonormal, define
e˜n = P{e1 ,...,en−1 }⊥ yn (99)
And define en = ||e˜n || .
e˜n
We can that {en }∞
n=1 are orthonormal, and for any N ∈ N, we have span{yi }N
i=1 = span{ei }N
i=1 .
Note that

[ ∞
[
span{yn }∞
n=1 = span{yn }N
n=1 = span{en }N
n=1 = span{en }n=1

(100)
N =1 N =1
So in particular
span{en } = span{yn } = span{xn } = H (101)
Hence {en } is an orthonormal basis.

Theorem 6.11. Let H1 , H2 be separable, infinite dimensional hilbert spaces, then there exists a linear isomorphism
T : H1 → H2 such that ⟨x, y⟩H1 = ⟨T x, T y⟩H2 . for all x, y ∈ H1 .

Proof. Let {en }, {fn } be orthonormal bases of H1 , H2 respectively (Which exist by the previous theorem). Define

X ∞
X
T( αn en ) = αn fn (102)
n=1 n=1
∞ ∞ ∞
2
This is well-defined, since if H1 ∋ x = αn en , then by Proposition 6.2 |αn | < ∞, hence αn fn converges.
P P P
n=1 n=1 n=1
Moreover, the representation of x as such a series is unique: If
X∞ X∞
x= αn en = βn e n (103)
n=1 n=1

19
Then it is clear that

X
0= (αn − βn )en (104)
n=1
But by orthonormality, we have

X
2 2
||0|| = |αn − βn | (105)
n=1
So αn = βn for all n, hence T is well-defined. Injectivity and surjectivity are also clear by the uniqueness of the series
representation. Finally, we have
X∞
⟨x, y⟩ = αn β̄n = ⟨T x, T y⟩ (106)
n=1

20
7 Orthonormal Basis of L2
Theorem 7.1. Let K = [0, 1]k . For every n(n1 , . . . , nk ) ∈ Zk Define en (x) = e2πin·x = en1 (x1 ) · · · enk (xk ) (Where n · x
denotes the dot product in Ck ), then {en }n∈Zk is an orthonormal basis for L2 (K). In other words, for every f ∈ L2 ,
we define fˆ(n) = ⟨f, en ⟩ = f (x)e−2πin·x dx, then
R
K

fˆ(n)e2πin·x
X
f (x) = (107)
n∈Zk
Where this equality is in L2 (That is, the series converges to f in the L2 norm), and not an equality of functions.

Remark. This series representation is called the Fourier Series of f , and fˆ(n) are called the Fourier Coefficients of f .
Proof. First, let us show that these are orthonormal. Indeed,
(
1 n=m
Z Z
⟨en , em ⟩ = en em = en−m = (108)
K K 0 n ̸= m
So the en form an orthonormal system. Define
P = span en , Cper (K) = {f ∈ C(K) : f (0, x3 , . . . .xk ) = f (1, x2 , . . . , xk ), f (x1 , 0, . . . , xk ) = f (x1 , 1, . . . , xk ), . . .}
(109)
We have the inclusions P ⊆ Cper (K) ⊆ C(K) ⊆ L2 (K). We shall show that every space in this sequence is dense in
the next, so that P will be dense in L2 (K), and hence by Theorem 6.6 we shall be done.

First, we show that P is dense in Cper (K). Indeed, we claimn that Cper (K) = ∼ C(Tk ) are isometrically iso-
morphic, where T = R /2πZk (This is similar to our proof in 1 dimension). P is an algebra (As the product of en ,
k k

em is en+m ), contains the constants, self-adjoint (ēn = e−n ) and separates points on the torus (Since the functions
e(0,...,1,...,0) separate points, by the dimension 1 case), hence by Stone-Weierstrass we have that P is dense in Cper (K)
in the ||·||∞ norm.

We now show that P is in fact dense in the L2 norm, and indeed, we have that
Z
2 2 2
||f ||2 = |f | ≤ ||f || Vol(K) = ||f ||∞ (110)
K
So it is clear that if P is dense in the ∞-Norm, it is dense in the L2 -norm.

Next, we claim that Cper (K) is dense in C(K) in the L2 norm. Let f ∈ C(K). For ε > 0, define L = [ε, 1 − ε]k , and
choose g ∈ Cper (K) such that:
1. g|L = 1
2. g|∂K = 0
3. 0 ≤ 0 ≤ 1
For example, we may take g(x) = min{d(x,∂K),ε}
ε . We now approximate f by f g ∈ Cper (K) (Since g is 0 on the
boundary). We get
Z Z
2 2 (1) 2 2 2
||f − f g||2 = |f − f g| = |f | (1 − g)2 ≤ ||f ||∞ · 1 · Vol(K − L) = ||f ||∞ (1 − (1 − 2ε)k ) (111)
K K−L
Where (1) is as f − f g = f − f = 0 on L, but we can make this arbitrarily small, by choosing g for smaller ε, so
Cper (K) is dense in C(K).

Finally, C(K) is dense in L2 (K) in the L2 -Norm by construction, hence P is dense in L2 (K) (Since each set
is dense in the next).

fˆ(n) .
2 2
Corollary 7.2. From the proof we get ||f ||L2 = |f | =
R P
K
n∈Zk

Example. Let f (x) = x ∈ C[0, 1] ⊆ L2 [0, 1]. For all n ∈ Z, we have


−2πinx 1
1
Z 1 Z 1 −2πinx
IBP,n̸=0 xe xe i
fˆ(n) = xe2πinx dx = + =− = (112)
0 −2πin 2πin 2πin 2πn
0 | 0 {z }
0

For 0 we get fˆ(n) =


R1
0
x= 2,
1
so we get
1 X i 2πinx
− = e (113)
2 2πn
n̸=0

21
Parseval’s Equality gives
∞ ∞ ∞
1 1 X 1 1 1 1
Z 1 2 π2
fˆ(n)
2
X X X
= x2 = ||x||2 = = + = + 2 =⇒ = (114)
3 0 n=−∞
4 4π 2 n2 4 n=1
4π 2 n2 n=1
n2 6
n̸=0

Proposition 7.3. Suppose f ∈ Cper [0, 1] ∩ C 1 [0, 1], then the fourier series of f converges uniformly (That is, in the
∞-Norm).

Proof. Let us calculate for n ̸= 0:


Z 1 1 Z 1
ˆ
f (n) =
′ f ′ (x)e−2πinx dx = f (x)e−2πinx + 2πin f (x)e−2πinx dx = 2πinfˆ(n) (115)
0 0 0
Where the last equality is by the preiodicity of f , Therefore
X fˆ′ (n) Cauchy−Schwarz X 2 1
fˆ(n) = fˆ′ (n)
X X
≤ · <∞ (116)
2π|n| (2πn)2
n̸=0 n̸=0 n̸=0 n̸=0

Therefore by the Weierstrass M-Test, fˆ(n)e2πinx converges uniformly to some g, therefore in particular it
P
n=−∞
converges in L2 to some g, but L2 is a norm on the space of continuous functions, therefore ||f − g||2 = 0 =⇒ f = g
as continuous functions (Recall that elements in L2 are completely characterised by their integrals, hence equality
in L2 for continuous functions implies equality of functions, as otherwise we could find an interval on which their
integrals would differ).

22
8 Pointwise Convergence of Fourier Series
8.1 Pointwise Convergence and Dirichlet’s Theorem
Consider L2 [−π, π] with the inner product
1 π
Z
⟨f, g⟩ = f ḡ (117)
2π −π
Under this inner product, the functions en = einx form a complete orthonormal system (We have proven this for [0, 1],
but this is attained by a simple change of variables).

Definition 8.1. Define P C 1 [a, b] to be the set of functions [a, b] → C such that:
1. f ∈ P C[a, b]
2. f is differentiable except for perhaps a finite number of points.
3. f ′ ∈ P C[a, b]

Proposition 8.2. If f ∈ P C 1 [−π, π] ∩ Cper [−π, π], then the fourier series converges absolutely and uniformly.

The proof is identical to that if Proposition 7.3

For the following discussion, we assume that f ∈ P C[−π, π]. We identify these functions with periodic functions with
period 2π on all of R (By gluing copies of the function on all of R). Define the N -th partial sum of the fourier series
N
fˆ(n)einx
X
(SN f )(x) = (118)
n=−N
Let us calculate
N N
1 π
1
X Z  Z π X
(SN f )(x) = f (t)e −int
dt e inx
= f (t) e−in(x−t) dt (119)
2π −π 2π −π
n=−N n=−N
| {z }
:=DN (x−t)

DN (x) is called the Dirichlet Kernel. This is a geometric series, let us calculate
sin (N + 12 )x
ix 1 1
−iN x ((e − 1)

ix(2N +1)
ei(N +1)x − e−iN x e− 2 ei(N + 2 )x − e−i(N + 2 )x
DN (x) = e = · = = (120)
eix − 1 eix − 1 ix −ix
sin x2
ix

e− 2 e2 −e 2
Definition 8.3. Let f, g : R → C be periodic with period 2π and integrable on every finite interval, the Convolution
of f, g is
1
Z π
(f ∗ g)(x) = f (x − t)g(t) dt (121)
2π −π

Remark. Convolution is commutative, associative, and bilinear.


Proof. (Proof of Commutativity):
1 s=x−t,dt=−ds 1 (1) 1
Z π Z x+π Z π
(g ∗ f )(x) = g(x − t)f (t) dt = f (x − s)g(s) ds = f (x − s)g(s) ds = (f ∗ g)(x)
2π −π 2π x−π 2π −π
(122)
Where (1) is as f g are periodic with period 2π, so the integral over any interval of length 2π is identical.
With the above definition, we get that in fact
SN f = f ∗ DN (123)

Theorem 8.4. (Dirichlet’s Theorem): Let f ∈ P C 1 [−π, π], then for all x ∈ R (Thinking of f as periodic with period
2π)

f (x+ ) + f (x− )
fˆ(n)einx = lim (SN f )(x) =
X
(124)
n=−∞
N →∞ 2

Remark. There exists f ∈ Cper [−π, π] such that its fourier series diverges on a dense set.
Although we have the tools for it, we shall not prove this theorem right now, but later after acquiring another tool
making the proof more convenient.

23
8.2 Cesaro Averages
Definition 8.5. For a function f ∈ P C[−π, π], we define the N -th Cesaro Sum to be
N
1 X
σN f = Sn f (125)
N + 1 n=0
That is, the average of the partial sums up to N .

Remark. From infi 1, cleaerly if SN f converges, so does σN f .

f (x+ )+f (x− )


Theorem 8.6. (Cesaro’s Theorem): Let f ∈ P C[−π, π], then for all x ∈ R, σN f (x) → 2 uniformly.
Furthermore, if f ∈ Cper [−π, π], the convergence is uniform on R.

Proof. Let us calculate


N N
!
1 X 1 X 1 X
σN f = SN f = f ∗ Dn = f ∗ Dn (126)
N + 1 n=0 N + 1 n=0 N + 1 n=0
| {z }
KN
KN is called the Fejer Kernel. Let us find an explicit formula for the Fejer Kernel. Note that
1
 
x
cos(nx) − cos((n + 1)x) = 2 sin (n + )x sin (127)
2 2
Therefore, we get
N N
1 X sin (n + 21 )x 1

X T elescoping
KX (x) = = cos(nx) − cos((n + 1)x) =
N + 1 n=0 sin 2
x
N + 1(2 sin 2 ) n=0
2 x
(128)
1 − cos((N + 1)x) 1 sin2 ( N2+1 x)
=
N + 1(2 sin2 x2 ) N + 1 sin2 x2
Note that in particular, we get that KN ≥ 0.

Note that the Fejer Kernel is even and periodic with period 2π as a trigonometric polynomial, and
1
Z π
KN = 1 (129)
2π −π
As an average of Dn , which all clearly have integral 1.

We claim that for δ > 0, the sequence KN → 0 uniformly on the intervals [δ, π], [−π, δ].

Indeed, we have
1 1 N →∞
∀x ∈ [δ, π] : |KX (x)| ≤
· −→ 0 (130)
N + 1 sin ( 2δ )
2

And this is a bound not dependent on x, hence we get uniform convergence on [δ, π]. The proof for [−π, −δ] is
similar.

We have
1
Z π
σN f (x) = (f ∗ KN )(x) = KN (t)f (x − t) dt (131)
2π −π
(Note we used commutativity of convolution). Let us bound
1 f (x− ) (1) 1 1
Z π Z π Z π (2)
KN (t)f (x − t) dt − = KN (t)f (x − t) dt − KN (t)f (x− ) dt ≤
2π 0 2 2π 0 2π 0
1 1 1
Z π Z δ Z π
KN (t) f (x − t) − f (x− ) dt = KN (t) f (x − t) − f (x− ) dt + KN (t) f (x − t) − f (x− ) dt = (∗)
2π 0 2π 0 2π δ
(132)

For some δ > 0 not yet chosen, where (1) is as 0 KN (t) dt = π (Since KN is even), and (2) is as KN ≥ 0. Let ε > 0,
since f (x − t) → f (x− ) when t ↓ 0, we can choose δ > 0 such that for all t ∈ (0, δ] we have
f (x − t) − f (x− ) < ε (133)
Choose N0 such that for all N > N0 we have |KN | < ε on [δ, π], then for all N > N0 we have
1 1
Z δ Z π
ε
(∗) ≤ KN (t)ε dt + ε2||f ||∞ dt < + ε||f ||∞ (134)
2π 0 2π δ 2

24

Where we again used the fact that 0
KN (t) dt = π, so we have shown that
1 f (x+ )
Z π
lim KN (t)f (x − t) dt = (135)
N →∞ 2π 0 2
f (x+ )
We prove the limit for the negative part of the integral and 2 similarly. This concludes the proof, by using the
triangle inequality and additivity of the integral.

For the ’Furthermore’, note that f ∈ Cper [−π, π] is uniformly continuous on R, hence we can choose δ > 0
not dependent on x, so the above bound is uniform in x.
Proof. (Proof of Dirichlet’s Theorem): We claim that Dirichlet’s Theorem holds for g(x) = x. Indeed, the fourier
series of g is (As seen last week)
∞ ∞
X i X i X 2(−1)n+1
(−1)n einx = (−1)n (einx − e−inx ) = sin(nx) (136)
n n=1
n n=1
n
n̸=0
This is a convergent series (Dirichlet’s Test) hence converges to some h(x), but by infi 1 we have that σN g(x) → h(x),
+ −
but Cesaro’s Theorem implies that σN g(x) = g(x )+g(x 2
)
= g(x), hence by uniqueness of the limit g(x) = h(x), so
that indeed SN g(x) → g as required.

Fact: If the fourier series of f, h satisfies Dirichlet’s Theorem, so do the fourier series of f + h, cf, f (x − c), f + c
(Easy to prove).

We now procceed by induction on the number of discontinuity points of f on [−π, π] (Including ±π). If n = 0, then
f ∈ P C 1 [−π, π]∩Cper , but the theorem holds for these functions by Proposition 8.2, so we have the basis of induction.

Induction Step: Suppose f has n > 0 points of discontinuity. By translation (Which doesn’t change the con-
vergence of the fourier series), WLOG one of them is at ±π. By multiplication by a constant and addition of
constants (Which again doesn’t change the fourier series), we may assume that f ((−π)+ ) = −π, f (π − ) = π. Now
f − x has has one fewer discontinuity, (Since we ’fixed’ the point of discontinuity at ±π), hence by IH f − x satisfies
Dirichlet’s Theorem, and therefore so does f = f − x + x.

25
9 Bounded Linear Operators
Definition 9.1. Let H, K be IP spaces, and T : H → K be linear. we say that T is Bounded if
||T || := sup{||T v||K : ||v||H = 1} < ∞ (137)
||T || is called the Operator Norm of T .

Remark. In the future we shall see that ||T || is indeed a norm on a suitable space of operators.
Remark. For any x ∈ H, we have
x x
||T x|| = ||x||T ( ) = ||x|| T ( ) ≤ ||T || · ||x|| (138)
||x|| ||x||
Furthermore, ||T || is the best upper bound for this inequality by definition.
Example. Let H = K = Cn . Let T x := Ax where A = (aij ) is some n × n matrix. We claim that such a T is
always bounded. We may see this by noting that T is continuous, and ||T || is exactly the supremum of the contin-
uous map ||T x|| on the unit ball, which is compact, hence ||T x|| attains its maximum, and in particular, T is bounded.

We may also calculate ||T || directly:


2  
X n
n X (1) n
X Xn n
X n X
X n
2 2 2 n 2 2
||T x|| = ||Ax|| = aij xj ≤  |aij | |xj |  = ||x|| |aij | (139)
i=1 j=1 i=1 j=1 j=1 i=1 j=1

Where (1) is by Cauchy-Schwarz in Cn , so T is bounded, and we have


  12
Xn X
n
2
||T || ≤  |aij |  (140)
i=1 j=1

Note that this is usually a strict upper bound for T , for example for the identity map T = In .

Theorem 9.2. TFAE:


1. T is bounded.
2. T is continuous at a point.
3. T is continuous at every point.
4. T is uniformly continuous.
5. T is lipschitz.

Proof. 5 =⇒ 4 =⇒ 3 =⇒ 2 is trivial. 1 =⇒ 5 is clear, since by the above remark, we have


∀x, y ∈ H : ||T x − T y|| = ||T (x − y)|| ≤ ||T || · ||x − y|| (141)
So T is Lipschitz with Lipschitz Constant ||T || < ∞.

For 3 =⇒ 1, we show the contrapositive: Suppose T is not bounded, i.e. there is a sequence xn of unit
vectors such that ||T xn || → ∞. Let h ∈ H be arbitrary. Define
xn
hn = h + (142)
||T xn ||
Then we have
||xn || 1
||h − hn || = = →0 (143)
||T xn || ||T xn ||
So hn converges to h, but
||T xn ||
||T hn − T h|| = = 1 ̸→ 0 (144)
||T xn ||
So T is discontinuous as h, and therefore T is discontinuous at every point.

Finally, 2 =⇒ 3 follows by linearity. Suppose WLOG that T is continuous at 0. Let hn → h ∈ H, then


hn − h → 0, but T is continuous at 0, therefore we get
||T (hn ) − T (h)|| = ||T (hn − h) − T (0)|| → 0 (145)
Therefore T is continuous at every point.

26
Theorem 9.3. Let H, K be Hilbert Spaces. D ≤ H be a dense subspace, and T : D → K a bounded linear map. T
can be uniquely extended to a bounded map T̃ : H → K with ||T || = T̃ .

Remark. In fact, Linearity is not needed for this theorem, just completeness of the spaces and continuity of the map.
Proof. Let x ∈ H, and let D ∋ xn → x, then T xn is Cauchy, since
m,n→∞
||T xn − T xm || ≤ ||T || · ||xn − xm || −→ 0 (146)
Therefore by completeness of K, T xn converges to some T̃ x. By a standard trick, T̃ x is well-defined, since if yn → x,
then so does the sequence x1 , y1 , x2 , y2 , . . ., therefore T x1 , T y1 , T x2 , T y2 , . . . converges by the previous argument, so
we must have that lim T xn = lim T yn .
n→∞ n→∞

It is clear from the definition that T̃ extends T , T̃ is linear (By standard extension by continuity arguments)
and is bounded, as we have
T̃ x = lim T xn = lim ||T xn || ≤ ||T || lim ||xn || = ||T || · ||x|| (147)
n→∞ n→∞ n→∞
And we have T̃ ≤ ||T ||. Moreover, the converse inequality is obvious (Since T̃ extends T ), therefore we have equality.

As for uniqueness, if S extends T , then for every x ∈ H, take D ∋ xn → x, we have by continuity


Sx = lim Sxn = lim T xn = T̃ xn by definition, hence S ≡ T̃ .
n→∞ n→∞

Example. Let H = K = L2 [0, 1]. Let D = C[0, 1]. Fix g ∈ C[0, 1], and define a map mg : D → K by mg (f ) = gf .
Clearly mg is linear. Moreover, this map is bounded, since we have
Z 1 Z 1
2 2 2 2 2 2
||mg f ||2 = ||gf ||2 = |gf | ≤ ||g||∞ |f | = ||g||∞ ||f || (148)
0 0
Therefore mg is bounded, and moreover ||mg || ≤ ||g||∞ . By the previous theorem, we may extend mg to a map
L2 [0, 1] → L2 [0, 1]. Let ||g||∞ = |g(x0 )| for some x0 ∈ [0, 1]. For each ε > 0, there is δ > 0 such that
|g(x)| ≥ ||g||∞ − ε (149)
For all x in (x0 − δ, x0 + δ) ⊆ [0, 1]. Define f ∈ C[0, 1] such that ||f ||2 = 1, and f is supported on (x0 − δ, x0 + δ), so
we have
Z 1 Z x0 +δ Z x0 +δ
2 2 2 2 2 2
||mg f ||2 = |g| |f | = |g| |f | ≥ (||g||∞ − ε)2
|f | = (||g||∞ − ε)2 (150)
0 x0 −δ x0 −δ |{z}
||f ||22 =1

Therefore ||mg || ≥ ||g||∞ − ε for all ε, and so ||mg || = ||g||∞ . Note that in general there is No f ∈ L2 [0, 1] with
||f ||2 = 1 and ||mg f || = ||g||∞ = ||mg ||, for example for g(x) = x (So that the supremum in the definition of ||mg || is
not a maximum).

27
10 Bounded Operators Contd.
10.1 Riesz’ Representation Theorem
Definition 10.1. Let H, K be Hilbert Spaces. The Space of Bounded Operators is denoted B(H, K).

Remark. Given T ∈ B(H, K), note that ker T = T −1 ({0}) ≤ H is a Closed subspace (As the inverse image of a
closed set under a continuous map), yet ImT ≤ K is a not necessarily closed subspace.

Definition 10.2. A Linear Functional is simply a bounded linear map ϕ : H → C. We denote the space of bounded
functionals on H by H ∗ := B(H, C)

Example. Let g ∈ H, define ϕg (h) = ⟨h, g⟩ (Note that we must take g to be in the antilinear component so that this
map is indeed linear). By Cauchy-Schwarz:
|ϕg (h)| ≤ ||g|| · ||h|| (151)
Therefore ϕg ∈ H and ||ϕg || ≤ ||g||. If we take h = g, we get

2
||g|| = |ϕg (g)| ≤ ||ϕg || · ||g|| =⇒ ||ϕg || ≥ ||g|| =⇒ ||ϕg || = ||g|| (152)

The question arises: Are there other bounded linear functionals on H? The answer, as it turns out, is No.

Theorem 10.3. (Riesz’ Representation Theorem): For every ϕ ∈ H ∗ , there exists a unique g ∈ H such that ϕ = ϕg .

Proof. If ϕ = 0, then we take g = 0. Suppose that ϕ ̸= 0. Since ker ϕ is closed, by Corollary 5.8 we have
H = ker ϕ ⊕ (ker ϕ)⊥ (153)
Note that ϕ|(ker ϕ)⊥ : (ker ϕ) → C is injective, therefore dim(ker ϕ) ≤ 1, but ϕ ̸= 0 (So that ker ϕ ̸= H), therefore
⊥ ⊥

dim(ker ϕ)⊥ = 1. Let v ∈ (ker ϕ)⊥ with ||v|| = 1. For every h ∈ H, we have
(1)
h = P(ker ϕ) h + P(ker ϕ)⊥ h = P(ker ϕ) h + ⟨h, v⟩v (154)
Where (1) is by Theorem 6.6 as v is an orthonormal basis for (ker ϕ) . Applying ϕ to both sides, we get

0
z }| {
ϕh = ϕP(ker ϕ) h +⟨h, v⟩ϕ(v) = ⟨h, ϕ(v)v⟩ (155)
So we define g = ϕ(v)v ∈ H, and we get that ϕ = ϕg .

As for uniqueness, if ϕg1 = ϕg2 , then ⟨h, g1 ⟩ = ⟨h, g2 ⟩ =⇒ ⟨h, g1 − g2 ⟩, therefore g1 − g2 ⊥ H i.e.
g1 − g2 = 0 =⇒ g1 = g2 .
Exercise. For a functional ϕ, ϕ is bounded ⇔ ker ϕ closed.

10.2 Adjoint Operators


Theorem 10.4. Let H, K be hilbert spaces, T ∈ B(H, K), then there exists a Unique operator T ∗ ∈ B(K, H) satisfying
∀h ∈ H, k ∈ K : ⟨T h, k⟩K = ⟨h, T ∗ k⟩H (156)

Definition 10.5. The above operator is called the Adjoint of T .

Example. If H = K = Cn , and TA : H → K is given by TA x = Ax, define A∗ = (A)t (Where by A we mean the


matrix obtained by conjugating all entries of A), then we have
(1)
⟨TA x, y⟩ = ⟨Ax, y⟩ = y ∗ Ax = x∗ A∗ y = ⟨A∗ y, x⟩ = ⟨x, A∗ y⟩ (157)
Where (1) is as y Ax is a number, therefore it is invariant under transpose, and we used the fact that conjugating

twice gives the identity. Therefore (TA )∗ = TA∗ .

Proof. For every k ∈ K, define ϕk : H → C by ϕK (h) = ⟨T h, k⟩. Clearly ϕk is linear, and bounded by Cauchy-Schwarz
as we have:
|ϕk (h)| ≤ ||T h|| · ||k|| ≤ ||T || · ||h|| · ||k|| = (||T || · ||k||)||h|| (158)
Where the constant (||T || · ||k||) does not depend on h, so ||ϕk || ≤ ||T || · ||k||. By Riesz’ Theorem, there exists a unique
T ∗ k ∈ H such that
⟨T h, k⟩ = ϕk (h) = ⟨h, T ∗ k⟩ (159)

28
So T ∗ is well-defined. It is left to show that T ∗ is indeed a bounded linear map.

Linearity: We have
∀k1 , k2 ∈ K, α ∈ C : ⟨h, T ∗ (αk1 + k2 )⟩ = ⟨T h, αk1 + k2 ⟩ = ᾱ ⟨T h, k1 ⟩ + ⟨T h, k2 ⟩ =
(160)
ᾱ ⟨h, T ∗ k1 ⟩ + ⟨h, T ∗ k2 ⟩ = ⟨h, αT ∗ k1 + T ∗ k2 ⟩ =⇒ ⟨h, T ∗ (αk1 + k2 ) − (αT ∗ k1 + T ∗ k2 )⟩
But this is true for all h ∈ H, therefore we must have
T ∗ (αk1 + k2 ) = αT ∗ k1 + T ∗ k2 (161)
So T is indeed linear.

Bounded: Let k ∈ K, then by the above example (Since T ∗ k is the Riesz Representation of ϕk ):
||T ∗ k|| = ||ϕk || ≤ ||T || · ||k|| (162)
Therefore ||T || ≤ ||T ||, so T is bounded.
∗ ∗

Example. Consider the operator mg : L2 [0, 1] → L2 [0, 1] as defined in the previous lecture for some continuous g. Let
f, g ∈ C[0, 1], then
Z 1 Z 1
⟨mg f, h⟩ = ⟨gf, h⟩ = gf h̄ = f ḡh = ⟨f, ḡh⟩ = ⟨f, mḡ h⟩ (163)
0 0
As mg , mḡ and the inner product are continuous in f, h, and C[0, 1] is dense, it follows that this equality (Although
obtained through explicit manipulation of continuous functions, which would not work for general elements of L2 )
holds for all f, h ∈ L2 [0, 1]. In particular, it follows by uniqueness of the adjoint that m∗g = mḡ

Proposition 10.6.
1. (T ∗ )∗ = T
2. ||T ∗ || = ||T ||
3. (αT + βS)∗ = ᾱT ∗ + β̄S ∗
4. (T S)∗ = S ∗ T ∗

Proof. We shall prove 1 and 2, the rest are left as exercises.


1.
⟨T h, k⟩ = ⟨h, T ∗ k⟩ = ⟨T ∗¯k, h⟩ = ⟨k, (T¯∗ )∗ h⟩ = ⟨(T ∗ )∗ h, k⟩ (164)
This is true for all k ∈ K, therefore T h = (T ) h. ∗ ∗

2. We’ve seen in the proof that ||T ∗ || ≤ ||T ||, conversely, we have ||T || = ||(T ∗ )∗ || ≤ ||T ∗ ||, so we have equality.

Proposition 10.7.
1. (ImT )⊥ = ker T ∗
2. (ImT ∗ )⊥ = ker T
3. (ker T )⊥ = ImT ∗
4. (ker T ∗ )⊥ = ImT

Proof.
1.
k ∈ (ImT )⊥ ⇔ ∀h ∈ H : ⟨T h, k⟩ = 0 ⇔ ∀h ∈ H : ⟨h, T ∗ k⟩ = 0 ⇔ T ∗ k = 0 ⇔ k ∈ ker T ∗ (165)
2. Follows from 1 on T ∗ .
3. Follows from 2 by taking orthogonal complement.
4. Follows from 1 by taking orthogonal complement.

Definition 10.8. Let T ∈ B(H, H) := B(H). We say that T is:


1. Self-Adjoint if T ∗ = T
2. Normal if T ∗ T = T T ∗
3. Positive Semi-Definite if ⟨T h, h⟩ ≥ 0 for all h ∈ H (And in particular real).

Definition 10.9. Let T ∈ B(H, K). We say that T is


1. An Isometry if ||T h|| = h for all h ∈ H.
2. Unitary if T is bijective, and ⟨T h1 , T h2 ⟩ = ⟨h1 , h2 ⟩ for all h1 , h2 ∈ H.

29
Proposition 10.10. TFAE:
1. T is an isometry.
2. T preserves the inner product, i.e. ⟨T h, T g⟩ = ⟨h, g⟩ for all h, g ∈ H.
3. T ∗ T = IH .

Proof.
2 =⇒ 1 is obvious.

1 =⇒ 2 Follows from the Polarisation Identity:


1 2 2 2 2
⟨x, y⟩ = (||x + y|| − ||x − y|| + i||x + iy|| − i||x − iy|| ) (166)
4
The proof is simply by opening parentheses. This shows that the inner product of H is completely determined by the
norm, i.e.
1 2 2 2 2
⟨T h, T g⟩ = (||T h + T g|| − ||T h − T g|| + i||T h + iT g|| − i||T h − iT g|| ) =
4
1 2 Isometry
2 2 2
(||T (h + g)|| − ||T (h − g)|| + i||T (h + ig)|| − i||T (h − ig)|| ) = (167)
4
1 2 2 2 2
(||h + g|| − ||h − g|| + i||h + ig|| − i||h − ig|| ) = ⟨h, g⟩
4
2 ⇔ 3: For all h, g, we have
⟨h, g⟩ = ⟨T h, T g⟩ = ⟨h, T ∗ T g⟩ ⇔ ∀g ∈ H : T ∗ T g = g ⇔ T ∗ T = IH (168)

Exercise. T is unitary iff T ∗ T = IH , T T ∗ = IK .

Example. 1. Define T : C2 → C3 by T (x, y) = (x, y, 0), then T is a non-unitary isometry, but every isometry
T : C → Cn is unitary (Since T is injective and therefore bijective).
n

2. Let H = ℓ2 , and define the Right Shift operator T : ℓ2 → ℓ2 by


T (a1 , a2 , . . .) = T (0, a1 , a2 , . . .) (169)
Then clearly T is an isometry, but it is not unitary as it is not surjective. We claim that
T ∗ (a1 , a2 , . . .) = (a2 , a3 , . . .) (170)
i.e. the left shift operator. Indeed, we have

X ∞
X ∞
X
⟨T a, b⟩ = (T a)n b̄n = an−1 b̄n = an b̄n+1 = ⟨a, T ∗ b⟩ (171)
n=1 n=2 n=1
And it is clear that T ∗ T = I, but T T ∗ ̸= I since T T ∗ (a1 , a2 , . . .) = (0, a2 , . . .). Let M = span{e1 }⊥ , then
T T ∗ = PM , so T is an isometry, but not normal or unitary.
3. Let M be a closed subspace of H, and let T = PM : H → H be the projection. We have
⟨PM h, g⟩ = ⟨PM h, PM g + PM ⊥ ⟩ = ⟨PM h, PM g⟩ = ⟨PM h + PM ⊥ h + PM g⟩ = ⟨h, PM g⟩ (172)
Therefore PM is self-adjoint. Furthermore, PM is positive semi-definite, since by the above formula, we have
2
⟨PM h, h⟩ = ⟨PM h, PM h⟩ = ||PM h|| (173)
Clearly PM is not an isometry when M is a proper subspace, since ∗
PM PM = 2
PM = PM ̸= IH .

30
11 Matrix Representation of Operators
Let H be a Separable Hilbert Space with basis {en }∞
n=1 . Define the map

X
U : H → ℓ2 , U ( an en ) = (a1 , a2 , . . .) (174)
n=1
We’ve proven in the proof of Theorem 6.11 that maps of the form of U (Sending an orthonormal basis to an orthonormal
basis) are linear isomorphisms. Furthermore, U is unitary, as it preserves the inner product. Given a map T ∈ B(H), we
wish to calculate S = U T U ∗ : ℓ2 → ℓ2 . Let a = (an )∞
n=1 ∈ ℓ , let us calculate:
2

∞ ∞ ∞ ∞
X (1) X (2) X (3) X
(Sa)n = (U T U ∗ a)n = (U T ( am em ))n = ⟨U T ( am em ), U en ⟩ = ⟨T ( am em ), en ⟩ = am ⟨T em , an ⟩ (175)
m=1 m=1 m=1 m=1
Where (1) is as U sends en to the sequence with 1 in the n-th place and 0 otherwise, and taking the inner product of a
sequence with this element isolates the n-th component. (2) is as U is unitary, and (3) is by linearity and continuity of
T and the inner product.

If we define an ’Infinite Matrix’ [T ] = (tnm )∞


n,m=1 where tnm = ⟨T em , en ⟩, then we’ve in fact shown that
X∞
(Sa)n = tnm am (176)
m=1
So S is given by multiplication with the infinite matrix [T ] (Where matrix-vector multiplication is defined in the obvious
way, extending matrix-vector multiplication in the finite case).

Definition 11.1. [T ]nm = ⟨T em , en ⟩ is called the Transformation Matrix of T in the basis {en }.

Remark. We sometimes identify S with [T ].

! Given an arbitary infinite matrix A, it is not clear that it defines a boudned linear transformation A : ℓ2 → ℓ2 .

2
Exercise. Let A = (anm ). If |anm | < ∞, then A defines a bounded linear map on ℓ2 (Check Using Cauchy-
P
n,m=1
Schwarz). This is a sufficient condition, but not necessary (For example the identity matrix defines a bounded linear
map, yet does not satisfy this condition).

Proposition 11.2. Let H be Separable Hilbert, {en } an orthonormal basis, T, S ∈ B(H).


1. [αT + βS] = α[T ] + β[S]
2. [T ][S] = [T S], where matrix multiplication is defined in the obvious way.
3. [T ∗ ] = [T ]t

Proof. All parts follow from the identity [T ] = U T U ∗ , thinking of [T ] as a map on ℓ2 .

Example. Fix f ∈ C[−π, π]. Define the Convoluiton Operator Cf : C[−π, π] → L2 [−π, π] by Cf (g)(x) = f ∗ g(x) =

2π −π f (t)g(x − t) dt, where as always, we think of the integrand as a piecewise-continuous periodic continuation of
1

f (t)g(x − t) (So the integral is defined). Let us check that Cf is bounded:


2
1 1 1
Z π Z π Z π
2 2 C−S
||Cf g||2 = |Cf g(x)| dx = f (t)g(x − t) dt dx ≤
2π −π (2π) −π 2π −π
3
(177)
1
Z π Z π Z π 
2 2 P eriodic 2 2
|f (t)| dt · |g(x − t)| dt) = ||f ||2 · ||g||2
(2π)3 −π −π −π
Therefore ||Cf || ≤ ||f ||2 , so Cf is bounded, and therefore by Theorem 9.3 has a unique extension to L2 [−π, π].

Choose the basis en (x) = einx for all n ∈ N, then the matrix of Cf is
[Cf ]n,m = ⟨Cf em , en ⟩ = ⟨f ∗ em , en ⟩ = f\
∗ em (n) (178)
In recitation, we’ve seen (This is a simple direct calculation, using Fubini’s Theorem):
f\∗ em (n) = fˆ(n)eˆm (n) = fˆδm,n (179)
Therfore we have [Cf ] = diag(. . . , fˆ(−1), fˆ(0), fˆ(1), . . .).

31
12 Banach Spaces
Definition 12.1. A Banach Space X is simply a normed space that is complete w.r.t its norm.

Examples.
1. Hilbert Spaces.
2. Cb (X), the space of bounded continuous functions on X with ||·||∞
3. For 1 ≤ p < ∞, define

X p
ℓ = {an }n=1 ⊆ C :
p ∞
|an | < ∞} (180)
n=1
 ∞
 p1
p
With the norm ||{an }∞ = . Furhtermore, we define ℓ∞ as the set of bounded sequences, with
P
n=1 ||p |an |
n=1
the norm ||{an }||∞ = sup |an |.
n∈N

We wish to show that ℓp is a normed space. After we show this, the proof of completeness is nearly identical to the
proof of Proposition 3.1. For this, we need the following theorem:

Theorem 12.2. (Holder’s Inequality): Let p, q ≥ 1 with p1 + 1q = 1 (Where we allow for p = 1, q = ∞), then
X
|an bn | ≤ ||a||p · ||b||q (181)
n

Proof. First, if ||a||p = 0 or ||b||q = 0, then the theorem is trivial. The case p = 1, q = ∞ is left as an (easy) exercise.
Suppose that ∞ > p, q > 1. By multiplying a, b by constants (Since this won’t affect the inequality), we can WLOG
assume that ||a||p = ||b||q = 1. Let x, y > 0, then we have, by concavity of the logarithm
1 p 1 q 1 1 1 1
 
log x + (1 − )y ≥ log(xp ) + (1 − ) log(y q ) = log(x) + log(y) = log(xy) =⇒ xp + (1 − )xq ≥ xy
p p p p p p
(182)
Note that if x = 0 or y = 0 the claim is trivially true. Now, we have
X X  |an |p |bn |
q
1 p 1 q 1 1
|an bn | ≤ + = ||a||p + ||b||q = + = 1 = ||a||p · ||b||q (183)
n n
p q p q p q

Exercise. For every f, g ∈ P C[a, b], we have


Z b ! p1 ! q1
Z b Z b
p q
|f g| ≤ |f | |g| (184)
a a a

We now wish to prove the triangle inequality for ℓp

Theorem 12.3. (Minkowski’s Inequality) For any a, b ∈ ℓp we have


||a + b||p ≤ ||a||p + ||b||p (185)
And in particular a + b ∈ ℓ p

Proof. Suppose p < ∞ (The case p = ∞ is easy). Suppose we are working with finite sequences by truncating our
sequence at some N (And eventually let N → ∞), then we have
p−1 Holder
p
X p−1
X p−1
X
||a + b||p = |an + bn | |an + bn | ≤ |an ||an + bn | + |bn ||an + bn | ≤
n n n
! p1 ! q1 ! p1 ! q1 (186)
X p
X (p−1P q)
X p
X (p−1P q)
|an | |an + bn | + |bn | |an + bn | = (∗)
n n n n
Where 1
p + 1
q = 1, so pq = p + q, and hence q(p − 1) = p and p
q = p − 1, so we get
p p
p−1 p−1
(∗) = ||a||p ||a + b||p + ||b||p ||a + b||p = ||a||p ||a + b||p
q q
+ ||b||p ||a + b||p (187)
p−1
Dividing both sides now by ||a + b||p (If it is nonzero, if it is zero the inequality trivially holds), we get
||a + b||p ≤ ||a||p + ||b||p (188)
This is true for every truncation of our sequence, so taking the limit N → ∞, we get the required.

32
Theorem 12.4. For every 1 ≤ p ≤ ∞, ℓp (And ℓpn , the subspace of finite sequences) are Banach Spaces.

Proof. ℓp is a vector space: We’ve seen that it is closed under sums by Minkowski’s Inequality, and closure under
scalar multiplication is easy by taking limits of partial sums.

||·||p is a norm: It is easy to see that it is nonnegative and homogeneous (Homogeneity again is seen by tak-
ing limits of partial sums), and the triangle inequality is exactly Minkowski’s Inequality.

Completeness: Note that the proof of completeness in Proposition made no mention of the inner product
structure of ℓ2 , and only used the norm and sequence space structure, hence the proof readily generalises to ℓp for
p < ∞. The proof for p = ∞ is exactly the proof for completeness of the space of bounded functions from some set
into a complete metric space with the ∞-norm (Metric), in this case, from N to C, as seen in previous courses.

Proposition 12.5. If ℓp is a hilbert space, then p = 2.

Proof. We’ve seen that in a Hilbert Space the parallelogram inequality holds:
2 2 2 2
||a + b|| + ||a − b|| = 2||a|| + ||b|| (189)
1
Take a = (1, 0, 0, . . .), b = (0, 1, 0, . . .). It’s clear that ||a||p = ||b||p = 1 for all ||a + b||p = ||a − b||p = 2 p (Where if
p = ∞ we take this to mean 1), then the parallelogram inequality holds iff
2 p p
2 · 2 p = 2 · 1 + 2 · 1 =⇒ 2 2 = 2 =⇒ = 1 =⇒ p = 2 (190)
2

In particular, ℓp is a non-hilbert banach space for p ̸= 2.


Exercise. If a ∈ ℓ1 , then a ∈ ℓp for all 1 ≤ p ≤ ∞, and
lim ||a||p = ||a||∞ (191)
p→∞
If it exists.

Theorem 12.6. Let Y be a normed space, then there exists a banach space X and a linear isometry i : Y → X such
that i(Y ) = X.

Proof. Identical to Theorem 3.2


Examples.
1. Define Cc (Rn ) to be the set of continuous, compactly supported functions f : Rn → C. That is, f ≡ 0 outside
of a compact set K. Also define the set
C0 (Rn ) = {f ∈ C(Rn ) : lim f (x)0} (192)
||x||→∞

Note that Cc (Rn ) ⊆ C0 (Rn ) ⊆ Cb (Rn ). We endow all these spaces with the ∞-Norm, i.e. ||f ||∞ = sup |f (x)|.
x∈Rn
We claim that C0 is a closed subpace of Cb (And hence Banach, since we know Cb is Banach), and Cc is Not
closed, and is in fact dense in C0 (So that C0 is the completion of Cc ).

Showing that C0 is closed is by a usual 3ε -Type proof, similar to the proof of the uniform convergence
theorem from infi. Clearly since C0 is closed, we have Cc ⊆ C0 . It is intuitively clear how we can approximate
functions vanishing at infinity in the ∞-Norm by functions with compact support (Cut off the function at an
ever increasing compact set, set it to 0 outside of this set, and let it decay very quickly towards the boundary),
but the rigorous proof is quite cumbersome (Convince yourselves that this works in Rn similarly to how it
would work in R).
2. Let D ⊆ Rn be a nice enough set (connected and open or compact). Define a norm on Cc (D) by
Z  p1
p
||f ||p = |f | (193)
D
For p < ∞, where the integral exists as the functions are compactly supported. This is a normed space (The
proof of Minkowski is similar for this norm), but this space is not complete. We call its completion Lp (D).

33
13 Bounded Operators On Banach Spaces
Note that the basic theory of bounded operators (In particular Definition 9.1, Theorem 9.2 and Theorem 9.3) holds
for bounded operators between banach spaces, as we only used properties of the norm, and not the inner product in our
proof.

Proposition 13.1. Let X, Y, Z be banach spaces. If A ∈ B(X, Y ), B ∈ B(Y, Z), then BA ∈ B(X, Z) with ||BA|| ≤
||B|| · ||A||.

Proof.
||BAx|| ≤ ||B|| · ||Ax|| ≤ ||B|| · ||A|| · ||x|| (194)

Theorem 13.2. Let X be a normed space, Y a banach space, then B(X, Y ) is a banach space with the operator norm.

Proof. Once we show that the operator norm is a norm, it will be clear that B(X, Y ) is a vector space with pointwise
addition and scalar multiplication, as sums and scalar multiples of bounded operators will clearly be bounded. It is
left as an (easy) exercise to check that the operator norm is indeed a norm, which follows from the properties of the
norm on Y .

Let us show that this space is complete. Let (Tn )∞


n=1 be a cauchy sequence in B(X, Y ). For every x ∈ X,
we have
n→∞
||Tn x − Tm x|| ≤ ||Tn − Tm || −−−−→ 0 (195)
Therefore Tn x is a cauchy sequence in Y , but y is banach, therefore there exists a limit which we shall denote T x. T
is linear, since for every x, y ∈ Y, λ ∈ C, we have
T (x + λy) = lim Tn (x + λy) = lim Tn (x) + λ lim Tn (y) = T (x) + λT (y) (196)
n→∞ n→∞ n→∞
Where we used linearity of the limit of Tn for all n.

T is bounded: Tn is Cauchy, therefore Tn are uniformly bounded, since there exists n, m ≥ N such that
||Tn − Tm || < 1, therefore for any n > N , we have
||Tn || ≤ ||Tn − TN || + ||TN || < 1 + ||TN || (197)
Therefore, since only finitely many i’s are not bounded by this bound, and the fact that ||Ti || < ∞ for all i ∈ N, we
can find some uniform bound C for all n ∈ N therefore we have for all x ∈ X
||T x|| = lim ||Tn x|| ≤ lim C||x|| = C||x|| (198)
n→∞ n→∞
So T is bounded.

Finally, we show that this convergence is in norm. Let ε > 0, then there exists N such that for all n, m > N we have
||Tn − Tm || < ε, therefore if ||x|| = 1, we have for n > N :
n>N
||T x − Tn x|| = lim ||Tm x − Tn x|| ≤ ε||x|| = ε (199)
m→∞
Therefore by definition, ||Tn − T || → 0, therefore B(X, Y ) is complete, as required.
Examples.
1. Let X = Y = C(K) for K some compact topological space. Fix g ∈ C(K), and define T : X → Y by T f = gf .
It’s clear that T is linear. For f ∈ C(K), we have
||T f || = max |f g| =≤ max |f | max |g| = ||f ||||g|| (200)
x∈K K K
So ||T || ≤ ||g||. Conversely, taking f ≡ 1, we get ||g|| = ||T f || ≤ ||T ||||1|| = ||T ||. So we have equality.
2. X = Y = C(K). Fix α : K → K continuous and define T : X → Y , T f = f ◦ α. Clearly T is linear. We have
||T f || = max |f ◦ α| ≤ max |f | = ||f || (201)
K K
Therefore T is bounded, and ||T || ≤ 1. In fact, by taking f ≡ 1, we get in fact that ||T || = 1. Note that if α is
also surjective, then T is an isometry.
3. Note: This example may be generalised further, but for the sake of simplicity, we only deal with functions on
[0, 1].

Let X = Y = Lp ([0, 1]), and let α : [0, 1] → [0, 1] be a piecewise continuous measure preserving trans-
R1 R1
formation. That is, for f ∈ P C[0, 1], we have 0 f ◦ α = 0 f . Define a map T : P C[0, 1] → P C[0, 1] by

34
T f = f ◦ α. T is linear, and
Z 1  p1 Z 1  p1 Z 1  p1
p p
||T f ||p = (|f ◦ α|)p
= |f | ◦ α = |f | = ||f ||p (202)
0 0 0
So ||T || = 1, and in fact T is an isometry. As in the case of hilbert spaces, we may uniquely extend T to an
isometry on Lp [0, 1].
As a corollary, for any normed space X, X ∗ = B(X, C) is a banach space (Since C is complete) with the operator
norm. we’ve seen that if X is hilbert, then by Riesz’ Theorem every φ ∈ X ∗ is of the form φ(x) = ⟨x, y⟩. For Banach
Spaces, the situation is much less obvious.
Example. Let X = ℓp . For φ ∈ X ∗ , define

X
φ((an )n∈N ) = an bn (203)
n=1
For what sequences b = (bn ) is this functional well-defined? If (bn ) ∈ ℓq , then Holder’s Inequality, we get

X Holder
|φ(a)| ≤ |an bn | = ||ab||1 ≤ ||a||p · ||b||q (204)
n=1
We see that not only is φ = φb well-defined, but it is bounded. In fact, we have the following two facts (Seen in
recitation):
1. ||φb || = ||b||q
2. If 1 ≤ p < ∞, every φ ∈ (ℓp )∗ is of the form φ = φb for some b ∈ ℓq , in other words, ℓq ∼
= (ℓp )∗ via the map
b → φb .
What’s the problem with ℓ∞ ? Consider the unit vectors en = (δmn )m∈N . Let a = (an ) ∈ ℓp . If p < ∞, then

a= an en , but for p = ∞, this does not hold. For example, we have
P
n=1
N
X
(1, 1, 1, . . .) − en = 1 ̸→ 0 (205)
n=1 ∞
This shows us that while we can define functionals on ℓp by choosing images of the en ’s, for ell∞ , this does not hold, so
we cannot specify functionals this way. In this sense, ℓ∞ is very problematic.

35
14 Weak Convergence, Ergodic Theory
14.1 Weak Convergence
Definition 14.1. Let X be a banach space, and let (xn ) be a sequence. We say that xn Converges Weakly to x if for
w
every φ ∈ X ∗ , we have φ(xn ) → φ(x), and we write xn −→ x.

w
Remark. In a Hilbert space, xn −
→ x iff ⟨xn , v⟩ → ⟨x, v⟩ for all v ∈ H by Riesz’ Theorem.
w
Remark. There is a non-metrisable topology on X such that xn → x topologically iff xn −
→ x.
Example. Let (en )∞
n=1 be an orthonormal system in a hilbert space H. For every z ∈ H, by Bessel’s Inequality, we
have

X 2
|⟨z, en ⟩| ≤ ||z|| < ∞ (206)
n=1
w
In particular, for every z ∈ H we have ⟨en , z⟩ → ⟨z, en ⟩ → 0 = ⟨0, z⟩, therefore en −→ 0. Note that en does not
2 2 2
converge in norm, as it is not cauchy, since by Pythagoras’ Theorem ||en − em || = ||en || + ||em || = 2.

Theorem 14.2. Every weakly convergent sequence in a Hilbert Space is bounded. Moreover, if (⟨xn , z⟩)∞
n=1 is a
bounded sequence for all z ∈ H, then (xn )∞
n=1 is bounded.

Recall The Baire Category Theorem: Any Complete Metric Space is not a countable union of closed nowhere dense

sets (i.e. sets A with A = ∅).

Proof. Suppose that (⟨xn , z⟩)∞


n=1 is bounded for all z ∈ H. For all k ∈ N, define
Fk = {z ∈ H : ∀n ∈ N, |⟨xn , z⟩| ≤ k||z||} (207)
Each Fk is closed (It is easy to see that it is a countable intersection of sequentially closed sets, since the norm and

inner product are continuous). Moreover, by hypothesis, Fk = H. By the BCT, there exists k such that Fk is not
S
k=1
nowhere dense, i.e. Fk◦ ̸= ∅, so there is some open ball B(y, r) ⊆ Fk . Explicitly, this means that if ||x − y|| < r, then
|⟨xn , x⟩| ≤ k||x||.

If ||z|| < r, then


|⟨xn , z⟩| = |⟨xn , z + y⟩ − ⟨xn , y⟩| ≤ |⟨xn , z + y⟩| + |⟨xn , y⟩| (208)
But y, z + y ∈ B(y, r), therefore we have
|⟨xn , z + y⟩| + |⟨xn , y⟩| ≤ k||z + y|| + k||y|| ≤ k(||z|| + 2||y||) < k(r + 2||y||) (209)
Let z = r xn
2 ||xn || so that ||z|| < 2,
r
then we have
r 2k
||xn || = |⟨xn , z⟩| < k(2||y|| + r) =⇒ ||xn || ≤ (2||y|| + r) (210)
2 r
Therefore we have bounded xn by a bound not dependent on n.
w
Note that in particular, if xn −
→ x, then (⟨xn , z⟩) is bounded for all z ∈ H, therefore in particular (xn ) is
bounded in this case.

Theorem 14.3. Every bounded sequence in a a hilbert space has a weakly convergent subsequence.

Proof. Sketch of Proof (Given as a guided exercise):

First, we may assume that H is separable by restricting ourselves to G = span{xn } where xn is our bounded sequence
(Note that we must check that weak convergence in G implies weak convergence in H). G is indeed separable since
spanQ {xn } is a countable dense set.

Next, let {zi } be a countable dense set in H, there is a subsequence xnk of xn such that (⟨xnk , zi ⟩)∞
k=1 con-
verges (As it is bounded by Cauchy-Schwarz). We use a diagonalisation argument to find a subsequence xnm such
that (⟨xnm , zi ⟩) converges for all i.

We now prove that for every z ∈ H, ⟨xnm , z⟩ is cauchy and therefore converges using a standard ε
3 argument
(By approximating z with zi ).

36
Finally, define φ(z) = lim ⟨xnm , z⟩. Show that φ is a continuous linear functional, therefore φ = ⟨·, x⟩ for
m→∞
w
some x ∈ H, therefore xnm −
→ x.

w
Theorem 14.4. (Hilbert-Saks): Let (xn ) be a sequence in a hilbert space such that xn −
→ x, then there is a subsequence
(xnk ) such that
N
1 X
x nk → x (211)
N
k=1
In norm.
w
Remark. In a Hilbert Space, the weak limit of a sequence is unique, i.e. if xn −
→ x, y then x = y, since this means
that ⟨x, z⟩ = ⟨y, z⟩ for all z ∈ H, therefore x − y ∈ H ⊥ .
w
Proof. WLOG by translation (Taking xn − x), xn − → 0. Let n1 = 1. We know by the definition of weak convergence
that ⟨xn , xn1 ⟩ → 0, therefore there is n2 > n1 such that |⟨xn2 , xn1 ⟩| < 41 .

Now, we know that ⟨xn , xn1 ⟩ , ⟨xn , xn2 ⟩ → 0, therefore there exists n3 > n2 such that
1 1
|⟨xn3 , xn1 ⟩|, |⟨xn3 , xn2 ⟩| < = 3 (212)
8 2
We continue by induction, at the k-th step choosing nk such that |⟨xnk , xni ⟩| < 21k for all 1 ≤ i < k.

Now we have
N 2 N N
!
1 X 1 X 1 X 2
X
x nk = 2 ⟨xnk , xnℓ ⟩ ≤ 2 ||xnk || + 2 |⟨xnk , xnℓ ⟩|
N N N
k=1 k,ℓ=1 k=1 k<ℓ
(213)
N k−1 N
(1)C 2 XX 1 C 2 X k − 1 N →∞
≤ 2+ 2 = + −−−−→ 0
N N 2k N2 N2 2k
k=1 ℓ=1 k=1
Where in (1) we used the Theorem 14.2 to show that the sequence xnk is bounded, and our bound on |⟨xnk , xnℓ ⟩| for

ℓ < k, and finally we used the fact that the series k−1
converges, therefore dividing by N 2 gives us a sequence
P
2k
k=1
converging to 0.

14.2 Ergodic Theory


Remark. For simplicity, we shall be working in the interval [0, 1] for this chapter, but the discussion and proofs extend
to more general sets.
Recall that we call a map T : [0, 1] → [0, 1] Measure-Preserving if for all f ∈ P C[0, 1], we have
Z 1 Z 1
f ◦T = f (214)
0 0
T defines an isometry UT : P C[0, 1] → P C[0, 1] by composition UT (f ) = f ◦ T , which we may uniquely extend to L2 [0, 1].

Let A ⊆ [0, 1] be a finite disjoint union of intervals, then χA ∈ P C[0, 1]. If T is measure-preserving, then
Z 1 Z 1 Z 1
len(A) = χA = χA ◦ T = χT −1 A = len(T −1 A) (215)
0 0 0
Where len(A) is the sum of the lengths of the intervals making up A. This in some sense justifies the name ’measure
preserving’, as it shows that T −1 preserves the ’measure’ of sets (At least, for finite disjoint unions of intervals, although
in general it may be shown that such T preserves the lebesgue measure). In fact, by linearity and the fact that step
functions are dense in L2 [0, 1] (As we’ve seen that they’re dense in C[0, 1]), it suffices to require this property on indicators
of intervals.
Examples. Fix α ∈ (0, 1), and define
(
x+α x+α<1
T (x) = = x + α mod 1 (216)
x+α−1 1≤x+α<2
T is measure preserving, as we have for any f ∈ P C[0, 1].
Z 1 Z 1−α Z 1 Z 1 Z α Z 1
f ◦T = f (x + α) dx + f (x + α − 1) dx = f (y) dy + f (z) dz = f (217)
0 0 1−α α 0 0
Where we used integration by substitution.

37
Let T x = 2x mod 1, i.e.
(
2x x < 21
Tx = (218)
2x − 1 21 ≤ x ≤ 1
T is again measure preserving, as we have
Z 1 Z 21 Z 1 Z 1 Z 1 Z 1
dy dz
f ◦T = f (2x) dx + f (2x − 1) dx = f (y) + f (z) = f (219)
0 0 1
2 0 2 0 2 0
Where we again use integration by substitution.

Note that in this example, unlike the previous example, we do not have that len(A) = len(T A) (In fact, we
have len(T A) = 2len(A)), however, we still have len(T −1 A) = len(A), as there are two intervals, each half of A’s
length which get mapped to A under T .

In Ergodic Theory, we wish to prove that for some ’random’ x0 , the sequence T x0 , T 2 x0 , T 3 x0 , · · · acts as if random.
For example, for every [a, b] ⊆ [0, 1],
|{0 ≤ n ≤ N : T n x0 ∈ [a, b]}| N →∞
−−−−→ b − a (220)
N +1
That is, as we iterate T on x0 , it falls inside [a, b] proportional to its length as N → ∞.

An equivalent way to phrase the above is that for any f ∈ P C[0, 1], we have
N
1 X
Z 1
N →∞
f (T N x0 ) −−−−→ f (221)
N + 1 n=0 0

Note that (220) is the special case of (221) where we take f = χ[a,b] , but in fact, if (220) holds for any subinterval, then
as step functions are dense in L2 [0, 1], (221) also holds, so they are indeed equivalent. (221) in fact says that the ’sample
average’ of f (Taken on the iterates of T ) converges to the ’real’ average of f (Its interval). This means that the sequence
T n x0 indeed acts as if random. That is, on average, T ’mixes’ the interval [0, 1] well.

Definition 14.5. A measure-preserving transformation T is called Ergodic if the only functions f ∈ L2 [0, 1] satisfying
UT f = ”f ◦ T ” = f are the constant functions (i.e. in span{1}).

Theorem 14.6. (Mean Ergodic Theorem): Let T : [0, 1] → [0, 1] be ergodic, then for all f ∈ L2 [0, 1]:
N
1 X n L2
Z 1
UT f −−→ f = ⟨f, 1⟩ (222)
N + 1 n=0 0
R1
Where 0 f is regarded as a constant function in L2 .

38
15 The Mean Ergodic Theorem
Example. Non-Example: T x = (x + 13 ) mod 1 is Not an ergodic transformation (As it is in fact periodic). We can
find a disjoint union of 3 intervals A which are 13 (mod 1) apart such that T −1 A = A, then we can take f = χA , and
χA ◦ T = χA , therefore T is not ergodic. In particular, it is clear that the ergodic theorem does not hold.
Remark. T x = x + α mod 1 is Ergodic iff α is irrational. Additionally, T x = 2x mod 1 is also ergodic.
The Mean Ergodic Theorem will follow from the following:

Theorem 15.1. Let H be a hilbert space, A ∈ B(H) with ||A|| < 1. Let M be the collection of fixed points of A, i.e.
M = {x ∈ H : Ax = x} = ker(I − A) (223)
This is a closed subspace of H. Then for all x ∈ H
N +1
1 X n
A x → PM x (224)
N + 1 n=0

Proof. (Proof of MET from Theorem 15.1): Let H = L2 [0, 1], A = UT (||UT || = 1). In this case,
R1
M = {f : UT f = f } = span{1} as T is ergodic, therefore PM f = ⟨f, 1⟩ = 0 f (As 1 is an orthonormal basis for
M).
Proof. (Proof of Theorem 15.1): Note that M is indeed closed as the kernel of a bounded linear operator. First, we
shall prove that ker(I − A) = ker(I − A∗ ). Let x ∈ ker(I − A), then Ax = x, and we have
2 C−S ||A||=||A∗ || 2
||x|| = ⟨x, x⟩ = ⟨Ax, x⟩ = ⟨x, A∗ x⟩ ≤ ||x|| · ||A∗ x|| ≤ ||x|| · ||A∗ || · ||x|| = ||x|| ||A|| ≤ x2 (225)
Where the last inequality is as ||A|| ≤ 1. Therefore there is equality in each inequality. In particular, in the Cauchy-
Shcwarz Inequality, but this only occurs when A∗ x = cx for some c ∈ C, but we have
2 2
c̄ · ||x|| = ⟨x, A∗ x⟩ = ||x|| (226)
Therefore c = 1, and so x ∈ ker(I − A∗ ). The converse inclusion is identical, switching the roles of A, A∗ (Since
A∗∗ = A).

Next, we have by Proposition 10.7


M⊥ = ker(I − A)⊥ = ker(I − A∗ )⊥ = Im(I − A) (227)
To check the claim, it suffices to check it for x ∈ M, x ∈ M ⊥ by linearity, since H = M ⊕ M⊥ .
• Let x ∈ M, then
N N
1 X n 1 X
A x= x = x → x = PM x (228)
N + 1 n=0 N + 1 n=0
• First let x ∈ Im(I − A). Let x = y − Ay, then
N N N
!
1 X n 1 X X 1
A x= n
A y− A n+1
y = (y − AN +1 y) → 0 = PM x (229)
N + 1 n=0 N +1 n=0 n=0
N +1
Where the last step is as y − AN +1y ≤ 2||y|| by the triangle inequality and the fact that ||A|| ≤ 1, therefore
this sequence goes to 0 (As N1+1 goes to 0).

Now, let x ∈ M⊥ = Im(I − A). Let ε > 0, choose x′ ∈ Im(I − A) such that ||x − x′ || < ε, therefore
N N N
1 X n 1 X n 1 X n ′
A x ≤ A (x − x′ ) + A x (230)
N + 1 n=0 N + 1 n=0 N + 1 n=0
Choose N0 large enough such that for all N > N0 , the second summand is < ε (Since it converges to 0), therefore
the sum is smaller than 2ε, therefore it converges to 0 as required.

39
16 Invertible Operators And The Inverse Map-
ping Theorem, Spectrum
16.1 Invertible Operators And The Inverse Mapping Theorem
Definition 16.1. Let X, Y be banach spaces, A ∈ B(X, Y ). We say that A is Invertible if there exists B ∈ B(Y, X)
such that AB = IY , BA = IX , i.e. A is bijective and its inverse is Bounded.

Proposition 16.2. Let A ∈ B(X, Y ) be bijective, then A−1 is bounded iff A is bounded below, i.e. ||Ax|| ≥ c||x|| for
c > 0.

Proof. Suppose ||A|| is bounded below. Let y ∈ Y , and let x = A−1 y, then there exists c > 0 such that
1
||Ax|| ≥ c||x|| =⇒ ||y|| ≥ c A−1 y =⇒ A−1 y ≤ ||y|| (231)
c
The converse is similar.

Theorem 16.3. (Inverse Mapping Theorem): Let X, Y be banach spaces. If A ∈ B(X, Y ) is bijective, then
A−1 ∈ B(Y, X).

Remark. We shall only prove the theorem for Hilbert Spaces,


Proof. First we prove that A∗ is bounded below, i.e. ||A∗ k|| ≥ c||k|| for all k. Suppose not, then there exists a
sequence kn such that ||A∗ kn || = 1 and ||kn || → ∞. This is as for all n, there exists kn ∈ Y such that
1
||A∗ kn || < ||kn || (232)
n
We normalise kn such that ||A∗ kn || = 1, then we get ||kn || > n.

Fix some z ∈ Y , and let z = Ax for x ∈ X, then we have


C.S.
|⟨kn , z⟩| = |⟨kn , Ak⟩| = |⟨A∗ kn , x⟩| ≤ ||A∗ x|| · ||z|| · ||h|| = ||h|| (233)
But by Theorem 14.2, since the sequence ⟨kn , z⟩ is bounded for all z, kn must be bounded, a contradiction, hence we
must have that A∗ is bounded below.

Let us show that A∗ is bijective: ker(A∗ ) = Im(A)⊥ = Y ⊥ = {0}, and Im(A∗ ) = ker(A)⊥ = {0}⊥ = X.

We shall prove the following fact: An operator that is bounded below has closed image. Indeed, let h ∈ Im(A∗ ), that
is, h = lim A∗ kn . For all n, m, we have (By boundedness below)
n→∞
1 ∗ n,m→∞
||kn − km || ≤||A kn − A∗ km || −→ 0 (234)
c
As A kn converges and is therefore cauchy. This means that kn is cauchy, and therefore converges, so kn → k, then

by continuity h = A∗ k, therefore h ∈ Im A∗ , therefore the image is closed. This proves that A∗ is surjective, so by
Proposition 16.2 we have that A∗ has bounded inverse (A∗ )−1 .

Now we show that A−1 is bounded. Taking the adjoint of both sides, we have
(A∗ )−1 A∗ = IY =⇒ A((A∗ )−1 )∗ = IY
(235)
A∗ (A∗ )−1 = IX =⇒ ((A∗ )−1 )∗ A = IX
So that by uniqueness of the inverse ((A∗ )−1 )∗ = A−1 , but then (A∗ )−1 = A−1 , therefore A−1 is bounded, as
required.

Definition 16.4. Denote the set of invertible maps in B(X) by GL(X).

Theorem 16.5. GL(X) is open in B(X).

Lemma 16.6. Let A ∈ B(X). We shall prove that if ||A|| < 1, then I − A is invertible. In other words, if ||B − I|| < 1,
then B is invertible, that is, B1 (I) ⊆ GL(X). Moreover, the inverse is given by
X∞
(I − A)−1 = An (236)
n=0

40
Proof. In a banach space, every absolutely convergent sequence converges (By an analogous proof as for numbers in
infi 2). Note that
∞ ∞
X X n 1
||An || ≤ ||A|| = <∞ (237)
n=0 n=0
1 − ||A||

Therefore the series S = An converges in B(X) (As it is a banach space). Let us calculate
P
n=0
N
! N
X X
S(I − A) = lim A n
(I − A) = lim (I − A) An = lim I − AN +1 = I (238)
N →∞ N →∞ N →∞
n=0 n=0

Where the last equality is as the series An converges absolutely, therefore we must have that An → 0, therefore
P
n=0
S(I − A) = I, and similarly (I − A)S = I.
Proof. (Proof of Theorem 16.5): Let A ∈ GL(X). For any B ∈ B(X), we have
A − B = A − AA−1 B = A(I − A−1 B) (239)
Therefore if ||B|| < 1
then A B < 1 and hence I−A B ∈ GL(X) by the lemma, therefore A−B is invertible
||A−1 || ,
−1 −1

as a composition of invertible maps. Therefore, we have that B −1


1 (A) ⊆ GL(X), since if ||A − B|| < ||A1−1 || , then
||A ||
we’ve shown that A − (A − B) = B is invertible.

Exercise. The map GL(X) → GL(X) given by A 7→ A−1 is continuous.

16.2 Spectrum of An Operator


Definition 16.7. Let A ∈ B(X). The Spectrum of A is defined to be
σ(A) = {λ ∈ C : A − λI ∈
/ GL(X)} (240)
The Point Spectrum of A is defined to be
σp (A) = {λ ∈ C : ker(A − λI) ̸= 0} ⊆ σ(A) (241)
i.e. the set of eigenvalues of A.

Remark. Generally, σ(A) ̸= σp (A), as A − λI may be injective but non-surjective.

Definition 16.8. The Resolvent Set of A is


ρ(A) = C − σ(A) (242)
Elements of the Resolvent are called Regular Values.

Example. Let X = ℓ2 , and define


a2 a3
A(a1 , a2 , . . .) = (a1 , , , . . .) (243)
2 3
It’s clear that ||A|| ≤ 1. It is also clear that λn = 1
n is an eigenvalue with eigenvector en = (0, 0, . . . , 1, 0, . . .) for all
n. In fact
1 1
σp (A) = {1, , , . . .} (244)
2 3
In particular 0 ∈/ σp (A), but 0 ∈ σ(A). Indeed, we have that A − 0I = A is noninvertible. In fact, it is neither
surjective nor injective.

A is not surjective since the vector (1, 12 , 13 , . . .) ∈


/ Im(A), since (1, 1, 1, . . .) ∈
/ ℓ2 .

We can also show that A is noninvertible as it is not bounded below, since ||en || = 1 and ||Aen || = 1
n → 0.

Proposition 16.9. σ(A) is closed, and σ(A) ⊆ {z ∈ C : |z| ≤ ||A||}, so σ(A) is compact.

Proof. The map φ : C → B(X) defined by φ(λ) = A − λI is a linear isometry, and in particular continuous, therefore
ρ(A) = φ−1 (GL(X)) (245)
Therefore ρ(A) is open since GL(X) is, and hence its complement σ(A) is closed. For boundedness, we need to show
that if |λ| > ||A|| then A − λI = −λ(I − λ1 A is invertible, but
1 ||A||
A = <1 (246)
λ |λ|

41
Therefore by Lemma 16.6, I − λ1 A is invertible, and hence A − λI is invertible, so that
ρ(A) ⊇ {z : |z| > ||A||} =⇒ σ(A) ⊆ {z : |z| ≤ ||A||} (247)

Proposition 16.10. If H is hilbert and A ∈ B(H), then


σ(A∗ ) = {z̄ : z ∈ σ(A)} (248)

Proof. The identity (T S)∗ = S ∗ T ∗ shows that T is invertible iff T ∗ is invertible, and in this case (T ∗ )−1 = (T −1 )∗ .

Therefore, λ ∈ σ(A∗ ) iff A∗ − λI = (A − λ̄I)∗ is noninvertible iff A − λ̄I is noninvertible iff λ̄ ∈ σ(A).

Theorem 16.11. For all A ∈ B(X), σ(A) ̸= ∅.

Remark. The above theorem is only true over C.

Example. Let x = ℓ2 , and let A ∈ B(X) be


A(a1 , a2 , . . .) = (0, a1 , a2 , . . .) (249)
Let us calculate σp (A). Suppose λ is an eigenvalue, then
(0, a1 , a2 , . . .) = Aa = λa = (λa1 , λa2 , . . .) (250)
If λ = 0, then clearly a = 0. If λ ̸= 0, then a1 = 0, and for all n, we have λan+1 = an , therefore by induction we get that
an = 0 for all n, i.e. a = 0, so we see that σp (A) = ∅ since for any λ the equation Aa = λa has no nontrivial solutions.

Recall that
A∗ (a1 , a2 , . . .) = (a2 , a3 , . . .) (251)
If |λ| < 1, , then aλ = (1, λ, λ2 , . . .), then A∗ aλ = (λ, λ2 , λ3 , . . .), therefore λ ∈ σp (A∗ ), therefore
{|z| < 1} ⊆ σp (A∗ ) = σ(A∗ ) (252)
And in particular, since the spectrum is closed, we have that
{|z| ≤ 1} ⊆ σ(A∗ ) (253)
On the other hand, ||A || = ||A|| = 1, therefore by Proposition 16.9, we have σ(A∗ ) ⊆ {|z| ≤ 1} i.e. σ(A∗ ) = {|z| ≤ 1}.

Moreover, we have
σ(A) = {|z| ≤ 1} (254)
Since {z : |z| ≤ 1} = {z̄ : |z| ≤ 1}

42
17 Compact Operators
Definition 17.1. Let X, Y be banach spaces. Denote by X1 = {x ∈ X : ||x|| ≤ 1} the unit ball in X. An operator
A ∈ B(X, Y ) is called Compact if A(X1 ) is precompact, i.e. A(X1 ) is compact. Equivalently, every sequence {xn } ⊆ X
with ||xn || ≤ 1 has a subsequence xnk such that (Axnk ) converges. We denote the set of compact operators K(X, Y ).

Remark. In particular, the above definition is equivalent to the fact that every bounded sequence xn has a subsequence
xnk such that Axnk converges (The equivalence is by normalising xn , as scalar multiplication preserves convergence).

Proposition 17.2. If A ∈ B(X, Y ) and dim Im A < ∞ (We say that A is of Finite Rank), then A is compact.

Proof. A is bounded, therefore A(X1 ) is a closed, compact subset of a finite-dimensional normed space, hence by
Heine-Borel, it is compact (We know from a previous course that Heine-Borel holds for any finite-dimensional normed
space).

Remark. Every compact operator is bounded, since if A(X1 ) is compact, it is bounded, therefore A(X1 ) is bounded,
so ||A|| < ∞.

Example. Let k ∈ C([0, 1]2 ) and define


Z 1
K : C[0, 1] → C[0, 1], Kf (x) = k(x, t)f (t) dt (255)
0
We claim that this is a compact operator. We must prove that K(C[0, 1]1 ) = F is precompact. Recall The Arzela-
Ascoli Theorem: A set in C[0, 1] is precompact iff it is uniformly bounded and equicontinuous. Let us show F satisfies
both conditions:
• F is uniformly bounded: Let f ∈ C[0, 1]1 , then
Z 1 Z 1 Z 1
|(Kf )(x)| = k(x, t)f (t) dt ≤ |k(x, t)||f (t)| dt ≤ |k(x, t)| dt ≤ ||k||∞ (256)
0 0 0
And this bound does not depend on x, f .
• F is Equicontinuous: Let ε > 0. k is uniformly continuous, therefore there is δ > 0 such that if
||(x, t) − (x′ , t′ )|| < δ then |k(x, t) − k(x′ , t′ )| < ε. Let f ∈ C[0, 1]1 and x, y such that |x − y| < δ, we have
Z 1 Z 1 Z 1
|(Kf )(x) − (Kf )(y)| = k(x, t)f (t) dt − k(y, t)f (t) dt ≤ |k(x, t) − k(y, t)| · |f (t)| dt < ε (257)
0 0 0
Therefore by Arzela-Ascoli F is precompact, therefore K is a compact operator.

Theorem 17.3. K(X, Y ) is a closed subspace of B(X, Y ).

Proof. Subspace: Let (xn ) in X1 . By compactness, there exists a subsequence (Axnk ) which converges. There is a
subsequence (xnkj ) such that Bxnkj converges, therefore so does
(λA + B)xnkj = λAxnkj + Bxnkj (258)
Therefore λA + B is compact.

Closedness: Let An compact operators converging in norm to an operator A. Let (xn ) be a sequence of unit
vectors. Define a sequence of subsequences inductively:

Take a subsequence (zn,1 ) such that A1 zn,1 converges. Define a subsequence (zn,2 ) of (zn,1 ) such that A2 zn,2
converges. Continuing by induction, we get for all k sequences (zn,k )∞n=1 such that Ak zn,k converges. Take the
diagonal sequence (wn ) = (zn,n )∞
n=1 . We claim that Awn converges by showing that it is Cauchy.

Let ε > 0, then there exists N such that ||AN − A|| < ε. Consider
||Awn − Awm || ≤ ||Awn − AN wn || + ||AN wn − AN wm || + ||Awm − AN wm || ≤
(259)
||AN − A|| · ||wn || + ||AN wn − AN wm || + ||AN − A|| · ||wm || < 2ε + ||Awm − AN wm ||
By construction, for all n > N , wn = zn,n is a subsequence of (zk,N ), but AN zk,N converges, therefore AN wn
converges as well as a subsequence, and therefore cauchy, so there exists M such that for all n, m > M , we have
||Awm − AN wm || < ε, therefore we get
||Awn − Awm || < 3ε (260)
Therefore Awn is cauchy and therefore converges, so A is compact.

Proposition 17.4. Let A ∈ B(X, Y ), B ∈ B(Y, Z). If A or B are compact, then BA ∈ K(X, Z).

43
Proof. Suppose that A is compact, then let {xn } ⊆ X1 , by compactness, there is a subsequence xnk such that Axnk
converges, and therefore by continuity BAxnk converges.

Suppose that B is compact, and let {xn } ⊆ X1 , then {Axn } is bounded (Since A is bounded), therefore by
an equivalent definition of compact operators, there exists a subsequence Axnk such that BAxnk converges.

Question. Is the identity map IX ∈ B(X) compact?


Answer: Not generally. This question is equivalent to the compactness of the unit ball x1 (In fact precompactness,
but X1 is closed, therefore it is precompact iff it is compact). In an infinite-dimensional hilbert space,√we’ve seen that
we can find an orthonormal basis {en } ⊆ X1 which has no convergent subsequence (Since ||ei − ej || = 2 for i ̸= j), so
the identity operator is not compact in an infinite-dimensional hilbert space. It turns out that the answer is not true in
an infinite-diemensional banach space. For this, we use the following lemma:

Lemma 17.5. (Riesz’ Lemma): Let x be a banach space, Y ⊊ X a closed subspace, then there exists x0 ∈ X such
that ||x0 || = 1 and
1
d(x0 , Y ) = inf{d(x0 , y) : y ∈ Y } ≥ (261)
2

Proof. Let x ∈
/ Y , then we have d = d(x, Y ) > 0 (Since Y is closed). By definition of the infimum, there is y ∈ Y
such that
d < ||x − y|| < 2d (262)
Choose x0 = ||x−y|| .
x−y
For all ỹ ∈ Y , we have
x−y ỹ 1
||x0 − ỹ|| = − ||x − y|| = · ||x − y − ỹ · ||x − y|| || (263)
||x − y|| ||x − y|| ||x − y||
Then 1
||x−y|| > 2d, and ||x − y|| · ||x − y − ỹ||x − y|||| ≥ d by definition, therefore
1 1
· ||x − y − ỹ · ||x − y|| || ≥ (264)
||x − y|| 2

Proposition 17.6. The identity operator IX on a Banach Space X is compact iff X is finite-dimensional.

Proof. ⇐ is obvious by Heine-Borel since it holds for every finite-dimensional normed space (Since every norm is
equivalent to the euclidean norm, in which Heine-Borel holds as in Cn ), therefore the unit ball is compact.

⇒: Suppose that dim X = ∞, we shall show the unit ball is not compact. Let x1 ∈ X with ||x1 || = 1. We
construct a sequence by induction: Suppose we have constructed {x1 , . . . , xn } with ||xi || = 1 and ||xn − xm || ≥ 12
for any m ̸= n. Consider Y = span{x1 , . . . , xn }, this is a proper subspace by hypothesis, and is closed (As every
finite-dimensional space is closed), then by Riesz’ Lemma there is xn+1 such that d(xn+1 , Y ) ≥ 21 , so we get a
sequence {xn } ⊆ X1 with ||xn − xm || ≥ 12 for all m ̸= n, so this is a sequence which does not have a convergent
subsequence (In particular (IX xn ) does not have a convergent subsequence) hence IX is not compact.

44
18 Spectrum of Compact Operators
! For the ensuing discussion, we consider only infinite-dimensional spaces.

Proposition 18.1. Let A ∈ K(X), then 0 ∈ σ(A)

Proof. If 0 ∈ / σ(A), then A is invertible, but then IX = AA−1 is compact by Proposition 17.4, but this is a
contradiction to Proposition 17.6

Theorem 18.2. (Fredholm Alternative): Let A ∈ K(X) and λ ̸= 0, then either λ ∈ ρ(A) or λ ∈ σp (A), and the
eigenspace Nλ = {x : Ax = λx} is finite-dimensional. In particular, σ(A) = {0} ∪ σp (A).

Lemma 18.3. Let A ∈ K(X). Let, and let Y be a closed subspace such that ker(I − A) ∩ Y = {0}, then (I − A)|Y is
bounded below. In particular, (I − A)(Y ) is closed.

Proof. Suppose by contrapositive that I − A|Y is not bounded below we shall show that ker(I − A) ∩ Y ̸= 0. By
hypothesis, there is a sequence {yn } ⊆ Y such that ||yn || = 1 and ||(I − A)yn || → 0. By compactness of A, there is a
subsequence ynk such that Aynk → y converges, therefore
ynk = (I − A)ynk + Aynk → 0 + y = y (265)
By closure of Y , y ∈ Y . Moreover, ||y|| = 1 by continuity of the norm, so y ̸= 0 in particular, and by continuity of A
we have
y = lim Aynk = A lim ynk = Ay (266)
k→∞ k→∞
But this means that 0 ̸= y ∈ ker(I − A) ∩ Y . The fact that I − A has closed image is as in the proof of Theorem
16.3

Lemma 18.4. Let A ∈ K(X). If I − A is injective then it is surjective.

Proof. Define Yn = (I − A)n (X). By the previous lemma, I − A is bounded below (Taking Y = X), therefore (I − A)n
is bounded below, therefore Yn is a closed subspace. Moreover
Y0 = X ⊇ Y1 ⊇ Y2 ⊇ . . . ⊇ Yn ⊇ . . . (267)
Suppose by contradiction that I −A is not surjective, then there is x ∈ Y0 −Y1 . This implies that (I −A) x ∈ Yn −Yn+1 :
n

If (I −A)n x ∈ Yn+1 , then (I −A)n x = (I −A)n+1 x̃, and by injectivity this means that x = (I −A)x̃, in a contradiction
to our hypothesis, therefore all the inculsions in (267) are strict. By Riesz’ Lemma, for all n ∈ N, we may choose
yn ∈ Yn such that ||yn || = 1 and d(yn , Yn+1 ) ≥ 21 . For any n > m ∈ N we have
||An − Am || = ||ym − ((I − A)ym + Ayn )|| (268)
ym ∈ Ym , therefore ym = (I − A)m x, therefore (I − A)ym = (I − A)m+1 x ∈ Ym+1 . Similarly, yn ∈ Yn ⊆ Ym+1 (Since
n > m), therefore yn = (I − A)m+1 x, therefore
Ayn = A(I − A)m+1 x = (I − A)m+1 (Ax) ∈ Ym+1 (269)
Therefore z = ((I − A)ym + Ayn ) ∈ Ym+1 , therefore
1
||yn − z|| ≥ (270)
2
But this means that {Ayn } has no convergent subsequence as the distance between any two elements is at least 21 , so
A is not compact, a contradiction.

Proof. (Proof of Fredholm Alternative): Suppose that λ ̸= 0 is not an eigenvalue, then A − λI = −λ(I − A λ ) is
injective (Note that Aλ is still compact), therefore by Lemma 18.4, Lemma 18.3, I − A
λ is surjective and bounded
below, therefore in particular so is A − λI, so A − λI is invertible, i.e. λ ∈ ρ(A).

Finally, suppose λ ∈ σp (A), then A|Nλ = λINλ is compact, but by Proposition 17.6 (Since Nλ is clearly
closed) dim Nλ < ∞.

n→∞
Theorem 18.5. Let A ∈ K(X). Either σp (A) is finite, empty, or σp = {λn }n∈N such that λn −−−−→ 0.

Proof. We shall show that for any ε > 0, there exist only finitely many eigenvalues with |λ| > ε. Suppose not. Let
ε > 0 and {λn } ⊆ σp (A) be a sequence of distinct eigenvalues such that |λn | ≥ ε. Let xn be an eigenvector of λn for
all n, and let Yn = span{x1 , . . . , xn }, then we have an ascending sequence of proper subspaces
Y1 ⊊ Y2 ⊊ . . . ⊊ Yn ⊊ . . . (271)
Where the inclusion is proper as eigenvectors corresponding to different eigenvalues are linearly independent. By

45
Riesz’ Lemma, for all n > 1, choose yn ∈ Yn such that d(yn , Yn−1 ) ≥ 12 , therefore for all n > m, we have
A 1
||Ayn − Aym || = λn (yn − (I − )yn + Aym ) (272)
λn λn
n
Note that yn ∈ Yn , therefore yn = ci xi , so we have
P
i=1
n n n 
1 X

A X X λi
(I − )yn = ci x i − λi ci xi = 1− ci xi ∈ Yn−1 (273)
λn i=1
λn i=1 i=1
λn
n−1
Since the coefficient of xn is 0. Of course, since Ym ⊆ Yn−1 , so we may write ym = ci xi , so
P
i=1
n−1
1 1 X
Aym = λi ci xi ∈ Yn−1 (274)
λn λn i=1
Therefore z = (I − λn )yn
A
+ 1
λn Aym ∈ Yn−1 , so by construction
λn ε
||λn (yn − z)|| ≥
≥ (275)
2 2
Therefore Ayn has no convergent subsequence, in contradiction to compactness.

This concludes the proof, since this condition implies there are only countably many eigenvalues in σp (A)
(By taking balls of radius n1 for all n ∈ N), so either σp (A) is finite or empty, or if it is countable, our condition
exactly implies that it forms a convergent sequence to 0 (For any ε only finitely many eigenvalues are not in an
ε-nbhd of 0).

Example. (Every Sequence converging to 0 is the point spectrum of some compact operator) Take X = ℓ2 . Fix a
sequence (wn )∞
n=1 of complex numbers converging to 0, and define
A(a1 , a2 , . . . , ) = (w1 a1 , w2 a2 , . . .) (276)
Similarly to the Example in Lecture 16, A is bounded, it’s clear that wn are eigenvalues with eigenvectors en ,
and it is clear there are no other eigenvalues, so σp (A) = {wn }n∈N . We shall show that A is compact. Define
AN (a1 , a2 , . . .) = (w1 a1 , . . . , wN aN , 0, . . .). AN is bounded of finite rank, therefore by Proposition 17.2. We shall
show that AN → A in norm, therefore by Theorem 17.3, A will be compact. Indeed, we have
N →∞
||AN − A|| = sup |wn | −−−−→ 0 (277)
n≥N
Note that we may take {wn } to be a sequence that is eventually 0, therefore this example shows that we may find
compact operators in ℓ2 with any finite nomempty spectrum as well. In recitation, we shall see that there also exist
compact operators with σp (A) = ∅, so σ(A) = {0}.

46
19 Compact Operators In Hilbert Spaces
19.1 Compact Operators In A Hilbert Space
Example. Let k ∈ C([0, 1]2 ) and define K : C[0, 1] → C[0, 1] by
Z 1
(Kf )(x) = k(x, t)f (t) dt (278)
0
We now think of C[0, 1] as a dense subset of L2 [0, 1]. Let us check that K is bounded in L2 :
Z 1 Z 1 Z 1 2 Z 1 Z 1  Z 1 
2 CS 2
||Kf ||2 = |(Kf )(x)| dx = k(x, t)f (t) dt dx ≤ |k(x, t)| dt |f (t)| dt dx =
0 0 0 0 0 0
(279)
Z 1 Z 1
2 2 2 2
||f ||2 |k(x, t)| dt dx = ||f ||2 · ||k||2
0 0
Therefore K is bounded, and in fact, ||K|| ≤ ||k||2 .

Suppose now that k ∈ L2 ([0, 1]2 ). There is a sequence C([0, 1]2 ) ∋ kn → k in L2 . Consider the correspond-
ing integral operators Kn , then
||Kn − Km || ≤ ||kn − km ||2 → 0 (280)
Where we used the fact that Kn −Km is an integral operator with integration kernel kn −km , and the above bound we
calculated for an arbitrary integral operator, therefore Kn is cauchy and hence converges by completeness of B(X).
We call the limit K. By the usual argument, since the limit K exists for any sequence kn → k, it is independent of
the choice of sequence kn .

Proposition 19.1. K as defined above is compact.

Proof. Choose the kernel en,m (x, y) = e2πi(nx+my) , then the corresponding operator for f ∈ C[0, 1] is
Z 1 Z 1
(En,m f )(x) = e2πi(nx+mt)
f (t) dt = e 2πinx
e2πimt f (t) dt = e2πinx fˆ(−m) (281)
0 0
So En,m is continuous
P of finite rank (Since its image
P is span{x
2πinx
}), therefore it is compact. For any K ∈ L2 ([0, 1])2 ,
we may write k = cn,m en,m , therefore K = cn,m En,m (Where the series is well-defined) so K is compact as a
n,m n,m
limit of compact operators.

19.2 Spectral Theorem For Compact Self-Adjoint Operators


Definition 19.2. Let A ∈ B(H). Denote the Numerical Radius
r(A) = sup{|⟨Ah, h⟩| : ||h|| = 1} (282)

Proposition 19.3. For all A ∈ B(H), we have r(A) ≤ ||A||. Moreover, if A is self-adjoint, r(A) = ||A||.

Proof. Let h ∈ H with ||h|| = 1, then


CS
|⟨Ah, h⟩| ≤ ||Ah|| · ||h|| ≤ ||A|| · ||h|| = ||A|| (283)
Suppose now that A is self-adjoint. For every g, h ∈ H, we have (By direct calculation)
⟨A(g + h), g + h⟩ − ⟨A(g − h), g − h⟩ = 4 Re ⟨Ah, g⟩ (284)
Therefore
2 2 (1) 2 2
4|Re ⟨Ah, g⟩| ≤ r(A)||g + h|| + r(A)||g − h|| = 2r(A)(||g|| + ||h|| ) (285)
Where in (1) we used the parallelogram equality. Now let g = Ah
||Ah|| ||h||, we get
2
4||Ah|| · ||h|| ≤ 2r(A)2||h|| =⇒ ||Ah|| ≤ r(A)||h|| (286)
Therefore ||A|| ≤ r(A).

Proposition 19.4. Suppose A ∈ B(H) is self-adjoint, then at least one of ||A||, −||A|| is in the spectrum. Moreover,
if A is compact, then at least one of ||A||, −||A|| ∈ σp (A).

Proof. By definition and the previous proposition there is a sequence (hn )n∈N with ||hn || = 1 such that
n→∞
|⟨Ahn , hn ⟩| −−−−→ r(A) = ||A|| (287)

47
Note that ⟨Ahn , hn ⟩ is real since A is self-adjoint, as we have
⟨Ahn , hn ⟩ = ⟨hn , Ahn ⟩ = ⟨Ahn , hn ⟩ =⇒ ⟨Ahn , hn ⟩ ∈ R (288)
By passing to a subsequence, we may assume that ⟨Ahn , hn ⟩ → λ, where λ = ±||A|| (One of them). We have
2 2 2 2
||Ahn − λhn || = ||Ahn || − 2λ ⟨Ahn , hn ⟩ + λ2 ||hn || ≤ ||A|| · ||hn || − 2λ ⟨Ahn , hn ⟩ + λ2 ||hn || → λ2 · 1 − 2λ · λ + λ2 = 0
(289)
Where we used the fact that ||hn || = 1 for all n ∈ N, and ⟨Ahn , hn ⟩ → λ. We get that (A − λI)hn → 0 with ||hn || = 1,
therefore A − λI is not bounded below and hence invertible by Proposition 16.2, hence λ ∈ σ(A) by definition.

Suppose now that A is compact (It is possible to use the Fredholm Alternative, but we shall present a differ-
ent proof), then there is a subsequence (hnk ) such that Ahnk → h, then λhnk = Ahnk − (A − λI)hnk → h − 0 = h,
therefore hnk → λ1 h (Suppose that λ ̸= 0, otherwise A = 0 and the claim is trivial), so hnk converges, so by continuity
  1
h = lim Ahnk = A lim hnk = Ah =⇒ Ah = λh =⇒ λ ∈ σp (A) (290)
n→∞ n→∞ λ

Definition 19.5. Let A ∈ B(H), and let M ≤ H be a closed subspace. We say that M is A-Invariant if A(M ) ⊆ M .
We say that M is Reducing if A(M ) ⊆ M And A(M ⊥ ) ⊆ M ⊥ .

Proposition 19.6. If A is Self-Adjoint, every invariant subspace is reducing.

Proof. Let M be invariant, and let h ∈ M ⊥ . For any m ∈ M , we have


⟨Ah, m⟩ = ⟨h, Am⟩ = 0 (291)
Since Am ∈ M so h ⊥ Am, therefore Ah ∈ M , therefore M is reducing.

Theorem 19.7. (Spectral Theorem): Let A ∈ B(H) be compact and self-adjoint in a separable hilbert space, then
there exists an orthonormal basis {en }∞
n=1 such that each en is an eigenvector for an eigenvalue λn . Moreover, λn ∈ R
and λn → 0.

Proof. By Proposition 19.4, we may take λ1 = ||A|| or −||A|| with some arbitrary unit eigenvector e1 . Suppose
by induction we have constructed e1 , . . . , en an orthonormal system of eigenvectors with eigenvalues λ1 , . . . , λn ∈ R.
Define M = span{e1 , . . . , en }, this is clearly an invariant subspace, then by Proposition 20.5, it is reducing, so M ⊥
is also invariant. Now consider A|M ⊥ ∈ B(M ⊥ ), then A|M ⊥ is still a bounded, compact (Since M1⊥ ⊆ H1 ), self-
adjoint (Since (A|⊥ M ) = A |M ⊥ ) operator on a separable hilbert space, therefore again by Proposition 19.4, it has
∗ ∗

an eigenvalue R ∋ λn+1 = ±||A|M ⊥ ||, so we may take a unit eigenvector en+1 ∈ M ⊥ . We have constructed an
orthonormal system {en } of eigenvectors. Note that if dim H < ∞, the process will end at an orthonormal basis. If
dim H = ∞ it remains to show that this is a basis. Note that by construction, |λn | is a monotonic decreasing sequence
(Since |λn | is the operator norm of A reduced to smaller and smaller subspaces), so it has some limit |λn | → c. Suppose
BWOC that c > 0, then
2 2 m,n→∞
||Aen − Aem || = ||λn en − λm em || = λ2n + λ2m −−−−−→ 2c2 > 0 (292)
So (Aen ) has no convergent subsequence, a contradiction to compactness of A, therefore we must have λn → 0.

Define now M = span(en )n∈N . Let h ∈ M⊥ . If we let Mn = span{e1 , . . . , en }, then Mn ⊆ M =⇒ M⊥


n ⊇ M ,

therefore h ∈ Mn . We have

n→∞
0 ≤ ||Ah|| = A|M⊥
n
h ≤ A|M⊥
n
· ||h|| = |λn | · ||h|| −−−−→ 0 (293)
Therefore Ah = 0, so any element in M is an eigenvector of 0. If we take any orthonormal basis {fn } of M (This is
⊥ ⊥

where we use separability, where we can construct such a basis by Theorem 6.10), then {en } ∪ {fn } is an orthonormal
basis of eigenvectors.
Remark. If H is not separable, then by the proof there is still a separable space M which is reducing which has an
orthonormal basis of eigenvectors, and A|M⊥ ≡ 0.
Remark. Equivalent formulations of the theorem:
1. If [A] is the matrix of A according to an orthonormal basis of eigenvectors {en }, then
[A]ij = ⟨Aej , ei ⟩ = λj ⟨ej , ei ⟩ = λj δij (294)
So [A] is diagonal with the eigenvalues of A on the diagonal.

In particular, this shows that for every compact operator, there exists a unitary operator U : ℓ2 → H
such that
U ∗ AU (a1 , a2 , . . . , ) = (λ1 a1 , λ2 a2 , . . .) (295)
We’ve seen such operators are compact, so we see that in a separable hilbert space these are the only compact

48
operators.
2. For any h ∈ H,

X ∞
X
h= ⟨h, en ⟩ en =⇒ Ah = λn ⟨h, en ⟩ en (296)
n=1 n=1
3. For any λ ∈ σp (A) denote by Nλ = ker(A − λI). If we denote Eλ = {en : en ∈ Nλ }, then this is a basis to Nλ .

Indeed For any h ∈ Nλ , we can write h = ⟨h, en ⟩ en = ⟨h, en ⟩ en , since from linear algebra we
P P
n=1 en ∈Eλ
know that Nλ ⊥ Nµ for any λ ̸= µ whenever A is self-adjoint), then Eλ is indeed an orthonormal basis of Nλ ,
therefore for any h ∈ H
X∞ X X X
h= = ⟨h, en ⟩ en = PN λ h (297)
n=1 λ∈σp (A) en ∈Nλ λ∈σp (A)
So we have
M
H= Nλ (298)
λ∈σp (A)

(Where we use the notation in this case to mean that every element in H can be written as a unique
L
convergent series of elements from the Nλ , namely, since Nλ are orthogonal, it can be written as the series of
projections). And of course
X
Ah = λPNλ h (299)
λ∈σp (A)
For each of these formulations, we’ve proven them given spectral theorem, but the converse is easy, so they are all
equivalent.

49
20 Spectral Theorem For Compact Normal Op-
erators, Functional Calculus
20.1 Spectral Theorem For Normal Operators
Lemma 20.1. If A, B ∈ B(H) commute, i.e. AB = BA, then every eigenspace Nλ of A is invariant under B (And
vice-versa).

Proof. Let h ∈ Nλ , then we have


A(Bh) = B(Ah) = B(λh) = λBh =⇒ Bh ∈ Nλ (300)
Therefore Nλ is B-Invariant.

Theorem 20.2. Let H be separable, A, B ∈ B(H) commuting, self-adjoint compact operators, then there exists an
orthonormal basis {en } of eigenvectors of both A, B (i.e. they are simultaneously diagonalisable).

Proof. Let (By an equivalent formulation) H = Nλ . By the previous lemma, every Nλ is B-Invariant,
L
λ∈σp (A)
therefore we may use the spectral theorem on B|Nλ (Which is self-adjoint, compact operator, and Nλ is closed and
therefore hilbert), so we get
M
Nλ = Nλ,µ (301)
µ∈σp (B)

Where Nλ,µ = ker(A − λI) ∩ ker(B − µI), Therefore


M
H= Nλ,µ (302)
λ∈σp (A),µ∈σp (B)

If we now take an orthonormal basis {en,λ,µ } for any Nλ,µ , then their union will be an orthonormal basis for H of
eigenvectors of both A, B, as required.

Theorem 20.3. Let H, K be Hilbert, A ∈ B(H, K), then A is compact iff A∗ is compact.

Proof. Suppose A is compact, then by Proposition 17.4 AA∗ ∈ B(K) is compact. Let (kn ) be a bounded sequence
in K, then there is a convergent subsequence (AA∗ kni ), then we have
2 CS i
A∗ kni − A∗ knj = A∗ kni − A∗ knj , A∗ kni − A∗ knj = AA∗ (kni − knj , kni − knj ≤ AA∗ kni − AA∗ knj ) · kni − knj −−
(303)
Therefore (A∗ kni ) is cauchy and hence convergent, so A∗ is compact.
Remark. We’ve in fact shown that A is compact iff A∗ is compact iff AA∗ is compact iff A∗ A is compact.

Proposition 20.4. For every A ∈ B(H), we can write A = A1 + iA2 , where A1 , A2 are self-adjoint. If A is compact,
then so are A1 , A2 , and this decomposition is unique.

Proof. If A = A1 + iA2 , then A∗ = A−1 iA2 , therefore we must have


A + A∗ A − A∗
A1 = , A2 = (304)
2 2i
This shows uniqueness and in fact also existence (Since A1 , A2 defined as above are self-adjoint). Moreover, if A is
compact, then A∗ is compact, so that A1 , A2 are also compact as a linear combination of compact operators.

Proposition 20.5. A is normal iff A1 , A2 commute.

Proof.
AA∗ = (A1 + iA2 )(A1 − iA2 ) = A21 + A22 + i(A2 A1 − A1 A2 )
(305)
A∗ A = A21 + A22 + i(A1 A2 − A2 A1 )
So AA∗ = A∗ A iff A1 A2 − A2 A1 = A2 A1 − A1 A2 =⇒ A1 A2 = A2 A1 .

Theorem 20.6. (Spectral Theorem For Normal Operators): Let H be a separable hilbert space, A ∈ B(H) normal and
compact, then there exists an orthonormal basis {en } of eigenvectors. Moreover, the eigenvectors {λn } are complex
and λn → 0

50
Proof. Let A = A1 + iA2 . These are commuting, self-adjoint, compact operators, therefore by Theorem 20.2, there is
an orthonormal basis {en } of eignevectors αn , βn of A1 , A2 respectively, then Aen = A1 en +iA2 en = αn A1 en +βn iA2 en ,
then {en } is an orthonormal basis of eignevectors of A, and in particular since αn , βn → 0 by construction, then
αn + iβn → 0
Remark. All the equivalent formulations hold as in the case of self-adjoint operators.

20.2 Functions of Operators


n
Let H be a separable hilbert space, A ∈ B(H) compact and normal. If p = ci xi is a polynomial, we can define
P
i=1
n
X
p(A) = c i Ai (306)
i=1
Let (en ) be an orthonormal basis of eigenvectors, then for any k ∈ N, Akn en = λk en , so we have p(A)en = p(λn )en . This
leads us to the following definition:

Definition 20.7. Let g : σp (A) → C be a bounded function. Define


X∞
g(A)h = g(λn ) ⟨h, en ⟩ en (307)
n=1
Equivalently,
X
g(A)h = g(λ)PNλ h (308)
λ∈σp (A)

Remark. From the equivalent definition, we see that g(A) does not depend on the choice of basis (en ).

Theorem 20.8.
1. For every g : σp (A) → C bounded, g(A) is well-defined, bounded, normal, and
||g(A)|| = sup |g(λ)| (309)
λ∈σp (A)

2. If f, g : σp (A) → C are bounded, then


(f + g)(A) = f (A) + g(A), (f · g)(A) = f (A) ◦ g(A), (ḡ)(A) = g(A)∗ (310)
3. If gn → g uniformly, then gn (A) → g(A) in norm.

Proof.
1. We must check that

X 2
|g(λn ) ⟨h, en ⟩| < ∞ (311)
n=1
Indeed, let M be the supremum of |g|, then

X ∞
X
2 2 2 2 2
||g(A)h|| = |g(λn ) ⟨h, en ⟩| ≤ M 2 |⟨h, en ⟩| = ||M || ||h|| (312)
n=1 n=1
Moreover, we see that ||g(A)|| ≤ M . Conversely, we have
|g(λn )| = ||g(A)en || ≤ ||g(A)|| · ||en || = ||g(A)|| (313)
Taking a supremum over λn , we see that we in fact have ||g(A)|| = M .
2. We have
[(f g)(A)] = diag(. . . , f g(λ−1 , f g(λ0 ), f g(λ1 ), . . .) =
(314)
diag(. . . , f (λ−1 , f (λ0 ), f (λ1 ), . . .) diag(. . . , g(λ−1 , g(λ0 ), g(λ1 ), . . .) = [f (A)][g(A)] = [f (A)g(A)]
Therefore (f g)(A) = f (A)g(A), and similarly for addition and conjugation. This shows in particular that g(A)
is normal, since we have g(A)g ∗ (A) = (gḡ)(A) = (ḡg)(A) = g(A)∗ g(A).
3.
(1) (2) n→∞
||gn (A) − g(A)|| = ||(gn − g)(A)|| = sup |gn (λ) − g(λ)| −−−−→ 0 (315)
σp (A)

Where (1) is by part 2, (2) is by part 1.

51
21 Root of Positive-Semidefinite Operator
Definition 21.1. An operator A ∈ B(H) is called Positive-Definite if ⟨Ah, h⟩ > 0 for all h ∈ H, and
Positive Semidefinite if ⟨Ah, h⟩ ≥ 0.

Theorem 21.2. Let A ∈ K(H) is positive semidefinite, then there exists a unique positive semidefinite operator B

such that B 2 = A. Moreover, B ∈ K(H). We write B = A and call it the Root of A.

Lemma 21.3. Every positive semidefinite operator is self-adjoint.

Proof. For every x ∈ H, we have


⟨Ax, x⟩ = ⟨x, Ax⟩ = ⟨x, Ax⟩ = ⟨A∗ x, x⟩ (316)
Where we used the fact that ⟨Ax, x⟩ ∈ R. From this, we get that ⟨A x, y⟩ = ⟨Ax, y⟩ for all x, y ∈ H by the polarisation

identity
⟨A(x + y), x + y⟩ − ⟨A(x − y), x − y⟩ = 4 Re ⟨Ax, y⟩ (317)
So Re ⟨Ax, y⟩ = Re ⟨A x, y⟩, and moreover therefore A − A ⊥ H, so that A − A = 0 =⇒ A = A as required.
∗ ∗ ∗ ∗

Proof. (Proof of Theorem): First, note that σp (A) ⊆ [0, ∞), since if h ∈ H is an eigenvector with eigenvalue λ, then
2
0 ≤ ⟨Ah, h⟩ = ⟨λh, h⟩ = λ||h|| =⇒ λ ≥ 0 (318)

Define (See Definition 20.7) g : σp (A) → [0, ∞) by g(λ) = λ. By the above g is well-defined and bounded (Since the
spectrum is bounded) by Proposition 16.9, since we get g(σp (A)) ⊆ g(σ(A)) ⊆ g(B0 (||A||) = B0 ( A2 ). By Theorem
20.8, we then get that B := g(A) is well-defined, bounded, and by definition, we get
p p
[B] = diag(. . . , λ−1 , λ0 , λ1 , . . .) (319)
p

Therefore B is positive-definite, since


* +
Xp X Xp 2
⟨Bh, h⟩ = λn ⟨h, en ⟩ en , ⟨h, em ⟩ em = λn |⟨h, en ⟩| ≥ 0 (320)
n m n

And B is compact as λn → 0, since we’ve seen in a previous example that every operator on ℓ2 defined by a
sequence converging to 0 are compact, and B is unitarily equivalent to such an operator and hence itself compact,
and by the theorem B 2 = g(A)2 = (g 2 )(A) = Id(A) = A

Uniqueness: Let C ∈ B(H) be such that C 2 = A, and C is positive semidefinite. Recall that from the
spectral theorem
M
H= Nλ (321)
λσp (A)

Where Nλ = ker(A − λI). Moreover, note that CA = CC 2 = C 3 = C 2 C = AC, therefore Nλ is C-Invariant as in the
proof of Theorem 20.2. There are two cases:
1. λ ̸= 0, then by Fredholm Alternative, dim Nλ < ∞, therefore B(Nλ ∋ C|N λ is compact. Let µ1 , . . . , µm ≥ 0 be
the eigenvalues of C|Nλ , then by hypothesis
C 2 |Nλ = A|Nλ = λINλ (322)
Therefore µ2i = λ for all i, and since C is positive semidefinite and hence has nonnegative eigenvalues, we get

C|Nλ = λINλ = B|Nλ (323)
2. λ = 0: For every h ∈ N0 , we get
2
||Ch|| = ⟨Ch, Ch⟩ = C 2 h, h = ⟨Ah, h⟩ = 0 (324)
Where we used the fact that positive semidefinite operators are self-adjoint, therefore Ch = 0, so C|N0 = 0 =
B|N0
Therefore C, B agree on every subspace in the direct sum decomposition, hence C = B as required.

52
22 The Fourier Transform
Definition 22.1. Letg P Cc (R) be the space of all piecewise continuous f : R → C (Identified up to finitely many
points of inequality) with compact support (i.e. the set {f ̸= 0} is compact, or equivalently there is a finite interval
[a, b] such that f ≡ 0 outside of [a, b]).

We define the standard L2 inner product on this space by


Z ∞
⟨f, g⟩ = f (x)g(x) dx (325)
−∞
Where this integral is in fact a proper riemann integral as f, g are compactly supported.

Definition 22.2. L2 (R) is the completion of P Cc (R) w.r.t the above inner product.

Similarly, for all 1 ≤ p < ∞ define


Z ∞  p1
p
||f ||p |f | (326)
−∞
This is a norm on P Cc (R) (The proofs of Holder and Minkowski’s inequalities for sums pass identically to functions),
hence we define its completion Lp (R).
Remark. Suppose that f : R → C is riemann integrable on every finite interval such that
Z ∞
p
|f | < ∞ (327)
−∞
(Where here this is the improper riemann integral), then we may "think" of f as an element of Lp (R). We find this
element by taking a sequence fn ∈ P Cc (R) such that
Z ∞
p
|fn − f | → 0 (328)
−∞
{fn } is cauchy in Lp (R) and hence converges, and its limit can be thought of as f .
Pn
Proposition 22.3. The set of staircase functions S = { k=1 cn χ[an ,bn ] } is dense in Lp (R) for all 1 ≤ p < ∞.

Remark. In class, we remarked that it suffices to show density of S in P Cc (R), and showed a ’proof by picture’ of
approximating a piecewise continuous function on a compact interval by staircase functions (Similarly to how one
would approximate such a function using partitions and riemann sums).
While the way we defined Lp (R) does not allow us to treat its elements as functions, we would still like to think of its
elements as functions, and hence perform functional operations on them:
1. Fix a ∈ R, and define τa : P Cc (R) → P Cc (R) by τa (f )(x) = f (x − a). It is clear that ||τa f ||p = ||f ||p , hence τa is an
isometry, hence bounded, so we may extend it by Theorem 9.3 to a map τa : Lp (R) → Lp (R). We will occasionally
write for f ∈ Lp (R) τa f = f (x − a) (Of course this is not well-defined as elements of Lp (R) by our definition are
not functions which can be evaluated pointwise).
2. Fix g : R → C piecewise continuous and bounded, and define mg : P Cc (R) → P Cc (R) by mg f = gf , then
Z ∞ Z ∞
p p p p p p
||mg f ||p = |gf | ≤ ||g||∞ |f | = ||g||∞ · ||f ||p (329)
−∞ −∞
Hence ||mg ||p→p ≤ ||g||∞ < ∞ (Where the notatin ||·||p→p means the operator norm of an operator Lp (R) → Lp (R)),
hence we may extend mg : Lp (R) → Lp (R). As before, we sometimes write by abuse of notation mg f = gf .
3. For [a, b] ⊆ R, define i : P C[a, b] → P Cc (R) by
(
f (x) x ∈ [a, b]
i(f )(x) = (330)
0 Otherwise
It’s clear that i is an isometry, and hence we may extend i : Lp [a, b] → Lp (R). Since i is an isometric embedding,
we may consider Lp [a, b] as a (closed) subspace of Lp (R) (Where Lp [a, b] is closed in Lp (R) as it is isometrically
isomorphic to a complete space, and a subspace of a complete space is closed iff complete).

Conversely, we may define R : P C(R) → P C[a, b] by Rf = f |[a,b] . Clearly ||R||p→p ≤ 1 (as it only reduces
the norm of a function in Lp (R)), hence we may extend R : Lp (R) → Lp [a, b].
4. For p = 1, define I : P Cc (R) → C by
Z ∞
I(f ) = f (331)
−∞

53
And we have Z ∞
|I(f )| ≤ |f | = ||f ||1 (332)
−∞
Therefore ||I|| ≤ 1, so we may extend I : L1 (R) → C.
Exercise.
1. Identifying Lp [a, b] with its isometric embedding in Lp (R), we have R = Mχ[a,b] .
2. When p = 2, R is the orthogonal projection onto Lp [a, b]

Definition 22.4. For f ∈ L1 (R), define the Fourier Transform of f :


Z ∞
ˆ ˆ
f : R → C, f (ω) = f (x)e−iωx = I(me−iωx (f )) (333)
−∞

Example. Let f = χ[a,b] , then


b
e−iωb − e−iωa
Z
fˆ(ω) = e−iωx = (334)
a −iω
For ω ̸= 0, and fˆ(0) = b − a.

In particular, if [−a, a] is a symmetric interval and f = χ[−a,a] , then


−e−iωa + eiωa sin(aω)
fˆ(ω) = =2 (335)
iω ω
ix −ix
For ω =
̸ 0, where we used the identity sin x = e −e 2i

Theorem 22.5. The fourier transform F1 : L1 (R) → C0 (R) (Where C0 (R) is the space of continuous functions
vanishing at infinity with the sup-norm) given by F1 f = fˆ, and in fact ||F1 || = 1.

Proof. We’ve seen in the example above that χ[ [a,b] ∈ C0 (R) for a bounded interval [a, b], therefore by linearity we
have an operator F1 : S → C0 (R) (S is the set of staircase functions). For every f ∈ S, we have
Z ∞ Z ∞
||F1 S||∞ = sup fˆ(ω) = sup f (x)e−iωx ≤ sup |f | · e−iωx = ||f ||1 (336)
ω ω −∞ ω −∞ | {z }
1
Therefore ||F1 || is bounded on S ,but S is dense in L1 (R), therefore we may extend F1 to an operator F1 : L1 (R) →
C0 (R) with ||F1 || ≤ 1. Note that for f = χ[a,b] , ||f || = b − a, and ||F1 f || ≥ fˆ(0) = b − a, so ||F1 || ≥ 1, and so we
1
have equality.

Theorem 22.6. (Properties of Fourier Transform): For all f ∈ L1 (R), the following hold:
1. F1 (f (x − a)) = e−ixa F1 (f )
2. F1 (eiax f ) = fˆ(ω − a)
3. F1 (f (ax)) = a1 fˆ( ωa )
4. If f ∈ L1 (R) ∩ C 1 (R), then fb′ = iω fˆ
5. If f ∈ Cc (R) then fˆ is differentiable, and
d ˆ
F1 (ixf ) = − f (ω) (337)

Proof. Proof of 1: Let f ∈ P Cc (R), then we have


Z ∞ Z ∞
y=x−a, dy=dx
F1 (f (x − a))(ω) = f (x − a)e −iωx
dx = f (y)e−iω(y+a) dy = e−iωa F1 (f )(ω) (338)
−∞ −∞
Hence the identity is true for f ∈ P Cc (R). Note that for f ∈ L1 , the identity in 1 is actuall the functional identity
F1 (m(f )) = me−ixa F1 (f ) (339)
But this is true for all f ∈ P Cc (R), hence by continuity the identity holds on L . 2 and 3 are proven by similar
1

continuity arguments. 4 is by integration by parts (Where we take f ∈ L1 (R) ∩ C 1 (R) to mean that f is the L1 norm
R∞
limit of functions in P Cc (R)). Note that in particular, this means that ||f ||1 = |f | < ∞
−∞

54
Proof of 5: Let us calculate
fˆ(ω + h) − fˆ(ω)
Z ∞ Z ∞
1

= f (x)e−i(ω+h)x dx − f (x)e−iωx dx =
h h −∞ −∞
Z ∞ −iωhx−1 Z ∞ (340)
e h→0
f (x)e −iωx
dx −−−→ f (x)e −iωx
(−ix) dx = −ixf (ω)
d
−∞ h −∞
o Where in the penultimate step we were able to interchange limit and integral, since f is compactly supported, so the
−ihx
integral is in fact an integral on some finite interval, and e h −1 → −ix uniformly in x on any compact interval, so we
may interchange integration and limit (By uniform convergence of a continuous function on a compact interval)

Remark. In 5, it is actually sufficient that f ∈ C0 (R), x2 f ∈ L1 (R) (Using stronger convergence theorems from
measure theory).

Definition 22.7. Let f, g ∈ P Cc (R), define their convolution to be


Z ∞
(f ∗ g)(x) = f (t)g(x − t) dt (341)
−∞

Example. Let f = χ[−M,M ] , g = χ[−N,N ] , M ≥ N , then


Z ∞ Z ∞
(f ∗ g)(x) = χ[−M,M ] (t)χ[−N,N ] (x − t) dt = χ[−M,M ] (t)χ[x−N,x+N ] (t) dt = len([−M, M ] ∩ [x − N, x + N ])
−∞ −∞
(342)
Where len denotes the lenght of the interval. This is in fact a continuous function.

Theorem 22.8. (Properties of Convolution): Let f, g, h ∈ P Cc (R), the following hold:


1. f ∗ g ∈ Cc (R)
2. ||f ∗ g||∞ ≤ ||f ||∞ · ||g||1 and ||f ∗ g||∞ ≤ ||f ||1 · ||g||∞
3. ||f ∗ g||1 ≤ ||f ||1 · ||g||1
4. f ∗ g = g ∗ f
5. f ∗ (g ∗ h) = (f ∗ g) ∗ h
6. f ∗ (g + h) = f ∗ g + f ∗ h
7. (τa f ) ∗ g = τa (f ∗ g)

Proof. Note that we’ve seen 4,5 for Periodic Functions. 6,7 are simple calculations.

Proof of 2: Z ∞ Z ∞ Z ∞
|f ∗ g(x)| = f (t)g(x − t) dt ≤ |f (t)| · |g(x − t)| dt ≤ ||g||∞ |f (t)| dt = ||g||∞ · ||f ||1 (343)
−∞ −∞ −∞
And taking the supremum we get the required (The second case is, obviously, completely symmetrical up to a linear
change of variables y = x − t). Proof of 3:
Z ∞ Z ∞ Z ∞Z ∞ Z ∞
||f ∗ g||1 = f (t)g(x − t) dt dx ≤ |f (t)| · |g(x − t)| dx dt = |f (t)| · ||g||1 dt = ||f ||1 · ||g||1
−∞ −∞ −∞ −∞ −∞
(344)
Where we implicitly used a linear change of variable, which does not change the limits of integration in the last step.

Proof of 1: If f is supported on [−N, N ] and g on [−M, M ], then f ∗ g is supported on [−M − N, N + M ],


and hence compactly supported.

We’ve seen that in the case f = χ[−N,N ] , g = χ[−M,M ] , the convolution is continuous, hence by property 7
this is true if f, g are indicators of an arbitrary interval, therefore by property 6 and homogeneity of the convolution
(Which is obvious) this is true for staircase functions. Choose fn , gn ∈ S such that fn → f, gn → g in L1 . We have
6
||f ∗ g − fn ∗ gn ||∞ ≤ ||f ∗ g − fn ∗ g||∞ + ≤ ||fn ∗ g − fn ∗ gn ||∞ =≤ ||(f − fn ) ∗ g||∞ + ≤
3
(345)
||fn ∗ (g − gn )||∞ ≤ ||fn − f ||∞ · ||g||∞ + ||fn ||∞ · ||gn − g||1
We may construct fn to have their sup-norm bounded by that of f , so that fn ∗ gn converge uniformly to f ∗ g, hence
since fn ∗ gn are continuous, so is f ∗ g.

55

You might also like