Symmetric and self-adjoint matrices
A matrix A in Mn (F) is called symmetric if AT = A, i.e. Aij = Aji for each
i, j; and self-adjoint if A∗ = A, i.e. Aij = Aji or each i, j. Note for A in
Mn (R) that AT = A∗ .
Notice that if F = R, then A is symmetric if and only if (Ax, y) = (x, Ay)
for each x, y in Rn . Observe that the set
Symn (R) = {A ∈ Mn (R) : AT = A}
is a linear subspace of Mn (R).
A matrix A in Mn (C) is called self-adjoint or hermitian if A∗ = A. Notice
that A is hermitian if and only if (Ax, y) = (x, Ay) for each x, y in Cn .
Observe that the set
Herm(n) = {A ∈ Mn (C) : A∗ = A}
is a R-linear subspace of Mn (C). Note that Symn (R) ⊂ Herm(n).
Consider the sets of (real) orthongonal and unitary matrices:
O(n) = {u ∈ Mn (R) : uT u = I} and U(n) = {u ∈ Mn (R) : u∗ u = I}
Clearly, O(n) ⊂ U(n).
Remark. It is easy to see that for u in Mn (R) (respectively, in Mn (C)) that
u ∈ O(n) (respectively, is in U(n)) if and only if the columns of u:
u11 un1
.. ..
u(1) = . , . . . , u(n) = .
u1n unn
form an orthonormal basis for Rn (respectively Cn ).
Lemma. If A ∈ Mn (R) admits a real eigenvalue, then there is a correspond-
ing real eigenvector.
Proof. Let λ be a real eigenvalue of A and z 6= 0 in Cn be an eigenvector.
Write xj = Rezj and yj = Imzj so z = x + iy where x, y ∈ Rn . Then
λx + iλy = λz = Az = Ax + iAy.
1
Collecting the real and imaginary parts of each entries of each component of
the above equality gives a non-zero real eigenvector: at least one of x or y.
Diagonalization Theorem (i) If A ∈ Herm(n), then the eigenvalues of
A are real. Furthermore, for any two distinct eigenvalues λ, µ of A with
corresponding eigenvectors x and y in Cn , we have (x, y) = 0.
(ii) If A ∈ Mn (R) (respectively, is in Mn (C)) then A ∈ Symn (R) (respec-
tively, is in Herm(n)) if and only if there is u in O(n) (respectively, u in
U(n)) for which
λ1 0 . . . 0
.. .
∗ 0 λ2
. ..
u Au = . .
.. .
.. .. 0
0 . . . 0 λn
where λ1 , . . . , λn are the eigenvalues of A (with multiplicity).
Proof. (i) Let λ be an eigenvalue of A with corresponding eigenvector x 6= 0
in Cn . Then
λ(x, x) = (λx, x) = (Ax, x) = (x, Ax) = (x, λx) = λ̄(x, x).
Dividing by (x, x) we see that λ = λ̄.
If eigenvalues λ 6= µ of A correspond to eigenvectors x and y then
λ(x, y) = (Ax, y) = (x, Ay) = µ(x, y)
so (λ − µ)(x, y) = 0.
(ii) Notice that sufficiency in both the real symmetric and hermitian cases
in trivial.
Let us consider necessity. Let µ1 , . . . , µk denote the full set of distinct eigen-
values of A with respective eigenspaces Ej = kerFn (µj I − A), j = 1, . . . , k.
Since Ei ⊥ Ej for i 6= j, as observed in (i), we may find an orthonormal basis
u11 un1
u(1) = ... , . . . , u(m) = ...
u1n umn
2
for E = E1 +· · ·+Ek such that u(1) , . . . , u(n1 ) is a basis for E1 , u(n1 +1) , . . . , u(n1 +n2 )
is a basis for E2 , . . . , and u(n1 +···+nk−1 ) , . . . , u(m) is a basis for Ek . (Here each
nj = dimC Ej .) Let for j = 1, . . . , k, Pj in Mn (F) be the matrix corresponding
to the orthogonal projection onto Ej , i.e. with entries
n1 +···+nj n1 +···+nj
X X
Pj,i0 j 0 = (ej 0 , u(l) )u(l) , ei0 =
(ej 0 , u(l) )(u(l) , ei0 )
l=0+n1 +···+nj−1 l=0+n1 +···+nj−1
which is easily seen to satisfy Pj = Pj∗ . Let B = A − kj=1 µj Pj , which is
P
in Herm(n) (respectively, in Symn (R) if F = R) and satisfies E ⊆ kerFn B.
If B 6= 0, then it admits an eigenvalue µ 6= 0 so (i) provides that its corre-
sponding eigenvector x 6= 0 is in E ⊥ . But then µx = Bx = Ax, which means
that µ is one of µ1 , . . . , µk , above, contradicting that x 6∈ E. Hence B = 0.
Further, if x ∈ E ⊥ , then Ax = Bx = 0, so x ∈ E ∩ E ⊥ , so x = 0. Thus
E = Fn , i.e. m = n.
Let u denote the matrix with columns u(1) , . . . , u(n) , which, by the remark
above, is unitary; and let λ1 , . . . , λn the respective eigenvalues µ1 (n1 times),
. . . , µk (nk times). Let e1 , . . . , en denote the standard basis for Fn . Then
uej = u(j) and we have that
(u∗ Auej , ei ) = (Au(j) , uei ) = λj (u(j) , u(i) ) = λj (ej , ei )
for each i, j = 1, . . . , n, so u∗ Au admits the claimed diagonal form.
A representation of a symmetric/hermitian matrix. The proof above
tells us that if µ1 , . . . , µk are the distinct eigenvalues of a symmetric (respec-
tively, hermitian matrix) A and P1 , . . . , Pn are the matrices representing the
orthogonal projections onto the respective eigenspaces E1 , . . . , Ek (which span
all of Fn ), then
Xk Xk
A= µj Pj where I = Pj . (†)
j=1 j=1
Since Ei ⊥ Ej if i 6= j, Pi Pj = 0 = Pj Pi .
Hence if p(X) = l=0 al X l is any polynomial, we have
P
X k
X
p(A) = al A l = p(µj )Pj
l=0 j=1
3
where A0 = I, by convention.
Lemma. Given A and P1 , . . . , Pk as above, another matrix B commutes with
A, i.e. [A, B] = AB − BA = 0, if and only if [Pj , B] = 0 for each j.
Proof. Sufficiency is evident from (†).
To see necessity, let for each j
(A − µ1 I) . . . (A − µj−1 I)(A − µj+1 I) . . . (A − µk I)
pj (A) = .
(µj − µ1 ) . . . (µj − µj−1 )(µj − µj+1 ) . . . (µj − µk )
Clearly [pj (A), B] = 0. Let x ∈ Ej = kerFn (A − µj I). Then
(
0 if i 6= j
pj (A)x =
x if i = j.
Hence, since Fn = E1 + · · · + Ek , pj (A)2 = pj (A). Also, it is obvious that
pj (A)∗ = pj (A). Thus pj (A) = Pj .
Simultaneous Diagonalization Theorem. If A, B ∈ Symn (R) (or in
Herm(n)), and [A, B] = 0, then there is v in O(n) (respectively, in U(n)) for
which
λ1 0 . . . 0 ν1 0 . . . 0
.. . .. .
∗ 0 λ2
. .. ∗ 0 ν2
. ..
v Av = . . and v Bv = . .
.. .
.. .. 0 .. .
.. .. 0
0 . . . 0 λn 0 . . . 0 νn
where λ1 , . . . , λn are the eigenvalues of A and ν1 , . . . , νn are the eigenvalues
of B (each with multiplicity).
Proof. Consider the representations of A and B as in (†):
k k 0
X X
A= µj Pj and B = µ0i Pi0 .
j=1 i=1
Since [A, B] = 0, the lemma above provides that [A, Pi0 ] = 0 = [Pj , B] for
any i, j, and the lemma again provides that [Pj , Pi0 ] = 0 for any i, j. Hence
each Pj Pi0 is self-adjoint and squares to itself, and is hence the orthogonal
4
projection onto Eij = kerFn (A − µj I) ∩ kerFn (B − µ0i I). Further (†) provides
that
k0
k X k
! k0 !
X X X
Pj Pi0 = Pk Pi0 = I (∗)
j=1 i=1 j=1 i=1
so 0 0
k X
X k k X
X k
A = AI = µj Pj Pi0 and B = IB = µ0i Pj Pi0 .
j=1 i=1 j=1 i=1
Take orthonormal bases for each of the non-zero spaces Eij and combine them
into an orthonormal basis
v11 vn1
.. ..
v(1) = . , . . . , v(n) = .
v1n vnn
for Fn (this is possible by (∗)). Let v be the matrix with columns v(1) , . . . , v(n) ,
and we obtain the desired diagonal forms.
Corollary. A in Mn (C) is normal, i.e. [A, A∗ ] = 0, if and only if there is v
in U(n) for which
λ1 0 . . . 0
. . . ..
0 λ .
v ∗ Av = . . 2 .
. . .
. . . 0
0 . . . 0 λn
where λ1 , . . . , λn are the eigenvalues of A (with multiplicity).
Proof. Sufficiency being evident, we show only necessity. Let
1 1
ReA = (A + A∗ ) and ImA = (A − A∗ )
2 2i
so ReA, ImA ∈ Herm(n) and A = ReA + iImA. It is easy to verify that A is
normal if and only if [ReA, ImA] = 0. Hence simultaneous diagonalization,
above, provides the necessary unitary diagonalizing matrix v.
0 1
No real analogue. The real matrix J2 = is normal, but admits
−1 0
only purely imaginary eigenvalues, and hence cannot be diagonalized by or-
thogonal matrices (i.e. unitary matrices with real entries).
5
Real skew-symmetric matrices. A matrix B in Mn (R) is skew-symmtric
if B T = −B.
Real Skew-symmetric Block Diagonalization Theorem. If B T = −B
in Mn (R) then there is u in O(n) and λ1 , . . . , λm > 0 in R such that
λ1 J2 0 ... ... ... 0
... ..
0 .
. ..
.
. λm J2 .
uT Bu = . ..
.. 0 .
.. . . ..
. . .
0 ... ... ... ... 0
0 1
where J2 = , i.e. 2m ≤ n.
−1 0
Proof. First, notice that iB ∈ Herm(n) and hence has real eigenvalues so B
has purely imaginary eigenvalues (including, possibly 0). In particular, the
only real eigenvectors may be in ker B.
Consider B T B, which is in Symn (R). Any eigenvalue µ of B T B with eigen-
vector x in Rn \ {0} satisfies
µ(x, x) = (B T Bx, x) = (Bx, Bx) ≥ 0
so µ ≥ 0. Let µ1 , . . . , µl denote the distinct non-zero eigenvalues of B T B and
Ej the eigenspace of µj , so Ei ⊥ Ej for i 6= j and each Ej ⊥ ker B T B. Let
E = E1 + · · · + El .
Let x ∈ Ej \ {0} and Vx = Rx + RBx. Then
B(Bx) = B 2 x = −B T Bx = −µj Bx ∈ Vx
and it follows that Vx is B-invariant, i.e. BVx ⊆ Vx . Furthermore dimR Vx = 2
as B admits no non-zero real eigenvalues and ker B = ker B T B. Now if
y ∈ Ej \ {0} with y ⊥ x then
(By, x) = (y, B T x) = −(y, x) = 0 and
(By, Bx) = (y, B T Bx) = −µj (y, Bx) = 0
6
(j) (j)
so Vy ⊥ Vx . Hence we may build an orthonormal basis u1 , . . . , ulj for Ej
(j)
such that each u2i ∈ Vu(j) and Vu(j) ⊥ Vu(j) for i0 = 1, . . . , blj /2c, and,
2i−1 2i−1+2i0 2i−1
in particular, lj is even.
Pl
Putting everything together we have that dim E = j=1 dim Ej is even,
and we can find an orthonormal basis u1 , . . . , u2m , u2m+1 , . . . , un for Rn for
which the spaces Vj = Vu2j−1 are pairwise orthogonal and span E, and
u2m+1 , . . . , un ∈ ker B. Letting u be the matrix whose rows u1 , . . . , un we
find that
B1 0 . . . . . . . . . 0
..
0 ...
. .
. ..
. Bm .
T
u Bu = . ..
.. 0 .
.. . .
. . ..
.
0 ... ... ... ... 0
where each Bj ∈ M2 (R). Since (uT Bu)T = uT B T u = −uT Bu we find that
each block must have the form Bj = λj J2 with λj in R \ {0}, and by applying
block permutations we may assume λj > 0. (One may further check that the
√ √
values λ1 , . . . , λm are the values µ1 , . . . , µk with multiplicities.)
Remark. The complex analogue of this result is much easier. If B ∈ Mn (C)
with B ∗ = −B we can B skew-hermitian. Notice that iB ∈ Herm(n) and
is hence unitarily diagonalizable with real eigenvalues, so B too is unitarily
diagonalizable but with purely imaginary eigenvalues.
Written by Nico Spronk, for use by students of PMath 763 at
University of Waterloo.