0% found this document useful (0 votes)
10 views18 pages

Matrix Notes

Chapter 2 revisits matrices, emphasizing their role in representing graphs and discussing their properties, such as transposition and multiplication. It introduces concepts like linear combinations, linear independence, and vector spaces, while also explaining symmetric and diagonal matrices. The chapter concludes with exercises to reinforce understanding of these concepts, including the Cauchy-Schwarz inequality and orthogonal vectors.

Uploaded by

Rajesh Kanna
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views18 pages

Matrix Notes

Chapter 2 revisits matrices, emphasizing their role in representing graphs and discussing their properties, such as transposition and multiplication. It introduces concepts like linear combinations, linear independence, and vector spaces, while also explaining symmetric and diagonal matrices. The chapter concludes with exercises to reinforce understanding of these concepts, including the Cauchy-Schwarz inequality and orthogonal vectors.

Uploaded by

Rajesh Kanna
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 2

A warmup with matrices

“It is not knowledge, but the act of learning, not the possession of but the act of getting
there, which grants the greatest enjoyment.” Carl Friedrich Gauss

Most of the content of this chapter is a revision of what we already know. The
discussion is brief but sufficient for our purpose. There are many ‘why?’, consider them
as interesting exercises, it will help revising the concepts better. Ideally we should have
started with the introduction to graphs, but we will begin with the revision of matrices,
as they are a very natural way to represent a graph.

2.1 Matrices

Let us recall our school days where we have studied the matrices, the arrangment of
numbers (entries) in rows and columns. For example following is a matrix
 
1 0
 
Q=
−1 1 .
0 −1

The entries of a matrix can be complex numbers but we will restrict ourselves to real
numbers, in fact almost always to just integers. An another matrix
" #
1 −1 0
0 1 −1

9
has the same entries as Q but the arrangement is different; the rows of the Q become
corresponding columns of the above matrix. By making rows of a matrix M correspond-
ing columns we get another matrix called as transpose of M , denoted M T . The above
matrix is QT . We see in Q there are three rows and two columns. In general when a
matrix has m rows and n column we say the matrix has the order or the dimension
m × n, so Q has order 3 × 2. In particular when m = n, we say the matrix is a square
matrix of order n. A matrix of order 1 × n is called a row vector of order n; similarly, an
n × 1 matrix is a column vector of order n1 . The set of all the column vectors of order n
having real entries is denoted by Rn . We denote the i-th row of a matrix M by M (i, :),
and the i-th column by M (:, i). The entry that is in row M (i, :) and column M (:, j) is
denoted by M (i, j). If i = j, we call M (i, j) a diagonal entry, else nondiagonal entry.
Often we write M = (mij ) to denote the matrix where M (i, j) = mij .
Recall matrix product of two matrices. The product of Q with QT , that is, QQT is the
matrix  
1 −1 0
 
L=
−1 2 .
−1
0 −1 1

Check whether QQT = QT Q?...Big no!. Matrix product is probably one of the first
few counterintuitive notions in our school. Firstly, for any two matrices A, B it is not
necessary that both AB and BA are defined. Second, even if they both are defined it is
not necessary that AB = BA, that is, it is not commutative.
Now let us come back to L. Apart from being a square matrix L has an special
arrangement of entries; it is transpose of itself, that is, L = LT . Such a matrix is
called a symmetric matrix. Symmetric matrices are very interesting and useful matrices,
they are used in plethora of applications. A trivial symmetric matrix where all the
nondiagonal entries are zero is called a diagonal matrix. We denote a diagonal matrix
D = (dij ) of order n by diag(d11 , . . . , dnn ), where dii is the i-th diagonal entry. For
example diag(1, 2, 1) is the matrix
 
1 0 0
 
D=
0 2 0 .

0 0 1

An typical example of a diagonal matrix is the identity matrix In = diag(1T ), where 1


is all-one vector of order n.
1
When we simply say a matrix M is of order n, it means M is a square matrix of order n × n, and
when we say a vector of order n × 1, it means a column vector order n, that is, a matrix of order n × 1.
Although sometimes by a vector v we mean a row vector but it will be clear from the context.

10
Exercise 2.1.1. Prove that for any two square matrices of order n if AB = In , then
BA = In .

y y y

v v v
u
u
u
x x x

(a) Dependent vectors (b) Independent vectors (c) Orthogonal vectors

Figure 2.1: Different possibilities for vectors u, v. If u, v are orthogonal then they are
also independent.

2.2 Matrix multiplication as linear combinations

The multiplication of a matrix with


" #a column vector is quite interesting. Consider Q
2
and multiply it by column vector . Most likely someone will write immediately that
5
     
1 " #0 (1 × 2) + (0 × 5) 2
−1 1  2 = (−1 × 2) + (1 × 5) =  3  .
     
  5    
0 −1 (0 × 2) + (−1 × 5) −5

It is correct, now let us see the above product with a different way; the output vector
[2, 3, −5]T can be written as    
1 0
   
2
−1 + 5  1  ,
  
0 −1

that is, it is the result of adding 2 times the first column


" of
# Q with 5 times the second
2 3
column. What if we multiply Q with the matrix ? The first column of the
5 2
resulting matrix will be as above but the second column will be
   
1 0
   
3−1
 
 + 2 1 .
 
0 −1

11
Suppose we have vectors v1 , . . . , vn and scalars α1 , . . . , αn , not all zero, then the vector
α1 v1 + · · · + α2 vn is said to be a linear combination of v1 , . . . , vn . The scalars α1 , . . . , αn ,
are the coefficients of the linear combination.
Suppose we have matrices A of order m × n and B of order n × p. Its now easy to see
that in AB the columns are the linear combinations of the columns of A and coefficients
are given by the columns of B. We can think of product AB as linear combinations
of rows rather than columns. In AB the rows are the linear combinations of the rows
of B and coefficients are given by the rows of A. We will see in many occasions that
visualising the matrix product as linear combinations of rows or columns is quite handy.

Exercise 2.2.1. Let


   
2 −1 3 3 0 −1
   
A=
4 −1 0  , B = 2 1 −3 .
  
0 3 1 5 1 −1

Write down AB as (a) linear combination of rows, (b) linear combination of columns.

2.3 Linear independence, vector space, basis

Carrying forward the discussion on linear combination let us recall an interesting


phenomenon of linear dependency of vectors. A set of vectors v1 , . . . , vn is said to be
linearly dependent set, if there exist scalars α1 , . . . , αn , not all zero, such that

α1 v1 + · · · + α2 vn = 0. (2.1)

Check that columns of Q and the output vector [2, 3, −5]T form a set of linearly dependent
vectors. If the vectors v1 , . . . , vn are not dependent, they are called independent vectors.
For a given set of vectors V = {v1 , . . . , vk } we can form a set S of all the vectors which
are some linear combination of vectors in V . The( "set
# S" is# called
" # )the vector space of
1 0 1
V . For example the vector space for a set V = , , is the 2D space as
0 2 2
every vector in 2D(plane
" # "can # )written as some combination of vectors in V . But see
1 0
that the set V 0 = , can also generate (or span) the 2D space using linear
0 2
" #
1
combinations of its vectors. In this context the vector is redundant. For a vector
2
space a maximal independent set of vectors whose linear combinations can generate all
vectors in the vector space is called it basis. In the above example V 0 is basis for 2D

12
while V is not a basis as it is not a maximal independent set. Note that there can be
several basis for a vector space, but the number of vectors in every basis is the same.
This number is called the dimension of the vector space.

Exercise 2.3.1. Check whether the following vectors are linear independent or dependent
    
2 0 4
     
−1 ,  3  , 0 .
     
0 2.5 1

Exercise 2.3.2. Find a maximal set of pair-wise independent vectors from the following
vectors.          
1 1 2 −1 −1
         
 1 ,  0  , 1 , −1 ,  0 .
         
−1 −1 0 1 1
Exercise 2.3.3. Find a value of x for which the following set of vectors form a basis
for R3 .      
( 1 x 1 )
V0 = 
     
 0  ,  0  , x
    
x 0 1

Cauchy–Schwarz inequality, orthogonal vectors

Following inequality about the dot product of two vectors is quite helpful to know. The
proof is given in Appendix 19.

Cauchy–Schwarz inequality

Theorem 2.3.4. For any two vectors u, v ∈ Rn


√ √
|uT v| ≤ uT u v T v,


where, |.| denotes the absolute value. The quantity uT u is known as 2-norm of u
and is denoted by ||u||. Thus Cauchy–Schwarz inequality is often stated as

|uT v| ≤ ||u||||v||.

Exercise 2.3.5. Prove that equality in the Cauchy-Schwarz inequality holds if and only
if u = cv, where c is some scalar.

13
It is quite fruitful to know that if the dot product of two nonzero vectors u, v, that
is, uT v is 0, then they are always independent (why?), such vectors are known as
orthogonal vectors. Recall our 3D-coordinate space with famous x-axis, y-axis, z-axis,
pick one vector along each of these axes, check that they are pairwise orthogonal. In
general, if v1 , . . . , vn are pairwise orthogonal vectors, they form an independent set of
vectors (why?). Corresponding to a vector space if a basis has vectors which are all pair
wise orthogonal to each other, then this basis is called as orthogonal basis (its callled
orthonormal basis if in addition every vector has norm 1.)

Exercise 2.3.6. Find a vector v ∈ Rn which is orthogonal to the vectors


   
2 2
   
−1 , 7 .
   
3 1

Exercise 2.3.7. Prove that vectors u, v are orthogonal if and only if ||u + v|| = ||u − v||.

Exercise 2.3.8. Find a value of x for which the following set of vectors form an
orthogonal basis for R3 . Convert the orthogonal basis into a orthonormal basis.
     
( x −2 0 )
     
V = 
1 ,  x  ,  1  .
    
1 1 −x

2.4 Row space, column space, rank


Consider a general m × n matrix M with columns v1 , . . . , vn , and a nonzero vector
x = [α1 , . . . , αn ]T , we can write

M x = α1 v1 + · · · + αn vn .

So v1 , . . . , vn , M x are linearly dependent, that is, M x is a linear combination of v1 , . . . , vn .


The set of all possible vectors which can be written as a linear combination of columns
(rows) of M is called column (row) space of M . The dimension of column (row) space of
a matrix is called its column (row) rank. Suppose M is 20 × 50 matrix. What could be
the maximum row rank of M ? Its obvious that it could be at most 20. But what about
column rank; what could be the maximum value of column rank of M ? You might be
tempted to say that it can be at most 50, but it is significantly wrong!. The column
rank will also be at most 20. In fact even more surprise lies ahead. Suppose the row
rank of M is 15, what can we say about the column rank of M ? The answer is that the

14
column rank must also be 15. For any matrix the row rank and the column rank are
the same. This number is called the rank of M , denoted rank M .

2.4.1 Why row rank = column rank?

Consider a m × n matrix with columns v1 , . . . , vn . We can write


 
| | ... |
 
M =
v 1 v 2 . . . v n
.

| | ... |

Suppose the column rank of M is p. That means there exist basis vectors b1 , . . . , bp
and each column vector of M can be written as some linear combination of these
basis vectors. Suppose vi = α1i b1 + · · · + αki bp , (the coefficients α1i , . . . αpi can not be
simultaneously zero) and let ai = [α1i , α2i , . . . , αpi ]T , i = 1, . . . , n. Then we can write
    
| | ... | | | ... | | | ... |
    
M =
v 1 v 2 . . . vn
 = b1 b2 . . . bp a1 a2 . . . an .
   
| | ... | | | ... | | | ... |
| {z }| {z }
B A

Now think about the row rank of matrix M . Observe that any row of M is a linear
combination of rows in A and the coefficients are given by the rows in B. (In particular,
i-th row of M is a linear combination of rows of A, and the i-th row of B give the
corresponding coefficients.) This implies that row rank of M is at most p which is the
column rank.
Now suppose the row rank of M is q. Now visualise M in terms of its rows. Let
u1 , u2 , . . . , um be its rows. As the row rank is q, there exist basis vectors y1 , . . . , yq
and each row vector of M can be written as some linear combination of these basis
vectors. Suppose ui = βi1 y1 + · · · + βik yq , (again, the coefficients βi1 , . . . βiq can not be
simultaneously zero) and let xi = [βi1 , βi2 , . . . , βiq ], i = 1, . . . , m. Then we can write
    
− u1 − − x1 − − y1 −
    
− u2 − − x2 −− y2 −
M =. =.
    
 .. .. ..  . .. .. 
 ...
 .. .. 
 . . 
 
 . . .  . . 
− um − − xm − − yq −
| {z }| {z }
X Y

Now think about the column rank of matrix M . Observe that any column of M is a
linear combination of columns in X and the coefficients are given by the columns in Y .

15
(In particular, i-th column of M is a linear combination of columns of X, and the i-th
column of Y gives the corresponding coefficients.) This implies that column rank of M
is at most q which is the row rank.
Combining the facts that the row rank can be at most column rank, and column rank
can be at most rank rank we arrive at the conclusion that both the ranks must be equal.

Theorem 2.4.1. For any matrix the row rank and the column rank are the same.

Exercise 2.4.2. Find the row rank and the column rank of the following matrix and
verify that both are the same " #
1 −1 0
.
2 −1 4

Exercise 2.4.3. Prove that if M = BC, then rank M ≤ min(rank B, rank C).

2.5 Null Space


Consider the case when M x is a zero vector, that is,

M x = 0. (2.2)

Is this possible even if M, x are nonzero matrices? Think about two real number a, b such
that ab = 0. We know that this is possible when only at least one of a, b is zero. Can we
generalize this to matrices and say that at least one of M, x must be a zero matrix for
M x = 0? No!. Multiply QT by all-one vector [1, 1, 1]T it gives a zero vector [0, 0, 0]T .
The set of all x satisfying Equation (2.2) is called null space of M . Vector [1, 1, 1]T is in
the null space of QT . The minimal number of vectors whose linear combinations form
the null space of M is called the nullity of M , denoted, null M . Another fundamental
fact is

rank M + null M = n,

see Appendix 19 for a proof. Next, suppose x is a nonzero vector, under what condition
M x = 0? Notice that this is the same as the form (2.1). When M is a square matrix
what does this imply? Here comes our old strange old mate, the determinant! It is
possible only when the determinant of M , denoted det M is zero. Recall that M x = 0

16
Figure 2.2: 2D plane: on projecting (multiplying) each point in the left shaded region
with area a by matrix A and we get the shaded region on right with the area | det(A)|a.
(In 3D it happens for volume.)

says that the columns of M are linearly dependent which implies det M = 0, to figure
out why? Its time to revisit our old mate, the determinant. We had already seen
the applications of determinants in solving the system of linear equations, finding the
area of 2D figures and volume of 3D figures. In, fact determinants can determine the
areas or volumes of figures resulting after linear transformation, see Figure 2.2. We will
see many more applications of determinants. The following quote on determinant by
mathematician J. J. Sylvester tells its great importance.

For what is the theory of determinants? It is an algebra upon algebra; a calculus which
enables us to combine and foretell the results of algebraical operations, in the same way
as algebra itself enable us to dispense with the performance of the special operations of
arithmetic. All analysis must ultimately clothe itself under this form.

2.6 Determinant (definition by Laplace expansion)

We know that " #


a b
det = ad − bc (2.3)
c d

and

17
 
a b c " # " # " #
  e f d f d e
d e f  = a det h i − b det g i + c det g h
det  
(2.4)
g h i
= aei − af h − bdi + bgf + cdh − ceg.

In the above equation we expanded the determinant of the 3 × 3 matrix with respect to
the first row. This method is called Laplace expansion which is more formally defined
as follows. Let M = (mij ) be a matrix of order n, and let M (i|j) be the matrix that
results from M by removing i-th row and j-th column. Then for any i the following
recurrence relation holds

n
X
det M = (−1)i+j mij det M (i|j). (2.5)
j=1

2.6.1 Elementary matrices

Recall that the following elementary row operations; these are at the heart of determi-
nant.

1. Exchanging two rows: the determinant change its sign.

2. Multiplying a row by a nonzero constant c: the determinant get multiplied by c.

3. Adding to a row a multiple of another row: the determinant is unchanged.

If the rows (hence the columns) of M are linear dependent, by using these row operations
we can make a row to be zero, which implies det M = 0. Note that all the three row
operations on any matrix leave the rank of the matrix unchanged.
A matrix resulting after applying one row operation to the identity matrix I is known
as an elementary matrix. Those which involve switching rows of the identity matrix are
called permutation matrices. Performing any of the three row operations on a matrix A
is equivalent to take the product EA, where E is the elementary matrix obtained by
using the desired row operation on the identity matrix. Thus

rank EA = rank A (2.6)

18
6 0 (why?). Another important fact that
(similarly rank AE = rank A). Also det E =
follows from the elementary row operations is that for any square matrix M

det EM = det E det M. (2.7)

Recall that if M is full rank, that is, all the rows are independent, then there exist
elementary matrices, E1 , . . . , Ek such that

Ek . . . E1 M = I. (2.8)

The matrix Ek . . . E1 is known as the inverse of M , denoted M −1 . Thus using Equation


2.7 det M 6= 0. For a square matrix M the inverse exists if and only det M =
6 0, in this
case we call M to be invertible. Since inverse of an elemantary matrix is an elementary
matrix, using Equation 2.8 any invertible matrix can be written as the product of
elemenatry matrices. Thus using 2.6 we see that for any invertible matrix M ,

rank M A = rank A, rank BM = rank B, (2.9)

where A, B are any suitable order matrices. The above discussion leads to the following
important relation on determinant and determinant of the product of two square matrices
A, B of the same order.

det AB = det A det B. (2.10)

Exercise 2.6.1. Find the null spaces of the following n × n matrices


     
1 1 ... 1 0 1 ... 1 n 1 ... 1
     
1 1 . . . 1 1 0 . . . 1  1 n ... 1
 , . . . , 
     
. . .
 .. .. . . ..   .. .. . . ..   ... ...
. . ..
.
.. 
.
     
1 1 ... 1 1 1 ... 0 1 1 ... n

Exercise 2.6.2. A minor of a p×q matrix M is the determinant of any k ×k submatrix


of M . Prove that rank M equals the size of largest nonzero minor of M .

Exercise 2.6.3. Using determinants find the rank of the following matrix.
 
1 1 2 3
 
4 1 5 6
 
7 1 8 9

19
Exercise 2.6.4. If M is a nonsingular matrix, prove that det A−1 = 1
det A .

Exercise 2.6.5. Let A and C be square matrices. Prove that


" #
A B
det = det A det C.
0 C

Exercise 2.6.6. Prove that if M is a p × q matrix, then det M T M ≥ 0.

Exercise 2.6.7. Prove that for a p × q matrix M , det M T M > 0 if and only if
rank M = q.

2.7 Eigenvalues and eigenvectors

Till now we have seen plenty of examples for multiplying a matrix with a vector. If M is
a square matrix of order n and x is a vector of order n, then M x is some vector of order
n. Usually x and M x are oriented in different directions. But there are special vectors
x such that both x and M x have the same direction or just opposite direction. In other
words, M x is some scaling of x. These special vectors are known as eigenvectors of M ,
and the scaling factor is known as the eigenvalue. Formally we have

M x = λx. (2.11)

The scalar λ is the eigenvalue, and x is a corresponding eigenvector. Notice that any
nonzero multiple of x is still an eigenvector corresponding to the eigenvalue λ. Thus
most often, we consider x having unit length, that is, xT x = 1, also called a normal
vector.
y y y Mu
Mu  
−1
u=
  1  
1 1
u= u=
1 0
x x x
Mu

(a) u is an eigenvector, λ = 3 (b) u is an eigenvector, λ = −1 (c) u is not an eigenvector


 
1 2
Figure 2.3: Here M = . Observe that when u is an eigenvector u and M u are
2 1
along the same line, that is, M u is a scaled version of u.

20
The eigenvalues, eigenvectors play a significant role in understanding various phenomena
in many branches of science. How to find out the eigenvalues and eigenvectors? In
Equation (2.11) the only thing that we know is M ; the other two are unknown. But
somehow, if we know one of λ, x we can know the other. We can write

(M − λI)x = 0.

It has the same form as Equation (2.2). It implies that det(M − λI) = 0. This gives a
polynomial of λ having degree n, we call this polynomial as the characteristic polynomial.
The n roots λ1 , . . . , λ2 of this polynomial are the eigenvalues of M . The set of all the
eigenvalues are known as the spectrum of the matrix. We may factor the characteristic
polynomial as
det(M − λI) = (λ1 − λ) . . . (λn − λ).

The relationship between the eigenvalues and the determinants of submatrices is quite
crucial to know. A principal submatrix of M is a submatrix formed by a set of rows
and the corresponding set of columns. A principal minor of M is the determinant of a
principal submatrix. We state the following theorem which tells a relationship between
eigenvalues and the principal minors. A proof is given in Chapter 19 (not yet).

Theorem 2.7.1. The sum of product of the eigenvalues, of M , taken k at a time


equals the sum of k × k principal minors of M .

For k = 1, this implies that the sum of all the eigenvalues is equal to the trace of M (the
sum of the diagonal entries). For k = n, this implies that the product of the eigenvalues
is equal to the determinant of M .

" #
1 2
Exercise 2.7.2. Find the eigenvalues of in terms of k. Also find an eigenvector
k 1
corresponding to each of the eigenvalues.

Exercise 2.7.3. Find the eigenvalues and eigenvectors of the following n × n matrices
       
n 0 ... 0 0 0 ... 0 0 1 ... 1 n 1 ... 1
       
0 n . . . 0 0 0 . . . 0  1 0 . . . 1 1 n ... 1
, , ,
       
. . . . ..  . . .
 .. .. . . ... 
. . .
 .. .. . . ... 
. . .. .. 
 .. .. . .  .. .. . .
       
0 0 ... 0 0 0 ... n 1 1 ... 0 1 1 ... n

21
Exercise 2.7.4. Prove that, if v1 , v2 , . . . , vk are the eigenvectors of M associated with
the same eigenvalue λ, then any nonzero linear combination

v = α1 v1 + α2 v2 + · · · + αk vk

is also an eigenvector associated with λ.

Exercise 2.7.5. For a square matrix M , prove that M and M T have the same eigen-
values.

Exercise 2.7.6. For any Am×n , Bn×m matrices, prove that the nonzero eigenvalues of
AB and BA are the same.

2.8 Eigenvalues, eigenvectors of symmetric matrices


Symmetric matrices display very interesting properties with the eigenvalues and eigenvec-
tor. The eigenvalues of a symmetric matrix are real (why?). (Note that an eigenvector
of a symmetric matrix can have complex entries, but we can also choose the eigenvectors
to be real.)
Moreover, for distinct eigenvalues of a symmetric matrix, the orthogonality of eigenvec-
tors comes for free. Consider a symmetric matrix M with two distinct eigenvalues α, β,
and the corresponding eigenvectors x, y, respectively. That is, we have

M x = αx, (2.12)
M y = βy. (2.13)

On both the side of Equation (2.12), take dot product with y, similarly, on both the
side of Equation (2.13), take dot product with x, we get

y T M x = αy T x, (2.14)
xT M y = βxT y. (2.15)

Taking transpose on both the sides of Equation (2.14), and then subtracting it from
Equation (2.15) we get (α − β)(xT y) = 0, which implies xT y = 0, that is, x, y are
orthogonal.

Theorem 2.8.1. For a symmetric matrix the eigenvectors associated with distinct
eigenvalues are orthogonal.

22
2.8.1 Orthogonal diagonalization

In fact for symmetric matrices we can say more than what Theorem 2.8.1 conveys.
Even if the eigenvalues are not distinct we can choose associated eigenvectors to be
orthogonal. The next theorem states this fact, its proof is given in Appendix 19. It
is also known as the spectral theorem, and the decomposition it gives is known as
eigenvalue decomposition of a symmetric matrix.

Orthogonal diagonalization (spectral theorem)

Theorem 2.8.2. Let M be a symmetric matrix. Then


  
  λ1 0 ... 0 − v1T −
| | ... |  
T

 0 λ2 ... 0  − v2 −
 
M = v1 v2 . . . vn  , (2.16)

 ... .. .. ..   .. .. .. 

  
. . .  . . .
| | ... |  
0 0 . . . λn − vnT −

that is, M = V DV T , where V is the matrix with columns v1 , v2 , . . . , vn which are


the eigenvectors of M corresponding to the eigenvalues λ1 , . . . , λn , and V T V = I,
D = diag(λ1 , . . . , λn ).

2.8.2 The interlace theorem

Consider the following symmetric matrix A and a matrix B which is a principal submatrix
of A.
 
−2 4 2 −4  

4
 −2 4 2
4 2 1  
A= , B =  4 4 2.
 
2 2 0 −3 
  2 2 0
−4 1 −3 4

The eigenvalues of A are λ1 = 8.62, λ2 = 5.03, λ3 = −1.95, λ4 = −5.70, and the eigen-
values of B are β1 = 7.03, β2 = −0.81, β3 = −4.23. Let us plot these eigenvalues on the
number line.

β3 β2 β1
λ4 λ3 0 λ2 λ1

23
Any βi is trapped between λi and λi+1 ; we have" λi+1#≤ βi ≤ λi for i = 1, 2, 3.
4 2
Now let us consider another principal submatrix of A. Its eigenvalues are
2 0
β1 = 4.83, β2 = −0.83. Let us plot these too on the number line.

β2 β1
λ4 λ3 0 λ2 λ1

This time β1 ≤ λ1 but also β1 ≤ λ2 . However, β1 ≥ λ3 . So β1 is infact trapped between


λ1 and λ3 . The following well celebrated theorem on symmetric matrices relates the
eigenvalue of a matrix with the eigenvalues of principal submatrices.

Cauchy interlace theorem (Eigenvalue interlacing theorem)

Theorem 2.8.3. Let M be a symmetric matrix of order n, and B be its principal


submatrix of order m, where, m < n. Suppose the eigenvalues of A are λ1 ≥ · · · ≥ λn
and the eigenvalues of B are β1 ≥ · · · ≥ βm . Then

λi ≥ βi ≥ λi+n−m , for i = 1, . . . , m.

In particular if m = n − 1, then

λ1 ≥ β1 ≥ λ2 ≥ · · · ≥ βn−1 ≥ λn .

2.8.3 Positive definite, semi-definite matrices

Now consider the following symmetric matrix


 
3−1 −3
0
 
0 7 1 −1
M = .
 
−1 1 2 2
 
−3 −1 2 6

Its eigenvalues are λ1 = 8.76, λ2 = 7.13, λ3 = 1.35, λ4 = 0.76. On the number line they
are placed as follows.

0 λ4 λ3 λ2 λ1

24
All the eigenvalues of M are nonnegative. A symmetric matrix whose all the eigenvalues
are nonnegative is known as positive semi-definite matrix, and when the eigenvalues are
positive it is known as positive definite matrix. Check that M is positive definite. These
matrices have very high importance for different applications and display beautiful
properties. We will see two equivalent criteria for a matrix to be positive semidefinite
or definite.
Let M be a symmetric positive semidefinite matrix of order n. Using 2.8.2 we can
write M = V DV T , where V is the matrix with columns v1 , v2 , . . . , vn which are the
eigenvectors of M corresponding to the eigenvalues λ1 , . . . , λn , and V is a orthogonal
1 1 1
matrix. Since the eigenvalues are nonnegative we can write D = D 2 D 2 , where D 2
√ 1
is a diagonal matrix with i-th diagonal entry equals to λi . Let V̂ = V D 2 , then
1 1
we can write M = V D 2 D 2 V T = V̂ V̂ T . It implies that for any nonzero vector x, we
have xT M x = xT V̂ V̂ T x = (V̂ T x)T (V̂ T x) ≥ 0. We will see that this is a very useful
information. When M is positive definite, that is, all the eigenvalues are positive, then
V̂ has rank n and so V̂ T x is can not be a zero vector unless x is zero vector. Since x is
nonzero this implies that xT M x > 0.
Now, assume that xT M x ≥ 0, for any symmetric matrix M and nonzero vector x. Using
Theorem 19.0.5 the smallest eigenvalue λn of M is equal to

xT M x
λn = min xT M x = min .
||x||=1 ||x||2

Since xT M x ≥ 0, we have, λn ≥ 0 which implies that all the eigenvalues are nonnegative.
If xT M x > 0, it implies that all the eigenvalues are positive. Thus we state the following
theorem.

Theorem 2.8.4. Let M be a symmetric matrix. For any nonzero vector x,

1. xT M x ≥ 0 if and only if all the eigenvalues of M are nonnegative. (M is


called positive semi-definite matrix.)

2. xT M x > 0 if and only if all the eigenvalues of M are positive. (M is called


positive definite matrix.)

Exercise 2.8.5. If A and B are positive semidefinite matrices, then trace AB ≥ 0, and
equality holds if and only if AB = 0.

Exercise 2.8.6. Prove that the rank of a symmetric matrix is equal to the number of

25
its nonzero eigenvalues.

Exercise 2.8.7. Let A be a n × m matrix with n ≤ m, then prove that AT A is a positive


semi-definite matrix. Further show that each eigenvalue of AAT is also an eigenvalue
of AT A, and the rest of the m − n eigenvalues are zero.

Exercise 2.8.8. Show that trace AT A = 0 if and only if the eigenvalues of AT A are
zero.

Exercise 2.8.9. For the matrix


 
2 −1 3
 
M =
−1 0 4

3 4 1

compute an orthogonal matrix V and a diagonal matrix D such that M = V DV T .

Exercise 2.8.10. Show that for any matrix Am×n , nullspaces of A and AT A are the
same. Also, the column spaces of A and AT A are the same.

Theorem 2.8.11. For the matrix


 
0 −1 2
 
M =
−1 0 3

2 3 1

compute the eigenvalues, and verify the Cauchy interlace theorem by computing the
eigenvalues of its principal submatrices.

26

You might also like