A positive definite matrix is a special type of
square matrix that is central in many areas of
mathematics, including linear algebra,
optimization, and statistics.
Test for Positive Definite Matrix
Each of the following tests for the real symmetric matrix 𝐴 to
be positive definite:
(𝐼) 𝑥 𝑇 𝑘𝑥 > 0 for all nonzero real vectors 𝑥.
(𝐼𝐼) All the eigenvalues of 𝐴 satisfy 𝜆𝑖 > 0.
(𝐼𝐼𝐼) All the upper left 𝑠𝑢𝑏𝑚𝑎𝑡𝑟𝑖𝑐𝑒𝑠 have positive determinants.
(IV) All the pivots (without row exchanges) satisfy 𝑑𝐾 > 0.
Positive Definite Matrices and Least
Squares
The rectangular 𝑚𝑎𝑡𝑟𝑖𝑥 will be 𝑅 and the least-squares problem
will be 𝑅𝑥 = 𝑏. It has 𝑚 equations with 𝑚 ≥ 𝑛 (square systems are
included). 𝑇ℎ𝑒 𝑙𝑒𝑎𝑠𝑡 𝑠𝑞𝑢𝑎𝑟𝑒 𝑐ℎ𝑜𝑖𝑐𝑒 𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 𝑜𝑓𝑅𝑇 𝑅𝑥 = 𝑅𝑇 𝑏.
That matrix 𝐴𝑅𝑇 𝑅 is not only symmetric but positive definite, as
we now show-provided that the 𝑛 columns of 𝑅 are linearly
independent. 𝑇ℎ𝑒 𝑠𝑦𝑚𝑚𝑒𝑡𝑟𝑖𝑐 𝑚𝑎𝑡𝑟𝑖𝑥 𝐴 𝑖𝑠 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑑𝑒𝑓𝑖𝑛𝑖𝑡𝑒 If and
only if there is a 𝑚𝑎𝑡𝑟𝑖𝑥 𝑅 with independent columns such that
𝐴 = 𝑅𝑇 𝑅.
Semi definite Matrices
Each of the following tests is a necessary and sufficient
condition for a symmetric matrix 𝐴 to be positive semi definite:
(𝐼) 𝑥 𝑇 𝐴𝑥 ≥ 0 for all vectors 𝑥.
(𝐼𝐼) All the eigenvalues of 𝐴 satisfy 𝜆𝑖 ≥ 0.
(𝐼𝐼𝐼) No principal submatrices have negative determinants.
(𝐼𝑉) No pivots are negative.
(𝑉) There is a 𝑚𝑎𝑡𝑟𝑖𝑥 𝑅, possibly with dependent columns, such that
𝐴 = 𝑅𝑇 𝑅.
The Generalized Eigenvalue Problem
Physics, engineering, and statistics are usually kind enough to
produce symmetric matrices in their eigenvalue problems. But
sometimes 𝐴𝑥 = 𝜆𝑥 is replaced by 𝐴𝑥 = 𝜆𝑀𝑥. There are two
matrices rather than one.
Singular Value Decomposition
𝐴 = 𝑈𝐷𝑉 𝑇 is known as the “𝑆𝑉𝐷” or the singular value decomposition.
The 𝑆𝑉𝐷 is closely associated with the eigenvalue-eigenvector
factorization 𝑄𝛬𝑄 𝑇 of a positive definite matrix. The eigenvalues
are in the diagonal matrix 𝛬. The eigenvector matrix 𝑄 is orthogonal
(𝑄 𝑇 𝑄) = 1 because eigenvectors of a symmetric matrix can be
chosen to be orthonormal.
Singular Value Decomposition: Any 𝑚 × 𝑛 matrix 𝐴 can be factored Into
𝐴 = 𝑈𝐷𝑉 𝑇 = (orthogonal)(diagonal)(orthogonal).
The columns of 𝑈 (𝑚 × 𝑚) are eigenvectors of 𝐴𝑇, and the columns of 𝑉 (𝑛 × 𝑛)
are eigenvectors of 𝐴𝑇𝐴 . The 𝑟 singular values on the diagonal of 𝐷 (𝑚 × 𝑛)
are the square roots of the nonzero eigenvalues of both 𝐴𝐴𝑇 and 𝐴𝑇𝐴.
Application of the SVD
1. Image processing: Suppose a satellite takes a picture, and
wants to send it to Earth. The picture may contain 1000 by 1000
“pixels” a million little squares, each with a definite color. We can
code the colors, and send back 1,000,000 numbers. It is better to
find the essential information inside the 1000 by 1000 matrix, and
send only that.
2. The effective rank: The rank of a matrix is the number of
independent rows, and the number of independent columns. That
can be hard to decide in computations! In exact arithmetic,
counting the pivots is correct.
𝟑. Polar decomposition Every nonzero complex number 𝑧 is
a positive number 𝑟 times number 𝑒iθ on the unit circle: 𝑧 = 𝑟
𝑒iθ. That expresses 𝑧 in “polar coordinates.” If we think of 𝑧
as a 1 × 1matrix, 𝑟 corresponds to a positive definite matrix
and 𝑒iθ corresponds to an orthogonal matrix . it forms a 1 × 1
unitary matrix: 𝑈𝐻𝑈 = 1. We take the complex conjugate as
well as the transpose, for 𝑈𝐻.
𝟒. Least Squares For a rectangular system 𝐴𝑥 = 𝑏. the
least-squares solution comes from the normal equations 𝐴𝑇𝐴
𝑥 = 𝐴𝑇𝑏. if 𝐴 has dependent columns then 𝐴𝑇 is not
invertible and 𝑥 is not determined. Any vector in the null
space could be added to 𝑥 . The optimal solution of 𝐴𝑥 = 𝑏
is the minimum length . solution of 𝐴𝑇 𝐴𝑥 = 𝐴𝑇 𝑏.
Minimum Principles
The unknown 𝑥 will not be given as the solution to 𝐴𝑥 = 𝑏 or
𝐴𝑥 = 𝜆𝑥. Instead, the vector 𝑥 will be determined by a
minimum principle.
The first goal in this section is to find the minimum principle
that is equivalent 𝑡𝑜 𝐴𝑥 = 𝑏, and the minimization equivalent
to 𝐴𝑥 = 𝜆𝑥. In every problem to solve the linear equation or
minimize the quadratic.
1
If 𝐴 is symmetric positive definite, then 𝑃 𝑥 = 𝑥 𝑇 𝐴𝑥 − 𝑥 𝑇 𝑏
2
reaches its minimum at the point where 𝐴𝑥 = b. At that point
1
𝑃𝑚𝑖𝑛 = − 𝑏 𝑇 𝐴−1 𝑏
2
Proof. Suppose 𝐴𝑥 = 𝑏. For any vector 𝑦, we show
that 𝑃(𝑦) ≥ 𝑃(𝑥):
1 1
𝑃 𝑦 − 𝑃 𝑥 = 𝑦 𝑇 𝐴𝑦 − 𝑦 𝑇 𝑏 − 𝑥 𝑇 𝐴𝑥 − 𝑥 𝑇 𝑏
2 2
1 1
= 𝑦 𝑇 𝐴𝑦 − 𝑦 𝑇 𝑏 + 𝑥 𝑇 𝐴𝑥 (𝑠𝑒t 𝑏 = 𝐴𝑥)
2 2
1
= 𝑦 − 𝑥 𝑇𝐴 𝑦 − 𝑥 .
2
This can’t be negative since 𝐴 is positive definite and it is zero only if
𝑦 − 𝑥 = [Link] all other points 𝑃(𝑦) is larger than 𝑃(𝑥),so the minimum
occurs at 𝑥.
1
Linear algebra recognizes this 𝑃(𝑥) as 𝑥 𝑇 𝐴𝑥 − 𝑥 𝑇 𝑏, and knows immediately
2
that 𝐴𝑥 = 𝑏 gives the minimum. Substitute 𝑥 = 𝐴−1𝑏 into 𝑃(𝑥):
1 1
Minimum value 𝒑𝒎𝒊𝒏 = (𝐴−1 𝑏) 𝑇 𝐴 𝐴−1 𝑏 − 𝐴−1 𝑏 𝑇 𝑏 = 𝑏 𝑇 𝐴−1𝑏
2 2
𝑥 = 𝐴−1 𝑏 , where the total energy 𝑃(𝑥) is a minimum.
Minimizing with Constraints
Add extra equations 𝐶𝑥 = 𝑑 on top of the minimization problem.
These equations are constraints. We minimize 𝑃(𝑥) subject to the
extra requirement 𝐶𝑥 = 𝑑. Usually 𝑥 can’t satisfy 𝑛 equations
𝐴𝑥 = 𝑏 and also 𝑙 extra constraints 𝐶𝑥 = 𝑑. We have too many
equations and we need 𝑙 more unknowns. Those new unknowns
𝑦1 , … , 𝑦𝑙 are called Lagrange multipliers. They build the constraint
into a function 𝐿(𝑥, 𝑦).
1
𝐿 𝑥, 𝑦 = 𝑃 𝑥 + 𝑦 𝑇 𝑐𝑥 − 𝑑 = 𝑥 𝑇 𝐴𝑥 − 𝑥 𝑇 𝑏 + 𝑥 𝑇 𝐶 𝑇 𝑦 − 𝑦 𝑇 𝑑.
2
That term in 𝐿 is chosen exactly so that 𝜕𝐿/𝜕 𝑦 = 0 brings back
𝐶𝑥 = 𝑑. When we set the derivatives of 𝐿 to zero, we have 𝑛 + 𝑙
equations for 𝑛 + 𝑙 unknowns 𝑥 𝑎𝑛𝑑 𝑦:
The Rayleigh quotient
To find a minimization problem equivalent to 𝐴𝑥 = 𝜆𝑥. The function
to minimize cannot be a quadratic, or its derivative would be linear
and the eigenvalue problem is nonlinear . The trick that succeeds is
to divide one quadratic by another one:
𝑥 𝑇 𝐴𝑥
Rayleigh quotient Minimize 𝑅 𝑥 = .
𝑥𝑇𝑥
Rayleigh’s Principle: The minimum value of the Rayleigh
quotient is the smallest eigenvalue. 𝑅(𝑥) reaches that minimum
at the first eigenvector 𝑥1 of 𝐴 and maximum at 𝑥𝑛 :
𝑥1𝑇 𝐴𝑥1 𝑥1𝑇 𝜆1 𝑥1
Minimum where 𝐴𝑥1 = 𝜆𝑥1 𝑅 𝑥1 = = = 𝜆1
𝑥1𝑇 𝑥1 𝑥1𝑇 𝑥1
𝑥𝑛𝑇 𝐴𝑥 𝑇𝜆 𝑥
𝑥𝑛
Maximum where 𝐴𝑥𝑛 = 𝜆𝑛 𝑥𝑛 𝑅 𝑥𝑛 = 𝑇
𝑛
= 𝑇
𝑛 𝑛
= 𝜆𝑛 .
𝑥𝑛 𝑥𝑛 𝑥𝑛 𝑥𝑛
The Finite Element Method
There were two main ideas in the preceding section on minimum
principles:
1
(𝑖) Solving 𝐴𝑥 = 𝑏 is equivalent to minimizing 𝑃(𝑥) = 𝑥 𝑇 𝐴𝑥 − 𝑥 𝑇 𝑏
2
(𝑖𝑖) Solving 𝐴𝑥 = 𝜆1 𝑥 is equivalent to minimizing 𝑅(𝑥) = 𝑥 𝑇 𝐴𝑥/𝑥 𝑇 𝑥
The new idea finite element method uses more of the power of
the computer in constructing a discrete approximation, solving
it, and displaying the results than any other technique in
scientific computation. If the basic idea is simple, the
applications can be complicated.
Trial Functions
Starting from the classical Rayleigh-Ritz principle, introduce the
new idea of finite elements. The equation can be – 𝑢” = 𝑓(𝑥) with
boundary conditions 𝑢(0) = 𝑢(1) = 0. This problem is infinite
dimensional. the energy whose minimum is required, replacing inner
products 𝑣 𝑇 𝑓 by integrals of 𝑣(𝑥)𝑓(𝑥):
1 1 1 1
Total energy 𝑝 𝑣 = 𝑣 𝑇 𝐴𝑣 − 𝑣 𝑇 𝑓 = 𝑣(𝑥)(−𝑣"(𝑥))𝑑𝑥 − 𝑣(𝑥)𝑓(𝑥)𝑑𝑥
2 2 0 0
𝑃(𝑣) is to be minimized over all functions 𝑣(𝑥) that satisfy 𝑣(0) = 𝑣(1) = 0.
The function that gives the minimum will be the solution 𝑢(𝑥).
Therefore the Rayleigh-Ritz method has three steps:
1. Choose the trial functions 𝑉1, … , 𝑉𝑛.
2. Compute the coefficients 𝐴𝑖𝑗 and 𝑏𝑗
3. Solve 𝐴𝑦 = 𝑏 to find (𝑥) = 𝑦1𝑉1(𝑥) +··· +𝑦𝑛𝑉𝑛(𝑥).
Linear Finite Elements
The simplest and most widely used finite element is piecewise
linear. Place nodes at the interior points 𝑥1 = ℎ, 𝑥2 = 2ℎ, … , 𝑥𝑛 = 𝑛
ℎ, just as for finite differences. Then 𝑉𝑗 is the “hat function” that
equals 1 at the node 𝑥 , and zero at all the other nodes. It is
concentrated in a small interval around its node, and it is zero
everywhere else.
Eigenvalue Problems
The Rayleigh-Ritz idea to minimize over a finite-dimensional family
of 𝑉’𝑠 in place of all admissible 𝑣’𝑠 is also useful for eigenvalue
problems. The true minimum of the Rayleigh quotient is the
fundamental frequency
Eigenfunction 𝒖(𝒙) −𝑢” = 𝜆𝑢, with 𝑢(0) = 𝑢(1) = 0.
The best example of an eigenvalue problem has (𝑥) = 𝑠𝑖𝑛𝜋𝑥 and 𝜆1 = 𝜋2
That function 𝑠𝑖𝑛𝜋𝑥 minimizes the Rayleigh quotient 𝑉𝑇𝐴𝑣/𝑣𝑇𝑣