0% found this document useful (0 votes)
7 views53 pages

Mathematics of Linear Elastic FEM

This document discusses the foundational mathematics necessary for understanding the Finite Element Method (FEM) in mechanical engineering, emphasizing the importance of expressing, solving, and interpreting mathematical functions, approximations, and physics. It covers various mathematical concepts, including coordinate systems, types of functions, and approaches to function approximation, providing examples and visualizations. The paper aims to elevate the relevance of mathematics in engineering by linking theoretical ideals with practical applications.

Uploaded by

Charl
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views53 pages

Mathematics of Linear Elastic FEM

This document discusses the foundational mathematics necessary for understanding the Finite Element Method (FEM) in mechanical engineering, emphasizing the importance of expressing, solving, and interpreting mathematical functions, approximations, and physics. It covers various mathematical concepts, including coordinate systems, types of functions, and approaches to function approximation, providing examples and visualizations. The paper aims to elevate the relevance of mathematics in engineering by linking theoretical ideals with practical applications.

Uploaded by

Charl
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Functions, Approximations, Physics, and

Equations: The Mathematical Pillars of Linear


Elastic FEM
Daniel N. Wilke1
1
Emerging Engineering Technology (EET) Lab, Mechanical and
Aeronautical Engineering, University of Pretoria, South Africa

1 Introduction
The Finite Element Method (FEM) is about constructing functions that approx-
imate physics using linear algebra. As Mechanical engineering is the engineering
of physical processes, the same way chemical engineering is the engineering of
chemical processes, being able to express physics mathematically is a basic yet
core skill for mechanical engineers.
This paper explores the background and foundational mathematics required
to understand and critically interpret FEM by recapping how to express, solve
and interpret

1. functions mathematically [2],


2. approximations mathematically [1],
3. physics mathematically [2],

4. linear systems mathematically [1].


The emphasis is on understanding and critically interpreting this mathemat-
ics, not just merely knowing this mathematics or knowing about this mathemat-
ics. Discussions and examples extend beyond FEM to give context and elevate
the relevance of the mathematics.

1
2 A coordinate system, a function, and an engi-
neer walk into a Space bar...
The art of problem-solving is the
art of finding the right function.

George Pólya

2.1 Space
To begin, we identify a space, such as the real number line (R), the Cartesian
plane (R2 ), or three-dimensional space (R3 ), and establish a suitable coordinate
system.

2.2 Cartesian Coordinates


Cartesian coordinate systems are fundamental to mathematics and physics.
They provide a framework and reference point for describing positions and
movements in space. Three Cartesian coordinate systems from 1D to 3D are
visualised in Figure 1, each showing a selected point in their respective coordi-
nate systems. A real number line represents each coordinate. The coordinates
are mutually orthogonal.

Figure 1: Visualisation of 1D, 2D, and 3D coordinate systems

2.2.1 1D Coordinate System


The leftmost panel shows a 1D coordinate system with a single axis, typically
labelled x or x1 . In this system:

ˆ A point is represented by a single number.

ˆ The example point shown is at x = 3.

2
2.2.2 2D Coordinate System
The middle panel shows a 2D coordinate system known as the Cartesian plane
typically labelled x, y or x1 , x2 . It consists of two perpendicular axes:

ˆ A point is represented by an ordered pair [x, y].

ˆ The example point shown is at [3, 2].

2.2.3 3D Coordinate System


The rightmost panel shows a 3D coordinate system typically labelled x, y, z or
x1 , x2 , x3 . It consists of three mutually perpendicular axes:

ˆ A point is represented by an ordered triple [x, y, z].

ˆ The example point shown is at [3, 2, 4].

2.2.4 Mathematical Notation


In this paper, we adopt the following notational conventions:
1. scalars are represented by non-bold italic lowercase letters (e.g., a, x, λ),

2. vectors are denoted by bold lowercase letters (e.g., v, x, f ),


3. and matrices are represented by bold uppercase letters (e.g., A, K, Σ).
For instance, a linear system Ax = b consists of a matrix A, a vector x, and a
vector b. The i-th component of a vector v is denoted by the scalar vi , and the
(i, j)-th element of a matrix A is represented by the scalar aij .

2.2.5 Mathematical Representation


In an n-dimensional Euclidean space Rn , a Cartesian coordinate system is de-
fined by n mutually orthogonal axes, each representing a real number line. A
point p in this space is uniquely specified by an ordered n-dimensional vector
x = [x1 , x2 , . . . , xn ], where each xi ∈ R represents the signed distance from
the origin to the orthogonal projection of p onto the i-th axis. The coordi-
nates xi are the Cartesian coordinates of the point p. In a Cartesian coordi-
nate system, a point p is uniquely specified by an ordered n-dimensional vector
x = [x1 , x2 , . . . , xn ], where each xi ∈ R represents the signed distance from the
origin to the orthogonal projection of p onto the i-th axis.
In general, we can represent points in these systems as vectors:

ˆ 1D: x = x or x = x1

ˆ 2D: x = [x, y] or x = [x1 , x2 ]

ˆ 3D: x = [x, y, z] or x = [x1 , x2 , x3 ]

3
This concept can be extended to higher dimensions, though they cannot be
easily visualised. An n-dimensional point would be represented as:
Let x = (x1 , . . . , xn ) ∈ Rn denote a point in n-dimensional Euclidean space,
where each xi represents the coordinate along the i-th orthogonal axis.

Figure 2: Orthogonal projections in 1D, 2D, and 3D Cartesian coordinate sys-


tems

Figure 2 illustrates this concept in 1D, 2D, and 3D:

ˆ 1D: The point p is represented by a single coordinate x1 = 3. The projec-


tion is simply the point itself on the number line.
ˆ 2D: The point p is represented by two coordinates [x1 , x2 ] = [3, 2]. The
red dashed lines show the orthogonal projections onto each axis:

– x1 = 3 is the distance from the origin to the projection on the x1 -axis.


– x2 = 2 is the distance from the origin to the projection on the x2 -axis.
ˆ 3D: The point p is represented by three coordinates [x1 , x2 , x3 ] = [3, 2, 4].
The red dashed lines show the orthogonal projections:

– x1 = 3 is the distance from the origin to the projection on the x1 -axis.


– x2 = 2 is the distance from the origin to the projection on the x2 -axis.
– x3 = 4 is the distance from the origin to the projection on the x3 -axis.
In each case, the coordinates xi uniquely specify the position of point p in
the respective space. The orthogonal projections ensure that each coordinate is
independent of the others, a key feature of Cartesian coordinate systems.
Understanding these coordinate systems is crucial for describing positions,
movements, and relationships in space. They form the foundation for more
advanced concepts in geometry, physics, and many other fields of science and
engineering.

4
2.3 Functions
Functions emerge as mathematical constructs that express relationships between
points in a space. For example, a function f : R → R defined by y = f (x) de-
scribes a relationship where each x-coordinate in the domain is associated with
exactly one y-coordinate in the codomain. Graphically, this can be represented
in the Cartesian plane, where each point [x, f (x)] satisfies the functional rela-
tionship. Hence, a function f is an equation that assigns to each element in a
set A exactly one element in a set B. We write this as:

f :A→B
A is called the domain and B is called the codomain.

2.4 Types of Functions


2.4.1 Univariate Functions
A univariate function is a function of a single variable. It can be represented as:

f :R→R

Example: f (x) = x2

2.4.2 Multivariate Functions


A multivariate function is a function of two or more variables. It can be repre-
sented as:
f : Rn → Rm
n is the number of input variables, and m is the number of output components.
Examples:

ˆ Scalar-valued: f (x, y) = x2 + y 2
T
ˆ Vector-valued: f (x, y) = x2 , y 2 , xy


2.4.3 Scalar Functions


Scalar functions return a single value for each input. For example, a scalar
function of two variables, x and y, is a function that assigns a single real value
to each pair of x and y values. In other words, a scalar function f (x, y) maps
the two-dimensional (x, y) coordinate space to the one-dimensional real number
line.
Temperature
The temperature function T (x, y) represents the temperature distribution in
a two-dimensional space:

T (x, y) = 20 + 0.1x2 + 0.05y 2

5
where T is measured in degrees Celsius, and x and y are spatial coordinates in
meters. The colour gradient and height of the surface represent the temperature
value at each point.
The top leftmost subfigure in Figure 3 depicts a temperature field with the
height and colours indicating the temperature.

Figure 3: Combined visualisation of functions. (a) Temperature distribution


T (x, y). (b) Pressure variation P (x, t). (c) Displacement field u(x, y) of a
structure under forming. (d) 2D velocity field v(x, y).

Pressure

The pressure function P (x, t) shows how pressure varies with position and
time:

P (x, t) = 101.325 + 0.01x + 0.5 sin(0.2t)


where P is measured in kilopascals (kPa), x is the spatial coordinate in
meters, and t is time in seconds. The colour and height of the surface indicate
the pressure value.
The top rightmost subfigure in Figure 3 depicts a pressure field with the
height and colours indicating the pressure.

6
2.4.4 Vector-valued Functions
Vector functions return multiple values (typically 3 for 3D space) for each input.
A vector function of two variables, x and y, is a function that assigns a vector
(with multiple components) to each pair of x and y values. In other words, a
vector function f (x, y) maps the two-dimensional (x, y) coordinate space to a
higher-dimensional vector space. For example, a two-dimensional vector func-
tion might look like  2 
x +y
f (x, y) =
x − y2
Displacement

The displacement function u(x, y) illustrates the two-dimensional displace-


ment of a structure under forming:
   
ux 0.01x(L − 2x)y/W
u(x, y) = =
uy 0.02x(L − x)(W − y)/W
where u is the displacement vector in meters, x and y are spatial coordinates
in meters, L is the length of the beam (set to 10 meters), and W is the width
of the beam (set to 2 meters). Each arrow represents the displacement vector
at a point, with the arrow’s direction indicating the direction of displacement
and its colour representing the magnitude of displacement.
The bottom leftmost subfigure in Figure 3 depicts a displacement field with
arrows indicating the displacement direction and magnitude by length and
colour.
Velocity

The velocity field v(x, y) represents the velocity of particles in a two-dimensional


space:
   
v 0.1x
v(x, y) = x =
vy 0.2y
where v is the velocity vector in meters per second, and x and y are spatial
coordinates in meters. Each arrow represents the velocity vector at a point,
with its direction indicating the direction of motion and its colour representing
the magnitude of velocity.
The bottom rightmost subfigure in Figure 3 depicts a velocity field with
arrows indicating the velocity direction and the length and colour of the velocity
magnitude.

7
3 Approximately Four Approaches to Approxi-
mation
Approximations bridge theoretical
ideals and practical realities.

Author

Four approaches are demonstrated to fit functions to data: fitting to discrete


data points, fitting to a continuous function, using the method of weighted
residuals, and using a stochastic integration approach.

3.1 Fitting to Discrete Data Points


In this approach, we have a set of discrete data points (xi , yi ), and we want
to find a function that best fits these points. We minimise the sum of squared
residuals:
n
X
minimise (yi − f (xi ))2
i=1

where f (x) is our fitting function. The square ensures that any difference is
added, whether yi is bigger or smaller than f (xi ).

3.2 Fitting to a Continuous Function


In this approach, we have a continuous target function g(x), and we want to
find another function f (x) that best approximates it. We minimise the integral
of the squared difference:
 b
minimise (g(x) − f (x))2 dx
a

where [a, b] is the interval of interest. This integral represents the area
between the two functions.

3.3 Method of Weighted Residuals


This approach, also known as the Galerkin method, minimises the weighted
integral of the residual. For a quadratic fit f (x) = ax2 + bx + c to a function
g(x), we solve:
 b
(g(x) − (ax2 + bx + c))wi (x)dx = 0, i = 1, 2, 3
a

where w1 (x) = x2 , w2 (x) = x, and w3 (x) = 1 are the weight functions. The
weight functions come from the terms of the fit function f (x), which is called
the Bubnov-Galerkin method. This leads to a system of linear equations:

8
 b  b  b  b
a x4 dx + b x3 dx + c x2 dx = g(x)x2 dx
a a a a
 b  b  b  b
a x3 dx + b x2 dx + c xdx = g(x)xdx
a a a a
 b  b  b  b
a x2 dx + b xdx + c dx = g(x)dx
a a a a

Solving this system gives us the coefficients a, b, and c.

3.4 Stochastic Integration Approach


This approach approximates the continuous integral using Monte Carlo integra-
tion. We randomly sample points from the domain and compute the average
squared difference:
N
1 X
minimize (g(xi ) − f (xi ))2
N i=1
xi are randomly sampled from the domain, and N is the number of samples.

3.5 Examples
3.5.1 1D Example: Fitting a Quadratic to a Sine Function
We fit a quadratic function f (x) = ax2 + bx + c to a sine function g(x) = sin(x)
over the interval [0, π].

Figure 4: 1D fitting examples

Far left plot: Fitting to discrete points


ˆ Blue dots: 20 data points sampled from sin(x) with added Gaussian noise
(mean 0, standard deviation 0.1)

ˆ Orange curve: Quadratic function fitted to these points using least squares
method
Middle left plot: Fitting to continuous function

9
ˆ Blue curve: Target function sin(x)
ˆ Orange curve: Quadratic function fitted by minimising the integral of the
squared difference
Middle right plot: Method of weighted residuals
ˆ Blue curve: Target function sin(x)
ˆ Orange curve: Quadratic function fitted using the method of weighted
residuals
Far right plot: Stochastic integration approach
ˆ Blue curve: Target function sin(x)
ˆ Orange curve: Quadratic function fitted using stochastic integration
To reproduce:
1. Generate 20 points from x ∈ [0, π]
2. Compute y = sin(x)
3. Fit a quadratic function to these points

4. For continuous fit, minimize 0
(sin(x) − (ax2 + bx + c))2 dx using gradient-
based optimisation [3].

5. For weighted residuals, solve the system of equations derived from 0 (sin(x)−
(ax2 +bx+c))xi dx = 0 for i = 0, 1, 2, which solves a solution for ∇f (x) = 0
(optimality criterion solution) [3].
6. For stochastic integration, randomly sample points from [0, π] and min-
imise the average squared difference using gradient-only optimisation [3].
7. Plot results using 100 points over [0, π]

3.5.2 2D Example: Fitting a 2D Quadratic to a 2D Sine Function


We fit a 2D quadratic function f (x, y) = ax2 + by 2 + cxy + dx + ey + f to a
2D sine function g(x, y) = sin(x) sin(y) over the region [−2, 2] × [−2, 2]. The far
left, middle left, middle right and far right plots of the fitted function are shown
in Figure 5.
To reproduce:
1. Generate 20x20 grid of points over [−2, 2] × [−2, 2]
2. Compute z = sin(x) sin(y)
3. Fit 2D quadratic function to these points
2 2
4. For continuous fit, minimize −2 −2
(sin(x) sin(y) − f (x, y))2 dxdy

10
Figure 5: 2D fitting examples

5. For weighted residuals, extend the 1D approach to 2D using appropriate


weight functions
6. For stochastic integration, randomly sample points from [−2, 2] × [−2, 2]
and minimize the average squared difference using gradient-only optimi-
sation [3].
7. Plot results using 50x50 grid over [−2, 2] × [−2, 2]
Each approach has its strengths:

ˆ Discrete point fitting is best when data is sparse or noisy and when data
is measured using sensors.
ˆ Continuous function fitting is ideal when the data we fit is given in a
functional form.

ˆ Method of weighted residuals often balances accuracy and computational


efficiency when the data we fit is given in a functional form.
ˆ Stochastic integration can be useful for high-dimensional problems but
requires careful implementation and possibly many samples.
The choice of method depends on the specific problem, computational re-
sources, and desired accuracy.

4 Maths’ Most Wanted: Functions and Calculus


Caught Red-Handed Describing Physics
Physics follows the path of least
action, and gradients show the
way.

Author

The mathematical description of physics fundamentally relies on functions


and calculus. Functions f (x) are the primary tool to model physical relation-
ships, mapping input variables to output quantities. For instance, position as a
function of time x(t) describes motion, whilst potential energy as a function of

11
position V (x) characterises force fields. Calculus then provides the machinery
df
to analyse how these quantities change. The derivative dx quantifies rates of
dx
change, crucial in defining velocity v(t) = dt or force f (x) = −∇V (x). Inte-
gration f (x) dx, conversely,
 accumulates infinitesimal contributions, essential
in calculating work W = f · dx or determining the centre of mass. Differen-
2
tial equations, combining functions and their derivatives (e.g., m ddt2x = f (x)),
encapsulate the dynamics of physical systems. This interplay of functions and
calculus forms the language in which the laws of physics are written and under-
stood, enabling precise description and prediction of natural phenomena across
scales from the quantum to the cosmic.

4.1 Calculus
Calculus, at its core, is the study of relationships between points in a defined
space by studying how functions change and accumulate:

4.1.1 Differential Calculus


For univariate functions:
f (x + h) − f (x)
f ′ (x) = lim
h→0 h
For multivariate functions:
ˆ Partial derivatives: ∂f
∂xi
h iT
ˆ Gradient (for scalar-valued functions): ∇f = ∂f ∂f
∂x1 , . . . , ∂xn

ˆ Jacobian matrix (for vector-valued functions): Jij = ∂fi


∂xj

Consider the 3D function defined as:


p 
f (x, y) = sin x2 + y 2

The partial derivatives of the function are:


∂f x p 
=p cos x2 + y 2
∂x x2 + y 2
∂f y p 
=p cos x2 + y 2
∂y x2 + y 2

Figure 4.1.1 visualises the top left: the function top left, top right: gradient
field, bottom left: partial derivative with respect to x, bottom right: partial
derivative with respect to y.

12
4.1.2 Integral Calculus
For univariate functions:
 b n
X
f (x)dx = lim f (x∗i )∆x
a n→∞
i=1

For multivariate functions:



ˆ Double integrals: D
f (x, y) dA

ˆ Triple integrals: V
f (x, y, z) dV

ˆ Line integrals: C
f (x, y, z) ds

ˆ Surface integrals: S
f (x, y, z) dS

The triple integral of a function f (x, y, z) over a region V is defined as:


 n
X
f (x, y, z) dx dy dz = lim f (x∗i , yi∗ , zi∗ )∆Vi
V n→∞
i=1

where ∆Vi is the volume of the i-th subregion and (x∗i , yi∗ , zi∗ ) is a point in
that subregion.
For a vector field f (x, y, z) and a closed surface s enclosing a volume V :

13
 
(∇ · f ) dV = f · ds
V s
ds is a vector pointing outward from the surface, perpendicular to the surface
at each point with magnitude of the area of an infinitesimal piece of the surface.
This is known as the Divergence Theorem.

4.2 Applications in 3D
ˆ Optimisation: ∇f = 0 at critical points (maxima, minima and saddle
points) [3]

ˆ Volume under a surface: V = V
f (x, y, z) dx dy dz

ˆ Flux through a surface: Φ = S


f · ds

Calculus provides tools to analyse how these relationships change over the
space. Differential calculus examines rates of change between nearby points,
quantified by derivatives, while integral calculus studies accumulation and ar-
eas represented by integrals. These fundamental operations allow us to explore
the behaviour of functions, such as finding extrema, analysing curvature, quan-
tifying differences between curves or calculating areas under curves.

4.3 Units in Differentiation and Integration


Understanding how units change under differentiation and integration is crucial
in physics and engineering:

ˆ Differentiation with respect to a spatial variable: When we differentiate


a quantity with respect to a spatial variable, the unit of the result is the
unit of the original quantity divided by the unit of the spatial variable.
Example: If f (x) represents a position in meters (m) and x is in seconds
df
(s), then dx will have units of m/s (velocity).
ˆ Integration with respect to a spatial variable: When we integrate a quan-
tity with respect to a spatial variable, the unit of the result is the unit of
the original quantity multiplied by the unit of the spatial variable.
Example: If we integrate velocity (m/s) with respect to time (s), the result
will have units of m (displacement).
ˆ In 3D: - The gradient (∇f ) of a scalar field f has units of [f]/[length]. -
The divergence (∇ · f ) of a vector field f has units of [f]/[length]. - The
curl (∇ × f ) of a vector field f has units of [F]/[length].

ˆ Volume integrals: When we perform a triple integral, we multiply the


original quantity by volume (length3 ).
Example: Integrating density (kg/m3 ) over a volume gives mass (kg).

14
These unit transformations are essential for dimensional consistency in phys-
ical equations and can be used to check the correctness of derived equations. Un-
derstanding how units transform under differentiation and integration is crucial
for correctly interpreting and applying these equations in real-world scenarios.

4.4 Physics and Gradients


Derivatives and partial derivatives are the building blocks for describing physics.
Many physical phenomena are described using Ordinary Differential Equa-
tions (ODEs) or Partial Differential Equations (PDEs):

ˆ ODEs involve functions of a single variable and their derivatives. For


2
example, Newton’s Second Law for a 1D motion: m ddt2x = F (x, t)

ˆ PDEs involve functions of multiple variables and their partial derivatives.


For example, the Heat Equation in 3D: ∂u 2
∂t = α∇ u, where u(x, y, z, t) is
2
the temperature and ∇ is the Laplacian operator.
ˆ These equations often arise from applying fundamental physical laws (con-
servation of mass, energy, momentum) to continuous systems.

ˆ Solving these equations typically involves techniques from calculus, such


as integration and finding extrema, often extended to multiple dimensions.
ˆ The units in these equations must be consistent. For instance, in the heat
equation, α must have units of [length2 ]/[time] for dimensional consis-
tency.

Calculus mathematics, particularly its multidimensional forms, provides the


foundation for describing and analysing complex physical systems through dif-
ferential equations.

4.4.1 Common Operators for Physics


The Divergence Operator
The divergence operator, denoted as ∇· or div, is a vector operator that mea-
sures the magnitude of a vector field’s source or sink at a given point. It maps
a vector field to a scalar field.
For a vector field F(x1 , . . . , xn ) = (F1 , . . . , Fn ) in n-dimensional space, the
divergence is defined as:
n
X ∂Fi
∇ · F = div F =
i=1
∂xi

ˆ In Cartesian coordinates (3D):

∂Fx ∂Fy ∂Fz


∇·F= + +
∂x ∂y ∂z

15
ˆ In cylindrical coordinates (r, θ, z):

1 ∂ 1 ∂Fθ ∂Fz
∇·F= (rFr ) + +
r ∂r r ∂θ ∂z

ˆ In spherical coordinates (r, θ, ϕ):

1 ∂ 2 1 ∂ 1 ∂Fϕ
∇·F= 2
(r Fr ) + (sin θFθ ) +
r ∂r r sin θ ∂θ r sin θ ∂ϕ

The divergence of a vector field F at a point P represents the net outward


flux of the field per unit volume as the volume shrinks to zero around P .

ˆ ∇ · f > 0: Net outward flux (source)

ˆ ∇ · f < 0: Net inward flux (sink)

ˆ ∇ · f = 0: No net flux (incompressible flow)

The key properties are:


ˆ Linearity: ∇ · (af + bg) = a∇ · f + b∇ · g

ˆ Product rule: ∇ · (f f ) = f (∇ · f ) + f · (∇f )



ˆ Divergence theorem: V
(∇ · f ) dV = S
f · ds
The divergence is crucial in many areas of physics and engineering:
ˆ Solid Mechanics:

– Describes equilibrium of a material under stress


– Used in studying the propagation of stress waves through materials
2
– Stress equilibrium equation: ∇ · σ + f = ρ ∂∂tu2
– Hooke’s law (isotropic linear elasticity): σ = λ(∇ · u)I + 2µε
ˆ Fluid Dynamics:

– Describes the compressibility of a fluid


– Identifies sources and sinks in a flow field
– Helps analyse the expansion or contraction of fluid elements
– Continuity equation for incompressible flow: ∇ · v = 0
∂ρ
– Continuity equation for compressible flow: ∂t + ∇ · (ρv) = 0

ˆ Electromagnetism:

– Used in Gauss’s law to relate electric fields to charge distributions


– Helps describe the behaviour of magnetic fields

16
– Gauss’s law for electricity: ∇ · E = ϵρ0
– Gauss’s law for magnetism: ∇ · B = 0
– Charge conservation: ∂ρ∂t + ∇ · J = 0

ˆ Heat Transfer:
– Describes heat flux in a material
– Helps identify heat sources and sinks in thermal systems
– Heat equation: ρcp ∂T
∂t = −∇ · q + Q

ˆ Acoustics:
– Used to study sound wave propagation and compression
– Helps analyse acoustic sources and their behaviour
– Continuity equation for sound waves: ∂ρ
∂t + ρ0 ∇ · v = 0
– Acoustic energy flux: ∇ · I = − ∂E
∂t c

ˆ Plasma Physics:
– Describes the behaviour of charged particle distributions
– Used in studying plasma dynamics and instabilities
– Continuity equation for plasma: ∂n
∂t + ∇ · (ns vs ) = 0
s

ˆ Meteorology:
– Helps analyse atmospheric pressure systems
– Used in studying wind patterns and air mass movements
– Mass continuity equation: ∂ρ
∂t + ∇ · (ρv) = 0
∂u ∂v ∂w
– Divergence of wind field: ∇ · v = ∂x + ∂y + ∂z

ˆ Geophysics:
– Applied in studying seismic wave propagation
– Used in analysing groundwater flow and contaminant transport
– Continuity equation for groundwater: Ss ∂h
∂t = ∇ · (K∇h) + W

ˆ Quantum Mechanics:
– Appears in the continuity equation for probability density
– Used in formulating conservation laws for quantum systems
∂|ψ|2
– Continuity equation: ∂t +∇·J=0
ˆ Image Processing:
– Applied in edge detection algorithms
– Used in analysing intensity variations in images
– Anisotropic diffusion: ∂I
∂t = ∇ · (c(x, y, t)∇I)
– Divergence of gradient for edge detection: ∇2 I = ∇ · (∇I)

17
The Divergence of the Gradient of a Function
The divergence of a function’s gradient is referred to as the Laplacian opera-
tor. The Laplacian operator denoted as ∇2 or ∆, is a second-order differential
operator of central importance in many mathematics, physics, and engineering
areas. The Laplacian of a function f at a point P measures how much the value
of f at P differs from its average value on a small sphere around P . It measures
the ’curvature’ of f in all directions.
For a scalar function f (x1 , . . . , xn ) of n variables, the Laplacian is defined
as:
n
X ∂2f
∇2 f = ∆f = ∇ · ∇f =
i=1
∂x2i
Common representations include:
ˆ In Cartesian coordinates (3D):

∂2f ∂2f ∂2f


∇2 f = 2
+ 2 + 2
∂x ∂y ∂z

ˆ In cylindrical coordinates (r, θ, z):

1 ∂2f ∂2f
 
2 1 ∂ ∂f
∇ f= r + 2 2 + 2
r ∂r ∂r r ∂θ ∂z

ˆ In spherical coordinates (r, θ, ϕ):

∂2f
   
2 1 ∂ 2 ∂f 1 ∂ ∂f 1
∇ f= 2 r + 2 sin θ + 2 2
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2

The Laplacian operator is fundamental in numerous scientific and engineer-


ing disciplines, offering insights into various physical phenomena:
ˆ Solid Mechanics:
– Used in analysing stress and strain distributions in materials
– Helps study elastic deformations and vibrations in solids
– Applied in modelling crack propagation and fracture mechanics
2
– Navier’s equation for elasticity: µ∇2 u + (λ + µ)∇(∇ · u) + f = ρ ∂∂tu2
q
– Biharmonic equation for plate bending: ∇4 w = D

ˆ Fluid Dynamics:
– Appears in the Navier-Stokes equations, describing fluid motion
– Used in studying vorticity and stream functions
– Helps analyse diffusion processes in fluids

18
– Navier-Stokes equation: ρ Dv 2
Dt = −∇p + µ∇ v + ρg
– Navier-Stokes equation for incompressible flow: ρ ∂v
∂t + ρ(v · ∇)v =
−∇p + µ∇2 v + f
– Stream function in 2D flow: ∇2 ψ = −ω, where ω is vorticity
ˆ Heat Transfer:
– Central to the heat equation, describing temperature distribution
over time
– Used in analysing steady-state heat conduction
– Helps model thermal diffusion in materials
∂T
– Heat equation: ∂t = α∇2 T
– Steady-state heat conduction: ∇2 T = 0
ˆ Acoustics:
– Appears in the wave equation for sound propagation
– Helps analyse resonance in acoustic cavities
– Used in studying sound diffraction and scattering
2
1 ∂ p
– Wave equation for sound: ∇2 p = c2 ∂t2
– Helmholtz equation: ∇2 p + k 2 p = 0
ˆ Finance:
– Appears in the Black-Scholes equation for option pricing
– Used in modelling diffusion processes in financial markets
– Helps analyse risk spreading in portfolio theory
2
∂V
– Black-Scholes equation: ∂t + 12 σ 2 S 2 ∂∂SV2 + rS ∂V
∂S − rV = 0
∂u 1 ∂2u
– Heat equation for option pricing: ∂τ = 2 ∂x2

ˆ Biology and Ecology:


– Appears in reaction-diffusion equations modelling biological pattern
formation
– Used in studying population dynamics and species distribution
– Helps analyse neural signal propagation
∂u
– Reaction-diffusion equation: ∂t = D∇2 u + f (u)
∂u
– Fisher-KPP equation: ∂t = D∇2 u + ru(1 − u)
ˆ Electromagnetism:
– Describes the behaviour of electric and magnetic potentials
– Used in solving Poisson’s equation for electrostatic problems

19
– Helps analyse electromagnetic wave propagation
– Poisson’s equation for electrostatic potential: ∇2 ϕ = − ϵρ0
2
– Wave equation for electromagnetic waves: ∇2 E = µ0 ϵ0 ∂∂tE
2

ˆ Image Processing:
– Used in edge detection and image sharpening algorithms
– Helps in noise reduction and image smoothing
– Applied in feature extraction and image segmentation
– Laplacian filter for edge detection: L(x, y) = ∇2 I(x, y)
∂I
– Heat equation for image denoising: ∂t = ∇2 I
ˆ Geophysics:
– Used in gravity and magnetic field analysis
– Helps model seismic wave propagation
– Applied in studying groundwater flow and contaminant transport
– Laplace’s equation for gravitational potential: ∇2 V = 0
– Diffusion equation for groundwater flow: S ∂h 2
∂t = T ∇ h

ˆ Plasma Physics:
– Describes potential distributions in plasmas
– Used in studying plasma oscillations and instabilities
– Helps analyse diffusion processes in plasma
– Poisson’s equation for electrostatic potential in plasma: ∇2 ϕ = − ϵe0 (ni −
ne )
∂ 2 ∂
– Drift wave equation: ∂t ∇ ϕ + v∗ ∂y ∇2 ϕ = 0
ˆ Computer Graphics:
– Applied in mesh smoothing and surface fairing
– Used in generating realistic textures and terrain
– Helps in simulating fluid and cloth dynamics
– Laplacian smoothing: vinew = vi + λ∇2 vi
– Poisson equation for surface reconstruction: ∇2 f = ∇ · v
ˆ Quantum Mechanics:
– Fundamental component of the Schrödinger equation
– Describes the kinetic energy operator for quantum particles
– Used in studying quantum tunnelling and barrier penetration
2
– Time-independent Schrödinger equation: − 2m

∇2 ψ + V ψ = Eψ
2
– Time-dependent Schrödinger equation: iℏ ∂ψ 2
∂t = − 2m ∇ ψ + V ψ

20
4.5 Examples
4.5.1 Ordinary Differential Equation (ODE): Motion of a Mass-Spring-
Damper System
The following second-order ODE can describe the motion of a mass-spring-
damper system:

d2 x dx
m +b + kx = 0
dt2 dt
where m is the mass, b is the damping coefficient, k is the spring constant,
x is the displacement, and t is time.
This can be rewritten as a system of first-order ODEs:

dx
=v
dt
dv k b
=− x− v
dt m m
Figure 6 visualises the solution to this ODE system in two ways. The top
left shows the position (blue), velocity (green), and acceleration (red) of the
mass-spring-damper system over time. The top right is a phase plot of the
same system, showing how velocity changes with position.
Interpreting these plots is essential in engineering. The mass started at
position 1 on the x-axis, and it had no initial zero velocity. The increasing
amplitude and velocity show the system is gradually losing energy. The spiral
pattern indicates the system’s gradual energy loss due to damping.

4.5.2 Partial Differential Equation (PDE): Heat Equation


The heat equation in one dimension is given by:

∂T ∂2T
=α 2
∂t ∂x
where T (x, t) is the temperature at position x and time t, and α is the
thermal diffusivity.
The solution to this PDE is visualised in three ways:

1. A surface plot showing how temperature (T ) varies with position and time.
2. A surface plot showing how the rate of change of temperature (∂T /∂t)
varies with position and time.
3. A surface plot showing how the second spatial derivative of temperature
(∂ 2 T /∂x2 ) varies with position and time.

21
Figure 6: Visualisation of ODE and PDE solutions. Top row: Mass-spring-
damper system. (Left) Position, velocity, and acceleration over time. (Right)
Phase plot. Bottom row: Heat equation. (Left) Temperature T (x, t). (Mid-
dle) Rate of change of temperature ∂T /∂t. (Right) Second spatial derivative
∂ 2 T /∂x2 .

Figure 6 bottom left is a surface plot of the heat equation solution. The
x-axis represents position, the y-axis represents time, and the z-axis (colour)
represents temperature. The initial sinusoidal temperature distribution can be
seen to ”flatten out” over time due to heat diffusion.
Figure 6bottom middle is a surface plot of dT /dt for the heat equation.
This represents the temperature change rate at each point in space and time.
The magnitude of dT /dt decreases over time as the system approaches thermal
equilibrium.
Figure 6 bottom right is the heat equation’s surface plot of ∂ 2 T /∂x2 . This
represents the curvature of the temperature profile at each point in space and
time. According to the heat equation,∂ 2 T /∂x2 is proportional to dT /dt, which
is evident in the similarity of their plots.
We can observe that higher temperature regions decrease and lower tem-
perature region increase over time. ∂T /∂t and ∂ 2 T /∂x2 differ only by a scale
factor. That scale factor is α. Regions with high positive ∂ 2 T /∂x2 (high up-
ward curvature) correspond to regions where ∂T /∂t is positive (temperature is
increasing), and vice versa. This illustrates the physical principle that heat flows
from high-temperature to low-temperature regions, smoothing the temperature

22
distribution over time. Using the same colour map for all three plots allows for
easy comparison between T , ∂T /∂t, and ∂ 2 T /∂x2 .

5 All-You-Can-Compute Linear Algebra


If you want to see real
applications of mathematics,
teach linear algebra.

Steven J. Leon

Linear algebra operations are fundamental to understanding and solving


problems in physics and solid mechanics. Here, we explore key operations and
their physical significance.

5.1 Introduction to Scalars, Vectors, and Matrices


In linear algebra and many areas of mathematics and physics, we often work
with three fundamental types of mathematical objects: scalars, vectors, and
matrices.

5.1.1 Scalars
A scalar is a single number which can be real or complex. Scalars are typically
denoted by lowercase, non-bold letters.
Example: a = 5, b = −3.14, c = 2 + 3i

5.1.2 Vectors
A vector is an ordered list of numbers, often representing a point or direction
in space. Vectors are typically denoted by lowercase, bold letters.
1  
−1
Example: v = 2, w =
4
3
Vectors can be represented in different ways:
 
v1
 v2 
ˆ Column vector: v =  . 
 
 .. 
vn

ˆ Row vector: vT = v1
 
v2 ··· vn

5.1.3 Matrices
A matrix is a rectangular array of numbers arranged in rows and columns. It
is typically denoted by uppercase, bold letters.

23
 
1 2 3  
−1 0
Example: A = 4 5 6 ,B=

2 3
7 8 9
A matrix with m rows and n columns is called an m × n matrix.

5.1.4 Relationships
ˆ A scalar can be viewed as a 1 × 1 matrix.

ˆ A vector can be viewed as an n × 1 matrix (column vector) or a 1 × n


matrix (row vector).

ˆ Matrices generalise the concept of vectors to multiple dimensions.

These fundamental objects form the basis for more complex operations and
concepts in linear algebra and related fields.

5.2 Matrix Multiplication: When Rows Meet Columns


For compatible matrices A and B, their product AB represents a composition
of linear transformations.

5.2.1 Mathematical Definition


P
(AB)ij = k aik bkj

5.2.2 Properties
ˆ Associative: (AB)C = A(BC)

ˆ Distributive: A(B + C) = AB + AC

ˆ Not commutative in general: AB ̸= BA

5.2.3 The Projection Perspective


Given matrices A ∈ Rm×n and B ∈ Rn×p , their product C = AB can be
interpreted as a series of projections and transformations:
n
X
Cij = Aik Bkj = aTi bj (1)
k=1

where aTi is the i-th row of A and bj is the j-th column of B.


This can be visualised as:

1. Each column of B represents a vector in Rn .


2. Each row of A represents a direction in Rn .

24
3. The (i, j)-th element of C is the projection of the j-th column of B onto
the direction of the i-th row of A.

Geometrically, if we consider A as a linear transformation:

A : Rn → Rm (2)
Then AB can be interpreted as applying this transformation to each column
of B:

(AB)j = A(Bj ) (3)


where (AB)j and Bj are the j-th columns of AB and B respectively.
This projection interpretation provides insight into how matrix multiplica-
tion transforms space and how it can be used in applications such as linear
regression, coordinate transformations, and data compression.

5.2.4 Physical Interpretation


In physics and solid mechanics:
ˆ Often represents the composition of physical processes or transformations.

ˆ In rigid body dynamics, rotation matrices multiply to give composite ro-


tations.

ˆ In continuum mechanics, σ = Cε relates strain to stress through the


constitutive tensor C.

5.3 Trace: Sprinting Down the Diagonal


The trace of a square matrix A, denoted tr(A), is the sum of its diagonal
elements.

5.3.1 Mathematical Definition


P
tr(A) = i aii

5.3.2 Properties
ˆ tr(A + B) = tr(A) + tr(B)

ˆ tr(AB) = tr(BA)

ˆ tr(A) = i λi , where λi are eigenvalues of A


P

25
5.3.3 Physical Interpretation
In physics and solid mechanics:
ˆ Often represents the sum of principal components or invariants of a system.

ˆ In stress analysis, tr(σ) = σ11 + σ22 + σ33 gives the hydrostatic stress.

ˆ In quantum mechanics, tr(ρ) = 1 for a density matrix ρ, representing


conservation of probability.

5.4 Determinant: Crushing Numbers into One


The determinant of a square matrix A, denoted det(A) or |A|, is a scalar value
that provides information about the matrix’s invertibility and the volume scaling
of its linear transformation.

5.4.1 Mathematical Definition


 
a b
For a 2x2 matrix: det = ad − bc
c d

5.4.2 Properties
ˆ det(AB) = det(A) det(B)

ˆ det(AT ) = det(A)

ˆ det(A−1 ) = 1
det(A)

5.4.3 Physical Interpretation


In physics and solid mechanics:

ˆ Represents volume change in transformations.

ˆ In continuum mechanics, det(F) > 0 for the deformation gradient F en-


sures no material interpenetration.
ˆ In thermodynamics, the Jacobian determinant appears in a change of vari-
ables for multiple integrals.

These fundamental operations form the backbone of linear algebra applica-


tions in physics and solid mechanics, providing powerful tools for analysing and
solving complex problems in these fields.

26
Figure 7: Plane with Normal Vector

27
5.5 Normal Vectors: VIP Physics Vector
There are two common methods to compute a normal vector on an arbitrar-
ily oriented plane: the cross product method and the trigonometric method.
Consider the plane shown in Figure 7.
Method 1: Cross Product
Given three non-collinear points p1 (x1 , y1 , z1 ), p2 (x2 , y2 , z2 ), and p3 (x3 , y3 , z3 )
on the plane:
1. Calculate two vectors on the plane: v = p2 −p1 = (x2 −x1 , y2 −y1 , z2 −z1 )
w = p3 − p1 = (x3 − x1 , y3 − y1 , z3 − z1 )
2. Compute the normal vector n usingthe cross product: n = v × w 
(y2 − y1 )(z3 − z1 ) − (z2 − z1 )(y3 − y1 )
3. The resulting normal vector is: n =  (z2 − z1 )(x3 − x1 ) − (x2 − x1 )(z3 − z1 ) 
(x2 − x1 )(y3 − y1 ) − (y2 − y1 )(x3 − x1 )
Method 2: Trigonometric
Alternatively, we can use trigonometry to compute the normal vector:
1. Define the plane using the general equation: ax + by + cz + d = 0
2. The normal vector is parallel to (a, b, c)
3. We can calculate a, b, and c using the following: a = cos θ cos ϕ b =
cos θ sin ϕ c = sin θ
Where θ is the angle between the normal vector and the xy-plane, and ϕ is
the azimuthal angle in the xy-plane from the x-axis.
4. The resulting normal vector is: n = (a, b, c) = (cos θ cos ϕ, cos θ sin ϕ, sin θ)
n
For both methods, to get a unit normal vector, normalise n: n̂ = ∥n∥
Figure 7 illustrates these concepts, showing the plane (in blue), the three
points p1 , p2 , and p3 (in red), and the computed unit normal vectors (in green
for cross product method and purple for trigonometric method).

5.6 Matrix Transposition: When Matrices Do the Twist


For a matrix A, its transpose AT is obtained by interchanging its rows and
columns.
For A = (aij ), AT = (aji )

5.6.1 Properties
ˆ (AT )T = A
ˆ (A + B)T = AT + BT
ˆ (AB)T = BT AT

5.6.2 Physical Interpretation


In physics and solid mechanics:
ˆ Transposition often represents a change in perspective or reference frame.
ˆ For stress tensors σ, symmetry (σ = σ T ) implies moment equilibrium.

28
5.7 Matrix Inverse: Matrices That Un-matrix Themselves
For a square matrix A, its inverse A−1 is defined as the unique matrix satisfying:

AA−1 = A−1 A = I
where I is the identity matrix.

5.7.1 Properties
ˆ (A−1 )−1 = A

ˆ (AB)−1 = B−1 A−1

ˆ (AT )−1 = (A−1 )T

5.7.2 Orthogonal Matrices


An orthogonal matrix Q is a square matrix whose columns and rows are or-
thonormal vectors. It has the following properties:

QT Q = QQT = I
This leads to a crucial relationship:

Q−1 = QT
For orthogonal matrices, the inverse is equal to the transpose.

Geometric Interpretation
Orthogonal matrices represent rotations and reflections in Euclidean space.
They preserve lengths and angles between vectors.

Properties of Orthogonal Matrices


ˆ det(Q) = ±1

ˆ The columns (and rows) form an orthonormal basis

ˆ ∥Qx∥ = ∥x∥ for any vector x

5.7.3 Solving Linear Systems and Linear Superposition


For a general invertible matrix A, the solution to Ax = b is:

x = A−1 b

29
5.7.4 Columns of A−1 as Solutions
Examining AA−1 = I column by column:

A[a−1 −1
1 a2 · · · a−1
n ] = [e1 e2 · · · en ]

Each column a−1


i of A−1 is the solution to:

Aa−1
i = ei
where ei is the standard basis vector (columns of I).

5.7.5 Exploiting Linear Superposition


For a general right-hand side b = b1 e1 + b2 e2 + · · · + bn en , the solution is:

x = b1 a−1 −1 −1
1 + b2 a2 + · · · + bn an

This is the principle of linear superposition at work. We’re solving n sys-


tems simultaneously, one for each basis vector, and then combining the results
according to the components of b.

5.7.6 Orthogonal Systems and Linear Superposition


For orthogonal systems where A = Q is orthogonal, the solution simplifies:

x = Q−1 b = QT b
This has several important implications:

1. Computational Efficiency: Computing QT b is much faster and numeri-


cally stable than inverting a general matrix.
2. Geometric Interpretation: The solution is a simple rotation/reflection of
the right-hand side vector.
3. Linear Superposition: The columns of QT are the solutions for each stan-
dard basis vector. We’re simply taking a linear combination of these
columns for a general b.

4. Energy Conservation: In physical systems represented by orthogonal trans-


formations, energy (represented by vector norms) is conserved: ∥x∥ = ∥b∥.

5.7.7 Applications in Physics and Engineering


The relationship between inverse, transpose, and orthogonality has numerous
applications:

ˆ Quantum Mechanics: Unitary matrices (complex orthogonal matrices)


represent quantum operations, with U−1 = U† (conjugate transpose).

30
ˆ Signal Processing: The Discrete Fourier Transform matrix is unitary, al-
lowing efficient computation of inverse transforms.
ˆ Structural Dynamics: Modal analysis often involves orthogonal mode
shapes, simplifying the solution of dynamic equations.
ˆ Computer Graphics: The orthogonal rotation matrices in 3D graphics
allow efficient inverse transformations.
ˆ Data Analysis: Principal Component Analysis (PCA) uses an orthogonal
transformation to convert correlated variables into uncorrelated principal
components.

5.7.8 Generalisation to Non-Square Matrices


For non-square matrices, we can define the Moore-Penrose pseudoinverse A+ :
ˆ For overdetermined systems (more equations than unknowns):
A+ = (AT A)−1 AT

ˆ For underdetermined systems (fewer equations than unknowns):


A+ = AT (AAT )−1

These formulations maintain the connection between inverse and transpose,


generalising the concept to a broader class of systems.
Understanding the relationship between matrix inverse, transpose, and or-
thogonality provides powerful insights into the behaviour of linear systems and
their applications in physics and engineering. The principle of linear superpo-
sition, combined with the properties of orthogonal systems, offers both com-
putational efficiency and a deeper understanding of the underlying physical
processes.

5.8 Linear Systems: Equation Equinox


Consider a linear system of equations represented by Ax = b, where A is an
m × n matrix, x is an n × 1 vector of unknowns, and b is an m × 1 vector.

5.8.1 Full Rank Square Systems


When m = n and rank(A) = n, we have a full rank square system.

Derivation
For a full rank square system:
1. The system has a unique solution: x = A−1 b
2. The inverse exists because det(A) ̸= 0
3. The columns of A form a basis for Rn

31
Solution
The solution can be found using:
ˆ Direct methods: Gaussian elimination, LU decomposition

ˆ Iterative methods: Jacobi method, Gauss-Seidel method

5.8.2 Overdetermined Systems


When m > n, we have an overdetermined system. These systems typically arise
in data fitting problems.

Derivation
For an overdetermined system:

1. There are more equations than unknowns


2. An exact solution may not exist

3. We seek a solution that minimizes the error ∥Ax − b∥2

Least Squares Therapy: Helping Overdetermined Systems Find Their


Best Fit
Minimize:
E = ∥Ax − b∥2 = (Ax − b)T (Ax − b)
Differentiating with respect to x and setting to zero:
∂E
= 2AT (Ax − b) = 0
∂x
This leads to the normal equations:

AT Ax = AT b
The least squares solution is:

x = (AT A)−1 AT b

Underdetermined Systems
When m < n, we have an underdetermined system. These systems typically
have infinitely many solutions.

32
Derivation
For an underdetermined system:

1. There are fewer equations than unknowns


2. The system has either no solution or infinitely many solutions
3. If solutions exist, they form an affine subspace of Rn

5.8.3 The Pseudoinverse Chronicles: When Matrices Pretend to


Have Inverses
Among the infinite solutions, we often seek the one with the minimum norm.
This can be found using the Moore-Penrose pseudoinverse:

x = A+ b = AT (AAT )−1 b
This solution minimizes ∥x∥2 subject to Ax = b.

5.8.4 General Case: Singular Value Decomposition (SVD)


For any matrix A, we can use SVD:

A = UΣVT
where U and V are orthogonal matrices and Σ is a diagonal matrix of
singular values.
The pseudoinverse is then:

A+ = VΣ+ UT
where Σ+ is formed by reciprocating the non-zero singular values and trans-
posing.
This approach provides a unified framework for solving all types of linear
systems, handling issues of rank deficiency, and providing least squares solutions
when necessary.

5.9 Eigenvalues and Eigenvectors


For a square matrix A ∈ Rn×n , an eigenvector v ∈ Rn and its corresponding
eigenvalue λ ∈ R satisfy:

Av = λv
This equation tells us that when A operates on v, it only scales v by a factor
λ, without changing its direction.

33
5.9.1 Derivation and Interpretation
Let’s multiply both sides of the eigenvalue equation by vT :

vT Av = λvT v
Now, if we normalize v such that ∥v∥ = 1, then vT v = 1, and we get:

vT Av = λ
This result has a profound interpretation: the eigenvalue λ is the value of
the quadratic form vT Av when v is the corresponding eigenvector.

5.9.2 Maximization Property


Eigenvectors are the vectors that maximise or minimise the quadratic form
vT Av subject to the constraint ∥v∥ = 1. To see this, consider the Lagrangian:

L(v, µ) = vT Av − µ(vT v − 1)
where µ is a Lagrange multiplier. Setting the gradient to zero:
∂L
= 2Av − 2µv = 0
∂v
This gives us Av = µv, which is precisely the eigenvalue equation with
µ = λ.

5.9.3 Orthogonality of Eigenvectors


For a symmetric matrix A, eigenvectors corresponding to distinct eigenvalues
are orthogonal. To prove this, consider two eigenvectors v1 and v2 with eigen-
values λ1 and λ2 :

λ1 v1T v2 = (Av1 )T v2 = v1T AT v2 = v1T Av2


= v1T (λ2 v2 ) = λ2 v1T v2

If λ1 ̸= λ2 , this implies v1T v2 = 0, i.e., v1 and v2 are orthogonal.

5.9.4 Physical Interpretation


In physical systems:

ˆ Eigenvectors often represent principal axes or modes of the system.


ˆ Eigenvalues typically represent these modes’ strength, frequency, or im-
portance.
ˆ The orthogonality of eigenvectors for symmetric matrices reflects the in-
dependence of these modes in many physical systems.

34
For example, in structural mechanics:
ˆ Eigenvectors of the stiffness matrix represent mode shapes of vibration.
ˆ Corresponding eigenvalues give the squared natural frequencies of these
modes.
ˆ The orthogonality of eigenvectors means these modes can be excited in-
dependently.

5.9.5 Diagonalisation and Principal Component Analysis


For a symmetric matrix A, we can write:

A = QΛQT
where Q is the matrix of eigenvectors and Λ is a diagonal matrix of eigenval-
ues. This decomposition is the basis for Principal Component Analysis (PCA),
where:

ˆ Eigenvectors represent the data’s principal components or directions of


maximum variance.
ˆ Eigenvalues represent the amount of variance explained by each principal
component.

This connection between linear algebra and data analysis demonstrates the
far-reaching implications of eigenvector theory.

5.10 Positive Definiteness: Physical Connection


A symmetric matrix A is positive definite if for all non-zero vectors x:

xT Ax > 0
In the context of elasticity, the strain energy density U is given by:
1 T
U= ε Cε
2
where ε is the strain vector and C is the elasticity tensor. For the strain
energy to be physically meaningful, C must be positive definite.

5.10.1 Applications in Physics


ˆ In continuum mechanics, the stress and strain tensors are represented as
symmetric matrices, with their eigenvalues and eigenvectors giving prin-
cipal stresses/strains and their directions.
ˆ In vibration analysis, the eigenvectors of the stiffness matrix represent the
mode shapes, while the eigenvalues give the natural frequencies.

35
ˆ In quantum mechanics, the Hamiltonian operator is represented as a Her-
mitian matrix. Its eigenvectors are the stationary states, and eigenvalues
are the corresponding energies.

6 Matrix Personality Types: Find Your Matrix


Soulmate
To grasp the language of physics,
one must first master the
dictionary of matrices.

Author

6.1 Diagonal Matrices: When You’re Too Lazy to Fill in


the Blanks
A diagonal matrix is a square matrix with zero off-diagonal elements. Formally,
for an n × n diagonal matrix D:
(
di if i = j
dij =
0 if i ̸= j
It can be represented as:
 
d1 0 ··· 0
0 d2 ··· 0
D=.
 
.. .. .. 
 .. . . .
0 0 ··· dn
Despite their simple structure, diagonal matrices are fundamental in linear
algebra and its applications. Their special properties make them invaluable in
simplifying complex problems, improving computational efficiency, and provid-
ing insights into the behaviour of linear systems across various disciplines.

6.1.1 Properties
Diagonal matrices possess several important properties:

1. Determinant: The determinant is the product of the diagonal elements:


n
Y
det(D) = di
i=1

36
2. Inverse: If all diagonal elements are non-zero, the inverse is also diagonal:
 
1/d1 0 ··· 0
 0 1/d2 · · · 0 
D−1 =  .
 
. .
. . . .. 
 . . . . 
0 0 · · · 1/dn

3. Eigenvalues: The eigenvalues of a diagonal matrix are its diagonal ele-


ments.
4. Eigenvectors: The standard basis vectors are eigenvectors of any diago-
nal matrix.
5. Matrix multiplication: Multiplication of a vector by a diagonal matrix
scales each component:

Dx = [d1 x1 , d2 x2 , . . . , dn xn ]T

6.1.2 Special Types of Diagonal Matrices


ˆ Identity matrix: All diagonal elements are 1.

ˆ Scalar matrix: All diagonal elements are equal.

ˆ Zero matrix: All diagonal elements are 0.

6.1.3 Applications
Diagonal matrices play crucial roles in various fields:

Linear Algebra and Matrix Theory


ˆ Diagonalization: A matrix A is diagonalizable if A = PDP−1 , where
D is diagonal.
ˆ Spectral theorem: Symmetric matrices are diagonalisable by orthogonal
matrices.

Numerical Linear Algebra


ˆ Preconditioning: Diagonal matrices are used as simple preconditioners
in iterative methods.
ˆ Scaling: Diagonal matrices can scale rows or columns to improve numer-
ical stability.

37
Physics and Engineering
ˆ Moment of inertia tensor: In the principal axis system, the moment
of inertia tensor is diagonal.
ˆ Stress tensor: In principal stress directions, the stress tensor becomes
diagonal.
ˆ Stiffness matrices: In some simple structural problems, stiffness matri-
ces can be diagonal.

Statistics and Data Science


ˆ Covariance matrices: Diagonal covariance matrices represent uncorre-
lated variables.
ˆ Principal Component Analysis (PCA): The transformed covariance
matrix in PCA is diagonal.

Control Theory
ˆ Decoupled systems: Diagonal state-space matrices represent decoupled
systems.
ˆ Modal analysis: The modal matrix diagonalises the system matrix in
vibration analysis.

6.1.4 Computational Advantages


Working with diagonal matrices offers several computational benefits:

1. Storage efficiency: Only n elements need to be stored instead of n2 .


2. Computational efficiency: Many operations (e.g., multiplication, in-
version) are O(n) instead of O(n3 ).
3. Parallelisation: Operations on diagonal matrices are easily parallelis-
able.

6.1.5 Diagonalisation and Eigendecomposition


The process of diagonalisation is closely related to the eigendecomposition of a
matrix. For a diagonalisable matrix A:

A = PDP−1
where D is a diagonal matrix of eigenvalues and P is a matrix whose columns
are the corresponding eigenvectors.
This decomposition is particularly useful for:
ˆ Computing matrix powers efficiently: Ak = PDk P−1

38
ˆ Solving systems of differential equations
ˆ Analysing the long-term behaviour of dynamical systems

6.2 Symmetric Matrices: Where Every Off-Diagonal Ele-


ment Has a Twin
A symmetric matrix S is a square matrix that is equal to its transpose:

S = ST
In component form, this means sij = sji for all i and j.
Understanding symmetric matrices and their properties is crucial in many
areas of physics, engineering, and data science. Their special properties, particu-
larly with regard to eigenvalues and eigenvectors, make them invaluable in mod-
eling many physical and statistical phenomena. The connection to quadratic
forms further extends their applicability, making them a cornerstone of many
mathematical and scientific disciplines.

6.2.1 Properties
Symmetric matrices possess several important properties:

1. Real eigenvalues: All eigenvalues of a symmetric matrix are real.


2. Orthogonal eigenvectors: Eigenvectors corresponding to distinct eigen-
values are orthogonal.
3. Diagonalisability: Every symmetric matrix is diagonalisable.
4. Spectral theorem: S = QΛQT , where Q is orthogonal and Λ is diago-
nal.
5. Quadratic forms: xT Sx is always real for any real vector x.
6. Multiplicative property: The product of two symmetric matrices is
symmetric if and only if they commute.

6.2.2 Geometric Interpretation


Symmetric matrices represent self-adjoint transformations. In 2D and 3D, they
can represent:
ˆ Scaling along orthogonal axes
ˆ Reflections
ˆ Combinations of the above
The eigenvectors of a symmetric matrix define the principal axes of the
transformation.

39
6.2.3 Examples
1. Covariance matrix: C = E[(X − µ)(X − µ)T ]
 
Ixx −Ixy −Ixz
2. Inertia tensor: I = −Ixy Iyy −Iyz 
−Ixz −Iyz Izz
 
σxx τxy τxz
3. Stress tensor: σ =  τxy σyy τyz 
τxz τyz σzz
4. Adjacency matrix of an undirected graph

6.2.4 Applications
Symmetric matrices play crucial roles in various fields:

1. Mechanics:
ˆ Stress and strain tensors in continuum mechanics
ˆ Moment of inertia tensors in rigid body dynamics

2. Finite Element Method:


ˆ Stiffness and mass matrices are symmetric

3. Statistics and Data Science:

ˆ Covariance matrices in multivariate statistics


ˆ Correlation matrices in data analysis
ˆ Gram matrices in machine learning

4. Quantum Mechanics:

ˆ Hamiltonian operators for time-independent systems


ˆ Density matrices for mixed states

5. Optimisation:
ˆ Hessian matrices in multivariate calculus
ˆ Quadratic programming problems

6. Graph Theory:
ˆ Laplacian matrices of graphs
ˆ Adjacency matrices of undirected graphs

40
6.2.5 Computational Advantages
Working with symmetric matrices offers several computational benefits:

1. Reduced storage: Only the upper (or lower) triangular part needs to
be stored.
2. Efficient eigenvalue algorithms: Methods like Jacobi rotations con-
verge faster for symmetric matrices.
3. Numerical stability: Many algorithms for symmetric matrices are more
stable than their general counterparts.
4. Guaranteed real eigenvalues: Simplifies many calculations and inter-
pretations.

6.2.6 Relation to Quadratic Forms


For a symmetric matrix S, the quadratic form Q(x) = xT Sx is of particular
importance:

ˆ If Q(x) > 0 for all non-zero x, S is positive definite.


ˆ If Q(x) ≥ 0 for all x, S is positive semi-definite.
ˆ The signature of S (number of positive, negative, and zero eigenvalues)
determines the geometric properties of the quadratic form.

This relationship between symmetric matrices and quadratic forms is funda-


mental in many areas, including optimisation, mechanics, and quantum physics.
Understanding symmetric matrices and their properties is crucial in many
physics, engineering, and data science areas. Their special properties, particu-
larly with regard to eigenvalues and eigenvectors, make them invaluable in mod-
elling many physical and statistical phenomena. The connection to quadratic
forms extends their applicability, making them a cornerstone of many mathe-
matical and scientific disciplines.

6.3 Orthogonal Matrices: Perpendicular Perfectionists


An orthogonal matrix Q is a real square matrix whose transpose is equal to its
inverse:

QT Q = QQT = I
where I is the identity matrix.
Understanding orthogonal matrices and their properties is crucial in many
areas of mathematics, physics, and engineering. Their ability to preserve ge-
ometric properties while allowing efficient computation makes them invaluable
tools in both theoretical analysis and practical applications, from 3D graphics
to quantum mechanics.

41
6.3.1 Properties
Orthogonal matrices possess several important properties:

1. Inverse equals transpose: Q−1 = QT


2. Columns (and rows) form an orthonormal basis: qTi qj = δij
3. Preserve inner products: (Qx)T (Qy) = xT y

4. Preserve norms: ∥Qx∥ = ∥x∥


5. Determinant is ±1: det(Q) = ±1
6. Eigenvalues have unit magnitude: |λi | = 1 for all eigenvalues λi
7. Product of orthogonal matrices is orthogonal

Orthogonal matrices have a profound geometric interpretation, which can


be visualised using cube rotation. This perspective provides an intuitive under-
standing of how orthogonal matrices preserve angles and distances.

6.3.2 Geometric Interpretation


Orthogonal matrices represent rigid transformations in Euclidean space:
ˆ Rotations (when det(Q) = 1)

ˆ Reflections (when det(Q) = −1)

ˆ Combinations of rotations and reflections

They preserve distances, angles, and orientation (for rotations). Consider


a unit cube in 3D space. The normal vectors to the faces of this cube form
an orthonormal basis, which can be represented as the columns of the identity
matrix:
 
1 0 0
I = 0 1 0
0 0 1
When we rotate this cube, the normal vectors to its faces also rotate. These
rotated normal vectors form the columns of an orthogonal matrix.
Figure 8 illustrates this concept. The left cube shows the original orientation
with normal vectors aligned with the coordinate axes. The right cube shows a
rotation, where the normal vectors have been transformed. These transformed
vectors form the columns of the orthogonal rotation matrix.

42
Figure 8: Visualisation of cube rotation and orthogonal matrix

6.3.3 Example: Rotation Matrix


For a rotation of θ = π/4 (45 degrees) around the z-axis, the orthogonal rotation
matrix is:
   
cos(π/4) − sin(π/4) 0 0.707 −0.707 0
Q =  sin(π/4) cos(π/4) 0 ≈ 0.707 0.707 0
0 0 1 0 0 1
The columns of this matrix represent the rotated normal vectors of the cube
faces.

6.3.4 Construction from Normal Vectors


Orthogonal matrices can be constructed by using normalised vectors as columns.
Let v1 , v2 , v3 be three unit vectors in R3 . An orthogonal matrix Q can be
formed as:

Q = [v1 v2 v3 ]
where v1 , v2 , v3 are mutually orthogonal.

Example 1: Identity Orthogonal Matrix


Using the standard basis vectors, we can construct the identity matrix, which
is trivially orthogonal:
 
1 0 0
Q1 = 0 1 0
0 0 1

43
Figure 9: Visualisation of the Identity Orthogonal Matrix

Figure 9 shows the column vectors of Q1 , which align with the standard
basis vectors.

Example 2: Non-trivial Orthogonal Matrix


We can also construct a non-trivial orthogonal matrix from non-orthogonal vec-
tors by normalising them:
 
0.7071 −0.4082 0.5774
Q2 ≈ 0.7071 0.4082 −0.5774
0 0.8165 0.5774
Figure 10 shows the column vectors of Q2 , which are mutually orthogonal
but not aligned with the standard basis vectors.
In both cases, we can verify that QT Q = I, demonstrating the orthogonality
property.

6.3.5 Properties Illustrated


This geometric interpretation highlights several key properties of orthogonal
matrices:

1. Preservation of orthonormality: The normal vectors remain perpen-


dicular and of unit length after rotation.

44
Figure 10: Visualisation of a Non-trivial Orthogonal Matrix

2. Preservation of distances: The cube’s edges remain the same length,


illustrating how orthogonal transformations preserve distances.
3. Preservation of angles: The right angles of the cube are maintained,
showing that orthogonal matrices preserve angles between vectors.

6.3.6 Applications
This geometric understanding of orthogonal matrices is crucial in various fields:

ˆ Computer Graphics: For implementing rotations in 3D rendering.

ˆ Robotics: In describing the orientation of robotic arms or autonomous


vehicles.
ˆ Quantum Mechanics: Where unitary (complex orthogonal) matrices
represent rotations in Hilbert space.
ˆ Data Analysis: In techniques like Principal Component Analysis, or-
thogonal transformations are used to find uncorrelated variables.

By visualising orthogonal matrices as rotations of a cube, we gain a powerful


intuition for their properties and applications in various domains of science and
engineering.

45
6.3.7 Examples
 
cos θ − sin θ
1. 2D rotation matrix: Q =
sin θ cos θ
 
cos θ − sin θ 0
2. 3D rotation matrix (around z-axis): Q =  sin θ cos θ 0
0 0 1

3. Householder reflection: Q = I − 2vvT where ∥v∥ = 1

4. Permutation matrices

6.3.8 Eigenvectors of Symmetric Matrices


In solid mechanics, many important matrices are symmetric. The eigenvectors
of symmetric matrices have special properties:

1. Eigenvectors corresponding to distinct eigenvalues are orthogonal.


2. The eigenvalues are all real.
3. The matrix of eigenvectors is orthogonal.

This relationship between symmetric matrices and orthogonal matrices is


fundamental in solid mechanics.

6.3.9 Stress Tensors in Solid Mechanics


In solid mechanics, the stress state at a point is represented by the stress tensor,
a symmetric 3x3 matrix:
 
σxx τxy τxz
σ =  τxy σyy τyz 
τxz τyz σzz
The eigenvectors of the stress tensor represent the principal stress directions,
and the eigenvalues are the principal stresses.

6.3.10 Example: Principal Stresses


Consider the following stress tensor (units in MPa):
 
100 30 0
σ =  30 50 0
0 0 0
Figure 11 shows the principal stress directions (eigenvectors) and magnitudes
(eigenvalues) for this stress tensor. The orthogonal nature of the principal
directions is clearly visible.

46
Figure 11: Visualisation of Principal Stress Directions

6.3.11 Stiffness Matrices in Solid Mechanics


In linear elasticity, the stiffness matrix relates force to displacement. For an
isotropic material, the stiffness matrix is symmetric. The eigenvectors of this
matrix represent the principal modes of deformation.

6.3.12 Significance in Solid Mechanics


The orthogonality of eigenvectors in these symmetric matrices has important
physical interpretations:

ˆ In stress analysis, it means that the principal stresses act in mutually


perpendicular directions.
ˆ In vibration analysis, it leads to the concept of normal modes, where the
structure can vibrate in independent, orthogonal patterns.

ˆ In finite element analysis, it allows for efficient solution methods and pro-
vides insight into the behaviour of structures under load.

The connection between symmetric matrices (representing physical proper-


ties or states) and orthogonal matrices (representing their eigenvectors) is a
powerful tool for understanding and analysing mechanical systems.

47
6.3.13 Applications
Orthogonal matrices play crucial roles in various fields:
1. Computer Graphics:
ˆ Representing 3D rotations in rendering and animation
ˆ Implementing camera transformations
2. Robotics and Control:
ˆ Describing orientation of rigid bodies
ˆ Implementing attitude control in spacecraft
3. Signal Processing:
ˆ Orthogonal transformations like DCT (Discrete Cosine Transform)
ˆ Wavelet transforms
4. Quantum Mechanics:
ˆ Representing spin rotations
ˆ Describing symmetry operations in molecular systems
5. Data Analysis:
ˆ Principal Component Analysis (PCA)
ˆ Factor analysis in psychometrics

6.3.14 Computational Advantages


Working with orthogonal matrices offers several computational benefits:
1. Efficient inversion: Q−1 = QT
2. Numerical stability: Orthogonal transformations do not amplify nu-
merical errors
3. Preservation of conditioning: The condition number of an orthogonal
matrix is always 1
4. Efficient storage: For special orthogonal matrices (e.g., Givens rota-
tions), only a few parameters need to be stored

6.3.15 Relation to Linear Systems


For a linear system Qx = b where Q is orthogonal:

x = Q−1 b = QT b
This means that solving the system is as simple as a matrix-vector multipli-
cation without the need for more complex linear algebra operations.

48
6.3.16 Orthogonal Decompositions
Orthogonal matrices are fundamental in many matrix decompositions:

ˆ QR Decomposition: A = QR, where Q is orthogonal and R is upper


triangular
ˆ Singular Value Decomposition (SVD): A = UΣVT , where U and
V are orthogonal

ˆ Eigendecomposition of Symmetric Matrices: S = QΛQT , where Q


is orthogonal and Λ is diagonal

These decompositions are powerful tools in numerical linear algebra, data


analysis, and various engineering applications.

6.4 Unitary Matrices: The Quantum Leap of Linear Trans-


formations
A unitary matrix U is a complex square matrix whose conjugate transpose is
equal to its inverse:

U∗ U = UU∗ = I
where U∗ denotes the conjugate transpose (also written as U† ), and I is the
identity matrix.
Understanding unitary matrices and their properties is crucial in many ar-
eas of physics and engineering, particularly in quantum mechanics and signal
processing. Their ability to preserve inner products and norms while allowing
efficient computation makes them invaluable tools in both theoretical analysis
and practical applications.

6.4.1 Properties
Unitary matrices possess several important properties:

1. Inverse equals conjugate transpose: U−1 = U∗


2. Columns (and rows) form an orthonormal basis: ⟨ui , uj ⟩ = δij
3. Preserve inner products: ⟨Ux, Uy⟩ = ⟨x, y⟩

4. Preserve norms: ∥Ux∥ = ∥x∥


5. Determinant has unit magnitude: | det(U)| = 1
6. Eigenvalues have unit magnitude: |λi | = 1 for all eigenvalues λi

49
6.4.2 Relation to Orthogonal Matrices
Unitary matrices are the complex generalisation of real orthogonal matrices.
When the entries are real, a unitary matrix is orthogonal.

6.4.3 Geometric Interpretation


In complex vector spaces, unitary matrices represent rotations and reflections.
They preserve the Hermitian inner product, analogous to how orthogonal ma-
trices preserve the Euclidean inner product in real spaces.

6.4.4 Examples
 
cos θ − sin θ
1. Rotation matrices: In 2D, U =
sin θ cos θ
 
0 1
2. Pauli matrices: e.g., σx =
1 0
 
1 1 1
3. Hadamard matrix: H = √2
1 −1

4. Fourier transform matrix: Fjk = √1 e−2πijk/n


n

6.4.5 Applications
Unitary matrices play crucial roles in various fields:
1. Quantum Mechanics:
ˆ Unitary matrices represent Quantum gates.
ˆ Time evolution of closed quantum systems is unitary.
2. Signal Processing:
ˆ The Discrete Fourier Transform (DFT) is unitary, facilitating efficient
computation of inverse transforms.
ˆ Unitary transforms are used in data compression algorithms.
3. Numerical Linear Algebra:
ˆ QR decomposition involves unitary matrices.
ˆ Singular Value Decomposition (SVD) uses unitary matrices.
4. Control Theory:
ˆ State-space models often involve unitary state transition matrices for
stable systems.
5. Cryptography:
ˆ Some cryptographic protocols use unitary operations.

50
6.4.6 Computational Advantages
Working with unitary matrices offers several computational benefits:

1. Efficient inversion: U−1 = U∗


2. Numerical stability: Unitary transformations do not amplify numerical
errors.

3. Preservation of conditioning: The condition number of a unitary ma-


trix is always 1.

6.4.7 Relation to Linear Superposition


For a linear system Ux = b where U is unitary:

x = U−1 b = U∗ b
This means that the solution x is a linear combination of the columns of U∗ ,
with coefficients given by the components of b. This property, combined with
the orthonormality of the columns, makes unitary transformations particularly
useful in analysing and manipulating complex systems.

6.5 Positive Definite Matrices: Energy Guardians of Sta-


ble Systems
A symmetric real matrix A is positive definite if for any non-zero vector x, the
quadratic form xT Ax is strictly positive. In mathematical notation:

xT Ax > 0 for all non-zero x


This seemingly abstract definition has profound implications in physics and
engineering, particularly in energy and stability. Positive-definite matrices are
mathematical constructs and guardians of energy positivity and stability in
physical systems. They ensure that our mathematical models behave in ways
consistent with the fundamental principles of physics, such as the non-negativity
of energy and the tendency of systems to seek stable equilibrium states. Under-
standing positive definiteness is crucial for anyone working with physical models,
as it bridges abstract linear algebra and tangible physical reality.

6.5.1 Physical Interpretation


1. Energy Representation: In many physical systems, the quadratic form
xT Ax represents energy. For instance:
ˆ In mechanics, it might represent kinetic energy ( 21 mv 2 )
ˆ In electrostatics, it could represent electric field energy
ˆ In elasticity theory, it often represents strain energy

51
2. Stability Guarantee: The positive definiteness ensures that the system’s
energy is always positive, which is a fundamental requirement for stable
physical systems. Negative energy would imply an unstable or unphysical
state.
3. Potential Energy Surfaces: In molecular dynamics and quantum chem-
istry, positive definite matrices often represent the curvature of potential
energy surfaces near equilibrium points, ensuring the local stability of
molecular configurations.

6.5.2 Examples in Physics


1. Harmonic Oscillator: The potential energy of a harmonic oscillator is
V = 12 kx2 , where k is the spring constant. The positivity of k ensures the
system always tends to return to its equilibrium position.
2. Moment of Inertia Tensor: In rigid body dynamics, the moment of
inertia tensor I is positive definite. This ensures that rotational kinetic
energy 12 ω T Iω is always positive for any angular velocity ω.
3. Elastic Stiffness Matrix: The elastic stiffness matrix C is positive def-
inite in continuum mechanics. This guarantees that strain energy 21 εT Cε
is always positive for any strain ε.

6.5.3 Mathematical Properties with Physical Significance


1. All eigenvalues are positive:
ˆ Physically, this means all system modes have positive energy.
ˆ In vibration analysis, it ensures all natural frequencies are real.

2. Determinant is positive:
ˆ This property is related to the volume expansion in phase space,
which is crucial in statistical mechanics.

3. Cholesky decomposition:
ˆ A = LLT , where L is lower triangular.
ˆ This decomposition is useful in simulating correlated random vari-
ables in statistical physics.
4. Convex optimisation:

ˆ Positive definite matrices guarantee a unique minimum in optimisa-


tion problems, often related to finding equilibrium states in physical
systems. [3]

52
6.5.4 Importance in Numerical Methods
1. Stability in iterative methods:
ˆ In finite element analysis, a positive definite stiffness matrix ensures
convergence of iterative solvers.

2. Condition number:
ˆ The ratio of largest to smallest eigenvalue affects numerical stability
in simulations of physical systems.

6.5.5 Analogies and Visualizations


1. Bowl shape:

ˆ The energy surface of a system with a positive definite matrix resem-


bles a bowl, always curving upward from its minimum.
2. Spring network:
ˆ A network of positive stiffness springs creates a stable, positive defi-
nite system.

References
[1] Strang, G. (2006). Linear Algebra and Its Applications. 4th ed. Belmont,
CA: Thomson, Brooks/Cole.
[2] Apostol, T. M. (2007). Calculus, Volume 1: One-Variable Calculus, with
an Introduction to Linear Algebra. 2nd ed. New York: Wiley.
[3] Snyman, J. A., Wilke, D. N. (2018). Practical Mathematical Optimization:
Basic Optimization Theory and Gradient-Based Algorithms. Second Edi-
tion. New York: Springer.

53

You might also like