Understanding Linear Equation Systems
Understanding Linear Equation Systems
1. Introduction
Study of a linear system of equations is classical. First let’s consider a system having
only one equation:
2x + 3y + 4z = 5 (2.1)
Indeterminates x, y, z are referred to as unknowns of the equation. Both (1, 1, 0) and
(−1, 1, 1) satisfy Equation (2.1). In fact, assign whatever real numbers you like to x
and y and find z from (2.1) by substituting the values of x and y (you have assigned).
Geometrically, the collection of all the solutions of Equation (2.1) is a plane in the
(real) space R3 .
Suppose a, b, c, d ∈ R are fixed. Then, the collection of all (α, γ, β) ∈ R3 satisfying
equation
ax + by + cz = d (2.2)
is geometrically a plane in the space R3 if at least one of a, b, c is nonzero. More
generally, suppose, a1 , a2 , . . . , an , b ∈ R. Then, the collection of all (α1 , α2 , . . . , αn ) ∈
Rn satisfying
Xn
ai x i = b (2.3)
i=1
collection of all the solution is called the solution set of the equation.
By a system of linear equations we mean a finite set of linear equations in finitely
many indeterminates. For instance, the following is a system of two linear equations:
2x + 3y + 4z = 5
x + y + z = 2. (2.4)
By a solution of this system we mean a solution of the first equation which is also
a solution of the second equation. If Ai (i = 1, 2) is the set of solutions of the i-th
equation, then the set of solutions of the system is A1 ∩ A2 . In this example, we know
that for each i, Ai is geometrically a plane. Thus the solution of system (2.4) is the
intersection of two planes. We had learnt in school that two planes in R3 intersect
11
12 2. SYSTEM OF LINEAR EQUATIONS
in a line unless they are parallel or conincident. Further, we learn certain adhoc
methods to solve a linear system of equations (applicable to a restricted situations);
for instance, Cramer’s rule.
Suppose F is a field. While reading for the first time you may assume that F = R.
The set of m × n matrices with coefficients from F is denoted by Mm×n (F).
A system of m linear equations in n unknowns x1 , x2 , . . . , xn written as
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
····················· ···
am1 x1 + am2 x2 + · · · + amn xn = bm (2.5)
where ai,j ∈ F for (1 ≤ i ≤ m, 1 ≤ j ≤ n), called coefficients and b1 , b2 , . . . , bm ∈ F,
called constant terms of the system.
System (2.5) can be described by the following matrix equation:
AX = B (2.6)
where
a11 a12 a1n b1 x1
a21 a22 a2n b2 x2
· · · · · · · · · ∈ Mm×n (F), B = · · · ∈ Mm×1 (F), and X = . . .
A=
am1 am2 amn bm xn
is the matrix (or column) of unknowns. A is called the coefficient matrix and B
the matrix (or column) of constants. The matrix (A|B) ∈ Mm×(n+1) (F) obtained by
attaching the column B with A, is called the augmented matrix of the system. By a
solution of AX = B where A and B are as above, we mean an element Y ∈ Mn×1 (F)
such that AY = B. The collection of all the solutions is called the solution set of
AX = B. Thus, the solution set of AX = B is a subset of Mn×1 (F). Observe that
every element of Mn×1 (F) defines an element of Fn uniquely and vice versa. Thus,
often we express a solution of AX = B as an n-tuple rather than an n-column.
We seek solution of these systems in R3 . It is not difficult to verify that system a) has
no solution (subtracting the second equation from the sum of the first and the third
equation, we get 0 = 2); system b) has only one solution, namely, (1, 1, 1) (Cramme’s
rule applicable) and system c) has infinitely many solutions. Indeed, for any real
number λ, the tuple (λ, 3 − 2λ, λ) is a solution of system c). Since Crammer’s rule is
applicable when the number of unknowns and the number of equations are same, and
the determinant of the coefficient matrix is nonzero, we require a method to deal with
a system of linear equations. In fact, in general, we have three types of solution set
to a system of linear equations; namely, an empty set, a singleton set or an infinite
set (if the field is infinite).
Here wewrite matrices
A,X and B for
each system of linear equations above.
1 1 1 3 x1
In a), A = 1 2 3 , B = 6, X = x2 so that
0 1 2 1 x3
1 1 1 | 3
(A|B) = 1 2 3 | 6 .
0 1 2 | 1
1 1 1 3 x1
In b), A = 1 2 3 , B = 6, X = x2 so that
1 1 2 4 x3
1 1 1 | 3
(A|B) = 1 2 3 | 6 .
1 1 2 | 4
1 1 1 3 x1
In c), A = 1 2 3 , B = 6 , X = x2 so that
0 1 2 3 x3
1 1 1 | 3
(A|B) = 1 2 3 | 6 .
0 1 2 | 3
If the coefficient matrix A belongs to Mm×n (F), then the column of unknowns has
n components so that a solution of the system is an n-tuple which is geometrically
a point in Fn . In this case, column B of constants must have m components. If B
is the zero column, then the system (namely, AX = 0) of linear equations is called a
homogenous system. A system which is not homogenous is called a non-homogenous
system. A homogeneous system always has a solution (namely, the zero column, i.e.,
xi = 0 for each i). When scalars are complex numbers geometric description of the
solution set is not what we have when scalars are reals.
14 2. SYSTEM OF LINEAR EQUATIONS
Warning: Before we write the matrix equation we should arrange the unknowns in
the same order in every equation of the system. If an unkown is missing from an
equation the corresponding coefficient is assumed to be zero.
if (i, k)-th coefficient of A ∈ Mm×n (F) is aik and (k, j)-th coefficient of B ∈ Mn×p
is bkj , and AB ∈ Mm×p (F). Recall that addition of matrices is commutative, but
multiplication is not.
A matrix having equal number of rows and column is called a square matrix. Sup-
pose A ∈ Mm×m is a square matrix. Then tr(A) is denotes the sum of the coefficient
in the pricipal diagonal (top-left corner to the bottom-right corner); symbolically,
m
X
tr(A) = ai,i . (2.7)
i=1
where bij = (−1)i+j det(Ai,j ) and Ai,j is the matrix obtained from A by simply remov-
ing the i-th row and j-th column. Quantity det(A) is independent of choice of i in the
definition. Since we assume readers to have studied the concept of determinant, we
will not justify this statement. Another fact is that in the definition of determinant,
role of i and j can be interchanged. For instance,
a b
det = ad − bc
c d
a1,1 a1, 2 a1,3
a2,2 a2,3 a2,1 a2,3
det a2, 1 a2,2 a2,3
= a1,1 · det − a1,2 · det +
a3,2 a3,3 a3,1 a3,3
a3,1 a3,2 a3,3
a2,1 a2,2
a1,3 · det
a3,1 a3,2
= a1,1 (a2,2 a3,3 − a2,3 a3,2 ) − a1,2 (a2,1 a3,3 − a1,3 a3,2 ) +
a1,3 (a2,1 a3,2 − a2,2 a3,1 )
Thus, if you know the effect of ρ on the identity matrix, then you know its effect on
A (namely, by simply premultiplying A by ρ(I)). Readers are suggested to verify this
fact in each case.
A square matrix E of size n is called an elementary matrix if there is an elementary
row operation ρ such that E = ρ(In ), where In is the identity matrix of size n.
Use the above rule successively on a matrix A and get the following two statements
for finitely many elementary row operations ρ1 , ρ2 , . . . , ρs (by induction):
(1) (ρs ◦ · · · ◦ ρ2 ◦ ρ1 )(A) = ρs (I) · · · ρ2 (I)ρ1 (I)A.
(2) (ρs ◦ · · · ◦ ρ2 ◦ ρ1 )(A) = (ρs ◦ · · · ◦ ρ2 ◦ ρ1 )(I)A.
18 2. SYSTEM OF LINEAR EQUATIONS
The RRE matrix row equivalent to A is called row reduced echelon form of A.
0 0 4 1
0 3 0 1
Example 2. We find the RRE form of 0 0 0 0 .
0 4 2 0
0 0 4 1 0 3 0 1
0 3 0 1 0 0 4 1
Apply R3 ↔ R4 . Get 0 4 2 0. Apply R1 ↔ R2 . Get 0 4 2 0.
0 0 0 0 0 0 0 0
0 1 0 1/3
0 0 4 1
Next apply R1 → 1/3R1 . Get 0 4 2 0 . Apply R3 → R3 − 4R1 , get
0 0 0 0
0 1 0 1/3 0 1 0 1/3
0 0 4 1 . Appy R2 → 1/4R2 , get 0 0 1 1/4 . Apply R3 → R3 −
0 0 2 −4/3 0 0 2 −4/3
0 0 0 0 0 0 0 0
0 1 0 1 0 1 0 1
0 0 1 1/4 0 0 1 1/4
0 0 0 −11/6. Apply, R3 → −6/11R3 , get 0 0 0 1 . Apply
2R2 , get
0 0 0 0 0 0 0 0
0 1 0 0
0 0 1 0
R1 → R1 − R3 and R2 → R2 − 1/4R3 get 0 0 0 1, which is RRE.
0 0 0 0
Remark 1. There are more than one ways of getting the row reduced echelon
form of a matrix.
Theorem 9.1. (1) If A and B are row equivalent, then they have the same
rank.
(2) If A is an RRE matrix, then the rank of A is the number of nonzero rows.
(3) The rank of a matrix is the number of nonzero rows of the RRE form of the
matrix.
10. APPLICATION-III OF ROW REDUCTION: SOLVING SYSTEMS OF EQUATIONS 21
0 01 4 1 1 1 1 1
0 310 1 2 1 3 1
Example 4. The rank of
0 is 3. The Rank of
00 0 0 1 0 2 0
0 40 2 1 0 1 −1 1
1 0 1 −1 1
0 1 0 2 0
is two. Student must show that this matrix is row equivalent to
0
0 0 0 0
0 0 0 0 0
which is row reduced echelon (so that the rank of the given matrix is 2).
Remark 2. The rank of a matrix is a very important parameter. We will see its
use in several places in the remainder of the book.