Discrete Mathematics Lecture Notes 2025-2026
Discrete Mathematics Lecture Notes 2025-2026
AY 2025–2026
Discrete Mathematics Page 2
This is the complete set of notes covering all material taught in the course.
2 Probability 31
2.1 Probability Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2 Conditional Probability and Independence . . . . . . . . . . . . . . . 37
2.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Recurrence relations 41
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 Linear Homogeneous Recurrence Relations . . . . . . . . . . . . . . . 45
3.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Solving Inhomogeneous Recurrence Relations . . . . . . . . . . . . . . 48
3.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3
Discrete Mathematics Page 4
4 Graph Theory 54
4.1 Introduction to Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Adjacency Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 Directed graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 Weighted Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.6.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7 Spanning trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.8 Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . 72
4.8.1 Prim’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.8.2 Kruskal’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . 74
4.8.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.9 Euler Paths and Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.9.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.10 Hamilton Paths and Circuits . . . . . . . . . . . . . . . . . . . . . . . 79
4.10.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1.1 Sets
Definition 1.1. A set is an unordered collection of objects.
The notation A = {1, 3, 5, 7} means that the set A contains the four elements
1, 3, 5, 7 . We write a ∈ A to denote that a is an element of the set A . We write
a∈ / A to indicate that a does not belong to A . For example, given A = {1, 3, 5, 7}
we have that 5 ∈ A and 6 ∈ / A.
There are a number of ways of denoting sets. Thus for the set of positive even
numbers we could write
A = {2, 4, 6, 8, . . .}
or
A = {x | x is a positive even number} ,
where the vertical bar | is to be read “such that”.
A set cannot contain the same object more than once. Also, the elements in a
set are not ordered so the sets {a, b, c} and {c, a, b} are the same. In general, two
sets are equal if they contain the same elements.
3 ∈ A, {3, 5} ∈ A, 5∈
/ A, {3, 6} ⊂ A, {3} ⊂ A.
5
Discrete Mathematics Page 6
The set that has no elements is called the empty set and is denoted by ∅ . By
convention ∅ ⊂ A for any set A.
Remark 1.2. Note that the set ∅ and the set B = {∅} are different. The single
element of B is the empty set itself.
Example 1.5. Let A = {1, 2, 3} and B = {x, y} . Find the Cartesian product
A×B.
Solution. The Cartesian product A × B is
A × B = { (1, x), (1, y), (2, x), (2, y), (3, x), (3, y) }.
1.1.1 Exercises
A ∩ B := {x | x ∈ A and x ∈ B}.
Page 7 Dr Adrian Turcanu & Dr Mateja Presern
Note that the statement x ∈ A or x ∈ B does not exclude the possibility that
x is an element of both sets. For example, we have
{1, 3, 5, 7} ∪ {2, 5, 7} = {1, 2, 3, 5, 7}
and
{1, 3, 5, 7} ∩ {2, 5, 7} = {5, 7} .
Remark 1.3. It is useful to picture combinations of sets with the aid of Venn dia-
grams. This will be done in the lectures.
Definition 1.8. If A and B are sets, then the difference of A and B is denoted
by A \ B and defined as
A \ B := {x | x ∈ A and x ∈
/ B}.
For example, if A = {1, 3, 5, 7} and B = {4, 5, 6, 7} then A \ B = {1, 3} .
Often, all the sets under consideration are subsets of some larger set U called the
universal set.
Definition 1.9. The complement of a set A is A = U \ A .
Thus A = {x | x ∈
/ A} .
1.2.1 Exercises
− To find the bit string for A from the bit string for A we turn each 1 into a 0
and each 0 into a 1.
For the above example, the bit string for A is 0 1 1 0 0 0 so the bit string for A
is 1 0 0 1 1 1 .
− The k th bit in the bit string for A ∪ B is 1 if either (or both) of the bits in
the k th position of A and B is 1, and is 0 if both bits are 0.
− The k th bit in the bit string for A ∩ B is 1 if both of the bits in the k th
position of A and B are 1 , and is 0 if either (or both) of the two bits is 0.
1.3.1 Exercises
Exercise 1.3.2. Which subsets of a finite universal set U do these bit strings rep-
resent?
(a) the string with all zeros, (b) the string with all ones.
Suppose we want to find a formula for the sum of the first n odd integers:
For n = 1 the sum is 1 = 12 .
For n = 2 the sum is 1 + 3 = 4 = 22 .
For n = 3 the sum is 1 + 3 + 5 = 9 = 32 .
For n = 4 the sum is 1 + 3 + 5 + 7 = 16 = 42 .
Based on the above, it seems reasonable to conjecture that
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .
But how does one rigorously prove this?
To start with, let us look at what it means for a statement (or proposition) to
depend on a positive integer n .
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .
Then:
Discrete Mathematics Page 10
Example 1.13. Let P (n) be the statement n2 + n is even. Write down P (1) ,
P (3) , P (k) and P (k + 1) .
Solution. P (1) states that 12 + 1 is even. P (3) states that 32 + 3 is even. P (k)
states that k 2 + k is even. P (k + 1) states that (k + 1)2 + k + 1 is even.
Example 1.14. Let P (n) be the statement 50 − 2n ≥ 0 . Write down P (1) , P (5)
and P (30) . Which of them are true?
Solution. P (1) states that 50 − 2 ≥ 0 which is true. P (5) states that 50 − 10 ≥ 0
which is true. P (30) states that 50 − 60 ≥ 0 which is false.
an+1 = 2an − 3 , n ≥ 1 .
Let P (n) be the statement an = 2n−1 + 3 . Write down P (1) , P (2) and P (3) .
Which of them are true?
Solution. P (1) states that a1 = 20 + 3 , that is, a1 = 4 . Hence P (1) is true. P (2)
states that a2 = 21 + 3 , that is, a2 = 5 . P (3) states that a3 = 22 + 3 , that is,
a3 = 7 .
We use an+1 = 2an − 3 to find a2 and a3 : a2 = 2a1 − 3 = 2(4) − 3 so a2 = 5,
and a3 = 2a2 − 3 = 2(5) − 3 so a3 = 7. Hence, both P (2) and P (3) are true.
Page 11 Dr Adrian Turcanu & Dr Mateja Presern
Mathematical induction
Example 1.16. Use Mathematical Induction to prove that for any positive integer
n
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2
Solution. Let P (n) be the statement
1 + 3 + 5 + 7 + . . . + (2n − 1) = n2 .
an+1 = an + 8n , n ≥ 1 .
ak+1 = ak + 8k
= (2k − 1)2 + 8k
= 4k 2 + 4k + 1
= (2k + 1)2 .
(c) Do the inductive step : show that if P (k) is true then P (k + 1) is true.
It is worth stressing that one needs to carry out ALL of the above steps. Indeed,
there are a number of pitfalls in using the principle of mathematical induction as
the next three examples show.
I. It does not suffice to prove only that P (1) is true. Indeed, let P (n) be the
statement: n2 = n . Then P (1) is true. But P (n) is true only when n = 1
or n = 0 and is false for n ≥ 2.
II. It does not suffice to prove only that if P (k) is true then P (k + 1) is true.
Indeed, let P (n) be the statement: n + 1 = n . Assume that P (k) holds so
that k + 1 = k . Adding 1 to each side gives k + 2 = k + 1 so P (k + 1) is
true. Of course, P (1) is false so we cannot get started.
III. Another typical error is to give a proof of P (k) implies P (k + 1) which only
works for some values of k , for example k ≥ 2 . Let P (n) be the statement:
any n students taking the same examination must all score the same mark.
Clearly P (1) is true.
Assume that P (k) is true and let any group of k + 1 candidates be given.
Let A and B be particular candidates and C the remaining group of k − 1
students. Then A and C form a group of k students so by P (k) they all
have the same mark. The same is true of B and C . Thus A and B and the
group C score the same mark so P (k + 1) is true. By induction, P (n) holds
for all n .
The mistake is that the group C must not be empty. If k = 1 then C is
empty and the marks of students from the groups A and C do not coincide.
We have proved that P (k) is true implies that P (k + 1) is true if k ≥ 2 but
there is no way of establishing P (2) which is false.
Page 13 Dr Adrian Turcanu & Dr Mateja Presern
The next example uses the factorial function n! . The number 5! (read five
factorial or factorial 5) means 5 × 4 × 3 × 2 × 1 . Note that 1! = 1 and that by
convention 0! = 1 . The reader should check by calculation that 4! = 24 and use a
calculator to check that 7! = 5040 .
2k+1 = 2 2k
< 2 (k!)
< (k + 1) k!
= (k + 1)!.
1.4.1 Exercises
Exercise 1.4.1. Let P (n) be the statement 3n − 100 ≥ 0 . Write down P (1) ,
P (4) and P (40) . Which of them are true?
Discrete Mathematics Page 14
an+1 = 5an − 8 , n ≥ 1 .
Let P (n) be the statement an = 5n−1 + 2 . Write down P (1) and P (2) . Which of
them are true?
a2n + 12
an+1 = , n≥1
7
where 3 < x < 4. Prove by induction that 3 < an < 4 for all n.
1.5 Relations
Let A and B be sets. A relation between A and B links certain elements of A with
certain elements of B. More formally, a relation from A to B is a subset of A × B.
A relation on the set A is a relation from A to A.
It is not symmetric since, for example, it contains (1, 2) but not (2, 1) .
It is antisymmetric since there is no pair of elements a and b with a ̸= b such
that (a, b) and (b, a) belong to R.
It is transitive. To check this we have to show that if (a, b) and (b, c) belong to
R, then so does (a, c) .
Example 1.22. As another example, let R be the relation on the set of people
consisting of pairs (a, b) where a likes b. Unfortunately it is not symmetric. It is
not antisymmetric or transitive. It is probably not even reflexive.
Example 1.24. Let R be the relation on the set A = {1, 3, 5, 9, 11, 18} defined by
the pairs (a, b) such that a − b is divisible by 4.
Given a set A with an equivalence relation R on it, we can break up all elements in
A into disjoint groups (subsets) such that within each group all elements are related
between themselves but no two elements from two different groups are related. Such
groups of elements are called equivalence classes. For Example 1.24 the equivalence
classes are C1 = {1, 5, 9}, C2 = {3, 11}, C3 = {18}.
Page 17 Dr Adrian Turcanu & Dr Mateja Presern
Example 1.26. Let R = {(1, 1), (1, 2), (1, 3), (2, 3), (3, 4)} be a relation on the set
A = {1, 2, 3, 4}.
Then Rr = {(1, 1), (1, 2), (2, 2), (1, 3), (2, 3), (3, 3), (3, 4), (4, 4)} is the reflexive
closure of R.
Example 1.28. Let R = {(1, 1), (1, 2), (1, 3), (2, 3), (3, 4)} be a relation on the set
A = {1, 2, 3, 4}.
Then Rs = {(1, 1), (1, 2), (2, 1), (1, 3), (3, 1), (2, 3), (3, 2), (3, 4), (4, 3)} is the sym-
metric closure of R.
Example 1.30. Let R = {(1, 1), (1, 2), (2, 3), (3, 2)} be a relation on the set A =
{1, 2, 3}.
Then Rt = {(1, 1), (1, 2), (2, 3), (1, 3), (3, 2), (2, 2), (3, 3)} is the transitive closure
of R.
Discrete Mathematics Page 18
1.5.2 Exercises
Exercise 1.5.2. Let R be the relation on the set of real numbers consisting of
pairs (a, b) with a ≤ b. Determine whether the relation is reflexive, symmetric.
antisymmetric, or transitive.
Find:
(a) the reflexive closure of R
(b) the symmetric closure of R
(c) the trasitive closure of R
1.6 Functions
Let X and Y be sets.
Definition 1.31. A function from X to Y is a rule that assigns to each element of
X exactly one element of Y . We write f : X → Y . The set X is called the domain
of f and the set Y is called the codomain of f .
Here are some examples of functions.
Example 1.35. The next example is called the McCarthy 91 function and was
given by John McCarthy, one of the founders of artificial intelligence.
Here f : N → Z is defined for positive integers n by
n − 10 if n ≥ 101
f (n) =
f (f (n + 11)) if n ≤ 100
If we try to find f (1) directly, we put n = 1 in the definition. This shows that f (1)
depends on f (12) which in turn depends on f (23) etc. It seems hopeless to find
f (1) in this way. The trick is to work backwards. Putting n = 100 in the definition
gives
f (100) = f (f (111)) = f (101) = 91
Putting n = 99 in the definition gives
Continuing in this way we can show that f (n) = 91 for all n ≤ 100 .
In Example 1.32 the range of f is the set of all non-negative real numbers:
Range(f ) = {y ∈ R | y ≥ 0}. In Example 1.33 the range is the set Range(f ) =
{3, 22}. For the McCarthy 91 function (Example 1.35) the range is Range(f ) =
{y ∈ N | y ≥ 91}.
Definition 1.36. A function is called surjective if its range is equal to its codomain.
(f ◦ g)(x) = f (g(x)) = f (3x + 4) = 2(3x + 4)2 = 2(9x2 + 24x + 16) = 18x2 + 48x + 32
while
(g ◦ f )(x) = g(f (x)) = g(2x2 ) = 3(2x2 ) + 4 = 6x2 + 4
Remark 1.8. A function that is both injective and surjective is often called bijective.
1.6.1 Exercises
Exercise 1.6.1. For each of the following functions find their range. Determine
whether the function is surjective and whether it is injective.
Exercise 1.6.2. Let the functions f , g and h have R as their domain and codomain,
and be defined as
(
1 if x ≥ 0
f (x) = 4x − 3 , g(x) = x2 + 1 , h(x) = .
0 if x < 0
Find rules for the following functions (with domain and codomain R)
Exercise 1.6.3. Find the inverse function for each of the following functions, or
explain why no inverse exists.
√
(a) g : R → R , g(x) = x2 ,
(b) j : S → S
where S is the set of non-empty finite strings of lower case letters, and j(s) gives
a string obtained by moving the last character to the beginning of the string, e.g.
j(yncrfm) = myncrf.
Discrete Mathematics Page 22
Example 1.41. The mathematics teaching committee requires one student repre-
sentative from either the first year or the second year or the third year. If there
are 500 first year, 300 second year and 100 third year students, how many possible
different representatives are there?
Solution. There are obviously 500 different choices for picking a representative from
the first year students, 300 for the second and 100 for the third year. Since the
students are all different and we have to pick only one representative from either
year we can add the numbers of choices for a representative from each year to obtain
900 different possible representatives.
If there are n(A) ways to do A and, distinct from them, n(B) ways
to do B, then the number of ways to do A or B is n(A) + n(B).
Similarly, one adds n(A), n(B), n(C) when one must do A or B or C;
and so on.
To solve the following example we need to use both of the above rules.
Example 1.43. In a certain computer system a file name must be a string of letters
and digits which is 1,2, or 3 symbols long. The first symbol must be a letter and
the system does not make any distinction between uppercase and lowercase letters.
How many legitimate file names are there?
Solution. A file name either has one symbol, or two symbols or three symbols. In
each case we have 26 choices for the first letter (the 26 letters of the alphabet).
The number of one-symbol names is then 26. The number of 2-symbol names is
26 × 36 = 936 because for the second symbol we can have either a letter or a
digit, 36 = 26 + 10 choices. We multiply the number of choices for the first and
for the second symbol because both choices are done independently. Similarly the
number of three-symbol names is 26 × 36 × 36 = 33696. Summing the numbers for
the three different cases (by the sum rule) we obtain that the system at hand can
accommodate 26 + 936 + 33696 = 34658 different files.
1.7.1 Exercises
Exercise 1.7.1. A binary word is made of 0’s and 1’s. How many binary words are
there with length exactly 7? How many binary words are there with length up to
3?
Exercise 1.7.2. A salesman is to visit 5 towns. Assuming that he goes to one town
after another, visiting all of them once, in how many orders can he make his trip?
Exercise 1.7.3. How many numbers between 1 and 1000 have exactly one 7 in
them?
Discrete Mathematics Page 24
Note that the order in which the selection is made matters. Thus for a set of 5
letters {B, S, W, K, L} the lists {B, W, K}, {W, B, K} are two different permutations
of the 5 given letters taken 3 at a time.
Repetitions are not allowed. Thus for the above example a list {B, B, L} is not
a permutation. We will often say “ordered selection without repetitions” instead of
“permutations” as this is self-descriptive.
Example 1.45. In how many ways can a committee with 10 members select a
chairperson, a treasurer and an administrator?
Solution. Suppose that the first person to be selected is to be the chairperson. There
are 10 different was to do it. Suppose the second person selected is the treasurer for
whom we have 9 choices. Finally, suppose the last person chosen is the administrator,
for whom there are 8 possible people available. By the product rule the number of
selections is 10 × 9 × 8 = 720.
Theorem 1.46. The number of ordered selections of k distinct objects from a set
of n objects is n(n − 1)(n − 2) . . . (n − k + 2)(n − k + 1).
n!
n(n − 1)(n − 2) . . . (n − k + 2)(n − k + 1) = (1.1)
(n − k)!
n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1
0! = 1 .
With this convention the expression on the right hand side in (1.1) gives the same
result for n = k as the expression on the left hand side.
Page 25 Dr Adrian Turcanu & Dr Mateja Presern
Proof of Theorem 1.46. There are n ways to select the first object. After the first
object is selected there are n − 1 objects left and there are respectively n − 1 choices
for the second object. We continue selecting until the last object for which there
are n − k + 1 choices. By the product rule these selections can be made in n(n −
1)(n − 2) . . . (n − k + 1) ways.
Remark 1.10. Observe that using formula (1.1) for k = n we find that the number
of different ways to order n objects is n!.
Consider now a different kind of selections called combinations.
where nk stands for the number of unordered selections. Dividing both sides by k!
For example
5 5 0 5 5 1 4 5 2 3 5 3 2 5 4 1 5 5 0
(x + y) = xy + xy + xy + xy + xy + xy
0 1 2 3 4 5
= y 5 + 5xy 4 + 10x2 y 3 + 10x3 y 2 + 5x4 y + x5 .
Discrete Mathematics Page 26
Example 1.49. In the card game bridge, a hand has 13 cards. There are 52 cards
in a deck. How many different bridge hands are there?
Solution. We have to make an unordered selection of 13 cards out of 52. Repetitions
are not allowed. By the above theorem there are 52 13
= 635013559600 different
bridge hands.
1.8.1 Exercises
Exercise 1.8.1. What is the number of ways in which a subset of one or two
elements can be picked up from a set of n elements?
Exercise 1.8.2. A university department has 12 male professors and 3 female pro-
fessors. It is decided that every committee of professors in this department should
have at least one female member. How many different committees of 12 people can
be formed?
Exercise 1.8.3. The computer science students are entering a team of 5 in the
university relay race. There are 20 undergraduates and 10 postgraduates. How
many teams are possible if
(a)(∗) any student is eligible
Page 27 Dr Adrian Turcanu & Dr Mateja Presern
Exercise 1.8.4. How many ways are there to pick a combination of k distinct
numbers from {1, 2, . . . , n} if 1 and 2 cannot both be picked?
Inclusion-Exclusion Principle
For instance, suppose we want to find the number of cards in an ordinary deck
that are clubs or aces. Let A be the set of cards that are aces and let C be the set
of cards that are clubs.
We have |A| = 4 , |C| = 13 and |A ∩ C| = 1 (the ace of clubs). Then the
Exclusion Principle gives us
|A ∪ C| = |A| + |C| − |A ∩ C| = 4 + 13 − 1 = 16.
Example 1.51. A survey in Scotland finds that 86% of the population have heard
of product A while 80% have heard of product B. Also, 70% of the population
have heard about both products. What percentage of the population have not
heard about either product?
Solution. Let A be the set of people who have heard about product A and define
B similarly. Let N be the population of Scotland.
From the given data, |A| = (0.86)N , |B| = (0.8)N and |A ∩ B| = (0.7)N .
Then
Hence 96% of the population have heard of either product A or product B and 4%
have not heard about either product.
Discrete Mathematics Page 28
In the lectures I will indicate why (1.6) holds by means of a Venn diagram.
and
|A ∪ B ∪ C| = 50 + 33 + 20 − 16 − 10 − 6 + 3 = 74.
Example 1.53. At the University of Scabia all first year computer science students
have to study at least one of the three modules : accounting, ethics and mathematics.
There are 237 first year computer science students. Also, 150 study maths, 120
study accounting, 100 study ethics, 50 study both accounting and maths, 20 study
both accounting and ethics and 70 study both maths and ethics.
(b) Find the number who study accounting and maths but not ethics.
Solution. Let A be the set of students studying accounting, M those doing maths
and E those doing ethics. From the given data,
|A ∩ M | = 50 |A ∩ E| = 20 |E ∩ M | = 70
Also, since everyone is studying at least one of the three modules,
|A ∪ E ∪ M | = 237
Page 29 Dr Adrian Turcanu & Dr Mateja Presern
Using
|A ∪ E ∪ M | = |A| + |E| + |M | − |A ∩ E| − |A ∩ M | − |E ∩ M | + |A ∩ E ∩ M |
X ∩Y =∅ X ∪Y =A∩M
By (1.5) , |X ∪ Y | = |X| + |Y | − |X ∩ Y | so
|A ∩ M | = |A ∩ M ∩ E| + |A ∩ M ∩ E|
Hence
50 = 7 + |A ∩ M ∩ E|
and |A ∩ M ∩ E| = 43 .
1.9.1 Exercises
Exercise 1.9.1. A and B are sets with |A| = 30 and |B| = 20 . Find A ∪ B if
(a) A ∩ B = ∅ ,
(b) |A ∩ B| = 5 ,
(c) B ⊂ A .
Exercise 1.9.2. A , B and C are sets with |A| = |B| = |C| = 100 .
Find |A ∪ B ∪ C| in the following cases.
(a) A = B = C .
(b) A ∩ B = A ∩ C = B ∩ C = ∅ .
(c) |A ∩ B| = |A ∩ C| = |B ∩ C| = 50 and A ∩ B ∩ C = ∅ .
(d) |A ∩ B| = |A ∩ C| = |B ∩ C| = 50 and |A ∩ B ∩ C| = 20 .
Discrete Mathematics Page 30
Exercise 1.9.3. How many integers from 1 to 1000 are divisible by 7 or 11?
Exercise 1.9.5. At the University of Scabia all first year students have to study
at least one of the three modules : computing, physics and mathematics. There
are 1400 first year students. Also, 700 study maths, 600 study physics, 500
study computing, 200 study both physics and maths, 150 study both physics and
computing and 120 study both maths and computing.
(b) Find the number who study computing and maths but not physics.
Chapter 2
Probability
flipping of a coin;
rolling a die;
the top link in a Google search (it will either lead to the desired information
or it won’t);
Definition 2.1. The sample space S for a random experiment is the set of all
possible outcomes of the experiment.
31
Discrete Mathematics Page 32
Example 2.3. Suggest suitable sample spaces, and identify the subset correspond-
ing to the event A, for the following situations:
(a) A coin, which shows heads (H) or tails (T), is tossed three times; A is the event
that the coin shows head twice.
(b) A game of football is played; A is the event that the match is drawn.
(c) A couple have two children; A is the event that both are girls.
Solution. (a) We can take
S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }
and
A = {HHT, HT H, T HH}.
(b) One option is to take
S = {(n, m) | n, m are integers ≥ 0}
where n is the first team’s score and m is the second team’s score. In this case
A = {(n, m) | n = m}.
Alternatively we can consider S = {W, D, L} with W meaning the first team
wins, D — it’s a draw, L — first team loses. In this case A = {D}.
Page 33 Dr Adrian Turcanu & Dr Mateja Presern
(c) We can take S = {GG, GB, BG, BB} where, for example, GB means first child
is a girl, second child is a boy. In this case, A = {GG}. Alternatively we can
take S = {0, 1, 2} with each sample point corresponding to number of girls born.
In this case, A = {2}.
Definition 2.4. The probability function is a function P (E) that assigns a value
between 0 and 1 to any event E ⊂ S. The value 0 ≤ P (E) ≤ 1 is called the
probability of event E.
Probability axioms
2. P (∅) = 0, P (S) = 1;
P (A ∪ B) = P (A) + P (B) .
Axiom 3 can be extended to the case where one has more than two events: if the
events E1 , E2 , . . . En are mutually exclusive, that is Ei ∩ Ej = ∅ for any i ̸= j, then
In particular for any finite subset E the probability P (E) is given by the sum of
probabilities assigned to its individual points.
This gives a recipe for specifying and computing a probability function. By
Axiom 2 the sum of the probabilities assigned to all sample points must by equal to
1:
n
X
for S = {1, 2, 3, . . . , n} one has pi = 1, (2.1)
i=1
Example 2.5. Consider again the experiment in which we roll a die. The sample
space is S = {1, 2, 3, 4, 5, 6}. If we think that the die is fair, that is all possibilities
are equally likely, then we should set p1 = p2 = ... = p6 = 16 . Then condition (2.1)
is satisfied.
In general for a finite sample space S with N elements one can define an equiprob-
able probability function by assigning pi = N1 for each sample point i ∈ S. This
means that all outcomes are equally likely.
For any event E we call its complement E = S \ E an event “not E” (any other
outcome but E). The following theorem holds.
The proof uses the same idea as in the proof of Theorem 2.7. You are warmly
encouraged to try and write it down yourself as an exercise!
Example 2.9 (*). If P (A) = 2/3 and P (B) = 3/4 what are the largest and the
smallest values that P (A ∩ B) can take?
Solution. The intersection A ∩ B is a subset of A and also is a subset of B. Thus
its probability has to be smaller than P (A) and smaller than P (B). So P (A ∩ B) ≤
2
3
= P (A) - the smaller of the two. Moreover it may take that value if A ⊂ B in
which case P (A ∩ B) = P (A) = 23 . Thus the largest P (A ∩ B) can be is 2/3. Next
consider the equality: P (A ∩ B) = P (A) + P (B) − P (A ∪ B) = 1712
− P (A ∪ B). The
17 5
largest P (A ∪ B) can be is 1. Thus the smallest P (A ∩ B) can be is 12 − 1 = 12 .
Example 2.10. One often hears that the theory of probability started in the sev-
enteenth century, when a French nobleman, the Chevalier de Méré, proposed the
following problem in 1654 to his friend Pascal: Why is one more likely to obtain a
6 in four throws of a die than to obtain a double 6 in 24 throws of two dice? This
problem is known as de Méré’s paradox. We use the word paradox, because, based
on the fact that there are 6 possible results when we roll a die and 36 possible results
when we roll two dice, some people thought that the two events above should have
the same probability. Indeed, notice that the number of throws, divided by the num-
ber of possible results, is equal to 2/3 in both cases (4/6 = 24/36 = 2/3 = 0.66...).
Nowadays, we can easily compute the probability of each event. We find that the
probability of obtaining at least one 6 in four rolls of a (fair or non-biased) die is
1 − (5/6)4 = 671/1296 ≈ 0.5177, while the probability of getting at least a double
6 in throwing two dice 24 times is 1 − (35/36)24 ≈ 0.4914. One can conclude that
the Chevalier de Méré must have spent a lot of time throwing dice to discover such
a small difference.
Discrete Mathematics Page 36
2.1.1 Exercises
Exercise 2.1.1. In the following experiments decide what are the outcomes and
state if they are equally likely.
(a) A red, blue and a yellow ball are put into a bag and a ball drawn out at random.
(b) Two yellow balls, one red ball and one blue are in a bag. One ball is drawn at
random.
(d) Two children throw a die to see who starts the game. If the die shows a prime
number Joe starts, otherwise Ellen starts.
Exercise 2.1.4. Events A and B are such that P (A) = 0.75, P (B) = 0.65. Explain
why events A and B cannot be mutually exclusive. (Two events A and B are called
mutually exclusive if A ∩ B = ∅.) What can you say about P (A ∪ B) and P (A ∩ B)?
Page 37 Dr Adrian Turcanu & Dr Mateja Presern
Example 2.11 (*). A fair coin is tossed twice. Let A be the event that at least one
head is obtained. Suppose at the end of experiment we are told that A has occurred
(but we have not seen the actual result). What are the probabilities of other events?
Solution. In light of the occurrence of A we should revise the probabilities of other
events. We define a new conditional probability function P (·|A) given that event A
has occurred:
Sample point P (·) P (·|A)
1 1
HH
4 3
1 1
TH
4 3
1 1
HT
4 3
1
TT 0
4
How did we get the revised probabilities given that A has occurred? Sample
points outside A (the point TT) are assigned probability 0 (since we know that A has
occurred). Sample points inside event A have their original probabilities multiplied
by a constant in such a way that they still add up to 1. Original probabilities total
to 3/4 so if we divided each by 3/4 we get revised probabilities which sum to 1.
Note that 3/4 — the factor we had to divide by — is P (A). This is not a
coincidence. The construction of conditional probability follows the following general
rule
P (B ∩ A)
P (B|A) = . (2.4)
P (A)
(It is assumed that P (A) ̸= 0.)
Example 2.12 (*). A fair coin is tossed three times. What is the probability that
the first toss was tails given that at least one head is obtained?
Solution. We have that A = {HHH, HHT, HT H, HT T, T HH, T HT, T T H} is the
event that at least one head has occurred and B = {T HH, T HT, T T H, T T T } is the
event that the first toss was tails. Their intersection is A∩B = {T HH, T HT, T T H}.
Since each point has probability 1/8 we have P (A) = 7/8, P (A ∩ B) = 3/8. Hence
by the above general formula
P (B ∩ A) 3/8 3
P (B|A) = = = .
P (A) 7/8 7
Discrete Mathematics Page 38
Example 2.13 (*). An urn contains six red balls and three blue balls. Three balls
are drawn at a time from the urn (without replacement). What is the probability
that the first ball is red? How is your answer modified if you are supplied an
additional information that the last two balls are red?
Solution. First we should define a sample space for this model. Let us enumerate
the balls so we can distinguish them: balls numbered from 1 to 6 are red and the
balls numbered 7,8,9 are blue. Then the sample points are triples of distinct numbers
each from 1 to 9 corresponding to the three draws. The total number of points in the
sample space is |S| = 9×8×7 = 504. There is no reason to suppose that one sample
point is more likely than another so we set the probability of each sample point to
be 1/504. Let A1 be the event that the first ball is red. We have |A1 | = 6 × 8 × 7
(we have 6 choices for the red ball at the first draw, 8 choices for the 2nd draw since
one ball was already picked, and 7 choices for the 3rd draw). Therefore the answer
to the first question is
|A1 | 6×8×7 6 2
P (A1 ) = = = = .
|S| 9×8×7 9 3
Now let B be the event that the last two balls are red. Then A1 ∩ B is the event
that all three balls are red. We have |B| = 6 × 5 × 7 and |A1 ∩ B| = 6 × 5 × 4 = 120.
Therefore
P (A1 ∩ B) |A1 ∩ B| 6×5×4 4
P (A1 |B) = = = = .
P (B) |B| 6×5×7 7
Definition 2.14. We say that two events A and B are independent if the occurrence
of B has no effect on the probability of A. So
It follows from formula (2.4) (CHECK THIS!) that the events A and B are
independent if and only if
The criteria of independence stated in this form is manifestly symmetric with respect
to the roles of A and B. If B has no effect on A then A has no effect on B so that
P (B|A) = P (B)
holds as well.
Page 39 Dr Adrian Turcanu & Dr Mateja Presern
Example 2.15 (*). A coin is tossed 3 times. Let the event A be that three heads are
obtained and the event B be that the first toss is a head. Are A and B independent
events?
Solution. We have
1 4 1 1
P (A) = , P (B) = = , P (A ∩ B) = P (A) = .
8 8 2 8
Then
1 1
P (A)P (B) = ̸= P (A ∩ B) = .
16 8
Hence, the events are not independent.
Example 2.16 (*). In a class there are 4 left-handed men, 6 left-handed women
and 6 right-handed men. How many right-handed women must be present if sex
and handedness are to be independent when a student is selected at random?
Solution. Let A be an event that a student is a woman, B the event that the student
is left-handed and x the number of right-handed women students in the class. So
the number of women in the class is 6 + x, the number of left-handed students is
10 and the total number of students is 16 + x. Using formula (2.6) we obtain an
equation
6+x 10 6
P (A)P (B) = = P (A ∩ B) =
16 + x 16 + x 16 + x
which is equivalent to
10(6 + x) = 6(16 + x)
and the solution is x = 9. The class must have 9 right-handed women.
2.2.1 Exercises
Exercise 2.2.1. If P (A) = 2/3 and P (B) = 3/4, what is the largest and smallest
P (B|A) can be? Answer the same question for P (A|B).
Discrete Mathematics Page 40
Exercise 2.2.2. The following questions may shed some light on why many peo-
ple believe that past performance of coin-tossing (or repeatable random events in
general) affects future performance. In the answers define the sample space and
probability on it first.
(a) A fair coin was tossed 6 times and heads came up (exactly) 4 times. If the coin
is tossed again, what is the probability that the seventh toss will be a head?
(b) A fair coin is tossed 6 times and heads came up (exactly) 4 times. What is the
probability that the 6th toss was a head?
(c) A fair coin is tossed 6 times and heads came up (exactly) 4 times. It is also
known that 2 of the first 3 tosses were heads. What is the probability that the
6th toss was a head?
Exercise 2.2.3. A box contains 10 items of type I, of which 3 are defective, and
20 items of type II, of which 5 are defective. An item is chosen at random from
the 30 items in the box. Let A be the event that the item is of type I and B be
the event that the item is defective. Find the following probabilities: P (A), P (Ā),
P (B), P (A ∩ B), P (A ∪ B), P (A|B), P (B|A). Are A and B mutually exclusive?
Are A and B independent?
Chapter 3
Recurrence relations
3.1 Introduction
What are recurrence relations? We have seen a few examples already in previous
chapters; here are some additional examples:
an = 2an−1 + 1 , n ≥ 2 , (3.1)
(a) Check that the given formula gives the correct initial value.
To do (b) we evaluate 2an−1 + 1 using the given formula and show that it is
equal to an .
Now an−1 = 2n−1 − 1 so
2an−1 + 1 = 2 2n−1 − 1 + 1 = 2n − 1
so an = 2an−1 + 1 .
41
Discrete Mathematics Page 42
Example 3.2. Suppose that you put £100 into a savings account yielding 4%
compounded annually. Let an be the amount (in pounds) in the account after n
years.
Then an is equal to the amount in the account after n − 1 years plus the interest
for the nth year. For example, a1 is equal to 100 plus the interest which is 4. Hence
a1 = 104.
In general, an = an−1 + (0.04)an−1 so that
an = (1.04)an−1 , n ≥ 1
with a0 = 100.
Example 3.3. The tower of Hanoi is a puzzle consisting of three pegs mounted on
a board and n discs of different sizes, The picture in Figure 1 shows the n = 3 case.
Fig 1
A
B
The object is to transfer the tower to one of the other pegs, moving only one
disc at a time and never moving a larger disc onto a smaller disc.
Page 43 Dr Adrian Turcanu & Dr Mateja Presern
Let an be the minimum number of moves needed to solve the puzzle with n discs.
Clearly a1 = 1 and a2 = 3.
For the n disc problem we will derive a recurrence relation for an . We first move
the top n − 1 discs to peg 2. This takes an−1 moves. We transfer the largest disc
to peg 3 ; this takes one move. Another an−1 moves transfers the n − 1 discs on
peg 2 to peg 3. This shows that
an = 2an−1 + 1
Example 3.4. I can climb up stairs by taking either one step or two steps at a
time. Let an be the number of ways I can climb n steps.
There are an−1 ways of reaching the (n − 1) th stair and an−2 ways of reaching the
(n − 2) th stair. Hence an = an−1 + an−2 . Using a1 = 1 and a2 = 2, we obtain
a7 = 21.
Discrete Mathematics Page 44
3.1.1 Exercises
an = 5an−1 − 12
with a1 = 13 is an = 2(5)n + 3 .
an = 5an−1 , (3.3)
an = p an−1
is
an = c pn ,
where c is a constant determined by the initial condition.
rn = 7rn−1 − 12rn−2
an = c 3n + d 4n
is
an = c (r1 )n + d (r2 )n ,
where r1 and r2 are distinct solutions of the characteristic equation
r2 = pr + q,
r2 = r + 2 ,
r2 − r − 2 = (r + 1) (r − 2) = 0 .
The above equation tells us that r1 = −1 and r2 = 2 . The general solution we are
after therefore is
an = c 2n + d (−1)n .
r2 − 4r + 3 = (r − 3) (r − 1) = 0
an = c 3n + d
an = 2 (3)n − 6 .
Page 47 Dr Adrian Turcanu & Dr Mateja Presern
r2 − r − 1 = 0 .
Let us determine c and d by using the initial conditions. Substituting the initial
conditions a0 = 0 and a1 = 1 into the above general solution we get
c+d=0
√ ! √ !
1+ 5 1− 5
c +d = 1.
2 2
Solving these gives
1 1
c= √ d = −√ ,
5 5
so √ !n √ !n
1 1+ 5 1 1− 5
an = √ −√ .
5 2 5 2
The above method only works when the characteristic equation has two distinct real
solutions.
Fact 3.11. If the characteristic equation has a repeated (double) solution r = r1 ,
then the general solution of the recurrence relation (3.5) is
an = c r1n + d n r1n .
Example 3.12. Find the general solution of the recurrence relation an = 4an−1 −
4an−2 .
Solution. The characteristic equation
r2 − 4r + 4 = (r − 2)2 = 0
Discrete Mathematics Page 48
an = c 2n + d n 2n .
3.2.1 Exercises
an = 4an−1 − 15 , (3.6)
p = 4p − 15
pn + q = 3 [p(n − 1) + q] + 9 − 2n ,
pn + q = 3pn − 3p + 3q + 9 − 2n ,
0 = n(2p − 2) − 3p + 2q + 9 .
Hence 2p − 2 = 0 so p = 1. Then 2q + 6 = 0 so q = −3. A particular solution is
therefore an = n − 3 .
This method of finding a particular solution works in most cases. It would not work
for
an = 4an−1 − 3an−2 + 5 (3)n .
In this case we would look for a solution an = pn (3)n .
an = 3an−1
an = c 3n + n − 3 .
p = (0.5) p + 3
an = c (0.5)n + 6 .
c+6=8
so that c = 2 . Hence
an = 2 (0.5)n + 6 .
Example 3.17. The average number of comparisons xn made by the quick sort
algorithm when sorting n pieces of data satisfies
n xn = (n + 1) xn−1 + 2n
for n ≥ 1 with x0 = 0.
(a) Let xn = (n + 1) an . Find a recurrence relation for an .
so that
2
an = an−1 + . (3.8)
n+1
Also, since x0 = 0 we have that a0 = 0 . From (3.8),
2 2 2
a1 = 1 , a2 = 1 + , a3 = 1 + + ,
3 3 4
2 2 2
an = 1 + + + ··· + .
3 4 n+1
Using xn = (n + 1) an , we obtain
2 2 2
xn = (n + 1) 1 + + + · · · + .
3 4 n+1
Page 51 Dr Adrian Turcanu & Dr Mateja Presern
3.3.1 Exercises
Example 3.18. Many algorithms take a problem and then divide it into a number
of smaller problems. The binary search method reduces the search for an element
in a list of size m to a search in a list of size m/2. The method assumes that the
list has terms in order of increasing size; for example numbers in increasing size or
words in alphabetical order.
We show how the method works for finding the number 9 in the list of eight numbers
1 2 3 4 5 6 9 10
We first divide the list in two :
1 2 3 4 5 6 9 10
We then compare 9 with the largest number in the first list. Since 4 < 9 we keep
the second list and divide it in two :
5 6 9 10
Since 6 < 9 we keep the second list 9 10. After one more application we locate 9.
Example 3.19. A bit has two possible values, 0 and 1. A bit string is a sequence
of bits. The length of a bit string is the number of bits in the string, for example,
11010110 has length eight. There are 2n bit strings of length n. Let an be the
number of bit strings of length n that do not have two consecutive 0’s.
Find a1 and a2 . Find a recurrence relation for an and hence find hence find
a6 .
Solution. We call a bit string valid if it does not have two consecutive 0’s. Both 0
and 1 are valid bit strings of length 1 so a1 = 2 . a2 = 3 , since 01 , 10 and 11 are
valid.
We can form a valid bit string of length n by
Now, there are an−1 of type (i) and an−2 of type (ii) so
an = an−1 + an−2 .
(b) Find a1 and hence find the solution of the recurrence relation.
(i) Adding a digit other than 0 at the end of a valid string of length n − 1.
For item (i) there are nine digits that we can put in at the end. Hence there are
9an−1 ways of forming a valid codeword of length n in this way.
The number of ways we can do item (ii) is equal to the number of invalid strings
of length n − 1.
Since there are a total of 10n−1 strings of length n − 1 and an−1 of these are
valid, there are
10n−1 − an−1
Page 53 Dr Adrian Turcanu & Dr Mateja Presern
an = 8an−1 + 10n−1 .
We have to solve the above recurrence relation with a1 = 9. We first find a particular
solution of the form an = p 10n−1 . We have
an = c 8n−1 + 5 (10)n−1 .
Using a1 = 9 , we obtain
9 = c + 5.
Hence c = 4 and
an = 4 (8)n−1 + 5 (10)n−1 .
3.4.1 Exercises
Exercise 3.4.1. A ternary string is a string that only contains the symbols 0, 1
and 2. Let an be the number of ternary strings of length n that do not have two
consecutive 0’s. Find a recurrence relation for an and use the recurrence relation
to find a6 .
Exercise 3.4.2. This question is about the number of regions formed if we draw
n lines in the plane. With one line we get 2 regions and with two lines we get 4
regions. Find a recurrence relation for an , where an is number of regions that a
plane is divided into by n lines, if no two of the lines are parallel and no three of
the lines go through the same point.
Chapter 4
Graph Theory
a a e5
e3 e4
d
b e1
e2 c
c b
Fig 1 Fig 2
We can present graphs by showing only their diagrams. For example, we can read
off Fig. 2 that the diagram therein represents the graph with vertex set V = {a, b, c}
and edge list
E : {a, b}, {a, c}, {a, a}, {c, b}, {c, b} .
54
Page 55 Dr Adrian Turcanu & Dr Mateja Presern
To distinguish the multiple edges they can be equipped with additional labels. Thus
for the graph depicted in Fig. 2 we can label the edges {e1 , e2 , e3 , e4 , e5 } as shown
in the picture. With each label we associate the pair of vertices the corresponding
edge connects. For example e2 is associated with {b, c} and e5 with {a, a}. These
vertices are called the end points of the corresponding edge.
Definition 4.1. A graph is called simple if it has no loops and no multiple edges.
When edges and vertices are removed from a graph, without removing endpoints
of any remaining edges, a smaller graph is obtained. Such a graph is called a subgraph
of the original graph. The vertices of the subgraph form a subset of the vertices of
the original graph and the edges of the subgraph appear in the edge list of the
original graph.
For example a graph with vertices V = {a, b} and edges E : {a, a}, {a, b}, de-
picted below, is a subgraph of the graph from Fig. 2.
a e5
e3
4.1.1 Exercises
E : {a, f }, {a, e}, {b, c}, {b, f }, {c, f }, {d, e}, {e, f } .
Example 4.4 (*). Find the adjacency matrix for the graph from Fig. 2.
Solution. Ordering the vertices alphabetically we obtain
1 1 1
A = 1 0 2 .
1 2 0
Note that the adjacency matrix of a graph is symmetric, that is, aij = aji (the
matrix is symmetric upon reflecting across the diagonal).
For a simple graph all entries aij are either 0 or 1. Also, since for a simple graph
an edge does not connect a vertex to itself, we have that in this case the diagonal
elements vanish: aii = 0 .
Definition 4.5. Let v be a vertex of a graph G . The degree of v, denoted by
deg(v) , is the number of edges of G having v as an endpoint with each loop (v, v)
contributing twice.
Example 4.6. For the graph in Fig. 1, deg(a) = 2, deg(b) = 2, deg(c) = 1 and
deg(d) = 3.
Page 57 Dr Adrian Turcanu & Dr Mateja Presern
By examining structure of the graph, we notice that each edge in a graph has
two ends. Hence, each edge must contribute 2 to the sum of the degrees. This
means that the sum of the degrees of the vertices is twice the number of edges. This
observation is formalised by the following theorem.
Theorem 4.7 (Handshaking Theorem). Let G = (V, E) be a graph with vertex set
V = {v1 , v2 , . . . , vn } . Then
The name “Handshaking Theorem” comes from the fact that a graph can be
used to represent a group of people shaking hands.
4.2.1 Exercises
Exercise 4.2.2. What can you say about the sum of numbers in any row or column
of the adjacency matrix of a graph with no loops?
Exercise 4.2.3. A graph G = (V, E) has 23 edges and every vertex of G has
degree four or larger. What is the largest possible number of vertices that G can
have?
Exercise 4.2.4. Prove that in any graph the number of vertices of odd degree is
even.
Exercise 4.2.5. Let G be a simple graph with at least two vertices. Prove that G
has two vertices of the same degree.
4.3 Connectivity
Definition 4.8. A path of length n from vertex a to vertex b is a sequence of edges
e1 , e2 , . . . , en such that e1 is associated with vertices {a, v1 }, e2 is associated with
{v1 , v2 }, and so on, with en associated with {vn−1 , b}.
Discrete Mathematics Page 58
Definition 4.9. A path is called simple if it does not contain the same edge twice.
a b c
a
d
d e f e
b c
Fig 3 Fig 4
For example, the graph in Figure 3 is connected, whereas the graph in Figure 4
is not connected because, for example, the vertices a and d are not joined by a path.
Any graph can be represented as a union of its connected components which are
disjoint. The graph on Fig. 4 has two connected components.
The number of paths between vertices in a graph can be found by using its
adjacency matrix. Let A = (aij ) be the adjacency matrix of a graph G and let
A2 =: B = (bij ) . Assume for simplicity that the graph G is simple. Then
a b
c
a c d
Fig 5 Fig 6
and
1 0 1 0 2 0
A2 = 0 2 0 A3 = 2 0 2
1 0 1 0 2 0
Since the (2, 3) entry of A2 is 0 there are no paths of length 2 joining b and c .
Since the (1, 2) entry of A3 is 2 there are 2 paths of length 3 joining a and b .
Discrete Mathematics Page 60
4.3.1 Exercises
Exercise 4.3.1. If a graph is connected, then every vertex must be adjacent to some
other vertex. Give an example of a graph with four vertices which is not connected,
but has each vertex connected to some other vertex.
A digraph is depicted the same way as an ordinary graph with arrows put on
the edges marking the direction, that is for a directed edge {a, b} the arrow points
from a to b. Thus a digraph specified by V = {a, b, c, d} and
E : {b, b}, {b, a}, {a, b}, {a, a}, {c, a}, {c, d}, {d, d}
is depicted as
b
a
c d
Definition 4.14. A directed path from the vertex v0 to the vertex vn in a digraph
is a sequence of vertices v0 , v1 , v2 , . . . , vn such that for each pair of neighbouring
vertices vi , vi+1 in the sequence there is a directed edge {vi , vi+1 } belonging to the
digraph. The number of edges in the directed path is called its length.
Any strongly connected digraph is also weakly connected but the opposite is not
always true.
The digraph on Fig. 7A below is strongly connected while the digraph on Fig. 7B
is weakly but not strongly connected.
c c
b e d b e d
a a
Fig. 7A Fig. 7B
The subgraphs of a digraph G that are strongly connected but are not contained
in larger strongly connected subgraphs are called strongly connected components of
G.
b e d a
Fig. 7C
Example 4.16. Find the number of directed paths of length 3 from a to b in the
digraph depicted below:
b
a
c
Solution. Arranging the vertices alphabetically we obtain the adjacency matrix
1 1 1
A = 1 1 0 .
0 1 0
4.4.1 Exercises
Exercise 4.4.1. Find the number of directed paths from a to b of length 3 and the
number of directed paths of length 4 from c to b for the digraph
Page 63 Dr Adrian Turcanu & Dr Mateja Presern
b
a
4 6
2
a 7 c 3 g
3
4 4 z
4
d f
2 5
1
12
Fig 8
e
The length of a path in a weighted graph is the sum of the weights on the edges
of the path. In Figure 8 the path a, b, c, f has length 10.
We study the problem of finding the shortest path between two vertices a and
z. We use Dijkstra’s Algorithm to do this.
Dijkstra’s Algorithm
S := ∅, L(a) := 0
For i = 1 to n
L(vi ) := ∞
while z ∈
/S
begin
u := a vertex not in S with L(u) minimal
S := S ∪ {u}
for all vertices v not in S
if L(u) + w(u, v) < L(v) then L(v) := L(u) + w(u, v)
end
if v ∈
/ S then L(v) is the length of the shortest path from a to v that contains
only (other than v) vertices in S .
if the shortest path does not include u then Lk (v) = Lk−1 (v) .
Hence
Lk (v) = min{Lk−1 (u) + w(u, v), Lk−1 (v)}.
Example 4.17 (*). Use Dijkstra’s Algorithm to find the length of the shortest path
between the vertices a and z in the weighted graph displayed in Figure 9. Show the
main steps used in the algorithm.
Page 65 Dr Adrian Turcanu & Dr Mateja Presern
b 5 d
4 6
8
1 Fig 9
a 2 z
2
3
10
e
c
Example 4.18 (*). Use Dijkstra’s Algorithm to find the length of the shortest path
between the vertices a and z in the weighted graph displayed in Figure 8. Show the
main steps used in the algorithm.
Solution. S = ∅, L(a) = 0, L(b) = L(c) = L(d) = L(e) = L(f ) = L(g) = L(z) =
∞
4.5.1 Exercises
Exercise 4.5.1. The table below gives the cost of travelling on a toll road between
cities a, b, c, d and e. Represent this data by a weighted graph. Use Dijkstra’s
Algorithm to find the least expensive route in terms of total tolls between a and
each of the cities.
a b c d e
a - 4 18 5 10
b 4 - 2 5 8
c 18 2 - 11 5
d 5 5 11 - 2
e 10 8 5 2 -
4.6 Trees
Definition 4.19. A path in a graph is called a circuit if it starts and ends at the
same vertex. A circuit is simple if it does not contain the same edge twice.
Example 4.21. The graph shown in Figure 1 is a tree. The graph in Figure 2 is
not connected so it is not a tree. The graph in Figure 3 is not a tree since it contains
a simple circuit.
Page 67 Dr Adrian Turcanu & Dr Mateja Presern
Trees are useful for many problems. File structures in a computer system can
often be represented by a tree. Trees can be used to construct good algorithms
for locating items in a list. For weighted graphs, trees can be used to construct
minimum cost transportation networks.
Note that there are many equivalent ways of defining a tree. Let T be a graph
with n vertices. The following are all equivalent.
(i) T is a tree.
(ii) T is connected and has n − 1 edges.
(iii) any two vertices of T are connected by exactly one path.
(iv) T is connected, but removing any edge from T results in a disconnected graph.
Definition 4.22. A tree is called a full m-ary tree (binary for m = 2, ternary for
m = 3) if each internal vertex except for the root has degree m + 1 and the root
vertex has degree m.
Vertices of degree one in a tree, distinct from the root, are called leaves. The
vertices which are not leaves are called internal.
The root is always an internal vertex.
Example 4.24. As an example, the graph depicted below has two components G1
and G2 where
a
d
e
c
b
Definition 4.25. A forest is a graph for which every connected component is a tree.
Example 4.26. A forest graph contains 10 trees. The graph has 5 vertices of degree
10, 8 vertices of degree 7, 10 vertices of degree 3. The rest of the vertices are leaves.
Find the total number of vertices in the graph.
Solution. Let |Ei | and |Vi | be the number of edges and the number of vertices in the
i-th connected component respectively. Since each connected component is a tree
we have
|Ei | = |Vi | − 1
Summing up these 10 equations we obtain
10
X 10
X
|E| = |Ei | = |Vi | − 10 = |V | − 10 .
i=1 i=1
|E| = |V | − 10 = L + 23 − 10 = L + 13
2|E| = 5 · 10 + 8 · 7 + 10 · 3 + L · 1 = 136 + L
Example 4.27. A spam e-mail is initially sent to 4 people. The e-mail asks to
sent it out to 4 more people. Some people follow the request and send 4 e-mails to
their friends, others ignore it and do not send it out. The e-mail stops propagating
when 100 people received the e-mail but did not send it out. How many people in
total received the letter (including the very first person who sent it out) if nobody
received it twice.
Page 69 Dr Adrian Turcanu & Dr Mateja Presern
Solution. The condition that nobody received the spam e-mail twice means that the
propagation graph of the e-mail is a tree. Let i be the number of internal vertices
in the graph (including the root). Each internal vertex corresponds to a person who
received the letter and spread it further. Let l be the number of leaves in the graph.
Each leave corresponds to a person who received the letter but did not spread it
further. Thus l = 100 and the total number of vertices |V | = l + i = 100 + i. Since
the graph is a tree |E| = |V | − 1. Finally, since each edge comes from an internal
vertex and there are 4 edges coming from each internal vertex we have |E| = i · 4.
We thus obtain 3 equations for 3 unknowns:
|V | = 100 + i ,
|E| = |V | − 1 ,
|E| = i · 4 .
4.6.1 Exercises
Exercise 4.6.1. A full ternary tree has 71 leaves. What is the total number of
vertices in this tree?
Breadth-first search
Example 4.29. The pictures below show the breadth-first spanning tree T of a
graph G. We use the convention in which vertices with smaller labels are inspected
before vertices with larger labels.
9 10 11
8
6 1 2
7 G
5 4 3
12
13
Page 71 Dr Adrian Turcanu & Dr Mateja Presern
2 4 6 10
T
11
3 5 13 7 9
12 8
Depth-first search
Example 4.30. Below is the depth first spanning tree T ′ for the graph G from
Example 4.29.
1
6 13
7 12
9 8
10
.
11
Discrete Mathematics Page 72
4.7.1 Exercises
Exercise 4.7.1. A connected graph has n vertices and k edges. How many edges
must be removed to produce a spanning tree?
Exercise 4.7.2. Give a example of a graph with 4 vertices and 3 edges which is not
a tree.
Exercise 4.7.3. Use a breadth-first search to produce a spanning tree of the graph
in Figure 5 below.
Step 1. Start by putting any edge with the smallest weight into the spanning
tree T .
Example 4.32. Below we apply Prim’s algorithm to find a minimum spanning tree
of the graph in Figure 7.
a 2 b 3 c 1 d
3 1 2 5
4 f 3 g 3 h Fig 7
e
4 2 4 3
3 3 1
i j k l
The algorithm:
choice 1. edge {b, f } weight 1
choice 2. edge {a, b} weight 2
choice 3. edge {f, j} weight 2
choice 4. edge {a, e} weight 3
choice 5. edge {i, j} weight 3
choice 6. edge {f, g} weight 3
choice 7. edge {c, g} weight 2
choice 8. edge {c, d} weight 1
choice 9. edge {g, h} weight 3
choice 10 edge {h, l} weight 3
choice 11 edge {k, l} weight 1
The minimum spanning tree is shown in Figure 8 and has weight 24.
a b c d
f g h Fig 8
e
3
i j k l
Discrete Mathematics Page 74
Step 1. We start by putting any edge with the smallest weight into the spanning
tree T .
Step 2. We then successively add edges with smallest weight that do not form a
simple circuit with those edges already in T . The algorithm stops when n − 1 edges
are in T .
Example 4.33. Below we use Kruskal’s algorithm to find a minimum spanning tree
of the graph in Figure 7 from Example 4.32.
The algorithm:
4.8.3 Exercises
Exercise 4.8.1. Give an example of a connected graph with 3 vertices that has
more than one minimum spanning tree.
Exercise 4.8.2. What is the main difference between Prim’s algorithm and
Kruskal’s algorithm?
Exercise 4.8.3. Starting with the edge {k, l} , use Prim’s algorithm to find a min-
imum spanning tree of the graph in Figure 7.
Exercise 4.8.4. Use Kruskal’s algorithm to find a minimum spanning tree of the
graph in Figure 11 below.
a b 4 c
5
3 6
2 3
5
7 e 1 Fig 11
d f
3
6 8 4
4
g 4 h 2 i
d e c d
a b a b
Fig 12 Fig 13
Theorem 4.34. A connected graph G has an Euler circuit if and only if each of its
vertices has even degree.
Proof. We prove part 1. If we follow the Euler circuit, whenever we pass through a
vertex we add 2 towards the degree of a vertex. Since each edge is used once, the
degree of each vertex is a sum of 2’s.
For the graph in Figure 12, deg(a) = 2 , deg(b) = 2 , deg(c) = 4 , deg(d) = 2 and
deg(e) = 2. Hence there is an Euler circuit.
For the graph in Figure 13, deg(a) = 3 so it does not have an Euler circuit.
An edge in a graph is called a cut edge if its removal produces a subgraph with more
connected components than the original graph.
The cut edges in the graph in Figure 14 are {a, b} and {c, e} .
a d f g
Fig 14
b c e h
Fleury’s algorithm for constructing an Euler circuit begins with choosing a starting
vertex and then adding edges successively. Once a edge is chosen it is removed and
Page 77 Dr Adrian Turcanu & Dr Mateja Presern
vertices of degree 0 are erased. Each edge begins where the last edge ends. A cut
edge is only used if there is no alternative.
Example 4.35. Explain why the graph shown in Figure 15 has an Euler circuit.
Use Fleury’s algorithm to find an Euler circuit starting with the vertex c.
g f e
Fig 15
a b c d
g f e
2
1 8 Fig 16
4 3
7
a 5 b 6 c 9 d
Discrete Mathematics Page 78
Theorem 4.36. Let G be a connected graph. G has an Euler path but not an Euler
circuit if and only if exactly two of its vertices have odd degree.
Proof. We first prove that if there is an Euler path but not an Euler circuit then
exactly two vertices have odd degree. As in the proof of Theorem 1, every vertex
on the Euler path has even degree except for the initial and final vertex. Only two
vertices, the initial and final, have odd degree.
Now assume that G has only two vertices of odd degree, x and y. We have to prove
that there is an Euler path. We make a new graph H by adding a new vertex z to
G and two new vertices, {x, z} and {y, z} . All the vertices in H have even degree.
By Theorem 1, there is an Euler circuit in H. By removing the edges {x, z} and
{y, z} from the Euler circuit we are left with an Euler path, starting at x and ending
at y.
For the graph in Figure 13, deg(a) = deg(d) = 3 and deg(b) = deg(c) = 2 so it has
an Euler path.
In order to find an Euler path, we use Fleury’s algorithm, starting at a vertex which
has odd degree.
Euler circuits and paths have many applications. For example, we might want to
find a path that traverses exactly once each street in a town, or each connection in
a utility grid, or each link in a communications network.
4.9.1 Exercises
Exercise 4.9.1. Theorems 4.34 and 4.36 are about graphs with exactly zero or two
vertices of odd degree. What can you say about graphs with exactly one vertex of
odd degree?
Exercise 4.9.3. Explain why the graph shown in Figure 17 has an Euler path. Use
Fleury’s algorithm to find an Euler path.
Fig 17 Fig 18
Exercise 4.9.4. Explain why the graph shown in Figure 18 has an Euler circuit.
Use Fleury’s algorithm to find an Euler circuit.
Euler paths and circuits concern the edges of a graph. What about paths that
contain every vertex of a graph exactly once.
A Hamilton circuit in a graph G is a simple circuit that visits every vertex of G
exactly once. A Hamilton path is a simple path that visits every vertex exactly
once.
The graph in Figure 13 has a Hamilton circuit, for example a, b, d, c .
The graph in Figure 14 does not have a Hamiltonian circuit since any circuit would
have to contain the edge {c, e} twice. It has the Hamilton path a, b, d, c, e, f, g, h .
The complete graph on n vertices, denoted by Kn , is the graph that contains exactly
one edge between each pair of distict vertices. Clearly there is a Hamiltonian circuit
in Kn for all values of n.
There is no simple criteria for determining whether or not a graph has a Hamiltonian
circuit or a Hamiltonian path. However, simple properties can be used to show that
a graph does not have a Hamiltonian circuit. For example, suppose G is a graph
containing a vertex with degree 1. Then it cannot contain a Hamiltonian circuit
since in a Hamiltonian circuit each vertex is incident with two edges.
Discrete Mathematics Page 80
The best algorithms known for finding a Hamiltonian circuit or determining that
no such circuit exists have exponential time complexity in the number of vertices in
the graph.
Hamiltonian circuits have many applications. The traveling salesman problem is:
given a set of cities and the distance between each pair, find the shortest route
which visits each city and returns to the initial city. This is equivalent to finding
a Hamiltonian circuit in a weighted complete graph Kn such that the total weight
of its edges is as small as possible. In principle we can solve the traveling salesman
problem by looking at all possible Hamiltonian circuits and selecting the one with
the minimum weight. However, when we increase the number of cities we run into
difficulties.
Suppose there are n vertices. Once we have chosen a starting point there are (n−1)!
Hamiltonian circuits to look at. We only need look at half of these since we can
travel on a Hamiltonian circuit in reverse order. The number
(n − 1)!
2
grows very quickly. For n = 10 we have to check 181440 possible routes and this
is feasible. For n = 25 there are approximately 3.1 × 1023 different Hamiltonian
circuits to check. This is impossible. The best algorithms for this problem produce
a solution that is close to the correct solution.
One simple algorithm that can be used in finding approximate solutions to the
travelling salesman problem is the Nearest Neighboor algorithm. This is an example
of a greedy algorithm. It proceeds as follows
2. Travel along the edge with the smallest weight to a vertex that has not been
visited yet
Example 4.37. For a graph depicted below use the Nearest Neighboor algorithm
to find Hamiltonian paths starting from the vertex a and starting from the vertex
f.
Page 81 Dr Adrian Turcanu & Dr Mateja Presern
b 5 c
10 6
10 6 8
8 11
a 13 d
7 3
7
12 10
f 8 e
Solution. The Nearest Neighboor algorithm starting with vertex a gives the path:
aebcdf a with the total weight 7+6+5+6+3+12=39, while starting with the vertex
f it gives f cbeadf with weight 41.
One should be aware that for certain graphs the Nearest Neighboor algorithm
may produce results which are very far from optimal no matter from which vertex one
starts. For the graph depicted below each path produced by the Nearest Neighboor
algorithm has a total weight exceeding 100 while the optimal path adceba has the
weight of only 16.
b
3 100
100
2
a
10 c
5 4
3
1
d 2 e
The following Lower Bound algorithm allows one to obtain a lower bound on the
weight of any Hamilton path:
Discrete Mathematics Page 82
1. Choose any vertex v and delete it and all of its incident edges from the graph
2. Find a minimum weight spanning tree in the resulting graph using Kruskal’s
or Prim’s algorithm
3. Find the two edges incident with v that have the smallest weight
4. A lower bound is the sum of the weights of the minimum weight spanning tree
found at step 2 and the two edges found at step 3
Example 4.38. Use the lower bound algorithm to find a lower bound for Hamilton
paths for the graph in Example 4.2. Start with the vertex a.
Solution. We find that the minimal tree contains the edges {f, d}, {bc}, {cd}, {eb}
and has the weight 20. The two edges incident with a having the smallest weight
are {ae} of weight 7 and {ab} of weight 10. Adding these to the spanning tree
weight we obtain a lower bound of 7+10+20=37. We now know that the result
produced by the Nearest Neighboor algorithm starting with a, which equals 39, is
pretty close to the optimal. As the optimal result is in any case larger than 37 the
weight of 39 is within 2/37 × 100% ≈ 5% from the optimal result. Thus the Nearest
Neighboor algorithm combined with the Lowest Bound algorithm can lead to rather
good approximate solutions to the traveling salesman problem.
4.10.1 Exercises
Exercise 4.10.1. Apply the Nearest Neighboor algorithm to the graph below, start-
ing with the vertices a, d, c. Find the weights of the corresponding Hamilton circuits.
b 5 c
8 5
13 9
8 8
3 7
a 17 d
12
16
10 2
e 11 f
Page 83 Dr Adrian Turcanu & Dr Mateja Presern
Exercise 4.10.2. For the same graph as in the previous problem use the Lower
Bound algorithm to find the lower bound removing the vertex b. Same for the
vertex e.
Exercise 4.10.3. Use the largest lower bound found in problem 4.4.2 to estimate
how close to the optimal is the best result found in problem 4.4.1.
Chapter 5
Here x1 , x2 , . . . , xn are the unknowns and aij ’s are coefficients and bj ’s are constants.
Given such a system we would like to have answers to the following questions
Does it have a solution? What conditions are required on the coefficients aij
and the constants bj for the system to have a solution?
ax = b (5.2)
84
Page 85 Dr Adrian Turcanu & Dr Mateja Presern
b
x= .
a
For example the equation
3x = 12
has unique solution
12
x= = 4.
3
0 · x = b.
Since b is not zero no value of x will make this statement true, so (5.2) has no
solutions in this case.
0 · x = 0.
This is true for any x. So (5.2) has infinitely many solutions in this case.
We will see that the 3 cases in the above example exactly mirror the general
case.
AX = b (5.3)
where
x1 b1
x2 b2
A = (aij )m×n , X= , b=
.. ..
. .
xm bm
We put all of this data into an m × (n + 1) augmented matrix:
(A|b) .
x + 2y − 3z = 4
x + 3y + z = 11
2x + 5y − 4z = 13
We use the first row to kill the rest of the first column. This corresponds to elimi-
nating x from all but the first equation. The top left entry of the matrix, which is
used to drive all the entries directly below it to zero, is called a pivot and is marked
by an asterisk. We apply
R2 → R2 − R1 , R3 → R3 − 2R1
to obtain
1 2 −3 4
0 1∗ 4 7
0 1 2 5
The process of driving all the entries below a pivot to zero is called a downsweep.
Next we use the second enry on the second row as a pivot to kill the elements below
it. We do this by applying R3 → R3 − R2
1 2 −3 4
0 1 4 7
0 0 −2 −2
Page 87 Dr Adrian Turcanu & Dr Mateja Presern
We now reverse the strategy and use the third row to kill elements in the third
column. This process is called an upsweep. After applying
R2 → R2 − 4R3 , R1 → R1 + 3R3
we obtain
1 2 0 7
0 1∗ 0 3
0 0 1 1
We next use the second row to kill elements in the second column by applying
R1 → R1 − 2R2
1 0 0 1
0 1 0 3
0 0 1 1
The unique solution of the system can now be read off the rightmost column:
x = 1, y = 3, z = 1.
The process that we employed to solve the above problems is called Gaussian
elimination. In it one first carries out a sequence of downsweeps using pivots on the
diagonal sucessively starting with the first row. One next scales the last row and
uses it for upsweeping. Then one scales the second row and uses it for upsweeping,
etc. until one obtains a unit matrix in the left block of the augmented matrix. The
rightmost column of the resulting matrix contains solutions. This algorithm assumes
that all pivots encountered in the course of its execution are nonvanishing.
An alternative to upsweepings is back-substitutions. In the last example once
we were done with all the downsweeps we obtained matrix (5.4). The corresponding
system of equations is
x + 2y − 3z = 4
y + 4z = 7
z = 1
We can find the solution by starting with the bottom equation and substituting
z = 1 into the equation directly above it. This gives y + 4 = 7 so that y = 3.
Substituting now z = 1 and y = 3 into the top equation we get x + 2 · 3 − 3 = 4
that yields x = 1. More generally, when we have more equations, we start with
the bottom equation and substitute its solution into the equation above it, then
substitute all known variables into the the equation above the bottom two, and so
on, always moving upwards.
Discrete Mathematics Page 88
Sometimes rather than having a unique solution the system may have no solution
or infinitely many solutions. Consider the following example.
x + 2y + 3z = 1
4x + 5y + 6z = 2
7x + 8y + 9z = 1
From here on we are unable to proceed with upsweeping because the entry we
normally use as a pivot is zero. However we do not have to proceed because the last
equation now reads 0 · x + 0 · y + 0 · z = −2 which cannot be satisfied for any values
of x,y, z. The system therefore has no solutions.
which is true for any values of x, y, z. To simplify the system further we do the
upsweep using the entry on the second row. We obtain
1 0 1 2
0 1 −2 2
0 0 0 0
x+z = 2
y − 2z = 2
x = 2−z, y = 2 + 2z .
(x, y, z) = (2 − z, 2 + 2z, z)
where z is arbitrary and is called a free variable. There are thus infinitely many
solutions (for each value of z). A particular solution is obtained by fixing the
value of z. For example for z = 3, x = −1, y = 8 is a particular solution.
Sometimes pivots do not appear where we naively expect them to be and rows
may need to be interchanged. Consider the following example.
Example 5.5. Solve the system of linear equations represented by the augmented
matrix ∗
1 2 3 1
2 4 5 1
−1 1 2 4
Solution. After downsweeping from the first pivot we obtain
1 2 3 1
0 0∗ −1 −1
0 3 5 5
We are blocked from using the natural pivot by a zero. However we can interchange
rows 2 and 3 and use a new pivot.
1 2 3 1
0 3∗ 5 5
0 0 −1 −1
Discrete Mathematics Page 90
Example 5.6. Solve the system of linear equations represented by the augmented
matrix ∗
1 2 3 1
2 4 5 1
−1 −2 2 4
Solution. All that is changed from the previous example is the entry A3,2 but it
makes a big difference. After downsweeping the first column we get
1 2 3 1
0 0∗ −1 −1
0 0 5 5
and there is nothing further down in the second column. If we can’t move down we
move right:
1 2 3 1
0 0 −1∗ −1
0 0 5 5
We downsweep column 3 to obtain
1 2 3 1
0 0 −1∗ −1
0 0 0 0
There are infinitely many solutions. We can express the pivot variables (the ones
multiplied by the units) in terms of the non-pivot ones:
x = −2y − 2 , z = 1.
The number of solutions in a linear system can be read off from the echelon form.
Recall that a matrix is said to be in echelon form if
1. Any all-zero rows are below all other rows; and
2. The first non-zero entry of any row (called the pivot entry) is strictly father
right that the first nonzero entry of any row above it
If (A′ |b′ ) is the augmented matrix obtained by using the Gaussian elimination just
after the last downsweep than the matrix A′ is in echelon form. Here is an example
of an augmented matrix after the last downsweep:
1 2 1 1 4 1 2
0 0 2 −2 6 2 2
0 0 0 0 0 3 6
0 0 0 0 0 0 0
The coefficient part A′ is in echelon form. The pivot entries are put in boxes.
A matrix is said to be in reduced echelon form if all the following are true
1. It is in echelon form
2. The first nonzero entry of any row is 1. This is called a pivotal 1.
3. All entries above a pivotal 1 are 0
If (A′′ |b′′ ) is the final augmented matrix obtained by Gaussian elimination than A′′
is in the reduced echelon form. In the above example the augmented matrix after
scalings and upsweeps is brought to the form
1 2 0 2 1 0 1
0 0 1 −1 3 0 −1
0 0 0 0 0 1 2
0 0 0 0 0 0 0
Its coefficient part is in reduced echelon form. The pivotal 1’s are put into boxes.
Example 5.7. Determine how many solutions there is to the system of equations
x + 2y − z = 0
2x + 5y + 2z = 0
x + 4y + 7z = 0
x + 3y + 3z = 0
We see that in the echelon form there are two equations for 3 unknowns (l = 2 <
m = 3) so the system has infinitely many solutions.
x + 3y − 5z + w = 4
2x + 5y − 2z + 4w = 6
x + 3(2 + 8z + 2w) − 5z + w = 4
After some elementary algebra we obtain from this equation x = −2 − 19z − 7w.
The general solution is therefore
Definition 5.9. The number of non-zero rows in the echelon form of matrix A is
called the rank of A, and is denoted rank(A).
Solution. To find the rank we bring the matrix to echelon form by downsweeping:
1 1 1 2 1 1 1 2 1 1 1 2
3 4 4 7 0 1 1 1 0 1 1 1
−2 1 2 −3 → 0 3 4 1 → 0 0 1 −2
5 3 4 6 0 −2 −1 −4 0 0 1 −2
4 5 3 13 0 1 −1 5 0 0 −2 4
1 1 1 2 1 1 1 2
0 1 1 1
0 1 1 1
0 0 1 −2
→
0 0 1 −2
0 0 1 −2 0 0 0 0
0 0 −2 4 0 0 0 0
The matrix is now in echelon form. There are 3 non-zero rows so rank(A) = 3.
5.1.1 Exercises
Discrete Mathematics Page 94
2x − 2y + 4z = −6
2x − y + 6z = −6
−x + 2y − 3z = 6
x + 2y + 3z = 1
2x + 3y + 4z = 2
−x + y + 2z = 1
2x + 8y + 6z = 20
4x + 2y − 2z = −2
−6x + 4y + 10z = 24
x − 3y + 4z + 2w = 5
2y + 5z + w = 2
y − 3z = 4
Page 95 Dr Adrian Turcanu & Dr Mateja Presern
Exercise 5.1.6. Determine which condition must be placed on a,b, c so that the
following system in unknowns x, y, z has a solution
x + 2y − 3z = a
2x + 6y − 11z = b
x − 2y + 7z = c
There is a special n-vector called the zero vector whose every component is
a zero:
0
0
0 = ..
.
0
−1
be a point in R4 . Then
24
TA (X) := AX = 10
−11
Page 97 Dr Adrian Turcanu & Dr Mateja Presern
is a point in R3 .
TA (X) = 0
that is the vectors from the null space are solutions to the homogeneous system
AX = 0. A trivial solution is of course always there. To solve the homogeneous
system we apply Gaussian elimination to the augmented matrix (A|0) to bring it
to reduced echelon form. The sequence of elementary row operations employed for
that is exactly the same we used on (A|Y ) to find the rank of A.
Applying Gaussian elimination we bring this matrix to the reduced echelon form
1 0 −1 −1 −5
0 1 2 2 6
A= 0 0
0 0 0
0 0 0 0 0
Discrete Mathematics Page 98
So any vector in the null space of TA must be a linear combination of the three vectors
u,v, w. The null space is thus an R3 sitting inside R5 . The number of independent
vectors in the null space is called the nullity of A, and is the dimension of the null
space of TA . Thus in the above example nullity(A) = 3.
Notice now that in the above example the sum of the rank and the nullity equals
the dimension of the space on which TA acts:
rank(A) + nullity(A) = 2 + 3 = 5
This is no accident as the following general relation holds known as the rank-nullity
theorem.
rank(A) + nullity(A) = n.
5.2.1 Exercises
then
4 1
AX = 4 = 4 1 = 4X .
8 2
A matrix can have more than one eigenvector and eigenvalue. In the previous
example the matrix A was shown to have an eigenvalue λ = 4 with eigenvector
1
X= 1
2
One can check that λ = −2 is also an eigenvalue of A with corresponding eigenvectors
1 1
X1 = 1 X2 = 0
0 −1
Discrete Mathematics Page 100
that is AX1 = −2X1 and AX2 = −2X2 . This example also shows that there can
be more than one eigenvector corresponding to the same eigenvalue. In fact, for the
above example, one can check that any linear combination
Y = C 1 X1 + C 2 X2
where C1 , C2 are constants (not both equal to zero), is also an egenvector of A with
eigenvalue −2.
(A − λI)X = 0
where I is the n × n identity matrix. One can use the Gaussian elimination to
bring the augmented matrix (A − λI|0) to to a reduced echelon form. Then the
homogeneous system at hand has a nontrivial solution in the reduced echelon form
there is at least one zero row. This imposes a condition on the values of λ.
This matrix has a nontrivial null space if the lower right entry is zero, that is, if
λ2 − λ − 2 = (λ − 2)(λ + 1) = 0
Thus the eigenvalues of A coincide with the roots (=solutions) of its characteristic
polynomial.
(λ − 1)(λ + 1)(λ − 2) = 0
λ1 = 1 , λ2 = −1 , λ3 = 2.
5.3.1 Exercises
Exercise 5.3.2. Find the eigenvalues for the matrix A in the previous exercise.
In a full m-ary tree, internal vertices and leaves determine the hierarchical structure. Each internal vertex, except the root, connects to \( m+1 \) edges, forming levels, with the root connecting \( m \). Leaves, vertices of degree one, mark the tree's endpoints. Together, they uphold the tree's growth pattern and balance: the total number of vertices \( n \) is related to internal vertices \( i \) and leaves \( l \) by \( n = i + l \), with structural equations balancing between these elements.
Adjacency matrices can identify connected components by examining the transitive closure—effectively checking if there exists a path between every pair of vertices. For a graph with adjacency matrix \( A \), the power \( A^n \), where \( n \) is the number of vertices, provides information about connectivity: nonzero entries indicate connections between vertices within a component. Thus, connected components correspond to submatrices indicating strong internal connectivity with no links to vertices outside. Matrix exponentiation and analysis uncover underlying connected structures.
Gaussian elimination facilitates solving a system of linear equations by transforming the augmented matrix into row echelon form via row operations. This involves performing downsweeps to zero out sub-diagonal elements and arranging equations hierarchically, then upsweeps or back-substitution are performed to solve variables starting from the bottom row. The approach systematically reduces the system to simpler forms, allowing the extraction of variable values efficiently.
A tree in graph theory is a connected acyclic graph, possessing unique characteristics: being connected implies there is a path between any two vertices, and acyclic means it contains no cycles. Compared to general graphs, trees are minimally connected with \( n-1 \) edges given \( n \) vertices, emphasizing simplicity and structure. Their acyclic nature ensures only one path connects any two vertices, distinguishing them from more complex general graphs which may contain cycles and multiple connections.
For a subgraph to be considered a spanning tree of a graph, it must satisfy: 1) containing every vertex of the original graph, 2) being minimally connected with no cycles (i.e., a tree structure), and 3) including exactly \( n-1 \) edges where \( n \) is the number of vertices, thus connecting all vertices using the fewest possible edges. This ensures it spans all vertices efficiently without redundant connections.
The recurrence relation \( a_n = a_{n-1} + a_{n-2} \) describes how each term in the sequence is the sum of the two preceding terms, similar to the Fibonacci sequence. Starting with initial conditions \( a_1 = 1 \) and \( a_2 = 2 \), we calculate each term sequentially: \( a_3 = 3, a_4 = 5, a_5 = 8, a_6 = 13, \) and finally \( a_7 = 21 \). These calculations align with the sequence definition, ensuring \( a_7 \) is accurate.
The characteristic equation \( r^2 = pr + q \) arises when solving linear homogeneous recurrence relations of the form \( a_n = pa_{n-1} + qa_{n-2} \). Solving this quadratic equation provides the roots \( r_1 \) and \( r_2 \), which are used in the general solution \( a_n = c(r_1)^n + d(r_2)^n \) where \( c \) and \( d \) are constants determined by initial conditions. This method leverages the structure of the quadratic to derive a closed form for the recurrence, simplifying analysis and calculations.
Dijkstra's Algorithm is efficient for finding the shortest path in a graph because it systematically explores edges closest to the starting point, visiting each node once and updating shortest paths in a priority-based manner. However, its efficiency is affected by the implementation of the priority queue: using a Fibonacci heap results in a time complexity of \( O(E + V \log V) \), making it suitable for dense graphs. Issues arise with negative weight edges as it cannot handle them properly, requiring edge relaxation assumptions.
For \( a_n = c(3)^n + d(4)^n \) to be a valid solution of a recurrence relation like \( a_n = 7a_{n-1} - 12a_{n-2} \), it must satisfy the characteristic equation derived from the relation. The characteristic equation \( r^2 = 7r - 12 \) resolves to roots \( r = 3 \) and \( r = 4 \). With these roots, \( a_n = c(3)^n + d(4)^n \) accurately captures all solutions associated with the recurrence, provided constants \( c \) and \( d \) fit initial conditions.
The theorem stating that any graph must have an even number of vertices with odd degree is significant because it is a fundamental property derived from the degree sum formula. This formula asserts that the sum of the degrees of all vertices in a graph must be even, as each edge is counted twice (once at each endpoint). Therefore, since odd degree vertices contribute an odd sum, they must occur in pairs for the total to remain even, ensuring consistency with graph theory principles.