0% found this document useful (0 votes)
3 views11 pages

CS502 Chapter 3

Chapter 3 discusses the divide-and-conquer strategy in algorithm design, illustrating its principles through the example of Merge Sort. The algorithm involves dividing an array into smaller subsequences, recursively sorting them, and then combining the sorted subsequences. The chapter also analyzes the running time of Merge Sort, concluding that it operates in Θ(n log n) time complexity.

Uploaded by

2035amirkhan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views11 pages

CS502 Chapter 3

Chapter 3 discusses the divide-and-conquer strategy in algorithm design, illustrating its principles through the example of Merge Sort. The algorithm involves dividing an array into smaller subsequences, recursively sorting them, and then combining the sorted subsequences. The chapter also analyzes the running time of Merge Sort, concluding that it operates in Θ(n log n) time complexity.

Uploaded by

2035amirkhan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 3

Divide and Conquer Strategy

The ancient Roman politicians understood an important principle of good algorithm design (although
they were probably not thinking about algorithms at the time). You divide your enemies (by getting them
to distrust each other) and then conquer them piece by piece. This is called divide-and-conquer. In
algorithm design, the idea is to take a problem on a large input, break the input into smaller pieces, solve
the problem on each of the small pieces, and then combine the piecewise solutions into a global solution.
But once you have broken the problem into pieces, how do you solve these pieces? The answer is to
apply divide-and-conquer to them, thus further breaking them down. The process ends when you are left
with such tiny pieces remaining (e.g. one or two items) that it is trivial to solve them. Summarizing, the
main elements to a divide-and-conquer solution are

Divide: the problem into a small number of pieces

Conquer: solve each piece by applying divide and conquer to it recursively

Combine: the pieces together into a global solution.

3.1 Merge Sort


Divide and conquer strategy is applicable in a huge number of computational problems. The first
example of divide and conquer algorithm we will discuss is a simple and efficient sorting procedure
called We are given a sequence of n numbers A, which we will assume are stored in an array A[1..n].
The objective is to output a permutation of this sequence sorted in increasing order. This is normally
done by permuting the elements within the array A. Here is how the merge sort algorithm works:

• (Divide:) split A down the middle into two subsequences, each of size roughly n/2

• (Conquer:) sort each subsequence by calling merge sort recursively on each.

• (Combine:) merge the two sorted subsequences into a single sorted list.

27
28 CHAPTER 3. DIVIDE AND CONQUER STRATEGY

Let’s design the algorithm top-down. We’ll assume that the procedure that merges two sorted list is
available to us. We’ll implement it later. Because the algorithm is called recursively on sublists, in
addition to passing in the array itself, we will pass in two indices, which indicate the first and last indices
of the sub-array that we are to sort. The call MergeSort(A, p, r) will sort the sub-array A[p : r] and
return the sorted result in the same [Link] is the overview. If r = p, then this means that there is
only one element to sort, and we may return immediately. Otherwise if (p 6= r) there are at least two
elements, and we will invoke the divide-and-conquer. We find the index q, midway between p and r,
namely q = (p + r)/2 (rounded down to the nearest integer). Then we split the array into sub-arrays
A[p : q] and A[q + 1 : r]. Call MergeSort recursively to sort each sub-array. Finally, we invoke a
procedure (which we have yet to write) which merges these two sub-arrays into a single sorted array.
Here is the MergeSort algorithm.
MERGE - SORT( array A, int p, int r)
1 if (p < r)
2 then
3 q ← (p + r)/2
4 MERGE - SORT(A, p, q) // sort A[p..q]
5 MERGE - SORT(A, q + 1, r) // sort A[q + 1..r]
6 MERGE(A, p, q, r) // merge the two pieces

The following figure illustrates the dividing (splitting) procedure.

7 5 2 4 1 6 3 0

7 5 2 4 1 6 3 0

7 5 2 4 1 6 3 0

7 5 2 4 1 6 3 0

split
Figure 3.1: Merge sort divide phase

Merging: All that is left is to describe the procedure that merges two sorted lists. Merge(A, p, q, r)
assumes that the left sub-array, A[p : q], and the right sub-array, A[q + 1 : r], have already been sorted.
We merge these two sub-arrays by copying the elements to a temporary working array called B. For
convenience, we will assume that the array B has the same index range A, that is, B[p : r]. (One nice
thing about pseudo-code, is that we can make these assumptions, and leave them up to the programmer to
figure out how to implement it.) We have to indices i and j, that point to the current elements of each
3.1. MERGE SORT 29

sub-array. We move the smaller element into the next position of B (indicated by index k) and then
increment the corresponding index (either i or j). When we run out of elements in one array, then we just
copy the rest of the other array into B. Finally, we copy the entire contents of B back into A. (The use of
the temporary array is a bit unpleasant, but this is impossible to overcome entirely. It is one of the
shortcomings of MergeSort, compared to some of the other efficient sorting algorithms.)
Here is the merge algorithm:
MERGE( array A, int p, int q int r)
1 int B[p..r]; int i ← k ← p; int j ← q + 1
2 while (i ≤ q) and (j ≤ r)
3 do if (A[i] ≤ A[j])
4 then B[k++ ] ← A[i++ ]
5 else B[k++ ] ← A[j++ ]
6 while (i ≤ q)
7 do B[k++ ] ← A[i++ ]
8 while (j ≤ r)
9 do B[k++ ] ← A[j++ ]
10 for i ← p to r
11 do A[i] ← B[i]

0 1 2 3 4 5 6 7

2 4 5 7 0 1 3 6

5 7 2 4 1 6 0 3

7 5 2 4 1 6 3 0

merge
Figure 3.2: Merge sort: combine phase

3.1.1 Analysis of Merge Sort


First let us consider the running time of the procedure Merge(A, p, q, r). Let n = r − p + 1 denote the
total length of both the left and right sub-arrays. What is the running time of Merge as a function of n?
The algorithm contains four loops (none nested in the other). It is easy to see that each loop can be
executed at most n times. (If you are a bit more careful you can actually see that all the while-loops
30 CHAPTER 3. DIVIDE AND CONQUER STRATEGY

together can only be executed n times in total, because each execution copies one new element to the
array B, and B only has space for n elements.) Thus the running time to Merge n items is Θ(n). Let us
write this without the asymptotic notation, simply as n. (We’ll see later why we do this.)
Now, how do we describe the running time of the entire MergeSort algorithm? We will do this through
the use of a recurrence, that is, a function that is defined recursively in terms of itself. To avoid
circularity, the recurrence for a given value of n is defined in terms of values that are strictly smaller than
n. Finally, a recurrence has some basis values (e.g. for n = 1), which are defined explicitly.
Let T (n) denote the worst case running time of MergeSort on an array of length n. If we call MergeSort
with an array containing a single item (n = 1) then the running time is constant. We can just write
T (n) = 1, ignoring all constants. For n > 1, MergeSort splits into two halves, sorts the two and then
merges them together. The left half is of size dn/2e and the right half is bn/2c. How long does it take to
sort elements in sub array of size dn/2e? We do not know this but because dn/2e < n for n > 1, we can
express this as T (dn/2e). Similarly the time taken to sort right sub array is expressed as T (bn/2c). In
conclusion we have 
1 if n = 1,
T (n) =
T (dn/2e) + T (bn/2c) + n otherwise
This is called recurrence relation, i.e., a recursively defined function. Divide-and-conquer is an
important design technique, and it naturally gives rise to recursive algorithms. It is thus important to
develop mathematical techniques for solving recurrences, either exactly or asymptotically.
Let’s expand the terms.

T (1) = 1
T (2) = T (1) + T (1) + 2 = 1 + 1 + 2 = 4
T (3) = T (2) + T (1) + 3 = 4 + 1 + 3 = 8
T (4) = T (2) + T (2) + 4 =4 +4 +4 =12
T (5) = T (3) + T (2) + 5 = 8 + 4 + 5 = 17
...
T (8) = T (4) + T (4) + 8 = 12 + 12 + 8 = 32
...
T (16) = T (8) + T (8) + 16 = 32 + 32 + 16 = 80
...
T (32) = T (16) + T (16) + 32 = 80 + 80 + 32 = 192

What is the pattern here? Let’s consider the ratios T (n)/n for powers of 2:

T (1)/1 = 1 T (8)/8 = 4
T (2)/2 = 2 T (16)/16 = 5
T (4)/4 = 3 T (32)/32 = 6

This suggests T (n)/n = log n + 1 Or, T (n) = n log n + n which is Θ(n log n) (using the limit rule).
3.1. MERGE SORT 31

3.1.2 The Iteration Method for Solving Recurrence Relations


Floor and ceilings are a pain to deal with. If n is assumed to be a power of 2 (2k = n), this will simplify
the recurrence to 
1 if n = 1,
T (n) =
2T (n/2) + n otherwise
The iteration method turns the recurrence into a summation. Let’s see how it works. Let’s expand the
recurrence:

T (n) = 2T (n/2) + n
= 2(2T (n/4) + n/2) + n
= 4T (n/4) + n + n
= 4(2T (n/8) + n/4) + n + n
= 8T (n/8) + n + n + n
= 8(2T (n/16) + n/8) + n + n + n
= 16T (n/16) + n + n + n + n

If n is a power of 2 then let n = 2k or k = log n.

T (n) = 2kT (n/(2k)) + (n + n + n + · · · + n)


| {z }
k times
k k
= 2 T (n/(2 )) + kn
= 2(log n)T (n/(2(log n))) + (log n)n
= 2(log n)T (n/n) + (log n)n
= nT (1) + n log n = n + n log n

3.1.3 Visualizing Recurrences Using the Recursion Tree


Iteration is a very powerful technique for solving recurrences. But, it is easy to get lost in all the symbolic
manipulations and lose sight of what is going on. Here is a nice way to visualize what is going on in
iteration. We can describe any recurrence in terms of a tree, where each expansion of the recurrence takes
us one level deeper in the tree.
Recall that the recurrence for MergeSort (which we simplified by assuming that n is a power of 2, and
hence could drop the floors and ceilings)

1 if n = 1,
T (n) =
2T (n/2) + n otherwise

Suppose that we draw the recursion tree for MergeSort, but each time we merge two lists, we label that
node of the tree with the time it takes to perform the associated (nonrecursive) merge. Recall that to
32 CHAPTER 3. DIVIDE AND CONQUER STRATEGY

merge two lists of size m/2 to a list of size m takes Θ(m) time, which we will just write as m. Below is
an illustration of the resulting recursion tree.

time to merge

n =n
+
n/2 n/2 2(n/2) = n
+
log(n)+1
n/4 n/4 n/4 n/4 4(n/4) = n levels
+
n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8 8(n/8) = n
.....
+
1 1 1 1 ...... 1 1 1 1 1 ...... 1 n(n/n) = n

n(log(n)+1)

Figure 3.3: Merge sort Recurrence Tree

3.1.4 A Messier Example

The merge sort recurrence was too easy. Let’s try a messier recurrence.

1 if n = 1,
T (n) =
3T (n/4) + n otherwise

Assume n to be a power of 4, i.e., n = 4k and k = log4 n

T (n) = 3T (n/4) + n
= 3(3T (n/16) + (n/4) + n
= 9T (n/16) + 3(n/4) + n
= 27T (n/64) + 9(n/16) + 3(n/4) + n
= ...
n
= 3kT ( k ) + 3k−1(n/4k−1)
4
+ · · · + 9(n/16) + 3(n/4) + n
X 3i k−1
kn
= 3 T ( k) + n
4 i=0
4i
3.1. MERGE SORT 33

With n = 4k and T (1) = 1

X 3i k−1
k n
T (n) = 3 T ( k ) + n
4 i=0
4i
(log4 n)−1
X 3i
log4 n
=3 T (1) + n
i=0
4i
(log4 n)−1
X 3i
= nlog4 3 + n
i=0
4i

We used the formula alogb n = nlogb a. n remains constant throughout the sum and 3i/4i = (3/4)i; we
thus have
(log4 n)−1 i
X 3
log4 3
T (n) = n +n
i=0
4

The sum is a geometric series; recall that for x 6= 1


m
X xm+1 − 1
xi =
i=0
x−1

In this case x = 3/4 and m = log4 n − 1. We get


(3/4)log4 n+1 − 1
T (n) = nlog4 3 + n
(3/4) − 1
Applying the log identity once more

(3/4)log4 n = nlog4 (3/4) = nlog4 3−log4 4


nlog4 3
= nlog4 3−1 =
n
If we plug this back, we get
nlog4 3
log4 3 −1 n
T (n) = n +n
(3/4) − 1
log4 3
n −n
= nlog4 3 +
−1/4
log4 3
=n + 4(n − nlog4 3)
= 4n − 3nlog4 3

With log4 3 ≈ 0.79, we finally have the result!

T (n) = 4n − 3nlog4 3 ≈ 4n − 3n0.79 ∈ Θ(n)


34 CHAPTER 3. DIVIDE AND CONQUER STRATEGY

3.2 Selection Problem


Suppose we are given a set of n numbers. Define the rank of an element to be one plus the number of
elements that are smaller. Thus, the rank of an element is its final position if the set is sorted. The
minimum is of rank 1 and the maximum is of rank n.
Consider the set: {5, 7, 2, 10, 8, 15, 21, 37, 41}. The rank of each number is its position in the sorted
order.

position 1 2 3 4 5 6 7 8 9
Number 2 5 7 8 10 15 21 37 41

For example, rank of 8 is 4, one plus the number of elements smaller than 8 which is 3.
The selection problem is stated as follows:

Given a set A of n distinct numbers and an integer k, 1 ≤ k ≤ n, output the element of A of rank
k.

Of particular interest in statistics is the median. If n is odd then the median is defined to be element of
rank (n + 1)/2. When n is even, there are two choices: n/2 and (n + 1)/2. In statistics, it is common to
return the average of the two elements.
Medians are useful as a measures of the central tendency of a set especially when the distribution of
values is highly skewed. For example, the median income in a community is a more meaningful measure
than average. Suppose 7 households have monthly incomes 5000, 7000, 2000, 10000, 8000, 15000 and
16000. In sorted order, the incomes are 2000, 5000, 7000, 8000, 10000, 15000, 16000. The median
income is 8000; median is element with rank 4: (7 + 1)/2 = 4. The average income is 9000. Suppose the
income 16000 goes up to 450,000. The median is still 8000 but the average goes up to 71,000. Clearly,
the average is not a good representative of the majority income levels.
The selection problem can be easily solved by simply sorting the numbers of A and returning A[k].
Sorting, however, requires Θ(n log n) time. The question is: can we do better than that? In particular, is
it possible to solve the selections problem in Θ(n) time? The answer is yes. However, the solution is far
from obvious.

3.2.1 Sieve Technique


The reason for introducing this algorithm is that it illustrates a very important special case of
divide-and-conquer, which I call the sieve technique. We think of divide-and-conquer as breaking the
problem into a small number of smaller subproblems, which are then solved recursively. The sieve
technique is a special case, where the number of subproblems is just 1.
The sieve technique works in phases as follows. It applies to problems where we are interested in finding
a single item from a larger set of n items. We do not know which item is of interest, however after doing
some amount of analysis of the data, taking say Θ(nk) time, for some constant k, we find that we do not
3.2. SELECTION PROBLEM 35

know what the desired the item is, but we can identify a large enough number of elements that cannot be
the desired value, and can be eliminated from further consideration. In particular “large enough” means
that the number of items is at least some fixed constant fraction of n (e.g. n/2, n/3). Then we solve the
problem recursively on whatever items remain. Each of the resulting recursive solutions then do the same
thing, eliminating a constant fraction of the remaining set.

3.2.2 Applying the Sieve to Selection


To see more concretely how the sieve technique works, let us apply it to the selection problem. We will
begin with the given array A[1..n]. We will pick an item from the array, called the pivot element which
we will denote by x. We will talk about how an item is chosen as the pivot later; for now just think of it as
a random element of A.
We then partition A into three parts.

1. A[q] contains the pivot element x,

2. A[1..q − 1] will contain all the elements that are less than x and

3. A[q + 1..n] will contains all elements that are greater than x.

Within each sub array, the items may appear in any order. The following figure shows a partitioned array:

pivot
p r
5 92 64 1 3 7 Before partitioning

p q r
3 12 4 69 5 7 After partitioning
<x x >x
Figure 3.4: A[p..r] partitioned about the pivot x

3.2.3 Selection Algorithm


It is easy to see that the rank of the The rank of the pivot x is q − p + 1 in A[p..r]. Let
rank x = q − p + 1. If k = rank x then the pivot is kth smallest. If k < rank x then search
A[p..q − 1] recursively. If k > rank x then search A[q + 1..r] recursively. Find element of rank (k − q)
because we eliminated q smaller elements in A.
SELECT( array A, int p, int r, int k)
36 CHAPTER 3. DIVIDE AND CONQUER STRATEGY

1 if (p = r)
2 then return A[p]
3 else x ← CHOOSE PIVOT(A, p, r)
4 q ← PARTITION(A, p, r, x)
5 rank x ← q − p + 1
6 if k = rank x
7 then return x
8 if k < rank x
9 then return SELECT(A, p, q − 1, k)
10 else return SELECT(A, q + 1, r, k − q)

Example: select the 6th smallest element of the set {5, 9, 2, 6, 4, 1, 3, 7}

rankx= 4
5 3
9 1
2 2
rankx= 3 rankx= 2
6 4
pivot 4 6 6 6 pivot 6 5
1 9 9 5 5 6
3 5 5 7
7 7 pivot 7 9
initial partition recurse partition recurse partition
k=6 pivot=4 k=(6-4)=2 pivot=7 k=2 pivot=6
DONE!!
Figure 3.5: Sieve example: select 6th smallest element

3.2.4 Analysis of Selection


We will discuss how to choose a pivot and the partitioning later. For the moment, we will assume that
they both take Θ(n) time. How many elements do we eliminate in each time? If x is the largest or the
smallest then we may only succeed in eliminating one element.

5, 9, 2, 6, 4, 1 , 3, 7 pivot is 1
1 , 5, 9, 2, 6, 4, 3, 7 after partition

Ideally, x should have a rank that is neither too large or too small.
Suppose we are able to choose a pivot that causes exactly half of the array to be eliminated in each phase.
3.2. SELECTION PROBLEM 37

This means that we recurse on the remaining n/2 elements. This leads to the following recurrence:

1 if n = 1,
T (n) =
T (n/2) + n otherwise

If we expand this recurrence, we get


n n
T (n) = n + + + ...
2 4
X∞
n

i=0
2i
X∞
1
=n
i=0
2i

Recall the formula for infinite geometric series; for any |c| < 1,

X 1
ci =
i=0
1−c

Using this we have


T (n) ≤ 2n ∈ Θ(n)

Let’s think about how we ended up with a Θ(n) algorithm for selection. Normally, a Θ(n) time
algorithm would make a single or perhaps a constant number of passes of the data set. In this algorithm,
we make a number of passes. In fact it could be as many as log n.
However, because we eliminate a constant fraction of the array with each phase, we get the convergent
geometric series in the analysis. This shows that the total running time is indeed linear in n. This lesson
is well worth remembering. It is often possible to achieve linear running times in ways that you would
not expect.

You might also like