Heap Sort and Binary Tree Concepts
Heap Sort and Binary Tree Concepts
Sorting
Heap Sort
Special Types of Trees
• Def: Full binary tree = a 4
3
Almost Complete Binary Tree
• Canonical labeling of nodes: Label the Nodes in the level-
wise fashion from left to right, as shown below
4 Height of root = 3
1 3
Height of (2)= 1 2 16 9 10 Level of (10)= 2
14 8
5
Useful Properties
There are at most 2l nodes at level (or depth) l of a
binary tree
A binary tree with maximum level d has at most 2d+1 -
1 nodes
d 1
d
2 1
n 2l 2d 1 1 4 Height of root = 3
l 0 2 1
1 3
Height of (2)= 1 2 16 9 10 Level of (10)= 2
14 8
6
The Heap Data Structure
• Def: A heap is an almost complete binary tree with the
following two properties:
– Structural property: all levels are full, except possibly the
last one, which is filled from left to right
– Max heap property: for any node x, Parent(x) ≥ x
Max Heap
7
Array Representation of Heaps
• A heap can be stored as an array
A.
– Root of tree is A[1]
– Left child of A[i] = A[2i]
– Right child of A[i] = A[2i + 1]
– Parent of A[i] = A[ i/2 ]
– Heapsize[A] ≤ length[A]
• The elements in the subarray
A[(n/2+1) .. n] are leaves
8
Heap Types
• Max-heaps (largest element at root), have the
max-heap property:
– for all nodes i, excluding the root:
A[PARENT(i)] ≥ A[i]
9
Adding/Deleting Nodes
• New nodes are always inserted at the bottom
level (left to right)
• Nodes are removed from the bottom level
(right to left)
10
Operations on Heaps
• Maintain/Restore the max-heap property
– MAX-HEAPIFY
• Create a max-heap from an unordered array
– BUILD-MAX-HEAP
• Sort an array in place
– HEAPSORT
11
Maintaining the Heap Property
• Suppose a node is smaller than a child
– Left and Right subtrees of i are max-heaps
• To eliminate the violation:
– Exchange with larger child
– Move down the tree
– Continue until node is not smaller than
children
12
Example
MAX-HEAPIFY(A, 2, 10)
A[2] A[4]
A[2] violates the heap property A[4] violates the heap property
A[4] A[9]
14
MAX-HEAPIFY Running Time
• It checks a path starting from current node to leaf node. At
every level it performs exactly 2 comparisons. At max
length of this path is h. So total number of comparisons is
at most 2h. So complexity is O(h) or O(logn)
15
Building a Max-Heap
2 3
1 3
4 5 6 7
2 16 9 10
8 9 10
14 8 7
Building a Heap
• Convert an array A[1 … n] into a max-heap (n = length[A])
• The elements in the subarray A[(n/2+1) .. n] are leaves
• Apply MAX-HEAPIFY on elements between 1 and n/2
Alg: BUILD-MAX-HEAP(A) 1
1. n = length[A] 4
3. do MAX-HEAPIFY(A, i, n) 8
2 9 10
16 9 10
14 8 7
A: 4 1 3 2 16 9 10 14 8 7
17
Example: A 4 1 3 2 16 9 10 14 8 7
4 4 4
2 3 2 3 2 3
1 3 1 3 1 3
4 5 6 7 4 5 6 7 4 5 6 7
8
2 9 10
16 9 10 8
2 9 10
16 9 10 8
14 9 10
16 9 10
14 8 7 14 8 7 2 8 7
i=2 i=1
1 1 1
4 4 16
2 3 2 3 2 3
1 10 16 10 14 10
4 5 6 7 4 5 6 7 4 5 6 7
8
14 9 10
16 9 3 8
14 9 10
7 9 3 8
8 9 10
7 9 3
2 8 7 2 8 1 2 4 1
18
Running Time of BUILD MAX HEAP
Alg: BUILD-MAX-HEAP(A)
1. n = length[A]
2. for i ← n/2 downto 1
O(n)
3. do MAX-HEAPIFY(A, i, n) O(lgn)
19
Running Time of BUILD MAX HEAP
• HEAPIFY takes O(h) the cost of HEAPIFY on a node i is
proportional to the height of the node i in the tree
h
T (n) ni hi 2i h i O(n)
h
h1 = 2 i=1 21
h2 = 1 i=2 22
h3 = 0 i = 3 (lgn) 23
• Idea:
– Build a max-heap from the array
– Swap the root (the maximum element) with the last
element in the array
– “Discard” this last node by decreasing the heap size
– Call MAX-HEAPIFY on the new root
– Repeat this process until only one node remains
22
Example: A=[7, 4, 3, 1, 2]
MAX-HEAPIFY(A, 1, 1)
23
Alg: HEAPSORT(A)
1. BUILD-MAX-HEAP(A) O(n)
2. for i ← length[A] downto 2
3. do exchange A[1] ↔ A[i] n-1 times
4. MAX-HEAPIFY(A, 1, i - 1) O(lgn)
12 4
25
Operations
on Priority Queues
• Max-priority queues support the following
operations:
– INSERT(S, x): inserts element x into set S
– EXTRACT-MAX(S): removes and returns element of S
with largest key
– MAXIMUM(S): returns element of S with largest key
– INCREASE-KEY(S, x, k): increases value of element x’s
key to k (Assume k ≥ x’s current key value)
26
HEAP-MAXIMUM
Goal:
– Return the largest element of the heap
Heap-Maximum(A) returns 7
27
HEAP-EXTRACT-MAX
Goal:
– Extract the largest element of the heap (i.e., return the max value
and also remove that element from the heap
Idea:
– Exchange the root element with the last
– Decrease the size of the heap by 1 element
– Call MAX-HEAPIFY on the new root, on a heap of size n-1
28
Example: HEAP-EXTRACT-MAX
16 1
14 10 max = 16 14 10
8 7 9 3 8 7 9 3
2 4 1 2 4
Heap size decreased with 1
14
29
HEAP-EXTRACT-MAX
Alg: HEAP-EXTRACT-MAX(A, n)
1. if n < 1
3. max ← A[1]
4. A[1] ← A[n]
6. return max
Running time: O(lgn)
30
HEAP-INCREASE-KEY
• Goal:
– Increases the key of an element i in the heap
• Idea:
– Increment the key of A[i] to its new value
– If the max-heap property does not hold anymore: traverse
a path toward the root to find the proper place for the
newly increased key
16
14 10
8 i 7 9 3
Key [i] ← 15 2 4 1
31
Example: HEAP-INCREASE-KEY
16 16
14 10 14 10
8 i 7 9 3 8 i 7 9 3
2 4 1 2 15 1
Key [i ] ← 15
16 16
i
14 10 15 10
i
15 7 9 3 14 7 9 3
2 8 1 2 8 1
32
HEAP-INCREASE-KEY
Alg: HEAP-INCREASE-KEY(A, i, key)
33
MAX-HEAP-INSERT
• Goal:
16
– Inserts a new element into a
14 10
max-heap
8 7 9 3
• Idea: 2 4 1 -
14 10 14 10
8 7 9 3 8 7 9 3
2 4 1 - 2 4 1 15
16 16
14 10 15 10
8 15 9 3 8 14 9 3
2 4 1 7 2 4 1 7
35
MAX-HEAP-INSERT
16
Alg: MAX-HEAP-INSERT(A, key, n)
14 10
1. heap-size[A] ← n + 1 8 7 9 3
2 4 1 -
2. A[n + 1] ← -
3. HEAP-INCREASE-KEY(A, n + 1, key)
Running time: O(lgn)
36
Summary
• We can perform the following operations on heaps:
– MAX-HEAPIFY O(lgn)
– BUILD-MAX-HEAP O(n)
– HEAP-SORT O(nlgn)
– MAX-HEAP-INSERT O(lgn)
– HEAP-EXTRACT-MAX O(lgn)
– HEAP-INCREASE-KEY O(lgn)
Average
– HEAP-MAXIMUM O(1) O(lgn)
37
Quick Sort
QuickSort Design
• Follows the divide-and-conquer paradigm.
• Divide: Partition (separate) the array A[p..r] into two
(possibly empty) subarrays A[p..q–1] and A[q+1..r].
– Each element in A[p..q–1] <= A[q].
– A[q] < each element in A[q+1..r].
– Index q is computed as part of the partitioning procedure.
• Conquer: Sort the two subarrays by recursive calls to
quicksort.
• Combine: The subarrays are sorted in place – no work is
needed to combine them.
• How do the divide and conquer steps of quicksort
compare with those of merge sort?
Pseudocode
Quicksort(A, p, r)
if p < r then Partition(A, p, r)
q := Partition(A, p, r); x:= A[r]
Quicksort(A, p, q – 1); i=p – 1;
Quicksort(A, q + 1, r) for j := p to r – 1 do
if A[j] x then
A[p..r] i := i + 1;
A[i] A[j]
5 A[i + 1] A[r];
return i + 1
A[p..q – 1] A[q+1..r]
Partition 0 1 2 3 4 5 6 7
5 Z Y X
5 >5
i j
Example
Position of I and j after of line 3
p r
initially: 2 5 8 3 9 4 1 7 10 6 note: pivot (x) = 6
i j
Partition(A, p, r)
next iteration: 2 5 8 3 9 4 1 7 10 6 1 x:= A[r]
i j 2 i=p – 1;
3 for j := p to r – 1 do
next iteration: 2 5 8 3 9 4 1 7 10 6 4 if A[j] x then
i j 5 i := i + 1;
6 A[i] A[j]
next iteration: 2 5 8 3 9 4 1 7 10 6
7 A[i + 1] A[r];
i j
8 return i + 1
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
Example (Continued)
next iteration: 2 5 3 8 9 4 1 7 10 6
i j
next iteration: 2 5 3 4 9 8 1 7 10 6
Partition(A, p, r)
i j
x:= A[r]
next iteration: 2 5 3 4 1 8 9 7 10 6
i=p – 1;
i j
for j := p to r – 1 do
next iteration: 2 5 3 4 1 8 9 7 10 6
if A[j] x then
i j
i := i + 1;
after final swap: 2 5 3 4 1 6 9 7 10 8
A[i] A[j]
i j
A[i + 1] A[r];
return i + 1
Partitioning
• Select the last element A[r] in the subarray A[p..r] as
the pivot – the element around which to partition.
• As the procedure executes, the array is partitioned into
four (possibly empty) regions.
1. A[p..i ] — All entries in this region are <= pivot.
2. A[i+1..j – 1] — All entries in this region are > pivot.
3. A[r] = pivot.
4. A[j..r – 1] — Not known how they compare to pivot.
• The above hold before each iteration of the for loop,
and constitute a loop invariant.
Correctness of Partition
• Use loop invariant.
• Initialization:
– Before first iteration
• A[p..i] and A[i+1..j – 1] are empty – Conds. 1 and 2 are satisfied
(trivially).
• r is the index of the pivot Partition(A, p, r)
– Cond. 3 is satisfied.
x:= A[r]
• Maintenance: i=p – 1;
– Case 1: A[j] > x for j := p to r – 1 do
if A[j] x then
• Increment j only.
i := i + 1;
• Loop Invariant is maintained. A[i] A[j]
A[i + 1] A[r];
return i + 1
Correctness of Partition
Case 1:
p i j r
>x x
x >x
p i j r
x
x >x
Correctness of Partition
• Case 2: A[j] x – Increment j
– Increment i • Condition 2 is maintained.
p i j r
x x
x >x
p i j r
x
x >x
Correctness of Partition
• Termination:
– When the loop terminates, j = r, so all elements in A
are partitioned into one of the three cases:
• A[p..i] pivot
• A[i+1..j – 1] > pivot
• A[r] = pivot
• The last two lines swap A[i+1] and A[r].
– Pivot moves from the end of the array to between the
two subarrays.
– Thus, procedure partition correctly performs the
divide step.
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• What is best case running time?
– Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• What is best case running time?
– Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
– Depth of recursion tree?
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• What is best case running time?
– Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
– Depth of recursion tree? O(log2n)
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• What is best case running time?
– Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
– Depth of recursion tree? O(log2n)
– Number of accesses in partition?
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• What is best case running time?
– Recursion:
1. Partition splits array in two sub-arrays of size n/2
2. Quicksort each sub-array
– Depth of recursion tree? O(log2n)
– Number of accesses in partition? O(n)
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• Recurrence relation T(n)=2T(n/2)+(n)
• Running time: O(n log2n)
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• Best case running time: O(n log2n)
• Worst case running time?
– Recursion:
1. Partition splits array in two sub-arrays:
• one sub-array of size 0
• the other sub-array of size n-1
2. Quicksort each sub-array
– Depth of recursion tree?
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• Best case running time: O(n log2n)
• Worst case running time?
– Recursion:
1. Partition splits array in two sub-arrays:
• one sub-array of size 0
• the other sub-array of size n-1
2. Quicksort each sub-array
– Depth of recursion tree? O(n)
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• Best case running time: O(n log2n)
• Worst case running time?
– Recursion:
1. Partition splits array in two sub-arrays:
• one sub-array of size 0
• the other sub-array of size n-1
2. Quicksort each sub-array
– Depth of recursion tree? O(n)
– Number of accesses per partition?
Quicksort Analysis
• Assume that keys are random, uniformly
distributed.
• Best case running time: O(n log2n)
• Worst case running time?
– Recursion:
1. Partition splits array in two sub-arrays:
• one sub-array of size 0
• the other sub-array of size n-1
2. Quicksort each sub-array
– Depth of recursion tree? O(n)
– Number of accesses per partition? O(n)
Quicksort Analysis
• Balanced partitioning
Quicksort Analysis
• mix of “good” and “bad” splits
A randomized version of quicksort
• RANDOMIZED-PARTITION (A, p, r)
1. i = RANDOM(p, r)
2. exchange A[r] with A[i]
3. return PARTITION(A, p, r)
• RANDOMIZED-QUICKSORT(A,p,r)
1. if p < r
1. q = RANDOMIZED-PARTITION(A,p,r)
2. RANDOMIZED-QUICKSORT(A,p,q -1)
3. RANDOMIZED-QUICKSORT(A,q+ 1, r)
Expected running time
• The running time of QUICKSORT is dominated by the
time spent in the PARTITION procedure.
• Each time the PARTITION procedure is called, it
selects a pivot element, and this element is never
included in any future recursive calls to QUICKSORT
and PARTITION.
• Thus, there can be at most n calls to PARTITION over
the entire execution of the quicksort algorithm.
Expected Running Time of Partition
• One call to PARTITION takes O(1) time plus an
amount of time that is proportional to the number of
iterations of the for loop in lines 3–6