0% found this document useful (0 votes)
10 views56 pages

Understanding Heaps and Heap Operations

Uploaded by

korayairdrop
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views56 pages

Understanding Heaps and Heap Operations

Uploaded by

korayairdrop
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Heaps

● A heap can be seen as a complete binary tree:

16

14 10

8 7 9 3

2 4 1

■ What makes a binary tree complete?


■ Is the example above complete?
Heaps
● A heap can be seen as a complete binary tree:

16

14 10

8 7 9 3

2 4 1 1 1 1 1 1

■ The book calls them “nearly complete” binary trees;


can think of unfilled slots as null pointers
Heaps
● In practice, heaps are usually implemented as
arrays:
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Heaps
● To represent a complete binary tree as an array:
■ The root node is A[1]
■ Node i is A[i]
■ The parent of node i is A[i/2] (note: integer divide)
■ The left child of node i is A[2i]
■ The right child of node i is A[2i + 1]
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Referencing Heap Elements
● So…
Parent(i) { return i/2; }
Left(i) { return 2*i; }
right(i) { return 2*i + 1; }
● An aside: How would you implement this
most efficiently?
● Another aside: Really?
The Heap Property
● Heaps also satisfy the heap property:
A[Parent(i)]  A[i] for all nodes i > 1
■ In other words, the value of a node is at most the
value of its parent
■ Where is the largest element in a heap stored?
● Definitions:
■ The height of a node in the tree = the number of
edges on the longest downward path to a leaf
■ The height of a tree = the height of its root
Heap Height
● What is the height of an n-element heap?
Why?
● This is nice: basic heap operations take at most
time proportional to the height of the heap
Heap Operations: Heapify()
● Heapify(): maintain the heap property
■ Given: a node i in the heap with children l and r
■ Given: two subtrees rooted at l and r, assumed to be
heaps
■ Problem: The subtree rooted at i may violate the heap
property (How?)
■ Action: let the value of the parent node “float down”
so subtree at i satisfies the heap property
○ What do you suppose will be the basic operation between i,
l, and r?
Heap Operations: Heapify()
Heapify(A, i)
{
l = Left(i); r = Right(i);
if (l <= heap_size(A) && A[l] > A[i])
largest = l;
else
largest = i;
if (r <= heap_size(A) && A[r] > A[largest])
largest = r;
if (largest != i)
Swap(A, i, largest);
Heapify(A, largest);
}
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Analyzing Heapify(): Informal
● Aside from the recursive call, what is the
running time of Heapify()?
● How many times can Heapify() recursively
call itself?
● What is the worst-case running time of
Heapify() on a heap of size n?
Analyzing Heapify(): Formal
● Fixing up relationships between i, l, and r
takes (1) time
● If the heap at i has n elements, how many
elements can the subtrees at l or r have?
■ Draw it
● Answer: 2n/3 (worst case: bottom row 1/2 full)
● So time taken by Heapify() is given by
T(n)  T(2n/3) + (1)
Analyzing Heapify(): Formal
● So we have
T(n)  T(2n/3) + (1)
● By case 2 of the Master Theorem,
T(n) = O(lg n)
● Thus, Heapify() takes linear time
Heap Operations: BuildHeap()
● We can build a heap in a bottom-up manner by
running Heapify() on successive subarrays
■ Fact: for array of length n, all elements in range
A[n/2 + 1 .. n] are heaps (Why?)
■ So:
○ Walk backwards through the array from n/2 to 1, calling
Heapify() on each node.
○ Order of processing guarantees that the children of node
i are heaps when i is processed
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)
Heapify(A, i);
}
BuildHeap() Example
● Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1 3

2 16 9 10

14 8 7
Analyzing BuildHeap()
● Each call to Heapify() takes O(lg n) time
● There are O(n) such calls (specifically, n/2)
● Thus the running time is O(n lg n)
■ Is this a correct asymptotic upper bound?
■ Is this an asymptotically tight bound?
● A tighter bound is O(n)
■ How can this be? Is there a flaw in the above
reasoning?
Analyzing BuildHeap(): Tight
● To Heapify() a subtree takes O(h) time
where h is the height of the subtree
■ h = O(lg m), m = # nodes in subtree
■ The height of most subtrees is small
● Fact: an n-element heap has at most n/2h+1
nodes of height h
● CLR 7.3 uses this fact to prove that
BuildHeap() takes O(n) time
Heapsort
● Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
■ Maximum element is at A[1]
■ Discard by swapping with element at A[n]
○ Decrement heap_size[A]
○ A[n] now contains correct value
■ Restore heap property at A[1] by calling
Heapify()
■ Repeat, always swapping A[1] for A[heap_size(A)]
Heapsort
Heapsort(A)
{
BuildHeap(A);
for (i = length(A) downto 2)
{
Swap(A[1], A[i]);
heap_size(A) -= 1;
Heapify(A, 1);
}
}
Analyzing Heapsort
● The call to BuildHeap() takes O(n) time
● Each of the n - 1 calls to Heapify() takes
O(lg n) time
● Thus the total time taken by HeapSort()
= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)
Heapify
● Heapify picks the largest child key and compare it to the parent
key. If parent key is larger than heapify quits, otherwise it swaps
the parent key with the largest child key. So that the parent is now
becomes larger than its children.
Heapify(A, i)
{
l  left(i)
r  right(i)
if l <= heapsize[A] and A[l] > A[i]
then largest l
else largest  i
if r <= heapsize[A] and A[r] > A[largest]
then largest  r
if largest != i
then swap A[i]  A[largest]
Heapify(A, largest)
}
Build Heap
● We can use the procedure 'Heapify' in a bottom-up fashion to
convert an array A[1 . . n] into a heap. Since the elements in the
subarray A[n/2 +1 . . n] are all leaves, the procedure
BUILD_HEAP goes through the remaining nodes of the tree and
runs 'Heapify' on each one. The bottom-up order of processing
node guarantees that the subtree rooted at children are heap before
'Heapify' is run at their parent.

Buildheap(A)
{
heapsize[A] length[A]
for i |length[A]/2 //down to 1
do Heapify(A, i)
}
Heap Sort Algorithm
● The heap sort algorithm starts by using procedure BUILD-HEAP to
build a heap on the input array A[1 . . n]. Since the maximum
element of the array stored at the root A[1], it can be put into its
correct final position by exchanging it with A[n] (the last element in
A). If we now discard node n from the heap than the remaining
elements can be made into heap. Note that the new element at the
root may violate the heap property. All that is needed to restore the
heap property.

Heapsort(A)
{
Buildheap(A)
for i  length[A] //down to 2
do swap A[1]  A[i]
heapsize[A]  heapsize[A] - 1
Heapify(A, 1)
}
Example: Convert the following array to a heap

16 4 7 1 12 19
Picture the array as a complete binary tree:

16

4 7

1 12 19
16 16

4 7 4 19
swa
p

12 19 1 12 7
1

16 19
swa
p
12 19 12 16
swa
p

4 7 1 4 7
1
Heap Sort
● The heapsort algorithm consists of two phases:
- build a heap from an arbitrary array
- use the heap to sort the data

● To sort the elements in the decreasing order, use a min heap


● To sort the elements in the increasing order, use a max heap

19

12 16

1 4 7
Example of Heap Sort
Take out biggest
19

12 16
Move the last element
to the root

1 4 7

Sorted:
Array A

12 1 1 4 7 19
6
7
swap
HEAPIFY()
12 16

1 4

Sorted:
Array A

7 12 1 1 4 19
6
16

12 7

1 4

Sorted:
Array A

1 12 7 1 4 19
6
Take out biggest
16
Move the last element
to the root
12 7

1 4

Sorted:
Array A

12 7 1 4 1 19
6
4

12 7

Sorted:
Array A

4 12 7 1 1 19
6
swap 4
HEAPIFY()
12 7

Sorted:
Array A

4 12 7 1 1 19
6
12

4 7

Sorted:
Array A

12 4 7 1 1 19
6
Take out biggest
12
Move the last
element to the
root 4 7

Sorted:
Array A

4 7 1 12 1 19
6
1 swap

4 7

Sorted:
Array A
1 4 7 12 1 19
6
7

4 1

Sorted:
Array A

7 4 1 12 1 19
6
Take out biggest
7
Move the last
element to the
4 1 root

Sorted:
Array A

1 4 7 12 1 19
6
swap 1
HEAPIFY()
4

Sorted:
Array A

4 1 7 12 1 19
6
Take out biggest
Move the last 4
element to the
root
1

Sorted:
Array A

1 4 7 12 1 19
6
Take out biggest
1

Sorted:
Array A

1 4 7 12 1 19
6
Sorted:

1 4 7 12 16 19
Time Analysis

● Build Heap Algorithm will run in O(n)


time
● There are n-1 calls to Heapify each call
requires O(log n) time
● Heap sort program combine Build Heap
program and Heapify, therefore it has the
running time of O(n log n) time
● Total time complexity: O(n log n)
Comparison with Quick Sort and
Merge Sort
● Quick sort is typically somewhat faster, due to better cache
behavior and other factors, but the worst-case running time for
quick sort is O (n2), which is unacceptable for large data sets and
can be deliberately triggered given enough knowledge of the
implementation, creating a security risk.

● The quick sort algorithm also requires Ω (log n) extra storage


space, making it not a strictly in-place algorithm. This typically
does not pose a problem except on the smallest embedded systems,
or on systems where memory allocation is highly restricted.
Constant space (in-place) variants of quick sort are possible to
construct, but are rarely used in practice due to their extra
complexity.
Comparison with Quick Sort and
Merge Sort (cont)
● Thus, because of the O(n log n) upper bound on heap sort’s running
time and constant upper bound on its auxiliary storage, embedded
systems with real-time constraints or systems concerned with
security often use heap sort.

● Heap sort also competes with merge sort, which has the same time
bounds, but requires Ω(n) auxiliary space, whereas heap sort
requires only a constant amount. Heap sort also typically runs more
quickly in practice. However, merge sort is simpler to understand
than heap sort, is a stable sort, parallelizes better, and can be easily
adapted to operate on linked lists and very large lists stored on
slow-to-access media such as disk storage or network attached
storage. Heap sort shares none of these benefits; in particular, it
relies strongly on random access.
Possible Application

■ When we want to know the task that carry the


highest priority given a large number of
things to do

■ Interval scheduling, when we have a lists of


certain task with start and finish times and we
want to do as many tasks as possible

■ Sorting a list of elements that needs and


efficient sorting algorithm
Conclusion
● The primary advantage of the heap sort is its
efficiency. The execution time efficiency of the
heap sort is O(n log n). The memory efficiency of
the heap sort, unlike the other n log n sorts, is
constant, O(1), because the heap sort algorithm is
not recursive.
● The heap sort algorithm has two major steps. The
first major step involves transforming the complete
tree into a heap. The second major step is to
perform the actual sort by extracting the largest
element from the root and transforming the
remaining tree into a heap.

You might also like