The first sort we looked at was insertion sort which had (average) run time O(n2) but required a constant (with respect to n) amount of additional storage. Mergesort on the other hand, had O(n lg n) run time, but the amount of memory required growed with n (since at each level of recursion new partial arrays are allocated). Heapsort is a sorting algorithm that maintains the O(n lg n) run time (like mergesort) but operates in place (like insertion sort). This sorting algorithm is based on a data structure known as a heap (which is also one way to implement a priority queue).

Heaps

A heap is simply an array viewed as a nearly complete binary tree (meaning a subsequent level is not started until the previous level is completely filled). For example, the array

image

can be viewed as heap

image

Heap values

We define the following values for an array A viewed as a heap

Heap operations

For heaps we assume the following operations

Note: LEFT() and RIGHT() can be efficiently computed using bitwise shift and set operations.

Max-heaps

We define a max-heap as a heap that satisfies the property that A[PARENT(i)]A[i] for all i, i.e. the value of every parent is greater than both children. Thus for a max-heap, the largest value is stored at the root.

Min-heaps

Similarly we define a min-heap as a heap that satisfies the property that A[PARENT(i)]A[i] for all i, i.e. the value of every parent is less than both children. Thus for a min-heap, the smallest value is stored at the root.

Creating a max-heap

Building a max-heap from an array involves two routines

MAX-HEAPIFY()

The MAX-HEAPIFY() routine swaps a parent node with the largest child node recursively until it ends up in the correct location in the heap. The pseudocode for the procedure is

MAX-HEAPIFY(A,i)
1  l = LEFT(i)
2  r = RIGHT(i)
3  if l <= A.heapsize and A[l] > A[i]
4     largest = l
5  else
6     largest = i
7  if r <= A.heapsize and A[r] > A[largest]
8     largest = r
9  if largest != i
10    exchange A[i] with A[largest]
11    MAX-HEAPIFY(A,largest)

Lines 1-10 take constant time (i.e. are O(1)) so the worst case run time will be when the recursion occurs the maximal number of times. Since the recursion selects one of the two children to recurse down (i.e. largest), the worst case will happen when the most nodes are retained in the selected branch as shown in the following figure

image

Since a heap is a nearly complete binary tree, for a given level i (where the root node is i = 0) there will be 2i nodes at that level. If the last level k is only half full (to retain the most nodes), there will be 2k/2 nodes in this level. For the first k-1 levels there will be a total of

image

Therefore the total number of nodes in the heap can be written as

image

Thus in the worst case half of the nodes in the first k-1 levels (minus the root) will be retained in addition to all the ones in the kth level giving

image

Substituting 2k = (2/3) n + 2/3 from above gives

image

Hence two-thirds of the nodes are retained in the worst case giving the recursive equation for MAX-HEAPIFY()

image

This recursive equation can be solved by inspection using Case 2 of the master theorem as

image

where h = lg n is the maximum number of levels the node can traverse in the binary tree.

BUILD-MAX-HEAP()

Using MAX-HEAPIFY() we can construct a max-heap by starting with the last node that has children (which occurs at A.length/2) and iterating back to the root calling MAX-HEAPIFY() for each node which ensures that the max-heap property will be maintained at each step for all evaluated nodes. The pseudocode for the routine is

BUILD-MAX-HEAP(A)
1  A.heapsize = A.length
2  for i = A.length/2 downto 1
3     MAX-HEAPIFY(A,i)

Thus since there are O(n) nodes with O(lg n) time for MAX-HEAPIFY() an upper bound for the total run time of BUILD-MAX-HEAP is

image

However, the run time for MAX-HEAPIFY() is really O(h) with a majority of the nodes having h ≤ lg n (i.e. most of the nodes are at the bottom of the tree with only the root having true worst case behavior). Thus a node at level i (of which there are 2i) will have h = lg n - i. Hence a better upper bound for the running time is given by

image

If we let k = lg n - i ( ⇒ i = lg n - k) then we can rewrite and simplify the summation as

image

Hence a better upper bound for BUILD-MAX-HEAP is O(n), i.e. we can build a max (or min) heap from an array in linear time.

Heapsort

Once we have created a max-heap, to sort the elements we simply

The pseudocode for heapsort is

HEAPSORT(A)
1  BUILD-MAX-HEAP(A)
2  for i = A.length downto 2
3     exchange A[1] with A[i]
4     A.heapsize = A.heapsize - 1
5     MAX-HEAPIFY(A,1)

Thus there is one call to BUILD-MAX-HEAP() which is O(n) and n-1 calls to MAX-HEAPIFY which is O(lg n) giving a run time for HEAPSORT() of

image

Since all the operations are performed by simply exchanging array elements, heapsort works in place in O(n lg n) time.

Priority Queues

One nice application of heaps is as a way to implement a priority queue (a data structure where the elements are maintained in order based on a priority value). Priority queues occur frequently in OS task scheduling so that more important tasks get priority for execution on the CPU. The priority queue will require operations to add new elements, remove the highest priority element, or adjust the priority of elements in the queue (updating the ordering accordingly). Thus a priority queue can be created (using BUILD-MAX-HEAP() and maintained efficiently as a max (or min depending on the application) heap. The priority queue operations are

Thus a priority queue can be created in linear time and maintained in lg time using a heap.