Due: Monday, Feb 20th in class Late assignments will be penalized 20\% per day.

Book Questions from Introduction to Algorithms - 3rd ed.

6.1-1, 6.1-4, 6.4-3

Heap sort and quick sort implementations

Hints:

6.1-1 - Recall that a heap is a nearly complete binary tree. Furthermore, the height of a heap is the maximal number of edges from the root to a leaf node.

6.1-4 - Consider what the heap property enforces with respect to the smallest node, i.e. where in the tree must it be.

6.4-3 - Explain the difference in run-times for best and worst case heapsort both practically and asymptotically.

Implementations

A skeleton project is provided in CS360_Sorter_Heap_Quick.zip. The zip file contains both a Visual Studio project and a Linux/OSX makefile to compile the code. Sorter.cpp contains both utility functions as well as empty sort function stubs - you should not need to modify main() or any of the utility functions.

Your Task

Implement the algorithms as given in the pseudocode below for heapsort and quicksort. Insert counter increment statements (note: a count global variable is provided), into each sorting function for each executable line of pseudocode (e.g. count all three lines required to implement a swap as a single operation). Use this counter to empirically measure the runtime of each sort. Only increment the counter for statements within the sorting functions, i.e. do not include any initialization overhead incurred in main() or the utility functions. Note that count is reset prior to each sort call but the results are stored in a 2D array counter which is used to display a table of all results once all the sorts and runs have completed.

Generate runs for 13 input sizes by changing the #define MAX_RUNS symbolic constant. This will generate data for increasing powers of 2 from 24 = 16 to 216 = 65536.

The #define NUM_AVG sets the number of data sets of each size to generate in order to compute an average runtime for each size. This value should be set to a reasonable number, e.g. 10, to give a good approximation of the average runtime of each sort. Note that the larger the value that is chosen, the longer the program will take to run.

You should generate tables for two different input ranges

Once the data for all input sizes and element ranges have been generated, make meaningful plots (e.g. using Excel) of the data showing important characteristics. In particular:

Hint: To plot cn lg n, consider making another column in the spreadsheet that computes the values for each empirical input size n and then plot that data as connected points without showing the individual data points.

Heap Sort

HEAPSORT(A)
1  BUILD-MAX-HEAP(A)
2  for i = A.length downto 2
3     exchange A[1] with A[i]
4     A.heapsize = A.heapsize - 1
5     MAX-HEAPIFY(A,1)

BUILD-MAX-HEAP(A)
1  A.heapsize = A.length
2  for i = A.length/2 downto 1
3     MAX-HEAPIFY(A,i)

MAX-HEAPIFY(A,i)
1  l = LEFT(i)
2  r = RIGHT(i)
3  if l <= A.heapsize and A[l] > A[i]
4     largest = l
5  else
6     largest = i
7  if r <= A.heapsize and A[r] > A[largest]
8     largest = r
9  if largest != i
10    exchange A[i] with A[largest]
11    MAX-HEAPIFY(A,largest)

Quick Sort

QUICKSORT(A,p,r)
1  if p < r
2     q = PARTITION(A,p,r)
3     QUICKSORT(A,p,q-1)
4     QUICKSORT(A,q+1,r)

PARTITION(A,p,r)
1  x = A[r]
2  i = p - 1
3  for j = p to r-1
4     if A[j] <= x
5        i = i + 1
6        exchange A[i] with A[j]
7  exchange A[i+1] with A[r]
8  return i+1

HINTS:

Function call statements DO NOT increment the counter since their runtime is evaluated by the execution of the function.

Return statement DO NOT increment the counter.

Loop statements, i.e. for and while, will execute one more time than the statements in the loop body. Hence a counter can be added to a loop as follows

for (...) {
   count++;
   // Body of loop
}
count++;

while (...) {
   count++;
   // Body of loop
}
count++;

For simple logic constructs, e.g. if, if/else, a count update can be added after the structure since only one branch will execute depending on the result of the condition

if (...) {
   // Body of if
}
count++;

if (...) {
   // Body of if branch
} else {
   // Body of else branch
}
count++;

For chained logical structures, i.e. if/else if/ else, there will need to be counters in each branch for the total number of conditions evaluated since they execute sequentially

if (...) {
   count++;
   // Body of first if branch
} else if (...) {
   count += 2;
   // Body of second if branch
} else if (...) {
   count += 3;
   // Body of third if branch
} else {
   count += however many if conditions there are
   // Body of else branch
}