This is the archived website of SY 301 from the Fall 2016 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Unit 11: Sorting

1 Sorting

The Sorting problem is perhaps the most well-studied (and widely taught) algorithmic problems in computer science. There are a lot of reasons for this: the problem is very important and comes up in a number of practical situations, there are non-trivial fast algorithms that make a big difference even on relatively small-sized problems, and the problem is very well-understood so we can answer almost all the hard questions about it.

In summary: you're going to learn about sorting in a computing major. Formally, we will define the problem as taking an input array \(A\) containing \(n\) objects that are comparable, and producing an array with the same objects, in increasing order. (By "comparable" we just mean that any one object is always less than, equal to, or greater than any other object.)

Beginning of Class 34

You've already seen two sorting algorithms early in the semester, Selection Sort and Insertion Sort. Here they are again:

def selectionSort(A):
    for i in range(len(A) - 1):
        # find smallest element from A[i] until the end
        min_index = i
        for j in range(i+1, len(A)):
            if A[j] < A[min_index]:
                min_index = j
        # swap A[i] and A[min_index]
        temp = A[i]
        A[i] = A[min_index]
        A[min_index] = temp

def insertionSort(A):
    for i in range(1, len(A)):
        temp = A[i]
        # go backwards from index i to find where A[i] belongs
        j = i-1
        while j >= 0 and A[j] > temp:
            A[j+1] = A[j]
            j -= 1
        A[j+1] = temp

How do they work? Well, both algorithms keep a growing and sorted sub-array in A[:i], with a growing i (in other words, the left side of the array is sorted, and the right side is not, and step by step, we make the sorted part one bigger, and the unsorted part one smaller). To enlarge this portion, step by step, both agorithms have a "Picking" step and a "Placing" step. In Selection Sort, the smallest element in the unsorted portion A[i:] is first picked, and it is then swapped (placed) into location i, therefore enlarging the sorted portion and shrinking the unsorted portion. In Insertion Sort, the first element in the unsorted portion is first picked; then, the algorithm finds where in the sorted portion it should be placed, moving all the other sorted elements over to make room for it in that spot.

What's the runtime? Well, in Selection Sort, "Picking" is a \(O(n)\) operation, and "Placing" is \(O(1)\) (see why?). After doing this \(n\) times, the algorithm has done \(n(O(n)+O(1)) = O(n^2)\) work.

In contrast, in Insertion Sort, "Picking" is \(O(1)\), and "Placing" is \(O(n)\). If you do these two steps, you of course still get \(O(n^2)\) time.

Beginning of Class 35

2 HeapSort

OK, fine. We've also learned about Heaps, though, which seem like they might be useful. And, it turns out, they are!

On the surface, heapsort is very easy: load everything into your heap, then pull it out, one by one. They'll come back out in order. Loading everything in? If we do it bottom-up, it's O(n). Taking everything out? O(n log(n)). So, total runtime O(n log(n)). But, it seemingly requires two arrays, one to keep track of the heap, and one to store the sorted numbers.

Consider the following sequence of numbers: 0 2 6 5 3 1 4. We'd like to put these in increasing order. Let's start by heapifying in place. I'll put a | in the array; everything to the right of it has been heapified, everything to the left has not.

We start by putting the last (n+1)/2 things onto the bottom level of the heap; this doesn't require any work: 0 2 6|5 3 1 4. We then work right to left, bubbling down as necessary. Before, we talked about this as heapifying the next (n+1)/4, (n+1)/8, etc., but functionally the same thing is happening. The old way was easier to analyze, the new way is easier to implement.

0 2|6 5 3 1 4 (6 doesn't need to bubble).
0|5 6 2 3 1 4 (2 has bubbled down).
6 5 4 2 3 1 0 (0 has bubbled down, twice).

Now, we can do removeMaxs. Now, we know two things: (1), the 6 will need to leave the heap, and (2), the array index holding the 0 will need to be empty. But why don't we just put the 6 in that newly-empty spot? Now, the | will mean everything to the right is sorted, and everything to the left is still in the heap.

6 5 4 2 3 1 0
0 5 4 2 3 1|6 (swapped the last thing with the top thing)
5 0 4 2 3 1|6 (0 has bubbled down)
5 3 4 2 0 1|6 (0 has bubbled down again)

Do another removeMax, again swapping the top thing for the last thing:

5 3 4 2 0 1|6
1 3 4 2 0|5 6 (Swap the 1 and the 5)
4 3 1 2 0|5 6 (1 has bubbled down)

Again:

4 3 1 2 0|5 6
0 3 1 2|4 5 6 (swap the 4 and the 0)
3 0 1 2|4 5 6 (0 has bubbled down)
3 2 1 0|4 5 6 (0 has bubbled down)

Again:

3 2 1 0|4 5 6
0 2 1|3 4 5 6 (swap the 0 and the 3)
2 0 1|3 4 5 6 (0 has bubbled down)

Again:

2 0 1|3 4 5 6
1 0|2 3 4 5 6 (swap the 1 and the 2)

Again:

1 0|2 3 4 5 6
0 1 2 3 4 5 6 (swap, and we're sorted)

\(O(n \log(n))\), in place (meaning, using a single array)! That's a lot better than \(O(n^2)\).

Beginning of Class 36

3 MergeSort

Our fourth sorting algorithm is known as MergeSort, which is a great recursive algorithm:

def mergeSort(A):
    if len(A) &lt;= 1: 
        return A
    else:
        B = the first half of A
        C = the second half of A
        B = mergeSort(B) #Sort the first half
        C = mergeSort(C) #Sort the second half
        A[:] = merge(B, C) #See below

So our base case is when the array entered has only one element in it. In the recursive case, we split the array in two, and sort those two halves. After those two recursive calls, we have two sorted arrays, B and C. How can we turn those into a single sorted array? Well, the smallest element in the sorted array will be one of the two smallest elements of B and C. Once you've identified that one, once again, the next smallest is one of two elements. And so on. So that's what happens in merge!

This approach is a very common pattern called divide and conquer. This is a general paradigm for designing algorithms that works in three basic steps:

Break the problem into similar subproblems
Solve each of the subproblems recursively
Combine the results to solve the original problem

In MergeSort, the first step is trivial: we break into subproblems just by dividing the array in half. The second step is the two recursive calls, and the third step is accomplished by the Merge subroutine, which combines the two sorted sublists back into one.

So how much work is done here? Well, for the \(n\) base cases, we do \(O(1)\) work. In the next level up (n/2 of those), we have to merge two total elements: (n/2)*2. In the next level up (n/4 of those), we have to merge four total elements: (n/4)*4. So the summation is \(\sum_{i=0}^{\log(n)}2^i\frac{n}{2^i}\), or \(n\log(n)\).

4 Space

Note that MergeSort is NOT an in-place algorithm. Unlike our other three, MergeSort requires additional arrays beyond the one which was input. Namely, beyond the array A, it must also store arrays B and C, the size of which total, at most, to something smaller than \(n\). Therefore, the additional space requirement of MergeSort is \(O(n)\).

Does that matter? Maybe! It depends heavily on your system and what it can provide in terms of memory. For some small systems like phones, memory can be in short supply, making MergeSort less attractive.