Unit 11: Sorting
1 Sorting
The Sorting problem is perhaps the most well-studied (and widely taught) algorithmic problems in computer science. There are a lot of reasons for this: the problem is very important and comes up in a number of practical situations, there are non-trivial fast algorithms that make a big difference even on relatively small-sized problems, and the problem is very well-understood so we can answer almost all the hard questions about it.
In summary: you're going to learn about sorting in a computing major. Formally, we will define the problem as taking an input array \(A\) containing \(n\) objects that are comparable, and producing an array with the same objects, in increasing order. (By "comparable" we just mean that any one object is always less than, equal to, or greater than any other object.)
Beginning of Class 34
You've already seen two sorting algorithms early in the semester, Selection Sort and Insertion Sort. Here they are again:
1 2 3 4 5 6 7 8 9 10 11 | def selectionSort(A): for i in range(len(A) - 1): # find smallest element from A[i] until the end min_index = i for j in range(i+1, len(A)): if A[j] < A[min_index]: min_index = j # swap A[i] and A[min_index] temp = A[i] A[i] = A[min_index] A[min_index] = temp |
1 2 3 4 5 6 7 8 9 | def insertionSort(A): for i in range(1, len(A)): temp = A[i] # go backwards from index i to find where A[i] belongs j = i-1 while j >= 0 and A[j] > temp: A[j+1] = A[j] j -= 1 A[j+1] = temp |
How do they work? Well, both algorithms keep a growing and sorted
sub-array in A[:i]
, with a growing i
(in other words,
the left side of the array is sorted, and the right side is not, and step by
step, we make the sorted part one bigger, and the unsorted part one smaller).
To enlarge this portion, step by step, both agorithms have a "Picking" step
and a "Placing" step. In Selection Sort, the smallest element in the unsorted
portion A[i:]
is first picked, and it is then swapped (placed) into
location i, therefore enlarging the sorted portion and shrinking the unsorted
portion. In Insertion Sort, the first element in the unsorted portion is first
picked; then, the algorithm finds where in the sorted portion it should be
placed, moving all the other sorted elements over to make room for it in that
spot.
What's the runtime? Well, in Selection Sort, "Picking" is a \(O(n)\) operation, and "Placing" is \(O(1)\) (see why?). After doing this \(n\) times, the algorithm has done \(n(O(n)+O(1)) = O(n^2)\) work.
In contrast, in Insertion Sort, "Picking" is \(O(1)\), and "Placing" is \(O(n)\). If you do these two steps, you of course still get \(O(n^2)\) time.
Beginning of Class 35
2 HeapSort
OK, fine. We've also learned about Heaps, though, which seem like they might be useful. And, it turns out, they are!
On the surface, heapsort is very easy: load everything into your heap, then pull it out, one by one. They'll come back out in order. Loading everything in? If we do it bottom-up, it's O(n). Taking everything out? O(n log(n)). So, total runtime O(n log(n)). But, it seemingly requires two arrays, one to keep track of the heap, and one to store the sorted numbers.
Consider the following sequence of numbers: 0 2 6 5 3 1 4
. We'd
like to put these in increasing order. Let's start by heapifying in place.
I'll put a | in the array; everything to the right of it has been heapified,
everything to the left has not.
We start by putting the last (n+1)/2 things onto the bottom level of the
heap; this doesn't require any work: 0 2 6|5 3 1 4
. We then work
right to left, bubbling down as necessary. Before, we talked about this as
heapifying the next (n+1)/4, (n+1)/8, etc., but functionally the same thing is
happening. The old way was easier to analyze, the new way is easier to
implement.
0 2|6 5 3 1 4
(6 doesn't need to bubble).0|5 6 2 3 1 4
(2 has bubbled down).6 5 4 2 3 1 0
(0 has bubbled down, twice).
Now, we can do removeMaxs. Now, we know two things: (1), the 6 will need to leave the heap, and (2), the array index holding the 0 will need to be empty. But why don't we just put the 6 in that newly-empty spot? Now, the | will mean everything to the right is sorted, and everything to the left is still in the heap.
6 5 4 2 3 1 0
0 5 4 2 3 1|6
(swapped the last thing with the top thing)5 0 4 2 3 1|6
(0 has bubbled down)5 3 4 2 0 1|6
(0 has bubbled down again)
Do another removeMax, again swapping the top thing for the last thing:
5 3 4 2 0 1|6
1 3 4 2 0|5 6
(Swap the 1 and the 5)4 3 1 2 0|5 6
(1 has bubbled down)
Again:
4 3 1 2 0|5 6
0 3 1 2|4 5 6
(swap the 4 and the 0)3 0 1 2|4 5 6
(0 has bubbled down)3 2 1 0|4 5 6
(0 has bubbled down)
Again:
3 2 1 0|4 5 6
0 2 1|3 4 5 6
(swap the 0 and the 3)2 0 1|3 4 5 6
(0 has bubbled down)
Again:
2 0 1|3 4 5 6
1 0|2 3 4 5 6
(swap the 1 and the 2)
Again:
1 0|2 3 4 5 6
0 1 2 3 4 5 6
(swap, and we're sorted)
\(O(n \log(n))\), in place (meaning, using a single array)! That's a lot better than \(O(n^2)\).
Beginning of Class 36
3 MergeSort
Our fourth sorting algorithm is known as MergeSort, which is a great recursive algorithm:
1 2 3 4 5 6 7 8 9 | def mergeSort(A): if len(A) <= 1: return A else: B = the first half of A C = the second half of A B = mergeSort(B) #Sort the first half C = mergeSort(C) #Sort the second half A[:] = merge(B, C) #See below |
So our base case is when the array entered has only one element in it. In the recursive case, we split the array in two, and sort those two halves. After those two recursive calls, we have two sorted arrays, B and C. How can we turn those into a single sorted array? Well, the smallest element in the sorted array will be one of the two smallest elements of B and C. Once you've identified that one, once again, the next smallest is one of two elements. And so on. So that's what happens in merge!
This approach is a very common pattern called divide and conquer. This is a general paradigm for designing algorithms that works in three basic steps:
- Break the problem into similar subproblems
- Solve each of the subproblems recursively
- Combine the results to solve the original problem
In MergeSort, the first step is trivial: we break into subproblems just by dividing the array in half. The second step is the two recursive calls, and the third step is accomplished by the Merge subroutine, which combines the two sorted sublists back into one.
So how much work is done here? Well, for the \(n\) base cases, we do \(O(1)\) work. In the next level up (n/2 of those), we have to merge two total elements: (n/2)*2. In the next level up (n/4 of those), we have to merge four total elements: (n/4)*4. So the summation is \(\sum_{i=0}^{\log(n)}2^i\frac{n}{2^i}\), or \(n\log(n)\).
4 Space
Note that MergeSort is NOT an in-place algorithm. Unlike our other three, MergeSort requires additional arrays beyond the one which was input. Namely, beyond the array A, it must also store arrays B and C, the size of which total, at most, to something smaller than \(n\). Therefore, the additional space requirement of MergeSort is \(O(n)\).
Does that matter? Maybe! It depends heavily on your system and what it can provide in terms of memory. For some small systems like phones, memory can be in short supply, making MergeSort less attractive.