Understanding Heapsort
Heapsort follows a very simple idea:
- Push all items in the list on a heap
- Pop all items from the heap. The heap guarantees that the items are sorted, because it will always pop the minimum element in the heap
Subsequently, a heapsort implementation typically is trivial. The following code
uses the MinHeap
class developed in the heaps article:
def heapsort(A):
h = MinHeap()
# push-em-all
for item in A:
h.Insert(item)
# pop-em-all
for i in range(len(A)):
A[i] = h.ExtractMin()
The same code, using the heapq module from the Python Standard Library:
def heapsort(A):
h = []
# push-em-all
for value in A:
heappush(h, value)
# pop-em-all
for i in range(len(A)):
A[i] = heappop(h)
The real fun starts if you compare this to the quicksort implementations developed in the quicksort article. It turns out that heapsort is dramatically faster than quicksort, in Python. Probably because of a combination of simplicity of algorithm and lack of recursion (and Python doesn't perform particularily well on recursion).
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 |
---|---|---|---|---|---|
Quicksort Hetland | 6.10 | 6.22 | 6.39 | 6.47 | 6.68 |
Quicksort Pyalda | 6.59 | 6.91 | 6.86 | 6.87 | 6.98 |
Heapsort with heapq | 3.26 | 3.16 | 3.14 | 3.17 | 3.08 |
At the end of mergesort we pointed out a situation where both quicksort implementations break badly: if asked to sort an already sorted array. Well, here are the results of comparing 1 million sorted float numbers in both heapsort and mergesort (quicksort won't sort either, because it'll exceed its maximum recursion depth):
Method | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 |
---|---|---|---|---|---|
Heapsort with heapq | 2.19 | 2.26 | 2.64 | 2.19 | 2.19 |
Mergesort with reverse() | 5.36 | 5.43 | 5.38 | 5.44 | 5.36 |
I, for one, welcome our new sorting overlord!