Understanding Heapsort

Heapsort follows a very simple idea:

Push all items in the list on a heap
Pop all items from the heap. The heap guarantees that the items are sorted, because it will always pop the minimum element in the heap

Subsequently, a heapsort implementation typically is trivial. The following code uses the MinHeap class developed in the heaps article:

def heapsort(A):
    h = MinHeap()

    # push-em-all
    for item in A:
        h.Insert(item)

    # pop-em-all
    for i in range(len(A)):
        A[i] = h.ExtractMin()

The same code, using the heapq module from the Python Standard Library:

def heapsort(A):
    h = []

    # push-em-all
    for value in A:
        heappush(h, value)

    # pop-em-all
    for i in range(len(A)):
        A[i] = heappop(h)

The real fun starts if you compare this to the quicksort implementations developed in the quicksort article. It turns out that heapsort is dramatically faster than quicksort, in Python. Probably because of a combination of simplicity of algorithm and lack of recursion (and Python doesn't perform particularily well on recursion).

Method	Run 1	Run 2	Run 3	Run 4	Run 5
Quicksort Hetland	6.10	6.22	6.39	6.47	6.68
Quicksort Pyalda	6.59	6.91	6.86	6.87	6.98
Heapsort with heapq	3.26	3.16	3.14	3.17	3.08

At the end of mergesort we pointed out a situation where both quicksort implementations break badly: if asked to sort an already sorted array. Well, here are the results of comparing 1 million sorted float numbers in both heapsort and mergesort (quicksort won't sort either, because it'll exceed its maximum recursion depth):

Method	Run 1	Run 2	Run 3	Run 4	Run 5
Heapsort with heapq	2.19	2.26	2.64	2.19	2.19
Mergesort with reverse()	5.36	5.43	5.38	5.44	5.36

I, for one, welcome our new sorting overlord!