Python libraries: heapq
TLDR
The Python
heapq standard module implements a priority queue using a heap data structure (see Cormen et al. Introduction to Algorithms - Chapter 6). Priority queues are often used to support more complex algorithms like sorting and compressing. This post explores the heapq module and how to use it.What is a heap
A heap is a data structure that can be used to store values in a tree-like structure maintaining a partial . Although it is convenient to think of a heap as a tree, its implementation uses arrays (lists in Python).
There are two types of heap: min-heap and max-heap, we cover the former, which is the one implemented in the
heapq Python module. A min-heap (from now on, simply heap) is a binary tree that satisfies the heap propertyFor each non-root node with parent
In words, children have values no smaller than the value of their parent (for a max-heap the relation between values is instead of ).
One use of the heap is in sorting, it is in fact the basis of an efficient sorting algorithm called heapsort and is based on the following fact.
Fact: The minimum value is at the root
If we apply the above relation all the way up to the root we can easily see that the minimum value can be found on the root.1
Heap structure VS Heap memory
It is important not to confuse the heap memory, where dynamic objects are stored with the heap data structure which is a specific type of data structure. The two things are completely different things and only share the name!
The heapq module
Python comes with a built-in implementation of a min-heap, which is in the
heapq module. The module offers several functions to work on heaps stored in ordinary Python lists. To begin with heaps, you can simply define a list with the content and heapify it with heapq.heapifyq = [7, 17, 3, 20, 2, 27, 40]
heapq.heapify(q)
print(q) # [2, 7, 3, 20, 17, 27, 40]
Once we have a heap, we can pop (extract) elements out of it
m = heapq.heappop(q)
print(m) # 2
print(q) # [3, 7, 27, 20, 17, 40]
or we can push (add) elements to it
heapq.heappush(q, 10)
print(q) # [3, 7, 10, 20, 17, 40, 27]
There is also a function that does the two operations in one: pop first and push second
k = heapq.heapreplace(q, 1)
print(k) # 3
print(q) # [1, 7, 10, 20, 17, 40, 27]
or push first and pop second
l = heapq.heappushpop(q, 0)
print(l) # 0
print(q) # [1, 7, 10, 20, 17, 40, 27]
To summarize the most commonly used functions in
heapqheapify(q): makesqa heap in-place;heappush(q,e): addseto the heapqheappop(q): extracts the smallest value from the heapqheapreplace(q,e): equivalent topopfollowed bypush(e)heappushpop(q,e): equivalent topush(e)followed bypop
All above functions maintain the heap property.
Max heap
The
heapq historically supported only min heap since its introduction in Python 2.3.Starting with Python 3.14, the library also supports max heap within the same heapq module with functions having the suffix _max, like heapify_max, heappush_max, ...Three ways to use heaps
So, what is a heap useful for? The key property is the fact that we can efficiently obtain the minimum (maximum for max-heap) stored value. One can see the heap as a kind-of sorted structure in that it always returns the current minimum, although the structure itself is not sorted. This idea leads to an immediate sorting algorithm.
Heapsort
Using the heap is relatively easy to implement a sorting algorithm that takes the name of heapsort. The idea is to create a heap with all values and continuously extract the minimum as long as the heap is not empty.
def heapsort(l):
s = []
heapq.heapify(l)
while(len(l) > 0):
s.append(heapq.heappop(l))
return s
print(heapsort([7, 17, 3, 20, 2, 27, 40]))
# [2, 3, 7, 17, 20, 27, 40]
Notice that this function "destroys" the input list
l and puts elements in s sorted by increasing key.In-place heapsort
The algorithm presented above works in-place (it doesn't require additional memory) as long as Python lists shrinks as we remove elements from them. In this case, the heap
q becomes smaller by the same amount the list a becomes larger and no additional space is required.It is possible to devise a strategy that implements heapsort in the same list (array). One way is to use a max-heap so that e push returns the maximum of the values stored. Once extracted these values are stored in the back of the array where the positions are no longer part of the heap.
Priority queue
A queue is a way to store objects so that the first that is added is the first that is eliminated (*First-In First-Out, FIFO). A priority queue is a way of storing objects so that each one has a priority number associated to it. The object to exit the queue next is the one that has the highest priority.
To implement a priority queue in Python using the
heapq module, we leverage the fact the Python tuples have an implicit ordering. For example the tuple (1,"Hello"), comes before the tuple (2, "Hello"), since 1 comes before 2. We use a tuple with the priority as first element and the object as second element. In our implementation, smaller numbers mean higher priority, because heapq is a min-heap.import heapq
people = [
(3, "Alice"),
(1, "Bob"),
(5, "Carol"),
(4, "Dave")
]
queue = []
for person in people:
heapq.heappush(queue, person)
while(len(queue) > 0):
extracted = heapq.heappop(queue)
print(f"Next: {extracted[1]} (priority={extracted[0]})")
Path finding algorithm (Dijkstra)
A priority queue is a useful in many applications, one of these applications is the Dijkstra algorithm for path finding. Path finding algorithms are ubiquitous, you find it in navigation apps and video games, to mention two examples.
I have written a post on Dijkstra algorithm where you can see an example of usage of the
heapq module.Conclusion
The
heapq module is one of many utility modules present in the Python standard library. We have seen a few example where such a module can become in handy like sorting and prioritizing. This concludes the first post on Python standard library.🐻 📆 Bear of the day: Blanky

Footnotes
-
The proof is quite simple for the case where all values in the tree are distinct. Let be the node with the minimum value. Assume that is not the root and let be its parent. By the property above, the value of must be smaller than the value of . However, this is not possible, since we chose to have the smallest value. Hence, we reached a contradiction and must be the root. The result still holds with duplicate values, but the minimum may not be unique. ↩