A heap is a complete binary tree (usually implemented with an array) that keeps the smallest or largest element at the root for fast access.
- Repeatedly get/remove minimum or maximum element.
- Top-
kproblems (largest/smallest/frequent). - Priority scheduling (tasks by priority/time).
- Streaming data (running median or dynamic extremes).
- Efficient graph algorithms (Dijkstra, Prim) with priority queues.
- Choose heap type:
- Min-heap: parent <= children (root is smallest).
- Max-heap: parent >= children (root is largest).
- Insert by appending to the array, then bubble up (
heapify-up). - Remove top by swapping root with last element, popping last, then bubble down (
heapify-down). - Build heap from list in linear time using bottom-up heapify.
- For top-
k, keep a heap of sizekand evict when needed.
2
/ \
5 8
/ \ /
9 12 14
Every parent <= its children.
Heap array: [2, 5, 8, 9, 12, 14]
index i:
left child = 2*i + 1
right child = 2*i + 2
parent = (i - 1) // 2
Use when: you need frequent minimum extraction. Input: list of numbers. Output: current minimum and updated heap.
import heapq
def min_heap_demo(nums):
heap = nums[:]
heapq.heapify(heap) # TC: O(n), make min-heap in-place
heapq.heappush(heap, 4) # TC: O(log n)
smallest = heapq.heappop(heap) # TC: O(log n)
return smallest, heapUse when: you need max-heap behavior in Python heapq.
Input: list of numbers.
Output: largest element after one push/pop step.
import heapq
def max_heap_demo(nums):
# Python has min-heap only; negate to simulate max-heap.
heap = [-x for x in nums]
heapq.heapify(heap)
heapq.heappush(heap, -10)
largest = -heapq.heappop(heap)
return largestUse when: keep only the largest k seen so far.
Input: list of numbers, integer k.
Output: largest k elements in descending order.
import heapq
def top_k_largest(nums, k):
if k <= 0:
return []
heap = []
for x in nums:
if len(heap) < k:
heapq.heappush(heap, x)
elif x > heap[0]:
# Invariant: heap stores current top-k largest values.
heapq.heapreplace(heap, x)
return sorted(heap, reverse=True)Use when: find kth largest without full sort.
Input: list of numbers, integer k (1-indexed).
Output: kth largest value.
import heapq
def kth_largest(nums, k):
if k <= 0 or k > len(nums):
return None
# Invariant: heap contains k largest seen so far.
heap = nums[:k]
heapq.heapify(heap)
for x in nums[k:]:
if x > heap[0]:
heapq.heapreplace(heap, x)
return heap[0]Use when: merge multiple sorted arrays/lists efficiently. Input: list of sorted lists. Output: one globally sorted list.
import heapq
def merge_k_sorted_lists(lists):
heap = []
result = []
for i, arr in enumerate(lists):
if arr:
# Entry: (value, list_id, idx_in_list)
heapq.heappush(heap, (arr[0], i, 0))
while heap:
val, li, ei = heapq.heappop(heap)
result.append(val)
ni = ei + 1
if ni < len(lists[li]):
heapq.heappush(heap, (lists[li][ni], li, ni))
return result- Peek top:
O(1) - Insert (
push):O(log n) - Remove top (
pop):O(log n) - Build heap (
heapify):O(n) - Top-k with size
kheap overnvalues:O(n log k)
- Why is
heapifyO(n)and notO(n log n)? - Why does a min-heap of size
khelp in top-k largest problems? - Why does heap sort use
O(1)extra space (array version) but Pythonheapqworkflows may use extra memory?
Answers
- Most nodes are near leaves and need little/no sinking, so total work sums to linear.
- The smallest among current top-k stays at root for fast comparison/replacement.
- In-place heap sort rearranges one array; practical
heapqpatterns often keep additional heaps or copies.
- Kth Largest Element in an Array
- Top K Frequent Elements
- Find Median from Data Stream
- Merge k Sorted Lists
- IPO
At first, I expected heaps to return fully sorted output directly, but I learned a heap guarantees order only at the root (global min/max), not between all siblings/subtrees.
- Dijkstra's Algorithm – uses a min-heap as its priority queue.
- Union-Find – another data structure used in graph algorithms.
- Topological Sort – Kahn's algorithm also uses a queue structure.