Sorting series post 6 focuses on heap sort: in-place O(n log n), unstable, slightly higher constants but worst-case guaranteed. It is also the foundation of streaming top-k.
Target Readers
- Engineers who need in-place sorting with worst-case O(n log n) guarantees.
- Learners connecting priority queues, top-k, and heap sort.
- People comparing quick/merge/heap selection.
Background and Motivation
- Heap sort builds a heap and repeatedly extracts the max; best/avg/worst are all O(n log n).
- Pros: in-place, worst-case safe. Cons: unstable, cache-unfriendly, higher constants than quicksort.
- Shares structure with priority queues and streaming top-k, giving strong engineering value.
A - Algorithm
Steps
- Build a max-heap (bottom-up O(n)).
- Repeatedly swap root with tail, shrink heap, sift down to restore heap (O(log n) each).
Basic example
Array [4, 10, 3, 5, 1]:
- After heapify:
[10, 5, 3, 4, 1]. - Swap root and tail ->
[1,5,3,4,10], sift down ->[5,4,3,1,10]. - Repeat until sorted.
C - Concepts
| Concept | Description |
|---|---|
| Heap property | Parent >= children (max-heap), children of i are 2i+1 and 2i+2. |
| Heapify | Sift down from last non-leaf to root, O(n). |
| Sift down | Swap node downward to restore heap, O(log n). |
| Stability | Unstable; swaps can reorder equals. |
| Space | In-place O(1) extra space. |
Complexity
- Time: heapify O(n) + n times sift O(log n) => O(n log n); worst-case same.
- Space: O(1); recursion uses O(log n) stack, iterative uses O(1).
E - Engineering
Scenario 1: In-place backend sorting (C)
Background: need in-place and worst-case guarantees.
void heapify(int *a, int n, int i){
while(1){
int l=2*i+1, r=2*i+2, largest=i;
if(l<n && a[l]>a[largest]) largest=l;
if(r<n && a[r]>a[largest]) largest=r;
if(largest==i) break;
int t=a[i]; a[i]=a[largest]; a[largest]=t;
i=largest;
}
}
void heap_sort(int *a, int n){
for(int i=n/2-1;i>=0;i--) heapify(a,n,i);
for(int end=n-1; end>0; end--){
int t=a[0]; a[0]=a[end]; a[end]=t;
heapify(a,end,0);
}
}
Scenario 2: Streaming top-k (Python, min-heap)
Background: maintain top k values in a stream.
import heapq
def topk(stream, k):
h=[]
for x in stream:
if len(h)<k:
heapq.heappush(h, x)
else:
if x>h[0]:
heapq.heapreplace(h, x)
return sorted(h, reverse=True)
print(topk([5,1,9,3,12,4], 3)) # [12,9,5]
Scenario 3: Go priority queue + sorting
Background: use container/heap to build a heap-based sort.
package main
import (
"container/heap"
"fmt"
)
type IntHeap []int
func (h IntHeap) Len() int { return len(h) }
func (h IntHeap) Less(i, j int) bool { return h[i] < h[j] }
func (h IntHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h *IntHeap) Push(x interface{}) { *h = append(*h, x.(int)) }
func (h *IntHeap) Pop() interface{} {
old := *h; n := len(old); x := old[n-1]; *h = old[:n-1]; return x
}
func heapSort(a []int) []int {
h := IntHeap(a)
heap.Init(&h)
res := make([]int, 0, len(a))
for h.Len()>0 { res = append(res, heap.Pop(&h).(int)) }
return res // ascending
}
func main(){ fmt.Println(heapSort([]int{4,10,3,5,1})) }
Scenario 4: Rust in-place heap sort
pub fn heap_sort(a: &mut [i32]) {
let n = a.len();
// build max-heap
for i in (0..n/2).rev() { sift_down(a, i, n); }
for end in (1..n).rev() {
a.swap(0, end);
sift_down(a, 0, end);
}
}
fn sift_down(a: &mut [i32], mut i: usize, n: usize) {
loop {
let l = 2*i+1; let r = l+1;
let mut largest = i;
if l < n && a[l] > a[largest] { largest = l; }
if r < n && a[r] > a[largest] { largest = r; }
if largest == i { break; }
a.swap(i, largest);
i = largest;
}
}
Scenario 5: JavaScript concise version
function heapify(a, n, i){
while(true){
let l=2*i+1, r=2*i+2, largest=i;
if(l<n && a[l]>a[largest]) largest=l;
if(r<n && a[r]>a[largest]) largest=r;
if(largest===i) break;
[a[i],a[largest]]=[a[largest],a[i]]; i=largest;
}
}
function heapSort(a){
const n=a.length;
for(let i=Math.floor(n/2)-1;i>=0;i--) heapify(a,n,i);
for(let end=n-1;end>0;end--){
[a[0],a[end]]=[a[end],a[0]];
heapify(a,end,0);
}
return a;
}
console.log(heapSort([4,10,3,5,1]));
R - Reflection
- Complexity: time O(n log n) in worst/avg/best; space O(1).
- Alternatives:
- Need stability -> merge/TimSort.
- Better constants/cache -> quicksort often faster.
- Known range -> counting/bucket/radix.
- Why it works:
- Worst-case guarantees, suitable when degradation is unacceptable.
- In-place, good for memory-constrained environments.
- Shares structure with priority queues and streaming top-k.
S - Summary
- Heap sort: in-place, unstable, worst-case O(n log n), higher constants than quicksort.
- Engineering: heaps are common for top-k/streaming; full heap sort is less visible in libraries but available (C++
make_heap/sort_heap). - If you need stability or near-sorted optimization, use merge/TimSort; for low constants, use quick/Introsort; heap sort shines when you need in-place and worst-case guarantees.
- Heapify bottom-up is O(n); iterative sift-down avoids recursion.
Practice Guide / Steps
- Implement heapify (bottom-up) and sift-down (iterative); verify child index math.
- If only top-k is required, maintain a min-heap of size k, space O(k).
- Benchmark random, reversed, heavy duplicates; record swap counts and time to observe cache effects.
- For stability, add original index as a secondary key (higher constant).
Common Pitfalls and Notes
- Child index math is easy to get wrong: 2i+1, 2i+2; continue sifting after swap.
- Recursive sift-down uses O(log n) stack; iterative avoids stack risk.
- Heap sort is unstable; equal elements may reorder.
Runnable Example: Minimal Python
def heap_sort(a):
n=len(a)
def sift(i, size):
while True:
l,r=2*i+1,2*i+2; largest=i
if l<size and a[l]>a[largest]: largest=l
if r<size and a[r]>a[largest]: largest=r
if largest==i: break
a[i],a[largest]=a[largest],a[i]; i=largest
for i in range(n//2-1,-1,-1): sift(i,n)
for end in range(n-1,0,-1):
a[0],a[end]=a[end],a[0]
sift(0,end)
return a
print(heap_sort([4,10,3,5,1]))
References and Further Reading
- CLRS “Introduction to Algorithms” heap sort
- C++
std::make_heap/std::sort_heapimplementation - William Cochran, “Heaps and Priority Queues” notes
Meta
- Reading time: approx. 14 min
- SEO keywords: heap sort, in-place sorting, top-k, priority queue
- Meta description: sorting series (6) explaining heap sort, heapify/sift, complexity, and engineering trade-offs with multilingual examples.
Call to Action (CTA)
- Compare quicksort vs heap sort on the same dataset and observe cache effects.
- If you need top-k, implement a min-heap and benchmark it.
- Follow the series: non-comparison sorting, TimSort/Introsort, selection guide.