Sorting series post 4 focuses on merge sort: classic divide and conquer, stable, O(n log n) time, with the cost of O(n) extra space. It is the foundation for external sorting and many stable library sorts.
Target Readers
- Engineers who need stability and can afford O(n) extra space.
- Learners building divide-and-conquer foundations for quicksort/TimSort.
- People working with huge files or streams and want external merge knowledge.
Background and Motivation
- Merge sort is O(n log n) on all inputs, unaffected by pivot degeneration.
- The trade-off is O(n) extra space; in-place variants are complex and costly.
- External sorting (data larger than RAM) uses “chunk sort + k-way merge” based on merge ideas.
A - Algorithm
Problem: Sort a comparable sequence, stable, in O(n log n).
Steps (top-down)
- Split: recursively divide the array into halves.
- Conquer: sort left and right halves.
- Merge: merge two sorted halves using a buffer.
Basic example
Array [5,2,4,6,1,3]:
- Split into
[5,2,4]and[6,1,3], then split further. - Merge
[2,4,5]and[1,3,6]->[1,2,3,4,5,6](stable preserves order).
C - Concepts
| Key Concept | Description |
|---|---|
| Divide and conquer | Split into subproblems, solve and merge. |
| Stability | When equal, take from left first to preserve relative order. |
| Space | Typical implementation uses O(n) buffer; bottom-up still needs buffer. |
| Variants | Bottom-up merge, block merge, external k-way merge. |
Complexity
- Time: T(n) = 2T(n/2) + O(n) => O(n log n) (best/avg/worst).
- Space: O(n) buffer (external sort depends on block size).
E - Engineering
Scenario 1: Stable multi-key sorting (Python)
Background: sort logs by date then user_id, need stability.
Why: Python’s list.sort / sorted use TimSort, which is stable and adaptive, so equal keys keep their original order.
Why this example: it shows a real-world multi-key sort where stability matters, and demonstrates that you can rely on the built-in sort instead of hand-writing a merge routine.
from operator import itemgetter
logs = [("2025-11-21", "u2"), ("2025-11-21", "u1"), ("2025-11-20", "u3")]
logs.sort(key=itemgetter(0,1))
print(logs)
Scenario 2: External sorting for large files (C++)
Background: sort a 10GB integer file with 512MB RAM.
Why: chunk sort + k-way merge keeps memory bounded; add a global sequence id to preserve stability for equal keys.
Approach: read fixed-size chunks, sort in memory, spill to temp files, then k-way merge with a min-heap.
Notes: input is one integer per line; output is sorted integers; chunk_items controls memory use.
#include <algorithm> // std::sort
#include <cstdint> // uint64_t fixed-width integer
#include <cstdio> // std::remove for deleting temp files
#include <fstream> // std::ifstream/std::ofstream
#include <iostream> // std::cerr
#include <queue> // std::priority_queue
#include <string> // std::string, std::to_string
#include <vector> // std::vector
// A simple aggregate type. In C++, "struct" defaults to public members.
// "uint64_t" is a fixed-width unsigned integer used for a stable tie-breaker.
struct Record {
int value;
uint64_t seq;
};
// Each heap entry remembers which temp file it came from.
// "size_t" is an unsigned type for sizes/indices.
struct HeapItem {
Record rec;
size_t file_index;
};
// Comparator functor for std::priority_queue (a max-heap by default).
// By reversing the comparison, we effectively get a min-heap by (value, seq).
struct HeapCmp {
bool operator()(const HeapItem& a, const HeapItem& b) const {
if (a.rec.value != b.rec.value) return a.rec.value > b.rec.value;
return a.rec.seq > b.rec.seq;
}
};
// "static" gives internal linkage (file-local). The ">>" operator reads tokens.
// It returns false when the stream fails (EOF or invalid format).
static bool read_record(std::ifstream& in, Record& out) {
return static_cast<bool>(in >> out.value >> out.seq);
}
int main(int argc, char** argv) {
// argc is the count of arguments; argv is the array of C-strings.
if (argc < 3) {
std::cerr << "Usage: " << argv[0] << " <input> <output> [chunk_items]\n";
std::cerr << "Input format: one integer per line.\n";
return 1;
}
std::string input_path = argv[1];
std::string output_path = argv[2];
size_t chunk_items = 1000000; // Default chunk size (items, not bytes).
if (argc >= 4) {
// std::stoull converts a string to unsigned long long.
chunk_items = static_cast<size_t>(std::stoull(argv[3]));
if (chunk_items == 0) {
std::cerr << "chunk_items must be > 0\n";
return 1;
}
}
std::ifstream in(input_path);
if (!in) {
std::cerr << "Failed to open input: " << input_path << "\n";
return 1;
}
std::vector<std::string> temp_files;
std::vector<Record> buffer;
buffer.reserve(chunk_items);
uint64_t seq = 0; // Monotonic sequence to preserve stability across chunks.
while (true) {
buffer.clear();
int v;
// Read up to chunk_items values; "&&" short-circuits on EOF.
while (buffer.size() < chunk_items && (in >> v)) {
buffer.push_back(Record{v, seq++}); // Aggregate initialization.
}
if (buffer.empty()) break;
// std::sort uses the comparator lambda: "[]" means no captures.
std::sort(buffer.begin(), buffer.end(),
[](const Record& a, const Record& b) {
if (a.value != b.value) return a.value < b.value;
return a.seq < b.seq;
});
std::string tmp_name = "chunk_" + std::to_string(temp_files.size()) + ".tmp";
std::ofstream out(tmp_name);
if (!out) {
std::cerr << "Failed to create temp file: " << tmp_name << "\n";
return 1;
}
for (const auto& rec : buffer) {
out << rec.value << ' ' << rec.seq << '\n';
}
temp_files.push_back(tmp_name);
if (!in) break;
}
// Handle empty input: just create an empty output file.
if (temp_files.empty()) {
std::ofstream out(output_path);
return 0;
}
std::vector<std::ifstream> inputs;
inputs.reserve(temp_files.size());
for (const auto& name : temp_files) {
// emplace_back constructs the ifstream in-place.
inputs.emplace_back(name);
if (!inputs.back()) {
std::cerr << "Failed to open temp file: " << name << "\n";
return 1;
}
}
// priority_queue<value, container, comparator>
std::priority_queue<HeapItem, std::vector<HeapItem>, HeapCmp> heap;
for (size_t i = 0; i < inputs.size(); ++i) {
Record rec;
if (read_record(inputs[i], rec)) {
heap.push(HeapItem{rec, i});
}
}
std::ofstream out(output_path);
if (!out) {
std::cerr << "Failed to open output: " << output_path << "\n";
return 1;
}
// k-way merge: always take the smallest (value, seq) from the heap.
while (!heap.empty()) {
HeapItem top = heap.top();
heap.pop();
out << top.rec.value << '\n';
Record next;
if (read_record(inputs[top.file_index], next)) {
heap.push(HeapItem{next, top.file_index});
}
}
// Clean up temp files using std::remove from <cstdio>.
for (const auto& name : temp_files) {
std::remove(name.c_str());
}
return 0;
}
Example (simulate external sorting with tiny memory):
Use a small chunk_items so the program must spill to temp files and perform a k-way merge, just like the 10GB/512MB scenario.
Input (numbers.txt):
5
1
5
2
2
3
Compile:
g++ -O2 -std=c++17 external_sort.cpp -o external_sort
Run (set chunk_items=3 to force multiple chunks):
./external_sort numbers.txt sorted.txt 3
Output (sorted.txt):
1
2
2
3
5
Scenario 3: Frontend stable sorting (JavaScript)
Background: table must keep original order for equal keys. Why: modern browser sort is mostly stable; if unsure, use a stable merge.
function mergeSort(arr){
if(arr.length<=1) return arr;
const mid = arr.length>>1;
const left = mergeSort(arr.slice(0,mid));
const right = mergeSort(arr.slice(mid));
const res=[]; let i=0,j=0;
while(i<left.length && j<right.length){
if(left[i].key <= right[j].key) res.push(left[i++]);
else res.push(right[j++]);
}
return res.concat(left.slice(i)).concat(right.slice(j));
}
console.log(mergeSort([{key:1},{key:1},{key:0}]));
Scenario 4: Stable backend sorting (Go)
Background: stable sort by multiple fields.
Why: sort.SliceStable is merge-based and stable.
package main
import (
"fmt"
"sort"
)
type Item struct{ Date string; User string }
func main(){
items := []Item{{"2025-11-21","u2"},{"2025-11-21","u1"},{"2025-11-20","u3"}}
sort.SliceStable(items, func(i, j int) bool {
if items[i].Date == items[j].Date { return items[i].User < items[j].User }
return items[i].Date < items[j].Date
})
fmt.Println(items)
}
R - Reflection
- Complexity analysis: time O(n log n), space O(n); external sort cost dominated by I/O and block size.
- Alternatives:
- vs quicksort: quick is in-place and small-constant but unstable and can degrade; merge is stable and bounded.
- vs heap sort: heap is in-place and unstable with poor cache behavior; merge is better for stability/external.
- vs TimSort: TimSort is faster on nearly sorted data and stable but more complex; merge is its base.
- Why it is preferred: stable, predictable O(n log n), and the default for external sorting.
S - Summary
- Merge sort provides stable and predictable O(n log n) with O(n) extra space.
- External sorting, stable multi-key sorting, and many stable libraries rely on merge ideas.
- Bottom-up merge avoids recursion depth but still needs buffers.
- If input is nearly sorted and you want more speed, consider TimSort; if memory is tight and stability is not needed, use quick/heap.
- Evaluate stability needs, memory budget, data size, and I/O cost.
Practice Guide / Steps
- Decide stability and memory budget: if O(n) buffer is ok, use merge/stable libraries; otherwise quick/heap.
- Choose implementation: top-down is simplest; bottom-up avoids deep recursion.
- Ensure stability in merge: when equal, pick from left.
- Edge tests: empty, single, all equal, reverse, heavy duplicates.
Runnable Examples: Multilingual Implementations
Python (top-down)
def merge_sort(a):
if len(a) <= 1:
return a
mid = len(a)//2
left = merge_sort(a[:mid])
right = merge_sort(a[mid:])
i=j=0; res=[]
while i < len(left) and j < len(right):
if left[i] <= right[j]:
res.append(left[i]); i+=1
else:
res.append(right[j]); j+=1
res.extend(left[i:]); res.extend(right[j:])
return res
print(merge_sort([5,2,4,6,1,3]))
C (bottom-up)
#include <stdlib.h>
void merge(int *a, int *buf, int l, int m, int r){
int i=l, j=m, k=l;
while(i<m && j<r){
if(a[i] <= a[j]) buf[k++] = a[i++];
else buf[k++] = a[j++];
}
while(i<m) buf[k++] = a[i++];
while(j<r) buf[k++] = a[j++];
for(int t=l; t<r; ++t) a[t]=buf[t];
}
void merge_sort(int *a, int n){
int *buf = malloc(sizeof(int)*n);
for(int width=1; width<n; width*=2){
for(int i=0; i<n; i+=2*width){
int l=i, m=i+width< n? i+width: n, r=i+2*width< n? i+2*width: n;
merge(a, buf, l, m, r);
}
}
free(buf);
}
C++ (top-down)
void merge(vector<int>& a, int l, int m, int r, vector<int>& buf){
int i=l,j=m,k=l;
while(i<m && j<r){
if(a[i]<=a[j]) buf[k++]=a[i++];
else buf[k++]=a[j++];
}
while(i<m) buf[k++]=a[i++];
while(j<r) buf[k++]=a[j++];
for(int t=l;t<r;++t) a[t]=buf[t];
}
void merge_sort(vector<int>& a, int l, int r, vector<int>& buf){
if(r-l<=1) return;
int m = l + (r-l)/2;
merge_sort(a,l,m,buf); merge_sort(a,m,r,buf);
merge(a,l,m,r,buf);
}
Go (top-down)
func mergeSort(a []int) []int {
if len(a) <= 1 { return a }
mid := len(a)/2
left := mergeSort(a[:mid])
right := mergeSort(a[mid:])
res := make([]int, 0, len(a))
i, j := 0, 0
for i < len(left) && j < len(right) {
if left[i] <= right[j] { res = append(res, left[i]); i++ } else { res = append(res, right[j]); j++ }
}
res = append(res, left[i:]...)
res = append(res, right[j:]...)
return res
}
Rust (top-down with buffer)
fn merge_sort(a: &mut [i32]) {
let n = a.len();
if n <= 1 { return; }
let mid = n/2;
merge_sort(&mut a[..mid]);
merge_sort(&mut a[mid..]);
let mut buf = a.to_vec();
merge(&a[..mid], &a[mid..], &mut buf[..]);
a.copy_from_slice(&buf);
}
fn merge(left: &[i32], right: &[i32], out: &mut [i32]) {
let (mut i, mut j, mut k) = (0,0,0);
while i < left.len() && j < right.len() {
if left[i] <= right[j] { out[k]=left[i]; i+=1; }
else { out[k]=right[j]; j+=1; }
k+=1;
}
if i < left.len() { out[k..k+left.len()-i].copy_from_slice(&left[i..]); }
if j < right.len() { out[k..k+right.len()-j].copy_from_slice(&right[j..]); }
}
JavaScript (top-down)
function mergeSort(a){
if(a.length<=1) return a;
const mid = a.length>>1;
const left = mergeSort(a.slice(0,mid));
const right = mergeSort(a.slice(mid));
const res=[]; let i=0,j=0;
while(i<left.length && j<right.length){
if(left[i] <= right[j]) res.push(left[i++]);
else res.push(right[j++]);
}
return res.concat(left.slice(i)).concat(right.slice(j));
}
console.log(mergeSort([5,2,4,6,1,3]));
Common Pitfalls and Notes
- Recursion depth: for large n use bottom-up iteration or stack tuning.
- Space usage: evaluate O(n) buffer; external sort requires careful block sizing and merge fan-in.
- Stability: when equal, take left to preserve order.
- Performance: use double-buffering or alternating buffers to reduce copies.
Best Practices
- Use stable library sort when available (Python/Java object sort, Go SliceStable).
- Factor out a
mergehelper to enforce stability; use bottom-up for large arrays. - External sort: control chunk size, use priority queue for k-way merge, batch I/O.
- For nearly sorted data, consider TimSort; merge is its foundation.
Conclusion
- Merge sort is stable and predictably O(n log n), ideal for stable multi-key sorting and external sorting.
- Extra space is the main trade-off; in-place variants are complex and uncommon.
- Bottom-up merge avoids recursion; external merge is essential for huge data.
References and Further Reading
- CLRS “Introduction to Algorithms” merge sort
- TimSort paper and CPython/Java sources (run merging)
- PostgreSQL tuplesort external sorting implementation
Meta
- Reading time: approx. 15 min
- SEO keywords: merge sort, stable sort, external sorting, divide and conquer
- Meta description: sorting series (4) explaining merge sort stability, space trade-offs, external sorting, and multilingual implementations.
Call to Action (CTA)
- Benchmark built-in stable sort vs your merge implementation on your dataset.
- If sorting large files, prototype chunk + k-way merge and measure I/O cost.
- Follow the series: quicksort, heap sort, non-comparison, TimSort/Introsort, selection guide.