For readers preparing a systematic sorting series: this is the preface. We first build a selection map with the ACERS framework so you can quickly decide when to use quicksort, merge, heap, counting/radix, TimSort, Introsort, and more.

Target Readers

Practice and study: want a structured overview to write a sorting series.
Backend/data engineers: care about memory, stability, and parallel scenarios.
Teaching and sharing: need a reusable framework and examples.

Background and Motivation

Pain points: too many sorting algorithms with similar names; stability/complexity are easy to mix up; engineering requires cache behavior, external sort, and language defaults.
Goal: provide a “selection cheat sheet + scenarios + code skeletons” so the rest of the series has consistent structure and language.

A - Algorithm

Theme: How to choose a sorting algorithm based on input size, data distribution, and stability requirements.

Basic examples

Example 1: small array (<= 30) and nearly sorted -> insertion sort.
Example 2: mid-size random array (~1e4) -> quicksort or Introsort.
Example 3: huge integer keys with narrow range (<= 1e6) -> counting or bucket sort.

Simple input/output

Input: an array/slice of comparable elements
Output: array/slice sorted in non-decreasing order

C - Concepts

Algorithm	Avg Time	Space	Stable	In-place	Notes
Bubble/Selection/Insertion	O(n^2)	O(1)	Bubble/Insertion yes	Yes/Yes/Yes	baseline/teaching
Shell	between O(n^2) and O(n log n)	O(1)	No	Yes	gap sequence matters
Merge	O(n log n)	O(n)	Yes	No	good for external sort
Quick	O(n log n) avg; worst O(n^2)	O(log n)	No	Yes	pivot choice is key
Heap	O(n log n)	O(1)	No	Yes	good for streaming top-k
Counting/Bucket/Radix	O(n + k)	O(n + k)	Counting/Radix stable	No/depends	known range/digits required
TimSort	O(n log n)	O(n)	Yes	No	default in Python/Java
Introsort	O(n log n)	O(1)	No	Yes	C++ std::sort

Categories

Divide and conquer: merge, quick.
Heap-based: heap sort.
Increment-based: shell.
Non-comparison: counting, bucket, radix.
Engineering hybrids: TimSort (insertion + merge), Introsort (quick + heap + insertion).

E - Engineering

Scenario 1: Batch analytics (Python)

Background: process 1e6 log rows, stable sort by timestamp to keep original order for ties. Why: Python built-in sort uses TimSort, stable and fast on partially sorted data.

from operator import itemgetter

logs = [
    ("2025-11-01T10:00:00", "user1", 3),
    ("2025-11-01T10:00:00", "user2", 1),
    ("2025-11-01T10:00:01", "user3", 2),
]

# Sort by timestamp ascending; stable keeps original order for ties
logs.sort(key=itemgetter(0))
print(logs)

Scenario 2: Backend pagination sort (Go)

Background: sort items by price ascending and sales descending; dataset < 1e5. Why: sort.Slice is in-place and flexible; standard library is sufficient here.

package main

import (
    "fmt"
    "sort"
)

type Item struct { Price int; Sales int }

func main() {
    items := []Item{{100, 50}, {80, 200}, {100, 120}}
    sort.Slice(items, func(i, j int) bool {
        if items[i].Price == items[j].Price {
            return items[i].Sales > items[j].Sales // sales desc
        }
        return items[i].Price < items[j].Price
    })
    fmt.Println(items)
}

Scenario 3: Memory-limited offline sort (C++, external merge)

Background: sort a 10GB integer file with only 512MB RAM. Why: external sort requires chunking + multi-way merge; stable and memory-bound.

#include <bits/stdc++.h>
using namespace std;

int main() {
    vector<int> buf;
    buf.reserve(1 << 20); // ~1M ints
    vector<string> tmpFiles;
    int x; int chunk = 0;
    while (cin >> x) {
        buf.push_back(x);
        if (buf.size() == buf.capacity()) {
            sort(buf.begin(), buf.end());
            string name = "chunk" + to_string(chunk++) + ".tmp";
            ofstream out(name);
            for (int v : buf) out << v << "\n";
            tmpFiles.push_back(name);
            buf.clear();
        }
    }
    // Omit the final chunk and k-way merge; show the idea
    cerr << "chunks: " << tmpFiles.size() << "\n";
}

Scenario 4: Frontend table sorting (JavaScript)

Background: sort by multiple columns and keep order for equal keys (stable). Why: modern browsers usually have stable Array.prototype.sort; if unsure, map indices and sort.

const rows = [
  { price: 100, sales: 50 },
  { price: 100, sales: 120 },
  { price: 80, sales: 200 },
];

rows
  .map((row, idx) => ({ ...row, idx }))
  .sort((a, b) => a.price - b.price || a.idx - b.idx)
  .forEach(r => console.log(r));

R - Reflection

Complexity and space:
- O(n log n) workhorses: merge (stable, not in-place), quick (in-place, worst-case), heap (in-place, cache-unfriendly).
- O(n + k) non-comparison: counting/bucket/radix, only when range/digits are bounded.
- O(n^2) baseline: bubble/selection/insertion, mostly for teaching or tiny arrays.
Alternatives:
- External sort vs in-memory: if data exceeds RAM, chunk + merge is mandatory.
- TimSort vs plain merge: TimSort is faster on partially sorted data and is the engineering default.
- Introsort vs plain quicksort: depth fallback to heap avoids worst-case O(n^2).
Why this selection is reasonable:
- Stability first: merge/TimSort/counting/radix.
- Memory first: quick/heap/Introsort (in-place).
- Range known: counting/bucket/radix.
- Huge data: external merge with multi-way streaming.

S - Summary

Selection depends on data size, distribution, stability needs, and memory/external constraints.
Default to the language built-in sort (often TimSort/Introsort); customize only for special needs.
Non-comparison sorting can drop complexity to O(n + k) when range/digits are bounded.
External sorting is essential for data larger than memory; core idea is chunking + multi-way merge.
Choose criteria first (time, space, stability), then the algorithm.

Practice Guide / Steps

Step 1: Evaluate size and distribution (random/near-sorted/heavy duplicates).
Step 2: Decide stability requirement and memory budget.
Step 3: Use the table to pick a baseline; for Python/Java, prefer built-in stable sort.
Step 4: Write three boundary tests: all equal, reverse, nearly sorted.
Step 5: Benchmark large data and record time/memory.

Runnable Example (Quick Benchmark, Python)

import random, time

def bench(n=100000):
    arr = [random.randint(0, 1000000) for _ in range(n)]
    t0 = time.time(); sorted(arr); t1 = time.time()
    print(f"n={n}, timsort time={t1 - t0:.3f}s")

if __name__ == "__main__":
    bench(200000)

Common Pitfalls and Notes

Forgetting comparator logic in Array.sort / sort.Slice may break stability or ordering.
Fixed pivot in quicksort degenerates on sorted input; use random or median-of-three.
Counting/bucket sort can blow memory if range is large; estimate min/max first.
External sort with too many temp files requires k-way merge or staged merging.

Best Practices

Use standard library sorts in production unless you have explicit constraints.
Write comparator + tests before implementing the sort; verify stability if needed.
Sample large data to judge whether bucket/radix is appropriate or external sort is needed.
Require a “sorting algorithm + rationale” note in PRs for review.

Conclusion

This preface provides the ACERS selection map for the series.
Next steps: expand by category (O(n^2) baseline, shell, merge, quick, heap, non-comparison, TimSort, Introsort, selection playbook).

References and Further Reading

CLRS “Introduction to Algorithms” sorting chapters
TimSort paper and CPython listobject.c
C++ std::sort / std::stable_sort implementation notes
PostgreSQL external sort (tuplesort)

Call to Action (CTA)

Write a “sorting selection checklist” for your project with size/distribution/stability notes.
Run the Python benchmark above with your real data distribution.
Follow the series (quicksort, merge, heap, non-comparison, TimSort/Introsort, selection playbook) and replicate it with the ACERS template.

Sorting Series (1): How to Choose an Algorithm - Time, Space, Stability, Scenarios

Target Readers

Background and Motivation

A - Algorithm

C - Concepts

E - Engineering

Scenario 1: Batch analytics (Python)

Scenario 3: Memory-limited offline sort (C++, external merge)

Scenario 4: Frontend table sorting (JavaScript)

R - Reflection

S - Summary

Practice Guide / Steps

Runnable Example (Quick Benchmark, Python)

Common Pitfalls and Notes

Best Practices

Conclusion

References and Further Reading

Meta

Call to Action (CTA)

Target Readers#

Background and Motivation#

A - Algorithm#

C - Concepts#

E - Engineering#

Scenario 1: Batch analytics (Python)#

Scenario 2: Backend pagination sort (Go)#

Scenario 3: Memory-limited offline sort (C++, external merge)#

Scenario 4: Frontend table sorting (JavaScript)#

R - Reflection#

S - Summary#

Practice Guide / Steps#

Runnable Example (Quick Benchmark, Python)#

Common Pitfalls and Notes#

Best Practices#

Conclusion#

References and Further Reading#

Meta#

Call to Action (CTA)#

Target Readers

Background and Motivation

A - Algorithm

C - Concepts

E - Engineering

Scenario 1: Batch analytics (Python)

Scenario 2: Backend pagination sort (Go)

Scenario 3: Memory-limited offline sort (C++, external merge)

Scenario 4: Frontend table sorting (JavaScript)

R - Reflection

S - Summary

Practice Guide / Steps

Runnable Example (Quick Benchmark, Python)

Common Pitfalls and Notes

Best Practices

Conclusion

References and Further Reading

Meta

Call to Action (CTA)