Sorting series post 7 focuses on non-comparison sorting: when key range or digit length is bounded, complexity can drop to O(n+k), but space, stability, and feasibility must be balanced.

Target Readers

  • Engineers sorting integer keys with known range/digits.
  • Learners seeking lower-than-n-log-n sorting for large batches.
  • People comparing standard library comparison sorts vs non-comparison.

Background and Motivation

  • Comparison sorts have a lower bound of Omega(n log n); non-comparison sorts bypass that by using range/digit information.
  • Cost: extra space and restricted applicability; implementation must handle stability and memory carefully.

A - Algorithm

Covered algorithms: counting sort, bucket sort, radix sort (LSD).

Basic examples

  • Counting: [4, 2, 2, 8, 3], range 0..9, count -> prefix sums -> stable fill.
  • Radix: group by ones/tens/hundreds digits, stable per digit.

C - Concepts

AlgorithmIdeaTimeSpaceStable
Countingfrequency + prefix sumsO(n+k)O(k+n)Can be stable
Bucketsplit by ranges, sort within bucketsexpected O(n+k)O(n+k)depends on inner sort
Radixstable per-digit sort in multiple passesO(d*(n+b))O(n+b)Yes if each pass is stable
  • k: range size; d: number of digits; b: base (number of buckets).
  • Stability: counting can be stable; radix requires stable passes; bucket depends on inner sort.

E - Engineering

Scenario 1: Small-range integer sort (Python counting)

def counting_sort(a, max_val):
    cnt = [0]*(max_val+1)
    for x in a: cnt[x]+=1
    # prefix sums
    for i in range(1, len(cnt)): cnt[i]+=cnt[i-1]
    out=[0]*len(a)
    for x in reversed(a):
        cnt[x]-=1
        out[cnt[x]] = x
    return out
print(counting_sort([4,2,2,8,3], 9))

Scenario 2: Bucket sort for known float distribution (JavaScript)

Background: floats uniformly distributed in [0,1).

function bucketSort(arr, buckets=10){
  const B=Array.from({length:buckets},()=>[]);
  for(const x of arr){
    const idx = Math.min(buckets-1, Math.floor(x*buckets));
    B[idx].push(x);
  }
  for(const b of B) b.sort((a,b)=>a-b);
  return B.flat();
}
console.log(bucketSort([0.78,0.17,0.39,0.26,0.72,0.94,0.21,0.12,0.23,0.68]));

Scenario 3: Radix sort for large integers (Go, LSD)

package main
import "fmt"

func radixLSD(a []int) {
    maxv := 0
    for _,v := range a { if v>maxv { maxv=v } }
    exp := 1
    buf := make([]int, len(a))
    for maxv/exp > 0 {
        cnt := make([]int, 10)
        for _,v := range a { digit := (v/exp)%10; cnt[digit]++ }
        for i:=1;i<10;i++ { cnt[i]+=cnt[i-1] }
        for i:=len(a)-1;i>=0;i-- {
            d := (a[i]/exp)%10
            cnt[d]--
            buf[cnt[d]] = a[i]
        }
        copy(a, buf)
        exp *= 10
    }
}

func main(){ a:=[]int{170,45,75,90,802,24,2,66}; radixLSD(a); fmt.Println(a) }

Scenario 4: C++ counting sort (small range)

#include <bits/stdc++.h>
using namespace std;
vector<int> counting_sort(const vector<int>& a, int maxv){
    vector<int> cnt(maxv+1), out(a.size());
    for(int x:a) cnt[x]++;
    for(int i=1;i<=maxv;i++) cnt[i]+=cnt[i-1];
    for(int i=(int)a.size()-1;i>=0;i--){
        int x=a[i]; cnt[x]--; out[cnt[x]]=x;
    }
    return out;
}

Scenario 5: Rust radix sort (LSD)

pub fn radix_lsd(a: &mut [u32]) {
    let mut maxv = *a.iter().max().unwrap();
    let mut exp = 1u32;
    let n = a.len();
    let mut buf = vec![0u32; n];
    while maxv/exp > 0 {
        let mut cnt = [0usize; 10];
        for &v in a.iter() { cnt[((v/exp)%10) as usize] += 1; }
        for i in 1..10 { cnt[i] += cnt[i-1]; }
        for &v in a.iter().rev() {
            let d = ((v/exp)%10) as usize;
            cnt[d] -= 1;
            buf[cnt[d]] = v;
        }
        a.copy_from_slice(&buf);
        exp *= 10;
    }
}

R - Reflection

  • Complexity and prerequisites:
    • Counting: O(n+k), k is range size; if k » n, not worth it.
    • Bucket: expected O(n+k) depends on distribution; worst-case can degrade.
    • Radix: O(d*(n+b)), d digits, base b; each pass must be stable.
  • Trade-offs:
    • Memory: counting/bucket needs O(k) or O(n+k); large ranges are infeasible.
    • Stability: counting and radix can be stable; bucket depends on inner sort.
    • Data type: best for integers or keys mappable to integers (dates, IPs, fixed-length strings).
  • Why it works:
    • With bounded range/digits, non-comparison sorts break the n log n lower bound.
    • Great for logs, bucketed stats, and bulk integer sorting.

S - Summary

  • Non-comparison sorts require known range/digits/distribution and can reach O(n+k).
  • Counting is simple and stable for small ranges; radix works for multi-digit integers; bucket depends on distribution.
  • Core risks: memory blow-up, distribution mismatch, missing stability.
  • Selection: small range -> counting; moderate digits + stable -> radix; uniform floats -> bucket; otherwise compare-based.

Practice Guide / Steps

  • Estimate range/digits: if k is close to or larger than n, be cautious.
  • Decide stability: radix needs stable per-digit sort; bucket may need stable inner sort.
  • Control memory: counting array length = max-min+1; radix buffers at least O(n).
  • Test with random, all equal, huge range, skewed distribution.

Common Pitfalls and Notes

  • Counting sort with negatives needs offset (or split positive/negative).
  • Radix loses stability if any digit pass is unstable.
  • Bucket sort degrades on skewed distributions; increase bucket count or sort large buckets with other methods.
  • Large memory usage calls for fallback to comparison sort or chunking.

Runnable Example: Counting with negatives (Python)

def counting_sort_with_neg(a):
    mn, mx = min(a), max(a)
    offset = -mn
    cnt = [0]*(mx - mn + 1)
    for x in a: cnt[x+offset]+=1
    for i in range(1,len(cnt)): cnt[i]+=cnt[i-1]
    out=[0]*len(a)
    for x in reversed(a):
        cnt[x+offset]-=1
        out[cnt[x+offset]] = x
    return out
print(counting_sort_with_neg([3,-1,2,-1,0]))

References and Further Reading

  • CLRS non-comparison sorting chapters
  • Donald Knuth, “The Art of Computer Programming, Vol. 3” (Sorting and Searching)
  • Lower bounds and word-RAM model discussions on integer sorting

Meta

  • Reading time: approx. 15 min
  • SEO keywords: counting sort, bucket sort, radix sort, non-comparison, O(n+k)
  • Meta description: sorting series (7) explaining prerequisites, complexity, and engineering trade-offs for counting/bucket/radix with multilingual examples.

Call to Action (CTA)

  • Estimate range/digits of your dataset and implement a counting or radix sort benchmark.
  • If distribution is skewed, tune bucket counts or use a hybrid inside large buckets.